Skip to content

Distributed Strategies

Partitioning Strategies

The project now supports a broader range of client-data topologies.

Strategy Best for Notes
uniform balanced client splits Fast default for smoke tests
stratified class-balance preservation Good baseline for fair comparisons
feature rule-based site segmentation Useful when client boundaries depend on a feature
sized capacity planning Simulate known site volumes
dirichlet non-IID research Standard benchmark for heterogeneous federated learning
label_skew site specialization Simulate clients that only observe a subset of labels

Aggregation Strategies

Legacy paper-style strategies

Strategy Behavior
rf_s_dts_a top-K trees per client by accuracy
rf_s_dts_wa top-K trees per client by weighted accuracy
rf_s_dts_a_all global top-K trees by accuracy
rf_s_dts_wa_all global top-K trees by weighted accuracy

Extended strategies

Strategy Behavior
top_k_global_balanced_accuracy global top-K by balanced accuracy
top_k_global_f1 global top-K by macro F1
proportional_weighted_accuracy allocate tree budget by client data volume, then rank by weighted accuracy
proportional_balanced_accuracy proportional budget with balanced-accuracy ranking
threshold_weighted_accuracy include trees above a quality floor, with fallback selection

When To Use Which Strategy

  • Use rf_s_dts_wa_all when you want a strong general-purpose baseline.
  • Use top_k_global_balanced_accuracy when class imbalance matters.
  • Use proportional_weighted_accuracy when larger clients should contribute more trees.
  • Use aggregation_strategy="auto" in FederatedRandomForest when you want validation-driven strategy selection.

Example

from distributed_random_forest import FederatedRandomForest

model = FederatedRandomForest(
    n_clients=6,
    partition_strategy="dirichlet",
    partition_kwargs={"alpha": 0.3},
    aggregation_strategy="auto",
    selection_metric="weighted_accuracy",
    execution_backend="thread",
)

Metric Bundle

Validation and test reports include:

  • accuracy
  • weighted_accuracy
  • balanced_accuracy
  • f1_score