Core concepts¶
For how the training set is split across clients, how per-client forests interact with federated tree aggregation (and DP), and when to use each partition or merge strategy, see Supported distributed RF patterns.
Splitting rules (node impurity)¶
| Criterion | Role |
|---|---|
| Gini | Tends to isolate the largest pure class in a split. |
| Entropy | Reduces mixed-class diversity in child nodes. |
Local ensemble voting¶
After each tree predicts a class, the forest combines votes:
| Method | Behavior |
|---|---|
| Simple voting (SV) | Plain majority over trees. |
| Weighted voting (WV) | Votes weighted by each tree’s class-wise accuracy. |
Federated aggregation (merging trees)¶
Each client finishes with its own RF. You select and merge decision trees (DTs) into one global model. The library offers four named strategies (accuracy vs. weighted accuracy; per-client top-K vs. global top-K). Full descriptions, parameter names (n_trees_per_client vs. n_total_trees), and when to use each pattern are in Supported distributed RF patterns (this section is the short version).
| Strategy name | Ranks by | Tree budget |
|---|---|---|
| rf_s_dts_a | Validation accuracy (A) | N per client |
| rf_s_dts_wa | Weighted accuracy (WA) | N per client |
| rf_s_dts_a_all | A on pooled trees | K total |
| rf_s_dts_wa_all | WA on pooled trees | K total |
Metrics¶
| Symbol | Meaning |
|---|---|
| Accuracy (A) | A tree’s accuracy on the holdout/validation set. |
| Weighted accuracy (WA) | Accuracy times mean per-class accuracy; rewards balance across classes. |
| F1 | Macro or weighted, depending on the experiment. |
Analysis may also report client-to-global gaps and DP degradation curves (utility vs. ε).
Implementation highlights¶
- Modular RF, client trainers, and FederatedAggregator.
- Gini, entropy, SV, WV, and all four global merge strategies.
- Privacy hooks for mechanisms such as Laplace / Gaussian and future extensions (e.g. tree-level clipping).
For the scripted evaluation flow, see Experiment pipeline.