FIG. 13 · ATLAS Random Forest
A committee of
decision trees.
Hundreds of trees trained on bootstrapped samples and random feature subsets, averaged into a single prediction. Famously hard to beat on tabular data, and forgiving of every kind of bad data hygiene.
The boundary you see at the top is the majority vote across N trees. Underneath, the first five trees in the forest are drawn individually so you can see how each one is wrong in its own way — and how averaging that variety of wrongness produces a smooth right.
§ I The forest, drawn whole
Crank n_trees from 1 to 100. The boundary smooths from jagged to confident. Push max_depth to 10 and watch each individual tree become eager to memorize — the forest still smooths it out.
First five trees in the forest
Each of these is one tree from the ensemble, trained on a different bootstrap sample with a different random feature picked at each split. No tree is great. The vote across all of them is.
§ II How it works
Train each tree on a bootstrap sample — a random selection of training rows drawn with replacement, so some rows appear twice and some not at all. At every split, restrict the candidate features to a random subset (this is the "random" in random forest, beyond bootstrapping). The two sources of randomness mean the trees disagree, and disagreement is the engine of the ensemble.
For prediction, run a new point down every tree and take the majority vote. The probability surface you see in the demo is the fraction of trees that voted for class 1 at each grid cell. Where the trees agree the surface is dark; where they're undecided the shading is faint — that's the model's honest uncertainty.
The math
For T trees voting on a point x:
f̂(x) = mode( T₁(x), T₂(x), …, T_T(x) )For probabilities (when needed):
P̂(y=1 | x) = (1/T) Σ T_i(x)Each tree T_i is trained on a bootstrap sample D_i ∼ D with feature bagging. The out-of-bag (OOB) points — the rows not picked for D_i — give an honest validation estimate at zero extra cost.
§ III Where it shines, where it breaks
Tabular accuracy, almost free
Default settings on a clean tabular dataset routinely produce a baseline that takes weeks of work to beat. Feature engineering helps, hyperparameter tuning helps, but the floor is already high.
Robust to messy data
Outliers, mixed scales, irrelevant features, missing values (with simple imputation) — a random forest mostly shrugs. It is the model that punishes preprocessing fragility least.
Inference cost at scale
100 trees, 10 levels deep each, run for every prediction. That's fine for batch scoring. For real-time inference at high QPS, consider gradient boosting on a single sequence or distill the forest into a smaller model.
Lost interpretability
A single tree is a flowchart. A hundred trees averaged is a weather pattern. Feature importance metrics (Gini, permutation) give a directional read but not the per-prediction explanation a regulator can follow.
§ IV Trade-off scorecard
Directional, not exact. Inference cost varies with tree count and depth.
- Inference0.55
- Accuracy0.85
- Training0.55
- Small size0.30
§ V In production
Microsoft Kinect's body-pose tracking. The original Xbox 360 Kinect classified each pixel of an infrared depth image into one of 31 body parts using a random forest of three trees with 20 levels each — trained on a million synthetic poses, evaluated at 200 FPS on a console GPU. Tabular accuracy showing up in real-time computer vision.
§ VI Compare to
Decision Tree
A single tree · interpretable
Gradient Boosting
Sequential trees · phase 2
Logistic Regression
Faster · less accurate · calibrated