FIG. 13 · ATLAS Decision Tree

A flowchart that
fits the data.

Recursive binary splits on the feature that best separates the target. Reads like a flowchart, handles mixed data without preprocessing, but overfits without depth limits.

Two views of the same model. On the left: the feature space, partitioned into rectangles. On the right: the tree itself, with split conditions inside each interior node and class labels at the leaves. Move the depth slider to feel the trade-off between underfit and overfit.

Filed

Engine

Pure JS · recursive Gini-impurity splits

Source

decision-tree-lab.js ↗

§ I The tree, drawn twice

Watch how each split slices the feature space with one straight cut at a time. Set depth = 1 for a stump (one cut). Crank to 10 to see the overfit boundary — the tree memorizes individual points.

Feature space · partitioned

The tree itself

Preset dataset

Max depth

depth4

Live model

train accuracy

—

leaves

—

§ II How it works

Decision trees fit recursively. At each node, the algorithm tries every feature and every possible threshold, picking the split that produces the purest two children. "Purity" here is Gini impurity: the chance a randomly drawn point would be misclassified if it picked the majority label. The lower the better.

Splits are axis-aligned — always either "x < threshold" or "y < threshold," never diagonal. That's why the partition map looks like a quilt of rectangles. It's also why a single tree is interpretable: the path from root to leaf is a list of inequalities you can read aloud.

The math

For a node containing classes with proportions p_k, the Gini impurity is:

G = 1 − Σ p_k²

The algorithm picks the (feature, threshold) that minimizes the weighted sum of Gini for the two child nodes:

argmin_(f, t) [ (n_L / n) · G_L + (n_R / n) · G_R ]

Recursion stops at max_depth, when a node becomes pure, or when no split lowers impurity. Each leaf's prediction is its majority class.

§ III Where it shines, where it breaks

Shines

Mixed-type tabular

Numerical features, categorical features, ordinal features, missing values: a tree handles all of them without one-hot encoding, scaling, or imputation. Real production data is messy. Trees are the model that doesn't mind.

Shines

Audit-friendly rules

A regulated lender can't ship a neural network without an interpretability layer. A pruned decision tree IS the interpretation. Every prediction is a path of three to five rules that a compliance officer can read.

Breaks

Overfitting at depth

Crank the depth slider to 10 above. The boundary becomes a haze of narrow rectangles, each carved out around a single training point. Train accuracy climbs to 100%; test accuracy collapses. This is why Random Forest and Gradient Boosting exist.

Breaks

Diagonal boundaries

Try the spiral preset. Axis-aligned splits can only approximate a diagonal with stair-steps. The tree gets there with enough depth, but the staircase shape is a sign you're paying for the wrong kind of decision surface.

§ IV Trade-off scorecard

Directional, not exact. Reflects shallow trees with reasonable depth limits.

Inference0.85
Accuracy0.70
Training0.85
Small size0.80

§ V In production

Credit scoring at FICO and the German credit bureaus. Pruned, hand-audited decision trees underwrite the interpretability requirements that regulated lending demands. The same data pumped through gradient boosting would score better, but a tree's path-to-leaf is the explanation a regulator can read aloud.

§ VI Compare to

Random Forest

Many trees, averaged · better accuracy

Gradient Boosting

Trees, sequentially · phase 2

Logistic Regression

Linear, also interpretable

Try the wizard again →