Advanced Gini Impurity Calculator

Calculator Inputs

Enter class names and sample counts for the parent node and two child nodes. Large screens show three columns, smaller screens show two, and mobile shows one.

Class Name 1

Class Name 2

Class Name 3

Class Name 4

Class Name 5

Class Name 6

Parent Count: Class A

Parent Count: Class B

Parent Count: Class C

Parent Count: Class D

Parent Count: Class E

Parent Count: Class F

Left Child Count: Class A

Left Child Count: Class B

Left Child Count: Class C

Left Child Count: Class D

Left Child Count: Class E

Left Child Count: Class F

Right Child Count: Class A

Right Child Count: Class B

Right Child Count: Class C

Right Child Count: Class D

Right Child Count: Class E

Right Child Count: Class F

Example Data Table

This example shows a parent node and a possible binary split for a three-class classification problem.

Class	Parent	Left Child	Right Child
Class A	50	35	15
Class B	30	10	20
Class C	20	5	15
Class D	0	0	0
Class E	0	0	0
Class F	0	0	0

Expected output: Parent Gini = 0.620000, Left Gini = 0.460000, Right Gini = 0.660000, Weighted Split Gini = 0.560000, Gini Gain = 0.060000.

Formula Used

Node Gini Impurity: Gini = 1 − Σ(p_i²)

Class Probability: p_i = class count / total node count

Weighted Split Gini: ((N_L / N_P) × Gini_L) + ((N_R / N_P) × Gini_R)

Gini Gain: Gini_Parent − Weighted Split Gini

Lower impurity means a cleaner node. A useful split usually reduces impurity, so a larger Gini gain often indicates a better decision boundary.

How to Use This Calculator

Enter up to six class names.
Provide parent node counts for each class.
Enter matching left and right child counts after a split.
Click Calculate Gini Impurity.
Review impurity values, weighted split quality, gain, and dominant classes.
Export the computed summary as CSV or PDF when needed.

Frequently Asked Questions

1. What does Gini impurity measure?

Gini impurity measures how mixed a node is. A value near zero means most samples belong to one class. Higher values show stronger class overlap inside the node.

2. Why is lower Gini usually better?

Lower Gini means the node is more pure. In tree models, purer child nodes usually help classification because each branch becomes more specialized and easier to interpret.

3. What is weighted split Gini?

Weighted split Gini combines child impurities using their sample shares. Larger child nodes influence the final split score more than smaller nodes.

4. What does Gini gain tell me?

Gini gain shows how much impurity decreases after a split. Positive gain suggests the split improved purity. Larger gain often means a stronger candidate split.

5. Must child counts equal parent counts?

For a strict binary partition, yes. Each parent class count should equal its left count plus right count. This calculator flags mismatches so you can verify your split logic.

6. Can I use more than two classes?

Yes. This page supports up to six classes. You can leave unused classes at zero when your problem has fewer categories.

7. Is this useful outside decision trees?

Yes. It also helps when teaching impurity, testing manual split candidates, auditing dataset segments, or validating tree-building steps before coding a model.

8. What if one child node has zero samples?

That child node gets zero impurity contribution because its weight becomes zero. However, such splits are rarely useful in practical tree construction.