Random Forest Predictor Calculator

Build your own forest model from uploaded tables. See accuracy, votes, and predictions instantly here. Adjust depth and trees, then download a report securely.

Model setup

Tip: Use small datasets for best performance. This page trains inside your browser session.


Dataset

First row must be headers. Target is your label/value column.
Used only when “Upload CSV” is selected.
If blank, the last column is used.

New observation to predict

Fill only feature columns (not the target). For categories, type the exact label found in your dataset.

Example data table

This sample matches the default CSV above.

StudyHours AttendanceRate PriorScore SupportLevel Passed
60.8872HighYes
20.6055LowNo
40.7563MediumYes
10.5040LowNo
80.9285HighYes

Formula used

Forest prediction
  • Classification: predicted class = majority vote across trees.
  • Regression: predicted value = average of tree predictions.
  • Vote share: class votes ÷ number of trees.
Tree split score
  • Gini: 1 − Σ p(k)², where p(k) is class proportion.
  • Entropy: −Σ p(k) log₂ p(k).
  • MSE: variance of targets in a node (lower is better).
  • Gain: parent impurity − weighted child impurity.

How to use this calculator

  1. Choose Classification for labels, or Regression for numbers.
  2. Paste or upload a CSV dataset with headers in the first row.
  3. Set the Target column name (your label/value column).
  4. Tune trees, depth, sampling, and feature subset options.
  5. Enter the new observation feature values to predict.
  6. Press Submit & Predict to view results above the form.
  7. Use the download buttons to export a CSV or PDF report.

For best results, include at least 50 rows and balanced classes.

FAQs

1) What does the prediction represent?

It is the forest’s combined output for your new observation. Classification returns the most-voted label. Regression returns the mean of all tree outputs.

2) Why do results change when I change the seed?

Random forests rely on randomness for bootstrap samples and feature subsets. A different seed changes those random choices, so the forest structure and output may shift.

3) What is out-of-bag accuracy or error?

Each tree leaves out some training rows during bootstrapping. Those left-out rows can be predicted by that tree, giving a built-in estimate without a separate validation set.

4) How should I set the number of trees?

More trees usually stabilize predictions but require more time. Start around 100–200. Increase if your vote shares or metrics fluctuate, then stop when improvements flatten.

5) What does feature subset per split do?

It limits how many features each split can try. This increases tree diversity and often improves generalization. sqrt(p) is a common default for classification.

6) Can I use text categories like “High” or “Low”?

Yes. Categorical columns are split using one-vs-rest rules. When predicting, type the exact category spelling that appears in your dataset for consistent behavior.

7) Why is my model accuracy low?

Low accuracy can come from noisy data, weak features, class imbalance, or too small a dataset. Try adding better predictors, collecting more rows, or adjusting depth and leaf sizes.

Related Calculators

learning curve tool

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.