Calculator Inputs
Use manual likelihood values, paired binary outcomes with predicted probabilities, or both. When paired data is supplied, likelihoods can be derived automatically.
Example Data Table
This sample illustrates paired outcomes and predicted probabilities for a binary logistic model.
| Observation | Observed Outcome | Predicted Probability |
|---|---|---|
| Obs 1 | 1 | 0.88 |
| Obs 2 | 0 | 0.22 |
| Obs 3 | 1 | 0.76 |
| Obs 4 | 1 | 0.69 |
| Obs 5 | 0 | 0.31 |
| Obs 6 | 0 | 0.14 |
| Obs 7 | 1 | 0.81 |
| Obs 8 | 0 | 0.40 |
| Obs 9 | 1 | 0.73 |
| Obs 10 | 0 | 0.27 |
Formula Used
1) McFadden R²
McFadden R² = 1 - (LLfull / LLnull)
2) Adjusted McFadden R²
Adjusted McFadden R² = 1 - ((LLfull - k) / LLnull)
3) Cox-Snell R²
Cox-Snell R² = 1 - exp((2 / n) × (LLnull - LLfull))
4) Nagelkerke R²
Nagelkerke R² = Cox-Snell R² / (1 - exp((2 / n) × LLnull))
5) Efron R²
Efron R² = 1 - Σ(y - p)² / Σ(y - ȳ)²
6) Tjur R²
Tjur R² = mean(p | y = 1) - mean(p | y = 0)
7) Supporting Diagnostics
- Likelihood Ratio Statistic = 2 × (LLfull - LLnull)
- Deviance = -2 × LL
- Brier Score = mean((y - p)²)
- Accuracy = correct classifications / n
Pseudo R squared values do not match ordinary least squares R² exactly. They are fit indicators tailored to categorical-response models, especially logistic regression.
How to Use This Calculator
- Enter sample size, predictor count, and a classification threshold.
- Supply manual null and full log-likelihoods if you already know them.
- Or paste binary observed outcomes and predicted probabilities to derive likelihoods automatically.
- Click Calculate Pseudo R Squared to show results above the form.
- Review multiple pseudo R squared metrics, diagnostics, and the Plotly graphs.
- Use the CSV button for spreadsheet work and the PDF button for reports.
Frequently Asked Questions
1) What does pseudo R squared measure?
It summarizes how much a categorical-response model improves over a null model. Different versions use likelihoods, probabilities, or group separation, so values should be interpreted by method, not as one universal score.
2) Why are there several pseudo R squared formulas?
Logistic and related models do not have one exact OLS-style R squared. Researchers therefore use several alternatives, each emphasizing likelihood gain, probability error, or class discrimination.
3) Which pseudo R squared is most common?
McFadden R squared is very common in logistic regression reporting. Nagelkerke is also popular because it rescales Cox-Snell toward a one-point maximum.
4) Can pseudo R squared be negative?
Yes. Adjusted versions can become negative when predictor penalties outweigh improvement. Negative values suggest the fitted model offers little useful gain over the baseline reference.
5) Should I compare pseudo R squared to linear regression R squared?
No. They are not directly equivalent. Use pseudo R squared to compare categorical-response models, not to match or translate ordinary least squares goodness-of-fit values.
6) Why are log-likelihood values usually negative?
For probability models, log-likelihood values commonly fall below zero because logarithms of probabilities between zero and one are negative. Less negative usually means better fit.
7) What is a good threshold for accuracy?
A threshold of 0.50 is common, but the best value depends on class balance, error costs, and decision context. Accuracy alone should not guide model quality.
8) When should I use Efron or Tjur R squared?
Use them when you want direct probability-based interpretation. Efron reflects squared prediction error, while Tjur shows separation between average predicted probabilities for the two classes.