Compare competing models with rigorous information criteria outputs. Balance fit, simplicity, and rankings across models. Choose stronger evidence with faster, clearer statistical decisions today.
Provide at least two candidate models. Use total estimated parameters and maximized log-likelihood values from the same dataset.
AIC = 2k - 2ln(L)
AICc = AIC + [2k(k + 1) / (n - k - 1)]
ΔAICᵢ = AICᵢ - AICmin
Weightᵢ = exp(-0.5 × ΔAICᵢ) / Σ exp(-0.5 × ΔAICr)
Here, k is the number of estimated parameters, ln(L) is the maximized log-likelihood, and n is sample size for AICc correction.
| Model | Parameters (k) | Log-Likelihood | AIC |
|---|---|---|---|
| Linear Model | 3 | -126.44 | 258.88 |
| Quadratic Model | 4 | -118.91 | 245.82 |
| Cubic Model | 5 | -117.25 | 244.50 |
| Spline Model | 6 | -116.83 | 245.66 |
AIC compares candidate models by balancing fit and complexity. In the example table, the cubic model has AIC 244.50, the lowest score among four options. It outperforms the quadratic model at 245.82, the spline model at 245.66, and the linear model at 258.88. Lower values indicate better expected information retention when models are fitted to the same response data for predictive use in practical modelling workflows today.
Delta AIC makes raw scores easier to interpret. Using the sample values, the cubic model has delta 0.00, spline 1.16, quadratic 1.32, and linear 14.38. Models within 2 units usually remain credible competitors. Values beyond 10 normally indicate weak support. This simple scale helps analysts judge whether a leading model wins narrowly or dominates the candidate set.
The penalty term matters because extra parameters can improve likelihood by chasing noise. AIC adds 2k to discourage unnecessary flexibility. In practice, this protects comparisons involving polynomial terms, interaction effects, or nonlinear expansions. If two models fit almost equally well, the one using fewer parameters often ranks higher because it offers a stronger balance between description and parsimony.
Corrected AIC, or AICc, becomes important when sample size is not large relative to model size. The correction grows as n approaches k plus 1, reducing optimism toward complex specifications. Analysts working with short time series, pilot studies, or restricted experimental samples should inspect AICc carefully. A changed ranking after correction often signals that the richer model demands more information than the data provide.
Akaike weights convert score differences into relative support that sums to 1.00. This makes reporting easier. Instead of naming only one winning model, teams can show how support is distributed across alternatives. Evidence ratios go further by indicating how many times less supported a competing model is. These outputs are useful in forecasting, credit analysis, and scientific reporting where uncertainty must be communicated clearly.
Good selection practice combines information criteria with diagnostics and subject knowledge. The best numerical model may still be unsuitable if residual patterns remain structured or parameters lack meaning. Use this calculator to rank models quickly, then verify assumptions before adoption. That workflow produces choices that are statistically efficient, interpretable, and easier to defend in research, engineering, and business decision settings.
A lower AIC suggests a better balance between model fit and complexity within the submitted candidate set. It does not prove the model is universally true.
Use AICc when sample size is small relative to parameter count. The correction reduces bias that can favor overly complex models in limited datasets.
No. AIC comparisons are valid only when candidate models are fitted to the same response data and estimated under a consistent likelihood framework.
Delta AIC below 2 often indicates strong competitive support. Values between 4 and 7 suggest weaker support, while values above 10 are usually poor.
Akaike weights show each model’s relative support after normalizing all Delta AIC values. Together, they sum to one across the compared models.
Not always. You should also review diagnostics, interpretability, assumptions, and practical relevance before selecting a final model for deployment or reporting.