Poisson Regression Calculator

Calculator

Paste CSV data

Tip: counts must be numeric. Rows with missing values are skipped.

Delimiter

Use robust standard errors

Response column (count)

Predictors (numeric)

Hold Ctrl/⌘ to select multiple.

Optional offset column (log scale)

Optional exposure column (positive)

Exposure enters as log(exposure) in the offset.

Max iterations

Tolerance

Prediction

None

Use row #

Manual

Prediction returns μ, the expected count for the chosen covariates.

Example data table

y	x1	x2	exposure
3	0.6	1.2	1
2	0.2	0.7	1
7	1.3	1.1	2
1	0.1	0.4	1
5	0.9	0.9	1
9	1.7	1.5	3

y is the count outcome. x1 and x2 are numeric predictors. Exposure enters as log(exposure).

Formula used

The model assumes a Poisson count response with mean μ:

yᵢ ~ Poisson(μᵢ)
log(μᵢ) = β₀ + β₁xᵢ1 + … + βₚxᵢp + offsetᵢ

Coefficients are estimated using IRLS (Newton’s method for GLMs):

Wᵢ = μᵢ
zᵢ = ηᵢ + (yᵢ − μᵢ)/μᵢ
β ← (XᵀWX)⁻¹XᵀWz

How to use this calculator

Paste your dataset as CSV with numeric columns.
Select the response count column and predictors.
Optionally add offset and exposure columns.
Run the model and read β and IRR.
Download a CSV or PDF report for sharing.

Interpretation: IRR > 1 raises expected counts per unit increase, holding other predictors constant.

Counts, rates, and link scale

Poisson regression targets non‑negative counts and connects predictors to the mean with a log link. When you include exposure, the model estimates rates per unit exposure rather than raw totals. For example, a log(exposure) offset converts “events per week” and “events per month” into comparable scales across rows.

Coefficient meaning and IRR

Each coefficient β is on the log scale. Exponentiating gives the incidence rate ratio (IRR). If a predictor has IRR 1.20, a one‑unit increase multiplies the expected count by 1.20, holding other predictors fixed. An IRR below 1.00 indicates a proportional decrease, such as 0.85 meaning 15% lower expected counts.

Model fit signals you can quantify

Deviance summarizes disagreement between observed counts and fitted means; it is most useful when comparing models on identical rows. Pearson χ² divided by degrees of freedom is a practical dispersion check. Values near 1.0 align with Poisson variance, while 1.5–2.5 often suggests extra‑Poisson variation worth investigating.

Overdispersion and robust inference

When dispersion exceeds 1.0, standard errors can be understated. Robust (sandwich) standard errors keep coefficient estimates unchanged but adjust uncertainty to match residual variability. This is helpful for clustered, heterogeneous, or omitted‑variable settings, especially when the primary goal is reliable inference for IRR and confidence intervals.

Prediction workflow and sanity checks

Predictions return μ = exp(Xβ + offset). Use “row prediction” to validate against your data and “manual prediction” for scenario testing. A useful check is the observed‑vs‑fitted plot: if many points sit far above the diagonal, the model underestimates high counts; far below implies overestimation.

Data requirements and stability

Reliable estimation needs variation in predictors and enough rows relative to parameters. If XᵀWX becomes nearly singular, coefficients can blow up and IRR becomes unstable. Reduce correlated predictors, rescale large values, or add more rows. As a rule, aim for at least 10–20 informative rows per parameter for steady behavior.

FAQs

1) When should I use an exposure column?

Use exposure when counts come from different observation times or populations. The calculator adds log(exposure) to the offset, so coefficients represent rate changes rather than total changes.

2) What if my data has many zeros?

The model can handle zeros, but many structural zeros may reduce fit. Check residual plots and consider whether a zero‑inflated or hurdle approach is more appropriate for your process.

3) Why is dispersion greater than 1?

Dispersion above 1 indicates variance exceeds the Poisson mean. Common causes include unmodeled heterogeneity, clustering, or missing predictors. Robust errors help with inference, but model refinement may be needed.

4) Can I interpret the intercept as a baseline rate?

Yes, when predictors are zero and offset is zero, exp(intercept) is the baseline mean count. With exposure or offset, it becomes the baseline rate on that adjusted scale.

5) What does a non‑converged fit mean?

Non‑convergence suggests the algorithm could not stabilize coefficients within the tolerance. Try fewer predictors, rescale inputs, remove collinearity, or increase iterations. Extreme outliers can also cause instability.

6) Are p‑values exact?

The calculator uses large‑sample normal approximations (z tests). For small samples, results can be optimistic. Prefer confidence intervals, robust errors, and validation plots to support conclusions.