Calculator Inputs
Example Data Table
| Scenario | Family | Effect Assumption | Baseline Metric | Adjusted Sample |
|---|---|---|---|---|
| Screening outcome model | Logistic | β = 0.35 | p0 = 0.30 | 408 |
| Counts per exposure model | Poisson | β = 0.22 | μ0 = 0.80 | 243 |
| Continuous outcome model | Gaussian | β = 0.25 | SDy = 1.20 | 230 |
Formula Used
Core Wald approximation:
n = ((z1-α/t + zpower)² × φ × VIF) / (β² × Var(X) × I0)
Gaussian: I0 = 1 / σ², so n = ((z terms)² × φ × VIF × σ²) / (β² × Var(X))
Logistic: I0 = p0(1-p0)
Poisson: I0 = μ0
Binary predictor variance: Var(X) = q(1-q)
Continuous predictor variance: Var(X) = SD(X)²
Recommended base sample: max(Wald minimum, stability minimum)
Dropout adjustment: Final n = Base n / (1 - dropout proportion)
How to Use This Calculator
- Choose the GLM family that matches your outcome.
- Enter the coefficient you want to detect.
- Set predictor variance with SD or prevalence.
- Add alpha, target power, and one-sided or two-sided testing.
- Enter the outcome metric: SD, event probability, or event rate.
- Increase dispersion or VIF when data are noisier or predictors correlate.
- Set predictors and stability factor for practical model reliability.
- Add dropout, then calculate and review the table, summary cards, and graph.
FAQs
1. What does this calculator estimate?
It estimates a planning sample size for testing one GLM coefficient. It supports Gaussian, logistic, and Poisson models. It also applies a model stability safeguard and a dropout adjustment.
2. What does beta mean here?
Beta is the coefficient you want to detect. For Gaussian models, it is a slope. For logistic models, exp(beta) is the odds ratio. For Poisson models, exp(beta) is the rate ratio.
3. Why is predictor variance important?
Higher predictor variance usually improves information. That lowers the sample needed for the same effect. Continuous predictors use SD squared. Binary predictors use prevalence times one minus prevalence.
4. Why does the logistic sample explode near extreme probabilities?
Information is strongest near a 50% event rate. When the baseline probability moves near zero or one, p(1-p) shrinks. Smaller information means a larger sample is needed.
5. What does the VIF input do?
VIF inflates the required sample when predictors are correlated. A VIF of 1 means no inflation. Larger values mean your focal coefficient is estimated less efficiently.
6. Why include a stability factor?
Pure power formulas can be optimistic for multivariable models. The stability factor adds a practical floor tied to predictor count. That helps prevent fragile estimates and underpowered fitted models.
7. Should I use one-sided or two-sided testing?
Two-sided tests are the usual default because they guard against effects in both directions. One-sided tests need stronger prior justification. Two-sided settings generally require larger samples.
8. Is this exact for every study design?
No. It is an approximation for early planning. Complex designs, clustering, repeated measures, time-to-event outcomes, and rare events often need specialized methods or simulation.