Calculator Inputs
Example Data Table
This sample tensor shows why per-channel calibration can help. Some channels occupy tight ranges. Others use much wider ranges.
| Channel | Values | Local Min | Local Max | Local Range | Why It Matters |
|---|---|---|---|---|---|
| 1 | 0.12, -0.33, 0.48, -0.05, 0.91 | -0.33 | 0.91 | 1.24 | Moderate range. Shared scaling can work well. |
| 2 | 1.20, 0.95, 1.44, 1.10, 0.88 | 0.88 | 1.44 | 0.56 | Narrow positive range benefits from local scaling. |
| 3 | -1.70, -1.22, -0.94, -1.43, -1.11 | -1.70 | -0.94 | 0.76 | Negative-only values shift zero point behavior. |
| 4 | 0.03, 0.08, -0.02, 0.11, -0.06 | -0.06 | 0.11 | 0.17 | Tiny ranges often lose detail with global calibration. |
Formula Used
Per-tensor quantization uses one scale and one zero point for the whole tensor. Per-channel quantization computes those values independently for each channel.
Asymmetric Quantization
scale = (r_max - r_min) / (q_max - q_min)
zero_point = round(q_min - r_min / scale)
q = clamp(round(x / scale) + zero_point, q_min, q_max)
x_hat = (q - zero_point) * scale
Symmetric Quantization
abs_max = max(|r_min|, |r_max|)
scale = abs_max / max(|q_min|, |q_max|)
zero_point = 0 for signed symmetric quantization
Error Metrics
MAE = mean(|x_hat - x|)
RMSE = sqrt(mean((x_hat - x)^2))
Max Error = max(|x_hat - x|)
Memory Estimate
quantized_payload = elements * bit_width / 8
per_tensor_total = quantized_payload + 8 bytes metadata
per_channel_total = quantized_payload + channels * 8 bytes metadata
How to Use This Calculator
- Choose the target bit width.
- Select signed or unsigned quantized integers.
- Pick symmetric or asymmetric mapping.
- Paste tensor values by channel.
- Use one line or one semicolon per channel.
- Separate values with commas or spaces.
- Press Compare Quantization.
- Review RMSE, MAE, max error, and size estimates.
- Use the Plotly graph to inspect reconstruction quality.
- Export the summary with CSV or PDF buttons.
Frequently Asked Questions
1. What is per-tensor quantization?
Per-tensor quantization uses one shared scale and one shared zero point for the whole tensor. It is simple, fast, and common in deployment pipelines. Accuracy can drop when channel ranges differ strongly.
2. What is per-channel quantization?
Per-channel quantization computes separate calibration values for each channel. It usually preserves weights better when different filters use different ranges. The tradeoff is extra metadata and slightly more implementation complexity.
3. When does per-channel usually help most?
It helps most when channels have very different distributions. Narrow channels can lose detail under a global scale. Local channel scaling preserves those smaller patterns more accurately.
4. Why compare RMSE and MAE together?
RMSE highlights larger errors more strongly. MAE shows average absolute drift. Reading both metrics gives a more balanced picture of reconstruction quality.
5. What does symmetric quantization change?
Symmetric quantization centers values around zero. It is often convenient for signed weights. Asymmetric mapping can better fit shifted ranges, especially for activations or positive-only data.
6. Does per-channel always win?
No. Sometimes both methods perform similarly. If channel ranges are already close, per-tensor may be accurate enough and easier to deploy.
7. Why does per-channel use more memory?
Each channel stores its own scale and zero point. That adds metadata. The payload size remains mostly the same, but the overhead grows with channel count.
8. Can this calculator handle activations too?
Yes. The same math applies to activations and weights. The best choice depends on runtime support, calibration data, and acceptable accuracy loss.