Tune feature map sizes for vision models easily. Compare settings across kernels, strides, and padding. See exact formulas, then download your calculations anytime fast.
| Scenario | Input (H×W) | Kernel | Stride | Padding | Dilation | Output (H×W) |
|---|---|---|---|---|---|---|
| Downsample block | 224×224 | 3×3 | 2×2 | SAME | 1×1 | 112×112 |
| Feature refine | 112×112 | 3×3 | 1×1 | SAME | 1×1 | 112×112 |
| Valid convolution | 32×32 | 5×5 | 1×1 | VALID | 1×1 | 28×28 |
| Dilated context | 64×64 | 3×3 | 1×1 | SAME | 2×2 | 64×64 |
| Upsample step | 56×56 | 4×4 | 2×2 | Custom (PH=2, PW=2) | 1×1 | 112×112 |
Stride controls how densely a kernel samples the input. With stride 1, adjacent receptive fields overlap heavily, preserving detail. With stride 2, you roughly halve spatial resolution, reducing compute and activation memory by ~4× in 2D. In common backbones, early stride choices drive feature map sizes (e.g., 224→112→56) and determine where fine texture is lost. If accuracy drops, move the first downsample later. Stacking two stride‑2 layers yields an effective stride of 4, so a 16×16 patch in input maps to one output cell.
For each dimension, the forward output is Out = floor((In + Ptotal − Keff)/S) + 1, where Keff = D×(K−1)+1. This calculator applies the formula independently to height and width (or length for 1D), so mixed strides like 2×1 are handled correctly for anisotropic inputs. The effective-kernel readout helps verify dilated blocks quickly.
SAME padding targets Out = ceil(In/S) and computes the total padding needed, then splits it into before/after values that may be asymmetric. Odd kernels with stride 1 often yield symmetric padding, while larger strides can force uneven borders. VALID padding uses Ptotal=0, shrinking maps by Keff−1 when stride is 1. Dilation increases Keff without increasing parameters, expanding context while keeping stride unchanged; the trade‑off is more boundary dependence and possible gridding at high dilation.
When you know the desired output size (for skip connections, concatenation, or patch grids), solving for stride can save trial and error. Because floor rounding is involved, not every target is achievable with a single integer stride. The solver reports the closest valid stride and shows the achieved output so you can adjust padding, kernel, or dilation to match exactly.
In decoders and segmentation heads, transposed layers expand resolution: Out = (In−1)×S − Ptotal + D×(K−1) + OutPad + 1. Output padding is a small correction term (often 0..S−1) used to land on exact sizes like 56→112. Use the history table to compare configurations, document decisions, and export reproducible notes.
Effective kernel equals D×(K−1)+1. Dilation spreads taps apart, so the receptive field grows without adding weights. The calculator shows Keff for each dimension to avoid manual expansion.
When stride is greater than 1, the required total padding may be odd. The tool splits it as floor(total/2) before and the remainder after, which can shift one border by one pixel.
In 2D, doubling stride typically halves height and width, cutting activations by about 4× and reducing convolution FLOPs similarly. This speeds training, but it can remove small objects and fine edges.
The output formula contains a floor operation, so only certain outputs are reachable for a fixed input, kernel, dilation, and padding. If the closest stride is off, adjust padding or kernel, or redesign the downsampling schedule.
Output padding is a small extra term that increases the transposed output size by 0..(stride−1). It is useful for matching skip‑connection sizes without changing kernel or stride.
Use 1D for sequences such as audio frames, token embeddings, or sensor signals. Enter the length as “input height/length”; width-related fields are ignored so you can focus on the single dimension.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.