Analyze OCR and ASR output with CER metrics. Track substitutions, deletions, insertions, and accuracy trends. See graphs, exports, examples, and formulas for better evaluation.
Use the responsive three, two, and one-column form layout below.
Character error rate is based on edit operations over reference length.
Character Error Rate measures how many character-level edits are needed to transform the hypothesis into the reference.
S is substitutions, D is deletions, I is insertions, and N is the number of characters in the normalized reference.
This file uses dynamic programming to compute the minimum edit distance, then backtracks through the matrix to count each operation precisely.
| Reference Text | Hypothesis Text | Expected Notes |
|---|---|---|
| character error rate | charactr error rate | One deletion likely produces a low CER. |
| machine learning model | machien learning model | One substitution and one insertion may appear. |
| speech recognition | speech recogniton | Missing character increases deletion count. |
| optical character recognition | opticl charcter recognition | Multiple deletions raise CER and lower accuracy. |
CER measures the proportion of character edits needed to match predicted text to the reference. Lower CER means better text recognition or transcription quality.
CER is useful for OCR, speech recognition, handwriting recognition, subtitle generation, and any system that outputs text one character at a time.
Normalization changes the evaluation policy. Ignoring case, spaces, or punctuation can reduce penalties when those differences are irrelevant to your application.
A good CER depends on the task. Near zero is excellent. Production systems often target very low CER for clean inputs and tolerate higher values for noisy data.
CER evaluates characters, while WER evaluates words. CER is more sensitive to spelling and small text changes, especially in short outputs.
Yes. CER can exceed 100 percent when insertions are very high compared with the reference length, especially for short reference strings.
CER becomes undefined when the normalized reference is empty but the hypothesis contains characters. There is no valid denominator for the formula.
Yes. The calculator splits text with multibyte-safe character handling, so it works better with Unicode text than byte-based counting.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.