Model LZW encoding with customizable symbol sets. Compare original bits output codes ratios and savings. Trace dictionary growth clearly for faster learning and auditing.
This sample uses a classic repeated text string. The numbers below illustrate the kind of output the calculator reports after encoding.
| Sample Text | Symbol Mode | Bits per Character | Code Width | Max Dictionary | Typical Output Codes | What to Observe |
|---|---|---|---|---|---|---|
| TOBEORNOTTOBEORTOBEORNOT | ASCII 256 | 8 | 12 | 4096 | 84, 79, 66, 69, 79, 82, 78, 79, 84, 256, 258, 260, 265, 259, 261, 263 | Repeated patterns create new dictionary entries and reduce average bits per symbol. |
| ABABABAABABA | Unique symbols from input | 8 | 12 | 512 | 0, 1, 2, 4, 5, 1 | Short alphabets often show dictionary growth very clearly in the trace table. |
LZW compression works by replacing repeated symbol sequences with dictionary codes. The algorithm starts with an initial symbol dictionary, emits a code for the longest known phrase, then adds a new phrase formed by the current phrase plus the next symbol.
The calculator reports size estimates with these formulas:
When Fixed code width is selected, every emitted code uses the chosen width. When Dynamic code width is selected, the estimator increases code width as the dictionary grows, starting from the minimum width required by the initial dictionary.
It estimates LZW encoding output, compressed size, savings, compression ratio, dictionary growth, and a step-by-step trace of emitted codes and added phrases.
Very short or random text may not repeat enough patterns. In those cases, emitted codes and dictionary overhead can make the encoded size larger than the original.
ASCII mode starts with a full 256-symbol dictionary. Unique input symbol mode builds the starting dictionary only from symbols found in the entered text.
Dynamic width increases the estimated bits per emitted code as the dictionary grows. It usually reflects real implementations better than a single fixed width.
It checks whether the produced code stream can decode back to the exact original input under the same selected settings and dictionary rules.
Yes. Spaces, tabs, and line breaks are accepted. The trace table displays them with readable markers like ␠ and \n.
It shows how close the final dictionary came to the selected maximum size. High saturation means the dictionary limit strongly influenced the encoding path.
Yes. The trace lists each phrase, next symbol, output code, width estimate, and dictionary update, which makes manual checking much easier.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.