Algorithm (II) Class code: AM C623 Lecturer: Huilan Chang Primary Source: CLRS, Introduction to Algorithms, Chapter 16.3
Acquire by end of lecture:
If letter a appears 100× , letter b appears 1× and letter c appears 1× in a file, should they get codewords of the same length?
Key Concept: Variable-length codes — trade short codewords for frequent letters against longer codewords for rare ones. Total bits depend on this trade-off.
Given a raw bitstring like 01011, how can a decoder tell where one character ends and the next begins?
Key Concept The prefix-free property — no codeword may be a prefix of another. Without separators, this property alone keeps decoding unambiguous.
| English Term | Definition | 中文翻譯 |
|---|---|---|
| Codeword | Binary string representing one character | 碼字 |
| Fixed-length code | All codewords share the same length | 定長編碼 |
| Variable-length code | Codewords may differ in length | 變長編碼 |
| Prefix code | No codeword is a prefix of another | 前綴碼 |
| Full binary tree | Every internal node has exactly two children | 完全(滿)二元樹 |
| Frequency $c.\text{freq}$ | Number of occurrences of character $c$ in the file | 字元頻率 |
| Depth $d_T(c)$ | Depth of leaf $c$ in tree $T$ (= codeword length) | 葉子深度 |
| Cost $B(T)$ | Total bits to encode the file | 編碼樹成本 |
| Min-priority queue | Supports INSERT and EXTRACT-MIN |
最小優先佇列 |
| Binary min-heap | Array-based heap with parent $\le$ children | 二元最小堆 |
| Greedy-choice property | Locally optimal choice leads to global optimum | 貪婪選擇性質 |
| Optimal-substructure property | Optimal solution contains optimal subproblem solutions | 最佳子結構性質 |
Intuition. Codeword length and leaf depth are the same number wearing two hats. Keep this duality in mind when reading $B(T)$.
Warning. “Prefix code” $\neq$ “prefix of a codeword.” The property concerns whole codewords, not individual bits.