<< Chapter < Page | Chapter >> Page > |
Only Size needs to be Huffman coded in the above scheme, since, within a given Size, all the input values havesufficiently similar probabilities for there to be little gain from entropy coding the Additional Bits (hence they are codedin simple binary as listed). Each coded Size is followed by the appropriate number of Additional Bits (equal to Size) todefine the sign and magnitude of the coefficient difference exactly.
There are only 12 Sizes to be Huffman coded, so specifying the code table can be very simple and require relatively few bitsin the header.
In JPEG all Huffman code tables are defined in the image header. Each table requires $16+n$ bytes, where $n$ is the number of codewords in the table.
The first 16 bytes list the number of codewords of each length from 1 to 16 bits (codewords longer than 16 bits areforbidden). The remaining $n$ bytes list the decoded output values of the $n$ codewords in ascending codeword order ( $n< 256$ ).
Hence $16+12=28$ bytes are needed to specify the code table for DC coefficients.
The remaining 63 coefs (the AC coefs) of each 64-element vector usually contain many zeros and so are coded with acombined run-amplitude Huffman code.
The codeword represents the run-length of zeros before a non-zero coef and the Size of that coef. This is then followed by the Additional Bits whichdefine the coef amplitude and sign precisely. Size and Additional Bits are defined just as for DC coefs.
This 2-dimensional Huffman code (Run, Size) is efficient because there is a strong correlation between the Size of acoef and the expected Run of zeros which precedes it - small coefs usually follow long runs; larger coefs tend to followshorter runs. No single 2-D event is so probable that the Huffman code becomes inefficient.
In order to keep the code table size $n$ below 256, only the following Run and Size values are coded: $$\mathrm{Run}=0\to 15$$ $$\mathrm{Size}=1\to 10$$ These require 160 codes. Two extra codes, corresponding to (Run,Size) = (0,0) and (15,0) are used for EOB (End-of-block)and ZRL (Zero run length).
EOB is transmitted after the last non-zero coef in a 64-vector. It is the most efficient way of coding the finalrun of zeros. It is omitted in the rare case that the final element of the vector is non-zero.
ZRL is transmitted whenever $\mathrm{Run}> 15$ , and represents a run of 16 zeros (15 zeros and a zero amplitude coef) which can be part of a longer run of anylength. Hence a run of 20 zeros followed by -5 would be coded as
(ZRL) (4,3) 010
When the code tables are defined in the image header, each codeword is assigned to a given (Run,Size) pair by making thedecoded output byte Code Byte equal to ( $16\mathrm{Run}+\mathrm{Size}$ ).
The default JPEG code for (Run,Size) of AC luminance DCT coefficients is summarised below in order of decreasing codeprobability:
(Run,Size) | Code Byte (hex) | Code Word (binary) | (Run,Size) | Code Byte (hex) | Code Word (binary) |
---|---|---|---|---|---|
(0,1) | 01 | 00 | (0,6) | 06 | 1111000 |
(0,2) | 02 | 01 | (1,3) | 13 | 1111001 |
(0,3) | 03 | 100 | (5,1) | 51 | 1111010 |
(EOB) | 00 | 1010 | (6,1) | 61 | 1111011 |
(0,4) | 04 | 1011 | (0,7) | 07 | 11111000 |
(1,1) | 11 | 1100 | (2,2) | 22 | 11111001 |
(0,5) | 05 | 11010 | (7,1) | 71 | 11111010 |
(1,2) | 12 | 11011 | (1,4) | 14 | 111110110 |
(2,1) | 21 | 11100 | ⋮ | ||
(3,1) | 31 | 111010 | (ZRL) | F0 | 11111111001 |
(4,1) | 41 | 111011 | ⋮ |
As an example, let us code the following $8\times 8$ block: $$\begin{pmatrix}-13 & -3 & 2 & 0 & 0 & 0 & 1 & 0\\ 6 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ \end{pmatrix}$$ Concerting this to (DC Size) or (Run,Size) and values for the Additional Bits gives:
(4) -13 (0,2) -3 (0,3) 6 (2,2) 2 (3,1) -1 (ZRL) (1,1) 1 (EOB)
101 0010 01 00 100 110 11111001 10 111010 0 11111111001 1100 1 1010
The compressed bitstream for this block is listed on the lower line, assuming that the default Huffman code tables, givenabove, are used.
shows the histogram of probabilities for the (Run,Size) codewords used to code Lennausing the ${Q}_{\mathrm{lum}}$ quantisation matrix. The bin number represents the decoded byte value.
shows the equivalent histogram when the quantisation matrix is $2{Q}_{\mathrm{lum}}$ .
Note the strong similarity between these histograms, despite the fact that represents only $\frac{2}{3}$ as many events. Only the EOB probability changes significantly, because its probability goes up as the number of events (non-zero coefs) per block goes down.
It turns out that the (Run,Size) histogram remains relatively constant over a wide range of image material and acrossdifferent regions of each image. This is because of the strong correlation between the run lengths and expected coefsizes. The number of events per block varies considerably depending on the local activity in the image, but theprobability distribution of those events (except for EOB) changes much less.
and also give the mean bit rates to code Lenna for the two quantisation matrices. Comparingthese with the theoretical entropies from this figure (lower row) we get:
Q matrix | Mean Entropy (b/pel) | JPEG Bit Rate (b/pel) | JPEG efficiency |
---|---|---|---|
${Q}_{\mathrm{lum}}$ | 0.8595 | 0.8709 | 98.7% |
$2{Q}_{\mathrm{lum}}$ | 0.5551 | 0.5595 | 99.21% |
Hence we see the high efficiency of the (Run,Size) code at two quite different compression factors. This tends to apply overa wide range of images and compression factors and is an impressive achievement.
There is even very Little efficiency lost if a single code table is used for many images, which can avoid the need totransmit the $16+n$ (168 bytes) of code definition in the header of each image. Using the recommended JPEG default luminance tables(Annex K.3.3) the above efficiencies drop to 97.35% and 95.74% respectively.
Notification Switch
Would you like to follow the 'Image coding' conversation and receive update notifications?