4.3 Mpeg layers 1-3: cosine-modulated filterbanks

An introduction to source-coding Page 1 / 1

Here the "polyphase quadrature" filterbank used in the MPEG audio standards is described in great detail. It has the following practical features: real-valued sub-band outputs, near-perfect reconstruction, and polyphase implementation; and is based on cancellation of adjacent sub-band interference.

Though the uniformly modulated filterbank in Figure 4 from "Uniformly-Modulated Filterbanks" was shown to have the fast implementation in Figure 5 from "Uniformly-Modulated Filterbanks" , the sub-band outputs are complex-valued for real-valuedinput, hence inconvenient (at first glance In the structure in Figure 4 from "Uniformly-Modulated Filterbanks" , it would be reasonable to replace the standard DFT with a real-valued DFT (defined inthe notes on transform coding), requiring $\approx N {log}_{2} N$ real-multiplies when N is a power of 2. Though it is not clear to the author why such a structure was notadopted in the MPEG standards, the cosine modulated filterbank derived in this section has equivalent performance and, withits polyphase/DCT implementation, equivalent implementation cost. ) for sub-band coding of real-valued data. In this section we propose a closely related filterbank with thefollowing properties.
1. Real-valued sub-band outputs (assuming real-valued inputs),
2. Near-perfect reconstruction,
3. Polyphase/fast-transform implementation.
This turns out to be the filterbank specified in the MPEG-1 and 2 (layers 1-3) audio compression standards (see IS0/IEC 13818-3).

Filter design

Real-valued Sub-band Outputs: Recall the generic filterbank structure of Figure 1 from "Uniformly-Modulated Filterbanks" . For the sub-band outputs to be real-valued (for real-valued input),we require that the impulse responses of ${H_{i} (z)}$ and ${K_{i} (z)}$ are real-valued. We can insure this by allocating the N (symmetric) frequency band pairs shown in [link] . The positive and negative halves of each band pair are centered at $ω_{i} = \frac{(2 i + 1) π}{2 N}$ radians.

Frequency band pairs for the polyphase quadrature filterbank ( $N = 4$ ).

We can consider each filter $H_{i} (z)$ as some combination of symmetric positive-frequency and negative-frequency components
$H_{i} (z) = a_{i} F_{i} (z) + b_{i} G_{i} (z)$
as shown in [link] .

Positive- and negative-frequency decomposition of $H_{i} (ω)$ . Note $K_{i} (ω)$ will have a similar, if not identical, frequency response.

When $b_{i} = a_{i}^{*}$ and the pairs ${F_{i} (z), G_{i} (z)}$ are modulated versions of the same prototype filter $H (z)$ , we can show that $H_{i} (z)$ must be real-valued:
$\begin{matrix} H_{i} (z) & = & a_{i} \underset{F_{i} (z)}{\underset{︸}{H (e^{- j π \frac{2 i + 1}{2 N}} z)}} + a_{i}^{*} \underset{G_{i} (z)}{\underset{︸}{H (e^{j π \frac{2 i + 1}{2 N}} z)}} \\ = & a_{i} \sum_{n} h_{n} e^{j π \frac{2 i + 1}{2 N} n} z^{- n} + a_{i}^{*} \sum_{n} h_{n} e^{- j π \frac{2 i + 1}{2 N} n} z^{- n} \\ = & Re (a_{i}) \sum_{n} h_{n} z^{- n} (e^{j π \frac{2 i + 1}{2 N} n} + e^{- j π \frac{2 i + 1}{2 N} n}) + j Im (a_{i}) \sum_{n} h_{n} z^{- n} (e^{j π \frac{2 i + 1}{2 N} n} - e^{- j π \frac{2 i + 1}{2 N} n}) \\ = & Re (a_{i}) \sum_{n} h_{n} z^{- n} \cdot 2 cos (π, \frac{2 i + 1}{2 N}, n) + j Im (a_{i}) \sum_{n} h_{n} z^{- n} \cdot 2 j sin (π, \frac{2 i + 1}{2 N}, n) \\ = & 2 \sum_{n} [Re (a_{i}) cos (π, \frac{2 i + 1}{2 N}, n) - Im (a_{i}) sin (π, \frac{2 i + 1}{2 N}, n)] h_{n} z^{- n} \end{matrix}$
Aliasing Cancellation: Recall again the generic filterbank in Figure 1 from "Uniformly-Modulated Filterbanks" . Here we determine conditions on real-valued ${H_{i} (z)}$ and ${K_{i} (z)}$ which lead to near-perfect reconstruction. It will be insightful to derive an expression for the input to the $i^{t h}$ reconstruction filter, ${y_{i} (n)}$ . The downsample-upsample-cascade equation Equation 14 from "Fundamentals of Multirate Signal Processing" (fourth equation) implies that
$\begin{matrix} Y_{i} (z) & = & \frac{1}{N} \sum_{p = 0}^{N - 1} X_{i} (e^{- j \frac{2 π}{N} p}, z) \\ = & \frac{1}{N} \sum_{p = 0}^{N - 1} H_{i} (e^{- j \frac{2 π}{N} p}, z) X (e^{- j \frac{2 π}{N} p}, z) \\ = & \frac{1}{N} \sum_{p = 0}^{N - 1} [a_{i} F_{i} (e^{- j \frac{2 π}{N} p}, z) + a_{i}^{*} G_{i} (e^{- j \frac{2 π}{N} p}, z)] X (e^{- j \frac{2 π}{N} p}, z) \\ = & \underset{desired}{\underset{︸}{\frac{1}{N} [a_{i} F_{i} (z) + a_{i}^{*} G_{i} (z)] X (z)}} + \underset{undesired images}{\underset{︸}{\frac{1}{N} \sum_{p = 1}^{N - 1} [a_{i} F_{i} (e^{- j \frac{2 π}{N} p}, z) + a_{i}^{*} G_{i} (e^{- j \frac{2 π}{N} p}, z)] X (e^{- j \frac{2 π}{N} p}, z)}} . \end{matrix}$
Thus the input to the $i^{t h}$ reconstruction filter is corrupted by unwanted spectral images, and the reconstruction filter's job is theremoval of these images. The reconstruction filter $K_{i} (z)$ will have a bandpass frequency response similar (or identical) to that of $H_{i} (z)$ illustrated in [link] . Due to the practical design considerations, neither $K_{i} (z)$ nor $H_{i} (z)$ will be perfect bandpass filters, but we will assume that the only significant out-of-band energy passed by these filters willoccur in the frequency range just outside of their passbands. (Note the limited “spillover” in [link] .) Under these assumptions, the only undesired images in $Y_{i} (ω)$ that will not be completely attenuated by $K_{i} (ω)$ are the images adjacent to $F_{i} (ω)$ and $G_{i} (ω)$ . Which indices p in [link] (third equation) are responsible for these adjacent images? [link] (third equation) implies that index $p = ℓ$ shifts the frequency response up by $2 π ℓ / N$ radians. Since the passband centers of $F_{i} (z)$ and $G_{i} (z)$ are $(2 i + 1) π / N$ radians apart, the passband of $G_{i} (e^{- j \frac{2 π}{N} p}, z)$ will reside directly to the left of the passband of $F_{i} (z)$ when $p = i$ . Similarly, the passband of $G_{i} (e^{- j \frac{2 π}{N} p}, z)$ will reside directly to the right of the passband of $F_{i} (z)$ when $p = i + 1$ . See [link] for an illustration. Using the same reasoning, the passband of $F_{i} (e^{- j \frac{2 π}{N} p}, z)$ will reside directly to the right of the passband of $G_{i} (z)$ when $p = - i$ and directly to the left when $p = - (i + 1)$ . The only exceptions to this rule occur when $i = 0$ , in which case the images to the right of $G_{i} (z)$ and to the left of $F_{i} (z)$ are desired, and when $i = N - 1$ , in which case the images to the left of $G_{i} (z)$ and to the right of $F_{i} (z)$ are desired.

Spectral images of $Y_{i} (ω)$ not completely attenuated by $K_{i} (ω)$ .

Based on the arguments above, we can write ${u_{i} (n)}$ , the output of the $i^{t h}$ reconstruction filter, as follows:
$\begin{matrix} U_{i} (z) & = & K_{i} (z) Y_{i} (z) \\ = & \underset{desired}{\underset{︸}{\frac{1}{N} K_{i} (z) [a_{i} F_{i} (z) X (z) + a_{i}^{*} G_{i} (z) X (z)]}} \\ + \underset{aliasing from inner undesired images when 1 \leq i \leq N - 1}{\underset{︸}{\frac{1}{N} K_{i} (z) [a_{i} F_{i} (e^{j \frac{2 π}{N} i}, z) X (e^{j \frac{2 π}{N} i}, z) + a_{i}^{*} G_{i} (e^{- j \frac{2 π}{N} i}, z) X (e^{- j \frac{2 π}{N} i}, z)]}} \\ + \underset{aliasing from outer undesired images when 0 \leq i \leq N - 2}{\underset{︸}{\frac{1}{N} K_{i} (z) [a_{i} F_{i} (e^{j \frac{2 π}{N} (i + 1)}, z) X (e^{j \frac{2 π}{N} (i + 1)}, z) + a_{i}^{*} G_{i} (e^{- j \frac{2 π}{N} (i + 1)}, z) X (e^{- j \frac{2 π}{N} (i + 1)}, z)]}} . \end{matrix}$
The previous equation shows that $U_{i} (z)$ is corrupted by the portions of the undesired images not completely removed by the reconstructionfilter $K_{i} (z)$ . In the filterbank context, this undesired behavior is referred to asaliasing. But notice that aliasing contributions to the signal $U (z) = \sum_{i} U_{i} (z)$ will vanish if the inner aliasing components in $U_{i} (z)$ cancel the outer aliasing components in $U_{i - 1} (z)$ . This happens when
$\begin{matrix} K_{i} (z) [a_{i} F_{i} (e^{j \frac{2 π}{N} i}, z) X (e^{j \frac{2 π}{N} i}, z) + a_{i}^{*} G_{i} (e^{- j \frac{2 π}{N} i}, z) X (e^{- j \frac{2 π}{N} i}, z)] \\ = - K_{i - 1} (z) [a_{i - 1} F_{i - 1} (e^{j \frac{2 π}{N} i}, z) X (e^{j \frac{2 π}{N} i}, z) + a_{i - 1}^{*} G_{i - 1} (e^{- j \frac{2 π}{N} i}, z) X (e^{- j \frac{2 π}{N} i}, z)] . \end{matrix}$
which occurs under satisfaction of the two conditions below.
$\begin{matrix} a_{i} K_{i} (z) F_{i} (e^{j \frac{2 π}{N} i}, z) & = & - a_{i - 1} K_{i - 1} (z) F_{i - 1} (e^{j \frac{2 π}{N} i}, z) \\ a_{i}^{*} K_{i} (z) G_{i} (e^{- j \frac{2 π}{N} i}, z) & = & - a_{i - 1}^{*} K_{i - 1} (z) G_{i - 1} (e^{- j \frac{2 π}{N} i}, z) . \end{matrix}$
We assume from this point on that the real-valued filters ${H_{i} (z)}$ and ${K_{i} (z)}$ are constructed using modulated versions of a lowpass prototype filter $H (z)$ . (This assumption is required for the existence of a polyphasefilterbank implementation.)
$\begin{matrix} \begin{matrix} H_{i} (z) & = & a_{i} F_{i} (z) + a_{i}^{*} G_{i} (z) \\ K_{i} (z) & = & c_{i} F_{i} (z) + c_{i}^{*} G_{i} (z) \end{matrix} where {\begin{matrix} F_{i} (z) & = & H (e^{- j \frac{π}{2 N} (2 i + 1)}, z) \\ G_{i} (z) & = & H (e^{j \frac{π}{2 N} (2 i + 1)}, z) \end{matrix} \end{matrix}$
Then condition [link] (upper equation) becomes
$\begin{matrix} a_{i} c_{i} H (e^{- j \frac{π}{2 N} (2 i + 1)}, z) H (e^{j \frac{π}{2 N} (2 i - 1)}, z) + a_{i} c_{i}^{*} H (e^{j \frac{π}{2 N} (2 i + 1)}, z) H (e^{j \frac{π}{2 N} (2 i - 1)}, z) \\ = - a_{i - 1} c_{i - 1} H (e^{- j \frac{π}{2 N} (2 i - 1)}, z) H (e^{j \frac{π}{2 N} (2 i + 1)}, z) - a_{i - 1} c_{i - 1}^{*} H (e^{j \frac{π}{2 N} (2 i - 1)}, z) H (e^{j \frac{π}{2 N} (2 i + 1)}, z) . \end{matrix}$
Lets take a closer look at the products $H (e^{- j \frac{π}{2 N} (2 i + 1)}, z) H (e^{j \frac{π}{2 N} (2 i - 1)}, z)$ in the previous equation. As illustrated in [link] , these products equal zero when $1 \leq i \leq N / 2$ since their passbands do not overlap. Setting these products to zero in [link] (bottom equation) yields the condition
$\begin{matrix} \begin{matrix} a_{i} c_{i}^{*} = - a_{i - 1} c_{i - 1}^{*} for 1 \leq i \leq N - 1 \end{matrix}, \end{matrix}$
which can also be shown to satisfy [link] (bottom equation).

Illustration of vanishing terms in [link] (lower equation).

Next we concern ourselves with the requirements on a ₀ and c ₀ . Assuming [link] is satisfied, we know that inner aliasing in $U_{i} (z)$ cancels outer aliasing in $U_{i - 1} (z)$ for $1 \leq i \leq N - 1$ . Hence, from [link] (fourth equation) and [link] (lower equation),
$\begin{matrix} U (z) & = & \sum_{i = 0}^{N - 1} U_{i} (z) \\ = & \frac{1}{N} \sum_{i = 0}^{N - 1} K_{i} (z) H_{i} (z) X (z) \\ = & \frac{1}{N} \sum_{i = 0}^{N - 1} [c_{i} H (e^{- j \frac{π}{2 N} (2 i + 1)}, z) + c_{i}^{*} H (e^{j \frac{π}{2 N} (2 i + 1)}, z)] \\ \cdot [a_{i} H (e^{- j \frac{π}{2 N} (2 i + 1)}, z) + a_{i}^{*} H (e^{j \frac{π}{2 N} (2 i + 1)}, z)] X (z) \end{matrix}$
Noting that the passbands of $H (e^{- j \frac{π}{2 N} (2 i + 1)}, z)$ and $H (e^{j \frac{π}{2 N} (2 i + 1)}, z)$ do not overlap for $1 \leq i \leq N - 2$ , we have
$\begin{matrix} U (z) & = & \frac{1}{N} [(a_{0} c_{0}^{*} + a_{0}^{*} c_{0}) H (e^{- j \frac{π}{2 N}}, z) H (e^{j \frac{π}{2 N}}, z) \\ + (a_{N - 1} c_{N - 1}^{*} + a_{N - 1}^{*} c_{N - 1}) H (e^{- j \frac{π}{2 N} (2 N - 1)}, z) H (e^{j \frac{π}{2 N} (2 N - 1)}, z) \\ + \sum_{i = 0}^{N - 1} (a_{i} c_{i} H^{2} (e^{- j \frac{π}{2 N} (2 i + 1)}, z) + a_{i}^{*} c_{i}^{*} H^{2} (e^{j \frac{π}{2 N} (2 i + 1)}, z))] X (z) . \end{matrix}$
The first two terms in [link] (third equation) represent aliasing components that prevent flat overall response at $ω = 0$ and $ω = π$ , respectively. These aliasing terms vanish when
$\begin{matrix} \begin{matrix} \begin{matrix} a_{0} c_{0}^{*} & = & - a_{0}^{*} c_{0} \\ a_{N - 1} c_{N - 1}^{*} & = & - a_{N - 1}^{*} c_{N - 1} \end{matrix} \end{matrix} \end{matrix}$
What remains is
$\begin{matrix} U (z) & = & \frac{1}{N} \sum_{i = 0}^{N - 1} (a_{i} c_{i} H^{2} (e^{- j \frac{π}{2 N} (2 i + 1)}, z) + a_{i}^{*} c_{i}^{*} H^{2} (e^{j \frac{π}{2 N} (2 i + 1)}, z)) X (z) . \end{matrix}$
Phase Distortion: Perfect reconstruction requires that the analysis/synthesis system hasno phase distortion. To guarantee the absence of phase distortion, we require that thecomposite system
$\begin{matrix} Q (z) : = \frac{U (z)}{X (z)} = \frac{1}{N} \sum_{i = 0}^{N - 1} a_{i} c_{i} H^{2} (e^{- j \frac{π}{2 N} (2 i + 1)}, z) + a_{i}^{*} c_{i}^{*} H^{2} (e^{j \frac{π}{2 N} (2 i + 1)}, z) \end{matrix}$
has a linear phase response. (Recall that a linear phase response is equivalent to a pure delayin the time domain.) This linear-phase constraint will provide the final condition used tospecify the constants ${a_{i}}$ and ${c_{i}}$ . We start by examining the impulse response of $Q (z)$ . Using a technique analogous to [link] (fifth equation), we can write
$\begin{matrix} Q (z) = \frac{2}{N} \sum_{n = 0}^{2 M - 2} (\sum_{i = 0}^{N - 1} Re (a_{i} c_{i}) cos (π, \frac{2 i + 1}{2 N}, n) - Im (a_{i} c_{i}) sin (π, \frac{2 i + 1}{2 N}, n)) (\sum_{k}, h_{k}, h_{n - k}) z^{- n} \end{matrix}$
Above, we have used the property that multiplication in the z -domain implies convolution in the time domain.For $Q (z)$ to be linear phase, it's impulse response must be symmetric. Let us assume that the prototype filter $H (z)$ is linear phase, so that ${h_{n}}$ is symmetric. Thus $\sum_{k} h_{m} h_{n - k}$ is symmetric about $n = M - 1$ , and thus for linear phase $Q (z)$ , we require that the quantity
$\begin{matrix} \sum_{i = 0}^{N - 1} Re (a_{i} c_{i}) cos (π, \frac{2 i + 1}{2 N}, n) - Im (a_{i} c_{i}) sin (π, \frac{2 i + 1}{2 N}, n) \end{matrix}$
is symmetric about $n = M - 1$ , i.e.,
$\begin{matrix} \sum_{i = 0}^{N - 1} Re (a_{i} c_{i}) cos (π, \frac{2 i + 1}{2 N}, (M - 1 + n)) - Im (a_{i} c_{i}) sin (π, \frac{2 i + 1}{2 N}, (M - 1 + n)) \\ = \sum_{i = 0}^{N - 1} Re (a_{i} c_{i}) cos (π, \frac{2 i + 1}{2 N}, (M - 1 - n)) - Im (a_{i} c_{i}) sin (π, \frac{2 i + 1}{2 N}, (M - 1 - n)) \end{matrix}$
for $n = 0, \dots, M - 1$ . Using trigonometric identities, it can be shown that the condition aboveis equivalent to
$\begin{matrix} 0 & = & \sum_{i = 0}^{N - 1} sin (π, \frac{2 i + 1}{2 N}, n) [Re (a_{i} c_{i}) sin (π, \frac{2 i + 1}{2 N}, (M - 1)) + Im (a_{i} c_{i}) cos (π, \frac{2 i + 1}{2 N}, (M - 1))], \end{matrix}$
which is satisfied when
$\begin{matrix} \frac{Im (a_{i} c_{i})}{Re (a_{i} c_{i})} & = & - \frac{sin (π, \frac{2 i + 1}{2 N}, (M - 1))}{cos (π, \frac{2 i + 1}{2 N}, (M - 1))} = tan (- π \frac{2 i + 1}{2 N} (M - 1)) . \end{matrix}$
Restricting $| a_{i} | = | c_{i} | = 1$ , the previous equation requires that
$\begin{matrix} \begin{matrix} a_{i} c_{i} = e^{- j π \frac{2 i + 1}{2 N} (M - 1)} \end{matrix} . \end{matrix}$
It can be easily verified that the following ${a_{i}}$ and ${c_{i}}$ satisfy conditions [link] , [link] , and [link] :
$\begin{matrix} \begin{matrix} \begin{matrix} a_{i} & = & e^{- j π \frac{M + N - 1}{4 N} (2 i + 1)} \\ c_{i} & = & e^{- j π \frac{M - N - 1}{4 N} (2 i + 1)} . \end{matrix} \end{matrix} \end{matrix}$
Plugging these into the expression for $H_{i} (z)$ we find that
$\begin{matrix} H_{i} (z) & = & a_{i} H (e^{- j π \frac{2 i + 1}{2 N}}, z) + a_{i}^{*} H (e^{j π \frac{2 i + 1}{2 N}}, z) \\ = & \sum_{n = 0}^{M - 1} (a_{i} e^{j π \frac{2 i + 1}{2 N} n} + a_{i}^{*} e^{- j π \frac{2 i + 1}{2 N} n}) h_{n} z^{- n} \\ = & \sum_{n = 0}^{M - 1} (e^{j π \frac{2 i + 1}{2 N} (n - \frac{M + N - 1}{2})} + e^{- j π \frac{2 i + 1}{2 N} (n - \frac{M + N - 1}{2})}) h_{n} z^{- n} \\ = & \sum_{n = 0}^{M - 1} \underset{impulse response of H_{i} (z)}{\underset{︸}{\begin{matrix} 2 cos (π, \frac{2 i + 1}{2 N}, (n - \frac{M + N - 1}{2})) h_{n} \end{matrix}}} z^{- n} . \end{matrix}$
Repeating this procedure for $K_{i} (z)$ yields
$\begin{matrix} K_{i} (z) & = & \sum_{n = 0}^{M - 1} \underset{impulse response of K_{i} (z)}{\underset{︸}{\begin{matrix} 2 cos (π, \frac{2 i + 1}{2 N}, (n - \frac{M - N - 1}{2})) h_{n} \end{matrix}}} z^{- n} . \end{matrix}$
At this point we make a few comments on the design of the lowpass prototype $H (z)$ . The perfect $H (z)$ would be an ideal linear-phase lowpass filter with cutoff at $ω = π / 2 N$ , as illustrated in [link] . Such a filter would perfectly separate the subbands as well as yieldflat composite magnitude response, as per [link] . Unfortunately, however, this perfect filter is not realizable with afinite number of filter coefficients. So, what we really want is a finite-length FIR filter having goodfrequency selectivity, nearly-flat composite response, and linear phase. The length-512 prototype filter specified in the MPEG standards issuch a filter, as evidenced by the responses in [link] . Unfortunately, the standards do not describe how this filter was designed,and a thorough discussion of multirate filter design is outside the scope of this course. For more on prototype filter design, we point the interested reader to page 358 of Vaidyanathan or Crochiere&Rabiner.

Ideal (dashed) and typical (solid) prototype-filter magnitude responses for the cosine-modulated filterbank. Note bandwidth relative to [link] .

Magnitude response of $| H (ω) |$ of MPEG prototype filter and the resulting composite response $| Q (ω) |$ , where $N = 32$ and $M = 16 N = 512$ .

To conclude, [link] (fourth equation) and [link] give impulse response expressions for a set of real-valued filters that comprise a near-perfectlyreconstructing filterbank (under suitable selection of ${h_{i}}$ ). This is commonly referred to The MPEG standards refer to this filterbank as a “polyphase quadrature” filterbank (PQF), the name given to the technique byan early technical paper: Rothweiler ICASSP 83 as a “cosine-modulated filterbank” because all filters are based on cosine modulations of areal-valued linear-phase lowpass prototype $H (z)$ . The near-perfect reconstruction property follows from the frequency-domaincancellation of adjacent-spectrum aliasing and the lack of phase distortion.
It should be noted that our derivation of the cosine modulated filterbank is similar to that in Rothweiler ICASSP 83 except for the treatmentsof phase distortion. See Chapter 8 of Vaidyanathan for a more comprehensiveview of cosine-modulated filterbanks.
Polyphase Implementations: Recall the uniformly modulated filterbank in Figure 4 from "Uniformly-Modulated Filterbanks" , whose combined modulator-filter coefficients can be constructed using products of the terms $h_{n}$ and $e^{j \frac{π}{N} i n}$ . Figure 5 from "Uniformly-Modulated Filterbanks" shows a computationally-efficient polyphase/DFT implementation of the analysis filter which requires only M multiplies and one N -dimensional DFT computation for calculation of N subband outputs.We might wonder: Is there a similar polyphase/fast-transform implementation of the cosine-modulated filterbank derived inthis section? From [link] (fourth equation), we see that the impulse responses of ${H_{i} (z)}$ are products of the terms $h_{n}$ and $cos (π, \frac{2 i + 1}{2 N}, (n - \frac{M + N - 1}{2}))$ for $n = 0, \dots, M - 1$ . Note that the inverse-DCT matrix C _n ^t can be specified via components with form similar to the cosine term in [link] (fourth equation):
$\begin{matrix} {[C_{N}^{t}]}_{i, n} & = & \sqrt{\frac{2}{N}} α_{n} cos (π, \frac{(2 i + 1)}{2 N}, n); i, n = 0 \dots N - 1 . \\ for α_{0} = 1 / \sqrt{2}, α_{n \neq 0} = 1 . \end{matrix}$
Thus it may not be surprising that there exist polyphase/DCT implementations of the cosine-modulated filterbank.Indeed, one such implementation is specified in the MPEG-2 audio compression standard (see ISO/IEC 13818-3).This particular implementation is the focus of the next section.

Mpeg filterbank implementation

Since MPEG audio compression standards are so well-known and widespread, a detailed look at the MPEG filterbank implementation is warranted.The cosine-modulated, or polyphase-quadrature filterbank described in the previous section is used in MPEG Layers 1-3.(The MPEG hierarchy will be described in a later chapter.) This section discusses the specific implementation suggested by theMPEG-2 standard (see ISO/IEC 13818-3).
The MPEG standard specifies 512 prototype filter coefficients, the first of which is zero.To adapt the MPEG filter to our cosine-modulated-filterbank framework, we append a zero-valued 513 ^th coefficient so that the resulting MPEG prototype filter becomes symmetric and hence linear phase.Since the standard specifies $N = 32$ frequency bands, we have
$M = 513 = 16 N + 1 .$
Plugging this value of M into the filter expressions [link] (fourth equation) and [link] , the $2 π$ -periodicity of the cosine implies that they may be rewritten as follows.
$\begin{matrix} H_{i} (z) & = & \sum_{n = 0}^{16 N - 1} \underset{impulse response of H_{i} (z)}{\underset{︸}{2 cos (π, \frac{2 i + 1}{2 N}, (n - \frac{N}{2})) h_{n}}} z^{- n} \\ K_{i} (z) & = & \sum_{n = 0}^{16 N - 1} \underset{impulse response of K_{i} (z)}{\underset{︸}{2 cos (π, \frac{2 i + 1}{2 N}, (n + \frac{N}{2})) h_{n}}} z^{- n} . \end{matrix}$
Encoding: Here we derive the encoder filterbank implementation suggested in theMPEG-2 standard (see ISO/IEC 13818-3). Using $x_{i} (n)$ to denote the output of the $i^{t h}$ analysis filter, we have
$x_{i} (n) = \sum_{k = 0}^{16 N - 1} [2 cos (π \frac{2 i + 1}{2 N} (k - \frac{N}{2})) h_{k}] x (n - k) .$
The relationship between $x_{i} (n)$ and its downsampled version $s_{i} (m)$ is given by
$s_{i} (m) = x_{i} (m N),$
so that the downsampled analysis output $s_{i} (m)$ can be written as
$\begin{matrix} s_{i} (m) & = & \sum_{n = 0}^{16 N - 1} [2 cos (π \frac{2 i + 1}{2 N} (n - \frac{N}{2})) h_{n}] x (m N - n) . \end{matrix}$
Using the substitution $n = k N + ℓ$ for $0 \leq ℓ \leq N - 1$ ,
$\begin{matrix} s_{i} (m) & = & 2 \sum_{k = 0}^{15} \sum_{ℓ = 0}^{N - 1} \underset{\begin{matrix} repeats every 4 increments of k \\ sign changes every 2 increments of k \end{matrix}}{\underset{︸}{cos (π, \frac{2 i + 1}{2 N}, (k N + ℓ - \frac{N}{2}))}} h_{k N + ℓ} x ((m - k) N - ℓ) \\ = & \sum_{k = 0}^{15} \sum_{ℓ = 0}^{N - 1} \underset{repeats every 2 increments of k}{\underset{︸}{cos (π, \frac{2 i + 1}{2 N}, ({⟨ k ⟩}_{2} N + ℓ - \frac{N}{2}))}} \underset{analysis window}{\underset{︸}{2 {(- 1)}^{⌊ k / 2 ⌋} h_{k N + ℓ}}} x ((m - k) N - ℓ) \end{matrix}$
[link] illustrates this process.

MPEG encoder filterbank implementation suggested in ISO/IEC 13818-3.
Decoding: Here we derive the dencoder filterbank implementation suggested in theMPEG-2 standard (see ISO/IEC 13818-3). Using $y_{i} (n)$ to denote the output of the $i^{t h}$ upsampler,
$\begin{matrix} u_{i} (n) & = & \sum_{k = 0}^{16 N - 1} [2 cos (π, \frac{2 i + 1}{2 N}, (k + \frac{N}{2})) h_{k}] y_{i} (n - k) . \end{matrix}$
The input to the upsampler $s_{i} (m)$ is related to the output $y_{i} (n)$ by
$y_{i} (n) = \{\begin{matrix} s_{i} (n / N) & when n / N \in Z \\ 0 & else, \end{matrix})$
so that
$\begin{matrix} u_{i} (n) & = & \sum_{{k : \frac{n - k}{N} \in Z}} [2 cos (π, \frac{2 i + 1}{2 N}, (k + \frac{N}{2})) h_{k}] s_{i} (\frac{n - k}{N}) . \end{matrix}$
Lets write $n = m N + ℓ$ for $0 \leq ℓ \leq N - 1$ and $k = p N + q$ for $0 \leq q \leq N - 1$ . Then due to the restricted ranges of ℓ and q ,
$\frac{n - k}{N} = m - p + \frac{ℓ - q}{N} \in Z \Rightarrow ℓ = q .$
Using these substitutions in the previous equation for $u_{i} (n)$ ,
$\begin{matrix} u_{i} (m N + ℓ) & = & 2 \sum_{p = 0}^{15} cos (π, \frac{2 i + 1}{2 N}, (p N + ℓ + \frac{N}{2})) h_{p N + ℓ} s_{i} (m - p) . \end{matrix}$
Summing $u_{i} (m N + ℓ)$ over i to create $u (m N + ℓ)$ ,
$\begin{matrix} u (m N + ℓ) & = & 2 \sum_{i = 0}^{N - 1} \sum_{p = 0}^{15} \underset{\begin{matrix} repeats every 4 increments of p \\ sign changes every 2 increments of p \end{matrix}}{\underset{︸}{cos (π, \frac{2 i + 1}{2 N}, (p N + ℓ + \frac{N}{2})))}} h_{p N + ℓ} s_{i} (m - p) \\ = & \sum_{p = 0}^{15} \underset{synthesis window}{\underset{︸}{2 {(- 1)}^{⌊ p / 2 ⌋} h_{p N + ℓ}}} \sum_{i = 0}^{N - 1} \underset{= \{\begin{matrix} cos (π, \frac{2 i + 1}{2 N}, (ℓ + \frac{N}{2})) & p even \\ cos (π, \frac{2 i + 1}{2 N}, (ℓ + N + \frac{N}{2})) & p odd \end{matrix})}{\underset{︸}{cos (π, \frac{2 i + 1}{2 N}, ({⟨ p ⟩}_{2} N + ℓ + \frac{N}{2})))}} s_{i} (m - p) \end{matrix}$
If we define
$v_{j} (m) = \sum_{i = 0}^{N - 1} cos (π, \frac{2 i + 1}{2 N}, (j + \frac{N}{2})) s_{i} (m) for 0 \leq j \leq 2 N - 1,$
(note the range of $j$ !) then we can rewrite
$\begin{matrix} u (m N + ℓ) = \sum_{p = 0, 2, \dots, 14} {(- 1)}^{⌊ p / 2 ⌋} h_{p N + ℓ} v_{ℓ} (m - p) + \sum_{p = 1, 3, \dots, 15} {(- 1)}^{⌊ p / 2 ⌋} h_{p N + ℓ} v_{ℓ + N} (m - p) . \end{matrix}$
[link] illustrates the construction of $u (m N + ℓ)$ using the notation
$v (m) = (\begin{matrix} v_{0} (m) & \dots & v_{2 N - 1} (m) \end{matrix}) .$

MPEG decoder filterbank implementation suggested in ISO/IEC 13818-3.
DCT Implementation of Cosine Matrixing: As seen in [link] and [link] , the filterbank implementations suggested by the MPEG standard requirea cosine matrix operation that, if implemented using straightforward arithmetic, requires $32 \times 64 = 2048$ multiply/adds at both the encoder and decoder.Note, however, that the cosine transformations in [link] and [link] do bear a great deal of similarity to the DCT:
$\begin{matrix} y_{k} & = & \sqrt{\frac{2}{N}} α_{k} \sum_{n = 0}^{N - 1} x_{n} cos (π, \frac{2 n + 1}{2 N}, k); k = 0 \dots N - 1, \\ for α_{0} = 1 / \sqrt{2}, α_{k \neq 0} = 1, \\ x_{n} & = & \sqrt{\frac{2}{N}} \sum_{k = 0}^{N - 1} α_{k} y_{k} cos (π, \frac{2 n + 1}{2 N}, k); n = 0 \dots N - 1, \end{matrix}$
which we know has a fast algorithm: Lee's $32 \times 32$ fast-DCT, for example, requires only 80 multiplications and 209 additions (see B.G.Lee TASSP Dec 84).So how do we implement the matrix operation using the fast-DCT? A technique has been described clearly in Konstantinides SPL 1994,the results of which are summarized below. At the encoder, the matrix operation can be written
$\begin{matrix} s_{i} (m) & = & \sum_{j = 0}^{2 N - 1} cos (π, \frac{2 i + 1}{2 N}, (j - \frac{N}{2})) w_{j} (m) for i = 0, \dots, N - 1, \end{matrix}$
where ${w_{0} (m), \dots, w_{2 N - 1} (m)$ is created from ${x (m), \dots, x (m - 16 N + 1)}$ by windowing, shifting, and adding. (See [link] .) We can write
$\begin{matrix} s_{i} (m) & = & \sum_{j = 0}^{N - 1} cos (π, \frac{2 i + 1}{2 N}, j) {\bar{w}}_{j} (m); i = 0, \dots, N - 1, \end{matrix}$
where, for $N = 32$ , ${{\bar{w}}_{j} (m)}$ is the following manipulation of ${w_{j} (m)}$ :
$\begin{matrix} {\bar{w}}_{j} (m) : = \{\begin{matrix} w_{16} (m) & j = 0 \\ w_{16 + j} (m) + w_{16 - j} (m) & j = 1, 2, \dots, 16 \\ w_{16 + j} (m) - w_{80 - j} (m) & j = 17, 18, \dots, 31 . \end{matrix}) \end{matrix}$
Compare [link] to the inverse DCT in [link] (lower equation). At the decoder, the matrix operation can be written
$\begin{matrix} v_{j} (m) & = & \sum_{i = 0}^{N - 1} cos (π, \frac{2 i + 1}{2 N}, (j + \frac{N}{2})) s_{i} (m) for j = 0, \dots, 2 N - 1, \end{matrix}$
where ${v_{0} (m), \dots, v_{2 N - 1} (m)}$ are windowed, shifted, and added to compute ${u (m)}$ . (See [link] .) It is shown in Konstantinides SPL 1994 that, for $N = 32$ , ${v_{j} (m)}$ can be calculated by first computing ${{\bar{v}}_{j} (m)}$ :
$\begin{matrix} {\bar{v}}_{j} (m) & = & \sum_{i = 0}^{N - 1} cos (π, \frac{2 i + 1}{2 N}, j) s_{i} (m); j = 0, \dots, N - 1 \end{matrix}$
and rearranging the outputs according to
$\begin{matrix} v_{j} (m) : = \{\begin{matrix} {\bar{v}}_{j + 16} (m) & j = 0, 1, \dots, 15, \\ 0 & j = 16, \\ - {\bar{v}}_{48 - j} (m) & j = 17, 18, \dots, 47, \\ - {\bar{v}}_{j - 48} (m) & j = 48, 49, \dots, 63 . \end{matrix}) \end{matrix}$
Compare [link] to the DCT in [link] (upper equation).

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, An introduction to source-coding: quantization, dpcm, transform coding, and sub-band coding. OpenStax CNX. Sep 25, 2009 Download for free at http://cnx.org/content/col11121/1.2

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'An introduction to source-coding: quantization, dpcm, transform coding, and sub-band coding' conversation and receive update notifications?

Ask

	7 Sociology 07 Deviance, Crime, Social Control MCQ By OpenStax Start Quiz
	Vocabulary Week 1-3 By Rachel Woolard Start Quiz
	4 BOD Hemolymphatic -Dr. Han By Brooke Delaney Start Exam
	1 Microbiology Final 1 By Madison Christian Start Quiz
	9 Sociology 09 Social Stratification in the US MCQ By OpenStax Start Quiz
	Nutrition Exam By Hannah Sheth Start Quiz
	Immunology Practice Test By Sandhills MLT Start Test
©flickr: Iqbal	Liver Cancer By Darlene Paliswat Start Test
	1 Endocrinology (MCQ) By Rohini Ajay Start Quiz
	7 Java Messaging Service By JavaChamp Team Start Quiz