2.8 The karhuenen-loeve transform

Signal theory Page 1 / 1

Introduces the Karhuenen-Loeve transform, with applications.

Define the random vector

X = [\begin{matrix} x_{1} \\ x_{2} \\ ⋮ \\ x_{N} \end{matrix}]

with mean zero and covariance matrix $R_{X} = E [X X^{*}]$ ; this matrix is symmetric and positive semidefinite.

Lemma 1 Every eigenvalue of $R_{X}$ is real and non-negative.

Let e be an eigenvector of $R_{X}$ with eigenvalue $λ$ .

λ {∥ e ∥}^{2} = λ ⟨ e, e ⟩ = ⟨ λ e, e ⟩ = ⟨ A e, e ⟩ \geq 0 .

The last statement falls out by the definite of positive semi-definite. We have $λ {∥ e ∥}^{2} \geq 0$ . Since ${∥ e ∥}^{2} \geq 0$ , it follows that $λ \geq 0$ , i.e. all the eigenvalues are non-negative. The eigenvectors of the matrix $R_{X}$ provide an orthonormal basis ${φ_{1}, φ_{2}, . . . φ_{N}}$ , which can be collected into an orthonormal basis matrix $φ = [φ_{1} φ_{2} ... φ_{N}]$ . Then let $y = φ^{*} x$ . We have:

R_{Y} = E [y y^{*}] = E [φ^{*} x x^{*} φ] = φ^{*} E [x x^{*}] φ = φ^{*} R_{X} φ .

Let us look at the adjoint of $φ^{*} R_{X}$ :

{(φ^{*}, R_{X})}^{*} = R_{X} φ = R_{X} [φ_{1} φ_{2} ... φ_{N}] = [λ_{1} φ_{1} λ_{2} φ_{2} ... λ_{N} φ_{N}] .

If we take the adjoint again, we get

φ^{*} R_{X} = [\begin{matrix} φ_{1}^{*} λ_{1} \\ φ_{2}^{*} λ_{2} \\ ⋮ \\ φ_{N}^{*} λ_{N} \end{matrix}] .

Going back to our derivation of $R_{Y}$ :

\begin{matrix} R_{Y} & = φ^{*} R_{X} φ = [\begin{matrix} φ_{1}^{*} λ_{1} \\ ⋮ \\ φ_{N}^{*} λ_{N} \end{matrix}] [\begin{matrix} φ_{1} & \dots & φ_{N} \end{matrix}] \\ = [\begin{matrix} λ_{1} ⟨ φ_{1}, φ_{1} ⟩ & λ_{1} ⟨ φ_{1}, φ_{2} ⟩ & \dots & λ_{1} ⟨ φ_{1}, φ_{N} ⟩ \\ λ_{2} ⟨ φ_{2}, φ_{1} ⟩ & λ_{2} ⟨ φ_{2}, φ_{2} ⟩ & \dots & λ_{2} ⟨ φ_{2}, φ_{N} ⟩ \\ ⋮ & ⋮ & ⋱ & ⋮ \\ λ_{N} ⟨ φ_{N}, φ_{1} ⟩ & λ_{N} ⟨ φ_{N}, φ_{2} ⟩ & \dots & λ_{N} ⟨ φ_{N}, φ_{N} ⟩ \end{matrix}] = [\begin{matrix} λ_{1} & 0 & \dots & 0 \\ 0 & λ_{2} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & λ_{N} \end{matrix}] \end{matrix}

The matrix $φ$ is known as the KLT matrix defined by $R_{X}$ . The transformation given by the KLT matrix provides a set of random variables $y_{i} = ⟨ φ_{i}, x ⟩$ that are uncorrelated.

Example 1 (Whitening Filter) For a random vector $X$ , $R_{X}$ has positive eigenvalues. Let us write $R_{Y}^{- 1 / 2} = diag (λ_{1}^{- 1 / 2}, . . . λ_{N}^{- 1 / 2})$ and $z = R_{Y}^{- 1 / 2} y$ where $y = φ^{*} x$ . We have

\begin{matrix} R_{Z} & = E [z z^{*}] = E [R_{Y}^{- 1 / 2} y y^{*} R_{Y}^{- 1 / 2}] = R_{Y}^{- 1 / 2} E [y y^{*}] R_{Y}^{- 1 / 2} = R_{Y}^{- 1 / 2} R_{Y} R_{Y}^{- 1 / 2} = I . \end{matrix}

The matrix $R_{Y}^{- 1 / 2} φ^{*}$ is known as a "whitening filter", as it maps an arbitrary random vector $x$ to a “white Gaussian noise” vector $z$ .

Example 2 (Transform Coding) Let $U : C^{n} \to C^{n}$ is a unitary operator. Assume we have a signal $x \in C^{n}$ that we want to send it through a channel by only sending $k$ numbers or "items", where $k < n$ ; in words, we wish to compress the signal $x$ . The block diagram for the compression/transmission system is given in [link] .

Block diagram for a transform coding system.

We want to minimize $E [∥ x - \hat{x} ∥]$ given $k$ by choosing the optimal transformation $U$ . We know $y = U^{*} x$ which implies $x = U y$ since U is unitary. Therefore,

∥ x - \hat{x} ∥ = ∥ U y - U \hat{y} ∥ = ∥ U (y - \hat{y}) ∥ = ∥ y - \hat{y} ∥ .

This means that we can minimize $∥ y - \hat{y} ∥$ in place of $∥ x - \hat{x} ∥$ . For simplicity, we choose a basic means of compression that preserves only the first $k$ entries of $y$ :

\begin{matrix} \hat{y_{i}} = \{\begin{matrix} y_{i}, & if i = 1, 2, . . . k, \\ 0, & if i = k + 1, k + 2, . . . n . \end{matrix}) \end{matrix}

We then have $E [{∥ x - \hat{x} ∥}^{2}] = E [{∥ y - \hat{y} ∥}^{2}] = E [\sum_{i = k + 1}^{n} | y_{i} |^{2}] = \sum_{i = k + 1}^{n} E [| y_{i} |^{2}]$ . Therefore,

\begin{matrix} min_{\hat{x}} (E [∥ x - \hat{x} ∥^{2}]) & = min_{U} (\sum_{i = k + 1}^{n}, E [|, y_{i}, |^{2}]) = min_{U} (\sum_{i = k + 1}^{n}, E [|, ⟨ x, u_{i} ⟩, |^{2}]), \\ = min_{U} (\sum_{i = k + 1}^{n}, E, [u_{i}^{T} x x^{T} u_{i}]) = min_{U} (\sum_{i = k + 1}^{n}, u_{i}^{T}, E, [x x^{T}], u_{i}), \\ = min_{U} (\sum_{i = k + 1}^{n}, u_{i}^{T}, R_{X}, u_{i}) . \end{matrix}

It turns out that the choice of transform basis $U$ that minimizes this amount is provided by the eigendecomposition of $R_{X}$ , as specified by the following theorem.

Theorem 1 Let X be a length-n random vector with covariance matrix $R_{X} = E [x x^{*}]$ that has eigenvalues $λ_{1} \geq λ_{2} \geq λ_{3} ... \geq λ_{n} \geq 0$ and matching eigenvectors $φ_{1}, φ_{2}, ... φ_{N}$ . Let $x_{M}$ be the orthogonal projection of x onto a subspace $M$ of dimension $k$ . Then

E [| | x - x_{M} {| |}^{2}] \geq \sum_{i = k + 1}^{M} λ_{i},

with equality if $M = span ({φ_{1}, φ_{2}, ... φ_{k}})$ .

From equation [link] , we have

min_{M} E [| | x - x_{M} {| |}^{2}] = min_{U} (\sum_{i = k + 1}^{n}, u_{i}^{T}, R_{X}, u_{i}) = min_{U} (\sum_{i = k + 1}^{n}, u_{i}^{T}, Φ, Λ, Φ^{T}, u_{i}),

where $R_{X} = Φ Λ Φ^{T}$ is the eigendecomposition of $R_{X}$ . Now, since

Φ^{T} u_{i} = [\begin{matrix} ⟨ u_{i}, φ_{1} ⟩ \\ ⟨ u_{i}, φ_{2} ⟩ \\ ⋮ \\ ⟨ u_{i}, φ_{n} ⟩ \end{matrix}] and Λ Φ^{T} u_{i} = [\begin{matrix} ⟨ u_{i}, φ_{1} ⟩ λ_{1} \\ ⟨ u_{i}, φ_{2} ⟩ λ_{2} \\ ⋮ \\ ⟨ u_{i}, φ_{n} ⟩ λ_{n} \end{matrix}],

we have that $u_{i}^{T} Φ Λ Φ^{T} u_{i} = {(Φ^{T} u_{i})}^{T} Λ Φ^{T} u_{i} = \sum_{i = j}^{n} {| ⟨ u_{i}, φ_{j} ⟩ |}^{2} λ_{j}$ . Plugging this into [link] , we have

\begin{matrix} min_{M} E [| | x - x_{M} {| |}^{2}] = min_{U} (\sum_{i = k + 1}^{n}, \sum_{i = j}^{n}, {| ⟨ u_{i}, φ_{j} ⟩ |}^{2}, λ_{j}) = min_{U} (\sum_{i = j}^{n}, λ_{j}, \sum_{i = k + 1}^{n}, {| ⟨ u_{i}, φ_{j} ⟩ |}^{2}) . \end{matrix}

Now, denote $α_{j} = \sum_{i = k + 1}^{n} {| ⟨ u_{i}, φ_{j} ⟩ |}^{2}$ , and see that

\begin{matrix} \sum_{j = 1}^{n} α_{j} = \sum_{j = 1}^{n} \sum_{i = k + 1}^{n} | ⟨ u_{i}, φ_{j} ⟩ |^{2} = \sum_{i = k + 1}^{n} \sum_{j = 1}^{n} | ⟨ u_{i}, φ_{j} ⟩ |^{2} = \sum_{i = k + 1}^{n} | | u_{i} {| |}^{2} = n - k, \end{matrix}

as all $u_{i}$ are unit-norm. Now, we have that

\begin{matrix} min_{M} E [| | x - x_{M} {| |}^{2}] = min_{U} (\sum_{i = j}^{n}, λ_{j}, α_{j}) = min_{U} (\sum_{j = 1}^{k} λ_{j} α_{j} - \sum_{j = k + 1}^{n} λ_{j} (1 - α_{j}) + \sum_{j = k + 1}^{n} λ_{j}) . \end{matrix}

Since the $λ_{k}$ are monotonically decreasing, we have that

\begin{matrix} min_{M} E [| | x - x_{M} {| |}^{2}] & \geq min_{U} (\sum_{j = 1}^{k} λ_{k} α_{j} - \sum_{j = k + 1}^{n} λ_{k} (1 - α_{j}) + \sum_{j = k + 1}^{n} λ_{j}), \\ \geq min_{U} (λ_{k} [\sum_{j = 1}^{k} α_{j} - \sum_{j = k + 1}^{n} 1 + \sum_{j = k + 1}^{n} α_{j}] + \sum_{j = k + 1}^{n} λ_{j}), \\ \geq min_{U} (λ_{k} [\sum_{j = 1}^{n} α_{j} - (n - k)] + \sum_{j = k + 1}^{n} λ_{j}), \\ \geq min_{U} (λ_{k} [\sum_{j = 1}^{k}, α_{j}] - λ_{k} (n - k) + \sum_{j = k + 1}^{n} λ_{j}), \\ \geq min_{U} (λ_{k} (n - k) - λ_{k} (n - k) + \sum_{j = k + 1}^{n} λ_{j}), \\ \geq min_{U} (\sum_{j = k + 1}^{n}, λ_{j}) . \end{matrix}

If we set $M = span {φ_{1}, φ_{2}, . . . φ_{k}})$ , (i.e., $U = Φ$ ) then it is easy to check that

\begin{matrix} E [| | x - x_{M} {| |}^{2}] = \sum_{j = k + 1}^{n} λ_{j}, \end{matrix}

proving the theorem.

Example 3 (Transform Coding) Transform coding is a common scheme for data compression that leverages the Karhuenen-Loève transform. Examples include JPEG and MP3. In particular, JPEG can be broadly described as follows:

Take the image $x$ and create tiles of size $8 \times 8$ . We assume that the tiles are draws from a random variable $X$ , i.e., the tiles $x_{1}, x_{2} ... \in R^{64}$ with $R_{X} = \frac{1}{n} \sum_{i = 1}^{n} x_{i} x_{i}^{T}$
Compute the KLT of the tile random variable $X$ from $R_{X}$ by obtaining its eigendecomposition $R_{X} = Φ Λ Φ^{T}$ .
Compute KLT coefficients for each block as $c_{i} = Φ^{T} x_{i}$ .
Pick as many coefficients of $c_{i}$ as allowed by communications or storage constraints; save them as the compressed image.
Load saved coefficients and append zeros to build coefficient vector ${\hat{c}}_{i}$ .
Run inverse KLT to obtain the decompressed tiles ${\hat{x}}_{i} = Φ {\hat{c}}_{i}$ .
Reassemble the image from the decompressed tiles.

In practice, it is not desirable to recompute the KLT for each individual image. Thus, the JPEG algorithm employs the discrete cosine transform (DCT). It turns out that the DCT is a good approximation of the KLT for tiles of natural images. Additionally, instead of selecting a subset of the coefficients, they are quantized to varying quality/error according to their index and the total amount of bits available.

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Signal theory. OpenStax CNX. Oct 18, 2013 Download for free at http://legacy.cnx.org/content/col11542/1.3

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Signal theory' conversation and receive update notifications?

Ask

	5 AP 05 Integumentary System MCQ By OpenStax Start Quiz
	Nutrition and Chronic Disease- Test 2 By Madison Christian Start Flashcards
©flickr: quinn	Neuroanatomy By George Turner Start Quiz
	24 AP Key Terms 24 Metabolism Nutrition By OpenStax Start Key Terms
	PE Power Enigeering Safety By Gerr Zen Start Quiz
	11 AP Key Terms 11 The Muscular System By OpenStax Start Key Terms
	Assembly Programming Language By JavaChamp Team Start Quiz
	Fluid Mechanics MCQ By Stephanie Redfern Start Quiz
©flickr: Francisco	U.S. Civil War Pre-test By Danielle Stephens Start Quiz
	25 AP Key Terms 25 The Urinary System By OpenStax Start Key Terms