<< Chapter < Page Chapter >> Page >

Objective

Minimize instantaneous squared error

e k w 2 y k x k w 2

Lms algorithm

w k w k 1 μ x k e k
Where w k is the new weight vector, w k 1 is the old weight vector, and μ x k e k is a small step in the instantaneous error gradient direction.

Interpretation in terms of weight error vector

Define

v k w k w opt
Where w opt is the optimal weight vector and
ε k y k x k w opt
where ε k is the minimum error. The stochastic difference equation is:
v k I μ x k x k v k 1 μ x k ε k

Convergence/stability analysis

Show that (tightness)

B v k B 0
With probability 1, the weight error vector is bounded for all k .

Chebyshev's inequality is

v k B v k 2 B 2
and
v k B 1 B 2 v k 2 v k
where v k 2 is the squared bias. If v k 2 v k is finite for all k , then B v k B 0 for all k .

Also,

v k tr v k v k
Therefore v k is finite if the diagonal elements of Γ k v k v k are bounded.

Convergence in mean

v k 0 as k . Take expectation of using smoothing property to simplify the calculation. We haveconvergence in mean if

  • R xx is positive definite (invertible).
  • μ 2 λ max R xx .

Bounded variance

Show that Γ k v k v k , the weight vector error covariance is bounded for all k .

We could have v k 0 , but v k ; in which case the algorithm would not be stable.
Recall that it is fairly straightforward to show that the diagonal elements of the transformed covariance C k U Γ k U tend to zero if μ 1 λ max R xx ( U is the eigenvector matrix of R xx ; R xx U D U ). The diagonal elements of C k were denoted by γ k , i i i 1 p .
v k tr Γ k tr U C k U tr C k i 1 p γ k , i
Thus, to guarantee boundedness of v k we need to show that the "steady-state" values γ k , i γ i .

We showed that

γ i μ α σ ε 2 2 1 μ λ i
where σ ε 2 ε k 2 , λ i is the i th eigenvalue of R xx ( R xx U λ 1 0 0 λ p U ), and α c σ ε 2 1 c .
0 c 1 2 i 1 p μ λ i 1 μ λ i 1
We found a sufficient condition for μ that guaranteed that the steady-state γ i 's (and hence v k ) are bounded: μ 2 3 i 1 p λ i Where i 1 p λ i tr R xx is the input vector energy.

With this choice of μ we have:

  • convergence in mean
  • bounded steady-state variance
This implies
B v k B 0
In other words, the LMS algorithm is stable about the optimumweight vector w opt .

Learning curve

Recall that

e k y k x k w k 1
and . These imply
e k ε k x k v k 1
where v k w k w opt . So the MSE
e k 2 σ ε 2 v k 1 x k x k v k 1 σ ε 2 x n ε n n n k v k 1 x k x k v k 1 σ ε 2 v k 1 R xx v k 1 σ ε 2 tr R xx v k 1 v k 1 σ ε 2 tr R xx Γ k 1
Where tr R xx Γ k 1 α k 1 α c σ ε 2 1 c . So the limiting MSE is
ε k e k 2 σ ε 2 c σ ε 2 1 c σ ε 2 1 c
Since 0 c 1 was required for convergence, ε σ ε 2 so that we see noisy adaptation leads to an MSE larger than the optimal
ε k 2 y k x k w opt 2 σ ε 2
To quantify the increase in the MSE, define the so-called misadjustment :
M ε σ ε 2 σ ε 2 ε σ ε 2 1 α σ ε 2 c 1 c
We would of course like to keep M as small as possible.

Learning speed and misadjustment trade-off

Fast adaptation and quick convergence require that we take steps as large as possible. In other words,learning speed is proportional to μ ; larger μ means faster convergence. How does μ affect the misadjustment?

To guarantee convergence/stability we require μ 2 3 i 1 p λ i R xx Let's assume that in fact μ 1 i 1 p λ i so that there is no problem with convergence. This condition implies μ 1 λ i or μ λ i 1 i i 1 p . From here we see that

c 1 2 i 1 p μ λ i 1 μ λ i 1 2 μ i 1 p λ i 1
This misadjustment
M c 1 c c 1 2 μ i 1 p λ i
This shows that larger step size μ leads to larger misadjustment.

Since we still have convergence in mean, this essentially means that with a larger step size we "converge"faster but have a larger variance (rattling) about w opt .

Summary

small μ implies

  • small misadjustment in steady-state
  • slow adaptation/tracking
large μ implies
  • large misadjustment in steady-state
  • fast adaptation/tracking

w opt 1 1 x k 0 1 0 0 1 y k x k w opt ε k ε k 0 0.01

Lms algorithm

initialization w 0 0 0 and w k w k 1 μ x k e k k k 1 , where e k y k x k w k 1

Learning curve

μ 0.05

Lms learning curve

μ 0.3

Comparison of learning curves

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Statistical signal processing. OpenStax CNX. Jun 14, 2004 Download for free at http://cnx.org/content/col10232/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical signal processing' conversation and receive update notifications?

Ask