Page 1 / 1

Recall the Weiner filter problem

$\{{x}_{k}\}$ , $\{{d}_{k}\}$ jointly wide sense stationary

Find $W$ minimizing $({}_{k}^{2})$ ${}_{k}={d}_{k}-{y}_{k}={d}_{k}-\sum_{i=0}^{M-1} {w}_{i}{x}_{k-i}={d}_{k}-{X}^{k}^T{W}^{k}$ ${X}^{k}=\begin{pmatrix}{x}_{k}\\ {x}_{k-1}\\ \\ {x}_{k-M+1}\\ \end{pmatrix}$ ${W}^{k}=\begin{pmatrix}{w}_{0}^{k}\\ {w}_{1}^{k}\\ \\ {w}_{M-1}^{k}\\ \end{pmatrix}$ The superscript denotes absolute time, and the subscript denotes time or a vector index.

the solution can be found by setting the gradient $0$

${}^{k}=\frac{\partial^{1}({}_{k}^{2})}{\partial W}=(2{}_{k}-{X}^{k})=(-2({d}_{k}-{X}^{k}^T{W}_{k}){X}^{k})=-(2({d}_{k}{X}^{k}))+(, {X}^{k}, {X}^{k}^T)W=2P+2RW$
$({W}_{\mathrm{opt}}=R^{(-1)}P)$ Alternatively, ${W}_{\mathrm{opt}}$ can be found iteratively using a gradient descent technique ${W}^{k+1}={W}^{k}-{}^{k}$ In practice, we don't know $R$ and $P$ exactly, and in an adaptive context they may be slowly varying with time.

To find the (approximate) Wiener filter, some approximations are necessary. As always, the key is to make the right approximations!

Approximate $R$ and $P$ :RLS methods, as discussed last time.
Approximate the gradient! ${}^{k}=\frac{\partial^{1}({}_{k}^{2})}{\partial W}$
Note that ${}_{k}^{2}$ itself is a very noisy approximation to $({}_{k}^{2})$ . We can get a noisy approximation to the gradient by finding the gradient of ${}_{k}^{2}$ ! Widrow and Hoff first published the LMS algorithm, based on this clever idea, in 1960. $({}^{k})=\frac{\partial^{1}{}_{k}^{2}}{\partial W}=2{}_{k}\frac{\partial^{1}{d}_{k}-{W}^{k}^T{X}^{k}}{\partial W}=2{}_{k}-{X}^{k}=-(2{}_{k}{X}^{k})$ This yields the LMS adaptive filter algorithm

• ${y}_{k}={W}^{k}^T{X}^{k}=\sum_{i=0}^{M-1} {w}_{i}^{k}{x}_{k-i}$
• ${}_{k}={d}_{k}-{y}_{k}$
• ${W}^{k+1}={W}^{k}-({}^{k})={W}^{k}--2{}_{k}{X}^{k}={W}^{k}+2{}_{k}{X}^{k}$ ( ${w}_{i}^{k+1}={w}_{i}^{k}+2{}_{k}{x}_{k-i}$ )

The LMS algorithm is often called a stochastic gradient algorithm, since $({}^{k})$ is a noisy gradient. This is by far the most commonly used adaptive filtering algorithm, because

• it was the first
• it is very simple
• in practice it works well (except that sometimes it converges slowly)
• it requires relatively litle computation
• it updates the tap weights every sample, so it continually adapts the filter
• it tracks slow changes in the signal statistics well

Computational cost of lms

To Compute ${y}_{k}$ ${}_{k}$ ${W}^{k+1}$ = Total
multiplies $M$ $0$ $M+1$ $2M+1$
adds $M-1$ $1$ $M$ $2M$

So the LMS algorithm is $O(M)$ per sample. In fact, it is nicely balanced in that the filter computation and the adaptation require the sameamount of computation.

Note that the parameter  plays a very important role in the LMS algorithm. It can also be varied with time, but usually a constant  ("convergence weight facor") is used, chosen after experimentation for a givenapplication.

large  : fast convergence, fast adaptivity

small  : accurate $W$ less misadjustment error, stability