<< Chapter < Page | Chapter >> Page > |
A newer algorithm called “l1_ls" [link] is based on an interior-point algorithm that uses a preconditioned conjugate gradient (PCG) method to approximately solve linear systems in a truncated-Newton framework. The algorithm exploits the structure of the Hessian to construct their preconditioner; thus, this is a second order method. Computational results show that about a hundred PCG steps are sufficient for obtaining accurate reconstruction. This method has been typically shown to be slower than first-order methods, but could be faster in cases where the true target signal is highly sparse.
As opposed to solving the constrained formulation, an alternate approach is to solve the unconstrained formulation in [link] . A widely used method for solving ${\ell}_{1}$ -minimization problems of the form
for a convex and differentiable $H$ , is an iterative procedure based on shrinkage (also called soft thresholding; see [link] below). In the context of solving [link] with a quadratic $H$ , this method was independently proposed and analyzed in [link] , [link] , [link] , [link] , and then further studied or extended in [link] , [link] , [link] , [link] , [link] , [link] . Shrinkage is a classic method used in wavelet-based image denoising. The shrinkage operator on any scalar component can be defined as follows:
This concept can be used effectively to solve [link] . In particular, the basic algorithm can be written as following the fixed-point iteration: for $i=1,...,N$ , the ${i}^{\mathrm{th}}$ coefficient of $x$ at the ${(k+1)}^{\mathrm{th}}$ time step is given by
where $\tau >0$ serves as a step-length for gradient descent (which may vary with $k$ ) and $\mu $ is as specified by the user. It is easy to see that the larger $\mu $ is, the larger the allowable distance between ${x}^{k+1}$ and ${x}^{k}$ . For a quadratic penalty term $H(\xb7)$ , the gradient $\u25bdH$ can be easily computed as a linear function of ${x}^{k}$ ; thus each iteration of [link] essentially boils down to a small number of matrix-vector multiplications.
The simplicity of the iterative approach is quite appealing, both from a computational, as well as a code-design standpoint. Various modifications, enhancements, and generalizations to this approach have been proposed, both to improve the efficiency of the basic iteration in [link] , and to extend its applicability to various kinds of $J$ [link] , [link] , [link] . In principle, the basic iteration in [link] would not be practically effective without a continuation (or path-following) strategy [link] , [link] in which we choose a gradually decreasing sequence of values for the parameter $\mu $ to guide the intermediate iterates towards the final optimal solution.
This procedure is known as continuation ; in [link] , the performance of an algorithm known as Fixed-Point Continuation (FPC) has been compared favorably with another similar method known as Gradient Projection for Sparse Reconstruction (GPSR) [link] and “l1_ls” [link] . A key aspect to solving the unconstrained optimization problem is the choice of the parameter $\mu $ . As discussed above, for CS recovery, $\mu $ may be chosen by trial and error; for the noiseless constrained formulation, we may solve the corresponding unconstrained minimization by choosing a large value for $\mu $ .
In the case of recovery from noisy compressive measurements, a commonly used choice for the convex cost function $H\left(x\right)$ is the square of the norm of the residual . Thus we have:
For this particular choice of penalty function, [link] reduces to the following iteration:
which is run until convergence to a fixed point. The algorithm is detailed in pseudocode form below.
Inputs: CS matrix
$\Phi $ , signal measurements
$y$ , parameter sequence
${\mu}_{n}$ Outputs: Signal estimate
$\widehat{x}$ initialize:
${\widehat{x}}_{0}=0$ ,
$r=y$ ,
$k=0$ .
while ħalting criterion false
do 1.
$k\leftarrow k+1$ 2.
$x\leftarrow \widehat{x}-\tau {\Phi}^{T}r$ {take a gradient step}
3.
$\widehat{x}\leftarrow \mathrm{shrink}(x,{\mu}_{k}\tau )$ {perform soft thresholding}
4.
$r\leftarrow y-\Phi \widehat{x}$ {update measurement residual}
end while return
$\widehat{x}\leftarrow \widehat{x}$
It turns out that an efficient method to obtain the solution to the constrained optimization problem in [link] can be devised by solving a small number of the unconstrained problems in the form of [link] . These subproblems are commonly referred to as Bregman iterations . A simple version can be written as follows:
The problem in the second step can be solved by the algorithms reviewed above. Bregman iterations were introduced in [link] for constrained total variation minimization problems, and was proved to converge for closed, convex functions $J\left(x\right)$ . In [link] , it is applied to [link] for $J\left(x\right)={\parallel x\parallel}_{1}$ and shown to converge in a finite number of steps for any $\mu >0$ . For moderate $\mu $ , the number of iterations needed is typically lesser than 5. Compared to the alternate approach that solves [link] through directly solving the unconstrained problem in [link] with a very large $\mu $ , Bregman iterations are often more stable and sometimes much faster.
All the methods discussed in this section optimize a convex function (usually the ${\ell}_{1}$ -norm) over a convex (possibly unbounded) set. This implies guaranteed convergence to the global optimum. In other words, given that the sampling matrix $\Phi $ satisfies the conditions specified in "Signal recovery via ${\ell}_{1}$ minimization" , convex optimization methods will recover the underlying signal $x$ . In addition, convex relaxation methods also guarantee stable recovery by reformulating the recovery problem as the SOCP, or the unconstrained formulation.
Notification Switch
Would you like to follow the 'An introduction to compressive sensing' conversation and receive update notifications?