<< Chapter < Page Chapter >> Page >

We now consider the worst-case bound for this error. Using standard properties of the singular value decomposition, it is straightforward to show that if Φ satisfies the RIP of order 2 K (with constant δ 2 K ), then the largest singular value of Φ Λ 0 lies in the range [ 1 / 1 + δ 2 K , 1 / 1 - δ 2 K ] . Thus, if we consider the worst-case recovery error over all e such that e 2 ϵ , then the recovery error can be bounded by

ϵ 1 + δ 2 K x ^ - x 2 ϵ 1 - δ 2 K .

Therefore, if x is exactly K -sparse, then the guarantee for the pseudoinverse recovery method, which is given perfect knowledge of the true support of x , cannot improve upon the bound in [link] by more than a constant value.

We now examine a slightly different noise model. Whereas [link] assumed that the noise norm e 2 was small, the theorem below analyzes a different recovery algorithm known as the Dantzig selector in the case where Φ T e is small  [link] . We will see below that this will lead to a simple analysis of the performance of this algorithm in Gaussian noise.

Suppose that Φ satisfies the RIP of order 2 K with δ 2 K < 2 - 1 and we obtain measurements of the form y = Φ x + e where Φ T e λ . Then when B ( y ) = { z : Φ T ( Φ z - y ) λ } , the solution x ^ to [link] obeys

x ^ - x 2 C 0 σ K ( x ) 1 K + C 3 K λ ,

where

C 0 = 2 1 - ( 1 - 2 ) δ 2 K 1 - ( 1 + 2 ) δ 2 K , C 3 = 4 2 1 - ( 1 + 2 ) δ 2 K .

The proof mirrors that of [link] . Since Φ T e λ , we again have that x B ( y ) , so x ^ 1 x 1 and thus [link] applies. We follow a similar approach as in [link] to bound Φ h Λ , Φ h . We first note that

Φ T Φ h Φ T ( Φ x ^ - y ) + Φ T ( y - Φ x ) 2 λ

where the last inequality again follows since x , x ^ B ( y ) . Next, note that Φ h Λ = Φ Λ h Λ . Using this we can apply the Cauchy-Schwarz inequality to obtain

Φ h Λ , Φ h = h Λ , Φ Λ T Φ h h Λ 2 Φ Λ T Φ h 2 .

Finally, since Φ T Φ h 2 λ , we have that every coefficient of Φ T Φ h is at most 2 λ , and thus Φ Λ T Φ h 2 2 K ( 2 λ ) . Thus,

h 2 C 0 σ K ( x ) 1 K + C 1 2 2 K λ = C 0 σ K ( x ) 1 K + C 3 K λ ,

as desired.

Gaussian noise

Finally, we also examine the performance of these approaches in the presence of Gaussian noise. The case of Gaussian noise was first considered in  [link] , which examined the performance of 0 minimization with noisy measurements. We now see that [link] and  [link] can be leveraged to provide similar guarantees for 1 minimization. To simplify our discussion we will restrict our attention to the case where x Σ K = x : x 0 K , so that σ K ( x ) 1 = 0 and the error bounds in [link] and  [link] depend only on the noise e .

To begin, suppose that the coefficients of e R M are i.i.d. according to a Gaussian distribution with mean zero and variance σ 2 . Since the Gaussian distribution is itself sub-Gaussian, we can apply results such as Corollary 1 from "Concentration of measure for sub-Gaussian random variables" to show that there exists a constant c 0 > 0 such that for any ϵ > 0 ,

P e 2 ( 1 + ϵ ) M σ exp - c 0 ϵ 2 M .

Applying this result to [link] with ϵ = 1 , we obtain the following result for the special case of Gaussian noise.

Suppose that Φ satisfies the RIP of order 2 K with δ 2 K < 2 - 1 . Furthermore, suppose that x Σ K and that we obtain measurements of the form y = Φ x + e where the entries of e are i.i.d. N ( 0 , σ 2 ) . Then when B ( y ) = { z : Φ z - y 2 2 M σ } , the solution x ^ to [link] obeys

x ^ - x 2 8 1 + δ 2 K 1 - ( 1 + 2 ) δ 2 K M σ

with probability at least 1 - exp ( - c 0 M ) .

We can similarly consider [link] in the context of Gaussian noise. If we assume that the columns of Φ have unit norm, then each coefficient of Φ T e is a Gaussian random variable with mean zero and variance σ 2 . Using standard tail bounds for the Gaussian distribution (see Theorem 1 from "Sub-Gaussian random variables" ), we have that

P Φ T e i t σ exp - t 2 / 2

for i = 1 , 2 , ... , n . Thus, using the union bound over the bounds for different i , we obtain

P Φ T e 2 log N σ N exp - 2 log N = 1 N .

Applying this to [link] , we obtain the following result, which is a simplified version of Theorem 1.1 of  [link] .

Suppose that Φ has unit-norm columns and satisfies the RIP of order 2 K with δ 2 K < 2 - 1 . Furthermore, suppose that x Σ K and that we obtain measurements of the form y = Φ x + e where the entries of e are i.i.d. N ( 0 , σ 2 ) . Then when B ( y ) = { z : Φ T ( Φ z - y ) 2 log N σ } , the solution x ^ to [link] obeys

x ^ - x 2 4 2 1 + δ 2 K 1 - ( 1 + 2 ) δ 2 K K log N σ

with probability at least 1 - 1 N .

Ignoring the precise constants and the probabilities with which the bounds hold (which we have made no effort to optimize), we observe that if M = O ( K log N ) then these results appear to be essentially the same. However, there is a subtle difference. Specifically, if M and N are fixed and we consider the effect of varying K , we can see that [link] yields a bound that is adaptive to this change, providing a stronger guarantee when K is small, whereas the bound in [link] does not improve as K is reduced. Thus, while they provide very similar guarantees, there are certain circumstances where the Dantzig selector is preferable. See  [link] for further discussion of the comparative advantages of these approaches.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, An introduction to compressive sensing. OpenStax CNX. Apr 02, 2011 Download for free at http://legacy.cnx.org/content/col11133/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'An introduction to compressive sensing' conversation and receive update notifications?

Ask