<< Chapter < Page Chapter >> Page >

Let X l denote a sequence of independent, identically distributed, random variables. Assuming they havezero means and finite variances (equaling 2 ), the Central Limit Theorem states that the sum l 1 L X l L converges in distribution to a Gaussian random variable. 1 L l 1 L X l L 0 2 Because of its generality, this theorem is often used to simplify calculations involving finite sums of non-Gaussian random variables. However, attention is seldompaid to the convergence rate of the Central Limit Theorem. Kolmogorov, the famous twentieth centurymathematician, is reputed to have said, "The Central Limit Theorem is a dangerous tool in the hands of amateurs." Let'ssee what he meant.

Taking 2 1 , the key result is that the magnitude of the difference between P x , defined to be the probability that the sum given above exceeds x , and Q x , the probability that a unit-variance Gaussian random variable exceeds x , is bounded by a quantity inversely related to the square root of L ( Cramer: Theorem 24 ). P x Q x c X 3 3 1 L The constant of proportionality c is a number known to be about 0.8 ( Hall: p6 ). The ratio of absolute third moment of X l to the cube of its standard deviation, known as the skew and denoted by X , depends only on the distribution of X l and is independent of scale. This bound on the absolute error has been shown to be tight ( Cramer: pp. 79ff ). Using our lower bound for Q (see ), we find that the relative error in the Central Limit Theoremapproximation to the distribution of finite sums is bounded for x 0 as

P x Q x Q x c X 2 L x 2 2 2 x 1 1 x 2 x x 1
Suppose we require that the relative error not exceed some specific value . The normalized (by the standard deviation) boundary x at which the approximation is evaluated must not violate L 2 2 c 2 X 2 x 2 4 x 1 1 x 2 x 2 x 1 As shown in , the right side of this equation is a monotonically increasing function.
The quantity which governs the limits of validity for numerically applying the Central Limit Theorem on finite numbersof data is shown over a portion of its range. To judge these limits, we must compute the quantity L 2 2 c 2 X , where denotes the desired percentage error in the Central Limit Theoremapproximation and L the number of observations. Selecting this value on the vertical axisand determining the value of x yielding it, we find the normalized ( x 1 implies unit variance) upper limit on an L -term sum to which the Central Limit Theorem is guaranteed to apply. Note how rapidly thecurve increases, suggesting that large amounts of data are needed for accurate approximation.

If 0.1 and taking c X arbitrarily to be unity (a reasonable value), the upper limit of the preceding equation becomes 1.6 -3 L . Examining , we find that for L 10000 , x must not exceed 1.17. Because we have normalized to unit variance, thisexample suggests that the Gaussian approximates the distribution of a ten-thousand term sum only over a rangecorresponding to a 76% area about the mean. Consequently, the Central Limit Theorem, as a finite-sample distributionalapproximation, is only guaranteed to hold near the mode of the Gaussian, with huge numbers of observations needed to specify the tail behavior. Realizingthis fact will keep us from being ignorant amateurs.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Statistical signal processing. OpenStax CNX. Dec 05, 2011 Download for free at http://cnx.org/content/col11382/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical signal processing' conversation and receive update notifications?

Ask