<< Chapter < Page | Chapter >> Page > |
In this section, we consider only real-valued wavelet functions that form an orthogonal basis, hence $\varphi \equiv \tilde{\varphi}$ and $\psi \equiv \tilde{\psi}$ . We saw in Orthogonal Bases from Multiresolution analysis and wavelets how a given function belonging to ${L}_{2}\left(\mathbb{R}\right)$ could be represented as a wavelet series. Here, we explain how to use a wavelet basis to construct a nonparametric estimator for the regression function $m$ in the model
where ${x}_{i}=\frac{i}{n}$ are equispaced design points and the errors are i.i.d. Gaussian, ${\u03f5}_{i}\phantom{\rule{3.33333pt}{0ex}}\sim \phantom{\rule{3.33333pt}{0ex}}N(0,{\sigma}_{\u03f5}^{2})$ .
A wavelet estimator can be linear or nonlinear . The linear wavelet estimator proceeds by projecting the data onto a coarse level space. This estimator is of a kernel-type, see "Linear smoothing with wavelets" . Another possibility for estimating $m$ is to detect which detail coefficients convey the important information about the function $m$ and to put equal to zero all the other coefficients. This yields a nonlinear wavelet estimator as described in "Nonlinear smoothing with wavelets" .
Suppose we are given data ${({x}_{i},{Y}_{i})}_{i=1}^{n}$ coming from the model [link] and an orthogonal wavelet basis generated by $\{\varphi ,\psi \}$ . The linear wavelet estimator proceeds by choosing a cutting level ${j}_{1}$ and represents an estimation of the projection of $m$ onto the space ${V}_{{j}_{1}}$ :
with ${j}_{0}$ the coarsest level in the decomposition, and where the so-called empirical coefficients are computed as
The cutting level ${j}_{1}$ plays the role of a smoothing parameter: a small value of ${j}_{1}$ means that many detail coefficients are left out, and this may lead to oversmoothing. On the other hand, if ${j}_{1}$ is too large, too many coefficients will be kept, and some artificial bumps will probably remain in the estimation of $m\left(x\right)$ .
To see that the estimator [link] is of a kernel-type, consider first the projection of $m$ onto ${V}_{{j}_{1}}$ :
where the (convolution) kernel ${K}_{{j}_{1}}(x,y)$ is given by
Härdle et al. [link] studied the approximation properties of this projection operator. In order to estimate [link] , Antoniadis et al. [link] proposed to take:
Approximating the last integral by $\frac{1}{n}{\varphi}_{{j}_{1},k}\left({x}_{i}\right)$ , we find back the estimator $\widehat{m}\left(x\right)$ in [link] .
By orthogonality of the wavelet transform and Parseval's equality, the ${L}_{2}-$ risk (or integrated mean square error IMSE) of a linear wavelet estimator is equal to the ${l}_{2}-$ risk of its wavelet coefficients:
where
are called `theoretical' coefficients in the regression context. The term ${S}_{1}+{S}_{2}$ in [link] constitutes the stochastic bias whereas ${S}_{3}$ is the deterministic bias. The optimal cutting level is such that these two bias are of the same order. If $m$ is $\beta -$ Hölder continuous, it is easy to see that the optimal cutting level is ${j}_{1}\left(n\right)=O\left({n}^{1/(1+2\beta )}\right)$ . The resulting optimal IMSE is of order ${n}^{-\frac{2\beta}{2\beta +1}}$ . In practice, cross-validation methods are often used to determine the optimal level ${j}_{1}$ [link] , [link] .
Notification Switch
Would you like to follow the 'An introduction to wavelet analysis' conversation and receive update notifications?