# 4.16 Robust hypothesis testing

 Page 1 / 2

"Robust" is a technical word that implies insensitivity to modeling assumptions. As we have seen, some algorithms arerobust while others are not. The intent of robust signal processing is to derive algorithms that are explicitly insensitive to the underlying signal and/or noise models. The way in which modelingincertainties are described is typified by the approach we shall use in the following discussion of robust model evaluation.

We assume that two nominal models of the generation of the statistically independent observations areknown; the "actual" conditional probability density that describes the data under the assumptions of each model is notknown exactly, but is "close" to the nominal. Letting $p()$ be the actual probability density for each observation and ${p}^{o}()$ the nominal, we say that ( Huber; 1981 ) $p(x)=1{p}^{o}(x)+{p}^{d}(x)$ where ${p}^{d}$ is the unknown disturbance density and  is the uncertainty variable ( $0\le < 1$ ). The uncertainty variable specifies how accurate the nominal model is through to be: the smaller  , the smaller the contribution of the disturbance. It is assumed that some valuefor  can be rationally assigned. The disturbance density is entirely unknown and isassumed to be any value probability density function. The expression given above is normalized so that $p()$ has unit density ranging about it. An example of densities described this way are shown in .

The robust model evaluation problem is formally stated as ${}_{0}:p(r, {}_{0}, r)=\prod_{l=0}^{L-1} 1{p}^{o}({r}_{l}, {}_{0}, {r}_{l})+{p}^{d}({r}_{l}, {}_{0}, {r}_{l})$ ${}_{1}:p(r, {}_{1}, r)=\prod_{l=0}^{L-1} 1{p}^{o}({r}_{l}, {}_{1}, {r}_{l})+{p}^{d}({r}_{l}, {}_{1}, {r}_{l})$ The nominal densities under each model correspond to the conditional densities that we have been using until now. Thedisturbance densities are intended to model imprecision of both descriptions; hence, they are assumed to be different in thecontext of each model. Note that the measure of imprecision  is assumed to be the same under either model.

To solve this problem, we take what is known as a minimax approach : find the worst-case combinations of a priori densities (max), then minimize the consequences of this situation (mini)according to some criterion. In this way, bad situations are handles as well as can be expected while the more tolerable onesare (hopefully) processed well also. The "mini" phase of the minimax solution corresponds to the likelihood ratio for manycriteria. Thus, the "max" phase amounts to finding the worst-case probability distributions for the likelihood ratiotest as described in the previous section: find the disturbance densities that can result in a constant value for the ratio overlarge domains of functions. When the two nominal distributions scaled by $1-$ can be brought together so that they are equal for some disturbance, then the likelihood ratio will be constant inthat domain. Of most interest here is the case where the models differ only in the value of the mean, as shown in . "Bringing the distributions together" means, in this case, scaling the distribution for ${}_{0}$ by $1-$ while adding the constant  to the scaled distribution for ${}_{1}$ . One can shown in general that if the ratio of the nominal densities is monotonic, this procedure finds theworst-case distribution ( Huber; 1965 ). The distributions overlap for small and for large values of the data with no overlap in a central region. As weshall see, the size of this central region depends greatly on the choice of  . The tails of the worst-case distributions under each model are equal; conceptually, we consider that theworst-case densities have exponential tails in the model evaluation problem.