Optimization theory is the branch of applied mathematics whose purpose is to consider a mathematical expression in order to find a set of parameters that either maximize or minimize it. Being an applied discipline, problems usually arise from real-life situations including areas like science, engineering and finance (among many other). This section presents some basic concepts for completeness and is not meant to replace a treaty on the subject. The reader is encouraged to consult further references for more information.
Solution of linear weighted least squares problems
Consider the quadratic problem
which can be written as
omitting the square root since this problem is a strictly convex one. Therefore its unique (and thus global) solution is found at the point where the partial derivatives with respect to the optimization variable are equal to zero. That is,
The solution of
[link] is given by
where the inverted term is referred
[link] ,
[link] as the
Moore-Pentrose pseudoinverse of
.
In the case of a weighted version of
[link] ,
where
is the
-th row of
, one can write
[link] as
where
contains the weighting vector
. The solution is therefore given by
Newton's method and the approximation of linear systems in an
Sense
Newton's method and
Linear phase systems
Consider the problem
for
. Problem
[link] is equivalent to the better posed problem
where
,
,
, and
The
-th element of
is given by
, where
and
. From
[link] we have that
where
is the
-th element of
and
Now,
where
Therefore the Jacobian of
is given by
The
Hessian of
is the matrix
whose
-th element (
) is given by
where adequate substitutions have been made for the sake of simplicity. We have
Note that the partial derivative of
at
is not defined. Therefore
Note that
for all
where it is not defined. Then
except at
where it is not defined.
Based on
[link] and
[link] , one can apply Newton's method to problem
[link] as follows,
- Given
,
,
- For
- Find
.
- Find
.
- Solve
for
.
- Let
.
- Check for convergence and iterate if necessary.
Note that for problem
[link] the
Jacobian of
can be written as
where
Also,
where
and
Therefore
From
[link] , the
Hessian
can be expressed as
where
The matrix
is given by
The matrix
is positive definite (for
). To see this, consider
where
. Let
,
. Then
unless
. But since
is diagonal and
is full column rank,
. Thus
(identity only if
) and so
is positive definite.
Newton's method and
Complex linear systems
Consider the problem
where
,
and
. One can write
[link] in terms of the real and imaginary parts of
and
,
where
and
. The gradient
is the vector whose
-th element is given by
where
is the row vector whose
-th element is
Therefore one can express the gradient of
by
, where
as above. Note that one can also write the gradient in vector form as follows
The Hessian
is the matrix of second derivatives whose
-th entry is given by
Now,
Substituting
[link] and
[link] into
[link] we obtain
Note that
can be written in matrix form as
Therefore to solve
[link] one can use Newton's method as follows: given an initial point
, each iteration gives a new estimate
according to the formulas
where
and
correspond to the Hessian and gradient of
as defined previously, evaluated at the current point
. Since the
-norm is convex for
, problem
[link] is convex. Therefore Newton's method will converge to the global minimizer
as long as
is not ill-conditioned.