Some times we are interested in derivatives of higher order e.g when we want to use Newton’s Method for optimization, which requires second-order derivatives (Nocedal and Wright, 2006).
$H=\begin{bmatrix}
\frac{\partial ^2f}{\partial x^2} &\frac{\partial ^2f}{\partial x \partial y} \\
\frac{\partial ^2f}{\partial x \partial y}& \frac{\partial ^2f}{\partial y^2}
\end{bmatrix}$
\frac{\partial ^2f}{\partial x_1^2} &\frac{\partial ^2f}{\partial x_1 \partial x_2}& \cdots & \frac{\partial ^2f}{\partial x_1 \partial x_n} \\
\frac{\partial ^2f}{\partial x_2 \partial x_1} &\frac{\partial ^2f}{\partial x_2^2}& \cdots & \frac{\partial ^2f}{\partial x_2 \partial x_n} \\
\cdots & \cdots & \cdots & \cdots \\
\frac{\partial ^2f}{\partial x_n \partial x_1} &\frac{\partial ^2f}{\partial x_n \partial x_2}& \cdots & \frac{\partial ^2f}{\partial x_n^2}
\end{bmatrix}$
Local Minimum or Maximum Test:
This is negative definite making it a local maximum of the function.
$f(x,y)=x^3−12x+y^3−75y+91$ Find the local minima and maxima.
We'll follow these steps:
Consider a function $f : \mathbb{R}^2 \to \mathbb{R}$ of two variables $x,y$. We use the following notation for higher-order partial derivatives (and for gradients):
$\frac{\partial ^2f}{\partial x^2}$ is the second partial derivative of $f$ with respect to $x$.
$\frac{\partial ^nf}{\partial x^n}$ is the $n$th partial derivative of $f$ with respect to $x$.
$\frac{\partial ^2f}{\partial y \partial x}=\frac{\partial }{\partial y}(\frac{\partial f}{\partial x})$
is the partial derivative obtained by first partial differentiating with respect to $x$ and then with respect to $y$.
$\frac{\partial ^2f}{\partial x \partial y}$ is the partial derivative obtained by first partial differentiating by $y$ and then $x$.
The Hessian is the collection of all second-order partial derivatives.
If $f(x,y)$ is a twice continuously differentiable function, then
$\frac{\partial ^2f}{\partial x \partial y}=\frac{\partial ^2f}{\partial y \partial x}$
i.e., the order of differentiation does not matter, and the corresponding Hessian matrix.
\frac{\partial ^2f}{\partial x^2} &\frac{\partial ^2f}{\partial x \partial y} \\
\frac{\partial ^2f}{\partial x \partial y}& \frac{\partial ^2f}{\partial y^2}
\end{bmatrix}$
is symmetric. The Hessian is denoted as $\triangledown ^2_{x,y}f(x,y)$. Generally for $x \in \mathbb{R}^n$ and $f:\mathbb{R}^n \to \mathbb{R}$, the Hessian is an $n \times n $ matrix. The hessian measures the curvature of the function locally around $(x,y)$.
General Hessian Matrix
$H=\begin{bmatrix}\frac{\partial ^2f}{\partial x_1^2} &\frac{\partial ^2f}{\partial x_1 \partial x_2}& \cdots & \frac{\partial ^2f}{\partial x_1 \partial x_n} \\
\frac{\partial ^2f}{\partial x_2 \partial x_1} &\frac{\partial ^2f}{\partial x_2^2}& \cdots & \frac{\partial ^2f}{\partial x_2 \partial x_n} \\
\cdots & \cdots & \cdots & \cdots \\
\frac{\partial ^2f}{\partial x_n \partial x_1} &\frac{\partial ^2f}{\partial x_n \partial x_2}& \cdots & \frac{\partial ^2f}{\partial x_n^2}
\end{bmatrix}$
Hessian of a vector field : if $f:\mathbb{R}^n \to \mathbb{R}^m$ is a vector field, the Hessian is an $(m \times n \times n )$ is tensor.
Hessians are used in Machine Learning for the determination of local minima and local maxima for solving the optimization problems.
Conditions for minima ,maxima and saddle point
The Hessian of a function is denoted by $\triangledown ^2_{x,y}f(x,y)$, where $f$ is twice differentiable function and if $f(x_0,y_0)$ is one of its stationary point then
$\triangledown ^2_{x,y}f(x_0,y_0)>0$ i.e, positive definite $(x_0,y_0)$ is a point of local minimum.
$\triangledown ^2_{x,y}f(x_0,y_0)<0$ i.e, negative definite $(x_0,y_0)$ is a point of local maximum.
$\triangledown ^2_{x,y}f(x_0,y_0)$ is neither positive nor negative i.e. Indefinite , $(x_0,y_0)$ is a saddle point
The determinant ($D$) of the Hessian matrix can be used to determine whether a critical point of a function is a local minimum, local maximum, or a saddle point. Here's how it works:
Local Minimum or Maximum Test:
If the determinant($D$) of the Hessian matrix at a critical point is positive , it indicates that the function is curving upward and downward in both directions of the critical point, suggesting the possibility of a local minimum or maximum.
If the determinant($D$) is negative , it indicates that the function is curving differently in different directions, which suggests that the critical point is a saddle point.
Determining Minima or Maxima:
To further classify whether the critical point is a local minimum or maximum, you can examine the signs of the second-order partial derivatives. Specifically, look at the signs of $\frac{\partial^2{f}}{\partial{x^2}}$ and $\frac{\partial^2{f}}{\partial{y^2}}$ (or corresponding partial derivatives in higher dimensions).
If both $\frac{\partial^2{f}}{\partial{x^2}}$ and $\frac{\partial^2{f}}{\partial{y^2}}$ are positive at the critical point (i.e., the leading principal minors of the Hessian matrix are positive), then it's a local minimum.
If both $\frac{\partial^2{f}}{\partial{x^2}}$ and $\frac{\partial^2{f}}{\partial{y^2}}$ are negative at the critical point, then it's a local maximum.
If one of them is positive while the other is negative, it's a saddle point.
Please note that this test applies to functions of two variables (i.e.,$f(x,y)$), and the classification of critical points can be more complex in higher dimensions. Additionally, when $D=0$, the test is inconclusive, and further analysis may be needed.
Example:
Let the function $f(x,y)= x^2+y^2$ . It's second order partial derivatives exist and they're continuous throughout the Domain .Find the Hessian Matrix
$\frac{\partial ^2f}{\partial x^2} =2$
$\frac{\partial ^2f}{\partial y^2} =2$
$\frac{\partial ^2f}{\partial x \partial y} =0$
$\frac{\partial ^2f}{\partial y \partial x} =0$
$H=\begin{bmatrix}
\frac{\partial ^2f}{\partial x^2} &\frac{\partial ^2f}{\partial x \partial y} \\
\frac{\partial ^2f}{\partial x \partial y}& \frac{\partial ^2f}{\partial y^2}
\end{bmatrix}$
\frac{\partial ^2f}{\partial x^2} &\frac{\partial ^2f}{\partial x \partial y} \\
\frac{\partial ^2f}{\partial x \partial y}& \frac{\partial ^2f}{\partial y^2}
\end{bmatrix}$
$H=\begin{bmatrix}
2&0 \\
0& 2
\end{bmatrix}$
2&0 \\
0& 2
\end{bmatrix}$
Suppose a function is defined by $f(x,y)=x^4-32x^2+y^4-18y^2$ . Find the maximum and minimum value of the function if it exists. Justify your answer.
$\frac{\partial ^2f}{\partial x^2} =12x^2-64$
$\frac{\partial ^2f}{\partial y^2} =12y^2-36$
$\frac{\partial ^2f}{\partial x \partial y} =0$
$\frac{\partial ^2f}{\partial y \partial x} =0$
$H=\triangledown ^2_{x,y}f(x,y)=\begin{bmatrix}
12x^2-64&0 \\
0& 12y^2-36
\end{bmatrix}$
12x^2-64&0 \\
0& 12y^2-36
\end{bmatrix}$
We solve for the Stationary points of the function $f(x,y)$ by equating it's partial derivatives $\frac{\partial{f}}{\partial{x}}$ and $\frac{\partial{f}}{\partial{y}}$ to 0.
$4x(x^2−16)=0⟹x=±4,0$
$4y(y^2−9)=0⟹y=±3,0$
The possible pairing gives us critical points $(±4,±3), (±4,0),(0,±3),(0,0)$
Now as the Hessian consists of even functions which reduces a lot of effort. we only need to check for the pairs $(4,3),(4,0),(0,3),(0,0)$.
$\triangledown ^2_{x,y}f(4,3)=\begin{bmatrix}
128&0 \\
0& 72
\end{bmatrix}$
It's positive definite matrix and thus it's the local minimum of the function.128&0 \\
0& 72
\end{bmatrix}$
$\triangledown ^2_{x,y}f(4,0)=\begin{bmatrix}
128&0 \\
0& -36
\end{bmatrix}$
It's indefinite thus ruled out.128&0 \\
0& -36
\end{bmatrix}$
$\triangledown ^2_{x,y}f(0,3)=\begin{bmatrix}
-64&0 \\
0&72
\end{bmatrix}$
It's indefinite thus ruled out.-64&0 \\
0&72
\end{bmatrix}$
$\triangledown ^2_{x,y}f(0,0)=\begin{bmatrix}
-64&0 \\
0&-36
\end{bmatrix}$
-64&0 \\
0&-36
\end{bmatrix}$
So $f(0,0)\ge f(x,y)\ge f(\pm4,\pm3)$
Thus we have bounded the above function and it's point of local minimum is $(\pm4,\pm3)$ and point of local maximum is $(0,0)$
Calculate the first-order partial derivatives.
Find the critical points by setting both partial derivatives equal to zero.
Use the second-order partial derivatives to classify these critical points.
Step 1: Calculate the first-order partial derivatives:
$\frac{\partial{f}}{\partial{x}}=3x^2-12$
Find the critical points by setting both partial derivatives equal to zero.
Use the second-order partial derivatives to classify these critical points.
Step 1: Calculate the first-order partial derivatives:
$\frac{\partial{f}}{\partial{x}}=3x^2-12$
$\frac{\partial{f}}{\partial{x}}=3y^2-75$
Step 2: Find the critical points by setting both partial derivatives equal to zero:
$3x^2−12=0$(Equation 1)$3y^2−75=0$(Equation 2)Solve for $x$ and $y$$x=\pm 2$$y=\pm 5$So the critical points are $(2,5),(-2,5),(2,-5),(-2,-5)$
The Hessian matrix is given by:
6x&0 \\
0&6y
\end{bmatrix}$
$H(2,5)=\begin{bmatrix}
12&0 \\
0&30
\end{bmatrix}$$H(-2,-5)=\begin{bmatrix}
-12&0 \\
0&-30
\end{bmatrix}$$H(2,-5)=\begin{bmatrix}
12&0 \\
0&-30
\end{bmatrix}$$H(-2,5)=\begin{bmatrix}
-12&0 \\
0&30
\end{bmatrix}$So $H(2,5)$ is the local minima and $H(-2,-5)$ is the local maxima.
Step 3: Use the second-order partial derivatives to classify these critical points. We'll calculate the Hessian matrix and evaluate it at each critical point:
The Hessian matrix is given by:
$H=\begin{bmatrix}
6x&0 \\
0&6y
\end{bmatrix}$
$H(2,5)=\begin{bmatrix}
12&0 \\
0&30
\end{bmatrix}$$H(-2,-5)=\begin{bmatrix}
-12&0 \\
0&-30
\end{bmatrix}$$H(2,-5)=\begin{bmatrix}
12&0 \\
0&-30
\end{bmatrix}$$H(-2,5)=\begin{bmatrix}
-12&0 \\
0&30
\end{bmatrix}$So $H(2,5)$ is the local minima and $H(-2,-5)$ is the local maxima.
Comments
Post a Comment