5.4 Constrained Optimization and Lagrange Multipliers

In the unconstrained optimization we considered the problem of solving for the minimum of a function

$min \quad f(x) $ where $f : \mathbb{R}^D \to \mathbb{R}$

In the constrained optimization we have additional constraints. That is, for real-valued functions $g_i : \mathbb{R}^D \to \mathbb{R}$ for $i = 1,\ldots,m$, we consider the constrained optimization problem

$min\quad f(x)$

subject to $g_i(x) \le 0$ for all $i = 1,\ldots,m$.

In general the functions $f$ and $g_i$ could be non-convex in general.

One obvious, but not very practical, way of converting the constrained problem into an unconstrained one is to use an indicator function

$J(x)=f(x)+ \sum_{i=1}^{m} 1(g_i(x))$

where $1(z)$ is an infinite step function

$1(z)=\left\{\begin{matrix}
0 & if z \le 0\\
\infty & otherwise
\end{matrix}\right\}$

This gives infinite penalty if the constraint is not satisfied and hence would provide the same solution.However, this infinite step function is equally difficult to optimize. We can overcome this difficulty by introducing Lagrange multipliers. The idea of Lagrange multipliers is to replace the step function with a linear function.

We associate to problem the Lagrangian by introducing the Lagrange multipliers $\lambda_i \ge 0$ corresponding to each inequality constraint respectively so that

${L}(x,\lambda)=f(x)+\sum_{i=1}^m \lambda_i g_i(x)$
$\quad =f(x)+\lambda^Tg(x)$

We now introduce the idea of Lagrangian duality. In general, duality in optimization is the idea of converting an optimization problem in one set of variables $x$(called the primal variables), into another optimization problem in a different set of variables $\lambda$ (called the dual variables). We introduce two different approaches to duality:

Lagrangian Duality and Legendre-Fenchel Duality

Lagrangian Duality

The problem

$min\quad f(x)$

subject to $g_i(x) \le 0$ for all $i = 1,\ldots,m$

is know as the primal problem, corresponds to the primal variable $x$.The associated Lagrangian dual problem is given by

$max D(\lambda),\quad \lambda \in \mathbb{R}^m$

subject to $\lambda \ge 0$

where $\lambda$ are the dual variables and $D(\lambda)=min_{x \in \mathbb{R}^d} L(x,\lambda)$

Assume that $f(.)$ and $g(.)$ are differentiable, we find the Lagrange dual problem by differentiating the Lagrangian with respect to $x$, setting the differential to 0, and solving for the optimal value.

Equality constraints

Consider

$min\quad f(x)$

subject to $g_i(x) \le 0$ for all $i = 1,\ldots,m$

$h_j(x) = 0$ for all $j=1,\ldots,n$

We can model equality constraints by replacing them with two inequality constraints.

That is for each equality constraint $h_j(x) = 0$ we equivalently replace it by two constraints $h_j(x) \le 0$ and $h_j(x) \ge 0$. It turns out that the resulting Lagrange multipliers are then unconstrained.Therefore, we constrain the Lagrange multipliers corresponding to the inequality constraints in equation to be non-negative, and leave the Lagrange multipliers corresponding to the equality constraints unconstrained.

$L(x,\lambda,v)=f(x_0)+\sum_{i=1}^{m} \lambda g_i(x)+ \sum_{i=1}^{n} v_i h_i(x)$

The optimal value of the dual problem is less than or equal to the optimal value of the Primal problem( weak duality)

$D(\lambda,v) \le inf L(x,\lambda,v)$

Example:

Least Square problem ( find the dual)

Minimize $||x^2||$

Subject to $Ax=b$

$L(x,v)=||x^2||+\sum v_i ( Ax-b)_i$

$L(x,v)=x^Tx+v^T ( Ax-b)$

$\bigtriangledown_x L(x,v)=2 x + A^T v=0$

$x^0=\frac{-1}{2}A^Tv$

So Lagrangian Dual is

$D(v)=L(x^0,v)=\frac{1}{4}v^TAA^Tv+\frac{-1}{2}v^TAA^Tv-v^Tb$

$=\frac{-1}{4}v^TAA^Tv-v^Tb$

Example:

Dual of general LP.

Find the dual function of the LP

minimize $c^Tx$

subject to

$Ax=b$

$x \ge 0$

1. The Lagrangian is

$L(x, λ,v) = c^T x + \lambda^T. -x + v^T (Ax − b)$

$ L(x, λ,v)= c^Tx - λ^Tx+v^T Ax-v^Tb$

$L(x, λ,v)=(c^T-λ^T+v^TA)x-v^Tb$

The dual problem is

maximize $D(λ,v )$ subject to $λ \ge 0$.

is having two options

$-v^Tb \qquad if \quad c^T-λ^T+v^TA=0$

$-\infty \qquad if \quad c^T-λ^T+v^TA \ne 0\qquad unbounded$ this is not feasible

So the minimization LP problem become a maximization problem in standard LP form

Maximize $-v^Tb$

subject to

$c+A^Tv \ge 0$ because $\lambda \ge 0$

$\lambda_i \ge 0$

Example:

Minimize the function $f(x,y)=x^2+y^2$

Subject to the constraint $x+2y-5=0$

$L(x,y,\lambda)=f(x,y)+\lambda g(x,y)$

$L(x,y,\lambda)=x^2+y^2+\lambda (x+2y-5)$

$\frac{\partial L}{\partial x}=0,\quad 2x+\lambda=0$

$\frac{\partial L}{\partial y}=0,\quad 2y+2\lambda=0$

$\frac{\partial L}{\partial \lambda}=0,\quad x+2y-5=0$

from equation 1 $\lambda=-2x$ so from equation 2 $y=2x$

from equation 3

$x+2y-5=0$

$x+4x-5=0$

$x=1$ So $y=2$

So the minimum is $f(x,y)=x^2+y^2=f(1,2)=1^2+2^2=5$

Example:

Minimize the function

$f(x,y)=x^2-10x+y^2-14y+28$

subject to the constraint $x+y=8$

$L(x,y,\lambda)=f(x,y)+\lambda g(x,y)$

$L(x,y,\lambda)=x^2-10x+y^2-14y+28+\lambda (x+y-8)$

$\frac{\partial L}{\partial x}=0,\quad 2x-10+\lambda=0$

$\frac{\partial L}{\partial y}=0,\quad 2y-14+\lambda=0$

$\frac{\partial L}{\partial \lambda}=0,\quad x+y-8=0$

Subtracting 2 from 1

$2x-2y=4$

$x-y=2$

$y=x-2$

Using the equation 3

$x+y-8=0$

$x+x-2-8=0$

$x=5$

so $y=3$

minimum value is $f(5,3)=-38$

Example:

Use the method of Lagrangian Multipliers to find the point that represent the minimum value for

$f(x,y)=x^2+y^2$ subject to the constraint $x+4y=20$

$L(x,y,\lambda)=f(x,y)+\lambda g(x,y)$

$L(x,y,\lambda)=x^2+y^2+\lambda (x+4y-20)$

$\frac{\partial L}{\partial x}=0,\quad 2x+\lambda=0$

$\frac{\partial L}{\partial y}=0,\quad 2y+4\lambda=0$

$\frac{\partial L}{\partial \lambda}=0,\quad x+4y-20=0$

Solving equation 1 and 2

$x=\frac{20}{17}$

using equation 3

$y=\frac{80}{17}$

The minimum point is $(20/7, 80/7)$ and the minimum value is $400/7$

Example:

Maximize $f(x,y)=2x+2xy+y$ subject to the constraint $2x+y=100$

$L(x,y,\lambda)=f(x,y)+\lambda g(x,y)$

$L(x,y,\lambda)=2x+2xy+y+\lambda (2x+y-100)$

$\frac{\partial L}{\partial x}=0,\quad 2y+2\lambda=0$

$\frac{\partial L}{\partial y}=0,\quad 2x+\lambda=0$

$\frac{\partial L}{\partial \lambda}=0,\quad 2x+y-100=0$

Solving equation 1 and 2 we will get

$x=25$

$y=50$

The maximum point is $(25,50)$ and the value is $2600$

Example:

Maximize the function $f(x,y)=2xy$ subject to the constraint $x+y=6$

$L(x,y,\lambda)=f(x,y)+\lambda g(x,y)$

$L(x,y,\lambda)=2xy+\lambda (x+y-6)$

$\frac{\partial L}{\partial x}=0,\quad 2y+\lambda=0$

$\frac{\partial L}{\partial y}=0,\quad 2x+\lambda=0$

$\frac{\partial L}{\partial \lambda}=0,\quad x+y-6=0$

from the first two equations

$2y-2x=0$

$y=x$

from equation 3

$2x-6=0$

$x=3$

So $y=3$

So the maximum occurs at $(3,3)$ and the value is $f(x,y)=2xy=2.3.3=18$

Example:

optimize $f(x,y)=xy+1$ with the constraint

$g(x,y)=x^2+y^2-1=0$

$L(x,y,\lambda)=f(x,y)+\lambda g(x,y)$

$L(x,y,\lambda)=xy+1+\lambda (x^2+y^2-1)$

$\frac{\partial L}{\partial x}=0,\quad y+\lambda 2x=0$

$\frac{\partial L}{\partial y}=0,\quad x+\lambda 2y=0$

$\frac{\partial L}{\partial \lambda}=0,\quad x^2+y^2-1=0$

From the first equation

$x=\frac{-y}{2 \lambda}$

putting it into second

$\frac{-y}{2 \lambda}+\lambda 2y=0$

$-y+4 \lambda^2 y=0$

$y(4\lambda^2-1)=0$

This means that $y=0$ or $4\lambda^2-1=0$

$\lambda= \pm \frac{1}{2}$

when $y=0$ $x=\pm 1$ from equation 3

so we have two points $(1,0)$, $(-1,0)$

when $\lambda=\pm \frac{1}{2}$

$y=x$ or $y=-x$

using equation 3

$x^2+x^2-1=0$

$2x^2=1$

$x=\pm \frac{1}{\sqrt{2}}$

So we have four points

$(\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}}),(\frac{1}{\sqrt{2}},\frac{-1}{\sqrt{2}}),(\frac{-1}{\sqrt{2}},\frac{1}{\sqrt{2}}),(\frac{-1}{\sqrt{2}},\frac{-1}{\sqrt{2}})$

$f(x,y)=f(\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}})=\frac{1}{4}+1=\frac{5}{4}$

$f(x,y)=f(\frac{1}{\sqrt{2}},\frac{-1}{\sqrt{2}})=\frac{-1}{4}+1=\frac{3}{4}$

$f(x,y)=f(\frac{-1}{\sqrt{2}},\frac{1}{\sqrt{2}})=\frac{-1}{4}+1=\frac{3}{4}$

$f(x,y)=f(\frac{-1}{\sqrt{2}},\frac{-1}{\sqrt{2}})=\frac{1}{4}+1=\frac{5}{4}$

$f(x,y)=f(1,0)=1$

$f(x,y)=f(-1,0)=1$

From this maximum and minimum can be found very easily.

Example:

Maximize the function $f(x, y, z) = xy + yz + xz$ on the unit sphere $g(x, y, z) = x^2 + y^2 +z^2=1$

$L(x,y,z,\lambda)=xy+yz+xz+\lambda(x^2+y^2+z^2-1)$

$\frac{\partial L}{\partial x}=0,\quad y+z+\lambda 2x=0$

$\frac{\partial L}{\partial y}=0,\quad x+z+\lambda 2y=0$

$\frac{\partial L}{\partial z}=0,\quad y+x+\lambda 2z=0$

$\frac{\partial L}{\partial \lambda}=0,\quad x^2+y^2+z^2-1=0$

From the first two equations

$x-\lambda 2x-y+\lambda 2y=0$

$x(1-2 \lambda)-y(1-2 \lambda)=0$

$(1-2 \lambda)(x-y)=0$

when $x-y=0$,we have $x=y$

so from equation 4 we have

$x^2+x^2+z^2=1$

$2x^2+z^2=1$

$z=\pm \sqrt{1-2x^2}$

So the points $(x,x,\sqrt{1-2x^2})$ and $(x,x,- \sqrt{1-2x^2})$ represents maximum value of $f(x,y,z)$

when $x \ne y$ we have $(1-2\lambda)=0$

so $\lambda=\frac{1}{2}$

which results in equation $x+y+z=0$ so $z=-x-y$

$f(x,y,z)=xy+y(-x-y)+x(-x-y)$

$f(x,y,z)=xy-xy-y^2-x^2-xy$

$f(x,y,z)=-xy-y^2-x^2$ which is always negative

Example:

Find the extreme values of the function $f(x,y)=x^2+2y^2$ on the circle $x^2+y^2=1.$

Using Lagrange multipliers,

$f_x = \lambda g_x$

$f_y = \lambda g_y$

which results in following equations

$2x=\lambda 2x.........(1)$

$4y=\lambda 2y..........(2)$

$x^2+y^2=1...............(3)$

From equation (1) $x=0$ or $\lambda=1$ when $x \ne 0$

From equation (3) when $x=0$ , $y=\pm 1$

This result in $f$ has extreme values at points $(0,1)$, $(0,-1)$

When $\lambda=1$ from equation(2) $y=0$

This result in extreme points $(1,0)$ and $(-1,0)$

Evaluating $f$ at these four points, we find that

$ f(0,1)=2 $

$ f(0,-1)=2 $

$ f(1,0)=1 $

$ f(-1,0)=1$

Therefore the maximum value of $f$ on the circle $x^2+y^2=1$ is $f(0,\pm 1)=2$ and minimum value is $f(\pm 1,0)=1$

Example: Lagrangian Duality

Example:

Finding extreme values of $f(x,y,z)=x^2+2y^2+3z^2$ on unit sphere $x^2+y^2+z^2=1$

You can write the Lagrangian as:

So we can get the gradient and putting it equal zero:

and we have these solutions: $(\pm 1, 0, 0, 1)$ , $(0, \pm 1, 0, 2)$ and $(0, 0, \pm 1, 3)$ .

Example:

Find the extrema of $f(x, y) = x$ subject to $g(x, y) = x^2 + 2y^2 =3.$

$f_x=\lambda g_x$

$1=\lambda 2x$

$x=\frac{1}{2\lambda}$

$f_y=\lambda g_y$

$0=\lambda 4y$

$y=0$

When $y=0. x=\pm \sqrt{3}$

when $y \ne 0, x =\infty $

$f(\sqrt{3},0)=\sqrt{3}$

$f(-\sqrt{3},0)=-\sqrt{3}$

$f(\infty,y)=\infty $

so the maximum is $\sqrt{3}$ and minimum is $-\sqrt{3}$

Example

Find the maximum value of $f(x,y,z) = xyz$ given that $g(x,y,z) = x + y + z = 3$ and $x,y,z>=0.$

$L(x,y,z,\lambda)=xyz+\lambda(x+y+z-3)$

$\frac{\partial L}{\partial x}=0,\quad yz+\lambda =0$

$\frac{\partial L}{\partial y}=0,\quad xz+\lambda =0$

$\frac{\partial L}{\partial z}=0,\quad xy+\lambda =0$

$\frac{\partial L}{\partial \lambda}=0,\quad x+y+z-3=0$

The first 3 equation implies that

$xz=yz=xy$

$xz=yz$

$z(x-y)=0$

$x=y$

we have $xz=xy$

so $xz=x.x$

$x=z$

This implies that $x=y=z$

putting it into equation 4 $x+y+z-3=0$

$3x-3=0$

$3(x-1)=0$

So $x=1,y=1,z=1$ is the maximum and the value is $f(x,y,z)=1.1.1=1$

Mathematics for Machine Learning with Python- CST284 KTU Minor - Dr. Binu V P -9847390760

Search This Blog

5.4 Constrained Optimization and Lagrange Multipliers

Comments

Post a Comment

Popular posts from this blog

Mathematics for Machine Learning- CST 284 - KTU Minor Notes - Dr Binu V P

1.1 Solving system of equations using Gauss Elimination Method

4.3 Sum Rule, Product Rule, and Bayes’ Theorem