The generalization of the derivative to functions of several variables is the gradient.Where the function f depends on one or more variables x∈Rn, e.g.,f(x)=f(x1,x2).
We find the gradient of the function f with respect to x by varying one variable at a time and keeping the others constant. The gradient is then the collection of these partial derivatives.
Definition Partial Derivative
For a function f:Rn→R, x→f(x),x∈Rn of n variables x1,x2,…,xn, we define partial derivatives as
∂f∂x1=limh→0f(x1+h,x2,…,xn)−f(x))h
⋮
∂f∂xn=limh→0f(x1,x2,…,xn+h)−f(x))h
and collect them in a row vector.The row vector is called the gradient of f or the Jacobian and is the generalization of the derivative form.
Example:
if f(x1,x2)=x21x2+x1x32∈R, then the derivative of f with respect to x1 and x2 are.
∂f(x1,x2)∂x1=2x1x2+x32
∂f(x1,x2)∂x2=x21+3x1x22
and the gradient is then
dfdx=[∂f(x1,x2)∂x1∂f(x1,x2)∂x2]=[2x1x2+x32x21+3x1x22]∈R1×2
The gradient corresponds to the rate of steepest ascent/descentEach component of the gradient tells you how fast your function is changing with respect to the standard basis. It's not too far-fetched then to wonder, how fast the function might be changing with respect to some arbitrary direction? Letting →v denote a unit vector, we can project along this direction in the natural way, namely via the dot product grad(f(a)).→v. This is a fairly common definition of the directional derivative.
We can then ask in what direction is this quantity maximal? You'll recall that
grad(f(a)).→v==|grad(f(a))||→v|cos(θ)Since
→v is unit vector, we have
|grad(f)|cos(θ), which is maximal when
cos(θ)=1 in particular when
→v points in the same direction as
grad(f(a))Basic Rules of Partial Differentiation
Product rule:
∂(f(x)g(x))∂x=∂f∂x.g(x)+f(x).∂g∂x
Sum Rule:
∂∂x(f(x)+g(x))=∂f∂x+∂g∂x
Chain Rule:
∂∂x(g∘f)(x)=∂∂x(g(f(x)))=∂g∂f∂f∂x
Example Partial derivatives using chain rule:
if f(x,y)=(x+2y3)2, we obtain partial derivatives
∂f(x,y)∂x=2(x+2y3)∂∂x(x+2y3)=2(x+2y3)
∂f(x,y)∂y=2(x+2y3)∂∂y(x+2y3)=12(x+2y3)y2
Chain Rule
Chain rule is widely used in machine learning especially in neural network during the back propagation.
consider a function f:R2↦R of two variables x1,x2.Furthermore x1(t) and x2(t) are themselves functions of t.To compute the gradiant of f with respect to t, we need to apply the chain rule for multivariate function as.
dfdt=[∂f∂x1∂f∂x2][∂x1(t)∂t∂x2(t)∂t]=∂f∂x1∂x1∂t+∂f∂x2∂x2∂t
where d denotes the gradient and ∂ denotes partial derivatives.
Example:
consider f(x1,x2)=x21+2x2, where x1=sin(t) and x2=cos(t), then
dfdt=∂f∂x1∂x1∂t+∂f∂x2∂x2∂t
=2x1cost+2.−sin(t)
=2sin(t)cos(t)−2sin(t)
=2sin(t)(cos(t)−1)
is the corresponding derivative of f with respect to t.
If f(x1,x2) is a function of x1 and x2, where x1(s,t) and x2(s,t) are themselves functions of two variables s and t, the chain rule yields the partial derivatives.
dfds=dfdx1dx1ds+dfdx2dx2ds
dfdt=dfdx1dx1dt+dfdx2dx2dt
dfd(s,t)=dfdxdxd(s,t)=[dfdx1dfdx2][dx1dsdx1dtdx2dsdx2dt]Example Problems
Find the gradient of f(x,y)=x2y at the point (3,2)
The gradient is just the vector of partial derivatives
dfdx=2xy
dfdy=x2
The gradient is
[2xyx2]
The Gradient at (3,2) is
[129]
Let f(x,y,z)=xyex2+z2−5. Calculate the gradient of f at the point (1,3,−2)
▽f(x,y,z)=[∂f∂x∂f∂y∂f∂z]
∂f∂x=y(x.ex2+z2−5.2x+ex2+z2−5)=(y+2x2y)ex2+z2−5
∂f∂y=(x.ex2+z2−5)
∂f∂z=(2xyz.ex2+z2−5)
Therefore
▽f(1,3,−2)=[91−12]
g(x,y)=x2yx2+y2 if (x,y)!=0 Find the partial derivative of g(x,y) at (0,0).
Note that the partial derivative
∂g∂x(0,0)=0
∂g∂y(0,0)=0
For a scalar function f(x,y,z)=x2+3y2+2z2, find the gradient and its magnitude at the point (1,2−1) , university question∂f∂x=2x
∂f∂y=6y∂f∂y=4z
The gradient is
▽f(x,y,z)=[∂f∂x∂f∂y∂f∂z]
There fore
▽f(x,y,z)=[2x6y4z]
▽f(1,2,−1)=[212−4]
Suppose you were trying to minimize f(x,y)=x2+2y+2y2. Along what vector should you travel from (5,12).
In order to minimize we should travel in the -ve direction of the gradient
∂f∂x=2x
∂f∂y=2+4y
▽f(x,y)=[2x2+4y]
gradient at (5,12) is
▽f(5,12)=[1050]
In order to minimize travel in the direction −[1050]
A skier is on a mountain with equationz=100−0.4x2−0.3y2where
z denotes height.
(a) The skier is located at the point with xy-coordinates (1,1), and wants to ski downhill along the steepest possible path. In which direction (indicated by a vector (a; b) in the xy-plane) should the skier begin skiing?
Solution:
Direction of greatest rate of decrease is opposite of direction of gradient.
▽g(x,y)=[−0.8x−0.6y]▽g(1,1)=[−0.8−0.6]
The gradient vector having magnitude 1 .So the unit vector in the opposite direction is
u=−▽g(1,1)=[0.80.6]
The skier begins skiing in the direction given by the xy-vector (a,b) you found in part (i), so the skier heads in a direction in space given by the vector (a,b,c). Find the value of c.
Solution:
Dug(1,1)=g(1,1).u=(−u).u=−1
gives the slope. which is the ratio of vertical change to horizontal change. In the direction of the vector (a,b,c).This ratio is c√a2+b2
. So
Dug(1,1)=c√a2+b2=c1=c
−1=c
Find the direction of greatest increase of the function f(x,y)=4x2+y2+2y at the point P(1,2) ( university question)
∂f∂x=8x
∂f∂y=2y+2
▽f(x,y)=[8x2y+2]
gradient at (1,2) is
▽f(1,2)=[86]=8i+6j
Find the partial derivative and gradient of the function f(x,y,z)=x5e2z/y (university question)
∂f∂x=ye2z5x4/y2=e2z5x4/y
∂f∂y=−x5e2z/y2
∂f∂z=yx5e2z.2/y2=2x5e2z/y
▽f(x,y,z)=[e2z5x4/y−x5e2z/y22x5e2z/y]
Find the gradient of the function f(x,y)=x2+y2 at the point (x,y)=(1,5) university question
∂f∂x=2x
∂f∂y=2y
▽f(x,y)=[2x2y]
gradient at (1,5) is
▽f(1,5)=[210]
Look for more example problems
Comments
Post a Comment