Skip to main content

2.13 Eigen decomposition and diagonalization

eigendecomposition (also known as eigenvalue decomposition, spectral decomposition, or diagonalization)

A diagonal matrix is a matrix that has value zero on all off-diagonal elements i.e., they are of the form
$D=\begin{bmatrix}
c_1 & \cdots & 0\\
\vdots&\ddots & \vdots \\
0& \cdots & c_n
\end{bmatrix}$
They allow fast computation of determinants, powers and inverses.The determinant is the product of its diagonal entries, a matrix power $D^k$ is given by each diagonal element raised to the power $k$, and the inverse $D^{-1}$ is the reciprocal of its diagonal elements, if all of them are nonzero.

Definition(Diagonalizable):A matrix $A \in \mathbb{R}^{n \times n}$ is diagonalizable, if it is similar to a diagonal matrix i.e., if there exists an invertable matrix $P \in \mathbb{R}^{n \times n}$ such that $D=P^{-1}AP$.

 Eigen decomposition can be done only on a square matrix.

let $A \in \mathbb{R}^{n \times n}$ and let $\lambda_1,\lambda_2,\ldots,\lambda_n$, be a set of scalars, and let $p_1,p_2,\ldots,p_n$ be set of vectors in $\mathbb{R}^n$.We define $P=[p_1,p_2,\ldots,p_n]$ and let $D \in \mathbb{R}^{n \times n}$ be a diagonal matrix with diagonal entries $\lambda_1,\lambda_2,\ldots,\lambda_n$, then we can show that

$A=PDP^{-1}$

if  and only if $\lambda_1,\lambda_2,\ldots,\lambda_n$ are the eigen values of $A$ and $p_1,p_2,\ldots,p_n$ are the corresponding eigen vectors of $A$.This requires that $P$ must be invertable and $P$ must have full rank.This requires  us to have $n$ linearly independent eigen vectors $p_1,\ldots,p_n$ and forms the basis of $\mathbb{R}^n$.

Theorem:(Eigen Decomposition). A square matrix $A \in \mathbb{R}^{n \times n}$ can be factored into

$A=PDP^{-1}$

where $P \in \mathbb{R}^n$ and $D$ is a diagonal matrix whose diagonal entries are the eigenvalues of $A$, if and only if the eigen vectors of $A$ form the basis of $\mathbb{R}^n$.

This theorem implies that only non defective matrix can be diagonalized and that the columns of $P$ are the $n$ eigen vectors of $A$.

A symmetric matrix $S \in \mathbb{R}^{n \times n}$ can always be diagonalized.

Spectral theorem states that we can find orthonormal eigen vectors .This makes $P$ an orthogonal matrix so that $A=PDP^T$ and $D=P^TAP$

Geometric Intuition

We can interpret the eigendecomposition of a matrix as follows.Let $A$ be a transformation matrix of a linear mapping with respect to the standard basis.$P^{-1}$ performs a basis change from the standard basis into eigen basis. Then the diagonal $D$ scales the vectors along these axes by the eigenvalues $\lambda_i$.Finally, $P$ transforms these scaled vectors back into standard/canonical coordinates.

Advantages

1.Diagonal matrix $D$ can be efficiently be raised to a power. Therefore we can find a matrix power for a matrix $A \in \mathbb{R}^{n \times n}$ via eigen value decomposition(if exist) so that

$A^k= ( PDP^{-1})^k= PD^kP^{-1}$

Computing $D^k$ is efficient because we apply this operation individually to any diagonal element.

2.Assume that eigen decomposition exist $A=PDP^{-1}$ exist, Then

   $det(A)=det(PDP^{-1})=det(P)det(D)det(P^{-1})=det(D)=\prod_i d_{ii}$

This allows efficient computation of the determinant of $A$.

3.The inverse of $A$ is $A^{-1}=(PDP^{-1})^{-1}=PD^{-1}P^{-1}$

$P^{-1}=P^T$ for orthonormal eignen vectors and $D^{-1}$ can be found by taking $1/\lambda_{ii}$

Eg: lets compute the eigen decomposition of  (  university question)

$A=\begin{bmatrix}2 &1 \\
1 & 2
\end{bmatrix}$

step1:Compute the eigen values and eigen vectors
The characteristic polynomial of $A$ is
$det(A-\lambda I)=det \left(\begin{bmatrix}
2-\lambda &1 \\
1 & 2-\lambda
\end{bmatrix}\right)$
$=(2-\lambda)^2-1=\lambda^2-4\lambda+3=(\lambda-3)(\lambda-1)$

Therefore the eigen values of $A$ are $\lambda_1=1$ and $\lambda_2=3$ and the corresponding normalized  eigen vectors are 
$p_1=\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\
-1
\end{bmatrix}$ and 
$p_2=\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\
1
\end{bmatrix}$ 

step2:Check for existence

The eigen vectors $p_1,p_2$ form a basis of $\mathbb{R}^2$. Therefore $A$ can be diagonalized.

step3: construct the matrix P to diagonalize A

We collect the eigenvectors of $A$ in $P$ so that

$P=[p_1,p_2]=\frac{1}{\sqrt{2}}\begin{bmatrix}1 &1 \\
-1 &1
\end{bmatrix}$
We then obtain 
$P^{-1}AP=\begin{bmatrix}1 &0 \\0 &3\end{bmatrix}=D$
Note that $P^{-1}=P^T$ since the eigenvectors form an Orthonormal basis.

Now the eigen decompostion of $A=PDP^T$ is 


import numpy as np
# define matrix
A = array([[2,1],[1,2]])
print("Original Matrix A")
print(A)
# factorize
values, vectors = np.linalg.eig(A)
# create matrix from eigenvectors
P = vectors
print("Normalized Eigen Vectors P")
print(P)
# create inverse of eigenvectors matrix
Pt = np.linalg.inv(P)
print("Inverse of P which is P^T")
print(Pt)
# create diagonal matrix from eigenvalues
D = np.diag(values)
print("diagonal Matrix D with eigen values on the diagonal")
print(D)
# reconstruct the original matrix
print("reconstructed original matrix PDP^T")
B = P.dot(D).dot(Pt)
print(B)

o/p
Original Matrix A 
[[2 1] 
 [1 2]] 
Normalized Eigen Vectors P
 [[ 0.70710678 -0.70710678] 
 [ 0.70710678 0.70710678]]
 Inverse of P which is P^T
 [[ 0.70710678 0.70710678] 
 [-0.70710678 0.70710678]] 
Diagonal Matrix D with eigen values on the digonal 
[[3. 0.]
 [0. 1.]] 
reconstructed original matrix PDP^T 
[[2. 1.] 
 [1. 2.]]

Sympy Code
from sympy import Matrix
M = Matrix(3, 3, [1, 2, 0, 0, 3, 0, 2, -4, 2])
(P, D) = M.diagonalize()
print(M)
print(D)
print(P)
print(P.inv() * M * P)

Matrix([ 
[1, 0, 0], 
[0, 2, 0], 
[0, 0, 3]]) 

Matrix([
 [-1, 0, -1], 
[ 0, 0, -1],
 [ 2, 1, 2]])

Matrix([ 
[1, 0, 0], 
[0, 2, 0], 
[0, 0, 3]])

Matrix([[1, 0, 0], [0, 2, 0], [0, 0, 3]])

Diagonalize the following matrix ( university question)
$A=\begin{bmatrix}
1 & -2 & 0 \\
-2 & 0 & 2 \\
0 & 2 & -1 \\
\end{bmatrix}$


The eigen values are $\lambda_1=0,\lambda_2=3,\lambda=-3$
The eigen vectors are$P$ $(-1, 2, -2), (-2, 1, 2), (2, 2, 1)$


So the Diagonalized matrix is $P^{-1}.A.P$
$\begin{bmatrix}
-3 & 0 & 0 \\
0 & 0 & 0 \\
0 & 0 & 3 \\
\end{bmatrix}$



Comments

Popular posts from this blog

Mathematics for Machine Learning- CST 284 - KTU Minor Notes - Dr Binu V P

  Introduction About Me Syllabus Course Outcomes and Model Question Paper University Question Papers and Evaluation Scheme -Mathematics for Machine learning CST 284 KTU Overview of Machine Learning What is Machine Learning (video) Learn the Seven Steps in Machine Learning (video) Linear Algebra in Machine Learning Module I- Linear Algebra 1.Geometry of Linear Equations (video-Gilbert Strang) 2.Elimination with Matrices (video-Gilbert Strang) 3.Solving System of equations using Gauss Elimination Method 4.Row Echelon form and Reduced Row Echelon Form -Python Code 5.Solving system of equations Python code 6. Practice problems Gauss Elimination ( contact) 7.Finding Inverse using Gauss Jordan Elimination  (video) 8.Finding Inverse using Gauss Jordan Elimination-Python code Vectors in Machine Learning- Basics 9.Vector spaces and sub spaces 10.Linear Independence 11.Linear Independence, Basis and Dimension (video) 12.Generating set basis and span 13.Rank of a Matrix 14.Linear Mapping...

Vectors in Machine Learning

As data scientists we work with data in various formats such as text images and numerical values We often use vectors to represent data in a structured and efficient manner especially in machine learning applications In this blog post we will explore what vectors are in terms of machine learning their significance and how they are used What is a Vector? In mathematics, a vector is a mathematical object that has both magnitude and direction. In machine learning, a vector is a mathematical representation of a set of numerical values. Vectors are usually represented as arrays or lists of numbers, and each number in the list represents a specific feature or attribute of the data. For example, suppose we have a dataset of houses, and we want to predict their prices based on their features such as the number of bedrooms, the size of the house, and the location. We can represent each house as a vector, where each element of the vector represents a specific feature of the house, such as the nu...

2.14 Singular Value Decomposition

The Singular Value Decomposition ( SVD) of a matrix is a central matrix decomposition method in linear algebra.It can be applied to all matrices,not only to square matrices and it always exists.It has been referred to as the 'fundamental theorem of linear algebra'( strang 1993). SVD Theorem: Let $A^{m \times n}$ be a rectangular matrix of rank $r \in [0,min(m,n)]$. The SVD of A is a decomposition of the form. $A= U \Sigma V^T $ with an orthogonal matrix $U \in \mathbb{R}^{m \times m}$ with column vectors $u_i, i=1,\ldots,m$ and an orthogonal matrix $V \in \mathbb{R}^{n \times n}$ with column vectors $v_j, j=1,\ldots,n$.Moreover, $\Sigma$ is an $m \times n$ matrix with $\sum_{ii} = \sigma \ge 0$ and $\sigma_{ij}=0, i \ne j$. The diagonal entries $\Sigma_i=1,\ldots,r$ of $\sigma$ are called singular values . $u_i$ are called left singular vectors , and $v_j$ are called right singular vectors .By convention singular values are ordered ie; $\sigma_1 \ge \sigma_2 \ldots \sigma_r \...