Projections are an important class of linear transformations (besides rotations and reflections) and play an important role in graphics, coding theory, statistics and machine learning. In machine learning, we often deal with data that is high-dimensional. High-dimensional data is often hard to analyze or visualize. However, high-dimensional data quite often possesses the property that only a few dimensions contain most information, and most other dimensions are not essential to describe key properties of the data. When we compress or visualize high-dimensional data, we will lose information. To minimize this compression loss, we ideally find the most informative dimensions in the data.More specifically, we can project the original high-dimensional data onto a lower-dimensional feature space and work in this lower-dimensional space to learn more about the dataset and extract relevant patterns.
Machine learning algorithms, such as principal component analysis (PCA) by Pearson (1901) and Hotelling (1933) and deep neural networks (e.g., deep auto-encoders (Deng et al., 2010)), heavily exploit the idea of dimensionality reduction.For a given lower-dimensional subspace, orthogonal projections of high-dimensional data retain as much information as possible and minimize the difference/error between the original data and the corresponding projection. An illustration of such an orthogonal projection is given in Figure.
Since linear mapping can be expressed by transformation matrices, the definition applies equally to a special kind of transformation matrices, the projection matrices $P_\pi$, which exhibit the property that $P^2_\pi=P$
Projection onto a one dimensional subspaces(Lines)
Assume we are given a line (one-dimensional subspace) through the origin with basis vector $b \in \mathbb{R}^n$. The line is a one-dimensional subspace $U \subseteq \mathbb{R}^n$ spanned by $b$. When we project $x \in \mathbb{R}^n$ onto $U$, we seek the vector $\pi_U(x) \in U$ that is closest to $x$.
Using geometric arguments
- The projection $\pi_U(x)$ is closest to $x$, implies that $\left \| x -\pi_U(x) \right \|$ is minimal.It follows that $\pi_U(x)-x$ is orthogonal to $b$.This means that their dot product is zero $<\pi_U(x)-x,b>=0$
- The projection $\pi_U(x)$ of $x$ onto $U$ must be an element of $U$ and ,therefore, a multiple of basis vector $b$ that spans $U$. Hence $\pi_U(x)=\lambda b$ for some $\lambda \in \mathbb{R}$
The following three steps will find out projection coordinate $\lambda$, the projection $\pi_u(x) \in U$ and the Projection matrix $P_\pi$
1.Lets find out the coordinate $\lambda$
$<x-\pi_U(x),b> = 0 \Leftrightarrow <x-\lambda b,b> = 0$
$<x,b>=\lambda <b,b> $
$\lambda= \frac{<x,b>}{<b,b>} $
$\lambda= \frac{<b,x>}{<b,b>}$
$\lambda= \frac{b^Tx}{b^Tb} $
$\lambda= \frac{b^Tx}{\left \| b \right \|^2}$
if $\left \| b \right \|=1$, then the coordinate $\lambda$ of the projection is given by $b^Tx$.
2.Finding the projection point $\pi_U(x) \in U$.Since $\pi_U(x)=\lambda b$, we immediately obtain that
$\pi_U(x)=\lambda b = \frac{<x,b>}{\left \| b \right\|^2}b = \frac{b^T x}{\left \| b \right \|^2}b$If we use dot product as the inner product
$\left \| \pi_U(x)) \right \|=\frac{| b^Tx|}{{\left \| b \right \|^2}}\left \| b \right \|=|cos \omega| \left \| x \right \| \left \| b \right \|\frac{\left \| b \right \|}{\left \| b \right \|^2}=|cos \omega|\left \| x\right \|$
Here $\omega$ is the angle between $x$ and $b$.
$P_\pi=\frac{bb^T}{\left \| b \right\|^2}$
Note that $bb^T$ (and, consequently, $P_\pi$ Projection matrices ) is a symmetric matrix (of rank 1)
and $\left \| b \right\|^2$ is a scalar.The projection matrix $P_\pi$ projects any vector $x \in \mathbb{R}^n$ onto the line through the origin with direction $b$ (equivalently, the subspace $U$ spanned by $b$).
Remark. The projection $\pi_U(x) \in \mathbb{R}^n$ is still an $n$-dimensional vector and not a scalar. However, we no longer require $n$ coordinates to represent the projection, but only a single one if we want to express it with respect to the basis vector b that spans the subspace $U: \lambda$.
Remark:$\pi_U(x)$ is an eigen vector of the projection matrix $P_\pi$
Example
Projecting a point [2,3] onto a line passing through origin spanned by [1,1]
import numpy as np
x=np.array([2,3])
b=np.array([1,1])
print("Projection Matrix b.bT/bT.b")
P=np.outer(b,b)/np.dot(b,b)
print(P)
print("Projection Point")
xp=P.dot(x)
print(xp)
O/P
[[0.5 0.5]
[0.5 0.5]]
Projection Point
[2.5 2.5]
Projection of point [1,1,1] onto line through origin spanned by [1,2,2]
import numpy as np
x=np.array([1,1,1])
b=np.array([1,2,2])
print("Projection Matrix b.bT/bT.b")
P=np.outer(b,b)/np.dot(b,b)
print(P)
print("Projection Point")
xp=P.dot(x)
print(xp)
print("applying projection again will not change anything")
xp=P.dot(x)
print(xp)
print("Squaring the projection matrix P will give P")
print(P.dot(P))
O/P
Projection Matrix b.bT/bT.b
[[0.11111111 0.22222222 0.22222222]
[0.22222222 0.44444444 0.44444444]
[0.22222222 0.44444444 0.44444444]]
Projection Point
[0.55555556 1.11111111 1.11111111]
applying projection again will not change anything
[0.55555556 1.11111111 1.11111111]
Squaring the projection matrix P
[[0.11111111 0.22222222 0.22222222]
[0.22222222 0.44444444 0.44444444]
[0.22222222 0.44444444 0.44444444]]
Projection on to general subspace
Let us consider orthogonal projections of vectors $x \in \mathbb{R}^n$ onto lower-dimensional subspaces $U \in \mathbb{R}^n$ with $dim(U) = m \ge 1$.
Assume that $(b_1,\ldots,b_m)$ is an orderd basis of $U$. Any projection $\pi_U(x)$ onto $U$ is necessarily an element of $U$. Therefore, they can be represented as a linear combination of the basis vectors $b_1,\ldots,b_n$ of $U$, such that $\pi_U(x)=\sum_{i=1}^{m}\lambda_ib_i$
1.Let us find the coordinates $\lambda_1,\ldots,\lambda_m$ of the projections with respect to $U$
$\pi_U(x)=\sum_{i=1}^{m}\lambda_ib_i=B\lambda$
$B=[b_1,\ldots,b_m] \in \mathbb{R}^{n \times m}$,$\lambda=[\lambda_1,\ldots,\lambda_m]^T \in \mathbb{R}^m$
It is noted that the vector connecting $x \in \mathbb{R}^n$ and $\pi_U(x) \in U$ is orthogonal to all the basis vectors of $U$.Therefore we obtain $m$ simultaneous conditions.
$<b_1,x-\pi_U(x)> \quad = b_1^T(x-\pi_U(x))=0 $$ \vdots $
$<b_m,x-\pi_U(x)> \quad = b_m^T(x-\pi_U(x))=0 $
since $\pi_U(x)=B\lambda $
$b_1^T(x-B\lambda)=0$
$ \vdots $
$ b_m^T(x-B \lambda)=0 $
So we get $m$ homogeneous equations
$\begin{bmatrix}
b_1\\
\vdots\\
b_m
\end{bmatrix}\begin{bmatrix}
x-B\lambda
\end{bmatrix}=0$
$B^T(x-B\lambda)=0$
b_1\\
\vdots\\
b_m
\end{bmatrix}\begin{bmatrix}
x-B\lambda
\end{bmatrix}=0$
$B^T(x-B\lambda)=0$
$B^Tx=B^TB\lambda$
The last expression is called normal equation. Since $b_1,\ldots,b_m$ normal equation are a basis of $U$ and, therefore, linearly independent, $B^TB \in \mathbb{R}^m$ is regular and can be inverted. This allows us to solve for the coefficients/coordinates
$\lambda=(B^TB)^{-1}B^Tx$
The matrix $\lambda=(B^TB)^{-1}B^T$ is also called the pseudo-inverse of $B$, which can be computed for the non square matrices $B$.It only requires that $B^TB$ is positive definite, which is the case if $B$ is full rank.
2.Find the projection $\pi_U(x) \in U$ .We know that $\pi_U(x)=B\lambda$
$\pi_U(x)=B(B^TB)^{-1}B^Tx$
3.Find the Projection matrix $P_\pi$
$P_\pi=B(B^TB)^{-1}B^T$
Note: The projection onto 1D is a special case ,if $dim(U)=1$, then $B^TB \in \mathbb{R}$ is a scalar and we can rewrite the projection matrix $P_\pi=B(B^TB)^{-1}B^T$ as $P_\pi=\frac{BB^T}{B^TB}$
Example
Projecting the point [6,0,0] onto the subspace spanned by [1,1,1] and [0,1,2]
import numpy as np
x=np.array([6,0,0])
B=np.array([[1,0],[1,1],[1,2]])
print("Projection Matrix B.(B.BT)^-1. BT")
P=np.dot(B.T,B)
P=np.linalg.inv(P)
P=P.dot(B.T)
P=B.dot(P)
print(P)
print("Projection Point")
xp=P.dot(x)
x=np.array([6,0,0])
B=np.array([[1,0],[1,1],[1,2]])
print("Projection Matrix B.(B.BT)^-1. BT")
P=np.dot(B.T,B)
P=np.linalg.inv(P)
P=P.dot(B.T)
P=B.dot(P)
print(P)
print("Projection Point")
xp=P.dot(x)
print(xp)
print("applying projection will not change anything")
xp=P.dot(x)
print("applying projection will not change anything")
xp=P.dot(x)
print(xp)
print("Squaring the projection matrix P will give P")
print(P.dot(P))
o/pprint("Squaring the projection matrix P will give P")
print(P.dot(P))
Projection Matrix B.(B.BT)^-1. BT
[[ 0.83333333 0.33333333 -0.16666667]
[ 0.33333333 0.33333333 0.33333333]
[-0.16666667 0.33333333 0.83333333]]
Projection Point
[ 5. 2. -1.]
applying projection will not change anything
[ 5. 2. -1.]
Squaring the projection matrix P will give P
[[ 0.83333333 0.33333333 -0.16666667]
[ 0.33333333 0.33333333 0.33333333]
[-0.16666667 0.33333333 0.83333333]]
Remarks:The projections $\pi_U(x)$ are still vectors in $\mathbb{R}^n$ although they lie in an $m$-dimensional subspace $U \subseteq \mathbb{R}^n$. However, to represent a projected vector we only need the $m$ coordinates $\lambda_1,\ldots,\lambda_m$ with respect to the basis vectors $b_1,\ldots,b_m$ of $U$.
Projections allow us to look at situations where we have a linear system $Ax = b$ without a solution. Recall that this means that $b$ does not lie in the span of $A$, i.e., the vector $b$ does not lie in the subspace spanned by the columns of $A$. Given that the linear equation cannot be solved exactly,we can find an approximate solution. The idea is to find the vector in the subspace spanned by the columns of $A$ that is closest to $b$, i.e., we compute the orthogonal projection of $b$ onto the subspace spanned by the columns of $A$. This problem arises often in practice, and the solution is called the least-squares least-squares solution.The projection error is the norm of the difference vector between the original vector and its projections onto $U$.i.e., $\left \| x- \pi_U(x) \right \|$.This displacement vector is orthogonal to all basis vectors of $U$.If the basis vectors $\{b_1,\ldots,b_m\}$ is Orthonormal, then the projection equation is simplified to
$\pi_U(x)=BB^Tx$ since $B^TB=I$
The coordinates $\lambda$ become
$\lambda=B^Tx$
This means that we no longer have to compute the inverse $(B^TB)^{-1}$, which saves computation time.
Example- University Question
Find the projection of the vector $x=\begin{bmatrix} 6\\0\\0\\ \end{bmatrix}$ in $\mathbb{R}^3$ into the span of $\begin{bmatrix} 1\\1\\1\\ \end{bmatrix} \begin{bmatrix} 0\\1\\2\\ \end{bmatrix}$
The projection Matrix is
$P_\pi=B(B^T.B)^{-1}B^T$
$=\begin{bmatrix} 1 & 0 \\ 1 & 1\\1 & 2 \end{bmatrix} \begin{bmatrix} 3 & 3\\ 3 & 5 \\ \end{bmatrix}^{-1} \begin{bmatrix} 1 & 1 & 1 \\0 & 1 & 2 \end{bmatrix}$
$=\begin{bmatrix} 1 & 0 \\ 1 & 1\\1 & 2 \end{bmatrix} \begin{bmatrix} 5/6 & -1/2\\ -1/2 & 1/2 \\ \end{bmatrix} \begin{bmatrix} 1 & 1 & 1 \\0 & 1 & 2 \end{bmatrix}$
$=\begin{bmatrix} 1 & 0 \\ 1 & 1\\1 & 2 \end{bmatrix} \begin{bmatrix} 5/6 & 1/3 & -1/6\\ -1/2 & 0 & 1/2 \\ \end{bmatrix} $
$= \begin{bmatrix} 5/6 & 1/3 & -1/6\\ 1/3 & 1/3 & 1/3\\ -1/6 & 1/3 & 5/6 \end{bmatrix} $
The projection of the vector $x$ is
$\pi_U(x)=P_\pi . x$$= \begin{bmatrix} 5/6 & 1/3 & -1/6\\ 1/3 & 1/3 & 1/3\\ -1/6 & 1/3 & 5/6 \end{bmatrix}\begin{bmatrix} 6 \\ 0 \\ 0 \end{bmatrix} $
$=\begin{bmatrix} 5 \\ 2 \\ -1 \end{bmatrix}$
Comments
Post a Comment