Skip to main content

Singular Value Decomposition

note

Previously we have seen that for n×nn\times n matrix (sayAA) with nn independent eigenvectors factorization is,

A=SΛS1\begin{matrix} A=S\Lambda S^{-1} \end{matrix}

Previously we have seen that if eigenvectors of matrix AA are orthogonal then Q1=QTQ^{-1}=Q^T, then the factorization is,
(We discussed it in detail HERE)

A=QΛQT\begin{matrix} A=Q\Lambda Q^{T} \end{matrix}

But generally eigenvectors are not orthogonal.

Singular Value Decomposition(SVD) is a factorization of any m×nm\times n matrix.

danger
A=UΣVT\begin{matrix} A=U\Sigma V^T \end{matrix}

Say that we have a m×nm\times n matrix AA and Rank(A)=r\text{Rank}(A)=r,

Am×n=[a11a12a1na21a22a2nam1am2amn]A_{m\times n}=\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix}
info

IDEA: behind SVD is to map an orthonormal basis in Rn\mathbb{R}^n (Row space + Null Space of matrix ATAA^TA) to the orthonormal basis in Rm\mathbb{R}^m (Column space + Left Null Space of matrix AATAA^T).

Fundamental SubspacesFundamental Subspaces

\quad

The row space is in Rn\mathbb{R}^n, here our row vectors are,

[a11a12a1n]Rn,[a21a22a2n]Rn,,[am1am2amn]Rn\begin{bmatrix} a_{11} \\ a_{12} \\ \vdots \\ a_{1n} \end{bmatrix}\in\mathbb{R}^n, \begin{bmatrix} a_{21} \\ a_{22} \\ \vdots \\ a_{2n} \end{bmatrix}\in\mathbb{R}^n, \cdots, \begin{bmatrix} a_{m1} \\ a_{m2} \\ \vdots \\ a_{mn} \end{bmatrix}\in\mathbb{R}^n

The column space is in Rm\mathbb{R}^m, our column vectors are,

[a11a21am1]Rm,[a12a22am2]Rm,,[a1na2namn]Rm\begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{bmatrix}\in\mathbb{R}^m, \begin{bmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{bmatrix}\in\mathbb{R}^m, \cdots, \begin{bmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \end{bmatrix}\in\mathbb{R}^m

A vector (say) vRn\vec{v}\in\mathbb{R}^n has some portion in Row Space and some in Left Null Space, when we transform that vector vRn\vec{v}\in\mathbb{R}^n using matrix AA, Av=uRmA\vec{v}=\vec{u}\in\mathbb{R}^m now this vector uRm\vec{u}\in\mathbb{R}^m has some portion in Column Space and some in Null Space
u=up+un\vec{u}=\vec{u}_p+\vec{u}_n [Reference]

Row space (C(AT))(C(A^T)) (Gather Orthonormal Basis vectors for Row Space)

note

Row vectors lives in nn-dimensional vector space.
Rank(A)=r\text{Rank}(A)=r so we have rr independent row vectors.
So the space spanned by row vectors is rr-dimensional vector space(Row Space), so to span this whole rr-dimensional vector space we need rr independent basis vectors.
Among those we choose orthonormal basis vectors.

( we can get them using Gram Schmidt method, but here we will use ATAA^TA to get them we will discuss it below)
(say) our orthonormal basis vector for Row Space are v1,v2,,vr\vec{v}_1,\vec{v}_2,\cdots,\vec{v}_r.

Null Space (N(A))(N(A)) (Orthonormal Basis vectors of Null Space)

note

Null Space is perpendicular to Row Space(we talked about orthogonal subspaces HERE), so every vector in Null Space is perpendicular to Row Space.
Rank(A)=r\text{Rank}(A)=r so we have nrn-r dependent(free) row vectors.
So the space spanned by vectors in Null Space is (nr)(n-r)-dimensional vector space(Null Space), so to span this whole (nr)(n-r)-dimensional vector space we need nrn-r independent basis vectors in Null Space.
(we don't need to find basis vectors for SVD we discuss it below in Column Space(C(A))(C(A)) section )
(say) our orthonormal basis vector for Null Space are vr+1,vr+2,,vn\vec{v}_{r+1},\vec{v}_{r+2},\cdots,\vec{v}_n.

Column Space (C(A))(C(A)) (Map the Orthonormal Basis vectors from the Row Space to Column Space)

note

After we got our orthonormal basis vector for Row Space v1,v2,,vn\vec{v}_1,\vec{v}_2,\cdots,\vec{v}_n, we map them to the Column Space using matrix AA.
Rank(A)=r\text{Rank}(A)=r so,

Avi=σiui;i{1,2,,r}\begin{matrix} A\vec{v}_i=\sigma_i\vec{u}_i;\quad\forall i\in\{1,2,\cdots,r\} \end{matrix}

For i{1,2,,r}i\in\{1,2,\cdots,r\} AA maps vi\vec{v}_i to σiui\sigma_i\vec{u}_i then we normalize it to get ui\vec{u}_i(unit vector), σi\sigma_i is the length of AviA\vec{v}_i.
Now we got the orthonormal basis vector for our column space u1,u2,,ur\vec{u}_1,\vec{u}_2,\cdots,\vec{u}_r

Avi=0;i{r+1,r+2,,n}\begin{matrix} A\vec{v}_i=\vec{0};\quad\forall i\in\{r+1,r+2,\cdots,n\} \end{matrix}

For i{r+1,r+2,,n}i\in\{r+1,r+2,\cdots,n\} vi\vec{v}_i lives in the Null Space of AA
So ur+1,ur+2,,un\vec{u}_{r+1},\vec{u}_{r+2},\cdots,\vec{u}_n are all 0\vec{0}

But wait WHY all ui\vec{u}_i's are perpendicular to each other?
we discussed it below in ATAA^TA section.

Left Null Space (N(AT))(N(A^T)) (Map the Orthonormal Basis vectors from the Null Space to Left Null Space)

note

Left Null Space is perpendicular to Column Space(we talked about orthogonal subspaces HERE), so every vector in Left Null Space is perpendicular to Column Space.
Rank(A)=r\text{Rank}(A)=r so we have mrm-r dependent(free) column vectors.
So the space spanned by vectors in Left Null Space is (mr)(m-r)-dimensional vector space(Null Space), so to span this whole (mr)(m-r)-dimensional vector space we need mrm-r independent basis vectors in Left Null Space.
(say) our orthonormal basis vector for Left Null Space are ur+1,ur+2,,un\vec{u}_{r+1},\vec{u}_{r+2},\cdots,\vec{u}_n.

Terminologies
Say v1,v2,,vr\vec{v}_1,\vec{v}_2,\cdots,\vec{v}_r are Orthonormal Basis vectors for Row Space (C(AT))(C(A^T)),
and vr+1,vr+2,,vn\vec{v}_{r+1},\vec{v}_{r+2},\cdots,\vec{v}_n are Orthonormal Basis vectors for Null Space (N(AT))(N(A^T))

(say)V=[v1v2vrvr+1vr+2vn]n×n(say)U=[u1u2urur+1ur+2un=0=0=0]m×n(say)Σ=[[σ10σ20σr]r×r0r×(nr)0(nr)×r0(nr)×(nr)]n×n\text{(say)} \mathbf{V}'= \begin{bmatrix} \begin{matrix} \vdots & \vdots & & \vdots \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_r \\ \vdots & \vdots & & \vdots \end{matrix} & \begin{matrix} \vdots & \vdots & & \vdots \\ \vec{v}_{r+1} & \vec{v}_{r+2} & \cdots & \vec{v}_n \\ \vdots & \vdots & & \vdots \end{matrix} \end{bmatrix}_{n\times n} \\ \text{(say)} \mathbf{U}'= \begin{bmatrix} \begin{matrix} \vdots & \vdots & & \vdots \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_r \\ \vdots & \vdots & & \vdots \end{matrix} & \begin{matrix} \vdots & \vdots & & \vdots \\ \vec{u}_{r+1} & \vec{u}_{r+2} & \cdots & \vec{u}_n \\ \underbrace{\vdots}_{=\vec{0}} & \underbrace{\vdots}_{=\vec{0}} & & \underbrace{\vdots}_{=\vec{0}} \end{matrix} \end{bmatrix}_{m\times n} \\ \text{(say)} \Sigma'= \begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_r \\ \end {bmatrix}_{r\times r} & \mathbf{0}_{r\times (n-r)} \\ \mathbf{0}_{(n-r)\times r} & \mathbf{0}_{(n-r)\times (n-r)} \end{bmatrix}_{n\times n}

Mapping from Rn\mathbb{R}^n to Rm\mathbb{R}^m is,

danger
Avi=σiui;i{1,2,,n}\begin{matrix} A\vec{v}_i=\sigma_i\vec{u}_i;\quad\forall i\in\{1,2,\cdots,n\} \end{matrix}

We can Write it as,

A[v1v2vn]Vn×n=[u1u2un]Um×n[[σ10σ20σr]r×r0r×(nr)0(nr)×r0(nr)×(nr)]Σn×nA\underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ V'_{_{n\times n}} } = \underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_n \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ U'_{m\times n} } \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_r \\ \end {bmatrix}_{r\times r} & \mathbf{0}_{r\times (n-r)} \\ \mathbf{0}_{(n-r)\times r} & \mathbf{0}_{(n-r)\times (n-r)} \end {bmatrix}}_{ \Sigma'_{n\times n} }

So,

AVn×n=Um×nΣn×nAV'_{n\times n}=U'_{m\times n}\Sigma'_{n\times n}

Full SVD

Say that all columns of our matrix AA are independent Rank(A)=m\Rightarrow \text{Rank}(A)=m, so

  • Dimension of Row Space is: mm

  • Dimension of Null Space is: nmn-m

  • Dimension of Column Space is: mm

  • Dimension of Left Null Space is: mm=0m-m=0 Our equation was,

    A[v1v2vn]Vn×n=[u1u2un]Um×n[[σ10σ20σm]m×m0m×(nm)0(nm)×m0(nm)×(nm)]Σn×nA\underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ V'_{_{n\times n}} } = \underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_n \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ U'_{m\times n} } \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m \\ \end {bmatrix}_{m\times m} & \mathbf{0}_{m\times (n-m)} \\ \mathbf{0}_{(n-m)\times m} & \mathbf{0}_{(n-m)\times (n-m)} \end {bmatrix}}_{ \Sigma'_{n\times n} }

    So,

    AVn×n=Um×nΣn×nAV'_{n\times n}=U'_{m\times n}\Sigma'_{n\times n}

    And as we discussed above that ur+1,ur+2,,un\vec{u}_{r+1},\vec{u}_{r+2},\cdots,\vec{u}_n are all 0\vec{0}, so we can discard those vectors from our equation so now our equation will become,

    A[v1v2vn]Vn×n=[u1u2um]Um×m[[σ10σ20σm]m×m0m×(nm)]Σm×nA\underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ V_{_{n\times n}} } = \underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_m \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ U_{m\times m} } \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m \\ \end {bmatrix}_{m\times m} & \mathbf{0}_{m\times (n-m)} \end {bmatrix}}_{ \Sigma_{m\times n} }

    So,

    Am×nVn×n=Um×mΣm×n\begin{matrix} A_{m\times n}V_{n\times n}=U_{m\times m}\Sigma_{m\times n} \end{matrix}

    This equation is what we call Full SVD

Reduced SVD

If some of the columns of our matrix AA are dependent and say Rank(A)=r\text{Rank}(A)=r, so

  • Dimension of Row Space is: rr

  • Dimension of Null Space is: nrn-r

  • Dimension of Column Space is: rr

  • Dimension of Left Null Space is: mrm-r Our equation was,

    A[v1v2vn]Vn×n=[u1u2un]Um×n[[σ10σ20σm]m×m0m×(nm)0(nm)×m0(nm)×(nm)]Σn×nA\underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ V'_{_{n\times n}} } = \underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_n \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ U'_{m\times n} } \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m \\ \end {bmatrix}_{m\times m} & \mathbf{0}_{m\times (n-m)} \\ \mathbf{0}_{(n-m)\times m} & \mathbf{0}_{(n-m)\times (n-m)} \end {bmatrix}}_{ \Sigma'_{n\times n} }

    So,

    AVn×n=Um×nΣn×nAV'_{n\times n}=U'_{m\times n}\Sigma'_{n\times n}

    And as we discussed above that vr+1,ur+2,,un\vec{v}_{r+1},\vec{u}_{r+2},\cdots,\vec{u}_n are all 0\vec{0}, so these vectors are redundant we can discard those vectors from our equation so now our equation will become,

    A[v1v2vr]Vn×r=[u1u2ur]Um×r[σ10σ20σr]Σr×rA\underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_r \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ V_{_{n\times r}} } = \underbrace{\begin{bmatrix} \vdots & \vdots & & \vdots \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_r \\ \vdots & \vdots & & \vdots \end{bmatrix}}_{ U_{m\times r} } \underbrace{\begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_r \\ \end {bmatrix}}_{ \Sigma_{r\times r} }

    So,

    Am×nVn×r=Um×rΣr×r\begin{matrix} A_{m\times n}V_{n\times r}=U_{m\times r}\Sigma_{r\times r} \end{matrix}

    This equation is what we call Reduced SVD

We have seen that we can write any m×nm\times n matrix AA as AV=UΣAV=U\Sigma where VV and UU are Orthonormal matrix.
Now the real challange is to find these VV and UU Orthonormal matrix.
Once we got those VV and UU Orthonormal matrix then we can find AA as,
AV=UΣAV=U\Sigma
A=UΣV1A=U\Sigma V^{-1}
And because VV is an orthogonal matrix so we know that V1=VTV^{-1}=V^T
To achieve it we need VV to be square matrix so we have to use Full SVD method, then we get

danger
Am×n=Um×mΣm×nVn×nT\begin{matrix} A_{m\times n}=U_{m\times m}\Sigma_{m\times n}V_{n\times n}^{T} \end{matrix}

Everything looks good but how to find these VV and UU Orthonormal matrix.

info

IDEA: to find the VV and UU are the Matrices ATAA^TA and AATAA^T

info
  • Row space of AA is same as Row space of ATAA^TA.
  • It is easy to find the basis for Row space of ATAA^TA, so rather than finding basis for Row space of AA we will find basis for Row space of ATAA^TA*

Explanation:

Am×n=[a11a12a1na21a22a2nam1am2amn]A_{m\times n}=\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix}
Let's rewrite ATAA^TA
ATA=AT[a11a12a1na21a22a2nam1am2amn]A^TA=A^T\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix}

Here we can see that we are mapping the row vectors of AA onto the Row Space of ATA^T.

And if Rank(A)=r\text{Rank}(A)=r then Rank(AT)=r\Rightarrow \text{Rank}(A^T)=r so AA and ATA^T both are rr-dimensions Row Space.

So when we map row vector of AA onto the Row Space of ATA^T then the resulting Row Space also has rr-dimensions, so we can say that Rank(ATA)=r\text{Rank}(A^TA)=r

Now we can say that Row Space of AA is same as Row Space of ATAA^TA.

(ATA)T=ATAATA(A^TA)^T=A^TA \Rightarrow A^TA is symmetric matrix, so
Row Space of ATA=A^TA= Column Space of ATA=A^TA= Row Space of AA

info
  • Column Space of AA is same as Column Space of AATAA^T
  • It is easy to find the basis for Column Space of AATAA^T, so rather than finding basis for Column Space of AA we will find basis for Column Space of AATAA^T*

Explanation:
Am×nT=[a11a21am1a12a22am2a1na2namn]A^T_{m\times n}=\begin{bmatrix} a_{11} & a_{21} & \cdots & a_{m1} \\ a_{12} & a_{22} & \cdots & a_{m2} \\ \vdots & \vdots & \ddots & \vdots \\ a_{1n} & a_{2n} & \cdots & a_{mn} \\ \end{bmatrix}
Let's rewrite AATAA^T
AAT=A[a11a21am1a12a22am2a1na2namn]AA^T=A\begin{bmatrix} a_{11} & a_{21} & \cdots & a_{m1} \\ a_{12} & a_{22} & \cdots & a_{m2} \\ \vdots & \vdots & \ddots & \vdots \\ a_{1n} & a_{2n} & \cdots & a_{mn} \\ \end{bmatrix}

Here we can see that we are mapping the column vectors of ATA^T onto the Column Space of AA.

And if Rank(A)=r\text{Rank}(A)=r then Rank(AT)=r\Rightarrow \text{Rank}(A^T)=r so AA and ATA^T both are rr-dimensions Column Space.

So when we map column vector of ATA^T onto the Column Space of AA then the resulting Column Space also has rr-dimensions, so we can say that Rank(AAT)=r\text{Rank}(AA^T)=r

Now we can say that Column Space of AA is same as Column Space of AATAA^T.

(AAT)T=AATAAT(AA^T)^T=AA^T \Rightarrow AA^T is symmetric matrix, so
Column Space of AAT=AA^T= Row Space of AAT=AA^T= Column Space of AA

Finding orthonormal Row vectors(V)(V) for matrix AA

We discussed that Row space of AA is same as Row space of ATAA^TA.
So we need to find orthonormal Row vectors for matrix ATAA^TA.

Am×n=Um×mΣm×nVn×nTA_{m\times n}=U_{m\times m}\Sigma_{m\times n}V_{n\times n}^{T}
(Σm×nT=Σn×m)(\Sigma^T_{m\times n}=\Sigma_{n\times m})
(AT)n×m=Vn×nΣn×mUm×mT\Rightarrow (A^T)_{n\times m} = V_{n\times n}\Sigma_{n\times m}U_{m\times m}^{T}

(AT)n×mAm×n=Vn×nΣn×mUm×mTUm×m=ImΣm×nVn×nT\Rightarrow (A^T)_{n\times m}A_{m\times n} = V_{n\times n}\Sigma_{n\times m} \underbrace{U_{m\times m}^{T} U_{m\times m}}_{=\mathcal{I}_m} \Sigma_{m\times n}V_{n\times n}^{T}
Σn×mΣm×n=[[σ10σ20σm]m×m0(nm)×m]Σn×m[[σ10σ20σm]m×m0m×(nm)]Σm×n\Sigma_{n\times m}\Sigma_{m\times n}= \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m \\ \end {bmatrix}_{m\times m} \\ \mathbf{0}_{(n-m)\times m} \end {bmatrix}}_{ \Sigma_{n\times m} } \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m \\ \end {bmatrix}_{m\times m} & \mathbf{0}_{m\times (n-m)} \end {bmatrix}}_{ \Sigma_{m\times n} }
Σn×mΣm×n=[[σ120σ220σm2]m×m0m×(nm)0(nm)×m0(nm)×(nm)]Σn×n2=Σn×n2\Sigma_{n\times m}\Sigma_{m\times n}= \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1^2 & & & \huge0 \\ & \sigma_2^2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m^2 \\ \end {bmatrix}_{m\times m} & \mathbf{0}_{m\times (n-m)} \\ \mathbf{0}_{(n-m)\times m} & \mathbf{0}_{(n-m)\times (n-m)} \\ \end{bmatrix}}_{ \Sigma^2_{n\times n}} = \Sigma^2_{n\times n}
Σn×n2=[[σ120σ220σm2]m×m0m×(nm)0(nm)×m0(nm)×(nm)]n×n\Sigma^2_{n\times n}= \begin{bmatrix} \begin{bmatrix} \sigma_1^2 & & & \huge0 \\ & \sigma_2^2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m^2 \\ \end {bmatrix}_{m\times m} & \mathbf{0}_{m\times (n-m)} \\ \mathbf{0}_{(n-m)\times m} & \mathbf{0}_{(n-m)\times (n-m)} \\ \end{bmatrix}_{n\times n}

(AT)n×mAm×n=Vn×nΣn×n2Vn×nT\Rightarrow (A^T)_{n\times m}A_{m\times n} = V_{n\times n}\Sigma^2_{n\times n}V_{n\times n}^{T}
Those 33 0\mathbf{0} matrices are interesting,

  • 0m×(nm)\mathbf{0}_{m\times (n-m)}
  • 0(nm)×m\mathbf{0}_{(n-m)\times m}
  • 0(nm)×(nm)\mathbf{0}_{(n-m)\times (n-m)} Remember that in Full SVD we said that Rank(A)=m\text{Rank}(A)=m so there must be nmn-m dependent row vectors. and these 0\mathbf{0} matrices are negating those dependent vectors.
info

Remember [A=QΛQT][A=Q\Lambda Q^T] where AA is symmetric matrix.

ATAA^TA is a symmetric matrix, so

info

Columns of VV are eigenvectors of matrix ATAA^TA, and σi2\sigma_i^2 are the ithi^{th} eigenvalue of matrix ATAA^TA

Finding orthonormal Column Vectors(U)(U) for matrix AA

We discussed that Column Space of AA is same as Column Space of AATAA^T.
So we need to find orthonormal Column vectors for matrix AATAA^T.

Am×n=Um×mΣm×nVn×nTA_{m\times n}=U_{m\times m}\Sigma_{m\times n}V_{n\times n}^{T}
(Σm×nT=Σn×m)(\Sigma^T_{m\times n}=\Sigma_{n\times m})
(AT)n×m=Vn×nΣn×mUm×mT\Rightarrow (A^T)_{n\times m} = V_{n\times n}\Sigma_{n\times m}U_{m\times m}^{T}
Am×n(AT)n×m=Um×mΣm×nVn×nTVn×n=InΣn×mUm×mT\Rightarrow A_{m\times n}(A^T)_{n\times m} = U_{m\times m}\Sigma_{m\times n} \underbrace{V_{n\times n}^{T} V_{n\times n}}_{=\mathcal{I}_n} \Sigma_{n\times m}U_{m\times m}^{T}

Σm×nΣn×m=[[σ10σ20σm]m×m0m×(nm)]Σm×n[[σ10σ20σm]m×m0(nm)×m]Σn×m\Sigma_{m\times n}\Sigma_{n\times m}= \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m \\ \end {bmatrix}_{m\times m} & \mathbf{0}_{m\times (n-m)} \end {bmatrix}}_{ \Sigma_{m\times n} } \underbrace{\begin{bmatrix} \begin{bmatrix} \sigma_1 & & & \huge0 \\ & \sigma_2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m \\ \end {bmatrix}_{m\times m} \\ \mathbf{0}_{(n-m)\times m} \end {bmatrix}}_{ \Sigma_{n\times m} }
Σn×mΣm×n=[σ120σ220σm2]Σm×m2=Σm×m2\Sigma_{n\times m}\Sigma_{m\times n}= \underbrace{\begin{bmatrix} \sigma_1^2 & & & \huge0 \\ & \sigma_2^2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m^2 \\ \end{bmatrix}}_{ \Sigma^2_{m\times m}} = \Sigma^2_{m\times m}
Σm×m2=[σ120σ220σm2]m×m\Sigma^2_{m\times m}= \begin{bmatrix} \sigma_1^2 & & & \huge0 \\ & \sigma_2^2 & & \\ & & \ddots & \\ \huge0 & & & \sigma_m^2 \\ \end{bmatrix}_{m\times m}

Am×n(AT)n×m=Um×mΣm×m2Um×mT\Rightarrow A_{m\times n}(A^T)_{n\times m} = U_{m\times m}\Sigma^2_{m\times m}U_{m\times m}^{T}
This time there isn't any 0\mathbf{0} matrix, because as we discussed, in Full SVD Rank(A)=m\text{Rank}(A)=m so all columns vectors and row vectors of UU are independent.

info

Remember [A=QΛQT][A=Q\Lambda Q^T] where AA is symmetric matrix.

AATAA^T is a symmetric matrix, so

info

Columns of UU are eigenvectors of matrix AATAA^T, and σi2\sigma_i^2 are the ithi^{th} eigenvalue of matrix AATAA^T