Say we have a m×n matrix Q with orthonormal column vectors .
Now say we want the projection of a vector (say v) onto the column space of Q.
Our projection matrix(P) is:
P=Q(QTQ)−1QT
And projection of vector v onto the column space of Q is Pv
because Q has orthonormal column vectors .
So QTQ=I.
So our projection matrix$(P) becomes P=QIQT, so.
Projection Matrix(P):
danger
P=QQT
(So now we don't have to take the inverse is QTQ)
We can see the direct benefit of having a matrix with orthonormal column vectors is in least squares.
info
In Least squares we have equation of form ATAX=ATv and if A has orthonormal column vectors,
then ATA=I so our equation becomes X=ATv
No need to take (ATA)−1
Now let's test the properties of a projection matrix,
Ok now we know that matrices with orthonormal column vectors are important.
But if our matrix(with independent column vectors) do not have orthonormal column vectors then how to make them orthonormal column vectors?
This is where Gram Schmidt came into picture.
First let's look at the smaller picture,
Say that we have 2 vectors a∈Rn and b∈Rn.
We have just 2(non-parallel) vectors, and span of 2(non-parallel) vectors is just a 2 dimensional plane.
What we want is two orthonormal vectors(say qa and qb) in this 2 dimensional plane.
Let's took a as our first orthogonal vector, it's easy because it's only one vector.
So
info
qa=∥a∥a
Now let's find second orthonormal vectors? IDEA is to remove it's direction along qa.
So first take the projection of b onto the vector space of qa, say this projected
b on qa to be bp.
Now the vector joining b and bp is orthogonal to qa, say this vector bo.
And bp+bo=b so, bo=b−bp. Recall our projection matrix(P) of a vector,
P=vTvvvT
So projection of b onto the vector space of qa is bp=Pb, Pb=qaTqaqaqaTb.
⇒bp=qaTqaqaqaTb; (And qaTqa=1)
⇒bp=qaqaTb.
info
bo=b−(qa⋅b)qa
And
qb=∥bo∥bo
And we know that b is perpendicular to qa, and if we think qa as
a matrix with one column then bo is perpendicular to the column space of
qa.
So bo is in the Null space of qaT so, qaTbo=0, let's verify it,
qaT(b−qaqaTb)=0
⇒qaTb−(qaTqa)qaTb=0
⇒qaTb−qaTb=0✓
Now let's push ourselves a little further now add another vector c.
We got our two orthogonal vectors qa and qb and this vector c
is not in the vector space of qa and qb, in other words c is
out of the plane spanned by vector qa and qb.
So c gives us access to the 3rd dimension, using this c we need to find a
vector orthogonal to the vector space of qa and qb. IDEA: first from c remove it's direction along qa and then remove it's direction
along qb.
First remove it's direction along qa
Let's call this vector coa. coa=c−qaTqaqaqaTc qaTqa=1 ⇒coa=c−qaqaTc ⇒coa=c−(qa⋅c)qa
Now remove it's direction along qa
This is our vector co. co=(c−(qa⋅c)qa)−qbTqbqbqbTc qbTqb=1 ⇒co=(c−(qa⋅c)qa)−(qb⋅c)qb qc=∥co∥co
Now we can see a pattern here.
Say that we have n(independent) (a1,a2,⋯,an) vectors and we have to find
northonormal vectors(q1,q2,⋯,qn) using these n(independent) vectors.
we can deduce a pattern above,
aio=ai−k=1∑i−1(qk⋅ai)qk
qi=∥aio∥aio
So now we can find orthonormal vectors for any set of independent vectors.
This is The Gram Schmidt Process.