Recall our previous example, here we have a
system of equation AX=Y.
let's take an example, suppose we have 3 points (in form of a1,a2) (1,1),(2,2),(3,2).
Our objective is to get a best possible linear function for a2, say that function be f.
Our function might not give exact a2 that corresponds to a1 but it will give us best possible
approximation for a2.
The simplest linear function is a2=f(a1)=x1+a1x2.
Here x1,x2 are our parameters
(unknown)
Our observations says, f=x1+1x2=1, f=x1+2x2=2, f=x1+3x2=2,
We can also write it as,
A⎣⎡111123⎦⎤X[x1x2]=Y⎣⎡122⎦⎤
AX=Y
So we want to find the linear combinations of column vectors of A that gives us
Y, but Y does not lives in the column space of A.
So now we will find a vector Y in the column space of A
that is closest to Y, here closeness is determined by the Euclidean distance between
Y and Y.
So instead we will solve,
AX=Y
(And X is just a way to tell that our solution is an estimate of exact solution). Y lives in the column space of A, andY
is out of the column space of A, so
The vector Y−Y is perpendicular to the column space of A. ⇒Y−Y is in the Null space of AT. ⇒AT(Y−Y)=0 and we know that Y=AX. ⇒AT(Y−AX)=0 ⇒ATAX=ATY
ATA=[36614],ATY=[511],
Now we have to solve,
[36614][x1x2]=[511]
We can write it as, 3x1+6x2=5 6x1+14x2=11
By solving we get x1=2/3 and x2=1/2
So our function f(a1)=x1+a1x2 becomes,
f(a1)=32+21a1
Let's take a look at our estimate for our 3 data points and it's error(which is a2−a2^).
For (a1,a2)=(1,1)
a2^=32+21(1)=67
e1=1−67=−61
For (a1,a2)=(2,2)
a2^=32+21(2)=35
e2=2−35=31
For (a1,a2)=(3,2)
a2^=32+21(3)=613
e3=2−613=−61
Now represent our estimate and errors as vector,
Y=⎣⎡6735613⎦⎤,e=Y−Y=⎣⎡−6131−61⎦⎤
As we discussed above that Y is in column space of A
and Y−Y is perpendicular to that column space.
We can now see it in this example.
First notice that dot product of Y and Y−Y is 0.
Y⋅(Y−Y)=YT(Y−Y)=0
As we said that Y−Y is perpendicular to the whole column space,
you can took any linear combinations of the columns of A it will be
perpendicular to Y−Y.