A transformation t of V is linear <=> For all vectors u , v and all real numbers r t(u+v) = t(u)+t(v) and t(r.u) = r.t(u)The set of all linear transformations of V is L(V).
t : R x R -> R x R : (x,y) -> (x+y,x) t : R x R -> R x R : (x,y) -> (0,y) t : R -> R : x -> 6x R[x] is the set of all polynomials in x with real coefficients. t : R[x] -> R[x] : p(x) --> p'(x) with p'(x) is the derivative of p(x)Exercise: check for each example that both conditions are met. Search for an original example and verify your answer.
Counterexample:
t : R x R -> R x R : (x,y) -> (x+4,y)Why is this not a linear transformation?
I : V -> V : v -> v
t(0) = t(0v) = 0.t(v) = 0Hence, the image of the vector 0 is 0.
t is in L(V) <=> For all vectors u, v and all real numbers r, s t(r.u + s.v) = r.t(u) + s.t(v)Proof :
t(r.u + s.v) = t(r.u) + t(s.v) = r.t(u) + s.t(v)Part 2 : If t(r.u + s.v) = r.t(u) + s.t(v) for all r, s then
take r = s = 1 t(u+v) = t(u)+t(v) take s = 0 t(r.u) = r.t(u)Q.E.D.
Proof :
A random vector v in V can be written as v = k.e_{1}+l.e_{2}+m.e_{3}.
A random vector w in V can be written as w = k'.e_{1}+l'.e_{2}+m'.e_{3}.
Then v + w = (k+k')e_{1} + (l+l')e_{2} + (m+m')e_{3}.
In order to prove the existence of such a transformation, we start
with a well-chosen transformation t of V.
We'll show afterwards that the required conditions for linearity are met.
So, first we define a transformation t by specifying the image of each vector.
If v = k.e_{1} + l.e_{2} + m.e_{3} then we define t(v) = k.u_{1} + l.u_{2} + m.u_{3}First condition: t is linear because :
t(v + w) = t( (k + k')e_{1} + (l + l')e_{2} + (m + m')e_{3} ) our definition of t = (k + k')u_{1} + (l + l')u_{2} + (m + m')u_{3} property in V = k.u_{1} + l.u_{2} + m.u_{3} + k'.u_{1} + l'.u_{2} + m'.u_{3} our definition of t = t(v) + t(w) t(r.v) = t(rk.e_{1} + rl.e_{2} + rm.e_{3}) our definition of t = rk.u_{1} + rl.u_{2} + rm.u_{3} property in V = r(k.u_{1} + l.u_{2} + m.u_{3}) our definition of t = rt(v)Second condition:
t(e_{1}) = t(1.e_{1} + 0.e_{2} + 0.e_{3}) = 1.u_{1} + 0.u_{2} + 0.u_{3} = u_{1} t(e_{2}) = t(0.e_{1} + 1.e_{2} + 0.e_{3}) = 0.u_{1} + 1.u_{2} + 0.u_{3} = u_{2} t(e_{3}) = t(0.e_{1} + 0.e_{2} + 1.e_{3}) = 0.u_{1} + 1.u_{2} + 1.u_{3} = u_{3}The existence is proved.
Suppose t and t' are two linear transformations such that
t(e_{1}) = u_{1} and t'(e_{1}) = u_{1}
t(e_{2}) = u_{2} and t'(e_{2}) = u_{2}
t(e_{3}) = u_{3} and t'(e_{3}) = u_{3}
Then, for each v in V
t(v) = t(k.e_{1}+l.e_{2}+m.e_{3}) our definition of t = k.u_{1} + l.u_{2} + m.u_{3} and t'(v)= t'(k.e_{1} + l.e_{2} + m.e_{3}) t' is linear = t'(k.e_{1}) + t'(l.e_{2}) + t'(m.e_{3}) t' is linear = k.t'(e_{1}) + l.t'(e_{2}) + m.t'(e_{3}) = k.u_{1} + l.u_{2} + m.u_{3} So t = t'Conclusion:
A linear transformation of a real vector space V is is completely and unambiguously determined by the images of the basis vectors of V |
Example 1:
There is exactly one linear transformation t of the vector space R x R
such that
t(1,0) = (3,2)
t(0,1) = (5,4)
We calculate the image of the vector (-1,5) = -1(1,0) + 5(0,1) .
t(-1,5) = t( -1(1,0) + 5(0,1) ) = -1(3,2) + 5(5,4) = (22,18)
Example 2:
Take the real vector space C of the complex numbers.
In that vector space we choose a ordened basis (1 , i).
A linear transformation of C is completely and unambiguously
determined by the images of 1 and i.
We define a linear transformation t by
t(1) = 1 + i
t(i) = 1 - i
Now we can calculate the image of each vector. For example, we calculate
the image of 3 - 2i.
t(3 - 2i) = t( 3 . 1 + (-2) . i ) = 3 (1+i) + (-2)(1-i) = 1 + 5i
If (e_{1}, e_{2}, e_{3}) is an ordered basis of V,
and if (u_{1}, u_{2}, u_{3}) is a randomly chosen ordered set of three vectors from V,
we know that there is exactly one linear transformation t of V such that
t(e_{1}) = u_{1}
t(e_{2}) = u_{2}
t(e_{3}) = u_{3}
These vectors u_{1}, u_{2}, u_{3} can be expressed in e_{1}, e_{2}, e_{3} .
u_{1} = a.e_{1} + b.e_{2} + c.e_{3} u_{2} = d.e_{1} + e.e_{2} + f.e_{3} u_{3} = g.e_{1} + h.e_{2} + i.e_{3}A random vector v = k.e_{1}+l.e_{2}+m.e_{3} is transformed by t in t(v) = k.u_{1}+l.u_{2}+m.u_{3}
t(v) = k.u_{1} + l.u_{2} + m.u_{3} <=> t(v) = k.(a.e_{1} + b.e_{2} + c.e_{3}) + l.(d.e_{1} + e.e_{2} + f.e_{3}) + m.(g.e_{1} + h.e_{2} + i.e_{3}) <=> t(v) = (k.a + l.d + m.g).e_{1} + (k.b + l.e + m.h).e_{2} + (k.c + l.f + m.i).e_{3}The coordinates of v relative to the basis (e_{1}, e_{2}, e_{3}) are (k,l,m).
k' = k.a + l.d + m.g l' = k.b + l.e + m.h m' = k.c + l.f + m.i <=> [k'] [a d g] [k] [l'] = [b e h] . [l] [m'] [c f i] [m]The last matrix formula is the transformation formula associated with t relative to the basis (e_{1}, e_{2}, e_{3}). With this matrix we transform the coordinates of v in the coordinates of t(v). The matrix
[a d g] [b e h] [c f i]is called the matrix of the linear transformation relative to the basis (e_{1}, e_{2}, e_{3}). The columns of this matrix are the coordinates of (u_{1}, u_{2}, u_{3}).
Example 1:
In R^{2} we choose ( (1,0) , (0,1) ) as basis.
There is exactly one linear transformation of R^{2} such that
t(1,0) = (3,2)
t(0,1) = (5,4)
The matrix of the linear transformation relative to the chosen basis is
[3 5] [2 4]Take v with coordinates (3, 1). The coordinates of t(v) are (14,10) because :
[3 5][3] [14] [2 4][1] = [10]Example 2:
We build a linear transformation t by choosing x , x + 1 and 3x - 2 as the images of the three basis vectors.
The matrix of the linear transformation is
[0 1 -2] [1 1 3] [0 0 0]The columns of this matrix are the coordinates of x , x + 1 and 3x - 2.
The polynomial 2x^{2} - x + 4 has coordinates (4,-1,2).
The image of the polynomial 2x^{2} - x + 4 by the linear transformation t has coordinates
[0 1 -2] [4] [-5] [1 1 3] [-1] = [ 9] [0 0 0] [2] [ 0]The image is the polynomial 9x - 5.
[cos(t) -sin(t)] [sin(t) cos(t)]Example:
The rotation about O by an angle of 60 degrees has as matrix
[0.5 -0.866] [0.866 0.5 ]The coordinates of the image w of the vector v(4,7) are
[0.5 -0.866][4] [0.866 0.5 ][7]w = w(-4.062 ; 6.964)
Check these result with a figure.
[cos(t) sin(t)] [sin(t) -cos(t)] Now 2 tan(t/2) 2m sin(t) = ---------------- = ----------- 1 + tan^{2}(t/2) 1 + m^{2} 1 - tan^{2}(t/2) 1 - m^{2} cos(t) = ---------------- = ----------- 1 + tan^{2}(t/2) 1 + m^{2} The matrix becomes [ (1-m^{2})/(1+m^{2}) 2m/(1+m^{2}) ] [ 2m/(1+m^{2}) -(1-m^{2})/(1+m^{2})]Example:
We consider the orthogonal reflection in the line y = 0.5 x.
Then m = 0.5
The matrix of the orthogonal reflection is
[0.6 0.8] [0.8 -0.6]The coordinates of the image w of the vector v(4,7) are
[0.6 0.8][4] [0.8 -0.6][7]w = w(8,-1)
Check these result with a figure.
Reflection in the x-axis
In this case t = 0. The matrix of the orthogonal reflection is
[ 1 0] [ 0 -1]Reflection in the y-axis
[ -1 0] [ 0 1]Reflection in the line y = x
[0 1] [1 0]
[cos^{2}(t) sin(t)cos(t)] [cos(t)sin(t) sin^{2}(t) ] Now 1 cos^{2}(t) = -------- 1 + m^{2} m^{2} sin^{2}(t) = --------- 1 + m^{2} m cos(t)sin(t) = (1/2) sin(2t) = --------- 1 + m^{2} The matrix of the orthogonal projection is [ 1/(1+m^{2}) m/(1+m^{2}) ] [ m/(1+m^{2}) m^{2}/(1+m^{2}) ]Example:
We'll find the matrix of the orthogonal projection on the line y = 2x. So, m = 2.
The matrix of the projection is
[ 0.2 0.4] [ 0.4 0.8]The coordinates of the image w of the vector v(4,7) are
[ 0.2 0.4][4] [ 0.4 0.8][7]w = w(3.6 ; 7.2)
Check these result with a figure.
Exercise: Find the matrices of the orthogonal projection on the x-axis, on the y-axis, on the line y = -x.
t(r.u + s.v) = r.t(u) + s.t(v) = r.0 + s.0 = 0 So, r.u + s.v is in ker(t).Hence the kernel is a subspace of V.
Example 1:
Relative to a basis the matrix of a linear transformation is
[0 1 -2] [1 1 3] [0 0 0]The vector v with coordinates (x,y,z) belongs to the null-space if and only if
[0 1 -2] [x] [ 0] [1 1 3] [y] = [ 0] [0 0 0] [z] [ 0] <=> y - 2z = 0 x + y + 3z = 0 <=> y = 2z x = -5zFor each z-value there is a vector of ker(t).
Example 2:
R[x] is the set of all polynomials in x with real coefficients.
t : R[x] -> R[x] : p(x) --> p'(x) with p'(x) is the derivative of p(x)
The null-space is the set of all polynomials p(x) such that p'(x) = 0. Ker(t) = R.
Example 3:
Take the orthogonal projection of all vectors of the plane on the line y=m(x).
This transformation is a linear transformation.
The null-space is the set of all vectors v such that orthogonal projection(v) = 0.
The image points of these vectors are the points of the line y = (-1/m) x.
It is a subspace of V with dimension 1.
The vector v is a fixed point of t if and only if t(v) = vIt is easy to show that the set of all fixed points of t is a subspace of V. (exercise)
Note that fixed points are vectors.
Example:
Let t = the orthogonal projection of all vectors of the plane on the line y = m x. All vectors with image point on the line y = m x are fixed points of t.
[a d g] A_{o} = [b e h] [c f i]The linear transformation t transforms a random vector
[k'] [a d g] [k] [l'] = [b e h].[l] [m'] [c f i] [m] [k'] [k] Let K_{o}' = [l'] and K_{o} = [l] [m'] [m]K_{o} is the matrix of the coordinates of v and K_{o}' is the matrix of the coordinates of t(v) relative to the basis (e_{1}, e_{2}, e_{3}) and
K_{o}' = A_{o}.K_{o} (*)Now we take a new basis in V. Then, all the vectors of V get new coordinates. From the theory of vector spaces, we know that these new coordinates are linked to the old ones with a transformation matrix C.
Denote the new coordinates of a vector, in matrix form, as K_{n}. Index o stands for old, index n stands for new.
Then K_{o} = C.K_{n} and K_{o}' = C.K_{n}'
We write (*) with the new coordinates.
C.K_{n}' = A_{o}.C.K_{n} <=> K_{n}' = C^{-1}.A_{o}.C.K_{n} (**)The last formula gives the connection between the coordinates of v and t(v) relative to the new basis.
Denote A_{n} as the matrix of t relative to the new basis. Then we have
K_{n}' = A_{n}.K_{n} (***)From (**) and (***) we see that
A_{n} = C^{-1} .A_{o}.CThe last formula gives us the possibility to calculate the new matrix A_{n} of t from the old matrix A_{o}.
Example
In a 2-dimensional space with basis (e_{1}, e_{2}), a linear transformation t has matrix
[3 1] [-1 1]Now we take a new basis
e_{1}' = e_{1} + e_{2} e_{2}' = e_{1} - e_{2} Then the transformation matrix C is [1 1] [1 -1] and from this C^{-1} is [1/2 1/2] [1/2 -1/2] The matrix of the linear transformation t relative to the new basis is [1/2 1/2] [3 1] [1 1] [1/2 -1/2] [-1 1] [1 -1] = [2 0] [2 2]
Application
In a plane we start with two orthogonal axes and two unit vectors (e_{1}, e_{2}) as basis. We take a rotation with an angle of pi/4 radians. We know that this is a linear transformation with matrix A
A = [1/sqrt(2) -1/sqrt(2)] [1/sqrt(2) 1/sqrt(2)] = 1/sqrt(2) . [1 -1] [1 1]Suppose that for a given application it is useful to move to skewed angular axes with new basis vectors (e_{1}', e_{2}') such that
If we want to use the rotation in the new coordinate system, we need to transform the matrix A for use in this new coordinate system. This new matrix is equal to C^{-1}. A .C
The columns of C are the coordinates of the new basis-vectors with respect to the original (old) basis vectors .
C= [1 1] [0 1]After calculation the new matrix A_{n} of the rotation is
A_{n} = 1/sqrt(2) . [0 -2] [1 2]
B = C^{-1}. A .CAs a corollary from previous formula we see that two matrices of a linear transformation, relative to a different basis, are similar.
B = C^{-1}. A .C => |B| =|C^{-1}|.|A|.|C| => |B| =|C^{-1}|.|C|.|A| => |B| = |A|So, similar matrices have the same determinant.
t+t' : V -> V : v -> t(v) + t'(v)It can easily be proved that t+t' is a linear transformation and that the matrix of t+t' is equal to the sum of the matrices of t and of t'.
r.t : V -> V : v -> r.t(v)It can easily be proved that r.t is a linear transformation and that the matrix of r.t is equal to r.(matrix) of t.
t' ^{o} t : V -> V : v -> t'(t(v))It can easily be proved that t' ^{o} t is a linear transformation and that the matrix of t' ^{o} t is equal to
Example
t_{1} and t_{2} are linear transformations with matrices A and B respectively.
A= [1 3] B= [-1 1] [0 3] [1 4]The linear transformation t_{2} ^{o} t_{1} has the matrix
B.A = [ -1, 0 ] [ 1, 15 ]The linear transformation t_{1} ^{o} t_{2} has the matrix
A.B = [ 2, 13 ] [ 3, 12 ]The composition of linear transformations is not commutative.
for n= 2 : t^{2} : V -> V : v -> t(t(v)) for n > 2 : t^{n} : V -> V : v -> t(t^{n-1} (v)) Example:From this it follows that t^{n} has matrix A^{n}.
t^{4}(v) = t(t(t(t(v))))
3 t^{3} - 4 t^{2} + t + I is a linear transformation with matrix 3 A^{3} - 4 A^{2} + A + I (t - I).(2 t^{2} + 5I) is a linear transformation with matrix (A - I).(2 A^{2} + 5I) (t - I).(2 t^{2} + 5I) = 2 t^{3} - 2 t^{2} + 5 t - 5 IHence:
f(t) = 0 <=> f(A) = 0
Then v = m + n .
Now we can define the transformation
p: V --> V : v --> m We define this transformation as the projection of V on M relative to NIt is easy to show that this transformation is linear.
Say p is the projection of V on M relative to N, then
p(2x^{3} - x^{2} + 4x - 7 ) = 4x - 7Say q is the projection of V on N relative to M, then
q(2x^{3} - x^{2} + 4x - 7 ) = 2x^{3} - x^{2}We choose the basis (1, x, x^{2}, x^{3}) in V. Now we can create the matrix of the projection.
For this projection we find
[ 1 0 0 0 ] [ 0 1 0 0 ] [ 0 0 0 0 ] [ 0 0 0 0 ]
To find A_{o} we first take a new basis B_{1}.
B_{1} = ( (1,0,2) , (0,1,0) , (2,0,-1) ).
The new base is chosen such that the projection of these new base vectors is very simple.
Indeed, we see that :
The projection of (1,0,2) is (1,0,2)
The projection of (0,1,0) is (0,1,0)
The projection of (2,0,-1) is (0,0,0)
The matrix A_{1} of the projection has as columns the coordinates of the projected
base vectors (1,0,2);(0,1,0) and (0,0,0) relative to the new basis B_{1}.
This matrix is
[1 0 0] A_{1} = [0 1 0] [0 0 0]The connection between A_{1} and A_{o} is A_{1} = C^{-1} A_{o} C
[1 0 2] C = [0 1 0] [2 0 -1] Now, A_{o} = C A_{1} C^{-1} [ 1, 0, 2 ] A_{o} = (1/5) [ 0, 5, 0 ] [ 2, 0, 4 ]With this matrix we can calculate the projection of any vector in an easy way.
[ 1, 0, 2 ] [ 6] [ 2 ] (1/5) [ 0, 5, 0 ] [-1] = [ -1 ] [ 2, 0, 4 ] [ 2] [ 4 ]
h : V --> V : v --> r.vWe say that h is a similarity transformation of V with factor r.
It is easy to show that this transformation is linear.
Important special values of r are 0, 1 and -1.
Now we define the transformation
s : V --> V : v --> m - nWe say that s is the reflection of V in M relative to the N.
It is easy to show that this transformation is linear.
This definition is a generalization of the ordinary reflection in a plane. Indeed, if you take the ordinary vectors in a plane and if M and N are one dimensional supplementary subspaces, then you'll see that with the previous definition, s becomes the ordinary reflection in M relative to the direction given by N.
Take V = R^{4}. M = span{(0,1,3,1);(1,0,-1,0)} N = span{(0,0,0,1);(3,2,1,0)}It is easy to show that M and N have only the vector 0 in common. (This is left as an exercise.) So, M and N are supplementary subspaces.
Now we'll calculate the image of the reflection of vector v = (4,3,3,1) in M relative to N.
First we write v as the sum of exactly one vector m of M and n of N.
(4,3,3,1) = x.(0,1,3,1) + y.(1,0,-1,0) + z.(0,0,0,1) + t.(3,2,1,0)The solution of this system gives x = 1; y = 1; z = 0; t = 1. The unique representation of v is
(4,3,3,1) = (1,1,2,1) + (3,2,1,0)The image of the reflection of vector v = (4,3,3,1) in M relative to N is vector v' =
(1,1,2,1) - (3,2,1,0) = (-2,-1,1,1)We choose in R^{4} the natural basis. We want to find the matrix A_{o} of the reflection relative to this natural basis. If this matrix is known, we can find the reflection of any vector in a simple way.
Finding the matrix A_{o} is not obvious. For this purpose we rely on the knowledge about changing the basis of the vector space and its consequences on the matrix of the reflection.
We start from a natural basis. Let A_{o} be the required matrix of our reflection.
Now we change the basis to a new basis. As new basis vectors we choose the
generators of M and N. The new basis is
( (0,1,3,1) ; (1,0,-1,0) ; (0,0,0,1) ; (3,2,1,0) )
relative to this new basis all vectors have new coordinates. From the theory of the vector spaces we know that these new coordinates are connected to the old coordinates by a matrix C. The columns of C are the coordinates of the new basis vectors relative to the original (old) basis. In our example is
[ 0, 1, 0, 3 ] C = [ 1, 0, 0, 2 ] [ 3, -1, 0, 1 ] [ 1, 0, 1, 0 ]relative to this new basis the linear transformation has a simple matrix. The columns of this matrix are the new coordinates of the images of the reflection of the new base vectors.
Basis vector image of reflection new coordinates of that image ---------- ------------------ -------------------------------- (0,1,3,1) (0,1,3,1) (1, 0, 0, 0) (1,0,-1,0) (1,0,-1,0) (0, 1, 0, 0) (0,0,0,1) -(0,0,0,1) (0, 0,-1, 0) (3,2,1,0) -(3,2,1,0) (0, 0, 0,-1) The matrix of the reflection relative to the new basis is [1 0 0 0] A_{n} = [0 1 0 0] [0 0 -1 0] [0 0 0 -1] Between the old and the new matrix of the reflection we have the connection A_{n} = C^{-1} A_{o} C <=> A_{o} = C A_{n} C^{-1} With this we calculate the matrix A_{o} of the reflection with respect to the natural basis. We find: [4 -9 3 0] A_{o} = [2 -5 2 0] [1 -3 2 0] [2 -4 2 -1]Once this result is found, it is very easy to find the reflection of any vector.
As an example, we retake the vector v = (4,3,3,1) from above. The image is :
[4 -9 3 0][4] [-2] [2 -5 2 0][3] = [-1] [1 -3 2 0][3] [1 ] [2 -4 2 -1][1] [1 ]So,in a very simple way we find the same result as above.
u is called an eigenvector or characteristic vector relative to t if and only if u is not 0 and there is a real number r such that t(u) = r.uThe real number r is called the eigenvalue of u.
An eigenvector relative to t is a vector different from 0 such that
t transforms this vector into a multiple of itself.
The vector u is not zero, but the eigenvalue r can be zero.
This means that all the vectors of the null-space, different from 0,
are the eigenvectors with eigenvalue 0.
Example 1:
Take an origin O in the plane and orthonormal basis vectors e_{1} and e_{2}.
Take the linear transformation t such that t projects each vector orthogonal on the x-axis.
The vectors, different from 0, with image point on the y-axis are transformed in 0.
These vectors are eigenvectors with eigenvalue 0.
Each vector, different from 0, with image point on the x-axis is transformed in itself.
It is an eigenvector with eigenvalue 1.
Example 2:
Take an origin O in the plane and orthonormal basis vectors e_{1} and e_{2}.
Take the linear transformation t such that t is a rotation about O with angle pi/11.
There is no vector, different from 0, such that t transforms this vector into a multiple of itself.
This rotation has no eigenvectors.
Exercise :
Take an origin O in the plane and orthonormal basis vectors e_{1} and e_{2}.
Take the orthogonal reflection in the line y = x as linear transformation.
Find the eigenvectors and the eigenvalues.
[a b c] [d e f] [g h i] We denote the co(u) = (x,y,z). Now, u(x,y,z) is a characteristic vector of t with eigenvalue r <=> t(u) = r.u and u not 0 <=> [a b c] [x] [x] [x] [0] [d e f].[y] = r.[y] with [y] not [0] [g h i] [z] [z] [z] [0] <=> [a b c] [x] [1 0 0] [x] [0] [x] [0] [d e f].[y] - r.[0 1 0].[y] = [0] with [y] not [0] [g h i] [z] [0 0 1] [z] [0] [z] [0] <=> [a b c] [x] [r 0 0] [x] [0] [x] [0] [d e f].[y] - [0 r 0].[y] = [0] with [y] not [0] [g h i] [z] [0 0 r] [z] [0] [z] [0] <=> The homogeneous system in x,y,z [a-r b c ] [x] [0] [d e-r f ].[y] = [0] [g h i-r] [z] [0] has a solution different from (0,0,0). <=> |a-r b c | |d e-r f | = 0 |g h i-r|The last equation is called the characteristic equation of t relative to the fixed basis in V. This equation is usually written as determinant(A - r I) = 0 or |A - r I| = 0. A is the matrix of the linear transformation and I is the suitable unit matrix.
If r is a solution of this equation, then the system
[a b c] [x] [x] [d e f].[y] = r.[y] [g h i] [z] [z]has a solution (x,y,z) different from (0,0,0). With this solution corresponds a characteristic vector u(x,y,z) and a eigenvalue r of t.
Example :
Find the eigenvalues and characteristic vectors of the matrix
[ 2 1 0 ] [ -1 0 4 ] [ 0 2 1 ]
| 2-r 1 0 | | -1 -r 4 | = 0 | 0 2 1-r |The solutions are 3 ; sqrt(5) and -sqrt(5)
Characteristic vectors corresponding with eigenvalue 3
[ 2 1 0 ][x] [x] [ -1 0 4 ][y] = 3 [y] [ 0 2 1 ][z] [z]This system is homogeneous with 1 side equation. The side equation may be deleted. We find as characteristic vectors all the multiples of(1,1,1) different from (0,0,0).
Characteristic vectors corresponding with eigenvalue sqrt(5).
[ 2 1 0 ][x] [x] [ -1 0 4 ][y] = sqrt(5) [y] [ 0 2 1 ][z] [z]This system is homogeneous with 1 side equation. The side equation may be deleted. We find as characteristic vectors all the multiples of (1, sqrt(5)-2, (3-sqrt(5))/2 ) different from (0,0,0).
Finally, the characteristic vectors with eigenvalue -sqrt(5).
This is left as an exercise.
This way of working can be extended to vector spaces with dimension n.
t(r.u + s.v) = r.t(u) + s.t(v) = r.k.u + s.k.v = k.(r.u + s.v)So, for all real r and s, (r.u + s.v) is a characteristic vector with eigenvalue k.
Proof:
Let v = characteristic vector with eigenvalue k.
Let w = characteristic vector with eigenvalue l.
Suppose v and w are linear dependent, then there is a scalar r such that
w = r v => t(w) = t(r v) => l w = r.t(v) => l w = r. k v => l r v = r. k v => k = lThis gives a contradiction with the fact that the two characteristic vectors correspond with different eigenvalues.
This theorem can be extended for a vector space with dimension n.
Take a vector space with dimension n and a linear transformation t.
If the n characteristic vectors correspond with all different eigenvalues,
then these characteristic vectors are linear independent.
t(v) = k.v = k.v + 0.w t(w) = l.w = 0.v + l.w The matrix of t relative to the basis ( v , w ) is [ k 0 ] [ 0 l ]We say that the matrix of t is diagonal.
This can be extended for a vector space with dimension n.
Take a vector space V with dimension n and a linear transformation t.
If n characteristic vectors correspond with all different eigenvalues,
then these characteristic vectors are linear independent.
The characteristic vectors can be used as a basis of V.
The matrix of t relative to that basis is diagonal and the eigenvalues
are the diagonal elements of the matrix.
D = C^{-1} . A . CHere, C is the transformation matrix. The columns of C are the coordinates of the new basis-vectors (characteristic vectors) relative to the original basis in V.
[4 -1] [2 1]Calculating the eigenvalues we find 3 and 2.
[1 1] [1 2]Then we have
-1 [3 0] = [1 1] [4 -1] [1 1] [0 2] [1 2] [2 1] [1 2]The matrix of the linear transformation is diagonalized.
If a real n x n matrix A is similar to a diagonal matrix D Then the eigenvectors of A form a set that generates the vector space V= R^{n}. |
Proof:
Say t is the linear transformation with matrix A. Since A and D are similar matrices there is a suitable basis
B of V such that D becomes the matrix of the linear transformation t.
The coordinates X of a random vector are then transformed into the coordinates X' of t(v).
The transformation formula is X' = D X with D the diagonal matrix (d1, d2, ... ,dn).
Each basis vector of B is then converted to a multiple of itself and therefore it is an eigenvector
But if the basis vectors of B are eigenvectors, then the set of eigenvectors will generate the vector space V= R^{n}.
If the eigenvectors of A form a set that generates the vector space V= R^{n} Then A is similar to a diagonal matrix D. |
Proof:
Let t be the linear transformation with matrix A.
Since the eigenvectors of A form a set that generates the vector space V, we can choose a basis from that set.
Since all the basis vectors are eigenvectors, the image of each basis vector is a multiple of itself.
The columns of the matrix of t relative to the chosen basis are multiples of the basis vectors. So, they form a diagonal matrix D. Then D is similar to A
Conclusion from theorem 1 and 2:
A real n x n matrix A is similar to a diagonal matrix D if and only if the eigenvectors of A form a set that generates the vector space V= R^{n}. |
Take A = [0 0 0 0] [0 0 0 0] [1 0 1 0] [0 1 0 1]The eigenvalues are 0, 0, 1, 1 We can choose the corresponding eigenvectors as simple as possible:
[1] [0] [0] [0] [0] [1] [0] [0] [-1] [0] [1] [0] [0] [-1] [0] [1]They generate V=R^{4}.
[1 0 0 0] [0 1 0 0] [-1 0 1 0] [0 -1 0 1]Then P^{-1}.A.P =
[0 0 0 0] [0 0 0 0] [0 0 1 0] [0 0 0 1]It is a diagonal matrix with the eigenvalues on the diagonal. The matrix A is diagonalized.
Take A = [2 0 -1 0] [2 1 0 -2] [0 0 1 0] [2 -1 -1 0]The eigenvalues are 1, -1, 2, 2
[1] [0] [1] [1] [0] [1] [2] [0] [1] [0] [0] [0] [1] [1] [0] [1]They generate V=R^{4}.
[1 0 1 1] [0 1 2 0] [1 0 0 0] [1 1 0 1]Then P^{-1}.A.P = D =
[1 0 0 0] [0 -1 0 0] [0 0 2 0] [0 0 0 2]It is a diagonal matrix with the eigenvalues on the diagonal. The matrix A is diagonalized.
diag(a, b, ... l) = diag(a^{n}, b^{n}, ... l^{n})
D = P^{-1}.A.P <=> A = P.D.P^{-1} Then A^{n} = P.D.P^{-1} . P.D.P^{-1} . P.D.P^{-1} . ... . P.D.P^{-1} A^{n} = P.D^{n}.P^{-1}Since it is easy to calculate D^{n}, A^{n} can be calculated.
Example : Take A =
[2 0 -1 0] [2 1 0 -2] [0 0 1 0] [2 -1 -1 0]In a previous example we have diagonalized this matrix. We found
[1 0 0 0] [0 -1 0 0] [0 0 2 0] [0 0 0 2]Here, P =
[1 0 1 1] [0 1 2 0] [1 0 0 0] [1 1 0 1]Say we want A^{4}. D^{4} is easy to calculate.
[1 0 0 0] [0 1 0 0] [0 0 16 0] [0 0 0 16]A^{4} = P.D^{4}.P^{-1} =
[ 16, 0, -15, 0 ] [ 10, 11, 0, -10 ] [ 0, 0, 1, 0 ] [ 10, -5, -15, 6 ]Many applications of this method can be found in theoretical areas.
The recursive formula is u_{n} = u_{n-1} + u_{n-2}
Starting from this, we'll find the formula for the n-th term of the fibonacci sequence.
First we write u_{n-1} + u_{n-2} = u_{n} in matrix notation.
[ 1 1 ] [ u_{n-1}] [ 1 0 ] [ u_{n-2}] = [ u_{n} ] [ u_{n-1} ] Let M = [ 1 1 ] [ 1 0 ] and Let F = [1] [1] Then [u_{3}] [ ] = M. F [u_{2}] Then [u_{4}] [u_{3}] [ ] = M.[ ] = M^{2}. F [u_{3}] [u_{2}] ... Then [u_{n} ] [ ] = M^{n-2}. F (1) [u_{n-1}]Now, we'll calculate M^{n-2}. To use the method from above, we need the eigenvalues and characteristic vectors connected with the matrix M.
The characteristic equation is r^{2} - r - 1 = 0 .
The eigenvalues are (1 + sqrt(5))/2 and (1 - sqrt(5))/2.
We call these eigenvalues respectively k and l.
Note that k - l = sqrt(5) and k.l = 1.
You'll find that (k,1) is a characteristic vector corresponding with k
and (l,1) is a characteristic vector corresponding with l.
If we choose these characteristic vectors as a new basis, then we have the connection between M and the diagonal matrix.
[k 0] [0 l] = -1 [k l] [k l] [1 1] . M . [1 1] This is equivalent with M = [k l] [k 0] [k l]^{ -1} [1 1] [0 l] [1 1]From this we can calculate M^{n-2}
M^{n-2} = [k l] [k^{n-2} 0] [k l]^{ -1} [1 1] [0 l^{n-2}] [1 1] Now [k l]^{ -1} [1 1] = [1/(k-l) -l/(k-l)] [-1/(k-l) k/(k-l)] = [1/sqrt(5) -l/sqrt(5)] [-1/sqrt(5) k/sqrt(5)] Then, we have M^{n-2} = [k l] [k^{n-2} 0] [1/sqrt(5) -l/sqrt(5)] [1 1] [0 l^{n-2}] [-1/sqrt(5) k/sqrt(5)] Writing only the first row of this product we have = (1/sqrt(5)) . [k^{n-1} - l^{n-1} -l.k^{n-1}+l^{n-1}.k ] Now from (1) above we can write u_{n} = (1/sqrt(5)) .( k^{n-1} - l^{n-1} -l.k^{n-1}+l^{n-1}.k ) <=> u_{n} = (1/sqrt(5)) .(k^{n-1}.(1-l) - l^{n-1}.(1-k)) <=> u_{n} = (1/sqrt(5)) .(k^{n} - l^{n}) with k = (1 + sqrt(5))/2 and l = (1 - sqrt(5))/2.This is the formula for the n-th term of the fibonacci sequence.
We know that the eigenvalues r are the solutions of the characteristic equation |A - r.I| = 0.
The polynomial |A - r.I| is called the characteristic polynomial of A.
It is an polynomial in r.
Every square matrix A satisfies its own characteristic equation |A - r.I| = 0. |
Example:
[ 2 1 0 ] Let A = [ -1 0 4 ] [ 0 2 1 ] The characteristic equation is | 2-r 1 0 | | -1 -r 4 | = 0 <=> - r^{3} + 3 r^{2} + 5r -15 = 0 | 0 2 1-r | The theorem claims : - A^{3} + 3 A^{2} + 5 A -15 = 0