Linear transformations




Linear transformations

A linear transformation

In a real vector space V we define
 
        A transformation t of V is linear

                <=>

        For all vectors u , v and all real numbers r
        t(u+v) = t(u)+t(v) and t(r.u) = r.t(u)
The set of all linear transformations of V is L(V).
Examples :
 
t : R x R  -> R x R  : (x,y)  -> (x+y,x)
t : R x R  -> R x R  : (x,y)  -> (0,y)
t : R -> R     : x -> 6x

R[x] is the set of all polynomials in x with real coefficients.
t : R[x] -> R[x] : p(x) --> p'(x)  with p'(x) is the derivative of p(x)
Exercise: check for each example that both conditions are met. Search for an original example and verify your answer.

Counterexample:

 
t : R x R  -> R x R  : (x,y)  -> (x+4,y)
Why is this not a linear transformation?

The identical linear transformation I

The identical linear transformation I transforms each vector in itself.
 
I : V -> V : v -> v

Image of the vector 0.

Let t be a linear transformation of V, then
 
        t(0) = t(0v) = 0.t(v) = 0
Hence, the image of the vector 0 is 0.

Criterion for the linearity of a transformation of V

Theorem :
Take a transformation t of V.
 
                t is in L(V)
                   <=>
        For all vectors u, v and all real numbers r, s
        t(r.u + s.v) = r.t(u) + s.t(v)
Proof :
Part 1 : If t is in L(V) then
 
        t(r.u + s.v) = t(r.u) + t(s.v) =  r.t(u) + s.t(v)
Part 2 : If t(r.u + s.v) = r.t(u) + s.t(v) for all r, s then
 
        take r = s = 1   t(u+v) = t(u)+t(v)
        take s = 0     t(r.u) = r.t(u)
Q.E.D.

Building linear transformations

We show this for dimension(V) = 3, but all can easily be generalized.
Theorem :
If (e1, e2, e3) is an ordered basis of V, and if (u1, u2, u3) is a randomly chosen ordered set of three vectors from V.
Then, there is exactly one linear transformation t of V such that
t(e1) = u1
t(e2) = u2
t(e3) = u3

Proof :
A random vector v in V can be written as v = k.e1+l.e2+m.e3.
A random vector w in V can be written as w = k'.e1+l'.e2+m'.e3.
Then v + w = (k+k')e1 + (l+l')e2 + (m+m')e3.

In order to prove the existence of such a transformation, we start with a well-chosen transformation t of V.
We'll show afterwards that the required conditions for linearity are met.

So, first we define a transformation t by specifying the image of each vector.

 
        If v = k.e1 + l.e2 + m.e3 then we define
        t(v) = k.u1 + l.u2 + m.u3
First condition: t is linear because :
 
      t(v + w)  = t( (k + k')e1  +  (l + l')e2  +  (m + m')e3 )
                                our definition of t
                = (k + k')u1  +  (l + l')u2  +  (m + m')u3
                                property in V
                =  k.u1 + l.u2 + m.u3  +  k'.u1 + l'.u2 + m'.u3
                                our definition of t
                =  t(v)  +  t(w)

        t(r.v)   = t(rk.e1 + rl.e2 + rm.e3)
                   our definition of t
                = rk.u1 + rl.u2 + rm.u3
                       property in V
                = r(k.u1 + l.u2 + m.u3)
                    our definition of t
                = rt(v)
Second condition:
 
        t(e1) = t(1.e1 + 0.e2 + 0.e3) = 1.u1 + 0.u2 + 0.u3 = u1
        t(e2) = t(0.e1 + 1.e2 + 0.e3) = 0.u1 + 1.u2 + 0.u3 = u2
        t(e3) = t(0.e1 + 0.e2 + 1.e3) = 0.u1 + 1.u2 + 1.u3 = u3
The existence is proved.
Now, we show that such a t is unique.

Suppose t and t' are two linear transformations such that
t(e1) = u1 and t'(e1) = u1
t(e2) = u2 and t'(e2) = u2
t(e3) = u3 and t'(e3) = u3
Then, for each v in V

 
        t(v) = t(k.e1+l.e2+m.e3)
                       our definition of t
             = k.u1 + l.u2 + m.u3
and
        t'(v)= t'(k.e1 + l.e2 + m.e3)
                          t' is linear
             = t'(k.e1) + t'(l.e2) + t'(m.e3)
                          t' is linear
             = k.t'(e1) + l.t'(e2) + m.t'(e3)

             = k.u1 + l.u2 + m.u3
So t = t'
Conclusion:

A linear transformation of a real vector space V is is completely and unambiguously determined by the images of the basis vectors of V

Example 1:
There is exactly one linear transformation t of the vector space R x R such that
t(1,0) = (3,2)
t(0,1) = (5,4)
We calculate the image of the vector (-1,5) = -1(1,0) + 5(0,1) .
t(-1,5) = t( -1(1,0) + 5(0,1) ) = -1(3,2) + 5(5,4) = (22,18)

Example 2:
Take the real vector space C of the complex numbers.
In that vector space we choose a ordened basis (1 , i).
A linear transformation of C is completely and unambiguously determined by the images of 1 and i.
We define a linear transformation t by
t(1) = 1 + i
t(i) = 1 - i
Now we can calculate the image of each vector. For example, we calculate the image of 3 - 2i.
t(3 - 2i) = t( 3 . 1 + (-2) . i ) = 3 (1+i) + (-2)(1-i) = 1 + 5i

Matrices and linear transformations

Matrix of a linear transformation relative to a basis in V.

We show this for dim(V) = 3, but all can easily be generalized.

If (e1, e2, e3) is an ordered basis of V, and if (u1, u2, u3) is a randomly chosen ordered set of three vectors from V, we know that there is exactly one linear transformation t of V such that
t(e1) = u1
t(e2) = u2
t(e3) = u3

These vectors u1, u2, u3 can be expressed in e1, e2, e3 .

 
        u1 = a.e1 + b.e2 + c.e3
        u2 = d.e1 + e.e2 + f.e3
        u3 = g.e1 + h.e2 + i.e3
A random vector v = k.e1+l.e2+m.e3 is transformed by t in t(v) = k.u1+l.u2+m.u3
So,
 
        t(v) = k.u1 + l.u2 + m.u3
<=>
        t(v) =  k.(a.e1 + b.e2 + c.e3) +
                l.(d.e1 + e.e2 + f.e3) +
                m.(g.e1 + h.e2 + i.e3)
<=>
        t(v) =  (k.a + l.d + m.g).e1 +
                (k.b + l.e + m.h).e2 +
                (k.c + l.f + m.i).e3
The coordinates of v relative to the basis (e1, e2, e3) are (k,l,m).
Let (k',l'm') = the coordinates of t(v) relative to the basis (e1, e2, e3).
Then,
 
        k' = k.a + l.d + m.g
        l' = k.b + l.e + m.h
        m' = k.c + l.f + m.i

<=>

        [k']    [a  d  g]   [k]
        [l'] =  [b  e  h] . [l]
        [m']    [c  f  i]   [m]
The last matrix formula is the transformation formula associated with t relative to the basis (e1, e2, e3). With this matrix we transform the coordinates of v in the coordinates of t(v). The matrix
 
         [a  d  g]
         [b  e  h]
         [c  f  i]
is called the matrix of the linear transformation relative to the basis (e1, e2, e3). The columns of this matrix are the coordinates of (u1, u2, u3).

Example 1:
In R2 we choose ( (1,0) , (0,1) ) as basis.
There is exactly one linear transformation of R2 such that
t(1,0) = (3,2)
t(0,1) = (5,4)
The matrix of the linear transformation relative to the chosen basis is

 
        [3  5]
        [2  4]
Take v with coordinates (3, 1). The coordinates of t(v) are (14,10) because :
 
        [3  5][3]    [14]
        [2  4][1]  = [10]
Example 2:
Take the vector space of the polynomials in x of second degree or less with real coefficients. We choose ( 1, x, x2 ) as basis.

We build a linear transformation t by choosing x , x + 1 and 3x - 2 as the images of the three basis vectors.

The matrix of the linear transformation is

 
         [0  1 -2]
         [1  1  3]
         [0  0  0]
The columns of this matrix are the coordinates of x , x + 1 and 3x - 2.

The polynomial 2x2 - x + 4 has coordinates (4,-1,2).
The image of the polynomial 2x2 - x + 4 by the linear transformation t has coordinates

 
         [0  1 -2] [4]    [-5]
         [1  1  3] [-1] = [ 9]
         [0  0  0] [2]    [ 0]
The image is the polynomial 9x - 5.

Rotation of all the vectors of the plane

Take an origin O in the plane and orthonormal basis vectors e1 and e2.
Draw the unit vectors and the unit circle.
The rotation about O, by an angle t, is a linear transformation.
The image of e1(1,0) is the vector u1(cos(t), sin(t)).
The image of e2(0,1) is the vector u2(-sin(t), cos(t)).
The matrix of the rotation is
 
   [cos(t)   -sin(t)]
   [sin(t)    cos(t)]
Example:

The rotation about O by an angle of 60 degrees has as matrix

 
   [0.5    -0.866]
   [0.866    0.5 ]
The coordinates of the image w of the vector v(4,7) are
 
   [0.5    -0.866][4]
   [0.866    0.5 ][7]
w = w(-4.062 ; 6.964)

Check these result with a figure.

Reflection of all the vectors of the plane in the line y = m x

Take an origin O in the plane and orthonormal basis vectors e1 and e2.
Draw the unit vectors and the unit circle.
We choose a fixed line s through O enclosing an angle t/2 with the x-axis The slope of the line s is m = tan(t/2).
The orthogonal reflection of all the vectors of the plane in the line s is a linear transformation.
The image of e1(1,0) is the vector u1(cos(t), sin(t)).
The image of e2(0,1) is the vector u2(sin(t), -cos(t)).
The matrix of the orthogonal reflection is
 
   [cos(t)    sin(t)]
   [sin(t)   -cos(t)]

Now
              2 tan(t/2)           2m
   sin(t) = ---------------- = -----------
             1 +  tan2(t/2)     1 + m2

             1 -  tan2(t/2)     1 - m2
   cos(t) = ---------------- = -----------
             1 +  tan2(t/2)     1 + m2

The matrix becomes
   [ (1-m2)/(1+m2)        2m/(1+m2)  ]
  [  2m/(1+m2)        -(1-m2)/(1+m2)]

Example:

We consider the orthogonal reflection in the line y = 0.5 x. Then m = 0.5
The matrix of the orthogonal reflection is

 
 [0.6   0.8]
 [0.8  -0.6]
The coordinates of the image w of the vector v(4,7) are
 
 [0.6   0.8][4]
 [0.8  -0.6][7]

w = w(8,-1)

Check these result with a figure.

Reflection in the x-axis
In this case t = 0. The matrix of the orthogonal reflection is

 
   [ 1    0]
   [ 0   -1]
Reflection in the y-axis
In this case t = pi. The matrix of the orthogonal reflection is
 
   [ -1    0]
   [  0    1]
Reflection in the line y = x
In this case t = pi/2. The matrix of the orthogonal reflection is
 
   [0    1]
   [1    0]

Orthogonal projection of all the vectors of the plane on the line y = m x

Take an origin O in the plane and orthonormal basis vectors e1 and e2.
Draw the unit vectors and the unit circle.
We choose a fixed line s through O enclosing an angle t with the x-axis The slope of the line s is m = tan(t).
The orthogonal projection of all the vectors of the plane on the line s is a linear transformation.
The image of e1(1,0) is the vector u1(cos2(t), sin(t) cos(t)).
The image of e2(0,1) is the vector u2(sin(t) cos(t), sin2(t)).
The matrix of the orthogonal projection is
 
   [cos2(t)        sin(t)cos(t)]
   [cos(t)sin(t)      sin2(t)  ]

Now
                 1
   cos2(t) = --------
               1 + m2

                 m2
   sin2(t) = ---------
              1 + m2

                                     m
   cos(t)sin(t) = (1/2) sin(2t) = ---------
                                   1 + m2

The matrix of the orthogonal projection is

   [  1/(1+m2)       m/(1+m2) ]
   [  m/(1+m2)      m2/(1+m2) ]

Example:

We'll find the matrix of the orthogonal projection on the line y = 2x. So, m = 2.
The matrix of the projection is

 
  [ 0.2    0.4]
  [ 0.4    0.8]
The coordinates of the image w of the vector v(4,7) are
 
  [ 0.2    0.4][4]
  [ 0.4    0.8][7]
w = w(3.6 ; 7.2)

Check these result with a figure.

Exercise: Find the matrices of the orthogonal projection on the x-axis, on the y-axis, on the line y = -x.

Kernel or Null-space of a linear transformation

The kernel or null-space of a linear transformation t is the set of all vectors v such that t(v) = 0. Notation: ker(t).

The null-space is a subspace of V

Since t(0) = 0, the null-space is not empty. If u an v are in the null-space, then
 
        t(r.u + s.v) = r.t(u) + s.t(v)
                     = r.0 + s.0
                     = 0
        So, r.u + s.v is in ker(t).
Hence the kernel is a subspace of V.

Example 1:
Relative to a basis the matrix of a linear transformation is

 
         [0  1 -2]
         [1  1  3]
         [0  0  0]
The vector v with coordinates (x,y,z) belongs to the null-space if and only if
 
         [0  1 -2] [x]    [ 0]
         [1  1  3] [y]  = [ 0]
         [0  0  0] [z]    [ 0]
<=>
          y - 2z = 0
      x + y + 3z = 0
<=>
          y =  2z
          x = -5z
For each z-value there is a vector of ker(t).
The coordinates of the vectors of the null-space are
{ (-5r , 2r, r) | r in R } = { r(-5,2,1) | r in R }.
The kernel contains all the multiples of v(-5,2,1). It is a subspace of V with dimension 1.

Example 2:

R[x] is the set of all polynomials in x with real coefficients.
t : R[x] -> R[x] : p(x) --> p'(x) with p'(x) is the derivative of p(x)

The null-space is the set of all polynomials p(x) such that p'(x) = 0. Ker(t) = R.

Example 3:

Take the orthogonal projection of all vectors of the plane on the line y=m(x). This transformation is a linear transformation.
The null-space is the set of all vectors v such that orthogonal projection(v) = 0. The image points of these vectors are the points of the line y = (-1/m) x. It is a subspace of V with dimension 1.

Nullity

The dimension of the null-space is called the nullity of the linear transformation t.

Fixed points

Let t be a linear transformation of V.
 
  The vector v is a fixed point of t

          if and only if

          t(v) = v
It is easy to show that the set of all fixed points of t is a subspace of V. (exercise)

Note that fixed points are vectors.

Example:

Let t = the orthogonal projection of all vectors of the plane on the line y = m x. All vectors with image point on the line y = m x are fixed points of t.

Matrix and the change of a basis

We show this for dim(V) = 3, but all can easily be generalized.
If (e1, e2, e3) is an ordered basis of V an t is a linear transformation of V with matrix Ao.
 
             [a  d  g]
        Ao = [b  e  h]
             [c  f  i]
The linear transformation t transforms a random vector
v = k.e1+l.e2+m.e3 in t(v) = k'.e1+l'.e2+m'.e3 and then
 
        [k']   [a  d  g] [k]
        [l'] = [b  e  h].[l]
        [m']   [c  f  i] [m]

          [k']            [k]
Let Ko' = [l']   and Ko = [l]
          [m']            [m]
Ko is the matrix of the coordinates of v and Ko' is the matrix of the coordinates of t(v) relative to the basis (e1, e2, e3) and
 
    Ko' = Ao.Ko            (*)
Now we take a new basis in V. Then, all the vectors of V get new coordinates. From the theory of vector spaces, we know that these new coordinates are linked to the old ones with a transformation matrix C.
The columns of the transformation matrix C are the coordinates of the new basis vectors relative to the original (old) basis vectors.

Denote the new coordinates of a vector, in matrix form, as Kn. Index o stands for old, index n stands for new.

Then Ko = C.Kn and Ko' = C.Kn'
We write (*) with the new coordinates.

 
        C.Kn' = Ao.C.Kn

<=>       Kn' = C-1.Ao.C.Kn      (**)
The last formula gives the connection between the coordinates of v and t(v) relative to the new basis.

Denote An as the matrix of t relative to the new basis. Then we have

 
        Kn' = An.Kn      (***)
From (**) and (***) we see that
 
        An = C-1  .Ao.C
The last formula gives us the possibility to calculate the new matrix An of t from the old matrix Ao.

Example

In a 2-dimensional space with basis (e1, e2), a linear transformation t has matrix

 
        [3  1]
        [-1 1]
Now we take a new basis
 
        e1' = e1 + e2
        e2' = e1 - e2

Then the transformation matrix C is

        [1  1]
        [1 -1]

and from this C-1   is

        [1/2   1/2]
        [1/2  -1/2]

The matrix of the linear transformation t relative to the new
basis is

        [1/2   1/2] [3  1] [1  1]
        [1/2  -1/2] [-1 1] [1 -1]


 =      [2   0]
        [2   2]


Application

In a plane we start with two orthogonal axes and two unit vectors (e1, e2) as basis. We take a rotation with an angle of pi/4 radians. We know that this is a linear transformation with matrix A

 
A = [1/sqrt(2)   -1/sqrt(2)]
    [1/sqrt(2)    1/sqrt(2)]


  = 1/sqrt(2) . [1 -1]
                [1  1]
Suppose that for a given application it is useful to move to skewed angular axes with new basis vectors (e1', e2') such that
e1' = e1 and e2' = e1 + e2 .

If we want to use the rotation in the new coordinate system, we need to transform the matrix A for use in this new coordinate system. This new matrix is equal to C-1. A .C

The columns of C are the coordinates of the new basis-vectors with respect to the original (old) basis vectors .

 
   C= [1  1]
      [0  1]
After calculation the new matrix An of the rotation is
 
  An =  1/sqrt(2) . [0   -2]
                    [1    2]

Similar matrices

Two n x n matrices A an B are similar if and only if there is a nonsingular n x n matrix C such that
 
        B  = C-1. A .C
As a corollary from previous formula we see that two matrices of a linear transformation, relative to a different basis, are similar.

Property of similar matrices.

Say A and B are similar matrices. Then
 
        B  = C-1. A .C

=>     |B| =|C-1|.|A|.|C|

=>     |B| =|C-1|.|C|.|A|


=>     |B| = |A|

So, similar matrices have the same determinant.

Sum of linear transformations

The sum of two linear transformations t and t' is defined by
 
        t+t' : V -> V : v -> t(v) + t'(v)
It can easily be proved that t+t' is a linear transformation and that the matrix of t+t' is equal to the sum of the matrices of t and of t'.

Scalar multiplication of a linear transformation with a real number

The scalar multiplication of a linear transformation t with a real number r is defined by
 
        r.t : V -> V : v -> r.t(v)
It can easily be proved that r.t is a linear transformation and that the matrix of r.t is equal to r.(matrix) of t.

Composition of two linear transformations

The composition t' o t of two linear transformations t and t' is defined by
 
        t' o t : V -> V : v -> t'(t(v))
It can easily be proved that t' o t is a linear transformation and that the matrix of t' o t is equal to
(matrix of t').(matrix of t) .

Example
t1 and t2 are linear transformations with matrices A and B respectively.

 
A= [1  3]   B= [-1  1]
   [0  3]      [1   4]
The linear transformation t2 o t1 has the matrix
 
B.A = [ -1, 0  ]
      [ 1,  15 ]
The linear transformation t1 o t2 has the matrix
 
A.B = [ 2, 13 ]
      [ 3, 12 ]
The composition of linear transformations is not commutative.

Power of a linear transformation

Let t be a linear transformation with matrix A relative to a basis in V.
The n-th power of t is defined by
 
 for n= 2 :
 t2 : V -> V : v -> t(t(v))
 for n > 2 :
 tn : V -> V : v -> t(tn-1 (v))

Example:
t4(v) = t(t(t(t(v))))
From this it follows that tn has matrix An.

Polynomials in t

Let t be the linear transformation with matrix A relative to a basis in V.
I is the identical linear transformation and we use the same symbol I for the identity matrix.
Since the addition, the scalar multiplication and the power of t is defined, we can write
 
    3 t3 - 4 t2 +  t + I is a linear transformation with matrix  3 A3 - 4 A2 +  A + I

     (t - I).(2 t2 + 5I) is a linear transformation with matrix  (A - I).(2 A2 + 5I)

     (t - I).(2 t2 + 5I) = 2 t3 - 2 t2 + 5 t - 5 I
Hence:
Let f(x) be a polynomial in x and t is a linear transformation with matrix A, then
 
   f(t) = 0    <=>     f(A) = 0

Projection, reflection and similarity in a vector space V

Projection in a vector space

Choose two supplementary subspaces M and N relative to the space V. Each vector v of V can be written in exactly one way as the sum of an element m of M and an element n of N.

Then v = m + n .

Now we can define the transformation

 
p: V --> V : v --> m

We define this transformation as

        the projection of V on M relative to N
It is easy to show that this transformation is linear.

Projection, example

V is the space of all polynomials with a degree not greater than 3.
We define two supplementary subspaces
M = span { 1, x }
N = span { x2, x3 }
Each vector of V is the sum of exactly one vector of M and of N.
e.g. 2x3 - x2 + 4x - 7 = (2x3 - x2) + (4x - 7)

Say p is the projection of V on M relative to N, then

 
    p(2x3 - x2 + 4x - 7 ) = 4x - 7
Say q is the projection of V on N relative to M, then
 
    q(2x3 - x2 + 4x - 7 ) = 2x3 - x2
We choose the basis (1, x, x2, x3) in V. Now we can create the matrix of the projection.
The columns of the matrix are the coordinates of the images of the basis vectors

For this projection we find

 
 [ 1 0 0 0 ]
 [ 0 1 0 0 ]
 [ 0 0 0 0 ]
 [ 0 0 0 0 ]

Projection example 2

V = R3 with a natural basis Bo.
M is the subspace span( (1,0,2) ; (0,1,0) ).
We calculate the matrix Ao of the projection p of V on M relative to N = span((2,0,-1)).

To find Ao we first take a new basis B1.
B1 = ( (1,0,2) , (0,1,0) , (2,0,-1) ).
The new base is chosen such that the projection of these new base vectors is very simple.

Indeed, we see that :
The projection of (1,0,2) is (1,0,2)
The projection of (0,1,0) is (0,1,0)
The projection of (2,0,-1) is (0,0,0)
The matrix A1 of the projection has as columns the coordinates of the projected base vectors (1,0,2);(0,1,0) and (0,0,0) relative to the new basis B1.

This matrix is

 
       [1 0 0]
 A1 =  [0 1 0]
       [0 0 0]
The connection between A1 and Ao is A1 = C-1 Ao C
Here C is the transformation matrix from basis Bo to B1. The columns of C are the coordinates of the new basis vectors relative to the original basis Bo.
 
     [1  0  2]
 C = [0  1  0]
     [2  0 -1]

Now,  Ao = C A1 C-1

             [ 1, 0, 2 ]
 Ao = (1/5)  [ 0, 5, 0 ]
             [ 2, 0, 4 ]
With this matrix we can calculate the projection of any vector in an easy way.
Example: we calculate the projction of (6,-1,2).
 
          [ 1, 0, 2 ] [ 6]     [ 2  ]
    (1/5) [ 0, 5, 0 ] [-1] =   [ -1 ]
          [ 2, 0, 4 ] [ 2]     [ 4  ]

Projection example 3

Orthogonal projection of a vector on a plane of the ordinary three dimensional space and matrix of the projection.

Similarity transformation of a vector space

Let r = any constant real number.
In a vector space V we define the transformation
 
   h : V --> V : v --> r.v
We say that h is a similarity transformation of V with factor r.

It is easy to show that this transformation is linear.

Important special values of r are 0, 1 and -1.

Reflection in a vector space

Choose two supplementary subspaces M and N relative to the space V.
Each vector v of V is the sum of exactly one vector m of M and n of N.

Now we define the transformation

 
    s : V --> V : v --> m - n
We say that s is the reflection of V in M relative to the N.

It is easy to show that this transformation is linear.

This definition is a generalization of the ordinary reflection in a plane. Indeed, if you take the ordinary vectors in a plane and if M and N are one dimensional supplementary subspaces, then you'll see that with the previous definition, s becomes the ordinary reflection in M relative to the direction given by N.

Example of a reflection and matrix of the reflection

 
Take V = R4.

M = span{(0,1,3,1);(1,0,-1,0)}

N = span{(0,0,0,1);(3,2,1,0)}
It is easy to show that M and N have only the vector 0 in common. (This is left as an exercise.) So, M and N are supplementary subspaces.

Now we'll calculate the image of the reflection of vector v = (4,3,3,1) in M relative to N.

First we write v as the sum of exactly one vector m of M and n of N.

 
 (4,3,3,1) = x.(0,1,3,1) + y.(1,0,-1,0) + z.(0,0,0,1) + t.(3,2,1,0)
The solution of this system gives x = 1; y = 1; z = 0; t = 1. The unique representation of v is
 
(4,3,3,1) = (1,1,2,1) + (3,2,1,0)
The image of the reflection of vector v = (4,3,3,1) in M relative to N is vector v' =
 
   (1,1,2,1) - (3,2,1,0) = (-2,-1,1,1)
We choose in R4 the natural basis. We want to find the matrix Ao of the reflection relative to this natural basis. If this matrix is known, we can find the reflection of any vector in a simple way.

Finding the matrix Ao is not obvious. For this purpose we rely on the knowledge about changing the basis of the vector space and its consequences on the matrix of the reflection.

We start from a natural basis. Let Ao be the required matrix of our reflection.

Now we change the basis to a new basis. As new basis vectors we choose the generators of M and N. The new basis is
( (0,1,3,1) ; (1,0,-1,0) ; (0,0,0,1) ; (3,2,1,0) )

relative to this new basis all vectors have new coordinates. From the theory of the vector spaces we know that these new coordinates are connected to the old coordinates by a matrix C. The columns of C are the coordinates of the new basis vectors relative to the original (old) basis. In our example is

 
        [ 0, 1,  0, 3 ]
  C =   [ 1, 0,  0, 2 ]
        [ 3, -1, 0, 1 ]
        [ 1, 0,  1, 0 ]
relative to this new basis the linear transformation has a simple matrix. The columns of this matrix are the new coordinates of the images of the reflection of the new base vectors.
 
   Basis vector    image of reflection   new coordinates of that image
   ----------      ------------------   --------------------------------
    (0,1,3,1)          (0,1,3,1)               (1, 0, 0, 0)
    (1,0,-1,0)         (1,0,-1,0)              (0, 1, 0, 0)
    (0,0,0,1)         -(0,0,0,1)               (0, 0,-1, 0)
    (3,2,1,0)         -(3,2,1,0)               (0, 0, 0,-1)

The matrix of the reflection relative to the new basis is

        [1 0  0  0]
  An =  [0 1  0  0]
        [0 0 -1  0]
        [0 0  0 -1]

Between the old and the new matrix of the reflection we have the connection

     An = C-1 Ao C
<=>
     Ao = C An C-1

With this we calculate the  matrix Ao of the reflection with respect
to the natural basis. We find:

       [4 -9  3  0]
 Ao =  [2 -5  2  0]
       [1 -3  2  0]
       [2 -4  2 -1]
Once this result is found, it is very easy to find the reflection of any vector.

As an example, we retake the vector v = (4,3,3,1) from above. The image is :

 
      [4 -9  3  0][4]    [-2]
      [2 -5  2  0][3] =  [-1]
      [1 -3  2  0][3]    [1 ]
      [2 -4  2 -1][1]    [1 ]
So,in a very simple way we find the same result as above.

Example 2 of a reflection and matrix of the reflection

Orthogonal reflection of a vector in a plane of the ordinary three dimensional space and matrix of this reflection.

Eigenvectors or characteristic vectors and eigenvalues

Definition

Say t is a linear transformation of a vector space V.
 
   u is called an eigenvector or characteristic vector relative to t

                      if and only if

                     u is not 0
       and there is a real number r such that t(u) = r.u

The real number r is called the eigenvalue of u.

An eigenvector relative to t is a vector different from 0 such that t transforms this vector into a multiple of itself.
The vector u is not zero, but the eigenvalue r can be zero. This means that all the vectors of the null-space, different from 0, are the eigenvectors with eigenvalue 0.

Example 1: Take an origin O in the plane and orthonormal basis vectors e1 and e2.
Take the linear transformation t such that t projects each vector orthogonal on the x-axis. The vectors, different from 0, with image point on the y-axis are transformed in 0. These vectors are eigenvectors with eigenvalue 0. Each vector, different from 0, with image point on the x-axis is transformed in itself. It is an eigenvector with eigenvalue 1.

Example 2: Take an origin O in the plane and orthonormal basis vectors e1 and e2.
Take the linear transformation t such that t is a rotation about O with angle pi/11. There is no vector, different from 0, such that t transforms this vector into a multiple of itself.
This rotation has no eigenvectors.

Exercise : Take an origin O in the plane and orthonormal basis vectors e1 and e2.
Take the orthogonal reflection in the line y = x as linear transformation. Find the eigenvectors and the eigenvalues.

Eigenvalues and eigenvectors in a space with dimension 3.

Say t is a linear transformation of a vector space V with dimension 3.
We fix a basis in V. relative to that basis, t has a unique matrix .
 
          [a  b  c]
          [d  e  f]
          [g  h  i]

We denote the co(u) = (x,y,z).

Now,     u(x,y,z) is a characteristic vector of t with eigenvalue r

                          <=>

                 t(u) = r.u  and  u not 0

                          <=>

                [a  b  c] [x]     [x]       [x]     [0]
                [d  e  f].[y] = r.[y]  with [y] not [0]
                [g  h  i] [z]     [z]       [z]     [0]


                          <=>

               [a  b  c] [x]     [1  0  0] [x]   [0]         [x]     [0]
               [d  e  f].[y] - r.[0  1  0].[y] = [0]    with [y] not [0]
               [g  h  i] [z]     [0  0  1] [z]   [0]         [z]     [0]


                         <=>

               [a  b  c] [x]     [r  0  0] [x]   [0]         [x]     [0]
               [d  e  f].[y]  -  [0  r  0].[y] = [0]    with [y] not [0]
               [g  h  i] [z]     [0  0  r] [z]   [0]         [z]     [0]


                        <=>
              The homogeneous system in x,y,z

               [a-r  b   c ] [x]    [0]
               [d   e-r  f ].[y]  = [0]
               [g    h  i-r] [z]    [0]

              has a solution different from (0,0,0).

                        <=>

                   |a-r  b   c |
                   |d   e-r  f | = 0
                   |g    h  i-r|

The last equation is called the characteristic equation of t relative to the fixed basis in V. This equation is usually written as determinant(A - r I) = 0 or |A - r I| = 0. A is the matrix of the linear transformation and I is the suitable unit matrix.

If r is a solution of this equation, then the system

 
                [a  b  c] [x]     [x]
                [d  e  f].[y] = r.[y]
                [g  h  i] [z]     [z]

has a solution (x,y,z) different from (0,0,0). With this solution corresponds a characteristic vector u(x,y,z) and a eigenvalue r of t.

Example :

Find the eigenvalues and characteristic vectors of the matrix

 
 [ 2   1  0 ]
 [ -1  0  4 ]
 [ 0   2  1 ]

The eigenvalues are the solutions of
 
 | 2-r   1      0 |
 | -1    -r     4 | = 0
 | 0     2    1-r |
The solutions are 3 ; sqrt(5) and -sqrt(5)

Characteristic vectors corresponding with eigenvalue 3

 
 [ 2   1  0 ][x]     [x]
 [ -1  0  4 ][y] = 3 [y]
 [ 0   2  1 ][z]     [z]
This system is homogeneous with 1 side equation. The side equation may be deleted. We find as characteristic vectors all the multiples of(1,1,1) different from (0,0,0).

Characteristic vectors corresponding with eigenvalue sqrt(5).

 
 [ 2   1  0 ][x]           [x]
 [ -1  0  4 ][y] = sqrt(5) [y]
 [ 0   2  1 ][z]           [z]
This system is homogeneous with 1 side equation. The side equation may be deleted. We find as characteristic vectors all the multiples of (1, sqrt(5)-2, (3-sqrt(5))/2 ) different from (0,0,0).

Finally, the characteristic vectors with eigenvalue -sqrt(5).
This is left as an exercise.

This way of working can be extended to vector spaces with dimension n.

Eigenspace associated with an eigenvalue.

Theorem:
The set of all characteristic vectors associated with an eigenvalue k, form a vector space, together with 0.
Proof:
Say u and v are characteristic vectors with eigenvalue k, then t(u) = k.u and t(v) = k.v.
Hence, for all real r and s we have
 
t(r.u + s.v) = r.t(u) + s.t(v) = r.k.u + s.k.v = k.(r.u + s.v)
So, for all real r and s, (r.u + s.v) is a characteristic vector with eigenvalue k.

Linear independent characteristic vectors.

Theorem:
Take a vector space with dimension 2 and a linear transformation t.
If two characteristic vectors correspond with different eigenvalues, then these characteristic vectors are linear independent.

Proof:
Let v = characteristic vector with eigenvalue k.
Let w = characteristic vector with eigenvalue l.
Suppose v and w are linear dependent, then there is a scalar r such that

 
        w = r v
=>      t(w) = t(r v)
=>      l w = r.t(v)
=>      l w = r. k v
=>      l r v = r. k v
=>        k = l
This gives a contradiction with the fact that the two characteristic vectors correspond with different eigenvalues.

This theorem can be extended for a vector space with dimension n. Take a vector space with dimension n and a linear transformation t.
If the n characteristic vectors correspond with all different eigenvalues, then these characteristic vectors are linear independent.

Characteristic vectors as a basis of a vector space

From previous theorem we know:
Take a vector space V with dimension n and a linear transformation t.
If n characteristic vectors correspond with all different eigenvalues, then these characteristic vectors are linear independent. The characteristic vectors can be used as a basis of V.

Diagonal matrices and characteristic vectors.

Take a vector space V with dimension 2 and a linear transformation t.
If two characteristic vectors v and w correspond with different eigenvalues k and l, then these characteristic vectors are linear independent and they constitute a basis for V. The images of v and w are
 
        t(v) = k.v = k.v + 0.w
        t(w) = l.w = 0.v + l.w

The matrix of t relative to the basis ( v , w ) is

        [ k   0 ]
        [ 0   l ]
We say that the matrix of t is diagonal.

This can be extended for a vector space with dimension n.
Take a vector space V with dimension n and a linear transformation t.
If n characteristic vectors correspond with all different eigenvalues, then these characteristic vectors are linear independent. The characteristic vectors can be used as a basis of V. The matrix of t relative to that basis is diagonal and the eigenvalues are the diagonal elements of the matrix.

Corollary

Say a linear transformation t has a matrix A relative to a basis and t has n characteristic vectors corresponding with all different eigenvalues.
If we take these characteristic vectors as a new basis, the new matrix of t is a diagonal matrix D. We know that there is a formula that connect A and D.
 
        D = C-1 . A . C
Here, C is the transformation matrix. The columns of C are the coordinates of the new basis-vectors (characteristic vectors) relative to the original basis in V.

Example.

In a vector space with dimension 2 and with basis (e1, e2) a linear transformation t has matrix A =
 
        [4 -1]
        [2  1]
Calculating the eigenvalues we find 3 and 2.
As corresponding characteristic vectors we choose v1(1,1) and v2(1,2).
Take these characteristic vectors as a new basis.
The transformation matrix is
 
        [1  1]
        [1  2]
Then we have
 
                        -1
        [3  0]  = [1  1]  [4  -1] [1  1]
        [0  2]    [1  2]  [2   1] [1  2]
The matrix of the linear transformation is diagonalized.

Criterion for a diagonalizable matrix

Theorem 1

If a real n x n matrix A is similar to a diagonal matrix D
Then the eigenvectors of A form a set that generates the vector space V= Rn.

Proof:
Say t is the linear transformation with matrix A. Since A and D are similar matrices there is a suitable basis B of V such that D becomes the matrix of the linear transformation t.

The coordinates X of a random vector are then transformed into the coordinates X' of t(v).
The transformation formula is X' = D X with D the diagonal matrix (d1, d2, ... ,dn).

Each basis vector of B is then converted to a multiple of itself and therefore it is an eigenvector

But if the basis vectors of B are eigenvectors, then the set of eigenvectors will generate the vector space V= Rn.

Theorem 2

If the eigenvectors of A form a set that generates the vector space V= Rn
Then A is similar to a diagonal matrix D.

Proof:
Let t be the linear transformation with matrix A. Since the eigenvectors of A form a set that generates the vector space V, we can choose a basis from that set. Since all the basis vectors are eigenvectors, the image of each basis vector is a multiple of itself.

The columns of the matrix of t relative to the chosen basis are multiples of the basis vectors. So, they form a diagonal matrix D. Then D is similar to A

Conclusion from theorem 1 and 2:

Criterion

A real n x n matrix A is similar to a diagonal matrix D if and only if the eigenvectors of A form a set that generates the vector space V= Rn.

Procedure

To diagonalize a n x n matrix A, we calculate n independent eigenvectors.
We choose these vectors as columns of a matrix P.
The relationship between A and D is P-1.A.P = D.
On the diagonal of the diagonal matrix D are the eigenvalues of those eigenvectors.

A simple example

 
Take A =

   [0 0 0 0]
   [0 0 0 0]
   [1 0 1 0]
   [0 1 0 1]
The eigenvalues are 0, 0, 1, 1 We can choose the corresponding eigenvectors as simple as possible:
 
  [1]    [0]   [0]  [0]
  [0]    [1]   [0]  [0]
  [-1]   [0]   [1]  [0]
  [0]    [-1]  [0]  [1]
They generate V=R4.
We put the eigenvectors as columns in the matrix P. Then P =
 
  [1  0 0 0]
  [0  1 0 0]
  [-1 0 1 0]
  [0 -1 0 1]

Then P-1.A.P =
 
  [0 0 0 0]
  [0 0 0 0]
  [0 0 1 0]
  [0 0 0 1]

It is a diagonal matrix with the eigenvalues on the diagonal. The matrix A is diagonalized.

Second example

 
Take A =

   [2  0 -1  0]
   [2  1  0 -2]
   [0  0  1  0]
   [2 -1 -1  0]
The eigenvalues are 1, -1, 2, 2
We can choose the corresponding eigenvectors as simple as possible:
 
  [1]    [0]   [1]  [1]
  [0]    [1]   [2]  [0]
  [1]    [0]   [0]  [0]
  [1]    [1]   [0]  [1]
They generate V=R4.
We put the eigenvectors as columns in the matrix P. Then P =
 
  [1  0  1  1]
  [0  1  2  0]
  [1  0  0  0]
  [1  1  0  1]

Then P-1.A.P = D =
 
  [1  0  0  0]
  [0 -1  0  0]
  [0  0  2  0]
  [0  0  0  2]

It is a diagonal matrix with the eigenvalues on the diagonal. The matrix A is diagonalized.

Power of a diagonal matrix

Using mathematical induction, it is easy to prove that

diag(a, b, ... l) = diag(an, bn, ... ln)

Power of a matrix

Suppose that it is possible to transform a matrix A to a diagonal matrix D. Then there is a matrix P such that
 
    D = P-1.A.P  <=> A = P.D.P-1
Then

    An = P.D.P-1 . P.D.P-1 . P.D.P-1 . ... . P.D.P-1

    An = P.Dn.P-1
Since it is easy to calculate Dn, An can be calculated.

Example : Take A =

 
   [2  0 -1  0]
   [2  1  0 -2]
   [0  0  1  0]
   [2 -1 -1  0]
In a previous example we have diagonalized this matrix. We found
P-1.A.P = D =
 
  [1  0  0  0]
  [0 -1  0  0]
  [0  0  2  0]
  [0  0  0  2]

Here, P =
 
  [1  0  1  1]
  [0  1  2  0]
  [1  0  0  0]
  [1  1  0  1]

Say we want A4. D4 is easy to calculate.
D4 =
 
  [1  0  0  0]
  [0  1  0  0]
  [0  0 16  0]
  [0  0  0 16]

A4 = P.D4.P-1 =
 
 [ 16, 0,  -15,  0  ]
 [ 10, 11,  0,  -10 ]
 [ 0,  0,   1,   0  ]
 [ 10, -5, -15,  6  ]


Many applications of this method can be found in theoretical areas.
As an illustration the power of this result we'll find the formula for the n-th term of the fibonacci sequence starting from the recursive formula.

Fibonacci

The Fibonacci sequence is 1, 1, 2, 3, 5, 8, ...

The recursive formula is un = un-1 + un-2

Starting from this, we'll find the formula for the n-th term of the fibonacci sequence.

First we write un-1 + un-2 = un in matrix notation.

 
    [ 1   1 ]   [ un-1]
    [ 1   0 ]   [ un-2]    =

    [  un  ]
    [ un-1 ]


Let M =
        [ 1   1 ]
        [ 1   0 ]
and
Let F =
        [1]
        [1]



Then
     [u3]
     [  ]     = M. F
     [u2]


Then [u4]       [u3]
     [  ]   = M.[  ]     =   M2. F
     [u3]       [u2]

...

Then [un  ]
     [    ]      =     Mn-2. F                (1)
     [un-1]

Now, we'll calculate Mn-2. To use the method from above, we need the eigenvalues and characteristic vectors connected with the matrix M.

The characteristic equation is r2 - r - 1 = 0 .

The eigenvalues are (1 + sqrt(5))/2 and (1 - sqrt(5))/2.
We call these eigenvalues respectively k and l.

Note that k - l = sqrt(5) and k.l = 1.

You'll find that (k,1) is a characteristic vector corresponding with k
and (l,1) is a characteristic vector corresponding with l.

If we choose these characteristic vectors as a new basis, then we have the connection between M and the diagonal matrix.

 
    [k  0]
    [0  l]     =

          -1
    [k  l]               [k  l]
    [1  1]      . M .    [1  1]


This is equivalent with

    M =

    [k  l]  [k  0]   [k  l] -1
    [1  1]  [0  l]   [1  1]

From this we can calculate Mn-2
 
    Mn-2 =


    [k  l]  [kn-2  0]   [k  l] -1
    [1  1]  [0  ln-2]   [1  1]


Now
    [k  l] -1
    [1  1]          =

    [1/(k-l)    -l/(k-l)]
    [-1/(k-l)    k/(k-l)]    =

    [1/sqrt(5)  -l/sqrt(5)]
    [-1/sqrt(5)  k/sqrt(5)]

Then, we have

   Mn-2 =

    [k  l]  [kn-2  0]  [1/sqrt(5)  -l/sqrt(5)]
    [1  1]  [0  ln-2]  [-1/sqrt(5)  k/sqrt(5)]

Writing only the first row of this product we have

    = (1/sqrt(5)) . [kn-1 - ln-1     -l.kn-1+ln-1.k ]

Now from (1) above we can write

  un =  (1/sqrt(5)) .( kn-1 - ln-1 -l.kn-1+ln-1.k )

<=>

  un =   (1/sqrt(5)) .(kn-1.(1-l) - ln-1.(1-k))

<=>

  un =   (1/sqrt(5)) .(kn - ln)


 with k = (1 + sqrt(5))/2  and l = (1 - sqrt(5))/2.

This is the formula for the n-th term of the fibonacci sequence.

Cayley - Hamilton

Characteristic polynomial and characteristic equation

Let A be a matrix corresponding with the linear transformation t.

We know that the eigenvalues r are the solutions of the characteristic equation |A - r.I| = 0.
The polynomial |A - r.I| is called the characteristic polynomial of A. It is an polynomial in r.

Cayley-Hamilton theorem

Every square matrix A satisfies its own characteristic equation |A - r.I| = 0.

The proof of this theorem is beyond the framework of this presentation.

Example:

 
        [ 2   1  0 ]
Let A = [ -1  0  4 ]
        [ 0   2  1 ]

The  characteristic equation is

     | 2-r   1      0 |
     | -1    -r     4 | = 0  <=>  - r3 + 3 r2 + 5r -15 = 0
     | 0     2    1-r |

The theorem claims :    - A3 + 3 A2 + 5 A -15 = 0




Solved Problems

 
You can find solved problems about linear transformations using this link
 





Topics and Problems

MATH-abundance home page - tutorial

MATH-tutorial Index

The tutorial address is http://home.scarlet.be/math/

Copying Conditions

Send all suggestions, remarks and reports on errors to Johan.Claeys@ping.be     The subject of the mail must contain the flemish word 'wiskunde' because other mails are filtered to Trash