Linear Algebra: a Geometric Approach
 9781351435291, 1351435299

Table of contents :
Content: Cover
Half Title
Title Page
Copyright Page
Table of Contents
Preface
Notes for the reader
I: Affine Geometry
1: Vectors and vector spaces
2: Matrices
3: Systems of linear equations
4: Some linear algebra
5: Rank
6: Determinants
7: Affine space (I)
8: Affine space (II)
9: Geometry in affine planes
10: Geometry in 3-dimensional affine space
11: Linear maps
12: Linear maps and matrices
affine changes of coordinates
13: Linear operators
14: Transformation groups
II: Euclidean Geometry
15: Bilinear and quadratic forms
16: Diagonalizing quadratic forms
17: Scalar product 18: Vector product19: Euclidean spaces
20: Unitary operators and isometries
21: Isometries of planes and three-dimensional space
22: Diagonalizing symmetric operators
23: The complex case
Appendices
A: Domains, rings and fields
B: Permutations
Selected solutions
Bibliography
Index of notation
Index

Citation preview

Linear Algebra A geometric approach

Linear Algebra A geometric approach

E. Sernesi Professor of Geometry La Sapienza Rome Italy Translated by

J. Montaldi Mathematics Institute Warwick University UK

CRC Press is an imprint of the Taylor & Francis Group, an informa business

Library of Congress Cataloging-in-Publication Data C atalo g reco rd is av ailab le from the L ibrary o f C o n g ress

CRC Press 6000 Broken Sound Parkway, NW Suite 300, Boca Raton, FL 33487 270 Madison Avenue New York, NY 10016 2 Park Square, Milton Park Abingdon, Oxon OX14 4RN, UK T his b o o k co n tain s in fo rm atio n o b tain ed from au th en tic and highly reg ard ed sources. R ep rin ted m a te ria l is qu o ted w ith p erm issio n , and so u rces are in d icated . A w id e variety o f re feren ces are listed. R ea so n ab le efforts h ave been m ad e to p u b lish reliab le d ata and in fo rm atio n , but the au th o r and the p u b lish e r c an n o t a ssu m e re sp o n sib ility for th e v alid ity o f all m aterials o r for the co n se q u e n ce s o f th e ir use. A part from any fair d ealin g for the p u rp o se o f research o r private study, o r c ritic ism o r review , as p e rm itte d u n d e r the U K C o p y rig h t D esig n s and P aten ts A ct, 1988, this p u b lic a tio n m ay not be rep ro d u c e d , s to re d o r tran sm itted , in any form o r by any m eans, ele ctro n ic o r m ech an ical, in c lu d in g p h o to c o p y in g , m ic ro ­ film in g , and re c o rd in g , or by any in fo rm atio n sto rag e o r retrieval system , w ith o u t the p rio r p e rm issio n in w ritin g o f the p u b lish ers, o r in the c ase o f rep ro g rap h ic re p ro d u c tio n only in a cc o rd a n c e w ith the te rm s o f the licen ses issu ed by th e C o p y rig h t L ice n sin g A g en cy in the U K , o r in a cc o rd a n c e w ith th e te rm s o f the lic e n se issu ed by th e a p p ro p ria te R ep ro d u ctio n R ights O rg an izatio n o u tsid e the U K . T h e c o n se n t o f C R C Press L L C does not ex ten d to c o p y in g fo r general distrib u tio n , for p ro m o tio n , for c re atin g new w o rk s, or fo r resale. S p ecific p erm issio n m u st be obtained in w ritin g from C R C P ress L L C for such copying. D irect all in q u iries to C R C Press L L C , 200 0 N.W. C o rp o ra te B lvd., B oca R aton, F lo rid a 33431. T r a d e m a r k N o tic e : P ro d u ct o r c o rp o ra te n am es m ay be trad em arks or reg istered tra d e m a rk s, an d are used only for id e n tific a tio n and ex p lan atio n , w ith o u t in ten t to infringe.

Visit the CRC Press Web site at www.crcpress.com © 1993 by C h ap m an & H all/C R C E nglish lan g u ag e e d itio n 1993 O rig in ally p u b lish ed by C h ap m an & H all R ep rin te d 200 9 b y C R C Press N o claim to o rig in al U .S. G o v ern m ent w orks In te rn a tio n a l S tan d ard B ook N u m b e r 0 -4 1 2 -4 0 6 8 0 -2 2 3 4 5 6 7 8 9 0 Prin ted on a cid -free p ap er

Contents

Preface Notes for the reader

I 1

2 3 4 5 6

7 8

9 10 11

12 13 14

II 15 16 17 18

vii viii

Affine Geometry Vectors and vector spaces Matrices Systems of linear equations Some linear algebra Rank Determinants Affine space (I) Affine space (II) Geometry in affine planes Geometry in 3-dimensional affine space Linear maps Linear maps and matrices; affine changes of coordinates Linear operators Transformation groups

3 13 25 43 65 73 93 106' 118 130 145 163 179 196

Euclidean Geometry Bilinear and quadratic forms Diagonalizing quadratic forms Scalar product Vector product

215 231 238 254

vi Contents 19 Euclidean spaces 20 Unitary operators and isometries 21 Isometries of planes and three-dimensional space 2 2 Diagonalizing symmetric operators 23 The complex case

259 278 295 310 316

Appendices A Domains, rings and fields B Perm utations

329 335

Selected solutions Bibliography Index of notation Index

339 357 359 361

Preface

This book is written primarily for M athematics students, and it covers the topics usually contained in a first course on linear algebra and geometry. Adopting a geometric approach, the text develops linear algebra alongside affine and Euclidean geometry, in such a way as to emphasize their close relationships and the geometric motivation. In order to reconcile as much as possible the need for full rigour with the equally im portant need not to weigh the treatm ent down with too abstract and formal a theory, the linear algebra is developed gradually and in alternation with the geometry. This is done also to give due prominence to the geometric aspects of vector spaces. The exposition makes use of numerous examples to help the reader acquire the more delicate concepts. Some of the chapters contain ‘Complements’ which develop further more specialized topics. Many of the exercises appearing at the end of each chapter have solutions at the end of the book. The structure of the book is designed to allow as much flexibility as possible in designing a course, both by reorder­ ing or om itting chapters, and, within individual chapters, by om itting examples or complements. E.S.

Notes for the reader

This text book assumes a knowledge of the basics of set theory and the principal properties of the fundamental sets of numbers, for which we use the following notation. N: Z: Q: R: C:

the the the the the

set set set set set

of of of of of

natural numbers (including 0 ); integers; rational numbers; real numbers; complex numbers.

On a first reading of this text, it is not strictly necessary to be familiar with complex numbers. The notation and symbols used are those most commonly used in the m athem atical literature. For the reader’s benefit, they are listed here. The em pty set is denoted 0. A C B and B D A mean th at the set A is a subset of the set B . a € A means th at a is an element of the set A . If A C then B \ A denotes the difference B minus A, consisting of the elements of B which do not belong to A. If n > 1 is an integer and A is a set, then A n denotes the Cartesian product of A with itself n times. The expression f:A a

B ft

Notes for the reader ix means th at the map / from the set A to the set B sends the element a £ A to the element b £ B. If f : A —>B and g : B —* C are two maps, then their composite is denoted g o / . For every positive integer k the symbol A;! means the product 1 .2 .3 ... k and is called k factorial. By definition, one takes 0! = 1 . Given a, 6 € R , with a < 6 , the symbols (a, b), [a, 6], (a, b] and [a, b) mean the intervals with end points a and b which are, respectively, open, closed, open at the left and open at the right. The conjugate a —ib of the complex number 2 = a + ib is denoted z . The modulus of z is \z\ = \Ja? + b2. For other symbols, we refer the reader to the list at the end of the book. The notions introduced in the appendices are used liberally through­ out the text.

Parti Affine Geometry

Vectors and vector spaces

The study of geometry in secondary schools is based upon Euclid’s axiomatic system in the modern formulation given at the end of the nineteenth century by David Hilbert. For plane geometry, this system considers points and lines as the primitive objects. It also takes as primitive the notions of a point belonging to a line, a point lying between two points, equality of segments and equality of angles (the notions of segment and angle are defined in terms of the axioms). There is an analogous system of axioms for space geometry. We will adopt a different point of view, founding geometry on the concept of ‘vector’. The axiomatics based on this concept are not only very simple but are also of great importance throughout M athematics. To motivate the definitions needed, we begin by introducing the concept of vectors in the Euclidean plane and in Euclidean space (which henceforth we refer to as the ordinary plane and space), and then highlight those properties which will subsequently be used to formulate the axioms. For now, we limit ourselves to an intuitive ap­ proach without being too concerned about giving complete proofs. A based vector (or oriented segment) in ordinary space is specified by a base point A and an end point B and is denoted by the symbol ( A ,B ) . The point A is also called the initial point of the vector. A based vector is represented by an arrow which joins the points A and B as in Fig. 1 . 1 . Two based vectors (A, B ) and (C, D) are said to be equivalent if they have the same direction and the same length. T hat is, they are equivalent if they lie on two parallel (possibly coincident) lines

4

Affine geom etry

A Fig. 1.1 and if moving one of the lines in such a way th at it always remains parallel to the other it is possible to move one segment so th at both the initial and the end points coincide. In the set of all based vectors this equivalence is indeed an equivalence relation because it satisfies for obvious reasons the three properties of reflexivity, symmetry and transitivity. A geometric vector (or simply vector) is by definition an equivalence class of based vectors, th at is, it is the set of all oriented segments equivalent to a given oriented segment (Fig 1 .2 ). Vectors will usually be denoted by letters in bold face a, b ,v ,w , etc.

Equivalent

Not equivalent

Not equivalent

Fig. 1.2 Every based vector th at determines the geometric vector a is said to be a representative of a. The vector having representative (A, B ) will also be denoted A B . Given a point A each vector a has one and only one representative whose base point is A. In the definition, we did not exclude the possibility th at A = B. The vector determined by (A, A) is called the zero vector: it has zero length and undefined direction and is denoted by 0. The sum of two vectors can be defined using their representatives in the following way (Fig. 1.3). Let a — A B and b = £ 0 , then a + b = A ^. If instead the vectors a and b are given by representatives based

Vectors and vector spaces

5

B

Fig. 1.3 at the same point, th at is a = A B and b = A D then a + b = A d , where C is the fourth vertex of the parallelogram whose other three vertices are A, B , and D with C lying opposite the common base point A . This method of constructing a + b is called the parallelogram rule. The operation of adding two vectors is associative, th at is, a + (b + c) = (a + b) + c

(1 .1 )

for every triple of vectors a, b, c. This is verified immediately using Fig. 1.4.

Fig. 1.4 From ( 1 . 1 ) it follows th a t when adding three vectors it is possible to omit the brackets because the expression a + b + c has only one meaning. A similar property holds for the sum of any finite number of vectors — see Observation 1.3(2). From the way addition of vectors is defined, it is obviously comm utative, that is, a+ b= b+ a

6

Affine geom etry

for every pair of vectors a, b. Note also th at the vector a+

0

0

satisfies

—0 + a —a

for every vector a. If a = A B then —a denotes B A . It is then obvious that a + (-a ) =

0

.

We now define the product of a vector a by a scalar k (in the context of vectors, real numbers are often called scalars). This is, by definition, the vector &a which is parallel to a, its length is th at of a multiplied by |fc|, and its direction is either the same as th a t of a if k > 0 , or opposite to th a t of a if k < 0 ; if k = 0 or a = 0 then fca = 0 . The operation of multiplication of a vector by a scalar is compatible with the operation of summing two vectors, and with the operations of addition and multiplication of scalars. For example, the following identity is easy to establish: na = a + a + ••• + a

(n times)

for every vector a and every positive integer n . In particular, la = a. One can easily verify th at (h + k)a. = ha + ka and that (kh) a = k(ha) for every pair of scalars fc, h and every vector a. It is also easy to verify the following identity geometrically: A:(a + b) = fca + fcb for every scalar k and every pair of vectors a, b. In a similar way, one can introduce vectors in a line, and vectors in the ordinary plane. It is im portant to notice th at in order to define vectors and the operations of addition and scalar multiplication, we have only used

Vectors and vector spaces

7

the notion of two lines being parallel and the possibility of comparing the lengths of two segments on parallel lines (i.e. that it is possible to find a real number k which represents the ratio of the lengths of two segments, and conversely to associate to any segment and any scalar k a second segment whose length is k times that of the first segment). The axioms of Euclidean geometry guarantee th at these are indeed possible. In our definitions we have not needed to compare two arbitrary segments or to measure angles between two lines. In particular, it is not necessary to have at our disposal an absolute unit for measuring distance, nor do we need the concept of perpendicularity. Now th at we have verified, in an intuitive geometric way, the prop­ erties of vectors, we will turn this point of view upside down and take these properties as the axioms for the definition of ‘vector space’. A vector space is always defined with respect to a field of scalars (cf. Appendix A), which in the preceeding examples was R but which can be chosen completely generally. We will not retain the maximum possible generality, but will take our field to be a subfield of C. Consequently, with the exception of Appendix A, we will from now on denote by K a subfield of C, which we suppose to be fixed once and for all. At first reading it is sufficient to limit oneself to the case K = R . Nevertheless, it is im portant to bear in mind th at what follows is valid in far greater generality. D efin ition 1.1 A vector space over K , or a K-vector space, is a non-empty set V such that, 1) for every pair of elements v, w 6 V there is defined a third element of V called the sum of v and w, which is denoted v - f w , 2 ) for every v £ V and every k € K , there is defined an element of V called the product of v and A;, denoted A:v, in such way that the following properties are satisfied: [VSl] (Associativity of vector addition) For every u, v, w £ V, (u + v) -f w = u + (v + w). [VS2] (Existence of zero) There exists an element

0

£ V , called the

8

Affine geom etry zero vector, such that, 0

+ v = v+

0

= v

for every v £ V. [VS3] (Existence of opposites) For every v satisfies the identity, v + (-l)v = [VS4] (Com m utativity) For every u, v u +

V

=

V

6

0

6

V the element (—l ) v

.

V one has,

+ u.

[VS5] (Distributive property over sum of vectors) For every u, v E V and every k E K , one has, &(u + v) = Am + k v . [VS6 ] (Distributive property over sum of scalars) For every v £ V and every A, k £ K , one has, (A + k )v = Av + kv. [VS7] (Associativity of scalar multiplication) For every v £ V and every A, k £ K , one has, (A£)v = h(kv). [VS8 ] For every v £ V one has, lv = v. We will say th at the two operations above define on V the structure of a K-vector space. It may be possible to define on a set the structure of a K -vector space in more than one way. In other words there may be more than one way to define on V the operations that make it into a vector space, and possibly over different fields — see Example 1.2(4). The elements of a vector space V are called vectors, and the ele­ ments of K are called scalars. The vectors of the form fcv, k £ K are said to be proportional to or multiples of v.

Vectors and vector spaces

9

For every v £ V , the vector (—l)v , called the opposite of v, is written —v, and one writes u —v rather than u + (—■ v ). From the 8 axioms, there follow several elementary properties of vector spaces which will be discussed at the end of this chapter (in Observations 1.3), and which from then on we will assume the reader knows. When K = R or K = C, V is often called a real vector space or a complex vector space, respectively. A set with just one element is in a trivial way a K-vector space whose only element is the zero vector. We will now see a few non-trivial examples of vector spaces. E xam p les 1.2 1. Let V be the set of geometric vectors in the plane. The properties th at we derived at the beginning of this chapter show that the operations of addition and scalar multiplication define on V the structure of a vector space over R . In a similar way, the sets of geometric vectors in space or in a line form vector spaces over R . 2. Let n > 1 be an integer and V = K n, the set of ordered n -tuples of elements in K. Define the sum of two n-tuples ( x i , . . . , x n) and (yi , . . . , yn) by ( X j , . . . , Xn ) *4 ( j /l

Vn)

(^1

3*71 4" V n )?

and for each k £ K and ( x i , . . . , x n) £ K n, define k{x 1 , . . . , xn) — (k x \ , . . . , k x n}] in particular —( x i , . . . , x n) = (—x t , . . . , —xn). It is easy to check th at with these operations K n satisfies axioms VS1, ..., VS 8 , and so is a K-vector space. It is usually called the numerical n-space over K . The numerical 1 -space over K is K itself, which with the usual definitions of sum and product, is a vector space over itself. If ( x j , . . . , xn) £ K n, the scalars x i , . . . , xn are called the components, and xt- the i-th component, of ( x i , . . . , xn). 3. Let I be any non-empty set and let V be the set whose elements are the maps f : I K . For every f , g £ V define f + g : I K by putting ( / + g)(x) = f ( x ) + g(x)

10

Affine geom etry for every x £ /. We thus obtain an element / + # E V . If A £ K and / £ V , define k f : I —►K by ( k f) ( x ) = k f ( x ) for every x £ I. We thus obtain an element k f £ V . It is easy to verify that with these two operations V is a Kvector space.

4. Let F be a subfield of K and let V be a vector space over K . Then V has induced on it the structure of an F-vector space by the same operations that define the K-vector space structure. In other words, the sum of two vectors remains the same as before, and multiplication by a scalar a £ F is defined by treating a as an element of K , and so by using the definition of multiplication by scalars in K. For example the field C of complex numbers can be considered not only as a complex vector space (the numerical 1 -space over C ), but also as a real vector space because R is a subfield of C. O bservations 1.3 1.

Some of the properties th at follow from the axioms VS 1 , . .. , VS 8 are obvious for geometric vectors, but require a proof for more general vector spaces. Let V be a K-vector space. In V there is a unique zero vector. T hat is, if Qi and O2 are two vectors satisfying Ox + v = v and 02 + v = v for every v £ V , then Ox = 0 2. To see this, first put v = Ox in 02+ v = v to obtain 0 2-f Ox = Ox. Next put v = 0 2 in the other equation, to obtain 02 + 0 X = 0 2. Thus Ox = 02 -f Ox = Ox + 02 = 02. For every v £ V there is only one opposite, th at is, if v + Vi = 0 = v + v 2 then Vx = v 2. Indeed, one has, Vx

= =

0+ V +

= (v + v 2) + V x = v + (v 2 + V x ) ( v x + v 2) = (v + V x ) + v 2 = 0 + v 2 = v 2.

Vx

Vectors and vector spaces

11

For every a, b £ V , the equation x + a = b has a unique solution x = b —a. In fact, (b —a) + a = b, and, as x + a = b, one has, x = (x + a) —a = b —a. For every v £ V , one has O.v = 0, where 0 £ K is the zero scalar and 0 is the zero vector. This is deduced as follows: Ov = (0 + 0)v = Ov + Ov, thus Ov = Ov —Ov — 0. Analagously, kO = 0 for every k £ K . This follows from a similar argument: kO — k (0 + 0 ) = kO + AO, so AO = AO —AO = 0 . 2.

Let V be a K-vector space. Given three vectors u ,v ,w £ V one writes u -f v + w for the sum calculated in one of the two possible ways, which, by axiom VS 1 , give the same result. Suppose now we are given n > 3 vectors Vi , . . . , v n £ V. We want to show th at calculating their sum always gives the same answer, no m atter how the brackets are distributed: (vi + (v 2 + (• • • + (v n__! + v n) • • •)))

(1.2)

gives a result which depends only on V i , . . . , v n, and so can be written without the brackets as Vi + v 2 + • • • + v n. The proof is by induction on n: if n = 3 the statem ent is true because of axiom V Sl. Let n > 4, and suppose th at the statem ent is true for every integer A between 3 and n — 1 . We can write two different sums (1.2) as a + ft and 7 + S respectively, where a , ft are sums in which Vi , . . . , v* and v*+i , . . . , v n appear respectively, while 7 , 8 are such that V i , . . . , Vj and Vj+i, . . . , v n appear respectively, for some integers A less than n. If j = A, then by the inductive hypothesis, a — 7 and ft — A, whence a -f ft = 7 + S. Suppose now that A < j. By the inductive hypothesis one has, a + (v fc+, + • • • + v j )

=

7,

(vjt+i + • • • + V j) + 6

-

7

and so a + (3 = a + (iTr+i d

+ 6,

f- Vj ) + 6 = 7 + S.

12

Affine geom etry

EXERCISES 1.1

A map s : N —►K of the set of natural numbers to K is called a sequence of elements of K . If s(n) = an £ K , the sequence s is also denoted by {an}n€N. Let 5 k be the set of all sequences of elements of K . In 5 k one defines the following operations: 4”

~

4“

— (Afln)

for every {an}, {bn} G 5 k , and k £ K . Show th at, with these operations, 5 k is a K-vector space (this is a particular example of the vector space V considered in Example 1.2(3)). 1 .2

A sequence {an} £ 5 r is said to be bounded if there is a real number R £ R such that an < R for every n £ N . Let B n denote the set of bounded sequences. Show th at, with the op­ erations defined in Exercise 1 , B n is a real vector space.

1.3

Let a, 6 £ R , a < 6 , and let C(a,b) be the set of all continuous real valued functions defined on the interval (a, b). For every f , g £ C(a>t) the function f + g '• (ayb) —* R is defined by ( / + g)(x ) = f i x ) + g(x )

for every

x £ (a, 6 ).

If / £ C(ai6) and c £ R then c f : (a, 6 ) —> R is defined by (c f ) (x) = c f(x )

for every

x £ (a, 6 ).

Prove that f + g and c f are continuous, so that these define two operations on C(a%b)- Prove that with these operations, C(a,b) has the structure of a real vector space. 1.4

Let X be an unknown and let K [ X ] be the set of polynomials in X with coefficients in K . For every / , # £ K [ X ] and a £ K , let / 4- g G K[X] be the polynomial sum of / and 5 , and let a f be the polynomial product of a , considered as the constant polynomial, and / . Prove that these operations define on K[X] the structure of a K-vector space.

2 Matrices

Let m ,n be positive integers. An m x n matrix is a rectangular array

/ «n A =

«21

j and lower triangular if = 0 for every i < j . Furthermore, if = 0 for every i > j , then A is said to be strictly upper triangular. Strictly lower triangular matrices are defined similarly. For example, of the three square matrices with real entries,

(o!).

-1

x/3

0

0

0

0

-

(Q 7 1 VO

3,

0 0 f 1

0 0 0 |

0\ 0 0 0/

the first is diagonal, the second upper triangular and the third is strictly lower triangular. A square m atrix A £ M n(K ) satisfying A 1 — A is said to be sym­ metric; if on the other hand, A 1 = —A then it is said to be skew symmetric. For example, the matrix, -1

5

3

5 3

0

-1

-1

2

TC

is symmetric, while '

-5

0

5

0

1

3

-3

o) is skew-symmetric. Notice th at any skew-symmetric m atrix must have only zeros down the diagonal. On the other hand, the m atrix 2

is neither symmetric nor skew-symmetric. P rop osition 2.2 1 ) I f A , B € Afm,„(K), C ,D

6

M n(K ), then ( A B )C = i4(J3C).

3) I f A and B can be multiplied, then B ‘ and A 1 can be multiplied, and ( A B ) 1 = B tA t. 4) I f A, B e Mm,„(K) f/ien A* + B l = (A + 5 ) '. P ro o f 1 ) Let A W = ( a,j a l2 . . . a,„ ), and 2 ?b) = ( bn the i-th rows of A and B (for i = 1 , . . . , m) and

6,2

• • • &m ) be

( cik \ C2k

C(k) =

^Cnk ) be the fc-th column of C (for k = 1, . . . ,p). The entry of (A + B ) C in position i, k is then, (A + B )^C (k) = (an + bn)cik + (a *2 + while the entry of

+ *• *+ («*n+ bin)cn*,

A C + B C in position i, k is

A ^ C (k) + B ^ C (k) = (a ilclk + ^i2c2k + • **+ QinCnk) + (bi\C\k + bi2C2k + **• + binCnk). These two are clearly equal, which proves the first identity in (1). The second and third identities are proved in a similar manner. The entry of A \ n in position i , j is i4^^(In)(j) == a tiO -f • • • -f- a{j-10 +

-f a{j+10 + • • • + ^mO =

and so A I n = A. The proof that I nC = C is similar. 2) Notice that A B E M m# (K ) and B C E MnfS(K ), and consequently, not only A B and C but also A and B C can be multiplied. The i-th row of A B is (for i = 1 , . . . , m) (AB)M = ( A M B w

a

^ b {2)

•••

A ^ B (P)),

Matrices

19

while the h-th column of B C is (for h = 1 , . . . , s) / bwcw \ B&C (h) (B C ) (h) = . K B ^ C (h)) It follows that the entry of (A B )C in position i, h is (A B )W C (h) = ( ^ (,)B{a))cu + ( A ^ B {2))c2h + • • • + ( A ^ B {p))cph = (at'i 6 n -f U*2^21 + • • • + ainbnl)c\h + (at'1^12 + «t'2^22 + • **+ a%nbn2)c2h + *• *+ (ailbip + a t'2^2p + • *• + Ginbrip^Cph* (2.4) The entry of A ( B C ) in position i, h is, instead, A ^ ( B C ) (h) =aa (B™ C(k)) + at2( B ^ C {h)) + • • • + ain( B M Cw ) =an(biiCih + b\2c2h + *• • + b\pcph) + «t‘2 (^2lCl/i + b22C2h + **• + b2pCph) + **• + 0.in[bn\C\h + bn2C2h + • • • + bnpCph). (2.5) A comparison of (2.4) and (2.5) shows th at they are equal, because each is the sum of all term s of the form dijbjkCkh as j varies from 1 to n and k varies from 1 to p. Thus the matrices (A B )C and A (B C ) coincide entry for entry, and the assertion is proved. 3) Suppose A £ Mm>n(K ) and B £ Mn,p(K ). Then A 1 £ Afn,m(K ) and B l £ Mp}H{K ) and so B f and A 1 can be multiplied. Furthermore, ( M £ ) % = (A B ) j{ = A ^ B (i) = { B ' f H A ' h ) = (BA m are matrices with entries in K such that for each k = 1 , 2 , . . . , m — 1 the matrices A* and Ak+\ can be multiplied, one can easily show th at their product does not depend on how the brackets are distributed. Thus for, Ai(A 2 (- *• Am) • • •),

20

Affine geom etry

one need only write A t A 2 *• • A m. The proof is similar to th at given in Observation 1.3(2) and is left to the reader. A square m atrix A of order n is said to be invertible if there is a m atrix M € Mn(K ) such that A M = M A = I n. If such an M exists then it is unique: for if N also satisfies A N = N A = I n, then M = M I n = M ( A N ) = (MA)AT = I nN = TV. This unique M is called the inverse of A and is w ritten A-1 . In fact, if A € M n(K ) is invertible and M satisfies one of the identities A M = I n or M A = I n then it necessarily satisfies the other too. Indeed, if, for example, A M = I n, then M A = ( A~ M) MA = A " 1(A M ) A = A -1 In A = A " 1 A = I n. In a similar way, one shows th at M A = I n implies AM = I n. The identity m atrix I n is invertible and is equal to its inverse. It follows immediately from the definition th at (A " 1) ””1 = A for any invertible m atrix A E M n(K). If A, B E M n(K ) are invertible then so is their product AB, and (A B ) - 1 = B - M - 1. Indeed, ( B - 1 A~ 1 )(AB) = B ' 1 (A~ 1 A)B = B ^ B = B “ *B = I n. More generally, if Aj, A 2 , . . . , A* E Mn(K ) are all invertible, then their product Ai A 2 • *• A* is also invertible, and (> M a ‘ • • Akr 1 =

M j"1.

The proof is similar to the preceding one. In Chapter 3 we will describe a procedure for finding the inverse of any invertible matrix. The subset of M n(K ) of invertible matrices is denoted GLn(K). For any integer k > 1 the product AA- - - A of A E Mn(K ) with itself k times is w ritten A k\ by convention, A0 = I n. For A E GLn(K ) one defines A~k = (A~l )k. A real square m atrix A E Mn(R ) is said to be orthogonal if AA* = In, that is, if A* = A " 1. The set of orthogonal matrices is denoted 0 (n). By definition 0 ( n ) C GLn(R ).

Matrices

21

The only orthogonal l x l matrices are ( 1 ) and ( —1). A m atrix A £ M 2 (R ) is orthogonal if and only if it has one of the forms,

M : V) or

with a 2 + b2 = 1. Indeed, if A = (atJ) then _ /

a ll + a 21

a l l a 12 + «21a 22 \

\ « 12«11 + « 22«21 and so A £

0

a l2 +

a 22

/

(2 ) if and only if

all + a2i = 1 ~ al2 + «22 ^11^12 4" #21^22 =: o. From these two conditions, it follows that there exists p ^ 0 such that Pai2)-

( « n ^ 2 i) = (—

From the first two equations, it then follows that p2 = 1 , that is p = ±1. Thus a i 2 = i « 2i and a 22 = T a ii? so A is either of the form ( 2 .6 ) or of the form (2.7). We will return to a more general discussion of orthogonal matrices in Chapters 2 0 and 2 1 . To describe matrices the so-called block notation is often useful. This consists of writing a m atrix A £ M m%n(K ) in the following form: f An A 2i A ==

:

\ Ah\

A\2 A 22

... •••

:

Aik \ A2k :

Ah2 . . .

Ahk J

where the Aij are themselves matrices of some appropriate sizes. More precisely, Aij £ Mm|,n>(K ) where rrti + m 2 H + = m and n\ + n2 H (- rtk = n. For example, m atrix (2 . 1 ) can be written in block notation as, A = (B

C\

22

Affine geom etry

where

O bservation 2.3 Matrices can be defined with entries in any domain D (see Appendix A). The set of all m x n matrices with entries in D is denoted by MmjTl(D ), and the set of square matrices of order n by Afn(D ). The most im portant cases for us will be D = Z and D = K [ X \, X i , . . . , X n], where X \ , . . . , X n are unknowns. The product of two matrices with entries in a domain is defined in exactly the same way as for D = K . Proposition 2.2 also remains unchanged if the field K is replaced by a domain D .

EXERCISES 2.1

Calculate:

■»(-*. D G i i ) * /3\ 0 0

b) ( )

(1 5

V37

429*

2 - 2 1 )

0 1

2 V 6 /

/0\ 1

2lX\ + 0,2 2 X 2 + • • *+ 02nXn = 0 :

(3-3)

:

OmiXi + om2 .

.

(3-6)

OmmXffi “h **’ ~h OmnX n — bn

with a n , a 22 , . •., omm ^ 0. The m atrix of coefficients of (3.6) is «11 0

0\2 \, • • ., an. The vector space {0 } consisting of the single vector 0 does not have a finite basis because its only vector is linearly dependent. There are other vector spaces, different from {0 }, which do not possess a finite basis — see Example 4.15(5). We will prove soon th at if a vector space V has a basis consisting of n vectors, then every other basis of V also consists of n vectors. This fundamental result follows from the following theorem. T h eorem 4.12 Let { v i , . . . , v n} be a system o f generators o f V and let W i , . . . , w m be elements o f V. I f m > n then Wi , . . . , w m are linearly dependent. P ro o f If Wj, . . . , w n are linearly dependent then so are w*, . . . , w m, by Propo­ sition 4.9. We can therefore restrict our attention to the case where Wj , . . . , w n are linearly independent. It will be sufficient to show th at Wj , . . . , w n generate V because then w m can be expressed as a linear

52

Affine geom etry

combination of W i , . . . , w n and so by Proposition 4.7 it would follow th a t W i , . . . , w m are indeed linearly dependent. From the hypothesis th at v x , . . . , v n generate V it follows th a t there are scalars a x , . . . , an such th at Wi = axvx + • • • +

anv„.

Now, since w x , . . . , w n are linearly independent, w x 0 and so the coefficients a x , . . . , a n are not all zero. After possibly reordering v x , . . . , v n we can suppose th at a x ^ 0. Then, vx =

th at is, have

v x

aj^W! - afla2v2 a^OnVn,

e (wx,v2,...,v„).

Since

v2,...,vn €

3

< n-

(wx,...,wa,vs+x,...,v„)

we

v2,..., v„)

= V,

(wx,v2,...,v„) D (vx,...,vn>

th a t is, Wi, V2 , . . . , v n generate V . Suppose now th at for some 1


€ K one has • • *+ tnSn =

12 d “ * * * 4" tn&n2i

(4 * 7 )

. -J- t n S n m y t m + \ , t m + 2> • • • ? ^ n )*

The right hand side is equal to ( 0 , 0 , . . . , 0 ) if and only if tm+i = *m+ 2 = • • • = • • •, in for the unknowns X m+i,.Xm+2 , . . . , X n. Equation (4.7) shows th at any such solu­ tion is a linear combination of the sm+ i,s m+2 , . . . ,«sn. Thus sm+i, ^m+2 ) • • •, generate the space Eo of solutions of (4.6). It fol­ lows that {sm+i,«sm+2 >‘ ♦*’ 3 n} is a basis for Eo. In particular, dim(Eo) = n — m. Recall from Chapter 3 th at system (4.6) has an (n —m )-param eter family of solutions, and thus the num ber of param eters is the same as the dimension of the space of solutions. More generally, since any system of linear homogeneous equa­ tions (4.5) can be reduced to one in echelon form, we have th a t a

Some linear algebra

57

system o f homogeneous linear equations has an N -parameter fam ily o f solutions if and only if the space o f its solutions has dimension N. 5. Let X be an unknown. The K vector space K[X] of polynomials in X with coefficients in K does not have finite dimension. To see this, suppose th at K[X] does have a finite basis { f \ ( X ) , . . . , f n(X) }. Let du . • •, dn be the degrees of f i ( X )y. . . , f n{X) re­ spectively, and put D = ma x{di , . . . , dn}. Let f ( X ) be a polyno­ mial of degree d > D. Since {f \ ( X ) , . . . , f n( X) } is a finite basis, there exist a \ , . . . , an G K such that f ( X ) = al f l ( X) + --- + anf n(X). However, the polynomial on the right hand side has degree at most D , and so cannot be equal to f ( X ). This contradiction implies th at a finite basis of K [ X ] cannot exist. In a similar fashion one can show th at the K vector space K [ X \ , X 2 , . . . , X n] of polynomials in the unknowns X \ , . . . , X n with coef­ ficients in K is not finite dimensional. 6.

For any positive integer d, the vector space K [ X i , . . . , X n]d consists of all homogeneous polynomials of degree d in X \ , . . . , X n with coefficients in K together with the polynomial 0. This vector space has a finite basis: one can take the set of all monomials of degree d. Thus K [ X \ j . . . , X n]d has dimension (n+^"’1) by Lemma A l l . In particular the vector space K [ X j , . . . , X n]i of all homogeneous polynomials of degree 1 in X \ , . . . , X n has dimension n.

7. The vector space Mm>n(K ) of all m x n matrices with entries in K has dimension mn. To see this consider, for each 1 < i < m, and each 1 < j < n, the m atrix 1 ,^ which has a 1 in the i j th position and Os elsewhere. We obtain a set of m n matrices { I n , I 12, . . . , lmn} which forms a basis for M m>n(K ) . Indeed, if A = (ajj) G M min(K ), then there is a unique expression for A as A = a n ln +

+ **• + amnl mn.

P rop osition 4.16 Suppose that dim (V) = n. 1) I f V i , . . . , v n G V are linearly independent then { v i , . . . , v n} is a basis fo r V.

58 2)

Affine geom etry I f V j , . . . , v* 6 V are linearly independent then there are vectors v*+i, . . . , v„ € V such that { vi , . . . , v n} is a basis fo r V .

P ro o f 1 ) It is enough to show th a t ( v i , . . . , v n) = V . By hypothesis there is a basis { b i , . . . , b n} of V , and this implies, by Theorem 4.12, th at for every v € V , the vectors v i , . . . , v n, v are linearly dependent. Therefore there exist a j , . . . , a n, a 6 K not all zero, such th at aiVi + • • *+ anv n + av =

0

.

Since the V i , . . . , v n are linearly independent, a m ust be non-zero. Thus we have that v = —a “xaiVi — . . . —a '^ n V n , and so v € ( v i , . . . , v n). As v was arbitrary, we have proved the assertion. 2) By Theorem 4.12, we have k < n. If k = n the result follows from part ( 1 ), so suppose now th at k < n. Now, v-i,. . . , Vfc cannot generate V , as otherwise there would be a basis with k ^ n elements, contradicting Corollary 4.13. Thus we can find a vector v*+i € V \ ( v i , . . . , v*). Suppose th at a i , . . . , a*, at-fi € K are such that 4)

(5.5)

is obvious when interpreted as a relation between the row ranks of the two matrices. On the other hand B is a subm atrix of C : B = C( 1 2 . . . p | j i j 2 . . . j,) ,

70

Affine geom etry

formed from the j r t h , jV th , . . j q-th columns of C. In this case the inequality rk(B ) < rk(C ) (5.6) is obvious when seen as a relation between the column ranks. Together, (5.5) and (5.6) imply th at rk(J3) < rk(A).



As a consequence of the preceding results we obtain the following theorem. T h eorem 5.6 The rank o f a matrix A is the maximum order o f its invertible square submatrices. P roof Let p be the maximum order of the invertible square submatrices of A, and let r = rk(A). From Propositions 5.4 and 5.5 it follows th at p < r. On the other hand, there are r linearly independent rows of A, say A ^ , A^t2\ . . . , A^tr\ It follows th a t the m atrix B = A (i\ *2 . . . ir 11 2 . . . n) has rank r, and so it has r linearly indepen­ dent columns, say # (j2)> • • • >£(>)• The square subm atrix B ( 1 . . . r \ j i . . . j r) of B has rank r, th at is, it is invertible. Since B(1 . . . r \ j i . . . j T) = A (ii . . . i T\j i . . . j , ) is a subm atrix af A one also has th at p > r.



The notion of ‘determ inant’, which will be introduced in the next chapter, and Theorem 5.6 together provide a practical m ethod for computing the rank af a m atrix (cf. Corollary 6 .6 ). From the proof of Theorem 5.6 it follows th at if a square subm atrix of A, B = A (ii . . . ir \j i . . . jr), is invertible then the rows A^Xi\ A ^'2\ . . . , of A are linearly independent and the columns A ^ j , A(j2) , . . . , A(Jr) are linearly indepen­ dent.

R ank

71

We can now apply these considerations to systems of linear equa­ tions. The notion of rank perm its the following simple criterion for the compatiblity of a system of equations. T h e o re m 5.7 (Kronecker-Rouche-Capelli) A system o f m equations in n unknowns A X = b,

(5.7)

where A e M m,n(K ), b G A/m,i(K ) and X = ( X t X 2 . . . X »)‘, is compatible if and only if rk(A) = rk(A b). In this case the space o f solutions o f system (5.7) has dimension n —r, where r = rk(A). P roof Let =

( an

ai2

ai„ \

a 21

a 22

02n

.

>a mi a m2 Omn ) 1) € K' a is a solution of (5.7) if and only if ( ail \ Xl

ffl21

' ®ml •

( Ol2 > a22 + . •. 4 - x n + X2 :

a in \ a2n





(5.8)

\bm)

' a mn '

V a m2/

bi \ b2

Equation (5.8) expresses the fact th at b is a linear combination of the columns of A. This is satisfied if and only if the augmented m atrix (>4b) has the same column rank as A , th at is, if and only if rk(i4b) = rk(A). The first part of theorem is thus proved. If system (5.7) is compatible, and r = rk(i4), then we can suppose th at its first r equations are linearly independent, and substitute (5.7) with the equivalent system, +

012^2 +

• ** +

Cl\nXn =

# 2 1 ^ 1 + 0,2 2 ^ 2 + • • • + a2nX n =

&1

62

(5.9) ^ r\X \ + ^ 2 X 2 + • **+ UrnXn = K

72

Affine geom etry

Applying Gaussian elimination to system (5.9), one can see th a t none of the equations become 0 = 0 , because this would imply th a t th a t equation would be a linear combination of the others. Thus sys­ tem (5.9) can be transformed into an equivalent system in echelon form with r equations. Therefore (5.9), and so (5.7), has an (n — r)param eter family of solutions, and consequently the space of solutions is (n —r)-dimensional — see Example 4.15(4) □

EXERCISES 5.1

5.2

Calculate the rank of each of the following matrices with ratio­ nal entries: - 1\ 1 /I 0 1 1 1 /2 3 1 1 -1 -1 a) 1 4 2 b) 0 0 0 - 1 /2 - 2 - 1 1 o / 0 0 / \o (I -2 4 1 4 7 2 c) 11 7 2 2 18 9 / V3 6 Prove th at every n x m m atrix with entries in K with rank at most 1 are of the form ax (*» for suitable u i , . . . , a n, b\, . . . , bm € K.

bm )

6 Determinants

In this chapter we describe a way of associating an element of K to any square m atrix A with entries in K . This is called the ‘determ inant’ of A. The determ inant, as we shall see, is an instrum ent of fundamental practical im portance in linear algebra. We will make use of the summation symbol E to denote the sum of any finite number of term s indexed by one or more indices; the sets of values of the indices will be indicated under and/or over the symbol E. Similarly, the symbol II will be used to denote products. D e fin itio n 6 . 1 Let n > 1 and

A =

/a n a 21

«i2 • a 22 •

V«nl

«n2 •

«ln\

«2n

€ M n(K ).

The determinant of A is the element of K given by det(A) =

53

£(p)«lp(l)« 2P(2) • • • «np(„),

(6 . 1 )

p€Sn where S n denotes the set of all perm utations of { 1 , 2 , . . . ,n}, and e(p) is the sign of the perm utation p £ S n (see Appendix B). det(A) is also denoted det(a,*j) or \A\ or again |ajj|.

74

Affine geom etry

Expression ( 6 . 1 ) is the sum of ?z! terms. Ignoring the signs, these terms are all possible products of n entries of A belonging to different rows and different columns. If n = 1 and A = (a) then det(A) = a.

021

022 I

= «11«22

012021-

If n = 3 then 011

012 013

021

022

023

031

032

033

=

011022033

011023032

+

012023031

—012021033 + 013021032

013022031 *

As n increases it becomes difficult to compute the determ inant of a general n x n m atrix directly from the definition (6.1). We will see shortly some simpler ways of doing it, without using ( 6 . 1 ) Let A € Afn(K ). As usual, denote the rows of A by A ^l \ . . . , and the columns by A(j),. . . , A(n). Using the block notation, we can write either A = { AW AW ... 400) or

\ ^(n) / We can therefore write f A(1) > det(A) = det (

A^

...

A M ) = det

^ ^ A(n)

J

W ith this notation, the determ inant will be considered as a function of n row vectors, or of n column vectors.

Determinants

75

T h eorem 6.2 Let n > 1 be an integer, and let A (i) \ A = (a,j) = (

A . . .

^ ( 2)

^W ) =

€ Afn(K ).

V ^(n) )

Then: 1)

det(i4‘) = det(A ). 2) I f A ^ = cV + c'V7, /o r some 1 < i < n, c, c' G K , that is, if the i-th row o f A is a linear combination o f two row n-vectors, then AW \

^ (1)) det(i4) = cdet

V

+ c det

V' vA ^ j

\A ^ i

Similarly, if /!(,) = cW + c'W ', fo r some 1 < i where W and W ' are two row n-vectors, then det(A) =

cdet ( A ^ . . . -fc'd et ( A ^

W ...

... W'

A^ ) . . . i4(B>).

3) I f the matrix B € M n(K) is obtained from A by swapping a pair o f rows or a pair of columns, then det(B ) = —det(.4). 4) I f A has two identical rows or columns then det (/I) = 0 . 5) det(In) = 1. P ro o f 1) By definition, det(y44) =

£(p)ap(i)iap(2)2 - - - ap(n)n.

( 6 .2 )

p€Sn

Ignoring the signs, the term s in (6 .2 ) are the same as in (6.1). Indeed, the term «p(l)l«p(2)2 • • • «p(n)n (6.3) can also be written as

a\q(\)a2q(2) • *• ttng(n)>

(6.4)

Affine geom etry

76

where q = p~x G S„. Now, to see th at the signs of the summands in det(A ) and det(A ‘) are the same, note th a t e(p_1) = e(p). Thus det(A ) = det(A ‘). 2) By part ( 1 ), the two statem ents are equivalent, therefore we need only prove the first one. Let V = ( V|

...

v„)

and

V' = ( i>J

...

v'n ) .

Then, ( a,i

a ,2

a,n ) = ( cvi + c'uj

...

cv2 + c'v'2 . . .

cv„ + c'v'n )

and Y

det(A ) =

£(p)a lp(l)°2p(2) • • • «np(n)

p€Sn =

]C

pesn

£ ( p ) a lp (l) • • •(CVP(i) + C'Vp(i)) • • •a np(n)

c E e(p)a ip(i) • • • vp(i) • • • «np( p€5„ +c' Y e (p)a ip(i) • • • p€^n A{1)\ A(1) \ cdet

V \ A>

3) By part ( 1 ) it is sufficient to consider the case where B is obtained from A by swapping two rows. Suppose the i-th and j- th rows are swapped, where 1 < i < j < n. P utting B = ( 6 /1*), we have ^(p)^ip(i) *• *^*p(0 • • • bjp(j) • • • bnp(n)

det ( 2?) = p€1(2... n 11. . . j . . . n). By varying p £ S n with p (l) =s j J the perm utation q defined in (6.8) varies through all of 5n- i . Therefore the sum of the terms in (6.7) is equal to a lj(

1)^*

y i

£ ( V th a t satisfy Axioms AS1 and AS2. From now on we will assume th at every affine space we consider is such that the associated vector space V is finite dimensional The dimension of V is also called the dimension o f the affine space A, and is denoted dim (A). Affine spaces of dimension 1 or 2 are commonly called affine lines or affine planes, respectively. E x a m p le s 7.2 1. The ordinary line, plane and space are examples of a real affine line, a real affine plane and a real 3-dimensional affine space, re­ spectively. The associated vector spaces are the spaces of geometric vectors of the respective spaces, and the operation which associates a vector to an ordered pair of points is the one th at was used in Chapter 1 to define geometric vectors. Thus affine spaces are a generalization of the ordinary line, plane and space. 2. Let V be a finite dimensional vector space over K . Putting a b = b —a defines on V the structure of an affine space over itself. Axiom AS1 is satisfied because for every point p G V and for every vector v € V the point q = p + v is the unique point th at satisfies the equation q — p = v. Axiom AS2 is satisfied because the identity r - p = (q - p) + (r - q) holds for every p, q, r G V.

Affine space (I)

95

Thus every vector space V can be considered as an affine space over itself. W ith this affine structure, V is denoted by V 0. 3. A particular case of the preceding example arises when V = K n. The resulting affine space KJJ is called the affine numerical n-space over K . It is usually denoted by A n(K), or just by A n if K is clear from the context. Axiom AS1 implies th at given any point O G A the resulting map associating P G A to O P G V is one-to-one and onto. This corre­ spondence is the generalization of the one in ordinary space, which, if a point 0 is specified, associates to each point P the geometric vector represented by the oriented segment with initial point O and end point P . D efin itio n 7,3 Let V be a K-vector space, and A an affine space over V . An affine system o f coordinates in the space A is given by a point O G A and a basis { e i , . . . , en} of V . This coordinate system is denoted Oe x . . . e n. For every point P G A one has O P = ajei + • • • + a ne n for some a i , . . . , an G K. These scalars a\ , . . . , an are called the affine coordi­ nates (or just coordinates), and (c*i,. . . , an) the coordinate n-tuple, of P with respect to the coordinate system O ei . . . en. The point O is called the origin of this coordinate system. It has coordinate n-tuple ( 0 , 0 , . . . , 0). Given an affine coordinate system Oei . .. en on A, we will write P ( x i , . . . , x n) for the point P G A with coordinates x i , . . . , x n (Fig. 7.1 refers to the ordinary plane). If A ( a i , . . . , a n), P(&i,. . . , bn) G A, the vector A B has coordinate n-tuple (b\ —« i , . . . , bn —an) with repect to the basis { e j , . . . , e n). This follows from the identity A B = O B —Ovt. If A = A n, then the affine coordinate system O E i . . . E n in which O = ( 0 , . . . , 0) and { E i , . . . , E n) is the canonical basis of K n is called the standard affine coordinate system. In this coordinate system, every point ( x i , . . . , xn) G A n has itself as coordinate n-tuple. The most im portant subsets of affine spaces are the ‘affine sub­ spaces’:

96

Affine geom etry

D efin ition 7.4 Let V be a K-vector space and let A be an affine space over V . Given a point Q £ A and a vector subspace W of V , the affine subspace of A passing through Q and parallel to W is the subset S C A consisting of points P £ A with Q P £ W . Note th at Q £ S because the vector subspace W contains the zero vector and 0 = so S is non-empty. The subspace W c V is called the vector subspace associated to S. The number dim( W ) is called the dimension of S and w ritten dim (5). If dim (5) = 0 then S = {$} is a single point; conversely, every subset of A consisting of one point is an affine subspace of dimension 0. If dim(iS') = 1 then S is said to be a line in A , and W the direction of S'; any non-zero vector a € W is called a direction vector for the line. It follows from the definition that the line S consists of all points P £ A such that Q P = fa for some t £ K. If dim(S') = 2 then S is said to be a plane in A. Fig. 7.2 represents a plane in ordinary space. The dimension of an affine subspace of A cannot be more than dim(A). If dim(S') = dim (A) then S = A because in this case W = V and so Q P £ W for every P £ A. If dim (5) = dim (A) — 1, then S is said to be a hyperplane. For example, a line in the affine plane is a hyperplane, as is a plane in a 3-dimensional affine space.

Affine space (I)

97

Fig. 7.2 E xam p les 7.5 1.

Let A be ordinary space. The affine subspaces of A are the points, the lines, the planes and A itself.

2. Let V be a non-zero finite dimensional vector space over K . Con­ sider a vector subspace W C V and a point q 6 V a. The affine subspace of V a passing through q and parallel to W is the set q - f W = {q + w | w e W }. Indeed, by definition q + W consists of all v G V„ satisfying v —q € W. If in particular q € W then q + W = W . We thus see that all vector subspaces of a vector space of V are affine subspaces of V a, and that every affine subspace is of the form q + W for some q E V a and for some subspace W C V , i.e. it is a translate of a vector subspace of V. 3. Given N + 1 > 2 points Po, *• • >P/v in an affine space A , the affine subspace passing through Po and having associated vector subspace (P0Pi, P0 P2 , . . . , PoPn) is written P0P \ . . . Pat, and called the sub­ space generated by (or the span of) Po, P i , . . . , P/v* Although the role of P0 in this definition is different from the others, the affine subspace P0P i . . . P/v does not depend on the order in which the points are taken. To see this, note th at the -------- y



y

vector subspace (P0 P 1 , PqPi, ■■■, PqPn) contains all the vectors PiPj = PoPj — PoPi, and so ( 7 .2 ) for every i = 0 , 1 , . . . , N . Conversely, for each i, every vector P 0 P j

98

Affine geom etry can be expressed as PqPj = PjPj — PiPo, and so (7.3). Thus the associated vector subspace does not depend on the order of the points. Furthermore, if P G A satisfies P o P e (P^P u . . . , P 0Pn ) then P ? = K P -P $ i€ by (7.3), and if P (7-2),

G

(P

A satisfies P ,P

iP o , G

• • •,

P K r)

(PtPo, •••, P{Pn) then, by

K P = P ^ P -P ^ P o € (P 6 K ,...,P 6 P ^ ). By the definition of P0P i . . . P # it follows th at dim(P0Pi • • • P n )


i, &2, i 3)) if and only if its coordinates satisfy the equation in X ,Y ,Z : X -q i a1 bi

Y -q a2

Z -q

2

«3

3

= 0.

(10.3)

b3

62

Expanding the left hand side of (10.3) by the first row, and putting, A =

a2 b2

a1 bi

a3 b3

a2 b2

(10.4)

A ( X - 9l) + B ( Y - q2) + C (Z - q3 ) = 0.

(10.5)

a3 b3

B = -

C =

a1 bi

gives Note th a t A , B , C are not all zero: this follows from the fact th at a and b are linearly independent. If we put D = - A q i - B q 2 - Cq3,

( 10 .6 )

then (10.5) can be rew ritten in the form A X + B Y + C Z + D = 0.

(10.7)

132

Affine geom etry

Equation (10.7) is satisfied by the points P ( x Jy Jz) 6 P , and only by these, and so it is the Cartesian equation of the plane V . If V is specified by giving three points which are not collinear, P o (# o 7 2 /o * * o ) , P \ ( x l* 2/i* Z \ ) and P 2 OE2 , 2/2 ** 2 )* then its associated vector subspace is generated by the vectors PqP\ and P 0 P 2 , and (10.3) takes the form X —x 0 Y —y 0 Z — zo #1 - Xo 2/1 - i/0 Z\ zq = 0. (10.8) x 2- So 2/2 - 2/0 *2 - *0 Equation (10.8) is the Cartesian equation for the plane passing through the points P o* P i and P 2 . A line £ C A has param etric equations of the form, x = a+

It

y = b -f m t z = c + n£,

(10.9)

where Q(a* byc) £ £ and v (/, m, n) is its direction vector. The line £ can also be defined by two Cartesian equations, th at is, as the intersection of two planes, because it has codimension 2: AX + BY + CZ+ D = 0 A 'X + B 'Y + C 'Z + D' = 0.

( 10 . 10 )

The direction of £ is the 1-dimensional vector subspace of V given by the associated homogeneous system, AX + B Y + CZ = 0

( 10 .11)

A 'X + B 'Y + C 'Z = 0. It follows that the vector v ( / , m , n ) with coordinates /

B B'

C , C'

m =

A A!

C , C'

n =

A A!

B B'

is a direction vector for because (/,m ,n ) is a non-zero solution of (10.11). This observation provides a practical method for finding the direction vector of a line which is given by Cartesian equations. If, on the other hand, the line £ is specified by a point Q (ay6, c) on it and a direction vector v (/,m ,n ), i.e. by (10.9), then the Cartesian

Geometry in 3-dimensional affine space

133

equations for the line can be found by requiring that the order two minors of the matrix

(V v zr) be zero. Indeed, this condition on the indeterminates X , Y, Z is satis­ fied by the coordinates of points P (x , y, z) € A for which the vector Q p is proportional to v , that is, by points in i.

Suppose that, for example, 1 ^ 0 , then the condition above becomes m(X-a)-KY-b) =0 n ( X - a) - l ( Z - c) = 0,

'

'

*

or, equivalently, mX-lY-(ma-lb) =

0

n X — IZ ~ (na — Ic) — 0, which are the equations of two distinct planes containing £. T hat the planes are distinct can be seen because, since / ^ 0, the m atrix of coefficients of (10.14) has rank 2. These planes therefore define the line t. Note finally th at (10.13) implies th at the remaining minor in (10.12) also vanishes: n ( Y — b) —m (Z —c) = r l {m [n{X - a) - l(Z - c)] - n [m (X - a) - l( Y - b)]} = 0. P ro p o sitio n 10.1 Let V and V 1 be two planes in A with Cartesian equations BY + CZ + D = 0 A 'X + B 'Y + C ’Z + D’ = 0

1 '

'

respectively. Then 1) V and V are parallel if and only if the matrix

(a' I c) has rank 1 . 2) I f the matrix (10.16) has rank augmented matrix / A \A '

B B*

1

then V and V 1 are disjoint if the C C'

D \ D 'J

(10 17) [ U)

134

Affine geom etry

has rank 2 ; if its rank is 1 then they coincide. 3) I f V and V are not parallel then they intersect, and V f) V is a line; this occurs i f and only i f matrix (10.16) has rank 2 . P roof The cases considered in the proposition correspond to the different pos­ sibilities th at can occur for the system of equations (10.15), for which (10.16) and (10.17) are the m atrix of coefficients and the augm ented m atrix, respectively. Since the vector subspaces associated to P and V are given by the associated homogeneous equations A X + B Y + CZ = 0 A 'X + B 'Y + C 'Z = 0,

(1018)

V and V are parallel if and only if the equations in (10.18) are propor­ tional, which is in turn true if and only if m atrix (10.16) has rank 1. In this case, m atrix (10.17) has rank 1 or 2, according to whether system (10.15) is compatible or not, th a t is, according to whether V fl V ' ^ 0 or V fl V —0. In the first case we m ust have th a t V = V since they are parallel. This proves (1) and (2). By (1) m atrix (10.16) has rank 2 if and only if V and V are not parallel. In this case system (10.15) is compatible because also (10.17) has rank 2. System (10.15) therefore has a 1-parameter family of solutions and V fl P ' is thus a line. □ Let us now consider the case of a line and a plane. The following proposition describes all possibilities for their relative positions. P ro p o sitio n 10.2 Let I be a line with parametric equations (10.9), and Cartesian equa­ tions (10.10), and let V " be a plane with equation A " X + B " Y + C "Z + D" = 0.

(10.19)

1) £ and V " are parallel if and only if A A! A"

B B' B"

C C' = 0 C"

( 10 .20 )

Geometry in 3-dimensionaJ affine space

135

or equivalently, i f and only if A"l + B "m A C"n = 0.

(10.21)

2) I f (10.20) is satisfied, then £ C V " if and only if the matrix A A' A"

B B' B"

C C' C"

D D' D"

( 10 .22 )

has rank 2 , otherwise they are disjoint (and the matrix has rank 3). 3) I f £ and V " are not parallel, then they are incident, and £ Cl V " consists o f only 1 point; this occurs if and only if A B C A! B ' C A " B " C"

(10.23)

or equivalently, if and only if, A"l A B "m + C"n f 0.

(10.24)

P roof Suppose th a t (10.20) is satisfied. Then the homogeneous system A X + B Y + CZ = 0 A 'X + B 'Y + C 'Z = 0

(10.25)

A " X + B " Y + C "Z = 0 is equivalent to the system consisting of only the first two equations, which defines the line £. Thus every direction vector of £ satisfies the third equation in (10.25) which is the equation of the vector subspace associated to V"\ th at is, V " and £ are parallel. Conversely, if V " and £ are parallel then the third equation in (10.25) is linearly dependent on the first two, and so (10.20) is satisfied. The fact th at (10.20) and (10.21) are equivalent follows by expanding the determ inant (10.20) by its first row. This proves (1). If V " and I are parallel, then either I C V " or they are disjoint accordingly as the system AX A BY A CZA D = 0 A 'X + B 'Y A C 'Z A D' = 0 A " X A B " Y + C "Z A D" = 0

(10.26)

136

Affine geom etry

is or is not compatible. P art (2) is now proved using Theorem 5.7. By (1), V " and £ are not parallel if and only if (10.23) and (10.24) (which, as already noted, are equivalent) are satisfied. In this case the system (10.26) is compatible and adm its a unique solution, by Theorem 5.7. T hat is, V " and £ have a unique point in common. □ Equation (10.20), or equation (10.21), is the condition fo r a line and a plane to be parallel We now proceed to the case of the relative position of two lines. Two lines £ and £\ are said to be coplanar if there is a plane which contains them both. P ro p o sitio n 10.3 Two lines £ and £\ in A are coplanar if and only if one o f the following conditions is satisfied: 1) £ and £\ are parallel; 2) £ and £\ are incident. In particular £ and £\ in A are coplanar if and only if they are not skew. P roof Since two lines in an affine plane which do not meet are parallel it follows th at two coplanar lines are either incident or parallel. Conversely, if £ = £\, then of course it is obvious th a t they are coplanar. If £ and £\ are parallel and distinct then they are contained in the plane passing through any point Q G £ and with associated vector subspace (v, Q Q i), and Qi G £\- If £ and £\ the plane passing through subspace (v ,v i), where v respectively.

where v is a direction vector for £ and £\ are incident and distinct, then they lie in the point £ fl £\ and with associated vector and Vi are direction vectors for £ and £\ □

The following result gives conditions under which two lines will be coplanar.

Geometry in 3-dimensional affine space

137

P ro p o s itio n 10.4 Let £ and t\ be two lines in A and suppose that £ has Cartesian equa­ tions (10.10) and £\ has equations A \ X + B XY + C \Z + D\ = 0 A [ X + B [ Y + C [Z + D[ = 0 .

(10.27)

Let Q {a,b,c) 6 £ and C?i(ai,&i,ci) € £\, and let v ( /,m ,n ) and Vi(/i, m i,n i) be direction vectors fo r £ and £\ respectively. The following conditions are equivalent: 1 ) £ and £\ are coplanar;

2)

a —a\ I h A A' Ax

b — b\ m mi

B B' Bx Bi

4

C C' Cx

Cl

c —Ci n = 0; ni

D D' Dx D'x

P ro o f (1) => (2) If £ and £\ are parallel, then v and Vi are proportional, and so (2) is satisfied. If £ and £\ are incident then they are contained in the same plane V; the vector QQ\ belongs to the vector subspace assocciated to V which is generated by v and Vi, and again (2) is satisfied. (2) => (1) If (2) is satisfied,then either the last two rows of the m atrix

( a —/ a\ V

6 —fej m m'

c —ci \ n n'

(10.28)

)

are proportional and the lines are parallel, or the first row is a linear combination of the other two, and the plane V with associated vector subspace (v, Vi) passing through Q and containing £ contains also Q i, and so contains £\. In either case the lines are coplanar. (3) => (1) If the system comprised of (10.10) and (10.27) is com pati­ ble, then £ and £\ are coplanar because they have a point in common.

138

Affine geom etry

Suppose th at this system is not R of the m atrix A B A! B ' A x Bx U i B[

compatible. Then, since the rank C C' Cx C[

D \ jy Dx D 'J

(10.29)

is at least 3, and the rank r of the m atrix A A! A\ W ',

B B' Bx B[

C \ C' Cx C [J

(10.30)

is at least two, we must have R = 3 and r — 2. Now, r = 2 implies th a t the last two rows of (10.30) are linear combinations of the first two. Thus £ and £\ have the same direction; th a t is, they are parallel, and hence coplanar. (1) =4- (3) If £ and £\ are incident, then the system consisting of (10.10) and (10.27) is compatible, and so (10.29) has rank less th an 4. If instead £ and £\ are parallel, then (10.30) has rank 2 and so (10.29) has rank at most 3. In either case (3) holds. □ Let £ be a line in A. The set $ of planes in A which contain £ is called a proper pencil o f planes, and £ is called the axis o f the pencil (Fig. 10.2).

Consider two distinct planes V and V\ in $ , with respective equa­ tions AX A BY A CZ+ D = 0 A x X + B xY + Ct Z + Dx = 0.

Geometry in 3-dimensional affine space

139

As in the case of pencils of lines, one can show th a t every plane be­ longing to $ has an equation of the form A( A X + B Y + C Z + D) + p ( A iX + B XY + CXZ + £>0 = 0 for some A, p. € K not both zero. In this case too, we can use a non-homogeneous param eter t to represent the planes of the pencil $ in the form A X + B Y + C Z + £> + t ( AxX + B XY + C XZ + D X) = 0, remembering though, th at the plane V x is the one plane which cannot be put in this form. Let $ be the pencil of planes with axis I and let Q be any plane not parallel to I . The planes belonging to $ intersect Q in the pencil of lines in Q with centre the point i ft Q (Fig. 10.3).

Let W be a subspace of V of dimension 2. The s e t o f planes in A having associated vector subspace W is called an improper pencil o f planes and W is called the vector subspace associated to the pencil. If Q is a plane in the improper pencil with equation A X + B Y + C Z + D = 0, then the other planes in the pencil have equations AX + B Y + CZ + t = 0

140

Affine geom etry

as t varies. E x a m p le 10.5 Let £ and £ 9 be a pair of skew lines in A, and let P € A be a point not belonging to £ U £ 9 (Fig. 10.4).

There is a unique line through P which is coplanar with both of £ and £9. To see this, note th at there is a unique plane V passing through P and containing £, and similarly, there is a unique plane V 9 passing through P and containing I9. Since V and V 9 are distinct (otherwise £ and I 9 would be coplanar), and intersect at P , the set V f) V 9 is a line passing through P which, by construction, is coplanar to both £ and £9. The uniqueness of this line follows from the uniqueness of V and V 9. Note th at the line we have just constructed is not parallel to both £ and P, for otherwise £ and £ 9 would be parallel to each other. Thus this line meets at least one of £ and £9.

EXERCISES 10.1

Establish which of the following triples of points in A 3(C ) are collinear: a) {(2,1, —3), ( 1 ,- 1 ,2 ) , (3/2,0, —1/2)} b) {(1,1,1), ( 2 ,- 1 ,3 ) , (2 ,1 ,-5 )}

Geometry in 3-dimensional affine space

141

c) {(*,0,0), (1 + *,2*, 1), (1,2, —*)} d) {(1,0,0), ( 2 , - 1 , - 1 ) , ( - 2 , - 2 , 1 ) } e) { (1 ,0 ,-1 ), (2,1,2), ( - 1 ,- 1 ,3 ) } £) {(1 - t,t,2 ) , (3,6*, —3), (2 - *’,3 t, 1)}. 10.2

In each of the following, find the value (if one exists) of the real param eter m for which the triple of points is collinear in A 3(R): a) {(2, —1,2), (1,1,1), (4, —m + 1,4)}; b) {(3,0,0), (0,1,1), (m, m, m)}; c) { ( l,- m ,0 ) , ( m , l , l ) , ( 1 ,- 1 ,- 3 ) } ; d) { (l,m ,0 ), (2, v/2,1), (2,110,1)}.

10.3

After checking, for each of the following, th at the points are not collinear, find param etric and Cartesian equations for the planes determined by the points a) { ( 2 , ^ , 1 ) , (1,1, \/2 ), (0,0,1)}; b) { (5 ,-1 ,0 ), (1,1, \/5 ), ( - 3 , 1, tt/2)}; c) {(1,1,1), ( - 2 ,1 ,0 ) , (2,2,2)}; d) {(1,1000,0), (3.55,2tt,0), (1,105,0)}.

10.4

In each of the following, find a Cartesian equation of the plane in A 3(C ) passing through Q and parallel to the plane V: a) Q = (—1,2,2), V : X + 2 Y + 3Z + I = 0 ; b) Q = (*',*,*), V : 2 X - Y = 0 ; c) Q = (1,*,* + 1), P : * T - 2 Z + 3t + 10 = 0; d) £? = (1 -2 * ,l,* r* ), V : i Y = 3.

10.5

For each of the following, determine whether or not the three planes belong to the same pencil. a) X - Y + Z = 0, - X + 3 F - 5 Z + 2 = 0,F - 2 Z + 1 = 0 b) 2 X - 3 F + 3 = 0, X - F + 6 = 0,X - Z Z = ~l c) X - 5 Y + 1 = 0, X - 5 Y = 0, 2 X + Z = 0 d) X - Y + Z + 5 = 0, 2 X - 2 Y + 2 Z + 77 = 0, - X + Y - Z = 0.

10.6

For each of the following, find param etric and Cartesian equa­ tions for the line in A 3(R ) passing through the point Q and parallel to the vector v. a) Q = (1 ,1 ,0 ) v = ( 2 , - l , v/2) b) Q = ( —2,2, —2) v = (1,1,0) c) Q = (1,2,3) v = (1,2,3)

142

Affine geom etry d) e)

Q = (0,0,0) Q = (1,1,0)

v = (1,0,0) v = ( l,l,- l) .

10.7

Find param etric equations for each of the following lines in A 3(C). a) X - i Y = 0, 2Y + Z + 1 = 0 b) 3 * + Z - 1 = 0, Y + Z - 5 = 0 c) X -1=0, Z - 1= 0 d) 2 i X - (i + 2 ) Y + Z - 3 + i = 0, Z + i Y = 2 i.

10.8

For each of the following, find param etric equations for the line in A 3(C ) passing through the point Q parallel to the line £. a) b) c) d)

10.9

Q Q Q Q

= = = =

(1,1,0), (1,0,0), (2,1, - 5 ) , (3,0,0),

£:X-iY = 0, Z + l = 0 £:X + 2 Y - 1 = 0 , X = 2 k Y = 2 , X = iZ + 7 l-.ZX - Y - Z + \ = 0, X - 5 F + s/2 Z - 7000 = 0.

In each of the following, find a Cartesian equation of the plane in A 3(R ) passing through Q and parallel to the lines I and a)

Q = ( l,- l,-2 ) ,

b)

Q = (0,1,3),

c)

Q = (3,3,3),

£ : X - Y = 1, X + Z = 5 £': X = 1, Z = 2 £: X + Y = —5, X — Y -\-2 Z = 0 £' : 2 X — 2 Y = X — Y + 2Z = I £ : X - 2 Y = -1 , X + Z = - 1 £ ': 2 X - 2 Z = 1, X - 2 Y = - 1 .

10.10 In each of the following, determine whether the lines £ and £' are skew or coplanar. If they are coplanar, find whether they are incident or parallel, and then, after checking th a t they are distinct, find a Cartesian equation for the plane containing them. a) £ : x = \ + t , y = — t , z = 2 + 2f, £ ': x = 2 —t, y = —1 + 3f, z — t b) £ :2 X + F + 1 = 0 , Y - Z = 2 £': x = 2 —t, y = 3 + 2t, z = 1 c) £ : 2 X + ZY - Z = 0, 5 X + 2 Z - 1= 0, £ ': 3 X - 3 F + ZZ - 1 = 0, 5 X + 2 Z + 1 = 0

Geometry in 3-dimensionaJ affine space d) e)

143

£ : 2 X + Z - 1 = 0, Y - Z + 1 = 0, £' : 2 X - V + 3Z = 0, 2 X + Y - 3= 0 I : X + 1 = 0, Z - 2 = 0, * : 2X + Y —2Z + 6 = 0, Y + Z - 2 = 0.

10.11 In each of the following, find the relative positions of the line I and the plane V in A 3(R ), and, if they are incident, determ ine the point of intersection. a) b) c) d)

10.12 In in a) b) c) d)

I : x = 1 + t, y = 2 —2t, z = 1 —4t, V :2X -Y +Z-1 =0 £ : x = 2 —t, y = 1 + 2t, z = —1 + 3 W is said to be linear if for every v , v ' G V and c G K one has F (v -f v ') = F (v ) + F (v ')

(11.1)

F( cv) = cF (v ).

(11.2)

The two properties (11.1) and (11.2) are equivalent to the single property F (c v + c V ) = cF( v) + c 'F (v ') (11.3) for every v , v ' G V and c, d G K . Indeed, ( 1 1 . 3 ) becomes ( 1 1 . 1 ) by putting c = c' = 1 , while it becomes ( 1 1 . 2 ) if c' = 0;conversely, if F satisfies (11.1) and (11.2) then F (cv + c V ) = F( cv) + F ( c V ) = cF( v) + c,F ( v /); the first equality following from (11.1) and the second from (11.2). By applying (11.3) repeatedly, it is easy to see that if F is linear, then for every Vi, v 2, . . . , v n G V and c\ , C2 , . . . , cn G K one has F ( ciV ,+ c2v 2+- • -+cnv n) = c1F ( v , ) + c2F ( v 2) + - • -+ c „ F (v n). (11.4) Notice th at (11.2) with c = 0 implies that for F linear F ( 0) = 0.

146

Affine geom etry

A linear map F : V —►V is called a linear operator on V , or an endomorphism of V . A linear map F : V —*• K is called a linear functional on V . If G : U —►V and F : V —> W are linear maps, then so is their composite F o G : U —» W . The proof is immediate, and is left to the reader. A linear m ap F : V —> W is called an isomorphism if it is bijective. The inverse F -1 : W —» V of an isomorphism is also linear and so is itself an isomorphism. Indeed, let w ,w ' € W and let v = F -1(w ), v ' = F -1(w '). Then F ’ 1(w + w ')

= F " 1(F ( v ) + F ( v'))

= F ” 1(F ( v +

v'))

= v + v ' = F -1 (w) + F _1(w '). Similarly, for c € K , F -1 (cw) = F ~ ‘ (c F (v )) = F -1 (F (cv )) = cv = c F '^ w ) . An isomorphism of a space V to itself is said to be an automorphism of V . In the following, Hom(V, W ) will denote the set of all linear maps from V to W , End(V ) will denote the set of all linear operators on V (endomorphisms of V ), and V* will denote the set of all linear functionals on V . Finally, G L(V) will denote the subset of End(V ) consisting of all automorphisms of V (GL stands for ‘General Linear’). E xam p les 11.1 1. For every V and W the zero map 0 : V —►W defined by 0(v) = 0 6 W for all v 6 V is a linear map. In every vector space V the identity map l y : V —> V which sends every vector to itself, is an automorphism. If F, G € GL(V ) then F ~ 1 and F o G also belong to GL(V). Let c € K . The map c l y defined by ( c l y ) ( v ) = CV

is linear. If c = 0 then c l y = 0. If c phism whose inverse is c ^ l y .

0 then c l y is an autom or­

Linear maps

147

If dim (V ) = 1 the only linear operators on V are of the form cly* For suppose F G End(V ) and e G V , e ^ 0, then {e} is a basis of V so F( e) = ce for some c G K , and so for every v = xe G V , F (v ) = F(are) = xF( e ) = ar(ce) = cv. 2. Let e = { e i,. . . , e n} be a basis of the K-vector space V , and e : V - K n be the map defined by e(x\e\ + • • • + arnen) = (x u . . . , x n). T hat is, e is the map which associates to every vector v = + ••• + xnen G V the n-tuple ( x i , . . . , x n) of its coordinates with respect to the basis e. Now (j)e is an isomorphism, since, for every c G K , v = ar!ei + b xnen, and v ' = y ie x + h ynen G V one has e(v + v')

=

e ((^i + 2/i)e x + *• • + (xn + yn)en)

= =

(®1 + y i , . . . , X n + ffn)

e(cv) =

( x i ,. . . , xn) + (j/i,. . . , yn) = e(y) + ^ e (v /), (cari,. . . , cxn) = c ( x ! , .. ., x n) = ce (v),

and so e is linear. Moreover, by the properties of the coordinates of a vector, e is bijective. e is the isomorphism defined by the basis e . In the case th at V = K n and e is the canonical basis, then e is the identity on K n. 3. Let V be a K-vector space and let U and W be complementary subspaces of V , th at is suppose V = u ©W.

(11.5)

Since every v G V can be expressed in a unique way as v = u + w , with u G U and w G W , we can define a map p :V

W

148

Affine geom etry by putting p(u -f w ) = w.

p is called the projection o f V on to W defined by the decomposi­ tion (11.5). The projection p is a linear map. Indeed, if v = u + w , v ' = u ' + w ' and c, d € K then p(cv 4* c V )

= = =

p(cu -f cw + c'u7+ c'w ') p(cu + c'u' + cw + c'w ') cw + c'w ' = cp(v) + c'p(v').

In particular, if W is a hyperplane then U = (u) for some u G V \ W , and in this case p is called the projection o f V on to W in the direction (u). If { e i,. . . , e n} is a basis of V and 1 < k < n, then the pro­ jection of V on to ( e * + i , . . . , e n ) defined by the decomposition V = ( e i ,...,e * ) ® (e*+ i,. . . , e n) is given by p(ciei -f* • •

* cnen) = Cjb+ie^+i -f* * * * *4 cnen.

If x is the ordinary plane, I a line in x and V and W are the real vector spaces of geometric vectors of x and of £ respectively, and u € V \ W is a vector not parallel to then the projection of V on to W in the direction u is the map illustrated in Fig. 11.1

Fig. 11.1 In a similar way, given any plane x in ordinary space S, letting V and W be the spaces of geometric vectors of S and x respec­ tively, and given any u 6 V \ W the projection of V on to W in the direction (u) is depicted in Fig. 11.2.

Linear maps

149

The projections described here for vectors in the ordinary plane and space are based on the same geometrical principal as the pro­ jection of an affine space on to a subspace defined at the end of Chapter 8. 4. Let V be a K-vector space and e = { e i,. . . , e n} be a basis for V . For each i = 1 , . . . , n define Vi : V -

K

by ?7i(c i 0 i -b *• • -b cne n) — Cj,

that is, associating to each vector its i-th coordinate with respect to e. It is easy to see th at i/,• is a linear functional on V . Indeed, given any v =

-f • • • + cne„,

v ' = dxei -b • • • + d„e„

in V and fc, k' E K , one has T}i(k\ + k'v')

= Tii((kci + k ’d i)ei + ---- b (kcn + fc' V by p(u + w ) = u —w for every v = u + w , with u G U , w € W .

150

Affine geom etry It is straightforward to check th at p is a linear operator with the following properties p(u) pop

= u V u€U , = lv.

The second property implies th a t p is invertible, and so p 6 G L(V ). If { e i,. . . , e n} is a basis for V such th at U = { e ! ,...,e * )

W = (e*+1, . . . , e „ ) ,

for some k with 1 < k < n then p (c i£ \

-|“ • • • -j- Cn e n )

~ C\Q 1 "f*

. . . 4“ Ck&lc

6. Let V be a vector space over K , U a subspace and consider the quotient space V /U . The map p : V -+ V /U defined by p(v) - v + U is linear. It is called the natural projection of V on to V /U . It is left to the reader to check th at p is indeed linear. 7. Let U be a subspace of the K-vector space V . The inclusion i of U into V is a linear map. This too is left to the reader to check. P ro p o sitio n 11.2 Let V and W be K -vector spaces, and F : V —►W a linear map. Let V i , . . . , v n € V , and w, = F ( v t) fo r i = I f v i , . . . , v n are linearly dependent then W i,... ,w n are also linearly dependent. Equivalently, if Wi, . . . , w n are linearly independent then so too are v i , . . . , v n. P roof We prove the first statem ent. Let c j , . . . , cn € K be scalars which are not all 0, satisfying civ i + • • • + cnv n = 0.

Linear maps

151

Then C1W1 H

b cnw n = = =

c1F (v O + - + cnF ( v n) F( cxv i + - • • + cnv n) F (0 ) = 0,

and s o w i , . . . , w n are linearly dependent.



T h e o re m 11.3 Let V and W be two K-vector spaces, e = {ei, e 2 , . . . , e n} a basis fo r V and Wi,W 2 , . . . , w n arbitrary vectors o f W . 7Vien there is a linear map F : V —►W sac/i ffia£ F ( e t) = w,*,

t = l,2 ,...,n .

P ro o f If jF exists it is unique, because for every v = x \e \ + • • • + x ne n E V one has, by (11.4), th at F( v ) = ariF(e!) 4 -

h x nF ( e n) =

+ • **+ x nw n

(11.6)

and the coefficients a?i,. . . , x n are uniquely determined as e is a basis. It thus suffices to show th at the map defined by (11.6) is linear. First we check condition (11.1). If =

#iOi -f* • • *-j- x ne n

v' =

y iei + • • • -b y„en

v

are elements of V then F ( v - b v ')

=

(xj -b yi)w j H

+ (x„ -b 3/n)wn

= =

(iiW i -b • • • + xnw n) -b (yiw , + • • • + ynw n) F (v ) + F (v').

Now check (11.2). Let c 6 K and v = x jei + • • • + x„e„. Then F (cv)

= cx 1W1 + -----b cxnw n =

Thus F is linear.

c(l!W , + • • • + I„w „) = cF (v ). □

152

Affine geom etry

D e fin itio n 11.4 Let F : V —> W be a linear map. The kernel of F is ker(F ) = {v € V | F (v ) = 0}, th at is it is the subset of V consisting of all vectors which are mapped to 0 6 W by F . The image of F is the subset of W Im (F ) = {w = F (v ) | v € V }. The sets ker(F ) and Im (F ) are vector subspaces of V and W re­ spectively (the proof of this is left to the reader). If they are finite dimensional then the dimension of Im (F ) is called the rank of F , de­ noted rk (F ), and the dimension of ker(F) is called the nullity of F . From (11.4) it follows th at if { e i,. . . ,e n} is a basis for V then Im (F) = {F(e i) , . . . , F ( e n)). P ro p o s itio n 11.5 A linear map F : V —►W is injective if and only if ker(F ) = (0). P ro o f If F is injective then 0 is the only element of ker(F ), so ker(F) = (0). Conversely, suppose th at ker(F) = (0). If v, v ' are such that F (v ) = F (v ') then F (v - v') = F (v ) - F (v ') = 0 and so v —v ; 6 ker(F). Since 0 is the only element of ker(F ) it follows th at v - v ' = 0, or v = v'. Thus F is injective. □ We now have a particularly im portant theorem. T h e o re m 11.6 Let F : V —» W be a linear map o f K -vector spaces, with dim (V ) = n. Then ker(F ) and Im (F ) are finite dimensional, and dim (ker(F )) + rk (F ) = n.

Linear m aps

153

P ro o f Since V is finite dimensional and ker(F) is a subspace of V it follows th at ker(F ) is also finite dimensional. Let s = dim (ker(F)), and fix a basis {nx, . . . , n 4} of ker(F). Let v ,+x, . . . , v n 6 V be such th at { n j, . . . , n ,, v ,+i , . . . , v n} is a basis of V . To complete the proof, it is sufficient to show th at { F (v ,+X) , . . . , F ( v n)} forms a basis for Im (F ). Every vector w € Im (F ) is of the form w

=

F ( a xn x + • • • + a ,n 4 + 6,+xv ,+x + • • • + bnv n)

= =

a xF ( n x) + • • • + o ,F ( n ,) + 6,+xF (v ,+ x) + • • • + 6„F (vn) bs+ iF (vt+i) H f- 6nF (v „),

for some scalars a x, . . . , as, 6J+X, . . . , bn. Thus F ( v ,+j ) , . . . , F ( v n) gen­ erate Im (F). Suppose c4+x, . . . , cn € K are such that

c*+iF(vs+i) + — + cnF(vn) = 0 so th a t c4+iVa+i + • • • + cnv„ 6 ker(F). Since {nx, . . . , n s} is a basis for ker(F), there are d i, . . . , d„ G K such th at c»+ivs+i + • • • + c„v„ = dxni + • • • + d ,n s i.e. such that d in x -1-------1- dan , - c4+1v 4+j --------- cnv n = 0. However, n x, . . . , n s, v ,+x, . . . , v n are linearly independent and so all the coefficients are zero; in particular c*+x = . . . = c, = 0 and F ( v ,+X) , . . . , F ( v n) are indeed linearly independent. □ Notice th at the preceding theorem does not require th at W be finite dimensional. C o ro lla ry 11.7 I f U is a subspace o f a finite dimensional K-vector space V , then d im (V /U ) = dim (V) - dim (U).

(11.7)

154

Affine geom etry

P roof The natural projection p : V —►V /U is surjective and has kernel U . □ The following corollary provides a simple characterization of iso­ morphisms, and follows immediately from Theorem 11.6. Its proof is left to the reader. C orollary 11.8 I f V and W are It-vector spaces o f the same finite dimension, and F : V —+ W is a linear map then the following conditions are equivalent: 1) ker(F) = (0 ); 2) Im(F) = W ; 3) F is an isomorphism. T h e o re m 11.9 (Homomorphism theorem for vector spaces) Let F : V —> W be a linear map o f K -vector spaces. F defines an isomorphism F '; J ! L ker(F)

.

such that F = i o F 'o p ,

(11.8)

where p : V —* V /k e r (F ) is the natural projection, and i : Im (F ) —► W is the inclusion. P ro o f Let Vi, v 2 € V . Now, F ( v i) = F ( v 2) if and only if F ( v 2 — Vj) = 0, th at is, if and only if v 2 —Vi € ker(F). In other words, F (v i) =

F (

v

2)

& v 2 € [vi + ker(F)].

(H -9)

We can therefore define a map F * : V /k e r (F ) —> Im (F ) by F f(v + ker(F)) = F (v ). F ' is bijective because it is obviously surjective, and by (11.9) it is also injective. The proof th at F 9 is linear and of (11.8) is left to the reader.



Linear maps

155

D efin ition 1 1.10 Two K -vector spaces V and W are said to be isomorphic if there is an isomorphism V —>W . One also says that V is isomorphic to W . Every vector space is isomorphic to itself, because l v is an isomor­ phism. If V is isomorphic to W then W is isomorphic to V because the inverse of an isomorphism is an isomorphism. Finally, since the composite of two isomorphisms is an isomorphism, if V is isomorphic to W and W is isomorphic to U then V is isomorphic to U. Thus isomorphism is an equivalence relation between vector spaces. T h eorem 11.11 Two finite dimensional K -vector spaces are isomorphic if and only if they have the same dimension. P ro o f Suppose V and W are isomorphic, and let F : V —►W be an isomor­ phism. Since ker(F) = (0), by Theorem 11.6 one has dim (W ) = rk (F ) = dim (V). Conversely, suppose dim (V) = n — dim (W ), and let { v i,. . . , v n} and { w i , . . . , w n } be bases for V and W respectively. The linear map p : V -> W defined by F (v i) = w t,

i = 1, . . . , n ,

is surjective because W t , . . . , w n generate W . It now follows from Theorem 11.6 th at dim (ker(F)) = 0 and so F is also injective. Thus F is an isomorphism. □ We conclude this chapter with a discussion of a few properties of the space of linear functionals on a vector space. Let V be a K-vector space. Recall th at a linear functional on V is a linear map L : V —* K; the set of all linear functionals is denoted V*. One can define on V* the structure of a K-vector space by defining operations in the following manner. For Iq, L 2 E V*, define L\ + L 2 E V* by

(£ i + I 2)(V) = Li (v ) + Z;2(V),

Vv € V .

156

Affine geom etry

To see th at Li + I 2 is linear, let v , v ' G V and c £ K . Then (Li + I 2)(v + v ')

=

Z/i(v + v*) + L 2(y ■+■v /)

=

I i ( v ) + £-i ( v /) + T 2( v ) + X2( v ')

= =

Iu (v ) + L 2 (v) + f if v is orthogonal to every L G th at is if £ (v ) = 0 for all L G The set $ x = {v G V | v is orthogonal to $} = ^fl^ker(L) C V is a vetor subspace of V as it is the intersection of a family of vector subspaces of V. We will call $ x the subspace o f V orthogonal to Suppose th at $ is a vector subspace of V* of dimension t and let { L i,. . . , L t) be a basis for $ . Then $ x = ker(Li) n . . . PI ker(Lt)

(11.12)

dim(4>x ) — n — t.

(11.13)

and

Linear maps

161

It is obvious th at 3>x C ker(Li) fl . . . D ker(X-► (L1( v ) , . . . , L t(v))

It follows from Theorem 11.6 that m > n — t. Suppose, for a contradiction, th at m > n ~ t. Choose a basis { e i,. . . , e n} such that { e i,. . . , e m} is a basis for The t X n m atrix ^0 0

••• •••

0 L i(em+i) 0 L2(^m+i)

••• L i(en)^ *** L2(e„)

\0

•••

0 Lt(em+1)

•••

Lt(en) J

has rank at most f — 1, since it has m > n — t columns which are zero. On the other hand, its t rows are linearly independent, because they are the vectors with the coordinates of with respect to the basis {r/i,. .. ,7/n}, dual to { e i,. . . , e n}. This is a contradiction; consequently m = n —t. Let V and W be two complex vector spaces. A map F : V —> W is said to be antilinear if it satisfies: F(v + v') = F(v) + F(v'), (11.14)

F (cv) = cjF(v),

(11.15)

for every v, v ' 6 V and every c 6 C. Equivalently, itshould satisfy F (cv + c V ) = cF (v) + c F ( v ') for every v, v ' £ V and every c, d £ C. Showing that these condi­ tions are equivalent is left to the reader.

ofV

162

Affine geom etry

EXERCISES 11.1

Let < / : U —t V , / : V —* W b e linear maps. Prove th at ker(n(K ) F i-+ MWtV(F) is an isomorphism o f K -vector spaces. In particular, dim(Hom(V, W )) = m n. P ro o f Let F, G € Hom(V, W ), M = MWyV{F), N = MW, V(G), and c € K . If v = xj vj H Hxnv„, let x = (xi . . . xn)*, the column n-vector with the coordinates of v with respect to the basis v. The column mvectors with the coordinates of F (v ) and G(v) with respect to w are

Linear maps and matrices; affine changes o f coordinates

165

M x and N x respectively, while the column vector with the coordinates of (F + G )(v) = F (v ) + G (v) is M x + N x . Thus, M WyV(F + G) =

+ M WyV(G).

Furtherm ore, the column m-vector with the coordinates of (cF )(v) = c F (v ) is c (M x ) = cM x, and so M w , v (cF) = cMW iV(F ). Therefore M w , v is a linear map. Now let A € Afm,n(K ), and define F a : V —*• W by F a ( x 1 V 1 + • • • + x „ v „ ) = (yt(,)x ) w ! +

h (/4 (m)x ) w m,

where x = (xi . . . xn)‘ and are the rows of A. In other words, y4x is the column vector with the coordinates of /^ ( v ) . First we check th at Fa is a linear map. Consider two arbitrary vectors v = XiV! H h xnv n, and v ' = x \V i -\------- (- x'nv n in V . Then F a(v + v ')

=

A {1)(x + x')w ! + • • • + /4(m)(x + x ')w m

= =

( /4 ^ x + ^4^^x')wi + • • • + (v4*m*x + X ^ x ') w m (A(1)x )w x + • • • + (A(m)x )w m +

=

F a { v ) + F a ( v ').

(^ (^ x 'Jw ! + • • • + (Ax')wm

And for c £ K , F a ( cv )

=

(,4(1)cx)w , + • • • + (4cx)wm

=

c (A ^ x )w i H

=

c[(A(1)x)w ! + • • • + (v4(m)x )w m]

=

cF a ( v

h c ( ^ m*x)wm

).

Thus F a is linear. By definition, M w >v (Fa ) = A. On the other hand, if A = M w ,v ( F ) it is obvious that F a = F . Thus the map M m,n(K ) A

-

H o m (V ,W ) Fa

166

Affine geom etry

is the inverse of M w ,v - The last assertion of the theorem follows from Example 4.15(7). □ The m ap F a introduced during the proof of the preceding theorem is called the linear map associated to the matrix A with respect to the bases v and w . It is defined, for each A € M m M F )[M v M G )*] — \M w , v (F )M v t u{G)]%.

□ If V = W and if w — {vx, . . . , v n} and w — {wx, . . . , w„} are two bases of V , then to each linear operator F on V there is associated a square m atrix M Wf V(F) 6 Mn(K ).

Linear maps and matrices; affine changes o f coordinates

167

If v = w it follows from Proposition 12.3 th at F £ GL(V) if and only if M v ,v € GLn(K ), in which case one has M V, V{ F - ') = M V, V{ F )~ \ Furthermore, M Vyv ( F ) = I n if and only if F = l y . A particularly im portant case arises when v and w are two distinct bases of V and F = l y , the identity map. In this case, M Wyv ( l y ) is called the change o f basis matrix from the basis v to the basis w. By definition, the j- th column of M w ^v ( l y ) consists of the coor­ dinates of Vj with respect to the basis w. For every vector v £ V one has v = x xv x + • • • + x nv n = 2/1 Wi + • • • + 2/nw n and, putting x = (xi . . . x n) \ y = (yx . . . yn) \ one has y = A fu ,,* (lv )x . Thus, from the m atrix M Wyv { ^ v ) one can find the coordinates y of any vector v with respect to the basis w given the coordinates x of v with respect to the base v. Note th at by Proposition 12.3, M v ^ w ( l v ) M w , v ( l v ) = M v yv ( 1 y ) = In and so M w , v { l y ) = Af*u>,i?(Iv) l -

(12.1)

Suppose now th at V is a real vector space. Two bases e = { e i , . . . , e n}

and

/ = { f i , . . . , f n}

of V are said to have the same orienatation if det(M e ^y ( l y ) ) > 0, and one writes e ~ Qr / • Otherwise the bases have different orientations. Clearly, every base has the same orientation as itself since M e j e( l v ) = In-

Moreover, since M ^ e (l y ) = Me ^ ( l y ) ~ x, whether two bases have the same or different orientation does not depend on the order in which they are taken. Finally, if e and f have the same orientation and f

Affine geom etry

168

{ g x, . . . , g n }

and g — have the same orientation, then the same orientation. For, det(M e , 0 ( l v )) = -

e

and g have

det (Me ^ f ( l y ) M f t g ( l y ) ) det(M e> f { l y ) ) d e t ( M f t g ( l v )) > 0.

We see therefore th at ~ or is an equivalence relation on the set B of all bases of V . Each equivalence class is called an orientation of V . How many orientations are there of V ? Clearly there are at most two, since if there were three distinct bases e , / and g defining three different orientations one could deduce the absurd statem ent: 0

> det(Afe , g ( l v ) ) = det(M e>y (lv ))d et(A fy :>fl( l v ) ) > 0.

On the other hand, if e = { e i , . . . , e n} is a basis then e and the basis / = {—e i , . . . , en} have different orientations since /-I 0

0 1

ON 0

0

0

17

Thus any real vector space V has exactly two orientations, th a t is the set B consists of two equivalence classes for the relation ~ 0r- The orientation of V to which e 6 B belongs is called the orientation of V defined by e. E xam p les 12.4 1.

If V = K n,W = K m and v and w are the canonical bases of K n and K m respectively, then the linear map F a associated to a m atrix A € Mm>n(K ) is given by F a (x ) = A x

for every x G K n,

where the elements of K n and K m are viewed as column vec­ tors. Since the columns of A are the vectors FA( E \) yFA{E 2) , . . . , ./^ (E n), one has lm (F A) = ( F A ( E i ) , . . . , F A { E n ) ) . In particular rk (i?/i) = rk(v4).

Linear maps and matrices; affine changes o f coordinates

169

Note th at the coordinates of the vector A x are m homogeneous linear polynomials in x i , . . . , x n. Conversely, any linear map F : K n —►K m is of the form F ( x i, . . . , £n) = (F\(X\j . . . , xn) , . . . , Fn(x u . . . , ^n)) in which each F j ( x i , . . . , xn) is a linear functional on K n: indeed, F j is equal to the composite r/j o F : K n -> K m

K,

which is linear. By ( 1 1 . 1 1 ), we see th at each of the F j ( x \ , . . . , xn) is a homogeneous linear polynomial in a?i,. . . ,x n. Thus, every linear map F : K n —►K m is determined by m homogeneous linear polynomials in x \ , . . . , x n, and the rows o f the corresponding matrix are the coefficients in F i(x \ ) . . . , xn), / ^ ( x i , . . . , xn) , . . . , Fm( x i ,. . . , xn). For example, the m atrix

MW

-V

2

) e M » (R )

is associated to the linear map Fa : R 3 -+ R 2 defined by Fa(xi , £ 2 , x3) =

( £1

+

2£2

+ \ / 2 £ 3 , 3£i +

£2

— §£ 3 )-

On the other hand, the linear map Fa : R 4 —*■R 3 defined by ■f’/ »(£l,*2,£3,£4) = (2xi - £ 3 + £ 4 ,

£2

- V 3 x 3 + §x4,

£1

-

£2

+

£3

+ 5x4)

has associated m atrix /2 0 \1

0 -1 1 - v /3 “ I

1

1 \ 3/2 . 5 /

2. Let A € Mm>n(K ) and b = (&i . . . 6ro)* € K m, and consider the system of m equations in n unknowns AX = b

( 12 .2 )

170

Affine geom etry

where X = (X"i . . . X n)*. A vector x = (xj . . . xn)‘ € K n is a solution of ( 1 2 .2 ) if and only if Fa ( x ) = b, where F a : K n —►K m is the linear m ap associated to A. For such an x to exist it is necessary and sufficient th at b € Im(F*). On the other hand, since Im (F i) is generated by the columns of A , in order th a t b € Im(F.A), th a t is for system ( 1 2 .2 ) to be compatible, it is necessary and sufficient th a t rk(.A) = rk( A b). This argument gives a second proof of the theorem of Kronecker-Rouche-Capelli. If system (12.2) is compatible, we know th at the space of solu­ tions has dimension n — r, where r = rk(A). This can be proved by noting th a t the space of solutions of system ( 1 2 .2 ) has the same dimension as the space of solutions of the associated homogeneous system AX = 0 , and th at this space is just k er(F 4 ). By Theorem 11.6, we have th a t dim(ker(Fyi)) = n —rk(A) — n —r. 3. Let V be a real vector space of dimension 3, and let e = {ej, e 2 , 6 3 } be a basis of V . Consider the following bases, whose vectors are given in coordinates with respect to e : v w=

Then

Similarly,

=

{ v i(l,

1 ,0 ),

v

{w,(-l,0,l), w2(l,

2(2,

1,1), v 3 (0 , —2 ,1)} - 2 ,- 3 ) ,

w3(l,

1,1)}.

Linear maps and matrices; affine changes o f coordinates

171

To find M w f v ( l v ) one can use Proposition 12.3. Thus

/1 /2

- 2 3 /2 \

/I

2

0 \

' (f :J ‘fJ (i ! l2J =

/ —3/2 - 1/2 \ 0

1/2

11/2 \

5/2 .

1/2

2

3 /

Finally, to find Mt>, t r ( l v ) we can use the identity M Vf tu(lv) = ^ t o ,» ( lv ) 1

and calculate the inverse of M » ,to (lv )) or else write

3 -1 1 -7 3

—2 1 -1 19 -9

-2

6

—4 \ / —l 1 2 0 -2 -1 / V 1 -3 —3 \ 2 . - 1/

1\ 1 1/

Let V and W be real vector spaces with dim (V) = 4 and dim (W ) = 2, and let v = {vi, v 2, v 3, v 4} and w = {wi, w 2}be bases of V and W . Let F :V —> W be the linear map for which M „ ,„ ( F ) = ( ‘

I

-*

‘f ) .

Let e = {e i (l, 1,1,2), e 2(2 ,- 1 ,3 ,0 ) , e3( A l , 0 , 0 ) , e4( l , - 1 /2 ,1 ,5 )} / = (fi(2 ,l) , f2( - l , l ) } be new bases of V and W respectively, given by their coordinates in the bases v and w . By Proposition 12.3 one has M f , e ( F ) = M f tW( l w ) M w , v ( F ) Mv , e ( l v ) -

172

Affine geom etry Since Mw , / ( 1w ) = ( j

M/>t£,(lw) = MWtf(l\ v ) 1 = (_jy3 2 / 3 ) ’ and M v ,e ( l v ) =

n 2 1 -1

1 -1/2

1 3 V2 0

0 0

i n

1 5

)

one gets M f , e (F) = ( 1/3 V —1/3

1 /3 \ / I 3 2/3 A 2 0

n

1

-1/3

U

-1

4/3

( 22

0 7

U

1

/I 1 1 -1 /6 / V2 1\ 2/ ‘

1/6 \

\/2 + l V 2 -1

/I 1 0 J 1 V2 2 \/2 -1 1 3 0 0 0

1/2 \

2

2 \/2 -1 1 3 0 0 0 1 \ -1 /2 1 5 )

-

1 \ 1/2 1 5 /

5. In the vector space M ziC ) of 2 x 2 matrices with complex entries, consider the bases c

=

P =

{ I 11 , I 12 , I 21 , I 22 }

{12,01, __ > 3 i cjfi + C2 f2 + C3f3 = F E = —E F = —(5ei — - e 2 + 2 e3)* Thus, Mf , e ( 1 y ) \

-5 3/2 ■1 / 2 ,

0

/1 7 /2 \

I1!?)'

-1

-

2

- 1 1 1

-

1 0

1

1

\

(

~5 3/2 / V—1 / 2

Affine geom etry

176

We conclude th at the formula for changing coordinates from lie i e 2 e3 to F f\ f 2 f3 is

or, Vi

=

—Xl + 2x2 - X3 + —

y\

=

13 -X l +X2 + y

yi

=

Xl - x2 + x3 - 7.

EXERCISES 12.1

Let F : R 2 —» R 3 be the linear map F ( x i , x2) = (xj + x 2, xj —2x2, x i). Find

where

6 = {(1,1), (0 ,-1 )} , 12.2

b ' = {(1,1,1), (1 ,—2,0), (0,0,1)}.

Let F : R 2 —►R 3 be defined by Xl +L — 3x2 , — 3x1 - — X2 , o ^ Fm{ x ,, x2)\ = ( R 3 satisfying F (v i) = (1,0,0),

12.5

F ( v 2) = (0,1,0),

fc' = { (l,V 5 ),(V S ,l)} V = {(1,0), (1,1)} b> = {(V5, -v /5 ), ( VE, V5)}.

For each of the following pairs of bases b and 6' of C 2 find

a) 6 = { ( l , t ) ,( t ,l ) } b) b = {(*,*), (—1,1)} 12.7

F ( v 3) = (0,0,1).

For each of the following pairs of bases b and bf of R 2 find

a) 6 = {(1,0), (0,1)} b) b = { (1 ,-1 ), (1,1)} c) b = {(2,1), (2,2)} 12.6

177

b' = {(2,1), (1,2)} 6' = {(»,0),(0,*)}.

For each of the following pairs of bases 6 and 5' of Q 3 find % ,6 '( 1 ) : a) b)

6 = {(1,0,1), (1,1,0), (0,1,1)} b' = {(1,1,1), (0,1,1), (0,0,1)} 6 = { (1 ,-1 ,1 ), ( - 1 ,1 ,1 ) , (1,1,1)} b ' = {(13,5, - 6 ) , ( 8 ,- 1 0 ,- 4 ) , ( - 1 7 ,0 ,- 7 ) } .

12.8

Let Oij be a fixed frame for a real affine plane A. Find the formula for changing coordinates from the frame Oij to the frame O 'i'j' where O' = 0 '( 1 ,2), i' = i + 3j and j ' = i + j.

12.9

Let A be a real affine plane in which there is a given affine frame Oij. Let l , P and I 11 be the lines with equations e : X + Y = 0,

ft; : X - Y - 1 = 0,

t ': 2 X + Y + 2 = 0.

P utting O' = e n e , U = and U' = let i' = O U , ► j ' = O 't/'. First show th at the vectors i' and j ' are linearly in­ dependent, and then find the formula for changing coordinates from the frame Oij to the frame O 'i'j'.

178

Affine geom etry

12.10 Let A be a 3-dimensional real affine space with a fixed frame O ijk. Find the formula for changing coordinates from the frame O ijk to the frame O 'i'j'k ', where O' = 0 ' ( | , j'=j-k,

|) ,

i' = i + k,

k' = i + j + k.

12.11 Let e = { e i , . . . , e „ } , b = {b!,... ,bn} be two bases of the Kvector space V , and let tj — {j?j,. . . , i)„} and /3 = {/?i,. . . , /?„} be the bases of V* dual to e and b respectively. Show th a t

13 Linear operators

Let V be a finite dimensional vector space over AT, and let e = { e i,. . . , e n) be a basis of V . For every operator F G End(V ) we will write Me (F) rather than Me ,e ( F ) , and M e (F) is called the matrix o f F with respect to the basis e. From Theorem 12.2 it follows th at the map M e : End(V ) F

~

M n(K ) M e {F)

is an isomorphism of K-vector spaces. In particular,

Afe(lv) = In, and M e {F) G GLn(K ) if and only if F G G L(V), that is F is an automorphism if and only if M e (F) is invertible. Thus M e induces a bijection, which we denote by the same symbol, M e : GL(V)

GLn(K).

Let / = { fi,. . . ,f n} be a second basis of V . By Proposition 12.3 it follows that M f (F) Since

=

M f te (lv )M e(F)M e i f (ly). one has

180

Affine geom etry

from which it follows immediately th at d e t ( M f ( F ) ) = det (A f«(F)), th a t is d e t ( M e ( F ) ) does not depend on the basis e but only on the operator F . We therefore call det(My?(F)) the determinant o f the operator F and we denote it by d et(F ), without having to specify the m atrix M f ( F ) used to compute it. D efin ition 13.1 Two matrices A, B € Mn(K ) are said to be similar if there is a m atrix M e GLn(K ) such th at B = M ~ M M . Similarity is an equivalence relation in Mn(K ). Indeed, every m a­ trix is similar to itself: A = I ~ M ln; secondly, if B = M ~ l A M then multiplying by M — (M ~1)~1 on the left gives M B = M M ~ l A M = AM , and then multiplying by M ”"1 on the right gives M B M ~ X = A M M ~ l = A, i.e. ( M ~ l )~l B M ~ l = A. Finally, if C = iNMBTV and B = M ~ l A M then C =

N ~ \M ~ lA M )N

=

(M N )~lA (M N )

and the relation is transitive. P ro p o sitio n 13.2 L d V vector space over K with dim (V) = n , and let A, B € Mn(K ). Then A and B are similar if and only if there is a linear operator F € End(V ) and bases e and f o f V such that M e (F) = A and M f ( F ) = B. P roof If such an F , e and / exist, then then it follows from (13.1) th a t A and B are similar. Conversely, suppose th at B =

AM .

(13.2)

Let e be any basis of V and let F = F a be the operator associated to the m atrix A. For each j = l , . . . , n let fj be the vector whose coordinates, with respect to e, are the elements of the jf-th column of M , that is fj = m ije i -) h m njen. Since M has rank n the vectors

Linear operators

181

f i , . . . , fra are linearly independent and so form a basis of V which we call / . Moreover, M = Me > f ( 1 V ). From (13.2) it follows th a t B = M f ( F ) .

n

D e fin itio n 13.3 Let V be a K-vector space of dimension n. An operator F £ End(V ) is said to be diagonalizable if there is a basis e of V such th at M e (F) is a diagonal m atrix, i.e. is of the form /A , 0 0

a2

0

• •• • ••

0 \ 0

0

• ••

A„ /

for some A i,. . . , An £ K . In this case e is said to be a diagonalizing basis for F. A m atrix A 6 M n(K ) is said to be diagonalizable if it is similar to a diagonal matrix. Clearly, if F £ End(V ) and e is a basis V then F is diagonalizable if and only if M e (F) is a diagonalizable m atrix. In particular, A £ M n(K ) is diagonalizable if and only if the operator F a : K n —►K n defined by A is diagonalizable. If F : V —* V is a diagonalizable linear operator and e is a diago­ nalizing basis for F then, F ( e t) = A,et,

for each i = 1 , . . . , n.

(13.3)

Conversely, if there exists a basis e satisfying (13.3) then the m atrix M e (F) is diagonal, and so F is diagonalizable and e is a diagonalizing basis for F. Note th at, if dim (V ) = 1 then every F £ End(V ) is diagonalizable, and every basis is diagonalizing for F. If dim (V) > 1 then there are some operators F £ End(V ) which are not diagonalizable. Similarly, not all matrices in M n(K ) are diagonalizable if n > 1 — see Example 13.15(2). The notions of ‘eigenvector’ and ‘eigenvalue’ arise naturally when considering the problem of existence of diagonalizing bases.

182

Affine geom etry

D efin ition 13.4 Let V be a K-vector space and let F € End(V ). A non-zero vector v € V is called an eigenvector o f F if there is a scalar A 6 K such th a t F (v ) = Av. The scalar A is then called the eigenvalue associated to the eigenvector v. The set of eigenvalues of an operator F is called the spectrum of F. For A £ M n(K ), an eigenvector o f A is an eigenvector x £ K w of the operator Fa : K n K n defined by A, and an eigenvalue o f A is an eigenvalue of F a . For example, if F = l y , then every vector v 0 is an eigenvector of F with eigenvalue A = 1. If F is an operator with ker(F) ^ (0), then every v £ ker(F) \ {0} is an eigenvector of F with eigenvalue A = 0. In the remainder of this chapter we describe some of the simple properties of eigenvectors and eigenvalues of an operator F £ End(V ). We suppose th at dim (V) = n > 1. P ro p o sitio n 13.5 The eigenvalue associated to an eigenvector is uniquely determined. P ro o f If Av = F (v ) = p v for some A,/i £ K then (A — //)v = 0 and, since v ^ 0, this implies A —/z = 0, i.e. A = p. □ P ro p o sitio n 13.6 / / V i,v 2 £ V are eigenvectors with the same eigenvalue A, then fo r every Ci,c2 £ K the vector cjV\ + c2v 2, if it is non-zero, is also an eigenvector with eigenvalue A. P ro o f One has F ( c1Vi + c2v 2) = ciF ( v i ) + c2F ( v 2) = c1Av1 + c2Av2 = A ^ v , + c2v 2).



Linear operators

183

From Proposition 13.6, it follows th at for each A € K , V a(F ) = {v £ V | v is an eigenvector of F with eigenvalue A} U {0} is a vector subspace of V , called the eigenspace fo r the eigenvalue A. For a m atrix A £ Afn(K ) the eigenspace fo r the eigenvalue A is defined to be the subspace Va(v4) = V a C ^ ) in K n. P ro p o sitio n 13.7 I f V j,. . . , v* £ V are eigenvectors with eigenvalues A i,. . . , A*, and these Xi are pairwise distinct, then V \, . . , , v* are linearly independent. P roof The assertion is trivial if k = 1, as Vi ^ 0, induction on fc, and suppose th at k > 2. If

We now proceed by

ciVi + C2v 2 + • • • + cky k = 0,

(13.4)

then, applying F to both sides, gives c iF (Vl) +

c 2F

( v 2)

+ • • • + ckF ( v k) = 0,

th at is, ciAiVj + c2A2v 2 + • • • + ckXkv k = 0.

(13.5)

On the other hand, multiplying (13.4) by Ai gives ciAjVi + c2AiV2 + • • • + ckX\Vk = 0,

(13.6)

and subtracting (13.6) from (13.5) gives c i(A 2 — A j)v 2 H

+ ck( Xk — X \ ) y k = 0.

By the inductive hypothesis, v 2, . . . , v * whence c2(A2 — Aj) = • • • = ck(Xk — Ai) = j = 2 , . . . , k it follows th at c2 = • • • = ck = dV i = 0 which implies th at c\ = 0 since y \

(13.7)

are linearly independent, 0. Since Aj — Ai ^ 0 for 0. Thus (13.4) reduces to ^ 0. □

P rop osition 13.8 I f every v £ V \ {0} is an eigenvector o f F then there exists A £ K such that F = A ly-

184

Affine geom etry

P roof If dim (V ) = 1 the assertion is obvious, and we can therefore suppose th a t dim (V ) > 1. Let { e i,. . . , e n} be a basis of V . From the hypoth­ esis it follows th a t there are A i,...,A n G K such th a t F ( e f) = A,e,*, for i = 1 ,... ,n . Let i, j be distinct integers with 1 < i yj < n, and let Vij = e, + ej. By hypothesis there is a scalar A G K for which F(vij) = AijVij = A#iet + Xijej. On the other hand, F (v ij) = F (e { + ej) = F ( e {) + F(ej ) = At-et- + Xjejy and from the independence of e, and ej it follows th a t A, = Xj = A In conclusion, we have Ai = • *• = A*, and the proposition is proved.



In order to find the eigenvalues of a linear operator or a m atrix one uses the so-called ‘characteristic polynomial’, whose definition relies on the following simple result. P ro p o sitio n 13.9 Let V be a finite dimensional vector space and let F G E nd(V ). A scalar A G K is an eigenvalue o f F if and only if the operator F - A ly : V

V

which is defined by (F - A lv )(v) = F ( v ) - Av, is not an isomorphism, that is, if and only if d e t(F —A ly ) = 0. P ro o f (F — A ly ) fails to be an isomorphism if and only if ker(F — A ly ) ^ (0), th at is if and only if there is a v G V with v ^ 0, for which (F — A ly )(v ) = 0, i.e. for which F ( \ ) = Av. (13.8) states th a t F has an eigenvector v with eigenvalue A.

(13.8) □

Linear operators

185

Let e = { e j,. . . , en} be a basis of V . The m atrix associated to the operator A ly is /A 0 ••• 0 \ o 0 A

...

\0

0

• ••

x)

and, if A = (ay) = Me (F ) then

Me ( F - A ly ) =

- A 012 022 ~ '\ g \ g" 6 G. [G2] (Existence o f an identity element) There is an element e 6 G satisfying e • g = g for every j G G . [G3] (Existence o f an inverse) For every g € G there is an element g~l G G satisfying g ■g~l = g~l • g = e. A group (G, •) is said to be commutative or abelian if it satisfies the additional axiom: [G4] ( Com mutativity) g • g* = g 1 • 5 , for every g yg 1 € G. In an abelian group, the binary operation is usually denoted by + and called ‘addition’. When it is clear from the context what operation is being used on G then the group (G, •) is usually denoted simply G.

Transformation groups

197

An im portant example of a group is the set T ( S ) of all the Injec­ tions of a non-empty set S into itself, also called transformations of S . To any pair (/,