Linear algebra over skewfields

Table of contents :
CHAPTER PAGE
1. PRELIMINARIES 1
2. REDUCED CHARACTERISTIC POLYNOMIALS 12
3 . THE REDUCED DETERMINANT 21
4. LINEAR TRANSFORMATIONS . . . . . . . . . 29
5. DIAGONALIZABLE OPERATORS AND CANONICAL FORMS K$
APPENDIX I . . . . . . . . . . . . 50
APPENDIX I I . . . . . . . . . 57
JLJJLO X \j£ XullP £jJt\i!jlM0iljO a o « « o * o » « * » » o * « t « o o s o o o « o i e o o o o » e o o O J L
V Am l i i o e * » o o t o » o o t o o « « » « « « « » o o o o o o » o « • o « o o o o o o * o » « o « o o • O O

Citation preview

This dissertation has been microfilmed exactly as received

g8-8271

WILSON, Paul Robert, 1939LINEAR ALGEBRA OVER SKEWFIELDS. University of Illinois, Ph.D., 1967 Mathematics

University Microfilms, Inc., A n n Arbor, Michigan

LINEAR ALGEBRA OVER SKEWFIELDS

BY

PAUL ROBERT WILSON A.B., University of Cincinnati, 1961 A.M., University of Cincinnati, 1962

THESIS Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mathematics in the Graduate College of the University of Illinois, 1967

Urbana, Illinois

UNIVERSITY OF ILLINOIS THE GRADUATE COLLEGE

SEPTEMBER 1 1 , 1967

I HEREBY RECOMMEND THAT THE THESIS PREPARED UNDER MY SUPERVISION wv iTMTTTT.F.n

PAUL ROBERT WILSON

LINEAR ALGEBRA OVER SKEWFIELDS

BE ACCEPTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF_

DOCTOR OF PHILOSOPHY

./"\ y\

O C/

Uncharge of Uncharge of Thesis Head of Department

Recommendation concurred inf

"^K^fr

yh^JL&L.

~)LtL^ if TSHom(V), then a matrix in (K) afforded by T will often be indicated by T; matrices will usually be represented by underlined capitals.

The symbol I always stands for the identity

transformation and I for the identity matrix.

The sym-

bol © means "direct sum", iff means "if and only ifo"

v

TABLE OF CONTENTS

CHAPTER

PAGE

1.

PRELIMINARIES

1

2.

REDUCED CHARACTERISTIC POLYNOMIALS

12

3.

THE REDUCED DETERMINANT

21

4.

LINEAR TRANSFORMATIONS . .

5.

DIAGONALIZABLE OPERATORS AND CANONICAL FORMS

K$

APPENDIX I

50

.......

............

APPENDIX I I . . . . . . . JLJJLO X

\j£

. . 57

XullP £jJt\i!jlM0iljO a o « « o * o » « * » » o * « t « o o s o o o « o i e o o o o » e o o

V Am l i i o e * » o o t o » o o t o o « « » « « « « » o o o o o o » o «

29

• o « o o o o o o * o » « o « o

o •

OJL

O O

1

CHAPTER 1 PRELIMINARIES

It is assumed that the reader is familiar with the elementary theory of vector spaces as presented in (8) and with the theory of Dieudonne determinants (of which there is an excellent short exposition in (1),).

In this chapter we review, briefly and select-

ively, some important results in the theory of linear algebras over skewfields.

For all details the reader

is referred to the third chapter of Jacobson's Theory of Rings 0 It is well known that the ring of polynomials in one indeterminate over a skewfield has both left and right division algorithms, so we can make the following definition. DEFINITION 1.1.

Let K be a skewfield and R = K[X]

be its ring of polynomials.

Given a,b e R, not

both zero, there exist uniquely determined elements d,m e R such that (i) d and m are monic polynomials, (ii) Ra + Rb = Rd

and

R a O R b = Rm.

Then d will be called the GREATEST COMMON RIGHT DIVISOR of a and b and written d = gcrd(a,b); m will be called the LEAST COMMON LEFT MULTIPLE

2

of a and b and written m = lcrm(a,b).

The poly-

nomials gcld(a,b) and lcrm(a,b) are similarly defined. Most important results in linear algebra depend in some way on the correspondence between linear transformations and finitely generated modules over polynomial rings. Such modules can always be decomposed into cyclic modules.

In the commutative case

two cyclic modules R/Ra and R/Rb are isomorphic iff a and b are associates in R.

The situation is less

simple in the non-commutative case. DEFINITION 1.2.

Two elements a,b e R are said to be

LEFT SIMILAR if R/Ra S R/Rb; RIGHT SIMILAR if R/aR * R/bR. 0. Ore (see (12)) introduced this notion and proved that left and right similarity are equivalent, so we may speak of SIMILAR elements; we shall often write a***b to indicate the similarity of a and b. THEOREM lojo

Let a,b e R be non-constant polynomials.

The following statements are equivalent, (i)

a and b are similar}

(ii)

there is a u s R such that deg(u) < deg(b), Ru + Rb = R, and RunRb = Rau;

3

(iii) there is a v e R such that deg(v) < deg(a), vR + aR = R, and vRoaR = vbR; (iv)

there exist u,v s R such that gcrd(u,b) = 1, gcld(v,a) = 1, deg(u) < deg(b), deg(v) < deg(a), and au = vb.

The proof is clear. The annihilator in R of any cyclic R-module is a principal two-sided ideal; Jacobson shows that any such ideal has a generator which is a monic polynomial in the center C of R„ DEFINITION 1.4. The sum of all two-sided ideals contained in a left (or right) ideal A of R will be called the BOUND of A and denoted by bd(A). If a s R, a ^ 0, then the unique monic polynomial a* s 0 such that bd(Ra), = Ra* will be called the BOUND of a and written either bd(a) or a*. We set bd(0) = 0« The apparent left-right asymmetry in the definition of bd(a) is illusory, for bd(Ra) = bd(aR); also, bd(Ra) is the annihilator m R of R/Ra so a~b implies bd(a) = bd(b)0 THEOREM 1.5o

Every non-zero element of R has a non-

zero bound.

4-

It suffices to prove that if a e R, a ^ 0, then bd(Ra) / 0o

If a is a unit, Ra = R and bd(Ra) = R / 0.

If a has degree n > 1 then R/Ra is an n-dimensional (left) K-vector space. by

Let T s Hom(R/Ra) be defined

T(u + Ra) = Xu + Ra. Obviously T e Homz(R/Ra,R/Ra),

so by the Cayley-Hamilton Theorem there is an f s C, £ / 0, such that f(T) = 0. Then f e bd(Ra). COROLLARY 1.6.

If a e R, a / 0, then bd(a) is the

monic polynomial of least degree in C which is a multiple (left and right) of a. That bd(a) is both a left and right multiple of a follows from bd(Ra) = bd(aR) = R(bd(a)). DEFINITION 1.7. If a,b e R and Ea^RbR, we say that a is a TOTAL DIVISOR of b and write a|b. NOTE.

Since aR^bd(aR) = bd(Ra) it is clear that a|b

iff aR2RbR. The next theorem asserts that total divisibility is a similarity invariant. THEOREM 1.8.

Suppose a~a', b-"b«.

Then a|b iff a0|b'o

The proof is straightforward. Ve now take up the study of square matrices over

R and their reduction to a canonical form.

Jacobson,

working over a non—commutative principal ideal domain, defines three kinds of elementary transformations in (R)

and proves that if A e (R)_9 then by using these

transformations on A one can reduce it to a diagonal matrix diag(a-,, . . . ,a ) in which (if n > 1) aJ I & i + 1 for 1 < i < n.

Because we are working over a ring

which has both left and right division algorithms, we can simplify Jacobson's set of elementary transformations and at the same time make an important improvement in his canonical form. DEFINITION 1.9.

Let GL(n,R) be the multiplicative

group of units in (R) .

For each a e R and each

pair of integers i,j such that 1 < i,j < n, i £ j , let E- .(a) be the element of (R)„ having io

n.

l's down the main diagonal, a in the (i,j) spot, and O's elsewhere.

Denote by SL(n,R) the sub-

group of GL(n,R), generated by all such E. .(a)'s. NOTE 1.

The inverse of E. .(a) is E. .(-a). l-d

NOTE 2.

ID

GL(1,R) = K, SL(l,R) = {l}. For n > 1

SL(n,R) is the set of elements of GL(n,R) having Dieudonne determinant 1 (see (5)). Therefore SL(n,R) is a normal subgroup of GL(n,R) for all n.

6

Observing that all of the elementary transformations used by Jacobson are members of SL(n,R), we have: THEOREM 1.10. Let A s (R) n .

Then there exist U,W in

SL(n,R) such that UAW = diag(a,,.«.,a), where a.|a. - for 1 < i < n. See lemma 1, Appendix I. We now fix our attention on the class of matrices over R of form IX - T, where T = (a..) e (K) . Let XJ

XL

us recall how such matrices appear in the study of linear transformations. Let V be an n-dimensional Kvector space with basis {v,,..,,v } and let T s Hom(V) afford the matrix T relative to this basis. Make V into a left R-module V™ by letting X act as T. Now if 1 = Rx, 1 ©...© Rx^n is a free left R-module on n generators, and N is the kernel in F of the R-homomorphism defined by R-isomorphic to F/N. elements yn »«"»»7n

e

x. — > v., 1 < i < n, then V™ is It is easily proved that the ^ defined by

y. - XX;j - j ^ ^ form a basis for N (i.e. N = Ry, ©...© Ry (x,,...,x )(IX - T) = (y-|j...,y )«

) and that

By (1.10) we can

find U,W e SL(n,R) such that U(IX - T)W = diag(ax,...,aQ), where a.|a. - for 1 1 and there exist U,W e SL(n,R)

11

such that

U(X - a)W = (X - p)I, it is not difficult

to show that W SL(n,K).

= TJ and that W may be chosen from

Therefore if (X - ot)I and (X - p)I are

equivalent normal forms of T there is a W s SL(n,K) such that W — 1, therefore

alX^iy = A(U.diag(l,...,a)«V) = A(diag(l,...,b)

= bpc^iy as asserted. COROLLARY 2.6. If a,b s R are monic and similar, then

N r d (a) = N rd (b) s C.

The converse is false.

Let Z be the field of

rational numbers, and K = Z[i,j] be the quaternions over Z. Every element of K-, has a unique representation of the form and

f., + foi + f^j + f^k, where

k = ij

f e Zn m l

for 1 < m < 4. The reduced norm of — — 2 2 2 2 such an element i s f, + f~ + f, + fy,. Consider the elements clear that

a = ( X 2 - l ) + (2X)i , Nrd(a) = Nrd(b) .

we can find monic polynomials such t h a t

au = vb

b = (X 2 +l) . If

a~b

u,v s R

and g c l d ( u , b ) = 1,

It

is

then by (1.3) of degree < 2 But no such

polynomials exist : obviously u and v cannot be constants, suppose au = vb

u = X+oc , v = X+P , oc,p e Z ; then

implies (comparing coefficients) that

a = -i

so X-i is the only possible choice for u.

But then

gcld(u,b) = X-i =J= 1 , for

Thus a

and b are not similar.

b = (X-i)(X+i).

This example also proves that

similarity classes do not multiply, that is, if f ~ f ' and g'vg' it may happen that fg+'f'g'? for in the example, a = (X+i) nomials X+i

and

, b = (X+i)(X-i) , and the polyX-i

are similar. We have shown that Nrd induces a multiplicative

group homomorphism from

K*/[K,,K.j] into Z, which

takes cosets of elements of K* into Z, and cosets of elements of R into C. We now compose this induced map with the Dieudonne determinant from (K^) into

KJ/CK^Kj]. DEFINITION 2.7. If A e ( K 1 ) n is not invertible, define the REDUCED DETERMINANT of A to be 0. If A e GL(n,K,), let a be any representative of the coset A ( A ) ; define the reduced determinant of A to be

Nrd(oc).

For all A s (2^)

the reduced

determinant of A will be written Arir m (A) or

VZ1 " simply A r d ( A ) . The discussion preceding (2.7) and. the fact that

20

the Dieudonne determinant on (Kn ) is an extension of In that on (K) implies that if A e (K) e Z , while A e (R) A

r

then A r d ( A )

implies A r d (A) s C.

Therefore

may be viewed as a group homomorphism from GL(n,K)

into Z*, or from GL(n,lL)into Z£. As the subscript K,/Z,

in the definition of A xr /%

ful purpose, it will be dropped.

serves no use-

CHAPTER 3 THE REDUCED DETERMINANT

Let

T s Hom(V)} if T, and Tp are matrices afford-

ed by T, then obviously A ^ C ^ ) = A r d ( £ 2 ) . Suppose T is afforded by T and diag(e1,...,e ) is a normal form of IX - T; then there exist U,W e SL(n,R) such that

U(IX - T)W = diag(e1,...,en).

THEOREM 3.1.

If

T e (K)

has normal form

n

-rv m\ rd/ diag(e1,...,en), then A r d /(IX - T) = T, NTra (e 1 e 2 ...e n )

The proof follows from the fact that SL(n,R) is contained in the kernel of A

viewed as a homomorph-

ism on GL(n,K,) ; theorem (1.11) guarantees that IX - T s GL(n,R),. If (e, 1 ,...,en ) and (f-.,...,f 1 n ) are invariant lists for

T e Hom(V), then by Nakayama's theorem

GA^?*

f° r

each i, so by (2.6) N r d (e i ) = N 3 7 ^ ^ ) for each i.

This

shows that the next definition is well formed. DEFINITION 3.2. Let

T e Hom(V) and (e1,...,en) be

any invariant list for T, then the sequence (Nrd(e1),...,Nrd(en)) will be called the REDUCED LIST of T.

An analogous definition is

given for the reduced list of a

T s (K) .

The reader might note here that if K is a field (so K = Z), then N A

= A

is the identity map on K and

is the ordinary determinant; thus the defin-

itions and results so far obtained coincide with the usual ones of linear algebra. Two questions occur naturally at this point:

Is

there a practicable way to calculate the reduced list of — T s x(K)? 'n list?

Can —T be recovered from its reduced

The second question is easy. Choose

a,b e R,

monic, with the same reduced norm, but not similar (see the example following (2.6)).

If a and b have

degree n > 1 , then the n-tuples (l,...,l,a) and (l,...,l,b) are invariant lists for two non-similar transformations on an n-dimensional vector space, yet these transformations have exactly the same reduced list:

(l,...,l,N

(a)). In general T cannot be re-

covered from its reduced list. To answer the first question we proceed in the following way.

Suppose

A e (K-,) and 0 < r < n;

after

striking out any n-r rows and n-r columns of A, we are left with an rxr submatrix of A; the reduced determi" nant of this submatrix is an element of Z.,, it will be called an r-rowed reduced minor of A. DEFINITION 3.3. Let

A s (\)n

tive integer < n.

Define

and r be a non-negaD

: (K,) — > Z, as

follows. 0, put

If every r-rowed reduced minor of A is

Dr(A) = 0. If some r-rowed reduced minor

of A is different from 0 let

D (A) be the

greatest common divisor of all the r-rowed reduced minors of A. It is obvious that if A e (K-,) and B,C e SL(n,K1), then

Dr(BAC) = D (A) for all r.

In particular, if

T s (K) , then for any normal form diag(e-,,...,e ) of T we haves

Dr(IX - T) = Dr(diag(e1,...1en)). It is

easily verified that the greatest common divisor of the (ordinary) r-rowed minors of the matrix diag(Nrd(e1),... ,Nrd(en)) is

D^diagCe^,,... ,e n )).

But for each i < n, e jje i+1 > so NJ?d(ei) |N rd Ce i+1 ) , therefore the greatest common divisor of the r-rowed minors of N

diag(N(e,),...,N

(e-,)»»'N

(©n)) is exactly

(e ) . We can now calculate the reduced

list of T e (K) in this way: (1) Calculate the reduced? norm of each entry of IX-T. The greatest common divisor of these reduced norms (in C) will be N r d ( e 1 ) ; (2) Calculate all 2-rowed reduced minors of IX-T. The greatest common divisor of these will be D2(IX-£) = N r d (e 1 )N r d (e 2 ).

Divide by N r d (e 1 ) to

obtain Nrd(e2)o

Repeating t h i s process for

r = 3*4-,...,n

produces

the desired result. We now turn our attention from matrices to linear transformations. Let V be an n-dimensional K-vector space and

S = Hom(V), then S is a central simple Z-

algebra isomorphic to (K) , and S can be decomposed into a direct sum of n minimal left ideals, all of which are isomorphic to V as S-modules:

S = V^n',

Let E be a splitting field for S; s(x)z E is a central simple E-algebra; if W is one of its minimal left ideals, then

S 0 E ^ W ^ n d \ and V © E * W ^ d \

Let

s s S , and let s © l be its image in SQc)E; clearly, if (s x l ) L e HomE(W,W) is left multiplication by s(x)l on the left ideal W, then CPg^z(s;X) s Z[XJ = C.

CPE((s x l)L;X) =

In addition, we observe that

if (s(x)l)L, s Hom E (W^ d \w^ d b is left multiplication by s(x)l on the nd -dimensional E-vector space W^ ', then

CPE((s(x)l):r, ;X) is exactly

CPZ(ST;X)

, where

ST S Homz(V,V) is left multiplication by s on the 2 nd -dimensional Z-vector space V. But CP^(s©l) L ,;X) = (CP E ((s©l) L ;X)) d = (CP£dz(s;X))d. We have proved: THEOREM 3.4. Let K be a skewfield of index d over its center Z, let V be an n- dimensional vector space

25

over K, S = Hom K (V), and

T s S.

Then letting

CPZ(T;X) denote the characteristic polynomial of T considered as an element of Homz(V,V) we have (CP^ z (T;X)) d = CPZ(T;X). From this point on, for

T s S = HomK(V),

CPrd(T;X) will always mean CPg/z(T;X). Suppose

a e R is a monic polynomial of degree

n > 1, let V = R/Ra be the associated n-dimensional K-vector space, and let iplication by X in V.

T e Hom(V) correspond to mult-

Since any Z-basis for K is also

a C-basis for R, as well as a Z,-basis for K,, we have N(a) = CPZ(T;X) e C.

(This follows from the fact that

N(a) is the determinant of a., where a,, s Hom„ (K.. ,K.) is left multiplication by a in K,).

Therefore, by (2.4),

(N rd (a)) d » N(a) = CPZ(T;X) = (CPrd(T;X))d«. LEMMA 3.5. Let

x,y s 0 be monic and let m be any

positive integer, then x111 = y m Let xy~

= z s Z, so that

implies x = y.

z m = 1.

Since x and

y have the same degree, z is a power series of the 00

form Let

•?

L a.X , where a. s Z for all i and a n + 0. 1 u ioO1 a, = ... = O C „ T = 0 , a + 0 . Assume the char-

1

s-l

'

s

26

acteristic of Z is 0. 0CQ

Then

z m = 1 implies that

+ IMXQ" a X s + ... = 1 , whence

a

= 0.

The contra-

diction shows that if Z has characteristic 0, z = a Q 6 K, but then

x = 0, let m = p m,, where k (m-,,p) = 1. The proof just given shows that z p = 1, k k k k k so x p = y p . But then x p - y p = (x - y ) p = 0, whence

x = y. THEOREM 3.6. and let

Let

a e R

V = R/Ra

be the associated n-dimensional

K-vector space. If by X on V, then

be monic of degree n > 1,

T s Hom(V) is multiplication

CPrd(T;X) = N r d (a).

The proof is an immediate consequence of (3.5)/ and the remarks preceeding it. Now let V be any n-dimensional vector space over K and choose

T e Hom(V).

We can decompose the left

R-module Vm into a direct sum

R/Ra, ©...© R/Ra , in

which the a.'s are monic polynomials. Let T. be the restriction of T to R/Ra.. Relative to an appropriate basis, the matrix of T is the direct sum of the matrices of the T.'s; from the ordinary properties of the determinant used to define CP

(T;X), we therefore have

the following important consequence of (3«6).

2.7

THEOREM 3-7.

If

T e Hom(V) and V^ = R/Ra-j®.. ,©R/Ran,

where the a.'s are monic, then we have: CPrdT;X) = N rd ( ai )...N rd (a n ) = N r d ( a i .•-a^). COROLLARY 3.8. Let (e,,...,e ) be any invariant list for

T e Hom(V), then CPrd(T;X) = Nrd(e1«--e^) =

N rd ( ei )...N rd (e n ). In particular, CPrd(T;X) is the product of the entries in the reduced list of T. THEOREM 3.9. If

T e (K) n , then

CPrd(T;X) = A rd (IX - T ) . The proof follows from (3.1) and (3.8). We can now prove a non-commutative version of one of the basic results of linear algebra. THEOREM 3.10.

(Cayley-Hamilton). For every ~T e Hom(V),

CPrd(T;T) = 0. The ordinary Cayley-Hamilton theorem says that CPZ(T;T) = 0

so the proof follows from (3.4).

The matrix form of (3.10) is: THEOREM 3.11. For every T s (K)

, CPrd(T;T) = 0 .

We conclude with two more generalizations of stand-

ard results. THEOREM 3.12.

If S,T e Hom(V), then

CP^STjX) = CPrd(TS;X). The proof results from (3*4-)» (3»5)i and the truth of the corresponding theorem in the ordinary case. COROLLARY 3.13.

If

S,T s Hom(V) are similar, then

CPrd(S;X) = CPrd(T;X).

29

CHAPTER 4 LINEAR TRANSFORMATIONS

Earlier, in (1.4), we defined the bound a* of a polynomial a s R.

Recall that a* is monic (unless a = 0)

and belongs to C, it generates the annihilator in R of R/Ra, and is the monic multiple of a of least degree which lies in C. THEOREM 4.1.

If

a e R , a + 0 , then

a|a*

and

a*|N r d (a). It suffices to prove the result for monic polynomials; we may assume a has degree n > 1. vector space

V = R/Ra, and let T s Hom(V) be multi-

plication by X on V. chapter, N

N

rd

Form the

By the results of the last rd

(a) = CP (T;X), and

CP(T;T) = 0, so that

(a) belongs to the annihilator of R/Ra; that is,

a* |N

(a). We already know that a|a*, so the theorem

is proved. The following example shows that a,a*, and N can all be distinct.

Let Z be the field of rational

numbers, and K be the quaternions over Z. a = X(X+i) s R. and

(a)

It is easily seen that

Choose a* = iP + X

N r d (a) = X 4 + X 2 .

DEFINITION 4.2. let

Let V be a vector space over K, and

T s Hom(V).

We define the MINIMAL POLYNOM-

30

IAL of T over K to be the unique monic polynomial f e R

of least degree such that f annihilates

the R-module V™.

This f will be denoted by

MPK(T;X) of by MP(T;X). We note that such an f always exists, it is the monic generator of the two-sided ideal in R which annihilates V,jj and therefore lies in C. It is immediate from the definition that MP(T;T) = 0. THEOREM 4.3.

Let V be a K-vector space, T s Hom(V).

If (e1,...,e ) is any invariant list for T, then MP(T;X) o e*. First, if (a,,...,a) is another invariant list, we have

a„/v e_ , so n n

a* = e* . We know that V m is n n T

isomorphic to the direct sum of the modules R/Re., for i = 1,2,...,n. Since

e

ilen

f° r each'i, the annihila-

tor of V™ is exactly the annihilator of R/Re , i.e. e*. Theorem (4.3) has the following easy generalization. THEOREM 4.4.

Let V and T be as in (4.3). If &v...,an -,

are arbitrary polynomials in R such that V T = R/Ra]_ ©...© R/Ran , then MP(T;X) is the least common multiple of the polynomials a?,...,a*. The proof is obvious. One other elementary fact

about minimal polynomials deserves mention; if V is a K-vector space, T e Hom(V), then considering T as a member of Homz(V,V), the minimal polynomial of T, denoted by MPZ(T;X), is exactly the same as MP(T;X). We next examine questions relating to characteristic values of linear transformations. Recall that for

a,p s K, the relation

a = p means that there is c

a X s K* such that \oc\~ = P; conjugate in K.

we say that a and p are

Throughout this discussion V will re-

i

present an n-dimensional K-vector space and T will be a fixed transformation on V. \~ aX will be denoted by

For oc,\ s K, the product

a .

!

DEFINI$I0N 4.5. An element a sK will be called a CHARACTERISTIC VALUE of T if Tv = av for some vsV,* v 4 = 0 . For each a e K, define S_a to be ' {vsV:Tv=av}, and define W

to be

ES, \v, where

X runs over all non-zero elements of K. THEOREM 4.6.

For each a s K, W

is a K-subspace of V.

Note that S> ?uis a vector space over Z for each X, and that if Therefore

P e K, then pWa = fl&\y

PQI^-N"

{pvsveSaX} - S a (\p~ 1 ).

SSa(\p~1) C W a .

DEFINITION 4.7. W a will be called the CHARACTERISTIC. SPACE of T associated with a.

NOTE. W„ cx is T-invariant, i.e., We observe that

a = P

T(WJ ex C V tx.

implies

W

= W« . The

converse is also true. LEMMA 4.8. a

If

oc,p s K

are not conjugate, then

a A V p - tO}.

Suppose

v e S a ^Vo » v + 0. Then

also v = v, +...+V

Tv = av ; but

, v. s S Q , where p. = p for

each i, and p,,...,p

are distinct. Of all such v's,

choose one for which m is minimal. We have av = Tv = T(v, 1 +...+ v m) = i p,v l n +...+ P„v m m. But

av = av, +...+ av

so

X = P-i-a + 0.

E(p.-a) = 0.

a 4 P

so

Obviously

write

v 1 = X" (a-P2)v2 +...+ X

By hypothesis

m > 1, so we can

(a-Pm)vm« Putting

X. = X~ (a-pi) for i = 2,...,m , we have v = X20 v2 0 +...+ X„v m m . This contradiction to the minimality of m shows that no such v exists.

—>

THEOREM 4.9.

are non-

Suppose that

o^,...,^ s K

conjugate in pairs and that W„ 4= 0 a i Then Wa +...+ W o^ is a direct sum.

for each i.

1

If the sum is not direct choose that

w. 6 W

such

Ew. = 0 , not all w. = 0. By renumbering we may

assume

w, +...+ w. = 0

with

w. 4= 0 for

1 < d < t.

Let t be the smallest integer for which such a relation holds. Now from all such relations, (t is fixed), pick one in which w-, is expressible as a sum

v, +...+ v_

i

l

v.. e S„(X.) , m minimal. By (4.8) m > 1. mality of m implies that the elements for

The mini-

p. = a, ^ i^ ,

1 < i < m , are all distinct. Since

have

m ,

Ew. = 0, we

0 = T(w1 +...+ w^) - P1(w1 +...+ w^) = wj+...+w£,

where

w!. = T(w.)-P,w. e W

for each j. In particu-

lar, w.J_ = T(w1)-p1w1 = E P ^ - Ep.jV.j_ = 2(p±-P1)vi

is

a sum of fewer that m elements, each of the same genre as the v.'s (see the proof of (4.6)).

If this fact is

not to contradict the minimality of m, then it must happen that each of the w\ 's is 0 (otherwise the minimality of t would be contradicted). But if then T(w.)-|3,w. = w'. = 0 implies w. e S n d

i j

d

^

"l

j > 1, = S (X,). 1

Since, by hypothesis, the a.*s are non-conjugate in pairs, this contradicts (4.8).

We conclude that no

such w.'s can exist, so the sum is direct. We can use (4.9) to prove the non-commutative counterparts of many familiar results. THEOREM 4.10. A linear transformation has at most n distinct, non-conjugate characteristic values in K, when n is the dimension of the vector space.

34

If a,,...,a, are characteristic values of T, nonconjugate in pairs, then for each i, W

has dimension a

i

at least 1 over K, so by (4.9) there can be at most n such a.'s. Let us look at a special case; suppose monic of degree

a e R is

m > 1. Let U = R/Ra be the associated

m-dimensional vector space, and let

S s Hom(U) be

multiplication by X on U. THEOREM 4.11.

If a, U, and S are as above, then a s K

is a characteristic value of S iff there is a X + 0 in K such that Suppose

u=f+RasU

(X-a ) is a left divisor of a. , u + 0 , and

Su = au.

We may assume the degree of f is less than that of a. Then, using the fact that S is multiplication by X, we see that

(X-a)f e Ra. By comparing degrees of the

two sides we find that there is a X 4= 0 in K such that (X-a)f = Xa. The result follows. Conversely, if (X-a )f = a, then for the element have:

u = Xf + Ra s U we

Su = au. Since u cannot be 0, the converse is

proved. In the classical theory the characteristic values of a transformation are exactly the zeros of its minimal polynomial; after some preliminary work we shall prove the same result for the non-commutative case.

DEFINITION 4.12. Let

f e a, f 4= 0 , and let

We shall call a a ZERO of f if NOTE.

f(a) = 0._

Since Z(a) is a field for all

no ambiguity about this substitution. a zero of

a s K.

a e K , there is Obviously a is

f e C iff X-a is a divisor of f in R.

LEMMA 4.13.

Let a, U, and S be as in (4.11).

If aeK

is a characteristic value of S, then it is a zero of CPrd(S;X), i.e., CPrd(S;a) = 0. By (4.11) a = (X-aX)f Therefore (4.1)

for some X 4= 0 , f e R.

CPrd(S;X) = N rd (a) = N rd (X-a X ).N rd (f). By

(X-aX) divides N r d (X-a X ), so it divides

CP rd (S;X).

Therefore CPrd(S;aX) = 0, whence the result,

COROLLARY 4.14. If a is a characteristic value of S, then a is a zero of a*. Same proof as for (4.13). Before we can prove a converse to (4.14), we need to know more about zeros of polynomials. THEOREM 4.15.

If

c e C is irreducible and non-con-

stant, then all zeros of c in K are conjugate. Let

a,p e K

be zeros of c. Let a: Z(a) ->• Z(p)

be the Z-isomorphism which takes a to P". By the SkolemNoether theorem (see (3)> chapter 8, page 110) there

is a X 4= 0

in K such that o*(x) = x

in particular, a

for every xeZ(a);

= p.

Since K is finite dimensional over Z. every element of K is algebraic over Z-.

For

p e K , let the

irreducible polynomial of p over Z- be written Irr(p;X). It follows from the definitions that for a s K , Irr(a;X) => (X-a)*; if

]i e K , u 4s 0 , it is

clear that (X-a*1) is a left divisor of (X-a)*.

Let f

be the monic polynomial of least degree in R having the property that for each \i 4s 0, (X-a*1) is a left divisor of f; for each )i | 0 f = (X-a*1)^ . If that and

X f 4 f j let

f | 0

choose

choose

so that

X s K , X + 0 , such

X f - f = 5h , where

6 4= 0 . For each u 4> 0

g^ e R

h s R

is monic

we have fX = (X-a^)ff X;

but as u ranges over the non-zero elements of K, so does uX. Thus each (X-a") is a left divisor of 5h, and so of h.

The degree of h is less than that of f;

this contradicts the minimality of the degree of f. We conclude that

f e C. Since (X-a)* is the monic

polynomial of least degree in C having the stated property, it follows that

f = (X-a)*.

THEOREM 4.16. For each a e K , Irr(a;X) can be written as a product of linear factors (X-a1)(X-a2)...(X-cxk) ,

where the a . ' s are a l l conjugate and a, = a.

We can write Irr(a;X) = (X-a1)v1 , where a, = a and

v, e E , If

a e Z , then v.,= 1

and we are

through; if a £ Z , we shall pick a conjugate of a by making use of the following observation. Suppose that X-p is a left divisor of uv, but not of u. Let u = (X-p")un + X , X 4= 0; then uv = (X-p)u,v + Xv, so (X-P ) is a left divisor of v. Returning to our proof, we note that since

a | Z

we can pick a conjugate a.1

of a which is not a; therefore (X-ai) is a left divisor of

(X-a,)vn , but not of (X-a,). By the above remark

there is a conjugate a 2 of ai such that Irr(a;X) = (X-a,)(X-a2)v2. Now if, for each i± 4= 0 , (X-a^") is a left divisor of (X-a,)(X-a2) then by the remarks preceeding the theorem (X-a,)(X-a2) is already equal to Irr(a;X), and we are through. Otherwise we proceed as before to find

ou = a

such that Irr(a;X) is equal

to (X-a, )(X-a2)(X-a,)v, . An obvious induction completes the proof. LEMMA 4.17. Let

a,b s R be distinct, monic, non-

constant, irreducible polynomials, and let a,PsK be zeros of a and b. If

aX = P , then

Then a and p are not conjugate.

a(p) = a(aX) = (a(a))X = 0.

But this is impossible, for 3 satisfies only one irreducible polynomial in C.

38

We wish to relate the factors of

a e R

to the

zeros of its bound a*. LEMMA 4.18.

Let

a e R . If

aeK

has the propyl in K, (X-a ) is not

erty that for each X 4= 0

a left divisor of a, then gcld(a,Irr(a;X)) = 1. By (4.15) all zeros of Irr(a;X) are conjugate; by (4.16) Irr(a;X) splits in R. LEMMA 4.19. where

Suppose

a e R

c,d e C.

The rest is obvious.

is a left divisor of cd,

If gcld(a,c) = 1, then a is a

left divisor of d. Choose

x,y e R

a(xd)+c(yd) = d . But THEOREM 4.20. Let

such that d e C

a eR

of a*, then for some

ax+cy = 1; then

so the result follows.

and a e K . X 4= 0

If a is a zero

(X-a ) is a left

divisor of a. If for all

X 4= 0, (X-aX) is not a left divisor

of a, then by (4.18) geld (a,Irr(a;X)) = 1. Since a is a zero of a*, (X-a), is a factor of a*, and necessarily a* belongs to the annihilator of R/R(X-a), i.e., (X-a)* is a factor of a*. Let

a* = (X-a)*d ,

d s C. We know that a divides a*, so by (4.19) & divides d. follows.

This contradicts (1.6).

The result

Let us return to a situation dealt with earlier: a eR

is monic of degree m > 1, U = R/Ra is the

associated m-dimensional vector space over K, and S s Hom(U) is multiplication by X on U. earlier results that is equal to a*.

CPrd(S;X) = N rd (a)

We know from and MP(S;X)

In the discussion just completed,

we have proved most of the following theorem. THEOREM 4.21.

The following are equivalent:

(i)

a e K is a characteristic value of S;

(ii)

a e K is a zero of

a* = MP(S;X) ;

(iii) a e K is a zero of N rd (a) = CPrd(S;X) ; (iv)

there is a X 4= 0

in K such that (X-a )

is a left divisor of a in R. (i) implies (ii), by (4.14). because a* divides N

(ii) implies (iii),

(a), (iii) implies (iv): we

have (CPrd(S;X))d = CPZ(S;X) , and MP(S;X) = MPZ(S;X). Ordinary linear algebra tells us that any zero of CPZ(S;X) (in an extension field of Z) is also a zero of MP(S;X) = MPZ(S;X), so any zero of CPrd(S;X) must be a zero of MP(S;X).

This proves that (iii) implies

(ii); by (4.20) (ii) implies (iv); therefore (iii) implies (iv). Finally, (iv) implies (i), by (4.11). NOTE.

It is worth remarking that we can omit the

word left in statement (iv) of (4.21); for conditions

(ii) and (iv) say that; a e K bound of a iff for some isor of a.

is a zero of the (left)

X 4= 0 , (X-a ) is a left div-

It is quite clear that by working with

right modules rather than left, we could prove that a s K some

is a zero of the (right) bound of a iff for X 4= O , (X-a ) is a right divisor of a.

We long

ago ooserved that the right and left bounds of a are the same, so we may assert that (X-a) is a left divisor of a iff for some

X + 0 , (X-a ) is a right divi-

sor of a. We can use (4.21) to prove the corresponding result in the more general case.

Let V; be an n-dimen-

sional vector space over K, T s Hom(V), and let (e,,...,e 1 n) be an invariant list for T. THEOREM 4.22.

The following are equivalent:

(i)

a s K is a characteristic value of T;

(ii)

a s K is a zero of

(iii)

a e K is a zero of N r d (e 1 )---N r d (e ) -

e* = MP(T;X) ;

CP rd (T;X) ; (iv)

there is a

X + 0

in K such that (X-a^)

is a left divisor of e„ in R. n (i) implies (ii). V™ is isomorphic to the direct sum of the modules R/Re., 1 < i < n , so if Tv = av for some

v + 0

in V, we can find

^i^m,m^^r\

E

R »

such that under the aoove mentioned isomorphism.v

41

corresponds to implies that

(f,+Re,,...,f_+Re ) . Then (X-a)f. e Re.

Tv = av

for each i. We may assume

that the degree of f. is less than that of e. for each i, and that Then

f . 4= 0

for at least one j (since

«J

(X-a)f. s He . implies that 0

v | 0).

(X-a)f. = Xe. , X not

0

do

0. In turn, this implies that (X-a ) is a left divisor of e ., so by (4.21) a is a zero of e*.. Since e*. 0

divides

3

0

e* divides N

(en)»

e* we are through.

(ii) implies (iii).

Obvious, for

(iii) implies (iv). If a is a zero of N (ei"*eTi)> then

a is a zero of some N

(e.), so by (4.21) it

also has the property that (X-a ) is a left divisor of e . for some a

b e R

X 4= 0

such that

in K. e

Since

e

-J e n » there is

= be. . Consider the poly-

nomial b(X-a ) ; by virtue of the note following (4.21), there is a u 4= 0

in K such that (X-a*1) is a left

divisor of b(X-a ) and therefore also of be. = e . J n (iv) implies (i). Let T be the restriction of T to R/Re . By (4.11) a is a characteristic value of T . Choose

f

Taking

v s V

with

eR ,f

4= 0 , such that ' (X-aA)f

= en .

to be the (non-zero) vector identified

(0,...,0,Xf +Re ) we have Tv = av, as required. Before closing this section, let us use the re-

suits just obtained to draw some interesting conclusions about 'zeros' of polynomials in R. DEFINITION 4.23.

Let

a e R , a 4= 0 . An element a

of K will be called a PSEUDO-ZERO of a if for some X 4= 0

in K (X-a^) is a left (or right)

divisor of a. COROLLARY 4.24.

Let

aeR, a4=0,asK.

Then a

is a pseudo-zero of a iff a is a zero of a*. This is a restatement of part of (4.21). The next theorem is the non-commutative version of the fundamental theorem of algebra. THEOREM 4.25. Let

aeR

have degree n > 1.

Then a

has at most n non-conjugate pseudo-zeros in K. Let U = R/Ra, and let cation by X on U.

S e Horn (U) be multipli-

By (4.10) S has at most n non-con-

jugate characteristic values in K. and (4.23) the proof is complete.

Thus by (4.21)

43

CHAPTER 5 DIAGONALIZABLE OPERATORS AND CANONICAL FORMS

Our first theorem is the well known Primary Decomposition Theorem. THEOREM 5.1. Let V be an n-dimensional vector space over K, and let

T e Hom(V).

Suppose that

MP(T;X) = (p1)rl«(p2)r2...(pk)rk , where the

Pi 's

are monic irreducible polynomials in C, and the r.'s are positive integers. Let V ± » { veV : Pi (T) r i(v) =0 } , 1 < i < k , be the null space of the transformation p.(T)ri . (i)

V = W 1 © ... © W k .

(ii)

T(Wi) C W . , 1 < i < k .

(iii) If T. is the restriction of T to W., then MP(Ti5X) = (p±)ri . The proof found in most elementary books on linear algebra (e.g., (7)) remains valid in the non-commutative case. Until further notice V and T are fixed, and for each

a e K ,W

will represent the characteristic

space of T associated with a.

(See (4.7).)

DEFINITION 5.2. V = SW THEOREM 5.3.

T is said to bte DIAGONALIZABLE if

, where a ranges over all of K. T is diagonal!zable iff there exist

a, ,. ..,a n E K , k < n , such that the a.•s are non-conjugate in pairs and V = W

+ ••• + W . l °lc This follows immediately from (5.2) and (4.9). a

THEOREM 5.4.

T is diagonalizable iff there is a

basis of V, each element of which is a characteristic vector of 1. Suppose T is diagonalizable; then as in (5.3).

Recalling that

W

= a

i

V =

k S W , i=l a i

S S X , it is

X+0 a i

clear that one can choose a K-basis for W

, each ele— a

i

ment of which is a characteristic vector of T.

Con-

versely, if {v,,...,v} is a basis of V and Tv. = a.v. for each i, arrange the v.'s so that {a,,...,a, } is a complete set of pairwise non—conjugate elements of the set {cc,,...,a }. Then for each j>k , there is a unique i < and W„, ^ — k such that a.j =c a. i a.= W„. a. . Thus J

1

V = Wa +•••+ W , so by (4.9) the proof is complete. l "k DEFINITION 5.5.

A set {a 1 ,...,a k } C K will be called

a COMPLETE SET OF CHARACTERISTIC VALUES of T if

45

every characteristic value of T is conjugate to exactly one of the a.'s. THEOREM 5.6.

Let T be diagonalizable, and let

{a,,...,a,} be a complete set of characteristic values of T. Then MP(T;X) = (X-o^)**- • (X-o^)*. Let

f = MP(T;X), and let {vlf...,v } be the

basis of V guaranteed by (5-4-). As in the proof of (5.4), arrange the v.'s so that

Tv. = ot.v. , where,

for i > k , a. is conjugate to one of a-,,...,a,. Thus, f(T) = 0

implies

f(T) = 0 , with

T being

the diagonal matrix diag(a1,...,an) afforded by T; that is,

0 = f(T) = diag(f(a,),...,f(a )) .

f(ai) = 0

for each i < n.

f for each i.

But then,

Therefore (X-a.)* divides

By (4.15), the polynomials (X-a.)* ,

1 < i < k , are distinct; all are irreducible, so they are relatively prime in pairs. We conclude that their product

(X-a,)*•••(X-a, )*

divides f. Let f

denote

this product; by (4.16) each (X-a.) , i < k , is a left divisor of f1, so f'(a.) = 0 for i < k. f'(a.) = 0

for all

i < n , so

T = diag(a,,... ,a ) ; thus f VT.

It follows that

f|f

But then

f'(T) = 0 , where

annihilated the R-module

, so

f = f .

We could also give a non-matrix proof of theorem (5.6) by arguing that for

1 < i < n , the vector space

Kv. is isomorphic to R/R(X-a. ) as an R-module, so that

¥

n m1 - © R/R(X-a.) . For each j > k, there is a unx i=l

ique i < k for which R/R(X-a.) and R/R(X-a.) have the same annihilator in R.

From this, and from (4.4), it

follows that the annihilator of Vm is the least common multiple of the polynomials (X-a.)*, 1 < i < k , i.e., (X-a,)*•••(X-a, ) * .

This proves the theorem, for the

annihilator of 7^ is MP(T;X). LEMMA 5.7. If

2

and t h a t the r e s u l t holds for

a l l matrices of order n-1 which s a t i s f y conditions ( i ) - ( i i i ) of the lemma.

If

a, . = a., = 0 •J- (J

for a l l

fj •A*

j >1 , then by the induction hypothesis we are through. Otherwise, some a, . or a.-, is different from 0; for notational convenience assume

a 2 , = Of / 0. Using a2,

to eliminate all other column 1 entries, we obtain 0 oc

where

-of g 2 l

22

0

lin

0

lin

-g>

3

n *2n

l

23

lin

-a" g 2 = a 2 1 - a^oi" a22 '

lin 5

$ " a Jl ~ a ll a

a

2J

etc.- Note that g 2 is monic of degree m+1, while g 2 , g

are all of degree m. All of the entries in the

lower right hand (n-2)*(n-l) block are constants except those marked "lin" which are either constants of linear polynomials;

those polynomials marked "lin"

along the upper diagonal of this block are all monic of first degree. We now use a to eliminate all a2J» j > 2 , and then interchange rows 1 and 2, at the same time changing the signs of the new row two entries. This gives us:

a

0

0

a s2

XL

±1-

0

p

: lin 9

«





0

. . .

6



3

o

0 »

Sn

lin •





lin

I lin

Notice that by adding an appropriate constant multiple of the j

column (j >2) to the 2

column we can in-

sure that all of the entries (3,2), (4,2),... ,(n,2) are constants, while keeping (X~ as the leading coefficient of the (2,2) entry.

Finally, use the algor-

ithm in R to reduce the (2,3), (2,4),..., (2,n) entries to constants, by adding to them appropriate left multiples of the (3,3), (4,4),..., (n,n) elements. None of these alterations affects either the degree or the leading coefficient of the (2,2) entry, for that polynomial has degree

m+1> 1.

The result is

a

1 :

1 .

Let x,y 6 R be monic, with is a monic polynomial b such that

Rx JS RyR • Then there yb ft Rx . Using

t h e a l g o r i t h m , we g e t : yb = qQx + ^ x = 0.1 r i

r

t-l = %rt

+

, d e g r e e ( r 1 ) < degree(x) , r

p » degree ( r 2 ) < d e g r e e ( r 1 )

,

'

Let l the leading co-

. We now choose elements

from SL(n,R) to effect the transformations indicated below by arrows.

"x o"

"x 0"

.° y.

->

yb y_

"x o' ->

.rll

->

'T2

-^7

"r2



7

.rl

-w

r* P(q2»qi)y

" ^ -P(q3»i2.^ ^° ^

&

H

over

again.

Because degrees are continually reduced, this must lead finally to a matrix which satisfies the requirements of the lemma. This completes the proof of (1.12).

APPENDIX II

The characterizations of similarity noted in chapter 1 are deficient in two respects:

first, they

suggest no general method for determining whether two given elements of a ring are similar; second, they provide no method for constructing similar elements (except for associates).

Here we give a characteriza-

tion of similarity in a non-commutative Euclidean ring, which eliminates at least the second of these difficulties. Let R be a ring with 1 having both left and right division algorithms and no zero divisors; we shall assume the definition of the function P on finite sequences of elements of R to be known (see Appendix I, definition 6 ) . It is an easy induction to prove that if and

x, ,...,x

£R

n>2

, then

PCx^ , . • . )X]lJ = PQXj , • . • «XI1_3_ 'Xri + ^vx3_ » • • • »x*j_2'• Readers familiar with continued fractions will recognize P as the recursive function which gives the numerators and denominators of the convergents. x,,y, fiR are simi lar.

Suppose

If they are in K they are a s -

sociates; if they are not in K we can choose

xP6 R

of lower degree than x,, such that

Rx, + Rx 2 = R

and

Rx,nRx 2 = Ry-jXo. From the division algorithm we get x

l

= w

lx2

+x

3

x 2 = w 2 x 3 + x^ (1)

, deg(xi) 1 x. = P(w , ...,w )(5 .

P(w ,...,w. ,)x. = P(w ,... ,w. )x. -, for i