Introduction to Modern Algebra

  • Commentary
  • Some pages scanned multiple times
Citation preview

PURE

AND.APPLIED

MATHEMATICS

A Series of Monographs and Textbooks

INTRODUCTION TO MODERN ALGEBRA

Marvin Marcus



INTRODUCTION TO MODERN ALGEBRA

PURE AND APPLIED MATHEMATICS A Program of Monographs, Textbooks, and Lecture Notes

EXECUTIVE EDITORS—MONOGRAPHS, TEXTBOOKs, AND LECTURE NOTES Earl J. Taft Rutgers University New Brunswick, New Jersey

Edwin Hewitt University of Washington

Seattle, Washington CHAIRMAN OF THE EDITORIAL BOARD

S. Kobayashi University of California, Berkeley Berkeley, California

EDITORIAL BOARD Masanao Aoki University of California, Los Angeles Glen E. Bredon Rutgers University

Sigurdur Helgason Massachusetts Institute of Technology G. Leitman

W. S. Massey Yale University Irving Reiner University of Illinois at Urbana-Champaign

Paul J. Sally, Jr. University of Chicago Jane Cronin Scan/on

University of California, Berkeley

Rutgers University

Marvin Marcus University of California, Santa Barbara

Martin Schechter Yeshiva University

Julius L. Shaneson Rutgers University

._i

MONOGRAPHS AND TEXTBOOKS IN PURE AND APPLIED MATHEMATICS

K. Yano, Integral Formulas in Riemannian Geometry (1970) S. Kobayashi, Hyperbolic Manifolds and Holomorphic Mappings (1970) V. S. Vladimirov, Equations of Mathematical Physics (A. Jeffrey, editor; A. Littlewood, translator) (1970) B. N. Pshenichnyi, Necessary Conditions for an Extremum (L. Neustadt, translation editor; K. Makowski, translator) (1971) L. Narici, E. Beckenstein, and G. Bachman, Functional Analysis and

Valuation Theory (1971)

D. S. Passman, Infinite Group Rings (1971) L. Dornhoff, Group Representation Theory (in two parts). Part A: Ordinary Representation Theory. Part B: Modular Representation Theory

(1971, 1972) W. Boothby and G. L. Weiss (eds. ), Symmetric Spaces: Short Courses Presented at Washington University (1972) Y. Matsushima, Differentiable Manifolds (E. T. Kobayashi, translator)

(1972) 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

L. E. Ward, Jr., Topology: An Outline for a First Course (1972) out ofprl'nt A. Babakhanian, Cohomological Methods in Group Theory (1972)

R. Gilmer, Multiplicative Ideal Theory (1972) J. Yeh, Stochastic Processes and the Wiener Integral (1973) out of print J. Barros-Neto, Introduction to the Theory of Distributions (1973) cutofprint R. Larsen, Functional Analysis: An Introduction (1973) K. Yano and S. Ishihara, Tangent and Cotangent Bundles: Differential Geometry (1973) C. Procesi, Rings with Polynomial Identities (1973) R. Hermann, Geometry, Physics, and Systems (1973) N. R. Wallach, Harmonic Analysis on Homogeneous Spaces (1973)

J. Dieudonné, Introduction to the Theory of Formal Groups (1973)

21. 22. 23.

I. Vaisman, Cohomology and Differential Forms (1973) B.-Y. Chen, Geometry of Submanifolds (1973) M Marcus, Finite Dimensional Multilinear Algebra (in two parts)

24. 25.

R. Larsen, Banach Algebras: An Introduction (1973) R. 0. Kufala and A. L. Vitter (eds. ), Value Distribution Theory: Part A; Part B. Deficit and Bezout Estimates by Wilhelm Stoll (1973) K. B. Stolarsky, Algebraic Numbers and Diophantine Approximation (1974) A. R. Magid, The Separable Galois Theory of Commutative Rings (1974)

(1973, 1975)

26. 27. 28. 29.

B. R. McDonald, Finite Rings with Identity (1974) J. Satake, Linear Algebra (S. Koh, T. Akiba, and S. Ihara, translators)

(1975) 30. 31. 32. 33. 34. 35. 36. 37. 38.

J. S. Golan, Localization of Noncommutative Rings (1975)

G. Klambauer, Mathematical Analysis (1975) M K. Agoston, Algebraic Topology: A First Course (1976) K. R. Goodearl, Ring Theory: Nonsingular Rings and Modules (1976) L. E. Mansfield, Linear Algebra with Geometric Applications (1976) N. J. Pullman, Matrix Theory and its Applications: Selected Topics ( 1976) B. R. McDonald, Geometric Algebra Over Local Rings (1976) C W. Groetsch, Generalized Inverses of Linear Operators: Representation

and Approximation (1977) J. E. Kuczkowski and J. L. Gersting, Abstract Algebra: A First Look

(1977) 39. 40. 41. 42. 43.

44. 45. 46. 47.

C 0. Christenson and W. L. Voxman, Aspects of Topology (1977) M. Nagata, Field Theory (1977)

R. L. Long, Algebraic Number Theory (1977) W. F. Pfeffer, Integrals and Measures (1977) R. L. leeden and A. Zygmund, Measure and Integral: An Introduction to Real Analysis (1977) J. H. Curtiss, Introduction to Functions of a Complex Variable (1978) K. Hrbacek and T. Jech, Introduction to Set Theory (1978)

W. S. Massey, Homology and Cohomology Theory (1978) M Marcus, Introduction to Modern Algebra (1978)

INTROgloJCTION MODERN ALGEBRA Marvin Marcus Institute for the Interdisciplinary Applications of Algebra and Combinatorics University of California. Santa Barbara Santa Barbara, California

MARCEL DEKKER, INC.

New York and Basel

Library of Congress Cataloging in Publication Data

Marcus, Marvin [Date] Introduction to modern algebra.

(Monographs and textbooks in pure and applied mathematics ;v. 47) 1. Algebra, Abstract. I. Title.

QA162.M37 512’.02 ISBN 0-8247-6479-X

78-2482

COPYRIGHT © 1978 by MARCEL DEKKER, INC. ALL RIGHTS‘RESERVED Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage and retrieval system, without permission in writing from the publisher. MARCEL DEKKER, INC. 270 Madison Avenue, New York, New York 10016 Current printing (last digit):

10 9 8 7 6 5 4 3 2 l PRINTED IN THE UNITED STATES OF AMERICA

To my wife

Rebecca Elizabeth Marcus

Contents

vii

Preface

1.

Basic Structures

1.1

1.2

1.3

2.

14 15 16 25 30 31 52 54 56

Groups 2.]

3.

Sets and Functions Exercises Glossary Algebraic Structures Exercises Glossary Permutation Groups Exercises Glossary

Isomorphism Theorems Exercises Glossary

2.2

Group Actions and the Sylow Theorems

2.3

Exercises G10Ssary Some Additional Topics Exercises Glossary

56 83 86 86 106 109 109 123 123

Rings and Fields

124

3.]

124 136

Basic Facts Exercises

Contents

vi

Glossary 3.2

Introduction to Polynomial Rings Exercises Glossary

3.3

Unique Factorization Domains

3.4

3.5

Exercises Glossary Polynomial Factorization Exercises Glossary Polynomials and Resultants Exercises

Glossary 3.6 Applications to Geometric Constructions Exercises

Glossary 3.7

4.

Galois Theory Exercises Glossary

Modules and Linear Operators

4.1

4.2

4.3

The Hermite Normal Form Exercises Glossary The Smith Normal Form Exercises Glossary The Structure of Linear Transformations Exercises Glossary

4.4 Introduction to Multilinear Algebra Exercises Glossary 5.

Representations of Groups

5.1

5.2

Index

Representations Exercises Glossary Matrix Representations Exercises Glossary

138 139 170 172 173 190 193 194 216 224 225 255 257 257 263 264 264 282 285 287

287 316 319 320 344 350 351 381 396 396 411 412 413 41 3 435 439 440 474 477 479

Preface

This book is written as a text for a basic one-year course in algebra at the advanced undergraduate'or beginning graduate level. The presentation is‘ oriented toward the applications of algebra to other branches of mathematics

and_to science in general. This point of view is reflected in the selection of constructive methods of proof, the choice of topics, and the space devoted to

items related to current applications of algebra. Thus, modules over a principal ideal domain are studied via elementary operations on matrices. Considerable space is devoted to such topics as permutation groups and the Polya counting theory; polynomial theory; canonical forms for matrices; applications of linear algebra to differential equations; representations of

groups. Prerequisites fora course based on this book are minimal: standard

one-quarter courses in the theory of equations and elementary matrix theory suffice. Altogether there are 390 exercises and these constitute an integral

part of the book. Problems that require somewhat intricate arguments are accompanied by complete solutions. The exercises contain a number of important results and several definitions. Occasionally they are used to remove technical arguments from the mainstream of a proof within the section. Students should at least read the exercises. Frequently exercises appearing at the end of a section are mentioned within the section so that they can be easily assigned at the appropriate time. We have not hesitated to reiterate definitions and results throughout the book. For example, conjugacy classes are discussed in the chapter on group theory and again in the chapter on group representation theory. Moreover, some arguments are repeated if they are separated from their last occurrence by a substantial amount of intervening material. Each section of the

vii

viii

Preface

book is followed by a Glossary which contains the page numbers for impor-

tant definitions, “name” theorems, and special notations. What follows is a description of the contents of each of the chapters. A diagram illustrating the interdependency of the various sections appears after the Preface. Chapter 1, Basic Structures, introduces many of the basic ideas that occur later in the book. Section 1.1, Sets and Functions, contains the usual material on sets, functions, the de Morgan formulas, function composition, Cartesian products, equivalence relations, quotient sets, systems of distinct representatives, universal properties, partial and linear orderings, etc. Towards the end of the section, the Axiom of Choice is discussed in a heuristic way. The equivalence of Zorn’s lemma and the Axiom of Choice is mentioned without going into much detail. Section 1.2, Algebraic Structures, introduces in order of increasing complexity some of the basic items developed in the remainder of the book. Thus groupoids, semigroups, monoids, groups, modules, vector spaces, algebras, and matrices appear here. In this section we also define permutation groups, free monoids, groupoid rings, polynomial rings, free power series, etc. The section contains an extensive list of elementary examples illustrating the definitions. The student can obtain considerable practice in the manipulation of these basic ideas in the exercise sections. Categories and morphisms appear in the exercises, but only peripherally. Section 1.3, Permutation Groups, examines the details of permutation groups and their structure. The basic properties of permutations (including cycle structure and the simplicity of the alternating group of degree n for n 2 5) appear here. Many of the basic ideas of group theory are illustrated in Section 1.3 in the context of permutation groups. Chapter 2, Groups, is a rather thorough study of most of the major elementary theorems in group theory. Section 2.1, Isomorphism Theorems, carries the reader through the Jordan-Holder theorem, properties of solvable groups, and composition series. Section 2.2, Group Actions and the Sylow Theorems, is devoted to a systematic study of the three major Sylow theorems. Since this section is highly combinatorial in nature, it seemed appropriate to include the Polya counting theorem and some interesting combinatorial applications. Section 2.3, Some Additional Topics, contains a number of more advanced items in group theory, beginning with the Zassenhaus isomorphism

lemma for groups. We then develop the Schreier refinement theorem for subnormal series of a (not necessarily finite) group. This section also includes the notion of a group with operators, admissible subgroups, and linear maps

on vector spaces. Chapter 3, Rings and Fields, is the longest chapter in the book. Section

Preface

ix

3.1, Basic Facts, covers ring characteristics, universal factorization properties of quotient rings, and the three ring isomorphism theorems. Section 3.2, Introduction to Polynomial Rings, shows how an indeterminate can be constructed over an arbitrary ring. The ring extension theorem is proved and used here to imbed a ring in a ring with an indeterminate. This section also contains material on polynomials in several variables, including the basis theorem for symmetric polynomials. Ascending and descending chain conditions for ideals in a ring are discussed and the Hilbert basis theorem for Noetherian rings is proved. Quotient fields of integral domains, and more generally, rings of fractions with respect to ideals appear toward the end of the section. Section 3.3, Unique Factorization Domains, starts with the usual material on polynomial division, the division algorithm and the remainder theorem. The division algorithm is proved over a noncommutative ring—it is required later in the study of elementary divisors over matrix rings. The basic fact that any principal ideal domain is a unique factorization domain is proved in Theorem 3.6. Example 6 shows how to calculate the greatest common divisor of two Gaussian integers using the Euclidean algorithm. Nilradi-

cals, quotients of ideals, and the Jacobson radical all occur at the end of this section. Section 3.4, Polynomial Factorization, begins in the standard way with Gauss’ lemma and goes on to show that unique factorization is inherited by the polynomial ring over a unique factorization domain. Considerable space is devoted here to the' practical problem of factoring polynomials. Theorem 4.7 shows how to construct a splitting field for a polynomial, and Theorem

4.13 exhibits the relationship between any two such splitting fields. The section concludes with a proof of the primitive element theorem for fields of characteristic zero. The exercises in this chapter contain a good deal of material, but detailed solutions are included for all but the most routine problems. Section 3.5, Polynomials and Resultants, deals with the classical theory of polynomials. Sylvester’s determinant, homogeneous polynomials, resultants, and discriminants appear here, and the fundamental question of when two polynomials have a common factor is investigated in some detail. This section concludes with a statement and proof of the Hilbert invariant theorem and a discussion of algebraic independence. Section 3.6, Applications to Geometric Constructions, applies the preceding material on field theory to problems of ruler and compass construction of regular polygons and angle trisection. Section 3.7, Galois Theory, is devoted to the proof of the fundamental

theorem of Galois theory for fields of characteristic zero and its application

x

Preface

to the classical problem of the solvability of a general polynomial of degree n by radicals. Chapter 4, Modules and Linear Operators, begins in Section 4.1, The

Hermite Normal Form, with the derivation of a normal form under left equivalence of matrices over a principal ideal domain. This theorem is then applied to finitely generated modules, yielding many of the standard results in module theory as easy consequences. The Hermite normal form is also useful in showing how to compute generators for ideals in a matrix ring. This section also contains the basic theory of finite dimensional vector spaces, the Steinitz exchange theorem, and the theory of linear equations. Section 4.2, The Smith Normal Form, shows how to compute canonical forms for matrices under right and left equivalence over a principal ideal

domain. ‘The Smith form is then used to analyze the structure of finitely generated modules as direct sums of free submodules. The fundamental theorem of abelian groups appears in Corollary 7. We then determine all loworder abelian groups in some of the examples and exercises. The cyclic primary decomposition of a module is carried out in the exercises, together with an analysis of finitely generated abelian groups given in terms of certain defining relations, i.e., group presentations. Section 4.3, The Structure of Linear Transformations, develops the standard elementary divisor theory and matrix canonical forms over a field. Our approach is computational, and the canonical forms under similarity are derived in terms of the reduction of the characteristic matrix via equivalence over a polynomial ring. Most of the important normal forms for matrices under similarity occur here, e.g., the Frobenius normal form and the Jordan normal form. A considerable part of the section deals with the problem of computing the elementary divisors of a function of a matrix. These important

results are used in other parts of mathematics, e.g., the theory of ordinary differential equations. In the last section of the chapter, Introduction to Multilinear Algebra, we introduce symmetry classes of tensors and briefly study the tensor, Grassmann, and completely symmetric spaces. As an example of the use of an inner product in a symmetry class, we show how the famous van der Waerden conjecture concerning doubly stochastic matrices can be partially resolved. The fifth and final chapter of the book, Representations of Groups, is essentially self-contained and could be used for a short course on group

representation theory. The major part of the chapter is concerned with matrix representations of finite groups. This permits us to achieve deep penetration of the subject rather rapidly. The contents of a course in algebra vary considerably and seem to depend more on individual tastes and prejudices than do corresponding

courses in analysis. The present book is no exception. However, a good deal

Preface

xi

of the material included can be justified in terms of its applications to other parts of mathematics and science. We anticipate that a student who gains

reasonable mastery of the contents will be ready for more advanced courses in algebra and the applications of algebra to a wide range of fields, e.g., computer science, control theory, algebraic coding theory, system theory, numerical linear algebra, quantum mechanics, and crystallography.

References

Each chapter is divided into sections. Thus Section 4.2 is Section 2 of Chapter 4. Definitions, theorems, and examples are numbered serially within a section. Thus Theorem 1.4 is the fourth theorem in Section 1. Any reference to a definition, theorem, or example within the chapter in which it appears does not identify the chapter. Any reference, to a definition, theorem, or example oc-

curring in another chapter includes the chapter and section number. The symbol I is used to denote the end of a proof.

Acknowledgment

The author is pleased to acknowledge the assistance of Dr. Ivan Filippenko

in reading the original manuscript and providing invaluable help in proofreading the printed copy.

Marvin Marcus

Scum?



-

8303955

3 m6

53:22.2 325E335; .85.. yo 2:825 N6

3

2:

E8”. .mEhoz_£_Ew

EB”. .83: SEE: 2:

9

_

_

mucmflsmwm—

md Wm

fin

ed

Ucw

2322.28 mm

39::

_ :o_um~_.0uomn_ _m_Eo:>_0n_

_m_Eo:>_om

mEmEoo co=m~toHomu mmcE cotosuobc. _

Bum”. 28m

>82... $2.5

5m

MN

9m 830... .mco_:JU< oEom

a;

98 mcoE—E

NN

$3

95.0

OH

“2050:3800

>>o_>m

22532 _

OCHwEOwo

2:969:

Pd

“5.5%?

:

8:90

«.8802... EwEEoEOm. _

“9.3025

38

828:3“. was

wZO..-.Umw “—0 mUZmn—Zwmmn

m;

_

x255.

2038585; L

mcozficomo‘amm

839355.“.

Wm

fim

INTRODUCTION TO MODERN ALGEBRA

Basic Structures

1.1

Sets and Functions

We shall assume that the reader is familiar with the notion of a set or collec-

tion of objects. The purpose of this section is to set forth the notation and language used throughout this book. If S is a set and x is a member or an element of S, we write x e S;

if x is not an element of S, we write x «E S.

If X is a set consisting of all elements x for which a certain proposition p(x) is true, we write

X = {x lp(x)}Thus, for example,

{x l x is an integer and f g x < 5} is the set consisting of the integers 1, 2, 3, and 4. It is often feasible to explicitly write out the elements of a set, e.g.,

X = {2,4,6}

(1)

means that X consists of the numbers 2, 4, and 6. The curly bracket notation

in (1) is usually reserved for finite sets, but sometimes infinite sets can be written this way by use of the ubiquitous “triple dot” notation, e.g., N: (1,2,3, . . .} is the set of positive integers.

2

Basic Structures

If every element of the set X is in the set Y, we write X C Y, or

Y 3 X,

and call X a subset of Y. If X C Ybut X =/: Y, then X is a proper subset of Y. The empty set or null set, denoted by Q, is the set with no elements; clearly, fiCX for any X. The power set of a set X is the set of all subsets of X. It is denoted by P(X) :

P(X): {Y| YC X}. If X contains only finitely many elements, we denote the number of elements in X by IX | . It is an easy exercise to verify that for a finite set X,

l P(X) l = 2"“

(2)

(See Exercise 1). For example, if X is the set (1), then the elements of P(X) are the eight subsets Q,

{2},

{4},

{6},

{2,4}:

{2,6},

{426}:

X-

The union of two sets X and Y is the set of elements in either X or Y and is

denoted by XUY.

The intersection of X and Yis the totality of elements in both X and Y and is denoted by

XnY.

We say that I is an indexing set or labeling set for a family of sets ifto each

1.1

Sets and Functions

3

element of I there corresponds a well-defined set X, in the family. The union and intersection of a family of sets indexed by I are written, respectively,

U X, 1'5]

and

n X,. 1'5]

Thus, x E Uie, X, means thatx e X, for somei E I, whereasx E (he; X, means that x E X1 for every 1' E I. For example, if I = N and X,. is the closed interval on the real line consisting of all x such that l/i g x g 1, then

UX,= {x|0 Yis injective (1—1) or an injection iff(xl) = f(x2) implies x1 = x2; it is surjective (onto) or a surjection iff(X) = Y; it is bijective (1—1, onto) or a bijection if f is injective and surjective. Other words are mono-

morphic (injective); epimorphic (surjective); a matching (bijective) of Xand Y.



(injective map)

For example, the map f: N —) N defined by the formula

f0!) = 2'1 is an injection but certainly not a surjection. However, if by 2N we mean the

set of positive even integers, thenf: N —> 2N is a bijection. If g: N —> {1,—1} is defined by g(n) = (— 1)", then g is an epimorphism or a surjection, and

g“({1})= 2N, g-'({—1})= N — 2N. Two functions f: X —) Y and g: X —) Y are equal if and only if (hereafter abbreviated “iff”)

f(x = g(x) for all x E X. The composite of two functions f: X —) Y and g: Y—> Z is the map h: X —) Z whose value at any x e X is h(x) = g(f(x)). The composite off and g is written gf or g - f. The composition of maps is associative: If X i) Y,

Y1» z, and z i) W, then h . (g - f) = (h - g) . f(verify 1). Thus the triple composite may simply be denoted by h - g - f. If {Y, I i E I} is a family of sets, then the cartesian product of the

1.1

Sets and Functions

5

members of the family is the set of all functions f: I —> Ufa, Y, such that f(i) e Y, for each i e I. The cartesian product is denoted by

x Y,= {f|f:I——) U Y,andf(i)e Y,foreachieI}.

ie]

iEI

If {Y}, . . . , Y,,} is a family of n sets, their cartesian product is also written Y1 x - - - x Y,l and can be thought of as the totality of ordered n-tuples f=(y1a ' ' '9yn),

y, e Y,, i = l, . . ., n; that is,f(i) = y,, i = l, . . ., n. Two n—tuples (yl,

..,yn)and(z,, . . . ,zn)areequalifl'y,=z,,i= l, . . .,n. Suppose I = [0,1], i.e., I is the closed interval on the real line consisting of all x for which 0 g x g 1. For each i e I let Y, = I. We assert that the cartesian product

xY, iEl is in fact the set of all maps from I to I, i.e.,

I’ = x Y,. iEI

For, any f E x 1.5, Yi is a function whose value at eachi E I is an element of Y, = I. The special map ex: X a X, called the identity map, is defined by ”(’0 = x

for each x e X. If Z C X and f: X —> Y, then f[z is the function whose domain is Z and whose value for each 2 E Z is f(z); f| Z is called the restriction off to Z,

and fis called an extension off| Z. If Z C X, then the map [x | Z is called the canonical injection of Z into X. Compositions of maps are often depicted by mapping diagrams; for ex-

ample,

h\‘ l 9

4 1°

(5)

6

Basic Structures

indicates that g - f = k . h. Diagrams showing the equality of compositions of sequences of maps, such as (5) and (6), are called commutative diagrams. Iff: X—) Y, g: Y—> X, and gf= Ix, then g is a left inverse off; if fg = (y, then g is a right inverse off. If g is a left and right inverse off, then it is an inverse of f. (See Exercise 3.) Theorem 1.1 Assume f: X —> Y. Then (i) f is injective ifi' it has a left inverse. (ii) f is surjective ifi" it has a right inverse.

(iii) Iffhas a left inverse g and a right inverse h, then g = h. (iv) f is bijective ifit has an inverse. (v)

Iff has an inverse it is unique and is denoted byf".

(vi) Iff has an inverse, then (f'1)‘1 = f. Proof:

(i)

Iff has a left inverse g: Y—) X, then f(x,) = f(xz) implies that

g(f(x1)) = g(f(x2)) and hence that x1 = ”(361) = (g 'f)(x1) = g(f(x1)) = g(f(x2)) = (g - f)(x,) = ”(3:2) = x2. Hence f is injective. Conversely, if f is injective, then for each y e f(X) there is exactly one element in X, call it x}, e

X, such that f(x,,) = y: define g|f(X) by g(y) = xy. For any other 2 e Y, let g(z) = x0, some fixed element in X. Obviously (g . f)(x) = g(f(x)) = x = :X(x)

for all x E X, so g is a left inverse off.

4—» 9

(ii)

Iff: X -—> Yis surjective, then f(X) = Y. Let g: Y—+ X be defined as

follows: For each y E Y choose an x, e f"({y}) and let g(y) = xy. Then

(f - g)(y) = f(g(y)) = f(x,) = y = My), i.e-,f - g = zy- Hencef has a right inverse. Conversely, if g: Y—> X is a right inverse off and y e Y, then y =

zy(y) = (f - g)(y) = f(g(y)) and hence y 6 im f. Thus f is surjective.

1.1

Sets and Functions

7

(iii) If y E Y, then g(y) = g(ey(y)) = g((f - h)()')) = (g 'f)(h(y)) = :x(h(y)) = h(y). Hence g = h. (iv) This follows immediately from (i), (ii), and (iii). (v) If f has an inverse, it is a right and left inverse and by (iii) it is uniquely determined.

(vi) Letg =f'1. Theng -f= Ix andf- g = :1, so thatfis a left and right inverse for g. Thus by (V), g‘1 = j; that is, (f“)'1 = f. I In the proof of Theorem 1.1(ii) we were required to make (in general)

an infinite number of choices of an element x, e f"( {y} ). At the end of this section we shall discuss the Axiom of Choice which deals with the justification forthis and similar arguments.

The set of all mappings from {1,2} to a (nonempty) set X is called the cartesian square of X and denoted by X2 (or X x X), i.e., X2 : sl

We can, of course, represent any f E X2 as f = (xvxz), where x1 = f(1) and

x2 = f(2); (x,,xz) is called an ordered pair. If D = {(x,x) | x E X} C X2, then D is called the diagonal of X2. Any subset R C X2 is called a relation on

X. If R is a relation on X, then R is called an equivalence relation on X if the following three conditions are satisfied:

DcR

(7)

(x,,x2) E R implies (x2,x,) E R

(8)

(xvxz) e R and (x2,x,) e R implies (x,,x,) e R.

(9)

Basic Structures

3

Properties (7), (8), and (9) are known as the reflexive, symmetric, and transitive properties of R, respectively. If x E X, then R(x) is the set

R(x) = {y I y E X and (w) E R} and is called the R-equivalence class containing x. The set of all R-equivalence classes is denoted by X/R and is called the factor set or quotient set of X modulo R. It is easy to see that if R is an equivalence relation on X, then the

family of all R-equivalence classes forms a partition of X. We leave as an exercise the easy verification that either R(x) n R(y) = Q or R(x) = R(y)

(see Exercise 10). On the other hand, it is quite simple to see that if X = U ,5, X, is a partition of X and R = {(x,y) | x and y belong to the same XI,} , then R is an equivalence relation on X (see Exercise 11). A system of distinct representatives (abbreviated S.D.R.), or a transversal

for R, is a set T that contains precisely one element from each of the Requivalence classes. The concept of an S.D.R. can be extended to a more

general situation. Thus let 91 = {X, l i e I} be a family of subsets of X(not necessarily pairwise disjoint or even distinct) indexed by an indexing set I. Let T: 1—) U ,5, X, be an injection satisfying

T(i) e Xi,

i e I.

Then Tis called an S.D.R. or a transversal for 91. For example, suppose

X = {1,2,3,4} and the family 21 C P(X) consists of the subsets

X1: {1,2,4},

X2 = {1,3},

X, = {1,3,4},

X4 = {3,4}.

Question: How many transversals are there for 21? We can answer this rather easily by constructing an incidence matrix as follows. Write a 4 x 4 array

1234 X11101 MX21010 X31011 X40011

(10)

in which a l or 0 is entered as the (i, j) entry according as xJ e X, or x] (E X,, respectively. A transversal is then defined by a set of four nonzero entries in M, precisely one from each row and each column. Since the entries in M are either 1 or 0, it is clear that there is a one-to-one correspondence between the

1.1

Sets and Functions

9

nonzero terms in the determinant expansion of M and the set of transversals. The reader can easily verify that the total number of such nonzero terms in

the determinant is 3 (see Exercise 15). Hence there are three transversals for the family 2!. If R is an equivalence relation on X, then the mapping 1;: X —> X/R

defined by v(x).= R(x)

for x e X

is called the natural map induced by R. For example, suppose X = Z, the set of all integers, and let p be a fixed positive integer. Define

R = {(m,n) | m — n is divisible by p}.

(l 1)

Then R is easily seen to be an equivalence relation on Z (see Exercise 12). The

value v(n) = R(n) is the set of all integers which differ from n by a multiple of p. There is an elementary but important result which shows that any function defines an equivalence relation in a very simple way. Theorem 1.2

Let X and Y be nonempty sets and supposef: X —) Y. Define Rf = {(xl’xz) I f(x1) = f(xz)} -

Then (i) Rf is an equivalence relation on X. (ii) Ifg: X—> Z and R, C Ra, then there existsa unique map E: X/Rf —> Z such that the diagram

x ——->X/n, \Z /

(12)

commutes. In (12) g is the natural map induced by Rf. [In particular, there exists a unique map f: X/Rf —) Ysuch that the diagram

x —>X/R,

\/ commutes] Proof: (i) See Exercise 13. (ii) Observe first that if Rf(x,) = Rf(x2), then (xvxz) E Rf C Ry so that g(xl) = g(x2). We can thus define a function g: X/Rf —> Yby §(Rf(x)) =

10

Basic Structures

g(x). Of course Rf(x) = v(x) so that g . v = g. Obviously the values of g are

completely determined by g. | The result in Theorem 1.2 is called the factor theorem for maps. Let X be a nonempty set. A relation R C X2 is antisymmetric if (xhxz)

E R and (x2,x,) E R together imply x1 = x2. If R is transitive, reflexive, and antisymmetric, then R is called a partial ordering in X. If, for any two elements x1 and x2 in X, (x1,x2) E R or (x2,x1) E R, then the partial ordering R is called a complete or a linear ordering. If C C X and C is linearly ordered with respect to the partial ordering R in X, then C is called a chain in X. If R is a partial ordering in X, then an element a E Y C X is a least element of Y if for every y E Y, (a,y) E R; similarly an element b E .Y C X is agreatest

element of Y if for every y E Y, (y,b) E R. It is customary to denote a partial ordering by the symbol g and to write (x1,x2) E g as x1 g x2. Also, if

x1 g x2 and x1 at x2, we write xI < x2. Thus the definitions just given for a partial ordering read as follows: (i) (ii)

x g x for every x E X (reflexive), x g y and y g 2 imply x g z (transitive),

(iii) x g y and y g ximply x = y(antisymmetric). (iv) If, in addition, xl E X and x2 E X imply xI _ a. It is obvious from condition (iii) that there can be at most one least (greatest)

element of a subset Y of a partially ordered set X (see Exercise 14). The set of integers Z (i.e., positive, negative, and 0) is linearly ordered

by the usual ordering: x1 g x2 ifi‘ x2 -— x1 is nonnegative. Let X be a set and P(X) its power set. Then the relation C is a partial ordering in P(X) but not a

linear ordering; i.e., given two subsets of X, it is not generally true that one must be a subset of the other.

As an example let

X = {0,1,2,3,4}, R = ((0,0), (1,1), (2,2), (3,3), (4,4), (0,2), (1,2), (3,4)}.

1.1

Sets and Functions

11

We can construct a diagram of R:

Thus 0 g 2 simply means that 0 is below 2 and they are connected by a line segment. Note that since neither 2 g 3 nor 3 g 2 holds, R is not a linear ordering. Also, 0, 1, and 3 are all minimal elements because, e.g., there is no x E X for which x < 3. Similarly, 2 and 4 are maximal elements. The set X does not have a least or greatest element. As another example let

X = P({0,1,2}), and let (ascending) inclusion be the ordering. The elements of X are E,

a = {0},

6 = {0,1},

B = {I},

[1 = {0,2},

y = {2},

(0 = {1,2},

1: = {0,1,2}.

The diagram that goes with this ordering is

The circle around a indicates that a C a, etc. The line going upward from a

to ,u means that a C y, etc. Finally, since the inclusion relation is transitive, we need not connect a to 1: by a separate line segment. Note that

{Q B, 60, I} is a chain in X (identify some others), and that z and 1: are the least and greatest elements in X, respectively.

12

Basic Structures

There is an important axiom in set theory known as Zorn 's lemma which states:

IfX is a partially ordered set and every chain C in X has an upper bound, then X contains at least one maximal element.

Zorn’s lemma can be justified by the following heuristic argument. With the set X assumed to be nonempty, let a0 E X. If (10 is not maximal in X, there exists 011 e X such that (to < a1.

If oz1 is not maximal in X, then there exists an a2 6 X such that do < «I < (1,. Either we obtain a maximal element or we construct an infinite chain ao’é X,)X (k; Xi)“ ;< X1. 1

+1

1

10. Prove that if R is an equivalence relation, then the R-equivalence classes form a partition of X.

11. Prove that if X = U is, X.- is a partition of X and R = {(x,y) | x and y belong to the same Xi} , then R is an equivalence relation on X. 12. Prove that R as defined in (11) is an equivalence relation on Z. 13. Prove Theorem 1.2(i). 14. Prove that a least (greatest) element in a partially ordered set is unique, if it exists. Hint: Use antisymmetry.

15. Verify that the number of nonzero terms in the determinant expansion of M in (10) is 3. 16. Give an example of a partially ordered set which contains minimal elements but

Basic Structures

14 Exercises

1 .1

1. Prove formula (2). 2. Prove formulas (3) and (4). 3. Let Z be the set of integers and define f: Z —» Z and g: Z —» Z by f(n) = 2n n/Z

if n is even

and gm) = o if n is odd

Then show that: (i) f is injective but not surjective; (ii) g is surjective but not injective; (iii) gf = i2, i.e., g is a left inverse of f; (iv) fg =# :2, i.e., g is not a right inverse off. . Prove that if f: X —> Y and g: Y——> Z and f and g are surjective (injective,

bijective), then so is gf. . Complete the details of the proof of Theorem 1.1 (iv). . Show that iis a set and G = {fl f: X—> X,fa bijection} ,then: (i) iffand g are in G, thenf . g e G; (ii) iff,g,hare in G, then(fog)-h =f - (g . h); (iii)

ix 6 G; (iv) iffe G, thenf‘l E G. 7 Show that if {X,|i E I} is a family of subsets of Xand f: X—> X, then f(ilEJIXz) = iLEJIf(X.-). f_l({n.-) = Elf—‘00.

and

f“(ire)l X.) = Iglf“(X.-).

. Let {X- I iEI} be a family of subsets of U and for a fixed k EI let pk: X fie, X,- —> Xk be defined by pk(f) = f(k). Then pk is called the projection on Xk. Take Xi = X, i E I, and let d: X—> XE, Xbe defined by d(x) =f, where

f(i) = x for all ie I. Then d is called the diagonal injection. Prove that Pk . d= LXfOI‘ allk e I. . Let X, . . . , X” be subsets of U. Show that if 1 S k S n, then there is a bijection k

n

n

l

k+l

1

.:(x X.) x (x Xi)—» x X... 10. Prove that if R is an equivalence relation, then the R-equivalence classes form a partition of X.

ll. Prove that if X = U ‘5, Xi is a partition of X and R = {(x,y) | x and y belong to the same Xi} , then R is an equivalence relation on X.

12. Prove that R as defined in (11) is an equivalence relation on Z. 13. Prove Theorem 1.2(i). 14. Prove that a least (greatest) element in a partially ordered set is unique, if it exists. Hint: Use antisymmetry.

15. Verify that the number of nonzero terms in the determinant expansion of M in (10) is 3. 16. Give an example of a partially ordered set which contains minimal elements but

1.1

Sets and Functions

15

no least element. Hint: Let X = P( {0,l}) — {Q} with inclusion as the ordering. Then {0} and {1} are minimal elements but neither is least. l7 Show that if (x,y) is defined to be the set {x, {x,y} } , then (x,y) = (u,v) ifl‘ x = u and y = v. Hint: First note that x =# {x,y} ; otherwise x E x, contradicting the

regularity axiom. Then from {x, {x,y}} = {u, {u,v} } , either (a) x = u and {x,y} = {u,v} or (b)x = {u,v} and {x,y} = u. Clearly (b) is impossible; otherwisex E u E x. Hence (a) holds; sox = uand {x,y} = {u,v} becomes {u,y} = {u,v}. Ifu = v, then {u,v} = {u} and y E {u}, i.e., y = u = v. Ifu =# v, then y at: u because I {u,v} I = 2 = l {u,y} I. Thus, y E {u,v} and y qt u imply y=v. Glossary

1 .1

antisymmetric, 10 Axiom of Choice, 12

identity, 5 identity map, 5

bijection, 4

ix, 5

canonical injection, 5

image, 3

Cartesian product, 4 _X Y,., 5

f(X), 4

Cartesian square, 7

incidence matrix, 8 inclusion, 2 X c Y, 2 indexing set, 2 injection, 4 intersection, 2

X2, 7 chain, 10 codomain, 4 commutative diagrams, 6 complement, 3

Y —- X, 3 X‘, 3 composite, 4 De Morgan formulas, 3 diagonal, 7 diagonal injection, 14 domain, 4 dmn f, 4 element, 1 empty set, 2

Z, 2 epimorphic, 4 equivalence relation, 7 factor set, 8

X/R, 8 factor theorem for maps, 10 function, 3 f: X —) Y, 3

im f, 4

X n Y, 2 inverse, 6 inverse image, 4

f"(2), 4 least element, 10 left inverse, 6 linear ordering, 10 lower bound, 10 map, 3 mapping, 3 mapping diagrams, 5 matching, 4 maximal element, 10 member, 1 minimal element, 10 monomorphic, 4

N, 1

XAxs

natural map induced by R, 9 number of elements, 2

greatest element, 10

IX [ , 2

16

Basic Structures

ordered pair, 7 partial ordering, 10 partition, 3 power set, 2 P(X), 2 projection, 14 quotient set, 8 range, 4 reflexive, 8 regularity axiom, 13

set notation, 1 subset, 2 surjection, 4 symmetric, 8 system of distinct representatives, 8 S.D.R., 8 transitive, 8 transversal, 8 union, 2 X U Y, 2

relation, 7 restriction, 5

upper bound, 10 value of a function, 3

f [2, 5 right inverse, 6

Zorn’s lemma, 12

1.2

Algebraic Structures

Let S be a nonempty set. If n E N, let S " = x 'fS be the cartesian nth power of

S, and let S° = 6- An n—ary operation on S is simply a function a): S" —-> S.

We agree that a O-ary operation a) is just some fixed element of S. A l-ary operation is called unary, and a 2-ary operation is called binary. Let {2 be a collection of n—ary operations on S (n may vary). Then the set S together with the operations {2 is called an .Q—algebra. The set S is called the

carrier of 9 and the Q-algebra is often written (5,9). If 52 = {w} and a) is binary, then (S,{2) is called a groupoid. If we write w(x,y) = x - y (it is convenient to use juxtaposition for the values of a), sometimes with a dot between x and y), it is natural to call xy the product of x and y and simply refer to “the groupoid S.” Thus a groupoid is a nonempty set equipped with a binary operation whose values are in S; i.e., S is closed under the operation.

If S is a groupoid and for all x, y, and z in S the associative law

3602) = 0002

(1)

holds, then S is called a semigroup. If a groupoid S possesses an element e such that

ex = xe = x

(2)

for all x E S, then e is called a neutral or identity element. A semigroup with an identity is called a monoid. Obviously a monoid can have only one identity, for if e and e’ are identities, then e = ee’ = e’

(why?).

1.2 Algebraic Structures

17

If S is a monoid and x e S, then y is an inverse of x if

xy = yx = e.

(3)

Obviously, if y’ is also an inverse of x, then

y’ = y’e = y’(xy) = (y’x)y = ey = yHence an element of a monoid can have at most one inverse. A monoid in which every element has an inverse is called a group. Summarizing: A groupoid S is simply a set with a binary operation taking on values in S; a semigroup is a groupoid in which the operation is associative; a monoid is a semigroup with an identity; a group is a monoid in which every element has an inverse. GROUP C MONOID c SEMIGROUP c GROUPOID

Groupoid + associativity = semigroup Semigroup + identity 2 monoid Monoid + inverses : group. For example, if Z denotes the set of integers

Z: {0, i1, i2, i3, i4, . . .}

and the operation is subtraction, then (Z, {—} ) is a groupoid, but it is certainly not a semigroup:

(a—b)—c=a—(b—c)

can only hold for c = 0. In general, a binary operation a) on S is abelian or commutative if

w(x,y) = w(y,x) for all x and y in S. Let .Q 2 {01,3} with a and [3 binary operations on the carrier S. If

(i) (S, {a}) is a group and a is abelian, (ii)

(S, {fl}) is a groupoid,

(iii)

fi(a(x,y),2) = a(b’(x,2),fi(y,2))

(4)

and

fi(z,a:(x,y)) : a(fi(z,x),fi(z,y))

(5)

for all x, y, and z in S, then the Q-algebra (S,.Q) is called a ring. Usually a(x,y)

is written x + y and fi(x,y) is written x - y or simply, xy, with the obvious names of addition and multiplication, respectively. The formulas (4) and (5)

then read and

(x + y)z = x2 + yz z(x + y) : zx + zy,

(6) (7)

Basic Structures

18

respectively, and are called the distributive laws. The identity of (S, {a}) is usually written 0 and is called the zero of the ring. If (S, {,6}) possesses an identity, then we denote it by l and the ring is said to have a multiplicative identity. If [3 is commutative, then (8,9) is called a commutative ring. If (S, {3}) is a semigroup, then (S,Q) is called an associative ring. If (SJ!) is a ring and xy = 0, x ab 0, y qt 0, then x and y are called divisors of 0 (or zero divisors). A commutative associative ring (S,.Q) with multiplicative identity 1 i 0 and which possesses no divisors of 0 is called an integral domain.

An associative ring (5,9) in which (S — {0} , {[1}) is a group is called a division ring or a skew-field. A division ring in which the multiplication is commutative is called a field. Summarizing:

A ring is a set S with two binary operations +, - such

that (S, {+}) is an abelian group, (S, {-}) is a groupoid, and the distributive laws (6) and (7) are satisfied; an integral domain is an associative commutative ring with at least two elements, having a multiplicative identity and no divisors of 0; a division ring is an associative ring in which the nonzero elements form a group with respect to the multiplication operation; a field is a division ring in which the multiplication is abelian. If S : ZZ denotes the set of even integers with the usual multiplication and addition, then S is clearly a commutative, associative ring without a

multiplicative identity. The ring of integers Z with the usual operations is an integral domain. The set Q of all rational numbers p/q (p,q E Z, q ah 0) is a field with respect to the usual operations. Similarly R and C, the real and complex numbers, respectively, are fields with respect to the usual operations of addition and multiplication. As a further example, let X = {0, 1} and S = P(X). Define a)(U, V) = U - V = U — V = U n V‘. Then (S, {co})is afinite groupoid, but it is not a semigroup. For, take U = V: W: X. Then

(UV)W: (Un V‘)n W‘: Un V‘n W“: whereas

U(VW) = U n (VW)° = U n (V n WC)c =Ufl(V°U W)=Xfl(®UX)=X ab Q.

Thus associativity fails.

Let (M, {+}) be an abelian group and let (R, {+,-}) be an associative ring. Let p: R x M—> M satisfy

p(r, m1 + m2) = p(r,m1) + p(r,m2), P01 + 72”")

: P(r1am) + P021”):

p(r1r2,m) = P(’1:P(r2sm))a

(8) (9) (10)

1.2

Algebraic Structures

19

for all r, r1, r2 in R and m, m1, m, in M. Then (M, {+}) taken together with

(R, {+} , -}) is called a left R-module. Usually we say simply that M is a left R-module. If R has a multiplicative identity 1 and

NJ") = m

(11)

for all m E M, then M is called a unital R—module. Some standard notational

confusions: the big —|— in M and the small + in R are both written (small) +. The value of p(r,m) is simply written rm, and we say that M is an R-module (the word “left” being understood). Thus (8) to (11) can be written

r(ml + m2) = rm1 + rmz, (rl + r2)m = r,m + rzm, (r1r2)m = r1(r2m),

1m = m. If M is a unital R-module and R is a field, then M is called a vector space over R, the elements of M are called vectors, and the elements of R are called scalars. The value rm is called the scalar product of m by r. A submodule N of a (left) module M is a subset of M which is a (left) module over R in which the addition in N is the same as that in M and the

scalar multiplication by elements of the ring R is also precisely the same as it is in M. The following are some interesting elementary examples of the preceding structures. Example 1 (Multiples of Z) If Z is the ring of integers and 0 at n e Z, then the subset r12 = {m E Z and n l m} is a commutative ring. (n | m means n divides m, i.e., there exists q e Z such that m = nq.)

Example 2

(The Integers Modulo p)

Let R C Z 2 be the equivalence relation de-

fined by (m,n) E R iff p|m —- n, where p is some fixed positive integer. Then the quotient set Z/R can be made into a commutative ring, denoted by Z,. The equivalence class R(n) is denoted by [n]; addition and multiplication are defined by

[m] + [n] = [m + n],

(12)

[mlln] = [Wt]

(13)

(the same notations are used for the operations in Z, and Z). It is easily checked that these operations are well-defined. If p is a prime then Z, is a field, and if p is composite then Z possesses divisors of 0. The ring Z, is called the ring of integers modulo p, and the equivalence classes [n] are called residue classes modulo p. The confirmations of these assertions concerning Z, are easy. For example, ifp is a prime and [n] at [0], then It is not divisible by p, i.e., p ,l’ (n — 0), and thus the greatest common divisor of n and p is l. The Euclidean algorithm then yields integers x and

y such that px + ny = 1. Hence [1] = [p][x] + [n][y] = [n][y], i.e., [n] possesses the multiplicative inverse [y].

Basic Structures

20

The addition and multiplication in Z3 are given in the following tables:

+

[0] [1] [2]

-

[0] [l] [2]

[0] [0] [0] [0] [1] [0] [l] [2] [2] [0] [2] [1]

[0] [0] [1] [2] [1] [1] [2] [0] [2] [2] [0] [1]

Note that each nonzero element in Z3 has a multiplicative inverse. The tables for Z,, are

+

[0] [1] [2] [3]

[0] [1] [2] [3]

[0] [1] [2] [3]

[2] [3] [0] [1]

[1] [2] [3] [0]

0 ‘lol [1] [2] [3] [0] [1] [2] [3]

[3] [0] [1] [2]

[0] [0] [0] [0]

[0] [1] [2] [3]

[0] [2] [0] [2]

[0] [3] [2] [1]

Note here that [2] has no multiplicative inverse in Z4. It is relatively easy to see that [n] has a multiplicative inverse in ZP iff n and p are relatively prime (verify). Example 3 (n-Tuples) Let R be an associative ring, and let [1,n] be the segment of the first It positive integers. Let M = RIM], and define an addition in M by

(u+V)(j)=u(i)+V(j),

i=1,---,n

(14)

and a scalar multiplication by

(ru)(j)= r140),

i=1, - . -, n.

(15)

Then M is a left R-module; the elements of M are usually written out

(14(1), u(2). . - -, 1400) 01'

("1: "z; ' ’ ' a “5)

and are called n-tuples. Usually M is written R", and if R is a field, then R’l is called the vector space of n-tuples over R. If scalar multiplication is defined by

p(r,u)(j) = u(j)r,

J' = l, - - - , n,

rather than by (15), then M is a right R-module. A right R-module satisfies precisely the same conditions as a left R-module with the exception that (10) is replaced by

p(r1rz,m) = p(rz,p(rx,m)), or

m(r,r,) = (mr,)r,. If M is a vector space over a field R and (M, {+, X}) is a ring which satisfies r(ml X m) = (rm!) X m2 = ml X (rmz)

for r E R and m,, m2 in M, then M is called a linear algebra over R. If (M, {+, X})

1.2

Algebraic Structures

21

is associative with respect to the x operation, then M is called a linear associative algebra over R. Example 4

(Matrices)

Let R be a ring and define

M = Rll,mlxil,nl, where m and n are positive integers. Define an addition in M by

(A + B) ((i,i)) = A((i,i)) + B((i,i)).

1 S i s m, 1 Si S n,

and a left scalar multiplication by

(M) ((i,i)) = rA((i,i))Then M is a left R—module; the elements of M are called m X n matrices and im A is usually written out as a rectangular array:

A:

an

an

' ' ‘

“2::

am!

am2

' ' '

am»

(16)

in which aU is the customary way of designating A((i,j)). The values a” are called

elements or entries of A ; A |( {i} X [1,n]) is called the ith row of A and is designated by Au) 2 [ail ' ’ ' am];

Ai([1,m] X {j}) is called the jth column of A and is denoted by a . AU) 2 [ :11]. am,-

The‘ R-module M is usually denoted by MM(R) or R""‘" and is called the module of m x n matrices over R. If m = n, then M,,,_,.(R) is abbreviated to Mn(R). There is a

binary operation denoted by juxtaposition that can be defined from M,,,m(R) x MM(R) —> M,M(R) as follows:

(AB) R[S]. The multiplication (23) needs some explanation: Forf,g E R[S] we let f - g be the function whose value at s E S is the sum on the right in (23) in which the summation is over all pairs (r,t) for which rt = s [if no such pair exists, then the value of (f - g)(s) is 0 by definition].

The multiplication in (23) is called convolution. Now clearly since f(r) at 0 and g(t) qt 0 for at most a finite number of elements in S, the summation in (23) is a finite sum of elements of R. Moreover, there are only finitely many s for which (f - g) (s) qt 0. For ifrl, . . ., r, are thevalues for whichfis not 0 and 1,, . . ., tq are the values for which g is not 0, then ritj, i = l, . . . ,p,j = l, . . . , q, are the only elements of S for which f - g can be different from 0. The assertion here is that R[S] is a ring: It is called the groupoid ring of S over R. The fact that (R[S], {+}) is,a commutative group is easy, with the “zero function” (the function identically 0) as the additive identity and with (—f)(s) = —- f(s), s E S as the obvious definition of the additive inverse, — f, off. If S is a semigroup and R is an associative ring, it turns out that R[S] is an associative ring. The only slightly troublesome point in verifying this is the associativity of multiplication:

((f- g) . mm = "gy- g)(r)hB, then 2U) = t3 and @(f) = t4. Ifg: C —> D, then gf is defined iff the domain of g is the same as the codomain off, i.e., C = B. But

970g) = LC and yo") = i, so that LG = i, iff C = B. 29. Let S be a regular semigroupoid, and let the product xy E S be defined. Show

that and

54x31) = .700 93m») = My).

Hint: 9(x)x is defined and xy is defined; thus by Exercise 22(ii),

(2(x)X)y = 9(X)(xy) is defined. But $(x)x = x so that the above equation becomes

$(x)(xy) = xySince .‘Z’(x) is an identity, we conclude from Exercise 26 (Exercise 23) that $(x) = $(xy).

Similarly,

@(xy) = 920)-

30. An element x in a regular semigroupoid S has an inverse y if xy = 52(x) and yx = .%(x). Prove that if x has an inverse, then it is unique. The unique inverse, if it exists, is denoted by x". Hint: Suppose xy’ = $(x) and y’x = .9?(x). Both yx and xy’ are defined, and hence the triple product

(yX)y’ = y(xy’) is defined. Then

y(xy’) = y. XX mapping the monoid X into a monoz'd offunctions in X x in which function composition is the operation. Proof: For each x E X, define f(x) 6 X X to be the function whose value at each 2 E Xis xz; i.e.,

f(x) (2) = xz. Let e be the identity element in X. Iff(x) = f(y), then f(x) (e) = f(y) (e), and so x = xe = ye = y. Hence f is injective. Now, f(xlxz) (z) = (x1x2)zOn the other hand, (f(x1) ° f(xz» (Z) = f(xx) (f(xz) (2)) = f(xl) (x22) = xi(xzz) = (3513502-

1.3

Permutation Groups

33

Thusf(x1) ° f(x2) = f(xlxz), and f is a monomorphism. I

Corollary 1 If X is a group, then the monomorphism f of Theorem 3.1 maps X into SX, the group ofall bijections on X.

Proof:

We need only show that f(x) E Sx- But f(x) (2,) = f(x) (22) implies

l 2 x22, and hence 21 2 22 since X is a group. Also, if ZEX thenf(x) (x‘1 z)

= xx‘lz = 2; thus f(x) is a bijection. I A subsemigroup T of a semigroup (S, {a}) is a subset, Q qb T C S,

such that im a| T2 C T; i.e, if t,,t2 E T, then a(t,,t2) E T. In other words, T is closed with respect to a. More accurately, the subsemigroup is (T, {al T 2} ).

If (S, {a}) is a monoid and (T, {al T2}) is a subsemigroup which contains the identity element of S, then (T, {a[ T2}) is called a submonoid of (S, {a} ). If (T, {al T2}) is a subsemigroup of the semigroup (S, {a}) and (T, {01] T2}) is a group, then we say that (T, {a| T2}) is a subgroup of the semigroup (S, {a} ). We usually drop the cumbersome notation and simply say T is a subsemigroup (or submonoid, or subgroup) of S. A number of elementary facts about homomorphisms appear in the next

result. Theorem 3.2 Let h: X —> Y be a homomorphism of semigroups. If S is a subsemigroup of X, then h(S) is a subsemigroup of Y. If T is a subsemigroup of

Y, then h“(T) is a subsemigroup of X. If X and Y are groups and S is a subgroup of X, then h(S) is a subgroup of Y. If T is a subgroup of Y, then h“(T) is a subgroup of X.

Proof: The first and second assertions are trivial and are left to the reader (see Exercise 2). We prove the third assertion. By the first assertion h(S) is a subsemi-

group of Y. Since S is a subgroup of X, e E S, and x E S implies x‘1 E S (verify). That is, the identity of X must be in any subgroup of X. We claim that h(e) is the identity element e’ of Y. For if x E X, h(e)h(x) = h(ex) = h(x) = e’h(x) and since h(x)" exists, we have h(e) = 6’. Of course, h(e) E h(S), and so h(S) contains the identity of Y. Now if x E X, then h(x)h(x'l) = h(xx")

= h(e) = e’. Thus h(x“) = h(x)—1 for x E X; in particular h(S) contains the inverse of each element in h(S). Hence it is a subgroup of Y. To prove the final assertion, we note the above statement that any subgroup of a group must contain the identity of the group. Thus e’ E T. Now h(e) =

e’, and so e E h‘1(T). Also x E h"(T) implies h(x) E T, and since h(x") = h(x)“ E T, we have x"1 E h"(T). In other words, h"(T) is a subgroup of X. | In view of Theorem 3.2, we can rephrase Corollary 1 as follows:

34

Basic Structures

Any group X is isomorphic to a subgroup of the group SX ofbijections on X. Specializing this to finite groups, we have Corollary 2 If X is afinite group of order n, i.e., [X | = n, then X is isomorphic to a subgroup of the symmetric group of degree n on X.

Corollary 2 justifies the study of symmetric groups: Among their subgroups all finite groups must appear to within isomorphism. Let [X I = n; to simplify notation, we may as well assume X = {1, . . . , n} = [1, n], so that Sx becomes S". There are several convenient ways

of writing permutations: If a E S", then we use the two-rowed array 1

2

3

n

a ‘ (0(1) 0(2) :70) ~ - - co»)

(3)

to depict a; i.e., o(i) is written directly under i. Of course the order in which the columns appear in (3) is immaterial; so we can write

6-1 = a0) a(2) - - - a ' l

2

(4)

It

If a e S", then

0° 2 e = (X, ,___1’_

0-” = a - ~ - a,

and

a" = (0—1)”,

p e N.

If there exists a subset K = {in . . . , ik} C [1, n] such that a(i,)=i,+l,

t=l,...,k—1,

0(ik) = i1,

and

a I K” = (Kc,

then a is called a cycle of length k or a k-cycle, and a is written 0 = (iliz ' ‘ ' ik)'

(5)

If a' = e, then of course the notation in (5) requires some explanation, e.g., e = (l) = (2) = - - - = (n) in this notation. For notational convenience we shall let e be designated by any of these cycles of length 1. A cycle of length 2

is called a transposition. Any cycle of length at least 2 is called a circular permutation. Let G C S, be any subgroup of Sn. Define an equivalence relation R C

X 2 by R = {(a, b) I 0(a) = bfor somea e G}

(6)

1.3

Permutation Groups

35

(see Exercise 4). The equivalence classes in X/R are called the orbits of G in X, or simply the G-orbits. If R(x) is a G-orbit and |R(x)| = 1, then R(x) is called a trivial orbit; obviously, R(x) is trivial iff 0(x) = x for every 0 e G. In general, an orbit R(x) consists of all the distinct values o(x) as 0 runs over

G. If G is any group (not necessarily a group of permutations), it is quite simple to see (verify) that for a E G,

[0]={¢|¢=a‘,teZ}

(7)

is a subgroup of G. The meaning of o‘ for a negative integer t is clear, e.g.,

0—2 = (er—1):. The subgroup [a] defined in (7) is called the cyclic group generated by a. We also observe that if for some t :# 0, a’ = e (where e is the identity in G), then

I [0] | = k

(8)

where k is the least positive integer for which 0" = e, and in fact

[a] = {e = 00,0302, . . . , ok'l}.

(9)

For suppose a’ = e for some t qt 0; we can assume t E N (since a‘ = e ifl' a" = e). Ifn E Z, divide n by k, n=qk+r,

Then

0gr 1. Then a is a k-cycle ifl[a] has precisely one nontrivial orbit.

Proof:

If a = (ili2 . . . ik) is a k-cycle, then by the above remark

[0'] = {e,a, . . . , 0"”1}. Moreover, a'(ij) = if+1, where j + t is computed modulok; i.e., ifj + t 2 k, then the subscript is the remainder obtained upon dividing j + t by k. It fol-

lows that {i1, . . . ,ik} is a [a]-orbit. Also since 01 {in . . . , ik}‘ is the identity, {in . . . , ik} is the only nontrivial [o]-orbit. Conversely, suppose that [a] has precisely one nontrivial orbit (9’. Let j e 0’. We assert that

36

Basic Structures

0 = U 0(1') 02(1') ~ ' - 0""(J'))-

(10)

To begin with, the elementsj = o°(j), 0(1'), . . . , ok‘1(j) are distinct. For if not, then for some s qé t we have o’(j) = a'(j) and hence o"'(j) = j for some m, 1 g m < k — 1. But then for any integer r, a'”(a’(j)) = a’(a"‘(j)) = a’(j) so that 0’” holds each element of 0’ fixed, and moreover holds any other element of [1, n] fixed because a does. In other words, a’” = e and this implies |[o]l g m < k, in contradiction to [[0]] = k. Also from (7) and (9) we have [a]: {9:03 - ‘ - ’ (Tic—l} = {(0l¢=0’t,tEZ}

and hence y = {j, a(j), . . . , a"“(j)}, for any element of 0‘ is of the form 0"(j) and o'" is one of e, o', . . . , 0"". Ifi GE 0’, then a(i) = isince 0’is the only nontrivial [a]-orbit. This proves (10). I

Let a and 7: be any two permutations in S”. If each nontrivial orbit of [a] is disjoint from the union of the nontrivial orbits of [7:], then we say that a and 7: act on disjoint subsets of[1, n], or more simply, are disjoint permutations. Clearly, if a and n are disjoint, then on = 7:0. For ifj is ina nontrivial orbit of n, then 7r(j) is in this orbit and hence is not in a nontrivial orbit of [a], i.e., o'(7z(i)) = 7t(j) and a(j) =j, so that 1:0(j) = 7r(j) also. If7z(j) =j, then 07r(j) = a(j). Now there are two possibilities for j: If it is in a nontrivial orbit of [a], then so is a(j), and 7w(j) = 00’); if o(j) =j, then 7w(j) = 7r(j) =j = 0(j) =

07t(j). Thus a and 7t commute. The fundamental result concerning cycles is the following so-called cycle decomposition theorem. Theorem 3.4

Let c qfi a' e S". Then a is a product ofpairwise disjoint cycles

of length at least 2. This representation is unique except for the order in which the cycles occur.

Proof: Let 0’1, 0’2, . . . , 0’}, be all the nontrivial [a]-orbits. Each orbit 0’, is of the form

01 = {inaUJw - nah—‘03)},

k.>1-

(11)

We assert that if at = (ft 0-0.1) ' ' ' ”kl-101)),

then

a' = a1 - - - op.

(12)

Leti E [1, n]. If iisinatrivial[a]-orbit, then a(i) = i, and since i 62 Uf=1 0,, o',(i) = i, t = 1, . . . ,p. Hence o'(i) = (a1 - - - (I) (i). Ifi E 0}, then i = a"(j,) for some 0 g s < k,, and a(i) = o(a’(j,)) = o‘“(j,). Also if r at t, then a,(i) = i and a,(a,(i)) = o,(i) since the 0’, are pairwise disjoint, and thus

(61 ' ° ' 0p) (i) = 0:0) = 040113)) = 034.1(1))

2 o(i).

1.3

Permutation Groups

37

Thus the equality (12) is established, and the cycles 0,, t = 1, . . . , p, are disjoint (see Exercise 7).

To prove uniqueness, suppose 0 is is some way represented as a product of disjoint cycles of length at least2and 71:] is one ofthese cycleszzzl = (i1 - - - i,). Let r be the product of the remaining cycles in this second representation of 0. Then 0 = 7m". None of the cycles whose product is 1: involve any of i1, . . . , i,(see Exercise 7); thus 1(ij) = ij,j = l, . . . , r, and hence 00') =

n1(i,),j = l, . . . , r. In other words, {i1,0(il), . . . , 0"1(i1)} = {z}, 7:1(1'1), . , ni—1(il)} is a nontrivial [0]-orbit, so 7rl must be one of the 0, defined

above. Thus each cycle such as 7:, in this second representation of 0 must be one of the 0,. Suppose 0 = 7:1 - - - nm = 01 - - - ‘5- Then we can cancel the n,’s with the corresponding equal 0,’s. Clearly no 0,’s can be left, for otherwise we would have a product of disjoint cycles of length at least 2 equal to e. I To find the disjoint cycle decomposition of any particular 0 e S" is relatively straightforward. For example, suppose 0 6 S10 is given by

_12345678910

—3467519108

2'

According to Theorem 3.4 [see (11) and (12)] we can start with any integer (1 is sensible) and determine the [0]-'orbit in which it lies. Thus 0(1) = 3,

0(3) = 6, 0(6) = 1

so that

01 = {1, 3, 6} and

01 = (l 3 6).

Choose any integer in [1, 10] — 0’1, say 2, and compute that

0(2) = 4, 0(4) = 7, 0(7) = 9, 0(9) = 8, 0(8) = 10, 0(10) 2 2. Thus

0’, = {2, 4, 7, 9, 8, 10}, and

02=(2479810).

Finally 0(5) = 5, and so 5 determines a trivial orbit. We have

0=(13 6)(247 9 810). Any cycle can be written as a product of transpositions: If0 = (i1 - - - i,),

then 0' = (ilir) (ilir—l) (1313-2) ' ' ' (i1i2)-

38

Basic Structures

From Theorem 3.4, any permutation can be written as a product of transpositions, but not uniquely, e.g.,

(1 2) = (3 4) (l 2) (3 4). If e i o‘ E S” and a : al . - - 01, is its unique (except for order) cycle decomposition, then we define

I(a):(k,—1)+(k2—l)+ - - - +(kp—l)

=k1+ - - - +k,,—p, where k, is the length of 0,. The integer 1(a) is called the Cauchy index of a. We define I(e) = 0. Now, although the above example shows that the repre-

sentation of a permutation as a product of transpositions is not unique, we have the following theorem.

Theorem 3.5 If a = ‘I:l - - . 1,, E S", where r, is a transposition for i = 1, . . . , k, then 1(a) and k have the same parity, i.e., they are both even or both odd. Proof: then

We first observe that if a, b, x1, . . . , x,,y1, . . . , y, are distinct,

(abXaxi' ‘ 'x,by1---y:)=(ax1"'xr)(by1-”ys), (13) (a b)(a x1 - - - x,) = (a xl - - - x, b),

(14)

and (‘1 b)(a x1 ' ' 'xr)(b yl' '

ys)

=(axl."xrbyl"'ys)'

(15)

Thus if (a b) is any transposition and a = 01 - - - a, is the unique cycle decomposition of a, we can assume that if a or b appear among the cycles 01, , 0,, then the product (a b)a begins with (13), (14), or (15) (recall that

disjoint cycles commute). But then clearly (see Exercise 9), I((a b)a) = [(0‘) -_I- l

(16)

in all cases. In other words, multiplying a permutation by a transposition changes the index of the permutation by 1. It follows immediately that I(e) = Km“ ' ‘ ‘ 7:10)

m

q

=1+~--+1+'—1+-~+—1+I(a) = m - q + [(0). Now I(e) = 0 and m -— q has the same parity as m + q = k. Thus [(0) has the

same parity as k. I

1.3

Permutation Groups

39

The sign or signum of a permutation o is defined to be

8(a) = (——1)’(“).

If 8(a) = 1, the permutation o' is even; if 8(0) = -l, the permutation a is odd.

Corollary 1 If a E S,l anda = 1:1 - - - 1,, where each 1', is a transposition, then

6(0‘) = (—l)".

(17)

8(00) = a(o')3(0).

(18)

Moreover, if 0 E S”, then Proof:

From Theorem 3.5, [(0') has the same parity as k and (17) follows.

If 6 = u] - . - uq where each a]. is a transposition, then 00 is a product of k + q transpositions and (18) follows.| Let A” denote the set of all even permutations in S". It follows (see Exercise 10) immediately from (18) that A,l is a subgroup of S"; it is called the

alternating group of degree n. If a E S", then according to Theorem 3.4, a is a unique product of disjoint cycles

a = 0102 - - ' op Where

and

0', = (j! 0(jt) ' ' ' Olin—10.1)):

0‘, = {j,, o(j,),. . ., okr'1(j,)},

t: 13 ' ' ‘ ,P,

(19)

t: l, . . .,p

are the nontrivial [oi-orbits. Let {fl}, . . . , {fM} denote the trivial orbits, i.e., f1, . . . , f,1L1 are the elements mapped into themselves by a. If there are ,1, orbits each with s elements in it, s = l, . . . , n, then we say that a has the cycle structure

[111, 212, . . . , n‘n].

(20)

Thus a has cycle structure (20) ifl' a is a product of A, disjoint cycles of length s, s = l, . . . , n. (Of course, the cycles of length l are all e.)

Let G be any group, and define R C G2 by

R = {(042) | there existsB E Gsuch thatq: = 000“} .

(21)

It is routine to verify that R is an equivalence relation on G (see Exercise 15). This relation is called conjugacy and the equivalence classes R(a') E G/R are called conjugacy classes. The following result provides us with an interesting connection between cycle structure and conjugacy.

40

Basic Structures

Theorem 3.6 (i) Two permutations a and a in S,‘ are conjugate if they have the same cycle structure. (ii) (Cauchy's formula) The number ofpermutations in the conjugacy class determined by the cycle structure [111, 2‘2, . . . , n‘n] is

ha], . . . ’1") = 111/11! 212,12!n!. . - n‘nlnl' Proof:

(i)

(22)

Suppose a = a] - - - a", where a, is given by (19), and let q) =

900“ for some9 e S". Then a = 0010—10020" - - - 001,0“. Now set 0(0,” (j,)) :i1,m’m=05'--,k,—1,t:1,...,p.WecomputC gate—lam») = 001(a:m(j1» : “mp-+10!» zinm+19

and

m=0,...,k!—2,

00,0_1(i,’k!_1) = 601(ark'_l(ji» = 001,903) = ino-

Now let r E {i,_o, . . . , i,,kt_1}‘. Then 0‘10) E 0’,‘ because 0(0) = {1}“), . , i,_k,_,} , and hence a,0“(r) = 0"(r). Thus 00,0‘l(r) = 00“(r) = r. We have proved that 00,0“ = (iho - - - it), and moreover the sets {imp . . , it} = 0(0’,) are disjoint. Also r E [1, n] is a fixed point of 0 HT

0(r) is a fixed point of go : 000—1, 000—109(0) = 00(r).

Thus a; is the product of the disjoint cycles (i,_o - - - i,,;,,_1), t = l, . . . , p, and has the same number of fixed points as a does. It follows that q) and a have the same cycle structure.

To prove the converse, suppose o' and (p have the same cycle structure:

a=(f1)--If“)---(.I}.o-'-J}.k,-1)-°¢=(g1)- - -(gi,)- ' '(im ' ' 'it,k;—l)' - ' where we have written the corresponding cycles of equal length in go directly below those in 0. Now define 02(‘51'Hf’1‘.UJE"°.Hj""“1.H)eS,,. 1'

..g11.

"lt.0"'it.k,—l'

Then

000"(g..) = 9002) = 0(fv) = g» v=1,.--,/11, =¢(gv), and

ago—10,,» = 000'“) = 0(j,,,,+,) = 13W =¢(iw),

Thus 000" = a.

a:0,...,k,—l,t=l,...,p.

1.3

Permutation Groups

(ii) follows:

41

Permutations with cycle structure (20) can be constructed as

A,

A,

1,, /_’\—R

where there are 2, cycles of length s, and all the cycles are disjoint. Now. 111, + 2/12 + - - - + min = n and the n elements 1, . . . , n are to be placed where the asterisks (*) are in the scheme (23). There are n! ways of doing this, but some of the resulting permutations are the same: Each of the 1, cycles of

length s can be permuted among themselves, and there are 21! - ~ - in! ways altogether of doing this; also the elements in a given s-cycle can be shifted ahead 1, 2, . . ., s places without altering the permutation [e.g., (l 2

3) = (3 1 2) = (2 3 1)], and there are

A s - - . s = s11

ways of doing this for the cycles of length s. Thus with each arrangement of 1, . . . , n in the scheme (23), there are 11121! 21222! - ~ - n‘wl”! of the permutations that are the same. Hence there are n!

111).]! 21222! - - - nlnln! distinct permutations with the cycle structure (20). I

As an example, suppose n = 3 and we want to list all the distinct permutations in S3 with cycle structure [11, 2‘]. We begin, as in the proof of Theorem 3.6 (ii), by listing all six arrangements of l, 2, 3 in the scheme

(*) (**) and identifying the equal ones:

(1) (2 (1) (3 (2) (1 (2) (3

3)} 2) 3)] 1)

(3) (1 2)] (3) (2 1) ' The bracketed permutations are equal and each cycle of the bracketed pair is obtained by cyclically permuting the integers in the other cycle of the same length. Here A] = A, = 1, A, = 0 and

Basic Structures

40

Theorem 3.6

(i)

Two permutations 0' and q) in S,. are conjugate if they have

the same cycle structure. (ii) (Cauchy’s formula) The number ofpermutations in the conjugacy class determined by the cycle structure [111, 212, . . . , n‘n] is

hul, . . . ”1") = 11:11! 2122,!n!- - - n‘nhnf Proof:

(i)

(22)

Suppose a = a, - - - a", where a, is given by (19), and let a =

000—1 for someB E S". Thenqz = 0010‘10020‘1 - - - Bopfl".Now set 0(0,“ (j,)) =ii.m:m=0,- - -,k,—-1,t= 1,. . .,p.Wecompute 00,0-1(l.,.m) : 90t(atm(jt)) = 0(atm+l(j¢))

=it.m+1:

and

m=0""’kt—2’

gate—1(ir.k,—1) = 901(arkl—103» = 00-19(13) = it’o.

Now let r e {110, . . . , i,,,,‘_1}‘. Then 0“(r) 6 fl,” because 0(0) = {110, . , i,_k'_1}, and hence 0,040) = 0—10). Thus 00,0“(0 = 00—10) = r. We have proved that 00,0" = (ibo - - - ik), and moreover the sets {i,.o, . . , fun—1} = 0(0’,) are disjoint. Also r e [1, n] is a fixed point of 0 iff

0(r) is a fixed point of (p = 000",

000—109(0) = 00(r).

Thus q) is the product of the disjoint cycles (tho - - - i,,,,,_1), t = l, . . . , p, and has the same number of fixed points as a does. It follows that q; and a have the same cycle structure.

To prove the converse, suppose a and (p have the same cycle structure:

0=(f1)° - -(JC1,)' - - (1}.0- - 'jtJn—l)‘ '90 = (g1)- - °(gz,)° - - (13.0 - ' '13.)”-1)‘ ° where we have written the corresponding cycles of equal length in q: directly below those in 0. Now define

02c]. Hf“. . .13“). . 'J:""""H)ES,,. 1‘ ° 'gzl' ' "no' ' '1:.k.—1' ' ' Then

t9o19"(gv = 00(12): 0(f») = gv and

=¢(gv), v= 1,. - all. 909—10,,» = 00(1),) = 00.4,“) = in“: =¢(i,.fl),

Thus 000" = q).

u:0,...,k,—l,t=l,...,p.

1 .3

Permutation Groups

41

(ii) Permutations with cycle structure (20) can be constructed as follows: HE

(*)(*)-------(*)(**)(**)(**)-(**#***)

(23)

Where there are ,1, cycles of length s, and all the cycles are disjoint. Now 1)., + 2/12 + - - - + n2" = n and the n elements 1, . . . , n are to be placed where

the asterisks (4:) are in the scheme (23). There are n! ways of doing this, but some of the resulting permutations are the same: Each of the 2, cycles of length .9 can be permuted among themselves, and there are 2,! - - - 1"! ways altogether of doing this; also the elements in a given s—cycle can be shifted ahead 1, 2, . . ., s places without altering the permutation [e.g., (l 2

3) = (3 1 2) = (2 3 1)], and there are

i

A. s - - - s = s‘: ways of doing this for the cycles of length s. Thus with each arrangement of l, . . . , n in the scheme (23), there are 1111,! 21212! - - - nlniln! of the permutations that are the same. Hence there are n!

111).]! 2121.2! - - - n‘nl"!

distinct permutations with the cycle structure (20). I

As an example, suppose n = 3 and we want to list all the distinct permutations in S3 with cycle structure [11, 21].

We begin, as in the proof of Theorem 3.6 (ii), by listing all six arrangements of l, 2, 3 in the scheme

(*) (**) and identifying the equal ones:

(1) (2 (1) (3 (2) (1 (2) (3 (3) (1 (3) (2

3)] 2) 3)] 1) 2)} 1) '

The bracketed permutations are equal and each cycle of the bracketed pair

is obtained by cyclically permuting the integers in the other cycle of the same length. Here 11 = 12 = l, A, = 0 and

42

Basic Structures

3!

Minimla) = W = 3.

Somewhat less trivial to compute is the case n = 4 with prescribed cycle structure

[22]. Then the scheme to be filled in is

(=1: *) (=1: 4:).

We have the list

(1 2) (3 4) (1 2) (4 3)

(1 3) (2 4) (1 3) (4 2)

(2 1) (3 4)

(3 1) (2 4)

(4 1) (2 3)

(2 (3 (4 (3 (4

(3 (2 (4 (2 (4

(4 (2 (3 (2 (3

1) 4) 3) 4) 3)

(4 (1 (1 (2 (2

3) 2) 2) 1) 1)

1) (4 4) (1 2) (1 4) (3 2) (3

(1 4) (2 3) (1 4) (3 2)

2) 3) 3) 1) 1)

1) (3 3) (1 2) (1 3) (4 2) (4

2) 4) 4) 1) 1)

The first 4 permutations in each list are obtained by cyclically permuting the

integers in each of the cycles. The second 4 are obtained by interchanging cycles and then cyclically permuting the integers within a cycle. There are 3 distinct permutations with the cycle structure [22]. Note that A, = 0, 12 = 2,

11, = A4 = 0, and h

4! (duly/Iva) = m

= 3.

In order to further unravel the structure of S", we introduce a few more general notions about groups. Let G be any group (with the operation denoted simply by juxtaposition), and let Q qt A E P(G). Then A is called a complex. The product and

inverses of complexes are given by AB: {ab|aeA,b€B}

and

A“ = {a—1 | a e A}.

If H is a subgroup of G, we write H < G. If a E G and H < G, then the set {a}H (H{a} ) is called a left (right) H-coset

in G. We write these cosets as aH or Ha. The element a is called a representa-

1.3

Permutation Groups

43

tive of the coset aH or Ha. It should be noted that if A is any complex and g e G, then

|A|=|gA|=|Ag|The reader will also confirm (see Exercise 16) that for arbitrary complexes we have

(AB)“ = B“A“‘; A(BC) = (AB)C; HK is any function, then Rf = {(x1, x2) | f(x,) = f(x2)} is an equivalencerelation on G. We assert that if G and K are groups and f is a homomorphism, then

R; = {(0,17) l ““17 Ef“({e})}-

(34)

(Here we denote by ethe neutral elementin either Gor K.) For, a”‘b E f'1 ( {e} ) ifl" f(a‘1b) = e ifl‘f(a) = f(b). In particular, let v: G —) G/H be the canonical homomorphism and consider the natural map a: G —> G/R‘, induced by R,” Le, ”(a) = Rv(a),

a E G.

According to (33) we have

R» = «ml» 1 a“‘b e H},

K such that o = (o'v, i.e., the diagram

HctP-1((e)lG/H

~0\fl K

is commutative.

Proof:

According to (34),

R, = «u» I all» e qr‘aen}, and from (35) and H C o“({e}) we have Rv c R,. But then by Theorem 1.2(ii) there exists a unique ¢ : G/R, —) K such that

v = evOf course G/Rv is just another way of writing G/H, and so (5 completes the diagram (36), asladvertised. Observe that

Basic Structures

46

(5(aHbH) = flabH) = ¢v(ab) = ¢(ab) = ¢(a)¢(b) = ¢(v(a))¢7(v(b)) = @(aH)¢(bH) so that (5 is indeed a homomorphism. I If (a: G —> K is a homomorphism, then the inverse image ¢"({e}) of

the neutral element e is called the kernel of p and is denoted by ker q). It is very simple to verify that

ker q;

0(1') for some i < j. Define (a) to be the total number of such inversions, e.g., 1

2

3

4

5

6

)= 2 + 2 + 2 ‘ (4 5 1 2 3 6 =6 because each of l, 2, and 3 is preceded by two larger integers. Prove that

saw) = (—1)""’. Hint: Let a be the first integer in the second row of a which is preceded by a larger integer (if no such a exists, then a = e, (0') = 0, and we are done). We assert that the closest such larger integer b must immediately precede a; other-

wrse the second row of or looks like (...b...x...a...)_

Now x is closer to a than b is, and so x < a. Also b cannot be less than x since b > a. Hence b > x. But then x precedes a and has a larger integer, namely b, preceding it. This contradicts the choice of a, and our assertion is established. We compute the product

(ab)or = (ab)(

12---i—lii+l---n xrxz

(12

xi—l

b

a

yi+2"'

y,,

)

~--i—1ii+l---n)

x1 x2 xr_1 a 1’ yr+2"' 3),, Hence the number of inversions in (ab)a is one less than the number of inversions in a, i.e., L((ab)0') = ¢(a) -— l. Repeating the argument with (ab)a, etc., we see that if k inversions occur in a, then by premultiplying a by k transpositions, we arrive at a permutation with no inversions, i.e., the identity. We have

0 = L(e) = L(Tk - - - 110') = (or) — k, and so (6) = k. Finally, 1 = e(e) =

etrk- - ma) = em. . -108(0):(—l)"s(a)impliese(a) = (—1)" = (—1)“"’.) 13 Show that S, is generated by the n —— 1 transpositions (1 2), (2 3), (3 4), . . ., (n — 1 n), i.e., any a E S, is a product of these transpositions. Hint: Using the notation of the previous hint, we compute that .. a(u+1)=(

''—1 ''1' + 1i +

12

x1 x2 ' ' ' xi—l b (1 _(12.-~i—lii+l x1 x2 ’ ' '

xi—l

a

b

2

")(i i+1)

Mn J’n i+2-'-n) yi+l

' ' '

J’n .

Hence (00 i + 1)) = z(a) — 1. In other words, if k inversions occur in a, then postmultiplying a by k of the required transpositions produces e.

14. Write a = (g

i

i’

g

g

g

g

g) as a product of transpositions of the

Basic Structures

54

form (1' i + 1). Hint: 0': (6 7)(7 8)(2 3)(3 4)(1 2)(2 3). 15. Verify that the relation R in (21) is an equivalence relation on G. 16. Confirm (24), (25), (26) and (27). Hint: (24) and (25) are trivial. One direction of (26) is trivial. Conversely, if H 1 = H, H is closed; if H = H", then H contains inverses, and e e H follows; (27) follows by application of (26) to HK.

17. Let H < G. Show that the mapping v(aH) = Ha“ = H‘1a'l is bijective with domain the set of left H-cosets and codomain the set of right H-cosets.

18. 19. 20. 21.

Prove: If X is a group and x2 = x E X, then x = e, the neutral element in X. Verify (33). Hint: If v(a) = H, then aH = H, and so a E H.

Complete the proof of (37). Show that if G < S", then either every 0 E G is even or G contains precisely the

same number of even and odd permutations. Hint: s = 2060501) = 2,55 8(60) = 6(0) 2,5046) = 5(0):,

where

6 e G

is fixed.

Thus

5' = s(0)s, so that

summing on 0 yields |Gls = s’. Hences = |G| ors = 0. This means that either 8(0') = 1 for all a e G or there are as many —1’s as 1’s in the sum s.

22. Show that [An] = n!/2 for n 2 2. Hint: If r e S,l is a transposition, the mapping a —v 10' of A,I into S,I is an injection. 23. Show that it‘¢1.—_(iI i2 - - - ik)ES,l and 0 E S,,, then 000“=(0(i1)0(i2) - - - 00,». 24. Assume H G/H is the canonical homomorphism. Prove that ker v = H. 25. Verify that K = {e, (1 2)(3 4), (2 3)(1 4), (1 3)(2 4)} is a normal subgroup of A4. . Let G be the multiplicative 4-element group of complex numbers {1, —1, i, —-i}. Show that G is not isomorphic to the Klein four group (45). Hint: In (45) the square of every element is the identity.

27. Let F be the class of all groups, and let S be the class of all group homomorphisms. Show that S is a regular semigroupoid with function composition as the operation. For each A E F define v(A) = ‘4.

Prove that g = (F,S,v) is a category, called the category ofgroups (see Section 1.2, Exercise 32). Hint: Suppose f:A —» B, g:C —» D, t —> L are group homomorphisms. Iff(gh) is defined, then L C C and D c: A, so that obviously (fg)h is defined and f(gh) = (fg)h. The composition of group homomorphisms is of course a group homomorphism when it is defined. Similarly, if hg and gf are defined, then (hg)fis defined. The set I(S) consists of the identity homomorphisms

LA, A E F.

Glossary

1 .3

alternating group of degree n, 39 [A], 48

canonical homomorphism, 44 category of groups, 54

A", 39

Cauchy’s formula, 40

automorphism, 32

Cauchy index, 38

1.3

Permutation Groups

Cayley’s theorem, 32 circular permutation, 34 closed, 33 complex, 42 conjugacy, 39 conjugacy classes, 39 cycle decomposition theorem, 36 cycle of length k, 34 cycle structure, 39

55 k-cycle, 34

a =(i1i2 - - -i,,), 34

[111,211, . . .,n1n], 39

kernel, 46 ker (p, 46 Klein four group, 52 Lagrange’s theorem, 43 left coset, 42 monomorphism, 31 normal divisor, 44 normal subgroup, 44

cyclic group generated by a, 35

H H defined by

f0!) = nd is a group isomorphism. To see this, observe that since H contains some integer other than 0, it must contain a positive integer and hence a least positive integer d. Then of course imH. Ifm E H, dividemby d:

m=qd+r,

Osr 1, then [am] is a proper normal subgroup of [am]; so e is not maximal in GM. Example 7 (a) We construct a composition series for G 2 5,. To begin with, A, ¢ S4; therefore set G1 = A4. Next define

02 = {3, (1

2)(3 4), (2 3)(1 4). (1

3)(2 4)} ;

G2 is the Klein four group whose table appears in Example 3. It is relatively easy to check that if 1 is any transposition in S4, then 1G2 = G21. Since any permutation is a product of transpositions, we conclude that G2 Q S4 (see Exercise 15). Let G, = {e,(1 2)(3 4)}. The group G, has index 2 in G, so G, G,bGz>G,be

(51)

is a composition series for S4; each factor is of order 2 or 3, hence simple, and we can apply Theorem 1.9 to conclude that (51) is a composition series for 5,. (b) Let G be the additive group (26, {+} ). Define G1 = {[0], [2], [4]} so that |G1| = 3 and |G/G,| = 2. Then 0/01 is simple, and thus G1 is maximal in G.

Also, [GI I = 3 implies G1 is simple. Therefore G = G, (> G, (> e is a composition series whose factors are of orders 2 and 3, respectively, and hence are cyclic.

(c) Let G = S,, Gll = A,. Then clearly |A,| = 3 and |G/G,| = 2. Hence G=Go&Gl¢€

is a composition series for S, with factors which are cyclic of orders 2 and 3, respectively.

We see from Example 7(b) and (c) that the factors of a composition

series do not suffice for a reconstruction of the entire group. For S3 and the additive group (26, {+}) are obviously not isomorphic, although they have composition series whose factors are isomorphic. However, we shall shortly prove a famous theorem (the Jordan-Holder theorem) that shows the extent to which a finite group determines its composition series. We need a preliminary result.

Theorem 1.12 Let G1 and GI’ be distinct maximal subgroups of G, and set K: G1 (1 Gl’. Then (i) K is maximal in both G1 and G1'.

2.1

Isomorphism Theorems

75

(ii) G/G,’ 5 l/1< and G/G, ; G,’/1---GG,’=e (54) be two composition seriesfor G. Then r = s, and thefactors in the two series can be paired of in some order into isomorphic pairs.

Proof: The proof is by induction on n = |G|. If n = 2, the problem is

Groups

76

trivial; thus assume that n > 2 and that the theorem holds for groups of order n — 1 or less. Let K = G, r] G,',

and suppose first that K = e. Unless one of G, or G,’ is e itself, i.e., unless r = l or s = 1, it follows that G, and G,’ must be distinct. Suppose for a moment that r = 1. Then e is maximal in G, and so G is simple. Hence s = 1

and G,’ = e as well; i.e., (53) and (54) both collapse to Gbe

and there is nothing to prove. We may thus assume that in case K = e, we

have r > I and s > 1. By Theorem 1.12, K = e is maximal in both G, and G,’, and hence r = s = 2. Thus

and

G 9 G, (> e

(55)

G (> G,’ b e

(56)

are the composition series (53) and (54), respectively. Theorem 1.12 also tells us that

and

G/G,’ E G,/K = G,/e E G,

(57)

G/G, g G,’/K = G,’/e ; G,’.

(58)

Now the factors in (55) are G/G,

and

G,/e E G,

(59)

G,’/e E G,’.

(60)

whereas the factors in (56) are

G/G,’

and

Then according to (57) and (58),

{GIG/£1}

and

{G/GI’GII}

(61)

are isomorphic pairings of the factors. This proves the result in the event K = e.

Now assume K qt e. We can also assume that G, qt G,’; otherwise we simply apply the induction hypothesis to G, = G,’ to obtain the result.

Again, by Theorem 1.12 we know that K = G, n G,’ is maximal in both G, and G,’. Let

K(>K,(>K2(>-.-(>K,,=e

(62)

be a composition series for K. Then the maximality of K in both G, and G,’ implies that

G,(>K(>K,(>.-.¢>K,=e

(63)

G;'C>K(>K.>---©K,,=e

(64)

and

2.1

lsomorphism Theorems

77

are composition series for G1 and G", respectively, both of length p + 1. On the other hand,

(65)

GI>--->G,=e and

(66) Gl’>.-->G,’=e are also composition series for GI and Gl’, respectively. Applying the induction hypothesis to G1 and Gl’, we obtainp + l = r — 1 andp + 1 =- s —— 1 so that r = s. Moreover, the factors

and

{GI/K, K/Kl, K,/K2,. . . ,Kp_1/Kp}

(67)

{Gr/02’ Gz/Gaa Gs/Gu - - ~ s Gr—I/Gr}

(68)

can be paired off into isomorphic pairs in some order, as can the factors

and

{G,’/K, K/Kl, Kl/Kz, . . . , KIM/KP}

(69)

{GK/G23 Gz//Gs’v - - , Gi-i/GX}

(70)

(S = r). Again applying Theorem 1.12, we have

G/Gl ; Gf/K G/G,’ ; G1 /K.

and

(71) (72)

Include 0/01 in both the sets (67), (68) and include G/G,’ in both the sets (69),

(70).

We then have

(0/0,, G,/G,, . ..,G,_,/G,} ; {G/G,, G,/K,K/K1, .. ., K,_,/K,} (73) \/‘

T

T

./\. t i {0/013 G,’/G,', . ..,G,'_,/G,'} s {G/Gl’,G,’/K,K/K1,...,Kp_1/Kp} (74) where “’5” means isomorphic in pairs in some order. The arrows in (73) and (74) indicate isomorphic pairings in view of (71) and (72). Thus the two sets on the left in (73) and (74) can be paired into isomorphic pairs, and the induction is complete. I

In view of the Jordan-Holder theorem we can refer to the factors in a composition series for a finite group, for these are determined, except for order, to within isomorphism.

Observe that the factors in the composition series for S4 given in Example 4(a) are abelian, so that S4 is solvable. In fact we have the following

result for any finite group G.

Theorem 1.14 If I G | < oo, thenGissoIvable ifleachfactor in a composition seriesfor G is ofprime order.

78

Groups

Proof: Assume first that G is solvable, i.e., G has a subnormal series with abelian factors. According to Theorem 1.11, this series can be refined to a composition series with abelian factors. By Theorem 1.9, these factors are

simple groups. But we assert (see Exercise 1) that the only simple finite abelian groups are cyclic groups of prime order. By Theorem 1.13 the composition factors are determined by G to within isomorphism; so in any composition series for G the order of each factor must be prime.

On the other hand, if the order of each factor in a composition series for G is prime, each factor must be a cyclic group and hence abelian. Thus G itself is solvable. I Example 8

(a) S,, is not solvable for n 2 5. For consider the series

S,I (> A, (> e.

(75)

Since |S,,/A,,| = 2, S,/A,, is simple, and hence A,l is obviously maximal in S". According to Theorem 3.10(v), Chapter 1, A,I is a simple group for n 2 5. Thus (75)

is a composition series for 5,. But A,,/e '5 A" is not abelian (and A,l is simple of non-prime order; see Exercise 1). Now, of course, Sn cannot have any subnormal series with abelian factors; for Theorem 1.11 tells us that any such series can be refined to a composition series with abelian factors, and these factors are unique to within isomorphism by Theorem 1.13. (b) We saw in Example 7(a) that S, has a composition series in which each ,factor has order 2 or 3. Therefore S, is solvable by Theorem 1.14. Next, consider

5,. We have lA,| = 3; so A, is simple and hence

S, (> A3 (> e is a composition series with abelian quotients. Again, S, is solvable. (Of course, S2 and S 1 are trivially solvable.)

There are several interesting and important results about finite solvable groups which we shall need.

Theorem 1.15 (i) If G is solvable and H < G, then H is solvable. (i) If G is solvable andf: G —) G’ is an epimorphism, then G’ is solvable. (iii) If H Gl(>Gzc>---(>G,=e

(76)

is a subnormal series with solvable factors, then G is solvable.

Proof:

(i)

Let G=Go9Gl(>"'bG,=e

be a subnormal series with abelian factors, and define

H,=HflG,,

i=0,...,r.

(77)

2.1

lsomorphism Theorems

79

Consider the diagram

Gi-l Hi— 1

\/ Hi-1"G/ That is, in diagram (9) replace G by GM, K by 0,, and H by HM. This is permissible since H,._1 < G,_l and G, Hl(> - --(>H,=e is a subnormal series for H with abelian factors, so that H is solvable. (ii) Let G=GOC>GI(>. .

.§Gr=e

be a subnormal series with abelian factors. Define G,’ = f(G,) and check (see

Exercise 4) that G,T_1/G,.’ is a homomorphic image of G,_1/G,. It follows that

G,’_1/G,' is abelian, so

f(G)=G'=Go'C>Gl'(>---(>G,’=e

Groups

80

is a subnormal series for G' with abelian factors. Thus 6' is solvable.

(iii)

Since H is solvable, there is a subnormal series

H=Kot>K1(>--->Kp=e

(80)

with abelian factors. Also G/H is solvable, so there exists a subnormal series for G/H with abelian factors, say

G/H = E, (> 171 (> 172 (> - - - (> 17, = identity.

(81)

By Theorem 1.3(ii) there are unique subgroups H, (H0 2 G) such that H Q H, 1, and assume the result holds for series of length r — l or less. Now G,(>Gzt>--->G,=e is a subnormal series for G1 with solvable factors. By the induction hypothesis

G1 is solvable; since G/Gl is solvable by hypothesis, (iii) implies that G is solvable. I We conclude this section by introducing three interesting ideas in group theory. Let G be a group and a,b e G. The commutator ofa and b is the element

aba‘lb“. The commutator subgroup of G is the group G“) generated by the set of all commutators in G:

G”) = [{z | z = aba‘1b_l, a,b e 0}]. Note that (aba’lb")“ = bab'la“, i.e., the inverse of a commutator is again a commutator. Thus G ‘1’ consists precisely of all finite products of commutators in G. The higher-order commutator groups are defined inductively:

Letting

GW=Q

2.1

lsomorphism Theorems

we set

81

G“) = (G"'“)”’,

s = l, 2,... .

The normalizer N(X) of a subset X c G, (X ah Q), is defined by

N(X)= {gE Gn=Xg}The centralizer Z(X) of a subset X C G, (X ab 0), is defined by Z(X): {ge G|gx=xg for all xeX}. In particular, Z(G) is called the center of G:

Z(G): {geG|gx=xg for all xE G}. It is quite simple (see Exercise 5) to prove that for 6 :# X C G,

Z(X) < N(X) < G.

(84)

Z(G) < G.

(85)

In particular we have

Theorem 1.16 (i)

(ii)

Let G be a group. Then

Gil) Q G.

G/G‘“ is abelian.

(iii) Iff: G —) S is-a homomorphism and S is abelian, then 0‘“ < ker f. (iV) If K q, 17$ :1: 1 (mod q). Prove that if G is a group of order

2.3

Some Additional Topics

109

p’q, then either G is the direct product of cyclic subgroups of order p1 and q, or G is the direct product of two cyclic subgroups of order p and a cyclic subgroup of order q. In either case G is abelian. Hint: Follow the method used in Example 10 with the role of 5 played by p and the role of 3 played by q. The argument is identical. 15. Classify all groups of order 147. Hint: Use Exercise 14. 16. Prove that any group of order 36 is abelian. Hint: The same argument as used in Example 10 will work.

Glossary

2.2

Burnside’s lemma, 88 character of H of degree 1, 88

orbits, 87 0x, 87

conjugate subgroups, 98

POIya counting theorem, 91

cycle index polynomial of H, 91

111(k). 39

P(H: x,, . . . , x"), 91 A, 88

stabilizer, 87 H,” 87

Z, 88

51-partition, 95

group of rotations of the cube, 92 H acts on the set X, 87

Sylow theorems, 99 p-Sylow subgroups, 101

H: X, 87

2.3

Some Additional Topics

In this section we consider several more advanced topics in group theory. We begin with a rather technical isomorphism theorem known as the

Zassenhaus lemma. Theorem 3.1

Assume that H0 Q H < G, K0 "'©Hr=e

(29)

G=KO>K,(>---(>K,=e

(30)

be two subnormal series for G. For eachi = l, . . . , rdefine

H,j=H,(H,_, nKj),

j=0,. . ., .9.

Since H, KU(> - - - (>K,1=KJ

so that the groups K”, i = 1, . . . , r — 1, are inserted between K.,_l and K1. We obtain a refinement of (30) of length rs. Finally Theorem 3.1 states that HU—l = H! Hi—l n Kj—l) : Kj(Kj—l n Hl—l)

Hi]

H:(Hx_1 0 K1.) — KJ-(KH 0 H1) K‘_1)'

='_,

Ki]

.

.

z:l,...,r,]=1,...,s.

Thus the two refined series are isomorphic.

I

We can now easily extend Theorem 1.13. Recall that an infinite group need not have a composition series (see Example 6, Section 2.1). However,

the following theorem shows that if a composition series exists, then the factors are uniquely determined. Theorem 3.3

and

(Jordan-Holder-Schreier Theorem)

Let

G=GobGlt>---(>G,=e

(35)

G=HO>H,(>--->H.=e

(36)

be two composition series for the group G. Then the two series (35) and (36) are isomorphic, i.e., r = s and the factors in the two series can be paired 017 in some order into isomorphic pairs.

Proof:

Since the two series (35) and (36) are composition series, neither one

of them can be (properly) refined, i.e., in either series each group is maximal in the group that contains it. But by Theorem 3.2 the two series have isomorphic refinements. Thus they must be initially isomorphic.

I

Recall (Section 2.2) that a group M is said to act on a setX if there exists a homomorphism

Groups

114

f: M —) SxAn important modification of this idea is useful in the study of modules and vector spaces and in group representation theory. Let M be an arbitrary set and G a group. Denote by

end (G)

(37)

the totality of endomorphisms of G (homomorphisms (a: G —) G). Let

f: M —> end (G)

(38)

be a function. Then the triple of objects

(M’0’!) is called a group with operators; the elements of M are called the operators. Note that since f(m) 6 end (G) (m E M) we have for g,,g2 e G that

f(mxgrgz) = f(m)(g1) f(M)(gz)-

(39)

It is sometimes convenient to call G an M-group and abbreviatef(m)(g) by mg.

(40)

In the notation (40), the formula (39) becomes

mmm=oammt Example 1

(a)

meM&&6G

Let G be an arbitrary group, take M = G, and define

f(m)g = m“gm,

m E M,g E G;

f(m) is called an inner automorphism or more precisely f(m) is conjugation by m. Observe that

f(M)(g1gz) = m“g1g2m = (M"‘g1m)(m"g2m) = f("1X3Of("1X32)(b)

Let M be the set of all automorphisms of a group G and define f = Q“.

(c)

Let (R, {+, - }) be a ring, take M = R, and define f(m)r=m-r,

mEM,rER;

f(m) is “left multiplication by m." Clearly

f(m)(r1+rz)= "101+ ’2) = W1+ m’z

=f(m)(r1) +f(m)(r2). A similar example can be constructed for right multiplication. (d) Let G be a left R-module (see Section 1.2), and let M = Z be the ring of integers. Define

fig

ifm>0

i=1

“"057:

—-f}g,

if

m K

is a group homomorphism, then q) is called an operator homomorphism or an M-homomorphism if

¢(fi(m)(g)) =f2(m)¢(g)

for 311m E M, g E G-

(42)

Example 2 Let G and K be vector spaces over a common field R, regarded as abelian groups. The scalar products ru and rv (r E R, u E G, v e K) define R as a set of

operators on G and K. To be precise, (R,G,f) is an M—group with M = R and f: R —» end (G) given by f(r)u = m,

r E R, u E G.

If q): G —> K is an operator homomorphism, then (42) reads ¢p(ru) = r Q[~/—5‘] by a(a + bJ‘s‘) = a — bJ‘s‘. Then

0((0 + b~/_5‘) + (c + dfi» = 0(a + c + (b + d)~/—5_)

= a + c — (b + dw? = (a - bx/E') + (c -— dJ?)

= a S/ {0} is an isomorphism (see Exercise 15). To avoid a conflict in notation, we reproduce diagram (25) with {0* denoting the map on the lower horizontal line:

b 5/(0) R/A —_— ‘p*

Then (0*); = ”(0, or

(p : u'l(o*v.

(29)

NOW (13 = lilo”: R/A —> S is a ring homomorphism and satisfies q) = (5v. Obviously the values of q? are completely determined by the values of go.

I

The next result, which follows immediately from Theorem 1.3, is called

the first ring isomorphism theorem.

Theorem 1.4 If a: R —) S is a ring homomorphism, then (p(R) is a subring of S and R/ker o is isomorphic to ¢(R).

Proof: It is simple to verify that (0(R) is a subring of S (see Exercise 13). In diagram (28) replace S by ¢(R) and A by ker a). Then the diagram becomes

Rings and Fields

132

I? ———>R/ker tP

\/ 19(9)

q) = (p‘v. It is obvious that q? is surjective. We claim 6 is injective. It suffices to prove that ker {6 contains only the zero in R/ker or (see Exercise 14). Now ¢(v(r)) = 0 implies (since q) = (75v) that ¢(r) = 0, i.e., r e ker o. But then v(r) = ker (a, the zero in R/ker p. I If two rings R and S are isomorphic, this fact is denoted by

R E S. Thus, for example, Theorem 1.4 states that

¢(_R) = R/ker q). The following interesting results for integral domains are applications of Theorem 1.4 (compare with Corollary 1, Section 2.1). Theorem 1.5

Let R be an integral domain. If char R = 0, then R contains a

subring isomorphic to Z. If char R = p, then R contains a subring isomorphic to Z1,, the ring of integers modulo p.

Proof:

Define a: Z —) R by o(n) = n- 1. Clearly a is aring homomorphism.

By Theorem 1.4, ¢(Z) is a subring of R and the mapping 6: Z/ker (a —> (0(2) given by

(3(m + ker o) = ¢(m),

m e Z

is an isomorphism. Now suppose char R = 0. Then (16) implies that ker q) = {0} , and hence the canonical map v: Z ——> Z/ {0} is an isomorphism. But then

(1311: Z —) (0(2) is an isomorphism. On the other hand, suppose char R : p. By the discussion following

Theorem 1.1,wehavekerqo = {n E Z| n -1= 0} = {n E Z| na = Oforall a E R} = pZ. Since Z/pZ = ZI, (see Exercise 16), the proof is complete.

I

Theorem 1.6 If S is a homomorphic image of Z, then either S is isomorphic to Z or S is isomorphic to Z,l for some n e N. Proof:

Let (a: Z —) S be an epimorphism; ker q) is a subring of Z (in fact,

it is an ideal in Z). But the only subrings of Z other than {0} are of the form

3.1

Basic Facts

133

n2 for some n E N (see Exercise 17). Hence by Theorem 1.4, S is isomorphic

to either Z/{O} E Z or Z/nZ = Z”,

I

If A and B are arbitrary nonempty subsets of a ring R, define

A+B= {a+b|aEA,bEB}. It is simple to verify that if S is a subring of R and A S makes tinto an indeterminate with respect to R (recall the definition immediately preceding Theorem 2.1). This means that (i)

tr = rt for each r E R;

(ii) every element of S is of the form 25:0 rkt", where rk E R, n E N U {0} ;

(iii) if Zg=orktk = o, rk e R, then r0 = - - - = r” = 0. The ring S is called the ring ofpolynomials in the indeterminate t over R. Theorem 2.4 Let R be a ring with identity 1. Then there exists an indeterminate t over R. Proof: By Theorem 2.1 there exists a ring with identity S containing an indeterminate x with respect to R. Thus we have a monomorphism

3.2

Introduction to Polynomial Rings

145

z:R—)S

such that

(i)

x commutes with every element of the ring 1(R);

(ii)

every element of S is of the form

:1 akx",

ak E 1(R), n E N U {0} ;

k=0

(iii) if 2L0 akx" = 0, ak = 10,), rk e R, then r0 = - - - = r" = 0. Now by Theorem 2.3, there exists a ring a and an isomorphism 0: S —-> ,@ such that R is a subring of .9? and 0 | 1(R) = z“ (or equivalently, 0'1 | R = z).

9

Sett = 0(x). Then if r e R, tr = 0(0“(tr)) = 0(0"(t)0"1(r))

= 006-10)) = 9(l(r)X) (by (i)) = 9(l(r))0(x) = l"(l(r))t = rt.

Thus t commutes with every element of R. Every element of S is of the form

fln=§mmk and hence, since 0 is surjective, every element of g is of the form

0mm=§wwmfi=gwwwk = n

a” t".

Finally,

: rkt" = 0 k=0

implies that

1:; 0—1(rk)9_'(t)k = 0:

Rings and Fields

146

i.e.,

k2" 0“(rk)xk = 0. :0

Since 0“(rk) = 10,) we have

i: 1(rk)xk = 0, k=0

and hence r0 = - - - = r" = 0 by (iii). Thus t is an indeterminate over R.

I

Theorem 2.4 tells us then that given any ring R with identity, we can construct a ring of polynomials in an indeterminate over R. We shall usually

use x (instead oft) to denote an indeterminate over R, provided there is no conflict in the notation. The ring of polynomials in the indeterminate x over R will be denoted, as before, by R[x]. In conformity with standard usage the notation R[x] will also be applied in a somewhat more general context. If

R is a subring of a ring S and s is any element of S which commutes with every element of R, then R[s] will denote the totality of elements of the form Zz=orksk, rk E R, n E N U {0} . It is clear that R[s] is a subring of S; the elements of R[s] will be called polynomials in s. We say that s is adjoined to R.

In particular the ring R is a subring of R[x], the ring of polynomials in the indeterminate x over R:

R c R[x].

(17)

It is obvious that the identity of the ring R is the identity of the ring R[x]. Also observe that if R is commutative, then R[x] is commutative. In general, repeated use of the fact that x commutes with elements of R shows that r

s

r+s

p=0

q=0

m=0

Z apx’ 2 q‘ = Z cmx’", where

c m = Z apbq.

(18)

p+q=m

Corollary 1

If R is an integral domain, then so is R[x].

Proof: We have already seen that R[x] is a commutative ring with identity when R is. Observe that iff(x)g(x) = 0, then (18) implies that the product of the leading coefficients is 0. Since R has no zero divisors, it follows that either f(x) or g(x) is 0. Finally, 1 ¢ 0 in the integral domain R C R[x]. Thus R[x]

is an integral domain.

I

Suppose that R is a ring with identity and we construct a ring R[x]] of polynomials in an indeterminate xl over R. Then we can construct a ring R[x1][x,] of polynomials in an indeterminate x2 over R[xl]. Continuing, we construct a ring R[x]][xz] - - - [xx], where xk is an indeterminate over R[x,]

- - - [xk_1], 2 _ S/p by

1(a) = a/l, a E R,

(86)

then 1 is a ring homomorphism which is not necessarily injective. Of course, if no element of A is a zero divisor and 0 E A, then 1 is an injection [1(a) = 1(c)

ifi' a/l = c/l ifl' (a — c)u = 0 for some u E A iff a — c = 0]. The ring S/p is called the ring offractions over R with respect to A.

Example 7 (a) Let R = Z and A = Z — 32; A consists of all integers which are not multiples of 3. Clearly I E A and if a,b E A, then ab E A (otherwise 3 divides ab and hence divides either a or b). Now ((a,b),(c,d)) E p iff (ad — bc)u = O for some u E A. Since 0 E A, we have ((a,b),(c,d)) E p ifi‘ ad — bc = 0. Thus a/b just consists of all pairs of integers (c,d) such that ad = be, and d is not divisible by 3. We can identify S/p with the set of all rational numbers a/b such that b is not amultiple of 3. In this example the mapping l: R ——) S/p is an injection. (b) Let R = Z6 be the ring of integers modulo 6, and let A = {[1],[2],[4]}.

Then A is a submonoid of the multiplicative monoid in 26. But [5]][1] = [2]][1] in S/p because ([5] —- [2]) [2] = [3][2] = [0]. Thus the mapping a given by (86) is not an injection.

Rings and Fields

166

In constructing the ring of fractions over R with respect to A, it was

stipulated that A be a submonoid of the multiplicative monoid in R. There is a routine way of constructing such A that leads us to consider two important kinds of ideals in a ring.

Recall that an ideal M in a ring R is maximal if M ah R and there is no ideal C R/A be the canonical epimorphism given by v(r) = r + A. To show that R/A isafield, it suffices to show that if 120') qe A, then v(r) possesses an inverse in R/A. So assume v(r) i A. Then r as A, and C = A + rR is clearly an ideal in R properly containing A. Hence C = R because A is maximal. There exist elements a E A and s e R such that 1 = a + rs. Then v(1) = v(a) + v(rs) = v(r)v(s), i.e., v(s) is the multiplicative inverse of v(r) in R/A.

Conversely, suppose that R/A is afield; in particular, A at R. Let C be an ideal in R properly containing A, and let r E C — A. Then v(r) qb A, so v(r) possesses an inverse in R(A). Let s e R, and consider the element

v(r)“v(s) E R/A. Choose u E R such that v(u) E v(r)“v(s). We have v(ur) = v(u)v(r) = v(r)“v(s)v(r) = v(S).

In other words, s — ur E A C C. But ur E C implies s e C. Since s was chosen to be an arbitrary element in R, we conclude that C = R. Thus A is a maximal ideal. I

Rings and Fields

1 68

Corollary 6

Let R be a commutative ring with identity. Then any maximal

ideal in R is a prime ideal. Proof:

If A is a maximal ideal in R, then R/A is a field by Theorem 2.11

(ii). In particular, R/A is an integral domain, and hence A is a prime idea] by Theorem 2.11(i). I Example 9 (a) The ring Zp of integers modulo p is a field ifl" p is a prime integer. We have Zp = Z/pZ ; ifp is prime, then pZ is a maximal ideal in Z by Example 8(e), and Theorem 2.11(ii) implies that Zp is a field. Conversely, suppose p is composite, say p = not for positive integers m,n qt 1. Then obviously [m][n] = [0], so Z1, possesses zero divisors and hence is not a field. (b) Let p be a prime integer. Since Zp is a field [see part (a)], Z,— {[0]} is a multiplicative group of order p — 1. In any (multiplicative) group of order m with identity e, g’" = e for all elements g in the group (see Exercise 21). Hence if [a] at [0], then [a ”"1 = [1]. This means that if the integer a is not a multiple ofp, then a?" — 1 is divisible by p. For example, ifp = 3 and a = 5, we obtain that 52 — 1 is divisible by 3. This result is known as Fermat's theorem. An obvious corollary is that if a is any integer, then a” —— a is divisible by p. (c) Let x be an indeterminate over the real field R and consider the polynomial p(x) = x2 + 1 e R[x]. For the purposes of this important example, we shall assume thatp(x) is an irreducible polynomial in R[x], meaning that it cannot be factored. Hence if f(x) E R[x] and f(x) $ p(x)R[x], then f(x) and p(x) have a greatest common divisor of 1 [we shall thoroughly investigate greatest common divisors (g.c.d.’s) in polynomial rings in the next section, but we assume here that the reader is familiar with the process of finding the g.c.d. of two polynomials with real coefficients]. It follows that there exist polynomials s(x) and t(x) in R[x] such that s(x)f(x) + t(x)p(x) = 1. Thus if C is an ideal in R[x] containing p(x)R[x] and if f(x) e C, then 1 = s(x) f(x) + t(x)p(x) e C so that C = R[x]. This shows that p(x)R[x] is a maximal idea]. We conclude by Theorem 2.11(ii) that R[x]/p(x)R[x] is a field. Let

v! R[X] -’ R[X]/p(X)R[x} be the canonical epimorphism and let ,u = vIR be the restriction of v to R. Suppose

R[x] H711) R[x]

3.2

lntroduction to Polynomial Rings

169

#01)— ,u(rz); then rl — r2: (x2 + l)f(x) for some f(x) 6 R[x], and hence r1 = r2. Thus

M: R —’ R[X]/p(x)R[x} is a monomorphism and MR) is a field contained in the field R[x]/p(x)R[x] (see Exercises 14 and 22). At this point we use Theorem 2.3 to secure a ring L containing R as a subring and an isomorphism t: L —> R[x]/p(x)R[x] such that rlR = ,u. Of course L is a field because L "=" R[x]/p(x)R[x]. Now set 1' = r"‘(y(x)) E L and compute that

I(i2 + 1)— — 1(i)2 + 1r(1)— — 1106)2 + 22(1) = v(x + 1): v(p(x)) = 11(X)R[X]. In other words, t(i2 + l) is the zero element of R[x]/p(x)R[x]. Since 1 is injective, it

follows that 12 + 1 = 0. Let f(x) e R[x], say f(x) = zy=or,x'; then

M:

II

12M:

r“(v(f(x))) = r1 (2 v(r,)v(x)') = i z“(v(r.»(r“(v(x)»' r (v(r,))i = 2r"(u(r,))i—— guano,»r,i’. 0

But 1“»: R[x] —> L is surjective, and so we conclude that every element of L is a polynomial 1n i. Thus L= R[i]. Furthermore, since i2 + 1— — 0, we havei2 —=1, and hence every element of R[i] IS of the form a + bi, a,b E R. We have constructed a field R[i] containing R, containing an element 1' such that i2 + 1 = O, and consisting of all elements of the form a + bi, a,b E R. Of course, R[i] is what we ordinarily call the field of complex numbers, C. As we shall see, the construction given here can be carried out for any irreducible polynomial P(x), but we must first systematically investigate the problem of factorization in a general polynomial ring R[x]. i

As a final application of the ring extension theorem in this section, we

indicate a method of embedding an arbitrary ring in a ring with a multiplicative identity. Theorem 2.12 Let R be a ring. Then there exists a ring S with a multiplicative identity such that R is a subring of S.

Proof:

and

Let T = Z X R and define two binary operations {—9 and ()9 on T by (marl)®(nar2) = (m + n, r] + ’2)

(87)

(m,rl) ® (n,r2) = (mn, n - rI + m . r2 + rlrz).

(88)

it is easy to verify that (T, {63,69} ) is a ring and that(l,0) is the multiplicative Identity in T (see Exercise 24). Moreover, L = {0} x R is a subring of Tand

the mapping v: R —> L given by v(r) = (0,r) is a ring isomorphism (see Exer-

170

Rings and Fields

cise 25). By the ring extension theorem there exist a ring S containing R as a

subring and an isomorphism r: S —) T such thatr|R = 1:. Since S E T it follows that S contains a multiplicative identity. I Corollary 7 Let R be a ring. Then there exist a ring T containing R, a ring with identity S containing R, and an indeterminate x over S such that T is the subring of S[x] consisting precisely of those polynomials with coefi‘icients in R. Proof:

By Theorem 2.12, there exists a ring with identity S containing R.

By Theorem 2.4, there exists an indeterminate x over S. Let T be the totality of polynomials in S[x] whose coefficients belong to R. Then T is obviously a subring of S[x]. I The point of this corollary is that even if R is a ring without identity, one can still formulate a notion of a polynomial ring in an indeterminate x

over R. We denote the ring T of the corollary simply by R[x]. Example 10 Let R = 2Z be the ring of even integers. Then R[x] is the ring of polynomials with even coefficients in an indeterminate x over Z. Exercises

3.2

1. Show that if f E R[Z+] is the zero function, then f = L( f(0))x° [see formula (5) for notation]. 2. Complete the verification that (B, {EB,*}) in Theorem 2.2 is a ring. 3. Let R be a ring with identity. Let xl be an indeterminate over R and xk an indeterminate over R[x,] - - ~ [x,,_1], 2 g k g n. Show that each xk is an indeterminate over R. 4. Prove formulas (30) to (33). 5. Confirm formulas (42) and (43). 6. Show that if R is an integral domain, the leading term of a product of polynomials in R[x], . . ., x,,] is the product of the leading terms. Hint: Prove this for the product of two polynomials and proceed by induction. 7. Let x be an indeterminate over R, and let f(x) E R[x] be a nonzero polynomial whose leading coefficient is not a zero divisor in R. Show that if g(x) E R[x] is any other nonzero polynomial, then

deg f(X)g(X) = deg f(x) + deg g(x). Show that in general, iff(x), g(x) E R[x] and f(x)g(x) 9’: 0, then

deg f(X)g(x) s deg f(X) + deg .306)Finally, confirm formula (45). 8. Prove formula (60).

9 Show that if {Ai | i E I} is a family of right ideals in a ring R, then nie1A, is a right ideal in R. Show that the same result is true for left and two-sided ideals.

3.2

Introduction to Polynomial Rings

171

10. Show that if {A, | i E I} is a family of right ideals in a ring R, then ZiEIAt is a right idea] in R. Show that the same result is true for left and two-sided ideals.

ll. Confirm formula (70). Hint: Clearly aR + Ra + RaR + Z -a is an ideal in R containing a; so (a) C aR + Ra + RaR + Z -a. But if A is any ideal in R containing a, then aR C A, Ra C A, RaR C A, and Z - a C A.

12. 13. 14. 15. 16.

Show that the relation p defined by (75) is an equivalence relation on S.

Verify assertions (a) to (f) in the proof of Theorem 2.9. Show that if 0: K —> L is a ring isomorphism and K is a field, then L is a field.

Show that the units in M2(Z) are the matrices with determinant i1. Show that the relation p defined by (83) is an equivalence relation on S. Hint: It is obvious that p is reflexive and symmetric. To show that p is transitive,

suppose ((a,b),(c,d)) E p and ((c,d),(e,f)) E p So that there exist u,v E A such that (ad —- bc)u = 0 and (cf — de)v = 0. Multiply the first of the equalities by fv and the second by bu: adfvu — bcfvu = 0, cfbuv —- debuv = 0. Now add these equations to obtain adfvu — debuv = 0, (af — eb)duv = 0. Since d, u, v E A we

have duv E A. Hence ((a,b),(e,f)) E p. 17. Show that the definitions (84) and (85) of addition EB and multiplication ® in the ring of fractions over R with respect to A do not depend on the particular representatives. Hint: Suppose a/b = a/fi, c/d = 31/6 so that (aft — ba)u = 0, (c6 — dy)v = 0 for some u,v E A. We must show that

(0

(ad + b6)/bd = (0‘6 + I37)/I36

and

(ii)

ac/bd = try/[36.

Now

((ad + bc),36 — (a6 + fiy)bd)uv = ((a5 — ba)d¢5 + (6'5 “ wow)” = (afl — ba)ud6v + (66 -— WWW" = 0,

and uv E A. This proves (i), and (ii) is proved similarly.

18. Show that the ring of fractions over R with respect to A is a commutative ring with identity. Hint: The identity is 1/1.

19. Let R be a commutative ring with identity 1, and let P be a prime ideal in R. Set A = R — P, and consider the ring of fractions S/p over R with respect to A. Prove that S/p is a local ring. Hint: Following Example 8(f ), let M = {a/b E S/p I a E P}. Then obviously M is an ideal in S/p and M at S/p. Suppose C is an ideal in S/p which is not a subset of M. Choose c/d E C such that c/d E M. Then c E R — P = A so that d/c E S/p. Hence l/l = d/c-c/d E C, and it

follows that C = R. Thus M is the unique maximal ideal in S/p. 20. Show that the ideal (2,x) in Example 8(d) is a maximal ideal in Z[x]. Hint: (2,x) consists of the polynomials in Z[x] with even constant terms. Suppose C is an ideal in Z[x] containing (2,x) and a polynomial f(x) with an odd constant term, say f(x) = 2m — 1 + xg(x), where m E Z and g(x) E Z[x]. Consider the ideal M = (2,x,f(x)) = ZZ[x] + xZ[x] + f(x)Z[x]. We have 2m E M, xg(x) E

Rings and Fields

172

M, and f(x) 6 M; hence I E M. It follows that M = Z[x], and since M c C, we obtain C = Z[x]. Thus (2,x) is a maximal ideal.

21. Let a group G with identity element e be written multiplicatively, and suppose |G| = m. Show that g’" = efor allg E G. Hint: Ifg E G, the elements g° = e, g‘, . . . , g'" cannot all be distinct. It follows that g" = e for some k 2 1. Then

[g] -'= {g°, . . . , g""‘} is a subgroup of G, and Lagrange’s theorem implies that klm. Hence g'" = e. 22. Let K be a field, L a ring, and ,u: K —> L a ring homomorphism. Show that if ,u(K) at {0} , then ,u(K) is a field. Hint: Since ker ,u is an ideal in the field K, we have ker ,u = K or ker ,u = {0} . But ,u(K) =# {0} so that ker u = {0} .Thus ,u(K) is the isomorphic image of the field K and hence is itself a field (see Exercise 14).

. Following Example 9(c), show how to construct a field containing Q and an

element r such that r2 — 2 = 0. . In the proof of Theorem 2.12, show that (T, {$,®}) is a ring with (1,0) as multiplicative identity.

25. Show that the mapping 1;: R —v L in the proof of Theorem 2.12 is a ring isomorphism. Glossary

3.2

ascending chain condition, 158 a.c.c., 158 chain of ideals (ascending, descending), 158 completely symmetric functions, 149 constant term, 142

local ring, 167 maximal ideal, 166 monic, 142 monomial, 142 Newton identities, 154

degree off(x), 142

quotient field, 163 prime ideal, 166 principal ideal, 157 principal ideal ring, 157 rational functions, 163 rational function field, 163 R(x,, . . . , x”), 163 ring extension theorem, 143 ring of fractions over R with respect to A, 165 ring of polynomials in the indeterminates x,, . . . , x,, over R, 147 R[x], . . . ,x,,],147 ring of polynomials in x, 142 R[x], 142 symmetric polynomial, 149 transcendental over R, 142 unit in a ring, 163 variable over R, 142

deg f(x), 142 descending chain condition, 158

d.c.c., 158 elementary symmetric functions, 149 e.s.f., 149 Fermat’s theorem, 168 field of fractions, 163 finitely generated idea], 157 ((11, . . . , an), 157 Hilbert Basis Theorem, 160 ideal generated by X, 157

(X), 157 independent indeterminates, 147 indeterminate over a ring R, 144 indeterminate with respect to a ring R,

139 leading coefficient, 142,150 leading term, 142,150

Noetherian ring, 159

3.3

Unique Factorization Domains

173

Wronski polynomials, 149

D,,,,,,, 143

Poll, . . . , hp ,148

901,119 148

PM): 148

I'°(n1, . . . , up), 149

GM, 14s 3.3

13°01), 149

Unique Factorization Domains

A principal ideal domain, abbreviated P.I.D., is a principal ideal ring which

is also an integral domain. Thus a ring R is a P.I.D. ifl' R is a commutative ring with identity containing no zero divisors and such that any ideal In R is of the form (a) = Ra. We know from elementary arithmetic that any integer is a product of prime integers. Moreover, this factorization is unique except possibly for or-

der and factors of :1, 6.3.,

6=2-3=(—2)(—3)=3-2=(—3)(—2).

in this section we will show that if R is a P.I.D., then a unique factorization into primes is possible in R as it is in Z. Example 1

Z is a P.I.D., for Z is certainly an integral domain, and if A< Z, then

A = nZ = (n) for some n E Z (see Exercise 17, Section 3.1).

If R is a field and x is an indeterminate over R, then R[x] is a principal ideal domain. In order to establish this important fact, we need to study divisibility in polynomial rings. We will conduct this study in somewhat greater generality so that our results may be used later in discussing canonical forms for matrices. Let R be a subring of an arbitrary ring S, and let x be an indeterminate over R. Let t be a fixed element of S, and define two mappings, rp,: R[x] —) S

and ¢zi R[x] —) S, as follows: for f(x) = a0 + alx + - - - + aux" e R[x], put ¢r(f(x)) = a0 + alt +

and

'

'

'

+ ant"

¢1(f(x)) = a0 tat?! + ~ - - + t"a,,;

we also denote ¢,(f(x)) byf,(t) and go,(f(x)) byf,(t). The function q), is called right substitution or right specialization at t, and f;(t) is called the right-hand value off(x) at t. Similarly, the function (a, is called left substitution or left

specialization at t and f,(t) is the left-hand value off(x) at I. If t commutes with the elements of R, then obviously f,(t) z o,(f(x)) = (o,(f(x)) = f, (t); in this case we designate (/7, and (a, simply by (a and call f,(t) = f,(t) the value of f(x) at t, denoting it by f(t). If R is a subring of a commutative ring S and c E S is such thatf(c) = 0, then c is called a root or a zero off(x) in S. The

174

Rings and Fields

following result is evident from the definitions, and its proof is left to the reader (see Exercise 1).

Theorem 3.1 Let R be a subring of a ring S, x an indeterminate over R, and t afixed element of S. (i) The mappings o, and (p, are group homomorphisms of (R[x], {+})

into (S, {+} ). (ii) (iii)

If t commutes with every element of R, then or: R[x] —> S is a ring homomorphism. If t is an indeterminate over R, then (a : R[x] —) R[t] is a ring isomorphism.

The key elementary result concerning polynomials is

Theorem 3.2

(Division Algorithm) Let R be a ring with identity and x an

indeterminate over R. Let a(x) = a0 + alx + - - - + anx" E R[x]

and

b(x) = b0 + blx + - - - + bmx’” E R[x],

where a,l ab 0 and bm is a unit in R. Then there exist unique polynomials q(x) and r(x) in R[x] such that

a(x) = q(x)b(x) + r(X)

(1)

and r(x) = 0 or deg r(x) < deg b(x). Similarly, there exist unique polynomials s(x) and u(x) in R[x] such that

a(x) = b(x)s(x) + u(x) and u(x) = 0 or deg u(x) < deg b(x).

(2)

Proof: We first establish the existence of the polynomials q(x) and r(x). The proof is by induction on n. Suppose n = 0, so that a(x) = a0. If m > 0, set q(x) = 0 and r(x) = a0. If m = 0, set q(x) = aobo‘I and r(x) = 0. Now assume that n > 0 and that (1) holds for all polynomials of degree n — 1 or less. If m > n, set q(x) = 0 and r(x) = a(x). If m g n, consider the polynomial

a(x) = a(x) —- anbm‘lx”“"'b(x).

(3)

It is obvious that a(x) = 0 or deg a(x) g n —- 1. If a(x) = 0, set q(x) = anbm‘lx”"'" and r(x) = 0. Otherwise by the induction hypothesis there exist polynomials q1(x) and r,(x) such that

a(x) = ql(x)b(x) + r106)

(4)

and r,(x) = 0 or deg r1(x) < m. Substituting (4) in (3) yields

a(X) = [anbm"1x"'"' + ql(x)]b(x) + r1(x); now set q(x) = anbm“x""" + ql(x) and r(x) = r1(x) to obtain (1). To show that q(x) and r(x) are uniquely determined, suppose we also have

1 74

Rings and Fields

following result is evident from the definitions, and its proof is left to the reader (see Exercise 1). Theorem 3.1 Let R be a subring of a ring S, x an indeterminate over R, and t afixed element of S. (i) The mappings q), and (p, are group homomorphisms of (R[x], {+})

into (S, {+} ). (ii) (iii)

If t commutes with every element of R, then (I): R[x] ——> S is a ring homomorphism. If t is an indeterminate over R, then o: R[x] —) R[t] is a ring isomorphism.

The key elementary result concerning polynomials is

Theorem 3.2 (Division Algorithm) Let R be a ring with identity and x an indeterminate over R. Let

a(x) = a0 + alx + - - -+ aux” e R[x] b(x) = bo + blx + - - - + bmx’" E R[x],

and

where an i 0 and bm is a unit in R. Then there exist unique polynomials q(x) and r(x) in R[x] such that

006) = q(X)b(x) + r(x)

(1)

and r(x) = 0 or deg r(x) < deg b(x). Similarly, there exist unique polynomials s(x) and u(x) in R[x] such that

006) = b(X)S(X) + u(x)

(2)

and u(x) = 0 or deg u(x) < deg b(x). Proof:

We first establish the existence of the polynomials q(x) and r(x). The

proof is by induction on n. Suppose n = 0, so that a(x) = a0. If m > 0, set

q(x) = 0 and r(x) = do. If m = 0, set q(x) = aobo'l and r(x) = 0. Now assume that n > O and that (1) holds for all polynomials of degree n — 1 or less. If m > n, set q(x) = 0 and r(x) = a(x). If m g n, consider the polynomial

a(x) = a(x) — anbm"x"""b(x).

(3)

It is obvious that a(x) = 0 or deg a(x) g n — 1. If a(x) = 0, set q(x) = anbm'lx""" and r(x) = 0. Otherwise by the induction hypothesis there exist polynomials ql(x) and r1(x) such that

a(x) = 41001705) + 710‘) and rl(x) = 0 or deg rl(x) < m. Substituting (4) in (3) yields

(4)

a(X) = [anbm"X""’” + q1(X)]b(X) + a(x); now set q(x) = aubm'lx"‘"‘ + q,(x) and r(x) = rl(x) to obtain (1). To show that q(x) and r(x) are uniquely determined, suppose we also have

3.3

Unique Factorization Domains

175

a(x) = Q(x)b(x) + p(x),

(5)

where p(x) = 0 or deg p(x) < deg b(x) = m. Combining (l) and (5) yields

[Q(X) — q(x)]b(x) = r(x) — 10(96)-

(6)

If Q(x) =fi q(x), then since bm is not a zero divisor in R (being a unit), it follows that deg [Q(x) — q(x)]b(x) 2 mwhereas deg [r(x) — p(x)] < m. Thus Q(x) = q(x) and [by (6)] r(x) = p(x). This establishes the uniqueness of the polyno-

mials q(x) and r(x). The proof of the second conclusion [involving (2)] is entirely analogous. I Example 2

Let x be an indeterminate over M2(Z). Suppose

_ 2 0]): 1 2 +[lo o]x+[0 o 1 o].

a(x)—[o

0

1

1

l

b(x)—L1 0]x+[1 1]' Then bl = [_0 (1)] is a unit in M2(Z):b,'l = [(1) —(1)]. We find the polynomials q(x),r(x) E MZ(Z)[x] such that q(x) = q(x)b(x) + r(x) and

r(x) = 0 or deg r(x) < deg b(x). Mimicking the proof of Theorem 3.2, we set

u(x) = a(x) — [(2) (1)] [(1) —(1):|xb(x)

=[i (HHS) 0].

(7)

Since deg q(x) = l = deg b(x), we repeat the process with q(x) and compute

1 1 o —1 1 o “(’0- o 1H1 o]b(x)_[1 2]’ SO that

q(x) =

:1] b(x) + [l

_

0]

1 2 '

Substituting (8) into the first equality in (7) yields

a(x)= ’1 —2]xb(x)+[(l) :1] b(x)+ [1 0] _o 0 1 2' Therefore

q(x) = '1p



0]x+[01 —1 -1}

'1 r(x) — -1

0 2].

Observe that

b(x)q(X) + r(X) = L? g] x2 + [(1) :i] x + B —(2)] =# 1100; thus in general q(x) at: s(x) and r(x) at: u(x).

(8)

3.3

Unique Factorization Domains

175

(5)

u(x) = Q(x)b(xl + p(x),

where P(x) = 0 or deg p(x) < deg b(x) = m. Combining (1) and (5) yields (6) [Q(x) — q(x)]b(x) = r(x) — p(x)If Q(x) qb q(x), then since b," is not a zero divisor in R (being a unit), it follows

that deg [Q(x) — q(x)] b(x) _>_ mwhereas deg [r(x) -— p(x)] < m. Thus Q(x) = q(x) and [by (6)] r(x) = p(x). This establishes the uniqueness of the polynomials q(x) and r(x). The proof of the second conclusion [involving (2)] is en-

tirely analogous. Example 2

I

Let x be an indeterminate over Mz(Z). Suppose

_ 21 2

o o

10

“(fl—[o 0]x+[1 0]”[0 ] 0

l

l

1

W) _ [—1 0]): + [1 I} Then bl = l:

0

—1

0

_

. .

.

(1)] is a unit m M2(Z):bl ‘ = [I

0].

We find the polynomials q(x),r(x) E M2(Z)[x] such that q(x) = q(x)b(x) + r(x) and r(x) = 0 or deg r(x) < deg b(x). Mimicking the proof of Theorem 3.2, we set

1][0 —(1)]xb(x) (r(x) =a(x) — [2001 l

I

1

0

=[1 0]”[o i

(7)

Since deg q(x) = l = deg b(x), we repeat the process with u(x) and compute _1

1

0

“(’0 — _o 1H1

l

=

0

o] W) _ [1 2}

u(x) L0 -1]”"‘) + [1l 02]

a

SO th 1;

‘1

—1

—1

()

.

8

Substituting (8) into the first equality in (7) yields —l + [1l 02]. u(x) — ‘0'1 —20]xb(x) + [0l _1]b(x)

There fore

‘1

—-

'1

0

q(x) :- L0 r(x) — _1

1

-—1

0] x + [0

_1],

2].

Observe that b

@406) + r(x)

=

0

0

l

—3

2

[-1 2] x 2 + [0 _1] x + [2

thus in general q(x) =# s(x) and r(x) 9E u(x).

—2

0] at u(x).‘

1 76

Rings and Fields

The polynomials a(x), b(x), q(x), r(x), s(x), and u(x) in Theorem 3.2

are called the dividend, divisor, right quotient, right remainder, left quotient and left remainder, respectively. Of course, if R is a commutative ring, there is no distinction between “right” and “left”; q(x) = s(x) is called the quotient and r(x) = u(x) is called the remainder. If R is commutative and r(x) = 0 so that

a(x) = a(x)b(x) = b(x)q(x), then we say b(x) divides a(x) and write b(x) | a(x). The reader will easily confirm the following facts (see Exercise 2):

Iff(x) lg(x) and g(x) |h(X), thenf(x) | h(x)-

(9)

Iff(x) [g(x), f(x)|h(x), and m(x), n(x) e R[x], then

f(X)l [M(x)g(X) + n(x)h(x)l-

(10)

If the leading coefi‘icient off(x) is not a zero divisor, then

f00306) = f(x)h(x) implies g(X) = 1100-

(1 1)

Iff(x)|g(x), g(x)] f(x), and the leading coeflicients of f(x) and g(x) are not zero divisors, then f(x) = rg(x) for some unit r e R.

Theorem 3.3

(Remainder theorem)

(12)

Let R be a ring with identity and x an

indeterminate over R. If a(x) E R[x], a(x) qt 0, and c E R, then there exist unique polynomials q(x) and s(x) in R[x] such that

a(x) = q(x)(x — 6) + a,(6), a(x) = (x — C)S(x) + 01(0) Proof:

(13) (14)

By Theorem 3.2 there exist unique polynomials q(x), r(x) E R[x]

such that

a(x) = q(x)(x — 6) + r(X)

(15)

and r(x) = 0 or deg r(x) < deg (x — c) = 1. In any event we have r(x) = d E R, so that

a(x) = q(x)(x —- c) + d.

(16)

Let ¢,: R[x] —> R be right substitution at c; recall that o, is a homomorphism of the additive group (R[x],{+}). Applying o, to both sides of (16) [say q(x) = 21120 qk], we compute that

”r(c) = (0,0100) = (o,[q(x)(x — 0)] + (Md) = on“; q")(x — 6)] + d = ¢,(k§:‘_,o qk( +1 — cx")) + d

3.3 Unique Factorization Domains

177

= M; qk“) — a, (£3 cx") + d = k=0 2. 41‘0“] "' k=0 2 41:06" + d: k=0 2 41:0“! _ k=0 2 41:0“! + d = d. Thus a(x) = q(x)(x — c) + a,(c), proving (13). The proof of (14) is similar. I

Corollary 1 If R is a commutative ring with identity and a(x) E R[x], a(x) at 0, then c E R is a root ofa(x) ifl (x — c) I a(x). Proof: If c is a root of a(x), then (13) reads a(x) = q(x)(x — c) so that (x — c)| a(x). On the other hand, if (x — c)|a(x), the uniqueness of the re-

mainder in (13) implies that a(c) = a,(c) = 0, i.e., c is a root of a(x)._ Corollary 2

I

If R is an integral domain, a(x) E R[x], and deg a(x) = n 2 0,

then a(x) has at most n roots in R. Proof:

The proof is by induction on n. The result is trivial if n = 0. Suppose

n = 1 and c,d E R are roots of a(x) = tax + do. Then alc + a0 = 0 = ald + a0 implies a1(c —— d) = 0. Since R is an integral domain, we obtain c = d. Now assume that n > 1 and the result holds for polynomials of degree less than n. If a(x) has a root c E R, then by Corollary 1

‘1(36) = q(x)(x - 6) for some q(x) E R[x]. Note that d 9b c is a root of a(x) ifl' 0 = a(d) = q(d)(d — c) ifl‘ q(d) = 0, since R is an integral domain. Thus the only roots of a(x) other than c are those that q(x) might possess. But deg q(x) = n — 1

(see Exercise 7, Section 3.2), so q(x) has at most n — l roots in R by the induction hypothesis. We conclude that a(x) has at most n roots in R. I Corollary 3

Let R be an integral domain, and let f (x), g(x) E R[x] be poly-

nomials ofdegree at most n 2 0. Suppose cl, . . . , 0,,+1 are n + 1 distinct elementsofR. Thenf(ck) = g(ck)fork =1, . . . , n +1iflf(x)= g(x). Proof:

Apply Corollary 2 to the polynomial f (x) — g(x).

I

Corollary 4 Let R be an integral domain, let x1, . . . , x" be independent indeterminates over R, and let f (x1, . . . , x") e R[xl, . . . , xn]. Suppose R has an infinite subset S such thatf(s1, . . . , s") = 0 for arbitrary specializationsofthexk task E S,k = l, . . .,n. Thenf(x,, . . .,xn) = 0.

Proof:

We first remark that iff(xl, . . . , x”) = Z,a,x,“” - ' - a"),

then f (s1, . . . , s") has the obvious meaning

f(sl, . . ., s") = Zayslm’ - - -s,,""’.

(17)

The proof is by induction on n. If n = 1, the result follows trivially from Corol-

178

Rings and Fields

lary 2. Assume that n > 1 and that the result holds for polynomials in fewer than n indeterminates over R. Write f(x,, . . . , x") as a polynomial in x" with coefl‘icients in R[xl, . . ., x,,_1]: f(x1! '

' - 9x71) : jgogfixl’ '

-

- ,xn—l)xnj'

(18)

Let sk°, k = 1, . . . , n — 1, be arbitrary but fixed elements ofS. From (18) we obtain f(sloa -

- - “93—1, xn) : Zogj(51°9 - - - ’33—1)xnj-

1:

(19)

Now the polynomial (19) belongs to R[xn] and vanishes for more than r spe-

cializations of x,‘ to elements of R. We conclude by Corollary 2 that f(s1°, . . ., s,?_1, x”) = O, i.e.,gj(sl°, . . ., s,?_1) = 0forj = 0, . . .,r. It follows by the induction hypothesis that gj(x1, . . . , x,,_1) = 0 for j = 0, . , r. Finally, (18) implies thatf(x,, . . . , x") = O. I

Corollary 5 Let R be an integral domain, let x1, . . . , x,l be independent indeterminates over R, and let f (x1, . . . , x”), g(x1, . . . , x”) E R[xl, . . . , xn]. Suppose R has an infinite subsets such that f (s,, . . . ,sn) = g(sl, . . . , s") for arbitrary specializations of the xk to sk E S, k = 1, . . . , n. Thenf(x1, . . . , x”) = g(x1, . . . , x"). Proof:

x")-

Apply Corollary 4 to the polynomial f(x1, . . . , x") — g(x1, . . . ,

I The most important consequence of Theorem 3.2 is the following result.

Theorem 3.4

If R is a field and x is an indeterminate over R, then R[x] is a

P.I.D. Proof:

We know that R[x] is an integral domain (see Corollary 1, Section

3.2). Suppose A 0, then (0(x) = h(x)k(x) where h(x),k(x) E F[x] have positive degree. By Theorem 4.3 we may write

h(x) = (al/b1)hl(x) and k(x) = (az/b2)k1(x), where a1,b1,az,b2 E D and h1(x), k1(x) are primitive polynomials in D[x]. Then

«:(x) = f—Z: h.(x)k.(x) and h1(x)k1(x) E D[x] is a primitive polynomial by Theorem 4.2. The uniqueness part of Theorem 4.3 implies that g(x) = ahl(x)k1(x) for some unit a E D; so again g(x) is not irreducible in D[x]. Conversely, assume g(x) is not irreducible in D[x]. If deg g(x) = 0, then 0 qt ¢(x) E F is a unit in F[x] and hence is not irreducible in F[x]. If deg g(x) > 0, then since g(x) is primitive, it follows that g(x) is a product of two polynomials in D[x] of positive degree. Again, this obviously implies that (g(x) is not irreducible in F[x]. (ii) Suppose (0(x) E D[x] is of positive degree. If cp(x) has two factors of positive degree in D[x], then obviously these factors also belong to F[x].

Conversely, assume (0(x) has two factors of positive degree in F[x], i.e., o(x) is not irreducible in F[x]. Write (0(x) = ag(x), where a E D and g(x) E D[x] is primitive (see Exercise 1). Since deg g(x) > 0 and g(x) is primitive, it follows from (i) that g(x) has two factors of positive degree in D[x]. Hence the same is true of rp(x). I

Corollary 1 If D is a U.F.D. and x is an indeterminate over D, then the set ofprime elements and the set of irreducible elements in D[x] coincide.

Proof: Recall that any prime in an integral domain is irreducible [see Example 5(0), Section 3.3]. On the other hand, suppose (0(x) E D[x] is irreducible in D[x]. If (g(x) is not a primitive polynomial, then there exists a prime p E D which divides every coefficient of ¢(x). Since ¢(x) is irreducible in D[x] and p|¢(x), it follows that (0(x) is an associate of p and hence is

itself a prime in D. Now Exercise 2 implies that (g(x) is a prime in D[x]. So assume (0(x) is a primitive irreducible polynomial in D[x]; then we must have deg (0(x) > 0. Suppose ¢(x)|a(x)fl(x) for some 0 7': a(x),fl(x) E D[x]. Let F be the quotient field of D. Since F[x] is a P.I.D. by Theorem 3.4, every irreducible element of F[x] is a prime. We conclude from Theorem 4.4(ii) that (g(x) is irreducible in F[x] and hence isa prime in F[x]. Regarding the

division ¢(x)|a(x)fi(x) as taking place in F[x], it follows that ¢(x) la(x) or ¢(x) [h(x) in F[x]. Assume without loss of generality that

3.4

Polynomial Factorization

197

a(x) = y(x)¢(x)

for some y(x) E F[x].

Write a(x) = d1a1(x), y(x) = (dz/d3)y1(x), where d,,d2,d3 E D and a,(x), ?1(x) are primitive polynomials in D[x]. Then

dm®=%MW®, and y,(x)(a(x) is a primitive polynomial in D[x] by Theorem 4.2. The unique-

ness part of Theorem 4.3 implies that dz/a'3 = edl for some unit a E D. Hence

d

r(x) = ‘7: r1(x) = ed] M26) 6 Dix}, i'e-a $(x) divides a(x) in D[x]. This shows that ¢(x) is a prime in D[x].

I

Theorem 4.4(i) implies the important fact that an irreducible polynomial in D[x] is either an irreducible element of D or a primitive polynomial ofpositive degree in D[x] which is irreducible in F[x]. Using Corollary 1, we obtain the equivalent statement that a prime polynomial in D[x] is either a prime in D

or a primitive polynomial of positive degree in D[x] which is prime in F[x]. The result of our work so far is the following important statement. Theorem 4.5 is a U.F.D.. Proof:

If D is a U.F.D. and x is an indeterminate over D, then D[x]

Let 0 qt f(x) E D[x] be a nonunit in D[x]. Iff(x) E D, then since D

is a U.F.D., f(x) can be written as a product of primes in D (and hence in D[x]). So assume degf(x) 2 l and write f(x) = dg(x), where d E D and g(x) E D[x] is a primitive polynomial of positive degree. Let F be the quotient field of D. Since F[x] is a U.F.D. by Theorems 3.4 and 3.6, regarding g(x) as an element of F[x] we can write

g(X) = (#106) - - - (0,06),

(3)

where ¢,(x) E F[x] is a prime polynomial of positive degree in F[x], i = l, . ., r. By Theorem 4.3 we have for each i that

¢m=gmn

m

where a,,b, E D, g,(x) E D[x], and gi(x) is primitive. For each i, g,(x) is prime in F[x] since ¢,(x) is, and hence gi(x) is a prime polynomial in D[x] by the

remark following Corollary 1. Combining (3) and (4), we have

gm=%%§mm-

an

m

and g1(x) - - - g,(x) is a primitive polynomial in D[x] by Theorem 4.2.

198

Rings and Fields

Since g(x) E D[x] is primitive, the uniqueness part of Theorem 4.3 implies that ale-ea,

b1---b

= e

(6)

r

for some unit a E D. Using (5) and (6), we obtain

for) = dg = d Hm) - - - g,(x) = d£g1(X) - - - $06)

(7)

Now D is a U.F.D. so there exist a unit 6 E D and primes d1, . . . , d, e D such that (18 = (Sail - - - d: (if d is a unit in D no primes are required). Then from (7),

f(x) = 6dr ‘ ‘ ' dsgl(x) ‘ ' ' grb‘)

(8)

is a factorization off(x) into a product of a unit 6 E D, primes d1, . . . , d, in D (and hence in D[x]), and prime polynomials of positive degree g,(x), . , g,(x) in D[x]. Suppose that

f(X) = #91 - - - enh1(x) - ' - h...(x)

(9)

is another such factorization. The gi(x) (i = 1,. . .,r) and hj(x) (j = l, . . ., m) are prime polynomials of positive degree in D[x], and hence are primitive polynomials in D[x] which are prime in F[x] (see the remark following Corollary

1). Since F[x] is a U.F.D., 6d] - - ~ dsg1(x) - - ~g,(x) = ,uel ' - - enh1(x), and (5dI - - - d, and ,ue, - - - e,I are units in F[x], it follows that r = m and after a suitable reordering, h,(x) = (k,/l,)g,(x) for some k,,l, E D, i = 1, . . ., r. Hence there exists a unit y' E D such that 6dl - - - d, = ,u’e, - - - en, and the fact that D is a U.F.D. completes the proof. I Thus we see that if D is a U.F.D., then any polynomial f(x) 6 D[x] of positive degree has a factorization of the form f(x) = 5‘11 '

'

I dsgl(x) '

' ‘ gr(x)s

where 6 is a unit in D, d1, . . . , d, are primes in D, and g,(x), . . . , g,(x) are prime polynomials ofpositive degree in D[x] (which are thereby primitive in D[x] and prime in F[x]).

Corollary 2 IfD is a U.F.D. and x1, . . . , x" are independent indeterminates over D, then D[x], . . . , x,] is a U.F.D..

Proof:

Since D[x], . . . , xn] = D[xl, . . . , xn_1][x,,], this result follows

from Theorem 4.5 by induction.

I

3-4

Polynomial Factorization

199

Example 1 (a) The polynomial x2 + x + l E Z[x] is a primitive polynomial of positive degree in Z[x] which is easily seen to be irreducible in Q[x]; thus x2 + x + 1 is irreducible in Z[x] by the remark following Corollary 1. On the other hand, 2X2 + 2x + 2 = 2(x2 + x + 1) is not an irreducible polynomial in Z[x]. (b) Let D be an integral domain and x an indeterminate over D. Let r be a

nonnegative integer, and let cbdk E D, k = 0, . . ., r , where the dk are r + 1 distinct elements of D. We assert that there is at most one polynomial f(x) E D[x] of degree at most r such thatf(dk) = ck, k = 0, . . . , r. First of all, if r = 0 we simply 8“f(x) = co (uniqueness is obvious). Assume that r 2 1, let F be the quotient field of D, and consider r

H



.

(10)

)3 er 3%; e Fix]. f(x) = k—O

Since f(x) is clearly a polynomial of degree at most r satisfyingf(dk) = ck, k = 0, . . , r, if f(x) E D[x] it is a polynomial of the desired sort, and in this case the

uniqueness off(x) follows from Corollary 3, Section 3.3. A somewhat different argument can be used to show that f(x) E D[x] is unique if it exists. Suppose

f(x) = £10m",

g(x) = éofikx" e D[x]

are D01ynomials of degree at most r such that f(di)=cj=g(d,:),

j=0,...,r.

Then we obtain

(“°_’3°)+(“1 ‘51)d;+ ' - - +(a,—fi,)d,-’=0,

j=0,. . .,r. (11)

Regard (11) as a system of r + 1 equations in the r + l “unknowns” ak — 13k with coefficient matrix

1

d, (1,2

A: 1 dz d22

1 d, d}

dl’

d2’.

(12)

d,’

The distinctness of the dk implies that det A i 0 (see Exercise 3). Thus by an elementary result in matrix algebra the system (11) has only the solution (1,, — flk = 0,

k = 0, . . . , r. (c) In this example we exhibit an explicit constructive procedure for finding the factors of a polynomial in D[x], where D is a U.F.D. of characteristic 0 with a finite number of units. Let F be the quotient field of D. Suppose f(x) and g(x) are in D[x] and deg g(x) = r < degf(x) = n. Let £0, . . . , E, be r + 1 distinct elements

ofD such that no =f(Eo), . . . , 17, =f(5,) are all different from 0. Let 6“, 6“, . . . , 6kmk(not Kronecker deltas!) be the complete set of all divisors of ’71:: k = 0, . . . , r. (There are only finitely many of these in a U.F.D. with finitely many units; see Exercise 4.) We construct a table:

200

Rings and Fields

E

n

5—0 ’70

6 (301602 ' ' ‘ 60m)

(13)

E, 17. 6&2 - - - (3"... Clearly, if g(x)| f(x) in D[x], then g(Ek)|f(Ek), k = 0, . . . , r. Thus a necessary condition for g(x)| f(x) is that g(£,,) must be one of 6H,, . . . , 6h”, say

g(l,=,,) = 6”,

k = 0, . . . , r.

(14)

We construct a unique polynomial g(x) E F[x] of degree r which satisfies (14). If g(x) GE D[x], we reject it. If g(x) E D[x], we test by dividing to see if g(x) is actually a divisor off(x) in D[x]. Thus the procedure is to take all possible choices 60,0, . . ., 6", of 6’s, one out of each row of table (13), construct g(x) satisfying (14), see if it is in D[x], and finally test whether or not it dividesf(x). Surely any divisor g(x) E D[x] of f(x) of degree r must appear in this process. For if g(x) l f(x), then g(E,,) = 6km k = 0, . . ., r, for some choice of to, . . ., t, and g(x) is uniquely determined by these r + 1 equalities. Of course, what one wants to do here is make as many of the m as possible into primes so that the number of 6’s is diminished. (d) We try to find a quadratic factor of f(x) = x5 + x — 1 in Z[x] by the preceding method. Here r = 2 and we can take £0 = —1, £1 = 0, E, = 1 so that

no =f(Eo) = —3, n, =f(0) = —l, 172 = 1. Table (13) becomes 5

6

n

—l——3

3—31—1

0—1—1

1

11—11 For example, take 3, 1, l as the choice of 6’s and construct an interpolating polynomial g(x) using formula (10):

_

..,.2‘(x.—,l)

5"") ‘ 3(—1)(—1 — 1) +1

(x +1)(X — 1)

(Ix—1)

(x + l)x

+ 1(1—‘m —(_1)]

= x2 —— x + 1.

If we divide g(x) into f(x), we see that x5+x—l=(xz—x+l)(x3+x2—l).

Although the method of Example 1(c) is finite, it can obviously become extraordinarily tedious even for polynomials of small degree. There is a very useful result for testing irreducibility in D[x]. Theorem 4.6

(Eisenstein Criterion)

Let D be a U'.F.D., let p E D be a

3.4

Polynomial Factorization

201

prime, and Ietf(x) = Zz=oakxk E D[x], a,l ¢ 0. Assume that (i) a, GE (p). (ii)

.00 95 (p2).

(iii) ak E (p), 0 g k < n. Then f(x) has no divisors ofpositive degree in D[x], and hence is irreducible in F[x], where F is the quotient field of D. Proof:

Suppose

f(X) = (bRC' + - ' - + bo)(6.x’ + - - - + Co), 11,0, :fi 0, 1 g r < n. Now a0 = boco, and since (10 is not divisible by p2 [by

(ii)], it follows that not both b0 and co are divisible by p, say b() E (p). But Plao by (iii), so plboco. Thus plbo or plco, i.e., plco. Similarly, we conclude from (i) and a,l = b,c, that p is not a divisor of either b, or 0,. Thus p divides co, but it does not divide 0,. We pick the least integer m 2 1 for which cm

is not divisible by p. Then a," = bocm + b,c,,,_l + - - - + bmco. Now the minimality of m implies that

plck.

Osksm—l.

(15)

Thus am — boc”l E (p). But bo E (p) and cm GE (p), and since p is prime it follows that bocm E (p). Thus am 65 (p) [otherwise been, E (p)]. But m g s
,u(K) is a monomor-

phism, and of course p(K) is a subfield of K[x]/( p(x)). Now by Theorem 2.3, Section 3.2, secure a ring .7 such that K isasubring ofY and anisomorphism 1‘: 32—) K[x]/(p(x)) for which 1: I K = ,u. We know that Zis an extension ofK. Now set a = 1:"(v(x)) e Yand suppose p(x) = Zy=ok,x‘, k, E K, i = 0, ., n. Then

r(p(a»——=(2 r kfl') ork(.)r(a)'= Efkflvm'

Thus 1:(p(a)) is the zero in K[x]/(p(x)), and since 7: is an isomorphism, it follows that p(a) = 0. Hence a E .‘Z is a root of the polynomial p(x). If f(x) E K[x], divide f(x) by p(x) to obtain a remainder r(x)= 2;?-0a,x, a,e K,i=0, . . .,p,0gpgn—l:

f(X) = q(x)p(x) + '06)Then

f(a) = q(a)p(a) + r(a) = r(a) = i a.a'. Now suppose f(x) =

(18)

510 b,x‘. Then

r“(v(f(x))) = 1—1 (z u(b.)»(x)’) = {$0 [711—104(30)i = is: 1’1“1

=f(a).

(19)

But 1."1 is surjective so that everything in .7is ofthe form 1:"(v(f(x))), and hence from (18) and(19) everything in .‘Zis of the form (17). To confirm the uniqueness of the representation (17), suppose r(x) and s(x) are polynomials in K[x], neither of which has degree exceedingn — 1 and which satisfy r(a) = s(a). Thus we have a polynomial g(x) = r(x) — s(x) which satisfies g(a) = 0, and if g(x) at 0, then deg g(x) < deg p(x); hence g(x) GE (p(x)). By the

maximality of ( p(x)), ( p(x),g(x)) = K[x], and hence there exist polynomials u(x) and v(x) in K[x] such that

u(x)p(x) + V(x)g(X) = 1If we specialize x to a, we obtain the contradiction 0 = 1. Thus g(x) = 0,

and the uniqueness of the representation (17) follows. We have proved that

204

Rings and Fields

.7 is a field consisting of polynomials in a, and it is obvious that .‘Z = K(a), i.e., it is the smallest field containing a and K. It is also a consequence of the uniqueness that if a0 + ala + - - - + apal’ = 0, 0 < p g n, then a0 = - = a = 0.

We [have proved the first three parts of the theorem for the field 52’ and the root at E 5/. The remaining part (iv) is something of a technicality that requires replacing 5? by an isomorphic copy as follows. Let y be an indeterminate over 5’, and consider the polynomial ring 5?[y]. Since K C 5?, it is clear that y is an indeterminate over K as well and K[y] is a subring of 3[y]. Moreover it is obvious that the mapping

III: K[y] —’ K[x] defined by w(2§’;ob,y’) = Egobix’ is an isomorphism. [We write !//(f(y)) = f(x).] Thus by Theorem 2.3, Section 3.2, there exists a ring .% and an isomorphism (p: $[y] -) .9? such that K[x] C .% and ¢|K[y] = w. ‘P

‘4'

zc\

/1

K [I]

K [I]

G)

G)

Now let (0(2) = L, and note that since I/IIK = 1x it follows that L is an extension field of K. Moreover, for 0 = (0(1):) E L we have

12(0) = z kfl‘ = g: kma)’ = :3 ¢(k,)¢(a)’

(remember that ¢|K[y] = 1/1 and MK = xx)

= (0 (g kia’) = ¢(p(a)) = (17(0) = 0. The rest of the conclusions (i) to (iii) concerning L and 0 are also obvious con-

sequences of the same statements for 5?and (1. Now yis an indeterminate over 5’, so ¢(y) = y/(y) = x must be an indeterminate over (0(2) = L. Finally, anything in .9? is of the form ¢(h(y)), h(y) e .‘Z[y]. But then since ¢(y) = x, anything

ingl is of the form h(x) where h(x) has coefficients in L, i.e., g = L[x]. I In the notation of Theorem 4.7 we have the following:

Rings and Fields

204

5? is a field consisting of polynomials in a, and it is obvious that .‘Z = K(a), i.e., it is the smallest field containing a and K. It is also aconsequence of the uniqueness that if a0 + alt: + - - - + apaP = 0, 0 < p g n, then a0 = - - = a = 0.

We have proved the first three parts of the theorem for the field .5? and the root a E 5?. The remaining part (iv) is something of a technicality that requires replacing 5? by an isomorphic copy as follows. Let y be an indeterminate over .‘Z, and consider the polynomial ring 52’[y]. Since K C .7, it is clear that y is an indeterminate over K as well and K[y] is a subring of 5? [y]. More-

over it is obvious that the mapping

v1: K[y] —> K[x] defined by w(2§’;ob,y‘) = Zgobg‘ is an isomorphism. [We write l//(f(y)) =

f(x).] Thus by Theorem 2.3, Section 3.2, there exists a ring g and an isomorphism ¢: .‘Z[y] —) 9 such that K[x] C .% and ¢|K[y] = 5/1. 90

‘1’

zm\ K [r]

/1 K9]

CD

6)

Now let (0(9) = L, and note that since y/lK = 1x it follows that L is an extension field of K. Moreover, for 0 = ¢(a) E L we have

12(0) = 2 kfl‘ = 2 kMa)‘ = i ¢J(k,)q)(ot)i i=0

(remember that ¢|K[y] = q/ and VII K 2 1K)

i7 (2 kid') = ¢(p(a» = ¢(0) = o. The rest of the conclusions (i) to (iii) concerning L and 0 are also obvious consequences of the same statements for .7 and 0:. Now y is an indeterminate over 5’; so (o(y) = 1/1(y) = x must be an indeterminate over (0(2) = L. Finally, anything in .95 is of the form ¢(h(y)), h(y) 6 EU]. But then since (0(y) = x, anything in .93! is of the form h(x) where h(x) has coefficients in L, i.e., .9?! = L[x]. I In the notation of Theorem 4.7 we have the following:

3.4

Polynomial Factorization

205

Corollary 3 If f(x) E K[x] and f(0) oh 0, then there exists a polynomial u(x) E K[x] such that deg u(x) < n = deg p(x) and u(0) = l/f(t9) E K(0).

Proof:

This follows immediately from Theorem 4.7 because L = K(t9)

is a field.

I

Example 3 (a) If a method for computing the g.c.d. of two polynomials in K[x] is available, then the polynomial u(x) in Corollary 3 can be computed. For iff(6) =# 0,

then f(x) GE (P(x)) and hence (f(x),p(x)) = K[x] [i.e., (p(x)) is maximal]. But then g.C.d.(f(x),p(x)) = 1, and we can find u(x) and g(x) in K[x] such that u(x)f(x) + g(x)p(x) = 1. Specializing x to 9 in L, we have 14(9)f(9) + g(9)P(9) = 1But p(e) = 0, so u(0)f(0) = 1. (b) We express 1/[(1 + i)3 + 30 + i) + 2] as a polynomial in l + iwith ra-

tional number coefficients. Let 0 = l + i. Then (0 — l)2 = i2 = —1, and hence 92 — 20 + 2 = 0. Let p(x) = x2 — 2x + 2 E Q[x]. Then p(x) is irreducible (use Theorem 4.6) in Q[x]. Now we want to compute g.c.d.(f(x),p(x)) wheref(x) = x3 +

3X + 2. Divide f(x) by p(x) to obtain f(X) = p(XXx + 2) + 5x — 2, and then divide p(x) by 5x — 2 to obtain 8

34

p(x) = (5x - ”(355 — 2—5) + 25"

Combining these equations, we obtain

p(x) = [f(x) — p(x)(x + 2)] (—29- — 583) + 3—: or

3': = 170‘) [I + (x + 2) ()g- — 285)] —f(x) (353 _ 285)

Evaluating at 0, we have 34

0

8

is = ”1“") (’5' ‘25) °'

1

4

5

1T9) = 1—7 — 3‘4 9' The field L = K09) of Theorem 4.7 is called a simple algebraic extension

of K. The element 0 E L is said to be algebraic over K because it satisfies the

equation p(0) = 0 in L: An element of an extension field L of a field K is algebraic over K if it is the root of a polynomial in K[x]. If every element of L is algebraic over K, then L is called an algebraic extension of K. The degree

of the polynomial p(x) is called the degree of the extension L = K(6) over K; it is denoted by [L:K].

206

Rings and Fields

More generally, if L is an extension of K and there exist elements ax, . . u" in L such that every element u e L has a representation of the form

u = g] k,u,,

k, e K,

(20)

then L is called a finite extension of K and {up . . . , un} is called a spanning set for L over K. If no finite spanning set exists, then L is called an infinite extension of K. If the coefficients k1, . . . , kn in (20) are uniquely determined

by u for each u E L, we call {up . . . , an} a basis ofL over K, and the integer n, also denoted by [LzK] = n, is called the degree of L over K. It is quite easy

to prove (see Exercise 14) that if {up . . . , un} is a spanning setfor L over K, then some subset of {up . . . , an} is a basis ofL over K. Thus in Theorem 4.7, K(0) is a finite extension ofK with {1, 0, . . . ,0"“} as a basis of K(0) over K. We shall see shortly that any finite extension L of a field K is an algebraic extension, and that any two bases of L over K must have the same number of

elements in them.

Theorem 4.8 Let p(x) E K[x] be irreducible, and let 0 be a root of p(x) in some extension field F of K (e.g., the field F = L provided by Theorem 4.7). Let y: K[x] —-> K[x]/(p(x)) be the canonical epimorphism. Then the mapping

K[x] 0: -»—~ -1. —> K 0

(p(x))

()

defined by

6(v(f(X))) =f(9),

f(X) E KM,

(21)

is an isomorphism.

Proof:

First define ,a: K[x] —> K[0] by a(f(x)) = f(0), an obvious homomor-

phism. Iff(x) 5 ker a, then f(0) = 0. Now unless f(x) E (p(x)), we know from the maximality of (p(x)) that g.c.d.(f(x),p(x)) = 1. Hence there exist u(x) and v(x) such that u(x)p(x) + f(x)v(x) = l, which is clearly a contradiction since 0 = p(O) = f(0). Thus p(x)| f (x), and it follows that ker p =

(p(x)). But then Theorem 1.4, Section 3.1, tells us that there exists a unique isomorphism a: K[x]/( p(x)) —> K[0] such that an = ,u. Finally, K[x]/( p(x)) is a field so that K[bl] is a field, and hence K[0] = K(0) (see Exercise 7). I Corollary 4

If p(x) E K[x] is irreducible and 01 and 02 are roots of p(x)

in extension fields Fl and F2 of K, respectively, then there exists precisely one isomorphism co: K(0,) —> K092) which satisfies w(01)= 192 and wIK = xx. Proof: Let a,: K[x]/(p(x)) —> K(0,.), i = 1,2, be the two isomorphisms (2]). Set w = 020‘," so that a): K(0,) —) K092) is an isomorphism. From (21), for f (x) E K[x] we have

3.4

Polynomial Factorization

207

0201"(f(01)) = 02(v(f(x))) = f(02), and hence (0(01) = 02, cal K = 1K. Since everythingin K(0,)has the formf(01), we conclude that a) is uniquely determined by 02(01) = 02 and l = 1K.

Two extensions LI and L2 of K are equivalent extensions of K if there

exists an isomorphism a): Ll —+ L2 such that col K = 1K. Thus K(01) and K(02) in Corollary 4 are equivalent extensions of K. In order to investigate finite field extensions, we need the following elementary fact. Theorem 4.9 Let L be a finite extension of K and suppose {111, . . . , u”) isabasis ofL over K. If v], . . . , vm are any m elements ofL, m > n, then there exist kj e K,j = 1, . . . , m, not a110, such that

i k,v, = o.

i=1 Proof:

(22)

Write

To satisfy (22), we must have

0 = 1:1 z kjv]. = i=1 z, (2 c,jk,)u,. 1=1 and thus we want k1, . . . , km to satisfy

ic,jk,=0,

i=l,...,n.

(23)

i=1

We prove the existence of k,, . . . , km satisfying (23) by induction on n. We can begin by assuming that no v] = 0; otherwise if v1.o = 0, simply take

kjo =1andkj = Oforj $1.0 in(22). Ifn =1,thenvi= clju1,j =1, . . ., m, and no v}. is 0. Take k1: cu, k2 : —c,,,k3 = - - - = k," = 0, so that Z] kjvj = klvl + k2”: = Clzcnui - cllclzul ,= = 0.

Suppose then that n > 1. Since not all e”. = 0 we can, without loss of generality, assume that Cl». qt 0. (Otherwise simply renumber both the u,’s and vj’s.) Then a nonzero m—tuple (kl, . . . , km) satisfying (23) exists ifl‘ such an m-tuple exists for the system of equations obtained from (23) by

performing the following operations: From equation 1' substract cimclm‘1 times equation 1,1‘ = 2, . . . ,n. By induction the last n — 1 equations of the

resulting system then have a solution k,°, . . . , k,‘},_l not all 0. Put these in the first equation

208

Rings and Fields

cllklo + ‘ ' ' + cl m—Iklei—l + clmkm = 0; since c1," 94: 0, km = k9,, can be determined from (24). I

(24)

Theorem 4.10 Let L be a finite extension of K. Then (i) (ii)

Any two bases of L over K contain the same number of elements, say n. Any element 0 E L is algebraic over K; in fact, 0 satisfies an irreducible

polynomial p(x) E K[x] with deg p(x) g n. Proof: (i) Suppose {up . . . , an} and {v}, . . . , vm} are two bases of L over K and m > n. Then by Theorem 4.9 there exist kl, . . . , km E K not all 0 such that 27=1kjvj = 0. But this contradicts the uniqueness of the representation of 0 in terms of VI, . . . , vm. (ii) Consider the n + 1 elements 1, 0, . . . , 0" in L. Then by Theorem

4.9 again there exist k0, . . . , k" in K not all 0 such that ko+k10+ - ~ - +k,,0"= . Thus 6 is a root of a polynomialf(x) = k0 + - - - + knx" E K[x], of degree at most n. Now K[x] is a U.F.D., so we can factorf(x) into a product of prime

polynomials in K[x]. Then obviously 12(0) = 0 for one of these primes p(x). Corollary 5 If L is afinite extension of K and 0 e L, then there is precisely one manic polynomial p(x) e K[x] of least degree such that p(0) = 0, and p(x) is irreducible in K[x]. Moreover, any polynomial f(x) E K[x] for which

f(0) = 0 is divisible by p(x) in K[x]. Proof: According to Theorem 4.10(ii), there are polynomials in K[x] havingt) as a root. Let A K(fi) be the isomorphisms ofTheorem

3.4 Polynomial Factorization

2"

4.8, and denote by 111 and W the canonical epimorphisms v1: F[x] —> F[x]/ (P(x)), v2: K[y] —) K[y]/(q(y)). Consider the diagram

8

FEX]—-’K[yj

(27) 1’1

”2

FBI/(pun—p K[y]/(Mn) ,u.

in which 6: F[x] —) K[y] is the obvious ring isomorphism satisfying 6 | F = q: and 6(x) = y. The existence of a ring (in this case field) homomorphism fl satisfying [‘1’] = ”26

(28)

is guaranteed by the basic mapping diagram (25) in Section 3.1. Since 6 and v2 are surjective, ,u is surjective. Also from (28), v1( f(x)) E ker ,u ifi

v2(5(f(x))) is the zero in K[y]/(q(y)) ifi" 5(f(X)) E ((10)) ifi5(f(X)) = 00040)) “T f(x) = 0, it would be reducible ifl' J? E Z. But It is square free. Thus \/71' is irrational and we conclude ab = 0. By similar reasoning (since J71? is

irrational) b =# 0. Hence a = 0, J73 = bJY, m = bzn, bz Im, a contradiction unless b2 = 1, i.e., = lax/30$] = |aa| = 1, then 6(a)6((3) = 1, then N(a) =

m = n. (ii) obvious. (iii) 6(a) = [N(a)[ = 10ml, so 6(a)?) WIS—I = 6(a)6(/3). Also (a/N(a))-a = Era/Ea = 1. (iV) If a5 = 1 and 6(a) and 603) are positive integers. Conversely if 6(a) :l:l and Er/N(a) E Z[¢7]. The rest of (iv) is easy.

34. Let D be a U.F.D., x an indeterminate over D, and let a be algebraic over D. Let K be the quotient field of D. An element £ E K(a) is said to be an algebraic integer if ¢p(§) = 0 for some monic polynomial ¢p(x) E D[x]. Prove that if f(x) E D[x] is a primitive polynomial of least degree in D[x] for which f(5) = 0, then the leading coeflicient off(x) is a unit in D. Hint: By the usual argument, any polynomial in K[x] of which E is a root is divisible in K[x] by f(x). Thus suppose = m + p.

Also p = a — fly E D. Assume p at 0. Then compute that 6(p) = |N(/3)|

lN((x — r) + (y — SNWI = 6(fl)|(x — r)2 — m(y - S)2| S 6(5)“): - rl2 +

lml ly — Slz} S 6(13){(%)2 + IMIGY} S 6(13){1 + t} = 609)- Equality can

occur only when Ix — r| = |y —s| = % and m = 3, but then |(x — r)z —

m(y — s)2| = H — H = 1; and 6(p) = 155(8). Thusin fact 6(9) < 6(6) and (D,6) is Euclidean. Case (ii): m E 1 (mod 4). Then m = —11, —7, —3, 5. By Ex-

ercise 36, D = {a + btla,b E Z, r = (Vii — 1)/2} , obviously an integral domain (why?). Observethata + b(¢W + 1)/2 E D for any a,b E Z (why?). Let a and [3 be in D, 13 =# 0, and again write ctr/[3 = x + yx/W, x,y E Q. Obtain s E Z

such that [2y — s] S b, Iy — ml 3 1. Also x — s/2 E Q so choose r E Z suchthat [(x -— s/2) — r| g {Let}! = r + s(l + ¢fi)/2 E D, letp =/3[(x—r — s/2) + (y — s/2)¢W], and check that a = fl(x + JK/fi) = by + p. Since a, [3, y E D it follows that p E D. Moreover, if p at: 0, we have

6

= INmI ;N((x-r——;—) + (palm) mm: (x_r_3,_)‘ !

— m(y — ~§)2 .3 «m(fi— + lml (H) s 6(5)(—},— + 1;) < 603). The problem of determining the entire set of values of m E Z (square free) for which the integers in Q(¢W) form a Euclidean domain is very difficult but known: These are m = —11, —7, —3, —2, —1, 2, 3, 5, 6, 7,11,13,17,19, 21, 29, 33, 37, 41, 57, 73. It is also known that m = 2, 3, 5, 6, 7, 11, 13, 14, 17, 19,

21, 22, 23, 29, 31, 33, 37, 38, 41, 43, 46, 47, 53, 57, 59, 61, 62, 67, 69, 71, 73, 77, 83, 86, 89, 93, 94, 97 are values of m for which the algebraic integers in Q(¢W) form a U.F.D. It is conjectured that there exist infinitely many such m > 0, but this has not been proved. Glossary

3.4

algebraic extension, 205 algebraic integer, 223 basis, 206

characteristic polynomial of an algebraic number, 208 content of a polynomial, 194

3.5

Polynomials and Resultants

225

C(f(x)), 194

quotient, 222 primitive element, 215 primitive element theorem, 214 primitive polynomial, 194 quadratic number field, 222 remainder, 222 root of multiplicity k, 220 simple algebraic extension, 205 0 is algebraic over K, 205 simple field extension, 202 simple root, 220 spanning set, 206 splitting field, 202 f(x) splits, 202 square free integer, 222 stathm, 222 valuation, 222

degree of an extension, 206 degree theorem, 218

[LzK], 206 derivative, 219

f’(x), 219 divisor, 222 Eisenstein criterion, 200 equivalent extensions, 207

Euclidean algorithm, 219 Euclidean domain, 222

field extension, 202 finite extension, 206 Gauss’ lemma, 194 infinite extension, 206

linearly independent, 218 1.i., 218 norm, 222

3.5

Polynomials and Resultants

As we saw in Corollary 2, Section 3.4, if D is a U.F.D., then the domain of polynomials in n independent indeterminates, D[x1, . . . , x,], is also a U.F.

D. We can then apply Theorem 4.1, Section 3.4, to immediately conclude the following. Theorem 5.1 LetDbeaU.F.D.; xl, . . . , x”, n independent indeterminates over D; and (p,(xl, . . ., x"), 1': l, . . ., n, nonzero polynomials in

D[x1, . . . , xn]. Then there exists a g.c.d. of these polynomials in D[x1, . . . , x,] which is unique exceptfor associates. We specialize a number of earlier results to D[x1, . . . , xn]. Corollary 1

Iff, g,h are in D[x1, . . . , xn],f|gh andg.c.d.(f, g) = 1, then

f l}!Proof: Let p], . . . , pm be all the distinct prime factors that appear in the factorizations off, g, h. Write

f =p1“‘ ‘ - -p,,.‘m,

g = p1“ ' ' ~p...”m, h = Pic‘ ' ' ' mm’

Where the a,, bf, and c, are nonnegative integers. Now g.c.d.(f, g) = 1 means that a,b, = 0, i = l, . . . , m, and fl gh is equivalent to

226

Rings and Fields

a,gb,+c,,

i=1,...,m.

(1)

Thus if a, > 0, then bi = 0 and (1) implies that a, g c,. If a, = 0, then trivially a, g c,. Henceflh. I Corollary 2

Let0 qt (1), E D[xl,. . ., xn],i= 1,. . ., m, andset d = g.C.d.(¢1, . . . , $1..)

so that

(0, = dq,

where

q,€D[xl,. . .,xn],

Then

Proof:

(2) i=1,. . .,m.

g.c.d.(q1, . . . , (Ln) = l.

(3)

Let p], . . . ,pk be all the distinct primes that appearin the factori-

zation of the q)” i = l, . . . , m, so that k

e.

%=gm"

%eNumLt=LH.JJ=L.H,mLa t=1...

e=mine.

'



lSi’Sm",



4

k

()

and as in Theorem 4.1, Section 3.4, we have

, k

,

d = H P: ‘(=1

By (4), for each t = l, . . . , k there is at least one i, such that the power of p, occurring in q)“ is the same as the power of p, occurring in d. Thus p, does

not occur in the prime factorization of q“. In other words, if a.

k

q: = H pt :1 t= 1

then for each t there is an i, such that a,” = 0. Hence

mman=q

lSiSrn

t=L..qk

and it follows that

g.c.d.(q,, . . . , q,,,) = l. Corollary 3

I

p e D[x,, . . . , xu] isprime and g.c.d.(p,f) at l, thenp|f.

Proof: Let d = g.c.d.(p, f). Since d at l and d] p, it follows that d = ep, 8 a unit in D. Henceplf. I

3.5

Polynomials and Resultants

227

Corollary 4 Let f, e D[x,, . . . , xnl, i= 1, 2, and assume g.c.d. (f,,f2) =1. Iff,|¢, i=1, 2, then fif2 l (0Proof:

Write (P = qlf1-

(5)

Then f2 l (0

and g.c.d.(f,, f2) = 1 imply (by Corollary 1) that

f2 l ‘11-

Hence from (5), flfz l ¢Example 1

I

(a) In D[x,,x2] observe that g.c.d.(x,,x2) = 1. However, there do

not exist polynomials f,g E D[x,,x2] such that

x1f(xl,x2) + x2g(xnx2) = l,

(6)

for the left side of (6) has degree at least 1 [see the definition of degree preceding formula (45) in Section 3.2] and thus formula (6) is not possible. (b) D[x1,x2] is not a P.I.D., for otherwise f and g satisfying (6) would exist. (0) D[x,,x2] is not a Euclidean domain, for otherwise it would be a P.I.D.

[see Exercise 33(v), Section 3.4]. (d) Let f, g, (p, 0 be nonzero polynomials in D[x,, . . . , x,], and assume that

f/g is in lowest terms in the rational function field D(x1, . . . , x,,), i.e., g.c.d.(f,g) = I. Let S C D be an infinite set and suppose £91 2.4;.33212 _ “LWL‘EIQ

g0“. - .,sn)_0(sb- . ..S,.) for all specializations of x, to s, in S. Then there exists h e D[x,, . . . , x,] such that «p = hf

and

9 = hg.

For let «p = f0 — gtp E D[x,, . . . , x,,] and use Corollary 4, Section 3.3, to conclude thatw = 0, i.e.,

f9 — £90 = 0 or

f0 = gllh

(7)

Now fl gtp from (7) and g.c.d.(f,g) = 1, so Corollary 1 implies thatf Io, say tr = kf-

Similarly 0 = hg, (7) becomes

fhg = gkf, and hence h = k.

Let m be a nonnegative integer. A polynomial

f = 76%;“) “7'1;" X,7(1)

(8 )

Rings and Fields

228

in D[x1, . . . ,x,,] is homogeneous ofdegree m, or a form of degree m, if when-

ever a, qt 0 we have

2 yo) = deg fr xi‘” = m.

t=l

I=l

(The zero polynomial is homogeneous of arbitrary degree.) Thus the elementary symmetric function (e.s.f.)

Em: 2

mlxru)’

13mg",

’EQmm 1:

and the completely symmetric function "I

hm =

1 S m S "a

1 x70), 2 750mm:

are both homogeneous of degree m.

A simple test for homogeneity is given in the next result. Theorem 5.2 Let x1, . . . , x", t be independent indeterminates over the integral domain D. Letf(x,, . . . , x") E D[x,, . . . , xn]. Then f is homogeneous of degree m ifl

f(tx,, txz, . . . , txn) = t'”f(xl, . . . , x")

(9)

in D[x1, . . . , x”, 1].

Proof:

Let f be homogeneous of degree m and given by (8). Observe that

whenever a, oh 0, fl

1510961)”) = “E17“ 111x,” i=1

1“!

= t'" fi x170"), i=1

and (9) follows. Conversely, let f beany nonzero polynomial in D[x1,r. . . , x,,] and group together the nonzero terms of equal degree in f, k

f = gimp

(10)

in which f,” at 0 is the sum of all homogeneous terms of degree n, appearing infand 0 3 r11 < 712 < - ~ - < nk. Then from (10), k

f(tx,, . . . , txn) = igfmaxl, . . . , txu) k

= gtnifl‘iocl, . . . , x”).

(11)

Now if

f(tx,, . . . , txn) = t'”f(x,, . . . , x"),

(12)

3.5

Polynomials and Resultants

229

We combine (1 l) and (12) to obtain k

t’”f(x,, . . . , x") — gt'fifnfixl, . . . , x") = 0.

(13)

Since t is an indeterminate over D[x1, . . . , x,,], k must be 1. For if k 2 2, the left side of (13) is a nonzero polynomial in D[x1, . . . , xn][t]. We also must have

m = n

and

thus f is homogeneous of degree m.

f =f,,1;

I

Example 2 (a) The discriminant polynomial in D[x1, . . . , x,] is the polynomial

A(x1, . . . ,x,,) =

1]

(xi — xj)2.

lSl mand vk = Ofork < Oork > n.A typical term inR = detSis m+n

n

m+n

(19)

3(a) Hlsiau) = 1' III "uh—tilllvnwm—i. :-

1‘

=

If a term (19) is not 0, its weight [see (17)] is n

m+n

i-

i=n+

m+n

m+n

2106) —i+ Zln+o(i) —i= mn + X100) — Elli 1"

=mn.

I

i=

232

Rings and Fields

If

md

f(x) = (1m + a,,,_1x""l + - - - + a0

(20)

an=bns+mnflfl+---+m

on

are in D[x], then the resultant of these two polynomials, denoted by R(f, g), is the specialization of Sylvester’s determinant R(uo,. . .,um,vo,. . .,v,,)

to

R(ao, . . . , am, b0, . . . , b").

We shall assume in what follows that D has characteristic 0 and ambn qt 0, m 2 1, n 2 1, so that deg f(x) = m, deg g(x) = n. The following result provides a useful test for deciding if (20) and (21) have a common factor of positive degree.

Theorem 5.5 Assume that D is a U.F.D. Then the polynomials (20) and (21) have a common factor ofpositive degree in D[x] ifi' R(f, g) = 0.

Proof: We first show that f and g have a common factor of positive degree ifl‘ there exist nonzero polynomials q) and 1/1 in D[x], deg q; < m, deg VI < n such that

Vlf = ¢g-

(22)

Suppose first that h is a common factor of positive degree so that

f = he,

g = W.

Then obviously

Wf = whr = vigOn the other hand, if (22) holds then every prime divisor of g of positive degree must be a divisor of y/f. They cannot all be divisors of 11/ because deg 1/1 < n = deg g. Hence at least one of the prime divisors of g must divide f; and hence f and g have a common factor of positive degree.

Thus in proving the theorem it sulfices to show that R(f, g) = 0 ill there exist polynomials o and 1/1, 0$¢=am_lx"'"l+ - - - +ao,

0¢w=flmfl”+---+m such that (22) holds. The condition (22) is then equivalent to the system of linear equations “0/30 = boao,

“150 + aofli = bowl + blaO’ azflo + alfil + aofiz = 170“: + brat + bza'o,

(23)

232

Rings and Fields

If

and

f(x) = amx’" + a,,,_1x”"l + - - - + a0

(20)

g(x) = bnx” + b,,_1x"‘l + - - - + b(,

(21)

are in D[x], then the resultant of these two polynomials, denoted by R(f, g), is the specialization of Sylvester’s determinant R(uo,. . .,um,v0,. . .,v,,) to

R(ao, . . . , am, b0, . . . , b").

We shall assume in what follows that D has characteristic 0 and amb,I ab 0, m 2 1, n 2 1, so that degf(x) = m, deg g(x) = n. The following result provides a useful test for deciding if (20) and (21) have a common factor of positive degree.

Theorem 5.5 Assume that D is a U.F.D. Then the polynomials (20) and (21) have a common factor ofpositive degree in D[x] ifl R(f, g) = 0.

Proof:

We first show that f and g have a common factor of positive degree

iff there exist nonzero polynomials a and 1/1 in D[x], deg q: < m, deg 1/1 < n

such that

wf = (0g.

(22)

Suppose first that h is a common factor of positive degree so that

f = he,

g = hw-

Then obviously

wf = whet) = (ogOn the other hand, if (22) holds then every prime divisor of g of positive degree must be a divisor of wf. They cannot all be divisors of y/ because deg w < n = deg g. Hence at least one of the prime divisors of g must divide f, and hence f and g have a common factor of positive degree.

Thus in proving the theorem it suflices to show that R(f, g) = 0 iff there exist polynomials go and 11/,

0¢¢=am_1x’”“+ - - - +ao,

0¢w=fi._ix"“+ - - - +130 such that (22) holds. The condition (22) is then equivalent to the system of linear equations (10:30 = bOaOa

“1/30 + “0/31 = boar + [71010: “2.30 + “1.31 + aofiz = boaz 'I' bi“: + bzao,

(23)

232

Rings and Fields

If

and

f(x) = amx'” + a,,,_1x”"l + ' ' - + a0

(20)

g(x) = bnx" + b,,_1x"'l + - ~ - + b0

(21)

are in D[x], then the resultant of these two polynomials, denoted by R(f, g), is the specialization of Sylvester’s determinant

R(uo,. . .,u,,,,vo,. . .,v,,) to

R(ao, . . . , am, b0, . . . , b").

We shall assume in what follows that D has characteristic 0 and ambn at 0, m 2 1, n 2 1, so that deg f(x) = m, deg g(x) = n. The following result provides a useful test for deciding if (20) and (21) have a common factor of posi-

tive degree. Theorem 5.5 Assume that D is a U.F.D. Then the polynomials (20) and (21) have a common factor ofpositive degree in D[x] if R(f, g) = 0.

Proof: We first show that f and g have a common factor of positive degree ifl‘ there exist nonzero polynomials q) and I]! in D[x], deg q: < m, deg y/ < n such that

wf = as

(22)

Suppose first that h is a common factor of positive degree so that

f = hr,

g = hul-

Then obviously

wf = whr = rsOn the other hand, if (22) holds then every prime divisor of g of positive degree must be a divisor of l/lf. They cannot all be divisors of I]! because deg w

< n = deg g. Hence at least one of the prime divisors of g must divide f, and hence f and g have a common factor of positive degree.

Thus in proving the theorem it sufiices to show that R(f, g) = 0 ill there exist polynomials (p and 1/1,

0¢¢=am_,x"'“+ - - - +ao,

0¢w=fl._1x"“+ - - - +130 such that (22) holds. The condition (22) is then equivalent to the system of linear equations “030 = boao,

“150 + “051 = boal + [71010, “230 + alfll + “032 = boaz + bi“! + bzao,

(23)

3.5

Polynomials and Resultants

233

Qgfio + 02/31 + 01/9: + “053 = boas + bi“: + bza'i + b300, amfin—l : bnam—l '

Regard (23) as a system of linear equations over the quotient field of D for the determination of the m + n unknowns a0, . . . , am_1, [10, . . . , 5,.-1- The matrix of coeflicients is 50

fir

Ian—1

do

“1

am-1

a0

0

-

0

—b0

0

al

a0

0

0

.—b1

—b0

a? a, a0 0 m9

6

0 0 - - -

—b_2 —b1 —b0 0

0 21",

6

9

(24)

o —b:,,

where the columns in (24) are ordered according to the indicated unknowns. By taking the transpose of (24) and multiplying the last m rows of the resulting matrix by — 1, we obtain Sylvester’s matrix specialized to the coefficients of f(x) and g(x). But then the homogeneous system (23) has a nonzero solution ifi' the determinant of the coefficient matrix is 0, i.e., iff R(f, g) = 0. Of course,

any nonzero solution in the quotient field of D produces a nonzero solution in D to the homogeneous system (23) by simply multiplying through every

equation by the same common denominator.

I

Example 3 (a) We construct the system of equations (23) and the matrix (24) for m = 3, n = 2. The product ftp is (“3"3 + 02x2 + 01" + ao)(flix + I30) = aoflo + (alflo + aofi1)x + ((12130 + ”131”: + (03130 + “#90353 + “351x4-

Similarly the product gq: is

(b2 + bx + bo)(a2x2 + a.x + a0) = boao + (bia'o + boai)x + (bzao + big] + [7000352 + (bzal + b1(12).x3 + b1a2x4.

The condition fip = gq) then becomes 0050 = boao, 0130 + 0051 = bio-'0 + boars 02130 + 01131 = bzao + bran + 170012: “3/30 + 0251 = bzai + 171012, “sign = blaz-

The matrix (24) is

234

Rings and Fields

I30

131

a0 0

“0

a1

—b0

0'2

a,

a0

—b1

0 —b0

‘12

a1

”b2

"bl

—‘b0

a3 0

a2 a,

——b2 0

—b1 —b2

0 0

0 0

(25)

Taking the transpose of the matrix (25) and multiplying the last three rows of the resulting matrix by —l, we obtain the matrix a0

a1

a2

a3

0

0

a0

a1

a2

a,

b0 b1 b2 0 0 0

0

b0 b1 b2 0 0 b0 b1 b2

Thus f and g have a common factor of positive degree in D[x] ifl‘ the determinant R(f,g) of this matrix is 0.

(b) Let f(x) = agx3 + azx2 + alx + a0, g(x) = x —— b. Then do

R (f, g) = det

—b

o

a]

a2

a:

l

0

0

_b

1

0

(26)

00—h] Now add b times column 2 to column 1, b2 times column 3 to column 1, and finally b3 times column 4 to column 1 to obtain

f(b) R(f,g) = det

o

a1

a2

a,

1 —b

o 1

o = b. 0 f()

00—b1 Thus fand g have a common nonconstant factor ifl‘f(b) = 0. But this is precisely the content of Corollary 1, Section 3.3. This calculation can obviously be generalized to a polynomial f(x) of arbitrary degree.

Corollary 5

Let D be a U.F.D. and letf(x) E D[x], degf(x) 2 2.

(a) If g(x) E D[x] is irreducible, deg g(x) 2 1, then gzl f iflgl f and gl f’. (b)

There exists a polynomial g(x) E D[x], deg g(x) 2 1, such that gz| f ifi‘

RU, f') = 0Proof:

(a) If g| f then f = gh,

f’ = 3'71 + gh’,

(27)

3.5

Polynomials and Resultants

235

so that gl f’ implies from (27) that g] g’h. Now deg g’ < deg g, and since g is irreducible it follows that glh. Hence g2] f. Conversely, if gzl f then f = gzk,

f' = 2gg’k + gzk’

and hence g[ f’ (b)

By Theorem 5.5, f and f' have a common nonconstant factor ifi'

R(f, f’) = 0- Apply (a)-

I

The resultant off and f’, R(f, f’), is called the discriminant off.

Corollary 6 Assume that D1 and D2 are U.F.D.’s and D1 C D2. Let x be an indeterminate over D2. If the polynomials (20) and (21) in Dl[x] have a

common factor ofpositive degree in D2[x], then they have a common factor of positive degree in D1[x].

Proof: By Theorem 5.5 applied to D2, R(f, g) = 0. But R(f, g) E Dl so that We can apply Theorem 5.5 again to conclude that f and g have a common factor of positive degree in Dl[x]. I Theorem 5.6

Let D be a U.F.D.. Then for the polynomials f(x) and g(x) in

(20) and (21) there exist polynomials a(x) and fl(x) in D[x], deg a g n — 1, deg B s m —- 1, such that

R(fig) = of + As Proof:

(28)

Write

f=ao+alx+a2x2+~--+amx”',

xf=

00" + alx’ + - - - + am_1x’" + amxm“,

xn—lf=

(loin—1 +



.

+

amxm+n—l,

(29)

g=bo+b1x+b2x2+ - ~ - +bnx", xg =

box + blx2 + - - - + b,,_,x" + bnx'm,

xiii—1g =

. boxm—l +

.

.

.

+ bnxm+n—l.

Observe that the coefficient matrix S0 = S(ao, . . . am, b0, . . . , b") of l, 35, x2, . . . , x"""“1 on the right in (29) is precisely Sylvester’s matrix (15), specialized to the coefficients offand g. Now let

c,=(—1)”‘detS°(t|l),

t=l,...,m+n

be the cofactors of the elements in column 1 of S°. It follows from the Laplace expansion theorem that m+n

2 9:53: = 51k det So = 61kR(./; g).

(=1

(30)

236

Rings and Fields

Multiplying the tth equation in (29) by c, and adding all m + n equations, we have

(cl + 52" + ‘ ‘ ' + cnxn_l)f+ (cn+l + cn+2x + ' ' '

+ cn+mx'"")g = R(f, 8). Simply let a and B be the coeflicients off and g, respectively, in the preceding equation to obtain (28). I Example 4

(a) In Corollary 6, let D1 = R, D2 = C, f(x) E R[x] and assumef(a) =

0, a = a + ib E C, b at 0. Form the real polynomial

g(X) = (x - a)(x - a) which, of course, is irreducible in R[x]. Now R(f,g) = 0 in C becausefand g have the common factor x — a. But then R(f,g) = 0 in R so that fandg must havea common factor of positive degree in R[x]. Since g(x) is irreducible, it follows that this common factor must be g, gl f, and hence f(a) = 0 also. In other words, we have as a con-

sequence of Corollary 6 the well-known result that complex roots of a real polynomial occur in complex conjugate pairs.

(b) Let f(x) = x3 + px + q E R[x]. We compute that the discriminant of f is

0

1

0

q p

0

l

q P 0

R(f,f’) = det p 0 3 0 0 = 27q1 + 4p3_ 0 p 0 3 0 0 0 p 0 3

Thus if 27qz + 4p3 = 0, then f(x) must be divisible by g(x)2, deg g(x) 2 l. Obviously deg g = 1, and hence f(x) has a real root of multiplicity at least 2.

Let x1, . . . ,xp be independent indeterminates over the U.F.D. D. Iff andg are in D[x,, . . . ,xp] and deg”JPf = m,

degxp g = n, write

f: a0 + alxp + - - - + amxp'",

(31)

g 2 b0 + blxp + - - - + buxlp'l

(32)

in which the polynomials a0, . . . ,am, b0, . . . , b,. in (31) and (32) are in

D[x1, . . . , xp_,]. Then the resultant

R02?) is obviouslyapolynomial R(xl, . . . ,xp_l) in D[x,, . . . ,x _1]- Of course if f is homogeneous of degreem in x1, . . . ,xp, then clearly a,(x1, . . . ,xp_l) must be homogeneous of degree m — t, t = 0, . . . , m. Similarly for g. Then R(txl, . . . ,txp_,) is given by

236

Rings and Fields

Multiplying the tth equation in (29) by c, and adding all m + n equations, we have

(01 + 02x + ' ' ' + cnxn_l)f+ (CU-+1 + cn+2x + ' ‘ '

+ C..+.,.X'”_')g = R0", 5:)Simply let a and [I be the coefficients off and g, respectively, in the preceding equation to obtain (28). I Example 4

(a) In Corollary 6, let D1 = R, D2 = C, f(x) E R[x] and assumef(a) =

0, a = a + ib E C, b at 0. Form the real polynomial

g(X) = (x — a)(x — 61) which, of course, is irreducible in R[x]. Now R(f,g) = 0 in C becausefand g have the common factor x — :2. But then R(f,g) = 0 in R so thatfandg must havea common factor of positive degree in R[x]. Since g(x) is irreducible, it follows that this common factor must be g, gl f, and hence f(a) = 0 also. In other words, we have asacon-

sequenoe of Corollary 6 the well-known result that complex roots of a real poly~ nomial occur in complex conjugate pairs.

(b) Let f(x) = x3 + px + q E R[x]. We compute that the discriminant off is

4 p 01 0 q p

0

01

R(f.f’)=det p o 3 o o =27q2+4p3, 0 p

0

3

0

o o p o 3 Thus if 27q2 + 4p3 = 0, thenf(x) must be divisible by g(x)2, deg g(x) 2 l. Obviously deg g = 1, and hence f(x) has a real root of multiplicity at least 2.

Let x1, . . . ,xp be independent indeterminates over the U.F.D. D. Iff andgare in D[x1, . . . ,xp] and

degxpf = m,

degx, g = n, write

f= a0 + alxp + - - - + amxp'",

(31)

g=bo+b1xp+ - - - +b,,x‘,'l

(32)

in which the polynomials a0, . . . ,am, b0, . . . , b,‘ in (31) and (32) are in D[x1, . . . , xp_,]. Then the resultant

R013) is obviouslyapolynomial R(x,, . . . ,xp_1) in D[x,, . . . ,xp_,]. Of course if f is homogeneous of degreem in x1, . . . ,xp, then clearly a,(xl, . . . ,xp_1)

must be homogeneous of degree m — t, t = 0, . . . , m. Similarly for g. Then R(tx1, . . . ,txp_l) is given by

3.5

Polynomials and Resultants

t"‘ao

t""‘a1

ta,,,_l

[Mao

det

237

a,”

..........

0

- - T

am . . .

t"‘a0

t"bo

a,"

t"“bl t"bo

n IOWS

(33)

b, 0 - - . .........

b,l

.....

m rows

. t"bo

t"“bl -------

b"

Multiply the ith row by t”"“, i = l, . . . , n and multiply the (n + i) th row

byt'""+‘,i = l, . . .,mi Then the jth column oftheresultingmatrixwill have t"+"'_j+lasthe poweroft occurringinall m + nrows,j = l, . . . ,n + m.The effect of the row multiplications is to multiply R(tx,, . . . , txp_l) by t”, where

0:"2n—i+l+fi]m—i+l i=1

i=1

__ 541+. 2 1.). + M 2 . Taking t"+’”"'*1 out of the jth column we have t°R(tx1, . . . , txp_1) = t" R(x,, . . . ,xp_,) where

0=n+zmn+m_j+l:m)(izim_

j-=l

Thus

R(tx1, . . . , txp_l) = t9'”R(x,, . . . , xp_l) = t’""R(x1,. . . ,xp_1).

(34)

In other words, iff(x1, . . . , xp) and g(x1, . . . x) are homogeneous ofdegrees m and n, respectively, then the resultant R(f;g) with respect to x, (or, of

course, any other xj) is either 0 or homogeneous ofdegree mn. This observation permits a relatively easy proof of the following important result. Theorem 5.7 Letf(x) and g(x) be thepolynomials in (20) and (21). If {1, . . . , C... and 01, . . . , 0,, are the roots off(x) and g(x), respectively, in an appropri-

ate extension field of the quotientfield ofD, then

Roz g) = (—1)"'"a; fl g6» = but fine». Proof:

(35) (36)

First observe that if c, d are in D then

R(Cfl dg) = 6"d"'R(f, g)

(37)

238

Rings and Fields

(see Exercise 1). Thus we may begin by assuming f(x) and g(x) are monic: a," :1, b,I =1, and

f(x) = 11 (x — 6,), g(x) = 1:1 (x — 0,). Let x1, . . . xm, yl, . . . ,ynbe1ndeterm1natessothatx, x,,yj, i— — 1, , m, j = 1,. . . , n are independent over the splitting field for f and g and define F(xax1:' - -:xm’yl9- ' 'ayn)=‘ll(x—xt)a

(38)

~ -:yn) =t1=_.l‘:(x_yt)s

(39)

G(xaxla - -

-9xm:y19 -

H(x1, . . . , xm,yl, . . . ,y,) = 1'11 1'[l(x, — y). 1"

(40)

1"

Now F is homogeneous of degree m, G is homogeneous of degree n, and H is homogeneous of degree mn. Let R(xla'

-

~:xm’yl9-

-

-ayn)=R(FaG)

(41)

denote the resultant of (38) and (39) with respect to x. By the observation immediately preceding the statement of the theorem, R(xl’ ‘



' ’xm’yl’ '

'

' ,yn)

is homogeneous of degree mn or possibly 0. That is, (41) is the resultant of homogeneous polynomials. Clearly R(x1, . . . , xm, y,, . . . , y") is not the zero polynomial. For we can specialize the x, and y, to elements of D so that the resulting polynomials have no common roots (remember that D has characteristic 0), hence no common factors, and thus their resultant with respect to x must be different from 0 (Theorem 5.5). On the other hand, if we substi-

tute yj for x, in (38), then the resulting polynomials will have a common factor and the resultant (4]) must be 0, i.e., R(x,,. . . ,xH, yj, xi+1,...,xm, y1,...,y,,)= 0. But then regarding (41) as a polynomial in x, over D[x1, . . . , xH, xi+1, ,xwyl, . . . ,yn ,itfollowsthat(x,—yj)|R(x1, . . ., xm,y1,. . .,yn)_ Since this is true for any binomial x, — yj, it follows that

H[R(x,, . . . , xwyl, . . . ,y,).

(42)

But both polynomials in (42) are homogeneous of degree mn so that (42) implies that R(x,, ...,xm,yl,...,yn)=aH(x1, . . .,x,,,y1, . . .,y,,)

(43)

3.5

Polynomials and Resultants

239

where a E D, a ab 0, and a is completely independent off(x) and g(x) and depends only on the polynomials (38) and (39). In other words, as polynom1als

in x1, . . . , xm, yl, . . . , y" the following equality holds: R(F(x’

x1, '

'

' ,Xm, yl’ '

'

' syn),

00‘; x], -

-

- 9x1": Y1, -

-

= a fi " (x, — y). i=1j= l

‘ syn))

(44)

Ifwespecializex,togaudy]to0in(44),i=1,. . . ,m,j= 1,. . . ,n, then since

F(x,§,,...,.fm,01,.. .,0,,)=f(x)= fil(x— 5,)

and

G(x, 51,. . .,.§m, o,,. . ,9 )— _ g(x): 1:1(x — 0,)

we have

R(f(x), g(x)) = a f1 f1 5, = g(é» =aJ]:II (f1 (0 —€,)) =a_1:1(—1)mf1(0 —5) = an( Irma) = (—1)m"a};11f(0j).

(45)

(46)

To determine a, we argue as follows. Regard the coefficients off(x) and g(x)

as additional independent indeterminates over D. Then the coeflic1ent of ao"b,,"‘ = a0" (since b,l = 1) in the determinant expansion of Sylvester’s matrix is precisely 1. On the other hand, in (46) take every 0, to be 1 [we can do this, since a depends only on the

P01ynomials (38) and (39) which are defined completely independently off(x) and g(x)] to obtain

R(f(x),g(X)) = (-l)"”'ai=]"’[l (a0 + a1 + . . . + am)

= (—l)""’a(ao" + - - .).

(47)

Thus matching coeflicients, a = (— l)"‘”, and the formulas (35) and (36) follow immediately from (45) and (46), respectively (remembering our assumption

that a». = b" = 1),

I

240

Rings and Fields

If we take g(x) = f’(x) in (35), then since n = degf’(x) = m — 1, the discriminant becomes

R(m = (—1)’”""“’a,."‘“IIf’(€.)

(48)

where :1, . . . , 6,. are the roots off(x). Now write

f(x)=am:1rx— 5. so that

f’(x) = a", :1 1'1 (x — 6,),

and hence

f’(5. = amfi (E: — 60-

(49)

Substituting (49) into (48), we obtain R(flf’) = (_ Drum—Damm—lamm :1! 9.3(61 _ it)

=(—1)"' Warm-1 fl n (_ 1, where no, "1: . , u,"_1, x are independent indeterminates over D. If 51, . . . , E". are roots off(x) in some splitting field, then Theorem 5.8 tells us that the discriminant satisfies

R(f(X).f’(x))=(—1)"'""""” lSi n, and consider the system

of homogeneous linear equations

,=flame]. =0,

i=1,...,n.

(9)

Since m > n, the system (9) has a nontrivial solution C = (51, . . . , if“). We

can assume ~51 at 0 (by renumbering the kj if necessary). Now according to Exercise 4, 01, . . . , 0,, are l.i. over K so that 2;; a, at 0. Thus there exists 0 E K, 0 qt 0, such that

gore) at o.

(10)

Choose l E K such that 16, = 6, i.e., )1 = 6146. Then 15 is a solution to (9) and (1.5)1 = 0 satisfies (10). Thus we can assume that 5 is already chosen so that (9) is satisfied and

(11)

g 0461) at o.

Let t be fixed for the moment, and evaluate a, on both sides of (9) to obtain

ii a,a,(k,)a,(¢,) = o,

i = 1, . . . . ,n.

(12)

,5 Of course, 0,0, = 0,, and as 1‘ runs over {1, . . . , n} so does 1". Thus (12) becomes

:1 0,,(kj)a,(fj) = o,

i’ = 1, . . . , n,

(13)

and We have a system such as (13) for each 1‘ = l, . . . , n. In other words, the n vectors

(01(61),- - - , 0‘6“»,

t = 1, - - - , n,

(14)

all satisfy the homogeneous system (9) and hence their sum

n = ($3 are), . . . , 2 are»)

(15)

also satisfies (9). Note that(11) states that 17] qt 0, and hence a is a nontrivial solution to (9). Also observe that

0.07,) = 2 more) = 2 are)

268

Rings and Fields S:1,...,n,j=l,...,m,

:7“,

and hence "jEK(G),

j=l,...,m.

Observe also that if for a fixed i’ we sum (13) for t = l, . . . , n, then we obtain (dropping the prime on i) Zia,(kj)77j=0,

1=

i: l,...,n.

(16)

In particular, there is an i such that a, = e, and for this i(16) becomes

.2 ki’h‘ = 0’

j=l

a linear dependence relation for kl, . . . , km over K(G). (ii) It is routine to verify that the polynomial (8) is equal to

P(x) = x" - E1(01(a), . - - , 0,,(oz))JC”'l + - + (-1)’ Ep(01(a), - - . , 0p(a))

(17)

where E,(t1, . . . , I) is the rth elementary symmetric function of 11, . . . , tp, and the ordering of the a, is chosen so that 01(a), . . . a ap(a) are all the distinct values of 01(a),j = 1, . . . , n (see Exercises 5 to 7). But then for any k = 1, . , n, O'k0'1(a), - - - : akap(a)

(18)

are distinct and hence must be (71(a), . . . , 0p(a) in some order [since 01(a), . . . ,

ap(a) are all the distinct values of a,(a), i = l, . . . , n, and the p values (18) are distinct]. Now (see Exercise 8) UkE,(O'1(a), . . . , ap(a)) = E,(a’kal(a), . . . , akap(a))

= E,(01(a), . - - , 0,,(a))In other words, the coefficients of the polynomial p(x) are in K(G) and, in fact, 01((1), . . . , ap(a) are the roots of p(x). Of course, since G is a group one of these values is a itself. Supposef (x) E K(G)[x] is a polynomial for which f(a) =0;thenforanyi=1,...,n,

0 = 0,-(f(a)) =f(01(0!)) so that a,(a) is a root off(x) as well. Since there are p distinct values among the a,(a), i = 1, . . . , n, it follows that deg f (x) 2 p. In other words, p(x) is the monic polynomial of least degree in K(G)[x] having a as a root, i.e., it is the characteristic polynomial for a and hence is irreducible. I

Example 4 Let K = Q(2” 3). The question we wish to consider is this: Does there exist a group G < A(K) such that Q = K(G)? That is, does there exist a group of

automorphisms of Q(2“ 3) for which Q is the fixed field? Observe first that 0(2” 3)3 =

0(2) = 2 for any a E A(K) holding Q elementwise fixed. The elements of K are real

3.7

Galois Theory

269

numbers, and the only real cube root of 2 is 21”. Thus if a E A(K) and 0 holds Q elementwise fixed, then 0(2‘”) = 21” and hence a = e. But K(e) = K at Q, so there is no group G < A(K) such that Q = K(G).

Our next result formalizes a fact we have verified several times before.

Theorem 7.3

Let F be a field, f (x) E F[x], and let K be a splitting field

for f(x). Let E 2 {an .. . , am} be the set of distinct roots of f(x) in K. Then A(K |F)IE < SE, the symmetric group on E, where A(K |F)|E = {alE [ a E A(KIF)}. Proof: First observe that if a E A(K 1F), then a is completely determined by its values on the roots in E, i.e., by the values U(a,). Indeed, any element of K is a linear combination of products of powers of the a, with coefficients

from F [i.e., K is a splitting field for f(x)]. Now if a E A(K IF), then for t: 1, ...,mwe have

f(0(a)) = 0(f(01.)) = 0(0) so 0(a) is a root off(x) and hence must be one ofal, . . . , am. Since a is in-

jective, we conclude that olE E SE.

I

We also have the following: Theorem 7.4 Let K be afield, G < A(K), |G| : n, and let f(x) E K(G)[x] be an irreducible polynomial with some root at E K. Then, in fact, K contains all the roots off(x), i.e., K contains a splittingfieldforf(x).

Proof: From Theorem 7.2 we know that the polynomial p(x) defined in (8) has the following properties: p(x) E K(G)[x]; p(x) is the characteristic poly-

nomial for a in K(G)[x]; all the roots ofp(x) are in K and one of these roots is 0:. Since p(x) is the characteristic polynomial for a and f (a) = 0, We know

that p(x) [f(x) in K(G)[x]. But the irreducibility off(x) implies that, in fact, f (x) = cp(x), c E K(G). Thus all the roots off (x) are in K. I Iff (x) E K[x] and every irreducible factor off (x) in K[x] has distinct

roots [in the splitting field forf(x)], then f(x) is called a separable polynomial. If we look back at Exercises 25 and 26, Section 3.4, we see that if char K = 0, then any polynomial is separable. If char K = p, however, then there exist

irreducible polynomials in K[x] which do not have distinct roots. We shall exhibit an example of an irreducible polynomial which is not separable, i.e., which does not have distinct roots. It is a fact that no such example exists if K is a finite field, but we shall not investigate this problem further. We content ourselves with the following rather unpleasant example.

270

Rings and Fields

Example 5 Let x be an indeterminate over Z2, and let F = Z2(x). Let y be an indeterminate over F and consider the polynomial ring F[y]. First observe that char F = 2. Now consider the polynomial f(y) = y2 — x in F[y]. The derivative (in y) off(y) is [2]y = 0, so by Exercise 25, Section 3.4, f(y) has fewer than 2 distinct roots. On the other hand, iff(y) were reducible in F[y], there would exist an a e F such that a2 = x. But a = p(x)/q(x), p(x) and q(x) in Z2[x], and hence p(x)2 =

xq(x)2. But deg p(x)2 = 2 deg p(x) = deg xq(x)2 = deg x + 2 deg q(x) = l + 2 deg q(x), which is clearly impossible. Actually, the reason yz — x has only one root (aside from the argument given above) is that its splitting field is F(¢ 55'),

y2 — x = (y — J?)(y + fl), and F(~/Y) also has characteristic 2. Thus —~/_x = J? and there is only one root of this irreducible polynomial.

If K is an extension field of F, then K is a normal extension of F if there exists a finite group G < A(K) such that F = K(G). In other words, K is a normal extension of F if F is precisely the fixed field of some finite group of

automorphisms of K. Observe that if k E K and k GE F (where K is a normal extension of F), then there must be some a E G such that 0(k) at k; otherwise k would be in F.

The following result is of crucial importance in understanding the structure of normal extension fields.

Theorem 7.5 If F C K are fields, then K is a normal extension of F ifir K is the splitting field of a separable polynomial in F[x]. In fact, if K is the splitting field of the separable polynomial p(x) E F[x], then

F = K(A(K| F»,

(19)

i.e., F is precisely thefixedfield of the Galois group ofK over F. Proof:

We begin by proving that if F = K(G), lGl = n, then K is the split-

ting field of a separable polynomial in F[x]. From Theorem 7.2(i) we know

that [KzF] = n so that

K = F(a,, . . . , a")

(20)

where a, E K, t = l, . . . , n, comprise a basis of K over F. According to Theorem 7.2(ii) we can construct the characterrstic polynomial p,(x) E F [x] of each a,. Moreover, p,(x) e F[x] has distinct roots [they are, in fact, all the

distinct values a,(a,) obtained as 0, runs over G], and of course p,(x) splits in K[x],t= 1,...,n. Ifwelet

p(X) = pl(x) - ' - pn(x) E F[X], then p(x) is obviously a separable polynomial. Also K is the splitting field for p(x) because K contains all the roots ofp(x), and in view of (20) it is the smal~ lest such field since a], . . . , a,l are among the roots of p(x).

In order to prove the sufficiency, we must show that if K is the splitting field of a separable polynomial p(x) e F[x], then

270

Rings and Fields

Example 5 Let x be an indeterminate over Z2, and let F = Z2(x). Let y be an indeterminate over F and consider the polynomial ring F[y]. First observe that char F = 2. Now consider the polynomial f(y) = yz — x in F[y]. The derivative (in y) off(y) is [2]y = 0, so by Exercise 25, Section 3.4, f(y) has fewer than 2 distinct roots. On the other hand, iff(y) were reducible in F[y], there would exist an a e F such that a2 = x. But a = p(x)/q(x), p(x) and q(x) in Zz[x], and hence p(x)2 =

xq(x)2. But deg p(x)2 = 2 deg p(x) :2 deg xq(x)2 = deg x + 2 deg q(x) = 1 + 2 deg q(x), which is clearly impossible. Actually, the reason y2 — x has only one root (aside from the argument given above) is that its splitting field is F(~/ ,3),

y2 — x = (y — ~/—x_)(y + \/_x’), and F(~/}) also has characteristic 2. Thus —~/‘}c = J? and there is only one root of this irreducible polynomial.

If K is an extension field of F, then K is a normal extension of F if there exists a finite group G < A(K) such that F = K(G). In other words, K is a normal extension of F if F is precisely the fixed field of some finite group of automorphisms of K. Observe that if k e K and k GE F (where K is a normal extension of F), then there must be some a e G such that a(k) =/= k; otherwise k would be in F.

The following result is of crucial importance in understanding the structure of normal extension fields.

Theorem 7.5 If F C K are fields, then K is a normal extension of F if K is the splitting field of a separable polynomial in F[x]. In fact, if K is the splitting field of the separable polynomial p(x) E F[x], then

F = K(A(K|F)),

(19)

i.e., F is precisely thefixedfield of the Galois group ofK over F. Proof:

We begin by proving that if F = K(G), [G] = n, then K is the split-

ting field of a separable polynomial in F [x]. From Theorem 7.2(i) we know

that [KzF] = n so that

K = F(ap - - - , an)

(20)

where a, E K, t = l, . . . , n, comprise a basis of K over F. According to

Theorem 7.2(ii) we can construct the characterrstic polynomial p,(x) E F [x] of each a,. Moreover, p,(x) e F[x] has distinct roots [they are, in fact, all the distinct values a,(a,) obtained as 0, runs over G], and of course p,(x) splits in K[x],t= l,...,n.Ifwelet

p(X) = p1(x) - - - PM) E F[X], then p(x) is obviously a separable polynomial. Also K is the splitting field for p(x) because K contains all the roots ofp(x), and in view of (20) it is the smal-

lest such field since (11, . . . , a, are among the roots ofp(x).

In order to prove the sufficiency, we must show that if K is the splitting

field of a separable polynomial p(x) E F[x], then

3.7

Galois Theory

271

F = K(A(K|F)),

(21)

so that K is indeed a normal extension of F [notice that A(K IF) is a finite group as a consequence of Theorem 7.3]. Thus suppose

p(x) = 606 - a1) - - - (x - a,)p1(X) - - '11,..(x)

(22)

is the prime factorization over F [x] of the separable polynomial p(x), where each of the irreducible polynomials p,(x) (of degree greater than 1) has disstinct roots. We argue by induction on k = n — r, where n = deg p(x). Namely, we propose to prove that (21) holds if K is the splitting field of (22). For k = 0, Le, r = n, p(x) splits in F[x] so that K = F, A(KlF) = {e} and (21) is trivially true. We thus assume that (21) holds whenever p(x) has r + 1 linear factors, 0 g r g n — 1, and try to prove that (21.) holds when p(x) has

r linear factors [i.e., we go from k = n — (r + 1) to k + 1 = n — r]. Let 01,,l E K be a root ofpl(x), and let F(a,+1) C K be the corresponding exten-

sion field of F. Let a,+1, . . . , (1,“, all in the splitting field K, be the complete set of roots of pl(x), deg p,(x) = s. These are distinct by the separability of

p(x). The extensions F(a,+1), . . . , F(a,+,) are all equivalent extensions of F. In fact, by Theorem 4.12, Section 3.4, there exist field isomorphisms a], . . , a, such that 0,: F(a,+1) —) F(a,+,), 0, holds F pointwise fixed, and (Ham) = am,

1 S t g s.

(23)

Now K is a splitting field for p(x) regarded as a polynomial in F(a,+1)[x] and also regarded as a polynomial in F(a,+,)[x]; a, is a field isomorphism of F(a,+1) onto F(a,+ ,) in which the coefficients of p(x) are held pointwise fixed.

If we apply Theorem 4.13, Section 3.4, we can extend a, to an isomorphism of K onto itself, i.e., to an automorphism of K. We denote this extended auto-

morphism by 0, also. Thus 01, . . . , as are automorphisms of K for which (23) holds and which hold F pointwise fixed, so that

a,eA(K|F),

t=1,. . .,s.

(24)

Now rewrite (22) as

POC) = 606 — a1) - - - (x —- a,)(x ‘ ar+1)q1(X)Pz(X) ' - -p,,.(X),

(25)

regarded as a factorization in F(a,+l)[x], also with splitting field K. The in-

duction hypothesis applies to the factorization (25) in which the coefficient field is F(a,+1), i.e., F(a,+1) is playing the role of F. Thus by induction [we have at least one more linear factor in (25) than in (22)] we have

F(ar+1) = K(A(K|F(a,+1))),

(26)

i.e., F(a,+1) is precisely the fixed field of the automorphism group of K over F(ar,+1). Now suppose

0 e K(A(K|F)).

(27)

272

Rings and Fields

Since obviously A(K|F(a,+1)) < A(K IF) it follows from (27) that if a e A(K 1F(a,+1», then

0(0) = 0, 0 e K(A(KlF(a,+1))) 0 e F(a,+1).

i.e., or, by (26),

(28)

Recall that a,+1 is a root of the irreducible polynomial p,(x), deg p,(x) = s; so (28) means that 0 = co + clar+1 + ' ' ' + c,_1a:—+‘}

(29)

where c, E F,i = 0, . . . , s — 1. Now (27) means that0is held fixed by every 0 E A(KlF), so by (24) we have

[866 (23)]

9 = 0‘09) = Co + 01am + ' ' ' + email or

c,_1a:;,1+ - - - + clam + (co — 0) = 0,

t=1, . . . , s.

(30)

Consider the polynomial

f(x) = c,_,x"I + ~ ' - + clx + (co — 0) E K[x].

The s elements a,+1, . . . , a,“ are distinct and according to (30) are roots of f(x). But degf(x) = s — 1 so that in fact f(x) = 0. We conclude that

CO — 0 = 0 0 = c0 E R

or We have proved that

K(A(Kl F)) C F and the reverse inclusion is trivial. Corollary 1 F[x]. Then

I

Let K be the splitting field of a separable polynomial p(x) e

(i) [KiF]= IA(K|F)|(ii) Iff(x) E F[x] is any irreducible polynomial with a root in K, then the splittingfield off(x) is a subfield ofK.

Proof:

(i)

We have from Theorem 7.5 that

F = K(A(K IF?) But then Theorem 7.2(i) implies that

[K i F] = [K 3 K(A(K|F))1 = |A(KIF)|~ (ii)

Again, (31) implies that

f(X) E F[x] = K(A(K|F))[x].

(31)

3.7

Galois Theory

273

Now apply Theorem 7.4 Example 6

I

Let K be the splitting field for p(x) = x5 —— 3x4 — 2x + 6 E Q[x]. Then -

P(X) = (x — 3) (x4 — 2) = (x — 3) (x2 J?) (x2 + J?) = (x — 3) (x - 21”) (x + 21”) - (x + i2‘”)(x — i2“). Thus K must contain 2”“ and i 2” 4 and hence i. But then K = Q(2“‘, i), and it is trivial to see that x2 + l is irreducible over Q(2”“) as is x4 — 2 over Q. Thus by the degree theorem (Theorem 6.2, Section 3.6), [K: F]

= [Q(2m)3 Q][Q(2”“)(i): Q(2m)] = 4 ' 2 = 3- Hence|A(K lQ)l = A(Q(2“4.i )1 Q) = 8 by Corollary 1(i).

In general, ifp(x) E F[x] is separable and K is the splitting field for p(x), then A(K | F) is called the Galois group of the polynomial p(x). Our next result, which will take a bit of proving, is called the Fundamental Theorem of Galois Theory (hereafter abbreviated to F.T.G.T.). We assume that the underlyingfield has characteristic 0 so that any polynomial is separable.

Theorem 7.6 (F.T.G.T.) Let F be a field of characteristic 0, p(x) E F [x] a (separable) polynomial. Let K be the splitting fieldfor p(x). (In other words, we let K be a normal extension ofF.) Then (i) IfFC B C K, B afield, then

B = K(A(K|B))-

(32)

(ii) If 45 is the set ofall subfields of K containing F and .Q is the set ofall subgroups of A(K | F), then the mapping ,u: (D —) .9 defined by

#(B) = A(K|B)

(33)

is a bijection. Moreover, if A and B are in 4’, then A C B, A qt Bimplies ”(B)

C #(A), #(B) i #(A)-

(iii) If B e o, then

[B : F] = w (iv)

(34)

If B E Q then B is a normal extension of F ifir

A(K|B) Q is surjective. Suppose that ”(A) = ”(B), i.e.,

A(K|A) = A(KIB). Then by (i),

A = K(A(KIA)) = K(A(KIB)) = B. Thus ,u is an injection. Obviously, if A C B then A(K|B) C A(K|A) and since ,u is an injection, it preserves proper inclusion. (iii) From Theorem 6.2(ii), Section 3.6 (the degree theorem), we have [BzF]

_ [K:F]

_ [K:B]'

(39)

Now F = K(A(K|F)) and B = K(A(K|B)) by (i), and hence from Theorem 7.2(i),

[K:F] = IA(KIF3|,

(40)

[KB] = |A(KIB)|-

(41)

Combining this with (39) produces (34). (iv)

Let B E «D. Then by Exercise 9, B is a normal extension of F ifl' A(K|F)B C B.

(42)

Assume then that B is a normal extension of F, a e A(K IF), 1' e A(KlB), and b e B. Then a(b) e B by (42) so that 1:(a(b)) = a(b) and hence

a"‘w(b) = a"la(b) = b,

i.e., 0-110 6 A(K|B). Thus (35) follows. Conversely, assume (35). Then 0—110 6 A(K|B) and hence for any b E B a_lra(b) = b,

110(5)) = 00’)-

(43)

274

Rings and Fields

[K23] = [K:K(H)] = IH | and also from (37)

[K:B] = [K:K(A(K IB))]

= |A(K|B)| so that

[HI = [A(KIB)|-

(38)

But H < A(K| B) because A(K| B) consists of all automorphisms of K which

hold B pointwise fixed. Thus (38) implies that

H = A(K|B), i.e.,

”(B) = H,

and p: d5 —> 9 is surjective. Suppose that ”(A) = MB), i.e., A(KlA) = A(KIB). Then by (i),

A = K(A(K|A)) = K(A(K|B)) = B.

Thus ,u is an injection. Obviously, if A C B then A(KIB) C A(K|/1) and since ,u is an injection, it preserves proper inclusion. (iii) From Theorem 6.2(ii), Section 3.6 (the degree theorem), we have

[BzF] = fi—g

(39)

Now F = K(A(K|F)) and B = K(A(K|B)) by (i), and hence from Theorem 7.2(i),

[KzFl = |A(KIF)I,

(40)

[KB] = |A(KIB)I-

(41)

Combining this with (39) produces (34).

(iv)

Let B e (15. Then by Exercise 9, B is a normal extension of F ifl‘

A(K| F)B c B.

(42)

Assume then that B is a normal extension of F, a e A(K IF), 1' E A(KIB), and b e B. Then a(b) e B by (42) so that t(a(b)) = a(b) and hence

a"w(b) = a“a(b) = b,

i.e., a"1:a' E A(KIB). Thus (35) follows. Conversely, assume (35). Then 0—110 E A(KIB) and hence for any b E B a‘lta'(b) = b, 11007)) = 01b)-

(43)

3.7

Galois Theory

275

Now (43) asserts that 0(b) is held fixed by any 1 E A(K I B). But by (i),

B = K(A(K|B)) so that any element held fixed by all T e A(K| B) must be in B, i.e., 0(b) 6 B. Hence (42) holds and B is a normal extension of F.

To complete the proof we must confirm (36), and this is done by exhibiting a group homomorphism

f:A(KIF)—>A(B|F) whose kernel is precisely A(K | B). Thus assume B is a normal extension of F so that (42) holds. Then since 0(B) C B for any a e A(K IF), we can define

f(0) = UIB which obviously belongs to A(B | F). Also f is clearly a homomorphism. Now

a e kerfiffalB = 13 ifi‘a e A(KIB). Hence A(KIB) = kerf. Thus

~A(KlF)_A(KlF) f: kerf _A(K|B)'

(44)

A(KIF)| _IA(K|F)I A(KIB)! |A(KIB)I —[B=FJ — [by (34)]-

But

However, B = K(A(B|F)) by (i), so by Theorem 7.2(i) we have

[BIFJ = |A(B|F)|. Thus we have proved that f: A(K | F) ——> A(BlF) satisfies im f < A(BlF), and

limfl = [BIF] = |A(B|F)|. Thus imf = A(BlF) and (44) produces the result (36).

I

Example 7 Let p(x) = x‘ — 2 e Q[x]; this is a separable polynomial in Q[x]. Obviously the splitting field of p(x) is K = Q(2”‘, 1') (why?) and [K :Q] = 8 (why?). We compute the Galois group A(K | Q) which by Theorem 7.5 and Theorem 7.2(i) is of order 8. Any 1]) E A(K | Q) is completely determined by what it does to the roots of

x4 — 2, and these in fact are simply permuted by «p (Theorem 7.3). Note that (Mi)2 =

—1 so that zp(i) = ior . Then by (60) and (61),

w = A6 = XDs = Xu

sothatw,e(u,,. . .,u,,>=N,i=l,. . .,n.Hence

(61)

4.1

307

The Hermite Normal Form

W c N.

(62)

Conversely, suppose N = (up . . . , u”) is a submodule of M and W c N. Then every element of Wis a linear combination of ul, . . . , u,,, i.e., w = Xu for some X e M,,(R). Since u = De for some D e M,(R), we have A8 = w = Xu = XDs

and it follows (why?) that A = XD, i.e., that D is a right divisor of A. In other

words,ifW=(Ae>=(wl,. . .,w,,,, 0,. . .,0) and N== n then v1, . . . , v”, v, are l.d. so that there exist c1, . . . , c", c in R, not all 0, such that

i] cjvj + cv, = 0. j=l

Clearly c at 0; otherwise we would contradict the linear independence of

v,,. . ., v”. Then

v, = Zl(—c"lcj)vj ,=

E (v1, . . . , v”). In other words, v, 6 (v1, . . . , v"), t: l, . . . ,p, and hence M: r, then every k”‘ order subdeterminant of A is 0. We also conclude from the Laplace expansion theorem that if dk(A) at 0, then

dk(A)|dk+l(A)'

(1)

Assume in what follows that a fixed complete system of nonassociates con— taining 1 has been stipulated in R so that all g.c.d.’s are chosen from this

system. Theorem 2.1 IfA,B 6 MM, ”(R)and A ; B, then dk(A) = dk(B),k = l, . . . , min {m,n} . Proof: Suppose A = PBQ, P e GL(m,R), Q e GL(n,R). Thenif a e Q,“ m, ,6 E Q,“ ,, the Cauchy-Binet theorem implies that det A[a|fi] =

E, me

E

:m 76

det P[a|w] det B[a)|y] det Q[y|fi].

(2)

I”

Thus if dk(B) qe 0, it follows from(2) that dk(B)| det A[a l ,6], for all a E Q," m, ,3 E QM]. We conclude that dk(B)|dk(A). Similarly, dk(A)|dk(B); so dk(A) and dk(B) are associates and hence equal. Since p(A) = p(B), dk(A) = 0 iff

divs) = 0. I The main result of this section can now be stated. Theorem 2.2

(Smith Normal Form)

A;

Let A e Mm’n(R), p(A) = r. Then

O

(3)

where q,|q,+1, 1' = 1,. . . ,r —1,andq,:/: 0, i = 1, . . . ,r. Thematrix on the right in (3) is called the Smith normal form of A, denoted by S(A). The ele-

ments q, are uniquely determined by A to within unit multiples. Proof: We can assume A qt 0. By type I elementary row and column operations (see Exercise 16, Section 4.1) a nonzero element in A can be brought into the (1,1) position, and then it can be replaced by the g.c.d. of the entries in the first column as in Theorem 1.2, Section 4.1. After that the ele-

322

Modules and Linear Operators

ments in column 1 below the first row can be made 0 by type II elementary row operations. The matrix on hand at this point has the form

B:

bu

b12

' ' '

bln

0

1722

b2;-

0

bmz

brim

If b11 divides all the remaining entries in row 1, then these can be made 0 by elementary type 11 column operations. If not, then let r1, . . . , r" be elements in R such that ZJLIl-rj = d1 = g.c.d.(b“, . . . , b1") and g.c.d.(rl, . . . , r") = 1. Then construct a unimodular Q E GL(n,R) with first column [rl, . . . , rn]T

so that d, is then the (1,1) entry of BQ. If d1 divides all the entries in the first row and column of BQ, these can all be made 0 by type II elementary row and

column operations [except the (1,1) entry of course]. If not, then as before replace d1 by the g.c.d. (call it d2) of the elements in column 1 of BQ and make the rest of the elements in the first column equal to 0. If d2 divides the rest of the entries now on hand in row 1, make them all 0 while leaving column 1 unaltered. Clearly this process can be continued until we obtain an

equivalent matrix whose (1,1) entry divides all the other entries in row 1 and column 1. For at each stage we replace the (1,1) entry by an element with fewer prime factors; e. g., (12 is the g.c.d. of (171 and other elements not divisible by d1, and since R is a U.F.D., (12 has fewer prime factors than d1. Thus the process must terminate with a matrix equivalent to A whose (1 , 1) entry divides every other entry in row 1 and column 1 (it may have “zero” prime factors, i.e., it may be a unit). Thus A is equivalent to a matrix of the form

C:

611

0

0

022

0

cm2

0 ' ' '

C2»

-

(4)

cum

Suppose that c”. is an entry in C(l I 1) not divisible by c“. Add row ito row 1 in C to put c”. in the (1,j) position—this leaves c11 alone. Then repeat the

whole procedure described before to obtain a matrix equivalent to C whose (1,1) entry is a proper divisor of c11 and which has the same form as the matrix (4). By repeated applications of the process, we finally obtain a matrix D having the same form as (4) in which the (1,1) entry, call it ql, divides all the other entries. For at each stage we replace the (1,1) entry by an element with

fewer prime factors. It is important to observe that if P = l 4- Pl andéQ = l + Q1, then

322

Modules and Linear Operators

ments in column 1 below the first row can be made 0 by type II elementary row operations. The matrix on hand at this point has the form

B =

bu

b12

0

bzz

‘ ' '

bin b2"

0

bm2

bmn

If bn divides all the remaining entries in row 1, then these can be made 0 by elementary type II column operations. If not, then let r1, . . . , r" be elements in R such that 27=1b1jrj = d, = g.c.d.(b11, . . . , b1”) and g.c.d.(r1, . . . , r") = 1. Then construct a unimodular Q E GL(n,R) with first column [r1, . . . , rn]T so that d1 is then the (1,1) entry of BQ. If d1 divides all the entries in the first row and column of BQ, these can all be made 0 by type II elementary row and column operations [except the (1,1) entry of course]. If not, then as before

replace d1 by the g.c.d. (call it (1,) of the elements in column 1 of BQ and make the rest of the elements in the first column equal to 0. If (12 divides the rest of the entries now on hand in row 1, make them all 0 while leaving column 1 unaltered. Clearly this process can be continued until we obtain an equivalent matrix whose (1,1) entry divides all the other entries in row 1 and column 1. For at each stage we replace the (1,1) entry by an element with fewer prime factors; e. g., d2 is the g.c.d. of (171 and other elements not divisible by d1, and since R is a U.F.D., d2 has fewer prime factors than d1. Thus the process must terminate with a matrix equivalent to A whose (1 , 1) entry divides

every other entry in row 1 and column 1 (it may have “zero” prime factors, i.e., it may be a unit). Thus A is equivalent to a matrix of the form

C:

C11

0

0

022

0

cm2

0

' ‘ '

Czu

-

(4)

cmn

Suppose that c” is an entry in C(l I 1) not divisible by c“. Add row i to row 1 in C to put cU in the (1,j) position—this leaves c11 alone. Then repeat the whole procedure described before to obtain a matrix equivalent to C whose (1,1) entry is a proper divisor of cn and which has the same form as the matrix (4). By repeated applications of the process, we finally obtain a matrix D having the same form as (4) in which the (1,1) entry, call it ql, divides all the other entries. For at each stage we replace the (1,1) entry by an element with

fewer prime factors. It is important to observe that if P = l -l- Pl andLQ =

1 + Q], then

4.2

The Smith Normal Form

PDQ =

323

‘11

0

o '

P1D(1 I 1)Q1

O

,

(5)

and the fact that ql divides all the entries in D(1| 1) remains true of any matrix equivalent to D(l | 1). Now the entire process is repeated with D(1 l 1), and as indicated in (5) this can be done by left and right multiplications by unimodular matrices in Mm(R) and M"(R), respectively, in such a way that row 1 and column 1 are unaltered. Thus A is equivalent to a matrix of the

form

S: O

4%

(6)

where q,|q,+1, i: 1, . . . , p — 1. However, since rank is invariant under equivalence, it follows that p = r. The divisibility properties of q,, . . . , q, immediately imply that

and

dk(S) = 3k41‘ ‘ 'qk’

k S ’3

dk(S) = 0,

k > r,

where 8,, is a unit in R. But dk(A) =

3k‘11"'4k=

k(S) so that

11A),

k=l,...,r.

Hence q = ‘11 ' ' ' qk—lqk k

91 ' ' ' qk—l

14A;

=ukdk_l(A),

k =1, ..., r (d0(A) = 1),

where uk is a unit in R. By obvious type III elementary row operations we can

replace qk in (6) by uk“qk, k = 1,. . ., r; so we can assume that qk =

dk(A)/dk—1(A)- I The elements q,, . . . , q, in S(A),

q,‘(A) = gig),

k = 1, . . . , r = p(A),

(7)

324

Modules and Linear Operators

are called the invariant factors of A. We make the convention that S(A) is chosen so that (7) holds, and hence S(A) is uniquely determined by A since we have stipulated the system of nonassociates in R in advance. Since R is a P.I.D., we can write

qk = akpl‘k‘pz‘“ - - -pm""",

k = 1, - ~ - , r:

(8)

where p], . . . , p”l are distinct primes in the complete system of nonassociates in R, ekj are nonnegative integers,j = 1, . . . , m, k = l, . . . , r, and 8,, are

units, k = l, . . . , r. From qqk+1 it follows that ekjgekflj,

k=1,...,r—1,j=l,...,m.

A prime power pj‘ki in which e“ > 0 in (8) is called an elementary divisor of

A. The list of elementary divisors of A is the totality of elementary divisors, each counted the number of times it appears among the factorizations (8).

Example 1 (a) If R = Z and q, = 23-3, q2 = 2‘-33-51-7, q, = 2433-55-71, then the list of elementary divisors is

2’, 2‘, 2‘, 3, 3’, 3’, 52, 55, 7, 72.

(b) If 2’, 2‘, 2‘, 3, 3’, 33, 52, 55, 7, 72 is the complete list of elementary divisors of A E M5,7(Z) and p(A) = 4, we can construct the Smith normal form S(A). First, note that since p(A) = 4, it follows that there are four invariant factors of A. Moreover, q4 must be the product of all the highest powers of the distinct primes that appear in the list of elementary divisors of A:

(14 = 2‘-33-5’.72. Then q3 must be the product of all the highest powers of the distinct primes in the list after q4 has been constructed:

q3 = 24-33-52-7. Similarly,

q2 = 23-3.

Hence 41 must be a unit: q1 = 1, if the system of nonassociates in Z is the set of nonnegative integers. Thus

q,000000 0q200000 S(A)=00q,0000 000q4000 0000000 We see from Example 1(b) that if the list of elementary divisors, the rank, and the dimensions of A are given, then S(A) can be reconstructed. Also note that d1(A) = 41

4.2

The Smith Normal Form

325 d2(A) = ‘1192

d,(A) = ‘11"‘qr so that the determinantal divisors can be reconstructed from the list of elementary divisors. We summarize. Corollary 1 The rank, dimensions, and list of elementary divisors (invariant factors, determinantal divisors) comprise a complete set of invariants with respect to equivalence. That is, two rectangular matrices are equivalent over R if)” they have the same rank, dimensions, and list of elementary divisors (invariant factors, determinantal divisors). In analogy to Theorem 1.7, Section 4.1, we have Theorem 2.3 If (R,5) is a Euclidean domain and A E Mm, ”(R), then A can be reduced to Smith normal form S(A) by elementary row and column operations.

Proof: If we examine the proof of Theorem 2.2, we see that the reduction of A to S(A) is accomplished by elementary row and column operations except at one stage: the replacement of the first element in a column (or row) of a k-rowed matrix by the g.c.d. of the elements a1, . . . , ak appearing in the column. In a Euclidean domain we can proceed as follows: Scan the nonzero elements among a,, . . . , ak for the one of least norm which lies in the first row (bring it there if necessary). Then from each of rows 2, . . . , k, subtract

appropriate multiples of row 1. This results in a column of the form a1, r2, . , rk in which r2, . . . , rk are all 0 or are of norm strictly less than 6(al). Repeat the process until the elements in rows 2, . . . , k are all 0. This must ultimately happen since the maximum norm of any element in the column is

decreased after each such step. The matrix then on hand has the following form:

b, b2 b,

b,

0

*

*

buuflk

*

Now replace 171 by the element of least norm in row 1, and perform elementary

column operations to ultimately reduce to 0 all the elements in row 1 except the (1,1) element. Of course, the elements in column 1 may no longer be 0 but the (1,1) entry has no larger a norm than 6(bl). The sweep through column 1 can then be repeated, etc. Finally, an element must appear in the (1,1) entry

which must divide all the other elements in row 1 and column 1, and then

326

Modules and Linear Operators

these can be made 0. The process can then be continued as in the proof of Theorem 2.2. |

Corollary 2 If R is a field, A E Mm,n(R) and p(A) = r, then A can be reduced to

S(A)=

I, 0

0I0

by elementary row and column operations. Proof: Any field is a Euclidean domain, and {0,1} is a complete system of nonassociates. Apply Theorem 2.3. I

The following result is very helpful in determining S(A) in the event A has nonzero entries only in positions (t,t), t = 1, . . . , k.

Theorem 2.4 Suppose that D E Mm. "(R) has nonzero elements hl, . . . , hr in positions (1,1), . . . , (k,k), respectively, and 0 elsewhere. Then the list of elementary divisors of D is the list of all the prime power factors of any of the hl, . . . , hk, each counted the number of times it occurs.

Proof: Let pl, . . . , p"I be all the distinct prime divisors of any of the hi, i = 1, . . . , k. Then write t :1, _ . . , k

ht : pllnpzerz _ _ 'pmfim,

where the e”. are nonnegative integers. Let 1 g s g m be fixed for the fol-

lowing discussion. We will show that if e” > 0, then p,‘" is an elementary divisor of D. We can perform elementary type I operations on D so that we can assume that e1: S e2: S '

i.e., we can assume that h,, powers of 1),. Then

' ' s eks’

(9)

, hk are arranged in order of increasing

h! = pseual

where p, and a, are relatively prime, t = 1, . . . , k. If 1 g r g k and co E Q“ k, then det D[a)|w] = hm“) - - - hum _

e

(1

e

— p: m )samll) ‘ ' ' pr Wham) = p’mlam') am

where and

m(co,s) = emu), + ‘ ' ' + emb): an, = am“) - - - aw).

Now p, and an, are relatively prime, and obviously the power of p, occurring

4.2

The Smith Normal Form 4

327

in d,(D) = g.c.d. {det D[a| ,6] | (1,)? e a} is min m(w,s).~ “EQIJC

But from (9),

min m(co,s) = e], + e2, + - - - + e". ¢”eQrJr

Thus the power of p, occurring in d,(D) is g e“. From (7), the invariant factors of D are given by

q,(D)

d (D) r, = _r..__ d,_1(D)

and hence the power of p, occurring in q,(D) is I

r-l

Zeis _ Zeir : en“

i=1

i=1

-

Thus in the factorization of q,(D), p, appears. with power e". We conclude that the list of elementary divisors of D just consists of all pf", e” > 0. I Example 2 If D = diag(32,7’,24), we find S(D) over Z. We recall that if X = [xii] is a square matrix with nonzero elements only on the main diagonal, then diag(xn, . . ., x,,,,) is used to denote X. From Theorem 2.4 we need only factor each of the main diagonal entries to obtain the list of elementary divisors of D:

2’, 3, 32, 7’. The list of invariant factors is q; = 23.32373,

‘lz = 39 ‘11 = 1,

Thus Corollary 3.

S(D) = diag (1, 3, 23-32-73). Assume that A E Mm'n(R) has the following form:

ifl A—

pl B‘O‘O _.,

,.____

r{ olclo

328

Modules and Linear Operators

Then the list of elementary divisors of A is the combined list of elementary divisors of B and C. Proof:

Let P], Q, P2 and Q2 be p x p, q x q, r x r, and s x s unimodular

matrices, respectively, so chosen that

me] = 5(3) Let

and

law: = S(C)-

P = P1+ P2 + Im—(P+r)

and

Q = Q1 + Q2 4‘ In—(q+-I')'

Then by the Laplace expansion theorem, P and Q are both unimodular, and

by block multiplication, q PAQ ___ p

s

5(3) 0

r

0

0 S(C) 0

0

0

0

By elementary type I operations, PAQ can be brought to the form '7

'-

q1(B)

. . qp1(B)

O

q1(C)

''.

,

(10)

q(C)

O

° 0

P1 = p(B), p2 = p(C). From Theorem 2.4 the list of elementary divisors of A

is just the list of all prime power factors of any of the nonzero entries appearing in the matrix (10). But this is exactly the combined list of elementary divisors of B and C. I Example 3 (a) Suppose A e Mm,,,(R) andb e R'" and it is required to solve the system of linear equations

Ax = b,

(11)

for x = (x,, . . ., x,) e R". This problem is more diflicult than the corresponding

problem for a field that we considered in Theorem 1.12, Section 4.1. However,

4.2

The Smith Normal Form

329

suppose P and Q are unimodular matrices such that PAQ = S(A). Let x = Qy in

(1].) so that

AQy = b, PAQy = Pb, or

S(A)y = c,

(12)

where c = Pb. Then (12) becomes 111(A)Y1 :5 cl

mm '= c,

(13)

0 .= chi-1

0 Q 0,,

where r = p(A). Thus we can state the following: The system of equations (11) has a solution ifl' q,(A)|c,, t = l, . . ., r and c,+1 = - - - = c, = 0. The totality of solutions x is then obtained by letting ym, . . ., y, be arbitrary elements of R in

x = Q[cl/ql(A), czlqz(A), . - -. C,/q,(A),y,+1, - . ., 15.?(b)

A linear diophantine system is a system such as (11) in which R = Z. We

solve the diophantine system 7xl + 3x2 + 15x3 = 100 x1 + 5x2 +

3x; = 120.

The system (14) has the form Ax = b, 7

3 5

A = [I

where

15 3]

and b = (100, 120). A minor calculation (see Exercise 6) shows that PAQ = S(A)

where

P =

[_10 7]l , l

Q =

a" d Also

Thus

S A =

( )

10

—33

0

l

—3

0

—5

16

1

0

0

[o 2 0:"

c = Pb = (120, 740).

_c.1._ __ fl) _ 120, 111(A)— 1 —

—c2— — 7A) 370

,

(14)

330

Modules and Linear Operators

Then the totality of solutions must have the form

x = ao, 370, y,]T = [3820 — 33y,, 37o — 3y3, —1850 + 16y,]T. If we set y, = 116 — d, where d runs through Z, we have a neater expression for x:

x1: —8 + 33d, x2 = 22+ 3d,

x3 = 6— 16d. Our immediate goal at this point is to use the Smith normal form as the main tool in proving the fundamental theorem describing the structure of a finitely generated R-module. We will illustrate some of the aspects of the proof in the following example. Example 4 By using the Smith normal form, we will reprove Theorem 1.9 in Section 4.1 to the effect that any finite dimensional vector space M over a field R

has a basis. By definition, M possesses a finite generating (or spanning) set {sh . . ., sk}. Any element a E M has the form a=rlsl+"'+rksk,

r,ER,i=l,...,k,

(15)

but in general there is nothing necessarily unique about the representation (15) because we do not know that the S], . . ., 3,, are l.i.. Consider the subset W C R" consisting of all (r,, . . ., rk) for which

I:

rs-=0.

Ell

It is an easy matter (see Exercise 7) to confirm that W is a subspace of R". Since R" has the obvious basis {8] = (6”, . . ., 6”,) | j = 1, . . ., k}, we can conclude by Theorem 1.4, Section 4.1 that W has a finite spanning set, say {(ail’ ‘

-

~9 gift) I i: 19 ~

-

-9 1]}.

(16)

Let A e MM(R) be the matrix whose 1'“ row is (an, _. . ., 11",), i = 1, . . ., q, Observe that the row space of A satisfies

932M) = W, and if P E GL(q,R) then .%(PA) = .9Z(A) = W. If we define

y = Q"s, where Q E GL(q,R), then as we have seen a number of times before, {y,, . . ., yk} is also a set of generators for M. Now suppose P and Q are chosen so that PAQ = S(A), and setr = p(A). Ifc = [c,, . . ., c,, 0, . . ., 0] E .9?(PAQ) = $(S(A)) then

C? = CQ"s = (g es.) Q"s = Lil c,(S(A))(,-,]Q"'s

(see Corollary 2)

= [g ci(PAQ)(i)] Q45 = [g Ci(PA)mQ] Q_ls

4.2

331

The Smith Normal.Form

= [g c,(PA)m]s.

(17)

But .9?!(PA) = .%(A), and so Zf=lc‘(PA)m e W; hence (17) is 0. Thus if c E @(PAQ) = .9z(S(A)), then cy = 0. Conversely, supposed = [d,, . . ., dk] and dy =

[rt/1m, n

dQ" =

n

0. Then dQ-‘s = 0, so dQ"' e W = 9204), Le,

d: gin/1mg = #:7504t

e .9?(AQ). But .9E(PAQ) = g(AQ), and hence d E $(PAQ) = .%(S(A)). Thus .9?(S(A)) consists precisely of those k-tuples [ch . . ., c,, 0, . . ., 0] for which

' (3171 1:21

= o.

We have found a generating set 9),, . . ., 'y,, 9),“, . . ., y,‘ of M with the property that

CiVi+"'+Crr=0 for arbitrary choices of c,, . . ., c, in R. For

I,

0

0

0

(18)

and hence any c = [c], . . ., c,, 0, . . ., 0] is in .92'(S(A)). Moreover,

d171+ ' ‘ ' +dk7k=0 Hid”I = - - - = dk = 0,i.e.,de .9§(S(A)). It follows from (18) that y. = - - - = 'y, = 0(choosec,=1, cj = 0,j¢ i, i = 1,. . ., r). Also, if duly”: + ' ' ' + dk'Yk = 0a

then :1,“ = - . - = dk = 0 and hence y,+,,. . ., karel.i.. A torsion element a in a module M is an element for which there exists r e R, r =# 0, such that

ra = 0.

(19)

In other words, a is not free. The module M is torsion free if a = 0 is the only torsion element. The totality of torsion elements in M is a submodule (see

Exercise 8), and it is called the torsion submodule of M. If a E M is atorsion element, then the totality of r E R for which (19) holds forms an ideal in R (see Exercise 9) called the order ideal of a and denoted by

0(a).

(20)

We can extend (20) to any element a by simply defining 0(a) to be {0} if a is free.

332

Modules and Linear Operators

This is sensible since the only r e R for which ra = 0 is r = 0 is a is free, i.e., not a torsion element. Since R is a P.I.D. and since any two generators of an ideal are associates, if follows that

0(a) = (q) and q is uniquely determined if we insist that it lie in the stipulated system of nonassociates in R. We call q the order of a. For example, if R = Z and the

system of nonassociates is N U {0} , then the order of a torsion element a is the least positive integer q such that qa = 0. More generally, if N is any subset of M, then the set of all r E R such

that rN = O is an ideal in R (see Exercise 10), called the order ideal of N and denoted by 0(N). The generator of 0(N) is called the order of N.

If W1, . . . , W1u are submodules of M such that M = 2,51 W,, and moreover every element a e M has a unique representation as a = wl + - - - 4- WP, w, E W,,i = 1, . . . , p, thenM iscalled thedirectsum of WI, . . . , WP and we write P

M=Z;'W,=W,+---+Wfl. ,=

(21)

The fundamental theorem concerning finitely generated modules over a P.I.D. R is the following. Theorem 2.5 Let M be a (nonzero) finitely generated R-module. Then there exists + r nonzero elements in M, a1, a2, . . . , (1,, a,+1, . . . , a,“ such that 5+,

(0

(22)

M = §.

(ii) a, is a torsion element oforder [1,, i = 1, . . . , s, undo1 [,uzl - - - lflx’ (iii) 01,“, . . . , a”, arefree. Proof: We remark that none of the p, can be 1 (Le, a unit) since l-a, = 0 means a, = 0 and none of a1, . . . , a, is 0. The argument begins in the same way as in Example 4. Let {sh . . . , sk} be a generating set for M, and let W C R" be the set of all k-tuples (c1, . . . , ck) for which k

cs 2 [El cjsj = 0.

Since R" has a basis, it follows that W has a spanning set {(ail""’aik)li=19""m}!

and we define A = [a,1.] E Mm. k(R). Clearly, c E .%(A) = W ifl' cs 2 0. Choose P and Q unimodular such that PAQ = S(A), and let q, = q,(A), t = l, . . . , p = p(A) be the invariant factors of A: ql

S(A) =

"1p

0

OO

,

(23)

4.2

333

The Smith Normal Form

Define y = Q"s, also a generating set for M. We prove first that:

iff dy = 0.

d e @(S(A))

(24)

Suppose first that d E .%’(S(A)) so that d = Zf=16,S(A)(,):

dy = dQ“s = Li 6,S(A)(,))Q"s = (2: 6,(PAQ)(,))Q"s = (2 5,0314)“, Q) Q"s p

—1

p

= (2 6,(PA)(,,)QQ s = (2 6,(PA)(,,)s.

(25)

Now 2,516,(PA)(,) E @(PA) = .%(A), and hence (25) is 0. Conversely if dy

= 0, then (IQ—'5' = 0; so dQ‘l E W: .%(A), i.e.,

arg-l = :1 r,A a), d = griA(i)Q = 1:1 rI(AQ)(i)

E .%(AQ) = @(PAQ). Hence

(26)

(1 E .%(S(A)).

This establishes (24). We examine the meaning of (26). From (23) we see that (26) is equivalent to the following: i: l,...,p,

(27)

dp+1=-.-=dk=0.

(28)

q,|d,,

Thus dy = 0 iff both (27) and (28) hold. If we take d = q,e,, e, = (6", . . . ,

6",), i = l, . . . , p, then d satisfies (27) and (28), and hence q,y,=0,

i=l,...,p.

Moreover, if my, = 0, then (me,)y = 0 so that me, E @(S(A)), i.e., me, satis-

fies (27) and (28). Thus q, | m. In other words, my, = 0 iff q,| m. It follows that q, is the order of y,, i: 1, . . . , p. Suppose now that q1 = - - - = qv = 1, and qv+1, . . . , q, are not 1. Then since

qm = 0 it fOHOWS that y, = - - - = y” = 0 and thus M is generated by yv+b ' - ' ’ 3’,» ”+1, - - - a 1’,“

and y, has order q, $1, t = v + l, . . . , p. We assert that ypH, ..., yk are free. For suppose dp+lyp+l +

'

'



+ dkyk = 0.

(29)

Then if we set d = [0, . . . , 0, dp“, . . . , dk] it follows from dy = 0that

d E .%(S(A)), i.e., that (28) holds. Suppose finally that

334

Modules and Linear Operators du+17v+1 + ‘ ' ' + dpl’p + dp+lyp+l + ‘ ° ' + dkl’k =

v+l+l + ' ' ' + 6pyp + 6p+17p+l + ' ' ' + 6kYk'

(30)

Thenifwesetd = [0,. . . , 0,d,+1, . . . , dk] ando = [0, . . . , 0,6,“, . . . ,6k],

it follows that (d — 6);); 0. Hence from (27) and (28) again we have all,+1 = 6],“, . . . , dk = 6,, t: v +1, - - - a P-

qt|(dt _ 6:):

and

(31)

But since gm), = 0, t = v + l, . . . , p, it follows from (31) that (d: _ 61)?! = 0’

d,y,=6,y,,

t=v+l,...,p.

In other words, any element in M can be written as an element in + ' ° ' + ()5) + (yp+l> + ° ' ' + (3%),

and moreover this can be done in only one way. If we set a, = yw, j = 1, ...,k—v,S=p—v, r=k—p, pj=qv+j,j= l,...,s, then (i), (ii) and (iii) immediately follow. I Corollary 4

In the notation of Theorem 2.5, the torsion submodule of M is

precisely

T = gran.

(32)

Moreover, F = Z'fi,’+l(a,) is a free submodule of M of rank r and M = T —l— F.

(33)

Proof: Clearly, T = Z‘g,l(a,) consists of torsion elements only, for obviously p,T = 0. On the other hand, if a = 2’1“: + E rial i=1

i=:+l

and ca = 0, then since the sum (22) is direct, cria, = 0, i = l, . . . , s + r. But then cr, E 0(ai), i = 1, . . . , s + r, and this in turn implies that cr, = 0, i = s +1,..., s + r, because 01,“, . . . , a“, are free. Hence ifc ab 0, then

r, = 0, i = s + 1, . . . , s + r, i.e., a E 232K119). This proves (32), and (33) is obvious. I Corollary 5

Proof:

The module M in Theorem 2.5 is free if it is torsion free.

If M is torsion free, then T = 0 and M = F which is a free module

by Corollary 4. Conversely, a free module M cannot contain a nonzero torsion element (see Exercise 11).

I

In what follows we assume that the system of nonassociates in Z is N U {0}.

4.2

The Smith Normal Form

335

Corollary 6 If R = Z, then (at) is isomorphic to Z”, i = l, . . . , s, and (01,) is isomorphic to Z, i = s + l, . . . , s + r, as abelian groups. Proof:

For i = 1, ..., s define (m; Zm—> (01,) by

(Pia/Cl) = kai'

First observe that if [k1] : [k2], then pi|(kl — k2) and hence (k1 — k2)all =

0, kla, = kzai. Thus (0,. is well-defined. Also,

(Ml/C] + [1]) = $10k + 1]) = (k + 00!,- = kw: + [at = ¢([k]) + (0([1DNote that if ¢i([k]) = ¢,([l]), then (k -— Dot, 2 0 and hence u,|(k —— 1), Le, [k] = [I]. Since (p, is clearly surjective, it is an isomorphism. Fori=s+ 1,...,s+rdefine¢,:Z——>(a,) by

Mk) = kaiIt is obvious that o, is a surjective homomorphism, and since a, is free, it is clearly injective. I If G is an abelian group, we can regard it as a Z—module [see Example 2(b), Section 4.1]. We then have the so—called fundamental theorem offinitely generated abelian groups:

Corollary 7 If G is a finitely generated abelian group (as a Z—module), then G is the direct sum (as submodules) of s cyclic subgroups GI, . . . , G, offinite

order and r infinite cyclic subgroups G,+1, . . . , 0”,. Moreover, |G,| | [G,H | , i=1,...,s—l.

Proof:

We simply apply the decomposition of Theorem 2.5 and take G,=(a,),

Corollary 8

i=l,...,s+r.

I

If G is a finite abelian group (as a Z—module), then

G = 21-0,,

(34)

i=1

where G], . . . , G, are finite cyclic subgroups satisfying |Gi| | |G1r+1], i = 1,

..., s — 1. Moreover |G| = H§=1|G,|, and G, is isomorphic to Zion, i = l, . . . , s.

Proof:

Everything has been proved except |G| = Hi=1lG:l- But we know

from Corollary 6 that

Gt = {0: “1,2051, - - - , (l1: " Dar}:

and since the sum (34) is direct, it follows that if Eli/(tar = 1;: [flu

#1 = lGila

(35)

336

Modules and Linear Operators

0gk,_i W,) = j; kc(Wi)-

(43)

Formula(42) is obvious, and (43)is contained in Exercise 15. That T= 2';_1(a,) is the torsion submodule of M was proved in Corollary 4. Next observe

that since ,u,| ,u: and ma, = 0, it follows that ,u,T = 27:1 p,(a,> = 0. Hence [1, E 0(T). On the other hand, iffT = 0, then fa, = 0, i = I, . . . , s, and

f E 0(a,) = (m). Thus .11, | f. In other words, ,u,is the order of T. This proves (ii). Now write M = T -i— F where T = 21:1 (01,) and F = 2'§:;+1 (42,). Then ”SM = ,u,T + ,usF by (42), and thus

#.M = #:Fs+r

But

[13F = iEI-‘l (Ilsa!)

is obviously a free submodule of rank r. Hence (iii) follows. Corollary 10

I

IfM hasasecond decomposition (as in Theorem 2.5) oftheform

M = goo + i; t, implies x""" = e. But since s and t are both less than q —— 1, (s —- t)! < ql = m. Thus some power of x less than m is e. This contradicts the fact that |A| = m. Finally, we saw that 3 consists of powers of

x’, i.e., x“. Ifk > q, write k = tq + v, 0 s v < q, and then x“ = x"4+“” = x“”'x”’ = x""x”’ = v_ Thus B is precisely the set {x’, x”, . . ., x“'“"} . 2. Suppose G is a proper subgroup of Z,l and n = p’", p a prime. Show that G is a

cyclic subgroup generated by [p‘], where 0 < a < m. Hint: Z,I is itself a cyclic group which we write additively:

Z.- = {0- [1]=[0],2[1], . . .,(n—1)[1]}. If G is a subgroup, then by Exercise 1 above,

G = {[1]. 2U]. - - - . (q -— DUI}

4.2

The Smith Normal Form

345

and ql = n =p'". But then I = p“ for some 0 < a < m. 3. Let M be a finitely generated free R—module of rank n. Let N be a submodule. Show thatthereexistsabasis {7/1, . . . , 34,} of M and nonzero elements q], . . . ,

q,,l of R, qllqzl - - - |q,,, such that {cm/l, . . . , qmym} is a basis of N. Hint: Let {a}, . . . , 8,} be a basis of M. By Theorems 1.5 and 1.8 in Section 4.1, there exists a basis {vb . . . , v,,,} of N and a matrix A E Mm,,,(R) of rank m such that

v = A5. Now write A = P"S(A)Q“", where P and Q are unimodular, and set Q"'s = y, Pv = u. Then )1 is a basis of M, u is a basis of N, and u = S(A)y. But

‘11 o . . . o . . . 0 SM) = [z ' E E] 0 . . . 'llm . . . 0 so that u, = qm, i = 1, . . ., m. 4. In Z3 let N be the submodule generated by (7,3,15) and (1,5,3). Find a basis

{311,312,913} of Z3 and positive integers ql and q2, q1 lqz, such that {qwhqzyz} is a basis of N. Hint: Write v1 = (7,3,15), v2 = (1,5,3) so that v = As, 5,- =

(6,1,a,,,6,.,), i= 1, 2, 3. Note that v is a basis of N. Now A = P“S(A)Q“, where A: 7

3

1

5

Q—

15 , P: 3

0 —-l

1, 7

3 "1’ '33 “—10"

_a()_020'

0—5

l6

Thus Pv z S(A)Q“e, and if we set Q"s = y and u = Pv, then y is a basis of Z3 and u is a basis of N. Now 1 1 5 3 Q"1 = det‘Q adj Q = 0 l6 3 . 0 5 1 Thus 7/, = (1,5,3), y, = (0,16,3), y, = (0,5,1). Also u, = y, = (1,5,3); 14; = 23:2 = (0,32,6). 5. Establish formula (43). Hint: Suppose a E k,(Z},?£1W,). Then a 2 WI + - - + w, and 0 = ca 2 cw, + - - - + cwp. The uniqueness of the representation of an element as a sum of elements in Wl, . . ., WP then implies that cw, = - = cw], = 0, i.e., w,- E k,(W,-), i= 1, . . ., p. Thus k,(2;n,.) C Z;§,k,( W,-), and the other inclusion is equally obvious. 6. Verify the computations of the matrices P, Q, and S(A) in Example 3(b). 7. Show that if 31, . . ., 3,, are generators of an R—module M, then the set W C R" consisting of all (r,, . . ., rk) E R" for which E}; r131' = 0 is a submodule of R". Hint: It is obviously closed under addition and scalar multiplication. 8. Show that if T is the set of all torsion elements in the R—module M, then T is a submodule of M (the torsion submodule). Hint: Suppose a and [3 are in T, ca = 0 and dfl = 0, 0 =# c,d E R. Then cd(a + [3) = 0 and can: = new = 0 for any a E R.

346

Modules and Linear Operators

9. Show that if a is a torsion element in M, then the totality 0(a) of r E R for

which ra = 0 is an ideal in M (the order ideal of a). Hint: If r1,rz E 0(a) then (rl + rz)a = my + rza = 0, so 0(a) is closed under addition. Also if a E R, r E 0(a), then raa = ara = 0.

10. Show that if N C M, then the totality of r E R for which rN = 0 is an ideal in R. Hint: Same argument as in Exercise 9.

11. Show that a free module M can contain no nonzero torsion elements. Hint:

Suppose {sh . . ., sk} is a basis of M and a 25216.3, is a torsion element. Then for some c at 0, 0 2 ca = 21;, cc,s,, and hence cc, = 0, t = 1, . . . , k. Since c qt 0, it follows that c, = 0, t = 1, . . . , k, and hence a = 0. 12. Let N be a submodule of M, and let k,(N) = {a E N I ca = 0}. Show that k,(N) is a submodule of M (k,(N) is called the kernel of c with respect to N). Hint:

If 01,13 E k,(N) and r E R, then cra = rca = 0 and c(a + 13) = ca + cfi = 0, [3. Verify formula (42).

14. Let M and N be R-modules, and let v: M —> N be a homomorphism of the addi-

tive abelian group structures in M and N which satisfies v(ra) = rv(a) for all r E R, a E M. Then v is a module homomorphism. The words epimorphism (surjective), monomorphism (injective), isomorphism (bijective) are also used in connection with module homomorphisms. (a) Show that LMI M —v M is a homomorphism. (b) If M C N and i: M —> N with t(a) = a, then L is a homomorphism.

(c) If v: M —» N, then im v and ker v = {a E M lv(a) = 0} are submodules of N and M, respectively. (d) Let N be a submodule of M, and define a scalar multiplication in the quotient group M/N by r(a + N) = ra + N. Show that M/N is an R-module with this definition and that the canonical map v: M —» M/N, v(a) = a + N, is a module homomorphism. The R-module M/N is called the quotient module. (e) Show that if v: M —> N, then im v E M/ker v.

(f) Show that if v: M —> N, then y is injective ifl‘ ker v = (0). (g) Show that if W.., i = 1, . . . , p, are submodules of M and o‘ E Sp, then

23-4114,." 5 Zinafi)‘ (h) Suppose F is a ring and M is an R-module. Assume that W,- is a submodule of M and W,- E F, i=1, . . . , p. Then Z;£,W,-§ Z',£1F= F" as R-modules. [We abuse the notation for direct sum slightly by regarding

each copy of F in Z ‘,glF as the set of all p—tuples over Fof the form (0, . . . , 0, f, 0, . . ., 0).] The scalar multiplication in F" by elements of R is of course

r(f,,. . .,f,,)=(rf, ,. . .,rfp). Hint: (a), (b), (c) are immediate. (d) If a —— )3 E N, then r(a — 13) E N, so ra + N = r13 + N. Thus the scalar multiplication is well-defined. Then v(ra) = ra + N = r(a + N): rv(a). (e) Define a: M/kerv —-> imv by 9((1 + ker v) = v(a). This is well-defined since a — B E ker 1) implies v(a) = 22(5). Also :7 is a surjective homomorphism. If v(a + ker v) = 0, then a E ker v, so a + ker v = ker v, i.e., :7 maps only the zero in M/ker v into 0. This makes 9 injective and

4.2

The Smith Normal Form

347

hencean isomorphism. (f) This is obvious. (g) Define v: Z‘,n,——» 2341 W“” by v(wl + - - - + w): wan, + - - ~ + wow. (h) Let q: Wi—ve the isomorphism as R-modules, i = I, . . . , p. Then set 12(wl + - . - + W!) =

(q(w1), . . . , -

Combine this with Exercise 21. The orders of the various primary cyclic submodules are called the elementary divisors of T.

25. Let M be an abelian group, and let {sh . . . ,sk} be a generating set for M.

Let (an, . . . , a“) E Z", i = 1, . . ., m. Then Mis said to begenerated by {51. . , sk} subject to the defining relations I: Frau-91:0,

z=l,...,k,

if (an, . . . ,a,,,), i = 1, . . . , m, span the submodule WC Z" consisting of all (c1, . . . , ck) for which 2121015; = 0. We also say that M has thepresentation

4.2

The Smith Normal Form

349

k (51,. . .,s,¢|_Zla,jsj=0,z=l,. . .,m.) ,= The proof of Theorem 2.5 shows how M can be decomposed into a direct sum 0

s torsion subgroups (a1), . . . , (01,) of orders M1, . . . , M, and r infinite cyclic subgroups =(2s1+s2), and the isomorphic copy of Z in M is (9),) = (s1). Glossary

4.2

cyclic primary decomposition, 348 determinantal divisor of A, 320

isomorphism of R-modules, 339 kernel of c with respect to N, 338

d,(A), 320

k,(N), 338

diophantine system of equations, 329 direct sum, 332 elementary divisors of A, 324 elementary divisors of a torsion submodule, 348 equivalence of matrices, 320 external direct sum of R-modules, 343 group generated by {s,,. . .,sk} subject to defining relations, 348 invariant factors, 324

list of elementary divisors, 324 matrix of relations, 349 module epimorphism, 346 module homomorphism, 346 module isomorphism, 339 module monomorphism, 346 order ideal of a, 331 0(a), 331

(MA). 323

order of a, 332

order ideal of N, 332 0(N), 332

4.3

The Structure of Linear Transformations

order of N, 332 . presentation of an abelian group, 348 primary module, 347 p-module, 347 quotient module, 346 M/N, 346 Smith normal form, 321

4.3

351

S(A), 321 spanning set for a vector space, 330 T,(W), 347 torsion element in a module, 331 torsion free, 331 torsion submodule, 331

The Structure of Linear Transformations

Let V and W be finite dimensional vector spaces over a field R; i.e., V and W are finitely generated modules over R. Let T: V —) W be a module homomorphism (see Exercise 14, Section 4.2). Then T is called a linear transformation (abbreviated It), and the totality of such T is denoted by Hom( V, W). It is obvious that if Tand S are in Hom( V, W), then their sum T + S, defined by

(T + S)v = Tv + Sv,

v E V,

(l)

is in Hom(V, W). Also, if r E R, then the scalar product rT E Hom(V, W) is the Lt. defined by (rT)v = r(Tv),

v E V.

(2)

Thus Hom(V, W) is a vector space. If {e,, . . . , e,,} is a basis of V, then there is one and only one H. which satisfies Te,=w,,

i=l,...,n,

where w, are prescribed vectors in W. For if v = Zgléiei E V, then Tv = TE; fie, = géiTe, = :1 fiwi.

This process of defining a It by stipulating its values on a basis is called linear extension. Let )L be an indeterminate over R, and let T E Hom( V, V) be a fixed linear transformation. We define a scalar multiplication of vectors v E V by polynomials f0.) E RH] as follows: Iff (,1) = a0 + all + - - - + ak/l", then

f(l)v =f(T)v = (aoly + alT + as + - - - + akT")v. It is simple (see Exercise 2) to confirm that Vis an R[).]-module. It is, in fact, a torsion module. For by Exercise 1, dim Hom(V, V) = N = n2, where dim V = n. Hence the N + l powers IV,T,T2, . . . , T” must be linearly dependent in Hom(V, V). Let a0, . . . , aN be elements of R, not all 0, for which

352

Modules and Linear Operators

aoly + alT+ - - - + aNT” :0. If f(A) = a0 + all + ~ ~ . + aNAN, then obviously f(A)V = 0. We stipulate the monic polynomials in RH] as a system of nonassociates. Then the order of V as an R[/1]-module is called the minimalpolynomial of T; i.e., the minimal polynomial of T is the monic polynomial m(h) of least degree such that m(T) = 0. According to Exercise 24 in Section 4.2, Vcan be written uniquely as a direct sum of cyclic primary R[/l]-submodules; that is, V is a direct sum of R[A]-submodules of the form

(a), where p().)‘ is the order of a, p0.) is a prime in RM], and e is a positive integer. To say that (d) is an Rm-submodule means that for any g0.) E RU], 3(T)d E (a).

Let

The“

£0») = 17(1)‘ = b0 + b1; + . . . + 1k.

boa+b1Ta+---+T"a=o

so that Tka is in the subspace U of Vspanned by 01, Ta, . . . , T’Ha. It follows immediately that T'"a e U for any positive integer mand thus that f(T)a E U for any f (/1) e R[/l]. Hence f (1)01 E U for all f(A), and since by definition

(:1) consists of the totality of such R[).]-multiples of a, it follows that (a), as a subspace of V, is spanned by a, Ta, . . . , T""a. But we can also see that these

vectors are l.i. over R; otherwise there would exist co, . . . , c,‘_l not all 0 such that 0 = coo: + clTa + . - - + ck_lT"_‘a

= (c0 + clT+ ~ ' - + ck_,T"")a = h(T)a where h(/1) = C0 + Cl}. + . - - + 9‘41"". But g0.) = p().)‘ is the order of (a) which means that if h().)a = 0, then g().) | h(/l). But deg g0.) = k, deg h(l) = k — 1. Thus (1, Ta, . . . , T""a is a basis of (a) as a subspace of V. We

can apply Exercise 24, Section 4.2, to conclude the following result. Theorem 3.1

(Cyclic Invariant Subspace Decomposition)

If dim V = n

and T E Hom( V, V), then there exists a unique list ofprime power manic polynomials

171(1)”, - - - , P1(l)“'", MAY“, - - - , p2(l)‘2"2, - - - , 15(1)“. - - - , MAY”r such that V is the direct sum of subspaces W”,

V = 22 W”-

(2)

4.3

The Structure of Linear Transformations

with the following properties: (i) T(W,j)CWij,j=l,...,n,,i=1,...,r. (ii) p,(}t)‘i/‘ is the order of WU as an R[).]-module and dim

353

Wij =

degpi(/1)‘ii = mwj': 1,...,n,.,i= 1,. . .,r. (iii) There exist vectors “U E V such that a”, Tat)“, Tzaija - - - ’ Tmii_laij comprise a basis of W,1..

(3)

Observe that since the sum (2) is direct, it follows that the totality of vectors (3),j = 1, . . . , n,, i = l, . . . , r, comprise a basis of V. The subspaces WU in Theorem 3.1 are called cyclic invariant subspaces.

Although Theorem 3.1 is essentially the main result concerning the structure of a single linear transformation T, its proof is unsatisfactory in that we have no idea of how to systematically go about constructing the subspaces W,U in (2). We shall now take a somewhat different approach to overcome this difficulty. Let V = (e1, . . . , e"), W = (fl, . . . ,fm), where the indicated vectors constitute bases of the vector spaces V and W. If T E Hom( V, W), then there exist uniquely defined scalars c”. E R, i = l, . . . , m, j = l, . . . , n, such that

Tej=§lcijfi’

j=l,...,n.

The m x n matrix [cu] = C is called the matrix representation of T with respect to the ordered basesE = {e,, . . . , en} and F = {fl, . . . ,fm} . We use the notation

C = mg. There are a number of elementary results concerning matrix representations. Theorem 3.2

Let U, V, and W be vector spaces over R with ordered bases

G = {g1,. . .,gp},E= {el,. . .,en},andF= {f1,. . .,fm},respectively. (a) If (p5. F: Hom(V,W)—> Mm_n(R) is the function that maps each TE Hom( V,W) into [T]; then on F is a vector space isomorphism. (b) IfS: U—) Vand T: V—> W, then

[TS]F = [T15515-

(4)

(c) [IU]8 = 1,, the p X p identity matrix [611]' (d) IfA E M”(R), then A is a unit (i.e., A is nonsingular) ifl there exists a basis

H: {h}, . . ., hn} osuch that [IVE = A-

354

Modules and Linear Operators

(e) If T: V—) V, then Tis a unit in Hom(V, V) 1271715129 a unit in Mn(R), and in this event

[7H]? = ([TIEYI-

(5)

(f) Let T: V~+ VandA = [TE If B E Mn(R), then there existsabasis H = {h1, . . . , h} of Vsuch that[ Z = B iflr there existsa unit matrix P e Mn(R)

such

that

A = PBP".

(6)

Proof: The proofs of (a), (b), (c) are left to the reader (see Exercise 3). ((1) Set B = [1,, g, and use (4) to compute

AB = [Ix/Lilly? = [ME = In, and similarly BA = 1". Thus if A = [1,45, then A is a unit. Conversely, suppose A is a unit and define hjziszla’je” i.e.,

j: l,...,n,

(7)

h = ATe.

Let c = [c], . . . , cu], 6,. e R, and assume that ch = 0. Then cATe = 0, and since E is a basis, cAT = 0. But then ACT 2 0 and A is unit so that cT = 0. Thus the vectors H = {h1, . . . , h} are Li. and hence form a basis of V. The equations (7) state that [IV ii! = A-

(e)

This follows immediately from (4).

(f)

Suppose such a basis H exists. Then from (4) we have

A = [T15 = [IV filTlllllylfi-

(8)

Now [1,, 5 = P is a unit by (d), and by (4) again, P"‘ = {IVY}. Thus (8) becomes (6). Suppose conversely that

A = PBP“. Define vectors h,, . . . , h,, by h = PTe. Since P is a unit so is PT, and hence

hl, . . . , h" form a basis of V. The equation h : PTe is the statement that [IVE : P. Then

[Tlil’ = [Ix/TM” = [Ivl’élTTlfillylfi

= P‘IAP = B. |

Theorem 3.2( f) is a particularly important result: It states that two matrices A and B are matrix representations for the same T E Hom(V,V) ifi' there exists a nonsingular P such that

A = PBP“. Two matrices A and B in M"(R) are similar over R, written

(9)

4.3

The Structure of Linear Transformations

355

Asa

am

if there exists P e GL(n,R) such that (9) holds. It is simple (see Exercise 4) to

confirm that similarity is an equivalence relation. Obviously we can analyze the structure of a T E Hom( V, V) if we can find a simple matrix representation for T. In view of (9) the problem is to reduce any matrix representation of Tto something simple by similarity transformations. The key notion in this reduction procedure is the connection between similarity over the field R and equivalence over the ring ofpolynomials over R. Let A be an indeterminate over M"(R). We observe that 1: R —> M"(R), 1(r) = r1", is an injection and i(R) is isomorphic to R. Also, 2. is an indeterminate over 1(R); so by a now familiar construction there is an indeterminate

1., over R and a ring isomorphism 11: R[}.l] —> 1(R)[A] such that 110.1) = 11, “IR = 1. Observe that 11(llr) = 11(Al)z,(r) = 11(r) = 1(rI").

‘(11)

In order to avoid cumbersome notation, we shall not distinguish between 2.

and 111. Thus A will be regarded as an indeterminate over R as well. Thus we have

Mn(R)[ll = MARI/1D-

(12)

Note that equivalence in Mn(R[h]) makes perfect sense since RM] is a P.I.D.; in fact RH] is a Euclidean ring so that equivalence can be accomplished by elementary row and column operations (see Theorem 2.3, Section 4.2).

If A E M”(R), then the characteristic matrix of A is the matrix

M. - A E M..(R)[ll,

(13)

det(lI,, —- A) 6 RM]

(14)

and

is the characteristic polynomial of A. If F is a splitting field for'the characteris-

tic polynomial, then the roots a1, . . . , an E F of the characteristic polynomial are called the characteristic roots or eigenvalues (e.v.) of A. Thus det(ajIn — A): 0,

j = l, . . . , n.

(15)

The splitting field F is determined to within equivalent extensions (see Theorem 4.14, Section 3.4).

The next result is fundamental.

Theorem 3.3 LetA ande in Mn(R). ThenA é Bifl/l], — A i 1.1,, — B, where £ is equivalence in M"(R[A]).

Proof:

Assume that 11,, — A E 1.1,, — B, and let P = P0.) and Q = Q0.)

be unimodular matrices in Mn(R[).]) for which

15—A=mm—mg

356

Modules and Linear Operators

Recall that P is unimodular ifi' det P is a unit in RM], i.e., ifi' 0 9E det P E R.

Let L = Q‘1 = M"(R[).]), and write L = L0.) as a polynomial in A. with coefl‘icients in M"(R):

L = ELm—l 1“",

L,H e Mu(R),

z: o, . . . , m

(16)

[=0

(see Exercise 5). Then

1,, = Q"Q = n_,A’""Q = n_,Ql’"".

(17)

Now write Q = Q0) as a polynomial in A with coeflicients in M"(R), and let

W = Q,(A);

(18')

Le, Wis the right-hand value of Q0.) at A. Then, since (see Exercise 6)

(Q(l)l"'"),(A) = Q,(A)A"'" = WAnt—t,

we can evaluate the right-hand value of (17) at A to obtain

I, = 2 L,,._.(Q(l)l'""),(A) = :1) L,,,_,WA"'-'. NOW

19—101.. ~ A) = (11.. — B)Q(l) = lQU») - BQU») = Q00)» - BQ(1)-

(19) (20)

It is elementary to verify (see Exercise 7) that the right-hand value of

P“().I,, — A) at A is 0, and thus from (20)

0 = (Q(l)/1),(A) - (BQ(l)),(A) = Q,(A)A - BQ,(A) = WA — BW,

WA = BW.

(21)

Thus, from (21) we have WA'"_' = 3“" W,

t = 0, . . . , m,

(22)

and substituting (22) into (19) produces

1,, = (i; Lm_,B"'"’) W (=0

so that W = Q,(A) is nonsingular (see Exercise 8). From (21) we have

A = W"‘BW

and A E B. Conversely, if A E B so that A = S“BS, S E Mn(R), then

AI” — A = u, — S“BS

(23)

4.3

The Structure of Linear Transformations

357

= S"().I,l — B)S

(24)

and since S is nonsingular in M"(R), it is unimodular in Mn(R[).]). But then

(24) implies AIR—AélIn—B.

I

It should be observed that the proof of Theorem 3.3 provides a con-

structive procedure for obtaining a matrix Win GL(n,R) for which A = W“BW if Aln—A—ElIn—B: (i)

Obtain Q = Q0.) (a product of elementary matrices) in Mn(R[/1]) for which

11.1,I — A = P().I,l — B)Q. (ii) Write Q0.) as a polynomial in A with coeflicients in M"(R). (iii) Evaluate W = Q(l),(A). Example 1

We show that A = [i

i] and B = [3

g] are similar over M,(R)

where R is the field of rational numbers. First,

All—A:

a“

d

U —B=



1_

_

1 —1 2—1 11—2 0

[o

1

A]

.

There is no difliculty in confirming that 1_

_

0 Emu—nEx—IWHLI' 1_‘1]Eu(z—1)=[3 1241],

(25)

Z. —2 0 l Ez(2)E21(2 — A)El(—-;)E,Z 1, then a

l

0

0

0

H(g(/1)) =

-

- 0 . 1

. 0

- - -

a

and clearly det(,ilm — H(g(}.)))(m| 1) = (—-l)"‘". Thus d0 = - - - =

"1—1 =

1, where the d, are the determinantal divisors of Mm — H(g(A)). Also, it is obvious that det (Mm — H(g(h))) = (A — a)“ = gut), and hence the invariant m 1

factors of AI,” — H(g().)) are l, . . . , 1, g(,1). But according to Theorem 3.5, these are precisely the invariant factors of Mm — C(g(/i)). Thus by Theorem

33, H(g(l)) é C(g(/1))- I The following theorem is classical.

Theorem 3.11

(Jordan Normal Form) Let A E Mn(R), and let R be a

splitting field for the characteristic polynomial of A. Then the polynomials p101), . . . , p,().) in the list (43) are of the form

p.(l)=l—a,,

t=1,...,r,

(44)

368

Modules and Linear Operators

and the matrix A is similar over R to the direct sum of the hypercompanion ma-

trices of the polynomials (43). Proof:

The product of the elementary divisors of AI” — A is the character-

istic polynomial of A, and by hypothesis this factors into linear factors in RM]. Since RM] is a U.F.D. and each of the elementary divisors is a powar of a prime in RM], it follows that each p,(l) is linear. The Frobenius normal

form of A is

g1. E'CW. — a,)‘ii) which by Theorem 3.10 is similar to

. 1' 21-11((1 — 011)"??- I

1=

15

(45)

The matrix (45) is called the Jordan normal form of A. Example 3 (a) Suppose R = R, the real number field. Then the irreducible monic polynomials in RM] are of the form/1 — wand/12 + c1}. + co, c12 — 400 < 0. Thus the possibilities for the elementary divisors of the characteristic matrix of a matrix in

M2(R) are A — ah A — a2, a; e R; (A — a)2, a e R; 1.2 + all + co,c12 — 4c0 < 0. Thus any A e M2(R) is similar over R to one of the following three types of matrices:

a,

0

a

0

a2,

0 a’

l

0

—co

1

—c.'

(b) We compute the Jordan normal form over the complex numbers C of the matrix A = C( f (1)), f (A) = (A — 1)2 (Al + 1). The single nonunit determinantal divisor of M4 — A is f (A), and hence the only nonunit invariant factor is [(1). Factoring f (A), we have

f0) = (l — 1)z(1 + i)(1 - i), and hence the list of elementary divisors of M4 — A is

(1 — 1)2, l + i, I. — i.

We have

Han —1)2)= [10 1I], H01 + i) = [—i]. H0. —— i) = [1']. Thus the Jordan normal form of A is

[(1) i] + [—i] + [i].

4.3

The Structure of Linear Transformations

(c)

369

We compute the Jordan normal form over R of l 3 A = 0 l 0 0

—l 2 . l

The characteristic matrix is

1—1 AI, — A =

0 0

a3 A .._ 1 0

1 —2 . l— 1

Thus do = 1, d1 = 1, and two of the 2 X 2 subdeterminants are (I. — 1)2 and A — 7 (to within unit multiples). It follows that d, = l, and clearly d, = (l — 1)’. The invariant factors are q1 = l, q, = l, q3 = (i. — l)’, and (l — l)3 is the single elementary divisor. Hence

A ‘1‘ H((1 -1)’)= 1; + U:(d)

If A e M,(R) and A2 = A, we discuss the possibilities for the Frobenius

normal form of A. Since A2 — A = 0, it follows that the minimal polynomial of A, m0), is a divisor of AZ — 2.. Thus the possibilities for mm) are: (i) mat) = 1; (ii) m0.) = A — 1; (iii) m0.) = 1(1 — 1). In case (i), A = 0; in (ii), A — I,l = 0, i.e., A = In; in (iii), the highest degree elementary divisors are A and l — l, i.e., the factors of m0»). Thus the Frobenius normal form of A is a direct sum of companion matrices of Z and A — 1. Thus A is similar over R to a direct sum of l-square matrices of the form [0] or [1]. But then

A g I, -l— 0.-., where r = p(A), I, is the r-square identity matrix and 0,_, is the (n — r)—square zero matrix. (e) Let B e M2(R) be the companion matrix of the irreducible binomial

f(I1) = 1’ + 011 + co 6 RH]. B = [0 1

—c°:l. —cl

Define A E M2,(R) to be the matrix

B 1

in which there are p occurrences of B down the main diagonal of A, 1’s occur in positions (2,3), (4,5), . . . , (2p -— 2, 2p — 1), and all other entries of A are 0. Then

A i C(f(1)")For it is easy to see that

370

Modules and Linear Operators

(1910112, — AXZPI 1) = co”, and since f (A) is irreducible, c0 =/: 0. Hence (121,4 = 1, and the determinantal divisors of/lIz, — A are do = ~ - - = dzp_l = 1, d2, = detail”, — A): {det(/'LI2 — B)}" = f (1.)”. Thus 112, —— A has the single elementary divisor f(1)” as does the characteristic matrix of C( f (1)”).

Our next sequence of results will help answer questions of the following kind: If we know the Frobenius normal form of A e Mn(R), then what is the

Frobenius normal form of A2 or A3 ? We illustrate this kind of problem in the following example. Example 4 (a) Let f(l) = 17(1)‘ E RM], p0.) a monic prime polynomial, deg [(1) = n > 0. Let A = C(f(1)). Since det(ll,, — A) = f(A), it follows that f(0) =

(- 1)" det A so that f(0) =/: 0 ifl A is nonsingular. Now write

Co 9e 0,

f0.) = co + c1}. + . . . + cflflp—l + An,

311d Then

fin-‘1 + —" +601 g(l)—l

_1_ 5:1. co. +coz+

g(A"1) = 14"" + ELA—‘W-D + , , , + (EA—1 + .1_Iu co

co

co

1

+ ‘ = c—oA-u(An + C'_1A"_l

.

. +601.)

-—n

1 =c_oA f(A)

=0

(by the Cayley-Hamilton theorem).

It follows that if h(l) is the minimal polynomial of A“, then h(Z)] 3(1). Let go.) = h(A)k(l), deg h(i.) = p, deg k0.) = q, p + q = n. Observe that n 1 n Colg(7)=co}~ (111+:0 ”ll—1+

= co + Cl}. +

'+

_n_—1 co

l I + :7)

-+ c,__l/'l.""l + 11"

= f(1) Now h(0)k(0)= g(0)— —— l/c0 =/= 0. Thus the constant terms of both h(l) and k0.) are nonzero; call them ho and k0, respectively. Then set 1

dq + r,

..,e—l

0 < r
=31 ,=,2; f "~"(a..)fi,‘”)=31 5' (i-l) (“‘)z"’

i=1 j=l

then the linear independence of the Z,.j implies that f(j_“(ai)=g(j_n(ai),

.= l, ‘

-

- ’flia

i=1! -

-

‘ ’S'

If X(t) = [x,j(t)] is a matrix whose entries are differentiable functions of a parameter I, then by the derivative of X(t), X(t), we mean the matrix

XU) = [56:10)]The following interesting result is the fundamental theorem concerning systems of first-order linear diflerential equations with constant coefficients. Theorem 3.16 then

IfA e M"(C) and X(t) = e“, where t is a complex parameter,

Xe) = AeM = AX(t).

(73)

Proof: Clearly, since f(A) = e" has derivatives of all orders, f (A) = e" is well-defined. In fact, from the representation (63) and the fact that f ”'“(A) = tj’1e", we have

X0) = 6" =f(A) = 2 i tjflew'ztj i=1 1"]

. i H —1 Z”. ___ H’ as;

(74)

Since the Z” are constant matrices (independent of t), we can difl'erentiate

both sides of (74) to obtain

380

Modules and Linear Operators

X0) = Z", aie'”ifi‘l tHZU + emf} (j —— Uri—22”] i=1

j=l

j=2

2 [emia’izn + e’“‘(ta,- + 1)Zi2 + e“"(t1a, + 20273 = i=1

+ - - - + em"(t""“a,- + (#1 - 1)t”"‘2)Z:,.,.] = 1;”; deaf-1.1,. + (j — 1)ti-2)z,.,..

(75)

Now consider the function g0.) 2 1e". An elementary induction on r shows that

g"’(A) = e"(t’l + rt'”),

r = 1,2, . . . .

(76)

It is also immediate (see Exercise 19) that

(77)

g(A) = Ae“. But from (76),

g"“’(a,) = e'“i(t"“a,. + (j — Dr”) and thus from (63) we have

g(A) = 2 fl e'“i(tj_‘a, + (j — Uri-2);]. i=lj=l

If we compare (75) and (78) and use (77), we obtain (73).

(78)

I

A homogeneous first-order system of linear ordinary differential equa-

tions with constant coefficients has the following form:

x(t) = Ax(t),

(79)

where x(t) = (x,(t), . . . , xn(t)) and x(t) = (5:10), . . . , J'cu(t)). The initial value problem in connection with (79) is to determine a solution x(t) for

which x(to) = x0, 3 given n-tuple. Now consider the matrix X(t) = e“1 and form the n-tuple of functions x(t) = X(r — to)xo.

Since e° = 1,, (see Exercise 20), we have x(to) = X(0)170 = x0. From Theorem 3.16 we have

x(t) = X(t — t(,)x0 = AX(t — to)xo = Ax(t).

Thus the solution to (79) which takes on the value xo when t = to is

x(t) = e‘HOMxO. Example 7

We solve the system

(80)

4.3

The Structure of Linear Transformations

381

5:! = 3x — x2 + x3

:22 = 2xl + x3 JE73 = x1 — x2 + 2x, with the condition x(0) = x0 = (1,—1,0). We rewrite (81) in the form 3': : Ax where

3

—l

A = [2

0

l

—1

1 l:|.

2

The matrix e" was computed in (72). Evaluating (80) in this case, we have

x(t) = e“):o =((t+1)62' + tez', _er+(1+ new __er+ tell, _el+ e2! _ er + e2!)

= (em: + 1), —2e' + (2t + 1)e2‘, —2e‘ + 2e”) = 2’0: + 1, 21‘ + 1, 2) + e'(0,—2,—2).

Exercises

4.3

1. Show that dim Hom( V,W) = dim Vdim W, and exhibit a basis of Hom(V, W).

Hint: Let {21, . . .,e,,} beabasisof V, {f,,. . .,f,,,} abasisof Wand define Ti]. E Hom(V,W) by Tile,- = fj, Til-ck = 0, k =/= i, and linear extension. Then the Ti,- form a basis of Hom(V, W). 2. Let T E Hom( V, V). For each f (A) E R[I.] define f(}.)v = f(T)v. Show that V is an R[A]-module. 3. Prove Theorem 3.2(a) to (c).

4. Let R be a P.I.D. If A and B are in M,(R), then A is similar to B, A i B, if there exists P E M,,(R) such that P is unimodular and A = PBP“. Prove that

A i B is an equivalence relation in M,(R). The equivalence classes are called similarity classes. 5. Let aUU.) E RH], i,j = l, . . . , It. By adjoiningOcoeflicients if necessary, write

a,.,.(,1)= gaunt/1" where m does not depend on i and j. If L0.) = [aii(l)] E M,(R[A]), show that L0.) can be regarded as an element of M,(R)[l] by writing

L0) = go [aw/1" = 20AM",

A,‘ = [aif], k = 0, . . . ,m.

Also prove that if M0.) = [him] = zmbflz" = aBk/l", then

L(A)M(l) = LL: a,v(l)b,,J-(l)] = ,3; (El, A,_,B,) 1'.

382

Modules and Linear Operators

Hint:

zb w )"j (hf/1* (=0 —z"; 131 ai,(l)byj(l)= v=l k=0 = :0: ga Hkb tk+t= :0 E (AkBl)ijAk-H 2m 3 = g0"20 (A s-q HA"

6. Let f(A) E M,,(R)[l], A E M,(R). Show that

(f(ID/1") (A) = f,(A)A". Hint: Write f(1.)

+ AMA". Then

A0 + A A +

(f(1)1"),(A) = (1401" + Ail“ + - ' ‘ + AM"""'),(A) =A°Ak+A1Ak+1+- . -+A,,,Ak+m

= (A0 + AA + - - - + AMA")! = fMM"7. Let f(A) e M,,(R)[l.]. Show that

(f(1X11. — A)),(A) = 0. Hint: Writef(A) = A0 + All + - - - + Am_ll"'"1 + Aml’" so that f (190.1,,I — A)

= A01+A112+ - - - + A,,,_.11"' +A,,.A"'+1 — AoA —A1Al—— - - - —- A,,,_1Al"'“

- AMAA'”. Then (fax/11,, — A)),(A) = AA + AIAZ + - - - + Am_1A"' + A,,,A"'+1 —- AoA — AIAZ - . - - — A,,_,A"' —— A,,,A"'+‘ = 0. 8. Show that if A and B are in M,,(R), R a P.I.D., and AB = I,,, then BA = In.

Hint: det AB = det A det B = 1; so det A is a unit. Thus(det A)"l adj A = ,4-1

E M,(R), and from AB = 1,, we have B = A"1 so that BA = I,,. 9. Show that any two similar matrices have the same minimal polynomial. Show that if T E Hom(V, V), then the minimal polynomial of T [i.e., the monic poly-

nomial m0.) of least degree for which m( T) = 0] is equal to the minimal polynomial of any matrix representation of T. Hint: This first part follows easily from the fact that p(SAS") = Sp(A)S"l for any polynomial 11(1) E RM]. The second statement is obvious from properties of matrix representations. 10. Let A be an n-square matrix partitioned as follows: F

"1 ,__. ”2

”k ,___ 1

n1 "2 A:

A“. "k Let P be an n x n permutation matrix with the partitioning

4.3

The Structure of Linear Transformations n1

n2

f...

383 0 I o

"k

,_.___1

"up "(1(2)

P =

”we -

L.

in which a E S,, and in block row 3 every block is 0 except for the block in block column I = 0(s), and there the identity matrix I“: [m], is found. Similarly, for M E Sh define an n-square permutation matrix Q by the partitioning "pm "#0)

I11, I

”we



wherein block column t has 0 blocks in it except in block row s = ”(t) and there the identity matrix I”: = M!) is found. Show that the (s,t) block in the partitioned matrix PAQ is Ammo), s,t = 1, . . . , k. 11. Show that if A E Mm,,,(R), dim V = n, and dim U = m, then there exists T E Hom(V,U) such that [71’ = A where E andF are bases of Vand U, respectively.

Hint: Define Tby Te]. = 2,511 aijfi, j = 1, . . . , n and linear extension. 12. Let V = W -i- U, where Wand U are invariant subspaces under T E Hom (V, V).

Let E = {eh . . ., e,, em, . . . , e,,} be a basis of V such that E, = {eh . . . , e,} and E,,_, = {e,+1, . . . , e,,} are bases of W and U, respectively. Show that [TE = Al -l- A,, where A1 = [T,]§;, A2 = [T2]E;;;, and Tll = T] W, T2 = Tl U. Hint: This is an immediate consequence of the definition of a matrix representation. 13. State and prove the converse of the result in Exercise 12.

14. Show that if A A c + D and A e GL(n,R), then A“ A C" 4— D". 15. Let A and B be matrices in M”(R). Let Fbe an extension field of R, and suppose that there exists a nonsingular matrix S E M,(F) such that S‘1AS = B. Then show that there exists a nonsingular matrix P E M,,(R) such that P"1AP = B. Hint: The characteristic matrices M, — A and AI, — B are equivalent as matrices over F[A] and hence have the same determinantal divisors. However these determinantal divisors are g.c.d.’s of polynomials in RM] and hence lie in RM]. Thus as matrices over RM], 2.1,. — A and 11,, = B have the same determinantal divisors. Thus they are equivalent over RM] so that by Theorem 3.3, A and B

are similar over R.

384

Modules and Linear Operators

16. Let a1, . . . , a, be distinct elements in a field R of characteristic 0, and let yI, . . . , y, be nonnegative integers. Let bk0"”,bk,rk’

k=1,...,s

be prescribed elements of R, and let a E R. Define the integer m = (s — 1) + Z1551 n. Then prove that there exists precisely one polynomial

pa) = ,= 230 (1,0 - a)’ of degree at most m for which

p”’(a,,)=b,,j,

j=0,...,yk, k=1,...,s.

Hint: There are a total of y,‘ + 1 linear conditions imposed on no, . . . , a,"

by the equalities p‘fi(ak) = bk}, 1' = 0, . . . ,yk. Thus altogether there are 21:16“ + 1) = s + 215, yk = m + 1 linear conditions on the coefficients a0, . . . , am. Now imagine every bk]. = 0 so that the resulting system of m + 1 linear equations in ac, . . . , a,,I is homogeneous. Any solution defines a polynomial

p0.) of degree at most m for which ak is a root of multiplicity at least 7;; + 1, k = 1, . . . ,s.Thusp(A) hasatotalof atleast Zi=1(7’k + 1) = 22:94, + s = m +1 roots, counting multiplicities. But p0.) has degree at most m and hence must be 0. In other words, the only solution a0, . . . , a," to the system of homogene-

ous linear equations must be a0 = . - - = a,,, = 0. But this happens ifi' the coefficient matrix is nonsingular. Thus the system has a unique solution for any

bkj. 17. Let 12(1) 5 RM], and let m0) be the minimal polynomial of A E M,(R). Prove that if r = deg m0), then there exists 11(1) 6 RM] with deg d0.) S r — 1 such

17(1) = M(1)q(l)+ €10).

deg d(A) S r

-

that p(A) = d(A). Hint: Write

I;

then p(A) = d(A).

18. Let moi) = (A — a1)!" - ~ -(2. - a, . be the minimal polynomial of A e Mn(R) (a1, . . . , a, are the distinct eigenvalues of A). Let f(A) be a function defined at A. Show that there exists a polynomial 11(1) E RM] such that p(A) = f (A) and deg p0.) < deg m0»). Hint: We need only exhibit a polynomial p0.) of degree less than deg m0.) for which

P‘j’fizk) =f”’(ak),

i=0, . . . ,uk — l, k = 1,. . .,s.

Apply Exercises 16 and 17.

19. Let f1, . . . , f,,, be functions defined at A, G(z,, . . . , 2,.) a polynomial in 21, . . z,,,, and let

h(l) = G(f 1(1). . ~ ~ ,f"(1))Show that h(A) = G( f (A), . . . , f,,,(A)). Hint: Choose polynomials 121(1), . , pma) such that pi(A) = fl(A), i = 1, . . . , m, and let r0.) =

0071(1), . . . , p,,,(l)), apolynomial. By composite difl‘erentiation,r"7(l)isasum of products involving partial derivatives of G and derivatives of the form pi‘k’ (A),

0 S k Si, 1 S is m. Now pi‘j’(ak) =f"7(ak), where an, . . . , a, are the

4.3

The Structure of Linear Transformations

385

distinct eigenvalues of A. The jth derivative of h(A): G(f,(}.), . . . , f,,,(}.)) is expressed in terms of the partials ofo Gand the derivatives of the f in exactly

the same way as r ‘1’ (A). Thus r 9’ (ak)— — h 0’ ((1,). Thus h 15 defined at A and by the remark following formula (61), h(A) = r(A). But r(A) = G(p1(A), . . . , p,,,(A)) because these are all scalar polynomials. Since 1),(A) = f,(A), we are done.

20. Show that e° = In. Hint: The eigenvalues of 0 are all 0, and every elementary divisor is linear. Thus s = 1 (there is only one eigenvalue) and ,41 = 1. The polynomial f 11(A) in Example 6(a) used to define the component matrix Z11 must satisfyf“(0) = 1; so we can takefna.) = 1. Then Zll = I,,. Iff(}.) = e“, then

e° = e°Z11 = I,l [see example 6(b)]. 21. Assume f and h are functions defined at A, g is defined at f (A), and g has derivatives of order ,ui — 1 atf(a,-), i = l, . . . , s. Let h = gafbe the composite function. Then show that h(A) = g(f(A)). Hint: Let pol) and (1(1) be polynomials

such that p(A) = f (A) and q(f(A)) = g(f(A))- Let r(l) = £10100). Then r ‘j’ (1) is a sum of products of factors of the form q“’(p(}.)) and p"’().), i,l g 1'. Also “’ (a )— — f ”’(ak) and qm(p(ak)) = “’( f(ak)) for any eigenvalue ak of A. If we differentiate h(A): g( f (1.)) j times, we get a sum of products of terms such

as g ‘" ( f (1.)) and f ("(1), LI S j, and this expression is the same as the expression

for r (’7 (A) in terms of q“’(p(li)) and p”) (A). Hencer (’7 ((1,) = h‘j’(ak) so that h(A) = r(A) = q(p(A)) = q(f(A)) = g(f(A)). 22. Let Vbe an n-square unitary matrix, i.e., V* V = I", where V* = [1711]- Show that

there exists a skew-hermitian matrix X such that eJr = V. Hint: Let QVQ"1 = diag(e"91, . . . , eion), Q unitary. Let D = diag(i01, . . . , i0"). Clearly, e1 is defined at any A so that from (61) we have eQ'IDQ = Q'leDQ. Now 11,, — D has linear elementary divisors so if we obtain a polynomial p0.) so that p(i6k)

= e‘ok, then p(D) = e”. But of course p(D) = diag(p(i01), . . . , 1200"» = diag(e"91, . . . , em») = QVQ"‘. Note that X = Q"DQ is skew-hermitian. 23. Let f (11) and g0.) be functions defined at A, h(l) = f (1.) gm), m(i.) = f (A) + g(}.). Show that h(A) = f (A) g(A), m(A) = f (A) + g(A). Hint: Let G(zl,zz) = 21 + 22, G(zl,zz) = 2,2; in Exercise 19.

24. Show that det(e‘) = e""". Hint: The eigenvalues of e" are eal, . . ., ear. Then det(eA) = e‘" - - . e“ = e“'+'"+“~ = e""". 25. Let fbe defined at A. Show that f(A T) = f(A)T. If, in addition, f U) (ak) = f m (m), j = 0, . . ., ,u, — l, k = 1,. . ., s, then f(A*) =f(A)*. Hint: Consider the formula (63) for f(A) in terms of the component matrices: f(A) =

=1 25%, f ”‘1’ (01,-)fif(A). NowAT is similar toA (why?), and hence from Example 6(a) the polynomials[17(2) for A T are the same as those for A. Since p(A)T = p(A T)

for any polynomial pol),

f(A‘)=

ff "‘1’ (a:)f1(AT)

= 21:éfu_l)(a1)(flj(x4))' = [; i:f""’(a.)f.-,-(A)]T = (f Aw.

Modules and Linear Operators

386

Also, f(A*) =f(A—"') = (f(I))T. Thus we need only show that f(1) = f—(A). If J is the Jordan normal form of A, then a moment’s reflection on the form of

f(J) shows that the condition f "’(ak) = f ”’(m) implies that f(f) = m), and hence that f(A) =f(S'1JS) = S‘1f(J)S = §‘lf(.l)§ =f(§"‘f§) = f(S‘US)

=f(A—)26. Let Ak = [aijk], k = l, 2, . . . be a sequence of matrices in M,(C). Then we write limk-,.,Ak = A and say Ak converges to A E M,(C) if each sequence aij", k = 1, 2, . . ., converges to a”, i.e., limkm ail-k = a”. Show that iflimkw, Ak = A and lim,Hm B, = B, then limk_,,(Ak + Bk) = A + B and limkm AkBk = AB, Hint: Use properties of limits.

27. Let fk, k = 1, 2, . . ., be a sequence of functions each defined at A E M,,(C).

If lim,,_.°° fk(A) = B, then fk(A) is said to converge to B. Prove that fk(A) converges to some matrix ifl‘ each of the limits kimfk(j)(at)a

i=09' ' '9Mt_'19t=19- - "3

exists. Hint: fk(A) is by definition similar (via a fixed matrix S independent of

k) to a direct sum of matrices of the type i=0 i1 ’ where 1 g e g ,u, and U, is an auxiliary unit matrix. Thus, if each limk_,, fk ‘j’ (01,)

exists, then clearly limk_..,fk(A) exists. Conversely, each fk (j) (a,) can be expressed as a fixed linear combination of the elements off,,(A), and hence if

limkmfk(A) exists, then so does mch limb.” fk ‘1" ((1,). 28. Let fk0.) = agar/1 — a)i be the partial sum in a power series whose circle of convergence contains all of the eigenvalues of A E M,(C) in its interior. Suppose f (A) = limb,”f,0). Then prove that f is defined at A and lim,mo fk(A) = f (A). Hint: Obviously eachfk(/l) is defined at A because fk0.) is a polynomial. Also, a convergent power series can be differentiated term by term in the interior

of its circle of convergence and hence f ”’(A) = limk-.,fk‘1’(/i) for any 1 in the interior of that circle. Hence f is defined at A. Now from formula (63),

MA) = 21s ""’(a.-)Zi,-, and the component matrices Z,.i do not depend on f,0). Also limbo. fk ‘1" (a..) = f m(a,-) because each a,- is in the interior of the circle of convergence. It follows that

limfku) = i=1j=lk~°° 2’: z‘limfk‘1-"(ai)Z.-,- = i >":‘f""’(a.-)zu k—ow I=lj=| =f(A). 29. Prove:

Az A3 12 e‘=I,+A+$+-3—,+ - - -. Hint: Let f,(z)=1+z+2—, +

11"

- - - + H and apply Exercise 28.

'

43.

The Structure of Linear Transformations

30. Prove: If P:

0

_

0

a unitary Vsuch that V*PV= [1

——i 8111t

i sinh q: cosh q:

0], 63- V: [—1/Ji

o

Then

cosh q:

l , then air? =

—-1

387

—z

.Hint: Choose 16/ L]-

1N2

-t/~/2

err? = eiwi’fli 331'" = Ve‘ivUi 3.2] V" = Vdiag(e",e9) V* _

1

—i

T

1

1

8—”

—i 0

0

I

1

e9 1

i

_1——iiie~ve-_ 7

l

—i

e9

e? + e"

2

iev — l— _e—r— e?

_e? — e"

'2 e7'+ e’?

2

__

cosh (p ——i sinh (p

2

i sinh (p cosh q) '

31. Assume Log 3. is defined at A where we use the principal value of the logarithm.

Show that eL°8 '4 = A. Hint: Use Exercise 21. 32. Let R be a field, A,B e M,(R), and let f (1), gm) be the characteristic polynomials of A, B. Prove: If g.c.d.( f (1), 57(1)) : 1, then

S((}.I,, —— A)(ZI,, — B)) = S(/II,, — A)S(AI,, — B). Hint: It suffices to prove that the product of the kth determinantal divisors £14,113) of 11,, — A and d,,,,(l) of 11,, — B equals the kth determinantal divisor dAB,,,(l) of the product (11,, — A)(}.I,, — B). Let

(11(1)

'10») (12(1)

5/4 =

r20) ...

,

5/3 =

(1.01)

r "(1)

denote, respectively, the Smith normal forms of 11,, — A and M” — B. There exist unimodular S, T, V, W in M,(R[A]) so that AI, — A = SyAT and

1.1,, — B = VS/BW. Thus

I

(”a -A)(/11..-B)= SVATVVBW; «7.41573:

where U = TV is unimodular. Using the Cauchy-Binet theorem,

d”,k(z)— — gmcd { 2% def Sfllalyl det UD'Ip] det Yplplfil} «.55a 7.996 Inn

= gc.d. {det SQIaIa] det U[alfil det ynifilfll}

= “£3; [(13 mm) det Ulalfl](11rpm(l))}The polynomials 44,111) = (11(1) ' ' ' 11k“)

and

(13,111): r101) ' - 41,01)

(82)

388

Modules and Linear Operators

divide, respectively, the first and third factors in each of the terms in (82). Hence dA,k()~)dB,k0~) l 1143,41)-

Suppose that there exists some (irreducible) polynomial ha) in RH] so that h(A)dA,k(A)d,,k(l) divides each of the terms in (82). Then, in particular: (a) It divides each element of

{qr/1) - - - qkm det Uu - - - k l m (

k

|=l

(b)

49.00)) m e 9%.}.

It divides each element of

{(i: mm det Ulall - - -k] r101) - . -r,.(2)la e on}. Using the Laplace expansion theorem on the first k rows (and then on the first k columns) of the unimodular matrix U, we see that each of the sets {det UII, '

and

'

- , klflllfi E Qk,n}

{det U[a|1, . . . , k]| a E Qk,,,}

consist of relatively prime elements in Rm. So from (a) and the fact that the

determinantal divisors form a divisibility chain, the irreducible polynomial [1(1) divides some product H§=,rp(,,(l), and from (b), 12(1) also divides some product

fl§=,q,,,.,(}.). By hypothesis, these are relatively prime, and so we have a contradiction. It follows that dA,k(l)d,,,k(l) = d”#0).

33. Suppose that A and B have the same irreducible characteristic polynomial. Prove that A and B are similar. Hint: If amatrix has an irreducible characteristic polynomial, then that polynomial is the only nontrivial invariant factor of the

characteristic matrix. Thus AI, — A and 11,, — B have the same invariant factors, i.e., they are similar. 34. Let A have the form A =

A, [0A, AZ]

m..( )

E M

R ,

where A, e M,,,(R), A2 6 M,,(R). Suppose further that A, and A, have no eigenvalues in common. Prove that A é A, -i- A2. Hint: Take unimodular P0.) and QM) in M,,,(R[l]) such that

P0») (111... — ADQQ) = SUI". — Al), the Smith normal form. Similarly, take unimodular DO.) and F0.) in Mu(R[1]) such that

D(l)(AI,, — A2)F().) = $011,, —— A1). By block multiplication,

[12(1) 0 0

Harm—A,

—A, 19(1)

0]

DO.) """6"""""" tI—A 0 Fa) _ [$011, — A,) 301) ] _ 0 son, — A2)

4.3

The Structure of Linear Transformations

389

where BO.) 6 Mm,,,(R[l]). Let

B0.) SQ], — A2) I

N _ SUI," — A1) 0

We show that by a sequence of elementary row and column operations on N all of the entries of 3(1) can be made zero. Take a nonzero element of 3(1), say bij(}.). Since A 1 and A2 have no eigenvalues in common, g'c'd'(qk(llnl — Al), q1(lln _ ’42)) = I,

k =1, - - w my I :11 - - - r n-

Thus there exist p,().), 122(1) E RU] such that

px(l)q.-(11,.. — A1) + pz(1)q,-(U.. - A2) = 1Hence

-b.,~(1)pr(l)q.-(N.. — A1) - by(/1)pz(1)q1011n — A2) = — 173/01)

and it follows that the (i, m + j) entry of

E:,...+,~(—b;,~(1)pz(l))NE.-,m+,-(-b.-,(l)p1(l)) is zero. We have eliminated the b,.1-(1) entry without disturbing any of the other elements of N so we can repeat the process, ultimately eliminating all of the nonzero entries of 3(1). But then (11,,” — A) g (11,, — A1) -i- (11,, — A2),

and so A % Al —i- A2. 35. Let

_ A: A _

[0

A; AL]

E Mm+n(R)9

where A], A2 are m— and n-square, respectively. Assume a is an eigenvalue of Al but not of AZ. Show that elementary divisors of 11,,+,, — A and 11,, - A 1 of the form

(1 — a)‘ are the same. Hint: Choose nonsingular P, Q in M ”,(R), M ,(R), respectively,

such that PAIP'l = J], QAZQ"1 2 J2, where Ji, 1' = 1, 2, are Jordan normal forms and furthermore the Jordan blocks of Al involving the eigenvalue a are in the first k rows (i.e., a is a root of multiplicity k of the characteristic polynomial of A1). By block multiplication

[3" lo' Elli?" Z—x]=[3‘ fl where B E M

(R). Let W = J1 B ’ 0 Jz

and re-partition the matrix W so that W:

Bl

B4

0

B,

Modules and Linear Operators

390

where B2 is k X k (i.e., with the k occurrences of a down the main diagonal) and B3 is upper triangular. Apply Exercise 34 to W, Le, W E B2 4- B3. But B3 is upper triangular; so the eigenvalues are displayed on the main diagonal, and none of them are a. Now since the complete list of elementary divisors of a direct sum is the combined lists of the summands, the elementary divisors of the characteristic matrix of A of the form (A —— a)‘ are exactly those of the characteristic matrix of A]. . Let A and B be rectangular matrices. Define A ® B = [ail-B], that is, A ® B is a block matrix whose (i,j) block is ail-B. Prove that (C ® D)(A ® B) 2 CA ® DB (assume that the products CA and DB are defined). The matrix A ® B is called the Kronecker product of A and B. Hint: By block multiplication, the (i,j) block in the product cuD c21D

cuD

auB

'

auB

anB



ca

ail-KB

is ( ” cikakj) DB, k=l

which is precisely the (i,j) block in the matrix

CA ® DB. 37. Let R be a P.I.D., and let A e Mm(R), B E M,(R). Letp E R be a prime, and suppose p‘l, . . ., p‘' are all the elementary divisors of A involving p while

pfl, . . ., pf: are all the elementary divisors of B involving p. Prove that the elementary divisors of the Kronecker product A (8 B involving p are p‘l+’f,

i= 1,. . ., r,j=1, . . ., s. Hint: First note that A®B£ SA®SB where S,4, SB are in Smith normal form over R. For suppose PAQ = SA and DBG = SB. Then by Exercise 36, (P69 D)(A ® B)(Q ® G) 2 SA 69 SB. But SA ® S, is diagonal; so its elementary divisors are just the prime power factors of the main diagonal elements. Hence the elementary divisors of A ® B involving p are p“+f1', i = 1, . . ., r, j = l, . . ., s. (Here we must allow the exponents er, fj to be zero, if necessary.)

38. Suppose A1 E M,,,(R), A2 E M,,(R). Definealinear mapL: Mm,,,(R) —-r MW(R) by L(X) = AIX —— XA2, X e M M(R). Prove that a necessary and sulficient

condition that L be nonsingular is that Al and A2 have no eigenvalues in common. Also show that

dctL = n (ax—13;), I=l, =1

where the a,- are the eigenvalues of A1 and the )3, are the eigenvalues of A2. Hint:

4.3

The Structure of Linear Transformations

391

The matrix representation of L with respect to the basis {Eu} , ordered lexicographically in the pairs (i,j), is

A: @111 - I... ® A27The eigenvalues of A 1 ® I,, are the eigenvalues of Al listed n times, and similarly the eigenvalues of 1,, ® A2T are the eigenvalues of A27 (i.e., the eigenvalues of A2) listed m times. Now Al ®In and 1,” ® A2T commute, and hence (why?)

the eigenvalues of L are {ai — )3,- | 1 g i S m, l g j g n}. Thus

det L = 'fi" (a,- — 5,). i=1.j=l

Hence L is nonsingular ifl‘ det L 4; 0, ifl‘ A1 and A2 have no eigenvalues in

common. 39. Let A e M,,(C), and consider the differential equation

5:0) = Ax(t) + f(t).

i.e.,

x(to) = c,

2,0) = j)"; a.,x,(r) + no), me) = ct.

Show that 1:!

X0) = e(t—!o)4c +f

e(t—:)Af(s) d3.

1=to

Hint: We need only show that if x0) = eu—to)Ac +

Idea—£4115.) ds,

:=to

then

But

12(t) = Ax(t) + f(t)

and x(t.,) = c.

2':(t) = Ae“"’°’ " c + 5’; “W! e“"’ "f(s) ds] :fllo

= Ae(l~lo) Ac + (1;: [emf

s=t

e—sAf(S) d3]

t=to

= Ae(i—t°)Ac + Ae'Af:=‘e"“f(S)dS + e"(e“"f(t))

= Ax(t) + f(f) and x(to) = e°c = c. . Prove that if A,B E M,,(C) and em”) = cue” for all I. E C, then AB = BA. Hint: If e‘W'“ = due” for all i. E C, then

d el(A+B) = (A + BMW”) = Aeue" + eMBe” I)» and

2 fiffl‘fl” = (A + B)ze‘““” = Azeueu’ + AeMBe“

+ AeMBeIB + eMB‘e". Evaluating at A = 0, we have

(A +B)’=A’+2AB+B’

392

Modules and Linear Operators

or

A’+AB+BA+B’=A’+AB+AB+B’.

Hence

AB = BA.

41. Let A be an n-square upper triangular matrix, and let a be a characteristic root of multiplicity r. Assume that

A[i,i+ l|i,i+1]=[:

x],

a

x =/= 0.

Prove that Al, — A has an elementary divisor of the form (2. — a)‘, e 2 2. Hint: AI, — A is upper triangular, and so

(1)370),

-

det(}.I,, — A) = (l

where g.c.d. (A — a, p(A)) = 1 and r 2 2. Observe that (AI, — A)(i + 1|z') is upper triangular, and

dn_1(l)|det(ll,, — A)(i + l Ii) = —x(}. — a)"2p(}.). 2

Hence

_

dill)

(2. - a) [q,(l) — 11—..-10)‘

42. Let f (A) and gm) be monic, g.c.d. (f(A), g(}.)) = l in RB]. Prove that

C (f(/1)g(1)) i C (f(1)) -l— C (3(1))Hint: The list of elementary divisors of the characteristic matrix of C ( f(A) g(}.)) is obtained by factoring its single nontrivial invariant factor, f (1) 57(1). The list of elementary divisors of the characteristic matrix of C(f(1)) 4- C(g(}.)) is obtained by putting together the lists of elementary divisors of the characteristic matrices of C( f(2.)) and C( 57(1)), i.e., by listing the factors off (l) and those of

got). Since g.c.d.( f(A), g(}.)) = 1, these lists are the same., 43. Discuss the elementary divisors over QM] of the characteristic matrix of a

permutation matrix A(a), a e S,,. Hint: Let a e S,I be a permutation with cycle structure [1“, 21¢, . . ., n1" , Le, a is a product of 1.1 1-cycles, 2.2 2-cycles, . . . , 1,, n-cycles, all of these cycles being disjoint (some of the A,- are 0). If r e S,l is any other permutation with the same cycle structure as a, then there exists 0 e

S,, such that T = 000". Thus A('r) = A(0¢19"‘) = A(9)A(a)A(0"’) = A(0)A(a)A(0)“. This means that permutation matrices corresponding to permutations with the same cycle structure are similar. Hence their characteristic matrices have the same elementary divisors. Consider the following permutation r with the same cycle structure as a:

t=(1)(2)-'°(11)(/11+1,li+2)(1i+3,11+4)'-' It, Then

'12

4.3

The Structure of Linear Transformations

393

Fl

‘1

21: ol—I

"O

Am =

i.e., A('r) is a direct sum of A, 1 x 1 blocks, 1., 2 x 2 blocks, . . . , A, n X n blocks, where the typical 1' x 1' block is

0

l

19.0 A,=

1

.

O 2 1'0 The complete list of elementary divisors of AI — A(r) is the combined lists of elementary divisors of the characteristic matrices of the direct summands.

However, 11, — A,. = 11,. — C(l‘ — 1) has as its only invariant factor If -— 1. Thus the elementary divisors of 11,. — A ,~ are the prime power factors of 1‘ —— 1 over Q. But

if— 1 = 1} «151(1) where 115, is the dth cyclotomic polynomial; i.e., the QM) are the irreducible factors of 1‘ — 1 over Q[l]. Therefore the elementary divisors of 11,. — A,- are the cyclotomic polynomials 1154(1), where d Ii.

44. If A(a) and A(q:) are similar permutation matrices over Q, prove that there exists 0 E S, such that A(0"'09) = A(tp). 45. Let X be an n-square matrix of functions of t which satisfies 1(1) = AX + XB,

X(0) = C.

394

Modules and Linear Operators

Express the solution in terms of A, B, C. Hint: Try é‘Ce‘B. Remember B commutes with any function evaluated at B. 46. (Hard!) Let A and B be hermitian matrices. Prove

tr(e‘“3) g tr(e“e”). (Note that e", e” are positive definite hermitian matrices.)

47. If A E M"(R) and Z,-j are the component matrices of A, show that i: Zil = In'

i=1

Hint: Let Z,-I.(A) be the i,j component matrix of A, and Z,.,(J,4) be the i,j component matrix of the Jordan form of A. Observe that Zij(A) =fij(A) =fij(SJAS—1) = Sf1j(JA)S—l = S(Zij(JA))S_19

where Z,- ,(J.4) is a direct sum of zero blocks [corresponding to the elementary

divisors (A — 010', k at: i, since f,7 ‘" ((1,) = 0 when 0 S t S e — 1 and k at i] together with nonzero blocks corresponding to the elementary divisors (l —- (1,)“, In each of these nonzero blocks there is exactly one nonzero stripe, i.e., l/(j— 1)! appears as an entry in the positions (k,j + k — l), k = l, . . .,

n — j + 1. This follows directly from the definition of a function defined at a matrix, and from the definition off,.j. Hence Z,.1(JA) has identity subblocks corresponding to elementary divisors involving (1 — a;)‘, with zero blocks elsewhere. So, clearly,

qU) = I. since the blocks are disjoint. But then

2:12.01) = SQ"; 2.1(J.>)s*‘ = SI.s-‘ = 1.. 48. Show that

ZijZ,,=0

fori=/:l.

Hint: In the notation of Exercise 47 we first consider Z,.,(J)2,,(J.4), i at: I. Z,-1-(JA) and Z,,(JA) are block diagonal matrices; so Z,- j(JA)Z,,(J.4) is also block diagonal

where the kth diagonal block of Zij(JA)Z,,(JA) is the product of the kth diagonal blocks of Zij(JA) and Z,,(JA). Thus the kth diagonal block of Zil-(JA)Z,,(JA) can be nonzero only if both the kth diagonal block of ZU-(JA) and that of Z,,(JA) are nonzero. This is impossible if i at 1 since the only nonzero blocks of Z,. j(J,4) are those corresponding to elementary divisors of the form (A — ai)‘, and

the only nonzero blocks of Z,,(JA) are those corresponding to elementary divisors of the form (A — a,)‘. Therefore Zij(JA)Z,,(J) = 0. Now, for some nonsingular S,

Z,,.(A)z,,(A) = SZ,,-(JA)S“SZ,,(JA)S" = SZIIUA)ZI:(JA)S~1 = 0.

4.3

The Structure of Linear Transformations

395

49. Show that Z“2 = Z,1. Hint: First, observe that

(21104))2 = (SZil(JA)S”)2 = S(Zu(JA))ZS"~ But Z,-1(JA) is the direct sum of zero blocks and identity blocks, and so (2,104))2 = 211(14)-

50. Show that Z“Zk, = Zk,. Hint: By block multiplication, Zkl(J DZkr(J A) = Zkr(J,4)

since nonzero blocks in Z“(JA) are identity matrices. But then

ZH(A)Z:"(A) = S(Zk1(JA))S“S(Zk,(J4))S " = S(Zkl(JA)Zkr(JA))S_l = S(ZkI(JA))S_1 = Zk,(A)-

51. Show ZU = (—j—11)! (A— oz,-I,,)""Z,1 . Hint: Again, in the notation of Exercise 47, we consider Z (J.0- Observe that J,4 — ail” is a block diagonal matrix with

U, as the block corresponding to the elementary divisors (A — (1,)". Thus in

(j_—11)-,(JA — ar,-I,,)"l the block corresponding to (A — ai)‘ is j— 1 stripe

o

. = _1__ Uzi—1

...

. .

o

l _— (j— l)!

o

(J- 1)!

.—1—-, 0 .(1— 1).

O

°

0

Now Z,,(JA) has I, as the block corresponding to (A —— a;)‘, and 0 blocks else-

——1—1)!(JA — 04.1,.)"12n (JA) has where. Thus, by block multiplication, (j— (j _1 1)‘ U i" as the block corresponding to (A — a,)‘, and 0 blocks elsewhere.

Therefore— (j_11), (JA_ —ail..)"‘Z.~1(JA) = Z.-,-(JA). Finally, 2.7M) = 52,1135—1

”[0— I)!“

ailn)j_lzii(14)]S"

)s-I)

= 0—;m (5’43” - ailn)"‘(SZ.-1(Jr = (1+1)? (A -' ailn)j-IZ,~1(A).

396

Modules and Linear Operators

Glossary

4.3

auxiliary unit matrix, 367

U", 367 Cayley—Hamilton theorem, 363 characteristic matrix of A, 355 characteristic polynomial of A, 355 characteristic roots of A, 355 component matrices, 377

Z,-,-, 377 companion matrix, 358

C0(1)), 358 cyclic invariant subspace, 353 cyclic invariant subspace decomposition, 352 cyclotomic polynomial, 393

(”4(1), 393 derivative of a matrix, 379

X0), 379 direct sum of transformations, 363

T1 -i- T2, 363 eigenvalues, 355

e.v., 355 Frobenius normal form, 360

f is defined at A, 376 hypercompanion matrix, 367

unitary matrix, 385 V‘, 385 value off at A, 376

H(3(1)). 367

4.4

invariant subspace under T, 359 irreducible matrix, 364 irreducible transformation, 363 Jordan normal form, 368 Kronecker product, 390 A ()9 B, 390 linear extension, 351 linear transformation, 351 l.t., 351 matrix convergence, 386 matrix representation of T, 353 C = [735, 353 minimal polynomial, 361 reducible matrix, 364 reducible transformation, 363 scalar product, 351 rT, 351 sequence of matrices, 386 similar matrices, 354 A é B, 355 similarity classes, 38] splitting of an elementary divisor, 375

Introduction to Multilinear Algebra

Let V], . . . , V," and U be vector spaces over the same field R. A function ¢2Xi21V,——) U is said to be multilinear if it is linear separately in each vector variable. This means that q;(. . .,cv,+dw,,...)=c¢(. . .,v,, . . .)+d¢(. . .,w,, . . .)

(l)

for all vectors v,,w, E V,, i = 1, . . . , m, and all scalars c,d E R. The dots indicate the same fixed but unspecified vectors on both sides of (1). Multilinear algebra is concerned with the study of such multilinear functions.

For example, if Vl = M1,",(R), V2 = M1, ”(R), and U = M,”(R), define go:

VI x V,—->U

4.4

Introduction to Multilinear Algebra

by

397

¢(V1’V2) = [51’]j] = ViTvz,

where v1 = (6,, . . . , 6,") and v2 = (r11, . . . , '7"). Observe that q) is linear

separately in v1 and v2; also im at consists of all rank 1 matrices in Mm. ”(R) together with the 0 matrix, so that

(im o) = U. However, im q) itself is not a vector space unless m = 1 or n = 1 (why?).

Sometimes problems of symmetry are important. For example, if o:R’"x---xR"'-)R is defined by

¢(vl, . . . , v") = det(v1, . . . , vm), then for any a E Sm,

¢(va(l)’ - . ~ , Van») = detO’au» - ~ - a Vow) = a(o)det(v,, . . . , V“)

= a(a)¢(vl, . . . , v").

This situation is described by saying that (o is “symmetric with respect to the group S", and the character a.” For the purposes of this introduction we shall only discuss multilinear functions (p, defined on the cartesian product of a finite dimensional vector space V with itself m times, with values in some vector space U, i.e., (p: >1 V-—+U. l

The theory, however, is easily extended to multilinear functions defined on cartesian products of arbitrary finite dimensional vector spaces. The underly-

ing field R is always assumed to have characteristic 0. The initial problem is to show how the study of multilinear functions can be referred to the simpler study of linear functions. The concept required

for this is-that of the tensor product. Definition 4.1

(Tensor Product)

Let V be a vector space over R. The pair

(P,v) is called a tensor product of V with itself m times if the following two conditions are satisfied. (i) The map v: x '{'V —-> P is a multilinear function with values in the vector space P over R such that (im 2)) = P. (ii)

(2)

(Universal Factorization Property) If U is any vector space over R

and o: X’f‘V—) U is any multilinear function, then there exists a linear function h: P —+ U such that

398

Modules and Linear Operators

¢ = hv, ¢(v,, . . . , v") = hv(vl, . . . , v")

i.e.,

for all vectors v1, . . . , v”' E V. Thus the following diagram is required to be commutative :

TV__E__. 1

Our first and most important problem is to show that a tensor product of V with itself m times always exists. To do this we let Mm(V:U) denote the totality of multilinear functions q): X'f‘V—) U. It is trivial to verify that

Mm(V:U) is a vector space over R with addition and scalar multiplication defined by (a; + 0)(vl, . . . , V“) = ¢(vl, . . ., v") + 0(vl, . . ., vm), (c¢)(v1, . . . , V“) = c¢(vl, . . . , v," ; here (0,0 6 Mm(V:U), v, E Vfori = l, . . .,m,andc e R. Theorem 4.1

The vector space Mm(V: U) is finite dimensional with dim Mm(V:U) = (dim V)"l dim U.

Proof: We remind the reader that I'm(n) denotes the totality of n'" sequences of length m of integers chosen from {1, 2, . . . , n} = [1,n]:

13.01) = {wt [1,m] —> [1,n1} [see Section 3.2, formula (23)].

Suppose f1, . . . ,fm are elements of V* = {fi V—) R] f is linear} , the dual space of V; thus each fj: V——> R is a linear transformation or linear functional on V to R. Then their product

(0 =f1 - - ~f,.. is the element of Mm(V:R) defined by ¢(v,, . . . , vm) =f,(v,)- - -fm(vm),

v, E V,i=1,. . ., m.

(3)

Now letE = {eh . . .,eu} beabasis of Vand W= {w,, . . ., wp} abasis of U. Let {f1, . . . ,fn} bethe basisofV* dualto E, so thatf;(ej) = 6”, i,j =

4.4

Introduction to Multilinear Algebra

399

l, . . . , n (see Exercise 1). For each a e 1"",(n) and l g j g p, define (adj: x TV—) Uby

wad-(v1, . . .,v,,) = (knfam(vk))w,.

(4)

It is easy to see that V’au' E Mm(V:U). We claim that {flu-la e 1"",(n), j = l, . , p} is a basis of Mm(V:U). First, the (on,1' are linearly independent. For if ‘31.; E R, a e 1",,(n), j = l, . . . , p, are scalars such that uETmOI) jzlcaj¢aj— — 0,

then for [i E I‘m(n) we have 022‘:aj¢a.j(e5(1),

'

'

' ’ ep(m))

caj L1: fa(k)(ep(k))]wj

= :2j cajaapwj P

= 2 cfljwj. J:

(5)

The linear independence of the W]. now implies that c j = 0, j = l, . . . , p.

Finally, if? E Mm( V:U), then for each y E 1"",(n) there exist scalars cw. E R, j= 1, .. ., p such that

11 Me) = “97(1), - - - ’ 970,0) = Z: (:w ,. Let 0— _

E 21 caj¢a, j aermbd j=

(6)

and note, as in the calculation (5), that 0(e1)_ _' Z cij _ _¢(e7)'

Since this holds for arbitrary y E 1"m(n) and p and 0 are multilinear, it follows that p = 0. Thus {plum E Fm(n),j = l, . . . , p} is a basis of Mm(V:U).

It is said to be dual to the bases E and W. The facts that |I",,,(n)| = n'" and dim U = p yield the result. I Note that in the event U = R, Theorem 4.] states that

dim Mm(V:R) = (dim V)“. Let

P = Mm(V:R)*

be the dual space of Mm( V:R), and let v 6 MM( VzP) be defined by

400

Modules and Linear Operators

v(v,, . . . ,vm)(¢) = ¢(v,, . . . ,v,,),

v,, . . . ,v," E V,

(7)

for any q; e Mm(V:R). The equation (7) makes sense: im v C P means that v(v,, . . . , vm) must be a linear functional on Mm(V:R), i.e., the value v(v1, . . . , vm)(¢) must be specified for each (a E Mm(V:R). Theorem 4.2 Proof:

The pair (P,v) is a tensor product of V with itself m times.

We must verify that (P,v) satisfies conditions (i) and (ii) in Defini-

tion 4.1. Let E = {e,, . . . , en} be a basis of V, and let {f,, . . .,f,,} be a basis of V* dual to E. We know from the proof of Theorem 4.1 that

m aft/2.“) | a e no» is a basis of Mm(V:R) and ¢a(efl) = ¢¢(ep(1)a - - - ’epw) = (sap,

0,}? E rm(n)-

(8)

Then the definition (7) of v E Mm( VzP) yields ”(ep)(¢a) = ”((330), ' ' ' ’ ey(m))(¢¢) = 6a :

aifi E I'm(n)'

(9)

Thus the linear functionals v(ep) E P = Mm(V:R)* form a basis of P dual to

the basis {oala e Fm(n)} of Mm(V:R) (see Exercise 1); in particular,

(im v) = P. To confirm that the universal factorization property holds, we must show that given q) e Mm(V:U), the commutative diagram

can be completed with a linear h. For this purpose simply define h: P —> U on the basis {v(ea)| a e Fm(n)} of P by

h(v(ea)) = ¢(ea) = ¢(ea(l), ' . ' ’ ea(m))9

a 6 F14”),

and extend h linearly to all of P. Then if v, = 2;.16Uej e V, t = l, . . . , m, we have

4.4

Introduction to Multilinear Algebra

h(v(vla ' 0 0 9 vm)) = h(

401

2

vermin)

617(1) ' ' ' émylm)”(ey))

= 2 (if; em)h(v(e,» yérmh)

l—

m

. , 370:») 6'7(t))¢(eym, . . (,H =l 2,erm () n

= q, (,E, cue, . . . , 2 chef) = ¢(vl, . . . , vm). Thus

hv = (a,

and the proof is complete.

I

It should be noted that if (P,v) and (Q41) are both tensor products of V with itself m times, then there exists precisely one linear bijection h: P ——> Q such that hv = p.

(10)

Indeed, since (P,v) and (Q,p) have the universal factorization property, we

obtain the diagram

Then hv = y and k]: = v imply khv = v. Since (im 1:) = P, it follows that kh = 1,. Similarly hk = IQ, so that h is a bijection. Finally, if h,: P —> Q and

hlv = fl = hv, then (im 1:) = P implies that h1 = h. The same kind of argument shows that the linear map h in the diagram

402

Modules and Linear Operators

of Definition 4.1(ii) is uniquely determined by q): We have II» = g) and the values of v span P. In the definition of a tensor product the assumption is made that v E M,”( VzP) satisfies (im v) = P. It seems to be a very diflicult question to characterize those q; E Mm(V:U) for which im¢=U. In other words, when is the range of a multilinear function a vector space?

This problem is in sharp contrast to the situation for linear maps; the range of a linear map is always a vector space.

As an example of a tensor product we observe that the space Mn(R) of n x n matrices over R is a tensor product of M1, "(R) with itself. Define

”1 M1,,.(R) >< M1,,.(R)—*M,.(R) vain) = [Em] =5 ,

by

where i = [6,, . . ., fin] and 17 = [771, . . ., nu]. Notice that if e1: [5“,...,6,,,],i=l, ...,n,then

{v(e,, e) | i,j= l, . . .,n} is a basis for Mn(R). Clearly v is bilinear and (im 11).: Mn(R). Also, if

(p: M]. ”(R) x Ml, ”(R) —) U is any bilinear function, defineh: M,,(R) -—) U by hv(e,,ej) = ¢(e,,ej)

(11)

and linear extension. We have for all 5,17 E Mr. ”(R) that

hum) = h (z ampere») hj=

= : éifljhv(ei’ej) Iojsl

: ”2:1 éiflj¢(ebej)

= Min); thus hv = q).

In view of the uniqueness properties of a tensor product (P,v) of V with itself m times (see the remarks following the proof of Theorem 4.2), it is customary to write /_Ln_—\

”I

P=V®...®V=®V l

and

v(v,,...,v,,,)=vl®-~®vm,

v,EV,i=l,...,m.

An element in @3" Vof the form vl ® - - - ® v”, is called decomposable. Observe, for example, that not every element in M”(R) = ML n(R) ® M1. ”(R)

4.4

Introduction to Multilinear Algebra

403

is decomposable, i.e., not every n x n matrix is of the form [6,171.] for f = [6,, . . . , 6"], n = [271, . . . , nn] 6 M1.,,(R) (the matrices of this form are precisely the rank 1 matrices together with the zero matrix). It follows from Theorem 4.1, Theorem 4.2, and Exercise 1 that if

dim V = n, then dim ®TV = n'". Moreover, if {e1, . . ., en} is a basis of V and {f1, . . ., fn} is a dual basis of V*, then the equation (8) states that

n=QLMaenw {ea(l) ® ' ‘ ' ®ea(m) i“ E rm(")}

and

(12)

are dual bases of Mm(V:R) and ®7 V, respectively. We abbreviate the notation for the elements in (12) to

e3,

01 E 1"",(n).

In general, if v1, . . . , v", belong to V we will sometimes write v3 for v1 ® .

.

n

® vm.

If V is a unitary space, there is a natural way of defining an inner prod-

uct in ®T V which makes the tensor product into a unitary space. Let {eh . , en} be an orthonormal basis of V; then the decomposable tensors ea®, a E 1"m(n) comprise a basis of ®TV. For arbitrary z = Zacaeag’ and w = Zadaedg’ in ®TV, define (z,w) = Z cad“.

(13)

aEI'mbr)

It is obvious that the definition (13) produces an inner product in ®'1"V, called the induced inner product. The only question is whether this inner prod-

uct is independent of the on basis of Vused to define it. This is resolved in the following theorem. Theorem 4.3

Let u,,v, E V, i = l, . . . , m. Then if (- , -) denotes the inner

product in both V and ®'," V, we have

(&m=fimm

(m

In particular, the right side of (14) is independent of the on basis {e,, . . . , en} of V used to define the inner product in (13). Thus, since the decomposable

elements span ®T V, the definition (13) is independent of {e1, . . . , en}. Proof: Formula (13) implies that (e09, ef’) = 6a], for (1,5 6 I‘m(n). Setting u, = nlifijej and v, = 2;,1n,jej, i = 1, . . . , m, we compute that “G =

v8 =

2 given? mEln)

2

mermin)

”men?

(60) = H 402(0), l=l

(”at = H merino), l=l

404

Modules and Linear Operators

and hence

"I

(u®,v@) =

2 FM", = Z

mérmln

(Dermot) is

[(5:0(n'7mm)

= 1131}; €1,471} = #11101”).

I

As an example of this construction consider Mn(C) as the tensor

product

MAC) = M1,,.(C)® M1,,.(C) With

6 = [61’ ' - ' ’ Cu], ’7 = [’71, ' ' ' 1 fin] E M1,n(C)'

{(911 = [61,71],

Using the standard inner product (u,v) = 27‘, up", in Mr. ”(C), we see by Theorem 4.3 that

(x®y, €®n) = (xzxm) = 2 magm.

(15)

On the other hand,

“((6 ® n)*(x ® )0) = tram11*[x1y,1) 7:l ~ N .—

= “([WiIIXiD

l=l

fikE,x,y,,

k=l

= x®y, 66911)

{from (15)].

Since any element in Mn(C) is a sum of decomposable elements, it follows that the induced inner product is the usual inner product in M,,(C) given by

(A, B) = tr(B*A),

A,B E Mu(C).

Much of multilinear algebra is devoted to the study of multilinear functions which have certain symmetries. For example, the multilinear function

¢ E M"(M1_,,(R): R) defined by ¢(v1, - - - , V") : detiiéij],

V! = [6m - - - a 61"] E M|_n(R):

i = 1, . . . , n satisfies $0410): '

-

- a Vow) = 8(0‘)¢(V1, - - - r V”)

for all a E S". More generally, let H < S,,l and let x: H —> R be a homomorphism of H into the multiplicative group of nonzero elements of R. Then

(I: E Mm(V: U) is said to be symmetric with respect to H and x if ¢(vd(l)’ '

'

' 9va(m)) = X(U)¢(V1, ‘

- - ’ vm)

holds for all a e H and v, E V, i = 1, . . . , m. As another example,

V) E M2(M1. "(R)= MAR»

4.4

Introduction to Multilinear Algebra

405

defined by (”(6,") = [5177;] "' [51’7l = 6T” "" ”TC:

6 = [61’ - ' « a Cu]: ’7 = [’71: - ‘ - y 71"] E M1,..(R)

is symmetric with respect to S2 and a. We now introduce the important concept of a symmetry class of tensors in terms of certain universal properties. Definition 4.2 (Symmetry Class of Tensors) Let Vand P be vector spaces over R, H < Sm, and x: H —) R a homomorphism of H into the multiplicative group of nonzero elements of R. Assume that v E Mm(V : P) is symmetric with respect to H and x and satisfies the following two conditions:

(i) (ii)

(im 11) = P. For any vector space U over R and any 07 e Mm(V : U), symmetric with respect to H and x, there exists a linear function h : P ——) U which makes

the diagram

’"

v

Ill—pp

Mt U

commutative, i.e.,

q) = hv.

(16)

Then the pair (P,v) is called a symmetry class of tensors associated with H and

)5. It will be shown in Theorem 4.4 that a symmetry class of tensors associated with H and )5 always exists. We need some preliminary definitions. Let

a e Sm, and define

(De M,"(V: (315V) by

¢(v1,. . .,v,,,)=v,_.m®- - -®V.,—u,,,,,

we V,i= 1,. . .,m.

By the universal factorization property there exists a unique linear function

P(a): ®TV——> ®7V which makes the diagram

406

Modules and Linear Operators

commute. Thus P(0')V1® ' ' ‘ 69"». = va—1(l) ® ' ° '®vu‘1(m)’

we V,i=l,. . .,m.

The linear transformation P(o) is called a permutation operator on ®’f V. Any linear combination of permutation operators S = 2 c,P(o'): (£2) V—)(>” R. Then (x, * ' ' ' *xmdh * ' ' ' *ym) = ml “ZHX(0)E (xi’y¢(i))'

Proof:

(26)

Let

=TH|eX; X(U)P(0) be the symmetry operator on ®'1"V defined by H and x. As we saw in the proof of Theorem 4.4,

S2 = S.

Observe also that for any a 6 SM,

(P(a)x®,y®) = fi (scrum = 1:1 (xtym) = (x8, P(a"‘)y®)-

4.4

409

Introduction to Multilinear Algebra

[Note: since this holds for any decomposable elements x9, y® E 697 V, and since the decomposable elements span (>3);I V, we conclude that (P (a)u, v) = (u,P(a'1)v) for all u,v e ®T V. Thus

(27)

P(¢7)* = PW“),

where P(o)* is the conjugate dual of P(a) with respect to the induced inner product in (>97 V.] Hence

m—-

(X*,y*) = (5x8, Sr“) = (S*Sx®,y®) = (SZX‘g’ y®)— — (3x8 V”)

E

E'—

'11 leZHX(0)(P(0)X® W)

l Eywwmwwa Igm()fi(ann I

The formulas

m

mA-HAhmA-HAM=%WW%ND m ----- mn~~-m=%mmww

m) m)

are particular instances of (26). The permanent of an m x m matrix A = [oil]

is defined by perA=2

fila ai-am

053m i=

As an interesting application of the result (29), we can resolve the positive semidefinite case of the famous van der Waerden conjecture for doubly

stochastic matrices. A doubly stochastic matrix is a matrix with nonnegative entries each of whose row and column sums is 1. In 1926 B. L. van der Waer— den conjectured that if the matrix A is n-square doubly stochastic, then per A 2 per J,' = nl/n”,

(30)

where the n-square matrix 1,, has every entry equal to l/n. Despite efforts by many workers over the last half century, the only general case of (30) presently known is for A positive semidefinite hermitian. An excellent survey of the problem is found in the article by H. J. Ryser, “Permanents and Systems of Distinct Representatives,” The University ofNorth Carolina Monograph Series

in Probability and Statistics (1969), pp. 55—70, The University of North Carolina Press, Chapel Hill.

Theorem 4.6

(van der Waerden Conjecture)

Let A be an n-square positive

semidefinite hermitian doubly stochastic matrix. Then

410

Modules and Linear Operators

per A 2 n!/n". Proof:

We first show that the positive semidefinite determination of A“2

has every row and column sum equal to 1. For as we know from Theorem 3.15,

A“2 is a polynomial in A: A”2 = col" +- - - + CPA”.

Therefore

14“a = 1,11“: = cJ,

where c = 00 + - - - + cp; the second equality follows since A is doubly stochastic and hence JnA = J". Now 1,, = JnAJ"

= (JnAl’zxw’ZJn) = 02],, so that c2 = 1. On the other hand,

J,,A“’J,, = c = a and hence c 2 0. Thus c = 1. Denote by (- , -) the standard inner product in V = C". From (29) with

m=n and the Cauchy-Schwarz inequality we have, setting e,=(6,1, . . . , 6,"), i=1,. . .,n,that 2

(7%)

I ' I ' Jnen)lz IP61'([(AUZ€;: Jnej)])|2 = [(Allzel ' ' ' ‘ . Allzen! Jnel ' s ”Aime! . ' ' ' ' Al/zenuzll'lnel ' ° ’ ' ° Jnenllz

1 = n—!l per([(A1126," Allzejfl) ,3 per([(J,.eu Jnej)]).

(31)

Now (Jnen Jae!) = l/n for i,j = l, . . . , n, since Jne, = (1/n)e where e = (1, . . ., 1). Also

(Al/2e” Al’zej) = (Ae,, eJ.) i,j=1,...,n,

=ajt:

and

(AI/2e” Jnej) = (JnAl’ze,, e!) = (Jnep ej) =71l-,

(Since JnAllz = J")

i’j:la---,n.

Thus (31) simplifies to

(per J,,)2 g per(AT) per J", and per(AT) = per A yields per A 2 per J,l = nl/n”.

I

4.4

Introduction to Multilinear Algebra

Exercises

411

4.4

1. Let Vbe a vector space over a field R, dim V = n, and let V* be the dual space of V, i.e., V* = Hom(V,R). Show that dim V= dim V*. Hint: Let {eh . . . , en} beabasisofV.Fori = 1,.. ., ndefinef,e V*byf,(e,-) = 6,1,1' =1,..., n, and linear extension, i.e., f,(ZI;=IE,-ej) = Zfiléjfxej) = 5,. Verify that

{f1,. . .,f,.} is a basis of V*. The basis {f1,. . ., f,,} issaid to be dual to {e1,. . .,e,,}. 2. Let Vbe a vector space over a field R, dim V: n, andlet {f1,. . .,f,,} bea

basis of V*. Show that there exists a basis {eh . . . ,e,} of Vsuch that {f 1, . . . , f,,} is dual to {91, . . . , en}. Hint: Let {v1, . . . , v,} beany basis of V. Weseek scalars 5,, E R such that {e,. = Effigy] l i: l, . . . , n} is a basis of Vdual to U“. . .,f,,}.Let {g,,. . .,g,,} bethebasisofV* dualto{vl,. . .,v,,}and writef, = Zzflnmgk, 1' = l, . . . , n. We want

||

5]: kZ=:1 17mgk(v!) = Z: Z: film/ C be a a homomorphism of the finite group G into the group of

nonzero complex numbers. Prove that for all a E G, [95(0)] = 1 and 2501“) =

1(0)4. Show that the linear map h in the diagram in Definition 4.2(ii) is uniquely determined by q). 5. Prove that if (P, v) and (Q, ,u) are two symmetry classes of tensors associated with H and x, then there exists precisely one linear bijection h: P —> Q such that hv = ,u.

6. Suppose dim V 2 4 and let e1, e2, e3, e, be linearly independent elements in V.

Show that el ()9 e2 + e3 ® e4 is not a decomposable element of V ® V. 7. Let v1,. . . ,vme V.Showthatv,®- - -®v,,,=0ifl'somev,-=0. 8. Let v,,u,~e V, i=1, . . ., m. Showthat if vl® - - -®v,,,:#0, then vl® - - '®vm=ul®- 0 -®u,,, iff v,=c,u,, i=l,. . .,m, and Hglci=l. Hint: Assume 0 9!: v. ® - - - 69 v,,, = ul 8) ~ - - ® u,,,. Then by Exercise 7, each v, and u, is nonzero. For a fixed k, let fk E V* be arbitrary and choose f, E V*, i at: k, such that f,(v,) = 1. Then there exists a linear h: ®rV—> R such that

412

Modules and Linear Operators

h Hom( V, V) is a representation function, then V is called a representation module for S or simply an S—module, and Vis a proper representation module for S if im L C GL(V). A representation function L: S ——> Mn(R) is called a matrix representation (function) for S, and L is a proper matrix representation if im L C GL(n, R). Example 1

(a)

Let S = S" be the symmetric group of degree n, and define

L: S —» GL(n, R) by

L(a) = A(a) = [6W],

0 E S,

so that L(o) is the permutation matrix associated with 0. Then L is a faithful proper matrix representation function for S (see Section 1.3, Exercise 1). (b) Let V be a vector space of dimension n over a field R, and suppose E is

a basis of V. Let S = Hom(V,V), and define 413

414

Representations of Groups

L: S —» M,(R)

an = mg,

by

T e S.

Then L is a faithful matrix representation function for S (see Section 4.3, Theorem

3.2). Recall that if R is a field and S is a groupoid, then R[S] is the totality of

functionsf E Rs which are nonzero for at most a finite number of elements of S. In Section 1.2, Example 9, we saw that operations of addition and multiplication can be defined in R[S] by the formulas

(f + g)(S) =f(S) + 8(3), (f - g)(S) = §f(r)g(t),

and

5ES S E S,

(2) (3)

wheref,g e R[S]. (If no pair r,t e S exists for which rt = s, then(f- g)(s) = 0 by definition.) It was proved in the above cited reference that R[S] is a ring with respect to these operations, called the groupoid ring of S over R. If S is a semigroup (so that the operation in S is associative), then R[S] is an associative ring, called the semigroup ring of S over R. We can also define a scalar multiplication in R[S] by the obvious formula

01%?) = rf(S),

f E R[S], r E R;

(4)

it is then trivial to verify that R[S] is a linear algebra over R (see Example 3, Section 1.2), called the groupoid algebra of S over R. If S is a group, then R[S] is of course called the group algebra of S over R. For any s E S, the indicator

function f, E R[S] of s is defined by

fs(t) = 5m Theorem 1.1

t E S-

(5)

If the groupoid S possesses an identity, then so does the grou-

poid algebra R[S]. Ifv: S —> R[S] is defined by v(s) = f,,

s e S,

(6)

then u is afaithful representation of S in R[S] and

’ Proof:

while

R[S] =

(7)

For s, t, and x in S we have

v(st)(X) = fir(X) = 6,1,, (v(S)V(t))(x) = (f: 'fi)(x)

(8)

=2f,(u)f. GL(n,R) is a proper matrix representation of S and we write L(s)v = sv, s E S, v E V, then [L(s)]§ 2 41(5)

for all s E S

and hence L(s) E GLn(V) for all s E S, i.e., Vis a proper representation mo-

dule for S.

I

We note that if Vis a proper representation module for a groupoidS and

S has an identity e, then ev = v,

v E V.

(20)

For let L: S -—-> GL(V) be the representation function. Then L(e) = L(ez) = L(ee)

= L(e)L(e), and L(e) has an inverse so that L(e) = Iy.

The following fundamental result shows that a representation module for a groupoid S can be regarded as a representation module for the groupoid algebra 91 of S over R.

Theorem 1.3 Let S be a groupoid, and V an S-module. Let 91 be the groupoid algebra of S over R. Then V is made into an 21 -module by setting

fv = E's a,(sv)

for eachf = 2,530; E 91.

(21)

5.1

41 9

Representations

Proof:

Each f E 21 has a unique representation of the form f = Zfisass

(only a finite number of the a, E R are different from 0), and so the vector fv E V is well-defined by (21). We must verify that

(fg)v = f(gr), (f + g)v =fv + gv. (rf)v = rUV), f(v + W) =fv +fw, f(rv) = r(fv)

and

(22) (23) (24) (25) (26)

for allf,g E 91, v,w E V, and r E R. Ifg = 2 bis, we have :58

l

f(gV) = gsaxx (you) = :33 gsaxbmv) = Z saxby(xy)v

(V is an S-module)

= 2 < 2 a,b,)Hom (58,53) f,v e 23; p(f)v =fv,

p is called the (left) regular representation of 8 .(The notation p is standard and should not be confused with the notation for rank.) In particular, if S is a semigroup and at is the semigroup algebra of S over R, then 21 may be regarded as an S-module by defining sv,

s E S, v E at

to be the product of s and v in 9[ . Explicitly, the representation

p: 3—» Hom (mm) is given by

p(s)v = sv,

s E S, v E 2[.

Observe that if S is a group and s E S, then p(s)x = sx runs over S exactly once as x does. Since Sis a basis of 21 over R, it follows that p(s) is nonsingular, i.e., p(s) E GL(91). In other words, the regular representation p: S —> GL(2[) is proper when S is a group.

420

Representations of Groups

Observe also that if 23 is an associative linear algebra over R and fig

6 23, thenfv = gv for all v 6 Q3 iff(f—— g)v = Oforall v E $.Thusthereg— ular representation p: $ —> Hom(23 ,58)is faithful iff whenever hv = O for all v E $we haveh = 0. Example 2 (a) Let S be the cyclic group of order 3: S = {e,g,g2}. The group algebra fit of S over R consists of all elements of the form

f = ae + by + cgz.

a,b,c E R.

and S = {e,g,g1} is abasis of 2L The regular representation is defined by left multi~ plication by each of the elements of S; and since S is a group, the regular representation is proper. We can compute the matrices A(e), A(g), A(g2) of Theorem 1.2 to

obtain an explicit proper matrix representation of S. Clearly A(e) = 1,. To compute A(g), we use the fact (see the proof of Theorem 1.2) that

4(3) = [17(3)]5. where E = {e,g,g2} . Since p(g)e = g, p(g)g = g2, and p(g) g2 = g3 = e, we have 0

0

l

A(g) = [1 0 o]Similarly,

0

l

0

0

l

0

A(g2) = |:0 0 .

I

0

1]0

(b) If S = {g,, . . . , g,} is a finite group and 21 is the group algebra of S over R, then for each gk 6 S there exists 0k E S, such that gkgr=geutb

t=1,'

'



9”-

Thus the matrix representation A: S —> GL(n,R) associated with the regular representation p: S -—» GL(QI) of S has the form A(gk) = [died/3],

k = 1, -

'

- 9 n

(see Theorem 1.2).

Hf: ZE=1ckgk E 91, then P(f)V = (22:1Ckgk)v = ZLICkn = 22:10]: P(gk)v

= (Zg=,ckp(gk))v for any v E 91. Hence letting E = {g,, . . . ,g,} denote S as a basis of 21 , we have

[p(f)]§ = *2; cinemaSince P(f1f2) = P(f1)P(f2) and P("if1 + 72/2) = r1P(f1)+ r2P(fz), if we define

A(f) = [p(f)15, we can conclude that A: 91 —+ M,,(R) is a matrix representation of 2! in the sense that A(f1f2) = 4100403) and

A("1f1+ rzfz) = r1A(f1) + r2A(f2)

for all f1,f2 6 2t and r,,r, e R. Observe that for any f e at, A( f) is amatrix all of whose row and column sums are equal (see Exercise 2).

5.1

421

Representations

(c)

Let S = S3 be the symmetric group of degree 3. Then V = R3 can be

regarded as a proper representation module for S, by defining ox = A(o‘)x,

a 6 S3,

x E R3,

where A(cr) is the permutation matrix A(a) = [6mfi]; thus

”X = (xrl(1)axrl Mn(R) and

5.1

425

Representations

x: S —> Mn(R) are the associated matrix representations with respect to bases E and F of V, respectively, then for all s E S we have

4(3) = [L(S)lE = [1y]E[L(S)]§[IVl§ = B“K(s)B where B = [IVE E GL(n,R). Thus Theorem l.5(i) can be reformulated as follows: A representation of S is reducible ifl the matrices in any associated matrix representation of Sform a reducible set. Observe that Theorem 1.5(ii) says that a representation of S is fully reducible ifir whenever A: S —) M"(R) is an associated matrix representation of S

of the form

A(s) =[All“)

0 ], 4122(3)

4121(5)

sE S

where A“(s) E M,(R) and A22(s) e Mn_,(R) for some fixed integer l g r < n, there exists a fixed nonsingular matrix

P:[ P “

0

P21

]e GL(n,R)

P22

such that Pll E M,(R) and 0

P“A(s)P = [All“) 0

where A’“(s) E M,(R)

],

s e s,

[22(3)

and A'zz(s) E Mn_,(R);

in

this

case

we

have

P22"An(s)P22 = 4132(3) for all s E S. In general, a set of matrices M C M"(R) is said to be fully reducible if whenever (34) holds there exists a fixed nonsingular matrix C E GL(n,R) such that

C“AC=[D“ 0 ], 0 D22

A GM,

(35)

where D11 E Mp(R) and D22 E Mq(R). If follows from the above reformula— tion of Theorem 1.5(ii) that if a representation of S is fully reducible, then the matrices in any associated matrix representation of Sform afully reducible set. As a final remark we note that any irreducible (i.e., not reducible) set of matrices is fully reducible, since the defining condition is fulfilled vacuously. Example 3 (a) Let S be the multiplicative group of nonzero complex numbers. Let E = {ehez} be a basis of the two-dimensional vector space V = R2, and for x + iy E S define L(x + iy) E GL(V) by L(x + iy)el = xel + yez,

L(x + iy)ez = —yel + xez

426

Representations of Groups

and linear extension [notice that det L(x + iy) = x2 + y2 qt 0, i.e., L(x + iy) is indeed nonsingular]. Then set

x Mn(C) [L: S —+ M"(R)] is a matrix representation

428

Representations of Groups

such that im L consists of unitary (orthogonal) matrices, then L is called a unitary (orthogonal) matrix representation of S. Theorem 1.6

Any set M C Mn(C) of unitary matrices is fully reducible.

Moreover, if S is a groupoid and V is a unitary representation module for S, then V is fully reducible. Proof: If the set M C M"(C) of unitary matrices is irreducible, we are done. So assume M is reducible. Let E be an on. basis of the unitary space V, and let [2 be the set of all unitary transformations T E Hom( V, V) such that [TE = U for some U E M. Since M is reducible, there exists a proper subspace Wof

tich is invariant under all T e .9. Choose an on basis {w,+1, . . . , w} of W(1 _ is an invariant subspace of every L(s), s E S, and the proof is complete. I We remark that Theorem 1.6 holds, of course, if C is replaced by R and “unitary” is replaced by “orthogonal.” Example 4

(a)

LetS= {g1,g2,. . .,g,,}beafinitegroupandletp:S—>GL(Q[)

5.1

Representations

429

be the regular representation of S, where 21 is the group algebra of S over either C or R. Then p is a fully reducible representation. To see this, denote the basis {gh . . , g,} of 91 by E, and define an inner product 13 on 91 by I3 (kill akgk: 1:11 bkgk) = 1:31 01:51:Then E is clearly an orthonormal basis of 2[ with respect to 5. For each r = 1, . ., n, define a, E S" by g,g,, = 3mm k = 1, . . ., n. We compute that

13006932 dig,” Me) k=l i bkgk = {3(Zni aim»,. k=1 )5: bkgrgk) k=1 k=1 2 I3 (1:21 akga,(kh ,2; bkgo,(kl)

= 29.am =I3(fi: akgkazuibkgk),

k=l

k=l

r=l,. - un-

Thus each p(g,) is unitary (orthogonal if the underlying field is R). It follows from

Theorem 1.6 that p is fully reducible“ (b) Let Vbe a finite dimensional unitary space with inner product I3, and let T E Hom(V, V). Recall that Tis said to be hermitian if T* = T. Also, Tis said to be positive semidefinite if ,6(Tx,x) 2 0 for all x E V, and T is, positive definite if in addition equality holds iff x = 0. It is true that T is positive semidefinite iff the eigenvalues of Tare nonnegative, and Tis positive definite iff the eigenvalues of T are positive. A standard result in elementary linear algebra states that if T is hermitian, then Vhas an orthonormal basis consisting of eigenvectors of T. Moreover, if E is any on basis of Vand [T]5 = A, then T is hermitian ifi' the matrix A satisfies A* = A (i.e., A is hermitian). If T is hermitian and X E GL( V) satisfies X* TX = T, then X is called an automorph of T. Observe that if X and Y are auto-

morphs of T, we have (XY)*T(XY) = Y*(X*TX)Y= Y*TY= T and XY is an automorph of T. (c) Let S be a finite group, and let Vbe a proper S—module over C. Let I3 be an inner product on V (we are not assuming that V is necessarily a unitary Smodule, i.e., that the transformations in the representation are unitary with respect to [3). Denote by L: S —» GL(V) the representation function for S. Let T E Hom(V, V) be positive definite hermitian and define

H = §5L(s)* TL(s) e Hom(V, V).

(36)

Notice that H is positive definite hermitian, being a sum of positive definite hermitian transformations. Then for t E S we compute that

L(t)*HL(t) = ZS L(t)*L(s)* TL(s)L(t) = E}: (L(s)L(t))* TL(s)L(t) = §SL(st)* TL(st) = H, where the final equality follows since st runs over S as s does. Thus L(t) is an auto-

430

Representations of Groups

morph of H for each i E S. Now define a new inner product on V, denoted simply

( - , - ), by the formula

(x,y) = fi(Hx,y), We have

x,y E V-

(37)

(L(s)x,L(s)y) = fl(HL(s)x,L(s)y)

= fi(L(S)*HL(S)X.y) = fi(Hx,y) = (XJ), s E SIn other words, each L(s) E im L is unitary with respect to the inner product given in (37). Thus V together with the inner product given in (37) is a unitary S—module, and hence Vis fully reducible by Theorem 1.6. Since any vector space Vover C can be equipped with an inner product if (see Exercise 8), we conclude that ifS is a finite group and V is a proper S-module over C, then V is fully reducible. The same argument shows that this result also holds if C is replaced by R.

((1)

Let A: S —> GL(n,C) be a matrix representation of the finite group S.

Then there exists a fixed nonsingular matrix P E GL(n,C) such that P“A(s)P is unitary for every s E S. To see this, let V be an n-dimensional vector space over C with basis E, and define the representation L: S -> GL( V) of S by[L(s)]§ = A(s), s E S. In (c) above we saw that there exists an inner product(-, .) on tich makes V into a unitary S—module, i.e., each of the transformations L(s) E im L is unitary with respect to (~,-). Now let F be an on basis of Vwith respect to (-,-); then

[L(s)]? = x(s) is a unitary matrix for each s E S. Finally,

x(s) = [L00]? = P‘1A(s)P,

s E S,

where P = [I y1§ E GL(n,C). An analogous result holds if C is replaced by R and “unitary” is replaced by “orthogona .” Actually, formula (36) suggests how to construct a matrix P E GL(n,C) such that P"A(s)P is a unitary matrix for every s E S. Let

H = ESAQY A(s) E GL(n,C); for each s E S, A(s)*A (s) is positive definite hermitian since A (s) is nonsingular, and

hence H is positive definite hermitian. Now let P = 11"” 2 be the inverse of the positive definite hermitian square root of H. Then for every s E S,

(P"A(s)P)*(P“A(s)P) = P*A(s)*P"*P‘1A(s)P

= H‘“2A(s)*HA(s)H‘“2 = H”1’2HH‘”2 =

(since A(s)*HA(s) = H)

Ill,

and so P'1A(s)P is a unitary matrix. This construction is illustrated in Exercise 9.

Let S be a groupoid and Van S-module over R. Let W be an invariant subspace of V for the representation of S, and denote by q: V —> V/ W the ca-

nonical quotient map. Then the quotient space V/ W can be made into an Smodule by defining

5.1

Representations

431

sq(v) = q(sv),

s E S, v E V.

(38)

To see that (38) makes sense, observe that q(vl) = q(v2) implies vl — v2 E W and hence s(v1 — v2) E Wfor s E S; but then q(sv1) = q(sv2). Also observe that for s,t E S and v E V,

S(tq(V)) = SWV) = q(S(tV)) = q((st)V) = stq(v); so (38) indeed defines a representation of S. It is called the quotient representa-

tion of S (with respect to W). Suppose Vhas dimension it over R. Let {e,+1, . . . , en} (1 g r g n) be a basis of Wand augment it to a basis E : {eh . . . , e,,} of V. Then the associated matrix representation A: S —> Mn(R) of S has the form 41(3) 2 [411(3) 4121(5)

0

],

sES,

4122(5)

where A11(s) E M,(R) and A220) E M,,_,(R). As we saw in Theorem 1.5, the

matrices A11(s) and A22(s) constitute respective matrix representations A11 : S —) M,(R) and An: S —) M"_,(R) ofS [see formulas (31) and (32)]. Now it is clear that the invariant subspace W of the S—module Vitself becomes an S—module if we simply restrict the original representation transformations to W, and A22: S —) Mn_,(R) is precisely the associated matrix representation of

S relative to the basis {e,+1, . . . , en} of W. Moreover, for s E S and 1 g k g r we have

sue.) = use» = q (2 A..(s)e.) = Q Aik(s)q(ei)

= g Atk(s)q(et)

[Since 4(6)“) = ' ' - = q(e‘,,) = 0].

It follows that A 11: S —-> M,(R) is the matrix representation associated with the quotient representation of S relative to the basis {q(€1), . . . , q(e,)} of V/ W. We have the following interesting result. Theorem 1.7 Let S be a groupoid and V on S—module. (i) If W is a minimal invariant subspace of Vfor the representation of S, then W is an irreducible S-module. (ii) If W at V is an invariant subspace of Vfor the representation of S, then V/ Wis an irreducible S-module (under the quotient representation) ifl” W is a maximal invariant subspace. Proof: (i) This is obvious. (ii) Assume V/ W is an irreducible S-module (under the quotient representation). If W is not a maximal invariant subspace of V for the represen-

Representations of Groups

432

tation of S, there exists an invariant subspace U of V such that W C U C V and the inclusions are strict. Let q: V—> V/ W denote the quotient map, and set W1 = q(U). Clearly W1 is a proper subspace of V/W (see Exercise 11). Moreover, for s e S and u e U we have

sq(u) = q(su) E W1 (su E U because U is an invariant subspace of V). Thus W1 is a proper invariant subspace of the S—module V/ W, contradicting the irreducibility of V/ W. We conclude that W is a maximal invariant subspace of V. Conversely, assume W is a maximal invariant subspace of V for the representation of S. Suppose W1 is an invariant subspace of the S—module V/ W. Then Wl : q(U), where U : q“(W,) is a subspace of Vcontaining W (see Exercise 10). If s E S and u e U, then q(u) 6 WI and the invariance of WI for the quotient representation of S implies that sq(u) 6 W1. Since

sq(u) = q(su) we have q(su) e W], i.e., su E U. Hence U is an invariant subspace of Vfor the representation of S, and since W C U the maximality of Wimplies that U: W or U: V. If U: W, then Wl = q(U) = {0}. If U : V, then Wl : q(U) : V/ W. This shows that V/ W has only {0} and V/ W as invariant subspaces for the quotient representation of S. Thus V/W is an irreducible S—module. I Theorem 1.7 tells us the following. If S is a groupoid and V an Smodule, then both minimal and maximal invariant subspaces of V give rise

to irreducible representations of S: A minimal invariant subspace W is made into an irreducible S-module by simply restricting the original representation transformations to W; and if W is a maximal invariant subspace, then the

quotient representation of S in V/ W is irreducible. We next introduce the important notion of equivalence. Let S be a groupoid, and let V and U be S-modules. If there exists a bijective TE Hom(U,V) such that sTu = Tsu,

s E S, u E U,

(39)

then the two S—modules V and U (or the corresponding representations) are said to be equivalent. If L: S —) Hom(V,V) and M: S —> Hom( U, U) are the

respective representations, then (39) states that L(s)Tu = TM(s)u,

or

L(s)T = TM(s),

s E S, u E U,

s e S.

(40)

Even if T e Hom(U,V) is not necessarily bijective in (40), we say that L and M are linked by T (or that U and V are linked by T). The linking is of course trivial if T = 0. When L and M are equivalent representations, we write

L ~ M. It is easy to check thatin the class of all representations of a groupoid S, ~ is an equivalence relation (see Exercise 12).

5.1

Representations

433

There is a similar notion of linking sets of matrices. Let!) C Mn(R), 1‘ C Mm(R), and suppose M E Mm’n(R). If for each A e 9 there exists a B e 1" such that

MA = BM

(41)

and also for each B E 1" there exists an A E 9 such that (41) holds, then M is said to link [2 and F. Finally, two matrix representations A: S —> M,(R) and 1c: S ——) Mn(R)

of a groupoid S are said to be equivalent if there exists a nonsingular matrix P E GL(n,R) such that A(s)P = PK.‘(S),

s E S.

Once again, we write A ~ K, and ~ is an equivalence relation in the class of all matrix representations of S. The next result shows that any nontrivial irreducible representation of a semigroup over a field is equivalent to a quotient representation of the regular

representation with respect to a maximal invariant subspace of the semigroup algebra over the field. (The trivial irreducible representation is the 0 representation in a space of dimension 1.) Theorem 1.8 Let S be a semigroup, 21 the semigroup algebra of S over afield R, and L: S —> Hom( V, V) a nontrivial irreducible representation of S over R. Then there exists a maximal invariant subspace A of 521 for the regular representation p: S —> Hom( 2I , s21)such that the corresponding irreducible quotient representation p]: S —> Hom(?t /A,9I /A) is equivalent to L.

Proof:

Let vo E V, vo qt 0. Then lo = V by Theorem 1.4(iv). Let A =

{fe 21 ]L(f)vo = 0}. It is obvious that A =# 91 is a subspace of 21, and in fact A is invariant under p; for iff E A and g e QI , then

L(gf)Vo = L(g)L(f)Vo = L(g)0

[since L( f )v0 = 0]

= 0, so that -p(g) f = gf E A. We obtain the quotient representation p1: S—> Hom(?I/A,2[/A)

given by

p1(S)q(f) = q(p(S)(f)) = q(3f),

SE S,fE 91,

where q: 91 ——> 21/A is the quotient map. Define the function T: V——> 21/A

by

Tv = q(f),

v E V,

(42)

where f e 9! is such that L(f)vo = v(such an f e 91 exists because 21v0 = V).

We must confirm that T is well-defined: If L(f)v0 = v = L(g)vo, then

434

Representations of Groups

L(f— g)v0 = 0 so thatf— g E A, and hence q(f) = q(g). Observe also that T is linear. For if Tu = q(f), Tv = q(g) and r,k E R, then (using L extended to QI) we have L(rf+ kg )v0 = rL(f)v0 + kL(g)v0 = ru + kv

[since Tu = q(f) means L(f)vo = u and Tv : q(g) means L(g)vo = v], and hence

T(ru + kV) = q(rf + kg) = rq(f) + kq(g) = rTu + kTv.

Next note that Tis injective. For if Tu = q(f) = q(g) = Tv, then f— g E A, L(f— g)v0 = 0, L(f)v0 = L(g)v0, and since L(f)v0 = uand L(g)vo = v, we obtain u = v. Clearly iffe 91 and L(f)v0 = v, then Tv = q(f) by definition; so Tis surjective as well. Finally, observe that Tlinks L and p1. Indeed,

suppose s E S, v E V, and f E 91 is chosen so that L( f)v0 = v. Then TL(s)v = TL(s)L( f )v0

= menu.) 2 q(sf) : pl(s)q( f ) = p1(s)Tv

[by the definition (42)] [by the definition of p1] [by the definition (42)].

Thus L ~ 1),. It is easy to check (see Exercise 13) that since L is irreducible

and [21 is equivalent to L, p1 is irreducible. Hence by Theorem l.7(ii), A is a maximal invariant subspace of 2[ for the regular representation p: S—>

H0m(?I.21)-

I

Theorem 1.8 has the following immediate consequence.

Corollary 1 Let S be a semigroup, 91 the semigroup algebra of S over afield R, and L: S —> Hom( V, V) a nontrivial irreducible representation of S over R. Assume that the regular representation p: S —) Hom( 2I , QI ) ofS isfully redccible. Then there exists a minimal invariantsubspace B of ‘21 for p such that the restric-

tion pl B of p to B [i.e., (pl B)(s) = p(s)|B, s E S] is equivalent to L. Proof: By Theorem 1.8, there exists a maximal invariant subspace A of ‘21 for the regular representation p such that L~ p], where p1: S —> Hom( 91 /A, $1 /A) is the corresponding quotient representation of S. Since p is fully reducible by hypothesis, there exists an invariant subspace B of at for p such that 91 = A —l— B. Then B is a minimal invariant subspace of 21 (see Exercise 14). Let Q be the restriction of the quotient mapq: QI —) 91 /A to B, i.e., Q = ql B. Then Q is nonsingular, and the statement (29) specializes to

171(3) = Q(P(S)IB)Q_1s

S E S-

(43)

But (43) says precisely that pl B~ p, , and since p1 ~ L, we obtain L ~ pl B. I

5.1

Representations

Example 5

435

We determine all the irreducible representations over C of a cyclic

group S = {e = g°, g, . . . , g""‘} of order n. Let 21 be the group algebra of S over C. By Example 4(a) we know that the regular representation p: S —> GLOZI ) of S is fully reducible. Suppose L: S —» Hom( V, V) is a nontrivial irreducible representation of S over C. It follows from Corollary 1 that L ~ p | B, where B is a minimal invariant subspace of fit for p. Now let b E B be an eigenvector of p(g). Since p(g") = p(g)", k E Z, b is an eigenvector of each p(g"). Hence (b) C B is a one-dimensional invariant subspace of 21 for p, and the minimality of B implies that (b) = B.

But L ~ pIB entails in particular that dim V = dim B; so dim V = 1. Write V = (v), where 0 at v E V, and let 1 E C be such that L(g)v 2 Av. Then L(g")v =

L(g)"v = lkv, k E Z, and since g” = e, we have A" = 1. In Theorem 1.8 we saw that for a semigroup S, any nontrivial irreducible representation L of S over a field R is equivalent to a quotient representation of the regular representation p of S with respect to a maximal invariant subspace A of the semigroup algebra 91 of S over R. Moreover, Corollary I told us that if the regular representation p is fully reducible, then L is equivalent to p acting on a minimal invariant subspace B of 21 . It isimportant to observe that an invariant subspace A of 21 for the regular representation p is simply a left ideal in 91 ; i.e., A satisfies QIACA.

We define minimal (maximal) left ideals in 91 to be minimal (maximal) invariant subspaces of 2! for p. Thus A is a maximal left ideal in SI iff A is a left

ideal not strictly contained in any other left ideal in 2),! other than at itself. Similarly, a minimal left ideal in QI is a left ideal not strictly containing any left ideal in QI other than {0} . The study of representations over R of a semigroup S is equivalent to the study of the ideal structure in the linear associative semigroup algebra fit of S over R. In the next section we shall study representations of groups (mostly finite groups) mainly through the use of matrices. There are several reasons for proceeding in this way. First, the matrix theoretic approach is easily understood. Second, the originators of group representation theory (Burnside, Frobenius, Schur, Maschke) developed the theory as a part of matrix theory. Third, the results are very specific and suitable for applications. Finally, the main results are quickly accessible.

Exercises

5.1

1. Verify (23) to (26). Hint: For example, f(rv) = (Exesaxxe) : Exesaxxov) =

ExesaMXV) = r 2.6mm) = r(fV)2. Referring to Example 2(b), prove that for each f e 21 the row and column sums of the matrix A(f) are all equal. Hint: A(f) = Zz=1ckA(gk). The A(g,,). are

436

Representations of Groups

permutation matrices. Thus cl + - . . + c,. is the common value of the row and column sums. 3. Prove the assertion following (27) that {q(v,+,), . . ., q(v,)} is a basis of V/ W. Hint: Ifo : Zln=r+1 c,q(v,) = q(EILr-l'l ctvt), then 252,“ ctvt E W =

and hence c,+, = - - - = c, = 0. Also, since q(v,) = 0 for i: 1,. . . , r, it

follows that V/ W = q(V) is spanned by the indicated set. 4. Prove that the function T,: V/ Wl —} V/W, defined in (28) is well-defined and linear. Hint: If q(v,) = q(vz), then v, — v2 E W, so that T(v, — v,) E W,. Hence Tv, — Tv, E W, and q(Tv,) = q(Tvz). Thus T, is well-defined. Also for r,s E R we have T,(q(rv, + sv,)) = q(T(rv, + sv,)) = q(rTv, + sTv,) = rq(Tv,) + sq(Tv2) = rT,q(v,) + sT,q(v,), showing that T, is linear.

5. Prove that (29) is valid. Hint: First observe that Q. W2 —> V/ W, is bijective. For if s— — 0, then q(w,)— _ 0 and W2 6 W, 0 W2: {0}. Thus Q is injective. Also if v— — w, + w, E V, then q(v)— — q(w,) + q(w,)— — q(w,)— - Qw2. Thus V/ W,= imq— = im Q and Qis surjective. Finally, if v— — w, + w, E V, we have

QPQ"q(v)= QPQ“q(w2)— — QP “Qw2= QPw2= QTwz — q(Tw2)— — q(Twl) + q(TWz) = q(TV)= T140) 6. Prove the assertion following (29) concerning W, and W,. Hint: Assume W, is not strictly contained in any proper invariant subspace of V under T. Suppose W3 is a proper invariant subspace of V under T strictly contained in W,. Then

W, —i— W, is an invariant subspace of Vunder T. Since the inclusions {0} C W3 C W2 are strict, so are the inclusions W, C W, -l- W3 C W, 4- W2: V. This contradicts our initial assumption.

Conversely, assume W2 does not strictly contain any proper invariant sub-

space of V under T. Suppose W3 is a proper invariant subspace of Vunder T which strictly contains W,. Then W3 n W2 is an invariant subspace of Vunder T, and since theinclusions W, C W3 C Vare strict, so are the inclusions {0} C W3 {‘1 W2 C W,. Again, this provides a contradiction. 7. Let Vbe a unitary space. Prove that if E and F are o.n. bases of V, then [I y];

is a unitary matrix. Hint: LetA = [IV]; Then e, = Ive, = 29:, a,,f,-, j = l, ., n, so that

6,, = Meme) = 5 (:1 “ipfb :5 duff) h

—-Z

arpa a-tqfl(f.n ft)— — ”=1 “spa—(16:!

,t

—Z2(A*)ta.,.= (A*A)“. — Hence A*A = 1,, and A is unitary. 8. Let Vbe a vector space over either R or C. Prove that it is always possible to define an inner product 13 on V. Hint: Let E = {e,, . . ., e,,} be a basis of V

and define IKE-$15191, 21210191) = 252151771, 232151911 Zi=1mex E V9. Let S C M2(C) be the group of matrices

s=i3 2H: 3H“; "1H: -iH-i 71H; 3:1-

5.1

Representations

437

(a) Show that the matrices in S constitute a faithful irreducible matrix representation of S3. (b) Find a matrix P E GL(2,C) such that P“A(s)P is a unitary matrix for each matrix A(s) in S. Hint: (a) To see that the matrices in S constitute a faithful representation of S 3, construct a 6 X 6 multiplication table for each group and compare the two. Suppose S were reducible over C and Q E GL(2,C) were such that Q"1A(s)Q

is lower triangular for each matrix A(s) in S. Then it would follow that Q”) is an eigenvector of each A(s). The eigenvectors of [(1)

(1)] are (1,1) and (1,—1) (and

scalar multiples of these). However, —1 [0

—1

l](1,1) —(—2,1)

—1 —1 [0 1](1,—1)—(o,—1),

and so that [(1)

(1)]and[_(l)

—:] do not have a common eigenvector. Thus S must

be irreducible over C. (b)

First form the sum H = Z,esA(-Y)*A(S)- Then find a 2 x 2 unitary

matrix U such that U“HU = diag(a,,a2). DefineP = U diag(¢z{‘ll 2,u,“’ 2)U'1. 10. Let Wbe a subspace of the vector space V, and let q: V—> V/ Wbe the quotient map. Show that any subspace Wl of V/ W has the form WI = q(U), where U is a subspace of Vcontaining W. Hint: Since 4: V—> V/ Wis linear, it follows that

U = q‘1(W1) = {x E V|q(x) E W,} isasubspace of V, and obviouslyq(U) = W]. Note that W = q“({0}) C q"(W1) = U.

ll. Let Wand U be subspaces of the vector space V. Suppose W C U C Vand the inclusions are strict. Let q: V—> V/ W be the quotient map. Show that q( U) is a proper subspace of V/ W. Hint: Since U is a subspace of Vand q: V-> V/ W is a linear map, q(U) is a subspace of V/ W. If q(U) = {0} , then q(u) = 0 for all u E U and hence U C W, which is impossible. Next, choose vo E V — U. Then if q(vo) E q(U), it would follow that q(vo) = q(u) for some u E U, and hence v0 = u + w for some wE W. Since wE WC U, we would have v0 E U, a contradiction. Thus q(vo) GE q(U) and q(U) is a proper subspace of V/ W.

12. Show that if S is a groupoid, then ~ is an equivalence relation in the class of all representations of S. Hint: Let L: S —9 Hom(V, V), M: S —> Hom(U,U), and K: S —+ Hom( W, W) be representations of S. Clearly IyL(s) = L(s)IV for s E S; so L ~ L. If T E Hom(U, V) is a bijection such that L(s)T= TM(s), s E S,

then T"1 E Hom(V,U) and M(s)T‘l = T“‘L(s), s E S. Hence L ~ M implies M ~ L. If T E Hom(U, V)isabijection linking L and M, and if D E Hom( W, U) is a bijection linking M and K, then TD E Hom( W, V) is a bijection and L(s)TD = TM(s)D = TDK(s), s E S. Thus L ~ Mand M~ K imply L ~ K.

13. Let S be a groupoid, and let V and U be equivalent S-modules; say the respective representations L: S —> Hom(V,V) and M: S —-» Hom( U, U) are linked

438

Representations of Groups by the bijection T E Hom( U, V). Show that W is a minimal (maximal) invariant subspace of U iff T( W) is a minimal (maximal) invariant subspace of V.

14. Let S be a groupoid and L: S —> Hom( V, V) a representation of S. Let W 1 and W2 be subspaces of V such that the pair (Wl,W2) reduces each L(s) E im L. Prove that W1 is a maximal invariant subspace of V for the representation of S iff W2 is a minimal invariant subspace of Vfor the representation of S. Hint: This is essentially Exercise 6.

15. Let S be a group, and Van S-module of dimension n over R. Let e be the identity of S. Denote by L: S —> Hom( V, V) the representation function for S. Show that (a) eV = L(e) V is a proper S-module.

(b) V: eV-i- (1v — L(e))V(c) L(s) I (IV — L(e))V = 0 for each s E S. (d) There exists a basis E of V such that if A(s) = [L(s)]E, s E S, then

A

(S)

= Ic(S) 0

[o

o’

where it: S—> GL(r,R) is a proper matrix representation of S and r = dim eV. Hint: (a) If s E S, then s(ev) = (se)v = (es)v : e(sv) for all v E V, and

hence s(eV) C eV. Also e(ev) = ezv = ev for all v E V, so that L(e)|eV = I”. In general, if M2 S —> Hom( W, W) is a representation of the group S and M(e) =

1W, then M is proper. Indeed, for each s E S we have [W = M(e) = M(ss‘l) = M(s)M(s"1), and so M(s) has an inverse. Thus each L(s) Itas an inverse, and it follows that eVis a proper S-module for which the representation function is simply L | eV. (b) If w E eV 0 (IV — L(e))V, then w = ev = u — eu for some u,v E V. We have ew = e(ev) = ev = w and ew = e(u — eu) = en — ezu = eu — eu = 0. Hence w = 0. Also, each v E Vcan be written as v = ev + (v — ev) E

eV + (h - L(E))V(c) v E V.

L(s)(I y — L(e))v = L(s)v — L(s)L(e)v = L(s)v — L(se)v = 0, s E S,

(d) Let E = {91, . . ., en} be a basis of Vso chosen that {eh . . ., e,} is a basis of eV and {e,+1, . . ., en} is a basis of (1,, — L(e))V. Obviously each [L(s)]g has the indicated form. In particular, we have proved that ic S —» Hom( V, V) is a representation ofa group S, then all the transformations L(s), s E S, have the same rank r.

16. Let M be a multiplicative group of (not necessarily nonsingular) matrices in M"(R), where R is a field. Prove the following statements.

(a) (b)

There exists an integer r such that the rank of every matrix A E M is r. There exists a fixed nonsingular matrix P E GL(n,R) such that for each

A e M, there is a B e M,(R) with P“AP = B 4. 0,,_,, where o,_, is the (n — r)-square zero matrix.

(c) If H is the identity in M, then P"‘HP = I, + 0,,_,. Hint: Let Vbe an n-dimensional vector space over R, Fa basis of V, and S the group of transformations T E Hom(V, V) such that [TE = A for some A E M.

5.1

Representations

439

Define L: S —v Hom( V, V) by L(T) = T, T e S; then Vbecomes an S-module. According to Exercise 15, there exists a basis E of Vsuch that if A(T) = [L( T)]§ = [T]E, T e S, then A(T) = x(T) —i- 0,,_,, where x: S —> GL(r,R) is a proper

matrix representation of S. But for each T E S, A(T) = [T]E = [IV]E[T]§[I y];

and hence P“AP = B + 0,,_,, whereP = [IV]; A = [T]; E M, and B = [C(T). Since I: is proper, each x(T) = B has rank r so that each matrix A has rank r. This proves (a) and (b). If H is the identity in M, the fact that IE is proper also implies x(H) = 1,. This proves (c). 17. Let R be a field of characteristic not 2. Let M be the set of all matrices of the formcfi (a)

i],0=ficeR.

Show that M is a group with respect to matrix multiplication.

(b) Find a fixed matrix P e GL(2,R) such that P“AP = [3 g] for every A e M (b depends on A). Hint: (a) M is obviously closed with respect to matrix multiplication. The

1‘ [I I] , and the inverse ofc [: identity in M is 211

4c11

(b) Let P: [: ‘1]e GL(2,R); then P" = l [J 2

mp 1]P=[Zc 0W6“.

I].

i] E M is -L [I 1

I] and

l

1100

Glossary

5.1

automorph, 429

completely reducible representation (module), 423 conjugate dual, 427 conjugate transpose, 427 equivalent representations, 432 Euclidean space, 427 Euclidean representation module, 427 fully reducible representation (module),

423 fully reducible set of matrices, 425 general linear group, 413 GL( V), 413 group algebra, 414 groupoid algebra, 414 groupoid ring, 414 hermitian matrix, 429, hermitian transformation, 429 indicator function, 414

fn 414

inner product, 427 invariant subset, 421 irreducible representation

(module),

422 linked transformations, 432 matrix representation, 413 maximal invariant subspace, 422 maximal left ideal, 435 minimal invariant subspace, 422 minimal left ideal, 435 orthogonal matrix, 427 orthogonal matrix representation, 428 orthogonal transformation, 427 orthonormal basis, 427 on, 427 positive definite hermitian transformation, 429

positive semidefinite hermitian transformation, 429 proper matrix representation, 413

440

Representations of Groups semigroup ring, 414 S-module, 413 trivial representation, 433 unitary matrix representation, 428 unitary representation module, 427 unitary space, 427 unitary transformation, 427 (W1, W2) reduces T, 421

proper representation module, 413 quotient representation, 431 reducible set of matrices, 424 regular representation, 419 p, 419 representation, 413 representation function, 413 representation module, 413

5.2

Matrix Representations

Let S be a group and Van S-module. It is shown in Exercise 15, Section 5.1, that if e is the identity in S, then eV is a proper representation module for S.

Moreover, there exists a basis E = {e1, . . ., en} of V such that if L: S ——> Hom( V, V) is the representation function and A(s) = [L(s)]§, s E S, then 41(5) = ”(5) + On—r’

where r = dim eVand K: S —+ GL(r,R) is a proper matrix representation with

K(S) = [L(s)|eV]§;, E, = {e1, . . ., e,}. Thus it suflices to consider proper representations when dealing with group representations. Henceforth in this section we shall assume that all representations of groups are proper. The degree of a representation L: S —> Hom( V, V) (or the degree of the S-module V) is the dimension of V, and if A: S ——> GL(n,R) is a matrix representation, then n is the degree ofA.

Following is a basic result in the representation theory for finite groups. Theorem 2.1 (Maschke’s Theorem) Let S be a finite group, and V on Smodule of dimension n over R. Let h = ISI, and assume that h = h-l a": 0 in R. Then V is fully reducible.

Proof: Let L: S —> GLn( V) denote the representation function. Suppose E is a basis of Vsuch that the associated matrix representation A: S —> GL(n,R) has the form A(S) =[ A “(5)

4121(5)

0

:|,

422(3)

sE s,

(1)

where A“(s) E GL( p,R) and A22(s) E GL(q,R) for some fixed integers p and q, l g p < n, p + q = n. The theorem will be proved once we exhibit a fixed

matrix P = P“ P21

0

E GL(n,R) such that PH E M (R) and

P22

P“A(s)P = Ic(s) + a(s),

P

s E S,

where IC(S) E Mp(R) and ”(s) E M,(R) (recall the discussion following Theorem 1.5).

5.2

Matrix Representations

441

Let

C = 3623 A2.(s)A.,(s)-‘ e M,,,(R) IP and set

P = [% C

O 1., e GL(n,R).

By block multiplication, we confirm directly from( 1) and the equality A(st) = A(s) A(t), that for all s,t E S,

A1107) : 4111(3) All“),

42265") 2 422(5) 422(1): 42107) = 421094110) + A22(S)A21(t)'

and

(2)

From (2), we have

421(5) All“) = A2107) — 422(3) 4210), and multiplying both sides on the right by 41,1(t)'I yields

5121(5) = A21(St) All(t)—l — 422(3) 4210) An“)—1 : A21(St) A“(st)_‘ 411(3) " 4122(5) 4210') 411(0—l-

(3)

Sum both sides of (3) over all t E S to obtain h 4121(3) = (“22$ A210") A,1(St)_l) 4111(3) _ 1122(5); A21“) 411(1)“-

(4)

Now as 2‘ runs over S, st runs over S, and recalling the definition of C we see that the equation (4) becomes (5)

h 421(3) = CA11(S) “ A22(S)C.

Finally, since Ip

P‘1 =

1

—--FC

0 [4

,

we compute by block multiplication that

P“A(s)P=

[ I, OHAHG) o I, o i “1 —hC ’4 421m A,,(s) 7C ’4 A"(s)

..%CA“(S)+A21(S)

0

1,

o

A22(s)][%C 1,]

4111(5)

flaws) + Ms) + flame

0

Ana)

442

Representations of Groups

_ A“(s) o _[ 0 4220)}

seS

[by (5)].

Thus, in fact, P"A(s)P = A,,(s) 4— A22(s), s E S. This completes the proof. | The next result due to I. Schur is used throughout representation theory. Theorem 2.2 (Schur’s Lemma) (a) Let Q C M”(R) and 1" C Mm(R) be irreducible (i.e., not reducible)

sets of matrices. If M e Mm, "(R) links {2 andI", then eitherM = 0 or m = n and M is nonsingular. (b) Let S be a groupoid, and let L: S —) Hom( V, V) and K: S —) Hom(U, U) be irreducible representations of S. If T E Hom( U, V) links L and K, then either T = 0 or dim U = dim V, Tis a bijection, and L ~ K. Proof: (a) To say that M E Mm ,(R) links 9 and 1." means that for each A E [2 there exists a B E 1" such that

MA = BM,

(6)

and for each B E I" there exists an A E 9 such that (6) holds. Assume that M ¢ 0. Let GL"( VI) of S. In terms of matrices, this means that if A: S —> GL(n,R) is any associated matrix representation of S and F is any extension field of R, there exists no matrix B E GL(n,F) such that

5.2

Matrix Representations

445

_



0

1111(3)

365,

4122(3)],

B 1A(S)B — [4121(5)

where A11(s) and A22(s) are square matrices of fixed size. More generally, a set of matrices 9 C M”(R) is said to be absolutely irreducible over R if there is no extension field F of R for which there exists a matrix B E GL(n,F) such that

_,

_ An

0

],

A e Q,

22

where An and A22 are square matrices of fixed size. Recall that if S is a finite abelian group, then S is the direct sum of cyclic subgroups of prime power order (see Theorem 2.7, Section 4.2). We change the notation slightly to conform to our present treatment. Namely, we write the operation in S multiplicatively instead of additively. Thus S is

the direct product of prime power order cyclic subgroups H1, . . . , Hk:

s=H,---H,c

(19)

(the H, here correspond to the Hi]. in the statement of Theorem 2.7, Section 4.2). Let hl, . . ., hk be generators of H1, . . ., Hk, respectively, i.e., H, =

[hi], i = 1, . . ., k. Then if e is the identity in S and |H,| = pfi, where p, is a prime and e, is a positive integer, we have hg’i“ = e, i = 1, . . ., k. Let n = p191 - - - pk‘k = IS] and let L: S —> GL(V) be a representation of

S. If s = h,“1 - - - hkmk E S, then s" = (h1")"'l - - - (hk")"‘k. Since p,‘i|n, i = 1, . . ., k, we conclude that s" = e. Thus L(s)" = L(s") = L(e) = IrIn other words, the eigenvalues of L(s) are roots of the polynomial h" —- 1. Corollary 3

Let S be afinite abelian group ofordern = pl‘l - - ~ pk‘k, where

p, is a prime and e, is a positive integer, i = 1, . . . , k. Let R be a splitting field for the polynomial ll" — l and assume that n - l at: 0 in R. Then any irreducible representation L: S —) GL( V), where V is a finite dimensional vector space over R, is in fact of degree 1. Moreover, the number of such pairwise inequivalent irreducible representations of S is precisely n. Proof: Let L: S —) GL( V) be an irreducible representation of S, where V is a finite dimensional vector space over R. Since R is a splitting field for the polynomial r1” — 1, it follows from the remarks immediately preceding the statement of Corollary 3 that the characteristic polynomial of each L(s) E

im L splits in R. Hence by Corollary 2, L is of degree 1; for each s E S, there exists a scalar 1(3) E R such that L(s)v = A(s)v,

v e V.

Every element s E S is achieved precisely once as

s = h,"'1h2"'2 - - - h,”

(20)

446

Representations of Groups

for 1 g m, g 12?, i = 1, . . . , k; this is just a consequence of the fact that

S is a direct product of the H, = [h,]. The values 2(3) for all s E Sare thus completely determined by the values A(h,), . . . , A(hk). Since hfi“ = e we we must have /l(h,)t""*i = l, i = l, . . ., k, so each 101,) is a pi‘i-th root of 1,

and any pfi-th root of 1 is an nth root of 1. Now nl’fl qt 0 and A” — 1 have no common factors, so the roots of A" —- 1 are distinct. Thus formally there

are n 2 p121 - - - pk‘k ways in which the function A: S —» R can be defined. But of course if two assignments of values to the 2(hi) are different, then we are obviously dealing with distinct representations (they do not agree on all the h,, i = l, . . ., k). The proof is completed by observing that representations of degree 1 are inequivalent if and only if they are distinct. I Example 1 We list all the irreducible representations over C of all abelian groups of order 18. By Example 5, Section 4.2, the nonisomorphic abelian groups of order

18 are Z” and Z2 + Z3 -i- Z,. Multiplicatively we have 01'

S = C13

(21)

S = C2‘ C3 ' C3

(22)

where Cp is a cyclic group of order p. In the case (21) we can define

A(h) = E where E is any eighteenth root of 1 and h is a generator of C”. In the case (22) we

can define 1(hj) = E],

j = l, 2’ 3

where E. = i1, 52 and .53 are any two cube roots of 1, and h,, hz, h3 are generators

of C2, C3, and C3 respectively.

Much of what is known about representations of a group depends on the following important result. Theorem 2.3

(Burnside’s Theorem)

Let S be a group, Van n-dimensional

vector space over R, and L: S —) GLn(V) an absolutely irreducible representation of S. Then

dim (im L 2 n2.

(23)

In other words, (im L) = Hom(V,V). Proof: Let A: S —> GL(n,R) be an associated matrix representation of S. The theorem will be proved once we show that (im A) = M"(R), i.e.,

dim (im A) = n1.

(24)

To simplify notation, we will denote a typical matrix in im A by A. Assume

that dim V/ W the quotient map. We saw in Section 5.1 that two representations of S can be obtained in terms of L: Define

L1: S—> Hom (W,W) by

L1(s) = L(s)| W,

s e S,

5.2

Matrix Representations

453

and define

L2: S—-> Hom (V/W, V/W)

by

L2(s)q(v) = q(L(s)v),

s e s, v e V.

If {e,, . . . , en} is a basis of Vsuch that {e,+1, . . . , e,,} is a basis of W, the associated matrix representation A: S —> Mn(R) of S has the form

,

s E S,

(54)

where A(s) E M (R) and B(s) E M:_,(R). It was observed immediately prior to Theorem 1.7 that B: S —> MH,'(R) 18 a matrix representation of S associated with L], and A: S —> M,(R) is a matrix representation of S associated with

L2. Now if L1 and L2 are themselves reducible, there exist nonsingular matrices P and Q of appropriate sizes such that

C(s)§ 0 ’

SE S

Q“B(s)Q= ........... ......... ,

SE S_

F1A(S)P=

""""": """""

* ED(S) E(s)§ 0

and

* ‘F(S)

(P + Q)“A(S)(P + Q) = ----------

--------- --------- ,

se s.

(55)

........................................

Proceeding in this fashion, we finally obtain a matrix U E GL(n,R) such that

Anio§o -~§0 * $21223); o

wumv:

*

0

*ygon-go

,

sea

60

454

Representations of Groups

and for each i = 1, . . . , p the matrices A,,(s) constitute an irreducible matrix representation of S. In terms of L, we conclude that there exists a basis E of V such that for each s e S, [L(s)]§ is the partitioned matrix on the right in (56). Next define a chain of invariant subspaces of V for the representation L as follows: Vp is the space spanned by the basis vectors in E that correspond to the A pp block; V1,_l is the space spanned by the basis vectors in E that correspond to the A pp and A p_1,p_, blocks; V1,_2 is the space spanned by the basis

vectors in E that correspond to the A PP’ the A p—lw—l’ and AI,_2,1,,_2 blocks, and so on. Then clearly

Vpc Vp_1C VP, c - - . c V1: V.

(57)

The invariant subspace VP is an irreducible S-module because the matrix representation A Pp of S is irreducible. Moreover, we know that A p4, p_1 is a matrix representation of S associated with the quotient representation of S in Vp_,/ VP. Since A p4. p_1 is irreducible, it follows that this quotient representation

is irreducible and hence (see Theorem 1.7, Section 5.1) that V, is a maximal invariant subspace of V *1. In the same way, for each i = l, . . . , p—l

the quotient representation of S in V,/ V,+1 is irreducible with associated matrix representation A“, and V,+1 is a maximal invariant subspace of V,. These representations of Sin V,/V,+1, i = l, . . . ,p(set Vp+1 = {0} so that Vp/ {0} = V1,) are called irreducible constituents or components of L, and a chain of subspaces such as (57) is called a reduction of V (or L). Similarly, the irreducible

matrix representations A“, A22, . . . , AM of S are called irreducible constituents or components of A, and the matrix (56) is called a reduction of A

into irreducible components. The integer p is called the length of the reduction. If S is a group, the Burnside—Frobenius—Schur theorem (Theorem 2.4) allows us to prove the essential uniqueness of these components in a surprisingly easy way.

Theorem 2.5 Let S be a group, and Vk an nk-dimensional vector space over Rfork = 1, . . ., m. Let Lk: S—> GLnk(Vk), k =1, . . . , m, be pairwise inequivalent absolutely irreducible representations of S. Then the characters 1,, = X“: S —> R are linearly independent elements of the vector space RS over

R. Proof:

Suppose ck E R, k = 1, . . . , m are scalars such that

:2 em = o.

(58)

Choose a basis for each vector space Vk, let Ak: S —> GL(nk,R) denote the matrix representation of S associated with Lk relative to the chosen basis of Vk, and set A k(s) = [a{§.(s)], s e S. Then for each s E S we have

5.2

455

Matrix Representations

O = 1:1 ckxk(s)= 21 cktrA,‘(s)

(59)

=§c 2ato).

Now it follows from (the proof of) Theorem 2.4 that the n12 + - - - + n,,,2 “entry functions”a,lg.: S—) R, i,j = l, . . . , nk, k = l, . . . ,m are linearly

independent in Rs. Otherwise there would exist a nonzero matrix K of the form =E'Kk,

KkEMnk(R),k=l,...,m

k=l

such that

tr KA(s) = I; tr(K"Ak(s))—_ 21 1:231 K}§a{‘1(s) :2 0,

S E S,

and this in turn would imply that

dim (imA) GL( V) and M: S -—)

GL( W). Suppose that XL = XI”. Then dim V = dim W, any reduction of L into absolutely irreducible components has the same length as any reduction of M into absolutely irreducible components, and the components of any two such reductions can be paired ofl into equivalent pairs.

Proof:

Let bases of V and W be given such that the matrix representations

A and 1c of S associated with L and M, respectively, have the form

Ana) 0 ' 0 - . . 0 * A22(s) 0 0 4(3) =

and

at-

-x-

*-

ql-

4133(3) ' - ~

*

0

~ - AUG)

,

sE S

(60)

456

Representations of Groups

rc“(s) 0 *

rc(s) =

K22“)

o

o

0

'

'

0

*

* Mn)?- - .

0

*

*

*



,

s e s,

(61)

- - -§q(s)

where the component representations A“, . . .,A”,K“,. . . ’ 1cqq

(62)

are all absolutely irreducible. Let 1'1, . . . , t, be the complete list of pairwise inequivalent matrix representations occurring among the representations

(62), and let x, be the character of 1,, 1' = 1, . . . , r. Suppose each I, is equivalent to 6, of the representations A n, . . . , App and k, of the representations K“, . . . , 1c“. Then

he) = tr A(s) = 2 tr Airs) = 2 61 tr no) = :1 630(3),

s E S,

and similarly

s e s.

m0) = glam). Since A = x”, we have

2 (6. — kn, = o.

i=1

(63)

The characters xv . . . , x, are linearly independent by Theorem 2.5, so that

6,:k,,

i=l,...,r.

This means that the number of representations in {A 11, . . . , Ap1,} equivalent to r, is the same as the number of representations in {km . . . , KM} equivalent to 1-,, i = l, . . ., r. It follows that dim V = dim W, p = q, and except for order the components of the representations A and 1c are pairwise equivalent. I

Corollary 5 Let S be a group, and V a proper finite dimensional S-module with representation function L: S ——> GL( V). Then any two reductions of L into absolutely irreducible components have the same length, and apart from order the components are pairwise equivalent.

Proof:

Apply Corollary 4 with L = M.

I

5.2

Matrix Representations

457

Corollary 6 Let S be agroup, and Vand W two properfinite dimensional Smodules over R. Suppose the respective representations L: S —> GL(V) and

M: S —> GL( W) are absolutely irreducible. Then L ~ M ifirx,‘ = 1M. Proof: If L ~ M, we know from general considerations that XL : 29,, [see (52)]. If XL = x”, then by Corollary 4 (since the representations L and M are already reduced and the reductions have length 1) we have L ~ M. I Corollary 7 Let S beafinitegroup with IS] = h, R afield in which h- I ab 0, and V a proper finite dimensional S-module over R. Denote by L: S —> GL( V) the representation function for S. Then there exists a direct sum decomposition (perhaps over an extension field of R) of the S—module V into invariant subspaces V1, . . ., VP of Vfor the representation L, P V =

2- Vk’

k=1 such that for each k = 1, . . ., p the representation Lk = L| Vk of S is absolutely irreducible. Moreover, if q

V = 2' Uk k=l is another such decomposition of V, then 11 = q andfor some a E Sp we have dim Vk = dim U0“) andLl Vk ~Ll Udik)’ k =1, .

Proof:

.

.,p.

Let a basis of V be chosen so that the matrix representation A of S

associated with L has the form (60) in which A“, . . ., A pP are absolutely irreducible components (it may be necessary to go to an extension field of R).

For each s E S, let A(s) denote the principal submatrix of A (s) immediately below and to the right of A ll(s); this provides a matrix representation A of S. By the proof of Maschke’s theorem (Theorem 2.1), we know that

A ~ All + A. Applying the same reasoning to A (or using induction), we obtain A~AH+A22+ - - - +11”.

(64)

In other words, there exists a basis E of V such that

[L(S)]E = 4111(3) + - - - -i~ APPLY),

s E S-

(65)

Now define Vk to be the subspace of V spanned by the vectors in E corresponding to Akk, k = l, . . . , p. Obviously (65) implies that each Vk is an

invariant subspace of V for the representation L. Moreover, since Akk is a matrix representation of S associated with Lk = L] Vk, it follows that Lk is absolutely irreducible, k = l, . . . , p. Thefinal statement of the result follows

from Corollary 5. I

458 Example 3

Representations of Groups Let S be a group with identity e, and let r be a positive integer. The

group S is said to have exponent r if s’ = e for all s E S. We shall show in this example that if R is a field in which r .1 ab 0 and S < GL(n,R) has exponent r, then S is a finite group. In other words, tfin a group of nonsingular n-square matrices the rth power of every matrix is I", then the group is finite. This result is due to Burnside. Case 1 S is absolutely irreducible as a set of matrices. If A E S, then since A’ = I,, the eigenvalues of A are roots of the polynomial 1' — 1. This polynomial has r roots, and it follows that tr A takes on at most a finite number of distinct values as A ranges over S. In fact, there are r possible choices for each 1,. in trA = 11 + - - - + in, so that tr A takes on at most r" distinct values as A ranges over S. Now the group S provides an absolutely irreducible matrix representation of itself of degree n, and

so dim3 M3) = m. . Let S be a group (written multiplicatively) and H < S. Then the number of left

cosets sH is called the index of H in S. Prove that if H1 and Hz are subgroups of finite index in S, then H1 ['1 H2 is of finite index in S. Hint: We first assert that if s,t E S are in different (left) (H1 n H,)-cosets, then they are in diflerent

Hl-cosets or they are in different Hz-cosets. For otherwise s“t E H,, s'It E Hz, and hence s‘lt E HI {1 Hz. Let the index of H1 in S be p, and write S = Cl U - - - U C,,adisjoint unionofp left Hl-cosets. Similarly, let the index of H2 inS be q, and write S = D1 U - - - U DPadisjoint union of qleftHz-cosets. Then for any t :1, . . .,p we have

C,=Sn C,=(Dl n C,)U - - - U(a C,). Now assume that H1 n H2 is not of finite index in S, so that the number of left (H1 n H1)-cosets in S is certainly greater than pq. Then there exists a sequence of N >pq elements of S, saysl, . . . , s”, whichlie in different (H1 n H2)-cosets. Each st lies in some set D," 0 CM:

s,ED,,,‘nC,,,,

i=l,...,N.

Suppose that (m,,n,) = (mj,n,) for i at j. Then s, E D," n C,"

and

s, E D," (1 CM.

476

Representations of Groups

By the first statement in the present argument, 5, and Si must lie in the same

(H1 ['1 H2)—coset, and this contradicts the choice of the sequence s,, . . . ,sN. Thusthe pairs (ml,n,), . . . , (mN,nN) are distinct. But m, E {1, . . . ,p}, n,- E {1, . . . , q} and hence there are only pq distinct pairs (m,,n,.) possible, contradicting N > pq. 7. Let x: S —> R be a character of the groupoid S. Show that x(st) = x(ts) for all s,t E S. Hint: Let the matrix representation A of S over R afford )5. Then for all s,t E S we have x(st) = tr A(st) = tr A(s)A(t) = tr A(t)A(s) = tr A(ts) = x(ts). 8. Let S be a finite group with |S| = h, and let R be a field in which h- 1 =# 0. Let A: S —» GL(n,R) be a proper matrix representation of S over R. Prove that there exists a finite algebraic (and hence simple algebraic) extension field F of R such that A ~ A1 4— - - - .l— A; over Fand the representationsAl, . . . , AP of S over F are absolutely irreducible. Hint: Let g be the totality of matrices X E M,(R) which satisfy XA = AX for all A E im A. If A is not itself absolutely irreducible, by Theorem ZJ there exists some extension field D of R and some B E GL(n,D) such that for any A E im A we have B'IAB = [:11

f: ],where 22

A“ isp >< pand A22 is q X q. Then for any (lg/5’ E D, M”, = B(alp —i— f3[,l)B‘l commutes with every A E im A. Think of AX— XA = O, A E im A, as a system of linear homogeneous equations in X (S is finite, so im A is finite). This system of equations has a null space of dimension at least 2 over D and hence over R. It follows that there exists M E Mn(R) such that M 4: y], for any y E R and MA = AM for all A E im A. let ,u bean eigenvalue of M, set W = M — ,uI,,, and note that W =# 0 and W commutes with every A E im A. Also R(,u) is a finite algebraic extension of R. Since W is singular, obtain matrices P,Q E GL(n,R(uD such that PWQ = 1,, -i— 0, where 0 < k = p(W) < n. Since WA = A W, it follows by block multiplication from the equation PAP‘1PWQ

= PWQQ-‘AQ that if Q"AQ =[K'1 K” :I=KthenK”=0,

for anyA E K21 K22 imA. Thus over R(,u), A is equivalent to the representation K. Then by Maschke’s theorem, K ~Kn J}- K22 over RM), and hence A ~ K“ -i— K22 over R(,u). Repeat the procedure with K ,1 and K22, etc., each time extending the field in question to a finite algebraic extension, until absolutely irreducible components are obtained.

9. Prove that conjugacy in a group is an equivalence relation.

10. ProveTheorem2.8.Hint: (a) Trivial. (b) Clearly zxz“: y ifl‘ zx‘lz‘1 =y“. (c) sX = Xs and tX = Xt imply Xs" = s“X and (st)X = s(tX) = S(Xt) = (sX)t = (Xs)t = X(st); so NX is a group. (d) Let ul, . . . , ukN, be all the left cosets of N, in S. The k conjugates u,xu,-'1 are distinct. Indeed, nixu,~1 =

ujxu,"l implies uj'lu, E N," i.e., u,- E u,N,. Thus the number of distinct conjugates of x is at least k. On the other hand, if z'lxz is any conjugate of x, then

2‘1 must lie in some left coset, say 2'1 = my”, y'1 E N, Hence z"lxz =

u,y"xyu," = u,xu,'1. Thus there are at most k conjugates of x. (e) Let C be the conjugacy class of x, i.e., Cconsists of all the distinct conjugates of x. Ac-

5.2

Matrix Representations

477

cording to part (d), [C l is the index of Nx in S and hence must divide h. (f) x(zxz"1) = tr A(zxz"') = tr A(z)A(x)A(z)'l = tr A(x) = x(x).

11 . Let R be a field of characteristic 0, and suppose p and q are relatively prime . . - l . . . integers. Show that if 5—18 a root of the momc polynomial )1” + a,_,l"“ + 1 - - - + all + a0 E RH] in which the a, E R are integral multiples of 1 e R, n . 1

11—1

I

n—l

then p/q is an integer. Hint: We have 5,; --1 + an_1 q“ + - - - + £11": + do

= 0 so that p" - 1 + q(a,._117"‘1 + an_zp"'zq + - - - + axq”2 + em“) = 0. Then p"- 1 — (q - m) - 1 = 0, where m is an integer, and since R has characteristic 0 this implies p" —— qm = 0. It follows that q lp”; since q and p have no common factors, we obtain q = -_|-l. 2 . 2 12 Show that in a field of characteristic 0, (Z ) = (5.1) is an integer ifl'p/q is an

integer. Hint: pz/q2 = m means pz-l = qzm-l or p2 —— q’m = 0. We can assume p and q are relatively prime; so q2 l p2 implies that q l p and hence q = j; l. 13 The prime subfield n of a field R is the smallest subfield of R containing 0 and 1. Prove that if S is a finite group with |S| = h such that h-l qt: 0 in R, then the absolutely irreducible components of a proper matrix representation A of S over R can be chosen to be over a finite algebraic (and hence simple algebraic) extension of n. Hint: First reduce A into irreducible components over R; so we may as well start by assuming A itself is irreducible over R. By Theorem 2.7(b), A is equivalent over R to a component p, of the regular representation p. The

entries in the matrices in im p are 0 and 1 when S itself is used as a basis of the group algebra Q! of S over R. Hence p, is over 7:, since the proof of Maschke’s theorem shows that the components of p are obtained using an equivalence over the field in which the entries lie. Thus A ~ ,2, over R, and p1 is over it. Now apply Exercise 10 to conclude that the absolutely irreducible components of p, can be chosen to be over a finite algebraic extension of n. GI ossary

5.2

absolutely irreducible character, 460 absolutely irreducible representation,

444 absolutely irreducible set of matrices,

445 Burnside’s theorem, 446 Burnside—Frobenius—Schur theorem, 450 character of a representation, 452 character orthogonality relationships of the first kind, 462 character orthogonality relationships of the second kind, 472 complete list of absolutely irreducible

characters for, a finite group, 471 component of a representation, 454 conjugacy classes, 469 conjugate elements, 469 constituent of representation, 454 degree of a character, 460 degree of a representation, 440 element of finite period, 452 exponent of a group, 458 irreducible character, 460 length of a reduction, 454 Maschke’s theorem, 440 modular representation, 460 normalizer, 469

478

Nx: 469 ordinary representation, 460 prime subfield, 477 principal character, 460 reduction of a representation, 454

Representations of Groups

scalar product, 460 Schur’s lemma, 442 simple character, 460 symmetry, 464

Index

abelian, 17 absolutely irreducible character, 460 absolutely irreducible representation,

444 absolutely irreducible set of matrices,

445 admissible M-group, 115 algebraic extension, 205 algebraic integer, 223 algebraically dependent polynomials,

254

A(K IF), 265 auxiliary unit matrix, U,,,, 367 axiom of choice, 12 alb, 179 A(alfl), 289 A[a|fi), 289

367

A(alfi], 289 A[alfl],

289

ammo) , 289 [A :B], 315

algebraically independent polynomi-

als,

254

alternating group of degree n,

An,

39

39

antisymmetric, 10 ascending chain condition, 158 a.c.c., 158 associate, 180 associative law, 16 associative ring, 18 automorph, 429 automorphism, 32 automorphism group of K, 265 automorphism group of K over F,

265 A(K), 265

basis, 206 basis of an R-module,

bijection,

297

4

binary, 16 Burnside-Frobenius-Schur Theorem,

450 Burnside’s Lemma, b(x)|a(x), 176

446

canonical homomorphism, canonical injection, 5 carrier, 16 cartesian product, 4

44

479

Index

480

_XI 1’},

G0),

5

cartesian square,

81

358

companion matrix,

7

X2, 7

CW», 358

category, 30 category of groups, 54 Cauchy index, 38 Cauchy-Binet Theorem, 289 Cauchy’s formula, 363 Cayley-Hamilton Theorem, 363 Cayley’s Theorem, 32 center, 81 centralizer, 81

companion matrix off(x), complement, 3

Z(G), Z(X),

complete system of residues modulo

81 81

158

character of a representation, 452 char R, 126 character of degree 1, 264 character of H of degree 1, 88 character orthogonality relationships

of the first kind,

462

character orthogonality relationships of the second kind, 472 characteristic, 126 characteristic matrix of A, 355 characteristic of a ring, 126 characteristic polynomial of A, 355 characteristic polynomial of an algebraic number, 208 characteristic roots of A, 355 circular permutation, 34 class equation of G, 85 classical groups, 2] closed, 16, 33 codomain, 4 column rank of A, 313 column space of A, 313

?(A),

313

commutative, l7 commutative diagrams, 6 commutative ring, 124 commutator of two elements, commutator subgroup, 80

3

complementary sequence, 289 complete list of absolutely irreducible characters of a finite group, 471 complete system of nonassociates,

180 M,

chain, 10 chain of ideals (ascending, descend-

ing),

Y — X, X‘, 3

181

completely reducible representation (module), 423 completely symmetric functions, 149 complex, 42 complex numbers, 18 complex symplectic group, 22

Spm(C), 22 component matrices,

377

Z”, 377 component of a representation, composite, 4 composition series, 73 conjugacy, 39 conjugacy classes, 39, 469 conjugate dual, 427 conjugate elements, 469 conjugate transpose, 22, 427

A*, 22 constant term,

142

454

,

constituent of representation, 454 contructible number, 259 constructible number field, 260 content of a polynomial, 194

«max 194 contragredient, 411 cube of volume 2, 262 cycle decomposition theorem, 36 cycle index polynomial of H, 91

P(H:xl, . . . , x”), 80

245

91

cycle of length K, 34 cycle structure, 39

Index

481

[111, 2‘2, . . . , n9], cyclic group, [a], 58

39

direct sum of matrices,

X 4- Y,

58

303

303

direct sum of transformations,

cyclic group generated by a,

35

T1 —i— T2,

a‘k = (a")", 34 cyclic invariant subspace, 353 cyclic invariant subspace decomposition, 352 cyclic primary decomposition, 348 cyclic subgroup generated by a, 58 cyclotomic polynomial 393

discriminant of F, 235 discriminant polynomial, 229 disjoint permutations, 36 distributive laws, 18, 124

115,0), 393 g, 259

divisor, 176, 222 divisors of 0, 18

De Morgan formulas, 3 decomposable element, 402, 407 Dedekind’s Theorem, 298 degree of a character, 460 degree of a representation, 440 degree of an extension, 206 degree off(x), 142 deg f(x), 142 degree theorem, 218, 257

[LzK], 206 A, 88 Z, 88 derivative,

219

f’(x), 219 derivative of a matrix, X(t), 379

379

descending chain condition, 158 d.c.c., 158 determinantal divisor of A, 320

dk(A),

320

diagonal, 7 diagonal injection, 14 dimension of a vector space,

dividend,

176

division algorithm, division ring, 18

174

domain, 4 doubly stochastic matrix, dual basis, 398

dual space of V, BMW 148

eigenvalues,

e.v.,

409

398

35

355

Eisenstein criterion, 200 element, 1 element of finite order, 58 element of finite period, 452 element of infinite order, 58 elementary column operations, 319 elementary divisors of A, 324 elementary divisors of a torsion submodule, 348 elementary matrix, 288

EU, 288 E,j(r), 288 E..(u), 288

H1C-B- - (91‘1" 63

elementary row operation, 288 elementary symmetric functions, e.s.f., 149 empty set, 2 z, 2 endomorphism, 32 endormophism ring, 128 epimorphic, 4 epimorphism, 31

i‘Hi, 63

equivalence,

dim M,

311

311

diophantine system of equations,

329 direct sum, 63, 332

Hl—i- - - --i-H,,

i=1

63

363

363

316

149

482

Index

equivalence of matrices,

equivalence relation,

320

7

equivalent extensions, 207 equivalent representations, 432 Euclidean algorithm, 219 Euclidean domain, 222 Euclidean space, 427 even permutation, 39 exponent of a group, 458 extension by radicals, 276 exterior product, 408

1,,

fully reducible set of matrices, function,

425

3

s —» Y, 3 X 1., Y, 3 Fundamental Theorem of Galois Theory, 273

f,4, f’,

250 251

409

external direct product, 65 external direct sum of R-modules,

343 e? 403

Gauss’ Lemma, 194 Gaussian integers, 180 general linear group, 22, 413

8

factor theorem for maps, 10 factors of a series, 73 Feit-Thompson Theorem, 83 Fermat’s Theorem, 168

field, 18 field extension,

202

field of fractions, 163 finite dimensional vector space, 310 finite extension, 206 finitely generated idea], 157 (L11, . . . , an), 157 finitely generated R—module, 297

fixed field of G,

265

GL(n, R), 21 GL( V), 413 greatest common divisor, 182 greatest common right divisor, g.c.r.d., 302 greatest element, 10 group, 17 group algebra, 414

group generated by

[A], 48, 66 [A1, ..., Am],

form of degree m, 228 free abelian group of rank n, 298 free generators of an R-module, 297 free monoid on X, 23 free power series ring over R, 25

RHXJL 25 297

302

{s,, . . ., sk}

subject to defining relations, group generated by A, 48, 66

348

66

group homomorphism theorem, group isomorphism, 59

K(G), 265

free R-module of rank N,

G-orbits, 35 Galois group of a polynomial, 273 Galois group of K over F, 265

F(n,, ..., n, , 148 1",(n), 148 I'°(n1, ..., up), 149 F:(n), 149

f is defined at A, 376 F—R module, 118 factor group, 44 factor ring, 130 factor set, 8

X/R,

free ring on X over R, 24 Frobenius normal form, 360 fully reducible, 65

S s T, 59 group of rotations of the cube, group with operators, 114 groupoid, 16 groupoid algebra, 414 groupoid ring, 414 groupoid ring of S over R, 24

57

92

483

Index

f,,

R[S], 24 Gm", 148

414

indices of A, indices of H,

GL(m, R), H acts on the set X,

H :X,

87

87

Hermite normal form, 292 Hermite’s Theorem, 187 Hermitian matrix, 429 Hermitian transformation, 429 Hilbert Basis Theorem, 160 Hilbert Invariant Theorem, 253 homogeneous of degree m, 228 homogeneous system of linear equations, 314 homomorphism, 31 hypercompanion matrix, 367

H(33(1)), 367

ideal,

129

ideal generated by X,

(X),

157

157

identity, 5 identity element, 16 identity map, 5 6x, 5 identity matrix, 21

In, 21 illegal trisection of an angle, image,

262

3

for), 4 im f, 4 incidence matrix, 8, 52 inclusion, 2 X C Y, 2 independent indeterminates, 147 indeterminate over a ring R, 144 indeterminate with respect to a ring

R,

139

index of H in G,

[GzH],

43

43

indexing set, 2 indicator function,

288

induced homomorphism, 58 infinite extension, 206 injection, 4 inner automorphism, 114 inner product, 427 integers, 17 integers modulo p, 19 integral domain, 18 internal direct product, 63 H1H2 '



' H,(direct),

HIXHZX'--XH,,

63

63

fi'H,, 63

(=1

intersection,

2

X0 Y, 2 invariant factors, 324 invariant subset, 421 invariant subspace under T, inverse, 6,16 inverse image, 4 inversion, 53 irreducibility of det X, 247 irreducible character, 460 irreducible element, 181 irreducible matrix, 364 irreducible representation 422 irreducibletransformation, isobaric, 231 isomorphic rings, 127 isomorphic series, 112 isomorphism, 31 isomorphism of R-modules, isomorphism theorem for (first), 57 isomorphism theorem for (second), 62 isomorphism theorem for

(third), (first),

359

(module), 363

339 groups groups groups

60

isomorphism 414

296 292

131

theorem

for

rings

484

Index

isomorphism theorem (second), 133 isomorphism theorem

(third), [(0), 38 10, 251 1(5), 28

for

rings rings

134

Jacobson radical,

J(R),

for

259

length of a reduction, 454 linear algebra, 28 linear associative algebra, 21 linear extension, 351 linear functional, 398 linear map, 115 linear ordering, 10 linear transformation, 115, 351

l.t.,

193

193

351

linearly dependent,

Jordan normal form, 368 Jordan-Holder Theorem, 75 Jordan-Holder-Schrier Theorem,

Jo,

legal constructions,

l.d.,

257, 310

257, 310

linearly independent, 113

252

1.i.,

218, 310

218, 257, 310

linked transformations,

432

list of elementary divisors, k-cycle,

34

‘7 = (iii: ' ' kernel, 46, ker tp, 52, kernel of C

kC(N),

' ik)a 34 129 129 with respect to N,

338

390

Lagrange’s Theorem,

43

287

left-hand value off(x) at t,

f,(t), left left left left left left

173

ideal, 129 inverse, 6 quotient, 176 remainder, 176 specialization at t, 173 substitution at t, 173

28

338

130

leading coefficient, 142, 150 leading term, 142, 150 least common left multiple, 303 l.c.l.m., 303 least element, 10 left coset, 42 left equivalent, 287

L J: B,

52,

M is generated by S, 296 M-group, 114 M-homomorphism, 115 map, 3 mapping, 3 mapping diagram for quotient rings,

Klein four group, 52 Kronecker product, 390

A (8 B,

324

local ring, 167 lower bound, 10 lowest terms, 227

173

mapping diagrams, 5 Maschke’s Theorem, 440 matching, 4 matrix, 21 matrix A associated with the submodule W, 305 matrix convergence, 326 matrix of relations, 349 matrix product, 21 matrix representation, 413 matrix representation of T, 353

C = [T]§, 353 maximal element,

10

maximal ideal, 135, 166 maximal invariant subspace, maximal left idea], 435

422

485

Index maximal normal subgroup, member, 1 metacyclic group, 73 minimal element, 10

nullity of A,

70

314

n(A>, 314 number of elements,

minimal invariant subspace,

422

2

IX I , 2 N, 1

minimal left ideal, 435 modular representation, 460 module epimorphism, 346 module homomorphism, 339

n,- - si, 297 nZ, 19

module isomorphism,

objects of a category, 30 odd permutation, 39 Q-algebra, 16

339

module monomorphism, 346 monic, 142 monoid, 16 monomial, 142 monomorphic, 4 monomorphism, 31 morphisms of a category, 30 m‘11 completely symmetric space over

V, 408 mm exterior space over V, M,,,( V:U), 398 multiplicative identity,

l6

(5’, [2),

operator homomorphism, orbits, 87

0’», 87 order ideal of a,

0(a),

332

332

order of a, 332 order of an element, 58 order of N, 332 ordered pair, 7 ordinary representation, 460 orthogonal matrix, 427 orthogonal matrix representation,

18

n-ary operation, 16 n-tuples, 20 natural map induced by R, 9 neutral element, 16 Newton identities 154 nilpotent element, 191 nilradical, 191 Noetherian ring, 159 nonhomogeneous polynomial sociated with h, 248 I7, 248 nonsingular matrix, 312 norm, 222 normal divisor, 44 normal field extension, 270 normal subgroup, 44 H