The Theory of Matrices, Volume II [2]

288 128 9MB

English Pages [275] Year 1959

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The Theory of Matrices, Volume II [2]

Table of contents :
Front Cover
Volume Two
PREFACE
CONTENTS
11. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, AND ORTHOGONAL MATRICES
12. SINGULAR PENCILS OF MATRICES
13. MATRICES WITH NON-NEGATIVE ELEMENTS
14. APPLICATIONS OF THE THEORY OF MATRICES TO THE INVESTIGATION OF SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
INDEX

Citation preview

THE THEORY OF

MATRICES F. R. GANTMACHER

VOLUME TWO

1959

PREFACE T h e matrix calculus is widely applied nowadays in various branches of mathematics, mechanics, theoretical physics, theoretical electrical engineer­ ing, etc. However, neither in the Soviet nor the foreign literature is there a book that gives a sufficiently complete account of the problems of matrix theory and of its diverse applications. The present book is an attempt to fill this gap in the mathematical literature. The book is based on lecture courses on the theory of matrices and its applications that the author has given several times in the course of the last seventeen years at the Universities of Moscow and Tiflis and at the Moscow Institute of Physical Technology. The book is meant not only for mathematicians (undergraduates and research students) but also for specialists in allied fields (physics, engi­ neering) who are interested in mathematics and its applications. Therefore the author has endeavoured to make his account of the material as accessible as possible, assuming only that the reader is acquainted with the theory of determinants and with the usual course of higher mathematic? within the programme of higher technical education. Only a few isolated sections in the last chapters of the book require additional mathematical knowledge on the part of the reader. Moreover, the author has tried to keep the indi­ vidual chapters as far as possible independent,of each other. For example, Chapter V, Functions of Matrices, does not depend on the material con­ tained in Chapters II and III. At those places of Chapter V where funda­ mental concepts introduced in Chapter IV are being used for the first time, the corresponding references are given. Thus, a reader who is acquainted with the rudiments of the theory of matrices can immediately begin with reading the chapters that interest him. The book consists of two parts, containing fifteen chapters. In Chapters I and III, information about matrices and linear operators is developed ah initio and the connection between operators and matrices is introduced. Chapter II expounds the theoretical basis of Gauss’s elimination method and certain associated effective methods of solving a system of n linear equations, for large n. In this chapter the reader also becomes acquainted with the technique of operating with matrices that are divided into rectan­ gular ‘blocks. ’ '

IV

P reface

In Chapter IV we introduce the extremely important ‘characteristic * and ‘minimal’ polynomials of a square matrix, and the ‘adjoint' and 1reduced adjoint’ matrices. In Chapter V, which is devoted to functions of matrices, we give the general definition of f ( A) as well as concrete methods of computing it— where f(X) is a function of a scalar argument / and .1 is a square matrix. The concept of a function of a matrix is used in §§ 5 and 6 of this chapter for a complete investigation of the solutions of a system of linear differen­ tial equations of the first order with constant coefficients. Roth the concept of a function of a matrix and this latter investigation of differential equa­ tions are based entirely on the concept of the minimal polynomial of a matrix and—in contrast to the usual exposition—do not use the so-called theory of elementary divisors, which is treated in Chapters VI ami VII. These five chapters constitute a first course on matrices and their appli­ cations. Very important problems in the theory of matrices arise in con­ nection with the reduction of matrices to a normal form. This reduction is carried out on the basis of Weierstrass’ theory of elementary divisors. In view of the importance of this theory we give two expositions in this book: an analytic one in Chapter VI and a geometric one in Chapter VII. We draw the reader’s attention to §§ 7 and 8 of Chapter VI, where we study effective methods of finding a matrix that transforms a given matrix to normal form. In § 8 of Chapter VTT we investigate in detail the method of A. N. Krylov for the practical computation of the coefficients of the characteristic polynomial. In Chapter VIIT certain types of matrix equations are solved. We also consider here the problem of determining all the matrices that are permutable with a given matrix and we study in detail the many-valued functions of matrices mV A and InA. Chapters IX and X deal with the theory of linear operators in a unitary space and the theory of quadratic and hermitian forms. These chapters do not depend on Weierstrass’ theory of elementary divisors and use, of the preceding material, only the basic information on matrices and linear opera­ tors contained in the first three chapters of the book. In § 9 of Chapter X we apply the theory of forms to the study of the principal oscillations of a system with n degrees of freedom. In § 11 of this chapter we give an account of Frobenius’ deep results on the theory of Hankel forms. These results are used later, in Chapter XV, to study special cases of the Routh-Hurwitz problem. The last five chapters form the second part of the book [the second volume, in the present English translation]. In Chapter XI we determine normal forms for complex symmetric, skew-symmetric, and orthogonal mat-

P reface

v

rices and establish interesting connections of these matrices with real matrices of the same classes and with unitary matrices. In Chapter XII we expound the general theory of pencils of matrices of the form A + XB, where A and B are arbitrary rectangular matrices of the same dimensions. Just as the study of regular pencils of matrices A + XB is based on Weierstrass’ theory of elementary divisors, so the study of singu­ lar pencils is built upon Kronecker’s theory of minimal indices, which is, as it were, a further development of Weierstrass’s theory. By means of Kronecker’s theory—the author believes that he has succeeded in simplifying the exposition of this theory—we establish in Chapter XII canonical forms of the pencil of matrices A + XB in the most general case. The results obtained there are applied to the study of systems of linear differential equations with constant coefficients. In Chapter X III we explain the remarkable spectral properties of mat­ rices with non-negative elements and consider two important applications of matrices of this class: 1) homogeneous Markov chains in the theory of probability and 2) oscillatory properties of elastic vibrations in mechanics. The matrix method of studying homogeneous Markov chains was developed in the book [46] by V. I.' Romanovskii and is based on the fact that the matrix of transition probabilities in a homogeneous Markov chain with a finite number of states is a matrix with non-negative elements of a special type (a ‘stochastic1matrix). The oscillatory properties of elastic vibrations are connected with another important class of non-negative matrices—the 4oscillation matrices. * These matrices and their applications were studied by M. G. Krein jointly with the author of this book. In Chapter XIII, only certain basic results in this domain are presented. The reader can find a detailed account of the whole material, in the monograph [17]. In Chapter XIY we compile the applications of the theory of matrices to systems of differential equations with variable coefficients. The central place (§§ 5-9) in this chapter belongs to the theory of the multiplicative integral (Produktintegral) and its connection with Volterra’s infinitesimal calculus: These problems are almost entirely unknown in Soviet mathe­ matical literature. In the first sections and in §11, we study reducible systems (in the sense of Lyapunov) in connection with the problem of stabil­ ity of motion; we also give certain results of N. P. Erugin. Sections 9-11 refer to the analytic theory of systems of differential equations. Here we clarify an inaccuracy in Birkhoff’s fundamental theorem, which is usually applied to the investigation of the solution of a system of differential equa­ tions in the neighborhood of a singular point, and we establish a canonical form of the solution in the case of a regular singular point.

vi

P r eface

In § 12 of Chapter XIV we give a brief survey of some results of the fundamental investigations of I. A. Lappo-Danilevskii on analytic functions of several matrices and their applications to differential systems. The last chapter, Chapter XV, deals with the applications of the theory of quadratic forms (in particular, of Hankel forms) to the Routh-Hurwitz problem of determining the number of roots of a polynomial in the right half-plane (Re z > 0). The first sections of the chapter contain the classical treatment of the problem. In § 5 we give the theorem of A. M. Lyapunov in which a stability criterion is set up which is equivalent to the Routh-Hurwitz criterion. Together with the stability criterion of Routh-Hurwitz we give, in § 11 of this chapter, the comparatively little known criterion of Lienard and Chipart in which the number of determinant inequalities is only about half of that in the Routh-Hurwitz criterion. At the end of Chapter XV we exhibit the close connection between stabil­ ity problems and two remarkable theorems of A. A. Markov and P. L. Chebyshev, which were obtained by these celebrated authors on the basis of the expansion of certain continued fractions of special types in series of decreas­ ing powers of the argument. Here we give a matrix proof of these theorems. This, then, is a brief summary of the contents of this book. F. R. Gantmacher

PUBLISHERS’ PREFACE T he P u b l i s h e r s w i s h t o thank Professor Gantmacher for his kindness in communicating to the translator new versions of several paragraphs of the original Russian-language book. The Publishers also take pleasure in thanking the VEB Deutscher Verlag der Wissenschaften, whose many published translations of Russian scientific books into the German language include a counterpart of the present work, for their kind spirit of cooperation in agreeing to the use of their formulas in the preparation of the present work. No material changes have been made in the text in translating the present work from the Russian except for the replacement of several paragraphs by the new versions supplied by Professor Gantmacher. Some changes in the references and in the Bibliography have been made for the benefit of the English-language reader.

CONTENTS P r e f a c e --------- ---------------------------------------------------------------------------------------------------™ P u b l is h e r s ’ P r e f a c e ---------------------------------------------------------------------------------

XJ-

C om plex S ym m etric, Skew -Sym m etric, and O rth o g o ­ n a l M a tr ic e s_________________________________________ 8 1. Some formulas for complex orthogonal and unitary

3

m atrices------------------------------------------------------------

§ § § § XII.

1

1

2. Polar decomposition of a complex matrix---- -----® 3. The normalform of a complex symmetric matrix-— —— 9 4. The normal form of a complex skew-symmetric matrix— 12 5. The normal form of a complex orthogonal mat ri x— 18

S in g u la r P e n c ils o f M a tr ic e s---------------------------------------24

8 1. Introduction-----------------------------------------------------5

Xni

vi

.

24 OK

§ § § §

2. 3. 4. 5.

Regular pencils of matrices--------------------------------------Singular pencils. The reduction theorem-------------------- 29 The canonical form of a singular pencil of matrices-.....— 35 The minimal indices of a pencil. Criterion for strong equivalence of pencils .------- ------------------------------------- 37

§

6.

S i n g u l a r p e n c i l s o f q u a d r a t i c f o r m s ----------------------------------

40

§

7.

A p p l i c a t i o n t o d i f f e r e n t i a l e q u a t i o n s ------------------------------

45

M a tr ic e s

Non -Negative Elem ents ----------------------

50

§

1 . G e n e r a l p r o p e r t i e s ----------------------------------------------------------

^0

§

2. S p e c t r a l p r o p e r t i e s o f i r r e d u c i b l e n o n - n e g a t i v e m a t r i c e s — 53

§

3 . R e d u c i b l e m a t r i c e s ----------------------------------------------------------

with

§ 4. The normal form of a reducible matrix............ g c

5

66 74

Primitive and imprimitive matrices----------------80 6 . Stochastic matrices____________________________________ 82

vii

Vlll

Contents

§ 7. Limiting probabilities for a homogeneous Markov chain with a finite number of states— ----87 § 8. Totally non-negative m atrices__________________________ 98 § 9. Oscillatory matrices_________~________________________ 103 XIV. Applications of the T heory of Matrices to the I nves ­ tigation of S ystem s of Linear D ifferential E quations 113 § 1. Systems of linear differential equations with variable coefficients. General concepts_______ § 2. Lyapunov transformations__________ § 3. Reducible systems___________________________________ § 4. The canonical form of a reducible system. Erugin ’s theorem § 5. The matricant _____ § 6. The multiplicative integral. The infinitesimal calculus of Vol terra ---------------------------------§ 7. Differential systems in a complex domain. General prop­ erties ----------.;______ § 8. The multiplicative integral in a complex domain... ............ § 9. Isolated singular points _______ § 10. Regular singularities----------§ 11. Reducible analytic systems ___________ § 12. Analytic functions of several matrices and their applica­ tion to the investigation of differential systems. The papers of Lappo-Danilevskii_____ _ ____ XV. T he P roblem 1. 2. 3. 4. 5. 6. 7. 8. 9.

131 135 138 142 148 164

168

R outh -Hurwitz and R elated Q uestions 172

Introduction ..______ 172 Cauchy in d ices ....................-—.......................173 Routh’s algorithm .................—........................ 177 The singular case E xam ples------- ----------------------181 Lyapunov’s theorem ................................................................ 185 The theorem of Routh-Hurwitz.......................... 190 Orlando’s form ula............ .......................... -.... -..... 196 Singular cases in the Routh-Hurwitz theorem—....................... ..... 198 The method of quadratic forms. Determination of the number of distinct real roots of a polynomial........................ 201 .

§ § § § § § § § §

of

113 116 118 121 125

C ontents

ix

§ 10. Infinite Hankel matrices of finite rank.......... ...... ....-........... 20 1 §11. Determination of the index of an arbitrary rational frac­ tion by the coefficients of numerator and denominator..... 208 §12. Another proof of the Routh-Hurwitz theorem---- ---—....... 216 § 13. Some supplements to the Routh-Hurwitz theorem. Stabil­ 220 ity criterion of Lienard and Chipart—...................... §14. Some properties of Hurwitz polynomials. Stieltjes’ theo­ rem. Representation of Hurwitz polynomials by con­ tinued fractions ------------------------------------------- -....— 225 § 15. Domain ofstability. Markovparameters------------------232 § 16. Connection with the problem of moments----------------- 236 §17. Theorems of Markov and Chebyshev---------------- --- —.... 240 248 § 18. The generalized Routh-Hurwitz problem--------------B ibliography ------------------------------------------------------------------------------------ 251

Index ........ —------- —--------- ~~~~~—------ .—... —..................... ... — ....... 268

C H A P T E R XI COMPLEX SYMMETRIC, SKEW-SYMMETRIC, AND ORTHOGONAL MATRICES In Volume I, Chapter IX, in connection with the study of linear operators in a euclidean space, we investigated real symmetric, skew-symmetric, and orthogonal matrices, i.e., real square matrices characterized by the relations*. ^ = = and QT= g - *1, respectively (here QT denotes the transpose of the matrix Q). We have shown that in the field of complex numbers all these matrices have linear elementary divisors and we have set up normal forms for them, i.e., 'simplest’ real symmetric, skew-symmetric, and orthogonal matrices to which arbitrary matrices of the types under consideration are real-similar and orthogonally similar. The present chapter deals with the investigation of complex symmetric, skew-symmetric, and orthogonal matrices. We shall clarify the question of what elementary divisors these matrices can have and shall set up normal forms for them. These forms have a considerably more complicated struc­ ture than the corresponding normal forms in the real case. As a preliminary, we shall establish in the first section interesting connections between com­ plex orthogonal and unitary matrices on the one hand, and real symmetric, skew-symmetric, and orthogonal matrices on the other hand. § L Some Formulas for Complex Orthogonal and Unitary Matrices 1. We begin with a lemma: _ L em m a l : 1 1. I f a matrix G is both hermitian and orthogonal (Q T — Q = G“ 1), then it cam be represented in the form G = leiK, (1) where 1 is a real symmetric imvolutory matrix and K a real skew-symmetric matrix permutable with i t : *See [169], pp. 223-225. f In this and in the following chapters, a matrix denoted by the letter Q is not neces­ sarily orthogonal.

1

2

XI.

C o m p l e x S y m m e t r ic , S k e w -S y m m e t r ic , O r t h o g o n a l M a t r ic e s

, P = E ,K = K = — K T

(2)

2. If, in addition, G is a positive-definite hermitian matrix,2 then in (1) I — E and G = e« (3) Proof. Let G = S + iT, (4) where 8 and T are real matrices. Then G = 8 — iT and GT — ffT+ %TT.

(5)

Therefore the equation G = GT implies that 8 := 8 r and T = — T **, i.e., 8 is symmetric and T skew-symmetric. Moreover, when the expressions for G and G from (4) and (5) are sub­ stituted in the complex equation GG=E, it breaks up into two real equations : S* + T2=:E and ST = TS,

(6)

The second of these equations shows that 8 and T commute. By Theorem 1 2 ' of Chapter IX (Vol. I, p. 2 9 2 ), the commuting normal matrices 8 and T can be carried simultaneously into quasi-diagonal form by a real orthogonal transformation. Therefore3 8

Q

{$ 1 , * 1 ,

8

2

, *2» • • «»

8qt

^2g+l» • • • » *n)

Q

*t

(Q = Q = Q T1) 8\ —f j—l*

*» = ± 1 . (&)

• I.e., G is the coefficient m atrix of a positive-definite herm itian form (see Vol. I, Chapter X ,§ 9 ) . * See also the Note following Theorem 12' of Vol. I, Chapter IX (p. 293)

§ 1.

C o m pl e x O rthogonal

and

U n it a r y M atrices

Now it is easy to verify that a matrix of the type can always be represented in the form s

3

**|| with $2— t2= 1

it

-r-%t 8

where | a | = cosh + H(Pv)}.

we introduce the corresponding symmetric form

SW= TWHW[TW]-1

(j = 1,2, ..., u)

it follows that XjEW + SM = TW [XfiW + HW] [TW ]—1.

Therefore setting S — {A1£(','>+ s ^ '\ X2E ^ + T = {T), ),

(47) (48)

we have: i S = T J T ~ 1. S is a. symmetric form of J. ”§ is similar to J and has the same elementary divisors (46) as / . This proves the theorem. Corollary 1. Every square complex matrix A = || a,* il* is similar to a symmetric matrix. Applying Theorem 4, we obtain: Corollary 2. Every complex symmetric matrix 8 = || aik ||? is orthogo­ nally similar to a symmetric matrix with the normal form 8, i.e., there exists pn orthogonal matrix Q such that

S = QSQ-1.

(49)

The normal form of a complex symmetric matrix has the quasi-diagonal

form S = { X 1EM + &*\ where the blocks 8

A2EM+S + #

t + i ( H ^ F w — F (p) #

)]

0 1 .

1

. . .



• •

o

0

. . ;

1

.

0

• —1



(61)

. ' ,

1 o

o

1[ ■ ; o —i . . .

; i 6 1|

§ 4. The Normal Form of a Complex Skew-symmetric Matrix 1. We shall examine what restrictions the skew symmetry of a matrix imposes on its elementary divisors. In this task we shall make use of the following theorem: T heorem 6 : A skew-symmetric matrix always has even rank. Proof. Let r be the rank of the skew-symmetric matrix K. Then K has r linearly independent rows, say those numbered ix, i*,. . . , v ; all the remain­ ing rows are linear combinations of these r rows. Since the columns of K are obtained from the corresponding rows by multiplying the elements by — 1, every column of K is a linear combination of the columns numbered iu *2> • • • >V Therefore every minor of order r of K can be represented in the. form olK

where a is a constant. Hence it follows that

But a skew-symmetric determinant of odd order is always zero. There­ fore r is even, and the theorem id proved. T heo rem 7 : If X0 is a characteristic value of the skew-symmetric matrix K with the corresponding elementary divisors (A -A o )* ,

(A -A o )/‘,

then — X0 is also a characteristic value of K with the same number and the same powers of the corresponding elementary divisors of K

§ 4. N o r m a l F o r m o f C o m p le x S k e w - S y m m e t r ic M a t r ix (A 4- AqK1,

13

(A 4- A0K*, . • ., (A 4- Ao)*

2. 7/ zero is a characteristic value of the skew-symmetric matrix K,10 the'n in the system of elementary divisors of K all those of even degree'cor­ responding to the characteristic value zero are repeated an even number of times. Proof. 1. The transposed matrix K7 has the same elementary divisors as K. But KT = — K, and the elementary divisors of —- K are obtained from those of K by replacing the characteristic values Ai, A2, . by — Ai, — X2, . . . . Hence the first part of our theorem follows. 2/ Suppose that to the characteristic value zero of K there correspond dx elementary divisors of the form A, d2 of the form A2, etc. In general, we denote by dp the number of elementary divisors of the form l p (p = 1, 2,. . .). We shall show that d2i )

Since (h(A) ) m1 vanishes on the spectrum of Q, it is divisible by without remainder. Hence: Nm' = 0, i.e., JVis a nilpotent matrix with mi as index of nilpotency. From (80) we find:23 N r = (QT~ E ) P.

(81)

21 From the fundamental formula (see Vol. I, p. 104) ■

9 ( * > = £ \ S (A*) Z n + g ’ (A*) £ « + •••] * -l it follows that p — Zw. 22 A hermitian operator P is called projective if P'2— P. In accordance with this, a hermitian matrix P for which P2— P is called projective. An example of a projective operator P in a unitary space R is the operator of the orthogonal projection of a vector x 6 R into a subspace S = P R , i.e., P x — x g. where x s e S and (jc — x g) l S .(aee Vol. I, p. 248). 23 All the matrices that occur here, P, N, NT, QT = Q~\ are permutable among each other and with Q, since they are all functions of Q.

§ 5.

N orm al F

orm of

C o m pl e x O rthog onal M atrix

21

Let us consider the matrix R ~ N ( N r + 2E).

(82)

From (78), (80), and (81) it follows that R = NN T + 2 N = ( Q - Q T) P . From this representation of R it is clear that R is skew-symmetric. On the other hand, from (82)R*= Nk (NT + 2 E)k (k = 1,2, .. .)•

(83)

But N1", like N, is nilpotent, and therefore

/ \N T + 2 E \ ^ 0 . \ Hence it follows from (83) that the matrices Rk and Nk have the same rank for every h. Now for odd k the matrix Rk is skew-symmetric and therefore (see p. 12) has even rank. Therefore each of the matrices

has odd rank. By repeating, verbatim for N the arguments that were used on p. 13 for K we may therefore state that among the elementary divisors of N those of the form X2p are repeated an even number of times. But to each ele­ mentary divisor X2p of N there corresponds an elementary divisor (X — l ) 2^ of Q, and vice versa.24 Hence it follows that among the elementary divisors of Q those of the form (X — \ ) 2p are repeated an even number of times. We obtain a similar statement for the elementary divisors of the form (X 4- l ) 2p by applying what has just been proved to the matrix —■Q. Thus, the proof of the theorem is complete. 2• We shall now prove the converse theorem. 24 Since f t( l) = 0, h'( 1) =^=0, in passing from Q to N = h( Q) the elementary divisors of the form (X— l ) 1^ of Q do not split and are therefore replaced by elementary divisors Xip (see Vol. I, Chapter V I, § 7).

22 XL Complex Symm etric, Skew-Sym m etric. O rth o g o n a l M atrices T heorem 1 0 :

Every system of powers of the form (A—Xff, (A-ATV ' (A, =^=0; j= 1 , 2 , . . u), (A -l)*, (A-l)«*,.... ( A - m (A+ 1)**, (A+ 1)\ .... (A+ 1)'“ (gt, ..., qv, tv . . ore odd numbers)

(84)

%

is the system of elementary divisors of some complex orthogonal matrix Q.25 Proof. We denote by the numbers connected with the numbers Ay (j = 1, 2, . . . , n) by the equations A, = e^ (/ = 1,2, We now introduce the "canonical’ skew-symmetric matrices (see the pre­ ceding section) (j =

1 , 2 , . . . ,

*);*, . . . , £ > ' ■ > ...........* “•>,

with the elementary divisors (X-H,)TK (A+

( j = 1, \, . . . , «) A»*. . . . , A5"; A'-,. . . , A«».

If JSl is a skew-symmetric matrix, then 0= is orthogonal (Qr = eRT = = Q*1). Moreover, to each elementary divi­ sor (A — p)p of K there corresponds an elementary divisor (A— e*)* of Q.26 Therefore the quasi-diagonal matrix t jr, .

.

= EM + XH&).

(5)

We multiply the second diagonal block on the right-hand side of (4) by J i 1; it can then be put into the form J + XE by a similarity transformation, where J is a matrix of normal form9 and E the unit matrix. We have thus arrived at the following theorem: T heorem 3 : Every regular pencil A 4- XB can be reduced to a ( strictly equivalent) canonical quasi-diagonal form {2Vit

• ♦•>

c_i) > B

(6j, b2, • • • >bn—6—i) ’

\yj

Then the matrix equation (22) can be replaced by a system of scalar equa­ tions that expresses the equality of the elements of the fc-th column on the right-hand and left-hand sides of (22) (k = 1, 2, . . . , n — e — 1) :1 11 This follows from the fact that the rank of the matrix (20) for k = e — 1 is equal to en; a similar equation holds for the rank of the m atrix i [.£»«].

34

XII. S ingular P encils

of

Matrices

*2* -+• A*u = dlk + A/u 4- yxak + Xyxbk, ^x2k ^2* J/2®*^ *yj>k * x4* + ^Xu —d3k -f Xfzk + tfea* + A2/36 *,

(23)

*t+L* + **«* -= y&k y^Pk—fus ................................ y«-ia* yfik ~fek &«-i,k (k = 1, 2, . . n —e — 1).

(24)

If (24) holds, then the required elements of X can obviously be determined from (23). It now remains to show that the system of equations (24) for the ele­ ments of Y always has a solution for arbitrary dik and f ik (i = 1, 2, . . . , e ; &= 1, 2, . . . , n — e — 1). Indeed, the matrix formed from the coefficients of the unknown elements of the rows ylf — y2, y3, — i/4, . . . , can be written, after transposition, in the form (A 0 . . . 0 \ B A 0 B • 0. Therefore by Theorem 4 the given pencil can be transformed into the form

p I

Ay + ABy)

where the equation (Ax -f XBx) = o has no solution z(1) of degree less than ex. If this equation has a non-zero solution of minimal degree e2 (where, necessarily, e2 ^ £1 ), then by applying Theorem 4 to the pencil A x + XBx we can transform the given pencil into the form Ltx O O \ O Lh O I :0 O A 2 + X B j Continuing this process, we can put the given pencil into the quasidiagonal form

where 0 < ex :g s2 g g ep and the equation (Ap + XBP) aft* = 0 has no non-zero solution, so that the columns of Ap 4- XBPare linearly independent.12 If the rows of Ap + XBP are linearly dependent, then the transposed pencil A p + XBTpWni be put into the form (25), where instead of ex, e2f. . . , eP there occur the numbers (0

Av

e x is ts 0.

XBP i s

a m o n g

th e

a b s e n t. ro w s

o f

th e

XII.

36

(0
i.e., (73)

§ 7. A pplication to D ifferential E quations

47

In that case we can take arbitrary functions of t as the unknown functions Zi, Z2t . . . , zg that form the columns k '2) The system (60) is of the form (™) or, more explicitly,20

f t + z! = /i W. f t + V = /i« . • • - f t + *•+!=/'( •

(7S)

Such a system is always consistent. If we take for 2 «+ i (0 an arbitrary function of t, then all the remaining unknown functions zg, z t _ i ,. . . , z\ can be determined from (75) by successive quadratures. 3) The system (67) is of the form

or, more explicitly,21 - ^ = /i( f ) , -jj + 2 i = /a(0» ♦• •> ^

= ft,(t), z n—/«i+i(0-*

From all the equations (77) except the first we determine z uniquely: /»?+! > 2»J-1

fv ~~

2i ~ fz

dt

2 „_i,

(77) -,Zl

(78)

9

fi .. + (-l)'» -1d^/V df'-1 ‘

Substituting this expression for z% into the first equation, we obtain the condition for consistency: 7 _/_______________ i)nAA+* —o

A

d t + dt*

lj

~ U‘

(79)

20 We have changed the indices of z and f to simplify the notation. In order to return from (75) to ( 66) we have to replace e by £t and add to each index of z the number g + «$+i + *** + V+-i—1+ * — 1 »t0 each index of f the number h -f e^+1 H------- (- eg+i—i . 21 Here, as in the preceding case, we have changed the notation. See the preceding footnote.

48

XII.

S in g u l a r P e n c il s

op

M atrices

4) The system (68) is of the form (80) or, more explicitly, (81) Hence we determine successively the unique solutions zu

tu >

(82)

5) The system (69) is of the form (83) As we have proved in Vol. I, Chapter V, § 5, the general solution of such a system has the form i =e~J% + f e-JV-x)f(T)dT\ (84) o here z0 is a column matrix with arbitrary elements (the initial values of the unknown functions for = 0). The inverse transition from the system (61) to (59) is effected by the formulas (60) and (62), according to which each of the functions xl t . . . , xn is a linear combination of the functions zlf . .., zn and each of the functions fi(t), • • •, fm(t) is expressed linearly (with constant coefficients) in terms of the functions fi(t), . . . , /*»( 0) if all the elements of A are non-negative (a^ = 0) or positive (aik > 0). D e fin itio n 2: A square matrix A = || aik ||* is called reducible if the index set 1, 2, . . . , n can tye split into two complementary sets (without com­ mon indices) i\, t2 , . . . , i^; ki, k2, . . . , kv (p + v= n) such that a i*kp = 0 (a = l,

2,

. . , , f i \ 0 = 1, 2 ,

v).

Otherwise the matrix is called irreducible. By a permutation of a square matrix A = j| aik ||” we mean a permutation of the rows of A combined with the same permutation of the columns. The definition of a reducible matrix and an irreducible matrix can also be formulated as follows: D e fin itio n 2': A matrix A = || a** ||J is called reducible if there is a permutation that puts it into the form

where B and D are square matrices. Otherwise A is called irreducible. 50

§ 1. General P roperties

5 1

Suppose that A = || a !|i corresponds to a linear operator A in an ^-dimensional vector space R with the basis cx, e2, . . . , en. To a permutation of A ,there corresponds a renumbering of the basis vectors, i.e., a transition from the basis e x, e2, ... ,e n to a new basis e \ ~ eilt e \ = e?a, . . . , e'n — ejn , where (j j2, . . ., jn) is a permutation of the indices 1, 2,. . ., n. The matrix A then goes over into a similar matrix A = T~lAT. (Each row and each column of the transforming matrix T contains a single element 1, and the remaining elements are zero.) i,

2. By a i/-dimensional coordinate subspace of R we mean a subspace of R with a basis ekt, ekt1 . . ., ekv (1 ^ &i < k2 < . . .
0.

(1)

Proof. For the proof of the lemma it is sufficient to show that for every vector1 (i.e., column) y ^ o (y o) the inequality (E + A ) n- Xy > o holds. This inequality will be established if we can only show that under the conditions y ^ o and y o the vector z = (E + A )y always has fewer zero coordinates than y does. Let us assume the contrary. Then y and z have the same zero coordinates.2 Without loss of generality we may assume that the columns y and z have the form3 1 Here and throughout this chapter we mean by a vector a column of n numbers. Tn this way we identify, as it were, a vector with the column of its coordinates in that ba^i* in which the given matrix A = || a ik If? determines a certain linear operator. 2 Here we start from the fact that z = y - \ ~ A y and A y ^ o ; therefore to positive coordinates of y there correspond positive coordinates of z. ' 3 The columns y and z can be brought into this form by means of a suitable renumber­ ing of the coordinates (the same for y and z ) .

X III.

52

M a tric e s w ith S

o n -N e g a t iv e

(u > o ,

y =

E

lem ents

v> o ),

where the columns u and v are of the same dimension. Setting

we have and hence Since

u > o9it follows

that ^21

0.

This equation contradicts the irreducibility of A. Thus the lemma is proved. We introduce the powers of A : =

(Q = 1>2> •

Then the lemma has the follow ing corollary: C orollary : I f A ^ . O is an im 'edu cible m a trix , th en f o r e v e r y in d e x p a ir i , k ( 1 ^ i, k ^ n ) th ere e x ists a p o s itiv e in te g e r q such th a t

0 .

(2)

M o r e o v e r , q c a n a lw a y s be c h o s e n w i t h i n th e b o u n d s

a < m — 1 if i^-k, | a< m if i = k, f

(3)

w h e r e m is th e d e g r e e o f th e m in im a l p o ly n o m ia l y> (l) o f A . F o r l e t r ( A ) d e n o t e t h e r e m a in d e r o n d i v i d i n g ( I -h l ) n~ 1 b y y s ( l ) . T h e n b y ( 1 ) w e h a v e r ( A ) > O. S in c e th e d e g r e e o f r ( l ) is le s s th a n m , i t f o l l o w s f r o m t h i s i n e q u a l i t y t h a t fo r a r b i tr a r y i, k (1 ^ i, k ^ n ) a t le a s t o n e o f th e n o n -n e g a tiv e n u m b e rs

. Offr* is n o t z e ro .

S i n c e Sifc = 0 f o r i ^

(2) (m—1) aik * **• » k , th e f i r s t o f th e r e la tio n s (3 ) fo llo w s.

§ 2. S pectral P roperties

of

I rreducible N on-N egative M atrices 53

The other relation (for i = k) is obtained similarly if the inequality r(A) > 0 is replaced by Ar(A) > 0.4 Note. This corollary of the lemma shows that in (1) the number n — 1 can be replaced by w — 1.

§ 2. Spectral Properties of Irreducible Non-negative Matrices 1. In 1907 Perron found a remarkable property of the spectra (i.e., the characteristic values and characteristic vectors) of positive matrices.5 T heorem 1 (Perron) : A positive matrix A = || ||* always has a real and positive characteristic value r which is a simple root of the characteristic equation and exceeds the moduli of all the other characteristic values. To this ‘maximal’ characteristic value r there corresponds a characteristic vector ss= (zly z2, . . . , zn) of A with positive coordinates Zi > 0 ( i = 1, 2, . . . , n).6 A positive matrix is a special case of an irreducible non-negative matrix. Frobenius7 has generalized Perron’s theorem by investigating the spectral properties of irreducible non-negative matrices. T heo rem 2 (Frobenius): An irreducible non-negative matrix A = || aw ||? always has a positive characteristic value r thal is a simple root of the characteristic equation. The moduli of all the other characteristic values do not exceed r. To the ‘maximal’ characteristic value r there corresponds a characteristic vector with positive coordinates. Moreover, if A has h characteristic values X0= r, Xif ... , Xh- i of modulus r, then these numbers are all distinct and are roots of the equation Xh —r* = 0

(4)

More generally: The whole spectrum X0, Xly • • •, Xn- i of A , regarded as a system of points in the complex X~plane, goes over into itself under a rotation 4 The product of an irreducible non-negative matrix and a positive matrix is itself positive. 5 See [316], [317], and [17], p. 100. 6 Since r is a simple characteristic value, the characteristic vector z belonging to it is ietermined to within a scalar factor. By Perron's theorem all the coordinates of z are • C. In particular, C 0 ( > O) means that all thrj elements of C are nonnegative (positive). Furthermore, we denote by C+ the matrix mod C which arises from C when all the elements are replaced by their moduli. 2, Proof of Frobeniusf Theorem :9 Let .r = ( r t, x2, • • • , *«) f ;r ^ o ) be a fixed real vector. We set: rx =

m in

{A* h

1£ 0 the value of rT does not change. There­ fore, in the computation of the maximum of rx we can restrict ourselves to the closed set M of vectors x for which n

z^o

and . (zx)

o t rx may have discontinuities at the boundary points of M at which one of its coordinates vanishes. Therefore, we introduce in place of M the set N of all the vectors y of the form y = (E + A ) n~1x

(*cHf).

The set N, like M, is bounded and closed and by Lemma 1 consists of positive vectors only. Moreover, when we multiply both sides of the inequality rxx ^ Ax, by (E + A)*"1 > 0, we obtain: rxy g Ay

( y = ( E + A)*-~1x).

Hence, from the definition of rv we have rx g ry. Therefore in the computation of the maximum of rx we can replace M by the set N which consists of positive vectors only. On the bounded and closed set N the function rx is continuous and therefore assumes a largest value for some vector z > o, • Every vector z ^ o for which rz = r (8 ) will be called extremal.

XIII.

56

M atrices

w it h

N o n -N egative E l e m e n t s

We shall now show that: 1) The number r defined by (7) is positive and is a characteristic value of A; 2) Every extremal vector z is positive and is a characteristic vector of A for the characteristic value r, i.e., r > 0, 2 > o , 0, be■■ cause no row of an irreducible matrix can consist of zeros only. Therefore r > 0, since r § ru. Now let x ^ iE + A ^ 'z .

(10)

Then, by Lemma 1, x > o. Suppose that Az — rz ^ o. Then by (1), (8), and (10) we obtain successively: Az — r z ^ o , (E + A)"—1 (Az — rz)>o, Ax — r x >o . The last inequality contradicts the definition of r, because it would imply that A x — (r + e ) i > o for sufficiently small e > 0, i.e., rT ^ r + e > r. Therefore Az = rz. But then o < x = ( E + A )1*-l z = (l + r)*” 1z, so that z > o. We shall now show that the moduli of all the characteristic values do not exceed r. Let Ay = o, i.e., y = (ylty2,. .. , t/n), where y{^=0 ( t= l, 2, . . . , n). Hence it follows that only one characteristic direction corresponds to the characteristic vector; for if there were two linearly independent character­ istic vectors z and Zi, we could chose numbers c and d such that the char­ acteristic vector y = cz + dz\ has a zero coordinate, and by what we have shown this is impossible. 10

Regarding the notation y + t see p. 54.

§ 2. S pectral P roperties

of

I rreducible N on -N egative Matrices 57

We now consider the adjoint matrix, of the characteristic matrix XE — A: ,

£ (A) = || Ba (A) \\1 = A (A) (XE - A ) ~ \

where A (X) is the characteristic polynomial of A and Bik(X) the algebraic complement of the element XdM— a** in the determinant A(X). From the fact that only one characteristic vector z = (zu z2y . . . , zn) with Z\ > 0, z2 > 0, . . . , zn > 0 corresponds to the characteristic value r (apart from a factor) it follows that B(r) ^ 0 and that in every non-zero column of B(r) all the elements are different from zero and are of the same sign. The same is true for the rows of B( r ) y since in the preceding argument A can be re­ placed by the transposed matrix AT. From these properties of the rows and columns of A it follows that all the Bik(r) (i, k = 1, 2, . . . , n) are different from zero and are of the same sign a. Therefore n

a A' (r) = a j £ B(i (r) > 0 , 2>........ *)•

We cl fine a diagonal matrix D by the equation D = ( e*’’1, e*«\ .... e***). Then y = Dy+. Substituting this expression for y in (17) and then setting y = r«‘>, we find easily:

Fy+= r y +,

(211

where F=

tr * D ~ 1 C D

.

(22)

Comparing (19) with (21). we obtain

(23) But by (22) and (20) F+—C+=A.

§

2 .

S p e c t r a l

P

r o p e r t i e s

o f

I r r e d u c ib l e

N o n

N e g a t iv e

M

a t r ic e s

5 9

Therefore we find from (23) F y+ = F + y+.

Since y+ > o, this equation can hold only if F = F+,

i.e., e-*>D~xCD = A. Hence C=e**DAI>-\ and the Lemma is proved. 4. We return to Frobenius’ theorem and apply the lemma to an irreducible matrix A ^ O that has precisely h characteristic values of maximal modu­ lus r: Ao =re*', Ax= re * \ . . A*_, = re***-1 (0 = 9>0 o).

(25)

(y+= z > o ) ,

(26)

(A* =«o.

XJII. Matrices

60

with

N on-Negative Elements

Furthermore, from (24) it follows that A= e