Linear Algebra in Action [3 ed.] 9781470472061, 9781470474195, 9781470474188

Linear algebra permeates mathematics, perhaps more so than any other single subject. It plays an essential role in pure

592 70 5MB

English Pages 512 Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Linear Algebra in Action [3 ed.]
 9781470472061, 9781470474195, 9781470474188

  • Commentary
  • Fixed version of https://libgen.is/book/index.php?md5=0545112B846E20E60EE3F2EB811BCA71 (pages ordered correctly, ToC added)

Table of contents :
Title
Contents
Preface to the third edition
Preface to the second edition
Preface to the first edition
1. Prerequisites
2. Dimension and rank
3. Gaussian elimination
4. Eigenvalues and eigenvectors
5. Towards the Jordan decomposition
6. The Jordan decomposition
7. Determinants
8. Companion matrices and circulants
9. Inequalities
10. Normed linear spaces
11. Inner product spaces
12. Orthogonality
13. Normal matrices
14. Projections, volumes, and traces
15. Singular value decomposition
16. Positive definite and semidefinite matrices
17. Determinants redux
18. Applications
19. Discrete dynamical systems
20. Continuous dynamical systems
21. Vector-valued functions
22. Fixed point theorems
23. The implicit function theorem
24. Extremal problems
25. Newton’s method
26. Matrices with nonnegative entries
27. Applications of matrices with nonnegative entries
28. Eigenvalues of Hermitian matrices
29. Singular values redux I
30. Singular values redux II
31. Approximation by unitary matrices
32. Linear functionals
33. A minimal norm problem
34. Conjugate gradients
35. Continuity of eigenvalues
36. Eigenvalue location problems
37. Matrix equations
38. A matrix completion problem
39. Minimal norm completions
40. The numerical range
41. Riccati equations
42. Supplementary topics
43. Toeplitz, Hankel, and de Branges
Bibliography
Notation index
Subject index

Citation preview

GRADUATE STUDIES I N M AT H E M AT I C S

232

Linear Algebra in Action Third Edition

Harry Dym

Dedicated to the memory of: Irene Lillian Dym, special friend for nigh onto 67 years, and to the memory of our two oldest sons and our first granddaughter, who were recalled prematurely for no apparent reason: Jonathan Carol Dym, and he but 44, David Loren Dym, and he but 57, while playing basketball in the gym named for his older brother Jonathan and his daughter Avital, Avital Chana Dym, and she but 12. Yhi zichram baruch

Linear Algebra in Action Third Edition

GRADUATE STUDIES I N M AT H E M AT I C S

232

Linear Algebra in Action Third Edition

Harry Dym

EDITORIAL COMMITTEE Matthew Baker Marco Gualtieri Gigliola Staffilani (Chair) Jeff A. Viaclovsky Rachel Ward 2020 Mathematics Subject Classification. Primary 15-01, 30-01, 34-01, 39-01, 46E22, 47B32, 52-01, 93-01.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-232

Library of Congress Cataloging-in-Publication Data Names: Dym, H. (Harry), 1938- author. Title: Linear algebra in action / Harry Dym. Description: Third edition. | Providence, Rhode Island : American Mathematical Society, [2023] | Series: Graduate studies in mathematics, 1065-7339 ; Volume 232 | Includes bibliographical references and index. Identifiers: LCCN 2023008107 | ISBN 9781470472061 (hardcover) | ISBN 9781470474195 (paperback) | ISBN 9781470474188 (ebook) Subjects: LCSH: Algebras, Linear. | AMS: Linear and multilinear algebra; matrix theory – Instructional exposition (textbooks, tutorial papers, etc.). | Functions of a complex variable – Instructional exposition (textbooks, tutorial papers, etc.). | Ordinary differential equations – Instructional exposition (textbooks, tutorial papers, etc.). | Difference and functional equations – Instructional exposition (textbooks, tutorial papers, etc.). | Functional analysis – Linear function spaces and their duals – Hilbert spaces with reproducing kernels (= [proper] functional Hilbert spaces, including de Branges-Rovnyak and other structured spaces | Operator theory – Special classes of linear operators – Operators in reproducing-kernel Hilbert spaces (including de Branges, de Branges-Rovnyak, and other structured spaces). | Convex and discrete geometry – Instructional exposition (textbooks, tutorial papers, etc.). | Systems theory; control – Instructional exposition (textbooks, tutorial papers, etc.). Classification: LCC QA184.2 .D96 2023 | DDC 512/.5–dc23/eng/20230519 LC record available at https://lccn.loc.gov/2023008107 Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to [email protected]. c 2023 by the American Mathematical Society. All rights reserved.  The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines 

established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

28 27 26 25 24 23

Contents

Preface to the third edition

xvii

Preface to the second edition

xix

Preface to the first edition

xxi

Chapter 1. Prerequisites

1

§1.1. Main definitions

1

§1.2. Mappings

6

§1.3. Triangular matrices

8

§1.4. The binomial formula

9

§1.5. Supplementary notes

10

Chapter 2. Dimension and rank

11

§2.1. The conservation of dimension

11

§2.2. Conservation of dimension for matrices

13

§2.3. What you need to know about rank

13

§2.4. {0, 1, ∞}

17

§2.5. Block triangular matrices

18

§2.6. Supplementary notes

19

Chapter 3. Gaussian elimination

21

§3.1. Examples

23

§3.2. A remarkable formula

26

§3.3. Extracting a basis

28

§3.4. Augmenting a given set of vectors to form a basis

28 vii

viii

Contents

§3.5. Computing the coefficients in a basis

29

§3.6. The Gauss-Seidel method

30

§3.7. Block Gaussian elimination

32

§3.8. Supplementary notes

34

Chapter 4. Eigenvalues and eigenvectors

35

§4.1. The first step

35

§4.2. Diagonalizable matrices

37

§4.3. Invariant subspaces

39

§4.4. Jordan cells

40

§4.5. Linear transformations

41

§4.6. Supplementary notes

42

Chapter 5. Towards the Jordan decomposition

43

§5.1. Direct sums

43

§5.2. The null spaces of powers of B

48

§5.3. Verification of Theorem 5.4

50

§5.4. Supplementary notes

52

Chapter 6. The Jordan decomposition

53

§6.1. The Jordan decomposition

53

§6.2. Overview

56

§6.3. Dimension of nullspaces of powers of B

57

§6.4. Computing J

58

§6.5. Computing U

58

§6.6. Two simple examples

60

§6.7. Real Jordan forms

62

§6.8. Supplementary notes

62

Chapter 7. Determinants

63

§7.1. Determinants

63

§7.2. Useful rules for calculating determinants

65

§7.3. Exploiting block structure

67

§7.4. Minors

69

§7.5. Eigenvalues

72

§7.6. Supplementary notes

74

Contents

ix

Chapter §8.1. §8.2. §8.3. §8.4. §8.5.

8. Companion matrices and circulants Companion matrices Circulants Interpolating polynomials An eigenvalue assignment problem Supplementary notes

75 75 79 81 82 84

Chapter §9.1. §9.2. §9.3. §9.4.

9. Inequalities A touch of convex function theory Four inequalities The Krein-Milman theorem Supplementary notes

85 85 91 96 97

Chapter 10. Normed linear spaces §10.1. Normed linear spaces §10.2. The vector space of matrices A §10.3. Evaluating some operator norms §10.4. Small perturbations §10.5. Supplementary notes

99 99 101 106 108 110

Chapter 11. Inner product spaces §11.1. Inner product spaces §11.2. Gram matrices §11.3. Adjoints §11.4. Spectral radius §11.5. What you need to know about A §11.6. Supplementary notes

113 113 116 118 120 122 122

Chapter 12. Orthogonality §12.1. Orthogonality §12.2. Projections and direct sums §12.3. Orthogonal projections §12.4. The Gram-Schmidt method §12.5. QR factorization §12.6. Supplementary notes

125 125 127 129 133 134 134

Chapter 13. Normal matrices §13.1. Normal matrices §13.2. Schur’s theorem §13.3. Commuting normal matrices

137 138 140 142

x

Contents

§13.4. Real Hermitian matrices

144

§13.5. Supplementary notes

145

Chapter 14. Projections, volumes, and traces

147

§14.1. Projection by iteration

147

§14.2. Computing nonorthogonal projections

149

§14.3. The general setting

151

§14.4. Detour on the angle between subspaces

154

§14.5. Areas, volumes, and determinants

156

§14.6. Trace formulas

158

§14.7. Supplementary notes

160

Chapter 15. Singular value decomposition

161

§15.1. Singular value decompositions

162

§15.2. A characterization of singular values

165

§15.3. Sums and products of singular values

167

§15.4. Properties of singular values

168

§15.5. Approximate solutions of linear equations

169

§15.6. Supplementary notes

170

Chapter 16. Positive definite and semidefinite matrices

171

§16.1. A detour on triangular factorization

173

§16.2. Characterizations of positive definite matrices

176

§16.3. Square roots

178

§16.4. Polar forms and partial isometries

180

§16.5. Some useful formulas

182

§16.6. Supplementary notes

183

Chapter 17. Determinants redux

185

§17.1. Differentiating determinants

185

§17.2. The characteristic polynomial

188

§17.3. The Binet-Cauchy formula

189

§17.4. Inequalities for determinants

190

§17.5. Some determinant identities

193

§17.6. Jacobi’s determinant identity

194

§17.7. Sylvester’s determinant identity

195

§17.8. Supplementary notes

198

Contents

xi

Chapter 18. Applications

199

§18.1. A minimization problem

199

§18.2. Strictly convex functions and spaces

201

§18.3. Fitting a line in R 2

204

Rp

205

§18.5. Schur complements for semidefinite matrices

207

§18.6. von Neumann’s inequality for contractive matrices

208

§18.7. Supplementary notes

209

§18.4. Fitting a line in

Chapter 19. Discrete dynamical systems

211

§19.1. Homogeneous systems

211

§19.2. Nonhomogeneous systems

214

§19.3. Second-order difference equations

214

§19.4. Higher-order difference equations

217

§19.5. Nonhomogeneous equations

219

§19.6. Supplementary notes

219

Chapter 20. Continuous dynamical systems

221

§20.1. Preliminaries on matrix-valued functions

221

§20.2. The exponential of a matrix

222

§20.3. Systems of differential equations

223

§20.4. Uniqueness

225

§20.5. Isometric and isospectral flows

226

§20.6. Nonhomogeneous differential systems

227

§20.7. Second-order differential equations

228

§20.8. Higher-order differential equations

229

§20.9. Wronskians

232

§20.10. Supplementary notes

233

Chapter 21. Vector-valued functions

235

§21.1. Mean value theorems

235

§21.2. Taylor’s formula with remainder

236

§21.3. Mean value theorem for functions of several variables

237

§21.4. Mean value theorems for vector-valued functions of several variables 239 §21.5. Convex minimization problems

241

§21.6. Supplementary notes

242

xii

Contents

Chapter 22. Fixed point theorems

243

§22.1. A contractive fixed point theorem

243

§22.2. A refined contractive fixed point theorem

245

§22.3. Other fixed point theorems

246

§22.4. Applications of fixed point theorems

246

§22.5. Supplementary notes

249

Chapter 23. The implicit function theorem

251

§23.1. The inverse function theorem

251

§23.2. The implicit function theorem

254

§23.3. Continuous dependence of solutions

257

§23.4. Roots of polynomials

258

§23.5. Supplementary notes

261

Chapter 24. Extremal problems

263

§24.1. Classical extremal problems

263

§24.2. Extremal problems with constraints

266

§24.3. Examples

269

§24.4. Supplementary notes

273

Chapter 25. Newton’s method

275

§25.1. Newton’s method for scalar functions

275

§25.2. Newton’s method for vector-valued functions

277

§25.3. Supplementary notes

280

Chapter 26. Matrices with nonnegative entries

281

§26.1. A warm-up theorem

282

§26.2. The Perron-Frobenius theorem

284

§26.3. Supplementary notes

290

Chapter 27. Applications of matrices with nonnegative entries

291

§27.1. Stochastic matrices

291

§27.2. Behind Google

292

§27.3. Leslie matrices

294

§27.4. Minimum matrices

295

§27.5. Doubly stochastic matrices

297

§27.6. Inequalities of Ky Fan and von Neuman

298

§27.7. Supplementary notes

303

Contents

xiii

Chapter 28. Eigenvalues of Hermitian matrices

305

§28.1. The Courant-Fischer theorem

305

§28.2. Applications of the Courant-Fischer theorem

307

§28.3. Ky Fan’s maximum principle

309

§28.4. The sum of two Hermitian matrices

309

§28.5. On the right-differentiability of eigenvalues

311

§28.6. Sylvester’s law of inertia

313

§28.7. Supplementary notes

314

Chapter 29. Singular values redux I

315

§29.1. Sums of singular values

315

§29.2. Majorization

316

§29.3. Norms based on sums of singular values

319

§29.4. Unitarily invariant norms

319

§29.5. Products of singular values

320

§29.6. Eigenvalues versus singular values

322

§29.7. Supplementary notes

323

Chapter 30. Singular values redux II

325

§30.1. Sums of powers of singular values

325

§30.2. Inequalities for singular values in terms of A

328

§30.3. Perturbation of singular values

329

§30.4. Supplementary notes

335

Chapter 31. Approximation by unitary matrices

337

§31.1. Approximation in the Frobenius norm

338

§31.2. Approximation in other norms

339

§31.3. Supplementary notes

344

Chapter 32. Linear functionals

345

§32.1. Linear functionals

345

§32.2. Extensions of linear functionals

348

§32.3. The Minkowski functional

349

§32.4. Separation theorems

351

§32.5. Another path

353

§32.6. Supplementary notes

357

xiv

Chapter 33. A minimal norm problem

Contents

359

§33.1. Dual extremal problems

360

§33.2. Preliminary calculations

361

§33.3. Evaluation of (33.1)

364

§33.4. A numerical example

365

§33.5. A review

367

§33.6. Supplementary notes

367

Chapter 34. Conjugate gradients

369

§34.1. The recursion

372

§34.2. Convergence estimates

375

§34.3. Krylov subspaces

377

§34.4. The general conjugate gradient method

378

§34.5. Supplementary notes

379

Chapter 35. Continuity of eigenvalues

381

§35.1. Contour integrals of matrix-valued functions

383

§35.2. Continuous dependence of the eigenvalues

387

§35.3. Matrices with distinct eigenvalues

389

§35.4. Supplementary notes

391

Chapter 36. Eigenvalue location problems

393

§36.1. Gerˇsgorin disks

393

§36.2. Spectral radius redux

395

§36.3. Shifting eigenvalues

399

§36.4. The Hilbert matrix

400

§36.5. Fractional powers

401

§36.6. Supplementary notes

401

Chapter 37. Matrix equations

403

§37.1. The equation X − AXB = C

403

§37.2. The Sylvester equation AX − XB = C

405

§37.3. AX = XB

409

§37.4. Special classes of solutions

409

§37.5. Supplementary notes

411

Contents

Chapter 38. A matrix completion problem

xv

413

§38.1. Constraints on Ω

414

§38.2. The central diagonals are specified

417

§38.3. A moment problem

422

§38.4. Supplementary notes

422

Chapter 39. Minimal norm completions

423

§39.1. A minimal norm completion problem

424

§39.2. A description of all solutions to the minimal norm completion problem

426

§39.3. Supplementary notes

428

Chapter 40. The numerical range

429

§40.1. The numerical range is convex

429

§40.2. Eigenvalues versus numerical range

431

§40.3. The Gauss-Lucas theorem

433

§40.4. The Heinz inequality

434

§40.5. Supplementary notes

436

Chapter 41. Riccati equations

437

§41.1. Riccati equations

437

§41.2. Two lemmas

443

§41.3. The LQR problem

445

§41.4. Supplementary notes

447

Chapter 42. Supplementary topics

449

§42.1. Gaussian quadrature

449

§42.2. Bezoutians

452

§42.3. Resultants

456

§42.4. General QR factorization

457

§42.5. The QR algorithm

458

§42.6. Supplementary notes

460

Chapter 43. Toeplitz, Hankel, and de Branges

461

§43.1. Reproducing kernel Hilbert spaces

461

§43.2. de Branges spaces

463

§43.3. The space of polynomials of degree ≤ n − 1

464

§43.4. Two subspaces

466

xvi

Contents

§43.5. G is a Toeplitz matrix

467

§43.6. G is a Hankel matrix

470

§43.7. Supplementary notes

473

Bibliography

475

Notation index

479

Subject index

481

Preface to the third edition

This new edition of Linear Algebra in Action is significantly different from the previous edition in both content and style: It includes a number of topics that did not appear in the earlier edition and excludes some that did. In the earlier edition I entered into the proofs of every fact that was used. In this edition I have relegated the proofs of a number of theorems to outside references and have focused instead on their applications, which to my mind has more impact than a proof, especially on a first pass. Moreover, most of the material that is adapted from the previous edition has been rewritten and reorganized. I have organized this book into short chapters, most of which are a dozen pages or less, because I think this is more amenable to classroom use. I have tried to write this book in the style that most of the mathematicians that I know work in, rather than in the way that they write. In particular, I believe that the discussion of a well-chosen example is often much more helpful than a formal proof, which in many cases is an example hidden by elaborate bookkeeping. A good student will be able to pass to the general setting from the example, and a weaker student will at least have something concrete to focus on. The book is intended primarily for students who have had at least a little exposure to linear algebra. Nevertheless, the first twelve chapters or so are basically a quick review of the material that is typically offered in a first course, plus a little. The content of this introductory material is dealt with in greater detail in the second edition of this book, which is a useful supplement to this third edition, but the two can be used independently. xvii

xviii

Preface to the third edition

A reader who is familiar with the main contents of the first sixteen chapters and Chapter 21 should be able to read any of the other chapters without difficulty, as they are for the most part independent of each other. The entries keep in mind, warning, and notation appear in the index. The first is to call attention to compilations that I feel are helpful, the second is to call attention to conventions that have been introduced, and the third is to point out the introduction of new notation, most of which is fairly standard, except that a distinction is made between the matrices AH (the conjugate transpose of A) and A∗ (the adjoint of A, which depends upon the inner product). I extend my thanks to the readers who contributed corrections to the first two editions over the past several years and to Shmuel Aviya for reading and commenting on a number of chapters as they were being prepared for this third edition. I owe a special note of thanks to Dr. Andrei Iacob who carefully copyedited a close to final version of the third edition. It is also a pleasure to thank the staff of the AMS for their friendly help, with extra special thanks to my copyeditor, Arlene O’Sean, for accommodating the author and for her care and devotion to getting things right. I plan to use the AMS website www.ams.org/bookpages/gsm-232 to supply some supplementary material as well as for sins of omission and commission (and just plain afterthoughts). I did not add words of wisdom at the beginning of each chapter as in the earlier editions. However, I cannot resist repeating two of them: The first is based on some four score and five years of observing the human scene: Those who think they know all the answers don’t know all the questions. A Chinese proverb puts it well: Trust only those who doubt. The second is one that I am especially fond of (both the saying and its originator, who was a very special person), though I have never been able to live up to it: Let’s throw everything away; then there will be room for what’s left. –Irene Dym TAM ACH TEREM NISHLAM February 14, 2023 Rehovot, Israel

Preface to the second edition

I have an opinion. But I do not agree with it. Joshua Sobol [83]

Most of the chapters in the first edition have been revised, some extensively. The revisions include changes in a number of proofs, to either simplify the argument and/or make the logic clearer, and, on occasion, to sharpen the result. New short introductory sections on linear programming, extreme points for polyhedra and a Nevanlinna-Pick interpolation problem have been added, as have some very short introductory sections on the mathematics behind Google, Drazin inverses, band inverses, and applications of svd together with a number of new exercises. I would like to thank the many readers who e-mailed me helpful lists of typographical errors. I owe a special word of thanks to David Kimsey and Motke Porat, whose lists hit double figures. I believe I have fixed all the reported errors and then some. A couple of oversights in the first edition that came to light (principally the fact that the word Hankel should be removed from the statement and proof of Corollary 21.2; an incomplete definition of a support hyperplane; and a certain fuzziness in the discussion of operator norms and multiplicative norms) have also been fixed. xix

xx

Preface to the second edition

It is a pleasure to thank the staff of the AMS for being so friendly and helpful; a special note of thanks to my copy/production editor Mike Saitas for his sharp eye and cheerful willingness to accommodate the author and to Mary Medeiros for preparing the indices and her expertise in LaTeX. The AMS website www.ams.org/bookpages/gsm-78 will be used for sins of omission and commission (and just plain afterthoughts) for the second edition as well as the first. TAM, ACH TEREM NISHLAM, ... July 19, 2013 Rehovot, Israel

Preface to the first edition

A foolish consistency is the hobgoblin of little minds,... Ralph Waldo Emerson, Self Reliance This book is based largely on courses that I have taught at the Feinberg Graduate School of the Weizmann Institute of Science over the past 35 years to graduate students with widely varying levels of mathematical sophistication and interests. The objective of a number of these courses was to present a user-friendly introduction to linear algebra and its many applications. Over the years I wrote and rewrote (and then, more often than not, rewrote some more) assorted sets of notes and learned many interesting things en route. This book is the current end product of that process. The emphasis is on developing a comfortable familiarity with the material. Many lemmas and theorems are made plausible by discussing an example that is chosen to make the underlying ideas transparent in lieu of a formal proof; i.e., I have tried to present the material in the way that most of the mathematicians that I know work rather than in the way they write. The coverage is not intended to be exhaustive (or exhausting), but rather to indicate the rich terrain that is part of the domain of linear algebra and to present a decent sample of some of the tools of the trade of a working analyst that I have absorbed and have found useful and interesting in more than 40 years in the business. To put it another way, I wish someone had taught me this material when I was a graduate student. In those days, in the arrogance of youth, I thought that linear algebra was for boys and girls and that real xxi

xxii

Preface to the first edition

men and women worked in functional analysis. However, this is but one of many opinions that did not stand the test of time. In my opinion, the material in this book can (and has been) used on many levels. A core course in classical linear algebra topics can be based on the first six chapters, plus selected topics from Chapters 7–9 and 13. The latter treats difference equations, differential equations, and systems thereof. Chapters 14–16 cover applications to vector calculus, including a proof of the implicit function based on the contractive fixed point theorem, and extremal problems with constraints. Subsequent chapters deal with matrix-valued holomorphic functions, matrix equations, realization theory, eigenvalue location problems, zero location problems, convexity, and matrices with nonnegative entries. I have taken the liberty of straying into areas that I consider significant, even though they are not usually viewed as part of the package associated with linear algebra. Thus, for example, I have added short sections on complex function theory, Fourier analysis, Lyapunov functions for dynamical systems, boundary value problems and more. A number of the applications are taken from control theory. I have adapted material from many sources. But the one which was most significant for at least the starting point of a number of topics covered in this work is the wonderful book [56] by Lancaster and Tismenetsky. A number of students read and commented on substantial sections of assorted drafts: Boris Ettinger, Ariel Ginis, Royi Lachmi, Mark Kozdoba, Evgeny Muzikantov, Simcha Rimler, Jonathan Ronen, Idith Segev, and Amit Weinberg. I thank them all, and extend my appreciation to two senior readers: Aad Dijksma and Andrei Iacob for their helpful insightful remarks. A special note of thanks goes to Deborah Smith, my copy editor at the AMS, for her sharp eye and expertise in the world of commas and semicolons. On the production side, I thank Jason Friedman for typing an early version, and our secretaries Diana Mandelik, Ruby Musrie, Linda Alman, Terry Debesh, all of whom typed selections and to Diana again for preparing all the figures and clarifying numerous mysterious intricacies of LaTeX. I also thank Barbara Beeton of the AMS for helpful advice on AMS LaTeX. One of the difficulties in preparing a manuscript for a book is knowing when to let go. It is always possible to write it better.1 Fortunately the

1 Israel Gohberg tells of a conversation with Lev Sakhnovich that took place in Odessa many years ago: Lev: Israel, how is your book with Mark Gregorovic (Krein) progressing? Israel: It’s about 85% done. Lev: That’s great! Why so sad? Israel: If you would have asked me yesterday, I would have said 95%.

Preface to the first edition

xxiii

AMS maintains a web page: http://www.ams.org/bookpages/gsm-78, for sins of omission and commission (or just plain afterthoughts). TAM, ACH TEREM NISHLAM,... October 18, 2006 Rehovot, Israel

Chapter 1

Prerequisites

We shall assume that the reader is familiar with real and complex vector spaces and the elements of matrix theory and linear transformations. Nevertheless, for the sake of completeness, we begin with a quick review of the main definitions and notations that will be in use in the sequel.

1.1. Main definitions • Notation: The symbols R and C will be used to denote the real and complex numbers, respectively. For us, the most important vector spaces are C p×q , the set of p × q matrices with complex entries, and R p×q , the set of p × q matrices with real entries; C p is short for C p×1 and R p is short for R p×1 . An element v in a vector space V is called a vector and is usually printed in boldface. We shall say that V is a complex vector space if αv ∈ V for every α ∈ C and every v ∈ V. Analogously, we shall say that V is a real vector space if αv ∈ V for every α ∈ R and every v ∈ V. In both cases, the numbers α will be referred to as scalars. • Subspaces: A subspace M of a vector space V is a nonempty subset of V that is closed under vector addition and scalar multiplication. In other words, if x and y belong to M, then x + y ∈ M and αx ∈ M for every scalar α. A subspace of a vector space is automatically a vector space in its own right. 1

2

1. Prerequisites

• Span: If v1 , . . . , vk is a given set of vectors in a vector space V, then ⎧ ⎫ k ⎨ ⎬ αj vj : α1 , . . . , αk are scalars . span {v1 , . . . , vk } = ⎩ ⎭ j=1

In words, the span is the set of all linear combinations α1 v1 + · · · + αk vk of the indicated set of vectors, with scalar coefficients α1 , . . . , αk . Or, to put it another way, span {v1 , . . . , vk } is the smallest subspace of V that contains the vectors v1 , . . . , vk . It is important to keep in mind that the number of vectors k that were used to define the span is not a good indicator of the size of this space. Thus, for example, if ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 2 3 ⎣ ⎦ ⎣ ⎦ ⎣ v1 = 2 , v2 = 4 , and v3 = 6⎦ , 1 2 3 then span{v1 , v2 , v3 } = span{v1 } . To clarify the notion of the size of the span we need the concept of linear dependence. • Linear dependence: A set of vectors {v1 , . . . , vk } in a vector space V is said to be linearly dependent if there exists a set of scalars α1 , . . . , αk , not all of which are zero, such that α1 v1 + · · · + αk vk = 0 . This permits us to express one or more of the given vectors in terms of the others. Thus, if α1 = 0, then α2 αk vk v1 = − v2 − · · · − α1 α1 and hence span{v1 , . . . , vk } = span{v2 , . . . , vk } . Further reductions are possible if the vectors v2 , . . . , vk are still linearly dependent. • Linear independence: A set of vectors {v1 , . . . , vk } in a vector space V is linearly independent if the only scalars α1 , . . . , αk for which α1 v1 + · · · + αk vk = 0 are α1 = · · · = αk = 0. This is just another way of saying that we cannot express one of these vectors as a linear combination of the

1.1. Main definitions

3

others. Moreover, if {v1 , . . . , vk } is a set of linearly independent vectors in a vector space V and if (1.1)

v = α1 v1 + · · · + αk vk

and

v = β1 v1 + · · · + βk vk

for some choice of scalars α1 , . . . , αk , β1 , . . . , βk , then αj = βj for j = 1, . . . , k. Exercise 1.1. Show that if (1.1) holds for a linearly independent set of vectors, {v1 , . . . , vk }, then αj = βj for j = 1, . . . , k. Show by example that this conclusion is false if the given set of k vectors is not linearly independent. • Basis: A set of vectors {v1 , . . . , vk } is a basis for a vector space V if: (1) span{v1 , . . . , vk } = V. (2) The vectors v1 , . . . , vk are linearly independent. Both of these conditions are essential. The first guarantees that the given set of k vectors is large enough to express every vector v ∈ V as a linear combination of v1 , . . . , vk ; the second ensures that we cannot achieve this with less than k vectors. Exercise 1.2. Let u1 , u2 , u3 be linearly independent vectors in a vector space U and let u4 = u1 + 2u2 + u3 . (a) Show that the vectors u1 , u2 , u4 are linearly independent and that span {u1 , u2 , u3 } = span {u1 , u2 , u4 }. (b) Express the vector 7u1 + 13u2 + 5u3 as a linear combination of the vectors u1 , u2 , u4 . [Note that the coefficients of u1 and u2 change.] • Dimension: A nontrivial vector space V has many bases. However, the number of elements in each basis for V is exactly the same and is referred to as the dimension of V and will be denoted dim V. If U is a subspace of a vector space V, then dim U ≤ dim V, with equality if and only if U = V. Example 1.1. The p × q matrices Eij , i = 1, . . . , p, j = 1, . . . , q, that are defined by setting every entry in Eij equal to zero except for the ij entry, which is set equal to one, form a basis for the vector  space C p×q . Thus, dim C p×q = pq. • Identity matrix: The symbol In denotes the n × n matrix A = [aij ], i, j = 1, . . . , n, with aii = 1 for i = 1, . . . , n and aij = 0 for i = j. The name stems from the fact that In x = x for every vector x ∈ Cn . • Zero matrix: The symbol Op×q denotes the matrix in C p×q all of whose entries are equal to zero. The subscript p × q will be dropped if the size is clear from the context.

4

1. Prerequisites

• Transposes: The transpose of a p × q matrix A is the q × p matrix AT whose k’th row is equal to the k’th column of A laid sideways, k = 1, . . . , q. In other words, the ij entry of A is equal to the ji entry of AT . • Hermitian transposes: The Hermitian transpose AH of a p × q matrix A is the same as the transpose AT of A, except that all the entries in the transposed matrix are replaced by their complex conjugates. Thus, for example, ⎡ ⎤ ⎡ ⎤

 1 4 1 4 1 3i 5 + i −i⎦ and AH = ⎣ −3i i ⎦. A= =⇒ AT = ⎣ 3i 4 −i 6i 5 + i 6i 5 − i −6i It is readily checked that (AT )T = A ,

(AB)T = B T AT ,

(AH )H = A ,

and

(AB)H = B H AH .

• Permutation matrices: Every n × n permutation matrix P is obtained by taking the identity matrix In and interchanging some of the rows. Consequently, P can be expressed in terms of the columns ej , j = 1, . . . , n, of In and a one-to-one mapping σ of the set of integers {1, . . . , n} onto itself by the formula (1.2)

P = Pσ =

n 

ej eTσ(j) ,

and hence In =

n 

j=1

ej eTj .

j=1

Thus, for example, if n = 4 and σ(1) = 3, σ(2) = 2, σ(3) = 4, and σ(4) = 1, then ⎡ 0 ⎢ 0 Pσ = e1 eT3 + e2 eT2 + e3 eT4 + e4 eT1 = ⎢ ⎣0 1

0 1 0 0

1 0 0 0

⎤ 0 0⎥ ⎥. 1⎦ 0

• Orthogonal matrices: A matrix V ∈ R n×n is said to be an orthogonal matrix if V T V = In . • Matrix multiplication: Let A = [aij ] be a p × q matrix and let B = [bst ] be a q × r matrix. Then the product AB is the p × r matrix C = [ckl ] with entries ck =

q  j=1

akj bj ,

k = 1, . . . , p ,  = 1, . . . , r.

1.1. Main definitions

5

Notice that ck is the matrix product of the k’th row ak of A with the ’th column b of B: ⎡ ⎤ b1 ⎢ ⎥ ck = ak b = [ak1 · · · akq ] ⎣ ... ⎦ . bq • Matrix multiplication is not commutative, e.g.,





  1 0 0 0 0 0 0 0 A= and B = =⇒ AB = = = BA . 1 0 1 1 0 0 2 0 • Matrix multiplication is associative: If A ∈ C p×q , B ∈ C q×r , and C ∈ C r×s , then (AB)C = A(BC) . • Matrix multiplication is distributive: If A, A1 , A2 ∈ C p×q and B, B1 , B2 ∈ C q×r , then (A1 + A2 )B = A1 B + A2 B

and

A(B1 + B2 ) = AB1 + AB2 .

• If A ∈ C p×q is expressed both as an array of p row vectors of width q and as an array of q column vectors of height p, ⎤ ⎡ a1  ⎥  ⎢ A = ⎣ ... ⎦ = a1 · · · aq , ap and if B ∈ C q×r is expressed both as an array of q row vectors of width r and as an array of r column vectors of height q, ⎤ ⎡ b1  ⎥  ⎢ B = ⎣ ... ⎦ = b1 · · · br , bq

(1.3)

then the product AB can be expressed in the following three ways: ⎡ ⎤ a1 B q   ⎢ ⎥  as bs . AB = ⎣ ... ⎦ = Ab1 · · · Abr = s=1 ap B

• If x ∈ C q with entries x1 , . . . , xq , then   (1.4) Ax = a1 · · · aq x = x1 a1 + · · · + xq aq . • Inverses: Let A ∈ C p×q . Then A is: (1) Left invertible if there is a matrix C ∈ C q×p such that CA = Iq ; in this case C is called a left inverse of A and the equation Ax = b has at most one solution x.

6

1. Prerequisites

(2) Right invertible if there is a matrix B ∈ C q×p such that AB = Ip ; in this case B is called a right inverse of A and the equation Ax = b has at least one solution x. If a matrix A ∈ C p×q has both a left inverse C and a right inverse B, then (1.5)

C = CIp = C(AB) = (CA)B = Iq B = B

and

q = p.

Thus, if A has both a left and a right inverse, then it has exactly one left inverse and exactly one right inverse. Moreover, they are equal and (as will be shown in Section 2.3) q = p. In this instance, we shall say that A is invertible and refer to B = C as the inverse of A and denote it by A−1 . Exercise 1.3. Show that if A ∈ C n×n and B ∈ C n×n are invertible, then AB is invertible and (AB)−1 = B −1 A−1 . • Block multiplication: It is often convenient to express a large matrix as an array of submatrices (i.e., blocks of numbers) rather than as an array of numbers. Then the rules of matrix multiplication still apply (block by block) provided that the block decompositions are compatible. Thus, for example, if ⎤ ⎡ 

A11 A12 B11 B12 B13 B14 ⎦ ⎣ and B = A = A21 A22 B21 B22 B23 B24 A31 A32 with entries Aij ∈ C pi ×qj and Bjk ∈ C qj ×rk , then   C = AB = Cij , i = 1, . . . , 3, j = 1, . . . , 4 , where Cij = Ai1 B1j + Ai2 B2j is a pi × rj matrix. Exercise 1.4. Verify the three ways of writing a matrix product in formula (1.3). [HINT: Show that the two; then observe  ij entries coincide for the first that if Ir = e1 · · · er , then AB = AIr B and Ir = rs=1 es eTs for the third.] Exercise 1.5. Show that every permutation matrix P ∈ R n×n is an orthogonal matrix, i.e., P T P = P P T = In . [HINT: Use formula (1.2).]

1.2. Mappings • Mappings: A mapping (or transformation) T from a vector space U into a vector space V is a rule that assigns exactly one vector v ∈ V to each u ∈ U . In this framework either U and V will

1.2. Mappings

7

both be complex vector spaces or they will both be real vector spaces. We shall refer to the set NT = {u ∈ U : T u = 0V } as the nullspace (or kernel) of T and to the set RT = {T u : u ∈ U } as the range (or image) of T . The subscript V is added to the symbol 0 in the first definition to emphasize that it is the zero vector in V, not in U . • Linear mappings: A mapping T from a vector space U into a vector space V is linear if for every choice of u, v ∈ U and every scalar α the following two conditions are met: (1) T (u + v) = T u + T v. (2) T (αu) = αT u. If T is a linear mapping from a vector space U into a vector space V, then NT is a subspace of U and RT is a subspace of V . • The identity: The special linear transformation from a vector space U into U that maps each vector u ∈ U into itself is called the identity mapping. It is denoted by the symbol In if U = C n or U = R n and by IU otherwise, though, more often than not, when the underlying space U is clear from the context, the subscript U will be dropped. If T is a linear mapping of a vector space U with basis {u1 , . . . , uq } into a vector space V with basis {v1 , . . . , vp }, then there exists a unique set of scalars aij , i = 1, . . . , p and j = 1, . . . , q, such that T uj =

(1.6)

p 

aij vi

for

j = 1, . . . , q

i=1

and hence that (1.7)

⎡ ⎤ ⎡ ⎤ ⎞ ⎛ y1 x1 q p   ⎥ ⎢ ⎥ ⎢ . xj uj ⎠ = yi vi ⇐⇒ A ⎣ .. ⎦ = ⎣ ... ⎦ . T⎝ j=1 i=1 xq yp

• Warning: If A ∈ C p×q , then matrix multiplication defines a linear map that sends x ∈ C q to Ax ∈ C p . Correspondingly, the nullspace of this map, NA = {x ∈ C q : Ax = 0} ,

is a subspace of C q ,

and the range of this map, RA = {Ax : x ∈ C q } ,

is a subspace of

Cp .

8

1. Prerequisites

However, if A ∈ R p×q , then matrix multiplication also defines a linear map that sends x ∈ R q to Ax ∈ R p ; in this setting NA = {x ∈ R q : Ax = 0}

is a subspace of

Rq ,

and the range of this map, RA = {Ax : x ∈ R q } ,

is a subspace of

Rp .

In short, it is important to clarify the space on which A is acting, i.e., the domain of A. This will usually be clear from the context.

1.3. Triangular matrices A matrix A ∈ C n×n with entries aij , i, j = 1, . . . , n, is said to be: • upper triangular if all its nonzero entries sit either on or above the diagonal, i.e., if aij = 0 when i > j; • lower triangular if all its nonzero entries sit either on or below the diagonal, i.e., if AT is upper triangular; • triangular if it is either upper triangular or lower triangular; • diagonal if aij = 0 when i = j. If A ∈ C n×n is a triangular matrix, then: • A is invertible if and only if all its diagonal entries are nonzero. If A is an invertible triangular matrix, then: • A is upper triangular ⇐⇒ A−1 is upper triangular. • A is lower triangular ⇐⇒ A−1 is lower triangular. Triangular matrices will be discussed in more detail in Section 2.5. Systems of equations based on a triangular matrix are particularly convenient to work with, even if the matrix is not invertible. Example 1.2. Let A ∈ C 4×4 be a 4 × 4 upper triangular matrix with nonzero diagonal entries and let b be any vector in C 4 . Then the vector x is a solution of the equation (1.8)

Ax = b

if and only if a11 x1 + a12 x2 + a13 x3 + a14 x4 = b1 , a22 x2 + a23 x3 + a24 x4 = b2 , a33 x3 + a34 x4 = b3 , a44 x4 = b4 .

1.4. The binomial formula

9

Therefore, since the diagonal entries of A are nonzero, it is readily seen that these equations admit exactly one solution, by working from the bottom up: x4 = a−1 44 b4 , x3 = a−1 33 (b3 − a34 x4 ) , x2 = a−1 22 (b2 − a23 x3 − a24 x4 ) , x1 = a−1 11 (b1 − a12 x2 − a13 x3 − a14 x4 ) . Thus, equation (1.8) admits exactly one solution x for each b. Let xj denote the solution of the equation Axj = ej for j = 1, . . . , 4 when ej , j = 1, . . . , 4, denotes the j’th column of the identity matrix I4 . Then the 4 × 4 matrix X = [x1 x2 x3 x4 ] is a right inverse of A: AX = A[x1 · · · x4 ] = [Ax1 · · · Ax4 ] = [e1 · · · e4 ] = I4 . Analogous examples can be built for lower triangular p×p matrices. The only difference is that now it is advantageous to work from the top down. The existence of a left inverse can also be obtained by writing down the requisite equations that must be solved and imitating the preceding arguments. It is easier, however, to take advantage of the fact that Y A = Ip ⇐⇒ AT Y T = Ip . (In fact, as we shall see shortly, if A, B ∈ C p×p , then AB = Ip ⇐⇒ BA =  Ip .)

1.4. The binomial formula The familiar binomial identity (a + b)

m

m    m k m−k = a b k k=0

for complex numbers a and b remains valid for square matrices A and B of the same size if they commute: m    m k m−k m if AB = BA. A B (1.9) (A + B) = k k=0

It is easy to see that the condition AB = BA is necessary for the formula in (1.9) to hold by comparing both sides when m = 2; the sufficiency may be verified by induction. If A = λIn and B ∈ C n×n , then AB = BA and hence m    m k m−k m . λ B (1.10) (λIn + B) = k k=0

10

1. Prerequisites

1.5. Supplementary notes Most of this chapter is adapted from Chapter 1 of [30], which contains more details. A principal difference is that in the present treatment we have simply accepted the fact that if a vector space has a basis with a finite number of vectors, then every other basis of that space will have exactly the same number of vectors. In the somewhat distant future we shall also deal with vector spaces F of functions that are defined on some reasonable subset Ω of C n or R n . Then vector addition and multiplication by scalars is defined in a natural way: If f, g ∈ F , ω ∈ Ω, and α is a scalar, then (1.11)

(f + g)(ω) = f (ω) + g(ω) and (αf )(ω) = α f (ω) .

Example 1.3. The set F of continuous complex- (resp., real-) valued functions f (x) on the interval 0 ≤ x ≤ 1 is a complex (resp., real) vector space with respect to the natural rules of vector addition and scalar multiplication that were introduced in (1.11). Exercise 1.6. Show that the set F0 = {f ∈ F : f (0) = 0 and f (1) = 0} is a subspace of the vector space F considered in the preceding example, but the set F1 = {f ∈ F : f (0) = 0 and f (1) = 1} is not a subspace of F .

Chapter 2

Dimension and rank

In this chapter we shall first establish a useful formula (see (2.1)) that we call the conservation of dimension and then explore some of its implications. The last section is devoted to block triangular matrices.

2.1. The conservation of dimension Theorem 2.1. Let T be a linear mapping from a finite-dimensional vector space U into a vector space V (finite dimensional or not). Then (2.1)

dim NT + dim RT = dim U .

Proof. Since dim RT ≤ dim U , RT is automatically a finite-dimensional space regardless of the dimension of V. Suppose first that NT = {0} and RT = {0}, and let u1 , . . . , uk be a basis for NT , let v1 , . . . , v be a basis for RT , and choose vectors yj ∈ U such that T yj = vj ,

j = 1, . . . , .

To verify (2.1), we shall show that the set of vectors {u1 , . . . , uk , y1 , . . . , y } is a basis for U . The first item of business is to show that the k +  vectors in this set are linearly independent. If there exist scalars α1 , . . . , αk and β1 , . . . , β such that (2.2)

k  i=1

αi ui +

 

βj yj = 0 ,

j=1

11

12

2. Dimension and rank

then

⎞ ⎛ k    0=T0=T⎝ αi ui + βj yj ⎠ i=1

=

k 

αi T ui +

i=1

j=1

 

βj T yj = 0 +

j=1

 

βj vj .

j=1

Therefore, β1 = · · · = β = 0 and so too, by (2.2), α1 = · · · = αk = 0. This completes the proof of the asserted linear independence. The next step is to check that (2.3)

span{u1 , . . . , uk , y1 , . . . , y } = U .

Towards this end, let w ∈ U . Then, since Tw =

 

βj vj =

j=1

 

βj T yj

j=1

for some choice of scalars β1 , . . . , β , it follows that ⎞ ⎛   βj yj ⎠ = 0 T ⎝w − j=1

and hence that w−

 

βj yj ∈ NT .

j=1

Consequently, this vector can be expressed as a linear combination of the vectors u1 , . . . , uk . In other words, w=

k  i=1

αi ui +

 

βj yj

j=1

for some choice of scalars α1 , . . . , αk and β1 , . . . , β . But this means that (2.3) holds and hence, in view of the already exhibited linear independence, that dim U = k + l = dim NT + dim RT , as claimed. Suppose next that NT = {0} and RT = {0}. Then much the same sort of argument serves to prove that if v1 , . . . , v is a basis for RT and if yj ∈ U is such that T yj = vj for j = 1, . . . , , then the vectors y1 , . . . , yl are linearly independent and span U . Thus, dim U = dim RT = , and hence formula (2.1) is still in force, since dim NT = 0. It remains only to consider the case RT = {0}. But then NT = U , and formula (2.1) is still valid. 

2.3. What you need to know about rank

13

We shall refer to formula (2.1) as the principle of conservation of dimension.

2.2. Conservation of dimension for matrices One of the main applications of the principle of conservation of dimension is to the particular linear transformation T from C q into C p that is defined by multiplying each vector x ∈ C q by a given matrix A ∈ C p×q . In this setting: • NA = {x ∈ C q : Ax = 0} • RA = {Ax : x ∈ C q }

is a subspace of C q .

is a subspace of C p .

• The conservation of dimension principle translates to (2.4)

q = dim NA + dim RA . • The dimension of RA is termed the rank of A: rank A = dim RA .

 Moreover, it is important to keep in mind that if A = a1 · · · then:

 aq ,

• RA = span{a1 , . . . , aq }, and the vectors a1 , . . . , aq are linearly independent if and only if NA = {0}. Exercise 2.1. Show that if A ∈ C p×q and C ∈ C k×q , then

 A (2.5) rank = q ⇐⇒ NA ∩ NC = {0} . C Exercise 2.2. Show that if A is a triangular matrix (either upper or lower), then rank A is bigger than or equal to the number of nonzero diagonal entries in A. Give an example of an upper triangular matrix A for which the inequality is strict. Exercise 2.3. Find the null space ⎡ 3 1 0 ⎣ A= 4 1 0 5 2 0

NA and the range RA of the matrix ⎤ 2 2 ⎦ acting on C 4 4

and check that the principle of conservation of dimension holds.

2.3. What you need to know about rank The next theorem is a little on the long side but has the advantage that it plus the conservation of dimension for A ∈ C p×q , (2.6)

q = dim NA + dim RA ,

contains almost everything you need to know about rank in one location.

14

2. Dimension and rank

Theorem 2.2. If A ∈ C p×q , B ∈ C q×r , and C ∈ C n×p , then: (1) rank AB ≤ min{rank A, rank B}. (2) NAH A = NA , rank AH A = rank A and RAH A = RAH . (3) rank A = rank AH = rank AT . (4) rank A ≤ min{p, q}. (5) A is right invertible ⇐⇒ RA = C p ⇐⇒ rank A = p. (6) A is left invertible ⇐⇒ NA = {0} ⇐⇒ rank A = q. (7) If B is right invertible, then RAB = RA and rank AB = rank A. (8) If C is left invertible, then NCA = NA and rank CA = rank A. (9) If B ∈ C q×p , then q + rank(Ip − AB) = p + rank(Iq − BA). Proof. Since RAB ⊆ RA , it is clear that rank AB ≤ rank A. The auxiliary inequality rank AB ≤ rank B then follows from the fact that if {v1 , . . . , vk } is a basis for RB , then RAB ⊆ span{Av1 , . . . , Avk }. Thus, (1) holds. Suppose next that x ∈ NAH A and let y = Ax. Then p 

|yj |2 = yH y = (Ax)H Ax = xH (AH Ax) = xH 0 = 0.

j=1

Therefore, Ax = y = 0, i.e., NAH A ⊆ NA . Thus, as the inclusion NA ⊆ NAH A is obvious, the first equality in (2) holds. The second equality in (2) then follows from the principle of conservation of dimension applied first to A ∈ C p×q and then to AH A ∈ C q×q : q = dim NA + dim RA = dim NAH A + dim RAH A . Consequently, dim RA = dim RAH A and hence, in view of (1), rank A = rank AH A ≤ rank AH . Since the last inequality may be applied to AH as well as to A and (AH )H = A, it follows that rank AH ≤ rank A. Thus, equality must hold. Moreover, dim RAH A = dim RAH and therefore the self-evident inclusion RAH A ⊆ RAH must be an equality. This completes the proof of (2) and the first equality in (3); the second is left to the reader. Assertion (4) is immediate from (1) and the observation that A = Ip A = AIq . Suppose next that A ∈ C p×q is right invertible. Then there exists a matrix B ∈ C q×p such that AB = Ip and hence, by (1) and (4), p = rank AB ≤ rank A ≤ p =⇒ rank A = p =⇒ RA = C p . Conversely, if RA = C p , then the equations Axj = bj , j = 1, . . . , p,

2.3. What you need to know about rank

15

are solvable for every choice of the vectors bj . If, in particular, bj is set equal to the j’th column of the identity matrix Ip , then     A x1 · · · xp = b1 · · · bp = Ip . Thus, RA = C p =⇒ A

is right invertible, since the q × p matrix   X = x1 · · · xp

with columns x1 , . . . , xp is a right inverse of A. Consequently, (5) holds. Next, (6) follows from (5) and the observation that NA = {0} ⇐⇒ rank A = q ⇐⇒ rank A

H

⇐⇒ A

H

=q

(by (2.4)) (by (3))

is right invertible (by (5))

⇐⇒ A is left invertible . The main step in the justification of (7) is to show that if B is right invertible, then RA ⊆ RAB , because the opposite inclusion is valid for every B ∈ C q×p , right invertible or not. But, if BD = Iq for some D ∈ C r×q , then RA = RA(BD) = R(AB)D ⊆ RAB ⊆ RA . Therefore, (7) holds, since all these spaces must be equal. Suppose next that C is left invertible. Then EC = Ip for some E ∈ C p×n and NA = N(EC)A = NE(CA) ⊇ NCA ⊇ NA . Therefore (8) holds, since all these spaces must be equal. Item (9) is listed here for ease of access; the proof is postponed to Exercise 3.17.  It is important to keep in mind the following implications of Theorem 2.2: (1) If A ∈ C p×q is right invertible, then p ≤ q, i.e., A is either a square matrix or a fat matrix. (2) If A ∈ C p×q is left invertible, then q ≤ p, i.e., A is either a square matrix or a thin matrix. (3) If A ∈ C p×p , then NA = {0} ⇐⇒ RA = C p . (4) If A, B ∈ C p×p , then AB = Ip ⇐⇒ BA = Ip ⇐⇒ A and B are both invertible .

16

2. Dimension and rank

The assumption that A and B are square matrices is crucial for the validity of (4):





   1  1  1 0 1 0 1 0 = = 1 , whereas . 0 0 0 0 In a similar vein,





   0  0  0 0 1 0 1 0 = = 0 , whereas . 1 1 1 0   Exercise 2.4. Show that the matrix 1 0 has infinitely many right in  1 verses and no left inverses, whereas the matrix has infinitely many left 0 inverses and no right inverses. Exercise 2.5. Show that if A ∈ C p×q , B ∈ C q×p , and AB is invertible, then q ≥ p, A is right invertible, and B is left invertible; then show that the converse is false when q > p. Exercise 2.6. Let A ∈ C n×n and suppose that Ak−1 = O, but Ak = O. Show that ⎤ ⎡ k−1 Ak−2 · · · In A ⎢ O Ak−1 · · · A ⎥ ⎥ ⎢ rank ⎢ . .. ⎥ = n . .. ⎣ .. . . ⎦ k−1 O O ··· A [HINT: The given matrix can be expressed as the product of its last block column with its first block row.] Exercise 2.7. Find a matrix A ∈ C p×q and an invertible matrix C ∈ C p×p such that RCA = RA . Show that (2.7)

RCA = CRA

for any C ∈ C m×p , invertible or not .

Exercise 2.8. Let A ∈ C 4×5 , let v1 , v2 , v3 be a basis for RA , and let V = [v1 v2 v3 ]. Show that V H V is invertible, that C = V (V H V )−1 V H is not left invertible, and yet RC = RCA . Exercise 2.9. Let A ∈ Cp×q , B ∈ C q×r . Show that: (1) if A and B are both left invertible, then AB is left invertible; (2) if A and B are both right invertible, then AB is right invertible. Exercise 2.10. Find a matrix A ∈ C p×q and an invertible matrix B ∈ C q×q such that NAB = NA . Show that (2.8) BNAB ⊆ NA

for every B ∈ C q×r with equality if B is invertible .

Exercise 2.11. Let A ∈ C p×q , C ∈ C m×p , and let {u1 , . . . , uk } be a basis for RA . Show that if C is left invertible, then {Cu1 , . . . , Cuk } is a basis for RCA .

2.4. {0, 1, ∞}

17

Exercise 2.12. Find a pair of matrices A ∈ C p×q and B ∈ C p×p such that B is not left invertible and yet {Bu1 , . . . , Buk } is a basis for RBA for every basis {u1 , . . . , uk } of RA . Exercise 2.13. Show that if A ∈ C p×q , then rank A = rank AT , however, rank AT A is not always equal to rank A. Exercise 2.14. Show that if A ∈ C p×q and B ∈ C p×r , then   (2.9) rank A B = p ⇐⇒ NAH ∩ NB H = {0} . Exercise 2.15. Let A ∈ C p×q , B ∈ C q×r , and let {u1 , . . . , uk } be a basis for RB . Show that {Au1 , . . . , Auk } is a basis for RAB if and only if RB ∩ NA = {0}. Exercise 2.16. Show that if A ∈ C p×q and B ∈ C q×p , then NAB = {0} if and only if NA ∩ RB = {0} and NB = {0}. Exercise 2.17. Find a matrix A ∈ C p×q and a vector b ∈ C p NA = {0} and yet the equation Ax = b has no solutions. ⎡ 1 3 1 ⎢ 0 1 2 Exercise 2.18. Find a basis for RA and NA if A = ⎢ ⎣ 1 −2 3 1 6 11 [REMARK: A systematic way of tackling such problems will be in the next chapter.]

such that ⎤ 8 2 1 3 ⎥ ⎥. 3 1 ⎦ 5 9 presented

2.4. {0, 1, ∞} Theorem 2.3. The equation Ax = b has either 0, 1, or infinitely many solutions. Proof. There are three possibilitities to consider: (1) b ∈ RA . (2) b ∈ RA and NA = {0}. (3) b ∈ RA and NA = {0}. In case (1) the equation Ax = b has no solutions. Suppose next that b ∈ RA and that x1 and x2 are both solutions to the given equation. Then the identities 0 = b − b = Ax1 − Ax2 = A(x1 − x2 ) imply that (x1 − x2 ) ∈ NA . Thus, in case (2) x1 − x2 = 0; i.e., the equation has exactly one solution, whereas in case (3) it has infinitely many solutions:  If Ax = b and u ∈ NA , then A(x + αu) = b for every scalar α.

18

2. Dimension and rank

Exercise 2.19. Find a system of 5 equations and 3 unknowns that has exactly one solution and a system of 3 equations and 5 unknowns that has no solutions. Exercise 2.20. Let nL = nL (A) and nR = nR (A) denote the number of left and right inverses, respectively, of a matrix A ∈ C p×q . Show that the combinations (nL = 0, nR = 0), (nL = 0, nR = ∞), (nL = 1, nR = 1), and (nL = ∞, nR = 0) are possible. Exercise 2.21. In the notation of the previous exercise, show that the combinations (nL = 0, nR = 1), (nL = 1, nR = 0), (nL = ∞, nR = 1), (nL = 1, nR = ∞), and (nL = ∞, nR = ∞) are impossible. B to the matrix A = Exercise 2.22. Find the set of all right inverses A11 A12 and the set of all left inverses C to AT when A11 is invertible.

2.5. Block triangular matrices Theorem 2.4. If A ∈ C n×n is a block triangular matrix that is either of the form 



A11 O A11 A12 or the form A = (2.10) A= O A22 A21 A22 with A11 ∈ C p×p and A22 ∈ C q×q , then A is invertible if and only if A11 and A22 are both invertible. Moreover, if A11 and A22 are invertible, then

−1 −1  A11 −A−1 A12 A−1 A11 A12 11 22 = and O A22 O A−1 22 (2.11) −1 

A−1 O A11 O 11 = . −1 A21 A22 A−1 −A−1 22 A21 A11 22 Proof. Suppose first that A is an invertible block upper triangular matrix. Then there exists a matrix B ∈ C n×n such that AB = In and BA = In . Thus, upon writing B in the block form that is compatible with the block form of A and invoking the first of these formulas, we obtain   

Op×q Ip A11 A12 B11 B12 = . AB = O A22 B21 B22 Oq×p Iq But this is equivalent to the following four identities: (1)

A11 B11 + A12 B21 = Ip .

(2)

A22 B21 = Oq×p .

(3) (4)

A11 B12 + A12 B22 = Op×q . A22 B22 = Iq .

2.6. Supplementary notes

19

Then, since A22 , B22 ∈ C q×q and A11 , B11 ∈ C p×p , (4), (2), and (1), considered in that order imply that A22 is invertible,

B21 = Oq×p ,

A11 B11 = Ip ,

and

A11 is invertible.

Therefore, the assertion that A invertible implies that A11 and A22 are invertible is verified. The converse implication is justified by noting that if A11 and A22 are invertible, then the first block matrix in (2.11) is well-defined, and then checking by direct calculation that the block upper triangular matrix B with these block entries really is the inverse of A. (This is logically correct but leaves open the question of where the entries came from. The −1 answer is that once you know that B22 = A−1 22 and B11 = A11 , then the −1 −1 formula B12 = −A11 A12 A22 is obtained from (3).) The analysis for block lower triangular matrices is similar and is left to the reader. (Another option is to exploit the fact that A is block lower  triangular if and only if AT is block upper triangular.) Exercise 2.23. Show that if A ∈ C n×n is an invertible triangular matrix with entries aij ∈ C for i, j = 1, . . . , n, then aii = 0 for i = 1, . . . , n. [HINT: Use Theorem 2.4 to show that if the claim is true for n = k, then it is also true for n = k + 1.]

2.6. Supplementary notes This chapter is partially adapted from Chapters 1 and 2 in [30]. Theorem 2.2 and the keep in mind notes are new to this edition. Formula (3.2) in the next chapter exhibits the secret behind the mysterious equality rank A = rank AT . The equivalence NA = {0} ⇐⇒ RA = C p for matrices A ∈ C p×p is a special case of the Fredholm alternative, which, in its most provocative form, states that if the solution to the equation Ax = b is unique, then it exists.

Chapter 3

Gaussian elimination

Gaussian elimination is a systematic way of passing from a given system of equations Ax = b

to a new system of equations U x = c

that is easier to analyze. The passage from the given system to the new system is effected by multiplying both sides of the given system successively on the left by an appropriately chosen sequence of invertible matrices. The restriction to invertible multipliers is essential. Otherwise, the new system may not have the same set of solutions as the given one. In particular, the left multipliers will be either permutation matrices P (which serve to interchange rows) or lower triangular matrices with ones on the diagonal E (which serve to add multiples of one row to another). Thus, for example, if A ∈ C 3×4 , then ⎤ ⎡ ⎤ ⎡ ⎤⎡ a21 a22 a23 a24 0 1 0 a11 a12 a13 a14 P A = ⎣1 0 0⎦ ⎣a21 a22 a23 a24 ⎦ = ⎣a11 a12 a13 a14 ⎦ , a31 a32 a33 a34 0 0 1 a31 a32 a33 a34 whereas



⎤⎡ 1 0 0 a11 · · · ⎣ ⎦ ⎣ a21 · · · EA = α 1 0 a31 · · · β 0 1

⎤ ⎡ a11 a14 ··· ⎦ ⎣ a24 = αa11 + a21 · · · a34 βa11 + a31 · · ·

⎤ a14 αa14 + a24 ⎦ . βa14 + a34

If A is a square matrix, then U will be upper triangular. If A is not square, then U will be an upper echelon matrix (which will be defined below). In practice, Gaussian is carried out by first forming the  multiplication   augmented matrix A = A b and then by carrying out the indicated row 21

22

3. Gaussian elimination

operations without referring to the corresponding matrix multipliers on the left. This works because if P1 , . . . , Pk is a sequence of permutation matrices and E1 , . . . , Ek is a sequence of lower triangular matrices with ones on the diagonal such that     (3.1) Ek Pk · · · E1 P1 A b = U c , then the matrix G = Ek Pk · · · E1 P1 is invertible and (3.1) ensures that Ax = b ⇐⇒ GAx = Gb ⇐⇒ U x = c . The following notions will prove useful: • Upper echelon: If U ∈ C p×q and rank U = k, then U is an upper echelon matrix if: (1) each of the first k rows contains at least one nonzero entry; (2) the first nonzero entry in row i lies to the left of the first nonzero entry in row i+1 for i = 1, . . . , k −1; (3) the entries in the remaining p − k rows (if any) are all zero (because of this special structure, rank U = rank U T for every upper echelon matrix U ). Thus, for example, if ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 3 6 2 4 1 0 0 2 3 1 0 0 0 ⎢0 0 1 0 5 0⎥ ⎢ ⎥ ⎢ ⎥ ⎥ , W = ⎢0 0 6 0⎥ , and X = ⎢4 2 3⎥ , V =⎢ ⎣0 0 0 0 2 0⎦ ⎣0 0 0 0⎦ ⎣0 0 6⎦ 0 0 0 0 0 0 0 0 0 0 0 5 0 then V and W are upper echelon matrices, while X is not. • Pivots: The first nonzero entry in each row of an upper echelon matrix is termed a pivot. In the preceding display, V has the 3 pivots v11 , v23 , and v35 , and rank V = rank V T = 3; W has the 2 pivots w12 and w23 , and rank W = rank W T = 2. Since a byproduct of Gaussian elimination is the fact that if A ∈ C p×q and rank A = k ≥ 1, then there exists an invertible matrix G ∈ C p×p such that GA = U is in upper echelon form with k pivots, we now can provide a more transparent proof of the fact that rank A = rank AT : (3.2)

rank A = rank U = number of pivots = rank U T = rank AT . • Pivot columns: A column in an upper echelon matrix U will be called a pivot column if it contains a pivot. Thus, the first, third, and fifth columns of V are the pivot columns of V , whereas the second and fourth columns of W are the pivot columns of W . Moreover, if GA = U , then the columns ai1 , . . . , aik of A that correspond in position to the pivot columns ui1 , . . . , uik of U will also be called pivot columns (even though the pivots are in U not in A); they form a basis for RA . The entries xi1 , . . . , xik in x ∈ C q will be referred to as pivot variables.

3.1. Examples

23

3.1. Examples Example 3.1. Consider the equation Ax = b, where ⎡ ⎤ ⎡ ⎤ 0 2 3 1 1 ⎣ ⎦ ⎣ (3.3) A= 1 5 3 4 and b = 2⎦ . 2 6 3 2 1 1. Construct the augmented matrix ⎡ ⎤ 0 2 3 1 1    = A b = ⎣1 5 3 4 2⎦ ; (3.4) A 2 6 3 2 1 it is introduced to ensure that the row operations that are applied to the matrix A are also applied to the vector b.  to get a nonzero 2. Interchange the first two rows of A per left-hand corner of the new matrix ⎡ ⎡ ⎤ 0 1 1 5 3 4 2  , with P1 = ⎣1 0 ⎣0 2 3 1 1⎦ = P1 A 0 0 2 6 3 2 1

entry in the up⎤ 0 0⎦ . 1

 from its bottom row 3. Subtract two times the top row of the matrix P1 A to get ⎡ ⎤ ⎡ ⎤ 1 0 0 1 5 3 4 2  , where E1 = ⎣ 0 1 0 ⎦ ⎣ 0 2 3 1 1 ⎦ = E1 P1 A −2 0 1 0 −4 −3 −6 −3 is chosen to obtain all zeros below the pivot in the first column. 4. Add two times the ⎡ 1 5 ⎣ 0 2 0 0 where

 to its third row to get second row of E1 P1 A ⎤ 3 4 2   = U c , 3 1 1 ⎦ = E2 E1 P1 A 3 −4 −1 ⎡

⎤ 1 0 0 E2 = ⎣0 1 0⎦ 0 2 1

is chosen to obtain all zeros below the pivot in the second column, U = E2 E1 P1 A is in upper echelon form, and c = E2 E1 P1 b.

24

3. Gaussian elimination

5. Try to solve the new system of equations ⎡ ⎤ ⎡ ⎤ x1 ⎡ ⎤ 1 5 3 4 ⎢ ⎥ 2 x 2⎥ ⎣ 1 ⎦ 1 ⎦⎢ (3.5) Ux = ⎣ 0 2 3 ⎣ x3 ⎦ = 0 0 3 −4 −1 x4 by solving for the pivot variables x1 , x2 , x3 in terms of x4 , working from the bottom row up: row 3: 3x3 − 4x4 = −1 =⇒ 3x3 = 4x4 − 1 , row 2: 2x2 + 3x3 + x4 = 1 =⇒ 2x2 = −3x3 − x4 + 1 , row 1: x1 + 5x2 + 3x3 + 4x4 = 2 =⇒ x1 = −5x2 − 3x3 − 4x4 + 2 . Thus, each of the pivot variables x1 , x2 , x3 can be expressed in terms of the variable x4 : x3 = (4x4 − 1)/3, x2 = (−5x4 + 2)/2, and x1 = (9x4 − 4)/2, i.e., ⎡ ⎤ ⎤ ⎡ ⎤ ⎡ 9/2 −2 x1 ⎢−5/2⎥ ⎢x2 ⎥ ⎢ 1 ⎥ ⎢ ⎥ ⎥ ⎢ ⎥ x=⎢ ⎣x3 ⎦ = ⎣−1/3⎦ + x4 u with u = ⎣ 4/3 ⎦ x4 1 0 is a solution of the system of equations (3.5), or equivalently, (3.6)

E2 E1 P1 Ax = E2 E1 P1 b

(with A and b as in (3.3)) for every choice of x4 . Therefore, u ∈ NA and, as the matrices E2 , E1 , and P1 are invertible, x is a solution of (3.6) if and only if Ax = b, i.e., if and only if x is a solution of the original equation. 6. Check that the computed solution solves the original system of equations. Strictly speaking, this step is superfluous, because the construction guarantees that every solution of the new system is a solution of the old system, and vice versa. Nevertheless, this is an extremely important step, because it gives you a way to check your calculations.  Exercise 3.1. Show that for the matrix A considered in the preceding  T example NA = span{ 9/2 −5/2 4/3 1 }. Example 3.2. Let ⎡

0 ⎢ 0 A=⎢ ⎣ 0 0

0 1 2 0

3 0 3 6

⎤ 4 7 0 0 ⎥ ⎥ 6 8 ⎦ 8 14

⎤ b1 ⎢ b2 ⎥ ⎥ b=⎢ ⎣ b3 ⎦ . b4 ⎡

and

3.1. Examples

25

Then a vector x ∈ C 5 is a solution of the equation Ax = b if and only if ⎡

0 ⎢ 0 ⎢ ⎣ 0 0

1 0 0 0

0 3 0 0

⎡ ⎤ x 0 ⎢ 1 ⎢ x2 7 ⎥ ⎥ ⎢ x3 1 ⎦⎢ ⎣ x4 0 x5

0 4 2 0



⎡ b2 ⎥ ⎥ ⎢ b1 ⎥=⎢ ⎥ ⎣ b3 − 2b2 − b1 ⎦ b4 − 2b1

⎤ ⎥ ⎥ . ⎦

The pivots of the upper echelon matrix on the left are in columns 2, 3, and 4. Therefore, upon solving for the pivot variables x2 , x3 , and x4 in terms of x1 , x5 , and b1 , . . . , b4 from the bottom row up, we obtain the formulas 0 = b4 − 2b1 , 2x4 = b3 − 2b2 − b1 − x5 , 3x3 = b1 − 4x4 − 7x5 , = 3b1 + 4b2 − 2b3 − 5x5 , x2 = b2 . But this is the same as ⎤ ⎡ ⎤ ⎡ ⎡ x1 1 x1 ⎢ ⎥ ⎢ ⎢ x2 ⎥ b 2 ⎥ ⎢ ⎥ ⎢ 0 ⎢ ⎢ x3 ⎥ = ⎢ (−5x5 + 3b1 + 4b2 − 2b3 )/3 ⎥ = x1 ⎢ 0 ⎥ ⎢ ⎥ ⎢ ⎢ ⎣ (−x5 + b3 − 2b2 − b1 )/2 ⎦ ⎣ 0 ⎣ x4 ⎦ x5 x5 0 ⎡ ⎤ ⎡ ⎤ ⎡ 0 0 0 ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ + b1 ⎢ ⎢ 1 ⎥ + b2 ⎢ 4/3 ⎥ + b3 ⎢ −2/3 ⎣ −1/2 ⎦ ⎣ −1 ⎦ ⎣ 1/2 0 0 0





⎥ ⎢ ⎥ ⎢ ⎥ + x5 ⎢ ⎥ ⎢ ⎣ ⎦ ⎤

0 0 −5/3 −1/2 1

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

⎥ ⎥ ⎥, ⎥ ⎦

i.e., in self-evident notation, x = x1 u1 + x5 u2 + b1 u3 + b2 u4 + b3 u5 . Thus, a vector b ∈ C 4 belongs to RA if and only if b4 = 2b1 , and, in this case, the displayed vector x is a solution of the equation Ax = b for every choice of x1 and x5 . Therefore, x1 u1 + x5 u2 is a solution of the equation Ax = 0 for every choice of x1 , x5 ∈ C. Thus, u1 , u2 ∈ NA and RA = span{a2 , a3 , a4 }, the pivot columns of A. Since u1 and u2 are linearly independent and dim NA = 2 by conservation of dimension, {u1 , u2 } is a  basis for NA .

26

3. Gaussian elimination

3.2. A remarkable formula One of the surprising dividends of Gaussian elimination is: Theorem 3.1. If A ∈ C p×q and rank A = r ≥ 1, then there exists a lower triangular p×p matrix L with ones on the diagonal, an upper echelon matrix U ∈ C p×q with rank U = r, and a p × p permutation matrix P such that (3.7)

P A = LU .

Discussion. To understand where this formula comes from, suppose that A is a nonzero 4 × 5 matrix and let e1 , . . . , e4 denote the columns of I4 . Then there exists a choice of permutation matrices P1 , P2 , P3 and lower triangular matrices ⎡ ⎤ ⎡ ⎤ 1 0 0 0 0 ⎢ a 1 0 0⎥ ⎢ ⎥ T ⎥ ⎢a⎥ E1 = ⎢ ⎣ b 0 1 0⎦ = I4 + u1 e1 with u1 = ⎣ b ⎦ , c 0 0 1 c ⎡ ⎤ ⎡ ⎤ 1 0 0 0 0 ⎢ 0 1 0 0⎥ ⎢ ⎥ T ⎥ ⎢0⎥ E2 = ⎢ ⎣0 d 1 0⎦ = I4 + u2 e2 with u2 = ⎣d⎦ , 0 e 0 1 e ⎡ ⎤ ⎡ ⎤ 1 0 0 0 0 ⎢ 0 1 0 0⎥ ⎢ ⎥ ⎥ = I4 + u3 eT3 with u3 = ⎢ 0 ⎥ , E3 = ⎢ ⎣ 0 0 1 0⎦ ⎣0⎦ 0 0 f 1 f such that E3 P3 E2 P2 E1 P1 A = U

(3.8)

is in upper echelon form. The crucial fact is that P2 is chosen so that it interchanges the second row of E1 P1 A with its third or fourth row, if necessary, whereas P3 interchanges the third and fourth rows, if necessary. Consequently, eTi Pj = ei if i < j and P2 E1 = P2 (I4 + u1 eT1 ) = P2 + (P2 u1 )eT1 = E1 P2 , where E1 = I4 + (P2 u1 )eT1 has the same form as E1 . Similarly, Pi Ej = Ej Pi

if i > j ,

Ej

denotes a matrix of the same form as Ej . Thus, in self-evident where notation , E3 P3 E2 P2 E1 P1 = E3 E2 P3 E1 P2 P1 = E3 E2 E1 P3 P2 P1 = EP , which yields (3.7) with L = E −1 .



3.2. A remarkable formula

27

Exercise 3.2. Show that if A ∈ C n×n is an invertible matrix, then there exists a permutation matrix P ∈ R n×n such that (3.9)

P A = LDU ,

where L ∈ C n×n is lower triangular with ones on the diagonal, U ∈ C n×n is upper triangular with ones on the diagonal, and D is a diagonal matrix. Exercise 3.3. Show that if L1 D1 U1 = L2 D2 U2 , where Lj , Dj , and Uj are n × n matrices of the form exhibited in Exercise 3.2, then L1 = L2 , −1 is both lower and D1 = D2 , and U1 = U2 . [HINT: L−1 2 L1 D1 = D2 U2 U1 upper triangular.] Exercise 3.4. Show that there exist a 3 × 3 permutation matrix P and a lower triangular matrix ⎡ ⎤ ⎡ ⎤⎡ ⎤ 1 0 0 0 1 0 1 0 0 B = ⎣ b21 1 0 ⎦ such that ⎣ 1 0 0 ⎦ ⎣ α 1 0 ⎦ = BP 0 0 1 β 0 1 b31 b32 1 if and only if α = 0. Exercise 3.5. Find a permutation matrix P such that P A = LU , where L is a lower triangular invertible 3 × 3 matrix ⎡ and U ⎤is an upper triangular 0 1 1 ⎣ invertible 3 × 3 matrix for the matrix A = 1 1 1⎦. 1 1 0 Exercise 3.6. Find a lower triangular matrix Ek ∈ C (k+1)×(k+1) with ones on the diagonal such that ⎡ ⎤ ⎤ ⎡ 1 ∗ ∗ ··· ∗ ∗ k 1 α ··· α ⎢0 ρ ∗ · · · ∗ ∗⎥ k−1 ⎥ ⎢ ⎥ ⎢α 1 · · · α ⎥ ⎢ .. ⎥ ⎢ . . Ek ⎢ . ⎥ .. ⎥ = ⎢ . . . ⎢ ⎥ ⎣ . . ⎦ ⎣ 0 0 0 · · · ρ ∗⎦ k k−1 ··· 1 α α 0 0 0 ··· 0 ρ with ρ = 1 − |α|2 . [HINT: Compare α times row j − 1 with row j in the matrix that is being processed.] Exercise 3.7. Show that if |α| = 1 and ⎤⎡ ⎤ ⎡ ⎤ ⎡ α ··· αk 1 x0 0 k−1 ⎥ ⎢x ⎥ ⎢ ⎢α .⎥ 1 · · · α ⎥ ⎢ 1 ⎥ ⎢ .. ⎥ ⎢ (3.10) ⎢ .. .. ⎥ ⎢ .. ⎥ = ⎢ ⎥ , ⎣ . . ⎦ ⎣ . ⎦ ⎣0⎦ k k−1 xk 1 ··· 1 α α then xk = (1 − |α|2 )−1 . [HINT: Exploit the displayed formula in Exercise 3.6; it is not necessary to compute Ek .]

28

3. Gaussian elimination

3.3. Extracting a basis Let {v1 , . . . , vk } be a set of vectors in C m . To find a basis for the subspace 

(1) Let A = v1

V = span{v1 , . . . , vk } :  · · · vk .

(2) Use Gaussian elimination to reduce A to an upper echelon matrix U. (3) The pivot columns of A form a basis for V. Example 3.3. Let ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 2 2 0 v1 = ⎣3⎦ , v2 = ⎣6⎦ , v3 = ⎣10⎦ , and v4 = ⎣2⎦ . 1 2 4 1 Then, following the the matrix ⎡ 1 ⎣ A= 3 1

indicated strategy, we apply Gaussian elimination to ⎤ 2 2 0 6 10 2 ⎦ 2 4 1

⎡ ⎤ 1 2 2 0 to get U = ⎣0 0 4 2⎦ . 0 0 0 0

The pivot columns of U are the first and the third. Therefore, by the recipe furnished above, dim RA = dim RU = 2 and RA = span{v1 , v2 , v3 , v4 } = span{v1 , v3 } . Exercise 3.8. Find a basis for ⎡ ⎤ ⎡ 2 1 ⎢ 3 ⎥ ⎢ 0 ⎢ ⎥, ⎢ ⎣ 1 ⎦ ⎣ 2 4 1

the span of the vectors ⎤ ⎡ ⎤ ⎡ ⎤ 0 3 ⎥ ⎢ 3 ⎥ ⎢ −3 ⎥ ⎥, ⎢ ⎥ ⎢ ⎥ ⎦ ⎣ −3 ⎦ , ⎣ 9 ⎦ . 2 1

You should keep in mind that if you use Gaussian elimination to obtain a basis for NA , then the pivot columns of A furnish a basis for RA . Exercise 3.9. Use Gaussian elimination to find NA and RA for following choices of the matrix A: ⎡ ⎤ ⎡ ⎤ ⎡ 3 1 2 4 1 2 0 2 1 0 0 8 ⎣ 2 1 8 7 ⎦ , ⎣ −1 −2 ⎦ ⎣ 1 1 0 , 1 2 4 3 2 6 1 1 2 −3 −7 −2 2 3 0

each of the ⎤ 1 1 ⎦. 0

3.4. Augmenting a given set of vectors to form a basis In future applications we shall need a convenient way to augment a given set of linearly independent vectors with vectors chosen from a second set of linearly independent vectors that span a larger vector space.

3.5. Computing the coefficients in a basis

29

Lemma 3.2. Let {w1 , . . . , wk } be a basis for a subspace W of a vector space V ⊆ C n with basis {v1 , . . . , v }. Then the  pivot columns of the n × (k + ) matrix A = [w1 · · · wk v1 · · · v ] will be a basis for V that includes the vectors w1 , . . . , wk . Proof. Since rank A = , the corresponding upper echelon form that is computed by Gaussian elimination will have  pivots. Moreover, it will have k pivots in the first k columns, since the vectors w1 , . . . , wk are linearly independent. The remaining −k pivots indicate the positions of the vectors   from the submatrix v1 · · · v that together with w1 , . . . , wk form a basis for V.  the Exercise 3.10. Find a basis for C 4 using  T of the vectors v1 , . . . , v4 , when w1 = 1 0 1     v1T = 1 1 1 1 , v2T = 0 1 0 1 , v3T =   0 0 0 1 .

vectors w1 ,  w2 and two   T 0 , w2 = 0 1 2 0 ,   0 0 1 1 , and v4T =

3.5. Computing the coefficients in a basis Let {u1 , . . . , uk } be a basis for a k-dimensional subspace U of Cn . Then every vector b ∈ U can be expressed as a unique linear combination of the vectors {u1 , . . . , uk }; i.e., there exists a unique set of coefficients c1 , . . . , ck such that k  cj uj . b= j=1

The problem of computing these coefficients is equivalent to the problem of solving the equation 



Ac = b ,

where A = u1 · · · uk is the n × k matrix with columns u1 , . . . , uk and c = [c1 · · · ck ]T . This problem, too, can be solved efficiently by Gaussian elimination. Exercise 3.11. Let 

u1 u2 u3 u4



1  ⎢ 0 v =⎢ ⎣ 0 0

5 2 0 0

7 1 4 0

4 6 2 0

⎤ 3 1 ⎥ ⎥. 1 ⎦ 0

Show that span{u1 , u2 , u3 } = span{u1 , u2 , u4 } and that v = au1 + bu2 + cu3 = du1 + eu2 + f u4 , and calculate the coefficients a, b, c, d, e, f in the two representations.

30

3. Gaussian elimination

3.6. The Gauss-Seidel method Gaussian elimination can be used to find the inverse of a p × p invertible matrix A by solving each of the equations Axj = ej , j = 1, . . . , p, where the right-hand side ej is the j’th column of the identity matrix Ip . Then the formula     A x1 · · · xp = e1 · · · ep = Ip   identifies X = x1 · · · xp as the inverse A−1 of A. The Gauss-Seidel method is a systematic way of organizing all p of these separate calculations into one more efficient calculation by proceeding as follows, given A ∈ C p×p , invertible or not: 1. Construct the p × 2p augmented matrix    = A Ip . A  that are designed to bring 2. Carry out elementary row operations on A A into upper echelon form U . This is equivalent to choosing an invertible matrix G so that       G A Ip = GA G = U G . 3. Observe that U is a p × p upper triangular matrix with k pivots. If k < p, then A is not invertible and the procedure grinds to a halt. If k = p, then uii = 0 for i = 1, . . . , p. Therefore, by Theorem 2.4, there exists an upper triangular matrix F such that F U = Ip and hence         F G A Ip = F GA F G = F U F G = Ip F G . Therefore, A−1 = F G, which is equal to the second block in the p × 2p matrix on the right in the last display. To obtain A−1 numerically, go on to the next steps.     4. Multiply U G on the left by a diagonal matrix D = dij with dii = (uii )−1 for i = 1, . . . , p to obtain      DG , D U G = U  = DU is an upper triangular matrix with ones on the diagonal. where now U    DG working from 5. Carry out elementary row manipulations on U  to the identity. This is the bottom row up that are designed to bring U

3.6. The Gauss-Seidel method

31

 = Ip . equivalent to choosing an upper triangular matrix F such that FU Then        DG = FU  FDG = Ip F DG , F U  = F DGA = Ip , the second block on the right is and hence, as FU FDG = F G = A−1 . 6. Check! Multiply your candidate for A−1 by A to see if you really get the identity matrix Ip as an answer. Thus, for example, if ⎡ ⎤ 1 3 1 ⎢ ⎥ A = ⎣ 2 8 4 ⎦ ,

⎡ then

1 3 1

1 0 0

0 4 7

0 0 1

 = ⎢ A ⎣ 2 8 4

0 4 7



⎥ 0 1 0 ⎦ ,

and two steps of Gaussian elimination lead in turn to the forms ⎡ ⎤ 1 3 1 1 0 0 ⎥ 1 = ⎢ −2 1 0 ⎦ A ⎣ 0 2 2 0 4 7 and



0 0 1

1 3 1

1

2 = ⎢ A ⎣ 0 2 2

−2



1 0

3 = ⎢ A ⎣ 0

1 2

0 0

0



⎥ 1 0 ⎦ .

4 −2 1

0 0 3 Next, let



1 3 1

⎥  ⎢ 0 ⎦A 2 = ⎣ 0 1 1 1 3



0 0

1

0

−1

1 2 − 23

4 3

0 0 1

0



⎥ 0 ⎦ , 1 3

3 from the second and first rows to and then subtract the bottom row of A obtain ⎡ 2 1 ⎤ 1 3 0 − 13 3 −3 7 1 ⎥ 4 = ⎢ − 73 A ⎣ 0 1 0 6 −3 ⎦ . 0 0 1

4 3

− 23

1 3

4 from The next to last step is to subtract three times the second row of A the first to obtain ⎡ 2 ⎤ 20 1 0 0 − 17 3 6 3 7 7 1 ⎥ 5 = ⎢ 0 1 0 − − A ⎣ 3 6 3 ⎦ . 0 0 1

4 3

− 23

1 3

32

3. Gaussian elimination

5 is A−1 . The final step is to The matrix built from the last 3 columns of A check that this matrix, which is conveniently written as ⎡ ⎤ 40 −17 4 1 ⎢ ⎥ 7 −2 ⎦ , B = ⎣ −14 6 8 −4 2 is indeed the inverse of A, i.e., AB = I3 .

⎤ 1 3 2 Exercise 3.12. Find the inverse of the matrix ⎣ 2 4 1 ⎦ by the Gauss0 4 2 Seidel method. Exercise 3.13. Use ⎡ 1 ⎢ 1 ⎢ (3.11) ⎣ 1 1



the Gauss-Seidel method to show that ⎤ ⎤−1 ⎡ 1 0 0 0 0 0 0 ⎢ 0 0⎥ 1 0 0 ⎥ ⎥. ⎥ = ⎢−1 1 ⎣ ⎦ 0 −1 1 0⎦ 1 1 0 0 0 −1 1 1 1 1

To compute the right inverses of a matrix A ∈ C p×n with rank A = p and q = n − p ≥ 1, use Gaussian elimination to find the pivot columns of choose a permutation matrix P ∈ R n×n such that AP =   A and then B11 B12 and B11 is invertible. Then express X ∈ C n×p in block form with blocks X11 ∈ C p×p and X21 ∈ C q×p and observe that

  X11  −1 = Ip ⇐⇒ X11 = B11 AP X = Ip ⇐⇒ B11 B12 (Ip − B12 X21 ) . X21 −1 (Ip −B12 X21 ) is a right inverse Consequently the matrix P X with X11 = B11 of A for every choice of the pq entries in X21 .

The left inverses of a matrix A ∈ C n×q with rank A = q and n > q are obtained easily from the right inverses of AT .

3.7. Block Gaussian elimination Gaussian elimination can also be carried out in block matrices provided that appropriate range conditions are fulfilled. Thus, for example, if ⎤ ⎡ A11 A12 A13 A = ⎣ A21 A22 A23 ⎦ A31 A32 A33 is a block matrix and if there exists a pair of matrices K1 and K2 such that (3.12)

A21 = K1 A11

and

A31 = K2 A11 ,

3.7. Block Gaussian elimination

33

then ⎡

⎤ ⎡ ⎤ I O O A11 A12 A13 ⎣ −K1 I O ⎦ A = ⎣ O −K1 A12 + A22 −K1 A13 + A23 ⎦ . −K2 O I O −K2 A12 + A32 −K2 A13 + A33

This operation is the block matrix analogue of clearing the first column in conventional Gaussian elimination. The implementation of such a step depends critically on the existence of matrices K1 and K2 that fulfill the conditions in (3.12). If A11 is invertible, then clearly K1 = A21 A−1 11 and −1 K2 = A31 A11 meet the requisite conditions. However, matrices K1 and K2 that satisfy the conditions in (3.12) may exist even if A11 is not invertible. Lemma 3.3. Let A ∈ C p×q , B ∈ C p×r , and C ∈ C r×q . Then: (1) There exists a matrix K ∈ C r×q such that A = BK if and only if RA ⊆ RB . (2) There exists a matrix L ∈ C p×r such that A = LC if and only if RAH ⊆ RC H . Proof. Suppose first that RA ⊆ RB and let ej denote the j’th column of Iq . Then the presumed range inclusion implies that Aej =   Buj for some vector uj ∈ C r for j = 1, . . . , q and hence that A = A e1 · · · eq =     B u1 · · · uq = BK, with K = u1 · · · uq . This proves half of (1). The other half is easy and is left to the reader together with (2).  Under appropriate assumptions, a double application of block Gaussian elimination (once on the left and once on the right) is applicable and leads to useful factorization formulas for square matrices. Theorem 3.4. If A11 ∈ C p×p , A12 ∈ C p×q , A21 ∈ C q×p , A22 ∈ C q×q and the range conditions (3.13)

RA12 ⊆ RA11

and

RAH ⊆ RAH 21

11

are in force, then there exists a pair of matrices K ∈ C p×q and L ∈ C q×p such that (3.14)

A12 = A11 K

and

A21 = LA11

and hence    

Ip K O Ip O A11 A11 A12 = . (3.15) A21 A22 L Iq O A22 − LA11 K O Iq Proof. Lemma 3.3 guarantees the existence of a pair of matrices L ∈ C q×p and K ∈ C p×q that meet the conditions in (3.14). Thus,   

A12 A11 A12 A11 Ip O = A21 A22 O A22 − LA12 −L Iq

34

and

3. Gaussian elimination

A12 A11 O A22 − LA12



Ip −K O Iq



=

O A11 O A22 − LA12

 ,

which in turn leads easily to formula (3.15).



The right-hand side of (3.15) is the product of block lower, block diagonal, and block upper triangular matrices. The next exercise reverses the order. Exercise 3.14. Let A ∈ C n×n be a four-block matrix with entries A11 ∈ C p×p , A12 ∈ C p×q , A21 ∈ C q×p , A22 ∈ C q×q , where n = p + q. Show that if the range conditions (3.16)

RAH ⊆ RAH 12

22

and

RA21 ⊆ RA22

are in force, then A admits a factorization of the form    

Ip M A11 − M A22 N O Ip O A11 A12 = . (3.17) A21 A22 O Iq O A22 N Iq Exercise 3.15. Show that if A11 is invertible, then formula (3.15) can be reexpressed as (3.18)    A11 Ip O O A12 Ip A−1 11 . A= Iq O Iq A21 A−1 O A22 − A21 A−1 11 11 A12 Exercise 3.16. Show that if A22 is invertible, then formula (3.17) can be reexpressed as (3.19)    O Ip A11 − A12 A−1 Ip A12 A−1 O 22 22 A21 . A= A−1 O Iq O A22 22 A21 Iq Exercise 3.17. Verify item (9) of Theorem 2.2. [HINT: Exploit formulas (3.18) and (3.19) for the block matrix with A11 = Ip , A22 = Iq and A and B in the corners.]

3.8. Supplementary notes This chapter is partially adapted from Chapters 2 and 3 of [30]. The ma−1 trices A22 − A21 A−1 11 A12 in formula (3.18) and A11 − A12 A22 A21 in formula (3.19) are referred to as the Schur complements of A11 and A22 , respectively. They make many computations transparent. If A is invertible and B = A−1 is expressed in compatible four-block form, then (3.20) .

−1 , A and A11 invertible =⇒ B22 = (A22 − A21 A−1 11 A12 ) −1 . A and A22 invertible =⇒ B11 = (A11 − A12 A−1 22 A21 )

Chapter 4

Eigenvalues and eigenvectors

• A point λ ∈ C is said to be an eigenvalue of a matrix A ∈ C n×n if there exists a nonzero vector u ∈ C n such that Au = λu, i.e., if N(A−λIn ) = {0} . Every nonzero vector u ∈ N(A−λIn ) is said to be an eigenvector of A corresponding to the eigenvalue λ.

4.1. The first step Theorem 4.1. If A ∈ C n×n , then: (1) A has at least one eigenvalue λ ∈ C. (2) A has at most n distinct eigenvalues. (3) Eigenvectors corresponding to distinct eigenvalues are linearly independent. Proof. Let u be a nonzero vector in C n . Then, since the set of n+1 vectors u, Au, . . . , An u is linearly dependent, there exists a set of complex numbers c0 , . . . , cn , not all of which are zero, such that c0 u + c1 Au + · · · + cn An u = 0 . Let k = max {j : cj = 0}. Then, by the fundamental theorem of algebra, the polynomial p(x) = c0 + c1 x + · · · + cn xn = c0 + c1 x + · · · + ck xk 35

36

4. Eigenvalues and eigenvectors

can be factored as a product of k polynomials of degree one with roots μ1 , . . . , μk ∈ C: p(x) = ck (x − μk ) · · · (x − μ1 ) . Moreover, the same holds true for polynomials in A (e.g., x2 + 5x + 6 = (x + 3)(x + 2) =⇒ A2 + 5A + 6In = (A + 3In )(A + 2In )). Correspondingly, c0 u + · · · + cn An u = c0 u + · · · + ck Ak u = ck (A − μk In ) · · · (A − μ2 In )(A − μ1 In )u = 0 . This in turn implies that there are k possibilities: (1) (A − μ1 In )u = 0. (2) (A − μ1 In )u = 0 and (A − μ2 In )(A − μ1 In )u = 0. .. . (k) (A−μk−1 In ) · · · (A−μ1 In )u = 0 and (A−μk In ) · · · (A−μ1 In )u = 0. In the first case, μ1 is an eigenvalue and u is an eigenvector. In the second case, w1 = (A − μ1 In )u is a nonzero vector in Cn and Aw1 = μ2 w1 . Therefore, (A − μ1 In )u is an eigenvector of A corresponding to the eigenvalue μ2 . .. . In the k’th case, wk−1 = (A − μk−1 In ) · · · (A − μ1 In )u is a nonzero vector in Cn and Awk−1 = μk wk−1 . Therefore, wk−1 is an eigenvector of A corresponding to the eigenvalue μk . This completes the verification of (1). Items (2) and (3) of Theorem 4.1 are stated here for perspective; they will be justified in the proof of Theorem 4.2. (But clearly, (2) follows from (3).)  Notice that the proof does not guarantee the existence of real eigenvalues for A even if A ∈ R n×n , because the polynomial p(x) = c0 + c1 x+· · ·+ck xk may have only complex roots μ1 , . . . , μk even if the coefficients c1 , . . . , ck are real.

 0 −1 Exercise 4.1. Show that if A = , then the equation Ax = μx has 1 0 a nonzero solution x ∈ C 2 if and and only if μ = ±i. If λ1 , . . . , λk are distinct eigenvalues of a matrix A ∈ C n×n , then: • The number γj = dim N(A−λj In ) is termed the geometric multiplicity of the eigenvalue λj , j = 1, . . . , k.

4.2. Diagonalizable matrices

37

• The number αj = dim N(A−λj In )n is termed the algebraic multiplicity of the eigenvalue λj , j = 1, . . . , k. • The inclusions (4.1)

N(A−λj In ) ⊆ N(A−λj In )2 ⊆ · · · ⊆ N(A−λj In )n guarantee that γj ≤ αj

(4.2)

for

j = 1, . . . , k ,

and hence (as α1 + · · · + αk = n by Theorem 5.4) that k ≤ γ1 + · · · + γk ≤ α1 + · · · + αk = n .

(4.3) • The set (4.4)

σ(A) = {λ ∈ C : N(A−λIn ) = {0}} is called the spectrum of A. Clearly, σ(A) = {λ1 , . . . , λk }, the set of all the distinct eigenvalues of the matrix A in C. • A nonzero vector u ∈ C n is said to be a generalized eigenvector of order k for the matrix A ∈ C n×n corresponding to the eigenvalue λ ∈ C if u ∈ N(A−λIn )k but u ∈ N(A−λIn )k−1 .

4.2. Diagonalizable matrices A matrix A ∈ C n×n is said to be similar to a matrix B ∈ C n×n if there exists an invertible matrix U ∈ C n×n such that A = U BU −1 ; A is said to be diagonalizable if it is similar to a diagonal matrix D, i.e., if A = U DU −1 .

(4.5)

Theorem 4.2. Let A ∈ C n×n and suppose that A has exactly k distinct eigenvalues λ1 , . . . , λk ∈ C with geometric multiplicities γ1 , . . . , γk , respectively. Then:   (1) There exists an n × (γ1 + · · · + γk ) matrix U = U1 · · · Uk with blocks Uj ∈ C n×γj such that rank U = γ1 + · · · + γk and AU = U D ,

with

D = diag{λ1 Iγ1 , . . . , λk Iγk } .

(2) k ≤ γ1 + · · · + γk ≤ n. (3) A is diagonalizable if and only if γ1 + · · · + γk = n. (4) A is diagonalizable if k = n.

38

4. Eigenvalues and eigenvectors

Discussion.

To ease the exposition, suppose that k = 3 and let

r = γ1

and

{u1 , . . . , ur }

be a basis for N(A−λ1 In ) ,

s = γ2

and

{v1 , . . . , vs }

be a basis for N(A−λ2 In ) ,

t = γ3

and

{w1 , . . . , wt }

and set

 U1 = u1 · · ·

be a basis for N(A−λ3 In ) ,

  ur , U2 = v1 · · ·

  vs , U3 = w1 · · ·

 wt .

Then       A U1 U2 U3 = AU1 AU2 AU3 = U1 (λ1 Ir ) U2 (λ2 Is ) U3 (λ3 It ) ⎤ ⎡ O O   λ1 Ir = U1 U2 U3 ⎣ O λ2 Is O ⎦ . O O λ3 It To verify (1), it remains to show that rank U = γ1 + γ2 + γ3 . Towards this end, suppose that r 

ai ui +

s 

i=1

bj vj +

j=1

t 

c k wk = 0

k=1

for some set of coefficients a1 , . . . , ar , b1 , . . . , bs , c1 , . . . , ct ∈ C and, to simplify the bookkeeping, let u=

r  i=1

ai ui ,

v=

s 

bj vj ,

and

j=1

w=

t 

c k wk .

k=1

Then Au = λ1 u, Av = λ2 v, Aw = λ3 w

and

u + v + w = 0.

Thus, 0 = (A − λ1 In )(u + v + w) = (λ2 − λ1 )v + (λ3 − λ1 )w and 0 = (A − λ2 In )((λ2 − λ1 )v + (λ3 − λ1 )w) = (λ3 − λ2 )(λ3 − λ1 )w . Since the eigenvalues are distinct,  starting from the  the last three displays, third, clearly imply that w = tk=1 ck wk = 0, v = sj=1 bj vj = 0, and  u = ri=1 ai ui = 0. Therefore, since each of these three sums is a linear combination of linearly independent vectors, all the coefficients are equal to zero. Consequently, (1) holds for k = 3. The argument is easily adapted for arbitrary positive integers k. Items (2) and (3) are immediate from (1); (4) is immediate from (2).



4.3. Invariant subspaces

39

Diagonalizable matrices are pleasant to work with. In particular, formula (4.5) implies that A2 = (U DU −1 )(U DU −1 ) = U D 2 U −1 , A3 = U D 3 U −1 , . . . , Ak = U D k U −1 . The advantage is that the powers D 2 , D 3 , . . . , D k are easy to compute: ⎤ ⎤ ⎡ ⎡ k μ1 μ1 ⎥ ⎥ ⎢ ⎢ k .. .. (4.6) D=⎣ ⎦ =⇒ D = ⎣ ⎦. . . k μn μn Exercise 4.2. Show that if a matrix A ∈ C n×n is diagonalizable, i.e., if A = U DU −1 with D = diag{λ1 , . . . , λn }, and if     and (U −1 )T = v1 · · · vn , then: U = u1 · · · un  (1) Ak = U D k U −1 = nj=1 λkj uj vjT .   (2) (A − λIn )−1 = U (D − λIn )−1 U −1 = nj=1 (λj − λ)−1 uj vjT , if λ ∈ σ(A). Exercise 4.3. Show that the matrices

 1 −1 A= and 1 1

A=

2 −1 3 −1



have no real eigenvalues, i.e., σ(A) ∩ R = ∅ in both cases. Exercise 4.4. Show that although the following upper triangular matrices ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 2 0 0 2 1 0 2 1 0 ⎣ 0 2 0 ⎦ , ⎣ 0 2 0 ⎦ , ⎣ 0 2 1 ⎦ 0 0 2 0 0 2 0 0 2 have the same diagonal, dim N(A−2I3 ) is equal to three for the first, two for the second, and one for the third. Calculate N(A−2I3 )j for j = 1, 2, 3, 4 for each of the three choices of A. n×n is a triangular matrix with entries Exercise 4.5. Show n that if A ∈ C aij , then σ(A) = i=1 {aii }.

4.3. Invariant subspaces A subspace U of C n is invariant under A if Au ∈ U for every vector u ∈ U . The simplest invariant subspaces are the one-dimensional ones. Exercise 4.6. Show that if A ∈ C n×n , then a nonzero vector u ∈ C n is an eigenvector of A if and only if the one-dimensional subspace {αu : α ∈ C} is invariant under A.

40

4. Eigenvalues and eigenvectors

Example 4.1. If A ∈ Cn×n , then N(A−λIn ) = {u ∈ C n×n : Au = λu} and R(A−λIn ) = {(A − λIn )u : u ∈ C n } are both invariant under A: (A − λIn )u = 0 =⇒ (A − λIn )Au = A(A − λIn )u = 0 and u = (A − λIn )v =⇒ Au = A(A − λIn )v = (A − λIn )Av . Theorem 4.3. Let A ∈ C n×n and let U be a subspace of C n that is invariant under A, i.e., u ∈ U =⇒ Au ∈ U . Then: (1) Either U = {0} or there exist a nonzero vector u ∈ U and a number λ ∈ C such that Au = λu. (2) If u1 , . . . , uk ∈ U are eigenvectors of A corresponding to distinct eigenvalues λ1 , . . . , λk , then k ≤ dim U . Proof. If u is a nonzero vector in U and dim U = m, then, since the m + 1 vectors u, Au, . . . , Am u all belong to U , they must be linearly dependent. Therefore, the same argument that was used to justify item (1) of Theorem 4.1 serves to justify item (1) of this theorem too. Item (2) then follows from  the fact that k = dim{span{u1 , . . . , uk }} ≤ dim U .

4.4. Jordan cells Not all matrices are diagonalizable: The criterion γ1 +· · ·+γk = n established in Theorem 4.2 may not be satisfied. Thus, for example, if ⎡ ⎤ ⎡ ⎤ 2−λ 1 0 2 1 0 2−λ 1 ⎦ A = ⎣0 2 1⎦ , then the nullspace of A−λI3 = ⎣ 0 0 0 2−λ 0 0 2 will be nonzero if and only if λ = 2. In this case dim N(A−2I3 ) = 1 and dim N(A−λI3 ) = 0 if λ = 2. More elaborate examples may be constructed by taking larger matrices of the same form or by putting such blocks together. The matrix A in the last display is of the form (4.7)

Cμ(p) = μIp +

p−1 

ej eTj+1

 where Ip = e1 · · ·

 ep ,

j=1

with μ = 2 and p = 3. Matrices of the form (4.7) are called Jordan cells. (p)

Exercise 4.7. Let A = Cμ be a Jordan cell of size p × p with μ on the diagonal. Show that N(A−λIp ) = {0} if and only if λ = μ and that dim N(A−μIp )k = k for k = 1, . . . , p.

4.5. Linear transformations

41

Nevertheless, the news is not all bad. There is a more general factorization formula than (4.5) in which the matrix D is replaced by a block diagonal matrix J = diag{Bλ1 , . . . , Bλk }, where Bλj is an αj × αj matrix that is block diagonal with γj Jordan cells as blocks. Thus, it is an αj × αj upper triangular matrix with λj on the diagonal and the columns of U are generalized eigenvectors of A. This representation will be developed in the next two chapters.

4.5. Linear transformations In this section we shall reformulate some of the main conclusions that were obtained for matrices in the language of linear transformations. Let T be a linear transformation (mapping) from a vector space V into itself. Then a subspace M of V is said to be invariant under T if T v ∈ M whenever v ∈ M. A number λ ∈ C is an eigenvalue of T if there exists a nonzero vector v ∈ V such that T v = λv. Such a vector v is said to be an eigenvector of T . Thus, every nonzero vector in N(T −λIV ) is an eigenvector of T . Let γj = dim N(T −λj IV ) . The next two theorems paraphrase the main conclusions that have already been obtained for matrices. The proofs, which are much the same as for their matrix counterparts, are left to the reader. Theorem 4.4. If T is a linear transformation from an n-dimensional complex vector space V into itself, then: (1) T has at least one eigenvalue λ ∈ C. (2) Eigenvectors v1 , . . . , vk of T that correspond to distinct eigenvalues λ1 , . . . , λk are linearly independent. Moreover, if T has exactly k distinct eigenvalues λ1 , . . . , λk with geometric multiplicities γj = dim N(T −λj IV ) , then: (3) k ≤ γ1 + · · · + γk ≤ n. (4) The space V admits a basis of eigenvectors of T if and only if γ1 + · · · + γk = n. Theorem 4.5. Let T be a linear transformation from a complex vector space V into itself and let M be a finite-dimensional subspace of V that is invariant under T . Then either M = {0} or there exist a nonzero vector w ∈ M and a number λ ∈ C such that T w = λw .

42

4. Eigenvalues and eigenvectors

Exercise 4.8. Show that if T is a linear transformation from a complex vector space V into itself, then the vector spaces N(T −λI) and R(T −λI) are both invariant under T for each choice of λ ∈ C. Exercise 4.9. The set V of polynomials p(t) with complex coefficients is a complex vector space with respect to the natural rules of vector addition and scalar multiplication. Let T p = p (t) + tp (t) and Sp = p (t) + t2 p (t). Show that the subspace Uk of V of polynomials p(t) = c0 + c1 t + · · · + ck tk of degree less than or equal to k is invariant under T , but not under S. Find a nonzero polynomial p ∈ U3 and a number λ ∈ C such that T p = λp. Exercise 4.10. Let T be a linear transformation from a real vector space V into itself and let U be a two-dimensional subspace of V with basis {u1 , u2 }. Show that if T u1 = u2 and T u2 = −u1 , then T 2 u + u = 0 for every vector u ∈ U , but that there are no one-dimensional subspaces of U that are invariant under T . Why? [HINT: One-dimensional subspace of U are of the form {α(c1 u1 + c2 u2 ) : α ∈ R} for some choice of c1 , c2 ∈ R with |c1 | + |c2 | > 0.]

4.6. Supplementary notes This chapter is adapted from Chapter 4 of [30]. As noted there, the proof of Theorem 4.1 (which avoids the use of determinants) was influenced by a conversation with Sheldon Axler at the Holomorphic Functions Session at MSRI, Berkeley, in 1995 and his paper [7].

Chapter 5

Towards the Jordan decomposition

The main result in this chapter is Theorem 5.4. To understand what it says, we need to understand direct sums of subspaces.

5.1. Direct sums Let U and V be subspaces of a vector space Y and let U + V = {u + v : u ∈ U and v ∈ V} . Clearly, the sum U + V is a subspace of Y with respect to the rules of vector addition and scalar multiplication that are inherited from the vector space Y, since U + V is closed under vector addition and scalar multiplication. If U and V are finite dimensional, then the sum U + V is said to be a direct sum if dim U + dim V = dim(U + V) . ˙ i.e., U +V ˙ rather than U + V. Direct sums are denoted by the symbol +, Analogously, if the Uj , j = 1, . . . , k, are subspaces of a vector space Y, then the sum (5.1)

U1 + · · · + Uk = {u1 + · · · + uk : ui ∈ Ui

for

i = 1, . . . , k}

is a subspace of Y with respect to the rules of vector addition and scalar multiplication that are inherited from the vector space Y. If these k subspaces are finite dimensional, the sum U = U1 + · · · + Uk is said to be a direct sum if (5.2)

dim U1 + · · · + dim Uk = dim{U1 + · · · + Uk } . 43

44

5. Towards the Jordan decomposition

If the sum U is direct, then we write ˙ · · · +U ˙ k. U = U1 + Example 5.1. If {u1 , . . . , un } is a basis for U and 1 < k <  < n, then ˙ span{uk , . . . , u−1 }+ ˙ span{u , . . . , un } . U = span{u1 , . . . , uk−1 }+ Theorem 5.1. If U and V are finite-dimensional subspaces of a vector space Y, then dim(U + V) = dim U + dim V − dim(U ∩ V)

(5.3) and hence

the sum U + V is direct

(5.4)

⇐⇒

U ∩ V = {0} .

Proof. If U ∩ V = U or U ∩ V = V, then it is easily seen that formula (5.3) holds, since U + V = V in the first case and U + V = U in the second. Suppose next that U ∩ V is a nonzero proper subspace of U and of V and that {w1 , . . . , wk } is a basis for U ∩ V. Then there exists a family of vectors {u1 , . . . , ur } and a family of vectors {v1 , . . . , vs } such that {w1 , . . . , wk , u1 , . . . , ur } is a basis for U and {w1 , . . . , wk , v1 , . . . , vs } is a basis for V. It is clear that span{w1 , . . . , wk , u1 , . . . , ur , v1 , . . . , vs } = U + V. Moreover, if a1 , . . . , ak , b1 , . . . , br , c1 , . . . , cs are scalars, then k  i=1

ai wi +

r 

bj uj +

j=1

s 

c v = 0 =⇒

k  i=1

=1

ai wi +

r 

bj uj = −

j=1

s 

c v

=1

s

and hence c1 = · · · = cs = 0 , since the last equality implies that =1 cs v belongs to U ∩ V = span{w1 , . . . , wk }. Thus, also a1 = · · · = ak = b1 = · · · = bs = 0, since {w1 , . . . , wk , u1 , . . . , ur } is a basis for U . Consequently, the full family of k + r + s vectors is linearly independent and so is a basis for U + V. Therefore, dim(U + V) = k + r + s = (k + r) + (k + s) − k , which serves to verify (5.3) when U ∩ V is a nonzero proper subspace of both U and V. The verification of (5.3) when U ∩ V = {0} is similar, but easier: If {u1 , . . . , us } is a basis for U and {v1 , . . . , vt } is a basis for V, then clearly span{u1 , . . . , us , v1 , . . . , vt } = U + V and this set of s + t vectors is linearly independent since s  i=1

bi ui +

t  j=1

cj vj = 0 =⇒

s  i=1

bi ui = −

t  j=1

cj vj

5.1. Direct sums

45

and hence both sides of the last equality belong to U ∩ V = {0}. Therefore, (5.3) holds in this case also. Finally, the characterization (5.4) is immediate from the definition of a direct sum and formula (5.3).  We remark that the characterization (5.4) has the advantage of being applicable to infinite-dimensional spaces. However, it does not have simple analogues for sums of three or more subspaces; see, e.g., Example 5.2. Example 5.2. If ⎧⎡ ⎤⎫ ⎨ 1 ⎬ U = span ⎣1⎦ , ⎩ ⎭ 0

⎧⎡ ⎤⎫ ⎨ 0 ⎬ V = span ⎣−1⎦ , ⎩ ⎭ 1

and

⎧⎡ ⎤⎫ ⎨ 1 ⎬ W = span ⎣0⎦ , ⎩ ⎭ 1

then U ∩ V = {0}, U ∩ W = {0}, and V ∩ W = {0} but the sum U + V + W is not direct.  Exercise 5.1. Let Y be a finite-dimensional vector space. Show that if ˙ ˙ +W. ˙ ˙ and V = X +W, then Y = U +X Y = U +V If the vector space Y admits the direct sum decomposition ˙ , Y = U +V then the subspace V is said to be a complementary space to the subspace U and the subspace U is said to be a complementary space to the subspace V. Exercise 5.2. Let T be a linear transformation from a real vector space V into itself and let U be a two-dimensional subspace of V with basis {u1 , u2 }. Show that if T u1 = u1 + 2u2 and T u2 = 2u1 + u2 , then U is the direct sum of two one-dimensional spaces that are each invariant under T . ˙ = Lemma 5.2. Let U and V be subspaces of a vector space Y such that U +V Y. Then every vector y ∈ Y can be expressed as a sum of the form y = u+v for exactly one pair of vectors u ∈ U and v ∈ V. Proof. Suppose that y = u1 +v1 = u2 +v2 with u1 , u2 ∈ U and v1 , v2 ∈ V. Then the vector u1 − u2 = v2 − v1 belongs to U ∩ V and is therefore equal  to 0. Thus, u1 = u2 and v1 = v2 . Lemma 5.3. Let U , V, and W be subspaces of a vector space Y such that ˙ =Y U +V

and

U ⊆W.

Then (5.5)

˙ W = U +(W ∩ V) .

46

5. Towards the Jordan decomposition

Proof. Clearly, U + (W ∩ V) = (W ∩ U ) + (W ∩ V) ⊆ W + W = W . To establish the opposite inclusion, let w ∈ W. Then, since W ⊆ Y and ˙ Y = U +V, w = u + v for exactly one pair of vectors u ∈ U and v ∈ V. Moreover, under the added assumption that U ⊆ W, it follows that both u and v = w − u belong to W. Therefore, u ∈ W ∩ U and v ∈ W ∩ V, and hence W ⊆ (W ∩ U ) + (W ∩ V) .   Example 5.3. If I2 = e1 e2 , U = span{e1 }, V = span{e2 }, and W = ˙ but (W ∩ U ) + (W ∩ V) = {0} = W. What span{e1 + e2 }, then C 2 = U +V, went wrong?  Therefore, (5.5) holds.



The next theorem is the first step towards the verification of the general formula A = U JU −1 alluded to earlier. Theorem 5.4. Let A ∈ C n×n and suppose that A has exactly k distinct eigenvalues, λ1 , . . . , λk ∈ C. Then (5.6)

˙ ···+ ˙ N(A−λk In )n . C n = N(A−λ1 In )n +

The proof of this theorem will be furnished in Section 5.3. At this point we shall focus on its implications. Corollary 5.5. If A ∈ C n×n has exactly k distinct eigenvalues, λ1 , . . . , λk ∈ C, and the matrices Uj ∈ C n×αj , j = 1, . . . , k, are constructed so that the columns of Uj are a basis for N(A−λj In )n , then:   (1) The n × n matrix U = U1 · · · Uk is invertible. (2) There exists a set of matrices Gj ∈ C αj ×αj such that AUj = Uj Gj for j = 1, . . . , k. (3) A = U GU −1 with G = diag{G1 , . . . , Gk }. Proof. It suffices to verify (2), since (1) follows easily from Theorem 5.4 and (3) follows easily from (2). Towards this end, let μ = λj and m = αj for any choice of the  of j in the set {1, . . . , k} and suppose that the columns n matrix V = v1 , . . . , vm are a basis for N(A−μIn )n . Then (A − μIn ) vi = 0 for i = 1, . . . , m. Therefore, (A − μIn )n Avi = A[(A − μIn )n vi ] = A0 = 0, i.e., vi ∈ N(A−μIn )n =⇒ Avi ∈ N(A−μIn )n . But this means that Avi is a linear combination of the columns of V , i.e., Avi = V gi for some vector

5.1. Direct sums

47

gi ∈ C m . Thus,  AV = A v1 · · ·

    vm = Av1 · · · Avm = V g1 · · ·   = V G with G = g1 · · · gm .

V gm



This serves to justify the formula AUj = Uj Gj for j = 1, . . . , k. Therefore (2) holds.  Example 5.4. Let A ∈ C 9×9 and suppose that A has exactly three distinct eigenvalues λ1 , λ2 , and λ3 with algebraic multiplicities α1 = 4, α2 = 2, and α3 = 3, respectively. Let {v1 , v2 , v3 , v4 } be any basis for N(A−λ1 I9 )9 , let {w1 , w2 } be any basis for N(A−λ2 I9 )9 , and let {x1 , x2 , x3 } be any basis for N(A−λ3 I9 )9 . Then, since each of the spaces N(A−λj I9 )9 , j = 1, 2, 3, is invariant under multiplication by the matrix A, A[v1 v2 v3 v4 ] = [v1 v2 v3 v4 ]G1 , A[w1 w2 ] = [w1 w2 ]G2 , A[x1 x2 x3 ] = [x1 x2 x3 ]G3 for some choice of G1 ∈ C 4×4 , G2 ∈ C 2×2 , and G3 ∈ C 3×3 . In other notation, upon setting V = [v1 v2 v3 v4 ] , W = [w1 w2 ] ,

and

X = [x1 x2 x3 ] ,

we can write the preceding three sets of equations together as ⎡ ⎤ G1 O O A[V W X] = [V W X] ⎣ O G2 O ⎦ O O G3 or, equivalently, upon setting U = [V W ⎡ G1 O ⎣ O G2 (5.7) A=U O O

X], as ⎤ O O ⎦ U −1 , G3

since the matrix U = [V W X] is invertible, thanks to Theorem 5.4.



Formula (5.7) is the best that can be achieved in Example 5.4 if we only know α1 , α2 , α3 . Our ultimate objective is to obtain a factorization in which each of the blocks G1 , G2 , G3 is itself a block diagonal matrix of Jordan cells. To achieve this, the corresponding bases {v1 , v2 , v3 , v4 }, {w1 w2 }, and {x1 x2 x3 } must be chosen appropriately. This will be taken up in the next chapter.

48

5. Towards the Jordan decomposition

5.2. The null spaces of powers of B The next few lemmas focus on the null spaces NB j for j = 0, 1, . . . when B ∈ C n×n . They will play a role in the proof of Theorem 5.11. However, to verify the first assertion of that theorem, only (1) and (2) of Lemma 5.6 and (1)–(3) of Lemma 5.8 are needed. Lemma 5.6. If B ∈ C n×n , then: (1) NB ⊆ NB 2 ⊆ NB 3 ⊆ · · · . (2) If NB j = NB j+1 for some integer j ≥ 1, then NB j+1 = NB j+2 . (3) If j ≥ 1 is an integer, then NB j = {0} ⇐⇒ NB j+1 = {0}. Proof. The proof is left to the reader.



Exercise 5.3. Verify the assertions in Lemma 5.6. Lemma 5.7. If B ∈ C n×n and u ∈ C n belongs to NB k for some positive integer k, then the vectors u, Bu, . . . , B k−1 u are linearly independent ⇐⇒ B k−1 u = 0 . Proof. Suppose first that B k−1 u = 0 and that α0 u + α1 Bu + · · · + αk−1 B k−1 u = 0 for some choice of coefficients α0 , . . . , αk−1 ∈ C. Then, since u ∈ NB k , B k−1 (α0 u + α1 Bu + · · · + αk−1 B k−1 u) = α0 B k−1 u = 0 , which clearly implies that α0 = 0. Similarly, the identity B k−2 (α1 Bu + · · · + αk−1 B k−1 u) = α1 B k−1 u = 0 implies that α1 = 0. Continuing in this vein it is readily seen that B k−1 u = 0 =⇒ the vectors u, Bu, . . . , B k−1 u are linearly independent. Thus, as the converse is self-evident, the proof is complete.  Lemma 5.8. If B ∈ C n×n , then: (1) NB k ⊆ NB n and RB k ⊇ RB n for every positive integer k. ˙ B n. (2) C n = NB n +R (3) N(B+λIn ) n ⊆ RB n for every point λ ∈ C \ {0}. (4) RB ∩ NB = {0} =⇒ RB ∩ NB k = {0} for k = 1, 2, . . .. ˙ B ⇐⇒ NB = NB n . (5) C n = NB +R Proof. If NB k = {0}, then the first assertion in (1) is clear. If B k u = 0 for some nonzero vector u ∈ C n and some positive integer k, let j be the smallest positive integer such that B j−1 u = 0 and B j u = 0. Then, in view of Lemma 5.7, the vectors u, Bu, . . . , B j−1 u are linearly independent.

5.2. The null spaces of powers of B

49

Therefore, j ≤ n, and hence B n u = B n−j (B j u) = B n−j 0 = 0, i.e., B k u = 0 =⇒ B n u = 0. This justifies the first assertion in (1); the second is left to the reader. To verify (2), suppose first that u ∈ NB n ∩ RB n . Then B n u = 0 and u = B n v for some vector v ∈ C n . Therefore, 0 = B n u = B 2n v. But, then (1) ensures that u = B n v = 0. Thus, the sum is direct. Therefore, as {RB n + NB n } ⊆ C n and dim{RB n + NB n } = dim RB n + dim NB n = n by the principle of conservation of dimension, (2) holds. Suppose next that (B + λIn )n u = 0 for some nonzero vector u ∈ C n and some nonzero number λ ∈ C. Then, since λIn commutes with B, we can invoke the binomial theorem for matrices to obtain n   n     n n−j j n n−j j n 0= λ B u=λ u+ λ B u. j j j=0

j=1

Therefore, (5.8)

u = Bp(B)u ,

where

p(B) = −

n    n j=1

j

λ−j B j−1

is a polynomial in B. Thus, as Bp(B) = p(B)B, u = Bp(B)u = (Bp(B))2 u = · · · = (Bp(B))n u = B n p(B)n u ∈ RB n , as claimed in (3). We now turn to (4): If RB ∩ NB = {0} and x ∈ RB ∩ NB k for some positive integer k ≥ 2, then B k−1 x ∈ RB ∩NB = {0} and hence RB ∩NB k ⊆ RB ∩ NB k−1 ⊆ · · · ⊆ RB ∩ NB . ˙ B , then NB ∩ RB = {0} and hence, in view of Finally, if C n = NB +R Lemma 5.3 and (4), ˙ B ∩ NB n ) = NB +(R ˙ B ∩ NB n ) NB n = (NB ∩ NB n )+(R ˙ B ∩ NB ) = NB . = NB +(R ˙ B . This is left to It remains to show that if NB n = NB , then C n = NB +R the reader as an exercise.  Exercise 5.4. Show that if B ∈ C n×n , then N(B+λIn ) k ⊆ RB k for every point λ ∈ C \ {0} and every positive integer k.

50

5. Towards the Jordan decomposition

˙ B , to complete Exercise 5.5. Show that if NB n = NB , then C n = NB +R the proof of (5) in Lemma 5.8. Remark 5.9. The last lemma may be exploited to give a quick proof of the fact that generalized eigenvectors corresponding to distinct eigenvalues are automatically linearly independent. To verify this, let (A − λj In )n uj = 0, j = 1, . . . , k, for some set of distinct eigenvalues λ1 , . . . , λk and suppose that c1 u1 + · · · + ck uk = 0 . Then −c1 u1 = c2 u2 + · · · + ck uk and, since −c1 u1 ∈ N(A−λ1 In )n and, by (3) of Lemma 5.8, c2 u2 +· · ·+ck uk ∈ R(A−λ1 In )n , both sides of the last displayed equality must equal zero, thanks to (2) of Lemma 5.8. Therefore, c1 = 0 and c2 u2 + · · · + ck uk = 0. To complete the verification, just keep on going. Exercise 5.6. Show that if B ∈ C n×n , λ ∈ C, and λ = 0, then NB j ⊆ R(B+λIn ) k

(5.9)

for every choice of j, k ∈ {1, . . . , n}. [HINT: The proof of (3) in Lemma 5.8.]

5.3. Verification of Theorem 5.4 Lemma 5.8 guarantees that (5.10)

˙ (A−λIn )n C n = N(A−λIn )n +R

for every point λ ∈ C. The next step is to obtain an analogous direct sum decomposition for R(A−λIn )n . Lemma 5.10. If A ∈ C n×n , λ1 , λ2 ∈ C, and λ1 = λ2 , then (5.11)

˙ (A−λ1 In )n ∩ R(A−λ2 In )n ) . R(A−λ1 In )n = N(A−λ2 In )n +(R

˙ (A−λ2 In )n and N(A−λ2 In )n ⊆ R(A−λ1 In )n Proof. Since C n = N(A−λ2 In )n +R when λ1 = λ2 by Lemma 5.8, the assertion follows from Lemma 5.3 (with  U = N(A−λ2 In )n , V = R(A−λ2 In )n , and W = R(A−λ1 In )n ). The first assertion in the next theorem is used in the proof of Theorem 5.4; the second assertion, which serves to characterize diagonalizable matrices, is included for added perspective.

5.3. Verification of Theorem 5.4

51

Theorem 5.11. If A ∈ C n×n has k distinct eigenvalues λ1 , . . . , λk ∈ C with geometric multiplicities γ1 , . . . , γk and algebraic multiplicities α1 , . . . , αk , respectively, then (5.12)

R(A−λ1 In )n ∩ R(A−λ2 In )n ∩ · · · ∩ R(A−λk In )n = {0} .

However, (5.13)

R(A−λ1 In ) ∩ R(A−λ2 In ) ∩ · · · ∩ R(A−λk In ) = {0} ⇐⇒ γj = αj f or j = 1, . . . , k .

Proof. Let M denote the intersection of the k subspaces on the left-hand side of the asserted identity (5.12). Then it is readily checked that M is invariant under A; i.e., if u ∈ M, then Au ∈ M, because each of the subspaces R(A−λj In )n is invariant under A: if u ∈ R(A−λj In )n , then u = (A − λj In )n vj for some vector vj ∈ C n for each choice of j = 1, . . . , n and hence Au = A(A − λj In )n vj = (A − λj In )n Avj ∈ R(A−λj In )n

for j = 1, . . . , k .

Thus, if M = {0}, then, by Theorem 4.3, there exist a number λ ∈ C and a nonzero vector v ∈ M such that Av − λv = 0. But this means that λ is equal to one of the eigenvalues of A, say λt , i.e., v ∈ N(A−λt In ) . But this in turn implies that v ∈ N(A−λt In )n ∩ R(A−λt In )n = {0} . Therefore, M = {0}. This completes the proof of (5.12). To verify the implication =⇒ in (5.13), observe first that, in view of Lemma 5.8 and the inclusions NB ⊆ NB n and RB n ⊆ RB : (a) NA−λ1 In ) ⊆ R(A−λ2 In ) ∩ · · · ∩ R(A−λk In ) . (b) R(A−λ1 In ) ∩ N(A−λ1 In ) ⊆ R(A−λ1 In ) ∩ · · · ∩ R(A−λk In ) . (c) R(A−λ1 In ) ∩ · · · ∩ R(A−λk In ) = {0} =⇒ R(A−λ1 In ) ∩ N(A−λ1 In ) = {0}. ˙ (A−λ1 In ) = C n and hence, by (5) of Lemma 5.8, Therefore, R(A−λ1 In ) +N α1 = γ1 , and, by analogous arguments, αj = γj for j = 2, . . . , k also. Conversely, if αj = γj for j = 1, . . . , k, then N(A−λ1 In ) = N(A−λ1 In )n . Therefore, R(A−λ1 In ) = R(A−λ1 In )n and hence the implication ⇐= in (5.13) follows from (5.12).  We are now ready to prove Theorem 5.4.

52

5. Towards the Jordan decomposition

Proof of Theorem 5.4. Let us suppose that k ≥ 3. Then, by Lemmas 5.8 and 5.10, ˙ (A−λ1 In )n C n = N(A−λ1 In )n +R and ˙ (A−λ1 In )n ∩ R(A−λ2 In )n . R(A−λ1 In )n = N(A−λ2 In )n +R Therefore, (5.14)

˙ (A−λ2 In )n +R ˙ (A−λ1 In )n ∩ R(A−λ2 In )n . C n = N(A−λ1 In )n +N

˙ (A−λ3 In )n and Moreover, since C n = N(A−λ3 In )n +R N(A−λ3 In )n ⊆ R(A−λ1 In )n ∩ R(A−λ2 In )n by Lemma 5.8, the supplementary formula R(A−λ1 In )n ∩ R(A−λ2 In )n ˙ (A−λ1 In )n ∩ R(A−λ2 In )n ∩ R(A−λ3 In )n = N(A−λ3 In )n +R follows from Lemma 5.3 with U = N(A−λ3 In )n , V = R(A−λ3 In )n , and W = R(A−λ1 In )n ∩ R(A−λ2 In )n . When substituted into formula (5.14), this yields ˙ · · · +N ˙ (A−λj In )n +R ˙ (A−λ1 In )n ∩ · · · ∩ R(A−λj In )n (5.15) C n = N(A−λ1 In )n + for j = 3. To complete the proof, just keep on going until you run out of eigenvalues (i.e., until j = k in (5.15)) and then invoke (5.12). 

5.4. Supplementary notes This chapter is mostly adapted from Chapter 4 in [30]. However, (5.13) does not appear there.

Chapter 6

The Jordan decomposition

The decomposition (5.6) ensures that if A ∈ C n×n has k distinct eigenvalues λ1 , . . . , λk with algebraic multiplicities α1 , . . . , αk , then there exist an invertible matrix U ∈ C n×n and a block diagonal matrix G = diag{G1 , . . . , Gk } with blocks Gj of size αj ×αj , j = 1, . . . , k, such that A = U GU −1 . Our next objective is to establish the Jordan decomposition theorem, which is essentially an algorithm for choosing the basis for each of the spaces N(A−λj In )n to obtain a nice form for Gj .

6.1. The Jordan decomposition Theorem 6.1. If A ∈ C n×n has k distinct eigenvalues λ1 , . . . , λk with geometric multiplicities γ1 , . . . , γk and algebraic multiplicities α1 , . . . , αk , respectively, then there exists an invertible matrix U ∈ C n×n such that AU = U J, where: (1) J = diag {Bλ1 , . . . , Bλk }. (2) Bλj is an αj ×αj block diagonal matrix that is built out of γj Jordan (·)

cells Cλj of the form (4.7). (m)

(3) The number of Jordan cells Cλj of size m×m in Bλj is equal to the number of columns of height m in the array of αj symbols × that is constructed by placing κi = dim N(A−λj In )i − dim N(A−λj In )i−1 53

54

6. The Jordan decomposition

symbols in row i for i = 1, 2, . . . (Lemma 6.4 guarantees that κ1 ≥ κ2 ≥ · · · ) : × × × × × ··· .. .

(6.1)

× ··· ×

× ··· ×

×

κ1 symbols κ2 symbols κ3 symbols .. . .

(4) The columns of U are generalized eigenvectors of the matrix A. (5) (A − λ1 In )α1 · · · (A − λk In )αk = O. (6) If νj = min{i : dim N(A−λj In )i = dim N(A−λj In )n }, then νj ≤ αj for j = 1, . . . , k and (A − λ1 In )ν1 · · · (A − λk In )νk = O. Discussion. Corollary 5.5 guarantees that there exist an invertible matrix U ∈ C n×n and a block diagonal matrix G = diag{G1 , . . . , Gk } with Gj ∈ C αj ×αj for j = 1, . . . , k such that AU = U G. This result was obtained by choosing Uj ∈ C n×αj so that the columns of Uj are a basis for N(A−λj In )n   and then setting U = U1 · · · Uk . It remains to show that it is possible to choose the Uj , i.e., the basis of N(A−λj In )n , in such a way that Gj will be a block diagonal matrix made up of γj Jordan cells. This will be done in two steps. The first step rests on Lemma 5.7, which guarantees that if u ∈ NB m and = 0, then the vectors u, Bu, . . . , B m−1 u are linearly independent and hence that   rank B m−1 u · · · u = m.

B m−1 u

The reason for stacking the vectors in this order is that then      B B m−1 u · · · u = B m u · · · Bu = 0 B m−1 u · · ·   (m) = B m−1 u · · · u C0 , (m)

where C0 is the m × m Jordan cell B = A − μIn , then    A B m−1 u · · · u = μ B m−1 u · · ·  = B m−1 u · · ·  = B m−1 u · · · which is of the form  (6.2) A x1 · · ·

 Bu

with 0 on the diagonal. Thus, if    (m) u + B m−1 u · · · u C0    (m) u μIm + B m−1 u · · · u C0  u Cμ(m) ,

  xm = x1 · · ·

 xm Cμ(m) .

6.1. The Jordan decomposition

55

Thus, x1 = B m−1 u = (A − μIn )m−1 u is an eigenvector of A and u is a nonzero vector in N(A−μIn )m and xj = B m−j u is a generalized eigenvector of A of order j for j = 2, . . . , m . The second step is to show that it is possible to choose γj such chains of vectors in N(A−λj In )αj that are linearly independent and span the space. We shall present an algorithm that achieves this in Section 6.5. These two steps and the algorithm referred to above serve to justify the first four assertions; (5) and (6) then drop out easily from the formulas A = U JU −1 and A − μIn = U (J − μIn )U −1 upon taking the structure of J into account; see Exercise 6.1 for the underlying idea.  A set of vectors {x1 , . . . , xm } with x1 = 0 for which (6.2) holds for some μ ∈ C is called a Jordan chain. Remark 6.2. Item (5) is the Cayley-Hamilton theorem: (6.3)

(λ − λ1 )

α1

· · · (λ − λk )

αk

n

=λ +

n−1 

cj λ =⇒ A = − j

n

j=0

n−1 

cj Aj .

j=0

In view of (6), the polynomial p(λ) = (λ − λ1 )ν1 · · · (λ − λk )νk is referred to as the minimal polynomial for A. Moreover, the number νj is the size of the largest Jordan cell in Bλj . Exercise 6.1. Show that if A = U JU −1 and J = diag{Cλ1 , Cλ2 } with λ1 = λ2 , then (A − λ1 I7 )4 (A − λ2 I7 )3 = O. [HINT: (J − λ1 I7 )4 (J − λ2 I7 )3 = O.] (4)

(3)

Exercise 6.2. Show that if {x1 , . . . , xm } is a set of vectors in C n such that (6.2) holds for some μ ∈ C and x1 = 0, then: (1) Ax1 = μx1 and Axj = μxj + xj−1 for j = 2, . . . , m if m > 1. (2) xj = B m−j xm for j = 1, . . . , m with B = A − μIn . Exercise 6.3. Show that if A = U Cα U −1 , then dim N(A−λIn ) = 1 if λ = α and 0 otherwise. (n)

Exercise 6.4. Calculate dim N(A−λIp )t for every λ ∈ C and t = 1, 2, . . . for the 17 × 17 matrix A = U JU −1 when J = diag {Bλ1 , Bλ2 , Bλ3 }, the points (4) (3) (3) λ1 , λ2 , λ3 are distinct, Bλ1 = Cλ1 , Bλ2 = diag{Cλ2 , Cλ2 }, and Bλ3 = (4)

(2)

(1)

diag{Cλ3 , Cλ3 , Cλ3 }. Then check that the number of Jordan cells in Bλj of size i × i is equal to the number of columns of height i in the array that is built via the instructions in item (3) of Theorem 6.1.

56

6. The Jordan decomposition

Exercise 6.5. Show that if A ∈ C n×n has exactly k distinct eigenvalues λ1 , . . . , λk in C with algebraic multiplicities α1 , . . . , αk , then ˙ · · · +N ˙ (A−λk In )αk = C n . N(A−λ1 In )α1 + Is it possible to reduce the powers further? Explain your answer. Exercise 6.6. Let u, v ∈ C n and B ∈ C n×n be such that B 4 u = 0, B 3 v = 0, and the pair of vectors B 3 u and B 2 v are linearly independent. Show that the seven vectors u, Bu, B 2 u, B 3 u, v, Bv, and B 2 v are linearly independent. (4)

(4)

(4)

Exercise 6.7. Calculate (Cμ )100 . [HINT: Cμ = μI4 + C0 .]

6.2. Overview The calculation of the matrices U and J in the representation A = U JU −1 presented in Theorem 6.1 can be conveniently divided into three parts: (1) Obtain the distinct eigenvalues λ1 , . . . , λk of the matrix A and their algebraic multiplicities α1 , . . . , αk . These are the points λ ∈ C at which N(λIn −A) = {0} and αj = dim N(λj In −A)n . As will be explained in the next chapter, the eigenvalues may also be characterized as the distinct roots λ1 , . . . , λk of the characteristic polynomial p(λ) = det (λIn − A) = a0 + a1 λ + · · · + an−1 λn−1 + λn that are obtained by writing it in factored form as p(λ) = (λ − λ1 )α1 · · · (λ − λk )αk , which also displays the algebraic multiplicities. (2) Compute J = diag{Bλ1 . . . , Bλk } by calculating dim N(A−λj In )i

for i = 1, . . . , αj − 1

for each of the distinct eigenvalues λ1 , . . . , λk , in order to obtain the sizes of the Jordan cells in Bλj from the algorithm in (3) of Theorem 6.1. The dimensions of these null spaces specify J up to order. (3) Construct a basis of N(A−λj In )αj made up of γj Jordan chains, one Jordan chain for each Jordan cell in Bλj , in order to obtain blocks   Uj such that AUj = Uj Bλj , j = 1, . . . , k. Then U = U1 · · · Uk . Remark 6.3. The information in (1) is enough to guarantee a factorization of A of the form A = U GU −1 , where G = diag{G1 , . . . , Gk } for some choice of Gj ∈ C αj ×αj , j = 1, . . . , k. The information in (2) serves to determine the number of Jordan cells in J and their sizes, but not the order in which

6.3. Dimension of nullspaces of powers of B

57

they appear, which can be chosen freely. Finally, the computations in (3) are to choose the columns of U so that each of the blocks Gj , j = 1, . . . , k, can be expressed as a block diagonal matrix with γj Jordan cells as diagonal blocks; the sizes of these cells are determined by (2).

6.3. Dimension of nullspaces of powers of B Lemma 6.4. If B ∈ C n×n , then (6.4)

dim NB j+1 − dim NB j ≤ dim NB j − dim NB j−1

f or j = 1, 2, . . . .

Proof. Fix an integer j ≥ 1 and assume that ˙ span{v1 , . . . , vs } and NB j = NB j−1 +

˙ span{w1 , . . . , wt } , NB j+1 = NB j +

where the vectors {v1 , . . . , vs } and {w1 , . . . , wt } are linearly independent and NB 0 = {0}. The proof amounts to verifying that: (1) span{B j w1 , . . . , B j wt } is a t-dimensional subspace of NB . (2) span{B j−1 v1 , . . . , B j−1 vs } is an s-dimensional subspace of NB . (3) span{B j w1 , . . . , B j wt } ⊆ span{B j−1 v1 , . . . , bj−1 vs }. Clearly B j wi ∈ NB for i = 1, . . . , t, since B(B j wi ) = B j+1 wi = 0. Moreover, the vectors {B j w1 , . . . , B j wt } are linearly independent, because δ1 B j w1 + · · · + δt B j wt = 0 =⇒ B j (δ1 w1 + · · · + δt wt ) = 0 =⇒ δ1 w1 + · · · + δt wt ∈ NB j ∩ span{w1 , . . . , wt } = {0} . Therefore, the coefficients δ1 = · · · = δt = 0, since {w1 , . . . , wt } is a linearly independent set of vectors. This completes the proof of (1). The proof of (2) is similar and is left to the reader. Next, since Bwi ∈ NB j , it follows that Bwi = u + β1 v1 + · · · + βs vs for some choice of u ∈ NB j−1 and β1 , . . . , βs ∈ C. Thus, B j wi = B j−1 u + B j−1 (β1 v1 + · · · + βs vs ) = 0 + β1 B j−1 v1 + · · · + βs B j−1 vs . Therefore, span{B j w1 , . . . , B j wt } ⊆ span{B j−1 v1 , . . . , B j−1 vs } , which justifies (3) and so too that t ≤ s . Consequently, dim NB j+1 − dim NB j = t ≤ s = dim NB j − dim NB j−1 , as claimed.



58

6. The Jordan decomposition

6.4. Computing J To illustrate the construction of J, suppose that A ∈ C n×n has k distinct eigenvalues λ1 , . . . , λk with geometric multiplicities γ1 , . . . , γk and algebraic multiplicities α1 , . . . , αk , respectively. To construct the Jordan blocks associated with λ1 , let B = A − λ1 In for short and suppose for the sake of definiteness that γ1 = 6, α1 = 15, and, to be more concrete, suppose that dim NB = 6 , dim NB 2 = 10 , dim NB 3 = 13 ,

and

dim NB 4 = 15 .

These numbers are chosen to meet the inequalities in (6.4) but are otherwise completely arbitrary. To see what to expect, construct an array of ×’s with 6 in the first row, 10 − 6 = 4 in the second row, 13 − 10 = 3 in the third row, and 15 − 13 = 2 in the fourth row: × × × × × × × × × × × × × × × There will be one Jordan cell for each column; the size of the cell is equal to the height of the column: two cells of size 4, one cell of size 3, one cell of size 2, and two cells of size 1. Exercise 6.8. Find an 11×11 matrix B such that dim NB = 4, dim NB 2 = 7, dim NB 3 = 9, dim NB 4 = 10, and dim NB 5 = 11. [HINT: Use the array.]

6.5. Computing U The Jordan decomposition theorem amounts to showing that there exist γj Jordan chains in N(A−λj In )n such the vectors in these chains form a basis for N(A−λj In )n . To illustrate the main ideas underlying the proof of this theorem, suppose, for example, that B = A − μIn , {u1 , . . . , ur } is a basis for NB , {u1 , . . . , ur ; v1 , . . . , vs } is a basis for NB 2 , {u1 , . . . , ur ; v1 , . . . , vs ; w1 , . . . , , wt } is a basis for NB 3 , and NB 3 = NB 4 . Then NB = span{u1 , . . . , ur }, ˙ span{v1 , . . . , vs }, NB 2 = NB +

and

˙ span{w1 , . . . , wt }. NB 3 = NB 2 + Since wi ∈ NB 3 and B 2 wi = 0, Lemma 5.7 guarantees that {B 2 wi , Bwi , wi } is a set of three linearly independent vectors for i = 1, . . . , t. Analogously,

6.5. Computing U

59

as vj ∈ NB 2 and Bvj = 0, the same lemma guarantees that {Bvj , vj } is a set of two linearly independent vectors for j = 1, . . . , s. Next, consider the following three sets of chains: (1) {B 2 w1 , Bw1 , w1 }, . . . , {B 2 wt , Bwt , wt } (2) {Bv1 , v1 }, . . . , {Bvs , vs } (3) {u1 }, . . . , {ur }

(t chains of length 3),

(s chains of length 2),

(r chains of length 1).

Exercise 6.9. Show that the 3t vectors in (1) are linearly independent and the 2s vectors in (2) are linearly independent. Since the r vectors in (3) are a basis for NB they must also be linearly independent. However, the full collection of 3t + 2s + r vectors cannot be linearly independent, since they all belong to NB 3 and dim NB 3 = r + s + t. The next step is to show that it is possible to select a subset of the exhibited set of 3t + 2s + r chains to form a basis for NB 3 . The proof of Lemma 6.4 ensures that (6.5)

t = dim span{B 2 w1 , . . . , B 2 wt } ,

s = dim span{Bv1 , . . . , Bvs } ,

and (6.6)

span{B 2 w1 , . . . , B 2 wt } ⊆ span{Bv1 , . . . , Bvs } ⊆ span{u1 , . . . , ur }.

The information in (6.5) and (6.6) is super important. It is the key to the whole construction. In view of (6.5) and (6.6), t ≤ s ≤ r and: (a) The vectors B 2 w1 , . . . , B 2 wt , together with an appropriately chosen set of s − t vectors from the set {Bv1 , . . . , Bvs } and an appropriately chosen set of r − s vectors from the set {u1 , . . . , ur } form a basis for NB . (b) The set of 3t+2(s−t)+1(r −s) = t+s+r vectors in the chains corresponding to the vectors selected in (a) are linearly independent. Therefore, they form a good basis for NB 3 = NB n . If μ = λj and the columns of Uj are constructed from the chains selected in (b), then AUj = Uj Bλj in which Bλj ∈ C αj ×αj is itself a block diagonal matrix with r = δj Jordan cells, one for each of the selected chains: (3) (2) (1) t Jordan cells Cλj , s − t Jordan cells Cλj , and r − s Jordan cells Cλj . • Thus, in order to describe the sizes of the Jordan cells in the matrix Bλj corresponding to a good choice of the block Uj , it is only necessary to know the numbers r, s, and t, or, equivalently, the dimensions of the spaces NB , NB 2 , . . ..

60

6. The Jordan decomposition

It is helpful to record this information graphically as a stacked array of r symbols in the first row, s symbols in the second, t symbols in the third, which suffices for the example under consideration: × ··· × ··· × ···

× × × × × × × × × ×

(r entries) (s entries) (t entries) .

The columns are in one-to-one correspondence with the Jordan cells: t columns of height 3; s − t columns of height 2; and r − s columns of height 1.

6.6. Two simple examples Example 6.1. Let ⎡

3 ⎢0 ⎢ A=⎢ ⎢0 ⎣0 0

0 3 0 0 1

0 0 3 0 0

0 1 0 3 0

⎤ 1 0⎥ ⎥ 0⎥ ⎥, 0⎦ 3

B = A − 3I5 ,

and let ej denote the j’th column of I5 . It is then readily checked that ˙ span{e5 }, NB 3 = NB 2 + ˙ span{e2 }, and NB = span{e1 , e3 }, NB 2 = NB + ˙ span{e4 }. The algorithm is to consider the array NB 4 = NB 3 + B 3 e4 B 2 e2 Be5 e1 e3 B 2 e4 Be2 e5 Be4 e2 e4 Since span{B 3 e4 } ⊆ span{B 2 e2 } ⊆ span{Be5 } ⊆ span{e1 , e3 } and equality prevails in the first two inclusions, the second and third columns in the array can be deleted. Furthermore, as B3 e4 = e1 , the fourth column  should also be deleted. Consequently, if U1 = B 3 e4 B 2 e4 Be4 e4 and     U2 = e3 , then U = U1 U2 is invertible and      C3(4) O  A U1 U2 = U1 U2 (1) . O C3 Example 6.2. Let A ∈ C n×n , λ1 ∈ σ(A), B = A − λ1 In and suppose that dim NB = 2 , dim NB 2 = 4 , and dim NB j = 5 for j = 3, . . . , n .

6.6. Two simple examples

61

The given information guarantees the existence of five linearly independent vectors a1 , a2 , b1 , b2 , and c1 such that NB = span{a1 , a2 } , ˙ span{b1 , b2 } , NB 2 = NB + ˙ span{c1 } , NB 3 = NB 2 + i.e., {a1 , a2 } is a basis for NB , {a1 , a2 , b1 , b2 } is a basis for NB 2 , and {a1 , a2 , b1 , b2 , c1 } is a basis for NB 3 . By the list that is justified in the proof of Lemma 6.4, span{B 2 c1 } is a one-dimensional subspace of span{Bb1 , Bb2 } and span{Bb1 , Bb2 } is a two-dimensional subspace of (and so equal to) NB . If B 2 c1 and Bb1 are linearly independent, then the strategy is to build the array B 2 c1 Bb1 Bc1 b1 c1 and then to check that the five vectors in these two columns are linearly independent. To this end, suppose that there exist scalars β1 , β2 , δ1 , δ2 , and δ3 such that β1 b1 + β2 Bb1 + δ1 c1 + δ2 Bc1 + δ3 B 2 c1 = 0 β1 Bb1 + δ1 Bc1 + δ2 B 2 c1 = 0

=⇒

=⇒

δ 1 B 2 c1 = 0 ,

where the implications in the second and third row are obtained by multiplying the preceding row through by B. The third row implies that δ1 = 0, since B 2 c1 = 0; then the second row implies that β1 = δ2 = 0, since the vectors B 2 c1 and Bb1 are presumed to be linearly independent in the case at hand. Then the first row implies that β2 = δ3 = 0 for the same reason. Thus, the set of vectors {B 2 c1 , Bc1 , c1 , Bb1 , b1 } is a basis for NB 3 . Moreover, since B[B 2 c1

Bc1

c1

b1 ] = [B 3 c1

Bb1

B 2 c1

= [0 B 2 c1 2

(3)

(2)

where N = diag C0 , C0

!

= [B c1

Bc1

Bc1

Bc1

B 2 b1

Bb1 ]

0 Bb1 ]

c1

Bb1

b1 ]N ,

, it is now readily seen that the vectors

u1 = B 2 c1 , u2 = Bc1 , u3 = c1 , u4 = Bb1 ,

and

u5 = b1

62

6. The Jordan decomposition

are linearly independent and that (6.7)



(3)

Cλ1 A[u1 · · · u5 ] = [u1 · · · u5 ] O

 O (2) . Cλ1

Similar conclusions prevail with b2 in place of b1 if B 2 c1 and Bb1 are linearly dependent.  p−1 Exercise 6.10. Show that if N = j=1 ej eTj+1 is the p×p matrix with ones on the first super diagonal and zeros elsewhere, then αIp + βN is similar to αIp + N if β = 0.

6.7. Real Jordan forms If A ∈ R n×n , then the Jordan decomposition A = U JU −1 can be reexpressed in terms of real matrices. Thus, for example, if A ∈ R 4×4 and

     ω 1 u u u A 1 2 = u1 2 0 ω for a pair of nonzero vectors u1 , u2 ∈ C 4 and a point ω ∈ C \ R, then ⎤ ⎡ ω 1 0 0    ⎢0 ω 0 0⎥  ⎥ A u1 u2 u1 u2 = u1 u2 u1 u2 ⎢ ⎣0 0 ω 1⎦ . 0 0 0 ω Consequently, upon writing uj = xj + iyj with xj , yj ∈ R 4 for j = 1, 2, and ω = r cos θ + ir sin θ with r > 0 and θ ∈ [0, 2π), it is readily checked that there exists an invertible matrix V ∈ C 4×4 such that     u1 u2 u1 u2 V = x1 y1 x2 y2 and subsequently that   A x1 y1 x2 y2  = x1 y1 x2



⎤ r cos θ r sin θ 1 0  ⎢−r sin θ r cos θ 0 1 ⎥ ⎥. y2 ⎢ ⎣ 0 0 r cos θ r sin θ ⎦ 0 0 −r sin θ r cos θ

Exercise 6.11. Justify the transition from the complex Jordan decomposition to the real Jordan decomposition and en route show that the complex Jordan form is similar to the real Jordan form.

6.8. Supplementary notes This chapter is adapted from Chapters 4 and 6 in [30].

Chapter 7

Determinants

In this chapter we shall develop the theory of determinants axiomatically and shall then briefly survey a number of their properties. Some more advanced topics on determinants will be considered in Chapter 17.

7.1. Determinants Let Σn denote the set of all the n! one-to-one mappings σ of the set of integers {1, . . . , n} onto itself and let ei denote the i’th column of the identity matrix In . Then the formula ⎤ eTσ(1) ⎥ ⎢ = ⎣ ... ⎦ eTσ(n) ⎡

Pσ =

n  i=1

ei eTσ(i)

that was introduced earlier defines a one-to-one correspondence between the set of all n × n permutation matrices Pσ and the set Σn . A permutation matrix Pσ ∈ R n×n with n ≥ 2 is said to be simple if σ interchanges exactly two of the integers in the set {1, . . . , n}, i.e., if and only if it can be expressed as  P = ej eTj + ei1 eTi2 + ei2 eTi1 , j∈Λ

where Λ = {1, . . . , n} \ {i1 , i2 }, i1 , i2 ∈ {1, . . . , n}, and i1 = i2 . Exercise 7.1. Show that: (1) if P ∈ R n×n is a permutation matrix, then P P T = In ; (2) if P is simple, then P = P T . 63

64

7. Determinants

Theorem 7.1. There is exactly one way of assigning a number d(A) ∈ C to each matrix A ∈ C n×n that meets the following three requirements: 1◦ . d(In ) = 1. 2◦ . d(P A) = −d(A) for every simple permutation matrix P . 3◦ . d(A) is a multilinear functional of the rows of A; i.e., it is linear in each row separately. Discussion. The first two of these requirements are easily understood. The third is perhaps best visualized by example. Thus, if ⎤ ⎡→ ⎤ ⎡ a1 a11 a12 a13   ⎢→ ⎥ A = ⎣a21 a22 a23 ⎦ = ⎣ a 2 ⎦ and I3 = e1 e2 e3 , → a31 a32 a33 a3 then →

a 1=

3  i=1



a1i eTi , a 2 =

3 

a2j eTj ,

and

j=1

→ a 3=

3 

a3k eTk .

k=1

3◦ ,

Therefore, by successive applications of rule ⎧ ⎛⎡ → ⎤⎞ ⎛⎡ → ⎤⎞⎫ ⎪ ⎪ e ei 3 3 3 i ⎬ ⎨   ⎜⎢ → ⎥⎟  ⎜⎢ → ⎥⎟ a1i d ⎝⎣ a2 ⎦⎠ = a1i a2j d ⎝⎣ ej ⎦⎠ d(A) = ⎪ ⎪ → → ⎭ ⎩ j=1 i=1 i=1 a3 a3 ⎧ ⎛⎡ → ⎤⎞⎤⎫ ⎡ ⎪ ⎪ ei 3 3 3 ⎬ ⎨   ⎜⎢ → ⎥⎟⎥ ⎢ a1i a2j ⎣ a3k d ⎝⎣ ej ⎦⎠⎦ , = ⎪ ⎪ → ⎭ ⎩ j=1 i=1 k=1 ek which is an explicit formula⎛for in terms of the entries ast in the matrix ⎡ d(A) → ⎤⎞ ei ⎜⎢ → ⎥⎟ A and the 27 numbers d ⎝⎣ ej ⎦⎠. But, the second rule in Theorem → ek 7.1 implies that if two rows of A ∈ C n×n are identical, then d(A) = 0. Consequently only 6 = 3! of these 27 numbers are not equal to 0 and the last expression simplifies to ⎛⎡ T ⎤⎞ eσ(1)  ⎜⎢ ⎥⎟ a1σ(1) a2σ(2) a3σ(3) d ⎝⎣ eTσ(2) ⎦⎠ , d(A) = σ∈Σ3 eTσ(3) where, as noted earlier, Σn denotes the set of all the n! one-to-one mappings of the set {1, . . . , n} onto itself.

7.2. Useful rules for calculating determinants

65

Analogously, if A ∈ C n×n and ej now denotes the j’th column of In , then ⎡ T ⎤ eσ(1)  ⎢ ⎥ a1σ(1) · · · anσ(n) d(Pσ ) with Pσ = ⎣ ... ⎦ . (7.1) d(A) = σ∈Σn eTσ(n) If Pσ is equal to the product of k simple permutations, then d(Pσ ) = (−1)k d(In ) = (−1)k

(= (−1)sign σ in the usual notation) .



The unique number d(A) that is determined by the three conditions in Theorem 7.1 is called the determinant of A and will be denoted det (A)

or

det A or

|A| ,

from now on. It is clear from formula (7.1) that if A ∈ C n×n , then det A is a continuous function of the n2 entries in A. Exercise 7.2. Use the three rules in Theorem 7.1 to show that if A ∈ C 2×2 , then det A = a11 a22 − a12 a21 , and if A ∈ C 3×3 , then det A is equal to a11 a22 a33 − a11 a23 a32 + a12 a23 a31 − a12 a21 a33 + a13 a21 a32 − a13 a22 a31 .

7.2. Useful rules for calculating determinants Theorem 7.2. The determinant of a matrix A ∈ C n×n satisfies the following rules: 4◦ . If two rows of A are identical, then det A = 0. 5◦ . If B is the matrix that is obtained by adding a multiple of one row of A to another row of A, then det B = det A. 6◦ . If A has a row in which all the entries are equal to zero, then det A = 0. 7◦ . If two nonzero rows of A are linearly dependent, then det A = 0. 8◦ . If A ∈ C n×n is either upper triangular or lower triangular, then det A = a11 · · · ann . 9◦ . If A ∈ C n×n , then A is invertible if and only if det A = 0. 10◦ . If A, B ∈ C n×n , then det(AB) = det A × det B = det(BA). 11◦ . If A ∈ C n×n and A is invertible, then det A−1 = (det A)−1 . 12◦ . If A ∈ C n×n , then det A = det AT . 13◦ . If A ∈ C n×n , then rules 3◦ to 7◦ remain valid if the word rows is replaced by the word columns and the row interchange in rule 2◦ is replaced by a column interchange.

66

7. Determinants

Proof. We shall discuss rules 8◦ , 9◦ , 10◦ , and 12◦ and leave the other rules to the reader. To % %a11 % % 0 % % 0

illustrate 8◦ , observe that, in view of rules 3◦ , 5◦ , and 1◦ , % % % % % %a11 a12 a13 % %a11 a12 0% a12 a13 %% % % % % a22 a23 %% = a33 %% 0 a22 a23 %% = a33 %% 0 a22 0%% % 0 % 0 0 1% 0 a33 % 0 1 % % % % % %a11 a12 0% %a11 0 0% % % % % 1 0%% = a33 a22 %% 0 1 0%% = a33 a22 a11 . = a33 a22 %% 0 % 0 % 0 0 1% 0 1%

The computation for triangular matrices in C n×n is much the same. The verification of 9◦ rests on the fact that the two basic steps of Gaussian elimination applied to a matrix A ∈ C n×n , i.e., (1) permuting rows and (2) adding a multiple of one row to another, preserve both the rank of A and (in view of 2◦ and 5◦ ) | det A |. More precisely, there exists a permutation matrix P ∈ C n×n (which is a product of simple permutations) and an upper echelon matrix U ∈ C n×n such that det P A = det U

and

rank A = rank U .

Thus, as U is automatically upper triangular (since it is square in this application), | det A | = | det U | = |u11 · · · unn | . But this serves to justify 9◦ , since A is invertible ⇐⇒ U is invertible and U is invertible ⇐⇒ u11 · · · unn = 0. To verify 10◦ , observe first that if det B = 0, then the asserted identities are immediate from rule 9◦ , since B, AB, and BA are then all noninvertible matrices. If det B = 0, set det(AB) det B and check that ϕ(A) meets rules 1◦ –3◦ . Then ϕ(A) =

ϕ(A) = det A , since there is only one functional that meets these three conditions, i.e., det(AB) = det A × det B ,

7.3. Exploiting block structure

67

as claimed. Now, having this last formula for every choice of A and B, invertible or not, we can interchange the roles of A and B to obtain det(BA) = det B × det A = det A × det B = det(AB) . To verify 12◦ , we first invoke the formula EP A = U that summarizes Gaussian elimination to obtain the equalities det P A = det EP A = det U = det U T = det AT P T E T = det AT P T , since E is lower triangular with ones on the diagonal and U is triangular. The formula in 12◦ emerges by multiplying through by det P , since (det P )2 = 1  and det P T P = det In = 1 for permutation matrices P . det(AB) Exercise 7.3. Show that if det B = 0, then the functional ϕ(A) = det B meets conditions 1◦ –3◦ . [HINT: To verify 3◦ , observe that if a1 , . . . , an designate the rows of A, then the rows of AB are a1 B, . . . , an B.] Exercise 7.4. Calculate Gaussian elimination: ⎡ ⎤ ⎡ 1 3 2 1 1 0 ⎢ 0 4 1 6 ⎥ ⎢ 0 1 ⎢ ⎥ ⎢ ⎣ 0 0 2 1 ⎦ , ⎣ 1 0 1 1 0 4 0 1

the determinants of the following matrices by 1 0 0 1

⎤ ⎡ 0 1 3 ⎥ ⎢ 1 ⎥ ⎢ 0 2 , 1 ⎦ ⎣ 0 0 0 0 0

2 1 3 1

⎤ ⎡ 4 0 0 ⎥ ⎢ 6 ⎥ ⎢ 1 2 , 0 ⎦ ⎣ 0 0 2 0 1

0 3 1 2

⎤ 4 1 ⎥ ⎥ . 1 ⎦ 6

[HINT: If k simple permutations were used to pass from A to U , then det A = (−1)k det U .] Exercise 7.5. Calculate the determinants of the matrices in the previous exercise by rules 1◦ to 13◦ . ⎤ ⎡ 1 α α2 α3 ⎢ α 1 α α2 ⎥ ⎥ Exercise 7.6. Calculate det A when A = ⎢ ⎣α2 α 1 α ⎦. [HINT: Use α3 α2 α 1 Gaussian elimination to find a lower triangular matrix E with ones on the diagonal such that U = EA is an upper echelon matrix.]

7.3. Exploiting block structure The calculation of determinants is often simplified by taking advantage of block structure. Lemma 7.3. If A ∈ C n×n is a block triangular matrix, i.e., if either 



A11 O A11 A12 or A = A= O A22 A21 A22 with square blocks A11 ∈ C p×p and A22 ∈ C q×q , then det A = det A11 det A22 .

68

7. Determinants

Proof. In view of Theorem 6.1, A11 = V1 J1 V1−1 and A22 = V2 J2 V2−1 , where J1 and J2 are in Jordan form. Thus, if A is block upper triangular, then

   −1 A11 A12 V1 O J1 V1−1 A12 V2 V1 O A= . = O V2 O O V2 O A22 J2 Therefore, since J1 and J2 are upper triangular, the middle matrix on the right is also upper triangular and det A = det J1 det J2 = det A11 det A22 , as claimed. The proof for block lower triangular matrices is similar.  ⎡ ⎤ 3 1 4 6 7 8 ⎢0 2 5 1 9 4⎥ ⎢ ⎥ ⎢0 0 1 1 1 1⎥ ⎢ ⎥. [REMARK: Exercise 7.7. Compute det A when A = ⎢ ⎥ ⎢0 0 0 4 0 0⎥ ⎣0 0 0 1 1 0⎦ 0 0 0 2 1 1 This should not take more than about 30 seconds.] 

A11 A12 ∈ C n×n be a four-block matrix with Exercise 7.8. Let A = A21 A22 entries A11 ∈ C p×p , A12 ∈ C p×q , A21 ∈ C q×p , A22 ∈ C q×q , where n = p + q. Show that: (7.2) if A11 is invertible, then

det A = det A11 det (A22 − A21 A−1 11 A12 ) ;

(7.3) if A22 is invertible, then

det A = det A22 det (A11 − A12 A−1 22 A21 ).

[HINT: Use the Schur complement formulas (3.18) and (3.19) .] Exercise 7.9. Show that if A ∈ C p×q and B ∈ C q×p , then det(Ip − AB) = det(Iq − BA) and q + rank(Ip − AB) = p + rank(Iq − BA) . [HINT: Imbed A and B appropriately in a (p + q) × (p + q) matrix and then invoke formulas (7.2) and (7.3).] Exercise 7.10. Show that if A ∈ C p×q and B ∈ C q×p , then (7.4)

λq det(λIp − AB) = λp det(λIq − BA)

for every λ ∈ C .

Exercise 7.11. Show that if u, v ∈ C n , then det (In − uvH ) = 0 if and only if 1 − vH u = 0 and then compute (In − uvH )−1 when this condition is in force. [HINT: (In − uvH )−1 must be of the form (In + κuvH )−1 for some choice of κ ∈ C.]

7.4. Minors

69

Exercise 7.12. Let A ∈ C n×n be invertible and let u, v ∈ C n . Show that the matrix A + uvH is invertible if and only if 1 + vH A−1 u = 0 and that if this condition is in force, then (A + uvH )−1 = A−1 −

A−1 uvH A−1 . 1 + vH A−1 u

[HINT: Exploit Exercise 7.11.] Exercise 7.13. Verify the formula det(In − ek eTj + uvH ) = (1 + vH u)(1 − eTj ek ) + uj vk , where u, v ∈ C n , uj = eTj u, vk = eTk v, and ei denotes the i’th column of In , i = 1, . . . , n. [HINT: First verify the formula under the condition 1 + vH u = 0, and then use the fact that the determinant of a matrix A ∈ C n×n is a continuous function of the entries aij in A.] 

Ip Op×q . Exercise 7.14. Calculate the determinant of the matrix Iq Oq×p

7.4. Minors The ij minor A{i;j} of a matrix A ∈ C n×n is defined as the determinant of the (n − 1) × (n − 1) matrix that is obtained by deleting the i’th row and the j’th column of A. Thus, for example, if ⎡ ⎤

 1 3 1 2 4 . A = ⎣2 0 4⎦ , then A{1;2} = det 1 2 1 1 2  where Exercise 7.15. Show that if A ∈ C n×n , then A{i;j} = (−1)i+j det A,  does  denotes the matrix A with its i’th row replaced by eT . [HINT: det A A j not change if akj is replaced by 0 when k = i. Then by i − 1 + j − 1 row and column interchanges the one in the ij position moves to the 11 position. Now invoke Lemma 7.3.] Theorem 7.4. If A is an n × n matrix, then det A can be expressed as an expansion along the i’th row: n  aij (−1)i+j A{i;j} (7.5) det A = j=1

for each choice of i, i = 1, . . . , n, and as an expansion along the j’th column: n  aij (−1)i+j A{i;j} (7.6) det A = i=1

for each choice of j, j = 1, . . . , n.

70

7. Determinants

Discussion. Let ej denote the j’th column of In . Then (7.5) follows from T the fact that ei A = nj=1 aij eTj and Exercise 7.15. The verification of formula (7.6) rests on analogous decompositions for the columns of A.  • Moral: Formulas (7.5) and (7.6) yield 2n different ways of calculating the determinant of an n × n matrix, one for each row and one for each column, respectively. It is usually advantageous to expand along the row or column with the most zeros. Exercise 7.16. Evaluate the determinant ⎡ 5 2 3 ⎢3 0 0 A=⎢ ⎣1 1 0 0 2 0

of the 4 × 4 matrix ⎤ 1 2⎥ ⎥ 1⎦ 1

twice; first begin by expanding in minors along the third column and then again by expanding in minors along the fourth column. Theorem 7.5. If A ∈ C n×n and if C ∈ C n×n denotes the matrix with entries cij = (−1)i+j A{j;i} , i, j = 1, . . . , n , then: (1) AC = CA = det A · In . (2) If det A = 0, then A is invertible and A−1 = Discussion. ⎡ a11 ⎣a21 (7.7) a31

1 C. det A

If A is a 3 × 3 matrix, then this theorem states that ⎤⎡ ⎤ A{3;1} A{1;1} −A{2;1} a12 a13 A{2;2} −A{3;2} ⎦ = det A · I3 . a22 a23 ⎦ ⎣ −A{1;2} a32 a33 A{1;3} −A{2;3} A{3;3}

This formula may be verified by three simple sets of calculations. The first set is based on the formula ⎡ ⎤ x y z (7.8) det ⎣a21 a22 a23 ⎦ = xA{1;1} − yA{1;2} + zA{1;3} a31 a32 a33 and the observation that: • if (x, y, z) = (a11 , a12 , a13 ), then the left-hand side of (7.8) is equal to det A; • if (x, y, z) = (a21 , a22 , a23 ), then, by rule 4◦ , the left-hand side of (7.8) is equal to 0;

7.4. Minors

71

• if (x, y, z) = (a31 , a32 , a33 ), then, by rule 4◦ , the left-hand side of (7.8) is equal to 0. These three evaluations can be recorded in the following more revealing way: ⎤⎡ ⎤ ⎡ ⎤ ⎡ A{1;1} 1 a11 a12 a13 ⎣a21 a22 a23 ⎦ ⎣ −A{1;2} ⎦ = det A ⎣0⎦ . 0 a31 a32 a33 A{1;3} The next set of calculations uses the formula ⎡ ⎤ a11 a12 a13 y z ⎦ = −xA{2;1} + yA{2;2} − zA{2;3} det ⎣ x a31 a32 a33 to verify that ⎤⎡ ⎤ ⎡ ⎤ −A{2;1} 0 a11 a12 a13 ⎣a21 a22 a23 ⎦ ⎣ A{2;2} ⎦ = det A ⎣1⎦ . a31 a32 a33 0 −A{2;3} ⎡

The final step in the verification of (7.7) is to substitute x, y, and z for a31 , a32 , and a33 , respectively, in order to obtain analogues of the formulas obtained in the first two steps and then to combine these results appropriately.  Exercise 7.17. Formulate and verify the analogue of formula (7.7) for 4 × 4 matrices. Exercise 7.18. Show that if ⎡ b1 a12 det ⎣b2 a22 b3 a32 x1 = det A

A ∈ C 3×3 is invertible and Ax = b, then ⎡ ⎤ ⎤ a11 b1 a13 a13 a23 ⎦ det ⎣a21 b2 a23 ⎦ a33 a31 b3 a33 , x2 = det A

and state and verify the analogous formula for x3 . [REMARK: This is an example of Cramer’s rule.] Exercise 7.19. Show that if A, B ∈ C n×n and AB = In , then for every λ∈C ⎡ ⎤ 1 λ · · · λn−1 ⎢ a21 a22 · · · a2n ⎥ ⎢ ⎥ det ⎢ . .. ⎥ ⎣ .. . ⎦ an1 an2 · · · ann . (7.9) b11 + b21 λ + · · · + bn1 λn−1 = det A

72

7. Determinants ⎡

1 2 Exercise 7.20. Compute the inverse of the matrix A = ⎣ 2 1 1 x those values of x for which A is invertible. [HINT: Exploit formula

⎤ 2 2 ⎦ for 0 (7.7).]

7.5. Eigenvalues Determinants play a useful role in calculating the eigenvalues of a matrix A ∈ C n×n . In particular, if A = U JU −1 , where J is in Jordan form, then & ' det(λIn − A) = det(λIn − U JU −1 ) = det U (λIn − J)U −1 . Therefore, by rules 10◦ , 11◦ , and 8◦ , applied in that order, det(λIn − A) = det(λIn − J) = (λ − j11 )(λ − j22 ) · · · (λ − jnn ), where jii , i = 1, . . . , n, are the diagonal entries of J. The polynomial p(λ) = det(λIn − A)

(7.10)

is termed the characteristic polynomial of A. In particular, a number λ is an eigenvalue of the matrix A if and only if p(λ) = 0. Thus, for example, to find the eigenvalues of the matrix

 1 2 A= , 2 1 look for the roots of the polynomial det(λI2 − A) = (λ − 1)2 − 22 = λ2 − 2λ − 3. This leads readily to the conclusion that the eigenvalues of the given matrix A are λ1 = 3 and λ2 = −1. Moreover, if J = diag{3, −1}, then A2 − 2A − 3I2 = U (J 2 − 2J − 3I)U −1 = U (J − 3I2 )(J + I2 )U −1

 

 0 0 4 0 −1 0 0 −1 = U U =U U , 0 −4 0 0 0 0 which yields the far from obvious conclusion A2 − 2A − 3I2 = O . The argument propogates: If λ1 , . . . , λk denote the distinct eigenvalues of A and if αi denotes the algebraic multiplicity of the eigenvalue λi , i = 1, . . . , k, then the characteristic polynomial can be written in the more revealing form (7.11)

p(λ) = (λ − λ1 )α1 (λ − λ2 )α2 · · · (λ − λk )αk .

Thus, p(A) = (A − λ1 In )α1 (A − λ2 In )α2 · · · (A − λk In )αk = U (J − λ1 In )α1 (J − λ2 In )α2 · · · (J − λk In )αk U −1 = O.

7.5. Eigenvalues

73

This serves to justify the Cayley-Hamilton theorem that was referred to in Remark 6.2. In more striking terms, it states that det (λIn − A)

=

a0 + · · · + an−1 λn−1 + λn

(7.12) =⇒ a0 In + · · · + an−1 An−1 + An = O . (5)

(3)

(2)

Exercise 7.21. Show that if J = diag{Cλ1 , Cλ2 , Cλ3 }, then (J − λ1 I10 )5 (J − λ2 I10 )3 (J − λ3 I10 )2 = O . . Exercise 7.22. Show that if λ1 = λ2 in Exercise 7.21, then (J − λ1 I10 )5 (J − λ3 I10 )2 = O . Exercise 7.22 illustrates the fact that if νj , j = 1, . . . , k, denotes the size of the largest Jordan cell in the matrix J with λj on its diagonal, then p(A) = 0 holds for the possibly lower-degree polynomial pmin (λ) = (λ − λ1 )ν1 (λ − λ2 )ν2 · · · (λ − λk )νk , which is the minimal polynomial referred to in Remark 6.2: pmin (A) = (A − λ1 In )ν1 (A − λ2 In )ν2 · · · (A − λk In )νk = U (J − λ1 In )ν1 (J − λ2 In )ν2 · · · (J − λk In )νk U −1 = O. The Jordan decomposition A = U JU −1 leads easily to a number of useful formulas for determinants and traces, where the trace of an n × n matrix A is defined as the sum of its diagonal elements: (7.13)

trace A = a11 + a22 + · · · + ann .

Theorem 7.6. If A ∈ C n×n has k distinct eigenvalues λ1 , . . . , λk with algebraic multiplicities α1 , . . . , αk , then (7.14)

det A = λα1 1 λα2 2 · · · λαk k

and (7.15)

trace A = α1 λ1 + α2 λ2 + · · · + αk λk .

Moreover, if f (λ) is a polynomial, then (7.16)

det(λIn − f (A)) = (λ − f (λ1 )) α1 · · · (λ − f (λk )) αk ,

(7.17)

det f (A) = f (λ1 )α1 f (λ2 )α2 · · · f (λk )αk ,

and (7.18)

trace f (A) = α1 f (λ1 ) + α2 f (λ2 ) + · · · + αk f (λk ) .

74

7. Determinants

Proof. The verification of formulas (7.15) and (7.18) depends upon the fact that n  n  aij bji = trace (BA) . (7.19) trace (AB) = i=1 j=1

Thus, in particular, (7.20)

trace A = trace (U JU −1 ) = trace (JU −1 U ) = trace J ,

which leads easily to (7.15); the verification of (7.18) is similar but is based  on the formula f (A) = U f (J)U −1 . The rest is left to the reader. Corollary 7.7 (Spectral mapping principle). If A ∈ C n×n and f (λ) is a polynomial, then (7.21)

λ ∈ σ(A) ⇐⇒ f (λ) ∈ σ(f (A)) .

Exercise 7.23. Verify Corollary 7.7, but show by example that the multiplicities may change. [HINT: The key is (7.16).]

7.6. Supplementary notes This chapter is partially adapted in abbreviated form from Chapter 5 of [30]. As noted earlier, additional properties of determinants will be considered in Chapter 17. Formula (7.13) is a special case of a general formula for the trace that will be presented in Section 14.6. The formula in Exercise 7.12 is known as the Sherman-Morrison formula. A byproduct of formula (7.4) is that: (7.22)

if A, B ∈ C n×n , then σ(AB) \ {0} = σ(BA) \ {0}.

Moreover, since det A is a homogeneous polynomial of degree n in its variables, ∂ det A/∂aij = (−1)i+j A{i;j} and hence formulas (7.5) and (7.6) can be expressed as n n   ∂ det A ∂ det A aij and det A = aij , (7.23) det A = ∂aij ∂aij j=1

respectively.

i=1

Chapter 8

Companion matrices and circulants

A matrix A ∈ C n×n of the form ⎡ 0 1 0 ⎢ 0 0 1 ⎢ ⎢ .. (8.1) A=⎢ . ⎢ ⎣ 0 0 0 −a0 −a1 −a2

··· ··· .. .

0 0 .. .

··· ···

1 −an−1

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

is called a companion matrix. In this chapter some of the special properties of companion matrices are developed and then applied to study a class of matrices called circulants. Sections 8.3 and 8.4 treat advanced applications that can be postponed to a later reading without loss of continuity.

8.1. Companion matrices Theorem 8.1. If the companion matrix A ∈ C n×n in (8.1) has k distinct eigenvalues λ1 , . . . , λk with geometric multiplicities γ1 , . . . , γk and algebraic multiplicities α1 , . . . , αk , respectively, then: (1) det (λIn − A) = a0 + a1 λ + · · · + an−1 λn−1 + λn . (2) γj = 1 for j = 1, . . . , k. (α )

(α )

(3) A is similar to the Jordan matrix J = diag{Cλ1 1 , . . . , Cλk k }. (4) A is invertible if and only if a0 = 0. Proof. The formula in (1) is obtained by expanding in minors along the last column of λIn − A and taking advantage of the structure; see Exercises 75

76

8. Companion matrices and circulants

8.1 and 8.2. Next, since dim R(A−λIn ) ≥ n − 1 for every point λ ∈ C, it follows that dim N(A−λj In ) = 1 for j = 1, . . . , k. Therefore (2) holds, and (α )

hence there is exactly one Jordan cell Cλj j for each distinct eigenvalue λj ,  i.e., (3) holds. Assertion (4) holds because rank A = n ⇐⇒ a0 = 0. Exercise 8.1. Verify the formula in (1) in Theorem 8.1 for n = 2 and n = 3. Exercise 8.2. Justify the formula for pn (λ) = det (λIn − A) in (1) in Theorem 8.1 for an arbitrary positive integer n by induction after first verifying the identity (8.2)

pn (λ) = (an−1 + λ)λn−1 + pn−1 (λ) − λn−1

for n ≥ 2 .

[HINT: Expand in minors along the last column. The key observation when n = 4, for example, is that the 34 minor ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ λ −1 0 λ −1 0 λ −1 0 det ⎣ 0 λ −1⎦ = det ⎣ 0 λ −1 ⎦ − det ⎣ 0 λ 0 ⎦ .] a0 a1 a2 0 0 λ a0 a1 a2 + λ Our next objective is to analyze the Jordan decomposition of companion matrices. To warm up, we begin with an example. Example 8.1. If λ is an eigenvalue of a 3 × 3 companion matrix A with characteristic polynomial f (λ) = a0 + a1 λ + a2 λ2 + λ3 , then ⎡ ⎤ ⎡ ⎤ ⎤⎡ ⎤ ⎡ x1 0 1 0 x1 x2 ⎦. x3 0 1 ⎦ ⎣x2 ⎦ = ⎣ λ ⎣ x2 ⎦ = ⎣ 0 x3 x3 −(a0 x1 + a1 x2 + a2 x3 ) −a0 −a1 −a2 Consequently, x2 = λx1 , x3 = λx2 = λ2 x1 , and −(a0 x1 + a1 x2 + a2 x3 ) = −(a0 + a1 λ + a2 λ2 )x1 = −f (λ)x1 + λ3 x1 = λ3 x1 ,  T  T = x1 1 λ λ2 , since f (λ) = 0 when λ ∈ σ(A). Thus, x1 x2 x3 and, if A has three distinct eigenvalues λ1 , λ2 , λ3 , then ⎤ ⎡ ⎤⎡ ⎤ ⎡ 1 1 1 λ1 0 0 1 1 1 A ⎣λ1 λ2 λ3 ⎦ = ⎣λ1 λ2 λ3 ⎦ ⎣ 0 λ2 0 ⎦ . λ21 λ22 λ23 λ21 λ22 λ23 0 0 λ3 Analogous formulas hold for n × n companion matrices with n distinct eigenvalues. This and more follows from the evaluations in the next lemma. Lemma 8.2. If A ∈ C n×n is a companion matrix with characteristic poly nomial f (λ) = a0 + · · · + an−1 λn−1 + λn and v(λ)T = 1 λ · · · λn−1 , then (8.3)

A v(λ) = λ v(λ) − f (λ)en ,

8.1. Companion matrices

77

where en denotes the n’th column of In , and (8.4)

A

v(j) (λ) v(j−1) (λ) f (j) (λ) v(j) (λ) =λ + − en j! j! (j − 1)! j!

Proof. By direct computation ⎡ λ ⎢ .. ⎢ . A v(λ) = ⎢ ⎣ λn−1 −(a0 + a1 λ + · · · + an−1 λn−1 )

⎤ ⎥ ⎥ ⎥ ⎦



f or j = 1, 2, . . . .

λ ⎢ .. ⎢ =⎢ . ⎣ λn−1 λn





0 ⎥ ⎢ .. ⎥ ⎢ . ⎥−⎢ ⎦ ⎣ 0 f (λ)

⎤ ⎥ ⎥ ⎥, ⎦

which coincides with (8.3). The formulas in (8.4) are obtained by differentiating both sides of (8.3) k times with respect to λ to first verify the formula (8.5)

A v(k) (λ) = λ v(k) (λ) + k v(k−1) (λ) − f (k) (λ)en

for k = 1, 2, . . . by Leibniz’s rule (gh)

(k)

=

k    k j=0

j

g (j) h(k−j)

for the derivative of a product (or by induction) and then dividing both sides by k!.  Exercise 8.3. Use formulas (8.3) and (8.4) to give another proof  of formula v(1) (λ) v(λ) v(n−1) (λ) (1) in Theorem 8.1. [HINT: det 0! = 1.] ··· 1! (n−1)! Exercise 8.4. ⎡ 0 1 ⎣ 0 0 −a0 −a1

Show that if f (λ) = a0 + a1 λ + a2 λ2 + λ3 = (λ − μ)3 , ⎤ ⎡ ⎤ ⎡ μ 1 0 1 0 0 1 ⎦ V = V ⎣ 0 μ 1 ⎦ , where V = ⎣ μ 1 0 0 μ μ2 2μ −a2

then ⎤ 0 0⎦ . 1

[HINT: Invoke (8.3), (8.4), and the evaluations f (μ) = f  (μ) = f  (μ) = 0.] (α )

(α )

Theorem 8.3. If J = diag{Cλ1 1 , . . . , Cλk k } is an n × n matrix in Jordan form based on k distinct eigenvalues λ1 , . . . , λk , then J is similar to the companion matrix A in (8.1) with entries aj equal to the coefficients of the polynomial a0 + a1 λ + · · · + an−1 λn−1 + λn = (λ − λ1 )α1 · · · (λ − λk )αk : If   v(αj −1) (λj ) v(λj ) v(1) (λj ) ··· (8.6) Vj = 0! 1! (αj − 1)!   for j = 1, . . . , k with v(λ)T = 1 λ · · · λn−1 , then the n × n matrix   (8.7) V = V1 · · · Vk

78

8. Companion matrices and circulants

is invertible and A = V JV −1 .

(8.8)

If A has n distinct eigenvalues λ1 , . . . , λn , then J = diag{λ1 , . . . , λn } and ⎡ ⎤ 1 ··· 1 ⎢ λ1 · · · λn ⎥ ⎢ ⎥ (8.9) V = ⎢ .. .. ⎥ . ⎣ . . ⎦ · · · λn−1 λn−1 n 1 Proof. In view of formulas (8.3) and (8.4), (8.10)

(αj )

AVj = Vj (λj Iαj + C0

(α )

) = Vj Cλj j

for

j = 1, . . . , k .

Consequently, AV = V J with V as in (8.7) and, as rank Vj = αj for j = 1, . . . , k and rank V = rank V1 + · · · + rank Vk = n, V is invertible. Thus, (8.8) holds. 

Formula (8.9) is obtained from (8.6) and (8.7) when k = n.

A matrix V ∈ C n×n of the form (8.9) is called a Vandermonde matrix, whereas a matrix of the form (8.7) with Vj as in (8.6) is called a generalized Vandermonde matrix. Corollary 8.4. The Vandermonde matrix V defined by formula (8.9) is invertible if and only if the points λ1 , . . . , λn are distinct. Example 8.2. If f (λ) = (λ − α)3 (λ − β)2 = a0 + a1 λ + · · · + a4 λ4 + λ5 and α = β, then ⎡ 0 1 ⎢ 0 0 ⎢ ⎢ 0 0 ⎢ ⎣ 0 0 −a0 −a1 ⎡ 1 ⎢ α ⎢ 2 =⎢ ⎢ α ⎣ α3 α4

0 0 0 1 0 0 0 1 0 0 0 1 −a2 −a3 −a4 0 1 2α 3α2 4α3

0 0 1 3α 6α2

1 β β2 β3 β4

⎤⎡

1 ⎥⎢ α ⎥⎢ 2 ⎥⎢ α ⎥⎢ ⎦ ⎣ α3 α4 ⎤⎡

0 1 2β 3β 2 4β 3

⎥ ⎥ ⎥ ⎥ ⎦

α ⎢ 0 ⎢ ⎢ 0 ⎢ ⎣ 0 0

0 1 2α 3α2 4α3 1 α 0 0 0

0 0 1 3α 6α2 0 1 α 0 0

0 0 0 β 0

1 β β2 β3 β4 0 0 0 1 β

0 1 2β 3β 2 4β 3 ⎤

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

⎥ ⎥ ⎥. ⎥ ⎦

Exercise 8.5. Compute the determinant of the Vandermonde matrix V given by formula (8.9) when n = 4. [HINT: Let f (x) denote the value of the determinant when λ4 is replaced by x and observe that f (x) is a polynomial of degree ≤ 3 and f (λ1 ) = f (λ2 ) = f (λ3 ) = 0.]

8.2. Circulants

79

Exercise 8.6. Find an invertible matrix U and a matrix J in Jordan form such that A = U JU −1 if A ∈ C 6×6 is a companion matrix, det (λI6 − A) = (λ − λ1 )4 (λ − λ2 )2 , and λ1 = λ2 . If A ∈ C n×n has k distinct eigenvalues λ1 , . . . , λk with geometric multiplicities γ1 = · · · = γk = 1, then A is similar to a companion matrix. The next three exercises serve to indicate the possibilities when this condition fails. Exercise 8.7. Show that if A ∈ C n×n is similar to the Jordan matrix (4)

(2)

(3)

(1)

(3)

J = diag{Cλ1 , Cλ1 , Cλ2 , Cλ2 , Cλ3 }

with 3 distinct eigenvalues ,

then A is also similar to the block diagonal matrix diag{A1 , A2 } based on a pair of companion matrices with characteristic polynomials f1 (λ) and f2 (λ), and find the polynomials. Exercise 8.8. Find a Jordan form J for the matrix ⎡ ⎤ 0 1 0 0 0 ⎢ 0 0 1 0 0 ⎥ ⎢ ⎥ ⎢ A = ⎢ 8 −12 6 0 0 ⎥ ⎥. ⎣ −1 1 0 0 1 ⎦ −4 1 0 −4 4 [HINT: You may find the formula x3 − 6x2 + 12x − 8 = (x − 2)3 useful.] Exercise 8.9. Find an invertible matrix U such that AU = U J for the matrices A and J considered in Exercise 8.8.

8.2. Circulants A matrix A ∈ C n×n of the form A = g(P ) = a0 In + a1 P + · · · + an−1 P n−1

(8.11)

based on the polynomial g(λ) = a0 + a1 λ + · · · + an−1 λn−1 and the n × n permutation matrix (8.12)

P =

n−1 

ej eTj+1 + en eT1

(which is also a companion matrix)

j=1

is called a circulant. ⎡ 0 1 0 ⎢0 0 1 ⎢ P =⎢ ⎢0 0 0 ⎣0 0 0 1 0 0

To illustrate more graphically, ⎤ ⎡ 0 0 a0 a1 ⎥ ⎢ 0 0⎥ ⎢ a4 a0 ⎢ a3 a4 1 0⎥ and A = ⎥ ⎢ ⎣ a2 a3 0 1⎦ a1 a2 0 0

if n = 5, then ⎤ a2 a3 a4 a1 a2 a3 ⎥ ⎥ a0 a1 a2 ⎥ ⎥. a4 a0 a1 ⎦ a3 a4 a0

80

8. Companion matrices and circulants

Circulants have very nice properties: (8.13)

if A ∈ C n×n and B ∈ C n×n are circulants, then AB = BA, AH is a circulant, and hence AH A = AAH .

Moreover, circulants are diagonalizable and, as will be spelled out below in Theorem 8.5, it is easy to compute their Jordan decomposition. Exercise 8.10. Show that if a permutation matrix P ∈ R n×n is also a companion matrix, then P P H = In and P n = In . Exercise 8.11. Verify the assertions in (8.13). [HINT: Exploit Exercise 8.10.] Exercise 8.12. Show that there exist permutation matrices P ∈ R n×n such that P n = In if n ≥ 3. Theorem 8.5. If A ∈ C n×n is the circulant that is defined in terms of the polynomial g(λ) = a0 + a1 λ + · · · + an−1 λn−1 and the permutation matrix P by formula (8.11), then (8.14)

A=

1 V DV H , where D = diag{g(λ1 ), . . . , g(λn )} , n λj = exp (2πi(j/n)) for j = 1, . . . , n ,

V is the Vandermonde matrix defined by formula (8.9), and V H V = nIn . Proof. The permutation matrix P defined by (8.12) is a companion matrix. Thus, in view of Theorem 8.1, det (λIn − P ) = λn − 1 and hence P has n distinct eigenvalues λj = ζ j for j = 1, . . . , n, with ζ = ei2π/n . Therefore, by Theorem 8.3, P = V ΔV −1 with Δ = diag{λ1 , . . . , λn } and V as in (8.9). Consequently, A = g(P ) = V g(Δ)V −1 = V DV −1 . Moreover, since ( 0 if j = k , n−1 = (8.15) 1 + λj λk + · · · + (λj λk ) n if j = k , 

V H V = n In . Exercise 8.13. Verify (8.15).

Exercise 8.14. Show that if P denotes the permutation matrix defined by formula (8.12), then N(P −λj In ) = span{vj }, where vj denotes the j’th column of the Vandermonde matrix V with λj = exp (2πij/n) and justify the formula (8.16)

P = V ΔV −1

where

Δ = diag{λ1 , . . . , λn } .

8.3. Interpolating polynomials

81

Remark 8.6. If A and V are defined as in Theorem 8.5, but the eigenvalues λj = exp (2πi(j/n) of A are indexed j = 0, . . . , n − 1, then

(8.17)

⎡ 1 ⎢1 ⎢ V = ⎢ .. ⎣.

··· ···

1 ω

1 ω n−1 · · ·



1

⎥ ⎥ ⎥. ⎦

ω n−1 .. . ω (n−1)

2

The matrix Fn = n−1/2 V is called the Fourier matrix. Exercise 8.15. Show that if n ≥ 4, then σ(Fn ) = {1, i, −1, −i}. [HINT: First show that Fn4 = In .]

8.3. Interpolating polynomials The next theorem is a useful byproduct of the properties of Vandermonde matrices that were developed in Section 8.1. Theorem 8.7. If {α0 , . . . , αn } and {β0 , . . . , βn } are two sets of points in C and αi = αj when i = j, then there exists exactly one polynomial p(λ) = c0 + c1 λ + · · · + cn λn of degree n such that p(αi ) = βi for i = 0, . . . , n. The coefficients cj of this polynomial are specified by (8.18) below. Proof. Let p(λ) = c0 + c1 λ + · · · + cn λn be a polynomial of degree n. Then p(αj ) = βj for j = 0, . . . , n if and only if

(8.18)

⎡ 1 α0 ⎢1 α1 ⎢ ⎢ .. ⎣. · · · 1 αn

··· ··· .. . ···

⎤ α0n α1n ⎥ ⎥ ⎥ ⎦ n αn

⎡ ⎤ ⎡ ⎤ c0 β0 ⎢ c 1 ⎥ ⎢ β1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ .. ⎥ = ⎢ .. ⎥ . ⎣.⎦ ⎣ . ⎦ cn

βn

Since the points α0 , . . . , αn are distinct, the matrix in (8.18) is the transpose of an invertible Vandermonde matrix. Therefore, there is only one set of  coefficients c0 , . . . , cn , for which (8.18) holds. The same circle of ideas applied to generalized Vandermonde matrices allows us to specify derivatives. Example 8.3. If λ1 = λ2 , then for each choice of β0 , . . . , β4 ∈ C, there exists exactly one polynomial p(λ) = c0 + c1 λ + · · · + c4 λ4 of degree four such that p(λ1 ) = β0 , p (λ1 ) = β1 , p (λ1 )/2! = β2 , p(λ2 ) = β3 , and p (λ2 ) = β4 . The

82

8. Companion matrices and circulants

coefficients of this polynomial ⎡ 1 λ1 λ21 ⎢0 1 2λ1 ⎢ ⎢0 0 1 ⎢ ⎣1 λ2 λ2 2 0 1 2λ1

are the solutions of the equation ⎤ λ31 λ41 ⎡c ⎤ ⎡β ⎤ 0 0 3λ21 4λ31 ⎥ ⎢c1 ⎥ ⎢β1 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ 3λ1 6λ21 ⎥ .⎥=⎢ . ⎥. ⎥⎢ ⎣ .. ⎦ ⎣ .. ⎦ 3 4 ⎦ λ2 λ2 c4 β4 3λ22 4λ32

This equation has exactly one solution because the matrix on the left is the transpose of an invertible generalized Vandermonde matrix. 

8.4. An eigenvalue assignment problem Let

(8.19)

⎡ ⎢ ⎢ ⎢ Kf = ⎢ ⎢ ⎣

0 0 .. .

1 0

0 1

··· ··· .. .

0 0 0 ··· −a0 −a1 −a2 · · ·

0 0 .. . 1 −an−1

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

denote the companion matrix based on the polynomial (8.20)

f (λ) = a0 + a1 λ + · · · + an−1 λn−1 + λn

and let Hf denote the invertible Hankel matrix ⎡ ⎤ a1 a2 · · · an−1 an ⎢ a2 a3 · · · an 0⎥ ⎢ ⎥ (8.21) Hf = ⎢ . .. ⎥ with an = 1 . ⎣. .⎦ an 0 · · · 0 0 based on the coefficients of the same polynomial. Lemma 8.8. The product Hf Kf of the matrices Kf and Hf defined by formulas (8.19) and (8.21) is symmetric: (8.22)

KfT Hf = Hf Kf

Proof. It turns out to be convenient to express Kf in terms of the n × n   (n) Jordan cell C0 with 0 on the diagonal, the vector uT = a0 · · · an−1 , (n)

and the standard basis vectors e1 , . . . , en of C n as Kf = C0 − en uT . Then  

0 a1 · · · an−1 −a0 0T (n) T T − e1 u = , Hf Kf = Hf (C0 − en u ) = 0 B 0 B

8.4. An eigenvalue assignment problem

where



a2 a3 .. .

··· ···

a3 a4

⎢ ⎢ ⎢ B=⎢ ⎢ ⎣an−1 an 0 ··· an

⎤ an−1 an an 0⎥ ⎥ .. ⎥ = B T .⎥ ⎥ 0 0⎦ 0 0

83

with an = 1 .

Thus, Hf Kf = (Hf Kf )T = KfT Hf , since Hf is symmetric.



Theorem 8.9. If A ∈ C n×n has k distinct eigenvalues λ1 , . . . , λk , with geometric multiplicities γ1 , . . . , γk and algebraic multiplicities α1 , . . . , αk , respectively, then the following statements are equivalent: (1) γ1 = · · · = γk = 1. (2) A is similar to the companion matrix Kf based on the polynomial f (λ) = det (λIn − A). (3) A is similar to KfT . (4) There exists a vector b ∈ C n such that the matrix C = [b Ab · · · An−1 b] is invertible. (In other terminology, b is a cyclic vector for A.) Proof. If (1) is in force, then the matrix J in the Jordan decomposition (α ) (α ) of A is of the form J = diag{Cλ1 1 , . . . , Cλk k }. Thus, in view of Theorem 8.3, J is similar to the companion matrix Kf based on the characteristic polynomial of A. Therefore, (1) =⇒ (2). The implications (2) =⇒ (1) and (2) ⇐⇒ (3) are justified by Theorem 8.1 and Lemma 8.8, respectively. Suppose next that (3) holds. Then there exists an invertible matrix U ∈ C n×n with columns u1 , . . . , un such that     u1 · · · un KfT = A u1 · · · un , i.e., u2 = Au1 ,

u3 = Au2 = A2 u1 , · · · , un = Aun−1 = An−1 u1 .

Thus, as U is invertible, (4) holds with b = u1 . Conversely, if (4) holds, then   with c = −(a0 b+· · ·+an−1 An−1 b) = An b , CKfT = Ab · · · An−1 b c by the Cayley-Hamilton theorem. Therefore, C KfT = AC and, as C is invertible, (4) =⇒ (3). 

84

8. Companion matrices and circulants

A basic problem in control theory amounts to shifting the eigenvalues of a given matrix A to preassigned values, or a preassigned region, by an appropriately chosen additive perturbation of the matrix, which in practice is implemented by feedback. Since the eigenvalues of A are the roots of its characteristic polynomial, this corresponds to shifting the polynomial f (λ) = det (λIn − A) to a polynomial g(λ) = c0 + · · · + cn−1 λn−1 + λn with suitably chosen roots. Theorem 8.10. If A ∈ C n×n has k distinct eigenvalues λ1 , . . . , λk , with geometric multiplicities γ1 = · · · = γk = 1, then for every polynomial g(λ) = c0 + c1 λ + · · · + cn−1 λn−1 + λn , there exists a pair of vectors b, u ∈ C n such that det(λIn − A − buT ) = g(λ). Proof. Let Kf denote the companion matrix based on the polynomial f (λ) = det (λIn − A) and let Hf denote the invertible Hankel matrix based on the coefficients of f (λ) that is defined in (8.21). Then, in view of The orem 8.9, there exists an invertible matrix C = b Ab · · · An−1 b such that AC = CKfT . Thus, as KfT Hf = Hf Kf , (A + buT )CHf = CKfT Hf + buT CHf = CHf Kf + buT CHf . Therefore, since Kg − Kf = en wT with  wT = (a0 − c0 ) (a1 − c1 ) · · ·

 (an−1 − cn−1 ) ,

(A + buT )CHf = CHf Kg ⇐⇒ buT CHf = CHf (Kg − Kf ) ⇐⇒ buT CHf = CHf en wT ⇐⇒ buT CHf = bwT ⇐⇒ uT = wT (CHf )−1 . (The passage to the last line uses the identities Hf en = e1 and Ce1 = b.)  to a companion matrix Exercise 8.16. Show that if A ∈ C n×n   is similar   −1 Kf : A = U Kf U , U = u1 · · · un , C = u1 Au1 · · · An−1 u1 , and Hf is the Hankel matrix defined in (8.21), then A = (CHf )Kf (CHf )−1 . ⎡ ⎤ ⎡ ⎤ 1 1 0 0 ⎣ ⎦ ⎣ Exercise 8.17. Let A = 0 1 1 and let b = 0⎦. Find a vector u ∈ C 3 0 0 1 1 H such that σ(A + bu ) = {2, 3, 4}.

8.5. Supplementary notes Companion matrices play a significant role in Chapters 19, 20, and 42. The restriction in Theorem 8.10 that the matrix A is similar to a companion matrix will be relaxed in Section 36.3.

Chapter 9

Inequalities

In this chapter we shall establish a number of inequalities for future use. We begin, however, with a brief introduction to the theory of convex functions, because they play a useful role in verifying these inequalities (and many, many others). In the last section we discuss ever so briefly a finitedimensional version of another basic tool in convex analysis: the KreinMilman theorem.

9.1. A touch of convex function theory A subset Q of a vector space is said to be a convex set if ta + (1 − t)b ∈ Q for every pair of vectors a, b ∈ Q and every t in the interval 0 ≤ t ≤ 1. This is the same as saying that if a and b belong to Q, then every point on the line segment between a and b also belongs to Q. A real-valued function f (x) defined on a convex set Q is said to be convex if (9.1) f (ta + (1 − t)b) ≤ tf (a) + (1 − t)f (b) for every choice of a, b ∈ Q and every choice of t in the interval 0 ≤ t ≤ 1; f (x) is said to be a strictly convex function if the inequality in (9.1) is strict whenever a = b and 0 < t < 1. Lemma 9.1 (Jensen’s inequality). If f is a real-valued convex function that is defined on a convex set Q and if x1 , . . . , xn ∈ Q and t1 , . . . , tn are positive numbers such that t1 + · · · + tn = 1, then ) n * n n    ti xi ∈ Q and f ti xi ≤ ti f (xi ) . (9.2) i=1

i=1

i=1

If f is strictly convex, then equality holds if and only if x1 = · · · = xn . 85

86

9. Inequalities

Proof. To justify (9.2), we shall show that if it is valid for n = k for some positive integer k ≥ 2, then it is also valid for n = k + 1. Since (9.2) is valid for n = 2 by definition, this will complete the justification. Towards this end, let x1 , . . . , xk+1 ∈ Q and let t1 , . . . , tk+1 be positive numbers such that t1 + · · · + tk+1 = 1. Then t1 x1 + · · · + tk xk + tk+1 xk+1 = (1 − tk+1 )u + tk+1 xk+1 , where u = τ1 x1 +· · ·+τk xk and τj = (1−tk+1 )−1 tj for j = 1, . . . , k. Thus, as τ1 +· · ·+τk = 1, the presumed validity of (9.2) for n = k ensures that u ∈ Q. Therefore, by the identity in the last display, t1 x1 +· · ·+tk xk +tk+1 xk+1 ∈ Q and f (t1 x1 + · · · + tk+1 xk+1 ) = f ((1 − tk+1 )u + tk+1 xk+1 ) ≤ (1 − tk+1 )f (u) + tk+1 f (xk+1 ) ≤ (1 − tk+1 )(τ1 f (x1 ) + · · · + τk f (xk )) + tk+1 f (xk+1 ) = t1 f (x1 ) + · · · + tk+1 f (xk+1 ) , as needed. Suppose next that f is strictly convex and equality holds in (9.2) for n = k + 1. Then the inequalities in the preceding display are all equalities. Therefore, xk+1 = u

and

τ1 f (x1 ) + · · · + τk f (xk ) = f (τ1 x1 + · · · + τk xk ) .

Repeating this procedure k − 2 times, we obtain three positive numbers s1 , s2 , s3 such that s1 + s2 + s3 = 1 and x3 = s1 x1 + s2 x2

and f (s1 x1 + s2 x2 ) = s1 f (x1 ) + s2 f (x2 ) ,

where sj = sj /(1 − s3 ) for j = 1, 2. Thus, as f is strictly convex, x1 = x2 and hence x3 = x1 and, as xj+1 is a convex combination of x1 , . . . , xj for j = 2, . . . , k, it follows that x1 = · · · = xk+1 . The converse is selfevident.  For the rest of this section we shall restrict our attention to convex functions f (x) of one real variable that are defined on convex subsets Q of R. Exercise 9.1. Show that if Q is a convex subset of R, then Q must be an interval, i.e., the set of points between a pair of points α ∈ R and β ∈ R, where one or both of these points may or may not belong to Q. [HINT: Let α = inf {x ∈ R : x ∈ Q} and β = sup {x ∈ R : x ∈ Q}.]

9.1. A touch of convex function theory

87

Lemma 9.2. If f (x) is defined on an open subinterval Q of R, then: (1) f is convex if and only if (9.3)

f (b) − f (a) f (b) − f (c) f (c) − f (a) ≤ ≤ c−a b−a b−c for every set of three points a, b, c in Q with a < c < b. (2) f is strictly convex if and only if

(9.4)

f (b) − f (a) f (b) − f (c) f (c) − f (a) < < c−a b−a b−c for every set of three points a, b, c in Q with a < c < b.

Proof. Suppose first that f is strictly convex. Then the inequality f (c) < tf (a) + (1 − t)f (b) for a < b, c = ta + (1 − t)b, and 0 < t < 1 implies that f (c) − f (a) < (1 − t)(f (b) − f (a)) and

t(f (b) − f (a)) < f (b) − f (c) .

The inequalities in (9.4) are obtained by noting that c−a b−c and 1 − t = . b−a b−a Conversely, if (9.4) holds for a < c < b, then c = ta + (1 − t)b for some t ∈ (0, 1) and   c−a c−a f (c) < 1 − f (a) + f (b) = tf (a) + (1 − t)f (b) , b−a b−a c = ta + (1 − t)b =⇒ t =

and hence f is strictly convex. The preceding argument is easily adapted to show that f is convex if and only if (9.3) holds for every choice of points a, b, c ∈ Q with a < c < b; the details are left to the reader.  Exercise 9.2. Show that the inequalities in (9.3) hold for every choice of points a, b, c ∈ Q with a < c < b if and only if f is convex. Theorem 9.3. If f is a convex function that is defined on an open interval (α, β) ⊆ R, then f is automatically continuous and the one-sided derivatives f + (x) = lim ε↓0

f (x + ε) − f (x) ε

and

f − (x) = lim ε↓0

f (x) − f (x − ε) ε

exist at every point x ∈ (α, β). Proof. We first show that each point a ∈ (α, β) has a right derivative f + (a). Choose points a , a and a sequence ε1 > ε2 > · · · > 0 such that α < a < a < · · · < a + εj+1 < a + εj < · · · < a + ε1 < β

88

9. Inequalities

and let μj =

f (a + εj ) − f (a) . εj

Then, by repeated applications of the first inequality in (9.3), we obtain μ1 ≥ μ2 ≥ · · · ≥ μj ≥ μj+1 ≥ · · · ≥

f (a) − f (a ) . a − a

A monotonely decreasing sequence of finite numbers that is bounded below must tend to a limit, and that limit (which must be the same regardless of how the points εj are chosen) is f + (a). A similar argument based on the second inequality in (9.3) shows that (f (b) − f (b − εj ))/εj is a monotonely increasing sequence of numbers that is bounded above and hence that f has a left derivative f − (b) at each point b ∈ (α, β). Thus, as f has both a left and a right derivative at every point in (α, β), it must be continuous. The details are left to the reader.  Exercise 9.3. Show that if f is a convex function on (α, β) ⊆ R, then f has a left derivative at every point in (α, β). Exercise 9.4. Show that if f is a convex function on (α, β) ⊆ R, then f is continuous. [HINT: If ε > 0, then |f (c + ε) − f (c)| ≤ ε|ε−1 (f (c + ε) − f (c)) − f + (c)| + ε|f + (c)|. ] Our next objective is to establish a practical way of checking whether or not a given function f (x) of one variable is convex, at least for functions in the class C 2 (Q), where C k (Q) = {f with k continuous derivatives in the open interval Q ⊆ R} . To attain this objective we need a preliminary result. Theorem 9.4. Let Q = (α, β) be an open subinterval of R and let f ∈ C 1 (Q). Then: (1) f (x) is convex on Q if and only if f  (x) ≤ f  (y) for every pair of points x, y ∈ Q with x < y. (2) f (x) is strictly convex on Q if and only if f  (x) < f  (y) for every pair of points x, y ∈ Q with x < y. Moreover, if f is convex and (9.5)

f  (x) = 0 for some x ∈ Q, then f (x) ≤ f (y) for every y ∈ Q;

if f is strictly convex and (9.6)

f  (x1 ) = f  (x2 ) = 0 for x1 , x2 ∈ Q, then x1 = x2 .

9.1. A touch of convex function theory

89

Proof. Suppose first that f (x) is strictly convex on Q and let a < c < b be three points in Q. Then, upon letting c ↓ a in the first inequality in (9.4) and c ↑ b in the second inequality in (9.4), it is readily seen (with the help of Exercise 9.5, to justify strict inequalities) that f  (a)
0, it follows that (f (c) − f (a))(b − c) < (f (b) − f (c))(c − a) .

But this in turn implies that f (c)(b − a) < (b − c)f (a) + (c − a)f (b) , which, upon setting c = ta + (1 − t)b for any choice of t ∈ (0, 1), is easily seen to be equivalent to the requisite condition for strict convexity. This completes the proof of (2). Suppose next that f is convex on Q and that a, b ∈ Q and f  (a) = 0. Then, as the inequality f (a + t(b − a)) ≤ f (a) + t(f (b) − f (a)) is valid for every t ∈ (0, 1), f  (a) = lim t↓0

f (a + t(b − a)) − f (a) ≤ f (b) − f (a) t

and hence (9.5) holds. Thus, if f is strictly convex and f (a1 ) ≤ f (b) and f (a2 ) ≤ f (b) for every point b ∈ Q, then f (a1 ) = f (a2 ) and hence if a1 = a2 and 0 < t < 1, then f (ta1 + (1 − t)a2 ) < tf (a1 ) + (1 − t)f (a2 ) = f (a1 ), which is not possible. Therefore, (9.6) holds. The verification of (1) is left to the reader.  Exercise 9.5. Show that if f is strictly convex on (α, β) ⊆ R and if α < a < a + ε1 < b < β and ε1 > ε2 > · · · tends to zero, then f (a + εj ) − f (a) f (a + εj+1 ) − f (a) f (b) − f (a) < < εj+1 εj b−a f (b) − f (b − εj+1 ) f (b) − f (b − εj ) < . < εj εj+1

90

9. Inequalities

Exercise 9.6. Show that if Q = (α, β) is an open subinterval of R and f ∈ C 2 (Q), then (9.7)

is convex on Q if and only if f  (x) ≥ 0

f

for every x ∈ Q .

Exercise 9.7. Show that if Q = (α, β) is an open subinterval of R and if f ∈ C 2 (Q), then (9.8)

f

is strictly convex on Q if f  (x) > 0

Exercise 9.8. Show that

x4

for every x ∈ Q .

is strictly convex on R even though f  (0) = 0.

Exercise 9.9. Show that xr is convex on (0, ∞) if and only if r ≥ 1 or r ≤ 0 and that −f (x) is convex on (0, ∞) if and only if 0 ≤ r ≤ 1. Exercise 9.10. Show that ex is strictly convex on R and that − ln x and xr with 1 < r < ∞ are strictly convex on (0, ∞). Exercise 9.11. Show that if f (x) is convex on an interval Q and g(y) is a nondecreasing convex function on an interval Q that contains {f (x) : x ∈ Q}, then g(f (x)) is convex on Q. + Exercise 9.12. ,Show that if f ∈ C (1) (0, ∞) C([0, ∞)) is convex and x f (0) = 0, then 0 f (s)ds ≤ x2 f  (x)/2 for every x > 0. [HINT: Exploit (9.7).] Exercise 9.13. Show that if a1 , . . . , an and t1 , . . . , tn are two sequences of positive numbers and t1 + · · · + tn = 1, then ⎞r ⎛ n n   ⎠ ⎝ tj aj ≤ tj arj when r ≥ 1 . (9.9) j=1

j=1

Example 9.1. If a1 , . . . , an and t1 , . . . , tn are two sequences of positive numbers such that t1 + · · · + tn = 1, then (9.10) at11 · · · atnn ≤ t1 a1 + · · · + tn an ,

with equality ⇐⇒ a1 = · · · = an .

In particular, the geometric mean of a given set of positive numbers a1 , . . . , an is less than or equal to its arithmetic mean, i.e., a1 + a2 + · · · + an (9.11) (a1 a2 · · · an )1/n ≤ . n Since − ln x is a strictly convex function of x on the interval (0, ∞), Lemma 9.1 ensures that − ln(t1 a1 + · · · + tn an ) ≤ −(t1 ln a1 + · · · + tn ln an ) = − ln(at11 · · · atnn ), with equality if and only if a1 = · · · = an . But this is easily seen to be equivalent to (9.10).  Exercise 9.14. Show that if A ∈ C n×n has nonnegative eigenvalues, then trace A . (9.12) (det A)1/n ≤ n

9.2. Four inequalities

91

9.2. Four inequalities If s > 1 and t > 1, then it is readily checked that 1 1 + = 1 ⇐⇒ (s − 1)(t − 1) = 1 s t ⇐⇒ (s − 1)t = s

(9.13)

⇐⇒ (t − 1)s = t . Lemma 9.5. If c > 0, d > 0, s > 1, t > 1, and (s − 1)(t − 1) = 1, then (9.14)

cd ≤

cs dt + , s t

with equality if and only if

cs = dt .

Proof. This is just (9.10) with n = 2, t1 = 1/s, t2 = 1/t, c = at11 , and  d = at22 ; it is treated in detail in the discussion of Example 9.1. Theorem 9.6 (H¨ older). If s > 1, t > 1, and (s − 1)(t − 1) = 1 and if n a, b ∈ C with components a1 , . . . , an and b1 , . . . , bn , respectively, then - n .1/s - n .1/t n    |ak bk | ≤ |ak |s |bk |t . (9.15) k=1

k=1

k=1

Moreover, equality will prevail in (9.15) if and only either a = 0 (9.16)

or a = 0 and |bk |t = μ|ak |s

for k = 1, . . . , n and some μ ≥ 0 .

Proof. Suppose first that a = 0 and b = 0 and let αk =

n

ak

s j=1 |aj |

Then

n 

!1/s

and βk =

n

bk

t j=1 |bj |

|αk |s = 1 and

k=1

n 

!1/t .

|βk |t = 1 ,

k=1

and hence, in view of Lemma 9.5, (9.17)

n  k=1

|αk βk | ≤

n  |αk |s k=1

s

+

n  |βk |t k=1

t

=

1 1 + = 1. s t

This yields the desired inequality because n n  k=1 |ak bk | |αk βk | = / 01/s / 01/t . n n s t k=1 j=1 |aj | j=1 |bj |

92

9. Inequalities

Moreover, equality will prevail in (9.15) if and only if it prevails in (9.17), i.e., if and only if (9.18)

|αi βi | =

|αi |s |βi |t + s t

for i = 1, . . . , n .

If αk βk = 0, then Lemma 9.5 implies that (9.18) holds for i = k if and only if n t j=1 |bj | s t t s . |αk | = |βk | ⇐⇒ |bk | = μ|ak | with μ = n s j=1 |aj | On the other hand, if αk βk = 0, then equality holds in (9.18) for i = k if and only if αk = 0 and βk = 0, i.e., if and only if ak = 0 and |bk |t = μ|ak |s . This completes the proof when a = 0 and b = 0. This leaves two cases to consider: (1) a = 0 and b = 0; (2) a = 0. Equality holds in (9.15) for both, and both are covered by (9.16).  Exercise 9.15. Show that if a, b ∈ C, then (9.19)

|a + b| = |a| + |b| ⇐⇒

either a = 0 . or a = 0 and b = μa for some μ ≥ 0 .

[HINT: If a = 0, then b/a = reiθ and (9.19) holds ⇐⇒ |1 + reiθ | = 1 + r.] Exercise 9.16. Show that if c ∈ C n with entries c1 , . . . , cn and c = 0, then % % % n % n % %  % % c |cj | ⇐⇒ cj = eiθ |cj | = j% % (9.20) % j=1 % j=1 for j = 1, . . . , n and some θ ∈ [0, 2π). [HINT: Suppose ck = 0 and deduce from (9.20) that |ck + cj | = |ck | + |cj | for j = 1, . . . , n and hence, in view of Exercise 9.15, that cj = tj ck = tj |ck |eiθ = |cj |eiθ with tj ≥ 0 for some θ ∈ [0, 2π) and j = 1, . . . , n.] The next exercise supplements Theorem 9.6. Exercise 9.17. Show that if, in the setting of Theorem 9.6, then % - n % n .1/s - n .1/t % %   % % s t ak bk % = |ak | |bk | (9.21) % % % k=1

k=1

n

k=1 |ak |

> 0,

k=1

if and only if |bk |t = μ|ak |s for some choice of μ ≥ 0 and ak bk = eiθ |ak bk | for k = 1, . . . , n and some choice of θ ∈ [0, 2π). [HINT: Exploit the implications of Theorem 9.6 and Exercises 9.15 and 9.16.]

9.2. Four inequalities

93

If s = t = 2, then (9.15) reduces to the well-known Cauchy-Schwarz inequality: Theorem 9.7 (Cauchy-Schwarz). If a, b ∈ C n with entries a1 , . . . , an and b1 , . . . , bn , respectively, then % ) % *1/2 ) n *1/2 n n % %   % % ak bk % ≤ |ak |2 |bk |2 , (9.22) % % % k=1

k=1

k=1

with equality if and only if (9.23)

either a = 0, or a = 0 and b = βa for some β ∈ C.

Proof. The inequality is immediate from (9.15) with s = t = 2. Let f (a, b) denote the right-hand side of (9.22). If equality holds in (9.22), then % % n n %  % % % ak bk % ≤ |ak bk | ≤ f (a, b) . f (a, b) = % % % k=1

k=1

Therefore, equality prevails throughout, and hence, if a = 0, then, in view of (9.20) and Theorem 9.6, there exists a θ ∈ [0, 2π) and a μ ≥ 0 such that ak bk = eiθ |ak bk |

and |bk | = μ|ak | for k = 1, . . . , n .

Thus, if ak = 0, then bk = eiθ

|ak |2 |ak bk | = eiθ μ = eiθ μak . ak ak

Thus equality in (9.22) implies that (9.23) is in force. Since the converse implication is self-evident, the proof is complete.  Exercise 9.18. Show that if α, β ∈ R and θ ∈ [0, 2π), then α cos θ+β sin θ ≤ 1 α2 + β 2 and that the upper bound is achieved for some choice of θ. Theorem 9.8 (Minkowski). If a ∈ C n with entries a1 , . . . , an , b ∈ C n with entries b1 , . . . , bn , and 1 ≤ s < ∞, then .1/s - n .1/s - n .1/s - n    |ak + bk |s ≤ |ak |s + |bk |s . (9.24) k=1

k=1

k=1

Moreover, if 1 < s < ∞, then equality holds in (9.24) if and only if either a = 0, or a = 0 and b = μa for some μ ≥ 0. If s = 1, then equality holds in (9.24) if and only if for each index k, either ak = 0, or ak = 0 and bk = μk ak for some μk ≥ 0.

94

9. Inequalities

Proof. The inequality for s = 1 is an immediate consequence of the fact that for every pair of complex numbers a and b, |a + b| ≤ |a| + |b|; the conditions for equality follow from Exercise 9.15. The proof for 1 < s < ∞ is divided into parts. 1. Verification of the inequality (9.24) for 1 < s < ∞. If s > 1, then n 

|ak + bk | = s

k=1

(9.25)

n 

|ak + bk |s−1 |ak + bk |

k=1



n 

|ak + bk |s−1 (|ak | + |bk |) .

k=1

By H¨older’s inequality, (9.26)

n 

|ak + bk |

s−1

|ak | ≤

- n 

|ak + bk |

(s−1)t

.1/t - n 

k=1

k=1

k=1

n 

- n 

.1/t - n 

.1/s |ak |

s

and (9.27)

|ak + bk |s−1 |bk | ≤

k=1

|ak + bk |(s−1)t

k=1

.1/s |bk |s

k=1

for t = s/(s − 1). Since (s − 1)t = s, the last three inequalities imply that - n .1/t ⎧) n *1/s ) n *1/s ⎫ n ⎨  ⎬    |ak + bk |s ≤ |ak + bk |s |ak |s + |bk |s . ⎩ ⎭ k=1

k=1

k=1

k=1

Now, if n 

|ak + bk |s > 0 ,

k=1

 s 1/t to then we can divide both sides of the last inequality by { nk=1 k + bk | } |a n obtain the desired inequality (9.24). On the other hand, if k=1 |ak +bk |s = 0, then the inequality (9.24) is self-evident. 2. Verification of the conditions for equality in (9.24) for 1 < s < ∞. Clearly equality holds in (9.24) if a = 0. Suppose next that a = 0 and let c ∈ C n with components c1 = a1 + b1 , . . . , cn = an + bn . A necessary condition for (9.24) to hold when a = 0 is that c = 0. Moreover, if equality

9.2. Four inequalities

95

holds in (9.24), then, in view of (9.25)–(9.27), n 

|ck | ≤ s

k=1

n 

|ck |

s−1

k=1



- n 

=

|ck |s−1 |bk |

k=1

|ck |s

k=1

- n 

|ak | +

n 

|ck |

s

.1/t ⎧) n ⎨  ⎩ k=1 .1/t ⎧) n ⎨ 

k=1



*1/s |ak |s

|ck |

s

) +

k=1

|bk |s

*1/s ⎫ ⎬

k=1

*1/s ⎫ ⎬ ⎭

n 

=

n 



|ck |s .

k=1

Thus, if equality holds in (9.24), then it also holds in (9.25)–(9.27). Then Theorem 9.6 ensures that equality holds in (9.26) if and only if |ak |s = α|ck |(s−1)t for k = 1, . . . , n and some α ≥ 0. Similarly, equality holds in (9.27) if and only if |bk |s = β|ck |(s−1)t for k = 1, . . . , n and some β ≥ 0. Moreover, since a = 0 by assumption, α > 0. Thus, as (s − 1)t = s, |bk |s = β|ck |s =

β β α|ck |s = |ak |s α α

for k = 1, . . . , n .

Therefore, (9.28)

|bk | = μ|ak | for k = 1, . . . , n and some μ ≥ 0 .

Equality in (9.25) implies that |ck |s−1 |ak + bk | = |ck |s−1 (|ak | + |bk |) for k = 1, . . . , n. If ak = 0, then this is automatically so. If ak = 0, then the already established equality |ak |s = α|ck |s ensures that ck = 0 and hence that |ak + bk | = |ak | + |bk |. But then (9.19) ensures that bk = tk ak for some tk ≥ 0. Thus, in view of (9.28), ak = 0 =⇒ bk = 0 and ak = 0 =⇒ tk = μ. Consequently, bk = μak in both cases; i.e., if a = 0, then b = μa for some μ ≥ 0.  Remark 9.9. The inequality (9.14) is a special case of a more general statement that is usually referred to as Young’s inequality: If b1 , . . . , bn and p1 , . . . , pn are positive numbers and

(9.29)

b1 · · · bn ≤

which is equivalent to (9.10).

bp11 bpnn + ··· + , p1 pn

1 p1

+ · · · + p1n = 1, then

96

9. Inequalities

9.3. The Krein-Milman theorem A point c in a convex set Q is said to be an extreme point of Q if 0 < t < 1, a, b ∈ Q, and c = ta + (1 − t)b =⇒ a = b = c . Theorem 9.10 (Krein-Milman). If Q is a closed convex subset of R n such that Q ⊂ {x ∈ R n : xH x ≤ R} for some R < ∞, then every vector in Q is a convex combination of the extreme points of Q. 

Proof. See, e.g., Section 22.8 in [30].

In fact, since Q ⊂ R n , every vector in Q is a convex combination of a finite number of extreme points of Q. An argument of Carath´ eodory serves to bound the number of extreme points needed: Theorem 9.11. If Q is a closed convex subset of R n such that Q ⊂ {x ∈ R n : xH x ≤ R} for some R < ∞, then every vector in Q is a convex combination of at most n + 1 extreme points of Q.  Proof. If u ∈ Q, then, in view of Theorem 9.10, u = kj=1 tj vj is a convex combination of k extreme points of Q, v1 , . . . , vk . If k > n + 1, then the vec T T  tors 1 v1 , . . . , 1 vk are linearly dependent. Therefore, there exist a set of numbers s1 , . . . , sk , not all of which are zero, such that

 k k k    1 = 0n+1 , i.e., sj sj = 0 and sj vj = 0n . vj j=1

j=1

j=1

Since we may assume that tj > 0 for j = 1, . . . , k, there exists a number μ > 0 such that tj − μsj ≥ 0 for j = 1, . . . , k, with equality for at least one  index j. Thus, as kj=1 (tj − μsj ) = 1, u=

k  (tj − μsj )vj j=1

is a convex combination of at most k − 1 extreme points. The argument can be repeated until a representation with at most n + 1 extreme points is obtained.  Exercise 9.19. Show that if v1 , . . . , vk are linearly independent vectors in  k R n and Q = v1 + { kj=2 tj vj : tj ≥ 0 and j=2 tj = 1}, then Q is a convex combination of k − 1 extreme points. Exercise 9.20. Show that if f (x) is a real-valued linear function on R n (i.e., f (cu + v) = cf (u) + f (v) for u, v ∈ R n and c ∈ R) and Q is as in Exercise 9.19 with extreme points u1 , . . . , uk−1 , then max {f (u) : u ∈ Q} = maxj f (uj ).

9.4. Supplementary notes

97

9.4. Supplementary notes For general versions of the Krein-Milman theorem see, e.g., Bollob´as [12]. Exercise 9.19 illustrates a refinement of Theorem 9.11 for convex subsets of R n having (appropriately defined) dimension less than n; see, e.g., Theorem 8.11 in Simon [70]. Exercise 9.12 was posed by Andrica [4].

Chapter 10

Normed linear spaces

This chapter introduces normed linear spaces and surveys a number of their basic properties. In the final section it is shown that left invertibility and right invertibility of a matrix are preserved under small perturbations, though rank is not.

10.1. Normed linear spaces A vector space U is said to be a normed linear space if there exists a number ϕ(x) assigned to each vector x ∈ U such that for every choice of x, y ∈ U and every scalar α the following four conditions are met: (1) ϕ(x) ≥ 0. (2) ϕ(x) = 0 if and only if x = 0. (3) ϕ(αx) = |α|ϕ(x). (4) ϕ(x + y) ≤ ϕ(x) + ϕ(y). Every function ϕ(x) that meets these four conditions is called a norm and is usually denoted by the symbol x, or by the symbol xU if it is desired to clarify the space under consideration. The inequality in (4) is called the triangle inequality. Lemma 10.1. Let U be a normed linear space with norm ϕ(x). Then |ϕ(x) − ϕ(y)| ≤ ϕ(x − y) for every choice of x and y in U (and hence ϕ is continuous). 99

100

10. Normed linear spaces

Proof. The triangle inequality implies that ϕ(x) = ϕ(x − y + y) ≤ ϕ(x − y) + ϕ(y). Therefore, ϕ(x) − ϕ(y) ≤ ϕ(x − y) and, since x and y may be interchanged, the supplementary inequality ϕ(y) − ϕ(x) ≤ ϕ(y − x) = ϕ(x − y) also holds. Thus, −ϕ(x − y) ≤ ϕ(x) − ϕ(y) ≤ ϕ(x − y), which is equivalent to the stated inequality.



In the special case that U = C n or U = R n the classical norms are ⎧ if s = ∞, ⎨ max {|xj | : 1 ≤ j ≤ n} ! (10.1) xs = 1/s n s ⎩ if 1 ≤ s < ∞ . j=1 |xj | Exercise 10.1. Verify that xs defines a norm on C n for each choice of s, 1 ≤ s ≤ ∞. [HINT: Use Minkowski’s inequality (9.24) to justify the triangle inequality when 1 < s < ∞.] Exercise 10.2. Let U be a vector space with basis u1 , . . . , un . Show that for each choice of s in the interval 1 ≤ s < ∞ the formula ⎫1/s ⎞ ⎧ ⎛ n n ⎨ ⎬  xj uj ⎠ = |xj |s ϕ⎝ ⎩ ⎭ j=1

j=1

defines a norm on U . [HINT: See the hint in Exercise 10.1.] The next exercise illustrates a special case of the general principle that in a finite-dimensional normed linear space all norms are equivalent, i.e., if ϕ(u) and ψ(u) are norms in a finite-dimensional normed linear space U , then there exists a pair of positive constants α and β such that (10.2)

αψ(u) ≤ ϕ(u) ≤ βψ(u) .

Exercise 10.3. Show that if x ∈ C n , then 1 x1 ≤ x∞ ≤ x2 ≤ x1 ≤ nx∞ . n Exercise 10.4. Show that if x ∈ C n , s ≥ 1, and t ≥ 0, then for each vector x ∈ C n .   [HINT: If yj = (xs )−1 |xj |, then 0 ≤ yj ≤ 1 and nj=0 yjs+t ≤ nj=0 yjs = 1.]

(10.3)

x1 ≥ xs ≥ xs+t ≥ x∞

10.2. The vector space of matrices A

101

Exercise 10.5. Show that (10.4)

n 

⎛ ⎞t n  |aj |t ≥ ⎝ |aj |⎠

j=1

if 0 < t < 1 .

j=1

[HINT: 0 < t < 1 if and only if 1 < 1/t < ∞.] Exercise 10.6. Show that lims↑∞ xs = x∞ for each vector x ∈ C n . [HINT: xs ≤ n1/s x∞ .] Exercise 10.7. Show that x3 is equivalent to x1 . [HINT: Exploit the preceding exercises.] The most important norms in C n and R n are x1 , x2 , and x∞ ; the choice s = 2 yields the familiar Euclidean norm: ⎫1/2 ⎧ n ⎬ ⎨ |xj |2 . (10.5) x2 = ⎭ ⎩ j=1

Remark 10.2. Even though all norms on a finite-dimensional normed linear space are equivalent in the sense noted above, particular choices may be most appropriate for certain applications. Thus, for example, if the entries ui in a vector u ∈ R n denote deviations from a navigational path (such as a channel through shallow waters) at successive increments of time, it’s important to keep u∞ small. If a, b ∈ R 2 , then although a − b2 is equal to the usual Euclidean distance between the points a and b, the norm a − b1 might give a better indication of the driving distance.

10.2. The vector space of matrices A There are many ways to define a norm on the vector space C p×q . One could for example view a matrix A ∈ C p×q as a funny way to record the pq entries aij of a vector in the space C pq . Then - p q { i=1 j=1 |aij |s }1/s if 1 ≤ s < ∞ , (10.6) As = max {|aij | : i = 1, . . . , p, j = 1, . . . , q} if s = ∞ . We shall, however, be primarily interested in other norms which reflect the action of A as a linear transformation. We begin with a preliminary estimate: Lemma 10.3. If A ∈ C p×q , u ∈ C q , and 1 ≤ s ≤ ∞, then ⎧ ⎞t ⎫1/t ⎛ p q ⎨ ⎬  ⎝ |aij |⎠ us f or 1 ≤ t < ∞ (10.7) Aut ≤ ⎩ ⎭ i=1

j=1

102

10. Normed linear spaces

and Au∞ ≤ max

(10.8)

i

⎧ q ⎨ ⎩

j=1

|aij |

⎫ ⎬ ⎭

us ≤ qA∞ us .

Proof. If 1 ≤ t < ∞, then %t ⎛ % ⎞t ⎛ ⎞t % % q q q    % % % aij uj %% ≤ ⎝ |aij uj |⎠ ≤ ⎝ |aij |⎠ uts , % % % j=1 j=1 j=1 since |uj | ≤ us when 1 ≤ s ≤ ∞. But this leads easily to (10.7). The verification of (10.8) is left to the reader.  Exercise 10.8. Show that if A ∈ C p×q , u ∈ C q , and 1 ≤ t ≤ ∞, then Aut ≤ At ut ,

(10.9) t

where = t/(t − 1) if 1 < t < ∞, t = ∞ if t = 1, and t = 1 if t = ∞. [HINT: H¨older’s inequality is the key.] Lemma 10.3 ensures that there exists a finite positive number γt,s such that (10.10)

Aut ≤ γt,s us

for every vector u ∈ C q and 1 ≤ s ≤ ∞ .

Thus, the function f (u) = Aut is continuous in C q with respect to the norm  · s . Therefore it attains its maximum value on the set {u ∈ C q : us = 1}, since this is a closed bounded set in a finite-dimensional space. Consequently, for every pair of numbers 1 ≤ s, t ≤ ∞, there exists at least one vector umax ∈ C q with umax s = 1 such that (10.11)

Aut ≤ Aumax t

for every u ∈ C q with us = 1 .

Let As,t = Aumax t . Then (10.12)

As,t = max {Aut : u ∈ C q and us = 1}

for every pair of numbers 1 ≤ s, t ≤ ∞. (Some evaluations of As,t are furnished in Section 10.3.) Theorem 10.4. If A ∈ C p×q and 1 ≤ s, t ≤ ∞, then the number As,t that is defined by formula (10.12) defines a norm on C p×q . Moreover, (10.13)

Aut ≤ As,t us

for every vector u ∈ C q ,

and As,t may also be evaluated by each of the following two supplementary recipes: (10.14) (10.15)

As,t = max {Aut : u ∈ C q and us ≤ 1} 2 ( Aut q : u∈C and u =  0 . = max us

10.2. The vector space of matrices A

103

Proof. Let α, β, and γ denote the right-hand sides of (10.12), (10.14), and (10.15), respectively, and note that if u is a nonzero vector in C q and v = u/us , then, as vs = 1, Aut =

Aut us = Avt us ≤ αus . us

This clearly implies that γ ≤ α and also serves to justify (10.13), since the inequality Aut ≤ αus is also valid for u = 0. On the other hand, if us ≤ 1 and u = 0, then Aut ≤

Aut ≤γ, us

which implies that β ≤ γ, and hence, in view of the already established bound γ ≤ α and the self-evident inequality α ≤ β, that α ≤ β ≤ γ ≤ α. Thus, α = β = γ. We still need to verify that As,t really defines a norm on C p×q . The conditions As,t ≥ 0 and αAs,t = |α| As,t for α ∈ C are self-evident. Moreover, if As,t = 0, then, in view of the bound (10.13), Aut = 0 for every vector u ∈ C q . Therefore Au = 0 for every vector u ∈ C q and hence As,t = 0 =⇒ A = O. It remains only to check the triangle inequality. But if also B ∈ C p×q , then (A + B)ut ≤ Aut + But ≤ As,t us + Bs,t us , which implies that (A + B)s,t ≤ As,t + Bs,t , as needed to complete the proof.  To put the preceding discussion into context, we define the operator norm of a linear transformation T from a finite-dimensional normed linear space U into a normed linear space V by the formula (10.16)

T U,V = max{T uV : u ∈ U and uU = 1} .

Thus, As,t is the operator norm for the linear transformation that sends vectors u in the normed linear space C q with norm us to vectors v = Au in the normed linear space C p with norm vt . The assumption that U is finite dimensional is essential to ensure the existence of a vector umax ∈ U with umax U = 1 such that T U,V = T umax V (or, in the setting discussed above, that (10.11) holds); see Exercise 10.9. Exercise 10.9. The vector space U of infinite column vectors u with en 2 |u tries u1 , u2 , . . . that meet the constraint ∞ j=1 j | < ∞ is a normed linear 4 3∞ 2 1/2 . Let T denote the operator that space with norm u = i=1 |ui |

104

10. Normed linear spaces

maps u into the vector T u with components a1 u1 , a2 u2 , . . ., where 0 < aj < 1 for all j and limj↑∞ aj = 1. Show that sup{T u : u = 1} = 1, but that there does not exist a nonzero vector u ∈ U such that T u = u. Exercise 10.10. Show that if T is a linear transformation from a finitedimensional normed linear space U into a normed linear space V, then

(10.17)

T U,V = max{T uV : u ∈ U and uU ≤ 1} 2 ( T uV : u ∈ U and u = 0 . = max uU

Exercise 10.11. Show that if A ∈ C p×1 , then As,t = At . The next theorem gives useful bounds. Theorem 10.5. If A ∈ C p×q , u ∈ {1, . . . , p}, and v ∈ {1, . . . , q}, then (10.18)

|auv | ≤ As,t ≤

p  q 

|aij |

for 1 ≤ s, t ≤ ∞ ,

i=1 j=1

i.e., (10.19)

A∞ ≤ As,t ≤ A1

for 1 ≤ s, t ≤ ∞ .

Proof. Let Eij denote the p × q matrix with a 1 in the ij place and 0’s elsewhere. Then 5 5 5 5 p q q p     5 5 5 5 aij Eij 5 ≤ |aij |Eij s,t . As,t = 5 5 5 i=1 j=1 i=1 j=1 s,t

To evaluate Eij s,t , let e1 , . . . , ep denote the columns of Ip and let f1 , . . . , fq denote the columns of Iq . Then, since Eij = ei fjT , Eij ut = ei fjT ut = |uj | ei t = |uj | ≤ us for every choice of s, 1 ≤ s ≤ ∞. Moreover, equality is achieved by choosing u = fj . Thus, Eij s,t = 1. This establishes the upper bound in (10.18). On the other hand if av denotes the v’th column of A, then As,t ≥ Afv t = av t ≥ |auv | for 1 ≤ s, t ≤ ∞ and every choice of u ∈ {1, . . . , p}.



10.2. The vector space of matrices A

105

Corollary 10.6. If A, B ∈ C p×q , u ∈ {1, . . . , p}, and v ∈ {1, . . . , q}, then (10.20)

|auv − buv | ≤ A − Bs,t ≤

p  q 

|aij − bij |

for 1 ≤ s, t ≤ ∞ .

i=1 j=1

The bounds in (10.18) are not the best possible; however, they have the advantage of simplicity and, as is spelled out in (10.20), they clearly display the fact that A − Bs,t is small if and only if the entries in A are close to the entries in B. We have special interest in norms A on matrices A that have two extra properties: AB ≤ A B

(10.21)

and

In  = 1 .

Theorem 10.7. The norm As,s , 1 ≤ s ≤ ∞, meets both of the conditions in (10.21). The norm As never meets both of these conditions: If A, B ∈ C n×n and n > 1, then (10.22) ABs ≤ As Bs ⇐⇒ 1 ≤ s ≤ 2 and Ip s = 1 ⇐⇒ s = ∞ . Proof. If A ∈ C p×q , B ∈ C q×k , and u ∈ C k , then, by successive applications of the bound (10.13), (AB)ut = A(Bu)t ≤ As,t Bus ≤ As,t Br,s ur , which implies that (10.23)

ABr,t ≤ As,t Br,s

for 1 ≤ r, s, t ≤ ∞

and hence that ABs,s ≤ As,s Bs,s . Thus, as In s,s = 1, both of the conditions in (10.21) are met. The verification of (10.22) is left to the reader; see Exercises 10.12, 10.13, and 10.14.  Exercise 10.12. Show that if 1 ≤ s ≤ ∞ and n > 1, then In s = 1 if and only if s = ∞. Exercise 10.13. Show that if A ∈ C p×q , B ∈ C q×r , and 1 ≤ s ≤ 2, then ABs ≤ As Bs . [HINT: If 1 < s ≤ 2 and s−1 + t−1 = 1, then t ≥ s and hence ut ≤ us .]

 a a Exercise 10.14. Show that if A = with a > 0, then A2 s > A2s a a when 2 < s ≤ ∞ and A2 s < A2s < 2A2 s when 1 ≤ s < 2. Exercise 10.15. Show that if A ∈ C p×q , then ⎧ ⎧ ⎫1/2 ⎫1/2 q p  q ⎨ ⎨ ⎬ ⎬ |aij |2 ≤ A2,2 ≤ |aij |2 . (10.24) max i ⎩ ⎩ ⎭ ⎭ j=1

i=1 j=1

106

10. Normed linear spaces

 a b Exercise 10.16. Show that if A = ∈ R 2×2 and d2 = a2 + b2 + c2 , 0 c then √ d2 + d4 − 4a2 c2 2 2 . max{Ax2 : x ∈ R and x2 = 1} = 2 [HINT: If u ∈ R 2 and u2 = 1, then uT = [cos θ to Exercise 9.18.]

sin θ]. To finish, refer

10.3. Evaluating some operator norms The next lemma lists a number of cases for which it is possible to evaluate As,t precisely. Lemma 10.8. If A ∈ C p×q , then:  (1) A1,1 = max { pi=1 |aij |}. j

(2) A∞,∞ = max

q

i

j=1 |aij |

! .

(3) A2,2 = s1 , where s21 is the largest eigenvalue of the matrix AH A. (4) A1,∞ = max |aij |. i,j

!  (5) A2,∞ = max ( qj=1 |aij |2 )1/2 . i 3 4 (6) A1,2 = max ( pi=1 |aij |2 )1/2 . j

Discussion.

To obtain the first formula, observe that % % * ) p % p % q   % q %  % aij xj %% ≤ |aij | |xj | Ax1 = % % j=1 i=1 i=1 % j=1

and hence that (10.25)

Ax1 ≤ max

- p 

j

|aij | x1 .

i=1

This establishes the inequality (10.26)

.

A1,1 ≤ max j

- p 

. |aij |

.

i=1

To obtain equality, it suffices to exhibit a vector x ∈ C q such that x = 0 and equality prevails in formula (10.25). Suppose that the maximum in (10.26)

10.3. Evaluating some operator norms

107

is achieved when j = k. Then for the vector u with uk = 1 and all other coordinates equal to zero, we obtain u1 = 1 and % % . - p % p % q p   % %  % A1,1 ≥ Au1 = aij uj %% = |aik | = max |aij | . % j % % i=1 i=1 j=1 i=1 This completes the proof of the first formula. Next, to obtain the second formula, observe that % % % % q q q  %  % % % a x |a ||x | ≤ |aij |x∞ ≤ ij j % ij j % % j=1 % j=1 j=1 and hence that (10.27)

Ax∞

⎫ ⎧% %⎫ ⎧ %⎬ q q ⎬ ⎨%% ⎨ % = max %% aij xj %% ≤ max |aij | x∞ , i ⎩% i ⎩ ⎭ %⎭ j=1 j=1

i.e., (10.28)

A∞,∞ ≤ max i

⎧ q ⎨ ⎩

j=1

|aij |

⎫ ⎬ ⎭

.

To obtain equality in (10.28), it suffices to exhibit a vector x ∈ C q such that x = 0 and equality prevails in (10.27). Suppose that the maximum in (10.28) is attained at i = k and that it is not equal to zero, and let u be the vector in C q with entries ( akj /|akj | if akj = 0, uj = 0 if akj = 0. Then u∞ = 1 and A∞,∞ ≥ Au∞

% ⎫ % ⎧ % q % q q ⎬ ⎨    % % ≥ %% akj uj %% = |akj | = max |aij | . i ⎩ ⎭ % j=1 % j=1 j=1

This completes the proof of the second assertion if A = Op×q . However, if A = Op×q , then the asserted formula is self-evident. We shall postpone the proof of the third assertion to Lemma 15.3 and leave the remaining assertions to the reader.  H Exercise 10.17.

 Compute the maximum eigenvalue of the matrix A A a b when A = ∈ R 2×2 and show that it is equal to the maximum that 0 c was calculated in Exercise 10.16.

108

10. Normed linear spaces

10.4. Small perturbations In subsequent chapters we shall work mostly (though not exclusively) with the norm A2,2 . But in this section (and this section only), just to give some idea of the possibilities, we shall let A denote the operator norm for A ∈ C p×q that is defined by the formula A = max {Ax : x ∈ C q and x = 1} , where x is any norm on C q . You can, if it makes you more comfortable, replace A by A2,2 and x by x2 . Lemma 10.9. If X ∈ C p×p and X < 1, then Ip − X is invertible. Proof. Let u ∈ C p and (Ip − X)u = 0; then u = Xu ≤ X u . Therefore, (1 − X ) u ≤ 0, which implies that u = 0 and hence that the nullspace of Ip − X is equal to {0}. Therefore, Ip − X is invertible.  Theorem 10.10. If A , B ∈ C p×p , A is invertible, and A − B < {A−1  }−1 , then B is invertible. Proof. Since A is invertible, B = A − (A − B) = A(Ip − A−1 (A − B)) , which will be invertible if A−1 (A − B) < 1 by Lemma 10.9. But, if A−B < {A−1  }−1 , then A−1 (A−B) ≤ A−1  A−B < 1.  Theorem 10.10 ensures that invertibilty is preserved under small perturbations. But more is true: (1) If B is close to A, then the eigenvalues of B will be close to the eigenvalues of A; see Theorem 35.7. (2) Left and right invertibility are also preserved under small perturbations. We shall justify this indirectly by first exploring the behavior of the rank, which is not necessarily preserved under small perturbations. Lemma 10.11. If A ∈ C p×q and B is a submatrix of A, then B ≤ A . Discussion. Suppose that A ∈ C 6×5 and 

a21 a23 a24 B= a41 a43 a44 and let ei denote the i’th column of I6 for i = 1, . . . , 6 and fj denote the j’th column of I5 for j = 1, . . . , 5. Then     and F = f1 f3 f4 . B = E T AF, where E = e2 e4

10.4. Small perturbations

109

Thus, B = E T AF  ≤ E T  A F  . But, F  = max{F x : x ∈ C 3 = max{I5 x : x ∈ C 5 , ≤ max{I5 x : x ∈ C

5

and x = 1} x2 = x5 = 0, and

and

x = 1}

x = 1} = I5  = 1 .

By similar considerations, E T  ≤ 1. Therefore, B ≤ A in this example. The verification of this inequality in the general setting is essentially the same; only the bookkeeping is a little more elaborate.  Exercise 10.18. Show that in the setting of the preceding discussion, F  = E T  = 1 . Exercise 10.19. Show by direct calculation that if E ∈ R n×k is a submatrix of In that is obtained by discarding n − k columns of In for some choice of k, 1 ≤ k ≤ n, then E = 1 and E T  = 1. Theorem 10.12. If A ∈ C p×q , then: (1) rank A = r =⇒ there exists an invertible r × r submatrix of A. (2) If there exists a k × k invertible submatrix of A, then rank A ≥ k.   Proof. If A = a1 · · · aq and rank A = r, then there exists a submatrix B ∈ C p×r of A with rank B = r. Therefore, rank B T = r and there exists an r × r submatrix C of B T with rank C = r. Thus C T is an invertible r × r submatrix of A. This completes the proof of (1). Suppose next that there exists a k × k invertible submatrix of A. Then the k columns of A that overlap the columns of this submatrix are linearly independent. Thus, rank A ≥ k.  Theorem 10.13. If A ∈ C p×q and rank A = r, then there exists an ε > 0 such that if B ∈ C p×q and A − B < ε, then rank B ≥ r. Proof. If A ∈ C p×q and rank A = r, then there exist a p × r submatrix E of Ip and a q × r submatrix F of Iq such that the r × r submatrix E T AF of A is invertible. Moreover, since E T AF − E T BF  = E T (A − B)F  ≤ A − B , Theorem 10.10 ensures that E T BF is invertible if 1 . A − B < T (E AF )−1  Thus, if this condition is met, then, in view of Theorem 10.12, rank B ≥ r, as claimed. 

110

10. Normed linear spaces

Example 10.1. If

0 1 A= 0 1



and

0 1 B= α 1

 with α = 0 ,

then A − B = |α| and hence there exists a matrix B with rank B = 2 in the set {X ∈ C p×q : A − X < ε} for every ε > 0, no matter how small, whereas rank A = 1.  Corollary 10.14. If A , B ∈ C p×q , then there exists an ε > 0 such that: (1) A left invertible and A − B < ε =⇒ B is left invertible. (2) A right invertible and A − B < ε =⇒ B is right invertible. Proof. This depends upon the fact that A is right invertible (resp., left invertible) if and only if rank A = p (resp., rank A = q). Consequently, if A is right invertible and B − A is small enough, then p ≥ min {p, q} ≥ rank B ≥ rank A = p . Thus, rank B = p and (1) holds; the justification of (2) is similar.



Theorem 10.15. If A ∈ Cp×p and ε > 0, then there exists a diagonalizable matrix B ∈ C p×p such that B − A < ε. Proof. If A = U JU −1 , choose B = U (J + D)U −1 , where D is a diagonal matrix that is chosen so that the diagonal entries of J + D are all distinct  and U DU −1  < ε. Theorem 10.15 implies that the set of complex diagonalizable matrices is dense in C p×p ; however, it is not an open set: If A ∈ C p×p is diagonalizable and ε > 0, then {B ∈ C p×p : B − A < ε} will contain nondiagonalizable matrices. The simplest example is



 1 0 1 α and with 0 < |α| < ε . A= 0 1 0 1 Exercise 10.20. Show that if A ∈ C n×n and A∞ < 1/(n + 1), then the matrix B = In − A is invertible and B∞ < 3/2.

10.5. Supplementary notes This chapter is partially adapted from Chapter 7 of [30]. For a proof of the equivalence of norms in a finite-dimensional space, see, e.g., Section 7.3 of [30]. A sequence of vectors u1 , u2 , . . . in a normed linear space U is a Cauchy sequence if for every ε > 0 there exists a positive integer N such that un+k − un U < ε for n ≥ N and all positive integers k. A normed linear space is a Banach space if every Cauchy sequence tends to a limit in the

10.5. Supplementary notes

111

space. In a finite-dimensional normed linear space every Cauchy sequence tends to a limit in the space. Thus, every finite-dimensional normed linear space is a Banach space. Moreover, in a finite-dimensional normed linear space, every infinite sequence of vectors u1 , u2 , . . . with uj U ≤ K < ∞ has a convergent subsequence; and if S, T are linear transformations from U into U , then ST = I ⇐⇒ T S = I. Both of these properties fail in infinite-dimensional spaces such as the space U considered in Exercise 10.9.

Chapter 11

Inner product spaces

The first three sections of this chapter are devoted to inner product spaces, Gram matrices, and the adjoint of a linear transformation, respectively. We then consider the spectral radius of a matrix (which could just as well have been presented in Chapter 10) and finally present a list of what you need to know about the operator norm A2,2 of a matrix A ∈ C p×q .

11.1. Inner product spaces A vector space U is said to be an inner product space if there is a number u, vU ∈ C associated with every pair of vectors u, v ∈ U such that: (1) u + w, vU = u, vU + w, vU for every w ∈ U . (2) αu, vU = αu, vU for every scalar α . (3) u, vU = v, uU . (4) u, uU ≥ 0 with equality if and only if u = 0. The number u, vU is termed the inner product. Items (1) and (2) imply that the inner product is linear in the first entry and hence, in particular, that 20, vU = 20, vU = 0, vU , which implies that 0, vU = 0. Item (3) serves to guarantee that the inner product is additive in the second entry, i.e., u, v + wU = u, vU + u, wU ;

however, u, βvU = βu, vU .

When the underlying inner product space U is clear from the context, the inner product is often denoted simply as u, v instead of u, vU . 113

114

11. Inner product spaces

A Hilbert space is an inner product space U in which every Cauchy sequence tends to a limit in U . A finite-dimensional inner product space is automatically a Hilbert space. Exercise 11.1. Let U be an inner product space and let u ∈ U . Show that u, v = 0 for every v ∈ U ⇐⇒ u = 0 and (consequently) u1 , v = u2 , v

for every v ∈ U ⇐⇒ u1 = u2 .

The notation x, yst , which is defined for x, y ∈ Cn by the formula n  (11.1) x, yst = yH x = yi xi , i=1

will be used on occasion to denote the standard inner product on C n . The conjugation in this formula can be dropped if x, y ∈ Rn . It is important to bear in mind that there are many other inner products that can be imposed on C n : Exercise 11.2. Show that if B ∈ C p×q and rank B = q, then the formula x, y = (By)H Bx

(11.2)

defines an inner product on C q . Exercise 11.3. Show that the formula A, B = trace(B H A) defines an inner product on the space U = C p×q and then find a basis for C p×q that is orthonormal with respect to this inner product. The norm A2 = (A, A)1/2 based on the inner product introduced in Exercise 11.3 is called the Frobenius norm. Although it looks totally new, the identities p  q  aij bij = trace B H A = trace AB H for A, B ∈ C p×q (11.3) i=1 j=1

show that it really is just the standard inner product for vectors in C pq that are displayed as matrices. Exercise 11.4. Let U denote the set of continuous complex-valued functions f (t) on the finite closed interval [a, b]. (a) Show that U is a complex vector space with respect to the natural rules of addition and multiplication by constants. Identify the zero element. (b) Show that U is a normed linear space with respect to the norm !1/2 ,b . f  = a |f (t)|2 dt

11.1. Inner product spaces

115

(c) Show that U is an inner product space with respect to the inner ,b product f, g = a f (t)g(t)dt. Lemma 11.1 (The Cauchy-Schwarz inequality for inner products). Let U be an inner product space with inner product u, v for every pair of vectors u, v ∈ U . Then |u, v| ≤ (u, u)1/2 (v, v)1/2 ,

(11.4)

with equality if and only if either (1) v = 0 and u is arbitrary or (2) v = 0 and u = λv for some λ ∈ C. Proof. The proof rests essentially on the fact that the inequality 0 ≤ u − λv, u − λv = u, u − λ u, v − λ v, u + |λ|2 v, v is valid for every choice of λ ∈ C. If v = 0, then v, v > 0 and we may set λ=

u, v v, v

to obtain u − λv, u − λv =

u, u v, v − |u, v|2 , v, v

which clearly justifies the inequality (11.4) and in addition shows that if equality prevails when v = 0, then u + λv = 0. On the other hand, if v = 0, then equality holds in (11.4) for every vector u ∈ U . It remains only to show that if v = 0 or v = 0 and u − λv = 0, then equality holds in (11.4). But this is easy and is left to the reader.  The condition for equality in the Cauchy-Schwarz inequality should not be overlooked; it is useful: Example 11.1. If x ∈ C n with components xi , i = 1, . . . , n, then √ (11.5) x1 ≤ nx2 with equality if and only if |x1 | = · · · = |xn | . Discussion. Let v, a ∈ C n with components vi = |xi | and ai = 1 for i = 1, . . . , n, respectively. Then x1 =

n 

|xi | = |v, a| ≤ v2 a2 =

√ n x2 ,

i=1

with equality if and only if v = μa for some constant μ ≥ 0.



Exercise 11.5. Show that if f (t) and g(t) are continuous complex-valued functions f (t) on the finite closed interval [a, b], then %2 6 b %6 b 6 b % % 2 % % f (t)g(t)dt% ≤ |f (t)| dt |g(t)|2 dt % a

a

a

with equality if and only if either f (t) ≡ 0 (i.e., f (t) = 0 for every point t ∈ [a, b]) or f (t) ≡ 0 and g(t) = βf (t) for some point β ∈ C.

116

11. Inner product spaces

Since an inner product space U is automatically a normed linear space with respect to the norm u = {u, u}1/2 , it is natural to ask whether or not the converse is true: Is every normed linear space automatically an inner product space? The answer is no, because the norm induced by the inner product has an extra property: It satisfies the parallelogram law: (11.6)

u + v2 + u − v2 = 2u2 + 2v2 .

It can be shown that every normed linear space for which (11.6) holds is an inner product space and that the inner product is then specified in terms of the norm by the polarization identity 1 k i u + ik v2 4 4

(11.7)

u, v =

for complex spaces

k=1

and (11.8)

1 u, v = {u + v2 − u − v2 } 4

for real spaces .

11.2. Gram matrices Let v1 , . . . , vk be a set of vectors in an inner product space U . Then the k × k matrix G with entries (11.9)

gij = vj , vi U

for

i, j = 1, . . . , k

is called the Gram matrix of the given set of vectors. It is easy to see that G = GH and, in terms of notation that will be discussed in Chapter 16, (11.10)

G  O,

i.e., Gx, xst ≥ 0 for every x ∈ C k .

The notation G  O signifies that G  O and G is invertible, i.e., (11.11)

G  O ⇐⇒ Gx, xst > 0 for every nonzero x ∈ C k .

Lemma 11.2. Let U be an inner product space and let G denote the Gram matrix of a set of vectors v1 , . . . , vk in U . Then G  O if and only if the vectors v1 , . . . , vk are linearly independent. Proof. Let c, d ∈ C k with components c1 , . . . , ck and d1 , . . . , dk , respec  tively, and let v = kj=1 cj vj and w = ki=1 di vi . Then v, wU = dH Gc = Gc, dst .  If G is invertible and kj=1 cj vj = 0 for some choice of c1 , . . . , ck ∈ C, then, in view of formula (11.12), 8 7 k k   cj vj , di vi = dH Gc = Gc, dst 0= (11.12)

j=1

i=1

U

11.2. Gram matrices

117

for every choice of d1 , . . . , dk ∈ C. Therefore, Gc = 0, which in turn implies that c = 0, since G is invertible. Thus, the vectors v1 , . . . , vk are linearly independent. Suppose next that the vectors v1 , . . . , vk are linearly independent and that c ∈ NG . Then, by formula (11.12), 8 7 k k   cj vj , ci vi = cH Gc = 0 . k

j=1

i=1

U

Therefore, j=1 cj vj = 0 and hence, in view of the presumed linear inde pendence, c1 = · · · = ck = 0. Thus, G is invertible. Exercise 11.6. Verify the assertions in (11.10). Exercise 11.7. Verify formula (11.12). Lemma 11.2 can be strengthened: Theorem 11.3. If G is the Gram matrix of a set of vectors v1 , . . . , vk in an inner product space U and V = span{v1 , . . . , vk }, then (11.13)

rank G = dim span{v1 , . . . , vk } = dim V .

Proof. Suppose first that dim RG = r, with 1 ≤ r < k. Then, there exists a permutation matrix P such that the first r columns of GP are a basis for RG . Consequently, the first r columns of P H GP are a basis for RP H GP . Thus, if r×r , P H GP is written in block form as [Aij ], i, j = 1, 2, with A11 = AH 11 ∈ C H (k−r)×r H (k−r)×(k−r) , and A22 = A22 ∈ C , then there exists a A21 = A12 ∈ C r×(k−r) such that the following implications hold: matrix B ∈ C

 

   A11 A12 A11 A11 Ir = B =⇒ = A11 =⇒ rank = rank A11 . A22 A21 A21 BH A21 Thus, as A11 can be identified as the Gram matrix of a set {vi1 , . . . , vir } of r linearly independent vectors , rank G = dim span{vi1 , . . . , vir } ≤ dim V . Conversely, if dim V = r and {vi1 , . . . , vir } is a basis for V, then the Gram matrix of this set of vectors is invertible. Thus, as this Gram matrix is a submatrix of G, it follows that r = dim V ≤ rank G . This completes the proof.



Exercise 11.8. Show that if the second and fourth columns of a matrix A ∈ C 4×4 are linearly independent and if A = AH and rank A = 2, then

a22 a24 is invertible. a42 a44

118

11. Inner product spaces

Exercise 11.9. Show that if {v1 , . . . , vr } is a basis for the space V introduced in the proof of Theorem 11.3 and 1 ≤ r < k, then there exists a  T matrix A ∈ C s×r with s = k − r such that the columns of A Is are linearly independent and belong to NG . Theorem 11.4. If C n is equipped with an inner product x, yU and G is the Gram matrix with entries gij = ej , ei U based on the columns of In , then (in terms of the notation (11.11)), G  O and (11.14)

x, yU = Gx, yst

for every choice of x, y ∈ C n .

Proof. The proof is easy and is left to the reader.



11.3. Adjoints If T is a linear transformation from a finite-dimensional inner product space U into a finite-dimensional inner product space V, then (as is spelled out in Exercises 11.10 and 11.11 below) there exists exactly one transformation T ∗ from V into U such that (11.15)

T u, vV = u, T ∗ vU

for every choice of u ∈ U and v ∈ V .

The transformation T ∗ is called the adjoint of T ; it is automatically linear and depends upon the inner products. Moreover, if T1 and T2 are linear transformations from U into V and α ∈ C, then: (1) (αT1 + T2 )∗ = αT1∗ + T2∗ ,

(T ∗ )∗ = T,

and T U,V = T ∗ V,U .

(2) T U,V = max{|T u, vV | : u ∈ U , v ∈ V, and uU = vV = 1}. (3) If dim U = dim V, then T ∗ T = IU ⇐⇒ T T ∗ = IV . Discussion. The first two equalities in (1) are easy consequences of the definition of the adjoint; the third follows from (2). To verify (2), observe first that the Cauchy-Schwarz inequality ensures that the right-hand side of the asserted equality in (2) is always ≤ T U,V . To achieve equality, fix a vector u0 ∈ C q such that u0 U = 1 and T u0 V = T U,V and choose v = T u0 /T u0 V . To justify (3), it suffices to verify the implication =⇒ . If T ∗ T = IU , then dim U = dim RT , since NT = {0}. Thus, if also dim U = dim V, then T maps U onto V. Therefore, for each vector v ∈ V, there exists exactly one vector u ∈ U such that T u = v. Consequently, T ∗ v = T ∗ T u = u, and  hence T T ∗ v = T u = v for every vector v ∈ V. Example 11.2. The most important example for us is when U = C q and V = C p . In keeping with the definition for linear transformations, we shall say that the matrix A∗ ∈ C q×p is the adjoint of a matrix A ∈ C p×q if (11.16)

Au, vV = u, A∗ vU

for every choice of u ∈ C q and v ∈ C p .

11.3. Adjoints

119

In view of Theorem 11.4, (11.16) can be expressed in terms of the Gram matrices B and C of the columns of Iq in U and the columns of Ip in V as (11.17)

CAu, vst = Bu, A∗ vst

for every u ∈ C q and v ∈ C p .

But, as Xu, vst = vH Xu = (X H v)H u = u, X H vst , (11.18)

(11.17) holds if and only if A∗ = B −1 AH C.

If C q and C p are both equipped with the standard inner product, then  B = Iq , C = Ip and hence A∗ = AH . Exercise 11.10. Let T be a linear transformation from an inner product space U with basis {u1 , . . . , uq } and Gram matrix GU into an inner product space V with basis {v1 , . . . , vp } and Gram matrix GV that is defined p in terms p×q of the entries aij of a matrix A ∈ C by the formula T ui = k=1 aki vk for i = 1, . . . , q, and let S be a linear transformation from U into V that q×p by the formula is defined qin terms of the entries bij of a matrix B ∈ C Svj = k=1 bkj uk for j = 1, . . . , p. Show that T u, vV = u, SvU for H every choice of u ∈ U and v ∈ V if and only if B = G−1 U A GV . Exercise 11.11. Use Exercise 11.10 to show that if T is a linear transformation from a finite-dimensional inner product space U into a finite-dimensional inner product space V, then there exists at least one transformation T ∗ from V into U such that (11.15) holds and then show that if S is any transformation from V into U such that T u, vV = u, SvU for every choice of u ∈ U and v ∈ V, then S = T ∗ . Example 11.3. If U = C p×q is equipped with the inner product A, BU = trace B H A, V = C p is equipped with the standard inner product, and u ∈ C q , then the adjoint T ∗ of the linear transformation T from U into V that is defined by the formula T A = Au for every A ∈ U must satisfy the identity T A, vV = A, T ∗ vU for every choice of A ∈ C p×q and v ∈ C p . Thus, T A, vV

= Au, vV = vH Au = trace{vH Au} = trace{uvH A} = A, vuH U ,

i.e., A, T ∗ vU = A, vuH U for every v ∈ C p . Therefore, T ∗ v = vuH for every v ∈ V.  Exercise 11.12. Let U = C n equipped with the inner product u, vU = n n j=1 jvj uj for vectors u, v ∈ C with components u1 , . . . , un and v1 , . . . , vn , respectively. Find the adjoint A∗ of a matrix A ∈ C n×n with respect to this inner product.

120

11. Inner product spaces

11.4. Spectral radius The next theorem provides a remarkable connection between the growth of the numbers An  and the spectral radius (11.19)

rσ (A) = max {|λ| : λ ∈ σ(A)}

for

A ∈ C p×p .

Theorem 11.5. If A ∈ C p×p and A = A2,2 , then lim An 1/n = rσ (A) ;

(11.20)

n↑∞

i.e., the indicated limit exists and is equal to the spectral radius of A. Proof. To verify (11.20), it suffices to justify the inequalities (11.21)

rσ (A) ≤ An 1/n ≤ rσ (A)(1 + κn ) ,

for every positive integer n, where κn ≥ 0 tends to zero as n ↑ ∞. The lower bound is easy: If Ax = λx for some nonzero vector x ∈ C p , then, since An x = λn x, it is readily seen that |λn |x2 = An x2 ≤ An x2 and hence that |λn | ≤ An  for every λ ∈ σ(A). Therefore, rσ (A) ≤ An 1/n for every positive integer n. To verify the upper bound in (11.21), we first invoke the Jordan decomposition theorem, which ensures that there exists an invertible matrix U ∈ C p×p such that A = U JU −1 . Therefore, An  = U J n U −1  ≤ U J n U −1  and (11.22) where

An 1/n ≤ U 1/n J n 1/n U −1 1/n = J n 1/n (1 + εn ) , (1 + εn ) = {U U −1 }1/n ≥ U U −1 1/n = 1

and εn ↓ 0 as n ↑ ∞. To obtain an upper bound on J n , it suffices to obtain an upper bound (k) (k) (k) on (Cμ )n  for every Jordan cell Cμ = μIk + N (with N = C0 ) that k appears in J. But, if μ ∈ σ(A) and n > p, then, as N = O, 5 5 5 5 5 n   5 5k−1   5  n  n 5 5 5 5 (k) n n−j j n−j j μ N 5 μ N 5 =5 (Cμ )  = 5 5 5 5 5 5 j=0 j 5 5 j=0 j 5 p−1 k−1   k−1    n n−j j n−j ≤ n (rσ (A)) ≤ nj (rσ (A))n−j |μ| ≤ j j=0

j=0 p

≤ (rσ (A)) (n/rσ (A)) n

j=0

if n ≥ 2rσ (A) and p ≥ 1 .

11.4. Spectral radius

121

Therefore, (11.23)

J n 1/n ≤ rσ (A)(1 + δn ) ,

where 1 + δn = {n/rσ (A)}p/n = exp

! p ln[n/rσ (A)] → 1 n

as

n ↑ ∞.

The bounds (11.22) and (11.23) imply that An 1/n ≤ rσ (A)(1+δn )(1+εn ). Therefore, 0 ≤ A1/n − rσ (A) ≤ rσ (A)(εn + δn + εn δn ), which serves to  complete the proof, since εn + δn + εn δn tends to 0 as n ↑ ∞. Theorem 11.6. If A, B ∈ C p×p and AB = BA, then: (1) σ(A + B) ⊆ σ(A) + σ(B). (2) rσ (A + B) ≤ rσ (A) + rσ (B). (3) rσ (AB) ≤ rσ (A)rσ (B). Proof. Let u be an eigenvector of A + B corresponding to the eigenvalue μ. Then (A + B)u = μu and hence, since BA = AB, (A + B)Bu = B(A + B)u = μBu ; that is to say, N(A+B−μIp ) is a nonzero subspace of C p that is invariant under B. Therefore, by Theorem 4.5, there exists an eigenvector v of B in this null space, i.e., there exists a nonzero vector v ∈ C p such that (A + B)v = μv and Bv = βv for some β ∈ C. But this in turn implies that β ∈ σ(B) and Av = (μ − β)v , i.e., the number α = μ − β is an eigenvalue of A. Thus we have shown that μ ∈ σ(A + B) =⇒ μ = α + β ,

where

α ∈ σ(A) and β ∈ σ(B).

Therefore, (1) holds. Moreover, (2) is an immediate consequence of (1) and the definition of spectral radius; (3) is left to the reader as an exercise.  Exercise 11.13. Verify the third assertion in Theorem 11.6. Exercise 11.14. Verify the second assertion in Theorem 11.6 by estimating (A + B)n  with the aid of the binomial theorem. [REMARK: This is not as easy as the proof furnished above, but it has the advantage of being applicable in wider circumstances.] Exercise 11.15. Show that if A, B ∈ C n×n , then rσ (AB) = rσ (BA), even if AB = BA. [HINT: Recall formula (7.4).]

122

11. Inner product spaces



 1 1 1 0 Exercise 11.16. Show that if A = and B = , then 0 0 1 0 rσ (AB) > rσ (A) rσ (B) and

rσ (A + B) > rσ (A) + rσ (B) .

Exercise 11.17. Show that if A, B ∈ C n×n , then rσ (A+B) ≤ rσ (A)+B, even if the two matrices do not commute.

11.5. What you need to know about A • Warning: From now on, we adopt the convention that, unless specified otherwise, A = A2,2 for matrices A ∈ C p×q and x = x2 for vectors x ∈ C q , for every choice of the positive integers p and q. The main properties of A are: (1) AB ≤ A B for B ∈ C q×r . (2) Ip  = 1. (3) The inequality Au2 ≤ A u2 is in force for every u ∈ C q . (4) A = max{|Ax, y| : x ∈ C q , y ∈ C p , and x2 = y2 = 1}. (5) A = AH  (this is an easy consequence of the formula in (4)). (6) (AH A)k  = A2k and A(AH A)k  = A2k+1 for k = 1, 2, . . .. (7) If V ∈ C n×p , V H V = Ip , U ∈ C r×q , and U H U = Iq , then V AU H  = A. Exercise 11.18. Verify the first formula in (6) when k = 1. [HINT: AH A ≥ max {AH Ax, x : x = 1} = max {Ax2 : x = 1} = A2 .] Exercise 11.19. Verify the second formula in (6) for k = 1. [HINT: A4 = AH AAH A ≤ AH  AAH A ≤ A4 .] Exercise 11.20. Show that if (AH A)k  = A2k for some integer k ≥ 2, then (AH A)k−1  = A2(k−1) and A(AH A)k−1  = A2k−1 . [HINT: A2k = AH A(AH A)k−1  ≤ A2 (AH A)k−1  ≤ A2k , for the first.] Exercise 11.21. Verify the formulas in (6) for k = 1, 2, . . .. [HINT: Exploit the implications in Exercises 11.18–11.20.] Exercise 11.22. Verify the formula in (7).

11.6. Supplementary notes This chapter is partially adapted from Chapter 8 of [30]. Formula (11.20) is valid in a much wider context than was considered here; see, e.g., Chapter 18 of Rudin [66]. Lemma 42.1 and the surrounding discussion is a good supplement to the section on adjoints. Example 11.3 is adapted from the monograph by Borwein and Lewis [13].

11.6. Supplementary notes

123

The next few exercises deal with variations of the polarization identities exhibited in (11.7) and (11.8). Exercise 11.23. Show that if G ∈ C n×n , then 1 k i (u + ik v)H G(u + ik v) v Gu = 4 4

(11.24)

H

for every u, v ∈ C n .

k=1

Exercise 11.24. Show that if G ∈ C n×n , then (11.25)

xH Gx = 0 for every x ∈ C n =⇒ G = O .

[HINT: Exploit Exercise 11.23.] Exercise 11.25. Show that if G ∈ R n×n , then 1 (11.26) vH (G + GT )u = {(u + v)T G(u + v) − (u − v)T G(u − v)} 2 for every choice of u, v ∈ R n . Exercise 11.26. Show that if G ∈ R n×n , then (11.27)

xH Gx = 0 for every x ∈ R n does not imply that G = O .

However, (11.28)

G = GT and xH Gx = 0 for every x ∈ R n =⇒ G = O .

Chapter 12

Orthogonality

In this chapter we shall discuss orthogonality in inner product spaces. Recall that the cosine of the angle θ between the line segment running from 0 to a = (a1 , a2 , a3 ) and the line segment running from 0 to b = (b1 , b2 , b3 ) for a pair of points a and b in the first octant in R 3 is 3 ai bi a2 + b2 − b − a2 = 9 i=19 . (12.1) cos θ = 2ab 3 3 2 2 i=1 ai i=1 bi Thus, in terms of the standard inner product a, b =

3

i=1 ai bi

in R 3 ,

a, b = 0 ⇐⇒ cos θ = 0 ⇐⇒ θ = π/2 ⇐⇒ a ⊥ b .

12.1. Orthogonality Formula (12.1) (the law of cosines) serves to motivate the following definitions in an inner product space U with inner product u, vU : • Orthogonal vectors: A pair of vectors u and v in U is said to be orthogonal if u, vU = 0. • Orthogonal family: A set of nonzero vectors {u1 , . . . , uk } in U is said to be an orthogonal family if ui , uj U = 0

when i = j.

The assumption that none of the vectors u1 , . . . , uk are equal to 0 serves to guarantee that they are automatically linearly independent. 125

126

12. Orthogonality

• Orthonormal family: A set of vectors u1 , . . . , uk in U is said to be an orthonormal family if: (1) ui , uj U = 0 for i, j = 1, . . . , k and i = j and (2) ui 2U = ui , ui U = 1 for i = 1, . . . , k. • Orthogonal decomposition: A pair of subspaces V and W of U is said to form an orthogonal decomposition of U if: (1) V + W = U and (2) v, wU = 0 for every v ∈ V and w ∈ W. Orthogonal decompositions will be indicated by the symbol U =V ⊕W. • Orthogonal complement: If V is a subspace of an inner product space U , then the set V ⊥ = {u ∈ U : u, vU = 0

(12.2)

for every

v ∈ V}

is referred to as the orthogonal complement of V in U . It is a subspace of U . • Orthonormal expansions: An orthonormal expansion of a vector  u ∈ U is a linear combination of vectors u = kj=1 cj uj wherein the vectors u1 , . . . , uk are orthonormal. The advantage of orthonormal expansions is that the computation of the coefficients c1 , . . . , ck and the evaluation of u, uU is now easy: u, ui U =

(12.3)

k 

cj uj , ui U = ci

for i = 1, . . . , k ,

j=1

and u, uU =

7 k 

(12.4) =

k 

8 ci ui

i=1

=

k 

U

i=1

7 ci

k  j=1

8 cj uj , ui U

|ci |2 .

i=1

Moreover, if w = culation, (12.5)

cj uj ,

j=1

k 

k

j=1 dj uj ,

u, wU =

then, by much the same sort of caln 

di ci .

i=1

Exercise 12.1. Show that every orthogonal sum decomposition is a direct sum decomposition and give an example of a direct sum decomposition that is not an orthogonal decomposition.

12.2. Projections and direct sums

127

Exercise 12.2. Show that if {u1 , . . . , uk } is an orthogonal family of nonzero vectors in an inner product space U , then u1 , . . . , uk are linearly independent. Exercise 12.3. Show that if A ∈ C p×q and C q and C p are both equipped with the standard inner product, then (12.6)

C q = RAH ⊕ NA

and

C p = RA ⊕ NAH .

12.2. Projections and direct sums Recall that the sum V + W = {v + w : v ∈ V and w ∈ W} of a pair of subspaces V and W of a vector space U is direct if V ∩ W = {0}. In this section we shall establish a correspondence between the decomposition ˙ U = V +W of a vector space U as a direct sum and a special class of linear transformations from U into U that are called projections. • Projections: A linear transformation T of a vector space U into itself is said to be a projection if T 2 = T . Lemma 12.1. If a linear transformation T from a vector space U into itself is a projection, then ˙ T. U = RT +N

(12.7)

Proof. Let x ∈ U . Then clearly x = T x + (I − T )x and T x ∈ RT . Moreover, (I − T )x ∈ NT , since T (I − T )x = (T − T 2 )x = (T − T )x = 0. Thus, U = RT + NT . The sum is direct because y ∈ RT ⇐⇒ y = T y

and

y ∈ NT ⇐⇒ T y = 0 .

(The key to the first equivalence is y = T x =⇒ T y = T 2 x = T x = y.)



Lemma 12.1 exhibits U as the direct sum of the spaces V = RT and W = NT that are defined in terms of a given projection T . Conversely, the ˙ complementary spaces in any given direct sum decomposition U = V +W may be identified as the range and null space, respectively, of a projection

128

12. Orthogonality

T , i.e., V = RT and W = NT : Lemma 12.2. Let V and W be subspaces of a vector space U and suppose ˙ that U = V +W. Then: (1) For every vector u ∈ U there exists exactly one vector v ∈ V such that u − v ∈ W. The transformation T that maps u ∈ U into the unique vector v ∈ V considered in (1) enjoys the following properties: (2) RT = V and T v = v for every vector v ∈ V; NT = R(I−T ) = W and (I − T )w = w for every vector w ∈ W. (3) T is linear and T 2 = T . Proof. The first assertion is immediate from the definition of a direct sum decomposition. To verify (2), suppose first that v ∈ V. Then, since v = v + 0 and 0 ∈ W, it follows that v = T v and hence that V ⊆ RT . Thus, as RT ⊆ V by definition, the first equality in (2) must hold. To get the second, let u ∈ NT and write u = v + w with v ∈ V and w ∈ W. Then, since 0 = T u = v, it follows that NT ⊆ W. Thus, as the opposite inclusion follows from the equality w = 0 + w, the proof of (2) is complete. Suppose next that u1 = αv1 + w1 and u2 = v2 + w2 with v1 , v2 ∈ V, w1 , w2 ∈ W, and a scalar α. Then, as αv1 + v2 ∈ V and w1 + w2 ∈ W, T (αu1 + u2 ) = αv1 + v2 = αT u1 + T u2 . Thus, T is linear. The equality T 2 = T is immediate from (2).



• Notation: We shall use the symbol PVW to denote the projection onto the ˙ in order to emphasize subspace V with respect to the decomposition V +W that this projection depends upon both V and the complementary space W. Exercise 12.4. Let {v, w} be a basis for a vector space U . Find the projection of the vector u = 2v + 3w onto the space V with respect to each of ˙ ˙ 1 , when the following direct sum decompositions: U = V +W and U = V +W V = span{v}, W = span{w}, and W1 = span{w + v}. Exercise 12.5. Let 

u1 u2 u3 u4 u5



1  ⎢ 2 u6 = ⎢ ⎣ 1 0

1 1 0 4 1 1 1 −1

2 3 1 5 0 1 0 −1

⎤ 4 0 ⎥ ⎥, 0 ⎦ 1

and let U = span{u1 , u2 , u3 , u4 } , V = span{u1 , u2 , u3 } , W1 = span{u4 }, and W2 = span{u5 }.

12.3. Orthogonal projections

129

(a) Find a basis for the vector space V. ˙ 1 and U = V +W ˙ 2. (b) Show that U = V +W (c) Find the projection of the vector u6 onto the space V with respect to each of the two direct sum decompositions defined in (b). Exercise 12.6. Show that if A ∈ R n×n and rank A = r, then A is a projection (i.e., A2 = A) if and only if det (λIn − A) = (λ − 1)r λn−r and A is diagonalizable.

12.3. Orthogonal projections • Orthogonal projections: A linear transformation T of an inner product space U into itself is an orthogonal projection if (12.8)

T2 = T

and NT is orthogonal to RT

(hence U = RT ⊕ NT , which is a stronger condition than (12.7)). Exercise 12.7. Let u1 and u2 be a pair of orthonormal vectors in an inner product space U and let α be a scalar. Show that the transformation T that is defined by the formula T u = u, u1 + αu2 U u1 is a projection but is not an orthogonal projection unless α = 0. Theorem 12.3. If a linear transformation T from an inner product space U into itself is a projection, then T is an orthogonal projection if and only if (12.9)

T x, yU = x, T yU

for every choice of

x, y ∈ U .

Proof. Suppose first that T is an orthogonal projection, i.e., (12.10)

v, wU = 0

for every choice of

v ∈ RT and w ∈ NT .

Then, since R(I−T ) = NT , T x, yU = T x, T y + (I − T )yU = T x, T yU = T x + (I − T )x, T yU = x, T yU for every choice of x, y ∈ U . Thus, (12.10) implies (12.9). Conversely, if v ∈ RT , w ∈ NT , and (12.9) is in force, then v, w = T v, w = v, T w = v, 0 = 0 . Therefore, (12.9) implies (12.10), as needed to complete the proof. Exercise 12.8. Show that if T 2 = T and (12.9) holds, then (12.11)

T x, yU = T x, T yU

for every choice of

x, y ∈ U .



130

12. Orthogonality

The next result is an analogue of Lemma 12.2 for orthogonal projections that also includes a recipe for calculating the projection. It is formulated in terms of one subspace V of the underlying inner product space U rather than in terms of a pair of complementary subspaces V and W, because the second space W is specified as the orthogonal complement V ⊥ of V, i.e., U = V ⊕ V⊥ .

(12.12)

Since V ∩ V ⊥ = {0}, Lemma 12.2 guarantees the existence of exactly one linear transformation T that maps U onto V such that u − T u ∈ V ⊥ and further guarantees that T 2 = T , RT = V, and NT = V ⊥ . We shall refer to this transformation as the orthogonal projection of U onto V and denote it by the symbol ΠV . Theorem 12.4. If U is an inner product space, V is a subspace of U with basis {v1 , . . . , vk }, u ∈ U , and G ∈ C (k+1)×(k+1) is the Gram matrix of the set of vectors {v1 , . . . , vk , u}, then the orthogonal projection ΠV of U onto V is given by the formula

 k  G11 G12 −1 (G11 G12 )j vj , where G = , (12.13) ΠV u = G21 G22 j=1

(G11 )ij = vj , vi U f or i, j = 1, . . . , k, (G12 )i = u, vi U f or i = 1, . . . , k , G21 = GH 12 ,

and

G22 = u, uU .

Moreover, (12.14)

u − v2U ≥ u2U − ΠV u2U = G22 − G21 G−1 11 G12

for every vector v ∈ V, with equality if and only if v = ΠV u. If U = C n is endowed with the inner product x, yU = Bx, yst = (based on any B ∈ C n×n for which xH Bx > 0 for every x = 0) and V = [v1 · · · vk ], then G11 = V H BV , G12 = V H Bu, G22 = uH Bu, and yH Bx

(12.15)

ΠV u = V (V H BV )−1 V H Bu = V G−1 11 G12

for every u ∈ C n .

 Proof. The vector u − kj=1 cj vj belongs to V ⊥ if and only if ⎞ 8 7⎛ k  ⎝u − cj vj ⎠ , vi = 0 for i = 1, . . . , k , j=1

U

or, equivalently, in terms of the entries in G, if and only if u, vi U =

k  j=1

(G11 )ij cj

for

i = 1, . . . , k .

12.3. Orthogonal projections

131

But this in turn is the same as saying that the vector c ∈ C k with components c1 , . . . , ck is a solution of the vector equation G12 = G11 c. Since G11 is invertible by Lemma 11.2, ΠV u is uniquely specified by formula (12.13). Next, since (u − ΠV u) ∈ V ⊥ and (ΠV u − v) ∈ V, it is readily seen that (12.16) u − v2U = u − ΠV u + ΠV u − v2U = u − ΠV u2U + ΠV u − v2U ≥ u − ΠV u2U = u2U − ΠV u2U for every v ∈ V, which serves to justify (12.14), modulo a straightforward calculation. Finally, (12.15) follows from (12.13), since G11 = V H BV and G12 =  V H Bu in the given setting. Exercise 12.9. Show that if u1 , . . . , un are linearly independent vectors in an inner product space U and Gj is the Gram matrix for {u1 , . . . , uj }, then min{uk − uU : u ∈ span{u1 , . . . , uk−1 }} = (det Gk / det Gk−1 )1/2 for k = 2, . . . , n. In the future we shall usually denote the orthogonal projection of an inner product space U onto a subspace V by ΠV . Here, there is no danger of going astray because it is understood that the projection is with respect to the decomposition U = V ⊕ V ⊥ . If the vectors v1 , . . . , vk that are specified in Theorem 12.4 are orthonormal in U , then the formulas simplify, because G11 = Ik , and the orthogonal projection ΠV u of a vector u ∈ U onto V is given by the formula ΠV u = u, v1 U v1 + · · · + u, vk U vk .

(12.17) Correspondingly,

ΠV u2U

(12.18)

=

k 

|u, vj U |2

for every vector u ∈ U

j=1

and (Bessel’s inequality) (12.19)

k 

|u, vj U |2 ≤ u2U ,

with equality if and only if u ∈ V .

j=1

Moreover, the coefficients cj = u, vj U , j = 1, . . . , k, computed in (12.17) do not change if the space V is enlarged by adding more orthonormal vectors. To this point the analysis in this section is applicable to any inner product space. Thus, for example, we may choose U equal to the set of continuous

132

12. Orthogonality

complex-valued functions on the interval [0, 1], with inner product 6 1 f, gU = f (t)g(t)dt. 0

Then it is readily checked that the set of functions ϕj (t) = ej2πit ,

j = 1, . . . , k,

is an orthonormal family in U for any choice of the integer k. Consequently, %2 6 1 k %6 1  % % % f (t)ϕj (t)dt%% ≤ |f (t)|2 dt, % j=1

0

0

by (12.19). Exercise 12.10. Show that no matter how large you choose k, the family ϕj (t) = ej2πit , j = 1, . . . , k, is not a basis for the space of continuous complex-valued functions U considered just above.   Exercise 12.11. Let A = a1 a2 a3 a4 ∈ C4×4 , let G = AH A, and let ΠV denote the orthogonal projection onto V = span{a1 , a2 , a3 }. Show that: (a) G is a Gram matrix. (b) If a1 , a2 , a3 are linearly independent and H denotes the 3 × 3 Gram matrix for these 3 vectors, then the Schur complement ⎡ ⎤  −1 g14  g44 − g41 g42 g43 H ⎣g24 ⎦ = a4 − ΠV a4 2 . g34 Exercise 12.12. Show that the choice λ = u, v/v, v when v = 0 in the proof of Lemma 11.1 (the Cauchy-Schwarz inequality) minimizes {u−αv : α ∈ C}. Exercise 12.13. Verify directly that if V ∈ C n×k with 1 ≤ k < n and rank V = k, then V H V is invertible and ΠV = V (V H V )−1 V H is an orthogonal projection in the space C n equipped with the standard inner product. Show that RΠV = RV and NΠV = NV H . [HINT: The fact that vj = V ej , where ej is the j’th column vector of Ik , may be helpful.] Exercise 12.14. Show that the norm of the projection that is defined in Exercise 12.7 is equal to (1 + |α|2 )1/2 . Exercise 12.15. Show that if P ∈ C n×n , rank P ≥ 1, and P 2 = P , then: (a) P  = 1 if RP is orthogonal to NP . (b) P  can be very large if RP is not orthogonal to NP . Exercise 12.16. Find the orthogonal projection of the vector u6 onto the space V in the setting of Exercise 12.5.

12.4. The Gram-Schmidt method

133

Exercise 12.17. Let μ1 < · · · < μk and let p(x) = c0 + c1 x + · · · + cn xn be a polynomial of degree n ≥ k with coefficients c0 , . . . , cn ∈ R. Show that if p(μj ) = βj for j = 1, . . . , k, then ⎡ ⎤ ⎡ ⎤ 1 ··· 1 β1 n ⎢ μ1  μk ⎥ ⎢ ⎥ ⎢ .. ⎥ 2 T T −1 cj ≥ b (V V ) b, where V = ⎢ . .. ⎥ and b = ⎣ . ⎦ , ⎣ .. . ⎦ j=0 βk μnk μn1 and find a polynomial that achieves the exhibited minimum. [HINT: First verify the fact that V (V T V )−1 V T  = 1.]

12.4. The Gram-Schmidt method Let {u1 , . . . , uk } be a set of linearly independent vectors in an inner product space U . The Gram-Schmidt method is a procedure for finding a set of orthonormal vectors {v1 , . . . , vk } such that Vj = span{v1 , . . . , vj } = span{u1 , . . . , uj }

for j = 1, . . . , k.

The steps of this procedure may be expressed in terms of the orthogonal projection ΠVj of U onto Vj as follows: v1 = u1 /ρ1 with ρ1 = u1 U > 0 , uj+1 − ΠVj uj+1 uj+1 − [uj+1 , v1 U v1 + · · · + uj+1 , vj U vj ] = vj+1 = ρj+1 ρj+1 with ρj+1 = uj+1 − [uj+1 , v1 U v1 + · · · + uj+1 , vj U vj ]U > 0 for j = 1, . . . , k. It is easily checked that the vectors constructed this way are orthonormal and that Vj = span{u1 , . . . , uj } for j = 1, . . . , k − 1. To see the pattern underlying this construction more clearly, note that v1 = u1 /ρ1 , (12.20)

v2 = [u2 − u2 , v1 U v1 ]/ρ2 , v3 = [u3 − u3 , v1 U v1 − u3 , v2 U v2 ]/ρ3 .

Exercise 12.18. Find a set of orthonormal vectors {y1 , y2 , y3 } in C 4 such that span{y1 , y2 } = span{u1 , u2 }, span{y1 , y2 , y3 } = span{u1 , u2 , u4 }, and span{y1 } = span{u1 }, for the vectors u1 , u2 , and u4 defined in Exercise 12.5. Exercise 12.19. Find a set of three polynomials p0 (t) = a, p1 (t) = b + ct, and p3 (t) = d + et + f t2 with a, b, c, d, e, f ∈ R so that they form an ,2 orthonormal set with respect to the real inner product f, g = 0 f (t)g(t)dt.

134

12. Orthogonality

12.5. QR factorization Lemma 12.5. If A ∈ C p×q and rank A = q, then there exist exactly one matrix Q ∈ C p×q with QH Q = Iq and exactly one upper triangular matrix R ∈ C q×q with positive entries on the diagonal such that A = QR. Proof. The existence of at least one factorization of the indicated form is a consequence of the Gram-Schmidt procedure. Thus, for example, if k = 3, then (12.20) can be reexpressed as ⎤ ⎡ ρ u , v  u , v  1 2 1 3 1     u1 u2 u3 = v1 v2 v3 ⎣ 0 u3 , v2 ⎦ . ρ2 0 0 ρ3 To verify the asserted uniqueness, suppose that there were two such factorizations: A = Q1 R1 and A = Q2 R2 . Then H H H H R1H R1 = R1H QH 1 Q1 R1 = A A = R2 Q2 Q2 R2 = R2 R2

and hence

R1 (R2 )−1 = (R1H )−1 R2H . Therefore, since the left-hand side of the last equality is upper triangular and the right-hand side is lower triangular, it follows that D = R1 (R2 )−1 is a diagonal matrix with positive diagonal entries djj for j = 1, . . . , q and, as R2H R2 = R1H R1 = R2H D H DR2 , that D H D = Iq . Thus, |djj |2 = 1 and hence as djj > 0, D = Iq , R1 = R2 ,  and Q1 = Q2 , as claimed. Exercise 12.20. Show that if A ∈ C n×n is invertible, then there exists an invertible lower triangular matrix C such that the columns of CA are orthonormal.

12.6. Supplementary notes The factorization A = QR established in Lemma 12.5 for matrices A ∈ C p×q with rank A = q is called the QR factorization of A. There is a very beautiful formula for the columns of Q when p = q and A is invertible: Theorem 12.6. The columns q1 , . . . , qn of the matrix Q in the QR factorization of an invertible matrix A ∈ C n×n can be expressed in terms of the columns a1 , . . . , an of A by the formula   det Gk 1/2 Ak G−1 (12.21) qk = k fk f or k = 2, . . . , n , det Gk−1   Ak is the Gram matrix of the columns where Ak = a1 · · · ak , Gk = AH  T k k of Ak , and fk = 0 · · · 0 1 ∈ R .

12.6. Supplementary notes

135

Proof. Since R is upper triangular, it is readily checked that Ak = Qk R[1,k] ,   where Qk = q1 · · · qk , R[1,k] denotes the upper left k × k corner of R, and hence that Qk = Ak (R[1,k] )−1 . Therefore, qk = Ak (R[1,k] )−1 fk . H H However, as QH k Qk = Ik , Gk = Ak Ak = (R[1,k] ) R[1,k] . Consequently, −1 H (R[1,k] )−1 fk = G−1 k (R[1,k] ) fk = Gk fk rkk ,

since rkk > 0. The final formula (12.21) is obtained by noting that −2 −1 −1 H = fkH R[1,k] (R[1,k] ) fk = fkH G−1 rkk k fk = G{k;k} / det Gk = det Gk−1 / det Gk .



Chapter 13

Normal matrices

A matrix A ∈ C n×n is said to be • normal if AH A = AAH , • Hermitian if AH = A, • skew-Hermitian if AH = −A, • unitary if AH A = In and AAH = In (but keep Exercise 13.1 in mind). A real unitary matrix is also called an orthogonal matrix. Permutation matrices are orthogonal matrices. A matrix A ∈ C p×q is said to be • isometric if AH A = Iq . Warning: The definitions of normal, unitary, and isometric matrices provided above are linked to the standard inner product. In particular, the term isometric stems from the fact that if A ∈ C p×q and AH A = Iq , then Ax, Axst = x, AH Axst = x, xst , i.e., the norm is preserved: Axst = xst . If A ∈ C p×q , U = C q , and V = C p are equipped with arbitrary inner products, then Ax, AxV = x, xU ⇐⇒ A∗ A = Iq , i.e., AH should be replaced by A∗ . Correspondingly, A is isometric if A∗ A = Iq ; if p = q, then A is normal if A∗ A = AA∗ and unitary if A∗ A = Ip ; see Example 11.2 and Theorem 13.10. Exercise 13.1. Show that if A ∈ C p×q , then AH A = Iq and AAH = Ip ⇐⇒ AH A = Iq and p = q (13.1)

⇐⇒ AH A = Iq and RA = C p ⇐⇒ AH A = Iq and A is invertible . 137

138

13. Normal matrices

Exercise 13.2. Show that if A ∈ C n×n is a cyclic matrix, then AH A = AAH .

13.1. Normal matrices The main result of this section is Theorem 13.1. As a byproduct of this theorem we shall see that the class of n × n normal matrices is the largest class of matrices A ∈ C n×n that admit a factorization of the form (13.2) A = U DU H ,

with U ∈ C n×n unitary and D ∈ C n×n diagonal .

Exercise 13.3. Show that a matrix A ∈ C n×n admits a factorization of the form (13.2) with D ∈ R n×n if and only if A = AH . Theorem 13.1. If A ∈ C n×n , then there exists an orthonormal family of eigenvectors {u1 , . . . , un } of A in C n (equipped with the standard inner product) if and only if AH A = AAH . Moreover: (1) If A = AH , then AH A = AAH and σ(A) ⊂ R. (2) If A = −AH , then AH A = AAH and σ(A) ⊂ iR. (3) If AH A = In , then AH A = AAH and σ(A) ⊂ {λ ∈ C : |λ| = 1}. Proof. If {u1 , . . . , un } is a family of eigenvectors of A in C n that are orthonormal in the standard inner product and Auj = λj uj for j = 1, . . . , n,   H u · · · u then the matrix U = 1 n is unitary (i.e., U U = In ) and AU = U D

with D = diag{λ1 , . . . , λn } .

Therefore, A = U DU H , AH = U D H U H , and, as D H D = DD H , AH A = (U D H U H )(U DU H ) = U D H DU H = U DD H U H = (U DU H )(U D H U H ) = AAH . Thus, AH A = AAH ; the same argument shows that if A ∈ C n×n admits a factorization of the form (13.2), then AH A = AAH . The verification of the converse and assertions (1)–(3) is broken into steps. 1. If A ∈ C n×n and AH A = AAH , then (A − λIn )u = (AH − λIn )u for every vector u ∈ C n and every point λ ∈ C. This is an easy consequence of the fact that (13.3) (AH −λIn )(A−λIn ) = (A−λIn )(AH −λIn )

for every point λ ∈ C .

13.1. Normal matrices

139

2. If A ∈ C n×n and AH A = AAH , then N(A−λIn )2 = N(A−λIn ) for every point λ ∈ C. If u ∈ N(A−λIn )2 and v = (A − λIn )u, then, in view of step 1, 0 = (A − λIn )v = (AH − λIn )v =⇒ (AH − λIn )v = 0 . Therefore, 0 = (AH − λIn )(A − λIn )u, u = (A − λIn )u2 , which implies that N(A−λIn )2 ⊆ N(A−λIn ) . Since the opposite inclusion is self-evident, this completes the proof of this step. 3. If A ∈ C n×n and AH A = AAH , then there exists an orthonormal family of eigenvectors {u1 , . . . , un } of A. Suppose that Uk = span{u1 , . . . , uk } is the span of k orthonormal eigenvectors of A for some positive integer k with k < n and let Uk⊥ denote the orthogonal complement of U with respect to the standard inner product. Then Uk⊥ = {0}. The key observation is that Uk⊥ is invariant under AH , i.e., if v ∈ Uk⊥ , then AH v ∈ Uk⊥ : uj , AH v = Auj , v = λj uj , v = 0 for j = 1, . . . , k . Consequently, there exists an eigenvector w of AH in Uk⊥ with w = 1. But if AH w = βw, then, in view of step 1, Aw = βw, i.e., w is an eigenvector of A that is orthogonal to Uk . Therefore, Uk+1 = span{u1 , . . . , uk , w} is an orthonormal family of k + 1 eigenvectors of A. If k + 1 = n, then we are finished. If k + 1 < n, then we repeat the procedure n − (k + 1) more times. 4. Verification of (1)–(3). It is easily seen that if AH = ±A, or if A ∈ C n×n and AH A = In , then = AAH (see Exercise 13.1 for help with the last assertion).

AH A

If Auj = λj uj , uj  = 1, and A = AH , then λj uj , uj  = Auj , uj  = uj , Auj  = uj , λj uj  = λj uj , uj  . Therefore, (1) holds. If Auj = λj uj , uj  = 1, and AH A = In , then |λj |2 uj , uj  = Auj , Auj  = uj , AH Auj  = uj , uj  . Therefore, (3) holds; (2) is left to the reader as an exercise. Exercise 13.4. Show that if A ∈ C n×n and AH = −A, then σ(A) ⊂ iR.



140

13. Normal matrices

Exercise 13.5. Show that if A ∈ R n×n , AT = −A, and n is odd, then det A = 0. Exercise 13.6. Show that if A ∈ C n×n and AH A = AAH , then rσ (A) = A. Exercise 13.7. Find the Jordan decomposition of the matrix ⎡ ⎤ 0 0 9 A = ⎣0 1 0⎦ . 0 0 1 [HINT: First check that A is a projection.] Exercise 13.8. Show that if A ∈ C n×n and A2 = A, then A is an orthogonal projection if and only if A is normal. Exercise 13.9. Show that if A ∈ C p×q , then (13.4)

AH A = Iq ⇐⇒ Ax, Axst = x, xst

for every x ∈ C q .

[HINT: Use the polarization identities to check that Bx, x = x, x for all vectors x ∈ C q if and only if Bx, y = x, y for all vectors x, y ∈ C q .]

Some elementary facts to keep in mind. (1) If U ∈ C n×n and U H U = In , then the columns of U form an orthonormal basis of C n with respect to the standard inner product. (2) If U ∈ R n×n and U H U = In , then the columns of U form an orthonormal basis of R n with respect to the standard inner product. (3) If A ∈ R n×n , then A is Hermitian if and only if it is symmetric. But this is not true if A ∈ C n×n . Thus, for example,



 1 i 1 i = B H = B T . A= = AT = AH , whereas B = −i 1 i 1 (4) (13.4) Exercise 13.10. Show that if A ∈ C p×q , then RA ∩ NAH = {0} and hence that if A = AH , then p = q and RA ∩ NA = {0}.

13.2. Schur’s theorem Theorem 13.1 ensures that every normal matrix is unitarily equivalent to a diagonal matrix, i.e., it admits a factorization of the form exhibited in (13.2). A theorem of Issai Schur states that every square matrix is unitarily

13.2. Schur’s theorem

141

equivalent to a triangular matrix: Theorem 13.2 (Schur). If A ∈ C n×n , then there exist a matrix V ∈ C n×n and an upper triangular matrix T such that V H V = In and A = V TV H

(13.5)

is upper triangular. Moreover, V can be chosen so that the diagonal entries of T coincide with the eigenvalues of A, repeated according to their algebraic multiplicity. Proof. The Jordan decomposition theorem guarantees that there exists an invertible matrix U ∈ C n×n such that A = U JU −1 . Since U is invertible, it has a QR decomposition U = QR with a unitary factor Q ∈ Cn×n and an upper triangular invertible factor R ∈ C n×n with positive entries on the diagonal. Thus, A = QRJR−1 Q−1 = Q(RJR−1 )QH . Moreover, upon writing R = D1 + X1 , J = D0 + X0 , and R−1 = D2 + X2 , where Dj is diagonal and Xj is strictly upper triangular (i.e., upper triangular with zero entries on the diagonal), it is readily checked that RJR−1 = (D1 + X1 )(D0 + X0 )(D2 + X2 ) = D1 D0 D2 + X3 , where X3 is strictly upper triangular and hence as D1 D0 D2 = D0 D1 D2 and D1 D2 = In , the diagonal entries of RJR−1 coincide with the diagonal entries of J, which run through the eigenvalues of A, repeated according to their algebraic mul tiplicity. Thus, (13.5) holds with V = Q and T = RJR−1 . Theorem 13.3. If A ∈ C n×n with eigenvalues μ1 , . . . , μn , repeated according to their algebraic multiplicity, then (13.6)

n  j=1

|μj |2 ≤

n 

|aij |2 ,

i,j=1

with equality if and only if AH A = AAH . Proof. By Schur’s theorem, there exists a unitary matrix U such that U H AU = T is upper triangular and tii = μi for i = 1, . . . , n. Consequently, trace{AH A} = trace{U T H U H U T U H } = trace{U T H T U H } = trace{T H T U H U } = trace{T H T } ,

142

13. Normal matrices

and hence the identity n 

|tij | = 2

i,j=1

n 

|aij |2

i,j=1

is in force for the entries tij of T and aij of A. Therefore, (13.7)

n  i=1

|μi |2 =

n  i=1

|tii |2 ≤

n 

|tij |2 = trace AH A =

i,j=1

n 

|aij |2 ,

i,j=1

with equality if and only if tij = 0 when i = j, i.e., if and only if T is a diagonal matrix. But then AH A = U T H T U H = U T T H U H = AAH , since diagonal matrices commute. Consequently, equality in (13.7) implies that A is normal. The opposite implication is left to the reader.  Exercise 13.11. Show that if A ∈ C n×n , then | det A | ≤ An∞ nn/2 . [HINT: Use (13.6) and (9.11).]

13.3. Commuting normal matrices Lemma 13.4. If A, B ∈ C n×n and B H B = BB H , then (13.8)

AB = BA ⇐⇒ AB H = B H A .

Proof. If AB = BA, then AB m = B m A for every positive integer m. Therefore, Aϕ(B) = ϕ(B)A for every polynomial ϕ. Moreover, since B H B = BB H , B admits a representation of the form B = W DW H with D ∈ C n×n diagonal and W ∈ C n×n unitary. Consequently, ϕ(B) = ϕ(W DW H ) = W ϕ(D)W H

and

B H = W DH W H .

Thus, ϕ(B) = B H ⇐⇒ ϕ(D) = D H . If D has k distinct entries λ1 , . . . , λk with k ≥ 2 and ϕ(λ) = a0 + a1 λ + · · · + ak−1 λk−1 , then ⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 λ1 · · · λk−1 λ1 a0 1 ⎢1 λ2 · · · λk−1 ⎥ ⎢ a1 ⎥ ⎢ λ2 ⎥ 2 ⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ϕ(λj ) = λj for j = 1, . . . , k ⇐⇒ ⎢ . .. ⎥ ⎢ .. ⎥ = ⎢ .. ⎥ . ⎣ .. ⎦ ⎣ ⎦ ⎣ . ⎦ . . 1 λk · · ·

λk−1 k

ak−1

λk

Thus, as the Vandermonde matrix in the preceding display is invertible, the  j H polynomial ϕ(λ) = k−1 j=0 aj λ maps λj to λj . Consequently, ϕ(B) = B and AB = BA =⇒ A ϕ(B) = ϕ(B) A =⇒ A B H = B H A . This serves to complete the proof, since the implication ⇐= follows from  the implication =⇒ and the fact that (B H )H = B. Exercise 13.12. Show that if A, B, C ∈ C n×n , B H B = BB H , and C H C = CC H , then AB = CA ⇐⇒ AB H = C H A.

13.3. Commuting normal matrices

143

Theorem 13.5. If A ∈ C n×n and B ∈ C n×n are both normal matrices, then AB = BA if and only if there exists a unitary matrix U ∈ C n×n that diagonalizes both A and B. Proof. Suppose first that there exists a unitary matrix U ∈ C n×n such that DA = U H AU and DB = U H BU are both diagonal matrices. Then, since DA DB = DB DA , AB = U DA U H U DB U H = U DA DB U H = U DB DA U H = U DB U H U DA U H = BA . Suppose next that AB = BA and A has k distinct eigenvalues λ1 , . . . , λk with geometric multiplicities γ1 , . . . , γk , respectively. Then there exists a set matrices U1 ∈ C n×γ1 , . . . , Uk ∈ C n×γk such that U =   of k isometric U1 · · · Uk is unitary, AUj = λj Uj , and AH Uj = λj Uj . Therefore, ABUj = BAUj = λj BUj

for

j = 1, . . . , k ,

which implies that the columns of the matrix BUj belong to N(A−λj In ) and hence, since the columns of Uj form a basis for that space, that there exists a set of γj × γj matrices Cj such that BUj = Uj Cj

for j = 1, . . . , k .

Since AB = BA =⇒ AB H = B H A, by Lemma 13.4, analogous reasoning yields a set of γj × γj matrices Dj such that B H Uj = Uj Dj

for j = 1, . . . , k .

Moreover, CjH = (UjH BUj )H = UjH B H Uj = Dj

for j = 1, . . . , k

and CjH Cj = UjH B H Uj Cj = UjH B H BUj = UjH BB H Uj = DjH Dj = Cj CjH for j = 1, . . . , k, i.e., the Cj are all normal matrices. Therefore, upon writing Cj = Wj Δj WjH

for j = 1, . . . , k ,

with Wj unitary and Δj diagonal, and setting Vj = Uj Wj

for j = 1, . . . , k

and

W = diag{W1 , . . . , Wk } ,

it is readily seen that AVj = AUj Wj = λj Uj Wj = λj Vj

for

j = 1, . . . , k

and BVj = BUj Wj = Uj Cj Wj = Uj Wj Δj = Vj Δj for j = 1, . . . , k.   Thus, the matrix V = V1 · · · Vk = U W is a unitary matrix that serves to diagonalize both A and B. 

144

13. Normal matrices

Remark 13.6. The proof of Theorem 13.5 simplifies when k = n, because then every eigenvector of A is also an eigenvector of B.

13.4. Real Hermitian matrices In this section we shall show that if A = AH and A ∈ R n×n , then the unitary matrix U in the formula A = U DU H may also be chosen to belong to R n×n . Theorem 13.7. If A = AH and A ∈ R n×n , then there exist an orthogonal matrix Q ∈ R n×n and a real diagonal matrix D ∈ R n×n such that A = QDQT .

(13.9)

Proof. Let μ ∈ σ(A) and let u1 , . . . , u be a basis for the nullspace of the matrix B = A − μIn . Then, since B ∈ R n×n , the real and imaginary parts of the vectors uj also belong to NB : If uj = xj + iyj with xj and yj in R n for j = 1, . . . , , then B(xj + iyj ) = 0 =⇒ Bxj = 0 and Moreover, if  U = u1 · · ·

 u ,

 X = x1 · · ·

Byj = 0 for j = 1, . . . ,  .

 x ,

 and Y = y1 · · ·

 y ,

then U = X + iY and the formula

   I U= X Y iI implies that

  rank X Y ≥ rank U = .   Therefore,  of the columns in X Y form a basis for NB , and an orthonormal basis of  vectors in R n for NB may be found by invoking the Gram-Schmidt procedure. If A has k distinct eigenvalues λ1 , . . . , λk , let Qi , i = 1, . . . , k, denote the n × γi matrix that is obtained by stacking the vectors that are obtained by applying the procedure described above to Bi = A − λi In for i = 1, . . . , k. Then, AQi = λi Qi and     A Q1 · · · Qk = Q1 · · · Qk D, where D = diag{λ1 Iγ1 , . . . , λk Iγk } .   Moreover, the matrix Q = Q1 · · · Qk is an orthogonal matrix, since the columns in Qj form an orthonormal basis for N(A−λj In ) and the columns in  Qi are orthogonal to the columns in Qj if i = j. Lemma 13.8. If A ∈ R p×q , then max {Axst : x ∈ C q

and

xst = 1}

= max {Axst : x ∈ R q

and

xst = 1} .

13.5. Supplementary notes

145

Proof. Since A ∈ R p×q , AH A is a real q × q Hermitian matrix. Therefore, AH A = QDQT , where Q ∈ R q×q is orthogonal, D = diag{λ1 , . . . , λq } ∈ R q×q , and λj ≥ 0 for j = 1, . . . , q. Thus, if x ∈ C q , y = QT x, and δ = max {λj : j = 1, . . . , q}, then δ ≥ 0 and Ax2st = AH Ax, xst = QDQT x, xst = DQT x, QT xst = Dy, yst n n   = λj |yj |2 ≤ δ |yj |2 j=1

=

δy2st

j=1 T

= δQ x2st = δx2st .

Consequently, max {Axst : x ∈ C n

and

xst = 1} ≤

√ δ.

However, if δ = λk , then it is readily seen that this maximum can be attained by choosing x = Qek , the k’th column of Q. But this proves the claim, since  Qek ∈ R q .

13.5. Supplementary notes This chapter is partially adapted from Chapter 9 in [30]. The inequality in Exercise 13.11 is due to Hadamard; the simple proof based on (13.6) is due to Schur. Lemma 13.4 and Exercise 13.12 are finite-dimensional versions of theorems due to Fuglede and Putnam, respectively. Theorem 13.1 is a special case of a general result for linear transformations T from a finite-dimensional inner product space U into itself. In this setting, T is said to be normal if T ∗ T = T T ∗ ; selfadjoint if T = T ∗ ; and unitary if T ∗ T = I (here, too, just as in Exercise 13.1, T ∗ T = I ⇐⇒ T T ∗ = I when U is a finite-dimensional inner product space). Theorem 13.9. If T is a linear transformation from an n-dimensional inner product space U into itself, then: (1) There exists an orthonormal basis {u1 , . . . , un } of U of eigenvectors of T if and only if T is normal. (2) If T is selfadjoint, then the eigenvalues of T are all real. (3) If T is unitary, then the eigenvalues of T all have absolute value equal to one. The proof is much the same as the proof of Theorem 13.1. If T is normal, then every vector u ∈ U can be expressed in terms of the orthonormal

146

13. Normal matrices

 eigenvectors u1 , . . . , un of T as u = nj=1 u, uj U uj . Moreover, if T uj = λj uj for j = 1, . . . , n, then T ∗ uj = λj uj for j = 1, . . . , n. Correspondingly, (13.10)

Tu =

n 

λj u, uj U uj

j=1

and



T u=

n 

λj u, uj U uj

j=1

for every vector u ∈ U . It is easy to extract (2) and (3) from (13.10). Nevertheless, it is helpful to keep the following simple alternate argument in mind: If T uj = λj uj for j = 1, . . . , n and T = T ∗ , then (13.11)

λj uj , ui U = T uj , ui U = uj , T ∗ ui U = uj , T ui U = uj , λi ui U = λi uj , ui U ,

i.e., (λj − λi )uj , ui U = 0. Consequently, λi = λi for i = 1, . . . , n and uj , ui U = 0 for i, j = 1, . . . , n if i = j. For additional details, see, e.g., Section 8.12 of [30]. Thus, if T : x ∈ C n → Ax ∈ C n , then T ∗ : x ∈ C n → A∗ x ∈ C n and: Theorem 13.10. If U = C n equipped with the inner product x, yU = Gx, yst for some G ∈ C n×n with G  O, then A∗ = G−1 AH G and (13.12)

A∗ A = AA∗ ⇐⇒ there exist W, D ∈ C n×n with D diagonal such that AW = W D

and

W H GW = G (i.e., W ∗ W = In ) .

Moreover, A∗ = A =⇒ D ∈ R n×n and A∗ A = In =⇒ |dii | = 1.

Chapter 14

Projections, volumes, and traces

The first three sections of this chapter deal with projections; the fourth is a short detour on the angle between subspaces; the fifth gives a geometric interpretation to determinants; the sixth develops trace formulas.

14.1. Projection by iteration In this section we consider the problem of computing the orthogonal projection of a vector u in a finite-dimensional inner product space U onto the intersection V ∩ W of a pair of subspaces V and W of U in terms of the orthogonal projections ΠV of U onto V and ΠW of U onto W. Lemma 14.1. If V and W are subspaces of a finite-dimensional inner product space U and ΠV , ΠW , and ΠV∩W denote the orthogonal projections onto V, W, and V ∩ W, respectively, then lim ΠV∩W − (ΠV ΠW )k  = 0 .

(14.1)

k↑∞

Moreover, (14.2)

ΠV ΠW  < 1 ⇐⇒ V ∩ W = {0} .

Proof. The proof is divided into steps; the first serves to justify (14.2). 1. V ∩ W = {0} ⇐⇒ ΠV ΠW  = 1. If ΠV ΠW  = 1 and ΠV ΠW x = ΠV ΠW  x for some nonzero vector x ∈ C n , then ΠW  = ΠV  = 1 and (14.3)

x = ΠV ΠW x ≤ ΠV  ΠW x = ΠW x ≤ ΠW  x = x . 147

148

14. Projections, volumes, and traces

Consequently, equality holds throughout (14.3). In particular, (In − ΠW )x2 = x2 − ΠW x2 = 0 , which implies that x = ΠW x ∈ W; and (In − ΠV )x2 = x2 − ΠV x2 = x2 − ΠV ΠW x2 = 0 , which implies that x = ΠV x ∈ V. Therefore, x ∈ V ∩ W, i.e., ΠV ΠW  = 1 =⇒ V ∩ W = {0}. Conversely, if x ∈ V ∩ W and x = 0, then ΠW x = x and ΠV ΠW x = ΠV x = x. Therefore, x = ΠV ΠW x ≤ ΠV ΠW  x ≤ x , i.e., V ∩ W = {0} =⇒ ΠV ΠW  = 1. 2. Verification of (14.1). Let X = V ∩ W and consider the orthogonal decompositions V = X ⊕ V1

W = X ⊕ W1 .

and

Then, in self-evident notation, the orthogonal projections ΠV = ΠX + ΠV1 ,

ΠW = ΠX + ΠW1

and, as ΠX ΠV1 = ΠV1 ΠX = O and ΠX ΠW1 = ΠW1 ΠX = O, ΠV ΠW = (ΠX + ΠV1 )(ΠX + ΠW1 ) = (ΠX )2 + ΠV1 ΠW1 = ΠX + ΠV1 ΠW1 . Since ΠX ΠV1 ΠW1 = O = ΠV1 ΠW1 ΠX , we can invoke the binomial formula to obtain k    k k k (ΠX )k−j (ΠV1 ΠW1 )j (ΠV ΠW ) = (ΠX + ΠV1 ΠW1 ) = j j=0

k

k

= (ΠX ) + (ΠV1 ΠW1 ) = ΠX + (ΠV1 ΠW1 )k for k = 1, 2, . . .. Therefore, ΠV∩W − (ΠV ΠW )k  = (ΠV1 ΠW1 )k  ≤ ΠV1 ΠW1 k , which tends to zero as k ↑ ∞, since V1 ∩ W1 ⊆ V ∩ W = X =⇒ V1 ∩ W1 ⊆ V1 ∩ X = {0} , and hence, in view of (14.2), ΠV1 ΠW1  < 1.



14.2. Computing nonorthogonal projections

149

Exercise 14.1. Show that if V is a subspace of a finite-dimensional inner product space U , ΠV is an orthogonal projection, and u ∈ U , then (14.4)

ΠV uU = uU ⇐⇒ ΠV u = u .

  Exercise 14.2. Show that if the columns of the matrices V = X V1 ,   W = X W1 , and X form orthonormal bases for the subspaces V, W, and V ∩ W, respectively, then V1H W1  < 1 and (14.5)

ΠV∩W − (ΠV ΠW )k  = (ΠV1 ΠW1 )k  = (V1 V1H W1 W1H )k  .

[HINT: ΠV ΠW = (XX H + V1 V1H )(XX H + W1 W1H ) = Y + Z, where Y = XX H , Z = V1 V1H W1 W1H , and Y Z = ZY = O.] Exercise 14.3. Show that in the setting of Exercise 14.2, (14.6)

(ΠV1 ΠW1 )k  = V1H W1 2k−1 = ΠV1 ΠW1 2k−1 .

[HINT: (ΠV1 ΠW1 )k = V1 A(AH A)k−1 W1H with A = V1H W1 ; and Section 11.5.]

14.2. Computing nonorthogonal projections A formula for the orthogonal projection ΠV of an inner product space U onto a subspace V was presented in Theorem 12.4. This is the projection that is defined by the orthogonal sum decomposition U = V ⊕ V ⊥ . In this section we shall obtain a formula for the projection PVW of U onto V with ˙ respect to the direct sum decomposition U = V +W. To carry out the computations, it is again convenient to draw upon some facts from the theory of positive definite matrices that will be justified in Chapter 16: If G ∈ C k×k and xH Gx > 0 for every nonzero x ∈ C k , then G is positive definite and we indicate this by writing G  O. Moreover, if G ∈ C k×k and G  O, then G = GH and there exists exactly one matrix F ∈ C k×k such that F  O and F 2 = G. The symbol G1/2 is used to denote this matrix. To warm up, we first consider the special case in which U is a subspace of C n .   Theorem 14.2. If V ∈ C n×p , W ∈ C n×q , A = V W , and rank A = p + q, then the Gram matrix G = AH A is positive definite and the space U = RA is the direct sum of the spaces V = RV and W = RW . Moreover,   (14.7) PVW = V O G−1 AH

150

14. Projections, volumes, and traces

and PVW 2 =

(14.8)

1 . 1 − ΠV ΠW 2

Proof. Let u ∈ U . Then, as rank V = p, rank W = q, and V ∩ W = {0}, there exists exactly one vector a ∈ C p and exactly one vector b ∈ C q such that u = V a+W b. Thus, PVW u = V a. In terms of the blocks Gij , i, j = 1, 2, of the Gram matrix G, V H u = V H V a + V H W b = G11 a + G12 b and W H u = W H V a + W H W b = G21 a + G22 b . Moreover, since G  O, the matrices G11 , G22 , and G11 − G12 G−1 22 G21 are all positive definite. Thus, (14.9)

H a = G−1 11 (V u − G12 b)

H and b = G−1 22 (W u − G21 a) ,

and hence

  H −1 H H −1 (G11 − G12 G−1 22 G21 )a = V u − G12 G22 W u = Ip −G12 G22 A u .

Formula (14.7) drops out easily upon invoking the identity     −1 Ip O G−1 = (G11 − G12 G−1 (14.10) Ip −G12 G−1 22 G21 ) 22 , which is readily obtained from the Schur complement identity (3.19). To verify (14.8), observe first that 5

H 5 5 5  −1 H W 2 W W H −1 V 5 5 PV  = PV (PV )  = 5 V O G A AG O 5 −1 H = V (G11 − G12 G−1 22 G21 ) V  −1/2

= V G11

−1/2

(Ip − G11

−1/2

= (Ip − G11

−1/2 −1

G12 G−1 22 G21 G11 −1/2 −1

G12 G−1 22 G21 G11

)

)

−1/2

G11

V H

.

Thus, if rank G12 = r, then the isometric factor V1 in the singular value −1/2 −1/2 decomposition G11 G12 G22 = V 1 S1 U1H is of size p × r. If r < p, then there exists a matrix V2 such that V1 V2 is unitary and hence PVW 2 = (Ip − V1 S1 U1H U1 S1 V1H )−1  5

 H −1 5 5 5  2  − S O I V1 5 5 r 1 = 5 V1 V2 5 O Ip−r V2H 5 5 5 5  −1 5 5 1 5 5 I − S12 O , =5 r 5= O Ip−r 5 1 − s21 5

14.3. The general setting

151

where −1/2

s1 = G11

−1/2

G12 G22

−1 H  = V G−1 11 G12 G22 W  = ΠV ΠW  .

Since the same conclusion holds when r = p, the proof is complete.



Exercise 14.4. Verify formula (14.10). Exercise 14.5. Show that in the setting of Theorem 14.2, (14.11)

PVW = (In − ΠV ΠW )−1 ΠV (In − ΠV ΠW ).

H = Π , V G−1 G W H = [HINT: Invoke (14.9) and the identities V G−1 V 11 V 11 12 −1 G G G = Π Π V .] ΠV ΠW , and V G−1 12 21 V W 11 22

Exercise 14.6. Show that if V ∈ C n×p and W ∈ C n×q are isometric matrices, then (14.12)

min{V a − W b2 : a = b = 1} = 2(1 − ΠV ΠW ) .

[HINT: First show that the left-hand side of (14.12) is ≥ 2(1 − V H W ).]   Exercise 14.7. Show that if A = V W with V ∈ C n×p , W ∈ C n×q , and G = AH A, then rank W = q =⇒ V a − ΠW V a2 = (G11 − G12 G−1 22 G21 )a, a and rank V = p =⇒ W a − ΠV W a2 = (G22 − G21 G−1 11 G12 )a, a .

14.3. The general setting In this section we obtain formulas for PVW and PVW {U, U} when V and W are subspaces of a finite-dimensional inner product space U . To distinguish between the two inner products that will be in play, the subscript U is added to all norms and inner products in U ; the standard inner product and norm are not marked. ˙ Theorem 14.3. If U = V +W, {v1 , . . . , vp } is a basis for V, {w1 , . . . , wq } is a basis for W, and n = p + q, then the Gram matrix G ∈ Cn×n of the vectors {v1 . . . , vp , w1 , . . . , wq }, G11 , the Gram matrix for {v1 , . . . , vp }, G22 , the Gram matrix for {w1 , . . . , wq }, and G11 − G12 G−1 22 G21 are all positive definite. Moreover, (14.13)

PVW u =

p  s=1

as vs ,

152

14. Projections, volumes, and traces

where



T

−1 −1 = a = (G11 − G12 G−1 22 G21 ) (c − G12 G22 d) ,  T T  and d = u, w1 U , . . . , u, wq U , c = u, v1 U , . . . , u, vp U

(14.14)

a1 · · ·

ap

and PVW 2{U, U} =

(14.15)

1 1−

−1/2 −1/2 G11 G12 G22 2

=

1 . 1 − ΠV ΠW 2{U, U}

Proof. Since the sum is direct, G , G11 , G22 , and G11 − G12 G−1 22 G21 are all positive definite (see (3.19) for the latter). The rest of the proof is divided into parts. 1. Verification of (14.13) and (14.14). If (14.16)

p 

u=

s=1

as vs +

q 

bt wt ,

t=1

then u, vi U =

p 

as vs , vi U +

q 

s=1

p q   bt wt , vi U = (G11 )is as + (G12 )it bt

t=1

s=1

t=1

for i = 1, . . . , p and u, wi U =

p 

as vs , wi U +

s=1

q 

p q   bt wt , wi U = (G21 )is as + (G22 )it bt

t=1

s=1

t=1

for i = 1, . . . , q. Therefore, c = G11 a + G12 b and d = G21 a + G22 b , T with b = b1 · · · bq ; (14.13) and (14.14) are then obtained by straightforward computation. The details are left to the reader. 

2. Verification of (14.15). −1/2

Let Q = G11

−1/2

G12 G22

. Then, since

−1/2

Iq − QH Q = G22

−1/2

(G22 − G21 G−1 11 G21 )G22

O

it follows that Q < 1 and hence that Q = cos θ for some θ ∈ (0, π/2]. Thus, in view of formulas (14.16) and (14.14), it is readily checked that PVW u2U = G11 a2 1/2

1/2

1/2

and u2U = G11 a2 + G22 b2 + 2G12 b, a ,

14.3. The general setting

153

1/2

1/2

which, upon setting x = G11 a and y = G22 b, implies that PVW u2U x2 x2 ≤ = x2 + y2 + 2 Qy, x x2 + y2 − 2Qyx u2U x2 x2 = x2 + y2 − 2 cos θyx (y − x cos θ)2 + x2 sin2 θ 1 . ≤ sin2 θ

=

This supplies the bound PVW {U, U} ≤ 1/ sin θ. To complete the proof it remains to show first that this upper bound can be achieved (by choosing y = cos θ u1 and x = −v1 , where u1 and v1 are the first columns in the isometric factors U1 and V1 in the singular value decomposition Q = V1 S1 U1H , respectively, and cos θ = s1 (Q) = Q) and then to ver ify that ΠV ΠW {U, U} = Q. (See Exercises 14.9 and 14.10.) Exercise 14.8. Show that if W is orthogonal to V, then formula (14.14) reduces to a = G−1 11 c. Exercise 14.9. Show that in the setting of Theorem 14.3, ΠV ΠW {U, U} = max {ΠV ΠW wU : w ∈ W and wU = 1} . Exercise 14.10. Show that in the setting of Theorem 14.3, −1/2

ΠV ΠW {U, U} = G11

−1/2

G12 G22

.

Theorem 14.4. In the setting and notation of Theorem 14.3, the projection PVW can be expressed in terms of the orthogonal projections ΠV and ΠW by the formula (14.17)

PVW u = (IU − ΠV ΠW )−1 ΠV (IU − ΠV ΠW )u .

Proof. The proof rests on the evaluations ΠV wi = ΠW vj = ΠV u =

p  s=1 q  t=1 p  s=1

(G−1 11 G12 )si vs

for i = 1, . . . , q ,

(G−1 22 G21 )tj wt

for j = 1, . . . , p ,

(a + G−1 11 G12 b)s vs

 and the formula PVW u = ps=1 as vs , where a1 , . . . , ap are the entries in the vector a that is specified in (14.14). The proof amounts to checking the

154

14. Projections, volumes, and traces

following sequence of equalities: (IU − ΠV ΠW ) PVW u = (IU − ΠV ΠW )

p 

as vs

s=1

=

p 

−1 ((Ip − G−1 11 G12 G22 G21 )a)s vs

s=1 p  −1 = ΠV u − (G−1 11 G12 (G22 G21 a + b))s vs s=1

= ΠV u − ΠV ΠW u = ΠV (IU − ΠV ΠW )u , which yields (14.17), since IU − ΠV ΠW is invertible.



Notice that formulas (14.13) and (14.17) for PVW u depend upon both V and W. But, if V is orthogonal to W, then G12 = O and ΠV ΠW = O and hence PVW u = ΠV u, which depends only on V. Exercise 14.11. Let  A = a1 a2 a3

⎡ 1  ⎢0 a4 = ⎢ ⎣0 0

1 1 0 0

1 1 1 0

⎤ 1 1⎥ ⎥, 1⎦ 1

⎡ ⎤ 1 ⎢2⎥ ⎥ u=⎢ ⎣3⎦ , 4

V = span{a1 , a2 }, W1 = span{a3 , a4 }, and W2 = span{a1 + a3 , a2 + a4 }. ˙ 1 and C4 = V +W ˙ 2 , i.e., both sums are direct. (a) Show that C4 = V +W (b) Find the projection of the vector u onto V with respect to the ˙ 1. decomposition C4 = V +W (c) Find the projection of the vector u onto V with respect to the ˙ 2. decomposition C4 = V +W (d) Compute the orthogonal projection of u onto V. [REMARK: Exploit the fact that A is block triangular to compute A−1 .]

14.4. Detour on the angle between subspaces The symbol cos θ was introduced as a convenient shorthand for the norm of the operator Q in step 2 of the proof of Theorem 14.3. But there is more to this choice of notation than the fact that Q ≤ 1. The story begins with a definition: The angle θ between a pair of subspaces V and W of a finite-dimensional inner product space U is defined in the interval [0, π/2] by the formula (14.18) cos θ = max{|v, w| : v ∈ V, w ∈ W,

and

v = w = 1} .

14.4. Detour on the angle between subspaces

155

The next few exercises are designed to give just a little additional insight. They can be skipped without loss of continuity. Exercise 14.12. Show that the angle θ defined in (14.18) can also be expressed in terms of the orthogonal projectors ΠV and ΠW by the formula cos θ = ΠV ΠW . Exercise 14.13. Show that in the setting of Exercise 14.12, (14.19) min {ΠV w2 : w ∈ W

and

w = 1} = 1 − (I − ΠV )ΠW 2

and (14.20) min {(I − ΠV )w2 : w ∈ W

and

w = 1} = 1 − ΠV ΠW 2 .

[HINT: w2 = ΠV w2 + (I − ΠV )w2 .] In view of (14.18) and (14.20), (14.21)

sin2 θ = min {(I − ΠV )w2 : w ∈ W

and

w = 1} .

Exercise 14.14. Show that if V and W are subspaces of a finite-dimensional inner product space U , then (14.22)

w ∈ W =⇒ min{v − w : v ∈ V} = (I − ΠV )w

and (14.23)

v ∈ V =⇒ min{v − w : w ∈ W} = (I − ΠW )v .

Exercise 14.15. Show that if, in the notation of Exercise 14.14, we set d(v; W) = min{v − w : w ∈ W} (which is a reasonable measure of the distance from a point v ∈ V to W), then (14.24)

max {d(v, W) : v ∈ V

and v = 1} = (I − ΠW )ΠV  .

Exercise 14.16. Show that (14.25)

(ΠV − ΠW )u2 = (ΠV (I − ΠW ))u2 + ((I − ΠV )ΠW )u2

for every vector u ∈ U and (14.26)

ΠV − ΠW  = max{ΠV (I − ΠW ), (I − ΠV )ΠW } .

[HINT: To obtain (14.26) from (14.25), let u = cos θw + sin θx, where w ∈ W, x is orthogonal to W, and w = x = 1.]

156

14. Projections, volumes, and traces

14.5. Areas, volumes, and determinants To warm up, consider the following: Exercise 14.17. Let a, b ∈ R 2 , let V = [a b] denote the 2×2 matrix with columns a and b in the first quadrant, and let G = V H V denote the Gram matrix of the given vectors. Then the area of the parallelogram generated by a and b is equal to (det G)1/2 . Lemma 14.5. Let a, b ∈ R n , let V = [a b] denote the n × 2 matrix with columns a and b, and let G = V H V denote the Gram matrix of the given vectors. Then the area of the parallelogram generated by a and b is equal to (det G)1/2 . Proof. Suppose first that a and b are linearly independent and let A = span{a}. Since span{a} = span{−a} and det G does not change if a is replaced by −a, we can suppose without loss of generality that the angle θ between a and b is between 0 and π/2. Then b admits the orthogonal decomposition b = ΠA b + (I − ΠA )b and the area α of the parallelogram is equal to a b sin θ = a (I − ΠA )b. Consequently, α2 = (I − ΠA )b2 a2 = (I − ΠA )b, (I − ΠA )b a2 = (I − ΠA )b, b a2 = b2 a2 − ΠA b, b a2 . Thus, as ΠA b = a(aH a)−1 aH b = b, aa−2 a by formula (12.15), (14.27)

α2 = a2 b2 − |a, b|2 .

To complete the proof, observe that the Gram matrix

H  a H [a b] G = V V = bH   

H a2 b, a a a aH b = = bH a bH b a, b b2 and hence that (14.28)

det G = a2 b2 − |a, b|2 = α2 .

This completes the proof of the asserted formula when a and b are linearly independent. Formula (14.28) remains valid, however, even if a and b are linearly dependent because then both sides are equal to zero. 

14.5. Areas, volumes, and determinants

157

As a byproduct of the proof of the last lemma we obtain the formula (14.29)

|a, b|2 = a2 b2 − α2 .

This yields another proof of the Cauchy-Schwarz inequality for vectors in R n: |a, b| ≤ ab with equality if and only if the area is equal to zero, i.e., if and only if a and b are linearly dependent. Formula (14.28) is a special case of the formula for the volume of the parallelepiped ⎧ ⎫ k ⎨ ⎬  P(v1 , . . . , vk ) = t1 v1 + · · · + tk vk : tj ≥ 0 for j = 1, . . . , k and tj = 1 ⎩ ⎭ j=1

generated by the vectors v1 , . . . , vk in R n that is defined inductively by the formula (14.30)

vol P(v1 , . . . , vj+1 ) = vol P(v1 , . . . , vj )(In − ΠVj )vj+1  ,

where ΠVj denotes the orthogonal projection of R n onto Vj = span{v1 , . . . , vj } . Theorem 14.6. If 

G11 G12 G= G21 G22

with G11 ∈ R(k−1)×(k−1) and G22 ∈ R

is the Gram matrix of a set of vectors {v1 , . . . , vk } in R n , then: (1) The volume of the parallelepiped generated by these vectors is given by the formula (14.31)

vol P(v1 , . . . , vk ) = (det G)1/2 .

(2) If G11 is invertible, (14.32)

(In − ΠVk−1 )vk 2 = G22 − G21 G−1 11 G12 .

Proof. If the vectors v1 , . . . , vk−1 are linearly dependent, then det G11 = 0 and det G = 0. Thus, in view of (14.30) with j = k −1, both sides of (14.31) are equal to zero. If the vectors v1 , . . . , vk−1 are linearly independent and   V = v1 · · · vk−1 , H then V H V = G11 , V H vk = G12 = GH 21 , vk vk = G22 , and

(In − ΠVk−1 )vk 2 = (In − V (V H V )−1 V H )vk , vk  = vkH vk − vkH V (V H V )−1 V H vk = G22 − G21 G−1 11 G12 ,

158

14. Projections, volumes, and traces

which justifies (2) and so too the formula {vol P(v1 , . . . , vk )}2 = {vol P(v1 , . . . , vk−1 )}2 (G22 − G21 G−1 11 G12 ) for every k ≥ 3. Thus, if (14.31) holds for k − 1 vectors, then the first factor on the right is equal to det G11 and hence, as (G22 − G21 G−1 11 G12 ) ∈ R in this case, {vol P(v1 , . . . , vk )}2 = det G11 det (G22 − G21 G−1 11 G12 ) = det G, by (7.3). Consequently, if (14.31) holds for k, then it holds for k + 1. Thus, as (14.31) holds for k = 2, it holds for all positive integers k ≥ 2. 

14.6. Trace formulas The trace of a linear transformation T that maps an inner product space U into itself is defined by the formula (14.33)

trace T =

n 

T uj , uj U ,

j=1

where {u1 , . . . , un } is any orthonormal basis for U . The fact that the sum in (14.33) is independent of the choice of the orthonormal basis will be established in Lemma 14.7. This is perhaps less sur  prising if you keep in mind that if U = C n , A ∈ C n×n , In = e1 · · · en ,   and {u1 , . . . , un } is an orthonormal basis for C n , then U = u1 · · · un is unitary and, in view of (7.13) and (7.19), n 

Auj , uj  = trace U H AU = trace A =

j=1

n 

ajj =

j=1

n  Aej , ej  . j=1

Thus, formula (7.13) is consistent with formula (14.33). Lemma 14.7. Let T be a linear transformation from an inner product space U into itself and let {u1 , . . . , un } and {w1 , . . . , wn } be any two orthonormal bases for U . Then n 

(14.34)

T uj , uj U =

j=1

n 

T wj , wj U .

j=1

Proof. Since T uj =

n  i=1

T uj , wi U wi

and

uj =

n  k=1

uj , wk U wk ,

14.6. Trace formulas

159

formula (12.5) ensures that n 

T uj , uj U =

j=1

- n n  

. T uj , wi U uj , wi U

j=1

i=1

i=1

j=1

⎧ ⎫ n ⎨ n ⎬  wi , uj U T ∗ wi , uj U = ⎩ ⎭ =

n 



wi , T wi U =

i=1

n 

T wi , wi U ,

i=1



as claimed. Exercise 14.18. Show that in the setting of Lemma 14.7 (14.35)

n 

T uj 2U =

j=1

n 

T wj 2U =

j=1

n 

T ∗ uj 2U .

j=1

Lemma 14.8. Let U and V be finite-dimensional inner product spaces and let T be a linear transformation from U into V and let S be a linear transformation from V into U . Then (14.36)

trace ST = trace T S.

Proof. Let u1 , . . . , uq be an orthonormal basis for U and let v1 , . . . , vp be an orthonormal basis for V. Then, in view of the definition of the adjoint of a linear transformation and (12.5), - p . q q q     ∗ ST uj , uj U = T uj , S uj V = T uj , vi V S ∗ uj , vi V j=1

j=1

j=1

i=1

⎧ ⎫ p ⎨ q ⎬  Svi , uj U T ∗ vi , uj U = ⎩ ⎭ i=1

=

p 

j=1

Svi , T ∗ vi U =

i=1

p 

T Svi , vi V ,

i=1



which justifies (14.36).

Corollary 14.9. If T is a linear transformation from a finite-dimensional inner product space U into an inner product space V and if {u1 , . . . , uq } is an orthonormal basis for U and {v1 , . . . , vp } is an orthonormal basis for RT , then (14.37)

q  j=1

T uj 2V

=

p  i=1

T ∗ vi 2U .

160

14. Projections, volumes, and traces

Proof. This is immediate from Lemma 14.8, upon viewing T as a transformation onto the finite-dimensional subspace RT of V and then setting  S = T ∗.

14.7. Supplementary notes This chapter was partially adapted from Chapters 8 and 9 of [30]. Section 14.1 was motivated by a report by Klaus Diepold [24] on the design of sensors for cars. A formula for PVW  appears in the paper [53] by Krein and Spitkovsky. Exercises 14.14–14.16 are adapted from the discussion in the book [39] by Glazman and Ljubic of (in their terminology) the aperture between subspaces.

Chapter 15

Singular value decomposition

In this chapter we introduce the singular value decomposition for matrices A ∈ C p×q . It is convenient to first review some facts about matrices A ∈ C p×q that preserve norm: Au = u for every vector u ∈ C q . • 1 Warning: Recall that, unless explicitly indicated otherwise, u = u, u and u, v = u, vst for vectors u, v ∈ C n and A = A2,2 for matrices A. • A matrix A ∈ C p×q is said to be isometric if AH A = Iq . The name stems from the fact that (15.1)

AH A = Iq ⇐⇒ Ax, Ax = x, x

for every x ∈ C q .

If A ∈ C p×q is isometric, then rank A = q and hence p ≥ q. Thus, there are two possibilities: (1) If p = q, then AH A = Ip =⇒ AAH = Ip . (2) If p > q, then AH A = Ip =⇒ AAH = A(AH A)−1 AH = ΠRA , the orthogonal projection of C p onto RA . In case (1), A and AH are both isometric; in case (2), A is an isometry and AH is a partial isometry, i.e., it is isometric on the orthogonal complement of NAH (i.e., on RA ). • A matrix A ∈ C p×q is said to be unitary if it is both isometric and invertible, i.e., if AH A = Iq

and

AAH = Ip ,

which is only possible if q = p.

If U1 ∈ C n×p is isometric and n − p = q ≥ 1, then there exists a second isometric matrix U2 ∈ C n×q such that U = U1 U2 is a unitary matrix. 161

162

15. Singular value decomposition

Thus,



H     U1H  Op×q U1 U1 U1H U2 Ip U1 U2 = In = U U = = , U2H U2H U1 U2H U2 Oq×p Iq

   U1H H = U1 U1H + U2 U2H = ΠRU1 + ΠRU2 , In = U U = U1 U2 U2H H

and U1H and U2H are both partial isometries. In particular, x, x = (U1 U1H + U2 U2H )x, x = U1H x, U1H x + U2H x, U2H x ≥ U1H x, U1H x ,

with equality if and only if x ∈ RU1 .

Exercise 15.1. Justify the equivalence in (15.1). [HINT: Exploit Exercise 11.24.] Exercise 15.2. Show that if A ∈ C p×p , then AH A = Ip ⇐⇒ AAH = Ip and hence every isometric matrix A ∈ C p×p is automatically unitary.

15.1. Singular value decompositions The statement of the next theorem is rather long. The main fact to focus on at a first pass is (15.2). The remaining assertions can be absorbed later, as needed. Theorem 15.1. If A ∈ C p×q and rank A = r, with r ≥ 1, then there exist a pair of isometric matrices     U1 = u1 · · · ur ∈ C q×r and V1 = v1 · · · vr ∈ C p×r and a diagonal matrix S1 = diag{s1 , . . . , sr } ∈ Rr×r

with

s1 ≥ · · · ≥ sr > 0

such that (15.2)

A=

r 

H sj vj uH j = V1 S1 U1 .

j=1

Moreover, RA = RV1 , NA = RU2 , RAH = RU1 , NAH = RV2 , and: (1) V1 V1H = ΠRA , the orthogonal projection of C p onto RA , and V1 U1H maps RAH isometrically onto RA . (2) U1 U1H = ΠRAH , the orthogonal projection of C q onto RAH , and U1 V1H maps RA isometrically onto RAH . p×(p−r) is any isometric matrix such that V = (3) If  r < pand V2 ∈ C V1 V2 is unitary, then V2 V2H = ΠNAH , the orthogonal projection of C p onto NAH .

15.1. Singular value decompositions

163

(4) If r < q and U2 ∈ C q×(q−r) is any isometric matrix such that U = U1 U2 is unitary, then U2 U2H = ΠNA , the orthogonal projection of C q onto NA .  (5) AH = rj=1 sj uj vjH = U1 S1 V1H . (6) If also X1 ∈ C q×r and Y1 ∈ C p×r are isometric matrices such that A = V1 S1 U1H = Y1 S1 X1H , then V1 U1H = Y1 X1H . Proof. Let A ∈ C p×q with rank A = r ≥ 1. Then, since AH A ∈ C q×q  and (AH A)H = AH A, there exists a unitary matrix U = u1 · · · uq such that AH AU = U D with D = diag{μ1 , . . . , μq } ∈ R q×q . Thus, rank A = rank AH A = rank D and μj = μj uj , uj  = AH Auj , uj  = Auj , Auj  ≥ 0 . Consequently, upon setting μj = s2j with sj ≥ 0 and rearranging the indices, if need be, so that s1 ≥ s2 ≥ · · · ≥ sq , the preceding formula for AH A can be expressed as ⎤ ⎡ 2 s1 0 · · · 0 ⎢ 0 s2 · · · 0 ⎥ ⎥ ⎢ (15.3) AH AU = U ⎢ . .. ⎥ with s1 ≥ s2 ≥ · · · ≥ sq ≥ 0 . . . . ⎣. . .⎦ 0

0

···

s2q

Formula (15.3) implies that Auj  = sj for j = 1, . . . , q and rank A = r ⇐⇒ sr > 0 and sj = 0 for j > r (if r < q) . Let

vj = s−1 j Auj

for j = 1, . . . , r .

Then −1 −1 AH Auj , uk  vj , vk  = s−1 j Auj , sk Auk  = {sj sk } ( 1 if j = k , −1 = sj sk uj , uk  = 0 if j = k .

Thus,

 AU1 = A u1 · · ·

  ur = Au1 · · ·

  Aur = s1 v1 · · ·

 sr vr = V1 S1 .

H If  r = q, then U1 = U is unitary and A = V1 S1 U1 . If r < q and U2 = ur+1 · · · uq , then AU2 = O and

V1 S1 U1H = AU1 U1H = A(U1 U1H + U2 U2H ) = AU U H = A . This completes the justification of (15.2). Assertions (1)–(5) are left to the reader to fill in at his/her leisure; (6) is covered by Corollary 16.8 in the next chapter. 

164

15. Singular value decomposition

The numbers s1 ≥ · · · ≥ sq ≥ 0 in the decomposition of AH A in (15.3) are termed the singular values of A; formula (15.2) is called the singular value decomposition of A, or the svd of A for short. Exercise 15.3. Show that if A ∈ R p×q with rank A = r ≥ 1, then (15.2) holds with V1 ∈ R p×r and U1 ∈ R q×r . [HINT: Keep Theorem 13.7 in mind.] Remark 15.2. In the setting and terminology of Theorem 15.1 formula (15.2) can be expressed in a number of different ways:   (1) If r < q and U = U1 U2 is unitary, then   (15.4) A = V1 S1 Or×(q−r) U H .   (2) If r < p and V = V1 V2 is unitary, then

 S1 (15.5) A=V UH . O(p−r)×r 1     (3) If r < min{p, q} and U = U1 U2 ∈ C q×q and V = V1 V2 ∈ C p×p are both unitary, then

 Or×(q−r) S1 (15.6) A=V UH . O(p−r)×r O(p−r)×(q−r) Formula (15.6) may seem particularly attractive because of the presence of unitary factors. However, formula (15.2) works just as well because V1 BU1H  = B for every B ∈ C r×r . Moreover, there is the added advantage that S1 is invertible. Exercise 15.4. Show that if A ∈ C p×q , then   p > q =⇒ A Op×(p−q) has the same nonzero singular values as A ,

 A p < q =⇒ has the same nonzero singular values as A . O(q−p)×q If A ∈ C p×q is expressed in the form (15.2), then the matrix (15.7)

A† = U1 S1−1 V1H

is called the Moore-Penrose inverse of A. It can be characterized as the only matrix in Cq×p that meets the four conditions A† AA† = A† , AA† A = A, (A† A)H = A† A, and (AA† )H = AA† . Exercise 15.5. Show that if A = V1 S1 U1H and A† = U1 S1−1 V1H , then A† AA† = A† , AA† A = A, (A† A)H = A† A, and (AA† )H = AA† . Exercise 15.6. Redo Exercises 11.18–11.21 using the singular value decomposition (15.2).

15.2. A characterization of singular values

165

15.2. A characterization of singular values Lemma 15.3. If A ∈ C p×q and rank A ≥ 1, then A = s1 . Proof. By formula (15.2), A = V1 S1 U1H , with isometric factors V1 ∈ C p×r and U1 ∈ C q×r . Therefore, A = V1 S1 U1H  = S1 . Thus, as S1 x = 2

r 

|sj xj | ≤ 2

j=1

s21

r 

|xj |2 = s21 x2

j=1

 T for every vector x = x1 · · · xr in C r , S1  ≤ s1 . Since equality is  attained by choosing x1 = 1 and xj = 0 for j = 2, . . . , r, A = s1 . The next result extends Lemma 15.3 and serves to characterize all the singular values of A ∈ C p×q in terms of a problem of best approximation. Theorem 15.4. If A ∈ C p×q , then its singular values s1 , . . . , sq can be characterized as solutions of the following extremal problem: 3 4 (15.8) sk+1 = min A − B : B ∈ C p×q and rank B ≤ k , for k = 0, . . . , q − 1. Proof. If k = 0 in (15.8), then B = O and the assertion follows from Lemma 15.3. If rank A = r and k ≥ r, then the minimum is achieved by choosing B = A and then (15.8) simply confirms the fact that si = 0 if i > r. Suppose therefore that rank r with r ≥ 1and rank B ≤ k < r;   A =q×r  k with p×r u v · · · u · · · v and V1 = 1 be the and let U1 = 1 r ∈ C r ∈C isometric matrices in the singular value decomposition (15.2) of A. Then,   since rank B u1 · · · uk+1 ≤ k, there exists a unit vector d ∈ C k+1 such   that B u1 · · · uk+1 d = 0. Thus, upon setting



     d d x = u1 · · · uk+1 d = u1 · · · ur = U1 with 0 ∈ C r−k−1 , 0 0   it is readily seen that if dT = d1 · · · dk+1 , then

  k+1 d H H Ax = V1 S1 U1 x = V1 S1 U1 U1 dj sj vj = 0 j=1

and hence (since Bx = 0 and x = 1) that

5 52 5k+1 5  5 5 2 2 2 dj sj vj 5 A − B ≥ Ax − Bx = Ax = 5 5 5 5 j=1 5 =

k+1  j=1

|dj |2 s2j



s2k+1

k+1  j=1

|dj |2 = s2k+1 .

166

15. Singular value decomposition

Let δk denote the right-hand side of (15.8). Then, as the last inequality is valid for every choice of B ∈ C p×q with rank B ≤ k, it follows that δk ≥ sk+1 . To complete the proof, it suffices to check that if ⎤ ⎡ s1 0 · · · 0 ⎥  ⎢ ⎢ 0 s2 · · · 0 ⎥  (15.9) B = v1 · · · vk ⎢ . ⎥ u1 · · · .. ⎦ ⎣ .. . 0 0 sk

uk

H

,

then rank B ≤ k and A − B = sk+1 . Therefore, δk ≤ sk+1 .



The usefulness of the singular value decomposition stems in large part  from the fact the matrix B = kj=1 sj vj uH j in (15.9) is the best approximation to A in the matrix norm  2,2 in the set of p × q matrices with rank ≤ k. It is also the best approximation in the norm ⎞1/2 ⎛  & '1/2 |aij |2 ⎠ = trace AH A , (15.10) A2 = ⎝ i,j

which is also referred to as the Frobenius norm; see (15.11) and (15.13) below. Theorem 15.5. If A ∈ C p×q and rank A = r ≥ 1, with nonzero singular values s1 , . . . , sr , then (15.11)

r 

3 s2j = min A − B22 : B ∈ C p×q

and

4 rank B ≤ k ,

j=k+1

for k = 0, . . . , r − 1. Proof. Let A = V1 S1 U1H be the singular value decomposition of A with V1 ∈ C p×r and U1 ∈ C q×r isometric, S1 = diag{s1 , . . . , sr }, s1 ≥ · · · ≥ sr > 0, and let C = V1H BU1 . Then rank C ≤ rank B ≤ k and A − B22 ≥ V1 S1 U1H − V1 V1H BU1 U1H 22 = S1 − V1H BU1 22 = S1 − C22 =

r 

|si − cii |2 +

i=1

(15.12) ≥ min

i,j=1

- r  i=1

=

r 

r  j=k+1

s2j .

|cij |2 ≥

r 

|si − cii |2

i=1

i=j

|si − cii | : C is diagonal and rank C ≤ k 2

.

15.3. Sums and products of singular values

167

Therefore, δk equal to the right-hand side of (15.11), we see that r setting 2 δk ≥ j=k+1 sj . On the other hand, the specific choice (in the notation of (15.9)) (15.13)

B=

k 

sj vj uH j

j=1

=⇒ A −

B22

=

r 

s2j ≥ δk ,

j=k+1



which completes the proof.

The last evaluation in the proof of Theorem 15.5 serves to identify B = H j=1 sj vj uj as the best approximation to A (in the matrix norm  2 ) in the set of p × q matrices with rank ≤ k.

k

Exercise 15.7. Justify the first inequality in the first line of (15.12) by H 2 H 2 H 2 H 2, showing 2  1 S1 U1 − B2 = S1 − V1 BU1 2 + V1 BU2 2 + V2 B  that V 2 where V1 V2 and U1 U2 are both unitary matrices. [HINT: A2 = trace AH A.] Exercise 15.8. Let A ∈ C p×q and suppose that s1 (A) ≤ 1. Show that the matrix Ip − AB is invertible for every choice of B ∈ C q×p with s1 (B) ≤ 1 if and only if s1 (A) < 1.

15.3. Sums and products of singular values The formula (15.14)

s1 = max {|Ax, y| : x ∈ C q , y ∈ C p , and xH x = yH y = 1}

for matrices A ∈ C p×q has a natural generalization: 3 s1 + · · · + sk = max | trace(Y H AX)| : ! (15.15) X ∈ Cq×k , Y ∈ Cp×k , and X H X = Ik = Y H Y , since the inner products in these two formulas are equal to trace yH Ax and trace Y H AX, respectively. To put it another way, formula (15.14) coincides with (15.15) when k = 1. In a similar vein, the formula (15.16) s21 · · · s2k = max det {W H AH AW } : W ∈ Cq×k and W H W = Ik

!

for the singular values s1 , . . . , sk of A ∈ C p×q is a natural generalization of the formula 3 4 (15.17) s21 = max det {wH AH Aw} : w ∈ C q and wH w = 1 . Proofs of (15.15) and (15.16) will be furnished in Chapter 29.

168

15. Singular value decomposition

15.4. Properties of singular values The next theorem summarizes a number of important properties of singular values that are good to keep in mind. The verification of the theorem will make use of the extremal characterizations (15.15) and (15.16), even though they have not been justified yet. Theorem 15.6. If A ∈ C p×q , B ∈ C m×p , and C ∈ C q×n , then: (1) sj (A) = sj (AH ) for j ≤ rank A. (2) sj (BA) ≤ Bsj (A), with equality if B ∈ C p×p is unitary (and hence also B = 1). (3) sj (AC) ≤ sj (A)C, with equality if C ∈ C q×q is unitary (and hence also C = 1). : : :k (4) j=1 sj (AC) ≤ kj=1 sj (A) kj=1 sj (C). k k k p×q . (5) j=1 sj (A + B) ≤ j=1 sj (A) + j=1 sj (B) when also B ∈ C Proof. Item (1) follows from Theorem 15.4, since A − B = AH − B H  and rank B = rank B H . To verify (2), observe that if B ∈ C m×p and j > 1, then, in view of Theorem 15.4, sj (BA) = min{BA − D : D ∈ C m×q (15.18)

≤ min{BA − BE : E ∈ C p×q ≤ B min{A − E : E ∈ C

p×q

and and and

rank D ≤ j − 1} rank E ≤ j − 1} rank E ≤ j − 1}

= B sj (A) . Therefore (2) holds. The proof of (3) is similar and is left to the reader. Next, a double application of the formula (15.19)

s21 · · · s2k det {W H W } = max det {W H AH AW } : W ∈ Cq×k

! ,

which stems from (15.16), yields the inequalities det{V H C H AH ACV } ≤ s1 (A)2 · · · sk (A)2 det{V H C H CV } ≤ s1 (A)2 · · · sk (A)2 s1 (C)2 · · · sk (C)2 det{V H V } for every matrix V ∈ C n×k . The inequality (4) then follows from (15.16) (with V in place of W and AC in place of A).

15.5. Approximate solutions of linear equations

169

Finally, the justification of (5) rests on (15.15) and the observation that if V ∈ C n×k is isometric, then | trace{V H (A + B)V }| = | trace{V H AV } + trace{V H BV }| ≤ | trace{V H AV }| + | trace{V H BV }| ≤

k  j=1

sj (A) +

k 

sj (B) .

j=1

Item (5) is then obtained by maximizing the left-hand side of this inequality over all isometric matrices V ∈ C n×k and invoking (15.15) once more.   Exercise 15.9. Show that kj=0 sj (A) defines a norm on C p×q when 1 ≤ k ≤ q. [HINT: Use (15.15) to obtain the triangle inequality.] Exercise 15.10. Let λ1 (A), . . . , λn (A) denote the eigenvalues of A ∈ C n×n , repeated according to their algebraic multiplicity, n. . . , sn (A) n and let2 s1 (A), denote the singular values of A. Show that j=1 |λj (A)| ≤ j=1 sj (A)2 . [HINT: Theorem 13.3 is the key.] Exercise 15.11. Show that if A, B ∈ C n×n and AB = (AB)H , then trace {(AB)H AB} ≤ trace {(BA)H BA}. [HINT: Use formula (7.4) and Exercise 15.10.] Exercise 15.12. Justify sj (A) = sj (AH ) for A ∈ C p×q and j ≤ rank A by comparing det (λIq − AH A) with det (λIp − AAH ).

15.5. Approximate solutions of linear equations If A ∈ C p×q and b ∈ C p , then the equation Ax = b has a solution x ∈ C q if and only if b ∈ RA . However, if b ∈ / RA , then a reasonable strategy is to look for vectors x ∈ C q that minimize Ax − b. (There may be many.) Since this problem is only of interest if b ∈ RA , it suffices to focus on the case where rank A = r with 1 ≤ r < p. Theorem 15.7. If A ∈ C p×q , rank A = r, 1 ≤ r < p, and, in terms of the notation introduced earlier for the singular value decomposition of A,  −1 H H † A = V1 S1 U1 , A = U1 S1 V1 , where V1 V2 and U1 U2 are unitary, then: (1) min {Ax − b : x ∈ C q } = (Iq − ΠRA )b = V2 V2H b = V2H b. (2) Ax − b = V2H b ⇐⇒ x = A† b + U2 c for some c ∈ C q−r , with U2 = O if q = r. Proof. Let b = b1 + b2 with b1 ∈ RA and b2 orthogonal to RA . Then b2 = (Ip − ΠRA )b = V2 V2H b and Ax − b2 = Ax − b1 2 + b2 2 .

170

15. Singular value decomposition

Therefore, (1) holds and Ax = b1 ⇐⇒ V1 S1 U1H x = V1 V1H b ⇐⇒ S1 U1H x = V1H b ⇐⇒ U1H x = S1−1 V1H b . Thus, upon writing x = U1 a + U2 c, it is easily seen that the last equality is met if and only if a = S1−1 V1H b , i.e., if and only if x = U1 S1−1 V1H b + U2 c = A† b + U2 c . 

Thus, (2) holds.

The matrix A† = U1 S1−1 V1H is called the Moore-Penrose inverse of the matrix A with singular value decomposition A = V1 S1 U1H . Exercise 15.13. Show that if A ∈ C p×q , then rank A† = rank A and (15.20) AA† A = A, A† AA† = A† , A† A = (A† A)H , and AA† = (AA† )H . Exercise 15.14. Show that AA† = ΠRA , A† A = ΠRAH , and hence that rank A = p ⇐⇒ AA† = Ip and rank A = q ⇐⇒ A† A = Iq . Exercise 15.15. In the setting of Theorem 15.7, show that if r < q, then r  b, vj  −1 H † uj (15.21) x = A b = U1 S1 V1 b = sj j=1

is the vector of smallest norm in

Cq

that minimizes Ax − b.

Exercise 15.16. Show that if A ∈ C p×q and rank A = q, then AH A is invertible and Ax = ΠRA b if and only if x = (AH A)−1 AH b.

15.6. Supplementary notes The monograph [43] of Gohberg and Krein is an excellent source of supplementary information on singular value decompositions in a Hilbert space setting. Theorem 15.5 was presented in a 1936 paper [35] by Eckart and Young. In 1960 Mirsky observed [62] that the same conclusions hold for any norm ϕ(A) on C p×q if ϕ(A) = ϕ(V AU ) when V and U are unitary; see also Golub , Hoffman, and Stewart [44] for additional developments and references. If A ∈ C p×q , then there is exactly one matrix A† ∈ C q×p that meets the four conditions in (15.20); see, e.g., Section 11.2 of [30].

Chapter 16

Positive definite and semidefinite matrices

A matrix A ∈ C n×n is said to be positive semidefinite if (16.1)

Ax, x ≥ 0 for every x ∈ C n ;

it is said to be positive definite if (16.2)

Ax, x > 0 for every nonzero vector x ∈ C n .

Correspondingly, A ∈ C n×n is said to be negative semidefinite if −A is positive semidefinite, and it is said to be negative definite if −A is positive definite. If A ∈ C n×n and B ∈ C n×n , then the notation A  B (resp., A  B) means that A − B is positive semidefinite (resp., A − B is positive definite). Lemma 16.1. If A ∈ C n×n and A  O, then: (1) A is automatically Hermitian. (2) The eigenvalues of A are nonnegative numbers. Moreover, (16.3) (16.4)

A  O ⇐⇒ A = AH and the eigenvalues of A are all positive ⇐⇒ A  O and det A > 0.

Proof. If A  O, then Ax, x = Ax, x = x, Ax 171

172

16. Positive definite and semidefinite matrices

for every x ∈ C n . Therefore, by a straightforward calculation, 4Ax, y =

=

4  k=1 4 

ik A(x + ik y), (x + ik y) ik (x + ik y), A(x + ik y) = 4x, Ay ;

k=1

i.e., Ax, y = x, Ay for every choice of x, y ∈ C n . Therefore, (1) holds. Next, let x be an eigenvector of A corresponding to the eigenvalue λ. Then λx, x = Ax, x ≥ 0. Therefore A  O =⇒ λ ≥ 0 and A  O =⇒ λ > 0, since x, x > 0. Thus, (2) and, in view of (1), the implication =⇒ in (16.3) hold. The implication ⇐= in (16.3) follows from the fact that A = AH =⇒ A = W DW H , in which W is unitary and D = diag{λ1 , . . . , λn }; the verification of (16.4) is left to the reader.  • Warning: The conclusions of Lemma 16.1 are not true under the less restrictive constraint Ax, x ≥ 0 for every x ∈ R n . Thus, for example, if A=

2 −2 0 2



 x1 , x= x2

and

then Ax, x = (x1 − x2 )2 + x21 + x22 > 0 for every nonzero vector x ∈ R n . However, A is clearly not Hermitian. The next lemma serves to clarify this example. Lemma 16.2. If A ∈ R n×n , then Au, u ≥ 0 for every u ∈ C

n

⇐⇒

(

Ax, x ≥ 0 for every x ∈ R n and A = AT .

Proof. If the conditions on the right hold and u = x + iy with x, y ∈ R n , then A(x + iy), (x + iy) = Ax, x − iAx, y + iAy, x + Ay, y = Ax, x + Ay, y ≥ 0 , since Ay, x = y, Ax = Ax, y when A = AT ∈ R n×n and x, y ∈ R n . Thus, the conditions on the left hold. The converse implication is justified by Lemma 16.1. 

16.1. A detour on triangular factorization

173

Exercise 16.1. Show that if A ∈ C n×n and A = AH with eigenvalues λ1 ≥ · · · ≥ λn , then λ1 In − A  O (even if λ1 ≤ 0). Exercise 16.2. Show that if V ∈ C n×k and rank V = k, then A  O =⇒ V H AV  O , but the converse implication is not true if k < n. Exercise 16.3. Show that if A ∈ C n×n with entries aij , i, j = 1, . . . , n, and A  O, then |aij |2 ≤ aii ajj . Exercise 16.4. Show that if A ∈ C n×n , n = p + q, and 

A11 A12 , A= A21 A22 where A11 ∈ C p×p , A22 ∈ C q×q , then A  O ⇐⇒ A11  O ,

A21 = AH 12 ,

and A22 − A21 A−1 11 A12  O .

Exercise 16.5. Let U ∈ C n×n be unitary and let A ∈ C n×n . Show that if A  O and AU  O, then U = In . [HINT: Consider AU x, x for eigenvectors x of U .]

16.1. A detour on triangular factorization The notation (16.5)

A[j,k]



ajj ⎢ .. =⎣ . akj

··· .. . ···

⎤ ajk .. ⎥ . ⎦

for A ∈ C n×n

and

1≤j≤k≤n

akk

will be convenient. The trade secret behind the factorization formulas that will be considered below is that if B, L, U ∈ C n×n , L is lower triangular, and U is upper triangular, then (16.6)

(LB)[1,k] = L[1,k] B[1,k]

and

(BU )[1,k] = B[1,k] U[1,k] ,

(BL)[k,n] = B[k,n] L[k,n]

and

(U B)[k,n] = U[k,n] B[k,n]

for k = 1, . . . , n . Exercise 16.6. Let Pk = diag{Ik , O(n−k)×(n−k) }. Show that (a) A ∈ C n×n is upper triangular ⇐⇒ APk = Pk APk for k = 1, . . . , n. (b) A ∈ C n×n is lower triangular ⇐⇒ Pk A = Pk APk for k = 1, . . . , n. We shall say that a matrix A ∈ C n×n admits an LU (resp., UL) factorization if there exist a lower triangular matrix L ∈ C n×n and an upper triangular matrix U ∈ C n×n such that A = LU (resp., A = U L).

174

16. Positive definite and semidefinite matrices

Theorem 16.3. If A ∈ C n×n , then: (1) A admits an LU factorization with invertible triangular factors L and U ⇐⇒ det A[1,k] = 0 for k = 1, . . . , n. (2) A admits a UL factorization with invertible triangular factors L and U ⇐⇒ det A[k,n] = 0 for k = 1, . . . , n. (3) If det A[1,k] = 0 for k = 1, . . . , n, then A = LDU for exactly one lower triangular matrix L with ones on the diagonal, one upper triangular matrix U with ones on the diagonal, and one diagonal matrix D. (4) If det A[k,n] = 0 for k = 1, . . . , n, then A = U DL for exactly one lower triangular matrix L with ones on the diagonal, one upper triangular matrix U with ones on the diagonal, and one diagonal matrix D. Proof. The proof is divided into steps. 1. Verification of (1): Suppose first that A = LU with invertible factors L and U . Then, by the first formula in (16.6), A[1,k] = L[1,k] U[1,k] . Moreover, since L and U are triangular matrices, L[1,k] and U[1,k] are also invertible for k = 1, . . . , n. Thus, A[1,k] is invertible for k = 1, . . . , n. Suppose next that A[1,k] is invertible for k = 1, . . . , n and let X ∈ C n×n be the lower triangular matrix with entries xij for i ≥ j that are specified by the formulas     xk1 · · · xkk = 0 · · · 0 1 (A[1,k] )−1 for k = 1, . . . , n , (16.7) with the understanding that x11 = 1/a11 . Now, let Y = XA. Then, by the first formula in (16.6), Y[1,k] = X[1,k] A[1,k] and hence, in view of (16.7),    yk1 · · · ykk = xk1 · · ·

for k = 1, . . . , n ,   xkk A[1,k] = 0 · · ·

 0 1 .

Thus, Y is upper triangular with yjj = 1 for j = 1, . . . , n. Therefore, Y and X = Y A−1 are invertible and A = LU with L = X −1 and U = Y . 2. Verification of (2): If A = U L and U and L are both invertible, then, as U and L are triangular, U[k,n] and L[k,n] are both invertible for k = 1, . . . , n. Thus, as A[k,n] = U[k,n] L[k,n] for k = 1, . . . , n by the fourth formula in (16.6), A[k,n] is also invertible for k = 1, . . . , n.

16.1. A detour on triangular factorization

175

Suppose next that A[k,n] is invertible for k = 1, . . . , n and let X ∈ C n×n be the lower triangular matrix with entries xij for i ≥ j that are specified by the formulas ⎡ ⎤ ⎡ ⎤ 1 xkk ⎢0⎥ ⎢ .. ⎥ −1 ⎢ ⎥ (16.8) ⎣ . ⎦ = (A[k,n] ) ⎢ .. ⎥ for k = 1, . . . , n , ⎣.⎦ xkn 0 with the understanding that xnn = 1/ann , and let Y = AX. Then, by the third formula in (16.6), Y[k,n] = A[k,n] X[k,n] for k = 1, . . . , n and hence ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 xkk ykk ⎢0⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎣ . ⎦ = A[k,n] ⎣ .. ⎦ = ⎢ .. ⎥ . ⎣.⎦ ykn xkn 0 Thus, Y is upper triangular with yjj = 1 for j = 1, . . . , n. Therefore, Y and X = A−1 Y are invertible and A = U L with L = X −1 and U = Y . 3. Verification of (3) and (4): To verify (3), suppose that an invertible matrix A admits a pair of factorizations A = L1 D1 U1 = L2 D2 U2 in which the diagonal entries of the triangular factors are all equal to one. Then the identity L−1 2 L1 D 1 = −1 −1 D2 U2 U1 implies that D1 = D2 and that L2 L1 D1 is both upper and lower triangular and hence is a diagonal matrix, which must be equal to D1 . Therefore, L1 = L2 and U1 = U2 . The verification of (4) is left to the reader; it is similar to the verification of (3).  Remark 16.4. Formulas (16.7) and (16.8) serve to make the proof of Theorem 16.3 efficient, but mysterious. To explain where the first of these two formulas comes from, we first observe that if A = LU is invertible, then L and U are invertible and the diagonal matrix Δ = diag{u11 , . . . , unn } based on the diagonal entries of U is invertible. Therefore, Y = Δ−1 U is an upper triangular matrix with diagonal entries yjj = 1 for j = 1, . . . , n and A = LU ⇐⇒ L−1 A = U ⇐⇒ Δ−1 L−1 A = Δ−1 U ⇐⇒ XA = Y , with X = Δ−1 L−1 . Thus A admits an LU factorization if and only if there exist a lower triangular matrix X ∈ C n×n and an upper triangular matrix Y ∈ C n×n with yjj = 1 for j = 1, . . . , n such that XA = Y (because then X is invertible and A = LY with L = X −1 ).

176

16. Positive definite and semidefinite matrices

It is remarkable that the awesome looking nonlinear matrix equation XA = Y , which is a system of n2 equations with (n2 + n)/2 unknown entries xij with 1 ≤ j ≤ i ≤ n in X and (n2 − n)/2 unknown entries yij with 1 ≤ i < j ≤ n in Y , is tractable. But, in self-evident notation,

     X11 O A11 A12 Ik Y[1,k] = (XA)[1,k] = Ik O X21 X22 A21 A22 O (16.9) = X11 A11 = X[1,k] A[1,k] for k = 1, . . . , n. It is now easily seen that when det A[1,k] = 0 for k = 1, . . . , n, then (16.7) is just the bottom row of (16.9). The motivation for (16.8) is similar. Exercise 16.7. Verify item (4) in Theorem 16.3. Exercise 16.8. Show that if A ∈ C n×n is invertible and A−1 = B, then A[1,k] is invertible for k = 1, . . . , n if and only if B[k,n] is invertible for k = 1, . . . , n. Exercise 16.9. Show that if A ∈ C n×n and A[1,k] is invertible for k = 1, . . . , n, then formula (16.7) implies that xkk = det A[1,k−1] / det A[1,k] for k = 2, . . . , n, whereas, if A[k,n] is invertible for k = 1, . . . , n, then (16.8) implies that xkk = det A[k+1,n] / det A[k,n] for k = 1, . . . , n − 1. 

1 −1 Exercise 16.10. Show that the matrix A = admits an LU factor1 0 ization but does not admit a UL factorization and find matrices L, D, U such that A = LDU with L lower (resp., U upper) triangular with ones on the diagonal and D diagonal. Exercise 16.11. Let A ∈ C n×n be a Vandermonde matrix with columns v(λ1 ), . . . , v(λn ) based on n distinct points λ1 , . . . , λn . Show that A admits an LU factorization, but may not admit a UL factorization. Exercise 16.12. Show that if A ∈ C n×n and A2 = A, then A is an orthogonal projection if and only if A  O.

16.2. Characterizations of positive definite matrices There are a number of different characterizations of positive definite matrices: Theorem 16.5. If A ∈ C n×n , then the following statements are equivalent: (1) A  O. (2) A = AH and the eigenvalues, λ1 , . . . , λn , of A are all positive; i.e., λj > 0 for j = 1, . . . , n.

16.2. Characterizations of positive definite matrices

(3) A = AH and det A[1,k] > 0

for

177

k = 1, . . . , n.

(4) A = LLH , where L is a lower triangular invertible matrix. (5) A = AH and det A[k,n] > 0

for

k = 1, . . . , n.

(6) A = U U H , where U is an upper triangular invertible matrix. Proof. Lemma 16.1 ensures that (1) =⇒ (2). Conversely, if (2) is in force, then A = V DV H with V ∈ C n×n unitary and D  O and diagonal. Therefore, Ax, x = DV H x, V H x > 0 for every nonzero vector x ∈ C n . Thus, (2) ⇐⇒ (1). Next, it is clear that (1) =⇒ (3) and hence, in view of Theorem 16.3, that A admits exactly one factorization of the form A = L1 DU1 , where L1 is lower triangular with ones on the diagonal, U1 is upper triangular with ones on the diagonal, and D is a diagonal matrix. Since A = AH , H H L1 DU1 = U1H D H LH 1 . Consequently, U1 = L1 and D = D . Moreover, as H A = L1 DLH 1 =⇒ A[1,k] = (L1 )[1,k] D[1,k] (L1 )[1,k]

=⇒ det A[1,k] = | det (L1 )[1,k] |2 det D[1,k] = det D[1,k] for k = 1, . . . , n, the diagonal entries in the matrix D = diag{λ1 , . . . , λn } are positive. Thus, D admits a square root √ √ D 1/2 = diag{ λ1 , . . . , λn } and hence (4) holds with L = L1 D 1/2 . Since the implication (4) =⇒ (1) is clear, the implications (1) =⇒ (3) =⇒ (4) =⇒ (1) are justified. To complete the proof, it suffices to check that (1) (6) =⇒ (1). The details are left to the reader.

=⇒

(5)

=⇒ 

Exercise 16.13. Show that if A ∈ C n×n , then A  O if and only if A = V H V for some invertible matrix V ∈ C n×n . The next three exercises are formulated in terms of the matrix (16.10) ⎡ ⎤ 0 0 ··· 0 1 n ⎢0 0 · · · 1 0⎥    ⎢ ⎥ ej eTn+1−j = ⎢ . , where e1 · · · en = In . Zn = ⎥ . .. ⎦ ⎣ .. j=1

1 0 ···

0 0

Exercise 16.14. Show that ZnH = Zn and ZnH Zn = In , i.e., Zn is both Hermitian and unitary.

178

16. Positive definite and semidefinite matrices

Exercise 16.15. Show that U ∈ C n×n is an invertible upper triangular matrix if and only if Zn U Zn is an invertible lower triangular matrix and then use this information to verify the equivalence of (4) and (6) in Theorem 16.5. Exercise 16.16. Show that if p ≥ 1, q ≥ 1, and p + q = n, then

    O Zq A11 A12 Zq A22 Zq Zq A21 Zp O Zp = Zp O A21 A22 Zq O Zp A12 Zq Zp A11 Zp and then use this identity to verify the equivalence of (3) and (5) in Theorem 16.5.

16.3. Square roots Theorem 16.6. If A ∈ C n×n and A  O, then there is exactly one matrix B ∈ C n×n such that B  O and B 2 = A. Proof. If A ∈ C n×n and A  O, then there exists a unitary matrix U and a diagonal matrix D = diag{d11 , . . . , dnn } with nonnegative entries such that A = U DU H . Therefore, upon setting 1/2

1/2 D 1/2 = diag{d11 , . . . , dnn },

it is readily checked that the matrix B = U D 1/2 U H is again positive semidefinite and B 2 = (U D 1/2 U H )(U D 1/2 U H ) = U DU H = A . This completes the proof of the existence of at least one positive semidefinite square root of A. Suppose next that there are two positive semidefinite square roots of A, B1 and B2 . Then there exist a pair of unitary matrices U1 and U2 and a pair of diagonal matrices D1 = diag{α1 , . . . , αn } with α1 ≥ · · · ≥ αn ≥ 0 and D2 = diag{β1 , . . . , βn } with β1 ≥ · · · ≥ βn ≥ 0 such that B1 = U1 D1 U1H

and

B2 = U2 D2 U2H .

Thus, as U1 D12 U1H = B12 = A = B22 = U2 D22 U2H , it follows that U2H U1 D12 = D22 U2H U1 and hence, upon setting W = U2H U1 , that W D12 − D2 W D1 = D22 W − D2 W D1 . But this in turn implies that the matrix X = W D1 − D2 W

16.3. Square roots

179

with entries xij for i, j = 1, . . . , n is a solution of the equation XD1 + D2 X = O and hence that xij αj + βi xij = 0

for i, j = 1, . . . , n .

Thus, xij = 0 if αj + βi > 0. On the other hand, if αj + βi = 0, then αj = βi = 0 and so xij = wij αj − βi wij = 0 in this case also. Therefore, X = O is the only solution of the equation XD1 + D2 X = O. Consequently, U2H U1 D1 − D2 U2H U1 = X = O ; i.e., B1 = U1 D1 U1H = U2 D2 U2H = B2 , as claimed.



If A  O, the symbol A1/2 will be used to denote the unique n×n matrix B  O with B 2 = A. Correspondingly, B will be referred to as the square root of A. The restriction that B  O is essential to ensure uniqueness. Thus, for example,  

O A O A = I2n , A−1 O A−1 O for every invertible matrix A ∈ C n×n . Exercise 16.17. Show that if A ∈ C n×n , then

 A A  O ⇐⇒ A  O . A A Exercise 16.18. Show that if A ∈ C n×n , then

2  A A  O ⇐⇒ A = AH . A In Exercise 16.19. Show that if A, G ∈ C n×n , G  O, and GA = AH G, then σ(A) ⊂ R and A = A∗ with respect to an appropriately defined inner product. Exercise 16.20. Show that if A, B ∈ C n×n and A  B  O, then B −1  A−1  O. [HINT: A − B  O =⇒ A−1/2 BA−1/2 ≺ In .] Exercise 16.21. Show that if A, B ∈ C n×n and if A  O and B  O, then trace AB ≥ 0 (even if AB  O).

180

16. Positive definite and semidefinite matrices

16.4. Polar forms and partial isometries A matrix A ∈ C p×q is said to be a partial isometry if AH Ax = x for every vector x ∈ C q that is orthogonal to NA . Since C q = NA ⊕RAH , A ∈ C p×q is a partial isometry if and only if AH AAH y = AH y for every y ∈ C p . Thus: (1) A ∈ C p×q is an isometry if AH A = Iq . (2) A ∈ C p×q is a partial isometry if AH AAH = AH . Exercise 16.22. Show that if A ∈ C p×q is a partial isometry, then it is an isometry if and only if rank A = q. Theorem 16.7. If A ∈ C p×q , then there exists exactly one partial isometry B ∈ C p×q and one positive semidefinite matrix P ∈ C q×q such that A = BP and NB = NP . In this factorization, P is the positive semidefinite square root of AH A. Proof. If B and P meet the stated conditions, then C q = NB ⊕ RB H = NP ⊕ RP H and P = P H . Therefore, RB H = RP and hence B H BP = P

and

AH A = P B H BP = P 2 .

Thus, P is the one and only positive semidefinite square root of AH A. If C ∈ C p×q is a partial isometry such that A = CP and NC = NP , then CP x = Ax = BP x for every x ∈ C q

and

Cy = By for every y ∈ NP . 

Therefore, C = B.

Corollary 16.8. If A ∈ C p×q and rank A = r ≥ 1 and A admits a pair of singular value decompositions A = V1 S1 U1H = Y1 S1 X1H , with isometric factors V1 , Y1 ∈ C p×r and U1 , X1 ∈ C q×r , then V1 U1H = Y1 X1H . Proof. If A ∈ C p×q and A = V1 S1 U1H with isometric factors V1 ∈ C p×r and U1 ∈ C q×r and S1 = diag{s1 , . . . , sr }  O, then (16.11)

A = BP

with B = V1 U1H ∈ C p×q and P = U1 S1 U1H ∈ C q×q .

The asserted uniqueness follows from Theorem 16.7, since B H BB H = B H , P  O, and NB = NP .  Exercise 16.23. Show that the factors V1 and U1 in the factorization A = V1 S1 U1H are not unique. [HINT: Diagonal matrices commute.] The factorization BP in (16.11) is called the right polar form of A. Exercise 16.24. Show that if A ∈ C p×q and rank A = r ≥ 1, then A admits exactly one left polar form A = QC in which Q  O is a square root of AAH and C is a partial isometry with RC = RQ .

16.4. Polar forms and partial isometries

181

Exercise 16.25. Show that if P ∈ C n×n is a positive semidefinite matrix and Y1 , Y2 ∈ C n×n are such that Y1 P = Y2 P and NY1 = NY2 = NP , then Y1 = Y2 . Theorem 16.9. If A ∈ Cp×q and rank A = r ≥ 1, then (16.12)

AH A = Iq ⇐⇒ Ax = x

for every x ∈ C q

and (16.13)

AH AAH = AH ⇐⇒ AAH y = AH y

for every y ∈ C p .

Proof. Since (16.12) is a special case of (16.13), it suffices to deal with the latter. Suppose first that AAH y = AH y for every y ∈ C p and let x ∈ C q . Then x = u + AH y for some choice of u ∈ NA and y ∈ C p . Thus, as (Iq − AH A)AH y, AH y = 0, it is readily checked that (Iq − AH A)x, x = (Iq − AH A)(u + AH y), u + AH y = u, u ≥ 0 , i.e., (Iq − AH A)  O. Consequently, (Iq − AH A)AH y, AH y = 0 =⇒ (Iq − AH A)1/2 AH y = 0 =⇒ (Iq − AH A)AH y = 0. Since these implications are valid for every y ∈ C p , (Iq − AH A)AH = O. The converse implication is easy and is left to the reader.



Lemma 16.10. If P ∈ C n×n is a positive semidefinite matrix and B ∈ C n×n is a partial isometry with NB = NP , then (16.14)

BP = P B H =⇒

B = BH ,

(i.e., BP = (BP )H =⇒ B = B H ). Proof. Under the given assumptions, BP B H BP B H = BP 2 B H = P B H BP = P 2 =⇒ BP B H = P , since the positive semidefinite matrix P 2 has exactly one positive semidefinite square root. But this in turn implies that B H P = B H BP B H = P B H = BP . Therefore, to complete the proof it suffices to show that NB H = NP . But B H a = 0 =⇒ BP B H a = 0 =⇒ P a = 0 , i.e., NB H ⊆ NP = NB . Thus, as dim NB = n − rank B = n − rank B H =  dim NB H , we see that NB H = NB = NP .

182

16. Positive definite and semidefinite matrices

16.5. Some useful formulas It is useful to keep in mind that if A = AH , then a number of formulas that were established earlier assume a more symmetric form: Theorem 16.11. If A ∈ C n×n and rank A = r, r ≥ 1, then: (1) AH A = AAH =⇒ A = max{|Ax, x| : x ∈ C n and x = 1}. (2) A  O =⇒ A = max{Ax, x : x ∈ C n and x = 1}. (3) A  O =⇒ in the singular value decomposition A = V1 S1 U1H in (15.2), the two n × r isometric matrices coincide: V1 = U1 . (4) A  O =⇒ ϕ(x) = (Ax, x)1/2 is a norm on C n . (5) A  O =⇒ ϕ(x, y) = Ax, y is an inner product on C n . Proof. We shall verify (3) and leave the justification of the rest to the reader. Since A  O =⇒ A = AH , the corresponding singular value decompositions must coincide, i.e., V1 S1 U1H = U1 S1 V1H . Thus, in view of Corollary 16.8, V1 U1H = U1 V1H and hence V1 = U1 V1H U1 = U1 K with K = V1H U1 . Consequently, Ir = V1H V1 = K H U1H U1 K = K H K,

i.e., K is unitary.

Moreover, KS1  O, since A = U1 KS1 U1H  O. Therefore, KS1 = S1 K H and hence (KS1 )2 = S1 K H KS1 = S12 . Thus, KS1 = S1 , since they are both positive definite square roots of S12 .  Therefore, K = Ir and V1 = U1 . Exercise 16.26. Verify items (1), (2), (4), and (5) in Theorem 16.11. We remark that if A, B ∈ C n×n , A  B  O, and 0 < t < 1, then 6 sin πt ∞ t x (xIn + A)−1 (A − B)(xIn + B)−1 dx . (16.15) At − B t = π 0 Exercise 16.27. Use formula (16.15) to show that if A, B ∈ C n×n , then A  B  O =⇒ At  B t for 0 < t < 1 .



 x 1 1 0 Exercise 16.28. Let A = and B = . Show that if 2 < x < 1 1 0 0 √ 1 + 2, then A  B  O, but A2 − B 2 has one positive eigenvalue and one negative eigenvalue, i.e., A  B  O does not imply that A2  B 2 . (16.16)

16.6. Supplementary notes

183

16.6. Supplementary notes This chapter is partially adapted from Chapter 12 in [30], which contains information on Toeplitz matrices, block Toeplitz matrices, and polynomial identities. Sections 16.4 and 16.5 are new, but (16.15) is discussed in [30].

Chapter 17

Determinants redux

This chapter deals with more advanced topics in the theory of determinants. The first three sections are devoted to developing the Binet-Cauchy formula for evaluating the determinants of matrix products of the form AB when A, B T ∈ C n×k and n > k. Subsequently, some useful inequalities and the formulas of Jacobi and Sylvester are discussed.

17.1. Differentiating determinants Theorem 17.1. If ⎡

f11 (t) · · · ⎢ .. ϕ(t) = det ⎣ . fn1 (t) · · ·

⎤ ⎡ ⎤ f1n (t) R1 (t) .. ⎥ = det ⎢ .. ⎥ , ⎣ . ⎦ . ⎦ Rn (t) fnn (t)

functions that where the fij (t) are smooth   can be differentiated freely with respect to t, Ri (t) = fi1 (t) · · · fin (t) for i = 1, . . . , n, and the notation |B| is used to denote the determinant of a matrix B, then

(17.1)

% % % %  % % % R1 (t) % % R (t) % % R1 (t) % % % % % 1 % % % R2 (t) % % R2 (t) % % R (t) % 2 % % % % % % ϕ (t) = % . % + % . % + · · · + % . % . % .. % % .. % % .. % % % % %  % % %R (t)% %Rn (t)% %Rn (t)% n

Discussion. Formula (17.1) follows from the fact that the determinant of a matrix is linear in each row of the matrix and is a continuous function of 185

186

17. Determinants redux

each entry in the matrix. Thus, if n = 3, then % % % % % % %R1 (t + ε) − R1 (t)% % % % % R1 (t) R1 (t) % % % % % % % % % % % % R2 (t + ε) R2 (t) ϕ(t + ε) = % % + %R2 (t + ε) − R2 (t)% + % % % % % % %R3 (t + ε) − R3 (t)% R3 (t + ε) R3 (t + ε) + ϕ(t) and ϕ(t + ε) − ϕ(t) ϕ (t) = lim ε→0 ε % % % % % %   (t) f  (t)% %f (t) f (t) f (t)% %f (t) f (t) f (t)% %f11 (t) f12 12 13 11 12 13 13 % % 11 % % % %  (t) f  (t) f  (t)% %f (t) f (t) f (t)% + = %%f21 (t) f22 (t) f23 (t)%% + %%f21 21 22 23 22 23 % % %. %f31 (t) f32 (t) f33 (t)% %f31 (t) f32 (t) f33 (t)% %f  (t) f  (t) f  (t)% 31 32 33 

The case of general n is treated in just the same way. Lemma 17.2. If A ∈ C n×n and ϕ(t) = det (tIn − A), then ϕ (t) = trace(tIn − A)−1 ϕ(t)

(17.2) Discussion. reduces to

for t ∈ σ(A) .

  If n = 3, t ∈ σ(A), and I3 = e1 e2 e3 , then (17.1)

% % % T % % % % e1 % %R1 (t)% %R1 (t)% % % % % % % ϕ (t) = %%R2 (t)%% + %% eT2 %% + %%R2 (t)%% %R3 (t)% %R3 (t)% % eT % 3 = (tIn − A){1;1} + (tIn − A){2;2} + (tIn − A){3;3} = ϕ(t)[((tIn − A)−1 )11 + ((tIn − A)−1 )22 + ((tIn − A)−1 )33 ] = ϕ(t) trace(tIn − A)−1 . 

The general case is evaluated in exactly the same way. Lemma 17.3. If C ∈ C n×n , t ∈ R, and ϕ(t) = det (In + tC), then ϕ (0) = trace C .

(17.3) Discussion.

If n = 3, then ⎤ ⎡ tc12 tc13 1 + tc11 1 + tc22 tc23 ⎦ = det ϕ(t) = det ⎣ tc21 tc31 tc32 1 + tc33

Thus,

⎤ ⎡ R1 (t) ⎣R2 (t)⎦ . R3 (t)

% % % % % % % % % %  % % %R1 (t)% %R1 (t)% %R1 (t)% % eT1 C % %R1 (t)% %R1 (t)% % % % % % % % % % % % % ϕ (t) = %%R2 (t)%% + %%R2 (t)%% + %%R2 (t)%% = %%R2 (t)%% + %% eT2 C %% + %%R2 (t)%% %R3 (t)% %R3 (t)% %R (t)% %R3 (t)% %R3 (t)% % eT C % 3 3

17.1. Differentiating determinants

and

187

% % % % % % %c11 c12 c13 % % 1 0 0 %% 0 0 %% %% 1 % % % ϕ (0) = %% 0 1 0 %% = trace C . 1 0 %% + %%c21 c22 c23 %% + %% 0 % % %0 % % 0 0 1 c31 c32 c33 % 0 1

The computation for general n proceeds in the same way, just the bookkeeping is more elaborate.  Lemma 17.4. If A, B ∈ C n×n , A  O, B = B H , h(t) = ln det (A + tB), and t ∈ R, then h (0) = trace A−1 B .

(17.4)

Proof. Let C = A−1/2 BA−1/2 and ϕ(t) = det (In + tC). Then h(t) = ln [det A × det (In + tC)] = h(0) + ln ϕ(t) and, in view of Lemma 17.3, h (0) = ϕ (0)/ϕ(0) = trace C.



Exercise 17.1. Give alternate proofs of formulas (17.2) and (17.3) using the Jordan decomposition for A. Lemmas 17.2 and 17.3 are special cases of the following more general result: Lemma 17.5. Let ⎡

f11 (t) · · · ⎢ .. F (t) = ⎣ . fn1 (t) · · ·

⎤ f1n (t) .. ⎥ . ⎦

and

ϕ(t) = det F (t) ,

fnn (t)

where the fij (t) are smooth functions that can be differentiated freely with respect to t and F (t) is invertible in the interval a < t < b. Then ϕ (t) = trace{F  (t)F (t)−1 } ϕ(t)

(17.5) and hence

(6

t

(17.6) ϕ(t) = ϕ(c) exp



f or

trace{F (s)F (s)

−1

a 0 and  T bj > 0 for j = 1, . . . , k. Let u = cos θ sin θ be a vector in R 2 with 0 < θ < π/2 and let ΠU x = u(uH u)−1 uH x = u uH x denote the orthogonal projection of x ∈ R2 onto the subspace U = span{u} = {tu : t ∈ R}. Then xj = ΠU xj + (I − ΠU )xj and the square of the distance from xj to U is equal to (I − ΠU )xj 2 = (I − ΠU )xj , (I − ΠU )xj  = (I − ΠU )xj , xj  = xj 2 − ΠU xj 2 = a2j + b2j − (aj cos θ + bj sin θ)2 .  We wish to choose θ ∈ (0, π/2) to minimize kj=1 (I − ΠU )xj 2 . In terms of the notation  T (18.11) a = a1 · · · ak , (18.12)

k 

(I −ΠU )xj  = 2

j=1

 b = b1 · · ·

k 

xj  − 2

j=1

k 

bk

T

,

and

  A= a b ,

ΠU xj 2 = a2 +b2 −f (θ) ,

j=1

where f (θ) =

k 

ΠU xj 2 = a2 cos2 θ + b2 sin2 θ + 2a, b cos θ sin θ

j=1

< 

= cos θ cos θ = A ,A . sin θ sin θ Since f (θ) ≥ 0, a2 +b2 −f (θ) will be minimized if we choose θ ∈ (0, π/2) to maximize f (θ). Let α = a2 − b2 and β = 2a, b for short. Then β > 0 and, upon invoking the trigonometric identities 1 + cos 2θ , 2 we see that cos2 θ =

(18.13)

sin2 θ =

1 − cos 2θ , 2

and

sin 2θ = 2 cos θ sin θ ,

' & f (θ) = a2 + b2 + α cos 2θ + β sin 2θ /2 .

Moreover, by the Cauchy-Schwarz inequality, 1 1 1 |α cos 2θ + β sin 2θ| ≤ α2 + β 2 cos2 2θ + sin2 2θ = α2 + β 2 , with equality if and only if



 cos 2θ α =γ sin 2θ β

for some γ ∈ R .

18.4. Fitting a line in R p

205

Therefore, γ 2 (α2 + β 2 ) = cos2 2θ + sin2 θ = 1, i.e., γ = ±(α2 + β 2 )−1/2 . Since β > 0 and sin 2θ > 0 when θ ∈ (0, π/2), we must choose γ = (α2 + β 2 )−1/2 . Consequently, cos 2θ =

(α2

α , + β 2 )1/2

sin 2θ =

(α2

β , + β 2 )1/2

and

cot 2θ =

α . β

Thus, as cot 2θ decreases strictly monotonically from +∞ to −∞ as θ increases from 0 to π/2, there exists exactly one angle θ1 ∈ (0, π/2) such that cot 2θ1 = α/β. If α > 0, then θ1 ∈ (0, π/4); if α = 0, then θ1 = π/4; if α < 0, then θ1 ∈ (π/4, π/2). Exercise 18.6. Show that the function f (θ) defined in (18.13) is subject to the bounds s22 ≤ f (θ) ≤ s21 , where 1 1 s21 = {a2 + b2 + α2 + β 2 }/2, s22 = {a2 + b2 − α2 + β 2 }/2, and s1 and s2 are the singular values of the matrix A in (18.11). Exercise 18.7. Continuing Exercise 18.6, show that f (θ) > s2 if 0 < θ ≤ π/2, but there exists exactly one angle θ2 ∈ (π/2, π) such that cos 2θ2 = − cos 2θ1 and sin 2θ2 = − sin 2θ1 , where θ1 ∈ (0, π/2) and f (θ1 ) = s21 . Then check that θ2 = θ1 + π/2 and f (θ2 ) = s22 . Exercise 18.8. Continuing Exercise 18.7, show that if A is the matrix defined in (18.11), then

2 

 s1 0 cos θ1 cos θ2 T 2 −1 2 , U= , A A = U S U , where S = sin θ1 sin θ2 0 s22 and U U T = I2 , and then express U in terms of θ1 only. Exercise 18.9. Find the angle θ1 for the best fitting line in the sense that  T it minimizes the total mean square distance for the points x1 = 1 2 ,  T  T x2 = 2 1 , and x3 = 2 3 .

18.4. Fitting a line in R p If x, u, v ∈ R p and v = 1, then the distance of x from the line {u + tv : t ∈ R} is equal to min x − u − tv = x − u − ΠV (x − u) , t∈R

where ΠV denotes the orthogonal projection of R p onto the vector space V = span{v}.

206

18. Applications

We wish to choose the line, i.e., the vectors u and v, to minimize k 

xj − u − ΠV (xj − u)2

j=1

for a given set of k points x1 , . . . , xk ∈ R p .  Let a = k −1 kj=1 xj . Then k 

(xj − a) = 0

j=1

and the sum of interest is equal to k 

xj − a − ΠV (xj − a) + a − u − ΠV (a − u)2

j=1

(18.14)

=

k  3

4 xj − a − ΠV (xj − a)2 + a − u − ΠV (a − u)2 ,

j=1

since −2

k 

xj − a − ΠV (xj − a), a − u − ΠV (a − u) = 0 .

j=1

Thus, as both of the terms inside the curly brackets in the second line of (18.14) are nonnegative, the sum is minimized by choosing u = a and then choosing the unit vector v to minimize k 

xj − a − ΠV (xj − a)2 =

j=1

k 

{xj − a2 − ΠV (xj − a)2 }

j=1

=

k 

{xj − a2 − xj − a, v2 } .

j=1

But this is the same as choosing v to maximize k k   2 xj − a, v = vT (xj − a)(xj − a)T v = vT Y Y T v , j=1

j=1

  where Y = x1 − a · · · xk − a . Thus, if Y is expressed in terms of its singular value decomposition Y = V1 S1 U1T , then Y Y T = V1 S12 V1T and the sum is maximized by choosing v equal to the first column of V1 .

18.5. Schur complements for semidefinite matrices

207

18.5. Schur complements for semidefinite matrices In this section we shall show that if A  O, then analogues of the Schur complement formulas considered in (3.18) and (3.19) hold even if neither of the block diagonal entries are invertible. (Similar formulas hold if A ! O.) Recall that B † denotes the Moore-Penrose inverse of B and that BB † = (BB † )H is the orthogonal projection onto the range of B. Lemma 18.5. If a positive semidefinite matrix A ∈ C n×n is written in standard four-block form as 

A11 A12 A= A21 A22 with A11 ∈ C p×p , A22 ∈ C q×q , and n = p + q, then: (1) NA11 ⊆ NA21 and NA22 ⊆ NA12 . (2) RA12 ⊆ RA11 and RA21 ⊆ RA22 . (3) A11 A†11 A12 = A12 , A11 A†11 = A†11 A11 , A22 A†22 A21 = A21 , and A22 A†22 = A†22 A22 . (4) The matrix A admits the (lower-upper) factorization   

A11 Ip O O Ip A†11 A12 . (18.15) A = O Iq A21 A†11 Iq O A22 − A21 A†11 A21 (5) The matrix A admits the (upper-lower) factorization   

O Ip A11 − A12 A†22 A21 O Ip A12 A†22 . (18.16) A = A†22 A21 Iq O Iq O A22 Proof. Since A  O, the inequality xH (A11 x + A12 y) + yH (A21 x + A22 y) ≥ 0 must be in force for every choice of x ∈ C p and y ∈ C q . If x ∈ NA11 , then this reduces to xH A12 y + yH (A21 x + A22 y) ≥ 0 for every choice of y ∈ C q and hence, upon replacing y by εy, to εxH A12 y + εyH A21 x + ε2 yH A22 y ≥ 0 for every choice of ε > 0 as well. Consequently, upon dividing through by ε and then letting ε ↓ 0, it follows that xH A12 y + yH A21 x ≥ 0 for every choice of y ∈ C q . But if y = −A21 x, then, as A12 = AH 21 , the last inequality implies that −2A21 x2 = −2xH A12 A21 x ≥ 0 .

208

18. Applications

Therefore, A21 x = 0 , i.e., NA11 ⊆ NA21 and, since the orthogonal complements of these two sets satisfy the opposite inclusion, RA12 = (NA21 )⊥ ⊆ (NA11 )⊥ = RA11 , H as A12 = AH 21 and A11 = A11 . This completes the verification of the first assertions in (1) and (2); the second assertions in (1) and (2) may be verified in much the same way.

Next, the formulas in (2) imply that there exists a pair of matrices X ∈ C p×q and Y ∈ C q×p such that A12 = A11 X and A21 = A22 Y . Therefore, A11 A†11 A12 = A11 A†11 A11 X = A11 X = A12 and

A22 A†22 A21 = A22 A†22 A22 Y = A22 Y = A21 .

This justifies two of the formulas in (3); the other two follow from the fact that A11  O and A22  O. Items (4) and (5) are straightforward computations based on the formulas in (3) and their Hermitian transposes. They are left to the reader.  Exercise 18.10. Show that if A  O, then, in the notation of Lemma 18.5, A11 A†11 = A†11 A11 and A†22 A22 = A22 A†22 . Exercise 18.11. Verify the identities in (4) and (5) of Lemma 18.5.

18.6. von Neumann’s inequality for contractive matrices A matrix A ∈ C p×q is said to be contractive if A ≤ 1. Exercise 18.12. Show that if A ∈ C p×q , then (18.17)

A ≤ 1 ⇐⇒ Iq − AH A  O ⇐⇒ Ip − AAH  O .

Exercise 18.13. Show that if A ≤ 1, then (18.18)

A(Iq − AH A)1/2 = (Ip − AAH )1/2 A .

[HINT: In the usual notation for svd’s, (Iq − AH A)1/2 = U1 (Ir − S12 )1/2 U1H + U2 U2H ; also keep in mind that diagonal matrices commute.] Lemma 18.6. If A ∈ C p×p and A ≤ 1, then there exists an np × np unitary matrix Bn with the special property that   (18.19) E T Bnk E = Ak for E T = Ip O · · · O , k = 0, . . . , n − 1 , when n ≥ 2.

18.7. Supplementary notes

209

Proof. Let DA = (Ip − AH A)1/2 and DAH = (Ip − AAH )1/2 , for short. Then, since ADA = DAH A, it is readily checked by direct calculation that the matrices ⎤ ⎡ ⎤ ⎡ A O O D AH 

A O D AH ⎢DA O O −AH ⎥ A D AH ⎥, , B3 = ⎣DA O −AH ⎦ , B4 = ⎢ B2 = H ⎣ O Ip O DA −A O ⎦ O Ip O O O O Ip ⎡ ⎤ ⎤ ⎡ Op×p A ΘTn−2 DAH ⎢ .. ⎥ T H ⎦ ⎣ Θn−2 −A with Θn−2 = ⎣ . ⎦ ∈ C (n−2)p · · · , Bn = D A Θn−2 I(n−2)p Θn−2 Op×p possess the requisite properties for n = 2, 3, 4 and n ≥ 5, respectively.  Theorem 18.7. If A ∈ C p×p and A ≤ 1, then (18.20)

f (A) ≤ max{|f (λ)| : λ ∈ C

and

|λ| = 1}

for every polynomial f (λ). Proof. Let f (λ) be a polynomial of degree at most n with n ≥ 2 and let B = Bn+1 be a unitary matrix of the form indicated in Lemma 18.6. Then (18.19) holds and, as B = U DU H with U unitary and D = diag{λ1 , . . . , λ(n+1)p } with |λj | = 1 for j = 1, . . . , (n + 1)p, it is then easily checked by direct calculation that f (A) = E T f (B)E ≤ f (B) = f (D) = max{|f (λj )| : j = 1, . . . , np}, which is clearly less than or equal to the right side of (18.20).



18.7. Supplementary notes Sections 18.3–18.6 are taken from [30]. The discussion in Sections 18.3 and 18.4 was adapted from Shuchat [69]. The discussion in Section 18.6 was adapted from Levy and Shalit [56]; see also [60] for further developments and the references in both for extensions to more general settings. Another approach to the minimization problem considered in Theorem 18.1 will be furnished in Example 21.1 for matrices A ∈ R n×n .

Chapter 19

Discrete dynamical systems

A discrete dynamical system is a sequence of vectors x0 , x1 , x2 , . . . in a set E that are generated by some rule. An intriguing example with a deceptively simple formulation is the 3x + 1 problem, wherein E = {1, 2, 3, . . .} is the set of positive integers and the sequence is {x, T x, T 2 x, . . .}, in which x ∈ E, T x = 3x+1 if x is odd, and T x = x/2 if x is even. Thus, for example, the initial choice x = 13, leads to the sequence 13, 40, 20, 10, 5, 16, 8, 4, 2, 1 (if one stops at 1). It is conjectured that for every initial state x ∈ E, there exists an integer k such that T k x = 1. This has been verified by computer for initial states up to 1020 , but a definitive answer is not known. In this chapter we shall restrict our attention to sequences in C p of the form (19.1)

xk+1 = Axk + fk ,

k = 0, 1, . . . ,

in which A ∈ C p×p , x0 ∈ C p , and f0 , f1 , f2 , . . . are specified vectors in C p ; our objective is to understand the behavior of the solution xn as n gets large. We shall subsequently use this understanding to study difference equations of the form (19.9).

19.1. Homogeneous systems The system (19.1) is said to be homogeneous if fk = 0 for k = 0, 1, . . .. The solution of the homogeneous system is xn = An x0

for n = 0, 1, . . . . 211

212

19. Discrete dynamical systems

This is a nice formula. However, it does not provide much insight into the behavior of xn . This is where the fact that A is similar to a Jordan matrix J comes into play: (19.2)

A = V JV −1 =⇒ xn = V J n V −1 x0

for n = 0, 1, . . . .

The advantage of this new formulation is that J n is relatively easy to compute: If A is diagonalizable, then J = diag{λ1 , . . . , λp } ,

J n = diag{λn1 , . . . , λnp } ,

and xn =

p 

dj λnj vj

j=1

is a linear combination of the eigenvectors vj of A, alias the columns of V , with coefficients that are proportional to λnj . If A is not diagonalizable, then J = diag{J1 , . . . , Jr } , where each block entry Ji is a Jordan cell, and J n = diag{J1n , . . . , Jrn } . Consequently, the key issue reduces to understanding the behavior of the (m) (m) n’th power (Cλ )n of the m × m Jordan cell Cλ as n tends to ∞. Fortunately, this is still relatively easy: (m)

Lemma 19.1. If N = Cλ (19.3)

(m) (Cλ )n

=

(m)

− λIm = C0

m−1  j=0

, then

 n n−j j λ N j

when

n ≥ m.

Proof. Since N commutes with λIm , the binomial theorem is applicable and supplies the formula n    n n−j j (m) (Cλ )n = (λIm + N )n = λ N . j j=0

But this is the same as formula (19.3), since N j = 0 for j ≥ m. (m)



The matrix (Cλ )n is an upper triangular Toeplitz matrix, i.e., it is constant on diagonals. Thus, it is completely specified by its top row: & n ' n−m+1  &n' n &n' n−1 &n' n−2 · · · λ λ λ . 0 1 2 m−1 λ

19.1. Homogeneous systems

213

If m = 3 and n > 3 for example, then ⎡&n' n &n' n−1 &n' n−2 ⎤ 0 λ 1 λ 2 λ ⎢ ⎥ ⎢ &n' n &n' n−1 ⎥ (3) n ⎢ ⎥. (Cλ ) = ⎢ 0 0 λ 1 λ ⎥ ⎣ ⎦ &n' n 0 0 0 λ

  Exercise 19.1. Show that if J = diag{λ1 , . . . , λp }, V = v1 · · · vp , and   (V −1 )T = w1 · · · wp , then the solution (19.2) of the homogeneous system can be expressed in the form xn =

p 

λnj vj wjT x0 .

j=1

Exercise 19.2. Show that if, in the setting of Exercise 19.1, |λ1 | > |λj | for j = 2, . . . , p, then 1 lim n xn = v1 w1T x0 . n↑∞ λ1 Exercise 19.3. The output un of a chemical plant at time n, n = 0, 1, . . ., is modeled by a system of the form un = An u0 . Show that if ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ a a − 3b 1 −3/2 0 ⎦. 0 A=⎣ 0 1/2 0 ⎦ and u0 = ⎣ b ⎦ , then lim un = ⎣ n→∞ c 0 0 0 1/4 Exercise 19.4. Find an explicit formula for the solution un of the system un = An u0 when ⎡ ⎤ ⎡ ⎤ 2 1 2 0 A = ⎣0 1 3⎦ and u0 = ⎣3⎦ . 0 0 0 1 It is not necessary to compute V −1 in the formula for the solution in (19.2). It is enough to compute V −1 x0 , which is often much less work (since only some of the minors of V may be needed): Set y0 = V −1 x0

and solve the equation V y0 = x0 . ⎡ ⎤ ⎡ ⎤ 6 2 2 6 Exercise 19.5. Calculate V −1 x0 when V = ⎣0 3 1⎦ and x0 = ⎣0⎦, 0 0 1 0 −1 and then calculating the product both directly (i.e., by first calculating V V −1 x0 ) and indirectly by solving the equation V y0 = x0 , and compare the effort.

214

19. Discrete dynamical systems

19.2. Nonhomogeneous systems In this section we look briefly at the nonhomogeneous system (19.1). It is readily checked that x1 = Ax0 + f0 , x2 = Ax1 + f1 = A2 x0 + Af0 + f1 , x3 = Ax2 + f2 = A3 x0 + A2 f0 + Af1 + f2 , and, in general, n

(19.4)

xn = A x0 +

n−1 

An−1−k fk .

k=0

 n−1−k f for Exercise 19.6. Let un = An x0 for n = 0, 1, . . ., vn = n−1 k k=0 A n = 1, 2 . . ., and v0 = 0. Show that un is a solution of the homogeneous system un+1 = Aun for n = 0, 1, . . . with u0 = x0 and that vn is a solution of the system vn+1 = Avn + fn for n = 0, 1, . . . with v0 = 0.

19.3. Second-order difference equations A second-order homogeneous difference equation is an equation of the form a0 xn + a1 xn+1 + a2 xn+2 = 0, n = 0, 1, . . . , with a0 a2 = 0 ,

(19.5)

where a0 , a1 , a2 are fixed and x0 and x1 are given. The objective is to obtain a formula for xn and, if possible, to understand how xn behaves as n ↑ ∞. We shall solve this second-order difference equation by embedding it into a first-order vector equation by setting  

xn 0 1 so that xn+1 = x for n = 0, 1, . . . . xn = xn+1 −a0 /a2 −a1 /a2 n Thus,

n

xn = A x0

0 1 with A = −a0 /a2 −a1 /a2

 for n = 0, 1, . . . .

Since A is a companion matrix, Theorem 8.1 implies that det(λI2 − A) = (a0 + a1 λ + a2 λ2 )/a2 = (λ − λ1 )(λ − λ2 ) and that there are only two possible Jordan forms:

 

λ1 1 λ1 0 if λ1 = λ2 and J = (19.6) J = 0 λ2 0 λ1

if λ1 = λ2 .

19.3. Second-order difference equations

215

Therefore, A = U JU −1 , where U is a Vandermonde Vandermonde matrix: 

1 1 1 if λ1 = λ2 and U = U= λ1 λ2 λ1

matrix or a generalized  0 1

if λ1 = λ2 .

Moreover, since a0 = 0 by assumption, λ1 λ2 = 0. Case 1 (λ1 = λ2 ): (19.7)

xn = An u0 = U J n U −1 x0 =

1 1 λ1 λ2



 λn1 0 U −1 x0 . 0 λn2

Consequently, (19.8)

 n   1 1   λ1 0 xn = 1 0 U −1 x0 = λn1 λn2 U −1 x0 . n 0 λ2 λ1 λ2 

However, it is not necessary to calculate U −1 . It suffices to note that formula (19.8) guarantees that xn must be of the form xn = αλn1 + βλn2

(λ1 = λ2 )

and then to solve for α and β from the given initial conditions: x0 = α + β and x1 = αλ1 + βλ2 . Case 2 (λ1 = λ2 ):

λ 1 xn = A x0 = U 1 0 λ1 n

n U

−1

1 0 x0 = λ1 1



 λn1 nλn−1 1 U −1 x0 . 0 λn1

Consequently,   xn = 1 0

1 0 λ1 1



   −1 λn1 nλn−1 1 U x0 U −1 x0 = λn1 nλn−1 n 1 0 λ1

must be of the form xn = αλn1 + βnλn1 . The coefficients α and β are obtained from the initial conditions: x0 = α and x1 = αλ1 + βλ1 . Notice that the second term in the solution was written as βnλn1 and . This is possible because λ1 = 0, and hence a (positive or not as βnλn−1 1 negative) power of λ1 can be absorbed into the constant β. The preceding analysis leads to the following recipe for obtaining the solutions of the second-order (homogeneous) difference equation a2 xn+2 + a1 xn+1 + a0 xn = 0

for n = 0, 1, . . . ,

with a2 a0 = 0

216

19. Discrete dynamical systems

and initial conditions x0 = c and

x1 = d :

(1) Solve for the roots λ1 , λ2 of the polynomial a2 λ2 + a1 λ + a0 and note that the factorization a2 (λ − λ1 )(λ − λ2 ) = a2 λ2 + a1 λ + a0 implies that λ1 λ2 = a0 /a2 = 0. (2) Express the solution as αλn1 + βλn2 if λ1 = λ2 , xn = αλn1 + βnλn1 if λ1 = λ2 for some choice of α and β. (3) Solve for α and β by invoking the initial conditions: c = x0 = α + β and d = x1 = αλ1 + βλ2 if λ1 = λ2 , c = x0 = α

and

d = x1 = αλ1 + βλ1

if λ1 = λ2 .

Example 19.1. Let xn+2 − 3xn+1 − 4xn = 0 for n = 0, 1, . . . , with initial conditions x0 = 5 and x1 = 0. Discussion. The roots of the equation λ2 −3λ−4 are λ1 = 4 and λ2 = −1. Therefore, the solution xn must be of the form xn = α4n + β(−1)n

for n = 0, 1, . . . .

The initial conditions x0 = 5 and x1 = 0 imply that α+β =5

and

α4 − β = 0 .

Thus, α = 1, β = 4, and xn = 4n + 4(−1)n

for

n = 0, 1, . . . .

Example 19.2. Let xn+2 − 2xn+1 + xn = 0 for n = 0, 1, . . . with initial conditions x0 = 3 and x1 = 5. Discussion.

The equation λ2 − 2λ + 1 = 0 has two equal roots: λ1 = λ2 = 1.

Therefore, xn = α1n + βn1n = α + βn . Substituting the initial conditions x0 = α = 3

and

x1 = α + β = 5 ,

we see that α = 3 and β = 2 and hence that xn = 3 + 2n

for

n = 0, 1, . . . .

19.4. Higher-order difference equations

217

Exercise 19.7. Find an explicit formula for xn , for n = 0, 1, . . ., given that x0 = −1, x1 = 2, and xk+1 = 3xk − 2xk−1 for k = 1, 2, . . .. Exercise 19.8. The Fibonacci sequence xn , n = 0, 1, . . ., is prescribed by the initial conditions x0 = 1, x1 = 1 and the difference equation xn+1 = xn + xn−1 for n = 1, 2, . . .. Find an explicit formula for xn and use it to calculate the golden mean, limn↑∞ xn+1 /xn . Exercise 19.9. Let xn = txn−1 + (1 − t)xn+1 for n = 1, 2, . . . . Evaluate the limn↑∞ xn as a function of t for 0 < t < 1 when x0 = 0 and x1 = 1.

19.4. Higher-order difference equations The solution of the p’th-order equation (19.9) a0 xn + a1 xn+1 + · · · + ap xn+p = 0 ,

n = 0, 1, . . . ,

with a0 ap = 0

and given initial conditions x0 , x1 , . . . , xp−1 can be obtained from the solution of the first-order vector equation xn = Axn−1 where (19.10)







0 0 .. .

for

n = 0, 1, . . . ,

1 0

0 1

···

xn ⎢ ⎢ xn+1 ⎥ ⎢ ⎢ ⎥ ⎢ xn = ⎢ . ⎥ , A = ⎢ ⎢ ⎣ .. ⎦ ⎣ 0 0 0 xn+p−1 −a0 /ap −a1 /ap −a2 /ap · · ·

0 0 .. . 1

⎤ ⎥ ⎥ ⎥ ⎥. ⎥ ⎦

−ap−1 /ap

The solution

  xn = 1 0 · · · 0 xn is obtained from the top entry in xn . To see how this works, consider the case of a 6 × 6 companion matrix A with three distinct eigenvalues λ1 , λ2 , and λ3 and det (λI6 − A) = (λ − λ1 )3 (λ − λ2 )2 (λ − λ1 ) and let T  v(λ) = 1 λ · · · λ5 . Then A = V JV −1 with    (λ ) 1 V = v(λ1 ) v (λ1 ) v 2! v(λ2 ) v (λ2 ) v(λ3 ) , (3)

(2)

(1)

and J = diag{Cλ1 , Cλ2 , Cλ3 }. Therefore,  xn = 1 0 · · ·

  0 An x0 = 1 0 · · ·

⎡ (3) ⎤n C O O  ⎢ λ1 ⎥ 0 V ⎣ O C (2) O ⎦ c , λ2 O O λ3

with c = V −1 x0 and, as     1 0 0 0 0 0 V = 1 0 0 1 0 1 ,

218

19. Discrete dynamical systems

             n n n n−1 n n−2 n n n n−1 n n xn = λ1 λ1 λ1 λ2 λ2 λ3 c 0 1 2 0 1 0 

n n n n n(n − 1) λn λn n λn λ λ λ = 1 λ 1 1 2 3 c. λ2 2 1 2λ21 (3)

(2)

Thus, xn is a linear combination of the top rows of (Cλ1 )n , (Cλ2 )n , and (1)

(Cλ3 )n . But, this is the same as saying λn1 , nλn1 , n2 λn1 , λn2 , nλn2 , and λn3 ; e.g., ⎡

 1   n(n − 1) 2 ⎣0 1 n 1 n n = μ 2μ2 0

that xn is a linear combination of ⎤ 0 0 1/μ −1/(2μ2 )⎦ 0 1/(2μ2 )

when μ = 0

and the lower triangular matrix on the right is invertible. This serves to motivate the following recipe for obtaining the solution of equation (19.9): (1) Find the roots of the polynomial a0 + a1 λ + · · · + ap λp . (2) If a0 + a1 λ + · · · + ap λp = ap (λ − λ1 )α1 · · · (λ − λk )αk with distinct roots λ1 , . . . , λk , then the solution must be of the form xn =

k 

pj (n)λnj ,

where

pj

is a polynomial of degree αj − 1 .

j=1

(3) Invoke the initial conditions to solve for the coefficients of the polynomials pj . Discussion.

The algorithm works because A is a companion matrix. Thus, det (λIp − A) = (a0 + a1 λ + · · · + ap λp )/ap

and hence, if det (λIp − A) = (λ − λ1 )α1 · · · (λ − λk )αk with distinct roots λ1 , . . . , λk , then A is similar to the Jordan matrix (α )

(α )

J = diag{Cλ1 1 , . . . , Cλk k } , with one Jordan cell for each distinct eigenvalue, and the matrix U in the Jordan decomposition A = U JU −1 is a generalized Vandermonde matrix. Therefore, the solution must be of the form indicated in (2). Remark 19.2. The equation a0 + a1 λ + · · · + ap λp = 0 for the eigenvalues of A may be obtained with minimum thought by letting xj = λj in equation (19.9) and then factoring out the highest common power of λ. The assumption a0 = 0 ensures that the eigenvalues are all nonzero.

19.6. Supplementary notes

219

Exercise 19.10. Find the solution of the third-order difference equation xn+3 − 3xn+2 + 3xn+1 − xn = 0 , n = 0, 1, . . . , subject to the initial conditions x0 = 1, x1 = 2, and x2 = 8. [HINT: (x − 1)3 = x3 − 3x2 + 3x − 1.]

19.5. Nonhomogeneous equations The solution xn of the p’th-order equation (19.11) a0 xn + a1 xn+1 + · · · + ap xn+p = fn , n = 0, 1, . . . ,

with

a0 ap = 0

and given initial conditions xj = cj for j = 0, . . . , p − 1, can be obtained from the solution of the first-order vector equation xn = Axn−1 + fn

for

n = 0, 1, . . . ,   where xn and A are as in (19.10) and = 0 · · · 0 fn /ap . Let ej denote the j’th column of Ip for j = 1, . . . , p. Then, by (19.4), which is valid for any A ∈ C p×p , fnT

xn = eT1 xn = eT1 An x0 + eT1 

n−1 

An−1−j fj .

j=0

 Exercise 19.11. Show that if fnT = 0 · · · 0 fn /ap and xn and A are as in (19.10), then un = eT1 An x0 for n = 0, 1, . . . is a solution of the homogeneous equation (19.9) with uj = cj for j = 0, . . . , p − 1, whereas  n−1−j f for n = 1, 2, . . . and v = 0 is a solution of (19.11) vn = eT1 n−1 j 0 j=0 A with v0 = · · · = vp−1 = 0.

19.6. Supplementary notes This chapter was adapted from Chapter 13 of [30]. The 3x + 1 problem was formulated by L. Collatz in 1937. It is currently regarded as unsolvable. A perhaps surprising application of the methods introduced in this chapter is to the computation of determinants of some special classes of matrices: (n)

(n)

Exercise 19.12. Let An = 5In + 2C0 + 2(C0 )T . Compute det An . [HINT: det An is a solution of the difference equation xn −5xn−1 +4xn−2 = 0 for n = 3, 4, . . ..] (n)

Exercise 19.13. Let An = bIn + cC0 det An .

(n)

+ c(C0 )T with b, c ∈ R. Compute

Chapter 20

Continuous dynamical systems

A continuous dynamical system is a curve x(t) in a set E that evolves according to some rule as t runs over an interval I ⊆ R. In this chapter we will focus initially on the special case in which x(t) is a solution of the first-order vector differential equation x (t) = Ax(t) + f (t), α ≤ t < β, based on a matrix A ∈ C p×p and a p × 1 vector-valued function f (t). We shall then use the theory established for systems to develop an algorithm for solving differential equations of the form (20.17). We begin with some prerequisites.

20.1. Preliminaries on matrix-valued functions Let



f11 (t) · · · ⎢ .. F (t) = ⎣ . fp1 (t) · · ·

⎤ f1q (t) .. ⎥ . ⎦ fpq (t)

be a p × q matrix-valued functions with entries fij (t) that are smooth functions. Then, the rules of matrix addition imply that ⎤ ⎡   (t) f11 (t) · · · f1q F (t + ε) − F (t) ⎢ . .. ⎥ . = ⎣ .. F  (t) = lim . ⎦ ε→0 ε   fp1 (t) · · · fpq (t) Consequently, the formula 6

t

F  (s)ds = F (t) − F (a)

a

221

222

20. Continuous dynamical systems

will hold if and only if ⎡, t

6

t a

a

⎢ F (s)ds = ⎣ ,t a

,t

f11 (s)ds · · · .. .

a

,t

fp1 (s)ds · · ·

a

⎤ f1q (s)ds ⎥ .. ⎦, . fpq (s)ds

i.e., differentiation and integration of a matrix-valued function are carried out on each entry in the matrix separately:

6 b  6 b   F (t) = [fij (t)] and F (s)ds = fij (s)ds . a

a

Moreover, if B ∈ C k×p and C ∈ C q×r , then 6

6

b

F (s)ds =

(20.1) B

6

b

BF (s)ds

a

and

a

6

b

b

F (s)ds C = a

F (s)Cds , a

i.e., multiplication by constant matrices can be brought inside the integral. To verify this, notice, for example, that the ij entry of the first term on the left in (20.1) is equal to p  k=1

6

b

bik

 F (s)ds

a

=

kj

p  k=1 b

6

6

b

bik

 6 b p fkj (s)ds = bik fkj (s)ds

a

a k=1

(BF (s))ij ds .

= a

Similar considerations serve to justify the second formula in (20.1) and the rule for differentiating the product of two matrix-valued functions: (F (s)G(s)) = F  (s)G(s) + F (s)G (s) . Thus, if F (s) and G(s) are p × p matrix-valued functions such that F (s)G(s) = Ip , then O = F  (s)G(s) + F (s)G (s) ,

i.e., G (s) = −F (s)−1 F  (s)F (s)−1 .

20.2. The exponential of a matrix The exponential eA of a matrix A ∈ C p×p is defined by the formula (20.2)

A

e =

∞  Aj j=0

j!

= Ip + A +

A2 + ··· . 2!

20.3. Systems of differential equations

223

The two main facts that we shall use are: (1) If A, B ∈ C p×p and AB = BA, then eA+B = eA eB . (2) If F (t) = etA , then F  (t) = AF (t) = F (t)A, since   hA − Ip e(t+h)A − etA F (t + h) − F (t) tA e = =e → etA A = AetA h h h as h tends to zero, and F (0) = Ip . 

0 b2 . Exercise 20.1. Calculate eA when A = 2 c 0 

a b2 A . [HINT: aI2 and A − aI2 Exercise 20.2. Calculate e when A = 2 c a commute.] Exercise 20.3. Show that if A, B ∈ C p×p , then etA etB e−tA e−tB − Ip = AB − BA . t→0 t2 ≈ Ip + tA + (t2 /2)A2 when t is close to zero.] lim

[HINT: etA

Exercise 20.4. Show that if A, B ∈ C p×p and etA etB = etA+tB for t ∈ R, then AB = BA. [HINT: Exploit the formula in Exercise 20.3.]

20.3. Systems of differential equations The set of solutions of the homogeneous system x (t) − Ax(t) = 0

(20.3)

is a vector space. In view of item (2) in the preceding section, x(t) = e(t−a)A c

(20.4)

is a solution of (20.3) for t ≥ a with initial condition x(a) = c. If A = V JV −1

for some Jordan matrix J, then

etA = V etJ V −1

and (20.5)

x(t) = V e(t−a)J d ,

where

d = V −1 x(a) .

Note that it is not necessary to calculate V −1 , since only d is needed. The advantage of this new formula is that it is easy to calculate etJ : If J = diag{λ1 , . . . , λp } , then etJ = diag{etλ1 , . . . , etλp }     and hence, upon writing V = v1 · · · vp and dT = d1 · · · dp , (20.6)

x(t) =

p  j=1

dj e(t−a)λj vj ,

224

20. Continuous dynamical systems

which exhibits the set {etλ1 v1 , . . . , etλp vp } of vector-valued functions of t as a basis for the set of solutions of the homogeneous system (20.3), i.e., of the null space of the linear transformation that maps x(t) into x (t) − Ax(t). If A is not diagonalizable, then J = diag{J1 , . . . , Jr }

and

etJ = diag{etJ1 , . . . , etJr } ,

where each block entry Ji is a Jordan cell and the set of columns of V etJ is a basis for the set of solutions of (20.3). (m)

Lemma 20.1. If N = Cλ (20.7)

(m)

etCλ

(m)

− λIm = C0

, then

= et(λIm +N ) = etλ etN = etλ

m−1  j=0

(tN )j . j!

Proof. Since (λIm )N = N (λIm ) and N m = O, (m)

etCλ

= et(λIm +N ) = etλIm etN = etλ etN 2 ( tm−1 N m−1 . = etλ Im + tN + · · · + (m − 1)!



(m)

Formula (20.7) exhibits etCλ

as an upper triangular Toeplitz matrix. ⎡ tλ ⎤ tλ t2 etλ e te 2! (3) Thus, for example, if m = 3, then etCλ = ⎣ 0 etλ tetλ ⎦ . The same 0 0 etλ pattern propagates for every Jordan cell.   Exercise 20.5. Show that if J = diag{λ1 , . . . , λp }, V = v1 · · · vp , and   (V −1 )T = w1 · · · wp , then the solution (20.4) of the system (20.3) can be expressed in the form x(t) =

p 

e(t−a)λj vj wjT x(a) .

j=1

Exercise 20.6. Show that if, in the setting of Exercise 20.5, λ1 > |λj | for j = 2, . . . , p, then lim e−tλ1 x(t) = e−aλ1 v1 w1T x(a) .

t↑∞

Exercise 20.7. Find an explicit formula for etA when ⎡ ⎤ 0 1 0 A = ⎣−1 0 1⎦ . 0 −1 0

√ [HINT:√You may use the fact that the eigenvalues of A are equal to 0, i 2, and −i 2.]

20.4. Uniqueness

225 ⎡

⎤ 2 1 0 Exercise 20.8. Let A = V JV −1 , where J = ⎣ 0 2 0 ⎦, V = [v1 v2 v3 ], 0 0 3 and (V T )−1 = [w1 w2 w3 ]. Evaluate the limit of the matrix-valued function e−3t etA as t ↑ ∞.

20.4. Uniqueness Formula (20.4) provides a (smooth) solution to the first-order vector differential equation (20.3). However, it remains to check that there are no others. Lemma 20.2. The differential equation (20.3) has only one solution x(t) with continuous derivative x (t) on the interval a ≤ t ≤ b that meets the specified initial condition at t = a. Proof. Suppose to the contrary that there are two solutions x(t) and y(t) and let u(t) = x(t) − y(t). Then u(a) = 0 and 6 t 6 t u (s)ds = Au(s)ds . u(t) = a

a

Therefore, upon iterating the last equality, we obtain the formulas 6 t 6 s1 6 sn−1 n u(t) = A ··· u(sn )dsn · · · ds1 for n = 2, 3, . . . , a

a

a

which, upon setting M = max {u(t) : a ≤ t ≤ b}, leads to the inequality (b − a)n (b − a)n ≤ M An . n! n! If n is large enough, then An (b − a)n /n! < 1 and hence   n n (b − a) 0 ≤ M 1 − A ≤ 0. n! M ≤ M An 

Therefore, M = 0; i.e., there is only one smooth solution of the differential equation (20.3) that meets the given initial conditions.  Much the same sort of analysis leads to Gronwall’s inequality: Exercise 20.9. Let h(t) be a continuous real-valued function on the interval a ≤ t ≤ b. Show that (6 s2 6 t 2 2 6 t h(s2 ) h(s1 )ds1 ds2 = h(s)ds /2! , 6

a

6

t

s3

h(s3 ) a

etc.

a

(6

s2

h(s2 ) a

2



6

a t

h(s1 )ds1 ds2 ds3 = a

a

3 h(s)ds /3! ,

226

20. Continuous dynamical systems

Exercise 20.10 (Gronwall’s inequality). Let α > 0 and let u(t) and h(t) be continuous real-valued functions on the interval a ≤ t ≤ b such that 6 t u(t) ≤ α + h(s)u(s)ds and h(t) ≥ 0 for a ≤ t ≤ b . a

Show that

6

t

u(t) ≤ α exp

 h(s)ds

for a ≤ t ≤ b .

a

[HINT: Iterate the inequality and exploit Exercise 20.9.]

20.5. Isometric and isospectral flows A matrix B ∈ R p×p is said to be skew-symmetric if B = −B T . Analogously, B ∈ C p×p is said to be skew-Hermitian if B = −B H . Exercise 20.11. Let B ∈ C p×p . Show that if B is skew-Hermitian, then B is normal and eB is unitary. Exercise 20.12. Let F (t) = etB , where B ∈ R p×p . Show that F (t) is an orthogonal matrix for every t ∈ R if and only if B is skew-symmetric. [HINT: If F (t) is orthogonal, then the derivative {F (t)F (t)T } = 0.] Exercise 20.13. Let B ∈ R p×p and let x(t), t ≥ 0, denote the solution of the differential equation x (t) = Bx(t) for t ≥ 0 that meets the initial condition x(0) = c ∈ R p . (a) Show that

d 2 dt x(t)

= x(t)T (B + B T )x(t) for every t ≥ 0.

(b) Show that if B is skew-symmetric, then x(t) = x(0) for every t ≥ 0. Exercise 20.14. Let A ∈ R p×p and let U (t) and B(t), t ≥ 0, be oneparameter families of real p × p matrices such that U  (t) = B(t)U (t) for t > 0 and U (0) = Ip . Show that F (t) = U (t)AU (t)−1 is a solution of the differential equation (20.8)

F  (t) = B(t)F (t) − F (t)B(t)

for t ≥ 0 .

Exercise 20.15. Show that if F (t) is the only smooth solution of a differential equation of the form (20.8) with suitably smooth B(t), then F (t) = U (t)F (0)U (t)−1 for t ≥ 0. [HINT: Consider U (t)F (0)U (t)−1 when U (t) is a solution of U  (t) = B(t)U (t) with U (0) = Ip .] A pair of matrix-valued functions F (t) and B(t) that are related by equation (20.8) is said to be a Lax pair, and the solution F (t) = U (t)F (0)U (t)−1 is said to be isospectral because its eigenvalues are independent of t.

20.6. Nonhomogeneous differential systems

227

20.6. Nonhomogeneous differential systems In this section we shall consider the nonhomogeneous differential system (20.9) x (t) − Ax(t) = f (t) ,

a ≤ t < b,

with initial condition x(a) = c ,

where A ∈ Rn×n and f (t) is a continuous n × 1 real vector-valued function on the interval a ≤ t < b. Then, since ' & x (t) − Ax(t) = etA e−tA x(t) , it is readily seen that the given system can be expressed as ' & −sA x(s) = e−sA f (s) e and hence, upon integrating both sides from a to a point t ∈ (a, b), that 6 t 6 t ' & −sA −tA −aA e x(t) − e x(a) = x(s) ds = e−sA f (s)ds e a

or, equivalently, that

a

6

t

x(t) = e(t−a)A c +

(20.10)

e(t−s)A f (s)ds for

a ≤ t < b.

a

To explore formula (20.10) further, let 6 (20.11)

u(t) = e

(t−a)A

c

t

e(t−s)A f (s)ds

and y(t) = a

and note that u(t) is a solution of the homogeneous equation u (t) = Au(t)

(20.12)

for a ≤ t with initial condition u(a) = c ,

whereas y(t) is a solution of the equation (20.13)

y (t) = Ay(t) + f (t) for a ≤ t with initial condition y(a) = 0 .

The key to this calculation is the general formula (for suitably smooth functions) 6 6 t d t ∂ g(t, s)f (s)ds g(t, s)f (s)ds = g(t, t)f (t) + (20.14) dt 0 0 ∂t applied to each entry of y(t). Thus, u (t) + y (t) = A(u(t) + y(t)) + f (t) and

u(a) + y(a) = c ,

as needed. Exercise 20.16. Show that if x and w are solutions of (20.9) with x(a) = w(a), then x(t) = w(t) for a ≤ t < b. [HINT: Lemma 20.2 is applicable.]

228

20. Continuous dynamical systems

20.7. Second-order differential equations Ordinary differential equations with constant coefficients can be solved by imbedding them in first-order vector differential equations and exploiting the theory developed in Sections 20.3 and 20.6. Thus, for example, to solve the second-order differential equation a0 x(t) + a1 x (t) + a2 x (t) = f (t) for t ≥ 0

(20.15)

with a2 = 0 and initial conditions x(0) = c1 and x (0) = c2 , let 

x(t) . x(t) =  x (t) Then, since x (t) = (−a0 x(t) − a1 x (t) + f (t))/a2 ,

   

 1 x (t) 0 1 x(t) 0  x (t) =  = + = Ax(t) + f (t) −a0 /a2 −a1 /a2 x (t) x (t) a2 f (t) for t ≥ 0, with 





1 0 c 0 1 , f (t) = , and x(0) = 1 = c. (20.16) A = c2 −a0 /a2 −a1 /a2 a2 f (t) Thus,

6 x(t) = e

tA

t

e(t−s)A f (s)ds = etA c + y(t)

c+ 0

and       x(t) = 1 0 x(t) = 1 0 etA c + 1 0

6

t

e(t−s)A f (s)ds .

0

λ2 .

Since A is a companion Let λ1 , λ2 denote the roots of a0 + a1 λ + a2 matrix, there are only two possible cases to consider. They correspond to the two Jordan forms described in (19.6). Case 1 (λ1 = λ2 ): e

tA

tJ

=Ve V

−1

,

1 1 with V = λ1 λ2

 and

e

tJ

 eλ1 t 0 = , 0 eλ2 t

and hence the solution x(t) of equation (20.15) must be of the form 6   t (t−s)A e f (s)ds x(t) = γeλ1 t + δeλ2 t + 1 0 0

for some choice of the constants γ and δ. Case 2 (λ1 = λ2 ): e

tA

tJ

=Ve V

−1

,

with

1 0 V = λ1 1



and

e

tJ

 eλ1 t teλ1 t = , 0 eλ1 t

20.8. Higher-order differential equations

229

and hence the solution x(t) of the equation must be of the form 6   t (t−s)A λ1 t λ1 t x(t) = γe + δte + 1 0 e f (s)ds 0

for some choice of the constants γ and δ. In both cases, the constants γ and δ are determined by the initial conditions x(0) and x (0). The particular solution 6   t (t−s)A e f (s)ds y(t) = 1 0 0

does not influence the choice of these constants, since y(0) = 0 and ( 2 6 t    (t−s)A y (t) = 1 0 f (t) + Ae f (s)ds =⇒ v  (0) = 0 . 0

Exercise 20.17. Show that if A, f , and c are as in (20.16) and x(t) = T  x1 (t) x2 (t) is a solution of the equation x (t) = Ax(t) + f (t) for t ≥ 0 and x(0) = c, then x1 (t) is a solution of (20.15) with x1 (0) = c1 and x1 (0) = c2 . [REMARK: The preceding discussion justified the passage from scalar equations to vector equations; the exercise runs the other way.]

20.8. Higher-order differential equations The strategy for solving the p’th-order differential equation (20.17)

ap x(p) (t) + ap−1 x(p−1) (t) + · · · a0 x(t) = f (t) for

t≥0

with constant coefficients that is subject to the constraint ap = 0 and to the initial conditions (20.18)

x(0) = c1 , . . . , x(p−1) (0) = cp

is the same as for the case p = 2 considered in Section 20.7. Again a recipe for the solution is obtained by identifying x(t) as the first entry of T  the solution x(t) = x1 (t) · · · xp (t) of the vector equation x (t) = Ax(t) + f (t) for t ≥ 0 with x(0) = c  T based on c = c1 · · · cp , ⎤ ⎡ 0 1 0 ··· 0 ⎥ ⎢ 0 0 1 ··· 0 ⎥ ⎢ 1 ⎥ ⎢ .. . . .. .. A=⎢ . ⎥ , and f (t) = ⎥ ⎢ ap ⎦ ⎣ 0 0 0 ··· 1 −a0 /ap −a1 /ap −a2 /ap · · · −ap−1 /ap (20.19)

and then invoking formulas (20.10)–(20.13).



0 0 .. .



⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ 0 ⎦ f (t)

230

20. Continuous dynamical systems

Let ej denote the j’th column of Ip for j = 1, . . . , p. Then, since 6 t tA x(t) = u(t) + y(t) , with u(t) = e c and y(t) = e(t−s)A f (s)ds , 0

it follows that x(t) = eT1 x(t) = u(t) + y(t) with

u(t) = eT1 u(t) and y(t) = eT1 y(t)

for t ≥ 0 and hence, by straightforward computations, that: (1) u(t) is a solution of the homogeneous equation a0 u(t) + a1 u(1) (t) + · · · + ap u(p) (t) = 0 :

(20.20)

a0 u(t) + a1 u(1) (t) + · · · + ap u(p) (t) = eT1 (a0 Ip + a1 A + · · · + ap Ap )u(t) = 0 , since a0 Ip +a1 A+· · ·+ap Ap = O by the Cayley-Hamilton theorem. (2) u(t) meets the initial conditions u(j−1) (0) = cj for j = 1, . . . , p: u(j−1) (t) = eT1 Aj−1 etA c and hence u(j−1) (0) = eT1 Aj−1 c = cj for j = 1, . . . , p. (The last evaluation exploits the fact that eT1 A = eT2 , eT1 A2 = eT2 A = eT3 , . . . , eT1 Ap−1 = eTp .) (3) y(t) is a solution of (20.17) with y (j) (0) = 0 for j = 0, . . . , p − 1: 6 t eT esA ep h(t − s)f (s)ds with h(s) = 1 , (20.21) y(t) = eT1 y(t) = ap 0 k and, by repeated use of formula (20.14) (with g(t, s) = ∂ k h(t − s) ∂t for k = 0, . . . , p), we obtain the evaluations 6 t y (j) (t) = h(j−1) (0)f (t) + h(j) (t − s)f (s)ds 0 (20.22)   6 t 1 T j−1 j (t−s)A e ep f (s)ds , = e1 A ep f (t) + A ap 0

for j = 1, . . . , p. Thus, as eT1 Aj−1 = eTj for j = 1, . . . , p and p j j=0 aj A = O, a0 y(t) + a1 y1 (t) + · · · + ap y (p) (t) = f (t) and y (j) (0) = 0

for j = 0, . . . , p − 1 .



20.8. Higher-order differential equations

231

The algorithm for computing the solution u(t) = eT1 V etJ V −1 c of the homogeneous equation (20.20) is: (1) Find the roots of the polynomial ap λp + ap−1 λp−1 + · · · + a0 . (2) If ap λp + ap−1 λp−1 + · · · + a0 = ap (λ − λ1 )α1 · · · (λ − λk )αk with k distinct roots λ1 , . . . , λk , then the solution u(t) of the homogeneous equation is a linear combination of the top rows of e j = 1, . . . , k, which can be reexpressed in the form

(αj ) j

tCλ

,

u(t) = etλ1 p1 (t) + · · · + etλk pk (t) , where pj (t) is a polynomial of degree αj − 1 for j = 1, . . . , k. In other words, the set of these entries is a basis for the vector space of solutions to the homogeneous differential equation (20.20). (3) Find the coefficients of the polynomials pj (t) by imposing the initial conditions. T tA The same three steps serve to solve for h(t) = a−1 p e1 e ep , except that now the initial conditions that are imposed in the third step are h(j) (0) = 0 for j = 0, . . . , p − 2 and h(p−1) (0) = 1/ap ; y(t) is then obtained from (20.21). (3)

(2)

Example 20.1. If J = diag{Cλ1 , Cλ2 }, then    eT1 V etJ = 1 0 0 1 0 etJ = eλ1 t teλ1 t The key observation is that eT1 V i.e., ⎡ a b ⎢∗ ∗  ⎢ 1 0 0 1 0 ⎢ ⎢∗ ∗ ⎣0 0 0 0

t2 λ1 t 2e

 eλ2 t teλ2 t .

selects the top rows of each block in etJ , c ∗ ∗ 0 0

0 0 0 d ∗

⎤ 0 0⎥ ⎥   0⎥ a b c d e = . ⎥ e⎦ ∗

Example 20.2. The recipe for solving the third-order differential equation a3 x(3) (t) + a2 x(2) (t) + a1 x(1) (t) + a0 x(t) = f (t), t ≥ 0 and a3 = 0 , is: (1) Solve for the roots λ1 , λ2 , λ3 of the polynomial a3 λ3 +a2 λ2 +a1 λ+a0 . (2) The solution u(t) to the homogeneous equation is u(t) = αeλ1 t + βeλ2 t + γeλ3 t if λ1 , λ2 , λ3 are all different , u(t) = αeλ1 t + βteλ1 t + γeλ3 t if λ1 = λ2 = λ3 , u(t) = αeλ1 t + βteλ1 t + γt2 eλ1 t if λ1 = λ2 = λ3 .

232

20. Continuous dynamical systems

(3) Determine the constants α, β, γ from the initial conditions x(0), x (0), and x (0). The function h(t) in (20.21) is of the same form as u(t), but subject to the initial conditions h(0) = h(1) (0) = 0 and h(2) (0) = 1/a3 . Exercise 20.18. Find the solution of the third-order differential equation x(3) (t) − 3x(2) (t) + 3x(1) (t) − x(t) = et , t ≥ 0 , subject to the initial conditions x(0) = 1, x(1) (0) = 2, x(2) (0) = 8. [ANSWER: x(t) = u(t)+y(t), where u(t) = (1+t+(5/2)t2 )et and y(t) = t3 et /6.]

 0 α  u(t) for t ≥ 0 . Show in two difExercise 20.19. Let u (t) = α 0 ferent ways that u(t)2 = u(0)2 if α + α = 0: first by showing that the derivative of u(t)2 with respect to t is equal to zero and then by invoking Exercise 20.11. Exercise 20.20. In the setting of Exercise 20.19, describe u(t)2 as t ↑ ∞ if α + α = 0. Exercise 20.21. Evaluate equation ⎡ 0 1 x (t) = ⎣ 0 0 8 −12

limt↑∞ t−2 e−2t x(t) for the solution x(t) of the ⎤ ⎡ ⎤ 0 8 1 ⎦ x(t), t ≥ 0, when x(0) = ⎣ 8 ⎦ . 6 8

20.9. Wronskians Let u1 (t), . . . , up (t) be solutions of the homogeneous differential equation x(p) (t) + ap−1 x(p−1) (t) + · · · + a1 x(1) (t) + a0 x(t) = 0 , c ≤ t ≤ d ,   (p−1) and let uj (t)T = uj (t) u(1) . Then the function (t) · · · u (t) j j ⎡ u1 (t) ··· up (t) ⎢ (1) (1) up (t) ⎢ u1 (t) · · ·   (20.23) ϕ(t) = det u1 (t) · · · up (t) = det ⎢ .. .. ⎢ . . ⎣ (p−1)

u1

(t) · · ·

is called the Wronskian of the functions u1 (t), . . . , up (t). Exercise 20.22. Show that (20.24)

ϕ(t) = exp{−(t − c)ap−1 } ϕ(c) .

[HINT: Exploit formula (17.1) and item 4◦ of Theorem 7.2.]



⎥ ⎥ ⎥ ⎥ ⎦ (p−1) up (t)

20.10. Supplementary notes

233

Exercise 20.23. Show that the vectors uj (t), j = 1, . . . , p, in formula (20.23) are linearly independent at one point in the interval c ≤ t ≤ d if and only if they are linearly independent at every point in the interval.

20.10. Supplementary notes This chapter is partially adapted from Chapter 13 of [30].

Chapter 21

Vector-valued functions

In this chapter we shall discuss vector-valued functions of one and many variables and some of their applications. We begin with some notation for classes of functions with different degrees of smoothness. Let Q be an open subset of R n . A function f that maps Q into R is said to belong to the class C(Q) if f is continuous on Q, C k (Q) for some positive integer k if f and all its partial derivatives of order up to and including k are continuous on Q. A vector-valued function f from Q into R m is said to belong to one of the two classes listed above if all its components belong to that class. Moreover, on occasion, f is said to be smooth if it belongs to C k (Q) for k large enough for the application at hand.

21.1. Mean value theorems We begin with the classical mean value theorem for real-valued functions f (x) of one variable x: Theorem 21.1. If Q is an open subset of R, f ∈ C 1 (Q), and Q contains the finite closed interval [a, b], then (21.1)

f (b) − f (a) = f  (c)(b − a)

for some point c in the open interval (a, b) = {x ∈ R : a < x < b}. 

Proof. See, e.g., [5]. The mean value theorem is a powerful tool for verifying inequalities.

235

236

21. Vector-valued functions

Exercise 21.1. Show that if 0 < a < b, then

√ ab − a ≤ (b − a)/2.

Exercise 21.2. Show that if a > 0, b > 0, and 0 < t < 1, then (a + b)t ≤ at + bt . [HINT: If a < b, then (a + b)t − bt = t(a/(c + b))1−t at for some point c ∈ (0, a).] Exercise 21.3. Show that if b > 1 and 0 < t < 1, then |b−bt | ≤ (1−t)b ln b. Exercise 21.4. Show that if a > 0, b > 0, and 1 < t < ∞, then (a + b)t ≥ at + bt . [HINT: See the hint to Exercise 21.2.] Exercise 21.5. If Q is an open subset of R, f ∈ C(Q), and Q contains the finite closed interval [a, b], then 6 b f (x)dx = (b − a)f (c) for some point c ∈ (a, b) . (21.2) a

[HINT: Apply the mean value theorem to the function h(y) =

,y a

f (x)dx.]

21.2. Taylor’s formula with remainder Theorem 21.2. If Q is an open subset of R, f ∈ C n (Q), and Q contains the finite closed interval [a, b], then (21.3)

f (b) = f (a) +

n−1 

f (k) (a)

k=1

(b − a)k (b − a)n + f (n) (c) k! n!

for some point c ∈ (a, b). 

Proof. See, e.g., [5].

Formula (21.3) may be used to approximate f (b) by a sum that is expressed in terms of f (a) and its derivatives f (1) (a), f (2) (a), . . . , when b is close to a: If b = a + h, then (21.3) can be rewritten as . n−1  hk hn (k) (21.4) f (a + h) − f (a) + = f (n) (c) f (a) k! n! k=1

and the right-hand side may be used to estimate the difference between the true value of f (a + h) and the approximant f (a) +

n−1  k=1

f (k) (a)

hk . k!

Thus, for example, in order to calculate (27.1)5/3 to an accuracy of 1/100, let f (x) = x5/3 , a = 27, and b = 27.1.

21.3. MVT for functions of many variables

237

Then, as 5 f  (x) = x2/3 3

and

f  (x) =

10 −1/3 x , 9

the formula f (b) = f (a) + f  (a)(b − a) + f  (c) translates to (27.1)

5/3

(b − a)2 2!

2 ( 10 5 1 5/3 2/3 1 = c−1/3 ; − (27) + (27) 3 10 9 200

i.e.,

% 2% ( −1/3 % % %(27.1)5/3 − 35 + 3 % = c , % 2 % 180 for some number c that lies between 27 and 27.1. In particular, this constraint implies that c > 27 and hence that c−1/3 < 13 . Consequently % 2% ( % % %(27.1)5/3 − 35 + 3 % < (1/3) = 1 . % 2 % 180 540 Thus, the error in approximating (27.1)5/3 by 35 +

3 2

is less than 1/540.

Exercise 21.6. Show that |(27.1)5/3 − (27)5/3 | > 3/2. Exercise 21.7. Show that  6 t  tk (t − s)k s t = e ds for t ≥ 0 . (21.5) e − 1 + t + ···+ k! k! 0 [HINT: Integrate the right-hand side once by parts to reveal the key.]

21.3. Mean value theorem for functions of several variables Let f (x) = f (x1 , . . . , xn ) be a real-valued function of the vector x with components x1 , . . . , xn and suppose that f ∈ C 1 (Q) in some open subset Q of R n . Then the vector   (21.6) (∇f )(x) = ∂f (x) · · · ∂f (x) ∂x1 ∂xn is called the gradient of f ; it may also be written as a column vector or even as a matrix if f is a function of n2 entries xij for i, j = 1, . . . , n. If f ∈ C 2 (Q), then the matrix-valued function ⎤ ⎡ ∂ 2f ∂2f (x) · · · (x) ∂x1 ∂xn ⎥ ⎢ ∂x1 ∂x1 ⎥ ⎢ . .. .. (21.7) Hf (x) = ⎢ ⎥ . ⎦ ⎣ 2 2 ∂ f ∂ f (x) · · · (x) ∂xn ∂x1 ∂xn ∂xn is called the Hessian of f .

238

21. Vector-valued functions

Theorem 21.3. If Q is an open convex subset of R q , f is a real-valued function in C 1 (Q), and a, b ∈ Q, then f (b) − f (a) = (∇f )(c)(b − a)

(21.8)

for some point c = a + t0 (b − a), 0 < t0 < 1, in the open line segment between a and b. If f ∈ C 2 (Q), then 1 f (b) = f (a) + (∇f )(a)(b − a) + Hf (c)(b − a), (b − a) 2

(21.9)

for some point c = a + t0 (b − a), 0 < t0 < 1, in the open line segment between a and b. Proof. Let h(t) = f (a + t(b − a)) = f (x1 (t), . . . , xq (t)) ,

where

xj (t) = aj + t(bj − aj )

for 0 ≤ t ≤ 1. Then clearly h ∈ C(I) for some open interval I that contains [0, 1], and q  ∂f (a + t(b − a)) (bj − aj ) h (t) = ∂xj 

j=1

= (∇f ) (a + t(b − a)) (b − a) exists for each point t in the open interval (0, 1). Therefore, by Theorem 21.1, there exists a point t0 ∈ (0, 1) such that h(1) − h(0) = h (t0 ). But, in view of the preceding calculation, this is easily seen to be the same as formula (21.8) with c = a + t0 (b − a). If f ∈ C 2 (Q), then Taylor’s formula with remainder applied to h(t) implies that 12 h(1) = h(0) + h (0) · 1 + h (t0 ) · 2! for some point t0 ∈ (0, 1). But this is the same as saying that f (b) = f (a) +

q q  ∂f 1  ∂ 2f (a)(bj − aj ) + (bi − ai ) (c)(bj − aj ), ∂xj 2 ∂xi ∂xj j=1

i,j=1

which coincides with (21.9).



Theorem 21.4. If Q is an open convex subset of R q , f is a real-valued function in C 1 (Q), and a, a + u ∈ Q, then (21.10)

lim t↓0

f (a + tu) − f (a) = (∇f )(a)u . t

21.4. MVT for vector-valued functions

239

In particular, if u = ej , the j’th column of Iq ,   f (a + tej ) − f (a) ∂f (21.11) = (∇f )(a)ej . (a) = lim t↓0 ∂xj t Proof. In view of formula (21.8), f (a + tu) − f (a) = (∇f )(c)u t where c = a + t0 tu and 0 < t0 < 1. Therefore, c − a ≤ tu, which tends to 0 as t ↓ 0. Thus, as (∇f )(x) is continuous, (∇f )(c) tends to (∇f )(a) as t ↓ 0. This completes the justification of (21.10); (21.11) is a special case of (21.10).  Exercise 21.8. Show that if A ∈ R n×n , x ∈ R n , and f (x) = Ax, x, then (21.12)

(∇f )(x)T = (A + AT )x

and

(Hf )(x) = A + AT .

Exercise 21.9. Show that if x ∈ R n , b ∈ R n , and f (x) = x, b, then (∇f )(x)T = b and Hf (x) = O. Exercise 21.10. Show that if g ∈ C 1 (R q ), A ∈ R q×q , a ∈ R q , and f (x) = g(a + Ax), then (∇f )(x) = (∇g)(a + Ax)A.

21.4. Mean value theorems for vector-valued functions of several variables We turn now to vector-valued functions of several variables. We assume that each of the components fi (x) = fi (x1 , . . . , xq ), i = 1, . . . , p, of f (x) is real valued. Thus, f (x) defines a mapping from some subset of R q into R p . If Q is an open subset of R q and f ∈ C 1 (Q), then the Jacobian matrix ⎤ ⎡ ∂f ∂f1 1 (x) · · · ⎡ ⎤ (x) f1 (x1 , . . . , xq ) ∂xq ⎥ ⎢ ∂x1 ⎢ ⎥ ⎢ . .. ⎥ .. of f (x) = ⎣ Jf (x) = ⎢ .. ⎦ . ⎥ . ⎦ ⎣ ∂fp ∂fp f (x , . . . , x ) p 1 q (x) · · · (x) ∂x1 ∂xq is a continuous p × q matrix-valued function on Q. Theorem 21.5. If Q is an open convex subset of R q , a, b ∈ Q, and each of the components fi (x), i = 1, . . . , p, of f (x) is a real-valued function in the class C 1 (Q), then ⎤ ⎡ (∇f1 )(c1 ) ⎥ ⎢ .. (21.13) f (b) − f (a) = ⎣ ⎦ (b − a) . (∇fp )(cp ) for some set of points ci = a + ti (b − a), 0 < ti < 1, i = 1, . . . , p.

240

21. Vector-valued functions

Proof. This is an immediate consequence of the mean value theorem for functions of many variables (Theorem 21.3), applied to each component  fi (x) of f (x) separately. The equality (21.13) is expressed in terms of p unknown points c1 , . . . , cp . The next theorem is formulated in terms of only one unknown point, but there is a price to pay: The equality is replaced by an inequality. Theorem 21.6. If Q is an open convex subset of R q , a, b ∈ Q, and each of the components fi (x), i = 1, . . . , p, of f (x) is a real-valued function in the class C 1 (Q), then f (b) − f (a) ≤ Jf (c)b − a

(21.14)

for some point c in the open line segment between a and b. Proof. Let u = f (b) − f (a) and let h(t) = uT f (a + t(b − a)). Then, by the classical mean value theorem, h(1) − h(0) = h (t0 )(1 − 0) for some point t0 ∈ (0, 1). But now as h(1) − h(0) = uT f (b) − uT f (a) = u2 and d  uj fj (a + t(b − a)) h (t) = dt p



=

j=1 p q  

uj

j=1

k=1

∂fj (a + t(b − a))(bk − ak ) ∂xk

= Jf (a + t(b − a))(b − a), u, the mean value theorem yields the formula u2 = Jf (c)(b − a), u for some point c = a + t0 (b − a) in the open line segment between a and b. Thus, by the Cauchy-Schwarz inequality, u2 ≤ Jf (c)(b − a)u , which clearly implies the validity of (21.14) when u = 0. Since (21.14) is also valid when u = 0, the proof is complete. 

21.5. Convex minimization problems

241

Exercise 21.11. Show that if h(x) = f (g(x)), where g(x) is a smooth map of R r into R q and f is a smooth map of R q into R p , then Jh (x) = Jf (g(x))Jg (x).

21.5. Convex minimization problems Convex functions have important extremal properties. Theorem 21.7. If Q is an open convex subset of R n and f ∈ C 1 (Q) is a real-valued convex function on Q, then the following implications are in force: (1) If a ∈ Q and (∇ f )(a) = 0, then f (a) ≤ f (b) for every point b ∈ Q, i.e., f achieves its minimal value at the point a. (2) If a ∈ Q, (∇ f )(a) = 0, and f is strictly convex, then f (a) < f (b) for every point b ∈ Q \ {a}, i.e., f (c) = f (a) ⇐⇒ c = a. Proof. Suppose that a, b ∈ Q. Then the inequality (9.1) for convex functions implies that f (a + t(b − a)) ≤ (1 − t)f (a) + tf (b) and hence that f (a + t(b − a)) − f (a) ≤ f (b) − f (a) if 0 < t < 1 . t But this leads easily to the inequality in (1), since the term on the left tends to (∇f )(a)(b − a) = 0 as t ↓ 0. Suppose next that f is strictly convex and there exists a point a ∈ Q such that f (a) ≤ f (b) for every point b ∈ Q. Then, if b = a and 0 < t < 1, f (a) ≤ f (ta + (1 − t)b) < tf (a) + (1 − t)f (b) ≤ tf (b) + (1 − t)f (b) = f (b) , i.e., f (a) < f (b) when a = b, as claimed.



Exercise 21.12. Show that if Q is an open nonempty convex subset of R n , then a real-valued function f ∈ C 2 (Q) is convex if and only if the Hessian Hf (x)  0 on Q; and that f is strictly convex if Hf (x)  0 on Q. Example 21.1. If A ∈ R n×n , A  O with block decomposition  <  =

a a A11 A12 and f (x) = A , A= A21 A22 x x with A11 ∈ R p×p , a ∈ R p , A22 ∈ R q×q , and x ∈ R q , then <  = <  = f (x + εu) − f (x) a 0 0 0 =2 A , +ε A , x u u u ε which tends to

<  = a 0 (∇f )(x)u = 2 A , x u

242

21. Vector-valued functions

as ε ↓ 0. Thus, (∇f )(x)T = 2A21 a + 2A22 x

for every x ∈ R q ,

and, as A22 is invertible, (∇f )(x)T = 0 ⇐⇒ x = x0 ,

where x0 = −A−1 22 A21 a .

Moreover,

f (x0 ) = (A11 − A12 A−1 22 A21 )a, a, and, since the Hessian Hf (x) = 2A22 is positive definite, f (x) ≥ f (x0 ) for every x ∈ R q with strict inequality if x = x0 , thanks to Exercise 21.12. This gives a geometric interpretation to the Schur complement A11 − A12 A−1 22 A21 for positive definite matrices A and is yet another way of treating the prob lem considered in Theorem 18.1 when A ∈ R n×n and a ∈ R p . Exercise 21.13. Show that if the positions and sizes of the vectors a and x in Example 21.1 are interchanged, then the minimum value of the new f (x) will be (A22 − A21 A−1 11 A12 )a, a. Example 21.2. Consider Example 21.1 but with A = AT only assumed to be positive semidefinite and rank  A22 = r with 1 ≤ r < q. Then A22 = T q×r , V = V1 V2 is a real q × q unitary matrix, V1 S1 V1 , where V1 ∈ R and S1 = diag{s1 , . . . , sr } with s1 ≥ · · · ≥ sr > 0. Then, upon setting x = V1 v + V2 w, (∇f )(x)T = 2A21 a + 2A22 x = 2A21 a + 2V1 S1 V1T (V1 v + V2 w) = 2A21 a + 2V1 S1 v , and hence

(∇f )(x)T = 0 ⇐⇒ v = −S1−1 V1T A21 a , i.e., if and only if x = x0 , where x0 = −V1 S1−1 V1T A21 a + V2 w = −A†22 A21 a + V2 w . Lemma 18.5 ensures that NA22 ⊆ NA12 and A22 A†22 A21 = A21 . Thus, as  V2 w ∈ NA22 , A12 V2 w = 0 and f (x0 ) = aT (A11 − A12 A†22 A21 )a . Exercise 21.14. Show that in the setting of Example 21.1, f (x)1/2 is convex. [HINT: Identify Au, u1/2 as a norm on R n and note that <  = ta + (1 − t)a ta + (1 − t)a f (tx + (1 − t)y) = A , .] tx + (1 − t)y tx + (1 − t)y Exercise 21.15. Show that if f (x) is a convex function on a convex subset Q of R n and f (x) ≥ 0 when x ∈ Q, then f (x)2 is also convex on Q.

21.6. Supplementary notes This chapter is partially adapted from Chapter 14 of [30].

Chapter 22

Fixed point theorems

A vector x is said to be a fixed point of a vector-valued function f (x) that maps a subset E of a vector space V into itself if x ∈ E and f (x) = x.

22.1. A contractive fixed point theorem Theorem 22.1. Let f (x) be a continuous map of a closed subset E of a finite-dimensional normed linear space X into itself such that (22.1)

f (b) − f (a)X ≤ Kb − aX

for some constant K, 0 < K < 1, and every pair of points a, b in the set E. Then: (1) There is exactly one point x∗ ∈ E such that f (x∗ ) = x∗ . (2) If x0 ∈ E and xn+1 = f (xn ) for n = 0, 1, . . . , then x1 , x2 , . . . ∈ E and x∗ = lim xn ; n↑∞

i.e., the limit exists and is independent of how the initial point x0 is chosen. Kn x1 − x0 X . (3) x∗ − xn X ≤ 1−K Proof. Choose any point x0 ∈ E and then define the sequence of points x1 , x2 , . . . in E by the rule xn+1 = f (xn ). 243

244

22. Fixed point theorems

Then, setting x = xX for short, x2 − x1  = f (x1 ) − f (x0 ) ≤ Kx1 − x0  , x3 − x2  = f (x2 ) − f (x1 ) ≤ Kx2 − x1  ≤ K 2 x1 − x0  , .. . xn+1 − xn  ≤ K n x1 − x0  , and hence xn+k − xn  ≤ xn+k − xn+k−1  + · · · + xn+1 − xn  ≤ (K n+k−1 + · · · + K n )x1 − x0  Kn x1 − x0 . ≤ 1−K Therefore, since K n tends to 0 as n ↑ ∞, this last bound guarantees that the sequence {xn } is a Cauchy sequence in the closed subset E of X . Thus, xn converges to a limit u ∈ E as n ↑ ∞. The next step is to show that this limit, which, as far as we know at this moment, may depend upon x0 , is a fixed point of f : f (u) − u = f (u) − f (xn ) + xn+1 − u ≤ f (u) − f (xn ) + xn+1 − u ≤ Ku − xn  + xn+1 − u . Since the right-hand side tends to zero as n ↑ ∞ and the left-hand side is independent of n, this implies that f (u) − u ≤ ε for every ε > 0 and hence that f (u) = u, i.e., u is a fixed point of f . Suppose next that v is also a fixed point of f in E. Then 0 ≤ u − v = f (u) − f (v) ≤ Ku − v . Therefore, 0 ≤ (1 − K)u − v ≤ 0. This proves that u = v and hence, upon denoting this one and only fixed point by x∗ , that (1) and (2) hold. The upper bound in (3) is obtained from the inequality x∗ − xn  ≤ x∗ − xn+k  + xn+k − xn  ≤ x∗ − xn+k  + by letting k ↑ ∞.

Kn x1 − x0  1−K 

22.2. A refined contractive fixed point theorem

245

22.2. A refined contractive fixed point theorem The next theorem relaxes the constraint (22.1) that was imposed in the formulation of Theorem 22.1. Theorem 22.2. Let f (x) be a continuous map of a closed subset E of a finite-dimensional normed linear space X into itself such that the j’th iterate f [j] (x) = f (f [j−1] (x))

f or j = 2, 3, . . .

of f = f [1] satisfies the constraint 5 5 5 5 [j] [j] (x) − f (y) f 5 ≤ Kx − yX 5 X

for some constant K, 0 < K < 1, and some positive integer j. Then f has exactly one fixed point x∗ in E. Proof. If g(x) = f [j] (x) meets the indicated constraint, then, by Theorem 22.1, g has exactly one fixed point x∗ in E. Moreover, f (x∗ ) = f (g(x∗ )) = g(f (x∗ )) . But this exhibits f (x∗ ) as a fixed point of g. Thus, as g has only one fixed point in E, we must have f (x∗ ) = x∗ ;  i.e., x∗ is a fixed point of f .

 α β Example 22.1. Let A = with |α| < 1 and |β| > 1. Then, since 0 α A ≥ {|α|2 + |β|2 }1/2 , the contractive fixed point theorem is not applicable to the vector-valued function f (x) = Ax. However, since 5 n 5 5 α nαn−1 β 5 n 5 ≤ |αn | + n|αn−1 | |β| , A  = 5 5 0 5 αn 

it will be applicable to f [n] if n is large enough. (n)

Exercise 22.1. Show that if A, C ∈ C n×n and N = C0 , the n × n Jordan cell with zeros on the diagonal, then A = C +N CN H +· · ·+N n−1 C(N H )n−1 is the one and only solution of the equation A−N AN H = C. [HINT: Invoke Theorem 22.2 with f (A) = C + N AN H , and keep in mind that N k  = 1 if k = 1, . . . , n − 1.]

246

22. Fixed point theorems

22.3. Other fixed point theorems To add perspective, we list a number of fixed point theorems with less restrictive assumptions but (as is only fair) weaker conclusions. We begin with the Brouwer fixed point theorem: Theorem 22.3. If f (x) is a continuous mapping of the closed unit ball B = {x ∈ R n : x ≤ 1} into itself, then there is at least one point x ∈ B such that f (x) = x. Proof. See, e.g., Chapter 14 of [30].



The Brouwer fixed point theorem can be strengthened to: Theorem 22.4. If f (x) is a continuous mapping of a closed bounded convex subset of R n into itself, then there is at least one point x in this set such that f (x) = x. Proof. See, e.g., Chapter 22 of [30] .



Exercise 22.2. Invoke Theorem 22.4 to show that if 0 < a < b,    √ x1 x1 x2 2 , = E = {x ∈ R : a ≤ x1 , x2 ≤ b} , and f x2 (x1 + x2 )/2 then f has a fixed point in E. [REMARK: Every vector x ∈ E with equal components is a fixed point.] There are also more general fixed point theorems that are applicable in infinite-dimensional spaces: The Leray-Schauder theorem: Every compact convex subset in a Banach space has the fixed point property; see, e.g., [67], for a start.

22.4. Applications of fixed point theorems Example 22.2. Let A ∈ C p×p , B ∈ C q×q , and C ∈ C p×q . Then the equation (22.2)

X − AXB = C

has a unique solution X ∈ C p×q if AB < 1. (Much stronger results will be obtained in Chapter 37.) Discussion. Let f (X) = C + AXB. Then clearly f maps C p×q into C p×q and X is a solution of (22.2) if and only if f (X) = X. Thus, as f (X) − f (Y ) = A(X − Y )B ≤ AX − Y B ,

22.4. Applications of fixed point theorems

247

Theorem 22.1 (with E = C p×q ) guarantees that (22.2) has exactly one solution X∗ in C p×q if A B < 1. Moreover, it is readily checked that 2 ( C p×q X∗ ∈ X ∈ C , : X ≤ 1 − A B 

since f maps the set in the last display into itself.

Example 22.3. Let Q = (α, β) ⊂ R and suppose that f ∈ C 2 (Q) maps Q into R and that there exists a point b ∈ Q such that f (b) = 0 and f  (b) = 0, and let Eδ = {x ∈ R : |x − b| ≤ δ}. Suppose further that although δ > 0, it is small enough so that Eδ ⊂ Q and f  (x) = 0 for x ∈ Eδ . Then, by the mean value theorem, f (b) = f (x) + (b − x)f  (c) with c = tx + (1 − t)b for some point t ∈ (0, 1) . Thus, as f (b) = 0 and f  (c) = 0, b=x−

f (x) f (x) ≈x−  . f  (c) f (x)

The function N (x) = x − f (x)/f  (x) is called the Newton step. Since N (x) = x if and only f (x) = 0, the problem of locating the zeros of f (x) in Eδ is equivalent to finding the fixed points of N (x) in Eδ . The Newton method sets up a sequence of points xk+1 = N (xk ), k = 0, 1, . . ., in Eδ that converge to b if the initial point x0 was chosen close enough to b. This will now be justified by the contractive fixed point theorem; the speed of convergence and Newton’s method for vector-valued functions of many variables will be discussed in Chapter 25. Discussion.

Let 

(

αδ = max{|f (x)| : x ∈ Eδ } ,

βδ = max

1 : x ∈ Eδ  |f (x)|

2 ,

and

κδ = max{|f  (x)| : x ∈ Eδ } , and observe that if 0 < δ1 < δ2 , then αδ1 ≤ αδ2 , βδ1 ≤ βδ2 , and κδ1 ≤ κδ2 . Then, since Eδ is a convex set and f  (x)f  (x) − f (x)f  (x) f (x)f  (x) = , f  (x)2 f  (x)2 the mean value theorem implies that for every pair of points x, y ∈ Eδ N  (x) = 1 −

N (x) − N (y) = (x − y)

f (c)f  (c) f  (c)2

for some point c ∈ Eδ .

Therefore, |N (x) − N (y)| ≤ |x − y| αδ βδ2 |f (c)| = |x − y| αδ βδ2 |f (c) − f (b)| ≤ |x − y| αδ βδ2 |c − b| κδ ≤ |x − y| γδ

with γδ = δαδ βδ2 κδ .

248

22. Fixed point theorems

If δ > 0 is small enough, then γδ < 1. Moreover, for such δ, |N (x) − b| = |N (x) − N (b)| ≤ |x − b|γδ , which ensures that N (x) maps points x ∈ Eδ into Eδ . Therefore, the contractive fixed point theorem is applicable to ensure that N (x) has exactly one fixed point in Eδ . Thus, as b ∈ Eδ and N (b) = b, that unique fixed point is b, and hence the sequence of points xk+1 = N (xk ) will tend to b as k ↑ ∞ for every choice of x0 ∈ Eδ .  Example 22.4. Let A ∈ C p×p and b ∈ C p . Then Ax = b if and only if x is a fixed point of the function f (x) = b − Ax + x. Let B = Ip − A and suppose further that B < 1. Then A is invertible and hence x is a fixed point of f if and only if x = A−1 b. Thus, as f maps C p into C p and f (x) − f (y) ≤ B x − y

for every pair of vectors x, y ∈ C p ,

the contractive fixed point theorem (with E = C p ) guarantees that if xk = f (xk−1 ) for k = 1, 2, . . ., then xk = b + Bb + · · · + B k−1 b + B k x0 tends to A−1 b as k ↑ ∞ for every choice of the initial point x0 . Moreover, as f maps the set 2 ( b E = x ∈ C p : x ≤ 1 − B −1  into itself, A b ∈ E. Exercise 22.3. Show by direct computation that in the setting of Example 22.4, xk − A−1 b = B k (x0 − A−1 b). [HINT: First verify the identity In + B + · · · + B k−1 = A−1 (I − B k ).] Example 22.5. Let E = {(x, y) ∈ R 2 : 0 ≤ x ≤ y ≤ 1} (i.e., E is the triangle with vertices (0, 0), (0, 1), and (1, 1) that lies above the line y = x) and let

 √  f1 (x, y) xy f (x, y) = = . f2 (x, y) (x + y)/2 √ Then, f maps E into E, since (x, y) ∈ E =⇒ 0 ≤ xy ≤ (x + y)/2 ≤ 1. Moreover, a point (x, y) ∈ E is a fixed point of f if and only if 

 √ xy x ⇐⇒ x = y . = (x + y)/2 y Thus, as f has infinitely many fixed points in E, there does not exist a positive constant γ < 1 such that f (u) − f (v) ≤ γ u − v for all points u, v ∈ E. Nevertheless, the limit in the recursive scheme discussed in the proof of the contractive fixed point theorem exists and is a fixed point of f , but now it depends upon the initial point in the recursion: If 

 √

 ak bk ak+1 a0 for k = 0, 1, 2, . . . , ∈ E and = (22.3) b0 bk+1 (ak + bk )/2

22.5. Supplementary notes

249

√ then, since ak ≤ bk for all nonnegative integers k, ak+1 = ak bk ≥ ak , bk+1 = (ak + bk )/2 ≤ bk , and b0 − a0 ak + bk 1 ak + bk bk − ak bk+1 − ak+1 = − ak bk ≤ − ak = ≤ k+1 . 2 2 2 2 Therefore, ak tends to a limit α as k ↑ ∞, bk tends to a limit β as k ↑ ∞, and |α − β| = |α − ak + (ak − bk ) + bk − β| ≤ |α − ak | + |ak − bk | + |bk − β| , which implies that α = β and hence that it is a fixed point of f . Notice that  ak ≤ α = β ≤ bk for k = 0, 1, 2, . . .. Exercise 22.4. Let E = {x ∈ R : 0 ≤ x ≤ 1} and let f (x) = (1 + x2 )/2. Show that: (a) f maps E into E. (b) There does not exist a positive constant γ < 1 such that |f (b) − f (a)| ≤ γ|b − a| for every choice of a, b ∈ E. (c) f has exactly one fixed point x∗ ∈ E. Exercise 22.5. Show that the polynomial p(x) = 1 − 4x + x2 − x3 has exactly one root in the interval 0 ≤ x ≤ 1. [HINT: Use the contractive fixed point theorem.]

 (1 − y)/2 Exercise 22.6. Show that the function f (x, y) = has a fixed (1 + x2 )/3 point inside the set {(x, y) ∈ R2 : x2 + y 2 ≤ 1}.

22.5. Supplementary notes This chapter is partially adapted from Chapters 14 and 22 in [30]. A perhaps surprising application of the sequence of vectors considered in (22.3) is to the evaluation of integrals of the form 6 π/2 1 1 dθ for 0 < a < b < 1 . I(a, b) = 2 0 a2 sin θ + b2 cos2 θ Gauss discovered that if a = a0 and b = b0 , then I(a, b) = I(ak , bk ) for k = 1, 2, . . . and hence that if ak → α, then I(a, b) = I(α, α) = π/(2α); see Duren [26] for a detailed account.

Chapter 23

The implicit function theorem

This chapter is devoted primarily to the implicit function theorem and a few of its applications. We shall, however, begin with the inverse function theorem and then use it to establish the implicit function theorem. (The two are really equivalent, but this direction makes for easier reading.) The notation Bα (x0 ) = {x ∈ R q : x − x0  < α} for the ball with center x0 ∈ R q and radius α > 0 will be used frequently.

23.1. The inverse function theorem If A ∈ R p×p , then the function f (x) = Ax maps R p onto R p if and only if A = Jf (x) is invertible. The inverse function theorem is an extension of this fact to a wider class of functions f (x). It makes extensive use of the observation that if f ∈ C 1 (Q) in some open set Q ⊆ R p and x0 ∈ Q, then f (x) ≈ f (x0 ) + Jf (x0 )(x − x0 )

if x ∈ Q is close to x0 .

Theorem 23.1. Suppose that the p × 1 vector-valued function ⎤ f1 (x1 , . . . , xp ) ⎥ ⎢ .. f (x) = ⎣ ⎦ . fp (x1 , . . . , xp ) ⎡

251

252

23. The implicit function theorem

maps an open set Q ⊆ R p into R p , that f ∈ C 1 (Q), and that the Jacobian matrix ⎡ ∂f ⎤ ∂f1 1 (x) · · · (x) ∂xp ⎢ ∂x1 ⎥ ⎢ .. .. ⎥ Jf (x) = ⎢ . ⎥ . ⎣ ⎦ ∂fp ∂fp (x) · · · (x) ∂x1 ∂xp is invertible at a point x0 ∈ Q and y0 = f (x0 ). Then there exists a pair of numbers α > 0 and β > 0 such that Bα (x0 ) ⊂ Q and for each vector y ∈ Bβ (y0 ) there exists exactly one vector x = g(y) in the ball Bα (x0 ) such that y = f (x). Moreover, the function g ∈ C 1 (Bβ (y0 )). Proof. Let A = Jf (x0 ) and w(x) = x − A−1 f (x) for x ∈ Q. Then (23.1)

Jw (x) = Ip − A−1 Jf (x) = A−1 [Jf (x0 ) − Jf (x)] .

Now choose α > 0 such that Bα (x0 ) ⊆ Q and Jw (x) ≤ 1/2 for x ∈ Bα (x0 ). Then, as A−1 Jf (x) = Ip − Jw (x), this ensures that (23.2)

Jf (x)

is invertible for every point x ∈ Bα (x0 ) .

The strategy of the proof is to invoke the contractive fixed point theorem to show that if y is close enough to y0 , then the vector-valued function hy (x) = x + A−1 [y − f (x)] = w(x) + A−1 y has exactly one fixed point x∗ ∈ Bα (x0 ). This then forces f (x∗ ) = y. There are some additional fine points, since Bα (x0 ) is an open set and the contractive fixed point theorem is formulated for closed subsets of Q. To overcome this difficulty, the initial analysis is carried out in a closed nonempty subset Bδ (x0 ) of Bα (x0 ). The details are presented in steps. 1. If 0 < δ < α and β ≤ δ/(2A−1 ), then hy (x) has exactly one fixed point g(y) in Bδ (x0 ) for each choice of y ∈ Bβ (y0 ). In view of the bound (21.14), the condition Jw (x) ≤ 1/2 ensures that 1 w(x) − w(v) ≤ x − v for every pair of points x, v ∈ Bα (x0 ) . 2 Consequently, hy (x) − x0  = w(x) − w(x0 ) + A−1 [y − f (x0 )] 1 ≤ x − x0  + A−1 β ≤ δ 2 for every point x ∈ Bδ (x0 ), i.e., hy maps Bδ (x0 ) into itself. Moreover, (23.3)

1 hy (x) − hy (v) = w(x) − w(v) ≤ x − v 2

23.1. The inverse function theorem

253

for every pair of points x, v ∈ Bδ (x0 ). Therefore, the contractive fixed point theorem ensures that hy (x) has exactly one fixed point x∗ ∈ Bδ (x0 ) and hence that y = f (x∗ ). 2. hy (x) has exactly one fixed point g(y) in Bα (x0 ) for each choice of y ∈ Bβ (y0 ). If v∗ and x∗ are fixed points of hy in Bα (x0 ), then 1 x∗ − v∗  = hy (x∗ ) − hy (v∗ ) = w(x∗ ) − w(v∗ ) ≤ x∗ − v∗  , 2 which is only viable if x∗ = v∗ . 3. g ∈ C 1 (Bβ (y0 ). Let a, b ∈ Bβ (y0 ). Then u = g(a) and v = g(b) belong to Bα (x0 ) and, in view of (23.3), 1 v − u ≥ w(v) − w(u) = v − u − A−1 (f (v) − f (u)) 2 = v − u − A−1 (b − a) ≥ v − u − A−1 (b − a) . Therefore, (23.4)

g(b) − g(a) = v − u ≤ 2A−1 b − a ,

which clearly implies that g is continuous on Bβ (y0 ). Moreover, b − a = f (v) − f (u) = A(t1 , . . . , tp )(v − u) = A(t1 , . . . , tp )(g(b) − g(a)) , where A(t1 , . . . , tp ) is the p × p matrix-valued function with i’th row equal to eTi A(t1 , . . . , tp ) = ∇fi (u + ti (v − u)) for some choice of ti ∈ (0, 1). Consequently, v = g(b) tends to u = g(a) as b tends to a and hence A(t1 , . . . , tp ) tends to Jf (u) as b tends to a. Thus, as Jf (u) is invertible, A(t1 , . . . , tp ) will be invertible if v is close to u. Thus, if b = a + εc for some unit vector c ∈ R p , then Jf (a)−1 c = lim A(t1 , . . . , tp )−1 c = lim ε→0

Therefore g ∈ C 1 (Q), as claimed.

ε→0

g(a + εc) − g(a) . ε 

Corollary 23.2 (The open mapping theorem). Let Q be an open nonempty subset of R p and suppose that f ∈ C 1 (Q) maps Q into R p and that Jf (x) is invertible for every point x ∈ Q. Then f (Q) is an open subset of R p .

254

23. The implicit function theorem

Proof. Let x0 ∈ Q and y0 = f (x0 ). Then there exists a pair of numbers α > 0 and β > 0 such that Bα (x0 ) ⊆ Q and for each vector y ∈ Bβ (y0 ) there exists exactly one vector x = g(y) in Bα (x0 ) such that f (x) = y. Thus,  Bβ (y0 ) = f (g(Bβ (y0 ))) ⊆ f (Bα (x0 )) ⊆ f (Q) . Exercise 23.1. Let ⎤ ⎡ x21 − x2 ⎦, x2 + x3 f (x) = ⎣ 2 x3 − 2x3 + 1



⎤ 1 x0 = ⎣ −1 ⎦ , −1

and

y0 = f (x0 ) .

(a) Calculate Jf (x), A = Jf (x0 ), Jf (x)−1 , and A−1 . (b) Show that A−1 2 < 5/3. (c) Show that the equation f (x) = y0 has exactly two real solutions, but only one in the ball B2 (x0 ). 

(∇f1 )(u) Exercise 23.2. Let u0 ∈ R 2 , f ∈ C 2 (Br (u0 )) and suppose that (∇f2 )(v) is invertible for every pair of vectors u, v ∈ Br (u0 ). Show that if a, b ∈ Br (u0 ), then f (a) = f (b) ⇐⇒ a = b. Exercise

23.3. Show  that the condition in Exercise 23.2 cannot be weak(∇f1 )(u) ened to is invertible for every vector u ∈ Br (u0 ). [HINT: Con(∇f2 )(u) sider the function f (x) with components f1 (x) = x1 cos x2 and f2 (x) = x1 sin x2 in a ball of radius 2π centered at the point (3π, 2π).] Exercise 23.4. Calculate the Jacobian matrix Jf (x) of the function f (x) with components fi (x1 , x2 , x3 ) = xi /(1 + x1 + x2 + x3 ) for i = 1, 2, 3 that are defined at all points x ∈ R 3 with x1 + x2 + x3 = −1. Exercise 23.5. Show that the vector-valued function that is defined in Exercise 23.4 defines a one-to-one map from its domain of definition in R 3 and find the inverse mapping.

23.2. The implicit function theorem To warm up, consider first the problem of describing the set of solutions u ∈ R n to the equation Au = b, when A ∈ R p×n , n > p, b ∈ R p , and rank A = p. The rank condition implies that there exists an n × n permutation matrix P such that the last p columns of the matrix AP are linearly independent. Thus, upon writing

   x and = PTu AP = A11 A12 y

23.2. The implicit function theorem

255

with A12 ∈ R p×p invertible, x ∈ R q , y ∈ R p , and n = p + q, the original equation can be rewritten as 0 = b − Au = b − AP P T u

  x  = b − A11 x − A12 y . = b − A11 A12 y Therefore, Au − b = 0 ⇐⇒ y = A−1 12 (b − A11 x) ;

(23.5)

i.e., the constraint g(u) = 0 for the function g(u) = Au − b implicitly prescribes some of the entries in u in terms of the others. The calculation rests on the fact that rank Jg (u) = rank A = p. The implicit function theorem yields similar conclusions for a more general class of functions g(u). Theorem 23.3. Let Q be a nonempty open subset of R n and let ⎤ ⎡ g1 (u1 , . . . , un ) ⎥ ⎢ .. 1 g(u) = ⎣ ⎦ belong to C (Q) . gp (u1 , . . . , un ) and suppose that q = n − p ≥ 1 and that there exists a point u0 ∈ Q such that g(u0 ) = 0

and

rank Jg (u0 ) = p .

Suppose further (to ease the exposition) that the last p are linearly independent, and let ⎡ ∂g 1 (u) · · · ⎡ ⎤ u1 ⎢ ∂x1 ⎢ . ⎢ ⎥ x = ⎣ ... ⎦ , x0 = a , A11 (u) = ⎢ .. ⎣ ∂gp uq (u) · · · ∂x1 ⎡



uq+1 ⎢ .. ⎥ y = ⎣ . ⎦, uq+p

y0 = b ,

and

columns of Jg (u0 ) ⎤ ∂g1 (u) ∂xq ⎥ .. ⎥ , . ⎥ ⎦ ∂gp (u) ∂xq

⎡ ∂g 1 (u) · · · ⎢ ∂y1 ⎢ . A12 (u) = ⎢ .. ⎣ ∂gp (u) · · · ∂y1

⎤ ∂g1 (u) ∂yp ⎥ .. ⎥ . . ⎥ ⎦ ∂gp (u) ∂yp

Then there exists a pair of positive numbers α and β such that for every x in the ball Bβ (x0 ) = {x ∈ R q : x − x0  < β} there exists exactly one vector y = ϕ(x) in the ball Bα (y0 ) = {y ∈ R p : y − y0  < α}

256

23. The implicit function theorem

such that g(x, ϕ(x)) = 0. Moreover, ϕ ∈ C 1 (Bβ (x0 )) and ⎡ ∂ϕ ⎤ ∂ϕ1 1 (x) · · · (x) ∂xq ⎢ ∂x1 ⎥ ⎢ .. .. ⎥ (23.6) ⎢ . ⎥ = −A12 (x, ϕ(x))−1 A11 (x, ϕ(x)) . . ⎣ ⎦ ∂ϕp ∂ϕp (x) · · · (x) ∂x1 ∂xq Proof. In order to apply the inverse function theorem, we embed g(u) in the function ⎤ ⎡ u1

 ⎢ .. ⎥ Iq O ⎢ . ⎥ f (u) = ⎢ ⎥ . Then Jf (u) = A11 (u) A12 (u) ⎣ uq ⎦ g(u) and, as A12 (u0 ) is invertible by assumption, Jf (u0 ) is also invertible. Therefore, by the inverse function theorem, there exists a pair of numbers α > 0 and β > 0 such that for every vector v in the ball Bβ (f (u0 )), there exists exactly one vector u in the ball Bα (u0 ) such that f (u) = v. Thus, if

 w v= with w ∈ C q , 0 ∈ C p , and w − a < β , 0 then

5  5 5 w a 5 5 v − f (u0 ) = 5 5 0 − 0 5 = w − a < β .

Therefore, the vector u ∈ Bα (u0 ) that meets the condition



 w w f (u) = must be of the form u = . 0 ϕ(w) The inequality

5  5 5 w a 5 5 5 5 ϕ(w) − b 5 < α

ensures that ϕ(w) − b < α and hence that for each w ∈ Bβ (a) there exists at least one function ϕ ∈ C (1) (Bβ (a)) such that g(w, ϕ(w)) = 0 and ϕ(w) ∈ Bα (b). It remains to check that ϕ(w) is uniquely specified at every point w ∈ Bβ (a). To see this, fix x ∈ Bβ (a), let h(y) = y − A12 (a, b)−1 g(x, y), and observe that h(y) = y ⇐⇒ g(x, y) = O

and

Jh = Ip − A12 (a, b)−1 A12 (x, y) .

Therefore, if α and β are small enough, then Jh (y) ≤ K < 1 for all points x ∈ Bβ (a) and y ∈ Bα (b). Thus, if c, d ∈ Bα (b) and g(x, c) = g(x, d) = 0,

23.3. Continuous dependence of solutions

257

then c − d = h(c) − h(d) ≤ Jh (e) c − d ≤ K c − d for a point e = c + t(d − c), 0 < t < 1. But this in turn implies that c = d.  Exercise 23.6. Suppose that g(x, y) = x2 + y 2 − 1 and g(x0 , y0 ) = 0

and

∂g (x0 , y0 ) = 0 . ∂y

Find a pair of positive numbers α and β and a function ϕ ∈ C 1 (Bβ (x0 )) such that ϕ(x) ∈ Bα (y0 ) and g(x, ϕ(x)) = 0. Exercise 23.7. Let g1 (x, y1 , y2 ) = x2 (y12 + y22 ) − 5 and g2 (x, y1 , y2 ) = (x − y2 )2 + y12 − 2. Show that in a neighborhood of the point x = 1, y1 = −1, y2 = 2, the curve of intersection of the two surfaces g1 (x, y1 , y2 ) = 0 and g2 (x, y1 , y2 ) = 0 can be described by a pair of functions y1 = ϕ1 (x) and y2 = ϕ2 (x). Exercise 23.8. Let ⎫ ⎧⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 1 x ⎪ ⎪ 1 2 ⎪ ⎪ x − 2x + 2x − x = 0 ⎬ ⎨⎢ ⎥ 1 2 4 3 ⎢ ⎥ ⎢ ⎥ 1 x 2 ⎢ ⎥ , and v = ⎢ 1 ⎥ . ⎥∈ R4 : , u = and S= ⎢ ⎣1⎦ ⎣0⎦ ⎣x ⎦ ⎪ ⎪ ⎪ x1 − 2x2 + 2x3 − x4 = 0 ⎪ ⎭ ⎩ 3 x4 1 −1 Use the implicit function theorem to show that it is possible to solve for x3 and x4 as functions of x1 and x2 for points in S that are close to u and to points in S that are close to v and write down formulas for these functions for each of these two cases.

23.3. Continuous dependence of solutions The implicit function theorem is often a useful tool to check the continuous dependence of the solution of an equation on the coefficients appearing in the equation. Suppose, for example, that Y ∈ R 2×2 is a solution of the matrix equation AT Y + Y A = B for some fixed choice of the matrices A, B ∈ R 2×2 . To try to show that if A changes only a little, then Y will also change only a little, consider the matrix-valued function F (A, Y ) = AT Y + Y A − B with entries fij (A, Y ) = eTi (AT Y + Y A − B)ej ,

i, j = 1, 2 ,

258

23. The implicit function theorem

where e1 , e2 denote the standard basis vectors in R 2 . Then, upon writing

 y11 y12 Y = , y21 y22 it is readily checked that ∂fij = eTi (AT es eTt + es eTt A)ej = asi eTt ej + atj eTi es ∂yst and hence that ∂fij ∂y11 ∂fij ∂y12 ∂fij ∂y21 ∂fij ∂y22 Consequently, ⎡ ∂f11 ∂f11 ⎢ ∂y11 ∂y12 ⎢ ∂f12 ∂f12 ⎢ ⎢ ∂y11 ∂y12 ⎢ ∂f ∂f21 ⎢ 21 ⎢ ∂y11 ∂y12 ⎣ ∂f22 ∂f22 ∂y11 ∂y12

∂f11 ∂y21 ∂f12 ∂y21 ∂f21 ∂y21 ∂f22 ∂y21

∂f11 ∂y22 ∂f12 ∂y22 ∂f21 ∂y22 ∂f22 ∂y22

= a1i eT1 ej + a1j eTi e1 , = a1i eT2 ej + a2j eTi e1 , = a2i eT1 ej + a1j eTi e2 , = a2i eT2 ej + a2j eTi e2 . ⎤ ⎥ ⎡ a21 a21 0 2a11 ⎥ ⎥ ⎢ 0 a21 ⎥ ⎢ a12 a11 + a22 ⎥=⎣ a12 0 a11 + a22 a21 ⎥ ⎥ 0 a12 a12 2a22 ⎦

⎤ ⎥ ⎥. ⎦

Now suppose that F (A0 , Y0 ) = 0 and that the matrix on the right in the last identity is invertible when the terms aij are taken from A0 . Then the implicit function theorem guarantees the existence of a pair of numbers α > 0 and β > 0 such that for every matrix A in the ball Bβ (A0 ) = {A ∈ R 2×2 : A − A0  < β}, there exists exactly one matrix Y = ϕ(A) in the ball Bα (Y0 ) = {Y ∈ R 2×2 : Y − Y0  < α} such that F (A, ϕ(A)) = 0 and hence that Y = ϕ(A) is a continuous function of A in the ball Bβ (A0 ); in fact ϕ ∈ C 1 (Bβ (A0 )).

23.4. Roots of polynomials The next theorem is a special case of a general theorem, which guarantees that a small change in the coefficients of a polynomial causes only a small change in the roots of the polynomial. The general theorem is usually proved by Rouch´e’s theorem from the theory of complex variables; see, e.g., pp. 153–154 of [9]. In this section we shall establish this result for polynomials with distinct roots via the implicit function theorem. The full result will

23.4. Roots of polynomials

259

be established later by invoking a different circle of ideas in Chapter 35. Another approach is considered in Exercise 35.7. Theorem 23.4. The roots of a polynomial f (λ) = λn + a1 λn−1 + · · · + an with distinct roots vary continuously with the coefficients a1 , . . . , an . Discussion.

To illustrate the basic idea, consider the polynomial f (λ) = λ3 + λ2 − 4λ + 6.

It has three distinct roots: λ1 = 1+i, λ2 = 1−i, and λ3 = −3. Consequently, the equation (23.7)

(μ + iν)3 + a(μ + iν)2 + b(μ + iν) + c = 0

in terms of the 5 real variables μ, ν, a, b, c is satisfied by the following choices: a = 1, b = −4, c = 6,

μ = 1,

ν = 1,

μ = 1,

ν = −1, a = 1, b = −4, c = 6,

μ = −3, ν = 0,

a = 1, b = −4, c = 6.

To put this into the setting of the implicit function theorem, express f (a, b, c, μ + iν) = (μ + iν)3 + a(μ + iν)2 + b(μ + iν) + c = μ3 + 3μ2 iν + 3μ(iν)2 + (iν)3 + a(μ2 + 2μiν + (iν)2 ) + b(μ + iν) + c in terms of its real and imaginary parts as f (a, b, c, μ + iν) = f1 (a, b, c, μ, ν) + if2 (a, b, c, μ, ν), where f1 (a, b, c, μ, ν) = μ3 − 3μν 2 + aμ2 − aν 2 + bμ + c and f2 (a, b, c, μ, ν) = 3μ2 ν − ν 3 + 2aμν + bν. Thus, the study of the roots of the equation λ3 + aλ2 + bλ + c = 0 with real coefficients a, b, c has been converted into the study of the solutions of the system f1 (a, b, c, μ, ν) = 0 , f2 (a, b, c, μ, ν) = 0 .

260

23. The implicit function theorem

The implicit function theorem guarantees the continuous dependence of the pair (μ, ν) on (a, b, c) in the vicinity of a solution provided that the matrix ⎡ ∂f ∂f ⎤ 1

∂μ $(a, b, c, μ, ν) = ⎣ ∂f2 ∂μ

1

∂ν ⎦ ∂f2 ∂ν

is invertible. To explore this condition, observe first that ∂f2 ∂f1 = 3μ2 − 3ν 2 + 2aμ + b = ∂μ ∂ν

(23.8) and

∂f1 ∂f2 = −6μν − 2aν = − . ∂ν ∂μ

(23.9) Therefore, (23.10)

⎡ ∂f

1

∂μ det ⎣ ∂f2 ∂μ

∂f1 ⎤ % %2 % % % % % ∂f2 %2 % ∂f %2 ∂ν ⎦ %% ∂f1 %% % =% % . =% + %% % ∂μ % ∂f2 ∂μ % ∂μ % ∂ν

In the case at hand, ∂f2 ∂f1 (1, −4, 6, 1, 1) = −2 = (1, −4, 6, 1, 1) ∂μ ∂ν and ∂f2 ∂f1 (1, −4, 6, 1, 1) = −8 = − , ∂ν ∂μ and hence (23.11)

 −2 −8 det $(1, −4, 6, 1, 1) = det = 22 + 82 = 68 . 8 −2

Thus, the implicit function theorem guarantees that if the coefficients a, b, c of the polynomial λ3 + aλ2 + bλ + c change a little bit from 1, −4, 6, then the root 1 + i will only change a little bit. Similar considerations apply to the other two roots in this example. The same analysis applies to the simple roots of any polynomial f (λ), since the identities ∂f1 /∂μ = ∂f2 /∂ν and ∂f1 /∂ν = −∂f2 /∂μ (the CauchyRiemann equations) exhibited in (23.8) and (23.9) that connect the real and imaginary parts of f continue to hold and hence, if a polynomial f (λ)  of degree n has n distinct roots μ1 , . . . , μn , then |(∂f /∂μ)(μj )| = 0. Exercise 23.9. Show that there exists a pair of numbers α > 0 and β > 0 such that the polynomial λ3 + aλ2 + bλ + c with real coefficients has exactly one root λ = μ+iν in the ball μ2 +(ν −2)2 < β if (a−1)2 +(b−4)2 +(c−4)2 < α. [HINT: The roots of the polynomial λ3 +λ2 +4λ+4 are 2i, −2i, and −1.]

23.5. Supplementary notes

261

23.5. Supplementary notes This chapter is partially adapted from Chapter 15 of [30]. However, there (in contrast to the present treatment) the proof of the inverse function theorem is based on the implicit function theorem, which was proved first. The monograph [52] by Krantz and Parks is a good source of additional information on the implicit function theorem and its applications.

Chapter 24

Extremal problems

This chapter is devoted primarily to classical extremal problems and extremal problems with constraints, which are resolved by the method of Lagrange multipliers.

24.1. Classical extremal problems Let f (x) = f (x1 , . . . , xn ) be a real-valued function of the variables x1 , . . . , xn that is defined in some open set Ω ⊂ R n and suppose that f ∈ C 1 (Ω) and a ∈ Ω. Then Theorem 21.4 guarantees that the directional derivative (Du f ) (a) = lim

(24.1)

ε↓0

f (a + εu) − f (a) ε

exists for every choice of u ∈ R n with u = 1 and supplies the formula (Du f ) (a) = (∇f )(a)u .

(24.2)

If a is a local maximum, then f (a) ≥ f (a + εu) for all unit vectors u and all sufficiently small positive numbers ε. Thus, ε > 0 =⇒

f (a + εu) − f (a) ≤ 0 =⇒ (Du f ) (a) = (∇f )(a)u ≤ 0 ε

for all unit vectors u ∈ R n . However, since the same inequality holds when u is replaced by −u, it follows that the last inequality must in fact be an equality: If a is a local maximum, then (∇f )(a)u = (Du f ) (a) = 0 263

264

24. Extremal problems

for all directions u. Therefore, as similar arguments lead to the same conclusion when a is a local minimum point for f (x), we obtain the following result: Theorem 24.1. Let Q be an open subset of R n and let f ∈ C 1 (Q). If a vector a ∈ Q is a local maximum or a local minimum for f (x), then (24.3)

(∇f )(a) = O1×n .

• Warning: The condition (24.3) is necessary but not sufficient for a to be a local extremum (i.e., a local maximum or a local minimum). Thus, for example, the point (0, 0) is not a local extremum for the function f (x1 , x2 ) =  2 2 x1 − x2 , even though (∇f )(0, 0) = 0 0 . Convex functions that are defined on convex sets are a happy exception, as we have already seen in Section 21.5. If f ∈ C 2 (Q), then additional insight may be obtained from the Hessian Hf (a): If Bε (a) = {x ∈ R q : a − x < ε} ⊂ Q for some ε > 0, then, in view of formula (21.9), which is applicable as Bε (a) is convex, 1 (∇f )(a) = 01×n =⇒ f (a + εu) − f (a) = ε2 Hf (c)u, u 2 for some point c on the open line segment between a + εu and a. Thus, if 0 < δ < ε and (24.4)

(24.5) Hf (x)  O for every x ∈ Bδ (a), then a is a local minimum for f , whereas if (24.6) Hf (x) ! O for every x ∈ Bδ (a), then a is a local maximum for f . This leads easily to the following conclusions (consistent with Theorem 21.7): Theorem 24.2. If Q ⊆ R n is an open set that contains the point a, f (x) = f (x1 , . . . , xn ) belongs to C 2 (Q), and (∇f )(a) = 01×n , then (24.7)

Hf (a)  O =⇒ a is a local minimum for f (x) ,

(24.8)

Hf (a) ≺ O =⇒ a is a local maximum for f (x) .

Proof. Assertion (24.7) follows from (24.5) and the observation that if Hf (a)  O, then there exists a δ > 0 such that Hf (x)  O for every  x ∈ Bδ (a). The verification that (24.6) implies (24.8) is similar. Exercise 24.1. Let A, B ∈ R n×n and suppose that A  O, B = B T , and A − B < λmin , where λmin denotes the smallest eigenvalue of A. Show that B  O. [HINT: Bu, u = Au, u + (B − A)u, u.] Exercise 24.2. Let A, B ∈ R n×n and suppose that A  O and B = B T . Show that A + εB  O if |ε| is sufficiently small.

24.1. Classical extremal problems

265

Theorem 24.2 implies that the behavior of a smooth function f (x) in the vicinity of a point a at which (∇f )(a) = 01×n depends critically on the eigenvalues of the real symmetric matrix Hf (a). Example 24.1. Let f (u, v) = α(u−1)2 +β(v −2)3 with nonzero coefficients α ∈ R and β ∈ R. Then ∂f (u, v) = 3β(v − 2)2 . ∂v

∂f (u, v) = 2α(u − 1) and ∂u Hence,

(∇f )(u, v) = [0 0] if u = 1 and v = 2 . However, the point (1, 2) is not a local maximum point or a local minimum point for the function f (u, v), since f (1 + ε1 , 2 + ε2 ) − f (1, 2) = αε21 + βε32 , which changes sign as ε2 passes through zero when ε1 = 0. This is consistent with the fact that the Hessian ⎡ 2 ⎤ ∂ f ∂ 2f

 ⎢ ∂u2 ∂u∂v ⎥ 2α 0 ⎢ ⎥ Hf (u, v) = ⎣ 2 ⎦ = 0 6β(v − 2) , ∂ 2f ∂ f ∂v∂u ∂v 2 and

2α 0 Hf (1, 2) = 0 0



is neither positive definite nor negative definite.



Exercise 24.3. Show that the Hessian Hg (1, 2) of the function g(u, v) = α(u − 1)2 + β(v − 2)4 is the same as the Hessian Hf (1, 2) of the function considered in the preceding example. • Warning: A local minimum or maximum point is not necessarily an absolute minimum or maximum point: Exercise 24.4. Show that the point (1, 2) is an absolute minimum for the function g(u, v) = (u − 1)2 + (v − 2)4 , but it is not even a local minimum point for the function f (u, v) = (u − 1)2 + (v − 2)3 .

266

24. Extremal problems

  Exercise 24.5. Let f ∈ C 2 (R2 ) and suppose that (∇f )(a, b) = 0 0 , and let λ1 and λ2 denote the eigenvalues of Hf (a, b). Show that the point (a, b) is: (i) a local minimum for f if λ1 > 0 and λ2 > 0, (ii) a local maximum for f if λ1 < 0 and λ2 < 0, (iii) neither a local maximum nor a local minimum if |λ1 λ2 | > 0, but λ1 λ2 < 0. Exercise 24.6. In many textbooks on calculus the conclusions formulated in Exercise 24.5 are given in terms of the second-order partial derivatives α = (∂ 2 f /∂x2 )(a, b), β = (∂ 2 f /∂y 2 )(a, b), and γ = (∂ 2 f /∂x∂y)(a, b) by the conditions (i) α > 0, β > 0, and αβ − γ 2 > 0; (ii) α < 0, β < 0, and αβ −γ 2 > 0; (iii) αβ −γ 2 < 0, respectively. Show that the two formulations are equivalent.

24.2. Extremal problems with constraints In this section we shall consider extremal problems with constraints, using the method of Lagrange multipliers. Let a be a local extremum of the function f (x1 , . . . , xn ) when the variables (x1 , . . . , xn ) belong to the surface S that is defined by the constraint g(x1 , . . . , xn ) = 0. If x(t), −1 ≤ t ≤ 1, is any smooth curve on this surface with x(0) = a, then g(x(t)) = g(x1 (t), . . . , xn (t)) = 0 for −1 < t < 1 and hence, upon writing gradients as column vectors, d g(x(t)) = (∇g)(x(t)), x (t) = 0 dt

for all t ∈ (−1, 1) .

In particular, (24.9)

(∇g)(a), x (0) = (∇g)(x(0)), x (0) = 0

for all such curves x(t). At the same time, since a is a local extremum for f (x) on S, t = 0 is also a local extremum for the function f (x(t)). Therefore, (24.10)

0=

d f (x(t))|t=0 = (∇f )(a), x (0) . dt

Thus, if rank (∇g)(a) = 1 and the span of the set of possible vectors x (0) fills out an (n − 1)-dimensional space, then (24.11)

(∇f )(a) = λ(∇g)(a) ,

i.e.,

for some constant λ ∈ R and i = 1, . . . , n.

∂g ∂f (a) = λ (a) ∂xi ∂xi

24.2. Extremal problems with constraints

267

Example 24.2. Let g(x) = x2 − 1 and let a ∈ R n be a local extremum for a smooth function f (x) = f (x1 , . . . , xn ) on the unit sphere S = {x ∈ R n : g(x) = 0}. If u ∈ S and a + tu ϕ(t) = a + tu , then the curve x(t) = ϕ(t) belongs to S for −1 < t < 1, x(0) = a, and x (t) =

ϕ(t)u − ϕ (t)(a + tu) . ϕ(t)2

Thus, as ϕ(0) = 1 and ϕ(t)2 = a + tu, a + tu, 2ϕ(t)ϕ (t) = 2u, a + tu . Therefore, ϕ (0) = u, a and

x (0) = u − u, aa = (In − ΠA )u ,

where ΠA denotes the orthogonal projection of R n onto A = {ca : c ∈ R}. Since u is an arbitrary vector in R n , the condition 0 = v, x (0) = v, (In − ΠA )u = (In − ΠA )v, u implies that (In − ΠA )v = 0., i.e., v = v, aa is a constant multiple of the vector a. Therefore, in view of (24.9) and (24.10), ∇g(a) and ∇f (a) are both constant multiples of a. But, this in turn implies that (24.11) holds, since (∇g)(a) = 2a and a = 1.  Analogously, if u0 is a local extremum for f (x) subject to p constraints g1 (x) = 0, . . . , gp (x) = 0 and ⎡ ⎤ (∇g1 )(u0 ) ⎢ ⎥ .. rank Jg (u0 ) = rank ⎣ ⎦ = p, . (∇gp )(u0 ) then (24.12)

(∇f )(u0 ) = λ1 (∇g1 )(u0 ) + · · · + λp (∇gp )(u0 ) .

Our next objective is to justify this conclusion with the help of the implicit function theorem. Theorem 24.3. Let f (u) = f (u1 , . . . , un ), gj (u) = gj (u1 , . . . , un ), j = 1, . . . , p, be real-valued functions in C 1 (Q) for some open set Q in R n , where p < n. Let S = {(u1 , . . . , un ) ∈ Q : gj (u1 , . . . , un ) = 0 for j = 1, . . . , p} and assume that there exist a point u0 ∈ S and a number δ > 0 such that: (1) Bδ (u0 ) ⊆ Q and either f (u) ≥ f (u0 ) for all u ∈ S ∩ Bδ (u0 ) or f (u) ≤ f (u0 ) for all u ∈ S ∩ Bδ (u0 ).

268

24. Extremal problems ⎤ (∇g1 )(u) ⎥ ⎢ .. (2) rank ⎣ ⎦=p . (∇gp )(u) ⎡

for all points u in the ball Bδ (u0 ).

Then there exists a set of p constants λ1 , . . . , λp such that (24.12) holds. Proof. To ease the exposition, we shall assume that the last p columns of Jg (u0 ) are linearly independent. Then the implicit function theorem guarantees the existence of a pair of constants α > 0 and β > 0 such that if



 x0 x q p , u= with x ∈ R , y ∈ R , p + q = n , and u0 = y0 y then for each point x in the ball Bβ (x0 ) there exists exactly one point y = ϕ(x) in the ball Bα (y0 ) such that gi (x, ϕ(x)) = 0 for i = 1, . . . , p. Moreover, ϕ ∈ C 1 (Bβ (x0 )). Thus, if w ∈ R q and w < β, then x(t) = x0 + tw,

−1 ≤ t ≤ 1 ,

belongs to the ball Bβ (x0 ). If α2 + β 2 ≤ δ 2 , then

u(t) = where

x(t) ϕ(x(t))

 ∈ Bδ (u0 )

and



u (0) =

⎡ ∂ϕ 1 (x ) · · · 0 ⎢ ∂x1 ⎢ .. C=⎢ . ⎣ ∂ϕp (x ) · · · ∂x1 0

x (0) y (0)

∂ϕ1 (x ) ∂xq 0 .. . ∂ϕp (x ) ∂xq 0



=

w Cw

 ,

⎤ ⎥ ⎥ ⎥. ⎦

Thus, the set of u (0) that is obtained this way spans a q-dimensional subspace of R n . Moreover, (24.13)

d f (u(t))|t=0 = (∇f )(u0 )u (0) = 0 dt

and gi (u(t)) = 0 for

−1 1 with AT A = U S 2 U T , where U = u1 · · · uq is a real q × q unitary matrix and S = diag{s21 , . . . , s2q } with s1 ≥ · · · ≥ sq ≥ 0 and let f (x) = Ax, Ax, g1 (x) = x, x − 1, and g2 (x) = x, u1  . The problem is to find max f (x) subject to the constraints g1 (x) = 0 Discussion. β such that

and

g2 (x) = 0.

In view of (24.12), there exists a pair of real constants α and

(∇f )(a) = α(∇g1 )(a) + β(∇g2 )(a) at each local extremum a of the given problem. Therefore, 2AT Aa = 2αa + βu1 ,

(24.16)

and the constraints supply the supplementary information that a, a = 1

and

a, u1  = 0.

Thus, as u1 , . . . , uq is an orthonormal basis for R q , a=

q 

cj uj ,

where

cj = a, uj  for

j = 1, . . . , q .

j=1

This last formula exhibits once again the advantage of working with an orthonormal basis: It is easy to calculate the coefficients. In particular, the constraint g2 (a) = 0 forces c1 = 0 and hence that a=

q 

cj uj .

j=2

Substituting this into formula (24.16), we obtain 2α

q  j=2

T

cj uj + βu1 = 2A A

q  j=2

cj uj = 2

q  j=2

cj s2j uj .

24.3. Examples

271

Therefore, β = 0 and cj s2j = αcj for j = 2, . . . , q. Moreover, since the constraint q  g1 (a) = 0 =⇒ c2j = 1, j=2

it is easily seen that Aa, Aa =

q 

⎛ ⎞ q  c2j s2j ≤ ⎝ c2j ⎠ s22 = s22

j=2

j=2

and hence that the maximum value of Aa, Aa subject to the two given constraints is obtained by choosing α = s22 , c2 = 1, and cj = 0 for j = 3, . . . , q. This gives a geometric interpretation to s2 . Analogous interpretations hold for the other singular values of A.  Example 24.5. The problem is to find max f (x) subject to the constraints g1 (x) = 0

and

g2 (x) = 0

when f (x) and g1 (x) are the same as in Example 24.4, but now g2 (x) = x, 3u1 + 4u2 , q ≥ 3, and s1 > s2 . Discussion. β such that

In view of (24.12), there exists a pair of real constants α and (∇f )(a) = α(∇g1 )(a) + β(∇g2 )(a)

at each local extremum a of the given problem. Therefore, 2AT Aa = 2αa + β(3u1 + 4u2 ) ,

(24.17)

and the constraints supply the supplementary information that a, a = 1 and

a, 3u1 + 4u2  = 0.

Thus, as u1 , . . . , uq is an orthonormal basis for R q , a=

q 

cj uj ,

where

cj = a, uj  for

j = 1, . . . , q .

j=1

Consequently, (24.18)

2

q  j=1

cj s2j uj

= 2α

q 

cj uj + β(3u1 + 4u2 ) ,

j=1

which, upon matching coefficients, implies that 2c1 s21 = 2αc1 + 3β, 2c2 s22 = 2αc2 + 4β

and

2cj s2j = 2αcj for j ≥ 3.

Moreover, the  constraints g1 (a) = 0 and g2 (a) = 0 yield the supplementary conditions qj=1 c2j = 1 and 3c1 + 4c2 = 0, respectively. Thus, as 8c1 s21 = 8αc1 +12β and 6c2 s22 = 6αc2 +12β, it follows that 8c1 s21 −6c2 s22 = 8αc1 −6αc2

272

24. Extremal problems

and hence, as c2 = −3c1 /4, that c1 (16s21 + 9s22 ) = 25c1 α. If c1 = 0, then α = (16s21 + 9s22 )/25 > s22 ≥ s2j when j ≥ 3. Therefore, cj = 0 for j ≥ 3 and consequently c21 + c22 = 1. Therefore, either c1 = 4/5 and c2 = −3/5, or c1 = −4/5 and c2 = 3/5. In both cases f (c1 u1 + c2 u2 ) = (16s21 + 9s22 )/25.  On the other hand, if c1 = 0, then also c2 = 0 and f ( qj=3 cj uj ) ≤ s23 <  (16s21 + 9s22 )/25, so this option does not yield a maximum. Exercise 24.7. In the setting of Example 24.5, find the minimum value of f (x) over all vectors x ∈ R 4 that are subject to the same two constraints that are considered there.   Example 24.6. Let X ∈ R p×p , e1 · · · ep = Ip , f (X) = (det X)2 , and g(X) = trace X T X − 1. The problem is to compute max{f (X) : X ∈ R p×p and g(X) = 0}. If A is a local extremum and Eij = ei eTj for i, j = 1, . . . , p, then, in view of (24.11), there exists a number μ ∈ R such that f (A + εEij ) − f (A) g(A + εEij ) − g(A) = μ lim ε→0 ε ε for i, j = 1, . . . , p, i.e.,

(24.19)

lim

ε→0

∂g ∂f (X) = λ (X) for i, j = 1, . . . , p when X = A . ∂xij ∂xij We shall carry out the implications of this condition in a series of exercises. Exercise 24.8. Show that (24.19) holds if and only if 2(−1)i+j A{i;j} det A = μ 2eTi Aej = μ 2aij

for i, j = 1, . . . , p .

Exercise 24.9. Show that the condition in Exercise 24.8 holds if and only if f (A)Ip = μAAT and hence that if this condition is met, then μ = pf (A), which in turn implies that AAT = p−1 Ip . Consequently, (det A)2 = det AAT = p−p . Exercise 24.10. Confirm that p−p is indeed the maximum value of f under the given constraints. [HINT: It is enough to show that det (p−1 Ip + X) ≤ p−p for every symmetric matrix X ∈ R p×p with trace X = 0.] Exercise 24.11. Show that the problem considered in Example 24.6 is equivalent to finding the maximum value of the product s21 · · · s2p for a set of real numbers s1 , . . . , sp that are subject to the constraint s21 + · · · + s2p = 1. Exercise 24.12. Let B ∈ R n×n be positive definite with entries bij , let Qm = {A ∈ R n×n : A  O

and aij = bij for |i−j| ≤ m}, 0 ≤ m < n ,    ∈ Qm , and and let f (X) = ln det X. Show that if e1 · · · en = In , A  −1 ej = 0 for |i − j| > m.  ≥ f (A) for every matrix A ∈ Qm , then eT (A) f (A) i

24.4. Supplementary notes

273

24.4. Supplementary notes This chapter is adapted from Chapter 16 in [30], which contains additional exercises. Exercise 24.12 is connected with the maximum entropy completion problem that will be considered in Chapter 38.

Chapter 25

Newton’s method

In Example 22.3 we used the contractive fixed point theorem to show that if Q = (α, β) ⊂ R, f ∈ C 2 (Q) maps Q into R, and there exists a point b ∈ Q such that f (b) = 0 and f  (b) = 0, then there exists a δ > 0 such that: (1) The set Eδ = {x ∈ R : |x − b| ≤ δ} is a subset of Q and f  (x) = 0 for x ∈ Eδ . (2) The Newton step N (x) = x − f (x)/f  (x) maps Eδ into itself. (3) The sequence of points xk+1 = N (xk ) tends to b as k ↑ ∞ for every choice of x0 ∈ Eδ . In this chapter we will find estimates for the speed of this convergence, first for scalar-valued functions and then, after developing the Newton method for vector-valued functions, for vector-valued functions too.

25.1. Newton’s method for scalar functions Let b ∈ R, Bδ (b) = {y ∈ R : |y − b| ≤ δ} ,

and

N (x) = x − f (x)/f  (x) .

Theorem 25.1. Let Q be a nonempty open subset of R, let f ∈ C 2 (Q) map Q into R, and suppose that there exists a point b ∈ Q such that f (b) = 0 and f  (b) = 0. Then there exists a pair of constants δ > 0 and γδ > 0 with δγδ < 1 such that Bδ (b) ⊂ Q, f  (y) is invertible for every point y ∈ Bδ (b), and: (1) |N (x) − b| ≤ |x − b|2 γδ < |x − b| for every point x ∈ Bδ (b). 275

276

25. Newton’s method

(2) If x0 ∈ Bδ (b) and xj = N (xj−1 ) for j = 1, 2, . . . , then xj ∈ Bδ (b) and (25.1)

|xj − b| ≤ γδ−1 {|x0 − b|γδ }2

j

f or j = 1, 2, . . . .

Proof. Choose δ > 0 (but small enough) so that Bδ (b) ⊂ Q and |f  (y)| > 0 for y ∈ Bδ (b); and let αδ = max {|f  (y)| : y ∈ Bδ (b)}

and

βδ = max {|f  (y)|−1 : y ∈ Bδ (b)} .

Then, since f (b) = 0, the Taylor theorem with remainder implies that 0 = f (x) + (b − x)f  (x) +

(b − x)2  f (c) 2!

for x ∈ Bδ (b) and some point c between x and b. Thus, as c ∈ Bδ (b), |N (x) − b| =

(b − x)2 αδ βδ |(b − x)2 f  (c)| ≤ ≤ |b − x|δγδ 2|f  (x)| 2

with γδ = αδ βδ /2. Since γδ1 ≤ γδ2 when δ1 ≤ δ2 , we can, by shrinking δ if need be, assume that δγδ < 1. Consequently, (1) holds; (2) follows easily from (1): If |xk − b| ≤ (γδ )−1 (|x0 − b|γδ )2 for some positive integer k, then, in view of (1), ) *2 2k (|x − b|γ ) 0 δ γδ , |xk+1 − b| = |N (xk ) − b| ≤ |xk − b|2 γδ ≤ γδ k

which coincides with the right-hand side of (25.1) if j = k + 1.



Example 25.1. Let f (x) = x2 − 3x + 2 = (x − 2)(x − 1). Then clearly f ∈ C 2 (R), f (2) = f (1) = 0, f  (x) = 2x − 3, f  (x) = 2, and N (x) = x − (x2 − 3x + 2)/(2x − 3) = (x2 − 2)/(2x − 3) . Thus, in terms of the notation introduced in the proof of Theorem 25.1, αδ = 2 for every δ > 0 and (focusing on the root b = 2) |f  (x)−1 | =

1 1 1 = ≤ |2x − 3| |2(x − 2) + 1| 1 − 2δ

if |x − 2| ≤ δ and δ < 1/2. Consequently, βδ = 1/(1 − 2δ) and δ αδ βδ δ= < 1 ⇐⇒ δ < 1/3 . 2 1 − 2δ This constraint is met by the initial choice x0 = 15/8 = 2 − 1/8. Then x1 = 97/48 = 2 +

1 , 48

x2 = 2 +

1 , 2400

... .

25.2. Newton’s method for vector-valued functions

277

Exercise 25.1. Show that the Newton recursion xn+1 = N (xn ), n = 0, 1, . . . , for solving the equation x2 − a = 0 to find the square roots of a > 0 is   1 a if xn = 0 , xn+1 = xn + 2 xn and calculate x1 , x2 , x3 when a = 4 and x0 = ±1. Exercise 25.2. Show that in the setting of Exercise 25.1 1 | |xn − xn−1 |2 . |xn+1 − xn | ≤ |x−1 2 n Exercise 25.3. Show that the Newton recursion xn+1 = N (xn ), n = 0, 1, . . . , for solving the equation x3 − a = 0 to find the cube roots of a is   1 a 2xn + 2 if xn = 0 , xn+1 = 3 xn and calculate x1 , x2 , x3 when a = 8 and x0 = ±1.

25.2. Newton’s method for vector-valued functions Newton’s method for vector-valued functions is an iterative scheme for solving equations of the form f (x) = 0 for vector-valued functions f ∈ C 2 (Q) that map an open subset Q of R p into R p . The underlying idea is that if f (b) = 0 and the Jacobian matrix ⎤ ⎡ ∂f1 ∂f1 (x) · · · (x) ∂xp ⎥ ⎢ ∂x1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ . . ⎢ .. ⎥ Jf (x) = ⎢ .. ⎥ ⎥ ⎢ ⎥ ⎢ ⎦ ⎣ ∂fp ∂fp (x) · · · (x) ∂x1 ∂xp is invertible at the point x = b, then, since 0 = f (b) ≈ f (x) + Jf (x)(b − x) and Jf (x) is invertible if x is close enough to b, b ≈ x − Jf (x)−1 f (x) . The vector-valued function (25.2)

N (x) = x − Jf (x)−1 f (x)

is called the Newton step. Clearly, N (b) = b, and it turns out that if x − b is small enough, then N (x) − b < x − b. Thus, a reasonable strategy is to define a sequence of points xj+1 = N (xj ), j = 0, 1, . . . , and then to try to show that the sequence of points xj+1 = N (xj ), j = 0, 1, . . . , converges to b as j ↑ ∞.

278

25. Newton’s method

Theorem 25.2. Let Q be a nonempty open subset of R p , let f ∈ C 2 (Q) map Q into R p , and suppose that there exists a point b ∈ Q such that f (b) = 0

and the Jacobian matrix Jf (x) is invertible at b .

Then there exists a pair of numbers δ > 0 and γδ > 0 with δγδ < 1 such that Bδ (b) ⊂ Q and the following implications hold: (1) x ∈ Bδ (b) =⇒ N (x) ∈ Bδ (b) and N (x) − b ≤ γδ x − b2 ≤ δγδ x − b .

(25.3)

(2) If x0 ∈ Bδ (b) and xj = N (xj−1 ) for j = 1, 2, . . . , then xj ∈ Bδ (b) and xj − b ≤ γδ−1 {γδ x0 − b}2

(25.4)

j

f or j = 1, 2, . . . .

Discussion. The proof is similar in spirit to the proof for the scalar case; however, the estimates are a little more complicated. If f ∈ C 2 (Q), u ∈ C p , and x(t) = a + t(b − a) for 0 ≤ t ≤ 1, then the function h(t) = uT f (x(t)) can be differentiated twice to obtain a formula that is expressed in terms of the Hessian Hfj of each component fj of f and the components uj of u: h (t) =

p 

uj Hfj (x(t))(b − a), (b − a) .

j=1

Thus, as h(1) = h(0) + h (0) + 12 h (t0 ) for some point t0 ∈ (0, 1), 1 uj Hfj (c)(b − a), (b − a) 2 p

uT [f (b) − f (a)] = uT Jf (a)(b − a) +

j=1

with c = a + t0 (b − b). If f (b) = 0, this reduces to 1 1 uj Hfj (c)(b − a), (b − a) = v, u , Jf (a){N (a) − b}, u = 2 2 p

j=1

for every choice of u ∈

R p,

where v ∈ R p is the vector with entries

vj = Hfj (c)(b − a), (b − a) . Therefore, 1 |Jf (a){N (a) − b}, u| ≤ v u , 2 which implies that 1 Jf (a){N (a) − b} ≤ v 2 and hence that 1 N (a) − b = Jf (a)−1 Jf (a){N (a) − b} ≤ Jf (a)−1  v . 2

25.2. Newton’s method for vector-valued functions

279

Consequently, upon setting (25.5)

αδ = max{Hfj (x) : x ∈ Bδ (b) and j = 1, . . . , p} , βδ = max{Jf (x)−1  : x ∈ Bδ (b)} ,

and

γδ = βδ

1√ p αδ , 2

it is readily checked that v ≤

√ p αδ a − b2

and hence that N (a) − b ≤ βδ

1√ p αδ a − b2 = γδ a − b2 . 2

Moreover, as αδ1 ≤ αδ2 and βδ1 ≤ βδ2 if δ1 ≤ δ2 , we can, by shrinking δ if need be, ensure that δ γδ < 1 and hence that (25.3) holds; (25.4) is an easy consequence of (25.3) (see the tail end of the proof of Theorem 25.1 for the strategy).  

2 

x1 + x22 − 1 f1 (x) = . Then f ∈ C 2 (R2 ) Example 25.2. Let f (x) = f2 (x) x21 − x2 and, since ∂f1 = 2x1 , ∂x1

∂f1 = 2x2 , ∂x2

it is readily checked that 

2x1 2x2 , Jf (x) = 2x1 −1

∂f2 = 2x1 , ∂x1

 2 0 Hf1 (x) = , 0 2

and

∂f2 = −1 , ∂x2

and

 2 0 Hf2 (x) = . 0 0

Thus, the upper bound αδ defined in (25.5) is equal to 2 for every δ > 0. √ 1/2 Moreover, f (x) = 0 if and only if x1 = ±x2 and x2 = (−1 + 5)/2, and as Jf (x) is invertible at these two points (which are the points at which the parabola intersects the circle) that are approximately equal to (±0.786, 0.618), there clearly exists a δ > 0 and a βδ > 0 such that if b is one of these points and x − b ≤ δ, then 5

5 5 −1 −2x2 5 −1 −1 5 5 ≤ βδ . Jf (x)  = 5{−2x1 (1 + 2x2 )} −2x1 2x1 5 Therefore, the Newton recursion will converge if the initial point x0 is chosen close enough to b.  Exercise 25.4. Show that if Q is an open subset of R p and f ∈ C 2 (Q) maps Q into R p and Bδ (b) ⊂ Q for some δ > 0, then there exists a constant αδ > 0 such that Jf (x) − Jf (y) ≤ αδ x − y for x, y ∈ Bδ (b).

280

25. Newton’s method

Exercise 25.5. Show that if, in the setting of Exercise 25.4, f (b) = 0 and βδ = max {Jf (x)−1  : x ∈ Bδ (b)} < ∞, then the Newton step N (x) satisfies the equality 6 1 {Jf (x + s(b − x)) − Jf (x)}ds(b − x) (25.6) N (x) − b = Jf (x)−1 0

,1

and hence N (x) − b ≤ αδ βδ 0 sdsb − x2 for every point x ∈ Bδ (b). ,1 [HINT: If h(s) = f (x + s(b − x)), then h(1) − h(0) = 0 h (s)ds.]

25.3. Supplementary notes This chapter is an expanded and revised version of Section 14.7 in [30]. The discussion of Newton’s method was originally adapted from [65]. A more sophisticated version due to Kantorovich may be found in the book [67] by Saaty and Bram.

Chapter 26

Matrices with nonnegative entries

We shall say that a matrix A ∈ R p×q with entries aij belongs to the class • R≥p×q if aij ≥ 0 for i = 1, . . . , p and j = 1, . . . , q; • R>p×q if aij > 0 for i = 1, . . . , p and j = 1, . . . , q. The class of R≥n×n (resp., R>n×n ) of matrices with nonnegative (resp., positive) entries is not the same as the class of n × n positive semidefinite (resp., positive definite) matrices:

 1 −1 if A = , then A  O, but A ∈ R≥n×n ; −1 2

 1 2 if A = , then A ∈ R>n×n , but A is not  O . 2 2 A matrix A ∈ R≥n×n is said to be irreducible if for every pair of indices i, j ∈ {1, . . . , n} there exists an integer k ≥ 1 such that the ij entry of Ak is positive, i.e., in terms of the standard basis ei , i = 1, . . . , n, of R n , if Ak ej , ei  > 0

for some positive integer k that may depend upon i, j .

This is less restrictive than assuming that there exists an integer k ≥ 1 such that Ak ∈ R>n×n .

 0 1 Example 26.1. The matrix A = is irreducible, but Ak ∈ R>2×2 for 1 0 any positive integer k.  281

282

26. Matrices with nonnegative entries

The main objective of this chapter is to show that if A ∈ R≥n×n is irreducible, then rσ (A), the spectral radius of A, is an eigenvalue of A with algebraic multiplicity equal to one and there exists a pair of vectors u, v ∈ R>n such that Au = rσ (A)u, AT v = rσ (A)v, and vT u = 1. These facts are part and parcel of the Perron-Frobenius theorem. To warm up we first consider a special case, wherein these conclusions are obtained easily under more restrictive conditions on A. This case will also serve as a useful review of a number of concepts that were introduced earlier.

26.1. A warm-up theorem We begin with a general observation that is formulated as a lemma for ease of future reference. Lemma 26.1. If A ∈ R≥n×n , Ax = μx for μ ∈ C and some nonzero vector x ∈ C n , and v is the vector with components vi = |xi | for i = 1, . . . , n, then |μ| v, v ≤ Av, v .

(26.1) Proof. Since

% % % n % n % %  aij xj %% ≤ aij vj |μ|vi = |μxi | = %% % j=1 % j=1

it follows that |μ|

n  i=1

|vi | ≤ 2

n 

for i = 1, . . . , n ,

vi aij vj ,

i,j=1



which is equivalent to (26.1).

is symmetric with eigenvalues λ1 ≥ λ2 ≥ Theorem 26.2. If A ∈ · · · ≥ λn , repeated according to their algebraic multiplicity, then: R>n×n

(1) λ1 > 0. (2) Ax, x ≤ λ1 x, x for every vector x ∈ R n . (3) λ1 In − A  O. (4) If Ax = λ1 x for some nonzero vector x ∈ C n and v is the vector with entries vj = |xj | for j = 1, . . . , n, then Av = λ1 v and v ∈ R>n . (5) The algebraic multiplicity of λ1 is equal to 1. (6) If μ ∈ σ(A) and μ = λ1 , then |μ| < λ1 (and hence λ1 = rσ (A)). Proof. Under the given assumptions, A = U DU T , where U ∈ R n×n is unitary, D = diag{λ1 , . . . , λn } with λ1 ≥ · · · ≥ λn . Consequently, (1) holds, since n  aii > 0 . λ1 + · · · + λn = trace A = i=1

26.1. A warm-up theorem

283

To verify (2), observe that if x ∈ R n and y = U T x, then Ax, x = U DU T x, x = DU T x, U T x =

n 

λj yj2

j=1

≤ λ1

n 

yj2 = λ1 y, y = λ1 U T x, U T x = λ1 x, x .

j=1

Assertion (3) is then immediate from (2), since A = AT . To verify (4), let x ∈ N(A−λ1 In ) and let v be the vector with vi = |xi | for i = 1, . . . , n. Then, in view of Lemma 26.1, λ1 v, v ≤ Av, v, whereas (3) implies that λ1 v, v ≥ Av, v. Therefore, 0 = (λ1 In − A)v, v = (λ1 In − A)1/2 v2 , 1/2 which implies that n (λ1 In − A) v = 0 and hence that λ1 v = Av. Consequently, λ1 vi = j=1 aij vj > 0 for i = 1, . . . , n. This completes the proof of (4).

To verify (5), suppose that λ1 x = Ax and λ1 y = Ay for a pair of nonzero vectors x, y ∈ R n with entries x1 , . . . , xn and y1 , . . . , yn , respectively. Then the vector u = y1 x − x1 y also belongs to N(λ1 In −A) . Therefore, either u = 0 or uj = 0 for j = 1, . . . , n. Since u1 = 0, u = 0. Thus, the geometric multiplicity of λ1 is equal to one. But as A ∈ R n×n and A = AT , the algebraic multiplicity of each eigenvalue of A is equal to its geometric multiplicity. The first step in the proof of (6) is to observe that if Ax = μx for some nonzero vector x ∈ C n with components x1 , . . . , xn and v is the vector with vi = |xi | for i = 1, . . . , n, then in view of Lemma 26.1 and (3), |μ| v, v ≤ Av, v ≤ λ1 v, v . Therefore, |μ| ≤ λ1 . Thus, to complete the proofof (6), it remains only to show that −λ1 ∈ σ(A). However, if −λ1 xi = nj=1 aij xj and w is the vector with wi = |xi | for i = 1, . . . , n, then, by another application of (3) and Lemma 26.1, Aw = λ1 w. Consequently, λ1 (|xi | − xi ) =

n 

aij [|xj | + xj ] .

j=1

Thus, if we say xm > 0 for any index m, then 0 = λ1 (|xm | − xm ) =

n  j=1

amj [|xj | + xj ] > 0 ,

284

26. Matrices with nonnegative entries

which is clearly not viable. Consequently, xj = −|xj | for j = 1, . . . , n. But then n  aij [|xj | − |xj |] = 0 for i = 1, . . . , n . 2λ1 |xi | = λ1 (|xi | − xi ) = j=1

Therefore, −λ1 is not an eigenvalue of A. Exercise

0 A= 1



26.1. Show that the matrix 

 1 1 1 is irreducible, but the matrix B = is not irreducible 1 0 1

and, more generally, that every triangular matrix B ∈ R≥n×n is not irreducible. be a diagonal matrix. Exercise 26.2. Let A ∈ R≥n×n and let D ∈ Rn×n > Show that A is irreducible ⇐⇒ AD is irreducible ⇐⇒ DA is irreducible .

26.2. The Perron-Frobenius theorem We begin with a preliminary lemma. The notation (26.2)

a≥b

for a pair of vectors a, b ∈

Rn

(resp., a > b)

means that a − b ∈ R≥n (resp., a − b ∈ R>n ).

Lemma 26.3. If A ∈ R≥n×n , x ∈ R≥n , and there exists a matrix P ∈ R>n×n such that AP = P A, then (26.3)

Ax − rσ (A)x ∈ R≥n ⇐⇒ Ax = rσ (A)x .

Moreover, if Ax − rσ (A)x ∈ R≥n×n and ϕ(t) is a polynomial such that (26.4)

ϕ(A) ∈ R>n×n

and

ϕ(rσ (A)) > 0 ,

then (26.5)

ϕ(A)x ∈ R>n ⇐⇒ x ∈ R>n .

Proof. The equivalence (26.3) clearly holds when Ax − rσ (A)x = 0. Thus, to verify (26.3) under the given assumptions, it suffices to focus on the case when Ax − rσ (A)x = 0. But, if Ax − rσ (A)x ∈ R≥n and Ax − rσ (A)x = 0, then x = 0, A = O, P x ∈ R>n , and AP x − rσ (A)P x = P (Ax − rσ (A)x) ∈ R>n . Therefore, (26.6)

ε = min

i=1,...,n

(AP x − rσ (A)P x)i >0 (P x)i

and AP x ≥ [rσ (A) + ε]P x .

26.2. The Perron-Frobenius theorem

285

Thus, Ak P x ≥ [rσ (A) + ε]k P x > 0 for every positive integer k and Ak  ≥

Ak P x ≥ [rσ (A) + ε]k . P x

But this implies that rσ (A) = lim Ak 1/k ≥ rσ (A) + ε k↑∞

and hence that ε = 0, which contradicts (26.6). Consequently, the initial assumption that Ax − rσ (A)x = 0 is not viable. Therefore, Ax = rσ (A)x, i.e., the implication =⇒ in (26.3) holds. Thus, as the opposite implication is self-evident, the proof of (26.3) is complete. Finally, if Ax − rσ (A)x ∈ R≥n×n and ϕ is a polynomial for which (26.4) holds, then Ax = rσ (A)x by (26.3), since ϕ(A) ∈ R>n×n and ϕ(A) commutes with A. Thus, ϕ(A)x = ϕ(rσ (A))x, and hence, as ϕ(rσ (A)) > 0, (26.5) must hold.  In future applications of Lemma 26.3, P will usually be taken equal to Ak for some positive integer k, or (In + A)n−1 , depending upon the constraints imposed on A. Exercise 26.3. Show that if A ∈ R≥n×n and if A is irreducible, then (In + A)n−1 ∈ R>n×n . [HINT: If eTi (In + A)n−1 ej = 0, then eTi Ak ej = 0 for every positive integer k.] Theorem 26.4 (Perron-Frobenius). If A ∈ R≥n×n is irreducible, then A = On×n and: (1) rσ (A) ∈ σ(A). (2) There exists a pair of vectors u, v ∈ R>n such that (26.7)

Au = rσ (A)u,

AT v = rσ (A)v,

and

vT u = 1 .

(3) The algebraic multiplicity of rσ (A) as an eigenvalue of A is equal to one. (4) If x is an eigenvector of A with nonnegative entries, then Ax = rσ (A)x. If also Ak ∈ R>n×n for some positive integer k, then the following supplementary assertions are valid: (5) If μ ∈ σ(A) and μ = rσ (A), then |μ| < rσ (A). (6) limk↑∞ rσ (A)−k Ak = uvT .

286

26. Matrices with nonnegative entries

Proof. The proof is divided into steps. 1. If μ ∈ C and x ∈ C n is a nonzero vector such that Ax = μ x and |μ| = rσ (A), then xi = 0 for i = 1, . . . , n and the vector w with entries wi = |xi | for i = 1, . . . , n is in the nullspace of A − rσ (A)In , i.e., Aw = rσ (A)w. Under the given assumptions, % % % n % n n  % %  % % aij xj % ≤ aij |xj | = aij wi , rσ (A)wi = |μxi | = % % j=1 % j=1 j=1 i.e., Aw − rσ (A)w ∈ R≥n . Therefore, by Lemma 26.3 with P = (In + A)n−1 , Aw = rσ (A)w and w ∈ R>n . 2. Verification of (1) and (2). Step 1, applied first to A and then to AT ensures that there exists a pair of vectors u, v ∈ R>n such that Au = rσ (A)u and AT v = rσ (A)v. Moreover, u and v can be normalized to achieve the condition vT u = 1. Thus, (1) and (2) hold. 3. The geometric multiplicity of rσ (A) as an eigenvalue of A is equal to one. T T   Let x = x1 · · · xn and y = y1 · · · yn be any two nonzero vectors in C n such that Ax = rσ (A)x and Ay = rσ (A)y. Then y1 x − x1 y is also in the null space of the matrix A − rσ (A)In . Thus, by step 2, x1 = 0, y1 = 0, and either y1 x − x1 y = 0 or |y1 xj − x1 yj | > 0 for j = 1, . . . , n. However, the second alternative is clearly impossible, since the first entry in the vector y1 x − x1 y is equal to zero. Thus, x and y are linearly dependent. 4. Verification of (3). Let λ1 , . . . , λn denote the eigenvalues of A repeated in accordance with their algebraic multiplicity, let u, v ∈ R>n satisfy (26.7), and let λ1 = rσ (A), ϕ(λ) = det (λIn − A) ,

and

ψ(λ) = det (λIn − [A − λ1 uvT ]) .

Then, since (λIn − A)−1 λ1 u = (λ − λ1 )−1 λ1 u for λ ∈ σ(A), ψ(λ) = det(λIn − A + λ1 uvT ) = det(λIn − A) det (In + (λIn − A)−1 λ1 uvT ) = ϕ(λ) det (In + (λ − λ1 )−1 λ1 uvT )   ϕ(λ) − ϕ(λ1 ) ϕ(λ) λ1 v T u =λ =λ = ϕ(λ) 1 + λ − λ1 λ − λ1 λ − λ1

for λ ∈ σ(A) .

26.2. The Perron-Frobenius theorem

287

Thus, as λ1 ϕ (λ1 ) = ψ(λ1 ), λ1 will be a simple root of ϕ(λ) if and only if ψ(λ1 ) = 0. But, if (A − λ1 uvT )x = λ1 x for some vector x ∈ C n , then λ1 vT x = vT (A − λ1 uvT )x = λ1 vT x − λ1 vT x = 0 . Therefore, Ax = λ1 x and hence, in view of step 3, x = αu for some α ∈ C. Thus, α = αvT u = vT x = 0 . But this means that the null space of A − λ1 uvT is equal to zero and, consequently, ψ(λ1 ) = 0, as needed to complete the proof of this step. 5. Verification of (4). If Ax = μx for some nonzero vector x ∈ R≥n , then μx, v = Ax, v = x, AT v = rσ (A)x, v . Therefore, μ = rσ (A) and hence as this eigenvalue has geometric multiplicity equal to one, x = αu for some α > 0. 6. If c1 , . . . , cn ∈ C\{0} and |c1 +· · ·+cn | = |c1 | +· · ·+|cn |, then cj = eiθ |cj | for some θ ∈ [0, 2π) and j = 1, . . . , n. To minimize the bookkeeping, let n = 3. Then, under the given assumptions, |c1 | + |c2 | + |c3 | = |c1 + c2 + c3 | ≤ |c1 | + |c2 + c3 | ≤ |c1 | + |c2 | + |c3 | . Therefore, equality prevails throughout. In particular, |c2 + c3 | = |c2 | + |c3 |, and hence, upon setting c3 /c2 = reiα with 0 ≤ α < 2π and r > 0, |1+reiα | = 1 + r. But this is only possible if 1 + 2r cos α + r2 = 1 + 2r + r2 , which forces α = 0. Thus, c3 = rc2 for some r > 0. Since also |c1 + c2 | = |c1 | + |c2 |, the same reasoning yields the equality c2 = δc1 for some δ > 0. Therefore, cj = γj c1 = γj eiθ |c1 | with γj > 0 for j = 1, 2, 3. Consequently, cj = eiθ |cj | for j = 1, 2, 3 in the case at hand and, in general, for j = 1, . . . , n . 7. Verification of (5). Suppose that Ax = μx for some nonzero vector x ∈ C n and that |μ| = rσ (A) and B = Ak ∈ R>n×n . Then Bx = μk x and |μk | = rσ (B). Thus, if w is the vector with entries wi = |xi | for i = 1, . . . , n, then % % % n % n n   % %  bij xj %% ≤ bij |xj | = bij wj rσ (B)wi = |μ|k |xi | = %% % j=1 % j=1 j=1

288

26. Matrices with nonnegative entries

for i = 1, . . . , n, i.e., Bw − rσ (B) w ∈ R≥n . Therefore, Bw − rσ (B) w = 0, thanks to Lemma 26.3, and hence w ∈ R>n . Consequently, % % % n % n  % % b1j |xj | = %% b1j xj %% , % j=1 % j=1 which, in view of step 6, implies that xj = eiθ |xj | = eiθ wj for j = 1, . . . , n, i.e., x = eiθ w. Consequently, |μ| = rσ (A) =⇒ μ = rσ (A), or, to put it another way, μ = rσ (A) =⇒ |μ| < rσ (A) =⇒ (5) holds. 8. Verification of (6).

 rσ (A) O1×(n−1) and let u1 (resp,. v1 ) Let A = with J = O(n−1)×1 J1 denote the first column of U (resp., V = (U −1 )T ). Then

 Ak 1 O =U V T = u1 v1T . lim O O k↑∞ rσ (A)k

U JU −1

But, as AU = U J =⇒ Au1 = rσ (A)u1 =⇒ u1 = α u , V T A = JV T =⇒ v1T A = rσ (A)v1T =⇒ v1 = β v , and V T U = In =⇒ v1T u1 = 1 =⇒ α β vT u = 1 =⇒ α β = 1 , it follows that u1 v1T = α β u vT = u vT , as claimed.



Exercise 26.4. Show that if, in the setting of Theorem 26.4, B = A − rσ (A)uvT , then rσ (B) < rσ (A). [HINT: See the verification of step 8 in the proof of Theorem 26.4.] Exercise 26.5. Show by explicit calculation

 that the six assertions of The1 4 orem 26.4 hold for the matrix A = . 1 1 ⎡ ⎤ 0 1 0 Exercise 26.6. Show that the matrix A = ⎣0 0 1⎦ is irreducible and 1 0 0 then check that the first four assertions of Theorem 26.4 hold, but the last two do not.

26.2. The Perron-Frobenius theorem

289

Exercise 26.7. Show that if u and v are vectors in R>3 that meet the conditions in (26.7) for the matrix A specified in Exercise 26.6, then 1 lim rσ (A)−k Ak = uvT . n↑∞ n n

k=1

be an irreducible matrix with spectral radius Exercise 26.8. Let A ∈ Rn×n ≥ rσ (A) = 1 and let B = A − rσ (A)uvT , where u, v ∈ R>n meet the conditions in (26.7). Show that: (a) σ(B) ⊂ σ(A) ∪ {0}, but 1 ∈ σ(B).  k (b) limN →∞ N1 N k=1 B = 0. (c) B k = Ak − uvT for k = 1, 2, . . ..  k T (d) limN →∞ N1 N k=1 A = uv . [HINT: If rσ (B) < 1, then it is readily checked that B k → O as k → ∞. However, if rσ (B) = 1, then B may have complex eigenvalues of the careful analysis is required that exploits the fact that form eiθ and a more ikθ = 0 if eiθ = 1.] e limN →∞ N1 N k=1 (3)

Exercise 26.9. Show that if A = Cμ , then there does not exist a pair of vectors u and v in R3 that meet the three conditions in (26.7). Lemma 26.5. If A ∈ R≥n×n is irreducible, B ∈ R≥n×n , and A − B ∈ R≥n×n , then: (1) rσ (A) ≥ rσ (B). (2) rσ (A) = rσ (B) ⇐⇒ A = B. Proof. Let β ∈ σ(B) with |β| = rσ (B), let By = βy for some nonzero vector y ∈ C n , and let w ∈ R n be the vector with components wi = |yi | for i = 1, . . . , n. Then % % % % n  % % % rσ (B)wi = |β||yi | = % bij yj %% % % j=1 (26.8) n n   bij |yj | ≤ aij wj for i = 1, . . . , n . ≤ j=1

j=1

By Theorem 26.4, there exists a vector v ∈ R>n such that AT v = rσ (A)v. Consequently rσ (B)w, v ≤ Aw, v = w, AT v = rσ (A)w, v . Therefore, since w, v > 0, (1) holds.

290

26. Matrices with nonnegative entries

Suppose next that rσ (A) = rσ (B). Then the inequality (26.8) implies that Aw − rσ (A)w ≥ 0 and hence, by Lemma 26.3 with P = (In + A)n−1 , that Aw − rσ (A)w = 0 and w > 0. But this in turn implies that n  (aij − bij )wj = 0 for i = 1, . . . , n j=1

and thus, as aij −bij ≥ 0 and wj > 0, we must have aij = bij for every choice of i, j ∈ {1, . . . , n}, i.e., rσ (A) = rσ (B) =⇒ A = B. The other direction is self-evident.   T Exercise 26.10. Show that if e = 1 1 · · · 1 and A ∈ R≥n×n is irreducible then: (a) C = {x ∈ R≥n : x, e = 1} is a closed convex set. (b) The function f (x) = (Ax, e)−1 Ax has a fixed point in C. (c) If f (x) = x for some x ∈ C, then x ∈ R>n and rσ (A) = Ax, e. Exercise 26.11. Show that if A ∈ R≥n×n is irreducible, then ⎧ ⎫ ⎧ ⎫ n n ⎨ ⎬ ⎨ ⎬ min aij ≤ rσ (A) ≤ max aij . i ⎩ i ⎩ ⎭ ⎭ j=1

j=1

 j, Exercise 26.12. Show that if A ∈ R n×n with entries aij ≥ 0 when i = then etA ∈ R≥n×n for every t ≥ 0. [HINT: There exists a δ > 0 such that A + δIn ∈ R≥n×n .] Exercise 26.13. Show that the implication in Exercise 26.12 is really an equivalence, i.e., if A ∈ R n×n and etA ∈ R≥n×n for every t ≥ 0, then aij ≥ 0 when i = j.

26.3. Supplementary notes The idea for a warm-up theorem came from the paper by Ninio [63]. The treatment of the Perron-Frobenius theorem here differs considerably from the treatment in [30]. It contains more information and the exposition is simpler; it is partially adapted from an internet source that unfortunately I cannot recover and so cannot reference. Exercises 26.12 and 26.13 are adapted from an article by Gl¨ uck [40].

Chapter 27

Applications of matrices with nonnegative entries

In this chapter a number of applications of matrices with nonnegative entries are considered.

27.1. Stochastic matrices A matrix A ∈ R≥p×q with entries aij , i = 1, . . . , p, j = 1, . . . , q, is said to be a stochastic matrix if (27.1)

q 

aij = 1

for i = 1, . . . , p .

j=1

 Lemma 27.1. If A ∈ R≥n×n is a stochastic matrix and a = 1 1 · · · then (27.2)

Aa = a

and

1

T

,

rσ (A) = 1 .

Proof. The first assertion in (27.2) is immediate from the definition of a stochastic matrix. To verify the second, assume that μx = Ax for some nonzero vector x ∈ C n and let δ = max{|xi | : i = 1, . . . , n}. Then, since % % % n % n n  % %  aij xj %% ≤ aij |xj | ≤ aij δ = δ for i = 1, . . . , n , |μxi | = %% % j=1 % j=1 j=1 291

292

27. Applications

it follows that |μ|δ ≤ δ and hence that |μ| ≤ 1 for every μ ∈ σ(A). Thus, as  1 ∈ σ(A), rσ (A) = 1. Exercise 27.1. Show that A ∈ R n×n is a stochastic matrix if and only if Ak is a stochastic matrix for every positive integer k. Exercise 27.2. Show that if A ∈ R n×n is an irreducible stochastic matrix n with entries aij for i, j = 1, . . . , n, then there exists na positive vector u ∈ R with entries ui for i = 1, . . . , n such that uj = i=1 ui aij for j = 1, . . . , n. [HINT: Exploit Theorem 26.4.] ⎡ ⎤ 1/2 0 1/2 Exercise 27.3. Show that the matrix A = ⎣1/4 1/2 1/4⎦ is an irre1/8 3/8 1/2 ducible stochastic matrix and find a positive vector u ∈ R 3 that meets the conditions discussed in Exercise 27.2.

27.2. Behind Google In a library of n documents, the Google search engine associates a vector in Rn to each document. The entries of each such vector are based on a weighted average of the number of links of this document to other documents with overlapping sets of keywords;  the vectors are normalized so that their entries sum to one. Thus, if G = g1 · · · gn is the array of the n vectors corresponding to the n documents, then GT is a stochastic matrix. Let   1 aT = 1 · · · 1 , A = aaT , and B = tG + (1 − t)A n for some fixed choice of t ∈ (0, 1); B is the Google matrix. Theorem 27.2. If 0 < t < 1 and the matrix B = tG + (1 − t)A has k distinct eigenvalues λ1 , . . . , λk with |λ1 | ≥ · · · ≥ |λk |, then (27.3)

λ1 = 1

Moreover, if B = Bu1 = u1 , then (27.4)

|λj | ≤ t for j = 2, . . . , k.   U = u1 · · · un , J is in Jordan form, and

and

U JU −1 ,

5 5 ⎛ ⎞ 5 5 n 5 m  5 5 cj uj ⎠ − c1 u1 5 lim 5B ⎝ 5=0 m↑∞ 5 5 j=1

for every choice of c1 , . . . , cn ∈ R. Proof. Since B T is a stochastic matrix, B T a = a, and rσ (B T ) = 1. Thus, as σ(B T ) = σ(B), 1 ∈ σ(B) and rσ (B) = 1. Moreover, as B ∈ R>n×n , Theorem 26.4 ensures that if λ ∈ σ(B) and λ = 1, then |λ| < 1. Therefore,  1 O1×n−1 . λ1 = 1 and |λj | < 1 for j = 2, . . . , k. Consequently, J = On−1×1 J

27.2. Behind Google

293

    (3) Suppose next, for example, that B u2 u3 u4 = u2 u3 u4 Cλ2 . Then u2 , a = u2 , B T a = Bu2 , a = λ2 u2 , a =⇒ u2 , a = 0 , u3 , a = u3 , B T a = Bu3 , a = λ2 u3 , a + u2 , a = λ2 u3 , a =⇒ u3 , a = 0 , u4 , a = u4 , B T a = Bu4 , a = λ2 u4 , a + u3 , a = λ2 u4 , a =⇒ u4 , a = 0 . The same argument is applicable to each of the Jordan chains associated with u2 , . . . , un . Thus, uj , a = 0 for j = 2, . . . , n  and hence A u2 · · · un = On,n−1 and        tG u2 · · · un = B u2 · · · un = u1 · · · un J e2 · · ·   = u2 · · · un J  . 

en



Consequently, λ2 , . . . , λk ∈ σ(tG) and thus, as rσ (tG) = t rσ (G) = t, |λj | ≤ t for j = 1, . . . , k. This completes the proof of (27.3). Next, as GT is a stochastic matrix, nAGm uj = Gm uj , a a = uj , (GT )m a a = uj , a a = 0 for j = 2, . . . , n. Therefore, n

B m uj = tm Gm uj

for j = 2, . . . , n .

then, as Bu1 = u1 , 5 5 ⎛ ⎞5 5 5 5 5 5 n n   5 5 5 5 m m m 5 ⎝ cj uj − c1 u1 ⎠5 cj uj 5 B x − c1 u1  = 5 5B 5 = 5B 5 5 5 5 5 j=1 j=2 5 5 5 5 5 n 5 5 5 n  5 5 5 5 m5 m m m 5 5 cj uj 5 ≤ t G  5 cj uj 5 = t 5G 5, 5 j=2 5 5 5 j=2

Thus, if x =

j=1 cj uj ,

which tends to 0 as m ↑ ∞, since limm↑∞ Gm 1/m = rσ (G) and rσ (G) =  rσ (GT ) = 1, because GT is a stochastic matrix. The ranking of documents is based on the entries in a good approximation to u1 (normalized so that u1 1 = 1) that is obtained by computing B m x for large enough m. The numbers involved are reportedly on the order of n = 25,000,000,000, k = 100 with t = .85. Presumably, these computations are tractable because most of the entries in the vectors gj are equal to zero.

294

27. Applications

27.3. Leslie matrices Matrices of the form

⎡ f1 f2 f3 · · · ⎢s1 0 0 · · · ⎢ ⎢ A = ⎢ 0 s2 0 · · · ⎢ .. ⎣. 0

0

0

···

⎤ fn−1 fn 0 0⎥ ⎥ 0 0⎥ ⎥ .. ⎥ .⎦ sn−1

0

in which fj ≥ 0 for j = 1, . . . , n and 0 ≤ sj ≤ 1 for j = 1, . . . , n − 1 serve to model population growth for many different species; they are called Leslie matrices. In order to simplify the exposition we shall restrict our attention to the case in which fj > 0 for j = 1, . . . , n, 0 < sj ≤ 1 for j = 1, . . . , n − 1, and n ≥ 2. Then An ∈ R>n×n and hence (all six assertions of) Theorem 26.4 are applicable. To get some feeling that if n = 4 and ⎡ 0 0 ⎢0 0 Z4 = ⎢ ⎣0 1 1 0 and

for the properties of this class of matrices observe 0 1 0 0

⎤ 1 0⎥ ⎥, 0⎦ 0

then Z4 = Z4T , Z4 Z4T = I4 ,

⎤ ⎡ ⎤ 0 s3 0 0 f1 f2 f3 f4 ⎢ 0 0 s2 0 ⎥ ⎢s1 0 0 0 ⎥ ⎥ ⎢ ⎥ Z4 ⎢ ⎣ 0 s2 0 0 ⎦ Z4 = ⎣ 0 0 0 s1 ⎦ , f4 f3 f2 f1 0 0 s3 0 ⎡

which has the same structure as the companion matrices that we studied earlier. In particular, if s1 = s2 = s3 = 1, then (27.5) det (λI4 − A) = det (λI4 − Z4 AZ4 ) = −f4 − f3 λ − f2 λ2 − f1 λ3 + λ4 . Theorem 26.4 ensures that rσ (A) is an eigenvalue of A. Since fj > 0 for j = 1, . . . , 4, this polynomial has exactly one positive root. Consequently, this root must be equal to rσ (A). Analogously, the spectral radius of an n × n Leslie matrix is equal to the positive root of the polynomial det (λIn − A) = λn − f1 λn−1 − s1 f2 λn−2 − s2 s1 f3 λn−3 − · · · − sn−1 · · · s1 fn . This polynomial is of the form g(x) = x − n

n−1  j=0

aj xj

with

aj > 0 for j = 0, . . . , n − 1 .

27.4. Minimum matrices

295

 j Thus, g(x) = 0 if and only if xn = n−1 j=0 aj x , which is very special in that both the left and right sides of the last equality are polynomials with positive coefficients (and hence both sides are convex functions of x on [0, ∞)).  Moreover, as g(0) = −a0 < 0 and g(b) > 0 if b > max{ n−1 j=0 aj , 1}, there n−1 n j must exist a point α ∈ (0, b) at which g(α) = 0, i.e., α = j=0 aj α . Therefore, αg  (α) = nαn −

n−1 

jaj αj = na0 +

j=1

n−1 

(n − j)aj αj > 0 .

j=1

Consequently, as g(α) = 0 and g  (α) = 0, this root may be computed by Newton’s method. Example 27.1. ⎡ 1 1 A = ⎣1 0 0 1

Let ⎤ 3 0⎦ . Then f (x) = det (xI3 − A) = x3 − x2 − x − 3 , 0

and, since f (0) = −3, f (1) = −4, f (2) = −1, and f (3) = 12, there exists a point μ with 2 < μ < 3 at which f (μ) = 0. Moreover, since f  (x) = 3x2 − 2x − 1 and μ3 = μ2 + μ + 3, μf  (μ) = 3μ3 − 2μ2 − μ = 3(μ2 + μ + 3) − 2μ2 − μ > 0 , as it should be in view of the preceding analysis. Thus, as f (μ) = 0 and f  (μ) = 0, the sequence of points xk = N (xk−1 ), k = 1, 2, . . ., in the Newton recursion will converge to μ if x0 is chosen close enough to μ. Since 2 < μ < 3, it is reasonable to choose x0 = 2 and to hope that this is close enough to μ to ensure that the sequence converges. Since N (x) = x − N (2) = 15/7 ≈ 2.14 , there are grounds for optimism.

2x3 − x2 + 3 f (x) = , f  (x) 3x2 − 2x − 1 and

N (15/7) =

6204 ≈ 2.13 , 2912 

27.4. Minimum matrices The entries of the minimum matrix Amin ∈ R n×n are aij = min {i, j}. Thus, for example, if n = 4, then ⎡ ⎤ 1 1 1 1 ⎢1 2 2 2⎥ ⎥ Amin = ⎢ ⎣1 2 3 3⎦ , 1 2 3 4

296

27. Applications

which can be ⎡ 1 ⎢1 Amin = ⎢ ⎣1 1

expressed as ⎤ ⎡ 1 1 1 0 ⎢0 1 1 1⎥ ⎥+⎢ 1 1 1⎦ ⎣0 1 1 1 0

0 1 1 1

0 1 1 1

⎤ ⎡ 0 0 ⎢0 1⎥ ⎥+⎢ 1⎦ ⎣0 1 0

0 0 0 0

⎤ ⎡ 0 0 ⎢0 0⎥ ⎥+⎢ 1⎦ ⎣0 1 0

0 0 1 1

0 0 0 0

0 0 0 0

⎤ 0 0⎥ ⎥ 0⎦ 1

= a1 aT1 + a2 aT2 + a3 aT3 + a4 aT4 = L LT , where



 L = a1 a2 a3

1  ⎢1 a4 = ⎢ ⎣1 1

0 1 1 1

0 0 1 1

⎤ 0 0⎥ ⎥ 0⎦ 1



and L−1

1 0 0 ⎢−1 1 0 =⎢ ⎣ 0 −1 1 0 0 −1

⎤ 0 0⎥ ⎥ 0⎦ 1

(see (3.11) for L−1 ). Therefore, Amin is positive definite and ⎡ ⎤ 2 −1 0 0 ⎢−1 2 −1 0 ⎥ ⎥ (Amin )−1 = (LT )−1 L−1 = ⎢ ⎣ 0 −1 2 −1⎦ . 0 0 −1 1 If λ is an eigenvalue of (Amin )−1 , then λ > 0 and, as Amin ∈ R n×n and 4 Amin = AH min , there exists a nonzero vector x ∈ R such that ⎡ ⎤ ⎡ ⎤⎡ ⎤ x1 2 −1 0 0 x1 ⎢x2 ⎥ ⎢−1 2 −1 0 ⎥ ⎢x2 ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎣ 0 −1 2 −1⎦ ⎣x3 ⎦ = λ ⎣x3 ⎦ x4 x4 0 0 −1 1 and hence that −xj−1 + 2xj − xj+1 = λxj

for j = 1, . . . , 4 with x0 = 0 and x5 = x4 .

Let δ = max{|x1 |, . . . , |x4 |}. Then, (2 − λ)xj = xj−1 + xj+1 =⇒ |2 − λ| |xj | = |xj−1 + xj+1 | ≤ 2δ =⇒ |2 − λ| δ ≤ 2δ =⇒ |2 − λ| ≤ 2 . (This is a special case of Gerˇsgorin’s theorem that will be discussed in Chapter 36.) Therefore, 2 − λ = 2 cos θ for exactly one choice of θ ∈ [0, π] and the general solution of an equation of the form −xj−1 + 2 cos θxj − xj+1 = 0 is of the form xj = αω1j + βω2j , where ω1 , ω2 are the roots of 1 − 2 cos θ ω + ω 2 = 0, i.e., √ 2 cos θ ± 4 cos2 θ − 4 = cos θ ± i sin θ . ω= 2 Consequently, xk = αeikθ + βe−ikθ and the condition x0 = 0 implies that α = −β and hence that we may choose xk = sin kθ for k = 0, . . . , 4. This

27.5. Doubly stochastic matrices

297

in turn imposes the restriction that 0 < θ < π, and so in particular that eiθ − 1 = 0. It remains only to choose θ so that x5 = x4 . But sin(n + 1)θ = sin nθ ⇐⇒ ei(n+1)θ − einθ = e−i(n+1)θ − e−inθ ⇐⇒ einθ (eiθ − 1) = e−i(n+1)θ (1 − eiθ ) ⇐⇒ ei(2n+1)θ = −1 . Consequently, if n = 4, the permissible values of θ are θj = (2j + 1)π/9 for j = 0, . . . , 3. Correspondingly, the eigenvalues of (Amin )−1 are λj = 2(1 − cos θj ) for j = 0, . . . , 3 and hence rσ (Amin ) = {2(1 − cos(π/9))}−1 and  T Amin x = rσ (Amin )x for x = sin(π/9), sin(3π/9), sin(5π/9), sin(7π/9) . Notice that x ∈ R>4 , in keeping with the Perron-Frobenius theorem.

27.5. Doubly stochastic matrices A matrix A ∈ R≥n×n is said to be a doubly stochastic matrix if both A and AT are stochastic matrices, i.e., if n n   aij = 1 for i = 1, . . . , n and aij = 1 for j = 1, . . . , n . (27.6) j=1

i=1

Exercise 27.4. Find the eigenvalues of the (doubly) stochastic matrix A = e1 e3 e2 e4 based on the columns ej , j = 1, . . . , 4, of I4 . Exercise 27.5. Show that the set of doubly stochastic n × n matrices is a convex set and that a matrix A ∈ R n×n is an extreme point of this set if and only if A is a permutation matrix. [HINT: 0 ≤ aij ≤ 1.] Theorem 27.3 (Birkhoff-von Neumann). A matrix A ∈ R n×n is doubly stochastic if and only if it is a convex combination of permutation matrices. Proof. The set of n × n doubly stochastic matrices is a closed bounded convex subset of R n×n . The extreme points of this set are the n × n permutation matrices. Thus, the fact that every doubly stochastic matrix is a convex combination of permutation matrices follows from the Krein-Milman theorem. The converse implication is self-evident.  Example 27.2. To illustrate ⎡ 3 5 1⎣ 4 4 A= 9 2 0

Theorem 27.3, let ⎤ ⎡ ⎤ 1 0 1 0 1⎦ and P1 = ⎣1 0 0⎦ . 7 0 0 1

Then A is doubly stochastic and ⎡ ⎤   3 1 1 4 1 A1 = (1 − 4/9)−1 A − P1 = ⎣0 4 1⎦ 9 5 2 0 3

298

27. Applications

is a doubly stochastic matrix with one more 0 than the matrix A. Similarly, ⎡ ⎤⎞ ⎡ ⎤ ⎛ 0 0 1 3 1 0 1 1 A2 = (1 − 1/5)−1 ⎝A1 − ⎣0 1 0⎦⎠ = ⎣0 3 1⎦ 5 4 1 0 0 1 0 3 is a doubly stochastic matrix with one more zero than A1 . The procedure  terminates at the k’th step if Ak is a permutation matrix. Exercise 27.6. Show that the doubly stochastic matrix A considered in Example 27.2 is a convex combination of permutation matrices. Exercise 27.7. Show that if A ∈ R n×n is a doubly stochastic matrix that is not a permutation matrix, σ is a permutation of the integers {1, . . . , n} such that t1 = min{a1σ(1) , . . . , anσ(n) } belongs to the open interval (0, 1), and P1 is the permutation matrix with 1’s in the iσ(i) position for i = 1, . . . , n, then A1 = (1 − t1 )−1 (A − t1 P1 ) is a doubly stochastic matrix with at least one more zero entry than A and A = t1 P1 + (1 − t1 )A1 . [REMARK: If A1 is not a permutation matrix, then the preceding argument can be repeated.] Exercise 27.8. Show that the representation of a doubly stochastic matrix A ∈ R n×n as a convex combination of permutation matrices is unique if n = 1, 2, 3, but not if n ≥ 4. [HINT: If n ≥ 4, then n! > n2 .]

27.6. Inequalities of Ky Fan and von Neuman In this section we shall use the Birkhoff-von Neumann theorem (Theorem 27.3) and the Hardy-Littlewood-Polya rearrangement lemma (Lemma 27.4) to obtain bounds on the trace of the product AB of a pair of matrices in terms of the eigenvalues and the singular values of A and of B. Lemma 27.4. Let a and b be vectors in R n with entries a1 ≥ a2 ≥ · · · ≥ an and b1 ≥ b2 ≥ · · · ≥ bn , respectively. Then aT P b ≤ aT b

(27.7)

for every n × n permutation matrix P .  Proof. Let P = nj=1 ej eTσ(j) for some one-to-one mapping σ of the integers {1, . . . , n} onto themselves, and suppose that P = In . Then aT P b =  n j=1 aj bσ(j) and there exists a smallest positive integer k such that σ(k) = k, i.e., n n   aj bσ(j) = a1 b1 + · · · + ak−1 bk−1 + aj bσ(j) . j=1

j=k

Thus, σ(k) > k and k = σ(), for some integer  > k . Therefore, (ak − a )(bσ() − bσ(k) ) = (ak − a )(bk − bσ(k) ) ≥ 0 ,

27.6. Inequalities of Ky Fan and von Neuman

299

and hence ak bσ(k) + a bσ() ≤ ak bσ() + a bσ(k) = ak bk + a bσ(k) . In the same way, one can rearrange the remaining terms to obtain the inequality (27.7).  Example 27.3. Suppose that in the setting of Lemma 27.4, n = 5, σ(1) = 1, σ(2) = 4, σ(3) = 5, σ(4) = 3, and σ(5) = 2. Then aT Pσ b = a1 b1 + a2 b4 + a3 b5 + a4 b3 + a5 b2 ≤ a1 b1 + a2 b2 + a3 b5 + a4 b3 + a5 b4 , since (a2 b2 + a5 b4 ) − (a2 b4 + a5 b2 ) = (a2 − a5 )(b2 − b4 ) ≥ 0 . Thus, the sum is “increased” by performing the interchange that pairs b2 with a2 . Next, to pair b3 with a3 , we consider the terms in the sum in which a3 and b3 appear and observe that a3 b5 + a4 b3 ≤ a3 b3 + a4 b5

(since (a3 − a4 )(b3 − b5 ) ≥ 0) .

Finally, to pair b4 with a4 , we consider the terms in which a4 and b4 appear and observe that a4 b5 + a5 b4 ≤ a4 b4 + a5 b5 . Exercise 27.9. Show by explicit calculation that if, in the setting of Lemma 27.4, n = 4, σ(1) = 1, σ(2) = 4, σ(3) = 2, and σ(4) = 3, then 4 

aj bσ(j) = a1 b1 + a2 b4 + a3 b2 + a4 b3 ≤ a1 b1 + a2 b2 + a3 b4 + a4 b3 .

j=1

[HINT: The inequality (a2 − a3 )(b2 − b4 ) ≥ 0 is the key.] Lemma 27.5. If U ∈ C n×n is a unitary matrix with entries uij , then the matrix W ∈ R n×n with entries wij = |uij |2 is doubly stochastic. Proof. Let vij denote the ij entry of U H . Then wij = uij uij = uij vji for i, j = 1, . . . , n. Therefore, in self-evident notation, n 

wij =

j=1

n 

uij vji = (U U H )ii = (In )ii = 1

for i = 1, . . . , n

uij vji = (U H U )jj = (In )jj = 1

for j = 1, . . . , n ,

j=1

and n 

wij =

i=1

as claimed.

n  i=1



300

27. Applications

Doubly stochastic matrices of the special form considered in Lemma 27.5 are often referred to as orthostochastic matrices. Not every doubly stochastic matrix is an orthostochastic matrix; see, e.g., Exercise 27.10. ⎡ ⎤ 0 3 3 Exercise 27.10. Show that the doubly stochastic matrix A = 16 ⎣3 1 2⎦ 3 2 1 is not an orthostochastic matrix. [HINT: Show that a matrix U with |uij | = √ aij is not unitary.] Theorem 27.6 (Ky Fan). Let A, B ∈ C n×n be Hermitian matrices with eigenvalues μ1 ≥ μ2 ≥ · · · ≥ μn

and

ν1 ≥ ν2 ≥ · · · ≥ νn ,

respectively (counting algebraic multiplicities). Then (27.8)

μ1 νn + · · · + μn ν1 ≤ trace AB ≤ μ1 ν1 + · · · + μn νn .

Proof. Under the given assumptions there exists a pair of n × n unitary matrices U and V such that A = U DA U H

and

B = V DB V H ,

where DA = diag{μ1 , . . . , μn } and

DB = diag{ν1 , . . . , νn } .

Thus trace AB = trace{U DA U H V DB V H } = trace{DA XDB X H } n n   = μi |xij |2 νj = μi wij νj , i,j=1

i,j=1

where xij denotes the ij entry of the unitary matrix X = U H V and wij = |xij |2 , i, j = 1, . . . , n, is the ij entry of the doubly stochastic matrix W ∈ R n×n . Consequently, by the Birkhoff-von Neumann theorem, W =

k 

t j Pj

with tj > 0 and

j=1

k 

tj = 1

j=1

is a convex combination of permutation matrices Pj . Thus, upon setting uT = [μ1 , . . . , μn ] and vT = [ν1 , . . . , νn ] and invoking Lemma 27.4, it is readily seen that T

trace AB = u W v =

k  j=1

t j u Pj v ≤ T

k  j=1

tj uT v = uT v ,

27.6. Inequalities of Ky Fan and von Neuman

301

which completes the proof of the upper bound. To obtain the lower bound, apply the upper bound to the pair A and −B.  Remark 27.7. The upper (resp., lower) bound in (27.8) is achieved if and only if there exists a unitary matrix that diagonalizes both A and B and preserves the order of the eigenvalues of both matrices (resp., reverses the order of the eigenvalues of one of the matrices): If the upper bound in (27.8) is achieved for a pair of Hermitian matrices A, B ∈ C n×n with eigenvalues μ1 ≥ · · · ≥ μn and ν1 ≥ · · · ≥ νn , respectively, and U (t) is a family of unitary matrices for t ∈ R with U (0) = In , then σ(U (t)BU (t)H ) = σ(B) and the function f (t) = trace{AU (t)BU (t)H } achieves its maximum value at t = 0. Thus, as U (t) = etK is unitary if t ∈ R, K ∈ Cn×n , and K H = −K, and f  (t) = trace{AU  (t)BU (t)H } + trace{AU (t)BU  (t)H } , we see that f  (0) = trace{AKB − ABK} = trace{K(BA − AB)} = 0 for every skew Hermitian matrix K ∈ C n×n . Since C = BA − AB = −C H , we can choose K = C H to obtain trace C H C = 0 and hence that C = O, i.e., AB = BA. But this in turn implies that there exists a single unitary matrix W that serves to diagonalize both A and B, i.e., A =W ΔA W H and B = W ΔB W H . Consequently, trace AB = trace ΔA ΔB = ni=1 μi νσ(i) for some permutation σ. In view of Lemma 27.4, this sum will only equal the maximum if νσ(i) = νi . Exercise 27.11. Let A, B ∈ C n×n be normal matrices with eigenvalues (counting algebraic multiplicities) μ1 , . . . , μn and ν1 , . . . , νn , respectively, that are indexed so that |μ1 | ≥ · · · ≥ |μn | and |ν1 | ≥ · · · ≥ |νn |. Show that | trace AB| ≤ |μ1 | |ν1 | + · · · + |μn | |νn |. [HINT: The proof of Theorem 27.6 is a helpful guide.] Exercise 27.12. Show that if A, B ∈ C n×n with singular values s1 ≥ s2 ≥ · · · ≥ sn and t1 ≥ t2 ≥ · · · ≥ tn , respectively, then ⎧ ⎫1/2 ⎧ ⎫1/2 n n ⎨ ⎬ ⎨ ⎬ s2j t2j | trace B H A | ≤ ⎩ ⎭ ⎩ ⎭ j=1

j=1

and then identify this inequality as the Cauchy-Schwarz inequality in an appropriately defined inner product space.

302

27. Applications

Theorem 27.8 (von Neumann). If A ∈ C p×q and B ∈ C q×p with singular values s1 ≥ s2 ≥ · · · ≥ sq and t1 ≥ t2 ≥ · · · ≥ tp , respectively, then (27.9)

| trace AB| ≤

m 

si ti ,

where m = min {rank A, rank B} .

i=1

Proof. The singular value decompositions A = V1 S1 U1H and B = X1 T1 Y1H , wherein V1 ∈ C p×r1 , U1 ∈ C q×r1 , X1 ∈ C q×r2 , and Y1 ∈ C p×r2 are isometric matrices, r1 = rank A, and r2 = rank B, ensure that | trace AB| = | trace V1 S1 U1H X1 T1 Y1H | = | trace S1 U1H X1 T1 Y1H V1 | % % % r1 r2 % %  % si mij tj nji %% = | trace S1 M1 T1 N1 | = %% % i=1 j=1 % 1  2 1 si tj [|mij |2 + |nji |2 ] , 2

r



r

i=1 j=1

where mij is the ij entry of the r1 × r2 matrix M1 = U1H X1 and nji is the ji entry of the r2 × r1 matrix N1 = Y1H V1 . The matrices M1 and N1 are not unitary matrices, unless p = q = r1 = r2 . However, they can be embedded in unitary matrices. Thus, for example, if U2 and X2 are chosen so that U = U1 U2 and X = X1 X2 are both q ×q unitary matrices, then M1 is the upper left r1 ×r2 corner of the unitary matrix

H   U M = 1H X1 X2 ∈ C q×q with entries mij for i, j = 1, . . . , q . U2 2 Consequently, the matrix W ∈ C q×q with wij = |m ij | for i, j = 1, . . . , q is doubly stochastic and hence, upon expressing W = rk=1 μk Pk as a convex combination of permutation matrices and setting   T T a = s1 · · · sr1 0 · · · 0 and b = t1 · · · tr2 0 · · · 0 ,

both in R q with entries a1 ≥ · · · ≥ aq and b1 ≥ · · · ≥ bq , it is readily seen that q r2 r1  r    2 si tj |mij | = ai bj wij = W b, a = μk Pk b, a i=1 j=1

i,j=1



r  k=1

k=1

μk b, a = b, a = r 1 r 2

r 1 ∧r2

si ti ,

i=1

where r1 ∧ r2 = min{r1 , r2 }. Since i=1 j=1 si tj |nji |2 is subject to the same bound, this serves to complete the proof. 

27.7. Supplementary notes

303

Exercise 27.13. Show that the set of p × q stochastic matrices is a convex set with q p extreme points.

27.7. Supplementary notes The discussion of minimum matrices is partially adapted from the article [11] by R. Bhatia. Notice that ⎤ and its inverse is restricted to a ⎡ Amin  O 1 1 a b ⎢1 2 2 c ⎥ ⎥ band. Thus, if A(a, b, c) = ⎢ ⎣a 2 3 3⎦ is positive definite, then, as will b c 3 4 follow from Theorem 38.3, det A(a, b, c) ≤ det Amin . The Birkhoff-von Neumann theorem can be proved directly by relatively elementary arguments instead of referring to the Krein-Milman theorem, much as in Example 27.2. The missing ingredient is a proof that if A ∈ R n×n is a doubly stochastic matrix that is not a permutation matrix, then there exists a permutation σ such that ajσ(j) > 0 for j = 1, . . . , n. This is usually justified by invoking the theory of permanents; see, e.g., Chapter 23 of [30]. Remark 27.7 is adapted from Section 28 of Bhatia [10].

Chapter 28

Eigenvalues of Hermitian matrices

This chapter is devoted to a number of classical results on the eigenvalues of Hermitian matrices.

28.1. The Courant-Fischer theorem The notation (28.1)

(A; X )min = min {Ax, x : x ∈ X

and

(A; X )max = max {Ax, x : x ∈ X

and x = 1}

x = 1}

and (28.2)

and subspaces X of Cn will be convenient. for Hermitian matrices A ∈ These definitions are meaningful because Ax, x is a real number when A = AH . C n×n

Theorem 28.1 (Courant-Fischer). Let A ∈ C n×n be a Hermitian matrix with eigenvalues λ1 ≥ · · · ≥ λn and let μj = λn+1−j for j = 1, . . . , n (so that the numbers μ1 , . . . , μn run through the same set of eigenvalues but are now indexed so that μ1 ≤ · · · ≤ μn ). Then (28.3)

λj = μn+1−j = max {(A; X )min : dim X = j}

for j = 1, . . . , n

μj = λn+1−j = min {(A; X )max : dim X = j}

for j = 1, . . . , n .

and (28.4)

Proof. Let u1 , . . . , un be an orthonormal set of eigenvectors of A corresponding to the eigenvalues λ1 , . . . , λn . Then, to verify (28.3), let X be any 305

306

28. Eigenvalues of Hermitian matrices

j-dimensional subspace of C n and let Uj = span{uj , . . . , un }. Then, since dim Uj = n + 1 − j and dim(X + Uj ) ≤ n, dim(X ∩ Uj ) = dim X + dim Uj − dim(X + Uj ) ≥ j + n + 1 − j − n = 1. Thus, X ∩ Uj = {0}, and if v ∈ X ∩ Uj with v = 1, then v=

n 

ci ui ,

i=j

and hence Av, v =

n  i=j

λi |ci |2 ≤ λj

n 

|ci |2 = λj .

i=j

Therefore, (A; X )min ≤ λj

for every j-dimensional subspace X of C n .

Consequently, max {(A; X )min : dim X = j} ≤ λj . Since the upper bound is attained when X = span{u1 , . . . , uj }, the verification of (28.3) is complete. Next, (28.4) may be verified by applying (28.3) to −A and noting that λj (−A) = −λn+1−j (A), (−A; X )min = −(A; X )max , and max {−(A; X )max : dim X = j} = − min {(A; X )max : dim X = j} . Another option, which some may find to be more congenial, is to imitate the proof of (28.3) but with Uj = span{u1 , . . . , un−j+1 } . Details are left to the reader.



Exercise 28.1. Show that if A, B ∈ C n×n are Hermitian matrices such that λ1 (A) ≥ · · · ≥ λn (A), λ1 (B) ≥ · · · ≥ λn (B), and Ax, x ≤ Bx, x for every vector x ∈ C n , then λj (A) ≤ λj (B) for j = 1, . . . , n. Exercise 28.2. Show that if A ∈ C n×n is a Hermitian matrix with eigenvalues μ1 ≤ · · · ≤ μn and X ⊥ denotes the orthogonal complement of X in C n , then ! μn−j+1 = min max Ax, x : x ∈ X ⊥ and x = 1 for j = 1, . . . , n . X ∈Sj

Exercise 28.3. Show that if A ∈ C n×n is a Hermitian matrix with eigenvalues μ1 ≤ · · · ≤ μn and X ⊥ denotes the orthogonal complement of X in C n , then ( 2 Ax, x ⊥ : x∈X and x = 0 for j = 1, . . . , n . μj = max min X ∈Sj x, x

28.2. Applications of the Courant-Fischer theorem

307

Exercise 28.4. Let A ∈ C n×n be a Hermitian matrix with eigenvalues λ1 ≥ · · · ≥ λn . Show that λn ≤ min aii ≤ max aii ≤ λ1 . Exercise 28.5. Use the Courant-Fischer theorem to give another proof of item (2) of Theorem 15.6. [HINT: Compare the eigenvalues of AH A and AH B H BA.] Exercise 28.6. Use the Courant-Fischer theorem to give another proof of item (3) of Theorem 15.6. [HINT: Compare the eigenvalues of AH A and C H AH AC.] Exercise 28.7. Let A ∈ C n×n , and let β1 ≥ · · · ≥ βn and δ1 ≥ · · · ≥ δn denote the eigenvalues of the Hermitian matrices B = (A + AH )/2 and C = (A − AH )/(2i), respectively. Show that λ+λ λ−λ ≤ β1 and δn ≤ ≤ δ1 for every point λ ∈ σ(A) . 2 2i [HINT: If Ax = λx and x = 1, then λ + λ = Ax, x + x, Ax.] βn ≤

28.2. Applications of the Courant-Fischer theorem If a Hermitian matrix undergoes a small Hermitian perturbation, then its eigenvalues will only change a little (but multiplicities may change): Theorem 28.2. If A, B ∈ C n×n are Hermitian matrices with eigenvalues λ1 (A) ≥ · · · ≥ λn (A) and λ1 (B) ≥ · · · ≥ λn (B), respectively, then (28.5)

|λj (A) − λj (B)| ≤ A − B

f or j = 1, . . . , n .

Proof. Let X be any j-dimensional subspace of C n . Then Ax, x = Bx, x + (A − B)x, x ≤ Bx, x + A − B for every vector x ∈ X with x = 1. Therefore, (A; X )min ≤ Bx, x + A − B for each vector x ∈ X with x = 1 and hence (A; X )min ≤ (B; X )min + A − B ≤ max {(B; X )min : dim X = j} + A − B = λj (B) + A − B . Thus, λj (A) − λj (B) ≤ A − B and, as follows by interchanging A and B, λj (B) − λj (A) ≤ B − A = A − B . But these two inequalities serve to justify (28.5).



308

28. Eigenvalues of Hermitian matrices

We turn next to the Cauchy interlacing theorem. Theorem 28.3. Let 

A11 A12 = AH A= A21 A22

with A11 ∈ C p×p , A22 ∈ C q×q , n = p + q ,

and suppose that λ1 (A) ≥ · · · ≥ λn (A) and λ1 (A11 ) ≥ · · · ≥ λp (A11 ). Then λj+q (A) ≤ λj (A11 ) ≤ λj (A)

(28.6)

f or j = 1, . . . , p .

Proof. For each subspace Y of C p , let (  2 y n > Y= ∈C : y∈Y . 0 > min , since Then Y> is a subspace of C n and (A11 ; Y)min = (A; Y) <  =

 y y y A11 y, y = A , when ∈ Y> . 0 0 0 Therefore, λj (A11 ) = max{(A11 ; Y)min : dim Y = j} > min : dim Y = j} = max{(A; Y) ≤ max{(A; X )min : X is a subspace of C n and dim X = j} = λj (A)

for j = 1, . . . , p .

This completes the justification of the upper bound in (28.6). The lower bound in (28.6) is obtained by applying the upper bound to −A to get λk (−A11 ) ≤ λk (−A) for k = 1, . . . , p and then noting that λk (−A11 ) = −λp+1−k (A11 )

and λk (−A) = −λn+1−k (A) .

The final inequality emerges upon setting j = p+1−k, which in turn implies that n + 1 − k = n + 1 − (p + 1 − j) = j + q.  A tridiagonal Hermitian matrix An ∈ R n×n ⎡ a1 ⎢ b1 n n−1 ⎢   ⎢ aj ej eTj + bj (ej eTj+1 + ej+1 eTj ) = ⎢ 0 An = ⎢ .. j=1 j=1 ⎣. 0

of the form b1 0 · · · a2 b2 · · · b2 a3 · · · 0

with bj > 0 and aj ∈ R is termed a Jacobi matrix.

0

···

0 0 0

⎤ 0 0⎥ ⎥ 0⎥ ⎥ .. ⎥ .⎦

bn−1 an

28.4. The sum of two Hermitian matrices

309

Exercise 28.8. Show that a Jacobi matrix An+1 has n + 1 distinct eigenvalues λ1 < · · · < λn+1 and that if μ1 < · · · < μn denote the eigenvalues of the Jacobi matrix An , then λj < μj < λj+1 for j = 1, . . . , n.

28.3. Ky Fan’s maximum principle Theorem 28.4. If A = AH ∈ C n×n with eigenvalues λ1 ≥ · · · ≥ λn , then k 

(28.7)

λj = max{trace X H AX : X ∈ C n×k and X H X = Ik }

j=1

and n 

(28.8)

λj = min{trace X H AX : X ∈ C n×k and X H X = Ik }

j=n−k+1

for k = 1, . . . , n.   Proof. Let X Y be a unitary matrix with isometric blocks X ∈ C n×k and Y ∈ C n×(n−k) , where 1 ≤ k ≤ n − 1, and let 

H

H    B11 B12 X X AX X H AY = B= A X Y = . YH B21 B22 Y H AX Y H AY Then, by Cauchy’s interlacing theorem, λj (A) = λj (B) ≥ λj (B11 )

for j = 1, . . . , k .

Therefore, (28.9)

k  j=1

λj (A) ≥

k 

λj (B11 ) = trace B11 = trace X H AX .

j=1

H On the other  hand,  ifHA = U DU with D = diag{λ1 , . . . , λn }, U unitary, H and X = Ik O U , then equality is achieved in (28.9). This completes the verification of (28.7) for k = 1, . . . , n − 1. Since the case k = n is selfevident, the proof of (28.7) is complete. The proof of (28.8) is left to the reader. 

Exercise 28.9. Verify (28.8).

28.4. The sum of two Hermitian matrices In this section we shall survey a number of classical inequalities for the eigenvalues of the sum A + B of two Hermitian matrices. Theorem 28.5. If A ∈ C n×n and B ∈ C n×n are Hermitian matrices and λi (C), i = 1, . . . , n, denotes the i’th eigenvalue of C = C H indexed in descending order, i.e., λ1 (C) ≥ · · · ≥ λn (C), then the following inequalities

310

28. Eigenvalues of Hermitian matrices

are in force: (1) (Weyl’s inequalities) (28.10) λi+j−1 (A + B) ≤ λi (A) + λj (B)

f or i, j ≥ 1 and i + j − 1 ≤ n .

(28.11) λi (A) + λn (B) ≤ λi (A + B) ≤ λi (A) + λ1 (B) (28.12)

|λi (A + B) − λi (A)| ≤ B

f or i = 1, . . . , n .

f or i = 1, . . . , n .

(2) (Ky Fan’s inequality) (28.13)

k 

λi (A + B) −

i=1

k 

λi (A) ≤

k 

i=1

λi (B)

f or k = 1, . . . , n ,

i=1

with equality when k = n. (3) (Lidskii’s inequality) (28.14)

k 

λij (A + B) −

j=1

k 

λij (A) ≤

j=1

k 

λi (B)

i=1

for 1 ≤ i1 < · · · < ik ≤ n and k = 1, . . . , n, with equality when k = n. Proof. Since A and B are Hermitian, there exist three orthonormal sets of vectors Auj = aj uj , Bvj = bj vj , and (A + B)wj = cj wj , for j = 1, . . . , n, where aj = λj (A), bj = λj (B), and cj = λj (A + B). The rest of the proof is divided into steps. 1. Verification of (28.10)–(28.12). If U = span{ui , . . . , un }, V = span{vj , . . . , vn }, W = span{w1 , . . . , wk } , with k = i + j − 1, and Y = U ∩ W, then dim Y = dim U + dim W − dim(U + W) ≥ (n − i + 1) + (i + j − 1) − n = j and dim Y ∩ V = dim Y + dim V − dim(Y + V) ≥ j + (n − j + 1) − n = 1 . Therefore, there exists a unit vector x ∈ U ∩ V ∩ W. Thus, ci+j−1 ≤ (A + B)x, x = Ax, x + Bx, x ≤ ai + bj , i.e., (28.10) holds.

28.5. On the right-differentiability of eigenvalues

311

The upper bound in (28.11) is obtained by choosing j = 1 in (28.10). The lower bound follows by applying the upper bound to the matrices −A and −B. The inequality (28.12) is an easy consequence of (28.11), λn (B) ≤ λi (A + B) − λi (A) ≤ λ1 (B) for i = 1, . . . , n , and the fact that B = max {|b1 |, |bn |} . 2. Verification of (28.13) and (28.14). It suffices to verify (28.14), since (28.13) is a special case of (28.14). Towards this end, fix 1 ≤ k ≤ n − 1 and note that (28.14) holds if and only if k k   [λij (A + B − bk In ) − λij (A)] ≤ λi (B − bk In ) j=1

i=1

and that B − bk I n =

n  (bi − bk )vi viH ≤ B+ ,

where

B+ =

i=1

k  (bi − bk )vi viH . i=1

Therefore, by Exercise 28.1, λi (A + B − bk In ) ≤ λi (A + B+ ) and

λi (A) ≤ λi (A + B+ )

for i = 1, . . . , n. Consequently, k 

k  [λij (A + B − bk In ) − λij (A)] ≤ [λij (A + B+ ) − λij (A)]

j=1



j=1 n 

[λi (A + B+ ) − λi (A)]

i=1

= trace(A + B+ ) − trace(A) = trace B+ =

k 

(bi − bk ) ,

i=1

which is equivalent to (28.14).



28.5. On the right-differentiability of eigenvalues The next theorem gives more refined information on the behavior of the eigenvalues of a Hermitian matrix A under small (Hermitian) perturbations of A than the inequality (28.12). It is a key step in understanding the behavior of the singular values of a matrix A under small perturbations of A.

312

28. Eigenvalues of Hermitian matrices

Theorem 28.6. If A, B ∈ C n×n , A = AH with eigenvalues λ1 (A) ≥ · · · ≥ λn (A), k of which are distinct with geometric multiplicities γ1 , . . . , γk , respectively, and B = B H , then (28.15)

lim μ↓0

λj (A + μB) − λj (A) = νj , μ

where the νj are the eigenvalues of a block diagonal matrix diag{C11 , . . . , Ckk }

(based on blocks Cjj of size γj × γj )  that is extracted from a matrix C ∈ C n×n that is similar to B and nj=1 νj = trace B. (Here, too, λ1 (A + μB) ≥ · · · ≥ λn (A + μB).) Discussion. Since A = AH , it admits a representation of the form A = U DU H with U ∈ C n×n unitary and D = diag{λ1 , . . . , λn }. Therefore, λj (A + μB) = λj (U [D + μU H BU ]U H ) = λj (D + μC) with C = U H BU . Moreover, λj (D + μC) = λj (eμM (D + μC)e−μM ) for every choice of M ∈ C n×n and μ ∈ R. But, eμM (D + μC)e−μM     μ2 2 μ2 2 = In + μM + M + · · · (D + μC) In − μM + M + · · · 2! 2! = D + μ(M D − DM + C) + · · · , which for small μ behaves essentially like D + μ(M D − DM + C). To make the picture as transparent as possible, suppose that ⎤ ⎡ O O κ 1 I3 O ⎦ with κ1 > κ2 > κ3 D = ⎣ O κ 2 I4 O O κ 3 I2 so that λ1 = λ2 = λ3 = κ1 , λ4 = λ5 = λ6 = λ7 = κ2 , and λ8 = λ9 = κ3 . Suppose further that C = [Cij ] and M = [Mij ] for i, j = 1, 2, 3 with blocks Cii and Mii of size 3 × 3 for i = 1, 4 × 4 for i = 2, and 2 × 2 for i = 3. Then, since (M D − DM + C)ij = Mij κj − κi Mij + Cij , we can choose Mij =

Cij κi − κj

if i = j

so that (M D − DM + C) = diag {C11 , C22 , C33 }. Consequently, λi (D + μC) = λi (λ1 I3 + μC11 ) = λi + μνi

for i = 1, 2, 3 ,

λi (D + μC) = λi (λ4 I4 + μC22 ) = λi + μνi

for i = 4, . . . , 7 ,

λi (D + μC) = λi (λ8 I2 + μC33 ) = λi + μνi

for i = 8, 9 ,

28.6. Sylvester’s law of inertia

313

when μ is small, where ν1 ≥ · · · ≥ ν3 are the eigenvalues of C11 , ν4 ≥ · · · ≥ ν7 are the eigenvalues of C22 , and ν8 ≥ ν9 are the eigenvalues of C33 . Therefore, the limit (28.15) clearly holds for this example, and 9  j=1

νj =

3 

trace Cii = trace C = trace B .

i=1

The general case is treated in exactly the same way; the only difference is that the bookkeeping is more elaborate.  Exercise 28.10. Show that in the setting and notation of Theorem 28.6, n j=1 λj νj = trace DC. [HINT: Look first at the example considered in the proof of the theorem.]

28.6. Sylvester’s law of inertia In this section we shall use the notation • E+ (A) = the number of positive eigenvalues of A, • E− (A) = the number of negative eigenvalues of A, • E0 (A) = the number of zero eigenvalues of A (counting multiplicities in all three) to track the location of the eigenvalues of Hermitian matrices A ∈ C n×n . Thus, E+ (A) + E− (A) = rank A

and E0 (A) = dim NA .

Theorem 28.7. If A ∈ C n×n is Hermitian and C ∈ C m×n , then E+ (CAC H ) ≤ E+ (A)

and

E− (CAC H ) ≤ E− (A) ,

with equality in both if and only if rank A = rank CAC H . Proof. Since A and B = CAC H are Hermitian matrices, there exists a pair of invertible matrices X ∈ C n×n and Y ∈ C m×m such that ⎡ ⎤ ⎤ ⎡ Is2 O O O O Is1 A = X ⎣ O −It1 O ⎦ X H and B = Y ⎣ O −It2 O ⎦ Y H , O O O O O O where s1 = E+ (A), t1 = E− (A), s2 = E+ (B), and t2 ⎤ ⎡ ⎡ Is1 O O O Is2 ⎣ O −It2 O ⎦ = Q ⎣ O −It1 O O O O O

= E− (B). Therefore, ⎤ O O ⎦ QH , O

314

28. Eigenvalues of Hermitian matrices

where Q = Y −1 CX. Thus, upon expressing the m × n matrix Q in block form as ⎡ ⎤ Q11 Q12 Q13 Q = ⎣ Q21 Q22 Q23 ⎦ , Q31 Q32 Q33 where the heights of the block rows are s2 , t2 , and m−s2 −t2 and the widths of the block columns are s1 , t1 , and n − s1 − t1 , respectively, it is readily seen that H Is2 = Q11 QH 11 − Q12 Q12

and

H It2 = −Q21 QH 21 + Q22 Q22 .

Therefore, H Q11 QH 11 = Is2 + Q12 Q12

H and Q22 QH 22 = It2 + Q21 Q21 .

Thus, s2 = rank Q11 QH 11 ≤ rank Q11 ≤ s1

and

t2 = rank Q22 QH 22 ≤ rank Q22 ≤ t1 . Consequently, rank A = s1 + t1 ≥ s2 + t1 ≥ s2 + t2 = rank B , which clearly displays the fact that rank A = rank B ⇐⇒ E+ (A) = E+ (B) and E− (A) = E− (B) .



Corollary 28.8 (Sylvester’s law of inertia). If A ∈ C n×n is Hermitian and C ∈ C n×n is invertible, then E+ (CAC H ) = E+ (A),

E− (CAC H ) = E− (A),

and

E0 (CAC H ) = E0 (A) .

H Exercise 28.11. Show that if A, B ∈ C n×n , A  O, and  B = B , then   A O In .] E− (A − B) ≤ E+ (B). [HINT: A − B = In In O −B In

28.7. Supplementary notes Sections 28.1 and 28.6 are adapted from [30]. Sections 28.4 and 28.3 are adapted from Bhatia [9]. Theorem 28.6 is adapted from Theorem A4 in the paper [58] by R. Lippert, which has stronger results than are presented here. The fact that the limit (28.15) does not change if λj (A + μB) is replaced by λj (D + μ(M D − DM + C)) can be justified by invoking (28.5). There is an analogue of Theorem 28.6 in which the limit as μ ↓ 0 is replaced by the limit as μ ↑ 0. These two limits are not always the same. They will be if A has n distinct eigenvalues.

Chapter 29

Singular values redux I

In this chapter we shall establish extremal characterizations of the partial sums s1 (A) + · · · + sk (A) and partial products s1 (A) · · · sk (A), k = 1, . . . , q, of the singular values sj (A) of a matrix A ∈ C p×q . These characterizations will then be used to develop a new class of norms on the space Cp×q and to establish upper bounds on the eigenvalues of a square matrix in terms of its singular values.

29.1. Sums of singular values The next theorem supplies the extension of formula (15.14) that was used in Chapter 15; the proof rests on von Neumann’s trace inequality (27.9). Theorem 29.1. If A ∈ C p×q and k ∈ {1, . . . , q}, then k 

(29.1)

sj (A) = max{| trace(Y H AX)| : X ∈ C q×k , Y ∈ C p×k ,

j=1

and X H X = Y H Y = Ik } . Proof. Let m = min {rank A, rank XY H }. In view of (27.9), | trace(Y H AX)| = | trace(AXY H )| ≤

m 

sj (A)sj (XY H ) =

j=1

k 

sj (A) ,

j=1

since rank XY H = k and sj (XY H ) = 1 for j = 1, . . . , k. 315

316

29. Singular values redux I

Equality is obtained by choosing X and Y appropriately: If rank A = r and the singular value decomposition of A is A = V1 S1 U1H with isometric  factors V1 ∈ C p×r and U1 ∈ C q×r and 1 ≤ k < r, then V1 = V11 V12 and   U1 = U11 U12 with V11 ∈ C p×k and U11 ∈ C q×k . Thus, if X = U11 and Y = V11 , then

H

     U11 I H H U11 = Ik O S1 k Y AX = V11 V11 V12 S1 H U12 O and hence trace(Y H AX) = s1 + · · · + sk . If k = r, then the equality trace(Y H AX) = s1 + · · · + sr is obtained by  choosing X = U1 and Y = V1 . Corollary 29.2. If A, B ∈ C p×q , then (29.2)

k 

sj (A + B) ≤

j=1

k 

sj (A) +

j=1

k 

sj (B)

for k = 1, . . . , q .

j=1

Proof. In view of Theorem 29.1, |(A + B)X, Y | = |AX, Y  + BX, Y | ≤ |AX, Y | + |BX, Y | ≤

k 

sj (A) +

j=1

k 

sj (B)

j=1

for every pair of isometric matrices X ∈ C q×k and Y ∈ C p×k . The inequality (29.2) is obtained by choosing X and Y to maximize the first term in the last display.  Our next objective is to extend the inequality (29.2) to k  j=1

sj (A + B)t ≤

k  (sj (A) + sj (B))t

for k = 1, . . . , q and 1 ≤ t < ∞

j=1

by a method called majorization.

29.2. Majorization The main result of this section is Theorem 29.4. It is convenient to begin with a lemma. Lemma 29.3. Let {a1 , . . . , an } and {b1 , . . . , bn } be two sequences of real numbers such that a1 ≥ a2 ≥ · · · ≥ an , b1 ≥ b2 · · · ≥ bn , (29.3)

and

k  j=1

aj ≤

k  j=1

bj

for k = 1, . . . , n ,

29.2. Majorization

317

and let

(

x − s for x − s > 0 , 0 for x − s ≤ 0.

(x − s)+ = Then k 

k  (aj − s)+ ≤ (bj − s)+

j=1

j=1

for every s ∈ R. Proof. Let α(s) = (a1 − s)+ + · · · + (ak − s)+ and β(s) = (b1 − s)+ + · · · + (bk − s)+ and consider the following cases: (1) If s < ak , then α(s) = (a1 − s) + · · · + (ak − s) ≤ (b1 − s) + · · · + (bk − s) ≤ (b1 − s)+ + · · · + (bk − s)+ = β(s) . (2) If aj ≤ s ≤ aj−1 for j = 2, . . . , k, then α(s) = (a1 − s)+ + · · · + (aj − s)+ = (a1 − s) + · · · + (aj−1 − s) ≤ (b1 − s) + · · · + (bj−1 − s) ≤ (b1 − s)+ + · · · + (bk − s)+ = β(s) . (3) If s ≥ a1 , then α(s) = 0 and so β(s) ≥ α(s), since β(s) ≥ 0.  Theorem 29.4. If a1 ≥ · · · ≥ an ≥ a and b1 ≥ · · · ≥ bn ≥ a, then k 

(29.4)

j=1

aj ≤

k 

bj f or k = 1, . . . , n

j=1

=⇒

k 

ϕ(aj ) ≤

j=1

k 

ϕ(bj ) f or k = 1, . . . , n

j=1

and every convex function ϕ ∈ C 2 (Q) on the interval Q = (a, ∞) for which limb↓a ϕ(b) = 0, limb↓a ϕ (b) = 0, and limb↓a bϕ (b) = 0. Proof. The key to the proof is the integral representation 6 ∞ (x − u)+ ϕ (u)du (29.5) ϕ(x) = a

318

29. Singular values redux I

for functions ϕ(x) that meet the given assumptions. Then, since ϕ(x) is convex, ϕ (u) ≥ 0 for u ∈ (a, ∞) and hence, in view of Lemma 29.3, k 

ϕ(aj ) =

j=1

k 6  j=1

=

k 





(aj − u)+ ϕ (u)du ≤

a

k 6  j=1



(bj − u)+ ϕ (u)du

a

ϕ(bj ) .

j=1

It remains to verify (29.5). Towards this end, choose x > b > a. Then 2 6 x 6 x (6 s    ϕ(x) − ϕ(b) = ϕ (s)ds = ϕ (u)du + ϕ (b) ds b b b 6 x 6 x  ds ϕ (u)du + (x − b)ϕ (b) = u 6b x (x − u)ϕ (u)du + (x − b)ϕ (b) . = b

Formula (29.5) is obtained by letting b ↓ a and replacing (x − u) by  (x − u)+ . Corollary 29.5. Let {a1 , . . . , an } and {b1 , . . . , bn } be two sequences of real numbers such that a1 ≥ a2 ≥ · · · ≥ an , b1 ≥ b2 ≥ · · · ≥ bn , and k 

aj ≤

j=1

k 

bj

for k = 1, . . . , n .

j=1

Then k 

e

aj

j=1



k 

ebj

for k = 1, . . . , n .

j=1

Proof. This is an immediate consequence of Theorem 29.4 and the formula 6 x 6 ∞ x s e = (x − s)e ds = (x − s)+ es ds , −∞

−∞

which corresponds to the choice ϕ(x) = ex and a = −∞.



Exercise 29.1. Show that if a1 ≥ · · · ≥ an > 0, b1 ≥ · · · ≥ bn > 0, and 1 > t > 0, then k  j=1

aj ≤

k  j=1

bj for k = 1, . . . , n =⇒

k  j=1

atj



k  j=1

btj for k = 1, . . . , n.

29.4. Unitarily invariant norms

319

29.3. Norms based on sums of singular values Theorem 29.6. If A, B ∈ C p×q and 1 ≤ t < ∞, then (29.6)

k 

sj (A + B)t ≤

j=1

k  (sj (A) + sj (B))t

for k = 1, . . . , q

j=1

and the function ⎛ ⎞1/t k  sj (A)t ⎠ ϕk,t (A) = ⎝

(29.7)

j=1

defines a norm on C p×q for each choice of k ∈ {1, . . . , q} and t ∈ [1, ∞). Proof. Let aj = sj (A + B) and bj = sj (A) + sj (B) for j = 1, . . . , q. Then  a1 ≥ · · · ≥ aq ≥ 0, b1 ≥ · · · ≥ bq ≥ 0, and, in view of (29.2), kj=1 aj ≤ k t j=1 bj for k = 1, . . . , q. Thus, as the function ϕ(x) = x is convex on   (0, ∞) and ϕ(x), ϕ (x), and xϕ (x) all tend to zero as x ↓ 0 when t > 1, Theorem 29.4 ensures that k 

t

sj (A + B) =

j=1

k  j=1

atj



k  j=1

btj

k  = (sj (A) + sj (B))t

for 1 < t < ∞ .

j=1

Consequently, as ⎞1/t ⎛ ⎞1/t ⎛ ⎞1/t ⎛ k k k    ⎝ (sj (A) + sj (B))t ⎠ ≤ ⎝ (sj (A))t ⎠ + ⎝ (sj (B))t ⎠ j=1

j=1

j=1

for 1 ≤ t < ∞ by Minkowski’s inequality (9.24), ϕk,t (A + B) ≤ ϕk,t (A) + ϕk,t (B) , i.e., the triangle inequality holds. The remaining properties of a norm are easily verified; the details are left to the reader. 

29.4. Unitarily invariant norms A norm ϕ(A) on C p×q is said to be unitarily invariant if ϕ(U AV ) = ϕ(A) for every A ∈ C p×q and every pair of unitary matrices U ∈ C p×p and V ∈ C q×q . The norm ϕk,t (A) considered in (29.7) is an example of a unitarily invariant norm.

320

29. Singular values redux I

A norm g(x) on R n is said to be a symmetric gauge function if: (1) g(P x) = g(x) for every x ∈ R n and every permutation matrix P ∈ R n×n . (2) g(Dx) = g(x) for every x ∈ R n and every orthogonal diagonal matrix D ∈ R n×n . The classical norms xt , 1 ≤ t ≤ ∞, are all symmetric gauge functions. A theorem of von Neumann establishes a connection between these two classes of norms. To formulate it, the notation  T for A ∈ C n×n sA = s1 (A) · · · sn (A) and Dx = diag{x1 , . . . , xn }

for x ∈ R n

will be convenient. Theorem 29.7. If A ∈ C n×n and g is a symmetric gauge function on R n , then the function ϕ(A) = g(sA ) is a unitarily invariant norm on C n×n . Conversely, if A ∈ C n×n and ϕ(A) is a unitarily invariant norm on then the function g(x) = ϕ(Dx ) is a symmetric gauge function.

C n×n ,



Proof. See, e.g., Theorem IV.2.1 on page 91 in Bhatia [9].

29.5. Products of singular values The main result in this section is Theorem 29.9. We begin with a preliminary lemma. Lemma 29.8. Let A ∈ C p×q , let s1 ≥ · · · ≥ sq denote the singular values of A, and let 1 ≤ k ≤ q. Then det(W H AH AW ) ≤ s21 · · · s2k det(W H W )

for every choice of

W ∈ C q×k .

Proof. Let A ∈ Cp×q and AH A = U S 2 U H with U ∈ Cq×q unitary, S = diag{s1 , . . . , sq }, and W ∈ Cq×k . Then det {W H AH AW } = det {W H U S 2 U H W } = det XX H with X = W H U S ∈ Ck×q . Thus, upon writing X ∈ Ck×q as an array X = x1 · · · xq of q columns of height k and setting   Y = W H U = y1 · · · yq

29.5. Products of singular values

321

so that X = Y S, the Binet-Cauchy formula implies that ⎡ H⎤ xi1    ⎢ .. ⎥ H det XX = det xi1 · · · xik det ⎣ . ⎦ xH ik   2  = | det xi1 · · · xik |    = | det yi1 · · · yik |2 s2i1 · · · s2ik    ≤ | det yi1 · · · yik |2 s21 · · · s2k , where each sum is over all k-tuples 1 ≤ i1 < · · · < ik ≤ q. Thus, det {W H AH AW } = det XX H ≤ s21 · · · s2k det Y Y H = s21 · · · s2k det W H W, 

as claimed.

Theorem 29.9. If A ∈ Cp×q with singular values s1 ≥ · · · ≥ sq and k ≤ min{p, q}, then ! (29.8) s21 · · · s2k = max det {W H AH AW }; W ∈ Cq×k and W H W = Ik . Proof. Lemma 29.8 ensures that det {W H AH AW } ≤ s21 · · · s2k

for every W ∈ Cq×k such that W H W = Ik .

If AH A = U S 2 U H with U ∈ C q×q unitary and S = diag{s1 , . . . , sq }, then equality in (29.8) is achieved by choosing 

Ik .  W =U O(q−k)×k Exercise 29.2. Show that if A ∈ Cp×q with singular values s1 ≥ · · · ≥ sq and 1 ≤ k ≤ rank A, then (29.9) ( 2 det {W H AH AW } q×k max : W ∈C and rank W = k = s21 · · · s2k . det W H W Exercise 29.3. Let A, B ∈ C n×n with singular values s1 ≥ · · · ≥ sn and t1 ≥ · · · ≥ tn , respectively. Show that max {| trace(U AV B)| : U, V ∈ C n×n and U H U = V H V = In } =

n  j=1

[HINT: Theorem 27.8.]

sj tj .

322

29. Singular values redux I

29.6. Eigenvalues versus singular values Lemma 29.10. Let A ∈ C n×n , let s1 ≥ · · · ≥ sn denote the singular values of A, and suppose that the eigenvalues λ1 , . . . , λn of A, repeated according to their algebraic multiplicity, are indexed so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Then |λ1 | · · · |λk | ≤ s1 · · · sk for k = 1, . . . , n . Proof. By Schur’s theorem, there exists a unitary matrix U ∈ C n×n such that T = U H AU is upper  triangular and tjj = λj for j = 1, . . . , n. Thus, if k < n and U = U1 U2 with U1 ∈ C n×k and W = U1 , then

   H I H H H H H W A AW = U1 U T T U U1 = Ik O T T k , O which, upon writing the upper triangular matrix T in compatible block form, can be expressed as   

H   T11 T11 T12 Ik O H Ik O = T11 T11 , H TH O T O T12 22 22 where T11 denotes the upper left-hand k × k corner of T . Therefore, H |λ1 · · · λk |2 = | det T11 |2 = det{T11 T11 }

= det{W H U T H T U H W } = det{W H U T H U H U T U H W } = det{W H AH AW } ≤ s21 · · · s2k det{W H W } = s21 · · · s2k , 

by Lemma 29.8.

Corollary 29.11. Let A ∈ C n×n with singular values s1 ≥ · · · ≥ sn and eigenvalues λ1 , . . . , λn , repeated according to their algebraic multiplicity and indexed so that |λ1 | ≥ · · · ≥ |λn |. Then sk = 0 =⇒ λk = 0 (i.e., |λk | > 0 =⇒ sk > 0). Theorem 29.12. Let A ∈ C n×n , let s1 , . . . , sn denote the singular values of A, and let λ1 , . . . , λn denote the eigenvalues of A, repeated according to their algebraic multiplicity and indexed so that |λ1 | ≥ · · · ≥ |λn |. Then (29.10)

k  j=1

|λj |t ≤

k 

stj

for t > 0 and k = 1, . . . , n .

j=1

Proof. Lemma 29.10 guarantees that |λ1 | · · · |λk | ≤ s1 · · · sk

for k = 1, . . . , n .

Suppose that |λk | > 0. Then ln |λ1 | + · · · + ln |λk | ≤ ln s1 + · · · + ln sk

29.7. Supplementary notes

323

and hence, if t > 0, t(ln |λ1 | + · · · + ln |λk |) ≤ t(ln s1 + · · · + ln sk ) or, equivalently, ln |λ1 |t + · · · + ln |λk |t ≤ ln st1 + · · · + ln stk . Consequently, Corollary 29.5 is applicable to the numbers aj = ln |λj |t and bj = ln stj for j = 1, . . . , k and yields the inequality eln |λ1 | + · · · + eln |λk | ≤ eln s1 + · · · + eln sk , t

t

t

t

which is equivalent to |λ1 |t + · · · + |λk |t ≤ st1 + · · · + stk . Thus we have established the inequality for every integer k ∈ {1, . . . , n} for which |λk | > 0. However, this is really enough, because if λ = 0, then |λj | ≤ sj for j = , . . . , n. Thus, for example, if n = 5 and |λ3 | > 0 but λ4 = 0, then the inequality (29.10) holds for k = 1, 2, 3 by the preceding analysis. However, it must also hold for k = 4 and k = 5, since λ4 = 0 =⇒ λ5 = 0  and thus |λ4 | ≤ s4 and |λ5 | ≤ s5 .

29.7. Supplementary notes The proof of Theorem 29.4 is adapted from the proof of Lemma 3.4 in Gohberg-Krein [43], which is formulated with less restrictive smoothness conditions on the convex function ϕ(x). They credit Herman Weyl and Hardy, Littlewood, and Polya [47]. They also present a number of inequalities for the real and imaginary parts of eigenvalues. There is an extensive literature on inequalities for eigenvalues and singular values of matrices; see, e.g., Ando [3] and Bhatia [9] and the references cited therein.

Chapter 30

Singular values redux II

In this chapter we shall develop a number of inequalities for singular values, partially for future use and partially because a number of them play a significant role in assorted algorithms in numerical analysis. Recall that if A ∈ C p×q and rank A = r ≥ 1, then AH A can be expressed in the form AH A = U S 2 U H , where U ∈ C q×q is unitary and S = diag{s1 , . . . , sq }, where s≥ · · · ≥ sr > 0 and the remaining singular values of A, if any, are all equal to zero (i.e., sj = 0 for j = r + 1, . . . , q if r < q).

30.1. Sums of powers of singular values In this section we develop two extremal characterizations for sums of powers of the singular values of a matrix A and then use these characterizations to establish some inequalities in terms of the entries in A. Lemma 30.1. If D ∈ C n×n is a positive semidefinite diagonal matrix and x ∈ C n , then (30.1)

(Dx, x)t ≤ D t x, x

for x = 1 and 1 ≤ t < ∞

(Dx, x)t ≥ D t x, x

for x = 1 and 0 < t ≤ 1 .

and (30.2)

Proof. The asserted inequalities are self-evident if t = 1. Therefore, it suffices to verify (30.1) for 1 < t < ∞ and (30.2) for 0 < t < 1. 325

326

30. Singular values redux II

If t > 1, 1/s = 1 − 1/t, djj = δj , and x = 1, then, by H¨older’s inequality, Dx, x =

n 

δj |xj | = 2

j=1

n 

δj |xj |2/t |xj |2/s

j=1

⎛ ⎞1/t ⎛ ⎞1/s ⎛ ⎞1/t n n n    ≤⎝ δjt |xj |2 ⎠ ⎝ |xj |2 ⎠ = ⎝ δjt |xj |2 ⎠ , j=1

j=1

which is equivalent to (30.1), since

j=1

n

t j=1 δj

|xj |2 = D t x, x.

Suppose next that 0 < t < 1. Then 1 < t−1 < ∞. Thus, if C ∈ C n×n is a positive semidefinite diagonal matrix, then (Cx, x)1/t ≤ C 1/t x, x x2(1−t)/t by (30.1). The inequality (30.2) is obtained by setting C = D t and raising both sides to the t’th power.  Exercise 30.1. Show that if V ∈ C n×k is isometric and D ∈ C k×k is a positive semidefinite diagonal matrix, then 't & (30.3) V DV H x, x ≤ V D t V H x, x for x = 1 and 1 ≤ t < ∞ and (30.4)

't & V DV H x, x ≥ V D t V H x, x

for x = 1 and 0 < t ≤ 1 .

Theorem 30.2. If A ∈ C p×q , rank A = r ≥ 1, k ∈ {1, . . . , q}, and ej denotes the j’th column of Ik , then ⎧ ⎫ k ⎨ ⎬ t q×k max AW ej  : W ∈ C is isometric ⎩ ⎭ j=1 (30.5) k  sj (A)t if 2 ≤ t < ∞ = j=1

and min (30.6)

⎧ k ⎨ ⎩

AW ej t : W ∈ C q×k

j=1

=

q  j=q−k+1

sj (A)t

⎫ ⎬ is isometric ⎭

if 0 < t ≤ 2 .

30.1. Sums of powers of singular values

327

Proof. Since AH A = U S 2 U H with S = diag{s1 (A), . . . , sq (A)} and U ∈ C q×q unitary, AW ej 2 = AH AW ej , W ej  = U S 2 U H W ej , W ej  = S 2 Xej , Xej  with X = U H W . Therefore, k 

AW ej t =

j=1

k  't/2 & 2 . S Xej , Xej  j=1

Moreover, since U ∈ C q×q is unitary and W ∈ C q×k is isometric, X is also isometric and hence Xej  = 1. Thus, Lemma 30.1 is applicable. There are two cases to consider: 1. If 2 ≤ t < ∞, then 1 ≤ t/2 < ∞ and hence, in view of (30.1), k 

AW ej t =

j=1

k  &

S 2 Xej , Xej 

't/2

j=1



k 

S t Xej , Xej 

j=1

= trace(X H S t X) = trace(S t XX H ) ≤

k 

sj (A)t ,

j=1

by the upper bound in the Ky Fan inequality (27.8), since the eigenvalues of S t are s1 (A)t , . . . , sq (A)t and the eigenvalues of XX H are equal to 1 with multiplicity k and 0 with multiplicity q − k. This upper bound is valid for every choice of the isometric matrix W . Equality  holds if W is chosen equal to U1 in the block decomposition U = U1 U2 of the unitary matrix U

 I q×k H , because then X = U W = k and hence with U1 ∈ C O k 

(S Xej , Xej )

j=1

2

t/2

=

k 

sj (A)t .

j=1

2. If 0 < t ≤ 2, then 0 < t/2 ≤ 1 and, in view of (30.2), k  j=1

k k  't/2  & 2 AW ej  = ≥ S t Xej , Xej  S Xej , Xej  t

j=1

j=1

= trace X H S t X = trace S t XX H ≥

q 

sj (A)t ,

j=q−k+1

by the lower bound in the Ky Fan inequality (27.8). This lower bound is valid for every choice of the isometric matrix W . Equality holds if W = U2

328

30. Singular values redux II

  in the block decomposition U = U1 U2 of the unitary matrix U with

 O q×k H U2 ∈ C , because then X = U W = .  Ik Corollary 30.3. If A ∈ C n×n and rank A = r ≥ 1, then (30.7)

n 

|Auj , uj | ≤ t

r 

j=1

for 1 ≤ t < ∞

sj (A)t

j=1

and every orthonormal basis {u1 , . . . , un } of C n . Proof. Since rank A = r, A admits a singular value decomposition of the form A = V1 S1 U1H , with isometric factors V1 , U1 ∈ C n×r . Thus, if x ∈ C n and 1 ≤ t < ∞, then 1/2

1/2

|Ax, x|t = |V1 S1 U1H x, x|t = |S1 U1H x, S1 V1H x|t 1/2

1/2

1/2

≤ S1 U1H xt S1 V1H xt ≤

1/2

S1 U1H x2t + S1 V1H x2t . 2

Consequently, if {u1 , . . . , un } is any orthonormal basis of C n , then, in view of Theorem 30.2, ⎧ ⎫ n n n ⎨ ⎬    1 1/2 1/2 |Auj , uj |t ≤ (sj (S1 U1H ))2t + (sj (S1 V1H ))2t ⎭ 2⎩ j=1 j=1 j=1 ⎧ ⎫ n n ⎬  1 ⎨ sj (A)t + sj (A)t , ≤ ⎭ 2⎩ j=1

j=1



as claimed.

30.2. Inequalities for singular values in terms of A Theorem 30.4. If A ∈ C p×q with rank A = r ≥ 1 and positive singular values s1 ≥ · · · ≥ sr , then (30.8)

p  q 

|aij |t ≤

i=1 j=1

r 

sj (A)t

if 2 ≤ t < ∞

sj (A)t

if 0 < t ≤ 2 .

j=1

and (30.9)

p  q  i=1 j=1

|aij | ≥ t

r  j=1

30.3. Perturbation of singular values

329

Proof. Let aj denote the j’th column of A, j = 1, . . . , q. If t ≥ 2, aj t ≤ aj 2 , i.e., p  |aij |t ≤ aj t2 . i=1

Consequently, q  p 

|aij | ≤ t

j=1 i=1

q 

aj  ≤ t

j=1

q 

sj (A)t ,

j=1

where the last inequality follows from (30.5) upon choosing k = q and W = Iq . This completes the proof of (30.8). The proof of (30.9) rests upon the inequalities p 

(30.10)

|aij |t ≥ aj t2

if 0 < t ≤ 2

i=1

and q 

(30.11)

j=1

aj t2



q 

sj (A)t

if 0 < t ≤ 2 .

j=1

The verification of (30.10) is relegated to Exercise 30.2; the inequality (30.11) is immediate from (30.6). The details are left to the reader.  Exercise 30.2. Verify the inequality (30.10). [HINT: Use Exercise 21.2.]

30.3. Perturbation of singular values In this section we shall investigate the properties of sj (A) under small changes in the matrix A. The next lemma shows that sj (A) is a continuous function of A. Lemma 30.5. If A, B ∈ C p×q , then |sj (A) − sj (B)| ≤ A − B

(30.12)

for j = 1, . . . , q .

Proof. If j = 1, then (30.12) is just another way of expressing the wellknown inequality |A − B| ≤ A − B. If j > 1, then sj (A) = min{A − C : C ∈ C p×q

and

rank C ≤ j − 1} .

Therefore, sj (A) ≤ A − C = B − C + A − B ≤ B − C + A − B for every C ∈ C p×q with rank C ≤ j − 1. Thus, upon minimizing the term

330

30. Singular values redux II

on the right over all admissible C, we obtain the inequality sj (A) ≤ sj (B) + A − B . But this justifies (30.12), since A and B may be interchanged.



Lemma 30.6. If A, B ∈ C p×q and t ≥ 1, then (30.13) |sj (A)t −sj (B)t | ≤ t A−B (sj (A)+sj (B))t−1

for j = 1, . . . , q .

Proof. Let f (x) = xt . If a ≥ 0, b ≥ 0, and t > 1, then, by the mean value theorem, there exists a point c = τ a + (1 − τ )b, 0 < τ < 1, such that at − bt = t(a − b) ct−1 = t(a − b)(τ a + (1 − τ )b)t−1 . Consequently, |sj (A)t − sj (B)t | = t|sj (A) − sj (B)|(τ sj (A) + (1 − τ )sj (B))t−1 ≤ tA − B(sj (A) + sj (B))t−1 , 

as claimed.

Our next objective is to obtain information on the singular values of A from information on the eigenvalues of the matrix A constructed in Exercise 30.3 (below) and Theorem 28.6. Exercise 30.3. Show that if A ∈ C p×q with rank A = r ≥ 1 and p + q = n, then the nonzero eigenvalues δ1 (A) ≥ · · · ≥ δr (A) and δn+r−1 (A) ≥ · · · ≥ O A can be expressed in terms of δn (A) of the Hermitian matrix A = AH O the nonzero singular values sj (A) by the formulas δj (A) = sj (A)

and

δn+1−j (A) = −sj (A)

for j = 1, . . . , r .

Theorem 30.7. If A, B ∈ C n×n and A has the singular value decomposition A = V SU H with unitary factors V, U ∈ C n×n , and C = V H BU , then (30.14)

lim μ↓0

sj (A + μB)t − sj (A)t = tsj (A)t−1 νj μ

for j = 1, . . . , n

and (30.15)

lim μ↓0

n  sj (A + μB)t − sj (A)t j=1

μ

= t trace{U S t−1 V H B}

. , νn are the eigenvalues of a for every choice of t > 1. In (30.14), ν1 , . .  block diagonal submatrix of (C + C H )/2 and nj=1 νj = trace(C + C H )/2. If sj (A) > 0, then (30.14) holds for every t > 0; if A is invertible, then (30.15) holds for every t > 0.

30.3. Perturbation of singular values

Proof. Since the matrices

O A= AH

A O



331

and

O B= BH

B O



are Hermitian, Theorem 28.6 guarantees that lim μ↓0

λj (A + μB) − λj (A) = νj μ

exists for j = 1, . . . , 2n. Moreover, if A = V SU H is the singular value decomposition of A and Zn is defined in terms of the standard basis {e1 , . . . , en } H by the formula Zn = e1 eH n + · · · + en e1 , then the matrix  



1 V 1 V O In Zn V Zn =√ W =√ In −Zn 2 O U 2 U −U Zn is unitary and W H (A + μB)W = Σ + μF , where

 S O , Σ= O −Zn SZn

 1 C + CH (C H − C)Zn F = , 2 Zn (C − C H ) −Zn (C + C H )Zn

and C = V H BU . Therefore, λj (A + μB) − λj (A) λj (Σ + μF ) − λj (Σ) = lim μ↓0 μ↓0 μ μ λj (Σ + μF ) − sj (A) = lim μ↓0 μ

νj = lim

for j = 1, . . . , n and the numbers ν1 , . . . , νn are the eigenvalues of a certain block diagonal submatrix of (C + C H )/2. Consequently, as sj (A + μB) = λj (A + μB) for j = 1, . . . , n and μ ∈ R, it follows that (30.16)

lim μ↓0

sj (A + μB) − sj (A) = νj μ

for j = 1, . . . , n

and hence that (30.17)

sj (A + μB) = sj (A) + μ[νj + ε(μ)] ,

where

lim ε(μ) = 0 . μ↓0

Thus, sj (A + μB)t = (sj (A) + μ[νj + ε(μ)])t , which is of the form f (x) = (sj (A) + x)t with x = μ[νj + ε(μ)]. By the mean value theorem, f (x) − f (0) = xf  (ξ) = xt(sj (A) + ξ)t−1

for some point ξ with |ξ| < |x| .

332

30. Singular values redux II

Therefore, sj (A + μB)t − sj (A)t = t[νj + ε(μ)](sj (A) + ξ)t−1 , μ which tends to the right-hand side of (30.14) as μ ↓ 0 when t ≥ 1 even if sj (A) = 0. If sj (A) > 0, then the same conclusion holds for every t > 0.  Then, since nj=1 νj = trace(C + C H )/2 and S is a diagonal matrix, (30.18)

n 

sj (A)t−1 νj =

j=1

1 trace S t−1 (C + C H ) =  trace S t−1 C 2

=  trace{S t−1 V H BU } =  trace{U S t−1 V H B} . This completes the justification of formula (30.15) for every A ∈ C n×n when t > 1 and for invertible matrices A ∈ C n×n when t > 0. (If rank A = r < n, then S t−1 is only defined for t > 1. The constraint t > 1 ensures that the limit in (30.14) is equal to zero for j > r.)  Exercise 30.4. Verify the first equality in the last display. [HINT: Look at Exercise 28.10.] Notice that if A, B ∈ R n×n , then the limit in (30.15) is a linear function of B. The extra complication of taking the real part enters because we are allowing complex matrices. It is tempting to conclude that if t > 1, then the limit in (30.14) is also equal to the real part of a function f (A; B) that is linear in B. However, the next example shows that this is not the case. Example 30.1. If

 1 0 A= , 0 1

b ∈ C,

then the eigenvalues of the matrix

and

 0 b B= , 0 0

1 μb (A + μB) (A + μB) = μb 1 + μ2 |b|2



H

are equal to 2 + μ2 |b|2 ± Thus,

1

4μ2 |b|2 + μ4 |b|4 . 2

1 μ|b|2 + 4|b|2 + μ2 |b|4 s1 (A + μB)2 − s1 (A)2 = lim = |b| , lim μ↓0 μ↓0 μ 2 whereas 1 μ|b|2 − 4|b|2 + μ2 |b|4 s2 (A + μB)2 − s2 (A)2 = lim = −|b| . lim μ↓0 μ↓0 μ 2

30.3. Perturbation of singular values

333

Neither of these two limits is equal to the real part of a linear function of B. However, (30.19)

lim μ↓0

2  sj (A + μB)2 − sj (A)2

= 0,

μ

j=1



which is a linear function of B. Example 30.2. Let 

a11 0 A= 0 a22

and

0 b B = 11 0 b22

 with |a11 | > |a22 | .

Then, since |a11 + μb11 | > |a22 + μb22 | if μ > 0 is sufficiently small, s1 (A + μB) = |a11 + μb11 |

and

s2 (A + μB) = |a22 + μb22 | for such μ.

Consequently,

. -% %t % % b 11 % −1 , s1 (A + μB) − s1 (A) = |a11 + μb11 | − |a11 | = |a11 | %%1 + μ a11 % t

t

t

t

t

which, upon setting α + iβ = b11 /a11 (with α, β ∈ R), can be expressed as s1 (A + μB)t − s1 (A)t = |a11 |t {|1 + μ(α + iβ)|t − 1} = |a11 |t {|(1 + μα)2 + (μβ)2 |t/2 − 1} . Therefore, (30.20)

lim μ↓0

b11 s1 (A + μB)t − s1 (A)t = t|a11 |t α = t|a11 |t  , μ a11

and hence the limit is of the form tf1 (A; B), where f1 is linear in B. A similar formula holds when s1 is replaced by s2 if |a22 | > 0. However, if a22 = 0, then s2 (A + μB)t − s2 (A)t |μb22 |t lim = lim = μ↓0 μ↓0 μ μ

(

0 if t > 1 , |b22 | if t = 1 .

Thus, the limit exists for t ≥ 1; however, it can only be expressed as the  real part of a function f2 (A; B) that is linear in B if t > 1. Exercise 30.5. Let ⎡ ⎡ ⎤ 1 1 0 0 0 

⎢0 ⎢0 1 0 0⎥ B B 11 12 ⎢ ⎥ A=⎢ ⎣0 0 0 0⎦ , B = B21 B22 = ⎣0 1 0 0 0 0

0 2 1 0

0 1 3 0

⎤ 1

 0⎥ ⎥, B > = B11 O , O B22 0⎦ 4

334

30. Singular values redux II

where Bij ∈ C 2×2 for i, j = 1, 2. Compute lim μ↓0

> − λj (A) λj (A + μB) μ

for j = 1, . . . , 4 ,

where λ1 (C) ≥ · · · ≥ λn (C) for all matrices C that intervene. > = ∅ in ExExercise 30.6. Show by direct computation that σ(B) ∩ σ(B)  4 4 > = ercise 30.5, but nevertheless j=1 λj (B) j=1 λj (B). Explain why this is not surprising. The basic facts for A, B ∈ C n×n are: (1) The limit in (30.14) exists for each singular value sj (A), but it is not necessarily equal to the real part of a linear function of B. (2) The limit of the full sum is (30.21)

lim μ↓0

n  sj (A + μB)t − sj (A)t j=1

μ

= tft (A; B) ,

where ft (A; B) is linear in B when 1 < t < ∞. Exercise 30.7. Show that if a and b belong to an inner product space U , then a + μb, a + μbU − a, aU lim = 2b, aU , μ↓0 μ i.e., the limit is the real part of a complex-valued function that is linear in b. Exercise 30.8. Compute the limit referred to in the previous exercise explicitly when U = C p×q equipped with the inner product A, BU = trace B H A, and express it as the real part of a complex-valued function that is linear in B.  Exercise 30.9. Show that if A ∈ C p×q , then f (A) = kj=1 sj (A)t is convex on C p×q for each choice of k ∈ {1, . . . , , q} and 1 ≤ t < ∞, and it is strictly convex if 1 < t < ∞. [HINT: f (A)1/t is a norm when 1 ≤ t < ∞.] Exercise 30.10. A real-valued function ϕ is said to be Fr´echet differentiable on C p×q if (30.22)

lim μ↓0

ϕ(A + μB) − ϕ(A) =  f (A; B) μ

30.4. Supplementary notes

335

exists and f (A; B) is linear in B for every choice of A, B ∈ C p×q . Show that if a Fr´echet differentiable function ϕ on C p×q has a local maximum or local minimum at A, then the limit in (30.22) is equal to zero for every B ∈ C p×q .

30.4. Supplementary notes The last section was developed to provide backing for the next chapter.

Chapter 31

Approximation by unitary matrices

Given a basis {a1 , . . . , an } of C n , we wish to find an orthonormal basis n n {w , . . . , wn } of C such that the error incurred by replacing j=1 cj aj by n1 cj wj is small for a reasonable set of coefficients, say for all cj with     j=1 n 2 j=1 |cj | ≤ 1. Upon setting A = a1 · · · an , W = w1 · · · wn , and  T c = c1 · · · cn , the identity n 

cj aj −

j=1

n 

cj wj = (A − W )c

j=1

enables us to reformulate the problem to that of approximating an invertible matrix A ∈ C n×n by a unitary matrix W ∈ C n×n . This chapter is devoted to evaluating this approximation in terms of the function (31.1)

ϕt (X) =

n 

sj (X)t

for X ∈ C n×n and t > 0 ,

j=1

which is equal to the t’th power of a norm when t ≥ 1. The main conclusion is: Theorem 31.1. If A ∈ C n×n is invertible and t > 1, then (31.2)

n 4  3 |sj (A) − 1|t . min ϕt (A − W ) : W W H = W H W = In = j=1

337

338

31. Approximation by unitary matrices

Moreover, in terms of the unitary factors V and U in the singular value decomposition A = V SU H : (1) The minimum in (31.2) is attained by exactly one unitary matrix W = V U H , which can be characterized as the one and only unitary matrix for which W H A is positive definite (as in (6) of Theorem 15.1). (2) If A  O, then the minimum in (31.2) is attained by W = In .

31.1. Approximation in the Frobenius norm To warm up, we shall first evaluate (31.2) for t = 2, the square of the Frobenius norm, because this computation is easy, thanks to the identities ϕ2 (A − W ) =

n 

sj (A − W )2 = trace(A − W )H (A − W )

j=1

=

n 

(A − W )ej  = 2

j=1

n 

|aij − wij |2 ,

i,j=1

where ej denotes the j’th column of In . Since A is presumed to be invertible, it admits a singular value decomposition A = V SU H in which V, U ∈ C n×n are unitary and S is the diagonal n × n matrix with entries s1 (A) ≥ · · · ≥ sn (A) > 0. Thus, in view of (2) and (3) of Theorem 15.6, sj (A − W ) = sj (V SU H − W ) = sj (V [S − V H W U ]U H ) = sj (S − Z) , wherein Z = V H W U is unitary. Consequently, ϕ2 (A − W ) = ϕ2 (S − Z) =

n 

|si − zii | + 2

i=1

n  i,j=1,i=j

|zij | ≥ 2

n 

|si − zii |2 ,

i=1

with equality if and only zij = 0 for i = j, i.e., if and only if Z is a diagonal matrix. Since Z is also unitary, the constraint |zii | = 1 must also be met. n Thus, as si > 0, the sum i=1 |si − zii |2 will attain its minimum value when zii = 1 for i = 1, . . . , n, i.e., when V H W U = Z = In , i.e., when W = V U H . Thus, ϕ2 (A − W ) is minimized by choosing W = V U H , the outside factors in the singular value decomposition A = V SU H . But then (31.2) holds for t = 2 and W H A = U V H V SU H = U SU H , which is clearly positive definite. It remains to show that there is exactly one unitary matrix W with the property that W H A is positive definite. But, if W1 and W2 are unitary matrices such that both W1H A and W2H A are positive definite, then W1H A = (W1H A)H = AH W1

and

W2H A = (W2H A)H = AH W2 .

31.2. Approximation in other norms

339

Therefore, (W1H A)2 = (W1H A)H (W1H A) = AH A = (W2H A)H (W2H A) = (W2H A)2 . But this serves to identify both W1H A and W2H A as positive definite square roots of the positive definite matrix AH A. Therefore, they must coincide: W1H A = W2H A. Thus, as A is invertible, W1 = W2 . This completes the proof of (1). Finally, to verify (2), observe that if A  O, then V = U and the  minimum is achieved by W = V V H = In .

31.2. Approximation in other norms In this section we shall establish Theorem 31.1. We begin, however, with a general principle that will be useful in the rest of this section. Lemma 31.2. If P, Z ∈ C n×n , P  0, and Z H Z = ZZ H = In , then (31.3)

P Z = ZH P

if and only if

P Z = ZP and σ(P Z) ⊂ R .

Proof. If P Z = Z H P , then (Z H P Z)2 = Z H P ZZ H P Z = Z H P P Z = P ZZ H P = P 2 . Therefore, Z H P Z = P , since they are both positive semi-definite square roots of P 2 . Consequently P Z = ZP , and, as P Z = (P Z)H , σ(P Z) ⊂ R. Conversely, if P Z = ZP , then, since P and Z are both normal matrices, Theorem 13.5 ensures that there exists an orthonormal basis {u1 , . . . , un } such that P uj = ρj uj with ρj ≥ 0 and Zuj = ωj uj with |ωj | = 1 for j = 1, . . . , n. Consequently, P Zuj = ρj ωj uj for j = 1, . . . , n, and hence, in view of the condition σ(P Z) ⊂ R, ρj ωj = ρj ωj . Therefore, Z H uj = ωj uj = ωj uj and P Zuj = Z H P uj for j = 1, . . . , n.  It is readily checked that if A = V SU H with unitary factors V and U , then ϕt (A − V U H ) = ϕt (V (S − In )U H ) = ϕt (S − In ) and hence that the minimum in (31.2) (which exists since ϕt is a continuous function acting on a closed bounded subset of C n×n ) is ≤ ϕt (S − In ). Theorem 31.1 guarantees that the last inequality is in fact an equality when t > 1. The verification is carried out next in a number of small steps, many of which are of independent interest.

340

31. Approximation by unitary matrices

1. If X, Y ∈ C n×n , α, β ∈ C, and t ≥ 1, then |ϕt (X + αY ) − ϕt (X + βY )| ≤ tn|α − β|Y (2X + (|α| + |β|)Y )t−1 .

In view of Lemma 30.6, |ϕt (X + αY ) − ϕt (X + βY )| ≤ ≤

n  j=1 n 

|sj (X + αY )t − sj (X + βY )t | t|α − β|Y {sj (X + αY ) + sj (X + βY )}t−1

j=1

≤ tn|α − β|Y {s1 (X + αY ) + s1 (X + βY )}t−1 , which leads easily to the asserted inequality. 2. If W is a unitary matrix such that ϕt (A − W ) ≤ ϕt (A − B), B unitary, > H is the singular value decomposition of A − W , then and V> S>U (31.4)

> S>t−1 V> H W x, x ∈ R for t ≥ 1 and every vector x ∈ C n . U

Let {u1 , . . . , un } be an orthonormal basis for C n and let M (θ) = e



u1 uH 1

+

n 

uj uH j

for − π < θ < π .

j=2

Then M (θ) is a unitary matrix with M (0) = In . Thus, if W ∈ C n×n is a unitary matrix such that ϕt (A − W ) ≤ ϕt (A − B) as B runs through the set of unitary matrices in C n×n , then ϕt (A − W M (θ)) = ϕt (A − W − W (M (θ) − In )) = ϕt (A − W − (eiθ − 1)W u1 uH 1 ) ≥ ϕt (A − W ) for small |θ|, since |eiθ − 1| ≤ |θ| when θ ∈ R. To ease the reading, let X = iθ A − W , Y = W u1 uH 1 , f (θ) = ϕt (X − (e − 1)Y ), and g(θ) = ϕt (X − iθY ). Then the last inequality implies that (31.5)

f (θ) − g(θ) + g(θ) − g(0) f (θ) − f (0) = ≥0 θ θ

when θ > 0 and is sufficiently small. Since |eiθ − 1 − iθ| ≤ |θ|2 /2!, the inequality in step 1 ensures that limθ↓0 [f (θ) − g(θ)]/θ = 0. Therefore, in

31.2. Approximation in other norms

341

view of formula (30.15), lim θ↓0

f (θ) − f (0) g(θ) − g(0) > S>t−1 V> H Y } ≥ 0 . = lim = t trace {−iU θ↓0 θ θ

−iθ H Since n the Hopposite inequality is obtained by setting M (θ) = e u1 u1 + j=2 uj uj with θ > 0, we see that

> S>t−1 V> H W u1 , u1 } , > S>t−1 V> H Y } = t {−iU 0 = t trace {−iU which yields (31.4), since u1 is any unit vector in C n . > S>t−1 V> H W = W H V> S>t−1 U >H. 3. If W is as in 2 and t ≥ 1, then U This is an immediate consequence of step 2 and (11.25): If C ∈ C n×n and Cx, x ∈ R for every vector x ∈ C n , then (C − C H )x, x = Cx, x − x, Cx = 0 for every x ∈ C n , and hence C = C H . > = V> H W U > S>t−1 and 4. If W is as in 2 and t ≥ 1, then S>t−1 V> H W U > ) ⊂ R. σ(S>t−1 V> H W U > . Then, Z ∈ C n×n is unitary, S>t−1  O, and the Let Z = V> H W U preceding step implies that S>t−1 Z = Z H S>t−1 . Thus, S>t−1 Z = Z S>t−1 and σ(S>t−1 Z) ⊂ R by Lemma 31.2. > = V> H W U > S> = U > H W H V> S. > 5. If W is as in 2 and t > 1, then S>V> H W U To this point we know that S>t−1 Z = Z S>t−1 = Z H S>t−1 and σ(S>t−1 Z) ⊂ > and t > 1. But as S>t−1 and Z are both normal R when Z = V> H W U operators, Theorem 13.5 ensures that there exists an orthonormal basis {u1 , . . . , un } of C n such that Zuj = ωj uj with |ωj | = 1 and S>t−1 uj = ρj uj with ρj ≥ 0 for j = 1, . . . , n. Moreover, ωj = ±1 if ρj > 0. Consequently, > j = ρ1/(t−1) uj and Su j > j = ρ1/(t−1) ωj uj = Z H Su > j > j = ρ1/(t−1) ωj uj = Z Su SZu j j Therefore, 5 holds.

for j = 1, . . . , n .

342

31. Approximation by unitary matrices

6. If W is as in 2, A  O, and t > 1, then AW = W A and W = W H . > H is the singular value decomposition of A − W . Then, Recall that V> S>U H as A = A and W is unitary, > U >H > {S>V> H W U > }U >H = U > {U > H W H V> S} AW − In = (A − W )H W = U > H = W H (A − W ) = W H A − In . = W H V> S>U Therefore, AW = W H A and hence AW = W A by Lemma 31.2. Thus, W H A = W A and, as A is invertible, W = W H . 7. If W is as in 2, A  O, and t > 1, then ϕt (A − W ) ≥ with equality if and only if W = In .

n

j=1 |sj (A)

− 1|t

Since A and W are normal matrices that commute, Theorem 13.5 ensures that there exists an orthonormal basis {u1 , . . . , un } of C n such that Auj = μj uj

and

W uj = ωj uj

for j = 1, . . . , n .

Consequently, (A − W )uj = (μj − ωj )uj for j = 1, . . . , n. Thus, if the eigenvectors are indexed so that |μ1 − ω1 | ≥ |μ2 − ω2 | ≥ · · · ≥ |μn − ωn | , then Theorem 29.12 ensures that n n n    sj (A − W )t ≥ |λj (A − W )|t = |μj − ωj |t . j=1

j=1

j=1

However, μj > 0, since A  O, and ωj = ±1, since W = W H and W H W = In . Therefore, the sum on the far right is clearly minimized when ωj = 1 for j = 1, . . . , n, i.e., if and only if W = In . Moreover, since A  O,  n n t = t |μ − 1| j j=1 j=1 |sj (A) − 1| . 8. Verification of (1) and (2) of Theorem 31.1. If A = V SU H with S = diag {s1 (A), . . . , sn (A)} and V, U ∈ C n×n unitary, then sj (A − W ) = sj (V SU H − W ) = sj (V [S − V H W U ]U H ) = sj (S − V H W U ) for every unitary matrix W ∈ C n×n . Thus, as S  O and V H W U is unitary, step 7 ensures that n  H |sj (A) − 1|t , ϕt (A − W ) = ϕt (S − V W U ) ≥ ϕt (S − In ) = j=1

UH.

Thus (1) holds; (2) then follows with equality if and only if W = V H  from (1) because V U = In when A  O.

31.2. Approximation in other norms

343

Remark 31.3. The passage from S t−1 Z = ZS t−1 to SZ = ZS in the verification of step 5 exploited the fact that S t−1 and Z are both normal matrices. It is also of interest to show that if B ∈ C n×n and S α B = BS α for some α > 0, then SB = BS. Towards this end, we first show that (31.6)

S α B = BS α =⇒ S α/2 B = BS α/2 .

Under the given assumptions S α B − S α/2 BS α/2 = BS α − S α/2 BS α/2 and hence S α/2 (S α/2 B − BS α/2 ) + (S α/2 B − BS α/2 )S α/2 = O . Thus, if qij denotes the ij entry of the matrix Q = S α/2 B − BS α/2 , then α/2

si

α/2

qij + qij sj α/2

= 0,

α/2

which implies that qij = 0 if si + sj > 0. If rank S = r and r < n, then this ensures that qij = 0 for all the indices ij except possibly if i > r and j > r. However, if i > r and j > r, then qij = 0 by definition. Thus, (31.6) holds and serves to ensure that if S α B = BS α for some α > 0, then the same equality holds for an α ∈ (0, δ) for any choice of δ > 0, no matter how small. Consequently, there exists a positive integer m such that mα < 1 ≤ (m + 1)α and 0 < α < δ. Let τ = mα. Then S τ B = BS τ and 0 < 1 − τ ≤ δ. Therefore, (31.7) SB − BS = (S − S τ )B − B(S − S τ ) ≤ S − S τ  2B ≤ κδ , where κ is a positive constant that depends only upon S. Thus, SB = BS. Exercise 31.1. Verify the upper bound in (31.7). [HINT: Exercise 21.3 does part of the job.] Exercise 31.2. Show that if A, U ∈ C n×n , A  O, and U is unitary, then In − A ≤ U − A ≤ In + A. Exercise 31.3. Show that if A, U ∈ C n×n , A  O, U is unitary, and t ≥ 1, then ϕt (In − A) ≤ ϕt (U − A) ≤ ϕt (In + A). Exercise 31.4. Show that if A, U ∈ C n×n , A  O, U is unitary, and t > 0, then ϕt (In − A) ≤ ϕt (U − A) ≤ ϕt (In + A). Exercise 31.5. Show that if 0 < a < 1,



 cos θ − sin θ 1+a 0 , A= , and Wθ = sin θ cos θ 0 1−a then ϕ1 (A−Wθ ) = 2aϕ1 (A−I2 ) for all points θ for which cos θ > (2−a2 )/2. [HINT: Exploit the formula (s1 + s2 )2 = trace B H B + 2{det B H B}1/2 for the singular values s1 ≥ s2 of B ∈ C 2×2 to minimize the calculations.]

344

31. Approximation by unitary matrices

31.3. Supplementary notes This chapter was adapted from a 1980 article [1] by Aiken, Erdos, and Goldstein that was motivated by a problem in quantum chemistry. In [1], the analysis was carried out in a Hilbert space setting. Exercises 31.2–31.5 are adapted from [1].

Chapter 32

Linear functionals

This chapter develops a number of basic properties of linear functionals and related applications, including the Hahn-Banach extension theorem and the Hahn-Banach separation theorem in finite-dimensional normed linear spaces.

32.1. Linear functionals A linear transformation from a complex (resp., real) vector space X into C (resp., R) is called a linear functional. Thus, if f belongs to the set X  of linear functionals on X , then f (αx + βy) = αf (x) + βf (y) for every choice of x, y ∈ X and every pair of scalars α, β. The set X  is a vector space with respect to the natural rules of addition and scalar multiplication: if f, g ∈ X  and α is a scalar, then (f + g)(x) = f (x) + g(x) and

(αf )(x) = α f (x) for every x ∈ X .

The main facts to keep in mind are: Theorem 32.1. If X is a finite-dimensional vector space, then: (1) X  is a finite-dimensional vector space and dim X = dim X  . (2) If X is a normed linear space with norm xX , then X  is a normed linear space with norm f X  = max{|f (x)| : x ∈ X and xX = 1} = max{|f (x)| : x ∈ X and xX ≤ 1} 2 ( |f (x)| : x ∈ X and x = 0 . = max xX 345

346

32. Linear functionals

The numerical value of f X  depends upon the choice of norm in X ; see Exercise 32.2. (3) (X  ) = X . Proof. To verify (1), suppose that {x1 , . . . , xn } is a basis for X and define the linear functionals fj , j = 1, . . . , n, by the rule ( 1 if j = i , fj (xi ) = 0 if j = i .  Then, if f ∈ X  , x = nj=1 cj xj , and f (xj ) = αj , ⎞ ⎛ n n n n     cj xj ⎠ = cj f (xj ) = cj αj = cj αj fj (xj ) f (x) = f ⎝ j=1

= i.e., f =

n

n 

)

αj fj

j=1

j=1 n  i=1

j=1 αj fj .

j=1

*

ci xi

=

n  j=1

αj fj (x) =

j=1 n 

(αj fj )(x) ,

j=1

This proves that span{f1 , . . . , fn } = X  .

To complete the verification of (1), it remains to  show that the linear functionals f1 , . . . , fn are linearly independent. But if nj=1 aj fj (x) = 0 for some set of scalars a1 , . . . , an and every vector x ∈ X , then ai =

n 

aj fj (xi ) = 0 for i = 1, . . . , n .

j=1

Thus, {f1 , . . . , fn } is a basis for X  and hence (1) holds. The verification of (2) and (3) is left to the reader.



Exercise 32.1. Let f (x) be a linear functional on C n and let f (ej ) = aj for j = 1, . . . , n, where ej denotes the j’th column of In . Show that if A = a1 · · · an and 1 ≤ s ≤ ∞, then max {|f (x)| : xs = 1} = As,∞ . Exercise 32.2. Let f (x) be a linear functional on C n and let f (ej ) = aj for j = 1, . . . , n, where ej denotes the j’th column of In . Show that if T  a = a1 · · · an , then max {|f (x)| : xs = 1} = as , where s = s/(s − 1) if 1 < s < ∞; s = 1 if s = ∞; and s = ∞ if s = 1. [HINT: The inequality |f (x)| ≤ as xs is covered by Theorem 10.4. Equality is obtained by making an appropriate choice for x; if 1 < s < ∞,  try the vector x with entries xj = aj |aj |s −2 when aj = 0.]

32.1. Linear functionals

347

Theorem 32.2. If f is a linear functional from a complex (resp., real) vector space X into C (resp., R), then: (1) The set Nf = {x ∈ X : f (x) = 0} is a subspace of X . (2) If Nf = X , then there exists a vector u ∈ X such that ( ˙ {αu : α ∈ R} Nf + if X is a real vector space , X = ˙ if X is a complex vector space . Nf + {αu : α ∈ C} (3) If X is a finite-dimensional inner product space, then there exists exactly one vector v ∈ X such that (32.1)

f (x) = x, vX

for every vector x ∈ X .

Proof. The proof of (1) is left to the reader. To verify (2), suppose that Nf = X . Then there exists a vector u ∈ X such that f (u) = 0. Thus, as       f (x) f (x) f (x) u + u and x− u ∈ Nf , (32.2) x= x− f (u) f (u) f (u) it is clear that X = Nf + span{u} . Moreover, this sum is direct, because if w = αu belongs to Nf , then f (w) = αf (u) = 0, which forces α = 0 and hence w = 0. Therefore, (2) holds. If X is a finite-dimensional inner product space and Nf = X , then there exists a nonzero vector u ∈ X that is orthogonal to Nf . Thus, as f (u) = 0, the formulas in (32.2) are in force. Moreover, as u is orthogonal to Nf , x, uX =

f (x) u, uX , f (u)

i.e., (32.1) holds with v = f (u) u/u, uX when Nf = X . Since (32.1) holds with v = 0 when Nf = X , the proof of the existence of at least one vector v ∈ X for which (32.1) holds is complete. It remains to show that there is only one such vector v. This is left to the reader.  Exercise 32.3. Show that there is only one vector v ∈ X for which (32.1) holds. The next two exercises are formulated in terms of the notation x + V = {x + v : v ∈ V} for a vector x and a vector space V. Exercise 32.4. Show that if V and W are subspaces of a vector space X and x, y ∈ X , then (32.3)

x + V = y + W ⇐⇒ V = W

and x − y ∈ V ∩ W .

348

32. Linear functionals

A subset Q of an n-dimensional vector space X is a hyperplane if there exists a vector x ∈ X and an (n − 1)-dimensional subspace V of X such that Q = x + V. Exercise 32.5. Show that a subset Q of an n-dimensional vector space X is a hyperplane if and only if there exists a nonzero linear functional f ∈ X  and a scalar α such that Q = {x ∈ X : f (x) = α}.

32.2. Extensions of linear functionals If f is a linear functional on a vector space X , then the function ϕ(x) = |f (x)| meets the following two conditions: ϕ(x + y) ≤ ϕ(x) + ϕ(y)

(32.4)

and ϕ(αx) = αϕ(x)

for every choice of x, y ∈ X and α ≥ 0. A real-valued function ϕ(x) that meets the two conditions in (32.4) is said to be sublinear. Lemma 32.3. If f is a linear functional on a proper subspace V of a finitedimensional real vector space X and p is a sublinear functional on all of X such that f (v) ≤ p(v) for every v ∈ V, then there exists a linear functional g on X such that g(v) = f (v) for every v ∈ V and g(x) ≤ p(x) for every x ∈ X. Proof. Since X is finite dimensional, it suffices to show that if w ∈ X \ V, then there exists a linear functional g on the vector space V + {αw : α ∈ R} such that g(v + αw) = f (v) + αg(w) and

g(v + αw) ≤ p(v + αw)

for every α ∈ R and every v ∈ V. The last inequality will hold if and only if g(v ± αw) ≤ p(v ± αw) for every α > 0 and every v ∈ V. But this imposes two constraints on g(w): g(w) ≤ p(α−1 v + w) − f (α−1 v)

and

− g(w) ≤ p(α−1 v − w) − f (α−1 v) .

In order to meet these constraints, we must choose g(w) so that f (x) − p(x − w) ≤ g(w) ≤ p(y + w) − f (y) for every choice of x, y ∈ V , if possible. But the development f (x) +f (y) = f (x+y) ≤ p(x+y) = p(x−w +y +w) ≤ p(x−w) +p(y +w) implies that m = max {f (x) − p(x − w) : x ∈ V} ≤ min {p(y + w) − f (y) : y ∈ V} = M. Therefore, if we set g(w) = γ for any point γ ∈ [m, M ] and take α > 0, we see that g(x + αw) = f (x) + αγ ≤ f (x) + α(p(y + w) − f (y)) for every y ∈ V

32.3. The Minkowski functional

349

and g(x − αw) = f (x) − αγ ≥ f (x) − α(p(y − w) − f (y)) for every y ∈ V . Thus, if y = x/α, then it is readily checked that g(x + αw) ≤ f (x) + α(p(α−1 x + w) − f (α−1 x)) = p(x + αw) and g(x − αw) ≤ f (x) + α(p(α−1 x − w) − f (α−1 x)) = p(x − αw) , as needed to define an extension on a space of dimension equal to dim V + 1. Since X is finite dimensional, we can obtain an extension on the full space X by repeating this procedure a finite number of times.  The next theorem is a finite-dimensional version of the Hahn-Banach extension theorem. Theorem 32.4. If f is a linear functional on a proper subspace V of a finitedimensional normed linear space X , then there exists a linear functional g on the full space X such that: (1) f (v) = g(v) for every vector v ∈ V. (2) max {|f (v)| : v ∈ V and vX ≤ 1} = max {|g(x)| : x ∈ X and xX ≤ 1}. Discussion. If X is a real normed linear space and κ = max {|f (v)| : v ∈ V and vX ≤ 1}, then p(x) = κ xX is a sublinear function on X and f (v) ≤ p(v) for every vector v ∈ V. Therefore, Lemma 32.3 guarantees that there exists a linear functional g on the full space X such that (1) holds and g(±x) ≤ p(±x) = κ xX . Consequently, |g(x)| ≤ κ xX , which leads easily to (2). These conclusions can be extended to complex normed linear spaces; see, e.g., the proof of Theorem 7.30 in [30]. 

32.3. The Minkowski functional Let X be a normed linear space and let Q ⊆ X . Then the functional ! x pQ (x) = inf t > 0 : ∈ Q t is called the Minkowski functional. If the indicated set of t is empty, then pQ (x) = ∞. Recall that in a normed linear space X , Br (0) = {x ∈ X : x < r}, and a subset B of X is bounded if there exists an R > 0 such that B ⊂ BR (0).

350

32. Linear functionals

Lemma 32.5. If Q is a convex subset of a normed linear space X such that Q ⊇ Br (0)

for some

r > 0,

then: (1) pQ (x + y) ≤ pQ (x) + pQ (y) for x, y ∈ X . (2) pQ (αx) = αpQ (x) for α ≥ 0 and x ∈ X . (3) |pQ (x) − pQ (y)| ≤ (2/r)x − yX for x, y ∈ X (and hence pQ (x) is a continuous function of x on X ). (4) If Q is open, then Q = {x ∈ X : pQ (x) < 1}. (5) If Q is closed, then Q = {x ∈ X : pQ (x) ≤ 1}. (6) If Q is bounded, then pQ (x) = 0 =⇒ x = 0. Proof. Let x, y ∈ X and suppose that α−1 x ∈ Q and β −1 y ∈ Q for some choice of α > 0 and β > 0. Then, since Q is convex, α β x+y = (α−1 x) + (β −1 y) α+β α+β α+β belongs to Q and hence, pQ (x + y) ≤ α + β . Consequently, upon letting α run through a sequence of values α1 ≥ α2 ≥ · · · that tend to pQ (x) and letting β run through a sequence of values β1 ≥ β2 ≥ · · · that tend to pQ (y), it is readily seen that pQ (x + y) ≤ pQ (x) + pQ (y) . Suppose next that α > 0 and pQ (x) = a. Then there exists a sequence of numbers t1 , t2 , . . . such that tj > 0, x ∈ Q and tj

lim tj = a .

j↑∞

Therefore, since αx ∈ Q and αtj

lim αtj = αa ,

j↑∞

pQ (αx) ≤ αpQ (x) . However, the same argument yields the opposite inequality: αpQ (x) = αpQ (α−1 αx) ≤ αα−1 pQ (αx) = pQ (αx) . Therefore, equality prevails. This completes the proof of (2) when α > 0. But, (2) also holds when α = 0, because pQ (0) = 0.

32.4. Separation theorems

351

Next, to verify (3), we first observe that pQ (x) ≤ (2/r)xX

for every point x ∈ X ,

since pQ (0) = 0 and ((2/r)x)−1 X x ∈ Br (0) if x = 0. Thus, pQ (x) = pQ (x − y + y) ≤ pQ (x − y) + pQ (y) ≤ (2/r)x − yX + pQ (y) and pQ (y) = pQ (y − x + x) ≤ pQ (y − x) + pQ (x) ≤ (2/r)y − xX + pQ (x) for every choice of x, y ∈ X , which justifies (3). Items (4) and (5) are left to the reader. Finally, to verify (6), suppose that pQ (x) = 0. Then there exists a sequence of points α1 ≥ α2 ≥ · · · decreasing to 0 such that αj−1 x ∈ Q. Therefore, since Q ⊆ BR (0) for some R > 0, the inequality αj−1 xX ≤ R implies that xX ≤ αj R for j = 1, 2 . . . and hence that x = 0.  Exercise 32.6. Complete the proof of Lemma 32.5 by verifying items (4) and (5). Exercise 32.7. Show that in the setting of Lemma 32.5, pQ (x) < 1 =⇒ x ∈ Q and x ∈ Q =⇒ pQ (x) ≤ 1.

32.4. Separation theorems In this section we shall use the Hahn-Banach extension theorem to justify the key steps in the proof of Theorem 32.9, which is the main conclusion of this section. Lemma 32.6. If A is a nonempty open convex set in a real normed linear space X such that 0 ∈ A, then there exists a linear functional f ∈ X  such that f (a) > 0 for every vector a ∈ A. Proof. Choose a point a0 ∈ A and let p(x) = pQ (x), the Minkowski functional for the set Q = a0 − A = {a0 − a : a ∈ A} . Then Q is an open convex subset of X that contains the point 0 and hence Q = {x ∈ X : p(x) < 1} . Thus, as a0 ∈ Q, p(a0 ) ≥ 1. Let Y = {αa0 : α ∈ R}

and define h(αa0 ) = αp(a0 )

for all α ∈ R .

Then h is a linear functional on Y such that h(αa0 ) = p(αa0 ) if α ≥ 0 , and h(αa0 ) = αp(a0 ) < 0 ≤ p(αa0 )

if α < 0 .

352

32. Linear functionals

Therefore, h(y) ≤ p(y) for every y ∈ Y. Consequently, Lemma 32.3 ensures that there exists a linear functional f ∈ X  such that f (y) = h(y) for every y ∈ Y and f (x) ≤ p(x) for every x ∈ X . Therefore, 1 > p(a0 − a) ≥ f (a0 − a) = f (a0 ) − f (a) = p(a0 ) − f (a) , i.e., f (a) > p(a0 ) − 1 ≥ 0 for every a ∈ A, as needed.



Lemma 32.7. If A and B are nonempty convex sets in a finite-dimensional real normed linear space such that A∩B = ∅ and A is open, then there exists a linear functional f ∈ X  and a point c ∈ R such that (32.5)

f (b) ≤ c < f (a)

for every a ∈ A and b ∈ B .

Proof. Let Q = A − B. Then Q is an open convex subset of X that does not contain the point 0, since A ∩ B = ∅. Therefore, by Lemma 32.6, there exists a linear functional f ∈ X  such that f (q) > 0 for every vector q ∈ Q. Thus, f (a − b) = f (a) − f (b) > 0 for every a ∈ A and every b ∈ B. But, as f (A) is an open convex subset of R, f (A) = (c, d), and hence f (b) ≤ c < f (a) for every a ∈ A and b ∈ B . 

as claimed.

Lemma 32.8. If B is a nonempty closed convex set in a finite-dimensional real normed linear space X such that B ⊂ Q for some open subset Q of X and B is bounded, then there exists an r > 0 such that B + Br (0) ⊆ Q for some r > 0. Proof. If the assertion is false, then there exists a sequence of vectors bj +xj such that bj ∈ B and xj  < 1/j for j = 1, 2, . . . such that bj + xj ∈ X \ Q. Since B is a closed bounded + subset of X , a subsequence of the bj tends to a limit b. But then b ∈ B (X \ Q), since both of these sets are closed. But this contradicts the fact that B ⊂ Q.  Theorem 32.9. If A and B are nonempty closed convex sets in a finitedimensional real normed linear space X such that A ∩ B = ∅ and B is bounded, then there exist a linear functional f ∈ X  and a pair of numbers c, d ∈ R such that (32.6)

f (b) < c < d < f (a)

for every choice of a ∈ A and b ∈ B (i.e., A and B are strictly separated by the hyperplane Q = {x : f (x) = c + 2−1 (d − c)}). Proof. Since B ⊆ X \ A and X \ A is an open subset of X , Lemma 32.8 ensures that there exists an r > 0 such that B + Br (0) ⊆ X \ A. Therefore, ? (B + B(r/2) (0)) (A + B(r/2) (0)) = ∅ .

32.5. Another path

353

Thus, as the two sets on the left are disjoint convex open sets, Lemma 32.7 guarantees that there exists a linear functional f ∈ X  such that f (b + x) ≤ d < f (a + y) for every b ∈ B, a ∈ A, and x, y ∈ B(r/2) (0). To finish, fix x ∈ Br/2 (0) such that f (x) > 0. Then f (b) ≤ d − f (x) < d − 2−1 f (x) < d < f (a) and (32.6) follows upon setting c = d − 2−1 f (x).



Theorem 32.9 supplies a key step in the proof of Farkas’s lemma, which is an important result in linear programming. Lemma 32.10 (Farkas). If A ∈ R p×q and b ∈ Rp , then either (1) there exists a vector x ∈ R≥q such that Ax = b or (2) there exists a vector y ∈ R p such that AT y ∈ R≥q and bT y < 0, but not both. Proof. Let A = {Ax : x ∈ R≥q }. If b ∈ A, then (1) is in force. If b ∈ A, then C = {a − b : a ∈ A} is a closed convex subset of R p and 0 ∈ C. Therefore, Theorem 32.9 (with B = {0} and A replaced by C) guarantees that there exist a linear functional f on R p and a number c ∈ R such that f (a − b) > c > f (0) = 0

for every a ∈ A .

Thus, there exists a vector y ∈ R p such that yT (Ax − b) > 0 for every x ∈ R≥q . Consequently, yT b < 0 and (32.7)

xT AT y > yT b

for every x ∈ Rq≥ .

But this implies that AT y ∈ Rq≥ , because if (AT y)j < 0 for some j ∈ {1, . . . , q}, then there exists a t > 0 such that t eTj (AT y) < yT b, which violates the inequality (32.7). Finally, if x ∈ R≥q is a vector for which (1) holds and y ∈ R p is a vector for which (2) holds, then yT Ax = yT b, which is impossible, since yT Ax ≥ 0  and yT b < 0. Therefore, (1) and (2) cannot both hold.

32.5. Another path In this section we shall present a direct proof of a special case of Theorem 32.9 that is formulated in a finite-dimensional real inner product space X , because it is instructive. Warning: To avoid clutter, we shall drop the subscripts X in xX and x, yX throughout this section.

354

32. Linear functionals

Lemma 32.11. If Q is nonempty closed convex subset of a finite-dimensional real inner product space X and a ∈ X , then there exists exactly one vector qa ∈ Q such that a − qa  ≤ a − q

(32.8)

for every q ∈ Q .

Proof. Let d = inf {a − q : q ∈ Q}. If a ∈ Q, then d = 0 and qa = a. If a ∈ Q, Then there exists a sequence of vectors q1 , q2 , . . . such that a − qj  ≤ d + 1/j . Since this sequence, qj  = qj − a + a ≤ qj − a + a ≤ d + 1/j + a , is bounded, a subsequence qj1 , qj2 , . . . will tend to a limit q , and the bounds d ≤ a − q  ≤ a − qjk  + qjk − q  ≤ d + 1/jk + qjk − q  clearly imply that a − q  = d ≤ a − q

(32.9)

for every q ∈ Q .

If also q ∈ Q and a − q  ≤ a − q

for every q ∈ Q ,

then, by the parallelogram law, 4d2 = 2q − a2 + 2q − a2 = q − a + q − a2 + q − q 2 = 4a − (q + q )/22 + q − q 2 ≥ 4d2 + q − q 2 . Therefore, q = q .



Lemma 32.12. Let Q be a closed nonempty convex subset of X , let a ∈ X , and let qa be the unique element in Q that is closest to a. Then (32.10)

q − qa , a − qa  ≤ 0

for every q ∈ Q .

Proof. Let q ∈ Q. Then clearly (1 − t)qa + tq ∈ Q for every number t in the interval 0 ≤ t ≤ 1. Therefore, a − qa 2 ≤ a − (1 − t)qa − tq2 = a − qa − t(q − qa )2 = a − qa 2 − 2ta − qa , q − qa  + t2 q − qa 2 , since t ∈ R and X is a real inner product space. But this in turn implies that 2ta − qa , q − qa  ≤ t2 q − qa 2

32.5. Another path

355

and hence that 2a − qa , q − qa  ≤ tq − qa 2 for every t in the interval 0 < t ≤ 1. The inequality (32.10) now drops out easily upon letting t ↓ 0.  Exercise 32.8. Show that if X = R n in Lemma 32.12, then 2a−qa , q−qa  is equal to the directional derivative (Du f )(x0 ) = lim t↓0

f (x0 + tu) − f (x) = (∇f )(x0 ), u t

for f (x) = x2 and appropriate choices of x0 and u (when (∇f )(x0 ) is written as a column vector). When is (∇f )(x0 ) = 0? Lemma 32.13. Let Q be a closed nonempty convex subset of X , let a, b ∈ X , and let qa and qb denote the unique elements in Q that are closest to a and b, respectively. Then qa − qb  ≤ a − b .

(32.11) Proof. Let

α = 2qb − qa , a − qa 

and

β = 2qa − qb , b − qb  .

In view of Lemma 32.12, α ≤ 0 and β ≤ 0. Therefore a − b2 = (a − qa ) − (b − qb ) + (qa − qb )2 = (a − qa ) − (b − qb )2 − α − β + qa − qb 2 ≥ qa − qb 2 , 

as claimed.

The inequality (32.11) implies that if Q is a closed nonempty convex subset of X , then the mapping from a ∈ X → qa ∈ Q is continuous. This fact will be used to advantage in the next proof: Theorem 32.14. If A and B are nonempty closed convex sets in a finitedimensional real inner product space X such that A∩B = ∅ and B is bounded, then there exist a linear functional f ∈ X  and a pair of numbers c, d ∈ R such that (32.12)

f (b) < c < d < f (a)

for every choice of a ∈ A and b ∈ B.

356

32. Linear functionals

Proof. Let ax denote the unique point in A that is closest to x ∈ X . Then, by Lemma 32.12, x − ax , a − ax  ≤ 0 for every a ∈ A . Moreover, in view of Lemma 32.13, g(x) = x − ax  is a continuous function of x ∈ X . In particular, g is continuous on the closed bounded set B, and hence there exists a vector b0 ∈ B such that b0 − ab0  ≤ b − ab  ≤ b − ab0  for every b ∈ B. Let a0 = ab0 . Then b0 − a0  ≤ b − a0  for every b ∈ B, and hence, as B is convex, b0 − a0 2 ≤ (1 − t)b0 + tb − a0 2 = t(b − b0 ) − (a0 − b0 )2 = t2 b − b0 2 − 2tb − b0 , a0 − b0  + a0 − b0 2 for 0 ≤ t ≤ 1. But this implies that 2b − b0 , a0 − b0  ≤ tb − b0 2 for every t in the interval 0 < t ≤ 1 and hence that b − b0 , a0 − b0  ≤ 0 for every b ∈ B , which, upon setting f (x) = x, a0 − b0 , yields the inequality (32.13)

f (b0 ) = b0 , a0 − b0  ≥ b, a0 − b0  = f (b) for every b ∈ B .

Moreover, since a0 = ab0 , Lemma 32.12 implies that a0 − a, a0 − b0  ≤ 0 for every a ∈ A and hence that (32.14)

f (a) = a, a0 − b0  ≥ a0 , a0 − b0  = f (a0 ) for every a ∈ A .

Consequently, in view of (32.14) and (32.13), f (a) ≥ f (a0 ) = a0 − b0 2 + f (b0 ) ≥ a0 − b0 2 + f (b) , for every choice of a ∈ A and b ∈ B. The inequality (32.6) is obtained by  setting d = f (a0 ) − ε and c = f (b0 ) + ε with 0 < 2ε < a0 − b0 2 .

32.6. Supplementary notes

357

32.6. Supplementary notes This chapter is partially adapted from Chapters 7 and 22 in [30]. Section 32.4 is adapted from Chapter 4 of Conway [19], which establishes a separation theorem from a Hahn-Banach extension theorem in a much more general setting, whereas Section 32.5 is adapted from Section 2.4 in the monograph by Webster [75], which is an eminently readable source of supplementary information on convexity in R n . Almost any textbook on functional analysis will contain more general formulations of the finite-dimensional versions of the Hahn-Banach extension theorem (Theorem 32.4) and the Hahn-Banach separation theorem (Theorem 32.9) considered here; see, e.g., Bollob´as [12] and Conway [19].

Chapter 33

A minimal norm problem

In this chapter we will consider the following problem: Given A ∈ C p×n and a vector b ∈ RA , find (33.1)

min {x1 : x ∈ Sb } ,

where (33.2)

Sb = {x ∈ C n : Ax = b} .

If the minimization in (33.1) is carried out with respect to x2 instead of x1 , then this problem is similar to problems that were resolved in Chapter 15 by invoking the singular value decomposition A = V1 S1 U1H , in which V1 ∈ C p×r and U1 ∈ C n×r are isometric matrices, S1 = diag{s1 , . . . , sr } is a positive definite matrix, and r = rank A. In particular, Ax = b ⇐⇒ V1 S1 U1H x = b ⇐⇒ U1H x = S1−1 V1H b . If r = n, then U1 is unitary and the problem is not interesting because the equation Ax = b will have only one solution: x = U1 S1−1 V1H b = A† b = (AH A)−1 AH b , and hence Sb is a set with only one vector.

  If r < n, then U1 can be embedded in a unitary matrix U = U1 U2

and (33.3)

Sb = {A† b + U2 y : y ∈ Cn−r } . 359

360

33. A minimal norm problem

Thus, as A† b + U2 y22 = A† b22 + U2 y22 , it is clear that min {x2 : x ∈ Sb } = A† b2 .

(33.4)

Minimization with respect to x1 requires a totally different approach that falls within the class of dual extremal problems that will be introduced in the next section. Exercise 33.1. Verify the formulas in (33.3) and (33.4).

33.1. Dual extremal problems Recall that if X is a finite-dimensional normed linear space and f ∈ X  , the set of linear functionals on X , then f X  = max {|f (x)| : x ∈ X and xX = 1} .   T n, x = x v1 · · · · · · x , and v = Moreover, if X = C 1 n n  f (x) = j=1 xj vj belongs to X and, as we shall see shortly,

vn

T

, then

xX = x1 =⇒ f X  = v∞ . Theorem 33.1. Let X be a finite-dimensional complex normed linear space, let U be a subspace of X , and let U ◦ = {f ∈ X  : f (u) = 0 for every u ∈ U }. Then for each vector x ∈ X , min x − uX = max◦ |f (x)| .

(33.5)

u∈U

f ∈U f X  ≤1

Proof. If x ∈ U , then (33.5) is self-evident, since both sides of the asserted equality are equal to zero. Suppose therefore that x ∈ U . Then for any f ∈ U ◦ with f X  ≤ 1 and any u ∈ U , |f (x)| = |f (x − u)| ≤ f X  x − uX ≤ x − uX . Therefore, (33.6) max {|f (x)| : f ∈ U ◦ and f X  ≤ 1} ≤ min{x − uX : u ∈ U } . To obtain the opposite inequality for x ∈ U , define the linear functional g on the subspace W = {αx + u : α ∈ C and

u ∈ U}

by the formula g(αx + u) = αd , Then α = 0 =⇒ α = 0 =⇒

with d = min{x − uX : u ∈ U } .

αx + uX = uX ≥ 0 = |g(αx + u)| , αx + uX = |α| x + u/αX ≥ |α| d = |g(αx + u)| .

33.2. Preliminary calculations

361

Thus, |g(w)| ≤ wX

for every w ∈ W .

Theorem 32.4 guarantees the existence of a linear functional h on the full space X such that h(w) = g(w) for every w ∈ W and hX  = max {|g(w)| : wX = 1} . Thus, h ∈

U ◦,

hX  ≤ 1, and h(x) = g(x) = d. Therefore,

max {|f (x)| : f ∈ U ◦ and f X  ≤ 1} ≥ |h(x)| = |h(x − u)|

(33.7)

= min{x − uX : u ∈ U } , 

and the proof is complete.

Exercise 33.2. Show that the conclusions of Theorem 33.1 are also valid when X is a finite-dimensional real normed linear space. [HINT: f ∈ U ◦ ⇐⇒ −f ∈ U ◦ and u ∈ U ⇐⇒ −u ∈ U .] Remark 33.2. Theorem 33.1 is also valid if X is a Banach space, but with inf in place of min and sup in place of max: If x ∈ X , then inf{x − u : u ∈ U } = max{|f (x)| : f ∈ U ◦ and f  ≤ 1} , whereas if f ∈ X  , then min{f − g : g ∈ U ◦ } = sup{|f (u)| : u ∈ U

and

u ≤ 1} .

The proof is much the same, except that the Hahn-Banach theorem is invoked in place of the finite-dimensional version considered here.

33.2. Preliminary calculations  If u = u1 · · · (33.8)

un

 T and v = v1 · · · vn , it is readily seen that % % % n % % % % uj vj %% ≤ u1 v∞ . |u, v| = % % j=1 %

T

However, more is true: Theorem 33.3. If u, v ∈ C n , then (33.9)

v1 = max {|u, v| : u∞ ≤ 1}

and (33.10)

v∞ = max {|u, v| : u1 ≤ 1} .

362

33. A minimal norm problem

Moreover, if equality holds in (33.8), then (33.11)

uj = 0

|vj | < v∞ .

if

Proof. To verify (33.9) when v = 0, let u be the vector with entries ( vj /|vj | if vj = 0 , uj = 0 if vj = 0 . Then u∞ = 1 and u, v =

n 

uj vj =

j=1

n 

|vj | = u∞ v1 .

j=1

To verify (33.10) when v = 0, let {1, . . . , n} = Ω1 ∪ Ω2 , where (33.12)

Ω1 = {j : |vj | = v∞ }

and Ω2 = {j : |vj | < v∞ }

and let u be the vector with entries ( if j ∈ Ω1 , tj vj /|vj | uj = 0 if j ∈ Ω2 ,   where tj > 0 and j∈Ω1 tj = 1. Then u1 = j∈Ω1 tj = 1 and u, v =

n 

uj vj =

j=1



|uj | v∞ = u1 v∞ .

j∈Ω1

To verify (33.11), suppose that % % % % n % % % uj vj %% = u1 v∞ . % % % j=1 Then, in terms of the notation (33.12), % % % % n n  % % % |uj | v∞ = u1 v∞ = % uj vj %% % % j=1 j=1 ≤

n  j=1

which implies that 0≥

 j∈Ω2

|uj vj | =



|uj | v∞ +

j∈Ω1



|uj | |vj | ,

j∈Ω2



|uj | [v∞ − |vj |] ≥

|uj | ε

j∈Ω2

for some ε > 0 and hence that uj = 0 if j ∈ Ω2 .



33.2. Preliminary calculations

363

Exercise 33.3. Show that if u, v ∈ C n , then u1 = max {|u, v| : v∞ ≤ 1}

(33.13) and

u∞ = max {|u, v| : v1 ≤ 1} .

(33.14)

In view of (33.11), solutions x of the minimization problem with respect to the norm x1 will typically have many zero entries. The situation is markedly different if the minimization is carried out with respect to x2 , or with respect to any of the other classical norms xt for 1 < t < ∞, as is spelled out in the next theorem. Theorem 33.4. If u, v ∈ C n , 1 < t < ∞, and t = t/(t − 1), then ut = max {|u, v| : vt = 1}

(33.15) and

vt = max {|u, v| : ut = 1} .

(33.16)

Moreover, if u = 0 and u, v = ut for some v ∈ C n with vt = 1, then uj for some γ ∈ C if uj = 0 . (33.17) vj = 0 if uj = 0 and vj = γ |uj |2−t  T T  Proof. Let u = u1 · · · un and v = v1 · · · vn . Then H¨older’s inequality ensures that % % % % n % % t % . uj vj %% ≤ ut vt with 1 < t < ∞ and t = |u, v| = % t−1 % % j=1 Therefore, max {|u, v| : vt = 1} ≤ ut . Moreover, if uj = 0 and γ > 0, then uj vj = γ|uj | ⇐⇒ vj = γuj |uj | t

t−2

=⇒

n 

uj vj = γ

j=1

and hence the equality

n 

j=1

uj vj = ut

j=1

is attained by choosing γ = ut

⎧ n ⎨ ⎩

j=1

|uj |t

⎫−1 ⎬ ⎭

=

⎧ n ⎨ ⎩

j=1

|uj |t

n 

⎫−1/t ⎬ ⎭

.

|uj |t

364

33. A minimal norm problem

It is also readily checked that vt = 1 and hence that (33.15) holds; (33.16) is obtained from (33.15) by interchanging the roles of u and v and t and t . Finally, since

% % % % n n %  % |u, v| = %% uj vj %% ≤ |uj | |vj | ≤ ut vt , % j=1 % j=1

it is clear that |u, v| = ut vt if and only if the two inequalities in the last display are both equalities, i.e., if and only if there exist a θ ∈ [0, 2π) and a δ ≥ 0 such that uj vj = eiθ |uj vj | and



|vj |t = δ|uj |t

for j = 1, . . . , n ;

see Exercise 9.16 and Theorem 9.6. The second condition implies that −1

|vj | = δ 1−t |uj |t−1

for j = 1, . . . , n .

Thus, vj = 0 if uj = 0 and, in view of the first condition, vj = eiθ

|uj | uj −1 |vj | = eiθ δ 1−t uj |uj |t−2 |vj | = eiθ uj |uj |

if uj = 0 .

Therefore, (33.17) holds.



The next theorem provides some explicit evaluations for f X  when f ∈ X  and X = C n . Theorem 33.5. If X = C n and f (x) = x, v for some vector v ∈ C n , then: (1) xX = x1 =⇒ f X  = v∞ . (2) xX = x∞ =⇒ f X  = v1 . (3) xX = xt , 1 < t < ∞, =⇒ f X  = vt , where 1/t = 1−1/t. Proof. Assertions (1)–(3) follow from the fact that f X  = max {|f (x)| : x ∈ C n and xX ≤ 1} = max {|x, v| : x ∈ C n and xX ≤ 1} and the formulas (33.10), (33.9), and (33.16), respectively.

33.3. Evaluation of (33.1) Theorem 33.6. If A ∈ C p×n with rank A = p, p < n, and b ∈ RA , then min {x1 : x ∈ Sb } (33.18) = max {|b, y| : y ∈ C p and AH y∞ ≤ 1} . Moreover, if 1 < s < ∞ and 1/s + 1/t = 1, then (33.19) min {xs : x ∈ Sb } = max {|yH b| : y ∈ C p and AH yt ≤ 1} .



33.4. A numerical example

365

Proof. This theorem is a special case of Theorem 33.1, in which X = C n and U = NA . Thus, a linear functional f (x) = x, v belongs to U ◦ if and only if v = AH y for some vector y ∈ C p . Moreover, since xX = x1 , f X  = AH y∞

and

x, AH y = Ax, y = b, y .

Consequently, (33.18) holds; (33.19) is left to the reader.



> ∈ C n is a solution of the minimization problem in Lemma 33.7. If x > ∈ C p is a solution of the maximization problem in (33.18), (33.18) and y then > | = > > ∞ x1 AH y |> x, AH y

(33.20) and (hence)

>=0 eTj x

(33.21)

if

> | < AH y > ∞ . |eTj AH y

Proof. Under the given assumptions, it is readily seen with the help of (33.9) that x − u1 : u ∈ NA } = max {|> x, AH y| : AH y∞ = 1} > x1 = min {> > | ≤ > > ∞ ≤ > x1 x1 AH y = |> x, AH y and hence that (33.20) holds. Formula (33.21) then follows from formula (33.11) in Theorem 33.3.  Exercise 33.4. Verify the equality (33.19). Exercise 33.5. Show that if s = t = 2 in (33.19), then min {x2 : x ∈ Sb } = max {|yH b| : y ∈ NAH and y∞ ≤ 1} = A† b . Exercise 33.6. Show that if A ∈ C p×q , c ∈ C q , 1 < s < ∞, and 1/s+1/t = 1, then min {c − ys : y ∈ NA } = max {|cH AH x| : x ∈ C p and AH xt ≤ 1} .

33.4. A numerical example Let

1 1/2 1/3 1/4 1/5 A= 1 −1/2 1/4 −1/8 1/16

 and

 3 b= . 4

Then {y ∈ R 2 : AT y∞ ≤ 1} is clearly a convex subset of R 2 , which is  T given explicitly by the set of y = c d that satisfy the following 5 sets of inequalities: −1 ≤ c + d ≤ 1, −2 ≤ c − d ≤ 2, −1 ≤ c/3 + d/4 ≤ 1, −1 ≤ c/4 − d/8 ≤ 1, and −1 ≤ c/5 + d/16 ≤ 1 .

366

33. A minimal norm problem

The first two sets of constraints define a closed parallelogram Q with vertices







 1/2 3/2 −1/2 −3/2 v1 = , v2 = , v3 = , and v4 = ; −3/2 −1/2 3/2 1/2 and it turns out that the last three constraints are automatically satisfied by the points in Q; see Exercise 33.7. Thus, to this point, we have reduced the problem to finding max {|b, y| : y ∈ Q}. But, as y ∈ Q if and only if −y ∈ Q, this maximum is equal to max {b, y : y ∈ Q} . However, since the four vertices are exactly the extreme points of Q, the Krein-Milman theorem ensures that it suffices to evaluate b, vj  for these four points v1 , . . . , v4 . Thus, as b, v1  = −9/2, b, v2  = 5/2, b, v3  = 9/2, b, v4  = −5/2 , it follows that max {b, y : AT y∞ ≤ 1} = max {b, y : y ∈ Ω} = 9/2 = b, v3  , > = v3 . Moreover, as i.e., y ⎡ ⎤ ⎡ ⎤ 1 1 1 ⎢1/2 −1/2⎥  ⎢ −1 ⎥ ⎢ ⎥ −1/2 ⎢ ⎥ T ⎢ ⎥ ⎥ 5/24 =⎢ A v3 = ⎢1/3 1/4 ⎥ ⎢ ⎥ ⎣1/4 −1/8⎦ 3/2 ⎣−5/16⎦ 1/5 1/16 67/80

and AT v3 ∞ = 1 ,

> are equal to zero. Theorem 33.3 implies that the last three entries of x Consequently,



   3 1 1/2 1/3 1/4 1/5 1 1/2 x >1 >= >= = , Ax x >2 4 1 −1/2 1/4 −1/8 1/16 1 −1/2 x  T > = 7/2 −1 0 0 0 and hence that > x1 = 9/2, which implies that x as it should be in order that >  = b, y >  = b, v3  = > x, AT y

9 = > x1 AT v3 ∞ . 2

Exercise 33.7. Verify the claim that the last three constraints in the set of five constraints listed above are automatically met by every point in the closed parallelogram Q. [HINT: It is enough to check that every convex combination of the extreme points v1 , . . . , v4 meets these three constraints. Why?]

33.6. Supplementary notes

367

33.5. A review To summarize, we solved the problem formulated in (33.1) by the following steps: (1) Find the extreme points v1 , . . . , vk of the closed convex bounded set {y ∈ R p : AT y∞ ≤ 1}. (2) Find the extreme points vi1 , . . . , vir at which max {b, vj  : j = > be any convex combination of these r 1, . . . , k} is attained and let y > is a solution of the maximization problem. extreme points. Then y > is a solution of the > )j | < AT y > ∞ . Then x (3) Let Ω2 = {j : |(AT y equation A> x = b with (> x)j = 0 if j ∈ Ω2 ; it is a solution of the minimization problem. >  = > > ∞ . (4) Check that > x, AT y x1 AT y

33.6. Supplementary notes This chapter was adapted from a recent article [18] by R. Cheng and Y. Xu, which treats an infinite-dimensional version of the problem considered here. The problem (33.1) is in some sense an approximation to the following open problem: Given A ∈ C p×n with rank A = p and p < n and a vector b ∈ C p , find (33.22)

min {ν(x) : x ∈ C n

and Ax = b} ,

where ν(x) = the number of nonzero entries in x.

Chapter 34

Conjugate gradients

The method of conjugate gradients is an iterative approach to solving equations of the form Ax = b for matrices A ∈ R n×n that are positive definite and vectors b ∈ R n . It is based on the observation that if A  O, then the gradient (∇ϕ)(x) of the function (34.1)

1 ϕ(x) = Ax, x − b, x 2

(written as a column vector) is equal to ⎡ ∂ϕ ⎢ ∂x1 ⎢ (∇ϕ)(x) = ⎢ ... ⎣ ∂ϕ ∂xn

⎤ ⎥ ⎥ ⎥ = Ax − b , ⎦

and the Hessian Hϕ (x) = A  O. Therefore, ϕ is strictly convex, and hence the solution x of the equation Ax = b is the unique point in R n at which ϕ(x) attains its minimum value, i.e., (34.2)

ϕ(A−1 b) < ϕ(x) if x ∈ R n and x = A −1 b .

The method of conjugate gradients exploits this fact in order to find the solution of this equation recursively, as the limit of the solutions x1 , x2 , . . . of a sequence of minimization problems. Lemma 34.1. If A ∈ R n×n is positive definite and Q is a closed convex subset of R n , then there exists exactly one point q ∈ Q at which the function ϕ(x) defined by formula (34.1) attains its minimum value, i.e., at which ϕ(q) ≤ ϕ(x)

for every

x ∈ Q. 369

370

34. Conjugate gradients

Proof. Let s1 ≥ · · · ≥ sn denote the singular values of A. Then, sn > 0 and 1 ϕ(x) = Ax, x − b, x 2 1 sn x22 − b2 x2 ≥ 2   1 sn x2 − b2 , = x2 2 which is clearly positive if x2 > 2b2 /sn . Thus, as ϕ(0) = 0, ϕ(x) will achieve its lowest values in the set Q ∩ {x : x2 ≤ 2b2 /sn }. Since this set is closed and bounded and ϕ(x) is a continuous function of x, ϕ(x) will attain its minimum value in this set. Moreover, as the Hessian of ϕ is equal to A and A  O, ϕ(x) is strictly convex (see also Exercise 34.2). Therefore, Theorem 18.2 ensures that ϕ(x) attains its minimum at exactly one point in the subset Q.  Exercise 34.1. Show that if u ∈ R n and V is a subspace of R n , then the set u + V = {u + v : v ∈ V} is a closed convex subset of R n . Exercise 34.2. Verify directly that the function ϕ(x) defined by formula (34.1) is strictly convex. [HINT: Check that the identity 1 (34.3) ϕ(tu + (1 − t)v) = tϕ(u) + (1 − t)ϕ(v) − t(1 − t)A(u − v), u − v 2 n is valid for every pair of vectors u and v in R and every number t ∈ R.] Let (34.4)

Hj = span{b, Ab, . . . , Aj−1 b}

for j = 1, . . . , n .

Then, in view of Lemma 34.1 and Exercise 34.1, there exists a sequence of points xj ∈ R n such that (34.5)

xj ∈ Hj

and ϕ(xj ) < ϕ(x)

if x ∈ Hj and x = xj .

Lemma 34.2. If A ∈ R n×n is positive definite, b ∈ R n , and the spaces Hj are defined by (34.4), then: (1) A−1 b ∈ Hn . (2) If Hk = Hk+1 for some positive integer k ≤ n − 2, then Hk+1 = Hk+2 . (3) If A−1 b ∈ Hk for some positive integer k < n, then A−1 b = xk . Proof. Let det (λIn − A) = a0 + a1 λ + · · · + an−1 λn−1 + λn .

34. Conjugate gradients

371

Then the Cayley-Hamilton theorem implies that a0 In + · · · + an−1 An−1 + An = O and hence, as a0 = det (−An ) = (−1)n det A = 0, that 1 In = − [a1 A + · · · + an−1 An−1 + An ] . a0 Therefore, 1 A−1 b = − [a1 In + · · · + an−1 An−2 + An−1 ]b a0 which belongs to Hn . Thus, (1) holds. To verify (2), suppose next that Hk = Hk+1 for some integer k ≤ n − 2. Then Ak b ∈ Hk . Therefore, Ak+1 b ∈ Hk+1 , which ensures that Hk+1 = Hk+2 , as claimed. Finally, if A−1 b ∈ Hk for some positive integer k < n, then ϕ(A−1 b) ≤ ϕ(xk ) ≤ ϕ(A−1 b) , since

ϕ(A−1 b) ≤ ϕ(x)

for every x ∈ R n

and ϕ(xk ) ≤ ϕ(x) for every x ∈ Hk . Thus, Lemma 34.1 ensures that A−1 b = xk .



The notation u, vA = Au, vst = vH Au will be used in the sequel. Lemma 34.3. If Hj is a proper subspace of Hj+1 , then (34.6)

Axj − b, x = 0

for every x ∈ Hj

and (34.7)

xj+1 − xj , xA = 0

for every x ∈ Hj .

Proof. If x ∈ Hj , then ϕ(xj + εx) − ϕ(xj ) ≥ 0

for every x ∈ Hj and ε > 0 .

Thus, as the left-hand side is equal to ε(∇ϕ)(c), x for some point c between xj + εx and xj , it follows that (∇ϕ)(xj ), x = lim ε↓0

ϕ(xj + εx) − ϕ(xj ) ≥0 ε

for every vector x ∈ Hj . But, by the same reasoning, (∇ϕ)(xj ), −x ≥ 0 for every vector x ∈ Hj . Thus, (34.6) holds.

372

34. Conjugate gradients

To verify (34.7), it suffices to note that xj+1 − xj , xA = A(xj+1 − xj ), xst = Axj+1 − b, xst − Axj − b, xst = 0 

for every x ∈ Hj by (34.6).

34.1. The recursion In this section we shall establish a recursive procedure for calculating the points x1 , x2 , . . . specified in (34.5) that avoids the need for solving minimization problems. It is convenient to express some formulas in terms of the orthogonal projection ΠXj from R n onto the subspace Xj = {α(xj − xj−1 ) : α ∈ R} with respect to the inner product ·, ·A : (34.8)

ΠXj : u ∈ R n →

u, xj − xj−1 A (xj − xj−1 ) xj − xj−1 , xj − xj−1 A

for j = 1, . . . , n. Lemma 34.4. If Hj is a proper subspace of Hj+1 and x0 = 0, then: (1) The set of vectors {Ax0 − b, Ax1 − b, . . . , Axj − b} is a basis for Hj+1 that is orthogonal with respect to the standard inner product. (2) The set of vectors {x1 − x0 , x2 − x1 , . . . , xj+1 − xj } is a basis for Hj+1 that is orthogonal with respect to the inner product ·, ·A . (3) There exists exactly one vector vj ∈ Hj+1 such that vj , Ay = 0 for every vector y ∈ Hj and Axj − b − vj ∈ Hj ; it may be specified in terms of the projection ΠXj by the formulas (34.9)

vj = Axj − b − ΠXj (Axj − b) = ΠXj+1 (Axj − b) .

Proof. The first two assertions follow easily from (34.6) and (34.7), respectively. Suppose next that vj and wj are vectors in Hj+1 that meet the two constraints in (3). Then vj − wj = (Axj − b − wj ) − (Axj − b − vj ) ∈ Hj and vj − wj is orthogonal to Hj in the inner product ·, ·A . Therefore, vj = wj . Moreover, in view of (2), Axj − b =

j+1  i=1

ci (xi − xi−1 ) ,

with ci =

Axj − b, xi − xi−1 A . xi − xi−1 , xi − xi−1 A

34.1. The recursion

373

The condition Axj − b, x = 0 for x ∈ Hj forces ci = 0 for i = 1, . . . , j − 1. Thus, Axj − b = cj+1 (xj+1 − xj ) + cj (xj − xj−1 ) . Therefore, the vector vj = cj+1 (xj+1 − xj ) = Axj − b − cj (xj − xj−1 ) meets the two constraints in (3). Since ci (xi − xi−1 ) = ΠXi (Axj − b), the two formulas for vj in the last display coincide with the formulas in (34.9). Thus, the proof is complete.  Lemma 34.5. The initial vectors in the recursion are x1 =

(34.10)

b, b b Ab, b

and (34.11)

v1 = Ax1 − b −

Ax1 − b, Ab b. Ab, b

Proof. Since x1 = tb for some t ∈ R, ϕ(tb) =

t2 Ab, b − tb, b , 2

which clearly achieves its minimum value when t = t1 = b, b/Ab, b. Thus, (34.10) holds; (34.11) follows from the first equality in (34.9), since  X1 = {αb : α ∈ R}. The next three exercises are a warmup to the general recursion that will be presented in Theorem 34.6. Exercise 34.3. Verify the identities Ax1 − b, x2 − x1 A = v1 , x2 − x1 A and Ax1 − b, x2 − x1 A = −x2 − x1 , x2 − x1 A . [HINT: A(x2 − x1 ) = (Ax2 − b) − (Ax1 − b).] Exercise 34.4. Show that x2 = x1 − γ1 v1

with γ1 =

Ax1 − b, v1  . Av1 , v1 

[HINT: Use the second equality in (34.9) and the two identities that were referred to in Exercise 34.3.] Exercise 34.5. Show that v2 = Ax2 − b − β1 v1

with β1 =

Ax2 − b, Av1  . Av1 , v1 

374

34. Conjugate gradients

Theorem 34.6. The general recursion is given by the formulas xj+1 = xj − γj vj

(34.12)

with γj =

Axj − b, vj  Avj , vj 

and (34.13)

vj+1 = Axj+1 − b − βj vj

with βj =

Axj+1 − b, Avj  Avj , vj 

for j = 1, . . . , n − 1 and the initial conditions x1 and v1 given by (34.10) and (34.11), respectively. Proof. Formula (34.12) is obtained by inverting the second formula for vj in (34.9): Axj − b, xj+1 − xj A vj = (xj+1 − xj ) , xj+1 − xj , xj+1 − xj A in order to obtain an expression for xj+1 −xj in terms of vj . The verification rests on the following two extensions of the identities in Exercise 34.3: Axj − b, xj+1 − xj A = Axj − b − vj , xj+1 − xj A + vj , xj+1 − xj A = vj , xj+1 − xj A and xj+1 − xj , xj+1 − xj A = (Axj+1 − b) − (Axj − b), xj+1 − xj  = −Axj − b, xj+1 − xj  . These identities enable us to write xj+1 − xj =

xj+1 − xj , xj+1 − xj A Axj − b, xj+1 − xj  vj = − vj , Axj − b, xj+1 − xj A vj , xj+1 − xj A

which is equivalent to (34.12), since xj+1 − xj = δj vj . Formula (34.13) is immediate from (34.9): vj+1 = Axj+1 − b − ΠXj+1 (Axj+1 − b) , since Axj+1 − b, xj+1 − xj A (xj+1 − xj ) = βj vj .  xj+1 − xj , xj+1 − xj A



 2 1 1 Exercise 34.6. Let A = and b = − . Show that A  O and 1 1 1 then use the formulas in Lemma 34.5 and Theorem 34.6 to solve the equation Ax = b. ΠXj+1 (Axj+1 − b) =

34.2. Convergence estimates

Exercise 34.7. Let

375

⎡ ⎤ 3 1 0 A = ⎣1 1 1⎦ 0 1 2

⎡ ⎤ 0 ⎣ and b = 0⎦ . 1

Show that A  O and then use the formulas in Lemma 34.5 and Theorem 34.6 to solve the equation Ax = b.

34.2. Convergence estimates In this section we shall develop upper bounds on xj − A−1 bA . Theorem 34.7. If A ∈ R n×n is positive definite with singular values s1 ≥ · · · ≥ sn and singular value decomposition A = V SV T and A−1 b ∈ Hk , then - n .  c2 −1 2 2 i (34.14) xk − A bA = min q(si ) : q ∈ Pk and q(0) = 1 , si i=1

where Pk denotes the set of polynomials of degree ≤ k and c1 , . . . , cn are the entries in the vector c = V T b. Proof. The proof is divided into steps. 1. Verification of the formula (34.15)

xk − A−1 bA = min{x − A−1 bA : x ∈ Hk } .

If x ∈ Hk and k < m = dim H, then A−1 b = xm and x − A−1 b2A = x − xk + xk − xm 2A = x − xk 2A + xk − xm 2A , since xm − xk is orthogonal to Hk with respect to the inner product ·, ·A . Therefore, the assertion is now self-evident. 2. Verification of the formula xk − A−1 bA = min {p(A)b − A−1 bA } . p∈Pk−1

This is immediate from the fact that x ∈ Hk if and only if x = p(A)b for some polynomial of degree ≤ k − 1.

376

34. Conjugate gradients

3. Verification of (34.14). Since A = V SV T with V ∈ R n×n unitary, A[p(A)b − A−1 b], [p(A)b − A−1 b] = V SV T [V p(S)V T − V S −1 V T ]b, [V p(S)V T − V S −1 V T ]b = V S[p(S)V T − S −1 V T ]b, V [p(S)V T − S −1 V T ]b = S −1 [Sp(S) − In ]c, [Sp(S) − In ]c , 

which yields (34.14).

We shall not attempt to evaluate the minimization indicated on the right-hand side of (34.14). Instead, we shall obtain an upper bound on the right-hand side of (34.14) by choosing a particular polynomial q ∈ Pk with q(0) = 1. Towards this end, we introduce the Chebyshev polynomial Tk (x) =

1 1 1 {(x + x2 − 1)k + (x − x2 − 1)k } , 2

k = 0, 1, . . . ,

which really is a polynomial in x of degree k (since odd powers of the term √ 2 x − 1 cancel out). Moreover, Tk (x) > 2−1 (x +

1

x2 − 1)k > 1/2

if x > 1

and |Tk (x)| ≤ 1 if |x| ≤ 1, since (cos θ + i sin θ)k + (cos θ − i sin θ)k 2 ikθ −ikθ e +e = 2 = cos kθ .

Tk (cos θ) =

Now, assuming that s1 > sn , let  hk (x) = Tk

s1 + sn − 2x s1 − sn

 and

qk (x) =

hk (x) . hk (0)

Then hk (x) is a polynomial of degree k in x and |hk (si )| ≤ 1 for i = 1, . . . , n, since s1 + sn − 2si s1 − sn sn − s1 ≤ ≤ = 1. −1 = s1 − sn s1 − sn s1 − sn

34.3. Krylov subspaces

377

Consequently, qk (x) is a polynomial of degree k in x such that qk (0) = 1 and (34.16) √ 1 κ+1 2 and κ = s1 /sn > 1 . , where μ = √ qk (si ) ≤ = k −k hk (0) μ +μ κ−1 The upper bounds in (34.16) depend essentially upon the observation that @  √ κ+1 2 κ κ+1 κ+1 2 + + −1= κ−1 κ−1 κ−1 κ−1 √ √ κ+1 ( κ + 1)2 √ =√ =μ = √ ( κ + 1)( κ − 1) κ−1 and, by a similar calculation, κ+1 − κ−1

@

κ+1 κ−1

2

− 1 = μ−1 .

Upon inserting these bounds into (34.14), we see that if A  O and s1 > sn , then  2 n  c2j 2 xk − A−1 b2A ≤ sj μk + μ−k j=1 (34.17)  2 2 −1 = A b, bst . μk + μ−k The number κ = s1 /sn is called the condition number of the matrix A. If κ is close to 1, then μ is large and the upper bound in (34.17) tends to zero quickly.

34.3. Krylov subspaces The spaces Hj generated by b ∈ R n and A ∈ R n×n that are defined in (34.4) are examples of Krylov subspaces. The k’th Krylov subspace of R n generated by a nonzero vector u ∈ R n and a matrix A ∈ R n×n is defined by the formula (34.18)

Hk = span{u, Au, . . . , Ak−1 u}

for

k = 1, 2, . . . .

Clearly, dim H1 = 1, dim Hk ≤ k for k = 2, 3, . . ., and, if Hk+1 = Hk for some positive integer k, then Hj = Hk for every integer j ≥ k. Exercise 34.8. Show that if Hk+1 = Hk for some positive integer k, then Hj = Hk for every integer j ≥ k. [HINT: If Hk+1 = Hk for some positive integer k, then Ak u = ck−1 Ak−1 u + · · · + c0 u.]

378

34. Conjugate gradients

Exercise 34.9. Let A ∈ R n×n , let u ∈ R n , and let k ≥ 1 be a positive integer. Show that if A  0, then the matrix ⎤ ⎡ Au, u A2 u, u ··· Ak u, u ⎢ Au, Au ··· Ak u, Au ⎥ A2 u, Au ⎥ ⎢ ⎥ ⎢ .. .. ⎦ ⎣ . ··· . Au, Ak−1 u A2 u, Ak−1 u · · · Ak u, Ak−1 u is invertible if and only if the vectors u, Au, . . . , Ak−1 u are linearly independent in R n .

34.4. The general conjugate gradient method In this section we shall outline the conjugate gradient algorithm for computing a sequence of points x0 , x1 , . . . that tend to the solution of the equation Ax = b when A  O, starting from an arbitrary point x0 ∈ Rn for which Ax0 − b = 0 in a series of exercises. Correspondingly, the spaces (34.19)

Hj = span{u, Au, . . . , Aj−1 u}

with u = Ax0 − b = 0

for j = 1, . . . , n. Exercise 34.10. Show that if x0 ∈ Rn and X is a subspace of Rn , then there exists exactly one point u ∈ x0 + X such that ϕ(u) ≤ ϕ(x) for every point x ∈ x0 + X . Exercise 34.11. Let x0 ∈ Rn and assume that u = Ax0 − b = 0 and let xj ∈ x0 + Hj be the unique point in that set such ϕ(xj ) ≤ ϕ(x) for every point x ∈ x0 + Hj . Show that: (a) xj − x0 ∈ Hj for j = 1, 2, . . . , . (b) xj − xj−1 ∈ Hj for j = 1, 2, . . . , . (c) Axj−1 − b ∈ Hj for j = 1, 2, . . . , . Exercise 34.12. Show that Axj − b, x = 0 for every vector x ∈ x0 + Hj and hence that: (a) Axj = b if and only if Hj+1 = Hj . (b) If 1 ≤ j ≤ , then the vectors Ax0 − b, Ax1 − b, . . . , Axj−1 − b form an orthogonal basis for Hj with respect to the standard inner product. (c) If 1 ≤ j ≤ , then the vectors x1 − x0 , x2 − x1 , . . . , xj − xj−1 form an orthogonal basis for Hj with respect to the “A”-inner product, i.e., A(xs − xs−1 ), xt − xt−1  = 0 if s, t = 1, . . . ,  and s = t.

34.5. Supplementary notes

379

Exercise 34.13. Show that if 1 ≤ j ≤  − 1, then: (a) Axj − b ∈ span{xj+1 − xj , xj − xj−1 }. (b) There exists exactly one vector vj ∈ Hj+1 that is of the form vj = αj (xj+1 − xj ) with αj ∈ R such that Axj − b − vj ∈ Hj . Exercise 34.14. Use the formulas xj+1 − xj = γj vj

and

Axj+1 − b = vj+1 + βj vj

to establish the recursions Axj − b, vj  vj , Avj , vj  A(Axj+1 − b), vj  vj , = Axj+1 − b − Avj , vj 

xj+1 = xj − vj+1

for j = 0, . . . ,  − 1, with initial conditions x0 and v0 = Ax0 − b. The recursions in Exercise 34.14 enable us to solve sequentially for x1 , v1 , x2 , v2 , . . . and only involve simple computations; it is not necessary to compute A−1 or to solve minimization problems. Exercise 34.15. Use to solve the equation

the recursions

 in Exercise

34.14  2 1 3 1 Ax = b when A = ,b= , and x0 = . 1 1 2 0

34.5. Supplementary notes The discussion of the general conjugate gradient method in Section 34.4 is based partially on the analysis in Section 16.6 of [30], which was adapted from the discussion in the monographs of Luenberger [59] and of Trefethen and Bau [72]. The analysis in Sections 34.1 and 34.2 does not appear in [30]. The convergence estimates in Section 34.2 were adapted from the online article [68] by Shewchuk.

Chapter 35

Continuity of eigenvalues

In this chapter we shall use complex function theory to show that if A, B ∈ C n×n and A − B is small, then σ(A) is close to σ(B) in a sense that will be made precise in Theorem 35.7, which is the main result of this chapter. We shall assume that the reader is familiar with the elements of complex function theory. A quick introduction is presented in Chapter 17 of [30]. However, all you need to proceed is a willingness to accept the following facts: (1) A complex-valued function f (λ) of the complex variable λ that is defined in an open set Ω ⊆ C is said to be holomorphic (or analytic) in Ω if the limit f (λ + ξ) − f (λ) ξ→0 ξ

f  (λ) = lim

(35.1)

exists for every point λ ∈ Ω. This is a very strong constraint because the variable ξ in this difference ratio is complex and the definition requires the limit to be the same regardless of how ξ tends to zero. In fact, it is so strong that f

analytic in Ω =⇒ f  =⇒ f



analytic in Ω analytic in Ω =⇒ · · · ,

and hence (see Exercise 35.1) the function f and its successive derivatives f  , f  . . . are all continuous in Ω. 381

382

35. Continuity of eigenvalues

, (2) The contour integral Γ f (λ)dλ of a continuous complex-valued function that is defined on a smooth curve Γ that is parametrized by γ(t), a ≤ t ≤ b, is defined by the formula 6 b 6 f (λ)dλ = f (γ(t))γ  (t)dt . (35.2) Γ

a

The curve Γ is said to be simple if a < t1 , t2 < b and t1 = t2 , then γ(t1 ) = γ(t2 ). (3) If f (λ) is holomorphic in some open nonempty set Ω and Γ is a simple closed piecewise smooth curve in Ω (think of a rubber band) that is directed counterclockwise such that all the points enclosed by Γ also belong to Ω, then 6 1 f (λ)dλ = 0 (35.3) 2πi Γ and (35.4)

1 2πi

6 Γ

⎧ ⎨ f (k) (ω) f (λ) dλ = k! ⎩ (λ − ω)k+1 0

if ω is inside Γ , if ω is outside Γ

for k = 0, 1, . . .. (Here f (0) = f and f (k) denotes the k’th derivative of f for k = 1, 2, . . ..) The numerical value of the integral in (35.2) depends upon the curve Γ, but not upon the particular choice of the (one-to-one) function γ(t) that is used to describe the curve. Exercise 35.1. Show that if f is analytic in an open set Ω ⊆ C, then f is continuous in Ω. [HINT: |f (λ + ξ) − f (λ)| = |ξ| |(f (λ + ξ) − f (λ))/ ξ − f  (λ) + f  (λ)|.] Exercise 35.2. Show that if f and g are analytic in a nonempty open set Ω ⊆ C, then f g and f + g are analytic in Ω, and that if |f (ω)| > 0 for every point ω ∈ Ω, then 1/f is also analytic in Ω. Exercise 35.3. Show that if f (λ) is a polynomial, then f and ef are analytic in C; but if f (λ) = (λ − λ1 )−1 (λ − λ2 )−1 with λ1 = λ2 , then f is analytic in C \ {λ1 , λ2 }. Exercise 35.4. Show that if f is analytic in an open set Ω ⊆ C and ω ∈ Ω, then the function f (λ) − f (ω) if λ = ω , λ−ω g(λ) =  if λ = ω f (ω) is analytic in Ω.

35.1. Contour integrals of matrix-valued functions

383

Remark 35.1. The formulas in (35.4) are all obtained from (35.3). In particular, the formula for k = 0, 6 1 f (λ) (35.5) dλ = f (ω) for ω inside Γ , 2πi Γ (λ − ω) may be obtained by applying (35.3) to the function g considered in Exercise 35.4. Formula (35.5) implies that

2 6 ( 1 1 1 f (ω + ξ) − f (ω) = − dλ ξ 2πiξ Γ λ − ω − ξ λ − ω 2 6 ( 1 1 = dλ 2πi Γ (λ − ω − ξ)(λ − ω)

when |ξ| > 0 and ω + ξ is inside Γ and hence, upon letting ξ → 0, yields the formula in (35.4) for k = 1: 6 1 f (λ) dλ for ω inside Γ . (35.6) f  (ω) = 2πi Γ (λ − ω)2 Analogously, the formula for k = 2 is obtained from the formula for k = 1 by writing 2 6 ( 1 1 1 f  (ω + ξ) − f  (ω) dλ = − ξ 2πiξ Γ (λ − ω − ξ)2 (λ − ω)2 ( 2 6 2(λ − ω) − ξ 1 dλ = 2πi Γ (λ − ω − ξ)2 (λ − ω)2 and then letting ξ → 0. This justifies the claim that if f is analytic in Ω, then f  is also analytic in Ω. In much the same way one can obtain the formula for f (k) (ω) from the formula for f (k−1) (ω); see Exercise 35.5. Exercise 35.5. Show that if ω ∈ Ω, |ξ| > 0, and ω + ξ ∈ Ω, then 2 ( 6 1 f (k−1) (ω + ξ) − f (k−1) (ω) 1 =k dλ . lim f (λ) ξ→0 ξ (k − 1)! 2πi Γ (λ − ω)(k+1)

35.1. Contour integrals of matrix-valued functions The contour integral

6 F (λ)dλ Γ

of a p × q matrix-valued function ⎡ f11 (λ) · · · ⎢ .. F (λ) = ⎣ . fp1 (λ) · · ·

⎤ f1q (λ) ⎥ .. ⎦ . fpq (λ)

384

35. Continuity of eigenvalues

is defined by the formula 6

where



a11 · · · ⎢ .. F (λ)dλ = ⎣ . Γ ap1 · · ·

⎤ a1q .. ⎥ , . ⎦ apq

6 aij =

fij (λ)dλ , i = 1, . . . , p , j = 1, . . . , q ; Γ

i.e., each entry is integrated separately. At first glance, this may seem unnatural. However, it is a consequence of the fact that ⎤ ⎡   (λ) f11 (λ) · · · f1q ⎥ ⎢ .. .. F  (λ) = ⎣ ⎦. . .   fp1 (λ) · · · fpq (λ) It is readily checked that 6 6 6 {F (λ) + G(λ)}dλ = F (λ)dλ + G(λ)dλ Γ

Γ

Γ

and that if B and C are appropriately sized constant matrices, then 6  6 BF (λ)Cdλ = B F (λ)dλ C . Γ

Γ

Moreover, if ϕ(λ) is a scalar-valued function and C ∈ Cp×q , then 6  6 ϕ(λ)Cdλ = ϕ(λ)dλ C . Γ

Γ

Lemma 35.2. If Γ = {γ(t) : a ≤ t ≤ b} is a simple smooth curve that is parametrized by a function γ(t) ∈ C 1 (Q), where Q is an open subset of R that contains [a, b] and F (λ) is a continous p × q matrix-valued function on Γ, then 5 6 b 56 5 5 5 F (λ)dλ5 ≤ F (γ(t))|γ  (t)|dt . 5 5 Γ

a

Proof. This is a straightforward consequence of the triangle inequality applied to the Riemann sums that are used to approximate the integral.  (k)

Lemma 35.3. Let Cμ = μIk + N be a Jordan cell of size k × k and let Γ be a simple piecewise smooth counterclockwise directed closed curve in the complex plane C that does not intersect the point μ. Then ( 6 1 Ik if μ is inside Γ , (k) −1 (λIk − Cμ ) dλ = (35.7) Ok×k if μ is outside Γ . 2πi Γ

35.1. Contour integrals of matrix-valued functions

385

Proof. Clearly, λIk − Cμ(k) = (λ − μ)Ik − N and, since N k = Ok×k ,  (λIk − Cμ(k) )−1 = (λ − μ)−1 Ik − =

N λ−μ

−1

N Ik N k−1 + + · · · + , λ − μ (λ − μ)2 (λ − μ)k

when λ = μ. Therefore, 1 2πi

6 (λIk −

Cμ(k) )−1 dλ

Γ

2 6 k (  1 1 = dλ N j−1 . 2πi Γ (λ − μ)j j=1

But this yields the asserted formula, since 6 1 1 dλ = 0 2πi Γ (λ − μ)j and 1 2πi

6 Γ

1 dλ = λ−μ

(

if j > 1

1 if μ is inside Γ , 0 if μ is outside Γ .



Let A ∈ C n×n admit a Jordan decomposition of the form ⎡ ⎤⎡ T ⎤ J1 V1 ⎢ ⎥ ⎢ . ⎥ −1 . .. (35.8) A = U JU = [U1 · · · U ] ⎣ ⎦ ⎣ .. ⎦ , J VT where J1 , . . . , J denote the Jordan cells of J, U1 , . . . , U denote the corresponding block columns of U , and V1T , . . . , VT denote the corresponding block rows of U −1 . Consequently, (35.9)

A=

 

Ui Ji ViT

i=1

and, if Ji is a Jordan cell of size ni × ni , then (35.10)

(λIn − A)−1 = U (λIn − J)−1 U −1 =

 

Ui (λIni − Ji )−1 ViT .

i=1

If A has k distinct eigenvalues with geometric multiplicities γ1 , . . . , γk , then  = γ1 + · · · + γk in formula (35.10).

386

35. Continuity of eigenvalues

Lemma 35.4. Let A ∈ C n×n admit a Jordan decomposition of the form (35.8), where the ni × ni Jordan cell Ji = βi Ini + Nni , and let Γ be a simple piecewise smooth counterclockwise directed closed curve in C that does not intersect any of the eigenvalues of A. Then 1 2πi

(35.11)

6

(λIn − A)−1 dλ = Γ

where

( Xi =

 

Ui Xi ViT ,

i=1

In i 0ni ×ni

if βi if βi

is inside Γ , is outside Γ .

Proof. This is an easy consequence of (35.10) and Lemma 35.3.



It is readily checked that the sum on the right-hand side of formula (35.11) is a projection: )

 

*2 Ui Xi ViT

i=1

=

 

Ui Xi ViT .

i=1

Therefore, the integral on the left-hand side of (35.11), 6 1 A (λIn − A)−1 dλ , (35.12) PΓ = 2πi Γ is also a projection. It is termed the Riesz projection. Lemma 35.5. Let A ∈ C n×n , let det (λIn − A) = (λ − λ1 )α1 · · · (λ − λk )αk , where the points λ1 , . . . , λk are distinct, and let Γ be a simple piecewise smooth counterclockwise directed closed curve that does not intersect any of the eigenvalues of A. Then  (35.13) rank PΓA = αi where G = {i : λi is inside Γ} . i∈G

Proof. The conclusion rests on the observation that rank

 

Ui Xi ViT

& ' = rank U (diag{X1 , . . . , X }) V T

i=1

= rank (diag{X1 , . . . , X }) , since U and V T are invertible. Therefore, the rank of the indicated sum is equal to the sum of the sizes of the nonzero Xi that intervene in the formula  for PΓA , which agrees with formula (35.13).

35.2. Continuous dependence of the eigenvalues

387

35.2. Continuous dependence of the eigenvalues In this section we shall use the projection formulas PΓA to establish the continuous dependence of the eigenvalues of a matrix A ∈ C n×n on A. The strategy is to show that if B ∈ C n×n is sufficiently close to A, then PΓA − PΓB  < 1, and hence, as will be shown in the next lemma, rank PΓA = rank PΓB . Lemma 35.6. If P, Q ∈ C n×n , P = P 2 , Q = Q2 , and P − Q < 1, then (35.14)

rank P = rank Q .

Proof. If P −Q < 1, then the matrix In ±(P −Q) is invertible. Therefore, rank P = rank{P (In − (P − Q))} = rank{P Q} = rank{(In − (Q − P ))Q} = rank Q .



The notation Dr (μ) = {λ ∈ C : |λ − μ| < r} will be used in the next theorem. Theorem 35.7. Let A, B ∈ C n×n and let det(λIn − A) = (λ − λ1 )α1 · · · (λ − λk )αk , where the k points λ1 , . . . , λk are distinct. Then for every r > 0, there exists a δ > 0 such that if A − B < δ, then σ(B) ⊂

k A

Dr (λi ) .

i=1

Moreover, if 1 min {|λi − λj | : i = j} , 2 then each disk Dr (λi ) will contain exactly αi eigenvalues of B, counting algebraic multiplicities. r < r0 =

Proof. Let r < r0 and let Γj = Γj (r) denote a circle of radius r centered at λj and directed counterclockwise. This constraint on r ensures that Γj ∩Γi = ∅ if j = i and that λj is the only eigenvalue of A inside the closed disc Dr (λj ) = {λ ∈ C : |λ − λj | ≤ r}. The remainder of the proof is divided into steps.  1. If 0 < r < r0 and λ ∈ kj=1 Γj (r), then λIn − A is invertible and there exists a constant δr such that (35.15)

(λIn − A)

−1

 ≤ δr

for every point λ ∈

k A j=1

Γj (r) .

388

35. Continuity of eigenvalues

 It is clear that λIn − A is invertible for every point λ ∈ kj=1 Γj (r). Moreover, upon invoking the Jordan decomposition A = U JU −1 , we may write (λIn − A)−1  = U (λIn − J)−1 U −1  ≤ U  (λIn − J)−1  U −1  . Thus, as J = diag{Jλ1 , . . . , Jλk } with blocks Jλj = λj Iαj + Tj , where Tj = (k )

(k )

diag{Cλj 1 , . . . , Cλj m } (with k1 + · · · + km = αj and m = γj ) is a strictly αj

upper triangular matrix of size αj × αj such that Tj  ≤ 1 and Tj 5  −1 5 5 5 T 5 5 j (λIαj − Jλj )−1  = 5(λ − λj )−1 Iαj − 5 5 5 λ − λj 5 5 αj i αj  5 (T )i−1 5  1 5 5 j ≤ =5 5 5 (λ − λj )i 5 |λ − λj | i=1

= O,

i=1

αj n   1 1 ≤ ≤ . i r ri i=1

i=1

Therefore, (35.15) holds with δr = U  U −1 

n

i=1 r

−i .

2. If A − B < 1/δr , then λIn − B is invertible and (λIn − A)

−1

− (λIn − B)

−1

δr2 B − A ≤ 1 − δr B − A

for every λ ∈

k A

Γj (r) .

j=1

Let X = (λIn − A)−1 (B − A). Then λIn − B = (λIn − A − (B − A)) = (λIn − A)(In − X)  is invertible if λ ∈ kj=1 Γj (r) and A − B < δ1r , because then In − X is invertible, since X ≤ δr A − B < 1. Moreover, (λIn − A)−1 − (λIn − B)−1  = (In − (In − X)−1 )(λIn − A)−1  = X(In − X)−1 (λIn − A)−1  ≤ δr

δr2 A − B X ≤ , 1 − X 1 − δr A − B

as claimed. 3. If B − A < (δr + δr2 r)−1 , then each disk Dr (λi ) will contain exactly αi points of spectrum of B, counting multiplicities.

35.3. Matrices with distinct eigenvalues

389

In view of step 2 and Lemma 35.2, 5 5 5 1 6 5 5 5 A B −1 −1 [(λIn − A) − (λIn − B) ]dλ5 PΓj − PΓj  = 5 5 2πi Γj 5 ≤

r δr2 B − A < 1. 1 − δr B − A

Then, by Lemma 35.6, rank PΓBj = rank PΓAj = αj for j = 1, . . . , k, and hence by Lemma 35.5, rank PΓBj = the sum of the algebraic multiplicities of the eigenvalues of B inside Dr (λj ) , 

which yields the desired result. αj

Exercise 35.6. Verify the claim that Tj  ≤ 1 and Tj in step 1 in the proof of Theorem 35.7.

= O that was made

Exercise 35.7. Show that the roots λ1 , . . . , λn of the polynomial p(λ) = a0 + a1 λ + · · · + an λn with an = 0 depend continuously on the coefficients a0 , . . . , an of the polynomial. [HINT: Exploit companion matrices.]

35.3. Matrices with distinct eigenvalues In this section we shall focus on matrices A ∈ C n×n with n distinct eigenvalues. Lemma 35.8. If A ∈ C n×n has n distinct eigenvalues, then there exists a number δ > 0 such that every matrix B ∈ C n×n for which A − B < δ also has n distinct eigenvalues. Proof. This is an easy consequence of Theorem 35.7.



The next result shows that any matrix A ∈ C n×n can be approximated arbitrarily well by a matrix B ∈ C n×n with n distinct eigenvalues; i.e., the class of n × n matrices with n distinct eigenvalues is dense in C n×n . Lemma 35.9. If A ∈ C n×n and δ > 0 are specified, then there exists a matrix B ∈ C n×n with n distinct eigenvalues such that A − B < δ. Proof. Let A = U JU −1 be a Jordan decomposition of the matrix A and let D be a diagonal matrix such that the diagonal entries of the matrix D + J are all distinct and |dii | ≤ (U U −1 )−1 δ

390

35. Continuity of eigenvalues

for i = 1, . . . , n. Then the matrix B = U (D + J)U −1 has n distinct eigenvalues and A − B = U DU −1  ≤ U DU −1  ≤ δ .



Note that this lemma does not say that every matrix B that meets the inequality A − B < δ has n distinct eigenvalues. Since the class of n × n matrices with n distinct eigenvalues is a subclass of the set of diagonalizable matrices, it is reasonable to ask whether or not a diagonalizable matrix remains diagonalizable if some of its entries are changed just a little. The answer is: not always!



 1 0 1 β Exercise 35.8. Let A = and B = , where β = 0. Show 0 1 0 1 that A − B = |β| but that A is diagonalizable, whereas B is not. Exercise 35.9. Let A ∈ C p×q and suppose that rank A = k and k < min{p, q}. Show that for every ε > 0 there exists a matrix B ∈ C p×q such that A − B < ε and rank B = k + 1.

Some conclusions A subset X of the set of p × q matrices C p×q is said to be open if for every matrix A ∈ X there exists a number δ > 0 such that the open ball 3 4 Bδ (A) = C ∈ C p×q : A − C < δ is also a subset of X . The meaning of this condition is that if the entries in a matrix A ∈ X are changed only slightly, then the new perturbed matrix will also belong to the class X . This is a significant property in applications and computations because the entries in any matrix that is obtained from data or from a numerical algorithm are only known approximately. The preceding analysis implies that: • {A ∈ C p×q : rank A = min{p, q}} is an open subset of C p×q . • {A ∈ C p×q : rank A < min{p, q}} is not an open subset of C p×q . • {A ∈ C n×n with n distinct eigenvalues} is an open subset of C n×n . • {A ∈ C n×n : A is diagonalizable} is not an open subset of C n×n . The set {A ∈ C n×n : with n distinct eigenvalues} is particularly useful because it is both open and dense in C n×n , thanks to Lemma 35.9. Open dense sets are said to be generic. Exercise 35.10. Show that {A ∈ C n×n : A is invertible} is a generic set.

35.4. Supplementary notes

391

35.4. Supplementary notes Some additional applications of complex function theory to linear algebra will be discussed in the next chapter. In particular, another verification of formula (11.20) for the spectral radius of a matrix A ∈ C n×n and formulas for fractional powers At when A  O will be presented there. The continuous dependence of the eigenvalues of a matrix A that is established in Theorem 35.7 is (in the terminology of the article [57] by Li and Zhang) topological continuity, not functional continuity. In particular, they note the following result, which is based on Theorem 5.2 on page 109 of Kato’s monograph [51]: Theorem 35.10. If A(μ) ∈ C n×n is a continuous function of μ for all points μ in a connected domain Ω ⊂ C such that either (1) Ω is a real interval or (2) the eigenvalues of A(λ) are real, then there exist n eigenvalues (counted with algebraic multiplicities) of A(μ) that can be parametrized as continuous functions λ1 (μ), . . . , λn (μ) from Ω to C. In the second case, one can choose λ1 (μ) ≥ · · · ≥ λn (μ).

Chapter 36

Eigenvalue location problems

In this chapter we shall continue to use complex function theory to extract information about the eigenvalues of matrices.

36.1. Gerˇ sgorin disks The notation Δ(α; r) = {λ ∈ C : |λ − α| ≤ r} for the closed disk with center α and radius r will be used in this section. Let A ∈ C n×n with entries aij and let ρi (A) =

n 

|aij | − |aii | for

i = 1, . . . , n .

j=1

sgorin disk of A. The set Δ(aii ; ρi (A)) is called the i’th Gerˇ Theorem 36.1. If A ∈ C n×n , then:  (1) σ(A) ⊂ ni=1 Δ(aii ; ρi (A)). (2) A union Ω1 of k Gerˇsgorin disks that has no points in common with the union Ω2 of the remaining n − k Gerˇsgorin disks contains exactly k eigenvalues of A, counting their algebraic multiplicities. Proof. If λ ∈ σ(A), then there exists a nonzero vector u ∈ C n with components u1 , . . . , un such that (λIn − A)u = 0. Suppose that |uk | = max {|uj | : j = 1, . . . , n} . 393

394

36. Eigenvalue location problems

Then the identity (λ − akk )uk =

n 

akj uj − akk uk

j=1

implies that |(λ − akk )uk | ≤

n 

|akj ||uj | − |akk ||uk |

j=1

≤ ρk (A)|uk | . Therefore, λ ∈ Δ(akk ; ρk (A)) ⊂

n A

Δ(aii ; ρi (A)) .

i=1

This completes the proof of (1). Next, to verify (2), let D = diag{a11 , . . . , ann } and let B(t) = D + t(A − D) for 0 ≤ t ≤ 1 . Then

(

aii taij

bij (t) = dij + t(aij − dij ) =

if i = j , if i =  j,

and hence each eigenvalue λ of B(t) belongs to one of the disks Δ(aii ; tρi (A)), i.e., n A Δ(aii ; tρi (A)) . σ(B(t)) ⊂ i=1

Now, let Γ be a simple smooth closed curve that contains Ω1 in its interior and Ω2 in its exterior such that min{|ω − λ| : ω ∈ Ω1 ∪ Ω2

and

λ ∈ Γ} ≥ δ

for some δ > 0. Then λIn − B(t) is invertible for every choice of λ ∈ Γ and t ∈ [0, 1] and there exists a finite positive number β such that (λIn − B(t))−1  ≤ β

for every choice of λ ∈ Γ and t ∈ [0, 1] .

Thus, (λIn − B(t))−1 − (λIn − B(s))−1  = (λIn − B(t))−1 (B(t) − B(s))(λIn − B(s))−1  ≤ β 2 A − D |t − s|

36.2. Spectral radius redux

395

and hence if the curve Γ is parameterized by the function γ(u), a ≤ u ≤ b, the corresponding projectors meet the constraint 5 5 6 5 1 5 B(t) B(s) −1 −1 5 PΓ − PΓ  = 5 [(λIn − B(t)) − (λIn − B(s)) ]dλ5 5 2πi Γ 6 b 1 |γ  (u)|du β 2 A − D |t − s| , ≤ 2π a which is < 1 if |t − s is small enough. Thus, by partitioning [0, 1] and proceeding from 0 to 1 in a sequence of small steps, we see that B(0)

k = rank PΓ

B(1)

= rank PΓ

= rank PΓA

and hence that A has k eigenvalues inside Γ, counting algebraic multiplici ties. But, as σ(A) ⊂ Ω1 ∪ Ω2 , these k eigenvalues must belong to Ω1 . Exercise 36.1. Show that the spectral radius rσ (A) of a matrix A ∈ Cn×n is subject to the bound ⎧ ⎫ n ⎨ ⎬ |aij | : i = 1, . . . , n . rσ (A) ≤ max ⎩ ⎭ j=1

Exercise 36.2. Show that the spectral radius rσ (A) of a matrix A ∈ C n×n is subject to the bound - n .  |aij | : j = 1, . . . , n . rσ (A) ≤ max i=1

Exercise 36.3. Let A ∈ C n×n . Show that if |aii | > ρi (A) for i = 1, . . . , n, then A is invertible. Exercise 36.4. Let A ∈ C n×n . Show that A is a diagonal matrix if and only if σ(A) = ni=1 Δ(aii ; 0). Exercise 36.5. Show that if A ∈ C n×n is a circulant, then all the Gerˇsgorin disks Δ(ajj ; ρj (A)), j = 1, . . . , n, are the same and then check that σ(A) ⊆ Δ(a11 ; ρ1 (A))

36.2. Spectral radius redux In this section we shall use the methods of complex analysis to obtain a simple proof of the formula rσ (A) = lim Ak 1/k k↑∞

for the spectral radius rσ (A) = max{|λ| : λ ∈ σ(A)}

396

36. Eigenvalue location problems

of a matrix A ∈ C n×n . Since the inequality rσ (A) ≤ Ak 1/k

for k = 1, 2, . . .

is easily verified, it suffices to show that (36.1)

lim Ak 1/k ≤ rσ (A) .

k↑∞

Lemma 36.2. If A ∈ C n×n and σ(A) belongs to the set of points enclosed by a simple piecewise smooth counterclockwise directed closed curve Γ, then 6 1 k λk (λIn − A)−1 dλ . (36.2) A = 2πi Γ Proof. Let Γr denote a circle of radius r centered at zero and directed counterclockwise, and suppose that r > A and g(λ) = λk . Then 6 6 ∞  1 Aj 1 λk (λIn − A)−1 dλ = g(λ) dλ 2πi Γr 2πi Γr λj+1 j=0 2 6 ∞ (  g(λ) 1 dλ Aj = 2πi Γr λj+1 =

j=0 ∞  j=0

g (j) (0) j A = Ak . j!

The assumption r > A is used to guarantee the uniform convergence of the  −j Aj on Γ and subsequently to justify the interchange in the λ sum ∞ r j=0 order of summation and integration. Thus, to this point, we have established formula (36.2) for Γ = Γr and r > A. However, since λk (λIn − A)−1 is holomorphic in an open set that contains the points between and on the curves Γ and Γr , it follows that 6 6 1 1 k −1 λ (λIn − A) dλ = λk (λIn − A)−1 dλ .  2πi Γ 2πi Γr Corollary 36.3. If A ∈ C n×n and σ(A) belongs to the set of points enclosed by a simple piecewise smooth counterclockwise directed closed curve Γ, then 6 1 (36.3) p(A) = p(λ)(λIn − A)−1 dλ 2πi Γ for every polynomial p(λ). Theorem 36.4. If A ∈ C n×n , then (36.4)

lim Ak 1/k = rσ (A) ;

k↑∞

i.e., the limit exists and is equal to the modulus of the maximum eigenvalue of A.

36.2. Spectral radius redux

397

Proof. Fix  > 0, let r = rσ (A) + , and let δr = max{(λIn − A)−1  : |λ| = r} . Then, by formula (36.2), 5 5 6 2π 5 1 5 k iθ k iθ −1 iθ 5 5 (re ) (re In − A) ire dθ5 A  = 5 2πi 0 6 2π 1 rk (reiθ In − A)−1 rdθ ≤ 2π 0 ≤ rk+1 δr . Thus, Ak 1/k ≤ r(rδr )1/k and, as (rδr )1/k → 1 as k ↑ ∞, there exists a positive integer N such that Ak 1/k ≤ rσ (A) + 2

for every integer k ≥ N .

Therefore, since rσ (A) ≤ Ak 1/k for k = 1, 2, . . ., it follows that 0 ≤ Ak 1/k − rσ (A) ≤ 2ε for every integer k ≥ N , 

which serves to establish formula (36.4).

Exercise 36.6. Show that if A ∈ C n×n and σ(A) belongs to the set of points enclosed by a simple piecewise smooth counterclockwise directed closed curve Γ, then 6 ∞  Aj 1 . eλ (λIn − A)−1 dλ = (36.5) 2πi Γ j! j=0

Let A ∈ C n×n and let f (λ) be holomorphic in an open set Ω that contains σ(A). Then, in view of formulas (36.3) and (36.5) it is reasonable to define 6 1 f (λ)(λIn − A)−1 dλ , (36.6) f (A) = 2πi Γ where Γ is any simple piecewise smooth counterclockwise directed closed curve in Ω that encloses σ(A) such that every point inside Γ also belongs to Ω. This definition is independent of the choice of Γ and is consistent with the definitions of f (A) considered earlier. Exercise 36.7. Show that if A ∈ C n×n , f (λ) and g(λ) are holomorphic in an open set Ω that contains σ(A), and Γ is any simple piecewise smooth

398

36. Eigenvalue location problems

counterclockwise directed closed curve in Ω that encloses σ(A) such that every point inside Γ also belongs to Ω, then (36.7) ( 2 6 6 1 1 −1 g(ζ)(ζIn − A) dζ = f (λ)g(λ)(λIn − A)−1 dλ. f (A) 2πi Γ 2πi Γ [HINT: Use the identity (λIn − A)−1 (ζIn − A)−1 = (ζ − λ)−1 {(λIn − A)−1 − (ζIn − A)−1 } to reexpress 2πif (A)2πig(A) as (6 (6 2 2 6 6 g(ζ) −1 −1 dζ dλ − f (λ)(λIn − A) g(ζ)(ζIn − A) · · · dζ , Γ Γ1 ζ − λ Γ1 Γ where the contour Γ sits inside the contour Γ1 .] Exercise 36.8. Show that if, in terms of the notation introduced in (35.9), (p) (q) A = U1 Cα V1T + U2 Cβ V2T , then (36.8)

f (A) = U1

p−1 (j)  f (α) j=0

j!

(p) (C0 )j V1T

+ U2

q−1 (j)  f (α) j=0

j!

(q)

(C0 )j V2T

for every function f (λ) that is holomorphic in an open set that contains the points α and β. Exercise 36.9. Show that in the setting of Exercise 36.8 (36.9)

det (λIn − f (A)) = (λ − f (α))p (λ − f (β))q .

Exercise 36.10. Show that if A ∈ C n×n and f (λ) is holomorphic in an open set that contains σ(A), then (36.10)

det (λIn − A) = (λ − λ1 )α1 · · · (λ − λk )αk =⇒ det (λIn − f (A)) = (λ − f (λ1 ))α1 · · · (λ − f (λk ))αk .

[HINT: The main ideas are contained in Exercises 36.8 and 36.9. The rest is just more elaborate bookkeeping.] Theorem 36.5 (The spectral mapping theorem). Let A ∈ C n×n and let f (λ) be holomorphic in an open set that contains σ(A). Then (36.11)

μ ∈ σ(f (A)) ⇐⇒ f (μ) ∈ σ(f (A)) .

Proof. This is immediate from formula (36.10).



36.3. Shifting eigenvalues

399

36.3. Shifting eigenvalues In this section we extend the discussion in Section 8.4 by dropping the restriction that the geometric multiplicity of each eigenvalue of the given matrix A is equal to one. We will now work with a matrix   C = B AB · · · An−1 B , in which A ∈ C n×n and B ∈ C n×k with k ≥ 1. In the control theory literature, a pair of matrices (A, B) ∈ C n×n × C n×k with k ≥ 1 is said to be a controllable pair if rank C = n. Lemma 36.6. If (A, B) ∈ C n×n × C n×k is a controllable pair and k > 1, then there exist a matrix C ∈ C k×n and a vector b ∈ RB such that (A + BC, b) is a controllable pair. Discussion. Suppose for the sake of definiteness that A ∈ C 10×10 , B = [b1 · · · b5 ], and then permute the columns in the matrix   B AB · · · An−1 B to obtain the matrix  b1 Ab1 · · ·

A9 b1 · · ·

b5 Ab5 · · ·

 A9 b5 .

Then, moving from left to right, discard vectors that may be expressed as linear combinations of vectors that sit to their left. Suppose further that A3 b1 ∈ span{b1 , Ab1 , A2 b1 } , A5 b2 ∈ span{b1 , Ab1 , A2 b1 , b2 , . . . , A4 b2 } , A2 b3 ∈ span{b1 , Ab1 , A2 b1 , b2 , . . . , A4 b2 , b3 , Ab3 } and the vectors in each of the sets on the right are linearly independent. Then the matrix Q = [b1 Ab1 A2 b1 b2 Ab2 A2 b2 A3 b2 A4 b2 b3 Ab3 ] is invertible. Let ej denote the j’th column of I5 and let fk denote the k’th column of I10 and set G = [0 0 e2 0 0 0 0 e3 0 0] and C = GQ−1 . Then (A + BC)b1 = Ab1 + BGQ−1 Qf1 = Ab1 , (A + BC)2 b1 = (A + BC)Ab1 = A2 b1 + BGQ−1 Qf2 = A2 b1 , (A + BC)3 b1 = (A + BC)A2 b1 = A3 b1 + BGQ−1 Qf3 = A3 b1 + Be2 = A3 b1 + b2 .

400

Thus,

36. Eigenvalue location problems



b1 (A + BC)b1 (A + BC)2 b1 (A + BC)3 b1   4 = Q f1 f2 f3 i=1 c4i fi



with c44 = 1. Similar considerations lead to the conclusion that [b1 (A + BC)b1 · · · (A + BC)9 b1 ] = QU , where U is an upper triangular matrix with ones on the diagonal. Therefore,  since QU is invertible, (A + BC, b1 ) is a controllable pair. Theorem 36.7. Let (A, B) ∈ C n×n × C n×k be a controllable pair and let {μ1 , . . . , μn } be any set of points in C (not necessarily distinct). Then there exists a matrix K ∈ C k×n such that det(λIn − A − BK) = (λ − μ1 ) · · · (λ − μn ). Proof. If k = 1, then the conclusion is given in Theorem 8.10. If k > 1, then, by Lemma 36.6, there exist a matrix C ∈ C k×n and a vector b ∈ RB such that (A + BC, b) is a controllable pair. Therefore, by Theorem 8.10, there exists a vector u ∈ C n such that det(λIn − A − BC − buH ) = (λ − μ1 ) · · · (λ − μn ). However, BC + buH = B(C + vuH ) for some v ∈ C k , since b ∈ RB . Thus,  the proof may be completed by choosing K = C + vuH . n×n × C n×k is a controllable pair if Exercise 36.11. Show that (A,  B) ∈ C and only if rank A − λIn B = n for every point λ ∈ C.

36.4. The Hilbert matrix The matrix An ∈ C (n+1)×(n+1) with jk entry (36.12)

ajk =

1 j+k+1

for j, k = 0, . . . , n

is called the Hilbert matrix. It is readily checked that An  O by setting f (x) = nk=0 ck xk and noting that 6 1  6 1 n 2 |f (x)| dx = ck xk cj xj dx 0

= 

where cH = c0 · · ·

 cn .

0 j,k=0 n  j,k=0

cj ck = cH An c , j+k+1

36.6. Supplementary notes

401

Lemma 36.8. The Hilbert matrix An defined by (36.12) is subject to the bound (36.13)

An  ≤ π

for every choice of the positive integer n .

Proof. Let Γ1 denote the semicircle of radius 1 in the open upper half-plane with base [−1, 1] directed counterclockwise, let Γ2 denote the arc of that semicircle, and set g(λ) = nk=0 |ck |λk . Then, by the preceding calculations, 6 1 6 1 6 1 |f (x)|2 dx ≤ g(x)2 dx ≤ g(x)2 dx cH An c = 6 0 6−1 60 g(λ)2 dλ − g(λ)2 dλ = − g(λ)2 dλ = Γ1 Γ2 Γ2 6 π 6 π = −i g(eiθ )2 eiθ dθ ≤ |g(eiθ )|2 dθ =

1 2

6

0 π −π

0

|g(eiθ )|2 dθ = π

n 

|ck |2 = πc2 .

k=0

Thus, the bound (36.13) holds, as claimed.



36.5. Fractional powers Since the function f (λ) = λt is holomorphic in C \ (−∞, 0] when t > 0, formula (36.6) may be used to obtain useful bounds on fractional powers of positive definite matrices. In particular, if A ∈ C n×n and A  O, then 6 √ 1 1/2 λ(λIn − A)−1 dλ A = 2πi Γ for any simple closed piecewise smooth curve Γ in the open right half-plane that includes the eigenvalues of A in its interior. Moreover, if A, B ∈ C n×n , A  B  O, and 0 < t < 1, then 6 3 4 1 t t λt (λIn − A)−1 − (λIn − B)−1 dλ , (36.14) A −B = 2πi Γ for an appropriately chosen contour Γ. The formula (16.15) is obtained by passing to appropriate limits; see, e.g., Section 17.10 in [30] for details.

36.6. Supplementary notes The argument used to bound the Hilbert matrix that is presented in Section 36.4 is taken from [47]. The authors credit this approach to Fej´er and F. Riesz and present several other ways of verifying this bound (including one based on Lagrange multipliers by J. W. S. Cassels that was presented as an exercise in the first edition of [30]); for yet another approach, see the discussion of the Hilbert matrix in Peller [64]

402

36. Eigenvalue location problems

Section 36.3 is adapted from Section 20.9 in [30], which is partially adapted from Heymann [49]. ⎡ ⎤ C ⎢ CA ⎥ ⎢ ⎥ If C ∈ C p×n , A ∈ C n×n , and O = ⎢ . ⎥, then the pair (C, A) is ⎣ .. ⎦ CAn−1 said to be observable if NO = {0}. Exercise 36.12. Show that a pair of matrices (C, A) ∈ C p×n × C n×n is λIn − A for every point λ ∈ C. observable if and only if rank C The conditions in Exercises 36.11 and 36.12 are often referred to as Hautus tests, or Popov-Belevich-Hautus tests. For additional equivalences for controllability and observability, see, e.g., Lemmas 19.2 and 19.3 in [30].

Chapter 37

Matrix equations

In this chapter we shall analyze the existence and uniqueness of solutions to a number of matrix equations that occur frequently in applications. The notation (37.1)

CR = {λ ∈ C : λ + λ > 0}

and

CL = {λ ∈ C : λ + λ < 0}

for the open right and open left half-plane, respectively, will be useful.

37.1. The equation X − AXB = C In this section we shall study the equation X − AXB = C for appropriately sized matrices A, B, C, and X. If A = diag{λ1 , . . . , λp } and B = diag{μ1 , . . . , μq } are diagonal matrices, then it is easily seen that the equation xij −λi xij μj = cij for the ij entry xij of X has exactly one solution if and only if 1−λi μj = 0 for i = 1, . . . , p and j = 1, . . . , q. This condition on the eigenvalues of A and B holds for nondiagonal matrices too but requires a little more work to justify. Lemma 37.1. Let A ∈ C p×p , B ∈ C q×q and let λ1 , . . . , λk and β1 , . . . , βm denote the distinct eigenvalues of the matrices A and B, respectively; and let T denote the linear transformation from C p×q into C p×q that is defined by the rule (37.2)

T : X ∈ C p×q → X − AXB ∈ C p×q .

Then NT = {O p×q } if and only if (37.3)

λi βj = 1

for

i = 1, . . . , k

and

j = 1, . . . , m . 403

404

37. Matrix equations

Proof. Let Aui = λi ui and B T vj = βj vj for some pair of nonzero vectors ui ∈ C p and vj ∈ C q and let X = ui vjT . Then the formula T X = ui vjT − Aui vjT B = (1 − λi βj )ui vjT clearly implies that the condition (37.3) is necessary for NT = {O p×q }. To prove the sufficiency of this condition,  invoke the Jordan decomposition B = U JU −1 with U = U1 · · · Um and J = diag{Jβ1 , . . . , Jβm }, in which Uj ∈ C q×kj and Jβj ∈ C kj ×kj is upper triangular with βj on the diagonal for j = 1, . . . , m and kj is equal to the algebraic multiplicity of βj for j = 1, . . . , m. Then, since Jβj = βj Ikj + Nj and (Nj ) kj = O, X − AXB = O ⇐⇒ X − AXU JU −1 = O ⇐⇒ XU − AXU J = O ⇐⇒ XUj = AXUj Jβj

for j = 1, . . . , m

⇐⇒ (Ip − βj A)XUj = AXUj Nj

for j = 1, . . . , m .

Therefore, since (Ip − βj A) is invertible when (37.3) is in force, X − AXB = O ⇐⇒ XUj = Mj XUj Nj

for j = 1, . . . , m

with Mj = (Ip − βj A)−1 A. But upon iterating the last displayed equality n times we obtain XUj = Mjn XUj Njn = O for j = 1, . . . , m if n ≥ q .   Therefore, X = XU U −1 = XU1 · · · XUm U −1 = O is the only solution  of the equation X − AXB = O when λi βj = 1. Theorem 37.2. Let A ∈ C p×p , B ∈ C q×q and let λ1 , . . . , λk and β1 , . . . , βm denote the distinct eigenvalues of the matrices A and B, respectively. Then, for each choice of C ∈ C p×q , the equation (37.4)

X − AXB = C

has exactly one solution X ∈ C p×q if and only if (37.3) holds. Proof. This is immediate from Lemma 37.1 and the principle of conservation of dimension: If T is the linear transformation that is defined by the rule (37.2), then pq = dim NT + dim RT . Therefore, T maps onto C p×q if and only if NT = {Op×q }, i.e., if and only  if λi βj = 1 for every choice of i and j. Corollary37.3. If A ∈ C p×p , B ∈ C q×q , C ∈ C p×q , and rσ (A) rσ (B) < 1, j j then X = ∞ j=0 A CB is the only solution of equation (37.4). Exercise 37.1. Use Theorem 22.2, the refined fixed point theorem, to justify Corollary 37.3 a second way (that does not depend upon Theorem 37.2).

37.2. The Sylvester equation AX − XB = C

405

Corollary 37.4. Let A ∈ C p×p , C ∈ C p×p and let λ1 , . . . , λp denote the eigenvalues of the matrix A. Then the Stein equation (37.5)

X − AH XA = C

has exactly one solution X ∈ C p×p if and only if 1 − λi λj = 0 for every choice of i and j. Exercise 37.2. Verify Corollary 37.4. (2)

(2)

(2)

(2)

Exercise 37.3. Let A = Cα and B = Cβ and suppose that αβ = 1. Show that the equation X − AXB = C has no solutions if either c21 = 0 or αc11 = βc22 . Exercise 37.4. Let A = Cα and B = Cβ and suppose that αβ = 1. Show that if c21 = 0 and αc11 = βc22 , then the equation X − AXB = C has infinitely many solutions. Exercise 37.5. Find the unique solution X ∈ C p×p of equation (37.5) when   (p) H A = C0 , C = e1 uH + ueH 1 , and u = t0 t1 · · · tp−1 . (4)

Exercise 37.6. Let A = C0 and let T denote the linear transformation from C4×4 into itself that is defined by the formula T X = X − AH XA. (a) Calculate dim NT . 4×4 (b) Show that a matrix ⎡ X∈C a b c d ⎢ e 0 0 0 X − AH XA = ⎢ ⎣ f 0 0 0 g 0 0 0 matrix (i.e., xij = xi+1,j+1 ) j = 1, . . . , 4.

is ⎤ a solution of the matrix equation ⎥ ⎥ if and only if X is a Toeplitz ⎦ with x1j = c1j and xj1 = cj1 for

37.2. The Sylvester equation AX − XB = C The strategy for studying the equation AX − XB = C is much the same as for the equation X − AXB = C. Again the special case in which A = diag{λ1 , . . . , λp } and B = diag{β1 , . . . , βq } are diagonal matrices points the way: The equation λi xij − xij βj = cij for the ij entry xij of X has exactly one solution if and only if λi − βj = 0 for i = 1, . . . , p and j = 1, . . . , q. This condition on the eigenvalues of A and B holds for nondiagonal matrices too but requires a little more work to justify. Lemma 37.5. Let A ∈ C p×p , B ∈ C q×q and let λ1 , . . . , λk and β1 , . . . , βm denote the distinct eigenvalues of the matrices A and B, respectively, and

406

37. Matrix equations

let T denote the linear transformation from C p×q into C p×q that is defined by the rule (37.6)

T : X ∈ C p×q → AX − XB ∈ C p×q .

Then NT = {Op×q } if and only if (37.7)

λi − βj = 0

for

i = 1, . . . , p

and

j = 1, . . . , q .

Proof. Let Aui = λi ui and B T vj = βj vj for some pair of nonzero vectors ui ∈ C p and vj ∈ C q and let X = ui vjT . Then the formula T X = Aui vjT − ui vjT B = (λi − βj )ui vjT clearly implies that the condition (37.7) is necessary for NT = {Op×q }. To prove the sufficiency of  this condition,  invoke the Jordan decomposition B = U JU −1 with U = U1 · · · Um and J = diag{Jβ1 , . . . , Jβm }, in which Uj ∈ C q×kj and Jβj ∈ C kj ×kj for j = 1, . . . , m and kj is equal to the algebraic multiplicity of βj for j = 1, . . . , m. Then, since Jβj = βj Ikj + Nj and (Nj ) kj = O, AX − XB = O ⇐⇒ AX − XU JU −1 = O ⇐⇒ AXU − XU J = O ⇐⇒ AXUj = XUj Jβj

for j = 1, . . . , m

⇐⇒ (A − βj Ip )XUj = XUj Nj

for j = 1, . . . , m .

Therefore, since (A − βj Ip ) is invertible when (37.7) is in force, AX − XB = O ⇐⇒ XUj = Mj XUj Nj

for j = 1, . . . , m

with Mj = (A − βj Ip )−1 A. But upon iterating the last displayed equality n times we obtain XUj = Mjn XUj Njn = O for j = 1, . . . , m if n ≥ q .   Therefore X = XU U −1 = XU1 · · · XUm U −1 = O is the only solution of the equation AX − XB = O when λi βj = 1.  Theorem 37.6. Let A ∈ C p×p , B ∈ C q×q and let λ1 , . . . , λk and β1 , . . . , βm denote the distinct eigenvalues of the matrices A and B, respectively. Then, for each choice of C ∈ C p×q , the equation AX − XB = C has exactly one solution X ∈ C p×q if and only if (37.7) holds, i.e., if and only if σ(A) ∩ σ(B) = ∅. Proof. This is an immediate corollary of Lemma 37.5 and the principle of conservation of dimension. 

37.2. The Sylvester equation AX − XB = C

407

Exercise 37.7. Let A ∈ C n×n . Show that the Lyapunov equation AH X + XA = Q

(37.8)

has exactly one solution for each choice of Q ∈ C n×n if and only if σ(A) ∩ σ(−AH ) = ∅. Lemma 37.7. If A, Q ∈ C n×n and if σ(A) ⊂ CL and −Q  O, then the Lyapunov equation (37.8) has exactly one solution X ∈ C n×n . Moreover, this solution is positive semidefinite. Proof. Since σ(A) ⊂ CL , the matrix 6 ∞ H Z=− etA QetA dt 0

is well-defined and is positive semidefinite. Moreover, 6 ∞ H H AH etA QetA dt A Z = −  60 ∞  d tAH e QetA dt = − dt 2 (0 6 ∞ % tAH tA %∞ tAH d tA (Qe )dt Qe − e = − e t=0 dt 0 6 ∞ H etA QetA dt A = Q+ 0

= Q − ZA . Thus, the matrix Z is a solution of the Lyapunov equation (37.8) and hence, as the assumption σ(A) ⊂ CL implies that σ(A) ∩ σ(−AH ) = ∅, Theorem 37.6 (as reformulated for (37.8) in Exercise 37.7) ensures that this equation has only one solution. Therefore, X = Z is positive semidefinite.  Exercise 37.8. Show that if, in addition to the assumptions that σ(A) ⊂ CL and −Q  O, it is also assumed in Lemma 37.7 that A, Q ∈ R n×n , then the solution X of (37.8) belongs to R n×n . Exercise 37.9. Let A ∈ C n×n . Show that if σ(A) ⊂ CR , the open right half-plane, then the equation AH X + XA = Q has exactly one solution for every choice of Q ∈ C n×n and that this solution can be expressed as 6 ∞ H e−tA Qe−tA dt X= 0

C n×n .

[HINT: Integrate the formula for every choice of Q ∈ 6 ∞ / 6 ∞ 0 d H H −tAH −tA e−tA Q e−tA dt e Qe dt = − A dt 0 0 by parts.]

408

37. Matrix equations

Exercise 37.10. Show that in the setting of Exercise 37.9, the solution X can also be expressed as 6 ∞ 1 X=− (iμIn + AH )−1 Q(iμIn − A)−1 dμ . 2π −∞ ,R 1 H [HINT: Write AH X = − limR↑∞ 2πi −R (A + iμIn − iμIn ){· · · }dμ and evaluate the integral by adding a semicircle of radius R to complete the contour keeping (35.4) in mind.] Exercise 37.11. Let A = diag{A11 , A22 } be a block diagonal matrix in C n×n with σ(A11 ) ⊂ CR and σ(A22 ) ⊂ CL , let Q ∈ C n×n , and let Y ∈ C n×n and Z ∈ C n×n be solutions of the Lyapunov equation AH X + XA = Q. Show that if Y and Z are written in block form consistent with the block decomposition of A, then Y11 = Z11 and Y22 = Z22 . Exercise 37.12. Let A, Q ∈ C n×n . Show that if σ(A) ∩ iR = ∅ and if Y and Z are both solutions of the same Lyapunov equation AH X + XA = Q and Y − Z  O, then Y = Z. [HINT: To warm up, suppose first that A = diag{A11 , A22 }, where σ(A11 ) ⊂ CR and σ(A22 ) ⊂ CL and consider Exercise 37.11.]  (4) Exercise 37.13. Let A = 3j=1 ej eTj+1 = C0 and let T denote the linear transformation from C4×4 into itself that is defined by the formula T X = AH X − XA. (a) Calculate dim NT . (b) Show that a matrix X ∈ C 4×4 with ⎡ entries xij is a ⎤solution of the 0 −a −b −c ⎢ a 0 0 0 ⎥ ⎥ if and only if matrix equation AH X − XA = ⎢ ⎣ b 0 0 0 ⎦ c 0 0 0 X is a Hankel matrix (i.e., xij = xi+1,j−1 ) with x11 = a, x12 = b, and x13 = c. Exercise 37.14. Show that if A, B, C ∈ C n×n , then X(t) = etA Ce−tB is a solution of the differential equation X  (t) = AX(t) − X(t)B that meets the initial condition X(0) = C.

37.4. Special classes of solutions

409

37.3. AX = XB The next result complements Lemma 37.5. It deals with the case where the nullspace NT of the linear transformation T introduced in (37.6) is not equal to zero. Lemma 37.8. Let A, X, B ∈ C n×n and suppose that AX = XB. Then there exists a matrix C ∈ C n×n such that AX = X(B + C) and σ(B + C) ⊆ σ(A). Proof. If X is invertible, then σ(A) = σ(B); i.e., the matrix C = O does (k) the trick. Suppose therefore that X is not invertible and that Cβ is a k × k Jordan cell in the Jordan decomposition of B = U JU −1 such that β ∈ σ(A). (k) Then there exists a subblock W ∈ C n×k of U such that BW = W Cβ . Therefore, (k)

AXW − XW Cβ = AXW − XBW = O and hence, as β ∈ σ(A), Lemma 37.5 implies that XW = O. Thus, if (k)

B1 = B + W (Cα(k) − Cβ )V H , where V H ∈ C k×n is the block of rows in U −1 corresponding to the columns (k) in W (i.e., U JU −1 = W Cβ V H + · · · ) and α ∈ σ(A), then XB1 = XB = AX, and the diagonal entry of the block under consideration in the Jordan decomposition of B1 now belongs to σ(A) and not to σ(B). Moreover, none of the other Jordan blocks in the Jordan decomposition of B are affected by this change. The same procedure can now be applied to change the diagonal entry of any Jordan cell in the Jordan decomposition of B1 from a point that is not in σ(A) to a point that is in σ(A). The proof is completed by repeating this procedure.  Exercise 37.15. Let A, X, B ∈ C n×n . Show that if AX = XB and the columns of V ∈ C n×k form a basis for NX , then there exists a matrix L ∈ C k×n such that σ(B + V L) ⊆ σ(A).

37.4. Special classes of solutions Let A ∈ C n×n and now let: • E+ (A) = the number of zeros of det (λIn − A) in CR , • E− (A) = the number of zeros of det (λIn − A) in CL , • E0 (A) = the number of zeros of det (λIn − A) in iR,

410

37. Matrix equations

counting multiplicities in all three. The triple (E+ (A), E− (A), E0 (A)) is called the inertia of A. Since multiplicities are counted, E+ (A) + E− (A) + E0 (A) = n . Theorem 37.9. Let A ∈ C n×n and suppose that σ(A) ∩ iR = ∅. Then there exists a Hermitian matrix G ∈ C n×n such that: (1) AH G + GA  O. (2) E+ (G) = E+ (A), E− (G) = E− (A), and E0 (G) = E0 (A) = 0. Proof. Suppose first that E+ (A) = p ≥ 1 and E− (A) = q ≥ 1. Then the assumption σ(A) ∩ iR = ∅ guarantees that p + q = n and hence that A admits a Jordan decomposition U JU −1 of the form 

J1 O U −1 A=U O J2 with J1 ∈ C p×p , σ(J1 ) ⊂ CR , J2 ∈ C q×q , and σ(J2 ) ⊂ CL . Let P11 ∈ C p×p and P22 ∈ C q×q be positive definite matrices. Then it is readily checked, much as in the proof of Lemma 37.7, that 6 ∞ H e−tJ1 P11 e−tJ1 dt X11 = 0

is a positive definite solution of the equation J1H X11 + X11 J1 = P11 6

and that



X22 = −

H

etJ2 P22 etJ2 dt

0

is a negative definite solution of the equation J2H X22 + X22 J2 = P22 . (The two integrals are well-defined because σ(J1 ) ⊂ CR =⇒ σ(J1H ) ⊂ CR and σ(J2 ) ⊂ CL =⇒ σ(J2H ) ⊂ CL .) Let X = diag {X11 , X22 } Then

JHX

and

P = diag {P11 , P22 } .

+ XJ = P and hence

(U H )−1 J H U H (U H )−1 XU −1 + (U H )−1 XU −1 U JU −1 = (U H )−1 P U −1 . Thus, the matrix G = (U H )−1 XU −1 is a solution of the equation AH G + GA = (U H )−1 P U −1 and hence, as (U H )−1 P U −1  O, (1) holds when p > 0 and q > 0. The cases p = 0, q = n and p = n, q = 0 are left to the reader. Sylvester’s inertia theorem (which is discussed in Section 28.6) guaran tees that E± (G) = E± (X) = E± (A), which justifies (2).

37.5. Supplementary notes

411

Exercise 37.16. Complete the proof of Theorem 37.9 by verifying the cases p = 0 and p = n. Exercise 37.17. Find a Hermitian matrix G ∈ C 2×2 that fulfills the conditions of Theorem 37.9 when A = diag{1 + i, 1 − i}.

37.5. Supplementary notes This chapter is adapted from Chapter 18 and Section 20.8 of [30]. A number of refinements may be found in Lancaster and Tismenetsky [55].

Chapter 38

A matrix completion problem

In this chapter we shall consider the problem of filling in the missing entries of a partially specified positive definite matrix Z ∈ C n×n . Towards this end, we shall suppose that the entries zij of Z are specified for those indices (i, j) that belong to a proper subset Ω of {(i, j) : i, j = 1, . . . , n} and shall let ZΩ = {A ∈ C n×n : A  O

aij = zij when (i, j) ∈ Ω} .

and

Since A  O =⇒ A = AH , we shall assume that Ω is symmetric, i.e., (i, j) ∈ Ω ⇐⇒ (j, i) ∈ Ω. The set ZΩ can also be conveniently described in terms of the standard basis vectors e1 , . . . , en for C n and the transformation ΠΩ that acts on A ∈ C n×n by the rule   ΠΩ A = A, ei eTj  ei eTj = trace{ej ei T A} ei eTj (i,j)∈Ω

=



(i,j)∈Ω

trace{ei Aej } ei eTj = T

(i,j)∈Ω



aij ei eTj ,

(i,j)∈Ω

i.e., ZΩ = {A ∈ C n×n : A  O

and ΠΩ A = ΠΩ Z} .

In these terms, a basic question is: (38.1)

For which sets Ω does there exist a matrix A ∈ C n×n such that A  O and A−1 − ΠΩ A−1 = O? 413

414

38. A matrix completion problem

We shall see that a necessary condition is that no principal submatrix of A is in the complement of Ω (and hence that aii ∈ Ω for i = 1, . . . , n). But that is not the whole story. Exercise 38.1. Show that the transformation ΠΩ is an orthogonal projection in the inner product space C n×n equipped with the inner product A, B = trace B H A.

38.1. Constraints on Ω Let g(X) = ln det X and recall that −g(X) is a strictly convex function on the convex set of positive definite matrices X ∈ C n×n and so too on the set ZΩ , since ZΩ is also a convex set. The main result of this section is Theorem 38.3. We begin, however, with a number of preliminary lemmas. Lemma 38.1. If A ∈ ZΩ , B = B H ∈ C n×n , and ΠΩ B = O, then A + μB ∈ ZΩ for every choice of μ ∈ R for which |μ| B < sn , the smallest singular value of A. Proof. Under the given assumptions, A + μB  O, since (A + μB)x, x = Ax, x + μBx, x ≥ sn x, x − |μ| B x, x > 0 for every nonzero vector x ∈ C n . ΠΩ (A + μB) = ΠΩ A.

Moreover, (A + μB) ∈ ZΩ , since 

Lemma 38.2. If Ω is symmetric and the matrix A ∈ ZΩ is a local extremum for the function g(X) = ln det X in ZΩ , then: (1) eTi (A−1 )ej = (A−1 )ij = 0 for every pair of points (i, j) ∈ Ω. (2) The indices (i, i) ∈ Ω for i = 1, . . . , n. Proof. If A is a local extremum for g(X) in ZΩ and if B = B H and ΠΩ B = O, then A + μB ∈ ZΩ for sufficiently small μ ∈ R and, in view of Lemma 17.4, g(A + μB) − g(A) = trace A−1 B . 0 = lim μ↓0 μ Thus, if B = αej eTi + αei eTj and (i, j) ∈ Ω, then trace A−1 B = αeTi (A−1 )ej + αeTj (A−1 )ei = 0 . Therefore, (A−1 )ij + (A−1 )ji = 0 if α ∈ R and

(A−1 )ij − (A−1 )ji = 0 if iα ∈ R .

Thus, (1) holds; and (2) follows from (1), since (A−1 )ii > 0.



There exist sets Ω for which ZΩ does not have a local extremum.

38.1. Constraints on Ω

415

 ? 1 Example 38.1. Let Ω = {(1, 2), (2, 1)} and Z = . Then 1 ? (

 2 a 1 2×2 : X= with a > 0, b > 0, and ab > 1 . ZΩ = X ∈ C 1 b The function g(X) does not have a local extremum in ZΩ : g(X) = ln (ab−1) and ln [(a + α)(b + β) − 1] > ln (ab − 1) if αb + βa + αβ > 0, whereas ln [(a + α)(b + β) − 1] < ln (ab − 1) if αb + βa + αβ < 0.  Exercise 38.2. Show that ZΩ is a closed subset of C n×n , i.e., if X ∈ C n×n , Xk ∈ ZΩ , and limk↑∞ X − Xk  = 0, then X ∈ ZΩ . Theorem 38.3. If (i, i) ∈ Ω for i = 1, . . . , n and ZΩ = ∅, then there exists exactly one matrix A ∈ ZΩ such that (38.2)

ln det A > ln det X

if

X ∈ ZΩ and X = A

(and hence (A−1 )ij = 0 for (i, j) ∈ Ω). Proof. Let γ = max {|zii | : i = 1, . . . , n} and proceed in steps. 1. If B ∈ ZΩ , then |bij | ≤ γ, B ≤ nγ, and det B ≤ γ n . Since B  O,

 bii bij O bji bjj

for every choice of 1 ≤ i < j ≤ n .

Therefore, |bij |2 ≤ bii bjj = zii zjj ≤ γ 2 , and hence |bij | ≤ γ.   The second assertion follows by writing B = b1 · · · bn as an array T  of its columns and noting that if c = c1 · · · cn , then 5 5 5  5 n n  5 5 n √ 5 ≤ c b |c | b  ≤ |cj | n γ ≤ c n γ , Bc = 5 j j5 j j 5 5 j=1 5 j=1 j=1 since bj  ≤



nγ and

n

j=1 |cj |

√ ≤ c n.

Finally, invoking the singular value decomposition B = V SV H with V ∈ C n×n unitary and S = diag{s1 , . . . , sn }, the inequality between the geometric and arithmetic means ensures that (det B)1/n = (s1 · · · sn )1/n ≤

trace B s1 + · · · + sn = ≤γ, n n

which justifies the last assertion of this step.

416

38. A matrix completion problem

2. There exists exactly one matrix A ∈ ZΩ such that (38.2) holds. In view of step 1, {det X : X ∈ ZΩ } is a bounded subset of R. By assumption, ZΩ = ∅. Therefore, there exists a matrix C ∈ ZΩ . Thus, the set {X ∈ ZΩ : det X ≥ det C} is a nonempty, closed, bounded, convex set and hence 2 follows from the fact that −g is strictly convex.  The assumption that ZΩ = ∅ in the formulation of Theorem 38.3 is essential; an example of a set Ω that contains the points (i, i) but ZΩ = ∅ is developed in the next six exercises, which culminate in Exercise 38.8. ⎡ ⎤ a xH μ Exercise 38.3. Let A = ⎣x B y⎦, where B ∈ C (n−2)×(n−2), x, y ∈ μ yH d C n−2 , and a, b ∈ C are known, but μ ∈ C is not. Show that (in terms of the notation | C | for det C) %% % % % %% % % B y% % a xH % % xH μ% %x B % % % % % % % % %. − (38.3) |A| |B | = % H d % %x B % % B y % %μ y H % y [HINT: Use Sylvester’s formula.] 



B y a xH  O and H  O in the setting Exercise 38.4. Show that if x B y d of Exercise 38.3, then



 a xH B y det det H x B y d H −1 2 A  O ⇐⇒ |μ − x B y| < . (det B)2 Exercise 38.5. Show that if A is a positive definite matrix of the form specified in Exercise 38.3, then | A | ≤ | B| (d − yH B −1 y) (a − xH B −1 x) with equality if and only if μ = xH B −1 y. Exercise 38.6. Show that if μ = xH B −1 y in Exercise 38.3, then (A−1 )n1 = (A−1 )1n = 0. ⎤ ⎡ β 1 μ Exercise 38.7. Show that if β > 1, then ⎣ 1 β 1 ⎦  O if and only if μ 1 β ⎤ ⎡ β 1 −1 |μ − β1 | < β − β1 , whereas ⎣ 1 β μ ⎦  O if and only if |μ + β1 | < β − β1 . −1 μ β

38.2. The central diagonals are specified

417

√ Exercise 38.8. Show that if 1 < β < 2, then there does not exist a choice of μ ∈ C for which the matrix ⎡ ⎤ β 1 1 −1 ⎢1 β 1 μ⎥ ⎥ A=⎢ ⎣1 1 β 1⎦ −1 μ 1 β √ is positive definite. [HINT: If 1 < β < 2, then, in view of Exercise 38.7, there does not exist a choice of μ ∈ C for which A{11} and A{33} are both positive definite.]

38.2. The central diagonals are specified Our next objective is to formulate necessary and sufficient conditions on the specified entries zij that ensure that ZΩ = ∅ when (38.4)

def

Ω = Λm = {(i, j) : i, j = 1, . . . , n

and

|i − j| ≤ m} ,

i.e., when the entries in the 2m + 1 central diagonals of the matrix are specified. It is tempting to set the unknown entries equal to zero. However, the matrix that is obtained this way is not necessarily positive definite; see Exercise 38.9 for a simple example, which also displays the fact that there may be many ways to fill in the missing entries to obtain a positive definite completion, even though there is only one way to fill in the missing entries so that the ij entries of the inverse of this completion are equal to zero if (i, j) ∈ Λm . We shall present an algorithm for obtaining this particular completion that is based on factorization and shall present another proof of the fact that it can be characterized as the completion which maximizes the determinant. Because of this property this particular completion is commonly referred to as the maximum entropy completion. ⎡ ⎤ 3 2 x Exercise 38.9. Show that if x ∈ R, then the matrix ⎣ 2 2 1 ⎦ is positive x 1 1 definite if and only (x − 1)2 < 1/2. However, there is only one choice of x for which A  O and (A−1 )31 = (A−1 )13 = 0. It is convenient to begin with a lemma: Lemma 38.4. If B = Y Y H , Y ∈ C n×n is an invertible triangular matrix (upper or lower), 1 ≤ i, j ≤ n, and 1 ≤ k ≤ n − 1, then (38.5)

bij = 0

for

|i − j| ≥ k ⇐⇒ yij = 0

for

|i − j| ≥ k .

418

38. A matrix completion problem

Discussion. The verification of (38.5) becomes transparent if the calculations are organized properly. The underlying ideas are best conveyed by example. Let B ∈ C 4×4 , k = 2 and suppose that Y is lower triangular. Then, in terms of the standard basis e1 , . . . , e4 for C 4 , ⎡ ⎤ y11 ⎢   0 ⎥ ⎥ b31 = eT3 Be1 = eT3 Y Y H e1 = y31 y32 y33 0 ⎢ ⎣ 0 ⎦ = y31 y11 , 0 ⎡ ⎤ y11 ⎢   0 ⎥ ⎥ b41 = eT4 Be1 = eT4 Y Y H e1 = y41 y42 y43 y44 ⎢ ⎣ 0 ⎦ = y41 y11 , 0 and  b42 = eT4 Be2 = eT4 Y Y H e2 = y41 y42 y43

⎡ ⎤ y21  ⎢y22 ⎥ ⎥ y44 ⎢ ⎣ 0 ⎦ = y41 y21 + y42 y22 . 0

Since Y is presumed to be invertible, yii = 0 for i = 1, . . . , 4, and hence the preceding three formulas clearly imply that b31 = b41 = b42 = 0 ⇐⇒ y31 = y41 = y42 = 0 . The verification of (38.5) in the general setting for lower triangular Y goes through in exactly the same way; the only difference is that the bookkeeping is a little more complicated. The proof for upper triangular Y is left to the reader. Recall the notation ⎡ B[j,k]

bjj ⎢ .. =⎣ .

···

bkj

···

⎤ bjk .. ⎥ . ⎦



for 1 ≤ j ≤ k ≤ n .

bkk

Theorem 38.5. If Ω = Λm for some nonnegative integer m ≤ n − 1, then ⎡ ⎤ zjj ··· zj,j+m ⎢ ⎥ .. ZΛm = ∅ ⇐⇒ Z[j,j+m] = ⎣ ... ⎦O . (38.6) zj+m,j · · · zj+m,j+m for j = 1, . . . , n − m . Moreover, if these conditions are met, then there is exactly one matrix (38.7)

A ∈ Z Λm

with

(A−1 )ij = 0 for (i, j) ∈ Λm .

38.2. The central diagonals are specified

419

(This unique matrix A is given by the formula A = (X H )−1 DX −1 ; X and D are defined in the proof.) Proof. The proof of the implication =⇒ in (38.6) is easy and is left to the reader. Conversely, if the matrices on the right-hand side of (38.6) are positive definite, then the following implications are in force: 1. There exists exactly one lower triangular matrix X ∈ C n×n such that ⎡ ⎤ ⎡ ⎤ 1 xjj ⎢0⎥ ⎢ ⎥ ⎢ ⎥ (38.8) Z[j,j+m] ⎣ ... ⎦ = ⎢ . ⎥ for j = 1, . . . , n − m , ⎣ .. ⎦ xj+m,j 0 ⎡ ⎤ ⎤ 1 xjj ⎢0⎥ ⎢ ⎥ ⎢ ⎥ Z[j,n] ⎣ ... ⎦ = ⎢ . ⎥ ⎣ .. ⎦ xn,j 0 ⎡

(38.9)

for

j = n − m + 1, . . . , n ,

and xij = 0 for (i, j) ∈ Λm . This is self-evident since the matrices Z[j,j+m] and Z[j,n] in (38.8) and (38.9) are positive definite. 2. The diagonal matrix D with entries dii = xii is positive definite. This is an easy computation. 3. The matrix A = (X H )−1 DX −1 belongs to ZΛm and (A−1 )ij = 0 for (i, j) ∈ Λm . The fact that (A−1 )ij = 0 for (i, j) ∈ Λm follows from Lemma 38.4, since xij = 0 for (i, j) ∈ Λm . It remains therefore only to check that aij = zij for (i, j) ∈ Λm . Towards this end, let U = (X H )−1 D. Then U is an upper triangular matrix with diagonal entries uii = 1 for i = 1, . . . , n. Consequently, the formula A = (X H )−1 DX −1 can be rewritten as AX = U , which in turn implies that (38.8) and (38.9) hold with A[j,j+m] and A[j,n] in place of Z[j,j+m] and Z[j,n] , respectively. Therefore, the numbers cij = zij − aij are solutions of the equations (38.8) and (38.9), but with right-hand

420

38. A matrix completion problem

sides equal ⎡ c11 ⎣c21 c31 ⎡ c33 ⎣c43 c53

to zero. Thus, for example, if n = 5 and m = 2, then ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ x11 0 c22 c23 c24 x22 0 c12 c13 c22 c23 ⎦ ⎣x21 ⎦ = ⎣0⎦ , ⎣c32 c33 c34 ⎦ ⎣x32 ⎦ = ⎣0⎦ , 0 0 c32 c33 x31 c42 c43 c44 x42 ⎤⎡ ⎤ ⎡ ⎤   

c34 c35 x33 0 0 c44 c45 x44 ⎦ ⎣ ⎦ ⎣ ⎦ c44 c45 x43 = 0 , = , and c54 c55 x54 0 c54 c55 x53 0 c55 x55 = 0 .

But, since the xjj are positive and each of the five submatrices are Hermitian, it is readily seen that cij = 0 for all the indicated entries; i.e., aij = zij for |i − j| ≤ 2. (Start with c55 x55 = 0 and work your way back.) Thus, the  matrix A = (X H )−1 DX −1 is a positive definite completion. A word on the intuition behind the proof of Theorem 38.5 is in order. It rests on the observation that if there exists a matrix A that meets the constraints in (38.7), then, since A−1  O, there exists an invertible lower triangular matrix Y ∈ C n×n with positive diagonal entries yii , i = 1, . . . , n, such that A−1 = Y Y H . Lemma 38.4 then guarantees that yij = 0 for (i, j) ∈ Λm . Thus, the matrix U = (Y H )−1 D with D = diag{y11 , . . . , ynn } is upper triangular with uii = 1 for i = 1, . . . , n and A = U X −1 with X = DY . The system of equations considered in step 1 in the proof of the theorem is obtained by considering certain subblocks of the system AX = U . The next example should help to make this more transparent. Example 38.2. If and m = 1 is ⎡ z11 z12 ? ⎢z21 z22 z23 ⎢ ⎣ ? z32 z33 ? ? z43

A ∈ ZΛm , then the matrix equation AX = U for n = 4 ⎤ ? ? ⎥ ⎥ z34 ⎦ z44

⎤ ⎤ ⎡ 0 0 x11 0 1 u12 u13 u14 ⎥ ⎢x21 x22 0 ⎢ 0 ⎥ ⎢ ⎥ = ⎢0 1 u23 u24 ⎥ . ⎣ 0 x32 x33 0 ⎦ ⎣0 0 1 u34 ⎦ 0 0 0 1 0 0 x43 x44 ⎡

Therefore, the entries xij in X for (i, j) ∈ Λm must satisfy the equations   

  

1 z22 z23 x22 1 z11 z12 x11 = , = , (38.10) z21 z22 x21 z32 z33 x32 0 0

(38.11)

z33 z34 z43 b44



  x33 1 = , x43 0

and

z44 x44 = 1 .

Next, since xjj > 0 for j = 1, . . . , 4, the matrices D = diag{x11 , . . . , x44 } and (38.12)

A = (X H )−1 DX −1

38.2. The central diagonals are specified

421

are positive definite. Moreover, equations (38.10) and (38.11) are in force, but with aij in place of zij . Therefore, the numbers cij = zij − aij are solutions of the equations

  

   c11 c12 x11 0 c22 c23 x22 0 (38.13) = , = , c21 c22 x21 c32 c33 x32 0 0

(38.14)

c33 c34 c43 c44



  x33 0 = , 0 x43

and

c44 x44 = 0 .

But, since the xjj are positive and each of the four submatrices are Hermitian, it is readily seen that cij = 0 for all the indicated entries; i.e., aij = zij for |i − j| ≤ 1. (Start with c44 x44 = 0 and work your way up.) Thus,  A ∈ Z Λm . We next give an independent proof (based on factorization) that the matrix A that satisfies the constraints in (38.7) maximizes the determinant. Theorem 38.6. If the conditions in (38.6) are in force and if A meets the constraints in (38.7) and C is any matrix in ZΛm , then det A ≥ det C, with equality if and only if C = A. Proof. In view of Theorem 16.5, (38.15)

A = (X H )−1 DX −1

and

C = (Y H )−1 GY −1 ,

where X ∈ C n×n and Y ∈ C n×n are lower triangular matrices with ones on the diagonal and D ∈ C n×n and G ∈ C n×n are positive definite diagonal matrices. Moreover, the formulas C = A + (C − A)

and

W = Y −1 X

imply that W H GW = D + X H (C − A)X . Thus, as W is lower triangular with ones on the diagonal and xij = 0 for i ≥ m + j, the diagonal entries of X H (C − A)X are all equal to zero. Consequently, djj =

n  s=j

gss |wsj |2 = gjj +



gss |wsj |2 ≥ gjj ,

s>j

with strict inequality unless wsj = 0 for s > j, i.e., unless W = In and hence C = A.  Exercise 38.10. Show that in terms of the notation in (38.15) D  G, with equality if and only if A = C.

422

38. A matrix completion problem

38.3. A moment problem Positive definite matrices A ∈ C n×n with (A−1 )ij = 0 for (i, j) ∈ Λm occur naturally in certain classes of moment problems. (See also Section 27.4.) Example 38.3. If p(λ) = λ − ω, |ω| > 1, and 6 2π 1 tj = e−ijθ |p(eiθ )|−2 dθ for j = 0, ±1, . . . , ±3 , 2π 0 then the matrix ⎤ ⎡ t0 t−1 t−2 t−3 ⎢t1 t0 t−1 t−2 ⎥ ⎥ A=⎢ ⎣t2 t1 t0 t−1 ⎦ t3 t2 t1 t0 is positive definite and (A−1 )ij = 0 for |i − j| > 1. Discussion.

It is readily checked that tj =

ω −j = t−j |ω|2 − 1

for j = 0, . . . , 3 ,

and hence (since the ranks of the relevant subblocks are all less than 3) that  the minors A{1;3} = A{1;4} = A{2;4} are all equal to zero.

38.4. Supplementary notes The section on the maximum entropy completion problem is adapted from the paper [32]. The analysis therein exploits factorization. The pioneering work on this problem was done by J. P. Burg [15], [16], who used Lagrange multipliers to maximize the determinant of a class of partially specified positive semidefinite matrices, as in Exercise 24.12. A description of the set of all the completions to the problem considered in [32] may be found, e.g., in Chapter 10 of [28]. Reformulations of this completion problem as a convex optimization problem, which in turn led to significant generalizations, were presented in Grone, Johnson, de S´a, and Wolkowicz [45]; for additional perspective, see Glunt, Hayden, Johnson, and Tarazaga [41], the survey paper by Johnson [50], and Theorem 3.2 of the earlier paper by Deutsch and Schneider [23]. Exercises 38.3–38.8 are adapted from an example that is presented in [50]. Example 38.3 is a special case of Theorem 12.20 in [30]. The algebraic structure underlying Theorem 38.5 is clarified in [33]; see also Gohberg, Kaashoek, and Woerdeman [42] for further generalizations.

Chapter 39

Minimal norm completions

In its simplest form, the minimal norm completion problem is formulated in terms of a given set of three scalars a, b, and c and

the objective is to a b choose a scalar z so that the norm of the 2 × 2 matrix is as small as c z possible. It is easily checked that 5 5 (5 5 2 5 a b 5 5 a 5   5 5 5 5 (39.1) 5 c z 5 ≥ γ, where γ = max 5 c 5 ,  a b  . Thus, |c|2 ≤ γ 2 − |a|2

and

|b|2 ≤ γ 2 − |a|2 .

Therefore, there exists a pair of scalars x, y ∈ C with |x| ≤ 1 and |y| ≤ 1 such that c = x(γ 2 − |a|2 )1/2 and b = (γ 2 − |a|2 )1/2 y . Upon making these substitutions and setting z = −xay, the matrix of interest can be expressed in factored form as 

  

a b 1 0 1 0 a (γ 2 − |a|2 )1/2 . = E , with E = c z 0 x 0 y (γ 2 − |a|2 )1/2 −a Thus, as E H E = γ 2 I2 , it is readily checked that the minimum norm is in fact equal to γ. The matrix analogue of this conclusion is presented in Theorem 39.6. To obtain it we shall need a more subtle variant of the elementary observation that if A is invertible, then: (1) AH A  C H C =⇒ C = KA for some K with K ≤ 1. (2) AAH  BB H =⇒ B = AL for some L with L ≤ 1. 423

424

39. Minimal norm completions

The next lemma serves to show that the preceding two implications are valid even if A is not invertible. Lemma 39.1. If A ∈ C p×q , B ∈ C p×r , and C ∈ C s×q , then: (1) AH A  C H C =⇒ there exists exactly one matrix K ∈ C s×p such that KA = C and Ku = 0 for every vector u ∈ NAH . (2) AAH  BB H =⇒ there exists exactly one matrix L ∈ C q×r such that B = AL and LH u = 0 for every vector u ∈ NA . Moreover, both of these uniquely specified matrices are contractions: K ≤ 1 and L ≤ 1. Proof. The inequality AH A  C H C implies that Ax ≥ Cx for every vector x ∈ C q and hence that NA ⊆ NC . Thus, RAH ⊇ RC H , since C q = NA ⊕ RAH = NC ⊕ RC H . Consequently, there exists a matrix G ∈ C p×s such that AH G = C H . Thus, upon setting K = GH , we see that Cu = KAu for every u ∈ C q . We shall also choose K so that Kv = 0 for every vector v ∈ NAH . This serves to define K uniquely. Moreover, since every vector x ∈ C p can be written as x = Au + v with u ∈ C q and v ∈ NAH , it is easily seen that Kx = KAu + Kv = KAu = Cu ≤ Au ≤ x and hence that K ≤ 1. The proof of the second assertion and the fact that L ≤ 1 goes through in much the same way. The details are left to the reader. 

39.1. A minimal norm completion problem This section treats a minimal norm completion problem. The main result is Theorem 39.5, which is usually referred to as Parrott’s lemma. The proof rests essentially on three preliminary lemmas that will be presented first. Lemma 39.2. If A ∈ C p×q , B ∈ C p×r , and C ∈ C s×q , then: (1) A ≤ γ ⇐⇒ γ 2 Iq − AH A  O ⇐⇒ γ 2 Ip − AAH  O. 5 5 5 A 5 2 H H 5 (2) γ ≥ 5 5 C 5 ⇐⇒ γ Iq − A A  C C. (3) γ 2 Iq − AH A  C H C ⇐⇒ C = X(γ 2 Iq − AH A)1/2 for some X ∈ C s×q with X ≤ 1. 5 5 (4) γ ≥ 5 A B 5 ⇐⇒ γ 2 Ip − AAH  BB H . (5) γ 2 Ip − AAH  BB H ⇐⇒ Y ∈ C p×r with Y  ≤ 1.

B = (γ 2 Ip − AAH )1/2 Y for some

39.1. A minimal norm completion problem

425

Proof. (1), (2), and (4) are easy consequences of the definitions. Thus, for example, γ ≥ A ⇐⇒ γ 2 x, x ≥ Ax, Ax

for every vector x ∈ C q

⇐⇒ (γ 2 Iq − AH A)x, x ≥ 0

for every vector x ∈ C q .

Lemma 39.1 then justifies the verification of (3) from (2) and (5) from (4).  Lemma 39.3. If A ∈ C p×q and A ≤ γ, then: (39.2)

(γ 2 Iq − AH A)1/2 AH = AH (γ 2 Ip − AAH )1/2 ,

(39.3)

(γ 2 Ip − AAH )1/2 A = A(γ 2 Iq − AH A)1/2

and the matrix (39.4)

E=

A (γ 2 Ip − AAH )1/2 2 H 1/2 (γ Iq − A A) −AH

satisfies the identity

EE H =

γ 2 Ip O O γ 2 Iq



 = γ 2 In .

Proof. The first two identities may be established with the aid of the singular value decomposition of A; the identity EE H = γ 2 In is an easy consequence of the first two.  Lemma 39.4. If A ∈ C p×q , B ∈ C p×r , C ∈ C s×q , and 5 2 (5 5 A 5   5 max 5 5 C 5,  A B  ≤ γ , then there exists a matrix D ∈ C s×r such that 5 5 5 A B 5 5 5 5 C D 5≤γ. Proof. The given inequality implies that γ 2 Iq − AH A  C H C

and

γ 2 Ip − AAH  BB H .

Therefore, by Lemma 39.1, (39.5)

B = (γ 2 Ip − AAH )1/2 X

for some choice of X ∈ C p×r and Y Thus, upon setting D = −Y AH X, it

 A B Ip = O C D

and

C = Y (γ 2 Iq − AH A)1/2

∈ C s×q with X ≤ 1 and Y  ≤ 1. is readily seen that   O Iq O E , Y O X

426

39. Minimal norm completions

where E is given by formula (39.4), and hence that 5 5 5 5 5 5 5 A B 5 5 Ip O 5 5 5 5 5≤5 5 E 5 Iq O 5 = E = γ , 5 C D 5 5 O Y 5 5 O X 5 since EE H = γ 2 In by Lemma 39.3 and the norm of each of the two block diagonal matrices is equal to one.  Theorem 39.5. If A ∈ C p×q , B ∈ C p×r , and C ∈ C s×q , then (39.6) (5 2 (5 5 2 5 5 A 5 5 A B 5   s×r 5 5 = max 5 min 5 5 C 5,  A B  . 5 C D 5:D∈C Proof. Let γ be equal to the right-hand side of (39.6). Then Lemma 39.4 implies that there exists a matrix D ∈ C s×r such that 5 5 5 A B 5 5 5 5 C D 5≤γ. Therefore, the left-hand side of (39.6) is less than or equal to the right-hand side of (39.6). Thus, as the left-hand side of (39.6) is ≥ γ, equality must prevail. 

39.2. A description of all solutions to the minimal norm completion problem Theorem 39.6. A matrix D ∈ C s×r achieves the minimum in (39.6) if and only if it can be expressed in the form D = −Y AH X + (Is − Y Y H )1/2 Z(Ir − X H X)1/2 ,

(39.7) where (39.8)

X = (γ 2 Ip − AAH )1/2

!†

B,

Y = C (γ 2 Iq − AH A)1/2

!†

and Z H Z ≤ γ 2 Ir .

is any matrix in C s×r such that

(39.9)

Z

Discussion.

The proof is outlined in a series of exercises with hints.

Exercise 39.1. Show that the inequality  

H CH A B A ! γ 2 Iq+r B H DH C D holds if and only if

H

H   A C 2 C D ! γ Iq+r − (39.10) H D BH





 A B .

,

39.2. A description of all solutions to the completion problem

427

Exercise 39.2. Show that

AH γ Iq+r − BH 2

where





 A B = MHM ,

 −AH X (γ 2 Iq − AH A)1/2 . M= O γ(Ir − X H X)1/2

[HINT: Use Lemma 39.3 and the formulas in (39.5).]   Exercise 39.3. Show that there exists exactly one matrix K1 K2 with components K1 ∈ C s×q and K2 ∈ C s×r such that     C D = K1 K2 M   = K1 (γ 2 Iq − AH A)1/2 −K1 AH X + K2 γ(Ir − X H X)1/2 and

K1 u1 + K2 u2 = 0 if M

H

 u1 = 0. u2

[HINT: Use (39.10), the identity in Exercise 39.2, and Lemma 39.1.] Exercise 39.4. Show that K1 = Y . [HINT: First check that

 H u1 = 0 ⇐⇒ (γ 2 Iq − AH A)1/2 u1 = 0 and M u2 −X H Au1 + γ(Ir − X H X)1/2 u2 = 0 ⇐⇒ (γ 2 Iq − AH A)1/2 u1 = 0 and (Ir − X H X)1/2 u2 = 0 , because / / 0† 0† X H A = B H (γ 2 Iq − AH A)1/2 A = B H A (γ 2 Ip − AAH )1/2 and NW H = NW † for any matrix W ∈ C k×k .] Exercise 39.5. Complete the proof. [HINT: Extract the formula D = −K1 AH X + γK2 (Iq − XX H )1/2 from Exercise 39.3 and then, taking note of the fact that K1 K1H + K2 K2H ! Is , replace K1 by Y and γK2 by (Is − Y Y H )1/2 Z.] 

428

39. Minimal norm completions

39.3. Supplementary notes This chapter is based largely on Section 12.13 in [30]. The minimal norm completion problem is adapted from Feintuch [36] and Zhou, Glover, and Doyle [77], both of which cite Davis, Kahan, and Weinberger [17] as a basic reference for this problem. Lemma 39.1 is a special case of (part of) Douglas’s lemma [25], which is applicable to bounded linear operators in Hilbert space.

Chapter 40

The numerical range

Let A ∈ C n×n . The set W (A) = {Ax, x : x ∈ C n and x = 1} is called the numerical range of A.

40.1. The numerical range is convex The objective of this section is to show that W (A) is a convex subset of C. We begin with a special case. Lemma 40.1. If B ∈ C n×n and if 0 ∈ W (B) and 1 ∈ W (B), then every point t ∈ [0, 1] also belongs to W (B). Proof. Let x, y ∈ C n be vectors such that x = y = 1, Bx, x = 1, and By, y = 0 and let ut = tγx + (1 − t)y , where |γ| = 1 and 0 ≤ t ≤ 1. Then But , ut  = t2 Bx, x + t(1 − t){γBx, y + γBy, x} + (1 − t)2 By, y = t2 + t(1 − t){γBx, y + γBy, x} . The next step is to show that there exists a choice of γ such that γBx, y + γBy, x is a real number. To this end it is convenient to write B = C + iD 429

430

40. The numerical range

in terms of its real and imaginary parts B + BH B − BH and D = . 2 2i Then, since C and D are both Hermitian matrices, C=

γBx, y + γBy, x = γCx, y + iγDx, y + γCy, x + iγDy, x = γc + γc + i{γd + γd} , where c = Cx, y

and d = Dx, y

are both independent of t. Now, in order to eliminate the imaginary component, set ( 1 if d = 0 , γ= i|d|−1 d if d = 0 . Then, for this choice of γ, But , ut  = t2 + t(1 − t)(γc + γc) . Moreover, since Bx, x = 1 and By, y = 0, the vectors x and y are linearly independent. Thus, ut 2 = tγx + (1 − t)y2 = t2 + (1 − t)t{γx, y + y, γx} + (1 − t)2 > 0 for every choice of t in the interval 0 ≤ t ≤ 1. Therefore, ut vt = ut  is a well-defined unit vector and Bvt , vt  =

t2 + t(1 − t){γc + γc} t2 + t(1 − t){γx, y + y, γx} + (1 − t)2

is a continuous real-valued function of t on the interval 0 ≤ t ≤ 1 such that Bv0 , v0  = 0 and

Bv1 , v1  = 1 .

Therefore, the equation Bvt , vt  = μ has at least one solution t ∈ [0, 1] for every choice of μ ∈ [0, 1].



Theorem 40.2 (Toeplitz-Hausdorff ). The numerical range W (A) of a matrix A ∈ C n×n is a convex subset of C. Proof. The objective is to show that if x = y = 1 and if Ax, x = a

and Ay, y = b ,

40.2. Eigenvalues versus numerical range

431

then for each choice of the number t in the interval 0 ≤ t ≤ 1, there exists a vector ut such that ut  = 1

Aut , ut  = ta + (1 − t)b .

and

If a = b, then ta + (1 − t)b = a = b, and hence we can choose ut = x or ut = y. Suppose therefore that a = b and let B = αA + βIn , where α, β are solutions of the system of equations aα + β = 1 bα + β = 0 . Then Bx, x = αAx, x + βx, x = αa + β = 1 and By, y = αAy, y + βy, y = αb + β = 0 . Therefore, by Lemma 40.1, for each choice of t in the interval 0 ≤ t ≤ 1, there exists a vector wt such that wt  = 1 and

Bwt , wt  = t .

But this in turn is the same as to say that αAwt , wt  + βwt , wt  = t + (1 − t)0 = t(αa + β) + (1 − t)(bα + β) = α{ta + (1 − t)b} + β . Thus, as wt , wt  = 1 and α = 0, Awt , wt  = ta + (1 − t)b , 

as claimed.

40.2. Eigenvalues versus numerical range Let λ1 , . . . , λk denote the distinct eigenvalues of a matrix A ∈ C n×n . The set ⎫ ⎧ k k ⎬ ⎨  tj λj : tj ≥ 0 and tj = 1 conv σ(A) = ⎭ ⎩ j=1

j=1

432

40. The numerical range

of convex combinations of λ1 , . . . , λk is called the convex hull of σ(A). Since the numerical range W (A) is convex and λj ∈ W (A) for j = 1, . . . , k, it is clear that (40.1)

conv σ(A) ⊆ W (A)

for every A ∈ C n×n .

In general, however, these two sets can be quite different. If

 0 0 A= , 1 0 for example, then σ(A) = {0}

and W (A) = {ab : a, b ∈ C and |a|2 + |b|2 = 1} .

The situation for normal matrices is markedly different: Theorem 40.3. Let A ∈ C n×n be a normal matrix, i.e., AAH = AH A. Then the convex hull of σ(A) is equal to the numerical range of A, i.e., conv σ(A) = W (A) . Proof. Since A is normal, it is unitarily equivalent to a diagonal matrix, i.e., there exists a unitary matrix U ∈ C n×n such that U H AU = diag{λ1 , . . . , λn } . The columns u1 , . . . , un of U form an orthonormal basis for C n . Thus, if x ∈ C n and x = 1, then n  x= ci ui i=1

is a linear combination of u1 , . . . , un , 8 7 n n n    Ax, x = A ci ui , cj uj = λi ci cj ui , uj  i=1

=

n 

j=1

i,j=1

λi |ci |2 ,

i=1

and n 

|ci |2 = x2 = 1 .

i=1

Therefore, W (A) ⊆ conv σ(A) and hence, as the opposite inclusion (40.1) is already known to be in force, the proof is complete. 

40.3. The Gauss-Lucas theorem

433

Exercise 40.1. Verify the inclusion conv σ(A) ⊆ W (A) for normal n matrices n×n by checking directly that every convex combination i=1 ti λi of A∈C the eigenvalues λ1 , . . . , λn of A belongs to W (A). [HINT: 7 n 8 n n n   √  1 ti λi = ti Aui , ui  = A ti ui , tj uj .] i=1

i=1

i=1

j=1

⎡ ⎤ 0 0 i Exercise 40.2. Find the numerical range of the matrix ⎣1 0 0⎦. 0 1 0

40.3. The Gauss-Lucas theorem Theorem 40.4. Let f (λ) = a0 + a1 λ + · · · + an−1 λn−1 + λn be a polynomial of degree n ≥ 1 with coefficients ai ∈ C for i = 0, . . . , n − 1. Then the roots of the derivative f  (λ) lie in the convex hull of the roots of f (λ). Proof. The proof exploits two general facts for matrices A ∈ C n×n that are expressed in terms ϕ(λ) = det(λIn − A) and the block 

of the notation A11 A12 with A11 ∈ C and A22 ∈ C (n−1)×(n−1) as: decomposition A = A21 A22 ϕ (λ) = trace(λIn − A)−1 ϕ(λ)

(40.2)

for λ ∈ σ(A)

and (40.3)

det (λIn−1 − A22 ) = eT1 (λIn − A)−1 e1 ϕ(λ)

for λ ∈ σ(A) .

Let μ1 , . . . , μn denote the roots of f (λ), allowing repetitions as needed, and set A = U DU H , D = diag{μ1 , . . . , μn } with U ∈ C n×n unitary, and U H e1 =  T 1 1 ··· 1 √ . Then n eT1 (λIn − A)−1 e1 = eT1 U (λIn − D)−1 U H e1 =

n 1  1 n λ − μi i=1

1 1 = trace(λIn − D)−1 = trace(λIn − A)−1 n n for λ ∈ σ(A). Thus, in view of (40.2) and (40.3), det (λIn−1 − A22 ) ϕ (λ) =n . ϕ(λ) ϕ(λ) Consequently, f  (λ) = ϕ (λ) = n det(λIn−1 − A22 ) .

434

40. The numerical range

The last formula serves to identify the eigenvalues of A22 with the roots of f  (λ). Moreover, in view of (40.1), conv σ(A22 ) ⊆ W (A22 ) = {xH A22 x : x ∈ C n−1 and x = 1}

 2 (  H 0 n−1 0 x A : x∈C and x = 1 = x ⊆ {yH Ay : y ∈ C n and y = 1} = W (A) = conv σ(A); the last equality follows from Theorem 40.3, since A is normal.



40.4. The Heinz inequality Lemma 40.5. If A = AH ∈ C p×p and B = B H ∈ C q×q are both invertible and X ∈ C p×q , then 2X ≤ AXB −1 + A−1 XB .

(40.4)

Proof. Suppose first that p = q and X = X H , and let λ ∈ σ(X). Then λ ∈ σ(A−1 XA), and hence there exists a unit vector x ∈ C p such that λ = AXA−1 x, x and

λ = x, AXA−1 x = A−1 XAx, x . Therefore, since λ = λ, |2λ| = |(AXA−1 + A−1 XA)x, x| ≤ AXA−1 + A−1 XA ,

which leads easily to the inequality (40.5)

2X ≤ AXA−1 + A−1 XA .

To extend this inequality to matrices X ∈ C p×p that are not necessarily Hermitian, apply it to the matrices



 O X A O X = and A = XH O O A and note that X  = X and

AX A−1 + A−1 X A = AXA−1 + A−1 XA .

Finally, (40.4) follows from (40.5) applied to the square matrices  

X A Op×q Op×p in place of A .  in place of X and Oq×p B Oq×p Oq×q

The fractional power At , 0 ≤ t ≤ 1, of a matrix A  O with singular value decomposition A = V SV H is defined as At = V S t V H . We shall also make use of the fact that if f is continuous on the interval [0, 1] and f ((a + b)/2) ≤ [f (a) + f (b)]/2, then f is convex on [0, 1].

40.4. The Heinz inequality

435

Exercise 40.3. Show that if A ∈ C n×n and A  O, then f (t) = At  is continuous on [0, 1]. Theorem 40.6 (Heinz). If A ∈ C p×p and B ∈ C q×q are both positive semidefinite and X ∈ C p×q , then (40.6)

2AXB ≤ A2 X + XB 2 

and (40.7)

At XB 1−t + A1−t XB t  ≤ AX + XB

for

0 ≤ t ≤ 1.

Proof. To verify (40.6), let Aε = A + εIp and Bε = B + εIq with ε > 0. Then, since Aε and Bε are invertible Hermitian matrices, we can invoke (40.4) to obtain the inequality 2 2 2 2Aε XBε  ≤ A2ε XBε Bε−1 + A−1 ε Aε XBε  = Aε X + XBε  ,

which tends to (40.6) as ε ↓ 0. Next, to verify (40.7), let f (t) = At XB 1−t + A1−t XB t , let 0 ≤ a < b ≤ 1, and set c = (a + b)/2 and d = (b − a)/2. Then, as c = a + d and 1 − c = 1 − b + d, (40.6) implies that 5 / 0 5 5 d 5 c 1−c 1−c c a 1−b 1−b a + A XB  = 5A A XB + A XB B d 5 f (c) = A XB 5 / 0 / 0 15 5 2d 5 Aa XB 1−b + A1−b XB a + Aa XB 1−b + A1−b XB a B 2d 5 ≤ 5A 2 1 b A XB 1−b + A1−a XB a + Aa XB 1−a + A1−b XB b  = 2 f (a) + f (b) ; ≤ 2 i.e., f (t) is a convex function on the interval 0 ≤ t ≤ 1. Thus, as f (0) = f (1) = AX + XB, f (t) = f ((1 − t)0 + t1) ≤ (1 − t)f (0) + tf (1) = (1 − t)f (0) + tf (0) = f (0) for every point t in the interval 0 ≤ t ≤ 1, which is equivalent to (40.7).



Theorem 40.7. Let A ∈ C p×p , B ∈ C p×p and suppose that A  O and B  O. Then (40.8)

As B s  ≤ ABs

for

0 ≤ s ≤ 1.

436

40. The numerical range

Proof. Let Q = {u ∈ [0, 1] : Au B u  ≤ ABu } and let s and t be a pair of points in Q. Then, with the help of the auxiliary inequality A(s+t)/2 B (s+t)/2 2 = B (s+t)/2 As+t B (s+t)/2  = rσ (B (s+t)/2 As+t B (s+t)/2 ) = rσ (B s As+t B t ) ≤ B s As+t B t  ≤ B s As  At B t  = As B s  At B t  ≤ ABs+t , it is readily checked that (s + t)/2 ∈ Q and hence that Q is convex. The proof is easily completed, since 0 ∈ Q and 1 ∈ Q.  Exercise 40.4. Verify each of the assertions that lead to the auxiliary inequality in the proof of Theorem 40.7. Exercise 40.5. Show that if A and B are as in Theorem 40.7, then ϕ(s) = As B s 1/s is an increasing function of s for s > 0.

40.5. Supplementary notes This chapter is adapted from Chapter 22 of [30]. The presented proof of the convexity of numerical range is based on an argument that is sketched briefly in [46]. Halmos credits it to C. W. R. de Boor. The presented proof works also for bounded operators in Hilbert space; see also McIntosh [61] for another very attractive approach. The proof of the Heinz inequality is taken from the beautiful short paper [37] by Fujii, Fujii, Furuta, and Nakomoto that establishes the Heinz inequality (40.7) for bounded operators in Hilbert space and sketches the history. The elegant passage from (40.6) to (40.7) is credited to an unpublished paper of A. McIntosh. The proof of Theorem 40.7 is adapted from a paper by Furuta [38]. The proof of the Gauss-Lucas theorem in Section 40.3 is adapted from an exercise in the monograph [8] by Bakonyi and Woerdeman. A complete description of the numerical range W (A) of a matrix A ∈ C n×n is presented in Helton and Spitkovsky [48].

Chapter 41

Riccati equations

In this chapter we shall investigate the existence and uniqueness of solutions X ∈ C n×n to the Riccati equation (41.1)

AH X + XA + XRX + Q = O

when

R = RH ,

Q = QH ,

and A, R, Q ∈ C n×n . This class of equations has important applications, one of which (the LQR problem) will be discussed in Section 41.3. Exercise 41.1. Show that if the Riccati equation (41.1) has exactly one solution X ∈ C n×n , then X = X H .

41.1. Riccati equations The study of the Riccati equation (41.1) is intimately connected with the invariant subspaces of the matrix 

A R with R = RH and Q = QH , (41.2) G= −Q −AH which is often referred to as the Hamiltonian matrix in the control theory literature. The first order of business is to verify that the eigenvalues of G are symmetrically distributed with respect to the imaginary axis iR: Lemma 41.1. The roots of the polynomial p(λ) = det(λI2n − G) are symmetrically distributed with respect to iR. Proof. This is a simple consequence of the identity (41.3)

SGS −1 = −GH , 437

438

41. Riccati equations

in terms of the orthogonal matrix 

O −In . (41.4) S= In O



Exercise 41.2. Verify the identity SGS −1 = −GH and the assertion of Lemma 41.1.

 A B Exercise 41.3. Show that if E = is a 2p × 2p matrix with p × p C A blocks A = AH , B = B H , and C = C H , then λ ∈ σ(E) ⇐⇒ λ ∈ σ(E). [HINT: It suffices to show that E is similar to E H .] If σ(G) ∩ iR = ∅, then Lemma 41.1 guarantees that G admits a Jordan decomposition of the form

 J1 O (41.5) G=U U −1 , O J2 where J1 , J2 ∈ C n×n , σ(J1 ) ⊂ CL , the open left half-plane, and σ(J2 ) ⊂ C R , the open right half-plane. It turns out that the upper left-hand n × n corner X1 of the matrix U will play a central role in the subsequent analysis; i.e., upon writing 

 In X1 and Λ = J1 U = O X2 so that

(41.6)

G

X1 X2



=

X1 X2

 Λ

and

σ(Λ) ⊂ CL ,

the case in which X1 is invertible will be particularly significant. Lemma 41.2. If X ∈ C n×n is a solution of the Riccati equation (41.1) such that σ(A + RX) ⊂ CL , then X = X H . Proof. If X is a solution of (41.1), then AH X H + X H A + X H RX H + Q = O .

(41.7) Therefore,

AH (X − X H ) + (X − X H )A + XRX − X H RX H = O and hence (AH + X H R)(X − X H ) + (X − X H )(A + RX) = O . Consequently, upon setting Λ = A+RX, we see that the matrix Z = X−X H is a solution of the equation (41.8)

ZΛ + ΛH Z = O.

41.1. Riccati equations

439

However, since σ(Λ) ⊂ CL =⇒ σ(ΛH ) ⊂ CL , Theorem 37.6 ensures that  Z = O is the only solution of (41.8). Therefore, X = X H . Exercise 41.4. Show that if σ(G) ∩ iR = ∅, then (41.6) is in force for some matrix Λ ∈ C n×n (that is not necessarily in Jordan form) and X1H X2 = X2H X1 .

(41.9)

[HINT: Use (41.3) to obtain (41.8), but with Z = X2H X1 − X1H X2 .] Theorem 41.3. If σ(G) ∩ iR = ∅ and the matrix X1 in (41.6) is invertible, then the matrix X = X2 X1−1 enjoys the following properties: (1) X is a solution of the Riccati equation (41.1). (2) σ(A + RX) ⊂ CL . (3) X = X H . Proof. If X1 is invertible and X = that

 In G = X and hence, upon filling in the block detail, that

X2 X1−1 , then formula (41.6) implies  In X1 ΛX1−1 X entries in G and writing this out in

A + RX = X1 ΛX1−1 , −Q − AH X = X(X1 ΛX1−1 ) . Therefore, −Q − AH X = X(A + RX) , which serves to verify (1). Assertion (2) follows from the formula A + RX = X1 ΛX1−1 and the fact  that σ(Λ) ⊂ CL ; (3) now follows from (1), (2), and Lemma 41.2. Exercise 41.5. Show that if X is a solution of the Riccati equation (41.1) such that σ(A + RX) ⊂ CL , then σ(G) ∩ iR = ∅, the matrix X1 in (41.6), is invertible. [REMARK: This is a converse to Theorem 41.3.] Theorem 41.4. The Riccati equation (41.1) has at most one solution X ∈ C n×n such that σ(A + RX) ⊂ CL . Proof. Let X and Y be a pair of solutions of the Riccati equation (41.1) such that σ(A + RX) ⊂ CL and σ(A + RY ) ⊂ CL . Then, since AH X + XA + XRX + Q = O and AH Y + Y A + Y RY + Q = O ,

440

41. Riccati equations

it is clear that AH (X − Y ) + (X − Y )A + XRX − Y RY = O . However, as Y = Y H by Lemma 41.2 this last equation can also be reexpressed as (A + RY )H (X − Y ) + (X − Y )(A + RX) = O , which exhibits X − Y as the solution of an equation of the form BZ + ZC = O with σ(B) ⊂ CL and σ(C) ⊂ CL . Theorem 37.6 ensures that this equation has at most one solution. Thus, as Z = On×n is a solution, it is in fact the only solution. Therefore X = Y , as claimed.  The next theorem provides conditions under which the constraints imposed on the Hamiltonian matrix G in Theorem 41.3 are satisfied when R = −BB H and Q = C H C. Theorem 41.5. Let A ∈ C n×n , B ∈ C n×k , C ∈ C r×n and suppose that 

A − λIn = n for every point λ ∈ iR (41.10) rank C and (41.11)

rank [A − λIn

B] = n

for every point λ ∈ CR .

Then there exists exactly one solution X ∈ C n×n of the Riccati equation (41.12)

AH X + XA − XBB H X + C H C = O

such that σ(A − BB H X) ⊂ CL . Moreover, this solution X is positive semidefinite, and if A, B, and C are real matrices, then X ∈ R n×n . Proof. If R = −BB H and Q = C H C, then 

A −BB H . G= −C H C −AH The proof is divided into steps. 1. If (41.10) and (41.11) are in force, then σ(G) ∩ iR = ∅. If

 

 x x A −BB H =λ −C H C −AH y y n n for some choice of x ∈ C , y ∈ C , and λ ∈ C, then

(A − λIn )x = BB H y

41.1. Riccati equations

441

and (AH + λIn )y = −C H Cx . Therefore, (A − λIn )x, y = BB H y, y = B H y22 and (A + λIn )x, y = x, (AH + λIn )y = −x, C H Cx = −Cx22 . Thus, −(λ + λ)x, y = (A − λIn )x, y − (A + λIn )x, y = B H y22 + Cx22 and hence λ + λ = 0 =⇒ B H y = 0 and

Cx = 0 ,

which in turn implies that

 A − λIn x = 0 and yH [A + λIn C

B] = 0

when λ ∈ iR. However, in view of (41.10) and (41.11), this is viable only if x = 0 and y = 0. Consequently, σ(G) ∩ iR = ∅. 2. If (41.10) and (41.11) are in force, then   

X1 X1 A −BB H = Λ −C H C −AH X2 X2

and

rank

X1 X2

 = n,

where X1 , X2 , Λ ∈ C n×n , σ(Λ) ⊂ CL , and X1 is invertible. The displayed formula and the inclusion σ(Λ) ⊂ CL follow from step 1. The main task is to show that X1 is invertible. Towards this end, suppose that u ∈ NX1 . Then −BB H X2 u = X1 Λu , and hence, as X2H X1 = X1H X2 by Exercise 41.4, −B H X2 u2 = X1 Λu, X2 u = Λu, X1H X2 u = Λu, X2H X1 u = Λu, 0 = 0 . Thus, B H X2 u = 0 and X1 Λu = −BB H X2 u = 0 , which means that NX1 is invariant under Λ and hence that either NX1 = {0} or that Λv = λv for some point λ ∈ CL and some nonzero vector v ∈ NX1 . In the latter case, −BB H X2 v = X1 Λv = λX1 v = 0

442

41. Riccati equations

and −AH X2 v = X2 Λv = λX2 v , i.e., vH X2H [A + λIn

B] = 0H

for some point λ ∈ CL . Therefore, since −λ ∈ CR , assumption (41.11) implies that X2 v = 0. Consequently, 

X1 v = 0 =⇒ v = 0 =⇒ NX1 = {0} =⇒ X1 is invertible . X2

3. Completing the proof Theorem 41.3 ensures that there exists at least one solution X of the Riccati equation (41.12) such that σ(A − BB H X) ⊂ CL , whereas Theorem 41.4 ensures that there is at most one such solution. Therefore, there is exactly one such solution. To verify that this solution X is positive semidefinite, it is convenient to express the Riccati equation AH X + XA − XBB H X + C H C = O as (A − BB H X)H X + X(A − BB H X) = −C H C − XBB H X , which is of the form AH 1 X + XA1 = Q , where σ(A1 ) ⊂ CL and − Q  O . The desired result then follows by invoking Lemma 37.7. Finally, if the matrices A, B, and C are real, then the matrix X is also a Hermitian solution of the Riccati equation (41.12) such that  σ(A − BB H X) ⊂ CL . Therefore, X = X, i.e., X ∈ R n×n . Exercise 41.6. Let A ∈ C n×n , B ∈ C n×k . Show that if σ(A) ∩ iR = ∅ and (41.11) holds, then there exists exactly one Hermitian solution X of the Riccati equation AH X +XA−XBB H X = O such that σ(A−BB H X) ⊂ CL . For future applications, it will be convenient to have another variant of Theorem 41.5.

41.2. Two lemmas

443

Theorem 41.6. Let A ∈ C n×n , B ∈ C n×k , Q ∈ C n×n , R ∈ C k×k ; and suppose that Q  O, R  O,

 A − λIn (41.13) rank = n for every point λ ∈ iR, Q and (41.11) holds. Then there exists exactly one solution X ∈ C n×n of the Riccati equation AH X + XA − XBR−1 B H X + Q = O such that σ(A − BR−1 B H X) ⊂ CL . Moreover, this solution X is positive semidefinite, and if A, B, and Q are real matrices, then X ∈ R n×n . Proof. Since Q  O, there exists a matrix C ∈ C r×n such that C H C = Q and rank C = rank Q = r. Thus, upon setting B1 = BR−1/2 , we see that the matrix  

A −B1 B1H A −BR−1 B H = −Q −AH −C H C −AH is of the form considered in Theorem 41.5. Moreover, since 



A − λIn A − λIn u = 0 ⇐⇒ u = 0, C Q condition (41.13) implies that 

A − λIn =n rank C

for every point λ ∈ iR.

Furthermore, as rank [A − λIn

B] = rank [A − λIn

B(R1/2 )−1 ],

assumption (41.11) guarantees that rank [A − λIn

B1 ] = n for every point λ ∈ CR .

The asserted conclusion now follows from Theorems 41.5 and 13.7.

41.2. Two lemmas The two lemmas in this section are prepared for use in the next section. Lemma 41.7. Let A, Q ∈ C n×n , B, L ∈ C n×k , R ∈ C k×k , 

Q L E= LH R and suppose that E  O, R  O and that (41.14)

rank E = rank Q + rank R.



444

41. Riccati equations

Then the formulas (41.15)

 rank

 − λIn A  Q



= rank

A − λIn Q



and  − λIn rank[A

(41.16)

B] = rank[A − λIn

B]

are valid for the matrices (41.17)

 = A − BR−1 LH A

 = Q − LR−1 LH Q

and

and every point λ ∈ C. Proof. The Schur complement formula 

   O In LR−1 Q L In Q = O Ik R−1 LH LH R O R

O In



implies that  + rank R rank E = rank Q

and

  O. Q

Thus, in view of assumption (41.14),  rank Q = rank Q −1 LH is the sum of two positive semidefinite matrices,  and, since Q = Q+LR

NQ = NQ ∩ NLH ⊆ NQ . However, since  =⇒ dim NQ = dim N  , rank Q = rank Q Q the last inclusion is in fact an equality: NQ = NQ and hence,



 − λIn A  Q

and NQ ⊆ NLH



u = 0 ⇐⇒

A − λIn Q

 u = 0.

The conclusion (41.15) now follows easily from the principle of conservation of dimension. The second conclusion (41.16) is immediate from the identity

 In O  , [A − λIn B] = [A − λIn B] −R−1 LH Ik since the last matrix on the right is invertible.



41.3. The LQR problem

445

 Q, Q,  B, L, R, and E are Lemma 41.8. Assume that the matrices A, A, as in Lemma 41.7 and that (41.14), (41.13), and (41.11) are in force. Then there exists exactly one solution X ∈ C n×n of the Riccati equation (41.18)

 − XBR−1 B H X + Q =O H X + X A A

 − BR−1 B H X) ⊂ CL . Moreover, this solution X is positive such that σ(A semidefinite, and if the matrices A, B, Q, L, and R are real, then X ∈ R n×n . Proof. Under the given assumptions, Lemma 41.7 guarantees that    − λIn A = n for every point λ ∈ iR rank  Q and

 − λIn rank [A

B] = n for every point λ ∈ CR .  in place of A and Q  in place Therefore, Theorem 41.6 is applicable with A of Q. 

41.3. The LQR problem Let A ∈ R n×n and B ∈ R n×k and let 6 t tA x(t) = e x(0) + e(t−s)A Bu(s)ds ,

0 ≤ t < ∞,

0

be the solution of the first-order vector system of equations x (t) = Ax(t) + Bu(t), t ≥ 0 , in which the vector x(0) ∈ R n and the vector-valued function u(t) ∈ R k , t ≥ 0, are specified. The LQR (linear quadratic regulator) problem in control engineering is to choose u to minimize the value of the integral  

6 t   Q L x(s) ds x(s)T u(s)T (41.19) Z(t) = u(s) LT R 0 when Q = QT ∈ R n×n , L ∈ R n×k , R = RT ∈ R k×k , 

Q L  O, LT R and R is assumed to be invertible. The first step in the analysis of this problem is to express it in simpler form by invoking the Schur complement formula:    

In LR−1 In O Q − LR−1 LT O Q L = . O Ik O R R−1 LT Ik LT R

446

41. Riccati equations

Then, upon setting  = Q − LR−1 LT , Q

 = A − BR−1 LT , A and

v(s) = R−1 LT x(s) + u(s), the integral (41.19) can be reexpressed more conveniently as

  6 t  O   Q x(s) T T v(s) x(s) ds , (41.20) Z(t) = v(s) O R 0 where the vectors x(s) and v(s) are linked by the equation  x (s) = Ax(s) + Bv(s) ,

(41.21) i.e., (41.22)

6

 tA

t



e(t−s)A Bv(s)ds .

x(t) = e x(0) + 0

Theorem 41.9. Let X be the unique solution of the Riccati equation (41.18) based on matrices A, B, Q, L, and R with real entries such that  − BR−1 B T X) ⊂ CL . Then: σ(A (1) X ∈ R n×n and Z(t) can be expressed in terms of the solution x(t) of (41.21) and the function ϕ(s) = x(s)T Xx(s) as 6 t R−1/2 (B T Xx(s) + Rv(s))22 ds . (41.23) Z(t) = ϕ(0) − ϕ(t) + 0

(2) Z(t) ≥ ϕ(0) − ϕ(t) with equality if v(s) = −R−1 B T Xx(s) for 0 ≤ s ≤ t. (3) If v(s) = −R−1 B T Xx(s) for 0 ≤ s < ∞, then Z(∞) = ϕ(0). Proof. The proof is broken into parts. 1. Verification of (1). In view of (41.21) ϕ (s) = x (s)T Xx(s) + x(s)T Xx (s)   + Bv(s)) = (Ax(s) + Bv(s))T Xx(s) + x(s)T X(Ax(s) T X + X A)x(s)  + v(s)T B T Xx(s) + x(s)T XBv(s) = x(s)T (A  + v(s)T B T Xx(s) + x(s)T XBv(s) = x(s)T (XBR−1 B T X − Q)x(s) = (x(s)T XB + v(s)T R)R−1 (B T Xx(s) + Rv(s))  − v(s)T Rv(s). −x(s)T Qx(s)

41.4. Supplementary notes

Therefore,

6

t

Z(t) =

447

 {x(s)T Qx(s) + v(s)T Rv(s)}ds

0

6

t

= − 6

0 t

+

d {x(s)T Xx(s)}ds ds (B T Xx(s) + Rv(s))T R−1 (B T Xx(s) + Rv(s))ds,

0

which is equivalent to (41.23). 2. Verification of (2) and (3). Assertion (2) is immediate from formula (41.23). Moreover, if v(s) is chosen as specified in assertion (3), then x(t) is a solution of the vector differential equation  − BR−1 B T X)x(t) , x (t) = (A  − BR−1 B T X) ⊂ CL , x(t) → 0 as t ↑ ∞. and hence, as σ(A



41.4. Supplementary notes This chapter is adapted from Chapter 18 in [30]. The discussion of Riccati equations and the LQR problem therein was partially adapted from the monograph Zhou, Doyle, and Glover [77], which is an excellent source of supplementary information on both of these topics. The monograph [54] by Lancaster and Rodman is recommended for more advanced studies.

Chapter 42

Supplementary topics

This chapter is devoted to four distinct topics: Gaussian quadrature, Bezoutians and resultants, general QR factorization, and the QR algorithm. The first, third, and fourth are presented in reasonable detail. The treatment of the second is limited to a brief introduction to the properties and significance of Bezoutians and resultants and is less complete. Companion matrices play a significant role in Gaussian quadrature and the theory of Bezoutians.

42.1. Gaussian quadrature Let w(x) denote a positive continuous function on a finite interval a ≤ x ≤ b and let U denote the space of continuous complex-valued functions on this interval, equipped with the inner product 6 b g(x)w(x)f (x)dx . f, gU = a

Let fj (x) = for j = 1, 2, . . . and let Pn = span{f1 , . . . , fn } denote the n-dimensional subspace of polynomials of degree less than or equal to n − 1 (with complex coefficients), and let xj−1

Pn = Πn M |Pn

for n = 0, 1 . . . ,

where Πn denotes the orthogonal projection from U onto Pn and M denotes the linear transformation on U of multiplication by the independent variable, i.e., (M f )(x) = xf (x). Then, ( fj+1 if j = 1, . . . , n − 1 , Pn fj = if j = n , c 1 f1 + · · · + c n fn 449

450

42. Supplementary topics

where the coefficients c1 , . . . , cn are chosen so that 7 8 n  (42.1) fn+1 − c j fj , fi = 0 for i = 1, . . . , n . j=1

U

Since the n × n matrix G with entries gij = fj , fi U for i, j = 1, . . . , n is invertible, the condition in (42.1) can be expressed as ⎡ ⎤ ⎡ ⎤ c1 g1,n+1 ⎢ ⎥ ⎢ ⎥ (42.2) c = ⎣ ... ⎦ = G−1 ⎣ ... ⎦ with gi,n+1 = fn+1 , fi U . cn

gn,n+1

Now let G = C n equipped with the inner product x, yG = Gx, yst . Then the linear operator Tn that maps fj ∈ Pn to ej ∈ G for j = 1, . . . , n is unitary: T is invertible and T fj , T fi G = ej , ei G = gij = fj , fi U , and Pn is unitarily equivalent to multiplication by the matrix ⎤ ⎡ 0 0 · · · 0 c1 ⎥   ⎢ ⎢1 0 · · · 0 c2 ⎥ , (42.3) A = e2 · · · en c = ⎢ . . .. ⎥ ⎦ ⎣ .. 0 0 · · · 1 cn in G, i.e., T Pn fj = Aej = AT fj

for j = 1, . . . , n .

Lemma 42.1. If A is the matrix that is defined by the formulas in (42.2) and (42.3), then: (1) Multiplication by A is a selfadjoint linear transformation in G, i.e., Au, vG = u, AvG

(42.4)

for every choice of u, v ∈ C n .

(2) A has n distinct real eigenvalues λ1 , . . . , λn and a corresponding set {v1 , . . . , vn } of eigenvectors that are orthonormal in G. (3) The coefficients in the expansion of ej in terms of the orthonormal eigenvectors {v1 , . . . , vn } of A are e1 , vs G ej , vs G = λj−1 s

(42.5)

for j = 1, . . . , n

and (42.6)

ej , ei G =

n  s=1

λi+j−2 |e1 , vs G |2 s

for i, j = 1, . . . , n .

42.1. Gaussian quadrature

451

Proof. Let hj = fj , f2 U . Then gij = hi+j−2 for i, j = 1, 2, . . ., and ⎡ ⎤ g12 · · · g1n g1,n+1   ⎢ .. ⎥ GA = G e2 · · · en c = ⎣ ... . ⎦ gn2 · · · gnn gn,n+1 ⎤ ⎡ h2 · · · hn h1 ⎢ h2 h3 · · · hn+1 ⎥ ⎥ ⎢ H H =⎢ . .. ⎥ = (GA) = A G , ⎦ ⎣ .. . hn hn+1 · · ·

h2n−1

since hj ∈ R. Thus, Ax, yG = GAx, yst = AH Gx, yst = x, AyG for every pair of vectors x, y ∈ C n , i.e., (1) holds. Suppose next that A has k distinct eigenvalues λ1 , . . . , λk . Then it follows readily from (42.4) that λi ∈ R for i = 1, . . . , k and that if Avi = λi vi , then vi , vj G = 0 if i = j. Moreover, if (A − λj In )2 u = 0 for some vector u ∈ C n , then 0 = (A − λj In )2 u, uG = (A − λj In )u, (A − λj In )uG . Therefore, N(A−λj In )2 = N(A−λj In ) , and hence, the algebraic multiplicity αj of each eigenvalue is equal to its geometric multiplicity γj , which is equal to one, since σ(A) = σ(AT ) and AT is a companion matrix. Thus, k = α1 + · · · + αk = n. Consequently, (2) holds. Finally, (42.5) follows from the observation that ej , vs G = Aej−1 , vs G = ej−1 , Avs G = λs ej−1 , vs G for j = 2, . . . , n; and (42.6) is obtained by substituting the expansions ej =

n 

ej , vs G vs

and

ei =

s=1

n 

ei , vt G vt

t=1

into ej , ei G and then invoking (42.5).



Theorem 42.2. Let Wj = |e1 , vj G |2 for j = 1, . . . , n. Then the formula 6

b

f (x)w(x)dx =

(42.7) a

n 

Wi f (λi )

i=1

is valid for every polynomial f (x) of degree less than or equal to 2n − 1 with complex coefficients.

452

42. Supplementary topics

Proof. It suffices to verify this formula for f (x) = fk (x), k = 1, . . . , 2n. Consider first the case k = i + j with i, j = 1, . . . , n − 1. Then, by (42.6), 6 b fi+j (x)w(x)dx = fj , fi+1 U = ej , ei+1 G a

=

n 

λi+j−1 Ws = s

s=1

n 

fi+j (λs ) Ws .

s=1

To complete the proof, it remains only to check that 6 b n  x2n−1 w(x)dx = λ2n−1 Ws . (42.8) s a

s=1

n

i+j−2 Ws s=1 λs

for i, j = 1, . . . , n, But, as ej , ei G = 6 b x2n−1 w(x)dx = fn+1 , fn U = M fn , fn U = Pn fn , fn U a

= Aen , en G = =

n 

cj

j=1

=

⎧ n ⎨  s=1



n 

cj ej , en G

j=1 n 

.

λn+j−2 Ws s

s=1

Ws λn−1 s

n  j=1

cj λj−1 s

⎫ ⎬ ⎭

=

n 

λ2n−1 Ws , s

s=1

since (42.9)

n 

cj λj−1 = λns s

for s = 1, . . . , n ,

j=1

because σ(A) = σ(AT ) and AT is a companion matrix.



Exercise 42.1. Verify formula (42.9). [HINT: (λ − λ1 ) · · · (λ − λn ) = det (λIn − A) = det(λIn − AT ) and AT is a companion matrix.] Finite sums like (42.7) that serve to approximate definite integrals, with equality for a reasonable class of functions, are termed quadrature formulas.

42.2. Bezoutians Let f (λ) = f0 + f1 λ + · · · + fn λn , fn = 0 , be a polynomial of degree n with coefficients f0 , . . . , fn ∈ C and let g(λ) = g0 + g1 λ + · · · + gn λn

42.2. Bezoutians

453

be a polynomial of degree less than or equal to n with coefficients g0 , . . . , gn ∈ C, at least one of which is nonzero. Then the matrix B ∈ C n×n with entries bij , i, j = 0, . . . , n − 1, that is uniquely defined by the formulas (42.10)

n−1  f (λ)g(μ) − g(λ)f (μ) = λi bij μj = v(λ)T Bv(μ) λ−μ i,j=0

and

  v(λ)T = 1 λ · · · λn−1 is called the Bezoutian of the polynomials f (λ) and g(λ) and will be denoted by the symbol B(f, g) (or just plain B if the polynomials are clear from the context). It is clear from formula (42.10) that if f (α) = 0 and g(α) = 0, then v(λ)T Bv(α) = 0

for every point λ ∈ C, and hence that Bv(α) = 0 . Moreover, if f (α) =

f  (α)

= 0 and g(α) = g  (α) = 0, then the identity

f (λ)g  (μ) − g(λ)f  (μ) f (λ)g(μ) − g(λ)f (μ) = v(λ)T Bv (μ) , + λ−μ (λ − μ)2 which is obtained by differentiating both sides of formula (42.10) with respect to μ, implies that (42.11)

v(λ)T Bv (α) = 0

for every point λ ∈ C. Therefore, dim NB ≥ 2, since the vectors v(α) and v (α) both belong to NB and are linearly independent. Much the same sort of reasoning leads rapidly to the conclusion that if f (α) = f (1) (α) = · · · = f (k−1) (α) = 0 = g(α) = g (1) (α) = · · · g (k−1) (α) , then the vectors v(α), . . . , v(k−1) (α) all belong to NB . Thus, as these vectors are linearly independent if k ≤ n, dim NB ≥ k. Moreover, if α = β and f (β) = f (1) (β) = · · · = f (j−1) (β) = 0 = g(β) = g (1) (β) = · · · = g (j−1) (β) , then the vectors v(β), . . . , v(j−1) (β) all belong to NB . Therefore, since this set of vectors is linearly independent of the set v(α), . . . , v(k−1) (α) when α = β (as was shown in Theorem 8.3), dim NB ≥ k + j. Proceeding this way, it is readily seen that dim NB(f,g) ≥ ν(f, g), where ν(f, g) = the number of common roots of the polynomials f (λ) and g(λ), counting multiplicities . It is also easy to show that dim NB(f,g) = ν(f, g) when f has distinct roots, as is spelled out in Exercises 42.6 and 42.5 below. In fact equality holds

454

42. Supplementary topics

even if f does not have distinct roots, but the proof is not so simple; it rests on a highly nontrivial identity and the following lemma: (p)

Lemma 42.3. If g(λ) is a polynomial and N = C0 , ⎡ g (1) (λ) ⎢ g(λ) 1! ⎢ ⎢ ⎢ p−1 (j)  0 g(λ) g (λ) j ⎢ (p) (42.12) g(Cλ ) = N =⎢ ⎢ j! ⎢ . j=0 ⎢ .. ⎢ ⎣ 0 0 and

(

(p) rank g(Cλ )

=

p if p − k if

then ··· ··· ..

.

···

⎤ g (p−1) (λ) (p − 1)! ⎥ ⎥ ⎥ g (p−2) (λ) ⎥ ⎥ (p − 2)! ⎥ ⎥ ⎥ .. ⎥ . ⎥ ⎦ g(λ)

g(λ) = 0 , g(λ) = · · · = g (k−1) (λ) = 0 but g (k) (λ) = 0 ,

where, in the last line, k is an integer such that 1 ≤ k ≤ p. Proof. Let Γ denote a circle of radius R > |λ| that is centered at the origin and is directed counterclockwise. Then, by Cauchy’s formula, 6 1 g(ζ)(ζIp − λIp − N )−1 dζ g(λIp + N ) = 2πi Γ 6 p−1  1 g(ζ) N j dζ = 2πi Γ (ζ − λ)j+1 j=0

=

p−1  j=0

g (j) (λ) j N . j! (p)

The formula for the rank of g(Cλ ) is clear from formula (42.12), which (p) exhibits g(Cλ ) as an upper triangular matrix that is constant on diagonals, i.e., as an upper triangular Toeplitz matrix.  Example 42.1. If p = 3, then

⎤ g(λ) g (1) (λ) g (2) (λ)/2! (3) g(Cλ ) = g(λI3 + N ) = ⎣ 0 g(λ) g (1) (λ) ⎦ . 0 0 g(λ)

But this clearly exhibits the fact ⎧ ⎨ 3 if 2 if rank g(λI3 + N ) = ⎩ 1 if



that g(λ) = 0 , g(λ) = 0 and g (1) (λ) = 0 , g(λ) = g (1) (λ) = 0 and g (2) (λ) = 0 .

42.2. Bezoutians

455

Exercise 42.2. Confirm formula (42.12) for the polynomial n n   k g(λ) = gk λ by writing g(λIp + N ) = gk (λIp + N )k k=0

k=0

and invoking the binomial formula. [REMARK: This is a good exercise in manipulating formulas, but it’s a lot more work than the proof of Lemma 42.3 that was presented above.] Theorem 42.4. If f (λ) is a polynomial of degree n (i.e., fn = 0) and g(λ) is a polynomial of degree ≤ n with at least one nonzero coefficient, then dim NB(f,g) = ν(f, g) .

(42.13) Discussion. (42.14)

The proof rests on the Barnett identity B(f, g) = Hf g(Kf ) ,

in which Kf is the companion matrix based on the polynomial f (see (8.19) and (8.20)) and Hf is an invertible Hankel matrix. To understand why (42.14) yields (42.13), consider the case f (λ) = (λ − λ1 )3 (λ − λ2 )2 with λ1 = λ2 . Then     (3) (3) Cλ1 g(C O ) O −1 λ1 V −1 . and g(Kf ) = V Kf = V (2) V (2) O Cλ2 O g(Cλ2 ) Therefore, dim NB(f,g) = dim Ng(Kf ) = dim Ng(C (3) ) + dim Ng(C (2) ) . λ1

− λ2 )(λ − λ3 Thus, if g(λ) = (λ − λ1 dim Ng(C (3) ) = 2 and dim Ng(C (2) ) = 1. )2 (λ

λ1

λ2

)2

λ2

has three distinct roots, then 

Exercise 42.3. Show that if A ∈ C n×n , B ∈ Cn×n , and AB = BA, then n  g (k) (A) (B − A)k g(B) = k! k=0

for every polynomial g(λ) of degree ≤ n. Exercise 42.4. Use Theorem 42.4 and formula (42.14) to calculate the number of common roots of the polynomials f (x) = 2 − 3x + x3 and g(x) = −2 + x + x2 . Exercise 42.5. Show that if f (λ) is a polynomial of degree n with n distinct roots λ1 , . . . , λn , g(λ) is a polynomial of degree ≤ n, and V is the Vandermonde matrix with columns v(λ1 ), . . . , v(λn ), then (42.15)

V T B(f, g)V = diag{f  (λ1 )g(λ1 ), . . . , f  (λn )g(λn )} .

[HINT: Compute v(λj )T B(f, g)v(λk ), for j = k and j = k via (42.10).]

456

42. Supplementary topics

Exercise 42.6. Show that (42.13) holds in the setting of Exercise 42.5. [HINT: Since V is invertible when f has n distinct roots, formula (42.15) implies that dim NB(f,g) is equal to the number of points αj at which g(αj ) = 0.]

42.3. Resultants There is another formula for computing the number of common roots of a pair of polynomials f (λ) and g(λ) that is easier to write down, since it is expressed in terms of the 2n × 2n matrix ⎤ ⎡ 0 ··· 0 f0 f1 · · · fn−1 fn ⎢ 0 f0 · · · fn−2 fn−1 fn · · · 0 ⎥ ⎥ ⎢ ⎢ .. .. .. .. .. .. ⎥ ⎥ ⎢. . . . . . ⎥ ⎢ ⎢0 0 ··· f0 f1 f2 · · · fn ⎥ ⎥ ⎢ R(f, g) = ⎢ 0 ··· 0 ⎥ ⎥ ⎢g0 g1 · · · gn−1 gn ⎢ 0 g0 · · · gn−2 gn−1 gn · · · 0 ⎥ ⎥ ⎢ ⎢. .. .. .. .. .. ⎥ . ⎣. . . . . .⎦ g1 g2 · · · gn 0 0 ··· g0 based on the coefficients of the polynomials f (λ) = f0 + f1 λ + · · · + fn λn and g(λ) = g0 + g1 λ + · · · + gn λn . The matrix R(f, g) is called the resultant of f (λ) and g(λ). Theorem 42.5. If f (λ) is a polynomial of degree n (i.e., fn = 0) and g(λ) is a polynomial of degree ≤ n with at least one nonzero coefficient, then (42.16)

dim NR(f,g) = ν(f, g) .

Proof. See, e.g., Theorem 21.9 in [30].



The next exercise is a good way to check that the theorem is correct in at least one case. Exercise 42.7. Show that if f is a polynomial of degree 3 with 3 distinct roots λ1 , λ2 , λ3 , g is a polynomial of degree 2 with 2 distinct roots μ1 , μ2 , and these five points are distinct, then R(f, g) is invertible. [HINT: It’s enough to show that if α ∈ C \ {λ1 , λ2 , λ3 , μ1 , μ2 } and ν(f, g) = 0, then the matrix   R(f, g) v(μ1 ) v(μ2 ) v(α) v(λ1 ) v(λ2 ) v(λ3 )  with v(λ) = 1 λ · · ·

λ5

T

is invertible.]

42.4. General QR factorization

457

42.4. General QR factorization In this section we consider QR factorization for matrices A ∈ C p×q that are not assumed to be left invertible. Theorem 42.6. If A ∈ C p×q and rank A = r ≥ 1, then A = QR, where Q ∈ C p×q is isometric and R ∈ C q×q is upper triangular with r positive entries on the diagonal and q − r zero entries on the diagonal. Discussion. The idea underlying the proof is best conveyed by example. The general case is established in exactly the same way, but with more elaborate bookkeeping.   Let A = a1 a2 a3 a4 and suppose that a1 and a3 are the pivot columns of A. Then rank A = 2, RA = span {a1 , a3 }, and standard QR factorization ensures that

    a b  a1 a3 = q1 q3 with q1 , q2 orthonormal, a > 0, and c > 0 . 0 c Thus, 

  a1 0 a3 0 = q1 0 q3

⎡ a  ⎢0 0 ⎢ ⎣0 0 ⎡

0 0 0 0

b 0 c 0

⎤ 0 0⎥ ⎥ 0⎦ 0

⎤ 0  0⎥ ⎥ = QB , = q1 q2 q3 0⎦ 0   where q2 and q4 are orthonormal vectors such that Q = q1 q2 q3 q4 is isometric and the 4 × 4 matrix B is upper triangular with nonnegative entries on the diagonal, two of which are positive.

     d e a a a a Next, since 2 , 4 = 1 3 0 f ⎡ ⎤ 0 d 0 e  ⎢0 0 0 0 ⎥      ⎥ 0 a2 0 a4 = a1 0 a3 0 ⎢ ⎣0 0 0 f ⎦ = a1 0 a3 0 C , 0 0 0 0 a  ⎢0 q4 ⎢ ⎣0 0

0 0 0 0

b 0 c 0

where the 4 × 4 matrix C is strictly upper triangular. Consequently, A = QB + QBC = QB(I4 + C) = QR

with R = B(I4 + C) .

Thus, as B is upper triangular and I4 + C is upper triangular with ones on the diagonal, the diagonal entries of R coincide with the diagonal entries of B, two of which are positive and two are equal to zero. 

458

42. Supplementary topics

As an application we consider a subcase of the CS decomposition: Theorem 42.7. If A, B ∈ C k×k and AH A + B H B = Ik , then there exists a set of unitary matrices W1 , W2 ∈ C 2k×2k such that

 

Ik C Ik A W2 = , W1 O B O S where C, S ∈ R k×k are diagonal matrices such that c11 ≤ · · · ≤ ckk , s11 ≥ · · · ≥ skk , and C 2 + S 2 = Ik . Proof. By reordering the entries in the singular value decomposition of A, we obtain a pair of unitary matrices V1 , U1 ∈ C k×k such that A = V1 CU1H . Theorem 42.6 ensures that there exists a unitary matrix Q ∈ C k×k and an upper triangular matrix R ∈ C k×k with nonnegative entries on the diagonal such that BU1 = QR. Thus, upon setting W1 = diag{V1H , QH } and W2 = diag{V1 , U1 }, it is readily checked that     

H O Ik A V1 O Ik V1H AU1 Ik C V1 . = = O QH BU1 O QH O B O U1 O R It remains to show that the upper triangular matrix R is actually a diagonal matrix. This follows from the fact that C H C + RH R = Ik . If C = Ik , then R = O. If C = Ik , then 0 ≤ c11 < 1. Consequently, r11 > 0, and hence, as  T the columns of C R are orthonormal, r1j = 0 for j = 2, . . . , k. Similarly,  if rii > 0, then rij = 0 for j = i + 1, . . . , k.

42.5. The QR algorithm Recall that if A ∈ C p×q and rank A = q, then there exists exactly one isometric matrix Q ∈ C p×q and exactly one upper triangular matrix R ∈ C k×k with positive entries on the diagonal such that A = QR. The QR algorithm defines a sequence of invertible matrices Aj ∈ C n×n by successive applications of QR factorization starting with A = A1 by setting Am+1 = Rm Qm when Am = Qm Rm and then applying QR factorization to Am+1 to obtain Am+1 = Qm+1 Rm+1 for m = 1, 2, . . .. Thus, A1 = Q1 R1 ,

A2 = R1 Q1 = Q2 R2 ,

A3 = R2 Q2 = Q3 A3 , . . . ,

where Qm is unitary and Rm is upper triangular with positive diagonal entries for m = 1, 2, . . .. The notation (42.17)

Qm = Q1 Q2 · · · Qm

and

Rm = Rm Rm−1 · · · R1

will be convenient; it will be used in this section only.

42.5. The QR algorithm

459

Exercise 42.8. Show that the QR factorization of Am is equal to Qm Rm , i.e., Am = Qm Rm .

(42.18)

[HINT: A3 = Q1 R1 Q1 R1 Q1 R1 = Q1 (R1 Q1 R1 Q1 )R1 = Q1 Q2 R2 Q2 R2 R1 .] Exercise 42.9. Show that (42.19)

Am+1 = QH m AQm .

[HINT: Am+1 = Rm Qm =⇒ Qm Am+1 QH m = Am .] Theorem 42.8. If A ∈ C n×n is invertible with Jordan decomposition A = XDX −1 , where (1) D = diag{λ1 , . . . , λn } with |λ1 | > |λ2 | > · · · |λn | and (2) the k × k upper left-hand corners of X −1 are invertible for k = 1, . . . , n, then the absolute values of (Am )ij , the ij entry in Am (the m’th matrix in the QR algorithm), tend to a limit when i < j and ( 0 if i > j , lim (Am )ij = if i = j . λj m↑∞ Proof. Under the given assumptions, Theorem 16.3 ensures that the matrix Y = X −1 admits a factorization of the form Y = LY UY , where LY is lower triangular with ones on the diagonal and UY is upper triangular. Then, in view of formula (42.18), Qm Rm = Am = XD m LY UY = XCm D m UY

with Cm = D m LY D −m .

−m and LY is lower Since (in self-evident notation) (Cm )ij = λm i (LY )ij λj −m m triangular, (Cm )ij = 0 if j > i. Furthermore, as λi λj → 0 when m ↑ ∞ if j < i, ( 0 if i = j , −m m lim (Cm )ij = lim λi (LY )ij λj = 1 if i = j . m↑∞ m↑∞

Thus, Cm is an invertible matrix that tends to In as m ↑ ∞. Consequently, upon invoking the QR factorization X = QX RX , Qm Rm = XCm D m UY = QX RX Cm D m UY −1 = QX (RX Cm RX )RX D m UY = QX Gm RX D m UY m = QX Qm Rm RX D m UY = QX Qm Δm ΔH m Rm RX D UY ,

where Qm is the unitary factor and Rm is the upper triangular factor with positive entries on the diagonal in the QR factorization of the invertible −1 , and Δm = diag{(μm )1 , . . . , (μm )n } is a unitary matrix Gm = RX Cm RX matrix that is chosen so that the diagonal entries of the upper triangular

460

42. Supplementary topics

m matrix ΔH m Rm RX D UY are all positive. Therefore, by the uniqueness of QR factorization, Qm = QX Qm Δm . The rest of the proof is divided into parts.

1. Qm → In as m ↑ ∞ It is easily checked that the entries in the matrices Qm and Rm are bounded and hence that: (1) A subsequence of {Q1 , Q2 , . . .} converges to a unitary matrix Q∞ . (2) A subsequence of the matrices {R1 , R2 , . . .} converges to an upper triangular matrix R∞ with nonnegative entries on the diagonal. (3) Q∞ R∞ = In . Consequently, the unitary matrix Q∞ = RH ∞ is a lower triangular matrix with nonnegative entries on the diagonal, which is only possible if Q∞ = In . Since the same conclusions hold for all convergent subsequences, it follows that the original sequences Qm → In and Rm → In as m ↑ ∞. 2. Completion of the proof In view of (42.19) and the formulas Qm = QX Qm Δm and X = QX RX , H H H −1 QX Qm Δm Am+1 = QH m AQm = Δm Qm QX XDX −1 H = ΔH m Qm RX DRX Qm Δm .

The stated conclusions of the theorem are now easily read off the formula −1 (Am+1 )ij = (μm )i (QH m RX DRX Qm )ij (μm )j , −1 −1 −1 since QH m RX DRX Qm → RX DRX as m ↑ ∞ and RX DRX is upper trian−1 )ii = λi for i = 1, . . . , n.  gular with (RX DRX

42.6. Supplementary notes The treatment of Gaussian quadrature is partially adapted from the discussion in Section 8.14 of [30], which was adapted from the Ph.D. thesis of Ilan Degani [22]. The present version is based on first showing that the operator Pn in the space of polynomials of degree ≤ n − 1 is unitarily equivalent to multiplication by a companion matrix in C n . The sections on Bezoutians and resultants are much abbreviated versions of the presentations in Chapter 21 of [30]. A neat way to establish the properties of Bezoutians and resultants that avoids the use of the Barnett identity is presented in Curgus and Dijksma [20]. The last two sections are new. Section 42.4 is adapted from [9]. The presented proof of Theorem 42.8 in the last section is taken from the paper [76] by J. N. Wilkinson. For additional insight and references, see the review article [73] by D. Watkins, and, for additional perspective [74].

Chapter 43

Toeplitz, Hankel, and de Branges

In this chapter we shall develop some elementary properties of finite-dimensional reproducing kernel Hilbert spaces and shall then use these properties to identify positive definite Hankel matrices as the Gram matrices of a set of polynomials in a very important class of reproducing kernel Hilbert spaces of the kind introduced and intensively studied by L. de Branges. Analogous conclusions will be presented first for positive definite Toeplitz matrices. This serves to identify the densities that define the inner products in these two spaces as solutions of a pair of truncated moment problems. Recall that a matrix A ∈ C n×n with entries aij , i, j = 1, . . . , n, is a: • Toeplitz matrix if aij = ti−j for i, j = 1, . . . , n and some set of 2n − 1 numbers t−n+1 , . . . , tn−1 , • Hankel matrix if aij = hi+j−2 for i, j = 1, . . . , n and some set of 2n − 1 numbers h0 , . . . , h2n−2 . Thus, for example, if n = 3, ⎤ ⎡ t0 t−1 t−2 B = ⎣t1 t0 t−1 ⎦ , t2 t1 t0



and

⎤ h0 h1 h2 C = ⎣ h1 h2 h3 ⎦ , h2 h3 h4

then B is a Toeplitz matrix and C is a Hankel matrix.

43.1. Reproducing kernel Hilbert spaces A finite-dimensional inner product space H of complex-valued functions that are defined on a nonempty subset Ω of C is said to be a reproducing kernel 461

462

43. Toeplitz, Hankel, and de Branges

Hilbert space if there exists a function Kω (λ) that is defined on Ω × Ω such that for every choice of ω ∈ Ω the following two conditions are fulfilled: (1) Kω ∈ H (as a function of λ). (2) f, Kω H = f (ω) for every point ω ∈ Ω and every function f ∈ H. A function Kω (λ) that meets these two conditions is called a reproducing kernel for H. Theorem 43.1. If Kω (λ) is a reproducing kernel for a finite-dimensional reproducing kernel Hilbert space H of complex-valued functions defined on a subset Ω of C, then: (1) Kω (λ) is a positive kernel, i.e., m 

(43.1)

ci Kωj (ωi )cj ≥ 0

i,j=1

for every positive integer m and every choice of points ω1 , . . . , ωm in Ω and coefficients c1 , . . . , cm ∈ C. (2) Kα (β) = Kβ (α) for every pair of points α, β ∈ Ω. (3) H has exactly one reproducing kernel (though it may be expressed in more than one way).  Proof. Let f = m j=1 cj Kωj . Then the evaluation (43.2)

f 2H

=

n 

cj Kωj , ci Kωi H =

i,j=1

m 

ci Kωj (ωi )cj

i,j=1

clearly serves to establish (1), since f 2H ≥ 0. The special case of (43.2) with m = 2, ω1 = α, and ω2 = β implies that

   Kα (α) Kβ (α) c1  c1 c2 ≥0 Kα (β) Kβ (β) c2 for every choice of c1 , c2 ∈ C and hence that the 2×2 matrix in the preceding display is positive semidefinite. Therefore, it is Hermitian and (2) holds. Suppose next that Lω (λ) is also a reproducing kernel for H. Then Lα (β) = Lα , Kβ H = Kβ , Lα H = Kβ (α) = Kα (β) for every pair of points α, β ∈ Ω. Therefore, (3) holds.



Example 43.1. Let H be an n-dimensional inner product space of complexvalued functions that are defined on a nonempty subset Ω of C, and let f1 , . . . , fn be a basis for H with Gram matrix G ∈ C n×n , i.e., gij = fj , fi H

for i, j = 1, . . . , n .

43.2. de Branges spaces

463

Then (43.3)

Kω (λ) =

n 

fi (λ) (G−1 )ij fj (ω)

i,j=1

is the reproducing kernel for H. It is clear that Kω (λ) belongs to H (as a function of λ) for every choice of ω ∈ Ω. Moreover, 7 8 n n   fi (G−1 )ij fj (ω) = (G−1 )ji fj (ω)fk , fi H fk , Kω H = fk , i,j=1

= since

n

n 

H

i,j=1

(G−1 )ji fj (ω)gik = fk (ω) ,

i,j=1

i=1 (G

−1 )

ji gik

is equal to the jk entry of In . Therefore, by linearity,

f, Kω H = f (ω) for every f ∈ H and hence Kω (λ) is the reproducing kernel for H.



Exercise 43.1. Show that in the setting of Example 43.1, the reproducing kernel Kω (λ) defined by formula (43.3) can also be expressed as 

G F (ω)H −1 H −1 Kω (λ) = F (λ)G F (ω) = −(det G) det F (λ) 0   with F (λ) = f1 (λ) · · · fn (λ) . Exercise 43.2. Show that if the space M considered in Example 43.1 is a proper subspace of an inner product space U and g, hM = g, hU for every choice of g, h ∈ M, then the orthogonal projection ΠM f of f ∈ U at ω ∈ Ω is equal to (ΠM f )(ω) = f, Kω U .

43.2. de Branges spaces Classical de Branges spaces are reproducing kernel Hilbert spaces of entire functions (i.e., functions that are holomorphic on the full complex plane) with reproducing kernels of the form (43.4)

Kω (λ) =

and inner product (43.5)

f, gΔE =

6

E+ (λ)E+ (ω) − E− (λ)E− (ω) −2πi(λ − ω) ∞ −∞

for λ = ω

g(μ)ΔE (μ)f (μ)dμ with ΔE (μ) = |E+ (μ)|−2 ,

464

43. Toeplitz, Hankel, and de Branges

where E± are entire functions that enjoy the following properties: (1) |E+ (λ)| > 0 for all points λ in the closed upper half-plane. −1 E− )(λ)| ≤ 1 for all points λ in the closed upper half-plane (2) |(E+ with equality for λ ∈ R.

Analogous conditions can be formulated for the unit disc D = {λ ∈ C : |λ| < 1}. The reproducing kernel Kω (λ) is then of the form E+ (λ)E+ (ω) − λ ω E− (λ)E− (ω) for λω = 1 1 − λω where the functions E± (λ) are subject to the following constraints: (43.6)

Kω (λ) =

(3) |E+ (λ)| > 0 for all points λ in the closed unit disc. −1 E− )(λ)| ≤ 1 for all points λ in the closed unit disc with (4) |(E+ equality for |λ| = 1.

The inner product in this setting is 6 2π 1 g(eiθ )|E+ (eiθ )|−2 f (eiθ )dθ (43.7) f, gΔE = 2π 0 To distinguish between these two cases, we shall refer to the space in the first (resp., second) setting as a de Branges space with respect to C+ (resp., D).

43.3. The space of polynomials of degree ≤ n − 1 For the rest of this chapter we shall focus on the special case that H is equal to the n-dimensional vector space of polynomials of degree ≤ n−1, equipped with an inner product and Ω = C. This is a nice space to work with because the vectors in H are entire functions and H is invariant under the action of the generalized backward shift operator Rα , which is defined by the formula f (λ) − f (α) if λ = α , (43.8) (Rα f )(λ) = λ−α  if λ = α f (α) for every α ∈ C: If f (λ) = c0 + c1 λ + · · · + cn−1 λn−1 , then   f (λ) − f (α)  λj − αj = = cj cj αj−1−i λi , λ−α λ−α n−1

n−1

j−1

j=1

j=1

i=1

which is a polynomial in λ of degree ≤ n − 2. It is convenient  to describe the space H in terms of the matrix-valued function F (λ) = 1 λ · · · λn−1 : (43.9)

H = {F (λ)u : u ∈ C n }.

43.3. The space of polynomials of degree ≤ n − 1

465

In particular (R0 F )(λ)u =

 F (λ) − F (0) u = 0 1 ··· λ

 λn−2 u = F (λ)Au

(n)

with A = C0 , the n × n Jordan cell with 0 on the diagonal. Consequently, F (λ)(In − λA) = F (0) , and hence, upon setting F (0) = C, we obtain the formulas F (λ) = C(In − λA)−1

(43.10) and

(Rα F )(λ) = F (λ)A(In − αA)−1 .

(43.11) The formula

F u, F vH = vH Gu

(43.12)

defines an inner product on H for every G ∈ C n×n that is positive definite. Lemma 43.2. Let H denote the n-dimensional inner product space of polynomials of degree ≤ n − 1 equipped with the inner product (43.12).Then H is a reproducing kernel Hilbert space with reproducing kernel Kω (λ) = F (λ)G−1 F (ω)H .

(43.13)

Proof. This is a particular case of Example 43.1.



For each choice of α ∈ C, let Hα = {f ∈ H : f (α) = 0} .

(43.14)

Lemma 43.3. The space Hα is a reproducing kernel Hilbert space with inner product (43.12) and reproducing kernel (43.15)

Kω(α) (λ) = Kω (λ) − Kα (λ)Kα (α)−1 Kω (α)

for each point α ∈ C. (α)

(α)

Proof. Since Kα (α) = 0, it is clear that Kα ∈ Hα . Moreover, if f ∈ Hα , then f, Kω(α) H = f, Kω H − f, Kα Kα (α)−1 Kω (α)H = f (ω) − Kα (ω)Kα (α)−1 f (α) = f (ω) , since f (α) = 0 when f ∈ Hα .



466

43. Toeplitz, Hankel, and de Branges

43.4. Two subspaces In this section, we shall have special interest in two subspaces of the inner product space H considered in Lemma 43.2: (43.16)

H0 = {f ∈ H : f (0) = 0}

and

H• = {R0 f : f ∈ H0 } .

To describe these spaces and their reproducing kernels, it is convenient to let Γ = G−1

and

e1 , . . . , en

denote the standard basis for C n .

Then H0 = {F v : v ∈ C n

(43.17)

and eH 1 v = 0}

and, in view of formula (43.15), the reproducing kernel for H0 is −1 H Kω(0) (λ) = Kω (λ) − F (λ)Γe1 (eH e1 ΓF (ω)H . 1 Γe1 )

(43.18)

Lemma 43.4. The space H• = {R0 f : f ∈ H0 } can be described as H• = {F v : v ∈ C n

(43.19)

and

eH n v = 0} ;

it is a reproducing kernel Hilbert space with reproducing kernel −1 H en ΓF (ω)H . Kω(•) (λ) = Kω (λ) − F (λ)Γen (eH n Γen )

(43.20)

Proof. Observe first that f = F v ∈ H0 if and only if f (0) = F (0)v = eH 1 v = 0. Thus, 2 (  0 : u ∈ C n−1 H0 = F u and, as





 0 0 u = FA =F , R0 F u u 0 we see that (  2 u n−1 H• = F : u∈C = {F v : v ∈ C n and eH n v = 0} . 0 Formula (43.20) implies that Kω(•) (λ) = F (λ)BF (ω)H

−1 H with B = Γ − Γen (eH en Γ . n Γen ) (•)

H Thus, as eH n BF (ω) = 0 for every point ω ∈ C, it follows that Kω ∈ H• . Moreover, if f ∈ H• , then

F v, Kω(•) H = F (ω)B H Gv = F (ω)BGv −1 H en )v = F (ω)v , = F (ω)(In − Γen (eH n Γen )

since eH n v = 0.



43.5. G is a Toeplitz matrix

467

43.5. G is a Toeplitz matrix In this section we shall identify H as a de Branges space with respect to D when the matrix G is a Toeplitz matrix. The notation ⎤ ⎡ gkk · · · gkm ⎢ .. ⎥ for 1 ≤ k ≤ m ≤ n (43.21) G[k,m] = ⎣ ... . ⎦ gmk · · · gmm (which appeared earlier in (16.5)) and (43.22)

R• : f ∈ H• → λf (λ)

will be useful. Lemma 43.5. The operator R0 maps H0 isometrically onto H• if and only if the positive definite matrix G that defines the inner product (43.12) in H is a Toeplitz matrix. Proof. It is readily checked that R• R0 f = f for every f ∈ H0 . Moreover, R0 maps H0 onto H• , because if f ∈ H• , then R• f ∈ H0 and R0 R• f = f . If f, g ∈ H0 , then f = F v, g = F w for vectors v, w ∈ C n that  are 0 H H and subject to the constraints e1 v = 0 and e1 w = 0. Thus, if v = x

 0 w= with x, y ∈ C n−1 , then, in terms of the notation (43.21), y F v, F wH = wH Gv = yH G[2,n] x and R0 F v, R0 F wH = F Av, F AwH = wH AH GAv = yH G[1,n−1] x . Thus, R0 maps H0 isometrically onto H• if and only if G[2,n] = G[1,n−1] , i.e., if and only if gij = gi+1,j+1 for i, j = 1, . . . , n − 1, i.e., if and only if G is a Toeplitz matrix.  Exercise 43.3. Show that the operator Sα that is defined by the formula (43.23)

(Sα f )(λ) =

1 − λα f (λ) for |α| = 1 λ−α

maps Hα isometrically onto H1/α if and only if the positive definite matrix G that defines the inner product (43.12) in H is a Toeplitz matrix. We remark that R0 = S0 = limα↓0 Sα and R• = limα↓0 S1/α (i.e., α tends to 0 through positive values).

468

43. Toeplitz, Hankel, and de Branges

Lemma 43.6. If the positive definite matrix G that defines the inner product (43.12) in H is a Toeplitz matrix, then λ ω Kω(•) (λ) = Kω(0) (λ)

(43.24)

for every pair of points λ, ω ∈ C and the reproducing kernel for H admits the representation (43.25) Kω (λ) = F (λ)ΓF (ω) =

E+ (λ)E+ (ω) − λ ω E− (λ)E− (ω) , λω = 1, 1 − λω

in terms of the matrix polynomials (43.26)

E+ (λ) =

F (λ)Γe1 1/2 (eH 1 Γe1 )

and

E− (λ) =

F (λ)Γen . 1/2 (eH n Γen )

Proof. Formula (43.24) follows easily from the evaluations (0)

(0)

(0)

R0 Kλ , ωKω(•) H = ω(R0 Kλ )(ω) = Kλ (ω) and (0)

(0)

ωKω(•) , R0 Kλ H = ωR• Kω(•) , Kλ H = λωKω(•) (λ) . Formulas (43.25) and (43.26) are obtained by substituting the expressions (43.18) and (43.20) into (43.24) and then solving for Kω (λ), the reproducing kernel for the full space H.  Exercise 43.4. Show that if G ∈ C n×n is a positive definite Toeplitz matrix and Γ = G−1 , then (43.27)

H eH 1 Γe1 = en Γen .

Theorem 43.7. If the positive definite matrix G that defines the inner product (43.12) in H is a Toeplitz matrix, then the space H is a de Branges space with respect to D: Its reproducing kernel can be expressed in the form (43.25) with entries E± (λ) specified by (43.26) and inner product 6 2π 1 g(eiθ )|E+ (eiθ )|−2 f (eiθ )dθ = f, gH (43.28) f, gΔE = 2π 0 for every pair of functions f, g ∈ H, i.e., 6 2π 1 vH F (eiθ )H |E+ (eiθ )|−2 F (eiθ )udθ = vH Gu (43.29) F u, F vΔE = 2π 0 for every choice of u, v ∈ C n . Proof. We have already obtained the basic identity (43.25) in tems of the −1 E− meet polynomials E± in (43.26). It remains to show that E+ and E+ the constraints (3) and (4) in Section 43.2. The proof is divided into parts.

43.5. G is a Toeplitz matrix

469

1. |E+ (ω)| > 0 if |ω| ≤ 1 and |E+ (ω)| = |E− (ω)| if |ω| = 1. Formula (43.25) clearly implies that |E+ (ω)|2 = |ω|2 |E− (ω)|2 + (1 − |ω|2 )Kω (ω) > 0 for every point ω ∈ D, since Kω (ω) > 0, and that |E+ (ω)|2 = |ω|2 |E− (ω)|2

if |ω| = 1 .

Consequently, |E+ (ω)| > 0 when |ω| = 1, because otherwise (by (43.25)), F (λ)ΓF (ω)H = 0

for every point λ ∈ D when |ω| = 1 .

But this in turn would imply that ΓF (ω)H = 0 when |ω| = 1 and hence, as Γ is invertible, that F (ω)H = 0 when |ω| = 1, which is false. The inequality |E+ (ω)| > 0 when |ω| = 1 ensures that the inner product (43.28) is well-defined. 2. f, Kω ΔE = f (ω) for f ∈ H. Let ρω (λ) = 1 − λω and then, using formula (43.25) for the reproducing kernel, check that for every f ∈ H and every ω ∈ D 8 7 6 2π 1 1 f (eiθ ) E+ E+ (ω) E+ (ω)dθ = f (ω) = f, iθ ρω 2π 0 E+ (e ) (1 − e−iθ ω) ΔE

= |E− (eiθ )|) and (as |E+ = < 6 2π (1 − ρω )E− E− (ω)H e−iθ ω 1 f (eiθ ) f, E− (ω)dθ = 0 . = ρω 2π 0 E− (eiθ ) (1 − e−iθ ω) ΔE (eiθ )|

These two integrals can be evaluated either by using, Cauchy’s formula from 2π complex function theory or by using the fact that 0 eikθ dθ = 0 for every nonzero integer k. This justifies the formula 6 2π 1 f (eiθ )ΔE (eiθ )F (ω)ΓF (eiθ )H dθ f (ω) = 2π 0 for f ∈ H and ω ∈ D. However, since both sides are entire functions of ω, the formula is valid for all points ω ∈ C. 3. Verification of (43.28). Let Q denote the Gram matrix of the functions fj (λ) = λj , j = 0, . . . , n − 1, with respect to the inner product (43.28). Then, in view of step 2, F (ω)u = F (ω)ΓGu = F u, Kω H = F u, F ΓF (ω)H H = F u, Kω ΔE = F u, F ΓF (ω)H ΔE = F (ω)ΓQu for every choice of ω ∈ C and u ∈ C n . Therefore, ΓG = ΓQ, and hence G = Q, since Γ is invertible. Therefore, (43.28) holds. 

470

43. Toeplitz, Hankel, and de Branges

43.6. G is a Hankel matrix In this section we shall identify H as a de Branges space with respect to C+ when the matrix G that defines the inner product (43.12) is a Hankel matrix. The analysis exploits the interplay between the operators (43.30)

(Ti f )(λ) =

λ+i f (λ) and λ−i

(T−i f )(λ) =

λ−i f (λ) λ+i

for f ∈ Hi and f ∈ H−i , respectively Lemma 43.8. The operator Ti maps Hi isometrically onto H−i if and only if the positive definite matrix G that defines the inner product (43.12) in H is a Hankel matrix. Proof. Since f (λ) λ+i f (λ) = f (λ) + 2i = f (λ) + 2i(Ri f )(λ) λ−i λ−i for f ∈ Hi , it is readily checked with the help of (43.11) that Ti F u = F v

with v = (In + iA)(In − iA)−1 u .

Consequently, F u, F uH − Ti F u, Ti F uH = uH Gu − vH Gv = uH (In + iAH )−1 X(In − iA)−1 u , with X = (In + iAH )G(In − iA) − (In − iAH )G(In + iA) = 2i(AH G − GA) . Thus, isometry holds for all f ∈ Hi if and only if uH (In + iAH )−1 (AH G − GA)(In − iA)−1 u = 0 −1 for all vectors u ∈ C n such that eH 1 (In − iA) u = 0, i.e., if and only if

   H 0 H 0 w (A G − GA) for every w ∈ C n−1 . w

In view of Exercise 11.24, the last condition holds if and only if   H A G − GA [2,n] = O, i.e., if and only if gij = gi+1,j−1 for i = 1, . . . , n − 1 and j = 2, . . . , n. Thus, the isometry holds if and only if G is a Hankel matrix. 

43.6. G is a Hankel matrix

471

Exercise 43.5. Show that the operator Tα that is defined by the formula λ−α (Tα f )(λ) = f (λ) λ−α maps Hα isometrically onto Hα if and only if the positive definite matrix G that defines the inner product (43.12) in H is a Hankel matrix. [REMARK: The assertion is correct for every α ∈ C, but only of interest for α ∈ R.] Exercise 43.6. Show that Tα acting from Hα to Hα is equal to the adjoint Tα∗ of the operator Tα acting from Hα to Hα . Lemma 43.9. If the positive definite matrix G that defines the inner product (43.12) in H is a Hankel matrix, then ω − i (i) λ − i (−i) Kω (λ) = K (λ) . λ+i ω+i ω

(43.31) (i)

(−i)

Proof. Since Ti Kλ ∈ H−i and T−i Kλ (i)

∈ Hi ,

(i)

Ti Kλ , Kω(−i) H = (Ti Kλ )(ω) =

ω + i (i) K (ω) ω−i λ

and (i)

T−i Kω(−i) , Kλ H = (T−i Kω(−i) )(λ) =

λ − i (−i) K (λ) . λ+i ω

However, since (i)

(i)

(i)

(−i)

T−i Kω(−i) , Kλ H = Kω(−i) , Ti Kλ H = Ti Kλ , Kω

H , 

it is readily seen that (43.31) holds.

Theorem 43.10. If the positive definite matrix G that defines the inner product (43.12) in H is a Hankel matrix, then (43.32)

Kω (λ) =

E+ (λ)E+ (ω) − E− (λ)E− (ω) −2πi(λ − ω)

for λ = ω ,

where (43.33)

E+ (λ) = π(λ + i)Ki (λ)(πKi (i))−1/2

and

E− (λ) = π(λ − i)K−i (λ)(πK−i (−i))−1/2 .

Proof. Substituting (43.15) with α = ±i into (43.31) yields the identity ' λ−i & Kω (λ) − K−i (λ)K−i (−i)−1 Kω (−i) λ+i ' ω−i & Kω (λ) − Ki (λ)Ki (i)−1 Kω (i) . = ω+i Formulas (43.32) and (43.33) are then obtained by straightforward computation. 

472

43. Toeplitz, Hankel, and de Branges

Theorem 43.11. If the positive definite matrix G that defines the inner product (43.12) in H is a Hankel matrix, then H is a de Branges space with respect to C+ : Its reproducing kernel can be expressed in the form (43.32) with entries E± (λ) specified by (43.33) and inner product 6 ∞ (43.34) f, gΔE = g(μ) |E+ (μ)|−2 f (μ)dμ −∞

for every pair of functions f, g ∈ H, i.e., 6 ∞ vH F (μ)H |E+ (μ)|−2 F (μ)udμ = vH Gu (43.35) F u, F vΔE = ∞

for every choice of u, v ∈

C n.

Proof. We have already obtained the basic identity (43.32). The rest of the proof is divided into parts. 1. |E+ (ω)| > 0 if &ω ≥ 0 and |E+ (ω)| = |E− (ω)| if ω ∈ R. Formula (43.32) clearly implies that |E+ (ω)|2 − |E− (ω)|2 = −2πi(ω − ω)Kω (ω) > 0 for every point ω ∈ C+ , since Kω (ω) > 0, and that |E+ (ω)| = |E− (ω)| if ω ∈ R . Consequently, |E+ (ω)| > 0 when ω ∈ R, because otherwise, F (λ)ΓF (ω)H = 0 for every point λ ∈ C when ω ∈ R . But this in turn implies that ΓF (ω)H = 0 when ω ∈ R and hence, as Γ is invertible, that F (ω)H = 0 when ω ∈ R, which is false. The last inequality ensures that the inner product (43.34) is well-defined. 2. E± are polynomials of degree n and |E− (ω)| < 0 if &ω < 0. It is clear from the formulas in (43.33) and the identity |E+ (ω)| = |E− (ω)| for ω ∈ R that E± are both polynomials of degree m and m ≤ n. On the other hand, in view of formulas (43.13) with Γ = G−1 and (43.32), |E+ (iν)|2 − |E− (iν)|2 = 4πeTn Γen > 0 . ν↑∞ ν 2n−1 lim

But this is only viable if m = n (and limν↑∞ ν −n {|E+ (iν)| − |E− (iν)|} = 0). The proof that |E− (ω)| < 0 if &ω < 0 is similar to the justification of step 1 and is left to the reader.

43.7. Supplementary notes

473

3. f, Kω ΔE = f (ω) for f ∈ H. Let ρω (λ) = −2πi(λ − ω) and then, using formula (43.32) for the reproducing kernel, check that for every f ∈ H and every ω ∈ C+ 7 8 6 ∞ 1 1 f (μ) E+ E+ (ω) E+ (ω)dμ = f (ω) = f, ρω 2πi −∞ E+ (μ) (μ − ω) ΔE

and

7

E− E− (ω) f, ρω

8 ΔE

1 = 2πi

6

∞ −∞

1 f (μ) E− (ω)dμ = 0 . E− (μ) (μ − ω)

These two integrals can be evaluated by using Cauchy’s formula from complex function theory to evaluate the indicated integrals with limits ±R and then letting R ↑ ∞. This justifies the formula 6 ∞ f (μ)ΔE (μ)F (ω)ΓF (μ)H dμ f (ω) = −∞

for f ∈ H and ω ∈ C+ . However, since both sides are entire functions of ω, the formula is valid for all points ω ∈ C. 4. Verification of (43.35). Let Q denote the Gram matrix of the functions fj (λ) = λj , j = 0, . . . , n − 1, with respect to the inner product (43.34). Then, in view of step 2, F (ω)u = F (ω)ΓGu = F u, F ΓF (ω)H H = F u, Kω H = F u, Kω ΔE = F u, F ΓF (ω)H ΔE = F (ω)ΓQu for every choice of ω ∈ C and u ∈ C n . Therefore, ΓG = ΓQ, and hence G = Q, since Γ is invertible. Therefore, (43.35) holds. 

43.7. Supplementary notes Formula (43.31) is a special case of the general formula ω − α (α) λ − α (α) Kω (λ) = K (λ) λ−α ω−α ω which is valid for every point α ∈ R when the operator λ−α f (λ) = ((I + (α − α)Rα )f )(λ) Tα : f (λ) → λ−α maps Hα isometrically onto Hα . This identity is due to L. de Branges [21] and was used by him in conjunction with (43.15) to characterize reproducing kernel Hilbert spaces of entire functions that admit reproducing kernels Kω (λ) of the form (43.32), i.e., in the terminology of this chapter, reproducing kernel Hilbert spaces of entire functions that are de Branges spaces with (43.36)

474

43. Toeplitz, Hankel, and de Branges

respect to C+ . It extends to spaces of vector-valued functions; see, e.g., [34] and, for more information on such spaces, [6]. Analogous characterizations of vector-valued de Branges spaces with respect to D are furnished in [31]. To ease the reading, the analysis in this chapter was carried out for inner products based on positive definite matrices G. However, many of the basic identities remain valid when G is only assumed to be an invertible Hermitian −1 e = 0. This in turn yields formulas that connect matrix such that eH n n G the number of zeros of the polynomials E± (λ) (or det E± (λ) in the vector case) with the inertia of G; see, e.g., [27] and [29]. In the setting of Theorem 43.7 (wherein G is a Toeplitz matrix with gjk = tj−k for j, k = 1, . . . , n), formula (43.29) exhibits (2π)−1 |E+ (eiθ )|−2 as a solution of the truncated trigonometric moment problem: 6 2π 1 eikθ |E+ (eiθ )|−2 dθ = tk for k = −(n − 1), . . . , (n − 1) . 2π 0 Analogously, in the setting of Theorem 43.11 (wherein G is a Hankel matrix with gjk = hj+k−2 for j, k = 1, . . . , n), formula (43.35) exhibits |E+ (μ)|−2 dμ as a solution of the truncated Hamburger moment problem: 6 ∞ μk |E+ (μ)|−2 dμ = hk for k = 0, . . . , 2n − 2 . −∞

Bibliography

[1] John G. Aiken, John A. Erdos, and Jerome A. Goldstein, Unitary approximation of positive operators, Illinois J. Math. 24 (1980), no. 1, 61–72. MR550652 [2] Alkiviadis G. Akritas, Evgenia K. Akritas, and Genadii I. Malaschonok, Various proofs of Sylvester’s (determinant) identity, Symbolic computation, new trends and developments (Lille, 1993), Math. Comput. Simulation 42 (1996), no. 4-6, 585–593, DOI 10.1016/S03784754(96)00035-3. MR1430843 [3] T. Ando, Majorizations and inequalities in matrix theory, Linear Algebra Appl. 199 (1994), 17–67, DOI 10.1016/0024-3795(94)90341-7. MR1274407 [4] Dorin Andrica, A new problem, Eur. Math. Soc. Mag. 125 (2022), 53. [5] Tom M. Apostol, Mathematical analysis: a modern approach to advanced calculus, AddisonWesley Publishing Co., Inc., Reading, Mass., 1957. MR0087718 [6] Damir Z. Arov and Harry Dym, Multivariate prediction, de Branges spaces, and related extension and inverse problems, Operator Theory: Advances and Applications, vol. 266, Birkh¨ auser/Springer, Cham, 2018, DOI 10.1007/978-3-319-70262-9. MR3793176 [7] Sheldon Axler, Down with determinants!, Amer. Math. Monthly 102 (1995), no. 2, 139–154, DOI 10.2307/2975348. MR1315593 [8] Mih´ aly Bakonyi and Hugo J. Woerdeman, Matrix completions, moments, and sums of Hermitian squares, Princeton University Press, Princeton, NJ, 2011, DOI 10.1515/9781400840595. MR2807419 [9] Rajendra Bhatia, Matrix analysis, Graduate Texts in Mathematics, vol. 169, Springer-Verlag, New York, 1997, DOI 10.1007/978-1-4612-0653-8. MR1477662 [10] Rajendra Bhatia, Perturbation bounds for matrix eigenvalues, Classics in Applied Mathematics, vol. 53, reprint of the 1987 original, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2007, DOI 10.1137/1.9780898719079. MR2325304 [11] Rajendra Bhatia, Min matrices and mean matrices, Math. Intelligencer 33 (2011), no. 2, 22–28, DOI 10.1007/s00283-010-9194-z. MR2813259 [12] B´ ela Bollob´ as, Linear analysis: An introductory course, 2nd ed., Cambridge University Press, Cambridge, 1999, DOI 10.1017/CBO9781139168472. MR1711398 [13] Jonathan M. Borwein and Adrian S. Lewis, Convex analysis and nonlinear optimization: Theory and examples, 2nd ed., CMS Books in Mathematics/Ouvrages de Math´ematiques de la SMC, vol. 3, Springer, New York, 2006, DOI 10.1007/978-0-387-31256-9. MR2184742

475

476

Bibliography

[14] Richard A. Brualdi and Hans Schneider, Determinantal identities: Gauss, Schur, Cauchy, Sylvester, Kronecker, Jacobi, Binet, Laplace, Muir, and Cayley, Linear Algebra Appl. 52/53 (1983), 769–791, DOI 10.1016/0024-3795(83)80049-4. MR1500275 [15] John P. Burg, Maximum entropy spectral analysis, in Modern Spectrum Analysis, Donald G. Childers, ed., John Wiley & Sons, 1978. pp. 34–41. [16] John P. Burg, Maximum entropy spectral analysis, Ph.D. Thesis, 1975. [17] Chandler Davis, W. M. Kahan, and H. F. Weinberger, Norm-preserving dilations and their applications to optimal error bounds, SIAM J. Numer. Anal. 19 (1982), no. 3, 445–469, DOI 10.1137/0719029. MR656462 [18] Raymond Cheng and Yuesheng Xu, Minimum norm interpolation in the 1 (N) space, Anal. Appl. (Singap.) 19 (2021), no. 1, 21–42, DOI 10.1142/S0219530520400059. MR4178411 [19] John B. Conway, A course in functional analysis, 2nd ed., Graduate Texts in Mathematics, vol. 96, Springer-Verlag, New York, 1990. MR1070713 ´ [20] Branko Curgus and Aad Dijksma, A proof of the main theorem on Bezoutians, Elem. Math. 69 (2014), no. 1, 33–39, DOI 10.4171/EM/243. MR3182264 [21] Louis de Branges, Hilbert spaces of entire functions, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1968. MR0229011 [22] Ilan Degani, RCMS - right correction Magnus schemes for oscillatory ode’s and cubature formulas and oscillatory extensions, Ph.D. Thesis, 2005. [23] Emeric Deutsch and Hans Schneider, Bounded groups and norm-Hermitian matrices, Linear Algebra Appl. 9 (1974), 9–27, DOI 10.1016/0024-3795(74)90022-6. MR382315 [24] Klaus Diepold, Intersection of subspaces, Patrick Dewilde Workshop on Algebra, Networks, Signal Processing and System Theory, Waasenaar (2008). [25] R. G. Douglas, On majorization, factorization, and range inclusion of operators on Hilbert space, Proc. Amer. Math. Soc. 17 (1966), 413–415, DOI 10.2307/2035178. MR203464 [26] Peter Duren, Invitation to classical analysis, Pure and Applied Undergraduate Texts, vol. 17, American Mathematical Society, Providence, RI, 2012. MR2933135 [27] Harry Dym, Hermitian block Toeplitz matrices, orthogonal polynomials, reproducing kernel Pontryagin spaces, interpolation and extension, Orthogonal matrix-valued polynomials and applications (Tel Aviv, 1987), Oper. Theory Adv. Appl., vol. 34, Birkh¨ auser, Basel, 1988, pp. 79–135, DOI 10.1007/978-3-0348-5472-6 5. MR1021062 [28] Harry Dym, J contractive matrix functions, reproducing kernel Hilbert spaces and interpolation, CBMS Regional Conference Series in Mathematics, vol. 71, Published for the Conference Board of the Mathematical Sciences, Washington, DC; by the American Mathematical Society, Providence, RI, 1989, DOI 10.1090/cbms/071. MR1004239 [29] Harry Dym, On Hermitian block Hankel matrices, matrix polynomials, the Hamburger moment problem, interpolation and maximum entropy, Integral Equations Operator Theory 12 (1989), no. 6, 757–812, DOI 10.1007/BF01196878. MR1018213 [30] Harry Dym, Linear algebra in action, 2nd ed., Graduate Studies in Mathematics, vol. 78, American Mathematical Society, Providence, RI, 2013, DOI 10.1090/gsm/078. MR3154813 [31] Harry Dym, Two classes of vector valued de Branges spaces, J. Funct. Anal. 284 (2023), no. 3, Paper No. 109758, 31, DOI 10.1016/j.jfa.2022.109758. MR4513110 [32] Harry Dym and Israel Gohberg, Extensions of band matrices with band inverses, Linear Algebra Appl. 36 (1981), 1–24, DOI 10.1016/0024-3795(81)90215-9. MR604325 [33] Harry Dym and Israel Gohberg, Extensions of kernels of Fredholm operators, J. Analyse Math. 42 (1982/83), 51–97, DOI 10.1007/BF02786871. MR729402 [34] Harry Dym and Santanu Sarkar, Multiplication operators with deficiency indices (p, p) and sampling formulas in reproducing kernel Hilbert spaces of entire vector valued functions, J. Funct. Anal. 273 (2017), no. 12, 3671–3718, DOI 10.1016/j.jfa.2017.09.007. MR3711878

Bibliography

477

[35] Carl Eckart and Gale Young, The approximation of one matrix by another of lower rank, Psychometrika 1 (1936), 211–218. [36] Avraham Feintuch, Robust control theory in Hilbert space, Applied Mathematical Sciences, vol. 130, Springer-Verlag, New York, 1998, DOI 10.1007/978-1-4612-0591-3. MR1482802 [37] Junichi Fujii, Masatoshi Fujii, Takayuki Furuta, and Ritsuo Nakamoto, Norm inequalities equivalent to Heinz inequality, Proc. Amer. Math. Soc. 118 (1993), no. 3, 827–830, DOI 10.2307/2160128. MR1132412 [38] Takayuki Furuta, Norm inequalities equivalent to L¨ owner-Heinz theorem, Rev. Math. Phys. 1 (1989), no. 1, 135–137, DOI 10.1142/S0129055X89000079. MR1041534 [39] I. M. Glazman and Ju. I. Ljubiˇ c, Finite-dimensional linear analysis: a systematic presentation in problem form, translated from the Russian and edited by G. P. Barker and G. Kuerti, The M.I.T. Press, Cambridge, Mass.-London, 1974. MR0354718 [40] Jochen Gl¨ uck, Evolution equations with eventually positive solutions, Eur. Math. Soc. Mag. 123 (2022), 4–11, DOI 10.4171/mag-65. MR4429067 [41] W. Glunt, T. L. Hayden, Charles R. Johnson, and P. Tarazaga, Positive definite completions and determinant maximization, Linear Algebra Appl. 288 (1999), no. 1-3, 1–10, DOI 10.1016/S0024-3795(98)10211-2. MR1670594 [42] I. Gohberg, M. A. Kaashoek, and H. J. Woerdeman, The band method for positive and contractive extension problems, J. Operator Theory 22 (1989), no. 1, 109–155. MR1026078 [43] I. C. Gohberg and M. G. Kre˘ın, Introduction to the theory of linear nonselfadjoint operators, translated from the Russian by A. Feinstein, Translations of Mathematical Monographs, Vol. 18, American Mathematical Society, Providence, R.I., 1969. MR0246142 [44] G. H. Golub, Alan Hoffman, and G. W. Stewart, A generalization of the Eckart-YoungMirsky matrix approximation theorem, Linear Algebra Appl. 88/89 (1987), 317–327, DOI 10.1016/0024-3795(87)90114-5. MR882452 [45] Robert Grone, Charles R. Johnson, Eduardo M. de S´ a, and Henry Wolkowicz, Positive definite completions of partial Hermitian matrices, Linear Algebra Appl. 58 (1984), 109–124, DOI 10.1016/0024-3795(84)90207-6. MR739282 [46] Paul R. Halmos, A Hilbert space problem book, D. Van Nostrand Co., Inc., Princeton, N.J.Toronto, Ont.-London, 1967. MR0208368 [47] G. H. Hardy, J. E. Littlewood, and G. P´ olya, Inequalities, 2nd ed., Cambridge, at the University Press, 1952. MR0046395 [48] J. William Helton and I. M. Spitkovsky, The possible shapes of numerical ranges, Oper. Matrices 6 (2012), no. 3, 607–611, DOI 10.7153/oam-06-41. MR2987030 [49] Michael Heymann, The pole shifting theorem revisited, IEEE Trans. Automat. Control 24 (1979), no. 3, 479–480, DOI 10.1109/TAC.1979.1102057. MR533402 [50] Charles R. Johnson, Matrix completion problems: a survey, Matrix theory and applications (Phoenix, AZ, 1989), Proc. Sympos. Appl. Math., vol. 40, Amer. Math. Soc., Providence, RI, 1990, pp. 171–198, DOI 10.1090/psapm/040/1059486. MR1059486 [51] Tosio Kato, Perturbation theory for linear operators, reprint of the 1980 edition, Classics in Mathematics, Springer-Verlag, Berlin, 1995. MR1335452 [52] Steven G. Krantz and Harold R. Parks, The implicit function theorem: History, theory, and applications, reprint of the 2003 edition, Modern Birkh¨ auser Classics, Birkh¨ auser/Springer, New York, 2013, DOI 10.1007/978-1-4614-5981-1. MR2977424 [53] M. G. Kre˘ın and I. M. Spitkovski˘ı, Some generalizations of Szeg˝ o’s first limit theorem (Russian, with English summary), Anal. Math. 9 (1983), no. 1, 23–41, DOI 10.1007/BF01903988. MR705805 [54] Peter Lancaster and Leiba Rodman, Algebraic Riccati equations, Oxford Science Publications, The Clarendon Press, Oxford University Press, New York, 1995. MR1367089 [55] Peter Lancaster and Miron Tismenetsky, The theory of matrices, 2nd ed., Computer Science and Applied Mathematics, Academic Press, Inc., Orlando, FL, 1985. MR792300

478

Bibliography

[56] Eliahu Levy and Orr Moshe Shalit, Dilation theory in finite dimensions: the possible, the impossible and the unknown, Rocky Mountain J. Math. 44 (2014), no. 1, 203–221, DOI 10.1216/RMJ-2014-44-1-203. MR3216017 [57] Chi-Kwong Li and Fuzhen Zhang, Eigenvalue continuity and Gerˇsgorin’s theorem, Electron. J. Linear Algebra 35 (2019), 619–625, DOI 10.13001/1081-3810.4123. MR4044371 [58] Ross A. Lippert, Fixing two eigenvalues by a minimal perturbation, Linear Algebra Appl. 406 (2005), 177–200, DOI 10.1016/j.laa.2005.04.004. MR2156435 [59] David G. Luenberger, Optimization by vector space methods, John Wiley & Sons, Inc., New York-London-Sydney, 1969. MR0238472 [60] John E. McCarthy and Orr Moshe Shalit, Unitary N -dilations for tuples of commuting matrices, Proc. Amer. Math. Soc. 141 (2013), no. 2, 563–571, DOI 10.1090/S0002-9939-201211714-9. MR2996961 [61] Alan McIntosh, The Toeplitz-Hausdorff theorem and ellipticity conditions, Amer. Math. Monthly 85 (1978), no. 6, 475–477, DOI 10.2307/2320069. MR506368 [62] L. Mirsky, Symmetric gauge functions and unitarily invariant norms, Quart. J. Math. Oxford Ser. (2) 11 (1960), 50–59, DOI 10.1093/qmath/11.1.50. MR114821 [63] F. Ninio, A simple proof of the Perron-Frobenius theorem for positive symmetric matrices, J. Phys. A 9 (1976), no. 8, 1281–1282. MR409523 [64] Vladimir V. Peller, Hankel operators and their applications, Springer Monographs in Mathematics, Springer-Verlag, New York, 2003, DOI 10.1007/978-0-387-21681-2. MR1949210 [65] Elijah Polak, Optimization: Algorithms and consistent approximations, Applied Mathematical Sciences, vol. 124, Springer-Verlag, New York, 1997, DOI 10.1007/978-1-4612-0663-7. MR1454128 [66] Walter Rudin, Real and complex analysis, McGraw-Hill Book Co., New York-Toronto-London, 1966. MR0210528 [67] Thomas L. Saaty and Joseph Bram, Nonlinear mathematics, reprint of the 1964 original, Dover Publications, Inc., New York, 1981. MR662681 [68] Jonathan R. Shewchuk, An introduction to the conjugate gradient method without the agonizing pain, https://www.cs.cmu.edu/ quake-papers/painless-conjugate-gradient.pdf (1994), 1–58. [69] Alan Shuchat, Generalized least squares and eigenvalues, Amer. Math. Monthly 92 (1985), no. 9, 656–659, DOI 10.2307/2323714. MR810663 [70] Barry Simon, Convexity: An analytic viewpoint, Cambridge Tracts in Mathematics, vol. 187, Cambridge University Press, Cambridge, 2011, DOI 10.1017/CBO9780511910135. MR2814377 [71] Terence Tao, Topics in random matrix theory, Graduate Studies in Mathematics, vol. 132, American Mathematical Society, Providence, RI, 2012, DOI 10.1090/gsm/132. MR2906465 [72] Lloyd N. Trefethen and David Bau, III, Numerical linear algebra, SIAM, 1997. [73] David S. Watkins, The QR algorithm revisited, SIAM Rev. 50 (2008), no. 1, 133–145, DOI 10.1137/060659454. MR2403061 [74] David S. Watkins, Francis’s algorithm, Amer. Math. Monthly 118 (2011), no. 5, 387–403, DOI 10.4169/amer.math.monthly.118.05.387. MR2805025 [75] Roger Webster, Convexity, Oxford Science Publications, The Clarendon Press, Oxford University Press, New York, 1994. MR1443208 [76] J. H. Wilkinson, Convergence of the LR, QR, and related algorithms, Comput. J. 8 (1965), 77–84, DOI 10.1093/comjnl/8.3.273. MR183108 [77] Kemin Zhou, John C. Doyle, and Keith Glover, Robust and optimal control, Prentice Hall, 1996.

Notation index

Δ(α; r), 393 ΠV , 130 ˙ 43 +, ⊕, 126 αj , 37 γj , 36 σ(A), 37 (A; X )max , 305 (A; X )min , 305 A  B, 171 A  B, 171 A∗ , 118 AH , 4 AT , 4 A1/2 , 179 A† , 164 A[j,k] , 173 A{I;J} , 194 A{i,j,k;r,s,t} , 197 A{i;j} , 69 As , 101 As,t , 102 A, 330 |A|, 65 (p)

Cμ , 40 C, 1 C p×q , 1 C p, 1 CL , 403 CR , 403 C(Q), 235

C k (Q), 88, 235 (Du f ), 263 det A, 65 dim V, 3 E+ (A), 313, 409 E− (A), 313, 409 E0 (A), 313, 409 (∇f ), 237 G[k,m] , 467 Hf (x), 237 Hα , 465 H• , 466 In , 3 IU , 7 f (λ)dλ, 382 Γ Jf , 239 (α)

Kω , 465 (•) Kω , 466 NT , 7 NA , 7 Op×q , 3 PΓA , 386 PVW , 128 Rα , 464

479

480

R• , 467 R, 1 R p×q , 1 R p, 1 p×q , 281 R> p×q , 281 R≥ RT , 7 RA , 7 rσ (A), 120 Sb , 359 T ∗ , 118 Tk (x), 376 T U,V , 103 u, vA , 371 u, vU , 113 V ⊥ , 126 W (A), 429 X  , 345 xs , 100 x, yst , 114 ZΩ , 413

Notation index

Subject index

adjoint, 118 Aiken, John G., 344 Akritas, Alkiviadis G., 198 Akritas, Evgenia K., 198 algebraic multiplicity, 37 algorithm, 231 analytic, 381 Ando, Tsuyoshi, 323 Andrica, Dorin, 97 angle between subspaces, 154 Apostol, Tom M., 235, 236 approximate solutions, 169 approximating by unitary matrices, 337 area of parallelogram, 156 arithmetic mean, 90 Arov, Damir Z., 474 Banach space, 111 Barnett identity, 455 basis, 3 Bau, III, David, 379 Belevich, Vitold, 402 Bessel’s inequality, 131 best approximation, 166, 167 Bezoutian, 453 Bhatia, Rajendra, 303, 314, 320, 323, 460 Binet-Cauchy formula, 189 binomial formula, 9 Birkhoff-von Neumann theorem, 297 block Gaussian elimination, 32 block multiplication, 6 block triangular matrices, 18

Bollob´ as, B´ela, 97, 357 Borwein, Jonathan M., 122 bounded, 349 Bram, Joseph, 280 Brouwer fixed point theorem, 246 Brualdi, Richard A., 198 Burg, John P., 422 Carath´eodory, Constantin, 96 Cassels, John W. S., 401 Cauchy interlacing theorem, 308 Cauchy sequence, 111 Cauchy-Riemann equations, 260 Cauchy-Schwarz inequality, 93, 115, 157 Cayley-Hamilton theorem, 55, 73 characteristic polynomial, 56, 72, 76, 188 Chebyshev polynomial, 376 circulant, 79 companion matrix, 75 complementary space, 45 completing the square, 200 complex vector space, 1 conjugate gradient recursion, 374 conjugate gradients, 369 conservation of dimension, 13 continuity of eigenvalues, 387 contour integral, 382 contractive fixed point theorem, 243 contractive matrix, 208 controllable pair, 399 convergence estimates, 375 convex function, 85, 241, 434

481

482

convex hull, 432 convex set, 85 Conway, John B., 357 cosine, 154 Courant-Fischer theorem, 305 Cramer’s rule, 71 CS decomposition, 458 Curgus, Branko, 460 cyclic matrix, 138 cyclic vector, 83 Davis, Chandler, 428 de Boor, Carl W. R., 436 de Branges space, 463 de Branges, Louis, 473 de S´ a, Eduardo M., 422 Degani, Ilan, 460 determinant, 65 determinant identities, 193 Deutsch, Emeric, 422 diagonalizable, 37 Diepold, Klaus, 160 difference equations, 214, 217 differentiability of eigenvalues, 312 differentiating determinants, 185 differentiation of matrix-valued functions, 222 Dijksma, Aad, 460 dimension, 3 direct sum decomposition, 43, 45, 46 directional derivative, 263 discrete dynamical system, 211 doubly stochastic matrix, 297 Douglas, Ron G., 428 Doyle, John C., 428, 447 dual extremal problems, 360 Duren, Peter L., 249 Eckhart, Carl, 170 eigenvalue, 35, 41, 72 eigenvector, 35, 41 entire function, 463 equivalent norms, 100 Erdos, John A., 344 exponential of a matrix, 222 extremal problems, 263, 266 extreme point, 96, 297 Fan, Ky, 300, 310 Farkas’s lemma, 353 Feintuch, Avraham, 428 Fej´er, Leopold, 401

Subject index

Fibonacci sequence, 217 fitting a line, 204, 205 fixed point, 243 fixed point theorems, 246 Fourier matrix, 81 fractional powers of positive definite matrices, 401, 434 Frobenius norm, 114, 166, 338 Fuglede, Bert, 145 Fujii, J. I., 436 Fujii, M., 436 Furuta, T., 436 Gauss, Carl F., 249 Gauss-Lucas theorem, 433 Gauss-Seidel method, 30 Gaussian elimination, 21 Gaussian quadrature, 449 generalized backward shift, 464 generalized eigenvector, 37 generalized Vandermonde matrix, 78 generic, 390 geometric mean, 90 geometric multiplicity, 36 Gerˇsgorin disks, 393 Gl¨ uch, Jochen, 290 Glazman, Israel N., 160 Glover, Keith, 428, 447 Glunt, William, 422 Gohberg, Israel, 170, 323, 422 Goldstein, Jerome A., 344 Golub, Gene H., 170 Google matrix, 292 gradient, 237 Gram matrix, 116, 117, 130 Gram-Schmidt method, 133 Grone, Robert, 422 Gronwall’s inequality, 226 Hadamard’s inequality, 191 Hadamard, Jacques, 145 Hahn-Banach extension theorem, 349 Halmos, Paul, 436 Hamiltonian matrix, 437 Hankel matrix, 82, 408, 461 Hardy, Godfrey H., 323, 401 Hautus, Malo L. J., 402 Hayden, Thomas L., 422 Heinz inequality, 435 Heinz, E., 436 Helton, J. William, 436 Hermitian matrix, 137, 140, 144

Subject index

Hermitian transpose, 4 Hessian, 237, 241 Heymann, Michael, 402 Hilbert matrix, 400 Hilbert space, 114 Hoffman, Alan, 170 H¨ older’s inequality, 91 holomorphic, 381 homogeneous system, 211 hyperplane, 348 identity matrix, 3 implicit function theorem, 255 inequalities for determinants, 191, 192 inertia, 410 inner product, 113 inner product space, 113 integration of matrix-valued functions, 222 invariant subspace, 39, 41 inverse, 6 inverse function theorem, 251 invertible, 6 irreducible matrix, 281, 285 isometric matrix, 137, 161 isospectral, 226 Jacobi matrix, 308 Jacobi’s determinant identity, 195 Jacobi’s formula, 187 Jacobi, Carl G., 198 Jacobian, 239 Jensen’s inequality, 85 Johnson, Charles R., 422 Jordan cells, 40 Jordan chain, 55, 56 Jordan decomposition, 53 Kaashoek, Marinus A., 422 Kahan, William M., 428 Kantorovich, Leonid V., 280 Kato, Tosio, 391 keep in mind, 13–15, 28, 65, 122, 137, 140, 162, 168, 176, 182, 185 Krantz, Steven G., 261 Krein, Mark G., 160, 170, 323 Krein-Milman theorem, 96 Krylov subspace, 377 Ky Fan’s inequality, 310 Ky Fan’s maximum principle, 309 Lagrange multipliers, 266 Lancaster, Peter, 411, 447

483

Lax pair, 226 left invertible, 5, 14, 110 Leibniz’s rule, 77 Leray-Schauder theorem, 246 Leslie matrices, 294 Levy, Eliahu, 209 Lewis, Adrian S., 122 Li, C-K., 391 Lidskii’s inequality, 310 Lidskii, Viktor B., 310 linear combinations, 2 linear dependence, 2 linear functional, 345 linear independence, 2 linear mapping, 7 Lippert, Robert, 314 Littlewood, John E., 323, 401 Ljubic, Ju. L., 160 lower triangular, 8 LQR problem, 445 LU factorization, 174 Luenberger, David G., 379 Lyapunov equation, 407 Malaschonok, Genadii I., 198 mappings, 6 matrices with nonnegative entries, 281 matrices with positive entries, 281 matrix completion, 413 matrix multiplication, 4 maximum entropy completion, 417 McIntosh, Alan, 436 mean value theorem, 235, 238 Mihaly, Bakonyi, 436 minimal norm completion, 423, 424, 426 minimal polynomial, 55, 73 minimum matrix, 295 Minkowski’s inequality, 93 minor, 69 Mirsky, Leonid, 170 moment problem, 422, 461, 474 Moore Penrose inverse, 164 Nakomoto, R., 436 negative definite, 171 negative semidefinite, 171 Newton recursion, 277 Newton step, 247, 277 Newton’s method, 247, 275 Ninio, F., 290 nonhomogeneous system, 214 norm, 99

484

normal matrix, 137, 140, 142, 143 normal transformation, 145 normed linear space, 99 notation, 1, 114, 128, 171, 173, 235, 251, 284, 305, 313, 347, 371, 403, 418, 467 nullspace, 7 numerical range, 429 observable, 402 open mapping theorem, 253 operator norm, 103 orthogonal, 125 orthogonal complement, 126, 130 orthogonal decomposition, 126 orthogonal family, 125 orthogonal matrix, 4, 137, 140 orthogonal projection, 129, 130, 201 orthonormal expansion, 126 orthostochastic matrix, 300 othonormal family, 126 parallelogram law, 116 Parks, Harold R., 261 Parrott’s lemma, 424 partial isometry, 161, 180 Peller, Vladimir V., 401 permutation matrix, 4, 63, 137 Perron-Frobenius theorem, 285 pivot column, 22 pivot variables, 22 pivots, 22 Polak, Elijah, 280 polar form, 180 polarization identity, 116 Polya, George, 323, 401 Popov, Vasilie M., 402 positive definite, 149, 171, 176 positive semidefinite, 171 principal submatrix, 188 products of singular values, 167 projection, 127 projection by iteration, 147 projection formulas, 151, 153 Putnam, Calvin R., 145 QR factorization, 134, 457 quadrature formulas, 452 range, 7 rank, 13, 14 real Jordan forms, 62 real vector space, 1

Subject index

reproducing kernel, 462 reproducing kernel Hilbert space, 462 resultants, 456 Riccati equation, 437 Riesz projection, 386 Riesz, Frigyes, 401 right invertible, 6, 14, 110 Rodman, Leiba, 447 roots of polynomials, 258 Rudin, Walter, 122 Saaty, Thomas L., 280 Santanu, Sartu, 474 Schneider, Hans, 198, 422 Schur complements, 34, 200, 207 Schur’s theorem, 141 Schur, Issai, 141, 145 selfadjoint transformation, 145 Shalit, Orr M., 209 Sherman Morrison formula, 74 Shewchuk, Jonathan R., 379 shifting eigenvalues, 400 Shuchat, Alan, 209 similar, 37 Simon, Barry, 97 simple curve, 382 simple permutation matrix, 63 sine, 155 singular value decomposition, 162, 164 singular values, 164, 165, 168 skew-Hermitian matrix, 137 smooth, 235 span, 2 spectral mapping, 74, 398 spectral radius, 120, 121, 396 spectrum, 37 Spitkovsky, Ilya M., 160, 436 square root, 178, 179 standard inner product, 114 Stein equation, 405 Stewart, G. W., 170 stochastic matrix, 291 strictly convex, 241, 370 strictly convex function, 85, 90, 201, 241 strictly convex normed linear space, 201 sublinear functional, 348 subspace, 1 sum of subspaces, 43 sums of singular values, 167 svd, 164 Sylvester equation, 405 Sylvester’s determinant identity, 196

Subject index

Sylvester’s law of inertia, 314 symmetric gauge function, 320 symmetric matrix, 140 Tao, Terence, 198 Tarazaga, Pablo, 422 Taylor’s formula, 236 Tismenetsky, Miron, 411 Toeplitz matrix, 212, 405, 461 Toeplitz-Hausdorff theorem, 430 trace, 73, 90, 114, 158 transpose, 4 Trefethen, Lloyd N., 379 triangle inequality, 99 triangular, 8 triangular factorization, 173 triangular matrices, 18 UL factorization, 174, 200 unitarily invariant norm, 319 unitary matrix, 137, 140, 161 unitary transformation, 145 upper echelon matrix, 22 upper triangular, 8 Vandermonde matrix, 78 volume of parallelepiped, 157 von Neumann’s inequality, 208 von Neumann’s trace inequality, 301 von Neumann, John, 301, 320 warning, 7, 122, 137, 161, 172, 201, 353 warnings, 264, 265 Watkins, David, 460 Webster, Roger, 357 Weinberger, Hans F., 428 Weyl’s inequalities, 310 Weyl, Herman, 310, 323 Wilkinson, John N., 460 Woerdeman, Hugo J., 422, 436 Wolkowicz, Henry, 422 Wronskian, 232 Young’s inequality, 95 Young, Gale, 170 zero matrix, 3 Zhang, F., 391 Zhou, Kemin, 428, 447

485

SELECTED PUBLISHED TITLES IN THIS SERIES

232 Harry Dym, Linear Algebra in Action, Third Edition, 2023 231 Lu´ıs Barreira and Yakov Pesin, Introduction to Smooth Ergodic Theory, Second Edition, 2023 229 Giovanni Leoni, A First Course in Fractional Sobolev Spaces, 2023 228 Henk Bruin, Topological and Ergodic Theory of Symbolic Dynamics, 2022 227 William M. Goldman, Geometric Structures on Manifolds, 2022 226 Milivoje Luki´ c, A First Course in Spectral Theory, 2022 225 Jacob Bedrossian and Vlad Vicol, The Mathematical Analysis of the Incompressible Euler and Navier-Stokes Equations, 2022 224 Ben Krause, Discrete Analogues in Harmonic Analysis, 2022 223 Volodymyr Nekrashevych, Groups and Topological Dynamics, 2022 222 Michael Artin, Algebraic Geometry, 2022 221 David Damanik and Jake Fillman, One-Dimensional Ergodic Schr¨ odinger Operators, 2022 220 219 218 217

Isaac Goldbring, Ultrafilters Throughout Mathematics, 2022 Michael Joswig, Essentials of Tropical Combinatorics, 2021 Riccardo Benedetti, Lectures on Differential Topology, 2021 Marius Crainic, Rui Loja Fernandes, and Ioan M˘ arcut ¸, Lectures on Poisson Geometry, 2021

216 215 214 213

Brian Osserman, A Concise Introduction to Algebraic Varieties, 2021 Tai-Ping Liu, Shock Waves, 2021 Ioannis Karatzas and Constantinos Kardaras, Portfolio Theory and Arbitrage, 2021 Hung Vinh Tran, Hamilton–Jacobi Equations, 2021

212 211 210 209

Marcelo Viana and Jos´ e M. Espinar, Differential Equations, 2021 Mateusz Michalek and Bernd Sturmfels, Invitation to Nonlinear Algebra, 2021 Bruce E. Sagan, Combinatorics: The Art of Counting, 2020 Jessica S. Purcell, Hyperbolic Knot Theory, 2020 ´ ´ 208 Vicente Mu˜ noz, Angel Gonz´ alez-Prieto, and Juan Angel Rojo, Geometry and Topology of Manifolds, 2020 207 Dmitry N. Kozlov, Organized Collapse: An Introduction to Discrete Morse Theory, 2020 206 Ben Andrews, Bennett Chow, Christine Guenther, and Mat Langford, Extrinsic Geometric Flows, 2020 205 Mikhail Shubin, Invitation to Partial Differential Equations, 2020 204 Sarah J. Witherspoon, Hochschild Cohomology for Algebras, 2019 203 202 201 200

Dimitris Koukoulopoulos, The Distribution of Prime Numbers, 2019 Michael E. Taylor, Introduction to Complex Analysis, 2019 Dan A. Lee, Geometric Relativity, 2019 Semyon Dyatlov and Maciej Zworski, Mathematical Theory of Scattering Resonances, 2019 199 Weinan E, Tiejun Li, and Eric Vanden-Eijnden, Applied Stochastic Analysis, 2019 198 Robert L. Benedetto, Dynamics in One Non-Archimedean Variable, 2019 197 Walter Craig, A Course on Partial Differential Equations, 2018 196 195 194 193

Martin Stynes and David Stynes, Convection-Diffusion Problems, 2018 Matthias Beck and Raman Sanyal, Combinatorial Reciprocity Theorems, 2018 Seth Sullivant, Algebraic Statistics, 2018 Martin Lorenz, A Tour of Representation Theory, 2018

For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstore/gsmseries/.

This book is based largely on courses that the author taught at the Feinberg Graduate School of the Weizmann Institute. It conveys in a user-friendly way the basic and advanced techniques of linear algebra from the point of view of a working analyst. The techniques are illustrated by a wide sample of applications and examples that are chosen to highlight the tools of the trade. In short, this is material that the author has found to be useful in his own research and wishes that he had been exposed to as a graduate student. Roughly the first quarter of the book reviews the contents of a basic course in linear algebra, plus a little. The remaining chapters treat singular value decompositions, convexity, special classes of matrices, projections, assorted algorithms, and a number of applications. The applications are drawn from vector calculus, numerical analysis, control theory, complex analysis, convex optimization, and functional analysis. In particular, fixed point theorems, extremal problems, best approximations, matrix equations, zero location and eigenvalue location problems, matrices with nonnegative entries, and reproducing kernels are discussed. This new edition differs significantly from the second edition in both content and style. It includes a number of topics that did not appear in the earlier edition and excludes some that did. Moreover, most of the material that has been adapted from the earlier edition has been extensively rewritten and reorganized.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-232

GSM/232

www.ams.org