Linear Algebra - A Modern Introduction (4e) ISM [4 ed.]
 1285869605, 9781285869605

Table of contents :
Blank Page
Blank Page
Blank Page
Poole 4e CSM title page.pdf
Linear Algebra
A Modern Introduction

Citation preview

Complete Solutions Manual

Linear Algebra A Modern Introduction FOURTH EDITION

David Poole Trent University

Prepared by Roger Lipsett

Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States

ISBN-13: 978-128586960-5 ISBN-10: 1-28586960-5

© 2015 Cengage Learning ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored, or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher except as may be permitted by the license terms below.

For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706. For permission to use material from this text or product, submit all requests online at www.cengage.com/permissions Further permissions questions can be emailed to [email protected].

Cengage Learning 200 First Stamford Place, 4th Floor Stamford, CT 06902 USA Cengage Learning is a leading provider of customized learning solutions with office locations around the globe, including Singapore, the United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local office at: www.cengage.com/global. Cengage Learning products are represented in Canada by Nelson Education, Ltd. To learn more about Cengage Learning Solutions, visit www.cengage.com. Purchase any of our products at your local college store or at our preferred online store www.cengagebrain.com.

NOTE: UNDER NO CIRCUMSTANCES MAY THIS MATERIAL OR ANY PORTION THEREOF BE SOLD, LICENSED, AUCTIONED, OR OTHERWISE REDISTRIBUTED EXCEPT AS MAY BE PERMITTED BY THE LICENSE TERMS HEREIN.

READ IMPORTANT LICENSE INFORMATION Dear Professor or Other Supplement Recipient: Cengage Learning has provided you with this product (the “Supplement”) for your review and, to the extent that you adopt the associated textbook for use in connection with your course (the “Course”), you and your students who purchase the textbook may use the Supplement as described below. Cengage Learning has established these use limitations in response to concerns raised by authors, professors, and other users regarding the pedagogical problems stemming from unlimited distribution of Supplements. Cengage Learning hereby grants you a nontransferable license to use the Supplement in connection with the Course, subject to the following conditions. The Supplement is for your personal, noncommercial use only and may not be reproduced, posted electronically or distributed, except that portions of the Supplement may be provided to your students IN PRINT FORM ONLY in connection with your instruction of the Course, so long as such students are advised that they

Printed in the United States of America 1 2 3 4 5 6 7 17 16 15 14 13

may not copy or distribute any portion of the Supplement to any third party. You may not sell, license, auction, or otherwise redistribute the Supplement in any form. We ask that you take reasonable steps to protect the Supplement from unauthorized use, reproduction, or distribution. Your use of the Supplement indicates your acceptance of the conditions set forth in this Agreement. If you do not accept these conditions, you must return the Supplement unused within 30 days of receipt. All rights (including without limitation, copyrights, patents, and trade secrets) in the Supplement are and will remain the sole and exclusive property of Cengage Learning and/or its licensors. The Supplement is furnished by Cengage Learning on an “as is” basis without any warranties, express or implied. This Agreement will be governed by and construed pursuant to the laws of the State of New York, without regard to such State’s conflict of law rules. Thank you for your assistance in helping to safeguard the integrity of the content contained in this Supplement. We trust you find the Supplement a useful teaching tool.

Contents 1 Vectors 1.1 The Geometry and Algebra of Vectors 1.2 Length and Angle: The Dot Product . Exploration: Vectors and Geometry . . . . 1.3 Lines and Planes . . . . . . . . . . . . Exploration: The Cross Product . . . . . . 1.4 Applications . . . . . . . . . . . . . . . Chapter Review . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3 3 10 25 27 41 44 48

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

2 Systems of Linear Equations 2.1 Introduction to Systems of Linear Equations . . . . . 2.2 Direct Methods for Solving Linear Systems . . . . . . Exploration: Lies My Computer Told Me . . . . . . . . . . Exploration: Partial Pivoting . . . . . . . . . . . . . . . . . Exploration: An Introduction to the Analysis of Algorithms 2.3 Spanning Sets and Linear Independence . . . . . . . . 2.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Iterative Methods for Solving Linear Systems . . . . . Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

53 . 53 . 58 . 75 . 75 . 77 . 79 . 93 . 112 . 123

3 Matrices 3.1 Matrix Operations . . . . . . . . . . . . 3.2 Matrix Algebra . . . . . . . . . . . . . . 3.3 The Inverse of a Matrix . . . . . . . . . 3.4 The LU Factorization . . . . . . . . . . 3.5 Subspaces, Basis, Dimension, and Rank 3.6 Introduction to Linear Transformations 3.7 Applications . . . . . . . . . . . . . . . . Chapter Review . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

129 129 138 150 164 176 192 209 230

4 Eigenvalues and Eigenvectors 4.1 Introduction to Eigenvalues and Eigenvectors . . 4.2 Determinants . . . . . . . . . . . . . . . . . . . . Exploration: Geometric Applications of Determinants 4.3 Eigenvalues and Eigenvectors of n × n Matrices . 4.4 Similarity and Diagonalization . . . . . . . . . . 4.5 Iterative Methods for Computing Eigenvalues . . 4.6 Applications and the Perron-Frobenius Theorem Chapter Review . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

235 235 250 263 270 291 308 326 365

. . . . . . . .

. . . . . . . .

. . . . . . . .

1

. . . . . . . .

2

CONTENTS

5 Orthogonality 5.1 Orthogonality in Rn . . . . . . . . . . . . . . . . . . . . . . 5.2 Orthogonal Complements and Orthogonal Projections . . . 5.3 The Gram-Schmidt Process and the QR Factorization . . . Exploration: The Modified QR Process . . . . . . . . . . . . . . Exploration: Approximating Eigenvalues with the QR Algorithm 5.4 Orthogonal Diagonalization of Symmetric Matrices . . . . . 5.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

371 371 379 388 398 402 405 417 442

6 Vector Spaces 6.1 Vector Spaces and Subspaces . . . . . . . . . . . . . . . . . 6.2 Linear Independence, Basis, and Dimension . . . . . . . . . Exploration: Magic Squares . . . . . . . . . . . . . . . . . . . . . 6.3 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Linear Transformations . . . . . . . . . . . . . . . . . . . . 6.5 The Kernel and Range of a Linear Transformation . . . . . 6.6 The Matrix of a Linear Transformation . . . . . . . . . . . Exploration: Tiles, Lattices, and the Crystallographic Restriction 6.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

451 451 463 477 480 491 498 507 525 527 531

7 Distance and Approximation 7.1 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . Exploration: Vectors and Matrices with Complex Entries . . . . Exploration: Geometric Inequalities and Optimization Problems 7.2 Norms and Distance Functions . . . . . . . . . . . . . . . . 7.3 Least Squares Approximation . . . . . . . . . . . . . . . . . 7.4 The Singular Value Decomposition . . . . . . . . . . . . . . 7.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

537 537 546 553 556 568 590 614 625

8 Codes 8.1 Code Vectors . . . . . . . 8.2 Error-Correcting Codes . 8.3 Dual Codes . . . . . . . . 8.4 Linear Codes . . . . . . . 8.5 The Minimum Distance of

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

633 633 637 641 647 650

. . . . a

. . . . . . . . . . . . . . . . Code

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

Chapter 1

Vectors 1.1

The Geometry and Algebra of Vectors H-2, 3L

1.

H2, 3L

3

2

1

H3, 0L -2

1

-1

2

3

-1

H3, -2L -2

2. Since 

     2 3 5 + = , −3 0 −3



     2 2 4 + = , −3 3 0



     2 −2 0 + = , −3 3 0

plotting those vectors gives

1

2

3

4

-1

c

b

-2

a -3

d -4

-5

3

5



     2 3 5 + = , −3 −2 −5

4

CHAPTER 1. VECTORS 3.

2

c

z

b

1

-2

0

-1

0

1

y2

1

0 3

2

-1

a

x -1

d -2

4. Since the heads are all at (3, 2, 1), the tails are at       3 0 3 2 − 2 = 0 , 1 0 1

      0 3 3 2 − 2 = 0 , 1 1 0

      3 1 2 2 − −2 = 4 , 1 1 0

# » 5. The four vectors AB are 3

c 2

1

d a 1

2

3

4

-1

b -2

In standard position, the vectors are # » (a) AB = [4 − 1, 2 − (−1)] = [3, 3]. # » (b) AB = [2 − 0, −1 − (−2)] = [2, 1]    # »  (c) AB = 12 − 2, 3 − 32 = − 32 , 32    # »  (d) AB = 16 − 13 , 12 − 13 = − 16 , 16 . 3

2

a

1

c b d -1

1

2

3

      3 −1 4 2 − −1 = 3 . 1 −2 3

1.1. THE GEOMETRY AND ALGEBRA OF VECTORS

5

6. Recall the notation that [a, b] denotes a move of a units horizontally and b units vertically. Then during the first part of the walk, the hiker walks 4 km north, so a = [0, 4]. During the second part of the walk, the hiker walks a distance of 5 km northeast. From the components, we get " √ √ # 5 2 5 2 ◦ ◦ , . b = [5 cos 45 , 5 sin 45 ] = 2 2 Thus the net displacement vector is " √ √ # 5 2 5 2 c=a+b= , 4+ . 2 2 7. a + b =

        3 2 3+2 5 + = = . 0 3 0+3 3

3

2

a+b b

1

a 1

        2 −2 2 − (−2) 4 8. b−c = − = = . 3 3 3−3 0

2

3

4

5

3

2

b

-c

1

b-c 1



    3 −2 5 9. d − c = − = . −2 3 −5

2

3

4



1

2

3

4

5

d

-1

-2

d-c

-3

-c

-4

-5

      3 3 3+3 10. a + d = + = = 0 −2 0 + (−2)   6 . −2

a 1

2

3

4

5

6

d

-1

a+d

-2

11. 2a + 3c = 2[0, 2, 0] + 3[1, −2, 1] = [2 · 0, 2 · 2, 2 · 0] + [3 · 1, 3 · (−2), 3 · 1] = [3, −2, 3]. 12. 3b − 2c + d = 3[3, 2, 1] − 2[1, −2, 1] + [−1, −1, −2] = [3 · 3, 3 · 2, 3 · 1] + [−2 · 1, −2 · (−2), −2 · 1] + [−1, −1, −2] = [6, 9, −1].

6

CHAPTER 1. VECTORS 13. u = [cos 60◦ , sin 60◦ ] =

h

1 2,



3 2

i

h √ i , and v = [cos 210◦ , sin 210◦ ] = − 23 , − 21 , so that

# √ √ 1 3 3 1 u+v = − , − , 2 2 2 2

# √ √ 1 3 3 1 u−v = + , + . 2 2 2 2

"

"

# » 14. (a) AB = b − a. # » # » # » # » (b) Since OC = AB, we have BC = OC − b = (b − a) − b = −a. # » (c) AD = −2a. # » # » # » (d) CF = −2OC = −2AB = −2(b − a) = 2(a − b). # » # » # » (e) AC = AB + BC = (b − a) + (−a) = b − 2a. # » # » # » # » (f ) Note that F A and OB are equal, and that DE = −AB. Then # » # » # » # » # » BC + DE + F A = −a − AB + OB = −a − (b − a) + b = 0.

15. 2(a − 3b) + 3(2b + a)

property e. distributivity

=

(2a − 6b) + (6b + 3a)

property b. associativity

=

(2a + 3a) + (−6b + 6b) = 5a.

16. property e. distributivity

−3(a − c) + 2(a + 2b) + 3(c − b)

=

property b. associativity

=

(−3a + 3c) + (2a + 4b) + (3c − 3b)

(−3a + 2a) + (4b − 3b) + (3c + 3c)

= −a + b + 6c. 17. x − a = 2(x − 2a) = 2x − 4a ⇒ x − 2x = a − 4a ⇒ −x = −3a ⇒ x = 3a. 18. x + 2a − b = 3(x + a) − 2(2a − b) = 3x + 3a − 4a + 2b x − 3x = −a − 2a + 2b + b





−2x = −3a + 3b ⇒ 3 3 x = a − b. 2 2 19. We have 2u + 3v = 2[1, −1] + 3[1, 1] = [2 · 1 + 3 · 1, 2 · (−1) + 3 · 1] = [5, 1]. Plots of all three vectors are 2

w

1

v

v 1

-1

2

3

4

v u -1

v u -2

5

6

1.1. THE GEOMETRY AND ALGEBRA OF VECTORS

7

20. We have −u − 2v = −[−2, 1] − 2[2, −2] = [−(−2) − 2 · 2, −1 − 2 · (−2)] = [−2, 3]. Plots of all three vectors are

-u -v 3

-u 2

w -v 1

u

-2

1

-1

2

v -1

-2

21. From the diagram, we see that w = −2u + 4v.

6

-u 5

-u w

4

v 3

v 2

v 1

v 1

-1

2

3

4

6

5

u -1

22. From the diagram, we see that w = 2u + 3v.

9

8

u 7

6

w

5

u 4

3

v 2

u

v 1

v -2

-1

1

2

3

4

5

6

7

23. Property (d) states that u + (−u) = 0. The first diagram below shows u along with −u. Then, as the diagonal of the parallelogram, the resultant vector is 0. Property (e) states that c(u + v) = cu + cv. The second figure illustrates this.

8

CHAPTER 1. VECTORS

cv

cHu + vL cu

u

v cu

u

u+v

u

cv v

-u

24. Let u = [u1 , u2 , . . . , un ] and v = [v1 , v2 , . . . , vn ], and let c and d be scalars in R. Property (d): u + (−u) = [u1 , u2 , . . . , un ] + (−1[u1 , u2 , . . . , un ]) = [u1 , u2 , . . . , un ] + [−u1 , −u2 , . . . , −un ] = [u1 − u1 , u2 − u2 , . . . , un − un ] = [0, 0, . . . , 0] = 0. Property (e): c(u + v) = c ([u1 , u2 , . . . , un ] + [v1 , v2 , . . . , vn ]) = c ([u1 + v1 , u2 + v2 , . . . , un + vn ]) = [c(u1 + v1 ), c(u2 + v2 ), . . . , c(un + vn )] = [cu1 + cv1 , cu2 + cv2 , . . . , cun + cvn ] = [cu1 , cu2 , . . . , cun ] + [cv1 , cv2 , . . . , cvn ] = c[u1 , u2 , . . . , un ] + c[v1 , v2 , . . . , vn ] = cu + cv. Property (f): (c + d)u = (c + d)[u1 , u2 , . . . , un ] = [(c + d)u1 , (c + d)u2 , . . . , (c + d)un ] = [cu1 + du1 , cu2 + du2 , . . . , cun + dun ] = [cu1 , cu2 , . . . , cun ] + [du1 , du2 , . . . , dun ] = c[u1 , u2 , . . . , un ] + d[u1 , u2 , . . . , un ] = cu + du. Property (g): c(du) = c(d[u1 , u2 , . . . , un ]) = c[du1 , du2 , . . . , dun ] = [cdu1 , cdu2 , . . . , cdun ] = [(cd)u1 , (cd)u2 , . . . , (cd)un ] = (cd)[u1 , u2 , . . . , un ] = (cd)u. 25. u + v = [0, 1] + [1, 1] = [1, 0]. 26. u + v = [1, 1, 0] + [1, 1, 1] = [0, 0, 1].

1.1. THE GEOMETRY AND ALGEBRA OF VECTORS

9

27. u + v = [1, 0, 1, 1] + [1, 1, 1, 1] = [0, 1, 0, 0]. 28. u + v = [1, 1, 0, 1, 0] + [0, 1, 1, 1, 0] = [1, 0, 1, 0, 0]. 29. + 0 1 2 3

0 0 1 2 3

1 1 2 3 0

2 2 3 0 1

3 3 0 1 2

· 0 1 2 3

0 0 0 0 0

1 0 1 2 3

2 0 2 0 2

3 0 3 2 1

0 0 1 2 3 4

1 1 2 3 4 0

2 2 3 4 0 1

3 3 4 0 1 2

4 4 0 1 2 3

· 0 1 2 3 4

0 0 0 0 0 0

1 0 1 2 3 4

2 0 2 4 1 3

3 0 3 1 4 2

30. + 0 1 2 3 4

4 0 4 3 2 1

31. 2 + 2 + 2 = 6 = 0 in Z3 . 32. 2 · 2 · 2 = 3 · 2 = 0 in Z3 . 33. 2(2 + 1 + 2) = 2 · 2 = 3 · 1 + 1 = 1 in Z3 . 34. 3 + 1 + 2 + 3 = 4 · 2 + 1 = 1 in Z4 . 35. 2 · 3 · 2 = 4 · 3 + 0 = 0 in Z4 . 36. 3(3 + 3 + 2) = 4 · 6 + 0 = 0 in Z4 . 37. 2 + 1 + 2 + 2 + 1 = 2 in Z3 , 2 + 1 + 2 + 2 + 1 = 0 in Z4 , 2 + 1 + 2 + 2 + 1 = 3 in Z5 . 38. (3 + 4)(3 + 2 + 4 + 2) = 2 · 1 = 2 in Z5 . 39. 8(6 + 4 + 3) = 8 · 4 = 5 in Z9 . 10 40. 2100 = 210 = (1024)10 = 110 = 1 in Z11 . 41. [2, 1, 2] + [2, 0, 1] = [1, 1, 0] in Z33 . 42. 2[2, 2, 1] = [2 · 2, 2 · 2, 2 · 1] = [1, 1, 2] in Z33 . 43. 2([3, 1, 1, 2] + [3, 3, 2, 1]) = 2[2, 0, 3, 3] = [2 · 2, 2 · 0, 2 · 3, 2 · 3] = [0, 0, 2, 2] in Z44 . 2([3, 1, 1, 2] + [3, 3, 2, 1]) = 2[1, 4, 3, 3] = [2 · 1, 2 · 4, 2 · 3, 2 · 3] = [2, 3, 1, 1] in Z45 . 44. x = 2 + (−3) = 2 + 2 = 4 in Z5 . 45. x = 1 + (−5) = 1 + 1 = 2 in Z6 46. x = 2−1 = 2 in Z3 . 47. No solution. 2 times anything is always even, so cannot leave a remainder of 1 when divided by 4. 48. x = 2−1 = 3 in Z5 . 49. x = 3−1 4 = 2 · 4 = 3 in Z5 . 50. No solution. 3 times anything is always a multiple of 3, so it cannot leave a remainder of 4 when divided by 6 (which is also a multiple of 3). 51. No solution. 6 times anything is always even, so it cannot leave an odd number as a remainder when divided by 8.

10

CHAPTER 1. VECTORS

52. x = 8−1 9 = 7 · 9 = 8 in Z11 53. x = 2−1 (2 + (−3)) = 3(2 + 2) = 2 in Z5 . 54. No solution. This equation is the same as 4x = 2 − 5 = −3 = 3 in Z6 . But 4 times anything is even, so it cannot leave a remainder of 3 when divided by 6 (which is also even). 55. Add 5 to both sides to get 6x = 6, so that x = 1 or x = 5 (since 6 · 1 = 6 and 6 · 5 = 30 = 6 in Z8 ). 56. (a)

All values.

(b)

All values.

(c)

All values.

57. (a) All a 6= 0 in Z5 have a solution because 5 is a prime number. (b) a = 1 and a = 5 because they have no common factors with 6 other than 1. (c) a and m can have no common factors other than 1; that is, the greatest common divisor, gcd, of a and m is 1.

1.2

Length and Angle: The Dot Product

    −1 3 · = (−1) · 3 + 2 · 1 = −3 + 2 = −1. 2 1     3 4 2. Following Example 1.15, u · v = · = 3 · 4 + (−2) · 6 = 12 − 12 = 0. −2 6     2 1 3. u · v = 2 · 3 = 1 · 2 + 2 · 3 + 3 · 1 = 2 + 6 + 3 = 11. 1 3

1. Following Example 1.15, u · v =

4. u · v = 3.2 · 1.5 + (−0.6) · 4.1 + (−1.4) · (−0.2) = 4.8 − 2.46 + 0.28 = 2.62.     4 √1 √  2 − 2 √ √ √  √   5. u · v =   3 ·  0  = 1 · 4 + 2 · (− 2) + 3 · 0 + 0 · (−5) = 4 − 2 = 2. −5 0     −2.29 1.12 −3.25  1.72    6. u · v =   2.07 ·  4.33 = −1.12 · 2.29 − 3.25 · 1.72 + 2.07 · 4.33 − 1.83 · (−1.54) = 3.6265. −1.54 −1.83 7. Finding a unit vector v in the same direction as a given vector u is called normalizing the vector u. Proceed as in Example 1.19: p √ kuk = (−1)2 + 22 = 5, so a unit vector v in the same direction as u is " # " # − √15 1 1 −1 v= u= √ = . kuk √2 5 2 5 8. Proceed as in Example 1.19: kuk =

p √ √ 32 + (−2)2 = 9 + 4 = 13,

so a unit vector v in the direction of u is " # " # √3 3 1 1 13 v= u= √ = . kuk 13 −2 − √213

1.2. LENGTH AND ANGLE: THE DOT PRODUCT

11

9. Proceed as in Example 1.19: kuk =

p

12 + 22 + 32 =



14,

so a unit vector v in the direction of u is     √1 1 14  2  1 1    = v= u= √  2  √14  .   kuk 14 3 √ 3 14 10. Proceed as in Example 1.19: p √ √ kuk = 3.22 + (−0.6)2 + (−1.4)2 = 10.24 + 0.36 + 1.96 = 12.56 ≈ 3.544, so a unit vector v in the direction of u is 

   1.5 0.903 1 1  0.4 ≈ −0.169 . v= u= kuk 3.544 −2.1 −0.395 11. Proceed as in Example 1.19: r kuk =

12 +

√ 2 √ 2 √ 2 + 3 + 02 = 6,

so a unit vector v in the direction of u is   1   1  √  6 √ √ 1 6 6 6 √ √ √    2  1   3 √  √   3  1  2 1 √6  =  3  =  √  = u= √  v= √    1   2  3 kuk 6  3   √6   √2   2  0 0 0 0 

12. Proceed as in Example 1.19: p √ kuk = 1.122 + (−3.25)2 + 2.072 + (−1.83)2 = 1.2544 + 10.5625 + 4.2849 + 3.3489 √ = 19.4507 ≈ 4.410, so a unit vector v in the direction of u is   1 1  1.12 −3.25 2.07 −1.83 ≈ 0.254 −0.737 v u= kuk 4.410       −1 3 −4 13. Following Example 1.20, we compute: u − v = − = , so 2 1 1 q √ 2 d(u, v) = ku − vk = (−4) + 12 = 17. 

     3 4 −1 14. Following Example 1.20, we compute: u − v = − = , so −2 6 −8 q √ 2 2 d(u, v) = ku − vk = (−1) + (−8) = 65.       2 −1 1 15. Following Example 1.20, we compute: u − v = 2 − 3 = −1, so 3 1 2 q √ 2 2 d(u, v) = ku − vk = (−1) + (−1) + 22 = 6.

0.469

 −0.415 .

12

CHAPTER 1. VECTORS 

     3.2 1.5 1.7 16. Following Example 1.20, we compute: u − v = −0.6 −  4.1 = −4.7, so −1.4 −0.2 −1.2 d(u, v) = ku − vk =

p

1.72 + (−4.7)2 + (−1.2)2 =



26.42 ≈ 5.14.

17. (a) u · v is a real number, so ku · vk is the norm of a number, which is not defined. (b) u · v is a scalar, while w is a vector. Thus u · v + w adds a scalar to a vector, which is not a defined operation. (c) u is a vector, while v · w is a scalar. Thus u · (v · w) is the dot product of a vector and a scalar, which is not defined. (d) c · (u + v) is the dot product of a scalar and a vector, which is not defined. 18. Let θ be the angle between u and v. Then √ 3 u·v 3 · (−1) + 0 · 1 2 p cos θ = =√ =− √ =− . 2 2 2 2 kuk kvk 2 3 2 3 + 0 (−1) + 1 Thus cos θ < 0 (in fact, θ =

3π 4 ),

so the angle between u and v is obtuse.

19. Let θ be the angle between u and v. Then cos θ = Thus cos θ > 0 (in fact, θ =

1 u·v 2 · 1 + (−1) · (−2) + 1 · (−1) p = . =p 2 2 2 2 2 2 kuk kvk 2 2 + (−1) + 1 1 + (−2) + 1 π 3 ),

so the angle between u and v is acute.

20. Let θ be the angle between u and v. Then cos θ =

0 4 · 1 + 3 · (−1) + (−1) · 1 u·v p = √ √ = 0. =p 2 2 2 2 2 2 kuk kvk 26 3 4 + 3 + (−1) 1 + (−1) + 1

Thus the angle between u and v is a right angle. 21. Let θ be the angle between u and v. Note that we can determine whether θ is acute, right, or obtuse u·v by examining the sign of kukkvk , which is determined by the sign of u · v. Since u · v = 0.9 · (−4.5) + 2.1 · 2.6 + 1.2 · (−0.8) = 0.45 > 0, we have cos θ > 0 so that θ is acute. 22. Let θ be the angle between u and v. Note that we can determine whether θ is acute, right, or obtuse u·v by examining the sign of kukkvk , which is determined by the sign of u · v. Since u · v = 1 · (−3) + 2 · 1 + 3 · 2 + 4 · (−2) = −3, we have cos θ < 0 so that θ is obtuse. 23. Since the components of both u and v are positive, it is clear that u · v > 0, so the angle between them is acute since it has a positive cosine.  √  √ ◦ 24. From Exercise 18, cos θ = − 22 , so that θ = cos−1 − 22 = 3π 4 = 135 . 25. From Exercise 19, cos θ = 12 , so that θ = cos−1 26. From Exercise 20, cos θ = 0, so that θ =

π 2

1 2

=

π 3

= 60◦ .

= 90◦ is a right angle.

1.2. LENGTH AND ANGLE: THE DOT PRODUCT

13

27. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors: u · v = 0.9 · (−4.5) + 2.1 · 2.6 + 1.2 · (−0.8) = 0.45, p √ kuk = 0.92 + 2.12 + 1.22 = 6.66, p √ kvk = (−4.5)2 + 2.62 + (−0.8)2 = 27.65. So if θ is the angle between u and v, then cos θ =

0.45 u·v 0.45 √ ≈√ , =√ kuk kvk 6.66 27.65 182.817

so that −1

θ = cos



0.45 √ 182.817



≈ 1.5375 ≈ 88.09◦ .

Note that it is important to maintain as much precision as possible until the last step, or roundoff errors may build up. 28. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors: u · v = 1 · (−3) + 2 · 1 + 3 · 2 + 4 · (−2) = −3, p √ kuk = 12 + 22 + 32 + 42 = 30, p √ kvk = (−3)2 + 12 + 22 + (−2)2 = 18. So if θ is the angle between u and v, then cos θ =

3 u·v 1 = −√ √ = − √ kuk kvk 30 18 2 15

so that

  1 θ = cos−1 − √ ≈ 1.7 ≈ 97.42◦ . 2 15

29. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors: u · v = 1 · 5 + 2 · 6 + 3 · 7 + 4 · 8 = 70, p √ kuk = 12 + 22 + 32 + 42 = 30, p √ kvk = 52 + 62 + 72 + 82 = 174. So if θ is the angle between u and v, then u·v 70 35 cos θ = =√ √ = √ kuk kvk 30 174 3 145

so that

−1

θ = cos



35 √ 3 145



≈ 0.2502 ≈ 14.34◦ .

30. To show that 4ABC is right, we need only show that one pair of its sides meets at a right angle. So # » # » # » let u = AB, v = BC, and w = AC. Then we must show that one of u · v, u · w or v · w is zero in order to show that one of these pairs is orthogonal. Then # » # » u = AB = [1 − (−3), 0 − 2] = [4, −2] , v = BC = [4 − 1, 6 − 0] = [3, 6] , # » w = AC = [4 − (−3), 6 − 2] = [7, 4] , and u · v = 4 · 3 + (−2) · 6 = 12 − 12 = 0. # » # » Since this dot product is zero, these two vectors are orthogonal, so that AB ⊥ BC and thus 4ABC is a right triangle. It is unnecessary to test the remaining pairs of sides.

14

CHAPTER 1. VECTORS

31. To show that 4ABC is right, we need only show that one pair of its sides meets at a right angle. So # » # » # » let u = AB, v = BC, and w = AC. Then we must show that one of u · v, u · w or v · w is zero in order to show that one of these pairs is orthogonal. Then # » u = AB = [−3 − 1, 2 − 1, (−2) − (−1)] = [−4, 1, −1] , # » v = BC = [2 − (−3), 2 − 2, −4 − (−2)] = [5, 0, −2], # » w = AC = [2 − 1, 2 − 1, −4 − (−1)] = [1, 1, −3], and u · v = −4 · 5 + 1 · 0 − 1 · (−2) = −18 u · w = −4 · 1 + 1 · 1 − 1 · (−3) = 0. # » # » Since this dot product is zero, these two vectors are orthogonal, so that AB ⊥ AC and thus 4ABC is a right triangle. It is unnecessary to test the remaining pair of sides. 32. As in Example 1.22, the dimensions of the cube do not matter, so we work with a cube with side length 1. Since the cube is symmetric, we need only consider one diagonal and adjacent edge. Orient the cube as shown in Figure 1.34; take the diagonal to be [1, 1, 1] and the adjacent edge to be [1, 0, 0]. Then the angle θ between these two vectors satisfies   1 1·1+1·0+1·0 1 √ √ cos θ = =√ , so θ = cos−1 √ ≈ 54.74◦ . 3 1 3 3 Thus the diagonal and an adjacent edge meet at an angle of 54.74◦ . 33. As in Example 1.22, the dimensions of the cube do not matter, so we work with a cube with side length 1. Since the cube is symmetric, we need only consider one pair of diagonals. Orient the cube as shown in Figure 1.34; take the diagonals to be u = [1, 1, 1] and v = [1, 1, 0] − [0, 0, 1] = [1, 1, −1]. Then the dot product is u · v = 1 · 1 + 1 · 1 + 1 · (−1) = 1 + 1 − 1 = 1 6= 0. Since the dot product is nonzero, the diagonals are not orthogonal. 34. To show a parallelogram is a rhombus, it suffices to show that its diagonals are perpendicular (Euclid). But     1 2 d1 · d2 = 2 · −1 = 2 · 1 + 2 · (−1) + 0 · 3 = 0. 3 0 To determine its side length, note that since the diagonals are perpendicular, one half of each diagonal are the legs of a right triangle whose hypotenuse is one side of the rhombus. So we can use the Pythagorean Theorem. Since 2

2

kd1 k = 22 + 22 + 02 = 8,

kd2 k = 12 + (−1)2 + 32 = 11,

we have for the side length s2 =



kd1 k 2

2

 +

kd2 k 2

2 =

8 11 19 + = , 4 4 4



so that s =

19 2

≈ 2.18.

35. Since ABCD is a rectangle, opposite sides BA and CD are parallel and congruent. So we can use # » the method of Example 1.1 in Section 1.1 to find the coordinates of vertex D: we compute BA = # » # » [1 − 3, 2 − 6, 3 − (−2)] = [−2, −4, 5]. If BA is then translated to CD, where C = (0, 5, −4), then D = (0 + (−2), 5 + (−4), −4 + 5) = (−2, 1, 1).

1.2. LENGTH AND ANGLE: THE DOT PRODUCT

15

36. The resultant velocity of the airplane is the sum of the velocity of the airplane and the velocity of the wind:       200 0 200 r=p+w = + = . 0 −40 −40 37. Let the x direction be east, in the direction of the current, and the y direction be north, across the river. The speed of the boat is 4 mph north, and the current is 3 mph east, so the velocity of the boat is       0 3 3 v= + = . 4 0 4 38. Let the x direction be the direction across the river, and the y direction be downstream. Since vt = d, use the given information to find v, then solve for t and compute   d. Since the speed of the boat is 20 20 km/h and the speed of the current is 5 km/h, we have v = . The width of the river is 2 km, and  5 2 the distance downstream is unknown; call it y. Then d = . Thus y     20 2 vt = t= . 5 y Thus 20t = 2 so that t = 0.1, and then y = 5 · 0.1 = 0.5. Therefore (a) Ann lands 0.5 km, or half a kilometer, downstream; (b) It takes Ann 0.1 hours, or six minutes, to cross the river. Note that the river flow does not increase the time required to cross the river, since its velocity is perpendicular to the direction of travel. 39. We want to find the angle between Bert’s resultant vector, r, and his velocity vector upstream, v. Let the first coordinate of the vector be the direction across the river, and the second be the direction   x upstream. Bert’s velocity vector directly across the river is unknown, say u = . His velocity vector 0   0 upstream compensates for the downstream flow, so v = . So the resultant vector is r = u + v = 1       x 0 x + = . Since Bert’s speed is 2 mph, we have krk = 2. Thus 0 1 1 2

x2 + 1 = krk = 4,

so that

x=



3.

If θ is the angle between r and v, then √ r·v 3 cos θ = = , krk kvk 2

so that

−1

θ = cos

√ ! 3 = 60◦ . 2

40. We have       u·v (−1) · (−2) + 1 · 4 −1 −1 −3 proju v = u= =3 = . 1 1 3 u·u (−1) · (−1) + 1 · 1 A graph of the situation is (with proju v in gray, and the perpendicular from v to the projection also drawn)

16

CHAPTER 1. VECTORS 4

v

3

proju HvL

2

1

u

-3

-2

-1

41. We have u·v u= proju v = u·u

3 4 5 · 1 + −5 · 3 3 4 5 · 5 + −5 ·

 " 3# 2 5 = − 1 u = −u.  1 − 45 − 45

A graph of the situation is (with proju v in gray, and the perpendicular from v to the projection also drawn) 2

v

1

proju HvL

1

u

42. We have  proju v =

u·v u= u·u

1 2 1 2

·

1 2

+

·

2 − 14  − 14

1 2

· 2 − · (−2)    − 41 + − 12 − 12



1 2  1 − 4  − 12

 =

8 3



1 2  1 − 4  − 12

 =



4 3  2 − 3  . − 43

43. We have     3 1 1 2     3 u·v 1 · 2 + (−1) · (−3) + 1 · (−1) + (−1) · (−2) −1 6 −1  − 2  3 proju v = u=   =   =   = u. u·u 1 · 1 + (−1) · (−1) + 1 · 1 + (−1) · (−1)  1 4  1  32  2 − 32 −1 −1 

44. We have proju v =

      u·v 0.5 · 2.1 + 1.5 · 1.2 0.5 2.85 0.5 0.57 u= = = = 1.14u. 1.71 u·u 0.5 · 0.5 + 1.5 · 1.5 1.5 2.5 1.5

1.2. LENGTH AND ANGLE: THE DOT PRODUCT

17

45. We have   3.01 u·v 3.01 · 1.34 − 0.33 · 4.25 + 2.52 · (−1.66)  −0.33 proju v = u= u·u 3.01 · 3.01 − 0.33 · (−0.33) + 2.52 · 2.52 2.52     3.01 −0.301 1.5523  1 −0.33 ≈  0.033 ≈ − u. =− 15.5194 10 2.52 −0.252 # » 46. Let u = AB =

       # » 2−1 1 4−1 3 = and v = AC = = . 2 − (−1) 3 0 − (−1) 1



(a) To compute the projection, we need     1 3 u·v = · = 6, 3 1 Thus

u·u=

    1 1 · = 10. 3 3

" # " # 3 3 1 u·v u= = 59 , proju v = u·u 5 3 5

so that

" # " # " # 12 3 3 5 . v − proju v = − 59 = 4 1 − 5 5

Then kuk =

p

12

+

32

=



s kv − proju vk =

10,

12 5

2

√  2 4 4 10 + = , 5 5

so that finally

√ 1 1√ 4 10 A = kuk kv − proju vk = = 4. 10 · 2 2 5 √ √ √ (b) We already know u · v = 6 and kuk = 10 from part (a). Also, kvk = 32 + 12 = 10. So cos θ =

3 u·v 6 =√ √ = , kuk kvk 5 10 10

so that

s sin θ =

p

1 − cos2 θ =

1−

 2 4 3 = . 5 5

Thus

1 1√ √ 4 A = kuk kvk sin θ = 10 10 · = 4. 2 2 5         4−3 1 5−3 2 # » # » 47. Let u = AB = −2 − (−1) = −1 and v = AC = 0 − (−1) =  1. 6−4 2 2−4 −2 (a) To compute the projection, we need     1 2 u · v = −1 ·  1 = −3, 2 −2 Thus



   1 1 u · u = −1 · −1 = 6. 2 2 

   1 −1 3    21  u·v u = − −1 =  2  , proju v = u·u 6 2 −1

18

CHAPTER 1. VECTORS so that

     5 − 21 2 2    1   1 v − proju v =  1 −  2  =  2  −2 −1 −1 

Then kuk =

p

12 + (−1)2 + 22 =



6,

s  √  2 2 5 1 30 2 + + (−1) = kv − proju vk = , 2 2 2

so that finally

√ √ 1√ 30 3 5 1 kuk kv − proju vk = = . 6· 2 2 2 2 p √ (b) We already know u · v = −3 and kuk = 6 from part (a). Also, kvk = 22 + 12 + (−2)2 = 3. So √ u·v 6 −3 cos θ = = √ =− , kuk kvk 6 3 6 so that v u √ !2 √ u p 6 30 t 2 = . sin θ = 1 − cos θ = 1 − 6 6 A=

Thus

√ √ 1 1√ 3 5 30 kuk kvk sin θ = = . 6·3· 2 2 6 2 48. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and solve for k:     1 2 k+1 u·v = · = 0 ⇒ 2(k + 1) + 3(k − 1) = 0 ⇒ 5k − 1 = 0 ⇒ k = . 3 k−1 5 A=

Substituting into the formula for v gives " v=

1 5 1 5

# " # 6 +1 5 . = −1 − 45

As a check, we compute " # " # 6 2 5 = 12 − 12 = 0, · u·v = 5 5 − 45 3 and the vectors are indeed orthogonal. 49. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and solve for k:    2 1 k u · v = −1 ·  k  = 0 ⇒ k 2 − k − 6 = 0 ⇒ (k + 2)(k − 3) = 0 ⇒ k = 2, −3. 2 −3 Substituting into the formula for v gives     (−2)2 4 k = 2 : v1 =  −2  = −2 , −3 −3

   32 9 k = −3 : v2 =  3  =  3 . −3 −3 

As a check, we compute         1 4 1 9 u · v1 = −1 · −2 = 1 · 4 − 1 · (−2) + 2 · (−3) = 0, u · v2 = −1 ·  3 = 1 · 9 − 1 · 3 + 2 · (−3) = 0 2 −3 2 −3 and the vectors are indeed orthogonal.

1.2. LENGTH AND ANGLE: THE DOT PRODUCT

19

50. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and solve for y in terms of x:     3 x u·v = · = 0 ⇒ 3x + y = 0 ⇒ y = −3x. 1 y Substituting y = −3x back into the formula for v gives     x 1 v= =x . −3x −3 Thus any vector orthogonal to

u·v =

    3 1 is a multiple of . As a check, 1 −3     3 x · = 3x − 3x = 0 for any value of x, 1 −3x

so that the vectors are indeed orthogonal. 51. As noted in the remarks just prior     to Example 1.16, the zero vector   0 is orthogonal to all vectors in a x a 2 R . So if = 0, any vector will do. Now assume that 6= 0; that is, that either a or b is b y b nonzero. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and solve for y in terms of x:     a x u·v = · = 0 ⇒ ax + by = 0. b y First assume b 6= 0. Then y = − ab x, so substituting back into the expression for v we get # " # " # " 1 b x x =x = . v= b −a − ab − ab x Next, if b = 0, then a 6= 0, so that x = − ab y, and substituting back into the expression for v gives " # " # " # − ab y − ab b y . v= =y =− a −a y 1 So in either case, a vector orthogonal to

    a b , if it is not the zero vector, is a multiple of . As a b −a

check, note that     a rb · = rab − rab = 0 for all values of r. b −ra 52. (a) The geometry of the vectors in Figure 1.26 suggests that if ku + vk = kuk + kvk, then u and v point in the same direction. This means that the angle between them must be 0. So we first prove Lemma 1. For all vectors u and v in R2 or R3 , u · v = kuk kvk if and only if the vectors point in the same direction. Proof. Let θ be the angle between u and v. Then cos θ =

u·v , kuk kvk

so that cos θ = 1 if and only if u · v = kuk kvk. But cos θ = 1 if and only if θ = 0, which means that u and v point in the same direction.

20

CHAPTER 1. VECTORS We can now show Theorem 2. For all vectors u and v in R2 or R3 , ku + vk = kuk + kvk if and only if u and v point in the same direction. Proof. First assume that u and v point in the same direction. Then u · v = kuk kvk, and thus 2

ku + vk =u · u + 2u · v + v · v 2

By Example 1.9

2

2

= kuk + 2u · v + kvk 2

Since w · w = kwk for any vector w 2

= kuk + 2 kuk kvk + kvk

By the lemma

2

= (kuk + kvk) . Since ku + vk and kuk+kvk are both nonnegative, taking square roots gives ku + vk = kuk+kvk. For the other direction, if ku + vk = kuk + kvk, then their squares are equal, so that 2

2

2

(kuk + kvk) = kuk + 2 kuk kvk + kvk and 2

ku + vk = u · u + 2u · v + v · v 2

are equal. But kuk = u · u and similarly for v, so that canceling those terms gives 2u · v = 2 kuk kvk and thus u · v = kuk kvk. Using the lemma again shows that u and v point in the same direction. (b) The geometry of the vectors in Figure 1.26 suggests that if ku + vk = kuk − kvk, then u and v point in opposite directions. In addition, since ku + vk ≥ 0, we must also have kuk ≥ kvk. If they point in opposite directions, the angle between them must be π. This entire proof is exactly analogous to the proof in part (a). We first prove Lemma 3. For all vectors u and v in R2 or R3 , u · v = − kuk kvk if and only if the vectors point in opposite directions. Proof. Let θ be the angle between u and v. Then cos θ =

u·v , kuk kvk

so that cos θ = −1 if and only if u · v = − kuk kvk. But cos θ = −1 if and only if θ = π, which means that u and v point in opposite directions. We can now show Theorem 4. For all vectors u and v in R2 or R3 , ku + vk = kuk − kvk if and only if u and v point in opposite directions and kuk ≥ kvk. Proof. First assume that u and v point in opposite directions and kuk ≥ kvk. Then u · v = − kuk kvk, and thus 2

ku + vk =u · u + 2u · v + v · v 2

By Example 1.9

2

2

= kuk + 2u · v + kvk 2

Since w · w = kwk for any vector w 2

= kuk − 2 kuk kvk + kvk

By the lemma

2

= (kuk − kvk) . Now, since kuk ≥ kvk by assumption, we see that both ku + vk and kuk − kvk are nonnegative, so that taking square roots gives ku + vk = kuk − kvk. For the other direction, if ku + vk =

1.2. LENGTH AND ANGLE: THE DOT PRODUCT

21

kuk − kvk, then first of all, since the left-hand side is nonnegative, the right-hand side must be as well, so that kuk ≥ kvk. Next, we can square both sides of the equality, so that 2

2

2

(kuk − kvk) = kuk − 2 kuk kvk + kvk and 2

ku + vk = u · u + 2u · v + v · v 2

are equal. But kuk = u · u and similarly for v, so that canceling those terms gives 2u · v = −2 kuk kvk and thus u · v = − kuk kvk. Using the lemma again shows that u and v point in opposite directions. 53. Prove Theorem 1.2(b) by applying the definition of the dot product: u · v = u1 (v1 + w1 ) + u2 (v2 + w2 ) + · · · + un (vn + wn ) = u1 v1 + u1 w1 + u2 v2 + u2 w2 + · · · + un vn + un wn = (u1 v1 + u2 v2 + · · · + un vn ) + (u1 w1 + u2 w2 + · · · + un wn ) = u · v + u · w. 54. Prove the three parts of Theorem 1.2(d) by applying the definition of the dot product and various properties of real numbers: Part 1: For any vector u, we must show u · u ≥ 0. But u · u = u1 u1 + u2 u2 + · · · + un un = u21 + u22 + · · · + u2n . Since for any real number x we know that x2 ≥ 0, it follows that this sum is also nonnegative, so that u · u ≥ 0. Part 2: We must show that if u = 0 then u · u = 0. But u = 0 means that ui = 0 for all i, so that u · u = 0 · 0 + 0 · 0 + · · · + 0 · 0 = 0. Part 3: We must show that if u · u = 0, then u = 0. From part 1, we know that u · u = u21 + u22 + · · · + u2n , and that u2i ≥ 0 for all i. So if the dot product is to be zero, each u2i must be zero, which means that ui = 0 for all i and thus u = 0. 55. We must show d(u, v) = ku − vk = kv − uk = d(v, u). By definition, d(u, v) = ku − vk. Then by Theorem 1.3(b) with c = −1, we have k−wk = kwk for any vector w; applying this to the vector u − v gives ku − vk = k−(u − v)k = kv − uk , which is by definition equal to d(v, u). 56. We must show that for any vectors u, v and w that d(u, w) ≤ d(u, v) + d(v, w). This is equivalent to showing that ku − wk ≤ ku − vk + kv − wk. Now substitute u − v for x and v − w for y in Theorem 1.5, giving ku − wk = k(u − v) + (v − w)k ≤ ku − vk + kv − wk . 57. We must show that d(u, v) = ku − vk = 0 if and only if u = v. This follows immediately from Theorem 1.3(a), kwk = 0 if and only if w = 0, upon setting w = u − v. 58. Apply the definitions: u · cv = [u1 , u2 , . . . , un ] · [cv1 , cv2 , . . . , cvn ] = u1 cv1 + u2 cv2 + · · · + un cvn = cu1 v1 + cu2 v2 + · · · + cun vn = c(u1 v1 + u2 v2 + · · · + un vn ) = c(u · v).

22

CHAPTER 1. VECTORS

59. We want to show that ku − vk ≥ kuk − kvk. This is equivalent to showing that kuk ≤ ku − vk + kvk. This follows immediately upon setting x = u − v and y = v in Theorem 1.5. 60. If u · v = u · w, it does not follow that v = w. For example, since 0 · v = 0 for every vector v in Rn , the zero vector is orthogonal to every vector v. So if u = 0 in the above equality, we know nothing about v and w. (as an example, 0 · [1, 2] = 0 · [−17, 12]). Note, however, that u · v = u · w implies that u · v − u · w = u(v − w) = 0, so that u is orthogonal to v − w. 2

2

61. We must show that (u + v)(u − v) = kuk − kvk for all vectors in Rn . Recall that for any w in Rn 2 that w · w = kwk , and also that u · v = v · u. Then 2

2

2

2

(u + v)(u − v) = u · u + v · u − u · v − v · v = kuk + u · v − u · v − kvk = kuk − kvk . 62. (a) Let u, v ∈ Rn . Then 2

2

ku + vk + ku − vk = (u + v) · (u + v) + (u − v) · (u − v) = (u · u + v · v + 2u · v) + (u · u + v · v − 2u · v)     2 2 2 2 = kuk + kvk + 2u · v + kuk + kvk − 2u · v 2

2

= 2 kuk + 2 kvk . (b) Part (a) tells us that the sum of the squares of the lengths of the diagonals of a parallelogram is equal to the sum of the squares of the lengths of its four sides.

u+v

u-v

u

v

63. Let u, v ∈ Rn . Then 1 1 1 2 2 ku + vk − ku − vk = [(u + v) · (u + v) − ((u − v) · (u − v)] 4 4 4 1 = [(u · u + v · v + 2u · v) − (u · u + v · v − 2u · v)] 4    i 1 h 2 2 2 2 = kuk − kuk + kvk − kvk + 4u · v 4 = u · v. 64. (a) Let u, v ∈ Rn . Then using the previous exercise, 2

2

2

2

ku + vk = ku − vk ⇔ ku + vk = ku − vk

⇔ ku + vk − ku − vk = 0 1 1 2 2 ⇔ ku + vk − ku − vk = 0 4 4 ⇔u·v =0 ⇔ u and v are orthogonal.

1.2. LENGTH AND ANGLE: THE DOT PRODUCT

23

(b) Part (a) tells us that a parallelogram is a rectangle if and only if the lengths of its diagonals are equal.

u+v

u-v

u v

2

2

2

2

65. (a) By Exercise 55, (u+v)·(u−v) = kuk −kvk . Thus (u+v)·(u−v) = 0 if and only if kuk = kvk . It follows immediately that u + v and u − v are orthogonal if and only if kuk = kvk. (b) Part (a) tells us that the diagonals of a parallelogram are perpendicular if and u+v only if the lengths of its sides are equal, i.e., if and only if it is a rhombus. u-v u

v

2

2

2

2

66. From Example 1.9 and the fact that w · w = kwk q, we have ku + vk = kuk + 2u · v + kvk . Taking 2

2

the square root of both sides yields ku + vk = kuk + 2u · v + kvk . Now substitute in the given √ values kuk = 2, kvk = 3, and u · v = 1, giving r √ 2 √ √ ku + vk = 22 + 2 · 1 + 3 = 4 + 2 + 3 = 9 = 3.

67. From Theorem 1.4 (the Cauchy-Schwarcz inequality), we have |u · v| ≤ kuk kvk. If kuk = 1 and kvk = 2, then |u · v| ≤ 2, so we cannot have u · v = 3. 68. (a) If u is orthogonal to both v and w, then u · v = u · w = 0. Then u · (v + w) = u · v + u · w = 0 + 0 = 0, so that u is orthogonal to v + w. (b) If u is orthogonal to both v and w, then u · v = u · w = 0. Then u · (sv + tw) = u · (sv) + u · (tw) = s(u · v) + t(u · w) = s · 0 + t · 0 = 0, so that u is orthogonal to sv + tw. 69. We have  u · v  u · (v − proju v) = u · v − u u ·u  u ·v =u·v−u· u u ·  u · v u =u·v− (u · u) u·u =u·v−u·v = 0.

24

CHAPTER 1. VECTORS

70. (a) proju (proju v) = proju

u·v u·u u



=

u·v u·u

proju u =

u·v u·u u

= proju v.

(b) Using part (a),  u · v u · v  u · v u− proj u proju (v − proju v) = proju v − u = u·u u·u u·u  u u · v u·v = u− u = 0. u·u u·u (c) From the diagram, we see that proju vku, so that proju (proju v) = proju v. Also, (v − proju v) ⊥ u, so that proju (v − proju v) = 0.

u

proju HvL

v

v - proju HvL

-proju HvL

71. (a) We have (u21 + u22 )(v12 + v22 ) − (u1 v1 + u2 v2 )2 = u21 v12 + u21 v22 + u22 v12 + u22 v22 − u21 v12 − 2u1 v1 u2 v2 − u22 v22 = u21 v22 + u22 v12 − 2u1 u2 v1 v2 = (u1 v2 − u2 v1 )2 . But the final expression is nonnegative since it is a square. Thus the original expression is as well, showing that (u21 + u22 )(v12 + v22 ) − (u1 v1 + u2 v2 )2 ≥ 0. (b) We have (u21 + u22 + u23 )(v12 + v22 + v32 ) − (u1 v1 + u2 v2 + u3 v3 )2 = u21 v12 + u21 v22 + u21 v32 + u22 v12 + u22 v22 + u22 v32 + u23 v12 + u23 v22 + u23 v32 − u21 v12 − 2u1 v1 u2 v2 − u22 v22 − 2u1 v1 u3 v3 − u23 v32 − 2u2 v2 u3 v3 = u21 v22 + u21 v32 + u22 v12 + u22 v32 + u23 v12 + u23 v22 − 2u1 u2 v1 v2 − 2u1 v1 u3 v3 − 2u2 v2 u3 v3 = (u1 v2 − u2 v1 )2 + (u1 v3 − u3 v1 )2 + (u3 v2 − u2 v3 )2 . But the final expression is nonnegative since it is the sum of three squares. Thus the original expression is as well, showing that (u21 + u22 + u23 )(v12 + v22 + v32 ) − (u1 v1 + u2 v2 + u3 v3 )2 ≥ 0. 72. (a) Since proju v =

u·v u·u u,

we have  u·v u·v  u· v− u u·u u·u  u·v u·v = u·v− u·u u·u u·u u·v = (u · v − u · v) u·u = 0,

proju v · (v − proju v) =

Exploration: Vectors and Geometry

25

so that proju v is orthogonal to v − proju v. Since their vector sum is v, those three vectors form a right triangle with hypotenuse v, so by Pythagoras’ Theorem, 2

2

2

2

kproju vk ≤ kproju vk + kv − proju vk = kvk . Since norms are always nonnegative, taking square roots gives kproju vk ≤ kvk. (b)

 u · v 

u ≤ kvk kproju vk ≤ kvk ⇐⇒ u · u

!

u·v

u ⇐⇒

≤ kvk

kuk2

u·v ⇐⇒ kuk ≤ kvk kuk2 |u · v| ≤ kvk kuk ⇐⇒ |u · v| ≤ kuk kvk , ⇐⇒

which is the Cauchy-Schwarcz inequality. 73. Suppose proju v = cu. From the figure, we see that cos θ = two expressions are equal, i.e.,

ckuk kvk .

But also cos θ =

u·v kukkvk .

Thus these

c kuk u·v u·v u·v u·v = ⇒ c kuk = ⇒c= = . kvk kuk kvk kuk kuk kuk u·u 74. The basis for induction is the cases n = 1 and n = 2. The n = 1 case is the assertion that kv1 k ≤ kv2 k, which is obviously true. The n = 2 case is the Triangle Inequality, which is also true. Now assume the statement holds for n = k ≥ 2; that is, for any v1 , v2 , . . . , vk , kv1 + v2 + · · · + vk k ≤ kv1 k + kv2 k + · · · + kvk k . Let v1 , v2 , . . . , vk , vk+1 be any vectors. Then kv1 + v2 + · · · + vk + vk+1 k = kv1 + v2 + · · · + (vk + vk+1 )k ≤ kv1 k + kv2 k + · · · + kvk + vk+1 k using the inductive hypothesis. But then using the Triangle Inequality (or the case n = 2 in this theorem), kvk + vk+1 k ≤ kvk k + kvk+1 k. Substituting into the above gives kv1 + v2 + · · · + vk + vk+1 k ≤ kv1 k + kv2 k + · · · + kvk + vk+1 k ≤ kv1 k + kv2 k + · · · + kvk k + kvk+1 k , which is what we were trying to prove.

Exploration: Vectors and Geometry # » # » # » 1. As in Example 1.25, let p = OP . Then p − a = AP = 31 AB = 13 (b − a), so that p = a + 13 (b − a) = # » # » 1 1 3 (2a + b). More generally, if P is the point n of the way from A to B along AB, then p − a = AP = 1# » 1 1 1 n AB = n (b − a), so that p = a + n (b − a) = n ((n − 1)a + b). # » 2. Use the notation that the vector OX is written x. Then from exercise 1, we have p = 21 (a + c) and q = 12 (b + c), so that 1 1 1 1# » # » P Q = q − p = (b + c) − (a + c) = (b − a) = AB. 2 2 2 2

26

CHAPTER 1. VECTORS # » # » # » # » # » 3. Draw AC. Then from exercise 2, we have P Q = 12 AB = SR. Also draw BD. Again from exercise 2, # » # » # » we have P S = 12 BD = QR. Thus opposite sides of the quadrilateral P QRS are equal. They are also parallel: indeed, 4BP Q and 4BAC are similar, since they share an angle and BP : BA = BQ : BC. Thus ∠BP Q = ∠BAC; since these angles are equal, P QkAC. Similarly, SRkAC so that P QkSR. In a like manner, we see that P SkRQ. Thus P QRS is a parallelogram. 4. Following the hint, we find m, the point that is two-thirds of the distance from A to P . From exercise 1, we have   1 1 1 1 1 p = (b + c), so that m = (2p + a) = 2 · (b + c) + a = (a + b + c). 2 3 3 2 3 Next we find m0 , the point that is two-thirds of the distance from B to Q. Again from exercise 1, we have   1 1 1 1 1 0 q = (a + c), so that m = (2q + b) = 2 · (a + c) + b = (a + b + c). 2 3 3 2 3 Finally we find m00 , the point that is two-thirds of the distance from C to R. Again from exercise 1, we have   1 1 1 1 1 00 2 · (a + b) + c = (a + b + c). r = (a + b), so that m = (2r + c) = 2 3 3 2 3 Since m = m0 = m00 , all three medians intersect at the centroid, G. # » # » # » # » 5. With notation as in the figure, we know that AH is orthogonal to BC; that is, AH · BC = 0. Also # » # » # » # » # » # » BH is orthogonal to AC; that is, BH · AC = 0. We must show that CH · AB = 0. But # » # » AH · BC = 0 ⇒ (h − a) · (b − c) = 0 ⇒ h · b − h · c − a · b + a · c = 0 # » # » BH · AC = 0 ⇒ (h − b) · (c − a) = 0 ⇒ h · c − h · a − b · c + b · a = 0. Adding these two equations together and canceling like terms gives # » # » 0 = h · b − h · a − c · b + a · c = (h − c) · (b − a) = CH · AB, so that these two are orthogonal. Thus all the altitudes intersect at the orthocenter H. # » # » # » # » 6. We are given that QK is orthogonal to AC and that P K is orthogonal to CB, and must show that # » # » RK is orthogonal to AB. By exercise 1, we have q = 21 (a + c), p = 12 (b + c), and r = 12 (a + b). Thus   1 # » # » QK · AC = 0 ⇒ (k − q) · (c − a) = 0 ⇒ k − (a + c) · (c − a) = 0 2   1 # » # » P K · CB = 0 ⇒ (k − p) · (b − c) = 0 ⇒ k − (b + c) · (b − c) = 0. 2 Expanding the two dot products gives 1 1 1 1 k·c−k·a− a·c+ a·a− c·c+ a·c=0 2 2 2 2 1 1 1 1 k · b − k · c − b · b + b · c − c · b + c · c = 0. 2 2 2 2 Add these two together and cancel like terms to get   1 1 1 # » # » 0 = k · b − k · a − b · b + a · a = k − (b + a) · (b − a) = (k − r) · (b − a) = RK · AB. 2 2 2 # » # » Thus RK and AB are indeed orthogonal, so all the perpendicular bisectors intersect at the circumcenter.

1.3. LINES AND PLANES

27 2

2

7. Let O, the center of the circle, be the origin. Then b = −a and kak = kck = r2 where r is the radius # » # » of the circle. We want to show that AC is orthogonal to BC. But # » # » AC · BC = (c − a) · (c − b) = (c − a) · (c + a) 2

2

= kck + c · a − kak − a · c = (a · c − c · a) + (r2 − r2 ) = 0. Thus the two are orthogonal, so that ∠ACB is a right angle. 8. As in exercise 5, we first find m, the point that is halfway from P to R. We have p = 21 (a + b) and r = 12 (c + d), so that   1 1 1 1 1 (a + b) + (c + d) = (a + b + c + d). m = (p + r) = 2 2 2 2 4 Similarly, we find m0 , the point that is halfway from Q to S. We have q = 21 (b + c) and s = 12 (a + d), so that   1 1 1 1 1 0 m = (q + s) = (b + c) + (a + d) = (a + b + c + d). 2 2 2 2 4 # » # » Thus m = m0 , so that P R and QS intersect at their mutual midpoints; thus, they bisect each other.

1.3

Lines and Planes

     3 0 1. (a) The normal form is n · (x − p) = 0, or · x− = 0. 2 0   x (b) Letting x = , we get y           3 x 0 3 x · − = · = 3x + 2y = 0. 2 y 0 2 y The general form is 3x + 2y = 0. 

    3 1 2. (a) The normal form is n · (x − p) = 0, or · x− = 0. −4 2   x (b) Letting x = , we get y           3 x 1 3 x−1 · − = · = 3(x − 1) − 4(y − 2) = 0. −4 y 2 −4 y−2

3. (a) (b)

4. (a) (b)

Expanding and simplifying gives the general form 3x − 4y = −5.     1 −1 In vector form, the equation of the line is x = p + td, or x = +t . 0 3       x x 1−t Letting x = and expanding the vector form from part (a) gives = , which yields y y 3t the parametric form x = 1 − t, y = 3t.     −4 1 In vector form, the equation of the line is x = p + td, or x = +t . 4 1   x Letting x = and expanding the vector form from part (a) gives the parametric form x = −4+t, y y = 4 + t.

28

CHAPTER 1. VECTORS

5. (a)

(b)

6. (a)

(b)

7. (a)

(b)

    0 1 In vector form, the equation of the line is x = p + td, or x = 0 + t −1. 0 4   x Letting x = y  and expanding the vector form from part (a) gives the parametric form x = t, z y = −t, z = 4t.     3 2 In vector form, the equation of the line is x = p + td, or x =  0 + t 5. −2 0       x x 3 + 2t Letting x = y  and expanding the vector form from part (a) gives y  =  5t , which z z −2 yields the parametric form x = 3 + 2t, y = 5t, z = −2.      3 0 The normal form is n · (x − p) = 0, or 2 · x − 1 = 0. 1 0   x Letting x = y , we get z           3 x 0 3 x 2 · y  − 1 = 2 · y − 1 = 3x + 2(y − 1) + z = 0. 1 z 0 1 z

Expanding and simplifying gives the general form 3x + 2y + z = 2.      3 2 8. (a) The normal form is n · (x − p) = 0, or 5 · x −  0 = 0. −2 0   x (b) Letting x = y , we get z           x 3 2 x−3 2 5 · y  −  0 = 5 ·  y  = 2(x − 3) + 5y = 0. 0 z −2 0 z+2 Expanding and simplifying gives the general form 2x + 5y = 6. 9. (a) In vector form, the equation of the line is x = p + su + tv, or       0 2 −3 x = 0 + s 1 + t  2 . 0 2 1   x (b) Letting x = y  and expanding the vector form from part (a) gives z     x 2s − 3t y  =  s + 2t  z 2s + t which yields the parametric form the parametric form x = 2s − 3t, y = s + 2t, z = 2s + t.

1.3. LINES AND PLANES

29

10. (a) In vector form, the equation of the line is x = p + su + tv, or       6 0 −1 x = −4 + s 1 + t  1 . −3 1 1   x (b) Letting x = y  and expanding the vector form from part (a) gives the parametric form x = 6−t, z y = −4 + s + t, z = −3 + s + t. 11. Any pair of points on ` determine a direction vector, so we use P and Q. We choose P to represent # » the point on the line. Then a direction vector forthe line is  d  = P Q = (3, 0) − (1, −2) = (2, 2). The 1 2 vector equation for the line is x = p + td, or x = +t . −2 2 12. Any pair of points on ` determine a direction vector, so we use P and Q. We choose P to represent the # » point on the line. Then a direction vector for the line is d=  P Q= (−2,  1, 3) − (0, 1, −1) = (−2, 0, 4). 0 −2 The vector equation for the line is x = p + td, or x =  1 + t  0. −1 4 13. We must find two direction vectors, u and v. Since P , Q, and R lie in a plane, we compute We get two direction vectors # » u = P Q = q − p = (4, 0, 2) − (1, 1, 1) = (3, −1, 1) # » v = P R = r − p = (0, 1, −1) − (1, 1, 1) = (−1, 0, −2). Since u and v are not scalar multiples of each other, they will serve as direction vectors (if they were parallel to each other, wewould have   not aplane  but a line). So the vector equation for the plane is 3 −1 1 x = p + su + tv, or x = 1 + s −1 + t  0. 1 −2 1 14. We must find two direction vectors, u and v. Since P , Q, and R lie in a plane, we compute We get two direction vectors # » u = P Q = q − p = (1, 0, 1) − (1, 1, 0) = (0, −1, 1) # » v = P R = r − p = (0, 1, 1) − (1, 1, 0) = (−1, 0, 1). Since u and v are not scalar multiples of each other, they will serve as direction vectors (if they were parallel to each other, wewould have   not aplane  but a line). So the vector equation for the plane is 1 0 −1 x = p + su + tv, or x = 1 + s −1 + t  0. 0 1 1 15. The parametric and associated vector forms x = p + td found below are not unique. (a) As in the remarks prior to Example 1.20, we start by letting x = t. Substituting x = t into y = 3x − 1 gives Sowe get parametric equations x = t, y = 3t − 1, and corresponding  y=  3t − 1.  x 0 1 vector form = +t . y −1 3 (b) In this case since the coefficient of y is 2, we start by letting x = 2t. Substituting x = 2t into 3x + 2y = 5 gives 3 · 2t + 2y = 5, which gives y = −3t + 25 . So we get parametric equations x = 2t, y = 52 − 3t, with corresponding vector equation " # " # " # x 0 2 = 5 +t . y −3 2

30

CHAPTER 1. VECTORS Note that theequation was of the form ax + by = c with a = 3, b = 2, and that a direction vector  b was given by . This is true in general. −a

16. Note that x = p + t(q − p) is the line that passes through p (when t = 0) and q (when t = 1). We write d = q − p; this is a direction vector for the line through p and q. (a) As noted above, the line p + td passes through P at t = 0 and through Q at t = 1. So as t varies from 0 to 1, the line describes the line segment P Q. (b) As shown in Exploration: Vectors and Geometry, to find the midpoint of P Q, we start at P # » and travel half the length of P Q in the direction of the vector P Q = q − p. That is, the midpoint of P Q is the head of the vector p + 12 (q − p). Since x = p + t(q − p), we see that this line passes through the midpoint at t = 21 , and that the midpoint is in fact p + 21 (q − p) = 21 (p + q). (c) From part (b), the midpoint is (d) From part (b), the midpoint is

1 2 1 2

([2, −3] + [0, 1]) = 12 [2, −2] = [1, −1].   ([1, 0, 1] + [4, 1, −2]) = 12 [5, 1, −1] = 52 , 12 , − 12 .

(e) Again from Exploration: Vectors and Geometry, the vector whose head is 31 of the way from P to Q along P Q is x1 = 13 (2p + q). Similarly, the vector whose head is 32 of the way from P to Q along P Q is also the vector one third of the way from Q to P along QP ; applying the same formula gives for this point x2 = 31 (2q + p). When p = [2, −3] and q = [0, 1], we get 1 x1 = (2[2, −3] + [0, 1]) = 3 1 x2 = (2[0, 1] + [2, −3]) = 3

  1 4 5 [4, −5] = ,− 3 3 3   1 2 1 [2, −1] = ,− . 3 3 3

(f ) Using the formulas from part (e) with p = [1, 0, −1] and q = [4, 1, −2] gives 1 (2[1, 0, −1] + [4, 1, −2]) = 3 1 x2 = (2[4, 1, −2] + [1, 0, −1]) = 3 x1 =

 1 [6, 1, −4] = 2, 3  1 [9, 2, −5] = 3, 3

 1 4 ,− 3 3  2 5 ,− . 3 3

17. A line `1 with slope m1 has equation y = m1 x + b1 , or −m1 x + y = b1 . Similarly, a line `2 withslope −m1 m2 has equation y = m2 x + b2 , or −m2 x + y = b2 . Thus the normal vector for `1 is n1 = , and 1   −m2 the normal vector for `2 is n2 = . Now, `1 and `2 are perpendicular if and only if their normal 1 vectors are perpendicular, i.e., if and only if n1 · n2 = 0. But     −m1 −m2 n1 · n2 = · = m1 m2 + 1, 1 1 so that the normal vectors are perpendicular if and only if m1 m2 +1 = 0, i.e., if and only if m1 m2 = −1. 18. Suppose the line ` has direction vector d, and the plane P has normal vector n. Then if d · n = 0 (d and n are orthogonal), then the line ` is parallel to the plane P. If on the other hand d and n are parallel, so that d = n, then ` is perpendicular to P.   2 (a) Since the general form of P is 2x + 3y − z = 1, its normal vector is n =  3. Since d = 1n, we −1 see that ` is perpendicular to P.

1.3. LINES AND PLANES

31 

 4 (b) Since the general form of P is 4x − y + 5z = 0, its normal vector is n = −1. Since 5     2 4 d · n =  3 · −1 = 2 · 4 + 3 · (−1) − 1 · 5 = 0, −1 5 ` is parallel to P. 

 1 (c) Since the general form of P is x − y − z = 3, its normal vector is n = −1. Since −1     2 1 d · n =  3 · −1 = 2 · 1 + 3 · (−1) − 1 · (−1) = 0, −1 −1 ` is parallel to P. 

 4 (d) Since the general form of P is 4x + 6y − 2z = 0, its normal vector is n =  6. Since −2     4 2 1 1 d =  3 =  6 = n, 2 2 −1 −2 ` is perpendicular to P. 19. Suppose the plane P1 has normal vector n1 , and the plane P has normal vector n. Then if n1 · n = 0 (n1 and n are orthogonal), then P1 is perpendicular to P. If on the other hand n1 and n are parallel, so that n1 = cn, thenP1is parallel to P. Note that in this exercise, P1 has the equation 4 4x − y + 5z = 2, so that n1 = −1. 5   2 (a) Since the general form of P is 2x + 3y − z = 1, its normal vector is n =  3. Since −1     4 2 n1 · n = −1 ·  3 = 4 · 2 − 1 · 3 + 5 · (−1) = 0, −1 5 the normal vectors are perpendicular, and thus P1 is perpendicular to P. 

 4 (b) Since the general form of P is 4x − y + 5z = 0, its normal vector is n = −1. Since n1 = n, 5 P1 is parallel to P.   1 (c) Since the general form of P is x − y − z = 3, its normal vector is n = −1. Since −1     4 1 n1 · n = −1 · −1 = 4 · 1 − 1 · (−1) + 5 · (−1) = 0, 5 −1 the normal vectors are perpendicular, and thus P1 is perpendicular to P.

32

CHAPTER 1. VECTORS 

 4 (d) Since the general form of P is 4x + 6y − 2z = 0, its normal vector is n =  6. Since −2 

   4 4 n1 · n = −1 ·  6 = 4 · 4 − 1 · 6 + 5 · (−2) = 0, 5 −2 the normal vectors are perpendicular, and thus P1 is perpendicular to P. 20. Since the vector form is x = p + td, we use the given information to determine p and d. The general   2 equation of the given line is 2x − 3y = 1, so its normal vector is n = . Our line is perpendicular −3   2 to that line, so it has direction vector d = n = . Furthermore, since our line passes through the −3   2 point P = (2, −1), we have p = . Thus the vector form of the line perpendicular to 2x − 3y = 1 −1 through the point P = (2, −1) is       x 2 2 = +t . y −1 −3 21. Since the vector form is x = p + td, we use the given information todetermine p and d. The general  2 equation of the given line is 2x − 3y = 1, so its normal vector is n = . Our line is parallel to that −3   3 (note that d · n = 0). Since our line passes through the point line, so it has direction vector d = 2   2 P = (2, −1), we have p = , so that the vector equation of the line parallel to 2x − 3y = 1 through −1 the point P = (2, −1) is       x 2 3 = +t . y −1 2 22. Since the vector form is x = p + td, we use the given information to determine p and d. A line is perpendicular to a plane if its direction vector d is the normal vector n of  the  plane. The general 1 equation of the given plane is x − 3y + 2z = 5, so its normal vector is n = −3. Thus the direction 2   1 vector of our line is d = −3. Furthermore, since our line passes through the point P = (−1, 0, 3), we 2   −1 have p =  0. So the vector form of the line perpendicular to x − 3y + 2z = 5 through P = (−1, 0, 3) 3 is       x −1 1 y  =  0 + t −3 . z 3 2 23. Since the vector form is x = p + td, we use the given information to determine p and d. Since the given line has parametric equations       x 1 −1 x = 1 − t, y = 2 + 3t, z = −2 − t, it has vector form y  =  2 + t  3 . z −2 −1

1.3. LINES AND PLANES

33

  −1 So its direction vector is  3, and this must be the direction vector d of the line we want, which is −1   −1 parallel to the given line. Since our line passes through the point P = (−1, 0, 3), we have p =  0. 3 So the vector form of the line parallel to the given line through P = (−1, 0, 3) is       −1 −1 x y  =  0 + t  3 . z 3 −1 24. Since the normal form is n · x = n · p, we use the given information to determine n and p. Note that a plane is parallel to a given plane if their normal vectors  are  equal. Since the general form of the 6 given plane is 6x − y + 2z = 3, its normal vector is n = −1, so this must be a normal vector of the 2 desired plane as well. Furthermore, since our plane passes through the point P = (0, −2, 5), we have   0 p = −2. So the normal form of the plane parallel to 6x − y + 2z = 3 through (0, −2, 5) is 5             x 6 0 6 x 6 −1 · y  = −1 · −2 or −1 · y  = 12. z 2 5 2 z 2 25. Using Figure 1.34 in Section 1.2 for reference, we will find a normal vector n and a point vector p for each of the sides, then substitute into n · x = n · p to get an equation for each plane. (a) Start with P 1 determined by the face of the cube in the xy-plane. Clearly a normal vector for 1 P1 is n = 0, or any vector parallel to the x-axis. Also, the plane passes through P = (0, 0, 0), 0  0 so we set p = 0. Then substituting gives 0         1 x 1 0 0 · y  = 0 · 0 or x = 0. 0 z 0 0 So the general equation for P1 is x = 0. Applying the same argument above to the plane P2 determined by the face in the xz-plane gives a general equation of y = 0, and similarly the plane P3 determined by the face in the xy-plane gives a general equation of z = 0. Now consider P4 , the plane containing the face parallel to the face  in  the yz-plane but passing 1 through (1, 1, 1). Since P4 is parallel to P1 , its normal vector is also 0; since it passes through 0   1 (1, 1, 1), we set p = 1. Then substituting gives 1         1 x 1 1 0 · y  = 0 · 1 or x = 1. 0 z 0 1 So the general equation for P4 is x = 1. Similarly, the general equations for P5 and P6 are y = 1 and z = 1.

34

CHAPTER 1. VECTORS   x (b) Let n = y  be a normal vector for the desired plane P. Since P is perpendicular to the z xy-plane, their normal vectors must be orthogonal. Thus     x 0 y  · 0 = x · 0 + y · 0 + z · 1 = z = 0. z 1   x Thus z = 0, so the normal vector is of the form n = y . But the normal vector is also 0 perpendicular to the plane in question, by definition. Since that plane contains both the origin and (1, 1, 1), the normal vector is orthogonal to (1, 1, 1) − (0, 0, 0):     x 1 y  · 1 = x · 1 + y · 1 + z · 0 = x + y = 0. 0 1   x Thus x + y = 0, so that y = −x. So finally, a normal vector to P is given by n = −x for 0   1 any nonzero x. We may as well choose x = 1, giving n = −1. Since the plane passes through 0 (0, 0, 0), we let p = 0. Then substituting in n · x = n · p gives         0 1 x 1 −1 · y  = −1 · 0 , or x − y = 0. 0 0 z 0 Thus the general equation for the plane perpendicular to the xy-plane and containing the diagonal from the origin to (1, 1, 1) is x − y = 0. (c) As in Example 1.22 (Figure 1.34)   in Section 1.2, use u = [0, 1, 1] and v = [1, 0, 1] as two vectors x in the required plane. If n = y  is a normal vector to the plane, then n · u = 0 = n · v: z         x 0 x 1 n · u = y  · 1 = y + z = 0 ⇒ y = −z, n · v = y  · 0 = x + z = 0 ⇒ x = −z. z 1 z 1     −z 1 Thus the normal vector is of the form n = −z  for any z. Taking z = −1 gives n =  1. z −1 Now, the side diagonals pass through (0, 0, 0), so set p = 0. Then n · x = n · p yields         1 x 1 0  1 · y  =  1 · 0 , or x + y − z = 0. −1 z −1 0 The general equation for the plane containing the side diagonals is x + y − z = 0.

26. Finding the distance between points A and B is equivalent to finding d(a, b), where a is the vector from the origin to A, and similarly for b. Given x = [x, y, z], p = [1, 0, −2], and q = [5, 2, 4], we want to solve d(x, p) = d(x, q); that is, p p d(x, p) = (x − 1)2 + (y − 0)2 + (z + 2)2 = (x − 5)2 + (y − 2)2 + (z − 4)2 = d(x, q).

1.3. LINES AND PLANES

35

Squaring both sides gives (x − 1)2 + (y − 0)2 + (z + 2)2 = (x − 5)2 + (y − 2)2 + (z − 4)2 2

2

2

2

2



2

x − 2x + 1 + y + z + 4z + 4 = x − 10x + 25 + y − 4y + 4 + z − 8z + 16 8x + 4y + 12z = 40





2x + y + 3z = 10. Thus all such points (x, y, z) lie on the plane 2x + y + 3z = 10. 27. To calculate d(Q, `) =

|ax0 +by0 −c| √ , a2 +b2

 we first put ` into general form. With d =

   1 1 , we get n = −1 1

since then n · d = 0. Then we have n·x=n·p ⇒

       1 x 1 −1 · = = 1. 1 y 1 2

Thus x + y = 1 and thus a = b = c = 1. Since Q = (2, 2) = (x0 , y0 ), we have √ 3 3 2 |1 · 2 + 1 · 2 − 1| √ . =√ = d(Q, `) = 2 2 12 + 12   −2 28. Comparing the given equation to x = p + td, we get P = (1, 1, 1) and d =  0. As suggested by 3 # » Figure 1.63, we need to calculate the length of RQ, where R is the point on the line at the foot of the # » perpendicular from Q. So if v = P Q, then # » # » P R = projd v, RQ = v − projd v.       1 −1 0 # » Now, v = P Q = q − p = 1 − 1 =  0, so that 1 −1 0  projd v =

d·v d·d



 d=

    2  −2 13 −2 · (−1) + 3 · (−1)      0 =  0 . −2 · (−2) + 3 · 3 3 3 − 13

Thus       2 15 −1 − 13 13       v − projd v =  0 −  0 =  0 . 3 −1 − 13 − 10 13 Then the distance d(Q, `) from Q to ` is

  √

3 p

5

0 = 5 32 + 22 = 5 13 . kv − projd vk =

13 13

2 13 0 +by0 +cz0 −d| 29. To calculate d(Q, P) = |ax√ , we first note that the plane has equation x + y − z = 0, so a2 +b2 +c2 that a = b = 1, c = −1, and d = 0. Also, Q = (2, 2, 2), so that x0 = y0 = z0 = 2. Hence √ |1 · 2 + 1 · 2 − 1 · 2 − 0| 2 2 3 d(Q, P) = p =√ = . 3 3 12 + 12 + (−1)2

36

CHAPTER 1. VECTORS

0 +by0 +cz0 −d| 30. To calculate d(Q, P) = |ax√ , we first note that the plane has equation x − 2y + 2z = 1, so a2 +b2 +c2 that a = 1, b = −2, c = 2, and d = 1. Also, Q = (0, 0, 0), so that x0 = y0 = z0 = 0. Hence

d(Q, P) =

1 |1 · 0 − 2 · 0 + 2 · 0 − 1| p = . 3 12 + (−2)2 + 22

# » # » 31. Figure 1.66 suggests that we let v = P Q;  then  w = P R = projd v. Comparing    the  given   line ` to # » 1 2 −1 3 x = p + td, we get P = (−1, 2) and d = . Then v = P Q = q − p = − = . Next, −1 2 2 0  w = projd v =

d·v d·d



 d=

1 · 3 + (−1) · 0 1 · 1 + (−1) · (−1)



   3 1 1 = . −1 2 −1

So

" # " # " # 1 3 −1 # » 2 = 2 . r = p + P R = p + projd v = p + w = + 1 − 32 2 2  1 1 So the point R on ` that is closest to Q is 2 , 2 . # » # » 32. Figure 1.66 suggests that we let = P Q; then P R = projd v. Comparing givenline ` to x = p+td,    the   v 0 1 −1 −2 # » we get P = (1, 1, 1) and d =  0. Then v = P Q = q − p = 1 − 1 =  0. Next, 0 1 −1 3  projd v =

d·v d·d



 d=

    2  −2 13 −2 · (−1) + 3 · (−1)      0 =  0 . (−2)2 + 32 3 3 − 13

So

      15 2 1 13 13 # »       r = p + P R = p + projd v = 1 +  0 =  1  . 3 10 − 13 1 13  15 10 So the point R on ` that is closest to Q is 13 , 1, 13 .

# » # » 33. Figure 1.67 suggests we let v = P Q, where Pis some point on the plane; then QR = projn v. The  1 equation of the plane is x + y − z = 0, so n =  1. Setting y = 0 shows that P = (1, 0, 1) is a point −1 on the plane. Then       2 1 1 # » v = P Q = q − p = 2 − 0 = 2 , 2 1 1 so that

 projn v =

n · v n·n

 n=

1·1+1·1−1·1 12 + 12 + (−1)2



   2 1 3    2  1 =  3  . −1 − 23

Finally,         2 4 1 1 3 3 # » # »         r = p + P Q + P R = p + v − projn v = 0 + 2 −  23  =  43  . 8 1 1 − 23 3  Therefore, the point R in P that is closest to Q is 34 , 43 , 83 .

1.3. LINES AND PLANES

37

# » # » 34. Figure 1.67 suggests we let v = P Q, where P is some  point on the plane; then QR = projn v. The 1 equation of the plane is x − 2y + 2z = 1, so n = −2. Setting y = z = 0 shows that P = (1, 0, 0) is a 2 point on the plane. Then       0 1 −1 # » v = P Q = q − p = 0 − 0 =  0 , 0 0 0 so that

   − 91 1    2 −2 =  9  . 2 − 29 

projn v =

n · v n·n

 n=

12

1 · (−1) + (−2)2 + 22



Finally,         − 19 −1 1 − 19 # » # »      2  2 r = p + P Q + P R = p + v − projn v = 0 +  0 −  9  =  9  . 0 0 − 29 − 29  Therefore, the point R in P that is closest to Q is − 19 , 29 , − 29 . 35. Since the given lines `1 and `2 are parallel, choose arbitrary points Q on `1 and P on `2 , say Q = (1, 1) and P = (5, 4). The direction vector of `2 is d = [−2, 3]. Then       # » 1 5 −4 v = PQ = q − p = − = , 1 4 −3 so that

      −2 · (−4) + 3 · (−3) d·v 1 −2 −2 =− d= . projd v = 3 3 d·d (−2)2 + 32 13 Then the distance between the lines is given by

 

      √

−4



1 −2

= 1 −54 = 18 3 = 18 13. kv − projd vk = +

−3



3 13 13 −36 13 2 13 

36. Since the given lines `1 and `2 are parallel, choose arbitrary points Q on `1 and P on `2 , say Q = (1, 0, −1) and P = (0, 1, 1). The direction vector of `2 is d = [1, 1, 1]. Then       1 0 1 # » v = P Q = q − p =  0 − 1 = −1 , −2 −1 1 so that  projd v =

d·v d·d



 d=

1 · 1 + 1 · (−1) + 1 · (−2) 12 + 12 + 12

     1 1 2 1 = − 1 . 3 1 1

Then the distance between the lines is given by

      5

1

√ 1

3 42

  2    1  1 p 2 kv − projd vk = −1 + 1 = − 3  = 5 + (−1)2 + (−4)2 = .



3 3 3 4

−2 1 −3 37. Since P1 and P2 are parallel, we choose an arbitrary point on P1 , say Q = (0, 0, 0), and compute d(Q, P2 ). Since the equation of P2 is 2x + y − 2z = 5, we have a = 2, b = 1, c = −2, and d = 5; since Q = (0, 0, 0), we have x0 = y0 = z0 = 0. Thus the distance is d(P1 , P2 ) = d(Q, P2 ) =

|ax0 + by0 + cz0 − d| |2 · 0 + 1 · 0 − 2 · 0 − 5| 5 √ √ = = . 2 2 2 2 2 2 3 a +b +c 2 +1 +2

38

CHAPTER 1. VECTORS

38. Since P1 and P2 are parallel, we choose an arbitrary point on P1 , say Q = (1, 0, 0), and compute d(Q, P2 ). Since the equation of P2 is x + y + z = 3, we have a = b = c = 1 and d = 3; since Q = (1, 0, 0), we have x0 = 1, y0 = 0, and z0 = 0. Thus the distance is √ |ax0 + by0 + cz0 − d| |1 · 1 + 1 · 0 + 1 · 0 − 3| 2 2 3 √ √ d(P1 , P2 ) = d(Q, P2 ) = = =√ = . 3 3 a2 + b2 + c2 12 + 1 2 + 1 2   # » a 0 −c| , where n = , n · a = c, and B = (x0 , y0 ). If v = AB = 39. We wish to show that d(B, `) = |ax√0a+by 2 +b2 b b − a, then     a x n · v = n · (b − a) = n · b − n · a = · 0 − c = ax0 + by0 − c. b y0 Then from Figure 1.65, we see that

 n · v  |n · v| |ax0 + by0 − c|

√ n = = . d(B, `) = kprojn vk = n·n knk a2 + b2   a 0 +by0 +cz0 −d|  b , n · a = d, and B = (x0 , y0 , z0 ). If 40. We wish to show that d(B, `) = |ax√ , where n = a2 +b2 +c2 c # » v = AB = b − a, then     x0 a n · v = n · (b − a) = n · b − n · a =  b  ·  y0  − d = ax0 + by0 + cz0 − d. c z0 Then from Figure 1.65, we see that

 n · v  |n · v| |ax0 + by0 + cz0 − d|

√ d(B, `) = kprojn vk = = . n = n·n knk a2 + b2 + c2 41. Choose B = (x0 , y0 ) on `1 ; since `1 and   `2 are parallel, the distance between them is d(B, `2 ). Then a x since B lies on `1 , we have n · b = · 0 = ax0 + by0 = c1 . Choose A on `2 , so that n · a = c2 . Set b y0 v = b − a. Then using the formula in Exercise 39, the distance is d(`1 , `2 ) = d(B, `2 ) =

|n · v| |n · (b − a)| |n · b − n · a| |c1 − c2 | = = = . knk knk knk knk

42. Choose B = (x0 , y0 , z0 ) on P1 ; since P1 and parallel, the distance between them is d(B, P2 ). P  2 are  a x0 Then since B lies on P1 , we have n · b =  b  ·  y0  = ax0 + by0 + cz0 = d1 . Choose A on P2 , so c z0 that n · a = d2 . Set v = b − a. Then using the formula in Exercise 40, the distance is |n · v| |n · (b − a)| |n · b − n · a| |d1 − d2 | = = = . knk knk knk knk     1 2 43. Since P1 has normal vector n1 = 1 and P2 has normal vector n2 =  1, the angle θ between 1 −2 the normal vectors satisfies d(P1 , P2 ) = d(B, P2 ) =

cos θ =

n1 · n2 1 · 2 + 1 · 1 + 1 · (−2) 1 p =√ = √ . 2 2 2 2 2 2 kn1 k kn2 k 3 3 1 + 1 + 1 2 + 1 + (−2)

Thus θ = cos−1



1 √ 3 3



≈ 78.9◦ .

1.3. LINES AND PLANES

39 

   3 1 44. Since P1 has normal vector n1 = −1 and P2 has normal vector n2 =  4, the angle θ between 2 −1 the normal vectors satisfies cos θ =

3 · 1 − 1 · 4 + 2 · (−1) 3 1 n1 · n2 p =p = −√ √ = −√ . 2 2 2 2 2 2 kn1 k kn2 k 14 18 28 3 + (−1) + 2 1 + 4 + (−1)

This is an obtuse angle, so the acute angle is   1 π − θ = π − cos−1 − √ ≈ 79.1◦ . 28 45. First, to see that P and ` intersect, substitute the parametric equations for ` into the equation for P, giving x + y + 2z = (2 + t) + (1 − 2t) + 2(3 + t) = 9 + t = 0, so that t = −9 represents the point of intersection, which is thus (2 + (−9), 1 − 2(−9),    3+ (−9)) = 1 1 (−7, 19, −6). Now, the normal to P is n = 1, and a direction vector for ` is d = −2. So if θ is 2 1 the angle between n and d, then θ satisfies cos θ =

1 · 1 + 1 · (−2) + 2 · 1 1 n·d p =√ = , 2 2 2 2 2 2 knk kdk 6 1 + 1 + 1 1 + (−2) + 1

so that −1

θ = cos

  1 ≈ 80.4◦ . 6

Thus the angle between the line and the plane is 90◦ − 80.4◦ ≈ 9.6◦ . 46. First, to see that P and ` intersect, substitute the parametric equations for ` into the equation for P, giving 4x − y − z = 4 · t − (1 + 2t) − (2 + 3t) = −t − 3 = 6, so that t = −9 represents the point of intersection, · (−9)) =   which is thus (−9, 1 + 2 · (−9), 2 +3  1 4 (−9, −17, −25). Now, the normal to P is n = −1, and a direction vector for ` is d = 2. So if θ 3 −1 is the angle between n and d, then θ satisfies cos θ =

n·d 4·1−1·2−1·3 1 √ =√ = −√ √ . knk kdk 18 14 42 + 12 + 12 12 + 22 + 32

This corresponds to an obtuse angle, so the acute angle between the two is   1 −1 θ = π − cos −√ √ ≈ 86.4◦ . 18 14 Thus the angle between the line and the plane is 90◦ − 86.4◦ ≈ 3.6◦ . 47. We have p = v − c n, so that c n = v − p. Take the dot product of both sides with n, giving (c n) · n = (v − p) · n



c(n · n) = v · n − p · n



c(n · n) = v · n (since p and n are orthogonal) n·v c= . n·n



40

CHAPTER 1. VECTORS Note that another interpretation of the figure is that c n = projn v = n·v c = n·n .

n·v n·n



n, which also implies that

Now substitute this value of c into the original equation, giving n · v n. p = v − cn = v − n·n   1 48. (a) A normal vector to the plane x + y + z = 0 is n = 1. Then 1     1 1 n · v = 1 ·  0 = 1 · 1 + 1 · 0 + 1 · (−2) = −1 1 −2     1 1 n · n = 1 · 1 = 1 · 1 + 1 · 1 + 1 · 1 = 3, 1 1 so that c = − 31 . Then      4 1 1 3   1    1 p=v− n =  0 + 1 =  3  . n·n 3 1 −2 − 53 

n · v

 3 (b) A normal vector to the plane 3x − y + z = 0 is n = −1. Then 1 

   1 3 n · v = −1 ·  0 = 3 · 1 − 1 · 0 + 1 · (−2) = 1 −2 1     3 3 n · n = −1 · −1 = 3 · 3 − 1 · (−1) + 1 · 1 = 11, 1 1 

so that c =

1 11 .

Then 

     8 1 3 11 1    1   p=v− n =  0 − −1 =  11  . n·n 11 −2 1 − 23 11 n · v



 1 (c) A normal vector to the plane x − 2z = 0 is n =  0. Then −2 

   1 1 n · v =  0 ·  0 = 1 · 1 + 0 · 0 − 2 · (−2) = 5 −2 −2     1 1 n · n =  0 ·  0 = 1 · 1 + 0 · 0 − 2 · (−2) = 5, −2 −2

Exploration: The Cross Product

41

so that c = 1. Then    1 1     n =  0 −  0 = 0. p=v− n·n −2 −2 

n · v

Note that the projection is 0 because the vector is normal to the plane, so its projection onto the plane is a single point.   2 (d) A normal vector to the plane 2x − 3y + z = 0 is n = −3. Then 1 

   2 1 n · v = −3 ·  0 = 2 · 1 − 3 · 0 + 1 · (−2) = 0 1 −2     2 2 n · n = −3 · −3 = 2 · 2 − 3 · (−3) + 1 · 1 = 14, 1 1  1 so that c = 0. Thus p = v =  0. Note that the projection is the vector itself because the −2 vector is parallel to the plane, so it is orthogonal to the normal vector. 

Exploration: The Cross Product 1. (a)

(b)

(c)

(d)

      3 u2 v3 − u3 v2 1 · 3 − 0 · (−1) u × v = u3 v1 − u1 v3  =  1 · 3 − 0 · 2  =  3. −3 u1 v2 − u2 v1 0 · (−1) − 1 · 3       u2 v3 − u3 v2 −1 · 1 − 2 · 1 −3 u × v = u3 v1 − u1 v3  =  2 · 0 − 3 · 1  = −3. u1 v2 − u2 v1 3 · 1 − (−1) · 0 3     u2 v3 − u3 v2 2 · (−6) − 3 · (−4) u × v = u3 v1 − u1 v3  = 3 · 2 − (−1) · (−6) = 0. u1 v2 − u2 v1 −1 · (−4) − 2 · 2       u2 v3 − u3 v2 1·3−1·2 1 u × v = u3 v1 − u1 v3  = 1 · 1 − 1 · 3 = −2. 1·2−1·1 1 u1 v2 − u2 v1

2. We have         1 0 0·0−0·1 0 e1 × e2 = 0 × 1 = 0 · 0 − 1 · 0 = 0 = e3 0 0 1·1−0·0 1         0 0 1·1−0·0 1 e2 × e3 = 1 × 0 = 0 · 0 − 0 · 1 = 0 = e1 0 1 0·0−1·0 0         0 1 0·0−1·0 0 e3 × e1 = 0 × 0 = 1 · 1 − 0 · 0 = 1 = e2 . 1 0 0·0−0·1 0

42

CHAPTER 1. VECTORS 3. Two vectors are orthogonal if their dot product equals zero. But     u2 v3 − u3 v2 u1 (u × v) · u = u3 v1 − u1 v3  · u2  u1 v2 − u2 v1 u3 = (u2 v3 − u3 v2 )u1 + (u3 v1 − u1 v3 )u2 + (u1 v2 − u2 v1 )u3 = (u2 v3 u1 − u1 v3 u2 ) + (u3 v1 u2 − u2 v1 u3 ) + (u1 v2 u3 − u3 v2 u1 ) = 0     u2 v3 − u3 v2 v1 (u × v) · v = u3 v1 − u1 v3  · v2  u1 v2 − u2 v1 v3 = (u2 v3 − u3 v2 )v1 + (u3 v1 − u1 v3 )v2 + (u1 v2 − u2 v1 )v3 = (u2 v3 v1 − u2 v1 v3 ) + (u3 v1 v2 − u3 v2 v1 ) + (u1 v2 v3 − u1 v3 v2 ) = 0. 4. (a) By Exercise 1, a vector normal to the plane is         0 3 1 · 2 − 1 · (−1) 3 n = u × v = 1 × −1 =  1 · 3 − 0 · 2  =  3 . 1 2 0 · (−1) − 1 · 3 −3 So the normal form for the equation of this plane is n · x = n · p, or         x 3 1 3  3 · y  =  3 ·  0 = 9. z −3 −2 −3 This simplifies to 3x + 3y − 3z = 9, or x + y − z = 3.     1 2 # » # » (b) Two vectors in the plane are u = P Q = 1 and v = P R =  3. So by Exercise 1, a vector −2 1 normal to the plane is         −5 1 · (−2) − 1 · 3 1 2 n = u × v = 1 ×  3 = 1 · 1 − 2 · (−2) =  5 . 5 2·3−1·1 −2 1 So the normal form for the equation of this plane is n · x = n · p, or         −5 0 x −5  5 · y  =  5 · −1 = 0. 1 z 5 5

5. (a)

(b)

(c)

(d)

This simplifies to −5x + 5y + 5z = 0, or x − y − z = 0.     v2 u3 − v3 u2 u2 v3 − u3 v2 v × u = v3 u1 − u3 v1  = − u3 v1 − u1 v3  = −(u × v). v1 u2 − v2 u1 u1 v2 − u2 v1         u1 0 u2 · 0 − u3 · 0 0 u × 0 = u2  × 0 = u3 · 0 − u1 · 0 = 0 = 0. u3 0 u1 · 0 − u2 · 0 0     u2 u3 − u3 u2 0 u × u = u3 u1 − u1 u3  = 0 = 0. u1 u2 − u2 u1 0     u2 kv3 − u3 kv2 u2 v3 − u3 v2 u × kv = u3 kv1 − u1 kv3  = k u3 v1 − u1 v3  = k(u × v). u1 kv2 − u2 kv1 u1 v2 − u2 v1

Exploration: The Cross Product

43

(e) u × ku = k(u × u) = k(0) = 0 by parts (d) and (c). (f ) Compute the cross-product:   u2 (v3 + w3 ) − u3 (v2 + w2 ) u × (v + w) = u3 (v1 + w1 ) − u1 (v3 + w3 ) u1 (v2 + w2 ) − u2 (v1 + w1 )   (u2 v3 − u3 v2 ) + (u2 w3 − u3 w2 ) = (u3 v1 − u1 v3 ) + (u3 w1 − u1 w3 ) (u1 v2 − u2 v1 ) + (u1 w2 − u2 w1 )     u2 v3 − u3 v2 u2 w3 − u3 w2 = u3 v1 − u1 v3  + u3 w1 − u1 w3  u1 v2 − u2 v1 u1 w2 − u2 w1 = u × v + u × w. 6. In each case, simply compute: (a)     v2 w3 − v3 w2 u1 u · (v × w) = u2  · v3 w1 − v1 w3  v1 w2 − v2 w1 u3 = u1 v2 w3 − u1 v3 w2 + u2 v3 w1 − u2 v1 w3 + u3 v1 w2 − u3 v2 w1 = (u2 v3 − u3 v2 )w1 + (u3 v1 − u1 v3 )w2 + (u1 v2 − u2 v1 )w3 = (u × v) · w. (b)     u1 v2 w3 − v3 w2 u × (v × w) = u2  × v3 w1 − v1 w3  u3 v1 w2 − v2 w1   u2 (v1 w2 − v2 w1 ) − u3 (v3 w1 − v1 w3 ) = u3 (v2 w3 − v3 w2 ) − u1 (v1 w2 − v2 w1 ) u1 (v3 w1 − v1 w3 ) − u2 (v2 w3 − v3 w2 )   (u1 w1 + u2 w2 + u3 w3 )v1 − (u1 v1 + u2 v2 + u3 v3 )w1 = (u1 w1 + u2 w2 + u3 w3 )v2 − (u1 v1 + u2 v2 + u3 v3 )w2  (u1 w1 + u2 w2 + u3 w3 )v3 − (u1 v1 + u2 v2 + u3 v3 )w3     v1 w1 = (u1 w1 + u2 w2 + u3 w3 ) v2  − (u1 v1 + u2 v2 + u3 v3 ) w2  v3 w3 = (u · w)v − (u · v)w. (c)

 

u2 v3 − u3 v2 2

2   ku × vk =

u3 v1 − u1 v3

u1 v2 − u2 v1 = (u2 v3 − u3 v2 )2 + (u3 v1 − u1 v3 )2 + (u1 v2 − u2 v1 )2 = (u21 + u22 + u23 )2 (v12 + v22 + v32 )2 − (u1 v1 + u2 v2 + u3 v3 )2 2

2

= kuk kvk − (u · v)2 .

44

CHAPTER 1. VECTORS 7. For problem 2, use the computation in the solution above to show that e1 × e2 = e3 . Then e2 × e3 = e2 × (e1 × e2 ) = (e2 · e2 )e1 − (e2 · e1 )e2

6(b)

= 1 · e1 − 0 · e2

ei · ej = 0 if i 6= j, ei · ei = 1

= e1 . Similarly, e3 × e1 = e3 × (e2 × e3 ) = (e3 · e3 )e2 − (e3 · e2 )e3

6(b)

= 1 · e2 − 0 · e3

ei · ej = 0 if i 6= j, ei · ei = 1

= e2 . For problem 3, we have u · (u × v) = (u × u) · v by 6(a), and then since u × u = 0 by Exercise 5(c), this reduces to 0 · v = 0. Thus u is orthogonal to u × v. Similarly, v · (u × v) = v · (−v × u) = −v · (v × u) by Exercise 5(a), and then as before, −v · (v × u) = −(v × v) · u = −0 · u = 0, so that v is also orthogonal to u × v. 8. (a) We have 2

2

2

2

2

2

2

ku × vk = kuk kvk − (u · v)2 2

2

= kuk kvk − kuk kvk cos2 θ  2 2 = kuk kvk 1 − cos2 θ = kuk kvk sin2 θ. Since the angle between u and v is always between 0 and π, it always has a nonnegative sine, so taking square roots gives ku × vk = kuk kvk sin θ. (b) Recall that the area of a triangle is A = 21 (base)(height). If the angle between u and v is θ, then the length of the perpendicular from the head of v to the line determined by u is an altitude of the triangle; the corresponding base is u. Thus the area is 1 1 1 kuk · (kvk sin θ) = kuk kvk sin θ = ku × vk . 2 2 2     1 4 # »   # »   (c) Let u = AB = −1 and v = AC = −3 . Then from part (b), we see that the area is −1 2

   

 

1

4

1 −5 1 √ 1

      A = −1 × −3 = −6

= 2 62. 2 2 −1 2 1 A=

1.4

Applications

1. Use the method of example 1.34. The magnitude of the resultant force r is q p 2 2 krk = kf1 k + kf2 k = 122 + 52 = 13 N, while the angle between r and east (the direction of f2 ) is θ = tan−1

12 ≈ 67.4◦ . 5

Note that the resultant is closer to north than east; the larger force to the north pulls the object more strongly to the north.

1.4. APPLICATIONS

45

2. Use the method of example 1.34. The magnitude of the resultant force r is q p 2 2 krk = kf1 k + kf2 k = 152 + 202 = 25 N, while the angle between r and west (the direction of f1 ) is θ = tan−1

20 ≈ 53.1◦ . 15

Note that the resultant is closer to south than west; the larger force to the south pulls the object more strongly to the south.       4 8 8 cos 60◦ √ 3. Use the method of Example 1.34. If we let f1 = . So the resultant , then f2 = = 0 8 sin 60◦ 4 3 force is       12 4 8 r = f1 + f2 = + √ = √ . 0 4 3 4 3 q √ 2 √ √ The magnitude of r is krk = 122 + 4 3 = 192 = 8 3 N, and the angle formed by r and f1 is −1

θ = tan

√ 4 3 1 = tan−1 √ = 30◦ . 12 3

Note that the resultant also forms a 30◦ degree angle with f2 ; since the magnitudes of the two forces are the same, the resultant points equally between them.      √  4 6 cos 135◦ −3√ 2 4. Use the method of Example 1.34. If we let f1 = . So the , then f2 = = 0 6 sin 135◦ 3 2 resultant force is √     √   4 −3√ 2 4 −√3 2 r = f1 + f2 = + = . 0 3 2 3 2 The magnitude of r is q q √ √ √ 2 2 krk = (4 − 3 2) + (3 2) = 52 − 24 2 ≈ 4.24 N, and the angle formed by r and f1 is −1

θ = tan

√ 3 2 √ ≈ 93.3◦ 4−3 2

        2 2 −6 4 cos 60◦ √ . 5. Use the method of Example 1.34. If we let f1 = , then f2 = , and f3 = = 0 0 4 sin 60◦ 2 3 So the resultant force is         2 −2 2 −6 r = f1 + f2 + f3 = + + √ = √ . 0 0 2 3 2 3 The magnitude of r is krk =

q √ √ (−2)2 + (2 3)2 = 16 = 4 N,

and the angle formed by r and f1 is √ √ 2 3 θ = tan = tan−1 (− 3) = 120◦ . −2 √ (Note that many CAS’s will return −60◦ for tan−1 (− 3); by convention we require an angle between 0 and 180◦ , so we add 180◦ to that answer, since tan θ = tan(180◦ + θ)). −1

46

CHAPTER 1. VECTORS 6. Use the method of Example 1.34. If we let f1 =

        10 0 −5 0 , then f2 = , f3 = , and f4 = . So 0 13 0 −8

the resultant force is           10 0 −5 0 5 r = f1 + f2 + f3 + f4 = + + + = . 0 13 0 −8 5 The magnitude of r is krk =

p √ √ 52 + 52 = 50 = 5 2 N,

and the angle formed by r and f1 is θ = tan−1

5 = tan−1 1 = 45◦ . 5

7. Following Example 1.35, we have the following diagram: f

fy

60°

fx

Here f makes a 60◦ angle with fy , so that kfy k = kf k cos 60◦ , √

Thus kfx k = 10 ·

3 2

√ = 5 3 ≈ 8.66 N and kfy k = 10 · fx =

 √  5 3 , 0

kfx k = kf k sin 60◦ . 1 2

= 5 N. Finally, this gives fy =

  0 . 5

8. The force that must be applied parallel to the ramp is the force needed to counteract the component of the force due to gravity that acts parallel to the ramp. Let f be the force due to gravity; this acts downwards. We can decompose it into components fp , acting parallel to the ramp, and fr , acting orthogonally to the ramp. Since the ramp makes an angle of 30◦ with the horizontal, it makes an angle of 60◦ with f . Thus 1 kfp k = kf k cos 60◦ = kf k = 5 N. 2 9. The vertical force is the vertical component of the force vector f . Since this acts at an angle of 45◦ to the horizontal, the magnitude of the component of this force in the vertical direction is √ √ 2 ◦ kfy k = kf k cos 45 = 1500 · = 750 2 ≈ 1060.66 N. 2 10. The vertical force is the vertical component of the force vector f , which has magnitude 100 N and acts at an angle of 45◦ to the horizontal. Since it acts at an angle of 45◦ to the horizontal, the magnitude of the component of this force in the vertical direction is √ √ 2 kfy k = kf k cos 45◦ = 100 · = 50 2 ≈ 70.7 N. 2 Note that the mass of the lawnmower itself is irrelevant; we are not considering the gravitational force in this exercise, only the force imparted by the person mowing the lawn.

1.4. APPLICATIONS

47

11. Use the method of Example 1.36. Let t be the force vector on the cable; then the tension on the cable is ktk. The force imparted by the hanging sign is the y component of t; call it ty . Since t and ty form an angle of 60◦ , we have 1 kty k = ktk cos 60◦ = ktk . 2 The gravitational force on the sign is its mass times the acceleration due to gravity, which is 50·9.8 = 490 N. Thus ktk = 2 kty k = 2 · 490 = 980 N. 12. Use the method of Example 1.36. Let w be the force vector created by the sign. Then kwk = 1·9.8 = 9.8 N, since the sign weighs 1 kg. By symmetry, each string carries half the weight of the sign, since the angles each string forms with the vertical are the same. Let s be the tension in the left-hand string. Since the angle between s and w is 45◦ , we have √ 2 1 kwk = ksk cos 45◦ = ksk . 2 2 √ 9.8 = 4.9 2 ≈ 6.9 N. Thus ksk = √12 kwk = √ 2 13. A diagram of the situation is 25

15

20

15 kg

The triangle is a right triangle, since 152 + 202 = 625 = 252 . Thus if θ1 is the angle that the left-hand 20 wire makes with the ceiling, then sin θ1 = 25 = 45 ; likewise, if θ2 is the angle that the right-hand wire 15 3 makes with the ceiling, then sin θ2 = 25 = 5 . Let f1 be the force on the left-hand wire and f2 the force on the right-hand wire. Let r be the force due to gravity acting on the painting. Then following Example 1.36, we have, using the Law of Sines, kf1 k 15 · 9.8 kf2 k krk = = 147 N. = = sin θ1 sin θ2 sin 90◦ 1 Then kf1 k = 147 sin θ1 = 147 ·

4 = 117.6 N, 5

kf2 k = 147 sin θ2 = 147 ·

3 = 88.2 N. 5

14. Let r be the force due to gravity acting on the painting, f1 be the tension on the wire opposite the 30◦ angle, and f2 be the tension on the wire opposite the 45◦ angle (don’t these people know to hang paintings straight?). Then krk = 20 · 9.8 = 196 N. Note that the remaining angle in the triangle is 105◦ . Then using the method of Example 1.36, we have, using the Law of Sines, kf1 k kf2 k krk = = . sin 30◦ sin 45◦ sin 105◦ Thus 196 · 12 krk · sin 30◦ ≈ ≈ 101.46 N sin 105◦ 0.9659√ 196 · 22 krk · sin 45◦ kf2 k = ≈ ≈ 143.48 N. sin 105◦ 0.9659 kf1 k =

48

CHAPTER 1. VECTORS

Chapter Review 1. (a) True. This follows from the properties of Rn listed in Theorem 1.1 in Section 1.1: u=u+0

Zero Property, Property (c)

= u + (w + (−w))

Additive Inverse Property, Property (d)

= (u + w) + (−w)

Distributive Property, Property (b)

= (v + w) + (−w)

By the given condition u + w = v + w

= v + (w + (−w))

Distributive Property, Property (b)

=v+0

Additive Inverse Property, Property (d)

=v

Zero Property, Property (c).

(b) False. See Exercise 60 in Section 1.2. For one counterexample, let w = 0 and u and v be arbitrary vectors. Then u · w = v · w = 0 since the dot product of any vector with the zero vector is zero. But certainly u and v need not be equal. As a second counterexample, suppose that both u and v are orthogonal to w; clearly they need not be equal, but u · w = v · w = 0. (c) False. For example, let u be any nonzero vector, and v any vector orthogonal to u. Let w = u. Then u is orthogonal to v and v is orthogonal to w = u, but certainly u and w = u are not orthogonal. (d) False. When a line is parallel to a plane, then d · n = 0; that is, d is orthogonal to the normal vector of the plane. (e) True. Since a normal vector n for P and the line ` are both perpendicular to P, they must be parallel. See Figure 1.62. (f ) True. See the remarks following Example 1.31 in Section 1.3. (g) False. They can be skew lines, which  arenonintersecting lines with nonparallel  direction   vectors. 0 0 1 For example, let `1 be the line x = t 0 (the x-axis), and `2 be the line x = 0 + t 1 (the 0 1 0 line through (0, 0, 1) that is parallel to the y-axis). These two lines do not intersect, yet they are not parallel.     1 1 (h) False. For example, 0 · 0 = 1 + 0 + 1 = 0 in Z2 . 1 1 (i) True. If ab = 0 in Z/5, then ab must be a multiple of 5. But 5 is prime, so either a or b must be divisible by 5, so that either a = 0 or b = 0 in Z5 . (j) False. For example, 2 · 3 = 6 = 0 in Z6 , but neither 2 nor 3 is zero in Z6 .   10 2. Let w = . Then the head of the resulting vector is −10         −1 3 10 9 4u + v + w = 4 + + = . 5 2 −10 12 So the coordinates of the point at the head of 4u + v are (9, 12). 3. Since 2x + u = 3(x − v) = 3x − 3v, simplifying gives u + 3v = x. Thus       −1 3 8 x = u + 3v = +3 = . 5 2 11 # » # » 4. Since ABCD is a square, OC = −OA, so that # » # » # » # » # » BC = OC − OB = −OA − OB = −a − b.

Chapter Review

49

5. We have     −1 2 u · v =  1 ·  1 = −1 · 2 + 1 · 1 + 2 · (−1) = −3 2 −1 p √ kuk = (−1)2 + 12 + 22 = 6 p √ kvk = 22 + 12 + (−1)2 = 6. Then if θ is the angle between u and v, it satisfies cos θ =

1 u·v 3 = −√ √ = − . kuk kvk 2 6 6

Thus −1

θ = cos

  1 − = 120◦ . 2

6. We have      1 1 1 1 · 1 − 2 · 1 + 2 · 1   1    92  u= proju v = −2 = −2 = − 9  . u·u 1 · 1 − 2 · (−2) + 2 · 2 9 2 2 2 9 

u · v

7. We are looking for a vector   in the xy-plane; any such vector has a z-coordinate of 0. So the vector we a are looking for is u =  b  for some a, b. Then we want 0       1 a 1 u · 2 =  b  · 2 = a + 2b = 0, 3 0 3 so that a = −2b. So for example choose b = 1; then a = −2, and the vector   −2 u =  1 0   1 is orthogonal to 2. Finally, to get a unit vector, we must normalize u. We have 3 kuk =

p √ (−2)2 + 12 + 02 = 5

to get   − √25   1 √1  w= u=   5 kuk 0 that is orthogonal to the given vector. We could have chosen any value for b, but we would have gotten either w or −w for the normalized vector. 8. The vector form of the given line is 

   2 −1 x =  3 + t  2 , −1 1

50

CHAPTER 1. VECTORS   −1 so the line has a direction vector d =  2. Since the plane is perpendicular to this line, d is a normal 1 vector n to the plane. Then the plane passes through P = (1, 1, 1), so equation n · x = n · p becomes         −1 x −1 1  2 · y  =  2 · 1 = 2. 1 z 1 1 Expanding gives the general equation −x + 2y + z = 2. 

 2 9. Parallel planes have parallel normals, so the vector n =  3, which is a normal to the given plane, −1 is also a normal to the desired plane. The plane we want must pass through P = (3, 2, 5), so equation n · x = n · p becomes         2 x 2 3  3 · y  =  3 · 2 = 7. −1 z −1 5 Expanding gives the general equation 2x + 3y − z = 7. 10. The three points give us two vectors that lie in the plane:       0 1 1 # »       d1 = AB = 0 − 1 = −1 , 1 0 1       −1 1 0 # » d2 = BC = 1 − 0 =  1 . 1 1 2   a A normal vector n =  b  must be orthogonal to both of these vectors, so c     0 a n · d1 =  b  · −1 = −b + c = 0 ⇒ b = c 1 c     a −1 n · d2 =  b  ·  1 = −a + b + c = 0 ⇒ a = b + c. c 1   2c Since b = c and a = b + c, we get a = 2c, so n =  c  for any value of c. Choosing c = 1 gives the c   2 vector n = 1. Let P be the point A = (1, 1, 0) (we could equally well choose B or C), and compute 1 n · x = n · p:         2 x 2 1 1 · y  = 1 · 1 = 3. 1 z 1 0 Expanding gives the general equation 2x + y + z = 3.

Chapter Review

51

11. Let     1−1 0 # » u = AB = 0 − 1 = −1 1−0 1     0−1 −1 # » v = AC = 1 − 1 =  0 . 2−0 2 Using the first method from Figure 1.39 (Exercises 46 and 47) in Section 1.2, we have       0 0 0 u · v 0 · (−1) − 1 · 0 + 1 · 2   2     −1 −1 −1 u= = = proju v = u·u 02 + (−1)2 + 12 2 1 1 1       −1 0 −1 v − proju v =  0 − −1 =  1 . 2 1 1 Then the area of the triangle is p 1p 2 1√ √ 1√ 1 kuk kv − proju vk = 0 + (−1)2 + 12 (−1)2 + 12 + 12 = 2 3= 6. 2 2 2 2 12. From the first example in Exploration: Vectors and Geometry, we have a formula for the midpoint:         5 3 8 4 1     1     1 1 + −7 −6 = −3 . = m = (a + b) = 2 2 2 −2 0 −2 −1 13. Suppose that kuk = 2 and kvk = 3. Then from the Cauchy-Schwarcz inequality, we have |u · v| ≤ kuk kvk = 2 · 3 = 6. Thus −6 ≤ u · v ≤ 6, so the dot product cannot equal −7. 14. We will apply the formula |ax0 + by0 + cz0 − d| √ a2 + b2 + c2 where A = (x0 , y0 , z0 ) is a point, and a general equation for the plane is ax + by + cz = d. Here, we have a = 2, b = 3, c = −1, and d = 0; since the point is (3, 2, 5), we have x0 = 3, y0 = 2, and z0 = 5. So the distance from the point to the plane is √ 7 |2 · 3 + 3 · 2 − 1 · 5 − 0| 14 d(A, P) = p =√ = . 2 14 22 + 32 + (−1)2 d(A, P) =

15. As in example 1.32 in Section 1.3, we have B = (3, 2, 5), and the line ` has vector form     0 1 x = 1 + t 1 , 2 1   1 so that A = (0, 1, 2) lies on the line, and a direction vector for the line is d = 1. Then 1       3 0 3 # » v = AB = 2 − 1 = 1 , 5 2 3

52

CHAPTER 1. VECTORS and then  projd v =

d·v d·d



 d=

    7  1 1 · 3 + 1 · 1 + 1 · 3    37  1 =  3  . 12 + 12 + 12 7 1 3

Then the vector from B that is perpendicular to the line is the vector       7 2 3 3 3   7  4 v − projd v = 1 −  3  = − 3  . 7 2 3 3 3 So the distance from B to ` is s   2  2 2 2 4 2 1√ 2√ kv − projd vk = + − + = 4 + 16 + 4 = 6. 3 3 3 3 3 16. We have 3 − (2 + 4)3 (4 + 3)2 = 3 − 13 · 22 = 3 − 1 · 4 = −1 = 4 in Z5 . Note that 2 + 4 = 6 = 1 in Z5 , 4 + 3 = 7 = 2 in Z5 , and −1 = 4 in Z5 . 17. 3(x + 2) = 5 implies that 5 · 3(x + 2) = 5 · 5 = 25 = 4. But 5 · 3 = 15 = 1 in Z7 , so this is the same as x + 2 = 4, so that x = 2. To check the answer, we have 3(2 + 2) = 3 · 4 = 12 = 5 in Z7 . 18. This has no solutions. For any value of x, the left-hand side is a multiple of 3, so it cannot leave a remainder of 5 when divided by 9 (which is also a multiple of 3). 19. Compute the dot product in Z45 : [2, 1, 3, 3] · [3, 4, 4, 2] = 2 · 3 + 1 · 4 + 3 · 4 + 3 · 2 = 6 + 4 + 12 + 6 = 1 + 4 + 2 + 1 = 8 = 3. 20. Suppose that [1, 1, 1, 0] · [d1 , d2 , d3 , d4 ] = d1 + d2 + d3 = 0 in Z2 . Then an even number (either zero or two) of d1 , d2 , and d3 must be 1 and the others must be zero. d4 is arbitrary (either 0 or 1). So the eight possible vectors are [0, 0, 0, 0], [0, 0, 0, 1], [1, 1, 0, 0], [1, 1, 0, 1], [1, 0, 1, 0], [1, 0, 1, 1], [0, 1, 1, 0], [0, 1, 1, 1].

Chapter 2

Systems of Linear Equations 2.1

Introduction to Systems of Linear Equations

1. This is equation is linear since each of x, y, and z appear to the first power with constant coefficients. √ 1, π, and 3 5 are constants. 2. This is not linear since each of x, y, and z appears to a power other than 1. 3. This is not linear since x appears to a power other than 1. 4. This is not linear since the equation contains the term xy, which is a product of two of the variables. 5. This is not linear since cos x is a function of x other than a constant times a multiple of x. 6. This is linear, since here cos 3 is a constant, and x, y, and √ z all appear to the first power with constant coefficients cos 3, −4, and 1, and the remaining term, 3, is also a constant. 7. Put the equation into the general form ax + by + c by adding y to both sides to get 2x + 4y = 7. There are no restrictions on x and y since there were none in the original equation. 8. In the given equation, the denominator must not be zero, so we must have x 6= y. With this restriction, we have x2 − y 2 (x + y)(x − y) = 1 ⇐⇒ = 1 ⇐⇒ x + y = 1, x 6= y. x−y x−y The corresponding linear equation is x + y = 1, for x 6= y. 9. Since x and y each appear in the denominator, neither one can be zero. With this restriction, we have 1 4 y x 4 1 + = ⇐⇒ + = ⇐⇒ y + x = 4 ⇐⇒ x + y = 4. x y xy xy xy xy The corresponding linear equation is x + y = 4, for x 6= 0 and y 6= 0. 10. Since log10 x and log10 y are defined only for x > 0 and y > 0, these are the restrictions on the variables. With these restrictions, we have log10 x − log10 y = 2 ⇐⇒ log10

x x = 2 ⇐⇒ = 102 = 100 ⇐⇒ x = 100y ⇐⇒ x − 100y = 0. y y

The corresponding linear equation is x − 100y = 0 for x > 0 and y > 0. 11. As in Example 2.2(a), we set x = t and solve for y. Setting x = t in 3x − 6y =0 gives  3t − 6y = 0, so that 6y = 3t and thus y = 12 t. So the set of solutions has the parametric form t, 21 t . Note that if we had set x = 2t to start with, we would have gotten the simpler (but equivalent) parametric form [2t, t]. 53

54

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

12. As in Example 2.2(a), we set x1 = t and solve for x2 . Setting x1 = t in 2x1 +3x2 = 5 gives 2t+3x 2 = 5,   so that 3x2 = 5 − 2t and thus x2 = 35 − 32 t. So the set of solutions has the parametric form t, 53 − 32 t . Note that if we had set x = 3t to start with, we would have gotten the simpler (but equivalent) parametric form [3t, 5 − 2t]. 13. As in Example 2.2(b), we set y = s and z = t and solve for x. (We choose y and z since the coefficient of the remaining variable, x, is one, so we will not have to do any division). Then x + 2s + 3t = 4, so that x = 4 − 2s − 3t. Thus a complete set of solutions written in parametric form is [4 − 2s − 3t, s, t]. 14. As in Example 2.2(b), we set x1 = s and x2 = t and solve for x3 . This substitution yields 4s+3t+2x3 = 3 1 1, so that  2x31 = 1−4s−3t  and x3 = 2 −2s− 2 t. Thus a complete set of solutions written in parametric 3 form is s, t, 2 − 2s − 2 t . 3

15. Subtract the first equation from the second to get x = 3; substituting x = 3 back into the first equation gives y = −3. Thus the unique intersection point is (3, −3).

2x+y‡3

2 1

1

2

3

4

5

-1 -2

H3, -3L

-3

x+y‡0

-4 -5 -6 -7

16. Add twice the second equation to the first equation to get 7x = 21 and thus x = 3. Substitute x = 3 back into the first equation to get y = −2. So the unique intersection point is (3, −2).

6

3x+y‡7

4 2

1

2

3

4

5

-2

H3, -3L

x-2y‡7

-4 -6 -8

3

17. Add three times the second equation to the first equation. This gives 0 = 6, which is impossible. So this system has no solutions — the two lines are parallel.

3x-6y‡3

2

1

-x + 2 y ‡ 1 -2

1

-1

2

3

4

5

4

5

-1

18. Add 53 times the first equation to the second, giving 0 = 0. Thus the system has an infinite number of solutions — the two lines are the same line (they are coincident).

6

4

0.1 x - 0.05 y ‡ 0.2 2

-2

1

-1 -2

2

3

-0.06 x + 0.03 y ‡ -0.12

-4

-6

-8

19. Starting from the second equation, we find y = 3. Substitute y = 3 into x − 2y = 1 to get x − 2 · 3 = 1, so that x = 7. The unique solution is [x, y] = [7, 3]. 20. Starting from the second equation, we find 2v = 6 so that v = 3. Substitute v = 3 into 2u − 3v = 5 to get 2u − 3 · 3 = 5, or 2u = 14, so that u = 7. The unique solution is [u, v] = [7, 3]. 21. Starting from the third equation, we have 3z = −1, so that z = − 13 . Now substitute this value into the second equation:   1 1 2 1 2y − z = 1 ⇒ 2y − − = 1 ⇒ 2y + = 1 ⇒ 2y = ⇒ y = . 3 3 3 3

2.1. INTRODUCTION TO SYSTEMS OF LINEAR EQUATIONS

55

Finally, substitute these values for y and z into the first equation:   1 1 2 2 x−y+z =0⇒x− + − =0⇒x− =0⇒x= . 3 3 3 3   So the unique solution is [x, y, z] = 32 , 13 , − 13 . 22. The third equation gives x3 = 0; substituting into the second equation gives −5x2 = 0, so that x2 = 0. Substituting these values into the first equation gives x1 = 0. So the unique solution is [x1 , x2 , x3 ] = [0, 0, 0]. 23. Substitute x4 = 1 from the fourth equation into the third equation, giving x3 − 1 = 0, so that x3 = 1. Substitute these values into the second equation, giving x2 + 1 + 1 = 0 and thus x2 = −2. Finally, substitute into the first equation: x1 + x2 − x3 − x4 = 1 ⇒ x1 + (−2) − 1 − 1 = 1 ⇒ x1 − 4 = 1 ⇒ x1 = 5. The unique solution is [x1 , x2 , x3 , x4 ] = [5, −2, 1, 1]. 24. Let z = t, so that y = 2t − 1. Substitute for y and z in the first equation, giving x − 3y + z = 5 ⇒ x − 3(2t − 1) + t = 5 ⇒ x − 5t + 3 = 5 ⇒ x = 5t + 2. This system has an infinite number of solutions: [x, y, z] = [5t + 2, 2t − 1, t]. 25. Working forward, we start with x = 2. Substitute that into the second equation to get 2 · 2 + y = −3, or 4 + y = −3, so that y = −7. Finally, substitute those two values into the third equation: −3x − 4y + z = −10 ⇒ −3 · 2 − 4 · (−7) + z = −10 ⇒ 22 + z = −10 ⇒ z = −32. The unique solution is [x, y, z] = [2, −7, −32]. 26. Working forward, we start with x1 = −1. Substituting into the second equation gives − 21 ·(−1)+x2 = 5, or 21 + x2 = 5. Thus x2 = 92 . Finally, substitute those two values into the third equation: 3 3 9 15 1 x1 + 2x2 + x3 = 7 ⇒ · (−1) + 2 · + x3 = 7 ⇒ + x3 = 7 ⇒ x3 = − . 2 2 2 2 2   The unique solution is [x1 , x2 , x3 ] = −1, 29 , − 12 . 27. As in Example 2.6, we create the augmented matrix from the coefficients:   1 −1 0 2 1 3 28. As in Example 2.6, we create the augmented matrix from the coefficients:   2 3 −1 1  1 0 1 0 −1 2 −2 0 29. As in Example 2.6, we create the augmented matrix from the coefficients:   1 5 −1 −1 1 −5 2 4 4 30. As in Example 2.6, we create the augmented matrix from the coefficients:   1 −2 0 1 2 −1 1 −1 −3 1

56

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

31. There are three variables, since there are three columns not counting the last one. Use x, y, and z. Then the system becomes y+z=1 x−y =1 2x − y + z = 1 . 32. There are five variables, since there are five columns not counting the last one. Use x1 , x2 , x3 , x4 , and x5 . Then the system becomes x1 − x2 + 3x4 + x5 = 2 x1 + x2 + 2x3 + x4 − x5 = 4 x2 + 2x4 − 3x5 = 0 . 33. Add the first equation to the second, giving 3x = 3, so that x = 1. Substituting into the first equation gives 1 − y = 0, so that y = 1. The unique solution is [x, y] = [1, 1]. 34. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise 28) to get it into a form where we can read off the solution. 

2 3  1 0 −1 2

−1 1 −2

  1 0 1 1 R1 ↔R2  0 − 2 3 −1 −−−−→ 0 −1 2 −2  1 0 1  . . . 0 1 −1 0 2 −1

   1 0 1 0 1 0 R −2R1 0 3 −3 1 −− 3 R2 . . . 1 −−2−−−→ −→ R3 +R1 0 − −−−−→ 0 2 −1 0    R1 −R3  0 0 −−− 1 0 1 −−→ 1 0 0  1 1  R2 +R3  0 1 −1  −−−−→ 0 1 0 3 3 − R3 −2R2 2 0 −−−−−→ 0 0 1 −3 0 0 1

From this we get the unique solution [x1 , x2 , x3 ] =

2

3,



2 3  − 31  . − 32

 − 31 , − 23 .

35. Add the first two equations to give 6y = −6, so that y = −1. Substituting back into the second equation gives −x − 1 = −5, so that x = 4. Further, 2 · 4 + 4 · (−1) = 4, so this solution satisfies the third equation as well. Thus the unique solution is [x, y] = [4, −1]. 36. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise 30) to get it into a simpler form. 

1 −1

−2 0 1 1 −1 −3

  2 1 R2 +R1 1 − 0 −−−−→

−2 −1

 0 1 2 −R2 . . . −1 −2 3 − −−→   R1 +2R2  1 −2 0 1 2 − −−−−→ 1 ... 0 1 1 2 −3 0

0 1

2 1

5 2

 −4 . −3

This augmented matrix corresponds to the linear system a

+ 2c + 5d = −4 b + c + 2d = −3

so that, letting c = s and d = t, we get a = −4 − 2s − 5t and b = −3 − s − 2t. So a complete set of solutions is parametrized by [a, b, c, d] = [−4 − 2s − 5t, −3 − s − 2t, s, t]. 37. Starting with the linear system given in Exercise 31, add the first equation to the second and to the third, giving x+ z=2 2x + 2z = 2 . Subtracting twice the first of these from the second gives 0 = −2, which is impossible. So this system is inconsistent and has no solution.

2.1. INTRODUCTION TO SYSTEMS OF LINEAR EQUATIONS

57

38. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise 32) to get it into a simpler form.  1 1 0

   1 −1 0 3 1 2 −1 0 3 1 2 R2 −R1  R2 ↔R3 1 2 1 −1 4 − 2 2 −2 −2 2 − −−−−→ 0 −−−−→ . . . 1 0 2 3 0 0 1 0 2 3 0     1 −1 0 3 1 2 1 −1 0 3 1 2 0 1 0 2 3 0 1 0 2 3 0 1 . . . 0 ... R3 −2R2 2 R3 0 2 2 −2 −2 2 − 0 0 2 −6 −8 2 −−−−→ −−−→  R1 +R2   5 4 1 −1 0 3 1 2 −−−−−→ 1 0 0 0 1 0 1 0 2 3 0 2 3 . . . 0 0 0 1 −3 −4 1 0 0 1 −3 −4

 2 0 1

Then setting x4 = s and x5 = t, we have x1

+ 5s + 4t = 2 x2 + 2s + 3t = 0 x3 − 3s − 4t = 1 ,

x1 = 2 − 5s − 4t so that

x2 = −2s − 3t x3 = 1 + 3s + 4t.

So the general solution to this system has parametric form [x1 , x2 , x3 , x4 , x5 ] = [2 − 5s − 4t, −2s − 3t, 1 + 3s + 4t, s, t] . 39. (a) Since t = x and y = 3 − 2t, we can substitute x for t in the equation for y to get y = 3 − 2x, or 2x + y = 3. (b) If y = s, then substituting s for y in the general equation above gives  2x + s = 3, or 2x = 3 − s, so that x = 23 − 12 s. So the parametric solution is [x, y] = 32 − 21 s, s . 40. (a) Since x1 = t, substitute x1 for t in the second and third equations, giving x2 = 1 + x1 and x3 = 2 − x1 . This gives the linear system −x1 + x2 =1 x1 + x3 = 2 . (b) Substitute s for x3 into the second equation, giving x1 + s = 2, so that x1 = 2 − s. Substitute that value for x1 into the first equation, giving −(2 − s) + x2 = 1, or x2 = 3 − s. So the parametric form given in this way is [x1 , x2 , x3 ] = [2 − s, 3 − s, s]. 41. Let u = x1 and v = y1 . Then the system of equations becomes 2u + 3v = 0 and 3u + 4v = 1. Solving the second equation for v gives v = 41 − 43 u. Substitute that value into the first equation, giving   1 3 1 3 2u + 3 − u = 0, or − u = − , so u = 3. 4 4 4 4     Then v = 41 − 34 · 3 = −2, so the solution is [u, v] = [3, −2], or [x, y] = u1 , v1 = 13 , − 21 . 42. Let u = x2 and v = y 2 . Then the system of equations becomes u + 2v = 6 and u − v = 3. Subtracting the second of these from the first gives 3v = 3, so that v = 1. Substituting that into the second equation gives u − 1 = 3, so that u = 4. The solution in terms of u and v is thus [u, v] = [x2 , y 2 ] = [4, 1]. Since 1 and 4 have both a positive and a negative square root, the original system has the four solutions [x, y] = [2, 1],

[−2, 1],

[2, −1],

or

[−2, −1].

58

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

43. Let u = tan x, v = sin y, and w = cos z. Then the system becomes u − 2v = 2 u− v+w= 2 v − w = −1 . Subtract the first equation from the second, giving v + w = 0. Add that to the third equation,  so1 that 1 these into the second equation gives u− − 2v = −1 and v = − 12 . Thus w = 12 . Substituting 2 + 2 = 2,   so that u = 1. The solution is [u, v, w] = 1, − 12 , 12 , so that π x = tan−1 u = tan−1 1 = + kπ,  4 1 7π π y = sin−1 v = sin−1 − = + 2`π or − + 2`π, 2 6 6 1 π z = cos−1 w = cos−1 = ± + 2nπ, 2 3 where k, `, and n are integers. 44. Let r = 2a and s = 3b . Then the system becomes −r + 2s = 1 and 3r − 4s = 1. Adding three times the first equation to the second gives 2s = 4, so that s = 2. Substituting that back into the first equation gives −r + 4 = 1, so that r = 3. So the solution is [r, s] = [2a , 3b ] = [3, 2]. In terms of a and b, this gives [a, b] = [log2 3, log3 2].

2.2

Direct Methods for Solving Linear Systems

1. This matrix is not in row echelon form, since the leading entry in row 2 appears to the right of the leading entry in row 3, violating condition 2 of the definition. 2. This matrix is in row echelon form, since the row of zeros is at the bottom and the leading entry in each nonzero row is to the left of the ones in all succeeding rows. It is not in reduced row echelon form since the leading entry in the first row is not 1. 3. This matrix is in reduced row echelon form: the leading entry in each nonzero row is 1 and is to the left of the ones in all succeeding rows, and any column containing a leading 1 contains zeros everywhere else. 4. This matrix is in reduced row echelon form: all zero rows are at the bottom, and all other conditions are vacuously satisfied (that is, they are true since no rows or columns satisfy the conditions). 5. This matrix is not in row echelon form since it has a zero row, but that row is not at the bottom. 6. This matrix is not in row echelon form since the leading entries in rows 1 and 2 appear to the right of leading entries in succeeding rows. 7. This matrix is not in row echelon form since the leading entry in row 1 does not appear to the left of the leading entry in row 2. It appears in the same column, but that is not sufficient for the definition. 8. This matrix is in row echelon form, since the zero row is at the bottom and the leading entry in each row appears in a column to the left of the leading entries in succeeding rows. However, it is not in reduced row echelon form since the leading entries in rows 1 and 3 are not 1. (Also, it is not in reduced row echelon form since the third column contains a leading 1 but also contains other nonzero entries). 9. (a)  0 0 1

0 1 1

  1 1 R1 ↔R2  1 − −−−−→ 0 1 0

1 1 0

 1 1 · · · 1

2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS

59

(b) R1 −R3  −−− −−→ 1 R2 −R3 0 ···− −−−−→ 0

1 1 0

 R1 −R2  0 − −−−−→ 1 0 0 1 0

 0 0 1

0 1 0

10. (a) "

4 2

" # 3 R2 ↔R1 2 −−−−−→ 4 1

# 1 " 2 R1 1 −− −→ 1 3 4

1 2

"

#

1 R −4R1 3 −−2−−−→ 0

1 2

1

# ···

(b) " R1 − 12 R2 1 − − − − − − → ··· 0

0 1

#

11. (a)  3  5 2

 1 R1  2 4 −− −→ 1   −2 5 5 3

  2 5  R ↔R3  −2 −−1−−−→ 5 3 4

  1 2  R −5R1  −2 −−2−−−→ 0 R3 −3R1 5 − −−−−→ 0

 2  −12 · · · −1

(b)  R −2R2  1 2 −−1−−−→   1 0 R3 +R2 −1 − −−−−→ 0

 1 1 − 12 R2  ···− −−−→ 0 0

 0  1 0

12. (a)  2 3

 1 R1  2 −4 −2 6 −− −→ 1 1 6 6 3

  −2 −1 3 1 R2 −3R1 1 6 6 − −−−−→ 0

 −2 −1 3 ··· 7 9 −3

(b) " 1 · · · 71 R2 −−−→ 0

" # R +2R2 1 0 3 −−1−−−→ − 73 0 1

−2 −1 9 1 7

11 7 9 7

15 7 − 73

#

13. (a)  3 2 4

−2 −1 −3

    3 −1 3 −2 −1 3R2 R2 −2R1  6 −3 −3 − −1 −−→ −−−−→ 0 −3R3 R3 +4R1 −1 − 9 3 − −−→ −12 −−−−→ 0

  −2 −1 3 0 1 −1 R3 −R2 1 −1 − −−−−→ 0

 −2 −1 1 −1 · · · 0 0

(b)  R +2R2 3 −−1−−−→ 0 ··· 0

0 1 0

 1 R1  3 −3 −− −→ 1  0 −1 0 0

0 1 0

 −1 −1 0

14. (a)  −2 −3 1

  −4 7 1 R1 ↔R3  −6 10 − −3 −−−−→ 2 −3 −2

  2 −3 1 R +3R1 0 −6 10 −−2−−−→ R3 +2R1 −4 7 − −−−−→ 0

(b)  R −3R2 1 −−1−−−→ 0 ··· 0

2 0 0

 0 1 0

2 0 0

  −3 1 0 1 R3 −R2 1 − −−−−→ 0

2 0 0

 3 1 · · · 0

60

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

15.  1 0  0 0

  1 2 2 −4 −4 5 0 −1 −1 10 9 −5   0 0 0 1 1 −1 R4 +29R3 0 0 0 0 0 24 − −−−−−→  1 2 0 −1  0 0 R −3R2 0 3 −−4−−−→  1 2 R2 +2R1 −−−−−→  2 4  R +2R1  2 3 −−3−−−→ R4 −R1 −1 1 −−−−−→

   1 2 −4 −4 5 −4 −4 5  9 −5 10 9 −5   8R 0 −1 10 3  0 0 8 8 −8 1 1 −1 − −→ 0 0 29 29 −5 29 29 −5    1 2 −4 −4 5 −4 −4 5  0 8 8 −8 10 9 −5 R3 ↔R2 0  − − − − − → 0 −1 10 9 −5 8 8 −8 0 3 −1 2 10 −1 2 10  −4 −4 5 0 0 2  2 1 5 3 6 5

undoes itself; k1 Ri undoes kR1 ; and Ri − kRj undoes Ri + kRJ .     − 1 R1  1  R1 + 7 R2  2 2 3 −1 2 1 2 − − 2 −1 − − − − → − − − − − → = B. So A and B are row equivalent, R2 −2R1 1 0 4 − 1 0 −−−−→ 1 0 and we can convert A to B by the sequence R2 − 2R1 , − 12 R1 , and R1 + 72 R2 .

16. Ri ↔ Rj  1 17. A = 3

18. 

2 A= 1 −1

0 1 1

 R1 +R2  3 −1 − −−−−→  1 0 −1 1

   3 1 −1 −1 R +2R 3  0 −−2−−−→ −1 3 2 1 −1 1 1     3 3 1 −1 3 1 −1 1 R + R R2 +R1  2 −1 3 2 3 3 2 − 1 −−− −−−−→ 2 4 − − − → R3 +R1 2 2 2 0 2 2 0 −−− −−→ 1 1 1 

1 5 2

 −1 1 = B. 0

Therefore the matrices A and B are row equivalent. 19. Performing R2 + R1 gives a new second row R20 = R2 + R1 . Then performing R1 + R2 gives a new first row with value R1 + R20 = 2R1 + R2 . Only one row operation can be performed at a time (although in a sequence of row operations, we may perform more than one in one step as long as any rows affected do not appear in other transformations in that step, just for efficiency).       R1 −R2   −R1   x1 x1 −x2 −x2 − − − − − − → −−→ x2 . The net effect is to interchange the 20. R2 +R1 R2 +R1 x2 − x + x x + x x x1 2 1 2 − 1 −−−−→ 1 −−−−→ first and second rows. So this sequence of elementary row operations is the same as R1 ↔ R2 . 21. Note that 3R2 − 2R1 is not an elementary row operation since it Ri + kRj (it is not of the third form since the coefficient of R2 is      3 1 3 1 3 R2 − 32 R1 3R2 10 2 4 − 0 −−−−−→ 3 −−→ 0

is not of the form Ri ↔ Rj , kRi , or 3, not 1). However,  1 , 10

so this is in fact a composition of elementary row operations. 22. We can create a 1 in the top left corner as follows:       1 R1  3 3 2 R1 ↔R2 1 4 3 2 −− −→ 1 , 1 4 −−−−−→ 3 2 1 4 1

2 3

4

 ,

 3 1

 R1 −2R2  2 − −−−−→ 1 1 4

 −6 . 4

The first method, simply interchanging the two rows, requires the least computation and seems simplest. Our next choice would probably be the third method, since at least it produces integer results.

2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS

61

23. The rank of a matrix is the number of nonzero rows in its row echelon form. (1) Since A is not in row echelon form,  1 0 0 (2) (3) (4) (5)

we first put it in row echelon form:    0 1 1 0 1 R2 ↔R3  0 3 − −−−−→ 0 1 0 . 1 0 0 0 3

Since the row echelon form has three nonzero rows, rank A = 3. A is already in row echelon form, and it has two nonzero rows, so rank A = 2. A is already in row echelon form, and it has two nonzero rows, so rank A = 2. A is already in row echelon form, and it has no nonzero rows, so rank A = 0. Since A is not in row echelon form, we first put it in row echelon form:     1 0 3 −4 0 1 0 3 −4 0 R2 ↔R3  0 0 0 0 0 − 0 1 . −−−−→ 0 1 5 0 1 5 0 1 0 0 0 0 0

Since the row echelon form has two nonzero rows, rank A = 2. (6) Since A is not in row echelon form, we first put it in row echelon form:     0 0 1 1 0 0 R1 ↔R3  0 1 0 − −−−−→ 0 1 0 1 0 0 0 0 1 Since the row echelon form has three nonzero rows, rank A = 3. (7) Since A is not in row echelon form, we first put it in row echelon form:       1 2 3 1 2 3 1 2 3 0 −2 −3 1 0 0       R2 −R1 0 −2 −3    −−−−−→   R3 + 21 R2  0 1 1 0 0 − 12  1 1 −−−−−−→ 0 0

0

1

0

0

1

0

0

 1 0   0 R +2R3 1 −−4−−−→ 0

 2 3 −2 −3  . 0 − 12  0 0

Since the row echelon form has three nonzero rows, rank A = 3. (8) A is already in row echelon form, and it has three nonzero rows, so rank A = 3. 24. Such a matrix has 0, 1, 2, or 3 nonzero rows.   0 0 0 • If it has no nonzero rows, then all rows are zero, so the only such matrix is 0 0 0. 0 0 0 • If it has one nonzero row, that must be the first row, and any entries in that row following the 1 are arbitrary, so the possibilities are       1 ∗ ∗ 0 1 ∗ 0 0 1 0 0 0 , 0 0 0 , 0 0 0 , 0 0 0 0 0 0 0 0 0 where ∗ denotes that any entry is allowed. • If it has two nonzero rows, then the third row is zero. There are several possibilities for the arrangements of 1s in the first two rows: they could be in columns 1 and 2, 1 and 3, or 2 and 3. In each case, the entries following each 1 are arbitrary, except for the column in the first row above the 1 in the second row. This gives the possibilities       1 0 ∗ 1 ∗ 0 0 1 0 0 1 ∗ , 0 0 1 , 0 0 1 . 0 0 0 0 0 0 0 0 0

62

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS  1 • If the matrix has all nonzero rows, then the only possibility is 0 0 entries must be zero since each column has a one in it.

0 1 0

 0 0. Note that all other 1

25. First row-reduce the augmented matrix of the system:  1  2 4

2 −3 −1 1 −1 1

  1 9  R −2R1  0−−2−−−→ 0 R3 −4R1 4 − −−−−→ 0  1  0 R3 +9R2 −−−−−→ 0 R +3R3  −−1−−−→ 1 R2 + 57 R3  −−−−−−→ 0 0

  9 1  − 51 R2  −18 −−−  −→ 0 0 −32   2 −3 1 2 9  18  7 1 −5 5  0 1 5 2 2 2 R3 0 −−→ 0 0 5 5 −  R −2R  2 2 0 12 −−1−−−→ 1 0   1 0 5 0 1 0 1 0 0 1 2 −3 −5 7 −9 13

2 −3 1 − 75 −9 13  −3 9  − 57 18 5  1 1  0 2  0 5 0 1

 9 18  5  −32

    x1 2 Thus the solution is x2  = 5. 1 x3 26. First row-reduce the augmented matrix of the system: 

1  −1 3

−1 1 3 1 1 7

  0 1  R2 +R1  5 −−−−−→ 0 R3 −3R1 2 − −−−−→ 0

−1 1 2 2 4 4

  0 1  12 R2  5 −−−→ 0 2 0

−1 1 1 1 4 4

  0 1  5 0 2 R3 −4R2 2 −−−−−→ 0

−1 1 1 1 0 0

 0 5 2 −8

This is an inconsistent system since the bottom row of the matrix is zero but the constant is nonzero. Thus the system has no solution. 27. First row-reduce the augmented matrix of the system: 

1 −1 2

−3 −2 2 1 4 6

   0 1 −3 −2 0 R2 +R1 −R2 0 −−−−−→ 0 −1 −1 0 − −−→ R −2R 3 1 0 −−−−−→ 0 10 10 0    1 −3 −2 0 1 0 0 1 1 0 R3 −10R2 0 10 10 0 − −−−−−→ 0

−3 −2 1 1 0 0

 R1 +3R2  0 − −−−−→ 1 0 0 0 0

0 1 0

1 1 0

 0 0 0

Since there is a row of zeros, and no leading 1 in the third column, we know that x3 is a free variable, say x3 = t. Back substituting gives x2 + t = 0 and x1 + t = 0, so that x1 = x2 = −t. So the solution is       x1 −t −1 x2  = −t = t −1 . x3 t 1

2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS

63

28. First row-reduce the augmented matrix of the system:   1 R1     3 3 1 2 2 3 −1 4 1 −− 1 − 12 2 12 − 21 2 −→ 1 2 2 2     R −3R1   3 0 1 1 −−2−−−→ 0 1 1 −5 − 12  3 −1 0 − 11 3 −1 2 2 R3 −3R1 17 5 1 3 −4 1 −1 2 3 −4 1 −1 2 − −7 −−−−→ 0 − 2 2 2    3 1 1 − 12 2 1 32 − 21 2 2 2 2  − 11 R2  3 1  3 10 10 0 1 − 0 1 −    −−−−→ 11 11 11 11 11 R3 + 17 5 2 8 1 2 R2 −7 0 − 17 0 0 − − − − − − → 2 2 2 11 11 This matrix is in row echelon form, but we can continue, putting it in reduced row 1  R1 − 3 R2   R1 + 11  R3  1 7 1 4 2 1 23 − 12 2 −−−− −→ 1 0 − 11 11 11 −−−−−−→ 1 2 −   3 3 3 10 1  10 1  R2 + 11 R3  ··· 0 1 − 11 0 1 − 11 11 11  11 11  −−−−−−→ 0 11 R 3 2 1 4 7 0 0 1 4 7 0 −− −→ 0 0



1 2 1  11  14 11

echelon form:  0 0 1 1  1 0 2 2 0 1 4 7

Using z = t as the parameter gives w = 1 − t, x = 2 − 2t, y = 7 − 4t, z = t for the solution, or         −1 1 1−t w −2  x  2 − 2t 2       =  y  7 − 4t = 7 + t −4 . 1 0 t z 29. First row-reduce the augmented matrix of the system:     1 R1    1 1 3 3 2 2 1 1 1 3 −− −→ 1 2 2 2 2     R2 −4R1   −R2  − − − − − → 4 1 7 4 1 7 0 −1 1       −−−→ 0 R3 −2R1 2 5 −1 2 5 −1 −−−−−→ 0 4 −4 0     x 2 The solution is 1 = . x2 −1

1 2

1 4

R1 − 21 R2  −−−−− −→ 1   −1 0 R −4R 2 −4 −−3−−−→ 0 3 2



0 1 0

 2  −1 0

−3 0 0 0 1 −2 0 0 0

 −2 1 0

30. First row-reduce the augmented matrix of the system:  −1  2 1

3 −2 4 −6 1 −2 −3 4 −8

 −R1  −−−→ 1 0 R2 +2R1  −3 − −−−−→ 0 R3 +R1 2 − −−−−→ 0

 1  0 −−−−→ 0 − 13 R2

−3 2 −4 0 1 −2 0 2 −4

−3 2 −4 0 −3 6 0 2 −4

 0 −3 2

  −2 1 0 1 R3 −2R2 2 − −−−−→ 0

−3 2 −4 0 1 −2 0 0 0

 −2 1 0

 R −2R2 1 −−1−−−→ 0 0 The system has been reduced to x1 − 3x2 = −2 x3 − 2x4 =

1.

x2 and x4 are free variables; setting x2 = s and x4 = t and back-substituting gives for the solution           x1 3s − 2 −2 3 0 x2   s   0 1 0  =        x3   2t + 1  =  1 + s 0 + t 2 . x4 t 0 0 1

64

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

31. First row-reduce the augmented matrix of the system: 

1 2 1 6 1 3

1 1 2

0

−1 0 −2

−6 0 −3 1 0 −4

 2R1  2 −−→ 1  6R2  −1− −→ 1 3R3 8 − −→ 1

2 3 0

 1 −−−−−→ 0 R3 −R1 −−− −−→ 0  1 0 R +2R2 0 −−3−−−→  R1 −2R2 −−−−−→ 1 0 0 R2 −R1

−2 −12 0 0 −18 6 −6 0 −12

 4 −6 24

 4 −10 20  0 4 6 −10 0 0  −12 24 6 −10 0 0

2 −2 −12 0 1 2 −6 6 −2 −4 12 −12 2 1 0

−2 −12 2 −6 0 0

0 1 0

−6 0 2 −6 0 0

Since the matrix has rank 2 but there are 5 variables, there are three free variables, say x3 = r, x4 = s, and x5 = t. Back-substituting gives x1 − 6r − 12t = 24 and x2 + 2r − 6s + 6t = −10, so that x1 = 24 + 6r + 12t and x2 = −10 − 2r + 6s − 6t. Thus the solution is           12 0 x1 6 24 −6 6 x2  −10 −2           x3  =  0 + r  1 + s 0 + t  0 .            0 1 x4   0  0 1 0 0 0 x5 32. First row-reduce the augmented matrix of the system: √

2   0 0

1 √ 2 −1

2 −3 √ 2

 √12 R1  1 −−−−→ 1 √  1  − 2 √2 R2 0 −−−−→ 1 0





2 2

1 −1

 1 0  0 1  √ 2 R3 −−−−→ 0 0



R1 −

√ 2 2 R2

 −−−−−−−→ 1   −1 0 R +R 3 2 1 −−−−−→ 0 2 2

2 √

−322 √ 2 √

3 2+ √ 2 3 2 − 2

1



0 1 0



3 2+ √ 2 −√3 2 2 2 2

√  2  −1 0

√ √  R1 −( 2+ 23 )R3  2 −−−−−−−−−−→ 1 0 0  R2 + 3√2 R3  0 1 0 −1 − −−−−2−−→  0 0 1 0

√  2  −1 0

  1 1  −1 R2 ↔R3 0 − −1 −−−−→ 0 1 0   1 1 0 −1   0 −3 R4 −2R3 1 −−−−−→ 0

 1 −1  −1 1  1 −1  −3 7

33. First row-reduce the augmented matrix of the system:  1 1  0 1

1 2 1 −1 −1 1 1 1 0 1 0 1

  1 1 2 1 R −R 2 1  0 −−−−−→ 0 −2 −3 0 −1 1 1 R −R 4 1 2 −−−−−→ 0 0 −2  1 1 2 0 1 1  R +2R2  0 0 −1 −−3−−−→ 0 0 −2

1 0 0 0 1 0 0 0

1 2 1 1 1 0 −2 −3 0 0 −2 0 1 1 0 0

2 1 1 0 −1 0 0 0

We need not proceed any further — since the last row says 0 = 7, we know the system is inconsistent and has no solution.

2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS

65

34. First row-reduce the augmented matrix of the system:  1 1 1 1 1 2 3 4  1 3 6 10 1 4 10 20

  1 1 1 1 4 R −R 2 1 0 1 2 3 −−− −−→ 10  R −R  3 1  20 − 0 2 5 9 −− −−→ R4 −R1 35 − 0 3 9 19 −−−−→

 1 1 1 1 0 1 2 3  0 0 1 3 R4 −3R3 −−−−−→ 0 0 0 1

  1 1 1 4 0 1 2 6  R −2R2  0 0 1 16 −−3−−−→ R4 −3R2 31 − −−−−→ 0 0 3

 R1 −R4  4 −−−−−→ 1 R2 −3R4  6  −−−−−→ 0 R3 −3R4 0 4 − −−−−→ 1 0

1 1 0 0

1 2 1 0

0 0 0 1

1 3 3 10

 4 6  4 13

 R1 −R3  −−→ 1 1 0 3 −−− R2 −2R3 0 1 0 3  −−−−−→  0 0 1 1 1 0 0 0  R1 −R2 −−− −−→ 1 0  0 0

 2 1  1 1

0 0 0 1 0 1 0 0

0 0 1 0

0 0 0 1

 1 1  1 1

    1 a  b  1    so that   c  = 1 is the unique solution. 1 d 35. If we reverse the first and third rows, we get a system in row echelon form with a leading 1 in every row, so rank A = 3. Thus this is a system of three equations in three variables with 3 − 3 = 0 free variables, so it has a unique solution. 36. Note that R3 −2R2 results in the row 0 0 0 0 | 2, so that the system is inconsistent and has no solutions. 37. This system is a homogeneous system with four variables and only three equations, so the rank of the matrix is at most 3 and thus there is at least one free variable. So this system has infinitely many solutions. 38. Note that R3 ← R3 − R1 − R2 gives a zero row, and then R2 ← R2 − 6R1 gives a matrix in row echelon form. So there are three free variables, and this system has infinitely many solutions. 39. It suffices to show that the row-reduced matrix has no zero rows, since it then has rank 2, so there are no free variables and no possibility of an inconsistent system. First, if a = 0, then since ad − bc = −bc 6= 0, we see that both b and c are nonzero. Then row-reduce the augmented matrix:     0 b r R1 ↔R2 c d s . c d s −−−−−→ 0 b r Since b and c are both nonzero, this is a consistent system with a unique solution. Second, if c = 0, then since ad − bc = ad 6= 0, we see that both a and d are nonzero. Then the augmented matrix is   a b r . 0 d s This is row-reduced since both a and d are nonzero, and it clearly has rank 2, so this is a consistent system with a unique solution. Finally, if a 6= 0 and c 6= 0, then row-reducing gives  a c

b d

 cR1  r −−→ ac bc aR2 s − ac ad −→

  rc ac bc R2 −R1 as − 0 ad − bc −−−−→

 rc . as

Since ac 6= 0 and also ad − bc 6= 0, this is a rank 2 matrix, so it has a unique solution.

66

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

40. First reduce the augmented matrix of the given system to row echelon form:  k 2

2 −4

  3 R1 ↔R2 2 −6 −−−−−→ k

−4 2

  −6 2 1 R2 − 2 kR1 3 − −−−−−→ 0

−4 2 + 2k

−6 3 + 3k



(a) The system has no solution when this matrix has a zero row with a corresponding nonzero constant. Now, 2 + 2k = 0 for k = −1, in which case 3 + 3k = 0 as well. Thus there is no value of k for which the system has no solutions. (b) The system has a unique solution when rank A = 2, which happens when the bottom row is nonzero, i.e., for k = 6 −1. (c) The system has an infinite number of solutions when A has a zero row with a zero constant. From part (a), this happens when k = −1. 41. First row-reduce the augmented matrix of the system: 

1 k

k 1

  1 1 R2 −kR1 1 − −−−−→ 0

k 1 − k2

1 1−k



(a) If 1 − k 2 = 0 but 1 − k 6= 0, this system will be inconsistent and have no solution. 1 − k 2 = 0 for k = 1 and k = −1; of these, only k = −1 produces 1 − k 6= 0. Thus the system has no solution for k = −1. (b) The system has a unique solution when 1 − k 2 6= 0, so for k 6= ±1. (c) The system has infinitely many solutions when the last row is zero, so when 1 − k 2 = 1 − k = 0. This happens for k = 1. 42. First reduce the augmented matrix of the given system to row echelon form:  1 1 2

−2 3 1 1 −1 4

  2 1 R2 −R1 −−→ 0 k  −−− R3 −2R1 k2 − −−−−→ 0

−2 3 3 −2 3 −2

  1 2 0 k − 2 R3 −R2 k2 − 4 − −−−−→ 0

−2 3 3 −2 0 0

 2 k − 2 k2 − k − 2

(a) The system has no solution when the matrix has a zero row with a corresponding nonzero constant. This occurs for k 2 − k − 2 6= 0, i.e. for k 6= 2, −1. (b) Since the augmented matrix has a zero row, the system does not have a unique solution for any value of k; z is a free variable. (c) The system has an infinite number of solutions when the augmented matrix has a zero row with a zero constant. This occurs for k 2 − k − 2 = 0, i.e. for k = 2, −1. 43. First reduce the augmented matrix of the given system to row echelon form: 

1 1 k

1 k 1

k 1 1

  1 1 1 R2 −R1 −−→ 0 k − 1 1  −−− R3 −kR1 −2 − −−−−→ 0 1 − k

k 1−k 1 − k2

  1 1 0 0  R3 +R2 −2 − k − −−−−→ 0

1 k k−1 1−k 0 2 − k − k2

 1 0  −2 − k

(a) The system has no solution when the matrix has a zero row with a corresponding nonzero constant. This occurs when 2 − k − k 2 = (1 − k)(2 + k) = 0 but −2 − k 6= 0. For k = 1, −2 − k = −3 6= 0, but for k = −2, −2 − k = 0. Thus the system has no solution only when k = 1. (b) The system has a unique solution when 2 − k − k 2 6= 0; that is, when k 6= 1 and k 6= −2. (c) The system has an infinite number of solutions when the augmented matrix has a zero row with a zero constant. This occurs when 2 − k − k 2 = 0 and −2 − k = 0. From part (a), we see that this is exactly when k = −2.

2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS

67

44. There are many examples. For instance, (a) The following system of n equations in n variables has infinitely many solutions (for example, choose k and set x1 = k and x2 = −k, with the other xi = 0): x1 + x2 + · · · + xn = 0 2x1 + 2x2 + · · · + 2xn = 0 .. . nx1 + nx2 + · · · + nxn = 0 . Let m be any integer greater than n. Then the following system of m equations in n variables has infinitely many solutions (for example, choose k and set x1 = k and x2 = −k, with the other xi = 0): x1 + x2 + · · · + xn = 0 2x1 + 2x2 + · · · + 2xn = 0 .. . mx1 + mx2 + · · · + mxn = 0 . (b) The following system of n equations in n variables has a unique solution: x1 = 0 x2 = 0 .. . xn = 0 . Let m be any integer greater than n. Then a system whose first n equations are as above, and which continues with equations x1 + x2 = 0, x1 + 2x2 = 0, and so forth until there are m equations clearly also has a unique solution. 45. Observe that the normal vectors to the planes, which are [3, 2, 1] and [2, −1, 4], are not parallel, so the two planes will indeed intersect in a line. The line of intersection is the points in the solution set of the system 3x + 2y + z = −1 2x − y + 4z =

5

Reduce the corresponding augmented matrix to row echelon form: " 3 2

2 1 −1 4

# 1 " 3 R1 −1 −− −→ 1 5 2

2 3

1 3

−1

4

# " 2 − 13 1 3 R −2R1 5 −−2−−−→ 0 − 73 " 1 1 23 3 3 − 7 R2 10 0 1 − −−−−→ 7

1 3 10 3

− 13

#

17 3

# " R − 23 R2 − 13 −−1−−− −→ 1 0 − 17 0 1 7

Thus z is a free variable; we set z = 7t to eliminate some fractions. Then 9 x = −9t + , 7

y = 10t −

17 , 7

so that the line of intersection is       9 x −9 7    17    y  = − 7  + t  10 . z 0 7

9 7 − 10 7

9 7 − 17 7

#

68

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

46. Observe that the normal vectors to the planes, which are [4, 1, 1] and [2, −1, 3], are not parallel, so the two planes will indeed intersect in a line. The line of intersection is the points in the solution set of the system 4x + y + z = 0 2x − y + 3z = 2 Reduce the corresponding augmented matrix to row echelon form: " 4 2

1 1 −1 3

# 1 " 4 R1 0 −− −→ 1 2 2

1 4

1 4

−1

3

" # 1 0 R −2R1 2 −−2−−−→ 0

1 4 − 32

" 1 − 32 R2 −−−−→ 0

1 4 5 2 1 4

1

# 0 2 1 4 − 35

# " R − 14 R2 0 −−1−−− −→ 1 − 34 0

0 1

2 3 − 53

1 3 − 43

#

Letting the parameter be z = t, we get the parametric form of the solution: x=

1 2 − t, 3 3

4 5 y = − + t, 3 3

z = t.

47. We are looking for examples, so we should keep in mind simple planes that we know, such as x = 0, y = 0, and z = 0. (a) Starting with x = 0 and y = 0, these planes intersect when x = y = 0, so they intersect on the z-axis. So we must find another plane that contains the z-axis. Since the line y = x in R2 passes through the origin, the plane y = x in R3 will have z as a free variable, and it contains (0, 0, 0), so it contains the z-axis. Thus the three planes x = 0, y = 0, and x = y intersect in a line, the z-axis. A sketch of these three planes is:

(b) Again start with x = 0 and y = 0; we are looking for a plane that intersects both of these, but not on the z-axis. Looking down from the top, it seems that any plane whose projection onto the xy plane is a line in the first quadrant will suffice. Since x + y = 1 is such a line, the plane x + y = 0 should intersect each of the other planes, but there will be no common points of intersection. To find the line of intersection of x = 0 and x + y = 1, solve that pair of equations. Using

2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS

69

forward substitution we get y = 1, so the line of intersection is the line x = 0, y = 1, which is the line [0, 1, t]. To find the line of intersection of y = 0 and x + y = 1, again solve using forward substitution, giving the line x = 1, y = 0, which is the line [1, 0, t]. Clearly these two lines have no common points. A sketch of these three planes is:

(c) Clearly x = 0 and x = 1 are parallel, while y = 0 intersects each of them in a line. A sketch of these three planes is:

70

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS (d) The three planes x = 0, y = 0, and z = 0 intersect only at the origin:

48. The two lines have parametric forms p + su and q + tv, where s and t are the parameters. These two are equal when p + su = q + tv, or su − tv = q − p. Substitute the given values of the four vectors to get           1 −1 2 −1 3 s+t= 3 s  2 − t  1 = 2 −  2 =  0 ⇒ 2s − t = 0 −1 0 0 1 −1 −s = −1 . From the third equation, s = 1; substituting into the first equation gives t = 2. Since s = 1, t = 2 satisfies the second equation, this is the solution. Thus the point of intersection is     0 x y  = p + 1u = 4 . z 0 49. The two lines have parametric forms p + su and q + tv, where s and t are the parameters. These two are equal when p + su = q + tv, or su − tv = q − p. Substitute the given values of the four vectors to get           1 2 −1 3 −4 s − 2t = −4 s 0 − t 3 =  1 − 1 =  0 ⇒ − 3t = 0 1 1 −1 0 −1 s − t = −1 . From the second equation, t = 0; setting t = 0 in the other two equations gives s = −4 and s = −1. This is impossible, so the system has no solution and the lines do not intersect. 50. Let Q = (a, b, c); then we want to find values of Q such that the line q + sv intersects the line p + tu. The two lines intersect when p + su = q + tv, or su − tv = q − p.

2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS

71

Substitute the values of the vectors to get           1 a−1 s − 2t = a − 1 1 2 a s  1 − t 1 =  b  −  2 =  b − 2  ⇒ s − t = b − 2 −1 0 c −3 c+3 −s = c+3. Solving those equations for a, b, and c gives Q = (a, b, c) = (s − 2t + 1, s − t + 2, −s − 3). Thus any pair of numbers s and t determine a point Q at which the two given lines intersect. 51. If x = [x1 , x2 , x3 ] satisfies u · x = v · x = 0, then the components of x satisfy the system u1 x1 + u2 x2 + u3 x3 = 0 v1 x1 + v2 x2 + v3 x3 = 0. There are several cases. First, if either vector is zero, then the cross product is zero, and the statement does not hold: any vector that is orthogonal to the other vector is orthogonal to both, but is not a multiple of the zero vector. Otherwise, both vectors are nonzero. Assume that u is the vector with a lowest-numbered nonzero component (that is, if u1 = 0 but u2 6= 0, while v1 6= 0, then reverse u and v). The augmented matrix of the system is   u1 u2 u3 0 . v1 v2 v3 0 If u1 = u2 = 0 but u3 6= 0, then v1 = v2 = 0 and v3 6= 0 as well, so that u and v are parallel and thus u × v = 0. However, any vector of the form   s t 0 is orthogonal to both u and v, so the result does not hold in this case either. Next, if u1 = 0 but u2 6= 0, then v1 = 0 as well, so the augmented matrix has the form       0 u2 u3 0 0 0 0 u2 u3 0 u2 u3 . u2 R2 R −v2 R1 0 v2 v3 0 − −−→ 0 u2 v2 u2 v3 0 −−2−−− −→ 0 0 u2 v3 − u3 v2 0     0 0 Then x1 is a free variable. If u2 v3 − u3 v2 = 0, then u = u2  and v = v2  are again parallel, so u3 v3 their dot product is zero and again the result does not hold. But if u2 v3 − u3 v2 6= 0, then x3 = 0, which implies that x2 = 0 as well. So any vector that is orthogonal to both u and v in this case is of the form     x1 s x2  = 0 , x3 0 and



       0 0 u2 v3 − u3 v2 u2 v3 − u3 v2 u2  × v2  = u3 · 0 − 0 · v3  =  , 0 u3 v3 0 · v2 − 0 · u 2 0

and since u2 v3 − u3 v2 6= 0, the orthogonal vectors are multiples of the cross product. Finally, assume that u1 6= 0. Then the augmented matrix reduces as follows:      u1 u2 u3 0 u1 u2 u1 u2 u3 0 u1 R2 R2 −v1 R1 v1 v2 v3 0 − u v u v u v 0 0 u v 1 2 1 3 1 2 − u2 v1 −−→ 1 1 −−−−−−→

u3 u1 v3 − u3 v1

 0 . 0

72

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Now consider the last row of this matrix. If it consists of all zeros, then u1 v2 = u2 v1 and u1 v3 = u3 v1 . If v1 6= 0, then uu12 = vv21 and also uu13 = vv13 , so that again u and v are parallel and the result does not hold. Otherwise, if v1 = 0, then u1 v2 = u1 v3 = 0, and u1 was assumed nonzero, so that v2 = v3 = 0 and thus v is the zero vector, which it was assumed not to be. So at least one of u1 v2 − u2 v1 and u1 v3 − u3 v1 is nonzero. If they are both nonzero, then x3 = t is free, and (u1 v2 − u2 v1 )x2 = (u3 v1 − u1 v3 )t,

so thatx2 =

u3 v1 − u1 v3 t. u1 v2 − u2 v1

But then u1 x1 + u2 x2 + u3 x3 = 0, so that   u1 (u3 v2 − u2 v3 ) u3 v1 − u1 v3 t + u3 t = u1 x1 + t = 0, u1 x1 + u2 u1 v2 − u2 v1 u1 v2 − u2 v1 and thus x1 =

u2 v3 − u3 v2 t. u1 v2 − u2 v1

Thus the orthogonal vectors are     u2 v3 −u3 v2 x1    u1 v2 −u2 v1  x2  = t  u3 v1 −u1 v3  ,    u1 v2 −u2 v1  1 x3 or, clearing fractions,     x1 u2 v3 − u3 v2     x2  = t u3 v1 − u1 v3  ,     x3 u1 v2 − u2 v1 so that orthogonal vectors are multiples of the cross product. If u1 v2 − u2 v1 6= 0 but u1 v3 − u3 v1 = 0, then again v1 = 0 forces v = 0, which is impossible. So v1 = 6 0 and thus uu13 = vv31 . Now, x3 = t is free and x2 = 0, so that u1 x1 + u3 t = 0 and then x1 = − uu31 t. Let t = (u1 v2 − u2 v1 )s; then x1 = −

u3 (u1 v2 − u2 v1 ) u3 u1 v2 u3 v3 s=− s + (u2 v1 )s = −u3 v2 s + (u2 v1 )s = (u2 v3 − u3 v2 )s. u1 u1 u1 v1

Thus orthogonal vectors are of the form     x1 u2 v3 − u3 v2 x2  = s  , 0 x3 u1 v2 − u2 v1 and since u1 v3 − u3 v1 = 0, this column vector is u × v, and again the result holds. The final case is that u1 v3 − u3 v1 6= 0 but u1 v2 − u2 v1 = 0. Again v1 = 0 forces v = 0, which is impossible, so that v1 6= 0 and thus uu12 = vv21 . Now, x2 = t is free and x3 = 0, so that u1 x1 + u2 t = 0 and then x1 = − uu21 t. Let t = (u1 v3 − u3 v1 )s; then x1 = −

u2 u1 v3 u2 v2 u2 (u1 v3 − u3 v1 ) s=− s + u3 v1 s = −u2 v3 s + u3 v1 s = (u3 v2 − u2 v3 )s. u1 u1 u1 v1

Thus orthogonal vectors are of the form     x1 u2 v3 − u3 v2 x2  = s u1 v3 − u3 v1  , x3 0 and since u1 v2 − u2 v1 = 0, this column vector is u × v, and again the result holds. Whew.

2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS

73

52. To show that the lines are skew, note first that they have nonparallel direction vectors, so that they are not parallel. Suppose there is some x such that x = p + su = q + tv. Then su − tv = q − p. Substituting the given values into this equation gives           1 −1 2s = −1 2 0 0 s −3 − t  6 =  1 − 1 =  0 ⇒ −3s − 6t = 0 1 −1 −1 0 −1 s + t = −1 . But the first equation gives s = − 21 , so that from the second equation t = 14 . But this pair of values does not satisfy the third equation. So the system has no solution and thus the lines do not intersect. Since they are not parallel, they are skew. Next, we wish to find a pair of parallel planes, one containing each line. Since u and v are direction vectors for the lines, those direction vectors will lie in both planes since the planes are required to be parallel. So a normal vector is given by     −3 · (−1) − 1 · 6 −3 u × v =  1 · 0 − 2 · (−1)  =  2 . 2 · 6 − (−3) · 0 12 So the planes we are looking for will have equations of the form −3x + 2y + 12z = d. The plane that passes through P = (1, 1, 0) must satisfy −3 · 1 + 2 · 1 + 12 · 0 = d, so that d = −1. The plane that passes through Q = (0, 1, −1) must satisfy −3 · 0 + 2 · 1 + 12 · (−1) = d, so that d = −10. Hence the two planes are −3x + 2y + 12z = −1 containing p + su,

−3x + 2y + 12z = −10 containing q + tv.

53. Over Z3 ,  R2 +2R1   1 2 1 − −−−−→ 1 2 1 1 2 0 2     x 0 Thus the solution is = . y 2

  1 1 2R2 1 − −→ 0

 R1 +R2  1 − −−−−→ 1 0 2 0 1

2 1

 0 . 2

54. Over Z2 ,  1 1 0 0 1 1 1 0 1

  1 1 0 0 R3 +R1 1 − −−−−→ 0

1 1 1

0 1 1

  1 1 0 0 R3 +R2 0 − −−−−→ 0

1 1 0

0 1 0

 R1 +R2  1 − −−−−→ 1 0 1 0 1 1 0 0 0 0 0

 1 0 0

Thus z = t is a free variable, and we have x + t = 1, so that x = 1 − t = 1 + t and also y + t = 0, so that y − t = t. Thus the solution is         x 1+t 0   1 y  =  t  = 0 , 1 ,   z t 0 1 since the only possible values for t in Z2 are 0 and 1. 55. Over Z3 ,  1 1 0 0 1 1 1 0 1

  1 1 1 0 0 1 1 0 R3 +2R1 1 − −−−−→ 0 2 1

 R +2R2  1 −−1−−−→ 1 0 0 R3 +R2 0 − −−−−→ 0

0 1 0

 1 0 ··· 2R3 0 − −→   R1 +R3  1 0 2 1 −−− −−→ 1 R2 +2R3 0 · · · 0 1 1 0 − −−−−→ 0 0 1 0 0 2 1 2

0 1 0

0 0 1

 1 0 0

74

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Thus the unique solution is     x 1 y  = 0 . z 0

56. Over Z5 ,  3 2 1 4

 2R1  1 − −→ 1 1 1

4 4

  2 1 R2 +4R1 1 − −−−−→ 0

4 0

 1 . 4

Since the bottom row says 0 = 4, this system is inconsistent and has no solutions. 57. Over Z7 ,  5R1  2 1 − −→ 1 4 1 1     x 3 . Thus the solution is = y 3  3 1

  5 R2 +6R1 1 3 1 −−−−−→ 0 1

3 4

 R1 +4R2  5 − −−−−→ 1 0 3 0 1

 3 . 3

58. Over Z5 ,  1 1  2 1

0 2 2 0

0 4 0 3

4 0 1 0

  1 0 0 4 1 R +4R 2 1 −−−−−→ 0 2 4 1 3  R +3R  1  1 − 0 2 0 3 −3−−−→ R +4R 1 2 −−4−−−→ 0 0 3 1  1 0 ··· 0 0

0 2 0 0

0 4 1 3

4 1 2 1

 1 2  R +4R · · · 2 4 − −3−−−→ 1

  1 0 0 4 1 3R 2 0 1 2 3 − − → 2   0 0 1 2 2 R +2R 4 3 1 −−−−−→ 0 0 0 0

  1 0 0 4 1 R2 +3R3  1 − − − − − →  0 1 0 4 0 0 1 2 2 0 0 0 0 0

 1 2  2 0

Thus x4 = t is a free variable, and       1+t 1 − 4t x1 x2  2 − 4t  2 + t      = x3  2 − 2t = 2 + 3t . x4 t t Since t = 0, 1, 2, 3, or 4, there are five solutions:       1 2 3 2 3 4  ,  ,  , 2 0 3 0 1 2

  4 0  , 1 3

  0 1  and  4 . 4

59. The Rank Theorem, which holds over Zp as well as over Rn , says that if A is the matrix of a consistent system, then the number of free variables is n − rank(A), where n is the number of variables. If there is one free variable, then it can take on any of the p values in Zp ; since 1 = n − rank(A), the system has exactly p = p1 = pn−rank(A) solutions. More generally, suppose there are k free variables. Then k = n − rank(A). Each of the free variables can take any of p possible values, so altogether there are pk choices for the values of the free variables. Each of these yields a different solution, so the total number of solutions is pk = pn−rank(A) . 60. The first step in row-reducing the augmented matrix of this system would be to multiply by the appropriate number to get a 1 in the first column. However, neither 2 nor 3 has a multiplicative inverse in Z6 , so this is impossible. So the best we can do in terms of row-reduction is     2 3 4 2 3 4 R2 +R1 4 3 2 − −−−−→ 0 0 0

Exploration: Lies My Computer Told Me

75

Thus y is a free variable, and 2x + 3y = 4. To solve this equation, choose each possible value for y in turn: • When y = 0, y = 2, or y = 4 we get 2x = 4, so that x = 2 or x = 5. • When y = 1 or y = 3, we get 2x + 3 = 4, or 2x = 1, which has no solution. So altogether there are six solutions:     2 5 , , 0 0

  2 , 2

  5 , 2

  2 , 4

  5 . 4

Exploration: Lies My Computer Told Me 1. x+ x+

y=0 =0

−800x − 800y = 0 800x + 801y = 800



801 800 y

2. x+ y=0 x + 1.0012y = 0





x = −800 y = 800 .

−x − y=0 x + 1.0012y = 1



x = −833.33 y = 833.33 .

−x − y=0 x + 1.001y = 1



x = −1000 y = 1000 ,

3. At four significant digits, we get x+ y=0 x + 1.001y = 0



while at three significant digits, we get x+ y=0 x + 1.00y = 0



−x − y=0 x + 1.00y = 1



Inconsistent system; no solution.

4. For the equations we just looked at, the slopes of the two lines were very close (the angle between the lines was very small). That means that changing one of the slopes even a little bit will result in a large change in their intersection point. That is the cause of the wild changes in the computed output values above — rounding to five significant digits is indeed the same as a small change in the slope of the second line, and it caused a large change in the computed solution. For the second example, note that the two slopes are 7.083 ≈ 1.55602, 4.552

2.693 ≈ 1.55575, 1.731

so that again the slopes are close, so we would expect the same kind of behavior.

Exploration: Partial Pivoting 1. (a) Solving 0.00021x = 1 to five significant digits gives x =

1 0.00021

≈ 4761.9.

1 (b) If only four significant digits are kept, we have 0.0002x = 1, so that x = 0.0002 = 5000. Thus the effect of an error of 0.00001 in the input is an error of 238.1 in the result. 1 2. (a) Start by pivoting on 0.4. Multiply the first row by 0.4 = 250 to get [1, 249, 250]. Now we want to subtract 75.3 times that row from the second row. Multiplying by 75.3 and keeping three significant digits gives

75.3[1, 249, 250] = [75.3, 18749.7, 18825.0] ≈ [75.3, 18700, 18800]. Then the second row becomes (again keeping only three significant digits) [75.3, 45.3, 30] − [75.3, 18700, 18800] = [0, 18654.7, 18770] ≈ [0, 18700, 18800].

76

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS This gives y = 18800 18700 ≈ 1.00535 ≈ 1.01. Substituting back into the modified first row, which says x + 249y = 250 gives x = 250 − 251.49 ≈ 250 − 251 = −1. Substituting x = y = 1 into the original system, however, yields two true equations, so this is in fact the solution. The problem was that the large numbers in the second row, combined with the fact that we were keeping only three significant digits, overwhelmed the information from adding in the first row, so all that information was lost. (b) First interchanging the rows gives the matrix  75.3 −45.3 0.4 99.6

30 100



1 to get (after reducing to three significant digits) [1, −0.602, 0.398]. Multiply the first row by 75.3 Then subtract 0.4 times that row from the second row:

[0.4, 99.6, 100] − 0.4[1, −0.602, 0.398] ≈ [0, 99.8, 99.8]. This gives 99.8y = 99.8, so that y = 1; from the first equation, we have x − 0.602y = 0.398, so that x − 0.602 · 1 = 0.398 and thus x = 1. We get the correct answer. 3. (a) Without partial pivoting, we pivot on 0.001 to three significant digits, so we divide the first row by 0.001, giving [1, 995, 1000], then multiply it by 10.2 and add the result to row 2, giving the matrix   1 995 1000 ⇒ y ≈ 1.01. 0 −10100 −10200 Back-substituting gives x = 1.00−1.00 = 0 (again remembering to keep only three significant 2 digits). This error was introduced because 1.00 and −50.0 were ignored when reducing to three significant digits in the original row-reduction, being overwhelmed by numbers in the range of 10000. Using partial pivoting, we pivot on −10.2 to three significant digits: the second row becomes [1, 0.098, 4.9]. Then multiply it by 0.001 and subtract it from the first row, giving [0, 0.995, 1.00]. Thus 0.995y = 1.00, so that to three significant digits y = 1.00. Back-substituting gives x = −50.0−1.00 = 51.0 −10.2 10.2 = 5. Pivoting on the largest absolute value reduced the error in the solution. (b) Start by pivoting on the 10.0 at upper left:    10.0 −7.00 0.00 7.00 10.0 −3.0 2.09 6.00 3.91 → −  0.0 5.0 1.00 5.00 6.00 0.0

−7.00 0.00 −0.01 6.00 2.50 5.00

 7.00 6.01 . 2.50

If we continue, pivoting on −0.01 to three significant digits, the second row then becomes [0, 1, −600, −601]. Multiply it by 2.5 and add it to the third row:     10.0 −7.00 0.00 7.00 10.0 −7.00 0.00 7.00  0.0 −0.01 6.00 6.01 → −  0.0 −0.01 6.00 6.01  . 0.0 2.50 5.00 2.50 0.0 0 −1500 −1500 This gives z = 1.00, y = −1.00, and x = 0.00. If instead we reversed the second and third rows and pivoted on the 2.5, we would first divide that row by 2.5, giving [0, 1, 2, 1], then multiply it by 0.01 and add it to the third row:       10.0 −7.00 0.00 7.00 10.0 −7.00 0.00 7.00 10.0 −7.00 0.00 7.00  0.0 2.50 5.00 2.50 → −  0.0 1 2 1 → −  0.0 1 2 1 . 0.0 −0.01 6.00 6.01 0.0 −0.01 6.00 6.01 0.0 0 6.02 6.02 This also gives z = 1.00, y = −1.00, and x = 0.00.

Exploration: An Introduction to the Analysis of Algorithms

77

Exploration: An Introduction to the Analysis of Algorithms 1. We count the number of operations, one step at a time.   1 R1  2 2 4 6 8 −− 1 −→  3 9  3 6 12 1 −1 1 −1 −1

2 9 1

3 6 −1

 4 12 1

There were three operations here: 21 · 4 = 2, 12 · 6 = 3, and 21 · 8 = 4. We don’t count 12 · 2 = 1 since we don’t actually perform that operation — once we chose 2, we knew we were going to turn it into 1.     1 2 3 4 4 1 2 3 R −3R 1   3 9 6 12 −−2−−−→ 0 3 −3 0 1 −1 1 −1 −1 1 −1 1 There were three operations here: 3 · 2 = 6, 3 · 3 = 9, and 3 · 4 = 12, fr a total of six operations so far. Note that we are counting only multiplications and divisions, and that again we don’t count 3 · 1 = 3 since we didn’t actually have to perform it — we just replace the 3 in the second row by a 0.     1 2 3 4 1 2 3 4 0 3 −3 0  0 3 −3 0 R3 +R1 −1 1 −1 1 − 2 5 −−−−→ 0 3 There were 3 operations here: 1 · 2 = 2, 1 · 3 = 3, and 1 · 4 = 4, for    1 2 3 1 2 3 4 1 0 3 −3 0 −− 3 R2 0 1 −1 −→ 0 3 2 5 0 3 2 In this step, there were 2 operations:  1 0 0

2 1 3

1 3

· (−3) and

3 −1 2

1 3

a total of nine operations so far.  4 0 5

· 0 = 0, for a total of eleven so far.

  1 2 4 0 1 0 R3 −3R2 5 − −−−−→ 0 0

3 −1 5

 4 0 5

In this step, there were 2 operations: −3 · (−1) = 3 and −3 · 0 = 0, for a total of 13 so far.  1 0 0

2 1 0

In this step, there was one operation,

3 −1 5 1 5

  4 1 2 0 1 0 1 5 R3 5 −− −→ 0 0

3 −1 1

 4 0 1

· 5 = 1, for a total of 14.

Finally, to complete the back-substitution, we need three more operations: −1 · x3 to find x2 , and 2 · x2 and 3 · x3 to find x1 . So in total 17 operations were required. 2. To compute the reduced row echelon form, we first compute the row echelon form above, which took 14 operations. Then:   R1 −2R2   1 2 3 4 − 5 4 −−−−→ 1 0 0 1 −1 0 0 1 −1 0 0 0 1 1 0 0 1 1 This required two operations, −2 · (−1) and −2 · 0, for a total of 16. Finally,  1 0 0 1 0 0

5 −1 1

 R −5R3  4 −−1−−−→ 1 0 0 R2 +R3 0 1 0 0 − −−−−→ 1 0 0 1

 −1 1 1

78

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS This required two operations, both in the constant column: 1 · 1 and −5 · 1, for a total of 18. So for this example, Gauss-Jordan required one more operation than Gaussian elimination. We would probably need more data to conclude that it is generally less efficient. 3. (a) There are n operations required to create the first leading 1 because we have to divide every entry in the first row, except a11 , by a11 (we don’t have to divide a11 by itself because the 1 results from the choice of pivot. Next, there are n operations required to create the first zero in column 1, because we have to multiply every entry in the first row, except the 1 in column one, by a21 . Again, we don’t have to multiply the 1 by itself, since we know we are simply replacing a21 by zero. Similarly, there are n operations required to create each zero in column 1. Since there are n − 1 rows excluding the first row, introducing the 1 in row one and the zeros below it requires n + (n − 1)n = n2 operations. (b) We can think of the resulting matrix, below and to the right of a22 , as an (n − 1) × n augmented matrix. Applying the process in part (a) to this matrix, we see that turning a22 into a 1 and creating zeros below it requires (n − 1) + (n − 2)(n − 1) = (n − 1)2 operations. Continuing to the matrix starting at row 3, column 3; this requires (n − 2)2 operations. So altogether, to reduce the matrix to row echelon form requires n2 + (n − 1)2 + · · · + 22 + 12 operations. (c) To find xn−1 requires one operation: we must multiply xn by the entry an−1,n . Then there are two operations required to find xn−2 , and so forth, until we require n − 1 operations to find x1 . So the total number required for the back-substitution is 1 + 2 + · · · + (n − 1). (d) From Appendix B, we see that k(k + 1)(2k + 1) 6 k(k + 1) 1 + 2 + ··· + k = . 2

12 + 22 + · · · + k 2 =

Thus the total number of operations required is n(n + 1)(2n + 1) n(n − 1) + 6 2 1 2 1 1 3 1 2 1 = n + n + n+ n − n 3 2 6 2 2 1 3 1 = n + n2 − n. 3 3

12 + 22 + · · · + n2 + 1 + 2 + · · · + (n − 1) =

For large values of n, the cubic term dominates, so the total number of operations required is ≈ 13 n3 . 4. As we saw in Exercise 2 in this section, we get the same n2 + (n − 1)2 + · · · + 22 + 12 operations to get the matrix into row echelon form. Then to create the 1 zero above row 2 requires 1(n − 1) operations, since each of the remaining n − 1 entries in row 2 must be multiplied. To create the 2 zeros above row 3 requires 2(n − 2) operations, since each of the remaining entries in row 3 must be multiplied twice. Continuing, we see that the total number of operations is (n − 1) + 2(n − 2) + · · · +(n − 1)(n − (n − 1)) = (1 · n + 2 · n + · · · + (n − 1) · n) − (12 + 22 + · · · + (n − 1)2 ) = n(1 + 2 + · · · + (n − 1)) − (12 + 22 + · · · + (n − 1)2 ). Adding that to the operations required to get the matrix into row echelon form, and using the formulas

2.3. SPANNING SETS AND LINEAR INDEPENDENCE

79

from Appendix B, we get for the total number of operations n2 + (n − 1)2 + · · · + 22 + 12 + n(1 + 2 + · · · + (n − 1))−(12 + 22 + · · · + (n − 1))2 n(n − 1) 2 1 3 1 2 2 =n + n − n 2 2 1 3 1 2 = n + n . 2 2 Again, for large values of n, the cubic term dominates, so the total number of operations required is ≈ 21 n3 . = n2 + n ·

2.3

Spanning Sets and Linear Independence

1. Yes, it is. We want to write  x

     1 2 1 , +y = −1 −1 2

so we want to solve the system of linear equations x + 2y = 1 −x − y = 2. Row reducing the augmented matrix for the system gives     R1 −2R2  1 2 1 1 2 1 − −−−−→ 1 0 R2 +R1 −1 −1 2 − 0 1 −−−−→ 0 1 3

 −5 . 3

So the solution is x = −5 and y = 3, and the linear combination is v = −5u1 + 3u2 . 2. No, it is not. We want to write 

     4 −2 2 x +y = , −2 1 1 so we want to solve the system of linear equations 4x − 2y = 2 −2x + y = 1 Row-reducing the augmented matrix for the system gives # 1 # " " " 4 R1 1 − 21 12 1 4 −2 2 −− −→ R2 +2R1 −2 1 1 −2 1 1 −−−−−→ 0

− 12 0

1 2

#

2

So the system is inconsistent, and the answer follows from Theorem 2.4. 3. No, it is not. We want to write       1 0 1 x 1 + y 1 = 2 , 0 1 3 so we want to solve the system of linear equations x

=1

x+y =2 y = 3. Rather than constructing the matrix and reducing it, note that the first and third equations force x = 1 and y = 3, and that these values of x and y do not satisfy the second equation. Thus the system is inconsistent, so that v is not a linear combination of u1 and u2 .

80

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 4. Yes, it is. We want to write       1 0 3 x 1 + y 1 =  2 , 0 1 −1 so we want to solve the system of linear equations x

=

3

x+y =

2

y = −1 Rather than constructing the matrix and reducing it, note that the first and third equations force x = 3 and y = −1, and that these values of x and y also satisfy the second equation. Thus v = 3u1 − u2 . 5. Yes, it is. We want to write         1 0 1 1 x 1 + y 1 + z 0 = 2 , 0 1 1 3 so we want to solve the system of linear equations x

+z = 1

x+y

=2

y+z = 3. Row-reducing the associated augmented matrix gives  1 0 1 1 1 0 0 1 1

  1 0 1 R2 −R1  2 − −−−−→ 0 1 3 0 1

1 −1 1

  1 0 1 0 1 1 R3 −R2 3 − −−−−→ 0 0

1 −1 2  1 · · · 0 0

 1 1 1 ··· 2 R3 2 −− −→ 0 1 0

1 −1 1

 R1 −R3  1 −−− −−→ 1 R2 +R3 0 1 − −−−−→ 1 0

0 1 0

0 0 1

 0 2 1

This gives the solution x = 0, y = 2, and z = 1, so that v = 2u2 + u3 . 6. Yes, it is. The associated linear system is 1.0x + 3.4y − 1.2z =

3.2

0.4x + 1.4y + 0.2z =

2.0

4.8x − 6.4y − 1.0z = −2.6 Solving this system with a CAS gives x = 1.0, y = 1.0, z = 1.0, so that v = u1 + u2 + u3 . 7. Yes, it is. b is in the span of the columns of A if and only if the system Ax = b has a solution, which is to say if and only if the linear system x + 2y = 5 3x + 4y = 6 has a solution. Row-reducing the augmented matrix for the system gives # " # " # " " R −2R2 1 2 5 1 2 5 −−1−−−→ 1 0 1 2 5 − 21 R2 R2 −3R1 9 3 4 6 −−−−−→ 0 −2 −9 −−−−→ 0 1 2 0 1 Thus x = −4 and y = 29 , so that A

  −4 9 2

=

  5 . 6

−4 9 2

#

2.3. SPANNING SETS AND LINEAR INDEPENDENCE

81

8. Yes, it is. b is in the span of the columns of A if and only if the system Ax = b has a solution, which is to say if and only if the linear system x + 2y + 3z = 4 5x + 6y + 7z = 8 9x + 10y + 11z = 12 Row-reducing the augmented matrix for the system gives  1 2 5 6 9 10

3 7 11

  1 4 R −5R1 0 8  −−2−−−→ R3 −9R1 12 − −−−−→ 0

2 3 −4 −8 −8 −16

  4 1 − 41 R2  −12 − 0 −−−→ −24 0

 4 3 −24

2 3 1 2 −8 −16

 1 0 R +8R2 0 −−3−−−→

2 1 0

3 2 0

 4 3 0

This system has an infinite number of solutions, so that b is indeed in the span of the columns of A.   a 9. We must show that for any vector , we can write b       1 1 a x +y = 1 −1 b for some x, y. Row-reduce the associated augmented matrix: # " # " " 1 1 a 1 1 a 1 1 − 12 R2 R2 −R1 1 −1 b −−− −−→ 0 −2 b − a −−− −→ 0 1

a

#

a−b 2

" R1 −R2 −−− −−→ 1 0 0 1

a+b 2 a−b 2

#

So given a and b, we have       a+b 1 a−b 1 a · + · = . 1 −1 b 2 2   a 10. We want to show that for any vector , we can write b  x

     3 0 a +y = −2 1 b

for some x, y. Row-reduce the associated augmented matrix: " " # 1 # " 3 R1 3 0 a −− 1 0 a3 1 −→ R2 +2R1 −2 1 b −2 1 b −−−−−→ 0

0 1

#

a 3

b+

2a 3

which shows that the vector can be written         a 2a a 3 0 = · + b+ · . b 1 3 −2 3 11. We want to show that any vector can be written as a linear combination of the three given vectors, i.e. that         a 1 1 0  b  = x 0 + y 1 + z 1 c 1 0 1

82

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS for some x, y, z. Row-reduce the associated augmented matrix:  1 1 0 0 1 1 1 0 1

  1 a 0 b R3 −R1 c − −−−−→ 0  1 1  0 1 1 2 R3 −−−→ 0 0  R1 −R2 −−− −−→ 1  0 0

1 1 −1 0 1 1 0 0 1 0 0 1

  1 a 0 b  R3 +R2 c−a − −−−−→ 0   1 1 a  R2 −R3  b  −−−−−→ 0 1 b+c−a 0 0 2  0 1 1

 a  b b+c−a  a a+b−c   2

1 0 1 1 0 2 0 0 1

b+c−a 2

a−b+c 2 a+b−c   2 b+c−a 2

Thus         a 1 1 0 1 1 1  b  = (a − b + c) · 0 + (a + b − c) · 1 + (−a + b + c) · 1 . 2 2 2 c 1 0 1 12. We want to show that any vector can be written as a linear combination of the three given vectors, i.e. that         a 1 1 2  b  = x 1 + y 2 + z  1 c 0 3 −1 for some x, y, z. Row-reduce the associated augmented matrix:  1 1 2 1 2 1 0 3 −1

     a 1 1 2 a 1 1 2 a R −R 2 1  0 1 −1 b −−− 0 1 −1 b − a b−a  −−→ R3 −3R2 c 0 3 −1 c 0 0 2 c + 3a − 3b −−−−−→     a 3b − 2a − c 1 1 2 1 1 0 R1 −2R3   −−−−−→   1 b−a 0 1 −1  R2 +R3 0 1 0 2 (c + a − b)  1 −−−−−→ 1 2 R3 0 0 1 12 (c + 3a − 3b) −− −→ 0 0 1 2 (c + 3a − 3b)   R1 −R2 −−− −−→ 1 0 0 12 (7b − 5a − 3c)   1 0 1 0 2 (c + a − b)  0 0 1 12 (c + 3a − 3b)

Thus         a 1 1 2  b  = 1 (7b − 5a − 3c) 1 + 1 (c + a − b) 2 + 1 (c + 3a − 3b)  1 . 2 2 2 c 0 3 −1 

   2 −1 = −2 , these vectors are parallel, so their span is the line through the origin with −4 2   −1 direction vector . 2     x −1 (b) The vector equation of this line is =t . Note that this is just another way of saying y 2 that the line is the span of the given vector. To derive the general equation of the line, expand the above, giving the parametric equations x = −t and y = 2t. Substitute −x for t in the second equation, giving y = −2x, or 2x + y = 0.

13. (a) Since

2.3. SPANNING SETS AND LINEAR INDEPENDENCE

83

    0 3 is just the origin, and the origin is in the span of , the span of these two 0 4     3 3 vectors is just the span of , which is the line through the origin with direction vector . 4 4     x 3 (b) The vector equation of this line is =t . Note that this is just another way of saying that y 4 the line is the span of the given vector. To derive the general equation of the line, expand the above, giving the parametric equations x = 3t and y = 4t. Substitute 31 x for t in the second equation, giving y = 43 x. Clearing fractions gives 3y = 4x, or 4x − 3y = 0.

14. (a) Since the span of

15. (a) Since the given vectors are not parallel, their span is a plane through the origin containing these two vectors as direction vectors. (b) The vector equation of the plane is       x 1 3 y  = s 2 + t  2 . z 0 −1 One way of deriving the general equation of the plane is to solve the system of equations arising from the above vector equation. Row-reducing the corresponding augmented system gives       1 3 x 1 3 x 1 3 x R −2R 2  0 −4  2 2 y  −−2−−−→ y − 4x 0 −4 y − 2x R3 − 41 R2 1 0 −1 z 0 −1 z 0 0 z + (2x − y) −−−−−−→ 4 Since we know the system is consistent, the bottom row must consist entirely of zeros. Thus the points on the plane must satisfy z + 41 (2x − y) = 0, or 4z + 2x − y = 0. This is the plane 2x − y + 4z = 0. An alternative method is to find a normal vector to the plane by taking the cross-product of the two given vectors:         1 3 0 · 2 − 2 · (−1) 2 2 ×  2 = 1 · (−1) − 0 · 3 = −1 . 0 −1 2·3−1·2 4 So the equation of the plane is 2x − y + 4z = d; since it passes through the origin, we again get 2x − y + 4z = 0. 16. (a) First, note that 

     0 1 −1 −1 = −  0 −  1 , 1 −1 0 so that the third vector is in the span of the first two. So we need onlyconsider the   first  two 1 −1 vectors. Their span is the plane through the origin with direction vectors  0 and  1. −1 0 (b) The vector equation of this plane is       1 −1 x s  0 + t  1 = y  . −1 0 z To find the general equation of the plane, we can solve the corresponding system of equations for s and t. Row-reducing the augmented matrix gives       1 −1 x 1 −1 x 1 −1 x       1 y y  y 0 1 0 1  0 R3 +R1 R3 +R2 −1 0 z −−− 0 −1 x + z 0 0 x + y + z −−→ −−−−−→

84

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Since we know the system is consistent, the bottom row must consist of all zeros, so that points on the plane satisfy the equation x + y + z = 0. Alternatively, we could find a normal vector to the plane by taking the cross product of the direction vectors:         1 −1 0 · 0 − (−1) · 1 1  0 ×  1 = −1 · −1 − 1 · 0 = 1 . −1 0 1 · 1 − 0 · (−1) 1 Thus the equation of the plane is x + y + z = d; since it passes through the origin, we again get x + y + z = 0.

17. Substituting the two points (1, 0, 3) and (−1, 1, −3) into the equation for the plane gives a linear system in a, b, and c: a

+ 3c = 0

−a + b − 3c = 0. Row-reducing the associated augmented matrix gives 

1 0 −1 1

3 −3

  0 1 0 3 R2 +R1 0 − −−−−→ 0 1 0

 0 , 0

so that b = 0 and a = −3c. Thus the equation of the plane is −3cx + cz = 0. Clearly we must have c 6= 0, since otherwise we do not get a plane. So dividing through by c gives −3x + z = 0, or 3x − z = 0, for the general equation of the plane. 18. Note that u = 1u + 0v + 0w,

v = 0u + 1v + 0w,

w = 0u + 0v + 1w.

Or, if one adopts the usual convention of not writing terms with coefficients of zero, u is in the span because u = u. 19. Note that u = u, v = (u + v) − u, w = (u + v + w) − (u + v), and the terms in parentheses on the right, together with u, are the elements of the given set. 20. (a) To prove the statement, we must show that anything that can be written as a linear combination of the elements of S can also be written as a linear combination of the elements of T . But if x is a linear combination of elements of S, then x = a1 u1 + a2 u2 + · · · + ak uk = a1 u1 + a2 u2 + · · · + ak uk + 0uk+1 + 0uk+1 + · · · + 0um . This shows that x is also a linear combination of elements of T , proving the result. (b) We know that span(S) ⊆ span(T ) from part (a). Since T is a set of vectors in Rn , we know that span(T ) ⊆ Rn . But then Rn = span(S) ⊆ span(T ) ⊆ Rn , so that span(T ) = Rn .

2.3. SPANNING SETS AND LINEAR INDEPENDENCE

85

21. (a) Suppose that ui = ui1 v1 + ui2 v2 + · · · + uim vm ,

i = 1, 2, . . . , k.

Suppose that w is a linear combination of the ui . Then w = w1 u1 + w2 u2 + · · · + wk uk = w1 (u11 v1 + u12 v2 + · · · + u1m vm ) + w2 (u21 v1 + u22 v2 + · · · + u2m vm ) + · · · wk (uk1 v1 + uk2 v2 + · · · + ukm vm ) = (w1 u11 + w2 u21 + · · · + wk uk1 )v1 + (w1 u12 + w2 u22 + · · · + wk uk2 )v2 + · · · (w1 u1m + w2 u2m + · · · + wk ukm )vm =

w10 v1

0 + w20 v2 + · · · + wm vm .

This shows that w is a linear combination of the vj . Thus span(u1 , . . . , uk ) ⊆ span(v1 , . . . , vk ). (b) Reversing the roles of the u’s and the v’s in part (a) gives span(v1 , . . . , vk ) ⊆ span(u1 , . . . , uk ). Combining that with part (a) shows that each of the spans is contained in the other, so they are equal. (c) Let   1 u1 = 0 , 0

  1 u2 = 1 , 0

  1 u3 = 1 . 1

We want to show that each ui is a linear combination of e1 , e2 , and e3 , and the reverse. The first of these is obvious: since e1 , e2 , and e3 span R3 , any vector is R3 is a linear combination of those, so in particular u1 , u2 , and u3 are. Next, note that e1 = u1 e2 = u2 − u1 e3 = u3 − u2 , so that the ej are linear combinations of the ui . Part (b) then implies that span(u1 , u2 , u3 ) = span(e1 , e2 , e3 ) = R3 .     −1 2 22. The vectors −1 and  2 are not scalar multiples of one another, so are linearly independent. 3 3 23. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that         1 1 1 0 c1 1 + c2 2 + c3 −1 = 0 . 1 3 2 0 Row-reduce the associated augmented matrix:  1 1 1 2 1 3

1 −1 2

  0 1 1 R2 −R1 0 −−−−−→ 0 1 R3 −R1 0 − −−−−→ 0 2

1 −2 1  1 · · · 0 0

  0 1 1 1 0 1 −2 0 R3 −2R2 0 − 5 −−−−→ 0 0  R1 −R3  1 1 0 −−−−−→ 1 R2 +2R3 0 1 −2 0 − −−−−→ 0 1 0 0

 0 0 1 ··· 5 R3 0 −− −→  R1 −R3  1 0 0 − −−−−→ 1 0 0 0 1 0 1 0 0 0 1 0 0 0 1

Thus the system has only the solution c1 = c2 = c3 = 0, so the vectors are linearly independent.

 0 0 0

86

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

24. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that         2 3 1 0 c1 2 + c2 1 + c3 −5 = 0 . 1 2 2 0 Row-reduce the associated augmented matrix:  2 3 1 2 1 −5 1 2 2

 R3  0 −−→ 1 R1  0 − −→ 2 R2 0 − −→ 2

2 3 1

2 1 −5  1 −R2  0 −−−→ 0

  1 2 2 0 R −2R1 0 −1 −3 0 −−2−−−→ R3 −2R1 0 − −−−−→ 0 −3 −9   1 2 2 0 0 1 3 0 R3 +3R2 −3 −9 0 − −−−−→ 0

 0 0 0 2 1 0

2 3 0

 R1 −2R2  0 − −−−−→ 1 0 0 1 0 0 0 0

−4 3 0

 0 0 0

Thus the system has a nontrivial solution, and the vectors are linearly dependent. From the reduced matrix, the general solution is c1 = 4t, c2 = −3t, c3 = t. Setting t = 1, for example, we get the relation         2 3 1 0 4 2 − 3 1 + −5 = 0 . 1 2 2 0 25. Since by inspection       2 2 0 v1 − v2 + v3 = 1 − 1 + 0 = 0, 3 1 2 the vectors are linearly dependent. 26. Since we are given 4 vectors in R3 , we know they are linearly dependent vectors by Theorem 2.8. To find a dependence relation, we want to find scalars c1 , c2 , c3 , c4 such that           −2 4 3 5 0 c1  3 + c2 −1 + c3 1 + c4 0 = 0 . 7 5 3 2 0 Row-reduce the associated augmented matrix:  1     − 2 R1 −2 4 3 5 0 −−− 1 −2 − 23 − 52 0 1 −2 − 32 − 25 − →   R2 −3R1    15 3 −1   3 −1 1 0 0 1 0 0 5 11   −−−−−→ 0   2 2 R −7R1 39 7 5 3 2 0 7 5 3 2 0 −−3−−−→ 0 19 27 2 2    1 −2 − 23 − 25 0 1 −2 − 23 − 25    1 R 3 11 3 5 2  0 1 11 0 1 −− −→ 0   10 2 10 2 R −19R 2 39 0 19 27 0 −−3−−−−→ 0 0 − 37 −9 2 2 5   R1 + 3 R3  2 1 −2 − 23 − 52 0 −−−−− −→ 1 −2 0 − 25 37    11 3 11 6 0  R2 − 10 R3 0 1 0 1 0 − − − − − − →    10 2 37 5 − 37 R3 45 0 1 45 0 0 0 1 −−−−→ 0 37 37   R +2R2 13 1 0 0 − 37 0 −−1−−−→   6 0 1 0 0   37 45 0 0 1 0 37

0



 0  0

 0  0  0  0  0  0

2.3. SPANNING SETS AND LINEAR INDEPENDENCE

87

Thus the system has a nontrivial solution, and the vectors are linearly dependent. From the reduced 13 6 matrix, the general solution is c1 = 37 t, c2 = − 37 t, c3 = − 45 37 t, c4 = t. Setting t = 37, for example, we get the relation           −2 4 3 5 0 13  3 − 6 −1 − 45 1 + 37 0 = 0 . 7 5 3 2 0

  0 27. Since v3 = 0 is the zero vector, these vectors are linearly dependent. For example, 0       3 6 0 0 · 4 + 0 · 7 + 137 0 = 0. 5 8 0 Any set of vectors containing the zero vector is linearly dependent. 28. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that         0 2 3 −1  3 0 2  1        c1   2 + c2 2 + c3  1 = 0 0 −1 4 1 Row-reduce the associated augmented matrix:  −1 3  1 2   2 2 1 4

2 3 1 −1

 −R1  0 − −−→ 1 1 0    2 0 0 1

  1 0 R −R 2 1 −−−−−→ 0 0  R −2R  2  0 − 0 −3−−−→ R4 −R1 0 − −−−−→ 0

−3 −2 2 3 2 1 4 −1

 1 −3 0 1 R −6R2  0 −−3−−−→ 0 R4 −7R2 0 0 −−−−−→  1 −3 R +2R3 −−1−−−→ 0 1 R2 −R3  −−− −−→ 0 0 0 0

−2 1 −1 −6 0 0 1 0

  1 0 0 0  −R  3  0 − 0 −−→ 0 0  R1 +3R2  0 −−−−−→ 1 0 0    0 0 0 0

−3 −2 5 5 6 5 7 1 −3 −2 1 1 0 1 0 −6 0 0 1 0 0 1 0 0

  1 0 1 5 R2 0 0  −−−→  0 0 0 0

−3 −2 1 1 6 5 7 1

  1 0 0 0   0 0 R4 +6R3 0 −−−−−→ 0 

 0 0  0 0

−3 −2 1 1 0 1 0 0

 0 0  0 0

0 0  0 0

Since the system has only the trivial solution, the given vectors are linearly independent. 29. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 , c4 so that 

         1 −1 1 0 0 −1  1  0  1 0          c1   1 + c2  0 + c3  1 + c4 −1 = 0 0 1 −1 1 0

88

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Row-reduce the associated augmented matrix:    1 −1 1 1 −1 1 0 0 R +R 2 1  −1 1 0 1 0 0 1  −−−−−→ 0 R3 −R1 0  1 0 1 −1 0− 1 0 −−−−→ 0 1 −1 1 0 0 1 −1  1 −1 1 0 1 0  0 0 1 R4 −R2 0 −1 −−− −−→ 0  1 −1 1 0 0 1 0 −1  0 0 1 1 1 3 R4 0 0 0 1 −−−→  R +R2 −R3 −−1−−− −−→ 1 0 0 0 1 0  0 0 1 0 0 0

   1 −1 1 0 0 0  0 1 0 −1 0 R2 ↔R3 0 −  − − − − →   0 0 0 1 1 0 0 0 1 −1 1 0    0 0 1 −1 1 0 0 0 −1 0 1 0 −1 0      1 0 0 0 1 1 0 R4 +R3 2 0 − 0 0 3 0 −−−−→ 0    1 −1 1 0 0 0 R2 +R4  − − − − − → 0 1 0 0 0 0   R −R   3 4 0 −−−−−→ 0 0 1 0 0 0 0 0 0 1 0  0 0 0 0  0 0 1 0 0 1 −1 1

Since the system has only the trivial solution, the given vectors are linearly independent. 30. The vectors

  0 0  v1 =  0 , 1

  0 0  v2 =  2 , 1

  0 3  v3 =  2 , 1

  4 3  v4 =  2 1

are linearly dependent. For suppose c1 v1 + c2 v2 + c3 v3 + c4 v4 = 0. Then c4 must be zero in order for the first component of the sum to be zero. But then the only vector containing a nonzero entry in the second component is v3 , so that c3 = 0 as well. Now for the same reason c2 = 0, looking at the third component. Finally, that implies that c1 = 0. So the vectors are linearly independent. 31. By inspection        −1 1 −1 3 −1  3 1 −1        v1 + v2 − v3 + v4 =   1 +  1 − 3 +  1 = 0, 1 3 −1 −1 

so the vectors are linearly dependent. 32. Construct a matrix A with these vectors as its rows and try to produce a zero row:   R0 =R1 +R2     2 −1 3 − 1 1 6 −1−−−−−→ 1 1 6 0 0 R2 =R2 +2R1 −1 2 3 −1 2 3 − −−−−−−−→ 0 3 9 This matrix is in row echelon form and has no zero rows, so the matrix has rank 2 and the vectors are linearly independent by Theorem 2.7. 33. Construct a matrix A with these vectors as its rows and try to produce a zero row:  1 1 1

  1 1 1 0 R =R2 −R1 2 3 −−2−−−−−→ 0 −1 2 R30 =R3 −R1 0 −−−−−−−→

  1 1 1 0 1 2 00 0 R3 =R3 +2R20 −2 1 − −−−−−−−→ 0

1 1 0

 1 2 5

This matrix is in row echelon form and has no zero rows, so the matrix has rank 3 and the vectors are linearly independent by Theorem 2.7.

2.3. SPANNING SETS AND LINEAR INDEPENDENCE

89

34. Construct a matrix A with these vectors as its rows and try to produce a zero row:  2 3 1

  2 1 1 R1 ↔R3  1 2 − −−−−→ 3 −5 2 2

  −5 2 1 R20 =R2 −3R1 1 2 −−−−−−−−→ 0 2 1 R30 =R3 −2R1 0 −−−−−−−−→

  1 −5 2 16 −4 00 0 3 0 0 R3 =R3 − 4 R2 12 −3 − −−−−−−−−→ 0

 −5 2 12 −3 0 0

Since there is a zero row, the matrix has rank less than 3, so the vectors are linearly dependent by Theorem 2.7. Working backwards, we have 3 3 1 3 [0, 0, 0] = R300 = R30 − R20 = R3 − 2R1 − (R2 − 3R1 ) = R1 − R2 + R3 4 4 4 4 

     1 3 2 3 1 = −5 − 1 + 2 . 4 4 2 2 1

35. Construct a matrix A with these vectors as its rows and try to produce a zero row:  0 2 2

1 1 0

 R10 =R2  2 −−− −−→ 2 3 R20 =R1 0 −−−−−→ 1 2

  3 2 0 2 0 R3 =R3 −R10 1 − −−−−−−→ 0

1 1 0

  1 3 2 1 2 00 0 0 0 R3 =R3 +R2 −1 −2 − −−−−−−−→ 0

1 1 0

 3 1 0

This matrix is in row echelon form and has a zero row, so the vectors are linearly dependent by Theorem 2.7. Working backwards, we have       0 2 2 [0, 0, 0] = R300 = R30 + R20 = R3 − R10 + R20 = R3 − R2 + R1 = R1 − R2 + R3 = 1 − 1 + 0 . 2 3 1 36. Construct a matrix A with these vectors and try to produce a zero row:  −2  4    3 5

  −2 3 7 R20 =R2 +2R1  0 − − − − − − − − → −1 5   0  R =R3 + 23 R1  1 3 −−3−−− −−−→  0 0 0 2 R40 =R4 + 25 R1 −−−−−−−−→

3 5 11 2 15 2

  7 −2  0 19  00  R3 =2R30  27    0 − − − − − → 2 39 2

R00 =2R0

4 −−4−−−→

 −2 3  0 5  0 · · · R3000 =R300 − 11  5 R2 −−−−−−−−−−→  0 0 R000 =R400 −3R20 0 0 −−4−−−− −−−→

0

3 5 11 15

 7 19  ··· 27 39

 −2 3  0 5    0 0 0000 000 5·18 000 R4 =R4 − 74 R3 −18 −−−−−−−−−−−−→ 0 0

 7 19  74  −5

 7 19  74  −5 0

Since this matrix has a zero row, it has rank less than 2, so the vectors are linearly dependent. Working backwards, we have [0, 0, 0] = R40000 = R4000 −

90 000 R 74 3

  45 11 R300 − R20 37 5 90 99 = 2R40 − 3R20 − R30 + R20 37 37     5 90 3 99 = 2 R4 + R1 − 3(R2 + 2R1 ) − R3 + R1 + (R2 + 2R1 ) 2 37 2 37 26 12 90 = R1 − R2 − R3 + 2R4 . 37 37 37 = R400 − 3R20 −

90

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Clearing fractions gives [0, 0, 0] = 26R1 − 12R2 − 90R3 + 74R4 . Thus

        −2 4 3 5 26  3 − 12 −1 − 90 1 + 74 0 = 0. 7 5 3 2

Note that instead of doing all this work, we could just have applied Theorem 2.8. 37. Construct a matrix A with these vectors and try  3 6 0

to produce a zero row:  4 5 7 8 0 0

This matrix already has a zero row, so we know by Theorem 2.7 that the vectors are linearly dependent, and we have an obvious dependence relation: let the coefficients of each of the first two vectors be zero, and the coefficient of the zero vector be any nonzero number. 38. Construct a matrix A with these vectors and try to produce a zero row:  −1  3 2

1 2 3

2 2 1

  1 −1 R20 =R2 +3R1 4 −−−−−−−−→  0 −1 R30 =R3 +2R1 0 −−−−−−−−→

1 5 5

2 8 5

  1 −1  0 7 00 R3 =R3 −R2 1 − 0 −−−−−−−→

1 5 0

 2 1 8 7 . −3 −6

This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are linearly independent by Theorem 2.7. 39. Construct a matrix A with these vectors and try to produce a zero row:   −1 1 0 1 R20 =R2 +R1 0 1 0 1 − − − − − − − →   0 1 −1 R30 =R3 −R1 0 −−−−−−−→ 1 −1 1 0



1 −1   1 0

 1 0  0 0

 −1 1 0 0 1 1  1 0 −1 0 R4 =R4 −R30 1 −1 1 − −−−−−−→

  −1 1 0 1 R200 =R30 0 0 1 1 − − − − − →   1 0 −1 R300 =R20 0 −−−−−→ 0 −1 2 0

  −1 1 0 1 0 1 0 −1   0 1 1 00 0 0 0 R4 =R4 +R3 0 −1 2 − −−−−−−−→ 0

 −1 1 0 1 0 −1 . 0 1 1 0 0 3

This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are linearly independent by Theorem 2.7. 40. Construct a matrix A with these vectors and try to produce a zero row:  0 0  0 4

0 0 3 3

0 2 2 2

  1 R20 =R2 −R1 0 −−−−−−−→  1  R0 =R −R 0 3 1  1 −−3−−− −−→ 0 0 1 R4 =R4 −R1 4 −−−−−−−→

0 0 3 3

0 2 2 2

  0 1  0  R300 =R30 −R20 0 0 −−−−−−−−→ 0 0 R400 =R40 −R20 4 −−−−−−−−→

0 0 3 3

0 2 0 0

 1 0  0 000 00 00 R4 =R4 −R3 0 − −−−−−−−→ 0

 0 0  0 4

0 0 3 0

0 2 0 0

000

R1 =R4 − −−−−→  4 1 R200 =R300 0 0 − − − − − →   0 R3000 =R20 0 −−−−−→ 0 R0000 =R1 0 4 −−− −−−→

0 3 0 0

0 0 2 0

 0 0 . 0 1

2.3. SPANNING SETS AND LINEAR INDEPENDENCE

91

This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are linearly independent by Theorem 2.7. 41. Construct a matrix A with these vectors and try to produce a zero row: 

3 −1   1 −1

 R10 =R3  1 −1 1 −1 −−− −−→ −1 3 1 −1   1 3 1 R30 =R1  3 −−−−−→ −1 −1 1 3

 1 3 1 R20 =R2 +R10 −−−−−−−→ 3 1 −1  R00 =R0 −3R0 1  −1 1 −1 −−3−−−3−−−→ 0 0 −1 1 3 R4 =R4 +R1 −−−−−−−→    1 1 3 1 1 0  0 4 4 0   R000 =R00 +R0 +R0  3 2 4 0 0 −4 −8 −4 − −3−−−− −−− −−→ 0 0 4 4 0

1 4 0 0

3 4 0 4

 1 0 . 0 4

Since this matrix has a zero row, it has rank less than 4, so the vectors are linearly dependent. Working backwards, we have [0, 0, 0, 0] = R3000 = R300 + R20 + R40 = R30 − 3R10 + R2 + R10 + R4 + R10 = R30 − R10 + R2 + R4 = R1 − R3 + R2 + R4 . Thus

       −1 1 −1 3 −1  3 1 −1   +   −   +   = 0.  1  1 3  1 3 1 −1 −1 

42. (a) In this case, rank(A) = n. To see this, recall that v1 , v2 , . . . , vn are linearly independent if and only if the only solution of [A | 0] = [v1 v2 . . . vn | 0] is the trivial solution. Since the trivial solution is always a solution, it is the only one if and only if the number of free variables in the associated system is zero. Then the Rank Theorem (Theorem 2.2 in Section 2.2) says that the number of free variables is n − rank(A), so that n = rank(A). (b) Let 

 v1  v2    A =  . .  ..  vn Then by Theorem 2.7, the vectors, which are the rows of A, are linearly independent if and only if the rank of A is greater than or equal to n. But A has n rows, so its rank is at most n. Thus, the rows of A are linearly independent if and only if the rank of A is exactly n. 43. (a) Yes, they are. Suppose that c1 (u + v) + c2 (v + w) + c3 (u + w) = 0. Expanding gives c1 u + c1 v + c2 v + c2 w + c3 u + c3 w = (c1 + c3 )u + (c1 + c2 )v + (c2 + c3 )w = 0. Since u, v, and w are linearly independent, all three of these coefficients are zero. So to see what values of c1 , c2 , and c3 satisfy this equation, we must solve the linear system c1

+ c3 = 0

c1 + c2

=0

c2 + c3 = 0.

92

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Row-reducing the coefficient matrix gives    1 0 1 1 1 1 0 → − 0 0 1 1 0

 0 1 1 −1 . 0 2

Since the rank of the reduced matrix is 3, the only solution is the trivial solution, so c1 = c2 = c3 = 0. Thus the original vectors u + v, v + w, and u + w are linearly independent. (b) No, they are not. Suppose that c1 (u − v) + c2 (v − w) + c3 (u − w) = 0. Expanding gives c1 u − c1 v + c2 v − c2 w + c3 u − c3 w = (c1 + c3 )u + (−c1 + c2 )v + (−c2 − c3 )w = 0. Since u, v, and w are linearly independent, all three of these coefficients are zero. So to see what values of c1 , c2 , and c3 satisfy this equation, we must solve the linear system c1

+ c3 = 0

−c1 + c2

=0

− c2 − c3 = 0. Row-reducing the coefficient matrix gives    1 1 0 1 −1 1 0 → − 0 0 0 −1 −1

0 1 0

 1 1 . 0

So c3 is a free variable, and a solution to the original system is c1 = c2 = −c3 . Taking c3 = −1, for example, we have (u − v) + (v − w) − (u − w) = 0. 44. Let v1 and v2 be the two vectors. Suppose first that at least one of them, say v1 , is zero. Then they are linearly dependent since v1 + 0v2 = 0 + 0 = 0; further, v1 = 0v2 , so one is a scalar multiple of the other. Now suppose that neither vector is the zero vector. We want to show that one is a multiple of the other if and only if they are linearly dependent. If one is a multiple of the other, say v1 = kv2 , then clearly 1v1 − kv2 = 0, so that they are linearly dependent. To go the other way, suppose they are linearly dependent. Then c1 v1 + c2 v2 = 0, where c1 and c2 are not both zero. Suppose that c1 6= 0 (if instead c2 6= 0, reverse the roles of the first and second subscripts). Then c1 v1 + c2 v2 = 0



c1 v1 = −c2 v2



v1 = −

c2 v2 , c1

so that v1 is a multiple of v2 , as was to be shown. 45. Suppose we have m row vectors v1 , v2 , . . . , vm in Rn with m > n. Let   v1  v2    A =  . .  ..  vm By Theorem 2.7, the rows (the vi ) are linearly dependent if and only if rank(A) < m. By definition, rank(A) is the number of nonzero rows in its row echelon form. But since there can be at most one leading 1 in each column, the number of nonzero rows is at most n < m. Thus rank(A) ≤ n < m, so that by Theorem 2.7, the vectors are linearly dependent.

2.4. APPLICATIONS

93

46. Suppose that A = {v1 , v2 , . . . , vn } is a set of linearly independent vectors, and that B is a subset of A, say B = {vi1 , vi2 , . . . , vik } where 1 ≤ i1 < i1 < · · · < ik ≤ n for j = 1, 2, . . . , k. We want to show that B is a linearly independent set. Suppose that ci1 vi1 + ci2 vi2 + · · · + cik vik = 0. Set cr = 0 if r is not in {i1 , i2 , . . . , ik }, and consider c1 v1 + c2 v2 + · · · + cn vn . This sum is zero since the sum of the i1 , i2 , . . . , ik elements is zero, and the other coefficients are zero. But A forms a linearly independent set, so all the coefficients must in fact be zero. Thus the linear combination of the vij above must be the trivial linear combination, so B is a linearly independent set. 47. Since S 0 = {v1 , . . . , vk } ⊂ S = {v1 , . . . , vk , v}, Exercise 20(a) proves that span(S 0 ) ⊆ span(S). Using Exercise 21(a), to show that span(S) ⊆ span(S 0 ), we must show that each element of S is a linear combination of elements of S 0 . But clearly each vi is a linear combination of elements of S 0 , since vi ∈ S 0 . Also, we are given that v is linear combination of the vi , so it too is a linear combination of elements of S 0 . Thus span(S) ⊆ span(S 0 ). Since we have proven both inclusions, the two sets are equal. 48. Suppose bv + b2 v2 + · · · + bk vk = 0. Substitute for v from the given equation, to get b(c1 v1 + c2 v2 + · · · + ck vk ) + b2 v2 + · · · + bk vk = bc1 v1 + (bc2 + b2 )v2 + · · · + (bck + bk )vk = 0. Since the vi are a linearly independent set, all of the coefficients must be zero: bc1 = 0,

bc2 + b2 = 0,

bc3 + b3 = 0,

...

bck + bk = 0.

In particular, bc1 = 0; since we are given that c1 6= 0, it follows that b = 0. Then substituting 0 for b in the remaining equations above b2 = 0,

b3 = 0,

...

bk = 0.

Thus b and all of the bk are zero, so the original linear combination was the trivial one. It follows that {v, v2 , . . . , vk } is a linearly independent set.

2.4

Applications

1. Let x1 , x2 , and x3 be the number of bacteria of strains I, II, and consumption of A, B, and C, we get the following system:    1 x1 + 2x2 = 400 1 2 0 400 − 0 2x1 + x2 + x3 = 600 ⇒ 2 1 1 600 → 1 1 2 600 0 x1 + x2 + 2x3 = 600

III, respectively. Then from the

0 0 1 0 0 1

 160 120 . 160

So 160 bacteria of strain I, 120 bacteria of strain II, and 160 bacteria of strain III can coexist. 2. Let x1 , x2 , and x3 be the number of bacteria of strains I, II, and consumption of A, B, and C, we get the following system:    x1 + 2x2 = 400 1 2 0 400 1 2x1 + x2 + 3x3 = 500 ⇒ 2 1 3 500 → − 0 1 1 1 600 0 x1 + x2 + x3 = 600 This is an inconsistent system, so the bacteria cannot coexist.

III, respectively. Then from the

0 1 0

2 −1 0

 0 0 . 1

94

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 3. Let x1 , x2 , and x3 be the number of small, medium, and large arrangements. Then from the consumption of flowers in each arrangement we get:     x1 + 2x2 + 4x3 = 24 1 0 0 2 1 2 4 24 − 0 1 0 3 . 3x1 + 4x2 + 8x3 = 50 ⇒ 3 4 8 50 → 3x1 + 6x2 + 6x3 = 48 3 6 6 48 0 0 1 4 So 2 small, 3 medium, and 4 large arrangements were sold that day. 4. (a) Let x1 , x2 , and x3 be the number of nickels, dimes, and quarters. Then we get     x1 + x2 + x3 = 20 20 1 0 0 4 1 1 1 0 → 2x1 − x2 = 0 ⇒ 2 −1 0 − 0 1 0 8 . 5 10 25 300 0 0 1 8 5x1 + 10x2 + 25x3 = 300 So there are 4 nickels, 8 dimes, and 8 quarters. (b) This is similar to part (a), except that the constraint 2x1 − x2 no longer applies, so we get     1 1 1 20 1 0 −3 −20 → − . 40 5 10 25 300 0 1 4 The general solution is x1 = −20 + 3t, x2 = 40 − 4t, and x3 = t. From the first equation, we must have 3t − 20 ≥ 0, so that t ≥ 7. From the second, we have t ≤ 10. So the possible values for t are 7, 8, 9, and 10, which give the following combinations of nickels, dimes, and quarters: [x1 , x2 , x3 ] = [1, 12, 7] or [4, 8, 8] or [7, 4, 9] or [10, 0, 10]. 5. Let x1 , x2 , and x3 be the number of house, special, and gourmet blends. Then from the consumption of beans in each blend we get 300x1 + 200x2 + 100x3 = 30,000 200x2 + 200x3 = 15,000 200x1 + 100x2 + 200x3 = 25,000 . Forming the augmented matrix and reducing  300 200 100  0 200 200 200 100 200

it gives   30000 1 0 0 15000 → − 0 1 0 25000 0 0 1

 65 30 . 45

The merchant should make 65 house blend, 30 special blend, and 45 gourmet blend. 6. Let x1 , x2 , and x3 be the number of house, special, and gourmet blends. Then from the consumption of beans in each blend we get 300x1 + 200x2 + 100x3 = 30,000 50x1 200x2 + 350x3 = 15,000 150x1 + 100x2 + 50x3 = 15,000 . Forming the augmented matrix and reducing  300 200 100  50 200 350 150 100 50

it gives   30000 1 0 15000 → − 0 1 15000 0 0

−1 2 0

 60 60 . 0

Thus x1 = 60 + t, x2 = 60 − 2t, and x3 = t. From the first equation, we get t ≥ −60; from the second, we get t ≤ 30, and the third gives us t ≥ 0. So the range of possible values is 0 ≤ t ≤ 30. We wish to maximize the profit P =

1 3 1 3 1 x1 + x2 + 2x3 = (60 + t) + (60 − t) + 2t = 120 − t. 2 2 2 2 2

Since 0 ≤ t ≤ 30, the profit is maximized if t = 0, in which case x1 = x + 2 = 60 and x3 = 0. Therefore the merchant should make 60 house and special blends, and no gourmet blends.

2.4. APPLICATIONS

95

7. Let x, y, z, and w be the number of FeS2 , O2 , Fe2 O3 , and SO2 molecules respectively. Then compare the number of iron, sulfur, and oxygen atoms in reactants and products: Iron:

x =

2z

Sulfur:

2x =

w

Oxygen:

2y = 3z + 2w,

so we get the matrix  1 0  2 0 0 2

−2 0 0 −1 −3 −2

  1 0 0 0   0 → − 0 1 0 0 0 1 0

− 21 − 11 8 − 14

 0  0 . 0

1 Thus z = 14 w, y = 11 8 w, and x = 2 w. The smallest positive value of w that will produce integer values for all four variables is 8, so letting w = 8 gives x = 4, y = 11, z = 2, and w = 8. The balanced equation is 4FeS2 + 11O2 −→ 2Fe2 O3 + 8SO2 .

8. Let x, y, z, and w be the number of CO2 , H2 O, C6 H12 O6 , and O2 molecules respectively. The compare the number of carbon, oxygen, and hydrogen atoms in reactants and products: Carbon:

x

=

6z

Oxygen:

2x + y = 6z + 2w

Hydrogen:

2y

=

12z.

This system is easy to solve using back-substitution: the first equation gives x = 6z and the third gives y = 6z. Substituting 6z for both x and y in the second equation gives 2 · 6z + 6z = 6z + 2w, so that w = 6z as well. The smallest positive value of z giving integers is z = 1, so that x = y = w = 6 and z = 1. The balanced equation is 6CO2 + 6H2 O −→ C6 H12 O6 + 6O2 . 9. Let x, y, z, and w be the number of C4 H10 , O2 , CO2 , and H2 O molecules respectively. The compare the number of carbon, oxygen, and hydrogen atoms in reactants and products: Carbon:

4x =

z

Oxygen:

2y = 2z + w

Hydrogen:

10x =

2w.

This system is easy to solve using back-substitution: the first equation gives z = 4x and the third gives w = 5x. Substituting these values for z and w in the second equation gives 2y = 2 · 4x + 5x = 13x, so that y = 13 2 x. The smallest positive value of x giving integers for all four variables is x = 2, resulting in x = 2, y = 13, z = 8, and w = 10. The balanced equation is 2C4 H10 + 13O2 −→ 8CO2 + 10H2 O. 10. Let x, y, z, and w be the number of C7 H6 O2 , O2 , H2 O, and CO2 molecules respectively. The compare the number of carbon, oxygen, and hydrogen atoms in reactants and products: Carbon:

7x

=

Oxygen:

2x + 2y = z + 2w

Hydrogen:

6x

=

w 2z.

This system is easy to solve using back-substitution: the first equation gives w = 7x and the third gives z = 3x. Substituting these values for w and z in the second equation gives 2x + 2y = 3x + 2 · 7x, so that y = 15 2 x. The smallest positive value for x giving integer values for all four variables is x = 2, so that the solution is x = 2, y = 15, z = 6, and w = 14. The balanced equation is 2C7 H6 O2 + 15O2 −→ 6H2 O+14CO2 .

96

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

11. Let x, y, z, and w be the number of C5 H11 OH, O2 , H2 O, and CO2 molecules respectively. The compare the number of carbon, oxygen, and hydrogen atoms in reactants and products: Carbon:

5x

Oxygen:

x + 2y = z + 2w

=

w

Hydrogen:

12x

=

2z.

This system is easy to solve using back-substitution: the first equation gives w = 5x and the third gives z = 6x. Substituting these values for w and z in the second equation gives x + 2y = 6x + 2 · 5x, so that y = 15 2 x. The smallest positive value for x giving integer values for all four variables is x = 2, so that the solution is x = 2, y = 15, z = 12, and w = 10. The balanced equation is 2C5 H11 OH+15O2 −→ 12H2 O+10CO2 . 12. Let x, y, z, and w be the number of HClO4 , P4 O10 , H3 PO4 , and Cl2 O7 molecules respectively. The compare the number of hydrogen, chlorine, oxygen, and phosphorus atoms in reactants and products: Hydrogen:

x

=

3z

Chlorine:

x

=

2w

Oxygen:

4x + 10y = 4z + 7w

Phosphorus:

4y

=

z.

This gives the matrix  1 0 −3 0 1 0 0 −2   4 10 −4 −7 0 4 −1 0

  1 0 0 0   0 0 1 0 − → 0 0 0 1 0 0 0 0

 0 0  . 0

−2 − 16 − 23 0

0

Thus the general solution is x = 2t, y = 16 t, z = 32 t, w = t. The smallest positive value of t giving all integer answers is t = 6, so the desired solution is x = 12, y = 1, z = 4, and w = 6. The balanced equation is 12HClO4 +P4 O10 −→ 4H3 PO4 + 6Cl2 O7 . 13. Let x1 , x2 , x3 , x4 , and x5 be the number of Na2 CO3 , C, N2 , NaCN, and CO molecules respectively. The compare the number of sodium, carbon, oxygen, and nitrogen atoms in reactants and products: Sodium:

2x1

=

x4

Carbon:

x1 + x2 = x4 + x5

Oxygen:

3x1

=

x5

Nitrogen:

2x3

=

x4 .

This gives the matrix  2 0 0 1 1 0   3 0 0 0 0 2

−1 0 −1 −1 0 −1 −1 0

  0 1   0 0 − → 0 0 0

0

4 x5 , 3

x3 =

0 1 0 0

0 0 1 0

0 0 0 1

− 31 − 34 − 31 − 32

 0 0   0 0

Thus x5 is a free variable, and x1 =

1 x5 , 3

x2 =

1 x5 , 3

x4 =

2 x5 . 3

The smallest value of x5 that produces integer solutions is 3, so the desired solution is x1 = 1, x2 = 4, x3 = 1, x4 = 2, and x5 = 3. The balanced equation is Na2 CO3 + 4C+N2 −→ 2NaCN+3CO.

2.4. APPLICATIONS

97

14. Let x1 , x2 , x3 , x4 , and x5 be the number of C2 H2 Cl4 , Ca(OH)2 , C2 HCl3 , CaCl2 , and H2 O molecules respectively. The compare the number of carbon, hydrogen, chlorine, calcium, and oxygen atoms in reactants and products: Carbon:

2x1

Hydrogen:

2x1 + 2x2 = x3 + 2x5

=

2x3

Chlorine:

4x1

= 3x3 + 2x4

Calcium:

x2

=

x4

Oxygen:

2x2

=

x5 .

This gives the matrix  2  2  4   0 0

0 2 0 1 2

−2 0 0 −1 0 −2 −3 −2 0 0 −1 0 0 0 −1

  1 0   0 0   − 0 0 →   0 0 0 0

0 1 0 0 0

0 0 1 0 0

0 0 0 1 0

−1 − 21 −1 − 21 0

 0  0  0   0 0

Thus x5 is a free variable, and 1 1 x5 , x3 = x5 , x4 = x5 . 2 2 The smallest value of x5 that produces integer solutions is 2, so the desired solution is x1 = 2, x2 = 1, x3 = 2, x4 = 1, and x5 = 2. The balanced equation is 2C2 H2 Cl4 +Ca(OH)2 −→ 2C2 HCl3 +CaCl2 + 2H2 O. x1 = x5 ,

x2 =

15. (a) By applying the conservation of flow rule to each node we obtain the system of equations f1 + f2 = 20

f2 − f3 = −10

f1 + f3 = 30.

Form the augmented matrix and row-reduce it:    1 1 0 1 0 20 0 1 −1 −10 → − 0 1 1 0 1 30 0 0

1 −1 0

 30 −10 0

Letting f3 = t, the possible flows are f1 = 30 − t, f2 = t − 10, f3 = t. (b) The flow through AB is f2 , so in this case 5 = t − 10 and t = 15. So the other flows are f1 = f3 = 15. (c) Since each flow must be nonnegative, the first equation gives t ≤ 30, the second gives t ≥ 10, and the third gives t ≥ 0. So we have 10 ≤ t ≤ 30, which means that 0 ≤ f1 ≤ 20

0 ≤ f2 ≤ 20

10 ≤ f3 ≤ 30.

(d) If a negative flow were allowed, it would mean a flow in the opposite direction, so that a negative flow on f2 would mean a flow from B into A. 16. (a) By applying the conservation of flow rule to each node we obtain the system of equations f1 + f2 = 20 Form the augmented matrix and  1 1 1 0  0 1 0 0

f1 + f3 = 25

f2 + f4 = 25

row-reduce it:   0 0 20 1 0 0 0 1 0 1 0 25 → − 0 1 25 0 0 1 1 1 30 0 0 0

−1 1 1 0

f3 + f4 = 30.

 −5 0  0 0

Letting f4 = t, the possible flows are f1 = t − 5, f2 = 25 − t, f3 = 30 − t, f4 = t.

98

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS (b) If f4 = t = 10, then the average flows on the other streets will be f1 = 10−5 = 5, f2 = 25−10 = 15, and f3 = 30 − 10 = 20. (c) Each flow must be nonnegative, so we get the constraints t ≥ 5, t ≤ 25, t ≤ 30, and t ≥ 0. Thus 5 ≤ t ≤ 25. This gives the constraints 0 ≤ f1 ≤ 20

0 ≤ f2 ≤ 20

5 ≤ f3 ≤ 25

5 ≤ f4 ≤ 25.

(d) Reversing all of the directions would have no effect on the solution. This would mean multiplying all rows of the augmented matrix by −1. However, multiplying a row by −1 is an elementary row operation, so the new matrix would have the same reduced row echelon form and thus the same solution. 17. (a) By applying the conservation of flow rule to each node we obtain the system of equations f1 + f2 = 100

f2 + f3 = f4 + 150

f4 + f5 = 150

f1 + 200 = f3 + f5 .

Rearrange the equations, set up the augmented matrix, and row reduce it:     100 1 0 −1 0 −1 −200 1 1 0 0 0 0 1 0 1 1 −1 0 150 1 0 1 300  →  − 0 0 0 1 1 0 1 1 150 0 0 150 1 0 −1 0 −1 −200 0 0 0 0 0 Then both f3 = s and f5 = t are free variables, so the flows are f1 = −200 + s + t,

f2 = 300 − s − t,

f3 = s,

f4 = 150 − t,

f5 = t.

(b) If DC is closed, then f5 = t = 0, so that f1 = −200 + s and f2 = 300 − s. Thus s, which is the flow through DB, must satisfy 200 ≤ s ≤ 300 liters per day. (c) If DB = f3 = S were closed, then f1 = −200 + t, so that t = f5 ≥ 200. But that would mean that f4 = 150 − t ≤ −50 < 0, which is impossible. (d) Since f4 = 150 − t, we must have 0 ≤ t ≤ 150, so that 0 ≤ f4 ≤ 150. Note that all of these values for t are consistent with the first three equations as well: if t = 0, then s = 250 produces positive values for all flows, while if t = 150, then s = 75 produces all positive values. So the flow through DB is between 0 and 150 liters per day. 18. (a) By applying the conservation of flow rule to each node we obtain: f3 + 200 = f1 + 100

f1 + 150 = f2 + f4

f2 + f5 = 300

f6 + 100 = f3 + 200

f4 + f7 = f6 + 100

f5 + f7 = 250.

Create and row-reduce the augmented matrix for the system:    1 0 −1 0 0 0 0 100 1 0 0 1 −1  0 1 0 0 −1 0 0 0 −150    0 0 0 1 1 0 0 1 0 0 300  → − 0  0 1 0 0 −1 0 −100   0 0 0 0 0 0 1 0 −1 1 100 0 0 0 0 0 0 0 1 0 1 250 0 0 0

0 0 0 1 0 0

0 0 0 0 1 0

−1 0 0 −1 −1 0 −1 1 0 1 0 0

 0 50  −100  100  250 0

Let f7 = t and f6 = s, so the flows are f1 = s

f2 = 50 + t

f3 = −100 + s

f4 = 100 + s − t

f5 = 250 − t

f6 = s

f7 = t.

(b) From the solution in part (a), since f1 = s = f6 , it is not possible for f1 and f6 to have different values.

2.4. APPLICATIONS

99

(c) If f4 = 0, then s − t = 100, so that t = 100 + s. Substituting this for t everywhere gives f1 = s

f2 = 150 + s

f3 = −100 + s

f4 = 0 f5 = 150 − s

f6 = s

f7 = 100 + s.

The constraints imposed by the requirement that all the fi be nonnegative means that 100 ≤ s ≤ 150, so that 100 ≤ f1 ≤ 150 50 ≤ f5 ≤ 100

250 ≤ f2 ≤ 300

0 ≤ f3 ≤ 50 f4 = 0

100 ≤ f6 ≤ 150

200 ≤ f7 ≤ 250.

19. Applying the current law to node A gives I1 + I3 = I2 or I1 − I2 + I3 = 0. Applying the voltage law to the top circuit, ABCA, gives −I2 − I1 + 8 = 0. Similarly, for the circuit ABDA we obtain −I2 + 13 − 4I3 = 0. This gives the system  I1 − I2 + I3 = 0 1 I1 + I2 = 8 ⇒ 1 I2 + 4I3 = 13 0

  1 0 0 0 8 → − 0 1 0 0 0 1 13

−1 1 1 0 1 4

 3 5 2

Thus I1 = 3 amps, I2 = 5 amps, and I3 = 2 amps. 20. Applying the current law to node B and the voltage law to the upper and lower circuits gives  1 I1 − I2 + I3 = 0 I1 + 2I2 = 5 ⇒ 1 0 2I2 + 4I3 = 8

  0 1 5 → − 0 0 8

−1 1 2 0 2 4

0 1 0

 1 2 1

0 0 1

Thus I1 = 1 amps, I2 = 2 amps, and I3 = 1 amp. 21. (a) Applying the current and voltage laws to the circuit gives the system: Node B:

I = I1 + I4

Node C:

I1 = I2 + I3

Node D:

I2 + I5 = I

Node E:

I3 + I4 = I5

Circuit ABEDA: −2I4 − I5 + 14 = 0 Circuit BCEB:

−I1 − I3 + 2I4 = 0

Circuit CDEC: −2I2 + I3 + I5 = 0 Construct the augmented matrix and row-reduce it:  1 0  1  0  0  0 0

−1 0 0 −1 0 1 −1 −1 0 0 0 −1 0 0 −1 0 0 1 1 −1 0 0 0 2 1 1 0 1 −2 0 0 2 −1 0 −1

  1 0  0  0  0  0  0 → − 0  14  0 0 0 0 0

0 1 0 0 0 0 0

0 0 1 0 0 0 0

0 0 0 1 0 0 0

0 0 0 0 1 0 0

0 0 0 0 0 1 0

 10 6  4  2  4  6 0

Thus the currents are I = 10, I1 = 6, I2 = 4, I3 = 2, I4 = 4, and I5 = 6 amps. (b) From Ohm’s Law, the effective resistance is Reff =

V I

=

14 10

=

7 5

ohms.

(c) To force the current in CE to be zero, we force I3 = 0 and let r be the resistance in BC. This

100

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS gives the following system of equations: Node B:

I = I1 + I4

Node C:

I1 = I2

Node D:

I2 + I5 = I

Node E:

I4 = I5

Circuit ABEDA: −2I4 − I5 + 14 = 0 −rI1

+ 2I4 = 0

Circuit CDEC: −2I2

+ I5 = 0

Circuit BCEB:

It is easier to solve this system by substitution. Substitute I4 = I5 in the circuit equation for 14 7 ABEDA to get −3I5 + 14 = 0, so that I4 = I5 = 14 3 . Thus −2I2 + 3 = 0 so that I1 = I2 = 3 . Then the circuit equation for BCED gives us −r ·

7 14 +2· =0 3 3

⇒ r = 4.

So if we use a 4 ohm resistor in BC, the current in CE will be zero. 22. (a) Applying the voltage law to Figure 2.23(a) gives IR1 +IR2 = E. But from Ohm’s Law, E = IReff , so that I(R1 + R2 ) = IReff and thus Reff = R1 + R2 . (b) Applying the current and voltage laws to Figure 2.23(b) gives the system of equations I = I1 + I2 , −I1 R1 + I2 R2 = 0, −I1 R1 + E = 0, and −I2 R2 + E = 0. Gauss-Jordan elimination gives     2 )E 1 0 0 (R1R+R 0 1 −1 −1 1 R2   0 −R E 0 R2  0 1 0   1 R1 − → .  E  0 −R1 0 −E  0 0 1 R 2

0 −R2

0 2 )E Thus I = (R1R+R , I1 = 1 R2 equation for I gives

I=

E R1 ,

I2 =

E R2 .

(R1 + R2 )IReff R1 R2

−E

0

0

0

0

But E = IReff ; substituting this value of E into the



Reff =

1 R1 +R2 R1 R2

=

1 R1

1 +

1 R2

.

23. The matrix for this system, with inputs on the left, outputs along the top, is

Farming Manufacturing

Farming

Manufacturing

1 2 1 2

1 3 2 3

Let the output of the farm sector be f and the output of the manufacturing sector be m. Then 1 1 f+ m= f 2 3 1 2 f + m = m. 2 3 Simplifying and row-reducing the corresponding augmented matrix gives " # " 1 − 12 f + 13 m = 0 − 12 0 1 − 23 3 ⇒ → − 1 1 1 − 13 0 0 0 2f − 3m = 0 2 Thus f = 23 m, so that farming and manufacturing must be in a 2 : 3 ratio.

# 0 0

2.4. APPLICATIONS

101

24. The matrix for this system, with needed resources on the left, outputs along the top, is Coal Steel

Coal Steel 0.3 0.8 0.7 0.2

Let the coal output be c and the steel output be s. Then 0.3c + 0.8s = c 0.7c + 0.2s = s. Simplifying and row-reducing the corresponding augmented matrix gives    −0.7c + 0.8s = 0 −0.7 0.8 0 1 − 87 → − ⇒ 0.7c − 0.8s = 0 0.7 −0.8 0 0 0

 0 0

Thus c = 87 s, so that coal and steel must be in an 8 : 7 ratio. For a total output of $20 million, coal 8 7 28 output must be 8+7 · 20 = 32 3 ≈ $10.667 million, while steel output is 8+7 · 20 = 3 ≈ $9.333 million. 25. Let x be the painter’s rate, y the plumber’s rate, and z the electrician’s rate. From the given schedule, we get the linear system 2x + y + 5z = 10x 4x + 5y + z = 10y 4x + 4y + 4z = 10z. Simplifying and row-reducing the corresponding augmented matrix gives    −8x + y + 5z = 0 −8 1 5 0 1 0 − 13 18    4x − 5y + z = 0 ⇒  4 −5 1 0 → − 0 1 − 79 4x + 4y − 6z = 0 4 4 −6 0 0 0 0

 0  0 . 0

13 Thus x = 18 z and y = 79 z. Let z = 54; this is a whole number in the required range such that the others will be whole numbers as well. Then x = 39 and y = 42. The painter should charge $39 per hour, the plumber $42 per hour, and the electrician $54 per hour.

26. From the given matrix, we get the linear system 1 L+ 4 1 1 B+ L+ 2 4 1 1 B+ L+ 4 4 1 1 B+ L+ 4 4

1 T 8 1 T 4 1 T 2 1 T 8

1 + Z 6 1 + Z 6 1 + Z 3 1 + Z 3

= B = L = T = Z.

Simplify and row-reduce the associated augmented matrix:  1 1 1 −B + 14 L + 81 T + 16 Z = 0 −1 4 8 6  1 3 1 1 1 1 1 − 34  2B − 4L + 4T + 6Z = 0 4 6 ⇒  21 1 1 1 1 1 1 1  4 −2 4B + 2L − 2T + 3Z = 0 2 3 1 1 1 2 1 1 1 − 32 4B + 8L + 8T − 3Z = 0 4 8 8

  0 1 0 0   0 0 1 0 − → 0 0 0 1 0 0 0 0

− 23 − 65 − 85 0

 0 0  . 0 0

Thus B = 23 Z, L = 56 Z, and T = 85 Z. Furthermore, B is the smallest price of the four, so B = 23 Z must be at least 50, in which case Z = 75. With that value of Z, we get B = 50, L = 90, and T = 120.

102

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

27. The matrix for this system, with needed resources on the left, outputs along the top, is Coal Steel

Coal Steel 0.15 0.2 0.1 0.1

(a) Let the coal output be c and the steel output be s. Then since 45 million of external demand is required for coal and 124 million for steel, we have c = 0.15c + 0.25s + 45 s = 0.2c + 0.1s + 124. Simplify and row-reduce the associated matrix:  0.85c − 0.25s = 45 0.85 ⇒ −0.2c + 0.9s + 124 −0.2

−0.25 0.9

  1 45 → − 124 0

0 1

100 160



So the coal industry should produce $100 million and the steel industry should produce $160 million. (b) The new external demands for coal and steel are 40 and 130 million respectively. The only thing that changes in the equations is the constant terms, so we have     0.85c − 0.25s = 40 0.85 −0.25 40 1 0 95.8 → − ⇒ −0.2c + 0.9s + 130 −0.2 0.9 130 0 1 165.73 So the coal industry should reduce production by 100−95.8 = $4.2 million, while the steel industry should increase production by 165.73 − 160 = $5.73 million. 28. This is an open system with external demand given by the other city departments. From the given matrix, we get the system 1 + 0.2A + 0.1H + 0.2T = A 1.2 + 0.1A + 0.1H + 0.2T = H 0.8 + 0.2A + 0.4H + 0.3T = T . Simplifying and row-reducing the associated augmented matrix gives    0.8 −0.1 −0.2 0.8A − 0.1H − 0.2T = 1 1 1 −0.1A + 0.9H − 0.2T = 1.2 ⇒ −0.1 0.9 −0.2 1.2 → − 0 −0.2A − 0.4H + 0.7T = 0.8 −0.2 −0.4 0.7 0.8 0

0 1 0

0 0 1

 2.31 2.28 . 3.11

So the Administrative department should produce $2.31 million of services, the Health department $2.28 million, and the Transportation department $3.11 million. 29. (a) Over Z2 , we need to solve x1 a + x2 b + · · · + x5 e = t + s:      0 0 1 1 0 0 0 0 1 1 1 1 0 0          s= 0 , t = 0 , ⇒ 0 1 1 1 0 0 1 0 0 1 1 1 0 0 0 1 1 0 0

  0 1 0 0 0 1 0 1 0 0 1 1   0 − → 0 0 1 0 0  0 0 0 1 1 1 0 0 0 0 0 0

 1 1  1  0 0

Hence x5 is a free variable, so there are exactly two solutions (with x5 = 0 and with x5 = 1). Solving for the other variables, we get x1 = 1 + x5 , x2 = 1 + x5 , x3 = 1, and x4 = x5 . So the two solutions are         x1 1 x1 0 x2  1 x2  0         x3  = 1 , x3  = 1 .         x4  0 x4  1 x5 0 x5 1 So we can push switches 1, 2, and 3 or switches 3, 4, and 5.

2.4. APPLICATIONS (b) In this case, t = e2 over Z2 ,  1 1  0  0 0

103 so the matrix becomes   1 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 0 0 1   − 1 1 1 0 0 0 0 1 0 0 → 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0

 0 0  0 . 0 1

This is an inconsistent system, so there is no solution. Thus it is impossible to start with all the lights off and turn on only the second light. 30. (a) With s = e4 and t = e2 + e4  1 1  0  0 0

over Z2 , so that s + t = e2   1 0 0 0 0 1 0 0 1 1 1 0 0 1   − 1 1 1 0 0 0 0 → 0 0  0 1 1 1 0 0 0 1 1 0 0 0

and we get the matrix  0 0 1 0 0 0 1 0  1 0 0 0 . 0 1 1 0 0 0 0 1

Since the system is inconsistent, there is no solution in this case: it is impossible to start with only the fourth light on and end up with only the second and fourth lights on. (b) In this case t = e2 over Z2 , so that s + t = e2 + e4 , and we get     1 1 0 0 0 0 1 0 0 0 1 1 1 1 1 0 0 1 0 1 0 0 1 1     0 1 1 1 0 0 →     − 0 0 1 0 0 1 . 0 0 1 1 1 1 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 So there are two solutions:

        x1 x1 1 0 x2  1 x2  0         x3  = 1 , x3  = 1 .         x4  0 x4  1 0 x5 1 x5 So we can push switches 1, 2, and 3 or switches 3, 4, and 5.

31. Recall from Example 2.35 that it does not matter what order we push the switches in, and that pushing a switch twice is the same as not pushing it at all. So with           0 0 0 1 1 0 0 1 1 1                    a= 0 , b = 1 , c = 1 , d = 1 , e = 0 , 1 1 1 0 0 1 1 0 0 0 there are 25 = 32 possible choices, since there are 5 buttons and we have the choice of pushing or not pushing each one. If we make a list of the resulting 32 resulting light states and eliminate duplicates, we see that the list of possible states is                 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1                 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 ,                 0 1 1 0 1 0 0 1 0 1 1 0 0 1 1 0                 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0                 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 .                 0 1 1 0 1 0 0 1 0 1 1 0 0 1 1 0

104

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

32. (a) Over Z3 , we need to solve x1 a + x2 b + x3 c = t, where      0 1 1 1 0 0 1 1 1 2 → − 0 t = 2 , so that 0 1 1 1 0 1

0 1 0

0 0 1

 1 2 . 2

Thus x1 = 1, x2 = 2, and x3 = 2, so we must push switch A once and switches B and C twice each. (b) Over Z3 , we need to solve x1 a + x2 b + x3 c = t, where      1 1 1 1 0 1 1 1 1 0 → − 0 t = 0 , so that 0 1 1 1 0 1

0 1 0

0 0 1

 2 2 . 2

In other words, we must push each switch twice. (c) Let x, y, and z be the final states  1 1 1 1 0 1

of the three lights. Then we want to solve    1 0 0 y + 2z 0 x − 0 1 0 x + 2y + z  . 1 y → 1 z 0 0 1 2x + y

So we can achieve this configuration if we push switch A y + 2z times, switch B x + 2y + z times, and switch C 2x + y times (adding in Z3 ). 33. The switches here correspond to the same vectors as in Example 2.35, but now the numbers are interpreted in Z3 rather than in Z2 . So in Z53 , with s = 0 (all lights initially off), we need to solve x1 a + x2 b + · · · + x5 e = t, where       1 1 0 0 0 2 2 1 0 0 0 1 1 1 1 1 0 0 1 0 1 0 0 2 1 1        , so that 0 1 1 1 0 2 →   2 t=     − 0 0 1 0 0 2 0 0 1 1 1 1 0 0 0 1 1 2 1 2 0 0 0 1 1 2 0 0 0 0 0 0 So x5 is a free variable, so there are three solutions, corresponding to x5 = 0, 1, or 2. Solving for the other variables gives x1 = 1 − x5 ,

x2 = 1 − 2x5 ,

x3 = 2,

so that the solutions are (remembering that 1 − 2 = 2 and       x1 1 0 x2  1 2       x3  = 2 , 2 ,       x4  2 1 x5 0 1 34. The possible configurations are  1 1  0  0 0

1 1 1 0 0

0 1 1 1 0

0 0 1 1 1

x4 = 2 − x5 ,

2 · 2 = 1 in Z3 )   2 0   2 .   0 2

    0 s1 s1 + s2     0  s2  s1 + s2 + s3      0 s3  = s2 + s3 + s4   1 s4  s3 + s4 + s5  1 s5 s4 + s5

where si ∈ {0, 1, 2} is the number of times that switch i is thrown.

2.4. APPLICATIONS

105

35. (a) Let a value of 1 represent white, and 0 represent black. Let ai correspond to touching the number i; it has a 1 in a given entry if the value of that square changes when i is pressed, and a 0 if it does not (thus the 1’s correspond to the squares with the blue dots in them in the diagram). Since each square has two states, white and black, we need to solve, over Z2 , the equation s + x1 a1 + x2 a2 + · · · + x9 a9 = t. Here

    0 0 0 1     0 1     0 1      , t = 0 , 0 s=     0 1     0 1     0 1 0 0

Row-reducing the  1 1 1 1  0 1  1 0  1 0  0 0  0 0  0 0 0 0

so that

x1 a1 + x2 a2 + · · · + x9 a9 = 0 − s = s.

corresponding augmented matrix gives   0 1 0 0 0 0 0 0 1 0 0 1 1 0 1 0 0 0 0 1    1 0 0 1 0 0 0 1  0 0  0 1 1 0 1 0 0 1  0 0  → − 1 0 1 0 1 0 1 0  0 0  1 0 1 1 0 0 1 1  0 0  0 1 0 0 1 1 0 1  0 0 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 1 0 0 0

0 0 1 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 1

 0 0  1  0  0 . 0  1  0 0

So touching the third and seventh squares will turn all the squares black. (b) In part (a), note that the row-reduced matrix consists of the identity matrix and a column vector corresponding to the constants. Changing the constants cannot produce an inconsistent system, since the matrix will still reduce to the identity matrix. So any configuration is reachable, since we can solve x1 a1 + x2 a2 + · · · + x9 a9 = s for any s. 36. In this version of the puzzle, the squares have three states, so we work over Z3 , with 2 corresponding to black, 1 to gray, and 0 to white. Then we need to go from     2 2 2 2     0 2     1 2      to t = 2 . 1 s=     0 2     2 2     0 2 1 2 Thus we want to solve

  0 0   2   1    x1 a1 + x2 a2 + · · · + x9 a9 = t − s =  1 . 2   0   2 1

106

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Row-reducing the  1 1  0  1  1  0  0  0 0

corresponding augmented matrix gives   1 0 1 0 1 0 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0    1 1 0 0 1 0 0 0 2  0 0  0 0 1 1 0 1 0 0 1  0 0  → − 0 1 0 1 0 1 0 1 1  0 0  0 1 0 1 1 0 0 1 2  0 0  0 0 1 0 0 1 1 0 0  0 0 0 0  0 0 0 1 0 1 1 1 2 0 0 0 0 1 0 1 1 1 0 0

0 0 1 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 1

 0 1  0  2  2 , 1  0  1 2

showing that a solution exists. 37. Let Grace and Hans’ ages be g and h. Then g = 3h and g + 5 = 2(h + 5), or g = 2(h + 5) − 5 = 2h + 5. It follows that g = 3h = 2h + 5, so that h = 5 and g = 15. So Hans is 5 and Grace is 15. 38. Let the ages be a, b, and c. Then the first two conditions give us a + b + c = 60 and a − b = b − c. For the third condition, Bert will be as old as Annie is now in a − b years; at that time, Annie will be a + (a − b) = 2a − b years, and we are given that 2a − b = 3c. So we get the system     31 0 0 28 31 1 1 60 a + b + c = 60 1 0 → −  0 1 0 20 . a − 2b + c = 0 ⇒  1 −2 0 0 1 12 2 −1 −3 0 2a − b − 3c = 0 Annie is 28, Bert is 20, and Chris is 12. 39. Let the areas of the fields be a and b. We have the two equations a + b = 1800 2 1 a + b = 1100. 3 2 Multiply the first equation by two and subtract the first equation from the result to get 13 a = 400, so that a = 1200 square yards and thus b = 1800 − a = 600 square yards. 40. Let x1 , x2 , and x3 be the number of bundles of the first, second, and third types of corn. Then from the given conditions we get 3x1 + 2x2 + x3 = 39 2x1 + 3x2 + x3 = 34 x1 + 2x2 + 3x3 = 26. Row-reduce the augmented matrix to get  3 2 1  2 3 1 1 2 3

  1 0 0 39   34 → − 0 1 0 26 0 0 1



37 4 17  . 4  11 4

Therefore there are 9.25 measures of corn in a bundle of the first type, 4.25 measures of corn in a bundle of the second type, and 2.75 measures of corn in a bundle of the third type. 41. (a) The addition table gives a + c = 2, a + d = 4, b + c = 3, b + d = 5. Construct and row-reduce the corresponding augmented matrix:     1 0 1 0 2 1 0 0 1 4 1 0 0 1 4 0 1 0 1 5  −   0 1 1 0 3 → 0 0 1 −1 −2 . 0 1 0 1 5 0 0 0 0 0

2.4. APPLICATIONS

107

So d is a free variable. Solving for the other variables in terms of d gives a = 4 − d, b = 5 − d, and c = d − 2. So there are an infinite number of solutions; for example, with d = 1, we get a = 3, b = 4, and c = −1. (b) We have a + c = 3, a + d = 4, b + c = 6, b + d = 5. Construct and row-reduce the corresponding augmented matrix:     1 0 0 1 0 1 0 1 0 3 1 0 0 1 4 0 1 0 1 0  −   0 0 1 −1 0 . 0 1 1 0 6 → 0 1 0 1 5 0 0 0 0 1 So this is an inconsistent system, and there are no such values of a, b, c, and d. 42. The addition table gives a + c = w, a + d = y, corresponding augmented matrix:    1 1 0 1 0 w 1 0 0 1 y  0 −   0 0 1 1 0 x  → 0 1 0 1 z 0

b + c = x, b + d = z. Construct and row-reduce the 0 0 1 0 0 1 0 0

1 1 −1 0

 w  x .  y−w z−x−y+w

In order for this to be a consistent system, we must have z − x − y + w = 0, or z − y = x − w. Another way of phrasing this is x + y = z + w; in other words, the sum of the diagonal entries must equal the sum of the off-diagonal entries. 43. (a) The addition table gives the following system of equations: a+d=3

b+d=2

c+d=1

a+e=5

b+e=4

c+e=3

a+f =4

b+f =3

c + f = 1.

Construct and row-reduce the corresponding augmented matrix:    1 0 0 1 0 0 3 1 0 0 0 0 1 1 0 0 0 1 0 5 0 1 0 0 0 1    1 0 0 0 0 1 4 0 0 1 0 0 1    0 1 0 1 0 0 2 0 0 0 1 0 −1    0 1 0 0 1 0 4 →    − 0 0 0 0 1 −1 0 1 0 0 0 1 3 0 0 0 0 0 0    0 0 1 1 0 0 1 0 0 0 0 0 0    0 0 1 0 1 0 3 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0

 0 0  0  0  0 . 1  0  0 0

The sixth row shows that this is an inconsistent system. (b) The addition table gives the following system of equations: a+d=1

Construct and row-reduce  1 0 1 0  1 0  0 1  0 1  0 1  0 0  0 0 0 0

b+d=2

c+d=3

a+e=3

b+e=4

c+e=5

a+f =4

b+f =5

c + f = 6.

the corresponding augmented matrix:   0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 3 1    0 0 0 1 4 1  0 0 1 0 0  0 1 0 0 2  0 0 0 1 0 −1  0 0 1 0 4 − → 0 0 0 0 1 −1  0 0 0 1 5 0  0 0 0 0 0  1 1 0 0 3  0 0 0 0 0 0  1 0 1 0 5 0 0 0 0 0 0 1 0 0 1 6 0 0 0 0 0 0

 4 5  6  −3  −1 . 0  0  0 0

108

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS So f is a free variable, and there are an infinite number of solutions of the form a = 4 − f , b = 5 − f , c = 6, d = f − 3, e = f − 1.

44. Generalize the addition table to + d e f

a m p s

b n q t

c o r u

The addition table gives the following system of equations:

Construct and row-reduce  1 0 0 1 0 0  1 0 0  0 1 0  0 1 0  0 1 0  0 0 1  0 0 1 0 0 1

a+d=m

b+d=n

c+d=o

a+e=p

b+e=q

c+e=r

a+f =s

b+f =t

c + f = u.

the corresponding augmented matrix:   1 0 0 m 1 0 0 0 0 1 0 1 0 0 0 1 0 1 0 p    0 0 1 s  0 0 1 0 0 1  1 0 0 n  0 0 0 1 0 −1  0 1 0 q → −  0 0 0 0 1 −1  0 0 1 t  0 0 0 0 0 0  1 0 0 o  0 0 0 0 0 0 0 1 0 r  0 0 0 0 0 0 0 0 1 u 0 0 0 0 0 0

 m  n   o   m−p  n + p − m − t . m − n − p + q  n−o−q+r  p−q−s+t  q−r−t+u

For the table to be valid, the last entry in the zero rows must also be zero, so we need the conditions m + q = n + p, n + r = o + q, p + t = q + s, and q + u = r + t. In terms of the addition table, this means that for each 2 × 2 submatrix, the sum of the diagonal entries must equal the sum of the off-diagonal entries (compare with Exercise 42). 45. (a) Since the three points (0, 1), (−1, 4), and (2, 1) must satisfy the equation y = ax2 + bx + c, substitute those values to get the three equations c = 1, a − b + c = 4, and 4a + 2b + c = 1. Construct and row-reduce the corresponding augmented matrix:     0 0 1 1 1 0 0 1 1 −1 1 4 → − 0 1 0 −2 4 2 1 1 0 0 1 1 Thus a = 1, b = −2, and c = 1, so that the equation of the parabola is y = x2 − 2x + 1. (b) Since the three points (−3, 1), (−2, 2), and (−1, 5) must satisfy the equation y = ax2 + bx + c, substitute those values to get the three equations 9a − 3b + c = 1, 4a − 2b + c = 2, and a − b + c = 5. Construct and row-reduce the corresponding augmented matrix:     9 −3 1 1 1 0 0 1 4 −2 1 2 → − 0 1 0 6  1 1 1 5 0 0 1 10 Thus a = 1, b = 6, and c = 10, so that the equation of the parabola is y = x2 + 6x + 10. 46. (a) Since the three points (0, 1), (−1, 4), and (2, 1) must satisfy the equation x2 + y 2 + ax + by + c = 0, substitute those values to get the three equations 1 + b + c = 0 1 + 16 − a + 4b + c = 0 4 + 1 + 2a + b + c = 0



b + c = −1 a − 4b − c = 17 2a + b + c = −5 .

2.4. APPLICATIONS

109

Construct and row-reduce the corresponding augmented matrix:  0 1 2

1 1 −4 −1 1 1

  −1 1 17 → − 0 −5 0

0 1 0

0 0 1

 −2 −6 5

Thus a = −2, b = −6, and c = 5, so that the equation of the circle is x2 + y 2 − 2x − 6y + 5 = 0. Completing the square gives (x − 1)2 + (y − 3)2 +√5 − 1 − 9 = 0, so that (x − 1)2 + (y − 3)2 = 5. This is a circle whose center is (1, 3), with radius 5. A sketch of the circle follows part (b). (b) Since the three points (−3, 1), (−2, 2), and (−1, 5) must satisfy the equation x2 +y 2 +ax+by+c = 0, substitute those values to get the three equations 9 + 1 − 3a + b + c = 0 4 + 4 − 2a + 2b + c = 0 1 + 25 − a + 5b + c = 0



3a − b − c = 10 2a − 2b − c = 8 a − 5b − c = 26 .

Construct and row-reduce the corresponding augmented matrix:  3 2 1

−1 −2 −5

−1 −1 −1

  10 1 0 0 8 → − 0 1 0 26 0 0 1

 12 −10 36

Thus a = 12, b = −10, and c = 36, so that the equation of the circle is x2 +y 2 +12x−10y +36 = 0. Completing the square gives (x + 6)2 + (y − 5)2 + 36 − 36 − 25 = 0, so that (x + 6)2 + (y − 5)2 = 25. This is a circle whose center is (−6, 5), with radius 5. Sketches of both circles are: 6 10 5

9 8

H-1, 4L

4

7 6

3

H1, 3L H-6, 5L

H-1, 5L

5 4

2 3 1

H2, 1L

H0, 1L

H-2, 2L

2

H-3, 1L

-2

-1

1

2

3

4

-11 -10 -9

-8

-7

-6

-5

-4

1

-3

-2

-1

3x+1 A B 47. We have x23x+1 +2x−3 = (x−1)(x+3) = x−1 + x+3 , so that clearing fractions, we get (x + 3)A + (x − 1)B = 3x + 1. Collecting terms gives (A + B)x + (3A − B) = 3x + 1. This gives the linear system A + B = 3, 3A − B = 1. Construct and row-reduce the corresponding augmented matrix:

 1 3 Thus A = 1 and B = 2, so that

3x+1 x2 +2x−3

1 −1 =

1 x−1

  3 1 0 → − 1 0 1 +

2 x+3 .

 1 . 2

110

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

48. We have x2 − 3x + 3 A x3 − 3x + 3 B C = + = + x3 + 2x2 + x x(x + 1)2 x x + 1 (x + 1)2 x2 − 3x + 3 = A(x + 1)2 + Bx(x + 1) + Cx 2



2

x − 3x + 3 = (A + B)x + (2A + B + C)x + A A + B = 1,

2A + B + C = −3,





A = 3.

This is most easily solved by substitution: A = 3, so that A + B = 1 gives B = −2. Substituting into the remaining equation gives 2 · 3 − 2 + C = −3, so that C = −7. Thus x2 − 3x + 3 3 2 7 = − − x3 + 2x2 + x x x + 1 (x + 1)2 49. We have x−1 A Bx + C Dx + E = + 2 + 2 (x + 1)(x2 + 1)(x2 + 4) x+1 x +1 x +4



x − 1 = A(x2 + 1)(x2 + 4) + (Bx + C)(x + 1)(x2 + 4) + (Dx + E)(x + 1)(x2 + 1). Expand the right-hand side and collect terms to get x − 1 = (A + B + D)x4 + (B + C + D + E)x3 + (5A + 4B + C + D + E)x2 + (4B + 4C + D + E)x + (4A + 4C + E). This gives the linear system A+ B

=

0

B+ C +D+E =

0

5A + 4B + C + D + E =

0

4B + 4C + D + E =

1

4A

+D

+ 4C

+ E = −1.

Construct and row-reduce the corresponding augmented matrix:    0 1 1 0 1 0 1 0 0 0    0 1 1 1 1 0 0 1 0 0    5 4 1 1 1 0 − → 0 0 1 0     1 0 0 0 1 0 4 4 1 1 4 0 4 0 1 −1 0 0 0 0

0 0 0 0 1

− 15



1  3

0 .

2  − 15  − 15

2 Thus A = − 15 , B = 13 , C = 0, D = − 15 , and E = − 15 , giving 2 1 x−1 1 x 15 x + 5 = − + − (x + 1)(x2 + 1)(x2 + 4) 5(x + 1) 3(x2 + 1) x2 + 4 1 x 2x + 3 =− + − . 5(x + 1) 3(x2 + 1) 15(x2 + 4)

50. We have x3 + x + 1 A B Cx + D Ex + F Gx + H Ix + J = + + 2 + 2 + 2 + 2 . 2 2 3 2 x(x − 1)(x + x + 1)(x + 1) x x−1 x +x+1 x +1 (x + 1) (x + 1)3

2.4. APPLICATIONS

111

Multiply both sides of the equation by x(x − 1)(x2 + x + 1)(x2 + 1)3 , simplify, and equate coefficients of xn to get the system A + B + C + E = 0 −3A + 3B − 3C + 3D − 2E + F − G + H + J = 0 B−C +D+F =0

A + 4B + C − 3D − 2F − H = 1

3A + 4B + 3C − D + 2E + G = 0

−3A + B − C + D − E − G − I = 0

−A + 3B − 3C + 3D − E + 2F + H = 0

B−D−F −H −J =1

3A + 6B + 3C − 3D + E − F + G + I = 0

−A = 1.

Construct  1 1   0 1   3 4   −1 3   3 6  −3 3    1 4  −3 1    0 1 −1 0

and row-reduce the corresponding augmented matrix:   1 0 1 0 1 0 0 0 0 0 0   −1 1 0 1 0 0 0 0 0 0 1    3 −1 2 0 1 0 0 0 0  0 0   −3 3 −1 2 0 1 0 0 0 0 0   0 0 3 −3 1 −1 1 0 1 0 0 → −  −3 3 −2 1 −1 1 0 1 0  0 0   1 −3 0 −2 0 −1 0 0 1 0 0    −1 1 −1 0 −1 0 −1 0 0  0 0   0 −1 0 −1 0 −1 0 −1 1 0 0 0 0 0 0 0 0 0 0 1 0 0

0 0 1 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 0 1

 −1 1  8 −1   0  15  8 . − 98   7  4 1 −4 1 2 1 2

This gives the partial fraction decomposition 1 15 7 1 9 1 1 x3 + x + 1 1 x 8 8 x− 8 4x − 4 2x + 2 = − + − + + + x(x − 1)(x2 + x + 1)(x2 + 1)3 x x − 1 x2 + x + 1 x2 + 1 (x2 + 1)2 (x2 + 1)3 1 1 x 15x − 9 7x − 1 x+1 =− + − + + + . x 8(x − 1) x2 + 1 8(x2 + 1) 4(x2 + 1)2 2(x2 + 1)3

51. Suppose that 1 + 2 + · · · + n = an2 + bn + c, and let n = 0, 1, and 2 to get the system c=0 a+ b+c=1 4a + 2b + c = 3. Construct and row-reduce the associated  0 0  1 1 4 2 Thus a = b =

1 2

augmented matrix:   1 0 1 0 0   1 1 → − 0 1 0 1 3 0 0 1



1 2 1 2

0

and c = 0, so that 1 + 2 + ··· + n =

1 2 1 1 n + n = n(n + 1). 2 2 2

52. Suppose that 12 + 22 + · · · + n2 = an3 + bn2 + cn + d, and let n = 0, 1, 2, and 3 to get the system d= 0 a+ b+ c+d= 1 8a + 4b + 2c + d = 5 27a + 9b + 3c + d = 14.

112

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Construct and row-reduce the associated  0 0 0 1 1 1   8 4 2 27 9 3

augmented matrix:   1 0 1 0 0 1 1 1   − → 1 5  0 0 0 0 1 14

0 0 1 0

1 3 1 2 . 1  6



0 0 0 1

0

Thus a = 31 , b = 12 , c = 16 , and d = 0, so that 12 + 22 + · · · + n2 =

1 3 1 2 1 1 1 n + n + n = (2n3 + 3n2 + n) = n(n + 1)(2n + 1). 3 2 6 6 6

53. Suppose that 13 + 23 + · · · + n3 = an4 + bn3 + cn2 + dn + e, and let n = 0, 1, 2, 3, and 4 to get the system e=

0

c+ d+e=

1

16a + 8b + 4c + 2d + e =

9

a+

b+

81a + 27b + 9c + 3d + e = 36 256a + 64b + 16c + 4d + e = 100. Construct and row-reduce the  0   1   16    81 256

associated augmented matrix:   0 0 0 1 0 1 0     1 1 1 1 1 0 1   0 0 9  → − 8 4 2 1     27 9 3 1 36  0 0 64 16 4 1 100 0 0

0 0 1 0 0

0 0 0 1 0

0 0 0 0 1



1 4 1  2 1 . 4

 0 0

Thus a = 14 , b = 12 , c = 14 , and d = e = 0, so that 1 1 1 1 1 1 + 2 + · · · + n = n4 + n3 + n2 = n2 (n2 + 2n + 1) = n2 (n + 1)2 = 4 2 4 4 4 3

2.5

3

3



n(n + 1) 2

2 .

Iterative Methods for Solving Linear Systems

1. Start by solving the first equation for x1 and the second for x2 : 6 x1 = + 7 4 x2 = + 5

1 x2 7 1 x1 . 5

Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations: n x1 x2

0 0 0

1 0.857 0.800

2 0.971 0.971

3 0.996 0.994

4 0.999 0.999

5 1.000 1.000

6 1.000 1.000

Solving the first equation for x2 gives x2 = 7x1 − 6; substituting into the second equation gives x1 − 5(7x1 − 6) = −4

so that

− 34x1 = −34



x1 = 1.

Substituting this value back into either equation gives x2 = 1, so the exact solution is [x1 , x2 ] = [1, 1].

2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS

113

2. Solving for x1 and x2 gives 5 − 2 x2 = −1 +

x1 =

1 x2 2 x1 .

Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations: n x1 x2 n x1 x2

0 0 0

1 2.5 −1.0

13 2.008 0.969

14 2.016 1.008

2 3 1.5

3 1.75 2.0

15 1.996 1.016

4 1.4 0.75

16 1.992 0.996

5 2.125 0.5

17 2.002 0.992

6 2.25 1.125

18 2.004 1.002

7 1.938 1.25

19 1.999 1.004

Using x2 = x1 − 1 in the first equation gives x1 = exact solution is [x1 , x2 ] = [2, 1].

5 2

8 1.875 0.938

20 1.998 0.999

9 2.031 0.875

21 2.000 0.998

22 2.001 1.000

10 2.062 1.031 23 2.000 1.001

11 1.984 1.062 24 2.000 1.000

12 1.969 0.984 25 2.000 1.000

− 12 (x1 − 1), so that x1 = 2 and x2 = 1. So the

3. Solving for x1 and x2 gives 1 x2 + 9 2 x2 = x2 + 7 x1 =

2 9 2 . 7

Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations: n x1 x2

0 0 0

1 0.222 0.286

2 0.254 0.349

3 0.261 0.358

4 0.262 0.360

5 0.262 0.361

6 0.262 0.361

Solving the first equation for x2 gives x2 = 9x1 − 2. Substitute this into the second equation to get 8 8 x1 − 3.5(9x2 − 2) = −1, so that 30.5x1 = 8 and thus x1 = 30.5 ≈ 0.262295. This gives x2 = 9 · 30.5 −2 ≈ 0.360656. 4. Solving for x1 , x2 , and x3 gives x1 = − x2 = x3 =

1 x2 + 20 1 x1 + 10 1 x1 − 10

1 x3 + 20 1 x3 − 10 1 x2 + 10

17 20 13 10 9 . 5

Using the initial vector [x1 , x2 , x3 ] = [0, 0, 0], we get a sequence of approximations: n x1 x2 x3

0 0 0 0

1 0.85 −1.3 1.8

2 1.005 −1.035 2.015

3 1.002 −0.998 2.004

4 1.000 −0.999 2.000

To solve the system directly, row-reduce the augmented matrix:    20 1 −1 17 1 0 0  1 −10 1 13 → − 0 1 0 −1 1 10 18 0 0 1 giving the exact solution [x1 , x2 , x3 ] = [1, −1, 2].

5 1.000 −1.000 2.000

 1 −1 2

6 1.000 −1.000 2.000

114

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

5. Solve to get 1 1 x2 + 3 3 1 1 1 x2 = − x1 − x3 + 4 4 4 1 1 x3 = − x2 + . 3 3 x1 =

Using the initial vector [x1 , x2 , x3 ] = [0, 0, 0], we get a sequence of approximations n x1 x2 x3

0 0 0 0

1 0.333 0.250 0.333

2 0.250 0.083 0.250

3 0.306 0.125 0.306

4 0.292 0.097 0.292

5 0.301 0.104 0.301

To solve the system directly, row-reduce the augmented    1 3 1 0 1    − 0 1 4 1 1 → 0 1 3 1 0 3 1 3 giving the exact solution [x1 , x2 , x3 ] = 10 , 10 , 10 .

6 0.299 0.100 0.299

7 0.300 0.101 0.300

8 0.300 0.100 0.300

9 0.300 0.100 0.300

matrix: 0 0 1 0 0 1



3 10 1  10  3 10

6. Solve to get 1 x2 + 3 1 x2 = x1 + 3 1 x3 = x2 + 3 1 x4 = x3 + 3 x1 =

1 3 1 x3 3 1 1 x4 + 3 3 1 . 3

Using the initial vector [x1 , x2 , x3 , x4 ] = [0, 0, 0, 0], we get a sequence of approximations n x1 x2 x3 x4

0 0 0 0 0

1 0.333 0 0.333 0.333

2 3 4 0.333 0.407 0.420 0.222 0.259 0.321 0.444 0.556 0.580 0.444 0.481 0.519

5 6 7 8 9 10 11 12 0.440 0.444 0.450 0.452 0.453 0.454 0.454 0.454 0.333 0.351 0.355 0.360 0.361 0.363 0.363 0.363 0.613 0.620 0.630 0.632 0.634 0.635 0.636 0.636 0.527 0.538 0.540 0.543 0.544 0.545 0.545 0.545

To solve the system directly, row-reduce the augmented matrix:    3 −1 0 0 1 1 0 0 0   −1 3 −1 0 0 0 1 0 0  − →   0 −1 3 −1 1 0 0 1 0 0 0 −1 3 1 0 0 0 1 5 4 7 6 giving the exact solution [x1 , x2 , x3 , x4 ] = 11 , 11 , 11 , 11 .

5 11 4  11  7   11 6 11



7. Using the equations from Exercise 1, we get (to three decimal places, but keeping four during the calculations) n 0 1 2 3 4 x1 0 0.857 0.996 1.000 1.000 x2 0 0.971 0.999 1.000 1.000 The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5.

2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS

115

8. Using the equations from Exercise 2, we get (to three decimal places, but keeping four during the calculations) n x1 x2

0 0 0

1 2.5 1.5

2 1.75 0.75

3 2.125 1.125

4 1.938 0.938

5 2.031 1.031

6 1.984 0.984

7 2.008 1.008

8 1.996 0.996

9 2.002 1.002

10 1.999 0.999

11 2.000 1.000

12 2.000 1.000

The Gauss-Seidel method takes 11 iterations, while the Jacobi method takes 24. 9. Using the equations from Exercise 3, we get (to three decimal places, but keeping four during the calculations) n 0 1 2 3 4 x1 0 0.222 0.261 0.262 0.262 x2 0 0.349 0.360 0.361 0.361 The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5. 10. Using the equations from Exercise 4, we get (to three decimal places, but keeping four during the calculations) n 0 1 2 3 4 x1 0 0.850 1.011 1.000 1.000 x2 0 −1.215 −0.998 −1.000 −1.000 x3 0 2.006 2.001 2.000 2.000 The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5. 11. Using the equations from Exercise 5, we get (to three decimal places, but keeping four during the calculations) 1 2 3 4 5 6 n 0 x1 0 0.333 0.278 0.296 0.299 0.300 0.300 x2 0 0.167 0.111 0.102 0.100 0.100 0.100 x3 0 0.278 0.296 0.299 0.300 0.300 0.300 The Gauss-Seidel method takes 5 iterations, while the Jacobi method takes 8. 12. Using the equations from Exercise calculations) n 0 1 x1 0 0.333 x2 0 0.111 x3 0 0.370 x4 0 0.457

6, we get (to three decimal places, but keeping four during the 2 0.370 0.247 0.568 0.523

3 0.416 0.328 0.617 0.539

4 0.433 0.353 0.631 0.544

5 0.451 0.361 0.635 0.545

6 0.454 0.363 0.636 0.545

7 0.454 0.363 0.636 0.545

The Gauss-Seidel method takes 6 iterations, while the Jacobi method takes 11. 1.

13. 0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.

116 14.

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 1.5

1.

0.5

1.

0.5

2.

1.5

2.5

15. Applying the Gauss-Seidel method to x1 − 2x2 = 3, 3x1 + 2x2 = 1 with an initial estimate of [0, 0] gives n 0 1 2 3 4 x1 0 3 −5 19 −53 x2 0 −4 8 −28 80 which evidently diverges. If, however, we reverse the two equations to get 3x1 + 2x2 = 1, x1 − 2x2 = 3, this is strictly diagonally dominant since |3| > |2| and |−2| > |1|. Applying the Gauss-Seidel method gives n x1 x2

0 0 0

1 2 0.333 1.222 −1.333 −0.889

3 0.926 −1.037

4 1.025 −0.988

5 0.992 −1.004

6 1.003 −0.999

7 8 9 0.999 1.000 1.000 −1.000 −1.000 −1.000

The exact solution is [x1 , x2 ] = [1, −1]. 16. Applying the Gauss-Seidel method to x1 − 4x2 + 2x3 = 2, 2x2 + 4x3 = 1, 6x1 − x2 − 2x3 = 1 with an initial estimate of [0, 0, 0] gives n x1 x2 x3

0 0 0 0

1 2 2 −6.5 0.5 −10 5.25 −15

3 −8 30.5 −39.75

4 203.5 80 570

which evidently diverges. If, however, we arrange the equations as 6x1 − x2 − 2x3 = 1 x1 − 4x2 + 2x3 = 2 2x2 + 4x3 = 1, we get a strictly diagonally dominant system. Applying the Gauss-Seidel method gives n x1 x2 x3

0 0 0 0

1 2 3 0.167 0.250 0.250 −0.458 −0.198 −0.263 0.479 0.349 0.382   The exact solution is [x1 , x2 , x3 ] = 14 , − 41 , 38 . 17.

75

50

25

-50

-40

-30

-20

10

-10

-25

20

4 0.250 −0.247 0.373

5 0.250 −0.251 0.375

6 7 0.250 0.250 −0.250 −0.250 0.375 0.375

2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS

117

18. Applying the Gauss-Seidel method to the equations −4x1 + 5x2 = 14 x1 − 3x2 = −7 gives n x1 x2

0 0 0

1 2 3 4 5 −3.5 −2.04 −1.43 −1.18 −1.08 1.17 1.65 1.86 1.94 1.97

6 7 8 −1.03 −1.01 −1.01 1.99 2.00 2.00

This gives an approximate solution (to 0.01) of x1 = −1.01, x2 = 2.00. The exact solution is [x1 , x2 ] = [−1, 2].

19. Applying the Gauss-Seidel method to the equations 5x1 − 2x2 + 3x3 = −8 x1 + 4x2 − 4x3 = 102 −2x1 − 2x2 + 4x3 = 90 gives n x1 x2 x3

0 0 0 0

1 2 3 4 5 6 −1.6 14.97 8.55 10.74 9.84 10.12 25.9 11.41 14.05 11.62 11.72 11.25 −10.35 −9.31 −11.2 −11.32 −11.72 −11.82 n x1 x2 x3

9 10.00 11.05 −11.97

10 10.00 11.03 −11.98

11 10.00 11.01 −11.99

7 9.99 11.19 −11.90

8 10.02 11.08 −11.95

12 13 14 10.00 10.00 10.00 11.01 11.00 11.00 −12.00 −12.00 −12.00

This gives an approximate solution (to 0.01) of x1 = 10.00, x2 = 11.00, x3 = −12.00. The exact solution is [x1 , x2 , x3 ] = [10, 11, −12].

20. Redoing the computation from Exercise 18, showing three decimal places, and computing to an accuracy of 0.001, gives n x1 x2

0 0 0

1 2 −3.5 −2.042 1.167 1.653 n x1 x2

3 −1.434 1.855 9 −1.002 1.999

4 −1.181 1.940 10 −1.001 2.000

5 −1.075 1.975 11 −1.000 2.000

6 −1.031 1.990

7 8 −1.013 −1.005 1.996 1.998

12 −1.000 2.000

This gives an approximate solution (to 0.001) of x1 = −1.000, x2 = 2.000. The exact solution is [x1 , x2 ] = [−1, 2].

118

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

21. Redoing the computation from Exercise 19, showing three decimal places, and computing to an accuracy of 0.001, gives n x1 x2 x3

0 0 0 0

1 −1.6 25.9 −10.35

2 3 14.97 8.55 11.408 14.051 −9.311 −11.199

4 5 10.74 9.839 11.615 11.718 −11.322 −11.721

6 10.120 11.249 −11.816

7 9.989 11.187 −11.912

n x1 x2 x3

9 10.002 11.052 −11.973

10 10.005 11.026 −11.985

11 10.001 11.015 −11.992

12 10.001 11.008 −11.996

13 10.000 11.005 −11.998

n x1 x2 x3

14 10.000 11.002 −11.999

15 10.000 11.001 −11.999

16 10.000 11.001 −12.000

17 10.000 11.000 −12.000

18 10.000 11.000 −12.000

8 10.022 11.082 −11.948

This gives an approximate solution (to 0.001) of x1 = 10.000, x2 = 11.000, x3 = −12.000. The exact solution is [x1 , x2 , x3 ] = [10, 11, −12]. 22. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: 1 1 (0 + 40 + 40 + t2 ) = 20 + t2 4 4 1 5 1 1 t2 = (0 + t1 + t3 + 5) = + t1 + t3 4 4 4 4 85 1 1 + t2 . t3 = (t2 + 40 + 40 + 5) = 4 4 4

t1 =

The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]: n t1 t2 t3

0 0 0 0

1 20.000 6.250 22.812

2 21.562 12.344 24.336

3 23.086 13.105 24.526

4 23.276 13.201 24.550

5 23.300 13.213 24.553

6 23.303 13.214 24.554

7 23.304 13.214 24.554

8 23.304 13.214 24.554

This gives equilibrium temperatures (to 0.001) of t1 = 23.304, t2 = 13.214, and t3 = 24.554. 23. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: 1 (t2 + t3 ) 4 1 t2 = (t1 + t4 ) 4 1 t3 = (t1 + t4 + 200) 4 1 t4 = (t2 + t3 + 200) 4

t1 =

The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]: n t1 t2 t3 t4

0 0 0 0 0

1 0 0 50 62.5

2 12.5 18.75 68.75 71.875

3 21.875 23.438 73.438 74.219

4 5 6 7 8 9 10 24.219 24.805 24.951 24.988 24.997 24.999 25.000 24.609 24.902 24.976 24.994 24.998 25.000 25.000 74.609 74.902 74.976 74.994 74.998 75.000 75.000 74.805 74.951 74.988 74.997 74.998 75.000 75.000

This gives equilibrium temperatures (to 0.001) of t1 = t2 = 25 and t3 = t4 = 75.

2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS

119

24. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: 1 (t2 + t3 ) 4 1 t2 = (t1 + t4 + 40) 4 1 t3 = (t1 + t4 + 80) 4 1 t4 = (t2 + t3 + 200) 4 t1 =

The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]: n t1 t2 t3 t4

0 0 0 0 0

1 0 10 20 57.5

2 7.5 26.25 36.25 65.625

3 15.625 30.312 40.312 67.656

4 5 6 7 8 9 10 17.656 18.164 18.291 18.323 18.331 18.333 18.333 31.328 31.582 31.646 31.661 31.665 31.666 31.667 41.328 41.582 41.646 41.661 41.665 41.666 41.667 68.164 68.291 68.323 68.331 68.333 68.333 68.333

This gives equilibrium temperatures (to 0.001) of t1 = 18.333, t2 = 31.667, t3 = 41.667, and t4 = 68.333. 25. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: t1 = t2 = t3 = t4 = t5 = t6 =

1 (t2 + 80) 4 1 (t1 + t3 + t4 ) 4 1 (t2 + t5 + 80) 4 1 (t2 + t5 + 5) 4 1 (t3 + t4 + t6 + 5) 4 1 (t5 + 85). 4

The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]: n t1 t2 t3 t4 t5 t6 n t1 t2 t3 t4 t5 t6

0 0 0 0 0 0 0

1 2 20 21.25 5 11.25 21.25 24.609 2.5 5.859 7.188 14.629 23.047 24.097 7 23.809 15.282 28.058 9.308 16.963 25.491

8 23.821 15.297 28.065 9.315 16.968 25.492

3 22.812 13.32 26.987 8.237 16.283 25.321 9 23.824 15.301 28.067 9.317 16.969 25.492

4 23.33 14.639 27.730 8.980 16.758 25.439 10 23.825 15.302 28.068 9.318 16.970 25.492

5 23.66 15.093 27.963 9.213 16.904 25.476 11 23.826 15.303 28.068 9.318 16.970 25.492

6 23.773 15.237 28.035 9.285 16.949 25.487 12 23.826 15.303 28.068 9.318 16.970 25.492

This gives equilibrium temperatures (to 0.001) of t1 = 23.826, t2 = 15.303, t3 = 28.068, t4 = 9.318, t5 = 16.970, and t6 = 25.492.

120

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

26. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: t1 = t2 = t3 = t4 = t5 = t6 = t7 = t8 = n t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 n t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 (t2 + t5 ) 4 1 (t1 + t3 + t6 ) 4 1 (t2 + t4 + t7 + 20) 4 1 (t3 + t8 + 40) 4 1 (t1 + t6 + t9 ) 4 1 (t2 + t5 + t7 + t10 ) 4 1 (t3 + t6 + t8 + t11 ) 4 1 (t4 + t7 + t12 + 20) 4

1 2 0 0 0 1.25 5. 8.438 11.25 14.141 0 2.5 0 1.875 1.25 4.844 8.125 16.562 10. 16.875 2.5 8.984 0.938 17.598 27.266 49.575 22.5 28.281 16.25 26.641 29.297 52.095 64.141 75.417

10 7.274 13.794 23.179 24.998 16.522 26.029 34.83 37.142 34.088 40.284 53.83 69.006 40.452 48.051 71.733 85.185

11 7.579 14.197 23.506 25.162 16.924 26.559 35.259 37.357 34.415 40.714 54.178 69.18 40.617 48.266 71.907 85.272

3 0.938 2.812 10.449 16.753 4.922 5.391 12.5 24.707 20.547 17.544 32.928 58.263 31.797 35.359 60.926 79.797

12 7.78 14.461 23.721 25.269 17.189 26.906 35.54 37.497 34.63 40.995 54.406 69.294 40.724 48.406 72.021 85.329

t9 = t10 = t11 = t12 = t13 = t14 = t15 = t16 = 4 1.934 4.443 13.424 19.533 6.968 10.364 20.356 29.538 24.077 25.682 41.307 62.661 34.859 40.367 65.368 82.007

13 7.912 14.635 23.861 25.34 17.362 27.133 35.724 37.589 34.77 41.179 54.554 69.368 40.794 48.498 72.095 85.366

5 2.853 6.66 16.637 21.544 9.323 15.505 25.747 32.488 27.466 31.161 46.234 65.182 36.958 43.372 67.903 83.271

14 7.999 14.748 23.953 25.386 17.476 27.282 35.845 37.65 34.862 41.299 54.652 69.417 40.84 48.559 72.144 85.39

1 (t5 + t10 + t13 + 40) 4 1 (t6 + t9 + t11 + t14 ) 4 1 (t7 + t10 + t12 + t15 ) 4 1 (t8 + t11 + t16 + 100) 4 1 (t9 + t14 + 80) 4 1 (t10 + t13 + t15 + 40) 4 1 (t11 + t14 + t16 + 100) 4 1 (t12 + t15 + 200). 4

6 3.996 9.035 19.081 22.892 11.742 19.421 29.306 34.345 29.965 34.748 49.285 66.725 38.334 45.246 69.451 84.044

15 8.056 14.823 24.013 25.416 17.55 27.379 35.923 37.689 34.922 41.378 54.716 69.449 40.87 48.598 72.176 85.406

7 5.194 10.924 20.781 23.781 13.645 22.156 31.642 35.537 31.682 37.092 51.227 67.702 39.232 46.444 70.429 84.533

16 8.093 14.871 24.053 25.435 17.599 27.443 35.975 37.715 34.962 41.43 54.757 69.47 40.89 48.624 72.197 85.417

8 6.142 12.27 21.923 24.365 14.995 24. 33.172 36.31 32.83 38.625 52.482 68.331 39.818 47.218 71.058 84.847

17 8.118 14.903 24.078 25.448 17.631 27.485 36.009 37.732 34.988 41.463 54.785 69.483 40.903 48.641 72.21 85.423

9 6.816 13.185 22.68 24.748 15.911 25.223 34.174 36.813 33.589 39.628 53.298 68.74 40.202 47.722 71.467 85.052

18 8.133 14.924 24.095 25.457 17.651 27.512 36.031 37.743 35.004 41.485 54.803 69.492 40.911 48.652 72.219 85.428

19 8.144 14.938 24.106 25.462 17.665 27.53 36.045 37.75 35.015 41.5 54.814 69.498 40.917 48.659 72.225 85.431

2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS n t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16

20 8.151 14.947 24.114 25.466 17.674 27.541 36.055 37.755 35.023 41.509 54.822 69.502 40.92 48.664 72.229 85.433

21 8.155 14.953 24.118 25.468 17.68 27.549 36.061 37.758 35.027 41.516 54.827 69.504 40.923 48.667 72.232 85.434

22 8.158 14.956 24.121 25.47 17.684 27.554 36.065 37.76 35.03 41.52 54.83 69.506 40.924 48.669 72.233 85.435 n t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16

23 8.16 14.959 24.123 25.471 17.686 27.557 36.068 37.761 35.033 41.522 54.832 69.507 40.925 48.67 72.234 85.435

30 8.163 14.963 24.127 25.473 17.691 27.563 36.072 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436

24 8.161 14.961 24.125 25.471 17.688 27.56 36.069 37.762 35.034 41.524 54.834 69.508 40.926 48.671 72.235 85.436

31 8.164 14.963 24.127 25.473 17.691 27.563 36.073 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436

25 8.162 14.962 24.126 25.472 17.689 27.561 36.071 37.763 35.035 41.525 54.835 69.508 40.926 48.672 72.235 85.436

32 8.164 14.964 24.127 25.473 17.691 27.563 36.073 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436

121 26 8.163 14.962 24.126 25.472 17.69 27.562 36.071 37.763 35.035 41.526 54.835 69.509 40.927 48.672 72.236 85.436

33 8.164 14.964 24.127 25.473 17.691 27.564 36.073 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436

27 8.163 14.963 24.127 25.472 17.69 27.562 36.072 37.763 35.036 41.526 54.836 69.509 40.927 48.672 72.236 85.436

28 8.163 14.963 24.127 25.472 17.69 27.563 36.072 37.763 35.036 41.527 54.836 69.509 40.927 48.672 72.236 85.436

29 8.163 14.963 24.127 25.473 17.691 27.563 36.072 37.763 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436

34 8.164 14.964 24.127 25.473 17.691 27.564 36.073 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436

Column 34 gives the equilibrium temperature to an accuracy of 0.001. 27. (a) Let x1 correspond to the left end of the paper and x2 to the right end, and let n be the number of folds. Then the first six values of [x1 , x2 ] are n x1 x2

0 0 1

1 0 1 2

2

3

4

5

6

1 4 1 2

1 4 3 8

5 16 3 8

5 16 11 32

21 64 11 32

The graph below shows these points, together with the two lines determined in part (b): 1. 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0. 0.

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.

122

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS (b) From the table, we see that x1 = 0 gives x2 = 12 in the next iteration, and x1 = 41 gives x2 = 38 in the next iteration. Similarly, x2 = 1 gives x1 = 0 in the next iteration, and x2 = 21 gives x1 = 14 in the next iteration. Thus x2 −

3 8 1 4 1 4 1 2

1 = 2

x1 − 0 =

− 12 1 1 (x1 − 0), or x2 = − x1 + 2 2 −0 −0 1 1 (x2 − 1), or x1 = − x2 + 2 2 −1

(c) Switching to decimal representations, we apply Gauss-Seidel to these equations after n = 6, 21 and x2 = 11 starting with the estimates x1 = 64 32 , to obtain an approximate point of convergence: n x1 x2

0 0 1

1 0 1 2

2

3

4

5

6

1 4 1 2

1 4 3 8

5 16 3 8

5 16 11 32

21 64 11 32

7 0.328 0.336

8 0.332 0.334

9 0.333 0.333

10 0.333 0.333

(d) Row-reduce the augmented matrix for this pair of equations, giving " # " # 1 1 12 1 0 13 2 . → − 1 21 12 0 1 13   The exact solution to the system of equations is [x1 , x2 ] = 13 , 13 . The ends of the piece of paper converge to 13 . 28. (a) Let x1 record the positions of the left endpoints and x2 the positions of the right endpoints at the end of each walks. Then the first six values of [x1 , x2 ] are 0 0

n x1 x2

1 2

1

2

3

4

5

6

1 4 1 2

1 4 5 8

5 16 5 8

5 16 21 32

21 64 21 32

21 64 85 128

The graph below shows these points, together with the two lines determined in part (b): 1. 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0. 0.

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.

1 2

(b) From the table, we see that x1 = 0 gives x2 = in the next iteration, and x1 = 41 gives x2 = 58 in 5 the next iteration. Similarly, x2 = 12 gives x1 = 14 in the next iteration, and x2 = 85 gives x1 = 16 in the next iteration. Thus x2 −

1 = 2

x1 −

1 = 4

5 1 1 1 8 − 2 (x1 − 0), or x2 = x1 + 1 2 2 4 −0  5 1  1 1 8 − 2 x − , or x1 = x2 . 2 5 1 2 2 − 16 4

(c) Switching to decimal representations, we apply Gauss-Seidel to these equations after n = 6, 21 85 starting with the estimates x1 = 64 and x2 = 128 , to obtain an approximate point of convergence: n x1 x2

0 0 1 2

1

2

3

4

5

6

1 4 1 2

1 4 5 8

5 16 5 8

5 16 21 32

21 64 21 32

21 64 85 128

7 8 9 0.332 0.333 0.333 0.666 0.667 0.667

Chapter Review

123

(d) Row-reduce the augmented matrix for this pair of equations, giving " # " # 1 − 21 1 1 0 13 2 . → − 1 − 12 0 0 1 23   The exact solution to the system of equations is [x1 , x2 ] = 13 , 23 . So the ant eventually oscillates between 13 and 32 .

Chapter Review 1. (a) False. Some linear systems are inconsistent, and have no solutions. See Section 2.1. One example is x + y = 0 and x + y = 1. (b) True. See Section 2.2. Since in a homogeneous system, the lines defined by the equations all pass through the origin, the system has at least one solution; namely, the solution in which all variables are zero. (c) False. See Theorem 2.6. This is guaranteed only if the system is homogeneous, but not if it is not (for example, x + y = 0, x + y = 1 is inconsistent in R3 ). (d) False. See the discussion of homogeneous systems in Section 2.2. It can have either a unique solution, infinitely many solutions, or no solutions. (e) True. See Theorem 2.4 in Section 2.3. (f ) False. If u and v are linearly dependent (i.e., one is a multiple of the other), we get a line, not a plane. And if u and v are both zero, we get only the origin. If the two vectors are linearly independent, then their span is indeed a plane through the origin. (g) True. If u and v are parallel, then v = cu, so that −cu + v = 0, so that u and v are linearly dependent. (h) True. If they can be drawn that way, then the sum of the vectors is 0, since tracing the vectors in turn returns you to the starting point. (i) False. For example, u = [1, 0, 0], v = [0, 1, 0] and w = [1, 1, 0] are linearly dependent since u + v − w = 0, yet none of these is a multiple of another. (j) True. See Theorem 2.8 in Section 2.3. m vectors in Rn are linearly dependent if m > n. 2. The rank  1 3  3 0

is the number of nonzero rows in row echelon form:     1 1 −2 0 3 2 −2 0 3 2 0  0 −5 −1 6 2 −1 1 3 4 R ↔R  R −3R1 +2R2   2 4  −−−→ 0 4 2 −3 2 −−3−−−− 4 2 −3 2 −−−−−→ 3 R −3R 1 +R2 3 −1 1 3 4 −−4−−−− −5 −1 6 2 −−→ 0

 −2 0 3 2 −5 −1 6 2  0 0 0 0 0 0 0 0

This row echelon form of A has two nonzero rows, so that rank(A) = 2. 3. Construct  1 1 1 3 2 1

the augmented matrix of the system and row-reduce it:       −2 4 1 1 −2 4 1 1 −2 4 1 R2 −R1 0 2 0 −1 7 −−− −−→ 0 2 1 3 1 3 −2R3 R3 −R2 R3 −2R1 −5 7 − 2 2 − −−−−→ 0 −−−−→ 0 −1 −1 −1 −−−→ 0 2    R1 −R2  R +2R3  −−1−−−→ 1 1 0 2 1 1 1 0 2 − −−−−→ 1 0 R2 +R3 0 2 0  −−  0 1 2 R2 0 4 1 0 2 −→ −−−−−→ 0 0 1 −1 0 0 1 −1 0 0

Thus the solution is

    x1 0 x2  =  2 . x3 −1

1 2 0 0 0 1

 −2 4 1 3 1 −1  0 2 . −1

124

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

4. Construct the augmented matrix and row-reduce it:  3 8 1 2 1 3

−18 1 −4 0 −7 1

 R2  35 −−→ 1 2 R3  11− −→ 1 3 R1 10 − −→ 3 8  1 0 R −2R2 0 −−3−−−→  1 R2 −R3  −−− −−→ 0 0

−4 0 −7 1 −18 1 2 1 0 2 1 0

−4 −3 0 −4 −3 0

  1 2 11 R2 −R1 −−→ 0 1 10 −−− R3 −3R1 35 − −−−−→ 0 2   0 11 1 0 1 −1 −R3 −1 4 − −−→ 0  R1 −2R2  0 11 −−−−−→ 1 0 3 0 1 −4 0

−4 0 −3 1 −6 1 2 1 0

 11 −1 2

 11 −1 −4  2 0 5 3 −3 0 0 1 −4

−4 0 −3 1 0 1

0 1 0

Then y = t is a free variable, and the general solution is w = 5 − 2t, x = 3 + 3t, y = t, z = −4:         −2 5 5 − 2t w  3  x  3 + 3t  3       =  y   t  =  0 + t  1 . 0 −4 −4 z 5. Construct the augmented matrix and row-reduce it over Z7 :   4R1     R1 +4R2  2 3 4 − 1 5 2 − −→ 1 5 2 −−−−→ 1 R2 +6R1 1 2 3 1 2 3 − 0 −−−−→ 0 4 1     x 6 Thus the solution is = . y 2 6. Construct the augmented matrix and row-reduce over Z5 :    3 2 3 2 1 R2 +3R1 1 4 2 − −−−−→ 0 0

0 4

 6 . 2

 1 0

So the solutions are all of the form 3x + 2y = 1. Setting y = 0, 1, 2, 3, 4 in turn, we get     x 2 y = 0 ⇒ 3x = 1 ⇒ x = 2 ⇒ = y 0     x 3 y = 1 ⇒ 3x + 2 = 1 ⇒ 3x = 4 ⇒ x = 3 ⇒ = y 1     x 4 y = 2 ⇒ 3x + 4 = 1 ⇒ 3x = 2 ⇒ x = 4 ⇒ = y 2     x 0 y = 3 ⇒ 3x + 1 = 1 ⇒ x = 0 ⇒ = y 3     x 1 y = 4 ⇒ 3x + 3 = 1 ⇒ 3x = 3 ⇒ x = 1 ⇒ = . y 4 7. Construct the augmented matrix and row-reduce it:      k 2 1 R2 ↔R1 1 2k 1 1 R2 −kR1 1 2k 1 −−−−−→ k 2 1 − −−−−→ 0

2k 2 − 2k 2

1 1−k



The system is inconsistent when 2 − 2k 2 = −2(k + 1)(k − 1) = 0 but 1 − k 6= 0. The first of these is zero for k = −1 and for k = 1; when k = −1 we have 1 − k = 2 6= 0. So the system is inconsistent when k = −1.

Chapter Review

125

8. Form the augmented matrix of the system and row reduce it (see Example 2.14 in Section 2.2):  1 5

2 6

3 7

  4 1 R2 −5R1 8 − 0 −−−−→

2 3 −4 −8

  4 1 2 3 1 − 4 R2 −12 − 0 1 2 −−−→

 R1 −2R2  4 − −−−−→ 1 0 3 0 1

−1 2

 −2 3

Thus z = t is a free variable, and x = −2 + t, y = 3 − 2t, so the parametric equation of the line is         x −2 + t −2 1 y  =  3 − 2t  =  3 + t −2 . z t 0 1 9. As in Example 2.15 in Section 2.2, we wish to find x = [x, y, z] that satisfies both equations simultaneously; that is, x = p + su = q + tv. This means su − tv = q − p. Substituting the given vectors into this equation gives           1 −1 5 1 4 s + t = 4 s −1 − t  1 = −2 − 2 = −4 ⇒ −s − t = −4 2 1 −4 3 −7 2s − t = −7 Form the augmented matrix of this system and row-reduce it to determine s and t: 

1 −1 2

1 −1 −1

  4 1 R2 ↔R3  −4 − 2 −−−−→ −7 −1

1 −1 −1

  4 1 R −2R1 0 7 −−2−−−→ R3 +R1 −4 − −−−−→ 0

 4 −15 0  1 1 − 3 R2  −−− −→ 0 0

1 −3 0

1 1 0

 R1 −R2  4 − −−−−→ 1 0 0 1 5 0 0 0

 −1 5 0

Thus s = −1 and t = 5, so the point of intersection is         x 1 1 0 y  = 2 − −1 = 3 . z 3 2 1 (We could equally well have substituted t = 5 into the equation for the other line; check that this gives the same answer.) 10. As in Example 2.18 in Section 2.3, we want to find scalars x and y such that       1 1 3 x + y = 3 x + 2y = 5 x 1 + y  2 =  5 ⇒ 3 −2 −1 3x − 2y = −1 Form the augmented matrix of this system and row-reduce it to determine x and y:  1 1 3

1 2 −2

  3 1 R2 −R1 5 −−− −−→ 0 R3 −3R1 −1 − −−−−→ 0

1 1 −5

  3 1 0 R1 −R2 2 −−− −−→ 0 1 R3 +5R3 −10 − −−−−→ 0 0

 1 2 0

So this system has a solution, and thus so the vector is in the span of the other two vectors:       3 1 1  5 = 1 + 2  2  . −1 3 −2

126

CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

11. As in Example 2.21 in Section 2.3, the equation of the plane we are looking for is       x 1 3 s + 3t = x y  = s 1 + t 2 ⇒ s + 2t = y z 1 1 s + t = z Form the augmented matrix and row-reduce it to find conditions for x, y, and z:  1 3 1 2 1 1

  1 x R2 −R1 y  −−− −−→ 0 R3 −R2 z − −−−−→ 0

3 −1 −2

  1 x −R2  y − x − 0 −−→ 0 z−x

3 1 −2

  x 1 3 0 1 x − y R3 +2R2 z−x − −−−−→ 0 0

 x x−y  x − 2y + z

We know the system is consistent, since the two given vectors span some plane. So we must have x − 2y + z = 0, and this is the general equation of the plane we were seeking. (An alternative method would be to take the cross product of the two given vectors, giving [−1, 2, −1], so that the plane is of the form −x + 2y − z = d; plugging in x = y = z = 0 gives −x + 2y − z = 0, or x − 2y + z = 0.) 12. As in Example 2.23 in Section 2.3, we want to find scalars c1 , c2 , and c3 so that         2 1 3 0 c1  1 + c2 −1 + c3  9 = 0 . −3 −2 −2 0 Form the augmented matrix of the resulting system and row-reduce it:     1 0 4 0 2 1 3 0  1 −1 9 0 → − 0 1 −5 0 −3 −2 −2 0 0 0 0 0 Since c1 = −4c3 , c2 = 5c3 is a solution, the vectors are linearly dependent. For example, setting c3 = −1 gives         2 1 3 0 4  1 − 5 −1 −  9 = 0 . −3 −2 −2 0 13. By part (a) of Exercise 21 in Section 2.3, it is sufficient to prove that ei is in span(u, v, w) ⊂ R3 , since then we have R3 = span(e1 , e2 , e3 ) ⊂ span(u, v, w) ⊂ R3 . (a) Begin with e1 . We want to find scalars x, y, and z such that         1 1 0 1 x 1 + 0 + 1 = e1 = 0 . 0 1 1 0 Form the augmented matrix and row    1 1 1 0 1    − 0 1 0 1 0 → 0 1 1 0 0 Similarly, for e2 ,  1 1 0  1 0 1 0 1 1

  0 1   1 → − 0 0 0

reduce: 

0 1 0

0 0 1

1 2 1  2  − 21

0 1 0

0 0 1

1 2  − 21  1 2



e1 =

1 1 1 u + v − w. 2 2 2



e1 =

1 1 1 u − v + w. 2 2 2



Chapter Review

127

Similarly, for e3 ,  1 1 0  1 0 1 0 1 1

  1 0 0 0   0 → − 0 1 0 1 0 0 1

− 21



1 2 1 2

 



1 1 1 e1 = − u + v + w. 2 2 2

Thus R3 = span(u, v, w). (b) Since w = u + v, these vectors are not linearly independent. But any set of vectors that spans R3 must contain 3 linearly independent vectors, so span(u, v, w) 6= R3 . 14. All three statements are true, so (d) is the correct answer. If the vectors are linearly independent, and b is any vector in R3 , then the augmented matrix [A | b], when reduced to row echelon form, cannot have a zero row (since otherwise, the portion of the matrix corresponding to A would have a zero row, so the three vectors would be linearly dependent). Thus [A | b] has a unique solution, which can be read off from the row-reduced matrix. This proves statement (c). Next, since there is not a zero row, continuing and putting the system in reduced row-echelon form results in a matrix with 3 leading ones, so it is the identity matrix. Hence the reduced row echelon form of A is I3 , and thus A has rank 3. 15. Since A is a 3 × 3 matrix, rank(A) ≤ 3. Since the vectors are linearly dependent, rank(A) cannot be 3; since they are not all zero, rank(A) cannot be zero. Thus rank(A) = 1 or rank(A) = 2. 16. By Theorem 2.8 in Section 5.3, the maximum rank of a 5 × 3 matrix is 3: since the rows are vectors in R3 , any set of four or more of them are linearly dependent. Thus when we row reduce the matrix, we must introduce at least two zero rows. The minimum rank is zero, achieved when all entries are zero. 17. Suppose that c1 (u + v) + c2 (u − v) = 0. Rearranging gives (c1 + c2 )u + (c1 − c2 )v = 0. Since u and v are linearly independent, it follows that c1 + c2 = 0 and c1 − c2 = 0. Adding the two equations gives 2c1 = 0, so that c1 = 0; then c2 = 0 as well. It follows that u + v and u − v are linearly independent. 18. Using Exercise 21 in Section 2.3, we show first that u and v are linear combinations of u and u + v. But this is clear: u = u, and v = (u + v) − u. This shows that span(u, v) ⊆ span(u, u + v). Next, we show that u and u + v are linear combinations of u and v. But this is also clear. Thus span(u, u + v) ⊆ span(u, v). Hence the two spans are equal. 19. Note that rank(A) ≤ rank([A | b]), since any zero rows in an echelon form of [A | b] are also zero rows in a row echelon form of A, so that a row echelon form of A has at least as many zero rows as a row echelon form of [A | b]. If rank(A) < rank([A | b]), then a row echelon form of [A | b] will contain at least one row that is zero in the columns corresponding to A but nonzero in the last column on that row, so that the system will be inconsistent. Hence the system is consistent only if rank(A) = rank([A | b]). 20. Row-reduce both A and B:  1  2 −1  1 1 0

1 3 4 0 1 1

  1 1 R −2R1 0 −1−−2−−−→ R3 +R1 1 − −−−−→ 0   −1 1 R2 −R1  1− −−−−→ 0 3 0

1 1 5 0 1 1

   1 1 1 1 0 1 −3 −3 R3 −5R2 2 −−−−−→ 0 0 17    −1 1 0 −1 0 1 2 2 . R3 −R2 3 − 0 0 1 −−−−→

Since in row echelon form each of these matrices has no zero rows, it follows that both matrices have rank 3, so that their reduced row echelon forms are both I3 . Thus the matrices are row-equivalent. The sequence of row operations required to derive B from A is the sequence of row operations required to reduce A to I3 , followed by the sequence of row operations, in reverse, required to reduce B to I3 .

Chapter 3

Matrices 3.1

Matrix Operations

1. Since A and D have the same shape, this operation makes sense.            3 0 0 −3 3 0 0 −6 3+0 0−6 3 A + 2D = +2 = + = = −1 5 −2 1 −1 5 −4 2 −1 − 4 5 + 2 −5

 −6 7

2. Since D and A have the same shape, this operation makes sense.            0 −3 3 0 0 −9 6 0 0−6 −9 − 0 −6 3D − 2A = 3 −2 = − = = −2 1 −1 5 −6 3 −2 10 −6 − (−2) 3 − 10 −4

 −9 −7

3. Since B is a 2 × 3 matrix and C is a 3 × 2 matrix, they do not have the same shape, so they cannot be subtracted. 4. Since both C and B T are 3 × 2 matrices, this     T 1 2 1 4 −2 1 C − B T = 3 4 − = 3 0 2 3 5 6 5

operation makes sense.     2 4 0 1−4 4 − −2 2 = 3 − (−2) 6 1 3 5−1

  2−0 −3 4 − 2 =  5 6−3 4

5. Since A has two columns and B has two rows, the product makes sense, and       3 0 4 −2 1 3·4+0·0 3 · (−2) + 0 · 2 3·1+0·3 12 −6 AB = = = −1 5 0 2 3 −1 · 4 + 5 · 0 −1 · (−2) + 5 · 2 −1 · 1 + 5 · 3 −4 12

 2 2 3

 3 . 14

6. Since the number of columns of B (3) does not equal the number of rows of D (2), this matrix product does not make sense. 7. Since B has three columns and C has three rows, the multiplication BC makes sense, and results in a 2 × 2 matrix (since B has 2 rows and C has 2 columns). Since D is also a 2 × 2 matrix, they can be subtracted, so this formula makes sense. First,     1 2     4 −2 1  4·1−2·3+1·5 4·2−2·4+1·6 3 6  3 4 = BC = = . 0 2 3 0·1+2·3+3·5 0·2+2·4+3·6 21 26 5 6 Then

 D + BC =

0 −2

  −3 3 + 1 21

  6 0+3 = 26 −2 + 21 129

  −3 + 6 3 = 1 + 26 19

 3 . 27

130

CHAPTER 3. MATRICES

8. Since B has three columns and B T has three rows, the product makes sense, and   T 4 −2 1 4 −2 1 BB = 0 2 3 0 2 3     4 0 4 −2 1  −2 2 = 0 2 3 1 3     4 · 4 − 2 · (−2) + 1 · 1 4 · 0 − 2 · 2 + 1 · 3 21 −1 = = 0 · 4 + 2 · (−2) + 3 · 1 0 · 0 + 2 · 2 + 3 · 3 −1 13 T

9. Since A has two columns and F has two rows, AF makes sense; since A has two rows and F has one column, the result is a 2 × 1 matrix. So this matrix has two rows, and E has two columns, so we can multiply by E on the left. Thus this product makes sense. Then 

      3 0 −1 3 · (−1) + 0 · 2 −3 AF = = = and then −1 5 2 −1 · (−1) + 5 · 2 11     −3     E(AF ) = 4 2 = 4 · (−3) + 2 · 11 = 10 . 11 10. F has shape 2 × 1. D has shape 2 × 2, so DF has shape 2 × 1. Since the number of columns (1) of F does not equal the number of rows (2) of DF , the product does not make sense. 11. Since F has one column and E has one row, the product makes sense, and FE =

  −1  4 2

  −1 · 4 2 = 2·4

  −1 · 2 −4 = 2·2 8

 −2 . 4

12. Since E has two columns and F has one row, the product makes sense, and  EF = 4

   −1     2 = 4 · (−1) + 2 · 2 = 0 . 2

(Note that F E from Exercise 11 and EF are not the same matrix. In fact, they don’t even have the same shape! In general, matrix multiplication is not commutative.) 13. Since B has two rows, B T has two columns; since C has two columns, C T has two rows. Thus B T C T makes sense, and the result is a 3 × 3 matrix (since B T has three rows and C T has three columns). Next, CB is defined (3 × 2 multiplied by 2 × 3) and also yields a 3 × 3 matrix, so its transpose is also a 3 × 3 matrix. Thus B T C T − (BC)T is defined. 

      4 0  4·1+0·2 4·3+0·4 4·5+0·6 4 12 20 1 3 5 B T C T = −2 2 = −2 · 1 + 2 · 2 −2 · 3 + 2 · 4 −2 · 5 + 2 · 6 = 2 2 2  2 4 6 1 3 1·1+3·2 1·3+3·4 1·5+3·6 7 15 23         1 2 1 · 4 + 2 · 0 1 · (−2) + 2 · 2 1 · 1 + 2 · 3 4 2 7 4 −2 1 CB = 3 4 = 3 · 4 + 4 · 0 3 · (−2) + 4 · 2 3 · 1 + 4 · 3 = 12 2 15 0 2 3 5 6 5 · 4 + 6 · 0 5 · (−2) + 6 · 2 5 · 1 + 6 · 3 20 2 23 T    4 12 20 4 2 7 (CB)T = 12 2 15 = 2 2 2  . 20 2 23 7 15 23 Thus B T C T = (CB)T , so their difference is the zero matrix.

3.1. MATRIX OPERATIONS

131

14. Since A and D are both 2 × 2 matrices, the products in both orders are defined, and each results in a 2 × 2 matrix, so their difference is defined as well.        0 −3 3 0 0 · 3 − 3 · (−1) 0·0−3·5 3 −15 DA = = = −2 1 −1 5 −2 · 3 + 1 · (−1) −2 · 0 + 1 · 5 −7 5        3 0 0 −3 3 · 0 + 0 · (−2) 3 · (−3) + 0 · 1 0 −9 AD = = = −1 5 −2 1 −1 · 0 + 5 · (−2) −1 · (−3) + 5 · 1 −10 8         3 −15 0 −9 3−0 −15 − (−9) 3 −6 DA − AD = − = = . −7 5 −10 8 −7 − (−10) 5−8 3 −3 15. Since A is 2 × 2, A2 is defined and   3 0 3 2 A = −1 5 −1   3 0 9 3 2 A = AA = −1 5 −8

is also a 2 × 2 matrix, so we can multiply by A again to get A3 .      0 3 · 3 + 0 · (−1) 3·0+0·5 9 0 = = 5 −1 · 3 + 5 · (−1) −1 · 0 + 5 · 5 −8 25      0 3 · 9 + 0 · (−8) 3 · 0 + 0 · 25 27 0 = = . 25 −1 · 9 + 5 · (−8) −1 · 0 + 5 · 25 −49 125

16. Since D is a 2 × 2 matrix, I2 − A makes sense and is also a 2 × 2 matrix. Since it has the same number of rows as columns, squaring it (which is multiplying it by itself) also makes sense.       1 0 0 −3 1 3 . I2 − D = − = 0 1 −2 1 2 0 Thus  1 (I2 − D) = 2 2

 a 17. Suppose that A = c

2  3 1·1+3·2 = 0 2·1+0·2

  1·3+3·0 7 = 2·3+0·0 2

3 6



 b . Then if A2 = 0, d A2 =

 a c

 b a d c

  2 b a + bc = d ac + cd

  ab + bd 0 = bc + d2 0

 0 0

If c = 0, so that the upper right-hand entry is zero, then we must also have a = 0 and d = 0 so that   0 0 the diagonal entries are zero. So is one solution. Alternatively, if c 6= 0, then ac = −cd and 1 0 thus a = −d. This will make the lower left entry zero as well, and the two diagonal entries will both be equal to a2 + bc. So any values of a, b, and c 6= 0 with a2 = −bc will work. For example, a = b = 1, c = −1, d = −1:        1 1 1 1 1−1 1−1 0 0 = = . −1 −1 −1 −1 −1 + 1 −1 + 1 0 0 18. Note that the first column of A is twice its second column. Thus       1 0 0 0 0 0 A = =A . −2 0 0 0 0 0   1.50 1.00 2.00 , where the first row 1.75 1.50 1.00 gives the costs of shipping each product by truck, and the second row the costs when shipping by train. Then BA gives the costs of shipping all the products to each of the warehouses by truck or train:     200 75   1.50 1.00 2.00  650.00 462.50  150 100 = BA = . 1.75 1.50 1.00 675.00 406.25 100 125

19. The cost of shipping one unit of each product is given by B =

132

CHAPTER 3. MATRICES

For example, the upper left entry in the product is the sum of 1.50 · 200, the cost of shipping doohickies to warehouse 1 by truck, 1.00 · 150, the cost of shipping gizmos to warehouse 1 by truck, and 2.00 · 100, the cost of shipping widgets to warehouse 1 by truck. So 650.00 is the cost of shipping all of the items to warehouse 1 by truck. Comparing, we see that it is cheaper to ship items to warehouse 1 by truck ($650.00 vs $675.00), but to warehouse 2 by train ($406.25 vs $462.50).   0.75 0.75 0.75 20. The cost of distributing one unit of the product is given by C = , where the entry 1.00 1.00 1.00 in row i, column j is the cost of shipping one unit of product j from warehouse i. Then the cost of distribution is     200 75   0.75 0.75 0.75  337.50 225.00 150 100 = CA = 1.00 1.00 1.00 450.00 300.00 100 125 In this product, the entry in row i, column i represents the total cost of shipping all the products from warehouse i. The off-diagonal entries do not represent anything in particular, since they are a result of multiplying shipping costs from one warehouse with products from the other warehouse.     x1   1 −2 3   0 x2 = 21. , so that Ax = b, where 2 1 −5 4 x3       x1 1 −2 3 0   . A= , x = x2 , b = 2 1 −5 4 x3  −1 22.  1 0

    1 0 2 x1 −1 0 x2  = −2, so that Ax = b, where −1 1 1 x3     −1 0 2 x1 A =  1 −1 0 , x = x2  , 0 1 1 x3



 1 b = −2 . −1

23. The column vectors of B are 

 2 b1 =  1 , −1



 3 b2 = −1 , 6

  0 b3 = 1 , 4

so the matrix-column representation is 

       1 0 −2 4 Ab1 = 2 −3 + 1 1 − 1  1 = −6 2 0 −1 5         1 0 −2 −9 Ab2 = 3 −3 − 1 1 + 6  1 = −4 2 0 −1 0         1 0 −2 −8 Ab3 = 0 −3 + 1 1 + 4  1 =  5 . 2 0 −1 −4

Thus



4 AB = −6 5

 −9 −8 −4 5 . 0 −4

3.1. MATRIX OPERATIONS

133

24. The row vectors of A are  A1 = 1

0

 −2 ,

 A2 = −3

1

 1 ,

 A3 = 2

0

 −1 ,

so the row-matrix representation is         A1 B = 1 2 3 0 + 0 1 −1 1 − 2 −1 6 4 = 4 −9 −8         A2 B = −3 2 3 0 + 1 1 −1 1 + 1 −1 6 4 = −6 −4 5         A3 B = 2 2 3 0 + 0 1 −1 1 − 1 −1 6 4 = 5 0 −4 . Thus



4 AB = −6 5

 −9 −8 −4 5 . 0 −4

25. The outer product expansion is 

 1  a1 B1 + a2 B2 + a3 B3 = −3 2 2  2 3 = −6 −9 4 6  4 −9 = −6 −4 5 0

    0  −2    3 0 + 1 1 −1 1 +  1 −1 6 0 −1      0 0 0 0 2 −12 −8 0 + 1 −1 1 + −1 6 4 0 0 0 0 1 −6 −4  −8 5 . −4

26. We have 

       2 3 0 −7 Ba1 = 1  1 − 3 −1 + 2 1 =  6 −1 6 4 −11         2 3 0 3 Ba2 = 0  1 + 1 −1 + 0 1 = −1 −1 6 4 6         2 3 0 −1 Ba3 = −2  1 + 1 −1 − 1 1 = −4 . −1 6 4 4 Thus



 −7 3 −1 BA =  6 −1 −4 . −11 6 4

27. We have         B1 A = 2 1 0 −2 + 3 −3 1 1 + 0 2 0 −1 = −7 3 −1         B2 A = 1 1 0 −2 − 1 −3 1 1 + 1 2 0 −1 = 6 −1 −4         B3 A = −1 1 0 −2 + 6 −3 1 1 + 4 2 0 −1 = −11 6 4 . Thus



 −7 3 −1 BA =  6 −1 −4 . −11 6 4

 4

134

CHAPTER 3. MATRICES

28. The outer product expansion is 

     2  3  0    b1 A1 + b2 A2 + b3 A3 =  1 1 0 −2 + −1 −3 1 1 + 1 2 0 −1 6 4       2 0 −4 −9 3 3 0 0 0 =  1 0 −2 +  3 −1 −1 + 2 0 −1 −1 0 2 −18 6 6 8 0 −4   −7 3 −1 =  6 −1 −4 . −11 6 4

 −1

  29. Assume that the columns of B = b1 b2 . . . bn are linearly dependent. Then there exists a solution to x1 b1 + x2 b2 + · · · + xn bn = 0 with at least one xi 6= 0. Now, we have     AB = A b1 b2 . . . bn = Ab1 Ab2 . . . Abn , so that A(x1 b1 + x2 b2 + · · · + xn bn ) = x1 (Ab1 ) + x2 (Ab2 ) + · · · + xn (Abn ) = 0, so that the columns of AB are linearly dependent. 30. Let



 a1  a2    A =  . ,  .. 



 a1 B  a2 B    AB =  .  .  .. 

so that

an

an B

Since the rows of A are linearly dependent, there are xi , not all zero, such that x1 a1 +x2 a2 +· · ·+xn an = 0. But then (x1 a1 + x2 a2 + · · · + xn an )B = x1 (a1 B) + x2 (a2 B) + · · · + xn (an B) = 0, so that the rows of AB are linearly dependent. 31. We have the block structure   A11 A12 B11 AB = A21 A22 B21

  B12 A11 B11 + A12 B21 = B22 A21 B11 + A22 B21

 A11 B12 + A12 B22 . A21 B12 + A22 B22

Now, the blocks A12 , A21 , B12 , and B21 are all zero blocks, so any products involving those block are zero. Thus AB reduces to   A11 B11 0 AB = . 0 A22 B22 But     1 −1 2 3 1 · 2 − 1 · (−1) = 0 1 −1 1 0 · 2 + 1 · (−1)     1     = 2 3 = 2·1+3·1 = 5 . 1

A11 B11 = A22 B22

  1·3−1·1 3 = 0·3+1·1 −1

 2 1

So finally,  A11 B11 AB = 0 32. We have  AB = A1

  B11 A2 B21

0 A22 B22





3 = −1 0

  B12 = A1 B11 + A2 B21 B22

2 1 0

 0 0 . 5  A1 B12 + A2 B22 .

But B11 is the zero matrix, and both A2 and B12 are the identity matrix I2 , so this expression simplifies to       1 7 7 AB = I2 B21 A1 I2 + I2 B22 = B21 A1 + B22 = . −2 7 7

3.1. MATRIX OPERATIONS

135

33. We have the block structure   A11 A12 B11 AB = A21 A22 B21

  B12 A11 B11 + A12 B21 = B22 A21 B11 + A22 B21

 A11 B12 + A12 B22 . A21 B12 + A22 B22

Now, B21 is the zero matrix, so products involving that block are zero. Further, A12 , A21 , and B11 are all the identity matrix I2 . So this product simplifies to     A11 I2 A11 B12 + I2 B22 A11 A11 B12 + B22 AB = = . I2 I2 I2 B12 + A22 B22 I2 B12 + A22 B22 The remaining products are

B12 + A22 B22

 1 3

        0 1 2 1 0 −1 2 0 + B22 = + = 1 0 4 3 1 0 5 3         −1 1 0 −1 0 1 1 1 1 = B12 + = + = 1 −1 1 0 1 0 −1 −1 0

A11 B12 + B22 =

2 4

Thus

 1 3 AB =  1 0

34. We have the block structure   A11 A12 B11 AB = A21 A22 B21

2 4 0 1

2 5 1 0

 2 . −1

 0 3 . 2 −1

  B12 A11 B11 + A12 B21 = B22 A21 B11 + A22 B21

 A11 B12 + A12 B22 . A21 B12 + A22 B22

Now, A11 is the identity matrix I3 , and A21 is the zero matrix. Further, B22 is a 1 × 1 matrix, as is A22 , so they both behave just like scalar multiplication. Thus this product reduces to     I3 B11 + A12 B21 I3 B12 − A12 B11 + A12 B21 B12 − A12 AB = = . 4B21 4 · (−1) 4B21 −4 Compute the matrices in the first row:    1  1  B11 + A12 B21 = B11 + 2 1 1 1 = 0 3 0       1 1 0 B12 − A12 = 1 − 2 = −1 . 1 3 −2 So

 2 2 AB =  3 4

3 3 3 4

4 6 4 4

2 1 0

  3 1 4 + 2 1 3

1 2 3

  1 2 2 = 2 3 3

 0 −1 . −2 −4

35. (a) Computing the powers of A as required, we get       −1 1 −1 0 0 −1 2 3 4 A = , A = , A = , −1 0 0 −1 1 −1       1 −1 1 0 0 1 A5 = , A6 = = I2 , A7 = = A. 1 0 0 1 −1 1 Note that A6 = I2 and A7 = A.

3 3 3

 4 6 4

136

CHAPTER 3. MATRICES (b) Since A6 = I2 , we know that A6k = A6 2015 = 2010 + 5 = 6 · 370 + 5, we have A

36. We have " B2 =

2015

=A

#

0

−1

1

0

B3 = B2B =

,

6·370+5

k

=A

" − √12

− √12

√1 2

− √12

= I2k = I2 for any positive integer value of k. Since

6·370

 1 A = I2 A = A = 1 5

5

#

5

" ,

B 4 = (B 2 )2 =

−1 0

 −1 0

# 0 −1

" ,

B8 =

1

# 0

0

1

= I2

Thus any power of B whose exponent is divisible by 8 is equal to I2 . Now, 2015 = 8 · 251 + 7, so that # " # " # " √1 √1 − √12 − √12 0 −1 2015 8·251 7 7 4 3 2 2 = B =B ·B =B =B ·B = · √1 − √12 √12 − √12 1 0 2   1 n 37. Using induction, we show that A = for n ≥ 1. The base case, n = 1, is just the definition of 0 1 1 A = A. Now assume that this equation holds for k. Then n

A

k+1

1

=A A

inductive k hypothesis

=

 1 0

1 1

 1 0

  k 1 = 1 0

 k+1 . 1

So by induction, the formula holds for all n ≥ 1. 38. (a) 

cos θ A = sin θ 2

− sin θ cos θ

 cos θ sin θ

  2 − sin θ cos θ − sin2 θ = cos θ 2 cos θ sin θ

 −2 cos θ sin θ . cos2 θ − sin2 θ

But cos2 θ − sin2 θ = cos 2θ and 2 cos θ sin θ = sin 2θ, so that we get   cos 2θ − sin 2θ 2 A = . sin 2θ cos 2θ (b) We have already proven the cases n = 1 and n = 2, which serve as the basis for the induction. Now assume that the given formula holds for k. Then   inductive  hypothesis cos θ − sin θ cos kθ − sin kθ Ak+1 = A1 Ak = sin θ cos θ sin kθ cos kθ   cos θ cos kθ − sin θ sin kθ −(cos θ sin kθ + sin θ cos kθ) . = sin θ cos kθ + cos θ sin kθ cos θ cos kθ − sin θ sin kθ But cos θ cos kθ − sin θ sin kθ = cos(θ + kθ) = cos(k + 1)θ, and sin θ cos kθ + cos θ sin kθ = sin(θ + kθ) = sin(k + 1)θ, so this matrix becomes   cos(k + 1)θ − sin(k + 1)θ k+1 A = . sin(k + 1)θ cos(k + 1)θ So by induction, the formula holds for all k ≥ 1. 39. (a) aij = (−1)i+j means that if i + j is even, then the corresponding entry is 1; otherwise, it is −1.   1 −1 1 −1 −1 1 −1 1 . A=  1 −1 1 −1 −1 1 −1 1

3.1. MATRIX OPERATIONS

137

(b) Each entry is its row number minus its column number:   0 −1 −2 −3 1 0 −1 −2 . A= 2 1 0 −1 3 2 1 0 (c)  (1 − 1)1 (2 − 1)1 A= (3 − 1)1 (4 − 1)1

(1 − 1)2 (2 − 1)2 (3 − 1)2 (4 − 1)2

(1 − 1)3 (2 − 1)3 (3 − 1)3 (4 − 1)3

  0 (1 − 1)4 1 (2 − 1)4  = (3 − 1)4  2 3 (4 − 1)4

0 1 4 9

0 1 8 27

 0 1 . 16 81

(d)  sin (1+1−1)π sin (1+2−1)π sin (1+3−1)π 4 4 4  (2+1−1)π (2+2−1)π (2+3−1)π sin sin sin 4 4 4 A= sin (3+1−1)π sin (3+2−1)π sin (3+3−1)π  4 4 4 (4+2−1)π (4+3−1)π sin (4+1−1)π sin sin 4 4 4   π π 3π sin 4 sin 2 sin 4 sin π   π 3π  sin 2 sin 4  sin π sin 5π 4   = 3π 5π 3π  sin π sin 4 sin 2  sin 4 sin π sin 5π sin 3π sin 7π 4 2 4 √ √   2 2 1 0 2 √ √   2 2 2 1 0 − 2 2  √ = .  √2 0 − 22 −1   2  √ √ 0 − 22 −1 − 22

sin (1+4−1)π 4



  sin (2+4−1)π 4  (3+4−1)π  sin  4 sin (4+4−1)π 4

40. (a) This matrix has zeros whenever the row number exceeds the column number, and the nonzero entries are the sum of the row and column numbers:   2 3 4 5 6 7 0 4 5 6 7 8    0 0 6 7 8 9  .  A=  0 0 0 8 9 10 0 0 0 0 10 11 0 0 0 0 0 12 (b) This matrix has 1’s on the main diagonal zeros elsewhere:  1 1  0 A= 0  0 0

and one diagonal away from the main diagonal, and  1 0 0 0 0 1 1 0 0 0  1 1 1 0 0 . 0 1 1 1 0  0 0 1 1 1 0 0 0 1 1

(c) This matrix is zero except where the row number  0 0 0 0 0 0  0 0 1 A= 0 1 1  1 1 1 1 1 0

plus the column number is 6, 7, or 8:  0 1 1 1 1 1  1 1 0 . 1 0 0  0 0 0 0 0 0

138

CHAPTER 3. MATRICES

41. Let A be an m × n matrix with rows a1 , a2 , . . . , am . Since ei has a 1 in the ith position and zeros elsewhere, we get ei A = 0 · a1 + 0 · a2 + · · · + 1 · ai + · · · + 0 · am = ai , which is the ith row of A.

3.2

Matrix Algebra

1. Since X − 2A + 3B = O, we have X = 2A − 3B, so      1 2 −1 0 2 · 1 − 3 · (−1) X=2 −3 = 3 4 1 1 2·3−3·1 2. Since 2X = A − B, we have X = 21 (A − B),  1 1 1 X = (A − B) = 3 2 2

so   2 −1 − 4 1

  2·2−3·0 5 = 2·4−3·1 3

  1 2 0 = 1 2 2

  1 2 = 3 1

 4 . 5

 1 3 2

3. Solving for X gives 2 2 X = (A + 2B) = 3 3

" 1 3

# " 2 −1 +2 4 1

#! " 0 2 −1 = 3 1 5

# " 2 −2 = 103 6 3

4 3

#

4

.

4. The given equation is 2A − 2B + 2X = 3X − 3A; solving for    1 2 −1 −2 X = 5A − 2B = 5 3 4 1

X gives X = 5A − 2B.    0 7 10 = 1 13 18

5. As in Example 3.16, we form a matrix whose columns consist and row-reduce it:    1 0 2 1 0  2 1 5 0 1  −  −1 2 0 → 0 0 1 1 3 0 0

of the entries of the matrices in question  2 1  0 0

Thus

 2 0

  5 1 =2 3 −1

  2 0 + 1 2

6. As in Example 3.16, we form a matrix whose columns consist and row-reduce it:    1 0 1 2 1 0 0 −1 1  0 1 3  → 0 0 0 1 0 −4 1 0 1 2 0 0 so that



2 −4

  3 1 =3 2 0

  0 0 −4 1 1

 1 . 1 of the entries of the matrices in question  0 3 0 −4  1 −1 0 0

  −1 1 − 0 0

7. As in Example 3.16, we form a matrix whose columns consist and row-reduce it:    1 −1 1 3 1 0  0 0 1 2 1 1     −1 0 1 1   → 0 0  0  0 0 0 0 0    1  0 0 1 0 1 0 0 0 0 0 0

 1 1

of the entries of the matrices in question  0 0 0 0  1 0 . 0 1  0 0 0 0

So the system is inconsistent, and B is not a linear combination of A1 , A2 , and A3 .

3.2. MATRIX ALGEBRA

139

8. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question and row-reduce it:     1 0 0 0 1 1 0 −1 1 2 0 1 0 0 2 0 1 0 −1 −2     0 0 1 0 3 0 1 −1  1 3     0 0 0 1 4 0 0 0 0 0      1 0  0 1 −1   → 0 0 0 0 0 0 0 0 0 0 0 1  0 −1 −2     0 0 0 0 0 0 0 0 0 0     0 0 0 0 0 0 0 0 0 0 2 1 0 −1 1 0 0 0 0 0 Since the linear system is consistent, we conclude that B = A1 + 2A2 + 3A3 + 4A4 .   w x 9. As in Example 3.17, let be an element in the span of the matrices. Then form a matrix whose y z columns consist of the entries of the matrices in question and row-reduce it:     1 0 w 1 0 w  2 1 x 0 1 x − 2w   −   −1 2 y  → 0 0 y + 5w − 2x . 1 1 z 0 0 z+w−x This gives the restrictions y + 5w − 2x = 0, so that y = 2x − 5w, and z + w − x = 0, so that z = x − w. So the span consists of matrices of the form   w x . 2x − 5w x − w   w x 10. As in Example 3.17, let be an element in the span of the matrices. Then form a matrix whose y z columns consist of the entries of the matrices in question and row-reduce it:     1 0 0 w + x−y 1 0 1 w 2 0 1 0  0 −1 1 x x+y     2 →   y−x 0 0 1  0 1 0 y 2 1 0 1 z 0 0 0 z−w This gives the single restriction z − w = 0, so that z = w. So the span consists of matrices of the form   w x . y w 

 w x 11. As in Example 3.17, let be an element in the span of the matrices. Then form a matrix whose y z columns consist of the entries of the matrices in question and row-reduce it:     1 −1 1 a 1 0 0 e  0 0 1 0  2 1 b c+e     −1  0 0 1  0 1 c b − 2c − 2e →    0  0 0 0 a + 3b − 4c − 5e . 0 0 d       1 0 0 0 1 0 e d f 0 0 0 f 0 0 0 We get the restrictions a + 3b − 4c − 5e = 0, so that a = −3b + 4c + 5e, as well as d = f = 0. Thus the span consists of matrices of the form   −3b + 4c + 5e b c . 0 e 0

140

CHAPTER 3. MATRICES

  x1 x2 x3 12. As in Example 3.17, let x4 x5 x6  be an element in the span of the matrices. Then form a matrix x7 x8 x9 whose columns consist of the entries of the matrices in question and row-reduce it:     x1 +x5 1 0 0 0 1 0 −1 1 x1 2 0 1 0 0 0 1  1 0 −1 x2  x3 + x5 −x     2      0 1 −1  1 x 0 0 1 0 x + x − x − x 3 3 5 1 2    0 0 0 1 x3 − x2 + x5 −x1  0 0 0 0 x4      2     1 −1 x5  → 0 0 0 0 x6 − x2 1 0       0 1   0 −1 x 0 0 0 0 x 6 4     0 0   x7 0 0 x7  0 0 0 0       0 0 0 0 0 0  0 0 x8  x8 1

0

−1

1

0

x9

0

0

x9 − x1

0

We get the restrictions x6 = x2 ,

x9 = x1 ,

Thus the span consists of matrices of the form  x1 0 0

x2 x5 0

x4 = x7 = x8 = 0.  x3 x6  . x1

13. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given matrices and with zeros in the constant column, and row-reduce it:     1 4 0 1 0 0 2 3 0 0 1 0     3 2 0 → 0 0 0 4 1 0 0 0 0 Since the only solution is the trivial solution, we conclude that these matrices are linearly independent. 14. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given matrices and with zeros in the constant column, and row-reduce it:     1 2 1 0 1 0 13 0 2 0 1 1 0 1 1 0     3  →  4 −1 1 0 0 0 0 0 3

0

1

0

0

0

0

Since the system has a nontrivial solution, the original matrices from the solutions we get from the row-reduced matrix,      1 1 2 1 2 1 1 + = 1 3 4 3 3 −1 0

0 are not linearly independent; in fact, 1 1



15. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given matrices and with zeros in the constant column, and row-reduce it:     1 0 0 0 0 0 1 −2 −1 0 0 1 0 0 0  1 0 −1 −3 0         5 2 0 1 0  → 0 0 1 0 0   2 3  0 0 0 1 0 1 9 0     0 0 0 0 0 −1 1 0 4 0 0 1 2 5 0 0 0 0 0 0 Since the only solution is the trivial solution, we conclude that these matrices are linearly independent.

3.2. MATRIX ALGEBRA

141

16. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given matrices and with zeros in the constant column, and row-reduce it:



1 −1   0   0   2   0   0   2 6

2 1 0 0 3 0 0 4 9

1 2 0 0 1 0 0 3 5

−1 1 0 0 −1 0 0 0 −4

  1 0 0 0   0 0   0  0   0 →  0 0 0   0  0  0  0 0 0

0 1 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0

 0 0  0  0  0  0  0  0 0

The only solution is the trivial solution, so the matrices are linearly independent.

17. Let A, B, and C be m × n matrices.

(a)



a11  .. A+B = .

... .. . ...

  a1n b11 ..  +  .. .   .

 . . . b1n ..  .. . .  . . . bmn 

am1 amn bm1  a11 + b11 . . . a1n + b1n   .. .. .. =  . . . am1 + bm1 . . . amn + bmn   b11 + a11 . . . b1n + a1n   .. .. .. =  . . . bm1 + am1 . . . bmn + amn    b11 . . . b1n a11 . . .  ..   . . .. .. ..  +  ... = . . bm1 . . . bmn am1 . . . = B + A.

 a1n ..  .  amn

142

CHAPTER 3. MATRICES (b) 

a11  .. (A + B) + C =  . am1   =

  b11 a1n ..  +  .. .   .

... .. . ...

bm1

amn

a11 + b11 .. .

... .. . ...

a1n + b1n .. .

  c11 . . . b1n ..  +  .. .. . .   . cm1 . . . bmn    c11 . . . c1n   .. ..  .. + . . . 

... .. . ...

 c1n ..  .  cmn

cm1 . . . cmn am1 + bm1 amn + bmn   a11 + b11 + c11 . . . a1n + b1n + c1n   .. .. .. =  . . . am1 + bm1 + cm1 . . . amn + bmn + cmn     a11 . . . a1n b11 + c11 . . . b1n + c1n   ..  +  .. .. .. .. =  ...  . . .   . . am1 . . . amn  b11 . . .  .. .. = A +  . . bm1 . . .

bm1 + cm1 . . . bmn + cmn    b1n c11 . . . c1n ..  +  .. ..  .. . .   . .  bmn cm1 . . . cmn

= A + (B + C). (c) 

a11  .. A+O = . am1

  . . . a1n 0 ..  +  .. .. . .  . . . . amn 0

  0 a11 + 0 ..  =  .. .  .

... .. . ...

0

... .. . ...

am1 + 0

 a1n + 0  ..  . amn + 0  a11  .. = . am1

(d) 

a11  .. A + (−A) =  . am1  a11  .. = .

... .. . ... ... .. . ...

   a1n a11 ..  + −  .. .    . amn

am1 

a1n −a11 ..  +  .. .   .

am1 amn  a11 − a11 . . .  .. .. = . . am1 − am1 . . .   0 ... 0   =  ... . . . ...  0 = O.

...

0



−am1

... .. . ...

... .. . ... 

a1n − a1n  ..  . amn − amn

 a1n ..  .  amn  −a1n ..  . 

−amn

... .. . ...

 a1n ..  = A. .  amn

3.2. MATRIX ALGEBRA

143

18. Let A and B be m × n matrices, and c and d be scalars. (a) 

a11  .. c(A + B) = c  .

  a1n b11 ..  +  .. .   .

... .. . ...

am1 amn  a11 + b11 . . .  .. .. = c  . . am1 + bm1 

...

bm1

a1n + b1n  ..  . amn + bmn  c(a1n + b1n )  ..  .

... .. . c(am1 + bm1 ) . . . c(amn + bmn )   ca11 + cb11 . . . ca1n + cb1n   .. .. .. =  . . .

 =

c(a11 + b11 ) .. .

 . . . b1n ..  .. . .  . . . bmn 

cam1 + cbm1 . . . camn + cbmn    ca11 . . . ca1n cb11 . . .  ..   .. . . .. . . = . . . . + . cam1  a11  .. = c .

 cb1n ..  . 

...

camn cbm1 . . . cbmn    . . . a1n b11 . . . b1n ..  + c  .. ..  .. ..  . . . .  .  . . . amn bm1 . . . bmn

am1 = cA + cB. (b) 

a11  (c + d)A = (c + d)  ... am1 

(c + d)a11  .. = . (c + d)am1  ca11 + da11  .. = .

... .. . ... ... .. . ...

 a1n ..  .  amn  (c + d)a1n  ..  . (c + d)amn

 ... ca1n + d1n  .. ..  . . cam1 + dam1 . . . camn + damn     ca11 . . . ca1n da11 . . . da1n  ..  +  .. ..  .. .. =  ... . . .   . .  cam1 . . . camn dam1 . . . damn     a11 . . . a1n a11 . . . a1n  ..  + d  .. ..  .. .. = c  ...  . . . .  .  am1

...

= cA + dA.

amn

am1

...

amn

144

CHAPTER 3. MATRICES (c) 

da11  .. c(dA) = c  .

... .. . ...

dam1

  da1n cda11 ..  =  .. .   . damn

cdam1

... .. . ...

  cda1n a11 ..  = cd  ..  . .  cdamn

am1

... .. . ...

 a1n ..  . .  amn

(d) 

a11  .. 1A = 1  . am1

  1a11 . . . a1n ..  =  .. .. . .   . 1am1 . . . amn

... .. . ...

  a11 1a1n ..  =  .. .   . 1amn

am1

... .. . ...

 a1n ..  = A. .  amn

19. Let A and B be r × n matrices, and C be an n × s matrix, so that the product makes sense. Then       c11 . . . c1s b11 . . . b1n a11 . . . a1n  ..  ..   .. ..  +  .. .. .. .. (A + B)C =  ... . . . .  .   . .   . cn1 . . . cns br1 . . . brn ar1 . . . arn    a11 + b11 . . . a1n + b1n c11 . . . c1s    .. .. .. ..  .. .. =  . . . . . .  ar1 + br1 . . . arn + brn cn1 . . . cns   (a11 + b11 )c11 + · · · + (a1n + b1n )cn1 . . . (a11 + b11 )c1s + · · · + (a1n + b1n )cns   .. .. .. =  . . . (ar1 + br1 )c11 + · · · + (arn + brn )cn1 . . . (ar1 + br1 )c1s + · · · + (ar1 + br1 )c1s   a11 c11 + · · · + a1n cn1 . . . a11 c1s + · · · + a1n cns   .. .. .. = + . . . ar1 c11 + · · · + arn cn1 . . . ar1 c1s + · · · + ar1 cns   b11 c11 + · · · + b1n cn1 . . . b11 c1s + · · · + b1n cns   .. .. ..   . . . br1 c11 + · · · + brn cn1 . . . br1 c1s + · · · + br1 cns = AC + BC. 20. Let A be an r × n matrix and C be an n × s matrix, so that the product makes sense. Let k be a scalar. Then   a11 c11 + · · · + a1n cn1 · · · a11 c1s + · · · + a1n cns   .. .. .. k(AB) = k   . . . ar1 c11 + · · · + arn cn1 · · · ar1 c1s + · · · + arn cns   k(a11 c11 + · · · + a1n cn1 ) · · · k(a11 c1s + · · · + a1n cns )   .. .. .. =  . . . k(ar1 c11 + · · · + arn cn1 ) · · · k(ar1 c1s + · · · + arn cns )   (ka11 )c11 + · · · + (ka1n )cn1 · · · (ka11 )c1s + · · · + (ka1n )cns   .. .. .. =  . . . (kar1 )c11 + · · · + (karn )cn1 · · · (kar1 )c1s + · · · + (karn )cns    ka11 . . . ka1n c11 . . . c1s  ..   .. ..  .. .. =  ... . . .  . .  kar1 . . . karn cn1 . . . cns = (kA)B.

3.2. MATRIX ALGEBRA

145

That proves the first equality. Regrouping the expression on the second line above differently, we get   a11 (kc11 ) + · · · + a1n (kcn1 ) · · · a11 (kc1s ) + · · · + a1n (kcns )   .. .. .. k(AB) =   . . . ar1 (kc11 ) + · · · + arn (kcn1 ) · · · ar1 (kc1s ) + · · · + arn (kcns )    a11 . . . a1n kc11 . . . kc1s  ..   .. ..  .. .. =  ... . . .  . .  ar1 . . . arn kcn1 . . . kcns = A(kB). 21. If A is m × n, then to prove Im A = A, note that 

Im

 e1  e2    =  . ,  ..  em

where ei is a standard (row) unit vector; then       A1 e1 e1 A  e2   e2 A   A2        Im A =  .  A =  .  =  .  = A.  ..   ..   ..  em

em A

Am

22. Note that (A − B)(A + B) = A(A + B) − B(A + B) = A2 + AB − BA + B 2 = (A2 + B 2 ) + (AB − BA), so that (A − B)(A + B) = A2 + B 2 if and only if AB − BA = O; that is, if and only if AB = BA. 23. Compute both AB and BA:      1 1 a b a+c b+d = 0 1 c d c d      a b 1 1 a a+b BA = = . c d 0 1 c c+d

AB =

So if AB = BA, then a + c = a, b + d = a + b, c = c, and d = c + d. The first equation gives c = 0 and the second gives d = a. So the matrices that commute with A are matrices of the form   a b . 0 a 24. Compute both AB and BA: 

    1 −1 a b a−c b−d AB = = −1 1 c d −a + c −b + d      a b 1 −1 a − b −a + b BA = = . c d −1 1 c − d −c + d So if AB = BA, then a − c = a − b, b − d = −a + b, −a + c = c − d, and −b + d = −c + d. The first (or fourth) equation gives b = c and the second (or third) gives d = a. So the matrices that commute with A are matrices of the form   a b . b a

146

CHAPTER 3. MATRICES

25. Compute both AB and BA:      1 2 a b a + 2c b + 2d = 3 4 c d 3a + 4c 3b + 4d      a b 1 2 a + 3b 2a + 4b BA = = . c d 3 4 c + 3d 2c + 4d

AB =

So if AB = BA, then a + 2c = a + 3b, b + 2d = 2a + 4b, 3a + 4c = c + 3d, and 3b + 4d = 2c + 4d. The first (or fourth) equation gives 3b = 2c and the third gives a = d − c. Substituting a = d − c into the second equation gives 3b = 2c again. Thus those are the only two conditions, and the matrices that commute with A are those of the form # " d − c 32 c . c d 26. Let A1 be the first matrix and A4 be the second. Then     1 0 a b a A1 B = = 0 0 c d 0     a b 1 0 a BA1 = = c d 0 0 c     0 0 a b 0 A4 B = = 0 1 c d c     a b 0 0 0 BA4 = = c d 0 1 0

b 0



 0 0  0 d  b . d

So the conditions imposed by requiring that A1 B = BA1 and A4 B = BA4 are b = c = 0. Thus the matrices that commute with both A1 and A4 are those of the form   a 0 0 d 27. Since we want B to commute with every2 × 2matrix, in particular it must commute with A1 and A4 a 0 from the previous exercise, so that B = . In addition, it must commute with 0 d A2 =

 0 0

 1 0

and A3 =

 0 1

 0 . 0

Computing the products, we get  0 0  a BA2 = 0  0 A3 B = 1  a BA3 = 0

A2 B =

 1 a 0 0  0 0 d 0  0 a 0 0  0 0 d 1

  0 0 = d 0   1 0 = 0 0   0 0 = d a   0 0 = 0 d

The condition that arises from these equalities is a = d. Thus   a 0 B= . 0 a

 d 0  a 0  0 0  0 . 0

3.2. MATRIX ALGEBRA

147

Finally, note that for any 2 × 2 matrix M ,   x y M= = xA1 + yA2 + zA3 + wA4 , z w so that if B commutes with A1 , A2 , A3 , and A4 , then it will commute with M . So the matrices B that satisfy this condition are the matrices above: those where the off-diagonal entries are zero and the diagonal entries are the same. 28. Suppose A is an m × n matrix and B is an a × b matrix. Since AB is defined, the number of columns (n) of A must equal the number of rows (a) of B; since BA is defined, the number of columns (b) of B must equal the number of rows (m) of A. Since m = b and n = a, we see that A is m × n and B is n × m. But then AB is m × m and BA is n × n, so both are square. 29. Suppose A = (aij ) and B = (bij ) are both upper triangular n × n matrices. This means that aij = bij = 0 if i > j. But if C = (cij ) = AB, then ckl = ak1 b1l + ak2 b2l + · · · + akl bll + ak(l+1) b(l+1)l + · · · + akn bnl . If k > l, then ak1 b1l + ak2 b2l + · · · + akl bll = 0 since each of the akj is zero. But also ak(l+1) b(l+1)l + · · · + akn bnl = 0 since each of the bil is zero. Thus ckl = 0 for k > l, so that C is upper triangular. 30. Let A and B whose sizes permit the indicated operations, and let k be a scalar. Denote the ith row of a matrix X by rowi (X) and its j th column by colj (X). Then Theorem 3.4(a): For any i and j, h     T i = AT ji = A ij , so that (AT )T = A. AT ij

Theorem 3.4(b): For any i and j,             (A + B)T ij = A + B ji = A ji + B ji = AT ij + B ij , so that (A + B)T = AT + B T . Theorem 3.4(c): For any i and j,         (kA)T ij = kA ji = k A ji = k A ij , so that (kA)T = k(AT ). 31. We prove that (Ar )T = (AT )r by induction on r. For r = 1, this says that (A1 )T = (AT )1 , which is clear. Now assume that (Ak )T = (AT )k . Then (Ak+1 )T = (A · Ak )T = (Ak )T AT = (AT )k AT = (AT )k+1 . (The second equality is from Theorem 3.4(d), and the third is the inductive assumption). 32. We prove that (A1 + A2 + · · · + An )T = AT1 + AT2 + · · · + ATn by induction on n. For n = 1, this says that AT1 = AT1 , which is clear. Now assume that (A1 + A2 + · · · + Ak )T = AT1 + AT2 + · · · + ATk . Then (A1 + A2 + · · · + Ak + Ak+1 )T = ((A1 + A2 + · · · + Ak ) + Ak+1 )T = (A1 + A2 + · · · + Ak )T + ATk+1 =

AT1

+

AT2

+ ··· +

ATk

+

ATk+1

(Thm. 3.4(b)) (Inductive assumption).

33. We prove that (A1 A2 · · · An )T = ATn ATn−1 · · · AT2 AT1 by induction on n. For n = 1, this says that AT1 = AT1 , which is clear. Now assume that (A1 A2 · · · Ak )T = ATk ATk−1 · · · AT2 AT1 . Then (A1 A2 · · · Ak Ak+1 )T = ((A1 A2 · · · Ak )Ak+1 )T = ATk+1 (A1 A2 · · · Ak )T = ATk+1 ATk · · · AT2 AT1

(Thm. 3.4(d)) (Inductive assumption).

148

CHAPTER 3. MATRICES

34. Using Theorem 3.4, we have (AAT )T = (AT )T AT = AAT and (AT A)T = AT (AT )T = AT A. So each of AAT and AT A is equal to its own transpose, so both matrices are symmetric. 35. Let A and B be symmetric n × n matrices, and let k be a scalar. (a) By Theorem 3.4(b), (A + B)T = AT + B T ; since both A and B are symmetric, this is equal to A + B. Thus A + B is equal to its transpose, so is symmetric. (b) By Theorem 3.4(c), (kA)T = kAT ; since A is symmetric, this is equal to kA. Thus kA is equal to its own transpose, so is symmetric     1 1 0 1 36. (a) For example, let n = 2, with A = and B = . Then both A and B are symmetric, 1 0 1 0 but      1 1 0 1 1 1 AB = = , 1 0 1 0 0 1 which is not symmetric. (b) Suppose A and B are symmetric. Then if AB is symmetric, we have AB = (AB)T = B T AT = BA using Theorem 3.4(d) and the fact that A and B are symmetric. In the other direction, if A and B are symmetric and AB = BA, then (AB)T = B T AT = BA = AB,

37. (a) (b)

(c)

(d)

so that AB is symmetric.     1 −2 −1 −2 T A = while −A = , so AT 6= −A and A is not skew-symmetric. 2 3 2 −3   0 1 T A = = −A, so A is skew-symmetric. −1 0   0 −3 1 0 −2 = −A, so A is skew-symmetric. AT =  3 −1 2 0   0 −1 2 0 5 6= −A, so A is not skew-symmetric. AT = 1 2 5 0

38. A matrix A is skew-symmetric if AT = −A. But this means that for every i and j, (AT )ij = Aji = −Aij . Thus the components must satisfy aij = −aji for every i and j. 39. By the previous exercise, we must have aij = −aji for every i and j. So when i = j, we get aii = −aii , which implies that aii = 0 for all i. 40. If A and B are skew-symmetric, so that AT = −A and B T = −B, then (A + B)T = AT + B T = −A − B = −(A + B), so that A + B is also skew-symmetric.     0 a 0 b 41. Suppose that A = and B = are skew-symmetric. Then −a 0 −b 0     −ab 0 −ab 0 T AB = , and thus (AB) = . 0 −ab 0 −ab So AB = −(AB)T only if ab = −ab, which happens if either a or b is zero. Thus either A or B must be the zero matrix.

3.2. MATRIX ALGEBRA

149

42. We must show that A−AT is skew-symmetric, so we must show that (A−AT )T = −(A−AT ) = AT −A. But (A − AT )T = AT − (AT )T = AT − A as desired, proving the result. 43. (a) If A is a square matrix, let S = A + AT and S 0 = A − AT . Then clearly A = 12 S + 12 S 0 . S is symmetric by Theorem 3.5, so 21 S is as well. S 0 is symmetric by Exercise 42, so 12 S 0 is as well. Thus A is a sum of a symmetric and a skew-symmetric matrix. (b) We have       1 2 3 1 4 7 2 6 10 S = A + AT = 4 5 6 + 2 5 8 =  6 10 14 7 8 9 3 6 9 10 14 18       1 2 3 1 4 7 0 −2 −4 S = A − AT = 4 5 6 − 2 5 8 = 2 0 −2 . 7 8 9 3 6 9 4 2 0 Thus

 1 1 0  1 A= S+ S = 3 2 2 5

3 5 7

  5 0 7 + 1 9 2

  −1 −2 1 0 −1 = 4 1 0 7

2 5 8

 3 6 . 9

44. Let A and B be n × n matrices, and k be a scalar. Then (a) tr(A + B) = (a11 + b11 ) + (a22 + b22 ) + · · · + (ann + bnn ) = (a11 + a22 + · · · + ann ) + (b11 + b22 + · · · + bnn ) = tr(A) + tr(B). (b) tr(kA) = ka11 + ka22 + · · · + kann = k(a11 + a22 + · · · + ann ) = k tr(A). 45. Let A and B be n × n matrices. Then tr(AB) = (AB)11 + (AB)22 + · · · + (AB)nn = (a11 b11 + a12 b21 + · · · + a1n bn1 ) + (a21 b12 + a22 b22 + · · · + a2n bn2 ) + · · · (an1 b1n + an2 b2n + · · · + ann bnn ) = (b11 a11 + b12 a21 + · · · + b1n an1 ) + (b21 a12 + b22 a22 + · · · + b2n an2 ) + · · · (bn1 a1n + bn2 a2n + · · · + bnn ann ) = tr(BA). 46. Note that the ii entry of AAT is equal to (AAT )ii = (A)i1 (AT )1i + (A)i2 (AT )2i + · · · (A)in (AT )ni = a2i1 + a2i2 + · · · + a2in , so that tr(AAT ) = (a211 + a212 + · · · + a21n ) + (a221 + a222 + · · · + a22n ) + (a2n1 + a2n2 + · · · + a2nn ). That is, it is the sum of the squares of the entries of A. 47. Note that by Exercise 45, tr(AB) = tr(BA). By Exercise 44, then, tr(AB − BA) = tr(AB) − tr(BA) = tr(AB) − tr(AB) = 0. Since tr(I2 ) = 2, it cannot be the case that AB − BA = I2 .

150

3.3

CHAPTER 3. MATRICES

The Inverse of a Matrix

1. Using Theorem 3.8, note that  4 det 1

 7 = ad − bc = 4 · 2 − 1 · 7 = 1, 2

so that  −1 1 7 2 = 2 −1 1

 4 1

  −7 2 = 4 −1

 −7 . 4

As a check,  4 1

 7 2 2 −1

    −7 4 · 2 + 7 · (−1) 4 · (−7) + 7 · 4 1 = = 4 1 · 2 + 2 · (−1) 1 · (−7) + 2 · 4 0

 0 . 1

2. Using Theorem 3.8, note that  4 2

 −2 = ad − bc = 4 · 0 − (−2) · 2 = 4, 0

" 4

−2

2

0

det so that

#−1

# 2

" 0 1 = 4 −2

4

" =

#

0

1 2

− 12

1

As a check, "

4 2

# " 0 −2 · 1 0 −2

1 2

#

1

" =

4 · 0 − 2 · − 21





2 · 0 + 0 · − 21





1 2 1 2

−2·1

#

+0·1

=

" 1

0

0

1

#

3. Using Theorem 3.8, note that  3 det 6

 4 = ad − bc = 3 · 8 − 4 · 6 = 0. 8

Since the determinant is zero, the matrix is not invertible. 4. Using Theorem 3.8, note that 

 1 = ad − bc = 0 · 0 − 1 · (−1) = 1, 0



1 0

0 det −1 so that 0 −1

−1

 0 =1· 1

  −1 0 = 0 1

 −1 0

As a check, 

0 −1

  1 0 · 0 1

5. Using Theorem 3.8, note that " det

3 4 5 6

  −1 0·0+1·1 = 0 −1 · 0 + 0 · 1

3 5 2 3

# = ad − bc =

  0 · (−1) + 1 · 0 1 = −1 · (−1) + 0 · 0 0

3 2 3 5 1 1 · − · = − = 0. 4 3 5 6 2 2

Since the determinant is zero, the matrix is not invertible.

 0 1

3.3. THE INVERSE OF A MATRIX 6. Using Theorem 3.8, note that " √1 2 − √12

det

151

√1 2 √1 2

#

1 −1 1 1 1 1 = ad − bc = √ · √ − √ · √ = + = 1 2 2 2 2 2 2

so that "

As a check, "

√1 2 √1 2

− √12

√1 2 − √12

# "

√1 2 − √12

·

√1 2

√1 2 √1 2

√1 2 √1 2

#

#−1 =1·

" =

"

√1 2 √1 2

− √12

#

√1 2

" =

√1 · √1 + − √1 · − √1 2 2 2 2 √1 · √1 + √1 · − √1 2 2 2 2

√1 2 √1 2

− √12

#

√1 2

√1 · √1 + − √1 · √1 2 2 2 2 √1 · √1 + √1 · √1 2 2 2 2

# =

" 1

# 0

0

1

7. Using Theorem 3.8, note that   −1.5 −4.2 = ad − bc = −1.5 · 2.4 − (−4.2) · 0.5 = −3.6 + 2.1 = −1.5. det 0.5 2.4 Then

 −1 1 2.4 −4.2 · =− 2.4 1.5 −0.5

 −1.5 0.5

  4.2 −1.6 = −1.5 0.333

As a check,     −1.5 −4.2 −1.6 −2.8 −1.5 · (−1.6) − 4.2 · 0.333 = 0.5 2.4 0.333 1 0.5 · (−1.6) + 2.4 · 0.333

 −2.8 1

  −1.5 · (−2.8) − 4.2 · 1 1 ≈ 0.5 · (−2.8) + 2.4 · 1 0

 0 . 1

8. Using Theorem 3.8, note that   3.55 0.25 det = ad − bc = 3.55 · 0.5 − 0.25 · 8.52 = 1.775 − 2.13 = −0.335. 8.52 0.50 Then

 −1   1 0.25 0.5 −0.25 −1.408 =− · ≈ 0.50 24 0.335 −8.52 3.55

 3.55 8.52 As a check,   3.55 0.25 −1.408 8.52 0.50 24

0.704 −10



    0.704 3.55 · (−1.408) + 0.25 · 24 3.55 · 0.704 + 0.25 · (−10) 1 = ≈ −10 8.52 · (−1.408) + 0.50 · 24 8.52 · 0.704 + 0.50 · (−10) 0

 0 . 1

9. Using Theorem 3.8, note that   a −b det = a · a − (−b) · b = a2 + b2 . b a Then "

" #−1 a a −b 1 · = 2 a + b2 −b b a

# " a b 2 2 = a +bb a − a2 +b2

#

b a2 +b2 a a2 +b2

As a check, " #" a a −b a2 +b2 b b a − a2 +b 2

b a2 +b2 a a2 +b2

#

   b a − a2 +b a · a2 +b 2 − b · 2   = a b b · a2 +b − a2 +b 2 + a · 2

b



a2 +b2



b a2 +b2

−b· +a· " =

a



a2 +b2 a a2 +b2

a2 +b2 a2 +b2 ab−ab a2 +b2

 ab−ab a2 +b2 a2 +b2 a2 +b2

#

"

1 = 0

# 0 . 1

152

CHAPTER 3. MATRICES

10. Using Theorem 3.8, note that " det Then "

1 a 1 c

1 b 1 d

1 a 1 c

#−1

1 b 1 d

# =

1 1 1 1 1 1 bc − ad · − · = − = a d b c ad bc abcd

" 1 abcd = · d1 bc − ad − c

− 1b 1 a

#

" =

abc bc−ad abd − bc−ad

acd − bc−ad

#

bcd bc−ad

.

As a check, "

1 a 1 c

1 b 1 d

#"

abc bc−ad abd − bc−ad

acd − bc−ad bcd bc−ad

#

" =

bc bc−ad ab bc−ad

− −

ad bc−ad ab bc−ad

cd − bc−ad + ad − bc−ad +

cd bc−ad bc bc−ad

#

" 1 = 0

# 0 . 1

11. Following Example 3.25, we first invert the coefficient matrix   2 1 . 5 3 The determinant of this matrix is 2 · 3 − 1 · 5 = 1, so its inverse is     1 3 −1 3 −1 = . 2 −5 2 1 −5 Thus the solution to the system is given by    −1    x 2 1 −1 3 = · = y 5 3 2 −5

       −1 −1 3 · (−1) − 1 · 2 −5 · = = , 2 2 −5 · (−1) + 2 · 2 9

so the solution is x = −5, y = 9. 12. Following Example 3.25, we first invert the coefficient matrix   1 −1 . 2 1 The determinant of this matrix is 1 · 1 − (−1) · 2 = 3, so its inverse is # " " # 1 1 1 1 1 3 3 · = 3 −2 1 −2 1 3

3

and thus the solution to the system is given by " # x y

=

" 1 2

#−1 " # " 1 −1 1 3 · = − 32 1 2

1 3 1 3

# " # " # " # 1 1 1 1 3 ·1+ 3 ·2 · = = 2 − 23 · 1 + 31 · 2 0

and the solution is x = 1, y = 0. 13. (a) Following Example 3.25, we first invert the coefficient matrix   1 2 . 2 6 The determinant of this matrix is 1 · 6 − 2 · 2 = 2, so its inverse is " # " # 6 −2 3 −1 1 · = . 1 2 −2 1 −1 2

3.3. THE INVERSE OF A MATRIX

153

Then the solutions for the three vectors given are      3 −1 3 4 x1 = A−1 b1 = = 1 5 −1 − 12 2      3 −1 −1 −5 x2 = A−1 b2 = = 1 2 2 −1 2      3 −1 2 6 = . x3 = A−1 b3 = 1 0 −2 −1 2  (b) To solve all three systems at once, form the augmented matrix A | b1 reduce to solve.     1 0 4 −5 6 1 2 3 −1 2 → 2 0 2 6 5 0 1 − 12 2 −2

b2

 b3 and row

(c) The method of constructing A−1 requires 7 multiplications, while row reductions requires only 6. 14. To show the result, we must show that (cA) · 1c A−1 = I. But 1 (cA) · A−1 = c



 1 · c AA−1 = AA−1 = I. c

15. To show the result, we must show that AT (A−1 )T = I. But AT (A−1 )T = (A−1 A)T = I T = I, where the first equality follows from Theorem 3.4(d). 16. First we show that In−1 = In . To see that, we need to prove that In · In = In , but that is true. Thus In−1 = In , and it also follows that In is invertible, since we found its inverse. 17. (a) There are many possible counterexamples. For instance, let  1 2

0 1

 1 AB = 2

0 1

A=



 B=

1 1

 0 . −1

Then  1 1

  0 1 = −1 3

 0 . −1

Then det(AB) = 1 · (−1) − 0 · 3 = −1, det A = 1 · 1 − 0 · 2 = 1, and det B = 1 · (−1) − 0 · 1 = −1, so that     1 −1 0 1 0 −1 (AB) = = 3 −1 −1 −3 1     1 1 0 1 0 −1 A = = −2 1 1 −2 1     1 −1 0 1 0 −1 B = = . 1 −1 −1 −1 1 Finally, A and (AB)−1 6= A−1 B −1 .

−1

B

−1



1 = −2

 0 1 1 1

  0 1 = −1 −1

 0 , −1

154

CHAPTER 3. MATRICES (b) Claim that (AB)−1 = A−1 B −1 if and only if AB = BA. First, if (AB)−1 = A−1 B −1 , then AB = ((AB)−1 )−1 = (A−1 B −1 )−1 = (B −1 )−1 (A−1 )−1 = BA. To prove the reverse, suppose that AB = BA. Then (AB)−1 = (BA)−1 = A−1 B −1 . This proves the claim.

18. Use induction on n. For n = 1, the statement says that (A1 )−1 = A−1 1 , which is clear. Now assume that the statement holds for n = k. Then −1 (A1 A2 · · · Ak Ak+1 )−1 = ((A1 A2 · · · Ak )Ak+1 )−1 = A−1 k+1 (A1 A2 · · · Ak )

by Theorem 3.9(c). But then by the inductive hypothesis, this becomes −1 −1 −1 −1 −1 −1 −1 −1 A−1 = A−1 k+1 (A1 A2 · · · Ak ) k+1 (Ak · · · A2 A1 ) = Ak+1 Ak · · · A2 A1 .

So by induction, the statement holds for all k ≥ 1. 19. There are many examples. For instance, let   1 0 A= , 0 1

B = −A =

 −1 0

 0 . −1

Then A−1 = A and B −1 = B = −A, so that A−1 + B −1 = O. But A + B = O, so (A + B)−1 does not exist. 20. XA2 = A−1 ⇒ (XA2 )A−2 = A−1 A−2 ⇒ X(A2 A−2 ) = A−1+(−2) ⇒ XI = A−3 ⇒ X = A−3 . 21. AXB = (BA)2 ⇒ A−1 (AXB)B −1 = A−1 (BA)2 B −1 ⇒ (A−1 A)X(BB −1 ) = A−1 (BA)2 B −1 . Thus X = A−1 (BA)2 B −1 . 22. By Theorem 3.9, (A−1 X)−1 = X −1 (A−1 )−1 = X −1 A and (B −2 A)−1 = A−1 (B −2 )−1 = A−1 B 2 , so the original equation becomes X −1 A = AA−1 B 2 = B 2 . Then X −1 A = B 2 ⇒ XX −1 A = XB 2 ⇒ A = XB 2 ⇒ AB −2 = XB 2 B −2 = X,

so that

X = AB −2 .

23. ABXA−1 B −1 = I + A, so that B −1 A−1 (ABXA−1 B −1 )BA = B −1 A−1 (I + A)BA = B −1 A−1 BA + B −1 A−1 ABA But B −1 A−1 (ABXA−1 B −1 )BA = (B −1 A−1 AB)X(A−1 B −1 BA) = X, and B −1 A−1 BA + B −1 A−1 ABA = B −1 A−1 BA + A. Thus X = B −1 A−1 BA + A = (AB)−1 BA + A.  0 0 24. To get from A to B, reverse the first and third rows. Thus E = 0 1 1 0  0 0 25. To get from B to A, reverse the first and third rows. Thus E = 0 1 1 0  1 26. To get from A to C, add the first row to the third row. Thus E = 0 1

 1 0. 0  1 0. 0 0 1 0

 0 0. 1

3.3. THE INVERSE OF A MATRIX

155 

1 27. To get from C to A, subtract the first row from the third row. Thus E =  0 −1

 0 0. 1  1 0 = 0 1 0 0  0 0 1 2. 0 1 0 1 0

28. To get from C to D, subtract twice the third row from the second row. Thus E  1 29. To get from D to C, add twice the third row to the second row. Thus E = 0 0

 0 −2. 1

30. The row operations R2 − 2R1 − 2R3 → R2 , R3 + R1 → R3 will turn matrix A into matrix D. Thus        1 0 0 1 0 0 1 2 −1 1 2 −1 1 1 = −3 −1 3 = D. E = −2 1 −2 satisfies EA = −2 1 −2 1 1 0 1 1 0 1 1 −1 0 2 1 −1 However, E is not an elementary matrix since it includes several row operations, not just one. 31. This matrix multiplies the first row by 3, so its inverse multiplies the first row by 31 :  3 0

−1  1 0 = 3 1 0

 0 . 1

32. This matrix adds twice the second row to the first row, so its inverse subtracts twice the second row from the first row:  −1   1 2 1 −2 = . 0 1 0 1 33. This matrix reverses rows 1 and 2, so its inverse does the same thing. Hence this matrix is its own inverse. 34. This matrix adds − 21 times the first row to the second row, so its inverse adds the second row:  −1   1 0 1 0 = . 1 1 − 12 1 2

1 2

times the first row to

35. This matrix subtracts twice the third row from the second row, so its inverse adds twice the third row to the second row:  −1   1 0 0 1 0 0 0 1 −2 = 0 1 2 . 0 0 1 0 0 1 36. This matrix reverses rows 1 and 3, so its inverse does the same thing. Hence this matrix is its own inverse. 37. This matrix multiplies row 2 by c. Since c 6= 0, its inverse multiplies the second row by 1c :  1 0 0

0 c 0

38. This matrix adds c times the third row to row from the second row:  1 0 0 1 0 0

−1  1 0 0 = 0 1 0

0 1 c

0

 0 0 . 1

the second row, so its inverse subtracts c times the third −1  0 1 c  = 0 1 0

0 1 0

 0 −c . 1

156

CHAPTER 3. MATRICES

39. As in Example 3.29, we first row-reduce A:    1 0 1 A= R2 +R1 −1 −2 − −−−−→ 0

  0 1 1 − 2 R2 −2 − −−−→ 0

 0 . 1

The first row operation adds the first row to the second row, and the second row operation multiplies the second row by − 12 . These correspond to the elementary matrices     1 0 1 0 E1 = and E2 = . 1 1 0 − 12 So E2 E1 A = I and thus A = E1−1 E2−1 . But E1−1 is R2 − R1 → R2 , and E2−1 is −2R2 → R2 , so     1 0 1 0 −1 −1 E1 = and E2 = , −1 1 0 −2 so that A=

E1−1 E2−1



 0 1 1 0

1 = −1

 0 . −2

Taking the inverse of both sides of the equation A = E1−1 E2−1 gives A−1 = (E1−1 E2−1 )−1 = (E2−1 )−1 (E1−1 )−1 = E2 E1 , so that A−1

" 1 = E2 E1 = 0

#" 1 − 21 1

# " 0 1 = 1 − 12

0

40. As in Example 3.29, we first row-reduce A:      2 4 R1 ↔R2 1 1 1 A= R2 −2R1 1 1 −−−−−→ 2 4 − −−−−→ 0

  1 1 1 2 R2 2 −− −→ 0

Converting each of these steps to an elementary matrix gives       1 0 0 1 1 0 , E1 = , E2 = , E3 = 1 0 −2 1 0 12

0

# .

− 21

 R1 −R2  1 − −−−−→ 1 1 0

E4 =

 1 0

 0 . 1

 −1 . 1

So E4 E3 E2 E1 A = I and thus A = E1−1 E2−1 E3−1 E4−1 . We can invert each of the elementary matrices above by considering the row operation which reverses its action, giving         0 1 1 0 1 0 1 1 −1 −1 −1 −1 E1 = , E2 = , E3 = , E4 = , 1 0 2 1 0 2 0 1 so that A=

E1−1 E2−1 E4−1

 0 = 1

 1 1 0 2

 0 1 1 0

 0 1 2 0

 1 . 1

Taking the inverse of both sides of the equation A = E1−1 E2−1 E3−1 E4−1 gives A−1 = E4 E3 E2 E1 , so that #" #" # " # " #" 1 0 0 1 − 12 2 1 −1 1 0 −1 A = E4 E3 E2 E1 = = . 1 −1 0 1 0 21 −2 1 1 0 2 41. Suppose AB = I and consider the equation Bx = 0. Left-multiplying by A gives ABx = A0 = 0. But AB = I, so ABx = Ix = x. Thus x = 0. So the system represented by Bx = 0 has the unique solution x = 0. From the equivalence of (c) and (a) in the Fundamental Theorem, we know that B is invertible (i.e., B −1 exists and satisfies BB −1 = I = B −1 B.) So multiply both sides of the equation AB = I on the right by B −1 , giving ABB −1 = IB −1 ⇒ AI = IB −1 ⇒ A = B −1 . This proves the theorem.

3.3. THE INVERSE OF A MATRIX

157

42. (a) Let A be invertible and suppose that AB = O. Left-multiply both sides of this equation by A−1 , giving A−1 AB = A−1 O. But A−1 A = I and A−1 O = O, so we get IB = B = O. Thus B is the zero matrix. (b) For example, let A= Then



1 AB = 2

 1 2

 2 , 4

 2 2 4 −1

 B=

 −2 . 1

2 −1

  −2 0 = 1 0

 0 , but B 6= O. 0

43. (a) Since A is invertible, we can multiply both sides of BA = CA on the right by A−1 , giving BAA−1 = CAA−1 , so that BI = CI and thus B = C. (b) For example, let 

1 A= 2 Then

 1 BA = 1

 1 1 1 2

 2 , 4

  2 3 = 4 3

44. (a) There are many possibilities. For  1 A= 0

 1 B= 1   6 3 = 6 1

example:   0 1 = I, B = 1 1

 1 , 1

 3 C= 1

 0 1 1 2

 0 . 1

 2 = CA, but B 6= C. 4

 0 , 0

 1 C= 0

  0 1 = 0 1   1 1 = 0 0

 0 =B 0  1 = C, 0

 1 . 0

Then A2 = I 2 = I = A   1 0 1 2 B = 1 0 1   1 1 1 2 C = 0 0 0 so that A, B, and C are idempotent. (b) Suppose that A and n × n matrix that is both invertible and idempotent. Then A2 = A since A is idempotent. Multiply both sides of this equation on the right by A−1 , giving AAA−1 = AA−1 = I. Then AAA−1 = A, so we get A = I. Thus A is the identity matrix. 45. To show that A−1 = 2I − A, it suffices to show that A(2I − A) = I. But A2 − 2A + I = O means that I = 2A − A2 = A(2I − A). This equation says that A multiplied by 2I − A is the identity, which is what we wanted to prove. 46. Let A be an invertible symmetric matrix. Then AA−1 = I. Take the transpose of both sides of this equation: (AA−1 )T = I T ⇒ (A−1 )T AT = I ⇒ (A−1 )T A = I ⇒ (A−1 )T AA−1 = IA−1 = A−1 . Thus (A−1 )T AA−1 = A−1 . But also clearly (A−1 )T AA−1 = (A−1 )T I = (A−1 )T , so that (A−1 )T = A−1 . This says that A−1 is its own transpose, so that A−1 is symmetric. 47. Let A and B be square matrices and assume that AB is invertible. Then AB(AB)−1 = A(B(AB)−1 ) = I, so that A is invertible with A−1 = B(AB)−1 . But this means that   B (AB)−1 A = B(AB)−1 A = I, so that also B is invertible, with inverse (AB)−1 A.

158

CHAPTER 3. MATRICES

 48. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A  1 5 1 4

1 0

  1 0 R2 −R1 1 − −−−−→ 0

5 −1

  1 0 −R2 1 − −−→ 0

1 −1

5 1

  I to I

 R1 −5R2  0 − −−−−→ 1 −1 0

1 1

0 1

 A−1 .  5 −1

−4 1

Thus the inverse of the matrix is  −4 1

 5 . −1

 49. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A " −2 3

4 −1

1 0

# 1 " − 2 R1 0 −−− −→ 1 1 3

−2 −1

− 21 0

  I to I

" # # 1 −2 − 21 0 0 1 ··· R −3R1 3 5 R2 1 −−2−−−→ 1 −− 0 5 −→ 2 # " " R +2R2 1 1 −2 − 21 0 −−1−−−→ 3 1 0 1 0 10 5

 A−1 .

1 10 3 10

0 1

2 5 1 5

#

Thus the inverse of the matrix is "

1 10 3 10

2 5 1 5

# .

 50. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A "

4 2

−2 0

1 0

" # 1 4 R1 0 −− −→ 1 2 1

− 21 0

1 4

0

# " 0 1 R −2R1 0 1 −−2−−−→

− 12 1

1 4 − 21

  I to I

# " R1 + 12 R2 0 −−− −−−→ 1 1 0

0 1

 A−1 . #

0

1 2

− 21

1

Thus the inverse of the matrix is "

#

0

1 2

− 12

1

.

 51. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A "

1 −a

a 1

1 0

# " 0 1 aR1 +R2 1 −−− −−→ 0

a a2 + 1

# 1 0 1 R ··· +1 2 a 1 −a−2− −−→ " 1 a 1 0 1 a2a+1

0 1 a2 +1

Thus the inverse of the matrix is "

a2 +1

− a2a+1

a a2 +1

1 a2 +1

1

# .

#

  I to I

" R −aR2 1 0 −−1−−−→ 0 1

1 a2 +1 a a2 +1

 A−1 .

− a2a+1 1 a2 +1

# .

3.3. THE INVERSE OF A MATRIX

159

52. As in Example 3.30, we adjoin the identity matrix    1 −2 −1 2 3 0 1 0 0   R1 ↔R2  1 −2 −1 0 1 0 −−−−−→ 2 3 0    2 0 −1 0 0 1 2 0 −1  1 −2 −1 0  1 1 2 7 R2  1 −−−→ 0 7 7 0 4 1 0  1 −2 −1 0  2 1 0 1  7 7 −7R3 0 0 1 4 −−−→  R +2R2 1 0 0 2 −−1−−−→  0 1 0 −1  0 0 1 4 The inverse of the matrix is



2 −1 4

  to A then row-reduce A I to   1 −2 −1 0 1 0  R2 −2R1   1 0 0 7 2  −−−−−→ 0 R3 −2R1 0 0 1 −−−−−→ 0 4 1   1 −2 −1 1 0   2 2   − 7 0 1 0 7 R3 −4R2 −2 1 −−−−−→ 0 0 − 71  R +R  1 3 1 0 −−−−−→ 1 −2 0  R2 − 2 R3   7 − 72 0 1 0  −−−−−−→ 0 6 −7 0 0 1  3 −3  −2 2  6 −7

−1 2 1 2 3 −1

1 0 0

0 1 0

  1 0 R2 −3R1 0 −−−−−→ 0 R3 −2R1 1 − −−−−→ 0

−1 2 4 −4 5 −5

 0  1 −2 0  0 −2 1  0 1 0  1 − 27 0  7 − 47 − 67 1 0

1

 7 −7  −2 2  6 −7

4 −1 4

 3 −3 −2 2 . 6 −7

 53. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A  1 3 2

 A−1 .

 I

1 −3 −2

  1 0 0 0 4R3 −5R2 1 − −−−−−→ 0

0 1 0

  I to I −1 2 4 −4 0 0

 A−1 . 1 −3 7

 0 0 1 0 . −5 4

Since the matrix on the left has a zero row, it cannot be reduced to the identity matrix, so that the given matrix is not invertible.     54. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 .       1 1 0 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0  R2 −R1   −R2    1 0 1 0 1 0−       −−−−→ 0 −1 1 −1 1 0 −−−→ 0 1 −1 1 −1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1     1 1 0 1 0 0 1 1 0 1 0 0     0 1 −1 0 1 −1 1 −1 0 1 −1 0   1   R3 −R2 1 1 1 2 R3 1 1 −− − 2 −1 0 0 1 −−− −−→ 0 0 −→ 2 2 2     1 1 1 1 1 0 1 0 0 R1 −R2 1 0 0 − 2 2 2   −−−−−→  R2 +R3  1 1 1 0 1 0 − 12 12  − 21 −−− −−→ 0 1 0   2 2 2 1 1 1 1 0 0 1 − 12 0 0 1 − 12 2 2 2 2 The inverse of the matrix is



1  2  1  2 − 12

1 2 − 12 1 2

− 21

 

1 2 1 2

160

CHAPTER 3. MATRICES

 55. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A  a  1 0

0 a 1

0 0 a

1 0 0

0 1 0

 a1 R1  0 −−−→ 1  a1 R2  1 0 −− −→  a 1 a1 R3 0 −−−→

0 1

0 0 1

1 a

1 a

0

0 0

1 a

  1 0  R2 − a1 R1  0  −−−−−−→ 0 1 0 a

0

0 1

0 0 1

1 a

1 a − a12

0

0

0

 1 0 0  0 1 0 1 R3 − a R2 −−−−−−→ 0 0 1 Thus the inverse of the matrix is



1 a  1 − a2 1 a3

0 1 a − a12

 A−1 .

  I to I

1 a

 0  0

1 a − a12 1 a3

1 a

0 1 a − a12

 0  0 . 1 a

 0  0 . 1 a

Note that these calculations assume a 6= 0. But if a = 0, then the original matrix is not invertible since the first row is zero.     56. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 . Note that if a = 0, then the first row is zero, so the matrix is clearly not invertible. So assume a 6= 0. Then     0 a 0 1 0 0 0 a 0 1 0 0     0 1 0 .  b 0 c 0 1 0 . b 0 c d R − R1 0 d 0 0 0 1 −−3−−a−−→ 0 0 0 − ad 0 1 Since the matrix on the left has a zero row, it cannot be reduced to the identity matrix, so the original matrix is not invertible.     57. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 .     1 0 0 0 −11 −2 5 −4 0 −1 1 0 1 0 0 0  2 4 1 −2 2 1 0 2 0 1 0 0  → 0 1 0 0 .    1 −1 3 0 0 0 1 0 0 0 1 0 5 1 −2 2 0 1 1 −1 0 0 0 1 0 0 0 1 9 2 −4 3 Thus the inverse of the matrix is

 −11  4   5 9

 −2 5 −4 1 −2 2 . 1 −2 2 2 −4 3

 58. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A  √ 2 −4√2    0 0

√ 0 2 2 0 √ 2 0 0 0 1 0 0 3 1



1 0 0 0

0 1 0 0

Thus the inverse of the matrix is

0 0 1 0 √



2

2  √ 2 2   0 0

  0 1 0 0 0  0 0   1 0 0 → 0 0 1 0 0 1 0 0 0 1 0



2 2

0 0

 −2 0  −8 0 . 1 0 −3 1

2

2 √

2 2 0 0

  I to I 0



2 2

0 0

 A−1 .

 −2 0  −8 0 . 1 0 −3 1

3.3. THE INVERSE OF A MATRIX 59. As in Example 3.30,  1 0   0 a

161

we adjoin the identity matrix to A   0 0 0 1 0 0 0 1 0   1 0 0 0 1 0 0 0 1 → 0 0  0 1 0 0 0 1 0 b c d 0 0 0 1 0 0

Thus the inverse of the matrix is



1  0   0 − ad

0 1 0 − db

 then row-reduce A 0 0 0 0 1 0 0 1

1 0 0 − ad

0 1 0 − db

  I to I  0 0 0 0  . 1 0 − dc d1

 A−1 .

 0 0 . 0

0 0 1 − dc

1 d

   60. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z2 to I   R1 +R2     0 1 1 0 − 1 0 1 1 −−−−→ 1 0 1 1 . R2 +R1 1 1 0 1 1 1 0 1 − −−−−→ 0 1 1 0 Thus the inverse of the matrix is

 1 1

 A−1 .

 1 . 0

   61. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z5 to I  4R1      1 3 4 0 4 2 1 0 − −→ 1 3 4 0 . R2 +2R1 3 4 0 1 3 4 0 1 − −−−−→ 0 0 3 1

 A−1 .

Since the left-hand matrix has a zero row, it cannot be reduced to the identity matrix, so the original matrix is not invertible.     62. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z3 to I A−1 .  2 1 0 1 1 2 0 2 1

1 0 0

0 1 0

 2R1  0 −−→ 1 R2 +R1 0 0− −−−−→ 0 1  1 0 R3 +R2 −−− −−→ 0  1 R2 +2R3  0 −−−−−→ 0

Thus the inverse of the matrix is

63. As in Example 3.30, we adjoin  1 1 3

  0 1 2 2R2  0 1 0 − −→ 0 2 1   1 2 2 0 2 0 0 0 1 1 1 2 2 0 2R3 0 2 2 2 1 − −→ 0 0  R1 +R2  2 0 2 0 0 −−−−−→ 1 0 1 0 1 1 1 0 1 1 1 2 0 2 2 2

0 2 1

 0 1 1

2 1 0

1 1 1

0 1 0

 4 5 0

6 3 6

2 2 0

0 2 0

0 1 1

2 2 1

0 2 1

 0 0 1  0 0 2

0 1 0

0 0 1

0 1 1

1 1 1

 1 1 . 2

 1 1 . 2

the identity matrix to A   5 0 1 0 0 1 2 4 0 1 0 → 0 6 1 0 0 1 0

Thus the inverse of the matrix is

0 1 1

 then row-reduce A  0 0 4 6 4 1 0 5 3 2 . 0 1 0 6 5

 4 2 . 5

  I over Z7 to I

 A−1 .

162

CHAPTER 3. MATRICES

64. Using block multiplication gives 

A O

B D

  −1 A O

   −A−1 BD−1 AA−1 + BO −AA−1 BD−1 + BD−1 = D−1 OA−1 + DO −OA−1 BD−1 + DD−1  −1  AA −BD−1 + BD−1 = O DD−1   I O = . O I

65. Using block multiplication gives  O C

B I

 −(BC)−1 (BC)−1 B C(BC)−1 I − C(BC)−1 B   −O(BC)−1 + BC(BC)−1 O(BC)−1 B + B(I − C(BC)−1 B) = −C(BC)−1 + IC(BC)−1 C(BC)−1 B + I(I − C(BC)−1 B)   BC(BC)−1 BI − BC(BC)−1 B = −C(BC)−1 + C(BC)−1 C(BC)−1 B + I − C(BC)−1 B   I O = . O I



66. Using block multiplication gives 

I C

B I

 (I − BC)−1 −(I − BC)−1 B −C(I − BC)−1 I + C(I − BC)−1 B   I(I − BC)−1 − BC(I − BC)−1 −I(I − BC)−1 B + B(I + C(I − BC)−1 B) = C(I − BC)−1 − I(C(I − BC)−1 ) −C(I − BC)−1 B + I(I + C(I − BC)−1 B)   (I − BC)(I − BC)−1 (BC − I)(I − BC)−1 B + B = C(I − BC)−1 − C(I − BC)−1 −C(I − BC)−1 B + I + C(I − BC)−1 B   I −IB + B = O I   I O = . O I



67. Using block multiplication gives  O C

B D

 −(BD−1 C)−1 (BD−1 C)−1 BD−1 D−1 C(BD−1 C)−1 D−1 − D−1 C(BD−1 C)−1 BD−1   BD−1 C(BD−1 C)−1 D−1 − D−1 CBD−1 C)−1 BD−1  C(BD−1 C)−1 BD−1 = −C(BD−1 C)−1 + C(BD−1 C)−1 −1 −1 −1 −1 −1 +D(D − D C(BD C) BD )   −1 −1 −1 I D − D CC DB −1 BD−1 = O C(BD−1 C)−1 BD−1 + I − DD−1 C(BD−1 C)−1 BD−1   I D−1 − D−1 = O C(BD−1 C)−1 BD−1 + I − C(BD−1 C)−1 BD−1   I O = . O I



3.3. THE INVERSE OF A MATRIX

163

68. Using block multiplication gives      A B P Q = AP + BR AQ + BS CP + DR DQ + DS C D R S   AP + B(−D−1 CP ) A(−P BD−1 + B(D−1 + D−1 CP BD−1 ) = CP + D(−D−1 CP ) C(−P BD−1 ) + D(D−1 + D−1 CP BD−1 )   (A − BD−1 C)P −(A − BD−1 C)P BD−1 + BD−1 = CP − CP −CP BD−1 + I + CP BD−1  −1  P P −P −1 P BD−1 + P BD−1 = O I     I −BD−1 + BD−1 I O = = . O I O I 69. Partition the matrix

 1 0 A= 2 1

0 1 3 2

 0  0 = I 0 C 1

0 0 1 0

 B , where B = O. I

Then using Exercise 66, (I − BC)−1 = (I − OC)−1 = I −1 = I, so that      (I − BC)−1 −(I − BC)−1 B I −IB I −1 A = = = −C(I − BC)−1 I + C(I − BC)−1 B −CI I + CIB −C Substituting back in for C gives 

A−1

70. Partition the matrix

1  0  = −2 −1

 0 0 0 1 0 0 . −3 1 0 −2 0 1

√   √0 2 2 0 2 0 0 = A O 0 1 0 0 3 1

 √ 2 √ −4 2 M =  0 0

B D



and use Exercise 64. Then #  √ −1 " √2 2 0 0 2 √ √ √ A = = √ −4 2 2 2 2 22  −1   1 0 1 0 D−1 = = 3 1 −3 1 "√ # √    2 0 1 0 −2 2 2 0 2 √ −A−1 BD = √ = 2 −8 0 0 −3 1 2 2 −1

2

Then



 M −1

71. Partition the matrix

" A−1 = O

−A−1 BD−1 D−1

 0 0 A= 0 1

#

2

2  √ 2 2 =  0 0

 0 1 1  0 1 0 = O −1 1 0 C 1 0 1

0



2 2

0 0

B I



 0 . 0

 −2 0  −8 0 . 1 0 −3 1

 O . I

164

CHAPTER 3. MATRICES and use Exercise 65. Then   −1  1 1 0 −1 1 −(BC)−1 = − =− 1 0 1 1 0      1 0 1 1 1 1 (BC −1 )B = = 0 −1 1 0 −1 0      0 −1 1 0 0 1 C(BC)−1 = = 1 1 0 −1 1 −1      0 1 1 1 0 0 I − C(BC)−1 B = I − = . 1 −1 1 0 0 0 Thus A−1 =

 −1  0 (BC)−1 B =  0 I − C(BC)−1 B 1

−(BC)−1 C(BC)−1





72. Partition the matrix



0 A= 1 −1 where  B= 1

 1 ,

1 3 5

  1 O  1 = C 2 

 1 C= , −1

  0 −1 = −1 0

 0 1

 0 1 1 1 −1 0 . 1 0 0 −1 0 0

 B , D

 3 D= 5

 1 , 2

and use Exercise 67. Then −(BD−1 C)−1

(BD−1 C)−1 BD−1 D−1 C(BD−1 C)−1 D−1 − D−1 C(BD−1 C)−1 BD−1

" # " #−1 i 3 1 −1 1 h i−1 h i  = − −5 = 51 1 5 2 −1 " # i 3 1 −1 h i 1h =− 1 1 = 35 − 52 5 5 2 " #−1 " #   " 3# −5 3 1 1 1 = = − 8 5 5 2 −1 5 " #−1 " # " # " # i 3 1 −1 1 1 3 1 − 53 h 5 5 . = − = 1 1 8 5 2 5 2 − 15 − 51 5  h = − 1

Thus " A−1 =

3.4

−1

−1

−(BD C) D C(BD−1 C)−1 −1

−1

D

−1

−1

−1

#

(BD C) BD = − D−1 C(BD−1 C)−1 BD−1



1 5  3 − 5 8 5

3 5 1 5 − 51

− 25



1 . 5 1 −5

The LU Factorization

1. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have        −2 1 1 0 −2 1 5 A= = = LU, and b = . 2 5 −1 1 0 6 1 Then 

1 Ly = b ⇒ −1

      y1 = 5 0 y1 5 y1 = 5 5 = ⇒ ⇒ ⇒y= . 1 y2 1 −y1 + y2 = 1 6 y2 = y1 + 1

3.4. THE LU FACTORIZATION

165

Likewise, Ux = y ⇒

        x2 = 1 1 x1 5 −2x1 + x2 = 5 x −2 = ⇒ ⇒ ⇒ 1 = . 6 x2 6 6x2 = 6 x2 1 −2x1 = 5 − x2

 −2 0

2. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have " # " #" # " # 4 −2 1 0 4 −2 0 A= = 1 = LU, and b = . 2 3 1 0 4 8 2 Then

 Ly = b ⇒

1 1 2

      0 y1 y = 0 0 0 ⇒y= . = ⇒1 1 8 8 1 y2 y + y = 8 1 2 2

Likewise, 

4 Ux = y ⇒ 0

        x2 = 2 −2 x1 0 4x1 − 2x2 = 0 x 1 = ⇒ ⇒ ⇒ 1 = . 4 x2 8 4x2 = 8 x 2 4x1 = 2x2 2

3. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have        2 1 −2 −3 1 0 0 2 1 −2        A = −2 1 0 0 4 −6 = LU, and b =  1 . 3 −4 = −1 5 0 2 − 4 1 0 0 − 72 4 −3 0 Then 

1 Ly = b ⇒ −1 2

0 1 − 45

    y1 0 −3 y1 0 y2  =  1 ⇒ −y1 + y3 0 2y1 − 1

= −3 y2 = 1 5 y + y = 0 2 3 4 y1 = −3

    −3 y1 ⇒ y2 = y1 + 1 ⇒ y2  = −2 . 7 5 y3 2 y3 = −2y1 + y2 4 Likewise,  2  U x = y ⇒ 0 0

1 4 0

    −2 x1 −3 2x1 + x2 − 2x3 = −3     −6 x2  = −2 ⇒ 4x2 − 6x3 = −2 7 7 −2 x3 − 27 x3 = 27 2     x1 − 23     ⇒ 4x2 = 6x3 − 2 ⇒ x2  =  −2 . x3 −1 2x1 = −x2 + 2x3 − 3 x3 = −1

4. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have        2 −4 0 1 0 0 2 −4 0 2     3    A =  3 −1 4 =  2 1 0 0 5 4 = LU, and b =  0 . 1 −1 2 2 −2 0 1 0 0 2 −5

166

CHAPTER 3. MATRICES Then 

1

 Ly = b ⇒  32 − 12

    y1 = 2 2 0 0 y1     1 0 y2  =  0 ⇒ 32 y1 + y2 = 0 1 −5 y3 0 1 − 2 y1 + y3 = −5 y1 = 2 3 ⇒ y2 = − 2 y1 1 y3 = y1 − 5 2

    2 y1     ⇒ y2  = −3 . −4 y3

Likewise,  2 U x = y ⇒ 0 0

    −4 0 x1 2 2x1 − 4x2 = 2 5 4 x2  = −3 ⇒ 5x2 + 4x3 = −3 0 2 x3 −4 2x3 = −4     x1 3     ⇒ 5x2 = −4x3 − 3 ⇒ x2  =  1 . x3 −2 2x1 = 4x2 + 2 x3 = −2

5. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have        1 0 1 0 0 0 2 −1 0 2 −1 0 0 2  0 −1 5 −3 6 −4 5 −3 3 1 0 0  = LU, and b =   .  = A= 2 8 −4 1 0 1 0 0 1 0 0 0 4 1 0 0 0 4 2 −1 5 1 4 −1 0 7 Then  1 3 Ly = b ⇒  4 2

    y1 0 0 0 y1 1 y2  2 3y 1 0 0    =   ⇒ 1 + y2 4y1 + y3 0 1 0 y3  2 2y1 − y2 + 5y3 + y4 −1 5 1 y4 1

= = = =

1 2 2 1

y1 = 1

    y1 1 y2  −1 y2 = −3y1 + 2    ⇒ ⇒ y3  = −2 . y3 = −4y1 + 2 8 y4 y4 = −2y1 + y2 − 5y3 + 1 Likewise,  2 0 Ux = y ⇒  0 0

    −1 0 0 x1 1 2x1 − x2 x2  −1 −1 5 −3 − x2 + 5x3 − 3x4   =   ⇒ 0 1 0 x3  −2 x3 0 0 4 x4 8 4x4

= 1 = −1 = −2 = 8

x4 = 2

    x1 −7 x2  −15 x3 = −2    ⇒ ⇒ x3  =  −2 . x2 = 5x3 − 3x4 + 1 x4 2 2x1 = x2 + 1 6. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We

3.4. THE LU FACTORIZATION

167

have 

1 A = −2 3

  1 4 3 0 −2 −5 −1 2 =   3 6 −3 −4 −5

 0 0 0 1 4  1 0 0  0 3  −2 1 0 0 0 0 0 4 −2 1

   3 0 1 −3 5 2  = LU, and b =   . −1 −2 0 0 1 0

Then     1 y1 0 0 0 y1 y2  −3 −2y 1 0 0 1 + y2   =   ⇒ 3y1 − 2y2 + y3 −2 1 0 y3  −1 0 −5y1 + 4y2 − 2y3 + y4 4 −2 1 y4



1 −2 Ly = b ⇒   3 −5

= 1 = −3 = −1 = 0

y1 = 1

    1 y1 y2  −1 y2 = 2y1 − 3    ⇒ ⇒ y3  = −6 . y3 = −3y1 + 2y2 − 1 −3 y4 y4 = 5y1 − 4y2 + 2y3 Likewise,  1 0 Ux = y ⇒  0 0

4 3 0 0

    x1 + 4x2 + 3x3 1 3 0 x1 x2  −1 3x2 + 5x3 + 2x4 5 2   =   ⇒ − 2x3 −2 0 x3  −6 0 1 x4 x4 −3

= 1 = −1 = −6 = −3

x4 = −3

   16  x1 3 x   − 10  x3 = 3  2  3  ⇒ ⇒ = . x3   3x2 = −5x3 − 2x4 − 1 3 x4 −3 x1 = −4x2 − 3x3 + 1 7. We use the multiplier method from Example 3.35:    1 2 1 A= R3 −(−3R1 ) −3 −1 − −−−−−−−→ 0

 2 = U. 5

The multiplier was −3, so 

1 L= −3

 0 . 1

8. We use the multiplier method from Example 3.35:    2 −4 2 3 A= R3 − 2 R1 3 1 − −−−−−→ 0 The multiplier was 23 , so

 L=

1 3 2

 −4 = U. 7

 0 . 1

9. We use the multiplier method from Example 3.35:      1 2 3 1 2 3 1 R −4R1 0 −3 −6 0 A = 4 5 6 −−2−−−→ R3 −3R2 R3 −8R1 8 7 9 − −−−−→ 0 −9 −15 −−−−−→ 0 The multipliers were `21 = 4, `31 = 8, and `32 = 3,  1 L = 4 8

so we get  0 0 1 0 3 1

 2 3 −3 −6 = U. 0 3

168

CHAPTER 3. MATRICES

10. We use the multiplier  2 A = 4 3

method from Example 3.35:     2 2 −1 2 −1 R −2R 2 2 1 0 6 0 4 −−−−−→ 0 −4 R3 −(− 41 R2 ) R3 − 23 R1 4 4 − 1 11 2 −−−−−−−−→ 0 −−−−−→ 0

The multipliers were `21 = 2, `31 = 23 , and `32 = − 14 , so  1 0 1 L = 2 1 3 − 2 4

 2 −1 −4 6 = U. 0 7

we get  0 0 1

11. We use the multiplier method from Example 3.35: 

1  2 A=  0 −1

  1 2 3 −1 R2 −2R1 −−−−−→ 0 6 3 0  R3 −0R1  6 −6 7 − −−−−→ 0 −2 −9 0 R4 −(−1R1 ) 0 −−−−−−−−→

2 2 6 0

  1 2 3 −1 0 2 −3 2  R −3R2  0 0 −6 7 −−3−−−→ R −0R 4 2 −6 −1 −−−−−→ 0 0

 3 −1 −3 2  3 1 −6 −1

 1 2 0 2  0 0 R4 −(−2R3 ) −−−−−−−−→ 0 0

 3 −1 −3 2  = U. 3 1 0 1

The multipliers were `21 = 2, `31 = 0, `41 = −1, `32 = 3, `42 = 0, and `43 = −2, so   1 0 0 0  2 1 0 0 . L=  0 3 1 0 −1 0 −2 1 12. We use the multiplier method from Example 3.35: 

2 −2  A= 4 6

2 4 4 9

  2 1 R −(−R ) 2 2 1 −−−→  −1 2  −−R−− 0 3 −2R1  7 3 −−−−−→ 0 R −3R1 0 5 8 −−4−−−→

2 6 0 3

  2 2 1  1 3  R3 −0R2 0 3 1 −−−−−→ 0 R4 − 21 R2 −1 5 − −−−−−→ 0

2 6 0 0

2 1 3 − 23

 1 3  1

7 2

 2 0  0 1 R4 −(− 2 R3 ) −−−−−−−−→ 0

2 6 0 0

2 1 3 0

 1 3  = U. 1 4

The multipliers were `21 = −1, `31 = 2, `41 = 3, `32 = 0, `42 = 21 , and `43 = − 12 , so   1 0 0 0 −1 1 0 0 . L=  2 0 1 0 3 21 − 12 1 13. We can use the multiplier method as before to make the matrix “upper triangular”, i.e., in row echelon form. In this case, however, the matrix is already in row echelon form, so it is equal to U , and L = I3 :      1 0 1 −2 1 0 0 1 0 1 −2 0 3 3 1 = 0 1 0 0 3 3 1 . 0 0 0 5 0 0 1 0 0 0 5

3.4. THE LU FACTORIZATION

169

14. We can use the multiplier method as before to make the matrix “upper triangular”, i.e., in row echelon form. 

1 −2   1 0

  2 0 −1 1 1 0 −3 3 6 0  R3 − 31 R2  −1 3 6 1 −−−−− −→ 0 3 −3 −6 0 R4 −(−1R2 ) 0 −−−−−−−−→

  2 0 −1 1 R −(−2R ) 1 2 1 −−−−−→  −7 3 8 −2 0  −−− R −1R 1 0 1 3 5 2 −−3−−−→ R4 −0R1 0 3 −3 −6 0 −−−−−→

 2 0 −1 1 −3 3 6 0  = U. 0 2 4 1 0 0 0 0

The multipliers were `21 = −2, `31 = 1, `41 = 0, `32 = 31 , and `42 = −1, so 

1 −2 L=  1 0

0 1 1 3

−1

 0 0 0 0 . 1 0 0 1

15. In Exercise 1, we have  −2 A= 2

  1 1 = LU = 5 −1

0 1

 −2 0

 1 . 6

Then " 1 1 = 1 1

−1

L

# " 0 1 = 1 1

# 0 , 1

U

−1

" 1 6 =− 12 0

# " −1 − 21 = −2 0

1 12 1 6

# ,

so that A−1 = (LU )−1 = U −1 L−1

" − 12 = 0

1 12 1 6

#" 1 1

# " 5 0 − 12 = 1 1 6

1 12 1 6

# .

16. In Exercise 4, we have 

2  A= 3 −1

  1 −4 0   −1 4 = LU =  32 2 2 − 21

To compute L−1 , we adjoin the identity matrix    1 0 0 1 0 0 1 0  3   → 1 0 0 1 0 0  2   1 1 −2 0 1 0 0 1 0 0 To compute U −1 , we  2 −4  5 0 0 0

 0 0 2  1 0 0 0 1 0

 −4 0  5 4 . 0 2

to L and row-reduce:   0 1 0 0 1   0 − 32 1 0 ⇒ L−1 = − 32 1 1 1 0 1 2 2

adjoin the identity matrix to U and row-reduce:     1 0 1 0 0 1 0 0 12 25 − 54 2     −1 4 0 1 0 → 0 1 0 0 15 − 52  ⇒ U =  0 1 2 0 0 1 0 0 1 0 0 0 2

 0 0  1 0 . 0 1

2 5 1 5

 − 54  − 52  .

0

1 2

2 5 1 5

 − 45  − 25  .

Then 

1 2

 A−1 = (LU )−1 = U −1 L−1 =  0 0

2 5 1 5

0

 − 45 1 2  3 − 5  − 2 1 2

1 2

  0 0 − 12   1 1 0 = − 2 1 0 1 4

0

1 2

17. Using the LU decomposition from Exercise 1, we compute A−1 one column at a time using the method outlined in the text. That is, we first solve Lyi = ei , and then solve U xi = yi . Column 1 of A−1 is

170

CHAPTER 3. MATRICES found as follows: "

#" # " # 1 0 y11 1 y11 = Ly1 = e1 ⇒ = ⇒ −1 1 y21 0 −y11 + y21 = " #" # " # −2 1 x11 1 −2x11 + x21 U x1 = y1 ⇒ = ⇒ 0 6 x21 1 6x21 Use the same procedure to find " 1 Ly2 = e2 ⇒ −1 " −2 U x2 = y2 ⇒ 0

" # 1 1 ⇒ y1 = 1 0 " # h i 5 = 1 − 12 ⇒ x = . 1 1 = 1 6

the second column: #" # " # 0 y12 0 y12 = = ⇒ 1 y22 1 −y12 + y22 = #" # " # 1 x12 0 −2x12 + x22 = ⇒ 6 x22 1 6x21

So A−1

h = x1

i

x2 =

"

5 − 12 1 6

1 12 1 6

" # 0 0 ⇒ y2 = 1 1 h i = 0 ⇒ x = 1 = 1

"

1 12 1 6

# .

# .

This matches what we found in Exercise 15. 18. Using the LU decomposition from Exercise 4, we compute A−1 one column at a time using the method outlined in the text. That is, we first solve Lyi = ei , and then solve U xi = yi . Column 1 of A−1 is found as follows:          1 0 0 y11 1 y11 = 1 y11 1  3        3 3 Ly1 = e1 =  2 1 0 y21  = 0 ⇒ 2 y11 + y21 = 0 ⇒ y21  = − 2  1 − 12 0 1 y31 0 − 21 y11 + y31 = 0 y31 2          2 −4 0 x11 1 x11 − 21 2x11 − 4x21 = 1      3     U x1 = y1 = 0 5 4 x21  = − 2  ⇒ 5x21 + 4x31 = − 23 ⇒ x21  = − 12  1 1 0 0 2 x31 x31 2x31 = 12 2 4 Repeat this procedure to find  1 0  3 Ly2 = e2 =  2 1 − 12 0  2 −4  U x2 = y2 = 0 5 0 0

the second column:         y12 = 0 0 y12 0 y12 0         3 0 y22  = 1 ⇒ 2 y12 + y22 = 1 ⇒ y22  = 1 0 0 1 y32 − 21 y12 + y32 = 0 y32         2 0 x12 0 2x12 − 4x22 = 0 x12 5       1 4 x22  = 1 ⇒ 5x22 + 4x32 = 1 ⇒ x22  =  5  2 x32 0 0 2x32 = 0 x32

Do this again to find the third column:          1 0 0 y13 0 y13 = 0 y13 0          Ly3 = e3 =  32 1 0 y23  = 0 ⇒ 32 y13 + y23 = 0 ⇒ y23  = 0 − 12 0 1 y33 1 − 12 y13 + y33 = 1 y33 1          2 −4 0 x13 0 2x13 − 4x23 = 0 x13 − 54         2 U x3 = y3 = 0 5 4 x23  = 0 ⇒ 5x23 + 4x33 = 0 ⇒ x23  = − 5  1 0 0 2 x33 1 2x33 = 1 x33 2   − 12 25 − 54 h i  1 1  −1 Thus A = x1 x2 x3 = − 2 5 − 52 ; this matches what we found in Exercise 16. 1 1 0 4 2

3.4. THE LU FACTORIZATION

171

19. This matrix results from the identity matrix by first interchanging rows 1 and 2, and then interchanging the (new) row 1 with row 3, so it is      0 0 1 0 1 0 0 0 1 E13 E12 = 0 1 0 1 0 0 = 1 0 0 . 1 0 0 0 0 1 0 1 0 Other products of elementary matrices are possible. 20. This matrix results from the identity matrix by first interchanging rows 1 and 4, and then interchanging rows 2 and 3, so it is      1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0     E14 E23 =  0 0 1 0 0 1 0 0 = 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 Other products of elementary matrices are possible; for example, also P = E13 E23 E34 . 21. To get from  0 0  1 0

the given matrix to the   1 1 0 0 0 0 0 1 R ↔R  1 3  0 0 0 −−−−−→ 0 0 0 1 0

identity matrix, perform the   1 0 0 0 0 0 0 1 0 0 0 1 R ↔R  2 3  1 0 0 −−−−−→ 0 0 0 0 0 1 0 1 0

following exchanges:   1 0 0 0 0 1 0 0 R ↔R  3 4  1 −−−−−→ 0 0 1 0 0 0 0

 0 0  0 1

Thus we can construct the given matrix P by multiplying these elementary matrices together in the reverse order: P = E13 E23 E34 . Other combinations are possible. 22. To get from the given matrix to the identity matrix, perform the following exchanges:  0 1  0  0 0

0 0 0 0 1

1 0 0 0 0

0 0 1 0 0

  1 0 0 0  R ↔R  1 2  0  −−−−−→ 0 0  1 0 0

0 0 0 0 1

0 1 0 0 0

0 0 1 0 0

  1 0 0 0  R ↔R  2 5  0  −−−−−→ 0 0  1 0 0  1 0  0  0 0

0 1 0 0 0

0 0 0 0 1

0 0 1 0 0

0 1 0 0 0

0 0 1 0 0

0 0 0 0 1

 0 0  R ↔R 3 5 0  −−−−−→  1 0   0 1 0 0  R ↔R  4 5  0  −−−−−→ 0  0 1 0 0

0 1 0 0 0

0 0 1 0 0

0 0 0 1 0

 0 0  0  0 1

Thus we can construct the given matrix P by multiplying these elementary matrices together in the reverse order: P = E12 E25 E35 E45 . Other combinations are possible. 23. To find a P T LU factorization of A, we begin by permuting the rows of A to get it will result in a small number of operations to reduce to row echelon form. Since       0 1 4 0 1 0 0 1 4 −1 2 A = −1 2 1 , we have P A = 1 0 0 −1 2 1 =  0 1 1 3 3 0 0 1 1 3 3 1 3

into a form that  1 4 . 3

172

CHAPTER 3. MATRICES Now follow the multiplier method from Example 3.35 to find the LU factorization of this matrix:  −1 PA =  0 1

2 1 3

  1 −1  0 4 R3 −(−R1 ) 3 − 0 −−−−−−→

  1 −1  0 4 R3 −5R2 4 − 0 −−−−→

2 1 5

2 1 0

 1 4 = U. −16

Then `21 = 0, `31 = −1, and `32 = 5, so that 

1 L= 0 −1 Thus

 0 A = P T LU = 1 0

1 0 0

 0 1 0  0 1 −1

 0 0 1

0 1 5 0 1 5

 0 −1 0  0 1 0

2 1 0

 1 4 . −16

24. To find a P T LU factorization of A, we begin by permuting the rows of A to get it will result in a small number of operations to reduce to row echelon form. Since       1 0 0 1 2 0 0 0 1 0 0 1 2   0  0 0 1 0 −1 1 −1 1 3 2 3 2 =   , we have P A =  A= 1 0 0 0  0 2  0 2 1 1  0 1 1 −1 1 1 −1 0 0 1 0 0 1 1 −1 0

into a form that 1 2 0 1

 −1 0 1 1 . 1 2 3 2

Now follow the multiplier method from Example 3.35 to find the LU factorization of this matrix: 

1  0 PA =   0 −1

1 2 0 1

  1 −1 0 0 1 1   1 2 R4 −(−R1 ) 0 3 2 −−−−−−−→ 0

1 2 0 2

  1 −1 0 0 1 1   1 2 R4 −R2 0 2 2 −−−−−→ 0

Then `41 = −1, `42 = 1, `42 = 1, and the other `ij  1  0 L=  0 −1 Thus

 0 0 T A = P LU =  0 1

0 0 1 0

1 0 0 0

 −1 0 1 1  1 2 1 1  1 1 0 2  0 0 R4 −R3 −−−−−→ 0 0

1 2 0 0

 −1 0 1 1  = U. 1 2 0 −1

are zero, so that  0 0 0 1 0 0 . 0 1 0 1 1 1

 0 1  0 1  0  0 0 −1

0 1 0 1

0 0 1 1

 0 1 0 0  0 0 1 0

1 2 0 0

 −1 0 1 1 . 1 2 0 −1

25. To find a P T LU factorization of A, we begin by permuting the rows of A to get it into a form that will result in a small number of operations to reduce to row echelon form. Since        0 −1 1 3 0 1 0 0 0 −1 1 3 −1 1 1 2 −1    1 1 2 1 1 2 1 3  , we have P A = 1 0 0 0 −1  =  0 −1 . A=  0       1 −1 1 0 0 0 1 0 1 −1 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 1 1 0 1 −1 1

3.4. THE LU FACTORIZATION

173

Now follow the multiplier method from Example 3.35 to find the LU    −1 1 −1 1 1 2  0 −1  0 −1 1 3    PA =  0 0 0 1 1 R4 −(−R2 )  0 0 0 0 1 −1 1 −−−−−−−→

factorization of this matrix:  1 2 1 3  = U. 1 1 0 4

Then `42 = −1 and the other `ij are zero, so that  1 0 L= 0 0

 0 0 0 1 0 0 . 0 1 0 −1 0 1

Thus  0  1 A = P T LU =  0 0

1 0 0 0

0 0 0 1

 1 0 0 0  1 0 0 0

 0 0 0 −1  1 0 0  0 0 1 0  0 0 −1 0 1

 1 1 2 −1 1 3 . 0 1 1 0 0 4

26. From the text, a permutation matrix consists of the rows of the identity matrix written in some order. So each row and each column of a permutation matrix has exactly one 1 in it, and the remaining entries in that row or column are zero. Thus there are n choices for the column containing the 1 in the first row; once that is chosen, there are n − 1 choices for the column to contain the 1 in the second row. Next, there are n − 2 choices in the third row. Continuing, we decrease the number of choices by one each time, until there is just one choice in the last row. Multiplying all these together gives the total number of choices, i.e., the number of n×n permutation matrices, which is n·(n−1)·(n−2) · · · 2·1 = n!. 27. To solve Ax = P T LU x = b, multiply both sides by P to get LU x = P b. Now compute P b, which is a permutation of the rows of b, and solve the resulting equation by the method outlined after Theorem 3.15:          1 1 0 1 0 0 1 0 0 1 0 P T = 1 0 0 ⇒ P = 1 0 0 ⇒ P b = 1 0 0 1 = 1 . 0 0 1 0 0 1 0 0 1 5 5 Then 

1  Ly = P b ⇒  0

0 1 − 12

1 2

    0 y1 1 y1     0 y2  = 1 ⇒ 1 1 y3 5 2 y1 −

= 1 y2 = 1 1 2 y2 + y3 = 5 y1 = 1

    y1 1     ⇒ y2 = 1 ⇒ y2  = 1 . 1 1 y3 5 y3 = 5 − y1 + y2 2 2 Likewise,  2 U x = y ⇒ 0 0

3 1 0

    2 2x1 + 3x2 + 2x3 = 1 x1 1 −1 x2  = 1 ⇒ x2 − x3 = 1 x3 5 − 25 − 52 x3 = 5     x1 4     ⇒ x2 = x3 + 1 ⇒ x = x2  = −1 . x3 −2 2x1 = 1 − 3x2 − 2x3 x3 = −2

174

CHAPTER 3. MATRICES

28. To solve Ax = P T LU x = b, multiply both sides by P to get LU x = P b. Now compute P b, which is a permutation of the rows of b, and solve the resulting equation by the method outlined after Theorem 3.15:          0 0 1 0 1 0 0 1 0 16 −4 P T = 1 0 0 ⇒ P = 0 0 1 ⇒ P b = 0 0 1 −4 =  4 . 0 1 0 1 0 0 1 0 0 4 16 Then  1 Ly = P b ⇒ 1 2

    0 0 y1 −4 y1 = −4 1 0 y2  =  4 ⇒ y1 + y2 = 4 −1 1 y3 16 2y1 − y2 + y3 = 16     −4 y1     ⇒ y2  =  8 . ⇒ y2 = 4 − y1 32 y3 y3 = 16 − 2y1 + y2 y1 = −4

Likewise,  4 U x = y ⇒ 0 0

    4x1 + x2 + 2x3 = −4 −4 1 2 x1 − x2 + x3 = 8 −1 1 x2  =  8 ⇒ 2x3 = 32 32 0 2 x3     x1 −11     ⇒ x2 = x3 − 8 ⇒ x = x2  =  8 . x3 16 4x1 = −x2 − 2x3 − 4 x3 = 16

29. Let A = (aij ) and B = (bij ) be unit lower triangular n × n matrices, so that   0 i < j aij = bij = 1 i = j   ∗ i>j (where of course the ∗’s maybe different for A and B). Then if C = (cij ) = AB, we have cij = ai1 b1j + ai2 b2j + · · · + ai(j−1) b(j−1)j + aij bjj + · · · + ain bnj . If i < j, then all the terms up through ai(j−1) b(j−1)j since the b’s are zero, and all terms from the next term to the end are zero since the a’s are zero. If i = j, then all the terms up through ai(j−1) b(j−1)j since the b’s are zero, the aij bjj = ajj bjj term is 1, and all terms from the next term to the end are zero since the a’s are zero. Finally, if i > j, multiple terms in the sum may be nonzero. Thus we get   0 cij = 1   ∗ so that C is lower unit triangular as well.   30. To invert L, we row-reduce L I :  1 0 ··· ∗ 1 · · ·   .. .. . . . . . ∗ ∗ ···

ij

0 0 .. .

1 0 .. .

0 1 .. .

··· ··· .. .

1

0

0

···

 0 0  ..  . . 1

3.4. THE LU FACTORIZATION

175

To do this, we first add multiples of the first row to each of the lower rows to make the first column zero except for the 1 in row 1. Note that the result of this is a unit lower triangular matrix on the right (since we added multiples of the leading 1 in the first row to the lower rows). Now do the same with row 2: add multiples of that row to each of the lower rows, clearing column 2 on the left except for the 1 in row 2. Again, for the same reason, the result on the right remains unit lower triangular. Continue this process for each row. The result is the identity matrix on the left, and a unit lower triangular matrix on the right. Thus L is invertible and its inverse is unit lower triangular. 31. We start by finding D. First, suppose D is a diagonal 2 × 2 matrix and A is any 2 × 2 matrix with 1’s on the diagonal. Then   a 0 DA = A 0 b has the effect of multiplying the first row of A by a and the second row of A by b, so results in a matrix with a and b in the diagonal entries. In our example, we want to write   −2 1 U= 0 6 as unit upper triangular, say U = DU1 . So we must turn the −2 and 6 into 1’s, and thus    −2 0 1 − 21 = DU1 . U= 0 6 0 1 Thus

 A = LU =

1 −1

 0 −2 1 0

  0 1 = 6 −1

 0 −2 1 0

 0 1 6 0

 − 21 = LDU1 . 1

32. We start by finding D. First, suppose D is a diagonal 3 × 3 matrix and A is any 3 × 3 matrix with 1’s on the diagonal. Then   a 0 0 DA = 0 b 0 A 0 0 c has the effect of multiplying the first row of A by a, the second row of A by b, and the third row of A by c, so the result is a matrix with a, b, and c on the diagonal. In our example, we want to write   2 −4 0 5 4 U = 0 0 0 2 as unit upper triangular, say U = DU1 . So we must turn the 2, 5, and 2 into 1’s, and thus    1 −2 0 2 0 0 1 54  = DU1 . U = 0 5 0 0 0 0 2 0 0 1 Thus 

1

 A = LU =  23 − 12

 0 0 2  1 0 0 0 1 0

  −4 0 1   3 5 4 =  2 0 2 − 21

 0 0 2  1 0 0 0 1 0

0 5 0

 0 1  0 0 2 0

−2 1 0

 0 4 = LDU1 . 5 1

33. We must show that if A = LDU is both symmetric and invertible, then U = LT . Since A is symmetric, L(DU ) = LDU = A = AT = (LDU )T = U T DT LT = U T (DT LT ) = U T (DLT ) since the diagonal matrix D is symmetric. Since L is lower triangular, it follows that LT is upper triangular. Since any diagonal matrix is also upper triangular, Exercise 29 in Section 3.2 tells us that

176

CHAPTER 3. MATRICES DLT and DU are both upper triangular. Finally, since U is upper triangular, it follows that U T is lower triangular. Thus we have A = U T (DLT ) = L(DU ), giving two LU -decompositions of A. But since A is invertible, Theorem 3.16 tells us that the LU -decomposition is unique, so that U T = L. Taking transposes gives U = LT .

34. Suppose A = L1 D1 LT1 = L2 D2 LT2 where L1 and L2 are unit lower triangular, D1 and D2 are diagonal, and A is symmetric and invertible. We want to show that L1 = L2 and D1 = D2 . First, since L2 (D2 LT2 ) = A = L1 (D1 LT1 ) gives two LU -factorizations of an invertible matrix A, it follows from Theorem 3.16 that these two factorizations are the same, so that L1 = L2 and D2 LT2 = D1 LT1 . Since L1 = L2 , this equation becomes D2 LT2 = D1 LT2 . From Exercise 47 in Section 3.3, since A = L2 (D2 LT2 ) and A is invertible, T we see that both L2 and D2 LT2 are invertible; thus LT2 is invertible as well ((LT2 )−1 = (L−1 2 ) ). So T T T −1 multiply D2 L2 = D1 L2 on the right by (L2 ) , giving D2 = D1 , and we are done.

3.5

Subspaces, Basis, Dimension, and Rank

1. Geometrically, x = 0 is a line through the origin in R2 , so you might conclude from Example 3.37 that x this is a subspace. And in fact, the set of vectors with x = 0 is y     0 0 =y . y 1   0 Since y is arbitrary, S = span , and this is a subspace of R2 by Theorem 3.19. 1       1 1 −1 2. This is not a subspace. For example, is in S, but − = is not in S. This violates property 0 0 0 3 of the definition of a subspace. 2 3. Geometrically, y = 2x is a line through the origin in R  , so you might conclude from Example 3.37 x that this is a subspace. And in fact, the set of vectors with y = 2x is y     x 1 =x . 2x 2   1 Since y is arbitrary, S = span , and this is a subspace of R2 by Theorem 3.19. 2     −3 2 4. This is not a subspace. For example, both and are in S, since −3 · (−1) and 2 · 2 are both −1 2 nonnegative. But their sum,       −3 2 −1 + = , −1 2 1

is not in S since −1 · 1 < 0. So S is not a subspace (violates property 2 of the definition). 5. Geometrically, x = y = z is a line through the origin in R3 , so youcould  conclude from Exercise 9 in x this section that this is a subspace. And in fact, the set of vectors y  with x = y = z is z     x 1 x = x 1 . x 1

3.5. SUBSPACES, BASIS, DIMENSION, AND RANK

177

  1 Since x is arbitrary, S = span 1, and this is a subspace of R3 by Theorem 3.19. 1 6. Geometrically, this is a line in R3 lying in the xz-plane and passing   through the origin, so Exercise 9 x in this section implies that it is a subspace. The set of vectors y  satisfying these conditions is z 

   x 1  0  = x 0 . 2x 2   1 Since x is arbitrary, S = span 0, and this is a subspace of R3 by Theorem 3.19. 2 7. This is a plane in R3 , but it does not pass through the origin since (0, 0, 0) does not satisfy the equation. So condition 1 in the definition is violated, and this is not a subspace. 8. This is not a subspace. For example     0 1 0 ∈ S and 1 ∈ S 2 1 since for the first vector |1 − 0| = |0 − 1| and for the second |0 − 1| = |1 − 2|. But their sum,       0 1 1 0 + 1 = 1 ∈ /S 2 3 1 since |1 − 1| = 6 |1 − 3|. 9. If a line ` passes through the origin, then it has the equation x = 0 + td = td, so that the set of points on ` is just span(d). Then Theorem 3.19 implies that ` is a subspace of R3 . (Note that we did not use the assumption that the vectors lay in R3 ; in fact this conclusion holds in any Rn for n ≥ 1.) 10. This is not a subspace of R2 . For example, (1, 0) ∈ S and (0, 1) ∈ S, but their sum, (1, 1), is not in S since it does not lie on either axis. 11. As in Example 3.41, b is in col(A) if and only if the augmented matrix     1 0 −1 3 A b = 1 1 1 2 is consistent as a linear system. So we row-reduce that augmented matrix:  1 0 1 1

−1 1

  3 1 0 R2 −R1 2 − −−−−→ 0 1

−1 2

 3 . −1

This matrix is in row echelon form and has no zero rows, so the system is consistent and thus b ∈ col(A). Using Example 3.41, w is in row(A) if and only if the matrix     1 0 −1 A 1  = 1 1 w −1 1 1

178

CHAPTER 3. MATRICES can be row-reduced to a matrix whose last row is zero by operations excluding row interchanges involving the last row. Thus 

1  1 −1

0 1 1

  1 0 −1 R2 −R1 1  −−− −−→  0 1 R3 +R1 1 − −−−−→ 0 1

  −1 1  0 2  R3 −R2 0 − −−−−→ 0

0 1 0

 −1 2  −2

This matrix is in row echelon form and has no zero rows, so we cannot make the last row zero. Thus w∈ / row(A). 12. Using Example 3.41, b is in col(A) if and only if the augmented matrix   1 1 −3 1   2 1 1 A b = 0 1 −1 −4 0 is consistent as a linear system. So we row-reduce that augmented matrix:  1  0 1

  1 1   1 0 R3 −R1 0 −−−−−→ 0

1 −3 2 1 −1 −4

1 −3 2 1 −2 −1

  1 1   1 0 R3 +R2 −1 −−−−−→ 0

1 2 0

  1 1  21 R2  1 −−−→ 0 0 0

−3 1 0

1 1 0

−3 1 2

0

 1 1 2 0

The linear system has a solution (in fact, an infinite number of them), so is consistent. Thus b is in col(A). Using Example 3.41, w is in row(A) if and only if the matrix  1 1 −3    0 A 2 1 =  1 −1 −4 w 2 4 −5

   

can be row-reduced to a matrix whose last row is zero by operations excluding row interchanges involving the last row. Thus 

1  0   1 2

  1 −3  2 1   R3 −R1   −1 −4 −−−−−→  R4 −2R1 4 −5 − −−−−→

1 0 0 0

  1 −3  2 1      −2 −1 R4 +R3 2 1 −−−−−→

1 0 0 0

 1 −3 2 1   −2 −1  0 0

Thus w is in row(A). 13. As in the remarks following Example 3.41, we determine whether w is in row(A) using col(AT ). To say w is in row(A) means that wT is in col(A), so we form the augmented matrix and row-reduce to see if the system has a solution: 

 T A

w

 T

1 1 = 0 1 −1 1

  −1 1 1 0 1 1 R3 +R2 1 − −−−−→ 0 2

  −1 1 1 1 R2 − 12 R3 0 0 0 −−−−−−→ 0 2

  −1 1 1 R2 ↔R3  1 − −−−−→ 0 2 0 0 0

 −1 0 1

This matrix has a row that is all zeros except for the final column, so the system is inconsistent. Thus wT ∈ / col(AT ) and thus w ∈ / row(A). 14. As in the remarks following Example 3.41, we determine whether w is in row(A) using col(AT ). To say w is in row(A) means that wT is in col(A), so we form the augmented matrix and row-reduce to

3.5. SUBSPACES, BASIS, DIMENSION, AND RANK

179

see if the system has a solution: 

 T A

w

 T

1 = 1 −3

0 2 1

1 −1 −4

  1 0 2 R2 −R1 4 −−− −−→ 0 2 R3 +3R1 −5 − −−−−→ 0 1

1 −2 −1

 2 2 1

 1 0 1 2 R2 0 1 −−−→ 0 1

1 −1 −1

  2 1 0 1 R3 −R2 1 − −−−−→ 0

0 1 0

1 −1 0

 2 1 . 0

This is in row reduced form, and the final row consists of all zeros. This is a consistent system, so that wT ∈ col(AT ) and thus w ∈ row(A). 15. To check if v is in null(A), we need only multiply A by v:     −1   1 0 −1   0 3 = Av = 6= 0, 1 1 1 1 −1 so that v ∈ / null(A). 16. To check if v is in null(A), we need only multiply A by v:      0 7 1 1 −3 2 1 −1 = 0 = 0, Av = 0 0 2 1 −1 −4 so that v ∈ null(A). 17. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:  1 1

0 1

  −1 1 R2 −R1 1 − −−−−→ 0

0 1

 −1 2

Thus a basis for row(A) is given by  { 1

0

  −1 , 0

1

 2 }.

A basis for col(A) is given by the column vectors of the original matrix A corresponding to the leading 1s in the row-reduced matrix, so a basis for col(A) is     1 0 , . 1 1 Finally, from the row-reduced form, the solutions to Ax = 0 where   x1 x = x2  x3 can be found by noting that x3 is a free variable; setting it to t and then solving for the other two variables gives x1 = t and x2 = −2t. Thus a basis for the nullspace of A is given by   1 x = −2 . 1

180

CHAPTER 3. MATRICES

18. We follow the comment after Example 3.48. Start      1 1 1 −3 1 1 −3         0  2 1 2 1 0 0  R3 +R2 R3 −R1 0 −2 −1 1 −1 −4 −−− −−−−−→ 0 −−→ Thus a basis for row(A) is given by     1 0 − 72 , 0 1 12

by finding the reduced row echelon form     R1 −R2 1 −3 1 1 −3 −−− −−→ 1  1 R2    1 2 0  2 1  −−−→ 0 1  2 0 0 0 0 0 0

or, clearing fractions,



2

0

  −7 , 0

2

of A: 0

− 72

1 0

 

1 2

0

 1 .

Since the row-reduction did not involve any row exchanges, a basis for row(A) is also given by the rows of A corresponding to nonzero rows in the reduced matrix:     1 1 −3 , 0 2 1 . A basis for col(A) is then given by the original column vectors of A corresponding to the columns containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by     1   1 0 ,  2   1 −1 Finally, from the row-reduced form, the solutions of Ax = 0 where   x1 x = x2  x3 can be found by noting that x3 is a free variable; we set it to t, and then solving for the other two variables in terms of t to get x1 = 27 t and x2 = − 21 t. Thus a basis for the nullspace of A is given by     7 7    2 1    x = − 2  or −1  2

1

19. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:  1 0 0

1 1 1

  1 1 0 1 0 1 −1 1 R3 −R2 −1 −1 − −−−−→ 0 0

Thus a basis for row(A) is given by  1 0 1

   R1 −R3 0 1 1 1 0 1 −−− −−→ 0 1 −1 1 R2 −R3 −1 1 1 −−−−−→ − 2 R3 0 −2 − 0 1 −−−→ 0 0   R1 −R2  1 1 0 0 − −−−−→ 1 0 0 1 −1 0 0 1 0 0 0 1 0 0

  0 , 0

1

−1

  0 , 0

0

0

1



 1 0 −1 0 0 1

.

Since the row-reduction did not involve any row exchanges, a basis for row(A) is also given by the rows of A corresponding to nonzero rows in the reduced matrix; since all the rows are nonzero, the rows of A form a basis for row(A). A basis for col(A) is then given by the original column vectors of A corresponding to the columns containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by       1 1   1 0 , 1 ,  1   0 1 −1

3.5. SUBSPACES, BASIS, DIMENSION, AND RANK

181

Finally, from the row-reduced form, the solutions of Ax = 0 where   x1 x2   x= x3  x4 can be found by noting that x3 is a free variable; we set it to t. Also, x4 = 0; solving for the other two variables in terms of t gives x1 = −t and x2 = t. Thus a basis for the nullspace of A is given by   −1  1  x=  1 . 0 20. We follow the  2 −4  2 −1 1 −2

comment after Example   1 0 2 1  R ↔R3  1 2 3 −−1−−−→ −1 2 1 4 4  1 −2 1  0 2 0 0 0 0

3.48. Start by finding the reduced row echelon form of A:    1 −2 1 4 4 −2 1 4 4   R2 +R1  0 2 6 7 −−→ 0 2 1 2 3 −−− R3 +R2 R3 −2R1 0 −2 −6 −7 −−− −4 0 2 1 − −−→ −−−−→ 0    R −R   1 2 4 4 1 −2 1 4 4 −−− 1 −2 0 1 12 −−→  21 R2     6 7 −− 0 1 3 72  0 1 3 72  0 −→ 0 0 0 0 0 0 0 0 0 0 0 0 0

Thus a basis for row(A) is given by the nonzero rows of the reduced matrix, which after clearing fractions are     2 −4 0 2 1 , 0 0 2 6 7 . A basis for col(A) is then given by the original column vectors of A corresponding to the columns containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by     0   2 −1 , 1 .   1 1 Finally, from the row-reduced form, the solutions of Ax = 0 where   x1 x2     x= x3  x4  x5 can be found by noting that x2 = s, x4 = t, and x5 = u are free variables. Solving for the other two variables gives x1 = 2s − t − 12 u and x3 = −3t − 27 u. Thus the nullspace of A is given by           1     1  2s − t + 12 u  2 2 −1 −2  −1 −2             1  1  0  0  0  0 s                   −3t − 7 u  = s 0 + t −3 + u − 7  = span 0 , −3 , − 7  . 2 2          2               0  1  0  1  0 0 t             0 0 0 0 u 1 1 Clearing fractions, we get for a basis for the nullspace of A       −1 −1    2      0  0  1        0 , −3 , −7 .            1  0    0    0 0 2

182

CHAPTER 3. MATRICES

21. Since row(A) = col(AT ) and col(A) = row(AT ), Then  1 AT =  0 −1

it suffices to find the row and column spaces of AT .    1 1 0 1 → 0 1 , 1 0 0

so that the column space of AT is given by the columns of AT corresponding to columns of this matrix containing a leading 1, so     1   1     col(AT ) =  0 , 1 ⇒ row(A) = 1 0 −1 , 1 1 1 .   −1 1 The row space of AT is given by the nonzero rows of the reduced matrix, so         1 0 T row(A ) = 1 0 , 0 1 ⇒ col(A) = , . 0 1 22. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT . Then     1 0 1 1 0 1 AT =  1 2 −1 → 0 1 −1 , −3 1 −4 0 0 0 so that the column space of AT is given by the columns of AT corresponding to columns of the reduced matrix containing a leading 1, so     0   1     T    1 , 2 ⇒ row(A) = 1 1 −3 , 0 2 1 . col(A ) =   1 −3 The row space of AT is given by the nonzero rows of the reduced matrix, so     0   1     row(AT ) = 1 0 1 , 0 1 −1 ⇒ col(A) = 0 ,  1 .   −1 1 23. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT . Then     1 0 1 1 0 0 1 0 1 0 1 1    AT =  0 −1 −1 → 0 0 1 , 1 1 −1 0 0 0 so that the column space of AT is given by the columns of AT corresponding to columns of the reduced matrix containing a leading 1, so       1 0 0                 1  1  1 T  col(A ) =   ,   ,   ⇒ row(A) = 1 1 0 1 , 0 1 −1 1 , 0 1 −1 −1 . 0 −1 −1       1 1 −1 The row space of AT is given by the nonzero rows of the reduced matrix, so       0 0   1       row(AT ) = 1 0 0 , 0 1 0 , 0 0 1 ⇒ col(A) = 0 , 1 , 0 .   0 0 1

3.5. SUBSPACES, BASIS, DIMENSION, AND RANK

183

24. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to Then    1 2 −1 1 0  −4 2 −2     1 1 AT =   → 0  0 0  2 2 4 0 1 3 4

find the row and column spaces of AT .  0 1 1 1  0 0 , 0 0 0 0

so that the column space of AT is given by the columns of AT corresponding to columns of the reduced matrix containing a leading 1, so     −1  2           −4     2      T  ,  1 ⇒ row(A) = 2 −4 0 2 1 , −1 2 1 2 3 . 0 col(A ) =          2  2       3 1 The row space of AT is given by the nonzero rows of the reduced matrix, so     0   1     row(AT ) = 1 0 1 , 0 1 1 ⇒ col(A) = 0 , 1 .   1 1 25. There is no reason to expect the bases for the row and column spaces found in Exercises 17 and 21 to be the same, since we used different methods to find them. What is importantis that they span  should   1 0 −1 , 0 1 2 }, the same subspaces. The basis for the row space found in Exercise 17 was {     and Exercise 21 gave { 1 0 −1 , 1 1 1 }. Since the vector is the same, in order to  first basis  0 1 2 is in the span of the second basis, show the subspaces are the same it suffices to see that  and that 1 1 1 is in the span of the first basis. But             0 1 2 = − 1 0 −1 + 1 1 1 , 1 1 1 = 1 0 −1 + 0 1 2 . So the two bases span the same row space. Similarly, the basis for the two column spaces were         1 0 1 0 Exercise 17: , , Exercise 21: , . 1 1 0 1 Here the second basis vectors are the same, and since             1 1 0 1 1 0 = + , = − , 1 0 1 0 1 1 each basis vector is a linear combination of the vectors in the other basis, so they span the same column space. 26. There is no reason to expect the bases for the row and column spaces found in Exercises 18 and 22 to be the same, since we used different methods to find them. What is importantis that they span  should   2 0 −7 0 2 1 the same subspaces. The basis for the row space found in Exercise 18 was { , },     and Exercise 22 gave { 1 1 −3 , 0 2 1 }. Since the vector is the same, in order to  second basis  show the subspaces are the same it suffices to see that 2 0 −7 is in the span of the second basis, and that 1 1 −3 is in the span of the first basis. But 

2

0

  −7 = 2 1

1

  −3 − 0

2

 1 ,

 1

1

 1 −3 = 2 2

0

 1 −7 + 0 2

2

 1 .

So the two bases span the same row space. Similarly, the basis for the two column spaces were         1  0   1  1 0 ,  1 . 0 ,  2 , Exercise 18: Exercise 22:     1 −1 1 −1

184

CHAPTER 3. MATRICES Here the first basis vectors are the same, and since 

     1 0 1  2 = 0 + 2  1 , −1 1 −1

     0 1 1  1 = 1  2 − 1 0 , 2 2 −1 1 −1 

each basis vector is a linear combination of the vectors in the other basis, so they span the same column space. 27. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the row space of   the matrixthat has  rows equal  to these vectors. Thus we form a matrix whose rows are 1 −1 0 , −1 0 1 , and 0 1 −1 and row-reduce it: 

1 −1 0

  −1 0 1 R2 +R1  0 1 − −−−−→ 0 1 −1 0

  −1 0 1 −R2  −1 1 − −−→ 0 1 −1 0

 R1 +R2  −1 0 −−− −−→ 1 0 0 1 1 −1 R −R 3 2 1 −1 −−−−−→ 0 0

 −1 −1 0

A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore given by the nonzero rows of the reduced matrix:     0   1  0 ,  1 .   −1 −1 Note that since no row interchanges were involved, the rows of the original matrix corresponding to nonzero rows in the reduced matrix also form a basis for the span of the given vectors. 28. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the row space of to these vectors. Thus we form a matrix whose rows are   the matrix  that has rows equal   1 −1 1 , 1 2 0 , 0 1 1 , and 2 1 2 and row-reduce it:  1 1  0 2

  1 −1 1 R −R 2 1 0 − − − − − → 2 0   0 1 1 R −2R 4 1 1 2 −−−−−→ 0  1 0 R3 −R2  −−− −−→ 0 R4 −2R2 −−−−−→ 0

  1 −1 −1 1 0 3 3 −1 R ↔R 2 4  −−−−−→  0 1 1 1 3 0 0 3   −1 1 1 −1 0 1 0 1   0 0 1 0 R4 +R3 0 −1 − 0 0 −−−−→

  1 1 1 3 R2 0 0  −−−→  0 1 −1 0  1 0  1 0

 −1 1 1 0  1 1 3 −1

A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore given by       0 0   1 −1 , 1 , 0 .   1 0 1 (Note that this matrix could be further reduced, and we would get a different basis). 29. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the row space of the matrix that has rows equal to these vectors. Thus we form a matrix whose rows are

3.5. SUBSPACES, BASIS, DIMENSION, AND RANK  2

−3

185

     1 , 1 −1 0 , 4 −4 1 and row-reduce it:         1 −1 0 1 −1 0 2 −3 1 1 −1 0 R2 −2R1 −R2  R1 ↔R2  1 −1 0− 0 1 −1 −−−−→ 2 −3 1 −−−−−→ 0 −1 1 −−−→ R3 −4R1 0 −2 1 0 −2 1 4 −4 1 4 −4 1 − −−−−→       1 −1 0 1 −1 0 1 −1 0 R2 +R3  0 0 1 −1 1 −1 − 1 0 −−−−→ 0 R3 +2R2 −R3 0 −1 −−−→ 0 0 1 0 0 1 −−−−−→ 0   R1 +R2 −−−−−→ 1 0 0 0 1 0 0 0 1

A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore given by       { 1 0 0 , 0 1 0 , 0 0 1 } (Note that as soon as we got to the first matrix on the second line above, we could have concluded that the reduced matrix would have no zero rows; since it was 3 × 3, it would thus be the identity matrix, so that we could immediately write down the basis above, or simply use the basis given by the rows of that matrix). 30. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the row space of the   matrix that has  rows equal to  these vectors. Thus we form a matrix whose rows are 0 1 −2 1 , 3 1 −1 0 , 2 1 5 1 and row-reduce it:  1      1 1 3 R1 3 1 −1 0 −− 1 0 1 −2 1 − 0 3 3  −→   R1 ↔R2    0 1 −2 1 0 1 −2 1 3 1 −1 0 − − − − − →       2

1

5

1

2 1 5 1 2 1    1 31 − 31 0 1    0 1 −2 1 0    R3 − 31 R2 R −2R1 17 0 13 1 −−3−−−→ −−−−−−→ 0 3   1 13 − 31 0   0 1 −2 1   3 R3 2 1 19 −19 −−→ 0 0

5

1

1 3

− 13

1

−2

0

19 3

 0  1 

2 3

Although this matrix could be reduced further, we already have a basis for the row space:       2 0 1 −2 1 , 1 13 − 13 0 , 0 0 1 19 , or  3

1

−1

 0 ,

 0

1

−2

 1 ,

 0

0

19

 2 .

31. If, as in Exercise 29, we form the matrix whose rows are those vectors and row-reduce it, we get the identity matrix (see Exercise 29). Because there were no row exchanges, the rows of the original matrix corresponding to nonzero rows in the row-reduced matrix form a basis for the span of the vectors (because there were no row exchanges). Thus a basis consists of all three given vectors:       2 −3 1 , 1 −1 0 , 4 −4 1 . 32. If, as in Exercise 30, we form the matrix whose rows are those vectors and row-reduce it, we get a matrix with no zero rows. Thus the rank of the matrix is 3, so the rows are linearly independent, so a basis consists of all three row vectors:       0 1 −2 1 , 3 1 −1 0 , 2 1 5 1 .

186

CHAPTER 3. MATRICES

33. If R is a matrix in echelon form, then row(R) is the span of the rows of R, by definition. But nonzero rows of R are linearly independent, since their first entries are in different columns, so those nonzero rows form a basis for row(R) (they span, and are linearly independent). 34. col(A) is the span of the columns of A by definition. If the columns of A are also linearly independent, then they form a basis for col(A) by definition of “basis”. 35. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From Exercise 17, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of columns minus the rank, or 3 − 2 = 1. 36. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From Exercise 18, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of columns minus the rank, or 3 − 2 = 1. 37. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From Exercise 19, row(A) has three vectors in its basis, so rank(A) = 3. Then nullity(A) is the number of columns minus the rank, or 4 − 3 = 1. 38. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From Exercise 20, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of columns minus the rank, or 5 − 2 = 3.   T 39. Since nullity(A) > 0, the equation P Ax = 0 has a nontrivial solution x = c1 c2 · · · cn , where at least one ci 6= 0. Then Ax = ci ai = 0. Since at least one ci 6= 0, this shows that the columns of A are linearly dependent. 40. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in this case rank(A) ≤ 2. Thus any row-reduced form of A will have at least 4 − 2 = 2 zero rows, which means that some row of A is a linear combination of the other rows, so that the rows of A are linearly dependent. 41. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in this case rank(A) ≤ 5 and rank(A) ≤ 3, so that rank(A) ≤ 3 and its possible values are 0, 1, 2, and 3. Using the Rank Theorem, we know that nullity(A) = n − rank(A) = 5 − rank(A), so that nullity(A) = 5, 4, 3, or 2. 42. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in this case rank(A) ≤ 2 and rank(A) ≤ 4, so that rank(A) ≤ 2 and its possible values are 0, 1, and 2. Using the Rank Theorem, we know that nullity(A) = 2 − rank(A) = 1 − rank(A), so that nullity(A) = 2, 1, or 0. 43. First row-reduce A:    1 2 a 1 R2 +2R1 −2 4a 2 −−−−−→ 0 R3 −aR1 a −2 1 − −−−−→ 0

  1 2 a 0 4a + 4 2a + 2 R2 + 21 R2 −2a − 2 1 − a2 − −−−−−→ 0

2 4a + 4 0

If a = −1, then we have  1 0 0

2 0 0

 1 0 0

2 12 0

 −1 0 so that rank(A) = 1. 0

If a = 2, then we have  2 6 so that rank(A) = 2. 0

Otherwise, the row-reduced matrix has all rows nonzero, so that rank(A) = 3.

 a 2a + 2  −a2 + a + 2

3.5. SUBSPACES, BASIS, DIMENSION, AND RANK

187

44. First row-reduce A:     R1 +R2   a 2 −1 −2 −1 a − −−−−→ 1 2 a − 2 R1 ↔R3   3 3 3 3 −2− 3 3 −2 −2 −−−−→ −2 −1 a a 2 −1 a 2 −1    1 2 1 2 a−2 1 R2 −3R1 − 3 R2  −3 4 − 3a  − −−−−−→ 0 1 −−−→ 0 2 R3 −aR2 0 2 − 2a −a + 2a − 1 0 2 − 2a −−−−−→   1 2 a−2 0 1  a − 34  R3 −(2−2a)R2 5 0 0 (a − 1) a − −−−−−−−−−→ 3

 a−2  a − 34 2 −a + 2a − 1

The first two rows are always nonzero. If a = 1 or a = 53 , then the bottom row is zero, so that rank(A) = 2. Otherwise, all three rows are nonzero and rank(A) = 3. 45. As in example 3.52,   1 1 , 0 form a basis for R3 if and only that matrix:  1 1 1 0 0 1

  1 0 , 1

  0 1 1

if the matrix with those vectors as its rows has rank 3. Row-reduce   0 1 R2 −R1  1 − −−−−→ 0 1 0

  1 1 0 0 −1 1 R3 +R2 1 1 − −−−−→ 0

 1 0 −1 1 0 2

Since the matrix has rank 3 (it has no nonzero rows and is in echelon form), the vectors are linearly independent; since there are three of them, they form a basis for R3 . 46. As in example 3.52,  1 −1 , 3 

  −1  5 , 1



 1 −3 1

form a basis for R3 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce that matrix:       1 −1 3 1 −1 3 1 −1 3 R +2R1 R2 +2R3  −1 0 5 1 −−2−−−→ 4 4 − 0 0 −−−−→ 0 R −R 3 1 1 −3 1 −−− 0 −2 −2 0 −2 −2 −−→ Since the matrix does not have rank 3, it follows that these vectors do not form a basis for R3 . (In fact,         1 −1 1 0 − −1 +  5 + 2 −3 = 0 , 3 1 1 0 so the vectors are linearly dependent.) 47. As in example 3.52,   1 1  , 1 0

  1 1  , 0 1

  1 0  , 1 1

  0 1   1 1

188

CHAPTER 3. MATRICES form a basis for R4 if and only if the matrix with those vectors as its rows has rank 4. Row-reduce that matrix:  1 1  1 0

1 1 0 1

1 0 1 1

  1 0 R −R 2 1  − − − − − → 1 0  R3 −R1 0 1 − −−−−→ 0 1

  1 1 1 0 0 0 −1 1 R ↔R 4 2   −1 0 1 −−−−−→ 0 0 1 1 1

1 1 1 1 −1 0 0 −1  1 1 0 1  R3 +R2  −−− −−→ 0 0 0 0

 0 1  1 1   1 1 0 0 1 1   0 1 2 R4 +R3 −1 1 −−−−−→ 0

1 1 0 0

1 1 1 0

 0 1  2 3

This matrix is in echelon form and has no zero rows, so it has rank 4 and thus these vectors form a basis for R4 . 48. As in example 3.52, 

 1 −1  ,  0 0



 0  1  ,  0 −1



 0  0  , −1 1

  −1  0    1 0

form a basis for R4 if and only if the matrix with those vectors as its rows has rank 4. Row-reduce that matrix:       1 −1 0 0 1 −1 0 0 1 −1 0 0 0 0  0 1 0 −1 1 0 −1 1 0 −1            0 0 0 −1 1 0 0 −1 1 0 −1 1 R4 +R2 R4 +R1 0 1 −1 1 0 − −1 0 1 0 − −−−−→ 0 −−−−→ 0 −1   1 −1 0 0 0 1 0 −1   0 0 −1 1 R4 +R3 0 0 0 −−− −−→ 0 Since the matrix does not have rank 4, these vectors do not form a basis for R4 . (In fact, the sum of the four vectors is zero, so they are linearly dependent.) 49. As in example 3.52,   1 1 , 0 form a basis for Z32 if and only if that matrix over Z2 :  1 1 0 1 1 0

  0 1 , 1

  1 0 1

the matrix with those vectors as its rows has rank 3. Row-reduce   0 1 0 1 R3 +R1 1 − −−−−→ 0

1 1 1

  0 1 0 1 R3 +R2 1 − −−−−→ 0

1 1 0

 0 1 0

Since the matrix does not have rank 3, these vectors do not form a basis for Z32 . 50. As in example 3.52,   1 1 , 0

  0 1 , 1

  1 0 1

3.5. SUBSPACES, BASIS, DIMENSION, AND RANK

189

form a basis for Z33 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce that matrix over Z3 :       1 1 0 1 1 0 1 1 0 0 1 1 0 1 1 0 1 1 R3 +2R1 R3 +R2 1 0 1 − −−−−→ 0 2 1 −−− −−→ 0 0 2 Since the matrix has rank 3, these vectors form a basis for Z33 . 51. We want to solve

      1 1 1 c1 2 + c2  0 = 6 , 0 −1 2

so set up the augmented matrix of the linear system and row-reduce to solve it:      R1 −R2   1 1 1 −−→ 1 0 1 1 1 −−− 1 1 1 1 − 2 R2  R2 −2R1  0 1 2 0 6 − 1 −2 −−−−→ 0 −2 4 −−− −→ 0 R +R 3 2 0 −1 2 0 −1 2 0 −1 2 −−−−−→ 0 0

 3 −2 0

Thus c1 = 3 and c2 = −2, so that       1 1 1 3 2 − 2  0 = 6 . 2 −1 0   3 Then the coordinate vector [w]B = . −2 52. We want to solve

      1 5 3 c1 1 + c2 1 = 3 , 4 6 4

so set up the augmented matrix of the linear system and row-reduce to solve it:        R1 −R2   3 5 1 1 1 1 1 3 1 1 3 1 3 −−− −−→ 1 0 R −3R 2 1 R R ↔R 2  1 1 3 −−1−−−→  0 1 2 2 0 3 5 1 −−−−−→ 0 2 −8 −− 1 −4 −→ R3 −4R1 R3 −2R2 4 6 4 4 6 4 − 0 2 −8 0 2 −8 −−−−→ −−−−−→ 0 0

 7 −4 0

Thus c1 = 7 and c2 = −4, so that       3 5 1 7 1 − 4 1 = 3 , 4 6 4   7 Then the coordinate vector [w]B = . −4 53. Row-reduce A over Z2 :  1 A = 0 1

1 1 0

  0 1 0 1 R3 +R1 1 − −−−−→ 0

1 1 1

  0 1 0 1 R3 +R2 1 − −−−−→ 0

1 1 0

 0 1 . 0

This matrix is in echelon form with one zero row, so it has rank 2. Thus nullity(A) = 3 − rank(A) = 3 − 2 = 1. 54. Row-reduce A over Z3 :  1 A = 2 2

1 1 0

  2 1 R2 +R1 2 −−− −−→ 0 R3 +R1 0 − −−−−→ 0

1 2 1

  2 1 0 1 R3 +R2 2 − −−−−→ 0

1 2 0

 2 1 . 0

190

CHAPTER 3. MATRICES This matrix is in echelon form with one zero row, so it has rank 2. Thus nullity(A) = 3 − rank(A) = 3 − 2 = 1.

55. Row-reduce A over Z5 :  1 A = 2 1

3 3 0

1 0 4

  1 4 R +3R1 0 1 −−2−−−→ R3 +4R1 0 − −−−−→ 0

3 2 2

1 3 3

  4 1 0 3 R3 +4R2 1 − −−−−→ 0

3 2 0

1 3 0

 4 3 3

This matrix is in echelon form with no zero rows, so it has rank 3. Thus nullity(A) = 4 − rank(A) = 4 − 3 = 1. 56. Row-reduce A over Z7 :  2 6 A= 1 1

4 3 0 1

0 5 2 1

0 1 2 1

  2 1 R +4R 2 1 −−−−−→ 0 0  R +3R  1  0 5 − −3−−−→ R +3R 1 0 1 −−4−−−→

4 5 5 6

0 5 2 1

0 1 2 1

 1 4  R +6R2 1 −−3−−−→ R4 +3R2 4 − −−−−→  2 4 0 5  0 0 0 0

0 5 4 2

0 1 1 4

  2 1 0 4   0 4 R4 +3R3 2 − −−−−→ 0

4 5 0 0

0 5 4 0

0 1 1 0

 1 4 . 4 0

This matrix is in echelon form with one zero row, so it has rank 3. Thus nullity(A) = 5 − rank(A) = 5 − 3 = 2. 57. Since x is in null(A), we know that Ax = 0. But if Ai is the ith row of A, then       0 a11 x1 + a12 x2 + · · · + a1n xn A1 · x  a21 x1 + a22 x2 + · · · + a2n xn   A2 · x  0       Ax =   =  ..  =  ..  . ..    .  . . am1 x1 + am2 x2 + · · · + amn xn

Am · x

0

Thus Ai · x = 0 for all i, so that every row of A is orthogonal to x. 58. By the Fundamental Theorem, if A and B have rank n, then they are both invertible. Therefore, by Theorem 3.9(c), AB is invertible as well. Applying the Fundamental Theorem again shows that AB also has rank n. 59. (a) Let bk be the k th column of B. Then Abk is the k th column of AB. Now suppose that some set of columns of AB, say Abk1 , Abk2 , . . . , Abkr , are linearly independent. Then ck1 bk1 + ck2 bk2 + · · · + ckr bkr = 0 ⇒ A (ck1 bk1 + ck2 bk2 + · · · + ckr bkr ) = 0 ⇒ ck1 (Abk1 ) + ck2 (Abk2 ) + · · · + ckr (Abkr = 0). But the Abki are linearly independent, so all the cki are zero and thus the bki are linearly independent. So the number of linearly independent columns in B is at least as large as the number of linearly independent columns of AB; it follows that rank(AB) ≤ rank(B). (b) There are many examples. For instance,     1 1 1 0 A= , B= . 0 0 1 1 Then B has rank 2, and AB =

 2 0

 1 has rank 1. 0

3.5. SUBSPACES, BASIS, DIMENSION, AND RANK

191

60. (a) Recall that by Theorem 3.25, if C is any matrix, then rank(C T ) = rank(C). Thus rank(AB) = rank((AB)T ) = rank(B T AT ) ≤ rank(AT ) = rank(A), using Exercise 59(a) with AT in place of B and B T in place of A. (b) For example, we can use the transposes of the matrices in part (b) of Exercise 59:     1 1 1 0 A= , B= . 0 1 1 0 Then A has rank 2 and

 2 AB = 1

 0 has rank 1. 0

61. (a) Since U is invertible, we have A = U −1 (U A), so that using Exercise 59(a) twice,  rank(A) = rank U −1 (U A) ≤ rank(U A) ≤ rank(A). Thus rank(A) = rank(U A). (b) Since V is invertible, we have A = (AV )V −1 , so that using Exercise 60(a) twice,  rank(A) = rank (AV )V −1 ≤ rank(AV ) ≤ rank(A). Thus rank(A) = rank(AV ). 62. Suppose first that A has rank 1. Then col(A) = span(u) for some  vector u. It follows that if ai is the ith column of A, then for each i, we have ai = ci u. Let vT = c1 c2 · · ·  cn . Then A = uvT . For the converse, suppose that A = uvT . Suppose that vT = c1 c2 · · · cn . Then   A = uvT = c1 u · · · cn u . Thus every column of A is a scalar multiple of u, so that col(A) = span(u) and thus rank(A) = 1. 63. If rank(A) = r, then a basis for col(A) consists of r vectors, say u1 , u2 , . . . , ur ∈ Rm . So if ai is the ith column of A, then for each i we have ai = ci1 u1 + ci2 u2 + · · · + cir ur , i = 1, 2, . . . , n.   Now let viT = c1i c2i · · · cni , for i = 1, 2, . . . , r. Then so that     A = a1 a2 · · · an = c11 u1 + c12 u2 + · · · + c1r ur · · · cn1 u1 + cn2 u2 + · · · + cnr ur       = c11 u1 · · · cn1 u1 + c12 u2 · · · cn2 u2 + c1r ur · · · cnr ur = u1 · v1T + u2 · v2T + · · · + ur · vrT . By exercise 62, each ui · viT is a matrix of rank 1, so A is a sum of r matrices of rank 1. 64. Let A = {u1 , . . . , ur } be a basis for col(A) and B = {v1 , . . . , vs } be a basis for col(B). Then rank(A) = r and rank(B) = s. Let ai and bi be the columns of A and B respectively, for i = 1, 2, . . . , n. Then if x ∈ col(A + B),   ! ! n n r s r n s n X X X X X X X X  x= c1 (ai + bi ) = αij uj + βik vk  = ci αij uj + ci βik vk , i=1

i=1

j=1

k=1

j=1

i=1

k=1

i=1

so that x is a linear combination of A and B. Thus A ∪ B spans col(A + B). But this implies that col(A + B) ⊆ span(A ∪ B) = span(A) + span(B) = col(A) + col(B). Since rank(A + B) is the dimension of col(A + B) and rank(A) + rank(B) is the dimension of col(A) plus the dimension of col(B), we are done.

192

CHAPTER 3. MATRICES

P 65. We show that col(A) ⊆ null(A) and then use the Rank Theorem. Suppose x = ci ai ∈ col(A). Let ai be the ith column of A; then ai = Aei where the ei are the standard basis vectors for Rn . Then X  X X X  X Ax = A ci ai = ci (Aai ) = ci (A (Aei )) = ci A2 ei = ci (Oei ) = 0. Thus any vector in the column space of A is taken to zero by A, so it is in the null space of A. Thus col(A) ⊆ null(A). But then rank(A) = dim(col(A)) ≤ dim(null(A)) = nullity(A), so by the Rank Theorem, rank(A) + rank(A) ≤ rank(A) + nullity(A) = n



2 rank(A) ≤ n



rank(A) ≤

n . 2

T 66. (a) Note first that since x is n × 1, then xT is 1 × n and thus xT Ax is 1 × 1. Thus xT Ax = xT Ax. Now since A is skew-symmetric, xT Ax = xT Ax

T

= x T AT x T

T

= xT (−A) x = −xT Ax.

So xT Ax is equal to its own negation, so it must be zero. (b) By Theorem 3.27, to show I + A is invertible it suffices to show that null(I + A) = 0. Choose x with (I + A)x = 0. Then xT (I + A)x = xT 0 = x · 0 = 0, so that 0 = xT (I + A)x = xT Ix + xT Ax = xT Ix + xT 0 = xT Ix = xT x = x · x since Ax = 0. But x · x = 0 implies that x = 0. So any vector in null(I + A) is the zero vector and thus I + A has a zero null space, so it is invertible.

3.6

Introduction to Linear Transformations

1. We must compute T (u) and T (v), so we take the product of the matrix A of T and each of these two vectors:      2 −1 1 0 T (u) = Au = = 3 4 2 11      2 −1 1 8 T (v) = Av = = . 3 4 2 1 2. We must compute T (u) and T (v), so we take the product of the matrix A of T and each of these two vectors:       3 −1   3·1−1·2 1 1 2 T (u) = Au = 1 = 1 · 1 + 2 · 2 = 5 2 1 4 1·1+4·2 9         3 −1 3 · 3 − 1 · (−2) 11 3 2 T (v) = Av = 1 = 1 · 3 + 2 · (−2) = −1 −2 1 4 1 · 3 + 4 · (−2) −5 3. Use the remark following Example 3.55. T is a linear transformation if and only if     x x T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 = 1 , v2 = 2 . y1 y2

3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS

193

Then 

    x1 x + c2 2 y1 y2   c1 x1 + c2 x2 =T c1 y1 + c2 y2   (c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 ) = (c1 x1 + c2 x2 ) − (c1 y1 + c2 y2 )   (c1 x1 + c1 y1 ) + (c2 x2 + c2 y2 ) = (c1 x1 − c1 y1 ) + (c2 x2 − c2 y2 )     c1 x1 + c1 y1 c2 x2 + c2 y2 = + c1 x1 − c1 y1 c2 x2 − c2 y2     x1 + y1 x2 + y2 = c1 + c2 x1 − y1 x2 − y2

T (c1 v1 + c2 v2 ) = T

c1

= c1 T (v1 ) + c2 T (v2 ) and thus T is linear. 4. Use the remark following Example 3.55. T is a linear transformation if and only if

T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 =

    x1 x , v2 = 2 . y1 y2

Then 

    x1 x T (c1 v1 + c2 v2 ) = T c1 + c2 2 y1 y2   c x + c2 x2 =T 1 1 c1 y1 + c2 y2   −(c1 y1 + c2 y2 ) =  (c1 x1 + c2 x2 ) + 2(c1 y1 + c2 y2 )  3(c1 x1 + c2 x2 ) − 4(c1 y1 + c2 y2 )     −c1 y1 −c2 y2 =  c1 x1 + 2c1 y1  +  c2 x2 + 2c2 y2  3c1 x1 − 4c1 y1 3c2 x2 − 4c2 y2     −y1 −y2 = c1  x1 + 2y1  + c2  x2 + 2y2  3x1 − 4y1 3x2 − 4y2 = c1 T (v1 ) + c2 T (v2 ) and thus T is linear. 5. Use the remark following Example 3.55. T is a linear transformation if and only if     x1 x2 T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 =  y1  , v2 =  y2  . z1 z2

194

CHAPTER 3. MATRICES Then 

    x1 x2 T (c1 v1 + c2 v2 ) = T c1  y1  + c2  y2  z1 z2   c1 x1 + c2 x2 = T  c1 y1 + c2 y2  c1 z1 + c2 z2   (c1 x1 + c2 x2 ) − (c1 y1 + c2 y2 ) + (c1 z1 + c2 z2 ) = 2(c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 ) − 3(c1 z1 + c2 z2 )   (c1 x1 − c1 y1 + c1 z1 ) + (c2 x2 − c2 y2 + c2 z2 ) = (2c1 x1 + c1 y1 − 3c1 z1 ) + (2c2 x2 + c2 y2 − 3c2 z2 )     c1 x1 − c1 y1 + c1 z1 c2 x2 − c2 y2 + c2 z2 = + 2c1 x1 + c1 y1 − 3c1 z1 2c2 x2 + c2 y2 − 3c2 z2     x1 − y1 + z1 x2 − y2 + z2 = c1 + c2 2x1 + y1 − 3z1 2x2 + y2 − 3z2 = c1 T (v1 ) + c2 T (v2 ) and thus T is linear. 6. Use the remark following Example 3.55. T is a linear transformation if and only if     x1 x2 T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 =  y1  , v2 =  y2  . z1 z2 Then 

    x1 x2 T (c1 v1 + c2 v2 ) = T c1  y1  + c2  y2  z1 z2   c1 x1 + c2 x2 = T  c1 y1 + c2 y2  c1 z1 + c2 z2   (c1 x1 + c2 x2 ) + (c1 z1 + c2 z2 ) =  (c1 y1 + c2 y2 ) + (c1 z1 + c2 z2 )  (c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 )     c1 x1 + c1 z1 c2 x2 + c2 z2 =  c1 y1 + c1 z1  +  c2 y2 + c2 z2  c1 x1 + c1 y1 c2 x2 + c2 y2     x1 + z1 x2 + z2 = c1  y1 + z1  + c2  y2 + z2  x1 + y1 x2 + y2 = c1 T (v1 ) + c2 T (v2 ) and thus T is linear. 7. For example,                         1 0 0 2 0 0 1+2 3 0 0 0 0 T = 2 = , T = 2 = , but T =T = 2 = 6= + . 0 1 1 0 2 4 0 0 3 9 1 4 8. For example,                         −1 |−1| 1 1 |1| 1 −1 + 1 0 |0| 0 1 1 T = = , T = = , but T = = = 6= + . 0 |0| 0 0 |0| 0 0 0 |0| 0 0 0

3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 9. For example T

                  2 2·2 4 2+2 4 4·4 16 4 4 , but T =T = = 6= + . = = 2 2+2 4 2+2 4 4+4 8 4 4

10. For example,                   1 1+1 2 1+1 2 2+1 3 2 2 T = = , but T =T = = 6= + . 0 0−1 −1 0+0 0 0−1 −1 −1 −1 11. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since             1 1+0 1 0 0+1 1 T (e1 ) = T = = , T (e2 ) = T = = , 0 1−0 1 1 0−1 −1 the matrix of T is

 1 A= 1

 1 . −1

12. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since             −0 0 −1 −1 1 0 T (e1 ) = T =  1 + 2 · 0  = 1 , T (e2 ) = T =  0 + 2 · 1  =  2 , 0 1 3·1−4·0 3 3·0−4·1 −4 the matrix of T is

 0 A = 1 3

 −1 2 . −4

13. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since       1 1−0+0 1 T (e1 ) = T 0 = = , 2·1+0−3·0 2 0       0 0−1+0 −1 T (e2 ) = T 1 = = , 2·0+1−3·0 1 0       0 0−0+1 1 T (e3 ) = T 0 = = , 2·0+0−3·1 −3 1 the matrix of T is

 1 A= 2

 −1 1 . 1 −3

14. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since       1 1+0 1 T (e1 ) = T 0 = 0 + 0 = 0 , 0 1+0 1       0 0+0 0 T (e2 ) = T 1 = 1 + 0 = 1 , 0 0+1 1       0 0+1 1 T (e3 ) = T 0 = 0 + 1 = 1 , 1 0+0 0

195

196

CHAPTER 3. MATRICES the matrix of T is  1 A = 0 1

 1 1 . 0

0 1 1

15. Reflecting a vector in the x-axis means negating the x-coordinate. So using the method of Example 3.56,         x −x −1 0 , F = =x +y y y 0 1 and thus F is a matrix transformation with matrix  −1 F = 0

 0 . 1

It follows that F is a linear transformation. 16. Example 3.58 shows that a " # 1 to 0

rotation of 45◦ about the origin sends " # " # " √ # " # "√ # 2 ◦ 0 cos 135 − 2 cos 45◦ to = √22 , = √22 and ◦ ◦ 1 sin 135 sin 45 2 2

so that R is a matrix transformation with matrix "√ R=



2 √2 2 2

−√ 22

#

2 2

.

It follows that R is a linear transformation. 17. Since D stretches by a factor of 2 in the x-direction and a factor of 3 in the y-direction, we see that        x 2x 2 0 x D = = . y 3y 0 3 y Thus D is a matrix transformation with matrix  2 0

 0 . 3

It follows that D is a linear transformation. 18. Since P = (x, y) projects onto the line whose direction vector is h1, 1i, it follows that the projection of P onto that line is     x+y x+y hx, yi · h1, 1i h1, 1i = , . projh1,1i hx, yi = h1, 1i · h1, 1i 2 2 Thus " # " # " # " # " x+y 1 1 1 x 2 2 T = x+y = x 1 + y 21 = 21 y 2 2 2 2

1 2 1 2

#" # x . y

Thus T is a matrix transformation with the matrix above, and it follows that T is a linear transformation. 19. Let A1 =

 k 0

 0 , 1

A2 =

 1 0

 0 , k

B=

 0 1

 1 , 0

C1 =

 1 0

 k , 1

 C2 =

1 k

0 1



3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS

197

Then    x k A1 = y 0    x 1 A2 = y 0    x 0 B = y 1    x 1 C1 = y 0    x 1 C2 = y k

    0 x kx : A1 stretches vectors horizontally by a factor of k = 1 y y     0 x x = : A2 stretches vectors vertically by a factor of k k y ky     1 x y = : B reflects vectors in the line y = x 0 y x     k x x + ky = : C1 extends vectors horizontally by ky 1 y y     0 x x = : C1 extends vectors vertically by kx 1 y y + kx

(C1 and C2 are called shears.) The effect of each of these transformations, with k = 2, on the unit square is Original

A1

3

3

2

2

2

1

1

1

0

0

1

2

0

3

0

1

B

2

0

3

C1

3

3

2

2

2

1

1

1

0

0

1

2

0

3

0

1

A2

3

0

1

0

3

3

2

3

C2

3

2

2

0

1

Note that reflection across the line y = x leaves the square unchanged, since it is symmetric about that line. 20. Using the formula from Example 3.58, we compute the matrix for a counterclockwise rotation of 120◦ : " R120◦ (e1 ) =

cos 120◦ sin 120◦

#

" =

− 21 √

3 2

#

" and R120◦ (e2 ) =

− sin 120◦ cos 120◦

So by Theorem 3.31, we have " h i −1 A = R120◦ (e1 ) R120◦ (e2 ) = √32 2





3 2 1 2

# .

#

" =



3 2 − 12



# .

198

CHAPTER 3. MATRICES

21. A clockwise rotation of 30◦ is a rotation of −30◦ . Then using the formula from Example 3.58, we compute the matrix for that rotation: R−30◦ (e1 ) =

" # cos(−30◦ ) sin(−30◦ )

"√ # =

3 2 − 12

" and R120◦ (e2 ) =

# − sin(−30◦ ) cos(−30◦ )

" =

1 √2 3 2

# .

So by Theorem 3.31, we have h

"√

i

3 2 − 21

A = R−30◦ (e1 ) R−30◦ (e2 ) =

22. A direction vector of the line y = 2x is d =  P` (e1 ) =

d · e1 d·d



1 √2 3 2

# .

  1 , so 2

" # " # " # " #   1 2 1 1 1 2 d · e2 5 d= = 2 and P` (e2 ) = d= = 54 . 1+4 2 d·d 1+4 2 5 5

Thus by Theorem 3.31, we have h

i A = P` (e1 ) P` (e2 ) =

"

1 5 2 5

2 5 4 5

# .

 23. A direction vector of the line y = −x is d =  P` (e1 ) =

d · e1 d·d



 1 , so −1

" # " # " # " #   1 1 1 − 21 1 d · e2 −1 2 d= = and P (e ) = d = = . ` 2 1 1 + 1 −1 d·d 1 + 2 −1 − 12 2

Thus by Theorem 3.31, we have h

i A = P` (e1 ) P` (e2 ) =

"

1 2 − 21

24. Reflection in the line y = x is given by          x y 0 1 0 R = =x +y = y x 1 0 1

− 12 1 2

# .

  1 x , 0 y

so that the matrix of R is  0 1

 1 . 0

25. Reflection in the line y = −x is given by          x −y 0 −1 0 R = =x +y = y −x −1 0 −1 so that the matrix of R is 

0 −1

 −1 . 0

  −1 x , 0 y

3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS

199

26. (a) Consider the following diagrams:

a+b

{ c F{ HbL

a

{ F{ HbL

F{ HbL cb

b

F{ HaL F{ Ha + bL

b F{ HaL

From the left-hand graph, we see that F` (cb) = cF` (b), and the right-hand graph shows that F` (a + b) = F` (a) + F` (b). (b) Let P` (x) be the projection of x on the line `; this is the image of x under the linear transformation P` . Then from the figure in the text, since the diagonals of the parallelogram shown bisect each other, and the point furthest from the origin on ` is x + F` (x), we have x + F` (x) = 2P` (x), so that F` (x) = 2P` (x) − x. Now, Example 3.59 gives us the matrix of P` , where ` has direction vector d = F` (x) = 2P` (x) − x " # " # d21 d1 d2 1 0 2 = 2 x− x d1 + d22 d1 d2 d22 0 1 " " # d21 d1 d2 d21 + d22 2 1 = − d21 + d22 d1 d2 d21 + d22 d22 0 " # d21 − d22 2d1 d2 1 x, = 2 2 d1 + d2 2d1 d2 −d21 + d22

0 2 d1 + d22

  d1 , so we have d2

#! x

which shows that the matrix of F` is as claimed in the exercise statement. (c) If θ is the angle that ` forms with the positive x-axis, we may take     d cos θ d= 1 = d2 sin θ as the direction vector. Substituting those values into the formula for F` from part (b) gives  2  1 d1 − d22 2d1 d2 F` = 2 d1 + d22 2d1 d2 −d21 + d22  2  1 cos θ − sin2 θ 2 cos θ sin θ = − cos2 θ + sin2 θ cos2 θ + sin2 θ 2 cos θ sin θ  2  cos θ − sin2 θ 2 cos θ sin θ = , 2 cos θ sin θ − cos2 θ + sin2 θ

200

CHAPTER 3. MATRICES

since cos2 θ + sin2 θ = 1. Now, cos2 θ − sin2 θ = cos 2θ, and 2 cos θ sin θ = sin 2θ, so the matrix above simplifies to   cos 2θ sin 2θ F` = . sin 2θ − cos 2θ   1 27. By part (b) of the previous exercise, since the line y = 2x has direction vector d = , we have for 2 the standard matrix of this reflection # " # " 1−4 2·2 − 53 54 1 . F` (x) = = 4 3 1 + 4 2 · 2 −1 + 4 5 5   √ √ 1 3 sin θ 28. The direction vector of this line is √ , so that cos = 3, and thus θ = π3 = 60◦ . So by part θ 1 = 3 (c) of the previous exercise, the standard matrix of this reflection is " # " # " √ # 1 cos 2θ sin 2θ cos 120◦ sin 120◦ − 23 √2 F` (x) = = = . 3 1 sin 2θ − cos 2θ sin 120◦ − cos 120◦ 2 2 29. Since     x1 x T 1 =  2x1 − x2  , x2 3x1 + 4x2

    2y1 + y3 y1  3y2 − y3   S y2  =   y1 − y2  , y3 y1 + y2 + y3

we substitute the result of T into the equation for S:     x1 x (S ◦ T ) 1 = S  2x1 − x2  x2 3x1 + 4x2   2x1 + (3x1 + 4x2 )  3(2x1 − x2 ) − (3x1 + 4x2 )   =   x1 − (2x1 − x2 ) x1 + (2x1 − x2 ) + (3x1 + 4x2 )   5x1 + 4x2 3x1 − 7x2   =  −x1 + x2  6x1 + 3x2   5 4    3 −7 x1  = . −1 1 x2 6 3 Thus the standard matrix of S ◦ T is



 4 −7 . 1 3

5  3 A= −1 6

30. (a) By direct substitution, we have            x x x − x2 2(x1 − x2 ) 2x1 − 2x2 (S ◦ T ) 1 = S T 1 =S 1 = = . x2 x2 x1 + x2 −(x1 + x2 ) −x1 − x2 Thus

 [S ◦ T ] =

2 −1

 −2 . −1

3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS

201

(b) Using matrix multiplication, we first find the standard matrices for S and T , which are     2 0 1 −1 [S] = , [T ] = . 0 −1 1 1 Then

 2 [S][T ] = 0

 0 1 −1 1

  −1 2 = 1 −1

 −2 . −1

Then [S ◦ T ] = [S][T ]. 31. (a) By direct substitution, we have            x1 x1 x1 + 2x2 (x1 + 2x2 ) + 3(−3x1 + x2 ) −8x1 + 5x2 (S ◦ T ) =S T =S = = . x2 x2 −3x1 + x2 (x1 + 2x2 ) − (−3x1 + x2 ) 4x1 + x1 Thus [S ◦ T ] =

 −8 4

 5 . 1

(b) Using matrix multiplication, we first find the standard matrices for S and T , which are     1 3 1 2 [S] = , [T ] = . 1 −1 −3 1 Then [S][T ] =

 1 1

 3 1 −1 −3

  2 −8 = 1 4

 5 . 1

Then [S ◦ T ] = [S][T ]. 32. (a) By direct substitution, we have            x2 + 3(−x1 ) −3x1 + x2 x1 x1 x2 (S ◦ T ) =S T =S =  2x2 − x1  = −x1 + 2x2  . x2 x2 −x1 x2 − (−x1 ) x1 + x2 Thus

 −3 [S ◦ T ] = −1 1

 1 2 . 1

(b) Using matrix multiplication, we first find the standard matrices for S and T , which are     1 3 0 1   1 , [T ] = [S] = 2 . −1 0 1 −1 Then

 1 [S][T ] = 2 1

 3  0 1 −1 −1

  −3 1 = −1 0 1

 1 2 . 1

Then [S ◦ T ] = [S][T ]. 33. (a) By direct substitution, we have    x1 (S ◦ T ) x2  = S T x3

    x1 x2  = S x1 + x2 − x3 2x1 − x2 + x3 x3     4(x1 + x2 − x3 ) − 2(2x1 − x2 + x3 ) 6x2 − 6x3 = = −(x1 + x2 − x3 ) + (2x1 − x2 + x3 ) x1 − 2x2 + 2x3

202

CHAPTER 3. MATRICES Thus

 0 [S ◦ T ] = 1

 6 −6 . −2 2

(b) Using matrix multiplication, we first find the standard matrices for S and T , which are     4 −2 1 1 −1 [S] = , [T ] = . −1 1 2 −1 1 Then

 [S][T ] =

 −2 1 1 2

4 −1

  1 −1 0 = −1 1 1

 6 −6 . −2 2

Then [S ◦ T ] = [S][T ]. 34. (a) By direct substitution, we have            x1 x1 (x1 + 2x2 ) − (2x2 − x3 ) x1 + x3 x + 2x2 (S ◦ T ) x2  = S T x2  = S 1 =  (x1 + 2x2 ) + (2x2 − x3 )  = x1 + 4x2 − x3  2x2 − x3 x3 x3 −(x1 + 2x2 ) + (2x2 − x3 ) −x1 − x3 Thus



1 [S ◦ T ] =  1 −1

0 4 0

 1 −1 . −1

(b) Using matrix multiplication, we first find the standard matrices for S and T , which are     1 −1 1 2 0   1 1 , [T ] = . [S] = 0 2 −1 −1 1 Then



1 [S][T ] =  1 −1

 −1  1 1 0 1

2 2

  1 0 = 1 −1 −1

0 4 0

 1 −1 . −1

Then [S ◦ T ] = [S][T ]. 35. (a) By direct substitution, we have            x1 x1 x1 + x2 (x1 + x2 ) − (x2 + x3 ) x1 − x3 (S ◦ T ) x2  = S T x2  = S x2 + x3  =  (x2 + x3 ) − (x1 + x3 )  = −x1 + x2  . x3 x3 x1 + x3 −(x1 + x2 ) + (x1 + x3 ) −x2 + x3 Thus



1 [S ◦ T ] = −1 0

 0 −1 1 0 . −1 1

(b) Using matrix multiplication, we first find the standard matrices for S and T , which are     1 −1 0 1 1 0 1 −1 , [T ] = 0 1 1 . [S] =  0 −1 0 1 1 0 1 Then



1 [S][T ] =  0 −1 Then [S ◦ T ] = [S][T ].

 −1 0 1 1 −1 0 0 1 1

1 1 0

  0 1 1 = −1 1 0

 0 −1 1 0 . −1 1

3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS

203

36. A counterclockwise rotation T through 60◦ is given by (see Example 3.58) " # " √ # 1 cos 60◦ − sin 60◦ − 23 2 [T ] = = √3 , 1 sin 60◦ cos 60◦ 2 2   1 and reflection S in y = x is reflection in a line with direction vector d = , which has the matrix 1 (see Exercise 26(b))     1 0 2 0 1 [S] = = . 1 0 2 2 0 Thus by Theorem 3.32, the composite transformation is given by "√ " #" √ # 3 3 1 0 1 − 2 2 √2 = [S ◦ T ] = [S][T ] = 3 1 1 1 0 2 2 2 37. Reflection T in the y-axis is taking the negative of the  −1 [T ] = 0

1 √2 − 23

# .

x-coordinate, so it has matrix  0 , 1

while clockwise rotation S through 30◦ has matrix (see Example 3.58) # "√ " # 3 1 cos(−30◦ ) − sin(−30◦ ) 2 2 √ = [S] = . 3 sin(−30◦ ) cos(−30◦ ) − 12 2 Thus by Theorem 3.32, the composite transformation is given by "√ # " √ #" 3 1 − 23 −1 0 [S ◦ T ] = [S][T ] = 21 √23 = 1 −2 0 1 2 2 38. Clockwise rotation T through 45◦ has matrix (see Example 3.58) # " √ " 2 cos 45◦ sin 45◦ √2 = [T ] = − sin 45◦ cos 45◦ − 22



2 √2 2 2

1 √2 3 2

# .

# ,

while projection S onto the y-axis has matrix (see Example 3.59, or note that this transformation simply sets the x-coordinate to zero)   0 0 [S] = . 0 1 Thus by Theorem 3.32, the composite transformation is given by " √ #" √ √ #" 2 2 2 0 0 2 2 √ √ √2 [T ◦ S ◦ T ] = [T ][S][T ] = 2 − 22 0 1 − 22 2



2 √2 2 2

#

− 12

1 2 1 2

# .

  1 , is given by (see Exercise 26(b)) 1    2 0 1 = . 0 1 0

39. Reflection T in the line y = x, with direction vector d =  1 0 [T ] = 2 2

=

" − 12

Counterclockwise rotation S through 30◦ has matrix (see Example 3.58) " # "√ # 3 1 cos 30◦ sin 30◦ [S] = = 21 √23 . − sin 30◦ cos 30◦ −2 2

204

CHAPTER 3. MATRICES  Finally, reflection R in the line y = −x, with direction vector d =

 1 , is given by (see Exercise −1

26(b))     1 0 −2 0 −1 = . 0 −1 0 2 −2 Then by Theorem 3.32, the composite transformation is given by #" # " √ " #" √ 3 1 0 1 − 23 0 −1 2 2 √ = [R ◦ S ◦ T ] = [R][S][T ] = 3 1 1 0 −1 0 − 12 2 2 [R] =

− 12 √



3 2

# .

40. Using the formula for rotation through an angle θ, from Example 3.58, we have    cos α − sin α cos β − sin β [Rα ◦ Rβ ] = [Rα ][Rβ ] = sin α cos α sin β cos β   cos α cos β − sin α sin β − cos α sin β − sin α cos β = sin α cos β + cos α sin β − sin α sin β + cos α cos β   cos(α + β) − sin(α + β) = sin(α + β) cos(α + β) = Rα+β .     cos α cos β 41. Assume line ` has direction vector l = and line m has direction vector . Let θ = β − α sin α sin β be the angle from ` to m (note that if α > β, then θ is negative). Then by Exercise 26(c),    cos 2β sin 2β cos 2α sin 2α Fm ◦ F` = sin 2β − cos 2β sin 2α − cos 2α   cos 2β cos 2α + sin 2β sin 2α cos 2β sin 2α − sin 2β cos 2α = sin 2β cos 2α − cos 2β sin 2α cos 2β cos 2α + sin 2β sin 2α   cos(2(α − β)) sin(2(α − β)) = − sin(2(α − β)) cos(2(α − β))   cos(2(β − α)) − sin(2(β − α)) = sin(2(β − α)) cos(2(β − α))   cos 2θ − sin 2θ = sin 2θ cos 2θ = R2θ . 42. (a) Let P be the projection onto a line with direction vector d = P is P =

 2 1 d1 d21 + d22 d1 d2

  d1 . Then the standard matrix of d2

 d1 d2 , d22

and thus  2   2  1 1 d1 d1 d2 d1 d1 d2 d22 d21 + d22 d1 d2 d22 d21 + d22 d1 d2   4 1 d31 d2 + d1 d32 d1 + d21 d22 = 2 3 3 2 2 d21 d22 + d42 (d1 + d2 ) d1 d2 + d1 d2  2 2  1 d1 (d1 + d22 ) d1 d2 (d21 + d22 ) = 2 (d1 + d22 )2 d1 d2 (d21 + d22 ) d22 (d21 + d22 )  2  1 d1 d1 d2 = 2 d22 d1 + d22 d1 d2

P ◦ P = P2 =

= P.

3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS

205

Note that we have proven this fact before, in a different form. Since P (v) = projd v, then (P ◦ P )(v) = projd (projd v) = projd v = P (v) by Exercise 70(a) in Section 1.2. (b) Let P be any projection matrix. Using the form of P from the previous exercise, we first prove that P 6= I. Suppose it were. Then d1 d2 = 0, so that either d1 = 0 or d2 = 0. But then either d21 or d22 would be zero, so that P could not be the identity matrix. So P is a non-identity matrix such that P 2 = P . Then P 2 − P = P (P − I) = O. Since P 6= I, we see that P − I 6= O, so there is some k such that (P − I)ek 6= 0 (otherwise, every row of P − I would be orthogonal to every standard basis vector, so would be zero, so that P − I would be the zero matrix). Let x = (P − I)ek . But then P x = P (P − I)ek = Oek = 0, so that x ∈ null(P ). Since the null space of P contains a nonzero vector, P is not invertible by Theorem 3.27. 43. Let `, m, and n be lines through the origin with angles from the x-axis of θ, α, and β respectively. Then     cos 2θ sin 2θ cos 2α sin 2α cos 2β sin 2β Fn ◦ Fm ◦ F` = sin 2θ − cos 2θ sin 2α − cos 2α sin 2β − cos 2β    cos 2θ cos 2α + sin 2θ sin 2α cos 2θ sin 2α − sin 2θ cos 2α cos 2β sin 2β = sin 2θ cos 2α − cos 2θ sin 2α sin 2θ sin 2α + cos 2θ cos 2α sin 2β − cos 2β    cos 2(θ − α) − sin 2(θ − α) cos 2β sin 2β = sin 2(θ − α) cos 2(θ − α) sin 2β − cos 2β   cos 2(θ − α) cos 2β − sin 2(θ − α) sin 2β cos 2(θ − α) sin 2β + sin 2(θ − α) cos 2β = sin 2(θ − α) cos 2β + cos 2(θ − α) sin 2β sin 2(θ − α) sin 2β − cos 2(θ − α) cos 2β   cos 2(θ − α + β) sin 2(θ − α + β) = . sin 2(θ − α + β) − cos 2(θ − α + β) This is a reflection through a line making an angle of θ − α + β with the x-axis. 44. If ` is a line with direction vector d through the point P , then it has equation x = p + td. Since T is linear, we have T (x) = T (p + td) = T (p) + tT (d), so that the equation of the resulting graph has equation T (x) = T (p) + tT (d). If T (d) = 0, then the image of ` has equation T (x) = T (p) so that it is the single point T (p). Otherwise, the image of ` is the line `0 with direction vector T (d) passing through T (p). 45. Suppose that we start with two parallel lines ` : x = p + td,

m : x = q + sd.

Then the images of these lines have equations `0 : T (p) + tT (d),

m0 : T (q) + sT (d).

Then • If T (d) 6= 0 and T (p) 6= T (q), then `0 and m0 are distinct parallel lines. • If T (d) 6= 0 and T (p) = T (q), then `0 and m0 are the same line. • If T (d) = 0 and T (p) 6= T (q), then `0 and m0 are distinct points T (p) and T (q). • If T (d) = 0 and T (p) 6= T (q), then `0 and m0 are the single point T (p).

206

CHAPTER 3. MATRICES

46. Using the linear transformation T from Exercise 3, we have     −1 0 T = , 1 −2

    1 2 T = , 1 0



   1 0 T = , −1 2

    −1 −2 T = . −1 0

The image is the square below: 2

C

1

D

B

-2

1

-1

2

-1

A -2

47. Using the linear transformation D from Exercise 17, we have D

    −1 −2 = , 1 3

D

    1 2 = , 1 3



   1 2 = , −1 −3

D

D

    −1 −2 = . −1 −3

The image is the rectangle below: 3

A

B

2

1

-2

1

-1

2

-1

-2

D

C

48. Using the linear transformation P from Exercise 18, we have     −1 0 P = , 1 0 The image is the line below:

    1 1 P = , 1 1



   1 0 P = , −1 0

    −1 −1 P = . −1 −1

3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS

207

1.

B

0.5

A -1.

1.

0.5

C

-0.5

-0.5

D -1.

49. Using the projection P from Exercise 22, we have " # " # " # " # " # " # 1 3 −1 1 1 − 51 5 5 P = 2 , P = 6 , P = , 1 1 −1 − 25 5 5

" # " # −1 − 35 P = . −1 − 65

The image is the line below: B 1.

0.5

A

-1.

1.

0.5

-0.5

C -0.5

-1.

D



 1 2 50. Using the transformation T from Exercise 31, with matrix , we have −3 1                 −1 1 1 3 1 −1 −1 −3 T = , T = , T = , T = . 1 4 1 −2 −1 −4 −1 2 The image is the parallelogram below: 4

A

3

2

D 1

-4

-3

-2

1

-1

2

3

4

-1

B -2

-3

C -4

51. Using the transformation resulting from Exercise 37, which we call Q, we have " # "√ # " # " √ # " # " √ # " # " √ # 3+1 − 3+1 − 3−1 3−1 −1 1 1 −1 2 2 √2 Q = √3+1 , Q = , Q = −√23−1 , Q = −√3+1 . 3−1 1 1 −1 −1 2 2 2 2 The image is the parallelogram below:

208

CHAPTER 3. MATRICES A

1

D

1

-1

B

-1

C

52. If ` has direction vector d, then using standard properties of the dot product and of scalar multiplication, we get        d · cv c(d · v) d·v P` (cv) = projd (cv) = d= d=c d = cP` (v). d·d d·d d·d 53. First suppose that T is linear. Then applying property 1 first, followed by property 2, we have T (c1 v1 + c2 v2 ) = T (c1 v1 ) + T (c2 v2 ) = c1 T (v1 ) + c2 T (v2 ). For the reverse direction, assume that T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ). To prove property 1, set c1 = c2 = 1; this gives T (v1 + v2 ) = T (v1 ) + T (v2 ). For property 2, set c1 = c, v1 = v, and c2 = 0; this gives T (cv) = T (cv + 0v2 ) = cT (v) + 0T (v2 ) = cT (v). Thus T is linear. 54. Let range(T ) be the range of T and col([T ]) be the column space of [T ]. We show that range(T ) = col([T ]) by showing that they are both equal to span({T (ei )}). To see P that range(T ) = span({T (ei )}), note that x ∈ range(T ) if and only if x = T (v) for some v = ci ei , so that X X x = T (v) = T ( ci e i ) = ci T (ei ), which is the case if and only if x ∈ span({T (ei )}). Thus range(T ) = span({T (ei )}).  Next, we show that col([T ]) = span({T (ei )}). By Theorem 2, [T ] = T (e1 ) T (e2 ) · · ·  Thus if vT = v1 v2 · · · vn , then

 T (en ) .

T (v) = v1 T (e1 ) + v2 T (e2 ) + · · · + vn T (en ). Then x ∈ col([T ]) ⇐⇒ x = T (v) = T (

X

vi T (ei )) =

X

vi T (ei ),

so that col([T ]) = span({T (ei )}). 55. The Fundamental Theorem of Invertible Matrices implies that any invertible 2 × 2 matrix is a product of elementary matrices. The matrices in Exercise 19 represent the three types of elementary 2 × 2 matrices, so TA must be a composition of the three types of transformations represented in Exercise 19.

3.7. APPLICATIONS

3.7

209

Applications

1. x1 = P x0 =

     0.5 0.3 0.5 0.4 = , 0.5 0.7 0.5 0.6

x2 = P x1 =

 0.5 0.5

    0.3 0.4 0.38 = . 0.7 0.6 0.62

2. Since there are two steps involved, the proportion we seek will be [P 2 ]21 — the probability of a state 1 population member being in state 2 after two steps:  0.5 P = 0.5 2

0.3 0.7

2

 0.4 = 0.6

 0.36 . 0.64

The proportion of the state 1 population in state 2 after two steps is 0.6. 3. Since there are two steps involved, the proportion we seek will be [P 2 ]22 — the probability of a state 2 population member remaining in state 2 after two steps:  0.5 P = 0.5 2

0.3 0.7

2

 0.4 = 0.6

 0.36 . 0.64

The proportion of the state 2 population in state 2 after two steps is 0.64.   x 4. The steady-state vector x = 1 has the property that P x = x, or (I − P )x = 0. So we form the x2 augmented matrix of that system and row-reduce it:         1 − 0.5 −0.3 0 0.5 −0.3 0 1 −0.6 0 I −P 0 : = → −0.5 1 − 0.7 0 0.5 0.3 0 0 0 0 Thus x2 = t is free and x1 = 35 t. Since we also want x to be a probability vector, we also have x1 + x2 = 53 t + t = 85 t = 1, so that t = 85 . Thus the steady-state vector is " # x=



1 2

 5. x1 = P x0 =  0 1 2

1 3 1 3 1 3





1 120 3  2  180   3

0

90

  150   = 120 , 120

3 8 5 8

.



1 3 1 3 1 3

1 2

 x2 = P x1 =  0 1 2





  155   = 120 . 120 115

1 150 3  2  120   3

0

6. Since there are two steps involved, the proportion we seek will be [P 2 ]11 — the probability of a state 1 population member being in state 1 after two steps: 

1 2

 P 2 = 0 1 2

1 3 1 3 1 3

2 1 3 2 3

 =

0

5 12 1 3 1 4

7 18 1 3 5 18



7 18 2  . 9  7 18

The proportion of the state 1 population in state 1 after two steps is

5 12 .

7. Since there are two steps involved, the proportion we seek will be [P 2 ]32 — the probability of a state 2 population member being in state 3 after two steps: 

1 2

 P 2 = 0 1 2

1 3 1 3 1 3

2 1 3 2 3 0

 =

5 12 1 3 1 4

7 18 1 3 5 18



7 18 2  . 9  7 18

The proportion of the state 2 population in state 3 after two steps is

5 18 .

210

CHAPTER 3. MATRICES

  x1 8. The steady-state vector x = x2  has the property that P x = x, or (I − P )x = 0. So we form the x3 augmented matrix of that system and row-reduce it:

h

I −P

 1 − 12  0 :  0 − 12 i

− 13 1 − 13 − 13

− 13 − 23 1

  1 0 2   0 =  0 − 21 0

− 13 2 3 − 13

− 31 − 32 1

  1 0 0   0 → 0 1 0 0 0

− 43 −1 0

 0  0 0

Thus x3 = t is free and x1 = 43 t, x2 = t. Since we want x to be a probability vector, we also have x1 + x2 + x3 =

10 3 4 t+t+t= t = 1, so that t = . 3 3 10

Thus the steady-state vector is  x=



2 5 3  10  . 3 10

9. Let the first column and row represent dry days and the second column and row represent wet days. (a) Given the data in the exercise, the transition matrix is  0.750 P = 0.250

 0.338 , 0.662

with

  x x= 1 , x2

where x1 and x2 denote the probabilities of dry and wet days respectively.   1 (b) Since Monday is dry, we have x0 = . Since Wednesday is two days later, the probability vector 0 for Wednesday is given by       0.750 0.338 0.750 0.338 1 0.647 x2 = P (P x0 ) = = . 0.250 0.662 0.250 0.662 0 0.353 Thus the probability that Wednesday is wet is 0.353. (c) We wish to find the steady state probability vector; that is, over that a given day will be dry or wet. Thus we need to solve P x = the augmented matrix and row-reduce:      1 − 0.750 −0.338 0 0.250 −0.338 I −P 0 : = −0.750 1 − 0.662 0 −0.750 0.338

the long run, the probability x, or (I − P )x = 0. So form   0 1 → 0 0

−1.352 0

Let x2 = t, so that x1 = 1.352t; then we also want x1 + x2 = 1, so that x1 + x2 = 2.352t = 1, so that t =

1 ≈ 0.425. 2.352

Thus the steady state probability vector is     1.352 · 0.425 0.575 x= ≈ . 0.425 0.425 So in the long run, about 57.5% of the days will be dry and 42.5% will be wet.

 0 . 0

3.7. APPLICATIONS

211

10. Let the three rows (and columns) correspond to, in order, tall, medium, or short. (a) From the given data, the transition matrix is  0.6 P = 0.2 0.2

0.1 0.7 0.2

 0.2 0.4 . 0.4

  0 (b) Starting with a short person, we have x0 = 0. That person’s grandchild is two generations 1 later, so that       0.6 0.1 0.2 0.6 0.1 0.2 0 0.24 x2 = P (P x0 ) = 0.2 0.7 0.4 0.2 0.7 0.4 0 = 0.48 . 0.2 0.2 0.4 0.2 0.2 0.4 1 0.28 So the probability that a grandchild will be tall is 0.24.   0.2 (c) The current distribution of heights is x0 = 0.5, and we want to find x3 : 0.3   3    0.2457 0.2 0.6 0.1 0.2 x2 = P 3 x0 = 0.2 0.7 0.4 0.5 = 0.5039 . 0.2504 0.3 0.2 0.2 0.4 So roughly 24.6% of the population is tall, 50.4% medium, and 25.0% short. (d) We want to solve (I − P )x = 0, so form the augmented matrix and row-reduce:      0.4 −0.1 −0.2 0 1 0 1 − 0.6 −0.1 −0.2 0   0 = −0.2 0.3 −0.4 0 → 0 1 I − P 0 :  −0.2 1 − 0.7 −0.4 −0.2 −0.2 1 − 0.4 0 −0.2 −0.2 0.6 0 0 0

−1 −2 0

 0 0 . 0

So x3 = t is free, and x1 = t, x2 = 2t. Since x1 + x2 + x3 = t + 2t + t = 4t = 1, we get t =

1 , 4

so that the steady state vector is   1 4

  x =  12  . 1 4

So in the long run, half the population is medium and one quarter each are tall and short. 11. Let the rows (and columns) represent good, fair, and poor crops, in order. (a) From the data given, the transition matrix is  0.08 P = 0.07 0.85

0.09 0.11 0.80

 0.11 0.05 . 0.84

(b) We want to find the probability that starting from a good crop in 1940 we end up with a good crop in each of the five succeeding years, so we want (P i )11 for i = 1, 2, 3, 4, 5. Computing P i and

212

CHAPTER 3. MATRICES looking at the upper left entries, we get 1941 : P11

= 0.0800

2

1942 :

(P )11 = 0.1062

1943 :

(P 3 )11 ≈ 0.1057

1944 :

(P 4 )11 ≈ 0.1057

1945 :

(P 5 )11 ≈ 0.1057.

 (c) To find the long-run proportion, we want to solve P x = x. So row-reduce I − P    1 − 0.08 −0.09 −0.11 0 1 0 −0.126   0 → 0 1 −0.066 I − P 0 =  −0.07 1 − 0.11 −0.05 −0.85 −0.80 1 − 0.84 0 0 0 0

 0 :  0 0 . 0

So x3 = t is free, and x1 = 0.126t, x2 = 0.066t. Since x1 + x2 + x3 = 1, we have x1 + x2 + x3 = 0.126t + 0.066t + t = 1.192t = 1, so that t =

1 ≈ 0.839. 1.192

So the steady state vector is       x1 0.126 · 0.839 0.106 x = x2  ≈ 0.066 · 0.839 ≈ 0.055 . x3 0.839 0.839 In the long run, about 10.6% of the crops will be good, 5.5% will be fair, and 83.9% will be poor. 12. (a) Looking at the connectivity of the graph, we get for the transition matrix 0

1 3

1  P =  12 2

0



0

1 4 1 4

1 3 1 3

0



0

1 3 . 2  3

1 2

0

Thus for example from state 2, there is a 13 probability of getting to state 3, since there are three paths out of state 2 one of which leads to state 3. Thus P32 = 13 . (b) First find the steady-state probability distribution by solving P x = x:  h I −P

1

i − 1  0 =  21 − 2 0

− 31 1 1 −3 − 31

− 41 − 41 1 1 −2

0 − 13 − 23 1

  0 1 0 0   → 0 0 0

0

0 1 0 0

0 0 1 0

− 32 −1 − 34 0

 0 0  . 0 0

So x4 is free; setting x4 = 3t to avoid fractions gives x1 = 2t, x2 = 3t, x3 = 4t, and x4 = 3t. Since this is a probability vector, we also have x1 + x2 + x3 + x3 = 2t + 3t + 4t + 3t = 12t = 1, so that t = So the long run probability vector is 1 6

1   x =  41  . 3 1 4

1 . 12

3.7. APPLICATIONS

213

Since we initially started with 60 total robots, the steady-state distribution will be   10 15  60x =  20 . 15   13. First suppose that P = p1 · · · pn is a stochastic matrix (so that the elements of each pi sum to 1). Let j be a row vector consisting entirely of 1’s. Then   P    P (p1 )i · · · (pn )i = 1 · · · 1 = j. jP = j · p1 · · · j · pn = P Conversely, suppose that jP = j. Then j · pi = 1 for each i, so that j (pi )j = 1 for all i, showing that the column sums of P are 1 so that P is stochastic.     x1 y1 w1 z1 14. Let P = and Q = be stochastic matrices. Then x2 y2 w2 z2      x y1 w1 z1 x w + y1 w2 x1 z1 + y1 z2 PQ = 1 = 1 1 . x2 y2 w2 z2 x2 w1 + y2 w2 x2 z1 + y2 z2 Now, both P and Q are stochastic, so that x1 + x2 = y1 + y2 = z1 + z2 = w1 + w2 = 1. Then summing each column of P Q gives x1 w1 + y1 w2 + x2 w1 + y2 w2 = (x1 + x2 )w1 + (y1 + y2 )w2 = w1 + w2 = 1 x1 z1 + y1 z2 + x2 z1 + y2 z2 = (x1 + x2 )z1 + (y1 + y2 )z2 = z1 + z2 = 1. Thus each column of P Q sums to 1 so that P Q is also stochastic. 15. The transition matrix from Exercise 9 is P =

  0.750 0.338 . 0.250 0.662

To find the expected number of days until a wet day, we delete the second row and column of P  (corresponding to wet days), giving Q = 0.750 . Then the expected number of days until a wet day is (I − Q)−1 = (1 − 0.750)−1 = 4. We expect four days until the next wet day. 16. The transition matrix from Exercise 10 is  0.6 P = 0.2 0.2

0.1 0.7 0.2

 0.2 0.4 . 0.4

We delete the row and column of this matrix corresponding to “tall”, which is the first row and column, giving   0.7 0.4 Q= . 0.2 0.4 Then (I − Q)

−1



0.3 = −0.2

−0.4 0.6

−1

  6.0 4.0 = . 2.0 3.0

The column of this matrix corresponding to “short” is the second column, since this was the third column in the original matrix. Summing the values in the second column shows that the expected number of generations before a short person has a tall descendant is 7.

214

CHAPTER 3. MATRICES

17. The transition matrix from Exercise 11 is  0.08 P = 0.07 0.85

0.09 0.11 0.80

 0.11 0.05 . 0.84

Since we want to know how long we expect it to be before a good crop, we remove the first row and column, leaving   0.11 0.05 Q= . 0.80 0.84 Then −1

(I − Q)



−0.05 0.16

0.89 = −0.80

−1

  1.562 0.488 = . 7.812 8.691

Since we are starting from a fair crop, we sum the entries in the first column (which was the “fair” column in the original matrix), giving 1.562 + 7.812, or 9 or 10 years expected until the next good crop. 18. The transition matrix from Exercise 12 is 0

1 3

1  P =  21 2

0



0

1 3 1 3

1 4 1 4

0



0

1 3 . 2  3

1 2

0

The expected number of moves from any junction other than 4 to reach junction 4 is found by removing the fourth row and column of P , giving   0 31 14   Q =  12 0 14  . 1 1 0 2 3 Then 

1

 (I − Q)−1 = − 21 − 21

− 31 1 − 13

−1  22 − 41 13   − 14  =  15 13 16 1 13

10 13 21 13 12 13



8 13 9  , 13  20 13

15 16 53 so that a robot is expected to reach junction 4 starting from junction 1 in 22 13 + 13 + 13 = 13 ≈ 4.08 moves, 10 21 12 43 8 9 37 from junction 2 in 13 + 13 + 13 = 13 ≈ 3.31 moves, and from junction 3 in 13 + 13 + 20 13 = 13 ≈ 2.85 moves.

19. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the augmented matrix of that system: " # " # " # −2R1 i h 1 1 1 1 − 12 0 − 0 1 − 0 − − − → 4 2 4 2 . E−I 0 : R2 +R1 1 − 41 0 −−− 0 0 0 0 0 0 −−→ 2 1 Thus   the solutions are of the form x1 = 2 t, x2 = t. Set t = 2 to avoid fractions, giving a price vector 1 . Of course, other values of t lead to equally valid price vectors. 2

20. Since neither column sum is 1, this is not an exchange matrix. 21. While the first column sums to 1, the second does not, so this is not an exchange matrix.

3.7. APPLICATIONS

215

22. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the augmented matrix of that system:  0 :

 E−I

 −0.9 0.9

0.6 −0.6

 − 10 R1  9 0 − −− −→ 1 0 0

  0 −0.9 0.6 R2 +R1 0 − 0 0 −−−−→

− 23 0

 0 . 0

2 Thus   the solutions are of the form x1 = 3 t, x2 = t. Set t = 3 to avoid fractions, giving a price vector 2 . Of course, other values of t lead to equally valid price vectors. 3

23. Since the matrix contains a negative entry, it is not an exchange matrix. 24. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the augmented matrix of that system:

h E−I

 − 12 i  0 :  0 1 2

1 −1 0

0 1 3 1 −3

  0 − 21  −R2  0 −−−→  0 R3 +R1 0 − 0 −−−−→

1 1 1

0 − 13 − 13  − 12   0 0

 R1 −R2 −−→ 0 −−−  0 R3 −R2 0 − −−−−→ 0 1 0

1 3 − 13

0

 −2R  1 1 0 0 −−−→   0 0 1 0 0 0

− 23 − 13 0

 0  0 . 0

Thus the solutions are of the form x1 = 32 t, x2 = 13 t, x3 = t. Set t = 3 to avoid fractions, giving a price   2 vector 1. Of course, other values of t lead to equally valid price vectors. 3 25. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the augmented matrix of that system:

h E−I

 − 10 R1   7 0 0.2 0 −−− 1 0 − 27 0 −→    −0.5 0.3 0 0.3 0 0.3 −0.5 0.5 −0.5 0 0.4 0.5 −0.5 0      1 0 − 72 0 1 0 − 27 0 1 0    −2R2  R2 −0.3R1  1 27 1 27 −−−−−−→ 0 − 2 0 0 −−−→ 0 1 0 − 2 70 70 R3 +R2 R3 −0.4R1 1 27 − 0 0 0 0 0 0 0 0 − − − − − → −−−−−−→ 2 70

 −0.7 i  0 :  0.3 0.4

− 27 − 27 35 0

 0  0 . 0

2 27 Thus the solutions   are of the form x1 = 7 t, x2 = 35 t, x3 = t. Set t = 35 to avoid fractions, giving a 10 price vector 27. Of course, other values of t lead to equally valid price vectors. 35

26. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the

216

CHAPTER 3. MATRICES augmented matrix of that system:   −2R  1 1 −1.40 −0.50 0.70 0.35 0 −−−→ i    0 :  0.25 −0.70 0.25 0 0.25 −0.70 0.25 0 −0.6 0 0.25 0   R1 −4R2  0 −2.40 1 −1.40 −0.70 0 −−−−−→ 1 0 −0.35 0.425 0 −0.35 0.425 0 R3 +R2 0 0.35 −0.425 0 − 0 0 −−−−→ 0   2.4 So one possible solution is 1.214. Multiplying through by 70 1   168 ≈  85  if an integral solution is desired. 70 h E−I

 −0.70 0  R −0.25R1 0.25 0 −−2−−−−−→ R3 −0.25R1 −0.6 0 − −−−−−−→   0 1 0 −2.40 1 − 3.5 R2  0 − −−−−→ 0 1 −1.214 0 0 0 0

 0 0 0

will roughly clear fractions, giving

27. This matrix is a consumption matrix since all of its entries are nonnegative. By Corollary 3.36, it is productive since the sum of the entries in each column is strictly less than 1. (Note that Corollary 3.35 does not apply since the sum of the entries in the second row exceeds 1). 28. This matrix is a consumption matrix since all of its entries are nonnegative. By Corollary 3.35, it is productive since the sum of the entries in each row is strictly less than 1. (Note that Corollary 3.36 does not apply since the sum of the entries in the third column exceeds 1). 29. This matrix is a consumption matrix since all of its entries are nonnegative. Since the entries in the second row as well as those in the second column sum to a number greater than 1, neither Corollary 3.35 nor 3.36 applies. So to determine if C is productive, we must use the definition, which involves seeing if I − C is invertible and if its inverse has all positive entries.       1 0 0 0.35 0.25 0 0.65 −0.25 0 0.45 −0.35 . I − C = 0 1 0 − 0.15 0.55 0.35 = −0.15 0 0 1 0.45 0.30 0.60 −0.45 −0.30 0.40 Using technology, (I − C)−1

  −13.33 −17.78 −15.56 ≈ −38.67 −46.22 −40.44 . −44.00 −54.67 −45.33

Since this is not a positive matrix, it follows that C is not productive. 30. This matrix is a consumption matrix since all of its entries are nonnegative. Since the entries in the first row as well as those in the each column sum to a number at least equal to 1, neither Corollary 3.35 nor 3.36 applies. So to determine if C is productive, we must use the definition, which involves seeing if I − C is invertible and if its inverse has all positive entries.       1 0 0 0 0.2 0.4 0.1 0.4 0.8 −0.4 −0.1 −0.4 0 1 0 0 0.3 0.2 0.2 0.1 −0.3 0.8 −0.2 −0.1     . I −C = 0 0 1 0 −  0 0.4 0.5 0.3 =  0 −0.4 0.5 −0.3 0 0 0 1 0.5 0 0.2 0.2 −0.5 0 −0.2 0.8 Using technology reveals that I − C is not invertible, so that C is not productive. 31. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here # " # " # " 1 1 1 1 − 1 0 2 4 . I −C = − 21 41 = 1 − 21 0 1 2 2 2

3.7. APPLICATIONS Then det(I − C) =

217 1 2

·

1 2

− − 14



 − 12 = 18 , so that " 1 12 −1 (I − C) = 1/8 12

1 4 1 2

#

"

4 = 4

# 2 . 4

Thus a feasible production vector is −1

x = (I − C)

 4 d= 4

    2 1 10 = . 4 3 16

32. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here       1 0 0.1 0.4 0.9 −0.4 I −C = − = . 0 1 0.3 0.2 −0.3 0.8 Then det(I − C) = 0.9 · 0.8 − (−0.4)(−0.3) = 0.60, so that " # " 4 0.8 0.4 1 −1 = 31 (I − C) = 0.60 0.3 0.9 2

2 3 3 2

# .

Thus a feasible production vector is " x = (I − C)−1 d =

4 3 1 2

2 3 3 2

#" # " # 10 2 = 35 . 1 2

33. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here       1 0 0 0.5 0.2 0.1 0.5 −0.2 −0.1 0.6 −0.8 . I − C = 0 1 0 −  0 0.4 0.2 =  0 0 0 1 0 0 0.5 0 0 0.5   Then row-reduce I − C I to compute (I − C)−1 :     1 0 0 2 23 23 0.5 −0.2 −0.1 1 0 0 h i     I −C I = 0 0.6 −0.8 0 1 0 → 0 1 0 0 53 23  . 0 0 0.5 0 0 1 0 0 1 0 0 2 Thus a feasible production vector is  2  −1 x = (I − C) d = 0 0

 

  10   =  6 . 4 8

2 3 5 3

2 3 3 2   3  2

0

2

34. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here       1 0 0 0.1 0.4 0.1 0.9 −0.4 −0.1 0 0.8 −0.2 . I − C = 0 1 0 −  0 0.2 0.2 =  0 0 1 0.3 0.2 0.3 −0.3 −0.2 0.7 Compute (I − C)−1 using technology to get (I − C)−1

  1.238 0.714 0.381 ≈ 0.143 1.429 0.429 . 0.571 0.714 1.714

Thus a feasible production vector is      1.238 0.714 0.381 1.1 4.624      x = (I − C)−1 d = 0.143 1.429 0.429 3.5 = 6.014 . 0.571 0.714 1.714 2.0 6.557

218

CHAPTER 3. MATRICES

35. Since x ≥ 0 and A ≥ O, then Ax has all nonnegative entries as well, so that Ax ≥ Ox = 0. But then x > Ax ≥ Ox = 0, so that x > 0. 36. (a) Since all entries of all matrices are positive, it is clear that AC ≥ O and BD ≥ O. To see that AC ≥ BD, let A = (aij ), B = (bij ), C = (cij ), and D = (dij ). Then each aij ≥ bij and each cij ≥ dij . So n n n X X X (AC)ij = aik ckj ≥ bik ckj ≥ bik dkj = (BD)ij . k=1

k=1

k=1

So every entry of AC is greater than or equal to the corresponding entry of BD, and thus AC ≥ BD.   (b) Let xT = x1 x2 · · · xn . Then for any i and j, we know that aij xj ≥ bij xj since aij > bij and xj ≥ 0. Since x 6= 0, we know that some component of x, say xk , is strictly positive. Then aik xk > bik xk since xk > 0. But then (Ax)i = ai1 x1 + · · · + aik xk + · · · + ain xn > bi1 x1 + · · · + bik xk + · · · + bin xn = (Bx)i . 37. Multiplying, we get 

    2 5 10 45 x1 = Lx0 = = 0.6 0 5 6      2 5 45 120 x2 = Lx1 = = 0.6 0 6 22.5      2 5 120 375 x3 = Lx2 = = 0.6 0 27 72 38. Multiplying, we get 

0 x1 = Lx0 = 0.2 0  0 x2 = Lx1 = 0.2 0  0 x3 = Lx2 = 0.2 0

    10 2 10 0  4  =  2  2 3 0     6 1 2 10 0 0  2  = 2 1 2 0.5 0     1 2 6 4 0 0 2 = 1.2 0.5 0 1 1 1 0 0.5

39. Multiplying, we get 

1 x1 = Lx0 = 0.7 0  1 x2 = Lx1 = 0.7 0  1 x3 = Lx2 = 0.7 0

    3 100 500 0 100 =  70  0 100 50     1 3 500 720 0 0  70  = 350 0.5 0 50 35     1 3 720 1175 0 0 350 =  504  . 0.5 0 35 175 1 0 0.5

3.7. APPLICATIONS

219

40. Multiplying, we get 

0 0.5 x1 = Lx0 =  0 0  0 0.5 x2 = Lx1 =  0 0  0 0.5 x3 = Lx2 =  0 0

1 0 0.7 0 1 0 0.7 0 1 0 0.7 0

    80 5 10 10  5  0   =   0 10  7  3 10 0     34 2 5 80  5   40  0 0   =   0 0  7  3.5 3 2.1 0.3 0     57.5 34 2 5     0 0   40  =  17  . 0 0 3.5  28  1.05 2.1 0.3 0 2 0 0 0.3

41. (a) The first ten generations of the species using each of the two models are as follows:

L1

x0 10 10

x1 50 8

x2 40 40

L2

  10 10

  50 8

  208 40

L1

 x6  640 640

L2

  64184.3 12252.2

 x3  200 32 

 872 166.4

 x4  160 160

 x5  800 128

  3654.4 697.6

  15315.2 2923.52

 x7  3200 512

 x8  2560 2560

 x9  12800 2048

 x10  10240 10240

 268989 51347.5

  1.127 × 106 215192

  4.724 × 106 901844

  1.980 × 107 . 3.780 × 106



(b) Graphs of the first ten generations using each model are:

Model L1

Percentage 1.0

Class 1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

Class 2 0

2

4

6

Model L2

Percentage 1.0

8

10

TimeHyrsL

Class 2

Class 1 0

2

4

6

8

10

TimeHyrsL

The graphs suggest that the first model does not have a steady state, while the second Leslie matrix quickly approaches a steady state ratio of class 1 to class 2 population.

220

CHAPTER 3. MATRICES

  20 42. Start with x0 = 20. Then 20 

0 x1 = 0.1 0  0 x2 = 0.1 0  0 x3 = 0.1 0

0 0 0.5 0 0 0.5 0 0 0.5

    20 20 400 0  20 =  2  0 20 10     20 400 200 0   2  =  40  0 10 1     20 200 20 0   40  = 20 . 0 1 20

Thus after three generations the population is back to where it started, so that the population is cyclic.   20 43. Start with x0 = 20. Then 20      0 0 20 20 400 0  20 = 20s x1 =  s 0 0 0.5 0 20 10      200 0 0 20 400 0  20s = 400s x2 =  s 0 10 0 0.5 0 10s      200s 200 0 0 20 0  400s = 200s . x3 =  s 0 200s 10s 0 0.5 0 Thus after three generations the population is still evenly distributed among the three classes, but the total population has been multiplied by 10s. So • If s < 0.1, then the overall population declines after three years; • If s = 0.1, then the overall population is cyclic every three years; • If s > 0.1, then the overall population increases after three years. 44. The table gives us seven separate age classes, each of duration two years; from the table, the Leslie matrix of the system is   0 0.4 1.8 1.8 1.8 1.6 0.6 0.3 0 0 0 0 0 0    0 0.7 0 0 0 0 0   0 0.9 0 0 0 0 L= 0 . 0  0 0 0.9 0 0 0   0 0 0 0 0.9 0 0 0 0 0 0 0 0.6 0 Since in 1990,       46.4 42.06 10 13.92  3  2        2.1  8  1.4             x0 =   5  , so that x1 = Lx0 =  7.2  , and x2 = Lx1 =  1.26  .  6.48   4.5  12       10.8  4.05  0 0 6.48 1

3.7. APPLICATIONS

221

Then x1 and x2 represent the population distribution in 1991 and 1992. To project the population for the years 2000 and 2010, note that xn = Ln x0 . Thus     167.80 69.12  46.08  18.61      29.54  12.24     20    x10 = L10 x0 ≈  10.59 , x20 = L x0 ≈  24.35  .  20.06   8.29       16.48   6.51  9.05 3.57 45. The adjacency matrix contains a 1 in row i, column j  0 1 0 1 0 1  0 1 0 1 0 1

if there is an edge between vi and vj :  1 0 . 1 0

46. The adjacency matrix contains a 1 in row i, column j  1 0 0 0 1 1  0 1 0 1 0 0

if there is an edge between vi and vj :  1 0 . 0 0

47. The adjacency matrix contains a 1 in row i, column  0 1 1 1 0 1  1 1 0  1 0 1 1 0 0

j if there is an edge between vi and vj :  1 1 0 0  1 0 . 0 1 1 0

48. The adjacency matrix contains a 1 in row i, column  0 0 0 0 0 0  0 0 0  1 1 1 1 0 0

j if there is an edge between vi and vj :  1 1 1 0  1 0 . 0 0 0 0

49. v2

v4

v1

v3

222

CHAPTER 3. MATRICES

50. v4

v3

v1

v2

51. v4

v1

v2

v3

v5

52. v2 v4

v1

v5 v3

53. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :  0 0  0 1

1 0 1 0

1 0 0 0

 0 0 . 1 0

54. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :  0 0  1 0

0 0 1 1

1 0 0 1

 1 1 . 0 0

3.7. APPLICATIONS

223

55. The adjacency matrix contains a 1 in row i, column  0 1 0 1 0 0  1 1 0  1 0 0 1 0 0

j if there is an edge from vi to vj :  1 0 1 0  0 0 . 0 1 0 0

56. The adjacency matrix contains a 1 in row i, column  0 1 0 0 0 0  0 1 0  0 0 0 0 0 1

j if there is an edge from vi to vj :  1 1 0 1  1 0 . 0 1 0 0

57. v4

v3

v1

v2

58. v3

v4

v1

v2

59. v1

v5

v3

v4

v2

224

CHAPTER 3. MATRICES

60. v5

v2

v1

v3

v4

61. In Exercise 50,  0 1 A= 0 1

1 1 1 1

0 1 0 1

  2 1 2 1 2  , so A =  2 1 1 0

2 4 2 3

 1 3 . 1 3

2 2 2 1

The number of paths of length 2 between v1 and v2 is given by A2 length 2 between v1 and v2 .

 12

= 2. So there are two paths of

62. In Exercise 52,  0 0  A= 0 1 1

0 0 0 1 1

0 0 0 1 1

1 1 1 0 0

  2 1 2 1   2  1  , so A = 2 0  0 0 0

2 2 2 0 0

2 2 2 0 0

The number of paths of length 2 between v1 and v2 is given by A2 length 2 between v1 and v2 .

0 0 0 3 3  12

 0 0  0 . 3 3 = 2. So there are two paths of

63. In Exercise 50,  0 1 A= 0 1

 3 6 7 8 . 3 6 6 5  The number of paths of length 3 between v1 and v3 is given by A3 13 = 3. So there are three paths of length 3 between v1 and v3 . 1 1 1 1

0 1 0 1

  1 3  1  , so A3 = 7 3 1 0 6

7 11 7 8

64. In Exercise 52,  0 0  A= 0 1 1

 12 0 0 12 0 0   12 0 0  . 0 18 18 0 18 18  The number of paths of length 4 between v2 and v2 is given by A4 22 = 12. So there are twelve paths of length 4 between v2 and itself. 0 0 0 1 1

0 0 0 1 1

1 1 1 0 0

  1 12 12 1   4  1  , so A = 12 0 0 0 0

12 12 12 0 0

3.7. APPLICATIONS

225

65. In Exercise 57,  0 1 A= 0 1

  0 1  1  , so A2 = 1 1 0 1 1  The number of paths of length 2 from v1 to v3 is A2 13 = 0. to v3 . 1 0 1 0

0 0 0 1

0 1 0 2

0 1 0 1

 1 1 . 1 1

There are no paths of length 2 from v1

66. In Exercise 57,  0 1 A= 0 1

  0 1  1  , so A3 = 2 1 0 1 3  The number of paths of length 3 from v4 to v1 is A3 41 = 3. v4 to v1 . 1 0 1 0

0 0 0 1

1 2 1 2

1 1 1 1

 1 2 . 1 3

There are three paths of length 3 from

67. In Exercise 60,  0 0  A= 1 1 1

  1 1 1 0   3  1  , so A = 2 3  0 2 0  The number of paths of length 3 from v4 to v1 is A3 41 = 3. v4 to v1 . 1 0 0 0 1

0 0 0 1 0

0 1 1 0 0

1 1 3 3 1

1 0 0 1 1

1 1 3 1 1

 1 2  3 . 1 0

There are three paths of length 3 from

68. In Exercise 60,   3 1 3 0   4  1  , so A = 6 3 0 2 0  The number of paths of length 4 from v1 to v4 is A3 14 = 2. to v4 .  0 0  A= 1 1 1

1 0 0 0 1

0 0 0 1 0

0 1 1 0 0

2 3 5 4 2

1 1 3 1 1

2 1 3 4 2

 2 1  2 . 4 3

There are two paths of length 4 from v1

69. (a) If there are no ones in row i, then no edges join vertex i to any of the other vertices. Thus G is a disconnected graph, and vertex i is an isolated vertex. (b) If there are no ones in column j, then no edges join vertex j to any of the other vertices. Thus G is a disconnected graph, and vertex j is an isolated vertex. 70. (a) If row i of A2 is all zeros, then there are no paths of length 2 from vertex i to any other vertex. (b) If column j of A2 is all zeros, then there are no paths of length 2 from any vertex to vertex j. 71. The adjacency matrix for the digraph is  0 1  1 A= 1  0 1

0 0 0 1 0 0

0 1 0 0 0 1

0 0 1 0 0 1

1 1 1 1 0 0

 0 1  0 . 0  1 0

226

CHAPTER 3. MATRICES The number of wins that each player had is given by     1 1 1 4     1 3    A 1 = 3 ,     1 1 3 1 so the ranking is: first, P2 ; second, P3 , P4 , P6 (tie); wins and indirect wins, we compute    0 0 0 1 1 3 0 2    1 2 1 0   (A + A2 )  1 = 2 1 1    1 1 0 1 3 1 1 1

third, P1 , P5 (tie). Ranking instead by combined 0 2 1 0 1 2

1 3 3 3 0 3

    2 1 1 1 12 2         1  1 =  8  ,     2  1  9    1  4  1 10 1 0

so the ranking is: first, P2 ; second, P6 ; third, P4 ; fourth, P3 ; fifth, P5 ; sixth, P1 . 72. Let vertices 1 through 7 correspond to, in order, rodent, plant, insect, bird, fish, fox, and bear. Then the adjacency matrix is   0 1 0 0 0 0 0 0 0 0 0 0 0 0   0 1 0 0 0 0 0    A= 0 1 1 0 1 0 0 . 0 1 1 0 0 0 0   1 0 0 1 0 0 0 1 0 0 0 1 1 0 (a) The number of direct sources of food is the number of ones in each row, so it is     1 1 1 0     1 1        A 1 = 3 . 1 2     1 2 3 1 Thus birds and bears have the most direct sources of food, 3. (b) The number of species for which a given species is a direct source of food is the number of ones in its column. Clearly plants, with a value of 4, are a direct source of food for the greatest number of species. Note the analogy of parts (a) and (b) with the previous exercise, where eating a given species is equivalent to winning that head-to-head match, and being eaten is equivalent to losing it.   (c) Since A2 ij gives the number of paths of length 2 from vertex i to vertex j, it follows that A2 ij is the number of ways in which species j is an indirect food source for species i. So summing the entries of the ith row of A2 gives the total number of indirect food sources:     1 0 1 0     1 0     2   A 1 =  3 . 1 1     1 4 1 5

3.7. APPLICATIONS

227

Thus bears have the most indirect food sources. Combined direct and indirect food sources are given by     1 1 1 0     1 1     2    (A + A ) 1 =  6 . 1 3     1 6 8 1 (d) The matrix A∗ is the matrix A after removing  0 0 0 0  0 1 A∗ =  0 1  1 0 1 0 Then we have

row 2 and column 2:  0 0 0 0 0 0 0 0  0 1 0 0 . 0 0 0 0  1 0 0 0 0 1 1 0

    0 1 1 0       2 ∗ 1  A  =  , 1 1 1 2 3 1

so the species with the most direct food sources is the bear, with three. The species that is a direct food source for the most species is the species for which the column sum is the largest; three species, rodents, insects, and birds, each are prey for two species. Note that the column sums can be computed as   1 1   1    1 1 1 1 1 1 A∗ or (A∗ )T  1 .   1 1 The number of indirect food sources each species has is     1 0 1 0       2 1  1 (A∗ )  1 = 0 ,     1 2 1 3 and the number of combined direct and indirect food sources is     1 0 1 0        2 1  = 3 . A∗ + (A∗ )  1 1     1 4 1 6 Thus birds have lost the most combined direct and indirect sources of food (5). The bear, fox, and fish species have each lost two combined sources of food, and rodents and insects have each lost one combined food source.

228

CHAPTER 3. MATRICES 5

(e) Since (A∗ ) is the zero matrix, after a while, no species will have a direct food source, so that the ecosystem will be wiped out. 73. (a) Using the relationships given, we get the digraph

Bert

Ehaz

Carla

Dana

Ann

Letting vertices 1 through 5 correspond to the people in alphabetical order, the adjacency matrix is   0 0 1 0 1 0 0 1 1 0    A= 0 0 0 0 1 . 1 0 1 0 0 0 1 0 0 0  k (b) The element A ij is 1 if there is a path of length k from i to j, so there is a path of length at   most k from i to j if A + A2 + · · · + Ak ij ≥ 1. If Bert hears a rumor and we want to know how many steps it takes before everyone else has heard the rumor, then we must calculate A, A + A2 , A + A2 + A3 , and so forth, until every entry in row 2 (except for the one in column 2, which corresponds to Bert himself) is nonzero. This is not true for A, since for example A21 = 0. Next,   0 1 1 0 2 1 0 2 1 1    A + A2 =  0 1 0 0 1 , 1 0 2 0 2 0 1 1 1 0 so everyone else will have heard the rumor, either directly or indirectly, from Bert after two steps (Carla will have heard it twice). (c) Using the same logic as in part (b), we must compute successive sums of powers of A until every entry in the first row, except for the entry in the first column corresponding to Ann herself, is  nonzero. From part (b), we see that this is not the case for A + A2 since A + A2 14 = 0. Next,   0 2 2 1 2 1 1 3 1 3   2 3  A+A +A = 0 1 1 1 1 , 1 2 2 0 3 1 1 2 1 1 so everyone else will have heard the rumor from Ann after three steps. In fact, everyone but Dana will have heard it twice.   (d) j is reachable from i by a path of length k if Ak ij is nonzero. Now, if j is reachable from i at all, it is reachable by a path of length at most n, where n is the number of vertices in the digraph, so we must compute   A + A2 + A3 + · · · + An ij and see if it is nonzero. If it is, the two vertices are connected.

3.7. APPLICATIONS

229

74. Let G be a graph with m vertices, and A = (aij ) its adjacency matrix. (a) The basis for the induction is n = 1, where the statement says that aij is the number of direct paths between vertices i and j. But this is just the definition of the adjacency matrix, since aij = 1 if i and j have an edge between them and is zero otherwise. Next, assume [Ak ]ij is the Pm k+1 k number of paths of length k between i and j. Then [A ]ij = l=1 [A ]il [A]lj . For a given vertex l, [Ak ]il gives the number of paths of length k from i to l, and [A]lj gives the number of paths of length 1 from l to j (this is either zero or one). So the product gives the number of paths of length k + 1 from i to j passing through l. Summing over all vertices l in the graph gives all paths of length k + 1 from i to j, proving the inductive step. (b) If G is a digraph, the statement of the result must be modified to say that the (i, j) entry of An is equal to the number of n-paths from i to j; the proof is valid since the adjacency matrix of a digraph takes arrow directions into account. Pm 75. [AAT ]ij = k=1 aik ajk . So this matrix entry is the number of vertices k that have edges from both vertex i and vertex j. 76. From the diagram in Exercise 49, this graph is bipartite, with U = {v1 } and V = {v2 , v3 , v4 }. 77. From the diagram in Exercise 52, this graph is bipartite, with U = {v1 , v2 , v3 } and V = {v4 , v5 }. 78. The graph in Exercise 51 is not bipartite. For suppose v1 ∈ U . Then since there are edges from v1 to both v3 and v4 , we must have v3 , v4 ∈ V . Next, there are edges from v3 to v5 , and from v4 to v2 , so that v2 , v5 ∈ U . But there is an edge from v2 to v5 . So this graph cannot be bipartite. 79. Examining the matrix, we see that vertices 1, 2, and 4 are connected to vertices 3, 5, and 6 only, so that if we define U = {v1 , v2 , v4 } and V = {v3 , v5 , v6 }, we have shown that this graph is bipartite. 80. Suppose that G has n vertices. (a) First suppose that the adjacency matrix of G is of the form shown. Suppose that the zero blocks have p and q = n − p rows respectively. Then any edges at vertices v1 , v2 , . . . , vp have their other vertex in vertices vp+1 , vp+2 , . . . , vn , since the upper left p × p matrix is O. Similarly, any edges at vertices vp+1 , vp+2 , . . . , vn have their other vertex in vertices v1 , v2 , . . . , vp , since the lower right q × q matrix is O. So G is bipartite, with U = {v1 , v2 , . . . , vp } and V = {vp+1 , vp+2 , . . . , vn }. Next suppose that G is bipartite. Label the vertices so that U = {v1 , v2 , . . . , vp } and V = {vp+1 , vp+2 , . . . , vn }. Now consider the adjacency matrix A of G. aij = 0 if i, j ≤ p, since then both vi and vj are in U ; similarly, aij = 0 if i, j > p, since then both vi and vj are in V . So the upper left p × p submatrix as well as the bottom right q × q submatrix are the zero matrix. The remaining two submatrices are transposes of each other since this is an undirected graph, so that aij = 1 if and only if aji = 1. Thus A is of the form shown. (b) Suppose G is bipartite and A its adjacency matrix, of the form given in part (a). It suffices to show that odd powers of A have zero blocks at upper left of size p × p and at lower right of size q × q, since then there are no paths of odd length between any vertex and itself. We do this by induction. Clearly A1 = A is of that form. Now,       OO + BB T OB + BO BB T O D1 O A2 = = = O D2 B T O + OB T B T B + OO O BT B where D1 = BB T and D2 = B T B. Suppose that A2k−1 has the required zero blocks. Then    O C1 D1 O A2k+1 = A2k−1 A2 = C2 O O D2   OD1 + C1 O OO + C1 D2 = C2 D1 + OO D1 O + OD2   O C 1 D2 = . C2 D1 O

230

CHAPTER 3. MATRICES Thus A2k+1 is of this form as well. So no odd power of A has any entries in the upper left or lower right block, so there are no odd paths beginning and ending at the same vertex. Thus all circuits must be of even length. (Note that in the matrix A2k+1 above, we must in fact have (C1 D2 )T = C2 D1 since powers of a symmetric matrix are symmetric, but we do not need this fact.)

Chapter Review 1. (a) True. See Exercise 34 in Section 3.2. If A is an m × n matrix, then AT is an n × m matrix. Then AAT is defined and is an m × m matrix, and AT A is also defined and is an n × n matrix. (b) False. This is true only when A is invertible. For example, if      1 0 0 0 0 A= , B= , then AB = 0 1 0 1 0

 0 , 0

but A 6= O. (c) False. What is true is that XAA−1 = BA−1 so that X = BA−1 . Matrix multiplication is not commutative, so that BA−1 6= A−1 B in general. (d) True. This is Theorem 3.11 in Section 3.3. (e) True. If E performs kRi , then E is a matrix with all zero entries off the diagonal, so it is its own transpose and thus E T is also the elementary matrix that performs kRi . If E performs Ri + kRj , then the only off-diagonal entry in E is [E]ij = k, so the only off-diagonal entry of E T is [E T ]ji = k and thus E T is the elementary matrix performing Rj + kRi . Finally, if E performs Ri ↔ Rj , then [E]ii = [E]jj = 0 and [E]ij = [E]ji = 1, so that E T = E and thus E T is also an elementary matrix exchanging Ri and Rj . (f ) False. Elementary matrices perform a single row operation, while their products can perform multiple row operations. There are many examples. For instance, consider the elementary matrices corresponding to R1 ↔ R2 and 2R1 :      0 1 2 0 0 1 = , 1 0 0 1 2 0 which is not an elementary matrix. (g) True. See Theorem 3.21 in Section 3.5. (h) False. For this to be true, the plane must pass through the origin. (i) True, since T (c1 u + c2 v) = −(c1 u + c2 v) = −c1 u − c2 v = c1 (−u) + c2 (−v) = c1 T (u) + c2 T (v). (j) False; the matrix is 5 × 4, not 4 × 5. See Theorems 3.30 and 3.31 in Section 3.6. 2. Since A is 2 × 2, so is A2 . Since B is 2 × 3, the product makes sense. Then          1 2 1 2 7 12 7 12 2 0 −1 50 −36 2 2 A = = , so that A B = = 3 5 3 5 18 31 18 31 3 −3 4 129 −93

 41 . 106

3. B 2 is not defined, since the number of columns of B is not equal to the number of rows of B. So this matrix product does not make sense. 4. Since B T is 3 × 2, A−1 is 2 × 2, and B is 2 × 3, this matrix product makes sense as long as A is invertible. Since det A = 1 · 5 − 2 · 3 = −1, we have     1 5 −2 −5 2 = , A−1 = 1 3 −1 −1 −3

Chapter Review

231

so that 

2 B T A−1 B =  0 −1

 3  −5 −3 3 4

 −1 0 −1 = −9 −3 4 17

 2 2 −1 3

 1  2 3 3 −6



   1 −3 5 0 −1 21 . = −9 −9 −3 4 16 18 −41

5. Since BB T is always defined (see exercise 34 in section 3.2) and is square, (BB T )−1 makes sense if BB T is invertible. We have       2 3 2 0 −1  5 2 0 −3 = BB T = . 3 −3 4 2 34 −1 4 Then det(BB T ) = 5 · 34 − 22˙ = 166, so that (BB T )−1 =

  1 34 −2 . 5 166 −2

6. Since B T B is always defined (see exercise 34 in section 3.2) and is square, (B T B)−1 makes sense if B T B is invertible. We have      2 3  13 −9 10 2 0 −1 9 −12 . B T B =  0 −3 = −9 3 −3 4 −1 4 10 −12 17   To compute the inverse, we row-reduce B T B I : 

h BT B

13 −9 10  I = −9 9 −12 10 −12 17 i

1 0 0

0 1 0

  13 0 9  R2 + 13 R1  0 −−−−−−→  0 R3 − 10 R1 1 − −−−13 −−→ 0

−9

10

1

36 13 66 − 13

− 66 13

9 13 10 − 13

121 13

 0 0  1 0 0 1

 13 −9 10  36 66  0 13 − 13 R3 + 66 R 2 0 0 −−−−36 −−→ 0

1 9 13 1 2

 0 0  1 0 . 11 1 6

Since the matrix on the left has a zero row, B T B is not invertible, so (B T B)−1 does not exist. 7. The outer product expansion of AAT is   1  T T T 1 AA = a1 A1 + a2 A2 = 3

   2  3 + 2 5

  1 5 = 3

    3 4 10 5 + = 9 10 25 13

 13 . 34

8. By Theorem 3.9 in Section 3.3, we know that (A−1 )−1 = A, so we must find the inverse of the given matrix. Now, det A−1 = 12 · 4 − (−1) − 32 = 12 , so that " # " # 8 2 1 4 1 −1 −1 A = (A ) = = . 1/2 23 12 3 1  −1 9. Since AX = B =  5 3

 −3 0, we have −2  1 0 −1 1  2 3 −1 0 0 1 1 0

X = A−1 B. To find A−1 , we compute

0 1 0

  0 1 0 0   0 → 0 1 0 1 0 0 1

2 −1 1

− 12 1 2 − 12



3 2  − 21  . 3 2

232

CHAPTER 3. MATRICES Then





− 21

2  X = A−1 B = −1 1

3 −1 2  − 21   5 3 3 2

1 2 − 12

  0 −3   0 = 2 1 −2

 −9  4 . −6

10. Since det A = 1 · 6 − 2 · 4 = −2 6= 0, A is invertible, so by Theorem 3.12 in Section 3.3, it can be written as a product of elementary matrices. As in Example 3.29 in Section 3.3, we start by row-reducing A: A=

 1 4

  2 1 R2 −4R1 6 −−−−−→ 0

  1 2 − 21 R2 −2 −−−−→ 0

 R1 −2R2  2 − −−−−→ 1 0 1

 0 . 1

So by applying "

1 E1 = 4

# 0 , 1

" 1 E2 = 0

0

#

− 12

,

h E3 = 1

−2

0

i 1

in that order to A we get the identity matrix, so that E3 E2 E1 A = I. Since elementary matrices are invertible, this means that A = (E3 E2 E1 )−1 = E1−1 E2−1 E3−1 . Since E1 is R2 − 4R1 , it follows that E1−1 is R2 + 4R1 . Since E2 is − 12 R2 , it follows that E2−1 is −2R2 . Since E3 is R1 − 2R2 , it follows that E3−1 is R1 + 2R2 . Thus     1 0 1 0 1 2 −1 −1 −1 A = E1 E2 E3 = . −1 1 0 −2 0 1 11. To show that (I − A)−1 = I + A + A2 , it suffices to show that (I + A + A2 )(I − A) = I. But (I + A + A2 )(I − A) = II − IA + AI − A2 + A2 I − A3 = I − A + A − A2 + A2 − A3 = I + A3 = I + O = I. 12. Use the multiplier method. We want to reduce A to an upper triangular matrix:  1 A = 3 2

  1 1 1 R2 −3R1 1 1 −−−−−→ 0 R3 −2R1 −1 1 − −−−−→ 0

  1 1 1 0 −2 −2 R3 − 23 R2 −3 −1 − −−−−−→ 0

 1 1 −2 −2 = U. 0 2

Then `21 = 3, `31 = 2, and `32 = 23 , so that  1  L = 3 2 13. We follow the form of A:  2 1 4

0 1 3 2

 0  0 . 1

comment after Example 3.48 in Section 3.5. Start by finding the reduced row echelon   −4 5 8 5 1 R1 ↔R2  −2 2 3 1− −−−−→ 2 −8 3 2 6 4  1 0 R +5R2 0 −−3−−−→ R1 −R3  −−− −−→ 1 R2 −3R3 0 −−−−−→ 0

  −2 2 3 1 1 −2 R2 −2R1 −4 5 8 5 −−−−−→ 0 0 R3 −4R1 −8 3 2 6 − 0 0 −−−−→   1 −2 2 −2 2 3 1 0 0 1 0 1 2 3 1 8 R3 0 0 0 8 −− 0 0 0 −→  R1 −2R2  −2 2 3 0 − −−−−→ 1 −2 0 0 1 2 0 0 0 0 0 1 0 0

 2 3 1 1 2 3 −5 −10 2  3 1 2 3 0 1  0 −1 0 1 2 0 . 0 0 1

Chapter Review

233

Thus a basis for row(A) is given by the nonzero rows of the matrix, or      { 1 −2 0 −1 0 , 0 0 1 2 0 , 0 0 0

0

 1 }.

Note that since the matrix is rank 3, so all three rows are nonzero, the rows of the original matrix also provide a basis for row(A). A basis for col(A) is given by the column vectors of the original matrix A corresponding to the leading 1s in the row-reduced matrix, so a basis for col(A) is       5 5   2 1 , 2 , 1 .   4 3 6 Finally, from the row-reduced form, the solutions to Ax = 0 where   x1 x2     x= x3  x4  x5 can be found by noting that x2 = s and x4 = t are free variables and that x1 = 2s + t, x3 = −2t, and x5 = 0. Thus the nullspace of A is       1 2 2s + t  0 1  s         −2t  = s 0 + t −2 ,        1 0  t  0 0 0 so that a basis for the nullspace is given by   2 1    { 0 , 0 0

 1  0   −2}.    1 0 

14. The row spaces will be the same since by Theorem 3.20 in Section 3.5, if A and B are row-equivalent, they have the same row spaces. However, the column spaces need not be the same. For example, let     1 0 0 1 0 0 0 1 0 = B. A = 0 1 0 R3 −R1 1 0 0 − −−−−→ 0 0 0 Then A and B are row-equivalent, but         1 0 1 0 col(A) = span 0 , 1 but col(B) = span 0 , 1 . 1 0 0 0 15. If A is invertible, then so is AT . But by Theorem 3.27 in Section 3.5, invertible matrices have trivial null spaces, so that null(A) = null(AT ) = 0. If A is not invertible, then the null spaces need not be the same. For example, let     1 1 1 1 0 0 A = 0 0 0 , so that AT = 1 0 0 . 0 0 0 1 0 0

234

CHAPTER 3. MATRICES Then AT row reduces to  1 0 0 But

0 0 0

     0 0 0 0 , so that null(AT ) = span 1 , 0 . 0 0 1     −1 −1 null(A) = span  1 ,  0 . 0 1

Since every vector in null(AT ) has a zero first component, neither of the basis vectors for null(A) is in null(AT ), so the spaces cannot be equal. 16. If the rows of A add up to zero, then the rows are linearly dependent, so by Theorem 3.27 in Section 3.5, A is not invertible. 17. Since A has n linearly independent columns, rank(A) = n. Theorem 3.28 of Section 3.5 tells us that rank(AT A) = rank(A) = n, so AT A is invertible since it is an n × n matrix. For the case of AAT , since m may be greater than n, there is no reason to believe that this matrix must be invertible. One counterexample is            1 0  1 0 1 1 0 1 0 1 1 0 1 1 0 1 , AAT = 0 1 A = 0 1 , AT = = 0 1 0 → 0 1 0 . 0 1 0 0 1 0 1 0 0 0 0 1 0 1 0 1 Thus AAT is not invertible, since it row-reduces to a matrix with a zero row.   18. The matrix of T is [T ] = T (e1 ) T (e2 ) . From the given data, we compute              1 1 2 1 0 1 1 1 1 1 + = + = T =T 4 0 2 1 2 −1 2 3 2 5              1 1 1 1 2 1 0 0 1 1 T =T − = − = . 1 −1 2 1 2 −1 2 3 2 5 Thus [T ] =

 1 4

 1 . −1 

19. Let R be the rotation and P the projection. The given line has direction vector Examples 3.58 and 3.59 in Section 3.6, " cos 45◦ [R] = sin 45◦

− sin 45◦ cos 45◦

" 12 1 [P ] = 2 1 + 22 1 · (−2)

"√

# =



2 √2 2 2

1 · (−2)



#

" =

(−2)2

2 √2 2 2

 1 . Then from −2

#

1 5 − 25

− 25 4 5

# .

So R followed by P is " [P ◦ R] = [P ][R] =

1 5 − 25

− 52 4 5

# "√

2 √2 2 2





2 √2 2 2

# =

" √ − 102 √

2 5



− 3102 √ 3 2 5

# .

20. Suppose that cv + dT (v) = 0. We want to show that c = d = 0. Since T is linear and T 2 (v) = 0, applying T to both sides of this equation gives T (cv + T (dT (v))) = cT (v) + dT (T (v)) = cT (v) + dT 2 (v) = cT (v) = 0. Since T (v) 6= 0, it follows that c = 0. But then cv + dT (v) = dT (v) = 0. Again since T (v) 6= 0, it follows that d = 0. Thus c = d = 0 and v and T (v) are linearly independent.

Chapter 4

Eigenvalues and Eigenvectors 4.1

Introduction to Eigenvalues and Eigenvectors

1. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:          0 3 1 0·1+3·1 3 1 Av = = = =3 = 3v. 3 0 1 3·1+0·1 3 1 Thus v is an eigenvector of A with eigenvalue 3. 2. Following example 4.1, to see that  1 Av = 2

v is an eigenvector of A, we show that Av is a multiple of v:       2 3 1 · 3 + 2 · (−3) −3 = = = −v 1 −3 2 · 3 + 1 · (−3) 3

Thus v is an eigenvector of A with eigenvalue −1. 3. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:          −1 1 1 −1 · 1 + 1 · (−2) −3 1 Av = = = = −3 = −3v. 6 0 −2 6 · 1 + 0 · (−2) 6 −2 Thus v is an eigenvector of A with eigenvalue −3. 4. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:        4 −2 4 4·4−2·2 12 Av = = = = 3v 5 −7 2 5·4−7·2 6 Thus v is an eigenvector of A with eigenvalue 3. 5. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:          6 2 3 0 0 2 3 · 2 + 0 · (−1) + 0 · 1 Av = 0 1 −2 −1 = 0 · 2 + 1 · (−1) − 2 · 1 = −3 = 3 −1 = 3v. 1 0 1 1 1 · 2 + 0 · (−1) + 1 · 1 3 1 Thus v is an eigenvector of A with eigenvalue 3. 6. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:        0 1 −1 −2 0 · (−2) + 1 · 1 − 1 · 1 0 1  1 = 1 · (−2) + 1 · 1 + 1 · 1 = 0 = 0v Av = 1 1 1 2 0 1 1 · (−2) + 2 · 1 + 0 · 1 0 Thus v is an eigenvector of A with eigenvalue 0. 235

236

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

7. We want to show that λ = 3 is an eigenvalue of  2 A= 2

 2 −1

Following Example 4.2, that means we wish to find a vector v with Av = 3v, i.e. with (A − 3I)v = 0. So we show that (A − 3I)v = 0 has nontrivial solutions.       2 2 3 0 −1 2 A − 3I = − = . 2 −1 0 3 2 −4 Row-reducing, we get   −R1     −1 2 − 1 −2 −−→ 1 −2 . R2 −2R1 2 −4 2 −4 − 0 −−−−→ 0   x It follows that any vector with x = 2y is an eigenvector, so for example one eigenvector correy   2 sponding to λ = 3 is . 1 8. We want to show that λ = −1 is an eigenvalue of  A=

2 3

 3 2

Following Example 4.2, that means we wish to find a vector v with Av = −v, i.e. with (A − (−I))v = (A + I)v = 0. So we show that (A + I)v = 0 has nontrivial solutions.       2 3 1 0 3 3 A+I = + = . 3 2 0 1 3 3 Row-reducing, we get  3 3

  3 3 R2 −R1 3 − −−−−→ 0

 1 R1  3 3 −− −→ 1 0 0

1 0



  x It follows that any vector with x = −y is an eigenvector, so for example one eigenvector corre y 1 sponding to λ = −1 is . −1 9. We want to show that λ = 1 is an eigenvalue of 

0 A= −1

 4 5

Following Example 4.2, that means we wish to find a vector v with Av = 1v, i.e. with (A − I)v = 0. So we show that (A − I)v = 0 has nontrivial solutions.       0 4 1 0 −1 4 A−I = − = . −1 5 0 1 −1 4 Row-reducing, we get   −R1     −1 4 − 1 −4 −−→ 1 −4 . R2 +R1 −1 4 −1 4 − 0 −−−−→ 0   x It follows that any vector with x = 4y is an eigenvector, so for example one eigenvector correy   4 sponding to λ = 1 is . 1

4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS 10. We want to show that λ = −6 is an eigenvalue of  4 A= 5

237

 −2 −7

Following Example 4.2, that means we wish to find a vector v with Av = −6v, i.e. with (A−(−6I))v = (A + 6I)v = 0. So we show that (A + 6I)v = 0 has nontrivial solutions.       4 −2 6 0 10 −2 + = . A + 6I = 5 −7 0 6 5 −1 Row-reducing, we get " 10 5

# −2

" # 1R " 1 10 −2 −10 −−→ 1

− 15

0

0

R −1R

2 1 −1 −−2−−− −→

  x It follows that any vector with x = y   1 sponding to λ = −6 is . 5

1 5y

0

0

#

is an eigenvector, so for example one eigenvector corre-

11. We want to show that λ = −1 is an eigenvalue of 

1 A = −1 2

0 1 0

 2 1 1

Following Example 4.2, that means we wish to find a vector v with Av = −1v, i.e. with (A + I)v = 0. So we show that (A + I)v = 0 has nontrivial solutions.       2 0 2 1 0 0 1 0 2 A + I = −1 1 1 + 0 1 0 = −1 2 1 . 2 0 2 0 0 1 2 0 1 Row-reducing, we get 

2 −1 2

0 2 0

  2 2 −1 1 R3 −R1 2 − 0 −−−−→

0 2 0

   −R1   2 −1 2 1 − −−→ 1 −2 −1 R1 ↔R2  R2 −2R1 2 1 − 2 0 2 0 2 − −−−−→ −−−−→ 0 0 0 0 0 0 0   R1 +2R2    1 −2 −1 1 1 −2 −1 − −−−−→ 1 0  0 4 R2 0 4 4 −− 1 1 −→ 0 0 0 0 0 0 0

0 1 0

 1 1 0

  x It follows that any vector y  with x = y = −z is an eigenvector, so for example one eigenvector z  −1 corresponding to λ = −1 is −1. 1 12. We want to show that λ = 2 is an eigenvalue of  3 A = 1 4

1 1 2

 −1 1 0

238

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Following example 4.2, that means we wish to find a vector v with Av = 2v, i.e. with (A − 2I)v = 0. So we show that (A − 2I)v = 0 has nontrivial solutions.       3 1 −1 2 0 0 1 1 −1 1 − 0 2 0 = 1 −1 1 A − 2I = 1 1 4 2 0 0 0 2 4 2 −2 Row-reducing, we have  1 1 4

  1 −1 1 R2 −R1 −1 1 −−− −−→ 0 R3 −4R1 2 −2 − −−−−→ 0

It follows that any vector

  1 −1 1 0 −2 2 R3 −R2 −2 2 − −−−−→ 0 hxi y z

  1 1 −1 − 21 R2  −2 2 − −−−→ 0 0 0 0

1 1 0

 R1 −R2  −1 − −−−−→ 1 0 −1 0 0

0 1 0

 0 −1 0

  0 with x = 0 and y = z is an eigenvector. Thus for example 1 is an 1

eigenvector for λ = 2. 13. Since A is the reflection F in the y-axis,   the only vectors that F maps parallel to themselves are vectors 1 parallel to the x-axis (multiples of ), which are reversed (eigenvalue λ = −1) and vectors parallel 0   0 to the y-axis (multiples of ), which are sent to themselves (eigenvalue λ = 1). So λ = ±1 are the 1 eigenvalues of A, with eigenspaces     1 0 E−1 = span E1 = span . 0 1

F He1 L = e1

e-1

F He-1 L =- e-1

x

F HxL

14. The only vectors that the transformation F corresponding to A maps to themselves are vectors perpendicular to y = x, which are reversed (so correspond to the eigenvalue λ = −1) and vectors parallel to y = x, which are sent to (so correspond to the eigenvalue λ=1. Vectors perpendicular  themselves  1 1 to y = x are multiples of ; vectors parallel to y = x are multiples of , so the eigenspaces are −1 1  E−1 = span

1 −1

 E1 = span

  1 . 1

4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS

239

F He1 L = e1 F He-1 L =- e-1

F HxL

e-1 x

15. If x is an eigenvalue of A, then Ax is a multiple  1 Ax = 0

of x:     0 x x = . 0 y 0

  x Then clearly any vector parallel to the x-axis, i.e., of the form , is an eigenvector for A with 0 eigenvalue 1. But this projection also transforms vectors parallel to the y-axis  to the zero vector, since 0 the projection of such a vector on the x-axis is 0. So vectors of the form are eigenvectors for A y   x with eigenvalue 0. Finally, a vector of the form where neither x nor y is zero is not an eigenvector: y   x it is transformed into , and comparing the x-coordinates shows that λ would have to be 1, while 0 comparing y-coordinates shows that it would have to be zero. Thus there is no such λ. So in summary,   1 λ = 1 is an eigenvalue with eigenspace E1 = span 0   0 λ = 0 is an eigenvalue with eigenspace E0 = span . 1 16. This projection sends a vector to its projection on the given line. Then a vector parallel to the direction vector of that line will be sent to itself, since its projection on the line is the vector itself. A vector orthogonal to the direction vector of the line will be "sent zero vector, since its projection on the # to the   4 4 line is the zero vector. Since the direction vector is 53 , or , there are two eigenspaces 3 5

  4 span , λ = 1; 3

 span

 −3 , λ = −1. 4

If a vector v is neither parallel to nor orthogonal to the direction vector of the line, then the projection transforms v to a vector parallel to the direction vector, thus not parallel to v. Thus the image of v cannot be a multiple of v. Hence the eigenvalues and eigenspaces above are the only ones.   x 17. Consider the action of A on a vector x = . If y = 0, then A multiplies x by 2, so this is an y eigenspace for λ = 2. If x = 0, then A multiplies x by 3, so this is an eigenspace for λ = 3. If neither x

240

CHAPTER 4. EIGENVALUES AND EIGENVECTORS nor y is zero, then

      x 2x x is mapped to , which is not a multiple of . So the only eigenspaces are y 3y y     1 0 span , λ = 2; span , λ = 3. 0 1

18. Since A rotates the plane 90◦ counterclockwise around the origin, any nonzero vector is taken to a vector orthogonal to itself. So there are no nonzero vectors that are taken to vectors parallel to themselves (that is, to multiples of themselves). The only vector taken to a multiple of itself is the zero vector; this is not an eigenvector since eigenvectors are by definition nonzero. So this matrix has no eigenvectors and no eigenvalues. Analytically, the matrix of A is     cos θ − sin θ 0 −1 A= = , sin θ cos θ 1 0 so that

 0 Ax = 1

    −1 x −y = . 0 y x

If this is a multiple of the original vector, then there is a constant a such that x = −ay and y = ax. Substituting ax for y in the first equation gives x = −a2 x, which is impossible unless x = 0. But then the second equation forces y = 0. So again we see that there are no eigenvectors. 19. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves, by A, we look for unit vectors whose image under A is in the same direction. It appears that these    vec 1 0 tors are vectors along the x and y-axes, so the eigenspaces are E1 = span and E2 = span . 0 1 The length of the resultant vectors in the first eigenspace appears are same length as the original vector, so that λ1 = 1. The length of the resultant vectors in the second eigenspace are twice the original length, so that λ2 = 2. 20. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves, by A, we look for unit vectors whose image under A is in the same direction. The only vectors satisfying 1 this criterion are those parallel to the line y = x, i.e., span , so this is the only eigenspace. The 1   1 terminal point of the line in the direction appears to be the point (2, 5), so that it has a length of 1 √ √ 29. Since the original vector is a unit vector, the eigenvalue is 29. 21. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves, by A, we look for unit vectors  whose image under A is in the same direction. The vectors parallel to 1 the line y = x, i.e., span satisfy this criterion, so this is an eigenspace E1 . The vectors parallel 1   1 to the line y = −x, i.e. span , also satisfy this criterion, so this is another eigenspace E2 . For −1 √ E1 , the terminal point of the line along y = x appears to be (2, 2),√so its length is 2 2. Since the original vector is a unit vector, the corresponding eigenvalue is λ1 = 2 2. For E2 , the resultant vectors are zero length, so λ2 = 0. 22. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves, by A, we look for unit vectors whose image under A is in the same direction. There are none — every vector is “bent” to the left by the transformation. So this transformation has no eigenvectors. 23. Using the method of Example 4.5, we wish to find solutions to the equation       4 −1 λ 0 4−λ −1 0 = det(A − λI) = det − = det 2 1 0 λ 2 1−λ = (4 − λ)(1 − λ) − 2 · (−1) = λ2 − 5λ + 6 = (λ − 2)(λ − 3).

4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS

241

Thus the eigenvalues are 2 and 3. To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I. " # " # h i 2 −1 0 1 − 21 0 → A − 2I 0 = 2 −1 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form 1    t 2t , or . 2t t   1 A basis for this eigenspace is given by choosing for example t = 1 to get . As a check, 2    1 4 A = 2 2

        −1 1 4·1−1·2 2 1 = = =2 . 1 2 2·1+1·2 4 2

To find the eigenspace corresponding to λ = 3, we must find the null space of A − 3I.       1 −1 0 1 −1 0 A − 3I 0 = → 2 −2 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form  t . t A basis for this eigenspace is given by choosing for example t = 1 to get

A

   1 4 = 1 2

        −1 1 4·1−1·1 3 1 = = =3 1 1 2·1+1·1 3 1

The plot below shows the effect of A on the eigenvectors     1 1 v= w= . 2 1 4

3

2v

2

3w

1

  1 . As a check, 1

v

w 1

2

3

242

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

24. Using the method of Example 4.5, we wish to find solutions to the equation  0 = det(A − λI) = det

  4 λ − 0 0

2 6

  0 2−λ = det λ 6

 4 −λ

= (2 − λ)(−λ) − 4 · 6 = λ2 − 2λ − 24 = (λ − 6)(λ + 4). Thus the eigenvalues are −4 and 6. To find the eigenspace corresponding to λ = −4, we must find the null space of A + 4I.

h

" i 6 0 = 6

A + 4I

4

0

4

0

# →

" 1

2 3

# 0

0

0

0

so that eigenvectors corresponding to this eigenvalue have the form  2  −3t , t

 or

 −2t 3t

  −2 A basis for this eigenspace is given by choosing for example t = 1 to get . As a check, 3

A

   −2 2 = 3 6

4 0

        −2 2 · (−2) + 4 · 3 8 −2 = = = −4 . 3 6 · (−2) + 0 · 3 −12 3

To find the eigenspace corresponding to λ = 6, we must find the null space of A − 6I.

 A − 6I

 −4 0 = 6 

4 −6

  0 1 → 0 0

−1 0

0 0



so that eigenvectors corresponding to this eigenvalue have the form  t t   1 A basis for this eigenspace is given by choosing for example t = 1 to get . As a check, 1    1 2 A = 1 6

4 0

        −1 2·1+4·1 6 1 = = =6 1 6·1+0·1 6 1

The plot below shows the effect of A on the eigenvectors

v=

  −2 3

w=

  1 . 1

4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS

243

6

6w 4

v 2

w 2

-2

4

6

8

-2

-4

-6

-8

-10

-4 v -12

25. Using the method of Example 4.5, we wish to find solutions to the equation       2 5 λ 0 2−λ 5 0 = det(A − λI) = det − = det 0 2 0 λ 0 2−λ = (2 − λ)(2 − λ) − 5 · 0 = (λ − 2)2 Thus the only eigenvalue is λ = 2. To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I. # " # " i h 0 1 0 0 5 0 → A − 2I 0 = 0 0 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form   t . 0   1 A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, 0            1 2 5 1 2·1+5·0 2 1 A = = = =2 . 0 0 2 0 0·1+2·0 0 0   1 The plot below shows the effect of A on the eigenvector v = : 0 v

2v 1

2

26. Using the method of Example 4.5, we wish to find solutions to the equation       1 2 λ 0 1−λ 2 0 = det(A − λI) = det − = det −2 3 0 λ −2 3−λ = (1 − λ)(3 − λ) + 4 = λ2 − 4λ + 7 Since this quadratic has no (real) roots, A has no (real) eigenvalues.

244

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

27. Using the method of Example 4.5, we wish to find solutions to the equation  0 = det(A − λI) = det

1 −1

  1 λ − 1 0

  0 1−λ = det λ −1

 1 1−λ

= (1 − λ)(1 − λ) − 1 · (−1) = λ2 − 2λ + 2. Using the quadratic formula, this quadratic has roots √ 2 ± 2i 2 ± 22 − 4 · 2 = = 1 ± i. λ= 2 2 Thus the eigenvalues are 1 + i and 1 − i. To find the eigenspace corresponding to λ = 1 + i, we must find the null space of A − (1 + i)I.   iR1       −i 1 0 − i 0 1 i 0 −→ 1 A − (1 + i)I 0 = R2 +R1 −1 −i 0 −1 −i 0 − −−−−→ 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form   −it . t A basis for this eigenspace is given by choosing for example t = 1, to get

  −i . 1

To find the eigenspace corresponding to λ = 1 − i, we must find the null space of A − (1 − i)I.   −iR1       i 1 0 − 1 −i 0 −−→ 1 −i 0 A − (1 − i)I 0 = R2 +R1 −1 i 0 −1 i 0 −−−−−→ 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form   it . t A basis for this eigenspace is given by choosing for example t = 1, to get

  i . 1

28. We wish to find solutions to the equation  0 = det(A − λI) = det

2 1

  −3 λ − 0 0

  0 2−λ = det λ 1

 −3 −λ

= (2 − λ)(−λ) − (−3) · 1 = λ2 − 2λ + 3. √ = 1 ± i 2, so these are the eigenvalues. √ √ To find the eigenspace corresponding to λ = 1 + i 2, we must find the null space of A − (1 + i 2)I:

Using the quadratic formula, this has roots λ =

 √ A − (1 + i 2)I

√   1−i 2 0 = 1

√ 2± 4−4·3 2

 −3 √ 0 R1 ↔R2 −−−−−→ −1 − i 2 0 √    1√ −1 − i 2 0 1 √ R2 −(1−i 2)R1 1−i 2 −3 0 − −−−−−−−−−→ 0

so that eigenvectors corresponding to this eigenvalue have the form  √  (1 + i 2)t . t

√ −1 − i 2 0

 0 , 0

4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS

245

 √  1+i 2 . As a check, 1 √  √      √  √   √   √ −1 + 2i 2 −3 1 + i 2 2(1 + i√ 2) − 3 · 1 1+i 2 √ 2 = (1 + i 2) 1 + i 2 = = = A 1 0 1 1 1 (1 + i 2) + 0 · 1 1+i 2 √ √ To find the eigenspace corresponding to λ = 1 − i 2, we must find the null space of A − (1 − i 2)I. √     √ −3 √ 0 R1 ↔R2 1+i 2 −−−−−→ A − (1 − i 2)I 0 = 1 −1 + i 2 0 √     √ 1√ −1 + i 2 0 1 −1 + i 2 0 √ R2 −(1+i 2)R1 0 0 −3 0 − 1+i 2 −−−−−−−−−→ 0

A basis for this eigenspace is given by choosing for example t = 1, to get

so that eigenvectors corresponding to this eigenvalue have the form  √  (1 − i 2)t . t  √  1−i 2 A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, 1 √ √       √   √  √   √ 2 2 −3 1 − i 2 2(1 − i√ 2) − 3 · 1 −1 − 2i 1−i 2 1 − i 2 √ A = = = = (1 − i 2) 1 0 1 1 1 (1 − i 2) + 0 · 1 1−i 2 29. Using the method of Example 4.5, we wish to find solutions to the equation       1 i λ 0 1−λ i 0 = det(A − λI) = det − = det i 1 0 λ i 1−λ = (1 − λ)(1 − λ) − i · i = λ2 − 2λ + 2. Using the quadratic formula, this quadratic has roots √ 2 ± 22 − 4 · 2 2 ± 2i λ= = = 1 ± i. 2 2 Thus the eigenvalues are 1 + i and 1 − i. To find the eigenspace corresponding to λ = 1 + i, we must find the null space of A − (1 + i)I.   iR1       −i i 0 − 1 −1 0 1 −1 0 − → A − (1 + i)I 0 = R2 −iR1 i −i 0 i −i 0 − 0 0 −−−−→ 0 so that eigenvectors corresponding to this eigenvalue have the form  t . t A basis for this eigenspace is given by choosing for example t = 1, to get

  1 . 1

To find the eigenspace corresponding to λ = 1 − i, we must find the null space of   −iR1      i i 0 − 1 1 −−→ 1 1 0 A − (1 − i)I 0 = R2 −iR1 i i 0 i i 0 − −−−−→ 0 0 so that eigenvectors corresponding to this eigenvalue have the form   −t . t A basis for this eigenspace is given by choosing for example t = 1, to get

  −1 . 1

A − (1 − i)I.  0 0

246

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

30. Using the method of Example 4.5, we wish to find solutions to the equation  0 = det(A − λI) = det

  0 1+i λ − 1−i 1 0

0 λ



 = det

−λ 1−i

1+i 1−λ



= (−λ)(1 − λ) − (1 + i)(1 − i) = λ2 − λ − 2 = (λ − 2)(λ + 1) Thus the eigenvalues are 2 and −1. To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I. " # 1 " # " − 2 R1 i h −2 1 + i 0 −−− 1 − 12 (1 + i) 0 1 − 12 (1 + i) − → A − 2I 0 = R2 −(1−i)R1 1 − i −1 0 1−i −1 0 −−−−−−−−→ 0 0

# 0 0

so that eigenvectors corresponding to this eigenvalue have the form " # " # 1 (1 + i)t 2 (1 + i)t , or, clearing fractions, . t 2t A basis for this eigenspace is given by choosing for example t = 1, to get

  1+i . 2

To find the eigenspace corresponding to λ = −1, we must find the null space of A − (−I) = A + I.  A+I

 0 =



1 1+i 1−i 2

  1 1+i 0 R2 −(1−i)R1 0 − 0 −−−−−−−→ 0

 0 0

so that eigenvectors corresponding to this eigenvalue have the form   −(1 + i)t . t A basis for this eigenspace is given by choosing for example t = 1, to get

  −1 − i . 1

31. Using the method of Example 4.5, we wish to find solutions in Z3 to the equation  0 = det(A − λI) = det

1 1

  0 λ − 2 0

  0 1−λ = det λ 1

 0 2−λ

= (1 − λ)(2 − λ) − 0 · 1 = (λ − 1)(λ − 2). Thus the eigenvalues are 1 and 2. 32. Using the method of Example 4.5, we wish to find solutions in Z3 to the equation  0 = det(A − λI) = det

2 1

  1 λ − 2 0

  0 2−λ = det λ 1

 1 2−λ

= (2 − λ)(2 − λ) − 1 · 1 = λ2 − 4λ + 4 − 1 = λ2 + 2λ = λ(λ + 2) = λ(λ − 1). Thus the eigenvalues are 0 and 1. 33. Using the method of Example 4.5, we wish to find solutions in Z5 to the equation  0 = det(A − λI) = det

3 4

  1 λ − 0 0

  0 3−λ = det λ 4

 1 −λ

= (3 − λ)(−λ) − 1 · 4 = λ2 − 3λ − 4 = λ2 + 2λ + 1 = (λ + 1)2 = (λ − 4)2 . Thus the only eigenvalue is 4.

4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS

247

34. Using the method of Example 4.5, we wish to find solutions in Z5 to the equation  0 = det(A − λI) = det

1 4

  4 λ − 0 0

  0 1−λ = det λ 4

 4 −λ

= (1 − λ)(−λ) − 4 · 4 = λ2 − λ − 16 = λ2 + 4λ + 4 = (λ + 2)2 = (λ − 3)2 . Thus the only eigenvalue is 3. 35. (a) To find the eigenvalues of A, find solutions to the equation 0 = det(A − λI) = det

 a c

  b λ − d 0

  0 a−λ = det λ c

 b d−λ

= (a − λ)(d − λ) − bc = λ2 − (a + d)λ + (ad − bc) = λ2 − tr(A)λ + det(A). (b) Using the quadratic formula to find the roots of this quadratic gives p tr(A) ± (tr(A))2 − 4 det(A) λ= 2 √ a + d ± a2 + 2ad + d2 − 4ad + 4bc = 2  p 1 2 = a + d ± a − 2ad + d2 + 4bc 2  p 1 = a + d ± (a − d)2 + 4bc . 2 (c) Let  p 1 a + d + (a − d)2 + 4bc , 2 be the two eigenvalues. Then λ1 =

λ2 =

 p 1 a + d − (a − d)2 + 4bc 2

 1  p p 1 a + d + (a − d)2 + 4bc + a + d − (a − d)2 + 4bc 2 2 1 = (a + d + a + d) 2 = a + d = tr(A),     1   p p 1 2 2 λ1 λ2 = a + d − (a − d) + 4bc a + d − (a − d) + 4bc 2 2  1 (a + d)2 − (a − d)2 + 4bc = 4 1 = (a2 + 2ad + d2 − a2 + 2ad − d2 − 4bc) 4 1 = (4ad − 4bc) = det(A). 4

λ1 + λ2 =

36. A quadratic equation will have two distinct real roots when its discriminant (the quantity under the square root sign in the quadratic formula) is positive, one real root when it is zero, and no real roots when it is negative. In Exercise 35, the discriminant is (a − d)2 + 4bc. (a) For A to have to distinct real eigenvalues, we must have (a − d)2 + 4bc > 0. (b) For A to have a single real eigenvalue, it must be the case that (a − d)2 + 4bc = 0, so that (a − d)2 = −4bc. Note that this implies that either a = d and at least one of b or c is zero, or that a 6= d and either b or c, but not both, is negative. (c) For A to have no real eigenvalues, we must have (a − c)2 + 4bc < 0.

248

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

37. From Exercise 35, since c = 0, the eigenvalues are  1 p 1 a + d ± (a − d)2 + 4b · 0 = (a + d ± |a − d|) 2 2   1 1 (a + d + a − d), (a + d + d − a) = {a, d}. = 2 2 To find the eigenspace corresponding to λ = a, we must find the null space of A − aI.     0 b 0 A − aI 0 = . 0 d−a 0 If b = 0 and a = d, then this is the zero matrix, so that any vector is an eigenvector; a basis for the eigenspace is     1 0 , . 0 1 The reason for this is that under these conditions, the matrix A is a multiple of the identity matrix, so it simply stretches any vector by a factor of a = d. Otherwise, either b 6= 0 or d − a 6= 0, so the above matrix row-reduces to   0 1 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form   t . 0   1 A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, 0         1 a·1+b·0 a 1 A = = =a . 0 0·1+d·0 0 0 To find the eigenspace corresponding to λ = d, we must find the null space of A − dI.     a−d b 0 A − dI 0 = . 0 0 0 If b = 0 and a = d, then this is again the zero matrix, which we dealt with above. Otherwise, either b 6= 0 or d − a 6= 0, so that eigenvectors corresponding to this eigenvalue have the form     −bt bt , or . (a − d)t (d − a)t 

 b A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, d−a 

       b a · b + b · (d − a) bd b = = =d . A d−a 0 · b + d · (d − a) d(d − a) d−a 38. Using the method of Example 4.5, we wish to find solutions to the equation  0 = det(A − λI) = det

  a b λ − −b a 0

  0 a−λ = det λ −b

b a−λ



= (a − λ)(a − λ) − 4 · (−b) = λ2 − 2aλ + (a2 + 4b).

4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS

249

Using Exercise 35, the eigenvalues are  p 1 1p a + a ± (a − a)2 + 4b · (−b) = a ± −4b2 = a ± bi. 2 2 To find the eigenspace corresponding to λ = a + bi, we must find the null space of A − (a + bi)I.  A − (a + bi)I

  −bi b 0 = −b −bi

 0 . 0

If b = 0 then this is the zero matrix, so that any vector is an eigenvector; a basis for the eigenspace is     1 0 , . 0 1 The reason for this is that under these conditions, the matrix A is a times the identity matrix, so it simply stretches any vector by a factor of a. Otherwise, b 6= 0, so we can row-reduce the matrix:  −bi b −b −bi

 1 1  0 −b−iR i −→ 1 −b −bi 0

  0 1 R2 +bR1 0 − −−−−→ 0

i 0

 0 . 0

Thus eigenvectors corresponding to this eigenvalue have the form     −it t = . t it   1 A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, i       1 a + bi 1 A = = (a + bi) . i −b + ai i To find the eigenspace corresponding to λ = a − bi, we must find the null space of A − (a − bi)I.  A − (a − bi)I



bi 0 = −b 

b bi

 0 . 0

If b = 0 then this is the zero matrix, which we dealt with above. Otherwise, b 6= 0, so we can row-reduce the matrix:   − 1 iR1     bi b 0 − −b−−→ 1 −i 0 R +bR 1 −i 0 . 1 −b bi 0 −b bi 0 −−2−−−→ 0 0 0 Thus eigenvectors corresponding to this eigenvalue have the form   it . t   i A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, 1 A

        i ai + b b + ai i = = = (a − bi) . 1 −bi + a a − bi 1

250

4.2

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

Determinants

1. Expanding along the first row gives 1 0 3 5 1 1 = 1 1 1 − 0 5 1 + 3 5 0 0 2 1 2 0 1 2 Expanding along the first column gives 1 0 3 5 1 1 = 1 1 1 − 5 0 3 + 0 0 1 1 2 1 2 0 1 2 2. Expanding along the first row gives 0 1 −1 2 3 −2 = 0 3 3 −1 3 0

1 = 1(1 · 2 − 1 · 1) − 0 + 3(5 · 1 − 1 · 0) = 16. 1

3 = 1(1 · 2 − 1 · 1) − 5(0 · 2 − 3 · 1) + 0 = 16. 1

2 −2 + (−1) −1 0

2 −2 − 1 −1 0

3 3

= 0 − 1(2 · 0 − (−2) · (−1)) − 1(2 · 3 − 3 · (−1)) = −7. Expanding along the first column gives 0 1 −1 2 3 −2 = 0 3 −2 − 2 1 3 3 0 −1 3 0

1 −1 + (−1) 3 0

−1 −2

= 0 − 2(1 · 0 − (−1) · 3) − 1(1 · (−2) − (−1) · 3) = −7. 3. Expanding along the first row gives 1 −1 0 −1 1 = 1 0 −1 0 1 −(−1) 0 1 −1 0 1 −1

−1 1 +0 0 −1

Expanding along the first column gives 1 −1 0 −1 −1 0 0 1 −1 0 1 = 1 −(−1) 1 −1 +0 0 1 −1 0 1 −1 4. Expanding 1 1 0

along the first row gives 1 0 0 1 − 1 1 1 + 0 1 0 1 = 1 0 1 1 0 1 1 1

Expanding 1 1 0

along the first column gives 1 0 0 1 − 1 1 0 + 0 1 0 1 = 1 0 1 1 1 1 1 1

5. Expanding along the first row gives 1 2 3 2 3 1 = 1 3 1 − 2 2 1 + 3 2 1 2 3 2 3 3 1 2

0 = 1(0·(−1)−1·1)+1(−1·(−1)−1·0)+0 = 0. 1

0 = 1(0·(−1)−1·1)+1(−1·(−1)−0·1)+0 = 0. 1

0 = 1(0 · 1 − 1 · 1) − 1(1 · 1 − 1 · 0) + 0 = −2. 1

0 = 1(0 · 1 − 1 · 1) − 1(1 · 1 − 1 · 0) + 0 = −2. 1

3 = 1(3 · 2 − 1 · 1) − 2(2 · 2 − 1 · 3) + 3(2 · 1 − 3 · 3) = −18. 1

4.2. DETERMINANTS

251

Expanding along the first column gives 1 2 3 2 3 1 = 1 3 1 − 2 2 1 + 3 2 3 = 1(3 · 2 − 1 · 1) − 2(2 · 2 − 1 · 3) + 3(2 · 1 − 3 · 3) = −18. 3 1 3 2 1 2 3 1 2 6. Expanding along the 1 2 3 4 5 6 = 1 5 8 7 8 9 Expanding along the 1 2 3 4 5 6 = 1 5 8 7 8 9 7. Noting the pair of 5 −1 3

first row gives 4 6 + 3 7 9

4 6 − 2 7 9

5 = 1(5 · 9 − 6 · 8) − 2(4 · 9 − 6 · 7) + 3(4 · 8 − 5 · 7) = 0. 8

first column gives 2 3 + 7 5 9

2 6 − 4 8 9

3 = 1(5 · 9 − 6 · 8) − 4(2 · 9 − 3 · 8) + 7(2 · 6 − 3 · 5) = 0. 6

zeros in the third row, use cofactor expansion along the third row to get 2 2 5 2 5 2 2 2 = 3(2 · 2 − 2 · 1) − 0 + 0 = 6. 1 2 = 3 + 0 − 0 −1 1 −1 2 1 2 0 0

8. Noting the zero in the second row, use cofactor expansion along the second row to get 1 1 −1 1 1 −1 1 2 0 1 = −2 = −2(1 · 1 − (−1) · (−2)) − 1(1 · (−2) − 1 · 3) = 7. − 1 3 −2 −2 1 3 −2 1 9. Noting the zero in the bottom right, use the third row, since the multipliers there are smaller: −4 1 3 2 −2 4 = 1 1 3 − (−1) −4 3 = 1(1 · 4 − 3 · (−2)) + 1(−4 · 4 − 3 · 2) = −12. 2 4 −2 4 1 −1 0 10. Noting that cos θ 0 0

the first column has only one nonzero entry, we expand along the first column: sin θ tan θ cos θ − sin θ = cos θ(cos θ · cos θ − (− sin θ) · sin θ) = cos θ(cos2 θ + sin2 θ) = cos θ. sin θ cos θ

11. Use the first column: a b 0 0 a b = a a 0 a 0 b

b b + a b a

12. Use the first row (other choices 0 b 0

0 = a(a · b − b · 0) + a(b · b − 0 · a) = a2 b + ab2 . b

are equally easy): a 0 b d = −a(b · 0 − d · 0) = 0. c d = −a 0 0 e 0

252

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

13. Use the third row, since it contains only one nonzero entry: 1 −1 0 3 1 0 2 5 2 6 = (−1)3+2 · 1 2 2 0 1 0 0 1 2 1 4 2 1

1 3 6 = − 2 1 1

0 2 2

3 6 . 1

To compute this determinant, use cofactor expansion along the first row: 1 0 3 2 6 − 3 2 2 = −1(2 · 1 − 6 · 2) − 3(2 · 2 − 2 · 1) = 4. − 2 2 6 = −1 1 2 2 1 1 2 1 14. Noting that the second column has only one nonzero entry, we expand along that column. The nonzero entry is in row 3, column 2, so it gets a minus sign, since (−1)3+2 = −1: 2 0 3 −1 2 3 −1 2 3 −1 1 0 2 2 2 . 2 = 1 2 = −(−1) 1 2 0 −1 1 4 2 1 −3 2 1 −3 2 0 1 −3 To evaluate the remaining determinant, we use expansion along the first row: 2 3 −1 1 2 1 2 2 2 1 2 2 = 2 − 1 − 3 2 1 2 −3 1 −3 2 1 −3 = 2(2 · (−3) − 2 · 1) − 3(1 · (−3) − 2 · 2) − 1(1 · 1 − 2 · 2) = 8. 15. Expand along the first row: 0 0 0 g

0 0 d h

0 a 0 b c 0 = −a e f g i j

Now expand along the first row again: 0 0 b 0 −a 0 d e = −ab g g h i

0 d h

b e . i

d = −ab(0 · h − dg) = abdg. h

16. Using the method of Example 4.9, we get 105

1 4 7

2 3 5 6 8 9

1 4 7

48

72

2 5 8

45

84

96

so that the determinant is −(105 + 48 + 72) + (45 + 84 + 96) = 0. 17. Using the method of Example 4.9, 1 2 3

1 −1 0 1 −2 1

0

−2

1 2 3

1 0 −2

0

3

so that the determinant is −(0 + (−2) + 2) + (0 + 3 + 4) = 7.

2

4

4.2. DETERMINANTS

253

18. Using the method of Example 4.9, we get 0

a 0 a

b a 0

0

0 a b 0 b a

b a 0

a2 b 2

0

ab2

0

2

so that the determinant is −(0 + 0 + 0) + (a b + ab + 0) = a2 b + ab2 . 19. Consider (2) in the text. The three upward-pointing diagonals have product a31 a22 a13 , a32 a23 a11 , and a33 a21 a12 , while the downward-pointing diagonals have product a11 a22 a33 , a12 a23 a31 , and a13 a21 a32 , so the determinant by that method is det A = −(a31 a22 a13 + a32 a23 a11 + a33 a21 a12 ) + (a11 a22 a33 + a12 a23 a31 + a13 a21 a32 ). Expanding along the first row gives a21 a a23 − a det A = a11 22 12 a31 a32 a33

a21 a23 + a 13 a31 a33

a22 a32

= a11 a22 a33 − a11 a32 a23 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 , which, after rearranging, is the same as the first answer.   a11 a12 20. Consider the 2 × 2 matrix A = . Then a21 a22 C11 = (−1)1+1 det[a22 ] = a22 ,

C12 = (−1)1+2 det[a21 ] = −a21 ,

so that from definition (4) det A =

2 X

a1j C1j = a11 C11 + a12 C12 = a11 a12 + a12 · (−a21 ) = a11 a22 − a12 a21 ,

j=1

which agrees with the definition of a 2 × 2 determinant. 21. If A is lower triangular, then AT is upper triangular. Since by Theorem 4.10 det A = det AT , if we prove the result for upper triangular matrices, we will have proven it for lower triangular matrices as well. So assume A is upper triangular. The basis for the induction is the case n = 1, where A = [a11 ] and det A = a11 , which is clearly the “product” of the entries on the diagonal. Now assume that the statement holds for n × n upper triangular matrices, and let A be an (n + 1) × (n + 1) upper triangular matrix. Using cofactor expansion along the last row, we get det A = (−1)(n+1)+(n+1) an+1,n+1 An+1,n+1 = an+1,n+1 An+1,n+1 . But An+1,n+1 is an n × n upper triangular matrix with diagonal entries a1 , a2 , . . . , an , so applying the inductive hypothesis we get det A = an+1,n+1 An+1,n+1 = an+1,n+1 (a1 a2 · · · an ) = a1 a2 · · · an an+1 , proving the result for (n + 1) × (n + 1) matrices. So by induction, the statement is true for all n ≥ 1. 22. Row-reducing the matrix in gives 1 5 0

Exercise 1, and modifying the determinant as required by Theorem 4.3, 0 1 1

1 0 3 R2 −5R1 1 − −−−−→ 0 1 0 1 2

1 3 0 −14 R3 −R2 2 − −−−−→ 0

0 1 0

3 −14 16

Note that we did not introduce any factors in front of the determinant since all of the row operations were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is 1 · 1 · 16 = 16.

254

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

23. Row-reducing the matrix in Exercise 9, and gives 1 −1 −4 1 3 R1 ↔R3 2 −2 4 − −−−−→ − 2 −2 −4 1 −1 0 1

modifying the determinant as required by Theorem 4.3, 1 0 R −2R 2 1 4 −−−−−→ − 0 R3 +4R1 0 3 − −−−−→

1 −1 0 R2 ↔R3 0 4 −−−−−→ 0 0 −3 3

−1 0 −3 3 . 0 4

There were two row interchanges, each of which introduced a factor of −1 in the determinant. The other row operations were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is 1 · (−3) · 4 = −12. 24. Row-reducing the matrix in Exercise 13, and modifying the determinant as required by Theorem 4.3, gives 1 2 0 1

1 −1 0 −1 0 3 0 1 0 7 2 0 R2 ↔R3 − 7 2 1 0 0 −−−−−→ 0 0 5 2 5 2 −2 1 −1 1 −1 0 3 0 0 1 1 0 0 − R3 −7R2 − 0 0 −−−−−→ 0 0 2 0 R4 −R3 R4 −5R2 − 0 0 0 0 2 −2 − − − − → −−−−−→

1 −1 0 3 R −2R1 5 2 6 −−2−−−→ 0 0 1 0 0 R4 −R1 0 4 2 1 − −−−−→

3 0 0 −2 0 3 0 0 . 2 0 0 −2

There was one row interchange, which introduced a factor of −1 in the determinant. The other row operations were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is −(1 · 1 · 2 · (−2)) = 4. 25. Row-reducing the matrix in Exercise 14, and modifying the determinant as required by Theorem 4.3, gives 2 1 0 2

1 0 2 0 3 −1 2 0 3 0 2 2 R1 ↔R2 − −1 1 4 −−−−−→ 0 −1 1 2 0 1 0 1 −3 1 0 2 0 −1 1 R ↔R3 −−2−−−→ 0 0 −1 0 0 −3

1 0 2 2 2 R −2R1 0 0 −1 −5 −1 −−2−−−→ − 0 −1 1 4 4 R2 −2R1 0 0 −3 −7 −3 − −−−−→ 1 2 0 2 2 0 −1 4 1 4 0 . −5 0 −1 −5 R −3R 3 −7 −−4−−−→ 0 0 0 8

There were two row interchanges, each of which introduced a factor of −1 in the determinant. The other row operations were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is 1 · (−1) · (−1) · 8 = 8. 26. Since the third row is a multiple of the first row, the determinant is zero since we can use the row operation R3 − 2R1 to zero the third row, and doing so does not change the determinant. 27. This matrix is upper triangular, so its determinant is the product of the diagonal entries, which is 3 · (−2) · 4 = −24. 28. If we interchange the first and third rows, the result is an upper triangular matrix with diagonal entries 3, 5, and 1. Since interchanging the rows negates the determinant, the determinant of the original matrix is −(3 · 5 · 1) = −15.

4.2. DETERMINANTS

255

29. The third column is −2 times the first column, so the columns are not linearly independent and thus the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its determinant is zero. 30. Since the third row is the sum of the first and second rows, the rows are not linearly independent and thus the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its determinant is zero. 31. Since the first column is the sum of the second and third columns, the columns are not linearly independent and thus the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its determinant is zero. 32. Interchanging the second and third rows gives the identity matrix, which has determinant 1. Since interchanging the rows introduces a minus sign, the determinant of the original matrix is −1. 33. Interchange the first and second rows, and also the third and fourth rows, to give a diagonal matrix with diagonal entries −3, 2, 1, and 4. Performing two row interchanges does not change the determinant, so the determinant of the original matrix is −3 · 2 · 1 · 4 = −24. 34. The sum of the first and second rows is equal to the sum of the third and fourth rows, so the rows are not linearly independent and thus the matrix does not have rank 4, so is not invertible. By Theorem 4.6, its determinant is zero. 35. This matrix results from the original matrix by multiplying the first row by 2, so its determinant is 2 · 4 = 8. 36. This matrix results from the original matrix by multiplying the first column by 2, the second column by 31 , and the third column by −1. Thus the determinant of this matrix is 2 · 13 · (−1) = − 32 times the original determinant, so its value is − 32 · 4 = − 83 . 37. This matrix results from the original matrix by interchanging the first and second rows. This multiplies the determinant by −1, so the determinant of this matrix is −4. 38. This matrix results from the original matrix by subtracting the third column from the first. This operation does not change the determinant, so the determinant of this matrix is also 4. 39. This matrix results from the original matrix by exchanging the first and third columns (which multiplies the determinant by −1) and then multiplying the first column by 2 (which multiplies the determinant by 2). So the determinant of this matrix is 2 · (−1) · 4 = −8. 40. To get from the original matrix to this one, row 2 was multiplied by 3, and then twice row 3 was added to each of rows 1 and 2. The first of these operations multiplies the determinant by 3, while the additions do not change the determinant. Thus the determinant of this matrix is 3 · 4 = 12. 41. Suppose first that A has a zero row, say the ith row is zero. Using cofactor expansion along that row gives n n X X det A = aij Cij = 0Cij = 0, j=1

j=1

so the determinant is zero. Similarly, if A has a zero column, say the j th column, then using cofactor expansion along that column gives det A =

n X i=1

aij Cij =

n X

0Cij = 0,

i=1

and again the determinant is zero. Alternatively, the column case can be proved from the row case by noting that if A has a zero column, then AT has a zero row, and det A = det AT by Theorem 4.10.

256

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Ri +kRj

42. Dealing first with rows, we want to show that if A −−−−−→ B where i 6= j then det B = det A. Let Ai be the ith row of A. Then     A1 A1  A2    A2      ..    ..     .  .   B= Ai + kAj  ; let C = kAj  .      .    .. .  .    . An An Then A, B, and C are identical except that the ith row of B is the sum of the ith rows of A and C, so that by part (a), det B = det A + det C. However, since i 6= j, the ith row of C, which is kAj , is a multiple of the j th row, which is Aj . By Theorem 4.3(d), multiplying the j th row by k multiplies the determinant of C by k. However, the result is a matrix with two identical rows; both the ith and j th rows are kAj . So det C = 0 by Theorem 4.3(c). Hence det B = det A as desired. Ci +kCj

Ri +kRj

If instead A −−−−−→ B where i 6= j, then AT −−−−−→ B T , so that by the first part and Theorem 4.10 det A = det AT = det B T = det B. 43. If E interchanges two rows, then det E = −1 by Theorem 4.4. Also, EB is the same as B but with two rows interchanged, so by Theorem 4.3(b), det(EB) = − det B = det(E) det(B). Next, if E results from multiplying a row by a constant k, then det E = k by Theorem 4.4. But EB is the same as B except that one row has been multiplied by k, so that by Theorem 4.3(d), det(EB) = k det(B) = det(E) det(B). Finally, if E results from adding a multiple of one row to another row, then det E = 1 by Theorem 4.4. But in this case EB is obtained from B by adding a multiple of one row to another, so by Theorem 4.3(f), det(EB) = det(B) = det(E) det(B). 44. Let the row vectors of A be A1 , A2 , . . . , An . Then the row vectors of kA are kA1 , kA2 , . . . , kAn . Using Theorem 4.3(d) n times, kA1 A1 A1 A1 kA2 kA2 A2 A2 det(kA) = . = k . = k 2 . = · · · = k n . = k n det A. .. .. .. .. kAn kAn kAn An 45. By Theorem 4.6, A is invertible if and only if det A 6= 0. To find det A, expand along the first column: k + 1 det A = k −8

−k 1 + k k−1 k+1

3 1

= k((k + 1)(k − 1) − 1 · (−8)) + k(−k · 1 − 3(k + 1)) = k 3 − 4k 2 + 4k = k(k − 2)2 . Thus det A = 0 when k = 0 or k = 2, so A is invertible for all k except k = 0 and k = 2. 46. By Theorem 4.6, A is invertible if and only if det A 6= 0. To find det A, expand along the first row: k k 0 2 2 k = k 2 k − k k k 2 k k k 0 k 0 k k = k(2k − k 2 ) − k(k 3 − 0) = k(2k − k 2 − k 3 ) = k 2 (2 − k − k 2 ) = −k 2 (k + 2)(k − 1). Since the matrix is invertible precisely when its determinant is nonzero, and since the determinant is zero when k = 0, k = 1, or k = −2, we see that the matrix is invertible for all values of k except for 0, 1, and −2. 47. By Theorem 4.8, det(AB) = det(A) det(B) = 3 · (−2) = −6.

4.2. DETERMINANTS

257

48. By Theorem 4.8, det(A2 ) = det(A) det(A) = 3 · 3 = 9. 49. By Theorem 4.9, det(B −1 ) =

1 det(B) ,

so that by Theorem 4.8 det(A) 3 =− . det(B) 2

det(B −1 A) = det(B −1 ) det(A) = 50. By Theorem 4.7, det(2A) = 2n det(A) = 3 · 2n .

51. By Theorem 4.10, det(B T ) = det(B), so that by Theorem 4.7, det(3B T ) = 3n det(B T ) = −2 · 3n . 52. By Theorem 4.10, det(AT ) = det(A), so that by Theorem 4.8, det(AAT ) = det(A) det(AT ) = det(A)2 = 9. 53. By Theorem 4.8, det(AB) = det(A) det(B) = det(B) det(A) = det(BA). 54. If B is invertible, then det(B) 6= 0 and det(B −1 ) =

1 det(B) ,

so that using Theorem 4.8

det(B −1 AB) = det(B −1 (AB)) = det(B −1 ) det(AB) = det(B −1 ) det(A) det(B) =

1 det(A) det(B) = det(A). det(B)

55. If A2 = A, then det(A2 ) = det(A). But det(A2 ) = det(A) det(A) = det(A)2 , so that det(A) = det(A)2 . Thus det(A) is either 0 or 1. 56. We first show that Theorem 4.8 extends to prove that if m ≥ 1, then det(Am ) = det(A)m . The case m = 1 is obvious, and m = 2 is Theorem 4.8. Now assume the statement is true for m = k. Then using Theorem 4.8 and the truth of the statement for m = k, det(Ak+1 ) = det(AAk ) = det(A) det(Ak ) = det(A) det(A)k = det(A)k+1 . So the statement holds for all m ≥ 1. Now, if Am = O, then det(Am ) = det(A)m = det(O) = 0, so that det(A) = 0. 57. The coefficient matrix and constant vector are   1 1 A= , 1 −1

b=

  1 . 2

To apply Cramer’s Rule, we compute 1 det A = 1 1 det A1 (b) = 2 1 det A2 (b) = 1

1 = −2 −1 1 = −3 −1 1 = 1. 2

Then by Cramer’s Rule, x=

3 det A1 (b) = , det A 2

58. The coefficient matrix and constant vector are   2 −1 A= , 1 3

y=

det A2 (b) 1 =− . det A 2

 b=

 5 . −1

258

CHAPTER 4. EIGENVALUES AND EIGENVECTORS To apply Cramer’s Rule, we compute 2 −1 =7 det A = 1 3 5 −1 = 14 det A1 (b) = −1 3 2 5 = −7. det A2 (b) = 1 −1 Then by Cramer’s Rule,

det A1 (b) det A2 (b) = 2, y = = −1. det A det A 59. The coefficient matrix and constant vector are     2 1 3 1 A = 0 1 1 , b = 1 . 0 0 1 1 x=

To apply Cramer’s Rule, we compute 2 det A = 0 0 1 det A1 (b) = 1 1 2 det A2 (b) = 0 0 2 det A3 (b) = 0 0

1 1 0 1 1 0 1 1 1 1 1 0

3 1 = 2 · 1 · 1 = 2 1 3 1 1 3 1 = 1 + 1 1 1 1 1 3 1 1 =0 1 = 2 1 1 1 1 1 1 = 2. 1 = 2 0 1 1

1 = −2 1

Then by Cramer’s Rule, x=

det A1 (b) = −1, det A

y=

det A2 (b) = 0, det A

60. The coefficient matrix and constant vector are   1 1 −1 1 1 , A = 1 1 −1 0

z=

det A3 (b) = 1. det A

  1 b = 2 . 3

To apply Cramer’s Rule, we compute 1 1 −1 1 −1 1 −1 =4 1 1 = 1 det A = 1 − (−1) 1 1 1 1 1 −1 0 1 1 −1 1 −1 1 −1 =9 1 1 = 3 det A1 (b) = 2 − (−1) 1 1 2 1 3 −1 0 1 1 −1 1 −1 1 −1 = −3 1 = 1 det A2 (b) = 1 2 − 3 2 1 1 1 1 3 0 1 1 1 1 2 1 2 1 1 1 2 = 1 det A3 (b) = 1 − 1 + 1 = 2. −1 3 1 3 1 −1 1 −1 3

4.2. DETERMINANTS

259

Then by Cramer’s Rule, x= 1 61. We have det A = 1

det A1 (b) 9 det A2 (b) 3 = , y= =− , det A 4 det A 4 1 = −2. The four cofactors are −1 C11 = (−1)1+1 · (−1) = −1, C21 = (−1)2+1 · 1 = −1,

z=

det A3 (b) 1 = . det A 2

C12 = (−1)1+2 · 1 = −1, C22 = (−1)2+2 · 1 = 1.

Then "

−1 adj A = −1 2 62. We have det A = 1

#T " −1 −1 = 1 −1

# " −1 1 1 −1 −1 , so A = adj A = − det A 2 −1 1

# " 1 −1 = 21 1 2

1 2 − 12

−1 = 7. The four cofactors are 3 C11 = (−1)1+1 · 3 = 3, C21 = (−1)2+1 · (−1) = 1,

C12 = (−1)1+2 · 1 = −1, C22 = (−1)2+2 · 2 = 2.

Then " 3 adj A = 1

#T " −1 3 = 2 −1

# " 1 3 1 1 −1 , so A = adj A = det A 7 −1 2

# " 3 1 7 = 1 2 −7

1 7 2 7

#

63. We have det A = 2 from Exercise 59. The nine cofactors are C11 = (−1)1+1 · 1 = 1,

C12 = (−1)1+2 · 0 = 0, C13 = (−1)1+3 · 0 = 0,

C21 = (−1)2+1 · 1 = −1,

C22 = (−1)2+2 · 2 = 2, C23 = (−1)2+3 · 0 = 0,

C31 = (−1)3+1 · (−2) = −2, Then

C32 = (−1)3+2 · 2 = −2, C33 = (−1)3+3 · 2 = 2.

T   1 −1 −2 0 0 2 −2 , 2 0 = 0 0 0 2 −2 2     1 1 −1 −2 − 12 −1 2 1 1    = adj A = 0 2 −2 =  0 1 −1 . det A 2 0 0 2 0 0 1 

1 adj A = −1 −2

so A−1

64. We have det A = 4 from Exercise 60. The nine cofactors are C11 = (−1)1+1 · 1 = 1,

C12 = (−1)1+2 · (−1) = 1, C13 = (−1)1+3 · (−2) = −2,

C21 = (−1)2+1 · (−1) = 1, C31 = (−1)3+1 · 2 = 2, Then

C22 = (−1)2+2 · 1 = 1, C23 = (−1)2+3 · (−2) = 2, C32 = (−1)3+2 · 2 = −2, C33 = (−1)3+3 · 0 = 0.

T   1 −2 1 1 2 1 2 =  1 1 −2 , −2 0 −2 2 0     1 1 1 1 1 2 4 4 2 1 1    adj A =  1 1 −2 =  14 14 − 12  . = det A 4 −2 2 0 − 12 12 0  1 adj A = 1 2

so A−1

.

# .

260

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

65. First, since A−1 =

1 det A

adj A, we have adj A = (det A)A−1 . Then

(adj A) · Thus adj A times

1 det A A

 1 1 1 A = (det A)A−1 · A = det A · A−1 A = I. det A det A det A is the identity, so that adj A is invertible and (adj A)−1 =

1 A. det A

This proves the first equality. Finally, substitute A−1 for A in the equation adj A = (det A)A−1 ; this gives adj(A−1 ) = (det A−1 )A = det1 A A, proving the second equality. 66. By Exercise 65, det(adj A) = det((det A)A−1 ); by Theorem 4.7, this is equal to (det A)n det(A−1 ). But det(A−1 ) = det1 A , so that det(adj A) = (det A)n det1 A = (det A)n−1 . 67. Note that exchanging Ri−1 and Ri results in a matrix in which row i has been moved “up” one row and all other rows remain the same. Refer to the diagram at the end of the solution when reading what follows. Starting with Rs ↔ Rs−1 , perform s − r adjacent interchanges, each one moving the row that started as row s up one row. At the end of these interchanges, row s will have been moved “up” to row r, and all other rows will remain in the same order. The original row r is therefore at row r + 1. We wish to move that row down to row s. Note that exchanging Ri and Ri+1 results in a matrix in which row i has been moved “down” one row and all other rows remain the same. So starting with Rr+1 ↔ Rr+2 , perform s − r − 1 adjacent interchanges, each one moving the row that started as row r + 1 “down” one row. At the end of these interchanges, row r + 1 will have been moved down s − r − 1 rows, so it will be at row s − r − 1 + r + 1 = s, and all other rows will be as they were. Now, row r + 1 originally started life as row r, so the end result is that row s has been moved to row r, and row r to row s. We used s − r adjacent interchanges to move row s, and another s − r − 1 adjacent interchanges to move row r, for a total of 2(s − r) − 1 adjacent interchanges.   R1  R2     ..   .    Rr−1     Rr    Rr+1    Original:  .   ..    Rs−1     Rs    Rs+1     .   ..  Rn     R1 R1  R2   R2       ..   ..   .   .      Rr−1  Rr−1       Rs   Rs       Rr  Rr+1      After initial s − r exchanges: Rr+1  After next s − r − 1 exchanges:  .  .   ..    ..     .  Rs−1      Rs−1   Rr      Rs+1  Rs+1       .   .   ..   ..  Rn Rn

4.2. DETERMINANTS

261

68. The proof is very similar to the proof given be the matrix obtained by moving column j  a11 a12 · · ·  a21 a22 · · ·  A= . .. ..  .. . . an2

in the text for the case of expansion along a row. Let B of matrix A to column 1:  a1,j−1 a1j a1,j+1 · · · a1n a2,j−1 a2j a2,j+1 · · · a2n   .. .. .. ..  .. . . . . .  · · · an,j−1 anj an,j+1 · · · ann

a1j  a2j  B= .  ..

a11 a21 .. .

a12 a22 .. .

··· ··· .. .

a1,j−1 a2,j−1 .. .

a1,j+1 a2,j+1 .. .

··· ··· .. .

 a1n a2n   ..  . 

anj

an1

an2

···

an,j−1

an,j+1

···

ann

an1 

By the discussion in Exercise 67, it requires j − 1 adjacent column interchanges to move column j to column 1, so by Lemma 4.14, det B = (−1)j−1 det A, so that det A = (−1)j−1 det B. Then by Lemma 4.13, evaluating det B using cofactor expansion along the first column gives det A = (−1)j−1 det B = (−1)j−1

n X

bi1 Bi1 = (−1)j−1

i=1

n X

aij Bi1 ,

i=1

where Aij and Bij are the cofactors of A and B. It remains only to understand how the cofactors of A and B are related. Clearly the matrix formed by removing row i and column j of A is the same matrix as is formed by removing row i and column 1 of B, so the only difference is the sign that we put on the determinant of that matrix. In A, the sign is (−1)i+j ; in B, it is (−1)i+1 . Thus Bi1 = (−1)j−1 Aij . Substituting that in the equation for det A above gives det A = (−1)j−1

n X

aij Bi1 = (−1)j−1

i=1

n X

(−1)j−1 aij Aij = (−1)2j−2

i=1

n X i=1

aij Aij =

n X

aij Aij ,

i=1

proving Laplace expansion along columns. 69. Since P and S are both square, suppose P is n × n and S is m × m. Then O is m × n and Q is n × m. Proceed by induction on n. If n = 1, then   p Q A = 11 , O S and Q is 1 × n while O is n × 1. Compute the determinant by cofactor expansion along the first column. The only nonzero entry in the first column is p11 , so that det A = p11 A11 . Since O has one column and Q one row, removing the first row and column of A leaves only S, so that A11 = det S and thus det A = p11 det S = (det P ) det S). This proves the case n = 1 and provides the basis for the induction. Now suppose that the statement is true when P is k × k; we want to prove it when P is (k + 1) × (k + 1). So let P be a (k + 1) × (k + 1) matrix. Again compute det A using column 1: det A =

k+1+m X

 ai1 (−1)i+1 det Ai1 .

i=1

Now, ai1 = pi1 for i ≤ k + 1, and ai1 = 0 for i > k + 1, so that det A =

k+1 X i=1

 pi1 (−1)i+1 det Ai1 .

262

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Ai1 is the submatrix formed by removing row i and column 1 of A; since i ranges from 1 to k + 1, we see that this process removes row i and column 1 of P , but does not change S. So we are left with a matrix   Pi1 Qi Ai1 = . O S Now Pi1 is k × k, so we can apply the inductive hypothesis to get det Ai1 = (det Pi1 )(det S). Substitute that into the equation for det A above to get det A =

k+1 X

pi1 (−1)i+1 det Ai1



i=1

=

k+1 X

 pi1 (−1)i+1 (det Pi1 )(det S)

i=1

= (det S)

k+1 X

! pi1 (−1)

i+1

det Pi1



.

i=1

But the sum is just cofactor expansion of P along the first column, so its value is det P , and finally we get det A = (det S)(det P ) = (det P )(det S) as desired. 70. (a) There are many examples. For instance, let    1 0 1 P =S= , Q= 0 1 1 Then  A=

P R

 1  0 Q = 0 S 0

 0 , 0 0 1 1 1

1 1 1 0

R=

 0 0

 1 . 1

 0 0 . 0 1

Since the second and third rows of A are identical, det A = 0, but (det P )(det S) − (det Q)(det R) = 1 · 1 − 0 · 0 = 1, and the two are unequal. (b) We have   P −1 O P Q BA = −RP −1 I R S  −1  P P + OR P −1 Q + OS = −RP −1 P + IR −RP −1 Q + IS   I P −1 Q = −R + R −RP −1 Q + S   I P −1 Q . = O S − RP −1 Q 

It is clear that the proof in Exercise 69 can easily be modified to prove the same result for a block lower triangular matrix. That is, if A, P , and S are square and   P O A= , Q S

Exploration: Geometric Applications of Determinants

263

then det A = (det P )(det S). Applying this to the matrix B gives det B = (det P −1 )(det I) = det P −1 =

1 . det P

Further, applying Exercise 69 to BA gives det(BA) = (det I)(det(S − RP −1 Q)) = det(−SRP −1 Q). So finally det A =

1 det(BA) = (det P )(det(S − RP −1 Q)). det B

(c) If P R = RP , then det A = (det P )(det(S − RP −1 Q)) = det(P (S − RP −1 Q)) = det(P S − (P R)P −1 Q)) = det(P S − RP P −1 Q) = det(P S − RQ).

Exploration: Geometric Applications of Determinants 1. (a) We have i u × v = 0 3

j k 1 1 1 = i −1 −1 2

0 1 − j 3 2

1 = 3i + 3j − 3k. −1

0 1 + k 3 2

(b) We have i u × v = 3 0

j k −1 −1 2 = i 1 1 1

3 2 − j 0 1

3 2 + k 0 1

−1 = −3i − 3j + 3k. 1

(c) We have j k 2 2 3 = i −4 −4 −6

i u × v = −1 2

−1 3 − j 2 −6

−1 3 + k 2 −6

2 = 0. −4

(d) We have i u × v = 1 1

j 1 2

k 1 1 = i 2 3

1 1 − j 1 3

1 1 + k 1 3

1 = i − 2j + k. 2

2. First, i u · (v × w) = u · v1 w1

j v2 w2

k v3 w3

= (u1 i + u2 j + u3 k) · (i(v2 w3 − v3 w2 ) − j(v1 w3 − v3 w1 ) + k(v1 w2 − v2 w1 )) = (u1 v2 w3 − u1 v3 w2 ) − (u2 v1 w3 − u2 v3 w1 ) + (u3 v1 w2 − u3 v2 w1 ). Computing the u1 v1 w1

determinant, we get u2 u3 v2 v3 v v2 v3 = u 1 − u2 1 w w w1 2 3 w2 w3

v v3 + u3 1 w3 w1

v2 w2

= u1 (v2 w3 − v3 w2 ) − u2 (v1 w3 − v3 w1 ) + u3 (v1 w2 − v2 w1 ) = (u1 v2 w3 − u1 v3 w2 ) − (u2 v1 w3 − u2 v3 w1 ) + (u3 v1 w2 − u3 v2 w1 ), and the two expressions are equal.

264

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

3. With u, v, and w as in the previous exercise, (a) We have i v × u = v1 u1

j v2 u2

i k R2 ↔R3 v3 −−−−−→ = − u1 v1 u3

j u2 v2

k u3 = −(u × v). v3

(b) We have i u × 0 = u1 0

k u3 . 0

j u2 0

This matrix has a zero row, so its determinant is zero by Theorem 4.3. Thus u × 0 = 0. (c) We have i j k u × 0 = u1 u2 u3 . u1 u2 u3 This matrix has two equal rows, so its determinant is zero by Theorem 4.3. Thus u × u = 0. (d) We have by Theorem 4.3 i i j k j k u2 u3 = k u1 u2 u3 = k(u × v). u × kv = u1 kv1 kv2 kv3 v1 v2 v3 (e) We have by Theorem 4.3 i j u2 u × (v + w) = u1 v1 + w1 v2 + w2

k i u3 = u1 v3 + w3 v1

j u2 v2

k i u3 + u1 v3 w1

u1 u · (u × v) = u1 v1 v1 v · (u × v) = u1 v1

u2 u2 v2

u3 u3 v3 v3 u3 . v3

j u2 w2

k u3 = u × v + u × w. w3

(f ) From Exercise 2 above,

v2 u2 v2

But each of these matrices has two equal rows, so each has zero determinant. Thus u · (u × v) = v · (u × v) = 0. (g) From Exercise 2 above, u1 u · (v × w) = v1 w 1

u2 v2 w2

u3 v3 , w3

w1 (u × v) · w = w · (u × v) = u1 v1

w2 u2 v2

w3 u3 . v3

The second matrix is obtained from the first by exchanging rows 1 and 2 and then exchanging rows 1 and 3. Since each row exchange multiplies the determinant by −1, it remains unchanged so that the two products are equal. 4. The vectors u and v, considered as vectors in R3 , have their third coordinates equal to zero, so the area of the parallelogram with these edges is

 

      i j k

u1 u2 u1 u2 u1 v1



  A = ku × vk = det u1 u2 0 = k det = det = det . v1 v2 v1 v2 u2 v2

v1 v2 0

Exploration: Geometric Applications of Determinants

265

5. The area of the rectangle is (a + c)(b + d). The small rectangles at top left and bottom right each have area bc. The triangles at top and bottom each have base a and height b, so the area of each is 1 2 ab; altogether they have area ab. Finally, the smaller triangles at left and right each have base d and height c, so the area of each is 12 cd; altogether they have area cd. So in total, the area in the big rectangle that is not in the parallelogram is bc + ab + cd, so the area of the parallelogram is (a + c)(b + d) − 2bc − ab − cd = ab + ad + bc + cd − 2bc − ab − cd = ad − bc.     a c For this parallelogram, u = and v = , and indeed b d   det u v = ad − bc. As for the absolute value sign, note that the formula A = ku × vk does not depend on which of the two sides we choose for u and which for v, but the sign of the determinant does. In our particular case, when we choose u and v as above, we don’t need the absolute value sign, as the area works out properly. It is present in the norm formula to represent the fact that we need not make particular choices for u and v. 6. (a) By Exercise 5, the area is  2 A = det 3

 −1 = |2 · 4 − (−1) · 3| = 11. 4

(b) By Exercise 5, the area is  3 A = det 4

 5 = |3 · 5 − 4 · 5| = 5. 5

7. A parallelepiped of height h with base area a has volume ha. This can be seen using calculus; the Rh h volume is 0 a dz = [az]0 = ah. Alternatively, it can be seen intuitively, since the parallelepiped consists of “slices” each of area a, and the height of those slices is h. Now, for the parallelepiped in Figure 4.11, the height is clearly kuk cos θ. The base is a parallelogram with sides v and w, so its area is kv × wk. Thus the volume of the parallelepiped is kuk kv × wk cos θ. But u · (v × w) = kuk kv × wk cos θ, so that the volume  is u · (v × w) which, by Exercise 2, is equal to the absolute value of the determinant u of the matrix  v , which is equal to the absolute value of the determinant of the transpose of that w   matrix, or u v w . 8. Compare Figure 4.12 to Figure 4.11. The base of the tetrahedron is a triangle that is one half of the area of the parallelogram in Figure 4.11. So from the hint, we have 1 (area of the base)(height) 3 1 = (area of the parallelogram)(height) 6 1 = (volume of the parallelepiped) 6 1 = |u · (v × w)| . 6

V=

9. By problem 4, the area of the parallelogram determined by Au and Av is       TA (P ) = det Au Av = det(A u v ) = (det A)(det u v ) = |det A| (area of P ).

266

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

10. By Problem 7, the volume of the parallelepiped determined by Au, Av, and Aw is  TA (P ) = det Au

Av

  Aw = det(A u

v

 w )

 = (det A)(det u 11. (a) The line through (2, 3) and (−1, 0) is given by x y 2 3 −1 0

v

 w ) = |det A| (volume of P ).

1 1 . 1

Evaluate by cofactor expansion along the third row: x y y 1 = −1(y − 3) + (3x − 2y) = 3x − 3y + 3. + 1 −1 2 3 3 1 The equation of the line is 3x − 3y + 3 = 0, or x − y + 1 = 0. (b) The line through (1, 2) and (4, 3) is given by x 1 4 Evaluate by cofactor expansion 2 x 3

y 2 3

1 1 . 1

along the first row: 1 1 1 + 1 1 2 = −x + 3y − 5. − y 4 3 4 1 1

The equation of the line is −x + 3y − 5 = 0, or x − 3y + 5 = 0. 12. Suppose the three points satisfy ax1 + by1 + c = 0 ax2 + by2 + c = 0 ax3 + by3 + c = 0, which can be written as the matrix equation  x1 XA = x2 x3

y1 y2 y3

  1 a 1  b  = O. c 1

Then det X 6= 0 if and only if X is invertible, which implies that X −1 XA = X −1 O = O, so that A = O. Thus A 6= O if and only if det X = O; this says precisely that the three points lie on the line ax + by + c = 0 with a and b not both zero if and only if det X = 0. 13. Following the discussion at the beginning of the section, since the three points are noncollinear, they determine a unique plane, say ax + by + cz + d = 0. Since all three points lie on the plane, their coordinates satisfy this equation, so that ax1 + by1 + cz1 + d = 0 ax2 + by2 + cz2 + d = 0 ax3 + by3 + cz3 + d = 0. These three equations together with the equation of the plane then form a system of linear equations in the variables a, b, c, and d. Since we know that the plane exists, it follows that there are values

Exploration: Geometric Applications of Determinants

267

of a, b, c, and d, not all zero, satisfying these equations. therefore its coefficient matrix, which is  x y z x1 y1 z1  x2 y2 z2 x3 y3 z3

So the system has a nontrivial solution, and  1 1  1 1

cannot be invertible, by the Fundamental Theorem of Invertible Matrices. Thus its determinant must be zero by Theorem 4.6. To see what happens if the three points are collinear, try row reducing the above coefficient matrix: 

x x1  x2 x3

y y1 y2 y3

z z1 z2 z3

  x 1  x1 1  R3 −R2  −−→ x2 − x1 1 −−− R4 −R2 1 − −−−−→ x3 − x1

y y1 y2 − y1 y3 − y1

z z1 z2 − z1 z3 − z1

 1 1 . 0 0

If the points are collinear, then     x3 − x1 x2 − x1  y3 − y1  = k  y2 − y1  z3 − z1 z2 − z1 for some k 6= 0. Then performing R4 − kR3 gives a matrix with a zero row. So its determinant is trivially zero and the system has an infinite number of solutions. This corresponds to the fact that there are an infinite number of planes passing through the line determined by the three collinear points. 14. Suppose the four points satisfy ax1 + by1 + cz1 + d = 0 ax2 + by2 + cz2 + d = 0 ax3 + by3 + cz3 + d = 0 ax4 + by4 + cz4 + d = 0, which can be written as the matrix equation  x1 x2 XA =  x3 x4

y1 y2 y3 y4

z1 z2 z3 z4

  1 a   1   b  = O.  1 c d 1

Then det X 6= 0 if and only if X is invertible, which implies that X −1 XA = X −1 O = O, so that A = O. Thus A 6= O if and only if det X = O; this says precisely that the four points lie on the plane ax + by + cz + d = 0 with a, b, and c not all zero if and only if det X = 0. 15. Substituting the values (x, y) = (−1, 10), (0, 5) and (3, 2) in turn into the equation gives the linear system in a, b, and c a − b + c = 10 a

= 5

a + 3b + 9c = 2. The determinant of the coefficient matrix of 1 −1 0 X = 1 1 3

this system is 1 −1 0 = −1 3 9

1 = 12 6= 0, 9

268

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Since X is has nonzero determinant, it is invertible by Theorem 4.6, so we have         a 10 a 10 −1         b 5 b 5 , X = ⇒ =X c 2 c 2 so that the system has a unique solution. To find that solution, we can either invert X or we can row-reduce the augmented matrix. Using the second method, and writing the second equation first in the augmented matrix, we have  1 0 0 1 −1 1 1 3 9

  1 5 R2 −R1 10 −−− −−→ 0 R3 −R1 2 − −−−−→ 0

   1 0 0 5 5 −R2  5 − −−→ 0 1 −1 −5 R3 −3R2 −3 0 3 9 −3 − −−−−→     1 0 5 1 0 0 5 R +R 2 3  0 1 −1 −5 −−− −1 −5 1 −−→ 0 R3 12 12 −12 0 0 1 1 0 −−→

0 0 −1 1 3 9  1 0 0 1 0 0

0 1 0

0 0 1

 5 −4 . 1

So the system has the unique solution a = 5, b = −4, c = 1 and the parabola through the three points is y = 5 − 4x + x2 . 16. (a) The linear system in a, b, and c is a + b + c = −1 a + 2b + 4c =

4

a + 3b + 9c =

3,

so row-reducing the augmented matrix gives  1 1 1 1 2 4 1 3 9

  −1 1 R2 −R1 −−→ 0 4 −−− R3 −R1 3 − −−−−→ 0  1 0 0

1 1 1 3 2 8 1 1 0

1 3 1

  1 −1 0 5 R3 −2R2 4 − −−−−→ 0  R1 −R3  −1 −−− −−→ 1 R2 −3R3 0 5 − −−−−→ 0 −3

1 1 1 3 0 2 1 0 1 0 0 1

 −1 5 1 2 R3 −6 −− −→  R1 −R2  2 −−−−−→ 1 0 0 0 1 0 14 0 0 1 −3

 −12 14 . −3

So the system has the unique solution a = −12, b = 14, c = −3 and the parabola through the three points is y = −12 + 14x − 3x2 . (b) The linear system in a, b, and c is a − b + c = −3 a + b + c = −1 a + 3b + 9c =

1,

so row-reducing the augmented matrix gives  1 1 1

−1 1 1 1 3 9

  −3 1 R2 −R1 −1 −−− −−→ 0 R3 −R1 1 − −−−−→ 0  1 0 0

   −3 1 R 1 −1 1 −3 2 2 2 −− −→ 0 1 0 1 R3 −R2 4 14 R3 0 1 2 1 − −−−−→ −−−→    R1 +R2 −R3  −3 1 −1 1 −3 −−−−−−−→ 1 0 0 0 0 1 0 1 1 1 0 1 2 R3 0 −− 0 0 0 1 0 0 1 −→

−1 1 2 0 4 8

−1 1 1 0 0 2

 −2 1 . 0

So the system has the unique solution a = −2, b = 1, c = 0. Thus the equation of the curve passing through these points is in fact a line, since c = 0, and its equation is y = −2 + x.

Exploration: Geometric Applications of Determinants

269

17. Computing the determinant, we get 1 1 1

a1 a2 a3

a21 2 = 1 a2 a2 a3 a23

a1 a22 2 − 1 a3 a3

a1 a21 2 + 1 a2 a3

a21 a22

= a2 a23 − a22 a3 − a1 a23 + a21 a3 + a1 a22 − a21 a2 . Now compute the product on the right: (a2 − a1 )(a3 − a1 )(a3 − a2 ) = (a2 a3 − a1 a3 − a1 a2 + a21 )(a3 − a2 ) = a2 a23 − a22 a3 − a1 a23 + a1 a2 a3 − a1 a2 a3 + a1 a22 + a21 a3 − a21 a2 = a2 a23 − a22 a3 − a1 a23 + a1 a22 + a21 a3 − a21 a2 . The two expressions are equal. Furthermore, the determinant is nonzero since it is the product of the three differences a2 − a1 , a3 − a1 , and a3 − a2 , which are all nonzero because we are assuming that the ai are distinct real numbers. 18. Recall that adding a multiple of one column to another does not change the determinant (see Theorem 4.3). So given the matrix 1 a1 a21 a31 1 a a2 a3 2 2 2 , 1 a3 a23 a33 1 a4 a2 a3 4 4 apply in order the column operations C4 − a1 C3 , C3 − a1 C2 , C2 − a1 C1 to get 1 1 1 1

a1 − a1 a2 − a1 a3 − a1 a4 − a1

a21 − a21 a22 − a2 a1 a23 − a3 a1 a24 − a4 a1

1 0 a31 − a31 1 1 a32 − a22 a1 = (a − a )(a − a )(a − a ) 4 1 3 1 2 1 1 1 a33 − a23 a1 3 2 1 1 a4 − a4 a1

0 a2 a3 a4

0 a22 . a23 a2 4

Then expand along the first row, giving 1 (a4 − a1 )(a3 − a1 )(a2 − a1 ) 1 1

a2 a3 a4

a22 a23 . a24

But we know how to evaluate this determinant from Exercise 17; doing so, we get for the original determinant (a4 − a1 )(a3 − a1 )(a2 − a1 )(a3 − a2 )(a4 − a2 )(a4 − a3 ) which is the same as the expression given in the problem statement. For the second part, we can interpret the matrix above as the coefficient matrix formed by starting with the cubic y = a + bx + cx2 + dx3 , substituting the four given points for x, and regarding the resulting equations as a linear system in a, b, c, and d, just as in Exercise 15. Since the determinant is nonzero if the ai are all distinct, it follows that the system has a unique solution, so that there is a unique polynomial of degree at most 3 (it may be less than a cubic; for example, see Exercise 16(b)) passing through the four points. 19. For the n × n determinant given, use induction on n (note that Exercise 18 was essentially an inductive computation, since it invoked Exercise 17 at the end). We have already proven the cases n = 1, n = 2,

270

CHAPTER 4. EIGENVALUES AND EIGENVECTORS and n = 3. So assume that the statement holds for n = k, and consider the determinant 1 1 . . . 1 1

··· ··· .. .

a1 a2 .. . ak ak+1

··· ···

ak−1 1 a2k−1 .. . k−1 ak ak−1 k+1

ak1 ak2 .. . . akk akk+1

apply in order the column operations Ck+1 − a1 Ck , Ck − a1 Ck−1 , . . . , C2 − a1 C1 to get 1 1 . . . 1 1

a1 − a1 a2 − a1 .. . ak − a1 ak+1 − a1

··· ··· .. . ··· ···

ak−1 − a1k−1 1 ak−1 − a1 ak−2 2 2 .. . ak−1 − a1 akk−2 k k−2 ak−1 k+1 − a1 ak+1

1 0 ak1 − ak1 k−1 1 1 k a2 − a1 a2 k+1 Y . . .. = (aj − a1 ) .. .. . j=2 1 1 akk − a1 ak−1 k k−1 1 1 k ak+1 − a1 ak+1

··· ··· .. . ··· ···

0 ak−2 2 .. . ak−2 k ak−2 k+1

0 ak−1 2 .. . . k−1 ak ak−1 k+1

Now expand along the first row, giving 1 k+1 .. Y (aj − a1 ) . 1 j=2 1

··· .. . ··· ···

ak−2 2 .. . ak−2 k ak−2 k+1

ak−1 2 .. . . k−1 ak ak−1 k+1

But the remaining determinant is again a Vandermonde determinant in the k variables a2 , a3 , . . . , ak+1 , so we know its value from the inductive hypothesis. Substituting that value, we get for the determinant of the original matrix k+1 Y

Y

j=2

2≤i 0 is an integer. When n = 0, the statement says that λ0 = 1 is an eigenvalue of A0 = I, which is true. So it remains to prove the statement for n < 0. We use induction on n. For n = −1, this is the statement that if Ax = λx, then A−1 x = λ−1 x, which is Theorem 4.18(b). Now assume the statement holds for n = −k, and prove it for n = −(k + 1). Note that by the remark in Section 3.3, A−(k+1) = (Ak+1 )−1 = (Ak A)−1 = A−1 (Ak )−1 = A−1 A−k . Now A−(k+1) x = A−1 (A−k x)

inductive hypothesis

=

n=−1

A−1 (λ−k x) = λ−k A−1 x = λ−k λ−1 x = λ−(k+1) x

and we are done. 15. If we can express x as a linear combination of the eigenvectors, it will be easy to computeA10  x since     c1 5 10 we know what A does to the eigenvectors. To so express x, we solve the system v1 v2 = , c2 1 which is c1 + c2 = 5 −c1 + c2 = 1.

4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES

283

This has the solution c1 = 2 and c2 = 3, so that x = 2v1 + 3v2 . But by Theorem 4.4(a), v1 and v2 10 are eigenvectors of A10 with eigenvalues λ10 1 and λ2 . Thus 10 A10 x = A10 (2v1 + 3v2 ) = 2A10 v1 + 3A10 v2 = 2λ10 1 v1 + 3λ2 v2 =

1 v1 + 3072v2 . 512

Substituting for v1 and v2 gives "

3072 + A x= 3072 − 10

1 512 1 512

# .

16. As in Exercise 15,  k   1 3 · 2k + 21−k k v1 + 3 · 2 v2 = A x=2· . 3 · 2k − 21−k 2 k

As k → ∞, the 21−k terms approach zero, so that Ak x ≈ 3 · 2k v2 . 17. We want to express x as a linear combination of the eigenvectors, say x = c1 v1 + c2 v2 + c3 v3 , so we want a solution of the linear system c1 + c2 + c3 = 2 c2 + c3 = 1 c3 = 2. Working from the bottom and substituting gives c3 = 2, c2 = −1, and c1 = 1, so that x = v1 −v2 +2v3 . Then A20 x = A20 (v1 − v2 + 2v3 ) = A20 v1 − A20 v2 + 2A20 v3 20 20 = λ20 1 v1 − λ2 v2 + 2λ3 v3        20 1  20 1 1 1 1 0 − 1 + 2 · 120 1 = − 3 3 0 0 1   2 = 2 − 3120  . 2

18. As in Exercise 17, Ak x = Ak (v1 − v2 + 2v3 ) = Ak v1 − Ak v2 + 2Ak v3 = λk1 v1 − λk2 v2 + 2λk3 v3        k 1  k 1 1 1 1 0 − 1 + 2 · 1k 1 = − 3 3 0 0 1  1  1 ± 3k − 3k + 2 =  − 31k + 2  . 2

As k → ∞, the

1 3k

  2 terms approach zero, and Ak x → 2. 2

284

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

19. (a) Since (λI)T = λI T = λI, we have AT − λI = AT − (λI)T = (A − λI)T , so that using Theorem 4.10,  det(AT − λI) = det (A − λI)T = det(A − λI). But the left-hand side is the characteristic polynomial of AT , and the right-hand side is the characteristic polynomial of A. Thus the two matrices have the same characteristic polynomial and thus the same eigenvalues. (b) There are many examples. For instance A=

 1 1

 0 2

has eigenvalues λ1 = 1 and λ2 = 2, with eigenspaces     −1 0 E1 = span , E2 = span . 1 1 AT has the same eigenvalues by part (a), but the eigenspaces are     1 1 E1 = span , E2 = span . 0 1 20. By Theorem 4.18(b), if λ is an eigenvalue of A, then λm is an eigenvalue of Am = O. But the only eigenvalue of O is zero, since Am takes any vector to zero. Thus λm = 0, so that λ = 0. 21. By Theorem 4.18(a), if λ is an eigenvalue of A with eigenvector x, then λ2 is an eigenvalue of A2 = A with eigenvector x. Since different eigenvalues have linearly independent eigenvectors by Theorem 4.20, we must have λ2 = λ, so that λ = 0 or λ = 1. 22. If Av = λv, then Av − cIv = λv − cIv, so that (A − cI)v = (λ − c)v. But this equation means exactly that v is an eigenvector of A − cI with eigenvalues λ − c. 23. (a) The characteristic polynomial of A is A − λI = 3 − λ 5

2 = λ2 − 3λ − 10 = (λ − 5)(λ + 2). −λ

So the eigenvalues of A are λ1 = 5 and λ2 = −2. For λ1 , we have     −2 2 1 −1 A − 5I = → , 5 −5 0 0 and for λ2 , we have  5 A + 2I = 5

  2 1 → 2 0

2 5

0

 .

It follows that   1 E5 = span , 1

 E−2 = span

   −2 − 25 = span . 5 1

(b) By Theorem 4.4(b), A−1 has eigenvalues 15 and − 21 , with     1 −2 E1/5 = span , E−1/2 = span . 1 5

4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES

285

By Exercise 22, A − 2I has eigenvalues 5 − 2 = 3 and −2 − 2 = −4 with     1 −2 E3 = span , E−4 = span . 1 5 Also by Exercise 22, A + 2I = A − (−2I) has eigenvalues 5 + 2 = 7 and −2 + 2 = 0 with   1 E7 = span , 1

  −2 E0 = span . 5

24. (a) There are many examples. For instance, let  0 A= 1

 1 , 0

 1 B= 0

 1 . 1

Then the characteristic equation of A is λ2 − 1, so that A has eigenvalues λ = ±1. The characteristic equation of B is (µ − 1)2 , so that B has eigenvalue µ = 1. But A+B =

 1 1

 2 1

has characteristic polynomial λ2 − 2λ − 1, so it has eigenvalues 1 ± λ + µ are 0 and 2.



2. The possible values of

(b) Use the same A and B as above. Then  0 AB = 1

 1 , 1

√ with characteristic polynomial λ2 − λ − 1 and eigenvalues 21 (1 ± 5), which again does not match λ + µ. (c) If λ and µ both have eigenvector x, then (A + B)x = Ax + Bx = λx + µx = (λ + µ)x, which shows that λ + µ is an eigenvalue of A + B with eigenvector x. For the product, we have (AB)x = A(Bx) = A(µx) = µAx = µλx = λµx, showing that λµ is an eigenvalue of AB with eigenvector x. 25. No, they do not. For example, take B = I2and let A be any 2 × 2 rank 2 matrix that has an eigenvalue 0 2 other than 1. For example let A = , which has eigenvalues 2 and −2. Since A has rank 2, it 2 0 row-reduces to I, so A and I are row-equivalent but do not have the same eigenvalues. 26. By definition, the companion matrix of p(x) = x2 − 7x + 12 is    −(−7) −12 7 C(p) = = 1 0 1

 −12 , 0

which has characteristic polynomial 7 − λ −12 = (7 − λ)(−λ) + 12 = λ2 − 7λ + 12. 1 −λ

286

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

27. By definition, the companion matrix of p(x) = x3 + 3x2 − 4x + 12 is     −3 −(−4) −12 −3 4 −12 0 0 = 1 0 0 , C(p) =  1 0 1 0 0 1 0 which has characteristic polynomial −3 − λ 4 −12 = −1 −3 − λ 1 −λ 0 1 0 1 −λ

−3 − λ −12 − λ 1 0

4 −λ

= −12 − λ((−3 − λ)(−λ) − 4) = −λ3 − 3λ2 + 4λ − 12. 28. (a) If p(x) = x2 + ax + b, then C(p) =

 −a 1

 −b , 0

so the characteristic polynomial of C(p) is −a − λ −b = (−a − λ)(−λ) + b = λ2 + aλ + b. 1 −λ   x (b) Suppose that λ is an eigenvalue of C(p), and that a corresponding eigenvector is x = 1 . Then x2     x x λ 1 = C(p) 1 means that x2 x2        λx1 −a −b x1 −ax1 − bx2 = = , λx2 1 0 x2 x1   λ so that x1 = λx2 . Setting x2 = 1 gives an eigenvector . 1 29. (a) If p(x) = x3 + ax2 + bx + c, then  −a C(p) =  1 0

 −b −c 0 0 , 1 0

so the characteristic polynomial of C(p) is −a − λ −b −c 1 = −1 −a − λ −λ 0 1 0 1 −λ

−a − λ −c − λ 1 0

−b −λ

= −c − λ((−a − λ)(−λ) + b) = −λ3 − aλ2 − bλ − c.   x1 (b) Suppose that λ is an eigenvalue of C(p), and that a corresponding eigenvector is x = x2 . Then x3     x1 x1 λ x2  = C(p) x2  means that x3 x3        λx1 −a −b −c x1 −ax1 − bx2 − cx3 λx2  =  1 , 0 0 x2  =  x1 λx3 0 1 0 x3 x2

4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES

287

 2 λ so that x1 = λx2 and x2 = λx3 . Setting x3 = 1 gives an eigenvector  λ . 1 30. Using Exercise 28, the characteristic polynomial of the matrix   −a −b C(p) = 1 0 is λ2 + aλ + b. If we want this matrix to have eigenvalues 2 and 5, then λ2 + aλ + b = (λ − 2)(λ − 5) = λ2 − 7λ + 10, so we can take   7 −10 A= . 1 0 31. Using Exercise 28, the characteristic polynomial of the matrix   −a −b −c 0 0 C(p) =  1 0 1 0 is −(λ3 + aλ2 + bλ + c). If we want this matrix to have eigenvalues −2, 1, and 3, then −(λ3 + aλ2 + bλ + c) = −(λ + 2)(λ − 1)(λ − 3) = λ3 − 2λ2 − 5λ + 6, so we can take

 2 A = 1 0

5 0 1

 −6 0 . 0

32. (a) Exercises 28 and 29 provide a basis for the induction, proving the cases n = 2 and n = 3. So assume that the statement holds for n = k; that is, that −ak−1 − λ −ak−2 · · · −a1 −a0 1 −λ ··· 0 0 k .. . 0 0 = (−1) p(λ), 0 1 0 0 · · · −λ 0 0 0 ··· 1 −λ where p(x) = xk + ak−1 xk−1 + · · · + a1 x + a0 . Now suppose that q(x) = xk+1 + bk xk + · · · + b1 x + b0 has degree k + 1; then −bk − λ −bk−1 · · · −b1 −b0 1 −λ ··· 0 0 .. C(q) = . 0 1 0 0 . 0 0 · · · −λ 0 0 0 ··· 1 −λ Evaluate this determinant by expanding along the last column, giving −bk − λ −bk−1 1 −λ · · · 0 1 −λ . . k . 0 1 0 C(q) = (−1) (−b0 ) + (−λ) 0 1 0 0 · · · −λ 0 0 0 0 ··· 1 0 0

··· ··· .. . ··· ···

−b2 0 0 −λ 1

−b1 0 0 . 0 −λ

The first determinant is of an upper triangular matrix, so it is the product of its diagonal entries, which is 1. The second determinant is the characteristic polynomial of xk +bk xk−1 +· · ·+b2 x+b1 ,

288

CHAPTER 4. EIGENVALUES AND EIGENVECTORS which is a k th degree polynomial. So by the inductive hypothesis, this determinant is (−1)k (λk + bk λk−1 + · · · + b2 λ + b1 ). So the entire expression becomes  C(q) = (−1)k (−b0 ) − λ (−1)k (λk + bk λk−1 + · · · + b2 λ + b1 ) = (−1)k+1 b0 + (−1)k+1 (λk+1 + bk λk + · · · + b2 λ2 + b1 λ) = (−1)k+1 (λk+1 + bk λk + · · · + b2 λ2 + b1 λ + b0 ) = (−1)k+1 q(λ). (b) The method is quite similar to Exercises 28(b) and 29(b). Suppose that λ is an eigenvalue of C(p) with eigenvector   x1  x2      x =  ...  .   xn−1  xn Then λx = C(p)x, so that      λx1 x1 −an−1 −an−2 · · · −a1 −a0  λx2   1   0 ··· 0 0     x2     ..   0   ..  1 · · · 0 0  . =      . .. .. ..   .  .. λxn−1   ..   . xn−1  . . . λxn 0 0 ··· 1 0 xn   −an−1 x1 − an−2 x2 − · · · − a1 xn−1 − a0 xn   x1     x2 = .   ..   . xn−1 But this means that xi = λxi+1 for i = 1, 2, . . . , n − 1. Setting xn = 1 gives an eigenvector  n−1  λ λn−2      x =  ...  .    λ  1

33. The characteristic polynomial of A is 1 − λ cA (λ) = det(A − λI) = 2

−1 = (1 − λ)(3 − λ) + 2 = λ2 − 4λ + 5. 3 − λ

Now, A2 =

 1 2

2  −1 −1 = 3 8

 −4 , 7

so that cA (A) = A2 − 4A + 5I2    −1 −4 1 = −4 8 7 2 = O.

  −1 1 +5 3 0

 0 1

4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES

289

34. The characteristic polynomial of A is 1 − λ cA (λ) = det(A − λI) = 1 0

1 −λ 1

0 1 = −λ3 + 2λ2 + λ − 2. 1 − λ

Now,  1 A2 = 1 0

2  0 2 1 = 1 1 1

1 0 1

1 2 1

 1 1 , 2

 1 A3 = 1 0

1 0 1

3  0 3 1 = 3 1 2

3 2 3

 2 3 , 3

so that cA (A) = −A3 + 2A2 + A − 2I3        3 3 2 2 1 1 1 1 0 1 0 = − 3 2 3 + 2 1 2 1 + 1 0 1 − 2 0 1 2 3 3 1 1 2 0 1 1 0 0   −3 + 4 + 1 − 2 −3 + 2 + 1 − 0 −2 + 2 + 0 − 0 = −3 + 2 + 1 − 0 −2 + 4 + 0 − 2 −3 + 2 + 1 − 0 −2 + 2 + 0 − 0 −3 + 2 + 1 − 0 −3 + 4 + 1 − 2

 0 0 1

= O. 35. Since A2 − 4A + 5I = 0, we have A2 = 4A − 5I, and A3 = A(4A − 5I) = 4A2 − 5A = 4(4A − 5I) − 5A = 11A − 20I A4 = (A2 )(A2 ) = (4A − 5I)(4A − 5I) = 16A2 − 40A + 25I = 16(4A − 5I) − 40A + 25I = 24A − 55I. 36. Since −A3 + 2A2 + A − 2I = 0, we have A3 = 2A2 + A − 2I, A4 = A(A3 ) = A(2A2 + A − 2I) = 2A3 + A2 − 2A = 2(2A2 + A − 2I) + A2 − 2A = 5A2 − 4I. 37. Since A2 − 4A + 5I = 0, we get A(A − 4I) = −5I. Then solving for A−1 yields −4 1 4 1 I =− A+ I A−1 = − A − 5 5 5 5  2  1 4 1 2 8 16 1 8 16 4 11 2 A−2 = A−1 = − A + I = A − A + I2 = (4A − 5I) − A + I = − A + I. 5 5 25 25 25 25 25 25 25 25 38. Since −A3 + 2A2 + A − 2I = 0, rearranging terms gives A(−A2 + 2A + I) = 2I. Then solving for A−1 yields (using Exercise 36) 1 1 1 I(−A2 + 2A + I) = − A2 + A + I 2 2 2 2   1 1 2 = A−1 = − A2 + A + I 2 2 1 4 1 1 = A − A3 + A2 + A + I 2 4 2 4 1 1 1 2 2 = (5A − 4I) − (2A + A − 2I) + A2 + A + I 4 2 4 1 2 5 = − A + I. 4 4

A−1 = A−2

290

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

39. Apply Exercise 69 in Section 4.2 to the matrix A − λI:   P − λI Q cA (λ) = det(A − λI) = det = (det(P − λI))(det(S − λI)) = cP (λ)cS (λ). O S − λI (Note that neither O or Q is affected by subtracting an identity matrix, since all the 1’s are along the diagonal, in either P or S.) 40. Start with the equation given in the problem statement: det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn ). This is an equation, so in particular it holds for λ = 0, and it becomes det A = det(A − 0I) = (−1)n (−λ1 )(−λ2 ) · · · (−λn ) = (−1)n (−1)n λ1 λ2 · · · λn = λ1 λ2 · · · λn , proving the first result. For the second result, we will find the coefficients of λn−1 on the left and right sides. On the left side, the easiest method given the tools we have is induction on n. For n = 2,   a11 − λ a12 det(A − λI) = det = (a11 − λ)(a22 − λ) − a12 a21 = λ2 − (a11 + a22 )λ − a12 a21 , a21 a22 − λ so that the coefficient of λn−1 = λ is − tr(A). Claim that in general, for n ≥ 1, the coefficient of λn−1 in det(A − λI) is (−1)n−1 tr(A). Suppose this statement holds for n = k. Then if A is (k + 1) × (k + 1), we have (expanding along the first column of A − λI) det A = (a11 − λ) det(A11 − λI) +

k+1 X

(−1)i+1 ai1 det(Ai1 − λI).

i=2

Now, the terms in the sum on the right each consist of a constant times the determinant of a k × k matrix derived by removing row i and column 1 from A. Since i > 1, this means we have removed both the entries a11 − λI and ai1 − λI from A, so that there are only k − 1 entries in A that contain λ. So the sum on the right cannot produce any terms with λk . For the other term, note that A11 − λI is a k × k matrix; using the inductive hypothesis we see that the coefficient of λk−1 in det(A11 − λI) is (−1)k−1 tr(A11 ). Now det A = (a11 − λ)((−1)k λk + (−1)k−1 tr(A11 )λk−1 + . . . ) +

X

...,

so that the coefficient of λk in det A is (−1)k a11 − (−1)k−1 tr(A11 ) = (−1)k (a11 + tr(A11 )) = (−1)k (a11 + a22 + · · · + ak+1,k+1 ) = (−1)k tr(A). Now look at the product on the right in the original expression det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn ). We can get a constant times λn−1 by expanding the right-hand side when exactly n − 1 of the factors use λ and one uses λi for some i; that contribution to the λn−1 term in the characteristic polynomial is −λi λn−1 . Summing over all possible choices for i shows that the coefficient of the λn−1 term is (−1)n (−λ1 − λ2 − · · · − λn ) = (−1)n+1 (λ1 + λ2 + · · · + λn ) = (−1)n−1 (λ1 + λ2 + · · · + λn ). Since the coefficient of λn−1 on the left must be the same as that on the right, we have (−1)n−1 tr(A) = (−1)n−1 (λ1 + λ2 + · · · + λn ),

or

tr(A) = λ1 + λ2 + · · · + λn .

4.4. SIMILARITY AND DIAGONALIZATION

291

41. Suppose that the eigenvalues of A are α1 , α2 , . . . , αn (repetitions included) and for B are β1 , β2 , . . . , βn . Let the eigenvalues of A + B be λ1 , λ2 , . . . , λn . Then by Exercise 40, tr(A + B) = λ1 + λ2 + · · · + λn tr(A) + tr(B) = α1 + · · · + αn + β1 + · · · + βn . Since tr(A + B) = tr(A) + tr(B) by Exercise 44(a) in Section 3.2, we get λ1 + λ2 + · · · + λ2 = α1 + · · · + αn + β1 + · · · + βn as desired. Let µ1 , µ2 , . . . , µn be the eigenvalues of AB. Then again by Exercise 40 we have µ1 µ2 · · · µn = det(AB) = det(A) det(B) = α1 α2 · · · αn β1 β2 · · · βn as desired. Pn 42. Suppose x = i=1 ci vi , where the vi are eigenvectors of A with corresponding eigenvalues λi . Then Theorem 4.18 tells us that vi are eigenvectors of Ak with corresponding eigenvalues λki , so that k

A x=A

k

n X i=1

4.4

! ci vi

=

n X i=1

ci Ak vi =

n X

ci λki vi .

i=1

Similarity and Diagonalization

1. The two characteristic polynomials are 4 − λ cA (λ) = 3 1 − λ cB (λ) = 0

1 = (4 − λ)(1 − λ) − 3 = λ2 − 5λ + 1 1 − λ 0 = (1 − λ)2 = λ2 − 2λ + 1. 1 − λ

Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem 4.22. 2. The two characteristic polynomials are 2 − λ cA (λ) = −4 3 − λ cB (λ) = −5

1 = (2 − λ)(6 − λ) + 4 = λ2 − 8λ + 16 6 − λ −1 = (3 − λ)(7 − λ) − 5 = λ2 − 10λ + 16. 7 − λ

Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem 4.22. 3. Both A and B are triangular matrices, so by Theorem 4.15, A has eigenvalues 2 and 4, while B has eigenvalues 1 and 4. By Theorem 4.22, the two matrices are not similar.

292

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

4. The two characteristic polynomials are 1 − λ 2 0 1−λ −1 cA (λ) = 0 0 −1 1 − λ 0 1 − λ −1 − 2 = (1 − λ) 0 −1 1 − λ

−1 1 − λ

= (1 − λ)((1 − λ)2 − 1) = (1 − λ)(λ2 − 2λ) = −λ3 + 3λ2 − 2λ 2 − λ 1 1 1−λ 0 cB (λ) = 0 2 0 1 − λ 2 − λ 1 = (1 − λ) 2 1 − λ = (1 − λ)((2 − λ)(1 − λ) − 2) = (1 − λ)(λ2 − 3λ) = −λ3 + 4λ2 − 3λ. Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem 4.22. 5. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which are the diagonal entries of D. Thus   1 λ1 = 4 with eigenspace span 1   1 λ2 = 3 with eigenspace span . 2 6. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which are the diagonal entries of D. Thus   3 λ1 = 2 with eigenspace span 1 2   1 λ2 = 0 with eigenspace span −1 0   0 λ3 = −1 with eigenspace span  1 . −1 7. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which are the diagonal entries of D. Thus   3 λ1 = 6 with eigenspace span 2 3     0 1 λ2 = −2 with eigenspace span  1 ,  0 . −1 −1 8. To determine whether A is diagonalizable, we first compute its eigenvalues: 5 − λ 2 det(A − λI) = = (5 − λ)2 − 4 = λ2 − 10λ + 21 = (λ − 3)(λ − 7). 2 5 − λ

4.4. SIMILARITY AND DIAGONALIZATION

293

Since A is a 2 × 2 matrix with 2 distinct eigenvalues, it is diagonalizable by Theorem 4.25. To find the diagonalizing matrix P , we find the eigenvectors of A. To find the eigenspace corresponding to λ1 = 3 we must find the null space of   2 2 A − 3I = . 2 2 Row reduce this matrix:

 2 0 = 2 

 A − 3I

2 2

  0 1 1 −→ 0 0 0

 0 . 0

Thus eigenvectors corresponding to λ1 are of the form       −t −1 −1 =t = span . t 1 1 To find the eigenspace corresponding to λ2 = 7 we must find the null space of   −2 2 A − 7I = . 2 −2 Row reduce this matrix:   −2 0 = 2

 A − 7I

  0 1 −→ 0 0

2 −2

 0 . 0

−1 0

Thus eigenvectors corresponding to λ1 are of the form      t 1 1 =t = span . t 1 1 Thus

" P =

#

−1

1

1

1

"

− 21

,

so that

P

−1

=

" − 12

1 2 1 2

1 2

#

and so by Theorem 4.23, P

−1

AP =

1 2

1 2 1 2

#" 5

2

#" −1

2

5

1

# 1 1

" =

3

# 0

0

7

.

9. To determine whether A is diagonalizable, we first compute its eigenvalues: −3 − λ 4 det(A − λI) = = (−3 − λ)(1 − λ) − 4 · (−1) = λ2 + 2λ + 1 = (λ + 1)2 . −1 1 − λ So the only eigenvalue is λ = −1, with algebraic multiplicity 2. To see if A is diagonalizable, we must see if its geometric multiplicity is also 2 by finding its eigenvectors. To find the eigenspace corresponding to λ = −1, we must find the null space of   −2 4 A+I = −1 2 by row-reducing it:  A+I

  −2 4 0 = −1 2

  0 1 −→ 0 0

−2 0

 0 . 0

Thus eigenvectors corresponding to λ1 are of the form     2t 2 , so a basis for the eigenspace is . t 1 Thus λ = −1 has geometric multiplicity 1, which is less than the algebraic multiplicity, so that A is not diagonalizable by Theorem 4.27.

294

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

10. To determine whether A is diagonalizable, we first compute its eigenvalues: 3 − λ 1 0 3−λ 1 = (3 − λ)3 det(A − λI) = 0 0 0 3 − λ since the matrix is a triangular matrix. Thus the only eigenvalue is λ = 3, with algebraic multiplicity 3. To see if A is diagonalizable, we must compute the eigenspace corresponding to this eigenvalue. To   find the eigenspace corresponding to λ = 3 we row-reduce A − 3I 0 :  A − 3I

 0 0 = 0 0 

1 0 0

0 1 0

 0 0 . 0

But this matrix is already row-reduced, and thus the eigenspace is       s 1 1 0 = s 0 = span 0 . 0 0 0 Thus the geometric multiplicity is only 1 while the algebraic multiplicity is 3. So by Theorem 4.27, the matrix is not diagonalizable. 11. To determine whether A is diagonalizable, we first compute its eigenvalues: 1 − λ 0 1 1 − λ 1 = −λ3 + 2λ2 + λ − 2 = −(λ + 1)(λ − 1)(λ − 2). det(A − λI) = 0 1 1 −λ Since A is a 3 × 3 matrix with 3 distinct eigenvalues, it is diagonalizable by Theorem 4.25. To find the diagonalizing matrix P , we find the eigenvectors of A. To find the eigenspace corresponding to λ = −1, we must find the null space of   2 0 1 A + I = 0 2 1 1 1 1 by row-reducing it: h A+I

 2 0 1 i  0 = 0 2 1 1 1 1

  0 1 0   0 −→ 0 1 0 0 0

1 2 1 2

0

 0  0 0

Thus eigenvectors corresponding to λ1 are of the form     − 12 t −t  1    − 2 t or, clearing fractions −t . t 2t   −1 Thus a basis for this eigenspace is −1. 2 To find the eigenspace corresponding to λ = 1, we must find the null space of   0 0 1 1 A − I = 0 0 1 1 −1

4.4. SIMILARITY AND DIAGONALIZATION

295

by row-reducing it:  A−I

 0 0  0 = 0 0 1 1

1 1 −1

  0 1 1 0 0 −→ 0 0 1 0 0 0 0

 0 0 0

Thus eigenvectors corresponding to λ1 are of the form     −t −1  t , so a basis for the eigenspace is  1 . 0 0 To find the eigenspace corresponding to λ = 2, we must find the null space of   −1 0 1 1 A − 2I =  0 −1 1 1 −2 by row-reducing it:  −1 0 = 0 1 

 A − 2I

0 1 −1 1 1 −2

  0 1 0 −→ 0 0 0

0 1 0

−1 −1 0

 0 0 0

Thus eigenvectors corresponding to λ1 are of the form    1 t t , so a basis for the eigenspace is 1 . 1 t Then a diagonalizing matrix P is the matrix whose columns are the eigenvectors of A:     − 61 − 16 13 −1 −1 1     1 P = −1 1 1 , so that P −1 = − 12 0 2 1 1 1 2 0 1 3 3 3 and so by Theorem 4.23,  − 16  P −1 AP = − 21

− 16 1 2 1 3

1 3

 1  0 0 1 1 3 1 3

0 1 1

 1 −1  1 −1 0 2

  −1 1 −1   1 1 =  0 0 1 0

0 1 0

 0  0 . 2

12. To determine whether A is diagonalizable, we first compute its eigenvalues: 1 − λ 0 0 2 − λ 1 2−λ 1 = (1 − λ) det(A − λI) = 2 = (1 − λ)2 (2 − λ). 0 1 − λ 3 0 1−λ So the eigenvalues are λ1 = 1 with algebraic multiplicity 2, and  λ2 = To find the eigenspace corresponding to λ1 = 1 we row-reduce A − I    1 0 0 0 1 0 0   A − I 0 = 2 2 1 0 → 0 1 1 3 0 1 0 0 0 0

2 with algebraic multiplicity 1.  0 :  0 0 . 0

Thus the eigenspace is 

     0 0 0 −t = t −1 = span −1 . t 1 1 Thus the geometric multiplicity of this eigenvalue is only 1 while the algebraic multiplicity is 2. So by Theorem 4.27, the matrix is not diagonalizable.

296

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

13. To determine whether A is diagonalizable, we first compute its eigenvalues: 1 − λ 2 1 −λ 1 = λ2 − λ3 = −λ2 (λ − 1). det(A − λI) = −1 1 1 −λ So A has eigenvalues 0 and 1. Considering λ = 0, we row-reduce    1 2 1 0 1 0 −1 −1 0 1 0 → 0 1 1 1 1 0 0 0 0 0

A − 0I = A:  0 0 , 0

so that the eigenspace is 

 1 span −1 . 1 Since λ = 0 has algebraic multiplicity 2 but geometric multiplicity 1, the matrix is not diagonalizable. 14. To determine whether A is diagonalizable, we first compute its eigenvalues: 2 − λ 0 0 2 0 3−λ 2 1 = (1 − λ)(2 − λ)(3 − λ)2 det(A − λI) = 0 3−λ 0 0 0 0 0 1 − λ since the matrix is upper triangular. So the eigenvalues are λ1 = 1 with algebraic multiplicity 1, λ2 = 2 with algebraic multiplicity 1, and λ3 = 3 with  algebraic multiplicity 2. To find the eigenspace corresponding to λ3 = 3 we row-reduce A − 3I 0 :     −1 0 0 2 0 1 0 0 0 0     0 0 2  1 0  → 0 0 1 0 0 . A − 3I 0 =   0 0 0   0 0 0 0 0 1 0 0 0 0 −2 0 0 0 0 0 0 Thus the eigenspace is       0 0 0 1 1 t   = t   = span   . 0 0 0 0 0 0 Thus the geometric multiplicity of this eigenvalue is only 1 while the algebraic multiplicity is 2. So by Theorem 4.27, the matrix is not diagonalizable. 15. Since A is triangular, we can read off its eigenvalues, which are λ1 = 2 and λ2 = −2, each of which has algebraic multiplicity 2. Computing eigenspaces, we have         0 0 0 4 0 0 0 1 0 0 1 0          0 0 0 1 0   0 0 0 0 0 0 →  , so E2 has basis   , 1 A − 2I 0 =  0 0 −4        0 0 0 0 0 0 0 0 0       0 0 0 −4 0 0 0 0 0 0 0 0         4 0 0 4 0 1 0 0 1 0 −1 0         0 1 0 0 0  0 4 0 0 0  0      0 A + 2I 0  0 0 0 0 0 → 0 0 0 0 0 , so E−2 has basis  0 , 1 .     0 0 0 0 0 0 0 0 0 0 1 0 Since each eigenvalue also has geometric multiplicity 2, the matrix is diagonalizable, and     1 0 −1 0 2 0 0 0 0 1  0 0 0 0  , D = 0 2  P = 0 0   0 1 0 0 −2 0 0 0 1 0 0 0 0 −2

4.4. SIMILARITY AND DIAGONALIZATION provides a solution to  1  0 P −1 AP =  0 0

0 1 0 0

0 0 0 1

 2 1 0 0  1 0 0 0

0 2 0 0

297

 0 4 1  0 0  0 −2 0 0 0 −2 0

0 1 0 0

16. As in Example 4.29, we must find a matrix P such that  −1 −1 −4 P AP = P −3

  2 −1 0 0 0 0 = 0 1 0 0 1 0

0 2 0 0

 0 0 0 0  = D. −2 0 0 −2

 6 P 5

is diagonal. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial −4 − λ 6 = (−4 − λ)(5 − λ) + 18 = λ2 − λ − 2. det(A − λI) = −3 5 − λ This polynomial factors as (λ − 2)(λ + 1), so the eigenvalues are λ1 = −1 and λ2 = 2. The eigenspace associated with λ1 can be found by finding the null space of A + I:       −3 6 0 1 −2 0 A+I 0 = → −3 6 0 0 0 0   2 Thus a basis for this eigenspace is given by . 1 The eigenspace associated with λ2 can be found by finding the null space of A − 2I:       −6 6 0 1 −1 0 A − 2I 0 = → . −3 3 0 0 0 0 Thus a basis for this eigenspace is given by Set

 1 P = 1

 2 , 1

  1 . 1 so that P

−1

 −1 = 1

 2 −1

Then D = P −1 AP =

 2 0

  0 1 , so that A5 = P D5 P −1 = −1 1

2 1

 32 0

 0 −1 −1 1

  2 −34 = −1 −33

66 65



17. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial −1 − λ 6 = (−1 − λ)(−λ) − 6 = λ2 + λ − 6 = (λ + 3)(λ − 2), det(A − λI) = 1 −λ  so the  are λ1 = −3 and λ2 = 2. To compute the eigenspaces, we row-reduce A + 3I  eigenvalues and A − 2I 0 :  A + 3I  A − 2I

       2 6 0 1 3 0 −3 0 = → , so E−3 has basis 1 3 0 0 0 0 1        −3 6 0 1 −2 0 2 0 = → so E2 has basis . 1 −2 0 0 0 0 1

 0

298

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Thus

" −3 P = 1

# 2 1

" P −1 =

1 5 1 5

− 52 3 5

#

"

−3 D= 0

0 2

#

satisfy P −1 AP = D, so that "

#10 6 = (P DP −1 )10 = P D10 P =1 0 " #" #10 " # 1 −3 2 −3 0 − 52 5 = 1 3 1 1 0 2 5 5 " #" #10 " # 1 −3 2 (−3)10 0 − 25 5 = 1 3 1 1 0 210 5 5 " # 35,839 −69,630 = . −11,605 24,234

−1 1

18. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial 4 − λ −3 = (4 − λ)(2 − λ) − 3 = λ2 − 6λ + 5 = (λ − 5)(λ − 1), det(A − λI) = −1 2 − λ  A−I so the eigenvalues are λ = 1 and λ = 5. To compute the eigenspaces, we row-reduce 1 2   A − 5I 0 : 

A−I

 A − 5I



  0 1 → 0 0   0 1 → 0 0

−3 1

3 0 = −1   −1 0 = −1 

−3 −3

−1 0 3 0

 0 and

   0 1 , so that E1 has basis , 0 1    0 −3 , so that E5 has basis . 0 1

Set " P =

1

# −3

1

1

" ,

so that P −1 =

#

1 4 − 14

3 4 1 4,

" ,

D=

1

# 0

0

5

.

Then " A

−6

= PD

−6

P

−1

=

1 1

#" −3 1 1

0

0 5−6

#"

1 4 − 14

3 4 1 4

#

" =

3907 56 3906 56

11718 56 11719 56

# .

19. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial −λ 3 det(A − λI) = = (−λ)(2 − λ) − 3 = λ2 − 2λ − 3 = (λ − 3)(λ + 1), 1 2 − λ  so the  are λ1 = 3 and λ2 = −1. To compute the eigenspaces, we row-reduce A − 3I  eigenvalues and A + I 0 :        −3 3 0 1 −1 0 1 0 = → , so that E3 has basis , 1 −1 0 0 0 0 1         1 3 0 1 3 0 −3 A+I 0 = → , so that E−1 has basis . 1 3 0 0 0 0 1

 A − 3I

 0

4.4. SIMILARITY AND DIAGONALIZATION Set

" P =

299

"

#

1

−3

1

1

so that P −1 =

,

#

1 4 − 41

3 4 1 4,

,

D=

" 3

# 0 −1

0

.

Then k

k

A = PD P " =

−1

=

3k +3(−1)k 4 3k −(−1)k 4

" 1 1

#" −3 3k 1

1 4 − 41

(−1)k

0

3k+1 −3(−1)k 4 3k+1 +(−1)k 4

#"

0

3 4 1 4

#

# .

20. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial 2 − λ 1 2 1−λ 2 = 5λ2 − λ3 = −λ2 (λ − 5), det(A − λI) = 2 2 1 2 − λ   are λ1 = 0 and λ2 = 5. To compute the eigenspaces we row-reduce A − 0I 0 and so the eigenvalues  A − 5I 0 :          −1  1 12 1 0   −1  h i 2 1 2 0        → 0 0 0 0 , so E0 has basis  2 ,  0 A 0 = 2 1 2 0             0 1  2 1 2 0 0 0 0 0       1 0 −1 0 −3 1 2 0  1    A − 5I 0 =  2 −4 2 0 → 0 1 −1 0 , so E5 has basis 1 .   1 2 1 −3 0 0 0 0 0 Set

 −1   P = 2 0

Then

−1 0 1

1



 1 , 1

so that P −1

 −1  5 2 = − 5 2 5

 0 D = P −1 AP = 0 0

0 0 0

 0 0 , 5

2 5 − 51 1 5

− 51

2 5 − 15 1 5

− 15

 

3 . 5 2 5

so that A8 = P D8 P −1

 −1   = 2 0

 1 0   0 1  0 1 1 0

−1

0 0 0

 0 − 51   2 0  − 5 2 58 5



 156250   3 =  156250 5 2 156250 5



78125

156250

78125

 156250 .

78125

156250

21. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. Since A is upper triangular, its eigenvalues are the diagonal are   entries, which  λ1 = −1 and λ2 = 1. To compute the eigenspaces, we row-reduce A + I 0 and A − I 0 :          2 1 1 0 1 21 12 0 −1   −1  h i         A + I 0 = 0 0 0 0 → 0 0 0 0 , so E−1 has basis  2 ,  0 ,     0 0 0 0 0 0 0 0 0 2       0 1 1 0 0 1 0 0  1    A − I 0 = 0 −2 0 0 → 0 0 1 0 , so E1 has basis 0 .   0 0 −2 0 0 0 0 0 0

300

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Set  1  2 0 , 0 0

 −1  P =  0 2

−1

so that P −1

 0  = 0 1

0 1 2 1 2



1 2

0 .

1 2

Then  −1 D = P −1 AP =  0 0

 0 0 −1 0 , 0 1

so that

A2015 = P D2015 P −1

 −1   = 0 2  −1   = 0

−1 2 0 −1 2

2

 1   = 0 0

0 1 −1

  0 0 0 1 (−1)2015   2015     0 (−1) 0  0 0  2015 0 0 1 0 1    0 0 0 0 21 1 −1    1    0   0 −1 0 0 2 0 0 0 1 1 12 12 0  1  0  = A.

0 1 2 1 2



1 2

0 

1 2

0 −1

22. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial 0 1 1−λ 1 0 2 − λ 2 − λ 1 = (1 − λ) 1 2 − λ

2 − λ det(A − λI) = 1 1

= (1 − λ)((2 − λ)2 − 1) = (1 − λ)(λ2 − 4λ + 3) = −(λ − 1)2 (λ − 3).  Thus the eigenvalues are λ1 = 1 and λ2 = 3. To compute the eigenspaces, we row-reduce A − I  and A − 3I 0 .

 A−I

 A − 3I

 1 0 1  0 = 1 0 1 1 0 1  −1 0  0 =  1 −2 1 0

       0 1 0 1 0 0   −1 0 → 0 0 0 0 , so E1 has basis  0 , 1   0 0 0 0 0 1 0      1 0 1 0 −1 0  1  1 0 → 0 1 −1 0 , so E3 has basis 1 .   −1 0 0 0 0 0 1

Set  1  P = 1 1

 0  0 1 , 1 0



−1

so that P −1 =

1  2 − 1  2 − 12

0 0 1



1 2 1 , 2 − 21

 3  D= 0 0

0 1 0

 0  0 , 1

 0

4.4. SIMILARITY AND DIAGONALIZATION

301

so that

Ak = P Dk P −1

 1  = 1 1  1  = 1 1 

 0 3k   1  0 0 0  0 3k   1  0 0 0

−1 0 1 −1 0 1

1 k (3 + 1) 0 2 1  =  2 (3k − 1) 1 1 k 2 (3 − 1) 0

0 1k 0



1  2  1 0  − 2 1k − 12

0

 1 0 0  2  1 1 0  − 2 0 1 − 12  1 k 2 (3 − 1) 1 k  2 (3 − 1) 1 k 2 (3 + 1)



1 2 1 2 − 21

0 0 1 0 0 1



1 2 1 2 − 21

23. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial 1 − λ 1 1 −2 − λ 2 = −(λ − 1)(λ − 2)(λ + 3). det(A − λI) = 2 0 1 1 − λ Thus the eigenvalues are λ1 =1, λ2 = 2, and λ2 = −3. To compute the eigenspaces, we row-reduce    A − I 0 , A − 2I 0 , and A + 3I 0 .       1 0 1 0 0 1 0 0  −1    A − I 0 = 2 −3 2 0 → 0 1 0 0 , so E1 has basis  0   1 0 1 0 0 0 0 0 0        1 0 −1 1  −1 1 0 0 0    A − 2I 0 =  2 −4 2 0 → 0 1 −1 0 , so E2 has basis 1   1 0 1 −1 0 0 0 0 0         1 0 −1 0 4 1 0 0  1    A + 3I 0 = 2 1 2 0 → 0 1 4 0 , so E3 has basis −4 .   1 0 1 4 0 0 0 0 0 Set

 −1  P = 0 1

1 1 1

 1  −4 , 1

so that P −1

 − 12  =  52 1 10

0 1 5 − 15



1 2 2 , 5 1 10

 1  D = 0 0

0 2 0

 0  0 , −3

so that Ak = P Dk P −1

   −1 1 1 1 0 0 − 21 0    2 1 k =  0 1 −4 0 2 0  5 5 1 1 1 1 0 0 (−3)k − 15 10  1 1 k k k+2 k ) 10 (5 + (−3) + 2 5 (2 − (−3) )  1 k k = − 25 ((−3)k − 2k ) 5 (4(−3) + 2 ) 1 1 k k k+2 k ) 5 (2 − (−3) ) 10 (−5 + (−3) + 2



1 2 2 5 1 10

24. The characteristic polynomial of A is 1 − λ det(A − λI) = 0



1 k k+2 ) 10 (−5 + (−3) + 2  2 k k − 5 ((−3) − 2 ) . 1 k k+2 ) 10 (5 + (−3) + 2

1 = (λ − 1)(λ − k), k − λ

302

CHAPTER 4. EIGENVALUES AND EIGENVECTORS so that if k 6= 1, then A is a 2 × 2 matrix with two distinct eigenvalues, so it is diagonalizable. If k = 1, then   0 1 A−I = , 0 0   1 which gives a one-dimensional eigenspace with basis . Since λ = 1 has algebraic multiplicity 2 but 0 geometric multiplicity 1, A is not diagonalizable. So A is diagonalizable exactly when k 6= 1.

25. The characteristic polynomial of A is 1 − λ det(A − λI) = 0

k = (λ − 1)2 . 1 − λ

The only eigenvalue is 1, and then A−I =

 0 0

 k . 0

If k = 0, then   A is diagonal to start with, while if k 6= 0, then we get a one-dimensional eigenspace 1 with basis . Since λ = 1 then has algebraic multiplicity 2 but geometric multiplicity 1, A is not 0 diagonalizable. So A is diagonalizable exactly when k = 0. 26. The characteristic polynomial of A is k − λ det(A − λI) = 1 √

1 = λ(λ − k) − 1 = λ2 − kλ − 1, −λ

2

so that A has the eigenvalues λ = k± 2k +4 . Since k 2 + 4 is never zero, and always positive, the eigenvalues of A are real and distinct, so A is diagonalizable for all k. 27. Since A is triangular, the only eigenvalue is 1, with algebraic multiplicity 3. Then   0 0 k A − I = 0 0 0 . 0 0 0 If k = 0, then to start with, while if k 6= 0, then we get a one-dimensional eigenspace  A  isdiagonal  0   1 with basis 0 , 1 . Since λ = 1 then has algebraic multiplicity 3 but geometric multiplicity 2,   0 0 A is not diagonalizable. So A is diagonalizable exactly when k = 0. 28. Since A is triangular, the eigenvalues are 1, with plicity 1. Then  0 k A − I = 0 1 0 0

algebraic multiplicity 2, and 2, with algebraic multi  0 0 0 → 0 0 0

1 0 0

 0 0 0

    0   1 regardless of the value of k, so λ = 1 has a two-dimensional eigenspace with basis 0 , 0 , i.e.,   0 1 with geometric multiplicity 2. Since this is equal to the algebraic multiplicity of λ, we see that A is diagonalizable for all k. 29. The characteristic polynomial of A is 1 − λ 1 1−λ det(A − λI) = 1 1 1

k k = −λ3 + 2λ2 + kλ2 = −λ2 (λ − (k + 2)). k − λ

4.4. SIMILARITY AND DIAGONALIZATION If k = −2, then A has only the eigenvalue  1 A = 1 1

303

0, with algebraic multiplicity 3, and then    1 −2 1 1 −2 1 −2 → 0 0 0 , 1 −2 0 0 0

so the eigenspace is two-dimensional and therefore the geometric multiplicity is not equal to the algebraic multiplicity, so that A is not diagonalizable. If k 6= −2, then 0 has algebraic multiplicity 2, and then     1 1 −k 1 1 −k 0 , A = 1 1 −k  → 0 0 1 1 −k 0 0 0 and the eigenspace is again two-dimensional, so the geometric multiplicity equals the algebraic multiplicity and A is diagonalizable. Thus A is diagonalizable exactly when k 6= −2. 30. We are given that A ∼ B, so that B = P −1 AP for some invertible matrix P , and that B ∼ C, so that C = Q−1 BQ for some invertible matrix Q. But then C = Q−1 BQ = Q−1 P −1 AP Q = (P Q)−1 AP Q, so that C is R−1 AR for the invertible matrix R = P Q. 31. A is invertible if and only if det A 6= 0. But since A and B are similar, det A = det B by part (a) of this theorem. Thus A is invertible if and only if det B 6= 0, which happens if and only if B is invertible. 32. By Exercise 61 in Section 3.5, if U is invertible then rank A = rank(U A) = rank(AU ). Now, since A and B are similar, we have B = P −1 AP for some invertible matrix P , so that rank B = rank(P −1 (AP )) = rank(AP ) = rank A, using the result above twice. 33. Suppose that B = P −1 AP where P is invertible. This can be rewritten as BP −1 = P −1 A. Let λ be an eigenvalue of A with eigenvector x. Then B(P −1 x) = (BP −1 )x = (P −1 A)x = P −1 (Ax) = P −1 (λx) = λ(P −1 x). Thus P −1 x is an eigenvector for B, corresponding to λ. So every eigenvalue of A is also an eigenvalue of B. By writing the similarity equation as A = Q−1 BQ, we see in exactly the same way that every eigenvalue of B is an eigenvalue of A. So A and B have the same eigenvalues. (Note that we have also proven a relationship between the eigenvectors corresponding to a given eigenvalue). 34. Suppose that A ∼ B, say A = P −1 BP . Then if m = 0, we have Am = B m = I, so that A0 ∼ B 0 . If m > 0, then m Am = P −1 BP = |P −1 BP · P −1 BP ·{z · · P −1 BP · P −1 BP} = P −1 B m P. m times

But A

m

=P

−1

m

B P means that A

m

m

∼B .

35. From part (f), Am ∼ B m for m ≥ 0. If m < 0, then −1 −1 Am = A−m = P −1 B −m P −1 −1 −1 since A−m and B −m are similar because −m > 0. But P −1 B −m P = P −1 (B −m ) P −1 = P −1 B m P . Thus Am ∼ B m for m < 0, so they are similar for all integers m. 36. Let P = B −1 . Then BA = BABB −1 = B(AB)B −1 = P −1 (AB)P, showing that AB ∼ BA.

304

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

37. Exercise 45 in Section 3.2 shows that if A and B are n × n, then tr(AB) = tr(BA). But then if B = P −1 AP , we have tr B = tr(P −1 AP ) = tr((P −1 A)P ) = tr(P (P −1 A)) = tr((P P −1 )A) = tr A. 38. A is triangular, so it has eigenvalues −1 and 3. For B, we have 1 − λ 2 = (1 − λ)2 − 4 = λ2 − 2λ − 3 = (λ + 1)(λ − 3). det(B − λI) = 2 1 − λ So the eigenvalues of B are also −1 and 3, and thus  −1 A∼B∼ 0

 0 = D. 3



   1 1 Reducing A + I and A − 3I shows that E−1 = span and E3 = span . Reducing B + I −4 0     1 1 and B − 3I shows that E−1 = span and E3 = span . Thus −1 1 "

−1

D=Q

#" #" # 3 1 1 1 0 −1 −4 0 #" #" # 1 1 − 12 1 2 , 1 2 1 −1 1 2

1 4 − 41

0 AQ = 1 "

D = R−1 BR =

1 2 1 2

so that finally B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ). Thus P = QR

−1



1 = −2

 0 2

is an invertible matrix such that B = P −1 AP . 39. We have 5 − λ −3 = (5 − λ)(−2 − λ) + 12 = λ2 − 3λ + 2 = (λ − 1)(λ − 2) det(A − λI) = 4 −2 − λ −1 − λ 1 = (−1 − λ)(4 − λ) + 6 = λ2 − 3λ + 2 = (λ − 1)(λ − 2). det(B − λI) = −6 4 − λ So A and B both have eigenvalues 1 and 2, so that  1 A∼B∼ 0

 0 = D. 2

    3 1 Reducing A − I and A − 2I shows that E1 = span and E2 = span . Reducing B − I and 4 1     1 1 B − 2I shows that E1 = span and E3 = span . Thus 2 3  −1 4  3 D = R−1 BR = −2 D = Q−1 AQ =

   1 5 −3 3 1 −3 4 −2 4 1    −1 −1 1 1 1 , 1 −6 4 2 3

4.4. SIMILARITY AND DIAGONALIZATION

305

so that finally B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ). Thus P = QR−1 =



 7 −2 10 −3

is an invertible matrix such that B = P −1 AP . 40. A is upper triangular, so it has eigenvalues −2, 1, and 2. For B, we have 3 − λ 2 −5 2−λ −1 = −(λ + 2)(λ − 1)(λ − 2), det(B − λI) = 1 2 2 −4 − λ So B also has eigenvalues −2, 1, and 2, so that  −2 A∼B∼ 0 0

0 1 0

 0 0 = D. 2

Reducing A + 2I, A − I, and A − 2I gives     1 1 E−2 = span −4 , E1 = span −1 , −3 0 Reducing B + 2I, B − I, and B − 2I gives     1 1 E−2 = span 0 , E1 = span −1 , 0 1

  1 E2 = span 0 . 0

  1 E2 = span 2 . 1

Therefore    1 0 − 41 1 1 2 1 0 12    D = Q−1 AQ = 0 0 − 13  0 −2 1 −4 −1 1 1 0 −3 0 0 1 1 4 4    1 1 3 −2 −2 1 3 2 −5 1 2    −1 D = R BR =  1 0 −1 1 2 −1 0 −1 1 1 0 − 21 2 2 −4 1 2 2

 1  0 0  1  2 , 1

so that finally B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ). Thus



P = QR−1

1 = 1 −3

0 2 0

 0 −5 3

is an invertible matrix such that B = P −1 AP . 41. We have 1 − λ 0 2 −1 − λ 1 = −(λ + 1)2 (λ − 3) det(A − λI) = 1 2 0 1 − λ −3 − λ −2 0 5−λ 0 = −(λ + 1)2 (λ − 3), det(B − λI) = 6 4 4 −1 − λ

306

CHAPTER 4. EIGENVALUES AND EIGENVECTORS so that A and B both have eigenvalues −1 (of algebraic  −1 0 A ∼ B ∼  0 −1 0 0

multiplicity 2) and 3. Therefore  0 0 = D. 3

Reducing A + I and A − 3I gives 

E−1

 1 = span  0 , −1

  0 1 , 0

  2 E3 = span 1 . 2

Reducing B + I and B − 3I gives 

E−1

 1 = span −1 , 0

  0 0 , 1



 1 E3 = span −3 . −2

Therefore  D = Q−1 AQ =

1 2  1 − 4 1 4



3 2

 D = R−1 BR =  −1 − 21

  − 12 1 0 2 1 0  1  − 4  1 −1 1  0 1 1 2 0 1 −1 0 4   1 1 0 0 −3 −2 2   5 0 −1 −1 1  6 0 4 4 −1 − 12 0 0 1 0

 2  1 2 0 0 1

 1  −3 , −2

so that finally B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ). Thus

 P = QR−1 =

1 2  3 − 2 − 25

− 12 − 32 − 32

 0  1 0

is an invertible matrix such that B = P −1 AP . 42. Suppose that B = P −1 AP . This gives P B = AP , so that (P B)T = (AP )T . Expanding yields B T P T = P T AT , so that B T ∼ AT and thus AT ∼ B T . 43. Suppose that P −1 AP = D, where D is diagonal. Then D = DT = (P −1 AP )T = P T AT (P T )−1 . Now let Q = (P T )−1 ; the equation above becomes D = Q−1 AT Q, so that AT is diagonalizable. 44. If A is invertible, it has no zero eigenvalues. So if it is diagonalizable, the diagonal matrix D is also invertible since none of the diagonal entries are zero. Then D−1 is also diagonal, consisting of the inverses of each of the diagonal entries of D. Suppose that D = P −1 AP . Then −1 −1 D−1 = P −1 AP = P −1 A−1 P −1 = P −1 A−1 P, showing that A−1 is also diagonalizable (and, incidentally, showing again that the eigenvalues of A−1 are the inverses of the eigenvalues of A, by looking at the entries of D−1 .) 45. Suppose that A is diagonalizable, and that λ is its only eigenvalue. Then if D = P −1 AP is diagonal, all of its diagonal entries must equal λ, so that D = λI. Then λI = P −1 AP implies that P (λI)P −1 = P P −1 AP P −1 = A. However, P (λI)P −1 = λP IP −1 = λI. Thus A = λI.

4.4. SIMILARITY AND DIAGONALIZATION

307

46. Since A and B each have n distinct eigenvalues, they are both diagonalizable by Theorem 4.25. Recall that diagonal matrices commute: if D and E are both diagonal, then DE = ED. Now, if A and B have the same eigenvectors, then since the diagonalizing matrix P has columns equal to the eigenvectors, we see that for the same invertible matrix P , both DA = P −1 AP and DB = P −1 BP are diagonal. But then AB = (P DA P −1 )(P DB P −1 ) = P DA DB P −1 = P DB DA P −1 = (P DB P −1 )(P DA P −1 ) = BA. Conversely, if AB = BA, let α1 , α2 , . . . , αn be the distinct eigenvalues of A, and pi the corresponding eigenvectors, which are also distinct since they are linearly independent. Then A(Bpi ) = (AB)pi = (BA)pi = B(Api ) = B(αi Pi ) = αi Bpi . Thus Bpi is also an eigenvector of A corresponding to αi , so it must be a multiple of pi . That is, Bpi = βi pi for all i, so that the eigenvectors of B are also pi . So every eigenvector of A is an eigenvector of B. Since B also has n distinct eigenvectors, the set of eigenvectors must be the same. 47. If A ∼ B, then A and B have the same characteristic polynomial, so the algebraic multiplicities of their eigenvalues are the same. 48. From the solution to Exercise 33, we know that if B = P −1 AP and x is an eigenvector of A with eigenvalue λ, then P −1 x is an eigenvalue of B with eigenvalue λ. Therefore, if λ is an eigenvalue of A with multiplicity m, its eigenspace has a basis consisting of m distinct vectors v1 , v2 , . . . , vm , so that the vectors P −1 v1 , P −1 v2 , . . . , P −1 vm are in the eigenspace of B corresponding to λ. These vectors are linearly independent since P −1 is an invertible linear transformation. So each of the n eigenvectors of A corresponds to a distinct eigenvector of B; since each matrix has n eigenvectors, the reverse is true as well. Therefore the m eigenvalues of B found above are all of the eigenvalues of B corresponding to λ, so that the dimension of the eigenspace of λ for B is m as well. So the geometric multiplicities of the eigenvalues of A and B are the same. 49. If D = P −1 AP and D is diagonal, then every entry of D is either 0 or 1, since every eigenvalue of A is either 0 or 1 by assumption. Since D2 is computed by squaring each diagonal entry, we see that D2 = D. But then P DP −1 = A, and A2 = (P DP −1 )2 = (P DP −1 )(P DP −1 ) = P −1 D2 P = P −1 DP = A, so that A is idempotent. 50. Suppose that A is nilpotent and diagonalizable, so that D = P −1 AP (so that A = P DP −1 ) and Am = O for some m > 1. Then Dm = (P −1 AP )m = P −1 Am P = P −1 OP = O. Since D is diagonal, Dm is found by taking the mth power of each diagonal entry. If this is the zero matrix, then all the diagonal entries must be zero, so that D = O. But then A = P OP −1 = O, so that A is the zero matrix. 51. (a) Suppose we have three vectors v1 , v2 , and v3 with Av1 = v1 ,

Av2 = v2 ,

Av3 = v3 .

Then all three vectors must lie in the eigenspace E1 corresponding to λ = 1, since all three vectors are multiplied by 1 when A is applied. However, since the algebraic multiplicity of λ = 1 is 2, the geometric multiplicity is no greater than 2, by Lemma 4.26. But any three vectors in a twodimensional space must be linearly independent, so that v1 , v2 , and v3 are linearly dependent. (b) By Theorem 4.27, A is diagonalizable if and only if the algebraic multiplicity of each eigenvalue equals its geometric multiplicity. So the dimensions of the eigenspaces must be dim E−1 = 1, dim E1 = 2, dim E2 = 3, since the algebraic multiplicity of −1 is 1, that of 1 is 2, and that of 2 is 3.

308

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

52. (a) The characteristic polynomial of A is a − λ b = λ2 − (a + d)λ + (ad − bc) = λ2 − tr(A) + det(A). det(A − λI) = c d − λ Therefore the eigenvalues are p p a + d ± (a + d)2 − 4(ad − bc) a + d ± (a − d)2 + 4bc λ= = . 2 2 So if (a − d)2 + 4bc > 0, then the two eigenvalues of A are real and distinct and therefore A is diagonalizable. If (a − d)2 + 4bc < 0, then A has no real eigenvalues and so is not diagonalizable. (b) If (a − d)2 + 4bc = 0, then A has one real eigenvalue, and A may or may not be diagonalizable depending on the geometric multiplicity of that eigenvalue. For example, the zero matrix O has (a − d)2 + 4bc = 0, and it is already a diagonal matrix, so is diagonalizable. However,   1 −1 A= 1 −1 satisfies (a − d)2 + 4bc = 0; from the formula above,   its only real eigenvalue is λ = 0, and it is 1 easy to see by row-reducing A that E0 = span . So the geometric multiplicity of λ = 0 is 1 only 1 and thus A is not diagonalizable, by Theorem 4.27.

4.5

Iterative Methods for Computing Eigenvalues

1. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be # " # " 1 1 . 11,109 ≈ 2.500 4443 The corresponding dominant eigenvalue is computed by finding " #" # " # 1 2 4443 26,661 x6 = Ax5 = = , 5 4 11,109 66,651 ≈ 6.001, 66,651 11,109 ≈ 6.000. 1 − λ 2 (b) det(A − λI) = = (1 − λ)(4 − λ) − 10 = λ2 − 5λ − 6 = (λ + 1)(λ − 6), so the dominant 5 4 − λ eigenvalue is 6. and

26,661 4443

2. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be # " # " 1 1 ≈ . 3904 −0.500 − 7811 The corresponding dominant eigenvalue is computed by finding " #" # " # 7 4 7811 39,061 x6 = Ax5 = = , −3 −1 −3904 −19,529 and

39,061 7811

≈ 5.001,

−19,529 −3904

≈ 5.002.

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES

309

7 − λ 4 (b) det(A − λI) = = (7 − λ)(−1 − λ) + 12 = λ2 − 6λ + 5 = (λ − 1)(λ − 5), so the −3 −1 − λ dominant eigenvalue is 5. 3. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be " # " # 1 1 ≈ . 89 0.618 144 The corresponding dominant eigenvalue is computed by finding " #" # " # 2 1 144 377 x6 = Ax5 = = , 1 1 89 233 ≈ 2.618, 233 ≈ 2.618. 89 2 − λ √ 1 (b) det(A − λI) = = (2 − λ)(1 − λ) − 1 = λ2 − 3λ + 1, which has roots λ = 12 (3 ± 5). 1 1 − λ √ So the dominant eigenvalue is 12 (3 + 5) ≈ 2.618. and

377 144

4. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be # " # " 1 1 ≈ . 239.500 3.951 60.625 The corresponding dominant eigenvalue is computed by finding # # " #" " 210.688 60.625 1.5 0.5 , = x6 = Ax5 = 839.75 2.0 3.0 239.500 210.688 60.625

839.75 ≈ 3.475, 239.500 ≈ 3.506. 1.5 − λ 0.5 (b) det(A − λI) = = (1.5 − λ)(3.0 − λ) − 1.0 = λ2 − 4.5λ + 3.5, which has roots 2.0 3.0 − λ λ = 1 and λ = 3.5. So the dominant eigenvalue is 3.5.

and

5. (a) The entry of largest magnitude in x5 is m5 = 11.001, so that " # " # 3.667 − 11.001 −0.333 1 y5 = x5 = = . 11.001 1 1 So the dominant eigenvalue is about 11 and y5 approximates the dominant eigenvector. (b) We have " #" # " # 2 −3 −0.333 −3.666 Ay5 = = −3 10 1 10.999 " # " # −0.333 −3.663 λ1 y5 = 11.001 = , 1 11.001 so we have indeed approximated an eigenvalue and an eigenvector of A. 6. (a) The entry of largest magnitude in x10 is m10 = 5.530, so that " # " # 1 1 1 y10 = x10 = 1.470 ≈ . 5.530 0.266 5.530 So the dominant eigenvalue is about 5.53 and y10 approximates the dominant eigenvector.

310

CHAPTER 4. EIGENVALUES AND EIGENVECTORS (b) We have Ay10 λ1 y10

" 5 = 2

#" # " # 2 1 5.532 = −2 0.266 1.468 " # " # 1 5.530 = 5.530 = , 0.266 1.471

so we have indeed approximated an eigenvalue and an eigenvector of A. 7. (a) The entry of largest magnitude in x8 is m8 = 10, so that     1 1 1     y8 = ≈ 0.0001 . x8 =  0.001 10.0  10 1 1 So the dominant eigenvalue is about 10 (b) We have  4  Ay8 = −1 6 

and y8 approximates the dominant eigenvector.

    6 1 10     1 0.0001 = 0.0003 4 1 10    1 10     λ1 y8 = 10 0.0001 = 0.001 , 1 10 0 3 0

so we have indeed approximated an eigenvalue and an eigenvector of A, though the approximation in the middle component is not great. 8. (a) The entry of largest magnitude in x10 is m10 = 3.415, so that     1 1 1     y10 = x10 =  2.914 ≈  0.853  . 3.415  3.415 −0.353 − 1.207 3.415 So the dominant eigenvalue is about 3.415 and y10 approximates the dominant eigenvector. (b) We have      3.412 1 1 2 −2      Ay10 = 1 1 −3  0.853  =  2.912  −1.206 0 −1 1 −0.353     1 3.415     λ1 y10 = 3.415  0.853  =  2.913  , −0.353 −1.205 so we have indeed approximated an eigenvalue and an eigenvector of A. 9. With the given A and x0 , we get the values below: k

yk

0   1 1   1 1

mk

1

xk

1   26 8   1.000 0.308

2   17.692 5.923   1.000 0.335

3   18.017 6.004   1.000 0.333

4   17.999 6.000   1.000 0.333

5   18.000 6.000   1.000 0.333

26

17.692

18.017

17.999

18

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES

311   3 . 1

We conclude that the dominant eigenvalue of A is 18, with eigenvector 10. With the given A and x0 , we get the values below: k

yk

0   1 0   1 0

mk

1

xk

1   −6 8   −0.75 1.00

2   8.500 8.000   1.000 −0.941

3   −9.765 9.882   −0.988 1.000

8

8.5

9.882

4 9.929 −9.905   1.000 −0.998

5   −9.990 9.995   −1.000 1.000

9.929

9.995





6 

 9.997 −9.996   1.000 −1.000 9.997

We conclude that the dominant eigenvalue of A is −10 (not 10, since at each step the sign of each 1 component of yk is reversed) , with eigenvector . −1 11. With the given A and x0 , we get the values below: k

yk

0   1 0   1 0

mk

1

xk

1   7 2   1.000 0.286

2   7.571 2.857   1.000 0.377

3   7.755 3.132   1.000 0.404

4   7.808 3.212   1.000 0.411

5   7.823 3.234   −1.000 0.412

6   7.827 3.240   1.000 0.414

7

7.571

7.755

7.808

7.823

7.827

  1 We conclude that the dominant eigenvalue of A is about 7.83, with eigenvector about . The 0.414   √ 1 . true values are λ = 5 + 2 2 with eigenvector √ 2−1 12. With the given A and x0 , we get the values below: k

yk

0   1 0   1 0

mk

1

xk

1 2     3.5 4.143 1.5 1.286     1.000 1.000 0.429 0.310 3.5

3   3.966 1.345   1.000 0.339

4   4.009 1.330   1.000 0.332

5   3.998 1.334   −1.000 0.334

6   4.001 1.333   1.000 0.333

3.966

4.009

3.998

4.001

4.143

  3 We conclude that the dominant eigenvalue of A is 4, with eigenvector . 1 13. With the given A and x0 , we get the values below: k xk

yk mk

0 1 2       1 21 16.810 1 15 12.238 1 13 10.714       1 1.000 1.000 1 0.714 0.728 1 0.619 0.637 1

21

16.81

3 4 5       17.011 16.999 17.000 12.371 12.363 12.264 10.824 10.818 10.818       1.000 1.000 1.000 0.727 0.727 0.727 0.636 0.636 0.636 17.011

16.999

17

312

CHAPTER 4. EIGENVALUES AND EIGENVECTORS 

 1 We conclude that the dominant eigenvalue of A is 17, with corresponding eigenvector about 0.727. 0.636     1 0 In fact the eigenspace corresponding to this eigenvalue is span 2 , −2, and we have found one 0 1 eigenvector in this space. 14. With the given A and x0 , we get the values below: 0 1 2       1 4 3.4 1 5 4.6 1 4 3.4       1 0.8 0.739 1 1.0 1.000 1 0.8 0.739

k xk

yk

1

mk

5

4.6

3   3.217 4.478 3.217   0.718 1.000 0.718

4   3.155 4.437 3.155   0.711 1.000 0.711

5   3.133 4.422 3.133   0.709 1.000 0.709

6   3.126 4.417 3.126   0.708 1.000 0.708

4.478

4.437

4.422

4.417

  0.708 We conclude that the dominant eigenvalue of A is ≈ 4.4, with corresponding eigenvector ≈ 1.000. 0.708 √  2 √  2  In fact the dominant eigenvalue is 3 + 2 with eigenspace spanned by  √1 . 2 2

  1 15. With the given matrix, and using x0 = 1, we get 1 k xk

yk mk

0 1     8 1 2 1 4 1     1 1 1 0.25 0.50 1 1

2   5.75 0.50 2.25   1 0.09 0.39

3   5.26 0.17 1.87   1 0.03 0.36

4   5.10 0.07 1.74   1 0.01 0.34

5   5.04 0.03 1.70   1 0.01 0.34

6   5.02 0.01 1.68   1  0  0.33

7   5.01  0  1.67   1  0  0.33

5.75

5.26

5.10

5.04

5.02

5.01

8

8 

9  

 5 5  0   0  1.67 1.67     1 1  0   0  0.33 0.33

5   3 We conclude that the dominant eigenvalue of A is 5, with corresponding eigenvector 0. 1   1 16. With the given matrix, and using x0 = 1, we get 1 k xk

yk mk

0   2 1 1   2 1 1 2

1 2     24 11.0 2  1.5 6 −2.5     1 1 0.08  0.14  0.25 −0.23 24

11

3 4    14.18 16.38  2.45  3.12 −7.91 −11.65     1 1  0.17   0.19  −0.56 −0.71 

14.18

16.38

5  16.38  3.12 −11.65   1  0.20  −0.77 

17.41

5

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES k

6 7 8      17.80 17.93 17.98  3.54  3.58  3.59 −14.05 −14.28 −14.36       1 1 1  0.20   0.2   0.2  −0.79 −0.8 −0.8

9  17.99  3.60 −14.39   1  0.2  −0.8

 xk

yk mk

17.8

17.93

313



17.98

10 11    18.00 18.00  3.60  3.60 −14.40 −14.40     1 1  0.2   0.2  −0.8 −0.8 

17.99

18

18 

 1 We conclude that the dominant eigenvalue of A is 18, with corresponding eigenvector  0.2 , or −0.8   5  1. −4 17. With the given A and x0 , we get the values below:

xk

0   1 0

1   7 2

R(xk )

7

7.755

k

2 

7.571 2.857



7.823

3   7.755 3.132 7.828

The Rayleigh quotient converges to the dominant eigenvalue, to three decimal places, after three iterations. 18. With the given A and x0 , we get the values below:

xk

0   1 0

1   3.5 1.5

2   4.143 1.286

3   3.966 1.345

R(xk )

3.5

3.966

3.998

4

k

The Rayleigh quotient converges to the dominant eigenvalue after three iterations. 19. With the given A and x0 , we get the values below: k xk R(xk )

0   1 1 1

1   21 15 13

2   16.810 12.238 10.714

16.333

16.998

17

The Rayleigh quotient converges to the dominant eigenvalue after two iterations. 20. With the given A and x0 , we get the values below: k xk R(xk )

0   1 1 1

1   4 5 4

2   3.4 4.6 3.4

3   3.217 4.478 3.217

4.333

4.404

4.413

4.414

The Rayleigh quotient converges to the dominant eigenvalue, to three decimal places, after three iterations.

314

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

21. With the given matrix, and using x0 =

yk

0   1 1   1 1

1   5 4   1 0.8

mk

4.5

4.49

k xk

  1 , we get 1

2 3 4       4.8 4.67 4.57 3.2 2.67 2.29       1 1 1 0.67 0.57 0.5 4.46

4.43

5 6 7 8         4.50 4.44 4.40 4.36 2.00 1.78 1.60 1.45         1 1 1 1 0.44 0.40 0.36 0.33

4.4

4.37

4.34

4.32

4.3

The characteristic polynomial is  4−λ det(A − λI) = 0

 1 = (λ − 4)2 , 4−λ

so that A has the double eigenvalue 4. To find the corresponding eigenspace,   0 1 0 = 0 0

 A − 4I

 0 , 0

  1 . The values mk appear to be converging to 4, but very slowly; 0 the vectors yk may also be converging to the eigenvector, but very slowly if at all.

so that the eigenspace has basis

  1 22. With the given matrix, and using x0 = , we get 1

yk

0   1 1   1 1

1   4 0   1 0

mk

2

3

k xk

2

3



3 −1





1 −0.33

4





2.67 −1.33   1 −0.5



2.8

5





2.50 −1.50   1 −0.6

2.6

2.47

6



2.40 −1.60





1 −0.67





7 

8

2.33 −1.67   1 −0.71

2.29 −1.71   1 −0.75

 2.25 −1.70   1 −0.78

2.32

2.28

2.25

2.38







The characteristic polynomial is  3−λ det(A − λI) = −1

 1 = λ2 − 4λ + 4 = (λ − 2)2 , 1−λ

so that A has the double eigenvalue 2. To find the corresponding eigenspace, 

A − 2I 

 0 =



1 −1

1 −1

  0 1 1 → 0 0 0

 0 0

 1 . The values mk appear to be converging to 2, but very slowly; −1 the vectors yk may also be converging to the eigenvector, but very slowly. so that the eigenspace has basis

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES

315

  1 23. With the given matrix, and using x0 = 1, we get 1 k xk

yk mk

0   1 1 1   1 1 1 3.33

1 2     5 4.2 4 3.2 1 0.2     1 1 0.8 0.76 0.2 0.05 4.05

4.03

3   4.05 3.05 0.05   1 0.75 0.01

4   4.01 3.01 0.01   1 0.75 0

5   4 3 0   1 0.75 0

4.01

4

4

6 7     4 4 3 3 0 0     1 1 0.75 0.75 0 0 4

The matrix is upper triangular, so that A has an eigenvalue 1 and a eigenspace for λ = 4,    0 0 0 0 1 0   0 0 → 0 0 A − 4I 0 = 0 0 0 0 −3 0 0 0

8   4 3 0   1 0.75 0

4

4

double eigenvalue 4. To find the  0 0 0

1 0 0

    0   1 so that the eigenspace has basis 0 , 1 . The values mk converge to 4, while yk converge   0 0   1 apparently to 0.75, which is in the eigenspace. 0   1 24. With the given matrix, and using x0 = 1, we get 1 k xk

yk mk

0 1     0 1 6 1 5 1     1 0 1  1  1 0.83 3.67

5.49

2 

3  

4 



5  

6  

7 

0 0 0 0 0 5.83 5.71 5.62 5.56 5.50 4.17 3.57 3.12 2.78 2.50           0 0 0 0 0  1   1   1   1   1  0.71 0.62 0.56 0.50 0.45 5.47

5.45

The matrix is upper triangular, so that A has eigenspace for λ = 5,  −5   A − 5I 0 =  0 0

5.42

5.4

5.38



8

 0 0 5.45 5.42 2.27 2.08     0 0  1   1  0.42 0.38 5.36

 

5.34

an eigenvalue 0 and a double eigenvalue 5. To find the 0 0 0 1 0 0

  0 1 0 → 0 0 0

0 0 0

0 1 0

 0 0 0

  0 so that the eigenspace has basis 1. The values mk appear to converge slowly to 5, while yk also 0   0 appear to converge slowly to 1. 0

316

CHAPTER 4. EIGENVALUES AND EIGENVECTORS   1 , we get 1

25. With the given matrix, and using x0 =

yk

0   1 1   1 1

1   1 0   1 0

2   −1 −1   1 1

mk

1

1

−1

k xk

3 4     1 −1 0 −1     1 1 0 1 −1

1

The characteristic polynomial is det(A − λI) =

 −1 − λ −1

 2 = λ2 + 1, 1−λ

so that A has no real eigenvalues, so the power method cannot succeed.   1 26. With the given matrix, and using x0 = , we get 1

yk

0   1 1   1 1

1   3 3   1 1

2   3 3   1 1

3   3 3   1 1

mk

1

3

3

3

k xk

The characteristic polynomial is 

2−λ det(A − λI) = −2

 1 = λ2 − 7λ + 12 = (λ − 3)(λ − 4) 5−λ

so  that the power method converges to a non-dominant eigenvalue. This happened because we chose 1 as the initial value, which is in fact an eigenvector for λ = 3: 1        2 1 1 3 1 = =3 . −2 5 1 3 1 27. With the given matrix, and using x0 = k xk

yk mk

0 1     1 3 1 4 1 3     1 0.75 1  1  1 0.75 1

  1 , we get 1

2   2.5 4.0 2.5   0.62  1  0.62

3   2.25  4  2.25   0.56  1  0.56

4   2.12  4  2.12   0.53  1  0.53

5   2.06  4  2.06   0.52  1  0.52

6   2.03  4  2.03   0.51  1  0.51

4

4

4

4

4

4

7 8     2.02 2.01  4   4  2.02 2.01     0.5 0.5 1 1 0.5 0.5 4

The characteristic polynomial is  −5 − λ det(A − λI) =  0 7

1 4−λ 1

 7 0  = (λ + 12)(λ − 4)(λ − 2) −5 − λ

4

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES

317

so that the power method converges to a non-dominant eigenvalue. This happened because we chose   1 1 as the initial value. This vector is orthogonal to the eigenvector corresponding to the dominant 1   1 eigenvalue λ = −12, which is  0: −1        −5 1 7 1 −12 1  0 4 0  0 =  0 = −12  0 . 7 1 −5 −1 12 −1   1 28. With the given matrix, and using x0 = , we get 1 k xk

yk mk

0   1 1 1   1 1 1

1   0 2 1   0 1 0.5

1

0.6

2

3

4   −1 −2 0.8  1   0  0.8 −0.5 −2.5 1.8       −1 0.8 0.44  1   0  0.44 −0.5 1 1 

 

1.44



1.49

5

6    0 −0.89 0.89  0.89 1 0.11     0 −1 0.89  1  1 0.12 

1

0.5

7 

8

−2  0  −1.88   1  0  0.94

 1  1  1.94   0.52 0.52 1

1.5

1

0.88





Neither mk nor yk shows clear signs of converging to anything. The characteristic polynomial is   1−λ −1 0 1−λ 0  = (λ − 1)(λ2 − 2x + 2) det(A − λI) =  1 1 −1 1−λ so that A has eigenvalues 1 and 1 ± i. The absolute value of each of the complex eigenvectors is 2, so 1 is not the eigenvalue of largest magnitude, and the power method cannot succeed. 29. In Exercise 9, we found that λ1 = 18. To find λ2 , we apply the power method to     −4 12 1 A − 18I = with x0 = . 5 −15 1 k

yk

0   1 1   1 1

 8 −10   −0.8 1

mk

1

−10

xk

1 

2  15.2 −19.0   −0.8 1



3  15.2 −19.0   −0.8 1



−19

−19

Thus λ2 − λ1 = −19, so that λ2 = −1 is the second eigenvalue of A. 30. In Exercise 10, we found that λ1 = −10, so we apply the power method to k  4 A + 10I = 8

   4 1 with x0 = : 8 1

yk

0   1 1   1 1

1   8 16   0.5 1

2   6 12   0.5 1

3   6 12   0.5 1

mk

1

16

12

12

xk

Thus λ2 − λ1 = 12, so that λ2 = 2 is the second eigenvalue of A.

318

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

31. In Exercise 13, we found that λ1 = 17, so we apply the power method to     −8 4 8 1 A − 17I =  4 −2 −4 with x0 = 1 . 8 −4 −8 1 k xk

yk

0   1 1 1   1 1 1

1 

3

 4 18 −2  −9 −4 −18     −1 −1 0.5 0.5 1 1 −4

1

mk

2  



 18  −9 −18   −1 0.5 1

−18

−18

Thus λ2 − λ1 = −18, so that λ2 = −1 is the second eigenvalue of A. 32. In Exercise 14, we found that λ1 ≈ 4.4, so we apply the power method to     −1.4 1 0 1 −1.4 1  with x0 = 1 . A − 4.4I =  1 0 1 −1.4 1 k xk

yk

0 1     −0.4 1 1  0.6 −0.4 1     −0.667 1 1  1  −0.667 1

mk

1

2

3

4

5  1.990 −2.814 1.990   −0.707  1  −0.707



 

 





1.990 1.990 1.933 −2.733 −2.815 −2.814 1.990 1.990 1.933       −0.707 −0.707 −0.707  1   1   1  −0.707 −0.707 −0.707 −2.733

0.6

−2.815

−2.814

−2.814

Thus λ2 − λ1 ≈ −2.81, so that λ2 ≈ 1.59 is the second eigenvalue of A (in fact, λ2 = 3 −



2).

33. With A and x0 as in Exercise 9, proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k

yk

0   1 1   1 1

mk

1

xk

1 

 0.5 −0.5   −1 1 −0.5

2

3





0.833 −1.056   −0.789 1





0.798 −0.997   −0.801 1

−1.056

4   0.8 −1   −0.8 1

5   0.8 −1   −0.8 1

−1

−1

−0.997

We conclude that the eigenvalue of A of smallest magnitude is

1 −1

= −1.

34. With A and x0 as in Exercise 10, proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k

yk

0   1 1   1 1

mk

1

xk

1 2 3 4 5           0.1 0.225 0.256 0.249 0.250 0.4 0.400 0.525 0.495 0.501           0.25 0.562 0.488 0.502 0.5 1 1 1 1 1 0.4

0.4

0.525

0.495

We conclude that the eigenvalue of A of smallest magnitude is

1 0.5

0.501 = 2.

6   0.25 0.50   0.5 1 0.5

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES

319

35. With A as in Exercise 7, and x0 as given, proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k

0

1    1 −0.5  1  0  −1 0.5     1 −1  1  0 −1 1 

xk

yk

−1

mk

2  0.500  0.333 −0.500   −1 −0.667 1 

3  0.500  0.111 −0.500   −1 −0.222 1 

−0.5

0.5

4 5 6    0.500 0.50  0.259  0.16 −0.500 −0.50     −1 −1 −0.519 −0.321 1 1 

−0.5

−0.5

We conclude that the eigenvalue of A of smallest magnitude is

1 −0.5

−0.5

= −2.

36. With A and x0 as in Exercise 14, proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k xk

yk

0   1 1 1   1 1 1

1   0.286 0.143 0.286   1 0.5 1

1

0.286

mk

2 

3  

4  



0.357 0.457 0.545 −0.071 −0.371 −0.634 0.357 0.457 0.545       1 1 −0.859 −0.2 −0.812  1  1 1 −0.859 0.357

0.457

5   −0.511  0.674 −0.511   −0.758  1  −0.758

6   −0.468  0.645 −0.468   −0.725  1  −0.725

0.674

0.645

−0.634

We conclude that the eigenvalue of A of smallest magnitude is ≈ √ is 3 − 2.

1 0.645

≈ 1.55. In fact the eigenvalue

37. With α = 0, we are just using the inverse power method, so the solution to Exercise 33 provides the eigenvalue closest to zero.   1 38. Since α = 0, there is no shift required, so this is the inverse power method. Take x0 = y0 = , and 1 proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k 0 1 2 3 4 5 6               1 0.125 0.417 0.306 0.340 0.332 0.334 xk 1 0.375 −0.750 −1.083 −0.981 −1.005 −0.999               1 0.333 −0.556 −0.282 −0.346 −0.33 −0.334 yk 1 1 1 1 1 1 1 mk

1

0.375

−0.75

−1.083

−0.981

−1.005

−0.999

We conclude that the eigenvalue of A closest to 0 (i.e., the eigenvalue smallest in magnitude) is

1 −1

= −1.

39. We first shift A:  −1 A − 5I = −1 6

 0 6 −2 1 . 0 −1

  1 Now take x0 = y0 = 1, and proceed to calculate xk by solving Axk = yk−1 at each step to apply 1

320

CHAPTER 4. EIGENVALUES AND EIGENVECTORS the inverse power method. We get k

0   1 1 1   1 1 1

xk

yk

1

2 3      0.2 −0.08 0.32 −0.5 −0.50 −0.50 0.2 −0.08 0.32       −0.4 0.16 −0.064  1   1   1  −0.4 0.16 −0.064 

−0.5

1

mk

−0.5

We conclude that the eigenvalue of A closest to 5 is 5 + 40. We first shift A:

−0.5

1 −0.5

= 3.

  11 4 7 A + 2I =  4 17 −4 . 8 −4 11

  1 Now take x0 = y0 = 1, and proceed to calculate xk by solving Axk = yk−1 at each step to apply 1 the inverse power method. We get k xk

yk mk

0 1     −0.158 1 1  0.158 0.263 1     −0.6 1 1  0.6  1 1 1

0.263

2   −0.832  0.432 0.853   −0.975  0.506  1

3 4     −0.999 −0.990  0.496  0.500 1.000 0.991     −0.999 −1  0.5  0.5 1 1

0.853

0.991

We conclude that the eigenvalue of A closest to −2 is −2 +

1 1

5   −1 0.5 1   −1 0.5 1

1

1

= −1.

41. The companion matrix of p(x) = x2 + 2x − 2 is  2 , 0   1 so we apply the inverse power method with x0 = y0 = , calculating xk by solving Axk = yk−1 at 1 each step. We get C(p) =

yk

0   1 1   1 1

  0.667 1

mk

1

1.5

k xk

1 

1 1.5



So the root of p(x) closest to 0 is ≈

 −2 1

2 3 4 5 6          1 1 1 1 1 1.333 1.375 1.364 1.367 1.366           0.75 0.727 0.733 0.732 0.732 1 1 1 1 1



1.333 1 1.366

1.375

1.364

1.367

≈ 0.732. The exact value is −1 +

42. The companion matrix of p(x) = x2 − x − 3 is C(p) =

 1 1

 3 , 0

1.366 √

3.

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES

321

so we apply the inverse power method to  −1 1

C(p) − 2I = with x0 = y0 =

 3 −2

  1 , calculating xk by solving Axk = yk−1 at each step. We get 1

yk

0   1 1   1 1

1   5 2   1 0.4

2   3.2 1.4   1 0.438

mk

1

5

3.2

k xk

So the root of p(x) closest to 2 is ≈ 2 +

3 4 5       3.312 3.302 3.303 1.438 1.434 1.434       1 1 1 0.434 0.434 0.434 3.312

1 3.303

3.302

6   3.303 1.434   1 0.434

3.303 √ ≈ 2.303. The exact value is 12 (1 + 13).

43. The companion matrix of p(x) = x3 − 2x2 + 1 is  2 C(p) = 1 0

3.303

 −1 0 , 0

0 0 1

  1 so we apply the inverse power method to C(p). If we first start with y0 = 1, we find that this is an 1 eigenvector of C(p) with eigenvalue 1, but this may not be the closest root to 0, so we start instead   1 with with x0 = y0 = 0, calculating xk by solving Axk = yk−1 at each step. We get 1 k xk

yk mk

0   1 0 1   1 0 1 1 k xk

yk mk

1  0  1 −1   0 −1 1

2   −1  1 −2   0.5 −0.5 1

−1

−2



3 4 5       −0.5 −0.667 −0.6  1   1   1  0.5 −1.667 −1.6       0.333 0.4 0.375 −0.667 −0.6 −0.625 1 1 1 −1.6

−1.667

−1.6

6   −0.625  1  −1.625   0.385 −0.615 1

7   −0.615  1  −1.615   0.381 −0.619 1

8   −0.619  1  1.619   0.382 −0.618 1

9   −0.618  1  −1.618   0.382 −0.618 1

−1.625

−1.615

−1.619

−1.618

We conclude that the root of p(x) closest to zero is 44. The companion matrix of p(x) = x3 − 2x2 + 1 is  5 C(p) = 1 0

1 −1.618

≈= −0.618. The actual root is 12 (1 −

 −1 −1 0 0 , 1 0



5).

322

CHAPTER 4. EIGENVALUES AND EIGENVECTORS   1 so we apply the inverse power method to C(p) − 5I starting with x0 = y0 = 1, calculating xk by 1 solving Axk = yk−1 at each step. We get k xk

yk mk

0 1     1 −2.333 1 −0.667 1 −0.333     1 1 1 0.286 1 0.143 1

−2.333

2 3 4       −3.762 −3.909 −3.918 −0.810 −0.825 −0.826 −0.190 −0.175 −0.174       1 1 1 0.215 0.211 0.211 0.051 0.045 0.044 −3.762

−3.909

We conclude that the root of p(x) closest to zero is 5 +

−3.918 1 −3.919

5   −3.919 −0.826 −0.174   1 0.211 0.044

6   −3.919 −0.826 −0.174   1 0.211 0.044

−3.919

−3.919

≈= 4.745.

45. First, A−αI is invertible, since α is an eigenvalue of A precisely when A−αI is not invertible. Let x be an eigenvector for λ. Then Ax = λx, so that Ax − αx = λx − αx, and therefore (A − αI)x = (λ − α)x. Now left multiply both sides of this equation by (A − αI)−1 , to get x = (A − αI)−1 (λ − α)x, so that

1 x = (A − αI)−1 x. λ−α

But the last equation says exactly that λ − α is an eigenvalue of (A − αI)−1 with eigenvector x. 46. If λ is dominant, then |λ| > |γ| for all other eigenvalues γ. But this means that the algebraic multiplicity of λ is 1, since it appears only once in the list of eigenvalues listed with multiplicity, so its geometric multiplicity is 1 and thus its eigenspace is one-dimensional. 47. Using rows, the Gerschgorin disks are: Center 1, radius 1 + 0 = 1, 1 1 Center 4, radius + = 1, 2 2 Center 5, radius 1 + 0 = 1. Using columns, they are 1 3 +1= , 2 2 Center 4, radius 1 + 0 = 1, 1 1 Center 5, radius 0 + = : 2 2 Center 1, radius

From this diagram, we see that A has a real eigenvalue between 0 and 2.

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES

323

48. Using rows, the Gerschgorin disks are: Center 2, radius |−i| + 0 = 1, Center 2i, radius |1| + |1 + i| = 1 +



2,

Center − 2i, radius 0 + 1 = 1. Using columns, they are Center 2, radius 1 + 0 = 1, Center 2i, radius |−i| + 1 = 2, √ Center − 2i, radius 0 + |1 + i| = 2 :

From this diagram, we conclude that A has an eigenvalue within one unit of −2i. 49. Using rows, the Gerschgorin disks are: Center 4 − 3i, radius |i| + 2 + 2 = 5, Center − 1 + i, radius |i| + 0 + 0 = 1, Center 5 + 6i, radius |1 + i| + |−i| + |2i| = 3 + Center − 5 − 5i, radius 1 + |−2i| + |2i| = 5.



Using columns, they are Center 4 − 3i,

radius |i| + |1 + i| + 1 = 2 +

Center − 1 + i, Center 5 + 6i,

radius |i| + |−i| + |−2i| = 4, radius 2 + 0 + |2i| = 4

Center − 5 − 5i,

radius |−2| + 0 + |2i| = 4.



2,

2

324

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

From this diagram, we conclude that A has an eigenvalue within 1 unit of −1 + i. 50. Using rows, the Gerschgorin disks are: 1 , 2 1 Center 4, radius + 4 1 Center 6, radius + 6 1 Center 8, radius . 8

1 1 = , 4 2 1 2 = 6 3

1 , 4 1 Center 4, radius + 2 1 Center 6, radius + 4 1 Center 8, radius . 6

1 1 = , 6 3 1 3 = 8 8

Center 2, radius

Using columns, they are Center 2, radius

Since the circles are all disjoint, we conclude that A has four real eigenvalues, near 2, 4, 6, and 8. 51. Let A be a strictly diagonally dominant square matrix. Then inP particular, none of the diagonal entries of A can be zero, since each row has the property that |aii | > j6=i |aij |. Therefore each Gerschgorin disk is centered at a nonzero point, and its radius is less than the absolute value of its center, so the disk does not contain the origin. Since every eigenvalue is contained in some Gerschgorin disk, it follows that 0 is not an eigenvalue, so that the matrix is invertible by Theorem 4.17.

4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES

325

52. Let λ be any eigenvalue of A. Then by Theorem 4.29, λ is contained in the Gerschgorin disk for some row i, so that X |λ − aii | ≤ |aij | . j6=i

By the Triangle Inequality, |λ| ≤ |λ − aii | + |aii |, so that |λ| ≤ |λ − aii | + |aii | ≤ |aii | +

X

|aij | =

j6=i

n X

|aij | ≤ kAk ,

j=1

since kAk is the maximum of the possible row sums. 53. A stochastic matrix is a square matrix whose entries are probability vectors, which means that the T sum of the entries in any column of A is 1, so the sum of the entries of any

row in A is 1, so that

AT = 1. By Exercise 52, any eigenvalue of AT must satisfy |λ| ≤ AT = 1. But by Exercise 19 in Section 4.3, the eigenvalues of A and of AT are the same, so that if λ is any eigenvalue of A, then |λ| ≤ 1. 54. Using rows, the Gerschgorin disks are: Center 0, radius 1, Center 5, radius 2, 1 1 Center 3, radius + = 1 2 2 3 Center 7, radius . 4 Using columns, they are Center 0, radius 2 +

1 5 = , 2 2

Center 5, radius 1, 3 Center 3, radius 4 1 Center 7, radius . 2

From the row diagram, we see that there is exactly one eigenvalue in the circle of radius 1 around 0, so it must be real and therefore there is exactly one eigenvalue in the interval [−1, 1]. From the column diagram, conclude that there is exactly one eigenvalue in the interval [4, 6] and exactly   we15similarly one in 13 2 , 2 . Since the matrix has real entries, the fourth eigenvalue must be real as well. From the row diagram, since the circle around zero contains exactly one eigenvalue, the fourth eigenvalue must be in the interval 2, 7 34 . But it cannot be larger than 3 43 since then, from the column diagram, it would either

326

CHAPTER 4. EIGENVALUES AND EIGENVECTORS not bein a disk  or would be in an isolated disk already containing an eigenvalue. So it must lie in the range 2, 3 34 . In summary, the four eigenvalues are each in a different one of the four intervals [−1, 1],

[2, 3.75],

[4, 6],

[6.5, 7.5].

The actual values are ≈ −0.372, ≈ 2.908, ≈ 5.372, and ≈ 7.092.

4.6

Applications and the Perron-Frobenius Theorem  0 1 regular.

1. Since

2  1 1 = 0 0

  0 0 , it follows that any power of 1 1

 1 contains zeroes. So this matrix is not 0

2. Since the given matrix is upper triangular, any of its powers is upper triangular as well, so will contain a zero. Thus this matrix is not regular. 3. Since "

#2 " 7 1 = 92 0 9

1 3 2 3

1 3 2 3

# ,

the square of the given matrix has all positive entries, so this matrix is regular. 4. We have 

1 2 1 2

0 

1 2 1 2

0 

1 2 1 2

0

2  1 0 1 4  1 0 0 =  4 1 1 0 2 3  1 0 1 2  1 0 0 =  2 1 0 0 4  1 0 1 2  1 0 0 =  2 1 0 0

1 0 0 0 0 1 0 0 1



1 2 1 2

0  1 1 4  1 0  4 1 0 2  5 1 8  1 0  8 1 0 4

1 0 0 1 2 1 2

0



1 2 1 2

 =

0 

1 4 1 4 1 2

5 8 1 8 1 4

1 4 1 4 1 2

0

 =



1 2 1 2

9 16 5  16 1 8

1 4 1 4 1 2



5 8 1 8 1 4

so that the fourth power of this matrix contains all positive entries, so this matrix is regular. 5. If A and B are any 3 × 3 matrices with a12 = a32 = b12 = b32 = 0, then (AB)12 = a11 b12 + a12 b22 + a13 b32 = 0 + 0 + 0 = 0,

(AB)32 = a31 b12 + a32 b22 + a33 b32 = 0 + 0 + 0 = 0,

so that AB has the same property. Thus any power of the given matrix A has this property, so no power of A consists of positive entries, so A is not regular. 6. Since the third row of the matrix is all zero, the third row of any power will also be all zero, so this matrix is not regular. 7. The transition matrix P has characteristic equation 1 − λ 1 3 6 = 1 (λ − 1)(6λ − 1), det(P − λI) = 2 5 3 6 6 −λ so its eigenvalues are λ1 = 1 and λ2 = 16 . Thus P has two distinct eigenvalues, so is diagonalizable. Now, this matrix is regular, so the remark at the end of the proof of Theorem 4.33 shows that the

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

327

vector x which forms the columns of L is a probability vector parallel to an eigenvector for λ = 1. To compute this, we row-reduce P − I: " # " # h i 1 1 − 23 0 1 − 0 6 4 → P −I 0 = 2 − 16 0 0 0 0 3 so that the corresponding eigenspace is   1 E1 = span . 4   t Thus the columns of L are of the form where t + 4t = 5t = 1, so that t = 15 . Therefore 4t # " L=

1 5 4 5

1 5 4 5

.

8. The transition matrix P has characteristic equation 1 1 1 − λ 2 3 6 1 1 1 det(P − λI) = 21 = − (λ − 1)(36λ2 − 18λ + 1). 2 −λ 3 36 1 1 0 − λ 6

2

So one eigenvalue is 1; the roots of the quadratic are distinct, so that P has three distinct eigenvalues, so is diagonalizable. Now, this matrix is regular (its square has all positive entries), so the remark at the end of the proof of Theorem 4.33 shows that the vector x which forms the columns of L is a probability vector parallel to an eigenvector for λ = 1. To compute this, we row-reduce P − I:     1 1 0 − 12 1 0 − 73 0 h i 3 6     1 P − I 0 =  12 − 21 0 → 0 1 −3 0 3 1 − 12 0 0 0 0 0 0 6 so that the corresponding eigenspace is   7 E1 = span 9 . 3   7t Thus the columns of L are of the form 9t where 7t + 9t + 3t = 19t = 1, so that t = 3t   7 7 7 1  9 9 9 . L= 19 3 3 3

1 19 .

Therefore

9. The transition matrix P has characteristic equation 0.2 − λ 0.3 0.4 det(P − λI) = 0.6 0.1 − λ 0.4 = −(λ − 1)(λ2 + 0.5λ + 0.08) 0.2 0.6 0.2 − λ So one eigenvalue is 1; the roots of the quadratic are distinct (and complex), so that P has three distinct eigenvalues, so is diagonalizable. Now, this matrix is regular, so the remark at the end of the

328

CHAPTER 4. EIGENVALUES AND EIGENVECTORS proof of Theorem 4.33 shows that the vector x which forms the columns of L parallel to an eigenvector for λ = 1. To compute this, we row-reduce P − I:    1 0 −0.889 −0.8 0.3 0.4 0 h i    0.4 0 → 0 1 −1.037 P − I 0 =  0.6 −0.9 0.2 0.6 −0.8 0 0 0 0

is a probability vector  0  0 0

so that the corresponding eigenspace is   0.889 E1 = span 1.037 . 1   0.889t Thus the columns of L are of the form 1.037t where 0.889t + 1.037t + t = 2.926t = 1, so that 1t 1 t = 2.926 . Therefore   0.304 0.304 0.304 L = 0.354 0.354 0.354 . 0.341 0.341 0.341 (If the matrix   is converted to fractions and the computation done that way, the resulting eigenvector 24 will be 28, which must then be converted to a unit vector.) 27 10. Suppose that the sequence of iterates xk approach both u and v, and choose  > 0. Choose k such that |xk − u| < 2 and |xk − v| < 2 . Then |u − v| = |(u − xk ) + (xk − v)| ≤ |u − xk | + |xk − v| < , so that |u − v| is less than any positive . So it must be zero; that is, we must have u = v. 11. The characteristic polynomial is −λ det(L − λI) = 0.5

2 = λ2 − 1 = (λ + 1)(λ − 1), −λ

so the positive eigenvalue is 1. To find a corresponding eigenvector, row-reduce       −1 2 0 1 −2 0 L−I 0 = → . 0.5 −1 0 0 0 0   2 Thus E1 = span . 1 12. The characteristic polynomial is 1 − λ det(L − λI) = 0.5

1.5 = λ2 − λ − 0.75 = (λ − 1.5)(λ + 0.5), −λ

so the positive eigenvalue is 1.5. To find a corresponding eigenvector, row-reduce       −0.5 1.5 0 1 −3 0 L − 1.5I 0 = → . 0.5 −1.5 0 0 0 0   3 Thus E1.5 = span . 1

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

329

13. The characteristic polynomial is −λ det(L − λI) = 0.5 0

7 −λ 0.5

4 1 0 = −λ3 + 3.5λ + 1 = − (λ − 2)(2λ2 + 4λ + 1), 2 −λ

so 2 is a positive eigenvalue. The other two roots are −4 ± 4



8

,

both of which are negative. To find an eigenvector for 2, row-reduce

 L − 2I

 −2  0 = 0.5 0

7 4 −2 0 0.5 −2

  0 1 0 → 0 0 0

0 1 0

−16 −4 0

 0 0 . 0

  16 Thus E2 = span  4 . 1 14. The characteristic polynomial is 1 − λ det(L − λI) = 13 0

5 −λ 2 3

3 2 1 5 0 = −x3 + x2 + x + = − (x − 2)(3x2 + 3x + 1) 3 3 3 −λ

so 2 is a positive eigenvalue. The other two eigenvalues are complex. To row-reduce    1 0 −18 −1 5 3 0 h i    1 L − 2I 0 =  3 −2 0 0 → 0 1 −3 2 0 0 −2 0 0 0 3   18 Thus E2 = span  3 . 1

find an eigenvector for 2,  0  0 . 0

15. Since the eigenvector corresponding to the positive eigenvalue λ1 for the Leslie matrix represents a stable ratio of population classes, one that proportion is reached, if λ1 > 1, the population will grow without bound; if λ1 = 1, it will be stable, and if λ1 < 1 it will decrease and eventually die out. 16. From the Leslie matrix in Equation (3), we have b1 − λ s1 0 det(L − λI) = 0 .. . 0

b2 −λ s2 0 .. .

b3 0 −λ s3 .. .

··· ··· ··· ··· .. .

bn−1 0 0 0 .. .

0

0

···

sn−1

For n = 1, this is just b1 − λ, which is equal to cL (λ) = (−1)1 (λ1 − b1 λ0 ) = b1 − λ.

bn 0 0 0 . .. . −λ

330

CHAPTER 4. EIGENVALUES AND EIGENVECTORS This forms the basis for the induction. Now assume the statement holds for n. Expand the above determinant along the last column, giving b1 − λ b2 s1 −λ 0 · · · 0 s1 −λ 0 s2 −λ · · · 0 0 s2 0 s3 · · · 0 + (−λ) det(L − λI) = (−1)n+1 bn 0 0 0 .. .. .. .. .. . . .. . . . . .. . 0 0 0 · · · sn−1 0 0

n − 1, and we prove it for b3 0 −λ s3 .. .

··· ··· ··· ··· .. .

bn−2 0 0 0 .. .

0

···

sn−2

bn−1 0 0 0 . .. . 0

The first matrix is upper triangular, so its determinant is the product of its diagonal entries; for the second, we use the inductive hypothesis. So det(L − λI) = (−1)n+1 bn (s1 s2 · · · sn−1 ) − λ (−1)n−1 (λn−1 − b1 λn−2 − b2 s1 λn−3 − · · · −bn−2 s1 s2 · · · sn−3 λ − bn−1 s1 s2 · · · sn−2 ) n+1

= (−1)

bn (s1 s2 · · · sn−1 )

+ (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · · − bn−2 s1 s2 · · · sn−3 λ2 − bn−1 s1 s2 · · · sn−2 λ) = (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · · − bn−2 s1 s2 · · · sn−3 λ2 − bn−1 s1 s2 · · · sn−2 λ − bn s1 s2 · · · sn−1 ), proving the result. 17. P is a diagonal matrix, so inverting it consists of inverting each of the diagonal entries. Now, multiplying any matrix on the left by a diagonal matrix multiplies each row by the corresponding diagonal entry, and multiplying on the right by a diagonal matrix multiplies each column by the corresponding diagonal entry, so   b1 b2 b3 · · · bn−1 bn   0 ··· 0 0 1 0   0 1 0 ··· 0 0 s1   −1 P L = 0 0 1 ··· 0 0   s1 s2 . .. .. .. ..  .. .  . . . . . . 1 0 0 0 · · · s1 s2 ···s 0 n−2 and then

 b1  1  0  −1 P LP =  0 . . . 0

b2 s1 0 1 0 .. . 0

b3 s1 s2 0 0 1 .. . 0

··· ··· ··· ··· .. . ···

bn−1 s1 s2 · · · sn−2 0 0 0 .. . 1

 bn s1 s2 · · · sn−1  0    0  . 0   ..  .  0

But this is the companion matrix of the polynomial p(x) = xn − b1 xn−1 − b2 s1 xn−2 − · · · − bn−1 s1 s2 · · · sn−2 x − bn s1 s2 · · · sn−1 , and by Exercise 32 in Section 4.3, the characteristic polynomial of P −1 LP is (−1)n p(λ). Finally, the characteristic polynomials of L and P −1 LP are the same since they have the same eigenvalues, so the characteristic polynomial of L is det(L − λI) = det(P −1 LP − λI) = (−1)n p(λ) = (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · · − bn−1 s1 s2 · · · sn−2 λ − bn s1 s2 · · · sn−1 )

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

331

18. By Exercise 46 in Section 4.4, the eigenvectors of P −1 LP are of the form P −1 x where x is an eigenvalue of L, and by Exercise 32 in Section 4.3, an eigenvector of P −1 LP for λ1 is  n−1  λ1 λn−2   1n−3  λ   1   . .  ..     λ1  1 So an eigenvector of L corresponding to λ1 is   n−1   λ1n−1 λ1  λn−2   s1 λn−2 1   1n−3   n−3  λ   s1 s2 λ1   1   P . = . . ..   ..        λ1   s1 s2 · · · sn−2 λ1  s1 s2 · · · sn−2 sn−1 1 Finally, we can scale this vector by dividing each entry by λn−1 to get 1   1   s /λ 1 1   2   s s /λ 1 2 1    . ..   .    s1 s2 · · · sn−2 /λn−2  1 n−1 s1 s2 · · · sn−2 sn−1 /λ1 19. Using a CAS, we find the positive eigenvalue of 

1 L = 0.7 0

1 0 0.5

 3 0 0

to be λ ≈ 1.7456, so this is the steady state growth rate of the population with this Leslie matrix. From Exercise 18, a corresponding eigenvector is     1 1   ≈ 0.401 . 0.7/1.7456 0.115 0.7 · 0.5/(1.7456)2 So the long-term proportions of the age classes are in the ratios 1 : 0.401 : 0.115. 20. Using a CAS, we find the positive eigenvalue of  0 0.5 L= 0 0

1 0 0.7 0

 2 5 0 0  0 0 0.3 0

to be λ ≈ 1.2023, so this is the steady state growth rate of the population with this Leslie matrix. From Exercise 18, a corresponding eigenvector is     1 1   0.416 0.5/1.2023      0.5 · 0.7/(1.2023)2  ≈ 0.242 0.5 · 0.7 · 0.3/(1.2023)3 0.060 So the long-term proportions of the age classes are in the ratios 1 : 0.416 : 0.242 : 0.06.

332

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

21. Using a CAS, we find the positive eigenvalue  0 0.4 0.3 0   0 0.7  0 L= 0 0 0  0 0 0 0

of 1.8 0 0 0.9 0 0 0

1.8 1.8 1.6 0 0 0 0 0 0 0 0 0 0.9 0 0 0 0.9 0 0 0 0.6

 0.6 0  0  0 . 0  0 0

to be λ ≈ 1.0924, so this is the steady state growth rate of the population with this Leslie matrix. From Exercise 18, a corresponding eigenvector is     1 1  0.275  0.3/1.0924     2   0.176 0.3 · 0.7/(1.0924)       ≈ 0.145 . 0.3 · 0.7 · 0.9/(1.0924)3       0.119 0.3 · 0.7 · 0.9 · 0.9/(1.0924)4      0.3 · 0.7 · 0.9 · 0.9 · 0.9/(1.0924)5  0.098 0.054 0.3 · 0.7 · 0.9 · 0.9 · 0.9 · 0.6/(1.0924)6 The entries of this vector give the long-term proportions of the age classes. 22. (a) The table gives us 13 separate age classes, each of duration two years; from the table, the Leslie matrix of the system is   0 0.02 0.70 1.53 1.67 1.65 1.56 1.45 1.22 0.91 0.70 0.22 0 0.91 0 0 0 0 0 0 0 0 0 0 0 0    0 0.88 0 0 0 0 0 0 0 0 0 0 0    0 0 0.85 0 0 0 0 0 0 0 0 0 0    0 0 0 0.80 0 0 0 0 0 0 0 0 0    0 0 0 0 0.74 0 0 0 0 0 0 0 0   0 0 0 0 0.67 0 0 0 0 0 0 0 L= .  0   0 0 0 0 0 0 0.59 0 0 0 0 0 0    0 0 0 0 0 0 0 0.49 0 0 0 0 0    0 0 0 0 0 0 0 0 0.38 0 0 0 0    0 0 0 0 0 0 0 0 0 0.27 0 0 0    0 0 0 0 0 0 0 0 0 0 0.17 0 0 0 0 0 0 0 0 0 0 0 0 0 0.15 0 Using a CAS, this matrix has positive eigenvalue ≈ 1.333, with a corresponding eigenvector (normalized to add to 100)   36.1  24.65     16.27     10.38     6.23     3.46     1.74  .    0.77     0.28     0.08     0.016     0.0021  0.00023

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

333

(b) In the long run, the percentage of seals in each age group is given by the entries in the eigenvector in part (a), and the growth rate will be about 1.333. 23. (a) Each term in r is the number of daughters born to a female in a particular age class times the probability of a given individual surviving to be in that age class, so it represents the expected number of daughters produced by a given individual while she is in that age class. So the sum of the terms gives the expected number of daughters ever born to a particular female over her lifetime. (b) Define g(λ) as suggested. Then g(λ) = 1 if and only if λn g(λ) = b1 λn−1 + b2 s1 λn−2 + · · · + bn s1 s2 · · · sn−1 = λn n

λ − b1 λ

n−1

n−2

− b2 s1 λ



− · · · − bn s1 s2 · · · sn−1 = 0.

But from Exercise 16, this is true if and only if λ is a zero of the characteristic polynomial of the Leslie matrix, indicating that λ is an eigenvalue of L. So r = g(1) = 1 if and only if λ = 1 is an eigenvalue of L, as desired. (c) For given values of bi and si , it is clear that g(λ) is a decreasing function of λ. Also, r = g(1) and g(λ1 ) = 1. So if λ1 > 1, then r = g(1) > g(λ1 ) = 1 so that r > 1. On the other hand, if λ1 < 1, then r = g(1) < g(λ1 ) = 1, so that r < 1. So we have shown that if the population is decreasing, then r < 1; if the population is increasing, then r > 1, and (part (b)) if the population is stable, then r = 1. Since this exhausts the possibilities, we have proven the converses as well. Thus r < 1 if and only if the population is decreasing and r > 1 if and only if the population is increasing. Note that this corresponds to our intuition, since if the net reproduction rate is less than one, we would expect the population in the next generation to be smaller. 24. If λ1 is the unique positive eigenvalue, then λ1 x = Lx for the corresponding eigenvector x. If h is the 1 1 x. Thus λ1 = 1−h , so that sustainable harvest ratio, then (1 − h)Lx = x, which means that Lx = 1−h 1 h = 1 − λ1 . 25. (a) Exercise 21 found the positive eigenvalue of L to be λ1 ≈ 1.0924, so from Exercise 24, h = 1− λ11 ≈ 1 ≈ 0.0846. 1 − 1.0924 (b) Reducing the initial population levels by 0.0846 gives     10 9.15  2   1.831       8   7.323         x00 = (1 − 0.0846)   5  =  4.577  . 12 10.985     0  0  1 0.915 Then 

0 0.3  0  0 0 x1 = Lx0 =  0 0  0 0

0.4 0 0.7 0 0 0 0

1.8 1.8 1.8 1.6 0 0 0 0 0 0 0 0 0.9 0 0 0 0 0.9 0 0 0 0 0.9 0 0 0 0 0.6

    0.6 9.15 42.47     0   1.831   2.75      0   7.323   1.28       0   4.577  =  6.59  .     0  10.985  4.12   0   0   9.89  0 0.915 0

The population has still grown substantially, because we did not start with a stable population

334

CHAPTER 4. EIGENVALUES AND EIGENVECTORS vector. If we repeat with the stable population ratios computed in Exercise 21, we get     0.915 1 0.275 0.252     0.176 0.161        x00 = (1 − 0.0846)  0.145 = 0.133 , 0.119 0.109     0.098 0.090 0.049 0.054 and



0 0.3  0  x01 = Lx00 =  0 0  0 0

0.4 0 0.7 0 0 0 0

1.8 1.8 1.8 1.6 0 0 0 0 0 0 0 0 0.9 0 0 0 0 0.9 0 0 0 0 0.9 0 0 0 0 0.6

    0.999 0.6 0.915     0  0.252 0.275 0.161 0.176 0         0  0.133 = 0.145 , 0.109 0.119 0     0  0.090 0.098 0.053 0.049 0

which is essentially x0 . 1 ≈ 0.25, which is about one quarter of the population. 26. Since λ1 ≈ 1.333, we have h = 1 − λ11 = 1 − 1.333

27. Suppose λ1 = r1 (cos 0 + i sin 0); then r1 = λ1 > 0 since λ1 is a positive eigenvalue. Let λ = r(cos θ + i sin θ) be some other eigenvalue. Then g(λ) = g(λ1 ) = 1. Computing g(λ) gives b1 b2 s1 bn s1 s2 · · · sn−1 + + ··· + n r(cos θ + i sin θ) r2 (cos θ + i sin θ)2 r (cos θ + i sin θ)n b2 s1 b2 s1 s2 · · · sn−1 b1 (cos θ − i sin θ)n , = (cos θ − i sin θ) + 2 (cos θ − i sin θ)2 + · · · + r r rn

1 = g(λ) =

where we get to the final equation by multiplying each fraction by

(cos θ−i sin θ)k (cos θ−i sin θ)k

= 1 and using the

identity (cos θ + i sin θ)(cos θ − i sin θ) = cos2 θ + sin2 θ = 1. Now apply DeMoivre’s theorem to get 1 = g(λ) =

b2 s1 b2 s1 s2 · · · sn−1 b1 (cos θ − i sin θ) + 2 (cos 2θ − i sin 2θ) + · · · + (cos nθ − i sin nθ). r r rn

Since g(λ) = 1, the imaginary parts must all cancel, so we are left with 1 = g(λ) =

b1 b2 s1 b2 s1 s2 · · · sn−1 cos θ + 2 cos 2θ + · · · + cos nθ. r r rn

But cos kθ ≤ 1 for all k, and bi , si , and r are all nonnegative, so 1 = g(λ) ≤

b1 b2 s1 b2 s1 s2 · · · sn−1 + 2 + ··· + . r r rn

Since g(λ1 ) = 1 as well, we get b1 b2 s1 b2 s1 s2 · · · sn−1 b1 b2 s1 b2 s1 s2 · · · sn−1 + 2 + ··· + ≤ + 2 + ··· + . n r1 r1 r1 r r rn Then since r1 > 0, we get r1 ≥ r. 28. The characteristic polynomial of A is 2 − λ det(A − λI) = 1

0 = (λ − 2)(λ − 1), 1 − λ

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

335

so that the Perron root is λ1 = 2. To find its Perron eigenvector,       0 0 0 1 −1 0 A − 2I 0 = → , 1 −1 0 0 0 0   1 so that is an eigenvector for λ1 = 2. The entries of the Perron eigenvector must sum to 1, however, 1 " # so dividing through by 2 gives for the Perron eigenvector

1 2 1 2

.

29. The characteristic polynomial of A is 1 − λ 3 = (1 − λ)(−λ) − 6 = λ2 − λ − 6 = (λ − 3)(λ + 2), det(A − λI) = 2 −λ so that the Perron root is λ1 = 3. To find its Perron eigenvector, " # " h i −2 3 0 1 − 23 → A − 3I 0 = 2 −3 0 0 0

# 0 , 0

  3 so that is an eigenvector for λ1 = 3. The entries of the Perron eigenvector must sum to 1, however, 2 " # so dividing through by 5 gives for the Perron eigenvector

3 5 2 5

.

30. The characteristic polynomial of A is −λ det(A − λI) = 1 1

1 −λ 1

1 1 = −(λ + 1)2 (λ − 2), −λ

so that the Perron root is λ1 = 2. To find its Perron eigenvector,    1 0 −2 1 1 0 h i    1 0 → 0 1 A − 2I 0 =  1 −2 1 1 −2 0 0 0

−1 −1 0

 0  0 , 0

  1 so that 1 is an eigenvector for λ1 = 2. The entries of the Perron eigenvector must sum to 1, however, 1   1 3

  so dividing through by 3 gives for the Perron eigenvector  13 . 1 3

31. The characteristic polynomial of A is 2 − λ det(A − λI) = 1 1

1 1 1−λ 0 = −λ(λ − 1)(λ − 3), 0 1 − λ

so that the Perron root is λ1 = 3. To find its Perron eigenvector,    −1 1 1 0 1 0 h i    A − 3I 0 =  1 −2 0 0 → 0 1 1 0 −2 0 0 0

−2 −1 0

 0  0 , 0

336

CHAPTER 4. EIGENVALUES AND EIGENVECTORS   2 so that 1 is an eigenvector for λ1 = 3. The entries of the Perron eigenvector must sum to 1, however, 1   1 2

  so dividing through by 4 gives for the Perron eigenvector  14 . 1 4

32. Here n = 4, and (I + A)n−1

 1 0 = 0 1

0 1 1 0

3  1 0 3 1  = 1 0 3 1

1 0 1 0

3 1 3 1

 1 3  > O, 3 1

3 1 1 3

so A is irreducible. 33. Here n = 4, and (I + A)n−1

so A is reducible. Interchanging required form, so that  0 0 0 1  0 0 1 0

 1 0 = 1 1

0 1 0 1

3  0 4 6 1  = 4 0 1 6

1 1 1 0

0 4 0 4

4 6 4 6

 0 4 , 0 4

rows 1 and 4 and then columns 1 and 4 produces a matrix in the 0 0 1 0

  0 1 0 0 A 0 0 1 0

0 1 0 0

  0 1 1 0 = 0 0 0 0

0 0 1 0

1 0 0 0

 1 0 . 1 0

0 1 0 1

34. Here n = 5, and

(I + A)n−1

 1 0  = 1 0 1

1 1 0 0 0

0 1 2 1 0

0 0 0 2 0

so that A is reducible. Exchanging rows 1 and matrix in the required form, so that    0 0 0 1 0 0 0 0 1 0 0 0 0 1    0 0 1 0 0 A 0 0    1 0 0 0 0 1 0 0 0 0 0 1 0 0

4  11 0 22 1    1  = 28 23  0 6 1

6 11 16 7 6

11 17 23 33 5

0 0 0 16 0

 11 17  22 , 18 6

4 and then exchanging columns 1 and 4 produces a 0 0 1 0 0

1 0 0 0 0

  0 1 0 0    0  = 0  0 0 1 0

0 0 0 1 0

1 1 1 0 0

0 0 1 0 1

 0 1  1 , 0 0

and B is a 1 × 1 matrix while D is 4 × 4. 35. Here n = 5, and

(I + A)n−1

so that A is irreducible.

 1 0  = 1 0 0

1 1 0 0 0

0 0 1 1 0

0 0 0 1 1

4  0 1 1 1    1  = 5  6 0 2 5

4 1 6 4 1

1 5 6 5 11

 5 11 11 16  12 21 , 6 12 16 22

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

337

36. (a) Let G be a graph and A its adjacency matrix. Let G0 be the graph obtained from G by adding an edge from each vertex to itself if no such edge exists in G. Then G0 is connected if and only if G is, since every vertex is always connected to itself. The adjacency matrix A0 of G results from A by replacing every 0 on the diagonal by a 1. Now consider the matrix A + I. This matrix is identical to A0 except that some diagonal entries of A0 equal to 1 may be equal to 2 in A + I (if the vertex in question had an edge from it to itself in G). But since all entries in A + I and A0 are nonnegative, (A0 )n−1 > O if and only (A + I)n−1 > O — whether those diagonal entries are 1 or 2 does not affect whether a particular entry in the power is zero or not. So G is connected if and only if G0 is connected, which is the case if and only if (A0 )n−1 of its adjacency matrix is > O, so that there is a path between any two vertices of length n − 1 (note that any shorter path can be made into a path of length n − 1 by adding traverses of the edge from the vertex to itself). But (A0 )n−1 > O then means that (A + I)n−1 > O, so that A is irreducible. (b) Since all the graphs in Section 4.0 are connected, they all have irreducible adjacency matrices. The graph in Figure 4.1 has a primitive adjacency matrix, since 2    3 2 2 2 0 1 1 1 2 3 2 2 1 0 1 1    A2 =  1 1 0 1 = 2 2 3 2 > O. 2 2 2 3 1 1 1 0 So does the graph in Figure 4.2:  0 1 0 0 1 0 1 0 1 0 1 0 0 0 0 1  0 1 0 1 0 0 0 0  0 0 1 0 1 0 0 0  1 0 0 1 0 1 0 0 4 A = 0 0 0 0 1 0 0 1  1 0 0 0 0 0 0 0  0 1 0 0 0 1 0 0  0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 1

0 0 1 0 0 1 1 0 0 0

4  15 0 4 0   9 0   9 1    0  =4 9 0   4 1   9 1   9 0 9 0

4 15 4 9 9 9 9 4 9 9

9 4 15 4 9 9 9 9 4 9

9 9 4 15 4 9 9 9 9 4

The graph in Figure 4.3 has a primitive adjacency matrix: 4   6 1 4 0 1 0 0 1 1 6 1 1 0 1 0 0      A4 =  0 1 0 1 0 = 4 1 6 4 4 1 0 0 1 0 1 1 4 4 1 0 0 1 0 However, the adjacency matrix for the graph is  0 0 0 0  0 0 A= 1 1  1 1 1 1

4 4 1 6 1

4 9 9 4 15 4 9 9 9 9

9 9 9 9 4 15 9 4 4 9

4 9 9 9 9 9 15 9 4 4

9 4 9 9 9 4 9 15 9 4

9 9 4 9 9 4 4 9 15 9

 9 9  9  4  9 >O 9  4  4  9 15

 1 4  4  > O. 1 6

Figure 4.4 is  0 1 1 1 0 1 1 1  0 1 1 1 , 1 0 0 0  1 0 0 0 1 0 0 0

and any power of A will always have a 3 × 3 block of zeroes in either the upper left or upper right, so A is not primitive. 37. (a) Suppose G is bipartite with disjoint vertex sets U and V and A is its adjacency matrix. Recall that (Ak )ij gives the number of paths between vertices i and j. Also, from Exercise 80 in Section 3.7, if u ∈ U and v ∈ V , then any path from u to itself will have even length, and any path from

338

CHAPTER 4. EIGENVALUES AND EIGENVECTORS u to v will have odd length, since in any path, each successive edge moves you from U to V and back. So odd powers of A will have zero entries along the diagonal, while even powers of A will have zero entries in any cell corresponding to a vertex in U and a vertex in V . Since every power of A has a zero entry, A is not primitive. (b) Number the vertices so that v1 , v2 , . . . , vk are in U and vk+1 , vk+2 , . . . , vn are in V . Then the adjacency matrix of G is   O B A= . BT O   v Write the eigenvector v corresponding to λ as 1 , where v1 is a k-element column vector and v2 v2 is an (n − k)-element column vector. Then     λv1 T Av = λv ⇐⇒ Bv2 B v1 = , λv2 so that Bv2 = λv1 and B T v1 = λv2 . Then         −Bv2 v1 −λv1 −v1 −λ = = = A . −v2 λv2 v2 B T v1 This shows that −λ is an eigenvalue of A with corresponding eigenvector

  −v1 . v2

38. (a) Since each vertex of G connects to k other vertices, we see that each column of the adjacency matrix A of G sums to k. Thus A = kP , where P is a matrix each of whose columns sums to 1. So P is a transition matrix, and by Theorem 4.30, 1 is an eigenvalue of P , so that k is an eigenvalue of kP = A. (b) If A is primitive, then Ar > O for some r, and therefore P r > O as well. Therefore P is regular, so by Theorem 4.31, every other eigenvalue of P has absolute value strictly less than 1, so that every other eigenvalue of A = kP has absolute value strictly less than k. 39. This exercise is left to the reader after doing the exploration suggested in Section 4.0. 40. (a) Let A = [aij ]. Then cA = [caij ], so that |cA| = [|caij |] = [|c| · |aij |] = |c| [|aij |] = |c| · |A| . (b) Using the Triangle Inequality, |A + B| = [|aij + bij |] ≤ [|aij | + |bij |] = [|aij |] + [|bij |] = |A| + |B| . (c) Ax is the n × 1 matrix whose ith entry is ai1 x1 + ai2 x2 + · · · + ain xn . Thus the i

th

entry of |Ax| is |(Ax)i | = |ai1 x1 + ai2 x2 + · · · + ain xn | ≤ |ai1 x1 | + |ai2 x2 | + · · · + |ain xn | = |ai1 | |x1 | + |ai2 | |x2 | + · · · + |ain | |xn | .

But this is just the ith entry of |A| |x|. (d) The ij entry of |AB| is |AB|ij = |ai1 b1j + ai2 b2j + · · · + ain bnj | ≤ |ai1 b1j | + |ai2 b2j | + · · · + |ain bnj | = |ai1 | |b1j | + |ai2 | |b2j | + · · · + |ain | |bnj | , which is the ij entry of |A| |B|.

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

339

41. A matrix is reducible if, after permuting rows and columns according to the same permutation, it can be written   B C , O D where B and D are square and O is the zero matrix. If A is 2 × 2, then B, C, O, and D must all be 1 × 1 matrices (which are therefore just the entries of A). If A is in the above form after applying no permutations, leaving A alone, then a21 = 0. If A is in this form after applying the only other possible permutation — exchanging the two rows and the two columns — then the resulting matrix is   a22 a21 , a12 a11 so that a12 = 0. This proves the result. 42. (a) If λ1 and v1 are the Perron eigenvalue and eigenvector respectively, then by Theorem 4.37, λ1 > 0 and v1 > 0. By Exercise 22 in Section 4.3, λ1 − 1 is an eigenvalue, with eigenvector v1 , for A − I, and therefore 1 − λ1 is an eigenvalue for I − A, again with eigenvector v1 . Since I − A is 1 is an eigenvalue for (I − A)−1 , again with eigenvector v1 . invertible, Theorem 4.18 says that 1−λ 1 But (I − A)−1 ≥ 0 and v1 ≥ 0, so that 1 v1 = (I − A)−1 v1 ≥ 0. 1 − λ1 1 ≥ 0 and thus λ1 < 1. Since we also have λ1 > 0, the result follows: 0 < λ1 < 1. It follows that 1−λ 1 (b) Since 0 < λ1 < 1, v1 > λ1 v1 = Av1 .

43. x0 = 1, x1 = 2x0 = 2, x2 = 2x1 = 4, x3 = 2x2 = 8, x4 = 2x3 = 16, x5 = 2x4 = 32. 44. a1 = 128, a2 =

a1 2

= 64, a3 =

a2 2

= 32, a4 =

a3 2

= 16, a5 =

a4 2

= 8.

45. y0 = 0, y1 = 1, y2 = y1 − y0 = 1, y3 = y2 − y1 = 0, y4 = y3 − y2 = −1, y5 = y4 − y3 = −1. 46. b0 = 1, b1 = 1, b2 = 2b1 + b0 = 3, b3 = 2b2 + b1 = 7, b4 = 2b3 + b2 = 17, b5 = 2b4 + b3 = 41. 47. The recurrence is xn −3xn−1 −4xn−2 = 0, so the characteristic equation is λ2 −3λ−4 = (λ−4)(λ+1) = 0. Thus the system has eigenvalues −1 and 4, so xn = c1 (−1)n + c2 · 4n , where c1 and c2 are constants. Using the initial conditions, we have 0 = x0 = c1 (−1)0 + c2 · 40 = c1 + c2 , 5 = x1 = c1 (−1)1 + c2 · 41 = −c1 + 4c2 . Add the two equations to solve for c2 , then substitute back in to get c1 = −1 and c2 = 1, so that xn = −(−1)n + 4n = 4n + (−1)n−1 . 48. The recurrence is xn −4xn−1 +3xn−2 = 0, so the characteristic equation is λ2 −4λ+3 = (λ−3)(λ−1) = 0. Thus the system has eigenvalues 1 and 3, so xn = c1 · 1n + c2 · 3n , where c1 and c2 are constants. Using the initial conditions, we have 0 = x0 = c1 · 10 + c2 · 30 = c1 + c2 , 1 = x1 = c1 · 11 + c2 · 31 = c1 + 3c2 . Subtract the two equations to solve for c2 , then substitute back in to get c1 = − 21 and c2 = 21 , so that xn = − 12 · 1n + 12 · 3n = 12 (3n − 1). 49. The recurrence is yn − 4yn−1 + 4yn−2 = 0, so the characteristic equation is λ2 − 4λ + 4 = (λ − 2)2 = 0. Thus the system has the double eigenvalue 2, so yn = c1 · 2n + c2 · n2n , where c1 and c2 are constants. Using the initial conditions, we have 1 = y1 = c1 · 21 + c2 · 1 · 21 = 2c1 + 2c2 , 6 = y2 = c1 · 22 + c2 · 2 · 22 = 4c1 + 8c2 . Solving the system gives c1 = − 12 and c2 = 1, so that yn = − 21 · 2n + n · 2n = n2n − 2n−1 .

340

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

50. The recurrence is an −an−1 + 14 an−2 = 0, so the characteristic equation is 14 (4λ2 −4λ+1) = 14 (2λ−1)2 = 0. Thus the system has the double eigenvalue 12 = 2−1 , so an = c1 · 2−n + c2 · n2−n , where c1 and c2 are constants. Using the initial conditions, we have 4 = a0 = c1 · 2−0 + c2 · 0 · 2−0 = c1 , 1 1 1 = a1 = c1 · 2−1 + c2 · 1 · 2−1 = c1 + c2 . 2 2 Solving the system gives c1 = 4 and c2 = −2, so that an = 4 · 2−n − 2n · 2−n = 22−n − n21−n . 51. The recurrence is bn − 2bn−1 − 2bn−2 = 0, so the characteristic equation is λ2 − 2λ − 2 = 0. Using √ 3, which are therefore the eigenvalues. So the quadratic√formula, this equation has roots λ = 1 ± √ bn = c1 · (1 + 3)n + c2 · (1 − 3)n , where c1 and c2 are constants. Using the initial conditions, we have √ √ 0 = b0 = c1 · (1 + 3)0 + c2 · (1 − 3)0 = c1 + c2 , √ √ √ 1 = b1 = c1 · (1 + 3)1 + c2 · (1 − 3)1 = c1 + c2 + (c1 − c2 ) 3. √ From the first equation, c1 + c2 = 0, so the second becomes 1 = (c1 − c2 ) 3, so that c1 − c2 = √ √ Since c2 = −c1 , we get c1 = 63 and c2 = − 63 , so that the recurrence is √ √ √  √ n √ n √ √  3 3 3 (1 + 3) − (1 − 3) = (1 + 3)n − (1 − 3)n . bn = 6 6 6



3 3 .

52. The recurrence is yn − yn−1 + yn−2 = 0, so the characteristic equation is λ2 − λ + 1 = 0. Using the √ 1 quadratic formula, this equation has roots λ = 2 (1 ± 3 i), which are therefore the eigenvalues. So √ n √ n yn = c1 · 21 (1 + 3 i) +c2 · 12 (1 − 3 i) , where c1 and c2 are constants. Using the initial conditions, we have 0  0  √ √ 1 1 0 = y0 = c1 · (1 + 3 i) + c2 · (1 − 3 i) = c1 + c2 , 2 2  1  1 √ √ √ √ 1 1 1 1 1 = y1 = c1 · (1 + 3 i) + c2 · (1 − 3 i) = c1 (1 + 3 i) + c2 (1 − 3 i). 2 2 2 2 √ √ From the first equation, c1 + c2 = 0, so the second becomes 2 = (c1 − c2 ) 3 i, so that c1 − c2 = − 2 3 3 i. √ √ Since c2 = −c1 , we get c1 = − 33 i and c2 = 63 , so that the recurrence is √  n √  n √ √ 3 1 3 1 i (1 + 3 i) + i (1 − 3 i) . yn = − 3 2 3 2 53. Working with the matrix A from the Theorem, since it has distinct eigenvalues λ1 6= λ2 , it can be diagonalized, so that there is some invertible P such that     e f λ1 0 −1 P = , P AP = D = . g h 0 λ2 First consider the case where one of the eigenvalues, say λ2 , is zero. In this case the matrix must be of the form   a 0 A= , 1 0 so that the other eigenvalue is a. In this case, the recurrence relation is xn = axn−1 + 0xn−2 = axn−1 , so if we are given xj and xj−1 as initial conditions, we have xn = an−j xj . So take c1 = xj a−j and c2 = 0; then xn = c1 λn1 + c2 λn2 as desired. Now suppose that neither eigenvalue is zero. Then       1 1 e f λk1 0 h −f ehλk1 − f gλk2 −ef λk1 + ef λk2 Ak = P Dk P −1 = = . 0 λk2 −g e eh − f g g h eh − f g ghλk1 − ghλk2 −f gλk1 + ehλk2

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

341

If we are given two terms of the sequence, say xj−1 and xj , then for any n we have   xn xn = = An−j xj xn−1    1 xj + ef λn−j −ef λn−j − f gλn−j ehλn−j 2 1 2 1 = xj−1 eh − f g ghλ1n−j − ghλn−j + ehλn−j −f gλn−j 2 1 2      xj−1 xj + −ef λn−j + ef λn−j ehλn−j − f gλn−j 1 1 2 1 2      = n−j n−j n−j eh − f g x x + −f gλ + ehλ ghλn−j − ghλ j−1 j 1 2 1 2      −j λn (ehxj − ef xj−1 ) λ1 λn1 + (−f gxj + ef xj−1 ) λ−j 1 2  2.    = eh − f g λn2 λn1 + (−ghxj + ehxj−1 ) λ−j (ghxj − f gxj−1 ) λ−j 2 1 Looking at the first row, we see that for c1 =

ehxj − ef xj−1 −j λ1 , eh − f g

c2 =

−f gxj + ef xj−1 −j λ2 , eh − f g

c1 and c2 are independent of n, and we have xn = c1 λn1 + c2 λn2 , as required. 54. (a) First suppose that the eigenvalues are distinct, i.e., that λ1 6= λ2 . Then Exercise 53 gives one formula for c1 and c2 involving the diagonalizing matrix P . But since we now know that c1 and c2 exist, we can derive a simpler formula. We know the solutions are of the form xn = c1 λn1 + c2 λn2 , valid for all n. Since x0 = r and x1 = s, we get the following linear system in c1 and c2 : λ01 c1 + λ02 c2 = r λ11 c1

+

λ12 c2

or

c1 +

=s

c2 = r

λ1 c1 + λ2 c2 = s.

Solving this system, for example by setting up the augmented matrix and row-reducing, gives a unique solution s − λ1 r s − λ1 r c1 = r − , c2 = . λ2 − λ1 λ2 − λ1 So the system has a unique solution in terms of x0 = r, x1 = s, λ1 , and λ2 . Note that the system has a unique solution (and these expressions make sense) since λ1 6= λ2 . Next suppose that the eigenvalues are the same, say λ1 = λ2 = λ. Then the solutions are of the form xn = c1 λn + c2 nλn . Note that we must have λ 6= 0, since otherwise the characteristic polynomial of the recurrence matrix is λ2 , so the recurrence relation is xn = 0. With x0 = r and x1 = s, we get the following linear system in c1 and c2 : λ0 c1 + 0 · λ0 c2 = r 1

1

λ c1 + 1 · λ c2 = s

or

c1

=r

λc1 + λc2 = s.

Thus c1 = r, and substituting into the second equation gives (since λ 6= 0) c2 = s−λr λ . So in this case as well we can find the scalars c1 and c2 uniquely. Finally, note that 0 is an eigenvalue of the recurrence matrix only when b = 0 in the recurrence relation xn = axn−1 + bxn−2 . (b) From part (a), the general solution to the recurrence is     s − λ1 r s − λ1 r n n n x n = c1 λ 1 + c2 λ 2 = r − λ1 + λn2 . λ2 − λ1 λ2 − λ1 Substituting x0 = r = 0 and x1 = s = 1 gives     1 − λ1 · 0 1 − λ1 · 0 1 1 xn = 0 − λn1 + λn2 = (−λn1 + λn2 ) = (λn − λn2 ). λ2 − λ1 λ2 − λ1 λ2 − λ1 λ1 − λ2 1

342

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

55. (a) For n = 1, since f0 = 0 and f1 = f2 = 1, the equation holds:     1 1 f f1 A1 = A = = 2 . 1 0 f1 f0 Now assume that the equation is true for n = k, and consider Ak+1 :       f fk 1 1 f + fk fk+1 f Ak+1 = Ak A = k+1 = k+1 = k+2 fk fk−1 1 0 fk + fk−1 fk fk+1

fk+1 fk



as desired. n (b) Since det A = −1, it follows that det(An ) = (det A) = (−1)n . But from part (a), 2 det(An ) = fn+2 fn − fn+1 = (−1)n .

(c) If we look at the 5 × 13 “rectangle” as being composed of two “triangles”, we see that in fact they are not triangles: the “diagonal” of the “rectangle” is not straight because the 8 × 3 triangular 5 6= 38 . So there is empty space along the pieces are not similar to the entire triangles, since 13 diagonal of the figure that adds up to the missing 1 square unit. 56. (a) t1 = 1 (the only way is using the 1 × 1 rectangle); t2 = 3 (two 1 × 1 rectangles or either of the 1 × 2 rectangles); t3 = 5; t4 = 11; t5 = 21. For t0 , we can think of that as the number of ways to tile an empty rectangle; there is only one way — using no rectangles. (b) The rightmost rectangle in a tiling of a 1 × n rectangle is either the 1 × 1 or one of the 1 × 2 rectangles. So the number of tilings of a 1 × n rectangle is the number of tilings of a 1 × (n − 1) rectangle (adding the 1 × 1) plus twice the number of tilings of a 1 × (n − 2) rectangle (adding either 1 × 2). So the recurrence relation is tn = tn−1 + 2tn−2 . (c) The characteristic equation is λ2 − λ − 2 = (λ − 2)(λ + 1), so the eigenvalues are −1 and 2. To find the constants, we solve t1 = 1 = c1 (−1)1 + c2 · 21 = −c1 + 2c2 t2 = 3 = c1 (−1)2 + c2 · 22 = c1 + 4c2 . This gives c1 =

1 3

and c2 = 23 , so that tn =

 2 1 1 (−1)n + · 2n = (−1)n + 2n+1 . 3 3 3

Evaluating for various values of t gives  1 (−1)0 + 2 = 1, 3  1 t2 = (−1)2 + 23 = 3, 3  1 t4 = (−1)4 + 25 = 11, 3 t0 =

 1 (−1)1 + 22 = 1, 3  1 t3 = (−1)3 + 24 = 5, 3  1 t5 = (−1)5 + 26 = 21. 3

t1 =

These agree with part (a). 57. (a) d1 = 1, d2 = 2, d3 = 3, d4 = 5, d5 = 8. As in Exercise 56, d0 can be interpreted as the number of ways to cover a 2 × 0 rectangle, which is one — use no dominos. (b) The right end of any tiling of a 2 × n rectangle either consists of a domino laid vertically, or of two dominos each laid horizontally. Removing those leaves either a 2 × (n − 1) or a 2 × (n − 2) rectangle. So the number of tilings of a 2 × n rectangle is equal to the number of tilings of a 2 × (n − 1) rectangle plus the number of tilings of a 2 × (n − 2) rectangle, or dn = dn−1 + dn−2 .

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

343

(c) We recognize the recurrence as the recurrence for the Fibonacci sequence; the initial values shift it by 1, so we get √ !n+1 √ !n+1 1 1 1+ 5 1− 5 dn = √ −√ . 2 2 5 5 Computing values using this formula gives the same values as in part (a), including the value for d0 . 58. Row-reducing gives h

h

A−

A−

√ 1+ 5 2 I √ 1− 5 2 I

√ 1+ 5 2

" i 1− 0 =

1 √

"

1− 5 2

i 1− 0 =

So the eigenvectors are " v1 =

1

√ # 1+ 5 2

1

" 1 √ → 1+ 5 0 − 2 " # 0√ 1 → 1− 5 0 − 2 #

0

" and v2 =

√ # −1− 5 2

0 √ # −1+ 5 2 .

0

√ # 1− 5 2 .

1

Then √ k 1+ 5 2    √ k−1 1+ 5 1 √ 2 5



xk = lim k→∞ λk k→∞ 1 lim

√1 5

  = lim  k→∞

√1 5

·



√ k 1− 5 2   √ k−1  1− 5 1 − √5 2  √ k 1+ 5 2



√1 5



√1 5



2√ 1+ 5



√1 5

·

√1 5





√ k 1−√5  1+ 5 √ k  . √  1− 5 1−√5 2 1+ 5



√ √5 Since 1− ≈ 0.381 < 1, the terms involving k th powers of this fraction go to zero as k → ∞, leaving 1+ 5 us with (in the limit) " # " √ # √ 1+ 5 √1 2 2 5− 5 5 2 √ √ √ √ = = v = v1 , 1 √ 2 √ 10 1 5(1 + 5) 5(1 + 5) 5(1+ 5) which is a constant times v1 . 59. The coefficient matrix has characteristic polynomial   1−λ 3 = (1 − λ)(2 − λ) − 6 = λ2 − 3λ − 4 = (λ − 4)(λ + 1), 2 2−λ so that the eigenvalues are λ1 = 4 and λ2 = −1. The corresponding eigenvectors are found by rowreduction:         −3 3 0 1 −1 0 1 A − 4I 0 = → ⇒ v1 = 2 −2 0 0 0 0 1       3   2 3 0 3 1 2 0 A+I 0 = → ⇒ v2 = . 2 3 0 −2 0 0 0 Then by Theorem 4.40, x = C1 e4t

    1 3 + C2 e−t . 1 −2

344

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Substitute t = 0 to determine the constants:         0 1 3 C1 + 3C2 x(0) = = C1 e0 + C 2 e0 = . 5 1 −2 C1 − 2C2 These two equations give C1 = 3 and C2 = −1, so that the solution is x = 3e4t

    x(t) = 3e4t − 3e−t 1 3 − e−t , or 1 −2 y(t) = 3e4t + 2e−t .

60. The coefficient matrix has characteristic polynomial   2−λ −1 = (2 − λ)2 − 1 = λ2 − 4λ + 3 = (λ − 1)(λ − 3), −1 2−λ so that the eigenvalues are λ1 = 1 and reduction:    1 A−I 0 = −1    −1 A − 3I 0 = −1

λ2 = 3. The corresponding eigenvectors are found by row  0 1 → 0 0   0 1 → 0 0

−1 1 −1 −1

−1 0 1 0

   0 1 ⇒ v1 = 0 1    0 1 ⇒ v2 = . 0 −1

Then by Theorem 4.40,     1 1 3t x = C1 e + C2 e . 1 −1 t

Substitute t = 0 to determine the constants:         1 1 C1 + C2 0 1 0 x(0) = = C1 e + C2 e = . 1 1 −1 C1 − C2 These two equations give C1 = 1 and C2 = 0, so that the solution is     x(t) = et 1 1 3t x=e +0·e , or 1 −1 y(t) = et . t

61. The coefficient matrix has characteristic polynomial   √ √ 1−λ 1 = (1 − λ)(−1 − λ) − 1 = λ2 − 2 = (λ − 2 )(λ + 2 ), 1 −1 − λ √ so that the eigenvalues are λ1 = 2 and λ2 row-reduction: √    √ 1− 2 1√ A − 2I 0 = 1 −1 − 2 √    √ 1+ 2 1√ A + 2I 0 = 1 −1 + 2 Then by Theorem 4.40, x = C1 e

√ 2t

√ = − 2. The corresponding eigenvectors are found by   0 1 → 0 0   0 1 → 0 0

−1 − 0 −1 + 0

√ √

2

0 0



2

0 0



⇒ ⇒

 √  1+ 2 1  √  1− 2 v2 = . 1 v1 =

  √  √  √ 1+ 2 1− 2 + C 2 e− 2 t . 1 1

Substitute t = 0 to determine the constants:     √  √   √ √  1 1+ 2 1− 2 C1 (1 + 2) + C2 (1 − 2) x(0) = = C1 e0 + C 2 e0 = . 0 1 1 C1 + C2

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

345

√ The second equation gives C1 + C2 = 0; substituting into √the first equation gives (C1 − C2 ) 2 = 1, so √ √ that C1 − C2 = 22 . It follows that C1 = 42 and C2 = − 42 , so that the solution is √ √ 2 + 2 √2 t 2 − 2 −√2 t √ √     √ √ x1 (t) = e + e 2 √2 t 1 + 2 2 −√2 t 1 − 2 4 x= − , or e e √ 4  √ 1 1 4 4 2 √2 t − e− 2 t . x2 (t) = e 4 62. The coefficient matrix has characteristic polynomial   1−λ −1 = (1 − λ)2 + 1 = λ2 − 2λ + 2, 1 1−λ so that the eigenvalues are λ1 = 1 + i and λ2 = 1 − i. The corresponding eigenvectors are found by row-reduction:         −i −1 0 1 −i 0 i A − (1 + i)I 0 = → ⇒ v1 = 1 −i 0 0 0 0 1         i −1 0 1 i 0 −i A − (1 − i)I 0 = → ⇒ v2 = . 1 i 0 0 0 0 1 Then by Theorem 4.40, y = C1 e(1+i)t

    i −i + C2 e(1−i)t . 1 1

Substitute t = 0 to determine the constants:         1 (C1 − C2 )i 0 i 0 −i y(0) = = C1 e + C2 e = . 1 1 1 C1 + C2 Solving gives C1 =

1−i 2 ,

C1 =

1+i 2 ,

so that the solution is     1 − i (1+i)t i 1 + i (1−i)t −i y= e + e , 1 1 2 2

or 1 + i (1−i)t i + 1 (1+i)t 1 − i (1−i)t 1 − i (1+i)t e i+ e · (−i) = e + e 2 2 2 2 1 − i (1+i)t 1 + i (1−i)t y2 (t) = e + e . 2 2

y1 (t) =

Now substitute cos t + i sin t for eit , giving i + 1 (1+i)t 1 − i (1−i)t e + e 2 2 i+1 t 1−i t = e (cos t + i sin t) + e (cos t − i sin t) 2 2  i − 1 −i − 1 t = et cos t + + e sin t 2 2

y1 (t) =

= et (cos t − sin t) 1 − i (1+i)t 1 + i (1−i)t y2 (t) = e + e 2 2 1−i t 1+i t = e (cos t + i sin t) + e (cos t − i sin t) 2  2 1+i 1−i t = et cos t + + e sin t 2 2 = et (cos t + sin t).

346

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

63. The coefficient matrix has characteristic polynomial   −λ 1 −1  1 −λ 1  = −λ3 + λ = −λ(1 − λ)(1 + λ), 1 1 −λ so that the eigenvalues are λ1 = 0, λ2 = 1, and λ3 = −1. row-reduction:    0 1 −1 0 1 0   1 0 → 0 1 A − 0I 0 = 1 0 1 1 0 0 0 0    1 −1 1 −1 0   1 0 → 0 A − I 0 =  1 −1 1 1 −1 0 0    1 1 1 1 −1 0   1 0 → 0 0 A + I 0 = 1 1 1 1 1 0 0 0

The corresponding eigenvectors are found by 1 −1 0 0 1 0 0 1 0

 0 0 0 0 −1 0  0 0 0

  −1 ⇒ v1 =  1 1    0 0 0 ⇒ v2 = 1 0 1   −1 ⇒ v2 =  1 . 0

Then by Theorem 4.40,       0 −1 −1 x = C1 e0t  1 + C2 et 1 + C3 e−t  1 . 1 0 1 Substitute t = 0 to determine the constants:           1 −C1 − C3 −1 0 −1 x(0) =  0 = C1 e0  1 + C2 e0 1 + C3 e0  1 = C1 + C2 + C3  −1 1 1 0 C1 + C2 Solving gives C1 = −2, C3 = 1, C2 = 1, so that the solution is       −1 0 −1 x = −2  1 + et 1 + e−t  1 1 1 0 or x(t) = 2 − e−t y(t) = −2 + et + e−t z(t) = −2 + et . 64. The coefficient matrix has characteristic polynomial   1−λ 0 3  1 −2 − λ 1  = −(λ − 4)(λ + 2)2 3 0 −λ so that the eigenvalues are λ1 = 4 and λ2 = −2 (with algebraic multiplicity 2). The corresponding eigenvectors are found by row-reduction:       1 0 −1 0 −3 0 3 0 3   1 0 → 0 1 − 13 0 ⇒ v1 = 1 A − 4I 0 =  1 −6 3 0 −3 0 3 0 0 0 0         3 0 3 0 1 0 1 0 0 −1   A + 2I 0 = 1 0 1 0 → 0 0 0 0 ⇒ v2 = 1 , v3 =  0 . 3 0 3 0 0 0 0 0 0 1

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

347

Then by Theorem 4.40,       3 0 −1 x = C1 e4t 1 + C2 e−2t 1 + C3 e−2t  0 . 3 0 1 Substitute t = 0 to determine the constants:           2 3 0 −1 3C1 − C3 x(0) = 3 = C1 e0 1 + C2 e0 1 + C3 e0  0 =  C1 + C2  4 3 0 1 3C1 + C3 Solving gives C1 = 1, C2 = 2, C3 = 1, so that the solution is       3 0 −1 x = e4t 1 + 2e−2t 1 + e−2t  0 3 0 1 or x(t) = 3e4t − e−2t y(t) = e4t + 2e−2t z(t) = 3e4t + e−2t . 65. (a) The coefficient matrix has characteristic polynomial   1.2 − λ −0.2 = (1.2 − λ)(1.5 − λ) − 0.04 = λ2 − 2.7λ + 1.76 = (λ − 1.1)(λ − 1.6) −0.2 1.5 − λ so that the eigenvalues are λ1 = 1.1 and λ2 = 1.6. The corresponding eigenvectors are found by row-reduction:         0.1 −0.2 0 1 −2 0 2 A − 1.1I 0 = → ⇒ v1 = 0.2 0.4 0 0 0 0 1       1   −0.4 −0.2 0 −1 1 2 0 A − 1.6I 0 = → ⇒ v2 = . −0.2 −0.1 0 2 0 0 0 Then by Theorem 4.40, x = C1 e

1.1t

    2 1.6t −1 + C2 e . 1 2

Substitute t = 0 to determine the constants:         400 2 −1 2C1 − C2 x(0) = = C 1 e0 + C2 e0 = . 500 1 2 C1 + 2C2 Solving gives C1 = 260, C2 = 120, so that the solution is     1.1t 2 1.6t −1 x = 260e + 120e 1 2 or x(t) = 520e1.1t − 120e1.6t y(t) = 260e1.1t + 240e1.6t . A plot of the populations over time is

348

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Population

15 000

X 10 000

5000

Y 0

1.0

0.5

1.5

2.0

2.5

3.0

Time HdaysL

From the graph, the population of Y dies out after about three days. (b) Substituting a and b for 400 and 500 in the initial value computation above gives         a 2 −1 2C1 − C2 x(0) = = C 1 e0 + C2 e0 = . b 1 2 C1 + 2C2 Solving gives C1 = 51 (2a + b), C2 = 15 (−a + 2b), so that the solution is     1 1 1.1t 2 1.6t −1 x = (2a + b)e + (−a + 2b)e 1 2 5 5 or 1 2 x(t) = (2a + b)e1.1t − (−a + 2b)e1.6t 5 5 1 2 y(t) = (2a + b)e1.1t + (−a + 2b)e1.6t . 5 5 In the long run, the e1.6t term will dominate, so if a > 2b, then −a + 2b < 0, so the coefficient of e1.6t in x(t) is positive and in y(t) is negative, so that Y will die out and X will dominate. If a < 2b, then −a + 2b > 0, so the reverse will happen — X will die out and Y will dominate. Finally, if a = 2b, then the e1.6t term disappears in the solutions, and both populations will grow at a rate proportional to e1.1t , though one will grow twice as fast as the other. 66. The coefficient matrix has characteristic polynomial   −0.8 − λ 0.4 = (−0.8 − λ)(−0.2 − λ) − 0.16 = λ2 + λ = λ(λ + 1) 0.4 −0.2 − λ so that the eigenvalues are λ1 = 0 and λ2 = −1. The corresponding eigenvectors are found by rowreduction:         −0.8 0.4 0 1 1 − 12 0 A − 0I 0 = ⇒ v1 = → 2 0.4 −0.2 0 0 0 0         0.2 0.4 0 1 2 0 −2 A+I 0 = → ⇒ v2 = . 0.4 0.8 0 0 0 0 1 Then by Theorem 4.40,     1 −t −2 x = C1 e + C2 e . 2 1 0t

Substitute t = 0 to determine the constants:         15 C1 − 2C2 0 1 0 −2 x(0) = = C1 e + C2 e = . 10 2 1 2C1 + C2 Solving gives C1 = 7, C2 = −4, so that the solution is     1 −2 x=7 − 4e−t 2 1 or x(t) = 7 + 8e−t y(t) = 14 − 4e−t . As t → ∞, the exponential terms go to zero, and the populations stabilize at 7 of X and 14 of Y .

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

349

67. We want to convert x0 = Ax + b into an equation u0 = Au. If we could write b = Ab0 , then letting u = x + b0 would do the trick. To find b0 , we invert A: # #" # " " 1 −10 − 12 −30 0 −1 2 = . b =A b= 1 1 −20 −10 2 2   −10 Then let u = x + , so that −20    0     −10 −10 −10 u0 = x + = x0 = Ax + b = Ax + AA−1 = A x + A−1 = Au. −20 −20 −20 So now we want to solve this equation. A has characteristic polynomial   1−λ 1 = (1 − λ)2 + 1 = λ2 − 2λ + 2, −1 1−λ so the eigenvalues are 1 + i and 1 − i. Eigenvectors are found      −i 1 0 1 i A − (1 + i)I 0 = → −1 −i 0 0 0      1 −i i 1 0 A − (1 − i)I 0 = → −1 i 0 0 0

by row-reducing:    0 1 ⇒ v1 = 0 i    0 i ⇒ v2 = . 0 1

Then by Theorem 4.40, u = C1 e

(1+i)t

    1 (1−i)t i + C2 e . i 1

Substitute t = 0 to determine the constants:           −10 10 C1 + iC2 0 1 0 i u(0) = x(0) + = = C1 e + C2 e = . −20 10 i 1 iC1 + C2 Solving gives C1 = C2 = 5 − 5i, so that the solution is     1 i u = (5 − 5i)e(1+i)t + (5 − 5i)e(1−i)t i 1     1 i = (5 − 5i)et (cos t + i sin t) + (5 − 5i)et (cos t − i sin t) i 1   (5 − 5i) cos t + (5 + 5i) sin t + (5 + 5i) cos t + (5 − 5i) sin t = et (5 + 5i) cos t + (−5 + 5i) sin t + (5 − 5i) cos t + (−5 − 5i) sin t   10 cos t + 10 sin t = et . 10 cos t − 10 sin t Rewriting in terms of the original variables gives         −10 10 t 10 cos t + 10 sin t t 10 cos t + 10 sin t x=e − =e + . 10 cos t − 10 sin t −20 10 cos t − 10 sin t 20 A plot of the populations is below. Both populations die out; species Y after about 1.2 time units and species X after about 2.4. Population 60

X

50 40 30 20

Y

10 0 0.0

0.5

1.0

1.5

2.0

2.5

3.0

Time HdaysL

350

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

68. We want to convert x0 = Ax + b into an equation u0 = Au. If we could write b = Ab0 , then letting u = x + b0 would do the trick. To find b0 , we invert A: # #" # " " −20 0 − 12 − 12 0 −1 = . b =A b= 1 −20 − 12 40 2   −20 Then let u = x + , so that −20     0    −20 0 0 0 0 −1 −1 u = x+ = x = Ax + b = Ax + AA =A x+A = Au. −20 40 40 So now we want to solve this equation. A has characteristic polynomial   −1 − λ 1 = (−1 − λ)2 + 1 = λ2 + 2λ + 2, −1 −1 − λ so the eigenvalues are −1 + i and −1 − i. Eigenvectors are found by row-reducing:         −i 1 0 1 i 0 1 A − (−1 + i)I 0 = → ⇒ v1 = −1 −i 0 0 0 0 i         i 1 0 1 −i 0 i A − (−1 − i)I 0 = → ⇒ v2 = . −1 i 0 0 0 0 1 Then by Theorem 4.40, u = C1 e

(−1+i)t

    1 (−1−i)t i + C2 e . i 1

Substitute t = 0 to determine the constants:           −20 −10 1 i C1 + iC2 u(0) = x(0) + = = C1 e0 + C 2 e0 = . −20 10 i 1 iC1 + C2 Solving gives C1 = −5 − 5i and C2 = 5 + 5i, so that the solution is     1 i u = (−5 − 5i)e(−1+i)t + (5 + 5i)e(−1−i)t i 1     1 i = (−5 − 5i)e−t (cos t + i sin t) + (5 + 5i)e−t (cos t − i sin t) i 1   (−5 − 5i) cos t + (5 − 5i) sin t + (−5 + 5i) cos t + (5 + 5i) sin t = e−t (5 − 5i) cos t + (5 + 5i) sin t + (5 + 5i) cos t + (5 − 5i) sin t   −10 cos t + 10 sin t = e−t . 10 cos t + 10 sin t Rewriting in terms of the original variables gives         −20 20 −t −10 cos t + 10 sin t −t −10 cos t + 10 sin t x=e − =e + . 10 cos t + 10 sin t −20 10 cos t + 10 sin t 20 A plot of the populations is below; both populations stabilize at 20 as the e−t term goes to zero. Population 30

Y 25 20

X 15 10 5 0

0

1

2

3

4

Time HdaysL 5

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

351

69. (a) Making the change of variables y = x0 and z = x, the original equation x00 + ax0 + bx = 0 becomes y 0 + ay + bz = 0; since z = x, we have z 0 = x0 = y. So we get the system y 0 = −ay − bz z0 =

y

(b) The coefficient matrix of the system in (a) is  −a 1

.

 −b , 0

whose characteristic polynomial is (−a − λ)(−λ) + b = λ2 + aλ + b. 70. Make the substitutions y1 = x(n−1) , y2 = x(n−2) , . . . , yn−1 = x0 , yn = x. Then the original equation x(n) + an−1 x(n−1) + · · · + a1 x0 + a0 = 0 can be written as y10 + an−1 y1 + an−2 y2 + · · · + a2 yn−2 + a1 yn−1 + a0 yn = 0, and for i = 2, 3, . . . , n, we have yi0 = yi−1 . So we get the n equations y10 = an−1 y1 − an−2 y2 − · · · − a2 yn−2 − a1 yn−1 − a0 yn y20 = y1 y30 = y2 .. . yn0 = yn−1 whose coefficient matrix is

 −an−1  1   0   ..  .

−an−2 0 1 .. .

0

0

· · · −a1 ··· 0 ··· 0 .. .. . . ··· 1

 −a0 0   0  . ..  .  0

This is the companion matrix of p(λ). 71. From Exercise 69, we know that the characteristic polynomial is λ2 − 5λ + 6, so the eigenvalues are 2 and 3. Therefore the general solution is x(t) = C1 e2t + C2 e3t . 72. From Exercise 69, we know that the characteristic polynomial is λ2 + 4λ + 3, so the eigenvalues are −1 and −3. Therefore the general solution is x(t) = C1 e−t + C2 e−3t . 73. From   Exercise   59, the matrix of the system has eigenvalues 4 and −1 with corresponding eigenvectors 1 3 and . So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this case 1 −2   0 x(0) = , we get 5 " #" #" #−1 " # 2 4t 3 −t 3 4t 3 −t 4t e + e e − e 1 3 e 0 1 3 eAT = P (et )D P −1 = = 52 4t 25 −t 53 4t 52 −t . 1 −2 0 e−t 1 −2 5e − 5e 5e + 5e Thus the solution is

"

2 4t 5e 2 4t 5e

+ 35 e−t − 25 e−t

3 4t 5e 3 4t 5e

which matches the result from Exercise 59.

− 35 e−t + 25 e−t

#" # " # 0 3e4t − 3e−t = , 5 3e4t + 2e−t

352

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

74. From   Exercise   60, the matrix of the system has eigenvalues 1 and 3 with corresponding eigenvectors 1 1 and . So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this case 1 −1   1 x(0) = , we get 1 "

e

AT

t D

= P (e ) P

−1

1 = 1

#" 1 et −1 0

0 e3t

#−1 " 1 t e + 1 e3t 1 = 21 t 21 3t −1 2e − 2e

#" 1 1

1 t 2e 1 t 2e

# − 12 e3t . + 12 e3t

Thus the solution is "

1 t 2e 1 t 2e

+ 12 e3t − 12 e3t

1 t 2e 1 t 2e

− 12 e3t + 12 e3t

#" # " # 1 et = t , 1 e

which matches the result from Exercise 60. 75. From of the system has eigenvalues 0, 1, and −1 with corresponding eigenvectors  Exercise   63, the matrix  −1 0 −1  1, 1, and  1. So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this 1 1 0   1 case x(0) =  0, and we get −1

eAT = P (et )D P −1

Thus the solution is 

1 −1 + e−t −1 + et

   −1 −1 0 −1 1 0 0 −1 0 −1 1 0 et 0  1 1 1 = 1 1 −t 1 1 0 0 0 e 1 1 0   −t −t 1 1−e −1 + e = −1 + e−t −1 + e−t + et 1 − e−t  . −1 + et −1 + et 1

1 − e−t −1 + e−t + et −1 + et

    −1 + e−t 1 2 − e−t 1 − e−t   0 = −2 + et + e−t  , −1 1 −2 + et

which matches the result from Exercise 63. 76. From Exercise 64, the matrix of the system has  eigenvalues   4 and −2 (with algebraic multiplicity 2) 0 −1  3  with corresponding eigenvectors 1 and 1 ,  0 . So by Theorem 4.41, the system x0 = Ax   3 0  1 2 has solution x = eAt x(0). In this case x(0) = 3, and we get 4

eAT = P (et )D P −1

 3  = 1 3  =

0 1 0

1 4t 2e  1 4t 6e 1 4t 2e

 −1 e4t  0  0 1 0 + 12 e−2t − 16 e−2t − 12 e−2t

0 e−2t 0 0

e−2t 0

 0 3  0  1 −2t e 3

1 4t 2e 1 4t 6e 1 4t 2e

0 1 0

 − 12 e−2t  − 16 e−2t  . + 12 e−2t

−1 −1  0 1

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

353

Thus the solution is 

1 4t 2e  1 4t 6e 1 4t 2e

+ 21 e−2t − 61 e−2t − 21 e−2t

    3e4t − e−2t 2 − 12 e−2t     − 16 e−2t  3 = e4t + 2e−2t  , 3e4t + e−2t 4 + 12 e−2t

1 4t 2e 1 4t 6e 1 4t 2e

0 e−2t 0

which matches the result from Exercise 64.   2 1 77. (a) With A = , we have 0 3       1 3 9 x1 = A = , x2 = Ax1 = , 1 3 9

x3 = Ax2 =

  27 . 27

25

20

15

10

5

10

5

 2 (b) With A = 0

 1 , we have 3     1 2 x1 = A = , 0 0

20

15

  4 x2 = Ax1 = , 0

25

  8 x3 = Ax2 = . 0

1.0

0.5

2

4

6

8

-0.5

-1.0

(c) Since the matrix is upper triangular, we see that its eigenvalues are 2 and 3; since these are both greater than 1 in magnitude, the origin is a repeller. (d) Some other trajectories are 10

5

5

-5

-5

-10

10

354

CHAPTER 4. EIGENVALUES AND EIGENVECTORS

78. (a) With A =

 0.5 0

 −0.5 , we have 0.5

    1 0 x1 = A = , 1 0.5

  −0.25 x2 = Ax1 = , 0.25

  −0.25 x3 = Ax2 = . 0.125

1.0

0.8

0.6

0.4

0.2

0.2

-0.2

(b) With A =

 0.5 0

0.4

0.6

0.8

1.0

 −0.5 , we have 0.5

x1 = A

    1 0.5 = , 0 0

x2 = Ax1 =

  0.25 , 0

x3 = Ax2 =

  0.125 . 0

0.1

0.2

0.4

0.6

0.8

1.0

1.2

1.4

-0.1

(c) Since the matrix is upper triangular, we see that its only eigenvalue is 0.5; since this is less than 1 in magnitude, the origin is an attractor. (d) Some other trajectories are 1.0

0.5

-1.0

0.5

-0.5

1.0

-0.5

-1.0

-1.5

-2.0



2 79. (a) With A = −1

 −1 , we have 2 x1 = A

so that

  1 is a fixed point: 1

    1 1 = , 1 1

1.5

2.0

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

355

2.0

1.5

1.0

0.5

1.0

0.5

 (b) With A =

2 −1

2.0

1.5

 −1 , we have 2     1 2 x1 = A = , 0 −1 2



 5 x2 = Ax1 = , −4 4

6

8

10



 14 x3 = Ax2 = . −13 12

14

-2

-4

-6

-8

-10

-12

(c) The characteristic polynomial of the matrix is   2−λ −1 det(A − λI) = = λ2 − 4λ + 3 = (λ − 3)(λ − 1). −1 2−λ Thus one eigenvalue is greater than 1 in magnitude and the other is equal to 1. So the origin is not a repeller, an attractor, or a saddle point; there are no nonzero values of x0 for which the trajectory of x0 approaches the origin. (d) Some other trajectories are 10

5

5

-5

10

-5

-10

 −4 80. (a) With A = 1

 2 , we have −3 x1 = A

    1 −2 = , 1 −2

x2 = Ax1 =

  4 , 4

x3 = Ax2 =

  −8 . −8

356

CHAPTER 4. EIGENVALUES AND EIGENVECTORS 4

H4, 4L

2 H1, 1L -8

-6

-4

2

-2

H-2, -2L

4

-2

-4

-6

H-8, -8L

 −4 (b) With A = 1

-8

 2 , we have −3     1 −4 x1 = A = , 0 1

 x2 = Ax1 =

 18 , −7

x3 = Ax2 =

  −86 . 39

40

H-86, 39L

30

20

10

H-4, 1L -80

-60

-40

H1, 0L 20

-20 H18, -7L

(c) The characteristic polynomial of the matrix is   −4 − λ 2 det(A − λI) = = λ2 + 7λ + 10 = (λ + 5)(λ + 2). 1 −3 − λ Since both eigenvalues, −5 and −2, are greater than 1 in magnitude, the origin is a repeller. (d) Some other trajectories are x3

x2 20

30

x0 =H-1, 1L 50

20

100

150

x1 -20

10

-40

x0 =H2, 1L -80

-60

-40

-20

-60 20 x2

x1

x3

-80 x3

x3

50 100

40 30 20

50

10

x1

x1

x0 =H2, -1L -250

-200

-150

-100

-50

50 x2

100

-60

-40

-20 x0 =H-1, -2L -10

x2

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 81. (a) With A =

 1.5 −1

357

 −1 , we have 0

    1 −2 x1 = A = , 1 −2

  4 x2 = Ax1 = , 4

1.0

  −8 x3 = Ax2 = . −8

H1, 1L

0.5

0.5

1.0

2.0

1.5

3.0

2.5

H1.75, -0.5L -0.5

-1.0 H0.5, -1L -1.5 H3.125, -1.75L

(b) With A =

 1.5 −1

 −1 , we have 0

x1 = A

    1 −4 = , 0 1

H1, 0L

1

 x2 = Ax1 =

2

3

 18 , −7

4

x3 = Ax2 =

5

  −86 . 39

6

-0.5

H1.5, -1L

-1.0

H3.25, -1.5L

-1.5

-2.0

-2.5

-3.0 H6.375, -3.25L

(c) The characteristic polynomial of the matrix is



1.5 − λ det(A − λI) = −1

 −1 = λ2 − 1.5λ + 1 = (λ − 2)(λ + 0.5) −λ

Since one eigenvalue is greater than 1 in magnitude and the other is less, the origin is a saddle point. Eigenvectors for λ = −0.5 are attracted to the origin, and other vectors are repelled.

358

CHAPTER 4. EIGENVALUES AND EIGENVECTORS (d) Some other trajectories are 2.0

5 x3

x0 =H1, 2L 1.5

4

1.0

3

0.5

x2

x2

2 -0.4

0.2

-0.2

0.4

0.6

0.8

1.0 x0 =H-1, 1L

x3 x1

1

x1 -0.5 -1.0

-8

-6

-4

-2 1.0

10

5 x0 =H2, -1L

15

x1 0.5 x3

x1

-2

-1.0

-0.8

-0.6

x2

-4

-0.4

0.2

-0.2

x2

-0.5 -1.0

-6 -1.5

 0.1 82. (a) With A = 0.5

-2.0

 0.9 , we have 0.5 x1 = A

so

x0 =H-1, -2L

x3

-8

    1 1 = , 1 1

  1 is a fixed point - it is an eigenvector for the eigenvalue λ = 1. 1 2.0

1.5

1.0

H1, 1L

0.5

0.5

(b) With A =

 0.1 0.5

1.0

1.5

2.0

 0.9 , we have 0.5

x1 = A

    1 0.1 = , 0 0.5

x2 = Ax1 =

  0.46 , 0.30

x3 = Ax2 =

  0.316 . 0.38

0.4

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

359

0.5 H0.1, 0.5L 0.4 H0.316, 0.38L

0.3

H0.46, 0.3L

0.2

0.1

H1, 0L 0.2

0.4

0.6

0.8

1.0

(c) The characteristic polynomial of the matrix is

det(A − λI) =

 0.1 − λ 0.5

 0.9 = λ2 − 0.6λ − 0.45 = (λ − 1)(λ + 0.4). 0.5 − λ

Since one eigenvalue is less than 1 in magnitude and the other is equal to 1, the origin is neither an attractor, a repeller, nor a saddle point. (d) Some other trajectories are

1.5

5

x0 =H-9, 5L

x1

4

x3 x2

3

1.0 x0 =H2, 1L

2 x2

1

0.5 -8

-6

-4

2

-2 x3

-1 0.0

0.5

-2.0

-1.5

1.0

1.5

-1.0

-0.5

x1

2.0

-2

x1

0.5

-0.5 -0.2

-0.5 x3 -0.4

x2

-1.0 -0.6

x1 x3

x2

-1.5 -0.8 x0 =H-1, -2L -2.0

-1.0

x0 =H1, -1L



 0.2 0.4 83. (a) With A = , we have −0.2 0.8

x1 = A

    1 0.6 = , 1 0.6

x2 = Ax1 =

  0.36 , 0.36

x2 = Ax2 =

  0.216 . 0.216

1.0

360

CHAPTER 4. EIGENVALUES AND EIGENVECTORS 1.0

0.8

0.6

0.4

0.4

0.6

0.8

1.0



 0.2 0.4 (b) With A = , we have −0.2 0.8 x1 = A

    1 0.2 = , 0 −0.2

x2 = Ax1 =

  −0.04 , −0.20

x3 = Ax2 =

  −0.088 . −0.152

0.2

0.1

0.2

-0.2

0.4

0.6

0.8

1.0

-0.1

-0.2

-0.3

(c) The characteristic polynomial of the matrix is  det(A − λI) =

 0.4 = λ2 − λ + 0.24 = (λ − 0.4)(λ − 0.6). 0.8 − λ

0.2 − λ −0.2

Since both eigenvalues are smaller than 1 in magnitude, the origin is an attractor. (d) Some other trajectories are 4

2

-2

1

-1

2

-2

-4

 84. (a) With A =

0 1.2

 −1.5 , we have 3.6

x1 = A

    1 −1.5 = , 1 4.8

 x2 = Ax1 =

 −7.2 , 15.48

x2 = Ax2 =

  −23.22 . 47.088

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

361

40

30

20

10

-20

 (b) With A =

0 1.2

-10

-15

-5

 −1.5 , we have 3.6

x1 = A

    1 0 = , 0 1.2

x2 = Ax1 =

  −1.8 , 4.32



 −6.48 . 13.392

x3 = Ax2 =

12

10

8

6

4

2

-6

-4

-5

-3

-2

1

-1

(c) The characteristic polynomial of the matrix is   −λ −1.5 det(A − λI) = = λ2 − 3.6λ + 1.8 = (λ − 3)(λ − 0.6). 1.2 3.6 − λ Since one eigenvalue is smaller than 1 in magnitude and the other is greater, the origin is a saddle point; the origin attracts some initial points and repels others. (d) Some other trajectories are 6

4

2

-4

2

-2

4

-2

-4

85. With the given matrix A, we have r = "

r A= 0



#" 0 cos θ r sin θ

a 2 + b2 = #



π 4,

2 and θ =

− sin θ = cos θ

"√

2 0

so that

#"

0 √ 2

√1 2 √1 2

− √12 √1 2

# .

362

CHAPTER 4. EIGENVALUES AND EIGENVECTORS Since r = |λ| =



2 > 1, the origin is a spiral repeller: 2.0

1.5

1.0

0.5

-4

-3

-2

1

-1

√ 86. With the given matrix A, we have r = a2 + b2 = 0.5 and θ = 3π 4 , so that " #" # " #" # r 0 cos θ − sin θ 0.5 0 0 1 . A= = 0 r sin θ cos θ 0 0.5 −1 0 Since r = |λ| = 0.5 < 1, the origin is a spiral attractor: 1.0

0.5

0.2

-0.2

0.4

0.6

0.8

1.0

1.2

-0.5

√ 87. With the given matrix A, we have r = a2 + b2 = 2 and θ = 5π 3 , so that #" # " #" " √ # 3 1 2 0 r 0 cos θ − sin θ 2√ 2 = A= . 3 1 0 2 − 2 cos θ 0 r sin θ 2 Since r = |λ| = 2 > 1, the origin is a spiral repeller:

-8

-6

-4

2

-2

-2

-4

-6

-8

88. With the given matrix A, we have r = "

r A= 0

0 r



a 2 + b2 =

#" cos θ sin θ

r





# " − sin θ 1 = cos θ 0

3 2

2

+

 1 2 2

#" √ 0 − 23 1 1 2

= 1 and θ = # − 12 √ . − 23

5π 6 ,

so that

4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM

363

Since r = |λ| = 1, the origin is the orbital center:

1.0

x3

x0

0.5

-1.0

1.0

0.5

-0.5

x1

x2 -0.5

-1.0

89. The characteristic polynomial of A is  0.1 − λ det(A − λI) = 0.1

 −0.2 = λ2 − 0.4λ + 0.05; 0.3 − λ

solving this quadratic gives λ = 0.2 ± 0.1i for the eigenvalues. Row-reducing gives     −1 −1 eigenvector. Thus P = Re x Im x = , and so 1 0       0 1 0.1 −0.2 −1 −1 0.2 −0.1 −1 C = P AP = = . −1 −1 0.1 0.3 1 0 0.1 0.2 √ Since |λ| = 0.05 < 1, the origin is a spiral attractor. The first six points are       1 −0.1 −0.09 x0 = , x1 = , x2 = , 1 0.4 0.11       −0.031 −0.0079 −0.00161 x3 = , x4 = , x5 = . 0.024 0.0041 0.00044 1.0

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.0

90. The characteristic polynomial of A is det(A − λI) =

 2−λ −2

 1 = λ2 − 2λ + 2; −λ

  −1 − i for an 1

364

CHAPTER 4. EIGENVALUES AND EIGENVECTORS   −1 − i solving this quadratic gives λ = 1 ± i for the eigenvalues. Row-reducing gives for an eigen2     −1 −1 vector. Thus P = Re x Im x = , and so 2 0 # #" #" # " " 1 1 1 2 1 −1 −1 0 2 = . C = P −1 AP = −1 1 −1 − 21 −2 0 2 0 Since |λ| =



2 > 1, the origin is a spiral repeller. The first six points are           1 3 4 2 −4 x0 = , x1 = , x2 = , x3 = , x4 = , 1 −2 −6 −8 −4

x5 =

  −12 . 8

5

-10

-5

-5

91. The characteristic polynomial of A is  det(A − λI) =

solving this quadratic gives λ =

1−λ 1

 −1 = λ2 − λ + 1; −λ



1 2

±

3 2

 i for the eigenvalues. Row-reducing gives

eigenvector. Thus  P = Re x

 Im x =



1 2



3 2



1

0

#"



and so " C=P

−1

AP =

0

1

− √23

√1 3

#" 1

−1

1

0

1 2



1

, #

" =

1 √2 3 2





Since |λ| = 1, the origin is an orbital center. The first six points are           1 0 −1 −1 0 x0 = , x1 = , x2 = , x3 = , x4 = , 1 1 0 −1 −1 1.0

0.5

-1.0

0.5

-0.5

-0.5

-1.0



− 1



3 2

0

1 2

1.0

3 2 1 2

# .

  1 x5 = . 0

3 2 i

 for an

Chapter Review

365

92. The characteristic polynomial of A is  −λ det(A − λI) = 1 √

solving this quadratic gives λ =

3 2

±

 √  − 23 − 12 i i for the eigenvalues. Row-reducing gives for an 1

1 2

eigenvector. Thus

 √ 3 Im x = − 2 1

 P = Re x and so

" C=P

−1

AP =

0 −2

 √ √ −1 = λ2 − 3λ + 1; 3−λ



 − 12 , 0

#" √ −1 − 23 √ 1 3

#" 1 0 √ − 3 1

− 12

"√

# =

0

3 2 1 2

− 21 √

3 2

# .

Since |λ| = 1, the origin is an orbital center. The first six points are √       −1√ 1 −1 −√ 3 x0 = , x1 = , x2 = , 1 1+ 3 2+ 3 √  √     √  −2 −√ 3 −2 −√ 3 −1 − 3 , x4 = , x5 = x3 = . 1 2+ 3 1+ 3

3 2 1

-3

-2

1

-1

2

3

-1

-2

-3

Chapter Review 1. (a) False. What is true is that if A is an n × n matrix, then det(−A) = (−1)n det A. This is because multiplying any row by −1 multiplies the determinant by −1, and −A multiplies every row by −1. See Theorem 4.7. (b) True, since det(AB) = det(A) det(B) = det(B) det(A) = det(BA). See Theorem 4.8. (c) False. It depends on what order the columns are in. Swapping any two columns of A multiplies the determinant by −1. See Theorem 4.4. (d) False. While det AT = det A by Theorem 4.10, det A−1 = det1 A by Theorem 4.9. So unless det A = ±1, the statement is false.   0 1 (e) False. For example, the matrix has a characteristic polynomial of 0, so 0 is its only 0 0 eigenvalue. (f ) False. For example, the identity matrix has only λ = 1 as an eigenvalue, but ei is an eigenvector for every i. (g) True. See Theorem 4.25 in Section 4.4. (h) False. Any diagonal matrix with two identical entries on the diagonal (such as the identity matrix) provides a counterexample.

366

CHAPTER 4. EIGENVALUES AND EIGENVECTORS (i) False. If A and B are similar, then they have the same eigenvalues, but not necessarily the same eigenvectors. Exercise 48 in Section 4.4 shows that if B = P −1 AP and v is an eigenvector of A, then P −1 v is an eigenvector for B for the same eigenvalue. (j) False. For example, if A is invertible, then its reduced row echelon form is the identity matrix, which has only the eigenvalue 1. So if A has an eigenvalue other than 1, it cannot be similar to the identity matrix.

2. (a) For example, expand along the first row: 1 3 5 3 5 7 = 1 5 7 − 3 3 7 9 11 7 9 11

3 7 + 5 7 11

5 9

= 1(5 · 11 − 7 · 9) − 3(3 · 11 − 7 · 7) + 5(3 · 9 − 5 · 7) = 0. (b) Apply row operations to diagonalize the matrix:  1 3 7

3 5 9

  1 5 R2 −3R1 7  −−−−−→ 0 R3 −7R1 11 − −−−−→ 0

  1 3 5 0 −4 −8 R3 −3R2 −12 −24 − −−−−→ 0

 3 5 −4 −8 . 0 0

Then det A is the product of the diagonal entries, which is zero. 3. To get from the first matrix to the second: • Multiply the first column by 3; this multiplies the determinant by 3. • Multiply the second column by 2; this multiplies the determinant by 2. • Subtract 4 times the third column from the second; this does not change the determinant. • Interchange the first and second rows; this multiplies the determinant by −1. So the determinant of the resulting matrix is 3 · 3 · 2 · (−1) = −18. 4. (a) We have  det C = det (AB)−1 =

1 1 1  = −2. = = det(AB) det A det B 2 · − 14

(b) We have  det C = det A2 B(3AT ) = det A · det A · det B · det(3AT ) = − det(3AT ) = −34 det(AT ) = −34 det A = −162. 5. For any n × n matrix, det A = det(AT ), and det(−A) = (−1)n det A. So if A is skew-symmetric, then AT = −A and det A = det(AT ) = det(−A) = (−1)n det A = − det A since n is odd. But det A = − det A means that det A = 0. 6. Compute the determinant by expanding along the first row: 1 −1 2 1 = 1 1 k2 − (−1) 1 1 k 2 4 k 2 4 k2

1 k + 2 2 2 k

1 4

= 1(k 2 − 4k) + 1(k 2 − 2k) + 2(4 − 2) = 2k 2 − 6k + 4 = 2(k − 2)(k − 1). So the determinant is zero when k = 1 or k = 2.

Chapter Review

367

7. Since

 3 Ax = 4

1 3

      1 5 1 = =5 , 2 10 2

we see that x is indeed an eigenvector, with corresponding eigenvalue 5. 8. Since 

        13 −60 −45 3 3 · 13 − 1 · (−60) + 2 · (−45) 9 3 18 15 −1 =  3 · (−5) − 1 · 18 + 2 · 15  = −3 = 3 −1 , Ax = −5 10 −40 −32 2 3 · 10 − 1 · (−40) + 2 · (−32) 6 2 we see that x is indeed an eigenvector, with corresponding eigenvalue 3. 9. (a) The characteristic polynomial of A is −5 − λ det(A − λI) = 3 0

3 −3 −2 − λ −5 − λ −6 = (−2 − λ) 3 4 − λ −6 4−λ 0

= −(λ + 2)(λ2 + λ − 2) = −(λ + 2)(λ + 2)(λ − 1). (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 1 and λ2 = −2 (with algebraic multiplicity 2). (c) We row-reduce A − λI for each eigenvalue λ:

 A−I

 −6  0 = 3 0

−6 3 3 −3 0 −3

 1 R +6R1  0 −−2−−−→ 0  R1 +R2 −−− −−→ 1 0 R3 +3R1 −−−−−→ 0

 A + 2I

 −3  0 = 3 0

1 0 0

−1 −3 1

1 0 0 1 0 0

−6 3 6 −3 0 0

Thus the eigenspaces are   −1 E1 = span  1 , 0

  3 0 R1 ↔R2  0 − −6 −−−−→ 0 0

3 −3 −6 3 0 −3

  1 1 0 R2 ↔R3  0 − −−−−→ 0 0 0 0 0  0 0 0

  0 −3 R2 +R2  0 − 0 −−−−→ 0 0

E−2

−1 1 −3

−6 3 0 0 0 0

 13 R1  1 0 −− −→ −6 0 0 − 13 R3 0 −−−−→  0 0 0

 − 1 R1  3 0 − −− −→ 1 2 0 0 0 0 0 0

    −2  −2s + t  =  s  = span  1 ,   t 0

1 −1 −6 3 0 1

−1 0 0

 0 0 0

 0 0 . 0

  1 0 . 1

(d) Even though A does not have three distinct eigenvalues, it is diagonalizable since the eigenvalue of algebraic multiplicity 2, which is λ2 = −2, also has geometric multiplicity 2. A diagonalizing matrix is a matrix P whose columns are the eigenvectors above, so   −1 −2 1 1 0 , P = 1 0 0 1

368

CHAPTER 4. EIGENVALUES AND EIGENVECTORS and 

1 P −1 AP = −1 0

 2 −1 −5 −1 1  3 0 1 0

 −6 3 −1 4 −3  1 0 −2 0

  −2 1 1 1 0 = 0 0 1 0

 0 0 −2 0 = D. 0 −2

10. Since A is diagonalizable, A ∼ D where D is a diagonal matrix whose diagonal entries are its eigenvalues. Since A ∼ D, we know that A and D have the same eigenvalues, so the diagonal entries of D are −2, 3, and 4 and thus det D = −24. But A ∼ D also implies that det D = det(P −1 AP ) = det(P −1 ) det A det P = det A, so that det A = −24. 11. Since

      3 1 1 =5 −2 , 7 1 −1

we know by Theorem 4.19 that            −5   1 1 1 160 2 162 3 − 2(−1)−5 = + = . A−5 =5 1 −1 160 −2 158 7 2 12. Since A is diagonalizable, it has n linearly independent eigenvectors, which therefore form a basis for P Rn . So if x is any vector, then x = ci vi for some constants ci , so that by Theorem 4.19, and using the Triangle Inequality,

! n

X

X



n X n n ci λni vi ≤ |ci | |λi | kvi k . ci v i = kA xk = A

i=1

n

Then |λi | → 0 as n → ∞ since |λi | < 1, so each term of this sum goes to zero, so that kAn xk → 0 as n → ∞. Since this holds for all vectors x, it follows that An → O as n → ∞. 13. The two characteristic polynomials are 4 − λ 2 = (4 − λ)(1 − λ) − 6 = λ2 − 5λ − 2 pA = 3 1 − λ 2 − λ 2 = (2 − λ)(2 − λ) − 6 = λ2 − 4λ − 2. pB = 3 2 − λ Since the characteristic polynomials are different, the matrices are not similar. 14. Since both matrices are diagonal, we can read off their eigenvalues; both matrices have eigenvalues 2 and 3. To convert A to B, we want to reverse the rows, and reverse the columns, so choose   0 1 . P = P −1 = 1 0 Then P −1 AP =

 0 1

1 0

 2 0

 0 0 3 1

15. These matrices are not similar. If there were, then  a P = d g such that P −1 AP = B, or AP = P B. But  a+d b+e AP = d + g e + h g h

  1 3 = 0 0

 0 = B. 2

there would be some 3 × 3 matrix  b c e f h i

 c+f f + i i

 a P B = d g

 a+b c d + e f . g+h i

Chapter Review

369

If these matrices are to be equal, then comparing entries in the last row and column gives f = i = g = 0. The upper left entries also give d = 0. The two middle entries are e + h and d + e = e, so that h = 0 as well. Thus the entire bottom row of P is zero, so P cannot be invertible and A and B are not similar. Note that this is a counterexample to the converse of Theorem 4.22. These two matrices satisfy parts (a) through (g) of the conclusion, yet they are not similar. 16. The characteristic polynomial of A is 2 − λ det(A − λI) = 1

k = (2 − λ)(−λ) − k = λ2 − 2λ − k. −λ

(a) A has eigenvalues 3 and −1 if λ2 − 2λ − k = (λ − 3)(λ + 1), so when k = 3. (b) A has an eigenvalue with algebraic multiplicity 2 when its characteristic polynomial has a double root, which occurs when its discriminant is 0. But its discriminant is (−2)2 − 4 · 1 · (−k) = 4 + 4k, so we must have k = −1. (c) A has no real eigenvalues when the discriminant 4 + 4k of its characteristic polynomial is negative, so for k < −1. 17. If λ is an eigenvalue of A with eigenvector x, then Ax = λx, so that A3 x = A(A(A(x))) = A(A(λx)) = A(λ2 x) = λ3 x. If A3 = A, then Ax = λ3 x, so that λ = λ3 and then λ = 0 or λ = ±1. 18. If A has two identical rows, then Theorem 4.3(c) says that det A = 0, so that A is not invertible. But then by Theorem 4.17, 0 must be an eigenvalue of A. 19. If x is an eigenvalue of A with eigenvalue 3, then (A2 − 5A + 2I)x = A2 x − 5Ax + 2Ix = 9x − 5 · 3x + 2x = −4x. This shows that x is also an eigenvector of A2 − 5A + 2I, with eigenvalue −4. 20. Suppose that B = P −1 AP where P is invertible. This can be rewritten as BP −1 = P −1 A. Let λ be an eigenvalue of A with eigenvector x. Then B(P −1 x) = (BP −1 )x = (P −1 A)x = P −1 (Ax) = P −1 (λx) = λ(P −1 x). Thus P −1 x is an eigenvector for B, corresponding to λ.

Chapter 5

Orthogonality 5.1

Orthogonality in Rn

1. Since v1 · v2 = (−3) · 2 + 1 · 4 + 2 · 1 = 0, v1 · v3 = (−3) · 1 + 1 · (−1) + 2 · 2 = 0, v2 · v3 = 2 · 1 + 4 · (−1) + 1 · 2 = 0, this set of vectors is orthogonal. 2. Since v1 · v2 = 4 · (−1) + 2 · 2 − 5 · 0 = 0, v1 · v3 = 4 · 2 + 2 · 1 − 5 · 2 = 0, v2 · v3 = −1 · 2 + 2 · 1 + 0 · 2 = 0, this set of vectors is orthogonal. 3. Since v1 · v2 = 3 · (−1) + 1 · 2 − 1 · 1 = −2 6= 0, this set of vectors is not orthogonal. There is no need to check the other two dot products; all three must be zero for the set to be orthogonal. 4. Since v1 · v3 = 5 · 3 + 3 · 1 + 1 · (−1) = 17 6= 0, this set of vectors is not orthogonal. There is no need to check the other two dot products; all three must be zero for the set to be orthogonal. 5. Since v1 · v2 = 2 · (−2) + 3 · 1 − 1 · (−1) + 4 · 0 = 0, v1 · v3 = 2 · (−4) + 3 · (−6) − 1 · 2 + 4 · 7 = 0, v2 · v3 = −2 · (−4) + 1 · (−6) − 1 · 2 + 0 · 7 = 0, this set of vectors is orthogonal. 6. Since v2 · v4 = −1 · 0 + 0 · (−1) + 1 · 1 + 2 · 1 = 3 6= 0, this set of vectors is not orthogonal. There is no need to check the other five dot products (they are all zero); all dot products must be zero for the set to be orthogonal. 7. Since the vectors are orthogonal: 

   4 1 · = 4 · 1 − 2 · 2 = 0, −2 2 371

372

CHAPTER 5. ORTHOGONALITY they are linearly independent by Theorem 5.1. Since there are two vectors, they form an orthogonal basis for R2 . By Theorem 5.2, we have v1 · w v2 · w 4 · 1 − 2 · (−3) 1 · 1 + 2 · (−3) 1 v1 + v2 = v1 + v2 = v1 − v2 , v1 · v1 v2 · v2 4 · 4 − 2 · (−2) 1·1+2·2 2

w= so that

 [w]B =

1 2



−1

.

8. Since the vectors are orthogonal:     1 −6 · = 1 · (−6) + 3 · 2 = 0, 3 2 they are linearly independent by Theorem 5.1. Since there are two vectors, they form an orthogonal basis for R2 . By Theorem 5.2, we have w=

v1 · w −6 · 1 + 2 · 1 2 1 v2 · w 1·1+3·1 v1 + v2 = v1 − v2 , v1 + v2 = v1 · v1 v2 · v2 1·1+3·3 −6 · (−6) + 2 · 2 5 10

so that " [w]B =

2 5 1 − 10

# .

9. Since the vectors are orthogonal: 

   1 1  0 · 2 = 1 · 1 + 0 · 2 − 1 · 1 = 0 −1 1     1 1  0 · −1 = 1 · 1 + 0 · (−1) − 1 · 1 = 0 −1 1     1 1 2 · −1 = 1 · 1 + 2 · (−1) + 1 · 1 = 0, 1 1 they are linearly independent by Theorem 5.1. Since there are three vectors, they form an orthogonal basis for R3 . By Theorem 5.2, we have v1 · w v2 · w v3 · w v1 + v2 + v3 v1 · v1 v2 · v2 v3 · v3 1·1+0·1−1·1 1·1+2·1+1·1 1·1−1·1+1·1 = v1 + v2 + v3 1 · 1 + 0 · 0 − 1 · (−1) 1·1+2·2+1·1 1 · 1 − 1 · (−1) + 1 · 1 2 1 = v2 + v3 , 3 3

w=

so that   0 2 [w]B =  3  . 1 3

5.1. ORTHOGONALITY IN RN

373

10. Since the vectors are orthogonal:     1 1 1 · −1 = 1 · 1 + 1 · (−1) + 1 · 0 = 0 1 0     1 1 1 ·  1 = 1 · 1 + 1 · 1 + 1 · (−2) = 0 1 −2     1 1 −1 ·  1 = 1 · 1 − 1 · 1 + 0 · (−2) = 0, 0 −2 they are linearly independent by Theorem 5.1. Since there are three vectors, they form an orthogonal basis for R3 . By Theorem 5.2, we have v2 · w v3 · w v1 · w v1 + v2 + v3 v1 · v1 v2 · v2 v3 · v3 1·1−1·2+0·3 1·1+1·2−2·3 1·1+1·2+1·3 v1 + v2 + v3 = 1·1+1·1+1·1 1 · 1 − 1 · (−1) + 0 · 0 1 · 1 + 1 · 1 − 2 · (−2) 1 1 = 2v1 − v2 − v3 , 2 2

w=

so that 

 2   [w]B = − 21  . − 12 11. Since kv1 k = mal.

q

 3 2 5

+

 4 2 5

= 1 and kv2 k =

q

− 45

2

+

 3 2 5

= 1, the given set of vectors is orthonor-

q  q    1 2 1 2 1 2 1 2 √1 6= 1 and kv2 k = + = + − = √12 6= 1, the given set of vectors 12. Since kv1 k = 2 2 2 2 2 is not orthonormal. To normalize the vectors, we divide each by its norm to get " # " #) (" # " #) ( 1 1 √1 √1 1 1 2 2 , 2 √ 2 , √ = . √1 √1 − 1/ 2 12 1/ 2 − 12 2 2 13. Taking norms, we get s   2  2 2 1 2 2 kv1 k = + + =1 3 3 3 s  √  2 2 2 1 5 kv2 k = + − + 02 = 3 3 3 s √  2 5 3 5 2 2 kv3 k = 1 + 2 + = . 2 2 So the first vector is already normalized; we must normalize the other two by dividing by their norms, to get          2   2  2 √ √  13  1   1  1  31  1    32   15   3 4 5  2 √ √ √ √ , , = , , − 2 − 3  3   3    3 5 . 5     5/3 3 5/2  2  2  5  5 −2 0 − √ 0 3 3 3 5

374

CHAPTER 5. ORTHOGONALITY

14. Taking norms, we get s   2  2  2 2 1 1 1 1 kv1 k = + + − + =1 2 2 2 2 s √  2  2  2 2 1 1 6 2 2 kv2 k = 0 + + + +0 = 3 3 3 3 s  √  2  2  2 2 1 1 1 3 1 + − + + − = kv3 k = . 2 6 6 6 3 So the first vector is already normalized; we must normalize the other two to get  1     1   1     0 0   2 2  2            1    1 1  1   √1  1 3 1 − 6     2 6 , 2 , √  1  =  21  ,   1 , √ √2  − 2   3  6  − 2   6/3 3/3     6       1 1 1  √1 − 16 2 3 2 6

by dividing by their norms, 





3    √23   −   √6  .  3   √6    − 63 

15. Taking norms, we get s   2  2  2 2 1 1 1 1 + + − + =1 kv1 k = 2 2 2 2 v u √ !2  2  2 u 1 6 1 t 2 + −√ =1 + √ kv2 k = 0 + 3 6 6 v u √ !2 √ !2 √ !2 √ !2 u 3 3 3 3 kv3 k = t + − + + − =1 2 6 6 6 s 2  2  1 1 2 2 + √ = 1. kv4 k = 0 + 0 + √ 2 2 So these vectors are already orthonormal. 16. The columns of the matrix are orthogonal:     0 −1 · = 0 · (−1) + 1 · 0 = 0. 1 0 Further, they are orthonormal since     0 0 · = 0 · 0 + 1 · 1 = 1, 1 1



   −1 −1 · = −1 · (−1) + 0 · 0 = 1. 0 0

Thus the matrix is an orthogonal matrix, so by Theorem 5.5, the inverse of this matrix is its transpose, or   0 1 . −1 0 17. The columns of the matrix are orthogonal: "

√1 2 − √12

# " ·

√1 2 √1 2

#

1 1 1 1 = √ · √ − √ · √ = 0. 2 2 2 2

5.1. ORTHOGONALITY IN RN

375

Further, they are orthonormal since " # " √1 2 − √12

"

·

√1 2 √1 2

√1 2 − √12

# " ·

√1 2 √1 2

#

  1 1 1 1 = √ · √ − √ · −√ =1 2 2 2 2

#

1 1 1 1 = √ · √ + √ · √ = 1. 2 2 2 2

Thus the matrix is an orthogonal matrix, so by Theorem 5.5, the inverse of this matrix is its transpose, or # " √1 √1 − 2 2 . 1 1 √



2

2

18. This is not an orthogonal matrix since none of the columns have norm 1: 1 1 1 1 + + = , 9 9 9 3

1 1 1 + +0= , 4 4 2

q3 · q3 =

19. This matrix Q is orthogonal by Theorem 5.4 since QQT = I:   cos θ sin θ cos θ sin θ − cos θ − sin2 θ QQT =  cos2 θ sin θ − cos θ sin θ  − cos θ sin θ 0 cos θ − sin2 θ  2 2 2 4 3

cos2 θ sin θ − cos θ sin θ

q1 · q1 =

q2 · q2 =

cos θ sin θ−sin θ cos θ+cos θ sin3 θ

cos θ sin θ+cos θ+sin θ

= cos3 θ sin θ−sin θ cos θ+cos θ sin3 θ 2

cos4 θ+sin2 θ+cos2 θ sin2 θ

2

2

sin θ cos θ−sin θ cos θ

 1 = 0 0

0 1 0

1 1 4 6 + + = . 25 25 25 25

 sin θ 0  cos θ cos θ sin2 θ−cos θ sin2 θ



cos2 θ sin θ−cos2 θ sin θ 

2

sin θ cos θ−cos θ sin θ

sin2 θ+cos2 θ

 0 0 . 1

Thus by Theorem 5.5 Q−1 = QT . 20. The columns are all orthogonal since each dot product consists of two terms of The columns are orthonormal since the norm of each column is s 2  2  2  2 1 1 1 1 + ± + ± + ± = 1. ± 2 2 2 2

1 2

· 12 and two of − 12 · 21 .

So the matrix is orthonormal, so by Theorem 5.5 its inverse is equal to its transpose,  1  1 1 − 12 2 2 2 − 1 1 1 1  2 2 2 2 .  1  1 1  2 − 21  2 2 1 2

− 21

1 2

1 2

21. The dot product of the first and fourth columns is    1  √ 1 6    √1  0  1  6 = 1 · √  · 6 0, = 0 − √1   6    6 √1 0 2

so the matrix is not orthonormal.

376

CHAPTER 5. ORTHOGONALITY

T −1 22. By Theorem 5.5, Q−1 is orthogonal if and only if Q−1 = Q−1 . But since Q is orthogonal, Q−1 = QT , so that T −1 −1 , Q−1 = QT = Q−1 and Q−1 is orthogonal. 23. Since det Q = det QT , we have 1 = det(I) = det(QQ−1 ) = det(QQT ) = (det Q)(det QT ) = (det Q)2 . Since (det Q)2 = 1, we must have det Q = ±1. T 24. By Theorem 5.5, it suffices to show that (Q1 Q2 )−1 = (Q1 Q2 )T . But using the fact that Q−1 1 = Q1 −1 T and Q2 = Q2 , we have −1 T T T (Q1 Q2 )−1 = Q−1 2 Q1 = Q2 Q1 = (Q1 Q2 ) .

Thus Q1 Q2 is orthogonal. 25. By Theorem 3.17, if P is a permutation matrix, then P −1 = P T . So by Theorem 5.5, P is orthogonal. 26. Reordering the rows of Q is the same as multiplying it on the left by a permutation matrix P . By the previous exercise, P is orthogonal, and then by Exercise 24, P Q is also orthogonal. So reordering the rows of an orthogonal matrix results in an orthogonal matrix. 27. Let θ be the angle between x and y, and φ the angle between Qx and Qy. By Theorem 5.6, kQxk = kxk, kQyk = kyk, and Qx · Qy = x · y, so that cos θ =

Qx · Qy x·y = = cos φ. kxk kyk kQxk kQyk

Since 0 ≤ θ, φ ≤ π, the fact that their cosines are equal implies that θ = φ.   a c 28. (a) Suppose that Q = is an orthogonal matrix. Then b d T

T

QQ = Q Q = I



 a c

 b a d b

  2   c a + b2 ac + bd 1 = = d ac + bd c2 + d2 0

 0 . 1

So a2 + b2 = 1 and c2 + d2 = 1, and thus both columns of Q are unit vectors. Also, ac + bd = 0, which implies that ac = −bd and therefore a2 c2 = b2 d2 . But c2 = 1 − d2 and b2 = 1 − a2 ; substituting gives a2 (1 − d2 ) = (1 − a2 )d2



a2 − a2 d2 = d2 − a2 d2



If a = d = 0, then b = c = ±1, so we get one of the matrices        0 1 0 −1 0 1 0 , , , 1 0 1 0 −1 0 −1

a2 = d2



a = ±d.

 −1 , 0

all of which are of one of the required forms. Otherwise, if a = d 6= 0, then c2 = 1−d2 = 1−a2 = b2 , so that b = ±c and we get   a ±b , b a and we must choose the minus sign in order that the columns be orthogonal. So the result is   a −b , b a

5.1. ORTHOGONALITY IN RN

377

which is also of the required form. Finally, if a = −d 6= 0, then again b2 = c2 and we get   a ±b ; b −a this time we must choose the plus sign for the columns to be orthogonal, resulting in   a b , b −a which is again of the required form.   a (b) Since is a unit vector, we have a2 +b2 = 1, so that (a, b) is a point on the unit circle. Therefore b there is some θ with (a, b) = (cos θ, sin θ). Substituting into the forms from part (a) gives the desired matrices. (c) Let  cos θ Q= sin θ

 − sin θ , cos θ



cos θ R= sin θ

 sin θ . − cos θ

From Example 3.58 in Section 3.6, Q is the standard matrix of a rotation through angle θ. From Exercise 26(c) in Section 3.6, R is a reflection through the line ` that makes an angle of θ2 with  the x-axis, which is the line y = tan θ2 x (or x = 0 if θ = π). (d) From part (c), we have det Q = cos2 θ + sin2 θ = 1 and det R = − cos2 θ − sin2 θ = −1.   29. Since det Q = √12 · √12 − − √12 · √12 = 1, this matrix represents a rotation. To find the angle, we have so that θ = π4 = 45◦ .  √  √  30. Since det Q = − 21 · − 12 − 23 · − 23 = 1, this matrix represents a rotation. To find the angle, we cos θ =

√1 2

and sin θ =

√1 , 2

31. Since det Q = − 12 · √

1 2







3 4π ◦ 2 , so that θ = 3 = 240 . √  √ 3 3 = −1, this matrix · 2 2

have cos θ = − 12 and sin θ = −

represents a reflection. If cos θ = − 21 and

, so that by part (d) of the previous exercise the reflection occurs through the sin θ = 23 , then θ = 2π √3  line y = tan π3 x = 3 x.   32. Since det Q = − 35 · 53 − − 45 · − 45 = −1, this matrix represents a reflection. Let θ be the (third quadrant) angle with cos θ = − 35 and sin θ = − 54 ; then by part (d) of the previous exercise, the  reflection occurs through the line y = tan θ2 x. 33. (a) Since A and B are orthogonal matrices, we know that AT = A−1 and B T = B −1 , so that  A(AT + B T )B = A A−1 + B −1 B = AA−1 B + AB −1 B = B + A = A + B. (b) Since A and B are orthogonal matrices, we have det A = ±1 and det B = ±1. If det A+det B = 0, then one of them must be 1 and the other must be −1, so that (det A)(det B) = −1. But by part (a),   det(A + B) = det A(AT + B T )B = (det A) det(AT + B T ) (det B)  = (det A) det((A + B)T ) (det B) = (det A)(det B) (det(A + B)) = − det(A + B). Since det(A+B) = − det(A+B), we conclude that det(A+B) = 0 so that A+B is not invertible. 34. It suffices to show that QQT = I. Now, " #" x1 yT  x1 T  QQ = 1 t y I − 1−x1 yy y

I−

T  y  1 1−x1

# yyt

.

378

CHAPTER 5. ORTHOGONALITY Also, x21 + yyT = x21 + x22 + · · · + x2n = 1, since x is a unit vector. Rearranging gives yyT = 1 − x21 . Further, we have       1 1 T T T y yy yT (1 − x1 )2 = yT − (1 + x1 )yT I− =y − 1 − x1 1 − x1 = −x1 yT       1 1 I− yyT y = y − (1 − x21 )y = y − (1 + x1 )y 1 − x1 1 − x1 = −x1 y 

 I−

1 1 − x1



  yyT I−

1 1 − x1



yyT



  2 1 1 yyT + yyT yyT 1 − x1 1 − x1 2(1 − x1 ) T 1 − x21 =I− yy + yyT (1 − x1 )2 (1 − x1 )2 1 − 2x1 + x21 T =I− yy (1 − x1 )2 = I2 − 2



= I − yyT . Then evaluating QQT gives       1 2 T T T T x + y y x y + y I − yy 1     1    1−x  1   QQT =  1 1 1 T T T T yy x1 y + I − 1−x1 yy y yy + I − 1−x1 yy I − 1−x 1   1 x1 y T − x1 y T = x1 y − x1 y yyT + (I − yyT )   1 O = O I = I. Thus Q is orthogonal. 35. Let Q be orthogonal and upper triangular. Then the columns qi of Q are orthonormal; that is, ( n X 0 i 6= j qi · qj = qki qkj = . 1 i=j k=1 Since Q is upper triangular, we know that qij = 0 when i > j. We prove by induction on the rows that each row has only one nonzero entry, along the diagonal. Note that q11 6= 0, since all other entries in the first column are zero and the norm of the first column vector must be 1. Now the dot product of q1 with qj for j > 1 is n X q1 · qj = qk1 qkj = q11 q1j . k=1

Since these columns must be orthogonal, this product is zero, so that q1j = 0 for j > 1 and therefore q11 is the only nonzero entry in the first row. Now assume that q11 , q22 , . . . , qii are the only nonzero entries in their rows, and consider qi+1,i+1 . It is the only nonzero entry in its column, since entries in lower-numbered rows are zero by the inductive hypothesis and entries in higher-numbered rows are zero since Q is upper triangular. Therefore, qi+1,i+1 6= 0 since the (i + 1)st column must have norm 1. Now consider the dot product of qi+1 with qj for j > i + 1: qi+1 · qj =

n X k=1

qk,i+1 qkj = qi+1,i+1 qi+1,j .

5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS

379

Since these columns must be orthogonal, this product is zero, so that qi+1,j = 0 for j > i + 1. So all entries to the right of the (i + 1)st column are zero. But also entries to the left of the (i + 1)st column are zero since Q is upper triangular. Thus qi+1,i+1 is the only nonzero entry in its row. So inductively, qkk is the only nonzero entry in the k th row for k = 1, 2, . . . , n, so that Q is diagonal. 36. If n > m, then the maximum rank of A is m. By the Rank Theorem (Theorem 3.26, in Section 3.5), we know that rank(A) + nullity(A) = n; since n > m, we have nullity(A) = n − m > 0. So there is some nonzero vector x such that Ax = 0. But then kxk = 6 0 while kAxk = k0k = 0. 37. (a) Since the {vi } are a basis, we may choose αi , βi such that x=

n X

α k vk ,

k=1

y=

n X

βk v k .

k=1

Because {vi } form an orthonormal basis, we know that vi · vi = 1 and vi · vj = 0 for i 6= j. Then ! ! n n n n n X X X X X x·y = αk vk · βk v k = αi βj vi · vj = αi βi vi · vi = αi βi . k=1

i,j=1

k=1

i=1

i=1

Pn

But x · vi = k=1 αk vk · vi = αi and similarly y · vi = βi . Therefore the sum at the end of the displayed equation above is n X i=1

αi βi =

n X

(x · vi )(y · vi ) = (x · v1 )(y · v1 ) + (x · v2 )(y · v2 ) + · · · + (x · vn )(y · vn ),

i=1

proving Parseval’s Identity. (b) Since [x]B = [αk ] and [y]B = [βk ], Parseval’s Identity says that x·y =

n X

αk βk = [x]B · [y]B .

k=1

In other words, the dot product is invariant under a change of orthonormal basis.

5.2

Orthogonal Complements and Orthogonal Projections

   1 1. W is the column space of A = , so that W ⊥ is the null space of AT = 1 2 matrix is already row-reduced:   1 2 0 ,

 2 . The augmented

so that a basis for the null space, and thus for W ⊥ , is given by   −2 . 1    −4 2. W is the column space of A = , so that W ⊥ is the null space of AT = −4 3 gives h i h i −4 3 0 → 1 − 43 0 , so that a basis for the null space, and thus for W ⊥ , is given by " # " # 3 3 4 , or . 1 4

 3 . Row-reducing

380

CHAPTER 5. ORTHOGONALITY

3. W consists of        x 1 0  1  y  = x 0 + y 1 , so that 0 ,  x+y 1 1 1 

  0  1 is a basis for W .  1

Then W is the column space of  1 A = 0 1

 0 1 , 1

so that W ⊥ is the null space of AT . The augmented matrix is already row-reduced:   1 0 1 0 0 1 1 0 so that the null space, which is W ⊥ , is the set       −t −1 −1 −t = t −1 = span −1 . t 1 1 This vector forms a basis for W ⊥ . 4. W consists of        x 1 0  1 2x + 3z  = x 2 + y 3 , so that 2 ,  z 0 1 0 

  0  3 is a basis for W .  1

Then W is the column space of  1 A = 2 0

 0 3 , 1

so that W ⊥ is the null space of AT . Row-reduce the augmented matrix: " # " # 1 2 0 0 1 0 − 32 0 → 1 0 3 1 0 0 1 0 3 so that the null space, which is W ⊥ , is the set       2 2 2 t 3 3  1      − 3 t = t − 13  = span −1 . 3 t 1 This vector forms a basis for W ⊥ . 

 1  5. W is the column space of A = −1, so that W ⊥ is the null space of AT = 1 3 augmented matrix is already row-reduced:   1 −1 3 0 , so that the null space, which is W ⊥ , is the set           1 −3 s − 3t 1 −3  s  = s 1 + t  0 = span 1 ,  0 t 0 1 0 1 These two vectors form a basis for W ⊥ .

−1

 3 . The

5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS  6. W is the column space of A =

381



1 2  1 − 2 ,

so that W ⊥ is the null space of AT =

1 2

− 12

 2 . Row-reducing

2 gives 1 2

− 12

  0 → 1

2

−1

4

 0 ,

so the null space, which is W ⊥ , is the set           s − 4t 1 −4 1 −4  s  = s 1 + t  0 = span 1 ,  0 t 0 1 0 1 These two vectors form a basis for W ⊥ .   7. To find the required bases, we row-reduce A 0 :    1 −1 3 0 1  5 0  2 1 0  →  0 0 1 −2 0 −1 −1 1 0 0

0 1 0 0

1 −2 0 0

 0 0 . 0 0

Therefore, a basis for the row space of A is the nonzero rows of the reduced matrix, or     u1 = 1 0 1 , u2 = 0 1 −2 . The null space null(A) is the set of solutions of the augmented matrix, which are given by   −s  2s . s Thus a basis for null(A) is   −1 v =  2 . 1 Every vector in row(A) is orthogonal to every vector in null(A): u1 · v = 1 · (−1) + 0 · 2 + 1 · 1 = 0, 8. To find the required bases, we row-reduce  1 1 −1 0 −2 0 2 4   2 2 −2 0 −3 −1 3 4

 A 2 4 1 5

u2 · v = 0 · (−1) + 1 · 2 − 2 · 1 = 0.

 0 :   0 1  0  → 0 0 0 0 0

0 1 0 0

−1 −2 0 0 2 0 0 0 1 0 0 0

 0 0 . 0 0

Thus a basis for the row space of A is the nonzero rows of the reduced matrix, or      u1 = 1 0 −1 −2 0 , u2 = 0 1 0 2 0 , u3 = 0 0 0

0

 1 .

The null space null(A) is the set of solutions of the augmented matrix, which are given by           s + 2t 1 2 1 2 0 −2 0 −2  −2t             s  = s 1 + t  0 = span 1 ,  0 .            t  0  1 0  1 0 0 0 0 0

382

CHAPTER 5. ORTHOGONALITY Thus a basis for null(A) is   1 0    v1 =  1 , 0 0

 2 −2    v2 =   0 .  1 0 

Every vector in row(A) is orthogonal to every vector in null(A): u1 · v1 = 1 · 1 + 0 · 0 − 1 · 1 − 2 · 0 + 0 · 0 = 0, u1 · v2 = 1 · 2 + 0 · (−2) − 1 · 0 − 2 · 1 + 0 · 0 = 0, u2 · v1 = 0 · 1 + 1 · 0 + 0 · 1 + 2 · 0 + 0 · 0 = 0, u2 · v2 = 0 · 2 + 1 · (−2) + 0 · 0 + 2 · 1 + 0 · 0 = 0, u3 · v1 = 0 · 1 + 0 · 0 + 0 · 1 + 0 · 0 + 1 · 0 = 0, u3 · v2 = 0 · 2 + 0 · (−2) + 0 · 0 + 0 · 1 + 1 · 0 = 0. 9. Using the row reduction from Exercise 7, we see that a basis for the column space of A is     1 −1  5  2   u1 =  u2 =   0 ,  1 . −1 −1   To find the null space of AT , we must row-reduce AT 0 :     3 1 0 − 75 1 5 0 −1 0 0 7     1 1 −1 0 → 0 1 − 27 0 . −1 2 7 3 1 −2 1 0 0 0 0 0 0 The null space null(AT ) is the set of solutions of the augmented matrix, which are given by  5  5  3      3 −3 −7 5 7s − 7t 7 − 1 s + 2 t − 1   2 −1  2  7    7  = s  7  + t  7  = span         ,   .    1  0  7  0  s t

0

0

1

7

Thus a basis for null(AT ) is  5 −1  v1 =   7 , 0 

  −3  2  v2 =   0 . 7

Every vector in col(A) is orthogonal to every vector in null(AT ): u1 · v1 = 1 · 5 + 5 · (−1) + 0 · 7 − 1 · 0 = 0, u1 · v2 = 1 · (−3) + 5 · 2 + 0 · 0 − 1 · 7 = 0, u2 · v1 = −1 · 5 + 2 · (−1) + 1 · 7 − 1 · 0 = 0, u2 · v2 = −1 · (−3) + 2 · 2 + 1 · 0 − 1 · 7 = 0. 10. Using the row reduction from Exercise 8, we see that a basis for the column space of A is       1 1 2 −2  0 4    u1 =  u2 =  u3 =   2 ,  2 , 1 −3 −1 5   To find the null space of AT , we must row-reduce AT 0 :     1 −2 2 −3 0 1 0 0 1 0  1 0 1 0 0 2 −1 0 1 0     −1   2 −2 3 0 → 0 0 1 −1 0  .  0 0 0 0 4 0 4 0 0 0 2 4 1 5 0 0 0 0 0 0

5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS

383

The null space null(AT ) is the set of solutions of the augmented matrix, which are given by     −s −1 −s −1   = s . s  1 s 1 Thus a basis for null(AT ) is   −1 −1  v1 =   1 . 1 Every vector in col(A) is orthogonal to every vector in null(AT ): u1 · v1 = 1 · (−1) − 2 · (−1) + 2 · 1 − 3 · 1 = 0,

u2 · v1 = 1 · (−1) + 0 · (−1) + 2 · 1 − 1 · 1 = 0,

u3 · v1 = 2 · (−1) + 4 · (−1) + 1 · 1 + 5 · 1 = 0. 11. Let



 A = w1

w2



2 = 1 −2

 4 0 . 1

Then the subspace W spanned by w1 and w2 is the column space of A, so that by Theorem 5.10, W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce " # " # i h 1 2 1 −2 0 1 0 0 4 → . AT 0 = 4 0 1 0 0 1 − 52 0 Thus null(AT ) is the set of solutions of this augmented matrix, which is       −1 − 41 t − 14  5   5   = t = span  2t   2  10 4 t 1 so this vector forms a basis for W ⊥ . 12. Let



 A = w1

1  −1 w2 =   3 −2

 0 1 . −2 1

Then the subspace W spanned by w1 and w2 is the column space of A, so that by Theorem 5.10, W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce       T 1 −1 3 −2 0 1 0 1 −1 0 0 = A → . 0 1 −2 1 0 0 1 −2 1 0 Thus null(AT ) is the set of solutions of this augmented matrix, which is           −s + t −1 1 −1 1  2s − t   2 −1  2 −1            s  = s  1 + t  0 = span  1 ,  0 , t 0 1 0 1 so these two vectors are a basis for W ⊥ .

384

CHAPTER 5. ORTHOGONALITY

13. Let



 A = w1

w2

2  −1 w3 =   6 3

 −1 2 2 5 . −3 6 −2 1

Then the subspace W spanned by w1 , w2 , and w3 is the column space of A, W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce    4 2 −1 6 3 0 1 0 3 h i 3    AT 0 = −1 2 −3 −2 0 → 0 1 0 − 13 2 5 6 1 0 0 0 0 0

so that by Theorem 5.10,  0  0 . 0

Thus null(AT ) is the set of solutions of this augmented matrix, which is  4         −4 −3 −3 −3s − 43 t −3           1 1    0  1   0  3t   = s   + t  3  = span   ,   ,  0  1  0   1  s 0

t

1

0



 1 2 2 2  0 2 . 1 −1 −3 2

3

so these two vectors form a basis for W ⊥ . 14. Let

4  6   v2 =  −1  1 −1

 A = v1

Then the subspace W spanned by w1 , w2 , and w3 is the column space of A, so that by Theorem 5.10, W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce     4 6 −1 1 −1 0 1 0 0 −2 7 0 h i     3  0 1 −3 0 AT 0 =  −5 0 1 2  → 0 1 0 . 2 2 2 2 −1 2 0 0 0 1 0 −1 0 Thus null(AT ) is the set of solutions of this augmented matrix, which is       2s − 7t 2 −7  3   3   − s + 5t −   5  2   2           = s  0 + t  1 t          1  0 s       t 0 1 Clearing fractions, we get 

 4 −3    0 ,    2 0

  −7  5    1    0 1

as a basis for W ⊥ . 15. We have  projW (v) =

u1 · v u1 · u1



1 · 7 + 1 · (−4) 3 u1 = u1 = u1 = 1·1+1·1 2

" # 3 2 3 2

5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS 16. We have  projW (v) = = =

=

   u1 · v u2 · v u1 + u2 u1 · u1 u2 · u2 1 · 3 + 1 · 1 + 1 · (−2) 1 · 3 − 1 · 1 + 0 · (−2) u1 + u2 1·1+1·1+1·1 1 · 1 − 1 · (−1) + 0 · 0 2 u1 + u2 3       5 1 1 3 2     1 + −1 =  − 13    3 1 0 2 3

17. We have 

   u1 · v u2 · v u1 + u2 u1 · u1 u2 · u2 −1 · 1 + 2 · 2 + 4 · 3 2·1−2·2+1·3 u1 + u2 = 2 · 2 − 2 · (−2) + 1 · 1 −1 · (−1) + 1 · 1 + 4 · 4 13 1 = u1 + u2 9   18   

projW (v) =

2 − 13 −1  9   18   2  2  13   1  = − 9  +  18  =  2 . 1 52 3 9 18

18. We have      u1 · v u2 · v u3 · v u1 + u2 + u3 u1 · u1 u2 · u2 u3 · u3 1 · 3 + 1 · (−2) + 0 · 4 + 0 · (−3) 1 · 3 − 1 · (−2) − 1 · 4 + 1 · (−3) u1 + u2 + 1·1+1·1+0·0+0·0 1 · 1 − 1 · (−1) − 1 · (−1) + 1 · 1 0 · 3 + 0 · (−2) + 1 · 4 + 1 · (−3) u3 0·0+0·0+1·1+1·1 1 1 1 u1 − u2 + u3 2   2  2     1 1 0 0  1 −1 1 0 1 1 1   −   +   =  . 2 0 2 −1 2 1 1 0 1 1 0

 projW (v) = =

=

=

19. Since W is spanned by the vector u1 =  projW (v) =

u1 · v u1 · u1

  1 , we have 3



  " 2# −5 1 · 2 + 3 · (−2) 2 1 u1 = u1 = − = . 1·1+3·3 5 3 − 65

The component of v orthogonal to W is thus " projW ⊥ (v) = v − projW (v) =

12 5 − 45

# .

385

386

CHAPTER 5. ORTHOGONALITY

  1 20. Since W is spanned by the vector u1 = 1, we have 1  projW (v) =

u1 · v u1 · u1



    4 1 3 1 · 3 + 1 · 2 + 1 · (−1) 4   4 1 = u1 = u1 =  3 . 1·1+1·1+1·1 3 1 4 3

The component of v orthogonal to W is thus  projW ⊥ (v) = v − projW (v) =



5  3  2 .  3 − 73

21. Since W is spanned by   1 u1 = 2 1



 1 and u2 = −1 , 1

we have 

   u1 · v u2 · v projW (v) = u1 + u2 u1 · u1 u2 · u2 1 · 4 − 1 · (−2) + 1 · 3 1 · 4 + 2 · (−2) + 1 · 3 u1 + u2 = 1·1+2·2+1·1 1 · 1 − 1 · (−1) + 1 · 1 1 = u1 + 3u2 2       7 1 3 2 2       =  1  + −3 = −2. 1 7 3 2 2 The component of v orthogonal to W is thus 

1 2



  projW ⊥ (v) = v − projW (v) =  0 . − 12 22. Since W is spanned by   1 1  u1 =   1 0



 1  0  and u2 =  −1 , 1

we have 

   u1 · v u2 · v u1 + u2 u1 · u1 u2 · u2 1 · 2 + 1 · (−1) + 1 · 5 + 0 · 6 1 · 2 + 0 · (−1) − 1 · 5 + 1 · 6 = u1 + u2 1·1+1·1+1·1+0·0 1 · 1 + 0 · 0 − 1 · (−1) + 1 · 1 = 2u1 + u2       1 1 3 1  0 2      = 2 1 + −1 = 1 . 0 1 1

projW (v) =

5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS

387

The component of v orthogonal to W is thus   −1 −3  projW ⊥ (v) = v − projW (v) =   4 . 5 23. W ⊥ is defined to be the set of all vectors in Rn that are orthogonal to every vector in W . So if w ∈ W ∩ W ⊥ , then w · w = 0. But this means that w = 0. Thus 0 is the only element of W ∩ W ⊥ . 24. If v ∈ W ⊥ , then since v is orthogonal to every vector in W , it is certainly orthogonal to wi , so that v · wi = 0 for i = 1, 2, . . . , k. For the converse, suppose w ∈ W ; then since W = span(w1 , . . . , wk ), Pk there are scalars ai such that w = i=1 ai wi . Then v·w =v·

k X

ai wi =

k X

i=1

ai v · wi = 0

i=1

since v is orthogonal to each of the wi . But this means that v is orthogonal to every vector in W , so that v ∈ W ⊥ .     0 1 25. No, it is not true. For example, let W ⊂ R3 be the xy-plane, spanned by e1 = 0 and e2 = 1. 0 0   0 / W ⊥. Then W ⊥ = span(e3 ) = span 0. Now let w = e1 and w0 = e2 ; then w · w0 = 0 but w0 ∈ 1 26. Yes, this is true. Let V = span(vk+1 , . . . , vn ); we want to show that V = W ⊥ . P By Theorem 5.9(d), n all we must show is that for every v ∈ V , vi · v = 0 for i = 1, 2, . . . , k. But v = j=k+1 cj vj , so that if i ≤ k, since the vi are an orthogonal basis for Rn we get n X

vi ·

cj vj =

j=k+1

n X

cj vj · vi = 0.

j=k+1

27. Let {uk } be an orthonormal basis for W . If  n n  X X uk · x uk = (uk · x)uk , x = projW (x) = uk · uk k=1

k=1

then x is in the span of the uk , which is W . For the converse, suppose that x ∈ W . Then x = P n k=1 ak uk . Since the uk are an orthogonal basis for W , we get x · ui =

n X

ak uk · ui = ai ui · ui = ai .

k=1

Therefore x=

n X

ak uk =

k=1

n X

(x · uk )uk = projW (x).

k=1

28. Let {uk } be an orthonormal basis for W . Then  n n  X X uk · x uk = (uk · x)uk . projW (x) = uk · uk k=1

k=1

So if projW (x) = 0, then uk · x = 0 for all k since the uk are linearly independent. But then by Theorem 4.9(d), x ∈ W ⊥ , so that x is orthogonal to W . Conversely, if x is orthogonal to W , then uk · x = 0 for all k, so the sum above is zero and thus projW (x) = 0.

388

CHAPTER 5. ORTHOGONALITY

29. If uk is an orthonormal basis for W , then projW (x) =

n X

(uk · x)uk ∈ W.

k=1

By Exercise 27, x ∈ W if and only if projW (x) = x. Applying this to projW (x) ∈ W gives projW (projW (x)) = projW (x) as desired. 30. (a) Let W = span(S) = span{v1 , . . . , vk }, and let T = {b1 , . . . , bj } be an orthonormal basis for W ⊥ . By Theorem 5.13, dim W + dim W ⊥ = k + j = n. Therefore S ∪ T is a basis for Rn . Further, it is an orthonormal basis: any two distinct members of S are orthogonal since S is an orthonormal set; similarly, any two distinct members of T are orthogonal. Also, vi · bj = 0 since one of the vectors is in W and the other is in W ⊥ . Finally, vi · vi = bj · bj = 1 since both sets are orthonormal sets. Choose x ∈ Rn ; then there are αr , βs such that x=

k X

αr vr +

r=1

j X

βs bs .

s=1

Then using orthonormality of S ∪ T , we have ! j k X X x · vi = αr vr + βs bs · vi = αi and x · bi = r=1

s=1

k X

αr vr +

r=1

j X

! βs bs

· bi = βi .

s=1

But then (again using orthonormality) 2

kxk = x · x =

k X

α r vr +

r=1

=

k X r=1

=

k X

αr2 +



!

k X

βs bs

s=1 j X

r=1

αr vr +

j X

! βs bs

s=1

βs2

s=1 2

|x · vr | +

r=1 k X

j X

j X

2

|x · bs |

s=1 2

|x · vr | ,

r=1

as required. (b) Looking at the string of equalities and inequalities above, we see that equality holds if and only Pj 2 if s=1 |x · bs | = 0. But this means that x · bs = 0 for all s, which in turn means that x ∈ span{vj } = W .

5.3

The Gram-Schmidt Process and the QR Factorization

1. First obtain an orthogonal basis: " # 1 v1 = x1 = 1 " # " # " # " # " # 3 1 − 21 1 1·1+1·2 1 v2 = x2 − projv1 (x2 )v1 = − = − 23 = . 1 1·1+1·1 1 2 2 2 2

5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION

389

Normalizing each of the vectors gives " # " # √1 v1 1 1 q1 = =√ = 12 , kv1 k √ 2 1 2

" # " # − √12 − 12 v2 1 q2 = = . = √ 1 kv2 k √1 1/ 2 2 2

2. First obtain an orthogonal basis:  v1 = x1 =

 3 −3

    3·3−3·1 3 3 − v2 = x2 − projv1 (x2 )v1 = 1 3 · 3 − 3 · (−3) −3           1 3 3 3 1 2 = = − − = . 1 1 −1 2 3 −3 Normalizing each of the vectors gives " # " # √1 3 1 v1 2 = =√ , q1 = kv1 k 18 −3 − √12

" # " # √1 v2 1 2 = 12 q2 = =√ kv2 k √ 8 2 2

3. First obtain an orthogonal basis: 

 1 v1 = x1 = −1 −1           2 0 1 0 2 1 · 0 − 1 · 3 − 1 · 3 −1 = 3 + −2 = 1 v2 = x2 − projv1 (x2 )v1 = 3 − 1 · 1 − 1 · (−1) − 1 · (−1) 3 −1 3 −2 1 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2               3 3 1 2 0 1 2 1 · 3 − 1 · 2 − 1 · 4 2 · 3 + 1 · 2 + 1 · 4 −1 − 1 = 2 + −1 − 2 1 = −1 . = 2 − 1 · 1 − 1 · (−1) − 1 · (−1) 2·2+1·1+1·1 4 4 −1 1 1 −1 1 Normalizing each of the vectors gives 

   √1 1 3  v1 1    = − √1  , q1 = =√  −1 kv1 k 3    13  −1 − √3     √2 2   16  v2 1  q2 = =√  1 =  √  , kv2 k 6    16  √ 1 6     0 0  1  v3 1   √  q3 = =√  = −1   − 2  . kv3 k 2 √1 1 2

390

CHAPTER 5. ORTHOGONALITY

4. First obtain an orthogonal basis:

  1 v1 = x1 = 1 1         1 1 1 1  3 1 · 1 + 1 · 1 + 1 · 0 2 1 v1 = 1 − 1 =  v2 = x2 − projv1 (x2 )v1 = 1 −  3 1·1+1·1+1·1 3 1 0 0 −2 3

v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2   1 1 ·1+ 1 ·0− 2 ·0 1·1+1·0+1·0 = 0 − v1 − 1 31 1 3 1 2 3 2  v2 1·1+1·1+1·1 3 · 3 + 3 · 3 − 3 · −3 0       1 1 1   1   1  3     1 = 0 − 3 1 − 2  3  0 1 − 23   1

 2 1 = − 2 . 0

Normalizing each of the vectors gives

    √1 1   13  v1 1  q1 = =√  1 =  √  , kv1 k 3    13  √ 1 3

 q2 =

v2 1 =p kv2 k 2/3 

 q3 =

v3 1 = √ kv3 k 1/ 2

1  2 − 1   2

0

 =



1  3  1  2 − 23

 =



√1  16   √ ,  6 − √26



√1 2  − √1   2

0

5. Using Gram-Schmidt, we get

  1   v1 = x1 = 1 0           3 1 3 1 − 21   1 · 3 + 1 · 4 + 0 · 2     7    1 v2 = x2 − projv1 (x2 )v1 = 4 − 1 = 4 − 1 =  2  . 1·1+1·1+0·0 2 2 0 2 0 2

5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION 6. Using Gram-Schmidt, we get 

 2 −1  v1 = x1 =   1 2 

 3 −1 2 · 3 − 1 · (−1) + 1 · 0 + 2 · 4  v2 = x2 − projv1 (x2 )v1 =   0 − 2 · 2 − 1 · (−1) + 1 · 1 + 2 · 2 v1 4       0 2 3   −1 3 −1  12       =  0 − 2  1 = − 3   2 2 4 1 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2   1 1 3 1  − 2 · 1 − 1 · 1 + 1 · 1 + 2 · 1 v1 − 0 · 1 + 2 · 1 − 2 · 1 + 1 · 1 v2 = 1 2 · 2 − 1 · (−1) + 1 · 1 + 2 · 2 0 · 0 + 12 · 21 − 32 · − 32 + 1 · 1 1     1     0 2 1  1   57   2 5 1 2 −1        = 1 − 5  1 + 0 − 3  =  3 .  2 5 2 1 1 1 5

7. First compute the projection of v on each of the vectors v1 and v2 from Exercise 5:     1 0 v1 · v 1 · 4 + 1 · (−4) + 0 · 3     projv1 v = = v1 = 1 0 v1 · v1 1·1+1·1+0·0 0 0       −2 − 12 −1 1 1 − 2 · 4 + 2 · (−4) + 2 · 3  1  4  21   92  v2 · v projv2 v = v2 =  2 =  2 =  9 . 2 2 v2 · v2 9 − 12 + 12 + 22 8 2 2 9 Then the component of v orthogonal to both v1 and v2 is 

     38 4 − 92 9    2   38  −4 −  9  = − 9  , 8 19 3 9 9 so that  v=



38 9  38  − 9  19 9

4 + v2 . 9

391

392

CHAPTER 5. ORTHOGONALITY

8. First compute the projection of v on each of the vectors v1 and v2 from Exercise 6:      2 2 2 5  1 −1 − 1  −1 v1 · v 2·1−1·4+1·0+2·2      5 projv1 v = v1 =   =   =  1 v1 · v1 22 + 12 + 12 + 22  1 5  1  5  2 2 5      0 0 0  1  4 1 0 · 1 + f rac12 · 4 − 32 · 0 + 1 · 2  v2 · v  2 8  2  7 projv2 v = v2 =  3  =  3  =  12    2 2 v2 · v2 − 2  7 − 2  − 7  02 + 1 + − 3 + 1 2

2



2

2

1

projv3 v =

v3 · v v3 = v3 · v3

1 5 ·1+  1 2 + 5

7 5 ·4  7 2 5

+ +

1

1

5 7 3 1 · 0 + · 2   5 5   5 3 2 1 2 3 + 5 5 5 1 5

5 7 5 3 5 1 5

=

31 12

1  =

8 7

31 60  217   60   93  .  60  31 60



Then the component of v orthogonal to both v1 and v2 is  2    31   1    0 5 60 12 4  1   4   217   1    − 5   7   60   84  −4 −  1  −  12  −  93  =  1  ,  5  − 7   60  − 28  3 2 8 31 5 − 84 5 7 60 so that 1 12  1  84   1 − 28  5 − 84

 v=

 1 8 31 + v1 + v2 + v3 . 5 7 12

9. To find the column space of the matrix, row-reduce it:    0 1 1 1 1 0 1 → 0 1 1 0 0

0 1 0

 0 0 . 1

It follows that, since there are 1s in all three columns, the columns of the original matrix are linearly independent, so they form a basis for the column space. We must orthogonalize those three vectors.   0   v1 = x1 = 1 1           0 1 0 1 1   0 · 1 + 1 · 0 + 1 · 1     1    1 v2 = x2 − projv1 (x2 )v1 = 0 − 1 = 0 − 1 = − 2  0·0+1·1+1·1 2 1 1 1 1 1 2 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2     1 0 1 · 1 − 21 · 1 + 12 · 0   0·1+1·1+1·0   = 1 − 1 − 0·0+1·1+1·1 1 · 1 − 21 · − 12 + 12 · 0 1         2 1 0 1 3   1   1  1  2 = 1 − 1 − − 2  =  3  . 2 3 1 0 1 − 23 2

 1 2

1



 1 − 2  1 2

5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION 10. To find the column space of the matrix, row-reduce it:    1 1 1 1 0  1 −1 2  → 0 −1 1 0 0 1 5 1

0 1 0 0

393

 0 0 . 1 0

It follows that, since there are 1s in all three columns, the columns of the original matrix are linearly independent, so they form a basis for the column space. We must orthogonalize those three vectors.   1  1  v1 = x1 =  −1 1           0 1 1 1 1 −1 1 · 1 + 1 · (−1) − 1 · 1 + 1 · 5  1 −1  1 −2          v2 = x2 − projv1 (x2 )v1 =   1 − 1 · 1 + 1 · 1 − 1 · (−1) + 1 · 1 −1 =  1 − −1 =  2 4 5 1 5 1 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2       1 1 0 2  1 −2 1 · 1 + 1 · 2 − 1 · 0 + 1 · 1 0 · 1 − 2 · 2 + 2 · 0 + 4 · 1      = 0 − 1 · 1 + 1 · 1 − 1 · (−1) + 1 · 1 −1 − 0 · 0 − 2 · (−2) + 2 · 2 + 4 · 4  2 1 1 4         0 0 1 1 −2 1 2  1        = 0 − −1 + 0  2 = 1 . 0 4 1 1     0 0 11. The given vector, x1 , together with x2 = 1 and x3 = 0 form a basis for R3 ; we want to 0 1 orthogonalize that basis.   3   v1 = x1 = 1 5           3 0 − 35 3 0 3 1    34    3·0+1·1+5·0    v2 = x2 − projv1 (x2 )v1 = 1 − 1 = 1 − 1 =  35  2 2 2 3 +1 +5 35 0 − 71 5 0 5 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2     0 3 3 − 35 ·0+   3·0+1·0+5·1  = 0 − − 1    3 2 32 + 12 + 52 − 35 + 1 5         3 0 3 − 35 − 15 34 5  34     1   = 0 − 1 +  35  =  0 . 7 34 9 1 5 − 17 34 12. Let



 1  2  v1 =  −1 , 0

34 1 35 · 0 − 7 · 1  2 2 34 + 17 35

  1 0  v2 =  1 . 3

  3 − 35  34   35  − 71

394

CHAPTER 5. ORTHOGONALITY Note that v1 · v2 = 0. Now let   0 0  x4 =  0 . 1

  0 0  x3 =  1 , 0

To see that v1 , v2 , x3 , and x4 form a basis for R4 , row-reduce the matrix that has these vectors as its columns:     1 1 0 0 1 0 0 0  2 0 0 0 0 1 0 0     −1 1 1 0 → 0 0 1 0 . 0 3 0 1 0 0 0 1 Since there are 1s in all four columns, the column vectors are linearly independent, so they form a basis for R4 . Next we need to orthogonalize that basis. Since v1 and v2 are already orthogonal, we start with x3 :   5       66 0 1 1  1      0 1  2  3  v1 · x3 1  v2 · x3 0        v3 = x3 − = v1 − v2 =   +   −  49    1 −1 1 v1 · v1 v2 · v2 6 11  66  0 0 3 3 − 11       v1 · x4 v2 · x4 v3 · x4 v4 = x4 − v1 − v2 − v3 v1 · v1 v2 · v2 v3 · v3           5 12 1 − 49 1 0 66    1   6          2 0 0 18  3      3  =  49 .    + =   + 0  −       49 −1 11 1 49  66   0  0 3 4 3 0 − 11 1 49   0 13. To start, we let x3 = 0 be the third column of the matrix. Then the determinant of this matrix 1 is (for example, expanding along the second row) √16 6= 0, so the resulting matrix is invertible and therefore the three column vectors form a basis for R3 . The first two vectors are orthogonal and each is of unit length, so we must orthogonalize the third vector as well:  v3 = x3 −

q1 · x3 q1 · q1



 q1 −

q2 · x3 q2 · q2

 q2

 1   1    √ √ 0 3 2 √1 √1 −   √1     2 3 = 0 −  2 0 −  2  2  2  3   2  √1 √1 + 02 + − √12 + √13 + √13 √1 1 − √12 2 3 3    1   1    1 √ √ 0 2 1  1  13   61     = 0 + √  0 − √  √3  = − 3  . 2 3 1 1 1 √ √ 1 − 2 6 3

Then v3 is orthogonal to the other two columns of the matrix, but we must normalize v3 to convert it to a unit vector. s   2  2 2 1 1 1 1 kv3 k = + − + =√ , 6 3 6 6

5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION

395

so that

 √1 6  √2  − 6  , √1 6 

q3 =

1 √ v3 = 1/ 6

so that the completed matrix Q is



√1 2

 Q=

0

− √12

√1 3 √1 3 √1 3

 √1 6  − √26  . √1 6

    2 1  1 1    14. To simplify the calculations, we will start with the multiples v1 =  1 and v2 =  0 of the given −3 1 basis vectors (column vectors of Q). We will find an orthogonal basis containing those two vectors and then normalize all four at the very end. Note that v1 and v2 are already orthogonal. If we choose

    0 0 0 0    x3 =  1 and x4 = 0 , 1 0

  then calculating the determinant of v1 v2 x3 x4 , for example by expanding along the last column, shows that the matrix is invertible, so that these four vectors form a basis for R4 . We now need to orthogonalize the last two vectors:

 p3 = x3 −

v1 · x3 v1 · v1



 v1 −

v2 · x3 v2 · v2

 v2

      0 1 2 0 1 · 0 + 1 · 0 + 1 · 1 + 1 · 0 1  2 · 0 + 1 · 0 + 0 · 1 − 3 · 0  1      = −  −   1 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1 2 · 2 + 1 · 1 + 0 · 0 − 3 · (−3)  0 0 1 −3       0 1 −1 0 1 1 1 −1       =   −   =  . 1 4 1 4  3 0

1

−1

396

CHAPTER 5. ORTHOGONALITY   −1 −1   Now let v3 = 4p3 =  , again avoiding fractions. Next,  3 −1  v4 = x4 −

v1 · x4 v1 · v1



 v1 −

v2 · x4 v2 · v2



 v2 −

v3 · x4 v3 · v3

 v3

      1 2 0  0 1 · 0 + 1 · 0 + 1 · 0 + 1 · 1 1 2 · 0 + 1 · 0 + 0 · 0 − 3 · 1  1      = −  −   0 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1 2 · 2 + 1 · 1 + 0 · 0 − 3 · (−3)  0 1 −3 1   −1   −1 · 0 − 1 · 0 + 3 · 0 − 1 · 1 −1 −   −1 · (−1) − 1 · (−1) + 3 · 3 − 1 · (−1)  3 −1         0 1 2 −1 0 1 1  1 −1 3 1         = −  +  +   0 4 1 14  0 12  3 1 

−3

1

−1

2 21 − 5   42  .



= 

 0

1 42

We now normalize v1 through v4 ; note that we already have normalized forms for the first two. p √ kv3 k = 12 + 12 + 32 + 12 = 2 3 s   2  2 2 5 1 1 2 + − + 02 + =√ , kv4 k = 21 42 42 42 so that the matrix is 1 Q=

15. From Exercise 9, kv1 k =



√2 14 √1 14

2 1 2  1 2 1 2

0 − √314





2, kv2 k =

6 2 ,

and kv3 k = 

h

Q = q1

q2

 R = QT A =  √26 √1 3

√1 2 − √16 √1 3

 q3 =  √12

 √1 0 2  √1  1 6 1 − √13

0  

√1 42

so that

√2 6 − √16 √1 6

0

i

Therefore 0

√4 42  − √542  .

√ 2 3 3 ,

√1 2





3 √6 − 63 √ 3 2√ − 63



1 0 1

 √1 3  √1  . 3 − √13

 √ 2 1   1 =  0 0 0

√1 2 √3 6

0

 √1 2  √1  . 6 √2 3

5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION

397

√ √ 16. From Exercise 10, kv1 k = 2, kv2 k = 2 6, and kv3 k = 2, so that  h Q = q1

i q3 =

q2

1 2  1  2  − 1  2 1 2





 0  √1  2 .  √1  2 0

0

6 √6 6 √6 6 3

Therefore 

1 2

R = QT A =  0

0

17. We have

 R = QT A =



− 21 √ 6 6 √1 2

1 √2 − 66 √1 2

2 3 1 3 2 3

1 3 2 3 − 32



1

1  √2   6  3 

0

− 23

1

−1 1

  1 2   −1 2  = 0  1 0  0 5 1 1

2 √ 2 6 0

  3 9 8 2   7 −1 = 0 6 −2 1 0 0



2 2  3  1 1 −2 3

2



 . 0  √  2



1 3 2 3 7 3

18. We have " T

R=Q A=

√1 6 √1 3

√2 6

− √16

0

√1 3

 1 # 0   2  1 √ −1 3

 3 "√ 4 6  = −1 0

0

√ # 2 6 √ . 3

1

19. If A is already orthogonal, let Q = A and R = I, which is upper triangular; then A = QR. 20. First, if A is invertible, then A has linearly independent columns, so that by Theorem 5.16, A = QR where R is invertible. Since R is upper triangular, its determinant is the product of its diagonal entries, so that R invertible implies that all of its diagonal entries are nonzero. For the converse, if R is upper triangular with all diagonal entries nonzero, then it has nonzero determinant; since Q is orthogonal, it is invertible so has nonzero determinant as well. But then det A = det(QR) = (det Q)(det R) 6= 0, so that A is invertible. Note that when this occurs, A−1 = (QR)−1 = R−1 Q−1 = R−1 QT .   21. From Exercise 20, A−1 = R−1 QT . To compute R−1 , we row-reduce R I using Exercise 15: √   √ √ √  2 √12 √12 1 0 0 1 0 0 22 − 66 − 63 √ √     6 3 .  0 √3  → 0 1 0 √1 0 1 0 − 0    3 6 6 √6  3 0 0 1 0 0 0 0 √23 0 0 1 2 Thus

√

2 2

 A−1 = R−1 QT =   0 0





6 √6 6 3

0





22. From Exercise 20, A−1 = R−1 QT . To compute R−1 ,    3 9 13 1 0 0 1    2 0 6 3 0 1 0 → 0 0 0 73 0 0 1 0



√1 2 − √16 √1 3

3 0 √6   3   √2 − 6  6 √ 3 √1 2 3



√1 2 √1  6 − √13

 −1  2 1 =  2 1 2

 we row-reduce R 0 1 0

0 0 1

1 3

− 12

0 0

1 6

0

1 2 − 12 1 2



1 2 1 . 2 − 21

 I using Exercise 17: 

2 21 1  − 21 . 3 7

398

CHAPTER 5. ORTHOGONALITY Thus 

1 3



− 12

 A−1 = R−1 QT =  0 0

1 6

0

2 2 21 3   1 − 21   31 3 2 7 3

1 3 2 3 − 23

− 23





2 3 1 3

=

5 42 1  42 2 7

− 27

11 − 21

1 7 − 27



2  . 21  1 7

23. If A = QR is a QR-factorization of A, then we want to show that Rx = 0 implies that x = 0 so that R is invertible by part (c) of the Fundamental Theorem. Note that since Q is orthogonal, so is QT , so that by Theorem 5.6(b), QT x = kxk for any vector x. Then



Rx = 0 ⇒ kRxk = 0 ⇒ (QT A)x = QT (Ax) = kAxk = 0 ⇒ Ax = 0. P If ai are the columns of A, then Ax = 0 means that ai xi = 0. But this means that all of the xi are zero since the columns of A are linearly independent. Thus x = 0. By part (c) of the Fundamental Theorem, we conclude that R is invertible. 24. Recall that row(A) = row(B) if and only if there is an invertible matrix M such that A = M B. Then A = QR implies that AT = RT QT ; since A has linearly independent columns, R is invertible, so that RT is as well. Thus row(AT ) = row(QT ), so that col(A) = col(Q).

Exploration: The Modified QR Process   d  1. I − 2uu = I − 2 1 d1 d2 T

d2



2d21 =I− 2d1 d2 

  2d1 d2 1 − 2d21 = 2 2d2 −2d1 d2

 −2d1 d2 . 1 − 2d22

2. (a) From Exercise 1, "

1 − 2d21 Q= −2d1 d2 (b) Since kx − yk =

p

# " 9 −2d1 d2 1 − 2 · 25 = 1 − 2d22 −2 · 53 · 45

# " 7 −2 · 53 · 45 25 = 16 24 1 − 2 · 25 − 25

# − 24 25 . 7 − 25

√ (5 − 1)2 + (5 − 7)2 = 2 5, we have " # " # 4 √ √2 1 2 5 5 u = √ (x − y) = = . 2 2 5 − 2√ − √15 5

Then from Exercise 1, " Q=

1 − 2d21

−2d1 d2

−2d1 d2

1 − 2d22

#

3. (a) QT = (I − 2uuT )T = I − 2 uuT



  " 1 − 2 · 45 −2 · √25 · − √15 − 35   = = 4 −2 · √25 · − √15 1 − 2 · 15 5 T

= I − 2 uT

T

4 5 3 5

# .

uT = I − 2uuT = Q, so that Q is symmetric.

(b) Start with QQT = QQ = (I − 2uuT )(I − 2uuT ) = I − 4uuT + −2uuT

2

= I − 4uuT + 4uuT uuT .

Now, uT u = u · u = 1 since u is a unit vector, so the equation above becomes QQT = I − 4uuT + 4uuT uuT = I − 4uuT + 4uuT = I. Thus QT = Q−1 and it follows that Q is orthogonal. (c) This follows from parts (a) and (b): Q2 = QQ = QQT = QQ−1 = I.

Exploration: The Modified QR Process

399

4. First suppose that v is in the span of u; say v = cu. Then again using the fact that uT u = u · u = 1, Qv = Q(cu) = (I − 2uuT )(cu) = c(u − 2uuT u) = c(u − 2u) = −cu = −v. If v · u = uT v = 0, then Qv = (I − 2uuT )v = v − 2uuT v = v. 5. First normalize the given vector to get  u=

Then

 uuT =

1 6  1 − 6 1 3

− 16 1 6 − 13



√1  16  − √  .  6 √2 6





1 3  − 13  , 2 3

so that Q = I − 2uuT =

2 3  1  3 − 23

1 3 2 3 2 3

− 32



2 . 3 1 −3

For Exercise 3, clearly Q is symmetric, and each column has norm 1. A short computation shows that each pair of columns is orthogonal. Therefore Q is orthogonal. Finally,    2 2 1 1 − 32 − 23 3 3 3 3  2  1 2 2 Q2 =  13 23 = I. 3  3 3 3 2 1 2 1 2 2 −3 3 −3 −3 3 −3 For Exercise 4, first suppose that v = cu. Then  Qv = Q(cu) = cQu =

2  3 1 c  3 − 23

1 3 2 3 2 3





√1   16  2  √  − 6 3  1 √2 −3 6

− 23

  − √16  1   = c  √6  = −cu = −v. 2 − √6

If v · u = uT v = 0, then h

√1 6

− √16

  i x1 √2 x2  = √1 x1 − √1 x2 + √2 x3 = 0, 6 6 6 6 x3

    1 0 so that x1 − x2 + 2x3 = 0 and therefore x2 = x1 + 2x3 . Thus if v · u = 0, then v = s 1 + t 2. But 0 1 then      1 0      Qv = Q s 1 + t 2 0 1     1 0     = sQ 1 + tQ 2 0 1       2 1 2 1 − 23 1 − 23 0 3 3 3 3  1 2  1 2 2   2   + t 3 3 = s 3 3 3  1 3  2 2 2 1 2 2 −3 3 −3 0 − 3 3 − 13 1     1 0     = s 1 + t 2 = v. 0 1

400

CHAPTER 5. ORTHOGONALITY

6. x − y is a multiple of u, so that Q(x − y) = −(x − y) by Exercise 4. Next, (x + y) · u =

1 (x + y) · (x − y). kx − yk

But by Exercise 61 in Section 1.2, this dot product is zero because kxk = kyk. Thus x+y is orthogonal to u, so that Q(x + y) = x + y. So we have Q(x − y) = −(x − y) Q(x + y) = x + y 7. We have

Qx + Qy = x + y

⇒ 2Qx = 2y ⇒ Qx = y.

  −2 1 1 u= (x − y) = √  2 , kx − yk 2 3 2

so that

 uuT =

1 3  1 − 3 − 13

− 13

− 31

1 3 1 3

Then

q



1 3 1 3

 Qx =

8. Since kyk =

Qx − Qy = −x + y



1 3 2 3 2 3

 ⇒ Q = I − 2uuT =

2 3 1 3 − 32

 

2 1 3 2   − 3  2 1 2 3

1 3 2 3 2 3

2 3 1 3 − 32



2 3  − 23  . 1 3

  3   = 0 = y. 0

2

kxk = kxk, if ai is the ith column vector of A, we can apply Exercise 6 to get

 Q1 A = Q1 a1

Q1 a2

...

  Q1 an = Q1 x

Q1 a2

...

 Q1 an  = y Q1 a2

...

Q1 an





 ∗ ∗ = , 0 A1

since the last m − 1 entries in y are all zero. Here A1 is (m − 1) × (n − 1), the left-hand ∗ is 1 × 1, and the right-hand ∗ is 1 × (n − 1). 9. Since P2 is a Householder matrix, it is orthogonal, so that    1 0 1 0 Q2 QT2 = = I, 0 P2 0 P2T so that Q2 is orthogonal as well. Further, 

1 0 Q2 Q1 A = 0 P2

    ∗ ∗ ∗ ∗ ∗ = = 0 0 A1 0 P2 A 1 0



 ∗ ∗ ∗ ∗ . 0 A2

10. Using induction, suppose that 

∗ Qk Qk−1 · · · Q1 A = 0 0

∗ ∗ 0

 ∗ ∗ . Ak

Repeat Exercise 8 on the matrix Ak , giving a Householder matrix Pk+1 such that   ∗ ∗ Pk+1 Ak = . 0 Ak+1

Exploration: The Modified QR Process

401



 1 0 . Then using the fact that Pk+1 is orthogonal, we again conclude that Qk+1 is 0 Pk+1 orthogonal, and that   ∗ ∗ ∗ ∗ . Qk+1 Qk Qk−1 · · · Q1 A = 0 ∗ 0 0 Ak+1

Let Qk+1 =

The result then follows by induction. The resulting matrix R = Qm−1 · · · Q2 Q1 A is upper triangular by construction. 11. If Q = Q1 Q2 · · · Qm−1 , then Q is orthogonal since each Qi is orthogonal, and T

QT = (Q1 Q2 · · · Qm−1 ) = QTm−1 · · · QT2 QT1 = Qm−1 · · · Q2 Q1 . Thus QT A = Qm−1 · · · Q2 Q1 A = R, so that QQT A = A = QR, since QQT = I.    p 3 32 + (−4)2 = 5 . Then 12. (a) We start with x = , so that y = −4 0 " # " # 2 1 − √15 T 5 5 u= ⇒ uu = 2 4 . − √25 5 5 Then

"

− 54 − 53

3 5 − 45

T

Q1 = I − 2uu =

#

" 5 ⇒ Q1 A = 0

# 3 −1 = R, −9 −2

and A = QT1 R.     3 1 (b) We start with x = 2, so that y = 0. Then 0 2    1 − √13 3  1   √  ⇒ uuT = − 1 u=   3 3 √1 − 31 3 Thus

 Q1 = I − 2uuT =

1 3 2 3 2 3



2 3 1 3 − 32

Then

2 3  − 23  1 3

"

4 A1 = 3

3 1

− 31

− 13



1 3 1 3

1 . 3 1 3

 3  ⇒ Q1 A = 0 0 1 3 4 3



−5 1 4 3 3 1



8 3 1 . 3 4 3

# ,

so that in the next iteration x=

" # 4



3

y=

Then

" # 5

" T

uu =

0 1 10 3 − 10

" ⇒

u=

3 − 10 9 10

√3 10

P2 = I − 2uuT =

4 5 3 5

3 5 − 54

#

" ⇒ Q2 =

# .

# .

So "

− √110

1 0 0 P2

#

 1  = 0 0



0

0

4 5 3 5

3 . 5 4 −5

402

CHAPTER 5. ORTHOGONALITY Finally,  1  Q2 Q1 A = 0 0

0 4 5 3 5

 3 3  5  0 0 − 45 0

−5 1 4 3 3 1



8 3 1 3 4 3

 3  = 0 0

 8 −5 1 3 16  = R, 5 3 15  0 1 − 13 15

and Q1 Q2 R = A.

Exploration: Approximating Eigenvalues with the QR Algorithm 1. We have A1 = RQ = (Q−1 Q)RQ = Q−1 (QR)Q = Q−1 AQ, so that A1 ∼ A. Thus by Theorem 4.22(e), A1 and A have the same eigenvalues. 2. Using Gram-Schmidt, we have " # 1 v1 = 1 " # " # " # " # " # 0 0 − 23 1·0+1·3 1 3 1 v2 = − . = − = 3 1·1+1·1 1 2 1 3 3 2 Normalizing these vectors gives " Q=

√1 2 √1 2

√1 2 − √12

Thus

"√ A1 = RQ =

"√

#

2 0

T

, so that R = Q A = √ #" 3 2 √1 2 2 √ √1 −322 2

√1 2 − √12

2 0

#

" =

5 2 − 32

√ # 3 2 2 √ . −322

− 21 3 2

# .

Then 1 − λ 0 det(A − λI) = = (1 − λ)(3 − λ) 1 3 − λ    5 − λ − 12 3 3 5 2 det(A1 − λI) = 3 −λ − λ − = λ2 − 4λ + 3 = (λ − 1)(λ − 3). = 3 −2 2 2 4 − λ 2 So A and A1 do indeed have the same eigenvalues. 3. The proof is by induction on k. We know that A1 ∼ A. Now assume An−1 ∼ A. Then −1 −1 An = Rn−1 Qn−1 = (Q−1 n−1 Qn−1 )Rn−1 Qn−1 = Qn−1 (Qn−1 Rn−1 )Qn−1 = Qn−1 An−1 Qn−1 ,

so that An ∼ An−1 ∼ A. 4. Converting A1 = Q1 R1 to decimals gives, using technology      0.86 0.51 2.92 −1.20 3.12 0.47 A1 = Q1 R1 ≈ ⇒ A2 = R1 Q1 ≈ −0.51 0.86 0 1.04 −0.53 0.88      −0.99 0.17 −3.16 −0.32 3.06 −0.84 A2 = Q2 R2 ≈ ⇒ A3 = R2 Q2 ≈ 0.17 0.99 0 0.95 0.16 0.93      −1.00 −0.05 −3.06 0.79 3.02 0.94 ⇒ A4 = R3 Q3 ≈ A3 = Q3 R3 ≈ −0.05 1.00 0 0.97 −0.05 0.97      −1.00 0.02 −3.02 −0.92 3.00 −0.98 A4 = Q4 R4 ≈ ⇒ A5 = R4 Q4 ≈ . 0.02 1.00 0 0.99 0.02 0.99 The Ak are approaching an upper triangular matrix U .

Exploration: Approximating Eigenvalues with the QR Algorithm

403

5. Since Ak ∼ A for all k, the eigenvalues of A and Ak are the same, so in the limit, the upper triangular matrix U will have eigenvalues equal to those of A; but the eigenvalues of a triangular matrix are its diagonal entries. Thus the diagonal entries of U will be the eigenvalues of A. 6. (a) Follow the same process as in Exercise 4.      0.71 0.71 2.83 2.83 4.02 0.00 A = QR ≈ ⇒ A1 = RQ ≈ 0.71 −0.71 0 1.41 1.00 −1.00      −0.97 −0.24 −4.14 0.24 3.96 1.23 A1 = Q1 R1 ≈ ⇒ A2 = R1 Q1 ≈ −0.24 0.97 0 −0.97 0.23 −0.94      −1.00 −0.06 −3.97 −1.17 4.04 −0.93 A2 = Q2 R2 ≈ ⇒ A3 = R2 Q2 ≈ −0.06 1.00 0 −1.01 0.06 −1.01      −1.00 −0.01 −4.04 0.94 4.03 0.98 A3 = Q3 R3 ≈ ⇒ A4 = R3 Q3 ≈ −0.01 1.00 0 −1.00 0.01 −1.00      −1.00 0.00 −4.03 −0.98 4.03 −0.98 A4 = Q4 R4 ≈ ⇒ A5 = R4 Q4 ≈ . 0.00 1.00 0 −1.00 0.00 −1.00 The eigenvalues of A are approximately 4.03 and −1.00. (b) Follow the same process as in Exercise 4.      0.45 0.89 2.24 1.34 2.20 1.40 A = QR ≈ ⇒ A1 = RQ ≈ 0.89 −0.45 0 0.45 0.40 −0.20      −0.98 −0.18 −2.24 −1.34 2.44 −0.92 A1 = Q1 R1 ≈ ⇒ A2 = R1 Q1 ≈ −0.18 0.98 0 −0.45 0.08 −0.44      −1.00 −0.03 −2.44 0.93 2.41 1.01 A2 = Q2 R2 ≈ ⇒ A3 = R2 Q2 ≈ −0.03 1.00 0 −0.41 0.01 −0.41      −1 0 −2.41 −1.01 2.42 −1 A3 = Q3 R3 ≈ ⇒ A4 = R3 Q3 ≈ 0 1 0 −0.42 0 −0.42      −1.00 0.00 −2.42 1 2.41 1 A4 = Q4 R4 ≈ ⇒ A5 = R4 Q4 ≈ . 0.00 1.00 0 −0.41 0 −0.41 √ The eigenvalues of A are approximately 2.41 and −0.41 (the true values are 1 ± 2). (c) Follow the same process as in Exercise 4.    4.24 0.47 −0.94 0.24 −0.06 −0.97 1.26 ⇒ 0.97 0  0 1.94 A = QR ≈  0.24 −0.94 0.23 −0.24 0 0 0.73   2.00 0 −3.89 A1 = RQ ≈ −0.73 2.18 −0.31 −0.69 0.17 −0.18    −0.89 −0.33 0.30 −2.24 0.76 3.32 0 −2.05 1.57 ⇒ A1 = Q1 R1 ≈  0.33 −0.94 −0.07  0.31 0.03 0.95 0 0 −1.31   3.27 0.13 2.44 1.98 1.64 A2 = R1 Q1 ≈ −0.18 −0.40 −0.04 −1.25    −0.99 −0.05 0.12 −3.3 −0.03 −2.47 0 −1.99 −1.8 ⇒ A2 = Q2 R2 ≈  0.06 −1.00 0.01  0.12 0.02 0.99 0 0 −1.31   2.96 0.16 −2.86 1.95 −1.81 A3 = R2 Q2 ≈ −0.33 −0.11 −0.02 −0.91

404

CHAPTER 5. ORTHOGONALITY

 −0.99 A3 = Q3 R3 ≈  0.11 0.04

  −0.11 0.04 −2.98 0.06 2.61 −0.99 0.01  0 −1.95 2.10 ⇒ 0.01 1.00 0 0 −1.03   3.07 0.30 2.49 1.96 2.09 A4 = R3 Q3 ≈ −0.14 −0.04 −0.01 −1.03    −1 −0.04 0.01 −3.07 −0.21 −2.41 −1 0  0 −1.97 −2.20 ⇒ A4 = Q4 R4 ≈ 0.04 0.01 0 1 0 0 −0.99   3.03 0.34 −2.45 A5 = R4 Q4 ≈ −0.12 1.96 −2.21 . −0.01 0 −0.99

The eigenvalues of A are approximately 3.03, 1.96, and −0.99. (The true values are 3, 2, and −1). (d) Follow the same process as  0.45 0 A = QR ≈  −0.89  1.00 A1 = Q1 R1 ≈ 0.00

in Exercise 4.     0.72  2.24 −3.13 −2.24 3.00 −1.07 0.60 ⇒ A1 = RQ ≈ 0 3.35 0 0 2.00 0.36     0.00 3.00 −1.07 3.00 −1.07 ⇒ A2 = R1 Q1 ≈ . 1.00 0 2.00 0 2.00

Since Q1 is the identity matrix, we get Q2 = I and R2 = R1 , so that successive Ai are equal to A1 . Thus the eigenvalues of A are approximately 3 and 2 (in fact, they are 3, 2, and 0). 7. Following the process in Exercise 4, " A = QR =

√2 5 − √15

" A1 = Q1 R1 =

√2 5 − √15

− √15

# "√

− √25 − √15 − √25

√8 5 √1 5

5

0 #"

√1 5

0

#

" ⇒ A1 = RQ =

2 5 − 15

#

− √85 √ 5

− 21 5 − 25 "

⇒ A2 = R1 Q1 ≈

#

2 −1

# 3 −2

= A.

Here A2 = A (Q1 = Q−1 and R1 = R−1 ); the algorithm fails because the eigenvalues of A, which are ±1, do not have distinct absolute values. 8. Set B = A + 0.9I =

 2.9 −1

 3 . −1.1

Then      −0.95 0.33 −3.07 −3.19 1.86 −4.02 ⇒ B1 = RQ ≈ B = QR ≈ 0.33 0.95 0 −0.06 −0.02 −0.06      −1 0.01 −1.86 4.02 1.9 4 B1 = Q1 R1 ≈ ⇒ B2 = R1 Q1 ≈ 0.01 1 0 −0.1 0 −0.1      −1.00 0 −1.9 −4 1.9 −4 B2 = Q2 R2 ≈ ⇒ B3 = R2 Q2 ≈ . 0 1.00 0 −0.1 0 −0.1 Further iterations will simply reverse the sign of the upper right entry of Bi . The eigenvalues of B are approximately 1.9 and −0.1, so the eigenvalues of A are approximately 1.9 − 0.9 = 1.0 and −0.1 − 0.9 = −1.0. These are correct (see Exercise 7).

5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES

405

9. We prove the first equality using induction on k. For k = 1, it says that Q0 A1 = AQ0 , or QA1 = AQ. Now, A1 = RQ and A = QR, so that QA1 = QRQ = (QR)Q = AQ. This proves the basis of the induction. Assume the equality holds for k = n; we prove it for k = n + 1: Q0 Q1 Q2 · · · Qn An+1 = Q0 Q1 Q2 · · · Qn−1 (Qn An+1 ) = Q0 Q1 Q2 · · · Qn−1 (Qn (Rn Qn )) = Q0 Q1 Q2 · · · Qn−1 (Qn Rn )Qn = (Q0 Q1 Q2 · · · Qn−1 An )Qn = (AQ0 Q1 Q2 · · · Qn−1 )Qn = AQ0 Q1 Q2 · · · Qn . For the second equality, note that Ak = Qk Rk , so that Q0 Q1 Q2 · · · Qk−1 Ak = Q0 Q1 Q2 · · · Qk−1 Qk Rk = AQ0 Q1 Q2 · · · Qk−1 . Multiply both sides by Rk−1 · · · R2 R1 to get (Q0 Q1 Q2 · · · Qk−1 Qk )(Rk Rk−1 · · · R2 R1 ) = A(Q0 Q1 Q2 · · · Qk−1 )(Rk−1 · · · R2 R1 ), as desired. For the QR-decomposition of Ak+1 , we again use induction. For k = 0, the equation says that Q0 R0 = A0+1 = A, which is true. Now assume the statement holds for k = n. Then using the first equation in this exercise together with the fact that Ak = Qk Rk and the inductive hypothesis, we get An+1 = A · An = A(Q0 Q1 · · · Qn−1 )(Rn−1 · · · R1 R0 ) = Q0 Q1 · · · Qn−1 An Rn−1 · · · R1 R0 = Q0 Q1 · · · Qn−1 Qn Rn Rn−1 · · · R1 R0 = (Q0 Q1 · · · Qn )(Rn Rn−1 · · · R1 R0 ), as desired.

5.4

Orthogonal Diagonalization of Symmetric Matrices

1. The characteristic polynomial of A is 4 − λ det(A − λI) = 1

1 = λ2 − 8λ + 15 = (λ − 5)(λ − 3) 4 − λ

Thus the eigenvalues of A are λ1 = 3 and λ2 = 5. To find the eigenspace corresponding to λ1 = 3, we row-reduce       1 1 0 1 1 0 A − 3I 0 = → . 1 1 0 0 0 0   −1 A basis for the eigenspace is given by . 1 To find the eigenspace corresponding to λ2 = 5, we row-reduce      −1 1 0 1 A − 5I 0 = → 1 −1 0 0 A basis for the eigenspace is given by

  1 . 1

−1 0

 0 . 0

406

CHAPTER 5. ORTHOGONALITY These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to turn them into orthonormal √ vectors, we normalize them. The norm of each vector is 2, so setting " # − √12 √12 Q= 1 1 √

2



2

we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that   3 0 −1 T Q AQ = Q AQ = = D. 0 5 2. The characteristic polynomial of A is −1 − λ det(A − λI) = 3

3 = λ2 + 2λ − 8 = (λ + 4)(λ − 2). −1 − λ

Thus the eigenvalues of A are λ1 = −4 and λ2 = 2. To find the eigenspace corresponding to λ1 = −4, we row-reduce       3 3 0 1 1 0 A + 4I 0 = → . 3 3 0 0 0 0   −1 A basis for the eigenspace is given by . 1 To find the eigenspace corresponding to λ2 = 2, we row-reduce      −3 3 0 1 A − 2I 0 = → 3 −1 0 0   1 A basis for the eigenspace is given by . 1

−1 0

 0 . 0

These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to turn them into orthonormal √ vectors, we normalize them. The norm of each vector is 2, so setting " # − √12 √12 Q= 1 1 √

2



2

we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that   −4 0 Q−1 AQ = QT AQ = = D. 0 2 3. The characteristic polynomial of A is 1 − λ det(A − λI) = √ 2

√ 2 = λ2 − λ − 2 = (λ − 2)(λ + 1) −λ

Thus the eigenvalues of A are λ1 = −1 and λ2 = 2. To find the eigenspace corresponding to λ1 = −1, we row-reduce " # " # √ √ h i 2 2 0 1 22 0 → . A+I 0 = √ 2 1 0 0 0 0   −1 A basis for the eigenspace is given by √ . 2 To find the eigenspace corresponding to λ2 = 2, we row-reduce √    √   −1 2 0 1 − 2 A − 2I 0 = √ → 0 0 2 −1 0

 0 . 0

5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES A basis for the eigenspace is given by

407

√  2 . 1

These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to turn them into orthonormal √ vectors, we normalize them. The norm of each vector is 3, so setting √ # " 6 − √13 3 √ Q= 6 1 3



3

we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that   −1 0 −1 T Q AQ = Q AQ = = D. 0 2 4. The characteristic polynomial of A is 9 − λ det(A − λI) = −2

−2 = λ2 − 15λ + 50 = (λ − 5)(λ − 10). 6 − λ

Thus the eigenvalues of A are λ1 = 5 and λ2 = 10. To find the eigenspace corresponding to λ1 = 5, we row-reduce " # " # h i 4 −2 0 1 − 21 0 → . A − 5I 0 = −2 1 0 0 0 0   1 A basis for the eigenspace is given by . 2 To find the eigenspace corresponding to λ2 = 10, we row-reduce      −1 −2 0 1 A − 10I 0 = → −2 −4 0 0   −2 A basis for the eigenspace is given by . 1

2 0

 0 . 0

These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to turn them into orthonormal √ vectors, we normalize them. The norm of each vector is 5, so setting # " √1 √2 − 5 Q = 25 1 √

5



5

we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that   5 0 −1 T Q AQ = Q AQ = = D. 0 10 5. The characteristic polynomial of A is 5 − λ 0 0 1−λ 3 = −λ3 + 7λ2 − 2λ − 40 = −(λ − 4)(λ − 5)(λ + 2). det(A − λI) = 0 0 3 1 − λ Thus the eigenvalues of A are λ1 = 4, λ2 = 5, and λ3 λ1 = 4, we row-reduce  1 0 0 i h  3 A − 4I 0 = 0 −3 0 3 −3

= −2. To find the eigenspace corresponding to   0 1 0   0 → 0 1 0 0 0

0 −1 0

 0  0 . 0

408

CHAPTER 5. ORTHOGONALITY   0 A basis for the eigenspace is given by 1. 1 To find the eigenspace corresponding to λ2 = 5, we row-reduce    0 0 0 0 0 1 0   3 0 → 0 0 1 A − 5I 0 = 0 −4 0 3 −4 0 0 0 0   1 A basis for the eigenspace is given by 0. 0 To find the eigenspace corresponding to λ3 = −2, we row-reduce    7 0 0 0 1 0 0   A + 2I 0 = 0 3 3 0 → 0 1 1 0 3 3 0 0 0 0   0 A basis for the eigenspace is given by −1. 1

 0 0 . 0

 0 0 . 0

These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so√to turn them into orthonormal vectors, we normalize them. The norm of the first and third vectors is 2, while the second is already a unit vector, so setting   0 1 0   1 √ √1  Q=  2 0 − 2 √1 √1 0 2 2 we see that Q is an orthonormal matrix whose columns  4 Q−1 AQ = QT AQ = 0 0

are the normalized eigenvectors for A, so that  0 0 5 0 = D. 0 −2

6. The characteristic polynomial of A is 2 − λ 3 0 2−λ 4 = −λ3 + 6λ2 + 13λ − 42 = −(λ + 3)(λ − 2)(λ − 7). det(A − λI) = 3 0 4 2 − λ Thus the eigenvalues of A are λ1 = −3, λ2 = 2, and λ3 = 7. To find the eigenspace corresponding to λ1 = −3, we row-reduce     3 5 3 0 0 1 0 − 0 4 h i     5  → 0 1 A + 3I 0 =  0 0 3 5 4    . 4 0 4 5 0 0 0 0 0   3 A basis for the eigenspace is given (after clearing fractions) by −5. 4 To find the eigenspace corresponding to λ2 = 2,  h i 0 3 A − 2I 0 =  3 0 0 4

we row-reduce   0 0 1 0     → 4 0 0 1 0 0 0 0

4 3

0 0

 0  0 . 0

5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES

409

  −4 A basis for the eigenspace is given (after clearing fractions) by  0. 3 To find the eigenspace corresponding to λ3 = 7, we row-reduce    1 0 0 −5 3 0 i  h      4 0 → 0 1 A − 7I 0 =  3 −5 4 −5

0



− 43

0

− 54

 0 .

0

0 0   3 A basis for the eigenspace is given (after clearing fractions) by 5. 4

0

0

These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to √ turn them into √ orthonormal vectors, we normalize them. The norms of these vectors are respectively 5 2, 5, and 5 2, so setting   3 4 3 √ √ − 5 5 2  5 12 √ √1  Q= − 0  2 2  4 √ 5 2

3 5

4 √ 5 2

we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that   −3 0 0 Q−1 AQ = QT AQ =  0 2 0 = D. 0 0 7 7. The characteristic polynomial of A is 1 − λ 0 0 1−λ det(A − λI) = −1 0

−1 0 = −λ3 + 3λ2 − 2λ = −λ(λ − 1)(λ − 2). 1 − λ

Thus the eigenvalues of A are λ1 = 0, λ2 = λ1 = 0, we row-reduce  1 h i  A − 0I 0 =  0 −1   1 A basis for the eigenspace is given by 0. 1

1, and λ3 = 2. To find the eigenspace corresponding to 0 1 0

−1 0 1

  0 1 0   0 → 0 1 0 0 0

−1 0 0

To find the eigenspace corresponding to λ2 = 1, we row-reduce    0 0 −1 0 1 0 0   0 0 → 0 0 1 A−I 0 = 0 0 −1 0 0 0 0 0 0   0 A basis for the eigenspace is given by 1. 0 To find the eigenspace corresponding to λ3 = 2, we row-reduce    −1 0 −1 0 1 0 1   0 0 → 0 1 0 A − 2I 0 =  0 −1 −1 0 −1 0 0 0 0

 0  0 . 0

 0 0 . 0

 0 0 . 0

410

CHAPTER 5. ORTHOGONALITY   −1 A basis for the eigenspace is given by  0. 1 These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so√to turn them into orthonormal vectors, we normalize them. The norm of the first and third vectors is 2, while the second is already a unit vector, so setting   √1 √1 0 − 2  2 Q= 0 1 0   √1 √1 0 2 2 we see that Q is an orthonormal matrix whose columns are  0 0 Q−1 AQ = QT AQ = 0 1 0 0

the normalized eigenvectors for A, so that  0 0 = D. 2

8. The characteristic polynomial of A is 1 − λ 2 2 1−λ 2 = −λ3 + 3λ2 + 9λ + 5 = −(λ + 1)2 (λ − 5) det(A − λI) = 2 2 2 1 − λ Thus the eigenvalues of A are λ1 = −1 and λ1 = −1, we row-reduce  2   A + I 0 = 2 2

λ2 = 5. To find the eigenspace E−1 corresponding to 2 2 2 2 2 2

  0 1 0 → 0 0 0

1 0 0

 0 0 . 0

1 0 0

Thus the eigenspace E−1 is        −s − t −1 −1  s  = span x1 =  1 , x2 =  0 . t 0 1 To find the eigenspace E5 corresponding to λ2 = 5, we  −4 2 2   A − 5I 0 =  2 −4 2 2 2 −4   1 Thus a basis for E5 is v3 = 1. 1

row-reduce   1 0 0 0 → 0 1 0 0 0

−1 −1 0

 0 0 . 0

E5 is orthogonal to E−1 as guaranteed by Theorem 10.19, so to turn this basis into an orthogonal basis we must orthogonalize the basis for E−1 above. Use Gram-Schmidt:         −1   −1 −1 −1  2 v · x 1 1 2 1 v1 =  1 , v2 = x2 − v1 =  0 −  1 =  − 2 . v1 · v1 2 0 1 0 1 We double v2 to remove fractions, so we get the orthogonal basis       −1 −1 1 v1 =  1 , v2 = −1 , v3 = 1 0 2 1

5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES

411

√ √ √ The norms of these vectors are, respectively, 2, 6, and 3, so normalizing each vector and setting Q equal to the matrix whose columns are the normalized eigenvectors gives   − √12 − √16 √13   1 Q= − √16 √13  ,  √2 1 2 √ √ 0 6 3 so that

 −1 Q−1 AQ = QT AQ =  0 0

9. The characteristic polynomial of A is 1 − λ 1 0 1 1 − λ 0 det(A − λI) = 0 1−λ 0 0 0 1 Thus the eigenvalues of A are λ1 = 0 and we row-reduce  1   1 A + 0I 0 =  0 0

 0 0 −1 0 = D. 0 5

0 0 = λ4 − 4λ3 + 4λ2 = λ2 (λ − 2)2 . 1 1 − λ

λ2 = 2. To find the eigenspace E0 corresponding to λ1 = 0,    1 1 0 0 0 1 0 0 0   1 0 0 0  → 0 0 1 1 0 .   0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0

Thus the eigenspace E0 is        0 −1 −s  0   1  s   = span x1 =   , x2 =   . −1   0  −t 1 0 t To find the eigenspace E2 corresponding to λ2 = 2, we  −1 1 0 0    1 −1 0 0 A − 2I 0 =   0 0 −1 1 0 0 1 −1

row-reduce   0 1 −1 0 0 0 0 1 −1 0 → 0 0 0 0 0 0 0 0 0 0

 0 0 . 0 0

Thus the eigenspace E2 is        s 1 0 s  1 0   = span x3 =   , x4 =   .  t  0 1 t 0 1 E0 is orthogonal to E2 as guaranteed by Theorem 10.19, and x1 and x2 are also orthogonal, as are x3 √ and x4 . So these four vectors form an orthogonal basis. The norm of each vector is 2, so normalizing each vector and setting Q equal to the matrix whose columns are the normalized eigenvectors gives   − √12 0 √12 0    √1 0 √12 0   2 Q=   √1 √1  0 − 0  2 2 √1 0 0 √12 2

412

CHAPTER 5. ORTHOGONALITY so that

 0  0 Q−1 AQ = QT AQ =  0 0

10. The characteristic polynomial of A is 2 − λ 0 0 0 1−λ 0 det(A − λI) = 0 1−λ 0 1 0 0

0 0 0 0

0 0 2 0

 0 0  = D. 0 2

1 0 = λ4 − 6λ3 + 12λ2 − 10λ + 3 = (λ − 1)3 (λ − 3) 0 2 − λ

Thus the eigenvalues of A are λ1 = 1 and λ2 = 3. we row-reduce  1 0 0 1   0 0 0 0 AI 0 =  0 0 0 0 1 0 0 1

To find the   1 0 0 0 → 0 0 0 0

eigenspace E1 corresponding to λ1 = 1,  0 0 1 0 0 0 0 0 . 0 0 0 0 0 0 0 0

Thus the eigenspace E1 is     −1 −s  0  t    = span   ,  0 u 1 s

  0 1  , 0 0

  0 0   . 1 0

Note that these vectors are mutually orthogonal. To find the eigenspace E3 corresponding to λ2 = 3, we  −1 0 0 1    0 −2 0 0 A − 3I 0 =   0 0 −2 0 1 0 0 −1   1 0  . Thus a basis for E3 is   0 1

row-reduce   1 0 0 0 → 0 0 0 0

0 0 1 0 0 1 0 0

−1 0 0 0

 0 0 . 0 0

Since the eigenvectors of E1 are orthogonal to those of E3 by Theorem 10.19, and the eigenvectors of E3 are already all we need to do is to normalize the vectors. The norms are, √ mutually orthogonal, √ respectively, 2, 1, 1, and 2, so setting Q equal to the matrix whose columns are the normalized eigenvectors gives   1 − √2 0 0 √12    0 1 0 0  , Q= 0 1 0  0  1 √1 √ 0 0 2 2 so that

 1 0 −1 T Q AQ = Q AQ =  0 0

0 1 0 0

0 0 1 0

 0 0  = D. 0 3

11. The characteristic polynomial is a − λ b det(A − λI) = = (a − λ)2 − b2 . b a − λ

5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES This is zero when a − λ = ±b, so the eigenvalues are a + b and eigenspaces, row-reduce:      −b b 0 1 A − (a + b)I 0 = → b −b 0 0      1 1 b b 0 A − (a − b)I 0 = → b b 0 0 0

413

a − b. To find the corresponding −1 0 0 0  0 . 0



    1 −1 So bases for the two eigenspaces (which are orthogonal by Theorem 5.19) are and . Each of 1 1 √ these vectors has norm 2, so normalizing each gives # " √1 √1 − 2 , Q = 12 1 √



2

so that Q−1 AQ = QT AQ =

2

  a+b 0 . 0 a−b

12. The characteristic polynomial is (expanding along the second row) a − λ 0 b  a−λ 0 = (a − λ) (a − λ)2 − b2 . det(A − λI) = 0 b 0 a − λ This is zero when a = λ or when a − λ = ±b, so the eigenvalues are a, a + b and a − b. To find the corresponding eigenspaces, row-reduce:     0 0 b 0 1 0 0 0   A − aI 0 = 0 0 0 0 → 0 0 1 0 b 0 0 0 0 0 0 0     −b 0 b 0 1 0 −1 0   0 0 → 0 1 0 0 A − (a + b)I 0 =  0 −b b 0 −b 0 0 0 0 0     1 0 1 0 b 0 b 0   A − (a − b)I 0 = 0 b 0 0 → 0 1 0 0 . b 0 b 0 0 0 0 0       0 1 −1 So bases for the three eigenspaces (which are orthogonal by Theorem 5.19) are 1, 0 and  0. 0 1 1 Normalizing the second and third vectors gives   0 √12 − √12   Q= 0 0 1 , 1 1 √ 0 √2 2 so that

 a Q−1 AQ = QT AQ = 0 0

0 a+b 0

 0 0 . a−b

13. (a) Since A and B are each orthogonally diagonalizable, they are each symmetric by the Spectral Theorem. Therefore A + B is also symmetric, so it too is orthogonally diagonalizable by the Spectral Theorem.

414

CHAPTER 5. ORTHOGONALITY (b) Since A is orthogonally diagonalizable, it is symmetric by the Spectral Theorem. Therefore cA is also symmetric, so it too is orthogonally diagonalizable by the Spectral Theorem. (c) Since A is orthogonally diagonalizable, it is symmetric by the Spectral Theorem. Therefore A2 = AA is also symmetric, so it too is orthogonally diagonalizable by the Spectral Theorem.

14. If A is invertible and orthogonally diagonalizable, then QT AQ = D for some invertible diagonal matrix D and orthogonal matrix Q, so that −1 −1 T D−1 = QT AQ = Q−1 A−1 QT = QT A−1 QT = QT A−1 Q. Thus A−1 is also orthogonally diagonalizable (and by the same orthogonal matrix). 15. Since A and B are orthogonally diagonalizable, they are both symmetric. Since AB = BA, Exercise 36 in Section 3.2 implies that AB is also symmetric, so it is orthogonally diagonalizable as well. 16. Let A be a symmetric matrix. Then it is orthogonally diagonalizable, say D = QT AQ. Then the diagonal entries of D are the eigenvalues of A. First, if every eigenvalue of A is nonnegative, let √    λ1 · · · 0 λ1 · · · 0 √    ..  . .. D =  ... . . . ...  , and define D =  ...  . √. 0 · · · λn λn 0 ··· √ Let B = Q DQT . Then B is diagonally orthogonalizable by construction, so it is symmetric. Further, since Q is orthogonal, √ √ √ √ √ √ B 2 = Q DQT Q DQT = Q DQ−1 Q DQT = Q D DQT = QDQT = A. Conversely, suppose A = B 2 for some symmetric matrix B. Then B is orthogonally diagonalizable, say QT BQ = D, so that QT AQ = QT B 2 Q = (QT BQ)(QT BQ) = D2 . But D2 is a diagonal matrix whose diagonal entries are the squares of the diagonal entries of D, so they are all nonnegative. Thus the eigenvalues of A are all nonnegative. 17. From Exercise 1 we have " λ1 = 3,

q1 =

− √12

#

" ,

√1 2

λ2 = 5,

q2 =

√1 2 √1 2

# .

Therefore " A = λ1 q1 qT1 + λ2 q2 qT2 = 3

1 2 − 21

− 12

#

1 2

" +5

1 2 1 2

1 2 1 2

#

√1 2 √1 2

#

.

18. From Exercise 2 we have " λ1 = −4,

q1 =

− √12 √1 2

#

" ,

λ2 = 2,

q2 =

.

Therefore " A=

λ1 q1 qT1

+

λ2 q2 qT2

= −4

1 2 − 12

− 12 1 2

#

" +2

1 2 1 2

1 2 1 2

# .

5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES

415

19. From Exercise 5 we have  λ1 = 4,

0

 1  √  q1 =   2 ,



  1    q2 =  0 , 0

 λ2 = 5,

√1 2

λ3 = −2,

0



 1  √  q2 =  − 2  . √1 2

Therefore  0  T T T  A = λ1 q1 q1 + λ2 q2 q2 + λ3 q3 q3 = 4 0 0

 1   1 + 5  0 2 1 0 2

0

0

1 2 1 2



0 0 0

  0 0     0 − 2 0 0 0

1 2 − 12





 0  − 21  .

0

1 2

20. From Exercise 8 we have λ1 = −1,

  − √16  1   q2 =  − √6  ,

  − √12   1  q1 =   √2  , 0

λ2 = 5,

q3 =

√2 6

√1  13  √  .  3 √1 3

Therefore 

A = λ1 q1 qT1 + λ1 q2 qT2 + λ2 q3 qT3 =

1  2 1 − − 2

− 12 1 2

0

0

  1 0   6  1 0 − 6 0 − 13

1 6 1 6 − 31

− 13



 − 13  + 2 3



1 3 1 5 3 1 3



1 3 1 3 1 3

1 3 1 . 3 1 3

− 32

#

21. Since v1 and v2 are already orthogonal, normalize each of them to get " # " # q1 =

√1 2 √1 2

,

q2 =

√1 2 − √12

.

Now let Q be the matrix whose columns are q1 and q2 ; then the desired matrix is # " # " " #" #" # " 1 √1 √1 √1 √1 −1 0 −1 0 −1 0 T −1 2 2 2 2 Q = 12 Q =Q Q = √ 0 2 0 2 0 2 √12 − √12 − 32 − √12 2

.

1 2

22. Since v1 and v2 are already orthogonal, normalize each of them to get " # " # √1 − √25 5 q1 = 2 , q2 = . 1 √



5

5

Now let Q be the matrix whose columns are q1 and q2 ; then the desired matrix is " # " # " #" #" # " √1 √1 √2 − √25 3 3 0 3 0 0 − 59 −1 T 5 5 5 Q Q =Q Q = 2 = 12 √ √1 0 −3 0 −3 0 −3 − √25 √15 5 5 5

12 5 9 5

# .

√ √ √ 23. Note that v1 , v2 , and v3 are mutually orthogonal, with norms 2, 3, and 6 respectively. Let Q be the matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is     1 0 0 1 0 0   −1   T    Q 0 2 0 Q = Q 0 2 0 Q 0

0

3

0

 =

√1  12 √  2

0

0

3

√1 3 − √13 √1 3

 1  1   √ 0 6  2 √ 0 6

− √16

0 2 0

 √1 0   12  0   √3 3 − √16

√1 2 − √13 √1 6

  5 0 3   √1  = − 2  3 3 √2 − 13 6

− 32 5 3 1 3

− 13

 

1 . 3 8 3

416

CHAPTER 5. ORTHOGONALITY

√ √ √ 24. Note that v1 , v2 , and v3 are mutually orthogonal, with norms 42, 3, and 14 respectively. Let Q be the matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is  1   Q 0 0

  1 0   −1   −4 0 Q = Q 0 0 0 −4 

 0  T −4 0 Q 0 −4 0

0

=

√4  542  √  42 − √142

 − 44  21 50 =  21 10 − 21



− √13

√2 14  1 1  − √14   0 3 √ 0 14

√1 3 √1 3 50 21 43 − 42 25 − 42

− 10 21

 √4 0   42  1 −4 0  − √3 √2 0 −4 14 0

√5 42 √1 3 − √114

− √142

 

√1  3 √3 14



 . − 25 42 

− 163 42

25. Since q is a unit vector, q · q = 1, so that  projW v =

q·v q·q



q = (q · v)q = q(q · v) = q(qT v) = (qqT )v.

26. (a) By definition (see Section 5.2), the projection of v onto W is  projW v =

q1 · v q1 · q1



 q1 + · · · +

qk · v qk · qk

 qk

= (q1 · v)q1 + · · · + (qk · v)qk = q1 (q1 · v) + · · · + qk (qk · v) = q1 (qT1 v) + · · · + qk (qTk · v) = (q1 qT1 + · · · + qk qTk )v. Therefore the matrix of the projection is indeed q1 qT1 + · · · + qk qTk . (b) First, T + · · · + qk qTk T T = qT1 qT1 + · · · + qTk qk = q1 qT1 + · · · + qk qTk = P.

P T = (q1 qT1 + · · · + qk qTk )T = q1 qT1

T

Next, since qi · qj = 0 for i 6= j, and qi · qi = qTi qi = 1, we have P 2 = (q1 qT1 )(q1 qT1 ) + · · · + (qk qTk )(qk qTk )   = q1 qT1 q1 qT1 + · · · + qk qTk qk qTk = q1 qT1 + · · · qk qTk = P. (c) We have  QQT = q1

···

 T q   .1  qk  ..  = q1 qT1 + · · · qk qTk = P. qTk

Thus rank(P ) = rank(QQT ) = rank(Q) by Theorem 3.28 in Section 3.5. But if q1 , . . . , qk is an orthonormal basis, then rank(Q) = k, so that rank(P ) = k as well.

5.5. APPLICATIONS

417

27. Suppose the eigenvalues are λ1 , . . . , λn and that v1 , . . . , vn are a corresponding orthonormal set of eigenvectors. Let Q1 = v1 · · · vn . Then  T  T v1 v1        QT1 AQ1 =  ...  A v1 · · · vn =  ...  Av1 · · · Avn vnT

vnT

 T v1  ..   =  .  λ1 v1

···

vnT

  λ λn vn = 1 0

 ∗ = T, A1

where A1 is an (n − 1) × (n − 1) matrix. The result now follows by induction since T is block upper triangular. 28. We know that if A is nilpotent, then its only eigenvalue is 0. So if A is nilpotent, then all of its eigenvalues are real, so that by Exercise 27 there is an orthogonal matrix Q such that QT AQ is upper triangular. But A and QT AQ have the same eigenvalues; since the eigenvalues of QT AQ are the diagonal entries, all the diagonal entries must be zero.

5.5

Applications

    x 2 3  x y = 2x2 + 6xy + 4y 2 . y 3 4     x1 5 1  T x1 x2 = 5x21 + 2x1 x2 − x22 . 2. f (x) = x Ax = x2 1 −1     1 3 −2  T 1 6 = 3 · 12 − 4 · 1 · 6 + 4 · 62 = 123. 3. f (x) = x Ax = 6 −2 4    1 0 −3  x  1 x y z = 1x2 + 2y 2 + 3z 2 + 0xy − 6xz + 2yz = x2 + 2y 2 + 3z 2 − 6xz + 2yz. 4. f (x) = y   0 2 −3 1 3 z

1. f (x) = xT Ax =

5. From Exercise 4, 

6.

7. 8. 9.

 2 f −1 = 22 + 2 · (−1)2 + 3 · 12 − 6 · 2 · 1 + 2 · (−1) · 1 = −5. 1    1 2 2 0   f (x) = xT Ax = 2 2 0 1 1 2 3 = 2 · 12 + 0 · 22 + 1 · 32 + 4 · 1 · 2 + 0 · 1 · 3 + 2 · 2 · 3 = 31. 3 0 1 1   1 3 The diagonal entries are 1 and 2, and the corners are each 21 · 6 = 3, so A = . 3 2   0 1 The diagonal entries are both zero, and the corners are each 21 · 1 = 12 , so A = 1 2 . 0 2   3 − 32 1 3 The diagonal entries are 3 and −1, and the corners are each 2 · (−3) = − 2 , so A = . − 23 −1

10. The diagonal entries are 1, 0, and −1, while the off-diagonal entries are Thus   1 4 0 0 −3 . A = 4 0 −3 −1

1 2

· 8 = 4, 0, and

1 2

· (−6) = −3.

418

CHAPTER 5. ORTHOGONALITY 1 2

· 2 = 1,

1 2

· (−4) = −2, and

12. The diagonal entries are 2, −3, and 1, while the off-diagonal entries are 21 · 0 = 0, 1 2 · 0 = 0. Thus   2 0 −2 0 . A =  0 −3 −2 0 1   2 −2 13. The matrix of this form is A = , so that its characteristic polynomial is −2 5 2 − λ −2 = λ2 − 7λ + 6 = (λ − 1)(λ − 6). −2 5 − λ

1 2

· (−4) = −2, and

11. The diagonal entries are 5, −1, and 2, while the off-diagonal entries are 1 2 · 4 = 2. Thus   5 1 −2 2 . A =  1 −1 −2 2 2

So the eigenvalues are the eigenvectors by the usual method gives corresponding  1 and 6; finding  2 1 eigenvectors v1 = and v2 = . Normalizing these and constructing Q gives 1 −2 " Q=

√2 5 √1 5

√1 5 − √25

# .

The new quadratic form is   1 y2 0

T

 f (y) = y Dy = y1

14. The matrix of this form is A =

 1 4

1 − λ 4

0 6

  y1 = y12 + 6y22 . y2

 4 , so that its characteristic polynomial is 1 4 = λ2 − 2λ − 15 = (λ − 5)(λ + 3). 1 − λ

So the eigenvaluesare the eigenvectors by the usual method gives corresponding  5 and −3;  finding  1 1 eigenvectors v1 = and v2 = . Normalizing these and constructing Q gives 1 −1 " Q=

√1 2 √1 2

√1 2 − √12

# .

The new quadratic form is  f (y) = yT Dy = y1

y2

  5 0

  0 y1 = 5y12 − 3y22 . −3 y2

  7 4 4 1 −8, so that its characteristic polynomial is 15. The matrix of this form is A = 4 4 −8 1 7 − λ 4 4 4 1−λ −8 = (λ − 9)2 (λ + 9). 4 −8 1 − λ

5.5. APPLICATIONS

419

So the eigenvalues are 9 (of algebraic multiplicity the eigenvectors  2) and −9; finding   by the usual 2 2 −1 method gives corresponding eigenvectors v1 = 0 and v2 = 1 for λ = 9 and  2 for λ = −9. 1 0 2 Orthogonalizing v1 and v2 gives    2     2 2 5 4 v2 · v1 v1 = 1 − 0 =  1 . v20 = v2 − v1 · v1 5 1 0 − 45 Normalizing these and constructing Q gives 

√2  5

Q=  0 √1 5

2 √ 3 5 5 √ 3 5 4 − 3√ 5

− 31

 

2 . 3 2 3

The new quadratic form is  f (y) = yT Dy = y1

y2

  9 y3 0 0

0 9 0

  0 y1 0 y2  = 9y12 + 9y22 − 9y32 . −9 y3

 −2 0 1 0, so that its characteristic polynomial is 0 3



1 16. The matrix of this form is A = −2 0

−2 0 1−λ 0 = (λ − 3)2 (λ + 1). 0 3 − λ

1 − λ −2 0

So the eigenvalues are 3 (of algebraic multiplicity  and −1; finding   the eigenvectors  by the usual  2) 0 1 1 method gives corresponding eigenvectors v1 = −1 and v2 = 0 for λ = 3 and 1 for λ = −1. 1 0 0 Since v1 and v2 are already orthogonal, we normalize all three vectors and construct Q to get   √1 √1 0 2 2  1 1  Q= − √2 0 √2  . 0 1 0 The new quadratic form is  f (y) = yT Dy = y1



1 17. The matrix of this form is A = −1 1 1 − λ −1 1

y2

  3 y3 0 0

0 3 0

  0 y1 0 y2  = 3y12 + 3y22 − y32 . −1 y3

 −1 0 0 1, so that its characteristic polynomial is 1 1 −1 −λ 1

0 1 = (λ − 2)(λ − 1)(λ + 1). 1 − λ

420

CHAPTER 5. ORTHOGONALITY So the eigenvalues are  2, 1, and−1;  finding  the  eigenvectors by the usual method gives corresponding 1 1 1 eigenvectors v1 = −1, v2 = 0, and  2. Normalize all three vectors and construct Q to get −1 1 −1   √1

 13 Q= − √3 − √13

√1 2

0 √1 2

√1 6 √2  . 6 − √16

The new quadratic form is f (x0 ) = (x0 )T Dx0 = x0 

y0

 2  z 0 0 0

0 1 0

  0 x0 0 y 0  = 2(x0 )2 + (y 0 )2 − (z 0 )2 . −1 z0

  0 1 1 18. The matrix of this form is A = 1 0 1, so that its characteristic polynomial is 1 1 0 −λ 1 1 1 −λ 1 = (λ − 2)(λ + 1)2 . 1 1 −λ So the eigenvalues are 2 and −1; finding the eigenvectors by the  usual method gives corresponding  0 −2 1 eigenvectors v1 = 1 for λ = 2 and v2 =  1 and v3 =  1 for λ = −1. The latter two are −1 1 1 already orthogonal, so simply normalize all three vectors and construct Q to get   √1 √2 − 0 6   13 √ √1 √1  . Q=  3 6 2 √1 √1 − √12 3 6 The new quadratic form is  f (x0 ) = (x0 )T Dx0 = x0

 1 19. The matrix of this form is A = 0

y0

 2  z 0 0 0

  0 0 x0 −1 0 y 0  = 2(x0 )2 − (y 0 )2 − (z 0 )2 . 0 −1 z0

 0 , so that its characteristic polynomial is 2 1 − λ 0 = (1 − λ)(2 − λ). 0 2 − λ

Since the eigenvalues, 1 and 2, are both positive, this is a positive definite form.   1 −1 20. The matrix of this form is A = , so that its characteristic polynomial is −1 1 1 − λ −1 = λ(λ − 2). −1 1 − λ Since the eigenvalues, 0 and 2, are both nonnegative, but one is zero, this is a positive semidefinite form.

5.5. APPLICATIONS

421

21. The matrix of this form is A =

 −2 1

 1 , so that its characteristic polynomial is −2

−2 − λ 1

1 = (λ + 3)(λ + 1). −2 − λ

Since the eigenvalues, −1 and −3, are both negative, this is a negative definite form.   1 2 22. The matrix of this form is A = , so that its characteristic polynomial is 2 1 1 − λ 2

2 = (λ − 3)(λ + 1). 1 − λ

Since the eigenvalues, −1 and 3, have opposite signs, this is an indefinite form.   2 1 1 23. The matrix of this form is A = 1 2 1, so that its characteristic polynomial is 1 1 2 2 − λ 1 1 Since both eigenvalues, 1 and 4, are  1 24. The matrix of this form is A = 0 1

1 1 2−λ 1 = −(λ − 1)2 (λ − 4). 1 2 − λ positive, this is a positive definite form.  0 1 1 0, so that its characteristic polynomial is 0 1 0 1 1−λ 0 = −λ(λ − 1)(λ − 2). 0 1 − λ

1 − λ 0 1 Since the eigenvalues, 0, 1, and 2, semidefinite form.  1 25. The matrix of this form is A = 2 0 1 − λ 2 0

are all nonnegative, but one of them is zero, this is a positive 2 1 0

 0 0, so that its characteristic polynomial is −1

2 1−λ 0

Since the eigenvalues, −1 and 3, have  −1 26. The matrix of this form is A = −1 −1 −1 − λ −1 −1

0 0 = −(λ + 1)2 (λ − 3). −1 − λ

opposite signs, this is an indefinite form.  −1 −1 −1 −1, so that its characteristic polynomial is −1 −1 −1 −1 − λ −1

−1 −1 = −λ2 (λ + 3) −1 − λ

Since the eigenvalues, 0 and −3, are both nonpositive but one of them is zero, this is a negative semidefinite form.

422

CHAPTER 5. ORTHOGONALITY

27. By the Principal Axes Theorem, if A is the matrix of f (x), then there is an orthogonal matrix Q such that QT AQ = D is a diagonal matrix, and then if y = QT x, we have f (x) = xT Ax = yT Dy = λ1 y12 + · · · + λn yn2 . • For 5.22(a), f (x) is positive definite if and only if λ1 y12 + · · · + λn yn2 > 0 for all y1 , . . . , yn . But this happens if and only if λi > 0 for all i, since if some λi ≤ 0, take yi = 1 and all other yj = 0 to get a value for f (x) that is not positive. • For 5.22(b), f (x) is positive semidefinite if and only if λ1 y12 + · · · + λn yn2 ≥ 0 for all y1 , . . . , yn . But this happens if and only if λi ≥ 0 for all i, since if some λi < 0, take yi = 1 and all other yj = 0 to get a value for f (x) that is negative. • For 5.22(c), f (x) is negative semidefinite if and only if λ1 y12 + · · · + λn yn2 < 0 for all y1 , . . . , yn . But this happens if and only if λi < 0 for all i, since if some λi ≥ 0, take yi = 1 and all other yj = 0 to get a value for f (x) that is nonnegative. • For 5.22(d), f (x) is negative semidefinite if and only if λ1 y12 + · · · + λn yn2 ≤ 0 for all y1 , . . . , yn . But this happens if and only if λi ≤ 0 for all i, since if some λi > 0, take yi = 1 and all other yj = 0 to get a value for f (x) that is positive. • For 5.22(e), f (x) is indefinite if and only if λ1 y12 + · · · + λn yn2 can be either negative or positive. But from the arguments above, we see that this can happen if and only if some λi is positive and some λj is negative. 28. The given matrix represents the form ax2 + 2bxy + dy 2 , which can be written in the form given in the hint: 2    b2 b y2 . ax2 + 2bxy + dy 2 = a x + y + d − a a This is a positive definite form if and only if the right-hand side is positive for all values of x and y not 2 both zero. First suppose that a > 0 and det A = ad − b2 > 0. Then ad > b2 so that d > ba (using the 2 fact that a > 0), and thus d − ba > 0. Therefore the first term on the right-hand side is always positive if either x or y is nonzero since a > 0 and nonzero squares are always positive, while the second term is always nonnegative since the constant factor is positive. Thus the form is positive definite. Next assume the form is positive definite; we must show a > 0 and det A > 0. By way of contradiction, assume a ≤ 0; then setting x = 1 and y = 0 we get a ≤ 0 for the value of the form, which contradicts the assumption that it is positive definite. Thus a > 0. Next assume that det A = ad − b2 ≤ 0; then 2 2 (since a > 0) d ≤ ba . Now setting x = − ab and y = 1 gives d − ba ≤ 0 for the value of the form, which is again a contradiction. Thus a > 0 and det A > 0. 2

29. Since A = B T B, we have xT Ax = xT B T Bx = (Bx)T (Bx) = kBxk ≥ 0. Now, if xT Ax = 0, then 2 kBxk = 0. But since B is invertible, this implies that x = 0. Therefore xT Ax > 0 for all x 6= 0, so that A is positive definite. 30. Since A is symmetric and positive definite, we can find an orthogonal matrix Q and a diagonal matrix D such that A = QDQT , where   λ1 · · · 0   D =  ... . . . ...  , where λi > 0 for all i. 0 Let

···

λn √

λ1  .. T C=C = . 0

··· .. . ···

 0 ..  ;  √. λn

then D = C 2 = C T C. This gives A = QDQT = QC T CQT = (CQT )T CQT .

5.5. APPLICATIONS

423

Since all the λi are positive, C is invertible; since Q is orthogonal, it is invertible. Thus B = CQT is invertible and we are done.  31. (a) xT (cA)x = c xT Ax > 0 because c > 0 and A is positive definite.   2 (b) xT A2 x = xT Ax xT Ax = xT Ax > 0.   (c) xT (A + B)x = xT Ax + xT Bx > 0 since both A and B are positive definite. (d) First note that A is invertible by Exercise 30, since it factors as A = B T B where B is invertible. n o Since A is positive definite, its eigenvalues {λi } are positive; the eigenvalues of A−1 are λ11 , which are also positive. Thus A−1 is positive definite. 32. From Exercise 30, we have A = QC T CQT , where Q is orthogonal and C is a diagonal matrix. But then C T = C, so that A = QCCQT = QCQT QCQT = (QCQT )(QCQT ) = B 2 . B is symmetric since B = QCQT is orthogonally diagonalizable, and it is positive definite since its eigenvalues are the diagonal entries of C, which are all positive. 33. The eigenvalues of this form, from Exercise 20, are λ1 = 2 and λ2 = 0, so that by Theorem 5.23, the maximum value of f subject to the constraint kxk = 1 is 2 and the minimum value is 0. These occur at the unit eigenvectors of A, which are found by row-reducing:       −1 −1 0 1 1 0 A − 2I 0 = → −1 −1 0 0 0 0       1 −1 0 1 −1 0 A − 0I 0 = → . −1 1 0 0 0 0 So normalized eigenvectors are " # " # √1 1 1 2 v1 = √ = , 2 −1 − √12

" # " # √1 1 1 = 12 . v2 = √ √ 2 1 2

Therefore the maximum value of 2 occurs at x = ±v1 , and the minimum of 0 occurs at x = ±v2 . 34. The eigenvalues of this form, from Exercise 22, are λ1 = 3 and λ2 = −1, so that by Theorem 5.23, the maximum value of f subject to the constraint kxk = 1 is 3 and the minimum value is −1. These occur at the unit eigenvectors of A, which are found by row-reducing:       −2 2 0 1 −1 0 A − 3I 0 = → 2 −2 0 0 0 0       2 2 0 1 1 0 A+I 0 = → . 2 2 0 0 0 0 So normalized eigenvectors are " # " # √1 1 1 v1 = √ = 12 , √ 2 1 2

" # " # √1 1 1 2 . v2 = √ = 2 −1 − √12

Therefore the maximum value of 3 occurs at x = ±v1 , and the minimum of −1 occurs at x = ±v2 . 35. The eigenvalues of this form, from Exercise 23, are λ1 = 4 and λ2 = 1, so that by Theorem 5.23, the maximum value of f subject to the constraint kxk = 1 is 4 and the minimum value is 1. These occur at the unit eigenvectors of A, which are found by row-reducing:     −2 1 1 0 1 0 −1 0   A − 4I 0 =  1 −2 1 0 → 0 1 −1 0 1 1 −2 0 0 0 0 0     1 1 1 0 1 1 1 0   A − I 0 = 1 1 1 0 → 0 0 0 0 . 1 1 1 0 0 0 0 0

424

CHAPTER 5. ORTHOGONALITY So normalized eigenvectors are



    √1 1   13  1  v1 = √  1 =  √  , 3    13  √ 1 3   

 √1 1 2   1   = − √1  , v2 = √  −1 2 2   0 0

1







√1 2

 1   v3 = √  = 0 0  .   2 1 − √2 −1

Therefore the maximum value of 4 occurs at x = ±v1 , and the minimum of 1 occurs at x = ±v2 as well as at x = ±v3 . 36. The eigenvalues of this form, from Exercise 24, are λ1 = 2, λ2 = 1, and λ3 = 0, so that by Theorem 5.23, the maximum value of f subject to the constraint kxk = 1 is 2 and the minimum value is 0. These occur at the corresponding unit eigenvectors of A, which are found by row-reducing:     −1 0 1 0 1 0 −1 0   A − 2I 0 =  0 −1 0 0 → 0 1 0 0 1 0 −1 0 0 0 0 0     1 0 1 0 1 0 1 0   A − 0I 0 = 0 1 0 0 → 0 1 0 0 . 1 0 1 0 0 0 0 0 So normalized eigenvectors are     √1 1   2 1  v1 = √  0 =  0  , 2   1  √ 1 2



1





√1 2



  1   v2 = √  = 0 0     . 2 1 √ −1 − 2

Therefore the maximum value of 2 occurs at x = ±v1 , and the minimum of 0 occurs at x = ±v2 . 37. Since λ1 ≥ λ2 ≥ · · · ≥ λn , we have 2

f (x) = λ1 y12 + λ2 y22 + · · · + λn yn2 ≥ λn (y12 + y22 + · · · + yn2 ) = λn kyk = λn since we are assuming the constraint kyk = 1. 38. If qn is a unit eigenvector corresponding to λn , then Aqn = λn qn , so that  f (qn ) = qTn Aqn = qTn λn qn = λn qTn qn = λn . Thus the quadratic form actually takes the value λn , so by property (a) of Theorem 5.23, it must be the minimum value of f (x), and it occurs when x = qn . 39. Divide this equation through by 25 to get x2 y2 + = 1, 25 5 which is an ellipse, from Figure 5.15. 40. Divide through by 4 and rearrange terms to get x2 y2 − = 1, 4 4 which is a hyperbola, from Figure 5.15.

5.5. APPLICATIONS

425

41. Rearrange terms to get y = x2 − 1, which is a parabola, from Figure 5.15. 42. Divide through by 8 and rearrange terms to get y2 x2 + = 1, 4 8 which is an ellipse, from Figure 5.15. 43. Rearrange terms to get y 2 − 3x2 = 1, which is a hyperbola, from Figure 5.15. 44. This is a parabola, from Figure 5.15. 45. Group the x and y terms and then complete the square, giving (x2 − 4x + 4) + (y 2 − 4x + 4) = −4 + 4 + 4

⇒ (x − 2)2 + (y − 2)2 = 4. 2

2

This is a circle of radius 2 centered at (2, 2). Letting x0 = x−2 and y 0 = y −2, we have (x0 ) +(y 0 ) = 4. y 4

3



x¢ 2

1

1

2

3

4

x

46. Group the x and y terms and simplify: (4x2 − 8x) + (2y 2 + 12y) = −6



4(x2 − 2x) + 2(y 2 + 6y) = −6.

Now complete the square and simplify again: 4(x2 − 2x + 1) + 2(y 2 + 6y + 9) = −6 + 4 + 18 2

2

4(x − 1) + 2(y + 3) = 16





(x − 1)2 (y + 3)2 + = 1. 4 8 This is an ellipse centered at (1, −3). Letting x0 = x − 1 and y 0 = y + 3, we have

(x0 )2 4

+

(y 0 )2 8

= 1.

426

CHAPTER 5. ORTHOGONALITY y 1

-1

2

3

x

-1 y¢ -2

x¢ -3

-4

-5

-6

47. Group the x and y terms to get 9x2 − 4(y 2 + y) = 37. Now complete the square and simplify: 2    2 y + 21 1 1 x2 2 2 2 − = 1. 9x − 4 y + y + = 37 − 1 ⇒ 9x − 4 y + = 36 ⇒ 4 2 4 9 2 2  (x0 ) (y 0 ) This is a hyperbola centered at 0, − 21 . Letting x0 = x and y 0 = y + 21 , we have 4 − 9 = 1.

y y¢ 3

2

1

-3

-2

1

-1

2

3

x

x¢ -1

-2

-3

48. Rearrange terms to get 3y − 13 = x2 + 10x; then complete the square: 3y − 13 + 25 = x2 + 10x + 25 = (x + 5)2



3y + 12 = (x + 5)2



y+4=

1 (x + 5)2 . 3

5.5. APPLICATIONS

427

This is a parabola. Letting x0 = x + 5 and y 0 = y + 4, we have y 0 =

1 3

2

(x0 ) . y x

-9

-8

-7

-6

-5

-4

-3

-2

-1 -1



-2 -3 x¢ -4

49. Rearrange terms to get 4x = −2y 2 − 8y; then divide by 2, giving 2x = −y 2 − 4y. Finally, complete the square: 1 2x − 4 = −(y 2 + 4y + 4) = −(y + 2)2 ⇒ x − 2 = − (y + 2)2 . 2 2 This is a parabola. Letting x0 = x − 2 and y 0 = y + 2, we have x0 = − 21 (y 0 ) . y 1 -3

-2

y¢ 1

-1

2

3

x

-1 x¢ -2 -3 -4

50. Complete the square and simplify: 2y 2 − 3x2 − 18x − 20y + 11 = 0 2



2

2(y − 10y + 25) − 3(x + 6x + 9) + 11 − 50 + 27 = 0 2

2

2(y − 5) − 3(x + 3) = 12 2





2

(y − 5) (x + 3) − = 1. 6 4 This is a hyperbola. Letting x0 = x + 3 and y 0 = y − 5, we have

(y0 ) 6

2



( x0 ) 4

y y¢

12

10

8

6 x¢ 4

2

-12

-10

-8

-6

-4

2

-2 -2

x

2

= 1.

428

CHAPTER 5. ORTHOGONALITY

51. The matrix of this form is

" A=

#

1

1 2

1 2

1

,

which has characteristic polynomial 1 3 (1 − λ) − = λ2 − 2λ + = 4 4 2

   1 3 λ− λ− . 2 2

Therefore the associated diagonal matrix is " D=

3 2

0

0

1 2

# ,

so that the equation becomes T

xT Ax = (x0 ) Dx0 =

3 0 2 1 0 2 (x ) + (y ) = 6. 2 2

Dividing through by 6 to simplify gives 2

2

(y 0 ) (x0 ) + = 1, 4 12 which is an ellipse. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = and λ2 = 21 by row-reduction to get " # √1 − √12 2 Q= 1 . 1 √

2



2

Then " e1 → Qe1 =

√1 2 √1 2

# e2 → Qe2 =

,

" # − √12 √1 2

,

so that the axis rotation is through a 45◦ angle. y 3 y¢

x¢ 2

1

-3

-2

1

-1

-1

-2

-3

52. The matrix of this form is A=

 4 5

 5 , 4

2

3

x

3 2

5.5. APPLICATIONS

429

which has characteristic polynomial (4 − λ)2 − 25 = λ2 − 8λ − 9 = (λ − 9)(λ + 1). Therefore the associated diagonal matrix is D=

 9 0

 0 , −1

so that the equation becomes T

2

2

xT Ax = (x0 ) Dx0 = 9 (x0 ) − (y 0 ) = 9. Dividing through by 9 to simplify gives 2

2

(x0 ) −

(y 0 ) = 1, 9

which is a hyperbola. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 9 and λ2 = 1 by row-reduction to get " Q=

√1 2 √1 2

− √12

#

√1 2

.

Then " e1 → Qe1 =

√1 2 √1 2

# e2 → Qe2 =

,

" # − √12 √1 2

,

so that the axis rotation is through a 45◦ angle. y 3 y¢

x¢ 2

1

-3

-2

1

-1

2

3

x

-1

-2

-3

53. The matrix of this form is  4 A= 3

 3 , −4

which has characteristic polynomial (4 − λ)(−4 − λ) − 9 = λ2 − 25 = (λ − 5)(λ + 5)

430

CHAPTER 5. ORTHOGONALITY Therefore the associated diagonal matrix is 

5 D= 0

 0 , −5

so that the equation becomes T

2

2

xT Ax = (x0 ) Dx0 = 5 (x0 ) − 5 (y 0 ) = 5. Dividing through by 5 to simplify gives 2

2

(x0 ) − (y 0 ) = 1, which is a hyperbola. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 9 and λ2 = 1 by row-reduction to get " # √3 √1 − 10 10 . Q= 1 3 √ 10



10

Then " e1 → Qe1 =

√3 10 √1 10

#

h e2 → Qe2 = − √110

,

so that the axis rotation is through an angle whose tangent is

√ 1/ 10 √ 3/ 10

√3 10

i

= 13 , or about 18.435◦ .

y 3



2 x¢

1

-3

-2

1

-1

2

3

x

-1

-2

-3

54. The matrix of this form is

 A=

3 −1

 −1 , 3

which has characteristic polynomial (3 − λ)(3 − λ) − 1 = λ2 − 6λ + 8 = (λ − 4)(λ − 2). Therefore the associated diagonal matrix is D=

 4 0

 0 , 2

,

5.5. APPLICATIONS

431

so that the equation becomes T

2

2

xT Ax = (x0 ) Dx0 = 4 (x0 ) + 2 (y 0 ) = 8. Dividing through by 8 to simplify gives 2

2

(x0 ) (y 0 ) + = 1, 2 4 which is an ellipse. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 4 and λ2 = 2 by row-reduction to get " # Q=

√1 2 − √12

√1 2 √1 2

.

Then " e1 → Qe1 =

√1 2 − √12

#

" e2 → Qe2 =

,

√1 2 √1 2

# ,

so that the axis rotation is through a −45◦ angle. y 2 y¢

1

-2

1

-1

2

x

-1

x¢ -2

55. Writing the equation as xT Ax + Bx + C = 0, we have    √ 3 −2 A= , B = −28 2 −2 3

√  22 2 ,

C = 84.

A has characteristic polynomial (3 − λ)(3 − λ) − 4 = λ2 − 6λ + 5 = (λ − 5)(λ − 1). Therefore the associated diagonal matrix is D=

 5 0

 0 . 1

To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 5 and λ2 = 1 by row-reduction to get " # Q=

√1 2 − √12

√1 2 √1 2

.

432

CHAPTER 5. ORTHOGONALITY Thus h



√ i 22 2

BQ = −28 2

"

√1 2 − √12

√1 2 √1 2

#

h i = −50 −6 ,

so the equation becomes 2

2

5 (x0 ) − 50x0 + (y 0 ) − 6y 0 = −84 ⇒     2 2 5 (x0 ) − 10x0 + 25 + (y 0 ) − 6y 0 + 9 = −84 + 125 + 9 0

0

2

2

5(x − 5) + (y − 3) = 50 0

0

2





2

(y − 3) (x − 5) + = 1. 10 50 Letting x00 = x0 − 5 and y 00 = y 0 − 3 gives 2

2

(x00 ) (y 00 ) + = 1, 10 50 which is an ellipse. 56. Writing the equation as xT Ax + Bx + C = 0, we have     6 −2 A= , B = −20 −10 , −2 9

C = −5.

A has characteristic polynomial (6 − λ)(9 − λ) − 4 = λ2 − 15λ + 50 = (λ − 10)(λ − 15). Therefore the associated diagonal matrix is D=

 10 0

 0 . 5

To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 10 and λ2 = 5 by row-reduction to get # " − √15 √25 . Q= 2 1 √

5



5

Thus " h i − √1 5 BQ = −20 −10 2 √ 5

√2 5 √1 5

#

h = 0

√ i −10 5 ,

so the equation becomes √ 2 2 10 (x0 ) + 5 (y 0 ) − 10 5y 0 = 5 ⇒   √ 2 2 10 (x0 ) + 5 (y 0 ) − 2 5 y 0 + 5 = 5 + 25  √ 2 2 10 (x0 ) + 5 y 0 − 5 = 30 ⇒ √ 2 2 y0 − 5 (x0 ) + = 1. 3 6 √ Letting x00 = x0 and y 00 = y 0 − 5 gives 2

2

(x00 ) (y 00 ) + = 1, 3 6 which is an ellipse.



5.5. APPLICATIONS

433

57. Writing the equation as xT Ax + Bx + C = 0, we have    √ 0 1 A= , B= 2 2 1 0

 0 ,

C = −1.

A has characteristic polynomial (−λ)(−λ) − 1 = λ2 − 1 = (λ − 1)(λ + 1) Therefore the associated diagonal matrix is D=

 1 0

 0 . −1

To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 1 and λ2 = −1 by row-reduction to get # " √1 2 √1 2

Q=

√1 2 − √12

.

Thus h √ BQ = 2 2

i 0

"

√1 2 √1 2

√1 2 − √12

#

h = 2

i 2 ,

so the equation becomes 2



2

(x0 ) − (y 0 ) + 2x0 + 2y 0 = 1 ⇒    2 2 (x0 ) + 2x0 + 1 − (y 0 ) − 2y 0 + 1 = 1 + 1 − 1 2



2

(x0 + 1) − (y 0 − 1) = 1. Letting x00 = x0 + 1 and y 00 = y 0 − 1 gives 2

2

(x00 ) − (y 00 ) = 1, which is a hyperbola. 58. Writing the equation as xT Ax + Bx + C = 0, we have    √ 1 −1 A= , B= 4 2 −1 1

 0 ,

C = −4.

A has characteristic polynomial (1 − λ)(1 − λ) − 1 = λ2 − 2λ = λ(λ − 2). Therefore the associated diagonal matrix is  2 D= 0

 0 . 0

To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 2 and λ2 = 0 by row-reduction to get " # Q=

√1 2 − √12

√1 2 √1 2

.

Thus h √ BQ = 4 2

i 0

"

√1 2 − √12

√1 2 √1 2

#

h = 4

i 4 ,

434

CHAPTER 5. ORTHOGONALITY so the equation becomes 2

2 (x0 ) + 4x0 + 4y 0 = 4



0 2

2y 0 = − (x ) − 2x0 + 2 ⇒   2 2y 0 − 1 = − (x0 ) + 2x0 + 1 ⇒   1 2 = − (x0 + 1) ⇒ 2 y0 − 2 1 1 2 y 0 − = − (x0 + 1) . 2 2 Letting x00 = x0 + 1 and y 00 = y 0 −

1 2

gives 1 2 y 00 = − (x00 ) , 2

which is a parabola. 59. x2 − y 2 = (x + y)(x − y) = 0 means that x = y or x = −y, so the graph of this degenerate conic is the pair of lines y = x and y = −x. 2

1

-2

1

-1

2

-1

-2

60. Rearranging terms gives x2 + 2y 2 = −2, which has no solutions since the left-hand side is a sum of two squares, so is always nonnegative. Therefore this is an imaginary conic. 61. The left-hand side is a sum of two squares, so it equals zero if and only if x = y = 0. The graph of this degenerate conic is the single point (0, 0). 1.

0.5

-1.

0.5

-0.5

-0.5

-1.

1.

5.5. APPLICATIONS

435

62. x2 + 2xy + y 2 = (x + y)2 . Then (x + y)2 = 0 is satisfied only when x + y = 0, so the graph of this degenerate conic is the line y = −x. 2

1

-2

1

-1

2

-1

-2

63. Rearranging terms gives √ √ x2 − 2xy + y 2 = 2 2(y − x), or (x − y)2 = 2 2(y − x). This is satisfied when y = x. If y 6= x, then we can divide through by y − x to get √ √ y − x = 2 2, or y = x + 2 2. √ So the graph of this degenerate conic is the union of the lines y = x and y = x + 2 2.

4

3

2

1

-2

1

-1

2

-1

-2

64. Writing the equation as xT Ax + Bx + C = 0, we have    √ √  2 1 A= , B = 2 2 −2 2 , 1 2

C = 6.

436

CHAPTER 5. ORTHOGONALITY A has characteristic polynomial (2 − λ)(2 − λ) − 1 = λ2 − 4λ + 3 = (λ − 3)(λ − 1). Therefore the associated diagonal matrix is  3 D= 0

 0 . 1

To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 3 and λ2 = 1 by row-reduction to get # " √1 − √12 2 . Q= 1 1 √



2

2

Thus h √ BQ = 2 2

√ i −2 2

"

− √12

√1 2 √1 2

#

√1 2

h = 0

i 4 ,

so the equation becomes 2

2

3 (x0 ) + (y 0 ) + 4y 0 = −6 ⇒   2 2 3 (x0 ) + (y 0 ) + 4y 0 + 4 = −6 + 4 2



2

3 (x0 ) + (y 0 + 2) = −2. The left-hand side is always nonnegative, so this equation has no solutions and this is an imaginary conic. 65. Let the eigenvalues of A be λ1 and λ2 . Then there is an orthogonal matrix Q such that   λ 0 QT AQ = D = 1 . 0 λ2 Then by the Principal Axes Theorem, xT Ax = x0T Dx0 = λ1 (x0 )2 + λ2 (y 0 )2 = k. If k 6= 0, then we may divide through by k to get the standard equation λ1 0 2 λ2 0 2 (x ) + (y ) = 1. k k (a) Since det A = det D = λ1 λ2 < 0, the coefficients of (x0 )2 and (y 0 )2 in the standard equation have opposite signs, so this curve is a hyperbola. (b) Since det A = det D = λ1 λ2 > 0, the coefficients of (x0 )2 and (y 0 )2 in the standard equation have the same sign. If they are both positive, the curve is an ellipse; if they are positive and equal, then the ellipse is in fact a circle. If they are both negative, then the equation has no solutions so is an imaginary conic. (c) Since det A = det D = λ1 λ2 = 0, then at least one of the coefficients of (x0 )2 and (y 0 )2 in the standard equation is zero. If the remaining coefficient is positive, then we get the two parallel q lines x0 = ± λki , while if it is zero or negative, the equation has no solutions and we get an imaginary conic. (d) If k = 0 but det A = det D = λ1 λ2 6= 0, then the equation is λ1 (x0 )2 + λ2 (y 0 )2 = 0, and both λ1 and λ2 are nonzero. Then solving for y 0 gives r λ1 0 y = ± − x0 . λ2 If λ1 and λ2 have the same sign, then this equation has only the solution x0 = y 0 = 0, so the graph is just the origin. If they have opposite signs, then the graph is two straight lines whose equations are above.

5.5. APPLICATIONS

437

(e) If k = 0 and det A = det D = λ1 λ2 = 0, then the equation is λ1 (x0 )2 + λ2 (y 0 )2 = 0, and at least one of λ1 and λ2 is zero. If λ1 = 0, we get x0 = 0, so the graph is the y 0 -axis, a straight line. Similarly, if λ2 = 0, we get y 0 = 0, so the graph is the x0 -axis, also a straight line (if both λ1 and λ2 are zero, then A was the zero matrix to start, so it does not correspond to a quadratic form). 66. The matrix of this form is

 4 A = 2 2

 2 2 , 4

2 4 2

which has characteristic polynomial −λ3 + 12λ2 − 36λ + 32 = −(λ − 2)2 (λ − 8) Therefore the associated diagonal matrix is  8 D = 0 0

 0 0 , 2

0 2 0

so that the equation becomes 2

T

2

2

2

2

xT Ax = (x0 ) Dx0 = 8 (x0 ) + 2 (y 0 ) + 2 (z 0 ) = 8, or (x0 ) + This is an ellipsoid. 67. The matrix of this form is

 1 A = 0 0

 0 0 1 −2 , −2 1

which has characteristic polynomial −λ3 + 3λ2 + λ − 3 = −(x + 1)(x − 1)(x − 3) Therefore the associated diagonal matrix is  −1 D= 0 0

0 1 0

 0 0 , 3

so that the equation becomes T

2

2

2

xT Ax = (x0 ) Dx0 = − (x0 ) + (y 0 ) + 3 (z 0 ) = 1, which is a hyperboloid of one sheet. 68. The matrix of this form is

 −1 A= 2 2

 2 2 −1 2 , 2 −1

which has characteristic polynomial −λ3 − 3λ2 + 9λ + 27 = −(x + 3)2 (x − 3)

2

(z 0 ) (y 0 ) + = 1. 4 4

438

CHAPTER 5. ORTHOGONALITY Therefore the associated diagonal matrix is  −3 D= 0 0

 0 0 −3 0 , 0 3

so that the equation becomes 2

2

T

2

2

2

(y 0 ) (z 0 ) (x0 ) + − = −1. 4 4 4

2

xT Ax = (x0 ) Dx0 = −3 (x0 ) − 3 (y 0 ) + 3 (z 0 ) = 12, or This is a hyperboloid of two sheets. 69. Writing the form as xT Ax + Bx + C  0 A = 1 0

= 0, we have  1 0  0 0 , B = 0 0 0

0

 1 ,

C = 0.

The characteristic polynomial of A is λ − λ3 = −λ(λ − 1)(λ + 1). Computing unit eigenvectors for each eigenspace gives       √1 √1 − 0 2   2   1  1   v−1 =  v1 =  v0 =   √ . √  , 0 , 2

1

2

0

0

Then setting  0  Q= 0 1

√1 2 √1 2

− √12

0

0

√1 2

 0  D= 0 0

  , 

0 1 0

 0  0  −1

we get  BQ = 1

0

 0 .

Then the form becomes 2

2

2

2

xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ (y 0 ) − (z 0 ) + x0 = 0 ⇒ x0 = (z 0 ) − (y 0 ) . This is a hyperbolic paraboloid. 70. Writing the form as xT Ax + Bx + C = 0, we have   16 0 −12  0 , B = −60 A =  0 100 −12 0 9

0

 −80 ,

C = 0.

The characteristic polynomial of A is −λ3 + 125λ2 − 2500λ = −λ(λ − 25)(λ − 100). Computing unit eigenvectors for each eigenspace gives     3 − 54 5     v0 =  0  , v25 =  0 , 4 5

v100

  0   = 1 . 0

0 25 0

 0  0  100

3 5

Then setting 

3 5

 Q = 0 4 5

− 45 0 3 5

 0  1 , 0

 0  D = 0 0

5.5. APPLICATIONS

439

we get  BQ = −100

 0 .

0

Then the form becomes 2

2

2

xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ 25 (y 0 ) + 100 (z 0 ) − 100x0 = 0 ⇒ x0 =

(y 0 ) 2 + (z 0 ) . 4

This is an elliptic paraboloid. 71. Writing the form as xT Ax + Bx + C = 0, we have   1 2 −1  1 , B = −1 A= 2 1 −1 1 −2

 1 ,

1

C = 0.

The characteristic polynomial of A is −λ3 + 9λ = −λ(λ − 3)(λ + 3). Computing unit eigenvectors for each eigenspace gives     √1 − √13  2  1  1   v3 =  v0 =   √2  ,  √3  , √1 0 3

 v−3 =



√1  16  − √  .  6 √2 6

Then setting  − √1  13 Q=  √3 √1 3

√1 2 √1 2

0



 0   D = 0 0

√1 6  − √16  , √2 6

0 3 0

 0  0  −3

we get BQ =

√

3

 0 .

0

Then the form becomes 2

2

xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ 3 (y 0 ) − 3 (z 0 ) +



2

3 x0 = 0 ⇒ x0 =

This is a hyperbolic paraboloid. 72. Writing the form as xT Ax + Bx = 15, we have   10 0 −20 0 , A =  0 25 −20 0 10

 √ B = 20 2

50

√  20 2 .

The characteristic polynomial of A is −λ3 + 45λ2 − 200λ − 7500 = −(λ + 10)(λ − 25)(λ − 30). Computing unit eigenvectors for each eigenspace gives     √1 0  2      v−10 =  , v = 25 0 1 , 1 √ 0 2

v30

  − √12   = 0  . √1 2

2

(y 0 ) (z 0 ) √ − √ 1/ 3 1/ 3

440

CHAPTER 5. ORTHOGONALITY Then setting 

√1  2

0

√1 2

0

Q= 0

− √12

  −10 0 0    D=  0 25 0 0 0 30



 0  ,

1

√1 2

we get  BQ = 40

 0 .

50

Then the form becomes xT Ax + Bx = 15 ⇒ x0T Dx0 + BQx0 = 15 ⇒ 2

2

2

−10 (x0 ) + 25 (y 0 ) + 30 (z 0 ) + 40x0 + 50y 0 = 15 ⇒     2 2 2 −10 (x0 ) + 4x0 + 25 (y 0 ) + 2y 0 + 30 (z 0 ) = 15 ⇒ 2

2

2

−10 (x0 + 2) + 25 (y 0 + 1) + 30 (z 0 ) = 0 ⇒ 2

2

2

(x0 + 2) =

(z 0 ) (y 0 + 1) + . 2/5 1/3

Letting x00 = x0 + 2, y 00 = y 0 + 1, z 00 = z 0 gives 2

2

(x00 ) =

2

(y 00 ) (z 00 ) + . 2/5 1/3

This is an elliptic cone. 73. Writing the form as xT Ax + Bx = 6, we have   11 1 4 A =  1 11 −4 , 4 −4 14

 B = −12

12

 12 .

The characteristic polynomial of A is −λ3 + 36λ2 − 396λ + 1296 = −(λ − 6)(λ − 12)(λ − 18). Computing unit eigenvectors for each eigenspace gives   − √13  1   v6 =   √3  ,

 v12 =

√1 3





√1  2  √1  ,  2

v18 =

0



√1  16  − √  .  6 √2 6

Then setting  − √1  13 Q=  √3 √1 3

we get

√1 2 √1 2

0



√1 6  − √16  , 2 √ 6

 √ BQ = 12 3

 6   D = 0 0

0

 0 .



0

0

12

 0 

0

18

5.5. APPLICATIONS

441

Then the form becomes xT Ax + Bx = 6 ⇒ x0T Dx0 + BQx0 = 6 ⇒ √ 2 2 2 6 (x0 ) + 12 (y 0 ) + 18 (z 0 ) + 12 3x0 6 ⇒   √ 2 2 2 6 (x0 ) + 2 3 x0 + 12 (y 0 ) + 18 (z 0 ) = 6 ⇒   √ 2 2 2 6 (x0 ) + 2 3 x0 + 3 + 12 (y 0 ) + 18 (z 0 ) = 24 ⇒ √ 2 2 2 x0 + 3 (y 0 ) (z 0 ) + + = 1. 4 2 4/3 Letting x00 = x0 +



3, y 00 = y 0 , z 00 = z 0 gives 2

2

2

(x00 ) (y 00 ) (z 00 ) + + = 1. 4 2 4/3 This is an ellipsoid. 74. The strategy is to reduce the problem to one in which we can apply part (b) of Exercise 65. Using the   −1 hint, let v be an eigenvector corresponding to λ = a−bi, and let P = Re v Im v . Set B = P P T . Now, B is a real matrix; it is also symmetric since  −1 T  T −1  T T T −1 −1 P = PPT = B. BT = P P T = PPT = P Also, det B = det



PPT

−1 

=

1 1 1 = = 2 > 0. det (P P T ) det P det P T (det P )

Thus Exercise 65(b) applies to B, so that xT Bx = k is an ellipse if both eigenvalues of B are positive (since k > 0). But by Exercise 29 in this section, B −1 is positive definite, so that B is as well (Exercise 31(d)), and therefore has positive eigenvalues. Hence the trajectories of xT B are ellipses. Next, let   a −b C= , b a so that by Theorem 4.43, A = P CP −1 . Note that     2   2 a b a −b a + b2 ab − ab |λ| CT C = = = −b a b a ab − ab a2 + b2 0

  1 0 = 0 |λ|

 0 = I. 1

Now, (Ax)T B(Ax) = xT AT P P T = xT = xT = xT = xT = xT

−1

(Ax) −1 −1 P CP PT P P CP −1 x T −1 P −1 C T P T P T CP −1 x  −1 T PT C CP −1 x −1 −1 PT P x  −1 PPT x  −1 T

= xT Bx. Thus the quadratics xT Bx = k and (Ax)T B(Ax) = k are the same, so that the trajectories of Ax are ellipses as well.

442

CHAPTER 5. ORTHOGONALITY

Chapter Review 1. (a) True. See Theorem 5.1 and the definition preceding it. (b) True. See Theorem 5.15; the Gram-Schmidt process allows us to turn any basis into an orthonormal basis, showing that any subspace has such a basis. (c) True. See the definition preceding Theorem 5.5 in Section 5.1. (d) True. See Theorem 5.5. (e) False. If Q is an orthogonal matrix, then |det Q| = 1. However, there are many matrices with determinant 1 that are not orthogonal. For example,     1 1 2 5 , . 0 1 1 3 ⊥



(f ) True. By Theorem 5.10, (row(A)) = null(A). But if (row(A)) = Rn , then null(A) = Rn so that Ax = 0 for every x, and therefore A = O (since Aei = 0 for every i, showing that every row of A is zero). (g) False. For example, let W be the subset of R2 spanned by e1 , and let v = e2 . Then projW (v) = 0, but v 6= 0. (h) True. If A is orthogonal, then A−1 = AT ; but if A is also symmetric, then AT = A. Thus A−1 = A, so that A2 = AA−1 = I. (i) False. For example, any diagonal matrix is orthogonally diagonalizable, since it is already diagonal. But if it has a zero entry on the diagonal, it is not invertible. (j) True. For example, the diagonal matrix with λ1 , λ2 , . . . , λn on the diagonal has this property. 2. The first pair of vectors is already orthogonal, since 1 · 4 + 2 · 1 + 3 · (−2) = 0. The other two pairs are orthogonal if and only if 1·a+2·b+3·3=0



4·a+1·b−2·3=0

a + 2b = −9 4a + b = 6 .

Computing and row-reducing the augmented matrix gives    1 0 1 2 −9 → 4 1 6 0 1

3 −6



So the unique solution is a = 3 and b = −6; these are the only values of a and b for which these vectors form an orthogonal set. 3. Let

  1 u1 = 0 , 1



 1 u2 =  1 , −1

  −1 u3 =  2 . 1

Note that these vectors form an orthogonal set. Therefore using Theorem 5.2, we write v = c1 u1 + c2 u2 + c3 u3       v · u1 v · u2 v · u3 = u1 + u2 + u3 u1 · u1 u2 · u2 u3 · u3 7·1−3·0+2·1 7 · 1 − 3 · 1 + 2 · (−1) 7 · (−1) − 3 · 2 + 2 · 1 = u1 + u2 + u3 1·1+0·0+1·1 1 · 1 + 1 · 1 − 1 · (−1) −1 · (−1) + 2 · 2 + 1 · 1 9 2 11 = u1 + u2 − u3 . 2 3 6 So the coordinate vector is   9 2

  [v]B =  23  . − 11 6

Chapter Review

443

4. We first find v2 . Since v1 and v2 =

  x are orthogonal, we have y 3 4 x + y = 0, 5 5

But also v2 is a unit vector, so that x2 + y 2 = h i • If v2 = − 45 35 , then

• If v2 =

h

4 or x = − y. 3

16 2 9 y

+ y2 =

25 2 9 y

= 1. Therefore y = ± 35 and x = ∓ 45 .

1 v = −3v1 + v2 = −3 2

" # 3 5 4 5

" # " # − 11 1 − 54 5 + = . 3 2 − 21 5 10

1 v = −3v1 + v2 = −3 2

" #

# " # " 4 − 57 1 5 + = . 2 − 35 − 27 10

i − 35 , then

4 5

3 5 4 5

5. Using Theorem 5.5, we show that the matrix is orthogonal by showing that     6 4 2 3 6 √ − √15 1 0 7 7 7 7 7 5    1  2 T 2 15  √ √  0 QQ =  0 − 7√5   = 0 1 − 5 5  7 4 15 2 3 2 2 √ √ √ 0 0 − 7√5 7√5 7 7 5 5 7 5

that QQT = I:  0  0  = I. 1

6. Let Q be the given matrix. If Q is to be orthogonal, then by Theorem 5.5, we must have " #" # " # " # 1 1 a 12 b + a2 21 b + ac 1 0 T 2 4 QQ = = 1 = . b c a c b2 + c2 0 1 2 b + ac Therefore we must have a2 +

1 = 1, 4

1 ac + b = 0, 2

b2 + c2 = 1.



The first equation gives a = ±

3 2 .

Substituting this into the second equation gives

√ ±

3 1 c + b = 0, 2 2

√ b = ∓ 3c



b2 = 3c2 .



But b2 + c2 = 1 and b2 = 3c2 means that 4c2 = 1, so that c = ± 21 . So there are two possibilities: √

• If a =

3 2 ,

√ then b = − 3 c, so we get the two matrices " −



• If a = −

3 2 ,

then b =





3 2 1 2

1 2√

3 2

#

" ,

1 √2 3 2



3 2 − 12

# .

3 c, so we get the two matrices "

1 √2 3 2





3 2 1 2

#

" ,

1 2√ − 23



3 2 − 21



# .

444

CHAPTER 5. ORTHOGONALITY

7. We must show that Qvi · Qvj = 0 for i 6= j and that Qvi · Qvi = 1. Since Q is orthogonal, we know that QQT = QT Q = I. Then for any i and j (possibly equal) we have  T Qvi · Qvj = (Qvi ) Qvj = viT QT Q vj = viT vj = vi · vj . Thus if i 6= j, the dot product is zero since vi · vj = 0 for i 6= j, and if i = j, then the dot product is 1 since vi · vi = 1. 8. The given condition means that for all vectors x and y, x·y Qx · Qy = . kxk kyk kQxk kQyk Let qi be the ith column of Q. Then qi = Qei . Since the ei are orthonormal and the qi are unit vectors, we have ( 0 i 6= j qi · qj ei · ej qi · qj = = = ei · ej = kqi k kqj k kei k kej k 1 i = j. It follows that Q is an orthogonal matrix.    5 9. W is the column space of A = , so that W ⊥ is the null space of AT = 5 2 augmented matrix:     5 2 0 → 1 52 0 ,

 2 . Row-reduce the

so that a basis for the null space, and thus for W ⊥ , is given by " # " # − 25 −2 , or . 1 5  1  10. W is the column space of A =  2, so that W ⊥ is the null space of AT = 1 −1 augmented matrix is already row-reduced:   1 2 −1 0 , 

2

 −1 . The

so that the null space, which is W ⊥ , is the set           −2s + t −2 1 −2 1  s  = s  1 + t 0 = span  1 , 0 . t 0 1 0 1 These two vectors form a basis for W ⊥ . 11. Let A be the matrix whose columns are the given basis of W . Then by Theorem 5.10, W ⊥ = ⊥ (col(A)) = null(AT ). Row-reducing AT , we get     1 −1 4 0 1 0 1 0 → , 0 1 −3 0 0 1 −3 0 so that a basis for W ⊥ is   −1  3 . 1

Chapter Review

445

12. Let A be the matrix whose columns are the given basis of W . Then by Theorem 5.10, W ⊥ = ⊥ (col(A)) = null(AT ). Row-reducing AT , we get " 1 1

# " 0 1 0 1 → 0 1 0 0

1 1 1 −1 1 2

# 0 , 0

3 2 − 12

so that W⊥

        3  −s − 32 t  −2  −1 −1          1      0  1  t 2   2  0   =   s  = s  1 + t  0 = span  1 ,            0 0 1 t

  −3    1   .  0   2

13. Row-reducing A gives  1 0 A→ 0 0

0 1 0 0

2 0 0 0

3 2 0 0

 4 1 , 0 0

so that a basis for row(A) is given by the nonzero rows of the reduced matrix, or 

1

0

2

3

  4 , 0

1

0

2

1



.

A basis for col(A) is given by the columns of A corresponding to leading 1s in the reduced matrix, so one basis is     −1  1        −1   ,  2 .  2  1      −5 3 A basis for null(A) is given by the solution set of the matrix, which, from the reduced form, is     −2 −2s − 3t − 4u          0     −2t − u         = span  1 ,  s           0   t          0 u

  −3 −2    0 ,    1 0

  −4    −1    0 .    0    1

Finally, to determine null(AT ), we row-reduce AT : 

1 −1   2   1 3

  −1 2 3 1 0 2 1 −5    −2 4 6  → 0 0 1 8 −1 −2 9 7 0

0 1 0 0 0

5 3 0 0 0

 1 −2  0 . 0 0

Then     −5s − t  −5           −3s + 2t  = span −3 , null(AT ) =     1 s         t 0 14. The orthogonal decomposition of v with respect to W is v = projW (v) + (v − projW (v)),

  −1    2   .  0   1

446

CHAPTER 5. ORTHOGONALITY so we must find projW (v). Let the three given vectors in W be w1 , w2 , and w3 . Then       w1 · v w2 · v w3 · v projW (v) = w1 + w2 + w3 w1 · w1 w2 · w2 w3 · w3     0 1  1 · 1 + 0 · 0 + 1 · (−1) − 1 · 2  0 0 · 1 + 1 · 0 + 1 · (−1) + 1 · 2  1  +   = 0 · 0 + 1 · 1 + 1 · 1 + 1 · 1 1 1 · 1 + 0 · 0 + 1 · 1 − 1 · (−1)  1 1 −1   3  3 · 1 + 1 · 0 − 2 · (−1) + 1 · 2   1 +   −2 3 · 3 + 1 · 1 − 2 · (−2) + 1 · 1 1       0 1 3  2  0  1 7 1 1 −  +   =  3 1 3  1 15 −2 1 −1 1  11  15

 4   =  195 . − 15  22 15

Then   11   4  1 15 15  0   4  − 4     5  5 v − projW (v) =   −  19  =  4  . −1 − 15   15  

2

22 15

8 15

15. (a) Using Gram-Schmidt, we get   1 1  v1 = x1 =  1 1     1 1 1 1 1 · 1 + 1 · 1 + 1 · 1 + 1 · 0    v2 = x2 − projv1 (x2 )v1 =  1 − − 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1 1 0      1 1 1 4 1 1 3 1    −   =  4 = 1 4 1  1   4 0 1 3 − 4

v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2   0 1 1 · 0 + 1 · 1 + 1 · 1 + 1 · 1  = 1 − 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 v1 − 1  1  2     −3 0 1 4  1 1 1 3 1 1   4  3    =  =  . 1 − 4 1 + 3   14   13  1 1 −3 0 4

1 4

·0+ 1 16

1 4

·1+ 1 + 16 +

1 3 4 ·1− 4 1 9 16 + 16

·1

v2

Chapter Review

447

(b) To find a QR factorization, we normalize the vectors from (a) and use those as the column vectors of Q. Then we set R = QT A. Normalizing v1 , v2 , and v3 yields     1 1    21    v1 1 1  = 2 q1 = =    1 kv1 k 2 1  2 1 1 2    √ v2 2 3 q2 = = kv2 k 3





3  √63   6   =  √3   √6  − 23  √  − 6  √36   6   =  √6  .  6 

1  41   4    1  4 − 34

  −2 √  31   v3 6  3 q3 = =  1 kv3 k 2   3 0

0

Then  Q=

1  12 2  1 2 1 2



3 √6 3 √6 3 6√ − 23





6 √3  6  √6  , 6  6 



0

and  R = QT A =

1  √2  3  6√ − 36

1 √2 3 √6 6 6

1 √2 3 √6 6 6

  1 1 2√   1  − 23   1  0 1

1 1 1 0

  0 2   1  = 0  1  0 1

3 √2 3 2

0

3 2√

 

3 . √6  6 3



Then A = QR is a QR factorization of A.

16. Note that the two given vectors,     1 0 0  1    v1 =  2 and v2 =  1 , 2 −1 are indeed orthogonal. We extend these to a basis by adding e3 and e4 . We can verify that this is a basis by computing the determinant of the matrix whose column vectors are the vi ; this determinant is 1, so the matrix is invertible and therefore the vectors are linearly independent, so they do in fact

448

CHAPTER 5. ORTHOGONALITY form a basis for R4 . We apply Gram-Schmidt to v3 and v4 : v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2   0 0 0 · 1 + 0 · 0 + 1 · 2 + 0 · 2 0 · 0 + 0 · 1 + 1 · 1 + 0 · (−1)   = − v1 − v2 12 + 02 + 22 + 22 02 + 12 + 12 + (−1)2 1 0        2 −9 0 1 0 0 2 0 1  1 − 1          =   −   −   =  32  1 9 2 3  1  9  0 2 −1 − 19 v4 = x4 − projv1 (x4 )v1 − projv2 (x4 )v2 − projv3 (x4 )v3   0 0 0 · 1 + 0 · 0 + 0 · 2 + 1 · 2 0 · 0 + 0 · 1 + 0 · 1 + 0 · (−1)   = − v1 − v2 2 2 2 2 1 +0 +2 +2 02 + 12 + 12 + (−1)2 0 1    0 · − 92 + 0 · − 13 + 0 · 29 + 1 · − 19 v3 4 1 4 1 81 + 9 + 81 + 81        2  1 0 −3 1 0 −9 0 2 0 1  1 1 − 1   1            =   −   +   +  32  =  6  . 0 9 2 3  1 2  9   0 1 1 2 −1 − 19 6 −

17. If x1 + x2 + x3 + x4 = 0, then x4 = −x1 − x2 − x3 , so that a basis for W is given by      0  0 1   0  1  0  , x2 =   , x3 =   . B = x1 =   1  0  0      −1 −1 −1    



Then using Gram-Schmidt,    v1 = x1 =  

 1 0   0

−1    v2 = x2 − projv1 (x2 )v1 =   





   0 1   1 1 · 0 + 0 · 1 + 0 · 0 − 1 · (−1)  0  −−   12 + 02 + 02 + (−1)2  0 0

−1  1 1 −2  1 0    =  0  0

0  1 1     = −   0 2  −1 −1



− 12

−1

Chapter Review

449

v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2     0 1  0 1 · 0 + 0 · 0 + 0 · 1 − 1 · (−1)  0 − 1 · 0 + 1 · 0 + 0 · 1 −     = −  − 2 1 1 12 + 02 + 02 + (−1)2  1  0 4 +1+0+ 4 −1 −1      1  1 −3 0 1 −2  0 1  0 1  1 − 1          =   −   −   =  3 .  1 2  0 3  0  1 −1

−1

− 12

1 2

 1 −2  · (−1)   1    0 − 12

− 13

18. (a) The characteristic polynomial of A is 2 − λ 1 −1 2−λ 1 = −λ3 + 6λ2 − 9λ = −λ(λ − 3)2 , det(A − λI) = 1 −1 1 2 − λ   so the eigenvalues are 0 and 3. To find the eigenspaces, we row-reduce A − λI 0 :     2 1 −1 0 1 0 −1 0   A − 0I 0 =  1 2 1 0 → 0 1 1 0 −1 1 2 0 0 0 0 0     1 −1 1 0 −1 1 −1 0   1 0 → 0 0 0 0 . A − 3I 0 =  1 −1 −1 1 −1 0 0 0 0 0 Thus the eigenspaces are    1   E0 = span v1 = −1 ,   1

     1 −1   E3 = span v2 = 1 , x3 =  0 .   0 1

The vectors in E0 are orthogonal to those in E3 since they correspond to different eigenvalues, but the basis vectors above in E3 are not orthogonal, so we use Gram-Schmidt to convert the second basis vector into one orthogonal to the first:       −1 1 − 12   1 · (−1) + 1 · 0 + 0 · 1    1  v3 =  0 − 1 =  2  . 1·1+1·1+0·0 1 0 1 Finally, normalize v1 , v2 , and v3 and let Q be the matrix whose columns are those vectors; then Q is the orthogonal matrix   √1 √1 − √16 3 2  1  1 √1  . Q= − √3 √2 6  √1 √2 0 3 6 Finally, an orthogonal diagonalization of A is given by  0 0 QT AQ = D = 0 3 0 0

 0 0 . 3

450

CHAPTER 5. ORTHOGONALITY (b) From part (a), we have λ1 = 0 and λ2 = λ3 = 3, with     q1 =

√1  13  − √  ,  3 √1 3

q2 =

  − √16  1   q3 =   √6  .

√1  2  √1  ,  2

√2 6

0

The spectral decomposition of A is A = λ1 q1 qT1 + λ2 q2 qT2 + λ3 q3 qT3 = 3q2 qT2 + 3q3 qT3     " # √1 − √16 1 1 2 √   √  h 2 2 + 3  √1  − √1 √1  = 3  2 0  6 6 √2 0 6     1 1 1 1 0 − 6 − 13 2 2 6     1 1 1 . 1  1  = 3  2 2 0 + 3 − 6 6 3 0

0

− 31

0

1 3

√1 6

√2 6

i

2 3

19. See Example   5.20, or Exercises 23 and 24 in Section 5.4. The basis vector for E−2 , which we name 1 v3 = −1, is orthogonal to the other two since they correspond to different eigenvalues. However, 0 we must orthogonalize the basis for E0 using Gram-Schmidt:         1 1 1 0     1·1+1·1+1·0    v1 = 1 , v2 = 1 − = 1 0 . 12 + 1 2 + 0 2 0 1 0 1 √ √ Now v1 , v2 , and v3 are mutually orthogonal, with norms 2, 2, and 1 respectively. Let Q be the matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is  1  Q 0 0

0 1 0

 0  T 1 0 Q 0 −2  √1 0 √12 1 2   1 1  = − √2 0 √2  0 0 1 0 0

  1 0  −1   0  Q = Q 0 −2 0 

0

20. vi viT is symmetric since vi viT

T

= viT

0 1 0

T

 √1 0  2  0  0 √1 −2 2

− √12 0 √1 2

  0 −1   2  3 1  = − 2 0 0

− 23 − 21 0

 0  0 . 1

viT = viT vi ,

so it is equal to its own transpose. Since a scalar times a symmetric matrix is a symmetric matrix, and the sum of symmetric matrices is symmetric, it follows that A is symmetric. Next, ! n n n X X  X T Avj = ci vi vi vj = ci vi viT vj = ci vi (vi · vj ) = cj vj i=1

i=1

i=1

since {vi } is a set of orthonormal vectors. This shows that each cj is an eigenvalue for A with corresponding eigenvector vj . Since there are n of them, these are all the eigenvalues and eigenvectors.

Chapter 6

Vector Spaces 6.1

Vector Spaces and Subspaces

  x . V is the set of all vectors in R2 whose first and second components are the same. x We verify all ten axioms of a vector space:           x y x y x+y 1. If , ∈ V then + = ∈ V since its first and second components are the same. x y x y x+y             x y x+y y+x y x 2. + = = = + . x y x+y y+x y x                       x y z x+y z x+y+z x y+z x y z 3. + + = + = = + = + + . x y z x+y z x+y+z x y+z x y z           0 x 0 x+0 x 4. ∈ V , and + = = ∈V. 0 x 0 x+0 x             x −x x −x x−x 0 5. If ∈ V , then ∈ V , and + = = . x −x x −x x−x 0       x x cx 6. If ∈ V , then c = ∈ V since its first and second components are the same. x x cx                 x y x+y c(x + y) c(y + x) y+x y x 7. c + =c = = =c =c + . x y x+y c(x + y) c(y + x) y+x y x               x (c + d)x cx + dx cx dx x x = = = + =c +d . 8. (c + d) x (c + d)x cx + dx cx dx x x            x dx c(dx) (cd)x x =c = = = (cd) . 9. c d x dx c(dx) (cd)x x       x 1x x = = . 10. 1 x 1x x    x 2. V = : x, y ≥ 0 fails to satisfy Axioms 5 and 6, so it is not a vector space: y         x 0 x cx 5. If 6= ∈ V , choose c < 0; then c = ∈ / V since at least one of cx and cy is negative. y 0 y cy       x 0 −x 6. If 6= ∈ V , then ∈ / V since −x < 0. x 0 −x

1. Let V =

451

452

CHAPTER 6. VECTOR SPACES    x : xy ≥ 0 fails to satisfy Axiom 1, so it is not a vector space: y     x −y 1. Choose x 6= y with xy ≥ 0. Then and are both in V , but y −x         x −y x−y x−y + = = ∈ /V y −x y−x −(x − y)

3. V =

since (x − y)(−(x − y)) = −(x − y)2 < 0 because x 6= y.    x 4. V = : x ≥ y fails to satisfy Axioms 5 and 6, so it is not a vector space: y     x −x 5. If ∈ V and x 6= y, then x > y, so that −x < −y. Therefore ∈ / V. y −y       x cx −x 6. Again let ∈ V and x 6= y, and choose c = −1. Then = ∈ / V as above. y cy −y . 5. R2 with the given scalar multiplication fails to satisfy Axiom 8, so it is not a vector space:   x 8. Choose ∈ R2 with y 6= 0, and let c and d be nonzero scalars. Then y                 x (c + d)x cx + dx x x cx dx cx + dx (c + d) = = , while c +d = + = . y y y y y y y 2y These two are unequal since we chose y 6= 0. 6. R2 with the given addition fails to satisfy Axiom 7, so it is not a vector space:     x x 8. Choose 1 , 2 ∈ R2 and let c be a scalar not equal 1. Then y1 y2         x1 x x + x2 + 1 cx1 + cx2 + c c + 2 =c 1 = y1 y2 y1 + y2 + 1 cy1 + cy2 + c           x x cx1 cx2 cx1 + cx2 + 1 c 1 +c 2 = + = . y1 y2 cy1 cy2 cy1 + cy2 + 1 These two are unequal since we chose c 6= 1. 7. All axioms apply, so that this set is a vector space. Let x, y, and z by positive real numbers, so that x, y, z ∈ V , and c and d be any real numbers. Then 1. x ⊕ y = xy > 0, so that x ⊕ y ∈ V . 2. x ⊕ y = xy = yx = y ⊕ x. 3. (x ⊕ y) ⊕ z = (xy) ⊕ z = (xy)z = x(yz) = x ⊕ (yz) = x ⊕ (y ⊕ z). 4. 0 = 1, since x ⊕ 0 = x1 = 1x = 0 ⊕ x = x. 5. −x is the positive real number

1 x,

since x ⊕ (−x) = x ·

1 x

= 1 = 0.

c

6. c x = x > 0, so that c x ∈ V . 7. c (x ⊕ y) = (xy)c = xc y c = xc ⊕ y c = (c x) ⊕ (c y). 8. (c + d) x = xc+d = xc xd = xc ⊕ xd = (c x) ⊕ (c y). c d 9. c (d x) = c xd = xd = xcd = (xc ) = d xc = d (c x). 10. 1 x = xx = x.

6.1. VECTOR SPACES AND SUBSPACES

453

8. This is not a vector space, since Axiom 6 fails. Choose any rational number q, and let c = π. Then qπ is not rational, so that V is not closed under scalar multiplication. 9. All apply, so  axioms    that this is a vector space. Let x, y, z, u, v, w, c, and d be real numbers. Then x y u v and ∈ V , and 0 z 0 w 1. The sum of upper triangular matrices is again upper triangular. 2. Matrix addition is commutative. 3. Matrix addition is associative.    x y x 4. The zero matrix O is upper triangular, and +O = 0 z 0     x y −x −y . 5. − is the matrix 0 z 0 −z     x y cx cy 6. c = ∈ V since it is upper triangular. 0 z 0 cz

 y . z

7.  c

x 0

     y u v x+u y+v + =c z 0 w 0 z+w   c(x + u) c(y + v) = 0 c(z + w)   cx + cu cy + cv = 0 cz + cw     cx cy cu cv + = 0 cz 0 cw     x y u v =c +c . 0 z 0 w

8. (c + d)

    x y dx 9. c d =c 0 z 0     x y x y 10. 1 = . 0 z 0 z

 x 0

   y (c + d)x (c + d)y = z 0 (c + d)z   cx + dx cy + dy = 0 cz + dz     cx cy dx dy = + 0 cz 0 dz     x y x y =c +d . 0 z 0 z

      dy c(dx) c(dy) (cd)x (cd)y x = = = (cd) dz 0 c(dz) 0 (cd)z 0

 y . z

10. This set V is not a vector space, since it is not closed under addition. For example,           1 0 0 0 1 0 0 0 1 0 , ∈ V, but + = ∈ /V 0 0 0 1 0 0 0 1 0 1 since the product of the diagonal entries is not zero.

454

CHAPTER 6. VECTOR SPACES

11. This set V is a vector space. Let A be a skew-symmetric matrix and c a constant. Then 1. The sum of skew-symmetric matrices is skew-symmetric. 2. Matrix addition is commutative. 3. Matrix addition is associative. 4. The zero matrix O is skew-symmetric, and A + O = A. 5. If A is skew-symmetric, then so is −A, since (−A)ij = −aij = aji = (A)ji = −(−A)ji . 6. If A is skew-symmetric, then so is cA, since (cA)ij = c(A)ij = −c(A)ji = −(cA)ji . 7. Scalar multiplication distributes over matrix addition. 8, 9. These properties hold for general matrices, and skew-symmetric matrices are closed under these operations, so then in V . 10. 1A = A is skew-symmetric. 12. Axioms 1, 2, and 6 were verified in Example 6.3. For the remaining examples, using the notation from Example 3, and letting r(x) = c0 + c1 x + c2 x2 , 3. (p(x) + q(x)) + r(x) = ((a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2 ) + r(x) = (a0 + b0 + c0 ) + (a1 + b1 + c1 )x + (a2 + b2 + c2 )x2 = (a0 + a1 x + a2 x2 ) + ((b0 + c0 ) + (b1 + c1 )x + (b2 + c2 )x2 ) = p(x) + (q(x) + r(x)). 4. The zero polynomial 0 + 0x + 0x2 is the zero vector, since 0 + p(x) + (0 + a0 ) + (0 + a1 )x + (0 + a2 )x2 = a0 + a1 x + a2 x2 = p(x). 5. −p(x) is the polynomial −a0 − a1 x − a2 x2 . 7.  c(p(x) + q(x)) = c (a0 + a1 x + a2 x2 ) + (b0 + b1 x + b2 x2 )  = c (a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2 = c(a0 + b0 ) + c(a1 + b1 )x + c(a2 + b2 )x2 = (ca0 + cb0 ) + (ca1 + cb1 )x + (ca2 + cb2 )x2   = ca0 + ca1 x + ca2 x2 + cb0 + cb1 x + cb2 x2   = c a0 + a1 x + a2 x2 + c b0 + b1 x + b2 x2 = cp(x) + cq(x). 8. (c + d)p(x) = (c + d) a0 + a1 x + a2 x2



= (c + d)a0 + (c + d)a1 x + (c + d)a2 x2 = (ca0 + da0 ) + (ca1 + da1 )x + (ca2 + da2 )x2 = (ca0 + ca1 x + ca2 x2 ) + (da0 + da1 x + da2 x2 )   = c a0 + a1 x + a2 x2 + d a0 + a1 x + a2 x2 = cp(x) + dp(x).

6.1. VECTOR SPACES AND SUBSPACES

455

9. c(dp(x)) = c da0 + da1 x + da2 x2



= c(da0 ) + c(da1 )x + c(da2 )x2 = (cd)a0 + (cd)a1 x + (cd)a2 x2  = (cd) a0 + a1 x + a2 x2 = (cd)p(x). 10. 1p(x) = 1a0 + 1a1 x + 1a2 x2 = a0 + a1 x + a2 x2 = p(x). 13. Axioms 1 and 6 were verified in Example 6.4. For the rest, using the notation from Example 4, 2, 3. Addition of functions is commutative and associative. 4. The zero function f (x) = 0 for all x is the zero vector, since (0 + f )(x) = 0(x) + f (x) = f (x). 5. The function −f such that (−f )(x) = −f (x) is such that (−f + f )(x) = (−f )(x) + f (x) = −f (x) + f (x) = 0, so their sum is the zero function. 7. The function c(f + g) is defined by (c(f + g))(x) = c((f + g)(x)) by the definition of scalar multiplication. But (f + g)(x) = f (x) + g(x), so we get (c(f + g))(x) = c(f (x) + g(x)) = cf (x) + cg(x) = (cf + cg)(x). 8. Again using the definition of scalar multiplication, the function (c+d)f is defined by ((c+d)f )(x) = (c + d)(f (x)) = cf (x) + df (x) = (cf + df )(x). 9. The function df is defined by (df )(x) = df (x), so c(df ) is defined by (c(df ))(x) = c((df )(x)) = cdf (x) = ((cd)f )(x). 10. (1f )(x) = 1f (x) = f (x). 14. This is not a complex vector space, since Axiom 6 fails. Let z ∈ C be any nonzero complex number, and let c ∈ C be any complex number with nonzero real and imaginary parts. Then       z cz cz c = . 6= z¯ cz c¯ z 15. This is a vector space over C; the proof is identical to the proof that Mmn (R) is a real vector space.     z w 16. This is a complex vector space. Let 1 , 1 ∈ C2 , and let c, d ∈ C. Then z2 w2 1, 2, 3, 4, 5. These follow since addition is the same as in the usual vector space C2 .     z c¯z1 6. c 1 = ∈ C2 . z2 c¯z2 7.           z1 w z +w+1 c¯(z1 + w1 ) c¯z1 + c¯w1 c + 1 =c 1 = = z2 w2 z2 + w2 c¯(z2 + w2 ) c¯z2 + c¯w2         c¯z1 c¯w1 z w = + =c 1 +c 1 . c¯z2 c¯w2 z2 w2               (c + d)z1 cz1 dz1 (c + d)z1 z z z1 = = + =c 1 +d 1 . = cz2 z2 z2 z2 (c + d)z2 dz2 (c + d)z2            ¯ ¯ (cd)z1 c¯dz dz z z = ¯1 =c ¯1 =c d 1 . 9. (cd) 1 = z2 c ¯ dz dz z2 (cd)z2 2 2         ¯ z 1z 1z1 z 10. 1 1 = ¯ 1 = = 1 . z2 1z2 1z2 z2

8. (c + d)

456

CHAPTER 6. VECTOR SPACES

17. This is not a complex vector space, since Axiom 6 fails. Let x ∈ Rn be any nonzero vector, and let c ∈ C be any complex number with nonzero imaginary part. Then cx ∈ / Rn , so that Rn is not closed under scalar multiplication. 18. This set V is a vector space over Z2 . Let x, y ∈ Zn2 each have an even number of 1s, say x has 2r ones and y has 2s ones. 1. x + y has a 1 where either x or y has a 1, except that if both x and y have a 1, x + y has a zero. Say this happens in k different places. Then the total number of 1s in x+y is 2r+2s−2k = 2(r+s−k), which is even, so V is closed under addition. 2. Since addition in Z2 is commutative, we see that x + y = y + x since we are simply adding corresponding components. 3. Since addition in Z2 is associative, we see that (x + y) + z = x + (y + z) since we are simply adding corresponding components. 4. 0 ∈ Zn2 has an even number (zero) of ones, so 0 ∈ V , and x + 0 = x. 5. Since 1 = −1 in Z2 , we can take −x = x; then −x + x = x + x = 0. 6. If c ∈ Z2 , then cx is equal to either 0 or x depending on whether c = 0 or c = 1. In either case, x ∈ V implies that cx ∈ V . 7, 8. Since multiplication distributes over addition in Z2 , these axioms follow since operations are component by component. 9. Since multiplication in Z2 is distributive, these axioms follow since operations are component by component. 10. 1x = x since multiplication is component by component. 19. This set V is not a vector space: 1. For example, let x and y be vectors with exactly one 1. Then x + y has either 0 or 2 ones (it has zero if the locations of the 1s in x and y are the same), so x + y ∈ / V and V is not closed under addition. 4. 0 ∈ / V since 0 does not have an odd number of 1s. 6. If c = 0, then cx = 0 ∈ / V if x ∈ V , so that V is not closed under scalar multiplication. 20. This is a vector space over Zp ; the proof is identical to the proof that Mmn (R) is a real vector space. 21. This is not a vector space. Axioms 1 through 7 are all satisfied, as is Axiom 10, but Axioms 8 and 9 fail since addition and multiplication are not the same in Z6 as they are in Z3 . For example: 8. Let c = 2, d = 1, and u = 1. Then (c + d)u = (2 + 1)u = 0u = 0 since addition of scalars is performed in Z3 , while cu + du = 2 · 1 + 1 · 1 = 2 + 1 = 3 since here the addition is performed in Z6 . 9. Let c = d = 2 and u = 1. Then (cd)u = 1 · 1 = 1 since the scalar multiplication is done in Z3 , while c(du) = 2(2 · 1) = 2 · 2 = 4 since here the multiplication is done in Z6 . 22. Since 0 = 0 + 0, we have 0u = (0 + 0)u = 0u + 0u by Axiom 8. By Axiom 5, we add −0u to both sides of this equation, giving −0u + 0u = (−0u + 0u) + 0u, so that by Axiom 5 0 = 0 + 0u. Finally, 0 + 0u = 0u by Axiom 4, so that 0 = 0u. 23. To show that (−1)u = −u, it suffices to show that (−1)u + u = 0, by Axiom 5. But (−1)u + u = (−1)u + 1u by Axiom 10. Then use Axiom 8 to rewrite this as (−1 + 1)u = 0u. By part (a) of Theorem 6.1, 0u = 0 and we are done.

6.1. VECTOR SPACES AND SUBSPACES

457

24. This is a subspace. We check each of the conditions in Theorem 6.2:       a b a+b 1. 0 + 0 =  0  ∈ W since its first and third components are equal and its second component a b a+b is zero.     a ca 2. c 0 =  0  ∈ W . a cz 25. This is a subspace. We check each of the conditions in Theorem 6.2:         a b a+b (a + b) 1. −a + −b =  −a − b  = −(a + b) ∈ W . 2a 2b 2a + 2b 2(a + b)       a ac (ac) 2. c −a = −ac = −(ac) ∈ W . 2a 2ac 2(ac) 26. This is not a subspace. It fails both conditions in Theorem 6.2:       c a+c a + = , which is not in W . d b+d b 1.  c+d+1 (a + c) + (b + d) + 2 a+b+1       ca ca a = = , which is not in W if c 6= 1. cb cb b 2. c  c(a + b + 1) ca + cb + c a+b+1 27. This is not a subspace. It fails both conditions in Theorem 6.2:       c a+c a 6 |a + c|, so that 1.  b  +  d  =  b + d . If a and c do not have the same sign, then |a| + |c| = |c| |a| + |c| |a| the sum is not in W .     a ca 2. c  b  =  cb . If c < 0, then c |a| = 6 |ca|, so that the result is not in W . |a| c |a| 28. This is a subspace. We check each of the conditions in Theorem 6.2:         a b c d a+c b+d a+c b+d 1. + = = ∈ W. b 2a d 2c b + d 2a + 2c b + d 2(a + c)       a b ca cb ac bc 2. c = = ∈ W. b 2a cb 2ac bc 2(ac) 29. This is not a subspace; condition 1 in Theorem 6.2 fails. For example, if a, d ≥ 0, then     a 0 −a a , ∈ W, 0 d d −d but  a 0 since 0 < ad.

     0 −a a 0 a + = ∈ /W d d −d d 0

458

CHAPTER 6. VECTOR SPACES

30. This is not a subspace; both conditions of Theorem 6.2 fail. For example: Let A = B = I; then det A = det B = 1, so that A, B ∈ W , but det(A + B) = det 2I = 2, so A+B ∈ / W. Let A = I and c be any constant other than ±1. Then det(cA) = cn det A 6= det A = 1, so that cA ∈ / W. 31. Since the sum of two diagonal matrices is again diagonal, and a scalar times a diagonal matrix is diagonal, this subset satisfies both conditions of Theorem 6.2, so it is a subspace. 32. This is not a subspace. For example, it fails the second condition: let A be idempotent and c be any constant other than 0 or 1. Then (cA)2 = c2 A2 = c2 A 6= cA so that cA is not idempotent, so is not in W . 33. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Suppose that A, C ∈ W . Then (A + C)B = AB + CB = BA + BC = B(A + C), so that A + C ∈ W . 2. Suppose that A ∈ W and c is any scalar. Then (cA)B = c(AB) = c(BA) = B(cA), so that cA ∈ W . 34. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let v = bx + cx2 , v0 = b0 x + c0 x2 . Then v + v0 = bx + cx2 + b0 x + c0 x2 = (b + b0 )x + (c + c0 )x2 ∈ W. 2. Let v = bx + cx2 and d be any scalar. Then dv = d(bx + cx2 ) = (db)x + (dc)x2 ∈ W . 35. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let v = a + bx + cx2 , v0 = a0 + b0 x + c0 x2 ∈ W . Then v + v0 = a + bx + cx2 + a0 + b0 x + c0 x2 = (a + a0 ) + (b + b0 )x + (c + c0 )x2 . Then (a + a0 ) + (b + b0 ) + (c + c0 ) = (a + b + c) + (a0 + b0 + c0 ) = 0, so that v + v0 ∈ W . 2. Let v = a + bx + cx2 and d be any scalar. Then dv = d(a + bx + cx2 ) = (da) + (db)x + (dc)x2 , and da + db + dc = d(a + b + c) = d · 0 = 0, so that dv ∈ W . 36. This is not a subspace; it fails condition 1 in Theorem 6.2. For example, let v = 1 + x and v0 = x2 . Then both v and v0 are in V , but v + v0 = 1 + x + x2 ∈ / V since the product of its coefficients is not zero. 37. This is not a subspace; it fails both conditions of Theorem 6.2. 1. For example let v = x3 and v0 = −x3 . Then v + v0 = 0, which is not a degree 3 polynomial. 2. Let c = 0; then cv = 0 for any v, but 0 ∈ / W , so that this condition fails. 38. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let f, g ∈ W . Then (f + g)(−x) = f (−x) + g(−x) = f (x) + g(x) = (f + g)(x), so that f + g ∈ W . 2. Let f ∈ W and c be any scalar. Then (cf )(−x) = cf (−x) = cf (x) = (cf )(x), so that cf ∈ W . 39. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let f, g ∈ W . Then (f + g)(−x) = f (−x) + g(−x) = −f (x) − g(x) = −(f (x) + g(x)) = −(f + g)(x), so that f + g ∈ W . 2. Let f ∈ W and c be any scalar. Then (cf )(−x) = cf (−x) = −cf (x) = −(cf )(x), so that cf ∈ W .

6.1. VECTOR SPACES AND SUBSPACES

459

40. This is not a subspace; it fails both conditions in Theorem 6.2. For example, if f, g ∈ W , then (f + g)(0) = f (0) + g(0) = 1 + 1 6= 1, so that f + g ∈ / W . Similarly, if c is a scalar not equal to 1, then (cf )(0) = cf (0) = c 6= 1, so that cf ∈ / W. 41. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let f, g ∈ W . Then (f + g)(0) = f (0) + g(0) = 0 + 0 = 0, so that f + g ∈ W . 2. Let f ∈ W and c be any scalar. Then (cf )(0) = cf (0) = c · 0 = 0, so that cf ∈ W . 42. If f and g ∈ W are integrable and c is a constant, then Z Z Z (f + g) = f + g,

Z

Z (cf ) = c

f,

so that both f + g and cf ∈ W . Thus W is a subspace. 43. This is not a subspace; it fails condition 2 in Theorem 6.2. For example, let f ∈ W be such that f 0 (x) > 0 everywhere, and choose c = −1; then (cf )0 (x) = cf 0 (x) < 0, so that cf ∈ / W. 44. If f, g ∈ W and c is a constant, then (f + g)00 = f 00 + g 00 ,

(cf )00 = cf 00 ,

so that both f + g and cf have continuous second derivatives, so they are both in W . 45. This is not a subspace; it fails both conditions in Theorem 6.2. 1. f (x) =

1 x2

∈ W , but −f (x) = − x12 ∈ / W , since limx→0 (−f (x)) = −∞.

2. Take c = 0 and f ∈ W ; then limx→0 (cf )(x) = limx→0 0 = 0, so that cf ∈ / W. 46. Suppose that v, v0 ∈ U ∩ W , and let c be a scalar. Then v, v0 ∈ U , so that v + v0 and cv ∈ U . Similarly, v, v0 ∈ W , so that v + v0 and cv ∈ W . Therefore v + v0 ∈ U ∩ W and cv ∈ U ∩ W , so that U ∩ W is a subspace.     x 0 47. For example, let U = , the x-axis, and let W = , the y-axis. Then U ∪ W consists of all 0 y 2 vectors in R in which at least one component is zero. Choose a, b 6= 0; then           a 0 a 0 a ∈ U, ∈ W, but + = ∈ / U ∪W 0 b 0 b b since neither a nor b is zero. 48. (a) U + W is the xy-plane, since     x 0 U =  0  , W = y  , 0 0 so that any point in the xy-plane is       a a 0  b  = 0 +  b  ∈ U + W. 0 0 0 However, a point not in the xy-plane has nonzero z-coordinate, so it cannot be a sum of an element of U and an element of W . Thus U + W is exactly the xy-plane.

460

CHAPTER 6. VECTOR SPACES (b) Suppose x, x0 ∈ U + W ; say x = u + w and x0 = u0 + w0 , where u, u0 ∈ U and w, w0 ∈ W . Then x + x0 = u + w + u0 + w0 = (u + u0 ) + (w + w0 ) ∈ U + W since both U and W are subspaces. If c is a scalar, then cx = c(u + w) = cu + cw ∈ U + W, again since both U and W are subspaces.

49. We verify all ten axioms of a vector space. Note that both U and V are vector spaces, so they satisfy all the vector space axioms. These facts will be used without comment below. Let (u1 , v1 ), (u2 , v2 ), and (u3 , v3 ) be elements of U × V , and let c and d be scalars. Then 1. (u1 , v1 ) + (u2 , v2 ) = (u1 + u2 , v1 + v2 ) ∈ U × V . 2. (u1 , v1 ) + (u2 , v2 ) = (u1 + u2 , v1 + v2 ) = (u2 + u1 , v2 + v1 ) = (u2 , v2 ) + (u1 , v1 ). 3. ((u1 , v1 ) + (u2 , v2 )) + (u3 , v3 ) = (u1 + u2 , v1 + v2 ) + (u3 , v3 ) = ((u1 + u2 ) + u3 , (v1 + v2 ) + v3 ) = (u1 + (u2 + u3 ), v1 + (v2 + v3 )) = (u1 , v1 ) + (u2 + u3 , v2 + v3 ) = (u1 , v1 ) + ((u2 , v2 ) + (u3 , v3 )). 4. (0, 0) ∈ U × V , and (u1 , v1 ) + (0, 0) = (u1 + 0, v1 + 0) = (u1 , v1 ). 5. (u1 , v1 ) + (−u1 , −v1 ) = (u1 − u1 , v1 − v1 ) = (0, 0). 6. c(u1 , v1 ) = (cu1 , cv1 ) ∈ U × V . 7. c ((u1 , v1 ) + (u2 , v2 )) = c(u1 + u2 , v1 + v2 ) = (c(u1 + u2 ), c(v1 + v2 )) = (cu1 + cu2 , cv1 + cv2 ) = (cu1 , cv1 ) + (cu2 , cv2 ) = c(u1 , v1 ) + c(u2 , v2 ). 8. (c + d)(u1 , v1 ) = ((c + d)u1 , (c + d)v1 ) = (cu1 + du1 , cv1 + dv1 ) = (cu1 , cv1 ) + (du1 , dv1 ) = c(u1 , v1 ) + d(u1 , v1 ). 9. c (d(u1 , v1 )) = c(du1 , dv1 ) = (c(du1 ), c(du2 )) = ((cd)u1 , (cd)v1 ) = (cd)(u1 , v1 ). 10. 1(u1 , v1 ) = (1u1 , 1v1 ) = (u1 , v1 ). 50. Suppose that (w, w), (w0 , w0 ) ∈ ∆. Then (w, w) + (w0 , w0 ) = (w + w0 , w + w0 ), which is in ∆ since its two components are the same. Let c be a scalar. Then c(w, w) = (cw, cw), which is also in ∆ since its components are the same. Therefore ∆ satisfies both conditions of Theorem 6.2, so is a subspace.

6.1. VECTOR SPACES AND SUBSPACES

461

51. If aA + bB = C, then 

1 a −1

  1 1 +b 1 1

  −1 a+b = 0 b−a

  a−b 1 = a 3

 2 . 4

Then a − b = 2 and b − a = 3, which is impossible. Therefore C is not in the span of A and B. 52. If aA + bB = C, then  a

1 −1

  1 1 +b 1 1

    −1 a+b a−b 3 = = 0 b−a a 5

 −5 . −1

Then a = −1 from the lower right. The upper left then gives −1+b = 3, so that b = 4. Then a−b = −5 and b − a = 5, so that −A + 4B = C. 53. Suppose that a(1 − 2x) + b(x − x2 ) + c(−2 + 3x + x2 ) = (a − 2c) + (−2a + b + 3c)x + (−b + c)x2 = 3 − 5x − x2 . Then − 2c =

a

3

−2a + b + 3c = −5 − b + c = −1. The third equation gives b = c + 1; substituting this in the second equation gives −2a + 4c = −6, which is −2 times the first equation. So there is a family of solutions, which can be parametrized as a = 2t + 3, b = t + 1, c = t. For example, with t = 0 we get s(x) = 3p(x) + q(x), and s(x) ∈ span(p(x), q(x), r(x)). 54. Suppose that a(1 − 2x) + b(x − x2 ) + c(−2 + 3x + x2 ) = (a − 2c) + (−2a + b + 3c)x + (−b + c)x2 = 1 + x + x2 . Then a

− 2c = 1

−2a + b + 3c = 1 − b + c = 1. The third equation gives b = c − 1; substituting into the second equation gives −2a + 4c − 1 = 1, or a − 2c = −1. This conflicts with the first equation, so the system has no solution. Thus s(x) ∈ / span(p(x), q(x), r(x)). 55. Since sin2 x + cos2 x = 1, we have 1 = f (x) + g(x) ∈ span(sin2 x, cos2 x). 56. Since cos 2x = cos2 x − sin2 x = g(x) − f (x), we have cos 2x = −f (x) + g(x) ∈ span(sin2 x, cos2 x). 57. Suppose that sin 2x = af (x) + bg(x) = a sin2 x + b cos2 x for all x. Then Let x = 0; then 0 = a sin2 0 + b cos2 0 = b, so that b = 0.  π π π π = sin π = 0 = a sin2 + b cos2 = a, so that a = 0. Let x = ; then sin 2 · 2 2 2 2 Thus the only solution is a = b = 0, but this is not a valid solution since sin 2x 6= 0. So sin 2x ∈ / span(sin2 x, cos2 x).

462

CHAPTER 6. VECTOR SPACES

58. Suppose that sin x = af (x) + bg(x) = a sin2 x + b cos2 x for all x. Then Let x = 0; then 0 = a sin2 0 + b cos2 0 = b, so that b = 0. π π π π Let x = ; then sin = 1 = a sin2 + b cos2 = a, so that a = 1. 2 2 2 2 Thus the only possibility is sin x = sin2 x. But this is not a valid equation; for example, if x = then the left-hand side is −1 while the right-hand side is 1. So sin x ∈ / span(sin2 x, cos2 x). 59. Let



1 V1 = 0

 1 , 1

 0 V2 = 1

 1 , 0

 1 V3 = 1

 0 , 1



0 V4 = 1

3π 2 ,

 −1 . 0

Now, V4 = V3 − V1 , so that span(V1 , V2 , V3 , V4 ) = span(V1 , V2 , V3 ). If span(V1 , V2 , V3 ) = M22 , then every matrix is in the span, so in particular E11 is. But     a+c a+b 1 0 aV1 + bV2 + cV3 = E11 ⇒ = . b+c a+c 0 0 Now the diagonal entries give a + c = 0 and a + c = 1, which has no solution. So E11 is not in the span and thus span(V1 , V2 , V3 ) 6= M22 . 60. Let V1 =

 1 1

 0 , 0

V2 =

 1 1

 1 , 0

V3 =

 1 1

 1 , 1

 V4 =

0 1

 −1 . 0

Solving the four independent systems E11 = aV1 + bV2 + cV3 + dV4

E12 = aV1 + bV2 + cV3 + dV4

E21 = aV1 + bV2 + cV3 + dV4

E22 = aV1 + bV2 + cV3 + dV4

gives E11 = 2V1 − V2 − V4

E12 = −V1 + V2

E21 = −V1 + V2 + V4

E22 = −V2 + V3 .

Therefore span(V1 , V2 , V3 , V4 ) ⊇ span(E11 , E12 , E21 , E22 ) = M22 . 61. Let p(x) = 1 + x, q(x) = x + x2 , and r(x) = 1 + x2 . Then a + c = 1 1 1 1 = 0 ⇒ a= , b=− , c= 1 = ap(x) + bq(x) + cs(x) ⇒ a + b 2 2 2 b + c = 0 a + c = 0 1 1 1 = 1 ⇒ a= , b= , c=− x = ap(x) + bq(x) + cs(x) ⇒ a + b 2 2 2 b + c = 0 a + c = 0 1 1 1 = 0 ⇒ a=− , b= , c= . x2 = ap(x) + bq(x) + cs(x) ⇒ a + b 2 2 2 b + c = 1 Thus span(p(x), q(x), r(x)) ⊇ span(1, x, x2 ) = P2 . 62. Let p(x) = 1 + x + 2x2 , q(x) = 2 + x + 2x2 , and r(x) = −1 + x + 2x2 . Then x = ap(x) + bq(x) + cs(x) ⇒

a + 2b − c = 0 a + b + c = 1 2a + 2b + 2c = 0 .

But the second pair of equations are inconsistent, so that x ∈ / span(p(x), q(x), r(x)). span(p(x), q(x), r(x)) 6= P2 .

Therefore

6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION

463

63. Suppose 0 and 00 satisfy Axiom 4. Then 0 + 00 = 0 since 00 satisfies Axiom 4, and also 0 + 00 = 00 since 0 satisfies Axiom 4. Putting these two equalities together gives 0 = 00 , so that 0 is unique. 64. Suppose v0 and v00 satisfy Axiom 5 for v. Then using Axiom 5, v00 = v00 + 0 = v00 + (v + v0 ) = (v00 + v) + v0 = 0 + v0 = v0 .

6.2

Linear Independence, Basis, and Dimension

1. We wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that       1 1 1 −1 1 0 c1 + c2 + c3 = 0. 0 −1 1 0 3 2 This is equivalent to asking for solutions to the linear system c1 + c2 + c3 = 0 c1 − c2

=0

c2 + 3c3 = 0 −c1

+ 2c3 = 0.

Row-reducing the corresponding augmented matrix, we get    1 0 1 1 1 0 0 1  1 −1 0 0 →  0 0  0 1 3 0 −1 0 2 0 0 0

0 0 1 0

 0 0 . 0 0

This system has only the trivial solution c1 = c2 = c3 = 0, so that these matrices are linearly independent. 2. We wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that       2 −3 1 −1 −1 3 c1 + c2 + c3 = 0. 4 2 3 3 1 5 This is equivalent to asking for solutions to the linear system 2c1 + c2 − c3 = 0 −3c1 − c2 + 3c3 = 0 4c1 + 3c2 + c3 = 0 2c1 + 3c2 + 5c3 = 0. Row-reducing the corresponding augmented matrix, we get    2 1 −1 0 1 0  −3 −1 3 0  → 0 1   4 0 0 3 1 0 2 3 5 0 0 0

−2 3 0 0

 0 0 . 0 0

This system has a nontrivial solution c1 = 2t, c2 = −3t, c3 = t; setting t = 1 for example gives       2 −3 1 −1 −1 3 2 −3 + = 0. 4 2 3 3 1 5 so that the matrices are linearly dependent.

464

CHAPTER 6. VECTOR SPACES

3. We wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that         −1 1 3 0 0 2 −1 0 c1 + c2 + c3 + c4 = 0. −2 2 1 1 −3 1 −1 7 This is equivalent to asking for solutions to the linear system −c1 + 3c2 c1

− c4 = 0 + 2c3

=0

−2c1 + c2 − 3c3 − c4 = 0 2c1 + c2 + c3 + 7c4 = 0. Row-reducing the corresponding  −1  1  −2 2

augmented matrix, we get   1 0 3 0 −1 0 0 1 0 2 0 0 → 0 0 1 −3 −1 0 1 1 7 0 0 0

0 0 1 0

4 1 −2 0

 0 0 . 0 0

This system has a nontrivial solution c1 = −4t, c2 = −t, c3 = 2t, c4 = t; setting t = 1 for example gives         −1 1 3 0 0 2 −1 0 = 0. −4 − +2 + −2 2 1 1 −3 1 −1 7 so that the matrices are linearly dependent. 4. We wish to determine if there are constants    1 1 1 c1 + c2 0 1 1

c1 , c2 , c3 , and c4 , not all zero, such that      0 0 1 1 1 + c3 + c4 = 0. 1 1 1 1 0

This is equivalent to asking for solutions to the linear system c1 + c2 c1

+ c4 = 0 + c3 + c4 = 0

c2 + c3 + c4 = 0 c1 + c2 + c3 Row-reducing the corresponding augmented  1 1 0 1 1 0 1 1  0 1 1 1 1 1 1 0

matrix, we   1 0 0 0 → 0 0 0 0

= 0. get 0 0 0 1 0 0 0 1 0 0 0 1

 0 0 . 0 0

Since the only solution is the trivial solution, it follows that these four matrices are linearly independent over M22 . 5. Following, for instance, Example 6.26, we wish to determine if there are constants c1 and c2 , not both zero, such that c1 (x) + c2 (1 + x) = 0. This is equivalent to asking for solutions to the linear system c2 = 0 c1 + c2 = 0. By forward substitution, we see that the only solution is the trivial solution, c1 = c2 = 0. It follows that these polynomials are linearly independent over P2 .

6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION

465

6. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that c1 (1 + x) + c2 (1 + x2 ) + c3 (1 − x + x2 ) = (c1 + c2 + c3 ) + (c1 − c3 )x + (c2 + c3 )x2 = 0. This is equivalent to asking for solutions to the linear system c1 + c2 + c3 = 0 − c3 = 0

c1

c2 + c3 = 0. Row-reducing the corresponding augmented matrix, we get    1 1 1 0 1 0 0 1 0 −1 0 → 0 1 0 0 1 1 0 0 0 1

 0 0 . 0

Since the only solution is the trivial solution, it follows that these three polynomials are linearly independent over P2 . 7. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that c1 (x) + c2 (2x − x2 ) + c3 (3x + 2x2 ) = (c1 + 2c2 + 3c3 )x + (−c2 + 2c3 )x2 = 0. This is a homogeneous system of two equations in three unknowns, so we know it has a nontrivial solution and these polynomials are linearly dependent. To find an expression showing the linear dependence, we must find solutions to the linear system c1 + 2c2 + 3c3 = 0 − c2 + 2c3 = 0. Row-reducing the corresponding augmented matrix, we get    1 2 3 0 1 0 7 → 0 −1 2 0 0 1 −2

 0 . 0

The general solution is c1 = −7t, c2 = 2t, c3 = t. Setting t = 1, for example, we get −7(x) + 2(2x − x2 ) + (3x + 2x2 ) = 0. 8. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that c1 (2x)+c2 (x−x2 )+c3 (1+x3 )+c4 (2−x2 +x3 ) = (c3 +2c4 )+(2c1 +c2 )x+(−c2 −c4 )x2 +(c3 +c4 )x3 = 0. This is equivalent to asking for solutions to the linear system c3 + 2c4 = 0 2c1 + c2

=0

− c2

− c4 = 0 c3 + c4 = 0.

Row-reducing the corresponding augmented matrix, we get    0 0 1 2 0 1 0 2  0 1 1 0 0 0    0 −1 0 −1 0 → 0 0 0 0 1 1 0 0 0

0 0 1 0

0 0 0 1

 0 0 . 0 0

Since the only solution is the trivial solution, it follows that these four polynomials are linearly independent over P3 .

466

CHAPTER 6. VECTOR SPACES

9. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that c1 (1 − 2x) + c2 (3x + x2 − x3 ) + c3 (1 + x2 + 2x3 ) + c4 (3 + 2x + 3x3 ) = (c1 + c3 + 3c4 ) + (−2c1 + 3c2 + 2c4 )x + (c2 + c3 )x2 + (−c2 + 2c3 + 3c4 )x3 = 0. This is equivalent to asking for solutions to the linear system c1

+ c3 + 3c4 = 0

−2c1 + 3c2

+ 2c4 = 0

c2 + c3

=0

− c2 + 2c3 + 3c4 = 0. Row-reducing the corresponding augmented matrix, we get    1 0 1 3 0 1 0 0 1  −2 3 0 2 0  →  0 0 0 1 1 0 0 0 −1 2 3 0 0 0

0 0 1 0

0 0 0 1

 0 0 . 0 0

Since the only solution is the trivial solution, it follows that these four polynomials are linearly independent over P3 . 10. We wish to find solutions to a + b sin x + c cos x = 0 that hold for all x. Setting x = 0 gives a + c = 0, so that a = −c; setting x = π gives a − c = 0, so that a = c. This forces a = c = 0; then setting x = π2 gives b = 0. So the only solution is the trivial solution, and these functions are linearly independent in F. 11. Since sin2 x + cos2 x = 1, the functions are linearly dependent. 12. Suppose that aex + be−x = 0 for all x. Setting x = 0 gives ae0 + be0 = a + b = 0, while setting x = ln 2 gives 2a + 12 b = 0. Since the coefficient matrix, " # 1 1 , 2 12 has nonzero determinant, the only solution to the system is the trivial solution, so that these functions are linearly independent in F . 13. Using properties of the logarithm, we have ln(2x) = ln 2 + ln x and ln(x2 ) = 2 ln x, so we want to see if {1, ln 2 + ln x, 2 ln x} are linearly independent over F . But clearly they are not, since by inspection (−2 ln 2) · 1 + 2 · (ln 2 + ln x) − 2 ln x = 0. So these functions are linearly dependent in F . 14. We wish to find solutions to a sin x + b sin 2x + c sin 3x = 0 that hold for all x. Setting π 2 π x= 6 π x= 3

a−c=0 √ 1 3 ⇒ a+ b=0 2√ 2√ 3 3 ⇒ a− b = 0. 2 2  √  Adding the second and third equations gives 12 + 23 a = 0, so that a = 0; then the first and third equations force b = c = 0. So the only solution is the trivial solution, and thus these functions are linearly independent in F . x=



6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION

467

15. We want to prove that if the Wronskian is not identically zero, then f and g are linearly independent. We do this by proving the contrapositive: assume f and g are linearly dependent. If either f or g is zero, then clearly the Wronskian is zero as well. So assume that neither f nor g is zero; since they are linearly dependent, each is a nonzero multiple of the other. Say f (x) = ag(x), so that f 0 (x) = ag 0 (x). Now, the Wronskian is f (x) g(x) 0 0 0 0 0 0 0 f (x) g 0 (x) = f (x)g (x) − g(x)f (x) = ag(x)g (x) − g(x)(ag (x)) = ag(x)g (x) − ag(x)g (x) = 0. Thus the Wronskian is identically zero, proving the result. 1 sin x cos x cos x − sin x 2 2 16. Exercise 10: W (x) = 0 cos x − sin x = = − cos x − sin x = −1 6= 0. Since − sin x − cos x 0 − sin x − cos x the Wronskian is not zero, the functions are linearly independent. Exercise 11: 1 sin2 x cos2 x 2 sin x cos x −2 sin x cos x W (x) = 0 0 2(cos2 x − sin2 x) 2(sin2 x − cos2 x) 1 sin2 x cos2 x = 0 sin 2x − sin 2x 0 2 cos 2x −2 cos 2x sin 2x − sin 2x = 2 cos 2x −2 cos 2x = 0. Since the Wronskian is zero, the functions are linearly dependent. ex e−x = −1 − 1 = −2 6= 0. Since the Wronskian is not zero, the Exercise 12: W (x) = x e −e−x functions are linearly independent. Exercise 13: 1 ln(2x) ln(x2 ) 1 2 1 2 x = 0. W (x) = 0 = x1 x x − x2 − x22 0 − 12 − 22 x

x

Since the Wronskian is zero, the functions are linearly dependent. sin x sin 2x sin 3x 2 cos 2x 3 cos 3x . Rather than computing the determinant and Exercise 14: W (x) = cos x − sin x −4 sin 2x −9 sin 3x showing it is not identically zero, it suffices to find some value of x such that W (x) 6= 0. Let x = π3 ; then π 2π sin sin sin π 3 3 π 2π W = cos π3 2 cos 3 3 cos π 3 − sin π3 −4 sin 2π −9 sin π 3 √ √ 3 3 0 2 2 = 21 −1 −3 √ √3 − 2 −2 3 0 √ √ 3 3 2√ 2 = 3 √ − 3 −2 3 2   3 27 = 3 −3 + =− 6= 0. 4 4

468

CHAPTER 6. VECTOR SPACES So the Wronskian is not identically zero and therefore the functions are linearly independent.

17. (a) Yes, it is. Suppose a(u + v) + b(v + w) + c(u + w) = (a + c)u + (a + b)v + (b + c)w = 0. Since {u, v, w} are linearly independent, it follows that a + c = 0, a + b = 0, and b + c = 0. Subtracting the latter two from each other gives a − c = 0; combining this with the first equation shows that a = c = 0, and then b = 0. So the only solution to the system is the trivial one, so these three vectors are linearly independent. (b) No, they are not, since (u − v) + (v − w) − (u − w) = 0. 18. Since dim M22 = 4 but B has only three elements, it cannot be a basis. 19. If a

 1 0

  0 0 +b 1 1

  −1 1 +c 0 1

  1 1 +d 1 1

  1 0 = −1 0

 0 , 0

then a

+c+d= 0 −b+c+d= 0 b+c+d= 0

a

+ c − d = 0.

Subtracting the second and third equations shows that b = 0; subtracting the first and fourth shows that d = 0. Then from the second equation, c = 0 as well, which forces a = 0. So the only solution is the trivial solution, and these matrices are linearly independent. Since there are four of them, and dim M22 = 4, it follows that they form a basis for M22 . 20. Since dim M22 = 4, if these vectors are linearly independent then they span and thus form a basis. To see if they are linearly independent, we want to solve         1 0 0 1 1 1 1 0 c1 + c2 + c3 + c4 = 0. 0 1 1 0 0 1 1 1 Solving this linear system gives c1 = −2c4 , c2    1 0 0 −2 − 0 1 1

= −c4 , and c3 = c4 . For example, letting c4 = 1 gives      1 1 1 1 0 + + = 0. 0 0 1 1 1

Thus the matrices are linearly dependent so do not form a basis (see Theorem 6.10). 21. Since dim M22 = 4 but B has five elements, it cannot be a basis. 22. Since dim P2 = 3, if these vectors are linearly independent then they span and thus form a basis. To see if they are linearly independent, we want to solve c1 x + c2 (1 + x) + c3 (x − x2 ) = c2 + (c1 + c2 + c3 )x − c3 x2 = 0. Rather than setting up the augmented matrix and row-reducing it, simply note that the first and third terms force c2 = c3 = 0; then the second term forces c1 = 0. So the only solution is the trivial solution, and these three polynomials are linearly independent, so they form a basis for P2 .

6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION

469

23. Since dim P2 = 3, if these vectors are linearly independent then they span and thus form a basis. To see if they are linearly independent, we want to solve c1 (1 − x) + c2 (1 − x2 ) + c3 (x − x2 ) = (c1 + c2 ) + (−c1 + c3 )x + (−c2 − c3 )x2 = 0. To solve the resulting linear system, we row-reduce its augmented    1 1 0 0 1 0 −1 −1 0 1 0 → 0 1 1 0 −1 −1 0 0 0 0

matrix:  0 0 . 0

This has general solution c1 = t, c2 = −t, c3 = t, so setting t = 1 we get (1 − x) − (1 − x2 ) + (x − x2 ) = 0. Therefore the vectors are not linearly independent, so they cannot form a basis for P2 , which has dimension 3. 24. Since dim P2 = 3 but there are only two vectors in B, they cannot form a basis. 25. Since dim P2 = 3 but there are four vectors in B, they must be linearly dependent, so cannot form a basis. 26. We want to find constants c1 , c2 , c3 , and c4 such that  c1 E22 + c2 E21 + c3 E12 + c4 E11 = A =

1 3

 2 . 4

Setting up the equivalent linear system gives c1

=4 c2

=3 c3

=2 c4 = 1.

This system obviously has the solution c1 = 4, c2 = 3, c3 = 2, c4 = 1, so the coordinate vector is   4 3  [A]B =  2 . 1 27. We want to find constants c1 , c2 , c3 , and c4 such that        1 0 1 1 1 1 1 c1 + c2 + c3 + c4 0 0 0 0 1 0 1

  1 1 = 1 3

 2 . 4

Setting up the equivalent linear system gives c1 + c2 + c3 + c4 = 1 c2 + c3 + c4 = 2 c3 + c4 = 3 c4 = 4. By backwards substitution, the solution is c4 = 4, c3 = −1, c2 = −1, c1 = −1, so the coordinate vector is   −1 −1  [A]B =  −1 . 4

470

CHAPTER 6. VECTOR SPACES

28. We want to find constants c1 , c2 , and c3 such that p(x) = 1 + 2x + 3x2 = c1 (1 + x) + c2 (1 − x) + c3 (x2 ) = (c1 + c2 ) + (c1 − c2 )x + c3 x2 . The corresponding linear system is c1 + c2

=1

c1 − c2

=2 c3 = 3.

Row-reducing the corresponding augmented  1 1 0  1 −1 0  0 0 1

matrix gives   1 0 0 1     2 → 0 1 0 3

0

0

1



3 2 − 12  ,

3

so that c1 = 23 , c2 = − 12 , and c3 = 3. Thus the coordinate vector is   3 2

  [p]B = − 12  . 3 29. We want to find constants c1 , c2 , and c3 such that p(x) = 2 − x + 3x2 = c1 (1) + c2 (1 + x) + c3 (−1 + x2 ) = (c1 + c2 − c3 ) + c2 x + c3 x2 . Equating coefficients shows that c3 = 3 and c2 = −1; substituting into the equation c1 + c2 − c3 = 2 gives c1 = 6, so that the coordinate vector is   6 [p]B = −1 . 3 30. To show B is a basis, we must show that it spans V and that the elements of B are linearly independent. B spans V , since if v ∈ V , then it can be written as a linear P combination of vectors in B by hypothesis. To show they are linearly independent, note that 0 = P 0ui ; since the representation of 0 is unique by assumption, it follows that the only solution to 0 = ci ui is the trivial solution, so that B = {ui } is a linearly independent set. Thus B is a basis for V . 31. [c1 u1 + · · · + ck uk ]B = [c1 u1 ]B + · · · + [ck uk ]B = c1 [u1 ]B + · · · + ck [uk ]B . 32. Suppose c1 u1 + · · · + cn un = 0. Then clearly 0 = [c1 u1 + · · · + cn un ]B . But by Exercise 31, the latter expression is equal to c1 [u1 ]B + · · · + cn [un ]B ; since [ui ]B are linearly independent in Rn , we get ci = 0 for all i, so that the only solution to the original equation is the trivial solution. Thus the ui are linearly independent in V . 33. First suppose that span(u1 , . . . , um ) = V . Choose   c1  c2    x =  .  ∈ Rn .  ..  cn

6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION Then

P

471

ci ui ∈ V since V = span(ui ). Therefore by Exercise 31 x = [c1 u1 + · · · + cm um ]B = c1 [u1 ]B + · · · + cm [um ]B ,

which shows that x ∈ span(S). For the converse, suppose that span(S) = Rn . Choose v ∈ V ; then [v]B ∈ Rn , and using Exercise 31 again, hX i X [v]B = ci [ui ]B = ci ui . B P But since B is a basis, and the two elements v and c u of V have the same coordinate vector, they i i P must be the same element. Thus v = ci ui , so that {ui } spans V . 34. Using the standard basis {x2 , x, 1} for P2 , if p(x) = ax2 + bx + c ∈ V , then p(0) = c = 0, so that c = 0 and therefore p(x) = ax2 + bx. Since x2 and x are linearly independent and span V , {x, x2 } is a basis, so that dim V = 2. 35. Two polynomials in V are p(x) = 1 − x and q(x) = 1 − x2 , since p(1) = q(1) = 0. If we show these polynomials are linearly independent and span V , we are done. They are linearly independent, since if a(1 − x) + b(1 − x2 ) = (a + b) − ax − bx2 = 0, then equating coefficients shows that a = b = 0. To see that they span, suppose r(x) = c+bx+ax2 ∈ V . Then r(1) = a + b + c = 0, so that c = −b − a and therefore r(x) = −(b + a) + bx + ax2 = −b(1 − x) − a(1 − x2 ), so that r is a linear combination of p and q. Thus {p(x), q(x)} forms a basis for V , which has dimension 2. 36. Suppose p(x) = c + bx + ax2 ; then xp0 (x) = bx + 2ax2 . So p(x) = xp0 (x) means that c = 0 and a = 2a, so that a = 0. Thus p(x) = bx. Then clearly V = span(x), so that {x} forms a basis, and dim V = 1. 37. Upper triangular matrices in M22 have the form   a b = aE11 + bE12 + dE22 . 0 d Since E11 , E12 , and E22 span V , and we know they are linearly independent, it follows that they form a basis for V , which therefore has dimension 3. 38. Skew-symmetric matrices in M22 have the form    0 b 0 =b −b 0 −1

 1 = bA. 0

Since A spans V , and {A} is a linearly independent set, we see that {A} is a basis for V , which therefore has dimension 1. 39. Suppose that A= Then

 a AB = c

 a+b , c+d

 a c

 b . d

 a+c BA = c

 b+d . d

Then AB = BA means that a = a + c, so that c = 0, and also a + b = b + d, so that a = d. So the matrix A has the form     a b 1 0 A= =a + bE12 = aI + bE12 . 0 a 0 1 I and E12 are linearly independent since I = E11 + E22 , and {E11 , E22 , E12 } is a linearly independent set. Since they span, we see that {I, E12 } forms a basis for V , which therefore has dimension 2.

472

CHAPTER 6. VECTOR SPACES

40. dim Mnn = n2 . For symmetric matrices, we have aij = aji . Therefore specifying the entries on or above the diagonal determines the matrix, and these entries may take any value. Since there are n + (n − 1) + · · · + 2 + 1 =

n(n + 1) 2

entries on or above the diagonal, this is the dimension of the space of symmetric matrices. 41. dim Mnn = n2 . For skew-symmetric matrices, we have aij = −aji . Therefore all the diagonal entries are zero; further, specifying the entries above the diagonal determines the matrix, and these entries may take any value. Since there are (n − 1) + · · · + 2 + 1 =

n(n − 1) 2

entries above the diagonal, this is the dimension of the space of symmetric matrices. 42. Following the hint, Let B = {v1 , . . . , vk } be a basis for U ∩ W . By Theorem 6.10(e), this can be extended to a basis C for U and to a basis D for W . Consider the set C ∪ D. Clearly it spans U + W , since an element of U + W is u + w for u ∈ U , w ∈ W ; since C and D are bases for U and W , the result follows. Next we must show that this set is linearly independent. Suppose dim U = m and dim V = n; then C = {uk+1 , . . . , um } ∪ B, D = {wk+1 , . . . , wn } ∪ B. Suppose that a linear combination of C ∪ D is the zero vector. This means that a1 v1 + · · · + ak vk + ck+1 uk+1 + · · · + cm um + dk+1 wk+1 + dn wn = A + C + D = 0. This means that C = −A − D. Since A ∈ U ∩ W and D ∈ W , we see that −A − D ∈ W so that C ∈ W . But also C ∈ U , since it is a combination of basis vectors for U . Thus C ∈ U ∩ W , which is impossible unless all of the ci are zero, since the uk+1 are elements of U not in the subspace U ∩ W . We are left with A + D = 0; this is a linear combination of basis elements of W , so that all the ai and all the dj must be zero as well. Therefore C ∪ D is linearly independent. It follows that C ∪ D forms a basis for U + W , so it remains to count the elements in this set. Clearly C ∩ D = B, for any basis element in common between C and D must be in U ∩ W , so it must be in B. Write #X for the number of elements in the set X. Then dim(U + W ) = #(C ∪ D) = #C + #D − #B = n + m − k = dim U + dim W − dim(U ∩ W ). 43. Let dim U = m, with basis B = {u1 , . . . , um }, and dim V = n, with basis C = {v1 , . . . , vn }. (a) Claim that a basis for U × V is D = {(ui , 0), (0, vj )}. This will also show that dim(U × V ) = mn, by counting the basis vectors. First we want to show that D spans. Let x = (u, v) ∈ U × V . Then   m n m n X X X X x = (u, v) =  ci ui , dj vj  = ci (ui , 0) + dj (0, vj ), i=1

j=1

i=1

j=1

proving that D spans U × V . To see that D is linearly independent, suppose that 0 is a linear combination of the elements of D. This means that   m n m n X X X X 0 = (0, 0) = ci (ui , 0) + dj (0, vj ) =  ci ui , dj vj  . i=1

j=1

i=1

j=1

Pm But the ui are linearly independent, so that P i=1 ci ui = 0 means that all of the ci are zero; n similarly the vj are linearly independent, so that j=1 dj vj = 0 means that all of the dj are zero. Thus D is linearly independent. So D is a basis for U × W , and therefore dim(U × W ) = mn.

6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION

473

(b) Let dim W = k, with basis {w1 , . . . , wk }. Claim that D = {(wi , wi )} is a basis for ∆; proving this claim also shows, by countingPthe basis vectors, that dim ∆ = dim W . To see that D spans ∆, choose (w, w) ∈ ∆; then w = ci wi , so that X  X X X (w, w) = ci wi , ci wi = (ci wi , ci wi ) = ci (wi , wi ). P Thus D spans ∆. To see that D is linearly independent, suppose that ci (wi , wi ) = 0. Then X  X X X 0= ci (wi , wi ) = (ci wi , ci wi ) = ci wi , ci wi . P This means that ci wi = 0; since the wi are linearly independent, we see that all the ci are zero. So D is linearly independent, showing that it is a basis for ∆ and thus that dim ∆ = dim W . 44. Suppose that P is finite dimensional, and that {p1 (x), . . . , pn (x)} is a basis. Let m be the largest power of x that appears in any of the basis elements, and consider q(x) = xm+1 . Since any linear combination of the pi has powers of x at most equal to m, it follows that q ∈ / span{pi }, so that P cannot have a finite basis. 45. Start by adding all three standard basis vectors for P2 to the set, yielding {1 + x, 1 + x + x2 , 1, x, x2 }. Since (1 + x) + x2 = 1 + x + x2 , we see that {1 + x, 1 + x + x2 , x2 } is linearly dependent, so discard x2 , leaving {1 + x, 1 + x + x2 , 1, x}. Next, since (1 + x) = 1 + x, the set {1 + x, 1, x} is linearly dependent, so discard x, leaving B = {1 + x, 1 + x + x2 , 1}. This is a set of three vectors, so to prove that it is a basis, it suffices to show that it spans. But by construction, the standard basis vectors not in B are linear combinations of elements of B. Hence B is a basis for P2 . 46. Start by adding the standard basis vectors Eij to the set, giving       0 1 1 1 B= C= ,D= , E11 , E12 , E21 , E22 . 0 1 0 1 Since E11 = D − C is the difference of the two given matrices, we discard it from B, leaving us with five elements. Next, C = E12 + E22 , so discard E22 , leaving us with B = {C, D, E12 , E21 }. This has four elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But by construction, the standard basis vectors not in B are linear combinations of elements of B. So B spans and therefore forms a basis for M22 . 47. Start by adding the standard basis vectors Eij to the set, giving         1 0 0 1 0 −1 B= I= ,C= ,D= , E11 , E12 , E21 , E22 . 0 1 1 0 1 0 Since E21 = 12 (C + D), we discard it from B. Next, I = E11 + E22 , so discard E22 from B. Finally, E12 = 21 (C − D), so we discard it from B. This leaves us with B = {I, C, D, E11 }. This has four elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But by construction, the standard basis vectors not in B are linear combinations of elements of B. So B spans and therefore forms a basis for M22 . 48. Start by adding the standard basis vectors Eij to the set, giving       1 0 0 1 B= I= ,C= , E11 , E12 , E21 , E22 . 0 1 1 0 Since I + C = E11 + E12 + E21 + E22 , these are a linearly dependent set; discard E22 from B. Next, C = E12 + E21 , so discard E21 from B. This leaves us with B = {I, C, E11 , E12 }. This has four elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But by construction, the standard basis vectors not in B are linear combinations of elements of B. So B spans and therefore forms a basis for M22 .

474

CHAPTER 6. VECTOR SPACES

49. The standard basis for P1 is {1, x}. If we show that each of these vectors is in the span of the given vectors, then those vectors span P1 . But 1 ∈ span(1, 1 + x, 2x), and x = (1 + x) − 1 ∈ span(1, 1 + x, 2x). Thus span(1, 1 + x, 2x) ⊇ span(1, x) = P1 , so that a basis for the span is {1, x}. 50. Since (1 − 2x) + (2x − x2 ) = 1 − x2 , these three elements are linearly independent, so we can discard 1 − x2 from the set without changing the span, leaving {1 − 2x, 2x − x2 , 1 + x2 }. Claim these are linearly independent. Suppose that a(1 − 2x) + b(2x − x2 ) + c(1 + x2 ) = (a + c) + (−2a + 2b)x + (−b + c)x2 = 0. Then a = −c and a = b from the constant term and the coefficient of x, so that b = −c. But then b = −c together with −b + c = 0 shows that b = c = 0, so that a = 0. Hence these three polynomials are linearly independent, so they form a basis for the span of the original set. 51. Since (1 − x) + (x − x2 ) = 1 − x2 and (1 − x) − (x − x2 ) = 1 − 2x + x2 , we can remove both 1 − x2 and 1 − 2x + x2 from the set without changing the span. This leaves us with {1 − x, x − x2 }. Claim these are linearly independent. Suppose that a(1 − x) + b(x − x2 ) = a + (−a + b)x − bx2 = 0. Then clearly a = b = 0. Hence these two polynomials are linearly independent, so they form a basis for the span of the original set. 52. Let

 1 V1 = 0

 0 , 1

 0 V2 = 1

 1 , 0

 −1 V3 = 1

 1 , −1



1 V4 = −1

 −1 . 1

Then V3 + V4 = 0, so we can discard V4 from the set without changing the span. Next, V3 = V2 − V1 , so we can discard V3 from the set without changing the span. So we are left with {V1 , V2 }; these are linearly independent since they are not scalar multiples of one another. So this set forms a basis for the span of the given set. 53. Since cos 2x = cos2 x − sin2 x, we can discard cos 2x from the set without changing the span, leaving us with {sin2 x, cos2 x}. Claim that this set is linearly independent. For suppose that a sin2 x+b cos2 x = 0 holds for all x. Setting x = 0 gives b = 0, while setting x = π2 gives a = 0. Thus a = b = 0 and these functions are linearly independent. So {sin2 x, cos2 x} forms a basis for the span of the original set. 54. Suppose that av + c1 v1 + · · · + cn vn = 0, so that c1 v1 + · · · + cn vn = −av. If a 6= 0, then dividing through by −a shows that v ∈ span(S), a contradiction. Therefore we must have a = 0, and we are left with c1 v1 + · · · + cn vn = 0, which implies (since the vi are linearly independent) that all of the ci are zero. Therefore all coefficients must be zero, so that S 0 is linearly independent. 55. If vn ∈ span(v1 , . . . , vn−1 ), then there are constants such that vn = c1 v1 + · · · + cn−1 vn−1 . Now choose x ∈ V = span(S). Then x = a1 v1 + · · · + an−1 vn−1 + an vn = a1 v1 + · · · + an−1 vn−1 + an (c1 v1 + · · · + cn−1 vn−1 ) = (a1 + an c1 )v1 + · · · + (an−1 + an cn−1 )vn−1 , showing that x ∈ span(S 0 ). Then V ⊆ span(S 0 ) ⊆ span(S) = V , so that span(S 0 ) = V .

6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION

475

56. Use induction on n. If S = {v1 }, then S is linearly independent so is a basis. Next, suppose the statement is true for all sets that span V and have no more than n − 1 elements. Let S = {v1 , . . . , vn }. If S is linearly independent, then it is a basis and we are done. Otherwise it is a linearly dependent set, so we can write some element of S (say, vn after a possible reordering) as a linear combination of the other, so that vn ∈ span{v1 , · · · , vn−1 }. Then by Exercise 55, we can remove vn from S, giving S 0 = {v1 , . . . , vn−1 } that also spans V . By the inductive assumption, S 0 can be made into a basis for V by removing some (possibly no) elements, so S can. 57. Suppose x ∈ V . Since S = {v1 , . . . , vn } is a basis, there are scalars a1 , . . . , an such that x = a1 v1 + a2 v2 + · · · + an vn . Since the ci are nonzero, we also have a2 an a1 (cn vn ), x = (c1 v1 ) + v2 + · · · + c1 c2 cn so that T = {c1 v1 , . . . , cn vn } spans V . To show T is linearly independent, the argument is much the same. Suppose that b1 (c1 v1 ) + b2 (c2 v2 ) + · · · + bn (cn vn ) = (b1 c1 )v1 + (b2 c2 )v2 + · · · + (bn cn )vn = v0 . Since S is linearly independent, it follows that bi ci = 0 for all i. Since all of the ci are nonzero, all of the bi must be zero, so that T is linearly independent. Therefore T is also a basis for V . 58. Let S = {v1 , v2 , . . . , vn },

T = {v1 , v1 + v2 , v1 + v2 + v3 , . . . , v1 + v2 + · · · + vn }.

By Exercise 21 in Section 2.3, to show that span(T ) = span(S) = V it suffices to show that every element of S is in span(T ) and vice versa. But vi = (v1 + v2 + · · · + vi−1 + vi ) − (v1 + v2 + · · · + vi−1 ) ∈ span(T ) and v1 + · · · + vi = (v1 ) + (v2 ) + · · · + (vi ) ∈ span(S). Thus the two spans are equal. Since S is a basis, and T has the same number of elements as S, it must also be a basis, by Theorem 6.10(d). 59. (a) (x − a1 )(x − a2 ) (x − 2)(x − 3) x2 − 5x + 6 1 5 = = = x2 − x + 3 (a0 − a1 )(a0 − a2 ) (1 − 2)(1 − 3) 2 2 2 (x − a0 )(x − a2 ) (x − 1)(x − 3) x2 − 4x + 3 p1 (x) = = = = −x2 + 4x − 3 (a1 − a0 )(a1 − a2 ) (2 − 1)(2 − 3) −1 (x − a0 )(x − a1 ) (x − 1)(x − 2) x2 − 3x + 2 1 3 p2 (x) = = = = x2 − x + 1. (a2 − a0 )(a2 − a1 ) (3 − 1)(3 − 2) 2 2 2 p0 (x) =

(b) If i = j, then substituting aj for x in the definition of pi gives a numerator equal to the denominator, so that pi (aj ) = 1 for i = j. If i 6= j, however, then substituting aj for x in the definition of pi gives zero, since the term x − aj in the numerator of pi becomes zero. Thus ( 0 if i 6= j pi (aj ) = 1 if i = j. 60. (a) Note first that the degree of each pi is n, since there are n factors in the numerator, so that pi ∈ Pn for all i. Suppose that there are constants ci such that c0 p0 (x) + c1 p1 (x) + · · · + cn pn (x) = 0. Then this equation must hold for all x. Substitute ai for x. Then by Exercise 59(b), all pj (ai ) for j 6= i vanish, leaving only ci pi (ai ) = ci . Therefore ci = 0. Since this holds for every i, we see that all of the ci are zero, so that the pi are linearly independent.

476

CHAPTER 6. VECTOR SPACES (b) Since dim Pn = n + 1 and there are n + 1 polynomials p0 , p1 , . . . , pn that span, they must form a basis, by Theorem 6.10(d).

61. (a) This is similar to part (a) of Exercise 60. Suppose that q(x) ∈ P, and that q(x) = c0 p0 (x) + c1 p1 (x) + · · · + cn pn (x). Then substituting ai for x gives q(ai ) = ci pi (ai ) = ci , since pj (ai ) = 0 for j 6= i by Exercise 59(b). Since the pi form a basis for P, any element of P has a unique representation as a linear combination of the pi , so that unique representation must be the one we just found: q(x) = q(a0 )p0 (x) + q(a1 )p1 (x) + · · · + q(an )pn (x). (b) q(x) passes through the given n + 1 points if and only if q(ai ) = c1 . But this is exactly what we proved in part (a). Since the representation is unique in part (a), this is the only such degree n polynomial. (c) (i) Here a0 = 1, a1 = 2, a2 = 3, so we use the Lagrange polynomials determined in Exercise 59(a) and therefore q(x) = c0 p0 (x) + c1 p1 (x) + c2 p2 (x)      1 2 3 1 2 5 x − x + 3 − −x2 + 4x − 3 − 2 x − x+1 =6 2 2 2 2 = 3x2 − 16x + 19. (ii) Here a0 = −1, a1 = 0, a2 = 3. We first compute the Lagrange polynomials: (x − a1 )(x − a2 ) (x − 0)(x − 3) x2 − 3x 1 3 = = = x2 − x (a0 − a1 )(a0 − a2 ) (−1 − 0)(−1 − 3) 4 4 4 2 (x − (−1))(x − 3) x − 2x − 3 1 2 (x − a0 )(x − a2 ) = = = − x2 + x + 1 p1 (x) = (a1 − a0 )(a1 − a2 ) (0 − (−1))(0 − 3) −3 3 3 (x − a0 )(x − a1 ) (x − (−1))(x − 0) x2 + x 1 2 1 p2 (x) = = = = x + x. (a2 − a0 )(a2 − a1 ) (3 − (−1))(3 − 0) 12 12 12 p0 (x) =

Using these polynomials, we get q(x) = c0 p0 (x) + c1 p1 (x) + c2 p2 (x)       1 2 1 2 3 1 2 2 1 x − x +5 − x + x+1 +2 x + x = 10 4 4 3 3 12 12     5 5 1 15 10 1 = − + + x2 + − + x+5 2 3 6 2 3 6 = x2 − 4x + 5. 62. Suppose that (a0 , 0), (a1 , 0), . . . , (an , 0) be the n + 1 distinct zeros. Then by Exercise 61(a), the only polynomial passing through these n + 1 points is q(x) = 0p0 (x) + 0p1 (x) + · · · + 0pn (x) = 0, so it is the zero polynomial. 63. An n × n matrix is invertible if and only if it row-reduces to the identity matrix, so that its columns are linearly independent. Since n linearly independent columns in Znp form a basis for Znp , finding the number of invertible matrices in Mnn (Zp ) is the same as finding the number of ways to construct a basis for Znp . To construct a basis, we can start by choosing any nonzero vector v1 ∈ Znp . There are pn − 1 ways to choose this vector (since Znp has pn elements, and we are excluding only one). For the second basis

Exploration: Magic Squares

477

element v2 , we can choose any vector not in span(v1 ) = {a1 v1 : a1 ∈ Zp }. There are therefore pn − p choices for v2 . Likewise, suppose we have chosen v1 , . . . , vk−1 . Then we can choose for vk any vector not in the span of the previous vectors, so any vector not in span(v1 , . . . , vk−1 ) = {a1 v1 + a2 v2 + · · · + ak−1 vk−1 : a1 , a2 , . . . , ak−1 ∈ Zp }. There are pn − pk−1 ways of choosing vk . Thus the total number of bases is (pn − 1)(pn − p)(pn − p2 )(· · · )(pn − pn−1 ).

Exploration: Magic Squares 1. A classical magic square M contains each of the numbers 1, 2, . . . , n2 exactly once. Since wt(M ) is the common sum of any row (or column), and there are n rows, we have, using the comments preceding Exercise 51 in Section 2.4) wt(M ) =

1 n2 (n2 + 1) n(n2 + 1) 1 (1 + 2 + · · · + n2 ) = · = . n n 2 2 2

2. A classical 3 × 3 magic square will have common sum 3(3 2+1) = 15. Therefore that no two of 7, 8, and 9 can be in the same row, column, or diagonal, since the sum of any two of these numbers is at least 15. The central square cannot be 1, since then 1 + 2 + k = 15 is impossible. Similarly, it cannot be any of 2 through 4, since then some row, column, or diagonal contains the numbers 1 and 2, which is impossible. The central square cannot be 7, 8, or 9, since then some line will contain two of those numbers. Finally, it cannot be 6, since then some line will contain both 6 and 9. So the central square must be 5. Also, 7 and 1 cannot be in the same line, since the third number would have to be 7. Finally, 9 cannot be in a corner, since it would then participate in three lines adding to 15. But there are only two possible sets of numbers, one of which is 9, adding to 15: 9, 5, 1 and 9, 4, and 2. Thus 9 must be on a side. Keeping these constraints in mind and trying a few configurations gives for example     2 7 6 8 1 6 9 5 1 3 5 7 4 3 8 4 9 2 These examples are essentially the same — the second one is a rotated and reflected version of the first. 3. Since the magic squares in Exercise 2 have a weight of 15, subtracting 14 3 from each entry will subtract 14 from each row and column while maintaining common sums, so the result will have weight 1. For example, using the second magic square above, we get   7 4 − 38 3 3  13  1 . − 11  3 3 3  10 − 23 − 53 3 In general, starting from any 3 × 3 classical magic square, to construct a magic square of weight w, add w−15 to each entry in the magic square. This clearly preserves the common sum property, and 3 the new sum of each row is 15 + 3 w−15 = w. 3 4. (a) We must show that if A and B are magic squares, then so is A + B, and if c is any scalar, then cA is also a magic square. Suppose wt(A) = a and wt(B) = b. Then the sum of the elements in any row of A + B is the sum of the elements in that row of A plus the sum of the elements in that row of B, which is therefore a + b. The same argument shows that the sum of the elements in any column or either diagonal of A + B is also a + b, so that A + B ∈ Mag 3 with weight a + b. In the same way, the sum of the elements in any row or column, or either diagonal, of cA is c times the sum of the elements in that line of A, so it is equal to ca. Thus cA ∈ Mag 3 with weight ca.

478

CHAPTER 6. VECTOR SPACES (b) If A, B ∈ Mag 03 , then wt(A) = wt(B) = 0. From part (a), wt(A + B) = 0 as well, and also wt(cA) = c wt(A) = 0. Thus A + B, cA ∈ Mag 03 , so by Theorem 6.2, Mag 03 is a subspace of Mag 3 .

5. Let M be any 3 × 3 magic square, with wt(M ) = w. Using Exercise 3, we define M0 to be the magic square of weight 0 derived from M by adding −w 3 to every element of M . This is the same as adding the matrix   − w3 − w3 − w3 w   w − 3 − w3 − w3  = − J 3 − w3 − w3 − w3 to M . Thus M0 = M − Thus k =

w w J, so that M = M0 + J. 3 3

w 3.

6. The linear equations express the fact that every row and column sum each diagonal sum is zero: a+b+c

=0 d+e+f

=0 g+h+i=0

a

+d b

+g +e

c a

+h +f

+e

=0 +i=0

+e c

Row-reducing the associated  1 1 1 0 0 0 0 0 1 1  0 0 0 0 0  1 0 0 1 0  0 1 0 0 1  0 0 1 0 0  1 0 0 0 1 0 0 1 0 1

=0

+i=0 +g

=0

matrix gives 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 1 0 0

  0 1 0 0   0 0    0 0 →  0 0   0 0  0 0 0 0

0 1 0 0 0 0 0 0

0 0 1 0 0 0 0 0

0 0 0 1 0 0 0 0

0 0 0 0 1 0 0 0

0 0 0 0 0 1 0 0

0 0 0 0 0 0 1 0

0 1 1 0 −1 −1 −1 −2 0 0 1 2 1 1 0 0

 0 0  0  0 . 0  0  0 0

So there are two free variables in the general solution, which gives the magic square   −s −t s+t  t + 2s 0 −t − 2s . −s − t t s 7. From Exercise 6, a basis for Mag 03 is given by    −1 0 1 0 A =  2 0 −2 , B =  1 −1 0 1 −1

 −1 1 0 −1 , 1 0

so that Mag 03 has dimension 2. Note that these matrices are linearly independent since they are not scalar multiples of each other. (The matrix in Exercise 6 is not the same as that given in the hint. To show that the two are just different forms of one another, make the substitutions u = −s, v = s + t; then s = −v and t = v − s = u + v, so the matrix becomes   u −u − v v −u + v 0 u − v , −v u+v −u which is the matrix given in the hint.)

Exploration: Magic Squares

479

8. From Exercise 5, an arbitrary element M ∈ Mag 3 can be written

M = M0 + kJ,

M0 ∈ Mag 3 ,

 1 J = 1 1

 1 1 . 1

1 1 1

Since M0 has basis {A, B} (see Exercise 7), it follows that M3 is spanned by S = {A, B, J}. To see that this is a basis, we must prove that these matrices are linearly independent. Now, we have seen that A and B are linearly independent, so if S is linearly dependent, then span(S) = span(A, B) = Mag 03 . But this is impossible, since S spans Mag 3 , which contains matrices not in Mag 03 . Thus S is linearly independent, so it forms a basis for Mag 3 , which therefore has dimension 3. 9. Using the matrix M from exercise 5, adding both diagonals and the middle column gives (a + e + i) + (c + e + g) + (b + e + h) = (a + b + c) + (g + h + i) + 3e, which is the sum of the first and third rows, plus 3 times the central entry. Since each row, column, and diagonal sum is w, we get w + w + w = w + w + 3e, so that e = w3 . 10. The sum of the squares in the entries of M are s2 + (−s − t)2 + t2 + (−s + t)2 + 02 + (s − t)2 + (−t)2 + (s + t)2 + (−s)2 = s2 + (s2 + 2st + t2 ) + t2 + (s2 − 2st + t2 ) + (s2 − 2st + t2 ) + t2 + (s2 + 2st + t2 ) + s2 = 6s2 + 6t2 . We have another way of computing this sum, however. Since M was obtained from a classical 3 × 3 magic square by subtracting 5 from each entry (see Exercise 5), the sum of the squares of the entries of M is also 2

2

2

(1 − 5) + (2 − 5) + · · · + (9 − 5) = 60. √ So the equation is 6s2 + 6t2 = 60, or s2 + t2 = 10, which is a circle of radius 10 centered at the origin. The only way to write 10 as a sum of two integral squares is (±1)2 + (±3)2 , so this equation has the eight integral solutions (s, t) = (−3, −1), (−3, 1), (−1, −3), (−1, 3), (3, −1), (3, 1), (1, −3), (1, 3). Substituting those values of s and t into 

s+5 M + 5J = −s + t + 5 −t + 5

−s − t + 5 5 s+t+5

 t+5 s − t + 5 −s + 5

gives the eight classical magic squares   2 9 4 7 5 3 , 6 1 8   8 3 4 1 5 9 , 6 7 2

  2 7 6 9 5 1 , 4 3 8   8 1 6 3 5 7 , 4 9 2

  4 9 2 3 5 7 , 8 1 6   6 7 2 1 5 9 , 8 3 4

  4 3 8 9 5 1 , 2 7 6   6 1 8 7 5 3 . 2 9 4

These matrices can all be obtained from one another by some combination of row and column interchanges and matrix transposition. A plot of the circle, with these eight points marked, is

480

CHAPTER 6. VECTOR SPACES t

3

2

1

-3

-2

1

-1

2

s

3

-1

-2

-3

6.3

Change of Basis

1. (a)         2 1 0 b1 = 2 2 = b1 + b2 ⇒ ⇒ [x]B = 3 0 1 b2 = 3 3 " #       5 5 c1 = 2 2 1 1 2 ⇒ [x] = = c1 + c2 ⇒ C 3 1 −1 c2 = − 21 − 12 (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: " # " 1 1 1 0 1 0 → 1 −1 0 1 0 1 " (c) [x]C = PC←B [x]B =

1 2 1 2

1 2 1 −2

1 2 1 2

1 2 − 12

#

" ⇒

PC←B =

1 2 1 2

1 2 − 21

#

#" # " # 5 2 2 , which is the same as the answer from part (a). = 3 − 12

(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:     1 0 1 1 1 1 ⇒ PB←C = 0 1 1 −1 1 −1 " #" # " # 5 1 1 2 = 2 , which is the same as the answer from part (a). (e) [x]B = PB←C [x]C = 1 −1 − 12 3 2. (a) 

     4 1 1 = b1 + b2 −1 0 1       4 0 2 = c1 + c2 −1 1 3





b1 = 5 b2 = −1





c1 = −7 c2 = 2



(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: # " " 1 0 − 32 0 2 1 1 → 1 1 3 0 1 0 1 2

− 21 1 2

 5 −1   −7 [x]C = 2

[x]B =

#

" ⇒

PC←B =

− 32

− 12

1 2

1 2

#

6.3. CHANGE OF BASIS

481

(c) [x]C = PC←B [x]B =

" − 23

− 12

1 2

1 2

#"

5

−1

# =

" # −7 2

, which is the same as the answer from part (a).

(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:       1 1 0 2 1 0 −1 −1 −1 −1 → ⇒ PB←C = 0 1 1 3 0 1 1 3 1 3      −1 −1 −7 5 (e) [x]B = PB←C [x]C = = , which is the same as the answer from part (a). 1 3 2 −1 3. (a)        0 0 1 1         + b + b = b  0 3 0 2 1 1 0 1 0 0 −1         0 0 1 1          0 = c1 1 + c2 1 + c3 0 1 1 1 −1 

(b) By Theorem 6.13,  1 0 0  1 1 0 1 1 1

[ C | B ] → [ I | PC←B ]:   1 0 0 1 0 0 1   0 1 0 → 0 1 0 −1 0 0 1 0 0 1 0







c1 = 1 c2 = −1 c3 = −1



 0 0  1 0 −1 1

 1   [x]B =  0 −1   1   [x]C = −1 . −1 

b1 = 1 b2 = 0 b3 = −1





PC←B

1  = −1 0

 0 0  1 0 . −1 1

(c) 

    1 1 0      0 = −1 1 0     −1 −1 1 −1

1

0

 [x]C = PC←B [x]B =  −1 0 which is the same as the answer from part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:   1 0 0 1 0 0 0 1 0 1 1 0 ⇒ 0 0 1 1 1 1

PB←C

 1 = 1 1

 0 0 . 1

0 1 1

(e)  1   [x]C = PB←C [x]C = 1 1





1



0

1

        0  −1 =  0 , 1 −1 −1

1

1



0

which is the same as the answer from part (a). 4. (a)         3 0 0 1         1 = b1 1 + b2 0 + b3 0         5 0 1 0         3 1 0 1         1 = c1 1 + c2 1 + c3 0         5 0 1 1

  1    b2 = 5 ⇒ [x]B =  5 b3 = 3 3   c1 = − 12 −1  2 3 ⇒ [x]C =  c2 = 32  2 . 7 c3 = 72 2 b1 = 1





482

CHAPTER 6. VECTOR SPACES (b) By Theorem 6.13,  1 0 1 0  1 1 0 1  0 1 1 0

[C |B ]→[I   1 0 1    0 0  → 0 0 1 0

| PC←B ]: 0

0

1

0

0

1



− 12

1 2 1 2 − 12



1 2 − 12   1 2

1 2 1 2

1  2  1  2 − 12



PC←B =

 

  −1  2 3  =  2 ,

− 21 1 2 1 2



1 2 − 12  . 1 2

(c)  [x]C = PC←B [x]B =

1  2  1  2 − 12

− 12 1 2 1 2

1 2  1   − 21   5 1 3 2

1 1 0

 0 1 1

7 2

which is the same as the answer from part (a). (d) By Theorem 6.13,  0 0 1 0 0 1

[ B | C ] → [ I | PB←C   1 1 0 1 1 0 1 1 0 → 0 0 0 1 1 0

]: 0 0 1 0 0 1

1 0 1



PB←C

 1 = 0 1

1 1 0

 0 1 1

(e)  1   [x]C = PB←C [x]C = 0 1

1 1 0

    −1 0 1   2   3 =    1  2  5 7 1 3 2

which is the same as the answer from part (a). 5. Note that in terms of the standard basis,     1 0 B= , , 0 1

C=

    0 1 , . 1 1

(a) 2 − x = b1 (1) + b2 (x)

b1 = 2 b2 = −1



2 − x = c1 (x) + c2 (1 + x)

c1 = −3 c2 = 2



(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:    0 1 1 0 1 0 −1 → 1 1 0 1 0 1 1  −1 (c) [2 − x]C = PC←B [2 − x]B = 1

 0 1

 1 0



 2 −1   −3 [2 − x]C = . 2

[2 − x]B = ⇒

PC←B =

 −1 1

 1 . 0

    1 2 −3 = , which agrees with the answer in part (a). 0 −1 2

(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:   1 0 0 1 0 1 1 1 (e) [2 − x]B = PB←C [2 − x]C =







PC←B

 0 = 1

1 1



    1 −3 2 = , which agrees with the answer in part (a). 1 2 −1

6.3. CHANGE OF BASIS

483

6. Note that in terms of the standard basis,     1 1 B= , , 1 −1

    0 4 C= , . 2 0

(a) 1 + 3x = b1 (1 + x) + b2 (1 − x)



(c) [1 + 3x]C = PC←B [1 + 3x]B =

1 2 1 4

− 21

#"

1 4

1 2 1 4

2

" (e) [1 + 3x]B = PB←C [1 + 3x]C =

−1

1 4

2

" ⇒

3 2 1 4

=

2 2

" =

7. Note that in terms of the standard basis,       0 0   1 B = 1 , 1 , 0 ,   1 1 1

−1

3 2 1 4

[1 + 3x]C =

#

1 4

1 −1

#" # 2 32

1

[1 + 3x]B =

#

1 2 1 4

PC←B =

− 12

#

1 4

.

" #

−1

(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:    1 1 0 4 1 0 → 1 −1 2 0 0 1



− 21

#



2

" #

3 2 1 4

c2 =

(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: " " # 1 0 0 4 1 1 → 0 1 2 0 1 −1 "

b2 = −1

c1 =



1 + 3x = c1 (2x) + c2 (4)

"

b1 = 2

2

−1

, which agrees with the answer in part (a).



 ⇒

PC←B =

 2 2

1 −1

# , which agrees with the answer in part (a).

      0 0   1 C = 0 , 1 , 0 .   0 0 1

(a) 1 + x2 = b1 (1 + x + x2 ) + b2 (x + x2 ) + b3 (x2 )

1 + x2 = c1 (1) + c2 (x) + c3 (x2 )

(b) By Theorem 6.13, [ C | B ] → [  1 0  0 1  0 0

c1 = 1 c2 = 0 c3 = 1





b1 = 1 b2 = −1 b3 = 1







 1 [1 + x2 ]B = −1 1

  1 [1 + x2 ]C = 0 . 1

I | PC←B ]: 0

1

0

0

1

1

1

1

1

 1 (c) [1 + x2 ]C = PC←B [1 + x2 ]B = 1 1

0 1 1

 0  0  1



PC←B

 1  = 1 1

0 1 1

 0  0 . 1

    0 1 1 0 −1 = 0, which agrees with the answer in part (a). 1 1 1

484

CHAPTER 6. VECTOR SPACES [ B | C ] → [ I | PB←C ]:      1 0 0 1 0 0 1 0 0 1 0 0 0 1 0 → 0 1 0 −1 1 0 ⇒ PC←B = −1 1 0 0 0 1 0 −1 1 0 0 1 0 −1 1      1 0 0 1 1 1 0 0 = −1, which agrees with the answer in part (e) [1 + x2 ]B = PB←C [1 + x2 ]C = −1 0 −1 1 1 1 (a).

(d) By Theorem 6.13,  1 0 0 1 1 0 1 1 1

8. Note that in terms of the standard basis,       1 0   0 B = 1 , 0 , 1 ,   0 1 1

      1 0   1 C = 0 , 1 , 0 .   0 0 1

(a) b1 = 3 b2 = 4 b3 = −5

4 − 2x − x2 = b1 (x) + b2 (1 + x2 ) + b3 (x + x2 )



4 − 2x − x2 = c1 (1) + c2 (1 + x) + c3 (x2 )

c1 = 6 c2 = −2 c3 = −1

(b) By Theorem 6.13,  1 1 0  0 1 0  0 0 1







 3 ⇒ [4 − 2x − x2 ]B =  4 −5   6 [4 − 2x − x2 ]C = −2 −1

[ C | B ] → [ I | PC←B ]:   0 1 0 1 0 0 −1 1     → 1 0 1 1 0 0 1 0

   −1 −1 1 −1     1 1 .  ⇒ PC←B =  1 0 0 1 1 0 1 1 0 0 1 0 1 1      −1 1 −1 3 6 1  4 = −2, which agrees with the (c) [4 − 2x − x2 ]C = PC←B [4 − 2x − x2 ]B =  1 0 0 1 1 −5 −1 answer in part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:       0 1 0 1 1 0 1 0 0 1 2 −1 1 2 −1 1 0 1 0 1 0 → 0 1 0 1 1 0 ⇒ PC←B =  1 1 0 0 1 1 0 0 1 0 0 1 −1 −1 1 −1 −1 1      1 2 −1 6 3 1 0 −2 =  4, which agrees with the (e) [4 − 2x − x2 ]B = PB←C [4 − 2x − x2 ]C =  1 −1 −1 1 −1 −5 answer in part (a). 9. (a) 

         4 1 0 0 0  2 0 1 0 0   = b1   + b2   + b3   + b4   ⇒  0 0 0 1 0 −1 0 0 0 1           4 1 2 1 1  2  2 1 1 0   = b1   + b2   + b3   + b4   ⇒  0  0 1 0 0 −1 −1 0 1 1

b1 b2 b3 b4

= 4 = 2 = 0 = −1

b1 b2 b3 b4

= 25 = 0 = −3 = 92





 4  2  [A]B =   0 −1  5 2



 0   [A]C =  . −3 9 2

6.3. CHANGE OF BASIS

485

(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: 

1

2

1

1

1

0

0

  2   0  −1

1

1

0

0

1

0

1

0

0

0

0

1

0

1

1

0

0

0

 1 0 0 0   0 1 0 0 0 → 0 0 1 0 0   1 0 0 0 1 0



1 2

0

−1

0

0

1

−1

1

1

3 2

−1

−2

− 21



 0  1 

− 21

 ⇒

PC←B

1 2

0

−1

  0 = −1 

0

1

1

1

−1

−2

3 2

(c) 

1 2

 0  [A]C = PC←B [A]B =  −1 3 2

    5 0 −1 − 21 4 2  2  0 0 1 0       =  , 1 1 1  0 −3 9 −1 −1 −2 − 21 2

which agrees with the answer in part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:  1 0 0 0  0 1 0 0  0 0 1 0  0 0 0 1



1

2

1

1

2

1

1

0

1

0

−1

0

1

 0  0  1

 ⇒



1

2

1

1

  2 PB←C =   0  −1

1

1

1

0

0

1

 0 . 0  1

(e) 

1

2

1

  2 [A]B = PB←C [A]C =   0  −1

1

1

1

0

0

1





 4         0   0 =  2 ,     0  −3  0 9 1 −1 2 1

5 2



which agrees with the result in part (a). 10. (a)           1 1 0 1 1           1           = b1 0 + b2 1 + b3 1 + b4 0 1 0 1 0 1           1 1 0 0 0           1 1 1 1 0           1 1 1 0 1   = b1   + b2   + b3   + b4   1 0 1 1 1           1 1 0 1 1

b1 = 1 ⇒

b2 = 1 b3 = 0



b4 = 0 b1 = ⇒

b2 = b3 = b4 =

1 3 1 3 1 3 1 3

  1   1  [A]B =  0   0   1



 31  3  [A]C =  1 . 3 1 3

− 12



 0  1 

− 12

486

CHAPTER 6. VECTOR SPACES (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:  1 1 1 0  1 1 0 1  0 1 1 1  1 0 1 1

1

0

1

0

1

1

0

1

0

1

0

0

  1 0 0 0 1    0  → 0 1 0 0 0 0 1 0  1  0 0 0 0 1

2 3 − 13 2 3 − 13

− 13

2 3 2 3 − 13 − 13

2 3 − 13 2 3

− 31



2  3 2 3 − 31

 ⇒

PC←B =

2  31 − 3   2  3 − 13

− 13

− 13

2 3 − 13 2 3

2 3 2 3 − 31 − 31

1

0

1

1

 1  2 =  1  2 − 12

1 2 1 2 1 2

1 2 − 21 1 2



2  3 . 2 3 − 13

(c)  [A]C = PC←B [A]B =

2  13 − 3   2  3 − 13

− 13

    1 1 3     2   1   1 3   3  =     2 1 3 3  0 1 0 − 13 3

2 3 2 3 − 31 − 31

− 13

1

0

1

1 2 1 2 − 12

1 2 1 2 1 2

1 2 − 21 1 2

2 3 − 13 2 3

which agrees with the answer in part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:  1 0 1 1  0 1 1 0  0 1 0 1  1 0 0 0

1

1

1

1

1

0

0

1

1

1

0

1

  1 0 0 0 0    1  → 0 1 0 0 0 0 1 0  1  1 0 0 0 1

1



3  2 1 −2 − 12

 ⇒

PB←C

(e) 

1

0

1

 1  2 [A]B = PB←C [A]C =   1  2 − 12

1 2 1 2 1 2

1 2 − 12 1 2

 

1 3   3 1  2 3 1 1 −2 3 1 − 21 3

1

which agrees with the result in part (a). 11. Note that in terms of the standard basis {sin x, cos x}, B=

    1 0 , , 1 1

C=

    1 1 , . 1 −1

  1   1  = 0   0



3  2 . 1 −2 − 12

6.3. CHANGE OF BASIS

487

(a) b1 = 2 b2 = −5 " # 2 [2 sin x − 3 cos x]B = −5

2 sin x − 3 cos x = b1 (sin x + cos x) + b2 (cos x) ⇒



2 sin x − 3 cos x = c1 (sin x + cos x) + c2 (sin x − cos x) ⇒

[2 sin x − 3 cos x]C =

(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: " # " 1 0 1 1 1 1 0 → 0 1 0 1 −1 1 1

1 2 − 21

"

1 2 1 −2

1 (c) [2 sin x − 3 cos x]C = PC←B [2 sin x − 3 cos x]B = 0 answer in part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:    1 0 1 1 1 0 → 1 1 1 −1 0 1

 1 −2

1 0

5 2

PC←B

" 1 = 0

.

1 2 − 12

# .

#"

# " # 2 − 12 = , which agrees with the 5 −5 2



PC←B

 1 = 0

 1 −2

#" # " # 1 − 21 2 , which agrees with the = 5 −2 −5 2

"

1 (e) [2 sin x − 3 cos x]B = PB←C [2 sin x − 3 cos x]C = 0 answer in part (a). 12. Note that in terms of the standard basis   1 B= , 1

" # − 21

# ⇒

c1 = − 21 c2 = 52



{sin x, cos x},       0 −1 1 , C= , . 1 1 1

(a) ⇒

sin x = b1 (sin x + cos x) + b2 (cos x)

"

b1 = 1 b2 = −1

sin x = c1 (cos x − sin x) + c2 (sin x + cos x)





[sin x]B =

c1 = − 21 c2 =

1 2



1

#

−1

[sin x]C =

" # − 21 1 2

(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: "

−1

1

1

0

1

1

1

1

(c) [sin x]C = PC←B [sin x]B =

" 0 1

1 2 1 2

#

" →

#"

# 1

−1

1 2 1 2

1

0

0

0

1

1

"

#

=

− 12 1 2

# ⇒

PC←B =

" 0 1

1 2 1 2

# .

, which agrees with the answer in part (a).

488

CHAPTER 6. VECTOR SPACES (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:    1 0 −1 1 1 0 → 1 1 1 1 0 1

(e) [sin x]B = PB←C [sin x]C =

" −1 2

#" # 1 − 12 0

1 2

−1 2 "

=

1

1 0

 ⇒

PC←B =

 −1 2

 1 0

#

−1

, which agrees with the answer in part (a).

13. From Section 3.6, Example 3.58, if B is the usual basis and C is the basis whose basis vectors are the rotations of the usual basis vectors through 60◦ , the matrix of the rotation is " # " √ # 3 1 cos 60◦ sin 60◦ 2 2 √ PC←B = = . 1 − sin 60◦ cos 60◦ − 23 2 Then (a) " [x]C = PC←B [x]B =

1 √2 − 23



√ # 3 = √ . 3 2 1− 2 3

3 2 1 2

#" # 3

"

1 √2 3 2

"

3 2

+

(b) Since PC←B is an orthogonal matrix, we have [x]B = (PC←B )

−1

T

[x]C = (PC←B ) [x]C =





3 2 1 2

" √ # 2 3+2 = √ . −4 2 3−2

#"

4

#

14. From Section 3.6, Example 3.58, if B is the usual basis and C is the basis whose basis vectors are the rotations of the usual basis vectors through 135◦ , the matrix of the rotation is # " √ " √ # 2 − 22 cos 135◦ sin 135◦ √ √2 = . PC←B = − 22 − 22 − sin 135◦ cos 135◦ Then (a) " [x]C = PC←B [x]B =



2 √2 − 22





2 √2 − 22

#" # 3 2

" =





#

#"

# 4

2 2 √ −522

.

(b) Since PC←B is an orthogonal matrix, we have " −1

[x]B = (PC←B )

T

[x]C = (PC←B ) [x]C =





2 √2 2 2



2 √2 − 22



"

# 0 = √ . −4 4 2

15. By definition, the columns of PC←B are the coordinates of B in the basis C, so we have         1 1 2 −1 [u1 ]C = , so that u1 = 1 −1 = −1 2 3 −1         −1 1 2 3 [u2 ]C = , so that u2 = −1 +2 = . 2 2 3 4 Therefore

 B = {u1 , u2 } =

   −1 3 , . −1 4

6.3. CHANGE OF BASIS

489

16. Let C = {p1 (x), p2 (x), p3 (x)}. By definition, the columns of PC←B are the coordinates of B in the basis C, so that   1 [x]C =  0 ⇒ x = p1 (x) − p3 (x), −1   0 [1 + x]C = 2 ⇒ 1 + x = 2p2 (x) + p3 (x), 1   0 [1 − x + x2 ]C = 1 ⇒ 1 − x + x2 = p2 (x) + p3 (x). 1 Subtract the third equation from the second, giving p2 (x) = 2x − x2 ; substituting back into the third equation gives p3 (x) = 1 − 3x + 2x2 ; then from the first equation, p1 (x) = 1 − 2x + 2x2 . So C = {p1 (x), p2 (x), p3 (x)} = {1 − 2x + 2x2 , 2x − x2 , 1 − 3x + 2x2 }. 17. Since a = 1 in this case and we are considering a quadratic, we have for the Taylor basis of P2 B = {1, x − 1, (x − 1)2 } = {1, x − 1, x2 − 2x + 1}. Let C = {1, x, x2 } be the standard basis for P2 . Then 

To find the change-of-basis matrix  1   B C = 0 0 Therefore

 1 [p(x)]C =  2 . −5     PB←C , we reduce B C → I PB←C :   −1 1 1 0 0 1 0 0 1 1 1 −2 0 1 0 → 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0

 1 [p(x)]C = PB←C [p(x)]B = 0 0

1 1 0

 1 2 . 1

    −2 1 1 2  2 = −8 , −5 −5 1

so that the Taylor polynomial centered at 1 is −2 − 8(x − 1) − 5(x − 1)2 . 18. Since a = −2 in this case and we are considering a quadratic, we have for the Taylor basis of P2 B = {1, x + 2, (x + 2)2 } = {1, x + 2, x2 + 4x + 4}. Let C = {1, x, x2 } be the standard basis for P2 . Then 

To find the change-of-basis matrix  1   B C = 0 0

PB←C , 2 1 0

4 4 1

 1 [p(x)]C =  2 . −5     we reduce B C → I PB←C :    1 0 0 1 0 0 1 −2 4 0 1 0 → 0 1 0 0 1 −4 . 0 0 1 0 0 1 0 0 1

490

CHAPTER 6. VECTOR SPACES Therefore

 1 [p(x)]C = PB←C [p(x)]B = 0 0

    −2 4 1 −23 1 −4  2 =  22 , 0 1 −5 −5

so that the Taylor polynomial centered at −2 is −23 + 22(x + 2) − 5(x + 2)2 . 19. Since a = −1 in this case and we are considering a cubic, we have for the Taylor basis of P3 B = {1, x + 1, (x + 1)2 , (x + 1)3 } = {1, x + 1, x2 + 2x + 1, x3 + 3x2 + 3x + 1}. Let C = {1, x, x2 , x3 } be the standard basis for P2 . Then   0 0  [p(x)]C =  0 . 1  To find the change-of-basis matrix PB←C , we reduce B

 B

Therefore

 1 1 1 1  0 1 2 3 C = 0 0 1 3 0 0 0 1

1 0 0 0

0 1 0 0

  C → I

 PB←C :

  1 0 0 0 0 0 1 0 0 0 → 0 0 1 0 0 1 0 0 0 1

0 0 1 0

 1 0 [p(x)]C = PB←C [p(x)]B =  0 0

 −1 1 −1 1 −2 3 . 0 1 −3 0 0 1

1 0 0 0

    −1 0 −1 1 −1 0  3 1 −2 3   =  , 0 1 −3 0 −3 1 1 0 0 1

so that the Taylor polynomial centered at −1 is −1 + 3(x + 1) − 3(x + 1)2 + (x + 1)3 . 20. Since a =

1 2

in this case and we are considering a cubic, we have for the Taylor basis of P3 (

B=

 2  3 )   1 1 1 1 1 3 3 1 1, x − , x − , x− = 1, x − , x2 − x + , x3 − x2 + x − . 2 2 2 2 4 2 4 8

Let C = {1, x, x2 , x3 } be the standard basis for P2 . Then   0 0  [p(x)]C =  0 . 1  To find the change-of-basis matrix PB←C , we reduce B

h

B

 1 i 0  C = 0 0

− 12 1 0 0

1 4

− 18

−1 1 0

3 4 − 32

1

1 0 0 0

0 1 0 0

0 0 1 0

  C → I

  0 1 0 0   → 0 0 1

0

0 1 0 0

 PB←C : 0 0 1 0

0 0 0 1

1 0 0 0

1 2

1 4

1 0 0

1 1 0

1 8 3 4 . 3  2



1

6.4. LINEAR TRANSFORMATIONS

491

Therefore  1 0  [p(x)]C = PB←C [p(x)]B =  0 0

1 2

1 4

1 0 0

1 1 0

1 0 8 3   4  0   3  0 2

 

1

1 8

3   =  43  , 2

1

1

1 2

so that the Taylor polynomial centered at is    2  3 1 1 3 1 1 3 + x− . + x− + x− 8 4 2 2 2 2 21. Since [x]D = PD←C [x]C and [x]C = PC←B [x]B , we get [x]D = PD←C [x]C = PD←C (PC←B [x]B ) = (PD←C PC←B )[x]B . Since the change-of-basis matrix is unique (its columns are determined by the two bases), it follows that PD←C PC←B = PD←B . 22. By Theorem 6.10 (c), to show that C is a basis it suffices to show that it is linearly independent, since C has n elements and dim V = n. Since P is invertible, the columns of P are linearly independent. But the given equation says that [ui ]B = pi , since the elements of that column are the coordinates of ui with respect to the basis B. Since the columns of P are linearly independent, so is {[u1 ]B , [u2 ]B , . . . , [un ]B }. But then by Theorem 6.7 in Section 6.2, it follows that {u1 , u2 , . . . , un } is linearly independent, so that C is a basis for V . To show that P = PB←C , recall that the columns of PB←C are the coordinate vectors of C with respect to the basis B. But looking at the given equation, we see that this is exactly what the columns of P are. Thus P = PB←C .

6.4

Linear Transformations

1. T is a linear transformation:  T

a c

 0 b a + 0 d c 

  b0 a + a0 = T 0 d c + c0

  a + a0 + b + b0    0 b + b0   0 =  0 d+d 0 0 c+c +d+d    0  a + b0 a+b   0   0  =T a + =  0   0  c c+d c0 + d0

  0 b a +T 0 d c

b0 d0

and T

  a α c

  b αa =T d αc

αb αd

 =

   αa + αb 0 a+b =α 0 αc + αd 0

2. T is not a linear transformation:    0   w + w0 w x w x0 =T T + 0 0 y z y + y0 y z

x + x0 z + z0





  0 a = αT c+d c

1 = x + x0 − y − y 0

w + w0 − z − z 0 1

while  T

w y

  0 x w +T z y0

x0 z0



   1 w−z 1 w0 − z 0 = + 0 x−y 1 x − y0 1   0 2 w + w − z − z0 = x + x0 − y − y 0 2 

Since the two are not equal, T is not a linear transformation.

 b . d 



492

CHAPTER 6. VECTOR SPACES

3. T is a linear transformation. If A, C ∈ Mnn and α is a scalar, then T (A + C) = (A + C)B = AB + CB = T (A) + T (C) T (αA) = (αA)B = α(AB) = αT (A). 4. T is a linear transformation. If A, C ∈ Mnn and α is a scalar, then T (A + C) = (A + C)B − B(A + C) = AB + CB − BA − BC = (AB − BA) + (CB − BC) = T (A) + T (C) T (αA) = (αA)B − B(αA) = α(AB) − α(BA) = α(AB − BA) = αT (A). 5. T is a linear transformation, since by Exercise 44 in Section 3.2, tr(A + B) = tr(A) + tr(B) and tr(αA) = α tr(A). 6. T is not a linear transformation. For example, in M22 , let A = I; then T (2A) = T (2I) = 2 · 2 = 4, while 2T (A) = 2T (I) = 2(1 · 1) = 2. 7. T is not a linear transformation. For example, let A be any invertible matrix in Mnn ; then −A is invertible as well, so that rank(A) = rank(−A) = n. Then T (A) + T (−A) = rank(A) + rank(−A) = 2n while T (A + (−A)) = T (O) = rank(O) = 0. 8. Since T (0) = 1 + x + x2 6= 0, T is not a linear transformation by Theorem 6.14(a). 9. T is a linear transformation. Let p(x) = a + bx + cx2 and p0 (x) = a0 + b0 x + c0 x2 be elements of P2 , and let α be a scalar. Then (T (p + p0 ))(d) = (T ((a + a0 ) + (b + b0 )x + (c + c0 )x2 ))(d) = (a + a0 ) + (b + b0 )(d + 1) + (b + b0 )(d + 1)2   = a + b(d + 1) + b(d + 1)2 + a0 + b0 (d + 1) + b0 (d + 1)2 = (T (p))(d) + (T (p0 ))(d), and T (αp)(d) = (T (αa + αbx + αcx2 ))(d) = αa + αb(d + 1) + αb(d + 1)2  = α a + b(d + 1) + b(d + 1)2 = α(T (p))(d). 10. T is a linear transformation. Let f, g ∈ F and α be a scalar. Then (T (f + g))(x) = (f + g)(x2 ) = f (x2 ) + g(x2 ) = (T (f ))(x) + (T (g))(x) (T (αf ))(x) = (αf )(x2 ) = αf (x2 ) = α(T (f ))(x). 11. T is not a linear transformation. For example, let f be any function in F that takes on a value other than 0 or 1. Then 2

2

(T (2f ))(x) = ((2f )(x)) = (2f (x)) = 4f (x)2 6= 2f (x)2 = 2(T (f ))(x), where the inequality holds because of the conditions on f . 12. This is similar to Exercise 10; T is a linear transformation. Let f, g ∈ F and α be a scalar. Then (T (f + g))(c) = (f + g)(c) = f (c) + g(c) = (T (f ))(c) + (T (g))(c) (T (αf ))(c) = (αf )(c) = αf (c) = α(T (f ))(c)

6.4. LINEAR TRANSFORMATIONS

493

13. For T , we have (with α a scalar)       a c a+c T + =T = (a + c) + (a + c + b + d)x = (a + (a + b)x) + (c + (c + d)x) b d b+d     a c =T +T b d        a αa a T α =T = αa + (αa + αb)x = α(a + (a + b)x) = αT . b αb b For S, suppose that p(x), q(x) ∈ P1 and α is a scalar; then T (p(x) + q(x)) = x(p(x) + q(x)) = xp(x) + xq(x) = T (p(x)) + T (q(x)) T (αp(x)) = x(αp(x)) = α(xp(x)) = αT (p(x)). 14. Using the comment following the definition of a linear transformation, we have                      1 3 5 6 11 5 1 0 1 0 T =T 5 +2 = 5T + 2T = 5  2 + 2 0 =  10 + 0 = 10 2 0 1 0 1 −1 4 −5 8 3 In a similar way,                      3 a 3b a + 3b 1 a 1 0 1 0 T =T a +b = aT + bT = a  2 + b 0 =  2a  +  0 =  2a  b 0 1 0 1 4 −a 4b −a + 4b −1       −7 1 3 15. We must first find the coordinates of in the basis , . So we want to find c, d such 9 1 −1 that       −7 1 3 c + 3d = −7 c = 5 =c +d ⇒ ⇒ 9 1 −1 c − d = 9 d = −4. Then using the comment following the definition of a linear transformation, we have            −7 1 3 1 3 T =T 5 −4 = 5T − 4T = 5(1 − 2x) − 4(x + 2x2 ) = −8x2 − 14x + 5. 9 1 −1 1 −1 The second part is much the same, except that we need to find c, d such that       c = a+3b a 1 3 c + 3d = a 4 =c +d ⇒ ⇒ b 1 −1 c − d = b d = a−b 4 . Then using the comment following the definition of a linear transformation, we have        a−b a + 3b 1 a 3 T + =T b 1 −1 4 4     a + 3b a−b 1 3 = T T + 1 −1 4 4 a−b a + 3b = (1 − 2x) + (x + 2x2 ) 4 4 a − b 2 a + 7b a + 3b = x − x+ . 2 4 4 16. Since T is linear, T (6 + x − 4x2 ) = T (6) + T (x) + T (−4x2 ) = 6T (1) + T (x) − 4T (x2 ) = 6(3 − 2x) + (4x − x2 ) − 4(2 + 2x2 ) = 10 − 8x − 9x2 .

494

CHAPTER 6. VECTOR SPACES Similarly, T (a + bx + cx2 ) = T (a) + T (bx) + T (cx2 ) = aT (1) + bT (x) + cT (x2 ) = a(3 − 2x) + b(4x − x2 ) + c(2 + 2x2 ) = (3a + 2c) + (4b − 2a)x + (2c − b)x2 .

17. We do the second part first, then substitute for a, b, and c to get the first part. We must find the  coordinates of a + bx + cx2 in the basis 1 + x, x + x2 , 1 + x2 . So we want to find r, s, t such that a + bx + cx2 = r(1 + x) + s(x + x2 ) + t(1 + x2 ) = (r + t) + (r + s)x + (s + t)x2 r + t = a r + s = b s + t = c

⇒ ⇒

r = s = t =

a+b−c 2 −a+b+c 2 a−b+c . 2

Then using the comment following the definition of a linear transformation, we have   a+b−c −a + b + c a−b+c 2 2 2 T (a + bx + cx ) = T (1 + x) + (x + x ) + (1 + x ) 2 2 2 a+b−c −a + b + c a−b+c = T (1 + x) + T (x + x2 ) + T (1 + x2 ) 2 2 2 a+b−c −a + b + c a−b+c = (1 + x2 ) + (x − x2 ) + (1 + x + x2 ) 2 2 2 3a − b − c 2 = a + cx + x 2 For the second part, substitute a = 4, b = −1, and c = 3 to get T (4 − x + 3x2 ) = 4 + 3x +

12 + 1 − 3 2 x = 4 + 3x + 5x2 . 2

18. We do the second part first, then substitute for a, b, c, and d to get the first part. We must find the coordinates of           a b 1 0 1 1 1 1 1 1 in the basis , , , . c d 0 0 0 0 1 0 1 1 So we want to find r, s, t, u such that  a c

  b 1 =r d 0

  0 1 +s 0 0

  1 1 +t 0 1

  1 1 +u 0 1

Then using the comment following the definition of a       a b 1 0 1 T = T (a − b) + (b − c) c d 0 0 0    1 0 1 = (a − b)T + (b − c)T 0 0 0

r + s + t + u = a r = a−b  1 s + t + u = b s = b−c ⇒ ⇒ 1 t + u = c t = c−d u = d u = d. linear transformation, we have      1 1 1 1 1 + (c − d) + +d 0 1 0 1 1      1 1 1 1 1 + (c − d)T + +dT 0 1 0 1 1

= (a − b) · 1 + (b − c) · 2 + (c − d) · 3 + d · 4 = a + b + c + d. For the second part, substitute a = 1, b = 3, c = 4, d = 2 to get   1 3 T = 1 + 3 + 4 + 2 = 10. 4 2

6.4. LINEAR TRANSFORMATIONS

495

19. Let T (E11 ) = a,

T (E12 ) = b,

T (E21 ) = c,

T (E22 ) = d.

Then   w x T = t (wE11 + xE12 + yE21 + zE22 ) = wT E11 + xT E12 + yT E21 + zT E22 = aw + bx + cy + dz. y z 20. Writing 

     0 2 3  6 = a 1 + b 0 −8 0 2 means that we must have b = −4, and then a = 6, so that if T were a linear transformation, then       0 2 3 T  6 = 6T 1 − 4 0 = 6(1 + x) − 4(2 − x + x2 ) = −2 + 10x − 4x2 6= −2 + 2x2 . −8 0 2 This is a contradiction, so T cannot be linear. 21. Choose v ∈ V . Then −v = 0 − v by definition, so using part (a) of Theorem 6.14 and the definition of a linear transformation, T (−v) = T (0 − v) = T (0) − T (v) = 0 − T (v) = −T (v). Pn 22. Choose v ∈ V . Then there are scalars c1 , . . . , cn such that v = i=1 ci vi since the vi are a basis. Since T is linear, using the comment following the definition of a linear transformation we get ! n n n X X X T (v) = T ci v i = ci T (vi ) = ci vi = v. i=1

i=1

i=1

Since T (v) = v for every vector v ∈ V , T is the identity transformation. Pn 23. We must show that if p(x) ∈ Pn , then T p(x) = p0 (x). Suppose that p(x) = k=0 ak xk . Then since T is linear, using the comment following the definition of a linear transformation we get ! n n n n X X X X k kak xk−1 = p0 (x). ak T (xk ) = ak T (xk ) = a0 T (1) + T p(x) = T ak x = k=0

k=1

k=0

k=1

24. (a) We prove the contrapositive: suppose that {v1 , . . . , vn } is linearly dependent in V ; then we have c1 v1 + · · · + cn vn = 0, where not all of the ci are zero. Then T (c1 v1 + · · · + cn vn ) = c1 T (v1 ) + · · · + cn T (vn ) = T (0) = 0, showing that {T (v1 ), . . . , T (vn )} is linearly dependent in W . Therefore if {T (v1 ), . . . , T (vn )} is linearly independent in W , we must also have that {v1 , . . . , vn } is linearly independent in V . (b) For example, the zero transformation from V to W , which takes every element of V to 0, shows that the converse is false. There are many other examples as well, however. For instance, let     x x 2 2 T :R →R : 7→ . y 0 Then e1 and e2 are linearly independent in V , but     1 0 T (e1 ) = and T (e2 ) = 0 0 are linearly dependent.

496

CHAPTER 6. VECTOR SPACES

25. We have    2 =S T 1    x (S ◦ T ) =S T y (S ◦ T )

      2 5 4 −1 =S = 1 −1 0 6       x 2x + y 2x −y =S = . y −y 0 2x + 2y

Since the domain of T is R2 but the codomain of S is M22 , T ◦ S does not make sense and we cannot compute it. 26. We have (S ◦ T )(3 + 2x − x2 ) = S(T (3 + 2x − x2 )) = S(2 + 2 · (−1)x) = S(2 − 2x) = 2 + (2 − 2)x + 2 · (−2)x2 = 2 − 4x2 (S ◦ T )(a + bx + cx2 ) = S(T (a + bx + cx2 )) = S(b + 2cx) = b + (b + 2c)x + 2(2c)x2 = b + (b + 2c)x + 4cx2 . Since the domain of S is P1 , which equals the codomain of T , we can compute T ◦ S: (T ◦ S)(a + bx) = T (S(a + bx)) = T (a + (a + b)x + 2bx2 ) = (a + b) + 2 · 2bx = (a + b) + 4bx. 27. The Chain Rule says that (f ◦ g)0 (x) = g 0 (x)f 0 (g(x)). Let g(x) = x + 1 and f (x) = p(x). Then (S ◦ T )(p(x)) = S(T (p(x))) = S(p0 (x)) = p0 (x + 1) (T ◦ S)(p(x)) = T (S(p(x))) = T (p(x + 1)) = T ((p ◦ g)(x)) = (p ◦ g)0 (x) = g 0 (x)p0 (g(x)) = p0 (x + 1) since the derivative of g(x) is 1. 28. Use the Chain Rule from the previous exercise, and let g(x) = x + 1: (S ◦ T )(p(x)) = S(T (p(x))) = S(xp0 (x)) = (x + 1)p0 (x + 1) (T ◦ S)(p(x)) = T (S(p(x))) = T (p(x + 1)) = T ((p ◦ g)(x)) = x(p ◦ g)0 (x) = xg 0 (x)p0 (g(x)) = xp0 (x + 1) since the derivative of g(x) is 1. 29. We must verify that S ◦ T = IR2 and T ◦ S = IR2 .            x x x−y 4(x − y) + (−3x + 4y) x (S ◦ T ) =S T =S = = y y −3x + 4y 3(x − y) + (−3x + 4y) y            x x 4x + y (4x + y) − (3x + y) x (T ◦ S) =T S =T = = . y y 3x + y −3(4x + y) + 4(3x + y) y         x x x x Thus (S ◦ T ) = (T ◦ S) = for any vector , so both compositions are the identity and y y y y we are done. 30. We must verify that S ◦ T = IP1 and T ◦ S = IP1 .     b b b + (a + 2b)x = −4 · + (a + 2b) + 2 · x = a + bx (S ◦ T )(a + bx) = S(T (a + bx)) = S 2 2 2 2a (T ◦ S)(a + bx) = T (S(a + bx)) = T ((−4a + b) + 2ax) = + ((−4a + b) + 2 · 2a)x = a + bx. 2 Thus (S ◦ T )(a + bx) = (T ◦ S)(a + bx) = a + bx for any a + bx ∈ P1 , so both compositions are the identity and we are done.

6.4. LINEAR TRANSFORMATIONS

497

31. Suppose T 0 and T 00 are such that T 0 ◦ T = T 00 ◦ T = IV and T ◦ T 0 = T ◦ T 00 = IW . To show that T 0 = T 00 , it suffices to show that T 0 (w) = T 00 (w) for any vector w ∈ W . But T 0 (w) = IV ◦ T 0 (w) = T 00 ◦ T ◦ T 0 (w) = T 00 ◦ ((T ◦ T 0 )w) = T 00 ◦ IW (w) = T 00 (w). 32. (a) First, if T (v) = ±v, then clearly {v, T v} is linearly dependent. For the converse, if {v, T (v)} is linearly dependent, then for some c, d not both zero cv + dT v = 0. Then T (cv + dT v) = cT v + d(T ◦ T )v = cT v + dv = T 0 = 0 since T ◦ T = IV . Thus cv + dT v = 0 cT v + dv = 0. Subtracting these two equations gives c(T v − v) − d(T v − v) = (c − d)(T v − v) = 0. But this means that either c = d 6= 0 or that T v = v. If c = d 6= 0, then divide through by c in the equation cv + dT v = 0 to get v + T v = 0, so that T v = −v. Hence T v = ±v.     x −x (b) Let T = . Then y −y (T ◦ T )

so that T ◦ T = IV , and

           x x −x −(−x) x =T T =T = = , y y −y −(−y) y

    x −x and are linearly dependent. y −y

33. (a) First, if T (v) = v, then clearly {v, T v} = {v, v} is linearly dependent, and if T (v) = 0, then {v, T v} = {v, 0} is linearly dependent. For the converse, suppose that {v, T v} is linearly dependent. Then there exist c, d, not both zero, such that cv + dT v = 0. As in Exercise 32, we get T (cv + dT v) = cT v + d(T ◦ T )v = cT v + dT v = T 0 = 0 since T ◦ T = T . Thus (c + d)T v = 0 cv +

dT v = 0.

Subtracting the two equations gives c(v − T v) = 0, so that either T v = v or c = 0. But if c = 0, then cv + dT v = dT v = 0, so that since c and d were not both zero, d cannot be zero, so that T v = 0. Thus either T v = v or T v = 0.     x x (b) For example, let T = . This is a projection operator, so we already know that T ◦ T = T ; y 0 to show this via computation,            x x x x x (T ◦ T ) =T T =T = =T . y y 0 0 y Then for example T

      1 1 0 = while T = 0. 0 0 1

498

CHAPTER 6. VECTOR SPACES

34. To show that S+T is a linear transformation, we must show it obeys the rules of a linear transformation. Using the definition of S + T and of cT , as well as the fact that both S and T are linear, we have (S + T )(u + v) = S(u + v) + T (u + v) = Su + Sv + T u + T v = (Su + T u) + (Sv + T v) = (S + T )u + (S + T )v (c(S + T ))v = (cS + cT )v = (cS)v + (cT )v = c(Sv) + c(T v) = c(Sv + T v) = c(S + T )v. 35. We show that L = L (V, W ) satisfies the ten axioms of a vector space. We use freely below the fact that V and W are both vector spaces. Also, note that S = T in L (V, W ) simply means that Sv = T v for all v ∈ V . 1. From Exercise 34, S, T ∈ L implies that S + T ∈ L . 2. (S + T )v = Sv + T v = T v + Sv = (T + S)v. 3. ((R + S) + T )v = (R + S)v + T v = Rv + Sv + T v = Rv + (S + T )v = (R + (S + T ))v. 4. Let Z : V → W be the map Z(v) = 0 for all v ∈ V. Then Z is the zero map, so it is linear and thus in L , and (S + Z)v = Sv + Zv = Sv + 0 = Sv. 5. If S ∈ L , let −S : V → W be the map (−S)v = −Sv. By Exercise 34, −S ∈ L , and (S + (−S))v = Sv + (−S)v = Sv − Sv = 0. 6. By Exercise 34, if S ∈ L , then cS ∈ L . 7. (c(S + T ))v = c(S + T )v = c(Sv + T v) = cSv + cT v = (cS)v + (cT )v = ((cS) + (cT ))v. 8. ((c + d)S)v = (c + d)Sv = cSv + dSv = (cS)v + (cT )v = ((cS) + (cT ))v. 9. (c(dS))v = c(dS)v = c(dSv) = (cd)Sv = ((cd)S)v. 10. By Exercise 34, (1S)v = 1Sv = Sv. 36. To show that these linear transformations are equal, it suffices to show that they have the same value on every element of V . (a) (R ◦ (S + T ))v = R((S + T )v) = R(Sv + T v) = R(Sv) + R(T v) = (R ◦ S)v + (R ◦ T )v. (b) (c(R ◦ S))v = c(R ◦ S)v = cR(Sv) = (cR)(Sv) = ((cR) ◦ S)v (c(R ◦ S))v = c(R ◦ S)v = cR(Sv) = (cR)(Sv) = R((cS)v) = (R ◦ (cS))v.

6.5

The Kernel and Range of a Linear Transformation

1. (a) Only (ii) is in ker(T ), since     1 2 1 0 (i) T = 6= O, −1 3 0 3     0 4 0 0 (ii) T = = O, 2 0 0 0     3 0 3 0 (iii) T = 6= O. 0 −3 0 −3 (b) From the definition of T , a matrix is in range(T ) if and only if its two off-diagonal elements are zero. Thus only (iii) is in range(T ). 2. (a) T is explicitly defined by  a T c Thus

 b =a+d d

6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION

499



 1 2 =1+3=4 −1 3   0 4 (ii) T =0+0=0 2 0   1 3 (iii) T = 1 + (−1) = 0, 0 −1 (i) T

and therefore (ii) and (iii) are in ker(T ) but (i) is not. (b) All three of these are in range(T ). For example,   0 0 (i) T = 0 + 0 = 0, 0 0   2 1 (ii) T = 2 + 0 = 2, 4 0  √ √ √ 2 π = 22 + 0 = 22 . (iii) T 2 12 0 (c) ker(T ) is the set of all matrices in M22 whose diagonal elements are negatives of each other. range(T ) is all real numbers. 3. (a) We have     1−1 0 (i) T (1 + x) = T (1 + 1x + 0x ) = = 6= 0, 1+0 1     0−1 −1 (ii) T (x − x2 ) = T (0 + 1x − 1x2 ) = = 6= 0, 1 + (−1) 0     1−1 0 (iii) T (1 + x − x2 ) = T (1 + 1x − 1x2 ) = = = 0, 1 + (−1) 0 2

so only (iii) is in range(T ). (b) All of them are. For example,   0 2 (i) T (1 + x − x ) = , from part (a), 0     1−0 1 (ii) T (1) = = , 0+0 0     0−0 0 (iii) T (x2 ) = = . 0+1 1 (c) ker(T ) consists of all polynomials a + bx + cx2 such that a = b and b = −c; parametrizing via a = t gives ker(T ) = {a + ax − ax2 }, so that ker(T ) consists of all multiples of 1 + x − x2 . range(T ) is all of R2 , since it is a subspace of R2 containing both e1 and e2 (see items (ii) and (iii) of part (b)). 4. (a) Since xp0 (x) = 0 if and only if p0 (x) = 0, we see that only (i) is in ker(T ). For the other two, we have T (x) = xx0 = x and T (x2 ) = x(x2 )0 = 2x2 . (b) A polynomial in range(T ) must be divisible by x; that is, it must have no constant term. Thus (i) is not  in range(T ). However, x = T (x) from part (a), and also from part (a), we see that T 21 x2 = x2 . Thus (ii) and (iii) are in range(T ). (c) ker(T ) consists of all polynomials whose derivative vanishes, so it consists of constant polynomials. range(T ) consists of all polynomials with zero constant terms, sinceR given any  such polynomial, we may write it as xg(x) for some polynomial g(x); then clearly T g(x) dx = xg(x).   0 b 5. Since ker(T ) = , a basis for ker(T ) is c 0     0 1 0 0 , . 0 0 1 0

500

CHAPTER 6. VECTOR SPACES  Similarly, since range(T ) =

a 0 0 d

 , a basis for range(T ) is  1 0

  0 0 , 0 0

 0 . 1

Thus dim ker(T ) = 2 and dim range(T ) = 2. The Rank Theorem says that 4 = dim M22 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 2 + 2, which is true. 6. Since   a b ker(T ) = , c −a a basis for ker(T ) is  1 0

  0 0 , −1 0

  1 0 , 0 1

0 0

 .

From Exercise 2, range(T ) = R. Thus dim ker(T ) = 3 and dim range(T ) = 1. The Rank Theorem says that 4 = dim M22 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 3 + 1, which is true.  7. Since ker(T ) = a + ax − ax2 , a basis for ker(T ) is {1 + x − x2 }. From Exercise 3, range(T ) = R2 . Thus dim ker(T ) = 1 and dim range(T ) = 2. The Rank Theorem says that 3 = dim P2 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 1 + 2, which is true. 8. Since ker(T ) is the constant polynomials, it has a basis {1}. From Exercise 4, range(T ) is the set of polynomials with nonzero constant terms, or {bx + cx2 }, so it has a basis {x, x2 }. Thus dim ker(T ) = 1 and dim range(T ) = 2. The Rank Theorem says that 3 = dim P2 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 1 + 2, which is true. 9. Since  1 T 0

     0 1−0 1 = = , 0 0−0 0



0 T 1

     0 0−0 0 = = , 0 1−0 1

it follows that range(T ) = R2 , since it is a subspace of R2 containing both e1 and e2 . Thus rank(T ) = dim range(T ) = 2, so that by the Rank Theorem, nullity(T ) = dim M22 − rank(T ) = 4 − 2 = 2. 10. Since     1−0 1 T (1 − x) = = , 1−1 0

  0 T (x) = , 1

it follows that range(T ) = R2 , since it is a subspace of R2 containing both e1 and e2 . Thus rank(T ) = dim range(T ) = 2, so that by the Rank Theorem, nullity(T ) = dim P2 − rank(T ) = 3 − 2 = 1.

6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION 11. If A =

 a c

501

 b , then d  a T (A) = AB = c

 b 1 d −1

     −1 a − b −a + b a − b −(a − b) = = . 1 c − d −c + d c − d −(c − d)

Thus T (A) = O if and only if a = b and c = d, so that   a a ker(T ) = , c c so that a basis for ker(T ) is  1 0

  1 0 , 0 1

 0 . 1

Hence nullity(T ) = dim ker(T ) = 2, so that by the Rank Theorem rank(T ) = dim M22 − nullity(T ) = 4 − 2 = 2.  a 12. If A = c

 b , then d

T (A) = AB − BA =

 a c

 b 1 d 0

  1 1 − 1 0

 1 a 1 c

  b a = d c

  a+b a+c − c+d c

So A ∈ ker(T ) if and only if c = 0 and a = d, so that      a b 1 0 0 A= =a +b 0 a 0 1 0

  b+d −c = d 0

 a−d . c

 1 . 0

These two matrices are linearly independent since they are not multiples of one another, so that nullity(T ) = 2. Since dim M22 = 4, we see that by the Rank Theorem rank(T ) = dim M22 − nullity(T ) = 4 − 2 = 2. 13. The derivative of p(x) = a + bx + cx2 is b + 2cx; then p0 (0) = b, so this is zero if b = 0. Thus ker(T ) = {a + cx2 }, so that a basis for ker(T ) is {1, x2 }. Then nullity(T ) = dim ker(T ) = 2, so that by the Rank Theorem, rank(T ) = dim P2 − nullity(T ) = 3 − 2 = 1. 14. We have



0 T (A) = A − AT = a12 − a21 a31 − a13

a12 − a21 0 a32 − a23

 a13 − a31 a23 − a32  , 0

so that A ∈ ker(T ) if and only if a12 = a21 , a13 = a31 , and a23 = a32 ; that is, if and only if A is symmetric. By Exercise 40 in Section 6.2, nullity(T ) = dim ker(T ) = 3·4 2 = 6. Then by the Rank Theorem, rank(T ) = dim M33 − nullity(T ) = 9 − 6 = 3. 15. (a) Yes, T is one-to-one. If T

      a 2a − b 0 = = , b a + 2b 0

then 2a − b = 0 and a + 2b = 0.  The  only solution to this pair of equations is a = b = 0, so that 0 the only element of the kernel is and thus T is one-to-one. 0 (b) Since T is one-to-one and the dimension of the domain and codomain are both 2, T must be onto by Theorem 6.21.

502

CHAPTER 6. VECTOR SPACES

16. (a) Yes, T is one-to-one. If T

  a = (a − 2b) + (3a + b)x + (a + b)x2 = 0, b

then a + b = 0 from the third term. Since 3a + b= 0 as well, subtracting gives 2a = 0, so that a a = 0. The third term then forces b = 0. Thus = 0 and T is one-to-one. b (b) T cannot be onto. R2 has dimension 2, so by the rank theorem 2 = dim R2 = rank(T ) + nullity(T ) ≥ rank(T ). It follows that the dimension of the range of T is at most 2 (in fact, since from part (a) nullity(T ) = 0, the dimension of the range is equal to 2). But dim P2 = 3, so T is not onto. 17. (a) If T (a + bx + cx2 ) = 0, then 2a − b

=0

a + b − 3c = 0 −a Row-reducing the augmented matrix gives  2 −1 0  1 1 −3 −1 0 1

+ c = 0.

  0 1 0 → 0 0 0

0 1 0

−1 −2 0

 0 0 . 0

This has nontrivial solutions a = t, b = 2t, c = t, so that ker(T ) = {t + 2tx + tx2 } = {t(1 + 2x + x2 )}. So T is not one-to-one. (b) Since dim P2 = dim R3 = 3, the fact that T is not one-to-one implies, by Theorem 6.21, that T is not onto. 18. (a) By Exercise 10, nullity(T ) = 1, so that ker(T ) 6= {0}, so that T is not one-to-one. (b) By Exercise 10, range(T ) = R2 , so that T is onto by definition.   a 19. (a) If T  b  = O, then a − b = a + b = 0 and b − c = b + c = 0; these equations give a = b = c = 0 as c the only solution. Thus T is one-to-one. (b) By the Rank Theorem, dim range(T ) = dim R3 − dim ker(T ) = 3 − 0 = 3. However, dim M22 = 4 > 3 = dim range(T ), so that T cannot be onto.   a 20. (a) If T  b  = O, then a + b + c = 0, b − 2c = 0, and a − c = 0. The latter two equations give a = c c and b = 2c, so that from the first equation 4c = 0 and thus c = 0. It follows that a = b = 0. Thus T is one-to-one. 3 (b) By Exercise 40 in Section 6.2, dim W = 2·3 2 = 3 = dim R . Since the dimensions of the domain and codomain are equal and T is one-to-one, Theorem 6.21 implies that T is also onto.

6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION

503

21. These vector spaces are isomorphic. Bases for the two vector spaces are D3 : {E11 , E22 , E33 },

R3 : {e1 , e2 , e3 }.

Since both have dimension 3, we know that D3 ∼ = R3 . Define T (aE11 + bE22 + cE33 ) = a e1 + b e2 + c e3 . Then T is a linear transformation from the way we have defined it. Since the ei are linearly independent, it follows that a e1 + b e2 + c e3 = 0 if and only if a = b = c = 0. Thus ker(T ) = {0} so that T is one-to-one. Since dim D3 = dim R3 , we know that T is onto as well, so it is an isomorphism. 22. By Exercise 40 in Section 6.2, dim S3 = 3·4 2 = 6. Since an upper triangular matrix has three entries that are forced to be zero (those below the diagonal), and the other six may take any value, we see that dim U3 = 6 as well. Thus S3 ∼ = U3 . Define     a11 a12 a13 a11 a12 a13 T : S3 → D3 : a21 a22 a23  7→  0 a22 a23  . a31 a32 a33 0 0 a33 Then any matrix in ker(T ) must have a11 = a12 = a13 = a22 = a23 = a33 = 0, so that it must itself be the zero matrix. Thus ker(T ) = {0} so that T is one-to-one. Since T is one-to-one and dim S3 = dim U3 = 6, it follows that T is onto and is therefore an isomorphism. 23. By Exercises 40 and 41 in Section 6.2, dim S3 = are unequal, S3 ∼ 6 S30 . =

3·4 2

= 6 while dim S30 =

2·3 2

= 3. Since their dimensions

24. Since a basis for P2 is {1, x, x2 }, we have dim P2 = 3. If p(x) ∈ W , say p(x) = a + bx + cx2 + dx3 , then p(0) = 0 implies that a = 0, so that p(x) = bx + cx2 + dx3 ; a basis for this subspace is {x, x2 , x3 } so that dim W = 3 as well. Therefore P2 ∼ = W . Define T : P2 → W : a + bx + cx2 7→ ax + bx2 + cx3 . Then T (a + bx + cx2 ) = ax + bx2 + cx3 = 0 implies that a = b = c = 0. Thus T is one-to-one; since the dimensions of domain and codomain are equal, T is also onto, so it is an isomorphism. 25. A basis for C as a vector space over R is {1, i}, so that dim C = 2. A basis for R2 is {e1 , e2 }, so that dim R2 = 2. Thus C ∼ = R2 . Define   a 2 T : C → R : a + bi 7→ . b Then T (a + bi) = 0 means that a = b = 0. Thus T is one-to-one; since the dimensions of domain and codomain are equal, T is also onto, so it is an isomorphism. 26. V consists of all 2 × 2 matrices whose upper-left and lower-right entries are negatives of each other, so that   a b V = . c −a Thus a basis for V is

 1 0

  0 0 , −1 0

  1 0 , 0 1

 0 , 0

and therefore dim V = 3. Since dim W = dim R2 = 2, these vector spaces are not isomorphic. 27. Since the dimensions of the domain and codomain are the same, it suffices to show that T is one-to-one. Let p(x) = a0 + a1 x + · · · + an xn . Then p(x) + p0 (x) = (a0 + a1 x + · · · + an−1 xn−1 + an xn ) + (a1 + 2a2 x + · · · + nan xn−1 ) = (a0 + a1 ) + (a1 + 2a2 )x + · · · + (an−1 + nan )xn−1 + an xn .

504

CHAPTER 6. VECTOR SPACES If p(x) + p0 (x) = 0, then the coefficient of xn is zero, so that an = 0. But then the coefficient of xn−1 is 0 = an−1 + nan = an−1 , so that an−1 = 0. We continue this process, determining that ai = 0 for all i. Thus p(x) = 0, so that T is one-to-one and is thus an isomorphism.

28. Since the dimensions of the domain and codomain are the same, it suffices to show that T is one-to-one. Now, T (xk ) = (x − 2)k . Suppose that T (a0 + a1 x + · · · + an xn ) = a0 + a1 (x − 2) + · · · + an (x − 2)n = 0. If we show that {1, x − 2, (x − 2)2 , . . . , (x − 2)n } is a basis for Pn , then it follows that all the ai are zero, so that ker(T ) = {0} and T is one-to-one. To prove they form a basis, it suffices to show they are linearly independent, since there are n + 1 elements in the set and dim Pn = n + 1. We do this by induction. For n = 0, this is clear. For n = 1, suppose that a + b(x − 2) = (a − 2b) + bx = 0. Then b = 0 and thus a = 0, so that linear independence for n = 1 is established. Now suppose that {1, x − 2, . . . , (x − 2)k } is a linearly independent set, and suppose that a0 + a1 (x − 2) + · · · + ak−1 (x − 2)k−1 + ak (x − 2)k = 0. If this polynomial were expanded, the only term involving xk comes from the last term of the sum, and it is ak xk . Therefor ak = 0, and then the other ai = 0 by induction. Thus this set forms a basis for Pn , so that T is one-to-one and thus is an isomorphism. 29. Since the dimensions of the domain and  codomain are the same, it suffices to show that T is one-to-one. That is, we want to show that xn p x1 = 0 implies that p(x) = 0. Suppose p(x) = a0 + a1 x + · · · + an xn . Then

       1 1 1 n x p = x a0 + a1 + · · · + an = a0 xn + a1 xn−1 + · · · + an . x x xn  Therefore if xn p x1 = 0, all of the ai are zero so that p(x) = 0. Thus T is one-to-one and is an isomorphism. n

30. (a) As the hint suggests, define a map T : C [0, 1] → C [2, 3] by (T (f ))(x) = f (x − 2) for f ∈ C [0, 1]. Thus for example (T (f ))(2.5) = f (2.5 − 2) = f (0.5). T is linear, since (T (f + g))(x) = (f + g)(x − 2) = f (x − 2) + g(x − 2) = (T (f ))(x) + (T (g))(x) (T (cf ))(x) = (cf )(x − 2) = cf (x − 2) = c(T (f ))(x). We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [0, 1] and that f (x − 2) = 0 for all x ∈ [2, 3]. Let y = x − 2; then x ∈ [2, 3] means y ∈ [0, 1], and we get f (y) = 0 for all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is one-to-one. To see that it is onto, choose f ∈ C [2, 3], and let g(x) = f (x + 2). Then (T (g))(x) = g(x − 2) = f (x + 2 − 2) = f (x), so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism. (b) This is similar to part (a). Define a map T : C [0, 1] → C [a, a + 1] by (T (f ))(x) = f (x − a) for f ∈ C [0, 1]. T is proved linear in a similar fashion to part (a). We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [0, 1] and that f (x − a) = 0 for all x ∈ [a, a + 1]. Let y = x − a; then x ∈ [a, a + 1] means y ∈ [0, 1], and we get f (y) = 0 for all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is one-to-one. To see that it is onto, choose f ∈ C [a, a + 1], and let g(x) = f (x + a). Then (T (g))(x) = g(x − a) = f (x + a − a) = f (x), so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism.

6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION

505

 31. This is similar to Exercise 30. Define a map T : C [0, 1] → C [0, 2] by (T (f ))(x) = f 21 x for f ∈ C [0, 1].  Thus for example (T (f ))(1) = f 12 · 1 = f 21 . T is proved linear in a similar fashion to part (a) of Exercise 30. We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [0, 1] and that f 12 x = 0 for all x ∈ [0, 2]. Let y = 12 x; then x ∈ [0, 2] means y ∈ [0, 1], and we get f (y) = 0 for all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is one-to-one. To see that it is onto, choose f ∈ C [0, 2], and let g(x) = f (2x). Then     1 1 (T (g))(x) = g x = f 2 · x = f (x), 2 2 so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism. 32. This is again similar to Exercises 30 and 31,   and is the most general form. Define a map T : C [a, b] → b−a C [c, d] by (T (f ))(x) = f d−c (x − c) + a for f ∈ C [a, b]. Then (T (f ))(c) = f (a) and (T (f ))(d) = b − a + a = b. T is proved linear in a similar fashion to part (a) of Exercise 30. We must show that T is one-to-one and  onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [a, b] and that f

b−a d−c (x

− c) + a = 0 for all x ∈ [c, d]. Let y =

b−a d−c (x

− c) + a; then x ∈ [c, d] means y ∈ [a, b], and

we get f (y) = 0 for all y ∈ [a, b]. But this means that  f = 0 in C [a,  b]. Thus T is one-to-one. To see d−c that it is onto, choose f ∈ C [c, d], and let g(x) = f b−a (x − a) + c . Then  (T (g))(x) = g

      d−c b−a b−a (x − c) + a = f (x − c) + a − a + c = f (x), d−c b−a d−c

so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism. 33. (a) Suppose that (S ◦ T )(u) = 0. Then (S ◦ T )(u) = S(T u) = 0. Since S is one-to-one, we must have T u = 0. But then since T is one-to-one, it follows that u = 0. Thus S ◦ T is one-to-one. (b) Choose w ∈ W . Since S is onto, we can find some v ∈ V such that Sv = w. But T is onto as well, so there is some u ∈ U such that T u = v. Then (S ◦ T )(u) = S(T u) = Sv = w, so that S ◦ T is onto. 34. (a) Suppose that T u = 0. Then since S0 = 0, we have (S ◦ T )u = S(T u) = S0 = 0. But S ◦ T is one-to-one, so we must have u = 0, so that T is one-to-one as well. Intuitively, if T takes two elements of U to the same element of V , then certainly S ◦ T takes those two elements of U to the same element of W . (b) Choose w ∈ W . Since S ◦ T is onto, there is some u ∈ U such that (S ◦ T )u = w. Let v = T u. Then Sv = S(T u) = (S ◦ T )u = w, so that S is onto. Intuitively, if S ◦ T “hits” every element of W , then surely S must. 35. (a) Suppose that T : V → W and that dim V < dim W . Then range(T ) is a subspace of W , and the Rank Theorem asserts that dim range(T ) = dim V − nullity(T ) ≤ dim V < dim W. Theorem 6.11(b) in Section 6.2 states that if U is a subspace of W , then dim U = dim W if and only if U = W . But here dim range(T ) 6= dim W , and range(T ) is a subspace of W , so that range(T ) 6= W and T cannot be onto.

506

CHAPTER 6. VECTOR SPACES (b) We want to show that ker(T ) 6= {0}. Since range(T ) is a subspace of W , then dim W ≥ dim range(T ) = rank(T ). Then the Rank Theorem asserts that dim W + nullity(T ) ≥ rank(T ) + nullity(T ) = dim V, so that dim ker(T ) = nullity(T ) ≥ dim V − dim W > 0. Thus ker(T ) 6= {0}, so that T is not one-to-one.

36. Since dim Pn = dim Rn+1 = n + 1, we need only show that T is a one-to-one linear transformation. It is linear since         q(a0 ) p(a0 ) p(a0 ) + q(a0 ) (p + q)(a0 )  (p + q)(a1 )   p(a1 ) + q(a1 )   p(a1 )   q(a1 )          (T (p + q))(x) =   =  ..  +  ..  = (T (p))(x) + (T (q))(x) = .. ..   .   .     . . q(an ) p(an ) p(an ) + q(an ) (p + q)(an )       p(a0 ) cp(a0 ) (cp)(a0 )  p(a1 )   (cp)(a1 )   cp(a1 )        (T (cp))(x) =   =  ..  = c  ..  = c(T (p))(x). ..      .  . . p(an ) cp(an ) (cp)(an ) Now suppose T (p) = 0. That means that p(a0 ) = p(a1 ) = · · · = p(an ) = 0, so that p has n + 1 zeroes since the ai are distinct. But Exercise 62 in Section 6.2 implies that p(x) must be the zero polynomial, so that ker(T ) = {0} and thus T is one-to-one. 37. By the Rank Theorem, since T : V → V and T 2 : V → V , we have rank(T ) + nullity(T ) = dim V = rank(T 2 ) + nullity(T 2 ). Since rank(T ) = rank(T 2 ), it follows that nullity(T ) = nullity(T 2 ), and thus that dim ker(T ) = dim ker(T 2 ). So if we show ker(T ) ⊆ ker(T 2 ), we can conclude that ker(T ) = ker(T 2 ). Choose v ∈ ker(T ); then T 2 v = T (T v) = T 0 = 0. Hence ker(T ) = ker(T 2 ). Now choose v ∈ range(T ) ∩ ker(T ); we want to show v = 0. Since v ∈ range(T ), there is w ∈ V such that v = T w. Then since v ∈ ker(T ), 0 = T v = T (T w) = T 2 w, so that w ∈ ker(T 2 ) = ker(T ). So w is also in ker(T ). As a result, v = T w = 0, as we were to show. 38. (a) We have T ((u1 , w1 ) + (u2 , w2 )) = T (u1 + u2 , w1 + w2 ) = u1 + u2 − w1 − w2 = (u1 − w1 ) + (u2 − w2 ) = T (u1 , w1 ) + T (u2 , w2 ) T (c(u, w)) = T (cu, cw) = cu − cw = c(u − w) = cT (u, w). (b) By definition, U + W = {u + w : u ∈ U, w ∈ W }. So if u + w ∈ U + W , we have T (u, −w) = u + w, so that U + W ⊂ range(T ). But since T (u, w) = u − w ∈ U + W , we also have range(T ) ⊂ U + W . Thus range(T ) = U + W .

6.6. THE MATRIX OF A LINEAR TRANSFORMATION

507

(c) Suppose v = (u, w) ∈ ker(T ). Then u − w = 0, so that u = w and thus u ∈ U ∩ W , and v = (u, u). So ker(T ) = {(u, u) : u ∈ U } ⊂ U × W is a subspace of U × W . Then S : U ∩ W → ker(T ) : u 7→ (u, u) is clearly a linear transformation that is one-to-one and onto. Thus S is an isomorphism between U ∩ W and ker T . (d) From Exercise 43(b) in Section 6.2, dim(U × W ) = dim U + dim W . Then the Rank Theorem gives rank(T ) + nullity(T ) = dim(U × W ) = dim U + dim W. But from parts (a) and (b), we have rank(T ) = dim range(T ) = dim(U + W ),

nullity(T ) = dim ker(T ) = dim(U ∩ W ).

Thus rank(T ) + nullity(T ) = dim(U + W ) + dim(U ∩ W ) = dim U + dim W. Rearranging this equation gives dim(U + W ) = dim U + dim W − dim(U ∩ W ).

6.6

The Matrix of a Linear Transformation

1. Computing T (v) directly gives T (v) = T (4 + 2x) = 2 − 4x. To compute the matrix [T ]C←B , we compute the value of T on each basis element of B:  [T (1)]C = [0 − x]C =

 0 , −1

[T (x)]C = [1 − 0x]C =

  1 0



[T ]C←B = Then



0 [T ]C←B [4 + 2x]B = −1



[T (1)]C

 [T (x)]C =



0 −1

 1 . 0

    1 4 2 = = [2 − 4x]C = [T (4 + 2x)]C . 0 2 −4

2. Computing T (v) directly gives (as in Exercise 1) T (v) = T (4 + 2x) = 2 − 4x. In terms of the basis B, we have 4 + 2x = 3(1 + x) + (1 − x), so that   3 [4 + 2x]B = . 1 Now, 

 1 [T (1 + x)]C = [1 − x]C = , −1

Then

  −1 [T (1 − x)]C = [−1 − x]C = , ⇒ −1    1 [T ]C←B = [T (1 + x)]C [T (1 − x)]C = −1 

[T ]C←B [4 + 2x]B =

1 −1

    −1 3 2 = = [2 − 4x]C = [T (4 + 2x)]C . −1 1 −4

 −1 . −1

508

CHAPTER 6. VECTOR SPACES

3. Computing T (v) directly gives T (v) = T (a + bx + cx2 ) = a + b(x + 2) + c(x + 2)2 . In terms of the standard basis B, we have   a [a + bx + cx2 ]B =  b  . c Now,   1 [T (1)]C = [1]C = 0 , 0

  0 [T (x)]C = [x + 2]C = 1 , 0 [T ]C←B =

  0 [T (x2 )]C = [(x + 2)2 ]C = 0 1 

[T (1)]C

[T (x)]C

[T (x2 )]C



⇒  1 = 0 0

0 1 0

 0 0 . 1

Then  1 [T ]C←B [a + bx + cx2 ]B = 0 0

0 1 0

    a a 0 0  b  =  b  = [a + b(x + 2) + c(x + 2)2 ]C = [T (a + bx + cx2 )]C . c c 1

4. Computing T (v) directly gives T (v) = T (a + bx + cx2 ) = a + b(x + 2) + c(x + 2)2 = (a + 2b + 4c) + (b + 4c)x + cx2 . To compute [a + bx + cx2 ]B , we want to solve a + bx + cx2 = r + s(x + 2) + t(x + 2)2 = (r + 2s + 4t) + (s + 4t)x + tx2 for r, s, and t. The x2 terms give t = c; then from the linear term b = s + 4c, so that s = b − 4c; finally, a = r + 2(b − 4c) + 4c = r + 2b − 4c, so that r = a − 2b + 4c. Hence   a − 2b + 4c [a + bx + cx2 ]B =  b − 4c  . c Now,   1 [T (1)]C = [1]C = 0 , 0   4 [T (x + 2)]C = [x + 4]C = 1 , 0   16 [T ((x + 2)2 )]C = [(x + 4)2 ]C = [x2 + 8x + 16]C =  8  , 1 so that [T ]C←B =



[T (1)]C

[T (x + 2)]C

[T ((x + 2)2 )]C



 1 = 0 0

4 1 0

 16 8 . 1

6.6. THE MATRIX OF A LINEAR TRANSFORMATION

509

Then    1 4 16 a − 2b + 4c [T ]C←B [a + bx + cx2 ]B = 0 1 8   b − 4c  0 0 1 c   (a − 2b + 4c) + 4(b − 4c) + 16c  (b − 4c) + 8c = c   a + 2b + 4c =  b + 4c  c = [(a + 2b + 4c) + (b + 4c)x + cx2 ]C = [T (a + bx + cx2 )]C . 5. Directly, T (a + bx + cx2 ) =



 a . a+b+c

Using the standard basis B, we have   a 2  [a + bx + cx ]B = b  . c Now, [T (1)]C =

    1 1 = , 1 1 C

[T (x)]C =

    0 0 = , 1 1 C

[T (x2 )]C =

so that [T ]C←B =



[T (1)]C

[T (x)]C

2

[T (x )]C



 1 = 1

0 1

    0 0 = , 1 1 C

 0 . 1

Then  1 [T ]C←B [a + bx + cx2 ]B = 1

       a a a 0 0   b = = [T (a + bx + cx2 )]C . = a+b+c a+b+c C 1 1 c

6. Directly, as in Exercise 5, T (a + bx + cx2 ) =



 a . a+b+c

Using the basis B, we have   c [a + bx + cx2 ]B =  b  . a Now, [T (x2 )]C =

    0 −1 = , 1 C 1

[T (x)]C =

    0 −1 = , 1 C 1

so that [T ]C←B =



[T (x2 )]C

[T (x)]C

[T (1)]C =

  −1 [T (1)]C = 1

    1 0 = , 1 C 1

 −1 0 . 1 1

Then  −1 2 [T ]C←B [a + bx + cx ]B = 1

   c     −1 0   −(b + c) a b = = = [T (a + bx + cx2 )]C . a+b+c a+b+c C 1 1 a

510

CHAPTER 6. VECTOR SPACES

7. Computing directly gives       −7 + 2 · 7 7 −7 Tv = T =  −(−7)  = 7 . 7 7 7   −7 We first compute the coordinates of in B. We want to solve 7       −7 1 3 =r +s 7 2 −1 for r and s; that is, we want r and s such that −7 = r + 3s and 7 = 2r − s. Solving this pair of equations gives r = 2 and s = −3. Thus     −7 2 = . 7 B −3 Now,        5 6 1 = −1 = −3 , T 2 C 2 C 2 so that [T ]C←B Then

    1 T = 2 C

   6 −7 [T ]C←B = −3 7 B 2

       1 4 3 = −3 = −2 , T −1 C −1 C −1      6 3 T = −3 −1 C 2

 4 −2 . −1

        4   7 0 −7 2      −2 . = 0 = 7 = T 7 C −3 −1 7 C 7

8. Computing directly gives (as in the statement of Exercise 7)     a + 2b a Tv = T =  −a  . b b   a We first compute the coordinates of in B. We want to solve b       a 1 3 =r +s b 2 −1 for r and s; that is, we want r and s such that a = r + 3s and b = 2r − s. Solving this pair of equations gives r = 71 (a + 3b) and s = 17 (2a − b). Thus " # " # 1 a (a + 3b) 7 = 1 . b 7 (2a − b) B

Now,        6 5 1 T = −1 = −3 , 2 C 2 C 2 so that [T ]C←B

    1 T = 2 C

       1 4 3 T = −3 = −2 , −1 C −1 C −1      6 3 T = −3 −1 C 2

 4 −2 . −1

6.6. THE MATRIX OF A LINEAR TRANSFORMATION

511

Then   " # # 6 4 "1 a  7 (a + 3b)  [T ]C←B = −3 −2 1 b 7 (2a − b) B 2 −1   4 6 7 (a + 3b) + 7 (2a − b)   = − 37 (a + 3b) − 27 (2a − b) 1 2 7 (a + 3b) − 7 (2a − b)     " " ## a + 2b 2a + 2b a     . =  −a − b  =  −a  = T b C b b C 9. Computing directly gives T v = T A = AT =

 a b

 c . d

Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is   a b  [A]B =  c . d Now,

T [T E11 ]C = [E11 ]C

T [T E12 ]C = [E12 ]C

T [T E21 ]C = [E21 ]C

T [T E22 ]C = [E22 ]C

    1 1 0 0    = [E11 ]C =  0 = 0 , 0 0 C   0   0 0 0 =  , = [E21 ]C = 1 0 C 1 0   0   1 0 1 =  , = [E12 ]C = 0 0 C 0 0   0   0 0 0 = [E22 ]C = =  . 0 1 C 0 1

Therefore [T ]C←B =

Then



[T E11 ]C

 1 0 [T ]C←B [A]B =  0 0

[T E12 ]C

0 0 1 0

0 1 0 0

[T E21 ]C

 1  0 [T E22 ]C =  0 0

    0 a a  b c 0   =   = a 0  c   b  b 1 d d

0 0 1 0

0 1 0 0

 0 0 . 0 1

 c = [T A]C . d C

512

CHAPTER 6. VECTOR SPACES

10. Computing directly gives  a Tv = TA = A = b T

 c . d

Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is   d c  [A]B =  b . a Now,   0  0 0 =  , 1 C 1 0   1    0 1 0 T =  , [T E21 ]C = [E21 ]C = [E12 ]C = 0 0 C 0 0   0    0 0 1 T [T E12 ]C = [E12 ]C = [E21 ]C = =  , 1 0 C 0 0     0 1     0 T  0 [T E11 ]C = [E11 ]C = [E11 ]C =  0 = 0 . 1 0 C  0 T [T E22 ]C = [E22 ]C = [E22 ]C = 0

Therefore [T ]C←B =



[T E22 ]C

[T E21 ]C

[T E12 ]C

 0  0 [T E11 ]C =  1 0

1 0 0 0

0 1 0 0

 0 0 . 0 1

Then  0 0 [T ]C←B [A]B =  1 0

1 0 0 0

0 1 0 0

    0 d c  c b 0   =   = a 0  b   d  b 1 a a

 c = [T A]C . d C

11. Computing directly gives Tv = TA =

 a c

 b 1 d −1

  −1 1 − 1 −1

  −1 a b 1 c d  a−b = c−d

    b−a a−c b−d c−b − = d−c c−a d−b a−d

Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is   a b  [A]B =  c . d

 d−a . b−c

6.6. THE MATRIX OF A LINEAR TRANSFORMATION

513

Now,   0  −1 −1  , = 0 C  1 0   −1    0 −1 0  , [T E12 ]C = [E12 B − BE12 ]C = = 0 1 C  0 1   1    0 1 0 [T E21 ]C = [E21 B − BE21 ]C = =  , 0 −1 C  0 −1   0    1 0 1 [T E22 ]C = [E22 B − BE22 ]C = =  . −1 0 C −1 0

 0 [T E11 ]C = [E11 B − BE11 ]C = 1

Therefore 

[T ]C←B =



[T E11 ]C

[T E12 ]C

[T E21 ]C

0  −1 [T E22 ]C =   1 0

 −1 1 0 0 0 1 . 0 0 −1 1 −1 0

Then 

0 −1  [T ]C←B [A]B =  1 0

    c−b −1 1 0 a      0 0 1   b  = d − a = c − b a−d 0 0 −1  c  a − d b−c d 1 −1 0

 d−a = [T A]C . b−c C

12. Computing directly gives  a Tv = TA = c

  b a − d b

   c 0 b−c = . d c−b 0

Using the given basis B for M22 , the coordinate vector of A is   a b  [A]B =  c . d Now,   0 0 T  [T E11 ]C = [E11 − E11 ]C =  0 , 0



 0  1 T  [T E12 ]C = [E12 − E12 ]C = [E12 − E21 ]C =  −1 , 0     0 0 −1 0 T T    [T E21 ]C = [E21 − E21 ]C = [E21 − E12 ]C =   1 , [T E22 ]C = [E22 − E22 ]C = 0 . 0 0

514

CHAPTER 6. VECTOR SPACES Therefore [T ]C←B =

Then



[T E22 ]C

[T E21 ]C

 0  0 [T E11 ]C =  0 0

[T E12 ]C

 0 0 0 1 −1 0 . −1 1 0 0 0 0

    0 0 0 0 a       0 b−c 1 −1 0   b  = b − c = = [T A]C . c−b 0 C −1 1 0  c  c − b 0 d 0 0 0

 0 0 [T ]C←B [A]B =  0 0

13. (a) If a sin x + b cos x ∈ W , then D(a sin x + b cos x) = D(a sin x) + D(b cos x) = aD sin x + bD cos x = a cos x − b sin x ∈ W. (b) Since [sin x]B =

  1 , 0

[cos x]B =

  0 , 1

then [D sin x]B = [cos x]B =

  0 , 1

[D cos x]B = [− sin x]B =

  −1 , 0

 ⇒

[D]B =

0 1

 −1 . 0



 3 (c) [f (x)]B = [3 sin x − 5 cos x]B = , so −5 [D(f (x))]B = [D]B [f (x)]B =

 0 1

    −1 3 5 = , 0 −5 3

so that D(f (x)) = f 0 (x) = 5 sin x + 3 cos x. Directly, since (sin x)0 = cos x and (cos x)0 = − sin x, we also get f 0 (x) = 3 cos x − 5(− sin x) = 5 sin x + 3 cos x. 14. (a) If ae2x + be−2x ∈ W , then D(ae2x + be−2x ) = D(ae2x ) + D(be−2x ) = aDe2x + bDe−2x = (2a)e2x + (−2b)e−2x ∈ W. (b) Since   1 [e ]B = , 0 2x

[e

−2x

  0 ]B = , 1

then [De2x ]B = [2e2x ]B =

  2 , 0

(c) [f (x)]B = [e2x − 3e−2x ]B =

[De−2x ]B = [−2e−2x ]B =



 0 , −2



[D]B =

 2 0

 0 . −2



 1 , so −3

 2 [D(f (x))]B = [D]B [f (x)]B = 0

    0 1 2 = , −2 −3 6

so that D(f (x)) = f 0 (x) = 2e2x + 6e−3x . Directly, f 0 (x) = e2x · 2 − 3e−2x · (−2) = 2e2x + 6e−3x . 15. (a) Since   1 [e2x ]B = 0 , 0

  0 [e2x cos x]B = 1 , 0

  0 [e2x sin x]B = 0 , 1

6.6. THE MATRIX OF A LINEAR TRANSFORMATION

515

then   2 [De2x ]B = [2e2x ]B = 0 , 0 

 0 [D(e2x cos x)]B = [2e2x cos x − e2x sin x]B =  2 , −1   0 [D(e2x sin x)]B = [2e2x sin x + e2x cos x]B = 1 . 2 Thus  2 [D]B = 0 0

 0 0 2 1 . −1 2

 3 (b) [f (x)]B = [3e2x − e2x cos x + 2e2x sin x]B = −1, so 2 

    3 6 0 0 2 1 −1 = 0 , −1 2 2 5

 2 [D(f (x))]B = [D]B [f (x)]B = 0 0 so that D(f (x)) = f 0 (x) = 6e2x + 5e2x sin x. Directly,

f 0 (x) = 3e2x · 2 − (2e2x cos x − e2x sin x) + (4e2x sin x + 2e2x cos x) = 6e2x + 5e2x sin x. 16. (a) Since   1 0  [cos x]B =  0 , 0

  0 1  [sin x]B =  0 , 0

  0 0  [x cos x]B =  1 , 0

  0 0  [x sin x]B =  0 , 1

then 

   0 1 −1 0    [D cos x]B = [− sin x]B =   0 , [D sin x]B = [cos x]B = 0 , 0 0     1 0  0 1    [D(x cos x)]B = [cos x − x sin x]B =   0 , [D(x sin x)]B = [sin x + x cos x]B = 1 . −1 0 Thus 

0 −1 [D]B =   0 0

1 0 0 0

 1 0 0 1 . 0 1 −1 0

516

CHAPTER 6. VECTOR SPACES   1 0  , so (b) [f (x)]B = [cos x + 2x cos x]B =   2 0 

0 −1 [D(f (x))]B = [D]B [f (x)]B =   0 0

1 0 0 0

    2 1 0 1 0 −1 0 1   =  , 0 1 2  0 −2 −1 0 0

so that D(f (x)) = f 0 (x) = 2 cos x − sin x − 2x sin x. Directly, f 0 (x) = − sin x + 2 cos x − 2x sin x = 2 cos x − sin x − 2x sin x. 17. (a) Let p(x) = a + bx; then  (S ◦ T )p(x) = S(T p(x)) = S Then

       p(0) a a − 2(a + b) −a − 2b =S = = . p(1) a+b 2a − (a + b) a−b

    −1 −1 = , [(S ◦ T )(1)]D = 1 1 D

so that



−2 [(S ◦ T )(x)]D = −1 

[S ◦ T ]D←B

−1 = 1

 D

  −2 = , −1

 −2 . −1

(b) We have [T (1)]C =

    1 1 = , 1 C 1

[T (x)]C =

so that [T ]C←B =

 1 1

    0 0 = , 1 C 1

 0 . 1

Next,        1 1 1 S = = , 2 0 D 2 D so that [S]D←C

       −2 0 −2 = , S = −1 1 D −1 D  1 = 2

Then [S ◦ T ]D←B = [S]D←C [T ]C←B =

 1 2

 −2 . −1  −2 1 −1 1

  0 −1 = 1 1

 −2 , −1

which matches the result from part (a). 18. (a) Let p(x) = a + bx; then (S ◦ T )p(x) = S(T p(x)) = S(a + b(x + 1)) = S((a + b) + bx) = (a + b) + b(x + 1) = (a + 2b) + bx. Then

    1 1 [(S ◦ T )(1)]D = 0 = 0 , 0 D 0

so that [S ◦ T ]D←B

    2 2 [(S ◦ T )(x)]D = 1 = 1 , 0 D 0  1 = 0 0

 2 1 . 0

6.6. THE MATRIX OF A LINEAR TRANSFORMATION

517

(b) We have     1 1 [T (1)]C = [1]C = 0 = 0 , 0 C 0

    1 1 [T (x)]C = [x + 1]C = 1 = 1 , 0 C 0

so that [T ]C←B

 1 = 0 0

 1 1 . 0

Next,   1 [S(1)]D = [1]D = 0 , 0

  1 [S(x)]D = [x + 1]D = 1 , 0   1 [S(x2 )]D = [(x + 1)2 ]D = [x2 + 2x + 1]D = 2 , 1

so that [S]D←C

 1 = 0 0

Then [S ◦ T ]D←B = [S]D←C [T ]C←B

 1 = 0 0

 1 2 . 1

1 1 0 1 1 0

 1 1 2 0 0 1

  1 1 1 = 0 0 0

 2 1 , 0

which matches the result from part (a). 19. In Exercise 1, the basis for P1 in both the domain and so we can use the matrix computed there:  0 [T ]E←E = −1

codomain was already the standard basis E,  1 . 0

Since this matrix has determinant 1, it is invertible, so that T is invertible as well. Then −1

[T −1 (a + bx)]E←E = ([T ]E←E )

−1    1 a 0 = 0 b 1

   a 0 = b −1

    −1 a −b = . 0 b a

So T −1 (a + bx) = −b + ax. 20. From Exercise 5, [T ]C←B

 1 = 1

0 1

 0 . 1

Since this matrix is not square, it is not invertible, so that T is not invertible. 21. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:       1 2 4 [T (1)]E = [1]E = 0 , [T (x)]E = [x + 2]E = 1 , [T (x2 )]E = [(x + 2)2 ]E = [x2 + 4x + 4]E = 4 , 0 0 1 so that [T ]E←E

 1 = 0 0

2 1 0

 4 4 . 1

518

CHAPTER 6. VECTOR SPACES Since this is an upper triangular matrix with nonzero diagonal entries, it is invertible, so that T is invertible as well. To find T −1 , we have

[T −1 (a + bx + cx2 )]E←E

   a 1 −1 = ([T ]E←E )  b  = 0 c 0

−1   4 a 4  b  1 c      1 −2 4 a a − 2b + 4c 1 −4  b  =  b − 4c  . = 0 0 0 1 c c

2 1 0

So T −1 (a + bx) = (a − 2b + 4c) + (b − 4c)x + cx2 = a + b(x − 2) + c(x − 2)2 = p(x − 2). 22. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:   0 [T (1)]E = [0]E = 0 , 0

  1 [T (x)]E = [1]E = 0 , 0

  0 [T (x2 )]E = [2x]E = 2 , 0

so that [T ]E←E

 0 = 0 0

1 0 0

 0 2 . 0

Since this is an upper triangular matrix with a zero entry on the diagonal, it has determinant zero, so is not invertible; therefore, T is not invertible either. 23. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:   1 [T (x)]E = [x + 1]E = 1 , 0

  1 [T (1)]E = [1 + 0]E = [1]E = 0 , 0

  0 [T (x2 )]E = [x2 + 2x]E = 2 , 1

so that [T ]E←E

 1 = 0 0

1 1 0

 0 2 . 1

Since this is an upper triangular matrix with nonzero diagonal entries, it is invertible, so that T is invertible as well. To find T −1 , we have

[T

−1

2

−1

(a + bx + cx )]E←E = ([T ]E←E )

   a 1  b  = 0 c 0

1 1 0

So T −1 (a + bx + cx2 ) = (a − b + 2c) + (b − 2c)x + cx2 .

−1   0 a 2  b  1 c      1 −1 2 a a − b + 2c 1 −2  b  =  b − 2c  . = 0 0 0 1 c c

6.6. THE MATRIX OF A LINEAR TRANSFORMATION

519

24. We first compute [T ]E←E , where E = {E11 , E12 , E21 , E22 } is the standard basis:   3      2 1 0 3 2 3 2 [T E11 ]E = = =  , 0 0 2 1 E 0 0 E 0 0   2      1 0 1 3 2 2 1 [T E12 ]E = = =  , 0 0 2 1 E 0 0 E 0 0   0      0 0 0 3 2 0 0  [T E21 ]E = = = , 1 0 2 1 E 3 2 E 3 2   0      0 0 0 3 2 0 0  [T E22 ]E = = = , 0 1 2 1 E 2 1 E 2 1 so that [T ]E←E

 3 2 = 0 0

2 1 0 0

0 0 3 2

 0 0 . 2 1

This matrix is block triangular, and each diagonal block is invertible, so it is invertible and therefore T is invertible as well. To find T −1 , we have   −1 a T c

So T −1

 a c



b d

E←E

   3 a  2 b −1    = ([T ]E←E )   c  = 0 0 d

  b −a + 2b = d −c + 2d

2 1 0 0

0 0 3 2

−1   a 0 b 0    2  c  d 1      −a + 2b a −1 2 0 0      2 −3 0 0   b  =  2a − 3b  . =  0 0 −1 2  c  −c + 2d 2c − 3d d 0 0 2 −3

 2a − 3b . 2c − 3d

25. From Exercise 11, since the bases given there were both the standard basis,   0 −1 1 0 −1 0 0 1 . [T ]E←E =   1 0 0 −1 0 1 −1 0 This matrix has determinant zero (for example, the second and third rows are multiples of one another), so that T is not invertible. 26. From Exercise 12, since the bases given there were both the  0 0 0 0 1 −1 [T ]E←E =  0 −1 1 0 0 0

standard basis,  0 0 . 0 0

This matrix has determinant zero, so that T is not invertible.

520

CHAPTER 6. VECTOR SPACES

27. In Exercise 13, the basis is B = {sin x, cos x}. Using the method of Example 6.83, Z    a (a sin x + b cos x) dx = [D]−1 B←B b . B Using [D]B←B from Exercise 13, we get Z   −1        0 −1 1 0 1 1 −3 (sin x − 3 cos x) dx = = = . 1 0 −3 −1 0 −3 −1 B R So (sin x − 3 cos x) dx = −3 sin x − cos x + C. This is the same answer as integrating directly. 28. In Exercise 14, the basis is B = {e2x , e−2x }. Using the method of Example 6.83, Z     a ae2x + be−2x dx = [D]−1 B←B b . B Using [D]B←B from Exercise 14, we get #" # " # " #−1 " # " Z  1 0 0 0 2 0 0 −2x 2 = . 5e dx = = 1 5 0 −2 5 −2 0 −2 5 B R So 5e−2x dx = − 52 e−2x + C. This is the same answer as integrating directly. 29. In Exercise 15, the basis is B = {e2x , e2x cos x, e2x sin x}. Using the method of Example 6.83,   Z   a ae2x + be−2x dx = [D]−1 B←B b . B Using [D]B←B from Exercise 15, we get  −1        1  2 0 0 0 0 0 0 0 2           e2x cos x − 2e2x sin x dx = 0 2 1  1 =  0 52 − 15   1 =  54  . B 2 0 −1 2 0 15 − 35 −2 −2 5  R 2x So e cos x − 2e2x sin x dx = 45 e2x cos x − 53 e2x sin x + C. Checking via differentiation gives 0   3  3 2x 4 4 2x e cos x − e sin x + C = 2e2x cos x − e2x sin x − 2e2x sin x + e2x cos x 5 5 5 5 Z

= e2x cos x − 2e2x sin x. 30. In Exercise 16, the basis is B = {cos x, sin x, x cos x, x sin x}. Using the method of Example 6.83,   a Z  b −1  (a cos x + b sin x + cx cos x + dx sin x) dx = [D]B←B  c . B d Using [D]B←B from Exercise 16, we get  −1        0 1 1 0 0 0 −1 1 0 0 1 Z  −1 0  0 1  0  1 0 1 0 0 1   =   =   (x cos x + x sin x) dx =   0 0 0 1 1 0 0 0 −1 1 −1 B 0 0 −1 0 1 0 0 1 0 1 1 R So (x cos x + x sin x) dx = cos x + sin x − x cos x + x sin x + C. Checking via differentiation gives 0

(cos x + sin x − x cos x + x sin x + C) = − sin x + cos x − cos x + x sin x + sin x + x cos x = x cos x + x sin x.

6.6. THE MATRIX OF A LINEAR TRANSFORMATION 31. With E = {e1 , e2 }, we have       −4 · 0 0 0 [T e1 ]E = = = , 1+5·0 E 1 E 1 Therefore [T ]E =

521

 [T e2 ]E =  0 1

−4 · 1 0+5·1

 = E

    −4 −4 = . 5 E 5

 −4 . 5

Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be     −1 −4 λ1 = 4, v1 = , λ2 = 1, v2 = . 1 1 Thus C=

     −1 −4 −1 , , and [T ]C = 1 1 1

32. With E = {e1 , e2 }, we have       1 1 1−0 = , = [T e1 ]E = 1 1 E 1+0 E Therefore [T ]E =

 −1 −4 −1 [T ]E 1 1

[T e2 ]E =  1 1

  −4 4 = 1 0

 0 1

      −1 −1 0−1 = . = 1 1 E 0+1 E

 −1 . 1

Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be     i −i λ1 = 1 + i, v1 = , λ2 = 1 − i, v2 = . 1 1 Since the eigenvectors are not real, T is not diagonalizable. 33. With E = {1, x}, we have     4 4 = , [T (1)]E = [4 + x]E = 1 1 E Therefore

    2 2 = . [T (x)]E = [2 + 3x]E = 3 3 E

 4 [T ]E = 1

 2 . 3

Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be     2 −1 λ1 = 5, v1 = , λ2 = 2, v2 = . 1 1 Thus C=

     2 −1 2 , , and [T ]C = 1 1 1

−1  −1 2 [T ]E 1 1

  −1 5 = 1 0

 0 . 2

34. With E = {1, x, x2 }, we have     1 1 [T (1)]E = [1]E = 0 = 0 , 0 E 0

    1 1 [T (x)]E = [x + 1]E = 1 = 1 , 0 E 0     1 1 [T (x2 )]E = [(x + 1)2 ]E = [x2 + 2x + 1]E = 2 = 2 . 1 E 1

522

CHAPTER 6. VECTOR SPACES Therefore  1 [T ]E = 0 0

1 1 0

 1 2 . 1

Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Since this is an upper triangular matrix with 1s on the diagonal, the only eigenvalue is 1; using methods from   1 Chapter 4 we find the eigenspace to be one-dimensional with basis 0. Thus T is not diagonalizable. 0 35. With E = {1, x}, we have     1 1 [T (1)]E = [1 + x · 0]E = [1]E = = , 0 E 0

    0 0 [T (x)]E = [x + x · 1]E = [2x]E = = . 2 E 2

Therefore [T ]E =

 1 0

 0 . 2

Thus T is already diagonal with respect to the standard basis E. 36. With E = {1, x, x2 }, we have     1 1 [T (1)]E = [1]E = 0 = 0 , 0 E 0     2 2 [T (x)]E = [3x + 2]E = 3 = 3 , 0 E 0    4 9 [T (x2 )]E = [(3x + 2)2 ]E = [9x2 + 12x + 4]E = 12 = 12 . 9 E 4 

Therefore  1 [T ]E = 0 0

2 3 0

 4 12 . 9

Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Since this is an upper triangular matrix with 1s on the diagonal, the eigenvalues are 1, 3, and 4; using methods from Chapter 4 we find the corresponding eigenvectors to be       1 1 1 λ1 = 1, v1 = 0 , λ2 = 3, v2 = 1 , λ3 = 9, v3 = 2 . 0 0 1 Thus        1 1  1  1 C = 0 , 1 , 2 , and [T ]C = 0   0 0 1 0

1 1 0

−1  1 1 2 [T ]E 0 1 0

1 1 0

  1 1 2 = 0 1 0

0 3 0

 0 0 . 9

    d1 −d2 , and let T be the reflection in `. Let d0 = . As in Example d2 d1 6.85, we may as well assume for simplicity that d is a unit vector; then d0 is as well. Continuing

37. Let ` have direction vector d =

6.6. THE MATRIX OF A LINEAR TRANSFORMATION

523

to follow Example 6.85, D = {d, d0 } is a basis for R2 . Since d0 is orthogonal to d, it follows that T d0 = −d0 . Thus     1 0 [T (d)]D = [d]D = , [T (d0 )]D = [−d0 ]D = . 0 −1 Thus

 1 [T ]D = 0

 0 . −1

Now, [T ]E = PE←D [T ]D PD←E , so we must find PE←D . The columns of this matrix are the expression of the basis vectors of D in the standard basis E, so we have " # d1 d2 PE←D = . d2 −d1 Since we assumed d was a unit vector, this is an orthogonal matrix, so that −1 T [T ]E = PE←D [T ]D PD←E = PE←D [T ]D PE←D = PE←D [T ]D PE←D    d d2 1 0 d1 = 1 d2 −d1 0 −1 d2

  2 d2 d − d22 = 1 −d1 2d1 d2

 2d1 d2 . d22 − d21

Note that this is the same as the answer from Exercise 26 in Section 3.6, with the assumption that d is a unit vector. 38. Let T be the projection onto W . From Example 5.11, we have an orthogonal basis for R2 ,        1  −1 1  u1 = 1 , u2 =  1 , u3 = −1 ,   2 1 0 where {u1 , u2 } is a basis for W and {u3 } is a basis for W ⊥ . Then       0 0 1 so that [T (u1 )]D = 0 , [T (u2 )]D = 1 , [T (u3 )]D = 0 , 0 0 0

 1 [T ]D = 0 0

0 1 0

 0 0 . 0

Further, since the columns of PE←D are the expression of the basis vectors of D in the standard basis E,     1 1 0 1 −1 1 2 2    −1 1 1 . PE←D = 1 = − 13 1 −1 ⇒ PE←D 3 3 1 1 1 −6 3 0 1 2 6 Thus −1 [T ]E = PE←D [T ]D PD←E = PE←D [T ]D PE←D    1 1 1 −1 1 1 0 0 2 2    1 1 = 1 1 −1 0 1 0 − 3 3 1 0 1 2 0 0 0 − 61 6

  5 0 6  1 1 = 6 3 1 − 13 3

1 6 5 6 1 3



 3 If v = −1, then 2  projW v = [T ]E [v]E =

5 6  1  6 − 13

1 6 5 6 1 3

   5 3 3      1 =  13  , 3  −1 1 − 23 2 3

− 31



which is the same as was found in Example 11 in Section 5.2.

− 31



1 . 3 1 3

524

CHAPTER 6. VECTOR SPACES

39. Let B = {v1 , . . . , vn }, and suppose that A is such that A[v]B = [T v]C for all v ∈ V . Then A[vi ]B = [T vi ]C . But [vi ]B = ei since the vi are the basis vectors of B, so this equation says that Aei = [T vi ]C . The left-hand side of this equation is the ith column of A, while the right-hand side is the ith column of [T ]C←B . Thus those ith columns are equal; since this holds for all i, we get A = [T ]C←B . 40. Suppose that [v]B ∈ null([T ]C←B ). That means that 0 = [T ]C←B [v]B = [T v]C , so that [T v]C = 0 and thus v ∈ ker(T ). Similarly, if v ∈ ker(T ), then 0 = [T v]C = [T ]C←B [v]B , so that [v]B ∈ null([T ]C←B ). Then the map V → Rn : v 7→ [v]B is an isomorphism mapping ker(T ) to null([T ]C←B ), so the two subspaces are isomorphic and thus have the same dimension. So nullity(T ) = nullity([T ]C←B ). 41. Suppose B = {b1 , . . . , bn }. Then [T ]C←B [bi ]B = [T (bi )]C by definition of the matrix of T with respect to these bases, and denote by ai the columns of [T ]C←B . Then rank([T ]C←B ) = dim col[T ]C←B , which is the number of linearly independent ai , and rank(T ) = dim range(T ), which is the number of linearly independent T (vi ). Now, since ai = [T ]C←B [bi ]B = [T (bi )]C we see that {ai1 , ai2 , . . . , aim } are linearly independent if and only if {T (bi1 ), . . . , T (b)im } are linearly independent, so col(A) = span(ai1 , ai2 , . . . , aim ) if and only if range(T ) = span(T (bi1 ), . . . , T (b)im ) and therefore rank(A) = rank(T ). 42. T is diagonalizable if and only if there is some basis C and matrix P such that [T ]C = P −1 AP is diagonal. Thus if T is diagonalizable, so is A. For the reverse, if A is diagonalizable, then the basis C whose elements are the columns of P is a basis in which [T ]C is diagonal. 43. Let dim V = n and dim W = m. Then [T ]C←B is an m × n matrix, and by Theorem 3.26 in Section 3.5, rank([T ]C←B ) + nullity([T ]C←B ) = n. But by Exercises 40 and 41, rank([T ]C←B ) = rank(T ) and nullity([T ]C←B ) = nullity(T ); since dim V = n, we get dim V = rank(T ) + nullity(T ). 44. Claim that [T ]C 0 ←B0 = PC 0 ←C [T ]C←B PB←B0 . To prove this, it is enough to show that for all x ∈ V , [x]C 0 = PC 0 ←C [T ]C←B PB←B0 [x]B0 since the matrix of T with respect to B 0 and C 0 is unique by Exercise 39. And indeed PC 0 ←C [T ]C←B PB←B0 [x]B0 = PC 0 ←C [T ]C←B (PB←B0 [x]B0 ) = PC 0 ←C ([T ]C←B [x]B ) = PC 0 ←C [x]C = [x]C 0 . 45. For T ∈ L(V, W ), define ϕ(T ) = [T ]C←B ∈ Mmn . Claim that ϕ is an isomorphism. We must first show it is linear: ϕ(S + T ) = [(S + T )]C←B   = [(S + T )b1 ]C · · · [(S + T )bn ]C   = [Sb1 ]C + [T b1 ]C · · · [Sbn ]C + [T bn ]C     = [Sb1 ]C · · · [Sbn ]C + [T b1 ]C · · · [T bn ]C = ϕ(S) + ϕ(T ).

Exploration: Tiles, Lattices, and the Crystallographic Restriction

525

Similarly ϕ(cS) = [(cS)]C←B   = [(cS)b1 ]C · · · [(cS)bn ]C   = c[Sb1 ]C · · · c[Sbn ]C   = c [Sb1 ]C · · · [Sbn ]C = cϕ(S). Next we show that ker ϕ = 0, so that ϕ is one-to-one. Suppose that ϕ(T ) = 0. That means that [T ]C←B = [0]mn , so that nullity([T ]C←B ) = n. By Exercise 40, this is equal to nullity(T ). Thus nullity(T ) = n = dim V , so that T is the zero linear transformation, showing that ϕ is one-to-one. Finally, since dim Mmn = mn, we will have shown that ϕ is an isomorphism if we prove that dim L(V, W ) = mn. But if {v1 , . . . , vn } and {w1 , . . . , wm } are bases for V and W , then a linear map from V to W sends each vi to some linear combination of the wj , so it is a sum of maps that send vi to wk and the other vi to zero. There are mn of these maps, which are clearly linearly independent, so that dim L(V, W ) = mn. 46. From Exercise 45 with W = R, we have dim W = dim R = 1, so that M1n ∼ = V . Thus V ∗ = L(V, W ) = L(V, R) ∼ = M1n ∼ = V.

Exploration: Tiles, Lattices, and the Crystallographic Restriction 1. Let P be the pattern in Figure 6.15. We are given, from the figure, that P +u = P and that P +v = P . Therefore if a is a positive integer, then P + au = (P + u) + (a − 1)u = P + (a − 1)u, and by induction this is just P . Similarly if b is a positive integer then P + bv = (P + v) + (b − 1)v = P + (b − 1)v, which again by induction is equal to P . If a is a negative integer, then since P + (−a)u = P , it follows that P = P − (−a)u = P + au, and similarly for b < 0 and v. 2. The lattice points correspond to centers of the trefoils in Figure 6.15.

1.5 1. 0.5

-3

-2

v u

-1

1

2

3

-0.5 -1. -1.5

θ 3. Let θ be the smallest positive angle through which the lattice exhibits rotational symmetry. Let RO denote the operation of rotation through an angle θ > 0 around a center of rotation O, and let P be a pattern or lattice. Now we have  nθ θ n RO (P ) = RO (P ) = P, n ∈ Z, n > 0,

since rotation through nθ is the same as applying a rotation through θ n times. If n < 0, then rotating through nθ gives a figure which, if rotated back through (−n)θ, is P , so it must be P as well. This shows that P is rotationally symmetric under any integer multiple of θ. Next, let m be the smallest positive integer such that mθ ≥ 360◦ , and suppose that mθ = 360◦ + ϕ. ◦ 360◦ +ϕ ϕ 360◦ is an integer, so Note that RO (P ) = P , so that RO (P ) = RO (P ). We want to show that 360 θ it suffices to show that ϕ = 0. Now, since m is minimal, we know that mθ − ϕ = 360◦ > (m − 1)θ, so that mθ − ϕ > mθ − θ and thus −ϕ > −θ, so that ϕ < θ. But ◦

360 mθ P = RO (P ) = RO



ϕ (P ) = RO (P ),

and we see that the lattice is rotationally symmetric under a rotation of ψ < θ0 . Since ϕ ≥ 0 and θ is ◦ the smallest positive such angle, it follows that ϕ = 0. Thus θ = 360 m .

526

CHAPTER 6. VECTOR SPACES

4. The smallest positive angle of rotational symmetry in the lattice of Exercise 2 is 60◦ ; this rotational symmetry is also present in the original figure, in Figure 6.14 or 6.15. 5. Some examples are below: n‡1

n‡2

1 -3

-2

1 v u 1

-1

2

3

-3

-2

u 1

-1 -1

n‡4

n‡6

3

-3

-2

u 1

-1

2

3

1 u 1

-1

2

-1

-1

v u 1

v -2

1 v

v

1 -3

n‡3

2

3

-3

-2

-1

-1

2

3

-1

It is impossible to draw a lattice with 8-fold rotational symmetry, as the rest of this section will show. 6. (a) With θ = 60◦ , we have u=

" # 1 0

" ,

v=

1 √2 3 2

# ,

and [Rθ ]E =

" cos θ

− sin θ

sin θ

cos θ

#

" =

1 √2 3 2

"

#





3 2 1 2

# .

(b) From the discussion in part (a), "

1 √2 3 2

"

1 √2 3 2

Rθ (u) =

Rθ (v) = With B = {u, v}, we have   0 [Rθ (u)]B = [v]B = , 1



#" #

3 1 1 2 √2 = = v, 3 1 0 2 2 # " # √ #" 1 − 12 − 23 √2 √ = = −u 3 3 1 2 2 2



  −1 [Rθ (v)]B = [−u + v]B = 1

+ v.



 0 [Rθ ]B = 1

 −1 . 1

7. Since the lattice is invariant under Rθ , u and v are mapped to lattice points under this rotation. But each lattice point is by definition au + bv where a and b are integers. Thus       a b a b [Rθ (u)]B = [au + cv]B = , [Rθ (v)]B = [bu + dv]B = ⇒ [Rθ ]B = , c d c d and the matrix entries are integers. 8. Let E = {e1 , e2 } and B = {u, v}. Then by Theorem 6.29 in Section 6.6, [Rθ ]E = P −1 [Rθ ]B P . But this means that [Rθ ]E ∼ [Rθ ]B (that is, the two matrices are similar). By Exercise 35 in Section 4.4, this implies that 2 cos θ = tr[Rθ ]E = tr[Rθ ]B = a + d, which is an integer.

6.7. APPLICATIONS

527

9. If 2 cos θ = m is an integer, then m must be 0, ±1, or ±2, so the angles are angles whose cosines are 0, ± 21 , or ±1; these angles are θ = 60◦ , 90◦ , 120◦ , 180◦ , 240◦ , 270◦ , 300◦ , 360◦ , corresponding to n = 1, 2, 3, 4, or 6. These are the only possible n-fold rotational symmetries. 10. Left to the reader.

6.7

Applications

1. Here a = −3, so by Theorem 6.32, {e3t } is a basis for solutions, which therefore have the form y(t) = ce3t . Since y(1) = ce3 = 2, we have c = 2e−3 , so that y(t) = 2e−3 e3t = 2e3t−3 . 2. Here a = 1, so by Theorem 6.32, {e−t } is a basis for solutions, which therefore have the form x(t) = ce−t . Since x(1) = ce−1 = 1, we have c = e, so that x(t) = ee−t = e1−t . 3. y 00 − 7y 0 + 12y = 0 corresponds to the characteristic equation λ2 − 7λ + 12 = (λ − 3)(λ − 4) = 0. Then λ1 = 3 and λ2 = 4, so Theorem 6.33 tells us that the general solution has the form y(t) = c1 e3t + c2 e4t . Then 1 − e4 c = 1 y(0) = 1 c1 + c2 = 1 e3 − e4 ⇒ ⇒ 3 4 e c + e c = 1 y(1) = 1 1 2 e3 − 1 c2 = 3 e − e4 so that y(t) =

e3

1 − e4

   1 − e4 e3t + e3 − 1 e4t .

4. x00 +x0 −12x = 0 corresponds to the characteristic equation λ2 +λ−12 = (λ−3)(λ+4) = 0. Then λ1 = 3 and λ2 = −4, so Theorem 6.33 tells us that the general solution has the form x(t) = c1 e3t + c2 e−4t . Then 1 c1 = x(0) = 0 c1 + c2 = 0 7 ⇒ ⇒ 3c1 − 4c2 = 1 1 x0 (0) = 1 c2 = − 7 so that y(t) =

 1 3t e − e−4t . 7 √

5. f 00 − f 0 − f = 0 corresponds to the characteristic equation λ2 − λ − 1 = 0, so that λ1 = 1+2 5 and √ √ √ λ2 = 1−2 5 . Therefore by Theorem 6.33, the general solution is f (t) = c1 e(1+ 5)t/2 + c2 e(1− 5)t/2 . So √

f (0) = 0 f (1) = 1





e(1+

so that f (t) =

5)/2

c1 + c2 = 0 √ c1 + e(1− 5)/2 c2 = 1

√ 5−1)/2 √ e 5−1

e(

c1 =



e(1+

√ 5)t/2

⇒ c2 =



− e(1−

5)t/2



.

e(

5−1)/2 √ e 5−1 √ e( 5−1)/2 − √ e 5−1

528

CHAPTER 6. VECTOR SPACES

6. g 00 − 2g = 0 corresponds to the characteristic equation λ2 − 2√= 0, so that λ1 = √ Therefore by Theorem 6.33, the general solution is g(t) = c1 et 2 + c2 e−t 2 . So g(0) = 1



g(1) = 0



e

2

c1 + c = 1 √ 2 − 2 c1 + e c2 = 0

so that g(t) =

c1 = −

1

√ e2 2

 −1

−et



2

⇒ c2 = √

+ e2

√  2−t 2



√ 2 and λ2 = − 2.

1

√ e2 2 − √ e2 2 √ e2 2 − 1

1

.

7. y 00 − 2y 0 + y = 0 corresponds to the characteristic equation λ2 − 2λ + 1 = (λ − 1)2 = 0, so that λ1 = λ2 = 1. Therefore by Theorem 6.33, the general solution is y(t) = c1 et + c2 tet . So y(0) = 1 y(1) = 1

c1 = 1 ec1 + ec2 = 1



so that y(t) = et +

c1 = 1 1−e c2 = e



1−e t te = et + (1 − e)tet−1 . e

8. x00 + 4x0 + 4x = 0 corresponds to the characteristic equation λ2 + 4λ + 4 = (λ + 2)2 = 0, so that λ1 = λ2 = −2. Therefore by Theorem 6.33, the general solution is x(t) = c1 e−2t + c2 te−2t . So x(0) = 1

c1 = 1 −2c1 + c2 = 1



0

x (0) = 1



c1 = 1 c2 = 3

so that y(t) = e−2t + 3te−2t . 9. y 00 − k 2 y = 0 corresponds to the characteristic equation λ2 − k 2 = 0, so that λ1 = k and λ2 = −k. Therefore by Theorem 6.33, the general solution is y(t) = c1 ekt + c2 e−kt . So y(0) = 1 y 0 (0) = 1

c1 + c2 = 1 kc1 − kc2 = 1



so that y(t) =

k+1 2k k−1 c2 = 2k

c1 = ⇒

 1 (k + 1)ekt + (k − 1)e−kt . 2k

10. y 00 − 2ky 0 + k 2 y = 0 corresponds to the characteristic equation λ2 − 2kλ + k 2 = (λ − k)2 = 0, so that λ1 = λ2 = k. Therefore by Theorem 6.33, the general solution is y(t) = c1 ekt + c2 tekt . So y(0) = 1 y(1) = 0

c1 = 1 c1 + c2 = 0



c1 = 1



c2 = −1

so that y(t) = ekt − te−kt . 11. f 00 − 2f 0 + 5f = 0 corresponds to the characteristic equation λ2 − 2λ + 5 = 0, so that λ1 = 1 + 2i and λ2 = 1 − 2i. Therefore by Theorem 6.33, the general solution is f (t) = c1 et cos 2t + c2 et sin 2t. Then f (0) = 1 π f =0 4



c1

= 1 eπ/4 c2 = 0

so that y(t) = et cos 2t.



c1 = 1 c2 = 0

6.7. APPLICATIONS

529

12. h00 − 4h0 + 5h = 0 corresponds to the characteristic equation λ2 − 4λ + 5 = 0, so that λ1 = 2 + i and λ2 = 2 − i. Therefore by Theorem 6.33, the general solution is h(t) = c1 e2t cos t + c2 e2t sin t. Then h0 (t) = c1 (2e2t cos t − e2t sin t) + c2 (2e2t sin t + e2t cos t), and h(0) = 0 0

h (0) = −1



c1 = 0 2c1 + c2 = −1

c1 = 0



c2 = −1

so that y(t) = −e2t sin t. 13. (a) We start with p(t) = cekt ; then p(0) = ce0k = c = 100, so that c = 100. Next, p(3) = 100e3k = 1600, so that e3k = 16 and k = ln316 . Hence p(t) = 100e(ln 16/3)t = 100 eln 16

t/3

= 100 · 16t/3 .

(b) The population doubles when p(t) = 100e(ln 16/3)t = 200. Simplifying gives ln 16 3 ln 2 3 ln 2 3 t = ln 2, so that t = = = . 3 ln 16 4 ln 2 4 The population doubles in

3 4

of an hour, or 45 minutes.

(c) We want to find t such that p(t) = 100e(ln 16/3)t = 1000000. Simplifying gives ln 16 3 ln 10000 12 ln 10 3 ln 10 t = ln 10000, so that t = = = ≈ 9.966 hours. 3 ln 16 4 ln 2 ln 2 14. (a) Start with p(t) = cekt ; then p(0) = ce0k = c = 76. Next, p(1) = 76e1k = 92, so that k = ln 92 76 , and therefore p(t) = 76et ln(92/76) . For the year 2000, we get p(10) = 76e10 ln(92/76) ≈ 513.5. This is almost twice the actual population in 2000. (b) Start with p(t) = cekt ; then p(0) = ce0k = c = 203. Next, p(1) = 203e1k = 227, so that k = ln 227 203 , t ln(227/203) and therefore p(t) = 203e . For the year 2000, we get p(3) = 203e3 ln(227/203) ≈ 283.8. This is a pretty good approximation. (c) Since the first estimate is very high but the second estimate is close, we can conclude that population growth was lower than exponential in the early 20th century, but has been close to exponential since 1970. This could be attributed for example to three major wars during the period from 1900 to 1970. 15. (a) Let m(t) = ae−ct . Then m(0) = ae−c·0 = a = 50, so that m(t) = 50e−ct . Since the half-life is 1590 years, we know that 25 mg will remain after that time, so that m(1590) = 50e−1590c = 25. Simplifying gives −1590c = ln 21 = − ln 2, so that c = After 1000 years, the amount remaining is

ln 2 1590 .

So finally, m(t) = 50e−t ln 2/1590 .

m(1000) = 50e−1000 ln 2/1590 ≈ 32.33 mg. (b) Only 10 mg will remain when m(t) = 10. Solving 50e−t ln 2/1590 = 10 gives −t

ln 2 1 1590 ln 5 = ln = − ln 5, so that t = ≈ 3692 years. 1590 5 ln 2

530

CHAPTER 6. VECTOR SPACES

16. With m(t) = ae−ct , a half-life of 5730 years means that m(5730) = ae−5730c = a2 , so that e−5730c = ln 2 and thus c = − ln(1/2) 5730 = 5730 . To find out how old the posts are, we want to find t such that

1 2

5730 ln 0.45 ≈ 6600 years. ln 2 √ √ 17. Following Example 6.92, the general equation of the spring motion is x(t) = c1 cos K t + c2 sin K t. Then x(0) = 10 = c1 cos 0 + c2 sin 0 = c1 , so that c1 = 10. Then √ √ √ 5 − 10 cos 10 K √ x(10) = 5 = 10 cos 10 K + c2 sin 10 K , so that c2 = . sin 10 K m(t) = ae−t ln 2/5730 = 0.45a, which gives t = −

Therefore the equation of the spring, giving its length at time t, is √ √ √ 5 − 10 cos 10 K √ sin K t. c2 = 10 cos K t + sin 10 K k and m = 50, we have K = 50 , so that r r √ √ k k t + c2 sin t. x(t) = c1 cos K t + c2 sin K t = c1 cos 50 50

18. From the analysis in Example 6.92, since K =

Since the period of sin bt and cos bt are each q k so that 50 = π5 and thus k = 2π 2 .

k m

2π b ,

the fact that the period is 10 means that √2π

k/50

= 10,

19. The differential equation for the pendulum motion is the same as the equation of spring motion, so we apply the analysis from Example 6.92. Recall that the period of sin bt and cos bt are each 2π b . (a) Since θ00 +

g Lθ

= 0, we have K =

g L

and √

√ θ(t) = c1 cos

K t + c2 sin

r K t = c1 cos

Recall that the period of sin bt and cos bt are each q 2π this case, since L = 1, we get P = 2π g1 = √ g.

2π b .

g t + c2 sin L

r

g t. L

Then the period is P = √2π = 2π gL

q

L g.

In

(b) Since θ1 does not appear in the formula, the period does not depend on θ1 . It depends only on the gravitational constant g and the length of the pendulum L. (Note that it also does not depend on the weight of the pendulum). 20. We want to show that if y1 and y2 are two solutions, and c is a constant, then y1 + y2 and cy1 are also solutions. But this follows directly from properties of the derivative: (y1 + y2 )00 + a(y1 + y2 )0 + b(y1 + y2 ) = (y100 + ay10 + by1 ) + (y200 + ay20 + by2 ) = 0 + 0 = 0 (cy1 )00 + a(cy1 )0 + b(cy1 ) = cy100 + acy10 + bcy1 = c(y100 + ay10 + by1 ) = c · 0 = 0. Thus the solution set is a subspace of F . 21. Since we know that dim S = 2, it suffices to show that eλt and teλt are solutions (are in S) and that they are linearly independent. Note that λ is a solution of λ2 + aλ + b = 0. Then 00 0 eλt + a eλt + beλt = λ2 eλt + aλeλt + beλt = eλt (λ2 + aλ + b) = 0. For teλt , note that 0 teλt = eλt + λteλt , 00 0 teλt = eλt + λteλt = λeλt + λeλt + λ2 teλt = 2λeλt + λ2 teλt .

Chapter Review

531

Then teλt

00

+ a teλt

0

 + bteλt = 2λeλt + λ2 teλt + a eλt + λteλt + bteλt = (2λ + a)eλt + teλt (λ2 + aλ + b) = (2λ + a)eλt . √

2

But the solutions of λ2 + aλ + b are λ = −a± 2a −4b , so if the two roots are equal, we must have a2 − 4b = 0, so that λ = − a2 , so that 2λ = −a and then 2λ + a = 0. So the remaining term vanishes, and teλt is a solution. To see that the two are linearly independent, suppose c1 eλt + c2 teλt is zero for all values of t. Setting t = 0 gives c1 = 0, so that c2 teλt is zero. Then set t = 1 to get c2 = 0. Thus the two are linearly independent, so they form a basis for the solution set S. 22. Suppose that c1 ept cos qt + c2 ept sin qt is identically zero. Setting t = 0 gives c1 = 0, so that c2 ept sin qt is identically zero, which clearly forces c2 = 0, proving linear independence.

Chapter Review 1. (a) False. All we can say is that dim V ≤ n, so that any spanning set for V contains at most n vectors. For example, let V = R, v1 = e1 and v2 = 2e1 . Then V = span(v1 , v2 ), but V has spanning sets consisting of only one vector. (b) True. See Exercise 17(a) in Section 6.2. (c) True. For example,  1 I= 0

 0 , 1

 0 J= 1

 1 , 0



1 L= 0

 1 , 1

 1 R= 1

 1 . 0

Then R − I = E11 , L − I = E12 , J + I − L = E21 , J + I − R = E22 . Therefore this set of four invertible matrices spans; since dim M22 = 4, it is a basis. (d) False. By Exercise 2 in Section 6.5, the kernel of the map tr : M22 → R has dimension 3, so that matrices with trace 0 span a subspace of M22 of dimension 3. (e) False. In general, kx + yk = 6 kxk + kyk. For example, take x = e1 and y = e2 in R2 . (f ) True. If T is one-to-one and {ui } is a basis for V , then {T ui } is a linearly independent set by Theorem 6.22. It spans range(T ), so it is a basis for range(T ). Thus if T is onto, range(T ) = W , so that {T ui } is a basis for W and therefore dim V = dim W . (g) False. This is only true if T is onto. For example, let T : R → R be T (x) = 0. Then ker T = R but the codomain is not {0} (although the range is). (h) True. If nullity(T ) = 4, then rank T = dim M33 − nullity(T ) = 9 − 4 = 5. So range(T ) is a five-dimensional subspace of P4 , so it is the entire space. (i) True. Since dim P3 = 4, it suffices to show that dim V = 4. But polynomials in V are of the form a + bx + cx2 + dx3 + ex4 where a + b + c + d + e = 0, so they are of the form (−b − c − d − e) + bx + cx2 + dx3 + ex4 , where b, c, d, and e are arbitrary. Thus a basis for V is {x, x2 , x3 , x4 }, so dim V = 4. (j) False. See Example 6.80 in Section 6.6. 2. This is the subspace {0}, since the only vector

    x 0 . satisfying x2 + 3y 2 = 0 is y 0

3. The conditions a + b = a + c and a + c = c + d imply  a W = b

that b = c and a = d, so that  b . a

532

CHAPTER 6. VECTOR SPACES (Note that matrices of this form satisfy all of the original equalities.) To show that this is a subspace, we must show that it is closed under sums and scalar multiplication. Thus suppose    0  a b a b0 , ∈ W, c a scalar. b a b0 a0 Then  a b

  0    b a b0 a + a0 b + b0 + 0 = ∈W a b a0 b + b0 a + a0     a b ca cb c = ∈ W. b a cb ca

4. To show that this is a subspace, we must show that it is closed under sums and scalar multiplication. Suppose p(x), q(x) ∈ W . Then            1 1 1 1 1 3 3 3 3 x (p + q) =x p +q =x p +x q = p(x) + q(x) = (p + q)(x), x x x x x so that p + q ∈ W . Next, if c is a scalar, then     1 1 = cx3 p = cp(x) = (cp)(x), x3 (cp) x x so that cp ∈ W . Thus W is a subspace. 5. To show that this is a subspace, we must show that it is closed under sums and scalar multiplication. Suppose f (x), g(x) ∈ W . Then (f + g)(x + π) = f (x + π) + g(x + π) = f (x) + g(x) = (f + g)(x), so that f + g ∈ W . Next, if c is a scalar, then (cf )(x + π) = cf (x + π) = cf (x) = (cf )(x), so that cf ∈ W . Thus W is a subspace. 6. They are linearly dependent. By the double angle formula, cos 2x = 1 − 2 sin2 x, so that 1 − cos 2x − 2 sin2 x = 1 − cos 2x −

 2 3 sin2 x = 0. 3

7. Suppose that cA + dB = O. Taking the transpose of both sides gives cAT + dB T = O; since A is symmetric and B is skew-symmetric, this is just cA − dB = O. Adding these two equations gives 2cA = O. Since A is nonzero, we must have c = 0, so that dB = O. Then since B is nonzero, we must have d = 0. Thus A and B are linearly independent. 8. The condition a + d = b + c means that a = b + c − d, so that      a b b+c−d b W = : a+d=b+c = . c d c d Then define B by finding three matrices in W in each of which exactly other are zero:       1 1 1 0 −1 B= B= , C= , D= 0 0 1 0 0

one of b, c, and d is 1 while the  0 . 1

Consider the expression bB + cC + dD =

 b+c−d c

 b . d

If this is O, then clearly b = c = d = 0, so that B, C, and D are linearly independent. But also, bB + cC + dD is a general matrix in W , so that B, C, and D span W . Thus {B, C, D} is a basis for W.

Chapter Review

533

9. Suppose p(x) = a + bx + cx2 + dx3 + ex4 + f x5 ∈ P5 with p(−x) = p(x). Then p(1) = p(−1) ⇒ a + b + c + d + e + f = a − b + c − d + e − f ⇒ b + d + f = 0 p(2) = p(−2) ⇒ a + 2b + 4c + 8d + 16e + 32f = a − 2b + 4c − 8d + 16e − 32f ⇒ b + 4d + 16f = 0 p(3) = p(−3) ⇒ a + 3b + 9c + 27d + 81e + 243f = a − 3b + 9c − 27d + 81e − 243f ⇒ b + 9d + 81f = 0. These three equations in the unknowns b, d, f have only the trivial solution, so that b = d = f = 0, and p(x) = a + cx2 + dx4 . Any polynomial of this form clearly has p(x) = p(−x), since (−x)2 = x2 and (−x)4 = x4 . So {1, x2 , x4 } is a basis for W . 10. Let B = {1, 1 + x, 1 + x + x2 } = {b1 , b2 , b3 } and C = {1 + x, x + x2 , 1 + x2 } = {c1 , c2 , c3 }. Then       0 −1 1 [c1 ]B = 1 , [c2 ]B =  0 , [c3 ]B = −1 , 0 1 1 so that by Theorem 6.13 PB←C

 0 = 1 0

Then

 −1 1 0 −1 . 1 1 

−1 PC←B = PB←C =

1 2  1 − 2 1 2

1 0 0



1 2 1 . 2 1 2

11. To show that T is linear, we must show that T (x + z) = T x + T z and that T (cx) = cT x, where c is a scalar.  T (x + z) = y(x + z)T y = y xT + zT y = yxT y + yzT y = T x + T z T (cx) = y(cx)T y = cyxT y = cT x. Thus T is linear. 12. T is not a linear transformation. Let A ∈ Mnn and c be a scalar other than 0 or 1. Then T (cA) = (cA)T (cA) = c2 AT A = c2 T (A). Since c 6= 0, 1, it follows that c2 T (A) 6= cT (A), so that T is not linear. 13. To show that T is linear, we must show that (T (p + q))(x) = T p(x) + T q(x) and that (T (cp))(x) = cT p(x), where c is a scalar. (T (p + q))(x) = (p + q)(2x − 1) = p(2x − 1) + q(2x − 1) = T p(x) + T q(x) (T (cp))(x) = (cp)(2x − 1) = cp(2x − 1) = T p(x). Thus T is linear. 14. We first write 5 − 3x + 2x2 in terms of 1, 1 + x, and 1 + x + x2 : 5 − 3x + 2x2 = a(1) + b(1 + x) + c(1 + x + x2 ) = (a + b + c) + (b + c)x + cx2 . Then c = 2, so that (from the x term) b = −5, and then a = 8. Thus T (5 − 3x + 2x2 ) = 8T (1) − 5T (1 + x) + 2T (1 + x + x2 )  1 =8 0

  0 1 −5 1 0

  1 0 +2 1 1

  −1 3 = 0 2

 −7 . 3

534

CHAPTER 6. VECTOR SPACES

15. Since T : Mnn → R is onto, we know from the rank theorem that n2 = dim Mnn = rank(T ) + nullity(T ) = 1 + nullity(T ), so that nullity(T ) = n2 − 1. 16. (a) We want a map T : M22 → M22 such that T (A) = O if and only if A is upper triangular. If   a b A= , c d an obvious map to try is  0 T (A) = c

 0 . 0

Clearly the matrices sent to O by T are exactly the upper triangular matrices (together with O), which is the subspace W . To see that T is linear, note that we can rewrite T as T (A) = a21 E21 . Then T (A + B) = (A + B)21 E21 = (a21 + b21 )E21 = a21 E21 + b21 E21 = T (A) + T (B) T (cA) = (cA)21 E21 = ca21 E21 = cT (A). (b) For this map, we want the image of any matrix A to be upper triangular. With A as above, an obvious map to try is the one that sets the entry below the main diagonal to zero:   a b T (A) = . 0 d It is obvious that range T = W , since any upper triangular matrix is the image of itself under T , and every matrix of the form T (A) is upper triangular by construction. To see that T is linear, write T (A) = A − a21 E21 ; then T (A + B) = (A + B) − (A + B)21 E21 = (A + B) − (a21 + b21 )E21 = (A − a21 E21 ) + (B − b21 E21 ) = T (A) + T (B) T (cA) = (cA) − (cA)21 E21 = cA − ca21 E21 = c(A − a21 E21 ) = cT (A). 17. We have   1 0  ⇒ [T (1)]C =  0 1       1 1 1 0 0 1 T (x) = T ((x + 1) − 1) = T (x + 1) − T (1) = − = 0 1 0 1 0 0   0 1  = 0 · E11 + 1 · E12 + 0 · E21 + 0 · E22 ⇒ [T (x)]C =  0 0    0 −1 1 T (x2 ) = T ((1 + x + x2 ) − (1 + x)) = T (1 + x + x2 ) − T (1 + x) = − 1 0 0   −1 −2  = −1 · E11 − 2 · E12 + 1 · E21 − 1 · E22 ⇒ [T (1)]C =   1 . −1  1 T (1) = 0

 0 = 1 · E11 + 0 · E12 + 0 · E21 + 1 · E22 1

  1 −1 = 1 1

 −2 −1

Chapter Review

535

Since these vectors are the columns of [T ]C←B , we have   1 0 −1 0 1 −2 . [T ]C←B =  0 0 1 1 0 −1 18. We must show that S spans V and that S is linearly independent. First, since every vector can be written as a linear combination of elements of S in one way, S spans V . Next, suppose c1 v1 + c2 v2 + · · · + cn vn = 0. Since we know that we can write 0 as the linear combination of the elements of S with zero coefficients, and since the representation of 0 is unique by hypothesis, it follows that ci = 0 for all i, so that S is linearly independent. Thus S is a basis for V . 19. If u ∈ U , then T u ∈ ker(S), so that (S ◦ T )u = S(T u) = 0. Thus S ◦ T is the zero map. 20. Since {T (v1 ), T (v2 ), . . . , T (vn )} is a basis for V , it follows that range(T ) = V since any element of V can be expressed as a linear combination of elements in range(T ). By Theorem 6.30 (p) and (t), this means that T is invertible.

Chapter 7

Distance and Approximation 7.1

Inner Product Spaces

1. With hu, vi = 2u1 v1 + 3u2 v2 , we get (a) hu, vi = 2 · 1 · 4 + 3 · (−2) · 3 = −10. p p √ (b) kuk = hu, ui = 2 · 1 · 1 + 3 · (−2) · (−2) = 14.

 

−3 p √

(c) d(u, v) = ku − vk =

−5 = 2 · (−3) · (−3) + 3 · (−5) · (−5) = 93.   6 2 2. With A = , we get 2 3      6 2 4 T (a) hu, vi = u Av = 1 −2 = −4. 2 3 3 s    p √   6 2 1 1 −2 = 10. (b) kuk = hu, ui = 2 3 −2

  s   

−3 √   6 2 −3

= −3 −5 (c) d(u, v) = ku − vk = = 189.

−5 2 3 −5   a 3. We want a vector w = such that b hu, wi = 2 · 1 · a + 3 · (−2) · b = 2a − 6b = 0.     3t 3 Any vector of the form is such a vector; for example, . t 1   a 4. We want a vector w = such that b      6 2 a T hu, wi = u aw = 1 −2 = 2a − 4b = 0. 2 3 b   2 There are many such vectors, for example w = . 1 5. (a) (b) (c)

hp(x), q(x)i = 3 − 2x, 1 + x + x2 = 3 · 1 + (−2) · 1 + 0 · 1 = 1. p p √ kp(x)k = hp(x), p(x)i = 3 · 3 + (−2) · (−2) + 0 · 0 = 13. √

p d(p(x), q(x)) = kp(x) − q(x)k = 2 − 3x − x2 = 22 + (−3)2 + (−1)2 = 14. 537

538

CHAPTER 7. DISTANCE AND APPROXIMATION

6. 1

Z

(3 − 2x)(1 + x + x2 ) dx =

Z

1

(−2x3 + x2 + x + 3) dx

(a)

hp(x), q(x)i =

(b)

 1 10 1 4 1 3 1 2 = − x + x + x + 3x = . 2 3 2 3 0 s s  1 r Z 1 p 13 4 3 2 2 (3 − 2x) dx = kp(x)k = hp(x), p(x)i = . 9x − 6x + x = 3 3 0 0 s Z 1

2

(2 − 3x − x2 )2 dx d(p(x), q(x)) = kp(x) − q(x)k = 2 − 3x − x =

0

0



(c)

0

s =

 1 r 1 5 3 4 5 3 41 2 x + x + x − 6x + 4x = . 5 2 3 30 0

7. We want a polynomial r(x) = a + bx + cx2 such that hp(x), r(x)i = 3 · a − 2 · b + 0 · c = 3a − 2b = 0. There are many such polynomials; two examples are 2 − 3x and 4 − 6x + 7x2 . 8. We want a polynomial r(x) = a + bx + cx2 such that Z 1 hp(x), r(x)i = (3 − 2x)(a + bx + cx2 ) dx 0 1

Z

(3a + (3b − 2a)x + (3c − 2b)x2 − 2cx3 ) dx

= 0

 1 3b − 2a 2 3c − 2b 2 c 4 x + x − x = 3ax + 2 3 2 0 5 1 = 2a + b + c = 0. 6 2 

There are many such polynomials. Three examples are −1 + 4x2 , 5 − 12x, and 12x − 20x2 . 9. Recall the double-angle formulas sin2 x =

1 2



1 2

cos 2x and cos2 x =

1 2

+

1 2

cos 2x. Then

(a) Z hf, gi =



Z

0



= 0

R 2π 0

 sin2 x + sin x cos x dx

0 2π

Z

(b) kf k =



f (x)g(x) dx =

f (x)2 dx =

R 2π 0

1 1 − cos 2x + sin x cos x 2 2

sin2 x dx =

R 2π 0

1 2



Z



1 2



 dx =

1 1 1 x − sin 2x + sin2 x 2 4 2

  cos 2x dx = 12 x −

1 4

sin 2x

2π 0

2π = π. 0

= π.

(c) Z d(f, g) = kf − gk =



(− cos x)2 dx =

0

10. We want to find a function g such that 1 2π 2 2 sin x 0 = 0, so choose g(x) = cos x.

0

R 2π 0

cos2 x dx =

Z 0





 1 1 + cos 2x dx 2 2  2π 1 1 = = π. x + sin 2x 2 4 0

sin x · g(x) dx = 0. Note from the previous exercise that

7.1. INNER PRODUCT SPACES

539

11. We must show it satisfies all four properties of an inner product. First, hp(x), q(x)i = p(a)q(a) + p(b)q(b) + p(c)q(c) = q(a)p(a) + q(b)p(b) + q(c)p(c) = hq(x), p(x)i . Next, hp(x), q(x) + r(x)i = p(a)(q(a) + r(a)) + p(b)(q(b) + r(b)) + p(c)(q(c) + r(c)) = p(a)q(a) + p(a)r(a) + p(b)q(b) + p(b)r(b) + p(c)q(c) + p(c)r(c) = p(a)q(a) + p(b)q(b) + p(c)q(c) + p(a)r(a) + p(b)r(b) + p(c)r(c) = hp(x), q(x)i + hp(x), r(x)i . Third, hdp(x), q(x)i = dp(a)q(a) + dp(b)q(b) + dp(c)q(c) = d(p(a)q(a) + p(b)q(b) + p(c)q(c)) = d hp(x), q(x)i . Finally, hp(x), p(x)i = p(a)p(a) + p(b)q(b) + p(c)q(c) = p(a)2 + p(b)2 + p(c)2 ≥ 0, and if equality holds, then p(a) = p(b) = p(c) = 0 since squares are always nonnegative. But p ∈ P2 , so if it is nonzero, it has at most two roots. Since a, b, and c are distinct, it follows that p(x) = 0. 12. We must show it satisfies all four properties of an inner product. First, hp(x), q(x)i = p(0)q(0) + p(1)q(1) + p(2)q(2) = q(0)p(0) + q(1)p(1) + q(2)p(2) = hq(x), p(x)i . Next, hp(x), q(x) + r(x)i = p(0)(q(0) + r(0)) + p(1)(q(1) + r(1)) + p(2)(q(2) + r(2)) = p(0)q(0) + p(0)r(0) + p(1)q(1) + p(1)r(1) + p(2)q(2) + p(2)r(2) = p(0)q(0) + p(1)q(1) + p(2)q(2) + p(0)r(0) + p(1)r(1) + p(2)r(2) = hp(x), q(x)i + hp(x), r(x)i . Third, hcp(x), q(x)i = cp(0)q(0) + cp(1)q(1) + cp(2)q(2) = c(p(0)q(0) + p(1)q(1) + p(2)q(2)) = c hp(x), q(x)i . Finally, hp(x), p(x)i = p(0)p(0) + p(1)q(1) + p(2)q(2) = p(0)2 + p(1)2 + p(2)2 ≥ 0, and if equality holds, then 0, 1, and 2 are all roots of p since squares are always nonnegative. But p ∈ P2 , so if it is nonzero, it has at most two roots. It follows that p(x) = 0.   0 13. The fourth axiom does not hold. For example, let u = . Then hu, ui = 0 but u 6= 0. 1   1 14. The fourth axiom does not hold. For example, let u = . Then hu, ui = 1 · 1 − 1 · 1 = 0, but u 6= 0. 1   1 Alternatively, if u = , then u · u = −3 < 0, so the fourth axiom again fails. 2   1 15. The fourth axiom does not hold. For example, let u = . Then hu, ui = 0 · 1 + 1 · 0 = 0, but u 6= 0. 0   1 Alternatively, if u = , then u · u = −4 < 0, so the fourth axiom again fails. −2 16. The fourth axiom does not hold. For example, if p is any nonzero polynomial with zero constant term, then hp(x), p(x)i = 0 but p(x) is not the zero polynomial.

540

CHAPTER 7. DISTANCE AND APPROXIMATION

17. The fourth axiom does not hold. For example, let p(x) = 1 − x; then p(1) = 0 so that hp(x), p(x)i = 0, but p(x) is not the zero polynomial. 18. The fourth axiom does not hold.  For example E12 E22 = O, so that hE12 , E22 i = det O = 0, but 1 1 E11 6= O. Alternatively, let A = ; then A 6= O but hA, Ai = 0. 1 1 

4 19. Examining Example 7.3, the matrix should be 1  u1

  4 u2 1

   1 v1 = 4u1 + u2 4 v2

 1 ; and indeed 4

   v1 u1 + 4u2 = 4u1 v1 + u2 v1 + u1 v2 + 4u2 v2 , v2

as desired.  1 20. Examining Example 7.3, the matrix should be 2  u1

u2

  1 2

   2 v1 = u1 + 2u2 5 v2

 2 ; and indeed 5

2u1 + 5u2

   v1 = u1 v1 + 2u2 v1 + 2u1 v2 + 5u2 v2 , v2

as desired. 21. The unit circle is 

   1 2 u1 2 u= : u1 + u2 = 1 . u2 4 u2 2

1

-1.0

0.5

-0.5

-1

-2

1.0

u1

7.1. INNER PRODUCT SPACES 22. The unit circle is

541



   u1 2 2 u= : 4u1 + 2u1 u2 + 4u2 = 1 . u2 u2 0.6

0.4

0.2

-0.6

-0.4

0.2

-0.2

0.4

0.6

u1

-0.2

-0.4

-0.6

23. We have, using properties of the inner product, hu, cvi = hcv, ui

(Property 1)

= c hv, ui

(Property 3)

= c hu, vi

(Property 1).

24. Let w be any vector. Then hu, 0i = hu, 0wi which, by Theorem 7.1(b), is equal to 0 hu, wi = 0. Similarly, h0, vi = h0w, vi which, by property 3 of inner products, is equal to 0 hw, vi = 0. 25. Using properties of the inner product and Theorem 7.1 as needed, hu + w, v − wi = hu, v − wi + hw, v − wi = hu, vi + hu, −wi + hw, vi + hw, −wi = hu, vi − hu, wi + hv, wi − hw, wi 2

= 1 − 5 + 0 − kwk = −4 − 4 = −8.

26. Using properties of the inner product and Theorem 7.1 as needed, h2v − w, 3u + 2wi = h2v, 3u + 2wi + h−w, 3u + 2wi = h2v, 3ui + h2v, 2wi + h−w, 3ui + h−w, 2wi = 6 hv, ui + 4 hv, wi − 3 hw, ui − 2 hw, wi 2

= 6 hu, vi + 4 hv, wi − 3 hu, wi − 2 kwk = 6 · 1 + 4 · 0 − 3 · 5 − 2 · 22 = −17. 27. Using properties of the inner product and Theorem 7.1 as needed, p ku + vk = hu + v, u + vi p = hu, ui + hu, vi + hv, ui + hv, vi q 2 2 = kuk + 2 hu, vi + kvk √ √ = 1 + 2 · 1 + 3 = 6.

542

CHAPTER 7. DISTANCE AND APPROXIMATION

28. Using properties of the inner product and Theorem 7.1 as needed, p k2u − 3v + wk = h2u − 3v + w, 2u − 3v + wi p = 4 hu, ui − 12 hu, vi − 6 hv, wi + 4 hu, wi + 9 hv, vi + hw, wi q 2 2 2 = 4 kuk − 12 hu, vi − 6 hv, wi + 4 hu, wi + 9 kvk + kwk r √ 2 = 4 · 12 − 12 · 1 − 6 · 0 + 4 · 5 + 9 · 3 + 22 √ = 43. 29. u+v = w means that u+v−w = 0; this is true if and only if hu + v − w, u + v − wi = 0. Computing this inner product gives hu + v − w, u + v − wi = hu, ui + 2 hu, vi − 2 hu, wi − 2 hv, wi + hv, vi + h−w, −wi 2

2

2

= kuk + 2 · 1 − 2 · 5 − 2 · 0 + kvk + kwk = 1 + 2 − 10 + 3 + 2 = 0. 2

30. Since kuk = 1, we also have 1 = kuk = hu, ui. Similarly, 1 = hv, vi. But then 0 ≤ hu + v, u + vi = hu, ui + 2 hu, vi + hv, vi = 2 + 2 hu, vi . Solving this inequality gives 2 hu, vi ≥ −2, so that hu, vi ≥ −1. 31. We have hu + v, u − vi = hu, ui + hu, vi − hv, ui − hv, vi . Since inner products are commutative, the middle two terms cancel, leaving us with 2

2

hu, ui − hv, vi = kuk − kvk . 32. We have 2

ku + vk = hu + v, u + vi = hu, ui + hu, vi + hv, ui + hv, vi . Since inner products are commutative, the middle two terms are the same, and we get 2

2

hu, ui + 2 hu, vi + hv, vi = kuk + 2 hu, vi + kvk . 33. From Exercise 32, we have 2

2

2

ku + vk = kuk + 2 hu, vi + kvk .

(1)

Substituting −v for v gives another identity, 2

2

2

2

2

ku − vk = kuk + 2 hu, −vi + k−vk = kuk − 2 hu, vi + kvk . Adding these two gives 2

2

2

2

ku + vk + ku − vk = 2 kuk + 2 kvk , and dividing through by 2 gives the required identity. 34. Solve (1) and (2) in the previous exercise for hu, vi, giving  1 2 2 2 ku + vk − kuk − kvk 2  1 2 2 2 hu, vi = − ku − vk + kuk + kvk . 2 hu, vi =

(2)

7.1. INNER PRODUCT SPACES

543

Add these two equations, giving 2 hu, vi =

 1 2 2 ku + vk − ku − vk ; 2

dividing through by 2 gives hu, vi =

 1 1 1 2 2 2 2 ku + vk − ku − vk = ku + vk − ku − vk . 4 4 4

35. Use Exercise 34. First, if ku + vk = ku − vk, then Exercise 34 implies that hu, vi = 0. Second, suppose that u and v are orthogonal. Then again from Exercise 34, 0=

1 1 1 2 2 ku + vk − ku − vk = (ku + vk + ku − vk) (ku + vk − ku − vk) , 4 4 4

so that one of the two factors must be zero. If the first factor is zero, then both norms are zero, so they are equal. If the second factor is zero, then we have ku + vk − ku − vk = 0 and again the norms are equal. 36. From (2) in Exercise 33, we have d(u, v) = ku − vk = This is equal to

q

2

q

2

2

kuk − 2 hu, vi + kvk .

2

kuk + kvk if and only if hu, vi = 0, which is to say when u and v are orthogonal.

37. The inner product we are working with is hu, vi = 2u1 v1 + 3u2 v2 , so that   1 v1 = 0           2·1·1+3·1·0 1 1 1 1 0 v2 = − = − = . 1 1 0 1 2·1·1+3·0·0 0     1 0 The orthogonal basis is , . 0 1 38. The inner product we are working with is  4 hu, vi = uT Av = uT −2

 −2 = 4u1 v1 − 2u1 v2 − 2u2 v1 + 7u2 v2 . 7

Then   1 0 " # " # " # " # " # 1 1 1 4·1·1−2·1·1−2·1·0+7·0·1 1 1 1 v2 = − = − = 2 . 4·1·1−2·1·0−2·1·0+7·0·0 0 2 0 1 1 1 v1 =



39. The inner product we are working with is a0 + a1 x + a2 x2 , b0 + b1 x + b2 x2 = a0 b0 + a1 b1 + a2 b2 . Then v1 = 1 1 · (1 + x) 1 = (1 + x) − 1 = x 1·1 1 · (1 + x + x2 ) x · (1 + x + x2 ) v3 = (1 + x + x2 ) − 1− x = (1 + x + x2 ) − 1 − x = x2 . 1·1 x·x v2 = (1 + x) −

The orthogonal basis is {1, x, x2 }.

544

CHAPTER 7. DISTANCE AND APPROXIMATION

40. The inner product we are working with is hp(x), q(x)i =

R1 0

p(x)q(x) dx. Then

v1 = 1 R1

(1 + x) dx 3 1 1 = (1 + x) − 1 = − + x R1 2 2 1 dx 0    − 12 + x · (1 + x + x2 ) 1 · (1 + x + x2 ) 1   − +x 1− 1 1 1·1 2 −2 + x · −2 + x  R1 R1   2 (1 + x + x ) dx − 21 + x (1 + x + x2 ) dx 1 0 0 1− R1 − +x   R1 2 1 dx − 12 + x · − 12 + x 0 0   1/6 1 11 1− − +x 6 1/12 2   11 1 1−2 − +x 6 2

1 · (1 + x) 1 = (1 + x) − v2 = (1 + x) − 1·1 v3 = (1 + x + x2 ) − = (1 + x + x2 ) − = (1 + x + x2 ) − = (1 + x + x2 ) − =

0

1 − x + x2 . 6

41. The inner product here is hf, gi =

R1 −1

f (x)g(x) dx.

(a) To normalize the first three Legendre polynomials, we find the norm of each:

sZ

sZ

1

1

1 · 1 dx =

k1k = −1

sZ

1 dx =



2

−1

sZ

s

r 1 1 3 2 kxk = x · x dx = dx = x = 3 3 −1 −1 −1 sZ 

sZ 1     1

2 1 1 1 2 2 1 2 2 4

x − = x − x − dx = x − x + dx

3 3 3 3 9 −1 −1 s √ 1 1 5 2 3 1 2 2 = x − x + x = √ . 5 9 9 −1 3 5 1

1

x2

So the normalized Legendre polynomials are

1 1 =√ , k1k 2

√ x 6 = x, kxk 2

√  √  x2 − 13 3 5 1 5 2

√ √ (3x2 − 1). = x − =

x2 − 1 3 2 2 2 2 3

7.1. INNER PRODUCT SPACES

545

(b) For the computation, we will use the unnormalized Legendre polynomials given in Example 3.8, 1, x, and x2 − 31 , and normalize the fourth one at the end. With x4 = x3 , we have x4 · v1 x4 · v2 x4 · v3 v4 = x4 − v1 − v2 − v3 v1 · v1 v2 · v2 v3 · v3 R1 3 R 1 3 2 1 R1 3   x · x dx x x − 3 dx x · 1 dx 1 −1 −1 −1 3 2 1− R1 x− R1 = x − R1 x −   3 1 · 1 dx x · x dx x2 − 31 x2 − 13 dx −1 −1 −1  R1 4 R1 R1 3   x dx x5 − 31 x3 dx x dx 1 −1 −1 1 − x − x2 − = x3 − R−1  R R 1 1 1 1 2 2 2 4 3 1 dx x dx x − 3 x + 9 dx −1 −1 −1  1 5 1 1 6   1 4 1 1   1 4 1 4 x −1 5 x −1 6 x − 12 x −1 1−  x − x2 − = x3 −   1 1 1 1 2 1 2 3 x3 x5 − x3 + x 3

−1

5

9

9

−1

2/5 = x3 − 0 − x−0 2/3 3 = x3 − x. 5 The norm of this polynomial is sZ  sZ     1 1 3 9 3 6 x3 − x dx = kv4 k = x3 − x x6 − x4 + x2 dx 5 5 5 25 −1 −1 s √ 1 1 7 6 5 3 3 2 14 x − x + x . = = 7 25 25 35 −1 Thus the normalized Legendre polynomial is   √ 3 14 35 3 √ x − x = (5x3 − 3x). 5 4 2 14 42. (a) Starting with `0 (x) = 1, `1 (x) = x, `2 (x) = x2 − 31 , `3 (x) = x3 − 53 x, we have `0 (1) = 1



L0 (x) = 1

`1 (1) = 1



L1 (x) = x



L2 (x) =

1 = 3 3 `3 (1) = 1 − − 5

`2 (1) = 1 −

2 3 2 5



 3 x2 − 2  5 L3 (x) = x3 − 2

 1 3 1 = x2 − 3 2 2  5 3 3 3 x = x − x. 5 2 2

(b) For L2 and L3 , the given recurrence says 2·2−1 2−1 xL1 (x) − L0 (x) = 2 2 2·3−1 3−1 L3 (x) = xL2 (x) − L1 (x) = 3 3 L2 (x) =

3 2 1 x − 2  2  5 3 2 1 2 5 3 x x − − x = x3 − x, 3 2 2 3 2 2

so that the recurrence indeed works for n = 2 and n = 3. Then     2·4−1 4−1 7 5 3 3 3 3 2 1 L4 (x) = xL3 (x) − L2 (x) = x x − x − x − 4 4 4 2 2 4 2 2 35 4 15 2 3 = x − x + 8 4 8     2·5−1 5−1 9 35 4 15 2 3 4 5 3 3 L5 (x) = xL4 (x) − L3 (x) = x x − x + − x − x 5 5 5 8 4 8 5 2 2 63 5 35 3 15 = x − x + x. 8 4 8

546

CHAPTER 7. DISTANCE AND APPROXIMATION

43. Let B = {ui } be an orthogonal basis for W . Then hprojW (v), uj i =

X

hui , vi ui , uj hui , ui i



 =

huj , vi uj , uj huj , uj i

 =

huj , vi huj , uj i = huj , vi . huj , uj i

Then hperpW (v), uj i = hv − projW (v), uj i = hv, uj i − hprojW (v), uj i = hv, uj i − huj , vi = 0. Thus perpW (v) is orthogonal to every vector in B, so that perpW (v) is orthogonal to every w ∈ W . 44. (a) We have 2

2

htu + v, tu + vi = t2 hu, ui + 2t hu, vi + hv, vi = kuk t2 + 2 hu, vi t + kvk ≥ 0. 2

2

This is a quadratic at2 + bt + c in t, where a = kuk , b = 2 hu, vi, and c = kvk . 2

(b) Since a = kuk > 0, this is an upward-opening parabola, so its vertex is its lowest point. That b vertex is at − 2a , so the lowest point, which must be nonnegative, is 2    b ab2 b2 − 2b2 + 4ac 4ac − b2 b b2 +b − +c= 2 − +c= = ≥ 0. a − 2a 2a 4a 2a 4a 4a Since 4a > 0, we must have 4ac ≥ b2 , so that



ac ≥

1 2

|b|.

(c) Substituting back in the values of a, b, and c from part (a) gives √

q ac =

2

2

kuk kvk = kuk kvk



1 1 |b| = |2 hu, vi| = |hu, vi| . 2 2

Thus kuk kvk ≥ |hu, vi|, which is the Cauchy-Schwarz inequality.

Exploration: Vectors and Matrices with Complex Entries 2

1. Recall that if z ∈ C, then zz = |z| . Then for v as given, we have kvk =



v·v =



v 1 v1 + v 2 v2 + · · · + v n vn =

q

2

2

|v1 | + · · · + |vn | .

2 Finally, we must show that |z| = z 2 . But 2 2 |z| = (zz) = zzz 2 = z 2 z 2 = z 2 , so the above is in fact equal to

p |v12 | + · · · + |vn2 |.

2. (a) u · v = i · (2 − 3i) + 1 · (1 + 5i) = −i(2 − 3i) + (1 + 5i) = −2 + 3i. p p √ (b) By Exercise 1, kuk = |i2 | + |12 | = |−1| + |1| = 2. (c) By Exercise 1, kvk =

p

|(2 − 3i)2 | + |(1 + 5i)2 | =

p

(2 + 3i)(2 − 3i) + (1 − 5i)(1 + 5i) =



13 + 26 =



39.

  p

−2 + 4i q

= |−2 + 4i|2 + |−5i|2 = (−2 − 4i)(−2 + 4i) + (5i)(−5i) = (d) d(u, v) = ku − vk =

−5i √ 3 5.

Exploration: Vectors and Matrices with Complex Entries (e) If w =

547

  a is orthogonal to u, then b u · w = i · a + 1 · b = −ai + b = 0,

so that b = ai. Thus for example setting a = 1, we get   1 . i   a (f ) If w = is orthogonal to v, then b u · w = 2 − 3i · a + 1 + 5i · b = (2 + 3i)a + (1 − 5i)b = 0. 2+3i So b = − 1−5i a; setting for example a = 1 − 5i, we get b = −2 − 3i, and one possible vector is



 1 − 5i −2 − 3i

3. (a) v · u = v 1 u1 + · · · + v n un = v1 u1 + · · · + vn un = v1 u1 + · · · + vn un = u1 v1 + · · · + un vn = u · v. (b) u · (v + w) = u1 (v1 + w1 ) + u2 (v2 + w2 ) + · · · un (vn + wn ) = u1 v1 + u1 w1 + u2 v2 + u2 w2 + · · · + un vn + un wn = (u1 v1 + u2 v2 + · · · + un vn ) + (u1 w1 + u2 w2 + · · · + un wn ) = u · v + u · w. (c) First, (cu) · v = cu1 v1 + · · · + cun vn = cu1 v1 + · · · + cun vn = c(u1 v1 + · · · + nn vn ) = cu · v. For the second equality, we have, using part (a), u · (cv) = (cu) · v = cu · v = cu · v = cv · u. 2 (d) By Exercise 1, u · u = kuk = u21 + · + u2n ≥ 0. Further, equality holds if and only if each 2 u = |ui |2 = 0, so all the ui must be zero and therefore u = 0. i 4. Just apply the definition: conjugate A and take its transpose:   T −i i ∗ (a) A = A = . −2i 3   T 2 5 − 2i ∗ (b) A = A = . 5 + 2i −1   2+i 4 T 0 . (c) A∗ = A = 1 − 3i −2 3 + 4i   −3i 1 + i 1 − i T 4 0 . (d) A∗ = A =  0 1 − i −i i

548

CHAPTER 7. DISTANCE AND APPROXIMATION

5. These facts follow directly from standard properties of complex numbers together with the fact that matrix conjugation is element by element:   (a) A = aij = [aij ] = A.       (b) A + B = aij + bij = aij + bij = [aij ] + bij = A + B. (c) cA = [caij ] = [c · aij ] = c [aij ] = cA. P (d) Let C = AB; then cij = k aik bkj , so that " # " # X X   AB ij = C ij = aik bkj aik bik = = A · B ij , k

ij

k

ij

so that AB = A · B. T (e) The ij entry of A is aji . The ij entry of AT is aji , so its conjugate, which is the ij entry of AT , is aji . So the two matrices have the same entries and thus are equal. 6. These facts follow directly from the facts in Exercise 5.  ∗  T  T  T T ∗ ∗ = A = AT = A. (a) (A ) = AT = AT (b) (A + B)∗ = (A + B)T = AT + B T = A∗ + B ∗ . (c) (cA)∗ = (cA)T = cAT = cAT = cA∗ . (d) (AB)∗ = (AB)T = B T AT = B T · AT = B ∗ A∗ .   v1    P  v2  7. We have u · v = ui vi = u1 u2 · · · un  .  = uT v = u∗ v.  ..  vn 8. If A is Hermitian, then aij = aji for all i and j. In particular, when i = j, this gives aii = aii , so that aii is equal to its own conjugate and therefore must be real. 9. (a) A is not Hermitian since it does not have real diagonal entries. (b) A is not Hermitian since a12 = 2 − 3i and a21 = 2 − 3i = 2 + 3i; the two are not equal. (c) A is not Hermitian since a12 = 1 − 5i and a21 = −1 − 5i; the two are not equal. (d) A is Hermitian, since every entry is the conjugate of the same entry in its transpose. (e) A is not Hermitian. Since it is a real matrix, it is equal to its conjugate, but AT 6= A since A is not symmetric. (f ) A is Hermitian. Since it is a real matrix, it is equal to its conjugate, and it is symmetric, so that T A = AT = A. 10. Suppose that λ is an eigenvalue of a Hermitian matrix A, with corresponding eigenvector v. Now, v∗ A = v∗ A∗ = (Av)∗ = (λv)∗ since A is Hermitian. But by Exercise 6(c), (λv)∗ = λv∗ . Then  λ(v∗ v) = v∗ (λv) = v∗ (Av) = (v∗ A)v = λv∗ v = λ(v∗ v). 2

Then since v 6= 0, we have by Exercise 7 v∗ v = v · v = kvk 6= 0. So we can divide the above equation through by v∗ v to get λ = λ, so that λ is real.

Exploration: Vectors and Matrices with Complex Entries

549

11. Following the hint, adapt the proof of Theorem 5.19. Suppose that λ1 6= λ2 are eigenvalues of a Hermitian matrix A corresponding to eigenvectors v1 and v2 respectively. Then λ1 (v1 · v2 ) = (λ1 v1 ) · v2 = (Av1 ) · v2 = (Av1 )∗ v2 = (v1∗ A∗ )v2 . But A is Hermitian, so A∗ = A and we continue: (v1∗ A∗ )v2 = (v1∗ A)v2 = v1∗ (Av2 ) = v1∗ (λ2 v2 ) = λ2 (v1∗ v2 ) = λ2 (v1 · v2 ). Thus λ1 (v1 · v2 ) = λ2 (v1 · v2 ), so that (λ1 − λ2 )(v1 · v2 ) = 0. But λ1 6= λ2 by assumption, so we must have v1 · v2 = 0. " # − √i2 − √i2 ∗ −1 ∗ 12. (a) Multiplying, we get AA = I, so that A is unitary and A = A = . √i √i − 2 2   4 0 (b) Multiplying, we get AA∗ = 6= I, so that A is not unitary. 0 4 # " 3 − 4i ∗ −1 ∗ 5 5 . (c) Multiplying, we get AA = I, so that A is unitary and A = A = − 45 − 3i 5   1−i √ √ 0 −1+i 6 3   (d) Multiplying, we get AA∗ = I, so that A is unitary and A−1 = A∗ =  1 0   0 . 2 1 √ √ 0 6 3 13. First show (a) ⇔ (b). Let ui denote the ith column of a matrix U . Then the ith row of U ∗ is ui . Then by the definition of matrix multiplication, (U ∗ U )ij = ui · uj . Then the columns of U form an orthonormal set if and only if ( 0 i 6= j ui · uj = , 1 i=j which holds if and only if ( ∗

(U U )ij =

0 1

i 6= j , i=j

which is to say that U ∗ U = I. Thus the columns of U are orthogonal if and only if U is unitary. Now apply this to the matrix U ∗ . We get that U ∗ is unitary if and only if the columns of U ∗ are orthogonal. But U ∗ is unitary if and only if U is, since (U ∗ )∗ U ∗ = U U ∗ , and the columns of U ∗ are just the conjugates of the rows of U , so that (letting Ui be the ith row of U) ∗ Ui · Uj = Ui Uj = (Ui )T Uj = Uj · Ui . So the columns of U ∗ are orthogonal if and only if the rows of U are orthogonal. This proves (a) ⇔ (c). Next, we show (a) ⇒ (e) ⇒ (d) ⇒ (e) ⇒ (a). To prove (a) ⇒ (e), assume that U is unitary. Then U ∗ U = I, so that ∗ U x · U y = (U x) U y = x∗ U ∗ U y = x∗ y = x · y.

550

CHAPTER 7. DISTANCE AND APPROXIMATION Next, for (e) ⇒ (d), assume that (e) holds. Then Qx · Qx = x · x, so that kQxk = √ x · x = kxk. Next, we show (d) ⇒ (e). Let ui denote the ith column of U . Then



Qx · Qx =

 1 2 2 kx + yk − kx − yk 4  1 2 2 = kU (x + y)k − kU (x − y)k 4  1 2 2 kU x + U yk − kU x − U yk = 4 = U x · U y.

x·y =

Thus (d) ⇒ (e). Finally, to see that (e) ⇒ (a), let ei be the ith standard basis vector. Then ui = U ei , so that since (e) holds, ( 0 i 6= j ui · uj = U ei · U ej = ei · ej = 1 i = j. This shows that the columns of U form an orthonormal set, so that U is unitary. 14. (a) Since "

# " #   − √i2 i i i i −√ ·√ =0 · = −√ · −√ i √ 2 2 2 2 2 " # " # √i √i 2 · 2 = − √i · √i − √i · √i = 1 i i √ √ 2 2 2 2 2 2 # " # "   − √i2 − √i2 i i i i √ √ · = · − − √ · √ = 1, i i √ √ 2 2 2 2 √i 2 √i 2

2

2

the columns of the matrix are orthonormal, so the matrix is unitary. (b) The dot product of the first column with itself is 1 + i(1 + i) + 1 − i(1 − i) = (1 − i)(1 + i) + (1 + i)(1 − i) = 4 6= 1, so that the matrix is not unitary. (c) Since " # " 3 5 4i 5

·

− 45

#

3i 5

  3 4 4i 3i = · − − · =0 5 5 5 5

" # " # 3 5 4i 5

3 5 4i 5

3 3 4i 4i · − · =1 5 5 5 5 " # " #   − 54 − 45 4 3i 3i 4 · = − · − − · = 1, 3i 3i 5 5 5 5 5 5 ·

=

the columns of the matrix are orthonormal, so the matrix is unitary.

Exploration: Vectors and Matrices with Complex Entries

551

(d) Clearly column 2 is a unit vector, and is orthogonal to each of the other two columns, so we need only show the first and third columns are unit vectors and are orthogonal. But  1+i   2  √



6 6 −1 + i 1 2 − 2i −1 + i     1−i 2 ·√ = + =0  0 · 0 = √ · √ +0+ √ 6 3 6 6 3 3 −1−i 1 √



1+i √ 6



3

 

3

1+i √ 6



−1 + i −1 − i 2 2     1−i 1+i · √ = + =1  0 · 0 = √ · √ +0+ √ 6 3 6 6 3 3 −1−i −1−i √



3



√2 6

 

3

√2 6



2 1 1 4 1 2      0  ·  0  = √ · √ + 0 + √ · √ = + = 1, 6 3 6 6 3 3 1 1 √

3



3

and we are done. 15. (a) With A as given, the characteristic polynomial is (λ − 2)2 − i · (−i) = λ2 − 4λ + 3 = (λ − 1)(λ − 3). Thus the eigenvalues are λ1 = 1 and λ2 = 3. Corresponding eigenvectors are     1 1+i λ=1: , λ=3: . i 1−i Since these eigenvectors correspond to different eigenvalues, they are already orthogonal. Their norms are

 

  √

1 p

1+i q 2 2

= 1 · 1 + (−i) · i = 2,

i

1 − i = |1 + i| + |1 − i| = 2. So defining a matrix U whose columns are the normalized eigenvectors gives # " U=

√1 2 √i 2

1+i 2 1−i 2

,

which is unitary. Then U ∗ AU =

 1 0

 0 . 3

(b) With A as given, the characteristic polynomial is λ2 + 1. Thus the eigenvalues are λ1 = i and λ2 = −i. Corresponding eigenvectors are     i −i λ=1: , λ=3: . 1 1 Since these eigenvectors correspond to different norms are

  q √

i 2 2

1 = |i| + |1| = 2,

eigenvalues, they are already orthogonal. Their

  q √

−i 2 2

1 = |−i| + |1| = 2.

So defining a matrix U whose columns are the normalized eigenvectors gives " # √i − √i2 2 U= 1 , 1 √



2

2

which is unitary. Then U ∗ AU =



i 0

 0 . −i

552

CHAPTER 7. DISTANCE AND APPROXIMATION (c) With A as given, the characteristic polynomial is (λ+1)λ−(1+i)(1−i) = λ2 +λ−2 = (λ+2)(λ−1). Thus the eigenvalues are λ1 = 1 and λ2 = −2. Corresponding eigenvectors are     1+i 1+i λ=1: , λ=3: . 2 −1 Since these eigenvectors correspond to different eigenvalues, they are already orthogonal. Their norms are



   √ √

1+i q

1+i q 2 2

= |1 + i|2 + |2|2 = 6,

−1 = |1 + i| + |1| = 3.

2 So defining a matrix U whose columns are the normalized eigenvectors gives # " U=

1+i √ 6 √2 6

1+i √ 3 − √13

,

which is unitary. Then  1 U AU = 0 ∗

 0 . −2

(d) With A as given, the characteristic polynomial is  (λ − 1) ((λ − 2)(λ − 3) − (1 − i)(1 + i)) = (λ − 1) λ2 − 5λ + 4 = (λ − 1)2 (λ − 4) Thus the eigenvalues are λ1 = 1 and λ2 = 4. Corresponding eigenvectors are       0 1 0 λ = 4 : 1 − i . λ = 1 : −1 + i , 0 , 1 2 0 The eigenvectors corresponding to different eigenvalues are already orthogonal, and the two eigenvectors corresponding to λ = 1 are also orthogonal, so these vectors form an orthogonal set. Their norms are

 

 

q

1 0 √



−1 + i = |0|2 + |−1 + i|2 + |1|2 = 3,

0 = 1,





0 1

 

0 q √

1 − i = |0|2 + |1 − i|2 + |2|2 = 6.

2 So defining a matrix U whose columns are the normalized eigenvectors gives   0 1 0   −1+i √ √ , 0 1−i U =  3 6 √1 0 √26 3 which is unitary. Then  1 U ∗ AU = 0 0

0 1 0

 0 0 . 4

16. If A is Hermitian, then A∗ = A, so that A∗ A = AA = AA∗ . Next, if U is unitary, then U U ∗ = I. But if U is unitary, then its rows are orthonormal. Since the columns of U ∗ are the conjugates of the rows of U , they are orthonormal as well and therefore U ∗ is also unitary. Hence U U ∗ = U ∗ U = I and therefore U is normal. Finally, if A is skew-Hermitian, then A∗ = −A; then A∗ A = −AA = A(−A) = AA∗ . So all three of these types of matrices are normal. 17. By the result quoted just before Exercise 16, if A is unitarily diagonalizable, then A∗ A = AA∗ ; this is the definition of A being normal.

Exploration: Geometric Inequalities and Optimization Problems

553

Exploration: Geometric Inequalities and Optimization Problems 1. We have √ √ √ √ √ x· y+ y· x=2 x y q √ 2 √ 2 √ kuk = kvk = x + ( y) = x + y, |u · v| =



√ √ √ √ so that 2 x y ≤ x + y · x + y = x + y, and thus √

xy ≤

x+y . 2

√ √ √ √ 2 2. (a) Since x − y is a square, it is nonnegative. Expanding then gives x − 2 x y + y ≥ 0, so √ that 2 xy ≤ x + y and thus x+y √ xy ≤ . 2 (b) Let AD = a and BD = b. Then by the Pythagorean Theorem, CD2 = a2 − x2 = b2 − y 2 , √ so that 2CD2 = (a2 + b2 ) − x2 − y 2 = (x + y)2 − (x2 + y 2 ) = 2xy, so that CD = xy. But CD is clearly at most equal to a radius of the circle, since it extends from a diameter to the circle. Since the radius is x+y 2 , we again get √

xy ≤

x+y . 2

√ 3. Since the area is xy = 100, we have 10 = xy ≤ x+y 2 , so that 40 ≤ 2(x + y), which is the perimeter. By the AMGM, equality holds when x = y; that is, when x = y = 10, so the square of side 10 is the rectangle with smallest perimeter. 4. Let

1 x

play the role of y in the AMGM. Then r

Equality holds if and only if x = for x > 0.

1 x

√ 1 , or 2 1 = 2 ≤ x + . x



x+ 1 ≤ x 2

1 x,

i.e., when x = 1. Thus 1 +

1 1

= 2 is the minimum value for f (x)

5. If the dimensions of the cut-out square are x × x, then the base of the box will be (10 − 2x) × (10 − 2x), so its volume will be x(10 − 2x)2 . Then √ 3

V ≤

x + (10 − 2x) + (10 − 2x) 20 − 3x = , 3 3

with equality holding if x = 10 − 2x. Thus x = 10 10 possible volume is 10 3 × 3 × 3 .

10 3 ,

so that 10 − 2x =

10 3

as well. Then the largest

6. We have p (x + y) + (y + z) + (z + x) 2 3 f= = (x + y + z), 3 3 with equality holding if x + y = y + z = z + x. But this implies that x = y = z; since xyz = 1, we must have x = y = z = 1, so that f (x, y, z) = (1 + 1)(1 + 1)(1 + 1) = 8. This is the minimum value of f subject to the constraint xyz = 1.

554

CHAPTER 7. DISTANCE AND APPROXIMATION

7. Use the substitution u = x − y. Then the original expression becomes u+y+

8 . uy

Using the AMGM for three variables, we get r 3

u·y·

u+y+ 8 ≤ uy 3

8 uy

√ 8 3 , or 3 8 = 6 ≤ u + y + , uy

8 with equality holding if and only if u = y = uy ; this implies that u = y = 2, so that x = u + y = 4. So 8 the minimum values is 4 + 2·2 = 6, which occurs when x = 4 and y = 2.   x √ 8. Follow the method of Example 7.11. x2 + 2y 2 + z 2 is the square of the norm of v =  2 y , so that z   1 √ x + 2y + 4z is the dot product of that vector with u =  2. Then the component-wise form of the 4 Cauchy-Schwarz Inequality gives   √ 2 √ √ 2 2 2 2 (1·x+ 2· 2 y +4·z) = (x+2y +4z) ≤ 1 + 2 + 4 (x2 +2y 2 +z 2 ) = 19(x2 +2y 2 +z 2 ) = 19.

So x + 2y + 4z ≤



19, and the maximum value is



19. 

 x 9. Follow the method of Example 7.11. Here f (x, y, z) is the square of the norm of v =  y , so set √z 2





1 u = √1 . Then the component-wise form of the Cauchy-Schwarz Inequality gives 2   2   x 1 √ 2   1 2 2 2  1  ·  y  = (x + y + z)2 ≤ 12 + 12 + z . 2 x + y + √ 2 √z 2 2 

Since x + y + z = 10, we get 

 1 2 1 100le4 x + y + z , or 25 ≤ x2 + y 2 + z 2 . 2 2 2

2

Thus the minimum value of the function is 25.     1 sin θ 2 10. Follow the method of Example 7.11 with u = and v = . (This works because kvk = 1 cos θ sin2 θ + cos2 θ = 1.) Then the component-wise form of the Cauchy-Schwarz Inequality gives (u · v)2 = (sin θ + cos θ)2 ≤ (12 + 12 )(sin2 θ + cos2 θ) = 2. √ √ Therefore sin θ + cos θ ≤ 2, so the maximum value of the expression is 2.     x 1 11. We want to minimize x + y subject to the constraint x + 2y = 5. Thus set v = and u = . y 2 Then the component-wise form of the Cauchy-Schwarz Inequality gives 2

2

(x + 2y)2 ≤ (12 + 22 )(x2 + y 2 ).

Exploration: Geometric Inequalities and Optimization Problems

555

Since x + 2y = 5, we get 25 ≤ 5(x2 + y 2 ), or 5 ≤ x2 + y 2 . Thus the minimum value is 5, which occurs at equality. To find the point where equality occurs, substitution 5 − 2y for x to get 5 = (5 − 2y)2 + y 2 = 25 − 20y + 5y 2 , so that y 2 − 4y + 4 = 0. This has the solution y = 2; then x = 5 − 2y = 1. So the point on x + 2y = 5 closest to the origin is (1, 2). √ 2 , apply the AMGM to x0 = x1 and y 0 = y1 , giving 12. To show that xy ≥ 1/x+1/y p x0 + y 0 x0 y 0 ≤ , or 2

r

2 1 2 √ ≥ 0 , or xy ≥ . x0 y 0 x + y0 1/x + 1/y

Since this inequality results from the AMGM, we have equality when x0 = y 0 , so when x = y. For the first inequality, note first that (x − y)2 = x2 − 2xy + y 2 ≥ 0, so that x2 + y 2 ≥ 2xy. Then r  2 x2 + y 2 + (x2 + y 2 ) x2 + 2xy + y 2 x+y x2 + y 2 x+y x2 + y 2 = ≥ = ≥ . , so that 2 4 4 2 2 2 Equality occurs when x2 + y 2 = 2xy, which is the same as (x − y)2 = 0, so that x = y. 13. The area of the rectangle shown is 2xy, where x2 + y 2 = r2 . From Exercise 12, we have r p √ √ p p x2 + y 2 √ ≥ xy ⇒ x2 + y 2 ≥ 2xy ⇒ r2 = r ≥ 2xy = A ⇒ A ≤ r2 . 2 The maximum occurs at equality, and the maximum area is A = r2 . By Exercise 12, this happens when x = y. 14. Following the hint, we have (x + y)2 = (x + y) xy



1 1 + x y

 .

So from Exercise 12, we have 2 x+y ≥ ⇒ (x + y) 2 1/x + 1/y



1 1 + x y

 ≥ 4.

Thus the minimum value is 4; this minimum occurs when x = y, so when   1 4x 1 + = = 4, (x + x) x x x or x = y = 1. 15. Squaring the inequalities in Exercise 12 gives x2 + y 2 ≥ 2 Substituting x +

1 x

for x and y + x+

1 y

 1 2 x



x+y 2

2

 ≥

2 1/x + 1/y

2 .

for y gives from the first inequality  2 + y + y1 2

 ≥

x+

1 x



  2 + y + y1  . 2

556

CHAPTER 7. DISTANCE AND APPROXIMATION Since the constraint we are working with is x + y = 1, we get  

x+

1 x



 2   2   + y + y1 (x + y) + x1 + y1  =  = 2 2

1 + 2

1 x

+ 2

1 y

!2

 ≥

1 2 + 2 x+y

 ,

using the second and fourth terms in Exercise 12 for the final inequality. But now again since x+y = 1, we get 2  2  2 5 25 1 + = = , 2 x+y 2 4 so that f (x, y) = 2 · x = y = 21 .

7.2

25 4

=

25 2

is the minimum when x + y = 1. This equality holds when x = y, so that

Norms and Distance Functions

1. We have kukE =

p √ (−1)2 + 42 + (−5)2 = 42

kuks = |−1| + |4| + |−5| = 10 kukm = max (|−1| , |4| , |−5|) = 5. 2. We have kvkE =

p √ 22 + (−2)2 + 02 = 2 2

kvks = |2| + |−2| + |0| = 4 kvkm = max (|2| , |−2| , |0|) = 2.   −3 3. u − v =  6, so −5 dE (u, v) =

p √ (−3)2 + 62 + (−5)2 = 70

ds (u, v) = |−3| + |6| + |−5| = 14 dm (u, v) = max (|−3| , |6| , |−5|) = 6. 4. (a) ds (u, v) measures the total difference in magnitude between corresponding components. (b) dm (u, v) measure the greatest difference between any pair of corresponding components. 5. kukH = w(u), which is the number of 1s in u, or 4. kvkH = w(v), which is the number of 1s in v, or 5. 6. dH (u, v) = w(u − v) = w(u) + w(v) less twice the overlap, or 4 + 5 − 2 · 2 = 5; this is the number of 1s in u − v. pP P 2 2 7. (a) If kvkE = vi2 = max{|vi |} = kvkm , then squaring both sides gives vi = max{|vi | } = 2 2 max{vi }. Since vi ≥ 0 for all i, equality holds if and only if exactly one of the vi is nonzero. P (b) Suppose kvks = |vi | = max{|vi |} = kvkm . Since |vi | ≥ 0 for all i, equality holds if and only if exactly one of the vi is nonzero. (c) By parts (a) and (b), kvkE = kvkm = kvks if and only if exactly one of the components of v is nonzero.

7.2. NORMS AND DISTANCE FUNCTIONS

557

8. (a) Since ku + vkE =

X X X X (ui + vi )2 = u2i + vi2 + 2 ui vi = kukE + kvkE + 2u · v,

we see that ku + vkE = kukE + kvkE if and only if u · v = 0; that is, if and only if u and v are orthogonal. (b) We have ku + vks =

X

|ui + vi | ,

kuks + kvks =

X

(|ui | + |vi |).

For a given i, if the signs of ui and vi are the same, then |ui + vi | = |ui | + |vi |, while if they are different, then |ui + vi | < |ui | + |vi |. Thus the two norms are equal exactly when ui and vi have the same sign for each i, which is to say that ui vi ≥ 0 for all i. (c) We have ku + vks = max{|ui + vi |},

kuks + kvks = max{|ui |} + max{|vi |}.

Suppose that |ur | = max{|ui |} and |vt | = max{|vi |}. Then for any j, we know that |uj + vj | ≤ |uj | + |vj | ≤ |ur | + |vt | , where the first inequality is the triangle inequality and the second comes from the condition on r and t. Choose j such that |uj + vj | = ku + vks . If the two norms are equal, then |uj + vj | = ku + vks = kuks + kvks = |ur | + |vt | . Thus both of the inequalities above must be equalities. The first is an equality if uj and vj have the same sign. For the second, since |uj | ≤ |ur | and |vj | ≤ |vt |, equality means that |uj | = |ur | and |vj | = |vt |. Putting this together gives the following (somewhat complicated) condition for equality to hold in the triangle inequality for the max norm: equality holds if there is some component j such that uj and vj have the same sign and such that |uj | is a component of maximal magnitude of u and |vj | is a component of maximal magnitude of v. 9. Let vk be a component of v with maximum magnitude. Then the result follows from the string of equalities and inequalities qX q kvkm = |vk | = vk2 ≤ vi2 = kvkE . P 2 P 2 2 2 10. We have kvkE = vi ≤ ( |vi |) ≤ kvks . Taking square roots gives the desired result since norms are always nonnegative. 11. We have kvks =

P

12. We have kvkE =

pP

|vi | ≤

P

vi2 ≤

max{|vi |} = n max{|vi |} = n kvkm .

q P

2

(max{vi }) =

q √ √ 2 n max{|vi | } = n max{|vi |} = n kvkm .

13. The unit circle in R2 relative to the sum norm is {(x, y) : |x| + |y| = 1} = {(x, y) : x + y = 1 or x − y = 1 or − x + y = 1 or − x − y = 1}. The unit circle in R2 relative to the max norm is {(x, y) : max{|x| , |y|} = 1}.

558

CHAPTER 7. DISTANCE AND APPROXIMATION These unit circles are shown below:

-1.0

1.0

1.0

0.5

0.5

1.0

0.5

-0.5

-1.0

0.5

-0.5

-0.5

-0.5

-1.0

-1.0

1.0

14. For example, in R2 , let u = h2, 1i and v = h1, 2i. Then kuk = kvk = 2 + 1 = 3. Also, ku + vk = kh3, 3ik = 6 and ku − vk = kh1, −1ik = 2. But then 2

2

kuk + kvk = 32 + 32 = 18 6=

1 1 2 2 ku + vk + ku − vk = 18 + 2 = 20. 2 2

 

a

15. Clearly

b ≥ 0, and since max{|2a| , |3b|} = 0 if and only if both a and b are zero, we see that the norm is zero only for the zero vector. Thus property (1) is satisfied. For property 2,

   

a ca



c

b = cb = max{|2ac| , |3bc|} = |c| max{|2a| , |3b|}. Finally,

     

a

a+c c



b + d = b + d = max{|2(a + c)| , |3(b + d)|}. Now use the triangle inequality: max{|2(a + c)| , |3(b + d)|} ≤ max{|2a| + |2c| , |3b| + |3d|} ≤ max{|2a| , |3b|} + max{|2c| , |3d|}.

   

a c



But the last expression is just

b + d , and we are done. 16. Clearly max{|aij |} ≥ 0, and is zero only when all of the aij are zero, so when A = O. Thus property i,j

(1) is satisfied. For property (2), kcAk = max{|caij |} = max{|c| |aij |} = |c| max{|aij |} = |c| kAk . i,j

i,j

i,j

For property (3), using the triangle inequality for absolute values gives kA + Bk = max{|aij + bij |} ≤ max{|aij | + |bij |} ≤ max{|aij |} + max{|bij |} = kAk + kBk . i,j

i,j

i,j

i,j

7.2. NORMS AND DISTANCE FUNCTIONS

559

17. Clearly kf k ≥ 0, since |f | ≥ 0. Since f is continuous, |f | is also continuous, so its integral is zero if and only if it is identically zero. So property (1) holds. For property (2), use standard properties of integrals: Z 1 Z 1 Z 1 kcf k = |cf (x)| dx = |c| · |f (x)| dx = |c| |f (x)| dx = |c| kf k . 0

0

0

Finally, property (3) relies on the triangle inequality for absolute value: Z Z 1 Z 1 Z 1 |f (x)| dx + (|f (x)| + |g(x)|) dx = |f (x) + g(x)| dx ≤ kf + gk =

|g(x)| dx = kf k + kgk .

0

0

0

0

1

18. Since the norm is the maximum of a set of nonnegative numbers, it is always nonnegative, and is zero exactly when all of the numbers are zero, which means that f is the zero function on [0, 1]. So property (1) holds. For property (2), we have kcf k = max {|cf (x)|} = max {|c| |f (x)|} = |c| max {|f (x)|} = |c| kf k . 0≤x≤1

0≤x≤1

0≤x≤1

Finally, property (3) relies on the triangle inequality for absolute values: kf + gk = max {|f (x) + g(x)|} ≤ max {|f (x)|+|g(x)|} ≤ max {|f (x)|}+ max {|g(x)|} = kf k+kgk . 0≤x≤1

0≤x≤1

0≤x≤1

0≤x≤1

19. By property (2) of the definition of a norm,

20.

21.

22.

23.

24.

25.

d(u, v) = ku − vk = |−1| ku − vk = k(−1)(u − v)k = kv − uk = d(v, u). √ √ The Frobenius norm is kAkF = 22 + 32 + 42 + 12 = 30. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 2 + 4 = 6, while kAk∞ is the largest row sum of |A|, which is 2 + 3 = 4 + 1 = 5. p √ The Frobenius norm is kAkF = 02 + (−1)2 + (−3)2 + 32 = 19. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is |−1| + 3 = 4, while kAk∞ is the largest row sum, which is |−3| + 3 = 6. p √ The Frobenius norm is kAkF = 12 + 52 + (−2)2 + (−1)2 = 31. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 5 + |−1| = 6, while kAk∞ is the largest row sum, which is 1 + 5 = 6. √ √ The Frobenius norm is kAkF = 22 + 12 + 12 + 12 + 32 + 22 + 12 + 12 + 32 = 31. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 1 + 2 + 3 = 6, while kAk∞ is the largest row sum, which is 1 + 3 + 2 = 6. p √ The Frobenius norm is kAkF = 02 + (−5)2 + 22 + 32 + 12 + (−3)2 + (−4)2 + (−4)2 + 32 = 89. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is |−5| + 1 + |−4| = 10, while kAk∞ is the largest row sum, which is |−4| + |−4| + 3 = 11. p √ 2 2 2 2 2 2 2 2 2 The √ Frobenius norm is kAkF = 4 + (−2) + (−1) + 0 + (−1) + 2 + 3 + (−3) + 0 = 44 = 2 11. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 4 + 0 + 3 = 7, while kAk∞ is the largest row sum, which is 4 + |−2| + |−1| = 7.

26. Since kAk1 = 6, which is the sum of the first column, such a vector x is e1 , since

      

2 2 3 1 2

= , and

4 = 6. 4 1 0 4 s   1 Since kAk∞ = 5, which is the sum of (say) the first row, let y = ; then 1

      

5 2 3 1 5

= , and

5 = 5. 4 1 1 5 m

560

CHAPTER 7. DISTANCE AND APPROXIMATION

27. Since kAk1 = 4, which is the sum of the magnitudes of the second column, such a vector x is e2 , since

      

−1 0 −1 0 −1

= , and

3 = |−1| + 3 = 4. −3 3 1 3 s Since kAk∞ = 6, which is the sum of the magnitudes of the second row, let y =

  −1 since the first 1

column in that row is negative; then

      

−1 0 −1 −1 −1

= , and

6 = 6. −3 3 1 6 m 28. Since kAk1 = 6, which is the sum of the magnitudes of the second column, such a vector x is e2 , since

      

5 1 5 0 5

= , and

−1 = 5 + |−1| = 6. −2 −1 1 −1 s   1 Since kAk∞ = 6, which is the sum of the first row, let y = ; then 1

      

6 1 5 1 6

= , and

−3 = 6. −2 −1 1 −3 m 29. Since kAk1 = 6, which is the sum  2 1 1

of the magnitudes of the     1 1 0 1 3 2 0 = 2 , and 1 3 1 3

Since kAk∞ = 6, which is the sum of the second row,  2 1 1

1 3 1

    4 1 1 2 1 = 6 , 5 1 3

third column, such a vector x is e3 , since

 

1

2 = 6.

3 s   1 let y = 1; then 1

 

4

  and

6 = 6.

5 m

30. Since kAk1 = 10, which is the sum of the magnitudes of the second column, such a vector x is e2 , since

      

−5 0 −5 2 0 −5

 3   1 −3 1 =  1 , and

1 = 10.

−4 −4 −4 3 0 −4 s

  −1 Since kAk∞ = 11, which is the sum of the magnitudes of the third row, let y = −1 since the entries 1 in the first two columns are negative; then

      

7 0 −5 2 −1 7

 3   1 −3 −1 = −7 , and

−7 = 11.

11 −4 −4 3 1 11 m

31. Since kAk1 = 7, which is the sum of the magnitudes of the first column, such a vector x is e1 , since

      

4 4 −2 −1 1 4

0 −1        2 0 = 0 , and

0 = 7.

3 3 −3 0 0 3 s

7.2. NORMS AND DISTANCE FUNCTIONS

561 

 1 Since kAk∞ = 7, which is the sum of the magnitudes of the first row, let y = −1 since the entries −1 in the last two columns are negative; then

      

7 4 −2 −1 1 7

0 −1   2 −1 = −1 , and

−1 = 7.

6 3 −3 0 −1 6 m 32. Let Ai be the ith row of A, let M =

max {kAj ks } be the maximum absolute row sum, and let x

j=1,2,...,n

be any vector with kxkm = 1 (which means that |xi | ≤ 1 for all i). Then kAxkm = max{|Ai x|} i

= max{|ai1 x1 + ai2 x2 + · · · + ain xn |} i

≤ max{|ai1 | |x1 | + |ai2 | |x2 | + · · · + |ai1 | |xn |} i

≤ max{|ai1 | + |ai2 | + · · · + |ain |} i

= max{kAi ks } = M. i

Thus kAxkm is at most equal to the maximum absolute row sum. Next, we find a specific unit vector x so that it is actually equal to M . Let row k be the row of A that gives the maximum sum norm; that is, such that kAk ks = M . Let ( 1 Akj ≥ 0 yj = , j = 1, 2, . . . , n. −1 Akj < 0  T Then with y = y1 · · · yn , we have kykm = 1, and we have corrected any negative signs in row k, so that akj yj ≥ 0 for all j and therefore the k th entry of Ay is X X X X akj yj = |akj yj | = |akj | |yj | = |akj | = kAk ks = M. j

j

j

j

So we have shown that the ∞ norm of A is bounded by the maximum row sum, and then found a vector x for which kAxk ≤ kAk∞ is equal to the maximum row sum. Thus kAk∞ is equal to the maximum row sum, as desired. ˆ k = max kˆ 33. (a) If k·k is an operator norm, then kIk = max kI x xk = 1. kˆ xk=1 kˆ xk=1 pP √ 12 = n 6= 1. When n = 1, (b) No, there is not, for n > 1, since in the Frobenius norm, kIkF = the Frobenius norm is the same as the absolute value norm on the real number line. ˆ . Then Aˆ 34. Let λ be an eigenvalue, and v a corresponding unit vector. Normalize v to v v = λˆ v, so that kAk = max kAˆ xk ≥ kAˆ vk = kλˆ vk = |λ| kˆ vk = |λ| . kˆ xk=1

35. Computing the inverse gives "

A

−1

1 = −2

− 12 3 2

# .

Then

cond1 (A) = kAk1 A−1 1 = (3 + 4) (1 + |−2|) = 21  

3 cond∞ (A) = kAk∞ A−1 ∞ = (4 + 2) |−2| + = 21. 2 The matrix is well-conditioned.

562

CHAPTER 7. DISTANCE AND APPROXIMATION

36. Since det A = 0, by definition cond1 (A) = cond∞ (A) = ∞, so this matrix is ill-conditioned. 37. Computing the inverse gives A−1 =



 100 −99 . −100 100

Then

cond1 (A) = kAk1 A−1 1 = (1 + 1)(100 + |−100|) = 400

cond∞ (A) = kAk∞ A−1 ∞ = (1 + 1)(100 + |−100|) = 400. The matrix is ill-conditioned. 38. Computing the inverse gives " A−1 =

2001 50 3001 − 100

# −2 3 2

.

Then  2001 3001 + = 294266 50 100   2001 = (3001 + 4002) + 2 = 294266. 50

cond1 (A) = kAk1 A−1 1 = (200 + 4002)

cond∞ (A) = kAk∞ A−1 ∞



The matrix is ill-conditioned. 39. Computing the inverse gives 

A−1

0 = 6 −5

 0 1 −1 −1 . 1 0

Then

cond1 (A) = kAk1 A−1 1 = (1 + 5 + 1)(0 + 6 + 5) = 77

cond∞ (A) = kAk∞ A−1 ∞ = (5 + 5 + 6)(6 + 1 + 1) = 128. The matrix is moderately ill-conditioned. 40. Computing the inverse gives 

A−1

 9 −36 30 192 −180 . = −36 30 −180 180

Then

cond1 (A) = kAk1 A−1 1 =

cond∞ (A) = kAk∞ A−1 ∞

  1 1 1+ + (36 + 192 + 180) = 748 2 3   1 1 = 1+ + (36 + 192 + 180) = 748. 2 3

The matrix is ill-conditioned. 41. (a) Computing the inverse gives A−1 =

 1 1 1 − k −1

 −k . 1

Then  

k 1 2 + , . cond∞ (A) = kAk∞ A−1 ∞ = max{|k| + 1, 2} max k − 1 k − 1 k − 1

7.2. NORMS AND DISTANCE FUNCTIONS

563

(b) As k → 1, since the second factor goes to ∞, cond∞ (A) → ∞. This makes sense since when k = 1, A is not invertible. 42. Since the matrix norm is compatible, we have from the definition that kAxk ≤ kAk kxk for any x, so that



1 kAk kbk = AA−1 b ≤ kAk A−1 b , so that ≤ . kA−1 bk kbk Then

−1

−1

A ∆b

A k∆bk

k∆xk kAk

A−1 k∆bk = cond(A) k∆bk . = ≤ ≤ kxk kA−1 bk kA−1 bk kbk kbk

43. (a) Since "

A−1

9 − 10 = 1

# 1 , −1

we have cond∞ (A) = (10 + 10)(1 + 1) = 40. (b) Here  10 ∆A = 10 Therefore

  10 10 − 11 10

  10 0 = 9 0

0 2

 ⇒ k∆Ak = 2.

k∆Ak∞ k∆xkm 2 ≤ cond∞ (A) = 40 · , kx0 km kAk∞ 20

so that this can produce at most a change of a factor of 4, or a 400% relative change. (c) Row-reducing gives  A  0 A Thus

and

k∆xkm kxkm

  10 b = 10   10 b = 10

  9 x= , 1

10 9

100 99



10 11

100 99



→ →

 1 0  1 0

0 1

 9 1

0 1

 11 . −1

   11 2 0 x = , so that ∆x = x − x = , −1 −2 0



= 29 , for approximately a 22% relative error.

(d) Using Exercise 42, ∆b =

      100 100 0 − = , 101 99 2

so that k∆bkm = 2 and k∆xkm k∆bkm 2 ≤ cond∞ (A) = 40 · = 0.8, kxkm kbkm 100 so that at most an 80% change can result.   (e) Row-reducing A b gives the same result as above, while      10 10 100 1 A b0 = → 10 9 101 0 Thus

and again



     11 9 2 ∆x = − = , −1 1 −2 k∆xkm kxkm

= 29 , for about a 22% actual relative error.

0 1

 11 . −1

564

CHAPTER 7. DISTANCE AND APPROXIMATION

44. (a) Since A−1

 −10  = 4 7

 3 5  −1 −2 , −2 −3

we have cond1 (A) = (1 + 5 + |−1|)(|−10| + 4 + 7) = 147. (b) Here  1 ∆A = 2 1 Therefore

  1 1 1 5 0 − 1 −1 2 1

  1 1 0 5 0 = 1 −1 2 0

0 0 0

 0 0 ⇒ k∆Ak1 = 1. 0

k∆xks k∆Ak1 1 ≤ cond1 (A) = 147 · = 21, kx0 ks kAk1 7

so that this can produce at most a change of a factor of 21, or a 2100% relative change. (c) Row-reducing gives h

h

Thus

and

k∆xks kxks

A

A

0



 1 i  b = 2 1  1 i  b = 1 1

 11   x = −4 , −6 =

33 21

1 1 5 0 −1 2 1 1 5 0 −1 2

 1  2 3  1  2 3





 1 0 0  0 1 0 0 0 0  1 0 0  0 1 0 0 0 1

 11  −4 −6  − 11 2 3  . 2  5

    33 − 11 − 2 2     x0 =  32  , so that ∆x = x0 − x =  11 , 2  5 11

≈ 1.57, for approximately a 157% actual relative error.

(d) Using Exercise 42,       1 0 1 ∆b = 1 − 2 = −1 , 3 0 3 so that k∆bks = 1 and k∆xks k∆bks 1 ≤ cond1 (A) = 147 · = 24.5, kxks kbks 6 so that at most a 2450% relative change can result.   (e) Row-reducing A b gives the same result as above, while    1 1 1 1 1   A b0 = 1 5 0 1 → 0 1 −1 2 3 0 Thus

and

k∆xks kxks

0 1 0

      −4 11 −15 ∆x =  1 − −4 =  5 4 −6 10 =

30 21

≈ 1.43, for about a 143% actual relative error.

0 0 1

 −4 1 . 4

7.2. NORMS AND DISTANCE FUNCTIONS

565



45. From Exercise 33(a), 1 = kIk = A−1 A ≤ A−1 kAk = cond(A). 46. From Exercise 33(a),





cond(AB) = (AB)−1 kABk = B −1 A−1 kABk ≤ A−1 kAk B −1 kBk = cond(A) cond(B). 47. From Theorem 4.18 (b) in Chapter 4, if λn is an eigenvalue of A, then

Exercise 34, kAk ≥ |λ1 | and A−1 ≥ 1 , so that

1 λn

is an eigenvalue of A−1 . By

λn

|λ1 | cond(A) = A−1 kAk ≥ . |λn | 48. With

 7 A= 1

 −1 , −5

we have M = −D

−1

" 7 (L + U ) = − 0

so that kM k∞ = 15 . Then " # 6 ⇒ b= −4

#−1 " 0 0 −5 1

" c=D

−1

b=

# " −1 − 71 = 0 0

1 7

0

0

− 15

#" 0 0 1 1 5

# " # 6 6 = 74 −4 5

# " −1 0 = 1 0 5

#"

k



kckm =

1 7

0

# ,

6 . 7

k

Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that 4+log (kck /5) . In this case, with kM k∞ = 15 and kckm = 67 , we get k > log 101/kMm k∞ ) 10 ( 4 + log10 (6/35) ≈ 4.6, so that k ≥ 5. log10 5   0.999 = M xk + c gives x5 = , as compared with an actual solution of 0.999 k>

Computing x5 using xk+1   1 x= . 1 49. With

 4.5 A= 1

 −0.5 , −3.5

we have #−1 " 4.5 0 0 (L + U ) = − 0 −3.5 1 "

M = −D

−1

so that kM k∞ = 27 . Then " # 1 b= ⇒ −1

" c=D

−1

b=

# " −0.5 − 92 = 0 0

2 9

0

0

− 27 k

# " # 2 1 = 92 −1 7

#" 0 0 2 1 7

# " −0.5 0 = 2 0 7

#"

⇒ k

kckm =

1 9

0

# ,

2 . 7

Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that 4+log (kck /5) k > log 101/kMm . In this case, with kM k∞ = 27 and kckm = 27 , we get k∞ ) 10 ( k>

4 + log10 (2/35) ≈ 5.06, so that k ≥ 6. log10 (7/2)

566

CHAPTER 7. DISTANCE AND APPROXIMATION   0.2622 Computing x6 using xk+1 = M xk + c gives x6 = , as compared with an actual solution of 0.3606   0.2623 x= . 0.3606

50. With



 1 −1 −10 1 , 1 10

20 A= 1 −1 we have  20 0  −1 M = −D (L + U ) = −  0 −10 0 0 2 so that kM k∞ = 10 = 51 . Then    0 17  1   −1 b = 13 ⇒ c = D b = − 10 1 18 − 10

−1  0 1 0   0  1 0 −1 1 10

1 20

0 1 10

  0 −1  1 1 =  10 1 0 10

    17 1 17 − 20 20   1   − 10  13 = − 13 20  9 18 0 5

k



1 − 20 0 1 − 10



1 20 1  10 

0

kckm =

9 . 5

k

Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that 4+log (kck /5) . In this case, with kM k∞ = 15 and kckm = 95 , we get k > log 101/kMm k∞ ) 10 ( 4 + log10 (9/25) ≈ 5.08, so that k ≥ 6. log10 5   0.999 = M xk + c gives x6 = −1.000, as compared with an actual solution of 1.999 k>

Computing x6 using xk+1  1 x = −1. 2 

51. With

 3 A = 1 0

we have  3  −1 M = −D (L + U ) = − 0 0

0 4 0

= 21 . Then    1 0   1 −1 b = 1 ⇒ c = D b =  4 1 0

so that kM k∞ =

 0 1 , 3

1 4 1

−1  0 0   0 1 3 0

1 0 1

  0 0   1 1 = − 4 0 0

− 31 0 − 31

 0  − 41  0

2 4

1 3

0 1 3 k

    1 1 3 1 1   = 4 4  1 1 0 1 3 0



k

kckm =

1 . 3

Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that 4+log (kck /5) k > log 101/kMm . In this case, with kM k∞ = 12 and kckm = 13 , we get k∞ ) 10 ( k>

4 + log10 (1/15) ≈ 9.3, so that k ≥ 10. log10 2

7.2. NORMS AND DISTANCE FUNCTIONS

567

Computing x10 using xk+1 = M xk + c gives x10

  0.299 = 0.099, as compared with an actual solution of 0.299

  0.3 x = 0.1. 0.3

52. The sum and max norms are operator norms induced from the sum and max norms on Rn , so they are matrix norms.

(a) If C and D are any two matrices and k·k is either the sum or max norm, then kCDk ≤ kCk kDk. Now assume A is such that kAk = δ < 1. Then n

kAn k ≤ kAk = δ n . Thus limn→∞ kAn k = limn→∞ δ n = 0. So by property (1) in the definition of matrix norms, An → O. (b) A computation shows that (I − A)(I + A + · · · + An ) = I − An+1 , since all the other terms cancel in pairs. Thus (I − A)(I + A + A2 + A3 + · · · ) = (I − A) lim (I + A + · · · + An ) n→∞

= lim ((I − A)(I + A + · · · + An )) n→∞  = lim I − An+1 n→∞

= I. It follows that I − A is invertible and that its inverse is I + A + A2 + A3 + · · · . (c) Recall that a consumption matrix C is a matrix such that all entries are nonnegative and the sum of the entries in each column does not exceed 1. A consumption matrix C is productive if I − C is invertible and (I − C)−1 ≥ O; that is, if it has only nonnegative entries. To prove Corollary 3.35, use k·k∞ , the max norm. Note that since C ≥ O, the sum of the absolute values of the entries of a row is the same as the sum of the entries of the row. If the sum of each row of C is less than 1, then kCk∞ < 1, so by part (b), I − C is invertible. Since C ≥ O, it is clear that each entry of (I − C)−1 = I + C + C 2 + C 3 + · · · is nonnegative, so that (I − C)−1 ≥ O. Thus C is productive. The proof of Corollary 3.36 is similar, but it uses k·k1 , the sum norm. If the sum of each column of C is less than 1, then kCk1 < 1. The rest of the argument is identical to that in the preceding paragraph.

568

7.3

CHAPTER 7. DISTANCE AND APPROXIMATION

Least Squares Approximation

1. The corresponding data points predicted by the line y = −2 + 2x are (1, 0), (2, 2), and (3, 4), so the least squares error is p √ (0 − 0)2 + (2 − 1)2 + (4 − 5)2 = 2. A plot of the points and the line on the same set of axes is 6

4

2

0.5

1.

2.

1.5

3.

2.5

3.5

-2

2. The corresponding data points predicted by the line y = x are (1, 1), (2, 2), and (3, 3), so the least squares error is p √ (1 − 0)2 + (2 − 1)2 + (3 − 5)2 = 6. A plot of the points and the line on the same set of axes is 5

4

3

2

1

1

2

3

4

-1

  3. The corresponding data points predicted by the line y = −3 + 25 x are 1, − 12 , (2, 2), and 3, 29 , so the least squares error is s √ 2  2 1 9 6 2 − − 0 + (2 − 1) + −5 = . 2 2 2 A plot of the points and the line on the same set of axes is

4

2

0.5

-2

1.

1.5

2.

2.5

3.

3.5

7.3. LEAST SQUARES APPROXIMATION

569

4. The corresponding data points predicted by the line y = 3 − 13 x are       14 1 1 , 0, 3 − · 0 = (0, 3), −5, 3 − (−5) = −5, 3 3 3         4 1 1 1 5, 3 − · 5 = 5, , 10, 3 − · 10 = 10, − , 3 3 3 3 so the least squares error is s √   2 2  2 r 4 14 1 25 4 1 30 2 3− + (3 − 3) + 2 − + 0− − = +0+ + = . 3 3 3 9 9 9 3 A plot of the points and the line on the same set of axes is 5

4

3

2

1

10

5

-5

5. The corresponding data points predicted by the line y = 3 − 13 x are         5 5 5 5 , 0, , 5, , 10, , −5, 2 2 2 2 so the least squares error is s 2  2  2   2 r 5 1 1 1 25 √ 5 5 5 3− + 3− + 2− + 0− = + + + = 7. 2 2 2 2 4 4 4 4 A plot of the points and the line on the same set of axes is 5

4

3

2

1

10

5

-5

6. The corresponding data points predicted by the line y = 2 − 15 x are (−5, 3) ,

(0, 2) ,

(5, 1) ,

(10, 0) ,

so the least squares error is q √ 2 2 2 (3 − 3) + (3 − 2) + (2 − 1) + (0 − 0) 62 = 2.

570

CHAPTER 7. DISTANCE AND APPROXIMATION A plot of the points and the line on the same set of axes is 3.

2.5

2.

1.5

1.

0.5

10

5

-5

7. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     y1 1 x1       1 1 0  y2  1 x2  a      = 1 2 , x = , b =  .  = 1 . A = .  . ..  b  ..   .. 1 3 5 yn 1 xn Since "

1 A A= 1 T

1 2

 # 1 1  1 3 1

 " 1 3  2 = 6 3

# " 7 6 T −1 3 ⇒ (A A) = −1 14

−1

# ,

1 2

we get " x = (AT A)−1 AT b =

7 3

−1

#" −1 1 1 1 2

1 2

  # 0 " # 1   −3 . 1 = 5 3 2 5

Thus the least squares approximation is y = −3 + 25 x. To find the least squares error, compute    0 1    e = b − Ax = 1 − 1 5 1 Then kek =

q

 1 2 2

 1 2 2

+ (−1)2 +

   1 1 " # 2 −3    = 2 −1  . 5 1 2 3 2



=

6 2 .

8. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     1 x1 y1       1 1 6 1 x2   y2  a         1 2 3 . A = . = , x = , b = =  ..  ..  b  .. . .  1 3 1 1 xn yn Since "

1 A A= 1 T

1 2

 # 1 1  1 3 1

 " 1 3  2 = 6 3

# " 7 6 T −1 3 ⇒ (A A) = −1 14

−1 1 2

# ,

7.3. LEAST SQUARES APPROXIMATION

571

we get " x = (AT A)−1 AT b =

7 3

−1 25 3

Thus the least squares approximation is y =

#" −1 1 1 1 2

− 25 x. To find the least squares error, compute    1 1 " 25 # 6    2 35 = − 13  . −2 1 3 6

   1 6    e = b − Ax = 3 − 1 1 1 Then kek =

q

 1 2 6

+ − 13

2

 1 2 6

+

1 2

  " # # 6 25 1   3 . 3 = 3 − 25 1



6 6 .

=

9. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     y1 1 x1       4 1 0  y2  1 x2  a         1 . 1 1 = , x = , b = A = . =    . .. b  ..   .. .  0 1 2 yn 1 xn Since " 1 T A A= 0

1 1

 " 0 3  1 = 3 2

 # 1 1  1 2 1

# " 5 3 T −1 6 ⇒ (A A) = 5 − 12

− 21

# ,

1 2

we get " x = (AT A)−1 AT b =

Thus the least squares approximation is y =

5 6 1 −2 11 3

− 12 1 2

q

 1 2 3

+ − 23

2

 1 2 3

+

1 1

  # 4 " # 11 1   1 = 3 . 2 −2 0

− 2x. To find the least squares error, compute

   1 4    e = b − Ax = 1 − 1 0 1 Then kek =

#" 1 0

   1 0 " 11 # 3  3  2 = − 3  . 1 −2 1 2 3



=

6 3 .

10. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     1 x1 y1       1 0 3 1 x2   y2  a        1 1 A = . , x= , b =  .  = 3 . ..  = b  ..  ..  .  1 2 5 1 xn yn Since " 1 T A A= 0

1 1

 # 1 1  1 2 1

 " 0 3  1 = 3 2

# " 5 3 T −1 6 ⇒ (A A) = − 12 5

− 21 1 2

# ,

572

CHAPTER 7. DISTANCE AND APPROXIMATION we get "

5 6 − 12

x = (AT A)−1 AT b =

− 21 1 2

#" 1 0

  " # # 3 8 1   3 = 3 . 2 1 5

1 1

Thus the least squares approximation is y = 83 + x. To find the least squares error, compute       1 1 0 "8# 3 3  3     2 e = b − Ax = 3 − 1 1 = − 3  . 1 1 1 2 5 3 q  √ 2 2 1 2 + − 23 + 13 = 36 . Then kek = 3 11. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         1 x1 y1 1 −5 −1   1 x2    y2     1 0 a       1  A = . , x= , b =  .  =  . ..  = 1 5 b 2  ..  ..  .  1 10 4 1 xn yn Since "

1 A A= −5 T

1 0

1 5

 # 1 1  1  10 1 1

 −5 " 4 0  = 10 5

"

#"

# " 3 10 T −1 10 ⇒ (A A) = 1 150 − 50

1 − 50

# ,

1 125

10

we get 3 10 1 − 50

x = (AT A)−1 AT b =

1 − 50

1 −5

1 125

1 0

1 5

  # −1 " #  7 1   1 .   = 10 8 10  2 25 4

7 8 Thus the least squares approximation is y = 10 + 25 x. To find the least squares error, compute  1     − 10 −1 1 −5 " #  3  1 1  7 0      10  e = b − Ax =   −  .  8 =  10 3   2 1 − 10  5 25

4 Then kek =

q

1 − 10

2

+

 3 2 10

3 + − 10

2

+

 1 2 10

1

10

1 10



=

5 5 .

12. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         1 x1 y1 1 −5 3   1 x2    y2     0 a   1   3 A = . , x= , b =  .  =  . ..  = 1 5 b 2  ..  ..  .  1 10 0 1 xn yn Since "

1 A A= −5 T

1 0

1 5

 # 1 1  1  10 1 1

 −5 " 0 4  = 5 10 10

# " 3 10 T −1 10 ⇒ (A A) = 1 − 50 150

1 − 50 1 125

# ,

7.3. LEAST SQUARES APPROXIMATION

573

we get " x = (AT A)−1 AT b =

1 − 50

3 10 1 − 50

#"

1 125

1 −5

1 0

  " # # 3  5 1  3 2 .  = − 15 10 2

1 5

0 5 2

1 5 x.

Thus the least squares approximation is y = −    1 3 3 1    e = b − Ax =   −  2 1 1

0 Then kek =

q

− 12

2

+

 1 2 2

+

 1 2 2

 1 2 2

+

To find the least squares error, compute   1 −5 " # −2   5 1 0 2 =  2 .   1 1  2 5 − 5 10 − 12

= 1.

13. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         1 1 1 1 x1 y1  3   1 2 1 x2      a   y2      1 3 4 A = . , x= , b= . = ..  =    . . b .  ..   .  1 4 5 1 xn yn 1 5 7 Since "

1 A A= 1

1 2

T

1 3

1 4

 1 # 1 1  1 5   1 1

 1  2 "  5 3  = 15  4 5

# " 11 15 T −1 10 ⇒ (A A) = 3 − 10 55

3 − 10 1 10

# ,

we get " T

x = (A A)

−1

T

A b=

11 10 3 − 10

#" 1 1 1 10

3 − 10

1 2

1 3

1 4

  1  # 3 " 1 #  1  4 = − 5 .  7 5    5 5 7

Thus the least squares approximation is y = − 15 + 75 x. To find the least squares error, compute       −1 1 1 1     " #  52  3 1 2  5       −1 5      e = b − Ax = 4 − 1 3 =  0 . 7      2 5 5 1 4 − 5  1 7 1 5 5 q √     2 2 2 2 Then kek = − 15 + 25 + 02 + − 25 + 15 = 510 . 14. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         1 1 10 1 x1 y1  8   1 2 1 x2      a     y2    1 3 5 A = . , x= , b= . = ..  =    . . b .  ..   .  1 4 3 1 xn yn 1 5 0

574

CHAPTER 7. DISTANCE AND APPROXIMATION Since "

1 A A= 1 T

1 2

1 3

1 4

 1  # " 2 " 11  5 15 T −1 10  ⇒ (A A) = 3 = 3 15 55 − 10  4 5

 1 # 1 1  1 5   1 1

3 − 10

#

1 10

,

we get "

11 10 3 − 10

x = (AT A)−1 AT b =

#" 1 1 1 10

3 − 10

1 2

1 3

1 4

  10  #  8  " 127 #  1   5  = 10 .  − 25 5    3 0

5 Thus the least squares approximation is y = 127 10 − 2 x. To find the least squares error, compute       −1 10 1 1     " #  35   10   8  1 2 127  1     10      e = b − Ax =  5  − 1 3 = − 5  . 5  3     −2  10   3  1 4 0 1 5 − 15

Then kek =

q

− 15

2

+

 3 2 10

+ − 15

2

+

 3 2 10

+ − 15

2



30 10 .

=

15. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation a + bx + cx2 = y to get a+ b+

c=

1

a + 2b + 4c = −2

which in matrix form is

 1 1  1 1

a + 3b + 9c =

3

a + 4b + 16c =

4,

1 2 3 4

   1 1   a −2 4  b =   .  3 9 c 4 16

Thus we want the least squares approximation to Ax = b, where A is the 4 × 3 matrix on the left and b is the matrix on the right in the above equation. Since     1 1 1     31 5 1 1 1 1  4 10 30 − 27  4 4 4   1 2 4      129 AT A = 1 2 3 4   − 54  ,  = 10 30 100 ⇒ (AT A)−1 = − 27 4 20 1 3 9  5 1 1 4 9 16 30 100 354 − 54 4 4 1 4 16 we get  x = (AT A)−1 AT b =

31 4  27 − 4 5 4

− 27 4 129 20 − 54

Thus the least squares approximation is y = 3 −



5 1 4  − 45  1 1 1 4 18 5 x

+ x2 .

1 2 4

1 3 9

   1 1   3  −2   . 4    = − 18 5   3 16 1 4 



7.3. LEAST SQUARES APPROXIMATION

575

16. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation a + bx + cx2 = y to get a+ b+

c=6

a + 2b + 4c = 0 a + 3b + 9c = 0 a + 4b + 16c = 2, which in matrix form is

 1 1  1 1

1 2 3 4

   1   6 a 0 4  b =   . 0 9 c 16 2

Thus we want the least squares approximation to Ax = b, where A is the 4 × 3 matrix on the left and b is the matrix on the right in the above equation. Since        1 1 1  31 5 4 10 30 1 1 1 1  − 27  4 4 4     1 2 4    129 AT A = 1 2 3 4   − 54  ,  = 10 30 100 ⇒ (AT A)−1 = − 27 4 20 1 3 9  5 1 30 100 354 1 4 9 16 − 54 4 4 1 4 16 we get  x = (AT A)−1 AT b =

31 4  27 − 4 5 4

− 27 4 129 20 − 54

Thus the least squares approximation is y = 15 −



5 1 4 5  − 4  1 1 1 4 56 5 x

1 2 4

1 3 9

    6  15 1    0   . 4    = − 56 5  0 16 2 2

+ 2x2 .

17. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation a + bx + cx2 = y to get a − 2b + 4c =

4

a− b+ c=

7

a

=

3

a+ b+ c=

0

a + 2b + 4c = −1, which in matrix form is

 1 1  1  1 1

   −2 4   4  7 −1 1 a        0 0  b =  3 .  0 1 1 c 2 4 −1

Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and b is the matrix on the right in the above equation. Since   1 −2 4        17 1 1 1 1 1 1 −1 1 5 0 10 0 − 71 35         T −1 1 AT A = −2 −1 0 1 2  0 , 0 0 1  =  0 10 0  ⇒ (A A) =  0 10   1 − 17 0 14 4 1 0 1 4 1 1 1 10 0 34 1 2 4

576

CHAPTER 7. DISTANCE AND APPROXIMATION we get  4     18 1 1 1 1  7 5      17  −1 0 1 2   3 = − 10  .   1 0 1 4  0 − 12 −1 



17 35

 x = (AT A)−1 AT b =  0 − 17

0 1 10

0

Thus the least squares approximation is y =

 − 71 1  0 −2 1 4 14 18 5



17 10 x

− 12 x2 .

18. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation a + bx + cx2 = y to get a − 2b + 4c =

0

a − b + c = −11 = −10

a

a + b + c = −9 a + 2b + 4c = which in matrix form is

8,

   0 −2 4   −11 −1 1    a     0 0 b =  −10 .   −9 c 1 1 8 2 4

 1 1  1  1 1

Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and b is the matrix on the right in the above equation. Since   1 −2 4        17 5 0 10 1 1 1 1 1 1 −1 1 0 − 71 35        T −1 1 AT A = −2 −1 0 1 2  0 0 0 ,  =  0 10 0  ⇒ (A A) =  0 10 1   1 1 1 10 0 34 4 1 0 1 4 1 − 17 0 14 1 2 4 we get 



17 35

 x = (AT A)−1 AT b =  0 − 17

0 1 10

0

 − 17 1  0 −2 1 4 14

 0    62  1 1 1 1 −11 −    59   = . −1 0 1 2  −10   5   1 0 1 4  −9 4 8

9 2 Thus the least squares approximation is y = − 62 5 + 5 x + 4x .

19. The normal equation is AT Ax = AT b. We have  3 A A= 1 T

AT b =

1 1



1 −2

   3 1   1  11 6 1 1 = 2 6 6 1 2    1   3 2   5 1 = , −2 1 4 1

so that the normal form of this system is  11 6

   6 5 x= . 6 4

7.3. LEAST SQUARES APPROXIMATION Thus

577

#−1 " # " 1 11 6 5 5 x= = 6 6 4 − 51 "

#" # " # 1 5 5 = 11 7 4 30 15

− 15

is the least squares solution. 20. The normal equation is AT Ax = AT b. We have    1 −2  3 2  14 3 −2 = −2 1 −6 2 1    1   3 2   6 1 = , −2 1 −3 1



1 −2

AT A =



1 −2

AT b =

 −6 9

so that the normal form of this system is 

   14 −6 6 x= . −6 9 −3

Thus

#−1 " # " 1 −6 6 = 10 1 9 −3 15

"

14 x= −6

1 15 7 45

#"

# " # 2 6 15 = 1 − 15 −3

is the least squares solution. 21. The normal equation is AT Ax = AT b. We have

AT A =

AT b =

     1 −2  0 2 3  0 −3 = 14 8 8 38 5 −3 5 0 2 3 0   4     12 0 2 3   1 = , −21 −3 5 0 −2 4



1 −2



1 −2

so that the normal form of this system is  14 8 Thus "

14 8 x= 8 38

   8 12 x= . 38 −21

#−1 "

# " 19 12 234 = 2 −21 − 117

2 − 117 7 234

#"

# " # 4 12 3 = −21 − 56

is the least squares solution. 22. The normal equation is AT Ax = AT b. We have  1 AT A = 0

AT b =

 1 0

  1 0    2 −1 0  6  2 −1 = −1 1 2 −1 1 −3 0 2   1     2 −1 0   5 = 12 , −1 1 2 −1 −2 2

 −3 6

578

CHAPTER 7. DISTANCE AND APPROXIMATION so that the normal form of this system is 

6 −3

Thus "

6 x= −3

   −3 12 x= . 6 −2

#−1 " # " 2 −3 12 = 91 6 −2 9

1 9 2 9

#"

# " # 22 12 = 98 −2 9

is the least squares solution.   23. Instead of trying to invert AT A, we instead use an alternative method, row-reducing AT A AT b . We will see that AT A does not reduce to the identity matrix, so that the system does not have a unique solution. We get      3 0 2 1 1 0 0 1 1 0 1 1  1 0 −1 −1 1 3 −2 −1 0 1 1   = 0  AT A =  0 1 3 2 1 1 0 −1 1 1 2 −2 1 −1 2 2 0 1 1 0 1 −1 1 0      2 1 1 1 0 1       −5 −3 1 0 −1 −1   =  . AT b =  0 1 1 1  2  3 −1 4 0 1 1 0 Row-reducing, we have  3 0  2 1

0 2 1 3 −2 −1 −2 3 2 −1 2 2

  2 1 0 0 0 1 0 −5 → 0 0 1 3 0 0 0 −1

−1 1 2 0

 4 −5 . −5 0

So AT A is not invertible, and therefore there is not a unique solution. The general solution is       1 4 4+t −1  −5 − t  −5       −5 − 2t = −5 + t −2 . 1 0 t   24. Instead of trying to invert AT A, we instead use an alternative method, row-reducing AT A AT b . We will see that AT A does not reduce to the identity matrix, so that the system does not have a unique solution. We get      0 1 1 1 0 1 1 0 3 0 3 0 1 −1 0 1 1 −1 1 −1 0 3 1 2  =  AT A =  1 1 1 1 1 0 1 0 3 1 4 0 0 −1 0 1 1 1 1 1 0 2 0 2      0 1 1 1 5 3 1 −1 0 1  3  3 T   =  . A b= 1 1 1 1 −1  8 0 −1 0 1 1 −2 Row-reducing, we have  3 0 3 0 0 3 1 2  3 1 4 0 0 2 0 2

  3 1 0 0 1 0 1 0 3 1 → 0 0 1 −1 8 −2 0 0 0 0

 −5 −1 . 6 0

7.3. LEAST SQUARES APPROXIMATION

579

So AT A is not invertible, and therefore there is not a unique solution. The general solution is       −5 − t −5 −1 −1 − t −1     =   + t −1 .  6 + t   6  1 t 0 1 25. Solving these equations corresponds to solving Ax = b where     2 1 1 −1 6  0 −1  2   b= A= 11 .  3 2 −1 0 −1 0 1  T  To solve, we row-reduce A A AT b :       1 1 −1 1 0 3 −1  11 7 −5  0 −1 2  2 0  7 6 −5 AT A =  1 −1 =  3 2 −1 −1 2 −1 1 −5 −5 7 −1 0 1      2  35 1 0 3 −1   6   18 . 2 0  = AT b =  1 −1 11 −1 −1 2 −1 1 0 Row-reducing, we have 

11 7 −5  6 −5  7 −5 −5 7

  1 0 0 35   18 → 0 1 0 −1 0 0 1



42 11 19  . 11  42 11

Thus the best approximation is     42 x 11    19  x = y  =  11  . 42 z 11 26. Solving these equations corresponds to solving Ax = b where     21 2 3 1  7  1 1 1   A= b= 14 . −1 1 −1 0 0 2 1  T  To solve, we row-reduce A A AT b :      2 3 1 2 1 −1 0  6  1 1 1  1 2  6 AT A = 3 1 = −1 1 −1 1 1 −1 1 4 0 2 1     21   2 1 −1 0   35 7   1 2  84 . AT b = 3 1 = 14 1 1 −1 1 14 0 Row-reducing, we have  6  6 4

6 15 5

4 5 4

  35 1 0 0   84 → 0 1 0 14 0 0 1



469 66 224  . 33  133 − 11

6 15 5

 4 5 4

580

CHAPTER 7. DISTANCE AND APPROXIMATION Thus the best approximation is     469 x 66     x = y  =  224 . 33  z − 133 11

27. With Ax = QRx = b, we get " Rx = QT b,

or

# " 2 1 x = 13 1 3

3 0

2 3 − 32

1 3 2 3

#

 " # 2 3   .  3 = −2 −1 

Back-substitution gives the solution " x=

5 3

# .

−2

28. With Ax = QRx = b, we get "√ Rx = QT b,

or



6

6 2 √1 2



0

#

" x=

√1 6 √1 2

√2 6

0

  " # # 1 √2 − √16   1 = √6 .   √1 2 2 1

Back-substitution gives the solution " # x=

4 3

2

.

29. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     1 20 14.5     1 x1 y1 1 40   31      1 x2    y2     a 36       1 48  .  A = . , x = , b = = =  ..   ..  1 60  b  ..  .  45.5 .     1 80   59  1 xn yn 1 100 73.5 Since

"

1 A A= 20

1 40

T

1 48

1 60

1 80

 1   # 1 1  1 100  1  1 1

 20  40  "  48  = 6 60  348   80  100

# " 1519 348 T −1 1545 ⇒ (A A) = 29 − 2060 24304

29 − 2060 1 4120

we get

" T

x = (A A)

−1

T

A b=

1519 1545 29 − 2060

29 − 2060 1 4120

#"

1 20

1 40

1 48

Thus the least squares approximation is y = 0.918 + 0.730x.

1 60

1 80

  14.5   31  " # #    36 0.918 1   = .  100  0.730 45.5    59  73.5

# ,

7.3. LEAST SQUARES APPROXIMATION

581

30. (a) We are looking for a line L = a + bF that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         y1 1 x1 7.4 1 2    y2   1 x2     9.6    1 4 . , x = a , b =  = A = .  =  . .  ..  1 6 b  ..  11.5  .. 13.6 1 8 yn 1 xn Since "

1 A A= 2 T

1 4

 2 4   6

 # 1 1  1  8 1 1

1 6

"

4 = 20

# " 3 20 T −1 2 ⇒ (A A) = 120 − 14

− 41 1 20

# ,

8

we get  7.4 " #  5.4 1   9.6  . =  1.025 8 11.5 13.6 

" x = (AT A)−1 AT b =

3 2 − 41

#" − 14 1 1 2 20

1 4

1 6

#

Thus the least squares approximation is L = 5.4 + 1.025F . a = 5.4 represents the length of the spring when no force is applied (F = 0). (b) When a weight of 5 ounces is attached, the spring length will be about 5.4 + 1.025 · 5 = 10.525 inches. 31. (a) We are looking for a line y = a + bx that best fits the given points. Letting x = 0 correspond to 1920 and using the least squares approximation theorem, we want a solution to Ax = b, where     1 0 54.1         1 10 59.7 1 x1 y1 1 20 62.9     1 x2    y2   1 30 a 68.2          A = . ..  = 1 40 , x = b , b =  ..  = 69.7 .  .. .  .     1 50 70.8 1 xn yn     1 60 73.7 1 70 75.4 Since

"

8 A A= 280 T

# " 5 280 T −1 12 ⇒ (A A) = 1 14000 − 120

1 − 120 1 4200

# ,

we get

" T

x = (A A)

−1

T

A b=

5 12 1 − 120

#" 1 1 0 4200

1 − 120

1 10

1 20

1 30

1 40

1 50

1 60

  54.1   59.7     # 62.9 " #  1 68.2 56.63  .  =  70  69.7 0.291   70.8     73.7 75.4

Thus the least squares approximation is y = 56.63 + 0.291x. The model predicts that the life expectancy of someone born in 2000 is y(80) = 56.63 + 0.291 · 80 = 79.91 years.

582

CHAPTER 7. DISTANCE AND APPROXIMATION (b) To see how good the model is, compute the least squares error:       −2.53 1 0 54.1  0.16  59.7 1 10        0.45  62.9 1 20         2.84  68.2 1 30 56.63  , so kek ≈ 4.43.      = e = b − Ax =    −  1.43  69.7 1 40 0.291 −0.38 70.8 1 50       −0.39 73.7 1 60 −1.6 1 70 75.4 The model is not bad, with an error of under 10% of the initial value.

32. (a) Following the method of Example 7.28, we substitute the given values of x into the equation a + bx + cx2 = y to get a + 0.5b + 0.25c = 11 a+

b+

c = 17

a + 1.5b + 2.25c = 21

which in matrix form is

a+

2b +

4c = 23

a+

3b +

9c = 18,

    11 1 0.5 0.25   17  a 1 1 1     1 1.5 2.25  b  = 21 .     23 1 2 4  c 1 3 9 18

Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and b is the matrix on the right in the above equation. Since



1 AT A =  0.5 0.25

1 1 1 1.5 1 2.25

  1 1 1  1 2 3  1 4 9 1 1

0.5 1 1.5 2 3

 0.25  5 1    8 2.25 =  16.5 4  9

8 16.5 39.5

 16.5 39.5  ⇒ 103.125

 3.330 −4.082 1.031 5.735 −1.543 , = −4.082 1.031 −1.543 0.436 

(AT A)−1 we get 

3.330 x = (AT A)−1 AT b = −4.082 1.031



−4.082 1.031 1 5.735 −1.543  0.5 −1.543 0.436 0.25

   11    1 1 1 1  1.918 17    1 1.5 2 3  21 = 20.306 . 1 2.25 4 9 23 −4.972 18

Thus the least squares approximation is y = 1.918 + 20.306t − 4.972t2 . (b) The height at which the object was released is y(0) ≈ 1.918 m. Its initial velocity was y 0 (0) ≈ 2 20.306 m/sec, and its acceleration due to gravity is y 00 (t) ≈ −9.944 m/s . (c) The object hits the ground when y(t) = 0. Solving the quadratic using the quadratic formula gives y ≈ −0.092 and y ≈ 4.176 seconds. So the object hits the ground after about 4.176 seconds.

7.3. LEAST SQUARES APPROXIMATION

583

33. (a) If p(t) = cekt , then p(0) = 150 means that c = 150, so that p(t) = 150ekt . Taking logs gives ln p(t) = ln 150 + kt ≈ 5.011 + kt. We wish to find the value of k which best fits the data. Let t = 0 correspond to 1950 and t = 1 to 1960. Then kt = ln p(t) − ln 150 = ln p(t) 150 , so   1   2    A= 3 ,   4 5

h i x= k ,

    179 ln 150 0.177   203   ln 150  0.303   227      b= ln 150  ≈ 0.414 .   250   ln 150  0.511 0.628 ln 281 150

Since   1 AT A = 55 and thus (AT A)−1 = 55 , we get

x = (AT A)−1 AT b =



1 55

 1

2

3

4

  0.177    0.303  5 0.414  = 0.131. 0.511 0.628

Thus the least squares approximation is p(t) = 150e0.131t . (b) 2010 corresponds to t = 6, so the predicted population is p(6) = 150e0.131·6 ≈ 329.19 million. 34. Let t = 0 correspond to 1970. (a) For a quadratic approximation, follow the method of Example 7.28, and substitute the given values of x into the equation a + bx + cx2 = y to get a a + 5b +

=

29.3

25c =

44.7

a + 10b + 100c = 143.8 a + 15b + 225c = 371.6 a + 20b + 400c = 597.5 a + 25b + 625c = 1110.8 a + 30b + 900c = 1895.6 a + 35b + 1225c = 2476.6, which in matrix form is

 1 1  1  1  1  1  1 1

0 5 10 15 20 25 30 35

   0 29.3  44.7  25      100     143.8    a   225    b  =  371.6  .    400   597.5  c    625  1110.8 1895.6 900  1225 2476.6

Thus we want the least squares approximation to Ax = b, where A is the 8 × 3 matrix on the left and b is the matrix on the right in the above equation. Since     3 1 17 − 40 8 140 3500 24 600    3 53 1  AT A =  140 − 3000 3500 98000  ⇒ (AT A)−1 = − 40 , 4200 1 1 1 − 3000 105000 3500 98000 2922500 600

584

CHAPTER 7. DISTANCE AND APPROXIMATION we get



17 24

3 − 40

1 600

53 4200 1 − 3000

 3 x = (AT A)−1 AT b = − 40

 1  1   1 1  1 600   1 − 3000  1 1  105000  1  1 1

0 5 10 15 20 25 30 35

T 0  25   100   225    400   625    900  1225

 29.3    44.7     143.8      57.063  371.6       = −20.335 .   597.5    2.589 1110.8     1895.6 2476.6 

Thus the least squares approximation is y = 57.063 − 20.335t + 2.589t2 . (b) With y(t) = cekt , since y(0) = 29.3 we have c = 29.3 so that p(t) = 29.3ekt . Taking logs gives p(t) ln p(t) = ln 29.3 + kt, or kt = ln 29.3 . Then 

 0   5   10   15   A =  , 20   25     30 35

   0.422 ln 44.7 29.3  ln 143.8  1.591    29.3      371.6  ln 29.3  2.540   597.5      b=  ln 29.3  ≈ 3.015 .   1110.8   ln 29.3  3.635   1895.6   ln 29.3  4.170 4.437 ln 2476.6 29.3 

        Then AT A = 3500 and AT b = 487.696 , so that x = 487.696 ≈ 0.1393 . Thus y(t) = 3500 29.3e0.1393t . (c) The models produce the following predictions: Year 1970 1975 1980 1985 1990 1995 2000 2005

Actual 29.3 44.7 143.8 371.6 597.5 1110.8 1895.6 2476.6

Quadratic 57.1 20.1 112.6 334.6 686.0 1166.8 1777.1 2516.9

Exponential 29.3 58.8 118.0 236.8 475.1 953.5 1913.3 3839.4

The quadratic appears to be a better fit overall, even though it does a poor job in the first 15 years or so. The exponential model simply grows too fast. (d) Using the quadratic approximation, we get for 2010 and 2015 y(40) = 57.063 − 20.335 · 40 + 2.589 · 402 = 3386.1 y(45) = 57.063 − 20.335 · 45 + 2.589 · 452 = 4384.7. 35. Using Example 7.29, taking logs gives     t1 30     A = t2  = 60 , t3 90

    kt1 ln 172 200     b = kt2  = ln 148 . 200  128 kt2 ln 200

7.3. LEAST SQUARES APPROXIMATION

585

     62.8  Then AT A = 12600 and AT b ≈ −62.8 , so that x ≈ − 12600 ≈ −0.005. Thus k = −0.005 and p(t) = ce−0.005t . To compute the half-life, we want to solve 12 c = ce−0.005t , so that −0.005t = ln 12 = ln 2 ≈ 139 days. − ln 2; then t = 0.005 36. We want to find the best fit z = a + bx + cy to the given data; the data give the equations − 4c =

0

=

0

a + 4b − c =

1

a + b − 3c =

1

a a + 5b

a − b − 5c = −2, which in matrix form is

 1 1  1  1 1

0 5 4 1 −1

   0 −4    0 0    a     −1  b =  1 .  1  c −3 −2 −5

Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and b is the matrix on the right in the above equation. Since     2189 541 5 9 −13 − 433 15 15 15     86 AT A =  9 43 −2 ⇒ (AT A)−1 = − 433 , − 107 15 15 15  541 107 134 −13 −2 51 − 15 15 15 we get 

 T

x = (A A)

−1

T

A b=

2189 15  433 − 15 541 15

− 433 15 86 15 − 107 15



541 1 15 107   − 15   0 134 −4 15

1 5 0

 0     43 1 1 1  0   38    4 1 −1   1 = − 3  .   11 −1 −3 −5  1 3 −2

Thus the least squares approximation is y = 31 (43 − 8x + 11y). 37. We have A=

  1 1



 AT = 1

 1



    AT A = 2 , (AT A)−1 = 21 .

So the standard matrix of the orthogonal projection onto W is " # " i 1 1 h1i h T −1 T 2 P = A(A A) A = = 1 1 1 1 2 2 Then

" projW (v) = P v =

38. We have



 1 A= −2



 AT = 1

−2



1 2 1 2



1 2 1 2

1 2 1 2

# .

#" # " # 7 3 = 27 . 4 2     AT A = 5 , (AT A)−1 = 51 .

So the standard matrix of the orthogonal projection onto W is " # " h ih i 1 1 1 5 P = A(AT A)−1 AT = = 1 −2 − 25 −2 5

− 25 4 5

# .

586

CHAPTER 7. DISTANCE AND APPROXIMATION Then " projW (v) = P v =

#" # " # 1 − 15 = . 4 2 1 5 5

− 25

1 5 − 25

39. We have   1 A = 1 1



 AT = 1

 1

1

    AT A = 3 , (AT A)−1 = 13 .



So the standard matrix of the orthogonal projection onto W is   1 h ih   1 T −1 T P = A(A A) A = 1 3 1 1

 1 i 3 1 1 = 3

1

1 3 1 3 1 3

1 3



1 3 1 . 3 1 3

Then  projW (v) = P v =

 

1 3 1 3 1 3

1 3 1 3 1 3

1 1 3 1   3  2 1 3 3

  2   = 2 . 2

40. We have  2 A =  2 −1 



 AT = 2

 −1

2

    AT A = 9 , (AT A)−1 = 19 .



So the standard matrix of the orthogonal projection onto W is  2 h ih   P = A(AT A)−1 AT =  2 91 2 −1





i

2 −1 =

4 9  4  9 − 29

4 9 4 9 − 92

 − 92  − 92  . 1 9

Then 4 9  4  9 − 29

4 9 4 9 − 29

    4 − 92 1 9  4 2   − 9  0 =  9  . 1 0 − 29 9

 −1 1



AT A =

 projW (v) = P v =

41. We have 

1 A= 0 −1

 1 1 1



AT =

 1 1

0 1

 2 0

 1 0 , (AT A)−1 = 2 3 0

So the standard matrix of the orthogonal projection onto W is 

1  P = A(AT A)−1 AT =  0 −1

 1 "1  1 2 0 1

#" 0 1 1 1 3

0 1

 5 # 6 −1  1 = 3 1 − 16

Then  projW (v) = P v =

5 6  1  3 − 61

1 3 1 3 1 3

    5 1 6      1 =  13  . 3  0 5 0 − 16 6

− 16

1 3 1 3 1 3

− 16



1 . 3 5 6

0 1 3

 .

7.3. LEAST SQUARES APPROXIMATION

587

42. We have 

1 A = −2 1

 1 0 −1

 1 A = 1 T



 −2 1 0 −1



 1 0 T −1 , (A A) = 6 2 0

 6 A A= 0 T

So the standard matrix of the orthogonal projection onto W is    2 #" # 1 1 "1 3 1 −2 1 0    P = A(AT A)−1 AT = −2 = − 31 0 6 1 0 −1 0 2 1 1 −1 − 31 Then

 projW (v) = P v =

43. With the new basis, we have   " 1 1 1   T A = 1 3 ⇒ A = 1 0 1

1 3

2 3  1 − 3 − 31

# 0 1

2 3 − 13

 .

1 2

 − 31  − 31  . 2 3

    −1 1 − 13   1   − 3  2 =  0 . 2 1 3 3

− 13 2 3 − 13

" 2 A A= 4 T



− 13

0

# " 11 4 T −1 6 , (A A) = 11 − 32

So the standard matrix of the orthogonal projection onto W is    5 # #" 1 1 " 11 2 6   6 −3 1 1 0  1 T −1 T P = A(A A) A = 1 3 = 6 2 1 −3 1 3 1 3 0 1 − 13

1 6 5 6 1 3

− 13

− 23 1 3

# .



1 . 3 1 3

This agrees with the matrix found in Example 7.31. 44. (a) To see that P is symmetric, compute its transpose:   T −1 T T  T −1 T T −1 T T = AT A A A = A AT A A P T = A AT A A = A AT A

−1

AT = P,

−1 T recalling that B T = B −1 for any invertible matrix B. (b) To show that P is idempotent, we must show that P 2 = P P = P :   −1 T   −1 T  −1 T  T −1 T P P = A AT A A A AT A A = A AT A A A A A A −1 T = A AT A A = P. 45. We have A=

  1 2



 AT = 1

2





    AT A = 5 , (AT A)−1 = 15 .

So the pseudoinverse of A is A+ = (AT A)−1 AT = 46. We have

 1 A = −1 2

1  1 5

  2 = 15

2 5



.





 AT = 1

−1

 2



    AT A = 6 , (AT A)−1 = 16 .

−1

  2 = 16

So the pseudoinverse of A is A+ = (AT A)−1 AT =

1  1 6

− 61

1 3



.

588

CHAPTER 7. DISTANCE AND APPROXIMATION

47. We have  3  1 2



1  A = −1 0

" 1 A = 3

# −1 0 1 2

T



"

2 A A= 2 T



# " 7 2 T −1 12 , (A A) = 1 14 − 12

1 − 12

# .

1 12

So the pseudoinverse of A is " +

T

A = (A A)

−1

T

A =

#" 1 1 3 12

7 12 1 − 12

1 − 12

# " 1 −1 0 = 31 1 2 6

− 23

− 61

1 6

1 6

# .

48. We have  3  1 2

 1  A = 3 2

"

1 A = 3 T



# 2 2

3 1

" 14 A A= 10 T



# " 7 10 T −1 48 , (A A) = 5 14 − 48

5 − 48 7 48

# .

So the pseudoinverse of A is " A+ = (AT A)−1 AT =

#" 1 7 3 48

5 − 48

7 48 5 − 48

# " 2 − 61 = 1 2 3

3 1

1 3 − 16

1 12 1 12

# .

49. We have A=

 1 0

 1 1

AT =



 1 1

0 1



AT A =



 1 1

  1 2 , (AT A)−1 = 2 −1

 −1 . 1

So the pseudoinverse of A is +

T

A = (A A)

−1



 −1 1 1 1

2 A = −1 T

  0 1 = 1 0

 −1 . 1

Note that this is the inverse A−1 of A. 50. We have "

1 A= 3

# 2 4

"



# 3 4

1 AT = 2

" 10 AT A = 14



# " 14 5 , (AT A)−1 = 20 − 72

− 27

#

5 2

.

So the pseudoinverse of A is " A+ = (AT A)−1 AT =

5

#" 1 5 2 2

− 72

− 72

# " 3 −2 = 3 4 2

1

#

− 12

.

Note that this is the inverse A−1 of A. 51. We have  1 1  A= 0 1

0 0 1 1

  0 1  1  T  ⇒ A = 0 1 0 1

1 0 1

0 1 1

  1 3   T 1 ⇒ A A = 1 1 2

1 2 2

  2 2 3   2 , (AT A)−1 =  13 3 − 23

1 3 5 3 − 34

So the pseudoinverse of A is  A+ = (AT A)−1 AT =

2 3  1  3 − 23

1 3 5 3 − 34

 − 23 1 4  − 3  0 5 0 3

1 0 1

0 1 1

  2 1 3   1 1 =  3 1 − 32

0 −1 1

− 31 1 3 1 3



1 3 2 . 3 − 13

 − 32  − 34  . 5 3

7.3. LEAST SQUARES APPROXIMATION 52. We have

 1 0 A= 1 0

so that

2 1 1 0

589

  0 1 −1  ⇒ AT = 2 −2 0 2

 0 1 0 1 1 0 , −1 −2 2

  15 3 −2 7   T −1 6 −3 , (A A) = −1 1 −3 9 7



2  T A A= 3 −2

−1 2 3

0

1 7



 0 .

1 7

So the pseudoinverse of A is 

15 7

 A+ = (AT A)−1 AT = −1 1 7

−1 2 3

0

 1  0 2 1 0 7 1 7

  1 0 1 0 7  1 1 1 0 =  3 1 −1 −2 2 7

− 87 2 3 − 17

6 7 − 31 − 71

2 7



 0 .

2 7

53. (a) Since A is square with linearly independent columns, it is invertible, so that −1 T −1 T A+ = AT A A = A−1 AT A = A−1 . (b) Let the columns of A be qi ; then qi · qj = 0 if i 6= j and 1 if i = j. Then  T q1   ..   T A A =  .  q1 · · · qn = In , qTn

−1 T so that A+ = AT A A = In AT = AT . (Note that this makes sense, for if A were square, it would be an orthogonal matrix and then part (a) would hold and AT = A−1 .)  −1 T  −1 T  + A AA+ = AT A A A A = A+ . 54. A+ AA+ = AT A −1 T −1 T  55. We have A+ A = AT A A A = AT A A A = I, which is certainly a symmetric matrix.   −1 T −1 −1 56. (a) (cA)+ = cAT cA cAT = c12 AT A cAT = 1c AT A A = 1c AT . (b) By Exercise 53(a), if A is square, then A+ = A−1 . Also A−1 is square with linearly independent + −1 + + columns, so that A−1 = A−1 = A. Then(A+ ) = A−1 = A. (c) Since A is square, Exercise 53(a) applies to both A and AT , so that + −1 T T AT = AT = A−1 = A+ . 57. The fact that not all the points lie on the same vertical line means that there is at least one pair i, j such that xi 6= xj . Constructing the matrix A gives   1 x1  .. ..  . .    1 xi     ..  . A =  ...  .   1 xj    . ..   .. .  1 xn Since xi 6= xj , the columns are not multiples of one another, so must be linearly independent. Since the columns of A are linearly independent, the points have a unique least squares approximating line, by part (b) of the Least Squares Theorem.

590

CHAPTER 7. DISTANCE AND APPROXIMATION

58. Constructing the matrix A gives  1  A =  ...

x1 .. .

x21 .. .

1

xn

x2n

··· .. . ···

 xk1 ..  . . 

xkn

The solution(s) to Ax = b are the least squares approximating polynomials of degree k; by the Least Squares Theorem, there is a unique solution when the columns of A are linearly independent. Now suppose that a0 a1 + a1 a2 + · · · + ak ak+1 = 0. This means that a0 + a1 xi + · · · + ak xki = 0 for all i. But since at least k + 1 of the xi are distinct, the polynomial a0 + a1 x + · · · + ak xk has at least k + 1 distinct roots, so it must be the zero polynomial by Exercise 62 in Section 6.2. Thus all of the ai are zero, so the columns of A are linearly independent.

7.4

The Singular Value Decomposition

1. We have  2 A A= 0 T

0 3

 2 0

  0 4 = 3 0

 0 . 9

Then the singular values of A are the square roots of the eigenvalues of√this matrix which, √ since the matrix is diagonal, are 4 and 9. Thus the singular values of A are σ1 = 9 = 3 and σ2 = 4 = 2. 2. We have  2 A A= 1 T

 1 2 2 1

  1 5 = 2 4

 4 . 5

Then the singular values of A are the square roots of the eigenvalues of this matrix. 5 − λ 4 T = (5 − λ)2 − 16 = λ2 − 10λ + 9 = (λ − 9)(λ − 1), det(A A − λI) = 4 5 − λ so that the eigenvalues are 9 and 1. Thus the singular values of A are σ1 =



9 = 3 and σ2 =



3. We have AT A =

 1 1

 0 1 0 0

  1 1 = 0 1

 1 . 1

Then the singular values of A are the square roots of the eigenvalues of this matrix. 1 − λ 1 T det(A A − λI) = = (1 − λ)2 − 1 = λ2 − 2λ = λ(λ − 2), 1 1 − λ so that the singular values of A are σ1 =



2 and σ2 =



0 = 0.

4. We have AT A =

√

2 1

 √ 2 √0 2 0

  √1 = √2 2 2

√  2 . 3

Then the singular values of A are the square roots of the eigenvalues of this matrix. √ 2 − λ 2 T √ det(A A − λI) = = (2 − λ)(3 − λ) − 2 = λ2 − 5λ + 4 = (λ − 4)(λ − 1), 2 3 − λ so that the singular values of A are σ1 =



4 = 2 and σ2 =



1 = 1.

1 = 1.

7.4. THE SINGULAR VALUE DECOMPOSITION

591

5. We have  AT A = 3

4

   3   = 25 . 4

Then the singular values of A are the square roots of the eigenvalues of this matrix. det(AT A − λI) = 25 − λ = 25 − λ, so that the singular value of A is σ1 =



25 = 5.

6. We have AT A =

  3  3 4

 4 =



 9 12 . 12 16

Then the singular values of A are the square roots of the eigenvalues of this matrix. 9 − λ 12 T = (9 − λ)(16 − λ) − 144 = λ2 − 25λ = λ(λ − 25), det(A A − λI) = 12 16 − λ √

so that the singular values of A are σ1 =

25 = 5 and σ2 =



0 = 0.

7. We have  0 0 AT A = 0 3



0 −2  0 0 −2 

  0 4 3 = 0 0

 0 . 9

Then the singular values of A are the square roots of the eigenvalues of√this matrix. Since√it is diagonal, its eigenvalues are 9 and 4, so that the singular values of A are σ1 = 9 = 3 and σ2 = 4 = 2. 8. We have  0 T A A= 1

1 0

  0 2  1 −2 2

  1 5  0 = −4 −2

 −4 . 5

Then the singular values of A are the square roots of the eigenvalues of this matrix. 5 − λ −4 = (5 − λ)2 − 16 = λ2 − 10λ + 9 = (λ − 9)(λ − 1), det(AT A − λI) = −4 5 − λ so that the eigenvalues are 9 and 1. Thus the singular values of A are σ1 =



9 = 3 and σ2 =



1 = 1.

9. We have  2 AT A = 0 1

 0  2 2 0 0

0 2

  4 1 = 0 0 2

0 4 0

 2 0 1

Then the singular values of A are the square roots of the eigenvalues of this matrix. 4 − λ 0 2 4−λ 0 = (4 − λ) ((4 − λ)(1 − λ) − 4) = −λ(λ − 5)(λ − 4), det(AT A − λI) = 0 2 0 1 − λ so that √ the eigenvalues are 5, 4, and 0. Thus the singular values of A are σ1 = σ3 = 0 = 0. 10. We have  1 AT A = 0 1

 0 1 1 −3 0 0 0 1 1

  0 1 2 −3 0 = 0 0 1 2

0 9 0

 2 0 2



5, σ2 =



4 = 2, and

592

CHAPTER 7. DISTANCE AND APPROXIMATION Then the singular values of A are the square roots of the eigenvalues of this matrix. 2 − λ 0 2  9−λ 0 = (9 − λ) (2 − λ)2 − 4 = −(λ − 9)(λ − 4)λ det(AT A − λI) = 0 2 0 2 − λ so that the √eigenvalues are 9, 4, and 0. Thus the singular values of A are σ1 = and σ3 = 0 = 0.



9 = 3, σ2 =



4 = 2,

11. From Exercise 3, AT A =



1 1

 1 , 1

√ √ with eigenvalues 2 and 0, so that the singular values of A are σ1 = 2 and σ2 = 0 = 0. To find the corresponding eigenspaces, row-reduce:        T  −1 1 0 1 −1 0 1 A A − 2I 0 = → ⇒ x1 = 1 −1 0 0 0 0 1         T 1 1 0 1 1 0 1 A A − 0I 0 = → ⇒ x2 = . 1 1 0 0 0 0 −1 These vectors are already orthogonal since they correspond to distinct eigenvalues. Normalizing each of them gives " # " # " # h i √1 √1 √1 √1 2 2 2 2 v2 = , so that V = v1 v2 = 1 . v1 = 1 , √ √ − √12 − √12 2 2 The first column of the matrix U is " 1 1 1 u1 = Av1 = √ σ1 2 0

1 0

#"

√1 2 √1 2

# =

" # 1 0

.

We do not use v2 since it corresponds to an eigenvalue of zero. We must extend this to an orthonormal basis of R2 by adjoining a second vector, u2 , to get an orthogonal matrix. Clearly e2 is such a vector. Thus # "√ " #" # √1 1 0 2 0 √12 T 2 A = U ΣV = 0 1 0 0 √12 − √12 is a singular value decomposition of A. 12. We have

 1 A A= 1 T

 1 1 1 1

  1 2 = 1 2

 2 . 2

The characteristic polynomial of AT A is 2 − λ 2 T det(A A − λI) = = (2 − λ)2 − 4 = λ2 − 4λ = λ(λ − 4). 2 2 − λ Thus the eigenvalues are λ1 = 4 and λ2 = 0 (and so the singular values are σ1 = 2 and σ2 = 0). To find the corresponding eigenspaces, row-reduce:         T −2 2 0 1 −1 0 1 A A − 4I 0 = → ⇒ x1 = 2 −2 0 0 0 0 1        T  2 2 0 1 1 0 1 A A − 0I 0 = → ⇒ x2 = . 2 2 0 0 0 0 −1

7.4. THE SINGULAR VALUE DECOMPOSITION

593

To find the matrix V , we must normalize each of these vectors (they are already orthogonal since they correspond to distinct eigenvalues). This gives " v1 =

√1 2 √1 2

"

# ,

√1 2 − √12

v2 =

# ,

h V = v1

so that

"

i

v2 =

√1 2 √1 2

√1 2 − √12

# .

The first column of the matrix U is " 1 1 1 u1 = Av1 = σ1 2 1

#" # 1 √12 √1 2

1

" =

√1 2 √1 2

# .

We do not use v2 since it corresponds to an eigenvalue of zero. We must extend this to an orthonormal basis of R2 by adjoining a second " vector, # u2 , to get an orthogonal matrix. By looking at the matrix √1 2 − √12

V , we see that such a vector is

. Thus "

A = U ΣV

T

=

√1 2 √1 2

√1 2 − √12

#" 2 0

#" 0 √12 √1 2

0

√1 2 − √12

#

is a singular value decomposition of A. 13. We have 

0 A A= −2 T

 −3 0 0 −3

  −2 9 = 0 0

 0 . 4

This matrix is diagonal, so its eigenvalues are √ which are λ1 = 9 and λ2 = 4. √ its diagonal entries, Therefore the singular values of A are σ1 = 9 = 3 and σ2 = 4 = 2. To find the corresponding eigenspaces, row-reduce:  T A A − 9I  T A A − 4I

       0 0 0 0 1 0 1 0 = → ⇒ x1 = 0 −5 0 0 0 0 0        4 0 0 1 0 0 0 0 = → ⇒ x2 = . 0 0 0 0 0 0 1

These vectors are already orthonormal, so we set v1 = x1 ,

v2 = x2 ,

so that

 V = v1

  1 v2 = 0

Then U is found by " 0 1 1 u1 = Av1 = σ1 3 −3 " 0 1 1 u2 = Av2 = σ2 2 −3

−2

#" # 1

0 0 #" # −2 0 0

1

" =

=

# 0

−1 " # −1 0

Thus A = U ΣV T = is a singular value decomposition of A.



0 −1

 −1 3 0 0

 0 1 2 0

0 1



.

 0 . 1

594

CHAPTER 7. DISTANCE AND APPROXIMATION

14. We have AT A =



 1 1 1 1

1 −1

  −1 2 = 1 0

 0 . 2

This matrix is diagonal, so its eigenvalues are its diagonal entries. So the only eigenvalue is 2 and the √ singular values of A are σ1 = σ2 = 2. To find the corresponding eigenspace, row-reduce:         T 0 0 0 1 0 A A − 2I 0 = ⇒ x1 = , x2 = . 0 0 0 0 1 These vectors are already orthonormal, so we set v1 = x1 ,

v2 = x2 ,

 V = v1

so that

v2



 1 = 0

 0 . 1

Then U is found by " 1 1 1 u1 = Av1 = √ σ1 2 1 " 1 1 1 u2 = Av2 = √ σ2 2 1

#" # −1 1 1 0 #" # −1 0

Thus " A = U ΣV

T

=

√1 2 √1 2

1

1

− √12

# "√

2 0

√1 2

#

"

√1 2 √1 2

"

− √12

=

=

√1 2

#" 0 1 √ 2 0

# .

# 0 1

is a singular value decomposition of A. 15. From Exercise 5,   AT A = 25 , so that the only eigenvalue is 25 and therefore the only singular value is σ1 = corresponding eigenspace, row-reduce:  T A A − 25I

  0 = 0



25 = 5. To find the

   0 ⇒ x1 = 1 .

This single vector forms an orthonormal set; let v1 = x1 , so that     V = v1 = 1 . Then U is found by " # " # 3 1 1 3 h i u1 = Av1 = 1 = 54 . σ1 5 4 5 " We extend this to an orthonormal basis for R2 by adjoining the orthogonal unit vector

− 45 3 5

# . Since

Σ should be 2 × 1 and there is only one singular value, we adjoin a zero to the singular value. Thus " #" # 3 4 − 5 h i 5 A = U ΣV T = 54 1 3 0 5 5 is a singular value decomposition of A.

7.4. THE SINGULAR VALUE DECOMPOSITION

595

16. From Exercise 6, AT A =



9 12

 12 , 16

√ √ with eigenvalues 25 and 0, so that the singular values of A are σ1 = 25 = 5 and σ2 = 0 = 0. To find the corresponding eigenspaces, row-reduce:        T  3 −16 12 0 1 − 34 0 A A − 25I 0 = → ⇒ x1 = 12 −9 0 4 0 0 0       4  T  9 12 0 −4 1 3 0 A A − 0I 0 = → ⇒ x2 = . 12 16 0 3 0 0 0 To find the matrix V , we must normalize each of these vectors (they are already orthogonal since they correspond to distinct eigenvalues). This gives " # " # " # 3 3 h i − 45 − 45 5 5 v2 = , so that V = v1 v2 = 4 . v1 = 4 , 3 3 5

5

5

5

The first column of the matrix U is 1 1h u1 = Av1 = 3 σ1 5

i 4

" # 3 5 4 5

h i = 1 .

Then U is the 1 × 1 matrix containing 1 as its entry. Since Σ should be 1 × 2 and there is only one nonzero singular value, we adjoin a zero to the singular value, so that " # 3 4 h ih i T 5 5 A = U ΣV = 1 5 0 − 45 35 is a singular value decomposition of A. 17. From Exercise 7, AT A =



4 0

 0 , 9

with eigenvalues 9 and 4, so with singular values 3 and 2. To find the corresponding eigenspaces, row-reduce:         T −5 0 0 1 0 0 0 A A − 9I 0 = → ⇒ x1 = 0 0 0 0 0 0 1        T  0 0 0 0 1 0 1 A A − 4I 0 = → ⇒ x2 = . 0 5 0 0 0 0 0 These vectors are already orthonormal, so we set v1 = x1 ,

v2 = x2 ,

so that

 V = v1

v2



 0 = 1

Then U is found by  0 1 1 0 u1 = Av1 = σ1 3 −2  0 1 1 0 u2 = Av2 = σ2 2 −2

   0   0 0 3 = 1 1 0 0      0 0 1 3 =  0 . 0 0 −1

 1 . 0

596

CHAPTER 7. DISTANCE AND APPROXIMATION   1 We must extend u1 and u2 to an orthonormal basis for R3 . Clearly u3 = e1 = 0 works. Then U is 0 the matrix whose columns are the ui . Now, Σ should be 3 × 2, so we add a row of zeros at the bottom, giving     0 0 1 3 0  0 1 0 0 0 2 A = U ΣV T = 1 1 0 0 −1 0 0 0 for a singular value decomposition of A.

18. From Exercise 8,  5 −4 , −4 5 √ √ with eigenvalues 9 and 1 and singular values σ1 = 9 = 3 and σ2 = 1 = 1. To find the corresponding eigenspaces, row-reduce:        T  −4 −4 0 1 1 0 1 A A − 9I 0 = → ⇒ x1 = −4 −4 0 0 0 0 −1         T 4 −4 0 1 −1 0 1 A A−I 0 = → ⇒ x2 = . −4 4 0 0 0 0 1 AT A =



These vectors are already orthogonal, but we must normalize them: " # " # h √1 √1 x2 x1 2 2 , = , v = = so that V = v1 = v1 2 kx1 k kx2 k √1 − √1 2

2

i

v2 =

"

√1 2 − √12

√1 2 √1 2

#

Then U is found by  0 1 1 u1 = Av1 =  1 σ1 3 2  0 1 1 Av2 =  u2 = 1 σ2 1 2

   1 − 3√ 1 " 1 # 2 √   1  2   0  − √1 =  3√2  4 2 √ −2 3 2    √1 1 " 1 #  √  2 2  1  0  1 = √  . √

−2

2

2

0

3 We must √extend u1 and √u2 to an orthonormal basis for R . To keep computations simpler, we multiply u1 by 3 2 and v1 by 2. Then adjoin e1 , e2 , and e3 to these vectors and row-reduce:     1 −1 1 1 0 0 1 0 0 0 4     1 − 14  .  1 1 0 1 0 → 0 1 0 1 4 0 0 0 1 0 0 1 −1 2

Then a basis for the column space, which is R3 , is given by the columns of the original matrix corresponding to leading 1’s, which is u1 , u2 , and x3 = e1 . Finally, we must turn this into an orthonormal basis:             4 −1 1 1 −1 1 9 −1 · 1 + 1 · 0 + 4 · 0 1 · 1 + 1 · 0 + 0 · 0 1 1             x03 = x3 −  1 − 1 = 0 +  1 − 1 = − 49  . −1 · (−1) + 1 · 1 + 4 · 4 1·1+1·1+0·0 18 2 2 4 0 0 4 0 9 Normalizing this vector gives  u3 =

x03 kx3 k

=



2 3  2 − 3  . 1 3

7.4. THE SINGULAR VALUE DECOMPOSITION

597

Then U is the matrix whose columns are the ui . Now, Σ should be 3 × 2, so we add a row of zeros at the bottom, giving    1 2 √1 " # − 3√ 3 0 3 2 √1 √1   12   − T 2  1 2 2  A = U ΣV =   3√2 √2 − 3  0 1 √1 √1 1 4 2 2 √ 0 0 0 3 3 2 for a singular value decomposition of A. 19. From Exercise 9,  4 AT A = 0 2

 2 0 , 1

0 4 0

√ with eigenvalues 5, 4, and 0, so the singular values are σ1 = 5, σ2 = 2, and σ3 = 0. To find the corresponding eigenspaces, row-reduce:       1 0 −2 0 2 −1 0 2 0  T  0 0 → 0 1 0 0 ⇒ x1 = 0 A A − 5I 0 =  0 −1 2 0 −4 0 0 0 0 0 1       0 0 2 0 1 0 0 0 0   T 0 0 → 0 0 1 0 ⇒ x2 = 1 A A − 4I 0 = 0 0 2 0 −3 0 0 0 0 0 0       1 4 0 2 0 −1 1 0 2 0  T  A A − 0I 0 = 0 4 0 0 → 0 1 0 0 ⇒ x3 =  0 . 2 2 0 1 0 0 0 0 0 These vectors are already orthogonal, but we must normalize them:       √2 √1 − 0 5  5    x3 x2 x1 , v3 = , v2 = v1 = = = = 0 1 0    ,    kx1 k kx2 k kx3 k 1 2 √ √ 0 5 5 so that

 √2 

V = v1

v2

5



v3 =  0

√1 5

 0 − √15 1 0  √2 0 5

Then U is found by "

u1 =

1 1 2 Av1 = √ σ1 5 0

" 1 1 2 u2 = Av2 = σ2 2 0

0 2

0 2

  # √2 " # 5 1   0 = 1 0   0 √1 5

  " # # 0  1  1 = 0 . 0   1 0

We do not use v3 since it corresponds to a zero singular value. Then u1 and u2 already form a basis for R2 . Now, Σ should be 3 × 2, so we add a column of zeros on the right, giving   " # "√ # √2 √1 0 5 5 1 0 5 0 0   0 A = U ΣV T = 1 0   0 1 0 2 0 − √15 0 √25 for a singular value decomposition of A.

598

CHAPTER 7. DISTANCE AND APPROXIMATION

20. We have

     1 1  2 2 2 1 1 1 AT A = 1 1 = 2 2 2 1 1 1 1 1 2 2 2 Then the singular values of A are the square roots of the eigenvalues of this matrix. 2 − λ 2 2 2−λ 2 = −λ2 (λ − 6), det(AT A − λI) = 2 2 2 2 − λ √ √ so that the eigenvalues are 6 and 0. Thus the singular values of A are σ1 = 6 and σ2 = 0 = 0. To find the corresponding eigenspaces, row-reduce:       1 0 −1 0 1 −4 2 2 0  T  2 0 → 0 1 −1 0 ⇒ x1 = 1 A A − 6I 0 =  2 −4 2 2 −4 0 0 0 0 0 1         1 1 1 0 1 1 2 2 2 0   T A A − 0I 0 = 2 2 2 0 → 0 0 0 0 ⇒ x2 = −1 , x3 =  0 . 2 2 2 0 0 0 0 0 0 −1 We must orthogonalize x2 and x3 . Use Gram-Schmidt:       1 1 1 2 x2 · x3   1    1 0 x3 = x3 − x2 =  0 − −1 =  2  x2 · x2 2 0 −1 −1 Multiply through by 2 to clear fractions, giving 

 1 x003 =  1 . −2 Finally, we normalize the vectors:   v1 =

x1 = kx1 k

√1  13  √  ,  3 √1 3

 v2 =

h V = v1



√1 2

  x2 = 0  ,  kx2 k 1 − √2 

v2

i

v3 =

√1  13 √  3 √1 3



 v3 =

√1 2

0 − √12

x003 = kx003 k

√1  16   √   6 − √26





√1 6 √1  . 6 − √26

Then U is found by "

u1 =

1 1 1 Av1 = √ σ1 6 1

1 1

  # √1 " # 3 √1 1  1 2 . √  = √1 1  3 √1 3

2

We do not use v2 or v3 since they correspond to a zero singular " value. # Then we extend u1 to an − √12 orthonormal basis of R2 by adding the orthogonal unit vector u2 = . Finally, Σ must be 2 × 3, √1 2

so we adjoin zeros as required: " A = U ΣV T =

√1 2 √1 2

for a singular value decomposition of A.

− √12 √1 2

# "√

6

0

 # √1 3 0 0   √1 0 0  12 √

6

√1 3

0 √1 6



√1 3 − √12   − √26

7.4. THE SINGULAR VALUE DECOMPOSITION

599

21. Using the singular values and the matrices U and V with columns ui and vi from Exercise 11, we get    h i i √ 1 h 1 0 √1 1 T T √ √ − √12 . A = σ1 u1 v1 + σ2 u2 v2 = 2 +0 2 2 2 0 1 22. Using the singular values and the matrices U and V with columns ui and vi from Exercise 14, we get " # " # i √ − √1 h i √ √1 h T T 2 2 A = σ1 u1 v1 + σ2 u2 v2 = 2 1 . 1 0 + 2 0 1 1 √

2



2

23. Using the singular values and the matrices U and V with columns ui and vi from Exercise 17, we get     0 0 h i i  h   T T    . A = σ1 u1 v1 + σ2 u2 v2 = 3 1 0 1 + 2  0 1 0  −1 0 24. Using the singular values and the matrices U and V with columns ui and vi from Exercise 19, we get " # " # i i √ 1 h 0 h T T 2 1 √ √ A = σ1 u1 v1 + σ2 u2 v2 = 5 + 2 . 0 0 1 0 5 5 0 1 25. For example, in Exercise 11, we extended u1 to a basis for R2 by adjoining e2 ; we could instead have adjoined −e2 , giving a different SVD: # "√ #" " # √1 1 0 2 0 √12 T 2 . A = U ΣV = 0 −1 0 0 √12 − √12     −1 0 In Exercise 14, we could have used x1 = and x2 = as eigenvectors, which would change 0 −1 both U and V and give a different SVD: # "√ #" # " √1 − √12 2 0 −1 0 T 2 √ . A = U ΣV = − √12 − √12 0 2 0 −1 T 26. (a) If A is symmetric, then A√ A = AA = A2 . If λ is any eigenvalue of A, then λ2 is an eigenvalue of 2 T A = A A and therefore λ2 = |λ| is a singular value of A.

(b) If A is positive definite, then A is symmetric with positive eigenvalues, so that in part (a), |λ| = λ. Thus the singular values of A are just the eigenvalues of A. 27. (a) We want to show that A = U ΣV T is an orthogonal diagonalization of A. Since A is positive definite, its singular values are its eigenvalues, by Exercise 26(b), so that Σ is a diagonal matrix D whose entries are the eigenvalues of A. It remains to show that U and V T are orthogonal and that U = V = Q where Q is orthogonal, for then A = QDQT , so that we indeed have an orthogonal diagonalization of A. Now, 1 1 ui = Avi = Avi . σi λi But vi is an eigenvector corresponding to λi , so continuing, 1 1 Avi = λi vi = vi . λi λi Thus the columns of U and V are the same, so that U = V . Further, V is an orthogonal matrix by construction, since its entries are orthonormal eigenvectors for A.

600

CHAPTER 7. DISTANCE AND APPROXIMATION (b) From part (a), U = V = Q, an orthogonal matrix, and by Exercise 26, the singular values of A are the eigenvalues of A. Then the outer product form of the SVD is A = σ1 u1 v1T + · · · + σn un vnT = λ1 q1 qT1 + · · · + λn qn qTn , which is the spectral decomposition of A since the qi are a set of orthonormal eigenvectors corresponding to the λi .

28. Since U and V are both orthogonal, we know that U −1 = U T and V −1 = V T . Then A = U ΣV T means that U T AV = U T U ΣV T V = U −1 U ΣV −1 V = Σ. Since U T , A, and V are all invertible, their product, Σ, is also invertible. Then since all the matrices are invertible, −1 −1 −1 −1 T A−1 = U ΣV T = VT Σ U = V T Σ−1 U T = V Σ−1 U T , showing that V Σ−1 U T is an SVD of A−1 since Σ−1 is also a diagonal matrix and V and U T are orthogonal matrices.   T T 29. We have AAT = U ΣV T U ΣV = U Σ(V T V )ΣU T = U Σ2 U T , so that (since U and U T are  −1 T orthogonal matrices) U AA U = U T AAT U = Σ2 , which is a diagonal matrix. Therefore U T diagonalizes AA , so by Theorem 4.23 in Section 4.4, it follows that the columns of U are eigenvectors of AAT . 30. To show that A and AT have the same singular values, we must show that AT A and AAT have the same eigenvalues. But from Exercise 29, AAT = U Σ2 U T ; a similar string of equalities to that in Exercise 29 gives AT A = V Σ2 V T . Since the diagonal matrix in both cases gives the eigenvalues, and those diagonal matrices are the same, AAT and AT A have the same eigenvalues. 31. To show that A and QA have the same singular values, it suffices to show that AT A and (QA)T QA have the same eigenvalues. But (QA)T QA = AT QT QA = AT A since Q is orthogonal. So in fact the two matrices are equal so they must have the same eigenvalues. Thus A and QA have the same singular values. 32. We know that {v1 , . . . , vr } is an orthonormal set, so it is linearly independent. Next, since A = U ΣV T , T we get AT = U ΣV T = V ΣT U T , so that V ΣT = AT U . But this means that vi σi = AT ui , so 1 T that vi = σi A ui , so that vi ∈ row(A). It remains only to show that dim row(A) = r, for then {v1 , . . . , vr } is an r-element linearly independent set in row(A), so it must form a basis. But by part (e), {vr+1 , . . . , vn } forms a basis for (A), which therefore has dimension n − r. By Theorem 5.10, ⊥ (row(A)) = (A), so that dim row(A) = n − dim (A) = n − (n − r) = r. 33. Since rank(A) = 1 but A is a √ map into R2 , the image of the unit circle is a solid ellipsoid. The only nonzero singular value of A is 2, so the equation of the ellipsoid is 2  y √1 ≤ 1, or y12 ≤ 2. 2 √ √ This is the interval [− 2, 2] along the y1 -axis. 34. Since rank(A) = 2 but A is a map into R3 , the image of the unit circle is a solid ellipsoid. The nonzero singular values of A are σ1 = 3 and σ2 = 2, so the equation of the ellipsoid is  2  2 y1 y2 y2 y12 + ≤ 1, or + 2 ≤ 1. σ1 σ2 9 4 This is an ellipse in R2 with semimajor axis 3 and semiminor axis 2 together with its interior.

7.4. THE SINGULAR VALUE DECOMPOSITION

601

35. Since rank(A) = 2 and A is a √ map into R2 , the image of the unit sphere is an ellipsoid. The nonzero singular values of A are σ1 = 5 and σ2 = 2, so the equation of the ellipsoid is  2  2 y1 y2 y12 y2 + = 1, or + 2 = 1. σ1 σ2 5 4 √ This is an ellipse in R2 with semimajor axis 5 and semiminor axis 2. 36. Since rank(A) = 2 but A is a map into R3 , the image of the unit sphere is a solid ellipsoid. The nonzero singular values of A are σ1 = 3 and σ2 = 2, so the equation of the ellipsoid is  2  2 y2 y12 y1 y2 + ≤ 1, or + 2 ≤ 1. σ1 σ2 9 4 This is an ellipse in R2 with semimajor axis 3 and semiminor axis 2 lying in the y1 y2 -plane, together with its interior in that plane. √ √ 37. (a) From example 7.37, kAk2 is the largest singular value, 2. Thus kAk2 = 2. (b) Since A is not invertible, cond2 (A) = ∞. 38. (a) From example 7.37, kAk2 is the largest singular value, 3. Thus kAk2 = 3. (b) Since A is not invertible, cond2 (A) = ∞. 39. We first compute the singular values of A:   1 1 1 T A A= 0.9 1 1

  0.9 2 = 1 1.9

 1.9 . 1.81

Then the characteristic polynomial for AT A is 2 − λ 1.9 = (2 − λ)(1.81 − λ) − 3.61 = λ2 − 3.81x + 0.01. 1.9 1.81 − λ Solving using a CAS √ gives λ1 ≈ 3.807 and λ2 ≈ 0.00263, so that the singular values are σ1 = 1.951 and σ2 = 0.00263 ≈ 0.0512.



3.807 ≈

(a) From Example 7.37, kAk2 = σ1 ≈ 1.95. (b) cond2 (A) =

σ1 σ2

≈ 38.1.

40. We first compute the singular values of A:   10 100  10 AT A = 10 100 100 0 1

 10100 10 0 = 10100 100 1 100 

10100 10100 100

 100 100 . 1

Using technology, the characteristic polynomial is −λ3 + 20201λ2 − 200λ, with roots λ1 ≈ 20201, λ2 ≈ 0.0099, and λ3 = 0, so that the singular values are σ1 ≈ 142.13, σ2 ≈ 0.0995, and σ3 = 0. (a) From Example 7.37, kAk2 = σ1 ≈ 142.13. (b) Since A is not invertible, cond2 (A) = ∞. 41. From Exercise 11, " A = U ΣV

T

=

1

# "√ 0 2

#" 0 √12

0

1

0

0

√1 2

√1 2 − √12

#

is a singular value decomposition of the matrix A from Exercise 3. So " #" #" # " 1 √1 √1 √1 0 1 0 + + T 2 2 2 2 A =VΣ U = 1 = 1 √ − √12 0 0 0 1 2 2

0 0

# .

602

CHAPTER 7. DISTANCE AND APPROXIMATION

42. From Exercise 18, A = U ΣV T

 1 − √  312 =  3√2

√1 2 √1 2

4 √ 3 2

0

 3    0 0 0 0 1

 0 " 1  √ 2 1  √1 2 0

− √12

#

√1 2

is a singular value decomposition of the matrix A from Exercise 8. So   1 4 1 #" " " # − √ √ √ 3 2 3 2 3 2 1 4 √1 √1  0 0 3 2 2 = 9  √1 √1 A + = V Σ+ U T = 0  5 2 2 − √12 √12 0 1 0  9 2 2 1 − 3 3 3

5 9 4 9

2 9 − 29

# .

43. From Exercise 19, A = U ΣV T =

" 1

0

0

1

# "√

 5

0

0

#

√2 5

0   0 2 0  1 − √5

0 1 0



√1 5

0 

√2 5

for a singular value decomposition of the matrix A from Exercise 9. So     2 # " √2 √1 √1 0 − 0 5  5  5 5  1 0 + + T 1  A =VΣ U = = 0  0 1  0 0 2 0 1 2 1 1 √ √ 0 0 0 5 5 5 44. We have

 1 AT A = 0 1

 0 1 1 −3 0 0 0 1 1

  0 1 2 −3 0 = 0 0 1 2

0 9 0

0

 

1 . 2

0

 2 0 . 2

Then the characteristic polynomial of AT A is 2 − λ 0 2  0 9−λ 0 = (9 − λ) (2 − λ)2 − 4 = (9 − λ)(λ2 − 4λ) = −λ(λ − 9)(λ − 4). 2 0 2 − λ Thus the eigenvalues are λ1 = 9, λ2 = 4, and λ3 = 0, so that the singular values are σ1 = 3, σ2 = 2, and σ3 = 0. To find the corresponding eigenspaces, row-reduce:       −7 0 2 0 1 0 0 0 0  T  0 0 → 0 0 1 0 ⇒ x1 = 1 A A − 9I 0 =  0 0 2 0 −7 0 0 0 0 0 0       −2 0 2 0 1 0 −1 0 1  T  0 0 → 0 1 0 0 ⇒ x2 = 0 A A − 4I 0 =  0 5 2 0 −2 0 0 0 0 0 1       2 0 2 0 1 0 1 0 1  T  A A − 0I 0 = 0 9 0 0 → 0 1 0 0 ⇒ x3 =  0 . 2 0 2 0 0 0 0 0 −1 These vectors are already orthogonal, but we must normalize them:       √1 √1 0 2    2  x1 x2 x3 v1 = = , v2 = = , v3 = = 1 0 0      , kx1 k kx2 k kx3 k 1 1 √ 0 − √2 2

7.4. THE SINGULAR VALUE DECOMPOSITION

603

so that h V = v1

v2

 i 0 v3 =  1 0



√1 2

√1 2

0 .

0

− √12

√1 2

Then U is found by  1 1 1  u1 = Av1 = 0 σ1 3 1  1 1 1  u2 = Av2 = 0 σ2 2 1

    0 1 0         −3 0 1 = −1  0 0 1 0     √1 √1 0 1   2  2     −3 0  0  =  0 . 0

0

√1 2

1

√1 2

We must extend u1 and u2 to an orthonormal basis for R3 ; clearly   √1

2  u3 =  0   1 − √2

works. Then U is the matrix whose columns are the ui . appropriate, giving   √1 0 √12 3 2     A = U ΣV T =  0 0 0 −1 0 0 √12 − √12 for a singular value decomposition of A. Then   1 √1 0 √12 2 3   A + = V Σ+ U T =  0 0  0 1 0 √12 − √12 0



Now, Σ should be 3 × 3, so we add zeros as

0 2 0

 0 0  1 √ 0  2 √1 0 2

0

0

0

−1

1 2

 1 √ 0  2 √1 0 2

0

0

0

1 0 0

0



√1  2 − √12



0



√1  2 − √12





1 4

= 0

0 − 13

1 4

0



1 4

0 .

1 4

45. We have AT A =

 1 2

 2 1 4 2

   2 5 10 = . 4 10 20

Then the characteristic polynomial of AT A is 5 − λ 10 = (5 − λ)(20 − λ) − 100 = λ2 − 25λ = λ(λ − 25). 10 20 − λ Thus the eigenvalues are λ1 = 25 and λ2 = 0, so that the singular values are σ1 = 5 and σ2 = 0. To find the corresponding eigenspaces, row-reduce:        T  −20 10 0 1 1 − 12 0 A A − 25I 0 = → ⇒ x1 = 10 −5 0 2 0 0 0        T  5 10 0 1 2 0 −2 A A − 0I 0 = → ⇒ x2 = . 10 20 0 0 0 0 1

604

CHAPTER 7. DISTANCE AND APPROXIMATION These vectors are already orthogonal, but we must normalize them: x1 v1 = = kx1 k

"

√1 5 √2 5

#

" # − √25 x2 v2 = = , kx2 k √1

,

h V = v1

so that

"

i

v2 =

5

√1 5 √2 5

− √25 √1 5

# .

Then U is found by " 1 1 1 u1 = Av1 = σ1 5 2

#" # 2 √15 √2 5

4

" =

√1 5 √2 5

# .

We must extend u1 to an orthonormal basis for R3 ; clearly u2 =

# " − √25 √1 5

works. Then U is the matrix whose columns are the ui . Now, Σ should be 2 × 2, so we add zeros as appropriate, giving " U=

√1 5 √2 5

− √25

#

" ,

√1 5

Σ=

5

# 0

0

0

1 5

0

0

0

" ,

V =

√1 5 √2 5

− √25 √1 5

# .

for a singular value decomposition of A. Then " +

+

A =VΣ U

T

=

√1 5 √2 5

− √25 √1 5

so that

" +

x=A b= 46. We have

 3 AT A = 0 0

#"

1 25 2 25

 0  3 0 0 2

2 25 4 25

0 0

#"

√1 5 − √25

√2 5 √1 5

#

" =

1 25 2 25

2 25 4 25

# ,

#" # " # 13 3 = 25 . 26 5 25   9 0 = 0 2 0

0 0 0

 0 0 . 4

Since this matrix is diagonal, we can read off its eigenvalues, which are λ1 = 9, λ2 = 4, and λ3 = 0, so that the singular values are σ1 = 3, σ2 = 2 and σ3 = 0. To find the corresponding eigenspaces, row-reduce:       0 0 0 0 0 1 0 0 1  T  0 0 → 0 0 1 0 ⇒ x1 = 0 A A − 9I 0 = 0 −9 0 0 −5 0 0 0 0 0 0       5 0 0 0 1 0 0 0 0  T  A A − 4I 0 = 0 −4 0 0 → 0 1 0 0 ⇒ x2 = 0 0 0 0 0 0 0 0 0 1       9 0 0 0 1 0 0 0 0  T  A A − 0I 0 = 0 0 0 0 → 0 0 1 0 ⇒ x3 = 1 . 0 0 4 0 0 0 0 0 0 These vectors are already orthonormal, so setting vi = xi , we get   1 0 0   V = v1 v2 v3 = 0 0 1 . 0 1 0

7.4. THE SINGULAR VALUE DECOMPOSITION

605

Then U is found by "

u1 =

1 1 3 Av1 = σ1 3 0

0 0

  " # # 1  0  0 = 1 , 2   0 0

"

u2 =

1 3 1 Av2 = σ2 2 0

  " # # 0  0  0 = 0 . 2   1 1

0 0

These vectors form an orthonormal basis for R2 , so that U is the matrix whose columns are the ui . Now, Σ should be 2 × 3, so we add a column of zeros on the right, giving   " " # # 1 0 0   1 0 3 0 0 U= , Σ= , V = 0 0 1   0 1 0 2 0 0 1 0 for a singular value decomposition of A. Then   1 1 0 0  3  + + T  A =VΣ U = 0 0 1  0 0 0 1 0 so that



1 3

 x = A+ b =  0 0 47. We have 

1 AT A = 1

1 1

 0 "  1 1 2 0 0

 # 0 1

1 3

0

0

1 2

= 0



 0 ,

   0 " # 1  3   = 0 . 0 0 1 0 2

  1 1  1 1 1

  1 3  1 = 3 1

 3 . 3

The characteristic polynomial of AT A is 3 − λ 3 = (3 − λ)2 − 9 = λ(λ − 6), 3 3 − λ so that its eigenvalues are λ1 = 6 and λ2 = 0. Therefore the singular values of A are σ1 = σ2 = 0. To find the corresponding eigenspaces, row-reduce:        T  −3 3 0 1 −1 0 1 A A − 6I 0 = → ⇒ x1 = 3 −3 0 0 0 0 1        T  3 3 0 1 1 0 1 A A − 0I 0 = → ⇒ x2 = . 3 3 0 0 0 0 −1 These vectors are already orthogonal, but we must normalize them: " # " # √1 √1 x1 x2 2 2 v1 = = 1 , v2 = = , kx1 k kx2 k √ − √1 2

2

so that h V = v1

i

v2 =

"

√1 2 √1 2

√1 2 − √12

# .

Then U is found by  1 1 1  u1 = Av1 = √  1 σ1 6 1

   √1 1 " 1 #  √  13  2   1  √1 =  √3  . 2 √1 1 3



6 and

606

CHAPTER 7. DISTANCE AND APPROXIMATION We must extend this to an orthonormal basis for R3 . Clearly adjoining e1 and e2 forms   a basis; we 1 √ must orthonormalize it. Multiplying u1 through by 3, work instead with u01 = 1 to simplify 1 computations. Then       2 1 1 3   1    1 0 = 0 − 1 = − 3  u2 = e1 − 3 0 1 − 13               1 1 2 0 0 0 1 3 3 3 e2 · u01 0 e2 · u02 0   1   1/3  1     1   1   1  0 u3 = e2 − 0 − − u − u = + = + = . 1 − 1 − 1              3 3 6 2  u1 · u01 1 u02 · u02 2 3 2/3 1 1 1 1 0 −3 0 −6 −2 1 3 e1 · u01 0 u u01 · u01 1

Normalizing these vectors gives  u2 =

u02 = ku02 k





√2  16  − √  ,  6 1 √ − 6

u3 =

0



 1  u03 √ . = 0 ku3 k  2  − √12

These vectors form an orthonormal basis for R2 , so that U is the matrix whose columns are the ui . Now, Σ should be 3 × 2, so we add two rows of zeros on the bottom, giving  U=

√1  13 √  3 √1 3

√2 6 − √16 − √16

0

√ 6   Σ= 0 0

 

√1  , 2 − √12

 0  0 , 0

" V =

√1 2 √1 2

√1 2 − √12

#

for a singular value decomposition of A. Then " +

+

A =VΣ U

T

=

√1 2 √1 2

√1 2 − √12

#"

 # √1 0 0  23 √ 0 0  6 0

√1 6

0



√1 3 − √16 √1 2

√1 3 − √16   − √12

" =

1 6 1 6

1 6 1 6

1 6 1 6

# ,

so that " x = A+ b =

1 6 1 6

1 6 1 6

1 6 1 6

  # 1 " # 1   . 2 = 1 3

48. We have  1 AT A = 0 1

0 1 0

 1 1 0 0 1 1

0 1 0

  1 2 0 = 0 1 2

0 1 0

 2 0 . 2

The characteristic polynomial of AT A is 2 − λ 0 2

0 2  1−λ 0 = (1 − λ) (2 − λ)2 − 4 = −λ(λ − 1)(λ − 4), 0 2 − λ

so that its eigenvalues are λ1 = 4, λ2 = 1, and λ3 = 0. Therefore the singular values of A are σ1 = 2,

7.4. THE SINGULAR VALUE DECOMPOSITION

607

σ2 = 1 and σ3 = 0. To find the corresponding eigenspaces, row-reduce:       1 0 −1 0 1 −2 0 2 0  T  0 0 → 0 1 0 0 ⇒ x1 = 0 A A − 4I 0 =  0 −3 2 0 −2 0 0 0 0 0 1       1 0 2 0 1 0 0 0 0   T A A − 1I 0 = 0 0 0 0 → 0 0 1 0 ⇒ x2 = 1 2 0 1 0 0 0 0 0 0       2 0 2 0 1 0 1 0 1   T A A − 0I 0 = 0 1 0 0 → 0 1 0 0 ⇒ x3 =  0 . 2 0 2 0 0 0 0 0 −1 These vectors are already orthogonal, but we must normalize them:       √1 √1 0 2  2    x1  , v2 = x2 = 1 , v3 = x3 =  = v1 = 0 0 , kx1 k   kx2 k   kx3 k  1 √1 √ 0 − 2 2 so that

 √1  V = v1

2



v2

v3 =  0 √1 2

0 1 0

 √1 2 0 .

− √12

Then U is found by  1 1 1 u1 = Av1 =  0 σ1 2 1

0 1 0

    √1 √1 1   2  2     0  0  =  0 , √1 √1 1 2 2

 1 1 1 u2 = Av2 =  0 σ2 1 1

0 1 0

    0 0 1         0  1 = 1 0 0 1

We must extend this to an orthonormal basis for R3 ; clearly adjoining   √1

2  u3 =  0   − √12

does the trick. Then U is the matrix whose columns are the ui . Now, Σ should be 3 × 3, so we add zeros as appropriate, giving       √1 √1 √1 √1 0 0 2 0 0 2 2  2    2  , Σ = 0 1 0 , V =  0 1 U = 0 1 0 0       √1 √1 √1 √1 0 − 0 − 0 0 0 2 2 2 2 for a singular value decomposition of A. Then   1 √1 √1 0 2 2 2   A + = V Σ+ U T =  0  0  0 1 √1 0 0 − √12 2 so that



1 4

 x = A+ b =  0 1 4

√1  2

0

1

 0  0

0 0 1 0



0

0

√1 2



√1 2

0 1 0

1 4

 0  = 0

− √12

    1 1 2     0  1 =  1  . 1 1 1 4 2 1 4



1 4

0 1 0



1 4

0 , 1 4

608

CHAPTER 7. DISTANCE AND APPROXIMATION

49. (a) The normal equation is AT Ax = AT b. Now,     1 1 1 1 2 AT A = = 1 1 1 1 2      1 1 0 1 AT b = = . 1 1 1 1

 2 2

Row-reducing, we get  T A A

  2 2 AT b = 2 2

  1 1 → 1 0

1 0

1 2

0

 .

1 Therefore x = 2 , which satisfies x + y = 21 . 0 (b) Letting y = t, the general solution is "

# " # 1 −t −t 2 + = x= . t 0 t " # 1 2

The length of this solution vector is s r 2 1 1 2 −t +t = − t + 2t2 . 2 4 (c) To minimize this length, it suffices to minimize 41 − t + 2t2 . This is an upward-opening parabola, b so its minimum occurs at t = − 2a = 41 . So the x of minimum length is " # " # 1 1 1 2 − 4 = 4 , 1 1 4

4

as found in Example 7.40. 50. The definition of pseudoinverse in Section 7.3 is A+ = AT A +

T

same as V Σ U , where U ΣV

T

−1

AT ; we must show that this is the T is an SVD for A. Now, A = U ΣV T = V ΣT U T = V ΣU T , and   AT A = V ΣU T U ΣV T = V Σ2 V T , T

since Σ is diagonal, thus symmetric, and U is orthogonal. Since A has linearly independent columns, the rank of A is the number of columns, so by Theorem 7.15, A has r nonzero singular values, so AT A has only nonzero eigenvalues. Thus AT A is invertible, and −1 −1 −1 2 −1 −1 2 AT A = V Σ2 V T = VT Σ V = V Σ−1 V T . So finally, AT A

−1

  2  2 T  AT = V Σ−1 V T V ΣU T = V Σ−1 V V ΣU T = V Σ−1 U T = V Σ+ U T ,

which is the definition of A+ in this section. Note that since Σ is invertible, Exercise 28 tells us that Σ−1 = Σ+ . 51. Let A = U ΣV T be an SVD for A. Suppose that  Dr×r Σ= O

O O



where D is an invertible diagonal matrix. Then Σ+ Σ =

 Ir×r O

O O



7.4. THE SINGULAR VALUE DECOMPOSITION

609

since the nonzero entries of Σ are the inverses of the nonzero entries of Σ. Therefore     Dr×r O Ir×r O + + ΣΣ Σ = Σ Σ Σ = = Σ. O O O O Similarly   I Σ+ ΣΣ+ = Σ+ Σ Σ+ = r×r O

O O

  −1 Dr×r O

 O = Σ+ . O

(a) Just compute the product:      AA+ A = U ΣV T V Σ+ U T U ΣV T = U Σ V T V Σ+ U T U ΣV T = U ΣΣ+ ΣV T = U ΣV T = A. (b) Just compute the product:      A+ AA+ = V Σ+ U T U ΣV T V Σ+ U T = V Σ+ U T U Σ V T V Σ+ U T = V Σ+ ΣΣ+ U T = V Σ+ U T = A+ . (c) Compute the transpose of AA+ : AA+

T

= AV Σ+ U T

T

= U Σ+

T

V T AT = U Σ +

T

 V T V ΣT U T T T = U Σ+ ΣT U T = U ΣΣ+ U T .

But ΣΣ+ is a block matrix with the identity in the upper left and zeros elsewhere, so it is symmetric, and this expression becomes T   U ΣΣ+ U T = U ΣΣ+ U T = U ΣV T V Σ+ U T = AA+ , so that AA+ is symmetric. A similar argument shows that A+ A is symmetric. 52. As suggested in the hint, suppose that A0 is a matrix satisfying the Penrose conditions for A. Then using the fact that A+ and A0 each satisfy the Penrose conditions for A, we have A0 = A0 AA0

(Condition (b) for A0 )

= A0 AA+ A A0  = A0 AA+ (AA0 ) T T = A0 A+ AT (A0 ) AT T T = A0 A+ (AA0 A) T = A0 A+ AT

(Condition (a) for A0 )

= A0 AA+

(Condition (c) for A+ ).



(Condition (a) for A+ ) (Condition (c) for A0 and A+ )

Similarly, A+ = A+ AA+ = A+ (AA0 A) A+  = A+ A (A0 A) A+ T T = AT A+ AT (A0 ) A+ T T = AA+ A (A0 ) A+

(Condition (b) for A+ ) (Condition (a) for A0 ) (Condition (c) for A0 and A+ )

= AT (A0 ) A+

T

(Condition (a) for A+ )

= A0 AT A+

(Condition (c) for A0 ).

Since A0 and A+ are equal to the same thing, they are equal to each other.

610

CHAPTER 7. DISTANCE AND APPROXIMATION

53. Since A+ AA+ = A+ , the matrix A satisfies Penrose condition (a) for A+ . Similarly, since AA+ A = A, T the matrix A satisfies Penrose condition (b) for A+ . Finally, since (A+ A) = A+ A, the matrix A T satisfies the first of the Penrose conditions in (c) for A+ ; similarly, since (AA+ ) = AA+ , the matrix A satisfies the second of the Penrose conditions in (c) for A+ . Since A satisfies all of the Penrose + conditions for A+ , and since by Exercise 52 (A+ ) is the only matrix satisfying those conditions for + A+ , we must have A = (A+ ) . T

54. As in Exercise 53, we show that (A+ ) satisfies the Penrose conditions for AT ; then uniqueness tells + T us that (A+ ) = AT . T

T

(a) AT (A+ ) AT = (AA+ A) = AT . T

T

T

T

(b) (A+ ) AT (A+ ) = (A+ AA+ ) = (A+ ) . T

T

(c) We must show that both AT (A+ ) and (A+ ) AT are symmetric:   T

So (A+ ) = AT

+

AT A+ A+

T

T  T

AT

T

=



A+

= AT

T  T

T 

T

= A+ A = A+ A

T

= AT A+

T  T

= AA+ = AA+

T

= A+

AT

A+

T

T

AT .

.

55. We show that A satisfies the Penrose conditions for A; then uniqueness tells us that A+ = A. (a), (b) AAA = A2 A = AA = A2 = A since A is idempotent. (c) We must show that AA is symmetric. But AA = A2 = A, which is symmetric. So A satisfies all three Penrose conditions and thus A+ = A. 56. It suffices to show that A+ QT satisfies the Penrose conditions for QA. Using the fact that Q is orthogonal,   (a) (QA) A+ QT (QA) = Q (AA+ ) QT Q A = Q (AA+ A) = QA.    (b) A+ QT (QA) A+ QT = A+ QT Q AA+ QT = (A+ AA+ ) QT = A+ QT .   (c) We must show that both (QA) A+ QT and A+ QT (QA) are symmetric: T T T T   (QA) A+ QT = QT A+ AT QT = Q AA+ QT = Q AA+ QT = (QA) A+ QT  T T T T T A+ QT (QA) = AT QT QT A+ = AT A+ = A+ A   = A+ A = A+ QT Q A = A+ QT (QA). So A+ QT = (QA)+ . 57. Since A is positive definite, it is symmetric by definition. Then by Exercise 27(a), the SVD for A is its orthogonal diagonalization, and U = V . The computations are identical to those in Exercise 27(a). 58. Recall that ( kAk1 = max

n X i=1

) |aij | ,

kAk∞ = max

 n X 

j=1

  |aij | . 

If A is diagonal, then the sum of any row or column is just the diagonal entry in that row or column, so that   ( n ) n X  X kAk1 = max |aij | = max {|ajj |} , kAk∞ = max |aij | = max {|aii |} ,   i=1

j=1

7.4. THE SINGULAR VALUE DECOMPOSITION

611

and the two are clearly equal. For kAk2 , by Exercise 26, since A is symmetric, being diagonal, it follows that the singular values of A are the absolute values of its eigenvalues, which are the absolute values of its diagonal entries. Further, from the material in the text following Example 7.37, we know that kAk2 is the largest singular value, which is the maximum magnitude of the diagonal entries. But this is just the value of kAk1 and kAk∞ . 2

2 T 59. Let σ be the largest singular value

λT is the eigenvalue of A A of

Tof

A. Then kAk2 = σ =T|λ|, where largest magnitude. Next, since A

1 = kAk∞ , we have A A 1 = A 1 kAk1 = kAk1 kAk∞ . But by Exercise 34 in Section 7.2, AT A 1 ≥ |λ|. Thus

2 kAk2 = |λ| ≤ AT A 1 = kAk1 kAk∞ . 60. Let A = U ΣV T be an SVD for a square matrix A. Since U is orthogonal, we have    A = U ΣV T = U Σ U T U V T = U ΣU T U V T = RQ. The eigenvalues of R are the same as the eigenvalues of Σ, since R ∼ Σ; thus the eigenvalues of R are all nonnegative. Now, since A is square, Σ is an n × n symmetric matrix, since it is a diagonal matrix, so that T RT = U ΣU T = U ΣT U T = U ΣU T . Hence R is a symmetric matrix with nonnegative eigenvalues, so it is positive semidefinite. Q is orthogonal since both U and V are, so that both U and V T are; see Theorem 5.8(d) in Section 5.1. 61. From Exercise 11, " U=

1

# 0

0

1

"√ ,

Σ=

2

0

0

#

0

" ,

V =

√1 2 √1 2

#

√1 2 − √12

is a singular value decomposition of the matrix A from Exercise 3. Then # "√ # "√ #" # " 1 0 2 0 1 0 2 0 T = R = U ΣU = 0 1 0 0 0 1 0 0 #" " # " # √1 √1 √1 1 0 √12 T 2 2 2 Q = UV = = 1 , √ √1 0 1 √12 − √12 − 2 2 so that a polar decomposition is "√ A = RQ =

2

0

0

#"

0

√1 2 √1 2

√1 2 − √12

# .

62. From Exercise 14, " U=

√1 2 √1 2

− √12 √1 2

"√

# ,

Σ=

2

0

# 0 √ , 2

" V =

1

# 0

0

1

is a singular value decomposition of A. Then # "√ #" # "√ " √1 √1 √1 − √12 2 0 2 T 2 2 2 = R = U ΣU = 1 √ 1 1 1 √ √ 0 2 − √2 √2 0 2 2 " #" # " # √1 √1 − √12 1 0 − √12 2 Q = U V T = 12 = , √ √1 √1 √1 0 1 2 2 2 2

# 0 √ 2

612

CHAPTER 7. DISTANCE AND APPROXIMATION so that a polar decomposition is "√ A = RQ =

2

0 √ 2

0

#"

− √12

√1 2 √1 2

# .

√1 2

63. We have  1 A A= 2

 −3 1 −1 −3

T

  2 10 = −1 5

 5 . 5

Then the characteristic polynomial of AT A is 10 − λ 5 = λ2 − 15λ + 25. 5 5 − λ √

5 , so that these are the eigenvalues of AT A. Thus the singular values of This has roots λ1 , λ2 = 15±5 √ √2 A are σ1 = λ1 and σ2 = λ2 . Using a CAS to row-reduce A − λ1 I and A − λ2 I gives eigenvectors " √ # " √ #

x1 =

1+ 5 2

,

1

x2 =

1− 5 2

.

1

These vectors are already orthogonal, but we must normalize them. Using a CAS, we get  v1 =

x1 = kx1 k



√1+



5 √  10+2 5  , 2 √ √ 10+2 5

so that  h V = v1

i

v2 =

 v2 =

x2 = kx2 k



√1+

5 √  10+2 5 2 √ √ 10+2 5

√1−





5 √  10−2 5  , 2 √ √ 10−2 5



√1−



5 √ 10−2 5  . 2 √ √ 10−2 5

Using technology again, we get  u1 =

1 Av1 = σ1





2

√ 10+2√ 5   , 5+3 5 − √ √ 2 25+10 5

 u2 =

1 Av2 = σ2

√5−



5



√  2 25−10 √ 5 . −5+3 5 √ √ 2 25−10 5

Then U is the matrix whose columns are the ui . Then  U=



2

√ 10+2  √ 5 − √5+3 5√ 2 25+10 5

√5−

√ 5



√ 2 25−10 √ 5 , −5+3 5 √ √ 2 25−10 5

q Σ=

√ 15+5 5 2

0

0

q



√ , 15−5 5 2

 V =

gives a singular value decomposition of A. Then using technology, " # " 2 −1 0 T T R = U ΣU = Q = UV = −1 3 −1

0

,

and A = RQ. Whew. 64. We have 

4 AT A =  2 −3

 −2 4 4 2 −1 −2 6 6 4

  2 −3 36 2 6 =  0 −1 6 0

0 9 0



5 √  10+2 5 √ 2 √ 10+2 5

# 1

√1+

 0 0 . 81

√1−





5 √ 10−2 5  . √ 2 √ 10−2 5

7.4. THE SINGULAR VALUE DECOMPOSITION

613

This is a diagonal matrix, so its eigenvalues are λ1 = 81, λ2 = 36, and λ3 = 9, so that the singular values of A are σ1 = 9, σ2 = 6, and σ3 = 3. Row-reduce to find the corresponding eigenspaces:       1 0 0 0 0 −45 0 0 0  0 =  0 −72 0 0 → 0 1 0 0 ⇒ v1 = 0 0 0 0 0 0 0 0 0 1       0 0 0 0 0 1 0 0 1  0 = 0 −27 0 0 → 0 0 1 0 ⇒ v2 = 0 0 0 45 0 0 0 0 0 0       1 0 0 0 0 27 0 0 0  0 =  0 0 0 0 → 0 0 1 0 ⇒ v3 = 1 0 0 72 0 0 0 0 0 0

 T A A − 81I

 T A A − 36I



AT A − 9I

These vectors are already orthonormal, so V is the matrix

 V = v1

v2

v3



 0 = 0 1

1 0 0

 0 1 . 0

Next,     − 31 2 −3 0     2 2 6 0 =  3  , 2 −1 6 1 3     2 2 −3 1 3     1 2 6 0 = − 3  , 2 −1 6 0 3     2 2 −3 0 3     2 2 6 1 =  3  . −1 6 0 − 13



4 1 1 Av1 = −2 u1 = σ1 9 4  4 1 1 u2 = Av2 = −2 σ2 6 4  4 1 1 u3 = Av3 = −2 σ3 3 4

Then U is the matrix whose columns are the ui , so that  − 31  2 U = 3

2 3 1 −3 2 3

2 3



2 3 2 , 3 1 −3

 9  Σ = 0 0

0 6 0

 0  0 , 3

 0  V = 0 1

1 0 0

 0  1 . 0

gives a singular value decomposition of A. So

R = U ΣU T

 − 13  2 = 3 2 3

Q = UV

T

 − 31  2 = 3 2 3

and A = RQ.

2 3 1 −3 2 3 2 3 1 −3 2 3



2 9 3 2  0   3 1 −3 0



2 0 3 2  3  1 − 31 0

0 6 0 0 0 1

   2 2 0 − 13 5 3 3   2 = 0  23 − 31 −2   3 2 2 1 3 − 0 3 3 3    2 2 1 − 13 3 3   1 2 2 0 = − 3 3 3 2 1 2 0 −3 3 3

 −2 0  6 2 2 7

614

7.5

CHAPTER 7. DISTANCE AND APPROXIMATION

Applications

1. As in Example 7.41, let W = P[−1, 1] be the subspace of linear polynomials in C [−1, 1]. Then {1, x} is an orthogonal basis for W , and we want to find



 1, x2 x, x2 2 g(x) = projW x = 1+ x. h1, 1i hx, xi Computing the various inner products, we get 1 1 3 2 x x dx = = 3 3 −1 −1 Z 1 1 1 dx = [x]−1 = 2 h1, 1i =

1, x2 =

Z

1

2

1 1 4 x x dx = =0 4 −1 −1  1 Z 1 1 3 2 2 x dx = hx, xi = x = . 3 3 −1 −1



x, x2 =

−1

Z

1



3

Therefore the best linear approximation is g(x) =

2/3 0 1 1+ x= . 2 2/3 3

2. As in Example 7.41, let W = P[−1, 1] be the subspace of linear polynomials in C [−1, 1]. Then {1, x} is an orthogonal basis for W , and we want to find



 1, x2 + 2x x, x2 + 2x 2 g(x) = projW x + 2x = 1+ x. h1, 1i hx, xi Computing the various inner products, we get

1, x2 + 2x =

Z

 1  2 1 3 = x2 + 2x dx = x + x2 3 3 −1  1  1 4 2 3 4 x3 + 2x2 dx = x + x = . 4 3 3 −1

1

−1



x, x2 + 2x =

Z

1

−1

h1, 1i and hx, xi were computed in Exercise 1. Therefore the best linear approximation is g(x) =

2/3 4/3 1 1+ x = + 2x. 2 2/3 3

3. As in Example 7.41, let W = P[−1, 1] be the subspace of linear polynomials in C [−1, 1]. Then {1, x} is an orthogonal basis for W , and we want to find



 1, x3 x, x3 3 g(x) = projW x = 1+ x. h1, 1i hx, xi Computing the various inner products, we get

1, x3 =

Z

1

−1

x3 dx =



1 4 x 4

1 =0

x, x3 =

Z

1

−1

−1

x4 dx =



1 5 x 5

1 = −1

2 . 5

h1, 1i and hx, xi were computed in Exercise 1. Therefore the best linear approximation is g(x) =

0 2/5 3 1+ x = x. 2 2/3 5

7.5. APPLICATIONS

615

4. As in Example 7.41, let W = P[−1, 1] be the subspace of linear polynomials in C [−1, 1]. Then {1, x} is an orthogonal basis for W , and we want to find



 1, sin πx x, sin πx 3 2 2 g(x) = projW x = 1+ x. h1, 1i hx, xi Computing the various inner products, we get Z 1 h D πx πx i1 πx E = sin =0 dx = − cos 1, sin 2 2 2 −1 −1  1  1 Z 1 Z D πx E πx πx 2 πx 2 1 2 2 πx 8 x, sin = x sin cos dx = − x cos + dx = sin = 2. 2 2 π 2 −1 π −1 2 π π 2 −1 π −1 h1, 1i and hx, xi were computed in the previous exercises. Therefore the best linear approximation is g(x) =

0 8/π 2 12 1+ x = 2 x. 2 2/3 π

5. As in the text example, let W = P[−1, 1] be the subspace of quadratic polynomials in C [−1, 1]. Then {1, x, x2 − 13 } is an orthogonal basis for W , and we want to find g(x) = projW

2 1 x − 3 , |x| h1, |x|i hx, |x|i

(|x|) = 1+ x + 2 1 2 1 . h1, 1i hx, xi x − 3, x − 3

Computing the various inner products, we get 0  1  1 2 1 2 + x =1 h1, |x|i = |x| dx = (−x) dx + x dx = − x 2 2 −1 −1 0 −1 0 0  1  Z 1 Z 0 Z 1 1 3 1 3 2 2 + x =0 hx, |x|i = x |x| dx = (−x ) dx + x dx = − x 3 3 −1 −1 0 −1 0  Z 1   1 1 x2 − |x| dx x2 − , |x| = 3 3 −1   Z 0 Z 1 1 1 3 3 =− x − x dx + x − x dx 3 3 −1 0  0  1 1 1 1 1 = − x4 − x2 + x4 − x2 4 6 4 6 −1 0 1 = 6 Z 1 1 h1, 1i = 1 dx = [x]−1 = 2 Z

1

Z

−1 1

0

Z

1

1 1 3 2 hx, xi = x dx = x = 3 3 −1 −1   Z 1 2   1 Z 1 1 1 1 2 1 1 5 2 3 1 8 x2 − , x2 − = x2 − dx = x4 − x2 + dx = x − x + x = . 3 3 3 3 9 5 9 9 45 −1 −1 −1 Z

2



Therefore the best quadratic approximation is g(x) =

1 0 1/6 1+ x+ 2 2/3 8/45



x2 −

1 3

 =

1 15 2 5 + x − . 2 16 16

616

CHAPTER 7. DISTANCE AND APPROXIMATION

6. As in the text example, let W = P[−1, 1] be the subspace of quadratic polynomials in C [−1, 1]. Then {1, x, x2 − 13 } is an orthogonal basis for W , and we want to find

2 1

x, cos πx x − , cos πx 1, cos πx 2 2 1+ x + 2 13 2 21 . g(x) = projW (|x|) = h1, 1i hx, xi x − 3, x − 3 Computing the various inner products, we get  1 Z 1 D πx πx E 2 πx 4 = cos 1, cos dx = sin = 2 2 π 2 −1 π −1  1  1 Z 1 Z 1 D E πx πx 2 πx πx 2 πx x, cos x cos = dx = x sin sin dx = 0 − − cos − =0 2 2 π 2 −1 2 π 2 −1 −1 −1  D  πx E πx πx E 1 D 1 − 1, cos = x2 , cos x2 − , cos 3 2 2 3 2 Z 1 πx 1 4 = x2 cos dx − · 2 3 π −1 1  πx 2 2 2 πx 4 8x cos + 3 (π x − 8) sin − = 2 π 2 π 2 −1 3π   4 32 4 = − − π π3 3π 2 8π − 96 = . 3π 3

h1, 1i, hx, xi, and x2 − 13 , x2 − 13 were computed in Exercise 5. Therefore the best quadratic approximation is   4/π 0 (8π 2 − 96)/(3π 3 ) 1 g(x) = 1+ x+ x2 − 2 2/3 8/45 3 15π 2 − 180 2 5π 2 − 60 2 x − = + π π3 π3 2 2 60 − 3π 15π − 180 2 = + x . π3 π3 7. Set v1 = x1 = 1, and x2 = x; then Z

1

h1, 1i =

1 dx = 0

1 [x]0

Z h1, xi =

= 1,

0

1



1 2 x x dx = 2

1 = 0

1 , 2

so that

h1, xi 1/2 1 =x− 1=x− . h1, 1i 1 2  So an orthogonal basis for P1 [0, 1] is 1, x − 12 . v2 = x −

8. From Exercise 7, we know that we can take v1 = 1 and v2 = x − 21 . Then let x3 = x2 . Now,

1, x2 =

Z 0

1

x2 dx =



1 3 x 3

1 = 0

1 , 3

  1 Z 1

1

1 1 1 1 4 1 1 x3 dx − · = − = x − , x2 = x, x2 − 1, x2 = x 2 2 2 3 4 6 12 0 0   1 1 1 1 1 1 1 x − ,x − = hx, xi − h1, xi + h1, 1i = − + · 1 = , 2 2 4 3 2 4 12 

7.5. APPLICATIONS

617

so that, using the other inner products we computed in Exercise 7,



    1, x2 x − 21 , x2 1 1 1/12 1 1 2 2 v3 = x − x − 1−

= x − · 1 − x − = x2 − x + . 1 1 h1, 1i 2 3 1/12 2 6 x − 2, x − 2  Thus an orthogonal basis for P2 [0, 1] is 1, x − 12 , x2 − x + 16 . 9. Using the basis from Exercise 7 together with the various inner products computed in Exercises 7 and 8, the best linear approximation is



    x − 12 , x2 1, x2 1 1/3 1/12 1 1 1+

x − = 1 + x − =x− . g(x) = h1, 1i 2 1 1/12 2 6 x − 12 , x − 12 10. Use the basis from Exercise 7 together with the various inner products computed in Exercises 7 and 8. Additionally,  1 Z 1

√ √ 2 3/2 2 x dx = 1, x = x = 3 3 0 0   Z 1    1 Z 1 √ 1 1 1 2 5/2 1 3/2 1 x3/2 − x1/2 dx = = x− , x = x− x1/2 dx = x − x . 2 2 2 5 3 15 0 0 0 Then the best linear approximation is

√     √ x − 12 , x h1, xi 2/3 1/15 1 2 4 2 4 4 1 g(x) = 1+

= 1 + x − = + x− = + x. x − h1, 1i 2 1 1/12 2 3 5 5 15 5 x − 12 , x − 12 11. Use the basis from Exercise 7 together with the various inner products computed in Exercises 7 and 8. Additionally, Z 1 1 h1, ex i = ex dx = [ex ]0 = e − 1 0



1 x − , ex 2



Z = 0

1



1 x− 2



x

Z

e dx = 0

1

   1 3 1 1 x 1 x x x x = − e. xe − e dx = xe − e − e 2 2 2 2 0

Then the best linear approximation is

  x − 12 , ex 1 h1, ex i x− 1+

g(x) = h1, 1i 2 x − 21 , x − 12  3 1  − e e−1 1 = 1+ 2 2 x− 1 1/12 2 = e − 1 + (18 − 6e)x − (9 − 3e) = (18 − 6e)x + 4e − 10. 12. Use the basis from Exercise 7 together with the various inner products computed in Exercises 7 and 8. Additionally,  1 Z 1 D πx E πx 2 πx 2 1, sin = sin dx = − cos = 2 2 π 2 0 π 0   Z 1  1 πx 1 πx x − , sin = x− sin dx 2 2 2 2 0  1 4 πx 2 πx 1 πx = sin − x cos + cos π2 2 π 2 π 2 0     4 1 = −0+0 − 0−0+ π2 π 4−π = . π2

618

CHAPTER 7. DISTANCE AND APPROXIMATION Then the best linear approximation is 

 x − 21 , sin πx 1 2

x− 2 x − 12 , x − 12   2 (4 − π)/π 1 2/π 1+ x− = 1 1/12 2 24 − 6π 2 48 − 12π x− = + 2 π π π2 8π − 24 48 − 12π = + x. π2 π2

1, sin πx 2 g(x) = 1+ h1, 1i

13. Use the basis from Exercise 8 together with the various inner products computed in Exercises 7 and 8. Additionally, 

 2 1 2 = x −x+ dx 6 0  Z 1 1 1 4 dx = x4 − 2x3 + x2 − x + 3 3 36 0  1 1 5 1 4 4 3 1 2 1 = x − x + x − x + x 5 2 9 6 36 0 1 = 180  1 Z 1

1 4 1 1, x3 = x3 dx = x = 4 4 0 0  Z 1   1   Z 1 1 1 5 1 4 3 1 3 1 3 3 4 x− dx = x − x = x − ,x = x dx = x − x 2 2 2 5 8 40 0 0 0  Z 1   1 1 x2 − x + , x3 = x2 − x + x3 dx 6 6 0  Z 1 1 = x5 − x4 + x3 dx 6 0  1 1 6 1 5 1 4 = x − x + x 6 5 24 0 1 = . 120

1 1 x − x + , x2 − x + 6 6 2



Z

1

Then the best quadratic approximation is





2    1, x3 x − 21 , x3 x − x + 61 , x3 1 1 2



g(x) = 1+ x− + 2 x −x+ h1, 1i 2 6 x − 21 , x − 21 x − x + 16 , x2 − x + 61     1/4 3/40 1 1/120 1 = 1+ x− + x2 − x + 1 1/12 2 1/180 6 1 9 9 3 2 3 1 = + x− + x − x+ 4 10 20 2 2 4 1 3 3 2 = − x+ x . 20 5 2

7.5. APPLICATIONS

619

14. Use the basis from Exercise 8 together with the various inner products computed in Exercises 7, 8, and 10. Additionally, 

1 √ x2 − x + , x 6



√ √ 1 √ = x2 , x − x, x + 1, x 6 Z 1 2 1 2 x5/2 dx − + · = 5 6 3 0  1 2 7/2 2 1 = x − + 7 5 9 0 1 . =− 315

Then the best quadratic approximation is

2 √  √    √ x − 12 , x x − x + 61 , x h1, xi 1 1 2 x− x −x+ g(x) = 1+

+ 2 h1, 1i 2 6 x − 12 , x − 21 x − x + 16 , x2 − x + 16     2/3 1/15 1 −1/315 1 = 1+ x− + x2 − x + 1 1/12 2 1/180 6 2 4 2 4 2 4 2 = + x− − x + x− 3 5 5 7 7 21 6 48 4 2 = + x− x . 35 35 7

15. Use the basis from Exercise 8 together with the various inner products computed in previous exercises. Additionally, 

1 x − x + , ex 6 2



1 = x2 , ex − hx, ex i + h1, ex i 6 Z 1 1 = x2 ex x − 1 + (e − 1) 6 0  2 x 1 1 7 = x e − 2xex + 2ex 0 + e − 6 6 7 7 19 1 =e−2+ e− = e− . 6 6 6 6

Then the best quadratic approximation is



2    x − 12 , ex x − x + 61 , ex h1, ex i 1 1 2



g(x) = 1+ x− + 2 x −x+ h1, 1i 2 6 x − 12 , x − 12 x − x + 61 , x2 − x + 16  7  1  19  3 − e e− 6 e−1 1 1 = 1+ 2 2 x− + 6 x2 − x + 1 1/12 2 1/180 6     1 1 = e − 1 + (18 − 6e) x − + (210e − 570) x2 − x + 2 6 = (39e − 105) + (588 − 216e)x + (210e − 570)x2 .

620

CHAPTER 7. DISTANCE AND APPROXIMATION

16. Use the basis from Exercise 8 together with the various inner products computed in previous exercises. Additionally,   D πx E 1 D πx E πx 1 πx E D − x, sin + 1, sin x2 − x + , sin = x2 , sin 6 2 2 2 6 2 Z 1 πx 1 2 4 x2 sin = dx − 2 + · 2 π 6 π 0  1 2 2 8 πx 16 − 2π x πx 1 4 = x sin cos + − 2+ 2 3 π 2 π 2 0 π 3π 16 4 1 8 = 2− 3− 2+ π π π 3π π 2 + 12π − 48 = . 3π 3 Then the best quadratic approximation is

 

2   1, sin πx x − 21 , sin πx x − x + 61 , sin πx 1 1 2 2 2 2

g(x) = 1+

x − + x − x + h1, 1i 2 6 x − 12 , x − 12 x2 − x + 61 , x2 − x + 16     2 2 3 2/π (4 − π)/π 1 (π + 12π − 48)/(3π ) 1 = 1+ x− + x2 − x + 1 1/12 2 1/180 6 2 48 − 12π 24 − 6π 60(π 2 + 12π − 48) 2 60(π 2 + 12π − 48) = + x − + x − x π π2 π2 π3 π3 10(π 2 + 12π − 48) + π3 2 60π 2 + 720π − 2880 2 18π + 96π − 480 −72π 2 − 672π + 2880 + x + x . = π3 π3 π3 17. Assuming k is an integer, recall that cosine is an even function (that is, cos(−x) = cos x), so that π 1 1 = (sin kπ − sin(−kπ)) = 0 sin kx k k −π −π  π Z π 1 1 1 = (cos kπ − cos(−kπ)) = (cos kπ − cos kπ) = 0. h1, sin kxi = sin kx dx = − cos kx k k k −π −π Z

h1, cos kxi =

π



cos kx dx =

18. Assuming j and k are integers, use the identity cos(j + k)x + cos(j − k)x = 2 cos jx cos kx; this is easily derived by expanding the left-hand side using the formula for cos(α + β). Then if j 6= k Z

π

hcos jx, cos kxi =

cos jx cos kx dx −π

=

1 2

Z

π

(cos(j + k)x + cos(j − k)x) dx −π



1 1 1 = sin(j + k)x + sin(j − k)x 2 j+k j−k

π −π

= 0. Note that this computation fails if j = k, since then cos(j − k)x = 1 and the integral evaluates differently.

7.5. APPLICATIONS

621

19. Assuming j and k are integers, use the identity cos(j − k)x − cos(j + k)x = 2 sin jx sin kx; this is easily derived by expanding the left-hand side using the formula for cos(α + β). Then if j 6= k Z π sin jx sin kx dx hsin jx, sin kxi = −π Z 1 π = (cos(j − k)x − cos(j + k)x) dx 2 −π  π 1 1 1 = sin(j − k)x − sin(j + k)x 2 j−k j+k −π = 0. Note that this computation fails if j = k, since then cos(j − k)x = 1 and the integral evaluates differently. Rπ 2 π 20. k1k = h1, 1i = −π 12 dx = [x]−π = 2π. Also, 2

kcos kxk = hcos kx, cos kxi Z π = cos2 kx dx −π  Z π  1 1 + cos 2kx dx = 2 2 −π  π 1 1 = x+ sin 2kx 2 4k −π = π. 21. We have  Z 0   0 π ! Z π Z π 1 2 π 1 1 1 2 1 + x = |x| dx = (−x) dx + x dx = − x a0 = 2π −π 2π 2π 2 2 2 −π 0 −π 0 Z 0  Z π Z π 1 1 |x| cos x dx = (−x cos x) dx + x cos x dx a1 = π −π π −π 0  1 0 π = [−x sin x − cos x]−π + [x sin x + cos x]0 π 4 =− π Z 0  Z Z π 1 π 1 a2 = |x| cos 2x dx = (−x cos 2x) dx + x cos 2x dx π −π π −π 0  0  π ! 1 1 1 1 1 = − cos 2x + x sin 2x + cos 2x + x sin 2x π 4 2 4 2 −π 0 =0 Z 0  Z π 1 |x| cos 3x dx = (−x cos 3x) dx + x cos 3x dx π −π −π 0  0  π ! 1 1 1 1 1 = − cos 3x + x sin 3x + cos 3x + x sin 3x π 9 3 9 3 −π 0

1 a3 = π

=−

Z

π

4 . 9π

622

CHAPTER 7. DISTANCE AND APPROXIMATION For the b’s, we have 1 bk = π

Z

π

1 |x| sin kx dx = π −π

Z

0

Z

π

(−x sin kx) dx +

 x sin kx dx .

−π

0

In the first integral, substitute u = −x; then du = −dx and the integral becomes Z

0

Z

0

−π

Z

0

u sin(−ku) · (−1) du =

(−x sin kx) dx =

u sin u du. π

π

This is just the negative of the second integral, so their sum vanishes and bk = 0 for all k. Therefore the third-order Fourier approximation of f (x) = |x| on [−π, π] is a0 + a1 cos x + a2 cos 2x + a3 cos 3x =

π 4 4 − cos x − cos 3x. 2 π 9π

22. We have  π Z π π2 1 1 3 1 x = x2 dx = 2π −π 2π 3 3 −π Z π π 1 1 a1 = x2 cos x dx = 2x cos x + (x2 − 2) sin x −π = −4 π −π π  π Z 1 π 2 1 1 1 2 a2 = =1 x cos 2x dx = x cos 2x + (2x − 1) sin 2x π −π π 2 4 −π  π Z 1 π 2 1 4 1 2 a3 = x cos 3x + (9x2 − 2) sin 3x =− . x cos 3x dx = π −π π 9 27 9 −π a0 =

For the b’s, we have bk =

1 π

Z

π

x2 sin kx dx

−π

π  1 2 2 − k 2 x2 = x sin kx + cos kx π k2 k3 −π 2 − k2 π2 2 − k 2 (−π)2 cos kπ − cos(−kπ) k3 k3 = 0.

=

Thus bk = 0 for all k. Therefore the third-order Fourier approximation of f (x) = x2 on [−π, π] is a0 + a1 cos x + a2 cos 2x + a3 cos 3x =

π2 4 − 4 cos x + cos 2x − cos 3x. 3 9

23. We have Z 0  Z π Z π 1 1 1 f (x) dx = 0 dx + 1 dx = 2π −π 2π 2 −π 0 Z 0   π Z Z π 1 π 1 1 1 ak = f (x) cos kx dx = 0 dx + cos kx dx = sin kx = 0 π −π π π k −π 0 0 Z   π Z Z π 0 1 π 1 1 1 1 − (−1)k bk = f (x) sin kx dx = 0 dx + sin kx dx = − cos kx = . π −π π π k πk −π 0 0 a0 =

7.5. APPLICATIONS

623

24. We have a0 = ak = = =

 Z 0 Z π Z π 1 1 1 dx = 0 f (x) dx = (−1) dx + 2π −π 2π 0 −π Z π 1 f (x) cos kx dx π −π  Z 0 Z π 1 (− cos kx) dx + cos kx dx π −π 0  0 π !  1 1 1 − sin kx sin kx + π k k −π 0

=0 Z 1 π f (x) sin kx dx bk = π −π  Z 0 Z π 1 sin kx dx = (− sin x) dx + π 0 −π   0 π ! 1 1 1 = + − cos kx cos kx π k k −π 0   k k 1 1 − (−1) 1 − (−1) = + π k k =

2(1 − (−1)k ) . πk

25. We have π  Z π Z π 1 1 1 1 f (x) dx = (π − x) dx = =π πx − x2 2π −π 2π −π 2π 2 −π  π Z Z 1 π 1 1 1 π 1 π ak = sin kx − 2 cos kx − x sin kx f (x) cos kx dx = (π − x) cos kx dx = π −π π −π π k k k −π a0 =

=0 bk =

1 π

Z

π

f (x) sin kx dx = −π

1 π

Z

π

(π − x) sin kx dx = −π

π  1 1 π 1 − cos kx − 2 sin kx + x cos kx π k k k −π

2(−1)k = . k 26. We have a0 = ak = = = = =

Z 0   0  π ! Z π Z π 1 1 1 1 2 1 2 π |x| dx = (−x) dx + x dx = − x + x = 2π −π 2π 2π 2 2 2 −π 0 −π 0 Z 0  Z π Z π 1 1 |x| cos kx dx = (−x cos kx) dx + x cos kx dx π −π π −π 0  0  π ! 1 1 1 1 1 − x sin kx − 2 cos kx x sin kx + 2 cos kx + π k k k k −π 0         1 1 π 1 π 1 1 0− 2 − sin(−kπ) − 2 cos(−kπ) + sin kπ + 2 cos kπ − 0 + 2 π k k k k k k   1 2 2 − 2 + 2 cos kπ π k k 2((−1)k − 1) πk 2

624

CHAPTER 7. DISTANCE AND APPROXIMATION For the b’s, we have 1 bk = π

Z

π

1 |x| sin kx dx = π −π

0

Z

π

Z

 x sin kx dx .

(−x sin kx) dx + −π

0

In the first integral on the right, substitute u = −x; then du = −dx and the integral becomes Z

0

π

Z (−x sin kx) dx +

−π

0

Z

Z

0

π

u sin(−ku) · (−1 du) +

x sin kx dx = π Z 0

x sin kx dx 0

π

Z

=

u sin ku du +

x sin kx dx

π

0 π

Z =−

Z

π

u sin ku du +

x sin kx dx

0

0

= 0. Therefore bk = 0 for all k. 27. (a) For any function f (x), we have Z

π

Z

0

Z

f (x) dx =

π

f (x) dx +

−π

−π

f (x) dx. 0

In the first integral on the right, make the substitution u = −x; then du = −dx, and x = −π corresponds to u = π while x = 0 corresponds to u = 0. Then if f is odd, the sum above becomes Z

0

π

Z f (x) dx +

Z

0

−π

0

π

Z f (−u) · (− du) +

f (x) dx = π Z 0

f (x) dx 0

Z

π

(−f (u)) · (− du) +

= π 0

f (x) dx 0

Z

Z

=

π

f (u) du + π

f (x) dx 0

Z =−

π

Z

π

f (u) du + 0

f (x) dx 0

= 0. (b) Note that cos kx is even, since cos(k(−x)) = cos(−kx) = cos kx. If f (x) is odd, then f (−x) cos(k(−x)) = −f (x) cos kx, so that f (x) cos kx is an odd function. From part (a), we see that 1 ak = π

Z

π

f (x) cos kx dx = 0. −π

28. (a) For any function f (x), we have Z

π

Z

0

f (x) dx = −π

Z f (x) dx +

−π

π

f (x) dx. 0

In the first integral on the right, make the substitution u = −x; then du = −dx, and x = −π

Chapter Review

625

corresponds to u = π while x = 0 corresponds to u = 0. Then if f is even, the sum above becomes Z π Z 0 Z π Z 0 f (x) dx f (−u) · (− du) + f (x) dx = f (x) dx + −π

π 0

0

Z

Z f (u) · (− du) +

= π

0 π

f (x) dx 0

Z 0 Z π =− f (u) du + f (x) dx π 0 Z π Z π = f (u) du + f (x) dx 0 0 Z π =2 f (x) dx. 0

(b) Note that sin kx is odd, since sin(k(−x)) = sin(−kx) = − sin kx. If f (x) is even, then f (−x) sin(k(−x)) = f (x) · (− sin kx) = −f (x) sin kx, so that f (x) sin kx is an odd function. From part (a) of Exercise 27, we see that Z 1 π bk = f (x) sin kx dx = 0. π −π

Chapter Review 1. (a) True. Since 1 and π are both positive scalars, this is a weighted dot product; see Section 7.1.   4 −2 T (b) True. This corresponds to the dot product hu, vi = u Av, since A = is a positive −2 4 definite matrix (its eigenvalues are 2 and 6).   0 1 (c) False. Any matrix A of trace zero will have hA, Ai = 0; for example, take A = . 1 0 (d) True. This is a calculation: ku + vk =

p

hu + v, u + vi =

q

r 2

2

kuk + 2 hu, vi + kvk =

42 + 2 · 2 +

√ 2 √ 5 = 25 = 5.

  P (e) True. An element of R1 is a vector v1 ; then the sum norm is |vi | = |v1 |, the max norm is p pP max{|vi |} = |v1 |, and the Euclidean norm is vi2 = v12 = |v1 |. (f ) False. The condition number satisfies k∆xk k∆Ak ≤ cond(A) , 0 kx k kAk where kAk is some matrix norm. The left-hand side measures the maximum percentage change, but even if that is small, cond(A) could be large. (g) True. From the inequality in the previous part, if cond(A) is small, so is the left-hand side, so the matrix is well-conditioned. (h) False. It is not necessarily unique. See Theorems 7.9 and 7.10 in Section 7.3. (i) True. By Theorem 7.11 in Section 7.3, the standard matrix of P is P = A AT A since A has orthonormal columns, it follows that AT A = I, so that P = AAT .

−1

AT . But

(j) False. By Exercise 26 in Section 7.4, the singular values of A are the absolute value of the eigenvalues of A. If A is positive definite, then the eigenvalues are positive, so this statement is true.

626

CHAPTER 7. DISTANCE AND APPROXIMATION

2. It is not. For example, hx, xi = 0 · 1 + 1 · 0 = 0, but x 6= 0. 3. This is an inner product. We show that it satisfies all four properties for an inner product.   T 1. hA, Bi = tr AT B = tr AT B = tr B T A = hB, Ai.    2. hA, B + Ci = tr AT (B + C) = tr AT B + tr AT C = hA, Bi + hA, Ci.   3. hcA, Bi = tr cAT B = c tr AT B = c hA, Bi.   a b 4. Let A = ; then c d AT A =



  2 b a + c2 = d ab + cd

 c a d c

a b

 ab + cd . b2 + d2

 Thus hA, Ai = tr AT A = a2 + b2 + c2 + d2 ≥ 0, and hA, Ai = 0 if and only if a = b = c = d = 0; that is, if and only if A = O. 4. False. For example, let f (x) = x, g(x) = x, and h(x) = −x. Then g(x) + h(x) = 0, and    hf, g + hi = max f (x) max (g(x) + h(x)) = 1 · 0 = 0 0≤x≤1 0≤x≤1       hf, gi + hf, hi = max f (x) max g(x) + max f (x) max h(x) = 1 · 1 + 1 · 0 = 1. 0≤x≤1

0≤x≤1

0≤x≤1

0≤x≤1

The two are not equal, so property (2) is not satisfied. 5. We have



p √

1 + x + x2 = h1 + x + x2 , 1 + x + x2 i = 1 · 1 + 1 · 1 + 1 · 1 = 3.

6. We have s

p

d(x, x2 ) = x − x2 = hx − x2 , x − x2 i =

1

Z

2

(x − x2 ) dx

0

s Z

s

1

=

(x2



2x3

+

x4 )

dx =

0

7. Start with

Then

  1 x1 = , 1   1 v1 = x1 = , 1

1 3 1 4 1 5 x − x + x 3 2 5

  1 x2 = . 2

v2 = x2 −

hv1 , x2 i v1 . hv1 , v1 i

Computing the two inner products, we get     6 4 1 1 hv1 , x2 i = = 30 4 6 2      6 4 1 hv1 , v1 i = v1T Av2 = 1 1 = 20, 4 6 2 v1T Ax2

so that

 = 1

" # " # " # " # " # 3 1 − 21 1 30 1 v2 = − = − 23 = . 1 20 1 2 2 2 2

1

r =

0

1 . 30

Chapter Review

627

8. Start with x1 = 1, x2 = x, x3 = x2 . Then v1 = x1 = 1 v2 = x2 −

R1 x dx 1 hv1 , x2 i 1 1=x− ·1=x− v1 = x − R01 hv1 , v1 i 2 2 1 dx 0

hv1 , x3 i hv2 , x3 i v1 − v2 hv1 , v1 i hv2 , v2 i  R1 R1 2   x dx x − 21 x2 dx 1 2 0 0 1− R1 = x − R1 x− 2 2 1 dx x − 12 dx 0 0 R 1 3 1 2   x − 2 x dx 1 1 x− = x2 − · 1 − 0R 1  1 2 3 2 x − 2 dx 0  1 4 1 3 1   x − 6x 0 1 1 x − = x2 − − h 4  i1 3 2 1 1 2 2 x− 2 0   1 1/12 1 2 =x − − x− 3 1/12 2 1 1 = x2 − − x + 3 2 1 = x2 − x + . 6

v3 = x3 −

9. This is not a norm since it fails to satisfy property (2) of a norm (see Section 7.2): kcvk = cvT cv = c2 vT v = c2 kvk = 6 c kvk if c is a scalar other than 0 or 1. 10. This is a norm. Let p(x) = a + bx; then p(0) = a and p(1) = a + b. We show it satisfies all three properties: 1. kp(x)k = |p(0)| + |p(1) − p(0)| = |a| + |a + b − a| = |a| + |b| ≥ 0. Also, kp(x)k = 0 if and only if |a| + |b| = 0; since |a| and |b| are both nonnegative, their sum is zero if and only if they are both zero. So kp(x)k = 0 if and only if p(x) = 0. 2. We have kcp(x)k = |cp(0)| + |cp(1) − cp(0)| = |c| |p(0)| + |c(p(1) − p(0))| = |c| |p(0)| + |c| |p(1) − p(0)| = |c| (|p(0)| + |p(1) − p(0)|) = |c| kp(x)k . 3. Using the triangle inequality for the usual absolute value, we have k(p + q)(x)k = |(p + q)(0)| + |(p + q)(1) − (p + q)(0)| = |p(0) + q(0)| + |p(1) − p(0) + q(1) − q(0)| ≤ |p(0)| + |q(0)| + |p(1) − p(0)| + |q(1) − q(0)| = (|p(0)| + |p(1) − p(0)|) + (|q(0)| + |q(1) − q(0)|) = kp(x)k + kq(x)k .

11. The condition number is defined by cond(A) = A−1 kAk, where k·k is a matrix norm. Computing A−1 gives   1 −11 10 1000 . A−1 = −11 −990 10 1000 −1000

628

CHAPTER 7. DISTANCE AND APPROXIMATION Then using k·k1 gives

−1

A kAk = 2010 · 1.21 = 2432.1 1 1

and k·k∞ gives

−1

A kAk = 2010 · 1.21 = 2432.1. ∞ ∞

By either measure, A is ill-conditioned.   12. If Q = q1 q2 · · · qn is orthogonal, then each qi is a unit vector. This means that v uX √ u n 2 kqi k = t qij = 1 = 1. j=1

Then

v v v u n uX uX n uX u n X u n 2 √ 2 =t qij kQkF = t qij = t 1 = n. i,j=1

i=1 j=1

13. Using the method of Example 7.26, we have  1 1 A= 1 1

 1 AT A = 1

1 2

  2 3  b= 5 , 7

 1 2 , 3 4

so that

 1  2 = 4 10 3 4

  1 1  1 4 1 1

1 3

i=1

 10 . 30

This matrix is invertible, and T

A A

−1

" =

3 2 − 12

− 21

# .

1 5

Then x = AT A

−1

" AT b =

3 2 1 −2

#" − 12 1 1 1 5

1 2

1 3

  # 2 " #  1  0 3   = 17 . 4 5 10 7

So the least squares solution is y = 0 +

17 10 x

=

17 10 x.

14. Using the method of Example 7.25, we have   1 2 1 0  A= 2 −1 , 0 5 so that  1 T A A= 2

1 0



 1  0  b= −1 , 3

  1 2 0  1 −1 5 2 0

 2  0 = 6 −1 0 5

This matrix is invertible, and AT A

−1

" =

1 6

0

0

1 30

# .

 0 . 30

Chapter Review

629

Then x = AT A

−1

" AT b =

1 6

0

  1 " # #  − 61 2 0  0  .  = 3 −1 5 −1 5 3

#" 0 1 1 2 30

1 0

 1 1 , 0

  1 x = 2 , 3

15. See Theorem 7.11. With  1 A = 0 1 we have  1 T A A= 1

0 1

  1 1  0 0 1

  1 2  1 = 1 0

 1 , 2

so that T

A A

−1

" =

2 3 − 13

− 31 2 3

# .

Then the standard matrix of the projection onto the column space W of A is    2 1 # #" 1 1 " 2 1 3 3  −3 1 0 1 −1 T   1 T 2 3 P =A A A A = 0 1 = 3 1 2 3 −3 1 1 0 1 1 3 1 0 − 3 3



1 3  − 13  , 2 3

so that the projection of x onto W is  Px =

2 3 1 3 1 3

1 3 2 3 − 13

 

1 1 3   − 31  2 2 3 3

  7 3

  =  23  . 5 3

 T   u v . So we should try A = u v . Then the column space of  −1 T A is span(u, v), so P = A AT A A is the projection matrix onto this span. We now compute P . First, since u and v are orthonormal,

 16. Note that uuT + vvT = u

uT u = u · u = 1,

v

vT v = v · v = 1,

uT v = u · v = 0,

vT u = v · u = 0.

Then  T u  A A= T u v T

 T u u v = T v u 

  uT v 1 = 0 vT v

 0 1



T

A A

−1

 1 = 0

 0 . 1

Then the standard matrix of the projection onto the column space of A (which is the space spanned by u and v) is   −1 T   uT T T P =A A A A = AA = u v = uuT + vvT , vT as required. 17. (a) We have  1 AT A = 1

0 0

  1 1  0 −1 1

  1 2 0 = 0 −1

 0 . 2

This diagonal matrix has eigenvalues λ1 = λ2 = 2, so the singular values of A are σ1 = σ2 =



2.

630

CHAPTER 7. DISTANCE AND APPROXIMATION (b) Compute the corresponding eigenvectors of AT A by row-reduction:    T  0 0 0 A A − 2I 0 = . 0 0 0     1 0 So eigenvectors are v1 = and v2 = , and then 0 1     1 0 V = v1 v2 = . 0 1 To compute U , we have  1 1  1  Av1 = √ 0 u1 = σ1 2 1  1 1 1   u2 = Av2 = √ 0 σ2 2 1

   √1 1 " #  2  1   0  0 = 0  √1 −1 2    √1 1 " #  2   0  0 . = 0   1  −1 − √12

  0 We must extend this to an orthonormal basis of R3 ; clearly adjoining the vector u3 = 1 is 0 sufficient. Finally, since Σ must be a 2 × 3 matrix, we append a row of zeroes to the bottom. Then    √ # " √1 √1 0 2 0 2  2   √  1 0  U = 0 1 2 , V = 0 1  0 , Σ =  0 1 √1 √ − 2 0 0 0 2 give an SVD for A, and A = U ΣV T . (c) With Σ as above, we compute Σ+ by transposing Σ and inverting all the nonzero entries: " # √1 0 0 + 2 Σ = . √1 0 0 2 Then A + = V Σ+ U T =

" 1

0

0

1

#"

√1 2

0

0

√1 2

 # √1 0  12 √ 0  2 0

0 0 1



√1 2 − √12  

0

" =

1 2 1 2

0 0

1 2 − 21

# .

18. (a) We have 

1 AT A =  1 −1

 1  1 1 1 −1

1 1

  2 −1 = 2 −1 −2

 2 −2 2 −2 . −2 2

The characteristic polynomial is −λ3 +6λ2 = −λ2 (λ−6), √ so λ1 = 6 and λ2 = 0 are the eigenvalues of AT A. Therefore the singular values of A are σ1 = 6 and σ2 = 0. (b) Compute the corresponding eigenvectors of AT A by row-reduction:       −4 2 −2 0 1 0 1 0 −1   T A A − 6I 0 =  2 −4 −2 0 → 0 1 1 0 ⇒ x1 = −1 −2 −2 −2 0 0 0 0 0 1         2 2 −2 0 1 1 −1 0 −1 1  T  2 −2 0 → 0 0 0 0 ⇒ x2 =  1 , x3 = 0 . A A − 0I 0 =  2 −2 −2 2 0 0 0 0 0 0 1

Chapter Review

631

We must orthogonalize {x2 , x3 }:       1 1 −1 2 hx2 , x3 i   −1    1  0 x3 = x3 − x2 = 0 −  1 =  2  . hx2 , x2 i 2 1 0 1   1 Multiply through by 2 to clear fractions, giving x003 = 1. Finally, normalize all three of these 2 vectors:       √1 − √13 − √12 6 00  1     x x x1 2 3  √1  ,  √1  . √ , = v = = v = = v1 = − 2 3 3 2 kx1 k  kx2 k  kx003 k  6  √1 √2 0 3 6 Then h V = v1

v2

 − √13 i  1 v3 =  − √3 √1 3

− √12 √1 2

0



√1 6 √1  . 6 √2 6

To compute U , we have "

u1 =

1 1 1 Av1 = √ σ1 6 1

  # − √1 " # 3  − √12 −1  1 − √  = . 3 −1  − √1

1 1

√1 3

2

We must extend this to an orthonormal basis of R2 ; clearly adjoining the vector # " u2 =

√1 2 − √12

is sufficient. Finally, since Σ must be a 2 × 3 matrix, we append two columns of zeroes to the right. Then   √1 √1 √1 # " # "√ − − 2 6 √1  13 − √12 6 0 0 2 , − √ √1 √1  U= Σ = , V =  3 2 6 − √12 − √12 0 0 0 √1 √2 0 3 6 give an SVD for A, and A = U ΣV T . (c) With Σ as above, we compute Σ+ by transposing Σ and inverting all the nonzero entries:   √1 0  6   Σ+ =   0 0 . 0 0 Then

A + = V Σ+ U T

 − √1  13 = − √3 √1 3

− √12 √1 2

0



√1 √1 6  6 √1   0 6  √2 0 6

 0 " 1  −√ 2 0  √1 2 0

 − √12 − √12

# =

1  6  1  6 − 16



1 6 1 . 6 − 61

632

CHAPTER 7. DISTANCE AND APPROXIMATION

19. It suffices to show that the eigenvalues of AT A and those of (P AQ)T (P AQ) are the same. Now,  (P AQ)T (P AQ) = QT AT P T P AQ = QT AT A Q since P is an orthogonal matrix. Thus AT A and (P AQ)T (P AQ) are similar via the matrix Q. By Theorem 4.22 in Section 4.4, the eigenvalues of AT A are the same as those of (P AQ)T (P AQ), so that the singular values of A and P AQ are the same. 20. By the Penrose conditions, we have A+

2

= A+ AA+ = A+ = A+ = A+ = A+ = A+ = O.

 A+ AA+   AA+ A+ A A+ T T AA+ A+ A A+ T  T A+ AT AT A+ A+ T T T A+ A2 A+ A+ T T A+ O T A+ A+ 

Chapter 8

Codes 8.1

Code Vectors

1. We must have 1 · u = [1, 1, 1, 1, 1] · [1, 0, 1, 1, d] = 0 in Z2 . Since this dot product is 1 · 1 + 1 · 0 + 1 · 1 + 1 · 1 + 1 · d = 3 + d = 1 + d in Z2 , we must have d = 1, so that the parity check code vector is v = [1, 0, 1, 1, 1]. 2. We must have 1 · u = [1, 1, 1, 1, 1, 1] · [1, 1, 0, 1, 1, d] = 0 in Z2 . Since this dot product is 1 · 1 + 1 · 1 + 1 · 0 + 1 · 1 + 1 · 1 + 1 · d = 4 + d = d in Z2 , we must have d = 0, so that the parity check code vector is v = [1, 1, 0, 1, 1, 0]. 3. Compute 1 · v: 1 · v = 1 · 1 + 1 · 0 + 1 · 1 + 1 · 0 = 2 = 0 in Z2 . Since the sum is zero, a single error could not have occurred. 4. Compute 1 · v: 1 · v = 1 · 1 + 1 · 1 + 1 · 1 + 1 · 0 + 1 · 1 + 1 · 1 = 5 = 1 in Z2 . Since the sum is not zero, a single error may have occurred. 5. Compute 1 · v: 1 · v = 1 · 0 + 1 · 1 + 1 · 0 + 1 · 1 + 1 · 1 + 1 · 1 = 4 = 0 in Z2 . Since the sum is zero, a single error could not have occurred. 6. Compute 1 · v: 1 · v = 1 · 1 + 1 · 1 + 1 · 0 + 1 · 1 + 1 · 0 + 1 · 1 + 1 · 1 + 1 · 1 = 6 = 0 in Z2 . Since the sum is zero, a single error could not have occurred. 7. We want [1, 1, 1, 1, 1] · [1, 2, 2, 2, d] = 0 in Z3 . The dot product is 1 · 1 + 1 · 2 + 1 · 2 + 1 · 2 + 1 · d = 7 + d = 1 + d in Z3 , so the check digit must be d = 2, since 1 + 2 = 0 in Z3 . 8. We want [1, 1, 1, 1, 1] · [3, 4, 2, 3, d] = 0 in Z5 . The dot product is 1 · 3 + 1 · 4 + 1 · 2 + 1 · 3 + 1 · d = 12 + d = 2 + d in Z5 , so the check digit must be d = 3, since 2 + 3 = 0 in Z5 . 633

634

CHAPTER 8. CODES

9. We want [1, 1, 1, 1, 1, 1] · [1, 5, 6, 4, 5, d] = 0 in Z7 . The dot product is 1 · 1 + 1 · 5 + 1 · 6 + 1 · 4 + 1 · 5 + 1 · d = 21 + d = d in Z7 , so the check digit must be d = 0. 10. We want [1, 1, 1, 1, 1, 1, 1] · [3, 0, 7, 5, 6, 8, d] = 0 in Z9 . The dot product is 1 · 3 + 1 · 0 + 1 · 7 + 1 · 5 + 1 · 6 + 1 · 8 + 1 · d = 29 + d = 2 + d in Z9 , so the check digit must be d = 7, since 2 + 7 = 0 in Z9 . 11. Suppose u = [u1 , u2 , . . . , un ], and suppose that v differs from u in exactly one component, say the j th component. Then v = [u1 , u2 , . . . , vj , . . . , un ] where uj 6= vj . So c · (u − v) = [1, 1, . . . , 1] · ([u1 , u2 , . . . , uj , . . . , un ] − [u1 , u2 , . . . , vj , . . . , un ]) = [1, 1, . . . , 1] · [0, 0, . . . , 0, uj − vj , 0, . . . , 0] = uj − vj 6= 0. Thus c · (u − v) 6= 0 so that c · u 6= c · v. 12. With c = [3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1], we want c · u = 0 in Z10 : c · u = [3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1] · [0, 5, 9, 4, 6, 4, 7, 0, 0, 2, 7, d] =3·0+1·5+3·9+1·4+3·6+1·4+3·7+1·0+3·0+1·2+3·7+1·d = 102 + d = 2 + d in Z10 . So the check digit must be d = 8. 13. With c = [3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1], we want c · u = 0 in Z10 : c · u = [3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1] · [0, 1, 4, 0, 1, 4, 1, 8, 4, 1, 2, d] =3·0+1·1+3·4+1·0+3·1+1·4+3·1+1·8+3·4+1·1+3·2+1·d = 50 + d = d in Z10 . So the check digit must be d = 0. 14. Let u = [0, 4, 6, 9, 5, 6, 1, 8, 2, 0, 1, 5] and c = [3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1]. Then (a) We check if c · u = 0: c · u = [3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1] · [0, 4, 6, 9, 5, 6, 1, 8, 2, 0, 1, 5] =3·0+1·4+3·6+1·9+3·5+1·6+3·1+1·8+3·2+1·0+3·1+1·5 = 77 = 7 in Z10 . Since the dot product is nonzero, the UPC cannot be correct. (b) Substituting an unknown digit d for the 6 in the third position, we get c · u = [3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1] · [0, 4, d, 9, 5, 6, 1, 8, 2, 0, 1, 5] =3·0+1·4+3·d+1·9+3·5+1·6+3·1+1·8+3·2+1·0+3·1+1·5 = 59 + 3d = 9 + 3d in Z10 . Solving 9 + 3d = 0 in Z10 gives d = 7, so the correct third digit is 7.

8.1. CODE VECTORS

635

15. Suppose u = [u1 , u2 , . . . , u12 ] is a correct UPC, and suppose that v differs from u in exactly one component, say the j th component. Then v = [u1 , u2 , . . . , vj , . . . , u12 ] where uj 6= vj . So c · (u − v) = [3, 1, . . . , 1] · ([u1 , u2 , . . . , uj , . . . , u12 ] − [u1 , u2 , . . . , vj , . . . , u12 ]) = [1, 1, . . . , 1] · [0, 0, . . . , 0, uj − vj , 0, . . . , 0] = 1(uj − vj ) or 3(uj − vj ). If 1(uj − vj ) = 0 ∈ Z10 , then uj = vj ∈ Z10 . Since both uj and vj are between 0 and 9, they must be equal. Similarly, if 3(uj − vj ) = 0 ∈ Z10 , then since 3x = 0 ∈ Z10 implies that x = 0, we see again that uj = vj ∈ Z10 . So neither of these can be zero, since uj 6= vj . Thus c · (u − v) 6= 0 so that 0 = c · u 6= c · v. Since c · v 6= 0, this error will be detected. 16. (a) Compute c · u0 , where u0 is the given UPC with the second and third components transposed: c · u0 = [3, 1, 3, 1, 3, 1, 3, 1, 3, 1, 3, 1] · [0, 4, 7, 9, 2, 7, 0, 2, 0, 9, 4, 6] =3·0+1·4+3·7+1·9+3·2+1·7+3·0+1·2+3·0+1·9+3·4+1·6 = 76 = 6 in Z10 . Since the dot product is nonzero, this error will be detected. (b) Since 3 · 4 + 9 = 21 while 3 · 9 + 4 = 31, and 21 = 31 = 1 in Z10 , transposing the third and fourth entries in the UPC will not change the dot product. So this transposition error will not be detected. (c) Suppose that transposing ui and ui+1 does not change the dot product, so that if the UPC with the entries transposed is u0 , then c · u = c · u0 = 0. Then u0 = u + ei (ui+1 − ui ) + ei+1 (ui − ui+1 ), and thus 0 = c · u0 = c · (u + ei (ui+1 − ui ) + ei+1 (ui − ui+1 )) = c · u + c · ei (ui+1 − ui ) + c · ei+1 (ui − ui+1 )) = ci (ui+1 − ui ) + ci+1 (ui − ui+1 ) = ui (ci+1 − ci ) + ui+1 (ci − ci+1 ). If i is even, then ci = 1 and ci+1 = 3, and we get 2ui − 2ui+1 = 2ui + 8ui+1 = 0 in Z10 . If i is odd, then ci = 3 and ci+1 = 1, so we get −2ui + 2ui+1 = 8ui + 2ui+1 = 0 in Z10 . Note that in part (b) above, i = 3 is odd, and 8u3 + 2u4 = 8 · 4 + 2 · 9 = 32 + 18 = 50 = 0 in Z10 , as expected. 17. Here c = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] and u = [0, 3, 8, 7, 9, 7, 9, 9, 3, d], so that c · u = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] · [0, 3, 8, 7, 9, 7, 9, 9, 3, d] = 10 · 0 + 9 · 3 + 8 · 8 + 7 · 7 + 6 · 9 + 5 · 7 + 4 · 9 + 3 · 9 + 2 · 3 + 1 · d = 298 + d = 1 + d in Z11 . So the check digit is d = X, since X represents 10, and 1 + 10 = 0 in Z11 . 18. Here c = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] and u = [0, 3, 9, 4, 7, 5, 6, 8, 2, d], so that c · u = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] · [0, 3, 9, 4, 7, 5, 6, 8, 2, d] = 10 · 0 + 9 · 3 + 8 · 9 + 7 · 4 + 6 · 7 + 5 · 5 + 4 · 6 + 3 · 8 + 2 · 2 + 1 · d = 246 + d = 4 + d in Z11 . So the check digit is d = 7, since 4 + 7 = 0 in Z11 .

636

CHAPTER 8. CODES

19. (a) Let c be the check vector as in the previous exercise, and let u = [0, 4, 4, 9, 5, 0, 8, 3, 5, 6]. Then c · u = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] · [0, 4, 4, 9, 5, 0, 8, 3, 5, 6] = 10 · 0 + 9 · 4 + 8 · 4 + 7 · 9 + 6 · 5 + 5 · 0 + 4 · 8 + 3 · 3 + 2 · 5 + 1 · 6 = 218 = 9 in Z11 . Since the dot product is not zero, this ISBN-10 is not correct. (b) Substitute an unknown digit d for the 5 in the fifth entry and recompute the dot product: c · u = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] · [0, 4, 4, 9, d, 0, 8, 3, 5, 6] = 10 · 0 + 9 · 4 + 8 · 4 + 7 · 9 + 6 · d + 5 · 0 + 4 · 8 + 3 · 3 + 2 · 5 + 1 · 6 = 188 + 6d = 1 + 6d in Z11 . Solving 1 + 6d = 0 in Z11 gives d = 9. The correct digit was 9. 20. (a) Let u0 be u with the fourth and fifth entries transposed: u0 = [0, 6, 7, 7, 9, 6, 2, 9, 0, 6]. Then c · (u − u0 ) = c · [0, 0, 0, 2, −2, 0, 0, 0, 0, 0] = 7 · 2 + 6 · (−2) = 2 6= 0. So this transposition error will be detected. (b) See part (c). (c) A correct ISBN-10 vector u satisfies the check constraint (in Z11 ) 10 X (11 − i)ui = 0. i=1

Suppose that uj and uj+1 are transposed, giving a new vector u0 . Then ui = u0i except for i = j or i = j + 1, so that c · u0 =

10 X (11 − i)u0i i=1

=

10 X (11 − i)ui − (11 − j)uj + (11 − j)u0j − (11 − (j + 1))uj+1 + (11 − (j + 1))u0j+1 i=1

= 0 + (11 − j)(u0j − uj ) + (11 − (j + 1))(u0j+1 − uj+1 ) = (11 − j)(uj+1 − uj ) + (10 − j)(uj − uj+1 ) = uj+1 − uj . So unless uj and uj+1 are equal, in which case the transposition does not change the vector, the resulting checksum will be nonzero, so the error will be detected. 21. (a) Let c be the ISBN-10 check vector, and let u0 = [0, 8, 3, 7, 0, 9, 9, 0, 2, 6]. Then c · u0 = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] · [0, 8, 3, 7, 0, 9, 9, 0, 2, 6] = 10 · 0 + 9 · 8 + 8 · 3 + 7 · 7 + 6 · 0 + 5 · 9 + 4 · 9 + 3 · 0 + 2 · 2 + 1 · 6 = 236 = 5 6= 0 in Z11 . Since the dot product is not zero, this ISBN-10 is not correct. (b) From part (c) of Exercise 20, the dot product we got in part (a) is the difference (in Z11 of the two transposed entries in the correct vector u. So we are looking for two entries u0j = uj+1 and u0j+1 = uj such that uj+1 − uj = u0j − u0j+1 = 5 in Z11 . The only such pair are the second and third entries, so the correct ISBN-10 must have been [0, 3, 8, 7, 0, 9, 9, 0, 2, 6].

8.2. ERROR-CORRECTING CODES

637

(c) We were able to correct the error in the ISBN-10 in part (b) because there was only one pair of entries satisfying the required difference constraint. But if an ISBN-10 has two such pairs, we have no way of knowing which pair was actually transposed. For example, consider the vector u0 = [0, 8, 3, 7, 0, 9, 9, 6, 1, 1]. Then computing c · u0 again gives 5, but we do not know whether the transposition occurred between positions 2 and 3 or between positions 8 and 9. The correct ISBN-10 could have been either [0, 3, 8, 7, 0, 9, 9, 6, 1, 1] or [0, 8, 3, 7, 0, 9, 9, 1, 6, 1].

8.2

Error-Correcting Codes

1. For example, suppose [0, 0] is transmitted and an error occurs in the last component, so that [0, 0, 0, 1] is received. Since this is not a legal code vector, there must have been an error; however, the received vector differs from two legal code vectors, [0, 0, 0, 0] and [0, 1, 0, 1], in one position each, so we cannot tell which binary digit is in error. 2. Any single or double error resulting from a mistransmission of 0 will give a result vector u0 that has one or two 1’s. Any single double error resulting from a mistransmission of 1 will give a result vector u1 that has one or two 0’s, so that it has three or four 1’s. So by counting the number of 1’s, we can tell what vector was actually transmitted: if there are two or fewer, then a 0 was transmitted; otherwise, a 1 was transmitted. 3. With  1 0  0  G= 0 1  1 0

0 1 0 0 1 0 1

0 0 1 0 0 1 1

 0 0  0  1 , 1  1 1

we get   1   1   1   1 0    Gx = G   =  0 . 0 0   0 1 1 4. With G as in Example 8.7 and Exercise 3,   0   1   0   1 1    Gx = G   =  1 . 1 0   1 0 1

638

CHAPTER 8. CODES

5. With G as in Example 8.7 and Exercise 3,   1   1   1   1 1  = 1 . Gx = G  1   1   1 1 1 6. With  1 P = 1 0

1 0 1

0 1 1

1 1 1

1 0 0

0 1 0

 0 0 , 1

we have   0 1     0 0   0  = 0 . 0 Pc = P    1 0   0 1 Since the product is zero, no single-bit error occurred. The first four components of the received code are the original message vector, so the original x = [0, 1, 0, 0]T . 7. With  1 P = 1 0

1 0 1

0 1 1

1 1 1

1 0 0

0 1 0

 0 0 , 1

we have   1 1     0 1    = 0 . 0 P c0 = P    1 1   1 0 Since the product is nonzero, there was an error; since the result vector matches the second column of P , we conclude that the second binary digit was in error. Making that change, x is the corrected first four columns of the result, or [1, 0, 0, 0]T . 8. With  1 P = 1 0

1 0 1

0 1 1

1 1 1

1 0 0

0 1 0

we have   0 0     1 0   0  = 1 . 1 Pc = P    1 0   1 0

 0 0 , 1

8.2. ERROR-CORRECTING CODES

639

Since the product is nonzero, there was an error; since the result vector matches the sixth column of P , we conclude that the sixth binary digit was in error. The sixth digit is part of the check code, not of the message, so that the correct message is the first four digits of the result, or x = [0, 0, 1, 1]T . 9. (a) If a = [a1 , a2 , a3 , a4 , a5 , a6 ] ∈ Z62 is an arbitrary vector, the encoded vector has a seventh compoP7 nent a7 such that i=1 ai = 0 ∈ Z2 . This can be encoded using the matrix P = [1, 1, 1, 1, 1, 1, 1], since   a1 a2    a3  X 7     a ai . P  4 = a5  i=1   a6  a7 (b) Using the definitions, we have B = [1, 1, 1, 1, 1, matrix for this code is  1 0 0 1    0 0  I6 = 0 0 B 0 0  0 0 1 1

1], n = 7, and k = 6. Then a standard generator 0 0 1 0 0 0 1

0 0 0 1 0 0 1

0 0 0 0 1 0 1

 0 0  0  0 . 0  1 1

(c) By Theorem 8.1, this is an error-correcting code if and only if the columns of P are nonzero and distinct. But all the columns of P are the same, so this is not an error-correcting code. 10. (a) The code words are found by multiplying G by each possible input:

 1 (b) Here B = 0 1

[0, 0]



[0, 1]



[1, 0]



[1, 1]



  0 G = [0, 0   0 G = [0, 1   1 G = [1, 0   1 G = [1, 1

0, 0, 0, 0]T 1, 0, 1, 1]T 0, 1, 0, 1]T 1, 1, 1, 0]T .

 0 1, so that the parity check matrix is 1

 P = B

I3



 1 = 0 1

0 1 1

1 0 0

0 1 0

 0 0 . 1

Since the columns of P are nonzero and distinct, this is a single error-correcting code.

640

CHAPTER 8. CODES

11. (a) The code words are found by multiplying G by each possible input:   0 [0, 0, 0] → G = [0, 0, 0, 0, 0, 0]T 0   0 [0, 0, 1] → G = [0, 0, 1, 0, 0, 1]T 1   1 [0, 1, 0] → G = [0, 1, 0, 0, 1, 1]T 0   1 [0, 1, 1] → G = [0, 1, 1, 0, 1, 0]T 1   0 [1, 0, 0] → G = [1, 0, 0, 1, 1, 1]T 0   0 [1, 0, 1] → G = [1, 0, 1, 1, 1, 0]T 1   1 [1, 1, 0] → G = [1, 1, 0, 1, 0, 0]T 0   1 [1, 1, 1] → G = [1, 1, 1, 1, 0, 1]T . 1  1 (b) Here B = 1 1

0 1 1

 0 0, so that the parity check matrix is 1  1 0 0 1   P = B I3 = 1 1 0 0 1 1 1 0

0 1 0

 0 0 . 1

The columns of P are nonzero, but the third and sixth columns are the same, so this is not a single error-correcting code. 12. The code in Example 8.6 has generating matrix   1 G = 1 . 1     1 1 1 0 So we can identify A = and then P = , appending a copy of the identity matrix to A. 1 1 0 1 From P we see that k = 1 and n − k = 2, so that n = 3. Therefore this is a (3, 1) Hamming code. 13. With n = 15 and k = 11, there are n − k = 4 parity check equations, and P has 15 columns. The four parity check equations give rise to the last four columns of P as a 4 × 4 identity matrix; the other eleven columns must be the other 11 nonzero elements of Z42 in some order. One possible P is   1 1 1 1 0 1 1 1 0 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 0 0 1 0 0  P = 1 1 0 1 1 0 1 0 1 0 1 0 0 1 0 . 1 0 1 1 1 0 0 1 0 1 1 0 0 0 1 Using Theorem 8.1, we get  1 1 A=B= 1 1

1 1 1 0

1 1 0 1

1 0 1 1

0 1 1 1

1 1 0 0

1 0 1 0

1 0 0 1

0 1 1 0

0 1 0 1

 0 0 . 1 1

8.3. DUAL CODES

641

Therefore the generating matrix for the Hamming (15,11)  1 0 0 0 0 0 0 0 1 0 0 0 0 0  0 0 1 0 0 0 0  0 0 0 1 0 0 0  0 0 0 0 1 0 0  0 0 0 0 0 1 0  0 0 0 0 0 0 1  G= 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 0 0 0 0 0 0  0 0 0 0 0 0 0  1 1 1 1 0 1 1  1 1 1 0 1 1 0  1 1 0 1 1 0 1 1 0 1 1 1 0 0

code is 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1

0 0 0 0 0 0 0 0 1 0 0 0 1 1 0

0 0 0 0 0 0 0 0 0 1 0 0 1 0 1

 0 0  0  0  0  0  0  0 . 0  0  1  0  0  1 1

14. Suppose that B = A; then G=

  Ik , A

 P = A

 In−k .

Note that A is (n − k) × k. Choose x ∈ Zn2 . Then     Ik P Gx = A In−k x = (AIk + In−k A)x = 2Ax. A However, in Z2 , 2 = 0, so that 2Ax = 0Ax = 0. 15. Suppose   I G= k , A

 P = A

In−k



are standard generator and parity check matrices for the same code. Choose x ∈ Zk2 , and let c = Gx be the encoding of x. If there is a transmission error in the ith column of c, receiving c0 instead, then P c0 = pi . Similarly, if there is a transmission error in the j th column of c, receiving c00 instead, then P c00 = pj . Thus if pi = pj , we cannot tell whether the received vector corresponds to an error in the ith or j th column.

8.3

Dual Codes

1. Add column 2 to column 1, giving  1 G0 = 0 1

 0 1 , 0

which is in standard form. Since we did not use R1, this is the same code as C. 2. Perform the following set of  1 1  0  1 0

column operations:   0 1 1 0 0 0  C1 ↔C3   1 1  −−−−−→ 1  0 1 0 0 1 1

0 0 1 1 0

  1 1 0 1  C2 ↔C3   0  −−−−−→ 1  0 1 0 1

1 1 0 1 0

 0 0  1 . 1 0

642

CHAPTER 8. CODES Continue, adding columns to each other:  1 0  1  0 1

1 1 0 1 0

  1 0 0 0  C +C →C  1 2 2  1  −−−−−−−−→ 1 0 1 1 0

0 1 1 1 1

  1 0 C3 +C1 →C1  − − − − − − − − → 0 0  0 1   C +C →C 2 1 1 −−3−−−2−−−→ 1 0

0 1 0 0 1

 0 0  0 1 =G. 1 0

This matrix G0 is in standard form. Since we did not use R1, the code C 0 is the same as the code C. 3. Since the first row of G is zero, we will need to use the operation R1, so the resulting code C 0 will be equivalent to but not equal to C. First exchange the first and fourth rows, then apply the following column operations:         1 0 0 1 0 0 1 1 1 C1 +C2 →C2 1 0 0 − − − − − − − − → 1 1 0 C +C →C 1 1 0 C +C →C 0 1 0 1 0 1 0   2 1 1    3 2 2    0 1 1 −−−−−−−−→ 0 0 1 −−−−−−−−→ 0 0 1 = G . 0 1 1 C1 +C3 →C3 0 0 0 0 0 0 0 0 0 −−−−−−−−→ 0 0 0 This matrix G0 is in standard form. 4. Since the first and second rows of G are identical, we will need to use the operation R1, so the resulting code C 0 will be equivalent to but not equal to C. First exchange the first and third rows, then apply the following column operation:     1 0 1 0   1 1  C +C →C 0 1  0 2 1 1   1 1 −  −−−−−−−→ 0 1 = G .  0 0 0 0 1 0 1 0 This matrix G0 is in standard form. 5. Since there is only one row, and the last entry is not a 1 (the one-dimensional identity matrix), we must use C1. Interchange the second and third columns, giving   P0 = 1 0 1 . This is in standard form; since we used C1, the corresponding code is equivalent to but not equal to C. 6. Perform the following row operations:    1 1 0 1 R1 ↔R2 1 −−−−−→ 1 1 1 1 1

1 1

1 0

 R2 +R1 →R1  1 − −−−−−−−→ 0 1 1

0 1

1 0

 0 = P 0. 1

This matrix P 0 is in standard form. Since we did not use C1, the corresponding code is equal to C. 7. Since the third and fourth columns of P are identical, we cannot get a 3 × 3 identity matrix in columns three through five without using C1. So the resulting code C 0 will be equivalent to but not equal to C. First exchange columns 1 and 4, and then perform the following row operations:       1 1 1 0 0 1 1 1 0 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 R3 +R2 →R2 0 0 0 1 0 = P 0 . R1 +R3 →R3 1 0 1 0 1 −−−−−−−−→ 0 1 0 0 1 −−−−−−−−→ 0 1 0 0 1 This matrix P 0 is in standard form.

8.3. DUAL CODES

643

8. Since the third column of P is zero, we cannot get a 2 × 2 identity matrix in the last two columns without using C1. So the resulting code C 0 will be equivalent to but not equal to C. First exchange columns 2 and 3, and then perform the following row operation:  0 1

0 0

1 0

 R2 +R1 →R1  1 − −−−−−−−→ 1 1 1

0 0

1 0

 0 = P 0. 1

This matrix P 0 is in standard form.  T 9. To find the dual code, we must find the set of vectors x = x1 x2 x3 such that c · x = 0 for every c ∈ C. This is the same as finding the null space of the matrix A whose rows are the codewords in C, so that   0 0 0 A= . 0 1 0 Then Ax = 0 gives two equations, one of which is all zeros, so we can ignore it. The second equation is x2 = 0. So the general solution is     x1 s x2  = 0 , x3 t and thus C⊥

   0 = 0 ,  0

  1 0 , 0

  0 0 , 1

  1  0 .  1

 T 10. To find the dual code, we must find the set of vectors x = x1 x2 x3 such that c · x = 0 for every c ∈ C. This is the same as finding the null space of the matrix A whose rows are the codewords in C, so that   0 0 0 1 1 0  A= 0 0 1 . 1 1 1 Then Ax = 0 gives four equations, one of which is all zeros, so we can ignore it. The other equations give the linear system x1 + x2

=0 x3 = 0

x1 + x2 + x3 = 0. The second equation forces x3 = 0, and then either x1 = x2 = 0 or x1 = x2 = 1. Thus     1   0 C ⊥ = 0 , 1 .   0 0  T 11. To find the dual code, we must find the set of vectors x = x1 x2 x3 x4 such that c · x = 0 for every c ∈ C. This is the same as finding the null space of the matrix A whose rows are the codewords in C, so that   0 0 0 0 0 1 0 0  A= 0 1 0 1 . 0 0 0 1

644

CHAPTER 8. CODES Then Ax = 0 gives four equations, one of which is all zeros, so we can ignore it. The second equation is x2 = 0, and the fourth is x4 = 0. The third is x2 + x4 = 0, which is satisfied by x2 = x4 = 0. Finally, x1 and x3 are arbitrary. Thus         0 1  1 0            0 0 0  ,   ,   , 0 . C⊥ =  0 0 1 1      0 0 0 0

 T 12. To find the dual code, we must find the set of vectors x = x1 x2 x3 x4 x5 such that c · x = 0 for every c ∈ C. This is the same as finding the null space of the matrix A whose rows are the codewords in C, so that   0 0 0 0 0 0 1 1 0 1  A= 1 0 0 1 0 . 1 1 1 1 1 Then Ax = 0 gives four equations, one of which is all zeros, so we can ignore it. The remaining three equations give the linear system x2 + x3

+ x5 = 0

x1

+ x4

=0

x1 + x2 + x3 + x4 + x5 = 0. Note that the third equation is the sum of the first two, so we need only consider the first two. The second equation gives x1 = x4 , while the first tells us that x2 , x3 , and x5 must have an even number of 1’s. Thus                 1  0 1 0 1 0 1 0       0 1 1 0 0 1 1  0                    , 0 , 1 , 1 , 1 , 1 , 0 , 0 . 0 C⊥ =                    0 1 0 1 0 1 0 1       1 1 1 1 0 0 0 0 13. Let P ⊥ = GT =

 1 1

1 1

1 0

 0 . 1

P ⊥ is in standard form, with  1 A= 1 Thus

14. Since

 1 . 1

 1   0 I ⊥ G = = 1 A 1  1 G = 0 T

0 1

1 0

 0 1 . 1 1 1 1

0 1



is not in standard form, add the first row to the second; now it is in standard form and   1 0 1 1 0 ⊥ P = . 1 1 1 0 1 Then A=

 1 1

0 1

 1 , 1

8.3. DUAL CODES

645

so that

 1   0  I G⊥ = = 0 A 1 1

 0 0  1 . 1 1

0 1 0 0 1

15. P T is not in standard form, but adding the second column of P T to the first gives   1 0 0 1  G⊥ =  1 0 , 1 1 which is in standard form with A=

 1 1

Then P



 = A

 0 . 1 

1 I = 1 

0 1

 0 . 1

1 0

16. P T is not in standard form, but adding the third column of P T to the first gives   1 0 0 0 1 0   ⊥  G = 0 0 1 , 1 0 0 1 1 1 which is in standard form with



1 A= 1 Then  P⊥ = A

0 1

  1 I = 1

 0 . 1 0 1

0 1

1 0

 0 . 1

17. By Theorem 8.2,

G⊥ = P T

 1 1  0  = 1 1  0 0

1 0 1 1 0 1 0

 0 1  1  1 , 0  0 1

P⊥

 1  0 = GT =  0 0

0 1 0 0

0 0 1 0

0 0 0 1

1 1 0 1

1 0 1 1

 0 1 . 1 1

18. (a) For E3 , the parity check  is that  the sum of the entries of the codeword is zero, so the parity check matrix must be P = A I = 1 1 1 . Then A = 1 1 , so that     1 0 I G= = 0 1 . A 1 1 For Rep3 , 0 maps to 0 and 1 maps to 1, so the generator matrix must be       1 I 1 0 0   G = = 1 , so that A = . A0 1 1

646

CHAPTER 8. CODES Then  P 0 = A0

  1 I = 1

1 0

 0 . 1

(b) Note that G0 = P T from part (a). While P 0 6= GT , it can be transformed into GT by adding the second row to the first row and then exchanging the first and second rows, so these are parity check matrices for the same code. 19. The argument is much the same as in Exercise 18. For En , the parity check is that the sum  of the A I = entries of the codeword is zero, so the parity check matrix must be the 1 × n matrix P =     1 1 · · · 1 . Then A is the 1 × (n − 1) matrix 1 1 · · · 1 , so that G is the n × (n − 1) matrix  1 0 0    0 1 0 In−1  G= =  ... ... ... A  0 0 0 1 1 1

··· ··· .. . ··· ···

 0 0  ..  . .  1 1

For Repn , 0 maps to 0 and 1 maps to 1, so the generator matrix must be the n × 1 matrix   1   1 I   G0 = = . A0  ..  1 so that A0 is the (n − 1) × 1 matrix   1 1   A0 =  .  = AT .  ..  1 Then P 0 is the (n − 1) × n matrix

 P 0 = A0

 1   1 In−1 =  .  ..

1 0 .. .

0 1 .. .

0 0 .. .

··· ··· .. .

 0 0  ..  . .

1

0

0

0

···

1

We see that G0 = P T and that P 0 can be transformed into GT by adding the last row to each of the other rows and then moving the last row to the first row (this is the same as several row exchanges). Therefore En and Repn are dual codes. 20. We must show that if x ∈ D⊥ , then x ∈ C ⊥ . But x ∈ D⊥ means that d · x = 0 for all d ∈ D. Since C ⊂ D, certainly also c · x = 0 for all c ∈ C, so that x ∈ C ⊥ . ⊥ 21. It suffices to show that if G is a generator matrix for C, then it is also a generator matrix for C ⊥ . If G is a generator matrix for C, then GT is a parity check matrix for C ⊥ by Theorem 8.2(a). Then T ⊥ by Theorem 8.2(b), we see that GT = G is a generator matrix for C ⊥ . 22. Let G and P be generator and parity check matrices for C. We want G⊥ = G. But G⊥ = P T , so we want G = P T . Since n = 6, it follows from G = P T that n − k = k, so that k = 3. So one such code is     I G= 3 , P = I3 I3 . I3

8.4. LINEAR CODES

8.4

647

Linear Codes

1. Since 0 ∈ / C, this is not a subspace, hence is not a linear code.     1 1 2. Since 0 ∈ C and + = 0 ∈ C, this subset is closed under addition, so is a subspace and thus is 1 1 a linear code.       1 1 0 3. Since 0 + 1 = 1 ∈ / C, this subset is not closed under addition, so is not a subspace and thus 1 0 1 is not a linear code.       1 0 1 4. Since 0 + 1 = 1 ∈ / C, this subset is not closed under addition, so is not a subspace and thus 0 0 0 is not a linear code.       0 0 0 5. Since 0 ∈ C and 1 + 1 = 0 ∈ C, and similarly for other pairs of vectors, this subset is closed 0 1 1 under addition, so is a subspace and thus is a linear code. 6. Since

      0 1 1 0 1 1  + = ∈ 1 0 1 / C, 0 0 0

this subset is not closed under addition, so is not a subspace and thus is not a linear code. 7. If w(x) = m and w(y) = n, then consider x + y. It has a 1 anywhere that either x or y does, but not where they both do. In the overlap, where they both have 1’s, all of these 1’s become 0’s in the sum. So w(x + y) = w(x) + w(y) − 2 overlap. So if w(x) and w(y) are both even, so is w(x + y). Therefore En is closed under addition since En consists of all vectors with even weight, so it is a linear code. 8. Since w(0) = 0, we see that 0 ∈ / On , so that On is not a subspace and thus not a linear code. 9. The other two bases are the other two pairs of nonzero vectors:            0 1  1 0 1         1 0 1 0 1   ,   note that   =   +     1  0 1 1   1    1 1 0 1 1            1 1  0 1 1         0 1 1 1 1   ,   note that   =   +   .   1  1 0 1   0    0 1 1 0 1 10. (a) Since G is n × k, G is 9 × 4. Also, P is (n − k) × n = 5 × 9. (b) G is n × k; P is (n − k) × n. 11. If c ∈ C and c0 ∈ C ⊥ , then c ∈ C ⊥

⊥

by definition of the dual code. Therefore C ⊂ C ⊥

 ⊥ ⊥



⊥

. But by

the proof of Theorem 8.4(a), dim C = n − dim C = n − (n − dim C) = dim C. So the dimensions  ⊥ ⊥ ⊥ of C and C are equal; since the first is a subset of the second, it must be the case that C = C ⊥ .

648

CHAPTER 8. CODES

12. By Theorem 8.4(b), C has 2k vectors and C ⊥ has 2n−k vectors. If C = C ⊥ , then 2k = 2n−k ; multiply through by 2k to get 22k = 2n , so that n = 2k is even. 13. Starting with the basis for R2 given in the text,        1 0 0      1 1 0       r21 =   , r22 =   , r23 =   , 1 0 1       1 1 1 a basis for R3 is, by the definitions,   1     1      1             1 r21 r22 r23 0  , , , =   , r21 r22 r23 1  1   1        1    1

  0 1   0   1  , 0   1   0 1

  0 0   1   1  , 0   0   1 1

  0    0     0    0   . 1   1      1    1

So the set of vectors in R3 is the subspace generated by this basis, which there are 4 basis elements):                         1 1 0 0 0 1 1 1 0 0 0 1    1 1 0 0 0 1 1 1 1 0 0 0                            1 0 1 0 1 0 1 1 0 1 0 1                           1 1 1 0 0 0 1 0 1 1 1 0  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  , 1 0 0 1 1 1 0 0 1 1 1 0                            1 1 0 1 0 1 0 1 0 1 0 1                            1 0 1 1 1 0 0 1 1 0 0 0    1 1 0 0 0 0 0 0 1 1 1 1

will have 16 vectors (since   1 1   0   0  , 0   0   1 1

  0 1   1   0  , 1   0   0 1

  1 0   0   1  , 0   1   1 0

  0    0     0    0   . 0    0     0    0

14. (a) Since G0 = 1, we get  G0 G1 = G0  G1 G2 = G1

 G2 G3 = G2

  0 1 = 1 1  1  1 0 = 1 1 1  1 1    1 1 0 = 1 1  1  1 1

 0 1 0 1 0 1

 0 0  1 1

0 1 0 1 0 1 0 1

0 0 1 1 0 0 1 1

 0 0  0  0 . 1  1  1 1

(b) Use induction on n. For n = 1, since the columns of G1 form a basis for R1 , we see that G1 generates R1 . Now assume Gn−1 generates Rn−1 . Then we must show that Gn generates all vectors of the form     u 0 , , u 1

8.4. LINEAR CODES

649

where u is a basis vector for Rn−1 . Let u be a basis vector for Rn−1 , and choose x such that Gn−1 x = u; such an x exists since Gn−1 generates Rn−1 . Then          x Gn−1 0 x Gn−1 x u Gn = = = 0 Gn−1 1 0 Gn−1 x u          0 Gn−1 0 0 Gn−1 0 + 0 · 1 0 Gn = = = , 1 Gn−1 1 1 Gn−1 0 + 1 · 1 1 as required. 15. G2 can be put into standard form by the following sequence of column operations:  1 1 G2 =  1 1

0 1 0 1

  1 0 1 0 C +C →C 2 1 2   1 −−−−−−−−→ 1 1 1

1 0 1 0

 C +C →C1  0 0 −−2−−−1−−−→ 1 0   C3 +C2 →C3 0 1 − −−−−−−−→ 1 1

  1 1 1 0 C +C →C 3 1 1   0 −−−−−−−−→ 0 0 1

1 0 1 0

1 0 1 0

 1   0 = A . 0 I 1

This matrix is in standard form, and  since we did not use the operation R1, it represents the same code, namely R2 . So A = 1 1 1 , and thus the associated parity check matrix is  P = I

  A = 1

1

1

 1 .

16. G cannot be put into standard form without using a row operation, since the 4 × 4 matrix consisting of the bottom four rows of G has determinant zero. Instead, we use Theorem 8.2: compute a basis T for C ⊥ and then a generator matrix G⊥ for C ⊥ ; then G⊥ is a parity check matrix for C, since  ⊥ ⊥ 7 C = C. Now, C is a four-dimensional subspace of Z2 , so its complement is three-dimensional. Examining the basis of C given in Exercise 13, clearly a basis for C ⊥ is       1  1 1      0 1 1              1 0 1                  1 0 0  ,  ,   , 1 1 0           0 1 0            1 0 0       0 0 0 so that G⊥ is the matrix whose columns are these vectors. (Alternatively, if desired one can construct the matrix whose rows are the entries of the basis for R3 and determine its nullspace, as in the text.) Then   1 0 1 0 1 0 1 0  T P = G⊥ = 1 1 0 0 1 1 0 0 . 1 1 1 1 0 0 0 0 17. Let E ⊆ C be the set of even vectors and O ⊆ C the set of odd vectors. Suppose that O is nonempty, and choose c0 ∈ O. Consider the map ψ : E → C : e 7→ c0 + e, and let O0 = {c0 +e : e ∈ E} be the image of this map. Note that ψ is one-to-one, since c0 +e1 = c0 +e2 implies that e1 = e2 . Claim O0 = O. First, for any e ∈ E, w(c0 + e) = w(c0 ) + w(e) − 2 overlap,

650

CHAPTER 8. CODES from Exercise 7. But w(c0 ) is odd and w(e) is even, so that w(c0 + e) is odd and therefore c0 + e ∈ O. Thus O0 ⊆ O. Next, choose o ∈ O. Note that c0 + c0 = 0 since we are working in Z2 , so that o = 0 + o = (c0 + c0 ) + o = c0 + (c0 + o). But c0 + o ∈ E, since its weight is w(c0 ) + w(o) − 2 overlap, and all three terms are even. Thus o = c0 + e0 for e0 ∈ E, so that o ∈ O0 . Hence O ⊂ O0 , so that the O = O0 . Finally, we see that ψ is a one-to-one map from E onto O, so that the two sets have the same number of elements. Thus exactly have of the code vectors have odd weight.

8.5

The Minimum Distance of a Code

1. Since x − y = x + y in Z2 , we have dH (x, y) = kx − ykH = kx + ykH = w(x + y). So the distance between two vectors is just the weight of their sum. Denoting by c0 , c1 , and c2 the vectors in C, we have       0 1 1 w(c0 + c1 ) = w 1 = 1, w(c0 + c2 ) = w 1 = 2, w(c1 + c2 ) = w 0 = 1. 0 0 0 So the minimum distance is d(C) = min(1, 2, 1) = 1. 2. Since x − y = x + y in Z2 , we have dH (x, y) = kx − ykH = kx + ykH = w(x + y). So the distance between two vectors is just the weight of their sum. Denoting by c0 , c1 , c2 , and c3 the vectors in C, we have       0 1 1 1 0 1       w(c0 + c3 ) = w  w(c0 + c2 ) = w  w(c0 + c1 ) = w  1 = 2, 0 = 2, 1 = 4, 0 1 1       0 1 1 1 0 1            = 4. w(c1 + c2 ) = w   = 2, w(c1 + c3 ) = w   = 2, w(c2 + c3 ) = w   1 0 1 0 1 1 So the minimum distance is d(C) = min(2, 4) = 2. 3. Note that   1 1     dH 0 ,  ..   . 

    0 = 2,  

0 so that d(C) ≤ 2. Now choose x, y ∈ En , neither of which is the zero vector, with x 6= y. Then since x + y ∈ En and is nonzero, we know that x + y has at least two ones, so its weight is at least 2. Therefore no pair of nonzero unequal vectors have a distance less than 2, so that d(C) = 2. 4. Since Repn = {0, 1}, we see that d(C) = w(0 + 1) = n.

8.5. THE MINIMUM DISTANCE OF A CODE

651

  A form a basis for the code vectors of C. Since no two columns of A are I identical, and the weight of the sum of any two columns of the identity matrix is 2, it follows that the weight of the sum of any two columns of G is at least 3. Thus d(C) ≥ 3. But P = I A , and a1 + a2 + a3 = 0, so by Theorem 8.7, d(C) ≤ 3. Therefore d(C) = 3.

5. The columns of G =

6. It is easy to see that the rows of P are linearly independent (if they were not, the sum of two or three of them would be zero, and it is not). Therefore rank(P ) = 3, so that the smallest d for which P has d linearly dependent columns is d = 4, and it follows from Theorem 8.7 that d(C) = 4. 7. Let C = {c1 , c2 , c3 , c4 } respectively. Then   0 1     dH (c1 , c2 ) = w  0 = 3, 1 1   1 1     dH (c2 , c3 ) = w  1 = 3, 0 0

  1 0     dH (c1 , c3 ) = w  1 = 4, 1 1   1 0     dH (c2 , c4 ) = w  1 = 4, 1 1

  1 1     dH (c1 , c4 ) = w  1 = 3 0 0   0 1     dH (c3 , c4 ) = w  0 = 3. 1 1

Thus d(C) = 3. Since   0 1     dH (c1 , u) = w  0 = 2, 1 0   1 1     dH (c3 , u) = w  1 = 4, 0 1

  0 0     dH (c2 , u) = w  0 = 1, 0 1   1 0     dH (c4 , u) = w  1 = 3, 1 0

we conclude that u should decode as c2 . Since   1 1     dH (c1 , v) = w  0 = 3, 1 0   0 1     dH (c3 , v) = w  1 = 3, 0 1

  1 0     dH (c2 , v) = w  0 = 2, 0 1   0 0     dH (c4 , v) = w  1 = 2, 1 0

the identity of v is not determined. It could be decoded as either c2 or c4 using nearest neighbor

652

CHAPTER 8. CODES decoding. Since   1 0     dH (c1 , w) = w  0 = 3, 1 1   0 0     dH (c3 , w) = w  1 = 1, 0 0

  1 1     dH (c2 , w) = w  0 = 2, 0 0   0 1     dH (c4 , w) = w  1 = 4, 1 1

we conclude that w should decode as c3 . 8. The code generated by (the columns of) G is   0    0          0  C = 0 ,  0        0    0

  1 0   0   1 ,   1   0 1

  0 1   0   1 ,   0   1 1

  0 0   1   0 ,   1   1 1

  1 1   0   0 ,   1   1 0

  1 0   1   1 ,   0   1 0

  0 1   1   1 ,   1   0 0

  1    1     1   0 .   0     0    1

Letting the ith element of C be ci , we have: • min{dH (ci , u)} = 1 only for c5 , so that u decodes as c5 . • min{dH (ci , v)} = 1 only for c8 , so that v decodes as c8 . • min{dH (ci , w)} = 1 only for c4 , so that w decodes as c4 . 9. From Exercise 4, d(Repn ) = 1, so that Rep8 = {0, 1} ⊂ Z82 is such a code. 10. If such a code exists, then its parity check matrix P is (8 − 2) × 8 = 6 × 8. The rank of such a matrix can be at most 6, so that the maximum number of linearly independent columns is 6. But that means that the minimum number of linearly dependent columns is at most 7. Thus we cannot have d = 8, and such a code cannot exist. 11. If such a code exists, then its parity check matrix P is (8 − 5) × 8 = 3 × 8. The rank of such a matrix can be at most 3, so that the maximum number of linearly independent columns is 3. But that means that the minimum number of linearly dependent columns is at most 4. Thus we cannot have d = 5, and such a code cannot exist. 12. By Theorem 8.5, R3 is an (8, 4) linear code, and from Example 8.16, it has minimum distance d = 23−1 = 4, so it is an (8, 4, 4) linear code. 13. Model the code vectors as four vertices of an octagon. With         0 0 1 1 0 1 0 1        c0 =  0 , c1 = 0 , c2 = 1 , c3 = 1 , 0 1 0 1 we have

8.5. THE MINIMUM DISTANCE OF A CODE

653

c1

c3

c0

c2

14. If c, d ∈ C, then since we are working in Z2 , we have c − d = c + d, and then dH (c, d) = kc − dkH = kc + dkH = w(c + d). But c + d ∈ C. Thus dH (C) = min{dH (c, d) : c, d ∈ C} = min{w(c + d) : c, d ∈ C} = min{w(c) : c ∈ C, c 6= 0}. 15. A parity check matrix for any linear (n, k) code is (n − k) × n, so that rank(P ) 6= n − k. Then the number of linearly independent columns of P is at most n − k, so that the minimum number of linearly dependent columns is at most n − k + 1. Thus d ≤ n − k + 1, so that d − 1 ≤ n − k. 16. From the previous exercise, we know that d ≤ n − k + 1, so that n − k ≥ d − 1. Now, if every set of n − k columns of P are linearly independent, then n − k < d. Together, these two inequalities force n − k = d − 1, or d = n − k + 1. Conversely, if d = n − k + 1, then Theorem 8.7 says that n − k + 1 is the smallest integer for which P has that many linearly dependent columns. But if this is true, then it must be the case that every set of n − k columns is linearly independent.