Solutions for Applied Numerical Linear Algebra (James Demmel) [1 ed.]

297 62 420KB

English Pages 75 Year 2014

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Solutions for Applied Numerical Linear Algebra (James Demmel) [1 ed.]

Table of contents :
1 Solutions for Chapter I: Introduction
Question 1.1
Question 1.2
Question 1.3
Question 1.4
Question 1.5
Question 1.6
Question 1.7
Question 1.8
Question 1.9
Question 1.10
Question 1.11
Question 1.12
Question 1.13
Question 1.14
Question 1.15
Question 1.16
Question 1.17
Question 1.18
2 Solutions for Chapter II: Linear Equation Solving
Question 2.1
Question 2.2
Question 2.3
Question 2.4
Question 2.5
Question 2.6
Question 2.7
Question 2.8
Question 2.9
Question 2.10
Question 2.11
Question 2.12
Question 2.13
Question 2.14
Question 2.15
Question 2.16
Question 2.17
Question 2.18
Question 2.19
Question 2.20
Question 2.21
3 Solutions for Chapter III: Linear Least Squares Problems
Question 3.1
Question 3.2
Question 3.3
Question 3.4
Question 3.5
Question 3.6
Question 3.7
Question 3.8
Question 3.9
Question 3.10
Question 3.11
Question 3.12
Question 3.13
Question 3.14
Question 3.15
Question 3.16
Question 3.17
Question 3.18
Question 3.19
Question 3.20
4 Solutions for Chapter IV: Nonsymmetric Eigenvalue Problems
Question 4.1
Question 4.2
Question 4.3
Question 4.4
Question 4.5
Question 4.6
Question 4.7
Question 4.8
Question 4.9
Question 4.10
Question 4.11
Question 4.12
Question 4.13
5 Solutions for Chapter V: The Symmetric Eigenproblem and Singular Value Decomposition
Question 5.1
Question 5.2
Question 5.3
Question 5.4
Question 5.5
Question 5.6
Question 5.7
Question 5.8
Question 5.9
Question 5.10
Question 5.11
Question 5.12
Question 5.13
Question 5.14
Question 5.15
Question 5.16
Question 5.17
Question 5.18
Question 5.19
Question 5.20
Question 5.21
Question 5.22
Question 5.23
Question 5.24
Question 5.25
Question 5.26
Question 5.27
Question 5.28
6 Solutions for Chapter VI: Iterative Methods for Linear Systems
Question 6.1
Question 6.2
Question 6.3
Question 6.4
Question 6.5
Question 6.6
Question 6.7
Question 6.8
Question 6.9
Question 6.10
Question 6.11
Question 6.12
Question 6.13
Question 6.14

Citation preview

Solutions for Applied Numerical Linear Algebra

Zheng Jinchang January 25, 2014

This solution is for Applied Numerical Linear Algebra, written by James. W. Demmel. It covers the majority of the questions from Chapter I to Chapter VI except the programming and few other questions . As it is done by myself, surely it will have errors. Thus, if there are some errors or something else, you can email me at [email protected]. I hope this solution is helpful :)

C ONTENTS 1 Solutions for Chapter I: Introduction

2

2 Solutions for Chapter II: Linear Equation Solving

15

3 Solutions for Chapter III: Linear Least Squares Problems

29

4 Solutions for Chapter IV: Nonsymmetric Eigenvalue Problems

42

5 Solutions for Chapter V: The Symmetric Eigenproblem and Singular Value Decomposition 51 6 Solutions for Chapter VI: Iterative Methods for Linear Systems

67

1

1 S OLUTIONS FOR C HAPTER I: I NTRODUCTION Question 1.1. Let A be an orthogonal matrix. Show that det(A) = ±1. Show that if B is also orthogonal and det(A) = − det(B ), then A + B is singular. Solution. According to the definition of orthogonal matrix, we have A AT = I . Take determinant on both sides. Since det(A) = det(A T ), it results ¡

¢2 det(A) = 1.

Consequently, det(A) = ±1. To proof A + B is singular, consider det(A + B ) = det(A T + B T ), yielding det(I + B A T ) det(A) = det(A + B ) = det(A T + B T ) = det(B T ) det(B A T + I ). Since det(A) = − det(B ), it follows det(B A T + I ) = 0. Thus det(A + B ) = 0.



i.e, A + B is singular.

Question 1.2. The rank of a matrix is the dimension of the space spanned by its columns. Show that A has rank one if and only if A = ab T for some column vectors a and b. Remark. For this question to hold, we have to assume that both a and b is not 0. ¡ ¢ Proof. If A = ab T , let b = {b 1 , b 2 , . . . , b n }. Partition A by columns as A = α1 , α2 , . . . , αn . We have αi = b i a. Therefore, all column vectors of A are linear dependent on a. As a, b 6= 0, resulting in A 6= 0. ¡ ¡ ¢¢ Then, dim span α1 , α2 , . . . , αn = 1, i.e. rank(A) = 1. ¡ ¢ On the other hand, if rank(A) = 1, partition A by columns as A = α1 , α2 , . . . , αn . Without losing any generalities, suppose α1 6= 0. Because of rank(A) = 1, there is only one linear independent column vector of A, which means αi = b i α1 . ¡ ¢ Denote a = α1 , b = b 1 , b 2 , . . . , b n . Then a, b 6= 0, and

A = ab T .

 Question 1.3. Show that if a matrix is orthogonal and triangular, then it is diagonal. What are its diagonal elements?

2

Proof. Without losing any generalities, suppose A = (a i j ) is orthogonal and upper trangular. Because A is orthogonal, we get       a 11 a 12 . . . a 1n a 11 a 11 a 11 a 12 . . . a 1n       a 22 . . . a 2n   a 12 a 22 a 22 . . . a 2n     a 12 a 22         ..   .. ..  .. .. .. .. .. ..   =  ..   = 1. . . . . . . .  .  .    .  a nn a 1n a 2n . . . a nn a 1n a 2n . . . a nn a nn Equate the (1, 1) entry, yielding a 1i = ±δ1i . Then, condition becomes   1 1   a 22 . . . a 2n   a 22    ..  .. ..   . . .   a nn a 2n



..

. ...

a nn

 1     =    

a 22 .. . a 2n

..

. ...

a nn

 1     



a 22

... .. .

 a 2n  ..   = 1, .  a nn

which means A(2 : n, 2 : n) is orthogonal and upper triangular. Thus, the result follows by induction, and the diagonal entries are ±1.  Question 1.4. A matirx is stictly upper triangular if it is upper triangluar with zero diagonal elements. Show that if A is strictly upper triangular and n-by-n then A n = 0. ¡ ¢ Proof. Suppose B is a n-by-n matrix, and partition B by its columns as B = b 1 , b 2 , . . . , b n . Consider the i th column of B A, which is (B A)(:, i ) =

iX −1

A( j , n)b j .

j =1

Thus, the i th column of B A is the linear combination of the first to the (i − 1)th columns of B . Subsititue B by A, yielding the diagonal and first superdiagonal of A 2 becoming zero, because only the first i −1 entries of i th column of A are nonzeros. Consequently, the second superdiagonal of A 3 becomes zero. In this ananlogy, after n times, the n − 1 superdiagonal of A n becomes zero, i.e. A n = 0.  Question 1.5. Let k · k be a vector norm on Rm and assume that C ∈ Rm×n . Show that if rank(C ) = n, then kxkC ≡ kC xk is a vector norm. Proof. 1. Positive definiteness. For any x ∈ Rn , since k · k is a vector norm, we have kC xk ≥ 0, and if and only if C x = 0, kC xk = 0. Consider the linear system of equations C x = 0. Because C has full column rank, the above equations have only zero solution. This proves the positive definiteness property.

3

2. Homogeneity. For any x ∈ Rn , α ∈ R, kαxkC = kαC xk = |α|kC xk = |α|kxkC . This proves the homogeneity property. 3. The triangle inequality. For any x, y ∈ Rn , kx + ykC = kC x +C yk ≤ kC xk + kC yk = kxkC + kykC . This proves the triangluar inequality. Consequently, the kxkC ≡ kC xk is a vector norm.



Question 1.6. Show that if 0 6= s ∈ Rn and E ∈ Rm×n , then ° µ T ¶° ° °2 kE sk22 °E I − ss ° = kE k2 − . F ° s T s °F sT s

Proof. Since ¶µ µ ¶¶T µ ¶ E s(E s)T ss T ss T E I− T E I− T = E I − T ET = EET − , s s s s s s sT s µ

it follows

ss T

° µ T T ¶° °2 ° kE sk22 °E I − ss ° = tr(E E T ) − tr(E s(E s) ) = kE k2 − . F ° s T s °F sT s sT s

Therefore, the question is proved.  n T Remark. It could also be proved by projection property, because for any u ∈ R , P = uu /u T u is the orthogonal projection of the subspace spanned by u, and I − P is its complement projection. Question 1.7. Verify that kx y ∗ kF = kx y ∗ k2 = kxk2 kyk2 for any x, y ∈ Cn . Proof. Because the Frobenius norm can be evaluated as kAk2F = tr(A A ∗ ), it follows kx y ∗ k2F = tr(x y ∗ y x ∗ ) = (y ∗ y)tr(x x ∗ ) = kyk22 kxk22 . For two-norm, since we have the equation about two-norm: kAk22 = λmax (A A ∗ ), subsititue A by x y ∗ . We have kx y ∗ k22 = λmax (x y ∗ y x ∗ ) = (y ∗ y)λmax (x x ∗ ) = kyk22 λmax (x x ∗ ). Now, what remains is to determine λmax (x x ∗ ). Because x x ∗ is hermitian, we get λ2max (x x ∗ ) = max v 6=0

v ∗x x∗v 2 = max (v ∗ x) , for any v ∈ C. v ∗ v =1 v ∗v

4

According to Cauchy-Schwartz inequality, only when v is proportional to x, (v ∗ x) get the greatest absolute value. Therefore, λ2max (x x ∗ ) ≤ kxk22 , and when v = x the inequality becomes equality. Consequently, kx y H k2 = kyk22 λmax (x x ∗ ) = kxk2 kyk2 . Combining the two eqautions proves the question.  Remark. If we regard vector x and y as n-by-1 matrices, by the consistancy of two-norm, kx yk2 ≤ kxk2 kyk2 holds, and this upper bound is attainable by applying vector y. Since we have not proved the consistancy, I use the above method, instead. P Question 1.8. One can identify the degree d polynomials p(x) = di=0 a i x i with Rd +1 via the vector of coefficients. Let x be fixed. Let S x be the set of polynomials with an infinite relative condition number with respect to evaluationg them at x (i.e., they are zero at x). In a few words, describe S x geometrically as a subset of Rd +1 . Let S x (κ) be the set of polynomails whose relative condition number is κ or greater. Describe S x (κ) geometrically in a few words. Describe how S x (κ) changes geometrically as κ → ∞. ¡ ¢ Solution. Suppose the coefficients of the polynomials with degree d as a = a 1 , a 2 , . . . , a d +1 , ¡ ¢ and the fixed x = 1, x, . . . , x d . Since S x are the polynomials that are zero at x, we have

a T x = 0. Therefore, the set S x forms the hyperplanes which are perpendicular to the fixed x (or their normal vectors are x). ¡ ¢ ¡ ¢ As for the S x (κ), set |a| = |a 1 |, |a 2 |, . . . , |a d +1 | , and |x| = |x 1 |, |x 2 |, . . . , |x d +1 | . Due to the definition of the relative condition number, we have |a T ||x| |a T x|

≥ κ.

Using Cauchy-Schwartz inequality, we can transform the inequality as kak2 kxk2 |a T x|

≥ κ.

Take reciprocal and arccos on both sides, the inequality does not change, resulting 1 κ

∠(a, x) ≥ arccos . So, the set S x (κ) are hyperplanes whose normal vectors have greater angle than arccos κ1 with the fixed x. Thus, as κ → ∞, arccos κ1 → π2 , i.e, the hyperplanes become perpendicualr to the fixed x.  Question 1.9. This question is too long, so I omit it. For details, please refer to the textbook.

5

Proof. For the first algorithm, insert a roundoff term (1+δi ) at each floating point operation, we get log (1 + x)(1 + δ1 ) log(1 + x) + log(1 + δ1 ) yˆ = (1 + δ2 ) = (1 + δ2 ). x x Since x, δi ≈ 0, we get the relative error as ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ δ1 ¯ ¯ δ1 ¯ ¯ yˆ − y ¯ ¯ log (1 + δ1 ) x ¯ ¯ ¯=¯ ¯ (1 + δ2 )¯ ≈ ¯ (1 + δ2 )¯¯ ≈ ¯¯ ¯¯ ¯ y ¯ ¯ log (1 + x) x x x As δ is a bounded value, the relative error will grow when x → 0. Use the same technique to analyse the second algorithm, we have its relative error as ¯ ¯ ¯ ¯ µ ¶¯ ¯ ¯ ¯ yˆ − y ¯ ¯ d − 1 log d log d ¯¯ ¯¯ 1 + δ2 ¯ ≈ δ, |δ| ≤ 2². ¯=¯ ¯ (1 + δ ) − − 1 = 2 ¯ ¯ ¯ ¯ y ¯ ¯ log d (d − 1)(1 + δ ) d −1 1 + δ1 1



Therefore, the second algorithm will correct near x = 0. ¡Pd

Pd

Question 1.10. Show that, barring overflow or underflow, fl i =1 x i y i = i =1 x i y i (1 + δi ), where |δi | ≤ d ². Use this to prove the following fact. Let A m×n and B n×p be matrices, and compute their produnct in the usual way. Baring overflow or underflow show that |fl(A · B ) − A · B | ≤ n · ² · |A| · |B |. Here the absolute value of a matrix |A| means the matrix with entries (|A|)i j = |a i j |, and the inequality is meant componentwise. ¢

Proof. Denote δi as mutipilication roundoff error and δˆi as addition roundoff error. Expanding the formula with the roundoff terms, we get à ! à ! dY −1 dY −1 d d X X (1 + δi ) (1 + δˆ j )x i y j + (1 + δˆ j )x 1 y 1 . fl xi y i = i =1

i =2

j =i −1

j =1

As 1 − (d − i + 1)² + O(²2 ) ≤ (1 − ²)d −i +1 ≤

dY −1

(1 + δˆ j ) ≤ (1 + ²)d −i +1 ≤ 1 + (d − i + 1)² + O(²2 ),

j =i −1

we have the following two inequlities à ! d d d X X X fl x i y i ≤ (1 + (d − i + 2)²)x i y i + (1 + d ²)x 1 y 1 ≤ (1 + d ²)x i y i , i =1 d X

i =2

(1 − d ²)x i y i ≤

i =1

i =1 d X i =2

Ã

(1 − (d − i + 2)²)x i y i + (1 − d ²)x 1 y 1 ≤ fl

d X

!

xi y i .

i =1

¡P ¢ P Therefore, fl di=1 x i y i = di=1 x i y i (1 + δ¯i ), where |δ¯i | ≤ d ². Now, turn to prove the matrix inequlity. Since the absolute value of a matrix is defined compoP nentwise, we just need to prove it componentwise. Suppose c i j = (A·B )(i , j ) = nk=1 A(i , k)B (k, j ). Applying what we have proved, we have ¯ ¯ ¯ n n X ¯ ¯ ¯X ¯ ¯fl(c i j ) − c i j ¯ = ¯¯ |A(i , k)||B (k, j )| = n²(|A||B |)(i , j ). A(i , k)B (k, j )δk ¯ ≤ n² ¯ ¯k=1 k=1

This proves the question.



6

Question 1.11. Let L be a lower triangular matrix and solve Lx = b by forward substituion. Show that barring overflow or underflow, the computed solution xˆ satisfies (L+δL)xˆ = b, where |δl i j | ≤ n²|l i j |, where ² is the machine precision. This means that forward substituion is backward stable. Argue that backward substitution for solving upper triangular systems satisfies the same bound. Proof. Insert a roundoff term (1 + δi ) at each floating point operation in the forward substituion formula, and use the results of last question. We get xi =

bi −

Pi −1

k=1

a i k x k (1 + δk ) ai i

(1 + δˆi )(1 + δ¯i ), where |δk | ≤ (i − 1)², |δˆi | ≤ ², |δ¯i | ≤ ².

Solve b i from the above equation, because of 1/(1 + δˆi )(1 + δ¯i ) = 1 + δi , where |δi | ≤ 2². We can combine the term x i in the sum as follows: bi =

i X

a i k x k (1 + δi k ), where |δi k | ≤ max{2, i − 1}².

k=1

Therefore, denote (δL)i j = a i j δi j . We have proved (L + δL)xˆ = b, where |δl i j | ≤ n²|l i j |. P −1 Remark. The result will be slightly different if we consider b i − ik=1 a i k x k as



(((b i − a i 1 x 1 ) − a i 2 x 2 ) − a i 3 x 3 ) . . . . In fact, it may be the evaluation sequence the question intended. My result is based on the following sequence iX −1 bi − ( a i k x k ). k=1

For this question, there is not major difference between the two. Question 1.12. The question is too long, so I omit it. For details, please refer to the textbook. Remark. In my point of view, the question is intended to prove that complex arithmetics implemented in real arithmetics are backward stable. Thus, it may be sufficient to prove |fl(a ¯ b) − ab| ≤ O(²)ab. My proof tries to find out the expressions of the δ’s. Solution. First, consider the complex addition. Suppose there is such δ = δ1 + δ2 i . Then, as complex addition is two real addition, we have two ways to compute the floating point operations as follows: ˆ + (b 1 + b 2 )(1 + δ)i ¯ = (a 1 + a 2 + b 1 i + b 2 i )(1 + δ1 + δ2 i ). fl (a 1 + b 1 i + a 2 + b 2 i ) = (a 1 + a 2 )(1 + δ)

7

Solving δ1 , δ2 from the above equations, we get (a 1 + a 2 )2 δˆ + (b 1 + b 2 )2 δ¯ , (a 1 + a 2 )2 + (b 1 + b 2 )2 ˆ (a 1 + a 2 )(b 1 + b 2 )(δ¯ − δ) δ2 = . 2 2 (a 1 + a 2 ) + (b 1 + b 2 )

δ1 =

¯ 2 + |δ¯ − δ| ˆ 2 ≤ 4²2 , we have proved for the complex addition. As Since |δ1 |2 + |δ2 |2 ≤ |δˆ + δ| subtraction is essentially the same as addition, it is the same for complex subtraction . As for mutiplication, use the same technique and the above result, we have the equation as fl ((a 1 + b 1 i )(a 2 + b 2 i )) = (a 1 + b 1 i )(a 2 + b 2 i )(1 + δ1 + δ2 i ) ¡ ¢ = a 1 a 2 (1 + δ¯1 ) − b 1 b 2 (1 + δ¯4 ) + a 2 b 1 (1 + δ¯2 )i + a 1 b 2 (1 + δ¯3 )i (1 + δˆ1 + δˆ2 i ). For simplicity, we ignore the higher order epsilons as they are not the major part of the error. Then solving δ1 , δ2 from the above equations, we get ³ 1 δ1 = (a 12 a 22 − a 1 a 2 b 1 b 2 )δ¯1 + (a 22 b 12 + a 1 a 2 b 1 b 2 )δ¯2 + (a 12 b 22 + (a 1 a 2 − b 1 b 2 )2 + (a 1 b 2 + a 2 b 1 )2 ´ a 1 a 2 b 1 b 2 )δ¯3 − (a 1 a 2 b 1 b 2 + b 12 b 22 )δˆ4 + (a 12 a 22 − b 12 b 22 + a 22 b 12 + a 12 b 22 )δˆ1 − (a 1 b 1 b 22 + a 1 a 22 b 1 )δˆ2 , ³ 1 − (a 12 a 2 b 2 + a 1 a 22 b 1 )δ¯1 + (a 1 a 22 b 1 − a 2 b 12 b 2 )δ¯2 + (a 12 a 2 b 2 − δ2 = (a 1 a 2 − b 1 b 2 )2 + (a 1 b 2 + a 2 b 1 )2 ´ a 1 b 1 b 22 )δ¯3 + (a 1 b 1 b 22 + a 2 b 12 b 2 )δ¯4 + ((a 1 a 2 − b 1 b 2 )2 + (a 1 b 2 + a 2 b 1 )2 )δˆ2 .

We can claim that all the coefficient of each δi is bounded, because of the mean value inequality. Therefore, the question holds for the complex multiplication. Now, we can check whether both the real and imaginary parts of the complex product are always compute accurately. Since we have fl(a 1 + b 1 i )(a 2 + b 2 i ) = (a 1 + b 1 i )(a 2 + b 2 i )(1 + δ1 + δ2 i ), separate the real and imaginary parts as ¯ ¯ ¯ (a 1 a 2 − b 1 b 2 )δ1 − (a 1 b 2 + a 2 b 1 )δ2 ¯ ¯, Errorr = ¯¯ ¯ a1 a2 − b1 b2 ¯ ¯ ¯ (a 1 a 2 − b 1 b 2 )δ2 − (a 1 b 2 + a 2 b 1 )δ1 ¯ ¯. Errori = ¯¯ ¯ a1 b2 + a2 b1 Thus, if a 1 , b 1 → 0, the real part may go wrong; and if b 1 , b 2 → 0, the imaginary part may go wrong. So both parts of the complex product may not always be computed accurately. At last, we check the complex division with the knowledge of complex mutiplication, and my algorithm for complex division is like the normal complex division just without actual complex mutiplication in realizing the denominator. Therefore, we have µ ¶ a1 + b1 i a1 + b1 i (a 1 + b 1 i )(a 2 − b 2 i )(1 + δ¯1 + δ¯2 i ) fl = (1 + δ1 + δ2 i ) = ¡ (1 + δˆ4 ).1 ¢ 2 2 ˆ ˆ ˆ a2 + b2 i a2 + b2 i a 2 (1 + δ1 ) + b 2 (1 + δ2 ) (1 + δ3 ) 1 In reality, there are two division δ’s. For simplicity I suppose they are the same, becasue there are no differ-

ence after the following merging δ’s. Otherwise, the procedure remains the same, but will be slightly more complicated.

8

Ignore the higher order epsilon. Then merge (1 + δ¯1 + δ¯2 i )(1 + δˆ4 ), (1 + δˆ1 )(1 + δˆ3 ) and (1 + δˆ2 )(1 + δˆ3 ), and continue to use the old mark. We can solve δ1 , δ2 as δ1 = δ¯1 −

a 22 δˆ1 + b 22 δˆ2

a 22 + b 22 δ2 = δ¯2 .

,

As all the coefficient of each δ is smaller than one, we prove the result for complex division. What’s more, not matter whether |a| is large or small, a/a will approximate 1 by the above formulas.  Question 1.13. Prove lemma 1.3. Remark. Since there is not major difference between real and complex situations. For simplicity, I prove for the real case. Proof. In one side, if A is s.p.d., for any x, y ∈ Rn , we define an inner product as 〈x, y〉 = x T Ay. Verify four criteria of inner product, we get 1. 〈x, y〉 = x T Ay = (y T Ax)T = y T Ax = 〈y, x〉. 2. 〈x, y + z〉 = x T A(y + z) = x T Ay + x T Az = 〈x, y〉 + 〈x, z〉. 3. 〈αx, y〉 = αx T Ay = α〈x, y〉. 4. 〈x, x〉 = x T Ax ≥ 0, and 〈x, x〉 = 0 if and only if x = 0. Therefore, it is an inner product. On the other side, if 〈·, ·〉 is an inner product, consider its value on the canonical Rn bases e i . Set matrix A(i , j ) = 〈e i , e j 〉, then A(i , j ) is symmetric because of the symmetry property of P inner product. For any x, y ∈ Rn , expand them under the canonical bases as x = ni=1 αi e i , y = Pn i =1 βi e i . We get n n X X 〈x, y〉 = ( αi e i )T ( βi e i ) = x T Ay. i =1

i =1

This means A is symmetric positive definite, and is uniquely determined by the inner product (for the bases is fixed). Thus, the quetion is proved.  Question 1.14. Prove Lemma 1.5. p Proof. First, since a 2 + b 2 ≤ |a| + |b|, we have s s s n n n n X X X X kxk2 = x i2 ≤ |x 1 | + x i2 ≤ |x 1 | + |x 2 | + x i2 ≤ . . . ≤ |x i | = kxk1 . i =1

i =2

i =3

i =1

on the other hand, kxk21 = (

n X

i =1

|x i |)2 =

n X i =1

x i2 + 2

n X i =1 j =i +1

|x i x j | ≤

n X i =1

x i2 +

X i 6= j

(x i2 + x 2j ) = n

n X i =1

x i2 = nkxk22 .

9

p Therefore, we get the first inequality kxk2 ≤ kxk1 ≤ nkxk2 . For the inequality about infinity-norm, since it is obvious that r X X |x i |, max(|x i |) ≤ max(x i 2 ) + x i2 ≤ max(|x i |) + else

else

and n max(|x i |)2 ≥

n X i =1

the second inequality kxk∞ ≤ kxk2 ≤ follows.

x i2 , n max(|x i |) ≥

n X

|x i |,

i =1

p nkxk∞ and third inequality kxk∞ ≤ kxk1 ≤ nkxk∞



Question 1.15. Prove Lemma 1.6. Proof. Let A ∈ Rm×n , k · kmn the operator norm and k · km , k · kn the corresponding vector norms. To prove the question, we just need to verify the three criteria. 1. Positive definiteness Since kAkmn = maxkxkn =1 (kAxkm ), definitly kAkmn ≥ 0. If kAkmn = 0, which means Ax = 0 for all x ∈ Rn . i.e., Rn is the solution set of linear system of equations Ax = 0. As a result, we have A = 0. 2. Homogeneity Due to the homogeneity of corresponding vector norm, we have kαAkmn = max (kαAxkm ) = |α| max (kAxkm ) = |α|kAkmn . kxkn =1

kxkn =1

3. The triangluar inequality Because the triangular inequality of corresponding vector norm, we have kA + B kmn = max (k(A + B )xkm ) ≤ max (kAxkm ) + max (kB xkm ) = kAkmn + kB kmn . kxkn =1

kxkn =1

kxkn =1



In all, an operator norm is a matrix norm. Question 1.16. Prove all parts except 7 of lemma 1.7. Remark. Part 14 of lemma 1.7 seems wrong. Thus, I do some modification there. Proof.

1. kAxk ≤ kAk · kxk for a vector norm and its corresponding operator norm, or the vector two-norm and matrix Frobenius norm. For a vector norm and its corresponding operator norm, the inequality is obvious, because operator norm is defined as smallest upbound for which kAk · kxk ≥ kAxk holds. We just need to prove it for the two-norm and Frobenius norm. ¡ ¢T Partition matrix A by its rows as A = αT1 , αT2 , . . . , αTm . We have kAxk22 =

m X i =1

(αTi x)2 ≤

m X i =1

kαi k22 kxk22 = kxk22

m X i =1

kαi k22 = kxk22 kAk2F .

Thus, it also holds for the two-norm and Frobenius norm.

10

2. kAB k ≤ kAk · kB k for any operator norm or for the Frobenius norm. Using the above result, for any operator norm, we have kA(B x)k kAk · kB xk ≤ max ≤ kAk · kB k. x6=0 x6=0 kxk kxk ¡ ¢T For Frobenius norm, parition A by its rows as A = αT1 , αT2 , . . . , αTm , and split B by its ¡ ¢ columns as B = β1 , β2 , . . . , βs . We get kAB k = max

kAB k2F =

m,s X i =1 j =1

(αTi β j )2 ≤

m,s X

m X

(kαi k22 · kβ j k22 ) =

kαi k22 ·

i =1

i =1 j =1

s X j =1

kβi k22 = kAk2F kB k2F .

Thus, it holds also for the two-norm and Frobenius norm. 3. The max norm and Frobenius norm are not operator norms. Consider a matrix A = ααT . We have a lower bound of kAk as follows, kAk = max x6=0

kAxk kAαk kααT αk kαk ≥ = = kαk22 = kαk22 . kxk kαk kαk kαk

¡ ¢ Let α = 1, 1, . . . , 1 , then kAkmax = 1. However, according to the above inequality, if kAkmax is an operator norm, we have kAkmax ≥ n. Such conflict means that k · kmax is not an operator norm. For Frobenius norm, consider any operator norm of identity matrix, we have

kI k = max x6=0

kI xk kxk = max = 1. x6=0 kxk kxk

While the Frobenius norm of I is n, it means that Frobenius norm is not an operator norm. 4. kQ AZ k = kAk if Q and Z are orthogonal or unitary for the Frobenius norm and for the operator norm induced by k · k2 .2 Separate the target equation as two equations: kQ Ak = kAk, kAZ k = kAk. Since kA T k = kAk holds for Frobenius and two-norm (it is obvious for Frobenius norm and later in this question we will prove this p for two-norm), we just need to prove the first equation. For two-norm, since kAk2 = λmax (A T A), we get q q kQ Ak2 = λmax (A T Q T Q A) = λmax (A T A) = kAk2 . Next, for the Frobenius norm, we claim that for any vector x, kQxk2 = kxk2 , because ¡ ¢ kQxk22 = x T Q T Qx. Then, parition A by its columns as A = α1 , α2 , . . . , αn . We have kQ Ak2F =

n X i =1

kQαi k22 =

n X i =1

kαi k22 = kAk2F .

2 This result can be extended into the case of the rectangular matrix. The proof is nearly the same, except that we

use the equality kAk2 = σmax (A).

11

∞ 5. kAk∞ ≡ maxx6=0 kAxk kxk∞ = maxi

P

j |a i j | = maximum absolute row sum. ¢T ¡ T T Partition A by its rows as A = α1 , α2 , . . . , αTn . We have

kAk∞ ≡ max x6=0

n ¯ X ¯ kAxk∞ ¯a i j ¯ . = max |αTi x| ≤ max kxk∞ =1,i i j =1 kxk∞

¡ Denote i the number of row of the maximum absolute row sum, and set y = sign(a i 1 ), ¢ P sign(a i 2 ), . . . , sign(a i n ) . The up bound is attainable, as kyk∞ = 1 and kAyk = maxi j |a i j |. P T 1 maximum absolute column sum. 6. kAk1 ≡ maxx6=0 kAxk kxk 1 = kA k∞ = ¡max j i |a i j | = ¢ Partition A by its columns as A = α1 , α2 , . . . , αn . We have ° ° °X ° n n X ° ° kAk1 = max kAxk1 = max ° α1 x 1 ° ≤ max |x i | kαi k1 . ° kxk1 =1 kxk1 =1 °i =1 kxk1 =1 i =1 1

Denote j the number of column of the maximum absolute column sum. We have ! Ã Ã ! n X X X |x i | kαi k1 = max |x i | kαi k1 + 1 − |x i | kα j k1 max kxk1 =1 i =1

kxk1 =1 i 6= j

i 6= j

Ã

!

X¡ ¢ = max kα j k1 − kα j k1 − kαi k1 |x i | kxk1 =1

i 6= j

≤ kα j k1 . Set x = e j , and the up bound is attainable. 7. kAk2 = kA T k2p .3 Since kAk2 = λmax (A T A), it suffice to prove that λmax (A T A) = λmax (A A T ). Denote λ is as a nonzero eigenvalue of A T A and x its corresponding eigenvector. We have (A A T )Ax = A(A T A)x = λAx. Since A T Ax = λx, and λ 6= 0, we have Ax 6= 0. Therefore, λ is an eigenvalue of A A T and Ax is its corresponding eigenvector. As both A T A and A A T are nonnegative symmetric matrices, their eigenvalues must at least as large as zero. If one of the matrices has all its eigenvalues zero, it follows A = 0. As a result the other will has all its eigenvalues zero, too. Thus, λmax (A T A) = λmax (A A T ). 8. kAk2 = maxi |λi (A)| if A is normal, i.e., A A T = A T A. ¡ ¢ Since A is normal, there exist an orthgonal matrix P and a diagonal matrix Λ = diag λ1 , λ2 , . . . , λn such that P −1 AP = Λ. Thus, q kAk2 = kP −1 AP k2 = kΛk2 = λmax (ΛT Λ) = max |λi (Λ)| = max |λi (A)|. i

i

3 There are several ways to prove that AB and B A have the same eigenvalues. Among them, I present an analytic

version when A and B are square here. First assume A is invertable, then det(λI − AB ) = det(λI −B A) is easy to prove. For singular A, we perturb A as A˜ = A + t I . A˜ is invertable when t is small enough, and operator det() is continuous. Thus, we have the result by taking limit. This method is inspired by my friend, Zhou Zeren. There is another method in Chapter IV.

12

9. If A is n-by-n, then n −1/2 kAk2 ≤ kAk1 ≤ n 1/2 kAk2 . p Use the inequalities kxk2 ≤ kxk1 ≤ nkxk2 . Suppose kxk 6= 0. Invert the inequalities as 1 1 1 ≤ ≤ . p kxk kxk nkxk2 1 2 As kAkp ≡ maxx6=0

kAxkp kxkp , combine the two inequalities.

kAxk2 kAxk1 ≤ ≤ p kxk1 nkxk2

We have

p nkAxk2 . kxk2

Take max, it follows n −1/2 kAk2 ≤ kAk1 ≤ n 1/2 kAk2 . 10. If A is n-by-n, then n −1/2 kAk2 ≤ kAk∞ ≤ n 1/2 kAk2 . Like the previous question, we use the inequalities n −1/2 kxk2 ≤ kxk∞ ≤ kxk2 . Invert the inequalities as 1 1 n 1/2 ≤ ≤ . kxk2 kxk∞ kxk2 Combine the two inequalities. We have kAxk2 kAxk∞ ≤ ≤ p kxk∞ nkxk2

p nkAxk2 . kxk2

Take max, we get n −1/2 kAk2 ≤ kAk∞ ≤ n 1/2 kAk2 . 11. If A is n-by-n, then n −1 kAk∞ ≤ kAk1 ≤ nkAk∞ . Like the previous question, we use the inequalities kxk∞ ≤ kxk1 ≤ nkxk∞ . Still, invert the inequalities as 1 1 1 ≤ ≤ . nkxk∞ kxk1 kxk∞ Combine the two inequalities. We have kAxk2 kAxk1 nkAxk∞ ≤ ≤ . nkxk∞ kxk1 kxk∞ Take max, we get n −1 kAk∞ ≤ kAk1 ≤ nkAk∞ . 12. If A is n-by-n, then kAk2 ≤ kAkF ≤ n 1/2 kAk2 . As A T A is symmetric nonnegative. Therefore, the eigenvalues of A T A are all nonnegaP tive. As λ(A T A) = tr(A T A), we get λmax (A T A) ≤ tr(A T A) ≤ nλmax (A T A). i.e., kAk2 ≤ kAkF ≤ n 1/2 kAk2

13

 p 2 2 Question 1.17. We mentioned that on a Cray p machine the expression arccos (x/ x + y ) caused an error, because roundoff caused x/ x 2 + y 2 to exceed 1. Show that this is impossible using IEEE arithmetic, barring overflow or under flow. Extra credit: Prove the same result using correctly rounded decimal arithmetic.

Remark. I am not sure about the correctness of my proof, since there is not major difference between decimal and binary arithmetic in my proof. It may due to that I use the different method to prove it instead of what the author is intended for. However, whether or not it is correct, my proof can serve as a p reference. Proof. If we have prove that fl( x 2 ) = x exactly on a machine using IEEE arithmetic, we will get q p fl(x/ x 2 + y 2 ) ≥ fl(x/ x 2 ) = fl(x/x) = 1, p which means it p is impossible for arccos (x/ x 2 + y 2 ) causing an error. Thus, we only need 2 2 to prove that fl( x 2 ) = x. Denote fl(x is already a floating p ) = x (1 + δ) with |δ| ≤ ², where x p point number. Then, to compute fl(x 2 ), the computer will compute fl(x 2 ) ≈ x(1 + 21 δ), ¡p ¢ and round it. However, as fl(x 2 ) − x /x = 21 δ and 12 |δ| ≤ 12 ², the 21 δ will be rounded off, p  resulting in fl( x 2 ) = x. Question 1.18. Suppose that a and b are normalized IEEE double precision floating point numbers, and consider the following algorithm, running with IEEE arithmetic: if(|a| < |b|), swap a and b s1 = a + b s 2 = (a − s 1 ) + b Prove the following facts: 1. Barring overflow or underflow, the only roundoff error committed in running the algorithm is computing s 1 = f l (a +b). In other words, both subtractions a −s 1 and (a −s 1 )+b are computed exactly. 2. s 1 +s 2 = a +b, exactly. This means that s 2 is actually the roundoff error committed when rounding the exact value of a + b to get s 1 . Proof. To do addition and subtraction, a computer will equalize the exponential parts of two operands. For instance, suppose a = (1, a 1 , a 2 , a 3 , a 4 , . . . , a 50 , a 51 )2 × 2e 1 , b = (1, b 1 , b 2 , b 3 , b 4 , . . . , b 50 , b 51 )2 × 2e 2 . To do addition, a computer will change e 1 and e 2 into e 3 , then add their corresponding fraction parts. Therefore, if a is too larger than b, that is to say, the bottomest bit of a is still greater than the

14

toppest bit of b, we will have s 1 = a + b = a, and s 1 − a = a − a = 0 is exact, since a and b are already floating point numbers. Then, s 1 + b = 0 + b = b is also exact. If a is not too larger than b, without losing any generality, we can suppose that the toppest bit of b is the 50th toppest bit of a. Remain to use the above marks. We have s 1 = a + b = (1, a 1 , a 2 , a 3 , a 4 , . . . , a 50 + 1, a 51 + b 1 ) × 2e 1 . Then, compute a − s 1 , we have a−s 1 = (1 − 1, a 1 − a 1 , . . . , a 50 − a 50 − 1, a 51 − a 51 − b 1 )×2e 1 = (−1, −b 1 )×2e 1 −50 = (−1, −b 1 )×2e 2 . This is exact, because the toppest bit of s 1 is the same as the toppest bit of a. What’s more, since a − s 1 has the same toppest bit of b, it is certain that (a − s 1 ) + b is exact. The situation of a < b is the same. In all, we can see that under both situation s 1 + s 2 contain all the bits of a and b. Therefore, s 1 + s 2 = a + b exactly. 

2 S OLUTIONS FOR C HAPTER II: L INEAR E QUATION S OLVING Question 2.1. Programming question. Skipped. Question 2.2. Consider solving AX = B for X , where A is n-by-n, and X and B are n-by-m. There are two obvious algorithms. The first algorithm factorizes A = P LU using Gaussian elimination and then solves for each column of X by forward and back substitution. The second algorithm computes A −1 using Gaussian elimination and then multiplies X = A −1 B . Count the number of flops required by each algorithm, and show that the first one requires fewer flops. Solution. For the first algorithm, since permutation does not increase flops, we can suppose matrix A is already permuted. Then, denote   a 11 a 12 . . . a 1n    a 21 a 22 . . . a 2n   A= . .. ..  .. . . . . .   . a n1

a n2

...

a nn

To determine the first column of L requires n − 1 divisions. To update the rest of the matrix costs n − 1 multiplications and subtractions for each row, which totally costs 2(n − 1)2 flops. Therefore, the whole flops Gaussian elimination requires is 2 X

2 (2(i − 1)2 + (i − 1)) = n 3 + O(n 2 ). 3 i =n For a single forward substitution, it requires n X

(2i − 1) = n 2 + O(n).

i =1

15

It is the same for back substitution. Thus, the first method costs one step of LU decomposition, m steps of forward and back substitution. Totally 2 3 n + 2mn 2 + O(n 2 ) + O(mn) 3 flops. P For the second algorithm, Using 1i =n (2(i −1)i +2(n−i +1)(i −1)) = n 3 +O(n 2 ) flops, we reduce from ¯   a 11 a 12 . . . a 1n ¯¯ 1  ¯  1  a 21 a 22 . . . a 2n ¯   . ¯  .. .. .. ..  . ¯  . . ¯ . .  .  ¯ ¯ a n1 a n2 . . . a nn 1 to

     

a 11

a 12 0 a 22

... ... .. .

a 1n 00 a 2n .. . 00 a nn

¯ ¯ 1 ¯ ¯ ¯ b 21 ¯ . ¯ . ¯ . ¯ ¯ b n1



1 .. . b n2

  .  

..

. ...

1

P Costing another 1i =n ((i − 1) + 2n(i − 1)) = n 3 + O(n 2 ), we finish finding the inverse of A. As a matrix multiplication also costs nm(2n − 1), totally it costs

2n 3 + 2mn 2 + O(n 2 ) + O(mn). Therefore, when n and m is large enough, the first algorithm costs fewer flops.



Question 2.3. Let k · k be the two-norm. Given a nonsingular matrix A and a vector b, show that for sufficiently small kδAk, there are nonzero δA and δb such that inequality kδxk ≤ kA −1 k(kδAk · kxˆ k + kδbk) is an equality. This justifies calling κ(A) = kA −1 k · kAk the condition number of A. Proof. To find the suitable δx and δA, let us have a glance at how the inequality comes. The original equation is δx = A −1 (−δA · xˆ + δb). Using the inequality kB xk ≤ kB k·kxk twice, and kδA xˆ +δbk ≤ kδA xˆ k+kδbk, we get the target inequality. If we denote y as the vector which satisfy kA −1 k · kyk = kA −1 yk, set δA satisfy k−δAk·kxˆ k = k−δA xˆ k, and δA xˆ proportional to δb. The three inequalities will both become equalities. This suggests how we can construct δA and δb. Assume the δA has the form δA = c ab T , where c is a positive constant and a, b are vectors to be determined. According to Question 1.7., k − δAk = ckbk · kak. Then, as the δA must satisfy k − δAk · kxˆ k = k − δA xˆ k, we get ckbk · kak · kxˆ k = ck − ab T xˆ k = ckb T xˆ k · kak ≤ ckbk · kak · kxˆ k.

16

Therefore, for the equation to hold, b must be proportional to xˆ , while a can be free of choice. Then, for δA xˆ to be proportional to δb, we can set b = xˆ and a = d δb, where d is another constant to be determined. As the final condistion requires −δA · xˆ + δb = (cd xˆ T xˆ + 1)δb = y, we can choose c, d such that cd xˆ T xˆ + 1 6= 0. Therefore, we have δA = d δb xˆ T , δb = (cd xˆ T xˆ + 1)−1 y. We can see that both δA and δb are determined by known value, and due to the conditions we use, the target inequality becomes equality under our choice of δA and δb, whose norm can be modified by the choice of c, d and kyk.  Question 2.4. Show that bounds ˆ + |b|)k, kδxk ≤ ²k|A −1 (|A| · |x| kδxk ≤ ²k|A −1 | · |A|k. kxk are attainable. Remark. I could only prove that the bounds can be attainable for certain A, whose inverse satisfy the following condition: ¡ ¢ ¡ ¢ sgn(b i 1 ), sgn(b i 2 ), . . . , sgn(b i n ) = ± sgn(b j 1 ), sgn(b j 2 ), . . . , sgn(b j n ) , for any i , j.

Also, I believe that second inequality should be kδxk ≤ ²k|A −1 | · |A|k. ˆ kxk Proof. For the first inequality, since it comes from ˆ + |δb|) ≤ |A −1 |(²|A| · |x| ˆ + ²|b|) = ²(|A −1 |(|A| · |x| ˆ + |b|)), |δx| = |A −1 (−δA xˆ + δb)| ≤ |A −1 |(|δA| · |x| it is inevitable that |δA| = ²|A|, |δb| = ²|b|, which means we could only choose the signs of entries of δA and δb. From inside, we have |δA xˆ | = |δA||xˆ |. To make the inequality become equality, for each row, δA(i , j ) and xˆ j should have the same sign. Therefore, ¡ ¢ ¡ ¢ sgn(δA(i , 1)), sgn(δA(i , 2)), . . . , sgn(δA(i , n)) = ± sgn(xˆ 1 ), sgn(xˆ 2 ), . . . , sgn(xˆ n ) , for any i ,

which reduces the degree of freedom of δA to n. Next, to make | − δA xˆ + δb| = | − δA xˆ | + |δb|, we have n X ˆ = sgn(δb), sgn( −δA x) i =1

17

which reduces the degree of freedom of δb to 0. For simplicity, denote z = −δA xˆ + δb. Use the last n degree of freedom of δA, we could determine the sign of each z i . According to the condition of A −1 , set sgn(z i ) = sgn(A −1 (i , j )), for any i , j. We will get |A −1 z| = |A −1 ||z|. In all, we get |δx| = ²|A −1 |(|A| · |xˆ | + |b|). Take norm on both sides, and the result follows. For the second inequality, the procedure is the same. Since under the assumption of δb = 0, the only difference is to divide xˆ in both sides.  Question 2.5. Prove Theorem 2.3. Given the residual r = A xˆ − b, use Theorem 2.3 to show that bound (2.9) is no larger than bound (2.7). This explains why LAPACK computes a bound based on (2.9), as described in section 2.4.4. Proof. To prove Theorem 2.3. Rearrange the equation (A + δA)xˆ = b + δb, yielding |A xˆ − b| = |r | = |δb − δA xˆ | ≤ |δb| + |δA xˆ | ≤ |δb| + |δA| · |xˆ | ≤ ²(|b| + |A| · |xˆ |). As the above inequality is componentwise, it follows ² ≥ max i

|r i | . (|A| · |xˆ | + |b|)i

Use the same technique describe in the previous question, it can be proved that for certain δA and δb |A xˆ − b| = |δb − δA xˆ | = |δb| + |δA| · |xˆ | ≤ ²(|b| + |A| · |xˆ |), i| which implies that ² cannot be smaller than maxi (|A|·||rxˆ |+|b|) . There proves the question. Next, i −1 −1 to prove k|A | · |r |k ≤ ²k|A |(|A| · |xˆ | + |b|)k. Because all entries of |A −1 |, |r |, and |A| · |xˆ | + |b| are nonnegative, we just need to prove that |r i | ≤ ²(|A| · |xˆ | + |b|)i . According to the smallest value of ², the targeted inequality holds. Therefore, bound (2.9) is no larger than bound (2.7).

 Question 2.6. Prove Lemma 2.2. Proof. First, consider the single permuted permutation matrix. That is, the permutation matrix with only one pair of rows permuted. It can be written as  1       P =      



..

       , where i th, j th rows are permuted.      

. 0 ··· .. . 1 ···

1 0 ..

. 1

18

According to the form of the single permutation matrix, it follows det(P ) = ±1. Give a matrix ¡ ¢ ¡ ¢T X = α1 , α2 , . . . , αn = βT1 , βT2 , . . . , βTn . The matrix product P X and X P have the form ¡ ¢T P X = βT1 , βT2 , . . . , βTj , . . . , βTi , . . . , βTn , ¡ ¢ X P = α1 , α2 , . . . , α j , . . . , αi , . . . , αn .

Mark this as property-one. Then, to transform P into identity matrix I , it just need to permute the permuated i th and j th rows again. Consequently, P −1 = P for single permuted permutation matrix. For any permutation matrix Pˆ , decompose it into a series of single permutation matrices as P i , i = 1, 2, . . . , m. i.e., Pˆ = P m P m−1 · · · P 1 . According to what we have proved for the single permutation matrix, it is only necessary to verify the property about P −1 . Since Pˆ T = P 1 P 2 · · · P m , we have Pˆ Pˆ T = Pˆ T Pˆ = I .



Thus, the question is proved.

Question 2.7. If A is a nonsingular symmetric matrix and has the facotization A = LD M T , where L and M are unit lower triangluar matrices and D is a diagonal matrix, show that L = M. Proof. To prove the question, partition L by its columns and M by its rows. We have ¡ ¢ L = α1 , α2 , . . . , αn , ¡ ¢T M = βT1 , βT2 , . . . , βTn . Then, we can decompose A as A = LD M T =

n X i =1

D i i αi βTi .

Recall that both L and M is unit lower triangular matrices. Therefore, D i i αi βTi only has nonzero entries in i : n submatrix, and αi (i ) = βi (i ) = 1. i.e., D i i αi βTi should have form as   0  .   .  ¡ ¢  .   0 . . . 0 D i i αi ∗ =   0 .   D i i βi  ∗ As A is nonsingular symmetric, we know D i i 6= 0 and D 11 α1 = D 11 β1 , which means α1 = β1 . Then, denote A 1 = A − D 11 α1 β1 . Apply the same technique to A(2 : n, 2 : n), which is also nonsingular and symmetric. As D i i αi βTi only has nonzero entries in i : n submatrix. We can prove that α2 = β2 . In this analogy, it proves L = M . 

19

Question 2.8. Consider the following two ways of solving a 2-by-2 linear system of equations: ·

a 11 Ax = a 21

¸ · ¸ · ¸ b1 x1 a 12 = b. = · b2 x2 a 22

1. Algorithm 1. Gauusian elimination with partial pivoting (GEPP). 2. Algorithm 2. Cramer’s rule: det = a 11 ∗ a 22 − a 12 ∗ a 21 , x 1 = (a 22 ∗ b 1 − a 12 ∗ b 2 )/ det, x 2 = (−a 21 ∗ b 1 + a 11 ∗ b 2 )/ det . Show by means of a numerical example that Cramer’s rule is not backward stable. Proof. To varify whether these two algorithms are backward stable or not, it requires to verify whether ˆ = O(²). kδxk/kxk Consider the following example: · ¸ · ¸ 1.112 1.999 2.003 ·x = . 3.398 6.123 6.122 ¡ ¢ The approximate answer should be x T = 1.63788, 0.0908866 . However, under four-decimaldigit floating point arthimetic, for the first algorithm, it yields ·

1 0.3273 1

¸·

3.398

¸ · ¸ 6.123 6.122 x= . −0.005 2.003

Solving it, we have · ¸ 1.441 x= . 0.2

For the second algorithm, we get fl(det) = fl(fl(1.112 ∗ 6.123) − fl(1.999 ∗ 3.398)) = fl(6.809 − 6.793) = 0.016, fl(x 1 ) = fl(fl(fl(6.123 ∗ 2.003) − fl(1.999 ∗ 6.122))/det) = 1.875, fl(x 2 ) = fl(fl(fl(1.112 ∗ 6.122) − fl(3.398 ∗ 2.003))/det) = 0.125. Thus, ·

¸ · ¸ 0.19688 −0.23712 δx 1 = , δx 2 = . −0.1091134 −0.0341134

It follows that kδx 2 k1 /kxk1 ≈ 0.156894 ≈ 156² = O(1), which suggest that the second algorithm is not backward stable. 

20

Question 2.9. Let B be an n-by-n upper bidiagonal matrix, i.e., nonzero only on the main diagonal and first superdiagonal. Derive an algorithm for computing κ∞ (B ) ≡ kB k∞ kB −1 k∞ exactly (ignoring roundoff ). In other words, you should not use an iterative algothrim such as Hager’s estimator. Your algorithm should be as cheap as possilbe; it should be possible to do using no more than 2n − 2 additions, n multiplications, n divisions, 4n − 2 absolute values, and 2n − 2 comparisions. Proof. For a bidiagonal matrix to have inverse, all the diagonal entries of B must be nonzero. Thus, we have the explicit form of B and B −1 as follow:   b 1,1 b 1,2   b 2,2 b 2,3     . .  , . . B = . .     b n−1,n−1 b n−1,n  b n,n     −1 B =   

1/b 1,1

−b 1,2 /b 2,2 b 1,1 1/b 2,2

b 1,2 b 2,3 /b 1,1 b 2,2 b 3,3 −b 2,3 /b 2,2 b 3,3 .. .

... ... .. . 1/b n−1,n−1

 Q Q b i ,1+1 / ni=1 b i ,i (−1)n−1 n−1 i =1 Q Qn  (−1)n−2 n−1 i =2 b i ,1+1 / i =2 b i ,i   .. . .    −b n−1,n /b n−1,n−1 b n,n 1/b n,n

Thus, the absolute sum of i th row of B −1 can be compute from (i + 1)th row by the following recursive formula: ¯ ¯ ¯ ¯ s i = s i +1 ¯b i ,i +1 /b i ,i ¯ + ¯1/b i ,i ¯ . Assume that we have done the 2n − 1 absolute value operations. The condition number can be computed as: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:

M AX 1 = B (n, n), M AX 2 = 1/B (n, n), L AST SU M = M AX 2 for i = 2 to n do T E M P = B (i , i ) + B (i , i + 1) if T E M P > M AX 1 then M AX 1 = T E M P end if L AST SU M = (L AST SU M ∗ B (i , i + 1) + 1)/B (i , i ) if L AST SU M > M AX 2 then M AX 2 = L AST SU M end if end for

Totally, it uses 2n − 2 additions, n − 1 multiplications, n divisions, 2n − 1 absolute values and 2n − 2 comparisions, which meets the requirement.  Question 2.10. Let A be n-by-m with n ≥ m. Show that kA T Ak2 = kAk22 and κ2 (A T A) = κ2 (A)2 . Let M be n-by-n and positive definite and L be its Cholesky factor so that M = LL T . Show that kM k2 = kLk22 and κ2 (M ) = κ2 (L)2 .

21

Proof. As A may be rectangular, denote A = U ΣV T as the SVD decomposition of A. Since kA T Ak2 = kV Σ2V T k2 = kΣ2 k2 = kΣk22 = kU ΣV T k22 = kAk22 , kA T Ak2 = kAk22 is proved. For equality κ2 (A T A) = κ2 (A)2 , it also follows from A T A = V Σ2V T , which implies σmin (A T A) = σmin (A)2 . Since the Cholesky factor matrix L is square and invertible, apply the above result and we prove the question.  Question 2.11. Let A be symmetic and positive definite. Show that |a i j | ≤ (a i i a j j )1/2 . Proof. Since A is s.p.d., all its diagonal entries are positive. Consider p p x i j = {0, 0, . . . , a j j , . . . , − a i i , . . . , 0}, p p y i j = {0, 0, . . . , a j j , . . . , a i i , . . . , 0}, yielding x Tij Ax i j = a j j a i i −

p

y Tij Ay i j = a j j a i i +

p

ai i a j j ai j −

p

ai i a j j ai j +

p

a i i a j j a i j + a i i a j j = 2a i i a j j − 2

p

a i i a j j a i j ≥ 0,

a i i a j j a i j + a i i a j j = 2a i i a j j + 2

p

a i i a j j a i j ≥ 0,

Combine them, and it follows (a i i a j j )1/2 ≥ |a i j |.

 Question 2.12. Show that if µ

I Y = 0

¶ Z , I

where I is an n-by-n identity matrix, then κF (Y ) = kY kF kY −1 kF = (2n + kZ k2F ). Proof. Consider the following matrix µ

X=

¶ I −Z , 0 I

which satisfies X Y = Y X = I . Thus, κF (Y ) = kY kF kY −1 kF =

q q (2n + kZ k2F ) (2n + kZ k2F ) = (2n + kZ k2F ).

 Question 2.13. In this question we will ask how to solve B y = c given a fast way to solve Ax = b, where A − B is "small" in some sense. 1. Prove the Sherman-Morrison formula: Let A be nonsingular, u and v be column vectors, and A + uv T be nonsingular. Then (A + uv T )−1 = A −1 − (A −1 uv T A −1 )/(1 + v T A −1 u). More generally, prove the Sherman-Morrison-Woodbury formula: Let U and V be n-byk rectangular matrices, where k ≤ n and A is n-by-n. Then T = I + V T A −1U is nonsingular if and only if A + UV T is nonsingular, in which case (A + UV T )−1 = A −1 − A −1U T −1V T A −1 .

22

2. If you have a fast algorithm to solve Ax = b, show how to build a fast solver for B y = c, where B = A + uv T . 3. Suppose that kA − B k is "small" and you have a fast algorithm for solving Ax = b. Describe an iterative scheme for solving B y = c. How fast do you expect your algorithm to converge? Proof. 1. First, it needs to prove that det(A + uv T ) = det(A) det(1 + v T A −1 u). Consider µ ¶ µ −1 ¶µ ¶µ ¶µ ¶ µ ¶ I A A + uv T u I A A A −1 u A = vT 1 1 1 −v T 1 1 1 + v T A −1 u Take determinant on both sides, and the result follows. Then, we can verify ShermanMorrison formula as follows: (A + uv T )(A −1 − (A −1 uv T A −1 )/(1 + v T A −1 u)) = I + uv T A −1 − (uv T A −1 − uv T A −1 uv T A −1 )/(1 + v T A −1 u) = I + uv T A −1 − (uv T A −1 )(1 + v T A −1 u)/(1 + v T A −1 u) = I, (A −1 − (A −1 uv T A −1 )/(1+v T A −1 u))(A + uv T ) = I + A −1 uv T − (A −1 uv T − A −1 uv T A −1 uv T )/(1 + v T A −1 u) = I + A −1 uv T − (A −1 uv T )(1 + v T A −1 u)/(1 + v T A −1 u) = I. Thus, we prove the firs part of part I. For the Sherman-Morrison-Woodbury formula, we first prove that det(A+UV T ) = det(A) det(I + V T A −1U ). Consider µ ¶ µ −1 ¶µ ¶µ ¶µ ¶ µ ¶ I A A +UV T U I A A A −1U A = VT I I I −V T I I I + V T A −1U Take determinant on both sides, and the result follows. As A is nonsingular, we prove they are mutually necessary and sufficient. Then, verify the matrix equation as follows: (A +UV T )(A −1 − A −1U T −1V T A −1 ) = I −U T −1V T A −1 +UV T A −1 −UV T A −1U T −1V T A −1 = I +UV T A −1 − (U +UV T A −1U )T −1V T A −1 = I +UV T A −1 −U (I + V T A −1U )T −1V T A −1 = I, (A

−1

−1

− A UT

−1

T

V A

−1

T

)(A +UV ) = I − A −1U T −1V T + A −1UV T − A −1U T −1V T A −1UV T = I + A −1UV T − A −1U T −1 (V T + V T A −1UV T ) = I + A −1UV T − A −1U T −1 (I + V T A −1U )V T = I.

In all, we prove part I.

23

2. Denote x 1 , x 2 , which satisfy Ax 1 = u, Ax 2 = c. According to what we have prove, to solve B y = c, we have y = B −1 c = (A −1 − (A −1 uv T A −1 )/(1 + v T A −1 u))c = x 2 − ((v T x 2 )x 1 )/(1 + v T x 1 ) Therefore, using the above formula, the solution can be computed from two solutions of Ax = b in addition to inner product, all of which are fast. 3. For the last question, since we have B y = (A + B − A)y = Ay − (A − B )y = c, we get an equation as y = A −1 c + A −1 (A − B )y. What we need to do is to find its root, thus we can use the fixed point iteration as y i +1 = A −1 c + A −1 (A − B )y n , or in pseudocode as: while kr k > t ol do solve Ax = r 3: y = y +x 4: r = c −By 5: end while 1:

2:

For the fixed point iteration to converge, the norm of the coefficient matrix A −1 (A − B ) must smaller than one. Or, we can also get the same result from the iterative refinement prespective as: kr i +1 k = kr i − B δyk = k(A − B )δyk ≤ kA − B k · kδyk ≤ kA − B k · kA −1 k · kr i k. Therefore, if kA − B k · kA −1 k < 1, the algorithm will converge linearly.

 Question 2.14. Programming question. Skipped. Question 2.15. Programming question. Skipped. Question 2.16. Show how to reorganize the Cholesky algorithm (Algorithm 2.11.) to do most of its operations using Level 3 BLAS. Mimic Algorithm 2.10. Solution. As cholesky factorization does not need pivot, we first modify Algorithm 2.11 to fit rectangular matrices. In details, for a m-by-b rectangular matrix, use original Algorithm 2.11 to deal with the upper b-by-b submatrix, and use level 2 BLAS to update the lower (m − n)by-n matrix, yielding 1:

for i = 1 to b do

24

2: 3: 4: 5: 6: 7:

p A(i : m, i ) = A(i : m, i )/ A(i , i ) for j = i + 1 to b do A( j : b, j ) = A( j : b, j ) − A( j , i )A( j : b, i ) end for A(b + 1 : m, i + 1 : b) = A(b + 1 : m, i : b) − A(b + 1 : m, i ) · A(i + 1 : b, i )T end for

Then, by delaying the update of submatrix by b steps, we get level 3 BLAS implemented Cholesky. In details, µ

A=

A 11 A 21

¶ µ L 11 A 12 = L 12 A 22

¶µ

I

¶µ

I A¯

L T11

¶ µ L T12 L 11 L T11 = I L 12 L T11

¶ L 11 L T12 . A¯ + L 12 L T12

Thus, A¯ = A 22 −L 12 L T12 , a symmetric rank-b update which is level 3 BLAS. 1:

for i = 1 to n − 1 step b do

µ ¶ L 22 2: Use the above Algorithm to factorize A(i : n, i : i + b − 1) = L 23 3: A(i + b : n, i + b : n) = A(i + b : n, i + b : n) − L 23 L T23 4: end for

 Question 2.17. Suppose that, in Matlab, you have an n-by-n matrix A and an n-by-1 matrix b. What do A\b, b’/A, and A/b mean in Matlab? How does A\b differ from inv(A)*b? 4 Solution. A\b: equals to solve the linear systems of equations: Ax = b; b’/A: equals to solve x T A = b T ; A/b: ?; Although both A\b and inv(A)*b get the solution of the linear systems of equations: Ax = b, yet to solve Ax = b does not need to compute the inverse of A. It can be based on matrix factorization like LU, LL.  Question 2.18. Let ·

A 11 A= A 21

A 12 A 22

¸

where A 11 is k-by-k and nonsingular. Then S = A 22 − A 21 A −1 11 A 12 is called the Schur complement of A 11 in A, or just Schur complement for short. 1. Show that after k steps of Gaussian elimination without pivoting, A 22 has been overwritten by S. 2. Suppose A = A T , A 11 is positive definite, and A 22 is negative definite. Show that A is nonsingular, that Gaussian elimination without pivoting will work in exact arithmetic, but that Gaussian elimination without pivoting may be numerically unstable. Proof. 4 I don’t use matlab much, so the answer may be modefied.

25

1. As we know, one step of Gaussian elimination overwrite the n − 1 lower submatrix by Schur complement. If it holds for (k − 1)th step of Gaussian elimination, it suffices to prove it is also true for the k steps of Gaussian elimination. Denote   A 11 v k−1 A 12 A = a Tk−1 a kk b Tn−k  , where A 11 is (k − 1)-by-(k − 1). A 21 u n−k A 22 As kth step of Gaussian elimination equals to (k − 1)th step of Gaussian elimination followed by one more Gaussian elimination, and the first k − 1 steps will update A(k : n, k : n) as · ¸ · T ¸ ¸ · T T −1 £ ¤ a a kk − a Tk−1 A −1 a kk b Tn−k 11 v k−1 b n−k − a k−1 A 11 A 12 , v A − k−1 A −1 = 12 k−1 11 A 21 u n−k A 22 u n−k − A 21 A −1 A 22 − A 21 A −1 11 v k−1 11 A 12 the next step will update A 22 as −1 T T −1 T −1 A 022 = A 22 −A 21 A −1 11 A 12 −(u n−k −A 21 A 11 v k−1 )(b n−k −a k−1 A 11 A 12 )/(a kk −a k−1 A 11 v k−1 ). µ ¶ A 11 v k−1 ¯ On the other hand, since the inverse of A = T is a k−1 a kk · ¸ 1 (a kk − a Tk−1 A −1 v k−1 )A −1 + A −1 v k−1 a Tk−1 A −1 −A −1 v k−1 −1 11 11 11 11 11 ¯ A = , −a Tk−1 A −1 1 a kk − a Tk−1 A −1 11 11 v k−1

the corresponding kth step Schur complement is · ¸ £ ¤ 0 A 12 −1 T −1 A 22 − A 21 u n−k A = A 22 − A 21 A −1 11 A 12 − (A 21 A 11 v k−1 a k−1 A 11 A 12 − b Tn−k T T T −1 0 u n−k a Tk−1 A −1 A 12 − A 21 A −1 11 v k−1 b n−k + u n−k b n−k )/(a kk − a k−1 A 11 v k−1 ) = A 22 .

Therefore, it also holds for kth step of Gaussian elimination. 2. According to the condition, A 11 , A 22 is invertible. Thus, to show that A is nonsingular, consider its determinant as ¯ ¯ ¯ ¯ ¯ A 11 A 12 ¯ ¯ A 11 ¯ A 12 ¯ ¯ ¯ ¯ = det(A 11 ) det(A 22 − A 21 A −1 A 12 ). = det(A) = ¯ 11 ¯ A 21 A 22 ¯ ¯ 0 A 22 − A 21 A −1 A 11 12 For any vector x ∈ Rn−k , we have T T x T (A 22 − A 21 A −1 11 A 12 )x = x A 22 x − (A 12 x) A 11 (A 12 x) ≥ 0.

The equality holds if and only if x = 0. So, A 22 − A 21 A −1 11 A 12 is s.p.d., which means −1 det(A 22 − A 21 A 11 A 12 ) > 0. As a result, A is nonsingular. According to Question 2.11., all the diagonal entries of A are nonzero. Consequently the Gaussian elimination will work in exact arthimetic. To show it is numerical unstable, consider the following counterexample: · ¸ · ¸· ¸ 1 1010 1 1 1010 A= = . 1010 −1 1010 1 −1 − 1020

26

 Question 2.19. Matrix A is called strictly column diagonally dominant, or diagonally dominant for short, if n X |a j i |. |a i i | > j =1, j 6=i

1. Show that A is nonsingular. 2. Show that Gaussian elimination with partial pivoting does not actually permute any rows, i.e., that it is identical to Gaussian elimination without pivoting. Proof. 1. According to Gershgorin’s theorem, the eigenvalues λ of A T satisfy |λ − a i i | ≤

X

|a j i |.

j 6=i

Thus, we have ai i −

X

|a j i | ≤ λ ≤ a i i +

j 6=i

X

|a j i |.

j 6=i

When a i i > 0, we have λ > 0. Otherwise, a i i < 0 will result in λ < 0. Since A is strictly column diagonally dominant, a i i 6= 0. Thus, we prove λ 6= 0, which means det(A) = Q det(A T ) = ni=1 λ 6= 0. i.e., A is nonsingular. 2. Since |a 11 | > |a j 1 |, it holds for the base case. If we could prove that after updating the submatrix still has the diagonally dominant property, we would prove the question. Denote a i0 j = a i j − a 1 j ∗ a i 1 /a 11 . Sum them up, we get n X i =2,i 6= j

|a i0 j | ≤

n X

(|a i j | + |a 1 j ∗ a i 1 /a 11 |)

i =2,i 6= j

≤ |a i i | − |a 1i | − |a 1 j | + |a 1 j /a 11 |

n X

|a i 1 |

i =1,i 6= j

≤ |a i i | − |a 1i | ≤ |a i i | − |a 1i ∗ a i 1 /a 11 | ≤ |a i0 i |. Therefore, the diagonally dominant property holds. By mathematic induction, we prove the quesion.

 Question 2.20. Given an n-by-n nonsingular matrix A, how do you efficiently solve the following problems, using Gaussian elimination with partial pivoting? (a) Solve the linear system A k x = b, where k is a positive integer.

27

(b) Compute α = c T A −1 b. (c) Solve the matrix equation AX = B , where B is n-by-m. You should (1) descirbe your algorithms, (2) present them in pseudocode and (3) give the require flops. Solution. (a) Since A k x = b equals A k−1 x = A −1 b, set A −1 b = b 1 , and we have A k−1 x = b 1 , which is the same kind of equation with smaller exponent. Thus, we have the algorithm for i = k to 1 do solve Ax = b 3: b=x 4: end for 1:

2:

Totally, it requires 23 n 3 + 2kn 2 + O(n 2 ) flops. (b) First solve Ax = b, then do the inner product. The pseudocode is shown below. 1: 2:

solve Ax = b α = cT x

Totally, it require 23 n 3 + O(n 2 ) flops. (c) Solve AX = B equals to solve m vectors. for i = 1 to m do 2: solve Ax = b i 3: end for

1:

Totally, it require 23 n 3 + 2mn 2 + O(n 2 ) flops.

 Question 2.21. Prove that Strassen’s algorithm (Algorithm 2.8) correctly multiplies n-by-n matrices, where n is a power of 2. Proof. Since n is a power of 2, all matrix partitions and matrix multiplication will be ok. To prove the result, it suffice to prove the correctness of the recursive step. According to Strassen’s algorithm, we have P 1 = A 12 B 21 + A 12 B 22 − A 22 B 21 − A 22 B 22 , P 4 = A 11 B 22 + A 12 B 22 , P 2 = A 11 B 11 + A 11 B 22 + A 22 B 11 + A 22 B 22 , P 5 = A 11 B 12 − A 11 B 22 , P 3 = A 11 B 11 + A 11 B 22 − A 21 B 11 − A 21 B 12 , P 6 = A 22 B 21 − A 22 B 11 , P 7 = A 21 B 11 + A 22 B 11 , C 11 = A 12 B 21 + A 11 B 11 , C 12 = A 12 B 22 + A 11 B 12 , C 21 = A 22 B 21 + A 21 B 11 , C 22 = A 22 B 22 + A 21 B 12 .

28

Therefore, ·

A=

A 11 B 11 + A 12 B 21 A 21 B 11 + A 22 B 21

¸ · C 11 A 11 B 12 + A 12 B 22 = C 21 A 21 B 12 + A 22 B 22

¸ C 12 . C 22

So the recursive yield the correct answer, and the basic case for the Strassen’s algorithm is just the scalar arithmtic operations. So, Strassen’s algorithm correctly multiplies n-by-n matrices, where n is a power of 2. 

3 S OLUTIONS FOR C HAPTER III: L INEAR L EAST S QUARES P ROBLEMS Question 3.1. Show that the two variations of Algorithm 3.1, CGS and MGS, are mathematically equvialent by showing that the two formulas for r i j yield the same results in exact arithmetic. Proof. The differences between CGS and MGS are the different ways to calculate coefficients r i j . For CGS, it maintains to use the original vectors a i , while the MGS uses the updated vectors q i , which after j steps of updates becomes q i = ai −

j X

r ki q k .

k=1

Thus, as for any j < i , q j are mutually orthogonal, it is obvious that r ( j +1)i = q Tj+1 q i = q Tj+1 (a i −

j X k=1

r ki q j ) = q Tj+1 a i .

So, by induction, CGS and MGS are mathematically equivalent.



Question 3.2. Programming question. Skipped. Question 3.3. Let A be m-by-n, m ≥ n, and have full rank. · ¸ · ¸ · ¸ b I A r has a solution where x minimizes kAx − bk2 . 1. Show that T · = 0 A 0 x 2. What is the conition number of the coefficient matrix in terms of the singular values of A? 3. Give an explicit expression for the inverse of the coefficient matrix, as a block 2-by-2 matrix. 4. Show how to use the QR decomposition of A to implement an iterative refinement algorithm to improve the accuracy of x. Solution.

29

·

I 1. We can decompose the equation T A

¸ · ¸ · ¸ A r b · = into the following two equations: 0 x 0

r + Ax = b, A T r = 0. Thus, if x is the original solution, we have A T r + A T Ax = A T Ax = A T b, which means x satisfies the normal equation. Therefore, x minimizes kAx − bk2 . 2. Since A has full column rank, let U T AV = Σ the full SVD of A, where U ,V are m-bym, n-by-n orthgonal matirces and Σ is m-by-n diagonal matrix with nonzero entries. Applying orthgonal similarity transformation onto the coefficient matrix, we have ¸ ¸· ¸ · ¸· · T I Σ Im A U U . = V ΣT V T AT Since Σ has full colum rank, we can compute the eigenvalues of the above matrix by applying elementary transformations with unit determinant as · ¸ · ¸ · ¸ (1 − λ)I m Σ (1 − λ)I m Σ I m λΣ(ΣT Σ)−1 det = det det ΣT −λI n ΣT −λI n In · ¸ · ¸ Im (1 − λ)I m λ(1 − λ)Σ(ΣT Σ)−1 + Σ = det det −ΣT I n ΣT · ¸ ¸ · I m Σ(ΣT Σ)−1 (1 − λ)I m λ(1 − λ)Σ(ΣT Σ)−1 + Σ = det det In λΣT λ(λ − 1)I n − ΣT Σ · ¸ (1 − λ)I m + λΣ(ΣT Σ)−1 ΣT = det λΣT λ(λ − 1)I n − ΣT Σ n Y = (1 − λ)m−n (λ2 − λ − σ2i ). i =1

q q Thus, the eigenvalues of the coefficient matrix are 1 and 12 (1± 1 + 4σ2i ). As 1 + 4σ2i > q 1, we have 1 + 1 + 4σ2i > 2. Consequently, the condition number is    

µ ¶ q n ³q ´ o 1 + 1 + 4σ2max / min 12 1 + 4σ2min − 1 , 1 , ¶ ³q µ q |λmax /λmin | = ´   1 + 4σ2min − 1 ,  1 + 1 + 4σ2max / 1 2

when m > n; when m = n;

3. To find its inverse, we use invertible matrices to transform the coefficient matrix into indentity matrix as follows · ¸· ¸· ¸· ¸ · ¸ Im I m A(A T A)−1 Im Im A Im = . −(A T A)−1 In −A T I n A T In

30

Thus, we have the candidate matrix as ¸ ¸ · ¸· ¸· · I m − A(A T A)−1 A T A(A T A)−1 Im I m A(A T A)−1 Im = B. = (A T A)−1 A T −(A T A)−1 −A T I n In −(A T A)−1 · ¸ · ¸ I A I A It is easy to check that B T = T B = I m+n . Consequently, B is the inverse A 0 A 0 of coefficient matrix. 4. Let A = QR the QR factorization of A, where Q is m-by-m orthgonal matrix and R is m-by-n upper triangluar matrix with positive entries in R(i , i ). Since ¸· ¸ ¸ · ¸· · ¸ · Q I R QT I A I QR , = = T T I I RT AT 0 R Q the main work in solving the original linear equations lies in solving ¸ · ¸ ¸· ¸ · · b1 x1 + Rx2 I R x1 , = = T T b2 R x1 x2 R which can be solved easily. Thus, we can apply iterative refinement onto the above equation to improve the answer.

 Question 3.4. Weighted least squares: If some components of Ax −b are more important than others, we can weight them with a scale factor d i and solve the weighted least squares problem min kD(Ax − b)k2 instead, where D has diagnoal entries d i . More generally, recall that if C is symmetric positive definite, then kxkC ≡ (x T C x)1/2 is a norm, and we can consider minimizing kAx − bkC . Derive the normal equations for this problem, as well as the formulation corresponding to the previous quesion. Solution. To derive the normal equations, we look for the x, where the gradient of kAx − bkC2 = (Ax − b)T C (Ax − b) vanishes, i.e., ¡ ¢ 0 = lim (Ax + Ae − b)T C (Ax + Ae − b) − (Ax − b)T C (Ax − b) /e T C e e→0 ¡ ¢ = lim 2e(A T C Ax − A T C b) + e T A T C Ae /e T C e e→0 ¡ ¢ = lim 2e(A T C Ax − A T C b) /e T C e. e→0

Thus, we get its normal equations as A T C Ax = A T C b. To derive the corresponding formulation like previous question, consider the following symmetric full rank linear systems of equations · ¸ · ¸ · 1/2 ¸ I C 1/2 A r C b · = . A T C 1/2 0 x 0 ˜ we have the same equation as before and the Thus, if we take C 1/2 A as A˜ and C 1/2 b as b, previous result also applies. 

31

Question 3.5. Let A ∈ Rn×n be positive definite. Two vectors u 1 and u 2 are called A-orthgonal if u T1 Au 2 = 0. If U ∈ Rn×r and U T AU = I , then the columns of U are said to be A-orthgonal. Show that every subspace has an A-orthgonal basis. ¡ ¢ Proof. Given any subspace, there exists a set of linear independent vectors α1 , α2 , . . . , αn , which can span the whole space. Then, we use the given vectors to create an A-orthonormal ¡ ¢ basis β1 , β2 , . . . , βn as follows: for i = 1 to n do P 2: βi = αi − ij−1 (βT Aαi )β j q =1 j 3: βi = βi / βTi Aβi 4: end for 1:

Since αi span the whole space, the above process terminates at nth step. It is easy to verify that the first step holds the A-orthonormal property as ¡ ¢ ¡ ¢ βT1 Aβ2 = βT1 Aα2 − (βT1 Aα2 )βT1 Aβ1 /βT2 Aβ2 = βT1 Aα2 − βT1 Aα2 /βT2 Aβ2 = 0.

Thus, by mathematic induction, for any i > j we have ³ ´ ³ ´ βTj Aβi = βTj Aαi − (βTj Aαi )βTj Aβ j /βTi Aβi = βTj Aαi − βTj Aαi /βTi Aβi = 0, ¡ ¢ which implies β1 , β2 , . . . , βn is an A-orthgonal basis. If there exist some βi 1 , βi 2 , . . . , βi s which are linear dependent, there will exist c 1 , c 2 , . . . , c s such that

c 1 βi 1 + c 2 βi 2 + . . . + c s βi s = 0. Apply βTik A on both side, and we have c k = 0, which implies they are linear independent. As αi span the whole space, the vector set βi also form a basis of the space and is mutually A-orthonormal.  Question 3.6. Let A have the form A=

· ¸ R , S

where R is n-by-n and upper triangular, and S is m-by-n and dense. Describe an algorithm using Householder transformations for reducing A to upper triangular form. Your algorithm should not "fill in" the zeros in R and thus require fewer operations than would Algorithm 3.2 applied to A. ¡ ¢ Solution. Adding R(1, 1) into the first column of S, we get x 1 = R(1, 1), S(1, 1), S(2, 1) . . . , S(m, 1) . If kx 1 k2 = 0, we are done and move into submatrix A(2 : m + n, 2 : n). Otherwise, we will have ¡ ¢ an (m + 1)-by-(m + 1) Householder transformation P , which satisfies P x 1 = kx 1 k2 , 0, . . . , 0 . Thus, we can construct a symmetric orthgonal matrix Pˆ as P (1, 1) 0 Pˆ =  P (2 : m + 1, 1) 

0 I n−1 0

 P (1, 2 : m + 1) . 0 P (2 : m + 1, 2 : m + 1)

32

Apply Pˆ into A, we have  kx 1 k2 Pˆ A = A 1 =  0 0

 ∗ R(2 : n, 2 : n) . ∗

Thus, we reduce the problem into smaller subproblem. Continue, and we will reduce A into upper triangluar.  Question 3.7. If A = R +uv T , where R is an upper triangular matrix, and u and v are column vectors, describe an effiecient algorithm to compute the QR decomposition of A. Solution. When either u or v is 0, the answer is thrivial. Suppose u n 6= 0. Otherwise the last row of uv T will be 0 and we can start from the (n −1)th row of uv T . As the rank of uv T is only one and by assumption the last row of uv T is not 0, it follows that the last row of uv T can linear represent the remaining rows. Thus, there exist n − 1 Givens rotations G 1 ,G 2 , . . . ,G n−1 which rotate i th and nth rows to cancel i th row. Since R is upper triangular, G i R is also upper triangular, which means A¯ = G n−1G n−2 . . .G 1 A = G n−1G n−2 . . .G 1 R +G n−1G n−2 . . .G 1 uv T is upper triangular except for the last row. Then, we proceed to cancel the last row. First, if ¯ 1). Otherwise, we swap the first row R(1, 1) is not zero, we use the first row of A¯ to cancel A(n, ¯ and the last row of A. Continue like this, and we reduce A¯ to upper triangular, and all the matrices we apply to A forms the orthgonal matrix Q.  Question 3.8. Let x ∈ Rn and let P be a Householder matrix such that P x = ±kxk2 e 1 . Let Q 1,2 , . . . ,Q n−1,n be Givens rotations, and let Q = G 1,2 . . .G n−1,n . Suppose Qx = ±kxk2 e 1 . Must P equal Q? Solution. P does not need to equal Q. Consider the two dimension counterexample as Px =

¸· ¸ · ¸ · 1 −3 −4 3 −5 = −kxk2 e 1 . = 0 4 5 −4 3

· ¸ 2 The above matrix is a Householder matrix with u = as generating vector. In the same 1 time, the following matrix is a Givens rotation, which satisfies the condition while does not equal to P , · ¸· ¸ · ¸ 1 −3 −4 3 −5 = = −kxk2 e 1 . Qx = 0 5 4 −3 4 1 5

 Question 3.9. Let A be m-by-n, with SVD A = U ΣV T . Compute the SVDs of the following matrices in terms of U , Σ, and V : 1. (A T A)−1 , 2. (A T A)−1 A T ,

33

3. A(A T A)−1 , 4. A(A T A)−1 A T . Remark. For the formulas to hold, we have to assume that A has full column rank. And for simplicity, later we will assume that m ≥ n if the contrary situation is not stated explicitly, since for the case n ≥ m can be treated similarly. Proof. 1. for first formula, we have (A T A)−1 = (V ΣT U T U ΣV T )−1 = (V Σ2V T )−1 = V Σ−2V T . 2. For second formula, we have (A T A)−1 A T = V Σ−2V T V ΣU T = V Σ−1U T . 3. For third formula, we have A(A T A)−1 = U ΣV T V Σ−2V T = U Σ−1V T . 4. For the last formula, we have A(A T A)−1 A T = U Σ−1V T V ΣU T = I n .

 Question 3.10. Let A k be a best rank-k approximation of the matrix A, as defined in Part 9 of Theorem 3.3. Let σi be the i th singular value of A. Show that A k is unique if σk > σk+1 . Proof. I believe what the question claim is not correct. According to the textbook, the socalled best rank-k approximation of a matrix A under two-norm is a rank-k matrix which satisfies ˜ 2 = σk+1 . kA − Ak Thus, I present a counterexample to illustrate that it may not be unique under the assupmtion of the question. Consider the matrix A as   4   3   A = diag(4, 3, 2, 1) =  .   2 1 Then any matrix A˜ like  4−²  3    2

   , 

0

34

˜ 2 = 1. Therefore, the best rank-3 approximation of A is not where 3 < ² < 4, satisfies kA − Ak unique. However, the result holds if we measure under Frobenius norm, and if the best rankP k approximation must have the form of ki=1 σi u i v Ti , the proof is trivial. Remark. Under the assumption of the question, I can only prove that any matrix B satisfies kA − B k2 = σk+1 has the condition  T  u 1 B v 1 u T2 B v 1 . . . u Tk B v 1 u T B v  T T 2 u2 B v 2 . . . uk B v 2  1    .. .. .. .. , U T BV =  . . .   .   T T T u 1 B v k u 2 B v k . . . u k B v k  0 where U and V are the orthgonoal matrices of the SVD of A. In some degree, the two norm equality kA−B k2 = σk+1 does not provide enough information of the range of B , which results in its nonuniqueness.  Question 3.11. Let A be m-by-n. Show that X = A + (the Moore-Penrose pseudoinverse) minimizes kAX − I kF over all n-by-m matrices X . What is the value of this minimum? Proof. Denote the SVD decomposition of A as A = U ΣV T . As there is an one-to-one correspondence from X to Xˆ = V T X U T , we can substitue X by Xˆ . If Xˆ is the minimizer, we can regain target X as V Xˆ U T and vice verse. Consequently, the problem can be transformed into ! Ã m r X ¡ ¢2 X 2 ˆ 2 2 T T 2 2 ˆ ˆ ˆ σi ( X (i , j )) + (n − r ), kAX − I kF = kU ΣV V X U − I kF = kΣ X − I kF = σi X (i , i ) − 1 + i =1

j 6=i

where r is the rank of A. Thus, to minimizes the above formula, we have ˆ Xˆ (i , i ) = σ−1 i , X (i , j ) = 0, where i ∈ {1, 2, . . . , r }, and j 6= i ∈ {1, 2, . . . , m}. So, Xˆ may have many choices, and Xˆ = Σ+ always satisfies the condition. As a result, X = A + p is the minimizer, and the minimum is n − r .  Question 3.12. Let A, B , and C be matrices with dimensions such that the product A T C B T is well defined. Let X be the set of matrices X minimizing kAX B −C kF , and let X 0 be the unique member of X minimizing kX kF . Show that X 0 = A +C B + . Remark. This solution is unique only under the assumption that A and B have full rank. Proof. Assume matrix A as m-by-s, B as t -by-n, C as m-by-n. Denote the SVD decomposition of A as A = U1 Σ1V1T , B as B = U2 Σ2V2T . As there is an one-to-one correspondence from X to Xˆ = V1T X U2 , we can substitue X as Xˆ . Thus, we have kAX B −C k2F = kU1 Σ1V1T V1 Xˆ U2T U2 Σ2V2T −C k2F = kΣ1 Xˆ Σ2 −U1T CV2 k2F = k(Σ1 Xˆ Σ2 −U1T CV2 )(1 : r 1 , 1 : r 2 )k2F + k(U1T CV2 ) (((r 1 + 1) : m, 1 : n) + (1 : r 1 , (r 2 + 1) : n)) k2F ,

35

where r 1 is the rank of A and r 2 is the rank of B . As the matrix U1T CV2 are set once A, B and C are given, the minimizer must and only have to minimize k(Σ1 Xˆ Σ2 −U1T CV2 )(1 : r 1 , 1 : r 2 )k2F . Thus, any matrix Xˆ , which satisfies Σ1 Xˆ Σ2 (1 : r 1 , 1 : r 2 ) = (U1T CV2 )(1 : r 1 , 1 : r 2 ) or equivalently for any matrix X which Σ1V1T X U2 Σ2 (1 : r 1 , 1 : r 2 ) = (U1T CV2 )(1 : r 1 , 1 : r 2 ), is a minimizer, We claim that X 0 = A +C B + is such a minimizer, since T + T Σ1V1T X 0U2 Σ2 (1 : r 1 , 1 : r 2 ) = Σ1V1T V1 Σ+ 1 U1 CV2 Σ2 U2 U2 Σ2 (1 : r 1 , 1 : r 2 ) · ¸ · ¸ I r2 0 I r1 0 T (1 : r 1 , 1 : r 2 ) = U CV2 0 0 0 0 1

= (U1T CV2 )(1 : r 1 , 1 : r 2 ).

 Question 3.13. Show that the Moore-Penrose pseudoinverse of A satisfies the following identities: A A + A = A, A + A A + = A + , A + A = (A + A)T , A A + = (A A + )T . Proof. Denote the SVD decomposition of A as U ΣV T , r as the rank of A. We prove these identities one by one. 1. T

+

T

+

A A A = U ΣV V Σ U U ΣV

T

·

Ir =U 0

¸ 0 ΣV T = U ΣV T = A. 0

2. ·

Ir A A A = V Σ U U ΣV V Σ U = V 0 +

+

+

T

T

+

T

¸ 0 + T Σ U = V Σ+U T = A + . 0

3. +

T

+

A A = V Σ U U ΣV

T

·

¸ · 0 T Ir V = (V 0 0

¸ 0 T T V ) = (A + A)T . 0

·

¸ · 0 T Ir U = (U 0 0

¸ 0 T T U ) = (A A + )T . 0

Ir =V 0

4. Ir A A = U ΣV V Σ U = U 0 +

T

+

T

¸ 0 AT Question 3.14. Prove part 4 of Theorem 3.3: Let H = , where A is square and A = A 0 U ΣV T is its SVD. Let Σ = diag(σ1 , σ2 . . . , σn ),U = [u 1 , u 2 , . . . , u n ], and V = [v 1 , v 2 , . . . , v n ]. Prove · ¸ vi that the 2n eigenvalues of H are ±σi , with corresponding unit eigenvectors p1 . Extend 2 ±u i to the case of retangular A. ·

36

Proof. We directly prove the case of retangular A, thus the square case is proved without P special treatment. First, denote the SVD decomposition of A as U ΣV T = ni=1 σi uv T , where U = [u 1 , u 2 , . . . , u n ],V = [v 1 , v 2 , . . . , v n ] and Σ = diag(σ1 , σ2 . . . , σn ). We verify that ±σi are the 2n eigenvalues of H with corresponding unit eigenvectors p1 [v i , ±u i ]T as follows: 2

Pn

T ¸· ¸ vi 0 i =1 σi v u T 0 ui i =1 σi uv · ¸ ¸ · vi σi v i , = σi = ui σi u i ¸ ¸ · · Pn T ¸· vi 0 vi i =1 σi v u = Pn H T 0 −u i −u i i =1 σi uv ¸ · ¸ · vi −σi v i . = −σi = −u i σi u i

·

vi H ui

¸

· = Pn

Then, as °· ¸° p ° vi ° ° ° ° ±u ° = 2, i 2 p1 [v i , ±u i ]T 2

are linear indenpendent, ±σi are the 2n eigenvalues of H with corresponding unit eigenvectors p1 [v i , ±u i ]T . As for the remaining m − n eigenvalues, let U˜ = 2 [u n+1 , . . . , u m ] be the orthgonal supplement of U . We can claim that the remaining m − n eigenvalues of H are zeros, with corresponding unit eigenvectors [0, u i ]T , since · ¸ · Pn T ¸· ¸ · ¸ 0 0 0 0 i =1 σi v u = Pn = . H T ui σ uv 0 u 0 i i =1 i and all

 Question 3.15. Let A be m-by-n, m < n, and of full rank. Then min kAx − bk2 is called an underdetermined least squares problem. Show that the solution is an (n − m)-dimensional set. Show how to compute the unique minimum norm solution using appropriately modified normal equations, QR decomposition, and SVD. Solution. We explain these three techniques one by one, and prove the properties of the solution in the first technique. 1. For modified normal equation, partition any vector x ∈ Rn into x = x 1 + x 2 , where x 2 is in the null space of A and x 1 is in the range of A T . Thus, kAx − bk2 = kAx 1 − bk2 = kA A T y − bk2 . As A A T has full rank, the normal equation A AT y = b has unique solution. Every vector in the form A T y + x 2 is the minimizer, which forms an (n − m)-dimensional set. The unique minimum norm solution is x = A T (A A T )−1 b .

37

2. For QR decompostion, since A T has the normal QR decomposition as A T = QR where Q is n-by-m and R is m-by-m, we have that A = R T Q T , and the least square problem equals the equation R T Q T x = b. Let x = Q(R T )−1 b + x˜ , where x˜ can be any vector. We can transform the above equation into Q T x˜ = 0. Thus, x˜ belong to the null space of Q T . Evaluate the norm of our solution, we have kxk22 = kQ(R T )−1 b + x˜ k22 = kQ(R T )−1 bk22 + kx˜ k22 ≥ kQ(R T )−1 bk22 . Consequently, the unique minimum norm solution is Q(R T )−1 b. 3. For SVD, we can have the SVD decomposition of A as A = V ΣU T , where V is m-by-m, Σ is m-by-m full rank diagonal matrix and U T is m-by-n. Like QR decomposition, the least square problem equals to U T x = Σ−1V T b. Let x = U Σ−1V T b + x˜ and we have the solution of above linear equations as U T x˜ = 0. Thus, x˜ belong to the null space of U T . Evaluate the norm of our solution, we have kxk22 = kU Σ−1V T b + x˜ k22 = kU Σ−1V T bk22 + kx˜ k22 ≥ kU Σ−1V T bk22 . Consequently, the unique minimum norm solution is U Σ−1V T b.

 Question 3.16. Prove Lemma 3.1: Let P be an exact Householder (or Givens) transformation, and P˜ be its floating point approximation. Then fl(P˜ A) = P (A + E ) kE k2 = O(²)kAk2 and fl(A P˜ ) = (A + F )P

kF k2 = O(²)kAk2 .

Proof. If fl(a ¯ b) = (a ¯ b)(1 + ²) holds, like Question 1.10., we can prove that fl(AB ) − AB = O(²)AB. Thus, for the first formula, we have ¡ ¢ kE k2 = kP −1 fl(P˜ A) − P A k2 = kP −1 (fl((1 + O(²))P A) − P A) k2 ≤ O(²)kP −1 k2 kP k2 kAk2 = O(²)kAk2 . The second one can be deduced similarly.



38

Question 3.17. The question is too long, so I omit it. For details, please refer to the textbook. Solution. 1. We prove it by induction. First, consider the base case as P = P 2 P 1 . We have P = P 2 P 1 = (I − u 2 u T2 )(I − u 1 u T1 ) = I − 2u 1 u T1 − 2u 2 u T2 − 4u T2 u 1 u 2 u T1 · ¸· T ¸ 2 u1 . = I − [u 1 , u 2 ] 4u T2 u 1 2 u T2 Thus, the base case holds. Then, assume that P 0 = P k P k−1 · · · P 1 = I −Uk Tk UkT , then we have P = P k+1 P 0 = (I − 2u k+1 u Tk+1 )(I −Uk Tk UkT ) = I −Uk Tk UkT − 2u k+1 u Tk+1 + 2u k+1 u Tk+1Uk Tk UkT ¸· T ¸ · Uk Tk = I − [Uk , u k+1 ] 2u Tk+1Uk Tk 2 u Tk+1 T = I −Uk+1 Tk+1Uk+1 .

Therefore, by induction, we have proved part I of the question. As for algorithm, we can use the recursive formula we get in induction as ·

Tk Tk+1 = T 2u k+1Uk Tk

¸

2

.

And the base case is T1 = 2. The pseudocode is shown below: 1: 2: 3: 4: 5:

T1 = 1 for i = 2 to k do βi = 2u Ti Ui −1 Ti −1 ¸ · Ti −1 Ti = βi 2 end for

2. The algorithm which the textbook introduces is just sufficient: for i = 1 to min(m − 1, n) do u i = House(A(i : m, i )) 3: v = u Ti A(i : m, i : n) 4: A(i : m, i : n) = A(i : m, i : n) − 2u i v T 5: end for

1:

2:

Here, u Ti A(i : m, i : n) is a matrix-vector multiplication and A(i : m, i : n) = A(i : m, i : n)−2u i v T is a rank-1 update. Since House(A(i : m, i )) costs 2(m −i ), the total flops are Pmin(m−1,n) (2m − 2i − 1)(n − i ) + 2(m − i )(n − i ) + 2(m − i ) ≈ 2mn 2 − 32 n 3 . i =1

39

3. Denote b as a small integer. Like Level 3 BLAS Gaussian elimination, we use the technique of delaying update. In details, assume we have already done the first k−1 columns, yielding k −1 n −k +1 A

=

Ã

k −1 m −k −1

Q 11

Q 12

Q 21

Q 22

k −1

b

n −k −b +1

R 11

R 12 A˜ 22

R 13 A˜ 23

A˜ 32

A˜ 33

 !  ·  

   

¸ A˜ 22 and get Then, apply the Level 2 BLAS QR implement to the submatrix ˜ A 23 ·

·

where Q

A˜ 22 A˜ 32

¸ · R 22 A˜ 23 =Q A˜ 33

¸ R 23 , A¯ 33

¸ ¸ · ¸ · · ¸ · A˜ 23 R 23 R 22 A˜ 22 = ˜ . Thus, and Q ¯ = ˜ A 33 A 33 0 A 32 · Q 11 A= Q 21 · Q 11 = Q 21

Q 12 Q 22

¸·

I Q

 ¸ R 11 

 ¸ R 11 Q 12Q  Q 22Q

R 12 R 22

R 12 R 22

 R 13 R 23  A¯ 33

 R 13 R 23  A¯ 33

The Q can be computed by part I, and R 23 , A¯ 33 can be computed from matrix product, which is level 3 BLAS.

 Question 3.18. It is often of interest to solve constrained least squares problems, where the solution x must satisfy a linear or nonlinear constraint in addition to minimizing kAx − bk2 . We consider one such problem here. Suppose that we want to choose x to minimize kAx − bk2 subject to the linear constraint C x = d . Suppose also that A is m-by-n, C is p-by-n, and C has full rank. We also assume that p ≤ n (so C x = d is guaranteed to be consisitent) and n ≤ m + p (so the system is not underdetermined). Show that there is a unique solution under the assump· ¸ A tion that has full column rank. Show how to compute x using two QR decompositions and C some matrix-vector multiplications and solving some triangular systems of equations. Solution. First, compute a full QR decomposition of C as C = R T Q T , where Q T = [Q 1 ,Q 2 ]T is n-by-n and R T = [L, 0] is p-by-n. Then the constraint C x = d can be transformed into R T Q T x = [L, 0]

· T¸ Q1 x = LQ 1T x = d . Q 2T

40

Like Question 3.15., the solutions of the above linear system are Q 1 L −1 d +Q 2 y, where y is any vector in Rn−p . Therefore, the original least square problem equals to kAx − bk2 = kA(Q 1 L −1 d +Q 2 y) − bk2 = kAQ 2 y − (b − AQ 1 L −1 d )k2 . As AQ 2 is m-by-(n − p) and m + p ≥ n, we have m ≥ n − p. If AQ 2 has full rank, we will · ¸ A get the unique solution. Since has full column rank, we claim that the intersection of C the null space of A and the null space of B is empty. If not, suppose x is one vector of the · ¸ · ¸ A A intersection, then x = 0, which means the columns of are linear dependent. Thus, C C the null spaces of A and C do not intersect. Consequently, AQ 2 has full rank. 5 If we denote ˆ we have the unique solution of y as the QR decomposition of AQ 2 as AQ 2 = Qˆ R, y = Rˆ −1Qˆ T (b − AQ 1 L −1 d ). Then the final answer should be x = AQ 2 y +Q 1 L −1 d .

 Question 3.19. Programming question. Skipped. Question 3.20. Prove Theorem 3.4. Proof. First, since (A T A)−1 A T (A + δA) = I + (A T A)−1 A T δA and kA T A −1 A T δAk2 = 1/σn , I + (A T A)−1 A T δA is invertible accroding to Lemma 2.1. Consequently, (A + δA) has full rank. As x˜ minimizes the problem k(A + δA)x˜ − (b + δb)k2 , it can be expressed as ¡ ¢−1 x˜ = (A + δA)T (A + δA) (A + δA)T (b + δb) ¡ ¢−1 = A T A(I + (A T A)−1 A T δA + (A T A)−1 (δA)T A + (A T A)−1 (δA)T (δA) (A + δA)T (b + δb) ∞ X ¡ ¢i = (−1)i (A T A)−1 A T δA + (A T A)−1 (δA)T A + (A T A)−1 (δA)T (δA) (A T A)−1 (A + δA)T (b + δb) i =0

¢ ¡ ≈ I − (A T A)−1 (δA)T A − (A T A)−1 A T δA (A T A)−1 (A T b + (δA)T b + A T δb)

≈ (I − (A T A)−1 (δA)T A − (A T A)−1 A T δA)x + (A T A)−1 (δA)T b + (A T A)−1 A T δb = (I − (A T A)−1 A T δA)x + (A T A)−1 (δA)T r + (A T A)−1 A T δb. 5 Denote Q = q , q , . . . , q 2 1 2 n−p . If there exist b 1 , b 2 , . . . , b n−p such that

¡

¢

b 1 Aq 1 + b 2 Aq 2 + . . . + b n−p Aq n−p = A(b 1 q 1 + b 2 q 2 + . . . + b n−p q n−p ) = 0, it follows b 1 q 1 + b 2 q 2 + . . . + b n−p q n−p is in the null space of A, which is also in the null space of C .

41

Also, since sin θ =

kr k2 kbk2

and θ is acute, we can evaluate cos θ and tan θ as following: r p kbk22 −kr k22 2 cos θ = 1 − sin θ = = kbk2 2

tan θ =

sin θ cos θ

kAxk2 kbk2 ,

= (kr k2 /kbk2 )/(kAxk2 /kbk) =

kr k2 kAxk2 .

Thus, kx˜ − xk2 ≤ ²k(A T A)−1 A T k2 kAk2 kxk2 + ²k(A T A)−1 k2 kAk2 kr k2 + ²k(A T A)−1 A T k2 kbk2 ≤ ²κ2 (A)kxk2 + ²κ22 (A)kxk2 tan θ + ²κ2 (A)kxk2 / cos θ µ ¶ 2κ2 (A) ≤² + tan θ · κ22 (A) . cos θ



4 S OLUTIONS FOR C HAPTER IV: N ONSYMMETRIC E IGENVALUE P ROBLEMS Q Question 4.1. Let A be defined as in equation (4.1). Show that det(A) = bi=1 det(A i i ) and then Q that det(A − λI ) = bi=1 det(A i i − λI ). Conclude that the set of eigenvalues of A is the union of the sets of eigenvalues of A 11 through A bb .

Proof. Denote A i i as m i -by-m i and the whole matrix A as n-by-n. From the definition of P determinant, we have det(A) = (−1)σi a i 1 ,1 a i 2 ,2 , . . . , a i n ,n , where σi is the inverse number of the permutation (i 1 , i 2 , . . . , i n ). Then we can group them as X det(A) = (−1)σi (a i 1 ,1 , . . . , a i m1 ,m1 ) . . . (a i mb−1 +1 ,mb−1 +1 , . . . , a i mb ,mb ). For the first group, a i j , j = 0 when i j > m 1 . Thus, we can delete them from the sum, and get X det(A) = (−1)σ1 (a i 1 ,1 , . . . , a i m1 ,m1 )(−1)σi 0 (a i m1 +1 ,m1 +1 , . . . , a i m2 ,m2 ) . . . (a i mb−1 +1 ,mb−1 +1 , . . . , a i mb ,mb ) X = det(A 11 ) (−1)σi 0 (a i m1 +1 ,m1 +1 , . . . , a i m2 ,m2 ) . . . (a i mb−1 +1 ,mb−1 +1 , . . . , a i mb ,mb ). Therefore, we reduce our problem into smaller subproblem. Continue, and we have det(A) = det(A 11 ) det(A 22 ) . . . det(A bb ) =

b Y

det(A i i ).

i =1

Consequently, det(A − λI ) =

b Y

det(A i i − λI ).

i =1

Thus λ is an eigenvalue of A if and only if it is an eigenvalue of some A i i .



Question 4.2. Suppose that A is normal; i.e., A A ∗ = A ∗ A. Show that if A is also triangular, it must be diagonal. Use this to show that an n-by-n matrix is normal if and only if it has n orthonormal eigenvectors.

42

Proof. Since A is normal and triangular, we have     a¯11 a¯11 a 11 a 12 . . . a 1n  ¯    a 22 . . . a 2n   a 12 a¯22   a¯12    = . ..  .. .. ..   ..   .  . . . .  .   .  a nn a¯1n a¯2n . . . a¯nn a¯1n



a¯22 .. . a¯2n

..

. ...

a¯nn

    

a 11

a 12 a 22

... ... .. .

 a 1n  a 2n  ..  . .  a nn

Equating (1, 1) entry, we have |a 11 |2 =

n X

|a 1n |2 .

i =1

Thus, for i 6= 1 we have |a 1i | = 0. Then we reduce the problem into smaller subproblem. Continue, and we will have A is diagonal. For any n-by-n normal matrix A, we have a unitary matrix U such that U ∗ AU = T , where T is triangular. Then, we claim that T is normal because T ∗ T = U ∗ A ∗UU ∗ AU = U ∗ A ∗ AU = U ∗ A A ∗U = U ∗ AUU ∗ A ∗U = T T ∗ . Thus, T is diagonal and denote it as diag(λ1 , λ2 , . . . , λn ), we have AU = U diag(λ1 , λ2 , . . . , λn ). i.e., each column of U is the right eigenvector of A. On the other hand, if A has n orthonormal eigenvectors, denote them as U . We have A = U diag(λ1 , λ2 , . . . , λn )U ∗ , and it is easy to verify that A is normal.  Question 4.3. Let λ and µ be distinct eigenvalues of A, let x be a right eigenvector for λ, and let y be a left eigenvector for µ. Show that x and y are orthogonal. Proof. Consider y ∗ Ax, and we have y ∗ Ax = µy ∗ x = λy ∗ x. Thus, (λ − µ)y ∗ x = 0. As λ 6= µ, we get y ∗ x = 0, i.e., x and y is orthogonal.  P i Question 4.4. Suppose A has distinct eigenvalues. Let f (z) = ∞ i =−∞ a i z be a function which ∗ is defined at the eigenvalues of A. Let Q AQ = T be the Schur form of A (so Q is unitary and T upper triangular). 1. show that f (A) = Q f (T )Q ∗ . Thus to compute f (A) it suffices to be able to compute f (T ). In the rest of the problem you will derive a simple recurrence formula for f (T ). 2. Show that ( f (T ))i i = f (Ti i ) so that the diagonal of f (T ) can be computed from the diagonal of T . 3. Show that T f (T ) = f (T )T . 4. From the last result, show that the i th superdiagonal of f (T ) can be computed from the (i-1)st and earlier subdiagonals. Thus, starting at the diagonal of f (T ), we can compute the first superdiagonal, second superdiagonal, and so on.

43

Proof. 1. It is suffices to prove that A i = QT i Q ∗ , and it can be proved as A i = QT Q ∗QT Q ∗ . . .QT Q ∗ = QT i Q ∗ . Thus, f (A) =

∞ X

ai A i =

∞ X

a i QT i Q ∗ = Q f (T )Q ∗ .

i =−∞

i =−∞

2. It is suffices to prove that (T i )i i = (Ti i )i , and it can be proved by mathematic induction. As the multiplication of upper triangular is also an upper triangular matrix, the diagonal entries are the products of the corresponding diagonal entries. For the base case, we have (T 2 )i i = Ti2i . Then, suppose it is correct for k, we can deduce for k + 1 as (T k+1 )i i = (T k T )i i = Tiki Ti i = Tik+1 i . Thus, ( f (T )) j j =

∞ X

∞ X

a i (T i ) j j =

i =−∞

i =−∞

a i T ji j = f (T j j ).

3. It is self evident that any matrix is commutative with itself. Thus T f (T ) =

∞ X

ai T T i =

∞ X

a i T i T = f (T )T.

i =−∞

i =−∞

4. It is suffices to deduce the result for T i , and it can be prove by mathematic induction. First, for T 2 , we have iX +k T 2 (i , i + k) = T (i , t )T (t , i + k). t =i

Thus, the kth superdiagonal can be computed from 0th to (k − 1)th superdiagonals. P j −i +1 Then, suppose it is correct for T n , so we can express T n (i , j ) = t =1 c i , j ,t T (i , i + t −1), where c i , j ,t are consts. Therefore, we can prove for T n+1 as T k+1 (i , i + k) =

iX +k s=i

T k (i , s)T (s, i + k) =

iX +k s=i

T (s, i + k)

s−i X+1

c i ,s,t T (i , i + t − 1).

t =1

Thus, the kth superdiagonal of T n+1 can still be computed from 0th to (k − 1)th superdiagonals.

 Question 4.5. Let A be a square matrix. Apply either Question 4.4 to the Schur form of A or equation (4.6) to the Jordan form of A to conclude that the eigenvalues of f (A) are f (λi ), where the λi are the eigenvalues of A. This result is called the spectral mapping theorem.

44

Proof. Let U T U ∗ be the Schur form of A. According to part II of the previous question, we know that the eigenvalues of f (T ) and f (A) are the same. Since ( f (T ))i i = f (Ti i ) = f (λi ), the eigenvalues of f (T ) are f (λi ). Thus, the eigenvalues of f (A) are f (λi ) with the same multiplicity as λi .  Question 4.6. In this problem we will show how to solve the Sylvester or Lyapunov equation AX − X B = C , where X and C are m-by-n, A is m-by-m, and B is n-by-n. This is a system of mn linear equations for the entries of X . 1. Given the Schur decompositions of A and B , show how AX − X B = C can be transformed into a similar system A 0 Y − Y B 0 = C 0 , where A 0 and B 0 are upper triangular. 2. Show how to solve for the entries of Y one at a time by a process analogous to back substitution. what condition on the eigenvalues of A and B guarantees that the system of equations is nonsingular? 3. Show how to transform Y to get the solution X . Solution. 1. Suppose U1 T1U1∗ and U2 T2U2∗ are the Schur form of A and B repectively. Then the equation can be expressed as T1U1∗ X U2 −U1∗ X U2 T2 = U1∗CU2 . Denote Y = U1∗ X U2 and C 0 = U1∗CU2 , we have T1 Y − Y T2 = C 0 , which satisfies the condition. 2. For illustration, we rewrite the left side of equation A 0 Y − Y B 0 = C 0 in matrix form as      a 11 a 12 . . . a 1n y 11 y 12 . . . y 1n y 11 y 12 . . . y 1n b 11 b 12 . . .      a 22 . . . a 2n   y 21 y 22 . . . y 2n   y 21 y 22 . . . y 2n   b 22 . . .      ..  .. ..  .. ..  .. .. .. ..    .. − ..  . . . . .  . . .   . . .   a nn y n1 y n2 . . . y nn y n1 y n2 . . . y nn Equating its corresponding entry of C 0 (n, :) one by one, we have C 0 (n, 1) = a nn y n1 − b 11 y n1 , C 0 (n, 2) = a nn y n2 −

2 X

b i 2 y ni ,

i =1

...... C 0 (n, k) = a nn y nk −

k X

b i k y ni ,

i =1

...... C 0 (n, n) = a nn y nk −

n X

b i n y ni .

i =1

45

 b 1n  b 2n  ..  . .  b nn

Thus, if a nn 6= b i i , we can solve the last row of Y one at a time. Then continue like this with the requisition that all the eigenvalues of A and B do not coincide. We can solve Y. 3. As what we have deduced, Y = U1∗ X U2 , where U1 ,U2 are the unitary matrices which transform A and B into Schur form separately. We can get X = U1 Y U2∗ .

 ·

¸

A C Question 4.7. Suppose that T = is in Schur form. We want to find a matrix S so that 0 B · ¸ · ¸ A 0 I R −1 S TS = . It turns out we can choose S of the form . Show how to solve for R. 0 B 0 I Solution. As the question does not tell us the dimension of each block and matrix, we can suppose that the matrix multiplication which the question requires is compatible. Since · ¸ I −R −1 S = , we have 0 I S −1 T S =

·

I −R 0 I

¸·

A 0

C B

¸·

¸ · A I R = 0 0 I

AR +C − RB B

¸

Thus, the question is equal to solve AR − RB = −C ,



which is the type of equation discussed in the previous question. Question 4.8. Let A be m-by-n and B be n-by-m. show that the matrices ¶ ¶ µ µ 0 0 AB 0 and B BA B 0 are similar. conclude that the nonzero eigenvalues of AB are the same as those of B A. ¸ · I −A Proof. Consider the nonsingular matrix S = . We have 0 I ·

AB S B

· ¸· ¸ I −A AB 0 −1 S = 0 I B 0

0 0

¸·

I 0

¸ · A 0 = I B

¸ 0 . BA

¶ µ ¶ AB 0 0 0 Thus, and are similar. Consequently, they have the same eigenvalues, and B 0 B BA according to Question 4.1., we have µ

λn det(λI − AB ) = λm det(λI − B A) Thus, AB and B A have the same nonzero eigenvalues.



46

Question 4.9. Let A be n-by-n with eigenvalues λ1 , . . . , λn . Show that n X i =1

|λi |2 = min kS −1 ASk2F . det(S)6=0

Remark. The question is not correct unless we change the operation from "min" into "inf". I present a counterexample below to illustrate why. Consider A as · ¸ 1 1 A= . 0 1 ·

For any nonsingular matrix S =

kS −1 ASk2F = =

a c

¸ b , we have ad 6= bc and d ¸°2 ° a2 ° ac + ad − bc °F

°· ° ad − ac − bc 1 ° 2 −c 2 (ad − bc) °

c 4 + a4 + 2. (ad − bc)2

Since a and c cannot equals zero simultaneously, we have kS −1 ASk2F >

n X

|λi |2 .

i =1

Thus, the lower bound cannot be reached in this situation. Proof. First, denote the Schur form of A as U ∗ AU = T . As there is an one to one coresspondent from S to U S, it means that the set is not changed. Thus, we we can transform the minimum problem into inf

det(U S)6=0

kS −1U ∗ AU Sk2F =

inf

det(S)6=0

kS −1 T Sk2F .

Then, we can transform the set from nonsingular matrices to nonsingular upper triangular matrices, because every nonsingular matrix S −1 has a QR decomposition as S −1 = QR. What’s more, since unitary matrices do not change Frobenius norm, we have inf

det(S)6=0

kS −1 T Sk2F =

inf

det(QR)6=0

kQRT R −1Q −1 k2F =

inf

det(R)6=0

kRT R −1 k2F .

As the inverse of an upper triangular matrix is also upper triangular, we have      −1 λ1 r 11 r 11 λ1 * *  *     −1 r 22 λ2 r 22        =   RT R −1 =  .. .. ..      . . .      λn

r nn

Thus,

n X i =1

|λi |2 ≤

inf

det(R)6=0

kRT R −1 k2F =

−1 r nn

inf

det(S)6=0



*

λ2 ..

. λn

kS −1 ASk2F .

47

    

Then, we should that the lower bound can be approximated to any precision. Denote c = ¡ ¢ maxi 6= j |T (i , j )|. For any ² < 1, set a = ²/ max(c, 1). Consider S = diag a n−1 , a n−2 , . . . , a, 1  λ1    ST S T =    

aT12 λ2

a 2 T13 aT23 λ3

... ... ... .. .

 a n−1 T1n a n−2 T2n    a n−3 T3n  .  ..   . λn

Since |a i Tk j | < ², the square sum of the off-diagnoal entries is smaller than proves the question.

n(n−1) ², 2

which



Question 4.10. Let A be an n-by-n matrix with eigenvalues λ1 , . . . , λn . 1. Show that A can be written A = H + S, where H = H ∗ is Hermitian and S = −S ∗ is skewHermitian. Give explicit formulas for H and S in terms of A. P 2. Show that ni=1 |Rλi |2 ≤ kH k2F . P 3. Show that ni=1 |Jλi |2 ≤ kSk2F . P 4. Show that A is normal (A A ∗ = A ∗ A) if and only if ni=1 |λi |2 = kAk2F . Solution. 1. Suppose A = H + S, then we have A ∗ = H − S. Thus, we can solve H and S as 1 H = (A + A ∗ ), 2 1 S = (A − A ∗ ), 2 where A = H + S, H is Hermitian and S is skew-Hermitian. 2. Let U ∗ AU = T be the Schur form of A. Then, for H , we have ° ° Rλ 1 ° * ° Rλ2 °  1 2 ∗ 2 ∗ 2  kH kF = kU HU kF = k (T + T )kF = ° .. ° 2 . ° ° * °

°2 ° ° ° n X ° ° ≥ |Rλi |2 . ° ° i =1 ° Rλn °F

3. Again, let U ∗ AU = T be the Schur form of A. Then, for S, we have ° ° Jλ 1 ° ° Jλ2 °  1  kSk2F = kU ∗ SU k2F = k (T − T ∗ )k2F = ° °  2 ° ° * °

°2 ° ° * ° n X ° ° ≥ |Jλi |2 . .. ° . °  i =1 ° Jλn °F

48

4. If A is normal, according to Question 4.2., there exists unitary matrix U such that U AU ∗ = diag(λ1 , . . . , λn ). Thus, we have kAk2F = kU AU ∗ k2F =

n X

|λi |2 .

i =1

On the other hand, if A is not normal, denote its Schur form as U AU ∗ = T . We can claim that T is not diagonal, otherwise T will be normal and consequently A is normal, which contradicts our assumption. So, we get kAk2F = kU AU ∗ k2F = kT k2F > Thus, if

Pn

i =1 |λi |

2

n X

λi .

i =1

= kAk2F , A must be normal.

Question 4.11. Let λ be a simple eigenvalue, and let x and y be right and left eigenvectors. We define the spectral projection P corresponding to λ as P = x y ∗ /(y ∗ x). Prove that P has the following properties. 1. P is uniquely defined, even though we could use any nonzero scalar multiples of x and y in its definition. 2. P 2 = P. (Any matrix satisfying P 2 = P is called a projection matrix.) 3. AP = P A = λP . (These properties motivate the name spectral projection, since P "contains" the left and right invariant subspaces of λ.) 4. kP k2 is the condition number of λ. Remark. Part IV of the question follows directly from Question 1.7. However, I present a similar proof as shown below to remain us of the result. Proof. 1. Since λ is a simple eigenvalue, any other right and left eigenvectors can be expressed as xˆ = c x and yˆ = d y. Thus, for the spectral projection defined by xˆ and yˆ , we have Pˆ = xˆ yˆ ∗ /( yˆ ∗ xˆ ) = cd x y ∗ /(cd y ∗ x) = P, which suggests that P is uniquely defined. 2. We can verify P 2 = P as follows: P 2 = x y ∗ x y ∗ /(y ∗ x)2 = x y ∗ /(y ∗ x) = P. 3. We can verify AP = P A = λP as follows: AP = Ax y ∗ /(y ∗ x) = λx y ∗ /(y ∗ x) = λP = x y ∗ A/(y ∗ x) = P A.

49

4. According to the result of part I, we can assume that kxk2 = kyk2 = 1. For any nonzero vector α in Cn , we can decompose it into α = c y + d z, where y ∗ z = 0 and kzk2 = 1. Then we have kP k2 = max α6=0

kP αk2 kP (y + z)k2 |c| |c| = max = max = max . p ∗ ∗ α6=0 α6=0 kαk2 |y x| |c|+|d |6=0 |y x| |c| + |d | kαk2 kαk2

It is easy to find that c = 0 cannot be the maximizer. Thus, we can divide |c| on the fraction and have kP k2 = max

|c|+|d |6=0 |y ∗ x|

|c| 1 1 = max ≤ ∗ . p q |c|6 = 0 |c| + |d | |y ∗ x| 1 + |d | |y x| |c|

The maximum is obtained when d = 0, and consequently kP k2 is the condition number of λ.

 ·

¸ c . Show that the condition numbers of the eigenvalues of A are b

a Question 4.12. Let A = 0 ¡ c ¢2 1/2 both equal to (1+ a−b ) . Thus, the condition number is large if the difference a −b between the eigenvalues is small compared to c, the offdiagonal part of the matrix.

Proof. As what the question suggests, we assume that all the variables appearing are real. It is easy to verify that x 1 = [1, 0]T , c T y 1 = [1, a−b ]

are the right and left eigenvectors of eigenvalue λ1 = a, and T x 2 = [1, a−b c ] ,

y 2 = [0, 1]T are the right and left eigenvectors of eigenvalue λ2 = b. Thus, the condition number of λ1 is kx 1 k2 ky 1 k2 |y T1 x 1 | and that of λ2 is

kx 2 k2 ky 2 k2 |y T2 x 2 |

Thus, the result is proved.

= (1 +

³ c ´2 )1/2 , a −b

= (1 +

³ c ´2 )1/2 . a −b



Question 4.13. Let A be a matrix, x be a unit vector (kxk2 = 1), µ be a scalar, and r = Ax − µx. Show that there is a matrix E with kE kF = kr k2 such that A + E has eigenvalue µ and eigenvector x.

50

Proof. Suppose such E exists. Since we have the condition (A + E )x = µx, we have the equation for E as −r = E x. Set E = −r x ∗ , which satisfies the condition. What’s more, we can evaluate the Frobenius norm of E as kE k2F = tr(xr ∗ r x ∗ ) = kr k22 tr(x x ∗ ) = kr k22 . So, there is a matrix E satisfies the conditions of the question.



5 S OLUTIONS FOR C HAPTER V: T HE S YMMETRIC E IGENPROBLEM AND S INGULAR VALUE D ECOMPOSITION Question 5.1. Show that A = B + iC is hermitian if and only if · ¸ B −C M= C B is symmetric. Express the eigenvalues and eigenvectors of M in terms of those of A. Remark. I believe the question is correct under the assumption of both B and C are real. Thus, it is what my proof will assume. Solution. Denote a i∗j asthe (i , j ) entry of matrix A ∗ . As B and C are both real and a i∗j = a j i =b j i + i c j i =b j i − ic j i = b j i − i c j i . A is hermitian if and only if A ∗ = B − iC = B + iC = A, which is equivalent to the symmetry of M. As for eigenvalues and eigenvectors, suppose α = β+i γ is the eigenvector with corresponding eigenvalue λ ∈ R of A. Because of the definition of eigenvalues and eigenvectors, Aα = (B + iC )(β + i γ) = λ(β + i γ). Thus, we have the equation B β −C γ = λβ, B γ +C β = λγ. Then, considering ·

B C

·

B C

¸· ¸ · ¸ · ¸ · ¸ −C β B β −C γ λβ β = = =λ , B γ Cβ + Bγ λγ γ ¸· ¸ · ¸ · ¸ · ¸ −C −γ −B γ −C β −λγ −γ = = =λ . B β −C γ + B β λβ β

Thus, as we can see, from the above equation, we know that [β, γ]T , [−γ, β]T are the eigenvectors with corresponding eigenvalue λ of M . 

51

Question 5.2. Prove Corollary 5.1, using Weyl’s theorem and part 4 of Theorem 3.3. Proof. According to the hint provided, denote matrices H1 , H2 , H as · ¸ · ¸ · ¸ 0 GT 0 GT + F T 0 FT H1 = , H2 = , H= . G 0 G +F 0 F 0 We know that σ1 ≥ σ2 . . . ≥ σn ≥ −σn ≥ −σn−1 ≥ . . . ≥ −σ1 are the eigenvalues of H1 , and σ01 ≥ σ02 . . . ≥ σ0n ≥ −σ0n ≥ −σ0n−1 ≥ . . . ≥ −σ01 are the eigenvalues of H2 . Using Weyl’s theorem, we have |σi − σ0i | ≤ kH k2 . Thus, what remains to prove is kH k2 = kF k2 . To do this, consider ¸ ¸ · T · ¸ · 0 FT 0 FT F F 0 . HT H = · = F 0 F 0 0 FFT This matrix has the eigenvalues as F F T and F T F , which have the same nonzero eigenvalues. Thus λmax (H T H ) = λmax (F T F ). i.e., kH k2 = kF k2 .  Question 5.3. Consider Figure 5.1. Consider the corresponding contour plot for an arbitrary 3-by-3 matrix A with eigenvalues α3 ≤ α2 ≤ α1 . Let C 1 and C 2 be the two great cirles along which ρ(u, A) = α2 . At what angle do they intersect? Remark. Since A is arbitary, we may assume A is real. Otherwise, x ∗ Ax may not be real number. What’s more, assume A is diagnoalizable and the eigenvalues of A are not identical. If not, I don’t think it will behave the same as what the textbook describes. Solution. Suppose x 1 , x 2 , x 3 are three unit orthogonal eigenvectors of A with corresponding eigenvalues as λ1 , λ2 , λ3 repectively. Thus, all vectors y ∈ R3 can be decomposed as y = b 1 x 1 + b 2 x 2 + b 3 x 3 . If a unit vector y satisfies ρ(y, A) = α2 , we have ρ(y, A) = λ1 b 12 + λ2 b 22 + λ3 b 32 = λ2 . Rearranging the above equation and using b 12 + b 22 + b 32 = 1 yields p p λ1 − λ2 b 1 = ± λ2 − λ3 b 3 . As we have p passume not all eigenvalues are equal, we can suppose λ2 − λ3 6= 0 and denote λ1 − λ2 / λ2 − λ3 = c without loss of generality. Thus, we express the two great ’circles’ by parametric equations as p b 1 = sin(θ)/ 1 + c 2 , b2

= cos(θ),

b3

p = ±c sin(θ)/ 1 + c 2 .

Then, their insection points are the points with θ = 0 and θ = π. For θ = 0, their tangent vectors are p p t 1 = 1/ 1 + c 2 x 1 + c/ 1 + c 2 x 3 , p p t 2 = 1/ 1 + c 2 x 1 − c/ 1 + c 2 x 3 .

52

2

Consequently, the corresponding intersection angle is arccos(t T1 t 2 /(kt 1 k2 ·kt 2 k2 )) = arccos( 1−c ).For 1+c 2 θ = π, by the same computation, the intersection angle is the same as θ = 0.  Question 5.4. Use the Courant-Fischer minimax theorem (Theorem 5.2) to prove the Cauchy interlace theorem: µ ¶ H b 1. Suppose that A = T is an n-by-n symmetric matrix and H is (n −1)-by-(n −1). Let b u αn ≤ . . . ≤ α1 be the eigenvalues of A and θn−1 ≤ . . . ≤ θ1 be the eigenvalues of H . Show that these two sets of eigenvalues interlace: αn ≤ θn−1 ≤ . . . ≤ θi ≤ αi ≤ θi −1 ≤ αi −1 ≤ . . . ≤ θ1 ≤ α1 . µ

¶ H B be n-by-n and H be m-by-m, with eigenvalues θm ≤ . . . ≤ θ1 . Show that BT U the eigenvalues of A and H interlace in the sense that α j +(n−m) ≤ θ j ≤ α j (or equivalently α j ≤ θ j −(n−m) ≤ α j −(n−m) ).

2. Let A =

Proof. ·

¸ I n−1 1. Denote the n-by-(n − 1) matrix P as P = . Thus, H = P T AP and for any vector 0 x ∈ Rn−1 , x T x = (P x)T (P x) holds because P T P = I n−1 . Noting that for linear independent vectors x 1 , . . . .x n−1 ∈ Rn−1 , P x 1 , . . . .P x n−1 ∈ Rn are also linear independent. Therefore, if S is i -dimensional space in Rn−1 , P S is i -dimensional space in Rn . Since the minimum of the subset is not less than the minimum of whole set, we have

θi = min max

S n−i 06=x∈S n−i

= min max

S n−i 06=x∈S n−i

≥ min max

S n−i 06= y∈S n−i

xT H x xT x (P x)T A(P x) (P x)T (P x) y T Ay yT y

= αi +1 .

On the other hand, we have θi = max min Si

06=x∈S i

= max min Si

06=x∈S i

≤ max min Si

06= y∈S i

xT H x xT x (P x)T A(P x) (P x)T (P x) y T Ay yT y

= αi .

Combine the results, and we have αi +1 ≤ θi ≤ αi , which is equivalent to what we should prove.

53

·

¸ Im 2. It is similar to what we have proved. Denote the n-by-m matrix P as P = , and we 0 have H = P T AP . Using Courant-Fischer minimax theorem, we have

θi = min

max

= min

max

≥ min

max

xT H x xT x

S m+1−i 06=x∈S m+1−i

(P x)T A(P x) (P x)T (P x)

S m+1−i 06=x∈S m+1−i

y T Ay yT y

S m+1−i 06= y∈S m+1−i

= αi +n−m .

On the other hand, we have xT H x

θi = max min

xT x

06=x∈S i

Si

(P x)T A(P x)

= max min

(P x)T (P x)

06=x∈S i

Si

y T Ay

≤ max min

yT y

06= y∈S i

Si

= αi .

Combine the results, and we have αi +n−m ≤ θi ≤ αi , which is equivalent to what we should prove.

 Question 5.5. Let A = A T with eigenvalues α1 ≥ . . . ≥ αn . Let H = H T with eigenvalues θ1 ≥ . . . ≥ θn . Let A + H have eigenvalues λ1 ≥ . . . λn . Use Courant-Fischer minimax theorem (Theorem) to show that α j + θn ≤ λ j ≤ α j + θ1 . If H is positive definite, conclude that λ j > α j . In other words, adding a symmetric positive definite matrix H to another symmetric matrix A can only increase its eigenvalues. Proof. According to Courant-Fischer minimax theorem, we have λi = min

x T (A + H )x

max

xT x

S n+1−i 06=x∈S n+1−i T

≤ min (

max

≤ min (

max

x Ax

S n+1−i 06=x∈S n+1−i

xT x x T Ax

S n+1−i 06=x∈S n+1−i

xT x

+

max

xT H x

06=x∈S n+1−i

xT x

)

+ θ1 ) = αi + θ1 .

And, λi = max min Si

06=x∈S i

≥ max( min Si

06=x∈S i

≥ max( min Si

06=x∈S i

x T (A + H )x xT x x T Ax xT x x T Ax xT x

+ min

06=x∈S i

xT H x xT x

)

+ θn ) = αi + θn .

54

Combine the results, and we have a j + θn ≤ λ j ≤ α j + θ1 . As for positive definite H , we know that all its eigenvalues are positive. Thus, α j < α j + θn ≤ λ j .

 Question 5.6. Let A = [A 1 , A 2 ] be n-by-n, where A 1 is n-by-m and A 2 is n-by-(n − m). Let σ1 ≥ σ2 ≥ . . . ≥ σn be the singular values of A and τ1 ≥ τ2 . . . ≥ τm be the singular values of A 1 . Use the Cauchy interlace theorem from Question 5.4 and part 4 of Theorem 3.3 to prove that σ j ≥ τ j ≥ σ j +n−m . Proof. Denote the matrix H as A1 0 0

0 H =  A T1 A T2 

 A2 0 . 0

According to part 4 of Theorem 3.3, the absolute values of eigenvalues of H are the singular values of A T , which are equal to those of A. Thus, we can express the eigenvalues of H as σ1 ≥ σ2 ≥ . . . ≥ σn ≥ −σn ≥ . . . ≥ −σ1 . Consider the (n + m)-by-(n + m) principal submatrix · ¸ 0 A1 H1 = T . A1 0 As in Question 3.14, we have already know its eigenvalues are τ1 ≥ . . . τm ≥ 0 = . . . = 0 ≥ −τm ≥ . . . ≥ −τ1 . Thus, by Cauchy interlace theorem, we have σi +(n−m) ≤ τi ≤ σi ,



which is what we need to prove. Question 5.7. Let q be a unit vector and d be any vector orthogonal to q. Show that k(q + d )q T − I k2 = kq + d k2 .

Remark. My proof suppose the dimension of the vector space is greater than 2. If the dimension is only 2, the proof is similar. Proof. Denote z as a unit vector which is orthogonal to both q and d . Therefore, any vector x can be express as x = aq + bd + c z, and we have ¡ ¢ k (q + d )q T − I xk22 T 2 k(q + d )q − I k2 = max x6=0 kxk22 =

(a 2 − 2ab + b 2 )kd k22 + c 2

max

a 2 + b 2 kd k22 + c 2

a 2 +b 2 +c 2 6=

Ã

=

max

a 2 +b 2 +c 2 6=0

1+

Ã

≤ max

a 2 +b 2 6=0

1+

(kd k22 − 1)a 2 − 2abkd k22

!

a 2 + b 2 kd k22 + c 2

(kd k22 − 1)a 2 − 2abkd k22 a 2 + b 2 kd k22

!

.

55

To determine the maximum of the above formula, we need to consider whether a = 0. If a = 0, the above formula yields 1; If not, set s = b/a, c = kd k22 , and define f (s) =

s 2 − (1 − 1c )s − 1c c − 1 − 2c s 0 2 , and f (s) = 2c . 1 + cs 2 (1 + c s 2 )2

Thus, f (s) has two local maximum at s = − 1c , s = +∞. Since f (− 1c ) =

c 2 +c c+1 , f

(+∞) = 0, we have

k(q + d )q T − I k22 = c + 1 = kq + d k22 .

 Question 5.8. Formulate and prove a theorem for singular vectors analogous to Theorem 5.4. Remark. My theorem has been shown below, and I assume that the target matrix A is m-byn, where m ≥ n. However, I am not sure whether it is the one that the question requests. Theorem 5.4’. Let A = U ΣV T , which is the reduced SVD of an m-by-n matrix A. Let A + E = ˆ Vˆ T be the perturbed SVD. Write U = [u 1 . . . , u n ],V = [v 1 . . . , v n ], Uˆ = [uˆ 1 . . . , uˆ n ], and Aˆ = Uˆ Σ Vˆ = [vˆ 1 . . . , vˆ n ]. Let θ, ϕ denote the acute angles between u i , uˆ i and v i , vˆ i repectively. Define gap(i , A) as min j 6=i |σi − σ j | if i 6= n; and gap(n, A) as min j 6=n (σn , |σn − σ j |) when i = n. Then we have p 2 2kE k2 max(sin 2θ, sin 2ϕ) ≤ . gap(i , A) Proof. Denote ωi =

p1 2

¶ vi ˆi = ,ω ui

µ

consider the matrix ·

0 H= A

p1 2

µ

¶ vˆ i ˆ i . Then, , and θˆ the acute angle between ωi and ω uˆ i

¸ AT , 0

Hˆ =

·

0 A +E

¸ AT + E T , 0

and δH = Hˆ − H . By Theorem 3.3 and Question 5.2, we know that kδH k2 = kE k2 and σ1 ≥ ˆ1 ≥ ...σ ˆ n ≥ 0 ≥ . . . ≥ 0 ≥ −σ ˆn ≥ . . . σn ≥ 0 ≥ . . . ≥ 0 ≥ −σn ≥ . . . ≥ −σ1 are the eigenvalues of H, σ ˆ 1 are the eigenvalues of Hˆ . Thus, According to and Theorem 5.4, we know that . . . ≥ −σ 2kE k2 . sin 2θˆ ≤ gap(i , A) As all angles are acute and 1 1 ˆ i = (u T uˆ + v T vˆ ) = (θ + ϕ), cos θˆ = ωTi ω 2 2 ˆ For the smaller angle, consider ku i + uˆ i k2 we just prove for the angle, which is larger than θ. 2 instead of ku − uˆ i k2 . Without loss of generality, we may assume that θ is the larger one. Since 2 sin2

θ θˆ 1 ˆ i k22 = 4 sin2 , = 1 − cos θi = ku i − uˆ i k22 ≤ kω − ω 2 2 2

56

ˆ we have and θ ≥ θ, sin θ = 2 sin Then,

p θ θˆ p θˆ θ ˆ cos ≤ 2 2 sin cos = 2 sin θ. 2 2 2 2

p p 2kE k2 . sin 2θ = 2 sin θ cos θ ≤ 2 2 sin θˆ cos θˆ = 2 sin 2θˆ ≤ gap(i , A)

 Question 5.9. Prove bound (5.6) from Theorem 5.5. Proof. Since A is symmetric, it has a set of orthonormal vectors [q 1 , . . . , q n ], which span the P whole space. Thus, suppose the unit vector x = nj=1 b j q j . Since Ax − ρ(x, A)x = r , b i Aq i = b i αi q i , subtract the second from the first, and we have X (A − ρ(x, A)I ) b j q j = r + (ρ(x, A) − αi )b i q i . j 6=i

As we can see that the left side does not contain q i , the right side does not contain q i , either. Thus, the part of q i of r cancelled by (ρ(x, A) − αi )b i q i , and kr + (ρ(x, A) − αi )b i q i k2 ≤ kr k2 . P Denote the right side as j 6=i c j q j . We have X X X (A − ρ(x, A)I ) b j q j = (αi − ρ(x, A)) bj q j = cj q j. j 6=i

j 6=i

j 6=i

Thus, b j = c j /(αi − ρ(x, A)) as we have assume the gap0 is not zero. Then sin θ = k

X

b j q j k2 = (

j 6=i

X¡ ¢2 1 c j /(αi − ρ(x, A)) ) 2 ≤ j 6=i

1 X 2 1 kr k2 ( c )2 ≤ . gap0 j 6=i j gap0

 Question 5.10. A harder question. Skipped. Question 5.11. Suppose θ = θ1 + θ2 , where all three angles lie between 0 and π/2. Prove that 1 1 1 2 sin θ ≤ 2 sin 2θ1 + 2 sin 2θ2 . Proof. Consider the function f (θ1 ) = sin 2θ − sin 2θ1 − sin 2(θ − θ1 ). Its derivative is f 0 (θ1 ) = −2 cos 2θ1 + 2 cos 2(θ − θ1 ). As 2θ − 2θ1 , 2θ1 ∈ [0, π], between which the function cos is monotonus, f 0 (θ1 ) has only one root as θ1 = θ/2. Thus, f (θ1 ) decreases in [0, θ/2], and increases in [θ/2, θ]. It can only reach its maximum at the extremity θ1 = 0 and θ1 = θ. i.e., f (θ1 ) ≤ max( f (0), f (θ)) = 0. Therefore, 21 sin θ ≤ 12 sin 2θ1 + 12 sin 2θ2 .



57

Question 5.12. Prove Corollary 5.2. ¸ · ¸ 0 GT 0 Gˆ T ˆ ˆ Proof. Denote the matrix H as H = , and H as H = ˆ . According to TheoG 0 G 0 ˆ1 ≥ ...σ ˆn ≥ 0 ≥ rem 3.3, σ1 ≥ . . . σn ≥ 0 ≥ . . . ≥ 0 ≥ −σn ≥ . . . ≥ −σ1 are the eigenvalues of H, σ ˆ n ≥ . . . ≥ −σ ˆ 1 are the eigenvalues of Hˆ . Since . . . ≥ 0 ≥ −σ ¸ · ¸ · T X 0 X 0 ˆ H , H= 0 Y 0 YT °· T ¸· ¸ ° ° ° X X 0 0 ° if we can prove ° ° 0 Y T 0 Y − I ° = ², by applying Theorem 5.6, we will prove the °· 2T ¸° ° X X −I ° 0 ° and result. As the target two-norm equals ° T ° 0 Y Y − I °2 °· T ¸° ° X X −I ° ¡ ¢ 0 ° ° = max kX T X − I k2 , kY T Y − I k2 , 6 T ° ° 0 Y Y −I 2 ·

the equation holds. Thus, we prove the result.



Question 5.13. Let A be a symmetric matrix. Consider running shifted QR iteration (Algorithm 4.5) with a Rayleigh quotient shift (σi = a nn ) at every iteration, yielding a sequence σ1 , σ2 , . . . of shifts. Also run Rayleigh quotient iteration (Algorithm 5.1), starting with x 0 = [0, 0, . . . , 1]T , yielding a sequence of Rayleigh quotients ρ 1 , ρ 2 , . . .. show that these sequences are identical: σi = ρ i for all i . This justifies the claim in section 5.3.2 that shifted QR iteration enjoys local cubic convergence. Proof. We prove it by induction. At the begining, for the shifted QR iteration, σ1 = a nn . For Rayleigh quotient iteration start with x 0 = [0, 0, . . . , 1], ρ 1 = x T0 Ax 0 = a nn . Thus, σ1 = ρ 1 . Then, suppose it holds until the kth step. For (k + 1)th step, denote P k = Q k Q k−1 . . .Q 1 . Since A is symmetric, the orthogonal matrix Q i are symmetric, too7 . We kown that for the shifted QR iteration, P k (A k+1 − σk I ) = (A − σk )P k . What’s more, P k R k = P k−1Q k R k = P k−1 (A k − σk I ) = (A − σk )P k−1 . Transpose on both sides, premultiply by P k−1 and postmultiply by P k , yielding P k−1 R kT = (A − σk )P k . Equate the last column, and we have P k−1 e n = (A − σk )P k e n /r k , or equivalently P k e n = (A − σk )−1 P k−1 e n /r k . where r k represents R k (n, n). As P 0 e n = e n and kP i e n k2 = 1, it is a inverse iteration sequence with the same start and same shifts as Rayleigh quotient iteration, by assumption. Thus, P k e n = x k , which suggest σk = A(n, n) = e Tn A k e n = e Tn P k AP k e n = x Tk Ax k = ρ k .  6 It is because that such a block diagonal matrix has the eigenvalues as the union of those of its diagonal blocks. 7 It is because (QR)T = R T Q T = QR, yielding Q = R T Q T R the right side of which is symmetric.

58

Question 5.14. Prove Lemma 5.1. Proof. The prove will be trivial if x or y are zero vector. Thus, we may assume that both of them are nonzero. Then, there exists a row of x y T that can linearly represent the other rows. Without loss of generality, suppose it is the first row. Otherwise, we may change it to the first row by exchange rows and columns. Denote the i th entry of x and y as x 1 and y 1 repectively. Then, by assumption, x 1 6= 0. We have   1 + x1 y 1 x1 y 2 ... x1 y n   1 + x2 y 2 . . . x2 y n   x2 y 1 T   det(I + x y ) = det  .. .. .. ..  . . . .   xn y 1 xn y 2 . . . 1 + xn y n   1 + x1 y 1 x1 y 2 . . . x1 y n   1 ... 0   −x 2 /x 1  = det  .. .. ..  ..  . . . .   −x n /x 1 0 ... 1   Pn 1 + i =1 x i y i x 1 y 2 . . . x 1 y n   0 1 ... 0    = det  .. .. ..  ..  . . . .   0 0 ... 1 n X = 1+ x i y i = 1 + x T y. i =1

 T

T

Remark. There is a simpler way to prove the question. First, observe that x y x = (y x)x, which means x is an eigenvector of x y T corresponding to y T x. As the rank of x y T is one, x y T has only one nonzero eigenvalue. Then by Question 4.5, the eigenvalues of I + x y T are 1’s and 1 + y T x. The result follows. Question 5.15. Prove that if t (n) = 2t (n/2) + cn 3 + O(n 2 ), then t (n) ≈ c 34 n 3 . This justifies the complexity analysis of the divide-and-conquer algorithm (Algorithm 5.2). Proof. Since we have t (n) = 2t (n/2) + cn 3 + O(n 2 ), 1 2t (n/2) = 4t (n/4) + cn 3 + O(n 2 ), 4 ......... 1 2log2 n−1 t (2) = 2log2 n t (1) + 2 log n−2 cn 3 + O(n 2 ), 2 2 by adding them together, we get 4 4 t (n) = 2log2 n t (1) + c (1 − n −2 )n 3 + O(n 2 ) ≈ cn 3 . 3 3



59

Question 5.16. Let A = D + ρuu T , where D = diag(d 1 , . . . , d n ) and u = [u 1 , . . . , u n ]T . Show that if d i = d i +1 or u i = 0, then d i is an eigenvalues of A. If u i = 0, show that the eigenvector corresponding to d i is e i , the i th column of the identity matrix. Derive a similarly simple expression when d i = d i +1 . This shows how to handle deflation in the divide-and-conquer algorithm, Algorithm 5.2. Proof. When u i = 0, then the i th column of uu T is zero vector. Thus, (D + ρuu T )e i = d i e i . As for d i = d i +1 , without loss of generality, we can assume that u i 6= 0 and d i −1 6= d i , d i +1 6= d i +2 ; otherwise, such case will either be the previous one or similar to the following. We claim that e i +1 − u i +1 /u i e i is the eigenvector corresponding to d i , because   2  0 u1 + d1 − di . . . u1 ui u 1 u i +1 . . . u1 un    .. .. .. .. .. .. ..    . . . . . . .        2 u i u i +1 . . . ui un u1 ui ... ui  −u i +1 /u i   (A − d i I )(e i +1 − u i +1 /u i e i ) =    2  u 1 u i +1  ... u i +1 u n   . . . u i u i +1 u i +1 1    .. .. .. .. ..    .. ..    . . . . . . . 2 u1 un . . . u i u n u i +1 u n . . . u n + d n − d i 0     1 . . . u1 0 . . . 0 d1 − di . . . 0 0 0 ... 0   . . .. . .. ..  .. .. . . . .. .. ..  ..    .. . .. . . ..  . . .  . .    .      0 . . . . . . u i u i +1 . . . u n  −u i +1 /u i  ui 0 . . . 0  u 1   =       0 . . . u i +1 1 . . . 0  0 ... 0 0 ... 0 1     . . . . .. .. . . ..   ..  .. . .    .. .. .. .. ..    . . .. . . . . . . . 0 0 ... 0 0 . . . dn − di 0 . . . un 0 . . . 1 = 0.

 Remark. My proof is based on solving (A − λI )x = 0. There is another way based on deflation. First, as we can use a Givens rotation to cancel an entry of a vector, if d i = d i +1 , denote G as the Givens rotation which satisfies     u1 u1  .   .   ..   ..       u  u   i   i G  =  . u i +1   0       ..   ..   .   .  un un Then, G AG T = GDG T + ρGuu T G T = D + ρ uˆ uˆ T , where uˆ i = 0.

60

Question 5.17. Let ψ and ψ0 be given scalars. Show how to compute scalars c and cˆ in the c function definition h(λ) = cˆ + d −λ so that at λ = ξ, h(ξ) = ψ and h 0 (ξ) = ψ0 . This result is needed to derive the secular equation solver in section 5.3.3. Solution. As h 0 (λ) =

c , (d − λ)2

the conditions h(ξ) = ψ and h 0 (ξ) = ψ0 become ψ = cˆ +

c , d −ξ

ψ0 =

c , (d − ξ)2

whose solution is c = ψ0 (d − ξ)2 ,

cˆ = ψ − ψ0 (d − ξ).

 Question 5.18. Use the SVD to show that if A is an m-by-n real matrix with m ≥ n, then there exists an m-by-n matrix Q with orthonormal columns (Q T Q = I ) and an n-by-n positive semidefinite matrix P such that A = QP . This decomposition is called the polar decomposition of A, because it is analogous to the polar form of a complex number z = e i arg(z) · |z|. Show that if A is nonsingular, then the polar decomposition is unique. Proof. Suppose A = U ΣV T , which is the reduced SVD of A. To show how to do the polar decomposition, consider A = U ΣV T = UV T V Σ1/2 Σ1/2V T = (UV T )(Σ1/2V T )T (Σ1/2V T ). The UV T is m-by-n orthonormal and (Σ1/2V T )T (Σ1/2V T ) is a positive semidefinite matrix, which satisfy the condition of polar decomposition. For the uniqueness, suppose there are two polar decomposition of A as A = QP = Qˆ Pˆ . Consider A T A = P 2 = Pˆ 2 . As P and Pˆ are positive definite matrices when A has full rank, P and Pˆ have the same eigenvalues. Suppose P = W T ΛW , Pˆ = S T ΛS. We get SW T ΛW S T = Λ. It follows that Pˆ = S T ΛS = S T SW T ΛW S T S = W T ΛW = P, and AP −1 = Q = Qˆ = A Pˆ −1 .  p T Remark. In fact, the uniquenesspfollows from the uniqueness of A A. Standard textbooks about linear algebra often define A T A = W T ΛW without checking whether it is well defined as the orthogonal matrix W may not be unique. Thus, I present the proof here. Question 5.19. Prove Lemma 5.5 Proof.

61

1. Denote x as any vector in R2n . Then, P T x = [e T1 , . . . , e T2n ]T x = [x 1 , x n+1 , x 2 , . . . .x n , x 2n ]T , x T P = x T [e 1 , . . . , e 2n ] = [x 1 , x n+1 , x 2 , . . . , x n , x 2n ]. Thus, denote a i , j as the i th row, j th column entry of A. We have 

a 1,1 a  1,n+1   a 1,2 T P AP =   ..  .   a 1,n a 1,2n

a 2,1 a 2,n+1 a 2,2 .. . a 2,n a 2,2n

... ... ... .. . ... ...

  a 2n,1 a 1,1 a a 2n,n+1    1,n+1   a 2n,2   a 1,2  ..  P =   ..  . .      a 1,n a 2n,n a 2n,2n a 1,2n

a n+1,1 a n+1,n+1 a n+1,2 .. . a n+1,n a n+1,2n

... ... ... .. . ... ...

a n,1 a n,n+1 a n,2 .. . a n,n a n,2n

 a 2n,1 a 2n,n+1    a 2n,2  ..  . .   a 2n,n  a 2n,2n

i.e., 

0 a  1  0 T P AP =   ..  .  0 0

a1 0 b1 .. . 0 0

2. We can verify by simple computations:   a1 a1 b1    a2 b2   b 1 a 2   .. .. .. ..   . . . .      b n−2 a n−1 b n−1   an

... ... ... .. . ... ...

 0 0   0 ..  . .   an  0

0 0 0 .. . 0 an



a n−1 b n−1

 a 2 + b 12    1   a2 b1 =     

a2 b1 a 22 + b 22 .. .

an



a3 b2 .. . a n b n−1

a n2

  .  

As for the remaining result, suppose B = U ΣV T , which is the SVD of B . Then, B B T = U Σ2U T . Therefore, the singular value are the square roots of the eigenvalues of TB B T , and the left singular vectors of B are the eigenvectors of TB B T . 3. We can verify by simple computations:   a1 b1 a1   b a2 b2  1 a2    .. .. ..    . . .       b n−2 a n−1 b n−1 a n



..

.

a n−1

 a2    1  a 1 b 1 =     b n−1  an

a1 b1 a 22 + b 12 .. .



a2 b2 .. . a n−1 b n−1

As for the remaining result, suppose B = U ΣV T , which is the SVD of B . Then, B T B = V Σ2V T . Therefore, the singular value are the square roots of the eigenvalues of TB T B , and the right singular vectors of B are the eigenvectors of TB T B .

62

a n−1 b n−1 2 2 a n−1 + b n−1

  .  

 Question 5.20. Prove Lemma 5.7 Proof. We can verify it by simple computations: 

χ1

 χ2 χ1 ζ1

   D 1B D 2 =    



χ1 a 1

   =        =   

χ1 a 1

χ 3 χ2 χ1 ζ2 ζ1

χ1 b 1 χ2 χ1 ζ1 a 2

ζ1 b 1 χ2 a 2

..

.

χn ...χ1 ζn−1 ...ζ1

χ 2 χ1 ζ1 b 2 χ3 χ2 χ1 ζ 2 ζ1 a 3

a1

b1 a2

      

b3 .. .

..

. an

χ3 χ 2 χ1 ζ2 ζ1 b 3

..

b2 a3

.

..

.

χn ...χ1 ζn−1 ...ζ1 a n

 1        

 1        

 ζ1 χ1

ζ2 ζ 1 χ2 χ 1

..

. ζn−1 ...ζ1 χn−1 ...χ1

 ζ1 χ1

ζ2 ζ 1 χ2 χ1

..

. ζn−1 ...ζ1 χn−1 ...χ1

       



ζ2 b 2 χ3 a 3

ζ3 b 3 .. .

..

. χn a n

   .   

 Question 5.21. Prove Theorem 5.13. Also, reduce the exponent 4n −2 in Theorem 5.13 to 2n −1. Proof. First, we can evaluate kD 1 D 1 − I k2 as follows: ° ° ° ° χ22 χ21 χ2n . . . χ21 ° ° 2 kD 1 D 1 − I k2 = °diag{χ1 − 1, 2 − 1, . . . , 2 − 1} ° ° ° ζ1 ζn−1 . . . ζ21 2 Q ¯ ¯ Qi 2¯ ¯ j =1 χ2 − ij−1 =1 ζ j ¯ j ¯ = max ¯ ¯ Q i −1 2 ¯ 1≤i ≤n ¯ j =1 ζ j ¯ ¯ ¯Y i iY −1 ¯ 2 2¯ 2i −2 ¯ ζj ¯ ≤ max τ ¯ χj − ¯ j =1 ¯ 1≤i ≤n j =1 ≤ max τ2i −2 (τ2i − τ−2i +2 ) 1≤i ≤n

≤ max τ4i −2 − 1 = τ4n−2 − 1. 1≤i ≤n

For kD 2 D 2 − I k2 , using the same techinique, we can prove kD 2 D 2 − I k2 = τ4n−4 − 1. Thus, τ4n−2 − 1 ≡ max(kD 1 D 1 − I k2 , kD 2 D 2 − I k2 ), and by Corollary 5.2, we have ˆ i − σi | ≤ (τ4n−2 − 1)σi = ((² + 1)4n−2 − 1)σi = (4n − 2)²σi + O(²2 ). |σ

63

       

To improve the bound, it is necessary to improve max(kD 1 D 1 − I k2 , kD 2 D 2 − I k2 ). Consider, Q p Q ζ j /χ j )D 1 , Dˆ 2 = ( χn [n/2] χ j /ζ j )D 2 8 . Then, Dˆ 1 = ( p1χ [n/2] j =1 j =1 n

Ã

! n−1 Y Y Y χ1 [n/2] χ2 [n/2] p Dˆ 1 = diag p χ j /ζ j , ζ j /χ j , p ζ j /χ j , . . . , χn χn j =1 χn j =2 j =[n/2]+1 Ã ! [n/2] [n/2] n−1 Y Y Y p p p Dˆ 2 = diag χn χ j /ζ j , χn χ j /ζ j , . . . , χn ζ j /χ j j =1

j =2

j =[n/2]+1

It is easy to verify that Bˆ = Dˆ 1 B Dˆ 2 = D 1 B D 2 , and ¯ ¯! ¯ ¯ Ã ¯ χ2 [n/2] ¯ ¯ ¯ χ2 iY −1 Y 2 2 ¯ ¯ ¯ ¯ i i 2 2 kDˆ 1 Dˆ 1 − I k2 ≤ max max ¯ χ j /ζ j − 1¯ ≤ τ2n−1 − 1, ζ j /χ j − 1¯ , max ¯ ¯ ¯ [n/2]+1≤i ≤n ¯ χn j =[n/2]+1 1≤i ≤[n/2] ¯ χn j =i ¯ ¯ ¯ ¯! Ã ¯ [n/2] ¯ ¯ ¯ iY −1 Y 2 2 ¯ ¯ ¯ ¯ kDˆ 2 Dˆ 2 − I k2 ≤ max max ¯χn χ j /ζ j − 1¯ , max ¯χn ζ2j /χ2j − 1¯ ≤ τ2n−1 − 1. ¯ ¯ ¯ 1≤i ≤[n/2] ¯ [n/2]+1≤i ≤n j =i j =[n/2]+1



Thus, we improve that bound from 4n − 2 to 2n − 1.

Question 5.22. Prove that Algorithm 5.13 computes the SVD of G, assuming that G T G converges to a diagonal matrix. Remark. I am not sure whether those σi are ordered by their corresponding magnitude, and it seems that G must have full rank. Otherwise, U = [G(:, 1)/σ1 ,G(:, 2)/σ2 , . . . ,G(:, n)/σn ] cannot be orthonormal as G(:, i ) are linear dependent or some σi = 0. ¡ ¢ ˆ it Proof. Denote Gˆ as the resulted matrix such that Gˆ T Gˆ = diag σ21 , σ22 , . . . , σ2n = Λ. As Gˆ T G, follows n X ˆ , j )2 = kG(:, ˆ j )k2 . σ2 = G(i j

2

i =1

ˆ 1)/σ1 , G(:, ˆ 2)/σ2 , . . . , G(:, ˆ n)/σn ] is orthonormal, since Then, we claim that U = [G(:,   ˆ 1)T /σ1 G(:,  ˆ   G(:, 2)T /σ2  £ ¤ T   G(:, ˆ 1)/σ1 , G(:, ˆ 2)/σ2 , . . . , G(:, ˆ n)/σn = I , U U = ..  .   ˆ n)T /σn G(:,

 ˆ 1)T /σ1 G(:,  ˆ  2)T /σ2  £ ¤  G(:, T   ˆ ˆ ˆ UU = G(:, 1)/σ1 , G(:, 2)/σ2 , . . . , G(:, n)/σn  ..  .   T ˆ n) /σn G(:, n X ˆ i )G(:, ˆ i )T /σ2 = GΛ ˆ −1Gˆ T = G(Λ ˆ −1Gˆ T G) ˆ Gˆ −1 = I . = G(:, i 

i =1

8 [a] denotes the largest integer that smaller than a. Another feasible const is pχ Q(n/2) ζ χ , where (a) means 1 j =1 j j

the smallest ineger that larger than a.

64

Finally, as p ˆ 1)/σ1 , G(:, ˆ 2)/σ2 , . . . , G(:, ˆ n)/σn ] ΛJ T = Gˆ J T = Gˆ J −1 = G, U ΣV T = [G(:,



it follows that Algorithm 5.13 computes the SVD of G. Question 5.23. A harder question. Skipped. Question 5.24. A harder question. Skipped. Question 5.25. A harder question. Skipped.

Question 5.26. Suppose that x is an n-vector. Define the matrix C by c i j = |x i |+|x j |−|x i − x j |. Show that C (x) is positive semidefinite. Proof. As ½

c i j = |x i | + |x j | − |x i − x j | =

2 min(|x i |, |x j |), 0,

x i x j ≥ 0, x i x j < 0,

we may assume that x 1 ≥ x 2 ≥ . . . ≥ x k ≥ 0 ≥ x k+1 . . . ≥ x n . Otherwise, we can use permutation matrix to transform other situation into this. Then, the matrix C has the form   x1 x2 . . . xk    x2 x2 . . . xk   .  .. . . ..  .  . .  .  .   · ¸ x k x k . . . x k   = C1 . C = 2  −x k+1 −x k+2 . . . −x n  C2    −x k+2 −x k+2 . . . −x n     .. .. ..  ..   . . . .   −x n −x n . . . −x n Thus, it suffices to prove for the x which satisfies |x| = x, and consequently C takes the form   x1 x2 . . . xk    x2 x2 . . . xk   C = . .. . . .  . . ..  .  .. xk

xk

...

xk

For simlicity, call the matrices which have this form as A-matrices. We prove that A-matrices are semipositive by induction. First, as x i ≥ 0 by assumption, the base case holds. Then, suppose that all (k − 1)-by-(k − 1) A-matrices are semipositive. For any k-by-k A-matrix C , its determinant can be computed as     x2 − x1 x3 − x1 . . . xk − x1 x1 x2 . . . xk  x3 − x2 . . . xk − x2      x 2 x 2 . . . x k    . .     .. .. C = xk  . .. . . ..  = x k   . . .    . .  x k − x k−1  1 1 ... 1 1 1 1 ... 1 Y 2k = (−1) x k (x i − x i +1 ) ≥ 0.

65

Thus, all the principal submatrices of C has nonnegative determinant, which implies C is semipositive.  Question 5.27. Let µ

A=

I B∗

with kB k2 < 1. Show that kAk2 kA −1 k2 =

B I



1 + kB k2 . 1 − kB k2

Proof. To prove the equation, we need to compute the kAk2 and kA −1 k2 exactly. Denote B µ ¶ y n+m as n-by-m matrix, and partion any vector x ∈ R as x = where y ∈ Rn and z ∈ Rm . Then z we can evaluate the largest and smallest eigenvalues of A as follows: µ ¶µ ¶ ¡ ∗ ¢ I B y ∗ ∗ y z λmax = max x Ax = max = 1 + 2 max Re(y ∗ B z), B∗ I z kxk2 =1 kyk22 +kzk22 =1 kyk22 +kzk22 =1 λmin = 1 + 2

min

kyk22 +kzk22 =1

Re(y ∗ B z).

As 2|Re(y ∗ B z)| = ky ∗ B z + z ∗ B ∗ yk2 ≤ 2kyk2 kB k2 kzk2 ≤ (kyk22 + kzk22 )kB k2 = kB k2 , the above formulas has an upper bound as 1 + 12 kB k2 , and a lower bound as 1 − 12 kB k2 . To see both of them are attainable, set zˆ as the vector which satisfies kB zˆ k2 = kB k2 , and yˆ which is a unit vector and parallel to zˆ ∗ B ∗ . Thus, y ∗B z kyk22 + kzk22

=

±1 ±1 ∗ ∗ ±1 zˆ B B z = kB zk22 = kB k2 . 2kB k2 2kB k2 2

What’s more, since we know kB k2 < 1, the matrix A are positive definite. Thus, the largest eigenvalue of A −1 is the inverse of smallest eigenvalue of A. Therefore, kAk2 kA −1 k2 =

1 + kB k2 . 1 − kB k2

 Question 5.28. A square matrix A is said to be skew Hermitian if A ∗ = −A. Prove that 1. the eigenvalues of a skew Hermitian are purely imaginary. 2. I − A is nonsingular. 3. C = (I − A)−1 (I + A) is unitary, C is called the Cayley transform of A. Proof. 1. Denote λ as an eigenvalue of matrix A, and x as the corresponding unit eigenvector. Thus, Ax = λx and −x ∗ A =λx ∗ . Consequently, x ∗ Ax = λx ∗ x = λ, −x ∗ Ax =λx ∗ x =λ. Add them together, and we have λ +λ= 0, which means λ is pure imaginary.

66

2. According to Question 4.5, the eigenvalues of I − A are 1 − λi , where λi are the eigenvalues of A. Since all the eigenvalues of A are pure imaginary, Y

|1 − λi | =

Yq

1 + λ2i ,

which suggests that I − A is invertible. 3. Since C ∗ = (I − A)(I + A)−1 , we have C ∗C = (I − A)(I + A)−1 (I − A)−1 (I + A) = (I − A)(I − A + A − A A)−1 (1 + A) = (I − A)(I − A)−1 (I + A)−1 (1 + A) = I, ∗

CC = (I − A)−1 (I + A)(I − A)(I + A)−1 = (I − A)−1 (I − A)(I + A)(I + A)−1 = I. Thus, C is unitary.



6 S OLUTIONS FOR C HAPTER VI: I TERATIVE M ETHODS FOR L INEAR S YSTEMS Question 6.1. Prove Lemma 6.1. Proof. It suffices to verify T N z j = λ j z j as follows: 

2 −1 −1 2 −1   .  −1 . .    .. . 

  jπ 2jπ 2 sin( ) − sin( )   N +1 N +1 jπ    sin( N +1 ) ..     2jπ   .  sin( )     ..  N +1   (k+1) j π (k−1) j π  k jπ . =  2 sin( N +1 ) − sin( N +1 ) − sin( N +1 ) .    ..    ..  ..   . −1 j Nπ .   sin( N +1 ) N jπ (N −1) j π −1 2 2 sin( N +1 ) − sin( N +1 ) 

A−B As sin(A) + sin(A) = 2 sin( A+B 2 ) cos( 2 ), it follows

2 sin(

(k + 1) j π (k − 1) j π k jπ k jπ jπ k jπ ) − sin( ) − sin( ) = 2 sin( ) − 2 sin( ) cos( ). N +1 N +1 N +1 N +1 N +1 N +1 0· j π

As for the first and last entries, it also holds because sin( N +1 ) = sin(

(N +1)· j π N +1 ) = 0.



Question 6.2. Prove the following formulas for triangular factorizations of T N .

67

T 1. The Cholesky factorization T N = B N B N has a upper bidiagonal Cholesky factor B N with

s

B N (i , i ) =

i +1 , B N (i , i + 1) = i

s

i . i +1

2. The result of Gaussian elimination with partial pivoting on T N is T N = L N U N , where the triangular factors are bidiagonal: L N (i , i ) = 1, L N (i + 1, i ) = − U N (i , i ) =

i , i +1

i +1 , U N (i , i + 1) = −1. i

3. T N = D N D TN , where D N is the N -by-(N + 1) upper bidiagonal matrix with 1 on the main diagonal and −1 on the superdiagonal. Proof. T 1. It suffices to verify that T N = B N B N . According to Question 5.19, the diagonal entries of q i −1 i +1 i T T BN B N are i +1 + = 2, and the offdiagonal entries are i i i i +1 = 1. Thus, T N = B N B N .

2. As T N is band matrix, T (i , i ) would be updated only at (i − 1)th step. For the first step, L N (1, 1) = 1, L N (2, 1) = − 12 and U N (1, 1) = 2, U N (1, 2) = −1. Suppose it holds for (k −1)th step and does not require swapping rows. Then, for kth step, we have      . .. .. .. .. ... . .     .   . .       ..  k k . 1  . −1 −1  = L k−1  . k−1 k−1 T N = L k−1    k−1     k+1   1    −1 2 −1 −1 k   k    .. .. .. . . .. .. . . . Therefore, it also holds for kth step and does not require swapping rows. 3. It suffices to verify that T N = D TN D N as follows:  



1 −1  ..  . D N D TN =   

..

. 1



1

  −1    −1   1 −1 

..

.

..

.

 2 −1     −1 . . . . . .   = .. 1   . 2 −1    −1 1 −1 2 −1

     = TN  

 Question 6.3. Confirm equation (6.13).

68

∂ ∂ 2 2 2 2 Proof. It suffices to verify that (− ∂x 2 − ∂y 2 ) sin(i πx) sin( j πy) = (i π + j π ) sin(i πx) sin( j πy) as follows: 2

(−

2

∂2 ∂2 ∂2 ∂2 − ) sin(i πx) sin( j πy) = − sin(i πx) sin( j πy) − sin(i πx) sin( j πy) ∂x 2 ∂y 2 ∂x 2 ∂y 2 = i 2 π2 sin(i πx) sin( j πy) + j 2 π2 sin(i πx) sin( j πy).

 Question 6.4.

1. Prove Lemma 6.2.

2. Prove Lemma 6.3. 3. Prove that the Sylvester equation AX − X B = C is equivalent to (I n ⊗ A −B T ⊗I m )vec(X ) = vec(C ). 4. Prove that vec(AX B ) = (B T ⊗ A) · vec(X ). Proof. 1. As the textbook has already proved part III of Lemma 6.2, we only need to prove part I and part II. For part I, partition X by columns as X = [x 1 , . . . , x n ]. Then      Ax 1 A x1  ..     .  . .. vec(AX ) =  .  =    ..  = (I n ⊗ A) · vec(X ). Ax n

A

xn

P The proof for part II is similar. It suffices to verify that X B (i , j ) = nk=1 B (k, j )X (i , k), which holds by the definition of matrix product. P 2. Since the (i , j ) block of (A ⊗ B )(C ⊗ D) is nk=1 A(i , k)C (k, j )B D = (AC ) ⊗ (B D), part I follows. For part II, use the result of part I, yielding

(A ⊗ B )(A −1 ⊗ B −1 ) = I m ⊗ I n = I m+n . Thus, (A ⊗ B )−1 = (A −1 ⊗ B −1 ). As for part III, we can verify it as follows   A(1, 1)B T A(2, 1)B T . . . A(n, 1)B T    A(1, 2)B T A(2, 2)B T . . . A(n, 2)B T  T   = AT ⊗ B T . (A ⊗ B ) =  .. .. .. ..  . . . .   T T T A(1, n)B A(2, n)B . . . A(n, n)B 3. Take operator "vec" on both side of the Sylvester equation, yielding vec(AX − X B ) = vec(AX ) − vec(X B ) = (I ⊗ A)vec(X ) − (B T ⊗ I )vec(X ) = vec(C ). 4. It can be verified as vec(AX B ) = (I ⊗ A)vec(X B ) = (I ⊗ A) · (B T ⊗ I )vec(X ) = (B T ⊗ A)vec(X ).

69

 Question 6.5. Suppose that A n×n is diagonalizable, so A has n independent eigenvectors: Ax i = αi x i , or AX = X Λ A , where X = [x 1 , x 2 , . . . , x n ] and Λ A = diag(αi ). Similarly, suppose that B m×m is diagonalizable, so B has m independent eigenvectors: B y i = βi y i , or B Y = Y ΛB , where Y = [y 1 , y 2 , . . . , y m ] and ΛB = diag(β j ). Prove the following results. 1. The mn eigenvalues of I m ⊗ A + B ⊗ I n are λi j = αi + β j , i.e., all possible sums of pairs of eigenvalues of A and B . The corresponding eigenvectors are z i j , where z i j = x i ⊗ y j , whose (km + l )th entry is x i (k)y j (l ). Written another way, (I m ⊗ A + B ⊗ I n )(Y ⊗ X ) = (Y ⊗ X )(I m ⊗ Λ A + ΛB ⊗ I n ). 2. The Sylvester equation AX + X B T = C is nonsingular if and only if the sum αi + β j = 0 for all eigenvalues αi of A and β j of B . The same is true for the slightly different Sylvester equation AX + X B = C . 3. The mn eigenvalues of A ⊗ B are λi j = αi β j , i.e., all possible products of pairs of eigenvalues of A and B . The corresponding eigenvectors are z i j , where z i j = x i ⊗ y j , whose (km + l )th entry is x i (k)y j (l ). Written another way, (B ⊗ A)(Y ⊗ X ) = (Y ⊗ X )(ΛB ⊗ Λ A ). Remark. It seems that part II of the question is wrong. The sum αi + β j should not equal zero. It has been proved in Question 4.6. Also, I do not know why the author swapped A and B in the matrix form, athough they are equivalent. Proof. 1. It suffices to verify the equation as follows: (I m ⊗ A + B ⊗ I n )(Y ⊗ X ) = Y ⊗ AX + B Y ⊗ X = Y ⊗ X Λ A + Y ΛB ⊗ X = (Y ⊗ X )(I m + Λ A ) + (Y ⊗ X )(ΛB + I n ) = (Y ⊗ X )(I m ⊗ Λ A + ΛB ⊗ I n ). 2. According to Question 6.4, the first Sylvester equation equals to solve (I m ⊗ A + B ⊗ I n )vec(X ) = vec(C ), which is a linear equation. To solve it, it is necessary and sufficient det(I m ⊗ A+B ⊗I n ) 6= 0. As what part I has proved, the eigenvalues of the above matrix is αi + β j . Thus, for all i , j , αi + β j 6= 0. For the second Sylvester equation, it suffices to prove that the eigenvalues of I m ⊗ A + B T ⊗ I n are also αi + β j , which is true if we subsititue B as B T in part I. 3. It suffices to verify the equation as follows: (B ⊗ A)(Y ⊗ X ) = (B Y ⊗ AX ) = (Y ΛB ) ⊗ (X Λ A ) = (Y ⊗ X )(ΛB ⊗ Λ A ).

70

Question 6.6. Programming question. Skipped. Question 6.7. Prove Lemma 6.7. Proof. 1. Since the mth Chebyshev polynomial is defined by Tm (x) = 2xTm−1 (x) − Tm−2 (x) with initial value T0 (x) = 1, T1 (x) = x, we can prove it by induction. As T0 (1) = 1, T1 (1) = 1 and assume it holds for (m − 1)th Chebyshev polynomial, it follows Tm (1) = 2Tm−1 (1) − Tm−2 (1) = 1. 2. As T1 (x) = x = 21−1 x, T0 (x) = 19 , suppose it holds for (m − 1)th Chebyshev polynomial. It follows Tm (x) = 2xTm−1 (x) − Tm−2 (x) = 2m−1 x m + O(x m−1 ). 3. The recursive formula has unique solution when initial value is given, though it may take many forms. Thus, it suffices to verify such Tm satisfies the recursive formula by induction, since it holds for base case. When |x| ≤ 1, it follows10 ¡ ¢ Tm = cos(m · arccos x) = cos (m − 1) arccos x + arccos x ¡ ¢ ¡ ¢ = x cos( m − 1) arccos x − sin (m − 2) arccos x + arccos x sin(arccos x) ¡ ¢ = xTm−1 − x sin (m − 2) arccos x sin(arccos x) − Tm−2 sin(arccos x)2 ³ ´ = xTm−1 − Tm−2 + Tm−2 cos(arccos x)2 + x/2 Tm−1 − Tm−3 3 x = xTm−1 − Tm−2 + (2xTm−2 − Tm−3 ) 2 2 = 2xTm−1 − Tm−2 . When |x| ≥ 1, it follows11 ¡ ¢ ¡ ¢ Tm = cosh(marccoshx) = xTm−1 + sinh (m − 1)arccoshx sinh arccoshx ¡ ¢ ¡ ¢ ¡ ¢2 = xTm−1 + x sinh (m − 2)arccoshx sinh arccoshx + Tm−2 sinh arccoshx

= xTm−1 − Tm−2 + x 2 Tm−2 + x/2Tm−1 − x/2Tm−3 3 = Tm−1 − Tm−2 + x/2(2xTm−2 − Tm−3 ) 2 = 2Tm−1 − Tm−2 . ¡ ¢ 4. |Tm (x)| ≤ 1 because |Tm (x)| = | cos m arccos x | when |x| ≤ 1.

5. As cosh x > 0, the zeros of Tm (x) must lie in segment [−1, 1]. When |x| ≤ 1, T (x) = ¡ ¢ ¡ ¢ cos m arccos x , which has zero at x i = cos (2i − 1)π/(2m) . Since Tm (x) is polynomial ¡ ¢ with degree m, all its zeros are x i = cos (2i − 1)π/(2m) . 9 Although T does not satisfy the formula, yet it does not matter. 0 10 cos(arccos x) = x holds because |x| ≤ 1. 11 cosh(arccoshx) = x holds because |x| ≥ 1.

71

p 6. Since arccoshx = ln(x ± x 2 − 1) when |x| > 1, it follows p p ¢ 1¡ exp(m ln(x ± x 2 − 1)) + exp(−m ln(x ± x 2 − 1)) 2 p p ¢ 1¡ = (x ± x 2 − 1)m + (x ± x 2 − 1)−m . 2 p p As x + x 2 − 1 = 1/(x − x 2 − 1), it can be united as one formula as

Tm =

p p ¢ 1¡ (x + x 2 − 1)m + (x + x 2 − 1)−m . 2

7. Substitute x as 1 + ² into the above formula, yielding12 p p p p ¢ 1 1¡ (1 + ² + ²2 + 2²)m + (1 + ² + ²2 + 2²)−m ≥ (1 + 2²)m ≥ .5(1 + m 2²). 2 2

Question 6.8. Programming question. Skipped. Question 6.9. Confirm that evaluating the formula in (6.47) by performing the matrix-vector multiplications from right to left is mathematically the same as Algorithm 6.13. Proof. We prove it step by step. First, according to Question 6.4., (Z T ⊗ Z T ) · vec(h 2 F ) = h 2 vec(Z F Z T ) = h 2 vec(Z T F Z ).13 The first step is correct. Since I ⊗ Λ + Λ ⊗ I = diag(λ1 I + Λ, . . . , λn Λ), it follows (I ⊗ Λ + Λ ⊗ I )−1 = diag(

1 1 1 1 , ,..., ,..., ). 2λ1 λ1 + λ2 λ2 + λ1 2λn

Consequently, −1

2 0

(I ⊗ Λ + Λ ⊗ I ) vec(h F )( j n + i ) =

h 2 f i0, j +1 λ j +1 + λi

.

The second step is correct. Finally, use part IV of Question 6.4., again, yielding vec(V ) = (Z ⊗ Z )vec(V 0 ) = vec(Z V 0 Z ) = vec(Z V 0 Z T ).

 Question 6.10. The question is too long, so I omit it. For details, please refer to the textbook. Proof. 12 The last one holds by Bernoulli inequality. 13 Z = Z T .

72

1. First, we prove that when two matrices commute they have at least one common eigenvector. Suppose λ is an eigenvalue of matrix A with corresponding eigenvectors x 1 , x 2 , . . . , x s . Since AB x i = B Ax i = λB x i , each B x i is in the eigenspace with value λ of A. ConseP quently, B x i = sj =1 a j i x j . Or, in matrix form as  a 11 ¡ ¢ ¡ ¢ . B x 1 , . . . , B x s = x 1 , . . . , x s  ..

a 12 .. . a s2

a s1

... .. . ...

 a 1s ..  . .

a ss

Denote the right side matrix as S. It has at least one eigenvector y corresponding to µ. Thus, a 11 £ ¤ £ ¤ . B x 1 , . . . , x s y = x 1 , . . . , x s  .. a s1

a 12 .. . a s2



... .. . ...

 a 1s £ ¤ ..  .  y = µ x 1 , . . . , x s y.

a ss

£ ¤ £ ¤ As x i are linear independent, x 1 , . . . , x s y 6= 0. Therefore, x 1 , . . . , x s y is a common eigenvector. Then, we prove the question by induction. The base case holds because it involves only scalar operations. Suppose it holds for (n − 1)-by-(n − 1) matrices. Then, for n-by-n matrices, since A and B have one common unit eigenvector x, expand it into an orthonormal matrix Q. We have

Q T AQ =

µ λ



A n−1

,Q T BQ =

µ µ



B n−1

.

Since Q T AQQ T BQ = Q T ABQ = Q T B AQ − Q T BQQ T AQ, it follows that A n−1 and B n−1 commute. Thus, by induction, there exists Q˜ such that Q˜ T A n−1Q˜ = Λ A n−1 , Q˜ T B n−1Q˜ = ΛB n−1 . Then Q

T

µ

1

¶ µ ¶ µ ¶ µ ¶ µ ¶ µ ¶ 1 λ 1 λ T 1 A Q= ,Q B Q= . ΛB n−1 ΛB n−1 Q˜ T Q˜ Q˜ T Q˜

2. First, consider the most trivial case, when  −1  −1  T˜ =  ..  . 



..

.

   .  −1

−1 jπ Since T˜ = T N − 2I , by Question 4.5, the eigenvalues of T˜ are 2 cos( N +1 ) with correj Nπ jπ ), . . . , sin( )]. Then, as Tˆ = αI − θ T˜ , it follows sponding eigenvector z j = [sin( N +1

N +1

jπ that the eigenvalues of Tˆ are α − 2θ cos( N +1 ) with corresponding eigenvectors z j = jπ

j Nπ

[sin( N +1 ), . . . , sin( N +1 )].

73

3. Since T = I ⊗ A + T˜ ⊗ H , it follows (I ⊗Q)T (I ⊗Q T ) = (I ⊗Q)(I ⊗ A)(I ⊗Q T ) + (I ⊗Q)(T˜ ⊗ H )(I ⊗Q T ) = I ⊗Q AQ T + T˜ ⊗Q HQ T = I ⊗ Λ A + T˜ ⊗ ΛH . jπ

By Question 6.5, the eigenvalues of T˜ ⊗ ΛH are λi j = 2 cos( N +1 )θi with corresponding eigenvector x i j = z i ⊗ e j . Then, (I ⊗ Λ A )x i j = (I ⊗ Λ A )(z i ⊗ e j ) = z i ⊗ (α j )e j = α j (z i ⊗ e j ) = α j x i j . Denote X = [x 11 , x 12 , . . . , x nn ] = Z ⊗ I . It follows (I ⊗ Λ A + T˜ ⊗ ΛH )X = X diag(α1 + λ11 , . . . , αn + λnn ). The eigenvectors are (I ⊗Q)(Z ⊗ I ) = Z ⊗Q. 4. Denote ST S −1 = Λ as the eigendecomposition of T . Then, to solve T x = b yields x = S −1 Λ−1 Sb. As we have the explicit form of Λ and S, we claim by this method we can solve T x = b in O(n 3 ) while dense LU costs O(n 6 ) and band LU costs O(n 4 ). As Z is symmetric orthonormal and finding Q and Q −1 costs O(n 3 ) time, if we can bound the matrix vector product (Z ⊗ Q)b and (Z −1 ⊗ Q −1 )c, the claim holds. Since the two matrix vector product is similar, we use (Z ⊗ Q)b as illustration. Partition b by every n entries, and form it into a n-by-n matrix B . Then, the corresponding matrix vector product becomes matrix matrix product as (Z ⊗Q)b = (Z ⊗Q)vec(B ) = vec(QB Z T ), which costs O(n 3 ) time. 5. If A and B are symmetric tridiagonal Toeplitz matrices, Q = Z and the cost of matrix matrix product can be reduced to O(n 2 log n) by fast sine transformation (or FFT). Thus, the total cost reduces to O(n 2 log n).

 Question 6.11. Suppose that R is upper triangular and nonsingular and that C is upper hessenberg. confirm that RC R −1 is upper Hessenberg. Proof. As the inverse of an upper triangular matrix is upper triangular, it suffices to verify multiplication by upper triangular preserve Hessenberg. Since C (i , j ) = 0 when i + 1 > j and R(i , j ) = 0 when i > j , it follows RC (i , j ) =

n X

R(i , k)C (k, j ) =

k=1

C R(i , j ) =

n X k=1

iX −1

R(i , k)C (k, j ) +

k=1

C (i , k)R(k, j ) =

iX −2 k=1

n X

R(i , k)C (k, j ) = 0, when i > j + 1,

k=i

C (i , k)R(k, j ) +

n X

C (i , k)R(k, j ) = 0, when i > j + 1.

k=i −1



74

Question 6.12. Confirm that the Krylov subspace Kk (A, y 1 ) has dimension k if and only if the Arnoldi algorithm (Algorithm 6.9) or the Lanczos algorithm (Algorithm 6.10) can compute q k without quitting first. Proof. The proofs for Arnoldi algorithm and Lanczos algorithm are similar. Thus, it suffices to prove the result for Arnoldi algorithm. If we can prove that at i th step of Arnoldi algorithm, z i = Aq i lies in the space span(A[y 1 , . . . .y i ]). The question is proved, because the criteria of quitting is whether h j +1, j = 0 occurs for any j , which equals to span(A[y 1 , . . . , y j ]) ⊂ span[y 1 , . . . , y j ], i.e., Krylov subspace Kk (A, y 1 ) has dimension less that k. We prove z i = Aq i ∈ span(A[y 1 , . . . .y i +1 ]) by induction. For the base case, it holds because q j = b/kbk2 . If it holds for the ( j − 1)th step, we can conclude q i ∈ [y 1 , . . . .y j −1 ] for all i ≤ j , Pi −1 since q i = z i −1 − k=1 q k . Then for the j th step, z = Aq j ∈ span(A[y 1 , . . . .y j ]).  Question 6.13. Confirm that when A n×n is symmetric positive definite and Q n×k has full column rank, then T = Q T AQ is also symmetric positive definite. Proof. For any vector x ∈ Rk , it follows x T Q T AQx = (Qx)T AQx ≥ 0. As A is s.p.d, only when Qx = 0 the equality holds. Then, since Q has full column rank, the only solution of Qx = 0 is x = 0. Thus, T is also s.p.d.  Question 6.14. Prove Theorem 6.9. Remark. The Theorem has a little error. That is, Aˆ = M −1/2 AM 1/2 . The correct one should be Aˆ = M −1/2 AM −1/2 Proof. It suffices to prove the relationship bewtween the variables and we can do it by induction. For the base case, we can verify as follows: zˆ = Aˆ · pˆ 1 = M −1/2 AM −1 b = M −1/2 Ap 1 = M −1/2 z, νˆ 1 = (ˆr T0 rˆ 0 )/(pˆ T1 zˆ ) = (b T M −1 b)/(b T M −1 z) = ν1 , xˆ 1 = xˆ 0 + νˆ 1 pˆ 1 = ν1 M 1/2 p 1 = M 1/2 x 1 , rˆ1 = rˆ0 − νˆ 1 zˆ = M −1/2 r 0 − ν1 M −1/2 z = M −1/2 r 1 , µˆ 2 = (ˆr T1 rˆ 1 )/(ˆr T0 rˆ 0 ) = (r T1 M −1 r 1 )/(r T0 M −1 r 0 ) = µ2 , pˆ 2 = rˆ1 + µˆ 2 pˆ 1 = M −1/2 r 1 + M −1/2 µ2 p 1 = M 1/2 p 2 . Suppose it holds for (k − 1)th step. Then, for kth step, it follows zˆ = Aˆ · pˆ k = M −1/2 Ap k = M −1/2 z, νˆ k = (ˆr Tk−1 rˆ k−1 )/(pˆ Tk zˆ ) = (r k−1 T M −1 r k−1 )/(p Tk z) = νk , xˆ k = xˆ k−1 + νˆ k pˆ k = M 1/2 x k−1 + νk M 1/2 p k = M 1/2 x k , rˆk = rˆk−1 − νˆ k zˆ = M −1/2 r k−1 − νk M −1/2 z = M −1/2 r k , µˆ k+1 = (ˆr Tk rˆ k )/(ˆr Tk−1 rˆ k−1 ) = (r Tk M −1 r k )/(r Tk−1 M −1 r k−1 ) = µk+1 , pˆ k+1 = rˆk + µˆ k+1 pˆ k = M −1/2 r k + M −1/2 µk+1 p k = M 1/2 p k+1 .



75