Exercises in Numerical Linear Algebra and Matrix Factorizations [23] 9783030597887

To put the world of linear algebra to advanced use, it is not enough to merely understand the theory; there is a signifi

1,528 396 3MB

English Pages 284 [273] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Exercises in Numerical Linear Algebra and Matrix Factorizations [23]
 9783030597887

Table of contents :
Preface
Acknowledgments
Contents
Listings
Chapter 1 A Short Review of Linear Algebra
Exercises section 1.1
Exercise 1.1: Strassen multiplication (Exam exercise 2017-1)
Exercises section 1.3
Exercise 1.2: The inverse of a general 2 X 2 matrix
Exercise 1.3: The inverse of a special 2 X 2 matrix
Exercise 1.4: Sherman-Morrison formula
Exercise 1.5: Inverse update (Exam exercise 1977-1)
Exercise 1.6: Matrix products (Exam exercise 2009-1)
Exercises section 1.4
Exercise 1.7: Cramer’s rule; special case
Exercise 1.8: Adjoint matrix; special case
Exercise 1.9: Determinant equation for a plane
Exercise 1.10: Signed area of a triangle
Exercise 1.11:Vandermonde matrix
Exercise 1.12: Cauchy determinant (1842)
Exercise 1.13: Inverse of the Hilbert matrix
Chapter 2 Diagonally Dominant Tridiagonal Matrices; Three Examples
Exercises section 2.1
Exercise 2.1: The shifted power basis is a basis
Exercise 2.2: The natural spline, n = 1
Exercise 2.3: Bounding the moments
Exercise 2.4: Moment equations for 1. derivative boundary conditions
Exercise 2.5: Minimal norm property of the natural spline
Theorem 2.5 (Minimal norm property of a cubic spline)
Exercise 2.6: Computing the D2-spline
Exercise 2.7: Spline evaluation
Exercises section 2.2
Exercise 2.8: Central difference approximation of 2. derivative
Exercise 2.9: Two point boundary value problem
Exercise 2.10: Two point boundary value problem; computation
Exercises section 2.3
Exercise 2.11:Approximate force
Exercise 2.12: Symmetrize matrix (Exam exercise 1977-3)
Exercises section 2.4
Exercise 2.13: Eigenpairs T of order 2
Exercise 2.14:LU factorization of 2. derivative matrix
Exercise 2.15: Inverse of the 2. derivative matrix
Exercises section 2.5
Exercise 2.16: Matrix element as a quadratic form
Exercise 2.17: Outer product expansion of a matrix
Exercise 2.18: The product AT A
Exercise 2.19: Outer product expansion
Exercise 2.20: System with many right hand sides; compact form
Exercise 2.21: Block multiplication example
Exercise 2.22: Another block multiplication example
Chapter 3 Gaussian Elimination and LU Factorizations
Exercises section 3.3
Exercise 3.1: Column oriented backsolve
Exercise 3.2: Computing the inverse of a triangular matrix
Exercise 3.3: Finite sums of integers
Exercise 3.4: Multiplying triangular matrices
Exercises section 3.4
Exercise 3.5: Using PLU for A*
Exercise 3.6: Using PLU for determinant
Exercise 3.7: Using PLU for A–1
Exercise 3.8: Upper Hessenberg system (Exam exercise (1994-2)
Exercises section 3.5
Exercise 3.9:# operations for banded triangular systems
Exercise 3.10: L1U and LU1
Exercise 3.11:LU of nonsingular matrix
Exercise 3.12: Row interchange
Exercise 3.13:LU and determinant
Exercise 3.14: Diagonal elements in U
Exercise 3.15: Proof of LDU theorem
Exercise 3.16: Proof of LU1 theorem
Exercise 3.17: Computing the inverse (Exam exercise 1978-1)
Exercise 3.18: Solving TH x =b (Exam exercise 1981-3)
Exercise 3.19: L1U factorization update (Exam exercise 1983-1)
Exercise 3.20: U1L factorization (Exam exercise 1990-1)
Exercises section 3.6
Exercise 3.21: Making block LU into LU
Chapter 4 LDL* Factorization and Positive Definite Matrices
Exercises section 4.2
Exercise 4.1: Positive definite characterizations
Exercise 4.2: L1U factorization (Exam exercise 1982-1)
Exercise 4.3:A counterexample
Exercise 4.4: Cholesky update (Exam exercise 2015-2)
Exercise 4.5: Cholesky update (Exam exercise 2016-2)
Chapter 5 Orthonormal and Unitary Transformations
Exercises section 5.1
Exercise 5.1: The A* A inner product
Exercise 5.2: Angle between vectors in complex case
Exercise 5.3: xT Ay inequality (Exam exercise 1979-3)
Exercises section 5.2
Exercise 5.4: What does algorithm housegen do when x = e1?
Exercise 5.5: Examples of Householder transformations
Exercise 5.6: 2 X 2 Householder transformation
Exercise 5.7: Householder transformation (Exam exercise 2010-1)
Exercises section 5.4
Exercise 5.8:QR decomposition
Exercise 5.9: Householder triangulation
Exercise 5.10: Hadamard’s inequality
Exercise 5.11:QL factorization (Exam exercise 1982-2)
Exercise 5.12: QL-factorization (Exam exercise 1982-3)
Exercise 5.13:QR Fact. of band matrices (Exam exercise 2006-2)
Exercise 5.14: Find QR factorization (Exam exercise 2008-2)
Exercises section 5.5
Exercise 5.15:QR using Gram-Schmidt, II
Exercises section 5.6
Exercise 5.16: Plane rotation
Exercise 5.17: Updating the QR decomposition
Exercise 5.18: Solving upper Hessenberg system using rotations
Exercise 5.19:A Givens transformation (Exam exercise 2013-2)
Exercise 5.20: Givens transformations (Exam exercise 2016-3)
Exercise 5.21: Cholesky and Givens (Exam exercise 2018-2)
Chapter 6 Eigenpairs and Similarity Transformations
Exercises section 6.1
Exercise 6.1: Eigenvalues of a block triangular matrix
Exercise 6.2: Characteristic polynomial of transpose
Exercise 6.3: Characteristic polynomial of inverse
Exercise 6.4: The power of the eigenvector expansion
Exercise 6.5: Eigenvalues of an idempotent matrix
Exercise 6.6: Eigenvalues of a nilpotent matrix
Exercise 6.7: Eigenvalues of a unitary matrix
Exercise 6.8: Nonsingular approximation of a singular matrix
Exercise 6.9: Companion matrix
Exercise 6.10: Find eigenpair example
Exercise 6.11: Right or wrong? (Exam exercise 2005-1)
Exercise 6.12 : Eigenvalues of tridiagonal matrix (Exam exercise 2009-3)
Exercises section 6.2
Exercise 6.13: Jordan example
Exercise 6.14:A nilpotent matrix
Exercise 6.15: Properties of the Jordan factorization
Exercise 6.16: Powers of a Jordan block
Exercise 6.17: The minimal polynomial
Exercise 6.18: Cayley Hamilton Theorem (Exam exercise 1996-3)
Exercises section 6.3
Exercise 6.19: Schur factorization example
Exercise 6.20: Skew-Hermitian matrix
Exercise 6.21: Eigenvalues of a skew-Hermitian matrix
Exercise 6.22: Eigenvector expansion using orthogonal eigenvectors
Exercise 6.23: Rayleigh quotient (Exam exercise 2015-3)
Exercises section 6.4
Exercise 6.24: Eigenvalue perturbation for Hermitian matrices
Exercise 6.25: Hoffman-Wielandt
Exercise 6.26: Biorthogonal expansion
Exercise 6.27: Generalized Rayleigh quotient
Chapter 7 The Singular Value Decomposition
Exercises section 7.1
Exercise 7.1: SVD1
Exercise 7.2: SVD2
Exercise 7.3: SVD examples
Exercise 7.4: More SVD examples
Exercise 7.5: Singular values of a normal matrix
Exercise 7.6: The matrices A* A, AA* and SVD
Exercise 7.7: Singular values (Exam exercise 2005-2)
Exercises section 7.2
Exercise 7.8: Nonsingular matrix
Exercise 7.9: Full row rank
Exercise 7.10: Counting dimensions of fundamental subspaces
Exercise 7.11: Rank and nullity relations
Exercise 7.12: Orthonormal bases example
Exercise 7.13: Some spanning sets
Exercise 7.14: Singular values and eigenpairs of composite matrix
Exercise 7.15: Polar decomposition (Exam exercise 2011-2)
Exercise 7.16: Underdetermined system (Exam exercise 2015-1)
Exercises section 7.4
Exercise 7.17: Rank example
Exercise 7.18: Another rank example
Exercise 7.19: Norms, Cholesky and SVD (Exam exercise 2016-1)
Chapter 8 Matrix Norms and Perturbation Theory for Linear Systems
Exercises section 8.1
Exercise 8.1: An A-norm inequality(Exam exercise 1982-4)
Exercise 8.2: A-orthogonal bases (Exam exercise 1995-4)
Exercises section 8.2
Exercise 8.3: Consistency of sum norm?
Exercise 8.4: Consistency of max norm?
Exercise 8.5: Consistency of modified max norm
Exercise 8.6: What is the sum norm subordinate to?
Exercise 8.7: What is the max norm subordinate to?
Exercise 8.8: Spectral norm
Exercise 8.9: Spectral norm of the inverse
Exercise 8.10: p-norm example
Exercise 8.11: Unitary invariance of the spectral norm
Exercise 8.12: A2 rectangular A
Exercise 8.13: p-norm of diagonal matrix
Exercise 8.14: Spectral norm of a column vector
Exercise 8.15: Norm of absolute value matrix
Exercise 8.16: An iterative method (Exam exercise 2017-3)
Exercises section 8.3
Exercise 8.17: Perturbed linear equation (Exam exercise 1981-2)
Exercise 8.18: Sharpness of perturbation bounds
Exercise 8.19: Condition number of 2. derivative matrix
Exercise 8.20: Perturbation of the Identity matrix
Exercise 8.21: Lower bounds in Equations
Exercise 8.22: Periodic spline interpolation (Exam exercise 1993-2)
Exercise 8.23: LSQ MATLAB program (Exam exercise 2013-4)
Exercises section 8.4
Exercise 8.24: When is a complex norm an inner product norm?
Exercise 8.25: p norm for p = 1 and p = ∞
Exercise 8.26: The p-norm unit sphere
Exercise 8.27: Sharpness of p-norm inequality
Exercise 8.28: p-norm inequalities for arbitrary p
Chapter 9 Least Squares
Exercises section 9.1
Exercise 9.1: Fitting a circle to points
Exercise 9.2: Least square fit (Exam exercise 2018-1)
Exercises section 9.2
Exercise 9.3:A least squares problem (Exam exercise 1983-2)
Exercise 9.4:Weighted least squares (Exam exercise 1977-2)
Exercises section 9.3
Exercise 9.5: Uniqueness of generalized inverse
Exercise 9.6:Verify that a matrix is a generalized inverse
Exercise 9.7: Linearly independent columns and generalized inverse
Exercise 9.8: More orthogonal projections
Exercise 9.9: The generalized inverse of a vector
Exercise 9.10: The generalized inverse of an outer product
Exercise 9.11: The generalized inverse of a diagonal matrix
Exercise 9.12: Properties of the generalized inverse
Exercise 9.13: The generalized inverse of a product
Exercise 9.14: The generalized inverse of the conjugate transpose
Exercise 9.15: Linearly independent columns
Exercise 9.16: Analysis of the general linear system
Exercise 9.17: Fredholm’s alternative
Exercise 9.18: SVD (Exam exercise 2017-2)
Exercises section 9.4
Exercise 9.19: Condition number
Exercise 9.20: Equality in perturbation bound
Exercise 9.21: Problem using normal equations
Exercises section 9.5
Exercise 9.22: Singular values perturbation (Exam exercise 1980-2)
Chapter 10 The Kronecker Product
Exercises sections 10.1 and 10.2
Exercise 10.1: 4 X 4 Poisson matrix
Exercise 10.2: Properties of Kronecker products
Exercise 10.3: Eigenpairs of Kronecker products (Exam exercise 2008-3)
Exercises section 10.3
Exercise 10.4: 2. derivative matrix is positive definite
Exercise 10.5: 1D test matrix is positive definite?
Exercise 10.6: Eigenvalues for 2D test matrix of order 4
Exercise 10.7: Nine point scheme for Poisson problem
Exercise 10.8: Matrix equation for nine point scheme
Exercise 10.9: Biharmonic equation
Chapter 11 Fast Direct Solution of a Large Linear System
Exercises section 11.3
Exercise 11.1: Fourier matrix
Exercise 11.2: Sine transform as Fourier transform
Exercise 11.3: Explicit solution of the discrete Poisson equation
Exercise 11.4: Improved version of Algorithm 11.1
Exercise 11.5: Fast solution of 9 point scheme
Exercise 11.6: Algorithm for fast solution of 9 point scheme
Exercise 11.7: Fast solution of biharmonic equation
Exercise 11.8: Algorithm for fast solution of biharmonic equation
Exercise 11.9 : Check algorithm for fast solution of biharmonic equation
Exercise 11.10 : Fast solution of biharmonic equation using 9 point rule
Chapter 12 The Classical Iterative Methods
Exercises section 12.3
Exercise 12.1: Richardson and Jacobi
Exercise 12.2: R-method when eigenvalues have positive real part
Exercise 12.3: Divergence example for J and GS
Exercise 12.5: Example: GS converges, J diverges
Exercise 12.6: Example: GS diverges, J converges
Exercise 12.7: Strictly diagonally dominance; the J method
Exercise 12.8: Strictly diagonally dominance; the GS method
Exercise 12.9: Convergence example for fix point iteration
Exercise 12.10: Estimate in Lemma 12.1 can be exact
Exercise 12.11: Iterative method (Exam exercise 1991-3)
Exercise 12.12: Gauss-Seidel method (Exam exercise 2008-1)
Exercises section 12.4
Exercise 12.13:A special norm
Exercise 12.14: Is A + E nonsingular?
Exercise 12.15: Slow spectral radius convergence
Chapter 13 The Conjugate Gradient Method
Exercises section 13.1
Exercise 13.1: A-norm
Exercise 13.2: Paraboloid
Exercise 13.3: Steepest descent iteration
Exercise 13.4: Steepest descent (Exam exercise 2011-1)
Exercises section 13.2
Exercise 13.5: Conjugate gradient iteration, II
Exercise 13.6: Conjugate gradient iteration, III
Exercise 13.7: The cg step length is optimal
Exercise 13.8: Starting value in cg
Exercise 13.9: Program code for testing steepest descent
Exercise 13.10: Using cg to solve normal equations
Exercise 13.11: AT A inner product (Exam exercise 2018-3)
Exercises section 13.3
Exercise 13.12: Krylov space and cg iterations
Exercise 13.13: Antisymmetric system (Exam exercise 1983-3)
Exercise 13.14: cg antisymmetric system (Exam exercise 1983-4)
Exercises section 13.4
Exercise 13.15: Another explicit formula for Chebyshev polynomials
Exercise 13.16: Maximum of a convex function
Exercises section 13.5
Exercise 13.17:Variable coefficient
Chapter 14 Numerical Eigenvalue Problems
Exercises section 14.1
Exercise 14.1:Yes or No (Exam exercise 2006-1)
Exercises section 14.2
Exercise 14.2: Nonsingularity using Gershgorin
Exercise 14.3: Gershgorin, strictly diagonally dominant matrix
Exercise 14.4: Gershgorin disks (Exam exercise 2009-2)
Exercises section 14.3
Exercise 14.5: Continuity of eigenvalues
Exercise 14.6: ∞-norm of a diagonal matrix
Exercise 14.7: Eigenvalue perturbations (Exam exercise 2010-2)
Exercises section 14.4
Exercise 14.8 : Number of arithmetic operations, Hessenberg reduction
Exercise 14.9: Assemble Householder transformations
Exercise 14.10: Tridiagonalize a symmetric matrix
Exercises section 14.5
Exercise 14.11: Counting eigenvalues
Exercise 14.12: Overflow in LDL* factorization
Exercise 14.13: Simultaneous diagonalization
Exercise 14.14: Program code for one eigenvalue
Exercise 14.15: Determinant of upper Hessenberg matrix
Chapter 15 The QR Algorithm
Exercise 15.1: Orthogonal vectors

Citation preview

23

Tom Lyche · Georg Muntingh Øyvind Ryan

Exercises in Numerical Linear Algebra and Matrix Factorizations

Editorial Board T. J.Barth M.Griebel D.E.Keyes R.M.Nieminen D.Roose T.Schlick

Texts in Computational Science and Engineering Editors Timothy J. Barth Michael Griebel David E. Keyes Risto M. Nieminen Dirk Roose Tamar Schlick

23

More information about this series at http://www.springer.com/series/5151

Tom Lyche • Georg Muntingh • Øyvind Ryan

Exercises in Numerical Linear Algebra and Matrix Factorizations

Tom Lyche Blindern University of Oslo Oslo, Norway

Georg Muntingh SINTEF ICT Oslo, Norway

Øyvind Ryan Blindern University of Oslo Oslo, Norway

ISSN 1611-0994 ISSN 2197-179X (electronic) Texts in Computational Science and Engineering ISBN 978-3-030-59788-7 ISBN 978-3-030-59789-4 (eBook) https://doi.org/10.1007/978-3-030-59789-4 Mathematics Subject Classification (2010): 15-XX, 65-XX © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Cover illustration: A steepest descent iteration (left) and a conjugate gradient iteration (right) to find the minimum of a quadratic surface shown by some level curves. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

To our families:

Annett, Max, Ella, and Mina;

Natalia, Katrina, and Alina.

Preface

The starting point for these solutions is a course in numerical linear algebra at the University of Oslo. Tom Lyche wrote and gradually expanded a text for this course over a period of more than 30 years. He also expanded the text with many new exercises, and many exam questions were made over the years as well. This work ultimately ended in the book “Numerical Linear Algebra and Matrix Factorizations” (Springer, 2020). Several assistants worked on the exercise sessions for the course over the years, and a number of professors taught the course. Their combined effort formed the foundation for this solution manual, which underwent a one and a half year preparation process by the authors prior to publication. The solutions repeat the exercise statements as they are in Tom Lyche’s book. Since these solutions repeatedly refer to this book, it will simply be mentioned as “the book” in the following. Several exercises refer to labeled equations in the book. To distuinguish between labels in this solution manual, roman numbering is used for equations in the solutions (the book uses non-roman numbering). Otherwise there is little room for confusion, since objects like Theorems, lemmas, and definitions appear only in the book (algorithms and figures specify whether they appear in the book or not, as confusion is possible here). There are some small differences from the book. The exercise text is here shown within a grey box, so that one easily can distinguish between exercise text appearing in the book, and their solutions. Also, hints are shown in their own type of grey box, rather than as footnotes as in the book. The grey hint- and exercise boxes also have their own icons. Many solutions contain code listings. The corresponding files can be found online together with the code from the book at http://folk.uio.no/tom/numlinalg/code. The listings also include the corresponding file names. All code listed in the solutions is MATLAB code, but the code directory also contains a Python module numlinalg.py, which contains the main functions from the first five chapters translated to Python. The code is very similar, but the Python versions naturally take advantage of several things in the Python language. As an example, parameters in Python are passed by reference, not by value.

vii

viii

Preface

The reader may notice that some exercises, in particular older exam exercises, are more difficult than others. A partial explanation to this may be that all aids were permitted in older exams. The reader may also notice that some of those older exams use pseudocode, and are formulated in a way that encourages students to write loops, where one nowadays would prefer vectorized code. Time may again be a partial explanation: The oldest exercises originated way back in the infancy of languages like MATLAB, when modern programming standards were still not established.

Acknowledgments Many should be mentioned when it comes to contributions to this material. Are Magnus Bruaset and Nj˚al Foldnes contributed with solutions to some of the exercises. Several others contributed with exam exercises, including Nils Henrik Risebro, Ragnar Winther, Aslak Tveito, and Michael Floater. A special thanks goes to Christian Schulz, who guided the exercise sessions in the course for several years and provided many solutions, which formed the basis for many solutions in this solution manual.

Oslo, Norway February 2020

Tom Lyche, Georg Muntingh, and Øyvind Ryan.

Contents

1

A Short Review of Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Exercise 1.1: Strassen multiplication (Exam exercise 2017-1) . . . . . . . . . . 1 Exercise 1.2: The inverse of a general 2 × 2 matrix . . . . . . . . . . . . . . . . . . . 3 Exercise 1.3: The inverse of a special 2 × 2 matrix . . . . . . . . . . . . . . . . . . . 4 Exercise 1.4: Sherman-Morrison formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Exercise 1.5: Inverse update (Exam exercise 1977-1) . . . . . . . . . . . . . . . . . . 4 Exercise 1.6: Matrix products (Exam exercise 2009-1) . . . . . . . . . . . . . . . . 7 Exercise 1.7: Cramer’s rule; special case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Exercise 1.8: Adjoint matrix; special case . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Exercise 1.9: Determinant equation for a plane . . . . . . . . . . . . . . . . . . . . . . . 9 Exercise 1.10: Signed area of a triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Exercise 1.11: Vandermonde matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Exercise 1.12: Cauchy determinant (1842) . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Exercise 1.13: Inverse of the Hilbert matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2

Diagonally Dominant Tridiagonal Matrices; Three Examples . . . . . . . Exercise 2.1: The shifted power basis is a basis . . . . . . . . . . . . . . . . . . . . . . Exercise 2.2: The natural spline, n = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.3: Bounding the moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.4: Moment equations for 1. derivative boundary conditions . . . Exercise 2.5: Minimal norm property of the natural spline . . . . . . . . . . . . . Exercise 2.6: Computing the D2 -spline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.7: Spline evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.8: Central difference approximation of 2. derivative . . . . . . . . . Exercise 2.9: Two point boundary value problem . . . . . . . . . . . . . . . . . . . . . Exercise 2.10: Two point boundary value problem; computation . . . . . . . . Exercise 2.11: Approximate force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.12: Symmetrize matrix (Exam exercise 1977-3) . . . . . . . . . . . . . Exercise 2.13: Eigenpairs T of order 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.14: LU factorization of 2. derivative matrix . . . . . . . . . . . . . . . . Exercise 2.15: Inverse of the 2. derivative matrix . . . . . . . . . . . . . . . . . . . . .

17 17 17 18 18 20 21 22 23 24 25 27 28 28 29 30

ix

x

Contents

Exercise 2.16: Matrix element as a quadratic form . . . . . . . . . . . . . . . . . . . . Exercise 2.17: Outer product expansion of a matrix . . . . . . . . . . . . . . . . . . . Exercise 2.18: The product AT A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.19: Outer product expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.20: System with many right hand sides; compact form . . . . . . . Exercise 2.21: Block multiplication example . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2.22: Another block multiplication example . . . . . . . . . . . . . . . . .

31 31 32 32 32 33 33

3

Gaussian Elimination and LU Factorizations . . . . . . . . . . . . . . . . . . . . . . Exercise 3.1: Column oriented backsolve . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.2: Computing the inverse of a triangular matrix . . . . . . . . . . . . . Exercise 3.3: Finite sums of integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.4: Multiplying triangular matrices . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.5: Using PLU for A∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.6: Using PLU for determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.7: Using PLU for A−1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.8: Upper Hessenberg system (Exam exercise (1994-2) . . . . . . . Exercise 3.9: # operations for banded triangular systems . . . . . . . . . . . . . . Exercise 3.10: L1U and LU1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.11: LU of nonsingular matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.12: Row interchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.13: LU and determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.14: Diagonal elements in U . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.15: Proof of LDU theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.16: Proof of LU1 theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3.17: Computing the inverse (Exam exercise 1978-1) . . . . . . . . . . Exercise 3.18: Solving T Hx = b (Exam exercise 1981-3) . . . . . . . . . . . . Exercise 3.19: L1U factorization update (Exam exercise 1983-1) . . . . . . . . Exercise 3.20: U1L factorization (Exam exercise 1990-1) . . . . . . . . . . . . . . Exercise 3.21: Making block LU into LU . . . . . . . . . . . . . . . . . . . . . . . . . . .

35 35 36 38 39 40 40 41 42 46 47 47 48 48 48 49 49 50 52 55 56 58

4

LDL* Factorization and Positive Definite Matrices . . . . . . . . . . . . . . . . . Exercise 4.1: Positive definite characterizations . . . . . . . . . . . . . . . . . . . . . . Exercise 4.2: L1U factorization (Exam exercise 1982-1) . . . . . . . . . . . . . . . Exercise 4.3: A counterexample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 4.4: Cholesky update (Exam exercise 2015-2) . . . . . . . . . . . . . . . . Exercise 4.5: Cholesky update (Exam exercise 2016-2) . . . . . . . . . . . . . . . .

61 61 62 62 63 64

5

Orthonormal and Unitary Transformations . . . . . . . . . . . . . . . . . . . . . . . Exercise 5.1: The A∗ A inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5.2: Angle between vectors in complex case . . . . . . . . . . . . . . . . . Exercise 5.3: xT Ay inequality (Exam exercise 1979-3) . . . . . . . . . . . . . . . Exercise 5.4: What does algorithm housegen do when x = e1 ? . . . . . . . . . Exercise 5.5: Examples of Householder transformations . . . . . . . . . . . . . . . Exercise 5.6: 2 × 2 Householder transformation . . . . . . . . . . . . . . . . . . . . . .

67 67 68 68 69 69 70

Contents

Exercise 5.7: Householder transformation (Exam exercise 2010-1) . . . . . . Exercise 5.8: QR decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5.9: Householder triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5.10: Hadamard’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5.11: QL factorization (Exam exercise 1982-2) . . . . . . . . . . . . . . . Exercise 5.12: QL-factorization (Exam exercise 1982-3) . . . . . . . . . . . . . . . Exercise 5.13: QR Fact. of band matrices (Exam exercise 2006-2) . . . . . . . Exercise 5.14: Find QR factorization (Exam exercise 2008-2) . . . . . . . . . . Exercise 5.15: QR using Gram-Schmidt, II . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5.16: Plane rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5.17: Updating the QR decomposition . . . . . . . . . . . . . . . . . . . . . . Exercise 5.18: Solving upper Hessenberg system using rotations . . . . . . . . Exercise 5.19: A Givens transformation (Exam exercise 2013-2) . . . . . . . . Exercise 5.20: Givens transformations (Exam exercise 2016-3) . . . . . . . . . Exercise 5.21: Cholesky and Givens (Exam exercise 2018-2) . . . . . . . . . . . 6

xi

70 72 72 73 74 76 78 79 80 81 82 82 83 85 87

Eigenpairs and Similarity Transformations . . . . . . . . . . . . . . . . . . . . . . . 91 Exercise 6.1: Eigenvalues of a block triangular matrix . . . . . . . . . . . . . . . . . 91 Exercise 6.2: Characteristic polynomial of transpose . . . . . . . . . . . . . . . . . . 91 Exercise 6.3: Characteristic polynomial of inverse . . . . . . . . . . . . . . . . . . . . 92 Exercise 6.4: The power of the eigenvector expansion . . . . . . . . . . . . . . . . . 92 Exercise 6.5: Eigenvalues of an idempotent matrix . . . . . . . . . . . . . . . . . . . . 93 Exercise 6.6: Eigenvalues of a nilpotent matrix . . . . . . . . . . . . . . . . . . . . . . . 93 Exercise 6.7: Eigenvalues of a unitary matrix . . . . . . . . . . . . . . . . . . . . . . . . 93 Exercise 6.8: Nonsingular approximation of a singular matrix . . . . . . . . . . 94 Exercise 6.9: Companion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Exercise 6.10: Find eigenpair example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Exercise 6.11: Right or wrong? (Exam exercise 2005-1) . . . . . . . . . . . . . . . 96 Exercise 6.12: Eigenvalues of tridiagonal matrix (Exam exercise 2009-3) . 96 Exercise 6.13: Jordan example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Exercise 6.14: A nilpotent matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Exercise 6.15: Properties of the Jordan factorization . . . . . . . . . . . . . . . . . . 98 Exercise 6.16: Powers of a Jordan block . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Exercise 6.17: The minimal polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Exercise 6.18: Cayley Hamilton Theorem (Exam exercise 1996-3) . . . . . . 102 Exercise 6.19: Schur factorization example . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Exercise 6.20: Skew-Hermitian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Exercise 6.21: Eigenvalues of a skew-Hermitian matrix . . . . . . . . . . . . . . . . 104 Exercise 6.22: Eigenvector expansion using orthogonal eigenvectors . . . . . 105 Exercise 6.23: Rayleigh quotient (Exam exercise 2015-3) . . . . . . . . . . . . . . 105 Exercise 6.24: Eigenvalue perturbation for Hermitian matrices . . . . . . . . . . 107 Exercise 6.25: Hoffman-Wielandt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Exercise 6.26: Biorthogonal expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Exercise 6.27: Generalized Rayleigh quotient . . . . . . . . . . . . . . . . . . . . . . . . 108

xii

Contents

7

The Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Exercise 7.1: SVD1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Exercise 7.2: SVD2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Exercise 7.3: SVD examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Exercise 7.4: More SVD examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Exercise 7.5: Singular values of a normal matrix . . . . . . . . . . . . . . . . . . . . . 112 Exercise 7.6: The matrices A∗ A, AA∗ and SVD . . . . . . . . . . . . . . . . . . . . 113 Exercise 7.7: Singular values (Exam exercise 2005-2) . . . . . . . . . . . . . . . . . 113 Exercise 7.8: Nonsingular matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Exercise 7.9: Full row rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Exercise 7.10: Counting dimensions of fundamental subspaces . . . . . . . . . 115 Exercise 7.11: Rank and nullity relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Exercise 7.12: Orthonormal bases example . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Exercise 7.13: Some spanning sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Exercise 7.14: Singular values and eigenpairs of composite matrix . . . . . . 117 Exercise 7.15: Polar decomposition (Exam exercise 2011-2) . . . . . . . . . . . 118 Exercise 7.16: Underdetermined system (Exam exercise 2015-1) . . . . . . . . 121 Exercise 7.17: Rank example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Exercise 7.18: Another rank example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Exercise 7.19: Norms, Cholesky and SVD (Exam exercise 2016-1) . . . . . . 125

8

Matrix Norms and Perturbation Theory for Linear Systems . . . . . . . . 129 Exercise 8.1: An A-norm inequality(Exam exercise 1982-4) . . . . . . . . . . . 129 Exercise 8.2: A-orthogonal bases (Exam exercise 1995-4) . . . . . . . . . . . . . 130 Exercise 8.3: Consistency of sum norm? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Exercise 8.4: Consistency of max norm? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Exercise 8.5: Consistency of modified max norm . . . . . . . . . . . . . . . . . . . . . 133 Exercise 8.6: What is the sum norm subordinate to? . . . . . . . . . . . . . . . . . . 134 Exercise 8.7: What is the max norm subordinate to? . . . . . . . . . . . . . . . . . . 134 Exercise 8.8: Spectral norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Exercise 8.9: Spectral norm of the inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Exercise 8.10: p-norm example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Exercise 8.11: Unitary invariance of the spectral norm . . . . . . . . . . . . . . . . 137 Exercise 8.12: kAU k2 rectangular A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Exercise 8.13: p-norm of diagonal matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Exercise 8.14: Spectral norm of a column vector . . . . . . . . . . . . . . . . . . . . . 138 Exercise 8.15: Norm of absolute value matrix . . . . . . . . . . . . . . . . . . . . . . . . 138 Exercise 8.16: An iterative method (Exam exercise 2017-3) . . . . . . . . . . . . 140 Exercise 8.17: Perturbed linear equation (Exam exercise 1981-2) . . . . . . . 143 Exercise 8.18: Sharpness of perturbation bounds . . . . . . . . . . . . . . . . . . . . . 144 Exercise 8.19: Condition number of 2. derivative matrix . . . . . . . . . . . . . . . 145 Exercise 8.20: Perturbation of the Identity matrix . . . . . . . . . . . . . . . . . . . . . 148 Exercise 8.21: Lower bounds in Equations (8.27), (8.29) . . . . . . . . . . . . . . . 149 Exercise 8.22: Periodic spline interpolation (Exam exercise 1993-2) . . . . . 151 Exercise 8.23: LSQ MATLAB program (Exam exercise 2013-4) . . . . . . . . 153

Contents

xiii

Exercise 8.24: When is a complex norm an inner product norm? . . . . . . . . 154 Exercise 8.25: p norm for p = 1 and p = ∞ . . . . . . . . . . . . . . . . . . . . . . . . . 155 Exercise 8.26: The p-norm unit sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Exercise 8.27: Sharpness of p-norm inequality . . . . . . . . . . . . . . . . . . . . . . . 156 Exercise 8.28: p-norm inequalities for arbitrary p . . . . . . . . . . . . . . . . . . . . . 157 9

Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Exercise 9.1: Fitting a circle to points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Exercise 9.2: Least square fit (Exam exercise 2018-1) . . . . . . . . . . . . . . . . . 161 Exercise 9.3: A least squares problem (Exam exercise 1983-2) . . . . . . . . . 162 Exercise 9.4: Weighted least squares (Exam exercise 1977-2) . . . . . . . . . . . 163 Exercise 9.5: Uniqueness of generalized inverse . . . . . . . . . . . . . . . . . . . . . . 165 Exercise 9.6: Verify that a matrix is a generalized inverse . . . . . . . . . . . . . . 165 Exercise 9.7: Linearly independent columns and generalized inverse . . . . . 166 Exercise 9.8: More orthogonal projections . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Exercise 9.9: The generalized inverse of a vector . . . . . . . . . . . . . . . . . . . . . 167 Exercise 9.10: The generalized inverse of an outer product . . . . . . . . . . . . . 167 Exercise 9.11: The generalized inverse of a diagonal matrix . . . . . . . . . . . . 168 Exercise 9.12: Properties of the generalized inverse . . . . . . . . . . . . . . . . . . . 168 Exercise 9.13: The generalized inverse of a product . . . . . . . . . . . . . . . . . . . 169 Exercise 9.14: The generalized inverse of the conjugate transpose . . . . . . . 170 Exercise 9.15: Linearly independent columns . . . . . . . . . . . . . . . . . . . . . . . . 170 Exercise 9.16: Analysis of the general linear system . . . . . . . . . . . . . . . . . . 170 Exercise 9.17: Fredholm’s alternative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Exercise 9.18: SVD (Exam exercise 2017-2) . . . . . . . . . . . . . . . . . . . . . . . . . 172 Exercise 9.19: Condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Exercise 9.20: Equality in perturbation bound . . . . . . . . . . . . . . . . . . . . . . . . 174 Exercise 9.21: Problem using normal equations . . . . . . . . . . . . . . . . . . . . . . 176 Exercise 9.22: Singular values perturbation (Exam exercise 1980-2) . . . . . 177

10

The Kronecker Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Exercise 10.1: 4 × 4 Poisson matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Exercise 10.2: Properties of Kronecker products . . . . . . . . . . . . . . . . . . . . . . 179 Exercise 10.3: Eigenpairs of Kronecker products (Exam exercise 2008-3) 181 Exercise 10.4: 2. derivative matrix is positive definite . . . . . . . . . . . . . . . . . 181 Exercise 10.5: 1D test matrix is positive definite? . . . . . . . . . . . . . . . . . . . . . 182 Exercise 10.6: Eigenvalues for 2D test matrix of order 4 . . . . . . . . . . . . . . . 182 Exercise 10.7: Nine point scheme for Poisson problem . . . . . . . . . . . . . . . . 183 Exercise 10.8: Matrix equation for nine point scheme . . . . . . . . . . . . . . . . . 185 Exercise 10.9: Biharmonic equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

11

Fast Direct Solution of a Large Linear System . . . . . . . . . . . . . . . . . . . . . 189 Exercise 11.1: Fourier matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Exercise 11.2: Sine transform as Fourier transform . . . . . . . . . . . . . . . . . . . 189 Exercise 11.3: Explicit solution of the discrete Poisson equation . . . . . . . . 190

xiv

Contents

Exercise 11.4: Improved version of Algorithm 11.1 . . . . . . . . . . . . . . . . . . . 191 Exercise 11.5: Fast solution of 9 point scheme . . . . . . . . . . . . . . . . . . . . . . . 193 Exercise 11.6: Algorithm for fast solution of 9 point scheme . . . . . . . . . . . 194 Exercise 11.7: Fast solution of biharmonic equation . . . . . . . . . . . . . . . . . . . 195 Exercise 11.8: Algorithm for fast solution of biharmonic equation . . . . . . . 196 Exercise 11.9: Check algorithm for fast solution of biharmonic equation . 196 Exercise 11.10: Fast solution of biharmonic equation using 9 point rule . . 197 12

The Classical Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Exercise 12.1: Richardson and Jacobi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Exercise 12.2: R-method when eigenvalues have positive real part . . . . . . . 202 Exercise 12.3: Divergence example for J and GS . . . . . . . . . . . . . . . . . . . . . 202 Exercise 12.4: 2 by 2 matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Exercise 12.5: Example: GS converges, J diverges . . . . . . . . . . . . . . . . . . . . 203 Exercise 12.6: Example: GS diverges, J converges . . . . . . . . . . . . . . . . . . . . 204 Exercise 12.7: Strictly diagonally dominance; the J method . . . . . . . . . . . . 205 Exercise 12.8: Strictly diagonally dominance; the GS method . . . . . . . . . . 205 Exercise 12.9: Convergence example for fix point iteration . . . . . . . . . . . . . 207 Exercise 12.10: Estimate in Lemma 12.1 can be exact . . . . . . . . . . . . . . . . . 207 Exercise 12.11: Iterative method (Exam exercise 1991-3) . . . . . . . . . . . . . . 208 Exercise 12.12: Gauss-Seidel method (Exam exercise 2008-1) . . . . . . . . . . 210 Exercise 12.13: A special norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Exercise 12.14: Is A + E nonsingular? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 Exercise 12.15: Slow spectral radius convergence . . . . . . . . . . . . . . . . . . . . 212

13

The Conjugate Gradient Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Exercise 13.1: A-norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Exercise 13.2: Paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Exercise 13.3: Steepest descent iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Exercise 13.4: Steepest descent (Exam exercise 2011-1) . . . . . . . . . . . . . . . 217 Exercise 13.5: Conjugate gradient iteration, II . . . . . . . . . . . . . . . . . . . . . . . 218 Exercise 13.6: Conjugate gradient iteration, III . . . . . . . . . . . . . . . . . . . . . . . 218 Exercise 13.7: The cg step length is optimal . . . . . . . . . . . . . . . . . . . . . . . . . 218 Exercise 13.8: Starting value in cg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Exercise 13.9: Program code for testing steepest descent . . . . . . . . . . . . . . . 220 Exercise 13.10: Using cg to solve normal equations . . . . . . . . . . . . . . . . . . . 223 Exercise 13.11: AT A inner product (Exam exercise 2018-3) . . . . . . . . . . . 224 Exercise 13.12: Krylov space and cg iterations . . . . . . . . . . . . . . . . . . . . . . . 227 Exercise 13.13: Antisymmetric system (Exam exercise 1983-3) . . . . . . . . . 229 Exercise 13.14: cg antisymmetric system (Exam exercise 1983-4) . . . . . . . 232 Exercise 13.15: Another explicit formula for Chebyshev polynomials . . . . 236 Exercise 13.16: Maximum of a convex function . . . . . . . . . . . . . . . . . . . . . . 237 Exercise 13.17: Variable coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

Contents

xv

14

Numerical Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Exercise 14.1: Yes or No (Exam exercise 2006-1) . . . . . . . . . . . . . . . . . . . . 241 Exercise 14.2: Nonsingularity using Gershgorin . . . . . . . . . . . . . . . . . . . . . . 242 Exercise 14.3: Gershgorin, strictly diagonally dominant matrix . . . . . . . . . 243 Exercise 14.4: Gershgorin disks (Exam exercise 2009-2) . . . . . . . . . . . . . . 243 Exercise 14.5: Continuity of eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Exercise 14.6: ∞-norm of a diagonal matrix . . . . . . . . . . . . . . . . . . . . . . . . . 244 Exercise 14.7: Eigenvalue perturbations (Exam exercise 2010-2) . . . . . . . . 245 Exercise 14.8: Number of arithmetic operations, Hessenberg reduction . . . 246 Exercise 14.9: Assemble Householder transformations . . . . . . . . . . . . . . . . 247 Exercise 14.10: Tridiagonalize a symmetric matrix . . . . . . . . . . . . . . . . . . . 247 Exercise 14.11: Counting eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 Exercise 14.12: Overflow in LDL* factorization . . . . . . . . . . . . . . . . . . . . . . 249 Exercise 14.13: Simultaneous diagonalization . . . . . . . . . . . . . . . . . . . . . . . 250 Exercise 14.14: Program code for one eigenvalue . . . . . . . . . . . . . . . . . . . . . 251 Exercise 14.15: Determinant of upper Hessenberg matrix . . . . . . . . . . . . . . 252

15

The QR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Exercise 15.1: Orthogonal vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

Listings

1.1 1.2 1.3 2.1

2.2 2.3 2.4 2.5 2.6

2.7 3.1 3.2 3.3 3.4 3.5 3.6 3.7

Recursive algorithm based on Strassen’s formulas for multiplication of two m × m matrices, with m = 2k for some k ≥ 0. . . . . . . . . . . . . Compute the inverse C of the matrix A obtained from the matrix A by replacing the ith column by a. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Test Listing 1.2 by computing the maximum error for a random 4 × 4 matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Compute the knot vector x and coefficient matrix C for the D2 -spline expressed as piecewise polynomials in the shifted monomial basis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Use Listing 2.1 to compute the coefficients ci,j in Example 2.2. . . . . . Determine the index vector i, for which the jth component gives the location of the jth component of x. . . . . . . . . . . . . . . . . . . . . . . . . . Find the values G of the spline with knot vector x and coefficient matrix C at the points X. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Use Listing 2.1 to reproduce Figure 2.5 in the book. . . . . . . . . . . . . . . Solve, for h = 0.1, 0.05, 0.025, 0.0125, the difference equation (2.36) with r = 0, f = q = 1 and boundary conditions u(0) = 1, u(1) = 0, and compute the “error” max1≤j≤m |u(xj ) − vj | for each h. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Plot the exact solution and estimated solution for the two-point boundary value problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Compute the inverse of a triangular matrix. . . . . . . . . . . . . . . . . . . . . . . Solving a linear system with row pivoting. . . . . . . . . . . . . . . . . . . . . . . . Solving a linear system with row pivoting, assuming the matrix has the Hessenberg form. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Test code for Listings 3.2 and 3.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Computing the inverse of a matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Test computing the inverse of a matrix. . . . . . . . . . . . . . . . . . . . . . . . . . Multiplying an upper triangular and a Hessenberg matrix — row-based version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 6 6

21 21 22 22 23

25 26 37 42 43 43 51 52 52

xvii

xviii

3.8 3.9 3.10 5.1 5.2 5.3 5.4 7.1 8.1 8.2

11.1 11.2 11.3 11.4 11.5 11.6 11.7 12.1 12.2 12.3 13.1 13.2 13.3 13.4 13.5 14.1 14.2 14.3

Listings

Multiplying an upper triangular and a Hessenberg matrix — column-based version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 LU1 factorization of a Hessenberg matrix. . . . . . . . . . . . . . . . . . . . . . . . 53 Reversing the order in a matrix in both axes. . . . . . . . . . . . . . . . . . . . . . 57 Compute the matrix L in the LT L factorization of a symmetric and positive definite matrix B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 For a symmetric and positive definite matrix B, solve the system Bx = c using the LT L factorization of B. . . . . . . . . . . . . . . . . . . . . . 75 Compute the product B = AT A, for a nonsingular symmetric band matrix A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Solve the upper Hessenberg system Ax = b using rotations. . . . . . . . 82 Compute the kth iterative approximation of the polar factorization A = QP of a nonsingular matrix A. . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Simplify a symbolic sum using the Symbolic Math Toolbox in MATLAB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 For a full rank matrix A, use its singular value factorization to compute the least square solution to Ax = b and its spectral condition number K. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Solving the Poisson problem using a nine-point scheme. . . . . . . . . . . . 194 A simple, fast solution to the biharmonic equation. . . . . . . . . . . . . . . . 196 A direct solution to the biharmonic equation. . . . . . . . . . . . . . . . . . . . . 196 Solving the biharmonic equation and plotting the result. . . . . . . . . . . . 197 A simple, fast solution to the biharmonic equation using the nine-point scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 A direct solution to the biharmonic equation using the nine-point scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Solving the biharmonic equation and plotting the result using the nine-point scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Solve Ax = b using the Gauss-Seidel method. . . . . . . . . . . . . . . . . . . . 211 For n = 5, λ = 0.9 and a = 10, plot the (1, n) element f (k) of the matrix Ak for n − 1 ≤ k ≤ 200. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 Find the first integer k for which f (k) < 10−8 . . . . . . . . . . . . . . . . . . . 213 Testing the steepest descent method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Compare the steepest descent and conjugate gradient methods. . . . . . 221 Conjugate gradient method for least squares . . . . . . . . . . . . . . . . . . . . . 223 A conjugate gradient-like method for antisymmetric systems. . . . . . . . 233 Testing the conjugate gradient-like method for antisymmetric systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Given a matrix A, compute the centers s and radii r and c of the row and column Gershgorin disks — for-loops . . . . . . . . . . . . . . . . . . . 243 Given a matrix A, compute the centers s and radii r and c of the row and column Gershgorin disks — vectorized implementation . . . . 243 Count the number k of eigenvalues strictly less than x of a tridiagonal matrix A = tridiag(c, d, c). . . . . . . . . . . . . . . . . . . . . . . . . 251

Listings

xix

14.4 Compute a small interval around the mth eigenvalue λm of a tridiagonal matrix A = tridiag(c, d, c) and return the point λ in the middle of this interval. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

Chapter 1

A Short Review of Linear Algebra

Exercises section 1.1

Exercise 1.1: Strassen multiplication (Exam exercise 2017-1) (By arithmetic operations we mean additions, subtractions, multiplications and divisions.) Let A and B be n × n real matrices. a) With A, B ∈ Rn×n , how many arithmetic operations are required to form the product AB? Solution. Each entry in AB requires n multiplications and n − 1 additions, i.e., a total of 2n − 1 operations. Since there are n2 entries to compute, the total number of operations is n2 (2n − 1) = 2n3 − n2 . b) Consider the 2n × 2n block matrix      W X AB E F = , Y Z CD GH where all matrices A, . . . , Z are in Rn×n . How many operations does it take to compute W , X, Y and Z by the obvious algorithm? Solution. We have that W = AE + BG. 3

2

2

So it takes 2(2n − n ) + n = 4n3 − n2 operations to compute this. We must compute 4 such matrices, hence the total operation cost is 16n3 − 4n2 . note that this is the same 2(2n)3 − (2n)2 , so the number of operations in block matrix multiplication is the same as that of ordinary matrix multiplication. This should not be surprising, since block matrix multiplication computes the same multiplications as the latter. The difference is simply that the summations are done in a different order. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_1

1

1 A Short Review of Linear Algebra

2

c) An alternative method to compute W , X, Y and Z is to use Strassen’s formulas: P1 = (A + D) (E + H) , P2 = (C + D) E,

P5 = (A + B) H,

P3 = A (F − H) ,

P6 = (C − A) (E + F ) ,

P4 = D (G − E) ,

P7 = (B − D) (G + H) ,

W = P1 + P4 − P5 + P7 ,

X = P3 + P5 ,

Y = P2 + P4 ,

Z = P1 + P3 − P2 + P6 .

You do not have to verify these formulas. What is the operation count for this method? Solution. We assume here that the n × n matrix multiplications are computed directly, i.e., that Strassen multiplication is not used further recursively. P1 , P6 and P7 each need 2n3 + n2 operations. P2 , P3 , P4 and P5 each need 2n3 operations. Hence forming the P’s needs 3(2n3 + n2 ) + 4 · 2n3 = 14n3 + 3n2 operations. To find the final result demands 8n2 operations. Thus the total cost is 14n3 + 11n2 . For large n this is clearly less than what we obtained in b). d) Describe a recursive algorithm, based on Strassen’s formulas, which given two matrices A and B of size m×m, with m = 2k for some k ≥ 0, calculates the product AB. Solution. code/strassen.m 1 2 3 4 5 6 7 8 9 10 11 12 13 14

function Z=strassen(A,B) [m,˜]=size(A); if m==1 Z=A*B; return; end one=1:m/2; two=m/2+1:m; P1=strassen(A(one,one)+A(two,two),B(one,one)+B(two,two)); P2=strassen(A(two,one)+A(two,two),B(one,one)); P3=strassen(A(one,one),B(one,two)-B(two,two)); P4=strassen(A(two,two),B(two,one)-B(one,one)); P5=strassen(A(one,one)+A(one,two),B(two,two)); P6=strassen(A(two,one)-A(one,one),B(one,one)+B(one,two)); P7=strassen(A(one,two)-A(two,two),B(two,one)+B(two,two)); Z=[P1+P4-P5+P7,P3+P5; P2+P4,P1+P3-P2+P6]; Listing 1.1: Recursive algorithm based on Strassen’s formulas for multiplication of two m × m matrices, with m = 2k for some k ≥ 0.

1 A Short Review of Linear Algebra

3

 e) Show that the operation count of the recursive algorithm is O mlog2 (7) . Note that log2 (7) ≈ 2.8 < 3, so this is less costly than straightforward matrix multiplication. Solution. Let sk be the number of operations in recursive Strassen multiplication of two matrices of size 2k . Then sk+1 = 7sk + 18 · 22k , since one Strassen multiplication is split into 7 Strassen multiplications of size 2k × 2k , and 18 additions/substractions of matrices of size 2k × 2k . A particular solution (p) to this difference equation is easily found as sk = −6 · 4k , and the general solution (h) to the homogeneous equation is clearly sk = γ · 7k , where γ is a constant. We thus obtain the general solution sk = γ · 7k − 6 · 4k . Here γ · 7k is the dominating term (note that γ > 0 is clear, since we otherwise would have that sk < 0). Since m = 2k , 7k = 2k log2 7 = 2log2 m log2 7 = mlog2 (7) . The result follows.

Exercises section 1.3 Exercise 1.2: The inverse of a general 2 × 2 matrix Show that 

ab cd

−1



 d −b =α , −c a

α=

1 , ad − bc

for any a, b, c, d such that ad − bc 6= 0. Solution. A straightforward computation yields        1 1 d −b a b ad − bc 0 10 = = , cd 0 ad − bc 01 ad − bc −c a ad − bc showing that the two matrices are inverse to each other.

1 A Short Review of Linear Algebra

4

Exercise 1.3: The inverse of a special 2 × 2 matrix Find the inverse of



 cos θ − sin θ A= . sin θ cos θ

Solution. By Exercise 1.2, and using that cos2 θ + sin2 θ = 1, the inverse is given by   cos θ sin θ . − sin θ cos θ

Exercise 1.4: Sherman-Morrison formula Suppose A ∈ Cn×n , and B, C ∈ Rn×m for some n, m ∈ N. If (I + C T A−1 B)−1 exists then (A + BC T )−1 = A−1 − A−1 B(I + C T A−1 B)−1 C T A−1 . Solution. A direct computation yields (A + BC T ) A−1 − A−1 B(I + C T A−1 B)−1 C T A−1



= I − B(I + C T A−1 B)−1 C T A−1 + BC T A−1 − BC T A−1 B(I + C T A−1 B)−1 C T A−1 = I + BC T A−1 − B(I + C T A−1 B)(I + C T A−1 B)−1 C T A−1 = I + BC T A−1 − BC T A−1 = I, showing that the two matrices are inverse to each other. Exercise 1.5: Inverse update (Exam exercise 1977-1) a) Let u, v ∈ Rn and suppose v T u 6= 1. Show that I − uv T has an inverse given by I − τ uv T , where τ := 1/(v T u − 1). Solution. We have (I − uv T )−1 = I − τ uv T since (I − uv T )(I − τ uv T ) = I − uv T − τ uv T + τ uv T uv T = I − (1 + τ − τ v T u)uv T = I.

1 A Short Review of Linear Algebra

5

b) Let A ∈ Rn×n be nonsingular with inverse C := A−1 , and let a ∈ Rn . Let A be the matrix which differs from A by replacing the ith row of A with T aT , i.e., A = A − ei (eT i A − a ), where ei is the ith column in the identity matrix I. Show that if λ := aT Cei 6= 0 (1.20) then A has an inverse C = A

−1

given by

C=C I+

 1 T ei (eT i − a C) λ

(1.21)

Solution. One has T A = A − ei (eT i A−a )

 T T = I − ei (eT i − a C) A = (I − uv )A, where u := ei and v := ei − C T a. We find v T u = 1 − aT Cei = 1 − λ 6= 1

⇐⇒

λ 6= 0

T

τ = 1/(v u − 1) = 1/(1 − λ − 1) = −1/λ 1 C = A−1 (I − uv T )−1 = C(I + uv T ), λ and (1.21) follows. c) Write an algorithm which to given C and a checks if (1.20) holds and computes C provided λ 6= 0. Hint. Use (1.21) to find formulas for computing each column in C. Solution. Let cj and cj , j = 1, . . . , n be the columns of C and C, respectively. We find   1 1 1 ci = Cei = C I + uv T ei = C ei + ei (1 − λ) = ci . λ λ λ If j 6= i then eT i ej = 0 and so cj = Cej = C I +

  1 1 T T ei (eT i − a C) ej = C ej − ei a Cej . λ λ

In other words, all columns in C except the ith column coincide with those of C I−

 1 1 ei aT C = C − ci aT C = C − ci aT C. λ λ

1 A Short Review of Linear Algebra

6

When computing the product ci aT C, it is more efficient to compute aT C first, and afterwards multiplying with ci from the left. It is easily checked that the other multiplication order yields an operation count of order n3 . This gives no benefit for a rank one update of the inverse over a direct computation of the inverse of A: Such a direct implementation is typically based on Gaussian elimination, which we will see in Chapter 3 also has an operation count of order n3 . Computing aT C first actually gives an operation count of order n2 , which we will now show. Also, an implementation needs to compute and store this, and this is the only vector requiring extra storage. The rest of the algorithm stores the result directly into the input memory (i.e., C). This is possible since the computation of cj with j 6= i does not affect ci . Note that the ith entry in aT C is λ. As shown above, the ith column in C simply needs to be scaled by this number. In fact, ci can overwrite ci at the start of the algorithm. The operations we need are as follows. 1. 2. 3. 4.

Computing aT C needs n(2n − 1) = 2n2 − n operations. Computing λ1 ci needs n divisions. Computing n − 1 columns in ci aT C needs n(n − 1) = n2 − n multiplications. Computing n − 1 columns in C − ci aT C needs n(n − 1) = n2 − n subtractions.

Adding these, the total number of operations is 4n2 − 2n. An implementation can be found in Listing 1.2. code/inverseupdate.m 1 2 3 4 5 6 7 8

function C=inverseupdate(C,a,i) res = a’*C; if res(i)˜=0 C(:,i) = C(:,i)/res(i); inds = 1:size(C,1); inds(i)=[]; C(:,inds) = C(:,inds) - C(:,i)*res(inds); end end Listing 1.2: Compute the inverse C of the matrix A obtained from the matrix A by replacing the ith column by a.

Listing 1.3 tests this implementation on a random 4 × 4-matrix (a random matrix is non-singular with high probability. Why?). The code compares the result with a direct computation of the inverse: code/test_inverseupdate.m 1 2 3 4 5

A=rand(4); C=inv(A); i=2; a=rand(4,1);

1 A Short Review of Linear Algebra 6 7 8 9

7

Abar=A; Abar(i,:)=a’; max(max(abs(inverseupdate(C,a,i)-inv(Abar)))) Listing 1.3: Test Listing 1.2 by computing the maximum error for a random 4 × 4 matrix.

Exercise 1.6: Matrix products (Exam exercise 2009-1) Let A, B, C, E ∈ Rn×n be matrices where AT = A. In this exercise an (arithmetic) operation is an addition or a multiplication. We ask about exact numbers of operations. a) How many operations are required to compute the matrix product BC? How many operations are required if B is lower triangular? Solution. For each of the n2 elements in B we have to compute an inner product of length n. This requires n multiplications and n − 1 additions. Therefore to compute BC requires n2 (2n − 1) = 2n3 − n2 operations. If B is lower triangular then row k of B contains k non-zero elements, k = 1, . . . , n. Therefore, to compute an element in the k-th row ofPBC requires k muln tiplications and k − 1 additions. Hence in total we need n k=1 (2k − 1) = n3 operations. b) Show that there exists a lower triangular matrix L ∈ Rn×n such that A = L + LT . Solution. We have A = AL + AD + AR , where AL is lower triangular with 0 on the diagonal, AD = diag(a11 , . . . , ann ), and AR is upper triangular with 0 on 1 the diagonal. Since AT = A, we have AR = AT L . If we let L := AL + 2 AD we T obtain A = L + L . c) We have E T AE = S + S T where S = E T LE. How many operations are required to compute E T AE in this way? Solution. We need n operations to compute the diagonal in L. We need n3 operations to compute LE and after that 2n3 − n2 operations to compute E T (LE). Finally we need n2 operations to compute the sum S + S T . This totals 3n3 operations. Direct computation of E T AE requires 4n3 − 2n2 operations.

1 A Short Review of Linear Algebra

8

Exercises section 1.4

Exercise 1.7: Cramer’s rule; special case Solve the following system by Cramer’s rule:      12 x1 3 = . 21 x2 6 Solution. Cramer’s rule yields 3 2 1 x1 = / 61 2

2 = 3, 1

1 x2 = 2

3 1 / 6 2

2 = 0. 1

Exercise 1.8: Adjoint matrix; special case Show that if

then



 2 −6 3 A =  3 −2 −6  , 6 3 2 

 14 21 42 adj(A) =  −42 −14 21  . 21 −42 14

Moreover, 

 343 0 0 adj(A)A =  0 343 0  = det(A)I. 0 0 343 Solution. Computing the cofactors of A gives   1+1 −2 −6 1+2 3 −6 1+3 3 −2 (−1) (−1) (−1)  3 2 6 2 6 3          2+1 −6 3 2+2 2 3 2+3 2 −6   adjT A =  (−1) 3 2 (−1) 6 2 (−1) 6 3         −6 3 2 3 2 −6  (−1)3+1 (−1)3+2 (−1)3+3 −2 −6 3 −6 3 −2  T 14 21 42 = −42 −14 21 . 21 −42 14 One checks directly that adjA A = det(A)I, with det(A) = 343.

1 A Short Review of Linear Algebra

9

Exercise 1.9: Determinant equation for a plane Show that x y z 1 x1 y1 z1 1 x2 y2 z2 1 = 0 x3 y3 z3 1 is the equation for a plane through three points (x1 , y1 , z1 ), (x2 , y2 , z2 ) and (x3 , y3 , z3 ) in space. Solution. Let ax + by + cz + d = 0 be an equation for a plane through the points (xi , yi , zi ), with i = 1, 2, 3. There is precisely one such plane if and only if the points are not collinear. Then axi + byi + czi + d = 0 for i = 1, 2, 3, so that      x y z 1 a 0 x1 y1 z1 1  b  0      x2 y2 z2 1  c  = 0 . x3 y3 z3 1 d 0 Since the coordinates a, b, c, d of the plane are not all zero, the above matrix is singular, implying that its determinant is zero. Computing this determinant by cofactor expansion of the first row gives the equation y1 z1 1 x1 z1 1 x1 y1 1 x1 y1 z1 + y2 z2 1 x − x2 z2 1 y + x2 y2 1 z − x2 y2 z2 = 0 y3 z3 1 x3 z3 1 x3 y3 1 x3 y3 z3 of the plane. Exercise 1.10: Signed area of a triangle Let Pi = (xi , yi ), i = 1, 2, 3, be three points in the plane defining a triangle T . Show that the area of T is 1 1 1 1 A(T ) = x1 x2 x3 . 2 y1 y2 y3 The area is positive if we traverse the vertices in counterclockwise order. Hint. One has A(T ) = A(ABP3 P1 )+A(P3 BCP2 )−A(P1 ACP2 ); c.f. Figure 1.1. Solution. Let T denote the triangle with vertices P1 , P2 , P3 . Since the area of a triangle is invariant under translation, we can assume P1 = A = (0, 0), P2 = (x2 , y2 ), P3 = (x3 , y3 ), B = (x3 , 0), and C = (x2 , 0). As is clear from Figure 1.1,

1 A Short Review of Linear Algebra

10

the area A(T ) can be expressed as A(T ) = A(ABP3 ) + A(P3 BCP2 ) − A(ACP2 ) 1 1 1 = x3 y3 + (x2 − x3 )y2 + (x2 − x3 )(y3 − y2 ) − x2 y2 2 2 2 1 1 1 1 = 0 x2 x3 , 2 0 y2 y3 which is what needed to be shown. P3

P2

P1

A

B

C

Figure 1.1: The triangle T defined by the three points P1 , P2 and P3 .

Exercise 1.11: Vandermonde matrix Show that

1 x1 x21 · · · xn−1 1 1 x2 x22 · · · xn−1 Y 2 (xi − xj ), . . . .. = .. .. .. . i>j 1 xn x2 · · · xn−1 n n Q Qn where i>j (xi − xj ) = i=2 (xi − x1 )(xi − x2 ) · · · (xi − xi−1 ). This determinant is called the Vandermonde determinant. Hint. Subtract xkn times column k from column k+1 for k = n−1, n−2, . . . , 1.

1 A Short Review of Linear Algebra

11

Solution. For any n = 1, 2, . . ., let 1 x1 1 x2 Dn := 1 x3 . . . . . . 1 xn

· · · xn−1 1 · · · x2n−1 · · · xn−1 3 .. .. . . x2n · · · xn−1 n x21 x22 x23 ...

be the determinant of the Vandermonde matrix in the exercise. Clearly the formula Y DN = (1.i) (xi − xj ) 1≤j j + 1, j = p, . . . , n − 1, we see that hi,j = 0 for i > j + 1 so that H is upper Hessenberg. b) Describe briefly how many multiplications/divisions are required to find the L1U factorization of H? Solution. The number of operations is O(n2 ), both to find L−1 b and for Gaussian elimination on H. c) Suppose we have found the L1U factorization H := LH U H of H. Explain how we can find the L1U factorization of B from LH and U H . Solution. Since B = LH = LLH U H we only need to compute the product LLH .

3 Gaussian Elimination and LU Factorizations

56

Exercise 3.20: U1L factorization (Exam exercise 1990-1) We say that A ∈ Rn×n has a U1L factorization if A = U L for an upper triangular matrix U ∈ Rn×n with ones on the diagonal and a lower triangular L ∈ Rn×n . A UL and the more common LU factorization are analogous, but normally not the same. a) Find a U1L factorization of the matrix   −3 −2 A := . 4 2 Solution. Comparing elements in        −3 −2 1 u12 l11 0 l + u12 l21 u12 l22 = = 11 4 2 0 1 l21 l22 l21 l22 we find l21 = a21 = 4, l22 = a22 = 2, u12 = a12 /l22 = −1 and l11 = a11 − u12 l21 = 1. We obtain      −3 −2 1 −1 1 0 = . 4 2 0 1 42 b) Let the columns of P ∈ Rn×n be the unit vectors in reverse order, i.e., P := [en , en−1 , . . . , e1 ]. Show that P T = P and P 2 = I. What is the connection between the elements in A and P A? Solution. We have pi,n−i+1 = 1 for i = 1, . . . , n and pi,j = 0 otherwise. Clearly, pi,j = pj,i for all i, j and P T = P . Now  T en     ..   2 T P = P P =  .  en · · · e1 = eT n−i+1 en−j+1 i,j = [δij ]i,j = I. eT 1

The rows of P A consists of the rows in A in reverse order. c) Let B := P AP . Find integers r, s, depending on i, j, n, such that bi,j = ar,s . Solution. We find  P AP (i, j) = eT n−i+1 Aen−j+1 = an−i+1,n−j+1 so that r = n − i + 1 and s = n − j + 1. If A is lower triangular then B is upper triangular.

3 Gaussian Elimination and LU Factorizations

57

d) Make a detailed algorithm which to given A ∈ Rn×n determines B := P AP . The elements bi,j in B should be stored in position i, j in A. You should not use other matrices than A and a scalar w ∈ R. Solution. This is shown in Algorithm 3.10. code/pap_compute.m 1

A = A(end:(-1):1,end:(-1):1) Listing 3.10: Reversing the order in a matrix in both axes.

e) Let P AP = M R be an L1U factorization of P AP , i.e., M is lower triangular with ones on the diagonal and R is upper triangular. Express the matrices U and L in a U1L factorization of A in terms of M , R and P . Solution. Since P AP = M R and P −1 = P we obtain A = (P M P )(P RP ) =: U L, where U is upper triangular with ones on the diagonal and L is lower triangular. f) Give necessary and sufficient conditions for a matrix to have a unique U1L factorization. Solution. We first show that A has a unique U1L factorization if and only if B := P AP has a unique L1U factorization. Suppose A has a unique U1L factorization A = U L. Then B = M R, where M := P U P and R := P LP , is an L1U factorization of B. We need to show that M and R are unique. Suppose B = M 1 R1 = M 2 R2 are two L1U factorizations of B. Then A = U 1 L1 = U 2 L2 ,

U i = P M iP ,

Li = P Ri P ,

i = 1, 2.

Since A has a unique U1L factorization and P −1 = P we have M 1 = P U 1P = P U 2P = M 2,

R1 = P L1 P = P L2 P = R2 ,

and uniqueness follows. The converse is analogous. We can now show that a matrix has a unique U1L factorization if and only if the matrices   an−k+1,n−k+1 · · · an−k+1,n   .. .. P k B [k] P k :=   . . an,n−k+1

···

an,n

are nonsingular for k = 1, . . . , n − 1. Here P k := [ek , . . . , e1 ] with ej the unit vectors in Rk , j = 1, . . . , k and

3 Gaussian Elimination and LU Factorizations

58



B [k]

b11 · · ·  .. :=  .

 b1k .  ..  .

bk1 · · · bkk Indeed, A has a unique U1L factorization if and only if B = P AP has a unique L1U factorization LU-theorem

⇐⇒ B [k] nonsingular for k = 1, . . . , n − 1 ⇐⇒ P k B [k] P k nonsingular for k = 1, . . . , n − 1.

Exercises section 3.6

Exercise 3.21: Making block LU into LU ˆ is unit lower triangular and U ˆ is upper triangular. Show that L Solution. We can write a block LU factorization of A as    I 0 ··· 0 U 11 U 12 · · · U 1m  L21 I · · · 0   0 U 22 · · · U 2m     A = LU =  . .. . . ..   .. .. . . .   .. . . . . ..  . . Lm1 Lm2 · · · I 0 0 · · · U mm (i.e., the blocks are denoted by Lij , U ij ). We now assume that U ii has an LU ˜ ii U ˜ ii (L ˜ ii unit lower triangular, U ˜ ii upper triangular), and define factorization L −1 ˆ := L diag(L ˜ ii ), U ˆ := diag(L ˜ )U . We get that L ii 

I  L21 ˆ = L  ..  .

0 I .. .

Lm1 Lm2

 ˜   ˜ L11 0 · · · 0 L11 0 0 ˜ 22 · · · 0   L21 L ˜ 11 L ˜ 22  0 L 0    ..   .. .. . . ..  =  .. ..   . .   . . . . . ˜ mm ˜ 11 Lm2 L ˜ 22 ··· I 0 0 ··· L Lm1 L ··· ··· .. .

··· ··· .. .

0 0 .. .

   . 

˜ mm ··· L

ˆ has the blocks L ˜ ii on the diagonal, and since these are unit lower This shows that L ˆ is unit lower triangular. Also, triangular, it follows that also L ˜ −1 0 L 11  ˜ −1  0 L −1 22 ˆ ˜  U := diag(Lii )U =  . .. .  . . 0 0 

··· ··· .. .

0 0 .. .

˜ −1 ··· L mm

     

U 11 U 12 0 U 22 .. .. . . 0 0

 · · · U 1m · · · U 2m   .  .. . ..  · · · U mm

3 Gaussian Elimination and LU Factorizations

˜ −1 U 11 L 11   0  = ...  

˜ −1 U 12 · · · L 11 ˜ −1 U 22 · · · L 22 .. ... .

0

0

˜ −1 U 1m L 11 ˜ −1 U 2m L 22 . ..

˜ −1 U mm ··· L mm

˜ −1 L ˜ −1 U 12 ˜ ˜ L L 11 11 11 U 11  ˜ −1 L ˜ ˜  0 L 22 22 U 22  = .. ..  . . 

0     = 

0

˜ −1 U 12 ˜ 11 L U 11 0 .. .

˜ 22 U . ..

0

0

59

     

 ˜ −1 U 1m L 11  ˜ −1 U 2m  L 22  ..   . ˜ ˜ ˜ −1 L ··· L mm mm U mm  ··· ··· .. .

˜ −1 U 1m ··· L 11  ˜ −1 U 2m  ··· L 22 ,  .. ..  . . ˜ mm ··· U

ˆ has the blocks U ˜ ii U ˜ ii on the diagonal, ˜ ii . This shows U where we inserted U ii = L ˆ and since these are upper triangular, it follows that also U is upper triangular.

Chapter 4

LDL* Factorization and Positive Definite Matrices

Exercises section 4.2

Exercise 4.1: Positive definite characterizations Show directly that all 4 characterizations in Theorem 4.4 hold for the matrix   21 A= . 12 Solution. We check the equivalent statements of Theorem 4.4 for the matrix A. 1. Obviously A is symmetric. In addition A is positive definite, because      21 x xy = 2x2 + 2xy + 2y 2 = (x + y)2 + x2 + y 2 > 0 12 y for any nonzero vector [x, y]T ∈ R2 . 2. The eigenvalues of A are the roots of the characteristic equation 0 = det(A − λI) = (2 − λ)2 − 1 = (λ − 1)(λ − 3). Hence the eigenvalues are λ = 1 and λ = 3, which are both positive. 3. The leading principal submatrices of A are [2] and A itself, which have positive determinants 2 and 3. 4. If we assume as in a Cholesky factorization that B is lower triangular, then     2    b 0 b11 b21 b11 b11 b21 21 BB T = 11 = = . b21 b22 0 b22 b21 b11 b221 + b222 12 √ Since b211 √= 2 we can choose b11 = 2. Using this, gives that pb11 b21 = 1p b21 = 1/ 2, and b221 + b222 = 2 finally gives b22 = 2 − 1/2 = 3/2 (we chose the positive square root). This means that we can choose © The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_4

61

62

4 LDL* Factorization and Positive Definite Matrices

 √  2 0 p √ B= . 1/ 2 3/2 This could also have been obtained by writing down an LDL-factorization (as in the proof for its existence), and then multiplying in the square root of the diagonal matrix.

Exercise 4.2: L1U factorization (Exam exercise 1982-1) Find the L1U factorization of the following matrix A ∈ Rn×n   1 −1 0 · · · 0  .  −1 2 −1 . . . ..      A =  0 ... ... ... 0  .    . .   .. . . −1 2 −1 0 · · · 0 −1 2 Is A positive definite? Solution. By (2.16) we find lk = −1,

u1 = 1, so that



1 −1 1  L= .. ..  . . −1

uk+1 = 1,    ,  1

k = 1, . . . , n − 1,

  1 −1  .. ..    . . U = .  1 −1 1

Since U T = L and L has positive diagonal elelements, the L1U factorization is also a Cholesky factorization. Then it follows from Theorem 4.2 that A is positive definite. Exercise 4.3: A counterexample In the non-symmetric case a nonsingular positive semidefinite matrix is not necessarily positive definite. Show this by considering the matrix A :=   1 0 . −2 1 Solution. For any vector x = [x, y]T ∈ C2 , one obtains        1 0 x   x xT Ax = x y = xy = x2 − 2xy + y 2 = (x − y)2 ≥ 0, −2 1 y y − 2x

4 LDL* Factorization and Positive Definite Matrices

63

showing that A is positive semi-definite. Equality holds precisely when x = y, and hence there exist nonzero vectors x satisfying xT Ax = 0, showing that A is not positive definite. Exercise 4.4: Cholesky update (Exam exercise 2015-2) a) Let E ∈ Rn×n be of the form E = I + uuT , where u ∈ Rn . Show that E is symmetric and positive definite, and find an expression for E −1 . Hint. The matrix E −1 is of the form E −1 = I + auuT for some a ∈ R. Solution. We have that E T = (I + uuT )T = I T + (uuT )T = I + uuT = E, and xT Ex = xT (I + uuT )x = xT x + xT uuT x = kxk2 + (xT u)2 > 0, so that E is symmetric and positive definite. Using the hint we compute (I +auuT )(I +uuT ) = I +(1+a)uuT +auuT uuT = I +(1+a+akuk2 )uuT . This equals I if 1 + a + akuk2 = 0, i.e., if a = −1/(1 + kuk2 ). This shows that E −1 = I −

1 uuT . 1 + kuk2

b) Let A ∈ Rn×n be of the form A = B + uuT , where B ∈ Rn×n is symmetric and positive definite, and u ∈ Rn . Show that A can be decomposed as A = L(I + vv T )LT , where L is nonsingular and lower triangular, and v ∈ Rn . Solution. Since B is symmetric and positive definite it has a Cholesky factorization B = LLT . We have that L(I + vv T )LT = LLT + Lvv T LT = B + Lv(Lv)T . If we now choose v so that Lv = u (this is possible since L is nonsingular), this equals B + uuT = A, and this shows that A can be written in the desired form. c) Assume that the Cholesky decomposition of B is already computed. Outline a procedure to solve the system Ax = b, where A is on the form above.

64

4 LDL* Factorization and Positive Definite Matrices

Solution. We first find a v so that A = L(I + vv T )LT (by solving Lv = u, which is a lower triangular system). Then we solve Lz = b (lower triangular system), then (I + vv T )w = z (where we can use a), where we found an expression for (I + vv T )−1 ), and finally LT x = w (upper triangular system). Exercise 4.5: Cholesky update (Exam exercise 2016-2) Let A ∈ Rn×n be a symmetric positive definite matrix with a known Cholesky factorization A = LLT . Furthermore, let A+ be a corresponding (n + 1) × (n + 1) matrix of the form   A a A+ = T , a α where a is a vector in Rn , and α is a real number. We assume that the matrix A+ is symmetric positive definite. a) Show that if A+ = L+ LT + is the Cholesky factorization of A+ , then L+ is of the form   L 0 L+ = T , y λ i.e., that the leading principal n × n submatrix of L+ is L. 

 L0 0 Solution. Write L+ = T . We have that y λ A+ =

L+ LT +

    L0 0 LT L0 LT L0 y y 0 0 = T = T T T . y λ 0 λ y L0 y y + λ2 

Since L0 LT 0 = A we must have that L0 = L by uniqueness of the Cholesky factorization, and it follows that L+ is on the given form. b) Explain why α > kL−1 ak22 . Solution. From what we computed above it follows that y and α must satisfy Ly = a (i.e., y = L−1 a), and y T y + λ2 = kL−1 ak22 + λ2 = α. Since L+ has positive diagonal elements, we must have that λ > 0, and it follows that α > kL−1 ak22 . −1 Let us show that α > ak22 implies that A+ is positive definite. This will be  kL    z the case if z T 1 A+ > 0 for any vector z ∈ Rn . We have that 1       T    LLT a z z z 1 A+ = zT 1 1 aT α 1    T  z T T T = z LL + a , z a + α 1

4 LDL* Factorization and Positive Definite Matrices

65

= z T LLT z + 2aT z + α = z T Az + 2aT z + α. The minimum of this (as a function of z) is attained when 2Az + 2a = 0, i.e., when z = −A−1 a. Inserting this in the expression above we get z T Az + 2aT z + α = aT (A−1 )T a − 2aT A−1 a + α = −aT A−1 a + α = −aT (LLT )−1 a + α = −aT (LT )−1 L−1 a + α = −kL−1 ak22 + α, and this is > 0 whenever α > kL−1 ak22 . c) Explain how you can compute L+ when L is known. Solution. The vector y must first be found by solving Ly = a. Then one obtains q q λ = α − kyk22 = α − kL−1 ak22 .

Chapter 5

Orthonormal and Unitary Transformations

Exercises section 5.1 Exercise 5.1: The A∗ A inner product Suppose A ∈ Cm×n has linearly independent columns. Show that hx, yi := y ∗ A∗ Ax defines an inner product on Cn . Solution. Assume that A ∈ Cm×n has linearly independent columns. We show that h·, ·iA : (x, y) 7−→ y ∗ A∗ Ax satisfies the axioms of an inner product on a complex vector space V, as described in Definition 5.1. Let x, y, z ∈ V and a, b ∈ C, and let h·, ·i be the standard inner product on V. Positivity. One has hx, xiA = x∗ A∗ Ax = (Ax)∗ Ax = hAx, Axi ≥ 0, with equality holding if and only if Ax = 0. Since Ax is a linearly combination of the columns of A with coefficients the entries of x, and since the columns of A are assumed to be linearly independent, one has Ax = 0 if and only if x = 0. Skew symmetry. One has hx, yiA = y ∗ A∗ Ax = (y ∗ A∗ Ax)∗ = x∗ A∗ Ay = hy, xiA . Linearity. One has hax + by, ziA = z ∗ A∗ A(ax + by) = az ∗ A∗ Ax + bz ∗ A∗ Ay = ahx, ziA + bhy, ziA .

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_5

67

5 Orthonormal and Unitary Transformations

68

Exercise 5.2: Angle between vectors in complex case Show that in the complex case there is a unique angle θ in [0, π/2] such that cos θ =

|hx, yi| . kxkkyk

Solution. By the Cauchy-Schwarz inequality for a complex inner product space, 0≤

|hx, yi| ≤ 1. kxkkyk

Note that taking x and y perpendicular yields zero, taking x and y equal yields one, and any value in between can be obtained by picking an appropriate affine combination of these two cases. Since the cosine decreases monotonously from one to zero on the interval [0, π/2], there is a unique argument θ ∈ [0, π/2] such that cos θ =

|hx, yi| . kxkkyk

(5.i)

cos( ) = ||x| ||x, y||y| ||

1.00 0.75 0.50 0.25 0.00 0.5

0 Figure 5.1: There is a unique angle θ ∈ [0, π/2] satisfying (5.i).

Exercise 5.3: xT Ay inequality (Exam exercise 1979-3) Suppose A ∈ Rn×n is symmetric positive definite. Show that |xT Ay|2 ≤ xT Ax y T Ay for all x, y ∈ Rn , with equality if and only if x and y are linearly dependent.

5 Orthonormal and Unitary Transformations

69

Solution. Since A is symmetric positive definite hx, yi := xT Ay is an inner product on Rn . Indeed, positivity, symmetry and linearity holds. The results then follows from the Cauchy-Schwarz inequality.

Exercises section 5.2

Exercise 5.4: What does algorithm housegen do when x = e1 ? Determine H in Algorithm 5.1 when x = e1 . Solution. If x = e1 , then the algorithm yields ρ = 1, and a = −ke1 k2 = −1. We then get z = e1 , and √ z + e1 2e1 u= √ = √ = 2e1 1 + z1 2 and

 −1 0  H = I − uuT =  .  ..

0 ··· 1 ··· .. . . . .

 0 0  ..  . .

0 0 ··· 1

Exercise 5.5: Examples of Householder transformations If x, y ∈ Rn with kxk2 = kyk2 and v := x − y 6= 0 then it follows from T Example 5.1 that I − 2 vv v T v x = y. Use this to construct a Householder transformation H such that Hx = y in the following cases.     3 5 a) x = , y= . 4 0 Solution. Let x and y be as in the exercise. As kxk2 = kyk2 , we can apply what we did in Example 5.1 to obtain a vector v and a matrix H,     vv T 1 3 4 −2 v =x−y = , H =I −2 T = , 4 v v 5 4 −3 such that Hx = y. As explained in the text above Example 5.1, H is a Householder √ transformation with u := 2v/kvk2 .   2 b) x =  2  , 1

  0 y =  3 . 0

5 Orthonormal and Unitary Transformations

70

Solution. Let x and y be as in the exercise. As kxk2 = kyk2 , we can apply what we did in Example 5.1 to obtain a vector v and a Householder transformation H,     2 −1 2 −2 T vv 1 v = x − y = −1 , H = I − 2 T =  2 2 1 , v v 3 1 −2 1 2 such that Hx = y. Exercise 5.6: 2 × 2 Householder transformation Show that a real 2 × 2 Householder transformation can be written in the form   − cos φ sin φ H= . sin φ cos φ Find Hx if x = [cos φ, sin φ]T . Solution. Let H = I − uuT ∈ R2,2 be any Householder transformation. Then u = [u1 , u2 ]T ∈ R2 is a vector satisfying u21 + u22 = kuk22 = 2, implying that the components of u are related via u21 − 1 = 1 − u22 . Moreover, as 0 ≤ u21 , u22 ≤ kuk2 = 2, one has −1 ≤ u21 −1 = 1−u22 ≤ 1, and there exists an angle φ0 ∈ [0, 2π) such that cos(φ0 ) = u21 − 1 = 1 − u22 . For such an angle φ0 , one has p p p −u1 u2 = ± 1 + cos φ0 1 − cos φ0 = ± 1 − cos2 φ0 = sin(±φ0 ). We thus find an angle φ := ±φ0 for which       1 − u21 −u1 u1 − cos(φ0 ) sin(±φ0 ) − cos(φ) sin(φ) H= = = . −u1 u2 1 − u22 sin(±φ0 ) cos(φ0 ) sin(φ) cos(φ) Furthermore, we find       2    cos φ − cos φ sin φ cos φ − cos(2φ) sin φ − cos2 φ H = = = . sin φ sin φ cos φ sin φ sin(2φ) 2 sin φ cos φ When applied to the vector [cos φ, sin φ]T , therefore, H doubles the angle and reflects the result in the y-axis. Exercise 5.7: Householder transformation (Exam exercise 2010-1) a) Suppose x, y ∈ Rn with kxk2 = kyk2 and v := x − y 6= 0. Show that Hx = y, Solution. Since

where H := I − 2

vv T . vT v

5 Orthonormal and Unitary Transformations

71

v T v = (x − y)T (x − y) = xT x − 2xT y + y T y = 2xT x − 2xT y = 2xT v = 2v T x, we find  Hx =

vv T I −2 T v v

 x=x−

2v T x v = x − (x − y) = y. 2v T x

b) Let B ∈ R4,4 be given by   0100 0 0 1 0  B :=  0 0 0 1 , 000 where 0 <  < 1. Compute a Householder transformation H and a matrix B 1 such that the first column of B 1 := HBH has a zero in the last two positions. Solution. Let V ∈ R3,3 be a Householder transformation mapping x := [0, 0, ]T into y = [, 0, 0]T . With v = x − y = [−1, 0, 1]T we find     −1  001  vv T V = I − 2 T = I − 22  0  −1 0 1 /(22 ) = 0 1 0 v v 1 100 Let

 1  0 1 0T H= = 0 0 V 0 

Then

0 0 0 1

0 0 1 0

 0 1 . 0 0

  0001   0 0 0  B 1 = HBH =  0 1 0 0 . 0010

This matrix is upper Hessenberg and has the same eigenvalues as B since it is similar to B.

5 Orthonormal and Unitary Transformations

72

Exercises section 5.4

Exercise 5.8: QR decomposition  1 1 A= 1 1

 2 2 , 0 0

  1 1 1 1 1 1 1 −1 −1 , Q=  2 1 −1 −1 1  1 −1 1 −1

 2 0 R= 0 0

 2 2 . 0 0

Show that Q is orthonormal and that QR is a QR decomposition of A. Find a QR factorization of A. Solution. That Q is orthonormal, and therefore unitary, can be shown directly by verifying that QT Q = I. A direct computation shows that QR = A. Moreover,   22   0 2 R1   R =   =: , 00 02,2 00 where R1 is upper triangular. It follows that A = QR is a QR decomposition. A QR factorization is obtained by removing the parts of Q and R that don’t contribute anything to the product QR. Thus we find a QR factorization   1 1   1 1 1  22 , A = Q1 R 1 , Q1 :=  R := . 1 02 2 1 −1 1 −1

Exercise 5.9: Householder triangulation a) Let 

1 0 A :=  −2 −1 2 2

 1 0. 1

Find Householder transformations H 1 , H 2 ∈ R3×3 such that H 2 H 1 A is upper triangular. Solution. Write A = [a1 , a2 , a3 ] as in the exercise. We wish to find Householder transformations H 1 , H 2 that produce zeros in the columns a1 , a2 of A. Applying Algorithm 5.1 in the book to the first column of A, we find first that a = −3, z = [1/3, −2/3, 2/3]T , and then

5 Orthonormal and Unitary Transformations





2 1 u1 = √ −1 , 3 1

73

  −3 −2 −1  0 0 1 . H 1 A := (I − u1 uT 1 )A = 0 1 0

Next we need to map the bottom element (H 1 A)3,2 of the second column to zero, without changing the first row of H 1 A. For this, we apply Algorithm 5.1 to the vector [0, 1]T to find a = −1, z = [0, 1]T , and then     1 0 −1 T ˆ u2 = and H 2 := I − u2 u2 = , 1 −1 0 which is a Householder transformation of size 2 × 2. Since     −3 −2 −1 1 0   0 −1 0 H 2H 1A = , H 2 := ˆ2 , 0H 0 0 −1 it follows that the Householder transformations H 1 and H 2 bring A into upper triangular form. b) Find the QR factorization of A, when R has positive diagonal elements. Solution. Clearly the matrix H 3 := −I is orthogonal and R := H 3 H 2 H 1 A is upper triangular with positive diagonal elements. It follows that A = QR,

T T Q := H T 1 H 2 H 3 = H 1H 2H 3,

is a QR factorization of A of the required form. Exercise 5.10: Hadamard’s inequality In this exercise we use the QR factorization to prove a classical determinant inequality. For any A = [a1 , . . . , an ] ∈ Cn×n we have |det(A)| ≤

n Y

kaj k2 .

j=1

Equality holds if and only if A has a zero column or the columns of A are orthogonal. a) Show that, if Q is unitary, then |det(Q)| = 1. Solution. Since 1 = det(I) = det(Q∗ Q) = det(Q∗ ) det(Q) = det(Q)∗ det(Q) = |det(Q)|2 , we have |det(Q)| = 1.

5 Orthonormal and Unitary Transformations

74

b) Let A = QR be a QR factorization of A and let R = [r 1 , . . . , r n ]. Show that (A∗ A)jj = kaj k22 = (R∗ R)jj = kr j k22 . Solution. Since Q is orthogonal, kaj k22 = =

n X k=1 n X

akj akj = (A∗ A)jj = (R∗ Q∗ QR)jj = (R∗ R)jj rkj rkj = kr j k22 .

k=1

c) Show that |det(A)| =

Qn

j=1 |rjj |



Qn

j=1 kaj k2 .

Solution. From b) follows | det(A)| = | det(Q) det(R)| = | det(R)| =

n Y

|rjj | ≤

j=1

n Y j=1

kr j k2 =

n Y

kaj k2 .

j=1

d) Show that we have equality if A has a zero column. Solution. If one of the columns aj of A is zero, then both the left hand side and right hand side in Hadamard’s inequality (5.22) are zero, and equality holds. e) Suppose the columns of A are nonzero. Show that we have equality if and only if the columns of A are orthogonal Hint. Show that we have equality ⇐⇒ R is diagonal ⇐⇒ A∗ A is diagonal. Solution. Suppose the columns are nonzero. We have equality if and only if |rjj | = kr j k2 for j = 1, . . . , n. This happens if and only if R is diagonal. But then A∗ A = R∗ R is diagonal, which means that the columns of A are orthogonal. Exercise 5.11: QL factorization (Exam exercise 1982-2) Suppose B ∈ Rn×n is symmetric and positive definite. It can be shown that B has a factorization of the form B = LT L, where L is lower triangular with positive diagonal elements (you should not show this). Note that this is different from the Cholesky factorization B = LLT . a) Suppose B = LT L. Write down the equations to determine the elements li,j of L, in the order i = n, n − 1, . . . , 1 and j = i, 1, 2 . . . , i − 1.

5 Orthonormal and Unitary Transformations

75

Solution. We have bi,j =

n X

lk,i lk,j ,

i, j = 1, . . . , n.

k=max(i,j)

Computing the entries in L in the order described in the exercise gives the following algorithm when we isolate li,j on the left hand side. code/ltlfact.m 1 2 3 4 5 6 7 8

function L=ltlfact(B) n=size(B,1); L=zeros(n); for i=n:(-1):1 L(i,i) = sqrt(B(i,i)- sum(L((i+1):n,i).ˆ2)); L(i,1:(i-1)) = (B(i,1:(i-1))-(L((i+1):n,i))’*L((i+1):n ,1:(i-1)))/L(i,i); end end Listing 5.1: Compute the matrix L in the LT L factorization of a symmetric and positive definite matrix B.

b) Explain (without making a detailed algorithm) how the LT L factorization can be used to solve the linear system Bx = c. Compute kLkF . Is the algorithm stable? Solution. First solve the upper triangular system LT y = c for y. Then solve the lower triangular system Lx = y for x. We can use the rforwardsolve and rbacksolve algorithms for this. The following code tests that we find the solution to the system in this way. The code also tests that the factorization from a) is correct. A random positive definite matrix is generated. code/ltlfact_solve.m 1 2 3 4 5 6 7 8 9

n=6; C=rand(6); B=C’*C; L=ltlfact(B); L’*L-B c=rand(n,1); y = rbacksolve(L’,c,n); x = rforwardsolve(L,y,n);

5 Orthonormal and Unitary Transformations

76 10

B*x-c Listing 5.2: For a symmetric and positive definite matrix B, solve the system Bx = c using the LT L factorization of B.

Pn 2 Pn 2 We find bi,i = k=i lk,i . By summing we find kLkF = i=1 bi,i . Thus the elements in L cannot become too large compared to the diagonal elements of B, and we conclude that the algorithm is stable. c) Show that every nonsingular matrix A ∈ Rn×n can be factored in the form A = QL, where Q ∈ Rn×n is orthogonal and L ∈ Rn×n is lower triangular with positive diagonal elements. Solution. The matrix AT A is symmetric and positive definite since A is nonsingular. Let LT L be the corresponding factorization of AT A and define Q := AL−1 . Then Q ∈ Rn×n and QT Q = L−T AT AL−1 = L−T LT LL−1 = I. Thus Q is orthogonal and A = QL. d) Show that the QL factorization in c) is unique. Solution. Every LT L factorization of B where L has positive diagonal elements must be given by the formulas in a). It follows that L in the LT L factorization of B is unique. If A = QL then AT A = LT QT QL = LT L. Thus L is the same as the L in the LT L factorization of AT A and therefore unique. Since A = QL implies Q = AL−1 it follows that Q must also be unique. Exercise 5.12: QL-factorization (Exam exercise 1982-3) In this exercise we will develop an algorithm to find a QL-factorization of A ∈ Rn×n (cf. Exercise 5.11) using Householder transformations. a) Given vectors a := [a1 , . . . , an ]T ∈ Rn and en := [0, . . . , 0, 1]T . Find ∗ v ∈ Rn such that the Householder transformation H := I − 2 vv v ∗ v satisfies Ha = −sen , where |s| = kak2 . How should we choose the sign of s? ∗

Solution. By Lemma 5.2, if v = a+sen , where |s| = kak2 , then H := I −2 vv v ∗ v is a Householder transformation so that Ha = −sen . From the proof of Theorem 5.8 we see that we can avoid a cancellation error in v if we choose s to have the same sign as the last component of a.

5 Orthonormal and Unitary Transformations

77

b) Let 1 ≤ r ≤ n, v r ∈ Rr , v r 6= 0, and √ vr v r v ∗r = I r − ur u∗r , with ur := 2 . ∗ vr vr kv r k2 h i 0 Show that H := V0r I n−r is a Householder transformation. Show also that, if ai,j = 0 for i = 1, . . . , r and j = r + 1, . . . , n then the last n − r columns of A and HA are the same. V r := I r − 2

√ ur ∗ n Solution. Let U := √ I − uu , where u := [ 0 ] ∈ R and kur k2 = 2. Since kuk2 = kur k2 = 2 it follows that U is a Householder transformation. Moreover,           Ir 0 ur  ∗  Ir 0 ur u∗r 0 Vr 0 ur 0 = U= − − = =: H. 0 I n−r 0 0 I n−r 0 0 0 I n−r If ai,j = 0 for i = 1, . . . , r and j = r + 1, . . . , n then A = A3 ∈ R(n−r)×(n−r) . But then      Vr 0 A1 0 V r A1 0 HA = = 0 I n−r A2 A3 A2 A3

 A1

0 A2 A3

 , where

has the same last n − r columns as A. c) Explain, without making a detailed algorithm, how we to a given matrix A ∈ Rn×n can find Householder transformations H 1 , . . . , H n−1 such that H n−1 , . . . , H 1 A is lower triangular. Give a QL factorization of A. Solution. We let H 1 be a Householder transformation such that H 1 an = −sn en , with |sn | = kan k2 . The transformation H 1 is given in a) with a = an , the last column of A. Suppose H 1 , . . . , H k−1 are determined such that Ak := H k−1 · · · H 1 A is lower triangular in its k − 1 last columns. Let V k be a Householder transformation in R(n−k+1)×(n−k+1) so that the first n − k + h1 entries in i 0 column k in Ak are sent to a scalar multiple of en−k+1 . Set H k := V0k I k−1 . By b) H k is a Householder transformation such that Ak+1 := H k Ak is lower triangular in its last k columns. Continuing we obtain L := An lower triangular. Moreover, A = QL, where T Q := (H n · · · H 1 )T = H T 1 · · · Hn = H1 · · · Hn

is orthogonal.

5 Orthonormal and Unitary Transformations

78

Exercise 5.13: QR Fact. of band matrices (Exam exercise 2006-2) Let A ∈ Rn×n be a nonsingular symmetric band matrix with bandwidth d ≤ n − 1, so that aij = 0 for all i, j with |i − j| > d. We define B := AT A and let A = QR be the QR factorization of A where R has positive diagonal entries. a) Show that B is symmetric. Solution. B T = (AT A)T = AT A = B. b) Show that B has bandwidth ≤ 2d. Solution. Let 1 ≤ i ≤ j ≤ n. Since aij = 0 for all i, j with |i − j| > d we find bij =

n X k=1

min{n,i+d}

aki akj =

X

aki akj .

k=max{1,j−d}

If j > i + 2d then j − d > i + d and bij = 0. Since B is symmetric we also have bji = 0. But then bij = 0 for all i, j with |j − i| > 2d. c) Write a MATLAB function B=ata(A,d) which computes B. You shall exploit the symmetry and the function should only use O(cn2 ) flops, where c only depends on d. Solution. code/ata.m 1 2 3 4 5 6 7 8 9 10

function B=ata(A,d) [m,n]=size(A); B=zeros(n); for i=1:n for j=i:n il=max(1,j-d); iu=min(n,i+d); B(i,j)=A(il:iu,i)’*A(il:iu,j); B(j,i)=B(i,j); end end Listing 5.3: Compute the product B = AT A, for a nonsingular symmetric band matrix A.

d) Estimate the number of arithmetic operations in your algorithm.

5 Orthonormal and Unitary Transformations

79

Solution. Since j ≥ i we have iu − il ≤ i + d − j + d ≤ 2d, so that the computation of each B(i, j) at most  2 · 2d − 1 = O(4d) arithmetic operations, yielding R nrequires Rn a total of O 0 i (4d)djdi = O(2dn2 ) operations. e) Show that AT A = RT R. Solution. Since QT Q = I, we find AT A = (QR)T (QR) = RT (QT Q)R = RT R.

f) Explain why R has upper bandwidth 2d. Solution. Since R is upper triangular with positive diagonal elements we see that RT R is the Cholesky factorization of AT A. From Theorem 4.6 we know that R has the same upper bandwidth 2d as AT A. g) We consider 3 methods for finding the QR factorization of the band matrix A, where we assume that n is much bigger than d. The methods are based on 1. Gram-Schmidt orthogonalization, 2. Householder transformations, 3. Givens rotations. Which method would you recommend for a computer program using floating point arithmetic? Give reasons for your answer. Solution. • Stability: Method 1 is not stable, while methods 2 and 3 are stable. Method 1 can produce vectors which are not orthogonal. • Complexity: Since A is a band matrix with small bandwidth, method 2 requires more operations than method 1 or 3. • Conclusion: Method 3 is recommended.

Exercise 5.14: Find QR factorization (Exam exercise 2008-2) Let



 2 1  2 −3  A :=  −2 −1 −2 3

a) Find the Cholesky factorization of AT A. Solution. We find AT A =



 16 −8 and want −8 20

5 Orthonormal and Unitary Transformations

80

AT A = RT R,

  r11 r12 , R= 0 r22

r11 , r22 > 0.

 2 r11 r11 r12 2 2 2 = = 16, r11 r12 = −8 and r12 +r22 we need r11 2 2 r11 r12 r12 + r22 20. The solution is r11 = 4, r12 = −2 and r22 = 4. Since RT R =



b) Find the QR factorization of A.   4 −2 Solution. We have already found R = . Then, to get A = QR we need 0 4   1 4 2 Q = AR−1 , and since R−1 = 16 we find 04 

Q = AR−1

 1 1 1  1 −1 . =  2 −1 −1 −1 1

Exercises section 5.5

Exercise 5.15: QR using Gram-Schmidt, II Construct Q1 and R1 in Example 5.2 using Gram-Schmidt orthogonalization. Solution. Let

  1 3 1 1 3 7   A = [a1 , a2 , a3 ] =  1 −1 −4 . 1 −1 2

Applying Gram-Schmidt orthogonalization, we find   1 1  v 1 = a1 =  1 , 1   2 2 aT 2 v1  v 2 = a2 − T v 1 =  −2 , v1 v1 | {z } −2 =1

  1 1 1 , q1 =   2 1 1   1 1 1 , q2 =   2 −1 −1

5 Orthonormal and Unitary Transformations

81

  −3  a3T v 1 aT v 3 2 3 , v 3 = a3 − T v 1 − T v2 =    −3 v1 v1 v2 v2 | {z } | {z } 3 =3/2

=5/4

  −1 1 1 . q3 =  2 −1 1

Since (R1 )11 = kv 1 k = 2,

(R1 )22 = kv 2 k = 4,

and since also (R1 )ij = aT j q i = ||v i ||

(R1 )33 = kv 3 k = 6,

aT j vi , T vi vi

i > j,

3 = 3, 2

(R1 )23 = 4 ·

we get (R1 )12 = 2 · 1 = 2,

(R1 )13 = 2 ·

5 = 5, 4

so that A = Q1 R1 with   1 1 −1   1 1 1 1  , Q1 = q 1 , q 2 , q 3 =  2 1 −1 −1 1 −1 1

  223 R1 = 0 4 5 . 006

Exercises section 5.6

Exercise 5.16: Plane rotation Suppose x= Show that

  r cos α , r sin α

 P =

 cos θ sin θ . − sin θ cos θ

  r cos (α − θ) Px = . r sin (α − θ)

Solution. Using the angle difference identities for the sine and cosine functions, cos(θ − α) = cos θ cos α + sin θ sin α, sin(θ − α) = sin θ cos α − cos θ sin α, we find  Px = r

   cos θ cos α + sin θ sin α r cos(θ − α) = . − sin θ cos α + cos θ sin α −r sin(θ − α)

5 Orthonormal and Unitary Transformations

82

Exercise 5.17: Updating the QR decomposition Let H ∈ R4,4 be upper Hessenberg. Find Givens rotation matrices G1 , G2 , G3 such that G3 G2 G1 H = R is upper triangular (here each Gk = P i,j for suitable i, j, c and s, and for each k you are meant to find suitable i and j). Solution. Recall how rotations in the i, j-plane were defined, see Definition 5.6. To bring an upper Hessenberg 4×4-matrix to upper triangular form, the following three entries need to be zeroed out: 1. h2,1 : This can be zeroed out with a rotation G1 in the 12-plane. 2. h3,2 : This can be zeroed out with a rotation G2 in the 23-plane. This does not affect the zeroes in column 1. 3. h4,3 : This can be zeroed out with a rotation G3 in the 34-plane. This does not affect the zeroes in column 1 and 2. Exercise 5.18: Solving upper Hessenberg system using rotations Let A ∈ Rn×n be upper Hessenberg and nonsingular, and let b ∈ Rn . The following algorithm (Algorithm 5.3 in the book) solves the linear system Ax = b using rotations P k,k+1 for k = 1, . . . , n − 1. It uses Algorithm 3.2 (backsolve). Determine the number of arithmetic operations of this algorithm.

code/rothesstri.m 1 2 3 4 5 6 7 8 9 10 11 12

function x=rothesstri(A,b) n=length(A); A=[A b]; for k=1:n-1 r=norm([A(k,k),A(k+1,k)]); if r>0 c=A(k,k)/r; s=A(k+1,k)/r; A([k k+1],k+1:n+1) =[c s;-s c]*A([k k+1],k+1:n+1); end A(k,k)=r; A(k+1,k)=0; end x=rbacksolve(A(:,1:n),A(:,n+1),n); Listing 5.4: Solve the upper Hessenberg system Ax = b using rotations.

Solution. To determine the number of arithmetic operations of Algorithm 5.3, we first consider the arithmetic operations in each step. Initially the algorithm stores the length of the matrix and appends the right hand side as the (n + 1)-th column to the matrix. Such copying and storing operations do not count as arithmetic operations.

5 Orthonormal and Unitary Transformations

83

The second big step is the loop. Let us consider the arithmetic operations at the k-th iteration of this loop. First we have to compute the norm of a two-dimensional vector, which comprises 4 arithmetic operations: two multiplications, one addition and one square root operation. Assuming r > 0 we compute c and s each in one division, adding 2 arithmetic operations to our count. Computing the product of the Givens rotation and A includes 2 multiplications and one addition for each entry of the result. As we have 2(n + 1 − k) entries, this amounts to 6(n + 1 − k) arithmetic operations. The last operation in the loop is just the storage of two entries of A, which again does not count as an arithmetic operation. The final step of the whole algorithm is a backward substitution, known to require O(n2 ) arithmetic operations. We conclude that the algorithm uses O(n2 ) +

n−1 X

 4 + 2 + 6(n + 1 − k)

k=1

= O(n2 ) + 6

n−1 X

(n + 2 − k)

k=1 2

= O(n2 ) + 3n + 9n − 12 = O(4n2 ) arithmetic operations. Exercise 5.19: A Givens transformation (Exam exercise 2013-2)   cs A Givens rotation of order 2 has the form G := ∈ R2×2 , where −s c s2 + c2 = 1. a) Is G symmetric and unitary? Solution. The matrix G is only symmetric for s = 0. In addition, G is unitary since  2  s + c2 0 ∗ T G G=G G= = I. 0 s2 + c2

b) Given x1 , x2 ∈ R and set r := y x y1 = y2 , where 1 = G 1 . y2 x2

p x21 + x22 . Find G and y1 , y2 so that

Solution. The goal of this exercise is to find Givens rotations in the plane that maps to the diagonal; see Figure 5.2. We find y1 = y2 if and only if cx1 + sx2 = −sx1 + cx2 and s2 + c2 = 1. Thus s and c must be solutions of

5 Orthonormal and Unitary Transformations

84

(x1 + x2 )s + (x1 − x2 )c = 0 s2 + c2 = 1. If x1 = x2 then the point is already on the diagonal and the solution is s = 0 and c = ±1, corresponding to the identity map and the half-turn. Suppose x1 6= x2 . +x2 Substituting c = xx21 −x s into 1 = s2 + c2 we find 1 2

1=s



(x1 + x2 )2 1+ (x2 − x1 )2



= s2

(x2 − x1 )2 + (x1 + x2 )2 2r2 2 = s . (x2 − x1 )2 (x2 − x1 )2

There are two solutions s1 =

x +x x2 − x1 √ , c1 = 2 √ 1 , r 2 r 2

s2 = −s1 , c2 = −c1 .

We find corresponding solutions y1 = y2 = c1 x1 + s1 x2 = and

√ (x1 + x2 )x1 + (x2 − x1 )x2 √ = r/ 2 r 2

√ y1 = y2 = c2 x1 + s2 x2 = −(c1 x1 + s1 x2 ) = −r/ 2.

y1 y1

r

r x1

x1

Figure 5.2: The Givens transformations in Exercise 5.19 mapping generic points (left) to the diagonal, and each point on the diagonal (right) to itself and its antipodal.

5 Orthonormal and Unitary Transformations

85

Exercise 5.20: Givens transformations (Exam exercise 2016-3) Recall that a rotation in the ij-plane is an m × m-matrix, denoted P i,j , which differs from the identity matrix only in the entries ii, ij, ji, jj, which equal     pii pij cos θ sin θ = , pji pjj − sin θ cos θ i.e., these four entries are those of a Givens rotation. a) For θ ∈ R, let P be a Givens rotation of the form   cos θ sin θ P = − sin θ cos θ and let x be a fixed vector in R2 . Show that there exists a unique θ ∈ (−π/2, π/2] so that P x = ±kxk2 e1 , where e1 = (1, 0)T . Solution. The matrix P represents a clockwise rotation with angle θ, and this is unitary, i.e., it preserves length. Write x = kxk2 [cos φ, sin φ] in polar coordinates, with −π ≤ φ < π. We can always find an angle θ ∈ (−π/2, π/2] so that x is rotated onto the positive or negative x-axis, i.e., onto ±kxk2 e1 and so that P x = ±kxk2 e1 : • If φ ∈ (−π/2, π/2], a rotation with θ = φ rotates x to the positive x-axis. • If φ ∈ [−π, −π/2], a rotation with θ = π + φ rotates x to the negative x-axis. • If φ ∈ (π/2, π), a rotation with θ = φ − π rotates x to the negative x-axis. b) Show that, for any vector w ∈ Rm , one can find rotations in the 12-plane, 23-plane, . . ., (m − 1)m-plane, so that   α 0   P 1,2 P 2,3 · · · P m−2,m−1 P m−1,m w =  .  ,  ..  0 where α = ±kwk2 . Solution. A rotation P m−1,m only changes the last two entries, and by a) we can find such a rotation that zeros out component m. We now proceed by induction: Assume that we have found rotations in the corresponding planes so that P k−1,k · · · P m−1,m w is zero in components k, . . . , m. A rotation P k−2,k−1 only changes components k − 2 and k − 1, and clearly we can find such a rotation that zeros out component k − 1 as well. At the final step we find a plane rotation P 1,2 which zeroes out the second component, and we now have a vector of the form ke1 . We must have that k = ±kwk2 , since all plane rotations preserve length.

5 Orthonormal and Unitary Transformations

86

c) Assume that m ≥ n. Recall that an m×n-matrix A with entries ai,j is called upper trapezoidal if there are no nonzero entries below the main diagonal (a1,1 , a2,2 , . . . , an,n ) (for m = n, upper trapezoidal is the same as upper triangular). Recall also that an m × n-matrix is said to be in upper Hessenberg form if there are no nonzero entries below the subdiagonal (a2,1 , a3,2 , . . . , an,n−1 ). Explain that, if an m × n-matrix H is in upper Hessenberg form, one can find plane rotations so that P m−1,m P m−2,m−1 · · · P 2,3 P 1,2 H is upper trapezoidal. Solution. The plane rotation P 1,2 only changes the two first components in any column of H. Using a) and b) we can find such a plane rotation which zeros out the entry (2, 1) in H. Such a plane rotation will keep H in upper Hessenberg form. We then find a plane rotation P 2,3 which zeros out entry (3, 2). After this the first two columns will have zeros below the diagonal, and H will still be in upper Hessenberg form. We can continue in this way to find plane rotations so that P m−1,m P m−2,m−1 · · · P 2,3 P 1,2 H is upper trapezoidal. d) Let again A be an m × n-matrix with m ≥ n, and let A− be the matrix obtained by removing column k in A. Explain how you can find a QR decomposition of A− , when we already have a QR decomposition A = QR of A. Hint. Consider the matrix QT A− . Solution. Let ai be the columns of A, and let r i be the columns of R. Since QT A = R we have that   QT A− = QT a1 · · · QT ak−1 QT ak+1 · · · QT an   = r 1 · · · r k−1 r k+1 · · · r n , which clearly is in upper Hessenberg form. Due to c) we can find plane rotations so that P m−1,m P m−2,m−1 · · · P 2,3,n−1 P 1,2 QT A− = R1 ,

5 Orthonormal and Unitary Transformations

87

where R1 is upper trapezoidal. We thus have that T T T A− = QP T 1,2 P 2,3 · · · P m−2,m−1 P m−1,m R1 ,

which gives a QR decomposition of A− . Exercise 5.21: Cholesky and Givens (Exam exercise 2018-2) Assume that A is n × n symmetric positive definite, and with Cholesky factorization A = LL∗ . Assume also that z is a given column vector of length n. a) Explain why A + zz ∗ has a unique Cholesky factorization. Solution. Since A is symmetric positive definite, and zz ∗ is symmetric positive semidefinite, it follows that x 6= 0

=⇒

xT (A + zz ∗ )x = xT Ax + xT zz ∗ x > 0,

and hence that A + zz ∗ is symmetric positive definite. Any symmetric matrix is positive definite if and only if it has a (unique) Cholesky factorization (see Theorem 4.2). b) Assume that we are given a QR decomposition  ∗   L R = Q , z∗ 0 with R square and upper triangular. Explain why R is nonsingular. Explain also that, if R also has nonnegative diagonal entries, then A + zz ∗ has the Cholesky factorization R∗ R. Solution. We obtain     L∗ A + zz ∗ = LL∗ + zz ∗ = L z z∗     R = R∗ 0 QT Q 0    ∗  R = R 0 = R∗ R. 0  ∗   L R has rank n, and hence the same holds for . But this is the case if z∗ 0 and only if R is nonsingular. If the diagonal elements of R are nonnegative, they must be positive, since R is nonsingular. Thus R∗ is lower triangular with positive diagonal elements, so that R∗ R is the Cholesky factorization. Here

88

5 Orthonormal and Unitary Transformations

Recall that a plane rotation in the (i, j)-plane, denoted Pi,j , is an n × n-matrix which differs from the identity matrix only in the entries (i, i), (i, j), (j, i), (j, j), which equal those of a Givens rotation, i.e., they are     pii pij cos θ sin θ = . − sin θ cos θ pji pjj

c) Explain how one can find plane rotations Pi1 ,n+1 , Pi2 ,n+1 ,. . . ,Pin ,n+1 so that  ∗  0 L R Pi1 ,n+1 Pi2 ,n+1 · · · Pin ,n+1 ∗ = . (5.23) z 0 0 with  ∗ R upper triangular, and explain how to obtain a QR decomposition of L from this. In particular you should write down the numbers i1 , . . . , in . z∗ Is it possible to choose the plane rotations so that R0 in (5.23) also has positive diagonal entries?

Solution. In the book, rectangular matrices which are zero below the main diagonal were called upper trapezoidal; hence the matrix  ∗ L B 0 := ∗ z is upper trapezoidal except for the last row. We can clearly find a Givens rotation P 1,n+1 in the (1, n + 1)-plane so that B 1 := P 1,n+1 B 0 has a zero in entry (n + 1, 1), and a nonzero in entry (1, 1). This is because a Givens rotation with angle θ maps     r cos α r cos(α − θ) 7−→ , r sin α r sin(α − θ) so choosing θ = α maps to the positive x-axis, while choosing θ = α + π maps to the negative x-axis. The resulting matrix will still be upper trapezoidal except for the last row, since P 1,n+1 changes only rows 1 and n + 1. Assume now that we have found Givens rotations that have mapped B 0 to a matrix B k with zeroes in the first k entries of row n + 1, and upper trapezoidal with nonzero diagonal elements, except for the last row. We find a Givens rotation P k+1,n+1 so that B k+1 := P k+1,n+1 B k has a zero in entry (n + 1, k + 1). This rotation will affect only rows n + 1 and k + 1, and since the first k elements in both these rows in B k are zero, the same will be the case for B k+1 . This proves that the final matrix we obtain after n Givens rotations will be upper trapezoidal, so that  ∗  0 L R P n,n+1 P n−1,n+1 · · · P 1,n+1 ∗ = , z 0

5 Orthonormal and Unitary Transformations

89

with R0 upper triangular. In particular we can set ik := n + 1 − k for all k. We now obtain  ∗  0 R L T T , Q Q := P T = 1,n+1 P 2,n+1 · · · P n,n+1 . z∗ 0 Since all the Givens rotations are unitary, their product is also unitary, so that we have factored the matrix as a product of a unitary matrix Q and an upper trapezoidal matrix, i.e., we have a QR decomposition. If we choose the angles in the Givens rotations so that all vectors are mapped to the positive x-axis, the diagonal elements of R0 will also be positive.

Chapter 6

Eigenpairs and Similarity Transformations

Exercises section 6.1

Exercise 6.1: Eigenvalues of a block triangular matrix What are the eigenvalues of the matrix   21000000 0 2 1 0 0 0 0 0   0 0 2 0 0 0 0 0   0 0 0 2 0 0 0 0 8,8   0 0 0 1 2 0 0 0 ∈ R ?   0 0 0 0 0 2 0 0   0 0 0 0 0 0 3 0 00000013 Solution. Successive cofactor expansion along the 1st column, 2nd column, 3rd column, 4th row, 5th column, 6th row, and 7th row, yields det(A − λI) = (2 − λ) · (2 − λ) · (2 − λ) · (2 − λ) · (2 − λ) · (2 − λ) · (3 − λ) · (3 − λ). It follows that A has eigenvalue 2 (with algebraic multiplicity 6) and eigenvalue 3 (with algebraic multiplicity 2). Exercise 6.2: Characteristic polynomial of transpose We have det(B T ) = det(B) and det(B) = det(B) for any square matrix B. Use this to show that a) πAT = πA , Solution. One obtains

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_6

91

6 Eigenpairs and Similarity Transformations

92

πAT (λ) = det(AT − λI) = det((A − λI)T ) = det(A − λI) = πA (λ).

b) πA∗ (λ) = πA (λ). Solution. One obtains ¯ = det(A∗ −λI) ¯ = det(AT − λI) = det(AT − λI) = πAT (λ) = πA (λ). πA∗ (λ)

Exercise 6.3: Characteristic polynomial of inverse Suppose (λ, x) is an eigenpair for A ∈ Cn×n . Show that a) if A is nonsingular then (λ−1 , x) is an eigenpair for A−1 . Solution. If Ax = λx we have that A−1 (λx) = x, hence A−1 x = λ−1 x, so that (λ−1 , x) is an eigenpair for A−1 . b) (λk , x) is an eigenpair for Ak for k ∈ Z. Solution. Since Ix = 1 · x, the statement is immediate for k = 0. Repeatedly multiplying by A, and using that Ax = λx, yields Ak x = Ak−1 Ax = Ak−2 Aλx = Ak−3 Aλ2 x = · · · = λk x. Applying this result to the matrix A−1 , which has eigenpair (λ−1 , x) by 1., it follows that 2. holds for any k ∈ Z. Exercise 6.4: The power of the eigenvector expansion Show that if A ∈ Cn×n is nondefective with eigenpairs (λj , xj ), j = 1, . . . , n then for any x ∈ Cn and k ∈ N Ak x =

n X

cj λkj xj ,

for some scalars c1 , . . . , cn .

j=1

Show that if A is nonsingular then (6.19) holds for all k ∈ Z. Solution. If the eigenvectors form a basis we can write x=

n X j=1

for some scalars c1 , . . . , cn . But then

cj xj

(6.19)

6 Eigenpairs and Similarity Transformations

Ax =

n X

93

cj Axj =

j=1

n X

cj λj xj .

j=1

Iterating this we obtain k

A x=

n X j=1

k

cj A xj =

n X

cj λkj xj .

j=1

Exercise 6.5: Eigenvalues of an idempotent matrix Let λ ∈ σ(A) where A2 = A ∈ Cn×n . Show that λ = 0 or λ = 1. (A matrix is called idempotent if A2 = A.) Solution. Suppose that (λ, x) is an eigenpair of a matrix A satisfying A2 = A. Then λx = Ax = A2 x = λAx = λ2 x. Since any eigenvector is nonzero, one has λ = λ2 , from which it follows that either λ = 0 or λ = 1. We conclude that the eigenvalues of any idempotent matrix can only be zero or one. Exercise 6.6: Eigenvalues of a nilpotent matrix Let λ ∈ σ(A) where Ak = 0 for some k ∈ N. Show that λ = 0. (A matrix A ∈ Cn×n such that Ak = 0 for some k ∈ N is called nilpotent.) Solution. Suppose that (λ, x) is an eigenpair of a matrix A satisfying Ak = 0 for some natural number k. Then 0 = Ak x = λAk−1 x = λ2 Ak−2 x = · · · = λk x. Since any eigenvector is nonzero, one has λk = 0, from which it follows that λ = 0. We conclude that any eigenvalue of a nilpotent matrix is zero. Exercise 6.7: Eigenvalues of a unitary matrix Let λ ∈ σ(A), where A∗ A = I. Show that |λ| = 1. Solution. Let x be an eigenvector corresponding to λ. Then Ax = λx and, as a consequence, x∗ A∗ = x∗ λ. To use that A∗ A = I, it is tempting to multiply the left hand sides of these equations, yielding |λ|2 kxk2 = x∗ λλx = x∗ A∗ Ax = x∗ Ix = kxk2 . Since x is an eigenvector, it must be nonzero. Nonzero vectors have nonzero norms, and we can therefore divide the above equation by kxk2 , which results in |λ|2 =

6 Eigenpairs and Similarity Transformations

94

1. Taking square roots we find that |λ| = 1, which is what needed to be shown. Apparently the eigenvalues of any unitary matrix reside on the unit circle in the complex plane. Exercise 6.8: Nonsingular approximation of a singular matrix Suppose A ∈ Cn×n is singular. Then we can find 0 > 0 such that A + I is nonsingular for all  ∈ C with || < 0 . Hint. Use that det(A) = λ1 λ2 · · · λn , where λi are the eigenvalues of A. Solution. Let λ1 , . . . , λn be the eigenvalues of the matrix A. As the matrix A is singular, its determinant det(A) = λ1 · · · λn is zero, implying that one of its eigenvalues is zero. If all the eigenvalues of A are zero let ε0 := 1. Otherwise, let ε0 := minλi 6=0 |λi | be the absolute value of the eigenvalue closest to zero. By definition of the eigenvalues, det(A − λI) is zero for λ = λ1 , . . . , λn , and nonzero otherwise. In particular det(A − εI) is nonzero for any ε ∈ (0, ε0 ), and A − εI will be nonsingular in this interval. Exercise 6.9: Companion matrix For q0 , . . . , qn−1 ∈ C let p(λ) = λn + qn−1 λn−1 + · · · + q0 be a polynomial of degree n in λ. We derive two matrices that have (−1)n p as its characteristic polynomial. a) Show that p = (−1)n πA where  −qn−1 −qn−2  1 0   0 1 A=  .. ..  . . 0

0

· · · −q1 ··· 0 ··· 0 . . .. . . ··· 1

 −q0 0   0  . ..  .  0

A is called a companion matrix of p. Solution. To show that (−1)n f is the characteristic polynomial πA of the matrix A, we need to compute   −qn−1 − λ −qn−2 · · · −q1 −q0  1 −λ · · · 0 0     0 1 ··· 0 0  πA (λ) = det(A − λI) = det  .  .. .. ..  . . ..  . . . . .  0

0

··· 1

−λ

6 Eigenpairs and Similarity Transformations

95

By the rules of determinant evaluation, we can subtract from any column a linear combination of the other columns without changing the value of the determinant. Multiply columns 1, 2, . . . , n − 1 by λn−1 , λn−2 , . . . , λ and adding the corresponding linear combination to the final column, we find   −qn−1 − λ −qn−2 · · · −q1 −f (λ)   0  1 −λ · · · 0    0 1 ··· 0 0  = (−1)n f (λ), πA (λ) = det   .. . .. ..  . . . ..  .  . . 0

0

··· 1

0

where the second equality follows from cofactor expansion along the final column. Multiplying this equation by (−1)n yields the statement of the Exercise. b) Show that p = (−1)n πB where  0 0 ··· 1 0 ···   B = 0 1 ···  .. .. . . . . .

 0 −q0 0 −q1   0 −q2  . .. ..  . .  0 0 · · · 1 −qn−1

Thus B can also be regarded as a companion matrix for p. Solution. Similar to a), by multiplying rows 2, 3, . . . , n by λ, λ2 , . . . , λn−1 and adding the corresponding linear combination to the first row. Exercise 6.10: Find eigenpair example 

 123 Find the eigenvalues and eigenvectors of A =  0 2 3 . Is A defective? 002 Solution. As A is a triangular matrix, its eigenvalues are the diagonal entries. One finds two eigenvalues λ1 = 1 and λ2 = 2, the latter with algebraic multiplicity two. Solving Ax1 = λ1 x1 and Ax2 = λ2 x2 , one finds (valid choices of) eigenpairs, for instance     1 2 (λ1 , x1 ) = (1, 0), (λ2 , x2 ) = (2, 1). 0 0 It follows that the eigenvectors span a space of dimension 2, and this means that A is defective.

6 Eigenpairs and Similarity Transformations

96

Exercise 6.11: Right or wrong? (Exam exercise 2005-1) Decide if the following statements are right or wrong. Give supporting arguments for your decisions. a) The matrix   1 3 4 A= 6 4 −3 is orthogonal? Solution. Wrong! Since [3, 4] [4, −3]T = 0 it follows that A has orthogonal 1 columns, but since 36 (32 + 42 ) 6= 1 the columns are not orthonormal. If we change 1 1 6 to 5 , then A becomes orthogonal. b) Let   a1 A= 0a where a ∈ R. There is a nonsingular matrix Y ∈ R2×2 and a diagonal matrix D ∈ R2×2 such that A = Y DY −1 ? Solution. Wrong! Let Y = [y 1 , y 2 ] and D = diag(λ1 , λ2 ). Since AY = Y D we have Ay j = λj y j for j = 1, 2, i.e., y 1 and y 2 are eigenvectors of A with eigenvalues λ1 and λ2 . Since A is upper triangular the eigenvalues of A are λ1 = λ2 = a. Furthermore y 1 and y 2 are linearly independent since Y is nonsingular. Let x := [x1 , x2 ]T be an eigenvector to A with eigenvalue λ so that Ax = λx or ax1 + x2 = ax1 and ax2 = ax2 . The solution is x1 arbitrary and x2 = 0. But this means that A does not have linearly independent eigenvectors and therefore cannot be diagonalized. Exercise 6.12 : Eigenvalues of tridiagonal matrix (Exam exercise 2009-3) Let A ∈ Rn,n be tridiagonal (i.e., aij = 0 when |i − j| > 1) and suppose also that ai+1,i ai,i+1 > 0 for i = 1, . . . , n − 1. Show that the eigenvalues of A are real. Hint. Show that there is a diagonal matrix D such that D −1 AD is symmetric. Solution. B := D −1 AD will be symmetric if we choose r ai+1,i d1 = 1, and di+1 = di , i = 1, . . . , n − 1. ai,i+1

6 Eigenpairs and Similarity Transformations

97

Since a real symmetric matrix has real eigenvalues and A is similar to B it follows that A has real eigenvalues.

Exercises section 6.2

Exercise 6.13: Jordan example Find S in the Jordan factorization  AS = SJ ,

 3 0 1 A = −4 1 −2 , −4 0 −1

  110 J = 0 1 0 . 001

Solution. This exercise shows that it matters in which order we solve for the columns of S. One would here need to find the second column first before solving for the other two. We are asked to find S = [s1 , s2 , s3 ] satisfying   [As1 , As2 , As3 ] = AS = SJ = [s1 , s2 , s3 ]J = s1 , s1 + s2 , s3 . The equations for the first and third columns say that s1 and s3 are eigenvectors for λ = 1, so that they can be found by row reducing A − I:     2 0 1 201 A − I = −4 0 −2 ∼ 0 0 0 . −4 0 −2 000 Hence [1, 0, −2]T and [0, 1, 0]T span the space ker(A−I) of eigenvectors for λ = 1. The vector s2 can be found by solving As2 = s1 + s2 , so that (A − I)s2 = s1 . This means that (A − I)2 s2 = (A − I)s1 = 0, so that s2 ∈ ker(A − I)2 . A simple computation shows that (A − I)2 = 0 so that any s2 will do, but we must also choose s2 so that (A − I)s2 = s1 is an eigenvector of A. Since A − I has rank one, we may choose any s2 so that (A − I)s2 is nonzero. In particular we can choose s2 = [1, 0, 0]T , and then s1 = (A − I)s2 = [2, −4, −4]T . We can also choose s3 = [0, 1, 0]T , since it is an eigenvector not spanned by the s1 and s2 which we just defined. All this means that we can choose   2 10 S = −4 0 1 . −4 0 0

6 Eigenpairs and Similarity Transformations

98

Exercise 6.14: A nilpotent matrix   0 I m−r Show that (J m (λ) − λI)r = for 1 ≤ r ≤ m − 1 and conclude 0 0 m that J m (λ) − λI = 0. Solution.  We show  this by induction. For r = 1 the statement is obvious. Define 0 I m−r Er = . We have that 0 0 (E 1 E r )i,j =

X (E 1 )i,k (E r )k,j . k

In the sum on the right hand side only one term can contribute (since any row/column in E 1 and E r contains only one nonzero entry, being a one). This occurs when there is a k so that k = i + 1, k + r = j, i.e., when j = i + r + 1. E r+1 has all nonzero entries when j = i + r + 1, and this proves that E r+1 = E 1 E r . It now follows that r+1  r J m (λ) − I = J m (λ) − I J m (λ) − I = E 1 E r = E r+1 , and the result follows. Exercise 6.15: Properties of the Jordan factorization Let J be the Jordan factorization of a matrix A ∈ Cn×n as given in Theorem 6.4. Then for r = 0, 1, 2, . . ., m = 2, 3, . . ., and any λ ∈ C, a) Ar = SJ r S −1 , b) J r = diag(U r1 , . . . , U rk ), c) U ri = diag(J mi,1 (λi )r , . . . , J mi,gi (λi )r ), Pmin{r,m−1} d) J m (λ)r = (E m + λI m )r = k=0

r k



λr−k E km .

Solution. Let J = S −1 AS be the Jordan form of the matrix A as in Theorem 6.4. Items a)–c) are easily shown by induction, making use of the rules of block multiplication in b) and c). For d), write E m := J m (λ) − λI m , with J m (λ) the Jordan block of order m. By the binomial theorem, r

r

J m (λ) = (E m + λI m ) =

r   X r k=0

k

Since E km = 0 for any k ≥ m, we obtain

E km (λI m )r−k

r   X r r−k k = λ Em. k k=0

6 Eigenpairs and Similarity Transformations

J m (λ)r =

99

min{r,m−1} 

 r r−k k Em. λ k

X

k=0

Exercise 6.16: Powers of a Jordan block Find J 100 and A100 for the matrix in Exercise 6.13. Solution. Let S be as in Exercise 6.13. J is block-diagonal so that we can write  n  n    110 1n0 11 0  = 0 1 0 , J n = 0 1 0 =  0 1 (6.i) n 001 001 0 1 where we used property d) in Exercise 6.15 on the upper left block. It follows that   −1 2 1 0 1 100 0 2 10 = (SJS −1 )100 = SJ 100 S −1 = −4 0 1 0 1 0 −4 0 1 −4 0 0 0 0 1 −4 0 0       1 2 1 0 1 100 0 0 0 −4 201 0 100 = −4 0 1 0 1 0 1 0 12  = −400 1 −200 . −4 0 0 0 0 1 −400 0 −199 0 1 −1 

A100

Exercise 6.17: The minimal polynomial Let J be the Jordan factorization of a matrix A ∈ Cn×n as given in Theorem 6.4. The polynomial µA (λ) :=

k Y

(λi − λ)mi ,

where mi := max mi,j , 1≤j≤gi

i=1

is called the minimal polynomial of A. We define the matrix polynomial µA (A) by replacing the factors λi − λ by λi I − A. Qk Qgi a) We have πA (λ) = i=1 j=1 (λi − λ)mi,j . Use this to show that the minimal polynomial divides the characteristic polynomial, i.e., πA = µA νA for some polynomial νA . P gi

Solution. For each i, (λi − λ)ai = (λi − λ) j=1 mi,j divides πA (λ). Since Pgi mi divides πA (λ). From this it j=1 mi,j ≥ max1≤j≤gi mi,j = mi , also (λi − λ) follows that also µA (λ) divides πA (λ). b) Show that µA (A) = 0 ⇐⇒ µA (J ) = 0. Solution. We have that

100

6 Eigenpairs and Similarity Transformations

µA (A) =

k Y

(λi I − A)mi =

i=1

(λi I − SJS −1 )mi

i=1 k Y

=S

k Y

! (λi I − J )

mi

S −1 = SµA (J )S −1 .

i=1

It follows that µA (A) = 0 if and only if µA (J ) = 0. c) (can be difficult) Use Exercises 6.14, 6.15 and the maximality of mi to show that µA (A) = 0. Thus a matrix satisfies its minimal equation. Finally show that the degree of any polynomial p such that p(A) = 0 is at least as large as the degree of the minimal polynomial. Solution. We have that µA (J ) =

k Y

(λi I − J )mi =

i=1

=

k Y

k Y

diag(λi I − U 1 , . . . , λi I − U k )

m i

i=1

diag (λi I − U 1 )mi , . . . , (λi I − U k )mi



i=1

= diag

k Y

mi

(λi I − U 1 )

,...,

i=1

k Y

! mi

(λi I − U k )

.

i=1

Now we also have   m i (λi I − U i )mi = λi I − diag J mi,1 (λi ), . . . , J mi,gi (λi ) m i = diag λi I − J mi,1 (λi ), . . . , λi I − J mi,gi (λi )  m i m i  = diag λi I − J mi,1 (λi ) , . . . , λi I − J mi,gi (λi ) = 0, since m i mi,j mi −mi,j λi I − J mi,j (λi ) = λi I − J mi,j (λi ) λi I − J mi,j (λi ) = 0(λi I − J mi,j (λi ))mi −mi,j = 0. We now get that k Y

(λi I − U j )mi = (λj I − U j )mj

i=1

so that

k Y i=1,i6=j

(λi I − U j )mi = 0,

6 Eigenpairs and Similarity Transformations

µA (J ) = diag

k Y

(λi I − U 1 )mi , . . . ,

i=1

101 k Y

! (λi I − U k )mi

= 0.

i=1

It follows that µA (A) = 0. Qr Suppose now that p(A) = 0. We can write p(A) = C i=1 (ki I − A)si , where ki are the zeros of p, with multiplicity si . As above it follows that p(A) = 0 if and only if p(J ) = 0. Factor p(J ) as above to obtain ! r k Y Y si si . p(J ) = diag (ki I − U 1 ) , . . . , (ki I − U k ) i=1

i=1

Note that  ki I − U j = diag ki I − J mj,1 (λj ), . . . , ki I − J mj,gj (λj ) is upper triangular with ki − λj on the diagonal. If ki 6= λj , then ki I − U j must be invertible, but then (ki I − U j )si is invertible as well. In order for p(J ) = 0, we must then have that for each j there exists a t so that kt = λj . The qth diagonal block entry in (ki I − U j )si is r Y i=1

r s t Y si ki I − J mj,q (λj ) = kt I − J mj,q (λj )

ki I − J mj,q (λj )

si

i=1,i6=t r st Y = λj I − J mj,q (λj )

si ki I − J mj,q (λj ) .

i=1,i6=t

The last matrix here is invertible (all ki 6= λj when i 6= t), so that we must have that st λj I − J mj,q (λj ) = 0 in order for p(J ) = 0. We know from the exercises that this happens only when st ≥ mj,q . Since q was arbitrary we obtain that st ≥ mj , i.e., that λj is a zero in p of multiplicity ≥ mj . Since this applied for any j, it follows that the minimal polynomial divides p, and the result follows. d) Use c) to show the Cayley-Hamilton Theorem, which says that a matrix satisfies its characteristic equation πA (A) = 0. Solution. We have that πA (A) = µA (A)νA (A) = 0.

102

6 Eigenpairs and Similarity Transformations

Exercise 6.18: Cayley Hamilton Theorem (Exam exercise 1996-3) Pr Suppose p is a polynomial given by p(t) := j=0 bj tj , where bj ∈ C and A ∈ Cn×n . We define the matrix p(A) ∈ Cn×n by p(A) :=

r X

bj A j ,

j=0

where A0 := I. From this it follows that if p(t) := (t − α1 ) · · · (t − αr ) for some α0 , . . . , αr ∈ C then p(A) = (A − α1 ) · · · (A − αr ). We accept this without proof. Let U ∗ AU = T , where U is unitary and T upper triangular with the eigenvalues of A on the diagonal.   21 a) Find the characteristic polynomial πA to . Show that π(A) = 0. −1 4 Solution. We find πA

2 − λ 1 = (2 − λ)(4 − λ) + 1 = λ2 − 6λ + 9 = det(A − λI) = −1 4 − λ

and hence πA (A) = A2 − 6A + 9I        21 21 21 10 = −6 +9 −1 4 −1 4 −1 4 01         3 6 −12 −6 90 00 = + + = . −6 15 6 −24 09 00

b) Let now A ∈ Cn×n be arbitrary. For any polynomial p show that p(A) = U p(T )U ∗ . Solution. We first show by induction that Aj = U T j U ∗ ,

j ∈ N.

(6.ii)

For j = 1 this follows form the Schur decomposition. If Aj = U T j U ∗ for some j ≥ 1, then   Aj+1 = Aj A = U T j U ∗ U T U ∗ = U T j (U ∗ U )T U ∗ = U T j+1 U ∗ , since U ∗ U =PI. Now (6.ii) follows. r If p(t) := j=0 bj tj then

6 Eigenpairs and Similarity Transformations

p(A) :=

r X j=0

(6.ii)

bj A j =

r X

bj U T j U ∗ = U

j=0

103 r X

 bj T j U ∗ = U p(T )U ∗ .

j=0

c) Let n, k ∈ N with 1 ≤ k < n. Let C, D ∈ Cn×n be upper triangular. Moreover, ci,j = 0 for i, j ≤ k and dk+1,k+1 = 0. Define E := CD, and show that ei,j = 0 for i, j ≤ k + 1. Solution. Let M [k] = (mij )ki,j=1 be the leading principal k × k-submatrix of a matrix M . Since C and D are upper triangular we find      C [k+1] C 1,2 D [k+1] D 1,2 C [k+1] D [k+1] E 1,2 E := CD = = , 0 C 2,2 0 D 2,2 0 E 2,2 and it follows that E [k+1] = C [k+1] D [k+1] . But since C [k] = 0 and dk+1,k+1 = 0 we find        E [k] e 0 c D [k] d 00 = E [k+1] = C [k+1] D [k+1] = = 0 α 0β 0 0 00 for some e, c, d ∈ Rk and α, β ∈ R. We conclude that ei,j = 0 for i, j ≤ k + 1. d) Now let p := πA be the characteristic polynomial of A. Show that p(T ) = 0. Then show that p(A) = 0. (Cayley Hamilton Theorem) Hint. Use a suitable factorization of p and use c). Solution. Since p(t) = (t − λ1 ) · · · (t − λn ) we have p(T ) = (T − λ1 I) · · · (T − λn I), where each T − λi I is upper triangular with tii = 0. Define W s := (T − λ1 I) · · · (T − λs I),

s = 1, . . . , n.

We show by induction that ws (i, j) = 0 for i, j ≤ s. This holds for s = 1. Suppose it holds for some s ≥ 1. Since W s and T − λs+1 I are upper triangular and (T − λs+1 I)(s + 1, s + 1) = 0 we can apply c) and obtain  W s+1 (i, j) = W s (T − λs+1 I) (i, j) = 0, i, j ≤ s + 1. Taking s = n we obtain W n = 0 which means that p(T ) = 0. But then also p(A) = U p(T )U ∗ = 0.

104

6 Eigenpairs and Similarity Transformations

Exercises section 6.3

Exercise 6.19: Schur factorization example T 12 Show that a Schur  factorization of A = [ 3 2 ] is U AU = 1 1 1 U = √2 −1 1 .

 −1 −1  0 4 , where

Solution. The matrix U is unitary, as U ∗ U = U T U = I. One directly verifies that   −1 −1 R := U T AU = . 0 4 Since this matrix is upper triangular, A = U RU T is a Schur decomposition of A. Exercise 6.20: Skew-Hermitian matrix Suppose C = A + iB, where A, B ∈ Rn×n . Show that C is skew-Hermitian if and only if AT = −A and B T = B. Solution. By definition, a matrix C is skew-Hermitian if C ∗ = −C. “=⇒”: Suppose that C = A+iB, with A, B ∈ Rm,m , is skew-Hermitian. Then −A − iB = −C = C ∗ = (A + iB)∗ = AT − iB T , which implies that AT = −A and B T = B (use that two complex numbers coincide if and only if their real parts coincide and their imaginary parts coincide). In other words, A is skew-Hermitian and B is real symmetric. “⇐=”: Suppose that we are given matrices A, B ∈ Rm,m such that A is skewHermitian and B is real symmetric. Let C = A + iB. Then C ∗ = (A + iB)∗ = AT − iB T = −A − iB = −(A + iB) = −C, meaning that C is skew-Hermitian. Exercise 6.21: Eigenvalues of a skew-Hermitian matrix Show that any eigenvalue of a skew-Hermitian matrix is purely imaginary. Solution. Let A be a skew-Hermitian matrix and consider a Schur triangularization A = U RU ∗ of A. Then R = U ∗ AU = U ∗ (−A∗ )U = −U ∗ A∗ U = −(U ∗ AU )∗ = −R∗ . Since R differs from A by a similarity transform, their eigenvalues coincide (use the multiplicative property of the determinant to show that

6 Eigenpairs and Similarity Transformations

105

det(A − λI) = det(U ∗ ) det(U RU ∗ − λI)) det(U ) = det(R − λI).) As R is a triangular matrix, its eigenvalues λi appear on its diagonal. From the equation R = −R∗ it then follows that λi = −λi , implying that each λi is purely imaginary. Exercise 6.22: Eigenvector expansion using orthogonal eigenvectors Show that if the eigenpairs (λ1 , u1 ), . . . , (λn , un ) of A ∈ Cn×n are orthogonal, i.e., u∗j uk = 0 for j 6= k, then the eigenvector expansions of x, Ax ∈ Cn take the form x=

n X

cj uj ,

Ax =

j=1

Solution. If x =

n X

cj λj uj , where cj =

j=1

Pn

j=1 cj uj

u∗j x . u∗j uj

and the eigenvectors are orthogonal we get that

u∗i x =

n X

cj u∗i uj = ci u∗i ui ,

j=1

so that ci = u∗i x/(u∗i ui ). Exercise 6.23: Rayleigh quotient (Exam exercise 2015-3) a) Let A ∈ Rn×n be a symmetric matrix. Explain how we can use the spectral theorem for symmetric matrices to show that λmin = min R(x) = min R(x), x6=0

kxk2 =1

where λmin is the smallest eigenvalue of A, and R(x) is the Rayleigh quotient given by xT Ax R(x) := T . x x Solution. The spectral theorem says that we can write any real symmetric matrix as A = U DU T , where U is orthogonal and D = diag(λ1 , . . . , λn ) is diagonal. We now get that R(x) = =

xT Ax xT U DU T x (U T x)T D(U T x) = = xT x xT x kxk2 (U T x)T D(U T x) = RD (U T x) kU T xk2

since U is orthogonal (RD is the Rayleigh quotient using D instead of A). We thus have that

106

6 Eigenpairs and Similarity Transformations

min R(x) = min RD (U T x) = min RD (x) = min x6=0

x6=0

x6=0

x6=0

n X

λi

i=1

xi2 = λmin , kxk2

where the minimum is attained for x = ei with λi = λmin . b) Let x, y ∈ Rn such that kxk2 = 1 and y 6= 0. Show that T R(x − ty) = R(x) − 2t Ax − R(x)x y + O(t2 ), where t > 0 is small. Hint. Use Taylor’s theorem for the function f (t) = R(x − ty). Solution. Using the hint we have that f (0) = R(x) and f (t) =

(x − ty)T A(x − ty) xT Ax − 2txT Ay + t2 y T Ay g(t) = = . (x − ty)T (x − ty) kxk2 − 2txT y + t2 kyk2 h(t)

Here g(0) = xT Ax,

g 0 (t) = −2xT Ay + 2ty T Ay,

g 0 (0) = −2xT Ay,

h(0) = kxk2 = 1,

h0 (t) = −2xT y + 2tkyk2 ,

h0 (0) = −2xT y.

We now get that g 0 (0)h(0) − g(0)h0 (0) = −2xT Ay + 2xT yxT Ax h(0)2  T = −2 (Ax)T y − R(x)xT y = −2 Ax − R(x)x y.

f 0 (0) =

Clearly the second derivative of f is bounded close to 0, so that f (t) = f (0) + tf 0 (0) + O(t2 ). Inserting f (0) = R(x) and f 0 (0) = −2(Ax − R(x)x)T y gives the desired result. c) Based on the characterisation given in a) above it is tempting to develop an algorithm for computing λmin by approximating the minimum of R(x) over the unit ball B1 := {x ∈ Rn | kxk2 = 1}. Assume that x0 ∈ B1 satisfies Ax0 − R(x0 )x0 6= 0, i.e., (R(x0 ), x0 ) is not an eigenpair for A. Explain how we can find a vector x1 ∈ B1 such that R(x1 ) < R(x0 ). Solution. If Ax0 − R(x0 )x0 6= 0 we can choose a vector y so that (Ax0 − R(x0 )x0 )T y > 0 (y can for instance be a vector pointing in the same direction

6 Eigenpairs and Similarity Transformations

107

as Ax0 − R(x0 )x0 ). But then −2t(Ax0 − R(x0 )x0 )T y < 0 (t is assumed to be positive here) and since this term dominates O(t2 ) for small t, we see that R(x0 − ty) < R(x0 ). In other words, we can reduce the Rayleigh quotient by taking a small step from x0 in the direction of Ax0 − R(x0 )x0 .

Exercises section 6.4

Exercise 6.24: Eigenvalue perturbation for Hermitian matrices Show that in Theorem 6.13, if E is symmetric positive semidefinite then βi ≥ αi . Solution. Let ε1 ≥ ε2 ≥ · · · ≥ εn be the eigenvalues of E := B − A. Since a positive semidefinite matrix has no negative eigenvalues, one has εn ≥ 0. It immediately follows from αi + εn ≤ βi that in this case βi ≥ αi . Exercise 6.25: Hoffman-Wielandt Show that Equation (6.15) does not hold for the matrices A := [ 00 04 ] and  −1 −1 B := 1 1 . Why does this not contradict the Hoffman-Wielandt theorem (Theorem 6.14)? Solution. The matrix A has eigenvalues 0 and 4, and the matrix B has eigenvalue 0 with algebraic multiplicity two. Independently of the choice of the permutation i1 , . . . , in , the Hoffman-Wielandt Theorem would yield 16 =

n X j=1

|µij − λj |2 ≤

n X n X

|aij − bij |2 = 12,

i=1 j=1

which clearly cannot be valid. However, the Hoffman-Wielandt Theorem cannot be applied to these matrices, because B is not normal,     22 2 −2 ∗ B B= 6= = BB ∗ . 22 −2 2

Exercise 6.26: Biorthogonal expansion Determine right and left eigenpairs for the matrix A := [ 32 12 ] and the two expansions in Equation (6.16) for any v ∈ R2 . Solution. The matrix A has characteristic polynomial det(A−λI) = (λ−4)(λ−1) and right eigenpairs (λ1 , x1 ) = (4, [1, 1]T ) and (λ2 , x2 ) = (1, [1, −2]T ). Since

108

6 Eigenpairs and Similarity Transformations

the right eigenvectors x1 , x2 are linearly independent, there exists vectors y 1 , y 2 satisfying hy i , xj i = δij . A vector orthogonal to x1 = [1, 1]T must be of the form y 2 = α[1, −1]T , and a vector orthogonal to x2 = [1, −2]T must be of the form y 1 = β[2, 1]T . These choices secure that hy i , xj i = 0 when i 6= j. We must also have that 1 = hy 1 , x1 i = 2β + β = 3β,

1 = hy 2 , x2 i = α + 2α = 3α,

so that α = β = 1/3, and we can choose the dual basis as y 1 = 31 [2, 1]T and y 2 = 13 [1, −1]T . Equation (6.16) then gives us the biorthogonal expansions 1 1 (2v1 + v2 )x1 + (v1 − v2 )x2 3 3 = hv, x1 iy 1 + hv, x2 iy 2 = (v1 + v2 )y 1 + (v1 − 2v2 )y 2 .

v = hv, y 1 ix1 + hv, y 2 ix2 =

Exercise 6.27: Generalized Rayleigh quotient For A ∈ Cn×n and any y, x ∈ Cn with y ∗ x 6= 0 the quantity R(y, x) = ∗ RA (y, x) := yy∗Ax x is called a generalized Rayleigh quotient for A. Show that if (λ, x) is a right eigenpair for A then R(y, x) = λ for any y with y ∗ x 6= 0. Also show that if (λ, y) is a left eigenpair for A then R(y, x) = λ for any x with y ∗ x 6= 0. Solution. Suppose (λ, x) is a right eigenpair for A, so that Ax = λx. Then the generalized Rayleigh quotient for A is R(y, x) :=

y ∗ Ax y ∗ λx = = λ, y∗ x y∗ x

which is well defined whenever y ∗ x 6= 0. On the other hand, if (λ, y) is a left eigenpair for A, then y ∗ A = λy ∗ and it follows that R(y, x) :=

y ∗ Ax λy ∗ x = = λ. y∗ x y∗ x

Chapter 7

The Singular Value Decomposition

Exercises section 7.1

Exercise 7.1: SVD1 Show that the decomposition        1 1 1 20 1 1 1 11 √ A := =√ = U DU T 11 2 1 −1 0 0 2 1 −1 is both a spectral decomposition and a singular value decomposition. Solution. The factorization is easily verified. Here   1 1 1 U =V = √ , 2 1 −1 which is clearly unitary. Since in addition the matrix in the middle is diagonal, it follows that this is both a spectral and a singular value decomposition. Exercise 7.2: SVD2 Show that the decomposition        1 1 1 1 −1 2 0 1 1 −1 √ A := =√ =: U ΣV T 1 −1 1 −1 0 0 1 1 2 2 is a singular value decomposition. Show that A is defective, so it cannot be diagonalized by any similarity transformation. Solution. Again the factorization is easily verified. Here     1 1 1 1 1 1 , V =√ , U=√ 2 1 −1 2 −1 1 © The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_7

109

7 The Singular Value Decomposition

110

which are both unitary (but different). Since in addition the matrix in the middle is diagonal and nonnegative, we have a singular value decomposition. The characteristic equation of the matrix A is 0 = (1 − λ)(−1 − λ) + 1 = λ2 , so that 0 is an eigenvalue with algebraic multiplicity 2. The corresponding eigenvectors are found by row reducing     1 −1 1 −1 ∼ , 0 0 1 −1 so that [1, 1]T spans this eigenspace, which is thus one-dimensional. It follows that A is defective. Exercise 7.3: SVD examples Find the singular value decomposition of the following matrices   3 a) A = . 4 Solution. For A = [3, 4]T we find a 1 × 1 matrix AT A = 25,√which has the eigenvalue λ1 = 25. This provides us with the singular value σ1 = λ1 = 5 for A. Hence the matrix A has rank 1 and an SVD of the form     5   V1 , A = U1 U2 with U 1 , U 2 ∈ R2,1 , V = V 1 ∈ R. 0 The eigenvector of AT A that corresponds to the eigenvalue λ1 = 25 is given by   v 1 = 1, providing us with V = 1 . Using part 3 of Theorem 7.2, one finds u1 = 1 1 T 2 T 5 [3, 4] . Extending u1 to an orthonormal basis for R gives u2 = 5 [−4, 3] . An SVD of A is therefore    1 3 −4 5   1 . A= 0 5 4 3 

 11 b) A =  2 2 . 22 Solution. One has  1  A= 2 2

 1 2 , 2

  122 A = , 122 T

  99 A A= . 99 T

7 The Singular Value Decomposition

111

T The eigenvalues of AT A are the zeros of det(A A−λI) = (9−λ)2 −81, yielding √ λ1 = 18 and λ2 = 0, and therefore σ1 = 18 and σ2 = 0. Note that since there is only one nonzero singular value, the rank of A is one. Following the dimensions of A, one finds √  18 0 Σ =  0 0 . 0 0

The normalized eigenvectors v 1 , v 2 of AT A corresponding to the eigenvalues λ1 , λ2 are the columns of the matrix   1 1 −1 V = [v 1 v 2 ] = √ . 2 1 1 Using part 3 of Theorem 7.2 one finds u1 , which can be extended to an orthonormal basis {u1 , u2 , u3 } using Gram-Schmidt orthogonalization (see Theorem 5.4). The vectors u1 , u2 , u3 constitute a matrix   1 −2 −2 1 U = [u1 u2 u3 ] = 2 2 −1 . 3 2 −1 2 An SVD of A is therefore given by   √ 1 −2 −2 18 1 2 2 −1  0 A= 3 2 −1 2 0

  0 1  1 1 . 0 √ 2 −1 1 0

Exercise 7.4: More SVD examples Find the singular value decomposition of the following matrices a) A = e1 the first unit vector in Rm .   Solution. We have A = e1 and AT A = eT 1 e1 = 1 . This gives the eigenpair (λ1 , v 1 ) = (1, [1]) of AT A. Hence σ1 = 1 and Σ = e1 = A. As Σ = A and V = I 1 we must have U = I m yielding a singular value decomposition A = I m e1 I 1 . n b) A = eT n the last unit vector in R .

Solution. For A = eT n , the matrix

7 The Singular Value Decomposition

112

 0 ···  .. . .  AT A =  . . 0 · · · 0 ···

 00 .. ..  . . . 0 0 01

1,n and has eigenpairs (0, ej ) for j = 1, . . . , n − 1 and (1, en ). Then Σ = eT 1 ∈ R  n,n , . . . , e1 ∈ R . Using part 3 of Theorem 7.2 we get u1 = 1, V = en , en−1   yielding U = 1 . An SVD for A is therefore given by

    A = enT = 1 e1T en , en−1 , . . . , e1 .

c) A =

 −1 0  0 3

.

Solution. In this exercise   −1 0 A= , 0 3

T

A = A,

  10 A A= . 09 T

The eigenpairs of AT A are given by (λ1 , v 1 ) = (9, e2 ) and (λ2 , v 2 ) = (1, e1 ), from which we find     30 01 Σ= , V = . 01 10 Using part 3 of Theorem 7.2 one finds u1 = e2 and u2 = −e1 , which constitute the matrix   0 −1 U= . 1 0 An SVD of A is therefore given by     0 −1 3 0 0 1 . A= 1 0 01 10

Exercise 7.5: Singular values of a normal matrix Show that a) the singular values of a normal matrix are the absolute values of its eigenvalues, Solution. A normal matrix can be written in the form A = U ΣU ∗ , with U unitary, Σ diagonal. We now obtain that A∗ A = U Σ ∗ ΣU ∗ , so that the eigenvalues of A∗ A are |λi |2 , where λi are the eigenvalues of A. It follows that |λi | are the singular values of A.

7 The Singular Value Decomposition

113

b) the singular values of a symmetric positive semidefinite matrix are its eigenvalues. Solution. A symmetric positive definite matrix is also normal, so that the singular values are |λi |. Since the eigenvalues are also positive, the result follows. Exercise 7.6: The matrices A∗ A, AA∗ and SVD Show the following: If A = U ΣV ∗ is a singular value decomposition of A ∈ Cm×n then a) A∗ A = V diag(σ12 , . . . , σn2 )V ∗ is a spectral decomposition of A∗ A. 2 b) AA∗ = U diag(σ12 , . . . , σm )U ∗ is a spectral decomposition of AA∗ .

c) The columns of U are orthonormal eigenvectors of AA∗ . d) The columns of V are orthonormal eigenvectors of A∗ A. Solution. a) and d): We have that A∗ A = V Σ ∗ U ∗ U ΣV ∗ = V Σ ∗ ΣV ∗ . This is a spectral decomposition, and the diagonal entries of Σ ∗ Σ are σi2 . In particular it follows that the columns of V are the corresponding eigenvectors. b) and c): We have that AA∗ = U ΣV ∗ V Σ ∗ U ∗ = U Σ ∗ ΣU ∗ . This is a spectral decomposition, and the diagonal entries of Σ ∗ Σ are again σi2 . In particular it follows that the columns of U are the corresponding eigenvectors. Exercise 7.7: Singular values (Exam exercise 2005-2) Given the statement: “If A ∈ Rn×n has singular values (σ1 , . . . , σn ) then A2 has singular values (σ12 , . . . , σn2 )”. Find a class of matrices for which the statement is true. Show that the statement is not true in general. Solution. The singular values are the nonnegative square roots of the eigenvalues of AT A so the statement is true if and only if AT A and A2 have the same eigenvalues. The claim is correct  if A  is symmetric. An example where the statement is not true 11 is given by A = . Note that A2 is upper triangular with eigenvalue 1 with 01   √ 11 algebraic multiplicity 2, while AT A = has eigenvalues (3 ± 5)/2. 12 Suppose A has singular value decomposition A = U ΣV T . The statement is correct if V = U for then A2 = U ΣV T U ΣV T = U Σ 2 V T and this is the singular value decomposition of A2 . The identity V = U holds if A is normal.

7 The Singular Value Decomposition

114

Exercises section 7.2

Exercise 7.8: Nonsingular matrix   11 48 Derive the SVF and SVD of the matrix A = . Also, possibly using 48 39 a computer, find its spectral decomposition U DU T . The matrix A is normal, but the spectral decomposition is not an SVD. Why? 1 25

Hint. Answer: A =

1 5

     3 −4 3 0 1 3 4 . 4 3 0 1 5 4 −3

Solution. We have that     1 2425 2400 1 97 96 A A= = . 625 2400 3825 25 96 153   97 96 The characteristic equation of is 96 153 T

0 = (97 − λ)(153 − λ) − 962 = λ2 − 250λ + 5625. √

This gives λ = 250± 2 40000 = 125 ± 100, i.e., 25 and 225. Scaling gives the eigenvalues 1 and 9 of AT A, so that the singular values of A are 1 and 3. A corresponding eigenvector for λ = 9 for AT A is found by row reducing     5.12 −3.84 4 −3 T 9I − A A = ∼ , −3.84 2.88 0 0 which gives v 1 = 15 [3, 4]T . For λ = 1 we get     −2.88 −3.84 34 I − AT A = ∼ , −3.84 −5.12 00 which gives v 2 = 15 [4, −3]T . This gives V

T

   T 1 3 4 = v1 v2 = . 5 4 −3

We also obtain       1 1 11 48 1 3 1 3 u1 = Av 1 = = σ1 75 48 39 5 4 5 4

7 The Singular Value Decomposition

115

      1 1 11 48 1 4 1 −4 u2 = Av 2 = = . σ2 25 48 39 5 −3 5 3     3 −4 This gives U = u1 u2 = 15 . Finally we obtain 4 3 A = U ΣV T =

     1 3 −4 3 0 1 3 4 . 0 1 5 4 −3 5 4 3

The spectral decomposition is easily found to be      1 3 −4 3 0 1 3 4 A= . 0 −1 5 −4 3 5 4 3 In particular the matrix is unitarily diagonalizable, so that it is normal. The spectral decomposition is not an SVD since one of the eigenvalues was negative: In the SVD the singular values are positive, so this is something different. Exercise 7.9: Full row rank Find the SVF and SVD of A :=

  1 14 4 16 ∈ R2×3 . 15 2 22 13

Hint. Take the transpose of the matrix in (7.2). Solution. Transposing the matrix in (7.2) we obtain      1 2 2 1 3 4 200 1 2 −2 1  , A= 5 4 −3 0 1 0 3 2 1 −2 which is an SVD. An SVF is A=

     1 3 4 20 1 1 2 2 . 5 4 −3 0 1 3 2 −2 1

Exercise 7.10: Counting dimensions of fundamental subspaces Suppose A ∈ Cm×n . Show using SVD that a) rank(A) = rank(A∗ ). Solution. Let A have singular value decomposition U ΣV ∗ .

7 The Singular Value Decomposition

116

By parts 1. and 3. of Theorem 7.3, span(A) and span(A∗ ) are vector spaces of the same dimension r, implying that rank(A) = rank(A∗ ). b) rank(A) + null(A) = n, Solution. This statement is known as the rank-nullity theorem, and it follows immediately from combining parts 1. and 4. in Theorem 7.3. c) rank(A) + null(A∗ ) = m, where null(A) is defined as the dimension of N (A). Solution. As rank(A∗ ) = rank(A) by 1., this follows by replacing A by A∗ in 2. Exercise 7.11: Rank and nullity relations Use Theorem 7.1 to show that for any A ∈ Cm×n a) rank A = rank(A∗ A) = rank(AA∗ ), Solution. Let A = U ΣV ∗ be a singular value decomposition of a matrix A ∈ Cm×n . By part 5 of Theorem 7.1, rank(A) is the number of positive eigenvalues of AA∗ = U ΣV ∗ V Σ ∗ U ∗ = U DU ∗ , where D := ΣΣ ∗ is a diagonal matrix with real nonnegative elements. Since U DU ∗ is an orthogonal diagonalization of AA∗ , the number of positive eigenvalues of AA∗ is the number of nonzero diagonal elements in D. Moreover, rank(AA∗ ) is the number of positive eigenvalues of AA∗ (AA∗ )∗ = AA∗ AA∗ = U ΣΣ ∗ ΣΣ ∗ U ∗ = U D 2 U ∗ , which is the number of nonzero diagonal elements in D 2 , so that rank(A) = rank(AA∗ ). From a similar argument for rank(A∗ A), we conclude that rank(A) = rank(AA∗ ) = rank(A∗ A). b) null(A∗ A) = null A, and null(AA∗ ) = null(A∗ ). Solution. Let r := rank(A) = rank(A∗ ) = rank(AA∗ ) = rank(A∗ A). Applying Theorem 7.1, parts 3 and 4, to the singular value decompositions A = U ΣV ∗ , A∗ = V ΣU ∗ , AA∗ = U ΣΣ ∗ U ∗ , A∗ A = V Σ ∗ ΣV ∗ , one finds that {v r+1 , . . . , v n } is a basis for both ker(A) and ker(A∗ A), while {ur+1 , . . . um } is a basis for both ker(A∗ ) and ker(AA∗ ). In particular it follows

7 The Singular Value Decomposition

117

that null(A) = null(A∗ A),

null(A∗ ) = null(AA∗ ),

which is what needed to be shown. Exercise 7.12: Orthonormal bases example Let A and B be as in Example 7.2. Give orthonormal bases for R(B) and N (B). Solution. We recall the SVD     14 2 1 2 2 2 1  1 4 22 =  2 −2 1  0 A= 15 3 16 13 2 1 −2 0

   0 1 3 4  1 = U ΣV ∗ . 5 4 −3 0

We have that B = A∗ A, and that B = V Σ 2 V ∗ is an SVD for B, with rank(B) = 2. Since B has full rank N (B) = {0}, and we can choose any orthonormal basis (for instance the standard basis) for R2 as basis for R(B). Exercise 7.13: Some spanning sets Show for any A ∈ Cm×n that R(A∗ A) = R(V 1 ) = R(A∗ ) Solution. The matrices A ∈ Cm×n and A∗ A have the same rank r since they have the same number of singular values, so that the vector spaces span(A∗ A) and span(A∗ ) have the same dimension. It is immediate from the definition that span(A∗ A) ⊂ span(A∗ ), and therefore span(A∗ A) = span(A∗ ). From Theorem 7.3 we know that the columns of V 1 form an orthonormal basis for span(A∗ ), and the result follows. Exercise 7.14: Singular values and eigenpairs of composite matrix Let A ∈ Cm×n with m ≥ n have singular values σ1 , . . . , σn , left singular vectors u1 , . . . , um ∈ Cm , and right singular vectors v 1 , . . . , v n ∈ Cn . Show that the matrix   0 A C := ∈ R(m+n)×(m+n) A∗ 0 has the n + m eigenpairs {(σ1 , p1 ), . . . , (σn , pn ), (−σ1 , q 1 ), . . . , (−σn , q n ), (0, r n+1 ), . . . , (0, r m )}, where, for i = 1, . . . , n and j = n + 1, . . . , m,     ui ui pi = , qi = , vi −v i

  u rj = j . 0

7 The Singular Value Decomposition

118

Solution. Given is a singular value decomposition A = U ΣV ∗ . Let r = rank(A), so that σ1 ≥ · · · ≥ σr > 0 and σr+1 = · · · = σn = 0. Let U = [U 1 , U 2 ] and V = [V 1 , V 2 ] be partitioned accordingly and Σ1 = diag(σ1 , . . . , σr ) as in Equation (7.4), so that A = U 1 Σ1 V ∗1 forms a singular value factorization of A. By Theorem 7.3,       0 A ui Av i σi pi for i = 1, . . . , r Cpi = = = A∗ 0 v i A∗ ui 0 · pi for i = r + 1, . . . , n       ui −σi q i for i = 1, . . . , r −Av i 0 A = = Cq i = A∗ 0 −v i A∗ ui 0 · q i for i = r + 1, . . . , n        0 0 A uj 0 Cr j = = 0 · r j , for j = n + 1, . . . , m. = = 0 A∗ 0 A∗ uj 0 This gives a total of n + n + (m − n) = m + n eigenpairs. Exercise 7.15: Polar decomposition (Exam exercise 2011-2) Given n ∈ N and a singular value decomposition A = U ΣV T of a square matrix A ∈ Rn,n , consider the matrices Q := U V T ,

P := V ΣV T

of order n. a) Show that A = QP ,

(7.13)

and show that Q is orthonormal. Solution. Since V T V = I we find QP = U V T V ΣV T = U ΣV T = A. Q is orthonormal since U and V T are orthonormal and a product of orthonormal matrices is orthonormal. b) Show that P is symmetric positive semidefinite and positive definite if A is nonsingular. The factorization in (7.13) is called a polar factorization. Solution. Write Σ = diag(σ1 , . . . , σn ). For any x ∈ Rn xT P x = xT V ΣV T x = y T Σy =

n X

σj |yj |2 ,

j=1

where y = V T x. Since the singular values σj are nonnegative it follows that xT P x ≥ 0. then Σ is nonsingular and then σj > 0 for all j. But then PnIf A is nonsingular 2 σ |y | > 0 for all nonzero y ∈ Rn so that xT P x > 0 for all nonzero j=1 j j n x∈R .

7 The Singular Value Decomposition

119

c) Usep the singular value decomposition of A to give a suitable definition of B := AT A so that P = B. T T T 2 T 2 Solution. p We have A A = V ΣU U ΣV = V Σ V = P , so we can define B = AT A := P = V ΣV T .

For the rest of this exercise assume that A is nonsingular. Consider the iterative method X k+1 =

 1 X k + X −T , k 2

k = 0, 1, 2, . . . ,

X 0 := A,

(7.14)

for finding Q. d) Show that the iteration (7.14) is well defined by showing that X k = U Σ k V T , for a diagonal matrix Σ k with positive diagonal elements, k = 0, 1, 2, . . .. Solution. We have X 0 := A = U Σ 0 V T , where Σ 0 = Σ is a diagonal matrix with positive diagonal elements. Suppose by induction that X k = U Σ k V T for a diagonal matrix Σ k with positive diagonal elements. Then  1 X k + X −T = U Σ k+1 V T , k 2  where Σ k+1 := 12 Σ k + Σ −1 is a diagonal matrix, and each diagonal element in k Σ k+1 is a sum of two positive numbers and therefore positive. In particular, if X k is nonsingular then X k+1 will also be nonsingular. It follows that the iteration is well defined. X k+1 :=

e) Show that X k+1 − Q =

  1 −T T X XT Xk − Q k −Q 2 k

(7.15)

and use (7.15) and the Frobenius norm to show (quadratic convergence to Q) kX k+1 − QkF ≤

1 2 kX −1 k kF kX k − QkF . 2

(7.16)

T Solution. Since X k = U Σ k V T we have X −1 = V Σ −1 and X −T = k k U k −1 T U Σ k V . But then T −1 T T T X −T = Q. k Q X k = U Σk V V U U ΣkV

Expanding the right hand side of (7.15) and using this we find   1 −T T Xk XT Xk − Q k −Q 2

7 The Singular Value Decomposition

120

  1 T I − X −T Xk − Q k Q 2  1 T −T T = X k − X −T k Q Xk − Q + Xk Q Q 2  1 = X k + X −T −Q k 2 = X k+1 − Q.

=

This shows (7.15). We have kB T kF = kBkF for any matrix B. Moreover, the Frobenius norm is consistent on Rn,n . Therefore

1 −T   T T

kX k+1 − QkF = X k X k − Q Xk − Q

2 F 1 −T T T ≤ kX k kF kX k − Q kF kX k − QkF 2 1 = kX −1 k kF kX k − QkF kX k − QkF 2 and (7.16) follows. f) Write a MATLAB program function [Q,P,k] = polardecomp(A,tol,K) to carry out the iteration in

(7.14). The output is approximations Q and P = QT A to the polar decomposition A = QP of A and the number of iterations k such that kX k+1 − X k kF < tol · kX k+1 kF . Set k = K + 1 if convergence is not achieved in K iterations. The Frobenius norm in MATLAB is written norm(A ,’fro’). Solution. code/polardecomp.m 1 2 3 4 5 6 7 8 9 10

function [Q,P,k] = polardecomp(A,tol,K) X=A; for k=1:K Q=(X+inv(X’))/2; if norm(Q-X,’fro’) k by b) ˜i , b ˜k i = hb ˜i , hb

k X j=1

cj bj i =

k X j=1

Pk

j=1 cj bj

˜i , bj i = 0. cj hb

for some cj ∈ R.

8 Matrix Norms and Perturbation Theory for Linear Systems

131

˜k 6= 0 and b ˜k − bk ∈ span(b1 , . . . , bk−1 ) we see that b ˜1 , . . . , b ˜n is an Since b n A-orthogonal basis for R . ˜1 , . . . , b ˜n ]. Show that there is an upper triangular matrix ˜ n := [b d) Define B n×n ˜ nT . T ∈R with ones on the diagonal and satisfies B n = B ˜k − bk ∈ span(b1 , . . . , bk−1 ) we have b ˜k = bk + Pk−1 sj,k bj for Solution. Since b j=1 ˜ n = B n S, where S ∈ Rn×n is unit upper some sj,k ∈ R. But this implies that B ˜ n T , where T = S −1 is also unit upper triangular (cf. triangular. But then B n = B Lemma 2.5). e) Assume that the matrix T in d) is such that |tij | ≤ 12 for all i, j ≤ n, ˜k k2 ≤ 2kb ˜k+1 k2 for k = 1, . . . , n − 1 and that i 6= j. Assume also that kb A A det(B n ) = 1. Show that then p kb1 kA kb2 kA · · · kbn kA ≤ 2n(n−1)/4 det(A). Hint. ˜1 k2 · · · kb ˜n k2 = det(A) Show that kb A A ˜1 k2 · · · kb ˜n k2 = det(A). By A-orthogonality for Solution. We first show that kb A A all i, j  ˜ T Ab ˜j = δi,j b ˜T Ab ˜i . ˜ T AB ˜ n (i, j) = b B n i i But then, by construction of T and assumption of B n ,    det T −1 = det (T −1 )T = 1 = det(B n ) = det (B n )T ˜ n = B n T −1 , it follows that and B   ˜1 k2 · · · kb ˜n k2 = det diag(b ˜ T Ab ˜1 , . . . , b ˜T Ab ˜n ) kb A A 1 n  T     ˜ AB ˜ n = det T −1 T B T AB n T −1 = det(A). = det B n n ˜k + Pk−1 tj,k b ˜j , and by A-orthogonality this ˜ n T ek = b Now bk = B n ek = B j=1 implies k−1 X ˜k k2 + ˜j k2 . kbk k2 = kb |tj,k |2 kb A

A

A

j=1

˜k k2 ≤ 2kb ˜k+1 k2 we obtain kb ˜j k2 ≤ 2k−j kb ˜k k2 for j = 1, . . . , k − 1. Since kb A A A A Moreover, since |tj,k | ≤ 1/2 we find

8 Matrix Norms and Perturbation Theory for Linear Systems

132 k−1

X 2 ˜k k2 + 1 ˜k k2 ≤ kb ˜k k2 1 + 2k−2 ) ≤ kb ˜k k2 2k−1 . kbk kA ≤ kb 2k−j kb A A A A 4 j=1 But then kb1 k2A

· · · kbn k2A



n Y

˜k k2 2k−1 = det(A) kb A

k=1

n Y

2k−1 = det(A)2n(n−1)/2 .

k=1

Taking square roots the result follows.

Exercises section 8.2

Exercise 8.3: Consistency of sum norm? Show that the sum norm is consistent. Solution. Observe that the sum norm is a matrix norm. This follows since it is equal to the l1 -norm of the vector v = vec(A) obtained by stacking the columns of a matrix A on top of each other. Let A = (aij )ij and B = (bij )ij be matrices for which the product AB is defined. Then X X X kABkS = aik bkj ≤ |aik | · |bkj | i,j k i,j,k X X X ≤ |aik | · |blj | = |aik | |blj | = kAkS kBkS , i,j,k,l

i,k

l,j

where the first inequality follows from the triangle inequality and multiplicative property of the absolute value | · |. Since A and B were arbitrary, this proves that the sum norm is consistent. Exercise 8.4: Consistency of max norm? Show that the max norm is not consistent by considering [ 11 11 ]. Solution. Observe that the max norm is a matrix norm. This follows since it is equal to the l∞ -norm of the vector v = vec(A) obtained by stacking the columns of a matrix A on top of each other. To show that the max norm is not consistent we use a counter example. Let 11 A=B= . Then 11

8 Matrix Norms and Perturbation Theory for Linear Systems

133

          1 1 1 1 = 2 2 = 2 > 1 = 1 1 1 1 , 1 1 1 1 1 1 1 1 2 2 M M M M contradicting kAB||M ≤ kAkM kBkM . Exercise 8.5: Consistency of modified max norm Exercise 8.4 shows that the max norm is not consistent. In this exercise we show that the max norm can be modified so as to define a consistent matrix norm. a) Show that the norm kAk :=



mnkAkM ,

A ∈ Cm×n

is a consistent matrix norm. Solution. To show that k · k defines a consistent matrix norm, we have to show that it fulfills the three matrix norm properties and that it is submultiplicative. Let A, B ∈ Cm,n be any matrices and α any scalar. √ 1. Positivity. Clearly kAk := mnkAkM ≥ 0. Moreover, kAk = 0 ⇐⇒ ai,j = 0 ∀i, j ⇐⇒ A = 0. √ √ 2. Homogeneity. kαAk = mnkαAkM = |α| mnkAkM = |α|kAk. 3. Subadditivity. One has  √ √  kA + Bk = nmkA + BkM ≤ nm kAkM + kBkM = kAk + kBk. 4. Submultiplicativity. One has kABk = ≤



√ √

q X mn max ai,k bk,j 1≤i≤m 1≤j≤n mn max

1≤i≤m 1≤j≤n



k=1 q X

|ai,k ||bk,j |

k=1

mn max

1≤i≤m

! |ai,k |

k=1

!



≤ q mn

max |bk,j |

1≤k≤q 1≤j≤n

q X

max |ai,k |

1≤i≤m 1≤k≤q

! max |bk,j |

1≤k≤q 1≤j≤n

= kAkkBk.

b) Show that the constant



mn can be replaced by m and by n.

8 Matrix Norms and Perturbation Theory for Linear Systems

134

Solution. For any A ∈ Cm,n , let kAk(1) := mkAkM

and

kAk(2) := nkAkM .

Comparing with the solution of part a) we see, that the points of positivity, homogeneity and subadditivity are fulfilled here as well, making kAk(1) and kAk(2) valid matrix norms. Furthermore, for any A ∈ Cm,q , B ∈ Cq,n , q ! ! X (1) kABk = m max ai,k bk,j ≤ m max |ai,k | q max |bk,j | 1≤k≤q 1≤i≤m 1≤i≤m 1≤k≤q 1≤j≤n 1≤j≤n (1)

= kAk (2)

kABk

k=1 (1)

kBk

= n max | 1≤i≤m 1≤j≤n

q X

!

! ai,k bk,j | ≤ q

k=1 (2)

= kAk(2) kBk

, max |ai,k | n

1≤i≤m 1≤k≤q

max |bk,j |

1≤k≤q 1≤j≤n

,

which proves the submultiplicativity of both norms. Exercise 8.6: What is the sum norm subordinate to? Show that the sum norm is subordinate to the l1 -norm. Solution. For any matrix A = [aij ]ij ∈ Cm,n and column vector x = [xj ]j ∈ Cn , one has X m X m X n m X n n X X X n ≤ kAxk1 = a x |a | · |x | ≤ |a | |xk | ij j ij j ij i=1 j=1 i=1 j=1 i=1 j=1 k=1 = kAkS kxk1 , which shows that the matrix norm k · kS is subordinate to the vector norm k · k1 . Exercise 8.7: What is the max norm subordinate to? Let A = [aij ]ij ∈ Cm,n be a matrix and x = [xj ]j ∈ Cn a column vector. a) Show that the max norm is subordinate to the ∞ and 1 norm, i.e., kAxk∞ ≤ kAkM kxk1 holds for all A ∈ Cm×n and all x ∈ Cn . Solution. One has kAxk∞

X n n X X n = max aij xj ≤ max |aij | · |xj | ≤ max |aij | |xj | i=1,...,m i=1,...,m i=1,...,m j=1 j=1 j=1 j=1,...,n = kAkM kxk1 .

8 Matrix Norms and Perturbation Theory for Linear Systems

135

b) Show that if kAkM = |akl |, then kAel k∞ = kAkM kel k1 . Solution. Assume that the maximum in the definition of kAkM is attained in column l, implying that kAkM = |ak,l | for some k. Let el be the lth standard basis vector. Then kel k1 = 1 and kAel k∞ = max |ai,l | = |ak,l | = |ak,l | · 1 = kAkM · kel k1 , i=1,...,m

which is what needed to be shown. c) Show that kAkM = maxx6=0

kAxk∞ kxk1 .

Solution. By a), kAkM ≥ kAxk∞ /kxk1 for all nonzero vectors x, implying that kAkM ≥ max x6=0

kAxk∞ . kxk1

By b), equality is attained for any standard basis vector el for which there exists a k such that kAkM = |ak,l |. We conclude that kAkM = max x6=0

kAxk∞ , kxk1

which means that k · kM is the (∞, 1)-operator norm (see Definition 8.6). Exercise 8.8: Spectral norm Let m, n ∈ N and A ∈ Cm×n . Show that kAk2 =

max

kxk2 =kyk2 =1

|y ∗ Ax|.

Solution. Let A = U ΣV ∗ be a singular value decomposition of A, and write σ1 := kAk2 for the biggest singular value of A. Since the orthogonal matrices U and V leave the Euclidean norm invariant, |y ∗ Ax| = max |y ∗ U ΣV ∗ x| = max |y ∗ Σx| kxk2 =1=kyk2 kxk2 =1=kyk2 n n X X = max σ x y ≤ max σi |xi y i | i i i kxk2 =1=kyk2 kxk2 =1=kyk2 max

kxk2 =1=kyk2

≤ σ1 = σ1

max

i=1 n X

kxk2 =1=kyk2

max

kxk2 =1=kyk2

i=1

i=1

|xi y i | ≤ σ1

max

kxk2 =1=kyk2

kyk2 kxk2 = σ1 ,

k|y|k2 k|x|k2

8 Matrix Norms and Perturbation Theory for Linear Systems

136

where |x| = (|x1 |, . . . , |xn |). Moreover, this maximum is achieved for x = v 1 and y = u1 , and we conclude kAk2 = σ1 =

max

kxk2 =1=kyk2

|y ∗ Ax|.

Exercise 8.9: Spectral norm of the inverse Suppose A ∈ Cn×n is nonsingular. Show that kAxk2 ≥ σn for all x ∈ Cn with kxk2 = 1. Show also that kA−1 k2 = max x6=0

kxk2 . kAxk2

Solution. Since A is nonsingular we find kA−1 k2 = max x6=0

kA−1 xk2 kxk2 = max . x6=0 kAxk2 kxk2

This proves the second claim. For the first claim, let σ1 ≥ · · · ≥ σn be the singular values of A. Again since A is nonsingular, σn must be nonzero, and Equation (8.19) states that σ1n = kA−1 k2 . From this and what we just proved we have that σ1n ≥ 1 kAxk2 for any x so that kxk2 = 1, so that also kAxk2 ≥ σn for such x. Exercise 8.10: p-norm example Let



 2 −1 A= . −1 2

Compute kAkp and kA−1 kp for p = 1, 2, ∞. Solution. We have  A=

 2 −1 , −1 2

A−1 =

  1 21 . 3 12

Using Theorem 8.3, one finds kAk1 = kAk∞ = 3 and kA−1 k1 = kA−1 k∞ = 1. The singular values σ1 ≥ σ2 of A are the square roots of the zeros of 0 = det(AT A − λI) = (5 − λ)2 − 16 = λ2 − 10λ + 9 = (λ − 9)(λ − 1). Using Theorem 8.4, we find kAk2 = σ1 = 3 and kA−1 k2 = σ2−1 = 1. Alternatively, since A is symmetric positive definite, we know from (8.20) that kAk2 = λ1 and kA−1 k2 = 1/λ2 , where λ1 = 3 is the biggest eigenvalue of A and λ2 = 1 is the smallest.

8 Matrix Norms and Perturbation Theory for Linear Systems

137

Exercise 8.11: Unitary invariance of the spectral norm Show that kV Ak2 = kAk2 holds even for a rectangular V as long as V ∗ V = I. Solution. Suppose V is a rectangular matrix satisfying V ∗ V = I. Then kV Ak22 = max kV Axk22 = max x∗ A∗ V ∗ V Ax kxk2 =1

kxk2 =1





= max x A Ax = max kAxk22 = kAk22 . kxk2 =1

kxk2 =1

The result follows by taking square roots. Exercise 8.12: kAU k2 rectangular A Find A ∈ R2×2 and U ∈ R2×1 with U T U = 1 such that kAU k2 < kAk2 . Thus, in general, kAU k2 = kAk2 does not hold for a rectangular U even if U ∗ U = I. Solution. Let U = [u1 , u2 ]T be any 2×1 matrix satisfying 1 = U T U . Then AU is a 2×1-matrix, and clearly the operator 2-norm of a 2×1-matrix equals its euclidean norm (when viewed as a vector):

 



  

a1  

a1 x

a1





a2 x = a2 x = |x| a2 . 2 2 2 In order for kAU k2 < kAk2 to hold, we need to find a vector v with kvk2 = 1 so that kAU k2 < kAvk2 . In other words, we need to pick a matrix A that scales more in the direction v than in the direction U . For instance, if       20 0 1 A= , U= , v= , 01 1 0 then kAk2 = max kAxk2 ≥ kAvk2 = 2 > 1 = kAU k2 . kxk2 =1

Exercise 8.13: p-norm of diagonal matrix Show that kAkp = ρ(A) := max |λi | (the largest eigenvalue of A), 1 ≤ p ≤ ∞, when A is a diagonal matrix. Solution. A = diag(λ1 , . . . , λn ) has eigenpairs (λ1 , e1 ), . . . , (λn , en ). For ρ(A) = max{|λ1 |, . . . , |λn |}, one has kAkp =

(|λ1 x1 |p + · · · + |λn xn |p )1/p (x1 ,...,xn )6=0 (|x1 |p + · · · + |xn |p )1/p max

8 Matrix Norms and Perturbation Theory for Linear Systems

138

1/p



(ρ(A)p |x1 |p + · · · + ρ(A)p |xn |p ) (x1 ,...,xn )6=0 (|x1 |p + · · · + |xn |p )1/p max

= ρ(A).

On the other hand, for j such that ρ(A) = |λj |, one finds kAkp = max x6=0

kAxkp kAej kp ≥ = ρ(A). kxkp kej kp

Together, the above two statements imply that kAkp = ρ(A) for any diagonal matrix A and any p satisfying 1 ≤ p ≤ ∞. Exercise 8.14: Spectral norm of a column vector A vector a ∈ Cm can also be considered as a matrix A ∈ Cm,1 . a) Show that the spectral matrix norm (2-norm) of A equals the Euclidean vector norm of a. b) Show that kAkp = kakp for 1 ≤ p ≤ ∞. Solution. We write A ∈ Cm,1 for the matrix corresponding to the column vector a ∈ Cm . Write kAkp for the operator p-norm of A and kakp for the vector p-norm of a. In particular kAk2 is the spectral norm of A and kak2 is the Euclidean norm of a. Then |x|kakp kAxkp kAkp = max = max = kakp , x6=0 x6=0 |x| |x| proving b). Note that a) follows as the special case p = 2. Exercise 8.15: Norm of absolute value matrix If A ∈ Cm×n has elements aij , let |A| ∈ Rm×n be the matrix with elements |aij |.   √ 1+i −2 a) Compute |A| if A = , i = −1. 1 1−i Solution. One finds  |A| =

 √  |1 + i| | − 2| 2 √2 = . |1| |1 − i| 1 2

b) Show that for any A ∈ Cm×n , kAkF = k |A| kF ,

kAkp = k |A| kp ,

for p = 1, ∞.

Solution. Let bi,j denote the entries of |A|. Observe that bi,j = |ai,j | = |bi,j |. Together with Theorem 8.3, these relations yield

8 Matrix Norms and Perturbation Theory for Linear Systems

139

  21   12 n m X m X n X X |ai,j |2  =  kAkF =  |bi,j |2  = k |A| kF , i=1 j=1

kAk1 = max

1≤j≤n

m X

|ai,j |

kAk∞ = max 

n X

m X

= max

1≤j≤n

i=1

 1≤i≤m

i=1 j=1

!



|bi,j |

= k |A| k1 ,

i=1



|ai,j | = max 

j=1

!

1≤i≤m

n X

 |bi,j | = k |A| k∞ ,

j=1

which is what needed to be shown. c) Show that for any A ∈ Cm×n kAk2 ≤ k |A| k2 . Solution. To show this relation between the operator 2-norms of A and |A|, we first examine the connection between the l2 -norms of Ax and |A||x|, where x = (x1 , . . . , xn ) and |x| = (|x1 |, . . . , |xn |). We find kAxk2 =

n 2 ! 12 m X X ≤ ai,j xj i=1

j=1

m X

n X

i=1

j=1

!2 ! 12 |ai,j ||xj |

= k |A||x| k2 .

Now let x∗ with kx∗ k2 = 1 be a vector for which kAk2 = kAx∗ k2 . That is, let x∗ be a unit vector for which the maximum in the definition of the 2-norm is attained. Observe that |x∗ | is then a unit vector as well, k |x∗ | k2 = 1. Then, by the above estimate of l2 -norms and definition of the 2-norm, kAk2 = kAx∗ k2 ≤ k |A||x∗ | k2 ≤ k |A| k2 .

d) Find a real symmetric 2 × 2 matrix A such that kAk2 < k |A| k2 . Solution. By Theorem 8.3, we can solve this exercise by finding a matrix A for which A and |A| have different largest singular values. As A is real and symmetric, there exist a, b, c ∈ R such that     ab |a| |b| A= , |A| = , bc |b| |c|  2    a + b2 ab + bc a2 + b2 |ab| + |bc| T T A A= , |A| |A| = . ab + bc b2 + c2 |ab| + |bc| b2 + c2 To simplify these equations we first try the case a + c = 0. Eliminating c we get  2   2  a + b2 0 a + b2 2|ab| T T A A= , |A| |A| = . 0 a2 + b2 2|ab| a2 + b2

8 Matrix Norms and Perturbation Theory for Linear Systems

140

To get different norms we have to choose a, b in such a way that the maximal eigenvalues of AT A and |A|T |A| are different. Clearly AT A has a unique eigenvalue λ := a2 +b2 and putting the characteristic polynomial π(µ) = (a2 +b2 −µ)2 −4|ab|2 of |A|T |A| to zero yields eigenvalues µ± := a2 + b2 ± 2|ab|. Hence |A|T |A| has maximal eigenvalue µ+ = a2 + b2 + 2|ab| = λ + 2|ab|. The spectral norms of A and |A| therefore differ whenever both a and b are nonzero. For example, when a = b = −c = 1 we find   √ 1 1 , kAk2 = 2, A= k |A| k2 = 2. 1 −1

Exercise 8.16: An iterative method (Exam exercise 2017-3) Assume that A ∈ Cn×n is non-singular and nondefective (the eigenvectors of A form a basis for Cn ). We wish to solve Ax = b. Assume that we have a list of the eigenvalues {λ1 , λ2 , . . . , λm }, in no particular order. We have that m ≤ n, since some of the eigenvalues may have multiplicity larger than one. m−1 Given x0 ∈ Cn , and k ≥ 0, we define the sequence {xk }k=0 by xk+1 = xk +

1 r k , where r k = b − Axk . λk+1

a) Let the coefficients cik be defined by rk =

n X

cik ui ,

i=1 n

where {(σi , ui )}i=1 are the eigenpairs of A. Show that ( ci,k+1 =

0  ci,k 1 −

 if σi = λk+1 , otherwise. λk+1 σi

Solution. Observe that Auj = σj uj , where σj ∈ {λ1 , . . . , λm }. We have that r k+1 = b − Axk+1  = b − A xk +

1 λk+1

 rk

1 Ar k λk+1   X σi ui . = cik 1 − λk+1 i = rk −

Hence

8 Matrix Norms and Perturbation Theory for Linear Systems

ci,k+1

141

  σi = ci,k 1 − , λk+1

which is zero when σi = λk+1 . b) Show that for some l ≤ m, we have that xl = xl+1 = · · · = xm = x, where Ax = b. Solution. After at most m iterations we will have ci,k = 0 for all i, and thus xk = x. c) Consider this iteration for the n × n matrix T = tridiag(c, d, c), where d and c are positive real numbers and d > 2c. The eigenvalues of T are   jπ λj = d + 2c cos , j = 1, . . . , n. n+1 What is the operation count for solving T x = b using the iterative algorithm above? Solution. Each iterative step consists in finding x + (b − T x)/λ. The matrix multiplication T x requires O(5n) arithmetic operations, we have to add b (n operations), divide by λ (n operations), and add x (n operations). Altogether O(5n)+n+n+n = O(8n) operations. The iteration reaches a fixed point in at most n steps, so the total count is O(8n2 ). d) Let now B be a symmetric n × n matrix which is zero on the “tridiagonal”, i.e., bij = 0 if |i − j| ≤ 1. Set A = T + B, where T is the tridiagonal matrix above. We wish to solve Ax = b by the iterative scheme T xk+1 = b − Bxk .

(8.41)

Recall that if E ∈ Rn×n has eigenvalues λ1 , . . . , λn then ρ(E) := maxi |λi | is the spectral radius of E. Show that ρ(T −1 B) ≤ ρ(T −1 )ρ(B). Solution. Choose x ∈ Rn so that T −1 Bx = λx,

|λ| = ρ(T −1 B),

x∗ x = 1.

Multiplying T −1 Bx = λx by x∗ T and taking absolute values we find ρ(T −1 B) =

|x∗ Bx| . |x∗ T x|

(8.i)

Since T is symmetric positive definite it has positive eigenvalues and orthonormal eigenvectors T uj = λj uj ,

j = 1, . . . , n,

λ1 ≥ · · · ≥ λn > 0,

u∗j uk = δjk ,

8 Matrix Norms and Perturbation Theory for Linear Systems

142

Since B is symmetric it has real eigenvalues and orthonormal eigenvectors Bv j = µj v j ,

0 < |µ1 | ≤ · · · ≤ |µn |,

j = 1, . . . , n,

Moreover, ρ(T −1 ) =

1 , λn

ρ(B) = |µn |.

v j∗ v k = δjk .

(8.ii)

For some cj , dj ∈ R x=

n X

cj uj =

j=1

n X

n X

dj v j ,

j=1

c2j

=

j=1

n X

d2j = 1.

j=1

We find by (8.ii) x∗ T x =

n X

c2j λj ≥ λn

j=1

n X

c2j = λn = 1/ρ(T −1 ),

j=1

X n n X ∗ 2 |x Bx| = dj µj ≤ |µn | d2j = ρ(B). j=1 j=1 But then ρ(T −1 B) ≤ ρ(T −1 )ρ(B) follows from (8.i). e) Show that the iteration (8.41) will converge if   n n   X X min max |bij |, max |bij | < d − 2c. j  i  j=1

i=1

Hint. Use Gershgorin’s theorem. Solution. The iterative scheme can be written xk+1 = −T −1 Bxk + T −1 b =: Gxk + c. The iteration will converge if ρ(G) < 1 (see Chapter 12). By d) we have that ρ(G) ≤ ρ(T −1 )ρ(B). The eigenvalues of T satisfy λj > d − 2c,

1 1 < . λj d − 2c

Hence ρ(T −1 ) < 1/(d − 2c) < ∞. Regarding the eigenvalues µj of B, by Gershgorin’s circle theorem, since bii = 0

8 Matrix Norms and Perturbation Theory for Linear Systems

143

    X X |µj | ≤ min max |bij | =: CB . |bij |, max j   i   j i      

|i−j|>1

|i−j|>1

Hence the algorithm will converge if CB ≤ d − 2c.

Exercises section 8.3

Exercise 8.17: Perturbed linear equation (Exam exercise 1981-2) Given the systems Ax = b, Ay = b + e, where         1.1 1 b1 2.1 e A := , b := = , e := 1 , 1 1 b2 2.0 e2

kek2 = 0.1.

We define δ = kx − yk2 /kxk2 . a) Determine K2 (A) = kAk2 kA−1 k2 . Give an upper bound and a positive lower bound for δ without computing x and y. Solution. Since AT = A we have√K2 (A) = |λ |/|λmin |. We find det(A − λI) = √max λ2 − 2.1λ + 0.1 giving λ = 2.1±2 4.01 = 21±20 401 . We then get √ √ λmax 21 + 401 (21 + 401)2 √ = = ≈ 42.08. λmin 40 21 − 401 Thus K2 (A) ≈ 42.08 and by Theorem 8.7 kek2 0.1 ≤ 42.08 < 1.46 kbk2 2.9 1 kek2 1 0.1 δ≥ ≥ > 0.0008. K2 (A) kbk2 42.08 2.9 δ ≤ K2 (A)

b) Suppose as before that b2 = 2.0 and kek2 = 0.1. Determine b1 and e which maximize δ. Solution. We have

max kA−1 ek2

δmax =

kek2 =0.1

min kA−1 bk2

b1 ∈R

.

8 Matrix Norms and Perturbation Theory for Linear Systems

144

Now A−1 =



10 −10 −10 11



so that

kA−1 bk22 = (10b1 − 10b2 )2 + (10b1 − 11b2 )2 = (10b1 − 20)2 + (10b1 − 22)2 , which achieves its minimum kA−1 bk22 = 2 at b1 = 2.1. Since A−1 is symmetric, it has an orthonormal basis of eigenvectors {v 1 , v 2 }. If e = e1 v 1 + e2 v 2 and corresponding eigenvalues λ1 , λ2 of A are ordered such that |λ1 | ≥ |λ2 |, we have that

2

e1 e2 e21 e22 0.01

kA−1 ek22 = v + v

λ1 1 λ2 2 = λ2 + λ2 ≤ λ2 , 1 2 2 with equality if e1 = 0, e2 = 0.1. The maximum kA−1 ek2 is therefore obtained for 2 2 √ √ √1 e = 0.1v 2 , and is 0.1 λ2 = 21− 401 . The maximum for δ is thus 21− 401 2 ≈ 1.45. To find e we need to find an eigenvector for λ2 : √   21 − 401 1.1 1 − I 1 1 20 √   0.05 + 401/20 1√ = 1 −0.05 + 401/20  √  20 401 − 1 ∼ . 0 0   √ 1 − 401 Hence is an eigenvector for λ2 . Normalizing this to a vector with length 20   −0.069 0.1 we get e ≈ . 0.072 Exercise 8.18: Sharpness of perturbation bounds The upper and lower bounds for ky − xk/kxk given by Equation (8.22), i.e., 1 kek ky − xk kek ≤ ≤ K(A) , K(A) kbk kxk kbk can be attained for any matrix A, but only for special choices of b. Suppose y A and y A−1 are vectors with ky A k = ky A−1 k = 1 and kAk = kAy A k and kA−1 k = kA−1 y A−1 k. a) Show that the upper bound in Equation (8.22) is attained if b = Ay A and e = y A−1 . b) Show that the lower bound is attained if b = y A−1 and e = Ay A . Solution. Suppose Ax = b and Ay = b + e. Let K = K(A) = kAkkA−1 k be the condition number of A. Let y A and y A−1 be unit vectors for which the

8 Matrix Norms and Perturbation Theory for Linear Systems

145

maxima in the definition of the operator norms of A and A−1 are attained. That is, ky A k = 1 = ky A−1 k, kAk = kAy A k, and kA−1 k = kA−1 y A−1 k. If b = Ay A and e = y A−1 , then ky − xk kA−1 ek kA−1 y A−1 k = = kxk ky A k kA−1 bk ky −1 k kek = kA−1 k = kAkkA−1 k A =K , kAy A k kbk showing that the upper bound is sharp. If b = y A−1 and e = Ay A , then ky − xk kA−1 ek ky A k = = −1 −1 kxk kA bk kA y A−1 k 1 1 kAy A | 1 kek = = , −1 = −1 ky K kbk kA k kAkkA k A−1 k showing that the lower bound is sharp. Exercise 8.19: Condition number of 2. derivative matrix In this exercise we will show that for m ≥ 1 4 1 (m + 1)2 − 2/3 < condp (T ) ≤ (m + 1)2 , 2 π 2

p = 1, 2, ∞,

(8.42)

where T := tridiag(−1, 2, −1) ∈ Rm×m and condp (T ) := kT kp kT −1 kp is the p-norm condition number of T . The p matrix norm is given by (8.17). You will need the explicit inverse of T given by Equation (2.39) and the eigenvalues given in Lemma 2.2. As usual we define h := 1/(m + 1). a) Show that for m ≥ 3 1 cond1 (T ) = cond∞ (T ) = × 2



h−2 , m odd, h−2 − 1, m even.

and that cond1 (T ) = cond∞ (T ) = 3 for m = 2. Solution. Equation (2.39) said that T −1 has components T −1

 ij

= T −1

 ji

= (1 − ih)j > 0,

1 ≤ j ≤ i ≤ m,

h=

1 . m+1

From Theorems 8.3 and 8.4, we have the following explicit expressions for the 1-, 2- and ∞-norms kAk1 = max

1≤j≤n

m X i=1

|ai,j |, kAk2 = σ1 , kA−1 k2 =

n X 1 , kAk∞ = max |ai,j | 1≤i≤m σm j=1

8 Matrix Norms and Perturbation Theory for Linear Systems

146

for any matrix A ∈ Cm,n , where σ1 is the largest singular value of A, σm the smallest singular value of A, and we assumed A to be nonsingular in the third equation. For the matrix T one obtains kT k1 = kT k∞ = m + 1 for m = 1, 2 and kT k1 = kT k∞ = 4 for m ≥ 3. For the inverse we get kT −1 k1 = kT −1 k∞ = 21 = 18 h−2 for m = 1 and     1 2 1 1 2 1 −1 = kT −1 k∞ kT k1 = = 1 = 1 2 3 3 1 2 ∞ 1 for m = 2. For m > 2, one obtains m m j−1 X X −1  X (1 − jh)i + (1 − ih)j T = ij i=1

i=1

i=j

It is easy to commit an error when simplifying this sum. It can be computed symbolically using Symbolic Math Toolbox using the following code. code/symbolic_example.m 1 2

syms i j m simplify(symsum((1-j/(m+1))*i,i,1,j-1) + symsum( (1-i/(m+1))* j,i,j,m)) Listing 8.1: Simplify a symbolic sum using the Symbolic Math Toolbox in MATLAB.

Here we have substituted h = 1/(m + 1). This produces the result 2j (m + 1 − j). To arrive at this ourselves, we can first rewrite the expression as j−1 X

(1 − jh)i +

i=1

m X

(1 − ih)j −

i=1

j−1 X

(1 − ih)j.

i=1

The first sum here equals (1 − jh) (j−1)j . The second sum equals 2 m X

j − jh

i=1

m X

i = mj − jh

i=1

m(m + 1) = mj − mj/2 = mj/2. 2

The third sum equals j−1 X i=1

j − jh

j−1 X

i = j(j − 1) − hj 2

i=1

j−1 = j(j − 1)(2 − hj)/2. 2

Combining the three sums we get j j ((j − 1)(1 − jh) + m − (j − 1)(2 − hj)) = (m + 1 − j), 2 2

8 Matrix Norms and Perturbation Theory for Linear Systems

147

1 which we also arrived at above. This can also be written as 2h j − 12 j 2 , which is 1 a quadratic function in j that attains its maximum at j = 2h = m+1 2 . For odd m > 1, this function takes its maximum at integral j, yielding kT −1 k1 = 18 h−2 . For even m > 2, on the other hand, the maximum over all integral j is attained at −1 1−h m+2 j=m = 1+h k1 = 18 (h−2 − 1). 2 = 2h or j = 2 2h , which both give kT −1 Since T is symmetric, the row sums equal the column sums, so that kT −1 k∞ = kT −1 k1 . We conclude that the 1- and ∞-condition numbers of T are

 2, m = 1,     m = 2, 1 6, cond1 (T ) = cond∞ (T ) = −2 2 h , m odd, m > 1,    −2 h − 1, m even, m > 2.

b) Show that for p = 2 and m ≥ 1 we have cond2 (T ) = cot2

 πh  2

 πh 

= 1/ tan2

2

.

Solution. Since the matrix T is symmetric, T T T = T 2 and the eigenvalues of T T T are the squares of the eigenvalues λ1 , . . . , λn of T . As all eigenvalues of T are positive, each singular value of T is equal to an eigenvalue. Using that λi = 2 − 2 cos(iπh), we find σ1 = |λm | = 2 − 2 cos(mπh) = 2 + 2 cos(πh), σm = |λ1 | = 2 − 2 cos(πh). It follows that σ1 1 + cos(πh) cond2 (T ) = = = cot2 σm 1 − cos(πh)



πh 2

 .

c) Show the bounds 4 −2 2 4 h − < cond2 (T ) < 2 h−2 . π2 3 π Hint. For the upper bound use the inequality tan x > x, which is valid for 0 < x < π/2. For the lower bound we use (without proof) the inequality cot2 x > 1 2 x2 − 3 for x > 0. Solution. From tan x > x we obtain cot2 x = x−2 − 23 we find

1 tan2 x




8 Matrix Norms and Perturbation Theory for Linear Systems

148

4 2 4 − < cond2 (T ) < 2 2 . π 2 h2 3 π h d) Prove Equation (8.42). Solution. For p = 2, substitute h = 1/(m + 1) in c) and use that 4/π 2 < 1/2. For p = 1, ∞ we need to show due to a) that 4 −2 1 1 h − 2/3 < h−2 ≤ h−2 . 2 π 2 2 when m is odd, and that 4 −2 1 1 h − 2/3 < (h−2 − 1) ≤ h−2 . π2 2 2 when m is even. The right hand sides in these equations are obvious. The left equation for m odd is also obvious since 4/π 2 < 1/2. The left equation for m even is also obvious since −2/3 < −1/2. Exercise 8.20: Perturbation of the Identity matrix Let E be a square matrix. a) Show that if I − E is nonsingular then k(I − E)−1 − Ik ≤ kEk. k(I − E)−1 k  Solution. Since I − E is nonsingular, we can write E = (I − E) (I − E)−1 − I . Multiplying this equation from the left by (I − E)−1 gives (I − E)−1 E = (I −  E)−1 − I . Consistency now gives k((I − E)−1 − I)k ≤ k(I − E)−1 k · kEk, from which the result follows. b) If kEk < 1 then (I − E)−1 exists and 1 1 ≤ k(I − E)−1 k ≤ 1 + kEk 1 − kEk Show the lower bound. Show the upper bound if kIk = 1. In general for a consistent matrix norm (such as the Frobenius norm) the upper bound follows from Theorem 12.14 using Neumann series. −1

−Ik ≤ k(I −E)−1 k. On the left side now Solution. From a) it follows that k(I−E) kEk multiply with kI − Ek in the numerator and the denominator. For the numerator we have that

8 Matrix Norms and Perturbation Theory for Linear Systems

149

 kEk = k(I − E) (I − E)−1 − I k ≤ kI − Ekk(I − E)−1 − Ik, and it follows from the triangle inequality that 1 1 kEk k(I − E)−1 − Ik ≤ = ≤ ≤ k(I − E)−1 k, 1 + kEk kI − Ek kI − EkkEk kEk where the final inequality follows from a). For the upper bound, recall the inverse triangle inequality, kak−kbk ≤ ka−bk. With a = (I − E)−1 and b = I, and applied to a), this yields 1−

kIk ≤ kEk. k(I − E)−1 k

kIk Reorganizing this we obtain 1 − kEk ≤ k(I−E) −1 k . Setting kIk = 1 and moving things around again gives the upper bound.

c) Show that if kEk < 1 then k(I − E)−1 − Ik ≤

kEk . 1 − kEk

Solution. From a) and b) it follows that k(I − E)−1 − Ik ≤ k(I − E)−1 kkEk ≤

kEk . 1 − kEk

Exercise 8.21: Lower bounds in Equations (8.27), (8.29) a) Solve for E in B −1 − A−1 = −A−1 EB −1 = −B −1 EA−1 , and show that K(B)−1

(8.30)

kEk kB −1 − A−1 k ≤ . kAk kB −1 k

Solution. The statement is the same as kEk ≤ kB −1 − A−1 kkAkkBk. Solving for E in Equation (8.30) gives E = −B(B −1 − A−1 )A, so the result follows by applying norms on both sides, and using consistency on the right hand side. −1

k 1 1 b) Equation (8.28) said that 1+r ≤ kB ≤ 1−r . Use this and a) to show kA−1 k that K(B)−1 kEk kB −1 − A−1 k ≤ . 1 + r kAk kA−1 k

8 Matrix Norms and Perturbation Theory for Linear Systems

150

Solution. Dividing with 1 + r on both sides in the expression obtained in a), and using the left inequality in Equation (8.28),we obtain K(B)−1 kEk 1 kB −1 − A−1 k ≤ 1 + r kAk 1+r kB −1 k

which proves the claim.



kB −1 k kB −1 − A−1 k kA−1 k kB −1 k

=

kB −1 − A−1 k , kA−1 k

8 Matrix Norms and Perturbation Theory for Linear Systems

151

Exercise 8.22: Periodic spline interpolation (Exam exercise 1993-2) Let the components of x = [x0 , . . . , xn ]T ∈ Rn+1 define a partition of the interval [a, b], a = x0 < x1 < · · · < xn = b, and given a dataset y := [y0 , . . . , yn ]T ∈ Rn+1 , where we assume y0 = yn . The periodic cubic spline interpolation problem is defined by finding a cubic spline function g satisfying the conditions g(xi ) = yi ,

i = 0, 1, . . . , n,

g 0 (a) = g 0 (b),

g 00 (a) = g 00 (b).

(Recall that g is a cubic polynomial on each interval (xi−1 , xi ), for i = 1, . . . , n with smoothness C 2 [a, b].) We define si := g 0 (xi ), i = 0, . . . , n. It can be shown that the vector s := [s1 , . . . , sn ]T is determined from a linear system As = b,

(8.45)

where b ∈ Rn is a given vector determined by x and y. The matrix A ∈ Rn×n is given by   2 µ1 0 · · · 0 λ1   .  λ2 2 µ2 . . 0     .   0 . . . . . . . . . . . . ..  , A :=   . . .   .. . . . . . . . . . . 0      .. 0  .λ 2 µ n−1

µn 0 · · ·

0

n−1

λn

2

where λi :=

hi , hi−1 + hi

µi :=

hi−1 , hi−1 + hi

, i = 1, . . . , n,

and hi = xi+1 − xi ,

i = 0, . . . , n − 1, and hn = h0 .

You shall not argue or prove the system (8.45). Throughout this exercise we assume that 1 hi ≤ ≤ 2, i = 1, . . . , n. 2 hi−1 a) Show that kAk∞ = 3

and that

kAk1 ≤

10 . 3

8 Matrix Norms and Perturbation Theory for Linear Systems

152

Solution. We have kAk∞ = max i

X |aij | = max(2 + µi + λi ) = 2 + 1 = 3. i

j

We have µi =

1 1 ≤ = 2/3, 1 + hi /hi−1 1 + 1/2

i = 1, . . . , n,

and since also 1/2 ≤ hi−1 /hi ≤ 2 we find λi =

1 1 ≤ = 2/3, 1 + hi−1 /hi 1 + 1/2

i = 1, . . . , n.

But then with µ0 := µn and λn+1 := λ1 kAk1 = max j

X

|aij | = max(2 + µj−1 + λj+1 ) ≤ 2 + j

i

2 2 10 + = . 3 3 3

b) Show that kA−1 k∞ ≤ 1. Solution. We have σi := |aii | −

X |aij | = 2 − λi − µi = 1,

i = 1, . . . , n.

j6=i

Hence A is strictly diagonally dominant, so that kA−1 bk∞ ≤ kbk∞ for all b ∈ Rn by Theorem 2.2. But then kA−1 k∞ = max b6=0

kA−1 bk∞ kbk∞ ≤ max = 1. b6=0 kbk∞ kbk∞

c) Show that kA−1 k1 ≤ 32 . Solution. Since X 2 4 |ajj | − |aij | = 2 − λi+1 − µi−1 ≥ 2 − = , 3 3

i = 1, . . . , n,

i6=j

it follows that AT also is strictly diagonally dominant, so that kA−T bk∞ ≤ 23 kbk∞ for all b ∈ Rn by Theorem 2.2. But then kA−1 k1 = kA−T k∞ = max b6=0

kA−T bk∞ 3 kbk∞ 3 ≤ max = . kbk∞ 2 b6=0 kbk∞ 2

8 Matrix Norms and Perturbation Theory for Linear Systems

153

d) Let s and b be as in (8.45), where we assume b 6= 0. Let e ∈ Rn be such ˆ satisfies that kekp /kbkp ≤ 0.01, p = 1, ∞. Suppose s Aˆ s = b + e. Give estimates for

kˆ s − sk∞ ksk∞

kˆ s − sk1 . ksk1

and

Solution. By Theorem 8.7 for p = 1 and p = ∞, kˆ s − skp kekp ≤ Kp (A) ≤ 0.01Kp (A), kskp kbkp where

( Kp (A) := kAkp kA

We obtain

kˆ s − sk1 ≤ 0.05 ksk1

−1

kp ≤

and

10 3

· 32 = 5, if p = 1, 3 · 1 = 3, if p = ∞. kˆ s − sk∞ ≤ 0.03. ksk∞

Exercise 8.23: LSQ MATLAB program (Exam exercise 2013-4) Suppose A ∈ Rm×n , b ∈ Rm , where A has rank n and let A = U ΣV T be a singular value factorization of A. Thus U ∈ Rm×n and Σ, V ∈ Rn×n . Write a MATLAB function [x,K]=lsq(A,b) that uses the singular value factorization of A to calculate a least squares solution x = V Σ −1 U T b to the system Ax = b and the spectral (2-norm) condition number of A. The MATLAB command [U,Sigma,V]=svd(A,0) computes the singular value factorization of A. Solution. The matrix Σ is a diagonal matrix with the singular values on the diagonal ordered so that σ1 ≥ · · · ≥ σn . Moreover, σn > 0 since A has rank n. The spectral condition number is K = σ1 /σn . We also use the MATLAB function diag(Sigma) that extracts the diagonal of Σ. This leads to the following program code/lsq.m 1 2 3 4

function [x,K]=lsq(A,b) [U,Sigma,V]=svd(A,0); s=diag(Sigma); x=V*((U’*b)./s);

8 Matrix Norms and Perturbation Theory for Linear Systems

154 5

K=s(1)/s(length(s)); Listing 8.2: For a full rank matrix A, use its singular value factorization to compute the least square solution to Ax = b and its spectral condition number K.

Exercises section 8.4

Exercise 8.24: When is a complex norm an inner product norm? Given a vector norm in a complex vector space V, and suppose (8.35) holds for all x, y ∈ V. Show that  1 kx + yk2 − kx − yk2 + ikx + iyk2 − ikx − iyk2 , 4 √ defines an inner product on V, where i = −1. This identity is called the polarization identity. hx, yi :=

Hint. We have hx, yi = s(x, y) + is(x, iy), where s(x, y) :=  yk2 .

1 4

kx + yk2 − kx −

Solution. As in the exercise, we let hx, yi = s(x, y) + is(x, iy),

s(x, y) =

kx + yk2 − kx − yk2 . 4

We need to verify the three properties that define an inner product. Let x, y, z be arbitrary vectors in Cm and a ∈ C be an arbitrary scalar. 1. Positive-definiteness. One has s(x, x) = kxk2 ≥ 0 and kx + ixk2 − kx − ixk2 k(1 + i)xk2 − k(1 − i)xk2 = 4 4 (|1 + i| − |1 − i|)kxk2 = = 0, 4

s(x, ix) =

so that hx, xi = kxk2 ≥ 0, with equality holding precisely when x = 0. 2. Conjugate symmetry. Since s(x, y) is real, s(x, y) = s(y, x), s(ax, ay) = |a|2 s(x, y), and s(x, −y) = −s(x, y), hy, xi = s(y, x) − is(y, ix) = s(x, y) − is(ix, y) = s(x, y) − is(x, −iy) = hx, yi. 3. Linearity in the first argument. Assuming the parallelogram identity,

8 Matrix Norms and Perturbation Theory for Linear Systems

155

2s(x, z) + 2s(y, z) 1 1 1 1 = kx + zk2 − kz − xk2 + ky + zk2 − kz − yk2 2 2 2 2

2

2

1 x + y x − y 1 x + y x − y



+ = z + + − z − − 2 2 2 2 2 2

2

2

1

z + x + y − x − y − 1 z − x + y + x − y

2 2 2 2 2 2

2

2

2

x − y

x − y 2 x + y x + y





= z + + − z − − 2 2 2 2

2

2

x + y x + y



= z + − z − 2 2   x+y = 4s ,z , 2  so that s(x, z) + s(y, z) = 2s x+y 2 , z . For y  = 0 we have clearly that x s(y, z) = 0, and it follows that s(x, z) = 2s 2 , z . It follows that s(x+y, z) = s(x, z) + s(y, z). Finally hx + y, zi = s(x + y, z) + is(x + y, iz) = s(x, z) + s(y, z) + is(x, iz) + is(y, iz) = s(x, z) + is(x, iz) + s(y, z) + is(y, iz) = hx, zi + hy, zi. That hax, yi = ahx, yi follows, mutatis mutandis, from the proof of Theorem 8.15.

Exercise 8.25: p norm for p = 1 and p = ∞ Show that k·kp is a vector norm in Rn for p = 1, p = ∞. Solution. We need to verify the three properties that define a norm. Consider arbitrary vectors x = [x1 , . . . , xn ]T and y = [y1 , . . . , yn ]T in Rn and a scalar a ∈ R. First we verify that k · k1 is a norm. 1. Positivity. Clearly kxk1 = |x1 | + · · · + |xn | ≥ 0, with equality holding precisely when |x1 | = · · · = |xn | = 0, which happens if and only if x is the zero vector. 2. Homogeneity. One has kaxk1 = |ax1 | + · · · + |axn | = |a|(|x1 | + · · · + |xn |) = |a|kxk1 . 3. Subadditivity. Using the triangle inequality for the absolute value, kx + yk1 = |x1 + y1 | + · · · + |xn + yn |

8 Matrix Norms and Perturbation Theory for Linear Systems

156

≤ |x1 | + |y1 | + · · · + |xn | + |yn | = kxk1 + kyk1 . Next we verify that k · k∞ is a norm. 1. Positivity. Clearly kxk∞ := max{|x1 |, . . . , |xn |} ≥ 0, with equality holding precisely when |x1 | = · · · = |xn | = 0, which happens if and only if x is the zero vector. 2. Homogeneity. One has kaxk∞ = max{|a||x1 |, . . . , |a||xn |} = |a| max{|x1 |, . . . , |xn |} = |a|kxk∞ . 3. Subadditivity. Using the triangle inequality for the absolute value, kx + yk∞ = max{|x1 + y1 |, . . . , |xn + yn |} ≤ max{|x1 | + |y1 |, . . . , |xn | + |yn |} ≤ max{|x1 |, . . . , |xn |} + max{|y1 |, . . . , |yn |} = kxk∞ + kyk∞ .

Exercise 8.26: The p-norm unit sphere The set Sp = {x ∈ Rn : kxkp = 1} is called the unit sphere in Rn with respect to p. Draw Sp for p = 1, 2, ∞ for n = 2. Solution. In the plane, unit spheres for the 1-norm, 2-norm, and ∞-norm are

1

1

1

1 1

1

1

1 1

1

1 1

Exercise 8.27: Sharpness of p-norm inequality For p ≥ 1, and any x ∈ Cn we have kxk∞ ≤ kxkp ≤ n1/p kxk∞ (cf. (8.5)). Produce a vector xl such that kxl k∞ = kxl kp and another vector xu such that kxu kp = n1/p kxu k∞ . Thus, these inequalities are sharp. Solution. Let 1 ≤ p ≤ ∞. The vector xl = [1, 0, . . . , 0]T ∈ Rn satisfies kxl kp = (|1|p + |0|p + · · · + |0|p )1/p = 1 = max{|1|, |0|, . . . , |0|} = kxl k∞ ,

8 Matrix Norms and Perturbation Theory for Linear Systems

157

and the vector xu = [1, 1, . . . , 1]T ∈ Rn satisfies kxu kp = (|1|p + · · · + |1|p )1/p = n1/p = n1/p max{|1|, . . . , |1|} = n1/p kxu k∞ .

Exercise 8.28: p-norm inequalities for arbitrary p If 1 ≤ q ≤ p ≤ ∞ then kxkp ≤ kxkq ≤ n1/q−1/p kxkp ,

x ∈ Cn .

Hint. For the rightmost inequality use Jensen’s inequality. Cf. Theorem 8.13 with f (z) = z p/q and zi = |xi |q . For the left inequality consider first yi = xi /kxk∞ , i = 1, 2, . . . , n. Solution. Let p and q be integers satisfying 1 ≤ q ≤ p, and let x = [x1 , . . . , xn ]T ∈ Cn . Since p/q ≥ 1, the function f (z) = z p/q is convex on [0, ∞). For any z1 , . . . , zn ∈ [0, ∞) and λ1 , . . . , λn ≥ 0 satisfying λ1 + · · · + λn = 1, Jensen’s inequality gives n X

!p/q λi zi

n X

=f

i=1

! λi zi



i=1

n X

λi f (zi ) =

i=1

n X

p/q

λ i zi

.

i=1

In particular for zi = |xi |q and λ1 = · · · = λn = 1/n, n

−p/q

n X

!p/q q

|xi |

=

i=1

n X 1 |xi |q n i=1

!p/q ≤

n n X X p/q 1 |xi |q = n−1 |xi |p . n i=1 i=1

Since the function x 7−→ x1/p is monotone, we obtain n

−1/q

kxkq = n

n X

−1/q

!1/q |xi |

q

≤n

−1/p

i=1

n X

!1/p = n−1/p kxkp ,

p

|xi |

i=1

from which the right inequality in the exercise follows. The left inequality clearly holds for x = 0, so assume x 6= 0. Without loss of generality we can then assume kxk∞ = 1, since kaxkp ≤ kaxkq if and only if kxkp ≤ kxkq for any nonzero scalar a. Then, for any i = 1, . . . , n, one has |xi | ≤ 1, implying that |xi |p ≤ |xi |q . Moreover, since |xi | = 1 for some i, one has |x1 |q + · · · + |xn |q ≥ 1, so that kxkp =

n X i=1

!1/p p

|xi |



n X i=1

!1/p |xi |

q



n X i=1

!1/q q

|xi |

= kxkq .

8 Matrix Norms and Perturbation Theory for Linear Systems

158

Finally we consider the case p = ∞. The statement is obvious for q = p, so assume that q is an integer. Then kxkq =

n X

!1/q |xi |

q



n X

!1/q kxkq∞

= n1/q kxk∞ ,

i=1

i=1

proving the right inequality. Using that the map x 7−→ x1/q is monotone, the left inequality follows from kxkq∞ = (max |xi |)q ≤ i

n X i=1

|xi |q = kxkqq .

Chapter 9

Least Squares

Exercises section 9.1

Exercise 9.1: Fitting a circle to points In this exercise we derive an algorithm to fit a circle (t − c1 )2 + (y − c2 )2 = r2 to m ≥ 3 given points (ti , yi )m i=1 in the (t, y)-plane. We obtain the overdetermined system (ti − c1 )2 + (yi − c2 )2 = r2 ,

i = 1, . . . , m,

(System (9.22)) of m equations in the three unknowns c1 , c2 and r. This system is nonlinear, but it can be solved from the linear system ti x1 + yi x2 + x3 = t2i + yi2 ,

i = 1, . . . , m,

(System (9.23)), and then setting c1 = x1 /2, c2 = x2 /2 and r2 = c21 +c22 +x3 . a) Derive (9.23) from (9.22). Explain how we can find c1 , c2 , r once [x1 , x2 , x3 ] is determined. Solution. Let c1 = x1 /2, c2 = x2 /2, and r2 = x3 + c21 + c22 as in the exercise. Then, for i = 1, . . . , m, 0 = (ti − c1 )2 + (yi − c2 )2 − r2   x 2  x 2 x 1 2  x 2 2 1 2 = ti − + yi − − x3 − − 2 2 2 2 = t2i + yi2 − ti x1 − yi x2 − x3 , from which Equation (9.23) follows immediately. Once x1 , x2 , and x3 are determined, we can compute

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_9

159

160

9 Least Squares

c1 =

x1 , 2

c2 =

x2 , 2

r r=

1 2 1 2 x + x + x3 . 4 1 4 2

b) Formulate (9.23) as a linear least squares problem for suitable A and b. Solution. The linear least square problem is to minimize kAx − bk22 , with    2    t1 y1 1 t1 + y12 x1     .. x2  . A =  ... ... ...  , b= , x =  . x3 tm ym 1 t2 + y 2 m

m

c) Does the matrix A in b) have linearly independent columns? Solution. Whether or not A has independent columns depends on the data ti , yi . For instance, if ti = yi = 1 for all i, then the columns of A are clearly dependent. In general, A has independent columns whenever we can find three points (ti , yi ) not on a straight line. d) Use (9.23) to find the circle passing through the three points (1, 4), (3, 2), and (1, 0). Solution. For these points the matrix A becomes   141 A = 3 2 1 , 101 which clearly is invertible. We find    −1     x1 141 17 2 x = x2  = 3 2 1 13 =  4  . x3 101 1 −1 It follows that c1 = 1, c2 = 2, and r = 2. The points (t, y) = (1, 4), (3, 2), (1, 0) therefore all lie on the circle (t − 1)2 + (y − 2)2 = 4, as shown in Figure 9.1.

9 Least Squares

161

4 3 2 1 0 1

0

1

2

3

Figure 9.1: The circle fit to the points in Exercise 9.1.

Exercise 9.2: Least square fit (Exam exercise 2018-1) √ √  2 √2 a) Let A be the matrix . Find the singular values of A, and compute 0 3 kAk2 .  22 , which has eigenvalues 6 and 1. The 25 √ corresponding singular values are thus σ1 = 6 and σ2 = 1. In particular kAk2 = √ σ1 = 6. Solution. We first compute A∗ A =





 3α , where α is a real number. For which α1 values of α is A positive definite? b) Consider the matrix A =

Solution. Since A is symmetric and a11 = 3√> √ 0 the matrix is positive definite if and only if det(A) = 3 − α2 > 0 or α ∈ (− 3, 3). Alternatively, the matrix is positive definite if and only if the eigenvalues are positive. The eigenvalues of A are solutions to 0 = det(A − λI) = (3 − λ)(1 − λ) − α2 = λ2 − 4λ + 3 − α2 . We find the eigenvalues λ± =



p p 16 − 4(3 − α2 ) = 2 ± 1 + α2 , 2

2 and, since α is real, these values are both √ bigger √ than zero if and only if 1 + α < 4, which happens precisely when α ∈ (− 3, 3).

9 Least Squares

162

c) We would like to fit the points p1 = (0, 1), p2 = (1, 0), p3 = (2, 1) to a straight line in the plane. Find a line p(x) = mx + b which minimizes 3 X

kp(xi ) − yi k2 ,

i=1

where pi = (xi , yi ). Is this solution unique? Solution. We would like to find a least squares solution to     10   1 1 1 b = 0 . m 12 1 Denote the coefficient matrix by A, and the right hand side by b. We have that     33 2 T T A A= , A b= . 35 2 Applying row reduction to the augmented matrix yields     332 1 0 2/3 [AT A|AT b] = ∼ , 352 01 0 so that b = 2/3, m = 0. It follows that the horizontal line y = p(x) = 2/3 is the least squares fit. The least squares solution is unique since A has linearly independent columns.

Exercises section 9.2

Exercise 9.3: A least squares problem (Exam exercise 1983-2) Suppose A ∈ Rm×n and let I ∈ Rn×n be the identity matrix. We define F : Rn → R by F (x) := kAx − bk22 + kxk22 . a) Show that the matrix B := I + AT A is symmetric and positive definite. Solution. For any x ∈ Rn we have xT (I + AT A)x = xT x + xT AT Ax = kxk22 + kAxk22 ≥ 0. Moreover, if equality holds then kxk2 = 0 so that x = 0. Finally, B T = (I + AT A)T = I T + (AT A)T = B

9 Least Squares

163

so that B is symmetric. b) Show that F (x) = xT Bx − 2cT x + bT b,

where c = AT b.

Solution. We find F (x) = (Ax − b)T (Ax − b) + xT x = xT AT Ax − (Ax)T b − bT Ax + bT b + xT Ix = xT (I + AT A)x − xT AT b − (AT b)T x + bT b = xT Bx − 2cT x + bT b.

c) Show that to every b ∈ Rm there is a unique x which minimizes F . Moreover, x is the unique solution of the linear system (I + AT A)x = AT b. Solution. Since B is positive definite it follows from Lemma 13.1 that there is a unique x ∈ Rn which minimizes F . Moreover, by the same lemma x is the unique solution of Bx = c or (I + AT A)x = AT b. Exercise 9.4: Weighted least squares (Exam exercise 1977-2) For m ≥ n we are given A ∈ Rm×n with linearly independent columns, b ∈ Rm , and D := diag(d1 , d2 , . . . , dm ) ∈ Rm×m , where di > 0, i = 1, 2, . . . , m. We want to minimize kr(x)k2D :=

m X

ri (x)2 di ,

x ∈ Rn ,

(9.24)

i=1

where ri = ri (x), i = 1, 2, . . . , m are the components of the vector r = r(x) = b − Ax. a) Show that kr(x)k2D in (9.24) obtains a unique minimum when x = xmin is the solution of the system

Solution. With D 1/2

AT DAx = AT Db. √  √ √ := diag d1 , d2 , . . . , dm we find

kr(x)k2D = kD 1/2 r(x)k22 = kD 1/2 Ax − D 1/2 bk22 . We obtain an ordinary least squares problem ˆ 2, ˆ − bk minkAx

ˆ := D 1/2 A, A

ˆ := D 1/2 b. b

9 Least Squares

164

ˆ has linearly independent columns, and by Theorems 9.1, Since D is nonsingular A ˆ which is the same ˆ =A ˆ T Ax ˆ T b, 9.2, 9.3, x = xmin is unique and the solution of A T T as A DAx = A Db. b) Show that K2 (AT DA) ≤ K2 (AT A)K2 (D), where for any nonsingular matrix K2 (B) := kBk2 kB −1 k2 . ˆ TA ˆ and A ˆ has linearly independent columns it folSolution. Since AT DA = A T lows from Lemma 4.2 that both A A and AT DA are positive definite. Let µ1 , µn and λ1 , λn be the largest and smallest eigenvalues of AT A and AT DA, respectively. Suppose (λ, v) is an eigenpair of AT DA, AT DAv = λv,

kvk2 = 1.

Then with w := Av λ = λv T v = v T AT DAv = wT Dw =

n X

di wi2 .

i=1

With dmax and dmin the largest and smallest diagonal element in D, it follows that dmin kAvk2 ≤ λ ≤ dmax kAvk2 . Since also µn ≤ hAT Av, vi = kAvk22 and kAvk22 = hAT Av, vi ≤ µ1 , it follows that dmin µn ≤ dmin kAvk22 ≤ λ ≤ dmax kAvk22 ≤ dmax µ1 . By Exercise 8.13 we have kDk2 = dmax and kD −1 k2 = 1/dmin . But then K2 (AT DA) =

dmax µ1 λ1 ≤ = K2 (D)K2 (AT A). λn dmin µn

9 Least Squares

165

Exercises section 9.3

Exercise 9.5: Uniqueness of generalized inverse Given A ∈ Cm×n , and suppose B, C ∈ Cn×m satisfy ABA BAB (AB)∗ (BA)∗

=A =B = AB = BA

(1) (2) (3) (4)

ACA = A, CAC = C, (AC)∗ = AC, (CA)∗ = CA.

Verify the following proof that B = C. B = (BA)B = (A∗ )B ∗ B = (A∗ C ∗ )A∗ B ∗ B = CA(A∗ B ∗ )B = CA(BAB) = (C)AB = C(AC)AB = CC ∗ A∗ (AB) = CC ∗ (A∗ B ∗ A∗ ) = C(C ∗ A∗ ) = CAC = C. Solution. Denote the properties to the left by (1B ), (2B ), (3B ), (4B ) and the properties to the right by (1C ), (2C ), (3C ), (4C ). Then one uses, in order, (2B ), (4B ), (1C ), (4C ), (4B ), (2B ), (2C ), (3C ), (3B ), (1B ), (3C ), and (2C ). Exercise 9.6: Verify that a matrix is a generalized inverse h1 1i Show that the generalized inverse of A = 1 1 is A† = 14 [ 11 11 00 ] without 00 using the singular value decomposition of A. Solution. Let

 1 A = 1 0

 1 1 , 0

B=

  1 110 4 110

be as in the exercise. One finds     11 1 110   AB = 1 1 = 4 110 00     11 1 110   11 = BA = 4 110 00

  110 1 1 1 0 , 2 000   1 11 , 2 11

so that (AB)∗ = AB and (BA)∗ = BA. Moreover,      11 1 1 11 ABA = A(BA) = 1 1 = 1 2 11 00 0

 1 1 = A, 0

9 Least Squares

166

      1 11 1 110 1 110 BAB = (BA)B = = = B. 2 11 4 110 4 110 By Theorem 9.5 we conclude that B must be the pseudoinverse of A. Exercise 9.7: Linearly independent columns and generalized inverse Suppose A ∈ Cm×n has linearly independent columns. Show that A∗ A is nonsingular and A† = (A∗ A)−1 A∗ . If A has linearly independent rows, then show that AA∗ is nonsingular and A† = A∗ (AA∗ )−1 . Solution. If A ∈ Cm,n has independent columns then both A and A∗ have rank n ≤ m. Then, by Exercise 7.11, A∗ A must have rank n as well. Since A∗ A is an n × nmatrix of maximal rank, it is nonsingular and we can define B := (A∗ A)−1 A∗ . Let A = U 1 Σ 1 V ∗1 be a singular value factorization of A. Note that V 1 is n × n, so that it is unitary. We have that −1

(A∗ A)−1 A∗ = ((U 1 Σ 1 V ∗1 )∗ U 1 Σ 1 V ∗1 )

(U 1 Σ 1 V ∗1 )∗

−1

= (V 1 Σ 1 U ∗1 U 1 Σ 1 V ∗1 ) (U 1 Σ 1 V ∗1 )∗ −1 ∗ ∗ = V 1 Σ 21 V ∗1 V 1 Σ 1 U ∗1 = V 1 Σ −2 1 V 1V 1Σ1U 1 ∗ † = V 1 Σ −1 1 U1 = A ,

−1 ∗ ∗ where we used that V 1 Σ 21 V ∗1 = V 1 Σ −2 1 V 1 . V 1 V 1 is m × m (V 1 has full rank). Alternatively, with B := (A∗ A)−1 A∗ , we can verify that B satisfies the four axioms of Exercise 9.5. 1. 2. 3. 4.

ABA = A(A∗ A)−1 A∗ A = A BAB = (A∗ A)−1 A∗ A(A∗ A)−1 A∗ = (A∗ A)−1 A∗ = B ∗ (BA)∗ = (A∗ A)−1 A∗ A = I ∗n = I n = (A∗ A)−1 A∗ A = BA  ∗ ∗ (AB)∗ = A(A∗ A)−1 A∗ = A (A∗ A)−1 A∗ = A(A∗ A)−1 A∗ = AB

It follows that B = A† . The second claim follows similarly. Alternatively, one can use the fact that the unique solution of the least squares problem is A† b and compare this with the solution of the normal equation.

9 Least Squares

167

Exercise 9.8: More orthogonal projections Given m, n ∈ N, A ∈ Cm×n of rank r, and let S be one of the subspaces R(A∗ ), N (A). Show that the orthogonal projection of v ∈ Cn into S can be written as a matrix P S times the vector v in the form P S v, where P R(A∗ ) = A† A = V 1 V ∗1 =

r X

v j v ∗j ∈ Cn×n ,

j=1 †

∗ 2

P N (A) = I − A A = V 2 V =

n X

(9.28) v j v ∗j

n×n

∈C

,

j=r+1

where A† is the generalized inverse of A and A = U ΣV ∗ ∈ Cm×n , is a singular value decomposition of A (cf. (9.7)). Thus (9.12) and (9.28) give the orthogonal projections into the 4 fundamental subspaces. Hint. ⊥

By Theorem 7.3 we have the orthogonal sum Cn = R(A∗ )⊕N (A). Solution. We know that {v j }rj=1 is an orthonormal basis for R(A∗ ), and that {v j }nj=r+1 is an orthonormal basis for P N (A) . From this the right hand side exPr Pn pressions V 1 V ∗1 = j=1 v j v ∗j ∈ Cn×n and V 2 V ∗2 = j=r+1 v j v ∗j ∈ Cn×n follow immediately. We also have that A† A = V 1 Σ −1 U ∗1 U 1 ΣV ∗1 = V 1 V ∗1 . Also I − A† A = I − P R(A∗ ) = P N (A) , and the result follows. Exercise 9.9: The generalized inverse of a vector Show that u† = (u∗ u)−1 u∗ if u ∈ Cn,1 is nonzero. Solution. This is a special case of Exercise 9.7. In particular, if u is a nonzero vector, then u∗ u = hu, ui = kuk2 is a nonzero number and (u∗ u)−1 u∗ is defined. One can again check the axioms of Exercise 9.5 to show that this vector must be the pseudoinverse of u∗ . Exercise 9.10: The generalized inverse of an outer product If A = uv ∗ where u ∈ Cm , v ∈ Cn are nonzero, show that A† =

1 ∗ A , α

α = kuk22 kvk22 .

9 Least Squares

168

Solution. Let A = uv ∗ be as in the exercise. Since u and v are nonzero, A = U 1 Σ 1 V 1∗ =

 v∗ u  kuk2 kvk2 kuk2 kvk2

is a singular value factorization of A. But then ∗ A† = V 1 Σ −1 1 U1 =

i u∗ v h 1 A∗ 1 ∗ = vu = . kuk2 kvk2 kvk2 kuk2 kuk22 kvk22 α

Exercise 9.11: The generalized inverse of a diagonal matrix Show that diag(λ1 , . . . , λn )† = diag(λ†1 , . . . , λ†n ) where  1/λi , λi 6= 0, † λi = 0, λi = 0. Solution. Let A := diag(λ1 , . . . , λn ) and B := diag(λ†1 , . . . , λ†n ) as in the exercise. Note that, by definition, λ†j indeed represents the pseudoinverse of the number λj for any j. It therefore satisfies the axioms of Exercise 9.5, which we shall use below. We now verify the axioms for B to show that B must be the pseudoinverse of A. 1. 2. 3. 4.

ABA = diag(λ1 λ†1 λ1 , . . . , λn λ†n λn ) = diag(λ1 , . . . , λn ) = A; BAB = diag(λ†1 λ1 λ†1 , . . . , λ†n λn λ†n ) = diag(λ†1 , . . . , λ†n ) = B; (BA)∗ = (diag(λ†1 λ1 , . . . , λ†n λn ))∗ = diag(λ†1 λ1 , . . . , λ†n λn ) = BA; (AB)∗ = (diag(λ1 λ†1 , . . . , λn λ†n ))∗ = diag(λ1 λ†1 , . . . , λn λ†n ) = AB.

This proves that B is the pseudoinverse of A. Exercise 9.12: Properties of the generalized inverse Suppose A ∈ Cm×n . Show that a) (A∗ )† = (A† )∗ . Solution. Let A = U ΣV ∗ be a singular value decomposition of A and A = U 1 Σ 1 V ∗1 the corresponding singular value factorization. By definition of the pseu∗ doinverse, A† := V 1 Σ −1 1 U 1. † ∗ −1 ∗ ∗ ∗ One has (A ) = (V 1 Σ 1 U 1 ) = U 1 Σ −∗ 1 V 1 . On the other hand, the matrix ∗ ∗ ∗ ∗ A has singular value factorization A = V 1 Σ 1 U 1 , so that its pseudoinverse is ∗ † ∗ ∗ † (A∗ )† := U 1 Σ −∗ 1 V 1 as well. We conclude that (A ) = (A ) . b) (A† )† = A.

9 Least Squares

169

Solution. Since A† := V 1 Σ 1−1 U 1∗ is a singular value factorization, it has pseudoinverse (A† )† = (U 1∗ )∗ (Σ 1−1 )−1 V ∗1 = U 1 Σ 1 V ∗1 = A. c) (αA)† =

† 1 αA ,

α 6= 0.

Solution. Since the matrix αA has singular value factorization U 1 (αΣ 1 )V ∗1 , it has pseudoinverse ∗ −1 † (αA)† = V 1 (αΣ 1 )−1 U ∗1 = α−1 V 1 Σ −1 A . 1 U1 = α

Exercise 9.13: The generalized inverse of a product Suppose k, m, n ∈ N, A ∈ Cm×n , B ∈ Cn×k . Suppose A has linearly independent columns and B has linearly independent rows. a) Show that (AB)† = B † A† .

Hint. Let E = AB, F = B † A† . Show by using A† A = BB † = I that F is the generalized inverse of E. Solution. Let A and B have singular value factorizations A = U 1 Σ 1 V ∗1 and B = U 2 Σ 2 V ∗2 . Since A has full column rank and B has full row rank, V 1 and U 2 are unitary. We have that AB = U 1 Σ 1 V ∗1 U 2 Σ 2 V ∗2 . Now, let U 3 Σ 3 V ∗3 be a singular value factorization of Σ 1 V ∗1 U 2 Σ 2 . This matrix is nonsingular, U 3 ∗ −1 ∗ −1 and V 3 are unitary, and inversion gives that V 3 Σ −1 3 U 3 = Σ 2 U 2 V 1 Σ 1 . We ∗ ∗ ∗ then have that AB = U 1 U 3 Σ 3 V 3 V 2 = U 1 U 3 Σ 3 (V 2 V 3 ) is a singular value factorization of AB, so that ∗ ∗ −1 ∗ −1 ∗ † † (AB)† = V 2 V 3 Σ −1 3 U 3U 1 = V 2Σ2 U 2V 1Σ1 U 1 = B A .

Alternatively we can verify the properties from Exercise 9.5. First, when A has linearly independent columns, B has linearly independent rows, it follows immediately from Exercise 9.7 that A† A = BB † = I. We know also from Exercise 9.5 that (AA† )∗ = AA† and (B † B)∗ = B † B. We now let E := AB and F := B † A† . Hence we want to show that E † = F , i.e., that F satisfies the four properties EF E = ABB † A† AB = AB = E, F EF = B † A† ABB † A† = B † A† = F , (F E)∗ = (B † A† AB)∗ = (B † B)∗ = B † B = B † A† AB = F E, (EF )∗ = (ABB † A† )∗ = (AA† )∗ = AA† = ABB † A† = EF .

9 Least Squares

170

b) Find A ∈ R1,2 , B ∈ R2,1 such that (AB)† 6= B † A† . Solution. Let A = u∗ and B = v, where u and v are column vectors. From exercises 9.9 and 9.10 we have that A† = u/kuk22 , and B † = v ∗ /kvk22 . We have that 1 v∗ u (AB)† = (u∗ v)† = ∗ , B † A† = . u v kvk22 kuk22 If these are to be equal we must have that (u∗ v)2 = kvk22 kuk22 . We must thus have equality in the Cauchy-Schwarz inequality, and this can happen only if u and v are parallel. It is thus enough to find u and v which are not parallel, in order to produce a counterexample. Exercise 9.14: The generalized inverse of the conjugate transpose Show that A∗ = A† if and only if all singular values of A are either zero or one. Solution. Let A have singular value factorization A = U 1 Σ 1 V ∗1 , so that A∗ = ∗ ∗ † ∗ −1 V 1 Σ ∗1 U ∗1 and A† = V 1 Σ −1 1 U 1 . Then A = A if and only if Σ 1 = Σ 1 , which, since the singular values are nonnegative, happens precisely when all nonzero singular values of A are one. Exercise 9.15: Linearly independent columns Show that if A has rank n then A(A∗ A)−1 A∗ b is the projection of b into R(A) (cf. Exercise 9.8). Solution. By Exercise 9.7, if A ∈ Cm×n has rank n, then A† = (A∗ A)−1 A∗ . Then A(A∗ A)−1 A∗ b = AA† b, which is the orthogonal projection of b into span(A) by Theorem 9.6. Exercise 9.16: Analysis of the general linear system Consider the linear system Ax = b where A ∈ Cn×n has rank r > 0 and b ∈ Cn . Let   Σ1 0 U ∗ AV = 0 0 represent a singular value decomposition of A. Solution. In this exercise, we can write   Σ1 0 Σ= , Σ1 = diag(σ1 , . . . , σr ), 0 0

σ1 ≥ · · · ≥ σr > 0.

9 Least Squares

171

a) Let c = [c1 , . . . , cn ]∗ = U ∗ b and y = [y1 , . . . , yn ]∗ = V ∗ x. Show that Ax = b if and only if   Σ1 0 y = c. 0 0 Solution. As U is unitary, we have U ∗ U = I. We find the following sequence of equivalences. Ax = b ⇐⇒ U ΣV ∗ x = b ⇐⇒ U ∗ U Σ(V ∗ x) = U ∗ b ⇐⇒ Σy = c, which is what needed to be shown. b) Show that Ax = b has a solution x if and only if cr+1 = · · · = cn = 0. Solution. By a), the linear system Ax = b has a solution if and only if the system     σ1 y1 c1  ..   ..   .   .        σr yr   cr  Σ1 0 =  y=  0  cr+1  = c 0 0      .   .   ..   ..  0

cn

has a solution y. Since σ1 , . . . , σr 6= 0, this system has a solution if and only if cr+1 = · · · = cn = 0. We conclude that Ax = b has a solution if and only if cr+1 = · · · = cn = 0. c) Deduce that a linear system Ax = b has either no solution, one solution or infinitely many solutions. Solution. By a), the linear system Ax = b has a solution if and only if the system Σy = c has a solution. Hence we have the following three cases. r=n Here yi = ci /σi for i = 1, . . . , n provides the only solution to the system Σy = b, and therefore x = V y is the only solution to Ax = b. It follows that the system has exactly one solution. r < n, ci = 0 for i = r + 1, . . . , n Here each solution y must satisfy yi = ci /σi for i = 1, . . . , r. The remaining yr+1 , . . . , yn , however, can be chosen arbitrarily. Hence we have infinitely many solutions to Σy = b as well as to Ax = b. r < n, ci 6= 0 for some i with r + 1 ≤ i ≤ n In this case it is impossible to find a y that satisfies Σy = b, and therefore the system Ax = b has no solution at all.

9 Least Squares

172

Exercise 9.17: Fredholm’s alternative For any A ∈ Cm×n , b ∈ Cn show that one and only one of the following systems has a solution (1)

Ax = b,

(2)

A∗ y = 0, y ∗ b 6= 0.

In other words either b ∈ R(A), or we can find y ∈ N (A∗ ) such that y ∗ b 6= 0. This is called Fredholm’s alternative. Solution. Suppose that the system Ax = b has a solution, i.e., b ∈ span(A). Suppose in addition that A∗ y = 0 has a solution, i.e., y ∈ ker(A∗ ). Since span(A)⊥ = ker(A∗ ), one has hy, bi = y ∗ b = 0. Thus if the system Ax = b has a solution, then we can not find a solution to A∗ y = 0, y ∗ b 6= 0. Conversely if y ∈ ker(A∗ ) and y ∗ b 6= 0, then b ∈ / ker(A∗ )⊥ = span(A), implying that the system Ax = b does not have a solution. Exercise 9.18: SVD (Exam exercise 2017-2) Let A ∈ Cm×n , with m ≥ n, be a matrix on the form   B A= C where B is a non-singular n × n matrix and C is in C(m−n)×n . Let A† denote the pseudoinverse of A. Show that kA† k2 ≤ kB −1 k2 . Solution. Since B is nonsingular the matrix A has rank n and singular values σ1 ≥ · · · ≥ σn > 0. The SVD of A can be written   Σ1 A=U V ∗, 0 where U ∈ Cm×m , V ∈ Cn×n are unitary and Σ 1 := diag(σ1 , . . . , σn ). Moreover, by the Courant-Fischer theorem for singular values (cf. (9.17)) σn = minn kAxk2 . x∈C kxk2 =1

But then, using also (8.19) ∗ A† = V [Σ −1 1 , 0]U ,

kB −1 k2 =

1 , σB

where σB is the smallest singular value of B. Now, for any x ∈ Cn kAxk22 = x∗ A∗ Ax = x∗ (B ∗ B + C ∗ C)x ≥ x∗ B ∗ Bx = kBxk22 .

(9.i)

9 Least Squares

173

But then kAxk2 ≥ kBxk2 ≥ σB and since x is arbitrary we obtain σn ≥ σB . For x ∈ Cm , using (9.i) kA† xk2 = kV[Σ−1 , 0]U∗ xk2 = k[Σ−1 , 0]U∗ xk2 ≤ k[Σ

−1

(V is unitary)



, 0]k2 kU xk2

(k·k2 subordinate)

−1

= kΣ k2 kxk2 1 1 = kxk2 ≤ kxk2 = kB −1 k2 kxk2 . σn σB

(U∗ unitary)

Taking the maximum over all x with kxk2 = 1 yields kA† k2 ≤ kB −1 k2 .

Exercises section 9.4

Exercise 9.19: Condition number Let



1 A = 1 1

 2 1, 1



 b1 b =  b2  . b3

a) Determine the projections b1 and b2 of b on R(A) and N (A∗ ). Solution. By Exercise 9.7, the pseudoinverse of A is   −1 1 1 A† = (AT A)−1 AT = . 1 − 12 − 12 Theorem 9.6 tells us that the orthogonal projection of b into span(A) is      10 0 b1 2b1 1 b1 := AA† b = 0 12 12  b2  = b2 + b3  , 2 1 1 b3 b2 + b3 0 2 2 so that the orthogonal projection of b into ker(AT ) is      0 0 0 b1 0 1 b2 := (I − AA† )b = 0 12 − 12  b2  = b2 − b3  , 2 b3 b3 − b2 0 − 12 12 where we used that b = b1 + b2 . b) Compute K(A) = kAk2 kA† k2 .

9 Least Squares

174

Solution. By Theorem 8.3, the 2-norms kAk2 and kA† k2 can be found by computing the largest singular values of the matrices A and A† . The largest singular value σ1 of A is the square root of the largest eigenvalue λ1 of AT A, which satisfies   3 − λ1 4 = λ21 − 9λ1 + 2. 0 = det(AT A − λ1 I) = det 4 6 − λ1 √ p √ It follows that σ1 = 21 2 9 + 73. Similarly, the largest singular value σ2 of A† is the square root of the largest eigenvalue λ2 of A†T A† , which satisfies     8 −6 −6 1 0 = det(A†T A† − λ2 I) = det  −6 5 5  − λ2 I  4 −6 5 5  1 = − λ2 2λ22 − 9λ2 + 1 . 2 Alternatively, we could have used that the largest singular value of A† is the inverse of the smallest singular value from the singular value factorizapof A√(this follows √ p √ tion). It follows that σ2 = 12 9 + 73 = 2/ 9 − 73. We conclude s †

K(A) = kAk2 · kA k2 =

√ √  9 + 73 1  √ = √ 9 + 73 ≈ 6.203. 9 − 73 2 2

Exercise 9.20: Equality in perturbation bound Let A ∈ Cm×n . Suppose y A and y A† are vectors with ky A k = ky A† k = 1 and kAk = kAy A k and kA† k = kA† y A† k. a) Show that we have equality to the right in (9.13) if b = Ay A , e1 = y A† . Solution. By assumption on y A and y A† , kA† y A† k = kA† k = kA† kky A† k.

kAy A k = kAk = kAkky A k,

We have here that e1 = y A† , and since A has linearly independent columns, x = yA . We have equality in the right hand side if and only if kAxk = kAkkxk

and

kA† e1 k = kA† kke1 k.

Combining this with the observations above, the result follows. b) Show that we have equality to the left if we switch b and e1 in a).

9 Least Squares

175

Solution. We have equality in the left hand side if and only if kAkkx − yk = ke1 k, and kxk = kA† kkb1 k. This is the same as kAkkA† e1 k = ke1 k, and kxk = kA† kkAxk. This is the same as ke1 k = kAkkA† e1 k

and

kA† bk = kA† kkbk.

The first corresponding statement in a) can also be written as kbk = kAkkA† bk, so that the statements in a) can be written as kbk = kAkkA† bk

and

kA† e1 k = kA† kke1 k.

We now see that b and e1 simply have swapped roles. c) Let A be as in Example 9.7. Find extremal b and e when the l∞ norm is used. This generalizes the sharpness results in Exercise 8.18. For if m = n and A is nonsingular then A† = A−1 and e1 = e. Solution. It is straightforward to check that y A = [1, 1]T and y A† = [1, −1, 0]T are vectors with infinity norm one, which satisfy the conditions in a) (they are not unique, however). For right hand side equality, a) now gives     11   2 1 b = Ay A = 0 1 = 1 . 1 00 0 One particular solution for e is     1   1 −1 0   2 −1 = A† e1 = . 0 1 0 −1 0 For left hand side equality, b) gives b = y A† = [1, −1, 0]T . One particular solution for e is       11     1 1 −1 0 0 1 1 = 1 . A† e1 = A† A = 1 0 1 0 1 1 00

9 Least Squares

176

Exercise 9.21: Problem using normal equations Consider the least squares problems where     1 1 2 A = 1 1 , b = 3, 1 1+ 2

 ∈ R.

a) Find the normal equations and the exact least squares solution. Solution. Let A, b, and  be as in the exercise. The normal equations AT Ax = AT b are then      3 3+ x1 7 = . 3 +  ( + 1)2 + 2 x2 7 + 2 If  6= 0, inverting the matrix AT A yields the unique solution      5  1 ( + 1)2 + 2 −3 −  x1 7 + 1 = 2 = 2 12 . x2 −3 −  3 7 + 2 − 2 2 If  = 0, on the other hand, then any vector x = [x1 , x2 ]T with x1 + x2 = 7/3 is a solution. ∗ 2 b) Suppose  is small and we replace the (2, 2) entry √ 3 + 2 +  in A A by 3 + 2. (This will be done in a computer √ if  < u, u being the round-off unit). For example, if u = 10−16 then u = 10−8 . Solve A∗ Ax = A∗ b for x and compare with the x found in a) (we will get a much more accurate result using the QR factorization or the singular value decomposition on this problem).

Solution. For  = 0, we get the same solution as in a). For  6= 0, however, the solution to the system      3 3 +  x1 7 = 3 +  3 + 2 x2 7 + 2 is

 0      1 3 + 2 −3 −  x1 7 2 − 1 = − = . 1 x02 3 7 + 2 2 −3 −  

We can compare this to the solution of a) by comparing the residuals,  1   5  1 2 + A 2 12 − b = − 1  = √1 2 − 2 2 0 2  2  0   √ 2 − 1   ≤ 2 = −1 = A − b , 1  1 2 2

9 Least Squares

177

which shows that the solution from a) is more accurate.

Exercises section 9.5

Exercise 9.22: Singular values perturbation (Exam exercise 1980-2) Let A() ∈ Rn×n be bidiagonal with ai,j = 0 for i, j = 1, . . . , n and j 6= i, i + 1. Moreover, for some 1 ≤ k ≤ n − 1 we have ak,k+1 =  ∈ R. Show that |σi () − σi (0)| ≤ ||, i = 1, . . . , n, where σi (), i = 1, . . . , n are the singular values of A(). Solution. Let E := A() − A(0) and F := E T E. According to Theorem 9.11 we have |σi () − σi (0)| ≤ kEk2 , i = 1, . . . , n. Now fk+1,k+1 = 2 , and all other elements of F are zero. It follows that the spectral radius of F is 2 and hence kEk2 = ||. As an alternative proof, we note that only one p element of E is nonzero and therefore kEk1 = kEk∞ = ||. But then kEk2 = kEk1 kEk∞ ≤ ||.

Chapter 10

The Kronecker Product

Exercises sections 10.1 and 10.2 Exercise 10.1: 4 × 4 Poisson matrix Write down the Poisson matrix for m = 2 and show that it is strictly diagonally dominant. Solution. For m = 2, the Poisson matrix A is the 22 × 22 matrix given by   4 −1 −1 0  −1 4 0 −1     −1 0 4 −1  . 0 −1 −1 4 In every row i, one has |aii | = 4 > 2 = | − 1| + | − 1| + |0| =

X

|aij |.

j6=i

In other words, A is strictly diagonally dominant. Exercise 10.2: Properties of Kronecker products Prove (10.13), i.e., that    λA ⊗ µB = λµ A ⊗ B ,  A1 + A2 ⊗ B = A1 ⊗ B + A2 ⊗ B,  A ⊗ B1 + B2 = A ⊗ B1 + A ⊗ B2, (A ⊗ B) ⊗ C = A ⊗ (B ⊗ C).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_10

179

10 The Kronecker Product

180

Solution. Let be given matrices A, A1 , A2 ∈ Rp×q , B, B 1 , B 2 ∈ Rr×s , and C ∈ Rt×u . Then (λA) ⊗ (µB) = λµ(A ⊗ B) by definition of the Kronecker product and since     Ab11 Ab12 · · · Ab1s (λA)µb11 (λA)µb12 · · · (λA)µb1s (λA)µb21 (λA)µb22 · · · (λA)µb2s      Ab21 Ab22 · · · Ab2s  = λµ  .   . .. . . .. . . . .. . . ..  .. ..   ..  . . Abr1 Abr2 · · · Abrs (λA)µbr1 (λA)µbr2 · · · (λA)µbrs The identity (A1 + A2 ) ⊗ B = (A1 ⊗ B) + (A2 ⊗ B) follows from   (A1 + A2 )b11 (A1 + A2 )b12 · · · (A1 + A2 )b1s (A1 + A2 )b21 (A1 + A2 )b22 · · · (A1 + A2 )b2s      .. .. .. ...   . . . (A1 + A2 )br1 (A1 + A2 )br2 · · · (A1 + A2 )brs   A1 b11 + A2 b11 A1 b12 + A2 b12 · · · A1 b1s + A2 b1s A1 b21 + A2 b21 A1 b22 + A2 b22 · · · A1 b2s + A2 b2s    =  . .. .. ...   . . .. A1 br1 + A2 br1 A1 br2 + A2 br2 · · · A1 brs + A2 brs     A1 b11 A1 b12 · · · A1 b1s A2 b11 A2 b12 · · · A2 b1s     A1 b21 A1 b22 · · · A1 b2s  A2 b21 A2 b22 · · · A2 b2s  = . + . .   . . . . . . .. ..   .. .. ..  ..  .. .. A1 br1 A1 br2 · · · A1 brs

A2 br1 A2 br2 · · · A2 brs

A similar argument proves A ⊗ (B 1 + B 2 ) = (A ⊗ B 1 ) + (A ⊗ B 2 ), and therefore the bilinearity of the Kronecker product. The associativity (A ⊗ B) ⊗ C = A ⊗ (B ⊗ C) follows from 

 Ab11  ..  . Abr1

Ab11 c11 ..   .    Ab · · · Ab1s  r1 c11 ..  ⊗ C =   .    Ab11 ct1 · · · Abrs   ..  . Abr1 ct1

· · · Ab1s c11 Ab11 c1u .. .. . ··· . · · · Abrs c11 Abr1 c1u .. . · · · Ab1s ct1 Ab11 ctu .. .. . ··· . · · · Abrs ct1 Abr1 ctu

 · · · Ab1s c1u ..   .  · · · Abrs c1u    ..  .  · · · Ab1s ctu    ..  . · · · Abrs ctu

10 The Kronecker Product



b11 c11  ..  .   br1 c11   =A⊗   b11 ct1    ... br1 ct1

181

· · · b1s c11 b11 c1u .. .. . ··· . · · · brs c11 br1 c1u .. . · · · b1s ct1 b11 ctu .. .. . ··· . · · · brs ct1 br1 ctu

 · · · b1s c1u ..  .     · · · brs c1u  Bc11 · · · Bc1u    . .. ..  .  = A ⊗  .. . .    Bct1 · · · Bctu · · · b1s ctu  ..  .  · · · brs ctu

Exercise 10.3 : Eigenpairs of Kronecker products (Exam exercise 2008-3) Let A, B ∈ Rn×n . Show that the eigenvalues of the Kronecker product A⊗B are products of the eigenvalues of A and B and that the eigenvectors of A⊗B are Kronecker products of the eigenvectors of A and B. Solution. Suppose Au = λu and Bv = µv. Then    Ab11 · · · Ab1n uv1  ..   ..  (A ⊗ B)(u ⊗ v) =  ... .  .  Abn1 · · · Abnn uvn     Ab11 uv1 + · · · + Ab1n uvn Au(Bv)1     .. .. = =  . . Abn1 uv1 + · · · + Abnn uvn

Au(Bv)n

= (Au) ⊗ (Bv) = (λu) ⊗ (µv) = (λµ)(u ⊗ v).

Exercises section 10.3

Exercise 10.4: 2. derivative matrix is positive definite Write down the eigenvalues of T = tridiag(−1, 2, −1) using (10.15) and conclude that T is symmetric positive definite. Solution. Applying Lemma 2.2 to the case that a = −1 and d = 2, one finds that the eigenvalues λj of the matrix tridiag(−1, 2, −1) ∈ Rm,m are      jπ jπ λj = d + 2a cos = 2 1 − cos , m+1 m+1

10 The Kronecker Product

182

for j = 1, . . . , m. Moreover, as |cos(x)| < 1 for any x ∈ (0, π), it follows that λj > 0 for j = 1, . . . , m. Since, in addition, tridiag(−1, 2, −1) is symmetric, Lemma 4.5 implies that the matrix tridiag(−1, 2, −1) is symmetric positive definite. Exercise 10.5: 1D test matrix is positive definite? Show that the matrix T 1 is symmetric positive definite if d > 0 and d ≥ 2|a|. Solution. The statement of this exercise is a generalization of the statement of Exercise 10.4. Consider a matrix M = tridiag(a, d, a) ∈ Rm,m for which d > 0 and d ≥ 2|a|. By Lemma 2.2, the eigenvalues λj , with j = 1, . . . , m, of the matrix M are   jπ λj = d + 2a cos . m+1 If a = 0, then all these eigenvalues are equal to d and therefore positive. If a 6= 0, write sgn(a) for the sign of a. Then       jπ jπ a λj ≥ 2|a| 1 + cos = 2|a| 1 + sgn(a) cos > 0, |a| m+1 m+1 again because |cos(x)| < 1 for any x ∈ (0, π). Since, in addition, M is symmetric, Lemma 4.5 again implies that M is symmetric positive definite. Exercise 10.6: Eigenvalues for 2D test matrix of order 4 For m = 2 the matrix (10.10) is given by   2d a a 0  a 2d 0 a   A=  a 0 2d a  . 0 a a 2d Show that λ = 2a + 2d is an eigenvalue corresponding to the eigenvector x = [1, 1, 1, 1]T . Verify that apart from a scaling of the eigenvector this agrees with 1 , m+1 sj = [sin(jπh), sin(2jπh), . . . , sin(mjπh)]T .

λj = d + 2a cos(jπh),

h :=

(i.e. (10.15) and (10.16)) for j = k = 1 and m = 2. Solution. One has

10 The Kronecker Product

183

       2d + 2a 2d a a 0 1 1  a 2d 0 a  1 2d + 2a 1       Ax =   a 0 2d a  1 = 2d + 2a = (2d + 2a) 1 = λx, 2d + 2a 0 a a 2d 1 1 which means that (λ, x) is an eigenpair of A. We also have √for m  = 2 that λ1 = 3/2 d + 2a cos(jπh) = d + 2a cos(π/3) = d + a, and s1 = √ . For j = k = 1 3/2 we now get λj + λk = 2λ1 = 2d + 2a   1 √  √  3 3/2 3/2 1  3 x. sj ⊗ sk = s1 ⊗ s1 = √ ⊗ √ =  3/2 3/2 2 1 2 1 Theorem 10.2 states that A(sj ⊗ sk ) = (λj + λk )(sj ⊗ sk ), i.e.,   3 3 A x = (2d + 2a) x 2 2 This is thus just a scaling with 3/2 of what we computed above. Exercise 10.7: Nine point scheme for Poisson problem Consider the following 9 point difference approximation to the Poisson problem −∆u = f , u = 0 on the boundary of the unit square (cf. (10.1)) (a) −(h v)j,k = (µf )j,k j, k = 1, . . . , m (b) 0 = v0,k = vm+1,k = vj,0 = vj,m+1 , j, k = 0, 1, . . . , m + 1, (c) −(h v)j,k := [20vj,k − 4vj−1,k − 4vj,k−1 − 4vj+1,k − 4vj,k+1 − vj−1,k−1 − vj+1,k−1 − vj−1,k+1 − vj+1,k+1 ]/(6h2 ), (d)

(µf )j,k := [8fj,k + fj−1,k + fj,k−1 + fj+1,k + fj,k+1 ]/12. (10.21)

a) Write down the 4-by-4 system we obtain for m = 2. Solution. If m = 2, the boundary condition yields     v00 v01 v02 v03 0000 v10  v13  0   = 0 , v20   v23 0 0 v30 v31 v32 v33 0000 leaving four equations to determine the interior points v11 , v12 , v21 , v22 . Since we have that 6h2 /12 = 1/(2(m + 1)2 ) = 1/18 for m = 2, we obtain 20v11 − 4v01 − 4v10 − 4v21 − 4v12 − v00 − v20 − v02 − v22

10 The Kronecker Product

184

1 (8f11 + f01 + f10 + f21 + f12 ), 18 20v21 − 4v11 − 4v20 − 4v31 − 4v22 − v10 − v30 − v12 − v32 1 = (8f21 + f11 + f20 + f31 + f22 ), 18 20v12 − 4v02 − 4v11 − 4v22 − 4v13 − v01 − v21 − v03 − v23 1 = (8f12 + f02 + f11 + f22 + f13 ), 18 20v22 − 4v12 − 4v21 − 4v32 − 4v23 − v11 − v31 − v13 − v33 1 = (8f22 + f12 + f21 + f32 + f23 ). 18 Using the values known from the boundary condition, these equations can be simplified to =

20v11 − 4v21 − 4v12 − v22 =

1 (8f11 + f01 + f10 + f21 + f12 ), 18

1 (8f21 + f11 + f20 + f31 + f22 ), 18 1 = (8f12 + f02 + f11 + f22 + f13 ), 18 1 = (8f22 + f12 + f21 + f32 + f23 ). 18

20v21 − 4v11 − 4v22 − v12 = 20v12 − 4v11 − 4v22 − v21 20v22 − 4v12 − 4v21 − v11

b) Find vj,k for j, k = 1, 2, if f (x, y) = 2π 2 sin (πx) sin (πy) and m = 2. Solution. For f (x, y) = 2π 2 sin(πx) sin(πy), one finds     f00 f01 f02 f03 0 0 0 0 f10 f11 f12 f13  0 3π 2 /2 3π 2 /2 0     f20 f21 f22 f23  = 0 3π 2 /2 3π 2 /2 0 . f30 f31 f32 f33 0 0 0 0 Substituting these values in our linear system, we obtain    2     1 5π /6 20 −4 −4 −1 v11 −4 20 −1 −4 v21  8 + 1 + 1 3π 2 1 5π 2 /6   =  2 .    = −4 −1 20 −4 v12  18 2 1 5π /6 1 5π 2 /6 −1 −4 −4 20 v22 Solving this system we find that v11 = v12 = v21 = v22 = 5π 2 /66. It can be shown that (10.21) defines an O(h4 ) approximation to (10.1).

10 The Kronecker Product

185

Exercise 10.8: Matrix equation for nine point scheme Consider the nine point difference approximation to (10.1) given by Equation (10.21) in Exercise 10.7. a) Show that Equation (10.21) is equivalent to the matrix equation 1 T V + V T − T V T = h2 µF . 6

(10.22)

Here µF has elements (µf )j,k given by (10.21d) and T = tridiag(−1, 2, −1). Solution. Let 

2 −1 −1 2   ..  . T = 0   

0 −1 .. .. . .



    , 0  −1 2 −1 0 −1 2



 v11 · · · v1m   V =  ... . . . ...  vm1 · · · vmm

be of equal dimensions. Implicitly assuming the boundary condition v0,k = vm+1,k = vj,0 = vj,m+1 = 0,

for j, k = 0, . . . , m + 1,

(10.i)

the (j, k)-th entry of T V + V T can be written as 4vj,k − vj−1,k − vj+1,k − vj,k−1 − vj,k+1 . Similarly, writing out two matrix products, the (j, k)-th entry of T V T = T (V T ) is found to be −1(−1vj−1,k−1 +2vj−1,k −1vj−1,k+1 ) +vj−1,k−1 −2vj−1,k +vj−1,k+1 +2(−1vj,k−1 +2vj,k −1vj,k+1 ) = −2vj,k−1 +4vj,k −2vj,k+1 . −1(−1vj+1,k−1 +2vj+1,k −1vj+1,k+1 ) +vj+1,k−1 −2vj+1,k +vj+1,k+1 Together, these observations yield that the System (10.21) is equivalent to (10.i) and 1 T V + V T − T V T = h2 µF . 6 b) Show that the standard form of the matrix equation (10.22) is Ax = b, where A = T ⊗ I + I ⊗ T − 16 T ⊗ T , x = vec(V ), and b = h2 vec(µF ). Solution. It is a direct consequence of properties 7 and 8 of Theorem 10.1 that this equation can be rewritten as one of the form Ax = b, where

10 The Kronecker Product

186

1 A = T ⊗ I + I ⊗ T − T ⊗ T, 6

x = vec(V ),

b = h2 vec(µF ).

Exercise 10.9: Biharmonic equation Consider the biharmonic equation  ∆2 u(s, t) := ∆ ∆u(s, t) = f (s, t) u(s, t) = 0, ∆u(s, t) = 0

(s, t) ∈ Ω, (s, t) ∈ ∂Ω.

(10.23)

Here Ω is the open unit square. The condition ∆u = 0 is called the Navier boundary condition. Moreover, ∆2 u = uxxxx + 2uxxyy + uyyyy . a) Let v := −∆u. Show that (10.23) can be written as a system −∆v(s, t) = f (s, t) −∆u(s, t) = v(s, t) u(s, t) = v(s, t) = 0

(s, t) ∈ Ω (s, t) ∈ Ω (s, t) ∈ ∂Ω.

Solution. Writing v = −∇u, the second line in (10.23) is equivalent to u(s, t) = v(s, t) = 0,

for (s, t) ∈ ∂Ω,

while the first line is equivalent to  f (s, t) = ∆2 u(s, t) = −∆ − ∆u(s, t) = −∆v(s, t),

for (s, t) ∈ Ω.

m×m b) Discretizing, using (10.4), with ,h = mT = tridiag(−1, 2, −1) ∈ R 1/(m + 1), and F = f (jh, kh) j,k=1 we get two matrix equations

T V + V T = h2 F ,

T U + U T = h2 V .

Show that (T ⊗I +I ⊗T )vec(V ) = h2 vec(F ),

(T ⊗I +I ⊗T )vec(U ) = h2 vec(V ).

and hence A = (T ⊗ I + I ⊗ T )2 is the matrix for the standard form of the discrete biharmonic equation. Solution. By property 8 of Theorem 10.1, (A ⊗ I + I ⊗ B)vec(V ) = vec(F ) ⇐⇒ AV + V B T = F , whenever A ∈ Rr,r , B ∈ Rs,s , F , V ∈ Rr,s (the identity matrices are assumed to be of the appropriate dimensions). Using T = T T , this equation implies T V + V T = h2 F ⇐⇒ (T ⊗ I + I ⊗ T )vec(V ) = h2 vec(F ),

10 The Kronecker Product

187

T U + U T = h2 V ⇐⇒ (T ⊗ I + I ⊗ T )vec(U ) = h2 vec(V ). Substituting the equation for vec(V ) into the equation for vec(F ), one obtains the equation Avec(U ) = h4 vec(F ),

where A := (T ⊗ I + I ⊗ T )2 ,

which is a linear system of m2 equations. c) Show that with n = m2 the vector form and standard form of the systems in b) can be written T 2 U + 2T U T + U T 2 = h4 F

and

Ax = b,

where A = T 2 ⊗ I + 2T ⊗ T + I ⊗ T 2 ∈ Rn×n , x = vec(U ), and b = h4 vec(F ). Solution. The equations h2 V = T U + U T and T V + V T = h2 F together yield the normal form T (T U + U T ) + (T U + U T )T = T 2 U + 2T U T + U T 2 = h4 F . The vector form is given in b). Using the distributive property of matrix multiplication and the mixed product rule of Lemma 10.1, the matrix A = (T ⊗ I + I ⊗ T )2 can be rewritten as A = (T ⊗ I)(T ⊗ I) + (T ⊗ I)(I ⊗ T ) + (I ⊗ T )(T ⊗ I) + (I ⊗ T )(I ⊗ T ) = T 2 ⊗ I + 2T ⊗ T + I ⊗ T 2 . Writing x := vec(U ) and b := h4 vec(F ), the linear system of b) can be written as Ax = b. d) Determine the eigenvalues and eigenvectors of the matrix A in c) and show that it is positive definite. Also determine the bandwidth of A. Solution. Property 6 of Theorem 10.1 says that A ⊗ I + I ⊗ B is positive definite when of of A and B is positive definite, and the other is positive semidefinite. Since T is positive definite, it follows that M := T ⊗ I + I ⊗ T is positive definite as well. The square of any symmetric positive definite matrix is symmetric positive definite as well, implying that A = M 2 is symmetric positive definite. Let us now show this more directly by calculating the eigenvalues of A. By Lemma 2.2, we know the eigenpairs (λi , si ), where i = 1, . . . , m, of the matrix T . By property 5 of Theorem 10.1, it follows that the eigenpairs of M are (λi + λj , si ⊗ sj ), for i, j = 1, . . . , m. If B is any matrix with eigenpairs (µi , v i ), where i = 1, . . . , m, then B 2 has eigenpairs (µ2i , v i ), as B 2 v i = B(Bv i ) = B(µi v i ) = µi (Bv i ) = µ2i v i ,

for i = 1, . . . , m.

188

10 The Kronecker Product

 It follows that A = M 2 has eigenpairs (λi + λj )2 , si ⊗ sj , for i, j = 1, . . . , m. (Note that we can also verify this directly by multiplying A by si ⊗ sj and using the mixed product rule.) Since the λi are positive, the eigenvalues of A are positive. We conclude that A is symmetric positive definite. Writing A = T 2 ⊗ I + 2T ⊗ T + I ⊗ T 2 and computing the block structure of each of these terms, one finds that A has bandwidth 2m, in the sense that any row has at most 4m + 1 nonzero elements. e) Suppose we want to solve the standard form equation Ax = b. We have two representations for the matrix A, the product one in b) and the one in c). Which one would you prefer for the basis of an algorithm? Why? Solution. One can expect to solve the system of b) faster, as it is typically quicker to solve two simple systems instead of one complex system.

Chapter 11

Fast Direct Solution of a Large Linear System

Exercises section 11.3

Exercise 11.1: Fourier matrix Show that the Fourier matrix F 4 is symmetric, but not Hermitian. Solution. The Fourier matrix F N has entries (j−1)(k−1)

(F N )j,k = ωN

,



ωN := e− N i = cos



2π N



 − i sin

2π N

 .

In particular for N = 4, this implies that ω4 = −i and   1 1 1 1 1 −i −1 i   F4 =  1 −1 1 −1 . 1 i −1 −i Computing the transpose and Hermitian transpose gives     1 1 1 1 1 1 1 1 1 −i −1 i  1 i −1 −i      FT FH 4 = 1 −1 1 −1 = F 4 , 4 = 1 −1 1 −1 6= F 4 , 1 i −1 −i 1 −i −1 i which is what needed to be shown. Exercise 11.2: Sine transform as Fourier transform Verify Lemma 11.1 directly when m = 1.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_11

189

11 Fast Direct Solution of a Large Linear System

190

Solution. According to Lemma 11.1, the Discrete Sine Transform can be computed from the Discrete Fourier Transform by (S m x)k = 2i (F 2m+2 z)k+1 , where z = [0, x1 , . . . , xm , 0, −xm , . . . , −x1 ]T . For m = 1 this means that z = [0, x1 , 0, −x1 ]T Since h =

1 m+1

=

1 2

(S 1 x)1 =

and

i (F 4 z)2 . 2

for m = 1, computing the DST directly gives

(S 1 x)1 = sin(πh)x1 = sin

π 2

x1 = x1 ,

while computing the Fourier transform gives        1 1 1 1 0 0 0 1 −i −1 i   x1  −2ix1   x1        F 4z =  1 −1 1 −1  0  =  0  = −2i  0  = −2iz. 1 i −1 −i −x1 2ix1 −x1 Multiplying the Fourier transform with 2i , one finds 2i F 4 z = z, so that 2i (F 4 z)2 = x1 = (S 1 x)1 , which is what we needed to show. Exercise 11.3: Explicit solution of the discrete Poisson equation Show that the exact solution of the discrete Poisson equation (Equation (10.5)) can be written V = (vij )m i,j=1 , where, with h := 1/(m + 1), 4

vij = h

m X m X m X m X sin p=1 r=1 k=1 l=1

h

ipπ m+1



jrπ m+1 i 2

sin

pπ 2(m+1)

sin

  kpπ lrπ sin m+1 sin m+1 fkl . h i2 rπ + sin 2(m+1)



 Solution. For j = 1, . . . , m, let λj = 4 sin2 jπh/2 , D = diag(λ1 , . . . , λm ), and S = (sjk )jk = sin(jkπh) jk . By the results derived in Section 11.2, the solution to the discrete Poisson equation is V = SXS, where X is found by solving DX + XD = 4h4 SF S. Since D is diagonal, one has (DX + XD)pr = (λp + λr )xpr , so that 4 (SF S)pr

xpr = 4h

λp + λr

4

= 4h

m X m X spk fkl slr k=1 l=1

λp + λr

,

and hence vij =

m X m X p=1 r=1

sip xpr srj = 4h4

m X m X m X m X sip spk slr srj p=1 r=1 k=1 l=1

λp + λr

fkl

11 Fast Direct Solution of a Large Linear System

4

=h

m X m sin m X m X X



ipπ m+1



sin2

p=1 r=1 k=1 l=1

sin 

191



pkπ m+1

pπ 2(m+1)





sin



lrπ m+1

+ sin2





sin

rπ 2(m+1)



rjπ m+1





fkl ,

which is what needed to be shown. Exercise 11.4: Improved version of Algorithm 11.1 Algorithm 11.1 involves multiplying a matrix by S four times. In this exercise we show that it is enough to multiply by S two times. We achieve this by diagonalizing only the second T in T V + V T = h2 F . Let D = diag(λ1 , . . . , λm ), where λj = 4 sin2 (jπh/2), j = 1, . . . , m. a) Show that T X + XD = C, where X = V S, and C = h2 F S. Solution. Recall that T V + V T = h2 F ,

(11.i)

−1

with T = SDS the orthogonal diagonalization of T from Equation (11.4). We also write X = V S and C = h2 F S. Multiplying Equation (11.i) from the right by S, one obtains T X + XD = T V S + V SD = T V S + V T S = h2 F S = C.

b) Show that (T + λj I)xj = cj ,

j = 1, . . . , m

(System (11.9)), where X = [x1 , . . . , xm ] and C = [c1 , . . . , cm ]. Thus we can find X by solving m linear systems, one for each of the columns of X. Recall that a tridiagonal m × m system can be solved by Algorithms 2.1 and 2.2 in 8m − 7 arithmetic operations. Give an algorithm to find X which only requires O(δm2 ) arithmetic operations for some constant δ independent of m. Solution. Writing C = [c1 , . . . , cm ], X = [x1 , . . . , xm ] and applying the rules of block multiplication, we find [c1 , . . . , cm ] = C = T X + XD = T [x1 , . . . , xm ] + X[λ1 e1 , . . . , λm em ] = [T x1 + λ1 Xe1 , . . . , T xm + λm Xem ] = [T x1 + λ1 x1 , . . . , T xm + λm xm ] = [(T + λ1 I)x1 , . . . , (T + λm I)xm ],

11 Fast Direct Solution of a Large Linear System

192

which is equivalent to System (11.9). To find X, we therefore need to solve the m tridiagonal linear systems of (11.9). Since the eigenvalues λ1 , . . . , λm are positive, each matrix T + λj I is diagonally dominant. By Theorem 2.4, every such matrix is nonsingular and has a unique LU factorization. Algorithms 2.1 and 2.2 then solve the corresponding system (T +λj I)xj = cj in O(δm) operations for some constant δ. Doing this for all m columns x1 , . . . , xm , one finds the matrix X in O(δm2 ) operations. c) Describe a method to compute V which only requires O(4m3 ) = O(4n3/2 ) arithmetic operations. Solution. To find V , we first find C = h2 F S by performing O(2m3 ) operations. Next we find X as in step b) by performing O(δm2 ) operations. Finally we compute V = 2hXS by performing O(2m3 ) operations. In total, this amounts to O(4m3 ) operations. d) Describe a method based on the fast Fourier transform which requires O(2γn log2 n) where γ is the same constant as mentioned at the end of the last section. Solution. As explained in Section 11.3, multiplying by the matrix S can be done in O(2m2 log2 m) operations by using the Fourier transform. The two matrix multiplications in c) can therefore be carried out in O(4γm2 log2 m) = O(4γn log2 n1/2 ) = O(2γn log2 n) operations.

11 Fast Direct Solution of a Large Linear System

193

Exercise 11.5: Fast solution of 9 point scheme Consider the equation 1 T V + V T − T V T = h2 µF , 6

(10.22)

that was derived in Exercise 10.8 for the 9-point scheme. Define the matrix X = (xj,k ) by V = SXS, where V is the solution of (10.22). Show that 1 DX + XD − DXD = 4h4 G, 6

where G = SµF S,

where D = diag(λ1 , . . . , λm ), with λj = 4 sin2 (jπh/2), j = 1, . . . , m, and that   h4 gj,k jπh 2 xj,k = , σ := sin , j, k = 1, 2, . . . , m. j 2 σj + σk − 23 σj σk Show that σj + σk − 23 σj σk > 0 for j, k = 1, 2, . . . , m. Conclude that the matrix A in Exercise 10.8b) is symmetric positive definite and that (10.21) always has a solution V . Solution. Analogously to Section 11.2, we use the relations between the matrices T , S, X, D to rewrite Equation (10.22). 1 h2 µF = T V + V T − T V T 6 ⇐⇒ ⇐⇒ ⇐⇒ ⇐⇒

1 h2 µF = T SXS + SXST − T SXST 6 1 2 2 2 h µSF S = ST SXS + S XST S − ST SXST S 6 1 2 2 2 2 2 2 h µSF S = S DXS + S XS D − S DXS 2 D 6 1 4 4 4h G = 4h µSF S = DX + XD − DXD. 6

Writing D = diag(λ1 , . . . , λm ), the (j, k)-th entry of DX + XD − 16 DXD is equal to λj xjk + xjk λk − 16 λj xjk λk . Isolating xjk and writing λj = 4σj = 4 sin2 (jπh/2) then yields   4h4 gjk h4 gjk jπh 2 xjk = = , σ = sin . j 2 λj + λk − 16 λj λk σj + σk − 23 σj σk Defining α := jπh/2 and β := kπh/2, one has 0 < α, β < π/2. Note that 2 σj + σk − σj σk > σj + σk − σj σk 3

11 Fast Direct Solution of a Large Linear System

194

= 2 − cos2 α − cos2 β − (1 − cos2 α)(1 − cos2 β) = 1 − cos2 α cos2 β ≥ 1 − cos2 β ≥ 0. Let A = T ⊗ I + I ⊗ T − 61 T ⊗ T be as in Exercise 10.8.b) and si as in Section 11.2. Applying the mixed-product rule, one obtains 1 A(si ⊗ sj ) = (T ⊗ I + I ⊗ T )(si ⊗ sj ) − (T ⊗ T )(si ⊗ sj ) 6 1 = (λi + λj )(si ⊗ sj ) − λi λj (si ⊗ sj ) 6 1 = (λi + λj − λi λj )(si ⊗ sj ). 6 The matrix A therefore has eigenvectors si ⊗ sj , and counting them shows that these must be all of them. As shown above, the corresponding eigenvalues λi + λj − 16 λi λj are positive, implying that the matrix A is positive definite. It follows that the System (10.21) always has a (unique) solution. Exercise 11.6: Algorithm for fast solution of 9 point scheme Derive an algorithm for solving System (10.21) which for large m requires essentially the same number of operations as in Algorithm 11.1. (We assume that µF already has been formed). Solution. Algorithm 11.1 solves System (10.21). code/ninepointscheme.m 1 2 3 4 5 6 7 8 9 10 11

function U = ninepointscheme(F) m = length(F); h = 1/(m+1); hv = pi*h*(1:m)’; sigma = sin(hv/2).ˆ2; S = sin(hv*(1:m)); G = S*F*S; X = (hˆ4)*G./( sigma*ones(1,m)+ones(m,1)*sigma’ - (2/3)* sigma.*sigma’ ); U = zeros(m+2,m+2); U(2:m+1,2:m+1) = S*X*S; end Listing 11.1: Solving the Poisson problem using a nine-point scheme.

Only two steps here are of order m3 : The ones which compute SF S and SXS. Hence the overall complexity is determined by the four matrix multiplications and given by O(m3 ).

11 Fast Direct Solution of a Large Linear System

195

Exercise 11.7: Fast solution of biharmonic equation For the biharmonic problem we derived in Exercise 10.9 the equation T 2 U + 2T U T + U T 2 = h4 F . Define the matrix X = (xj,k ) by U = SXS where U is the solution of (10.25). Show that D 2 X + 2DXD + XD 2 = 4h6 G, where G = SF S, and that xj,k =

h6 gj,k , 4(σj + σk )2

σj := sin2



jπh 2

 ,

j, k = 1, 2, . . . , m.

Solution. From Exercise 10.9 we know that T ∈ Rm×m is the second derivative matrix. According to Lemma 2.2, the eigenpairs (λj , sj ), with j = 1, . . . , m, of T are given by sj = [sin(jπh), sin(2jπh), . . . , sin(mjπh)]T , λj = 2 − 2 cos(jπh) = 4 sin2 (jπh/2), and satisfy sT j sk = δj,k /(2h) for all j, k, where h := 1/(m + 1). Using, in order, that U = SXS, T S = SD, and S 2 = I/(2h), one finds that h4 F = T 2 U + 2T U T + U T 2 ⇐⇒

h4 F = T 2 SXS + 2T SXST + SXST 2

⇐⇒

h4 SF S = ST 2 SXS 2 + 2ST SXST S + S 2 XST 2 S

⇐⇒

h4 SF S = S 2 D 2 XS 2 + 2S 2 DXS 2 D + S 2 XS 2 D 2

⇐⇒

h4 SF S = ID 2 XI/(4h2 ) + 2IDXID/(4h2 ) + IXID 2 /(4h2 )

⇐⇒

4h6 G = D 2 X + 2DXD + XD 2 ,

where G := SF S. The (j, k)-th entry of the latter matrix equation is 4h6 gjk = λ2j xjk + 2λj xjk λk + xjk λ2k = xjk (λj + λk )2 . Writing σj := sin2 (jπh/2) = λj /4, one obtains xjk =

4h6 gjk 4h6 gjk h6 gjk = = .  2 (λj + λk )2 4(σj + σk )2 4 sin2 (jπh/2) + 4 sin2 (kπh/2)

196

11 Fast Direct Solution of a Large Linear System

Exercise 11.8: Algorithm for fast solution of biharmonic equation Use Exercise 11.7 to derive an algorithm function U=simplefastbiharmonic (F) which requires only O(δn3/2 ) operations to find U in Exercise 10.9. Here δ is some constant independent of n. Solution. In order to derive an algorithm that computes U in Exercise 10.9, we can adjust Algorithm 11.1 by replacing the computation of the matrix X by the formula from Exercise 11.7. This adjustment does not change the complexity of Algorithm 11.1, which therefore remains O(δn3/2 ). The new algorithm can be implemented in MATLAB as in Listing 11.2. code/simplefastbiharmonic.m 1 2 3 4 5 6 7 8 9 10 11

function U = simplefastbiharmonic(F) m = length(F); h = 1/(m+1); hv = pi*h*(1:m)’; sigma = sin(hv/2).ˆ2; S = sin(hv*(1:m)); G = S*F*S; X = (hˆ6)*G./(4*(sigma*ones(1,m)+ones(m,1)*sigma’).ˆ2); U = zeros(m+2,m+2); U(2:m+1,2:m+1) = S*X*S; end Listing 11.2: A simple, fast solution to the biharmonic equation.

Exercise 11.9 : Check algorithm for fast solution of biharmonic equation In Exercise 11.8 compute the solution U corresponding to F = ones(m,m). For some small m’s check that you get the same solution obtained by solving the standard form Ax = b in (10.25). You can use x = A\b for solving Ax = b. Use F(:) to vectorize a matrix and reshape(x,m,m) to turn a 2 vector x ∈ Rm into an m × m matrix. Use the MATLAB command surf(U) for plotting U for, say, m = 50. Compare the result with Exercise 11.8 by plotting the difference between both matrices. Solution. The MATLAB function from Listing 11.3 directly solves the standard form Ax = b of Equation (10.25), making sure to return a matrix of the same dimension as the implementation from Listing 11.2. code/standardbiharmonic.m 1

function V = standardbiharmonic(F)

11 Fast Direct Solution of a Large Linear System 2 3 4 5 6 7 8 9 10

197

m = length(F); h = 1/(m+1); T = gallery(’tridiag’, m, -1, 2, -1); A = kron(Tˆ2, eye(m)) + 2*kron(T,T) + kron(eye(m),Tˆ2); b = h.ˆ4*F(:); x = A\b; V = zeros(m+2, m+2); V(2:m+1,2:m+1) = reshape(x,m,m); end Listing 11.3: A direct solution to the biharmonic equation.

After specifying m = 4 by issuing the command F = ones(4,4), the commands simplefastbiharmonic(F) and standardbiharmonic(F) both return the matrix   0 0 0 0 0 0 0 0.0015 0.0024 0.0024 0.0015 0   0 0.0024 0.0037 0.0037 0.0024 0   0 0.0024 0.0037 0.0037 0.0024 0 .   0 0.0015 0.0024 0.0024 0.0015 0 0 0 0 0 0 0 For large m, it is more insightful to plot the data returned by our MATLAB functions. For m = 50, we solve and plot our system with the commands in Listing 11.4, resulting in Figure 11.1. code/biharmonic_compare.m 1 2 3 4 5

F = ones(50, 50); U = simplefastbiharmonic(F); V = standardbiharmonic(F); surf(U); surf(V); Listing 11.4: Solving the biharmonic equation and plotting the result.

On the face of it, these plots seem to be virtually identical. But exactly how close are they? We investigate this by plotting the difference with the command surf(U -V), which yields Figure 11.2. We conclude that their maximal difference is of the order of 10−14 , which makes them indeed very similar. Exercise 11.10 : Fast solution of biharmonic equation using 9 point rule Repeat Exercises 10.9, 11.8 and 11.9 using the nine point rule (10.21) to solve the system (10.24). Solution. We here need to compute

11 Fast Direct Solution of a Large Linear System

198 simplefastbiharmonic

standardbiharmonic

−3

−3

x 10

x 10

5

5

4

4

3

3

2

2

1

1

0 60

0 60 60

40

60

40

40 20

40 20

20 0

20 0

0

0

Figure 11.1: Solution of the biharmonic equation using the functions simplefastbiharmonic (left) and standardbiharmonic (right). simplefastbiharmonic minus standardbiharmonic

−14

x 10 2

1.5

1

0.5

0 60 60

40 40 20

20 0

0

Figure 11.2: Difference of the solutions of the biharmonic equation using the functions simplefastbiharmonic and standardbiharmonic.

1 A = (T ⊗ I + I ⊗ T − T ⊗ T )2 6 1 1 1 = T 2 ⊗ I + I ⊗ T 2 + 2T ⊗ T − T ⊗ T 2 − T 2 ⊗ T + T 2 ⊗ T 2 . 3 3 36 In vector form this can be written as T 2 U +2T U T +U T 2 − 13 T 2 U T − 13 T U T 2 + 2 2 1 36 T U T . Following the deductions in Exercise 11.7 we write 1 1 1 h4 F = T 2 U + 2T U T + U T 2 − T 2 U T − T U T 2 + T 2 U T 2 3 3 36 h4 F = T 2 SXS + 2T SXST + SXST 2 1 1 1 − T 2 SXST − T SXST 2 + T 2 SXST 2 3 3 36 h4 SF S = ST 2 SXS 2 + 2ST SXST S + S 2 XST 2 S 1 1 1 − ST 2 SXST S − ST SXST 2 S + ST 2 SXST 2 S 3 3 36

11 Fast Direct Solution of a Large Linear System

199

h4 SF S = S 2 D 2 XS 2 + 2S 2 DXS 2 D + S 2 XS 2 D 2 1 1 1 − S 2 D 2 XS 2 D − S 2 DXS 2 D 2 + S 2 D 2 XS 2 D 2 3 3 36 h4 SF S = (D 2 X + 2DXD + XD 2 1 1 1 − D 2 XD − DXD 2 + D 2 XD 2 )/(4h2 ) 3 3 36 1 1 1 2 2 6 4h G = D X + 2DXD + XD − D 2 XD − DXD 2 + D 2 XD 2 . 3 3 36 Note that these deductions are very general: If we had an even more complex expression for A, the same deductions apply so that T can be replaced in terms of D everywhere using that T S = SD. The (j, k)-th entry of the latter is   1 2 1 1 2 2 6 2 2 2 4h gjk = λj + 2λj λk + λk − λj λk − λj λk + λj λk xjk . 3 3 36 Writing σj := sin2 (jπh/2) = λj /4, one obtains xjk =

(λj + λk )2 −

4h6 gjk − 13 λj λ2k +

1 2 3 λj λk

1 2 2 36 λj λk

h6 gjk

= 4 σj + σk

2



16 2 3 σj σk



16 2 3 σj σk

+

16 2 2 9 σj σk

.

To replace the implementation of the fast solution to the biharmonic equation for the nine-point scheme, we simply replace a line in simplefastbiharmonic.m with the above. Otherwise the code is unchanged. This is shown in Listing 11.5 below. code/simplefastbiharmonic_ninepointscheme.m 1 2 3 4 5 6 7 8

9 10 11

function U = simplefastbiharmonic_ninepointscheme(F) m = length(F); h = 1/(m+1); hv = pi*h*(1:m)’; sigma = sin(hv/2).ˆ2; S = sin(hv*(1:m)); G = S*F*S; X = (hˆ6)*G./( 4*(sigma*ones(1,m)+ones(m,1)*sigma’).ˆ2 (16/3)*sigma.*((sigma’).ˆ2) - (16/3)*(sigma.ˆ2).*sigma ’ + (16/9)*(sigma.ˆ2).*((sigma’).ˆ2)); U = zeros(m+2,m+2); U(2:m+1,2:m+1) = S*X*S; end Listing 11.5: A simple, fast solution to the biharmonic equation using the nine-point scheme.

11 Fast Direct Solution of a Large Linear System

200

In the code for solving the biharmonic equation directly, one simply replaces T 2 ⊗ I + I ⊗ T 2 + 2T ⊗ T with 1 1 1 T 2 ⊗ I + I ⊗ T 2 + 2T ⊗ T − T ⊗ T 2 − T 2 ⊗ T + T 2 ⊗ T 2 , 3 3 36 so that only one line is changed also here. This is shown in Listing 11.6 below. code/standardbiharmonic_ninepointscheme.m 1 2 3 4 5

6 7 8 9 10

function V = standardbiharmonic_ninepointscheme(F) m = length(F); h = 1/(m+1); T = gallery(’tridiag’, m, -1, 2, -1); A = kron(Tˆ2, eye(m)) + 2*kron(T,T) + kron(eye(m),Tˆ2) (1/3)*kron(Tˆ2,T) - 1/3*kron(T,Tˆ2) + (1/36)*kron(Tˆ2, Tˆ2); b = h.ˆ4*F(:); x = A\b; V = zeros(m+2, m+2); V(2:m+1,2:m+1) = reshape(x,m,m); end Listing 11.6: A direct solution to the biharmonic equation using the nine-point scheme.

The code where the two approaches for the nine-point scheme are compared is shown in Listing 11.7. code/biharmonic_compare_ninepointscheme.m 1 2 3 4 5 6

F = ones(50, 50); U = simplefastbiharmonic_ninepointscheme(F); V = standardbiharmonic_ninepointscheme(F); surf(U); figure() surf(V); Listing 11.7: Solving the biharmonic equation and plotting the result using the nine-point scheme.

The plots look similar to those from the standard scheme.

Chapter 12

The Classical Iterative Methods

Exercises section 12.3

Exercise 12.1: Richardson and Jacobi Show that if aii = d 6= 0 for all i then Richardson’s method with α := 1/d is the same as Jacobi’s method. Solution. If a11 = · · · = ann = d 6= 0 and α = 1/d, Richardson’s method (12.18) yields, for i = 1, . . . , n,   n X 1 bi − aij xk (j) xk+1 (i) = xk (i) + d j=1   n X 1 = dxk (i) − aij xk (j) + bi  d j=1   n X 1  aii xk (i) − aij xk (j) + bi  = aii j=1   i−1 n X 1  X = − aij xk (j) − aij xk (j) + bi  , aii j=1 j=i+1

which is identical to Jacobi’s method (12.2).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_12

201

12 The Classical Iterative Methods

202

Exercise 12.2: R-method when eigenvalues have positive real part Suppose all eigenvalues λj of A have positive real parts uj for j = 1, . . . , n and that α is real. Show that the R-method converges if and only if 0 < α < minj (2uj /|λj |2 ). Solution. We can write Richardson’s method as xk+1 = Gxk + c, with G = I − αA, c = αb. We know from Theorem 12.4 that the method converges if and only if ρ(G) < 1. The eigenvalues of I − αA are 1 − αλj , and we have that |1 − αλj |2 = 1 + α2 |λj |2 − 2α 0, since uj > 0. Dividing by α we get that α|λj |2 < 2uj , so that α < 2uj /|λj |2 (since |λj | > 0 since uj 6= 0). We thus have that ρ(G) < 1 if and only if α < minj (2uj /|λj |2 ), and the result follows. Exercise 12.3: Divergence example for J and GS Show that both Jacobi’s method and Gauss-Seidel’s method diverge for A = [ 13 24 ]. Solution. We compute the matrices GJ and G1 from A and show that that the spectral radii ρ(GJ ), ρ(G1 ) ≥ 1. Once this is shown, Theorem 12.4 implies that the Jacobi method and Gauss-Seidel’s method diverge. Write A = D − AL − AR , with D the diagonal part of A, AL lower triangular, and AR upper triangular. From Equation (12.12), we find        10 12 0 −2 10 −1 −1 GJ = I − M J A = I − D A = − = , 01 0 1 34 −3 0  4  4    1 0 12 0 −2 10 −1 −1 G1 = I − M 1 A = I − (D − AL ) A = − = . 01 − 34 14 3 4 0 32 From this, we find ρ(GJ ) = than 1.

p 3/2 and ρ(G1 ) = 3/2, both of which are greater

Exercise 12.4: 2 by 2 matrix We want to show that Gauss-Seidel converges if and only if Jacobi converges a12 for a 2 by 2 matrix A := [ aa11 ] ∈ R2×2 . 21 a22 a) Show that the spectral radius for the Jacobi method is p ρ(GJ ) = |a21 a12 /a11 a22 |. Solution. The splitting matrix for the Jacobi method, and its inverse, are

203

12 The Classical Iterative Methods

  a11 0 MJ = D = , 0 a22

M J−1

 −1  a11 0 = −1 . 0 a22

It follows that GJ = I − M J−1 A =



   −1 −1 0 a12 −a11 1 − a−1 11 a11 −a11 a12 , = −1 −1 −1 0 −a22 a21 a22 a21 1 − a22 −a22

p which has eigenvalues ± a21 a12 /a11 a22 , and therefore spectral radius p ρ(GJ ) = |a21 a12 /a11 a22 |.

b) Show that the spectral radius for the Gauss-Seidel method is ρ(G1 ) = |a21 a12 /a11 a22 |. Solution. The splitting matrix for the Gauss-Seidel method, and its inverse, are     1 a 0 a22 0 M 1 = D − AL = 11 , M −1 = . 1 a21 a22 a11 a22 −a21 a11 It follows that G1 = I −

M −1 1 A

    0 − aa12 1 a22 a11 a22 a12 11 =I− = a21 , 0 a11 a22 − a12 a21 0 aa12 a11 a22 11 a22

which has eigenvalues 0 and

a12 a21 a11 a22 ,

and therefore spectral radius

ρ(G1 ) = |a21 a12 /a11 a22 |.

c) Conclude that Gauss-Seidel converges if and only if Jacobi converges. p Solution. From a) and b) it follows that ρ(GJ ) = ρ(G1 ), so that ρ(GJ ) < 1 if and only if ρ(G1 ) < 1. Since an iterative method converges if and only if ρ(G) < 1, the result follows. Exercise 12.5: Example: GS converges, J diverges Show (by finding its eigenvalues) that the matrix   1aa a 1 a aa1 is positive definite for −1/2 < a < 1. Thus, GS converges for these values of a. Show that the J method does not converge for 1/2 < a < 1.

12 The Classical Iterative Methods

204

Solution. The eigenvalues of A are the zeros of det(A − λI) = (−λ + 2a + 1)(λ + a − 1)2 . We find eigenvalues λ1 := 2a + 1 and λ2 := 1 − a, the latter having algebraic multiplicity two. Whenever 1/2 < a < 1 these eigenvalues are positive, implying that A is positive definite for such a. Let’s compute the spectral radius of GJ = I − D −1 A, where D is the diagonal part of A. The eigenvalues of GJ are the zeros of the characteristic polynomial −λ −a −a det(GJ − λI) = −a −λ −a = (−λ − 2a)(a − λ)2 , −a −a −λ and we find spectral radius ρ(GJ ) = max{|a|, |2a|}. It follows that ρ(GJ ) > 1 whenever 1/2 < a < 1, in which case Theorem 12.4 implies that the Jacobi method does not converge (even though A is symmetric positive definite). Exercise 12.6: Example: GS diverges, J converges Let GJ and G1 be the iteration matrices h 1 0 1/2for i the Jacobi and Gauss-Seidel methods applied to the matrix A := 1 1 0 a . −1 1 1

 a) Show that G1 :=

0 0 −1/2 0 0 1/2 0 0 −1

 and conclude that GS diverges.

a

S. Venit, “The convergence of Jacobi and Gauss-Seidel iteration”, Mathematics Magazine 48 (1975), 163–167.

Solution. The splitting matrix, and its inverse, are    1 00 1 0  M 1 = D − AL =  1 1 0 , M −1 1 = −1 1 −1 1 1 2 −1 so that

 0 0 , 1

    1 0 1/2 0 0 −1/2    . G1 = I − M −1 1 A = I − 0 1 −1/2 = 0 0 1/2 00 2 0 0 −1

This matrix has eigenvalues 0 and −1, and therefore ρ(G1 ) = 1, implying that Gauss-Seidel diverges for this matrix. b) Show that p(λ) := det(λI − GJ ) = λ3 + 12 λ + 12 . Solution. The splitting matrix M J = D = I, so that also M −1 J = I, and   0 0 −1/2  0 , GJ = D − M −1 J A = D − A = −1 0 1 −1 0

205

12 The Classical Iterative Methods

so that λ 0 1 1 λ λ 0 1/2 = λ3 + 1 λ + 1 . det(λI − GJ ) = 1 λ 0 = λ + λ 1 −1 1 2 2 2 −1 1 λ

c) Show that if |λ| ≥ 1 then p(λ) 6= 0. Conclude that J converges. Solution. Since p0 (λ) = 3λ2 + 12 > 0, p is increasing. Since p(1) = 2, p(λ) > 0 for λ ≥ 1. Since p(−1) = −1, p(λ) ≤ −1 for λ ≤ −1. It follows that p(λ) can be zero only if −1 < λ < 1, so that all eigenvalues are in (−1, 1). It follows that ρ(GJ ) < 1, so that the Jacobi method converges. Exercise 12.7: Strictly diagonally dominance; the J method P Show that the J method converges if |aii | > j6=i |aij | for i = 1, . . . , n. Solution. If, as assumed in the exercise, A = (aij )ij is strictly diagonally dominant, then it is nonsingular and a11 , . . . , ann 6= 0. For the Jacobi method, one finds 

0

 − aa21  a22  31 GJ = I − diag(a11 , . . . , ann )−1 A =  − a33  .  .. n1 − aann

− aa12 − aa13 11 11 0 − aa23 22 − aa32 0 33 .. .. . . n2 n3 − aann − aann

 · · · − aa1n 11  · · · − aa2n 22  a3n  · · · − a33  . .  .. . ..  ··· 0

By Theorem 8.3, the ∞-norm can be expressed as the maximum, over all rows, of the sum of absolute values of the entries in a row. Using that A is strictly diagonally dominant, one finds X aij X − = max 1 kGJ k∞ = max |aij | < 1. 1≤i≤n 1≤i≤n |aii | aii j6=i

j6=i

As by Lemma 8.1 the ∞-norm is consistent, Corollary 12.3 implies that the Jacobi method converges for any strictly diagonally dominant matrix A. Exercise 12.8: Strictly diagonally dominance; the GS method Consider the GS method. Let x be the exact solution to Ax = b, and let  := xk − x be its difference to the approximate solution xk at iteration k. P |a | Suppose r := maxi ri < 1, where ri = j6=i |aij . Show using induction ii | on i that |k+1 (j)| ≤ rkk k∞ for j = 1, . . . , i. Conclude that Gauss-Seidel’s method is convergent when A is strictly diagonally dominant.

12 The Classical Iterative Methods

206

Solution. Let A = −AL + D − AR be decomposed as a sum of a lower triangular, a diagonal, and an upper triangular part. By Equation (12.3), the approximate solutions xk are related by Dxk+1 = AL xk+1 + AR xk + b in the Gauss Seidel method. Let x be the exact solution of Ax = b. It follows that the errors k := xk − x are related by Dk+1 = AL k+1 + AR k . Let r and ri be as in the exercise. Let k ≥ 0 be arbitrary. We show by induction that |k+1 (j)| ≤ rkk k∞ ,

for j = 1, 2 . . . , n.

(12.i)

For j = 1, the relation between the errors translates to |k+1 (1)| = |a11 |−1 |−a12 k (2) − · · · − a1n k (n)| ≤ r1 kk k∞ ≤ rkk k∞ . Assume that (12.i) holds for 1, . . . , j −1. The relation between the errors then yields the bound j−1 n X X |k+1 (j)| ≤ |ajj |−1 − aj,i k+1 (i) − aj,i k (i) i=1 i=j+1 ≤ rj max{rkk k∞ , kk k∞ } = rj kk k∞ ≤ rkk k∞ . Equation (12.i) then follows by induction, and it also follows that kk+1 k∞ ≤ rkk k∞ . The matrix A is strictly diagonally dominant precisely when r < 1, implying lim kk k∞ ≤ k0 k∞ lim rk = 0. k→∞

k→∞

We conclude that the Gauss Seidel method converges for strictly diagonally dominant matrices.

207

12 The Classical Iterative Methods

Exercise 12.9: Convergence example for fix point iteration Consider for a ∈ C x :=

       x1 0 a x1 1−a = + =: Gx + c. x2 a 0 x2 1−a

Starting with x0 = 0 show by induction xk (1) = xk (2) = 1 − ak ,

k ≥ 0,

and conclude that the iteration converges to the fixed-point x = [1, 1]T for |a| < 1 and diverges for |a| > 1. Show that ρ(G) = 1 − η with η = 1 − |a|. Compute the estimate (12.31) for the rate of convergence for a = 0.9 and s = 16 and compare with the true number of iterations determined from |a|k ≤ 10−16 . Solution. We show by induction that xk (1) = xk (2) = 1 − ak for every k ≥ 0. Clearly the formula holds for k = 0. Assume the formula holds for some fixed k. Then        0 a 1 − ak 1−a 1 − ak+1 xk+1 = Gxk + c = + = . a 0 1 − ak 1−a 1 − ak+1 It follows that the formula holds for any k ≥ 0. When |a| < 1 we can evaluate the limit lim xk (i) = lim 1 − ak = 1 − lim ak = 1,

k→∞

k→∞

k→∞

for i = 1, 2.

When |a| > 1, however, |xk (1)| = |xk (2)| = |1 − ak | becomes arbitrary large with k and limk→∞ xk (i) diverges. The eigenvalues of G are the zeros of the characteristic polynomial λ2 − a2 = (λ − a)(λ + a), and we find that G has spectral radius ρ(G) = 1 − η, where η := 1 − |a|. Equation (12.31) yields an estimate k˜ = log(10)s/(1 − |a|) for the smallest number of iterations k so that ρ(G)k ≤ 10−s . In particular, taking a = 0.9 and s = 16, one expects at least k˜ = 160 log(10) ≈ 368 iterations before ρ(G)k ≤ 10−16 . On the other hand, 0.9k = |a|k = 10−s = 10−16 when k ≈ 350, so in this case the estimate is fairly accurate. Exercise 12.10: Estimate in Lemma 12.1 can be exact Consider the iteration in Example 12.2. Show that ρ(GJ ) = 1/2. Then show that xk (1) = xk (2) = 1 − 2−k for k ≥ 0. Thus the estimate in Lemma 12.1 is exact in this case. Solution. As the eigenvalues of the matrix GJ are the zeros of λ2 − 1/4 = (λ − 1/2)(λ + 1/2) = 0, one finds the spectral radius ρ(GJ ) = 1/2. In this example, the

12 The Classical Iterative Methods

208

Jacobi iteration process is described by  xk+1 = GJ xk + c,

GJ =

0 1 2

1 2

0

 ,

c=

 −1    1  20 1 = 21 . 02 1 2

The initial guess x0 =

  0 0

satisfies the formula xk (1) = xk (2) = 1 − 2−k for k = 0. Moreover, if this formula holds for some k ≥ 0, one finds  1   1   −(k+1) 0 1 − 2−k 2 = 1−2 xk+1 = GJ xk + c = 1 2 + , 1 1 − 2−k 1 − 2−(k+1) 2 0 2 which means that it must then hold for k + 1 as well. By induction we can conclude that the formula holds for all k ≥ 0. At iteration k, each entry of the approximation xk differs by 2−k from the fixed point, implying that kk k∞ = 2−k . Therefore, for given s, the error kk k∞ ≤ 10−s for the first time at k = ds log(10)/ log(2)e. The bound −s log(10)/ log(ρ(G)) gives the same. Exercise 12.11: Iterative method (Exam exercise 1991-3) Let A ∈ Rn×n be a symmetric positive definite matrix with ones on the diagonal and let b ∈ Rn . We will consider an iterative method for the solution of Ax = b. Observe that A may be written A = I − L − LT , where L is lower triangular with zeros on the diagonal, li,j = 0, when j ≥ i. The method is defined by M xk+1 = N xk + b, (12.45) where M and N are given by the splitting A = M − N,

M = (I − L)(I − LT ),

N = LLT .

(12.46)

a) Let x 6= 0 be an eigenvector of M −1 N with eigenvalue λ. Show that λ=

xT N x . xT Ax + xT N x

(12.47)

Solution. If M −1 N x = λx then N x = λM x = λ(A + N )x. We multiply by xT and obtain xT N x = λxT (A + N )x = λ(xT Ax + xT N x) which after a division gives the result. b) Show that the sequence {xk } generated by (12.45) converges to the solution x of Ax = b for any starting vector x0 .

209

12 The Classical Iterative Methods

Solution. It is enough to show that all eigenvalues of M −1 N are less than one in magnitude. Let (λ, x) be an eigenpair of M −1 N . It satisfies (12.47). Since A is positive definite xT Ax > 0. Also, xT N x = xT LLT x = kLT xk22 ≥ 0. Due to this (12.47) implies λ ∈ [0, 1). c) Consider the following algorithm 1. Choose x = [x(1), x(2), . . . , x(n)]T . 2. for k = 1, 2, 3, . . . for i = 1, 2, . . . , n − 1, n, n, n − 1, n − 2, . . . , 1 X x(i) = b(i) − a(i, j)x(j) j6=i

(i.e., Algorithm (12.48) in the book). Is there a connection between this algorithm and the method of Gauss-Seidel? Show that the algorithm (12.48) leads up to the splitting (12.46). Solution. The algorithm consists of two Gauss-Seidel iterations in each iteration. One in the order 1, 2, . . . , n (forward) and then one in the order n, n − 1, . . . , 1 (backwards). We write the solution of Ax = b as x = Lx + LT x + b. Forward and backward Gauss-Seidel can be written y k+1 = Ly k+1 + LT xk + b and xk+1 = Ly k+1 + LT xk+1 + b. We write these in the form (I − L)y k+1 = LT xk + b, T

(I − L )xk+1 = Ly k+1 + b.

(12.ii) (12.iii)

Multiplying (12.iii) by I − L we obtain M xk+1 = (I − L)Ly k+1 + (I − L)b. Since (I − L)L = L(I − L) we find M xk+1 = L(I − L)y k+1 + (I − L)b (12.ii)

= LLT xk + Lb + (I − L)b = N xk + b.

This is the splitting (12.46).

12 The Classical Iterative Methods

210

Exercise 12.12: Gauss-Seidel method (Exam exercise 2008-1) Consider the linear system Ax = b in which   301 A := 0 7 2 124 and b := [1, 9, −2]T . a) With x0 = [1, 1, 1]T , carry out one iteration of the Gauss-Seidel method to find x1 ∈ R3 . Solution. We obtain  x1 (1) = b(1) − x0 (3) /3 = (1 − 1)/3 = 0  x1 (2) = b(2) − 2x0 (3) /7 = (9 − 2 · 1)/7 = 1  x1 (3) = b(3) − x1 (1) − 2x1 (2) /4 = (−2 − 0 − 2 · 1)/4 = −1.

b) If we continue the iteration, will the method converge? Why? Solution. Yes. Since A is strictly diagonally dominant, i.e., X aii > |aij |, i = 1, 2, 3, j6=i

all its eigenvalues are positive by Gershgorin’s circle theorem. Therefore, since A is also symmetric it is symmetric positive definite, which guarantees convergence. That A is positive definite also follows since the three leading principal submatrices have positive determinants. c) Write a MATLAB program for the Gauss-Seidel method applied to a matrix A ∈ Rn×n and right-hand side b ∈ Rn . Use the ratio of the current residual to the initial residual as the stopping criterion, as well as a maximum number of iterations. Hint. The function C=tril(A) extracts the lower part of A into a lower triangular matrix C. Solution. The following code solves the system Ax = b using the Gauss-Seidel method.

12 The Classical Iterative Methods

211

code/gs.m 1 2 3 4 5 6 7 8 9 10 11

function [x,it]=gs(A,b,x,tol,maxit) nr=norm(b-A*x,2); C = tril(A); for k=1:maxit r=b-A*x; x=x+C\r; if norm(r,2)/nr= 10ˆ(-8) k = k + 1; end k Listing 12.3: Find the first integer k for which f (k) < 10−8 .

finds that f (k) dives for the first time below 10−8 at k = 470. We conclude that the matrix Ak is close to zero only for a very high power k. b) Ak can be found explicitly Show by  for any k. Let E := (A − λI)/a. n induction that E k = 00 I n−k for 1 ≤ k ≤ n − 1 and that E = 0. 0 Solution. Let E = E 1 := (A − λI)/a be the n × n matrix in the exercise, and write

12 The Classical Iterative Methods

214

  0 I n−k E k := ∈ Rn,n . 0 0 Clearly E k = E k for k = 1. Suppose that E k = E k for some k satisfying 1 ≤ k ≤ n − 1. Using the rules of block multiplication, E k+1 = E k E 1 # " #  " n−k−1 0k,1 , I k 0k,n−k−1 0n−k,k I n−k 0n−k,k+1 0I1,n−k−1 = = n−k−1 0k,k 0k,n−k 0n−k,k+1 0I1,n−k−1 0k,k+1 0k,n−k−1 = E k+1 . Alternatively, since 

k

(E )ij =

1, if j = i + k, 0, otherwise,

one has (E k+1 )ij = (E k E 1 )ij =

X

(E k )i` (E 1 )`j

` k

1

= (E )i,i+k (E )i+k,j = 1 · (E 1 )i+k,j  1 if j = i + k + 1, = 0 otherwise, By induction we conclude that E k = E k for any k satisfying 1 ≤ k ≤ n, with the convention that E n = E n = 0n,n . We summarize that the matrix E is nilpotent of degree n. Pmin{k,n−1} k j k−j j c) We have Ak = (aE + λI)k = j=0 E and conclude j a λ that the (1, n) element is given by f (k) for k ≥ n − 1. Solution. Since the matrices E and I commute, the binomial theorem and b) yield Ak = (aE + λI)k =

min{k,n−1} 

X j=0

 k k−j j j λ a E . j

Since (E j )1,n = 0 for 1 ≤ j ≤ n − 2 and (E n−1 )1,n = 1, it follows that (Ak )1,n =

min{k,n−1} 

X j=0

   k k−j j j k λ a (E )1,n = λk−n+1 an−1 = f (k), j n−1

which is what needed to be shown.

Chapter 13

The Conjugate Gradient Method

Exercises section 13.1

Exercise 13.1: A-norm One can show that the A-norm is a vector norm on Rn without using the fact that it is an inner product norm. Show this with the help of the Cholesky factorization of A. Solution. Let A = LL∗ be a Cholesky factorization of A, i.e., L is lower triangular with positive diagonal elements. The A-norm then takes the form kxkA = √ T x LL∗ x = kL∗ xk. Let us verify the three properties of a vector norm: 1. Positivity: Clearly kxkA = kL∗ xk ≥ 0. Since L∗ is nonsingular, kxkA = kL∗ xk = 0 if and only if L∗ x = 0 if and only if x = 0. 2. Homogeneity: kaxkA = kL∗ (ax)k = kaL∗ xk = |a|kL∗ (x)k = |a|kxkA . 3. Subadditivity: kx+ykA = kL∗ (x+y)k = kL∗ x+L∗ yk ≤ kL∗ xk+kL∗ yk = kxkA +kykA .

Exercise 13.2: Paraboloid Let A = U DU T be the spectral decomposition of A, i.e., U is orthonormal and D = diag(λ1 , . . . , λn ) is diagonal. Define new variables v = [v1 , . . . , vn ]T := U T y, and set c := U T b = [c1 , . . . , cn ]T . Show that Q(y) :=

n n X 1 T 1X y Ay − bT y = λj vj2 − cj vj . 2 2 j=1 j=1

Solution. One has

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_13

215

13 The Conjugate Gradient Method

216

Q(y) =

n n X 1 T 1 1X y U DU T y − bT y = v T Dv − cT v = λj vj2 − c j vj . 2 2 2 j=1 j=1

Exercise 13.3: Steepest descent iteration Verify the numbers in Example 13.1, i.e., show that       1 1−k −k 1 −k 0 t2k−2 = 3 · 4 , x2k−1 = −4 , r 2k−1 = 3 · 4 , −1/2 2 1 (13.i)       −1 1 1/2 t2k−1 = 3 · 4−k , x2k = −4−k , r 2k = 3 · 4−k . (13.ii) 2 1/2 0 Solution. In the steepest descent method we choose, at the kth iteration, the search direction pk = r k = b − Axk and optimal step length αk :=

rT k rk . T r k Ar k

Given is a quadratic function     1  x T x xy A Q(x, y) = −b , y y 2



 2 −1 A= , −1 2

  0 b= , 0

and an initial guess x0 = [−1, −1/2]T of its minimum. The corresponding residual is        0 2 −1 −1 3/2 r 0 = b − Ax0 = − = . 0 −1 2 −1/2 0 Performing the steps in Equation (13.8) twice yields      rT r0 9/4 1 2 −1 3/2 3 t0 = Ar 0 = = , α0 = 0T = = , −1 2 0 −3/2 9/2 2 r 0 t0             1 3/2 1 −1 −1/4 3/2 3 0 x1 = + = , r1 = − = , −1/2 −1/2 0 3/4 2 0 2 −3/2  t1 = Ar 1 =

x2 =

rT 9/16 1 1 r1 = = , 9/8 2 rT t 1 1       1 −3/4 0 3/8 r2 = − = . 3/4 0 2 3/2

    2 −1 0 −3/4 = , −1 2 3/4 3/2

      1 0 −1/4 −1/4 + = , −1/2 −1/8 2 3/4

α1 =

Moreover, assume that for some k ≥ 1, (13.i) and (13.ii) hold. Then

13 The Conjugate Gradient Method

t2k = 3 · 4−k

217

    1 2 −1 1/2 1−(k+1) , =3·4 −1 2 0 −1/2



9 · 4−2k · ( 12 )2 rT 1 2k r 2k = = , 1 T −2k 2 r 2k t2k 9·4 ·2       1 1 1/2 1 = −4−k + · 3 · 4−k = −4−(k+1) , 1/2 0 2 2       1 1/2 1 0 = 3 · 4−k − · 3 · 41−(k+1) = 3 · 4−(k+1) , 0 −1/2 1 2      2 −1 0 −1 = 3 · 4−(k+1) = 3 · 4−(k+1) , −1 2 1 2

α2k = x2k+1 r 2k+1 t2k+1

rT 9 · 4−2(k+1) 1 2k+1 r 2k+1 = = , −2(k+1) · 2 2 9 · 4 rT t 2k+1 2k+1       1 1 −(k+1) 1 −(k+1) 0 −(k+1) = −4 + ·3·4 = −4 , 2 1 1/2 2       1 −(k+1) 0 −(k+1) −1 −(k+1) 1/2 =3·4 − ·3·4 =3·4 . 1 2 0 2

α2k+1 = x2k+2 r 2k+2

Using the method of induction, we conclude that (13.i), (13.ii), and αk = 1/2 hold for any k ≥ 1. Exercise 13.4: Steepest descent (Exam exercise 2011-1) The steepest descent method can be used to solve a linear system Ax = b for x ∈ Rn , where A ∈ Rn,n is symmetric and positive definite, and b ∈ Rn . With x0 ∈ Rn an initial guess, the iteration is xk+1 = xk + αk r k , where r k rT r is the residual, r k = b − Axk , and αk = rTkArkk . k   2 −1 a) Compute x1 if A = , b = [1 1]T and x0 = 0. −1 2 Solution. We find r 0 = [1 1]T , α0 = 1, and x1 = [1 1]T , the exact solution. b) If the k-th error, ek = xk − x, is an eigenvector of A, what can you say about xk+1 ? Solution. If Aek = λek for some λ ∈ R then r k = −λek , and Ar k = −λAek = 2 T T 3 T −λ2 ek . We therefore have r T k r k = λ ek ek and r k Ar k = λ ek ek , so that αk = 1/λ, and therefore, xk+1 = xk + αk r k = xk − ek = x, which is the solution.

13 The Conjugate Gradient Method

218

Exercises section 13.2

Exercise 13.5: Conjugate gradient iteration, II Do one iteration with the conjugate gradient method when x0 = 0. Solution. Using x0 = 0, one finds x1 = x0 +

(b − Ax0 )T (b − Ax0 ) bT b (b − Ax ) = b. 0 2 (b − Ax0 )T A(b − A x0 ) bT Ab

Exercise 13.6: Conjugate gradient iteration, III Do two conjugate gradient iterations for the system      2 −1 x1 0 = , −1 2 x2 3 starting with x0 = 0. Solution. By Exercise 13.5, x1 =

    bT b 9 0 0 b = = . 3/2 18 3 bT Ab

We find, in order, 3 1 α0 = , r 1 = 2 , 0 2 3   2 1 2 p1 = 3 , α1 = , x2 = . 2 3 4

  0 p0 = r 0 = , 3 β0 =

1 , 4

Since the residual vectors r 0 , r 1 , r 2 must be orthogonal, it follows that r 2 = 0 and x2 must be an exact solution. This can be verified directly by hand. Exercise 13.7: The cg step length is optimal Show that the step length αk in the conjugate gradient method is optimal. Hint. Pk−1 Use induction on k to show that pk = r k + j=0 ak,j r j for some constants ak,j . Solution. For any fixed search direction pk , the step length αk is optimal if Q(xk+1 ) is as small as possible, that is

13 The Conjugate Gradient Method

219

Q(xk+1 ) = Q(xk + αk pk ) = min f (α), α∈R

where, by (13.5), 1 f (α) := Q(xk + αpk ) = Q(xk ) − αpkT r k + α2 pT k Apk 2 is a quadratic polynomial in α. Since A is assumed to be positive definite, necessarily pT k Apk > 0. Therefore f has a minimum, which it attains at α=

pT k rk . T pk Apk

Applying (13.17) repeatedly, one finds that the search direction pk for the conjugate gradient method satisfies ! T T r r rT r r r k−1 k k k−1 pk = r k + T k p = rk + T k r k−1 + T p = ··· r k−1 r k−1 k−1 r k−1 r k−1 r k−2 r k−2 k−2 As p0 = r 0 , the difference pk − r k is a linear combination of the vectors T r k−1 , . . . , r 0 , each of which is orthogonal to r k . It follows that pT k rk = rk rk and that the step length α is optimal for α=

rT k rk = αk . T pk Apk

Exercise 13.8: Starting value in cg Show that the conjugate gradient method (13.18) for Ax = b starting with x0 is the same as applying the method to the system Ay = r 0 := b − Ax0 starting with y 0 = 0. Hint. The conjugate gradient method for Ay = r 0 can be written y k+1 := y k + γk q k , γk := sT k+1 sk+1 sT k sk

sT k sk , qT k Aq k

sk+1 := sk − γk Aq k , q k+1 := sk+1 + δk q k , δk :=

. Show that y k = xk −x0 , sk = r k , and q k = pk , for k = 0, 1, 2 . . ..

Solution. As in the exercise, we consider the conjugate gradient method for Ay = r 0 , with r 0 = b − Ax0 . Starting with y 0 = 0,

s0 = r 0 − Ay 0 = r 0 ,

one computes, for any k ≥ 0,

q 0 = s0 = r 0 ,

220

13 The Conjugate Gradient Method

γk :=

sT k sk , T q k Aq k δk :=

y k+1 = y k + γk q k , sT k+1 sk+1 , sT k sk

sk+1 = sk − γk Aq k ,

q k+1 = sk+1 + δk q k .

How are the iterates y k and xk related? As remarked above, s0 = r 0 and q 0 = r 0 = p0 . Suppose sk = r k and q k = pk for some k ≥ 0. Then sk+1 = sk − γk Aq k = r k −

rT k rk Apk = r k − αk Apk = r k+1 , pT k Apk

q k+1 = sk+1 + δk q k = r k+1 +

rT k+1 r k+1 pk = pk+1 . rT k rk

It follows by induction that sk = r k and q k = pk for all k ≥ 0. In addition, y k+1 − y k = γk q k =

rT k rk pk = xk+1 − xk , pT k Apk

for any k ≥ 0,

so that y k = xk − x0 . Exercise 13.9: Program code for testing steepest descent Write a function K=sdtest(m,a,d,tol,itmax) to test the steepest descent method on the matrix T 2 . Make the analogues of Table 13.1 and Table 13.2. For Table 13.2 it is enough√to test for say n = 100, 400, 1600, 2500, and tabulate K/n instead of K/ n in the last row. Conclude that the upper bound (13.19) is realistic. Compare also with the number of iterations for the J and GS method in Table 12.1. Solution. Replacing the steps in (13.18) by those in (13.8), Algorithm 13.2 changes into the following algorithm for testing the steepest descent method. code/sdtest.m 1 2 3 4 5 6 7 8 9 10 11 12

function [V,K] = sdtest(m, a, d, tol, itmax) R = ones(m)/(m+1)ˆ2; rho = sum(sum(R.*R)); rho0 = rho; V = zeros(m,m); T1=sparse(toeplitz([d, a, zeros(1,m-2)])); for k=1:itmax if sqrt(rho/rho0) 0 for k = 0, . . . , m we clearly have 0 < ω1 < 1. −1 Suppose 0 < ωk−1 < 1 for some k ≥ 2. Now ωk−1 ρk /ρk−1 > 0 and therefore 0 < ωk < 1. The result follows by induction.

13 The Conjugate Gradient Method

233

b) Explain briefly how to define an iterative algorithm for determining xk using the formulas (13.68), (13.69), (13.70) and estimate the number of arithmetic operations in each iteration. Solution. The following code runs the algorithm a fixed number of iterations, for a given matrix B and vector b. code/cgantisymm.m 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

function x=cgantisymm(B,b) m = 10; % Number of iterations n = size(B,1); x = zeros(n,m+2); r = zeros(n,m+2); r(:,1) = b; r(:,2) = b; rho = zeros(m+2,1); rho(2) = norm(b)ˆ2; omega = 1; for k=2:(m+1) x(:,k+1) = (1-omega)*x(:,k-1) + omega*(x(:,k)+r(:,k)); r(:,k+1) = (1-omega)*r(:,k-1) + omega*B*r(:,k); rho(k+1) = norm(r(:,k+1))ˆ2; omega=1/(1+rho(k+1)/(rho(k)*omega)); end x = x(:,m+2); end Listing 13.4: A conjugate gradient-like method for antisymmetric systems.

The number of operations if B is a full matrix are O(n) for the first and third lines in the for-loop, O(n2 ) for the second line and O(1) for the fourth line. The following code tests the algorithm. code/test_cgantisymm.m 1 2 3 4 5 6 7

n=8; C=tril(rand(n)); B = C-C’; b=rand(n,1); x=cgantisymm(B,b); norm( (eye(n)-B)*x-b ) Listing 13.5: Testing the conjugate gradient-like method for antisymmetric systems.

The last line here tests if the solution found by cgantisymm solves the system. Ideally it would return zero, but due to roundoff errors a small number is returned. If space is at a premium we note that it is enough to use two arrays for x, two arrays for r and two real numbers for ρ. The code above, however, stores all iterations of these.

13 The Conjugate Gradient Method

234

c) Show that hr k , r j i = 0 for j = 0, 1, . . . , k − 1. Solution. We first note that r j ∈ Wj+1 since xj ∈ Wj and r j = b − Axj = b − xj + Bxj . Since the exercise imposes that hAxk , wi = hb, wi for all w ∈ Wk , it follows that hr k , wi = 0 for all w ∈ Wk . The orthogonality follows by taking w = r j ∈ Wj+1 for j = 0, 1, . . . , k − 1. d) Show that if k ≤ m + 1 then Wk = span(r 0 , r 1 , . . . , r k−1 ) and dim Wk = k. Solution. Let Vk := span(r 0 , r 1 , . . . , r k−1 ). Since r 0 , r 1 , . . . , r k−1 are orthogonal Pk−1 and nonzero they are linearly independent. Indeed, If i=0 ci r i = 0 for some c0 , . . . , ck−1 , then *k−1 + X ci r i , r j = cj hr j , r j i = 0, j = 0, . . . , k − 1, i=0

so that cj = 0. It follows that dim Vk = k. Also r j ∈ Wj+1 ⊂ Wk , j = 0, . . . , k−1 implies that Vk ⊂ Wk . Since dim Wk ≤ k (since Wk is spanned by k vectors), it follows that actually dim Wk = k. It follows that {r 0 , r 1 , . . . , r k−1 } also must be a basis for Wk , so that Vk = Wk . e) Show that if 1 ≤ k ≤ m − 1 then Br k = αk r k+1 + βk r k−1 ,

(13.71)

where αk := hBr k , r k+1 i/ρk+1 and βk := hBr k , r k−1 i/ρk−1 . Solution. Since Br k ∈ Wk+2 = span(r 0 , r 1 , . . . , r k+1 ) and the r j are nonzero and orthogonal we have Br k =

k+1 X j=0

hBr k , r j i rj . ρj

By c) hBr k , r j i = −hr k , Br j i = 0,

j = 0, . . . , k − 2

and hBr k , r k i = 0 by Exercise 13.13a). Thus we obtain (13.71), since only the indices k − 1 and k + 1 contribute. f) Define α0 := hBr 0 , r 1 i/ρ1 and show that α0 = 1. Solution. Since x1 ∈ W1 we have x1 = γb for some γ ∈ R. By (13.64) and definition of x1 ,

13 The Conjugate Gradient Method

γhb, bi = γhAb, bi = hAx1 , bi = hb, bi

235

=⇒

γ = 1,

showing that x1 = b. Since r 0 = b we find α0 =

hBb, r 1 i hb − Ab, r 1 i hr 1 , r 1 i = = = 1. ρ1 ρ1 ρ1

g) Show that if 1 ≤ k ≤ m − 1 then βk = −αk−1 ρk /ρk−1 . Solution. Since B T = −B and xT y = y T x for any x, y ∈ Rn βk ρk−1 = hBr k , r k−1 i = hr k , B T r k−1 i = −hr k , Br k−1 i = = −hBr k−1 , r k i = −αk−1 ρk . The result follows after dividing by ρk−1 . h) Show that hr k+1 , A−1 r k+1 i = hr k+1 , A−1 r j i,

j = 0, 1, . . . , k.

(13.72)

Hint. Show that A−1 (r k+1 − r j ) ∈ Wk+1 . Solution. We have A−1 (r k+1 − r j ) = A−1 (b − Axk+1 − b + Axj ) = xj − xk+1 ∈ Wk+1 But since hr k+1 , wi = 0 for all w ∈ Wk+1 hr k+1 , A−1 r k+1 i = hr k+1 , A−1 (r k+1 − r j )i + hr k+1 , A−1 r j i = hr k+1 , A−1 r j i, where the first inner product is zero since r k+1 ∈ Wk+2 , A−1 (r k+1 −r j ) ∈ Wk+1 . i) Use (13.71) and (13.72) to show that αk + βk = 1 for k = 1, 2, . . . , m − 1. Solution. By (13.71) Ar k = r k − Br k = r k − αk r k+1 − βk r k−1 , so that r k = A−1 r k − αk A−1 r k+1 − βk A−1 r k−1 By (13.72) we have hr k+1 , A−1 r k−1 i = hr k+1 , A−1 r k+1 i . By orthogonality of the residuals

13 The Conjugate Gradient Method

236

0 = hr k+1 , r k i = hr k+1 , A−1 r k i − (αk + βk )hr k+1 , A−1 r k+1 i = hr k+1 , x∗ − xk i − (αk + βk )hr k+1 , x∗ − xk+1 i = hr k+1 , x∗ i − (αk + βk )hr k+1 , x∗ i. 6 0 we obtain αk + βk = 1. Since hr k+1 , x∗ i = j) Show that αk ≥ 1 for k = 1, 2, . . . , m − 1. Solution. We have αk = 1 − βk = 1 + αk−1 ρk /ρk−1 . Since α0 = 1 and ρj > 0 for j = 1, . . . , m − 1, the result follows by induction. k) Show that xk , r k and ω k satisfy the recurrence relations (13.68), (13.69) and (13.70). Solution. Proof of (13.70): By (13.71) we have   βk 1 1 1 r k+1 = − r k−1 + Br k = 1 − r k−1 + Br k . αk αk αk αk With ωk := 1/αk we obtain (13.70) for k ≥ 1. For k = 0 we obtain from (13.71) that r 1 = Br 0 = Bb = b − Ab, and this is consistent since x1 = b. Proof of (13.68): We have ω0 = 1/α0 = 1. Moreover, the proof of j) implies (13.68) directly. Proof of (13.69): Using (13.70) we find b − Axk+1 = (1 − ωk )(b − Axk−1 ) + ωk (b − Axk ) − ωk Ar k = b − (1 − ωk )Axk−1 − ωk Axk − ωk Ar k , from which (13.69) follows after canceling b and multiplying with A−1 on both sides. This holds for k = 0 since (13.70) holds for k = 0. Moreover, since r 0 = r −1 = b we have x0 = x−1 = 0.

Exercises section 13.4

Exercise 13.15: Another explicit formula for Chebyshev polynomials Show that Tn (t) = cosh(narccosh t) for t ≥ 1, where arccosh is the inverse function of cosh x := (ex + e−x )/2. Solution. For any integer n ≥ 0, write  Pn (t) := cosh nφ(t) , φ(t) := arccosh(t),

t ∈ [1, ∞).

13 The Conjugate Gradient Method

237

It is well known, and easily verified, that cosh(x + y) = cosh(x) cosh(y) + sinh(x) sinh(y). Using this and that cosh is even and sinh is odd, one finds   Pn+1 (t) + Pn−1 (t) = cosh (n + 1)φ + cosh (n − 1)φ = cosh(nφ) cosh(φ) + sinh(nφ) sinh(φ)+ cosh(nφ) cosh(φ) − sinh(nφ) sinh(φ) = 2 cosh(φ) cosh(nφ) = 2tPn (t). It follows that Pn and Tn satisfy the same recurrence relation. Since they also share initial terms P0 (t) = 1 = T0 (t) and P1 (t) = t = T1 (t), necessarily Pn = Tn for any n ≥ 0. Exercise 13.16: Maximum of a convex function Show that if f : [a, b] → R is convex then maxa≤x≤b f (x) ≤ max{f (a), f (b)}. Solution. This is a special case of the maximum principle in convex analysis, which states that a convex function, defined on a compact convex set Ω, attains its maximum on the boundary of Ω. Let f : [a, b] → R be a convex function. Consider an arbitrary point x = (1 − λ)a + λb ∈ [a, b], with 0 ≤ λ ≤ 1. Since f is convex,  f (x) = f (1 − λ)a + λb ≤ (1 − λ)f (a) + λf (b) ≤ (1 − λ) max{f (a), f (b)} + λ max{f (a), f (b)} = max{f (a), f (b)}, see Figure 13.2. It follows that f (x) ≤ max{f (a), f (b)} and that f attains its maximum on the boundary of its domain of definition.

13 The Conjugate Gradient Method

238

f(b)

(1

)f(a) + f(b) f(x) f(a) a

x = (1

)a + b

b

Figure 13.2: A convex function f is bounded from above by convex combinations.

Exercises section 13.5

Exercise 13.17: Variable coefficient For m = 2, show that (13.57) takes the form      a1,1 −c3/2,1 −c1,3/2 0 v1,1 (dv)1,1  −c3/2,1 a2,2     0 −c2,3/2   v2,1  = (dv)2,1  , Ax =   −c1,3/2     0 a3,3 −c3/2,2 v1,2 (dv)1,2  0 −c2,3/2 −c3/2,2 a4,4 v2,2 (dv)2,2 where



   a1,1 c1/2,1 + c1,1/2 + c1,3/2 + c3/2,1  a2,2   c3/2,1 + c2,1/2 + c2,3/2 + c5/2,1       a3,3  =  c1/2,2 + c1,3/2 + c1,5/2 + c3/2,2  . a4,4 c3/2,2 + c2,3/2 + c2,5/2 + c5/2,2

Show that the matrix A is symmetric, and if c(x, y) > 0 for all (x, y) ∈ Ω then it is strictly diagonally dominant. Solution. In the equation for entry (j, k), vj,k contributes with cj−1/2,k +cj,k−1/2 − cj+1/2,k − cj,k+1/2 . This means that the contributions in entries (1, 1), (2, 1), (1, 2), and (2, 2) (i.e., in the order obtained by stacking the columns) are   c1/2,1 + c1,1/2 + c3/2,1 + c1,3/2 c3/2,1 + c2,1/2 + c5/2,1 + c2,3/2    c1/2,2 + c1,3/2 + c3/2,2 + c1,5/2  . c3/2,2 + c2,3/2 + c5/2,2 + c2,5/2

13 The Conjugate Gradient Method

239

These are the expressions for ai,i , and they are placed on the diagonal. Also, in the equation for entry (j, k), • vj−1,k contributes with −cj−1/2,k . This tribute in the first subdiagonal. • vj+1,k contributes with −cj+1/2,k . This tribute in the first superdiagonal. • vj,k−1 contributes with −cj,k−1/2 . This tribute in the second subdiagonal. • vj,k+1 contributes with −cj,k+1/2 . This tribute in the second superdiagonal.

means that −c3/2,1 and −c3/2,2 conmeans that −c3/2,1 and −c3/2,2 conmeans that −c1,3/2 and −c2,3/2 conmeans that −c1,3/2 and −c2,3/2 con-

This accounts for all entries in the matrix. The matrix is clearly symmetric. To check diagonal dominance one goes through all four rows. For the first row, |a1,1 | = c1/2,1 + c1,1/2 + c3/2,1 + c1,3/2 , and the off-diagonal contribution is c3/2,1 +c1,3/2 . The difference is c1/2,1 +c1,1/2 > 0. For all other rows one sees in the same way that the two nonzero off-diagonal entries appear in ai,i , so that terms cancel in the same way. It follows that the matrix is strictly diagonally dominant.

Chapter 14

Numerical Eigenvalue Problems

Exercises section 14.1

Exercise 14.1: Yes or No (Exam exercise 2006-1) Answer simply yes or no to the following questions: a) Every matrix A ∈ Cm×n has a singular value decomposition? Solution. Yes, we have seen that this is the case. b) The algebraic multiplicity of an eigenvalue is always less than or equal to the geometric multiplicity? Solution. No, we have seen that it is the other way around. c) The QR factorization of a matrix A ∈ Rn×n can be determined by Householder transformations in O(n2 ) arithmetic operations? Solution. No, we have seen that 4n3 /3 operations are required in order to bring a matrix to upper triangular form using Householder transformations. d) Let ρ(A) be the spectral radius of A ∈ Cn×n . Then limk→∞ Ak = 0 if and only if ρ(A) < 1? Solution. Yes, we have seen in Theorem 12.10 that this is the case.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_14

241

242

14 Numerical Eigenvalue Problems

Exercises section 14.2

Exercise 14.2: Nonsingularity using Gershgorin Consider the matrix   4100 1 4 1 0  A= 0 1 4 1. 0014 Show using the Gershgorin circle theorem (Theorem 14.1) that A is nonsingular. Solution. We compute the Gershgorin disks R1 = R4 = C1 = C4 = {z ∈ C : |z − 4| ≤ 1}, R2 = R3 = C2 = C3 = {z ∈ C : |z − 4| ≤ 2}, see Figure 14.1. Then, by the Gershgorin circle theorem, each eigenvalue of A lies in (R1 ∪ · · · ∪ R4 ) ∩ (C1 ∪ · · · ∪ C4 ) = {z ∈ C : |z − 4| ≤ 2}. In particular A has only nonzero eigenvalues, implying that A must be nonsingular.

2

R2 = R3 = C2 = C3

1

R1 = R4 = C1 = C4

0 1 2 2

3

4

Figure 14.1: Gershgorin disks in Exercise 14.2.

5

6

243

14 Numerical Eigenvalue Problems

Exercise 14.3: Gershgorin, strictly diagonally dominant matrix Show using the Gershgorin circle theorem (Theorem 14.1) that a strictly diagP onally dominant matrix A (|ai,i | > j6=i |ai,j | for all i) is nonsingular. Solution. Suppose A is a strictly diagonally dominant matrix. For such a matrix, one finds Gershgorin disks     X Ri = z ∈ C : |z − aii | ≤ |aij | .   j6=i

P Since |aii | > j6=i |aij | for all i, the Sorigin is not an element of anySof the disks S Ri , and therefore neither of the union Ri , nor of the intersection ( Ri ) ∩ ( Ci ). Then, by the Gershgorin circle theorem, A only has nonzero eigenvalues, implying that det(A) = det(A − 0 · I) 6= 0 and A is nonsingular. Exercise 14.4: Gershgorin disks (Exam exercise 2009-2) The eigenvalues of A ∈ Rn,n lie inside R ∩ C, where R := R1 ∪ · · · ∪ Rn is the union of the row disks Ri of A, and C := C1 ∪ · · · ∪ Cn is the union of the column disks Cj . You do not need to prove this. Write a MATLAB function [s,r,c]=gershgorin(A) that computes the centers s = [s1 , . . . , sn ] ∈ Rn of the row and column disks, and their radii r = [r1 , . . . , rn ] ∈ Rn and c = [c1 , . . . , cn ] ∈ Rn , respectively. Solution. code/gershgorin.m 1 2 3 4 5 6 7 8 9 10

function [s,r,c] = gershgorin(A) n=length(A); s=diag(A); r=zeros(n,1); c=r; for i=1:n for j=1:n r(i)=r(i)+abs(A(i,j)); c(i)=c(i)+abs(A(j,i)); end; r(i)=r(i)-abs(s(i)); c(i)=c(i)-abs(s(i)); end Listing 14.1: Given a matrix A, compute the centers s and radii r and c of the row and column Gershgorin disks — for-loops

code/gershgorinv.m

14 Numerical Eigenvalue Problems

244 1 2 3 4 5

function [s,r,c] = gershgorinv(A) n=length(A); s=diag(A); e=ones(n,1); r=abs(A)*e-abs(s); c=(abs(A))’*e-abs(s); Listing 14.2: Given a matrix A, compute the centers s and radii r and c of the row and column Gershgorin disks — vectorized implementation

Exercises section 14.3

Exercise 14.5: Continuity of eigenvalues Suppose A(t) := D + t(A − D),

D := diag(a11 , . . . , ann ),

t ∈ R,

0 ≤ t1 < t2 ≤ 1 and that µ is an eigenvalue of A(t2 ). Show, using Theorem 14.2 with A = A(t1 ) and E = A(t2 ) − A(t1 ), that A(t1 ) has an eigenvalue λ such that  |λ − µ| ≤ C(t2 − t1 )1/n , C ≤ 2 kDk2 + kA − Dk2 . (?) Thus, as a function of t, every eigenvalue of A(t) is a continuous function of t. Solution. Applying Theorem 14.21 to the matrix A(t1 ) with perturbation E := A(t2 ) − A(t1 ), one finds that A(t1 ) has an eigenvalue λ such that |λ − µ| ≤ kA(t1 )k2 + kA(t2 )k2

1−1/n

1/n

kA(t2 ) − A(t1 )k2 .

Applying the triangle inequality to the definition of A(t1 ) and A(t2 ), and using that the function x 7−→ x1−1/n is monotone increasing,  1−1/n 1/n |λ − µ| ≤ 2kDk2 + (t1 + t2 )kA − Dk2 k(A − D)k2 (t2 − t1 )1/n . Finally, using that t1 + t2 ≤ 2, that the function x 7−→ x1/n is monotone increasing, and that k(A − D)k2 ≤ 2kDk2 + 2k(A − D)k2 , one obtains (?). Exercise 14.6: ∞-norm of a diagonal matrix Give a direct proof that kAk∞ = ρ(A) if A is diagonal. 1

L. Elsner, “An optimal bound for the spectral variation of two matrices”, Linear Algebra and its Applications 71 (1985), 77–80.

245

14 Numerical Eigenvalue Problems

Solution. Let A = diag(λ1 , . . . , λn ) be a diagonal matrix. The spectral radius ρ(A) is the absolute value of the biggest eigenvalue, say λi , of A. One has kAk∞ = max kAxk∞ = max max{|λ1 x1 |, . . . , |λn xn |} ≤ ρ(A), kxk∞ =1

kxk∞ =1

as |λ1 |, . . . , |λn | ≤ |λi | = ρ(A) and since the components of any vector x satisfy x1 , . . . , xn ≤ kxk∞ . Moreover, this bound is attained for the standard basis vector x = ei , since kAei k∞ = |λi | = ρ(A). Exercise 14.7: Eigenvalue perturbations (Exam exercise 2010-2) Let A = [akj ], E = [ekj ], and B = [bkj ] be matrices in Rn,n with ( ( 1, j = k + 1, , k = n, j = 1, akj = ekj = 0, otherwise, 0, otherwise,

(14.10)

and B = A + E, where 0 <  < 1. Thus for n = 4,       0100 0000 0100 0 0 1 0      , E := 0 0 0 0 , B := 0 0 1 0 . A :=  0 0 0 1 0 0 0 0 0 0 0 1 0000 000 000 a) Find the eigenvalues of A and B. Solution. Since A is triangular its eigenvalues λj are its diagonal elements. Thus the eigenvalues of A are λj = 0, j = 1, 2, . . . , n. For B the equation Bx = λx gives λxk = xk+1 , k = 1, . . . , n − 1 and λxn = x1 . Thus λn xn = λn−1 x1 = λn−2 x2 = λn−3 x3 = · · · = xn . We must have xn 6= 0 since otherwise x = 0. Canceling xn we find λn =  and the eigenvalues of B are µj = µe2πij/n , j = 1, . . . , n, √ where µ = 1/n and i = −1. Alternatively, expanding the determinant of B by its first column yields its characteristic polynomial, πB (λ) = (−1)n (λn − ), leading to the same eigenvalues. b) Show that kAk2 = kBk2 = 1 for arbitrary n ∈ N. Solution. The spectral norm of A is the largest singular value, which is the square root of the largest eigenvalue of AT A. Now AT A = diag(0, 1, . . . , 1) is di-

14 Numerical Eigenvalue Problems

246

agonal with eigenvalues 0 and 1. Thus kAk2 = 1. For B we find B T B = diag(2 , 1, . . . , 1) and kBk2 = 1. c) Recall Elsner’s Theorem (Theorem 14.2). Let A, E, B be given by (14.10). What upper bound does (14.1) in Elsner’s theorem give for the eigenvalue µ = 1/n of B? How sharp is this upper bound? Solution. We have kAk2 = kBk2 = 1. Now E T E = diag(2 , 0, . . . , 0) so kEk2 = . Elsner’s theorem says that there is an eigenvalue λ of A such that |µ − λ| ≤ kAk2 + kBk2

1−1/n

1/n

kEk2

= 21−1/n 1/n ≤ 21/n .

Since λ = 0 the exact difference is |µ − λ| = µ = 1/n . The upper bound differs from the exact value by a constant less than 2, so it is quite sharp. It captures the 1/n behavior.

Exercises section 14.4 Exercise 14.8 : Number of arithmetic operations, Hessenberg reduction Show that the number of arithmetic operations for Algorithm 14.1 in the book 3 is of the order 10 3 n = 5Gn . Solution. An arithmetic operation is a floating point operation, so we need not bother with any integer operations, like the computation of k + 1 in the indices. As we are only interested in the overall complexity, we count only terms that can contribute to this. For the first line involving C, the multiplication v’*C involves (n − k)2 floating point multiplications and about (n − k)2 floating point sums. Next, computing the outer product v*(v’*C) involves (n − k)2 floating point multiplications, and subtracting C - v*(v’*C) needs (n − k)2 subtractions. This line therefore involves (almost) 4(n − k)2 arithmetic operations. Similarly we find 4n(n − k) arithmetic operations for the line after that. These 4(n − k)2 + 4n(n − k) arithmetic operations need to be carried out for k = 1, . . . , n − 2, meaning that the algorithm requires of the order N :=

n−2 X k=1

 4(n − k)2 + 4n(n − k)

247

14 Numerical Eigenvalue Problems

arithmetic operations. Pn−2 2This sum can be computed by either using the formulas for Pn−2 k=1 k and k=1 k , or using that the highest order term can be found by evaluating an associated integral. One finds that the algorithm requires of the order Z n  10 3 N∼ 4(n − k)2 + 4n(n − k) dk = n 3 0 arithmetic operations. Exercise 14.9: Assemble Householder transformations Show that the number of arithmetic operations required by Algorithm 14.2 in the book is of the order 43 n3 = 2Gn . Solution. The multiplication v’*C involves (n − k)2 floating point multiplications and about (n−k)2 floating point sums. Next, computing the outer product v*(v’*C) involves (n − k)2 floating point multiplications, and subtracting C - v*(v’*C) needs (n−k)2 subtractions. In total we find (almost) 4(n−k)2 arithmetic operations, which have to be carried out for k = 1, . . . , n − 2, meaning that the algorithm requires of the order n−2 X N := 4(n − k)2 k=1

arithmetic operations. Pn−2 Pn−2 2This sum can be computed by either using the formulae for k and k=1 k=1 k , or using that the highest order term can be found by evaluating an associated integral. One finds that the algorithm requires of the order Z n 4 N∼ 4(n − k)2 dk = n3 3 0 arithmetic operations. Exercise 14.10: Tridiagonalize a symmetric matrix If A is real and symmetric we can modify Algorithm 14.1 as follows. To find Ak+1 from Ak we have to compute V k E k V k where E k is symmetric. Dropping subscripts we have to compute a product of the form G = (I − vv T )E(I − vv T ). Let w := Ev, β := 12 v T w and z := w − βv. Show that G = E − vz T − zv T . Since G is symmetric, only the sub- or superdiagonal elements of G need to be computed. Computing G in this way, it can be shown that we need O(4n3 /3) operations to tridiagonalize a symmetric matrix by orthonormal similarity transformations. This is less than half the work to reduce a nonsymmetric matrix to upper Hessenberg forma . a

We refer to G. W. Stewart, “Matrix Algorithms Volume II: Eigensystems”, SIAM, Philadelphia, 2001, for a detailed algorithm.

14 Numerical Eigenvalue Problems

248

Solution. We get z = w − βv = Ev − 21 vv T Ev and z T = v T E − 12 v T Evv T , which yields G = (I − vv T )E(I − vv T ) = E − vv T E − Evv T + vv T Evv T 1 1 = E − v(v T E − v T Evv T ) − (Ev − vv T Ev)v T 2 2 = E − vz T − zv T .

Exercises section 14.5

Exercise 14.11: Counting eigenvalues Consider the matrix A in Exercise 14.2. Determine the number of eigenvalues greater than 4.5. Solution. Let

 4 1 A= 0 0

1 4 1 0

0 1 4 1

 0 0 , 1 4

α = 4.5.

Applying the recursive procedure described in Corollary 14.2, we find the diagonal elements d1 (α), d2 (α), d3 (α), d4 (α) of the matrix D in the factorization A − αI = LDLT , d1 (α) = 4 − 9/2 = −1/2, d2 (α) = 4 − 9/2 − 12 /(−1/2) = +3/2, d3 (α) = 4 − 9/2 − 12 /(+3/2) = −7/6, d4 (α) = 4 − 9/2 − 12 /(−7/6) = +5/14. As precisely two of these are negative, Corollary 14.2 implies that there are precisely two eigenvalues of A strictly smaller than α = 4.5. As det(A − 4.5I) = det(LDLT ) = d1 (α)d2 (α)d3 (α)d4 (α) 6= 0, the matrix A does not have an eigenvalue equal to 4.5. We conclude that the remaining two eigenvalues must be bigger than 4.5.

249

14 Numerical Eigenvalue Problems

Exercise 14.12: Overflow in LDL* factorization Let for n ∈ N 10 1 0 · · ·   1 10 1 . . .   An =  0 . . . . . . . . .   . .  .. . . 1 10 

 0 ..  .    ∈ Rn×n . 0   1  0 · · · 0 1 10

a) Let dk be √ the diagonal elements of D in an LDL* factorization of An . Show that 5+ 24 < dk ≤ 10, k = 1, 2, . . . , n. Solution. Since An is tridiagonal and strictly diagonally dominant, it has a unique LU factorization by Theorem 2.3. From Equations (2.16), one can determine the corresponding LDL∗ factorization. For n = 1, 2, . . ., let dn,k , with k = 1, . . . , n, be the diagonal elements of the diagonal matrix D n in a symmetric factorization of An . We proceed by induction. Let n ≥ 1 be any positive integer. For the first diagonal √ element, corresponding to k = 1, Equations √ (2.16) immediately yield 5 + 24 < dn,1 = 10 ≤ 10. Next, assume that 5 + √ 24 < dn,k ≤ 10 for some 1 ≤ k < n. We show that this implies that 5 + 24 < dn,k+1 ≤ 10. First observe that √ 2 √ √ 5 + 24 = 25 + 10 24 + 24 = 49 + 10 24. From Equations (2.16) we know that √ dn,k+1 = 10 − 1/dn,k , which yields dn,k+1 < 10 since dn,k > 0. Moreover, 5 + 24 < dn,k implies √ √ 1 1 50 + 10 24 − 1 √ = √ dn,k+1 = 10 − > 10 − = 5 + 24. dn,k 5 + 24 5 + 24 √ √ Hence 5 + 24 < dn,k+1 ≤ 10, and we conclude that 5 + 24 < dn,k ≤ 10 for any n ≥ 1 and 1 ≤ k ≤ n. √ b) Show that Dn := det(An ) > (5 + 24)n . Give n0 ∈ N such that your computer gives an overflow when Dn0 is computed in floating point arithmetic. Solution. We have A = LDLT with L triangular and with ones on the diagonal. As a consequence, det(A) = det(L) det(D) det(L) = det(D) =

n Y

di > 5 +



24

n

.

i=1

An overflow in MATLAB is indicated by a return of Inf. This will occur for some large n, depending on your platform.

14 Numerical Eigenvalue Problems

250

Exercise 14.13: Simultaneous diagonalization (Simultaneous diagonalization of two symmetric matrices by a congruence transformation). Let A, B ∈ Rn×n where AT = A and B is symmetric positive definite. Then B = U T DU for some orthogonal matrix U and a diagonal matrix D = diag(d1 , . . . , dn ) with positive diagonal elements. Let ˆ = D −1/2 U AU T D −1/2 where A  −1/2 D −1/2 := diag d1 , . . . , d−1/2 . n ˆ is symmetric. a) Show that A 1

Solution. Since D − 2 , like any diagonal matrix, and A are symmetric, one has T

ˆ T = D − 12 U AT U T D − 12 A

T

1 1 ˆ = D − 2 U AU T D − 2 = A.

ˆ =U ˆ TD ˆU ˆ where U ˆ is orthogonal and D ˆ is diagonal. Set E := b) Write A T −1/2 ˆ T U D U . Show that E is nonsingular and that ˆ E T AE = D,

E T BE = I.

ˆ is symmetric, it admits an orthogonal diagonalization A ˆ = Solution. Since A 1 T T ˆ D ˆU ˆ . Let E := U T D − 2 U ˆ . Then E, as the product of three nonsingular U ˆ D 12 U , since matrices, is nonsingular. Its inverse is given by F := U 1

1

ˆ ˆ D 2 U U T D− 2 U FE = U

T

1

1

ˆ D 2 D− 2 U ˆ =U

T

ˆU ˆ =U

T

=I

ˆ =U ˆ TD ˆU ˆ it follows that U ˆA ˆU ˆ T = D, ˆ and similarly EF = I. Moreover, from A which gives ˆ D − 12 U AU T D − 12 U ˆT = U ˆA ˆU ˆ T = D. ˆ E T AE = U Similarly B = U T DU implies U BU T = D, which yields ˆ D − 12 U BU T D − 12 U ˆT = U ˆ D − 12 D 12 D 12 D − 12 U ˆ T = I. E T BE = U We conclude that for a symmetric matrix A and symmetric positive definite matrix B, the congruence transformation X 7−→ E T XE simultaneously diagonalizes the matrices A and B, and even maps B to the identity matrix2 2

For a more general result see Theorem 10.1 in P. Lancaster and L.Rodman, “Canonical forms for Hermitian matrix pairs under strict equivalence and congruence”, SIAM Review 47 (2005), 407–443.

14 Numerical Eigenvalue Problems

251

Exercise 14.14: Program code for one eigenvalue Suppose A = tridiag(c, d, c) is symmetric and tridiagonal with elements d1 , . . . , dn on the diagonal and c1 , . . . , cn−1 on the neighboring subdiagonals. Let λ1 ≥ λ2 ≥ · · · ≥ λn be the eigenvalues of A. We shall write a program to compute one eigenvalue λm for a given m using bisection and the method outlined in (14.9). a) Write a function k=counting(c,d,x) which for given x counts the number of eigenvalues of A strictly greater than x. Use the replacement described above Exercise 14.14 in the book if one of the dj (x) is close to zero. Solution. Let A = tridiag(c, d, c) and x be as in the exercise. If dj (x) is close to zero, e.g. smaller (in absolute value) than δj := cj M with M the Machine epsilon in MATLAB, it is suggested to replace dj (x) by δj . The following MATLAB program counts the number k of eigenvalues of A strictly less than x. code/count.m 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

function k=count(c,d,x) n = length(d); k = 0; u = d(1)-x; if u < 0 k = k+1; end for i = 2:n umin = abs(c(i-1))*eps; if abs(u) < umin if u < 0 u = -umin; else u = umin; end end u = d(i)-x-c(i-1)ˆ2/u; if u < 0 k = k+1; end end Listing 14.3: Count the number k of eigenvalues strictly less than x of a tridiagonal matrix A = tridiag(c, d, c).

b) Write a function lambda=findeigv(c,d,m) which first estimates an interval (a, b] containing all eigenvalues of A and then generates a sequence {(aj , bj ]} of intervals each containing λm . Iterate until bj − aj ≤ (b − a)M , where M is MATLAB’s machine epsilon eps. Typically M ≈ 2.22 × 10−16 .

252

14 Numerical Eigenvalue Problems

Solution. Let A = tridiag(c, d, c) and m be as in the exercise. The following MATLAB program computes a small interval [a, b] around the mth eigenvalue λm of A and returns the point λ in the middle of this interval. code/findeigv.m 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

function lambda = findeigv(c,d,m) n = length(d); a = d(1)-abs(c(1)); b = d(1)+abs(c(1)); for i = 2:n-1 a = min(a, d(i)-abs(c(i-1))-abs(c(i))); b = max(b, d(i)+abs(c(i-1))+abs(c(i))); end a = min(a, d(n)-abs(c(n-1))); b = max(b, d(n)+abs(c(n-1))); h = b-a; while abs(b-a) > eps*h c0 = (a+b)/2; k = count(c,d,c0); if k < m a = c0; else b = c0; end end lambda = (a+b)/2; Listing 14.4: Compute a small interval around the mth eigenvalue λm of a tridiagonal matrix A = tridiag(c, d, c) and return the point λ in the middle of this interval.

c) Test the program on T := tridiag(−1, 2, −1) of size 100. Compare the exact value of λ5 with your result and the result obtained by using MATLAB’s builtin function eig. Solution. A comparison between the values and errors obtained by the different methods is shown in Table 14.1. Exercise 14.15: Determinant of upper Hessenberg matrix Suppose A ∈ Cn×n is upper Hessenberg and x ∈ C. We will study two algorithms to compute f (x) = det(A − xI). a) Show that Gaussian elimination without pivoting requires O(n2 ) arithmetic operations. Solution. Scaling row k requires n − k multiplications. Zeroing the (k + 1, k)-entry (and correspondingly scaling the remaining entries in the row) requires 2(n − k) additions/multiplications. The total number of operations in bringing the matrix to

253

14 Numerical Eigenvalue Problems method

value

error

exact

0.02413912051848666

0

findeigv

0.02413912051848621

4.44 · 10−16

MATLAB eig

0.02413912051848647

1.84 · 10−16

Table 14.1: A comparison between the exact value of λ5 and the values returned by the functions findeigv and eig, as well as the errors.

upper triangular form is thus n−1 X k=1

3(n − k) =

3 n(n − 1), 2

which is O(n2 ). b) Show that the number of arithmetic operations is the same if partial pivoting is used. Solution. There is only one possible row interchange at each step in Gaussian elimination of an upper Hessenberg matrix, and this does not affect the upper Hessenberg structure of the lower right submatrix. The number of operations is therefore the same. c) Estimate the number of arithmetic operations if Givens rotations are used. Solution. According to Algorithm 5.3 and the solution of Exercise 5.18, the number of operations to bring the matrix to upper triangular form is O(n2 ). d) Compare the two methods discussing advantages and disadvantages. Solution. We have seen that the two methods are comparable in complexity. Givens rotations, however, give better guarantees when it comes to stability, as pivoting can be unstable.

Chapter 15

The QR Algorithm

Exercise 15.1: Orthogonal vectors Show that u and Au − λu are orthogonal when λ =

u∗ Au u∗ u .

Solution. In the exercise it is implicitly assumed that u∗ u 6= 0 and therefore u 6= 0. The vectors u and Au − λu are orthogonal precisely when 0 = hu, Au − λui = u∗ (Au − λu) = u∗ Au − λu∗ u. Dividing by u∗ u yields λ=

u∗ Au . u∗ u

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 T. Lyche et al., Exercises in Numerical Linear Algebra and Matrix Factorizations, Texts in Computational Science and Engineering 23, https://doi.org/10.1007/978-3-030-59789-4_15

255

Editorial Policy 1. Textbooks on topics in the field of computational science and engineering will be considered. They should be written for courses in CSE education. Both graduate and undergraduate textbooks will be published in TCSE. Multidisciplinary topics and multidisciplinary teams of authors are especially welcome. 2. Format: Only works in English will be considered. For evaluation purposes, manuscripts may be submitted in print or electronic form, in the latter case, preferably as pdfor zipped ps-files. Authors are requested to use the LaTeX style files available from Springer at: http://www.springer.com/gp/authors-editors/book-authors-editors/resourcesguidelines/rights-permissions-licensing/manuscript-preparation/5636#c3324 (Layout & templates – LaTeX template – contributed books). Electronic material can be included if appropriate. Please contact the publisher. 3. Those considering a book which might be suitable for the series are strongly advised to contact the publisher or the series editors at an early stage.

General Remarks Careful preparation of manuscripts will help keep production time short and ensure a satisfactory appearance of the finished book. The following terms and conditions hold: Regarding free copies and royalties, the standard terms for Springer mathematics textbooks hold. Please write to [email protected] for details. Authors are entitled to purchase further copies of their book and other Springer books for their personal use, at a discount of 33.3% directly from Springer-Verlag.

Series Editors Timothy J. Barth NASA Ames Research Center NAS Division Moffett Field, CA 94035, USA [email protected] Michael Griebel Institut für Numerische Simulation der Universität Bonn Wegelerstr. 6 53115 Bonn, Germany [email protected]

Risto M. Nieminen Department of Applied Physics Aalto University School of Science and Technology 00076 Aalto, Finland [email protected] Dirk Roose Department of Computer Science Katholieke Universiteit Leuven Celestijnenlaan 200A 3001 Leuven-Heverlee, Belgium [email protected]

David E. Keyes Mathematical and Computer Sciences and Engineering King Abdullah University of Science and Technology P.O. Box 55455 Jeddah 21534, Saudi Arabia [email protected]

Tamar Schlick Department of Chemistry and Courant Institute of Mathematical Sciences New York University 251 Mercer Street New York, NY 10012, USA [email protected]

and

Editor for Computational Science and Engineering at Springer: Martin Peters Springer-Verlag Mathematics Editorial Tiergartenstrasse 17 69121 Heidelberg, Germany [email protected]

Department of Applied Physics and Applied Mathematics Columbia University 500 W. 120 th Street New York, NY 10027, USA [email protected]

Texts in Computational Science and Engineering 1. H. P. Langtangen, Computational Partial Differential Equations. Numerical Methods and Diffpack Programming. 2nd Edition 2. A. Quarteroni, F. Saleri, P. Gervasio, Scientific Computing with MATLAB and Octave. 4th Edition 3. H. P. Langtangen, Python Scripting for Computational Science. 3rd Edition 4. H. Gardner, G. Manduchi, Design Patterns for e-Science. 5. M. Griebel, S. Knapek, G. Zumbusch, Numerical Simulation in Molecular Dynamics. 6. H. P. Langtangen, A Primer on Scientific Programming with Python. 5th Edition 7. A. Tveito, H. P. Langtangen, B. F. Nielsen, X. Cai, Elements of Scientific Computing. 8. B. Gustafsson, Fundamentals of Scientific Computing. 9. M. Bader, Space-Filling Curves. 10. M. Larson, F. Bengzon, The Finite Element Method: Theory, Implementation and Applications. 11. W. Gander, M. Gander, F. Kwok, Scientific Computing: An Introduction using Maple and MATLAB. 12. P. Deuflhard, S. Röblitz, A Guide to Numerical Modelling in Systems Biology. 13. M. H. Holmes, Introduction to Scientific Computing and Data Analysis. 14. S. Linge, H. P. Langtangen, Programming for Computations - A Gentle Introduction to Numerical Simulations with MATLAB/Octave. 15. S. Linge, H. P. Langtangen, Programming for Computations - A Gentle Introduction to Numerical Simulations with Python. 16. H.P. Langtangen, S. Linge, Finite Difference Computing with PDEs - A Modern Software Approach. 17. B. Gustafsson, Scientific Computing from a Historical Perspective 18. J.A. Trangenstein, Scientific Computing - Vol. I. – Linear and Nonlinear Equations 19. J.A. Trangenstein, Scientific Computing - Vol. II. – Eigenvalues and Optimization 20. J.A. Trangenstein, Scientific Computing - Vol. III. – Approximation and Integration 21. H.P. Langtangen, K.-A. Mardal, Introduction to Numerical Methods for Variational Problems. 22. T. Lyche, Numerical Linear Algebra and Matrix Factorizations. For further information on these books please have a look at our mathematics catalogue at the following URL: www.springer.com/series/5151

Monographs in Computational Science and Engineering 1. J. Sundnes, G.T. Lines, X. Cai, B.F. Nielsen, K.-A. Mardal, A. Tveito, Computing the Electrical Activity in the Heart. For further information on this book, please have a look at our mathematics catalogue at the following URL: www.springer.com/series/7417

Lecture Notes in Computational Science and Engineering 1. D. Funaro, Spectral Elements for Transport-Dominated Equations. 2. H.P. Langtangen, Computational Partial Differential Equations. Numerical Methods and Diffpack Programming. 3. W. Hackbusch, G. Wittum (eds.), Multigrid Methods V. 4. P. Deuflhard, J. Hermans, B. Leimkuhler, A.E. Mark, S. Reich, R.D. Skeel (eds.), Computational Molecular Dynamics: Challenges, Methods, Ideas. 5. D. Kröner, M. Ohlberger, C. Rohde (eds.), An Introduction to Recent Developments in Theory and Numerics for Conservation Laws. 6. S. Turek, Efficient Solvers for Incompressible Flow Problems. An Algorithmic and Computational Approach. 7. R. von Schwerin, Multi Body System SIMulation. Numerical Methods, Algorithms, and Software. 8. H.-J. Bungartz, F. Durst, C. Zenger (eds.), High Performance Scientific and Engineering Computing. 9. T.J. Barth, H. Deconinck (eds.), High-Order Methods for Computational Physics. 10. H.P. Langtangen, A.M. Bruaset, E. Quak (eds.), Advances in Software Tools for Scientific Computing. 11. B. Cockburn, G.E. Karniadakis, C.-W. Shu (eds.), Discontinuous Galerkin Methods. Theory, Computation and Applications. 12. U. van Rienen, Numerical Methods in Computational Electrodynamics. Linear Systems in Practical Applications. 13. B. Engquist, L. Johnsson, M. Hammill, F. Short (eds.), Simulation and Visualization on the Grid. 14. E. Dick, K. Riemslagh, J. Vierendeels (eds.), Multigrid Methods VI. 15. A. Frommer, T. Lippert, B. Medeke, K. Schilling (eds.), Numerical Challenges in Lattice Quantum Chromodynamics. 16. J. Lang, Adaptive Multilevel Solution of Nonlinear Parabolic PDE Systems. Theory, Algorithm, and Applications.

17. B.I. Wohlmuth, Discretization Methods and Iterative Solvers Based on Domain Decomposition. 18. U. van Rienen, M. Günther, D. Hecht (eds.), Scientific Computing in Electrical Engineering. 19. I. Babuška, P.G. Ciarlet, T. Miyoshi (eds.), Mathematical Modeling and Numerical Simulation in Continuum Mechanics. 20. T.J. Barth, T. Chan, R. Haimes (eds.), Multiscale and Multiresolution Methods. Theory and Applications. 21. M. Breuer, F. Durst, C. Zenger (eds.), High Performance Scientific and Engineering Computing. 22. K. Urban, Wavelets in Numerical Simulation. Problem Adapted Construction and Applications. 23. L.F. Pavarino, A. Toselli (eds.), Recent Developments in Domain Decomposition Methods. 24. T. Schlick, H.H. Gan (eds.), Computational Methods for Macromolecules: Challenges and Applications. 25. T.J. Barth, H. Deconinck (eds.), Error Estimation and Adaptive Discretization Methods in Computational Fluid Dynamics. 26. M. Griebel, M.A. Schweitzer (eds.), Meshfree Methods for Partial Differential Equations. 27. S. Müller, Adaptive Multiscale Schemes for Conservation Laws. 28. C. Carstensen, S. Funken, W. Hackbusch, R.H.W. Hoppe, P. Monk (eds.), Computational Electromagnetics. 29. M.A. Schweitzer, A Parallel Multilevel Partition of Unity Method for Elliptic Partial Differential Equations. 30. T. Biegler, O. Ghattas, M. Heinkenschloss, B. van Bloemen Waanders (eds.), Large-Scale PDEConstrained Optimization. 31. M. Ainsworth, P. Davies, D. Duncan, P. Martin, B. Rynne (eds.), Topics in Computational Wave Propagation. Direct and Inverse Problems. 32. H. Emmerich, B. Nestler, M. Schreckenberg (eds.), Interface and Transport Dynamics. Computational Modelling. 33. H.P. Langtangen, A. Tveito (eds.), Advanced Topics in Computational Partial Differential Equations. Numerical Methods and Diffpack Programming. 34. V. John, Large Eddy Simulation of Turbulent Incompressible Flows. Analytical and Numerical Results for a Class of LES Models. 35. E. Bänsch (ed.), Challenges in Scientific Computing - CISC 2002. 36. B.N. Khoromskij, G. Wittum, Numerical Solution of Elliptic Differential Equations by Reduction to the Interface. 37. A. Iske, Multiresolution Methods in Scattered Data Modelling. 38. S.-I. Niculescu, K. Gu (eds.), Advances in Time-Delay Systems. 39. S. Attinger, P. Koumoutsakos (eds.), Multiscale Modelling and Simulation. 40. R. Kornhuber, R. Hoppe, J. Périaux, O. Pironneau, O. Wildlund, J. Xu (eds.), Domain Decomposition Methods in Science and Engineering. 41. T. Plewa, T. Linde, V.G. Weirs (eds.), Adaptive Mesh Refinement – Theory and Applications. 42. A. Schmidt, K.G. Siebert, Design of Adaptive Finite Element Software. The Finite Element Toolbox ALBERTA.

43. M. Griebel, M.A. Schweitzer (eds.), Meshfree Methods for Partial Differential Equations II. 44. B. Engquist, P. Lötstedt, O. Runborg (eds.), Multiscale Methods in Science and Engineering. 45. P. Benner, V. Mehrmann, D.C. Sorensen (eds.), Dimension Reduction of Large-Scale Systems. 46. D. Kressner, Numerical Methods for General and Structured Eigenvalue Problems. 47. A. Boriçi, A. Frommer, B. Joó, A. Kennedy, B. Pendleton (eds.), QCD and Numerical Analysis III. 48. F. Graziani (ed.), Computational Methods in Transport. 49. B. Leimkuhler, C. Chipot, R. Elber, A. Laaksonen, A. Mark, T. Schlick, C. Schütte, R. Skeel (eds.), New Algorithms for Macromolecular Simulation. 50. M. Bücker, G. Corliss, P. Hovland, U. Naumann, B. Norris (eds.), Automatic Differentiation: Applications, Theory, and Implementations. 51. A.M. Bruaset, A. Tveito (eds.), Numerical Solution of Partial Differential Equations on Parallel Computers. 52. K.H. Hoffmann, A. Meyer (eds.), Parallel Algorithms and Cluster Computing. 53. H.-J. Bungartz, M. Schäfer (eds.), Fluid-Structure Interaction. 54. J. Behrens, Adaptive Atmospheric Modeling. 55. O. Widlund, D. Keyes (eds.), Domain Decomposition Methods in Science and Engineering XVI. 56. S. Kassinos, C. Langer, G. Iaccarino, P. Moin (eds.), Complex Effects in Large Eddy Simulations. 57. M. Griebel, M.A Schweitzer (eds.), Meshfree Methods for Partial Differential Equations III. 58. A.N. Gorban, B. Kégl, D.C. Wunsch, A. Zinovyev (eds.), Principal Manifolds for Data Visualization and Dimension Reduction. 59. H. Ammari (ed.), Modeling and Computations in Electromagnetics: A Volume Dedicated to JeanClaude Nédélec. 60. U. Langer, M. Discacciati, D. Keyes, O. Widlund, W. Zulehner (eds.), Domain Decomposition Methods in Science and Engineering XVII. 61. T. Mathew, Domain Decomposition Methods for the Numerical Solution of Partial Differential Equations. 62. F. Graziani (ed.), Computational Methods in Transport: Verification and Validation. 63. M. Bebendorf, Hierarchical Matrices. A Means to Efficiently Solve Elliptic Boundary Value Problems. 64. C.H. Bischof, H.M. Bücker, P. Hovland, U. Naumann, J. Utke (eds.), Advances in Automatic Differentiation. 65. M. Griebel, M.A. Schweitzer (eds.), Meshfree Methods for Partial Differential Equations IV. 66. B. Engquist, P. Lötstedt, O. Runborg (eds.), Multiscale Modeling and Simulation in Science. 67. I.H. Tuncer, Ü. Gülcat, D.R. Emerson, K. Matsuno (eds.), Parallel Computational Fluid Dynamics 2007. 68. S. Yip, T. Diaz de la Rubia (eds.), Scientific Modeling and Simulations. 69. A. Hegarty, N. Kopteva, E. O’Riordan, M. Stynes (eds.), BAIL 2008 – Boundary and Interior Layers.

70. M. Bercovier, M.J. Gander, R. Kornhuber, O. Widlund (eds.), Domain Decomposition Methods in Science and Engineering XVIII. 71. B. Koren, C. Vuik (eds.), Advanced Computational Methods in Science and Engineering. 72. M. Peters (ed.), Computational Fluid Dynamics for Sport Simulation. 73. H.-J. Bungartz, M. Mehl, M. Schäfer (eds.), Fluid Structure Interaction II - Modelling, Simulation, Optimization. 74. D. Tromeur-Dervout, G. Brenner, D.R. Emerson, J. Erhel (eds.), Parallel Computational Fluid Dynamics 2008. 75. A.N. Gorban, D. Roose (eds.), Coping with Complexity: Model Reduction and Data Analysis. 76. J.S. Hesthaven, E.M. Rønquist (eds.), Spectral and High Order Methods for Partial Differential Equations. 77. M. Holtz, Sparse Grid Quadrature in High Dimensions with Applications in Finance and Insurance. 78. Y. Huang, R. Kornhuber, O.Widlund, J. Xu (eds.), Domain Decomposition Methods in Science and Engineering XIX. 79. M. Griebel, M.A. Schweitzer (eds.), Meshfree Methods for Partial Differential Equations V. 80. P.H. Lauritzen, C. Jablonowski, M.A. Taylor, R.D. Nair (eds.), Numerical Techniques for Global Atmospheric Models. 81. C. Clavero, J.L. Gracia, F.J. Lisbona (eds.), BAIL 2010 – Boundary and Interior Layers, Computational and Asymptotic Methods. 82. B. Engquist, O. Runborg, Y.R. Tsai (eds.), Numerical Analysis and Multiscale Computations. 83. I.G. Graham, T.Y. Hou, O. Lakkis, R. Scheichl (eds.), Numerical Analysis of Multiscale Problems. 84. A. Logg, K.-A. Mardal, G. Wells (eds.), Automated Solution of Differential Equations by the Finite Element Method. 85. J. Blowey, M. Jensen (eds.), Frontiers in Numerical Analysis - Durham 2010. 86. O. Kolditz, U.-J. Gorke, H. Shao, W. Wang (eds.), Thermo-Hydro-Mechanical-Chemical Processes in Fractured Porous Media - Benchmarks and Examples. 87. S. Forth, P. Hovland, E. Phipps, J. Utke, A. Walther (eds.), Recent Advances in Algorithmic Differentiation. 88. J. Garcke, M. Griebel (eds.), Sparse Grids and Applications. 89. M. Griebel, M.A. Schweitzer (eds.), Meshfree Methods for Partial Differential Equations VI. 90. C. Pechstein, Finite and Boundary Element Tearing and Interconnecting Solvers for Multiscale Problems. 91. R. Bank, M. Holst, O. Widlund, J. Xu (eds.), Domain Decomposition Methods in Science and Engineering XX. 92. H. Bijl, D. Lucor, S. Mishra, C. Schwab (eds.), Uncertainty Quantification in Computational Fluid Dynamics. 93. M. Bader, H.-J. Bungartz, T. Weinzierl (eds.), Advanced Computing. 94. M. Ehrhardt, T. Koprucki (eds.), Advanced Mathematical Models and Numerical Techniques for Multi-Band Effective Mass Approximations.

95. M. Azaïez, H. El Fekih, J.S. Hesthaven (eds.), Spectral and High Order Methods for Partial Differential Equations ICOSAHOM 2012. 96. F. Graziani, M.P. Desjarlais, R. Redmer, S.B. Trickey (eds.), Frontiers and Challenges in Warm Dense Matter. 97. J. Garcke, D. Pflüger (eds.), Sparse Grids and Applications – Munich 2012. 98. J. Erhel, M. Gander, L. Halpern, G. Pichot, T. Sassi, O. Widlund (eds.), Domain Decomposition Methods in Science and Engineering XXI. 99. R. Abgrall, H. Beaugendre, P.M. Congedo, C. Dobrzynski, V. Perrier, M. Ricchiuto (eds.), High Order Nonlinear Numerical Methods for Evolutionary PDEs - HONOM 2013. 100. M. Griebel, M.A. Schweitzer (eds.), Meshfree Methods for Partial Differential Equations VII. 101. R. Hoppe (ed.), Optimization with PDE Constraints - OPTPDE 2014. 102. S. Dahlke, W. Dahmen, M. Griebel, W. Hackbusch, K. Ritter, R. Schneider, C. Schwab, H. Yserentant (eds.), Extraction of Quantifiable Information from Complex Systems. 103. A. Abdulle, S. Deparis, D. Kressner, F. Nobile, M. Picasso (eds.), Numerical Mathematics and Advanced Applications - ENUMATH 2013. 104. T. Dickopf, M.J. Gander, L. Halpern, R. Krause, L.F. Pavarino (eds.), Domain Decomposition Methods in Science and Engineering XXII. 105. M. Mehl, M. Bischoff, M. Schäfer (eds.), Recent Trends in Computational Engineering - CE2014. Optimization, Uncertainty, Parallel Algorithms, Coupled and Complex Problems. 106. R.M. Kirby, M. Berzins, J.S. Hesthaven (eds.), Spectral and High Order Methods for Partial Differential Equations - ICOSAHOM’14. 107. B. Jüttler, B. Simeon (eds.), Isogeometric Analysis and Applications 2014. 108. P. Knobloch (ed.), Boundary and Interior Layers, Computational and Asymptotic Methods – BAIL 2014. 109. J. Garcke, D. Pflüger (eds.), Sparse Grids and Applications – Stuttgart 2014. 110. H. P. Langtangen, Finite Difference Computing with Exponential Decay Models. 111. A. Tveito, G.T. Lines, Computing Characterizations of Drugs for Ion Channels and Receptors Using Markov Models. 112. B. Karazösen, M. Manguo˘glu, M. Tezer-Sezgin, S. Göktepe, Ö. U˘gur (eds.), Numerical Mathematics and Advanced Applications - ENUMATH 2015. 113. H.-J. Bungartz, P. Neumann, W.E. Nagel (eds.), Software for Exascale Computing - SPPEXA 20132015. 114. G.R. Barrenechea, F. Brezzi, A. Cangiani, E.H. Georgoulis (eds.), Building Bridges: Connections and Challenges in Modern Approaches to Numerical Partial Differential Equations. 115. M. Griebel, M.A. Schweitzer (eds.), Meshfree Methods for Partial Differential Equations VIII. 116. C.-O. Lee, X.-C. Cai, D.E. Keyes, H.H. Kim, A. Klawonn, E.-J. Park, O.B. Widlund (eds.), Domain Decomposition Methods in Science and Engineering XXIII. 117. T. Sakurai, S. Zhang, T. Imamura, Y. Yusaku, K. Yoshinobu, H. Takeo (eds.), Eigenvalue Problems: Algorithms, Software and Applications, in Petascale Computing. EPASA 2015, Tsukuba, Japan, September 2015. 118. T. Richter (ed.), Fluid-structure Interactions. Models, Analysis and Finite Elements.

119. M.L. Bittencourt, N.A. Dumont, J.S. Hesthaven (eds.), Spectral and High Order Methods for Partial Differential Equations ICOSAHOM 2016. 120. Z. Huang, M. Stynes, Z. Zhang (eds.), Boundary and Interior Layers, Computational and Asymptotic Methods BAIL 2016. 121. S.P.A. Bordas, E.N. Burman, M.G. Larson, M.A. Olshanskii (eds.), Geometrically Unfitted Finite Element Methods and Applications. Proceedings of the UCL Workshop 2016. 122. A. Gerisch, R. Penta, J. Lang (eds.), Multiscale Models in Mechano and Tumor Biology. Modeling, Homogenization, and Applications. 123. J. Garcke, D. Pflüger, C.G. Webster, G. Zhang (eds.), Sparse Grids and Applications - Miami 2016. 124. M. Sch¨afer, M. Behr, M. Mehl, B. Wohlmuth (eds.), Recent Advances in Computational Engineering. Proceedings of the 4th International Conference on Computational Engineering (ICCE 2017) in Darmstadt. 125. P.E. Bjørstad, S.C. Brenner, L. Halpern, R. Kornhuber, H.H. Kim, T. Rahman, O.B. Widlund (eds.), Domain Decomposition Methods in Science and Engineering XXIV. 24th International Conference on Domain Decomposition Methods, Svalbard, Norway, February 6–10, 2017. 126. F.A. Radu, K. Kumar, I. Berre, J.M. Nordbotten, I.S. Pop (eds.), Numerical Mathematics and Advanced Applications – ENUMATH 2017. 127. X. Roca, A. Loseille (eds.), 27th International Meshing Roundtable. 128. Th. Apel, U. Langer, A. Meyer, O. Steinbach (eds.), Advanced Finite Element Methods with Applications. Selected Papers from the 30th Chemnitz Finite Element Symposium 2017. 129. M. Griebel, M. A. Schweitzer (eds.), Meshfree Methods for Partial Differencial Equations IX. 130. S. Weißer, BEM-based Finite Element Approaches on Polytopal Meshes. 131. V. A. Garanzha, L. Kamenski, H. Si (eds.), Numerical Geometry, Grid Generation and Scientific Computing. Proceedings of the 9th International Conference, NUMGRID 2018/Voronoi 150, Celebrating the 150th Anniversary of G. F. Voronoi, Moscow, Russia, December 2018. 132. E. H. van Brummelen, A. Corsini, S. Perotto, G. Rozza (eds.), Numerical Methods for Flows. For further information on these books please have a look at our mathematics catalogue at the following URL: www.springer.com/series/3527