Linear Algebra to Differential Equations [1 ed.] 0815361467, 9780815361466

Linear Algebra to Differential Equations concentrates on the essential topics necessary for all engineering students in

1,086 254 7MB

English Pages 416 [412] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Linear Algebra to Differential Equations [1 ed.]
 0815361467, 9780815361466

Table of contents :
Cover
Half Title
Title Page
Copyright Page
Dedication
Contents
Preface
1. Vectors and Matrices
1.1. Introduction
1.2. Scalars and Vectors
1.3. Introduction to Matrices
1.4. Types of Matrices
1.5. Elementary Operations and Elementary Matrices
1.6. Determinants
1.7. Inverse of a Matrix
1.8. Partitioning of Matrices
1.9. Advanced Topics: Pseudo Inverse and Congruent Inverse
1.10. Conclusion
2. Linear System of Equations
2.1. Introduction
2.2. Linear System of Equations
2.3. Rank of a Matrix
2.4. Echelon Form and Normal Form
2.5. Solutions of a Linear System of Equations
2.6. Cayley-Hamilton Theorem
2.7. Eigen-values and Eigen-vectors
2.8. Singular Values and Singular Vectors
2.9. Quadratic Forms
2.10. Conclusion
3. Vector Spaces
3.1. Introduction
3.2. Vector Space and Subspaces
3.3. Linear Independence, Basis and Dimension
3.4. Change of Basis-Matrix
3.5. Linear Transformations
3.6. Matrices of Linear Transformations
3.7. Inner Product Space
3.8. Gram-Schmidt Orthogonalization
3.9. Linking Linear Algebra to Differential Equations
3.10. Conclusion
4. Numerical Methods in Linear Algebra
4.1. Introduction
4.2. Elements of Computation and Errors
4.3. Direct Methods for Solving a Linear System of Equations
4.4. Iterative Methods
4.5. Householder Transformation
4.6. Tridiagonalization of a Symmetric Matrix by Plane Rotation
4.7. QR Decomposition
4.8. Eigen-values: Bounds and Power Method
4.9. Krylov Subspace Methods
4.10. Conclusion
5. Applications
5.1. Introduction
5.2. Finding Curves through Given Points
5.3. Markov Chains
5.4. Leontief 's Models
5.5. Cryptology
5.6. Application to Computer Graphics
5.7. Application to Robotics
5.8. Bioinformatics
5.9. Principal Component Analysis (PCA)
5.10. Big Data
5.11. Conclusion
6. Kronecker Product
6.1. Introduction
6.2. Primary Matrices
6.3. Kronecker Products
6.4. Further Properties of Kronecker Products
6.5. Kronecker Product of Two Linear Transformations
6.6. Kronecker Product and Vector Operators
6.7. Permutation Matrices and Kronecker Products
6.8. Analytical Functions and Kronecker Product
6.9. Kronecker Sum
6.10. Lyapunov Function
6.11. Conclusion
7. Calculus of Matrices
7.1. Introduction
7.2. Derivative of a Matrix Valued Function with Respect to a Scalar
7.3. Derivative of a Vector-Valued Function w.r.t. a Vector
7.4. Derivative of a Scalar-Valued Function w.r.t. a Matrix
7.5. Derivative of a Matrix Valued Function w.r.t. its Entries and Vice versa
7.6. The Matrix Differential
7.7. Derivative of a Matrix w.r.t. a Matrix
7.8. Derivative Formula using Kronecker Products
7.9. Another Definition for Derivative of a Matrix w.r.t. a Matrix
7.10. Conclusion
8. Linear Systems of Differential Equations
8.1. Introduction
8.2. Linear Systems
8.3. Fundamental Matrix
8.4. Method of Successive Approximations
8.5. Nonhomogeneous Systems
8.6. Linear Systems with Constant Coefficients
8.7. Stability Analysis of a System
8.8. Election Mathematics
8.9. Conclusion
9. Linear Matrix Differential Equations
9.1. Introduction
9.2. Initial-value-problems of LMDEs
9.3. LMDE X' = A(t)X
9.4. The LMDE X' = AXB
9.5. More General LMDE
9.6. A Class of LMDE of Higher Order
9.7. Boundary Value Problem of LMDE
9.8. Trigonometric and Hyperbolic Matrix Functions
9.9. Conclusion
Bibliography
Answers
Index

Citation preview

Linear Algebra to Differential Equations

Linear Algebra to Differential Equations

J. Vasundhara Devi

Sadashiv G. Deo Ramakrishna Khandeparkar

First edition published 2022 by CRC Press 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 and by CRC Press 2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN © 2022 Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, LLC Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright. com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact [email protected] Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Names: Vasundhara Devi, J., author. | Deo, Sadashiv G., 1933- author. | Khandeparkar, Ramakrishna, author. Title: Linear algebra to differential equations / J. Vasundhara Devi, Sadashiv G. Deo, Ramakrishna Khandeparkar. Description: First edition. | Boca Raton, FL : Chapman & Hall, CRC Press, 2021. | Includes bibliographical references and index. Identifiers: LCCN 2021019504 (print) | LCCN 2021019505 (ebook) | ISBN 9780815361466 (hardback) | ISBN 9781032067988 (paperback) | ISBN 9781351014953 (ebook) Subjects: LCSH: Algebras, Linear. | Differential equations. Classification: LCC QA184.2 .V388 2021 (print) | LCC QA184.2 (ebook) | DDC 512/.5--dc23 LC record available at https://lccn.loc.gov/2021019504 LC ebook record available at https://lccn.loc.gov/2021019505

ISBN: 978-0-815-36146-6(hbk) ISBN: 978-1-032-06798-8(pbk) ISBN: 978-1-351-01495-3(ebk) DOI: 10.1201/9781351014953 Typeset in Computer Modern by KnowledgeWorks Global Ltd.

To our teacher, guide and philosopher Professor V. Lakshmikantham, world citizen and teacher beyond comparison.

Contents

Preface

xi

1 Vectors and Matrices 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Scalars and Vectors . . . . . . . . . . . . . . . . . . . . 1.3 Introduction to Matrices . . . . . . . . . . . . . . . . . 1.4 Types of Matrices . . . . . . . . . . . . . . . . . . . . . 1.5 Elementary Operations and Elementary Matrices . . . 1.6 Determinants . . . . . . . . . . . . . . . . . . . . . . . . 1.7 Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . 1.8 Partitioning of Matrices . . . . . . . . . . . . . . . . . . 1.9 Advanced Topics: Pseudo Inverse and Congruent Inverse 1.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 2 Linear System of Equations 2.1 Introduction . . . . . . . . . . . . . . . . 2.2 Linear System of Equations . . . . . . . . 2.3 Rank of a Matrix . . . . . . . . . . . . . 2.4 Echelon Form and Normal Form . . . . . 2.5 Solutions of a Linear System of Equations 2.6 Cayley–Hamilton Theorem . . . . . . . . 2.7 Eigen-values and Eigen-vectors . . . . . . 2.8 Singular Values and Singular Vectors . . 2.9 Quadratic Forms . . . . . . . . . . . . . . 2.10 Conclusion . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . .

1 1 1 10 20 26 29 34 43 47 51

. . . . . . . .

. . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

53 53 53 56 59 65 72 75 88 90 95

3 Vector Spaces 3.1 Introduction . . . . . . . . . . . . . . . . . 3.2 Vector Space and Subspaces . . . . . . . . 3.3 Linear Independence, Basis and Dimension 3.4 Change of Basis–Matrix . . . . . . . . . . . 3.5 Linear Transformations . . . . . . . . . . . 3.6 Matrices of Linear Transformations . . . . 3.7 Inner Product Space . . . . . . . . . . . . . 3.8 Gram–Schmidt Orthogonalization . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

97 97 97 101 106 109 115 121 125

. . . .

vii

viii

Contents 3.9 Linking Linear Algebra to Differential Equations . . . . . . . 3.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127 129

4 Numerical Methods in Linear Algebra 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Elements of Computation and Errors . . . . . . . . . . . . . 4.3 Direct Methods for Solving a Linear System of Equations . . 4.4 Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Householder Transformation . . . . . . . . . . . . . . . . . . 4.6 Tridiagonalization of a Symmetric Matrix by Plane Rotation 4.7 QR Decomposition . . . . . . . . . . . . . . . . . . . . . . . 4.8 Eigen-values: Bounds and Power Method . . . . . . . . . . . 4.9 Krylov Subspace Methods . . . . . . . . . . . . . . . . . . . 4.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

131 131 131 136 149 160 167 170 172 178 180

5 Applications 5.1 Introduction . . . . . . . . . . . . . . 5.2 Finding Curves through Given Points 5.3 Markov Chains . . . . . . . . . . . . . 5.4 Leontief’s Models . . . . . . . . . . . 5.5 Cryptology . . . . . . . . . . . . . . . 5.6 Application to Computer Graphics . . 5.7 Application to Robotics . . . . . . . . 5.8 Bioinformatics . . . . . . . . . . . . . 5.9 Principal Component Analysis (PCA) 5.10 Big Data . . . . . . . . . . . . . . . . 5.11 Conclusion . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

181 181 181 183 190 196 199 204 211 215 220 223

6 Kronecker Product 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . 6.2 Primary Matrices . . . . . . . . . . . . . . . . . . 6.3 Kronecker Products . . . . . . . . . . . . . . . . . 6.4 Further Properties of Kronecker Products . . . . . 6.5 Kronecker Product of Two Linear Transformations 6.6 Kronecker Product and Vector Operators . . . . . 6.7 Permutation Matrices and Kronecker Products . . 6.8 Analytical Functions and Kronecker Product . . . 6.9 Kronecker Sum . . . . . . . . . . . . . . . . . . . . 6.10 Lyapunov Function . . . . . . . . . . . . . . . . . 6.11 Conclusion . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

225 225 225 233 238 245 247 251 256 261 268 271

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

Contents

ix

7 Calculus of Matrices 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Derivative of a Matrix Valued Function with Respect to a Scalar 7.3 Derivative of a Vector-Valued Function w.r.t. a Vector . . . . 7.4 Derivative of a Scalar-Valued Function w.r.t. a Matrix . . . . 7.5 Derivative of a Matrix Valued Function w.r.t. Its Entries and Vice versa . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 The Matrix Differential . . . . . . . . . . . . . . . . . . . . . 7.7 Derivative of a Matrix w.r.t. a Matrix . . . . . . . . . . . . . 7.8 Derivative Formula using Kronecker Products . . . . . . . . 7.9 Another Definition for Derivative of a Matrix w.r.t. a Matrix 7.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

273 273 273 280 285

8 Linear Systems of Differential Equations 8.1 Introduction . . . . . . . . . . . . . . . . 8.2 Linear Systems . . . . . . . . . . . . . . . 8.3 Fundamental Matrix . . . . . . . . . . . . 8.4 Method of Successive Approximations . . 8.5 Nonhomogeneous Systems . . . . . . . . 8.6 Linear Systems with Constant Coefficients 8.7 Stability Analysis of a System . . . . . . 8.8 Election Mathematics . . . . . . . . . . . 8.9 Conclusion . . . . . . . . . . . . . . . . .

288 295 297 301 308 309

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

311 311 312 315 320 324 328 337 342 344

9 Linear Matrix Differential Equations 9.1 Introduction . . . . . . . . . . . . . . . . . . . . 9.2 Initial-value-problems of LMDEs . . . . . . . . . 9.3 LMDE X0 = A(t)X . . . . . . . . . . . . . . . . 9.4 The LMDE X0 = AXB . . . . . . . . . . . . . . 9.5 More General LMDE . . . . . . . . . . . . . . . 9.6 A Class of LMDE of Higher Order . . . . . . . . 9.7 Boundary Value Problem of LMDE . . . . . . . 9.8 Trigonometric and Hyperbolic Matrix Functions 9.9 Conclusion . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

347 347 347 349 352 355 360 364 367 374

. . . . .

. . . . . . . . . . . .

. . . . . . . . .

Bibliography

377

Answers

379

Index

395

Preface

The authors feel the necessity to answer an important question: what is the need for another book on matrices, linear algebra and differential equations when there are hundreds of books available and some of them are excellent? An answer in a nutshell is: racing time, enhanced computational literacy and facilities, expanding knowledge on all frontiers, similarity among various disciplines, complexities evolving due to changes in social, physical, technical and engineering domains giving rise to new problems–all call for a new book. A book, in fact, a textbook that (i) is complete in the basic knowledge of the subject, (ii) has a lucid exposition of the necessary topics, (iii) introduces the abstraction and at the same time is down to earth, (iv) highlights the methods and approaches that are more useful, (v) brings out the essential techniques, (vi) points towards the similarities in various disciplines, (vii) has illustrative examples, (viii) introduces applications both classical and novel, (ix) gives an overview of techniques that may be useful in the future and could go a long way in serving the needs of the students and applied scientists alike. With this view the authors conceived this book. This book has many new features. The fundamental knowledge essential for a student of linear algebra is given in the first two chapters. The authors quickly go through the basics and also include concepts like pseudo inverse and congruent inverse which are useful in applications. Eigen-value decomposition theorem and the introduction of singular values and singular vectors in the second chapter allow the time needed for a student to assimilate these ideas to be used later. Chapter 3 begins with abstract concepts but quickly shifts to down-toearth topics: change of basis matrix, linear transformations and matrices of linear transformations. These are some of the topics frequently used by applied xi

xii

Preface

scientists. The singular value decomposition theorem and the Gram–Schmidt orthogonalization process are given since these are essential tools for applied scientists, engineers and technocrats. In Chapter 4, after highlighting the natural existence of errors in computation, the basic methods for solving a linear system of equations are given using both direct and iterative methods. As orthogonal transformations play a vital role in iterative techniques, the Householder method and the plane rotation method are given. Bounds for the eigen-values, the power method for finding the largest eigen-value and QR method are also described. A brief description of Krylov subspace methods introduce a broad spectrum of solvers to be discovered by the reader. As problems in nearly all the topics involving matrices can be solved by just giving a command in a computer, much stress has not been given in solving problems. In fact, the opinion of the authors is that the content of this chapter, is more tuned to assignments than to examinations. Further, in this chapter the basic techniques have been given and the refinements can be learnt by the reader whenever required. Chapter 5 brings forth the use of linear algebra in applied areas. An attempt has been made to introduce and showcase some of them–both from classical applications and the latest topics. The reader will recognize that these applications are only the tip of an iceberg of the wide range and scope of applications of linear algebra. A section has been included in Chapter 3 linking linear algebra to differential equations. It can be considered as a prologue to Chapters 6 to 9. In Chapter 6, Kronecker products and sums are introduced. These are important tools to obtain eigen-values and eigen-vectors of higher-order matrices. In Chapter 7 various types of derivatives available in literature have been given though most of the derivatives are seldom used. There is a lot of potential for both mathematicians and applied researchers to develop new types of differential equations and obtain new results. Chapter 8 contains the study of a linear system of differential equations. This chapter is essential for engineering students using control systems and mechanical systems among others. In Chapter 9, Linear matrix differential equations, an extension of the systems studied in Chapter 8 have been described. Throughout the book the concepts have been illustrated through examples while routine mathematical proofs have been avoided. While a number of examples and problems are illustrative, nevertheless, the readers are encouraged to create their own problems as the process of creating a textbook problem is a backward process starting from the solution. The authors would like to place on record their gratitude to all the scientists and mathematicians who have taken the subject to the level where applications of the subject have become innumerable. Next the authors would like to record their regards to the management of Gayatri Vidya Parishad College of Engineering (Autonomous) (GVPCE(A)),

Preface

xiii

Visakhapatnam, India, for all the support rendered. Special mention must be made of Dr A. B. Koteswara Rao, Principal, GVPCE(A) for giving all the necessary support when the writing of the book was in progress including the comments and suggestions in the topic of Robotics. In developing the sections relating to Cryptography, Computer Graphics, Big Data and Bioinformatics, the authors wish to acknowledge the guidance provided by Professor R. G. Marshall, (Retired) Professor of Computer Science, Plymouth State University, New Hampshire, USA. Dr G. Sudheer and Dr K. L. Sai Prasad of Dept. of Mathematics, GVPCE (Women) have given timely suggestions for the betterment of the book and the authors acknowledge it. The Dept. of Mathematics of GVPCE(A) has given its unstinted support. The authors feel it a privilege to express their appreciation to the department in general and in particular to Dr R.V.G. Ravi Kumar, Associate Professor and Head, Department of Mathematics, Dr A.R.J. Srikanth, Dr. Ch. V. Sreedhar, Dr Ch. Appala Naidu, Dr J. Satish, Dr D. Ravi Kumar, Mr. I.S.N.R.G. Bharat and Mr. S. Srinivasa Rao, all Assistant Professors of the department and Dr P. Sarada Varma, WOS(A), SERB, DST. Dr K.V. Varalakshmi, Assistant Professor, Dept. of Mechanical Engineering GVPCE(A) drew the figures in the book and Dr D. N. D. Harini, Associate Professor, Dept. of Computer Science and Engineering GVPCE(A) and Mr S. Kanthi Kiran, Assistant Professor, Dept. of Information Technology, GVPCE(A) helped problems related to Computer Graphics and PCA. Professor D. Varadaraju, Professor, Dept. of Mechanical Engineering, GVPCE(A) has given immense support in reading of the proofs. They all are sincerely acknowledged. Dr N. Deepika Rani, Dept. of Electronics and Communications Engineering, GVPCE(A) has given timely help with her team of students: Ms P Mayuri, Ms B. Prema and Ms V. Shyamala – all M. Tech, ECE students, in typesetting the book in LATEX and the authors record their appreciation. The contribution of Dr Lymuel McRae, Consultant, McLean, Virginia, USA, Dr F. A. McRae of Catholic University of America, USA, Dr Z. Drici of Illinois Wesleyan University, USA and Mr J. V. S. Dattatreya of Affine Tech Systems Private Limited India in ensuing correctness of final version of the book is sincerely acknowledged. The authors extend heartfelt thanks to the Janvi Enterprises at Dombivili(E) Maharashtra State, India for providing computer facilities for quick communication between the authors while the book writing was in progress. Dr Aastha Sharma and Ms Shikha Garg of Taylor & Francis group have demonstrated utmost patience and gave complete support in a very turbulent phase of the writing of the book. The authors express their appreciation. Last but not the least the authors record their deep sense of gratitude to their respective families, teachers and friends for being with them throughout. J. Vasundhara Devi Sadashiv G. Deo Ramakrishna Khandeparkar

xiv

Preface Condolences

It is with a heavy heart that we express our sorrow at the loss of our co-author, Dr Ramakrishna Khandeparkar, while the book is still in progress. His contribution to the book is sincerely acknowledged. J. Vasundhara Devi Sadashiv G. Deo

Chapter 1 Vectors and Matrices

1.1

Introduction

This is the foundation chapter for the book. The building blocks of all the topics covered in the book constitute this chapter. Section 1.2 begins with the definitions of a scalar and vector and proceeds with the study of their properties. These properties provide a base for the theory developed in later chapters. Matrices are introduced in terms of vectors in Section 1.3. This definition is useful as nearly in all applications a matrix consisting of linearly independent vectors or ortho-normal vectors is utilized to transform a vector into a new vector in a given situation. Various types of matrices which will be useful later are introduced in Section 1.4, while Section 1.5 deals with elementary operations on matrices. These elementary operations are the only operations that can be done within a matrix and their importance is in developing algorithms for numerical techniques. Elementary matrices resulting from elementary operations are introduced in this section. Determinants, which attribute a single numerical value to a matrix, are given along with their properties in Section 1.6. The contents of Section 1.7 consist of the inverse of a matrix together with its properties. Section 1.8 deals with the partitioning of a matrix, wherein basic algebraic properties involving the partition are introduced. Section 1.9 introduces the concepts of pseudo inverse and congruent inverse.

1.2

Scalars and Vectors

The concepts of a scalar and a vector are very essential and are basic fundamental notions describing the physical world. A scalar is used to denote quantification or measurement and to express magnitude. For example, mass, temperature, length, volume of a body and time are all scalars.

DOI: 10.1201/9781351014953-1

1

2

Linear Algebra to Differential Equations

Definition 1.2.1 A scalar is a quantity completely specified by a real or a complex number. A scalar is denoted by small letters a, b, c, x, y, z for real numbers and a + ib, x + iy and so on for complex numbers. Velocity and acceleration are typical examples of a vector. In physical terms, a vector has both magnitude and direction. On the real line R, a vector describes the distance to the left or to the right of zero. Here, zero is the point of reference. In the plane R2 , the vector (a b) is the position vector of the point (a b). It gives the distance of the point from the origin (0 0) as well as the direction. In the space R3 , corresponding to any point (x y z), a position vector is associated to it. This vector gives both the distance and direction from (0 0 0), the origin, which is the frame of reference in R3 . Definition 1.2.2 An n-dimensional vector a is an ordered set of ‘n’ scalars written in ordinary parentheses as a = (a1 a2 · · · ai · · · an ). The scalar ‘ai ’ is called the ith component of the vector. A vector having ‘n’ components is called an ‘n’-tuple. Example 1.2.1 In the 3-dimensional space, a point (x y z) can be considered as a vector, called the position vector of the point (x y z) and is represented as in Fig 1.1.

Observation. It can be observed that in a 2-dimensional space or in a 3dimensional space, a vector is a directed line segment, having both magnitude and direction.

Vectors and Matrices

3

Definition 1.2.3 Two vectors a = (a1 a2 · · · an ) and b = (b1 b2 · · · bn ) are equal if and only if ai = bi , i = 1, 2, · · · , n. Definition 1.2.4 A zero vector is a vector all of whose components are zero, a = (0 0 · · · , 0) and is designated as 0. Definition 1.2.5 A vector w = (w1 w2 ... wn ) is the negative vector of v = (v1 v2 ... vn ) if and only if w = (−v1 − v2 ... − vn ) = −v. Definition 1.2.6 The sum of two vectors a = (a1 a2 · · · an ) and b = (b1 b2 · · · bn ) is denoted by a + b and is defined as a + b = (a1 + b1 a2 + b2 ... an + bn ). That is, the sum of the two vectors a and b is a vector whose ith component is the sum ai + bi . Geometrically, if a and b are any two vectors in a plane, then the sum a + b is obtained from the triangle formed by vectors a and b or from the parallelogram with a and b as adjacent sides. The following diagrams, Fig 1.2 and Fig 1.3 are self-explanatory. Properties of vector addition in a plane are given in the following figures. Fig. 1.4 shows that vector addition is commutative, that is, a + b = b + a. Fig. 1.5 shows that vector addition is associative, that is, (a + b) + c = a + (b + c). Fig. 1.6 explains the concept of difference of two vectors a and b.

4

Linear Algebra to Differential Equations

Vectors and Matrices

5

Note. The properties of associativity and commutativity of addition hold good for n-tuples, n ≥ 1. Definition 1.2.7 Multiplication of a vector a by a scalar λ ∈ R is defined as a vector λa = (λa1 λa2 · · · λan ), that is, the vector obtained by multiplying each component ai of a with λ. Observation. If λ > 0, then the vector λa is in the direction of a. If λ < 0, then the vector λa is in the opposite direction of a. If λ = 0, then the vector λa is the zero vector. If |λ| > 1, then λa is said to be a magnification or dilation of a and if |λ| < 1, then λa is called a contraction of a. A combination of the above definitions gives rise to an important notion that plays a pivotal role in linear algebra. Definition 1.2.8 If v1 , v2 , · · · , vn are n vectors and α1 , α2 , ..., αn are n scalars, then the linear combination of the n vectors is defined as a vector v = α1 v1 + α2 v2 + ... + αn vn . Example 1.2.2 If   1 v1 = 2 , 3   11 then 3v1 + 2v2 + v3 = 11 . 25

    3 2 v2 = 2 and v3 = 1 , 4 8

Example 1.2.3 In R2 , let i = (1 0) and j = (0 1). Then any position vector r = (x y) can be written as r = x(1 0) + y(0 1). Example 1.2.4 In R3 , let i = (1 0 0), j = (0 1 0) and k = (0 0 1). Then any position vector r = (x y z) can be written as r = x(1 0 0) + y(0 1 0) + z(0 0 1) = xi + yj + zk. Result 1.2.1 To extend this idea to n-tuples, denote e1 = (1 0 ... 0), e2 = (0 1 0 ..., 0), ..., en = (0 0 ... 0 1), where ( 1 in the ith component ei = 0 elsewhere. Then any vector, v = (x1 x2 ... xn ) = x1 e1 + x2 e2 + ... + xn en .

6

Linear Algebra to Differential Equations

Definition 1.2.9 The vectors v1 , v2 , ..., vn are said to be linearly independent if and only if a linear combination αi ∈ R,

α1 v1 + α2 v2 + ... + αn vn = 0, implies αi = 0 for all i=1,2,...,n.

Example 1.2.5 Show that v1 = (1 1 0), v2 = (1 0 1) and v3 = (0 1 1) are linearly independent. Consider α1 (1 1 0) + α2 (1 0 1) + α3 (0 1 1) = 0 then α1 +α2 = 0, α1 +α3 = 0 and α2 +α3 = 0, which gives α1 = α2 = α3 = 0. Definition 1.2.10 The vectors v1 , v2 , ..., vn are said to be linearly dependent if they are not linearly independent. Alternatively, the vectors v1 , v2 , ..., vn are linearly dependent if there exist scalars α1 , α2 , ..., αn , not all zero, such that the linear combination α1 v1 + α2 v2 + ... + αn vn = 0. Example 1.2.6 Show that v1 = (1 3 4), v2 = (2 9 5) and v3 = (4 15 13) are linearly dependent. Consider α1 (1 3 4) + α2 (2 9 5) + α3 (4 15 13) = 0. Then simplifying gives α1 + 2α2 + 4α3 = 0, 3α1 + 9α2 + 15α3 = 0 and 4α1 + 5α2 + 13α3 = 0 which yields upon solving α1 = 2, α2 = 1 and α3 = −1. Hence the result. The length or magnitude of a vector is a very important notion and is as follows. Definition 1.2.11 The magnitude or the length of a vector v (v1 v2 ... vn ) is given by the Euclidean distance as q |v| = l(v) = v12 + v22 + ... + vn2

=

Example 1.2.7 If v = (2, 5, 7) then the Euclidean distance is p √ |v| = l(v) = 22 + 52 + 72 = 78. Definition 1.2.12 A unit vector is a vector whose magnitude or length is 1. Example 1.2.8 Show that (1 0), (0 1), ( √12 plane. |(1, 0)| = and

|( √12 , √12 )|

=

q

1 2

+

1 2



√1 ) 2

1 = 1, |(0, 1)| =



are all unit vectors in a

1=1

= 1.

Normalization of a vector is a process of obtaining a unit vector by dividing a given vector by its length.

Vectors and Matrices

7

Example 1.2.9 If v = (10 5 4), then, u=

v 1 =√ (10 5 4) |v| 141

is a unit vector. The product of vectors can be defined in many ways. The dot product (also called scalar product or inner product) and vector product, which are widely used in science and engineering are considered below. Definition 1.2.13 Dot product of two vectors a = (a1 a2 ... an ) and b = (b1 b2 ... bn ) is defined as a·b=

n X

ai bi = a1 b1 + a2 b2 + ... + an bn .

i=1

The dot product in a plane is defined as follows. Let OA = a and OB = b be vectors in the XY-plane making angles α and β with the X-axis. Let θ be the angle between a and b measured in the anti-clockwise direction. Then θ = α − β. Then from the Fig 1.7, cos θ = cos(α − β) = cos α cos β + sin α sin β =

a2 b2 a1 b1 + a2 b2 a1 b1 + = , |a||b| |a||b| |a||b|

we get, |a||b|cosθ = a1 b1 + a2 b2 = a · b, i.e. a · b = |a||b|cosθ.

Thus in a plane or in the space, a·b = |a||b|cosθ, where θ is the angle between a and b. Some of the important properties of the dot product are:

8

Linear Algebra to Differential Equations 1. The dot product of two nonzero vectors a and b is positive, zero or negative according to whether the angle θ between the vectors is less than, equal to or greater than 90◦ (= π2 ), respectively. 2. The dot product is commutative, a · b = b · a = |a||b|cosθ. 3. If a · b = 0 then either a = 0 or b = 0; if a 6= 0 and b 6= 0 then a is perpendicular to b.

Example 1.2.10 (a) If a = (2 3 4), b = (0 1 5) then a · b = 23. (b) if e1 = (1 0) and e2 = (0 1) then e1 · e2 = 0, that is, e1 is perpendicular to e2 and e1 · e1 = 1 and e2 · e2 = 1. Definition 1.2.14 Projection of vector b on to a vector a is a scalar and is defined as b·a . |a| Definition 1.2.15 The projection vector of b on to a vector a is given by b·a a. |a|2 Another kind of definition of a product between two vectors yields a vector and the definition is as follows. Definition 1.2.16 The cross product of any two vectors a = (a1 a2 a3 ) and b = (b1 b2 b3 ) is defined by a × b = (a2 b3 − a3 b2 − (a1 b3 − a3 b1 ) a1 b2 − a2 b1 ). The following definition is in terms of angle θ.

Aliter definition The cross product or the vector product of two vectors a and b is defined as a × b = |a||b| sin θ e, where θ is the angle between a and b such that 0 ≤ θ ≤ π and e is a unit vector perpendicular to both a and b and in a sense as per the right-hand rule. The vector product is explained in Fig. 1.8.

Vectors and Matrices

9

Important properties of the cross product 1. a × b is a vector in the direction perpendicular to both a and b. 2. The cross product of two vectors a and b gives the area of the parallelogram enclosed by the two vectors a and b. 3. The area of the triangle whose sides are a and b is given by 21 (a × b). Note. Problems on vector products can be easily done once the concept of determinants is introduced in Section 1.6. Example 1.2.11 If i, j, k are mutually perpendicular unit vectors in R3 along the coordinate axes, then the following relations hold: i × j = k, j × i = −k,

j × k = i,

k × i = j;

k × j = −i,

i × k = −j;

i × i = j × j = k × k = 0.

EXERCISE 1.2 1. If a = (3 − 4 1) and b = (1 0 2) find (i)

a+b

(iii) 3a + 2b

(ii)

5b

(iv) 3a − 4

2. Find a + b and a · b for the following vectors:

10

Linear Algebra to Differential Equations (i)

a = (2 1 3), b = (−3 0 5)

(ii)

a = (2 7 0), b = (−1 4 3)

3. Find the cosine of the angle between the vectors a = (−4 3 4) and b = (−4 5 −8). 4. Find a unit vector in the direction of the vector (4 5 2). 5. Using vectors, prove that a parallelogram whose diagonals are equal is a rectangle. 6. Using vectors, prove that the median to the base of an isosceles triangle is perpendicular to the base. 7. If a = 2i − 3j + 6k, b = 2i − j − 4k then verify that |a|2 − |b|2 = (a + b) · (a − b). 8. If |a+b| = |a−b| then prove that the vectors a and b are perpendicular. 9. Prove the identity a = (a · i)i + (a · j)j + (a · k)k, where a is any vector in the space. 10. Find a unit vector perpendicular to each of the vectors 4i + 3j + k and 2i − j + 2k. Also, find the cosine of the angle between the given vectors. 11. If a + b + c = 0, prove that the angle θ between the vectors a and b is given by c2 − a2 − b2 cos θ = . 2|a||b| 12. Find the work done in moving an object acted upon by a force (4 −3 2) along a vector (−1 −3 5).

1.3

Introduction to Matrices

In this section, following the approach used to define a vector, a new concept called a matrix will be introduced. A few types of matrices will be defined and basic operations of algebra will be given. Definition 1.3.1 A matrix of order m × n is an arrangement of n number of m-vectors each written in a column.

Vectors and Matrices If x1 = (a11 a21 ... am1 ), x2 = (a12 a22 ... am2 ), ..., Then  a11 a12 · · ·  a21 a22 · · ·  A = (x1 x2 ... xn ) =  . .. ..  .. . . am1 am2 · · ·

11 xn = (a1n a2n ... amn ).  a1n a2n   ..  .  amn

m×n

is an m × n matrix. Notation. For the sake of brevity and clarity, a matrix of order m×n, Am×n , is written as A = [aij ]m×n , where i = 1, 2, ..., m, j = 1, 2, · · · , n and aij is the ith row, jth column entry of the matrix. Special cases. (i) If m = n = 1 then A = [a11 ] is a single entry matrix and is considered as a scalar.   a11  a21    (ii) If n = 1, A =  .  = [ai1 ], i = 1, 2, · · · , m  ..  am1 is a column matrix and is called a column vector. Thus, a column vector is represented using a square bracket or a parenthesis.   (iii) If m = 1, then A = a11 a12 . . . a1n = [a1j ], j = 1, 2, · · · n is a row matrix and is known as a row vector. Thus, a row vector is represented using a square bracket or a parenthesis. (iv) If m = n then A = [aij ]n×n is a square matrix of order n. (v) If m 6= n, then A = [aij ]m×n is a rectangular matrix. Example 1.3.1 (i)

A = [3] is a single entry matrix.   1 5  (ii) A =  is a column vector or a column matrix. 3 2 4×1   (iii) A = 2 3 6 4 8 1×5 is a row matrix or a row vector.   1 3 (iv) A = is a square matrix. 4 5 2×2

12 (v)

 2 A= 5

1 2

Linear Algebra to Differential Equations  4 8 is a rectangular matrix. 1 3 2×4

Some important matrices are defined below. Definition 1.3.2 Zero matrix, denoted by O or by 0 is a matrix of the form A = [aij ]m×n = [0]m×n , that is, aij = 0 for all i = 1, 2, · · · , m and j = 1, 2, · · · , n. Definition 1.3.3 Elements of the leading diagonal of a square matrix of order n are the entries aii , i = 1, 2, · · · , n.  a 3 4 1 b 5 Example 1.3.2 Let A =  2 3 c 0 1 2

 1 2  ; then the leading diagonal 1 d

consists of the entries a, b, c and d. Definition 1.3.4 Diagonal matrix is a square matrix of order n, all of whose off-diagonal elements are zero, that is, aij = 0 for i 6= j in A = [aij ]n×n . Definition 1.3.5 Trace of A is the sum of the diagonal elements of A = [aij ]n×n and is written as n X tr(A) = aii . i=1

Definition 1.3.6 Identity matrix is a diagonal matrix whose diagonal elements aii = 1, for i = 1, 2, · · · , n. It is denoted by In .  2 0 Example 1.3.3 (i) A = 0 8  0 0 1 0 0 0 1 0 (ii) tr A = 15. (iii)  0 0 1 0 0 0

 0 0 is a diagonal matrix of order 3. 5  0 0  is an identity matrix of order 4. 0 1

Definition 1.3.7 Lower triangular matrix is a square matrix whose entries above the leading diagonal are all zero, that is, aij = 0, for all i < j, i, j = 1, 2, · · · , n. Definition 1.3.8 Upper triangular matrix is a square matrix whose entries below the leading diagonal elements are all zero, that is, aij = 0, for all i > j, i, j = 1, 2, · · · , n.

Vectors and Matrices 13  1 0 0 0 −2 3 0 0   Example 1.3.4 (i)   1 −1 1 0  is a lower triangular matrix of 2 0 1 −1 order 4.   5 4 −1 2 0 3 2 −1  is an upper triangular matrix of order 4. (ii)  0 0 4 1 0 0 0 2 

With these few definitions of matrices, it is essential to introduce the algebra of matrices which involves defining the sum, scalar product and product of matrices. In order to define the sum, difference and equality of two matrices, it is necessary to take matrices of the same order. Let A = [aij ]m×n and B = [bij ]m×n . Definition 1.3.9 The sum of two matrices A and B is a matrix C = [cij ]m×n , where cij = aij + bij for i = 1, 2, · · · , m and j = 1, 2, · · · , n. Definition 1.3.10 Scalar multiplication of a matrix A with λ ∈ R or λ ∈ C is the matrix λA = [λaij ]m×n . Definition 1.3.11 The difference of two matrices A and B is a matrix C = [cij ]m×n defined by C = A − B = A + (−1)(B) = [aij − bij ]m×n . Definition 1.3.12 Two matrices A and B are said to be equal if and only if aij = bij for all i = 1, 2, · · · , m and j = 1, 2, · · · , n.  a b Example 1.3.5 (i) If A = d e  A+B=

  c p and B = f x

a+p d+x

b+q e+y

q y

 r then z

 c+r . f +z

Note that the orders of A, B and A + B are same. (ii) The following matrices are not compatible for matrix addition or difference (subtraction).         2 4 1 1 5 1 5 7 1 A = −3, B = −7 0, C = , D = 2 4 3. 2 −1 3 4 2 2 3 3 2 (iii) If A =

 1 6

4 −3

  5 3 ,B= 2 0

 −1 3 −2 4

14

Linear Algebra to Differential Equations

then

  −2 5 2 A−B= . 6 −1 −2 .    4 −5 −3 16 (iv) If A = then 4A = 3 2 1 12

−20 8

 −12 . 4

Definition 1.3.13 Multiplication of two matrices A = [aij ]m×p and B = [bij ]p×n is a product matrix C = AB, where C = [cij ]m×n and cij is defined as p X cij = aik bkj = ai1 b1j + ai2 b2j + · · · + aip bpj . k=1

Remark 1.3.1 (i) The product of two matrices A and B is defined if and only if the number of columns of matrix A is equal to the number of rows of matrix B. (ii) Unlike scalars and vectors, the product AB may not be equal to the product BA. (iii) The product AB may exist but the product BA need not exist. (iv) If AB = BA, then A and B are said to commute. Example 1.3.6 Find the product AB and BA for     2 (i) A = 2 4 , B = 3

 (ii) A = 1

 1 (iii) A = 2 3

 (iv) A = 2

  −2  3, B= 1  5

2

2 3 4

4

  0 0 1, B = 2 2 2  1  8 , B = 0 2

2 2 0

 1 0 1

 2 1 . 3

Observe that not only A and B are compatible but B and A are also compatible for cases (i) to (iii). Hence the product AB and BA are defined for all cases except case (iv).

Vectors and Matrices     2 = [2 × 2 + 4 × 3] = [16], (i) AB = 2 4 3      2  4 8 2 4 = BA = . 3 6 12     −2 (ii) AB = 1 2 3  1  = [1 × (−2) + 2 × 1 + 3 × 5] = [15] 5     −2  −2 −4 −6  2 3 . and BA =  1  1 2 3 =  1 5 5 10 15      4 6 1 1 2 0 0 2 1 (iii) AB = 2 3 1 2 2 0 =  8 10 3 12 14 5 3 4 2 2 0 1      7 10 4 0 2 1 1 2 0 and BA = 2 2 0 2 3 1 = 6 10 2 . 5 8 2 2 0 1 3 4 2       1 2 (iv) AB = 2 4 8 0 1 = 18 32 , 2 3

15

but BA is not defined. As is the case with real numbers and vectors, addition of matrices is associative and commutative. Below are given results relating to algebra of matrices in terms of addition, scalar multiplication and product of matrices. The readers are advised to prove them. The proofs are simple and can be obtained by using the properties of real numbers for each entry of the matrix. 1. Properties of addition and scalar multiplication. Let A, B, C be matrices each of order m × n and λ and µ be any scalars. Then (a) A + B = B + A. (b) A + (B + C) = (A + B) + C. (c) (λ + µ)A = λA + µA. (d) λ(A + B) = λA + λB. (e) 0.A = Om×n . (f) λOm×n = Om×n . (g) λ(µA) = (λµ)A = µ(λA). 2. Properties of matrix multiplication. Let A = [aij ]m×p , B = [bij ]p×q , C = [cij ]p×q and D = [dij ]q×n , then (a) A(BD) = (AB)D.

16

Linear Algebra to Differential Equations (b) A(B ± C) = AB ± AC. (c) (B ± C)D = BD ± CD. (d) Im×m Am×p = Am×p . (e) Am×p Ip×p = Am×p . (f) Am×p Op×p = Om×p . (g) Ok×m Am×p = Ok×p .

    3 0 1 1 4 −1 Example 1.3.7 Suppose A = , B = 2 1 3 and 3 0 2 4 1 2   2 1 C =  0 2. Then show that (AB)C = A(BC). −3 1       2 1 −19 24 7 3 11  0 2 = . Consider (AB)C = 13 28 17 2 7 −3 1  1 Similarly A(BC) = 3

4 0

  3 −1  −5 2 2

  4 −19  7 = 13 8

 24 . 28

Hence (AB)C = A(BC) = ABC.     0 2 2 0 Example 1.3.8 Consider A = , B= ; then 0 0 0 0   0 0 AB = . Thus A 6= 0, B 6= 0 but AB = 0. 0 0 

 1 −1 Example 1.3.9 Show that A = 4A, where A = . −1 1      1 −1 1 −1 2 −2 Consider A2 = = and hence −1 1 −1 1 −2 2 3

A3 = A2 A =



2 −2

 −2 1 2 −1

  −1 4 = 1 −4

 −4 = 4A. 4

Example 1.3.10 Show that (A + kB2 )3 = kI3 , where     0 1 0 0 0 0 A = 0 0 1 and B = 1 0 0 and k is any scalar. 0 0 0 0 1 0

Vectors and Matrices    0 0 0 0 On multiplication, B2 = 0 0 0, A + kB2 = 0 1 0 0 k 

0 (A + kB2 )2 = k 0

0 0 k

17 

1 0 0 1 and 0 0

 1 0 . 0

Further,  k (A + kB2 )3 = (A + kB2 )2 (A + kB2 ) = 0 0

0 k 0

 0 0 = kI3 , k

hence the result. Example 1.3.11 Prove that A3 − 23A − 40I3 = 0,   1 2 3 A = 3 −2 1 . 4 2 1    19 4 8 63 On multiplication, A2 =  1 12 8  , A3 = 69 14 6 15 92   1 69 23 − 23 3 4 63

 63 46 A3 − 23A − 40I3 = 69 −6 92 46  0 0 = 0 0 0 0 thus proving the result.

2 −2 2

where

46 −6 46

 69 23 and 63

  1 3 1 − 40 0 0 1

0 1 0

 0 0 , 1

 0 0 , 0

 3 Example 1.3.12 If A = 1

  −4 1 + 2n n , prove that A = −1 n

The proof is given by the method of induction. Let n = 2, then     5 −8 5 −8 2 L.H.S. = A = and R.H.S. is . 2 −3 2 −3 Hence the result is true for n = 2. Assume that the result holds for n = k.  1 + 2k Ak = k

 −4k . 1 − 2k

 −4n . 1 − 2n

18

Linear Algebra to Differential Equations

To prove the result for n = k + 1,    1 + 2k −4k 3 −4 k L.H.S. = A A = k 1 − 2k 1 −1   1 + 2k + 2 −4k − 4 = k+1 1 − 2k − 2   1 + 2(k + 1) −4(k + 1) = k+1 1 − 2(k + 1) = Ak+1 . As the result is true for n = k + 1, by the principle of mathematical induction, the result holds good for all natural numbers. Example 1.3.13 Suppose that P and Q are square matrices of order n, show that P and Q commute if and only if P − kI and Q − kI commute for every real number k. If P and Q commute, i.e. PQ = QP

(1.1)

Then (P − kI)(Q − kI) = PQ − PkI − kIQ + k 2 I = PQ − kP − kQ + k 2 I; (P − kI)(Q − kI) = PQ − kP − kQ + k 2 I;

(1.2)

(Q − kI)(P − kI) = QP − QkI − kIP + k 2 I = QP − kQ − kP + k 2 I; (Q − kI)(P − kI) = QP − kQ − kP + k 2 I;

(1.3)

Since PQ = QP, expressions (1.2) and (1.3) are equal, and hence P − kI and Q−kI commute. Conversely, expressions (1.2) and (1.3) are equal only if (1.1) holds.

EXERCISE 1.3 1. State the orders of the following matrices:     1 3 0 −2 −4 4 2 3  3 4   (i)  (ii)   7 6 −5 8  6 −9 7 1 9 0

(iii)

 −1 5

(v)

 0 0

 2 0 3 −7 0 0

0 0

0 0

Vectors and Matrices 

2 (iv)  6 −1

19 −4 6 7



0 11 3 −5 8 2

 0 . 0

2. A matrix has nine elements. Find the possible number of matrices that can be formed. State the possible orders if matrix has 11 elements. 3. Construct 4 × 3 matrices [aij ] whose elements are (iii) aij = (i) aij = i + 2j (ii) aij = 3i − jj     x − 2z 3x − z −5 0 4. If = , find the values of x, y, z. y+z x+z 5 4

(i−j)2 . 3

5. Find the values of α, β, γ, δ from the matrix equation     2α − δ α + 2δ 3 4 = . 5β − γ 4β + 3γ 11 24  1 6. If A = 5 1

2 0 −1

  −3 4 2  , B = 1 1 2

0 3 2

  1 3 −2 and C = −1 3 2

 4 2 2 0. 0 3

Then compute (i)

−3B, 4C

(ii)

A2 − 2B2 + C2

(iii) A + (B − C) Also verify A + (B − C) = (A + B) − C.       3 1 1 2 3 2 0 −2 1     . 7. If A = 1 0 −1 , B = 2 0 and C = 1 2 3 −4 4 −1 −1 3 2 Verify that A(BC) = (AB)C.     p q r s 8. Show that for all p, q, r, s the matrices A = and B = −q p −s r commute.   1 2 9. If A = , find A2 + 3A + 5I where I is second-order identity −3 0 matrix.     cos nx sin nx cos x sin x 10. If Ax = , then show that (Ax )n = , − sin x cos x − sin nx cos nx where n is a positive integer.

20 11.

12.

13.

14.

Linear Algebra to Differential Equations       1 −3 2 2 1 −2 1 4 0 If A = 2 1 −3, B = 3 −2 −1, C = 2 1 1. 4 −3 −1 2 −5 0 1 −2 2 Verify that AB = AC. (Note that the cancellation law in general does not hold good for matrix multiplication.)   p q If A = , prove that A2 − (p + s)A + (ps − qr)I = 0 where I is −r s second-order-identity matrix.     3 5 −1 4 2 −5 If A = ,B= , then solve the matrix equations 1 2 5 −3 5 3 for X (i) 3A + 5X = 2B , (ii) αA + βB = γX, where α, β and γ are scalars.     cos2 x sin x cos x cos2 y sin y cos y Prove that A = is a sin x cos x sin2 x sin y cos y sin2 y zero matrix, where x and y differ by an odd multiple of π2 .

15. Let A be a square matrix. Ak is defined by the relation Ak = AAk−1 , where k is a positive integer. Show that

1.4

(i)

Am An = Am+n

(ii)

[Am ]n = Amn .

Types of Matrices

In this section, various types of matrices are introduced. These matrices are important and are useful in studying physical phenomena. Definition 1.4.1 Transpose of a matrix A = [aij ]m×n is a matrix AT = [a0ij ]n×m where a0ij = aji , i = 1, 2, . . . , m and j = 1, 2, . . . , n.  1 Example 1.4.1 If A = 1 4

  −1 1 2  then AT = −1 3

 1 4 . 2 3

The following properties are satisfied by the transpose of a matrix. (i)

(AT )T = A

(ii)

(A + B)T = AT + BT .

(iii) (kA)T = kAT , where k is any scalar. (iv) (AB)T = BT AT , where the product AB is well defined.

Vectors and Matrices    3 2 −2 −1 Example 1.4.2 If A = −4 0, B = 5 −3 −1 5

21  0 then −2



   4 −9 −4 4 8 27 4 0  and (AB)T = −9 4 −14 . AB =  8 27 −14 −10 −4 0 −10      −2 5  4 8 27 3 −4 −1 Further, BT AT = −1 −3 = −9 4 −14 . 2 0 5 0 −2 −4 0 −10 Thus, (AB)T = BT AT .   ¯ of a matrix A = aij Definition 1.4.2 The conjugate matrix, A, is m×n   ¯ a matrix A = bij m×n where bij = a¯ij that is bij = conjugate of aij for i=1,2,...m, j=1,2,...,n. Example 1.4.3 Let A =  1 − 2i ¯ then A = −i

 1 + 2i i

 1 1 + 3i , −2 + i 5 + 3i

 1 1 − 3i . −2 − i 5 − 3i

The following properties are satisfied by conjugate matrices. Let Am×n , Bm×n be two matrices. Then 1. (A) = A. 2. A + B = A + B. 3. AB = A B, here m = n. Combining the above two definitions gives rise to a new definition. Definition 1.4.3 The conjugate transpose of a matrix, Ac , of a given matrix Am×n is defined as Ac = AT. The conjugate transpose of matrix satisfies (i)

(Ac )c = A

(ii)

T

Ac = AT = A .

Example 1.4.4 Verify the above statements for the matrix   2 + i 3 + 2i 4 + 3i A = 5 + 4i 6 + 5i 7 + 6i. 1 2 2 + 3i

22

Linear Algebra to Differential Equations



2+i Then AT = 3 + 2i 4 + 3i

5 + 4i 6 + 5i 7 + 6i

  1 2−i 2  and Ac = 3 − 2i 2 + 3i 4 − 3i

5 − 4i 6 − 5i 7 − 6i

 1 2 . 2 − 3i

The conclusion is left to the reader. Definition 1.4.4 A symmetric matrix An×n is a square matrix satisfying the criterion A = AT.     a h g 3 −2 4 6 Example 1.4.5 A = h b f  and −2 5 g f c 4 6 −1 are symmetric matrices.

Aliter definition Am×n is a symmetric matrix if and only if aij = aji , i, j = 1, 2, 3, . . . , n. Definition 1.4.5 A skew-symmetric matrix Am×n is a square matrix satisfying aij = −aji , i, j = 1, 2, 3, . . . , n. In other words A = −AT. 

0 h Example 1.4.6 A = −h 0 −g −f

  g 0 f  and B =  2 0 −4

−2 0 −5

 4 5 0

are skew-symmetric matrices. Observation. For skew-symmetric matrices, the diagonal elements are zero. The following are some important properties of symmetric and skewsymmetric matrices. 1. If A and B are symmetric matrices (skew-symmetric) then A + B, A − B are also symmetric matrices (skew-symmetric) matrices. 2. For any square matrix A, A + AT is symmetric and A − AT is skewsymmetric.

Result 1.4.1 A square matrix A can be expressed as a sum of a symmetric matrix and a skew-symmetric matrix in a unique way.   2 −3 −1 4  as the sum of symExample 1.4.7 Express the matrix A = 3 6 5 1 0 metric and skew-symmetric matrices.

Vectors and Matrices    4 0 4 0 Solution. Consider A + AT = 0 12 5 and A − AT = 6 4 5 0 6

−6 0 −3

23  −6 3 0

Set P = 21 (A + AT ) and Q = 21 (A − AT );     2 0 2 0 −3 −3 3  hence A = 0 6 52  + 3 0 is in the required fashion. 2 5 −3 2 2 0 3 2 0 Definition 1.4.6 A Hermitian matrix is a square matrix A = [aij ]n×n satisfying the condition aij = aji . Observation: By the definition, aij = aji means that the diagonal elements of a Hermitian matrix are real.   3 5 − 2i Example 1.4.8 A = . 5 + 2i 8 Definition 1.4.7 A skew-Hermitian matrix is a square matrix A = [aij ]n×n satisfying the condition aij = −aji . Observation: The definition implies that aij = −aji , which means the diagonal elements of a skew-Hermitian matrix are either zero or purely imaginary.     0 −3 − 2i 2i 4 − 4i Example 1.4.9 A = , B= . 3 − 2i 0 −4 − 4i −i Result 1.4.2 If A is a Hermitian matrix, then B = iA is a skew-Hermitian matrix. Proof. Let A be a Hermitian matrix. Then aij = aji . Let aji = a + ib. Then aij = a − ib. Set B = iA =⇒ bij = iaij = i(a − ib) = ai + b, and bji = iaji = i(a + ib) = ai − b = −(b − ia) = −(b + ai) = −bij . Hence B is a skew-Hermitian matrix. Definition 1.4.8 An orthogonal matrix is a real square matrix satisfying AAT = I.   cos x − sin x Example 1.4.10 Let A = . Then A is an orthogonal matrix. sin x cos x      cos x − sin x cos x sin x 1 0 Solution. Since AAT = = . sin x cos x − sin x cos x 0 1 Result 1.4.3 The product of any two orthogonal matrices of same order is orthogonal.

24

Linear Algebra to Differential Equations

Proof. Let Am×m and Bm×m be orthogonal. Thus, AAT = I and BBT = I. Consider (AB)(AB)T = (AB)(BT AT ) = I. Definition 1.4.9 An Idempotent matrix is a square matrix satisfying A2 = A. Example 1.4.11 Show that A is an Idempotent matrix where   −1 3 5 A =  1 −3 −5 . −1 3 5 Solution. Left as an exercise. Definition 1.4.10 A Nilpotent matrix is a square matrix for which An = 0, where n is a positive integer. 

1 Example 1.4.12 Show that A = −1 1 the value of n.

 −4 4  is a Nilpotent matrix. Give −4

−3 3 −3

Solution. A2 = O. Thus n = 2. Definition 1.4.11 A periodic matrix of order n is a square matrix satisfying the relation An+1 = A, for some least positive integer n. 

1 Example 1.4.13 Show that A = −3 2 period 2.

−2 2 0

 −6 9  is a periodic matrix of −3

Solution. Left as an exercise. Definition 1.4.12 An involutory matrix is lation A2 = I.  0 1 Example 1.4.14 Show that A = 4 −3 3 −3

a square matrix satisfying the re −1 4  is an involutory matrix. 4

Solution. Left as an exercise. Definition 1.4.13 An Unitary matrix is a square matrix satisfying the relation AAc = I.

Vectors and Matrices

25

EXERCISE 1.4 1. Let A and B be two matrices that are compatible. Then prove that (i)

(A + B)T = AT + BT

(ii)

(kA)T = kAT

(iii) (AB)T = BT AT . 2. If A is a square matrix, show that (i)

A + AT is symmetric;

(ii)

A − AT is skew-symmetric.

3. Prove that, if A is a square matrix, then AAT and AT A are both symmetric. 4. If A, B are the conjugates of the matrices A and B, respectively, and are compatible then show that (i)

(A + B) = A + B

(ii)

(kA) = kA, if k is complex

(iii) AB = A B. 5. Prove that every Hermitian matrix H can be expressed as H = P + Q, where P is real and skew-symmetric and Q is real and symmetric. 6. Show that the following matrix is a  1 A = −1 1

Nilpotent matrix of the order 2.  −1 −2 1 2 . −1 −2

7. Find a necessary and sufficient condition  a −b A = −a b a −b (i)

to be Nilpotent of order 2;

(ii)

to be Idempotent.

8. Complete Example 1.4.11. 9. Do Examples 1.4.13 and 1.4.14.

for the matrix  −c c −c

26

1.5

Linear Algebra to Differential Equations

Elementary Operations and Elementary Matrices

Sometimes it is convenient to transform a given matrix into a diagonal matrix or a triangular matrix. This can be done by applying certain operations on the given matrix A. It is important to note that these operations are done within a matrix and these are the only operations that can be performed within a matrix. Definition 1.5.1 The elementary row (column) operations on a given matrix are as follows. Let Ri and Rj be the ith row and jth row, respectively of the matrix A and Ck , Cl be the kth column and lth column, respectively of the matrix A. (i)

Interchange of two rows (columns), that is Ri ↔ Rj (Ck ↔ Cl ).

(ii)

Multiplying a row (column) with a constant, that is Ri → αRi (Ck → αCk ), where α is a constant.

(iii) Multiplying a row (column) with a constant and adding it to another row (column), that is Rj → Rj + αRi (Ck → Ck + αCl ).

Example 1.5.1 Consider A4×4

(i)

 1 3 = 4 8

8 1 32 4

 4 1  16 3

Interchanging of two rows: R3 ↔ R2 gives  1 4 ∼ 3 8

(ii)

3 2 12 5

3 12 2 5

8 32 1 4

 4 16  1 3

Multiplying a row with a constant: multiply R2 with 1 4 R2   1 3 8 4 1 3 8 4  ∼ 3 2 1 1 8 5 4 3

1 4,

that is, R2 →

Vectors and Matrices

27

(iii) Multiplying a row with a constant and adding it to another row: R2 → R2 + (−1)R1 (multiplying R1 with (−1) and adding it to R2 ).   1 3 8 4 0 0 0 0  ∼ 3 2 1 1 . 8 5 4 3 Note that similar operations can be done on columns. It is important to note that these row (column) operations do not change the properties of the matrix. Definition 1.5.2 Elementary matrix is obtained by doing a single row (column) operation on an identity matrix.   1 0 0 Example 1.5.2 (i) 0 0 1 interchanging R2 ↔ R3 . 0 1 0  3 (ii) 0

 0 , R1 → 3R1 . 1

 1 (ii) 3

 0 , R2 → R2 + 3R1 . 1

Result 1.5.1 A row operation done on matrix A is equivalent to premultiplying A with the corresponding elementary matrix.   3 4 2 Example 1.5.3 A = −1 0 1 4 2 2  −1 R1 ↔ R2 gives  3 4

 0 1 4 2. 2 2

Doing the same operation on Identity yields   0 1 0 3 4 1 0 0 −1 0 0 0 1 4 2

matrix I3 and premultiplying with A   2 −1 1 =  3 2 4

 0 1 4 2 . 2 2

Similarly, other operations can be verified. Result 1.5.2 A column operation performed on matrix A is equivalent to post-multiplying A with the corresponding elementary matrix.

28

Linear Algebra to Differential Equations   6 3 2 Example 1.5.4 Let A = 2 1 0 4 2 2   3 3 2 1  Then C1 → 2 C1 gives A = 1 1 0. 2 2 2    1 6 3 2 0 0 2 Now consider 2 1 0  0 1 0 4 2 2 0 0 1  3 = 1 2

3 1 2

 2 0. 2

Definition 1.5.3 Equivalent Matrices. Two matrices A and B are said to be equivalent if one can be obtained from the other by a sequence of elementary row and/or column operations. Also, two matrices are said to be equivalent if one is obtained from the other by pre-multiplying and/or post-multiplying with a sequence of elementary matrices. Notation. A ∼ B means that A is equivalent to B.

EXERCISE 1.5 1. Verify Result 1.5.1 for the following matrices by (i) R2 ↔ R3 (ii) R2 → 2R2 (iii) R3 → R3 − 2R3 .     2 3 1 0 8 6 3 (i) A = −1 2 1 1 (ii) 9 2 1 −2 4 2 2 2 4 5     1 3 4 1 2 3 2 4  (iii)  (iv) −2 5 6 4. 4 4 3 2 2 3 2 1 2. Verify Result 1.5.2 for the following matrices by (i) C1 ↔ C3 (ii) C2 → 2C2 (iii) C3 → C3 + 2C2 .   1 0 2 1   3 1 2 1 4 3 11  (i) B =  (ii) 1 2 4 2 2 5 10 1 1 −2 −1



2 (iii) −1 3

1.6

Vectors and Matrices   3 1 −3 2 −2 (iv)  2 −2 5 8

29 8 5 1 3 6 −1



4 2 . −2

Determinants

In this section, the notion of a determinant is introduced and its properties are discussed. Definition 1.6.1 A determinant is a numerical value associated with every square matrix A and is denoted by detA or |A|. The determinant of a matrixA =  [a]1×1 is defined as detA = a. a b is defined as detA = ad − bc. The determinant of A = c d 2×2 In order to define the determinant of higher-order matrices, the following concepts are needed. Definition 1.6.2 A submatrix B of a given matrix Am×n is obtained by deleting one or more rows and (or) one or more columns of A. Definition 1.6.3 The (i, j)th minor Mij of a matrix An×n is the determinant of the matrix obtained by removing the ith row and the jth column of A. Definition 1.6.4 The (i, j)th cofactor of a matrix An×n is given by Aij = (−1)i+j Mij . To find the determinant of a An×n matrix, the following procedure given by Laplace is useful.

Laplace’s rule to find the determinant of a matrix of order n Consider the elements of the ith row, aij , 1 ≤ j ≤ n, where i is fixed. Corresponding to each element ai1 , ai2 , · · · , ain find the cofactors Ai1 , Ai2 , · · · , Ain ; then the determinant of A is given by |A| = detA = ai1 Ai1 + ai2 Ai2 + · · · + ain Ain . Observation (i)

The detA can be found using any row i, i = 1, 2, 3, · · · , n, as fixed.

(ii)

The detA can be found similarly using any fixed column j = 1, 2, 3, · · · , n.

30

Linear Algebra to Differential Equations

Geometrical representation Consider two position vectors x = (a11 a21 ) and y = (a12 a22 ). To find the area of the parallelogram formed by these vectors, write   a11 a12 A = [x y] = , a21 a22 then |A| = a11 a22 − a12 a21 gives the required area. If x = (a11 a21 a11 then |A| = a21 a31

a31 ), y = (a12 a22 a32 ) and z = (a13 a23 a33 ), a12 a13 a22 a23 is the volume of the parallelepiped formed a32 a33

by the three vectors x, y and z. Similarly, the determinant of An×n matrix gives the volume of an nparallelepiped formed by n-vectors. Example 1.6.1 Using the second row find the det of the matrix,   3 5 −7 A = 4 1 −12 . 2 9 −3 Then 5 |A| = 4(−1)2+1 9

−7 2+2 3 + 1(−1) 2 −3

−7 2+3 3 − 12(−1) 2 −3

5 = 17. 9

Properties of determinants Some useful properties of the determinant are stated below without proof. An example is given to illustrate the property. Let A be an n × n matrix. Then

1. |A| = |AT |. 

2 Example 1.6.2 Let A =  3 −1

−1 2 0

  1 2 4; Then AT = −1 3 1

 3 −1 2 0 4 3

and |A| = |AT | = 27. 2. Let B be the matrix obtained by interchanging two rows (columns) of A. Then |B| = −|A|.

Vectors and Matrices    3 −2 2 1 Example 1.6.3 Let A = 1 2 −3 and B = 3 4 1 2 4

31  −3 2 ; 2

2 −2 1

then |A| = 35 and |B| = −35 = −|A|. 3. If any two rows (columns) of the matrix A are identical or proportional (that is, Rj = aRj or Ck = bCl ), then |A| = 0. 

2 Example 1.6.4 (i) Let A = −3 −1 

5 (ii) Consider A =  2 25

−1 2 0

 −2 3  , then |A| = 0. 1

 1 2 4 6  , then |A| = 0. 5 10

4. Let Bn×n be the matrix obtained by multiplying a row (column) of the matrix A by a scalar (i.e.) Ri → aRi [Ck → aCk ] then |B| = a|A|.  1 Example 1.6.5 (i) Let A = 3 4

−2 −1 2

  1 3 1  and B = 3 4 −3

−2 −1 2

 6 2 , −6

then |A| = 5 and |B| = 10 = 2|A|. 5. Let Bn×n be the matrix obtained by doing the third elementary row operation on A. Then |B| = |A|.  2 Example 1.6.6 Let A = 3 1

1 2 0

 1 4 then |A| = 5. 3

Let R1 → R1 + 5R2 and write B as   2 + 15 1 + 10 1 + 20 2 4 . Then |B| = 5. B= 3 1 0 3 6. If in any row (column) of a matrix, every entry is a sum of two numbers then the matrix can be expressed as a sum of two matrices of the same order and the det of the sum equals the det of the original matrix.

32

Linear Algebra to Differential Equations

Example 1.6.7 Let  2+b A= 4 7  2 1 = 4 1 7 2

 1+q 4+r 1 3  2 1    4 b q r 3 + 4 1 3 1 7 2 1

= B + C (say) then |A| = |B| + |C| (P rove!)  1 Example 1.6.8 If A = 1 1

 a2 b2  c2

a b c

then |A| = (a − b)(b − c)(c − a). Applying row operations R1 → R1 − R2 and R2 → R2 − R3 0 a − b a2 − b2 then |A| = 0 b − c b2 − c2 1 c c2 0 = (a − b)(b − c) 0 1

1 1 c

a + b b + c c2

= (a − b)(b − c)(c − a). 

1 ω Example 1.6.9 Evaluate the determinant of A =  ω ω 2 ω2 1 where ω is complex and is one of the cube roots of unity. Applying the column operations C1 → C1 + C2 + C3   1 + ω + ω2 ω ω2 A = 1 + ω + ω 2 ω 2 1 , 1 + ω + ω2 1 ω since 1 + ω + ω 2 = 0, detA = 0. Example 1.6.10 Show that the det of A = 4a2 b2 c2 for   a2 bc ac + c2 b2 ac  . A = a2 + ab 2 ab b + bc c2

 ω2 1 ω

Vectors and Matrices

33

Applying row operations R1 → R1 + R2 + R3   2a(a + b) 2b(b + c) 2c(c + a) b2 ac  A =  a2 + ab 2 ab b + bc c2 a + b b + c c + a b a |A| = 2abc a + b b b+c c Applying row operations again on the matrix obtained R2 → R2 − R1 gives a + b b + c c + a −c −c |A| = 2abc 0 b b+c c = 2abc[(a + b)(−c2 + c(b + c)) + b[(b + c)(−c) + c(c + a)]] = 4a2 b2 c2 .

Result 1.6.1 If A is an orthogonal matrix, then |A| = ±1. Since |A| = |AT |, |AAT | = |A|2 = |I| = 1. Hence the result |A| = ±1.

EXERCISE 1.6 1. If Aij is the cofactor then show that a11 a21 a31

of aij in the matrix A = [aij ]3×3 and |A| = D, a12 a22 a32

a13 A11 a23 A21 a33 A31

A12 A22 A32

A13 A23 = D3 . A33

2. Find the product of the determinants of the matrices,     1 2 3 2 3 4 2 3 1 and 3 4 2 . 3 1 2 4 2 3  a 3. If a, b, c have all different values and A =  b c then prove that abc = 1 using |A| = 0.

a2 b2 c2

 a3 − 1 b3 − 1 , c3 − 1

34

Linear Algebra to Differential Equations   1 + p2 − q 2 2pq −2q     2 2 , then show that 2pq 1 − p + q 2p 4. Let A =      2q −2p 1 − p2 − q 2 |A| = (1 + p2 + q 2 )3 . 5. Find the square of the determinant 0 x x 0 y 0 0 a 6. Prove that sin(x + π ) 4 sin(x + π ) 4 1

sin x cos x a

in a determinant form y 0 0 a . 0 b b 0

 cos x  1 √ sin x = √ ( 2 − 1) cos 2x 2 1 − a

for any value of a.

1.7

Inverse of a Matrix

In Section 1.2, the sum and scalar multiplication of vectors and in Section 1.3, the sum, scalar multiplication and product of matrices were introduced parallel to the definitions of real numbers. Now comes the turn of the reciprocal or the inverse of a matrix. Observing that a reciprocal of a real number, a 6= 0 satisfies the relation a.

1 1 = 1 = .a; a a

the question that arises is whether there is a similar concept for matrices. The problem is, given a matrix Am×n , is it possible to find matrices Bn×m and Cn×m such that the matrix product is compatible and Am×n Bn×m = Im and Cn×m Am×n = In . The answer leads to the following notion.

Definition 1.7.1 Let a matrix Am×n be given. (i)

A matrix Bn×m is called the left inverse of A, if Bn×m Am×n = In .

(ii) A matrix Cn×m is called the right inverse of A, if Am×n Cn×m = Im .

Vectors and Matrices

35

These inverses may or may not exist and this fact is illustrated in the following example.  1 −1 Example 1.7.1 Consider A = 1 0

 2 . 1

Find the left inverse and right inverse for A, if they exist. Solution: The left inverse B3×2 , if it exists satisfies the equation, B3×2 A2×3 = I3 . This means that to find the six entries in B3×2 ; there are nine equations. Hence it is obvious that B cannot be found.   a d So consider C3×2 =  b e ; then, c f      a d  1 −1 2  1 0  b e = AC = I2 gives , 1 0 1 0 1 c f which gives the equations, a − b + 2c = 1, a + c = 0,

and d − e + 2f = 0,

(1.4)

d + f = 1,

(1.5)

giving the solution matrix, 

 −α 1 − β α − 1 1 + β  , α β where α and β are arbitrary. Thus A has infinitely many right inverses but not a single left inverse. The process of finding inverses of rectangular matrices needs more information and is considered later in Section 1.9. The rest of this section deals with the inverse of a square matrix. Definition 1.7.2 A square matrix Bn×n is said to be the inverse of An×n if and only if AB = BA = In . Then A is said to be invertible and B is called the inverse of A and is denoted by A−1 . Result 1.7.1 |A−1 | =

1 |A| .

36

Linear Algebra to Differential Equations

Proof. From the definition of A−1 AA−1 = I = A−1 A. Taking determinants and the fact that |AB| = |A||B| yields |A||A−1 | = 1. Hence, |A−1 | =

1 . |A|

Definition 1.7.3 (i) A nonsingular matrix A is a matrix whose |A| = 6 0. (ii) A singular matrix A is a matrix whose |A| = 0.  1 Example 1.7.2 A = 2 3 since |A| = 0.  1 0 B = 2 1 0 3

2 1 6

 1 0 is singular matrix, 3

 1 0 is nonsingular matrix, 2

since |B| = 8 6= 0. The following matrix is useful for finding the inverse of a matrix. Definition 1.7.4 An adjoint matrix of A, denoted by adjA is A11  A21  adjA =  .  ..

A12 A22 .. .

··· ··· .. .

T A1n A2n   ..  . 

Am1

Am2

···

Amn



m×n

where Aij is the cofactor of the (i, j)th element of A.

Result 1.7.2 For any square matrix of order n, A (adjA) = (adjA) A = |A|In . This yields A−1 = Thus A−1 =

1 |A| adjA.

1 |A| adjA

is a working rule.

,

Vectors and Matrices

37

Theorem 1.7.1 A square matrix An×n possesses an inverse if and only if it is nonsingular. Proof. Suppose A possesses an inverse, i.e., A−1 = B exists and AB = BA = In . Thus |A||B| = |In |, which means |A||B| = 1; this gives |A| = 6 0, and hence A is nonsingular. Suppose A is nonsingular, then |A| = 6 0. Consider (adjA)(A) = (A)adjA = |A|In Since |A| = 6 0, (adjA)A (A)adjA = = In |A| |A| which means A−1 =

adjA , |A|

thus completing the proof. Result 1.7.3 If An×n and Bn×n are compatable and invertible then (AB)−1 = B−1 A−1 Consider (AB)(B−1 A−1 ) = AIn A−1 = In . Similarly, (B−1 A−1 )(AB) = B−1 In B = In . Hence (AB)−1 = B−1 A−1 . Result 1.7.4 If A is nonsingular matrix then the following properties hold. (i)

(A−1 )−1 = A.

(ii)

(AT )−1 = (A−1 )T .

(iii) ( A )−1 = (A−1 ). T

(iv) ( AT )−1 = (A−1 ) .

38

Linear Algebra to Differential Equations

Proof. (i)

Since A−1 A = I consider (A−1 )−1 = (A−1 )−1 I = (A−1 )−1 A−1 A = A.

(ii)

Consider AA−1 = I = A−1 A. Taking transpose and applying the product rule for transpose gives (AA−1 )T = IT = (A−1 A)T =⇒ (A−1 )T AT = I = AT (A−1 )T which means (AT )−1 = (A−1 )T.

(iii) Consider AA−1 = I = A−1 A. Taking conjugate and applying the property for conjugates AA−1 = I = A−1 A gives A A−1 = I = A−1 A which gives A−1 = ( A )−1 .

(iv) Applying relations (ii) and (iii) on AA−1 = I gives, AA−1 = I = A−1 A =⇒ A A−1 = I = A−1 A. Taking transpose gives the result. Result 1.7.5 If A is an involutory matrix, then A is its own inverse. That is, the inverse of an involutory matrix A is itself, since A2 = I implies AA = I or A = A−1 . The following are some methods of finding the inverse of a matrix. The inverse of a matrix An×n can be calculated using the adjoint matrix. 

2 Example 1.7.3 Find the inverse of A = −7

 −6 using adjA. 3

Clearly |A| = −36 6= 0 is nonsingular and A−1 exists. Further, A11 = 3,

A12 = 6

Vectors and Matrices A21 = 7,

39

A22 = 2.

Hence by definition, using adjA, A

−1

 1 3 =− 36 7

 1  − 12 6 = 2 7 − 36

− 61 1 − 18

 .

Remark 1.7.1 For any square matrix of order two, one can obtain the inverse by the following rule. Rule Interchange the position of the main diagonal elements, change the signs of second diagonal elements and divide by |A|. Verify this rule for the above example.  0 Example 1.7.4 Find the inverse of the matrix A = 1 3

1 2 1

 2 3 . 1

Here |A| = −2 and hence A−1 exists. Further verify that A11 = −1, A12 = 8, A13 = −5, A21 = 1, A22 = −6, A23 = 3, A31 = −1, A32 = 2, A33 = −1, T  −1 8 5 A−1 = −1/2  1 −6 3  then, −1 2 −1   1 −1 1 A−1 = 1/2 −8 6 −2 . −5 −3 1

Gauss-Jordan method Inverse of a given square matrix of order n can also be found by applying elementary row (columns) operations. This method is called as Gauss-Jordan method. The method consists of writing a given matrix as A = In A and  applying row operations A In . The matrix can   also be written as A = A I AIn , and column operations are applied n . The following examples illustrate the method.   −2 4 Example 1.7.5 Find the inverse of the matrix A = using Gauss3 −5 Jordan method.

40

Linear Algebra to Differential Equations

Here, |A| = −2 6= 0, A is nonsingular and A−1 exists. Now perform elementary row operations as follows     −2 4 1 0 = A 3 −5 0 1   1   −2 0 1 −2  A, R1 → − 1 R1 ≈ = 3 −5 2 0 1  1    −2 0 1 −2  A, R2 → R2 − 3R1 ≈ = 0 1 3 1  52    2 2 1 0  A, R1 → R1 + 2R2 ≈ = 0 1 3 1 2 5 Hence A−1 = 

2 3 2

2

 .

1 

1 Example 1.7.6 Find the inverse of the matrix A = −1 2 row operations.

2 1 −1

 −1 2  using 1

Here, |A| = 6 0, and hence A−1 exists. Now,     1 0 0 1 2 −1 −1 1 2  = 0 1 0 A 0 0 1 2 −1 1     1 2 −1 1 0 0 1  =  1 1 0 A, R2 → R2 + R1 , R3 → R3 − 2R2 ≈ 0 3 0 −5 3 −2 0 1     1 2 −1 1 0 0 1 1  1 0 A, R2 → R2 ≈ 0 1 =  13 3 3 3 0 −5 3 −2 0 1     1 1 0 − 53 − 32 0 3        1  1  1   0 ≈ 0 1 3 = 3 3  A, R1 → R1 − 2R2 , R3 → R3 + 5R2     14 1 5 0 0 −3 1 3 3

Vectors and Matrices

 1 0   ≈ 0 1  0 0

− 53





1 3

− 23

0

41



     1  3 1 = 0 3   3  A, R3 → 14 R3    1 5 3 1 − 14   3 14 14 1 5 − 14 14 14    1 0 0    5 5 1 1  3    0 1 0 − ≈ = 14 14 14 A, R1 → R1 + 3 R3 , R2 → R2 − 3 R3  0 0 1  5 3 1 − 14 14 14  3  1 5 − 14 14 14    5  3 1  − Hence, A−1 =  14 14  .  14   1 5 3 − 14 14 14 1 3

 3 Example 1.7.7 Find the inverse of the matrix A = 2 0 umn operations. Clearly,  3 2 0  1 ≈ 2 1  1 ≈ 2 1  1 ≈ 2 1  1 ≈ 0 1 3

 −3 4 −3 4 using col−1 1

A is nonsingular and A−1 exists. Now,    1 0 0 −3 4 −3 4 = A 0 1 0 0 0 1 −1 1    −1 0 0 −3 4 −3 4 = A  0 1 0 , C1 → C1 − C3 , C1 → −C1 1 0 1 −1 1    0 0 −1 −3 4 3 −4 = A  0 1 0  , C2 → C2 + 3C1 , C3 → C3 − 4C1 2 −3 1 3 −3    0 0 −1 −1 4 1 1 1 −4 = A  0 0  , C2 → C2 3 3 2 −3 1 1 −3 3    0 0 −1 −1 0 1 4 1 0  = A − 23 C1 → C1 − 2C2 , C3 → C3 + 4C2 3 3 , 2 1 −3 −1 1 1 3

42 

1 0  0 1 − 31 23  1 0 0 1 0 0

Linear   0 1 0 = A − 23 1 −1   0 1 0 = A −2 1 −2 

Finally A−1

1 = −2 −2

Algebra to Differential Equations  −1 0 1 −4 , C3 → −3C3 3 1 −3  −1 0 2 1 3 −4 , C1 → −C1 + C3 , C2 → C2 − C3 3 3 3 −3

−1 3 3

 0 −4. −3

EXERCISE 1.7 1. If k is scalar, prove that (kA)−1 = k1 A−1 . 2. If A is nonsingular then prove that AB = AC implies B = C. Is the converse true? If not, give an example. 3. Find the  5 (i) 3  3 (iv) 2 6

inverse of the following matrices:    3 1 0 (ii) 2 −1 2    1 1 13 1 2 1 0 (v)  2 1 0. 3 1 5 0 1

4. Find the  1 0 (i)  2 3

inverse of the following matrices using row transformations   T 0 1 2 2 0 −1 1 3 4  (ii) 5 1 0  . 1 6 8 0 1 3 2 9 15

5. Find the inverse of the following matrices using    1 2 −1 −7 (i) (ii) 2 3 7 −1 0 1   1 0 0 (iii) 2 2 −1. 1 −1 1

(iii)

 1 3

 2 4

column operations  3 0 2

Vectors and Matrices

1.8

43

Partitioning of Matrices

Handling matrices of a very large order will be difficult in terms of algebraic operations. Thus it will be convenient to break these large matrices into submatrices in such a way that they will be compatible for performing the chosen algebraic operation. This division of submatrices is done in different ways depending on the operation that is to be executed. This procedure of breaking up a given matrix into submatrices is called partitioning of matrices. Definition 1.8.1 A matrix is said to be partitioned if the matrix is divided into non-overlapping rectangular/square blocks with each block forming a reduced dimension matrix, a sub- matrix.   1 2 3 0 4 5 6 0 0  0 1 2 0 4 3 5 6 0     0 0 1 1 0 2 4 3 2     Example 1.8.1 A =     1 0 0 4 3 2 1 5 4     0 1 1 5 2 2 4 3 2  1 1 0 6 1 2 1 1 1   2 4 3  1 2 1   . B=    4 2 8 1 4 3 A has been divided into six submatrices of the order 3 × 3, B has been divided into four submatrices, two of the order 2×2 and two are of the order 2×1. This partitioning of matrices is done depending upon the operations to be performed.

1. Addition of matrices using partitions Let Am×n and Bm×n be two matrices of the same order. Then, in order to define addition, they have to be partitioned in such a way that their submatrices are also of the same order. The partitioning may be as follows:     Er×k Fr×n−k Pr×k Qr×n−k A= and B = . Gm−r×k Hm−r×n−k Rm−r×k Sm−r×n−k Then,  A+B=

E+P G+R

 F+Q , H+S

44

Linear Algebra to Differential Equations

that is, the corresponding submatrices are added. Example 1.8.2 Let  E3×3 A= G1×3  1 where, E = 0 3   G= q q c ,

F3×2 H1×2



 and B =

4×5

P3×3 R1×3

Q3×2 S1×2

 , 4×5

      3 1 1 2 a b d 2 , P = 2 1 4, F =  c d , Q = r 3 2 0 1 e f b       R = c b c , H = 2 1 and S = 3 3 .

 m s v

1 4 2

   Then, A + B =   

2 2 5

2 5 2

5 6 4

a+d c+r e+b

b+m d+s f +v

g+c

q+b

2c

5

4

   .  

2. Product of two matrices Let Am×p and Bp×n be two matrices such that their product (AB)m×n exists. To find the product using the partitioning of matrices, care should be taken while doing the partitioning. The partitioning must be done in such a way that the product of each of the submatrices exists. In other words, the partitioning of p-columns of A and the p-rows of B must be done in the same way.     Er×k Fr×p−k Pk×l Qk×n−l Let A = and B = Gm−r×k Hm−r×p−k m×p Rp−k×l Sp−k×n−l p×n 

(EP + FR)r×l AB = (GP + HR)m−r×l

(EQ + FS)r×n−l (GQ + HS)m−r×n−l

  2 1 0 −1  −2 3 0   2 B= Example 1.8.3 Let A =     −1 0 1 3 

 . m×n

1 −1 1 1 2

Find the product of matrices using the partition method.     E F P Q A= and B = , G H R S

2

 0 0   1

Vectors and Matrices hence  AB =

EP + FR GP + HR

 0 3 EQ + FS = 8 1 GQ + HS 4 1 

−1 5 3

45  0 0. 1

3. Inverse of a matrix using partitions Consider a matrix An×n which can be partitioned as   P Q A= , R S where P is an r × r matrix, Q is an r × s matrix, Rs×r matrix and Ss×s matrix such that r + s = n. Assume that A−1 exists and can be partitioned in a similar way as   X Y A−1 = , Z W where Xr×r , Yr×s , Zs×r and Ws×s matrices are to be found. Observe that X, Y, Z and W are of the same order as P, Q, R and S respectively. Since AA−1 = I,      X Y I1 O P Q  =  A= R S Z W O I2 where I1 and I2 are identity matrices of the order r and s, respectively. Using the definition of the product of matrices for partitioned matrices gives the following equations. PX + QZ = I1 PY + QW = O RX + SZ = O RY + SW = I2 . Solving for X, Y, Z and W gives W = [S − RP−1 Q]−1 Y = −P−1 QW Z = −WRP−1 X = P−1 − P−1 QZ. Example 1.8.4 Find the inverse using partitioning method for the matrix   1 1 1  2 3 −1  . A=   3 5 2

46

Linear Algebra to Differential Equations

Partitioning A as 

 1 1 1  2 3 −1   A=   3 5 2        1 1 , Q= , R= 3 5 , S= 2 3 −1

 1 P= 2

P

−1

 1 = 2

1 3

−1



3 = −2

   S − RP−1 Q = 2 − 3

 5



3 −2

 −1 1 −1 1



 1 −1

 = 2 − [−1 2]

 1 = 2 − [−3] = 5 −1 W = [S − RP−1 Q]−1 =

Y = −P−1 QW = − Z = −WRP

−1

 1 3 5 −2

1 =− 3 5

5



3 −2



   1 4 1 =− −1 5 −3   1 −1 = − −1 2 1 5

  1 3 3 −1 + −2 1 5 −2    1 3 3 −1 = + −2 1 5 −2    1 4 3 −1 = + −2 1 5 −3   2.2 0.6 = . −1.4 −0.2

X = P−1 − P−1 QZ =





−1 1

1 5

  1  −1 2 −1   −1 −1 2 1 1 −2  −8 6 −1 1



Thus A−1

   3.8 X Y = = −2.6 Z W 0.2

−2.6 2.2 −0.4

  2.2 0.8 −0.6 = −1.4 0.2 0.2

0.6 −0.2 −0.4

 −0.8 0.6  . 0.2

Vectors and Matrices

47

EXERCISE 1.8 1. Find the sum and difference of the following matrices using the partitioning method.     1 0 5 2 0 3 a b c  1 1 1    C= −6 12 8 and D = 0 2 5 . 0 8 9 1 1 3 2. Find the product of the matrices using the partition method     3 1 4 2 2 4 1 2 3 3 2  1 3 2     P= 1 2 2 3 and Q = 1 2 1 . 5 3 2 3 2 3 5 4 1 2 1 3. Find the inverse of the following  4 3 A= 1 2

matrix using the partitioning method  0 2 1 2 2 0 . 3 1 −2 −2 1 1

4. Find the inverse of the following matrix A using the partitioning method and solve the system Ax = b, where     4 2 1 1 −2  8 3 0 2 1    A= 9 4 −1 1  and b = 16 . 8 1 1 2 1

1.9

Advanced Topics: Pseudo Inverse and Congruent Inverse

In Section 1.7, the notion of a left inverse and a right inverse was introduced. Depending on the size, an m × n matrix may possess an infinite number of left inverses or right inverses. There are many types of inverses defined for a rectangular matrix, which are called a pseudo inverse or a generalized inverse. In this section, the Moore-Penrose inverse for rectangular matrices is defined. The Moore-Penrose inverse, if it exists, is unique. Next, using the concept of a congruence relation, modular inverse of a matrix is defined. This matrix is used to encrypt messages in cryptography.

48

Linear Algebra to Differential Equations

Generalized inverse of a matrix Definition 1.9.1 Moore-Penrose Inverse. Let Am×n be a matrix of rank = r. The Moore-Penrose inverse of A is a matrix of order n × m, denoted by A+ , satisfying the following criteria. 1. AA+ A = A. 2. A+ AA+ = A+. +

3. AA+ = (AA )T. 4. (A+ A) = (A+ A)T . Some of the important properties of the Moore-Penrose inverse are (i)

The Moore Penrose inverse is unique, if it exists.

(ii)

(A+ )T = (AT )+

(iii) (A+ )+ = A (iv) Rank(A) = Rank(A+ ) (v)

If A and B are two nonsquare matrices and if AB = 0, then B+ A+ = 0

(vi) AT = A+ if and only if AT A is idempotent. Result 1.9.1 Let A be a real matrix (i)

if A has linearly independent columns, then A+ = (AT A)(−1) AT is the left inverse of A (since A+ A = I

(ii)

if A has linearly independent rows, then A+ = AT (AAT )(−1) is the right inverse of A (since AA+ = I

Example 1.9.1 If a = (a1 , · · · an ) is a nonzero vector then 1 T a . aaT   a1  a2     an ) [aa1T ]  .  a1  .. 

a+ =

Verification: (i) aa+ a = (a1 ,

a2

...

 an = a+ .

an Similarly, a+ aa+ = a+ . Further, aa+ and a+ a are symmetric. A systematic procedure to find A+ is given in later chapters after the singular value decomposition of a matrix is introduced.

Vectors and Matrices

49

Congruent inverse of a matrix In order to introduce the congruent inverse or the modular inverse of a matrix, the following preliminary definitions are needed. Definition 1.9.2 p, r, s ∈ N. Then r is said to be congruent to s mod p (denoted by r mod p = s), if and only if p/r − s, that is, p divides r − s. Definition 1.9.3 Modular inverse of is l if and only if kl mod p = 1.

k ∈ N : The modular inverse of k

Example 1.9.2 (11)−1 mod 19 = 7 since (11)7 = 77 and 77 mod 19 = 1. Result 1.9.2 If congruent inverse or modular inverse of k is l (i.e.) k −1 mod p = l, 0 ≤ l ≤ p − 1, then (−k)−1 mod p = p − l. Example 1.9.3 (−11)−1 mod 19 = (19 − 7) = 12, from the above example and result.

Procedure to find congruent or modular inverse The following is the procedure to find the congruent or modular inverse of a 2 × 2 matrix   a b A= . c d Step 1. Find the conventional inverse of A. Then   1 d −b −1 , A = k −c a where k = ad − bc. Step 2. Find l such that kl mod p = 1 then k −1 mod p = l. in A−1 with l and multiply the matrix by l (i.e.)   ld −bl A−1 = . −cl al   s t Step 4. Find mod p value for each entry: , where u v

Step 3. Replace

1 k

ld mod p = s, −bl mod p = t, −cl mod p = u, al mod p = v. This is the modular or congruent inverse of A.

50

Linear Algebra to Differential Equations   8 1 Example 1.9.4 Let M = ; find modular inverse of A with p = 19. 13 3 Step 1. Then det(M ) = 11 and −1

M

 1 3 = 11 −13

 −1 . 8

Step 2. 11−1 mod 19 = 7. Step 3. Now M−1 =



 21 −7 . −91 56

Step 4. (i) 21 mod 19 = 2 (iii) −91 mod 19 = 4

(ii) −7 mod 19 = 12 (using result) (iv) 56 mod 19 = 18.

Hence, congruent inverse or modular inverse of M is   2 12 M−1 = . c 4 18     20 114 1 0 −1 Now one can verify that MMc = = . 38 210 0 1

EXERCISE 1.9 1. Verify that the MP inverse of a column vector b, is 1 b+ = T bT. (b b) 2. Verify if A = (1, 2), A+ = 51 (1, 2). 3. Verify that if a is a nonzero 1 × n row vector, then 1 (aT a)+ = (aT a). (aaT )2 4. Find the congruent inverse of   8 −1 N= with p = 19. 13 −3 5. Find the congruent inverse of   8 −1 N= with p = 11. 12 −3

Vectors and Matrices

1.10

51

Conclusion

This chapter contains the fundamental blocks of the contents of the book. Care has been taken to introduce all the necessary notions that are needed for the smooth flow of results in the following chapters. Determinants and congruent inverse are defined in this chapter and their applications to conics and cryptography respectively are showcased in Chapter 5. The chapter is self-contained and the treatment of the basics relative to vectors and matrices is complete. The contents of this chapter, which are the basics of linear algebra, are presented in a concise manner. For additional information, the reader can refer to [19, 14]. For information on pseudo inverses refer to [21].

Chapter 2 Linear System of Equations

2.1

Introduction

Advances in technology and sophisticated computers opened up a galaxy of problems that can be studied numerically. Any challenging problem in space exploration, missile technology, genetic coding or the age-old transportation problem involve a whole bunch of equations (obtained through prior knowledge) involving a large number of variables. These equations, if expressed adroitly as a linear system, can be easily studied using matrices. In this chapter, a linear system of equations is introduced in Section 2.2 and the concept of rank of a matrix which gives a very good understanding of the system of equations is discussed in Section 2.3. The row echelon form and the normal form that help finding the rank of a matrix are described in Section 2.4. The theory of linear system of homogeneous and nonhomogeneous equations forms the content of Section 2.5. Cayley–Hamilton theorem is stated and its use in finding the inverse of a matrix is dealt in Section 2.6. The eigen-values and eigen-vectors, along with diagonalization of a matrix are studied in Section 2.7. Section 2.8 deals with singular values and singular vectors and the quadratic forms are dealt in Section 2.9.

2.2

Linear System of Equations

In this section, a linear system of equations is defined and the problems that can arise in solving the system are given by considering equations involving two unknowns. This section provides the necessary motivation for studying the succeeding sections. Definition 2.2.1 A linear equation in two unknowns x and y is an equation of the form ax + by = 0, where a and b are constants If the number of equations involving x and y is more than one, then they can DOI: 10.1201/9781351014953-2

53

54

Linear Algebra to Differential Equations

be considered as a system of equations. In general, a system of m equations in n unknowns is as follows. Definition 2.2.2 A linear system of equations or a system of linear equations is a bunch of m linear equations in n unknowns given by a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2 .. . am1 x1 + am2 x2 + · · · + amn xn = bm

(2.1)

It can be observed that in system (2.1) the unknowns x1 , x2 , . . . , xn are repeated in every equation. Now their coefficients without changing the order can be written as row vectors R1 = (a11 a12 . . . a1n ), R2 = (a21 a22 . . . a2n )... and Rm = (am1 am2 . . . amn ). Forming a matrix with the m-rows gives   a11 a12 · · · a1n  a21 a22 · · · a2n    A= . ..   . am1 am2 ... amn   x1  x2    By writing x =  .  as the unknown n × 1 column vector and using matrix  ..  xn multiplication, system (2.1) can be written as Ax = b, 

(2.2)



b1  b2    where b =  .  is an m × 1 column vector.  ..  bm Equation (2.2) describes system (2.1) in an efficient, compact and an elegant manner. Definition   2.2.3 A solution of a linear system (2.2) is an n×1 column vector s1  s2    x =  .  satisfying the equation (2.2).  ..  sn

Linear System of Equations

55

Remark 2.2.1 The ordered set of n numbers s1 , s2 , . . . , sn satisfy all the equations in the system (2.1). This can be verified by putting x1 = s1 , x2 = s2 , . . . , xn = sn in (2.1). It is quite obvious that solving a large system of equations could be complicated and various situations may crop up. Most of the situations that need to be understood are illustrated in the following examples. Example 2.2.1 Solve the system of equations 2x + 3y = 5 and x + y = 2. Solution. From the second equation, y = 2 − x and substituting in the first equation gives −x + 6 = 5 or x = 1 and y = 1. Here the number of unknowns and the number of equations are equal and there is exactly one solution. Example 2.2.2 Solve x + y = 1, x − y = 0 and 2x + y = 3. Solution. It can be noted that no values of x and y will satisfy all the 3 equations. Hence there is no solution. These equations are said to be inconsistent. Observe that there are two unknowns and three equations. Example 2.2.3 Solve x + y = 1, x − y = 0, and 2x + 2y = 2. Solution. The 3 equations have a solution x = 1/2 and y = 1/2. Here there are 3 equations but still, there is a solution. Discover the reason (Hint. Redundant equation). Example 2.2.4 Solve 2x + 3y = 8. Solution. There is only one equation and there are two unknowns. Writing (8 − 3y) , it can be seen that for every choice of y, there is a solution. In x= 2 other words, there are infinite number of solutions. All the fore mentioned examples replicate the scenario for the linear system (2.2) however large the values of m and n are! In order to work with large and very large systems of linear equations, new concepts and tools are introduced in the next two sections.

EXERCISE 2.2 1. Show that the equations x+y+z = 0 −3y + 3z = 0 y−z = 0

56

Linear Algebra to Differential Equations have infinitely many solutions. 2. Show that the equations x+y+z = 0 2x + 5y + z = 0 x−y+z = 0 have only the zero solution (this is called as a trivial solution).

2.3

Rank of a Matrix

In this section, the concept of rank of a matrix is introduced and its properties are given. The rank of a matrix helps to determine the solution structure of the system (2.2). Definition 2.3.1 The rank of a matrix Am×n is denoted by ρ(A) = r and is defined as the number of independent row vectors (column vectors) of the matrix A. Alternatively, the following definition of rank of a matrix could be easily verified for matrices of small order. Definition 2.3.2 The rank of a matrix Am×n is said to be r if and only if (i) Every minor of order (r + 1) and above is zero; (ii) There exists at least one minor of order r which is non zero. Example 2.3.1 Find the rank of the matrix   −2 1 2 A =  1 3 0. 2 0 5 Solution. |A| = (−2)(15) − 1(5) + 2(−6) = −30 − 5 − 12 = −47. Thus |A| = 6 0 which gives ρ(A) = 3. Example 2.3.2 Find the rank of the matrix   1 4 2 A = 3 3 4. 2 −1 2

Linear System of Equations

57

Solution. Here |A| = 1[6 + 4] − 4[6 − 8] + 2[−3 − 6] = 10 + 8 − 18 = 0. Next consider, 1 4 3 3 = 3 − 12 = −9 6= 0. Hence, ρ(A) = 2. Some of the important properties of the rank of a matrix are stated below. The proofs of these properties are avoided at this stage. Using the material in the next section all the properties mentioned below can be proved easily.

Properties of the rank of a matrix (i) Rank of a matrix is unique. (ii) Rank of zero matrix O = [0]m×n is defined to be zero. (iii) For a Am×n matrix, the rank, ρ(A) ≤ min{m, n}. (iv) Rank of the identity matrix, In is n. (v) If An×n is nonsingular then, ρ(A) = n. (vi) ρ(A) = r implies that there are ‘r’ linearly independent rows in the matrix. (vii) ρ(A) = r implies that every minor of order (r + 1) and above is zero. (viii) If ρ(Am×n ) = r and Bn×n and Cm×m are nonsingular matrices then ρ(A) = ρ(CA) and ρ(A) = ρ(AB). This means that rank of a matrix remains unchanged by multiplication by nonsingular matrices. (ix) if ρ(A) = r1 , and ρ(B) = r2 Then ρ(AB) ≤ min{r1 , r2 }. 

1 Example 2.3.3 Find the rank of the matrix A =  x x3

1 y y3

 1 z , z3

where x, y and z are real numbers. Solution. By performing elementary column operations it follows that 1 0 0 y−x z − x det(A) = x x3 y 3 − x3 z 3 − x3 = 1[(y − x)(z 3 − x3 ) − (z − x)(y 3 − x3 )] = (y − x)(z − x)[(z 2 + xz + x2 ) − (y 2 + xy + x2 )] = (y − x)(z − x)[z 2 − y 2 + x(z − y)] = (y − x)(z − x)(z − y)[x + y + z] = (x − y)(y − z)(z − x)(x + y + z). Then the following cases arise:

58

Linear Algebra to Differential Equations 1. x + y + z 6= 0 and x 6= y 6= z. Then |A| = 6 0 and ρ(A) = 3 (by definition). 2. x + y + z = 0 and x 6= y 6= z. Then |A| = 0 hence ρ(A) 6= 3. Then, 1 1 = y − x 6= 0. Hence ρ(A) = 2. consider the 2 × 2 matrix, x y 3. If x 6= y and y = z or x = z and z 6= y or x 6= z and x = y (i.e. Two 1 1 = 6 0. Hence of x, y, z are equal and the third is different). Then x z ρ(A) = 2. 4. If x + y + z = 0 and x = y = z 6= 0. Then |A| = 0 and all 2 × 2 minors have their determinant zero. Hence ρ(A) = 1.

Example 2.3.4 Prove that the points (x1 , y1 ), (x  2 , y2 ), and (x3 , y3 ) are x1 y1 1 collinear if the rank of the matrix A = x2 y2 1 is less than 3. x3 y3 1 Solution. ρ(A) < 3 implies |A| = 0, which implies that the area of the triangle formed by the given points is zero. This means that the points are collinear.

EXERCISE 2.3 1. If A is a nonzero column matrix of order m × 1 and B is a nonzero row matrix of order 1 × n. Show that ρ(AB) = 1. 2. Determine the rank of the following matrices.    1 2 3 1 2 3 2 2 4 3 (i) 2 3 5 1 (ii)  3 2 1 1 3 4 5 6 8 7  2   1 22 32 42 1 1+i 22 32 42 52    0 i (iii)  (iv) 32 42 52 62  1 1 + 2i 42 52 62 72 3. Find the rank of the matrix  0  −c   b −a0

c 0 −a −b0

−b a 0 −c0

 a0 b0  , c0  0

 0 2  3 5  −i 1 + 2i 1+i

Linear System of Equations

59

where aa0 + bb0 + cc0 = 0 and a, b, c, a0 , b0 and c0 are all positive numbers. 4. Prove that the rank of a matrix A is the number of linearly independent columns (rows) of a matrix. 5. Show that the rank of a matrix of order n × n having every element as 1 has rank 1. 6. Let A = [aij ] be a matrix of order n such that ai,i+1 = 1, aij = 0, j − i 6= 1. Show that the rank of Ak is n − k, k < n.

2.4

Echelon Form and Normal Form

In the previous section, it has been observed that the rank of a matrix gives a decent picture about the solutions of the linear system (2.2). A couple of examples were given to find the rank of a matrix. It is quite clear that such an approach would be extremely unwieldy if the order of the matrix is very large. In this situation, it would be more practical to use the first definition of a rank of a matrix. But the glitch is that just by looking at a matrix it will not be possible to state the rank of a matrix. Thus in this section, two forms of a matrix are given which provide the rank of a matrix on observation. The first one is called row echelon form and is defined as follows. Definition 2.4.1 A matrix is said to be in a row echelon form if and only if (i) All zero rows (rows having all zero entries) are below any row with a non zero entry. (ii) In every nonzero row the first nonzero entry (called leading entry) is one. (iii) In any two successive nonzero rows, the leading entry (one) in the succeeding row is to the right of the leading entry (one) in the preceding row. Definition 2.4.2 A matrix is said to be in a reduced row echelon form if and only if it is in row echelon form and (iv) The leading entry (one) in each nonzero row is the only nonzero entry in that column.

60

Linear Algebra to Differential Equations

Example 2.4.1 The following matrices are in row echelon   1 0 0 2 0 1 2 1 3 0 1 0 3  0 0 0 1 2  (ii) B =  (i) A =  0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0

form  1 5 2 1  4 2  0 0 0 0

Example 2.4.2 The following matrices are in reduced row echelon form   0 1 0 0 0 0 1 0  (i) A =  0 0 0 1 0 0 0 0   1 0 0 0 1 0  (ii) B =  0 0 1 0 0 0 Result 2.4.1 Given any Am×n matrix it can be changed into a row echelon form or a reduced row echelon form by a series of elementary row operations. Result 2.4.2 The number of nonzero rows in an echelon form is the rank of the matrix. Example 2.4.3 Reduce the given matrix into echelon form and hence find its rank.   2 1 5 3 8 4 13 7  A= 4 2 3 1 −8 −4 1 3 Solution. Consider the following row operations: R2 → R2 − 4R1 , R3 → R3 − 2R1 , R4 → R4 + 4R1 It follows that

 2 0 A∼ 0 0

1 0 0 0

5 −7 −7 21

 3 −5  −5 15

now, considering the row operations R4 → R4 + 3R3 , R3 → R3 + (−1)R2 , then,   2 1 5 3 0 0 −7 −5  A∼ 0 0 0 0 0 0 0 0

Linear System of Equations 61   1 1/2 5/2 3/2//0 0 1 5/7  0 0 R1 → 1/2R1 , R2 → −1/7R2 then A 0 0 0 0 0 0 Clearly, A has been reduced to a matrix in echelon form and ρ(A) = 2. Example 2.4.4 Find the value of q, such that the rank of matrix   0 1 1 −1 2 0 2 2  is 2. A= 1 2 q −1 2 −1 −3 −1   2 0 2 2 0 1 1 −1  Solution. Consider R1 ↔ R2 , then, A ∼  1 2 q −1 2 −1 −3 −1 R1 → R1 /2   1 0 1 1 0 1 1 −1  ∼ 1 2 q −1 2 −1 −3 −1 R3 → R3 − R1 ; R4 → R4 − 2R1   1 0 1 1 0 1 1 −1  ∼ 0 2 q − 1 −2 0 −1 −5 −3 Since rank of A = 2, determinant of every 3 by 3 matrix = 0. 1 1 −1 Thus, 2 q − 1 −2 = 0, −1 −5 −3 1[(q − 1)(−3) − 10] + (−1)[−6 − 2] − 1[−10 + q − 1] = 0 −3q − 7 + 8 + 11 − q = 0 −4q + 12 = 0 ⇒ q = 3. Definition 2.4.3 The  normal  formof a matrix is any of the following ma  Ir I O trices: Ir , Ir O , or r , where Ir is an identity matrix of rank O O O r and O is any p × q zero matrix. Example 2.4.5 The following matrices are in normal form

62  1 0  (i)  0 0 0  1 0  (ii)  0 0  1 0 (iii)  0 0  1 (iv) 0

0 1 0 0 0

0 0 1 0 0

Linear Algebra to Differential Equations  0 0 0   0 0 0  I3 O3×3 0 0 0 =  O2×3 O2×3 0 0 0 0 0 0  0 0  = I4 0 1

0 0 1 0 0 1 0 0  0   I2 1 = 0 O2×2 0   0 0 0 0 = I2 1 0 0 0

O2×3



Result 2.4.3 Every matrix Am×n of rank r can be reduced to one of the fore-mentioned normal forms by a sequence of elementary row or column operations. Example 2.4.6 Find the rank of the following matrix by reducing it to the normal form   2 1 5 3 8 4 13 7  A= 4 2 3 1 −8 −4 1 3 Solution. Consider the column operation  1 2 4 8 A∼ 2 4 −4 −8

C1 ↔ C2 ,  5 3 13 7 . 3 1 1 3

Next perform the following sequence of operations: R2 → −R2 + 4R1 , R3 → R3 − 2R1 , R4 → R4 + 4R1   1 2 5 3 0 0 7 5  ∼ 0 0 −7 −5 0 0 21 15 R3 → R3 + R2 ;

R4 → R4 − 3R2

 1 0 ∼ 0 0

2 0 0 0

C2 → C2 − 2C1 , C3  1 0 0 0 ∼ 0 0 0 0 C2 ↔ C3  1 0 0 7 ∼ 0 0 0 0

5 7 0 0

Linear System of Equations  3 5  0 0

63

→ C3 − 5C1 , C4 → C4 − 3C1  0 0 7 5  0 0 0 0  0 0 0 5  0 0 0 0

C2 → (1/7)C2  1 0 ∼ 0 0

0 1 0 0

0 0 0 0

 0 5  0 0

C4 → C4 − 5C2  1 0 ∼ 0 0

0 1 0 0

0 0 0 0

 0 0  0 0 

I2 Thus the above matrix is in the normal form O2×2 The rank of the matrix is ρ(A) = 2.

O2×2 O2×2



Result 2.4.4 Existence of nonsingular matrices P and Q. To reduce a given matrix to its normal form a sequence of row operations R1 , R2 , . . . , Ri and a sequence of column operations C1 , C2 , . . . , Cj are performed. This is equivalent to pre- multiplying A with a sequence of elementary matrices Ei , Ei−1 , ..., E1 and post multiplying A with a sequence of elementary matrices Ej , Ej−1 , ..., E1 . Set P = Ei Ei−1 ...E1 and Q = Ej Ej−1 ...E1 . Then the matrix PAQ will be in normal form with P and Q being nonsingular matrices. This is illustrated in the following example. Example 2.4.7 Find non-singular matrices P and Q such that PAQ is in the normal form of the matrix and hence deduce the rank of   1 3 6 −1 A = 2 8 10 2  . 1 5 4 3

64

Linear Algebra to Differential Equations

Set A=I3 AI4 . Any row operations done on LHS matrix A will be done on the matrix I3 and column operations executed on LHS matrix A will be operated on the matrix I4 .       1 0 0 0 1 3 6 −1 1 0 0   2 8 10 2 =0 1 0 A 0 1 0 0 . 0 0 1 0 1 5 4 3 0 0 1 0 0 0 1

 1 0 0

R2 → R2 − 2R1 , R3 → R3 − R1     1 3 6 −1 1 0 0 0  2 −2 4 =−2 1 0A  0 2 −2 4 −1 0 1 0

R3 → R3 − R2    1 1 3 6 −1 0 2 −2 4 =2 1 0 0 0 0 R2 → 1/2R2    1 1 3 6 −1 0 1 −1 2 =−1 0 0 0 0 1

 1 0 0

 1 0 0

0 1 −1

0 1 2

−1

 0 0A 1

 1 0  0 0

0 1 0 0

0 0 1 0

 0 0  0 1

0 1 0 0

0 0 1 0

 0 0  0 1

  1 0 0 0 A  0 1 0

0 1 0 0

0 0 1 0

 1 0  0 1

C2 → C2 − 3C1 , C3 → C3 − 6C1 , C4 → C4 + C1     1 −3 −6 3 6 −1 1 0 0 0 1 0 0 −2 0 =−1 1 −1A  0 0 1 2 0 0 −1 0 1 0 0 0 C3 → C3 + C2 , C4 → C4 − 2C2     1 1 0 0 0 0 0  0 1 0 0=−1 12 0A  0 0 0 0 1 −1 1 0

−3 1 0 0

−9 1 1 0

 7 −2  0 1

Thus the nonsingular matrices P and Q are     1 −3 −9 7 1 0 0 0 1 1 −2 , P=−1 21 0 and Q= 0 0 1 0 1 −1 1 0 0 0 1 and the rank of the matrix, ρ(A) = 2.

 7 −2  0 1

Linear System of Equations

65

A fundamental question that crops up is that whether the row echelon form and the normal form of the matrix depend on the sequence of elementary row operations/column operations performed on the matrix A. The following theorem deals with this question. Theorem 2.4.1 The number of nonzero rows in an echelon form and the Ir obtained in a normal form are independent of the sequence of row/column operations performed on a given matrix Am×n . In other words, the row echelon form and normal form of the given matrix A are independent of the nonsingular matrices P and Q obtained.

EXERCISE 2.4 1. Reduce the following matrices to echelon form and hence find their rank  1 2 3 −2 2 4 6 −1   −1 −3 −2 2  2 5 6 −4  (a)

2. For what value of  3 1 2 2  8 4 5 3

 1 −1 4 1 (b) 1 3 1 1

 2 −1 1 2 . 0 4 0 2

p will the matrix  p 8 6 −1  have rank = 2. 12 15  9 7

3. Determine the nonsingular matrices P and Q such that PAQ is in normal form for     1 1 2 4 1 −1 −2 1 2  3 −4  1 . (a)  (b)2 3 0 −1 −1 2  1 −1 1 2 1 3 1

2.5

Solutions of a Linear System of Equations

A linear system of equations was introduced in Section 2.2 and various possibilities that occur while solving these types of systems have been discussed as special cases. In this section an arbitrary linear system of ‘m’ equations in

66

Linear Algebra to Differential Equations

‘n’ unknowns is again considered. The famous Gauss-elimination method and Gauss-Jordan method will be mentioned in brief (and again will be discussed in detail in Chapter 4) and then theory pertaining to the solutions of the system will be given. Consider a linear system of ‘m’ equations in ‘n’ unknowns given by



a11 a12 · · ·  a21 a22 · · ·  where A =  . .. ..  .. . . am1 am2 · · · x = [x1 , x2 , . . . , xn ]T is

Ax = b, 

a1n a2n   ..  . 

(2.3)

is the coefficient matrix,

amn m×n the unknown column vector(matrix)

and b = [b1 , b2 , . . . , bm ]T is a column vector(matrix).

Definition 2.5.1 The system (2.3) is said to be a nonhomogeneous linear system. If b = [0, . . . , 0]T in the system (2.3) then Ax = 0

(2.4)

is said to be a homogeneous linear system. Definition 2.5.2 Augmented matrix of the system (2.3) is the matrix K = [A : b]. In order to solve the system (2.3), Gauss elimination method is a very natural and a systematic procedure and it is the basis of all other elimination procedures. In this method, the augmented matrix K is reduced to row echelon form or an upper triangular matrix and the vector x is determined by back substitution. This method is illustrated through the following example.

Example 2.5.1 Solve the system  2 1 A = 6 2 4 2

Ax = b, where    5 3 7 and b = 1. 3 4

Solution. Consider the augmented  2 K = 6 4

matrix, K = [A : b]  1 5 : 3 2 7 : 1 2 3 : 4

Linear System of Equations R2 → R2 − 3R1 , R3 → R3 − 2R1 , It follows that  2 1 5  0 −1 −8 K∼ 0 0 −7

67

 : 3 : −8. : −2

The matrix is in an  upper triangular form. On doing back substitution, the −29 14

  value of x =   

40 7 2 7

  .  

The Gauss-Jordan method also aims at simplifying the given matrix so as to make it easier to find the solutions of a given linear system. The first stage of the Gauss-Jordan method is the same as that of the Gauss elimination method. Instead of doing back substitution once the matrix is available in an upper triangular form, a second set of elementary operations are employed so as to reduce it to an identity matrix. The following theorem relates the solutions of original system and the simplified system.

Theorem 2.5.1 Suppose the system (2.3) or the augmented matrix K is transformed by a sequence of elementary row operations into the system A0 x = b0

(2.5)

or the augmented matrix K0 = [A0 : b0 ]. Then x is a solution of the system (2.5) if and only if x is a solution of the system Ax = b. Proof. It is known, from Section 1.5, that every elementary row operation on K is equivalent to pre multiplication of K by an elementary matrix. Let p be the number of row operations done on K. Then there exist p elementary nonsingular matrices Ep , Ep−1 , ...E1 successively pre multiplied on K. Let H = Ep Ep−1 ...E1 . Note that H is non-singular. Thus [A0 : b0 ] = H[A : b] = [HA : Hb] by the rules of multiplication of partitioned matrices. This implies that A0 = HA and b0 = Hb. Let x be a solution of Ax = b. Pre-multiplying with H gives

68

Linear Algebra to Differential Equations

HAx = Hb that is, A0 x = b0 . Thus x is a solution of A0 x = b0 . Next, suppose x is a solution of A0 x = b0 then HAx = Hb. Pre-multiplying with H−1 gives Ax = b which means x is a solution of the system (2.3). The proof is complete. Result 2.5.1 If the linear system (2.3) has two distinct solutions x1 and x2 then it has an infinite number of solutions. Proof.

Let x1 and x2 be two distinct solutions of the linear system (2.3). Then Ax1 = b and Ax2 = b Further (λA)(x1 ) = A(λx1 ) = λb and (1 − λ)(Ax2 ) = A(1 − λ)x2 = (1 − λ)b which gives A[λx1 + (1 − λ)x2 )] = b. Thus λx1 + (1 − λ)x2 is a solution of the linear system (2.3) for any λ ∈ R. Thus if the system (2.3) has two distinct solutions it will have infinitely many solutions.

Theorem 2.5.2 Rouche’s Theorem The nonhomogeneous linear system (2.3) is consistent if and only if the coefficient matrix Am×n and the augmented matrix K have the same rank. Otherwise, the system is inconsistent. Theorem 2.5.3 Let the rank of the coefficient matrix A, ρ(A) = p and the rank of the augmented matrix K, ρ(K) = q. Then exactly one and only one of the following possibilities hold: (i) p 6= q, in which case the system (2.3) is inconsistent and there are no solutions. (ii) p = q = n (number of unknowns), then there exists a unique solution for the system (2.3). (iii) p = q 6= n(number of unknowns), then there exist infinitely many solutions for the system (2.3). Proof. Case (i) p 6= q since the coefficient matrix A is a part of the augmented matrix K, the rank A = p 6= q = rank K. The Gauss elimination method or the echelon form of K implies that the augmented matrix has an extra nonzero row more than the coefficient matrix. In other words there is a row having the equation 0.x1 + 0.x2 + ... + 0.xn = bp+1 6= 0. Since this can never happen, there is no solution for the system (2.3). Case (ii) p = q = n. This means that the coefficient matrix has n- linearly independent rows and that the inverse A−1 exists. Hence x = A−1 b is the only solution of the system (2.3). Case (iii) p = q < n. This means that there are p independent rows and

Linear System of Equations

69

the remaining rows are zero rows in the echelon form. Since there are p non zero rows there are p equations and, hence p-unknowns can be found. The remaining n − p variables have to be chosen arbitrarily and each choice gives a different solution for the system. Since the choices can be infinite, there exist an infinite number of solutions. If the coefficient matrix is a square matrix An×n then the following theorem holds. Theorem 2.5.4 Let An×n be the coefficient matrix of the system (2.3). Then the following conditions are equivalent. (i) The corresponding homogeneous system (2.4) has only the zero solution. (ii) rank A = n. (iii) For each bn×1 column vector, the system Ax = b has exactly one solution. (iv) A is nonsingular. Proof. Assume (i), that is the system Ax = 0 has only one solution. Then by the previous theorem (taking b = 0) implies that rank A = n, that is condition (ii) holds. Next assume that (ii) holds, then rank A = n. Consider the system Ax = b and form the augmented matrix K = [A : b].Then clearly rank A=rank K, which implies from the previous theorem that there exists a unique solution for the system Ax = b. Assume (iii), that is, there exists a unique solution for the system Ax = b. If possible, suppose A is singular, then clearly |A| = 0 which implies that rankA < n and that there are infinitely many solutions for the system Ax = b, a contradiction. Hence A is nonsingular. Suppose A is nonsingular, then clearly A−1 exists and Ax = 0 =⇒ x = 0, hence x = 0 is the only solution of the system Ax = 0, completing the proof. Result 2.5.2 If x1 and x2 are two distinct solutions of the nonhomogeneous linear system (2.3), then x1 −x2 is a solution of the homogeneous linear system (2.4) Proof. Suppose x1 and x2 are solutions of the nonhomogeneous linear system (2.3). Then Ax1 = b and Ax2 = b hold, Consider A(x1 − x2 ) = Ax1 − Ax2 = b − b = 0. Hence the conclusion. Result 2.5.3 If x is any solution of the homogeneous linear system (2.4) and x0 is a solution of the nonhomogeneous linear system(2.3), then every solution of the nonhomogeneous linear system(2.3) is of the form y = x + x0 .

70

Linear Algebra to Differential Equations

Proof. Given Ax0 = b and Ax = 0. Thus Ay = A(x0 + x) = Ax0 + Ax = b. This means any solution of system (2.3) is of the form y = x0 + x = x + x0 . Cramer’s Rule. Let A be a nonsingular matrix then the unique solution x = (x1 , x2 , . . . , xn ) of the nonhomogeneous linear system (2.3) is given by i] xi = det[A det(A) , i = 1, 2, . . . , n where Ai is the matrix obtained by replacing ith column of A by the column vector b Example 2.5.2 Solve the system x + y + 2z = 1 2x − 3y − z = −3

 1 Solution. A = 2 2

2x + 2y + 2z = 4.      x 1 2 −1, x =  y  and b = −3. z 4 2

1 −3 2

Here |A| = 1(−6 + 2) − 1(4 + 2) + 2(4 + 6) = 10. 1 1 2 |A1 | = −3 −3 −1 4 2 2 = 1(−6 + 2) − 1(−6 + 4) + 2(−6 + 12) = −4 + 2 + 12 = 10. Hence by Cramer’s Rule, x1 = Next, 1 1 |A2 | = 2 −3 2 4

|A1 | |A|

= 1.

2 −1 2

= 1(−6 + 4) − 1(4 + 2) + 2(8 + 6) = −2 − 6 + 28 = 20. It follows that 2| y = |A |A| =

Also, 1 |A3 | = 2 2

1 −3 2

20 10

1 −3 4

= 2.

Linear System of Equations = 1(−12 + 6) − 1(8 + 6) + 1(4 + 6) = −6 − 14 + 10 = −10. 3| z = |A |A| =

−10 10

= −1.



 1 Hence x =  2  . −1

EXERCISE 2.5 1. Find the values of k for which the equations x + 2y + z = k 3x + y − z = 1 7x + 4y − z = k 2 have a solution. Solve the system for each k. 2. For what values of a and b the equations 2x + ay + 2z = 6 2x + 2y + z = b 3x + 5y + z = 9 have a solution. 3. Check the consistency of the following systems and solve them. (a)

3x + y − 3z = 13 26x + 20y − 6z = 3 6y − 18z + 1 = 0.

(b)

2x + 6y + 11 = 0 x + 2y + z = 3 3x + 9y − z = 4.

4. Discuss for what values of a and b in the equations x + 3y + 2z = 5 3x + 2y + az = b x+y+z =8 have a unique solution, infinitely many solutions and no solution.

71

72

Linear Algebra to Differential Equations 5. Solve the homogeneous system x + y − 2z + w = 0 x + 2y − z − w = 0 2x + 3y − z + 2w = 0 3x + y + 4z + 3w = 0 6. Solve the following system of linear equations using Cramers rule. (a) x + y + z = 7, 2x + 3y + z = 16, 3x + 4y + z = 22 (b) x + y + z = −1, 3x + 2y + z = −4, 4x + 3y + z = −6 (c) 3x + y + 2z = 6, x + 2y + 4z = 7, 9x + 3y + 2z = 14

2.6

Cayley–Hamilton Theorem

In this section, the Cayley–Hamilton theorem is given along with proof. Further, application of the theorem in finding the powers of a matrix is presented. Theorem 2.6.1 Let An×n be a square matrix and p(λ) = det[λI − A] = det[A − λI] be the characteristic polynomial of A. Then p(A) = 0. (or) Every square matrix satisfies its own characteristic equation. Proof. Let the characteristic polynomial of A be p(λ) = λn + cn−1 λn−1 + cn−2 λn−2 + ... + c2 λ2 + c1 λ + c0 . Set E = λI−A. The maximum degree of the element of adj(E) is n−1. Hence we can find unique matrices Ei such that polynomial of (adj(E)) = E0 + E1 λ + E2 λ2 + ... + En−1 λn−1 . Since E I = (adj(E))E and p(λ) = E , (adj(E))E = p(λ)I = c0 I + c1 Iλ + c2 Iλ2 + ... + cn−1 Iλn−1 + Iλn

(2.6)

On the other hand (adjE)(E) = adjE(λI − A) = [E0 + E1 λ + ... + En−1 λn−1 ][λI − A] = E0 λ + E1 λ2 + ... + En−1 λn − E0 A − E1 Aλ − E2 Aλ2 − E3 Aλ3 + .... + En−1 Aλn−1 = −E0 A + [E0 − E1 A]λ + (E1 − E2 A)λ2 + ... + [En−2 − En−1 A]λn−1 + En λn

(2.7)

Linear System of Equations

73

Since the representations are unique, the equations (2.6) and (2.7) give −E0 A = c0 I =⇒ −E0 A = c0 I E0 − E1 A = c1 I =⇒ E0 A − E1 A2 = c1 A E1 − E2 A = c2 I =⇒ E1 A2 − E2 A3 = c2 A2 E2 − E3 A = c3 I =⇒ E2 A3 − E3 A4 = c3 A3 ···

···

···

En−2 − En−1 A =⇒ En−2 A

n−1

n

− En−1 A = cn−1 An−1

En−1 = I =⇒ En−1 An = An Now adding the equations on both sides, it follows that 0 = p(A) = c0 I + c1 A + · · · + cn−1 An−1 + An . The proof is complete.  1 2 Example 2.6.1 Verify Cayley–Hamilton theorem for A =  0 0

3 0 0 0

0 0 2 3

 0 0  3 2

and find its inverse.  1 2 Solution. Given matrix A =  0 0

3 0 0 0

0 0 2 3

 0 0  3 2

 1−λ  2 The characteristic matrix is A − λI =   0 0 1 − λ 2 Characteristic equation is A − λI = 0 = 0 0 −λ 0 0 0 2 3 − 3 0 2 − λ =⇒ (1 − λ) 0 2 − λ 0 0 3 2 − λ 3

3 −λ 0 0

 0 0 0 0   2−λ 3  3 2−λ

0 0 0 0 =0 2−λ 3 3 2 − λ 0 3 = 0 2 − λ

3 −λ 0 0

=⇒ (1 − λ)[(−λ)[(2 − λ)2 − 9]] − 3[2[(2 − λ)2 − 9]] = 0

74

Linear Algebra to Differential Equations λ4 − 5λ3 − 7λ2 + 29λ + 30 = 0.

By Cayley–Hamilton theorem, A satisfies its characteristic equation. Therefore A4 − 5A3 − 7A2 + 29A + 30I = 0. To verify this consider      7 3 0 0 1 3 0 0 1 3 0 0 2 0 0 02 0 0 0 2 6 0 0      A2 = AA =  0 0 2 30 0 2 3= 0 0 13 12, 0 0 12 13 0 0 3 2 0 0 3 2      7 3 0 0 1 3 0 0 13 21 0 0 2 6 0 0 2 0 0 0 14 6 0 0      A3 = A2 A =  0 0 13 120 0 2 3 =  0 0 62 63, 0 0 12 13 0 0 3 2 0 0 63 62      55 39 0 0 7 3 0 0 7 3 0 0    2 6 0 0  0  2 6 0 0  26 42 0 A4 =  0 0 13 120 0 13 12 =  0 0 313 312 0 0 312 313 0 0 12 13 0 0 12 13 Now, A4 − 5A3 − 7A2 + 29A + 30I      7 3 0 13 21 0 0 55 39 0 0 2 6 0 14 6 0 0   26 42 0 0    − 5 =  0 0 62 63 − 7 0 0 13 0 0 313 312 0 0 312 313 0 0 12 0 0 63 62       0 0 0 0 1 0 0 0 1 3 0 0 0 1 0 0 0 0 0 0 2 0 0 0      +29 0 0 2 3 + 30 0 0 1 0 = 0 0 0 0 = 0. 0 0 0 0 0 0 0 1 0 0 3 2

 0 0  12 13

Hence Cayley–Hamilton theorem is verified. To find the inverse of A, consider A4 − 5A3 − 7A2 + 29A + 30I = 0. Now, A−1 = −

1 {A3 − 5A2 − 7A + 29I} 30



0

1 3 = 0 0

1 2 − 16

0 0

0 0 − 52 3 5

 0 0   3 . 5 − 25

Linear System of Equations

75

EXERCISE 2.6 Verify Cayley–Hamilton theorem for each of the following matrices and hence find its inverse     2 1 1 1 4 (i) A = (iii) A = 0 1 0 2 3 1 1 2 

7 (ii) A = −6 6

2.7

2 −1 2

 −2 2 −1

 1 (iv) A = 0 3

2 1 −1

 −1 −1. 1

Eigen-values and Eigen-vectors

This section deals with only square matrices, say of, order n. Let An×n be any square matrix and x ∈ Rn be an n × 1 vector. Then Ax = y ∈ Rn is a vector with a direction and magnitude different from x. Taking cue from functions like f (x) = x2 , x3 , x4 and observing that 0 and 1 remain as fixed points the following question arises. Are there any vectors x ∈ Rn such that Ax remains in the direction of x? Further, in modeling of many physical problems matrices arise naturally. These matrices normally depend upon an unknown parameter which is very important. These parameters must be chosen so as to make the matrix singular. Such type of situations can be found in systems exhibiting oscillatory behavior. Some of the applications include vibration of a string, stretching of an elastic membrane, etc. Another important area is understanding the behavior of the solutions of linear homogeneous differential equations. All the a forementioned topics are studied using the concepts introduced in this section. Definition 2.7.1 An eigen-value and eigen-vector of a square matrix A are a scalar value λ and a nonzero vector x satisfying the relation Ax = λx,

x 6= 0.

(2.8)

Note. Characteristic values, latent roots, proper values and spectral values are some of the synonyms to eigen-values. The eigen-vectors are also called as characteristic vectors. From the relation (2.8) it is clear that finding the eigen-values and

76

Linear Algebra to Differential Equations

eigen-vectors means solving the homogeneous linear system (A − λI)x = 0, x 6= 0. Since x 6= 0, this yields that det (A − λI) = 0, from which the eigen-values λ can be determined. The above discussion yields the following theorem. Theorem 2.7.1 λ is an eigen-value of A if and only if det (A − λI)=0. Proof. By definition, λ is an eigen-value of A if and only if Ax = λx,

x 6= 0

⇐⇒ (A − λI)x = 0, x 6= 0 ⇐⇒ det(A − λI) = 0 ⇐⇒ λ is a root of the equation |A − λI| = 0. Note. |A−λI| is a polynomial of nth degree. Hence the equation |A−λI| = 0 yields ‘n’ eigen-values including multiplicities. Once the eigen-values are found, the eigen-vector corresponding to the eigen-value can be found by solving the homogeneous linear system (A − λI)x = 0, x 6= 0. The eigen-values together with their corresponding eigen-vectors form the eigensystem of A. Example 2.7.1 Find the eigen-values and the associated eigen-vectors for     1 3 2 5 4 the following matrices. (i) A = and (ii) A = 0 2 6. 1 2 0 0 3        1 5 4 1 1 Solution. It is observed that choosing v = yields = . −1 1 2 −1 −1   1 Hence λ = 1 is an eigen-value and is an eigen-vector corresponding to −1 the eigen-value λ = 1.   1 (ii) Similarly, x = 0 is an eigen-vector corresponding to the eigen-value 0 λ = 1,     1 3 2 1 1 since 0 2 6 0 = 0. 0 0 3 0 0 A systematic approach is required to find the eigen-values and their corresponding eigen-vectors. Taking the idea from Theorem 2.7.1 steps for obtaining the eigen-value and the corresponding eigen-vector are given. A procedure for finding the eigen-values and related eigen-vectors of a given square matrix A of order ‘n’ is as follows.

Linear System of Equations

77

Algorithm Step 1. Given a square matrix A of order n, consider identity matrix, In and write down the matrix A − λI, λ unknown; Step 2. Write down the equation given by det(A − λI) = 0, which is called as the characteristic equation and the corresponding polynomial is called as the characteristic polynomial; Step 3. Find the roots of the characteristic equation, which are the eigen values of A. There will be ‘n’ roots including multiplicities if any, when order of A is n; Step 4. Corresponding to each eigen-value λ of A, consider the homogeneous system (A − λI)x = 0 and find the nontrivial solution x. This x is the eigenvector corresponding to the eigen-value λ. Note. If the eigen-values λ are repeated, sometimes distinct eigen-vectors cannot be found. This possibility results in the following definition. Definition 2.7.2 The algebraic multiplicity of an eigen-value λ is the number of times λ is a repeated root. The geometric multiplicity of λ is the number of linearly independent eigen-vectors associated with λ. Clearly, geometric multiplicity of λ ≤ algebraic multiplicity of λ. Example 2.7.2 Find   the eigen-values and the corresponding eigen-vectors of 1 2 0 the matrix 2 1 0. 0 0 2  1 2 0 Solution Set A = 2 1 0 . 0 0 2 The characteristic equation is given by 1 − λ 2 0 1−λ 0 = 0. |A − λI| = 0 i.e. 2 0 0 2 − λ Finding the determinant along the 3rd column yields, (2 − λ)[(1 − λ)2 − 4] = 0. (2 − λ)(λ + 1)(λ − 3) = 0. Thus the eigen-values are λ = 2, λ = −1, and λ = 3 with both geometric and algebraic multiplicity of each eigen-value being 1. To find the eigen-vector corresponding to eigen-valueλ = −1.      2 2 0 x1 0 Set λ = −1 in the matrix (A − λI)x = 0 which gives 2 2 0 x2  = 0. 0 0 3 x3 0

78

Linear Algebra to Differential Equations

Hence x1 + x2 = 0, 3x3 = 0 gives x2 = −x1 , x3 = 0. Choose x1 =  k,

 k then corresponding to the eigen-value λ = −1, the eigen-vector is −k  or 0   1 −1, choosing k = 1. 0 Now consider the eigen-value λ = 2, then the matrix (A − 2I)x = 0 yields, −x1 + 2x2 = 0, 2x1 − x2 = 0 ⇐⇒ x1 = 2x2 , x2 = 2x1 ⇐⇒ x1 = x2 = 0 and x3 = k. Hence  the eigen-vector corresponding to the eigen-value λ = 2, 0 choosing k=1 is 0. 1 Corresponding to the eigen-value λ = 3, the matrix (A − 3I)x = 0 gives, −2x1 + 2x2 = 0, 2x1 − 2x2 = 0, −x3 = 0 ⇐⇒ x1 = x2 , 1 Hence the eigen-vector corresponding to the eigen-value λ = 3 is 1. 0 Thus theeigen-values are λ = −1, 2 and 3, and the corresponding eigen-vectors     1 0 1 are −1, 0 and 1, respectively. 1 0 0 Example 2.7.3 Find  the eigen-values and the corresponding eigen-vectors of  1 1 0 the matrix 1 1 0 . 0 0 2 Solution. The characteristic equation is given by 1 − λ 1 0 1−λ 0 = 0. |A − λI| = 0 i.e., 1 0 0 2 − λ =⇒ λ(2 − λ)2 = 0. Hence the eigen-values are λ = 2, 2, and 0. To find the eigen-vector corresponding to eigen-value λ = 2. Set λ = 2 in the matrix (A − λI)x = 0 which gives      −1 1 0 x1 0  1 −1 0 x2  = 0. 0 0 0 x3 0 x1 − x2 = 0, let x1 = x2 = k and x3 = l,         x1 k 1 0 then x2  = k  = k 1 + l 0. x3 l 0 1

Linear System of Equations

79

    1 0 Hence the eigen-vectors corresponding to λ = 2 are 1, 0. 0 1 Now the eigen-vector corresponding to λ = 0 is given by,      1 1 0 x1 0 1 1 0 x2  = 0. 0 0 2 x3 0 x1 + x2 = 0, x3 = 0 =⇒ x1 = −x2 , x3 = 0. Set x1 = k, then x2 = −k. Choosing k = 1, theeigen-vector to the eigen-value λ = 0 is     corresponding  1 1 0 1 −1. Thus 1, 0 and −1 are the eigen-vectors of the considered 0 0 1 0 system.

Properties of the Characteristic Polynomial The eigen-values of a square matrix A of order n depend on the characteristic polynomial |A−λI| in λ. The structure of the characteristic polynomial f (λ) = |A − λI| is as follows. (i) f (λ) is a polynomial of degree n, with the nth degree term given by (−λ)n . Thus the coefficient of (λ)n is (−1)n ; (ii) The coefficient of the (n−1)th degree term, (λ)n−1 is the trace of A, tr(A)=sum of diagonal elements of A; (iii) The constant term in the polynomial f (λ) is given by |A|. The next question that arises is that if the eigen-values of a given square matrix A of order ‘n’ are known can one determine the eigen-values of its related matrices like AT , A−1 , etc.? The answer is in affirmative.

I Properties of Eigen-values Let A be a square matrix of order n and α be and an eigen-value of A and x be its corresponding eigen-vector. Then, the following statements hold. (1) α is also an eigen-value of AT ; Proof. Consider |AT − αI| = |(A − αI)T | = |A − αI| = 0 since α is an eigen-value of A. Thus |AT − αI| = 0 and this implies that α satisfies the characteristic equation of AT . Hence α is an eigen-value of AT . (2) kα, k ± α are the eigen-values of kA, k ± A, respectively. Proof. Since α is an eigen-value of A, Ax = αx, x 6= 0. Consider (kA)x =

80

Linear Algebra to Differential Equations

k(Ax) = k(αx) = (kα)x, x 6= 0 and (k ± A)x = kx ± Ax = kx ± αx = (k ± α)x, x 6= 0. (3) αn is an eigen-value of An , n = 1, 2, ...; Proof. (3) proof is simple and follows like the earlier proof. (4) α−1 is an eigen-value of A−1 . −1

Proof. (4) Ax = αx, x 6= 0 Now multiply throughout with Aα A−1 x, x 6= 0, α 6= 0 from which the result can be concluded.

⇐⇒

1 αx

=

Note. (i) The eigen-vector corresponding to α−1 is same as the eigen-vector corresponding to α. (ii) A−1 exists implies that zero is not an eigen-value of A; (5)

|A| α

is an eigen-value of adjA.

Proof. From (4), A−1 x = adjAx =

|A| α x,

1 α x,

x 6= 0 ⇐⇒

adjA |A| x

=

1 α x,

x 6= 0 ⇐⇒

x 6= 0;

(6) The eigen-value of B = c0 An + c1 An−1 + ... + cn−1 A + cn I is c0 αn + c1 αn−1 + ... + cn−1 α + cn . Proof. Follows from the application of the properties (2) and (3); (7) if α is a complex number and x is its corresponding complex eigenvector, then α ¯ is also an eigen-value and x ¯ is its corresponding eigen-vector. (A is a real matrix.) Proof. Given Ax = αx, x 6= 0, ¯x = α Taking conjugates yields, A¯ ¯x ¯, x ¯ 6= 0, =⇒ A¯ x=α ¯x ¯.

II Properties of Eigen-vectors (1) Let α1 = 6 α2 be any two eigen-values of A then their corresponding eigenvectors x1 and x2 are distinct. Proof. If possible let x be an eigen-vector for both α1 and α2 . Then, Ax = α1 x and Ax = α2 x x 6= 0, this implies (α1 − α2 )x = 0, x 6= 0 ⇐⇒ α1 − α2 = 0 ⇐⇒ α1 = α2 . Thus x1 and x2 are distinct. (2) Let α1 6= α2 be any two eigen-values of A then their corresponding eigen-vectors, x1 and x2 are linearly independent. Proof. To show x1 and x2 are linearly independent, consider the linear

Linear System of Equations

81

combination c1 x1 + c2 x2 = 0. Then

c1 α1 x1 + c2 α1 x2 = 0.

(2.9)

Consider A(c1 x1 + c2 x2 ) = c1 Ax1 + c2 Ax2 = c1 α1 x1 + c2 α2 x2 = 0.

(2.10)

Subtracting the relation (2.10) from (2.9) gives c2 (α1 − α2 )x2 = 0 implies c2 = 0. Substituting the value of c2 in (2.9) provides c1 = 0. Hence x1 and x2 are linearly independent.

III Eigen-values and eigen-vectors of various matrices (1) Eigen-values of an upper triangular matrix are its diagonal elements. Proof. Consider the characteristic equation |A − αI| = 0. The determinant gives (a11 − α)(a22 − α)...(ann − α) = 0 and the result follows. (2) Let A be a real symmetric matrix, then (a) The eigen-values of A are real. (b) The eigen-vectors of two distinct eigen-values of A are orthogonal. Proof. If possible let α ∈ C be an eigen-value and x be its corresponding eigen-vector of a real symmetric matrix A. Then Ax = αx

(2.11)

A¯ x=α ¯x ¯

(2.12)

¯x = α and A¯ ¯x ¯ which implies Taking the transpose of (2.12) yields x ¯T A = x ¯T α ¯.

(2.13)

Here, pre-multiplying (2.11) with x ¯T and post multiplying (2.13) with x yields T T T T x ¯ Ax = α¯ x x, x ¯ Ax = x ¯ α ¯ x. On subtraction, the above two relations give (α− α ¯ )¯ xT x = 0 implies α = α ¯ as x 6= 0. Hence α is real. (b) Let α1 and α2 be two distinct eigen-values of A and x1 and x2 be their corresponding eigen-vectors. Then Ax1 = α1 x1

(2.14)

Ax2 = α2 x2

(2.15)

Taking transpose for the equation (2.14) gives x1 T A = x1 T α 1

(2.16)

82

Linear Algebra to Differential Equations

pre-multiplying (2.15) with x1 T and post multiplying (2.16) with x2 and subtracting yields x1 T (α1 − α2 )x2 = 0. Thus x1 T x2 = 0 since α1 6= α2 . Hence x1 and x2 are orthogonal. (3)(a) A Hermitian matrix has only real eigen-values. (b) The eigen-vectors of distinct eigen-values of a Hermitian matrix are orthogonal. Proof. Follows similar pattern as in (2) and is left as an exercise. (4) Let A be a skew-Hermitian matrix. Its eigen-values are purely imaginary or zero. Proof. Follows similar pattern as in (2) and is left as an exercise. (5) If α is an eigen-value of a Unitary matrix, then |α| = 1. Proof. By definition Ax = αx, x 6= 0. Taking conjugate transpose leads to ¯T = α x ¯T A ¯x ¯T ¯ T Ax = α ⇐⇒ x ¯T A ¯ α¯ xT x x ¯T x = |α|2 x ¯T x =⇒ |α|2 = 1 ⇐⇒ |α| = 1, completing the proof. 6) Both α and

1 α

are eigen-values of an orthogonal matrix.

Proof. A is an orthogonal matrix implies AT = A−1 . The proof follows from I(1) and I(4). 7) Let P be any nonsingular matrix such that A = PBP−1 . Then A and B have the same eigen-values but their eigen-vectors are different. Proof. Let A = PBP−1 , where P is a nonsingular matrix and let α be an eigen-value of A. Then Ax = αx, x 6= 0. Now suppose there is a vector y such that x = Py. Then Ax = APy and By = P−1 APy ⇐⇒ P−1 Ax = P−1 αx = αP−1 x = αy. Hence α is an eigen-value for B but its corresponding eigen-vector is y. Observation. A similarity transformation preserves the eigen-values. This leads to the eigen-value decomposition theorem–which transforms a given matrix to a diagonal form.

Linear System of Equations

83

Definition 2.7.3 Modal matrix is a matrix consisting of all the eigen-vectors of A which are linearly independent. Theorem 2.7.2 Suppose A, a square matrix of order n, has n linearly independent eigen-vectors and P is the matrix having these eigen-vectors as columns. Then D = P−1 AP is a diagonal matrix. Proof. Let α1 , α2 , α3 , ..., αn be the eigen-values of A and x1 , x2 , x3 , ..., xn be the corresponding linearly independent eigen-vectors. then Axi = αi xi , i = 1, 2, . . . , n. where xi is the ith column of P. Clearly, by multiplication of partition matrices, AP = A[x1 , ..., xn ] = [Ax1 , Ax2 , ..., Axn ] = [α1 x1 , α2 x2 , ..., αn xn ] = [x1 , x2 , ..., xn ]Diag[α1 , α2 , ..., αn ] = PD, where D = diag [α1 , α2 , ..., αn ] is a diagonal matrix consisting of eigen-values of A. Thus AP=PD gives A = PDP−1 , concluding the proof. Observation. 1. Diagonal matrix D exists when all the eigen-values are real and distinct. 2. The modal matrix consisting of the eigen-vectors is not unique. 3. The order of the eigen-values and the eigen-vectors must be maintained. 4. If the eigen-values are repeated then the modal matrix may not exist. 5. If the rank of the matrix is less than the order of the matrix then the modal matrix does not exist. Definition 2.7.4 A square matrix A of order n is said to be diagonalizable if there exists a modal matrix P such that D = P−1 AP−1 , where D consists of the eigen-values of A and is a diagonal matrix. Example 2.7.4 Reduce the matrix A =

 5 2

 2 into diagonal form by using 2

a modal matrix.  5−λ Solution. The characteristic matrix is A − λI = 2 5 − λ The characteristic equation A − λI = 2 (5 − λ)(2 − λ) − 4 = 0

2 =0 2 − λ

2 2−λ



84

Linear Algebra to Differential Equations

λ2 − 7λ + 6 = 0 (λ − 6)(λ − 1) = 0 λ = 1, 6 are eigen-values. (i) To find the eigen-vector corresponding to λ=1. Let x be a nonzero vector such that (A -I )x = 0    4 2 x1 0 = 2 1 x2 0 R2 → R2 −

R1 2

 4 0

    2 x1 0 = 0 x2 0

4x1 + 2x2 = 0 Take x2 =k, then 4x1 + 2k = 0 ⇒ 4x1 = −2k ⇒ x1 = − k2      k x1 −2 k −1 = 2 Hence = 2 x2 k   −1 x= is an eigen-vector corresponding to λ=1 2 (ii) eigen-vector corresponding to λ=6. Let x be a nonzero vector such that (A − 6I)x = 0     −1 2 x1 0 = 2 −4 x2 0 R2 → R2 + 2R1      −1 2 x1 0 = 0 0 x2 0 -x1 + 2x2 =0 Take x2 =k, then x1 =2k       x 2k 2 Hence 1 = =k x2 k 1   2 x2 = is an eigen-vector corresponding to λ = 6 1   −1 2 Take P = . 2 1   1 −2 Then adjP = , P = −1 − 4 = −5, and −2 −1   1 −2 = −1 P−1 = adjP 5 −2 −1 P Then P diagonalizes A and the diagonal form is given by P−1 A P =D

Linear System of Equations

−1 5



1 −2

 −2 5 −1 2

 2 −1 2 2

  2 1 = 1 0

85

 0 6 

7 Example 2.7.5 Reduce the matrix A = 4 −4 using a modal matrix.  7 4 −4 Given matrix is A = 4 1 8  −4 8 1 The characteristic equation is A − λI = 0 7 − λ 4 −4 1−λ 8 = 0 ⇒ 4 −4 8 1 − λ

 4 −4 1 8  into diagonal form by 8 1

⇒ (7 − λ)[(1 − λ)2 − 64] − 4[4(1 − λ) + 32] − 4[32 + 4[(1 − λ)] = 0 ⇒ (λ − 9)(81 − λ2 ) = 0 ⇒ λ = −9, 9, 9. Therefore the eigen-values of A are 9, 9, −9 To find an eigen-vector corresponding to the eigen-value λ = 9 Consider the system of homogeneous    equation [A − λI]x = 0   0 x 7−9 4 −4  4 1−9 8  y  = 0 0 z −4 8 1−9      0 −2 4 −4 x  4 −8 8 y  =0 0 −4 8 −8 z R2 → R2 + 2R1 , R3 → R3 − 2R1      0 −2 4 −4 x  0 0 0 y  = 0 0 z 0 0 0 Let z = k, y = l(parameters) −2x + 4y − 4z = 0 −x + 2y − 2z = 0 x  =(2y  − 2z) =(2l −2k)    x (2l − 2k) 2 −2 y    = l 1 + k  0  l z k 0 1     2 −2 Therefore 1,  0  are the eigen-vectors corresponding to the eigen-value, 0 1 i.e. λ = 9 repeated twice.

86

Linear Algebra to Differential Equations

To find an eigen-vector corresponding the eigen-value λ = −9 Consider the system of homogeneous equation [A − λI]x = 0 [A  + 9I]x = 0      16 4 −4 x 0  4 10 8  y  =0 −4 8 10 z 0 R1 → 41 R1 , R2 → R2 − R1 , R3 → R3 + R1  4 0 0

1 9 9

    −1 x 0 9  y  =0 9 z 0

R3 → R3 − R2 , R2 → 1/9R2  4 0 0

1 1 0

    x 0 −1 1  y  =0 z 0 0

Let z = k(parameter) y+z =0 y = −z = −k 4x = −y + z, x = 14 (−y + z) = 14 (k + k) = 21 k   1    x 1 2k y  = −k  = 1 k −2 2 z 2 k   1 Therefore −2 is an eigen-vector corresponding to λ = −9 2 Writing the three eigen-vectors as the three columns,   2 −2 1 the modal matrix is K = 1 0 −2 . 0 1 2 

 2 5 4 Now, K−1 = 19 −2 4 5 and 1 −2 2    2 5 4 7 4 −4 2 D = 19 −2 4 5  4 1 8  1 1 −2 2 −4 8 1 0   9 0 0 D = 0 9 0 = diag(9, 9, −9). 0 0 −9

−2 0 1

 1 −2 2

Linear System of Equations

87

Note. Theorem 2.7.2 reduces to a special form if A is a symmetric matrix and this is given below. Theorem 2.7.3 If A is a symmetric matrix then there exists an orthogonal matrix S such that A = SDS T , where D is a diagonal matrix consisting of the distinct eigen-values of A and S is the matrix whose columns are eigen-vectors corresponding to eigen-values of A. Proof. In property III(2), it has been shown that the eigen-vectors of a symmetric matrix are orthogonal. Since for an orthogonal matrix P, P−1 = PT , the result follows from Theorem 2.7.2. When the rank A < n and when there are only s linearly independent eigen-vectors, with s < n, then in place of diagonal matrix, there will be s blocks each having one eigen-value. Each block is called a Jordan block and is as follows. Definition 2.7.5 Jordan block is a square matrix having an eigen-value α on the main diagonal and 1 on the super diagonal. All other elements are zero.  2 Example 2.7.6 0 0

1 2 0

 0 0, 2

 2 0 0

0 −4 0

 0 1  are Jordan blocks. −4

The following theorem describes the Jordan canonical form. Theorem 2.7.4 Jordan Canonical form: If a square matrix of order ‘n’ has a characteristic polynomial C(x) = det(A − xI)) = Πsi=1 (x − αi )ri Then A is similar to a matrix with αi on the main diagonal and one’s on the super diagonal and zeros elsewhere. Observation. 1. The αi ’s are the eigen-values of A and ri are its algebraic multiplicities.

EXERCISE 2.7 1. Find the eigen-values and the corresponding eigen-vectors for the following matrices

88

Linear Algebra to Differential Equations    4 1 −5 0 1 (i) A=−3 0 5  (ii) A= −6 5 3 3 −2     2 1 6 2 3 (iii) A=0 −5 3 (iv) A= 2 1 0 0 4     1 2 3 7 1 (v) A= (vi) A=0 4 5 10 4 0 0 6 

2. Find the modal matrix and hence reduce the following matrices into diagonal from   2 2 2 (i) A=2 2 2 2 2 2   7 4 −4 (iii) A= 4 1 8  −4 8 1

2.8

 3 2 −3 (ii) A=−3 −4 9  −1 −2 5   1 0 1 (iv) A=0 −1 3 0 0 6 

Singular Values and Singular Vectors

This section mainly deals with rectangular matrices. A matrix A of order m × n can be considered as a mapping from Rn to Rm . Systems of algebraic equations in n unknowns and m equations or algebraic equations with m unknowns and n equations (m 6= n) give rise to matrices of order m × n or n × m. In this case in place of eigen-values, singular values are introduced. Their corresponding vectors are called as singular vectors. A formal definition is as follows. Definition 2.8.1 A singular value and a pair of singular vectors of a square or a rectangular matrix A are a nonnegative scalar σ and two nonzero vectors, u and v such that Av = σu and AT u = σv, where AT is the transpose of A. The following is the method to find singular values and singular vectors.

Linear System of Equations

89

Method to find singular values and singular vectors. Step 1 Given a matrix A (square or rectangular), find AT A. Step 2 Find the eigen-values α of AT A and√the corresponding eigen-vectors, v. Then the singular value of A, σ = α or σ 2 = α. Step 3 Find the eigen-values of AAT and the corresponding eigen-vectors, u. Step 4 Then Av = σu. Step 5 σ is the singular value and v and u are the corresponding singular vectors of A. Example 2.8.1 Find the singular values and singular vectors for the matrix  1 1 A = 2 2 1 1       1 1 6 6 1 2 1  2 2 = Step 1. AT A = 6 6 1 2 1 1 1 T T A A − λI = 0 Step 2. The characteristic equation of A A is 6 − λ 6 6 6 − λ = 0 (6 − λ)2 − 36 = 0 T λ= √ 12, 0 are √ the eigen-values of A A. Hence σ1 = 12 = 2 3, σ2 = 0 are the singular values of A.   1 The eigen-vectors of AT A corresponding to λ = 12 and λ = 0 are and 1 # "   √1 −1 , respectively and the orthonormal vectors are v1 = √12 and v2 = 1 2 " # −1 √ 2 √1 2

, respectively.  1 Step 3. AAT = 2 1 has eigen-values 12,

    1  2 4 2 1 2 1 2 = 4 8 4 1 2 1 1 2 4 2 0 and 0.

  1 The eigen-vector associated to the eigen-value λ1 = 2 3 of AAT is 2 and 1  1  √



 6 the corresponding normalized vector u1 is  √26 . √1 6

90

Linear Algebra to Differential Equations √ −2  5

Normalized eigen-vectors corresponding to λ = 0 are u2 =  √15  and u3 = 0 √ −2  5

 0 . √1 5

 1    √ 1 1 " √1 # 6 √   It can be easily shown that Av1 = 2 2 √12 = 2 3  √26  2 √1 1 1 6 = σ u . 1 1 √ Similarly, Av2 = σ2 u2 . Now σ1 = 2 3 and σ2 = 0 are the singular values of A and the corresponding singular vectors are u1 , v1 and u2 , v2 .

EXERCISE 2.8 Find the singular values and singular vectors for the following matrices     2 1 2 1 0 1 0 (i) A = (ii) A = 2 1 2 0 1 0 1

2.9

Quadratic Forms

In this section, a bilinear form is introduced. A special case of the bilinear form is a quadratic form and is discussed in detail. Quadratic forms arise naturally in problems in physics, mechanics, statistics and in problems involving maxima and minima, that is, optimization problems. Quadratic forms reduced to canonical forms give insights to the conics considered. The nature of quadratic forms is also discussed. Definition 2.9.1 A bilinear form is a function f : Rn × Rm → R defined by f (y, x) = yT Ax, where A is an n × m matrix. Definition 2.9.2 A quadratic form is a homogeneous expression of second degree in n variables x1 , x2 , · · · , xn of the form Q = xT Ax =

n X n X j=1 i=1

where aij , i, j = 1, 2, · · · , n are constants.

aij xi xj ,

(2.17)

Linear System of Equations

91

Let x = (x1 , x2 , · · · , xn )T .  Then xT = (x1 ,  x2, · · ·, xn ) x1 a11 a12 ... a1n  a21 a22 ... a2n   x2       and xT Ax = x1 x2 ... xn    ..  ..   .  . xn an1 an2 ... ann Pn Pn = j=1 i=1 aij xi xj = a11 x21 + a12 x1 x2 + · · · + a1n x1 xn + a21 x2 x1 + a22 x22 + · · · a2n x2 xn + · · · + an1 xn x1 + · · · + ann x2n = a11 x21 + (a12 + a21 )x1 x2 + · · · + (a1n + an1 )x1 xn + · · · + ann x2n . The matrix A need not be symmetric but a nonsymmetric matrix can be replaced by a symmetric matrix with identical results. Example 2.9.1 1. x2 + 6xy − 4y 2 is a quadratic form in 2 variables and the matrix representation is      1 3 x x y 3 −4 y 2. 4x2 + 2y 2 − 8z 2 + 2xy + 5yz − 4zx matrix representation is as follows:  4   x y z  1 −2

is a quadratic form in 3 variables. The   1 −2 x 2 5/2  y  5/2 − 8 z

3. 2x2 + y 2 + z 2 + 5w2 + 2xy + 8zw is The matrix representation is  2   1 x y z w  0 0

a quadratic expression in 4 variables. 1 1 0 0

0 0 1 4

  x 0 y  0   4  z  5 w

The observation that a real quadratic form can be expressed as a matrix is stated as the following theorem. Theorem 2.9.1 Every real quadratic form in n variables x1 , x2 , · · · , xn can be expressed in the form xT Ax, where x = (x1 x2 · · · xn )T is a column matrix and A is a real symmetric square matrix of order n. This matrix A is called as the matrix of the quadratic form xT Ax. Definition 2.9.3 Rank of a Quadratic form Let xT Ax be a real quadratic form. The rank r of A is called the rank of the quadratic form. If ρ(A) = r < n or A = 0 then the quadratic form is said to be singular; otherwise it is nonsingular.

92

Linear Algebra to Differential Equations

Definition 2.9.4 Canonical Form or Normal Form of a Quadratic form Let xT Ax be a quadratic form in n variables. Then there exists a real nonsingular orthogonal matrix, P (can also be treated as an orthogonal linear transformation) such that x = Py transforms xT Ax to the form yT By. The form yT By = λ1 y12 + λ2 y22 + · · · + λn yn2 is called the canonical form of xT Ax, where B = diag[λ1 λ2 · · · λn ]. Procedure to reduce the given quadratic form to the canonical form when all the n eigen-vectors can be found Step 1. Write the matrix A Step 2. Find the eigen-values of A Step 3. Find the corresponding eigen-vectors if all the n (order of the matrix) eigen-vectors are found proceed further. Otherwise canonical form is not possible   x1 Step 4. Normalize them. (i.e.) If x2  is an eigen-vector then x = x3   x1 √ 2 1 2 2 x2 , is normalized eigen-vector. x1 +x2 +x3 x3 Step 5. Form the matrix P by the eigen-vectors. Step 6. Find P−1 Step 7. Calculate B = P−1 AP Then B = diag[λ1 · · · λn ], where λ1 , λ2 , · · · λn are the eigen-values of A and B is in canonical form. Example 2.9.2 Reduce the Quadratic form 7x2 + 6y 2 + 5z 2 − 4xy − 4yz to the canonical form.   7 −2 0 Solution: Its matrix is A = −2 6 − 2 0 −2 5

Linear System of Equations

93

A − λI = 0 7−λ −2 0 =⇒ −2 6 − λ − 2 = 0 0 −2 5−λ (7 − λ)[(6 − λ)(5 − λ) − 4] + 2[2(λ − 5)] = 0 =⇒ (7 − λ)[30 − 11λ + λ2 − 4] + 4(λ − 5) = 0 (7 − λ)[λ2 − 11λ + 26] + 4λ − 20 = 0 7λ2 − 77λ + 182 − λ3 + 11λ2 − 26λ + 4λ − 20 = 0 =⇒ −λ3 + 18λ2 − 99λ + 162 =⇒ λ3 − 18λ2 + 99λ − 162 =⇒ (λ − 3)(λ − 6)(λ − 9) =⇒ λ = 3, 6, 9

are the eigen-values.

The canonical form is 3u2 + 6v 2 + 9w2

Index and signature of a quadratic form Let the quadratic form xT Ax be transformed into the canonical form yT By. The canonical form has only r terms, if the rank of A = ρ(A) = r. The terms can be either positive or negative. Definition 2.9.5 The number of positive terms in the normal form of the quadratic form is called the index (denoted by i) of the quadratic form. Example 2.9.3 1. Suppose the canonical form of a quadratic form is given by 2x2 + 3y 2 − z 2 then the index of the quadratic form is 2. 2. Suppose the canonical form of a quadratic form is given by 4x2 +6y 2 +z 2 +w2 then the index of the quadratic form is 4. Theorem 2.9.2 The number of positive terms (negative terms) in any two canonical forms are the same. Definition 2.9.6 Signature of a Quadratic Form Let r be the rank of the quadratic form and i be the index of the quadratic form (i.e.) the number of positive terms in the canonical form. Then the signature of the quadratic form (denoted by s) is defined as the excess number of positive terms over the number of negative terms, that is, s = i − (r − i) = 2i − r.

94

Linear Algebra to Differential Equations

Nature of a quadratic form The nature of the quadratic form can be determined by the index or rank of the matrix A or in terms of the eigen-values of the matrix A. Let ρ(A) = r and index of the quadratic form be i. Then, the real quadratic form is said to be 1. Positive Definite if r = i = n or in terms of eigen-values, all the eigenvalues of A are positive. 2. Negative Definite if i = 0 and r = n or, all the eigen-values of A are negative. 3. Positive Semi Definite if i = r < n or if the eigen-values of A are positive and at least one eigen-value is 0. 4. Negative Semi Definite if i = 0 and r < n or the eigen-values of A are negative and at least one of them is 0. 5. Indefinite in all other cases that is, the eigen-values are both positive and negative. Theorem 2.9.3 (Sylvester’s Law of Inertia) The signature of a quadratic form is independent of all canonical forms, (i.e.) the signature does not change with different canonical forms. Example 2.9.4 Find the index, signature and the nature of the quadratic form x2 + y 2 + 2z 2 + 4xy.  T Solution. Let x = x y z   1 2 0 then the matrix associated with the quadratic form is A = 2 1 0 0 0 2 From Example 2.7.2 the eigen-values are λ = −1, 2, 3 and the corresponding       1 0 1 eigen-vectors are −1 , 0 and 1, respectively. 1 0 0       1 1 0 Now normalizing the vectors gives √12 −1 , 0 and √12 1 0 1 0  √1    √1 0 −1 0 0 2 2 −1 0 √12  , P−1 = PT and P−1 AP =  0 2 0 Let P =  √ 2 0 0 3 0 1 0 Now the canonical form of the quadratic form is −u2 + 2v 2 + 3w2 Hence the index is 2 and the signature is 1. The nature of the quadratic form is indefinite.

Linear System of Equations

95

Example 2.9.5 Find the nature of the quadratic form 4x21 +x22 +4x23 +2x1 x2 + 4x1 x3 .   4 1 2 Solution. The matrix corresponding to the quadratic form is 1 1 0 2 0 4 The eigen-values are λ1 ≈ 0.56, λ2 ≈ 2.33, λ3 ≈ 6.10 Since all the eigen-values of A are positive the quadratic form is positive definite.

EXERCISE 2.9 1. Find the index, signature and nature of the quadratic form by finding the normal form. 1. x2 + y 2 + 2z 2 + 2xy. 2. 5x2 + 2y 2 + 4xy. 3. 7x2 + y 2 + z 2 + 8xy − 8xz + 16yz.

2.10

Conclusion

In this chapter, the various setups of linear system of equations are described and related theoretical concepts are given. Various approaches to reducing a given matrix to a diagonal, upper or lower triangular form are given and Jordan blocks are introduced. Eigen-values and eigen-vectors along with their properties are discussed. Singular values and singular vectors are introduced. These concepts are used in studying principal component analysis in Section 5.9. A study of quadratic forms is initiated. This chapter is a continuation of Chapter 1 and deals with the fundamentals of linear algebra. For further reading, the readers are suggested to refer to [14] and [19].

Chapter 3 Vector Spaces

3.1

Introduction

In this chapter, the theoretical foundation for the concepts introduced in earlier chapters is given. Section 3.2 deals with a vector space and its subspace. In Section 3.3 the linear independence of vectors, the basis and the dimension of a vector space are discussed. In Section 3.4, the change of basis matrix is described. Linear transformation is introduced in Section 3.5. The matrix of linear transformation is described in Section 3.6. In Section 3.7, an inner product space and its important properties are given. Section 3.8 deals with the Gram-Schmidt orthogonalization process and singular value decomposition theorem. Finally, an attempt is made to link linear algebra with differential equations in Section 3.9.

3.2

Vector Space and Subspaces

In this section, the theoretical foundation for the structure of vectors and matrices developed in Chapter 1 is given. Let V be any nonempty set of elements and let two operations called vector addition, ‘+’ and scalar multiplication, ‘.’ on elements of V be defined such that, (i) if x ∈ V and y ∈ V then x + y ∈ V (an operation ‘+’ satisfying this property is called a binary operation); and (ii) for any real number α ∈ R, αx ∈ V.

Definition 3.2.1 The triad (V, +, .) is a real vector space (or a vector space in short) if the operations of ‘+’ and ‘.’ satisfy the following properties. Let x, y, z ∈ V and α, β ∈ R. Then (i) x + y ∈ V (Closure under addition); (ii) x + y = y + x (Commutative ); DOI: 10.1201/9781351014953-3

97

98

Linear Algebra to Differential Equations

(iii) (x + y) + z = x + (y + z) (Associative); (iv) There exists an additive identity 0 ∈ V such that x + 0 = 0 + x = x, for all x ∈ V ; (v) To each x ∈ V , there exists a y ∈ V such that x + y = y + x = 0, which is called the additive inverse of x and is denoted by −x; (vi) α(x + y) = αx + αy; (vii) (α + β)x = αx + βx; (viii) (αβ)x = α(βx); (ix) There exists 1 ∈ R such that 1 x = x ∈ V . Example 3.2.1 1. Let V = {x = (x1 x2 ... xn ) : xi ∈ R}, be the set of all vectors in Rn . Let the vector addition + and scalar multiplication . be defined componentwise. Then (V, +, .) is a vector space. 2. Let V = {(aij )m×n : aij ∈ R} be the set of all m × n matrices with entries in R. Let the binary operations + and . be defined entrywise. Then (V, +, .) is a vector space. 3. Let V = {p(x) = a0 + a1 x + a2 x2 + · · · + an xn : a0 , a1 , . . . , an ∈ R}, be set of all polynomials of degree ≤ n, with real constants. Let the operations + and . be defined as follows: for any two polynomials p, q ∈ V, (p+q)(x) = p(x)+q(x) and (αp)(x) = αp(x). Then (V, +, .) is a vector space. Sometimes it may not be necessary to work with the whole space. For example, suppose one is working in R3 , but the information given is a subset of R3 . Then the question is whether this subset of R3 is a vector space or not and what is the criteria for verification. The following definition answers this question. Definition 3.2.2 Subspace Let (V, +, .) be a vector space and W be a subset of V. Then (W, +, .) is a subspace of V if and only if (W, +, .) is a vector space in its own right. Example 3.2.2 V = {(x1 x2 x3 ) : xi ∈ R, i = 1, 2, 3} = R3 . Then V = R3 . Let W = {(x1 x2 0)/xi ∈ R, i = 1, 2}. Then clearly W ⊆ V and (W, +, .) is a subspace of V.    a b Example 3.2.3 Consider V = : a, b, c, d ∈ R with the definitions c d of matrix addition multiplication. Then V is a vector space. and scalar   a 0 Now set W = : a, b ∈ R . Then W is a subspace of V with the 0 b operations of matrix addition and scalar multiplication of V.

Vector Spaces

99

The following theorem gives the necessary and sufficient condition for a subset W of V to be a subspace. Theorem 3.2.1 A nonempty subset W of a vector space V over R is a subspace of V iff the following conditions are satisfied. (i) w1 , w2 ∈ W =⇒ w1 + w2 ∈ W (ii) α ∈ R, w ∈ W =⇒ αw ∈ W . Proof. Suppose W is a subspace of V. Then by definition of a subspace, W is a vector space with respect to the operations of + and . defined on V. Thus for w1 , w2 ∈ W , w1 + w2 ∈ W and αw ∈ W for α ∈ R. Conversely suppose (i) and (ii) are satisfied then consider w1 , w2 , w3 ∈ W =⇒ w1 , w2 , w3 ∈ V. Since V is a vector space, w1 + (w2 + w3 ) = (w1 + w2 ) + w3 and w1 + w2 = w2 + w1 , and (αβ)w1 = α(βw1 ) α(w1 + w2 ) = αw1 + αw2 (α + β)w1 = αw1 + βw1 . Using (i) and (ii) in combination, all the above elements are in W . To show that additive inverse and additive identity exist, taking α = −1 in (ii), yields (−1)w ∈ W by (ii) and by (i) w − w ∈ W =⇒ 0 ∈ W . Hence the proof. Observation. The two conditions in the above theorem can be written as a single one as αw1 + w2 ∈ W. Theorem 3.2.2 Suppose W1 and W2 are subspaces of V. Then W1 ∩ W2 is also a subspace of V. Proof is given as exercise. Definition 3.2.3 If W1 and W2 are any two subspaces of V then the sum of the two subspaces is denoted by W1 + W2 and is defined as W1 + W2 = {w1 + w2 /w1 ∈ W1 and w2 ∈ W2 }. Theorem 3.2.3 W1 + W2 is a subspace of V. Proof is given as exercise. Definition 3.2.4 A vector space V is said to be a direct sum of two subspaces W1 and W2 , denoted by W1 ⊕ W2 , if and only if V = W1 + W2 and W1 ∩ W2 = {0}.       L a 0 0 c a c Example 3.2.4 Let W1 = , W2 = then W1 W2 = is 0 b d 0 d b the space R2×2 .

100

Linear Algebra to Differential Equations

EXERCISE 3.2 1. Show that the set of all 3 column vectors S = {(x sin x cos x) : x ∈ R} is not closed under vector addition. 2. Show that the line {(0 0 z) : z ∈ R} and the plane {(0 y z) : y, z ∈ R} are subspaces of R3 . 3. Show that the solutions of the system Ax = 0 where A is an m×n matrix form a vector space under vector addition and scalar multiplication.   a b 4. Does the set of all 2 × 2 matrices of the form , where a ≥ 0 with c d usual operations in R2×2 form a vector space? If not explain why? L 5. J Show that the set of all positive real numbers operations and L J V with defined by x, y ∈ V, x y = xy and c x = xc is a vector space.   a b 6. Show that { such that a + b = c} is a subspace of R2×2 . c d       a 0 0 c 7. Let W1 = : a, b ∈ R and W2 = : c, d ∈ R then 0 b d 0 L 2×2 show that W1 W2 = R .       a 0 0 c 8. Let W1 = : a, b ∈ R and W2 = : c, d ∈ R then 0 b d 0   L a 0 show that W1 W2 = is a subspace of R2×2 . c b 9. Show that the space R3 is a direct sum of two planes xy-plane and z-plane, W1 = {(x y 0) : x, y ∈ R} and W2 = {(0 0 z) : z ∈ R}. 10. Show that M = {(x 0) : x ∈ R} is a subspace of R2 . L 11. Find two subspaces H and K of R2 such that H K = R2 and find two L 3 3 subspaces S1 and S2 of R such that S1 S2 = R . 12. Show that the following sets are subspaces of 3rd order matrices, R3 ×R3 . (i) all diagonal matrices of order 3 × 3 (ii) all symmetric matrices of order 3 × 3. S 13. Are H = {(x y) : x ≥ 0, y ≥ 0, x, y ∈ R} {{x y} S : x ≤ 0, y ≤ 0, x, y ∈ R} and K = {(x y) : x ≥ 0, y ≤ 0, x, y ∈ R} {{x y} : x ≤ 0, y ≥ 0, x, y ∈ R} are subspaces of R2 . Give reasons.

Vector Spaces

101

14. If W1 and W2 are subspaces of V then show that T (i) W1 W2 is a subspace of V. (ii) W1 + W2 is also a subspace of V. S (iii) W1 W2 need not be a subspace of V by an example.

3.3

Linear Independence, Basis and Dimension

A vector space may have a finite number of elements or an infinite number of elements. In the latter situation, it is quite complex and complicated to verify the properties or study the vector space. Thus it would be convenient if there is a set of finite number of elements that represent the vector space. In this section, this set is discovered. The following definitions are a step in that direction. Let (V, +, .) be a real vector space. Definition 3.3.1 Linear Combination. Let C = {v1 , v2 , . . . , vn } be a subset of the vector space V. Then a linear combination of the vectors in set C is a vector v = α1 v1 + α2 v2 + · · · + αn vn ∈ V where α1 , α2 , . . . , αn ∈ R. Example 3.3.1 Let C = {(1 1), (3 0)}, then v = 3(1 1) + 4(3 0) = (15 3) is a linear combination of the vectors in C. Definition 3.3.2 Linear span is the collection of all possible linear combinations of the set of all vectors in C and is denoted by Sc . Example 3.3.2 C={(1 0 0), (0 1 0), (0 0 1)}. Then Sc = R3 , verify! Example 3.3.3 Let C = {(1 1 0), (1 0 1), (0 1 1)}. Then Sc = R3 . Verify! Definition 3.3.3 Linearly independent set. A subset S = {v1 , v2 , . . . , vn } of V is said to be linearly independent if and only if for αi ∈ R for all i = 1, 2, . . . , n and α1 v1 +α2 v2 +· · ·+αn vn = 0 imply αi = 0 for all i = 1, 2, . . . , n. Example 3.3.4 S1 = {(1 0 0), (0 1 0), (0 0 1)} and S2 {(0 1 1), (1 0 1), (1 1 0)} are linearly independent sets. Verify!

=

Definition 3.3.4 Linearly dependent set. A set S is said to be linearly dependent if it is not linearly independent.

102

Linear Algebra to Differential Equations

Aliter definition A set S is said to be linearly dependent if there exist scalars αi , i = 1, 2, . . . , n, not all zero, such that α1 v1 + α2 v2 + · · · + αn vn = 0. Example 3.3.5 C={(1 1), (2 0), (5 3)} is a linearly dependent set. Verify! Result. Linear span Sc is a subspace of V. Proof. Let C = {v1 , v2 , . . . , vn } be any set of vectors and Sc = {v/v = α1 v1 + α2 v2 + · · · + αn vn where vi ∈ V, i = 1, 2, . . . , n} Consider u, Pvn∈ Sc . Pn Then u = i=1 αi vi and v = i=1 βi vi where αi , βi ∈ R, i=1,2,. . . ,n. Then u+v =

n X

(αi + βi )vi =

i=1

n X

ri vi ,

i=1

where ri = αi + βi ∈ R which means that u + v ∈ Sc . Consider v ∈ Sc then for any α ∈ R. αv = α

n X i=1

αi vi =

n X

(ααi )vi ∈ Sc .

i=1

Hence by Theorem 3.2.1, Sc is a subspace of V. A combination of the above two notions gives rise to one of the most important and fundamental concepts in linear algebra and is defined below. Definition 3.3.5 A subset B = {v1 , v2 , . . . , vn } of a vector space V is said to be a basis for V if and only if (i) B is a linearly independent set and (ii) The space V is a linear span of B. Example 3.3.6 1. The sets S1 and S2 given in Example 3.3.4 are two different bases for R3 . 2. B = {1, x, x2 , . . . , xn } forms a basis for the vector space of all polynomials of degree ≤ ‘n’ with the definition of ‘+’ and ‘.’ defined as in Example 3.2.1 (3).         1 0 0 1 0 0 0 0 3. B = , , , forms a basis for the vector 0 0 0 0 1 0 0 1 space consisting of all 2 × 2 matrices with ‘+’ and ‘.’ defining matrix addition and scalar multiplication of a matrix. 4. The eigen-vectors of a given n×n matrix A with 0 n0 distinct eigen-values form a basis for Rn .

Vector Spaces

103

5. Every nonsingular matrix of order 3 forms a basis for using its 3 column vectors or its 3 row vectors R3 . Observation. From the above examples it can be observed that the basis of a vector space is not unique. In fact, there can be infinitely many bases, but the number of elements in any basis is unique. This leads to the following definition. Definition 3.3.6 The number of elements in any basis of a vector space is called the dimension of V. Example 3.3.7

(i) The dimension of Rn is n.

(ii) The dimension of the vector space of all 2 × 2 matrices is 4. (iii) The dimension of the vector space of all polynomials of degree ≤ n is n+1 from Example 3.3.6 (2). Observation. 1. The number of elements in any basis of the vector space of all polynomials, for every degree cannot be finite based on earlier presentation. Hence a basis can have finite, or infinite number of elements. 2. If the basis has a finite number of elements, the basis is said to be finite and the vector space is called as a finite dimensional vector space. Otherwise, the basis is infinite and the vector space is said to be infinite dimensional. The following properties relative to a basis are stated without proof. Result. Suppose that dimV = n. 1. Let B = {v1 , v2 , . . . , vm } be any subset of V having m linearly independent vectors, m < n. Then B can be extended to form a basis, that is there exist vectors {vm+1 , vm+2 , . . . , vn } such that {v1 , v2 , . . . , vm , vm+1 , vm+2 , . . . , vn } is a basis for V. 2. If Sc = {v1 , v2 , . . . , vp } is a linear span of V, p > n then there exist B ⊆ Sc such that B= {vi1 , vi2 , . . . , vin } is a basis for V where i1 , i2 , ... in are distinct elements from {1, 2, ..., p}. Theorem 3.3.1 If W1 and W2 are any two finite dimensional subspaces of a vector space V, then W1 + W2 is also a finite dimensional subspace and dim(W1 + W2 ) = dimW1 + dimW2 − dim(W1 ∩ W2 ). Proof. Let dimW1 = m and dimW2 = n. Now Theorem 3.2.2 and Theorem 3.2.3 guarantee that W1 ∩ W2 and W1 + W2 are subspaces of V. Let

104

Linear Algebra to Differential Equations

dim(W1 ∩ W2 ) = p. Then clearly p ≤ m and p ≤ n (why?). Let B 0 = {v1 , v2 , . . . , vp } be a basis for W1 ∩ W2 . Since B 0 is a linearly independent subset of W1 and W2 , it follows from the above observation that B 0 can be extended to B 1 = {v1 , v2 , . . . , , vp , w1p+1 , w1p+2 , . . . , w1m } and B 2 = {v1 , v2 , . . . , , vp , w2 p+1 , w2 p+2 , . . . , w2 n } to form a basis for W1 and W2 , respectively. To show the vectors in B 0 , B 1 and B 2 are linearly independent. Consider p X

ai vi +

i=1

m X

n X

bi w1i +

i=p+1

ci w2i = 0

(3.1)

i=p+1

and where a1 , a1 , . . . , ap , bp+1 , . . . , bm , cp+1 , cp+2 , . . . , cn are constants. That gives p n m X X X ci w2i . bi w1i = − ai vi − i=p+1

i=p+1

i=1

Now left-hand side is a vector in W1 and right-hand side is a vector in W2 . Thus m X bi w1i ∈ W1 ∩ W2 . i=p+1

Hence there exist constants δ1 , δ2 , . . . , δp such that m X

bi w1i =

i=p+1

=⇒

p X

δi vi

i=1

δi vi +

i=1

p X

m X

b1 w1i = 0.

i=p+1

Since v1 , v2 · · · vp and w1p+1 , . . . , w1m is a basis for W1 , it implies that δi = 0, bi = 0, i = 1, 2, . . . , m. Substituting in relation (3.1), bi = 0, i = 1, 2, . . . , m, yield p X i=1

ai vi +

n X

ci w2i = 0.

i=p+1

Now B 2 = {v1 , v2 , . . . , vp , w2 p+1 , w2 p+2 , . . . , w2 n } is a basis for W2 which gives ai = 0 and ci = 0 for i=1,2,. . . ,p, p+1,. . . ,n. Set B = B0 ∪ B1 ∪ B2 then clearly every vector in W1 + W2 can be expressed as a linear combination of elements in B. Thus, B = {v1 , v2 , . . . , vp , w1p+1 , . . . , w1m , w2 p+1 , w2 p+2 , . . . , w2 n }

Vector Spaces

105

forms a basis for W1 + W2 . Also, dim(W1 + W2 ) = p + (m − p) + (n − p) =m+n−p = dimW1 + dimW2 − dim(W1 ∩ W2 ). Hence, the proof is complete.

EXERCISE 3.3 1. Show that {(1 − 2 1), (2 3 − 4), (7 6 5), (3 − 2 5)} is a linearly dependent set. Find a basis for R3 that has the vector (2 0 1)T. 2. Show that {1, t, t2 } is a linearly independent set. 3. Verify {1, t, t2 , . . . , tn } is a basis for a vector space of all polynomials of degree ≤ n. 4. Add a polynomial to the set {1, 1 + x2 } so that it forms a basis for P2 , all polynomials of degree ≤ 2. 5. Find a basis for each of these subspaces of 3 × 3 matrices (i) All diagonal matrices (ii) All symmetric matrices (iii) All skew symmetric matrices. 6. Find the dimensions of the subspaces of R2 spanned by the vectors (a) (0 0), (3 4), (9 16) (b) (2 4), (7 3). 7. Find the dimensions of the subspaces of R4 spanned by the vectors (a) (1 0 0 1), (0 1 0 0), (1 1 1 1), (0 1 1 1) (b) (1 − 1 0 2), (3 − 1 2 1), (1 0 0 1) (c) (−2 4 6 4), (0 1 2 0), (−1 2 3 2), (−3 2 5 6), (−2 − 1 0 4). 8. Find the dimension of the solution space W of 4 × 5 matrix       x 0 1 2 3 −1 1  1    x 2 0 2 4   0 1 3   x3  = 0 . 3 1 −2 0 5     x4  0 4 3 1 2 3 x5 0

106

3.4

Linear Algebra to Differential Equations

Change of Basis–Matrix

It has been discussed in the previous section that a basis can be considered as a representative of any vector in a vector space. A basis can also be considered as a coordinate system for a vector space by putting an additional condition which is given below. Definition 3.4.1 An ordered basis B is a basis whose vectors are arranged in a definite order. Example 3.4.1 B1 = {(1 0 0), (0 1 0), (0 0 1)} is an ordered basis and B2 = {(0 1 0), (1 0 0), (0 0 1)} is another ordered basis. In fact, there are four more different ordered bases with these elements. Let B = {v1 , v2 , · · · , vn } be an ordered basis. Then every vector can be written as a unique linear combination of the vectors in B, say v = α1 v1 + α2 v2 + · · · + αn vn (verify!). The coefficients α1 , α2 , . . . , αn are written as a column and are called as the coordinates of the vector v with respect to the basis B and is written as   α1  α2    [v]B =  . . (3.2)  ..  αn As a vector space can have more than one basis it would be convenient if there exists a tool that would link the coordinates of a vector represented in one basis to the coordinates of the vector obtained by using a different basis. In other words, the question is the availability of a tool that can give the coordinates of a vector with respect to a basis B 0 when its coordinates with respect to a basis B are known. The answer is affirmative and the procedure is given below. Let B = {v1 , v2 , . . . , vn } and B 0 ={w1 , w2 , · · · , wn } be two different ordered bases of an n-dimensional vector space V. Then each vector in B 0 can be written as a linear combination of vectors in B as n X wj = cij vi , j = 1, 2, 3, . . . , n. (3.3) i=1

Let v ∈ V. Then v can be expressed in terms of vectors of both bases B and B 0 as follows. v=

n X i=1

αi vi

=

n X j=1

β j wj

(3.4)

Vector Spaces

107

using (3.3) in (3.4) gives v=

n X

β j wj =

j=1

n X j=1

βj

n X

cij vi =

i=1

n X i=1

vi

n X

βj cij =

j=1

n X n X ( cij βj )vi . i=1 j=1

Comparing the coefficients yields αi =

n X

cij βj

j=1

or using vector notation for v ∈ V [v]B = C[v]B0,

(3.5)

where C = [cij ]n×n . This procedure results in the following theorem. Theorem 3.4.1 Let V be a vector space of finite dimension n. Let B and B 0 be two bases for V. Then there exists a unique nonsingular matrix C of order n such that, for any vector v ∈ V, [v]B = C [v]B0. Example 3.4.2 Let B = {(1 1 0), (1 0 1), (0 1 1)} be a basis for R3 . If v = (2 4 8), find the coordinates of v with respect to B. Solution. Let v = a(1 1 0) + b(1 0 1) + c(0 1 1), then this is equivalent to solving  1 1 0 the system Ax = b where x = (a b c), b = (2 4 8) and A = 1 0 1. 0 1 1 Using the augmented matrix       1 1 0:2 a −1 1 0 1 : 4 and reducing to echelon form yields  b  =  3 . 0 1 1:8 c B 5 Example 3.4.3 Let B 0 be the basis in the above example and B ={(1 0 0), (0 1 0), (0 0 1)} be the standard basis for R3 . Verify Theorem 3.4.1 by finding the nonsingular matrix C such that [v]B0 = C[v]B for v = [2, 4, 8]. Solution. 0 Consider the  augmented matrix K = [B : B] 1 1 0:1 0 0 1 0 1 : 0 1 0 0 1 1:0 0 1 R2 → R2 − R1

108

Linear Algebra to Differential Equations  1 0 0

1 −1 1

 0 0 1 0 0 1

0: 1 1 : −1 1: 0

R3 → R3 + R2 , R1 → R1 + R2   1 0 1: 0 1 0 0 −1 1 : −1 1 0 0 0 2 : −1 1 1 R2 → (−1)R2 ,  1 0 0

R3 → 12 R3 0 1 0

1: 0 −1 : 1 1 : −1 2

R2 → R2 + R3  1 0  0 1 0

0:

0

R1 → R1 − R3  1 0  0 1 0

1:0

0

1:

1 2 0 : 12 1 : −1 2

0:

1 2  1  2 −1 2

 Hence

C=

  2 [v]B = 4 8  1 [v]B0 =

2  1  2 −1 2

1 2

1

1 2 −1 2

1 2 −1 2 1 2

1 2 −1 2 1 2

 0 0

1 −1

−1 2 1 2

1 2 −1 2 1 2

1 2

0



1 2 1 2

−1  2 1  2 . 1 2

−1  2 1  2 . 1 2

−1    2 2 1   2  4 8 1 2

  −1 =  3 , verifying the theorem. 5

Vector Spaces

109

EXERCISE 3.4 1. Find the change of basis matrix, that is, a nonsingular matrix C such that [v]B0 = C[v]B for the bases B = {(1 0), (1 1)} and B 0 = {(1 3), (−2 6)}. 2. Let B 0 = {(1 3), (2 8)}. If v = (4 2), find the coordinates of v with respect to basis of B 0 . 3. Let B = {(1 1 0), (2 0 1), (0 1 1)} and B 0 = {(0 2 0), (1 4 3), (−1 2 1)} find the change of basis matrix. 4. B = {(1 0 0 0), (0 1 0 0), (0 0 1 0), (0 0 0 1)} and B 0 = {(1 0 0 1), (0 1 0 0), (1 1 1 1), (0 1 1 1)}, Find change of basis matrix. 5. B = {(1 0 0 1), (0 1 0 0), (1 1 1 1), (0 1 1 1)} and B 0 = {(2 2 0 0), (−1 1 1 2), (0 0 2 2), (2 1 2 1)}, Find the change of basis matrix.

3.5

Linear Transformations

In Section 1.3, matrix multiplication was described and it can be observed that an n × 1 column vector, (i.e.) x, when multiplied by an m × n matrix A, yields an m × 1-vector, (i.e.) y, that is, Ax = y. In other words, a matrix behaves like a mapping taking a vector in Rn to a vector in Rm . Further, matrix addition and scalar multiplication are distributive in nature. Taking cue from matrices the idea is to introduce a mapping from one vector space into another such that the properties or rules satisfied by a matrix are preserved. This leads to the following definition of a mapping which generalizes the concept of a matrix. Definition 3.5.1 A linear transformation T is a mapping from a vector space, V, into another vector space, W, denoted by T : V → W satisfying (i) T (v1 + v2 ) = T v1 + T v2 for all v1 , v2 ∈ V ; (ii) (αT )(v) = (α)(T v), for all α ∈ R, v ∈ V. Observation. The addition and scalar multiplication on left-hand side are that in V and on right-hand side is that in W and hence may be different. Example 3.5.1 Let T : R2 → R2 be such that

110

Linear Algebra to Differential Equations

(a) T(x y)=(y x) (b) T(x y) = T(x 0). Both the examples are linear transformations. Verify. Example 3.5.2 T : R3 → R3 defined by (a) T(x y z)=(ax by cz) (b) T(x y z) = (x+y y+z z+x) are linear transformations. Verify. Example 3.5.3 T : R3 → R2 defined by T (x y z) = (x y) is a linear transformation. Example 3.5.4 T : R2×2 → R defined by   a b (a) T = a+b+c+d is a linear transformations. c d     a1 a2 b1 b2 Solution. Consider A = and B = a3 a4 b3 b4   a1 + b1 a2 + b2 Then, T (A + B) = T a3 + b3 a4 + b4 = a1 + b1 + a2 + b2 + a3 + b3 + a4 + b4 = (a1 + a2 + a3 + a4 ) + (b1 + b2 + b3 + b4 ) = T (A) + T (B).     a1 a2 αa1 αa2 Also, (αT )(A) = (αT ) =T a3 a4 αa3 αa4 = αa1 + αa2 + αa3 + αa4 ) = α(a1 + a2 + a3 + a4 ) = αT (A). Observations. (1) A linear transformation T : V → V is called a linear operator. (2) The identity mapping I : V → V defined by I(v) = v is a linear operator. (3) The zero mapping O : V → V defined by O(v) = 0 is a linear operator. (4) T(0)=0, for any linear transformation T. Example 3.5.5 T : R → R defined by T(x) =x+c is not a linear transformation since T (x + y) = x + y + c 6= T x + T y = (x + c) + (y + c).

Vector Spaces

111

The above example leads to a very interesting definition which plays a vital role in applications relating to computer graphics. Definition 3.5.2 An affine transformation is a mapping T : V → V defined by T v = v + u0 , where u0 is a fixed vector. Observation. (1) In R, a linear transformation is a straight line passing through the origin (T (0) = 0 holds). (2) In R, an affine transformation is a straight line not passing through the origin (T (0) 6= 0 holds). With the definition of a linear transformation in place, the next question that arises is about the domain, range and null space of a linear transmission T. These are given below. Let T : V → W be a linear transformation defined from a vector space V to another vector space W (with underlying elements in R). Definition 3.5.3 The null space or kernel of T is defined as the set of all vectors, v ∈ V such that T v = 0, and is denoted by N (T ) = {v : T v = 0}. Definition 3.5.4 The range of T is the set of all w ∈ W such that T v = w and is denoted by R(T ) = {T v : v ∈ V }. Observation. N (T ) ⊆ V and R(T ) ⊆ W. Theorem 3.5.1 If T : V → W is a linear transformation, then N(T) is a subspace of V and R(T) is a subspace of W. Proof. By Theorem 3.2.1, it is sufficient to show that αv1 + v2 ∈ N (T ) whenever v1 , v2 ∈ N (T ). Let v1 , v2 ∈ N (T ) then T (v1 ) = T (v2 ) = 0. Consider T (αv1 + v2 ) = αT (v1 ) + T (v2 ) = α(0) + 0 =0 which yields αv1 + v2 ∈ N (T ) and thus N(T) is a subspace of V. Let w1 , w2 ∈ R(T ) then there exist v1 and v2 ∈ V such that T v1 = w1 and T v2 = w2 . Consider αw1 + w2 = αT v1 + T v2 = T (αv1 ) + T (v2 ) = T (αv1 + v2 ) ∈ R(T ).

112

Linear Algebra to Differential Equations

The above relations follow from the fact that αv1 + v2 ∈ V. From the observation following Theorem 3.2.1, R(T) is a subspace of W and the proof is complete. The importance of a basis studied in Section 3.3 can be observed when studying linear transformations. A linear transformation must map every element in V to an element in W. But because of the basis it is enough for a linear transformation to map every element in the basis to an element in W (Why?). The following theorem characterizes the importance of the basis. It is stated without proof. Theorem 3.5.2 Let V be a vector space with dimV=n, and {v1 , v2 , . . . , vn } be an ordered basis for V. Further, let W be another vector space and w1 , w2 , . . . , wn ∈ W be any given n vectors. Then there exists a unique linear transformation T from V to W such that T vi = wi , i = 1, 2, 3, . . . , n. The result given below will link up the concepts of a basis and a linear transformation. In order to do so the following definitions are in order. Definition 3.5.5 The dimension of the null space of a linear transformation T is called nullity of T and is denoted by dim N (T ) or n(T ). Definition 3.5.6 The dimension of the range space of a linear transformation T is called the rank of T and is denoted by dim R(T ) or r(T ). Theorem 3.5.3 Let V and W be vector spaces with dimV = n < ∞ and T : V → W be a linear transformation. Then r(T ) + n(T ) = dimV. Proof. From Theorem 3.5.1, N(T) is a subspace of V, hence dim N(T)=m (say) ≤ n. Let B = {v1 , v2 , . . . , vm } is a basis for N(T). Extend B to form a basis for V by adding n − m vectors say vm+1 , vm+2 , . . . , vn . Then B 0 = {v1 , v2 , . . . , vm , vm+1 , . . . , vn } is a basis for V. To conclude the result of the theorem it is sufficient to show that r(T ) = n−m. In other words, it is enough to show that the set Bˆ = {T vm+1 , T vm+2 , . . . , T vn } is a basis for R(T). The proof of linear independence of the set Bˆ is as follows. Let α1 , α2 , . . . , αn−m be n − m scalars such that α1 T vm+1 + α2 T vm+2 + · · · + αn−m T vn = 0. Then T (α1 vm+1 + α2 vm+2 + · · · + αn−m vn ) = 0. This implies that v = α1 vm+1 + α2 vm+2 + · · · + αn−m vn ∈ N (T ), from which v can be written as a linear combination of elements of the basis B of N (T ).

Vector Spaces

113

Hence v = α1 vm+1 + α2 vm+2 + · · · + αn−m vn = β1 v1 + β2 v2 + · · · + βm vm for some scalars β1 , β2 , . . . , βm , which gives, −{β1 v1 + β2 v2 + · · · + βm vm } + α1 vm+1 + α2 vm+2 + · · · + αn−m vn = 0 which implies β1 = β2 = ... = βm = α1 = α2 = · · · = αn−m = 0 as {v1 , v2 , . . . , vn } is a basis for V . Next, to show that {T vm+1 , T vm+2 , . . . , T vn } spans R(T). Consider w ∈ R(T ) then w = T v, for some v ∈ V. Let v = γ1 v1 + γ2 v2 + · · · + γn vn . Then w = T (γ1 v1 + γ2 v2 + · · · + γn vn ) = γ1 T (v1 ) + γ2 T (v2 ) + · · · + γm T (vm ) + γm+1 T (vm+1 ) + · · · + γn T (vn ) = γm+1 T (vm+1 ) + · · · + γn T (vn ), since T (v1 ) = T (v2 ) = · · · = T (vm ) = 0, and the proof is complete. Result. 1. Let T be a linear transformation from a vector space V to a vector space W whose zero elements are given by 0v and 0w , respectively. Then T (0v ) = 0w . Proof. T (0v ) + 0w = T (0v ) = T (0v + 0v ) = T (0v ) + T (0v ) Thus, T (0v ) = 0w , since T (0v ) ∈ W . In other words, 2. If T (0v ) 6= 0w , then T is not a linear transformation.

Minimal polynomial Consider a linear operator T (which is a linear transformation from V to V) defined on a finite dimensional vector space V with dim V = n. Corresponding to the linear operator T there exists a matrix A of order n × n. From Cayley–Hamilton theorem, every square matrix A satisfies its own characteristic equation. Thus if p(λ) is the characteristic polynomial of A then p(A) = 0, (also p(T ) = 0). The question that arises is whether there is a polynomial of lesser degree q(A) that divides the characteristic polynomial and is such that q(A) = 0. The answer is yes and leads to the following definition. Definition 3.5.7 Minimal polynomial for a linear operator T(and to the corresponding matrix A) is a polynomial q(λ) of smallest degree such that 1. q(T ) = 0 and q(A) = 0;

114

Linear Algebra to Differential Equations

2. The coefficient of its highest degree term is ‘1’; 3. q divides the characteristic polynomial (i.e.) q(λ) divides p(λ). Observation. 1. If p(λ) is the characteristic polynomial of A and all the eigen-values of A are distinct then the minimal polynomial is the characteristic polynomial. 2. If p(λ) is the characteristic polynomial of A and some or all of the eigenvalues of A are repeated then there exists a minimal polynomial q(λ) such that q(A) = 0 and q divides p. This result leads to the Jordan blocks described in Chapter 2. Example 3.5.6 Find the minimal polynomial of   1 0 0 A = 2 3 0 . 3 4 5 Solution. The characteristic equation of A is

 1−λ  2 3

|A − λI| = 0  0 0 =0  3−λ 0 4 5−λ

(1 − λ)(3 − λ)(5 − λ) = 0 The characteristic polynomial is p(λ) = (1 − λ)(3 − λ)(5 − λ). Here all the eigen-values different. So the minimal polynomial is the characteristic polynomial. Thus the minimal polynomial is q(λ) = (1 − λ)(3 − λ)(5 − λ). Example 3.5.7 Find the minimal polynomial of   1 0 0 A = 1 −2 0  . 1 0 −2 Solution. The characteristic equation of A is |A − λI|=0   1−λ 0 0  1 =0  −2 − λ 0 1 0 −2 − λ

Vector Spaces

115

(λ − 1)(λ + 2)(λ + 2) = 0 The eigen-values are 1, −2, −2. The characteristic polynomial is p(λ) = (λ − 1)(λ + 2)(λ + 2) and the minimal polynomial is q(λ) = (λ − 1)(λ + 2).

EXERCISE 3.5 1. Let T : R2 → R3 be a linear transformation given by T (x1 x2 ) = (x1 + x2 x1 − x2 x2 ) Then find Rank T. 2. If T : R2 → R2 is defined by T (x1 x2 ) = (x21 x2 ). Is T is a linear transformation? if not why? 3. If T : R2 → R2 is defined by T (x1 x2 ) = (x1 − x2 0). Show that T is a linear transformation. 4. If T : R3 → R2 is defined by T (x y z) = (x + 1 y + z). Show that T is not a linear transformation. 5. Let T : R3 → R3 be a linear transformation defined by T (x y z) = (x + 2y − z y + z x − 2z) then find the rank of T. 6. Which of the following are linear transformations? (a) T : R2 → R2 such that T (x y) = (x x); (b) T : R2 → R2 such that T (x y) = (x + y x − y); (c) T : R2 → R2 such that T (x y) = (x + y siny).

3.6

Matrices of Linear Transformations

In this section corresponding to every linear transformation, a matrix associated with it is described. Specific examples of linear transformations like projection, rotation, dilation, etc. are discussed. As observed in previous sections bases of both the domain and the codomain vector spaces play a vital role in obtaining the correspondence between a linear transformation and a

116

Linear Algebra to Differential Equations

matrix. Procedures to obtain the matrix corresponding to a linear transformation are given. In Section 1.2, vectors of various types have been defined. These specific examples are considered to obtain the desired matrix. Example 3.6.1 The projection map T : R3 → R2 is defined by T (x y z) = (x y). The matrix corresponding to this transformation is given by   1 0 0 A = 0 1 0. 0 0 0 Example 3.6.2 The dilation map T : R2 → R2 is defined by T (x y) = (αx αy) α > 1 and the associated matrix is

 α 0

 0 . α

Example 3.6.3 The rotation map T : R2 → R2 is defined by T (x y) = (xcosθ + ysinθ

− xsinθ + ycosθ),

where θ is the angle of rotation of the vector (x y) from the origin. The corresponding matrix is given by   cosθ sinθ . −sinθ cosθ Example 3.6.4 The reflection map with respect to x-axis T : R2 → R2 is defined by T (x y) = (x − y) and the corresponding matrix is A=

  1 0 0 −1

Example 3.6.5 The shear in x-direction L : R2 → R2 is defined as L(x y) = [x + ky y] and the associated linear transformation is   1 k A= , 0 1 where k is any scalar.

Vector Spaces

117

Observation. All the above examples are considered with respect to the standard basis in R2 or R3 . The question that arises naturally is the role of a basis in finding the matrix corresponding to a linear transformation. The following theorem is a useful result in this direction. Theorem 3.6.1 Let V and W be finite dimensional vector spaces with dimV = n and B = {v1 , v2 , . . . , vn } be a basis of V. Further, let T : V → W be a linear transformation. Then, for any v ∈ V, T v = c1 T (v1 ) + c2 T (v2 ) + · · · + cn T (vn ), where c1 , c2 , c3 , · · · , cn ∈ R are unique scalars. Observation. The theorem states that the linear transformation is completely determined for every vector v ∈ V once the values of T (v1 ), T (v2 ), . . . , T (vn ) corresponding to the basis B are known. The proof is simple and is left as an exercise. The unique matrix corresponding to a linear transformation from a vector space V to a vector space W is given below. Theorem 3.6.2 Let V and W be finite dimensional vector spaces with dim V = n and dim W = m. Further let B = {v1 , v2 , . . . , vn } and B 0 = {w1 , w2 , . . . , wm } bases for V and W , respectively. Let T : V → W be a linear transformation. Then, there exists a unique matrix Am×n whose jth column is the coordinate vector of T (vj ) with respect to the basis B 0 on W and the matrix A is such that for v ∈ V [T (v)]B0 = A[v]B , where [v]B and [T v]B0 are the coordinate vectors of v and T v with respect to the basis B and B 0 , respectively. This matrix is unique and jth column Aj is given by   c1j  c2j     .     .     .  cmj B0 , where c1j , c2j , c3j , · · · cmj are constants obtained by the relation T (vj ) = c1j w1 + c2j w2 + · · · + cmj wm . Definition 3.6.1 The matrix of linear transformation T : V → W is given

118

Linear Algebra to Differential Equations

by 

c11  c21   .   . cm1 where

c12 c22 . . cm2

 . . . c1n . . . c2n   . . . .  , . . . .  . . . cmn

 c1j  c2j     .   Aj =   .  = [T vj ]B0    .  cmj B0 

The following is a procedure for finding the matrix of linear transformation. Step 1 Given T : V → W, basis B = {v1 , v2 , . . . , vn } ⊆ V , B 0 = {w1 , w2 , . . . , wm } ⊆ W. Step 2 Compute T (vj ),

j = 1, 2, 3, . . . , n.

Step 3 Find coordinate vector for [T vj )]B0 . That is T (vj ) = c1j w1 + c2j w2 + · · · + cmj wm and find c1j , c2j , · · · cmj . Step 4 The matrix of linear transformation is A = [A1 A2 A3 · · · Aj · · · An ] with [T (vj )]B0 = Aj = [c1j c2j · · · cmj ]T . Below is a simple procedure to find the matrix A. If T : Rn → Rm to find A that is to find To find c1j , c2j , . . . , cmj , j = 1, 2, 3, . . . , n write the augmented matrix   w1 w2 . . . wn : T (v1 ) T (v2 ) . .T (vn )   :     :     : : Next transform the augmented matrix to reduced row echelon form [In : A]. Then A is the required matrix. Observation. Finding the coordinate vector of T (vj ) means solving the equation. [w1 w2 w3 · · · wm ]xj = T (vj ), j = 1, 2, . . . , n. This can be done at one go by considering the augmented matrix as above, where xj is the jth column of A (to be found).

Vector Spaces

119     x y+z Example 3.6.6 Let L : R3 → R2 be denoted by L y  = . x−y z Let S = {v1 , v2 , v3 } and T = {w1 , w2 } be the bases for R3 and R2 , respectively,           1 0 0 1 0 where v1 = 0, v2 = 1, v3 = 0 and w1 = , w2 = . 0 1 0 0 1 Find the matrix of L with respect to S and T. Solution. Calculating             0+0 0 1+0 1 0+1 1 L(v1 ) = = , L(v2 ) = = , L(v3 ) = = , 1−0 1 0−1 −1 0−0 0         0 1 1 0 1 1 [L(v1 )]T = , [L(v2 )]T = , [L(v3 )]T = , Hence, A = 1 −1 0 1 −1 0     x x+y 3 2   . Let S = Example 3.6.7 Let L : R → R denoted as L y = y−z z {v1 , v2 , v3 } and T = {w1 , w2 }, respectively,           1 0 1 1 −1       where v1 = 0 , v2 = 1 , v3 = 1 and w1 = , w2 = . 2 1 1 1 1 Find the matrix of L with respect to S and T. Solution. Calculating       1 1 2 L(v1 ) = , L(v2 ) = , L(v3 ) = . −1 0 0 To find the vectors, we write  coordinate      1 1 −1 L(v1 ) = = a1 w1 + a2 w2 = a1 + a2 , −1 2 1       1 1 −1 L(v2 ) = = b1 w1 + b2 w2 = b1 + b2 , 0 2 1       2 1 −1 L(v3 ) = = c1 w1 + c2 w2 = c1 + c2 . 0 2 1 This requires solving three linear systems. Thus, form the matrix   1 −1 : 1 : 1 : 2 2 1 : −1 : 0 : 0 R2 → R2 − 2R1

120  1 0

−1 3

Linear Algebra to Differential Equations  : 1 : 1 : 2 : −3 : −2 : −4

R1 → 3R1 + R2  3 0 : 0 0 3 : −3

: 1 : −2

: 2 : −4

R1 → R1 /3, R2 → R2 /3  1 0 : 0 : 13 : 0 1 : −1 : −2 :  3 1 2 0 3 3  [x]S [L(x)]T =  −4 −2 −1 3 3  0 Therefore, the matrix A =  −1

2 3 −4 3





1 3

2 3

−2 3

−4 3

  of L w.r.t S and T.

Example 3.6.8 Let P be any vector in R3 and B = {v1 , v2 , v3 } B0 = {w1 , w2 , w3 } be any two orthonormal bases of R3 then find the co-ordinates of p with respect to B and B 0 and find the change of basis matrix. Solution. To find the coordinates of p with respect to the basis B, write p as p = av1 + bv2 + cv3 p.v1 = a, p.v2 = b, p.v3 = c. Thus co-ordinate vector of p with respect to the basis B is [p]B = (p.v1 p.v2 p.v3 )T Similarly the co-ordinate vector of p with respect to basis B 0 is [p]B0 = (p.w1 p.w2 p.w3 )T. Working in a similar manner, the  v1 .w1 v2 .w1 v3 .w1

change of basis matrix from B → B 0 is  v1 .w2 v1 .w3 v2 .w2 v2 .w3  v3 .w2 v3 .w3

EXERCISE 3.6 1. Find the matrix corresponding to the linear transformation T : R2 → R2 of the reflection map with respect to y-axis.

Vector Spaces

121

2. Find the matrix corresponding to the shear in y-direction. 3. Prove Theorem 3.6.1. 4. The matrix linear transmission T on R3 defined as T (x y z) = (2y + z x−4y 3x) with respect to the basis B = {(1 1 1), (1 1 0), (1 0 0)} then find [T ]B . 5. Let T be a linear transformation from R3 → R2 defined by T (x y z) = (x + y y − z). Then find the matrix of T with respect to the ordered bases {(1 1 1), (1 − 1 0), (0 1 0)} and {(1 1), (1 0)}. 6. Let T be a linear transformation on R3 given by T (x y z) = (2x 4x − y 2x + 3y − z) then show that the matrix of the linear transformation is invertible.   1 7. The matrix linear transformation T : R3 → R2 with T 0 = 0           0 0 6 −13 7 , and T 0 = . Find the matrix A of T , T 1 = 9 17 11 1 0 with respect to the basis {(1 0}, (1 1)}. 8. T : R2 → R2 such that T (x y) = (x − y 2x + y). Find the matrix which represents T. (a) Relative to the basis {(1 0), (0 1)} (b) Relative to the basis {(1 2), (1 1)}

3.7

Inner Product Space

In Section 1.2, the dot product of two vectors is given and it can be observed that in R2 , every orthogonal basis B = {v1 , v2 } satisfies the property v1 .v2 = 0. In other words, these basis vectors are perpendicular (orthogonal) to each other in R2 . This property holds in R3 and also in Rn , for any n < ∞. The main idea of this section is to describe a notion that characterizes this concept of orthogonality in a vector space. The following definition of an inner product suits the bill and is useful for defining a basis in a vector space. Definition 3.7.1 An inner product on a vector space V is a mapping from V × V into R, denoted by hx, yi, for every pair of vectors x and y, satisfying the following properties: (i) hx, xi ≥ 0 and hx, xi = 0 if and only if x = 0, x ∈ V ;

122

Linear Algebra to Differential Equations

(ii) hαx, yi = αhx, yi for all x, y ∈ V and (iii) hx + y, zi = hx, yi + hx, zi for all x, y, z ∈ V. Observation. Note that as the underlying space is R, the definition of the inner product had been restricted to R. In general, an inner product definition caters to complex numbers with the following extra condition. (iv) hx, yi = hy, xi for all x, y ∈ V . In this situation, the inner product is a mapping from V × V → C. Example 3.7.1 Let V=Rn thenP for x = (x1 n (y1 y2 . . . yn ), hx, yi = x.y = i=1 xi yi .

x2

...

xn ), y =

Definition 3.7.2 A Vector space V with an inner product defined on it is called an inner product space. Example 3.7.2 (i) Rn , 0 < n < ∞ is an inner product space with inner product defined as above. (ii) Let K = {f : [a, b] → R such that f is continuous}. For f, g ∈ K, define, Z hf, gi =

b

f (x)g(x)dx. a

Then K is an inner product space. Observation. 1. Note that the inner product induces a distance called norm (see Section 1.2) as hx, xi = kxk2 . 2. In Example 3.7.2 (i), it can be seen that the inner product in Rn is nothing but the dot product in Rn and hence the concept of angle comes into picture and hence the idea of perpendicular vectors can be considered. This notion is extended to a vector space as follows. Let the vector space (V,+,.) with an associated inner product h., .i be an inner product space. Definition 3.7.3 Two vectors x, y ∈ V are said to be orthogonal or perpendicular if and only if hx, yi = 0 and is denoted by x ⊥ y. Definition 3.7.4 Let S ⊆ V. Then S is said to be an orthogonal set if every two distinct vectors x, y ∈ S are orthogonal (i.e.) hx, yi = 0 f or x, y 6= 0,

x, y ∈ S.

Vector Spaces

123

Definition 3.7.5 An orthogonal complement of S is a set (denoted by S ⊥ ) of vectors that are perpendicular to every vector in S. (i.e.) S ⊥ = {x ∈ V /hx, yi = 0, f or all y ∈ S}. Example 3.7.3 {0}⊥ = V, V ⊥ = {0}. Definition 3.7.6 Two sets S1 , S2 are said to be mutually orthogonal, denoted by S1 ⊥ S2 , if for all x ∈ S1 and y ∈ S2 , hx, yi = 0. Example 3.7.4 Let S1 = xy − Lplane, S2 , and S2⊥ = S1 . Also R3 = S1 S2 .

S2 =

z − axis. Then S1⊥ =

(See Section 3.2 for the definition of direct sum.) Definition 3.7.7 An orthogonal set S ⊆ V is said to be orthonormal if every vector in S is a unit vector. Example 3.7.5 S = {(1 0 0), (0 1 0), (0 0 1)} is an orthonormal set in R3 . Definition 3.7.8 A subset S (⊆ V) is said to be an orthonormal basis for V if it is an ordered basis and is also an orthonormal set. Example 3.7.6 The preceding example forms an orthonormal basis.

Diagonalizing a rectangular matrix In Section 1.3, a matrix has been introduced as an arrangement of vectors in columns. In Section 2.7, it was shown that a matrix consisting of linearly independent eigen-vectors diagonalizes the matrix. These eigen-vectors form a basis. If A is symmetric, the eigen-vectors can be made orthonormal and the matrix consisting of orthonormal eigen-vectors is an orthogonal matrix, Q, and An×n can be diagonalized by Λ = QT AQ. In general, if one wants an orthonormal basis for diagonalizing a rectangular matrix Am×n , it is essential to consider singular values and singular vectors. Consider {v1 , v2 , . . . , vn } eigen-vectors of AT A and {u1 , u2 , u3 , . . . , un } eigen-vectors of AAT with the eigen-values λi = σi2 . Then the vi ’s and ui ’s are linked by the relation, Av1 = σ1 u1 , Av2 = σ2 u2 , . . . , Avn = σn un which can be written in the matrix notation AV = UΣ or A = UΣVT as U and V are orthogonal matrices. Any matrix Am×n represents a linear transformation from a vector space V of dimension n to a vector space W of dimension m. To diagonalize A the following theorem is essential.

124

Linear Algebra to Differential Equations

Theorem 3.7.1 Singular Value Decomposition. Let {v1 , v2 , . . . , vn } be the eigen-vectors of (AT A)n×n and {u1 , u2 , . . . , um } be the eigen-vectors corresponding to (AAT )m×m . Let λi = σi2 be the eigen-values of AT A. Then σi ’s are called as the singular values of A and {v1 , v2 , . . . , vn } and {u1 , u2 , . . . , um } are the singular vectors of A. Further, 

A = U ΣV T

σ1 0  = [u1 , u2 , . . . , um ]m×m  . 0

0 σ2

. .

. .

0 0

.

.

.

σm

0 . . 0

 0 .  [v1 , v2 , . . . , vn ]Tn×n  0 m×n

. Example 3.7.7  1 Then AT = 0 0

  1 0 0 Let A = 0 2 1 2×3    0 1 0 T  2 , AA = 0 5 1 3×2

The eigen-values AAT are 5 and 1.   0 The eigen-vector corresponding to λ = 5 is u1 = and 1   1 the eigen-vector corresponding to λ = 1 is u2 = . 0     0 1 The normalized vector matrices are u1 = and u2 = , hence set 1 0   0 1 U= . 1 0   1 0 0 N ext consider AT A = 0 4 2. 0 2 1 The eigen-values are 5, 1 and 0.     0 1 Corresponding to the eigen-values for λ = 5, v1 = 2; for λ = 1, v2 = 0; 0 1   0 and for λ = 0, v3 = −0.5 1       0 1 0 Normalizing the vectors gives v1 = 0.8944, v2 = 0, and v3 = −0.4472, 0.4472 0 0.8944   0 1 0 Then the matrix is V = 0.8944 0 −0.4472 0.4472 0 0.8944

√ Further,

P

=

Verification  P 0 U VT = 1

5 0

Vector Spaces   0 0 2.2361 0 0 √ = . 0 1 0 1 0

125



 1 2.2361 0 0

  0 0 0  1 1 0 0

0.8944 0 −0.4472

  0.4472 1 0  = 0 0.8944

0 2

 0 = 1

A. Observation. The matrices U and V are orthogonal matrices.

EXERCISE 3.7 1. Show that K in Example 3.7.2(ii) is an inner product space with the given definition. (Verify the definition in Example 3.7.2(ii) is an inner product.) 2. Find the value of the inner product (i) h3x + 4y, 2x + 9yi in R. (ii) h(6 − 8i)z, 5i + 2xi in C. 3. Show that x ⊥ y implies y ⊥ x. 4. Show that x ⊥ y for all x ∈ V, then x = 0. 5. If S = {v1 , v2 , . . . , vn } is a set of mutually orthogonal vectors in Rn , show that S is a basis for Rn .   3 3 6. Find the singular value decomposition for A = . 2 −2

3.8

Gram–Schmidt Orthogonalization

In this section, a process to obtain an orthonormal basis is described. This procedure is known as Gram–Schmidt orthogonalization process and is useful to represent functions in terms of a basis. Let V be an inner product space and {x1 , x2 , . . . , xn } be a linearly independent subset of V. Then, there exists an orthonormal subset of nonzero vectors {e1 , e2 , . . . , en } such that span {e1 , e2 , . . . , en } = span {x1 , x2 , . . . , xn }. The procedure is as follows.

126

Linear Algebra to Differential Equations

Step 1: Set e1 =

1 kx1 k x1 .

Step 2: Write v2 = x2 − hx2 , e1 ie1 . Consider hv2 , e1 i = hx2 , e1 i − hx2 , e1 ihe1 , e1 i = 0. Hence v2 perpendicular to e1 and v2 6= 0. Set e2 =

1 v2 . kv2 k

Step 3. Write v3 = x3 − hx3 , e1 ie1 + hx3 , e2 ie2 then hv3 , e1 i = hv3 , e2 i = 0 and v3 6= 0. Thus v3 ⊥ e1 and v3 ⊥ e2 . Now set e3 =

1 v3 . kv3 k

Proceeding in a similar fashion, Step n: vn = xn −

n−1 X

hxn , ek iek

k=1

with vn 6= 0 and vn is orthogonal to e1 , e2 , e3 , . . . , en−1 and 1 vn . en = kvn k Thus {e1 , e2 , e3 , . . . , en } is an orthonormal set in V. Observation. 1. The proof of the equivalence of the linear span is given as exercise. It follows by induction. 2. If V is of finite dimension n, then {e1 , e2 , . . . , en } is an orthonormal basis for V. 3. The procedure carries on for any n, hence it holds for a countably infinite set {x1 , x2 , . . . , xn , . . .} (this set is called as a sequence). 4. Looking at the vectors {x1 , x2 , x3 , . . . , xn } as representing n directions, it can be stated that at each step i, the components in the previous i − 1 orthogonal directions are being subtracted from the vector xi . 5. At any point of the procedure, vi 6= 0.

Vector Spaces

127

Result. Let V be an inner product space and {e1 , e2 , e3 , . . . , en } be an orthonormal subset of non-zero vectors. If x ∈ span{e1 , e2 , . . . , en }, then x=

n X hx, ei iei . i=1

The coefficients hx, ei i in the above linear combination are called as Fourier coefficients.

EXERCISE 3.8 1. In Gram–Schmidt procedure prove that span{x1 , x2 , . . . , xn } = span{e1 , e2 , . . . , en }. 2. Consider any 3 × 3 nonsingular matrix and construct the orthonormal set of vectors. 3. Let S = {(1 − 2), (2 3)} and W = span S. Construct the orthonormal set of vectors. Find the Fourier coefficients for the vectors x = (10 − 6).

3.9

Linking Linear Algebra to Differential Equations

This section is meant for readers having knowledge of differential equations up to undergraduate level. In this section, a brief note is given relating a couple of topics in linear algebra to differential equations. In Chapter 8, a differential system x0 = Ax where A is an n × n constant matrix and x is an n × 1 column vector is considered. It will be shown that the properties of this system are studied using the eigen-values of A. The same can be observed for systems dealt in other chapters also. Differential equations involving variable coefficients do not fit into the above setup. Their study involves the notion of orthogonal sets. Some results dealing with them are given below. Definition 3.9.1 Let L2 [a, b] = {x : [a, b] → R/ x is continuous} with an inner product defined by Z b hx, yi = x(t)y(t)dt. a

Then L2 [a, b] is an inner product space.

128

Linear Algebra to Differential Equations

It is called as Hilbert space on addition of a new concept called completeness, which is beyond the scope of this book. Definition 3.9.2 Power series is a linear combination of a countably infinite set of functions of the form, S = {1, x, x2 , . . . , xn , . . .}. Thus a power series is given by ∞ X

an xn .

n=0

Many linear differential equations of variable coefficients like y 00 +xy 0 +xy = 0 are assumed to have a power series solution and the coefficients of the power series are found. There are many differential equations whose solutions exhibit specific behavior. A couple of them are discussed below. Definition 3.9.3 Legendre differential equation is given by (1−x2 )y 00 −2xy 0 + n(n + 1)y = 0, where n is a whole number. Then an orthogonal sequence of functions denoted by Pn (x) are solutions of Legendre differential equation. The Pn (x) are defined by Pn (x) =

N X

(−1)k

k=0

(2n − 2k)! xn−2k , 2n k!(n − k)!(n − 2k)!

where

N=

n  ,   2

n is even,

    (n − 1) , 2

n is odd.

(3.6)

Then P0 (x) = 1, P1 (x) = x, P2 (x) = 12 (3x2 − 1), P3 (x) = 21 (5x2 − 3x), and so on. Pn (x) is called a Legendre polynomial of order n. The space in which Pn (x) is defined for each n is L2 [−1, 1] and the inner product is defined by Z 1

hx, yi =

x(t)y(t)dt. −1

Definition 3.9.4 A Bessel’s differential equation is of the form x2

d2 y dy +x + (x2 − n2 )y = 0, n is a real number. dx2 dx

Vector Spaces

129

Then there is an orthogonal sequence of functions denoted by Jn (x) which are solutions of this equation and are defined by Jn (x) =

∞ X k=0

1 x . (−1)k ( )n+2k 2 k!Γ(n + k + 1)

These functions are in L2 [0, 1] and the inner product is defined by Z

1

xJn (αx)Jn (βx)dx. 0

Each Jn (x) is called as a Bessel’s function of first kind of order n. Similarly there exist orthogonal sequences called Hermite polynomials, Lagrange polynomials which are solutions of Hermite differential equations and Lagrange differential equations, respectively. The readers are suggested to read books on special functions and also differential equations for more information.

3.10

Conclusion

This chapter begins with fundamental concepts like a vector space, subspace, linear independence, basis and dimension. The concept of change of basis of matrix, widely used in application is given in detail. Various linear transformations used in applications, for example, in topics relating to computer graphics and robotics are discussed. The concept of a basis is extended from a finite dimensional space to an infinite dimensional space and orthonormal sets are introduced. Again these sets are linked to special functions and differential equations. Singular value decomposition theorem, an important result, used in principal component analysis, is given. For further reading the readers are suggested to refer to [3], [7], [10], [14] and [19].

Chapter 4 Numerical Methods in Linear Algebra

4.1

Introduction

Numerical methods are essential for handling large and very large systems of equations. This chapter concentrates on the well-known and useful techniques that are needed to solve these linear systems. Numerical computation induces error and the elements of these concepts form the content of Section 4.2. Section 4.3 deals with the standard techniques that reduce a given system to a diagonal or an upper or a lower triangular system. Section 4.4 introduces the iterative methods used to solve a system of equations. In Section 4.5, Householder transformation, which is an orthogonal transformation, is given. This technique mitigates the propagation of errors. In Section 4.6, another orthogonal transformation, plane rotation is utilized to reduce a symmetric matrix to a tridiagonal form. Section 4.7 deals with QR method. The bounds of eigen-values and a method of finding the largest eigen-value form the content of Section 4.8. Krylov subspace methods, which are solvers, are briefly touched in Section 4.9.

4.2

Elements of Computation and Errors

As this chapter concentrates on numerical techniques, it is essential to understand how digital computers handle computation and its impact on the problems considered.

Real numbers representation The representation of a real number on a digital computer has a specific format called floating-point number. Here the number is expressed as a fraction or an integer or an exponent.

DOI: 10.1201/9781351014953-4

131

132

Linear Algebra to Differential Equations

Definition 4.2.1 A floating- point number is written as r = ± 0.a1 a2 ...am × 10n

(4.1)

where a1 , a2 , ...am are integers such that 0 < a1 ≤ 9 and 0 ≤ ai ≤ 9, i = 2, 3, ..., m, and n is an integer between −99 ≤ n ≤ 99. The equation (4.1) can be rewritten as r = a × 10n , where a is called mantissa and n is called the characteristic or exponent. The number of digits in a, that is, m, is called the precision of the floating point number. Definition 4.2.2 m-digit arithmetic is the arithmetic involving m-digit floating point numbers. Observation. High-speed scientific computation uses 15 digit precision arithmetic. Example 4.2.1 If r=0.29345675 ×10−6 then, mantissa a=0.29345675, n=−6 is the exponent m=8 is the precision. 1 = 0.0026737968 = 0.26737968 × 10−2 in floating point Example 4.2.2 374 arithmetic. Here, mantissa is a=0.26737968, exponent is n=−2 and m=8 is the precision.

Every digital computer has a certain range of numbers that can be handled. Suppose a system has the capacity to work with numbers, n ranging from 10−7 to 108 i.e. 10−7 < n < 108 . Then any number m > 108 is beyond the capacity of the system and is termed as an overflow while all numbers less than 10−7 are termed as underflow. These numbers cannot be handled by the considered system. When arithmetic operations are performed the machine restricts the numbers to be within its prescribed range by either chopping off the digits or by rounding off the number. Both these operations lead to errors and together are termed as round off errors. Thus solving problems numerically generates errors automatically and they pile up at every stage. It becomes essential to understand the errors and hence the following definitions. Definition 4.2.3 Absolute error is the difference between the actual quantity ¯ that is  = Q − Q. ¯ Q and the calculated quantity Q, Definition 4.2.4 Relative error is the ratio of the absolute error and the actual quantity, ¯ Q−Q RE = Q

Numerical Methods in Linear Algebra

133

Example 4.2.3 Suppose a company has a turnover of Rs 20 crores and during calculations Rs 10 lakhs is missing. Find the absolute error and the relative error. Solution. Here the absolute error is Rs 10 lakhs which appears like a signifi1000000 1 cant amount. But the relative error RE = 200000000 = 200 = 0.005. In order to understand the effect of the errors the following notions are defined. Definition 4.2.5 An ill-conditioned problem is a numerical problem wherein small changes in the data result in a large relative change in the computed solution. Definition 4.2.6 A well-conditioned problem is a numerical problem whose computed solution is insensitive to small changes in the data of the problem. To understand the sensitivity of the problem certain results related to matrices and norms are required. These are stated without proof and are given as problems in the exercise. Result 4.2.1 If A is a square matrix of order n and ||A|| < 1, where ||A|| is a matrix norm then the infinite series I + A + ... + An + ... converges to the sum (I − A)−1 . Result 4.2.2 If ||A|| < 1 and the matrix (I − A) is nonsingular, then (i) ||(I − A)−1 || ≤

1 ||I−A|| ,

(ii) ||I − (I − A)−1 || ≤

whenever ||I|| = 1.

||A|| ||I−A|| ,

whenever ||I|| = 1.

Observation. ||I|| = 1 in 1-,2- and ∞− norms. The conditioning number of a matrix in case of perturbations involved when solving a linear system of equations is discussed below. Linear system of equations Consider a linear system of equations given by Ax = b,

(4.2)

where An×n matrix and xn×1 and bn×1 are column vectors. While solving the system (4.2), errors can perturb the matrix A or the vector b. Case (i) Suppose A is perturbed and the new system is (A + A0 )x = b. Let x + x0 be the corresponding solution.

(4.3)

134

Linear Algebra to Differential Equations Then, (A + A0 )(x + x0 ) = b

(4.4)

is satisfied. Now subtracting (4.2) from (4.4) and simplifying gives Ax0 + A0 (x + x0 ) = 0 or x0 = (I + A−1 A0 )−1 A−1 A0 x. This gives ||x0 || ||x||

= ||A−1 A0 || ||(I + A−1 A0 )−1 || ≤

||A−1 A0 || ||(I+A−1 A0 )|| .

0

0

||x || || −1 ||. Multiplying and Now set µ = ||A ||A|| ,  = ||x|| and k(A) = ||A|| ||A dividing the rhs by ||A|| gives an estimate of relative error,  ≤ k(A)µ(1 − k(A)µ)−1 .

Observation. µ =

||A0 || ||A||

gives the relative change in A.

Definition 4.2.7 k(A) = ||A|| ||A−1 || is called the condition number of A. It is the deciding factor in determining whether the relative error ‘’ is large or small. Case (ii) Suppose the perturbation occurs in b and the new system is Ax = b + b0

(4.5)

and the solution for the system (4.5) is given by x + x0 . Then, A(x + x0 ) = b + b0 .

(4.6)

Subtracting (4.2) from (4.6) gives Ax0 = b0 or x0 = A−1 b0 . Taking norm on both sides of (4.7), ||x0 || ≤ ||A−1 || ||b0 ||, also, ||b|| ≤ ||A|| ||x||. Now the relative error =

||x0 || ||A|| ≤ ||A−1 || ||b0 ||, ||x|| ||b||

γ=

||b0 || : relative change in b ||b||

(4.7)

Numerical Methods in Linear Algebra

135

and k(A) = ||A|| ||A−1 ||. Then, the relative error  ≤ k(A)γ.

Example 4.2.4 Suppose A =

 1 2

   1 3.5 and b = . 1 4

(i) Solve the system Ax = b. 

 0.001 −0.002 solve the system 0.003 −0.003 (A + A0 )x = b and state whether the system Ax = b is well-conditional or not. Solution.  T (i) Consider Ax = b. Then x = 0.5 3 . (ii) The solution of the system (A + A0 )˜ x = b, that is,     1.001 0.998 3.5 x ˜= . 2.003 0.997 4   0.502 is x ˜= . 3.0035 Since the difference between x and x ˜ is a very small vector the problem is well conditioned.     2 2 3.5 Example 4.2.5 Let A = ,b = (i) Solve the system 2 2.001 4 Ax = b. (ii) Suppose A0 is as in above example. Then solve the system (A + A0 )˜ x = b. (iii) State the conditionality of the problem.   −498.25 Solution. The system (i) Ax = b has the solution x = . 500   250 (ii) The system (A + A0 )˜ x = b has the solution x ˜= . −248.6236 (iii) Since the difference between x and x ˜ is large the problem is ill-conditioned. (ii) Suppose A is perturbed by A0 =

EXERCISE 4.2 1. Verify Result 1. 2. Verify Result 2.

136 3.

4.

5.

6.

4.3

Linear Algebra to Differential Equations   3 1 Find the condition number for the matrix (i) A = and (ii) 2 2   2 1 3 2 2 1 in 2-norm. 1 2 3     1 3 5 Consider A = and b = . Solve the system Ax = b. State 2 4 8 0 the  conditioning of the system Ax = b when perturbed with A = 0.002 −0.002 . 0.003 −0.004     3 3 5 Suppose the matrix A = and b = . Taking A0 as in 3 3.003 8 problem 4 find the conditionality of the problem.   5.02 0 In problem 4 change b to b = find the conditionality of the 8.01 problem.

Direct Methods for Solving a Linear System of Equations

In this section, a few numerical procedures are given to solve a linear system of equations, Ax = b, (4.8) where A is an m × n matrix, x is an n × 1 vector and b is an m × 1 vector. Gauss elimination method is a method imitating the systematic approach taught at elementary level to solve a system of equations. As mentioned in Section 4.2, arithmetic operations propagate errors and to reduce its impact following notions are defined below. Consider the matrix A = (aij )m×n . Definition 4.3.1 Pivot is that element denoted by p, which is the largest in magnitude among all elements of (aij ) of the jth column considered for i ≥ j. That is, |p| = max{|aij | : i ≥ j}. Observation. After fixing the p, elementary row operations are performed to shift the element p to (j,j) position.

Numerical Methods in Linear Algebra Let



1 −1 A= −4 1

137

 1 0 2 3  3 1 5 −2

to find the pivot p of 1st column, consider |p| = max{1, | − 1|, | − 4|, 1}=4 pivot of 1st column is p = −4 pivot of 2nd column, consider |p| = max{2, 3, 5}=5 pivot of 2nd column, p=5 Similarly, pivot of 3rd column is p = −2. Definition 4.3.2 Partial pivoting is the process wherein the search for the pivot is restricted to each column. Definition 4.3.3 Complete pivoting is the approach where the pivot is chosen as the largest element in magnitude among all the mn elements of the matrix. Observation. The pivot p is such that |p|=max{|aij |, i, j = 1, 2, . . . , n}= akl (say). 1. The pivot p is shifted to the (1,1) position by applying both elementary row operation and elementary column operation of interchanging R1 and Rk rows and C1 and Cl columns. 2. At every stage of the method, complete pivoting is done by reducing the order of the matrix by 1, every time.

Gauss Elimination Method The idea in Gauss elimination method is to consider the augmented matrix K = [A : b] and perform elementary row operations to transform K to K0 = [U : c] where U is an upper triangular matrix. Algorithm 1 The algorithm of the Gauss elimination method is as follows. Step 1. Consider the augmented matrix K = [A : b]. Step 2. Find the pivot p for 1st column and perform elementary row transformation so that p is in (1,1) position. Step 3. Perform row operations Ri → matrix  (1) (1) a11 a12  (1)  0 a22 (1)  K = . ..  .. . (1) 0 am2

Ri + ··· ··· .. . ···

−ai1 |p|

R1 , i=2,3,...,n giving the  (1) (1) a1n p b1 (1)  (1) a2n p b2  .. ..   . p .  (1) (1) amn p bm

138

Linear Algebra to Differential Equations

Step 4. Repeat steps 2 and 3 for column 2 excluding row1. This gives K(2)

 (2) a11   0 =  ..  .

(2)

a12 (2) a0 .. .

0

0

(2)

··· ··· .. .

a1n (2) a2n .. .

···

(2) amn

p p p p

 (2) b1 (2)  b2  ..   .  (2)

bm

Step 5. Repeat steps 2 and 3 for column 3 excluding rows 1 and 2. Step 6. Repeat step 5 excluding one more row with each step, until K0 = [U : c]. Now the system Ux = c is solved by back substitution and x can be found, thus solving the system Ax = b. Observation. (1) No pivoting is required if the matrix is diagonally dominant (largest number in magnitude is on the main or if the order of the matrix is small). (2) If the pivot is very small, it leads to errors and may be ill-conditioned and complete pivoting may have to be used. (3) Only elementary row operations must be performed as each column represents coefficients of one variable.

Example 4.3.1 where  1 0 2 1 A= 3 2 1 1

Solve the system Ax = B by Gauss elimination method, 2 2 1 2

 1  2 , b = 3 3 2

7

11

T  6 , x = x1

x2

x3

Solution. Let   1 0 2 1 p 3 2 1 2 2 p 7   K= 3 2 1 3 p 11 1 1 2 2 p 6 R2 → R2 − 2R1 ,  1 0  0 0

R3 → R3 − 3R1 , R4 → R4 − R1  0 2 1 p 3 1 −2 0 p 1  2 −5 0 p 2 1 0 1 p 3

x4

T

.

Numerical Methods in Linear Algebra R3 → R3 − 2R2 , R4  1 0 2 1 0 1 −2 0  0 0 −1 0 0 0 2 1

139

→ R4 − R2  p 3 p 1  p 0 p 2

R3 → −R3  1 0 2 1 p 0 1 −2 0 p  0 0 1 0 p 0 0 2 1 p

 3 1  0 2

R4 → R4 − 2R3  1 0 2 1 p 0 1 −2 0 p  0 0 1 0 p 0 0 0 1 p

 3 1  0 2

by back substitution, x4 = 2, x3 = 0, x2 = 1, x1 = 1. Example 4.3.2 Solve by Gauss elimination method the linear system of equations 2x1 + 4x2 + x3

=

4

x1 + 3x2 + 2x3

=

4

3x1 + 5x2 + 4x3

=

8

Solution. Let  2 K = 1 3

4 3 5

1 2 4

p p p

 4 4 8

R2 ↔ R1  1 2 3

3 4 5

2 1 4

 p 4 p 4 p 8

R2 → R2 − 2R1 , R3 → R3 − 3R1   1 3 2 p 4 0 −2 −3 p −4 0 −4 −2 p −4

140

Linear Algebra to Differential Equations R3 → R3 − 2R2  1 0 0

3 −2 0

2 −3 4

 p 4 p −4 p 4

R2 → −1 2 R2 , R3  1 3 2 p 0 1 3 p 2 0 0 1 p

→ 14 R3  4 2 1

by back substitution, x3 = 1, x2 =

1 1 , x1 = . 2 2

Gauss Jordan method is an extension of the Gauss elimination method. In place of reducing the augmented matrix K = [A : b] to an upper triangular matrix K0 = [U : c] additional elementary row operations are performed to e = [D : d]. The Algorithm 1 is transfer K = [A : b] to a diagonal matrix K extended by repeating steps 2 and 3 without excluding any row. Result 4.3.1 The Gauss-Jordan method is well utilized to find the inverse of a matrix. Procedure. Replace the column vector b in the augmented matrix K by an identity matrix of same order as An×n , that is, set K = [A : In ]. Perform steps 2 and 3 in Algorithm 1 for all rows i = 1, 2, . . . , n so that e = [In : B], K = [A : In ] → K where AB = In = BA. Example 4.3.3 Solve the system Ax = b by Gauss-Jordan method, where, 

   2 2 4 18 A = 11 12 2, b = 40. 9 8 3 35   x Solution. Let x = y  and z   2 2 4 p 18 K = 11 12 2 p 40 9 8 3 p 35 R1 → 21 R1

Numerical Methods in Linear Algebra   1 1 2 p 9 11 12 2 p 40 9 8 3 p 35 R2 → R2 − 11R1 , R3 → R3 − 9R1   1 1 2 p 9 0 1 −20 p −59 0 −1 −15 p −46 R1 → R1 − R2 ,  1 0 22 0 1 −20 0 0 −35

−1 35 R3

R3 →  1 0 0 1 0 0

R 3 → R3 + R2  p 68 p −59  p −105

 68 −59 3

22 p −20 p 1 p

R1 → R1 − 22R3 , R2  1 0 0 p 0 1 0 p 0 0 1 p

→ R2 + 20R3  2 1. 3

Hence, x = 2, y = 1, z = 3. Example 4.3.4 Solve by Gauss-Jordan method x + y + 2z + 2w

=

13,

y + z + 3w

=

12,

2x + 2z + w

=

11,

x + 2y + z + w

=

9.

Solution. The augmented matrix  1 0 K= 2 1

is given by 1 1 0 2

2 1 2 1

R3 → R3 − 2R1 ,  1 1 2 0 1 1  0 −2 −2 0 1 −1

2 3 1 1

 p 13 p 12  p 11 p 9

R4 → R4 − R1  2 p 13 3 p 12   −3 p −15 −1 p −4

141

142

Linear Algebra to Differential Equations R3 → R3 + 2R2 , R4  1 1 2 2 0 1 1 3  0 0 0 3 0 0 −2 −4

 1 0  0 0

→ R4 − R2  p 13 p 12   p 9  p −16

R4 →

−1 2 R4

1 1 0 0

2 3 3 2

2 1 0 1

 p 13 p 12  p 9 p 8

R4 ↔ R3  1 0  0 0

1 1 0 0

2 1 1 0

2 3 2 3

 p 13 p 12  p 8 p 9

R4 → 31 R4  1 1 2 2 0 1 1 3  0 0 1 2 0 0 0 1

p p p p

 13 12 . 8 3

Hence, by back substitution, w = 3, z = 2, y = 1, x = 2. From the Gauss elimination method, it can be easily concluded that a triangular system can be solved with much less effort. This led to the notion of splitting a given matrix A into a product of a lower and an upper triangular matrix. This technique is called as the decomposition method or the factorization method. Factorization method-LU decomposition method The matrix A in (4.8) is to be decomposed as A = LU where A = [aij ]n×n , This yields  a11 a12 · · ·  a21 a22 · · ·   .. .. ..  . . . an2

an2

···

L = (lij ), lij = 0, i < j and U = (uij ), uij = 0, i > j.   a1n l11  l21 a2n    ..  =  .. .   . ann

ln1

l22 .. .

··· ··· .. .

0 0 .. .

ln2

···

lnn

0

 u11  0    ..  . 0

u12 u22 .. .

··· ··· .. .

 u1n u2n   ..  . 

0

···

unn

Numerical Methods in Linear Algebra

143

This gives a system of n2 + n equations, li1 u1j + li2 u2j + ...... + lin unj = aij , j = 1, 2, . . . , n

(4.9)

where lij = 0 for i < j and uij = 0 for i > j To obtain a unique solution the ‘n’ diagonal elements may be fixed. Thus set uii = 1, i = 1, 2, 3, . . . , n. Then the system (4.9) reduces to the form uii = 1, i = 1, 2, . . . , n After computing the matrices L and U, using (4.9) Ax = b reduces to LUx = b, set Ux = y. (4.10) Then Ly = b,

(4.11)

is a lower triangular system. Equation (4.11)  can  be easily solved by forward c1  c2    substitution and y can be found, say y =  .   ..  cn Now substituting y in equation (4.10) gives   c1  c2    Ux =  .   .. 

(4.12)

cn The upper triangular system (4.12) can be solved by back substitution and x, the solution can be easily found. Before working out some examples, the following points must be noted. Observation. (1) LU-factorization is guaranteed if A is a positive definite matrix. (This is only a sufficient condition but is not necessary). (2) LU decomposition is possible if all the principal minors of A are different a a12 from zero. That is, a11 6= 0, 11 6= 0, . . . , |A| = 6 0. a21 a22 (3) If uii = 1, i = 1, 2, 3, . . . , n, the method is called Doolittle’s method. (4) If lii = 1, i = 1, 2, 3, . . . , n, this method is called Crout’s method. (5) If uii = 0 for some i, Doolittle’s method fails and if lii = 0, for some i, Crout’s method fails.

144

Linear Algebra to Differential Equations

Example 4.3.5 Solve the following system of equations using LU decomposition.

Solution. Let that is,  2 6 1 1  4 2 3 1

2x + 6y + 8w

=

10,

x + y + 3z + w

=

5,

4x + 2y + 8z + 11w

=

13,

3x + y + 2z + 6w

=

5.

A = LU 0 3 8 2

  1 8 l21 1 = 11 l31 l41 6

0 1

0 0 1

l32 l42

l43

 u11 0  0 0  0  0 0 1

u12 u22 0 0

u13 u23 u33 0

 u14 u24  . u34  u44

Then simplifying, u11

=

2

u12

=

6

u13

=

0

u14

=

8

l21 u11 = 1

1 2 =2

l21 =

l31 u11 = 4

l31

l41 u11 = 3

l41 =

l21 u12 + u22 = 1 =⇒ 21 · 6 + u22 = 1 =⇒ u22 l21 u13 + u23 = 3 =⇒ 21 · 0 + u23 = 3 =⇒ u23 l21 u14 + u24 = 1 =⇒ 12 · 8 + u24 = 1 =⇒ u24 Thus,    1 0 0 0 2 0.5 1 0 0 0  L= U=  2 5 1 0 0 3 10 0 4 1 2 7

3 , 2

= −2 =3 = −3. 6 −2 0 0

0 3 −7 0

 8 −3  , 10 

−58 7

that is solving Ly = b, 

1 0.5 Ly =  2 3 2

0 1 5 4

0 0 1 10 7



    0 y1 10 y2   5  0   =   0 y3  13 y4 5 1

 10 0  by forward substitution gives y =  −7. Next substituting for Ux = y 0      2 6 0 8 x 10  y   0  0 −2 3 −3      0 0 −7 10   z  = −7 −58 w 0 0 0 0 7

Numerical Methods in Linear Algebra

145

gives by backward substitution,  x = 0.5

1.5

1

T 0 ,

the solution of the given system of equations. Algorithm 2. The LU decomposition can also be obtained by using Gauss elimination method. Procedure. Given a matrix An×n reduce it to an upper triangular matrix U. In the process, the lower triangular matrix is obtained as follows. Set lii = 1, i=1,2,. . . ,n. lij = 0, for i > j and lij = cij for i < j. where cij is obtained in the process of making uij = 0 for i < j, i.e. Ri → Ri − cij Rj where cij is chosen in such a way that uij = 0.  1 Example 4.3.6 Find the LU decomposition of the matrix A = 3 1 using the approach of Gauss elimination method  (without pivoting). 1 2 3 Solution. Consider set l11 = 1, A = 3 2 1 1 1 1 Using Gaussian elimination method R2 → R2 − 3R1 which gives l21 = 3   1 2 3 0 −4 −8 1 1 1 next R3 → R3 − R1 thus l31 = 1   1 2 3 0 −4 −8 0 −1 −2 next

1 1 R3 → R3 − R1 hence l32 = 4 4   1 2 3 0 −4 −8. 0 0 0

2 2 1

 3 1 1

146  1 L = 3 1 A.

0 1 1 4

Linear Algebra to Differential Equations    0 1 2 3 0 and U = 0 −4 −8 which is the LU decomposition of 0 0 0 1

Example 4.3.7 Solve the system x + 2y + z + w

=

4,

x − 2y + z − w

=

0,

3x + 2y + z − 4w

= −4,

x + 2y − z − 2w

= −4.

Solution. In matrix notation  1 2 1 1 −2 1  3 2 1 1 2 −1

    4 x 1     −1  y  =  0  −4  z  −4 −4 −2 w

To find LU decomposition ofA 1 2 1 1 1 −2 1 −1  and apply Gauss elimination method without Consider  3 2 1 −4 1 2 −1 −2 pivoting. Initially set l11 = 1 and l21 = 1 R2 → R2 − R1   1 2 1 1 0 −4 0 −2   3 2 1 −4 1 2 −1 −2 next R3 → R3 − 3R1 that gives l31 = 3   1 2 1 1 0 −4 0 −2   0 −4 −2 −7 1 2 −1 −2 setting R4 → R4 − R1 that gives l41 = 1   1 2 1 1 0 −4 0 −2   0 −4 −2 −7 0 0 −2 −3

Numerical Methods in Linear Algebra

147

next R3 → R3 − R2 gives l32 = 1   1 2 1 1 0 −4 0 −2   0 0 −2 −5 0 0 −2 −3 Again

 1 0 Thus U =  0 0

R4 → R4 − R1 that gives l43   1 2 1 1 0 −4 0 −2   0 0 −2 −5 0 0 0 2   1 0 0 1 1 1 1 0 0 −2  and L =  3 1 1 −2 −5 1 0 1 0 2

2 −4 0 0

=1

 0 0  0 1

Now solving for Ax = B reduces to LUx = b, set y = Ux Now solving for Ly = b      y1 1 0 0 0 4 1 1 0 0 y2   0       3 1 1 0 y3  = −4 1 0 1 1 y4 −4   4  −4   gives y =  −12. 4 Solving for Ux = b  1 0  0 0  gives x

y

z

w

T

2 −4 0 0

 = 1

0

1 0 −2 0 1

    1 x 4     −2   y  =  −4  −5  z  −12 2 w 4  2 .

In case A is a symmetric and a positive definite matrix, the LU decomposition becomes simplified and is described in the following method. Cholesky Method. If A is a symmetric and a positive definite matrix, then A = LLT where L is a lower triangular matrix.

(4.13)

148

Linear Algebra to Differential Equations

The same procedure as LU decomposition method is used to solve the system Ax = b. Observation. (1) It is possible to write A = UUT and solve for U. (2) This method is also called as square root method. (3) Since A = LLT one can find A−1 after finding L. Thus A−1 = (LT )−1 L−1 .

Computational complexity The number of operations required to obtain the solution in the fore-mentioned method is given here without detail. In Gauss elimination method, the number of operations required is (n/3)[(n2 + 3n − 1)]. Thus for large ‘n ’, the computational complexity is given by ≈ n3 . For large ‘n ’, the computational complexity for (1) Gauss-Jordan method is n3 /2 (2) LU decomposition method is n3 /3 (3) Cholesky method is n3 /6

EXERCISE 4.3 1. Use Gauss elimination method to solve the system 3x + 2y + 2z = 3 x + y + 2z = 1 2x + 3y + 4z = 5 2. Find the inverse of A using Gauss Jordan method where   1 2 3 A= 4 5 4  7 8 7 3. Find a decomposition for A using Cholesky method where   1 4 5 A =  4 25 26  5 26 45

Numerical Methods in Linear Algebra 4. Find LU decomposition for A using where  1 A= 3 −1

4.4

149

Gauss elimination method for A,  4 3 5 1  1 3

Iterative Methods

In this section, an attempt is made to solve a linear system of equations using iterative techniques. All iterative techniques start with an initial approximation. Successive approximations are found using the previous approximation. The convergence of the sequence of approximations is important and criteria are given for their convergence. Of the many iterative methods available, the Gauss-Jacobi and GaussSiedel are well utilized and are discussed first and criteria of when these sequence of approximations converge is stated. Successive over relaxation (SoR) technique, which is developed on Gauss-Siedel method is given. Gauss-Jacobi method In Gauss-Jacobi method, the system Ax = b is rewritten as Ax = (D + L + U)x = b,

(4.14)

where D, L and U are diagonal, lower triangular and upper triangular matrices respectively. The system (4.14) is grouped as Dx = −(L + U)x + b Then x = −D−1 (L + U)x + D−1 b Now replace x on rhs with an initial approximation x(0) to give x(1) = −D−1 (L + U)x(0) + D−1 b, replacing x(0) with the new estimate x(1) , the next iteration is x(2) = −D−1 (L + U)x(1) + D−1 b proceeding in a similar fashion gives x(k+1) = −D−1 (L + U)x(k) + D−1 b.

(4.15)

150

Linear Algebra to Differential Equations

Observation. (i) The sequence of iterates converge if the matrix A is strictly diagonally dominant. (ii) This method is called as the method of simultaneous displacement, as the approximation x(k) is completely used to find the new approximation x(k+1) . Example 4.4.1 Solve the following equations by using the Gauss Jacobi method 2x − y

=

1

−x + 6y − 2z

=

3

4x − 3y + 8z

=

9.

Solution. Let Ax = b 

2 −1 4

−1 6 −3

    x 1 0 −2 y  = 3 z 9 8

(D + L + U)x = b         2 0 0 0 0 0 0 −1 0 1 0 6 0 + −1 0 0 + 0 0 −2 x = 3 0 0 8 4 −3 0 0 0 0 9         1 0 −1 0 0 0 0 2 0 0 0 6 0 x = − −1 0 0 + 0 0 −2 x + 3 9 0 0 0 4 −3 0 0 0 8 x = −D−1 (L + U)x + D−1 b   1 1 0 0 0 −1 0 2 2 x = −  0 16 0  −1 0 −2 · x +  0 4 −3 0 0 0 81 0   T x(0) = 0 0 0   0.5 x(1) =  0.5  1.125 1    0 0 0 −1 0 0.5 2 x(2) = −  0 16 0  −1 0 −2  0.5  + 4 −3 0 1.125 0 0 18   0.75 0.9583 1.0625

  0 1 0  3. 1 9 8

0 1 6

0

1 2

0

0 0

0

1 6

  0 1 0  3 1 9 8

=

Numerical Methods  0 0 0 −1 2 −  0 16 0  −1 0 4 −3 0 0 18 1

x(3)

=

  0.9791 0.9792 1.1093

1 x(4)

2

=

0 1 6

− 0 0

0

1

0

  0.9791 1.0329 1.0025 x(5)

2

=

1 6

− 0 0

0

1

0

  0.9896  1.033  1.0027 x(6) 

2

=

− 0 0 

1 6

0

in Linear Algebra   1 0 0.75 2 −2 0.9583 +  0 0 1.0625 0

151 0 1 6

0

 0 0 0  −1 1 4 8

−1 0 −3

  1 0 0.9791 2 −2 0.9792 +  0 0 1.1093 0

0

 0 0 0  −1 1 4 8

−1 0 −3

  1 0.9791 0 2 −2 1.0329 +  0 1.0025 0 0

0

 0 0 0  −1 1 4 8

−1 0 −3

  1 0.9896 0 2 −2  1.033  +  0 1.0027 0 0

0

1 6

0

1 6

0

1 6

0

  0 1 0  3 1 9 8

=

  0 1 0  3 1 9 8

=

  0 1 0  3 1 9 8

=

  0 1 0  3 1 9 8

=

1.0165 0.99916. 1.0176

Hence the solution is,   1 x = 1. 1 Gauss-Siedel Method. The difference between Gauss-Jacobi method and Gauss-Siedel method is that the new information obtained is immediately used in the Gauss-Siedel method. It can be observed that in solving the system (4.8) using the relation (4.15), the first component of x(k+1) is found in the first step and 2nd component of x(k+1) is found in second step and so on. This observation is fruitfully utilized by considering the setup Ax = (D + L + U)x = b, and rearranging it as (D + L)x = −Ux + b, giving the iterative scheme as (D + L)x(k+1) = −Ux(k) + b,

152

Linear Algebra to Differential Equations

or x(k+1) = −(D + L)−1 Ux(k) + (D + L)−1 b.

(4.16)

The relation (4.16) is the iterative scheme for Gauss-Siedel Method. Observation. (1) The Gauss-Siedel method is known as the method of successive displacement. (2) This method converges if A is strictly diagonally dominant. Example 4.4.2 Solve by Gauss-Siedel method 5x1 + 3x2 + 2x3 + x4

=

5

x1 + 5x2 + 2x3 + 2x4

=

6

x1 + 2x2 + 6x3 + x4

=

7

2x1 + x2 + x3 + 5x4

=

8.

Solution. Here  5 1 D+L= 1 2

0 5 2 1

0 0 6 1

 0 0 , 0 5

 0.2 0 0 0  −0.04 0.2 0 0 , (D + L)−1 =   −0.02 −0.0667 0.1667 0 −0.068 −0.0267 −0.0333 0.2   0 3 2 1 0 0 2 2  U= 0 0 0 1, 0 0 0 0   0 −0.6 −0.4 −0.2 0 0.12 −0.32 −0.36   −(D + L)−1 U =  0 0.06 0.1733 −0.0133, 0 0.204 0.1893 0.1547   1  T  1   (D + L)−1 b = (D + L)−1 5 6 7 8 =   0.667 , 0.8667 

Numerical Methods in Linear Algebra

153

substituting these values in formula (4.16) with initial approximation x(0) = (D + L)−1 b. Suppose theinitial approximation is  1  1   x(0) =   0.667  0.8667        −0.04 1 1 0 −0.6 −0.4 −0.2       0 0.12 −0.32 −0.36    1   1  0.5946 x(1) =  0 0.06 0.1733 −0.0133  0.667  +  0.667  = 0.8307, 1.331 0.8667 0.8667 0 0.204 0.1893 0.1547        0 −0.6 −0.4 −0.2 −0.04 1 0.0448 0 0.12 −0.32       −0.36   0.5946  1  0.3264 x(2) =  0 0.06 0.1733 −0.0133 0.8307 +  0.667  = 0.8286, 0 0.204 0.1893 0.1547 1.331 0.8667 1.3512        0.2025 1 0.0448 0 −0.6 −0.4 −0.2       0 0.12 −0.32 −0.36   0.3264  1  0.2876 x(3) =  0 0.06 0.1733 −0.0133 0.8286 +  0.667  = 0.8119, 1.2992 0.8667 1.3512 0 0.204 0.1893 0.1547        0.2428 1 0.2025 0 −0.6 −0.4 −0.2       0 0.12 −0.32 −0.36   0.2876  1   0.307  x(4) =  0 0.06 0.1733 −0.0133 0.8119 +  0.667  = 0.8074, 1.28 0.8667 1.2992 0 0.204 0.1893 0.1547        0.2368 1 0.2428 0 −0.6 −0.4 −0.2       0 0.12 −0.32 −0.36    0.307   1  0.3177 x(5) =  0 0.06 0.1733 −0.0133 0.8074 +  0.667  =  0.808 , 1.2802 0.8667 1.28 0 0.204 0.1893 0.1547        0 −0.6 −0.4 −0.2 0.2368 1 0.2301 0 0.12 −0.32       −0.36   0.3177  1  0.3187 x(6) =  0 0.06 0.1733 −0.0133  0.808  +  0.667  = 0.8088, 0 0.204 0.1893 0.1547 1.2802 0.8667 1.2825        0 −0.6 −0.4 −0.2 0.2301 1 0.228 0 0.12 −0.32       −0.36   0.3187  1  0.3277 x(7) =  0 0.06 0.1733 −0.0133 0.8088 +  0.667  = 0.8089, 0 0.204 0.1893 0.1547 1.2825 0.8667 1.2832        0 −0.6 −0.4 −0.2 0.228 1 0.2292 0 0.12 −0.32       −0.36   0.3277 +  1  = 0.3173, x(8) =  0 0.06 0.1733 −0.0133 0.8089  0.667  0.8089 0 0.204 0.1893 0.1547 1.2832 0.8667 1.2831

154

Linear  0 −0.6 −0.4  0 0.12 −0.32 x(9) =  0 0.06 0.1733 0 0.204 0.1893  0 −0.6 −0.4 0 0.12 −0.32 (10) x = 0 0.06 0.1733 0 0.204 0.1893  0 −0.6 −0.4  0 0.12 −0.32 x(11) =  0 0.06 0.1733 0 0.204 0.1893 Thus, x1 = 0.2295, x2 = 0.3173, Verification

Algebra to Differential Equations       0.2294 1 0.2292 −0.2       −0.36   0.3173 +  1  = 0.3173, −0.0133 0.8089  0.667  0.8089 1.283 0.8667 1.2831 0.1547       0.2295 1 0.2294 −0.2       −0.36   0.3173 +  1  = 0.3173,       0.8089 0.667 −0.0133 0.8089 1.283 0.8667 1.283 0.1547       0.2295 1 0.2295 −0.2       −0.36   0.3173 +  1  = 0.3173, −0.0133 0.8089  0.667  0.8089 1.283 0.1547 0.8667 1.283 x3 = 0.8089, x4 = 1.283.   0.2295 0.3173  x = A−1 b =  0.8089. 1.283

Example 4.4.3 Solve the system Gauss-Siedel method 2x1 + x2 + x3 + x4

=

5

x1 + 3x2 + x3 + x4

=

6

x1 + x2 + 4x3 + x4

=

7

x1 + x2 + x3 + 5x4

=

8.

Numerical Methods in Linear Algebra

155

Solution. Here  2 1  D+L= 1 1 

0 3 1 1

0 0 4 1

 0 0 , 0 5

 0.5 0 0 0 −0.1667 0.3333 0 0 , and (D + L)−1 =   0.0833 −0.0833 0.25 0 −0.05 −0.05 −0.05 0.2   0 1 1 1 0 0 1 1  U= 0 0 0 1, 0 0 0 0   0 −0.5 −0.5 −0.5 0 0.1667 −0.1667 −0.1667  −(D + L)−1 U =  0 0.0833 0.1667 −0.0833, 0 0.05 0.1 0.15   2.5   1.1667  (D + L)−1 b =  0.8333. 0.7 Now substituting these values in formula (4.16) with initial approximation x(0) = (D + L)−1 b. Suppose theinitial approximation is  2.5 1.1667  x(0) =  0.8333, 0.7        0 −0.5 −0.5 −0.5 2.5 2.5 1.15 0 0.1667 −0.1667 −0.1667 1.1667 1.1667 1.1056       x(1) =  0 0.0833 0.1667 −0.0833 0.8333 + 0.8333 =  1.011 , 0 0.05 0.1 0.15 0.7 0.7 0.9467        0 −0.5 −0.5 −0.5 1.15 2.5 0.9683 0 0.1667 −0.1667 −0.1667 1.1056 1.1667 1.0246       x(2) =  0 0.0833 0.1667 −0.0833  1.011  + 0.8333 = 1.0151, 0 0.05 0.1 0.15 0.9467 0.7 0.9984        0 −0.5 −0.5 −0.5 0.9683 2.5 0.981 0 0.1667 −0.1667 −0.1667 1.0246 1.1667 1.0019       x(3) =  0 0.0833 0.1667 −0.0833 1.0151 + 0.8333 = 1.0047, 0 0.05 0.1 0.15 0.9984 0.7 1.0025

156  0 −0.5  0 0.1667 x(4) =  0 0.0833 0 0.05  0 −0.5 0 0.1667 (5) x = 0 0.0833 0 0.05  0 −0.5  0 0.1667 x(6) =  0 0.0833 0 0.05  0 −0.5 0 0.1667 (7) x = 0 0.0833 0 0.05  0 −0.5  0 0.1667 x(8) =  0 0.0833 0 0.05  0 −0.5 0 0.1667 (9)  x = 0 0.0833 0 0.05 Therefore x1 = 1, Verification

Linear Algebra to Differential Equations       0.9955 2.5 0.981 −0.5 −0.5       −0.1667 −0.1667  1.0019 + 1.1667 = 0.99991, 0.1667 −0.0833 1.0047 0.8333  1.0007  1.0009 0.7 1.0025 0.1 0.15       0.9997 2.5 0.9955 −0.5 −0.5       −0.1667 −0.1667  0.99991 + 1.1667 = 0.9996,       1  0.8333 1.0007 0.1667 −0.0833 1.0002 0.7 1.0009 0.1 0.15       0.9997 2.5 1.0001 −0.5 −0.5       −0.1667 −0.1667  0.9996 + 1.1667 = 0.9999, 0.1667 −0.0833  1  0.8333  1  1.0002 0.1 0.15 0.7 1       1.0001 2.5 1.0001 −0.5 −0.5       −0.1667 −0.1667  0.9999 + 1.1667 =  1 ,       1  0.8333 1 0.1667 −0.0833 1 0.7 1 0.1 0.15       1 2.5 1.0001 −0.5 −0.5  1  1.1667 1 −0.1667 −0.1667  =  , +  0.1667 −0.0833  1  0.8333 1 1 0.7 1 0.1 0.15       1 2.5 1 −0.5 −0.5       −0.1667 −0.1667  1 + 1.1667 = 1. 0.1667 −0.0833 1 0.8333 1 1 0.7 1 0.1 0.15 x2 = 1, x3 = 1, x4 = 1.   1  1  x = A−1 b =  1. 1

The next method described in this section needs the following definitions. Definition 4.4.1 Permutation matrix U is a matrix having exactly 1 in each row and column and all other entries are zero. Definition 4.4.2 Property A. A real matrix B is said to have property  A iff A11 A12 T there exists a permutation matrix U such that UBU = A21 A22 where A11 and A22 are diagonal matrices. Note. This property A helps in solving the parameters ‘w’, required in the following successive over relaxation method.

Numerical Methods in Linear Algebra

157

Successive over relaxation technique Successive over relaxation (SOR) technique is a generalization of the GaussSiedel method and can be successfully applied if the given matrix A is symmetric and has the property ‘A’. Given the system Ax = b, the Jacobi method is first developed and the iteration matrix is J = D−1 (L + U) Consider |J − λI| = 0 and let λ1 ,λ2 ,..., λn be the eigen-values of the iteration matrix J. Let η = ρ(J) = max {|λ1 |, |λ2 |, ..., |λn |} be the spectral radius of J. The successive over relaxation method is developed as follows. Introduce a new vector at each iteration e(k+1) = −D−1 Lx(k+1) − D−1 Ux(k) + D−1 b, x

(4.17)

(observe right hand side is similar to Jacobi method iterate) and the (k + 1)th approximation is defined as x(k+1) = (1 − w)x(k) + we x(k+1) , where w is called as the relaxation operator. Here the (k + 1)th iterate is the e(k+1) . Substiweighted mean of the kth iterate and the constructed iterate x (k+1) e tuting the value of x in (4.17), x(k+1) = (1 − w)x(k) + w(−D−1 Lx(k+1) − D−1 Ux(k) + D−1 b) −1

(I + wD

L)x(k+1) = [(1 − w)I − wD−1 U]x(k) + wD−1 b

(4.18)

D−1 (D + wL)x(k+1) = D−1 [(1 − w)D − wU]x(k) + wD−1 b Thus x(k+1) = (D+wL)−1 [(1 − w)D − wU]x(k) + w(D + wL)−1 b Set J = (D + wL)−1 [(1 − w)D − wU] and c = w(D + wL)−1 b Then the iteration scheme is x(k+1) = Jx(k) + c, k = 0, 1, 2, . . . .

(4.19)

Observation. 1. If w = 1, the iteration scheme (4.18) reduces to Gauss-Seidel method. 2. The weights w are non-negative for 0 ≤ w ≤ 1, and if w < 1, the method is called a successive under relaxation method.

158

Linear Algebra to Differential Equations

3. If w > 1, the method is called a successive over relaxation method. Note. To find ‘w’, let η = p ρ(J) spectral radius of the matrix J in Jacobi method. Then w = 2/η 2 (1 ± 1 − η 2 ). Also, observe that η < 1, else w becomes a complex number and SOR cannot be used. Example 4.4.4 Ax over relaxation  Solve the system   = b using successive  3 −1 0 x 5 method. A = −1 3 −1, x = y  and b =  4  0 −1 3 z 11 Solution. To apply the iterative formula (4.19) first w must be calculated for matrix J = D−1 (L + U) in Jacobi method. Here       0 −1 0 0 0 0 3 0 0 D = 0 3 0 L = −1 0 0 U = 0 0 −1 0 0 3 0 0 0 0 −1 0    0 −1 0 1/3 0 0 1/3 0  −1 0 −1 D−1 (L + U) =  0 0 −1 0 0 0 1/3   0 −1/3 0 0 −1/3 = −1/3 0 −1/3 0 The largest eigen-value is η = ! r 2 2 √ 1− 1− = 1.062746 9 ( 2/3)2



2/3 wopt

=

 p 2  2 1 − 1 − η = η2

The recurrence formula is given by x(k+1) = Jx(k) + c, where J = (D + wL)−1 [(1 − w)D − wU]  −1  1/3 0 0 3 0 0  1/3 0  3 0 = w/9 (D + wL)−1 = −w  w2 0 −w 3 w/9 1/3 27   3(1 − w) w 0  0 3(1 − w) w ((1 − w)D − wU] =  0 0 3(1 − w) 

Numerical Methods in Linear Algebra J = (D + wL)−1 ((1 − w)D − wU]   w 1−w 0 3    w(1 − w)  w2 − 9w + 9) w  =   9 3  2 3  3 2 w (1 − w) w + 9w(1 − w) w − 9w + 9 9 27 9   5w   1/3 0 0     3    5  w(5w + 12) w/9 1/3 0      c = w  4 =     w2  9 1 11  2 w w(5w + 12w + 99)  9 27 3 27 substituting the value of wopt = w = 1.062746   −0.0627 0.35423 0 J =  −0.02221 0.06278 0.35423 −0.007867 0.02224 0.06278 and

  1.77117 c = 2.04434 4.62074

Substituting for J and c in x(k+1) = Jx(k) + c gives, by taking x(0) = [0

0

0]T

  1.77117 x(1) = c = 2.04434, 4.62074   2.38428 x(2) = 0.77011, 4.94236 substituting x(2) in the recursive formula gives   2.95716 x(3) = 3.97875 4.99611 Similarly x(4)

  2.99514 = 3.99816 4.99961

159

160

Linear Algebra to Differential Equations

which is approximately equal to the real value of     x 3 x = y  = 4 . z 5

EXERCISE 4.4 1. Solve the following systems using (i) Gauss-Jacobi method (ii) (a) 3x + y + z = 4 x + 4y + 2z = 4 2x + y + 5z = 5

Gauss-Siedel method

(b) 2x + y + 2z = 15 2x + 9y + 3z = 25 2x + 4y + 8z = 30

2. Find the solution for the following system using (i) Gauss-elimination method (ii) Gauss-Jacobi method (iii) Gauss-Siedel method. What is your inference. 2x + y + 2z = 4 x + 3y + 4z = 6 4x + 2y + 2z = 7. 3. Using SOR technique solve the system Ax = b where       3 −2 1 x −1 A = −2 3 −2 , x = y  , b =  7  1 −2 3 z −7 .

4.5

Householder Transformation

In Chapter 1, it has been shown that every elementary operation can be represented by an elementary matrix and the product of these elementary matrices is a nonsingular matrix ‘P’ that when operated on the matrix reduces to an upper triangular matrix or a diagonal matrix. As mathematical operations lead to round off errors, an orthogonal matrix would be an appropriate transformation matrix to mitigate propagation of errors. This is because for an orthogonal matrix H, kxk = kHxk, that is, norms are preserved for an orthogonal matrix.

Numerical Methods in Linear Algebra

161

In this section, of the many approaches known, the Householders transformation method is discussed. Definition 4.5.1 Householder transformation is a matrix H ∈ Rn×n defined by H = I − 2wwT where w ∈ Rn such that wT w = 1, and I is an n × n identity matrix of order ‘n’. Observation. The Householder transformation H satisfies the following properties (1) HT H = I, (2) HT = H−1 . Theorem 4.5.1 For any non zero vector x ∈ Rn , there exists a Householder transformation H such that Hx = ae1 , where e1 = (1, 0, 0, 0)T ∈ Rn , for some constant a. Proof. Given x ∈ Rn , finding the Householder transformation H means finding a vector w such that Hx = (I − 2wwT )x = ae1 , where I is an nxn identity matrix, and e1 = (1, 0, 0, 0)T ∈ Rn . Let x = (x1 x2 xn ). Then kxk22 = x21 + x22 + · · · + x2n = k 2 (say). Also, k 2 = xT x = xT HT Hx = a2 eT1 e1 = a2 . Thus a = ±k = ±kxk2 . To find w, consider Hx = ae1 , that is, (I − 2wwT )x =ae1 , where w = (w1 , w2 , w3 , wn )T  1 − 2w12 −2w1 w2  −2w2 w1 1 − 2w22   .. ..  . . −2wn w1

−2wn w2

··· ··· .. . ···

    −2w1 wn x1 1  x2  0 −2w2 wn        ..  = a  ..  ..  .  . . 1 − 2wn2

which is,  x1 + (−2w1 )w1 x1 + (−2w1 )(w2 x2 ) · · · (−2w2 )w1 x1 + x2 + (−2w2 )(w2 x2 ) · · ·   .. ..  . . (−2wn )(w1 x1 ) + (−2wn )(w2 x2 )+ · · ·

xn

0

(−2w1 )(wn xn ) (−2w2 )(wn xn ) .. .



  a     0 =  0  0 xn − (2wn )(wn xn )

162

Linear Algebra to Differential Equations

Thus, x1 − 2w1 (wT x) = a = ±k, xi − 2wi (wT x) = 0, i = 2, . . . , n

(4.20)

where, wT x = w1 x1 + w2 x2 + · · · + wn xn . To solve w from the ‘n’ equations in (4.20), set wT x = l, Then, 2w1 l = x1 ∓ k

(4.21)

and 2wi l = xi ,

i = 2, . . . , k.

(4.22)

Squaring and adding the equations in (4.21) and (4.22) give, 4(w12 + · · · + wn2 )l2 = x21 + · · · + x2n + k 2 ± 2x1 k = 2k 2 ∓ 2x1 k or 2l2

=

k(k ∓ x1 )

=

k(k − (sgnx1 )x1 ].

where sgnx1 = 1, −1, according as x1 ≥ 0, x1 < 0. u Set w = 2l , where u = (x1 − k, x2 , xn )T and a = ±k = −(∓k) = −(sgnx1 )kxk2 . Thus, (I − 2wwT )x = (x − 2(wT x)w) (uT x)u = x−2 ||u||2

Hx

=

=

(x1 xn )T − (x1 ∓ k x2 xn )T

= −(±k 0 0) = −(sgnx1 )kxk2 e1 . Therefore, Hx = ae1 , where a = −(sgnx1 )|kxk2 Algorithm for Householder method Step 1. Given x ∈ Rn find kxk22 = x21 + · · · + x2n = k 2 (say). Step 2. Write 2l2 = k(k + (sgnx1 )x1 ]. Step 3. Write u = (x1 + k x2 xn )T and w =

1 2l u.

T

uu Step 4. H = I − 2 ||u|| 2.

Observation. The above theorem can be utilized for any mxn matrix by considering each column of the matrix or a column of its submatrix.

Numerical Methods in Linear Algebra

163

Householder Method to reduce a given matrix to an upper triangular form Consider a matrix An×n = (aij ), i = j = 1, 2, . . . , n. Goal is to find a transformation matrix H such that HA = C = (cij ),

i = j = 1, 2, ...n

such that cij = 0 for i > j Step 1. Set x = (a11 a21 an1 )T (1st column of A) To find a transformation Hn such that Hn x = c11 e1 , e1 ∈ Rn Using the algorithm for Householder transformation, with c11 = −(sgn a11 )kxk2 . Applying the algorithm gives   c11 c12 . . . c1n  (1)   0 a(1) . . . a2n  22   Hn A =  . . . . . . ,    . . . . . .  (1) (1) 0 an1 . . . ann now consider the submatrix

An−1,n−1

 (1) a  22  (1) a32 =  .   . (1) an2

.

.

.

(1)



(1)

    .,  .

a2n

. . . a3n . . . . . . . . (1) . . . ann

Again set h x = a(1) 22

.

.

(1)

.an1

iT

∈ Rn−1

Applying Householder Algorithm Hn−1 x = c22 e1 , e1 ∈ Rn−1 and (1)

c22 = (−sgna22 )kxk2 This gives, after applying the algorithm  c22 c23   0 a(2) 33  Hn−1 An−1,n−1 =  . .   . . (2) 0 an3

.

.

.

. . . . . . . . . . . .

c2n



 a3n   . ,  .  (2) ann

164

Linear Algebra to Differential Equations

This step is equivalent to considering the  c11 0    1 0 Hn A =   . 0 Hn−1  . 0

matrix c12 c22 . . (2) an2

. . . . . . . . . . . . . . .

 c1n c2n   .  , .  (2)

ann

Now set (2)

(2)

x = [a33 ......an3 ]T ∈ Rn−2 the goal is to find the matrix Hn−2 ∈ Rn−2×n−2 such that Hn−2 An−2,n−2 = c33 e1 , e1 ∈ Rn−2 (2)

with c33 = −sgna33 kxk2 . Using Householder transformation gives  c33 c34  0 a(3) 44  . Hn−2 An−2,n−2 =   .  . . (3) 0 an4

. . . . .

. . . . .

 . c3n (4)  . a4n  . .  , . .  (3) . ann

The matrix A is transformed as

 I2 0

 1 Hn−2 0 0

 c11 0   0 Hn A =   . Hn−1  . 0

c12 c22 . . 0

c13 c23 c33 . 0

. . . . (3)

an4

 . c1n . c2n   . .   . .  (3) . ann

For simplicity sake set Qk+1 =  Ik 0

0 Hn−k

 ,

where Hn−k is a Householder transformation of order n − k. Then H = Qn−1 Qn−2 ......Q1 is the orthogonal transformation reducing A to a triangular form. Example 4.5.1 Reduce the matrix A to an upper triangular matrix using the Householder method   2 2 3 A = 2 3 0 . 1 −2 1

Numerical Methods in Linear Algebra

165

Solution.     2 2 3 2 Write A1 = 2 3 0 and consider x = 2, 1 −2 1 1 √ Then ||x|| = 9 = 3.       2 1 5 u = x + (sgnx1 )||x||e1 = 2 + 3 0 = 2 1 0 1 u · uT H3 = I − 2 ||u||2     1 0 0 5 2    2 · 5 = 0 1 0 − 30 0 0 1 1  −2 H3 A1 =

3  −2 3 −1 3

−2 3 11 15 −2 15

−1 2 3 −2   2 15 14 1 15



2 3 −2

 −2 3  1 =  −2 3

2

−1 3

  −3 3 0 =  0 1 0

−8 3 17 15 −44 15

−2 3 11 15 −2 15

−1 3 −2  15 14 15



−7 3 −32  15 −1 15



Now removing 1st row and 1st column, holds  17  −32 15 15 A2 = −44 −1 15

 x= ||x|| =



17 15 −44 15

15



9.8889 = 3.1447       17 4.278 1 15 u = x + (sgnx1 )||x||e1 = −44 + (3.1447) = −44 0 15 15 H2

= =

H2 A2

u · uT I−2 ||u||2     2 4.278  1 0 · 4.278 − 44 0 1 26.9057 15

=

 −0.3604 0.9328

0.9328 0.3604



17 15 −44 15

−32 15 −1 15

44 15



 =

=

 −0.3604 0.9328

 −3.1447 0

Now removing 1st row and 1st column from A2 gives   A1 = −2.014   x = −2.014

0.9328 0.3604

0.7067 −2.014





166

Linear Algebra to Differential Equations

||x|| = 2.014   u = x + (sgnx1 )||x||e1 = −4.028

H1

u · uT ||u||2   2    1 = −1 = 1 − 1

= I−2

Now  1 0 0

0 1 0

 0 1 0  0 −1 0

H1 H2 H3 A  2= 2  1 0 0 2 3 3 3 −2   0.53 −0.848  23 −1 2 3 3 1 −2 2 −0.848 −0.53 1 3 3 3   −8 −7 −3 3 3  0 −0.31447 0.7067 = R 0 0 2.014

2 3 −2

which is an upper triangular matrix. Now,  2   2 1 1 0 0 1 0 3 3 3 −2   0 0.53 Q = H1 H2 H3 = 0 1 0   23 −1 3 3 1 −2 2 0 0 −1 0 −0.848 3 3 3  −2  −0.0707 0.742 3 −0.3887 −0.636 is an orthogonal matrix. =  −2 3 −1 0.9187 −0.212 3

 3 0 = 1

 0 −0.848 −0.53

Verify QR = A. Observation. 1. For any square matrix of order n, the total number of operations of multiplications and square roots required to reduce A to a triangular matrix is 2/n3 + 3/2n2 + 11/6n − 4 = 2/3n3 if n is very large. 2. This approach can be applied for any matrix of the form Amxn .

EXERCISE 4.5  1 1. Transform the matrix A = 2 1 Householder method. Find the

 2 1 2 3 to an upper triangular matrix using 3 3 orthogonal transformation.

Numerical Methods in Linear Algebra 167      3 2 2 x 10 2. Solve the system 1 1 2 y  =  5  by reducing the matrix to an 2 3 4 z 11 upper triangular form using Householder method. Also find the orthogonal transformation.

4.6

Tridiagonalization of a Symmetric Matrix by Plane Rotation

Definition 4.6.1 An orthogonal matrix Rij of order ‘n’ is said to be a plane rotation or a plane transformation in the (i, j)th plane if (i) Rij is an identity matrix except in columns i and j (ii) rii = cosθ, rij = −sinθ, rji = sinθ, rjj = cosθ Example 4.6.1 

P13

cosθ = 0 sinθ

0 1 0

 −sinθ 0  cosθ

is a plane transformation in the (1, 3) plane. Observation. 1. Plane rotation matrix is an orthogonal matrix.

Algorithm for tridiagonalization of a symmetric matrix of order n Step 1. Given A = (aij )n×n . To reduce A to a matrix C = (cij )n×n , where cij = 0 for i doesnot belong to {j − 1, j, j + 1}. Step 2. Consider R23 = (rij )n×n with r22 = cosθ, r23 = −sinθ, r32 = −sinθ, r33 = cosθ and rest is an identity matrix. Then RT23 AR23 = 

b11 b12  b12 b22  f (θ) b32   . .   . .   . . bn1 bn2

. . . . . . .

. . . . . . .

. . . . . . .

 b1n b2n   b3n   .   .   .  bnn

Step 3. Find θ by setting (3,1) element f (θ) = 0.

168

Linear Algebra to Differential Equations

Step 4. Repeat step 2 with matrices R24 , .., R2n and their transposes to get RT2n RT2n−1 ...(RT24 RT23 AR23 )R24 ...R2n−1 , ..., R2n 

c1 α1   =0  ..  .

α1 α22 α32 .. .

0

αn2

. α23 . . .

. . . . . . . .

. .

0 α2n α3n .. .

     = B (say).  

αnn

Step 5. Repeat the steps 3 & 4 to eliminate elements α42 , ..., αn2 and α24 , ..., α2n by premultiplying B with R34 , R35 , . . . , R3n successively T and post multiply with RT34 , R35 , . . . , RT3n successively. Step 6. Repeat the procedure to obtain  c1 α1 0 α1 c2 α1   0 α2 α3   . . . . 0 0

C, in the form  . . 0 . . .   . . αn−1   . . αn cn  . . αn−1

The following example will illustrate the procedure. Example 4.6.2 Reduce the following matrix to a tridiagonal matrix using plane rotation,   1 1 1 1 1 2 3 4   A= 1 3 6 10 1 4 30 20 Solution. Consider R23

 1 0 = 0 0

0 cos θ sin θ 0

0 − sin θ cos θ 0

 0 0  0 1

Then  A1 =

RT23 AR23

=

1 0 0 0

0 cosθ −sinθ 0

0 sinθ cosθ 0

T 

0 0 0 1

1 1 1 1

1 2 3 4

1 3 6 10



1 1 4  0 10 0 20 0

0 cosθ sinθ 0

0 −sinθ cosθ 0



0 0 0 1

Numerical Methods in Linear Algebra

 1  cosθ + sinθ  =  −sinθ + cosθ 1

cosθ + sinθ 2cos2 θ + 6sin2 θ + 3sin2θ 4cosθsinθ + 2cos2 θ3sin2 θ 4cosθ + 10sinθ

−sinθ + cosθ 3cos2 θ + 4cosθsinθ − 3sin2 θ 2 2sin θ + 6cos2 θ − 6cosθsinθ −4sinθ + 10cosθ

169

 1 4cosθ + 10sinθ   −4sinθ + 10cosθ  20

Equating the element (1,3) position to zero, gives tan θ = 1, that is θ =

A1 = RT23 AR23

 √1  2 =  0 1



2 7 2 √ 7 2

π 4.

 1 √ 7√2  3 2 3 2 20 0 2 1 √

As the matrix is symmetric so the (3,1) element is also reduced to zero. Now to reduce the element in the (1,4) position to zero, we take R24   1 0 0 0 0 cosθ 0 −sinθ  R24 =  0 0 1 0  0 sinθ 0 cosθ T A2 = R−1 24 A1 R24 = R24 A1 R24 and equating  

1 0 0 0

0 cosθ 0 −sinθ

0 0 1 0

0 1 sinθ 1.414 0  0 cosθ 1

1.414 7 2 9.8995

0 2 1 4.2426



1 1 9.8995 0 4.2426 0 20 0

0 cosθ 0 sinθ

0 0 1 0



0 −sinθ 0  cosθ

Computing A2 = RT34 A1 R34 and equating the entry in (1, 4) position to zero gives, sin θ = 0.5774 and cos θ = 0.8165 substituting these values in A2 yields  1 1.7321 0 1.7321 20.6681 4.0827 A2 =   0 4.0821 1 0 9.4281 2.3093 To annihilate the element in the (2, 4)  1 0 choose R34 =  0 0

 0 9.4281  2.3093 6.333

position of A2 0 1 0 0

0 0 cos θ sin θ

 0 0   − sin θ cos θ

Computing A3 = RT34 A2 R34 and equating the entry in (2, 4) position to zero gives sin θ = 0.9177 and cos θ = 0.3974 substituting in A3 gives the required

170

Linear Algebra to Differential Equations

matrix in tridiagonal form A3 = RT34 RT24 RT23 AR23 R24 R34  1 1.7321 0 1.7321 20.6667 10.2740 =  0 10.2740 7.1754 0 0 0.3646

 0 0  . 0.3646 0.1580

Observation. 1. If A is a symmetric matrix then Householder transformation also can be applied in a similar manner to obtain a tridiagonal matrix. 2. If A is any square matrix, then A can be reduced to a triangular form using plane rotations.

EXERCISE 4.6 1. Show that the plane rotation matrix P24 -4 × 4 matrix is an orthogonal matrix. 2. Using plane rotation reduce the matrix √  √2 √1  2 − 2 √ A=  2 −1 √ 2 2



2 −1 √ √2 2

 √2  √2  2 −3

to a tridiagonal form.

4.7

QR Decomposition

In this section, the QR decomposition is described and using the Householder method the decomposition is obtained for an example illustrating the technique. Definition 4.7.1 The QR decomposition of a given matrix A is the product of an orthogonal matrix Q and an upper triangular matrix R, that is A = QR. Observation. The orthogonal matrix Q in this section is obtained using Householder method.

Numerical Methods in Linear Algebra

171

An upper triangular matrix R is in any of the following forms for an m × n matrix A. 1. If A is an m × n matrix and m > n then the triangular form holds for an n × n matrix and all rows n + 1, n + 2, ..., m become zero rows.   c11 c12 . . . c1n  0 c22 . . . c2n     . . . . . .     . . . . . .    0 0 . . . cnn    0 0 . . . 0     .. .. ..   . . . . . .  0

0

.

.

.

0

2. If m < n, then there exists an m × m triangular matrix and rest of the columns exist.  c11 0   .   .   . 0

c12 c22 . . . 0

. . . . . .

. . . . . .

. c1m . c2m . . . . . . . cmm

. . . . . . . . . . . . . . . . . .

 c1n c2n   .   .   .  cnn

Example 4.7.1 Obtain QR decomposition for   3 5 A= 4 3 using Householder’s Algorithm Solution. ||x|| =

p

x21 + x22 =



9 + 16 = 5

||x||(sgnx1 )x1 = 25 + 5.3 = 40       3 1 −2 u = x − (sgnx1 )||x||e1 = −5 = 4 0 4       1 3 2 −2  1 0 u·uT −2 4 = − H2 = I2 − 2 ||u||2 = 0 1 10 4 5 4  27     5 1 3 4 3 5  5 Now H2 A = =  11 5 4 −3 4 3 0 5

 4 . −3

172

Linear Algebra to Differential Equations

Now set Q = H2 and then QA = R Further QT R=A, verify.

EXERCISE 4.7   2 1 2 1. Obtain the QR decomposition for the matrix A = 2 3 1 using 1 4 3 Householder transformation.   1 2 1 2. Obtain the QR decomposition for the matrix A = 2 2 3 using plane 1 3 3 rotations.

4.8

Eigen-values: Bounds and Power Method

Finding eigen-values of a matrix is important but it is difficult to find them for matrices of a large order. This is because the problem reduces to finding the roots of characteristic polynomials and finding roots of a polynomial of degree ≥ 4 is not easy. Hence numerical approach becomes essential. This calls for choosing the initial approximation. Gershgorin gave a simple technique to find the bounds of the eigen-values of a matrix A by using the definition of an eigen-value. This method is described below.

Gershgorin Method Let A be a square matrix of order n and λ be an eigen-value of A and x = (x1 x2 xn )T be the eigen-vector corresponding to λ then Ax = λx, x 6= 0. Writing down the system of equations yields (a11 − λ)x1 + a12 x2 + ... + a1n xn = 0 a21 x1 + (a22 − λ)x2 + ... + a2n xn = 0 .. . al1 x1 + al2 x2 + ... + (all − λ)xl + ... + aln xn = 0 .. . an1 x1 + an2 x2 + ... + anl xl + ... + (ann − λ)xn = 0

(4.23)

Numerical Methods in Linear Algebra

173

without loss of generality, assume that |xl | = M ax{|x1 |, |x2 |, ..., |xn |} Then considering the lth equation in the system (4.23) and taking the absolute value gives |λ||xl | = |al1 x1 + al2 x2 + ... + aln xn | ≤ |al1 ||x1 | + |al2 ||x2 | + ... + |aln ||xn | dividing by |xl | and observing that |xi |/|xl | ≤ 1, yields, |λ| ≤ |al1 | + |al2 | + ...|aln | =

n X

|alj |

j=1

Thus an eigen-value λ of a matrix A is bounded by the sum of the absolute values of its row elements. This holds for all rows l = 1, 2, ..., n. n P Since A and AT have the same eigen-values |λ| ≤ |aik | for each k = i=1

1, 2, ..., n. By rewriting lth equation in the system (4.23) in a slightly different fashion one can obtain Gershgorin circles or discs as follows. (all −λ)|xl | = |al1 ||x1 |+|al2 ||x2 |+...+|al(l−1) ||xl−1 |+|al(l+1) ||xl+1 |+...+|aln ||xn | since |xi |/|xl | < 1, |all − λ| ≤

n X

|alj | = rl

j=1,j6=l

Thus an eigen-value λ lies in a circle of radius rl and center all . As the λ is arbitrary and both A and AT have the same eigen-values the following result can be safely concluded. This result is given by Braver. Result 4.8.1 (1) The eigen-values of a matrix A are always less than or equal to the sum of its row elements in absolute values or the sum of its column elements in absolute values. (2) The eigen-values of a matrix A lie in discs with center all and radius rl = n n P P |alj | , l = 1, 2, ..., n and radius rk = |aik |, k = 1, 2, ..., n. j=1,j6=l

i=1,i6=k

Let Cl be the circle with radius rl and center all , l = 1, 2, ..., n and Ck be n S the circle with radius rk and center akk , k = 1, 2, ..., n. Write CR = Cl and CC =

n S

l=1

Ck

k=1

Then all eigen-values of all the matrix A will lie in the circle C = CC ∩ CR .

174

Linear Algebra to Differential Equations

Example 4.8.1 (i) Find the bounds for  −1 A= 1 −1

the eigen-values of the matrix  1 3 2 1 1 2

(ii) Find the Gershgorin circles. (iii) Find the region of existence of the eigenvalues of A. Solution.

Fig. 4.1 The union of Gershgorin discs of A(CR = C1 ∪ C2 ∪ C3 ) and the union of Gershgorin discs of AT (CC = D1 ∪ D2 ∪ D3 ) and eigen-values of A lie inside CR ∩ CC (i) (a) by considering the row sums of A |λ| ≤ M ax{5, 4, 4} = 5 (b) by considering the column sums of A |λ| = M ax{3, 4, 6} = 6 (ii) (a) The Gershgorin circles taking rows are C1 = {λ : |λ + 1| ≤ 4}, C2 = {λ : |λ − 2| ≤ 2}, C3 = {λ : |λ − 2| ≤ 2}. (b) The Gershgorin circles taking columns are D1 = {λ : |λ + 1| ≤ 2}, D2 = {λ : |λ − 2| ≤ 2}, D3 = {λ : |λ − 2| ≤ 4}. (iii) CR = C1 ∪ C2 ∪ C3 and CC = D1 ∪ D2 ∪ D3 CR ∩ CC (see figure 4.1) is the region in which the eigen-values of A lie. The eigen-values of A are 3, ±i.

Numerical Methods in Linear Algebra

175

Power method. Power method is used to find the largest eigen-value of a square matrix of order n. In order to use this method certain criteria are necessary. These are assumed and are as follows. (i) All the ‘n’ eigen-vectors can be determined even if some of the eigenvalues are repeated. (ii) All the ‘n’ eigen-vectors are linearly independent. (iii) Let vi be the eigen-vector corresponding to the eigen-value λi . The above assumptions guarantee that the ‘n’ eigen-vectors S = {v1 , v2 , ..., vn } form a basis. Without loss of generality assume that |λ1 | > |λ2 | ≥ ... ≥ |λn |. Let v = α1 v1 + α2 v2 + ... + αn vn be any vector in Rn . Then Av

= α1 Av1 + α2 Av2 + ... + αn Avn = α1 λ1 v1 + α2 λ2 v2 + ... + αn λn vn = λ1 (α1 v1 + α2 (λ2 /λ1 )v2 + ... + αn (λn /λ1 )vn )

Also, 2

2

k

k

A2 v = λ21 (α1 v1 + α2 (λ2 /λ1 ) v2 + ...... + αn (λn /λ1 ) vn ) Proceeding so on, Ak v = λk1 (α1 v1 + α2 (λ2 /λ1 ) v2 + ...... + αn (λn /λ1 ) vn ) and Ak+1 v = λk+1 (α1 v1 + α2 (λ2 /λ1 ) 1

k+1

k+1

v2 + ...... + αn (λn /λ1 )

vn )

Now as k → ∞ Ak v → λk1 α1 v1 as k → ∞ A v → λk+1 α1 v1 as k → ∞ 1 k+1

λ1 = lim (Ak+1 v)i /(Ak v)i , k→∞

i = 1, 2, ....n

Observation. Ak v, Ak+1 v are vectors, hence (Av)i , (Ak+1 v)i are the ith components of the vectors. The iteration procedure is stopped when the error of the magnitude of successive ratios is very small or within the prescribed limits. The algorithm of the power method is as follows. Step 1. Given a square matrix A of order n, choose any vector v0 (which is not in the standard basis). (1)

(1)

Step 2. Consider Av0 = v1 = λ1 y(1) where λ1 = max(v11 , v12 , ...v1n )

176

Linear Algebra to Differential Equations (2)

(2)

Step 3. Calculate Ay(1) = v2 = λ1 y(2) where λ1 = max(v21 , v22 , ...v2n ) (k)

(k−1)

Repeat Step 3, k times till λ1 − λ1 the given tolerance limit. Example 4.8.2 Using power method lowing matrix  2 A = 1 9

<  and ||y(k) − y(k−1) || < ,

find the largest eigen-value of the fol4 3 4

8 5 3

 .

Solution. Let the initial approximation be   1 v0 = 1 . 1 Then calculating  2 Av0 = 1 9

4 3 4

8 5 3

    1 14  1 =  9  1 16

Now taking maximum of components of Av0 , and dividing the vector,     14 0.875 1    9 = 0.5625 v1 = 16 16 1 proceeding similarly,  2 Av1 = 1 9

4 3 4

    8 0.875 12 5  0.5625 = 7.5625. 3 1 13.125

Now,     12 0.9143 1  7.5625 = 0.5762 v2 = 13.125 13.125 1 and

 2 Av2 = 1 9

4 3 4

8 5 3

    0.9143 12.1333  0.5762 =  7.6429 . 1 13.5333

again,     12.1333 0.8966 1  7.6429  = 0.5647 v3 = 13.5333 13.5333 1

Numerical  2 Av3 = 1 9

Methods in Linear Algebra     4 8 0.8966 12.0521 3 5  0.5647 =  7.5908  4 3 1 13.3279     12.0521 0.9043 1  7.5908  = 0.5695 v4 = 13.3279 13.3279 1      2 4 8 0.9043 12.0867 Av4 = 1 3 5  0.5695 =  7.6129 . 9 4 3 1 13.4166 Again,     12.0867 0.9009 1  7.6129  = 0.5674 v5 = 13.4166 13.4166 1 and

 2 Av5 = 1 9

4 3 4

8 5 3

   0.9009 12.0714  0.5674  7.6031 . 1 13.3776

Next,     0.9024 12.0714 1  7.6031  = 0.5683 v6 = 13.3776 1 13.3776 and

    12.0781 0.9024 8 5  0.5683 =  7.6074  13.3947 1 3     0.9017 12.0781 1  7.6074  = 0.5679 v7 = 13.3947 1 13.3947      2 4 8 0.9017 12.0752 Av7 = 1 3 5  0.5679 =  7.6055 . 9 4 3 1 13.3872     12.0752 0.902 1  7.6055  = 0.5681 v8 = 13.3872 13.3872 1      2 4 8 0.902 12.0765 Av8 = 1 3 5  0.5681 =  7.6064 . 9 4 3 1 13.3905  2 Av6 = 1 9

4 3 4

Again     12.0765 0.9019 1  7.6064  =  0.568  v9 = 13.3905 13.3905 1

177

178

Linear Algebra to Differential Equations

and

 2 Av9 = 1 9 v10

    8 0.9019 12.0759 5   0.568  =  7.606  3 1 13.389     12.0759 0.9019 1  7.606  = 0.5681. = 13.389 13.389 1 4 3 4

Since v9 and v10 are same the procedure is stopped. The maximum eigen-value is 13.39 and the corresponding eigen-vector is   0.9 0.57 1

Exercise 4.8 1. Obtain the bounds for the eigen-values for    2 2 5 2 (i) 2 2 3 2 1 3 2 2 2. Obtain the Gershgorin circles  −2 1 (i)  4 −3 −2 1

the matrices  4 1 2 1 3 1

for the matrices.   5 1 3 1 (ii) −1 −2 4 1 0

3. Using power method find the largest eigen-value the corresponding eigen-vector.    2 4 1 3 2 1 0 0 2 (i) 4 2 1 (ii)  1 0 2 2 1 2 0 4 2

4.9

 2 3 −1 for the matrices. Find  2 3  1 2

Krylov Subspace Methods

It has been described in literature that one of the ten most important classes of numerical methods is Krylov Space methods. They are used to solve large sparse systems of linear equations and also eigen-value problems of large sparse matrices. Following is a definition of a Krylov subspace.

Numerical Methods in Linear Algebra

179

Definition 4.9.1 Given a nonsingular matrix A ∈ C n×n and a non zero vector y ∈ C n , the nth Krylov subspace Kn (A, y) generated by A from y is Kn = Kn (A, y) = span(y, Ay, ..., An−1 y). Observation. K1 ⊆ K2 ⊆ K3 . . . ⊆ Kn . Idea behind Krylov Space Methods Consider a linear system Ax = b. Choose an initial approximation, so that r0 = b − Ax0 − b is the initial error vector or the initial residual vector. The sequence of approximations are generated from r0 by considering the nth Krylov subspace Kn (A, r0 ) and finding xn such that xn ∈ x0 + Kn (A, r0 ) and the residual rn ∈ Kn+1 (A, r0 ) and rn → 0 as n → ∞. Thus, it is expected that in a finite number of steps rn = 0 and xn = x∗ is the solution of the linear system Ax = b. In general, the definition of Krylov subspace solver that covers all cases is difficult. The following definition is given in a simplified form. Definition 4.9.2 A Krylov space solver is an iterative method for solving a linear system Ax = b starting from an initial approximation x0 and the corresponding residual vector r0 = b − Ax0 and generating for all, or at least most ‘n’, till it possibly finds the exact solutions, iteration xn such that xn − x0 = pn−1 (A)r0 ∈ Kn (A, r0 ) with a polynomial pn−1 of exact degree n − 1. For some n, xn may not exist or pn−1 may be of lower degree. Krylov space solvers converge very slowly(if they do) for very large real world problems. One of the approaches to handle this situation is by preconditioning the given system, that is, by pre-multiplying with an approximate inverse C and considering the system CAx = Cb in place of Ax = b. There are many methods that come under this umbrella. Some of them are (i) Conjugate gradient (CG) method applied when A is a symmetric positive definite matrix. (ii) Conjugate direction method, which is a refined version of CG method and is used when A is a symmetric positive definite matrix. (iii) Orthomin form of the CG method which is again a refinement of CG method and is used when A is a symmetric positive definite matrix. (iv) Biconjugate gradient (BiCG) method for nonsymmetric matrices.

180

4.10

Linear Algebra to Differential Equations

Conclusion

Elements of computational errors are introduced in this chapter. Care has been taken to introduce all the direct and iterative methods to solve a linear system of equations. As an orthogonal transformation mitigates errors, Householder transformation and plane rotation method are given to transform ordinary and symmetric matrices respectively to diagonal or tridiagonal form. The QR method is described and the bounds for eigen-values and power method for finding the largest eigen-value and its corresponding eigen-vector are given. Further, Krylov subspace solvers are introduced. For further reading the readers are suggested to refer to [5], [10], [13], [14] and [18].

Chapter 5 Applications

5.1

Introduction

The focus of this chapter is to use the tools developed in earlier chapters and apply them to diverse topics. Section 5.2 highlights the use of determinants in finding equations of conics. Section 5.3 deals with Markov chains that deal with a mathematical system having a finite number of states. In Section 5.4 economic models are discussed using a linear system of equations. The concept of congruent inverse introduced in Section 1.9 is used for encoding and decoding messages in the field of cryptography in Section 5.5. Linear transformations described in Chapter 3 are used to introduce the basics of computer graphics in Section 5.6. The notion of change of basis matrix is widely used in robotics to pass on information from one frame of reference to another and this forms the content of Section 5.7. Matrices play a key role in bioinformatics and this is showcased in Section 5.8. Section 5.9 deals with principal component analysis wherein the eigen-value decomposition and singular value decomposition theorems are well utilized. In Section 5.10, the concept of big data is introduced and an example is given to illustrate it.

5.2

Finding Curves through Given Points

This section deals with application of determinants. It is shown that equation of curves and surfaces can be found using the theorems of analytic geometry along with determinants. The main theorem that is used to obtain our results is from Chapter 2, which states that ‘A homogeneous system of linear equations has a nontrivial solution if and only if the coefficient matrix is singular, that is, its determinant is zero’.

Result 1 Finding the equation of a circle passing through 3 points. Let (x1 y1 ), (x2 y2 ), (x3 y3 ) be any three points through which the DOI: 10.1201/9781351014953-5

181

182

Linear Algebra to Differential Equations

circle must pass. Then clearly these 3 points must satisfy the equation of the circle a(x2 + y 2 ) + bx + cy + d = 0. Hence, a(x21 + y12 ) + bx1 + cy1 + d = 0 a(x22 + y22 ) + bx2 + cy2 + d = 0 a(x23 + y32 ) + bx3 + cy3 + d = 0 All the above equations form a homogeneous linear system and possess a non trivial solution if 2 x + y 2 x y 1 2 x1 + y12 x1 y1 1 2 x2 + y22 x2 y2 1 = 0, 2 x3 + y32 x3 y3 1 which is also the equation of a circle passing through the points (x1 y1 ), (x2 y2 ), (x3 y3 ). Example 5.2.1 Construct a circle passing through the points (1 0), (0 1), (2 3). Then the equation of the circle passing through the given points is 2 x + y 2 x y 1 1 1 0 1 =0 1 0 1 1 13 2 3 1 expanding the determinant gives 4(x2 + y 2 ) − 12(x + y) + 8 = 0, which is the equation of a circle passing through the points (1 0), (0 1), (2 3).

Result 2 Finding the equation of a conic section. From analytic geometry it is known that the general equation of a conic section is ax2 + by 2 + cxy + dx + ey + f = 0. Since there are six unknowns given 5 points (x1 y1 ), (x2 y2 ), (x3 y3 ), (x4 y4 ), (x5 y5 ), the equation of the conic section can be determined by forming a homogeneous linear system as before (one can eliminate the coefficient of x2 ). The system possesses a nontrivial solution if the determinant of the system is zero, that is, 2 x y 2 xy x y 1 2 x1 y12 x1 y1 x1 y1 1 2 x2 y22 x2 y2 x2 y2 1 2 x3 y32 x3 y3 x3 y3 1 = 0, 2 x4 y42 x4 y4 x4 y4 1 2 x5 y52 x5 y5 x5 y5 1

Applications

183

which also gives the equation of a conic section. Using this determinant the equations of curves like ellipse, parabola and hyperbola can be determined. This has applications in determining specified orbits of asteroid, etc.

EXERCISE 5.2 1. Construct a conic passing through the points (0 0), (1 1), (−1 1), (2 0), (3 − 2). 2. Find the determinant to find the equation of a sphere passing through four points. and hence find the equation of the sphere passing through the points, (0 0 0), (0 1 1), (1 1 0), (0 2 0). 3. Find the determinant to find the equation of a plane in a 3-D space and hence find the equation of the plane passing through the points, (1 1 1), (0 0 3), (2 5 2).

5.3

Markov Chains

Markov chain technique was developed by the Russian mathematician Andrei Markov. This technique has a wide range of applications which include forecasting weather patterns, understanding commodity price movements, administration of plantations, studying behavior of animals, analysis of experiments in physics, etc. Andrei Markov was the first one to study matrices involving stochastic matrices and Markov chains. Consider a physical or a mathematical system having a finite number of states and is such that at any moment of time the system occupies only one of the finite states. The behavior of the system as it changes from one state to another leads to the following definitions. Definition 5.3.1 Stochastic process is a process in which changing of the system from one state to another is not predetermined but can only be specified in terms of certain probabilities which are obtained from the past behavior or history of the system. Definition 5.3.2 First-order Markov process or a Markov Chain is a stochastic process where in the transition probability of a state depends exclusively on its immediately preceding state.

184

Linear Algebra to Differential Equations

A more explicit definition can be given as follows. Definition 5.3.3 A first-order Markov chain is a sequence of trials of an experiment such that (1) The outcome of the nth trial depends only on the outcome of the (n − 1)th trial and not on the outcome of the earlier trials; and (2) In two successive trials of the experiment, the probability of going from state Si to state Sj remains the same or does not change. To understand a system described by a Markov process or a Markov chain the following concepts are necessary. Definition 5.3.4 Probability vector is a column vector, p, whose entries are nonnegative and their sum is one, that is, n X

pi = 1.

i=1

Definition 5.3.5 Stochastic matrix is a matrix P=[pij ] of order n × n where each of its columns is a probability vector. Definition 5.3.6 Transition probability is the probability of a system moving from one state to another. Definition 5.3.7 Transition matrix of a Markov chain is a probability matrix T = [tij ] of order n × n where tij is the transition probability of the system going from state Si to state Sj on successive trials of the experiment. Observation. The sum of entries in each column is one. In other words n X

pij = 1, f or j = 1, 2, ..., n,

i=1

which implies that a transition matrix is a stochastic matrix. The following problem involves these concepts. Example 5.3.1 Best Furniture Decorators (BFD) has 40% of the business in a certain city. It’s only competitor Excellent Enterprises (EE) has the other 60%. To become more competitive BFD advertised to improve its image. During the advertisement campaign monthly sales figures were compiled. It was found that 80% of BFD’s customers returned to BFD the following month, while 25% of EE’s customers switched to BFD. (a) What percentage of customers use each service after one month? (b) What percentage of customers use each service after two months?

Applications

185

(c) What is the long-term effect in case of each service? Solution. From the given information the decorating services in a city is a system in which there are two states BFD and EE and the probability of the customers choosing each state is given by P (BF D/BF D) = 0.8 P (BF D/EE) = 0.25

P (EE/BF D) = 0.2 P (EE/EE) = 0.75

From the given information, the initial probability vector p0 = [0.4 0.6]T. Observe that the initial vector is a probability vector and the transition matrix T is a stochastic matrix given by,   0.8 0.25 T= 0.2 0.75 Observation. The matrix T gives the probabilities of passing from one state to another. For example, the element a21 of the matrix T is the probability of moving from state 2 to state 1 in one month. To find the number of customers availing each service, consider      0.8 0.25 0.4 0.47 Tp0 = = = p1 , 0.2 0.75 0.6 0.53 where p1 is the vector of proportions after a month. It can be easily seen that the vector of proportions after two months, p2 is given by      0.8 0.25 0.47 0.5085 p2 = Tp1 = = 0.2 0.75 0.53 0.4915 Note that 0.5085 + 0.4915 = 1 and p1 = Tp0 , p2 = Tp1 = T(Tp0 ) = T2 p0 , similarly, 2 3 p3 = Tp2 = T(T   p0 )=T p  0.  0.8 0.25 0.5085 0.529675 Thus, p3 = = 0.2 0.75 0.4915 0.470325 It can be seen that as the number of months increase, the proportions tend to a fixed probability vector,   5 0.555... p= = 94 . 0.444... 9 p is called fixed vector for Transition matrix T because    0.8 0.25 59 Tp = 0.2 0.75 49 T  = 59 94 = p

186

Linear Algebra to Differential Equations

In the long run, 56% of the market is favorable to BFD and 44% of the market to EE. The following table provides information regarding the tendency of customers’ choice after n months. n Probability vector after n months 0 (0.4, 0.6)T 1 (0.47, 0.53)T 2 (0.5085, 0.4915)T 3 (0.529675, 0.470325)T 4 (0.54132125, 0.45867875)T 5 (0.547726688, 0.452273313)T 10 (0.555161541, 0.444838489)T 15 (0.555535725, 0.444464275)T 20 (0.555554558, 0.444445442)T 25 (0.555555505, 0.444444495)T In order to understand Markov Processes the following concepts are necessary. Definition 5.3.8 A transition matrix T is said to be regular if all the entries of at least one of its powers Tn are strictly positive. A Markov chain is regular if its transition matrix is regular. Result 5.3.1 If T is a regular probability matrix then there exists a unique probability vector t such that Tt = t Observation. 1. This unique probability vector is also called as a steady state vector. 2. If p is any probability vector, then, Tn p tends to the fixed vector t as n increases. The following example illustrates the concepts given above. Example 5.3.2 In state elections, it has been determined that voters will switch votes according to the following probabilities. From Socialist Capitalist Independent To Socialist 0.7 0.35 0.4 Capitalist 0.2 0.6 0.3 Independent 0.1 0.05 0.3

Applications

187

Over the long run, what percentages of voters will vote for a socialist, a capitalist and an independent? Solution. The transition matrix for the Markov chain is   0.7 0.35 0.4 T = 0.2 0.6 0.3 0.1 0.05 0.3 Since T is regular from the above result there exists a unique fixed probability vector t such that Tt = t. Let t = (x y z)T. Since t is a probability vector, x + y + z = 1 and further,      0.7 0.35 0.4 x x 0.2 0.6 0.3 y  = y  . 0.1 0.05 0.3 z z yielding a system of equations given by, x+y+z =1 −0.3x + 0.35y + 0.4z = 0 0.2x − 0.4y + 0.3z = 0 0.1x + 0.05y − 0.7z = 0. To solve the system of equations, consider the  1 1 1 −0.3 0.35 0.4   0.2 −0.4 0.3 0.1 0.05 −0.7

augmented matrix given by  p 1 p 0  p 0 p 0

Now R2 → R2 + 0.3R1 , R3 → R3 − 0.2R1 , R4 → R4 − 0.1R1 gives   1 1 1 p 1 0 0.65 0.7 p 0.3   ∼ 0 −0.6 0.1 p −0.2 0 −0.05 −0.8 p −0.1 R2 → (1/0.65)R2 gives   1 1 1 p 1 0 1 1.076 p 0.4615  ∼ 0 −0.6 0.1 p −0.2  0 −0.05 −0.8 p −0.1 R1 → R1 − R2 , R3 → R3 + 0.6R2 , R4 → R4 + 0.05R2 gives   1 0 −0.076 p 0.5385 0 1 1.076 p 0.4615   ∼ 0 0 0.7456 p 0.0769  0 0 −0.7462 p −0.0769

188

Linear Algebra to Differential Equations R3  1 0  ∼ 0 0

→ (1/0.7456)R3 gives 0 1 0 0

−0.076 1.076 1 −0.7462

p p p p

 0.5385 0.4615   0.1031  −0.0769

and R1 → R1 + 0.076R3 , R2 → R2 − 1.076R3 ,  1 0 0 p 0.5463 0 1 0 p 0.3506 ∼ 0 0 1 p 0.1031 0 0 0 p 0

R4 → R4 + 0.7462R3 gives    

Thus, x = 0.5463, y = 0.3506, z = 0.1031 and therefore t = (0.5463 0.3506 0.1031)T. Hence, the percentages of voters that will vote for socialist is 54.63%, capitalist is 35.06% and an independent is 10.31%. Note that if the given information is close to reality, then the conclusion will also be close to reality. Age of trees in a forest. Assume that trees in a forest are divided into four age-groups. Let FB (k) denote the number of baby trees in the forest with age – group 0–20 years at a given time period k. Similarly, FY (k), FM (k) and FO (k) denote the number of young trees (21–40 years of age), middle-aged trees (41–60 years) and old trees (older than 60 years of age), respectively. The length of one time period is 20 years. This simple model needs the following assumptions: (1) In each group, certain percentage of trees die. (2) Surviving trees are placed in the next age group and old trees remain old. (3) Trees lost are replaced by baby trees. Note that total tree population remains constant over a given period of time. This problem is expressed as a system of difference equations as follows: Let the loss rates in percentage in each age group be LB , LY , LM , LO , each being a nonnegative quantity less than 1, then FB (k+1)= LB FB (k)+LY FY (k)+LM FM (k)+LO FO (k) FY (k+1)=(1 − LB )FB (k) FM (k+1)=(1 − LY )FY (k) FO (k+1)=(1 − LM )FM (k)+(1 − LO ) FO (k). Age distribution vector t is given by t(k) = (FB (k) FY (k) FM (k) FO (k))T.

Applications

189

Transition matrix A from the above system of difference equations is given by 

LB 1 − LB A=  0 0

LY 0 1 − LY 0

LM 0 0 1 − LM

 LO 0  . 0  1 − LO

Then we have, At(k) = t(k + 1) Now, choose LB = 0.2, LY = 0.1, LM = 0.4, LO = 0.3. Then,   0.2 0.1 0.4 0.3 0.8 0 0 0 . A=  0 0.9 0 0 0 0 0.6 0.7 Then, the fixed probability vector t is given by 1 t= (1 0.8 0.72 1.44)T. 3.96 Suppose that a forest is newly planted with a total tree population of 70,000 trees. Initial age distribution of the forest f (0) is given by f (0) = (70000 0 0 0)T After 20 years, f (1) = At(0) = 70,000 (0.2 0.8 0 0]T =(14000 56000 0 0)T After 40 years, f (2) = At(1) = 70,000 (0.12 0.16 0.72 0)T = (8400 11200 50400 0)T After 60 years, f (3) = At(2) = 70,000 (0.328 0.096 0.144 0.432)T = (22960 6720 10080 30240)T Clearly for nth period, the number of trees in a forest is given by f (n) = Af (n − 1) = A2 f (n − 2) = ... = An f (0).

EXERCISE 5.3 1. Let the transition matrix be T=

  0.55 0.4 0.45 0.6

Calculate pn for n= 1,2,3,4 given p0  0.6 2. Given the transition matrix T = 0.2 0.2   0 p0 = 1. Find p1 , p2 and p3 . 0

= [1 0]T.  0.5 0.7 0.4 0.1 and the initial vector 0.1 0.2

190

Linear Algebra to Differential Equations

3. Check whether the following transition matrices are regular and find the steady t.  state vector    1/2 1/4 0.78 0.67 (a) (b) 1/2 3/4 0.12 0.33 4. A person either works from home or goes to office. If he works from home one day, he works from home the following day, 1 time out of three. If he goes to office one day, he goes to office the next day 4 times out of five. Over the long run what are the chances that a person goes to office. 5. A city administration wanted to plan its services for next few years. So they have divided the city onto 3 parts: old city, residential area and suburbs. It is found that 15% of old city people move to suburbs and 10% move to residential area. 5% of people in residential area move to old city and 10% move to suburbs. 5% of residents of suburbs move to old city and 5% of residents move to residential area after a long time what percentage of people live in each region.

5.4

Leontief ’s Models

In this section, two economic models are discussed. The Nobel-Laureate Wassily Leontief proposed two different mathematical models describing interrelations between the constituents of an economic system. These mathematical models are linear in nature and hence the theory of matrices can be utilized to analyze them. The mathematical models are: (1) the closed or input-output model and (2) the open or production model. In a comprehensive economic system, all industries are related to each other. A mathematical model is used for studying the interrelations between industries through some defined parameters. Using this information certain other parameters must be obtained such that they satisfy a given economic objective. For example, suppose that for its production of goods each industry buys products of other industries as input. In some cases, buying can be only from primary producing sector. Thus industries, in general, are interdependent and study of this interdependence of industries in any economy is described in the following models.

Closed Model. A closed or an input-output model deals with an economic system consisting of ‘n’ industries having an ‘output’ that is completely utilized in a predetermined manner by all the ‘n’ industries themselves. The objective of the closed or the

Applications

191

input-output model is to find prices for the outputs so as to balance the income and expenditure of each industry during a given time period. Let xi , i = 1, 2, 3, . . . , n be the price of the total output of the ith industry and aij , i, j = 1, 2, . . . , n be the fraction of the total output of the jth industry purchased by the ith industry. Pn Then clearly (i) xi > 0, (ii) aij > 0 and (iii) i=1 aij = 1, j = 1, 2, ..., n. Let x = (x1 x2 ... xn ) be the cost vector and the input-output matrix be A = (aij )n×n . According to the model, since expenditure of each industry must be equal to its income, x1 = a11 x1 + ... + a1n xn . . . xn = an1 x1 + ... + ann xn or x = Ax or (I − A)x = 0. This is a linear system of homogeneous equations and hence has a nontrivial solution, if (I − A) is a singular matrix. The following example illustrates the model. Example 5.4.1 Three factories X,Y,Z manufacture products and whatever they produce is consumed among themselves. Let X consume 20 % of what it produces and Y 60% of X’s production and remaining is consumed by Z. The production of Y is shared by X, Y, Z in the ratio 3 : 4 : 3. Also X consumes 10% of Z’s production, Y consumes 20% of Z’s production and Z the remaining 70%. Find the ratios of incomes of X, Y, and Z. Solution. Assume that the total production is unity and let x, y, and z be the incomes of X, Y and Z respectively. Let x be the amount obtained from the sale of X’s z product. Since X spends x5 of his product and spends 3y 10 on Y’s product and 10 z on Z’s product, the total expenditure of X is x5 + 3y 10 + 10 . The system being closed, total expenditure equals total income. Using the same reasoning for the incomes of Y and Z, yields, x 3y z + + 5 10 10 3x 2y z y= + + 5 5 5 x 3y 7z z= + + . 5 10 10

x=

192

Linear Algebra to Differential Equations The above mathematical model can be expressed as a linear system,  4   3 1   − 10 − 10 5 x 0  3 3 1    y 0. − − =  5 5 5  z 0 1 3 3 − − 5

10

10

Performing a sequence of elementary row operations: (i) R1 → 10R1 , (ii) R2 → 5R2 , (iii) R3 → 10R3 (iv) R2 → R2 − R1 , (v) R3 → R3 + 3R1 and (vi) R3 → R3 + 2R2 to get      8 −3 −1 x 0 −11 6 0  y  = 0 0 0 0 z 0 This gives, 8x − 3y − z = 0, 6 x = , y 11

−11x + 6y = 0 and

y 11 = or x : y : z = 6 : 11 : 15, z 15

which are the required ratio of incomes of X, Y and Z, respectively.

Open Model In an open or a production model, the industries satisfy demand from the consumers in addition to the requirement among themselves. Thus in this model, the prices are fixed and the output to be produced by each industry to meet a given demand is to be determined. Making an assumption that the output is measured in terms of the price of the product, the model is developed as follows. Let xi be financial value of the total output of the ith industry and di be the financial value of the output of the ith industry to satisfy the consumer demand. Further, let aij be the financial value of the output of the ith industry needed by the jth industry to produce one unit of financial value of its own output. This gives rise to the production model. x1 = a11 x1 + a12 x2 + ... + a1n xn + d1 x2 = a21 x1 + a22 x2 + ... + a2n xn + d2 .. .. .. .. .. . . . . . xn = an1 x1 + an2 x2

+ ... + ann xn + dn .

In matrix notation, x=Ax + d or (I − A)x = d, where   a11 a12 · · · a1n  a21 a22 · · · a2n    A= . .. .. .. ,  .. . . .  an1 an2 · · · ann

(5.1)

Applications

193

d = (d1 d2 ... dn )T and x = (x1 x2 ... xn )T . The matrix A is called as a transaction matrix. To have a realistic model it is assumed that x ≥ 0, d ≥ 0 and A ≥ 0 and that (I − A) is invertible, then the output vector x is given by x = (I − A)−1 d. The following example illustrates the model. Example 5.4.2 An economy produces products A and B. The two commodities serve as intermediate input in each other’s production. 0.4 units of product B and 0.7 units of product A are needed to produce one unit of product B. 0.1 unit of product B and 0.6 units of product A are required to produce one unit of product A. No capital inputs are required. Verify whether the system is viable. 5 and 4 labor days are required to produce one unit of product A and B respectively. If the economy needs 100 units of product A and 50 units of product B, calculate the gross output of the two commodities and the total labor required. Consumer Product A Producer Product A(units) 0.6 Product B(units) 0.1  0.6 Given, A = 0.1

Product B Final Demand 0.7 0.4

100 50

     0.7 100 x1 , d= , x= . 0.4 50 x2

Under the product B, 0.7 + 0.4 = 1.1 > 1 and therefore the economy is not viable. The balance equation is x = Ax + d i.e. (I − A)x = d or −1

x = (I − A) d. 0.4 −0.7 = 0.17 6= 0 and |I − A| = −0.1 0.6   0.6 0.7 1 (I − A)−1 = 0.17 . Hence 0.1 0.4        0.6 0.7 100 95 558.8 1 1 x = 0.17 = 0.17 = . 0.1 0.4 50 30 176.5 The gross production required is 558.8 units of product A and 176.5 units of product B.   558.8 Total labor required = [5 4] = 3500 man days 176.5

194

Linear Algebra to Differential Equations

Example 5.4.3 The interrelationship between the production of three industries A, B, C in a year is given below ( in crores of rupees). Users A Producers A 40 B 40 C 40

B

C

Final demand

Output

40 60 20

160 200 150

50 50 100 30 50 50

Determine the output if the final demand changes are 60 for A, 40 for B and 60 for C. Solution. Input coefficients of A, B, C are given by A

B

C

40/160 50/200 40/160 100/200 40/160 50/200 The input-coefficient matrix A is given  1 1 4 4   1 1 A= 4 2   1 1 4

50/150 30/150 50/150 by

4

 1 3   1  5   1 3

Further, xT = [x y z], and dT = [60 40 60] The Input-Output balance equation is x = Ax + d or (I − A)x = d i.e. −1 x=(I − A) d. Hence   3 1 1 − −   4 4 3  23 1 1  1 I − A = − 6= 0. −  and |I − A| =  4 240 2 5  1  1 2 − − 4 4 3 It follows that

Applications  68 60 52    23 23 23  60       52 100 56    −1   x=(I − A) d =   23 23 23  40       45 60 75  60 23 23 23   417.39       455.65 =    401.74

195



Thus the final output for A, B and C are x = 417.39, y = 455.65, z = 401.74.

EXERCISE 5.4 1. A medical expert advised a patient to consume 75 g protein, 90 g fats and 100 g of carbohydrates. The patient consumes three types of food items A, B, C which when mixed will provide him desired food values. The table below presents the food analysis of A, B, C.

Type of food A B C

Food value 100 g Proteins Fats Carbohydrates 8 5 5 4 20 2.5 25 5 40

What quantities of items A, B, C patient should consume so that he will receive needed food values? 2. One unit of commodity A is produced by combining 5 units of land, 3 units of labor and 1 unit of capital. One unit of B is produced by 2 units of land, 1 unit of labor and 3 units of capital. One unit of C is produced by 1 unit of land, 2 units of labor and 4 units of capital. If the cost price of A, B, C are Rs.16, Rs. 19 and Rs. 25, respectively, find the rent R, wages W and rate of interest I. 3. An amount of Rs. 5000/- is put into three investments at rates of 6, 7 and 8 percent per annum, respectively. the total annual income is Rs.358/- The combined income from the first two investments is Rs.70/-

196

Linear Algebra to Differential Equations more than income from the third investment. Find the amount of each investment.

4. A Salesman has the following record of sales during three months for three items X, Y and Z which have different rates of commission.

Months

Sale of units X Y Z October 60 100 30 November 130 50 40 December 90 100 20

Total commission (in rupees) 850 900 800

Find the rates of commission on items X, Y, Z. 5. A farmer feeds cattle on a matrix of three types of food namely A, B and C. Type A contains 10% proteins, 10% of the calories and 5% of carbohydrates which needs each day. Similarly, type B contains 5% proteins, 10% of calories but no carbohydrates and type C has 5% proteins, 5% of the calories and 10% of carbohydrates. How many units of each type of food should the farmer give a steer each day so that it gives 100% of the amount of proteins, calories and carbohydrates it requires?

5.5

Cryptology

Cryptography which is both an art and a science deals with techniques for encoding and decoding messages which are exchanged between two parties – one of whom is the sender and the other is the receiver. The message created by the sender is called ‘plain text’. The sender encodes the plain text (‘encrypted text’) using one or more ‘keys’ and sends the encrypted text to the recipient. The recipient uses one or more keys to decrypt the text (‘decrypted text’) so as to obtain the original message, i.e. plain text. In symmetric cryptography, both the sender and the recipient use the same key for encoding and decoding purposes. This obviously requires the sender to share the key with the recipient. A third party can obtain the plain text by intercepting both the encrypted text and the shared key transmitted by the sender. In asymmetric cryptography, both sender and receiver employ different keys for encoding and decoding purposes. An example of asymmetric cryptography is the RSA [Rivest–Shamir–Adleman] algorithm where two sets of keys are used. One key is the public key known and shared by both sender and receiver. The third party can access the public key which is commonly available.

Applications

197

There are two other keys – a private key used by the sender and known only to him/her, and a second private key used by the receiver and known only by him/her. If a third party is able to intercept the encrypted text, they will need to compute the private keys in order to decode the encrypted text. The private keys are two very large prime numbers. Computation of these values is extremely time intensive. Irrespective of which cryptographic techniques are employed, in order to foil hackers all of these techniques must meet two basic requirements which are as follows: (a) the time spent by a hacker in trying to find the key(s) far exceeds the time during which the plain text is seen as useful, i.e. the message becomes useless after a certain time period, and (b) the costs involved in trying to decrypt a message far exceeds the value of the plain text. In this section, we focus on two symmetric cryptographic techniques which employ matrices for encryption/decryption purposes. In the first technique, the plain text is encoded and multiplied by any nonsingular matrix of appropriate dimensions to generate the encrypted text. The encrypted text is then multiplied by the inverse of the nonsingular matrix to yield the original text. The second technique, called the Hill method, is also like the first one but here modular arithmetic and modular matrices are involved. The following two examples show how the two techniques are employed. Assume that the plain text is created from an alphabet drawn from letters A through S (a total of 19 letters) where the letters are encoded by the numerals 0 through 18 as shown below: A B C D E F G H I J K L M N 0

1

2

3

4

5

6

7

O

P

Q R S

8 9 10 11 12 13 14 15 16 17 18

Example 5.5.1 Using matrix inverse. Consider a message line ‘CLEAR PLAN’ to be sent. Then from association between alphabet and numbers, ‘CLEAR PLAN’ is equivalent to ‘2 11 4 0 17 15 11 0 13’. Writing the above nine numbers as three vectors in serial order gives       2 0 11 11 17  0 . 4 15 13   1 2 3 Next consider any arbitrary nonsingular matrix A = 1 0 2 (det(A) 6= 0) 0 1 2

198

Linear Algebra to Differential Equations

and send the message as 3 vectors x, y, z given by   2 x = A 11 4   0 y = A 17 15   11 z = A 0  13 The receiver on obtaining the vectors x, y, z multiplies them with   1 4 2 −  3 3 3   2 1  2 −1 A = − −   3 3 3  1 1 2  − 3 3 3 to get the original vectors and hence retrieves the message. This method can be easily hacked and hence there is a better technique using modular inverse of a matrix from Section 1.9 which is as follows. Example 5.5.2 Using modular inverse of a matrix. Suppose a plain text message ‘EI’ must be sent.  The letters reduce to the 8 1 vector x = (4 8)T. Now consider the matrix M = in Example 1.9.4. 13 3        8 1 4 40 2 Then M x = = = , which is obtained using 13 3 8 76 0     2 C a mod b ≡ 19. The vector in alphabet reduces to and this mes0 A sage is sent to the  receiver. The receiver picks the message CA transforms 2 it to vector z = uses the modular inverse of M (from Example 1.9.4),   0      2 12 2 12 2 4 and finds M−1 z = = and gets the original M−1 = 4 18 4 18 0 8 message. Observation. A 2 × 2 matrix is required to send an information containing 2 letters. Thus a 3 × 3 matrix is essential to send information containing 3 letters. So sending big messages becomes difficult if one were to use large matrices, but the messages can be sent in segments of 3 using a 3 × 3 matrix.

Applications

199

EXERCISE 5.5 

 8 −1 1. Consider N = (Problem 4, Exercise 1.9) send message ‘OM’ 13 18 by encrypting using modular matrix Nc decrypt p=19 it using its modular inverse. 2. Send the message ‘CLEAR PLAN’ by using a 3 × 3 modular matrix. Verify the result using the modular inverse of the matrix.

5.6

Application to Computer Graphics

In this section, the basic concepts in computer graphics are discussed. Visualization is an important aspect of this area, as 3D images have to be projected onto a plane. These images are essentially a bunch of vectors. Moving an image from one point to another point, magnifying the image, scaling the image, rotating the image and projecting an image – all mean multiplying all the vectors in the image by a suitable matrix. All the mentioned objectives can be achieved by the following concepts involving matrices: (1) translation, (2) scaling, (3) rotation and (4) projection. The following concept is necessary before describing the four matrices. While working with vectors in a plane, that is, 2D or with vectors in a space, that is, 3D, sometimes it becomes necessary to consider them as vectors, in a higher dimension. This leads to the following definition. Definition 5.6.1 Homogeneous coordinates of a 2D vector (a (a b 1)T and of a 3D vector (a b c)T are (a b c 1)T

b)T are

1. Translation As translation is not a linear transformation, to translate an image placed with reference to the origin (0 0 0)T to a new frame of reference (a b c)T, the homogeneous coordinates must be used. Every vector (x y z)T must be considered in its homogeneous coordinates form and multiplied by the following matrix using homogeneous coordinates. (i) Translation matrix in 3D: 

1 0 0 1  T= 0 0 a b

0 0 1 c

 0 0  0 1

and the vector in its homogeneous coordinates is (x y z 1)T

200

Linear Algebra to Differential Equations

(ii) Translation matrix in 2D: 

 0 0 1

1 0 T = 0 1 a b

and the vector in its homogeneous coordinates is (x y 1)T. The new position vector in 3D is (x + a y + b z + c)T, and in homogeneous coordinates is (x+a y+b z+c 1)T . Similarly, in 2D the new position vector is (x+a y+b)T and in homogeneous coordinates, (x + a y + b 1)T. Observation. To change the frame of reference from origin (0 0 0)T to (a b c)T, the origin must be taken in homogeneous coordinates as x = (0 0 0 1)T then the product xT = (a b c 1)T is the new origin in 3D. Similar is the case for 2D. Example 5.6.1 Translate a vector v = (6 to the vector v = (2 1)T. Solution. The translation matrix is  1 0 T = 0 1 2 1 and  vT = 6

3

  1 1 0 2

0 1 1

3)T to a new point with reference

 0 0 1  0  0 = 8 1

4

 1 .

New position of the vector is (8 4)T. 2. Scaling Scaling is an important notion used to magnify or diminish a picture. Depending on how much the picture must be reduced in x, y and z directions the constants k1 , k2 and k3 are chosen and the scaling matrix is denoted by S. (i) Scaling matrix in 3D:  k1 0 S= 0 0

0 k2 0 0

0 0 k3 0

 0 0 . 0 1

(ii) Scaling matrix in 2D:  k1 S =0 0

0 k2 0

 0 0 . 1

Applications

201

Example 5.6.2 Scale the vector w = (9 0.4 8) by 0.3 units in x-direction, 2 units in y-direction and 0.2 units in z-direction. Solution. The scaling matrix   0.3 0 0 0  0 2 0 0  S=  0 0 0.2 0 . 0 0 0 1 The new position of vector w in homogeneous coordinates is   0.3 0 0 0    0 2 0 0   wS = 9 0.4 8 1   0 0 0.2 0 = 2.7 0.8 1.6 0 0 0 1

 1

3. Rotation Rotating the picture or an image involves considering a rotation matrix. In 2D the rotation of a vector is about the origin. In order to turn a vector w = (a b) through an angle α, it is necessary to consider a 3 × 3 matrix of the type,   cosα sinα 0 R = −sinα cosα 0 . 0 0 1 Example 5.6.3 Rotate the vector w = (3 3) through an angle α = π/6 about another vector v = (2 1). What is the new position of w? Solution. In order to rotate the vector w = (3 3) through an angle α = π/6 about another vector v = (2 1) the notion of translation along with rotation must be used. First the vector about which the rotation must be taken is translated to the origin by the translation matrix T in Example 5.6.1. Next the rotation matrix 

cosπ/6 R = −sinπ/6 0

sinπ/6 cosπ/6 0

is applied and the vector is transformed to translation matrix,  1 0 1 T− =  0 −2 −1

 0 0 . 1

the position vector (2, 1) by the  0 0. 1

Now the vector w = (3 3) rotated through an angle α = π/6 about the vector (2 1) is the vector given by the product, wT RT = (1.866 3.232 1).

202

Linear Algebra to Differential Equations

Hence the new position of w = (1.866 3.232). In 3D, the rotation of a vector is about an axis. For example, the rotation matrix about the y-axis in 3D is   cosβ 0 sinβ 0 −sinβ 0 cosβ 0 . T =  0 1 0 0 0 0 0 1 Example 5.6.4 Suppose a camera is focused on the point w = (5 3) of an image in the xy plane. It is moved by an angle π4 in the xy plane. On what point will the camera focus on? Solution. Taking α = π/4 in the rotation matrix R, the camera will focus on the point given by the product   cos π4 sin π4 0 h i    5 3 1 −sin π cos π 0 = √52 − √32 √52 + √32 1 . 4 4 0 0 1 Thus the camera will focus on the point (1.4 5.6). 4. Projection To project 3D vectors onto a plane there are two cases (i) planes passing through the origin and (ii) planes not passing through the origin. Case (i) Let K be a plane passing through the origin and n be a unit vector normal to the plane. Then the projection matrix onto the plane is given by I − nnT and in homogeneous coordinates the projection matrix is given by   0 I − nnT 0  PM =   0 0 0 0 1 Case (ii) Let K be a plane not passing through the origin. Then the projection onto the affine plane (or flat as it is called) is done in 3 steps. Step 1. Consider any point on the plane represented by the vector v0 and translate v0 to the origin by the matrix T− . Step 2. Apply projection matrix along the direction of n.

Applications

203

Step 3. Translate back along the vector v0 . In this case, the projection matrix is   I 0 I − nnT T − PM T = −v0 1 0  T3×3  =  −v10

−v20

−v30

 0 T  0  I − nn   0 1 0

0

0

0 1



I v0

 0 T3×3  0  0  v10 1

 0 1

v20

v30

 0 0 . 0 1

Observation. 1. The projection matrix PM n= (I − nnT )n = 0. 2. The projection matrix PM v = v, v is any vector on the plane K. 3. A reflection moves each point twice as far as projection. The reflection goes through the plane on to the other side. So for a mirror matrix the projection is I − 2nnT.

EXERCISE 5.6 1. Consider a quadrilateral with coordinates A(2 3), B(3 3), C(2 0) and D(0 0). Apply the translation with a distance of 1 unit from the origin on the x-axis and 1 unit from the origin on the y-axis. Obtain the new coordinates of the quadrilateral. 2. Consider the coordinates A(1 4 2), B(3 3 1) and C(5 2 3) of a 3-D object. Apply the translation with a distance of 1 unit from the origin on the x-axis, 1 unit from the origin on the y-axis and 2 units from the origin on the z-axis. Obtain the new coordinates of the object. 3. A picture represented by a matrix A = (aij )3×3 must be reflected onto the x-axis and then projected onto the y-axis. Find the reflection and projection matrices and the matrix obtained after the operations are performed. 4. Rotate the vectors (2 1 2), (1 3 4) and (5 4 1) by an angle π/4 in the yz-plane. 5. Project the vectors (x1 y1 z1 ), (x2 y2 z2 ) and (x3 y3 z3 ) onto the xy-plane by using a suitable projection matrix.

204

5.7

Linear Algebra to Differential Equations

Application to Robotics

Robotics is a discipline that includes design and development of robots. The robots are expected to replicate human working patterns, especially in areas that are dangerous, difficult, dirty and dull for humans. A scientific definition of a robot is as follows. Definition 5.7.1 A robot is a programmed mechanical device that is software controllable and uses sensors to guide one or more end-effectors to manipulate physical objects in a workspace. There are many types of robots. In this section, an industrial robot is considered. This is also called as a robotic manipulator or a robotic arm. It is modeled by a chain of rigid links interconnected by flexible joints. There is an end effector, called a tool, gripper or a hand, connected to a robotic arm which is used to handle physical objects. It is roughly similar to a human arm. The tool of a robotic manipulator is mounted on a plate and secured to the wrist of a robot. The joints of a robot are of six types. Of these following two are widely used: (i) Revolute joint (R), which allows a rotary motion about an axis; (ii) Prismatic joint (P), which has a sliding or a linear motion along an axis. A robot needs a frame of reference to work with. The axes of the first three joints of a robot, which dictate the movement of the wrist, are considered as the major axes and these determine the coordinate system adopted by the robot.

Types of Robots 1. Cartesian coordinate robot consists of all prismatic joints as shown in Fig. 5.1. The 3 major axes correspond to 3 sliding joints allowing the movement of wrist up and down, in and out, and back and forth. 2. Cylindrical coordinate robot is obtained by replacing the 1st prismatic joint in Cartesian coordinate robot with a revolute joint. The other joints are left untouched. The first revolute joint swings the arm back and forth about a vertical axis. The two prismatic joints then move the wrist up and down along the vertical axis and in and out along the radial axis as can be seen from, Fig. 5.2. 3. Spherical coordinate robot is the result of replacing both the 1st and 2nd

Applications

205





2 2



3

3





1 1

Fig.5.1 Cartesian robotic arm Fig.5.1Fig. Cartesian roboticrobotic arm arm 5.1 Cartesian

33









22









11



Fig. 5.2 Cylindrical robotic arm Fig.5.2 Fig.5.2Cylindrical Cylindricalrobotic roboticarm arm

prismatic joints in Cartesian coordinate robot with revolute joints. The arm of this robot swings back and forth about a vertical base axis due to the first revolute joint. By the second revolute joint the arm moves up and down about a horizontal shoulder axis and the wrist moves in and out due to the prismatic joint, Fig. 5.3. The other important part of the robot is the tool. The tool is fixed to the wrist and has its own movements. These movements are of many types and they specify the tool orientation. The tool orientation called yaw-pitch-roll (YPR) system adopted by the aeronautics industry is described here. The tool orientation consists of two reference frames (i) wrist coordinate frame B1 = [v1 , v2 , v3 ] and (ii) tool coordinate frame B2 = [w1 , w2 , w3 ].

206

Linear Algebra to Differential Equations



3



2



1



Fig. 5.3 Spherical or polar robotic arm Fig.5.3 Spherical or polar robotic arm

The wrist frame is attached to the end of the forearm and the tool frame f3, m3 is attached to the tool and moves with it. At the outset both the coordinate frames B1 and B2 are coincident. The yaw motion is performed first. This is done by rotating the tool about wrist axis v1 . Then the pitch motion is Tool done by rotating the tool about wrist axis v2 and finally, the roll motion is performed by rotating the tool about wrist axis v3 . These operations can be specified with reference to the tool Roll frame B2 also. With the frame of reference in place, consider the direct kinematics prob lem, which will help Wrist in understanding the movements of a robot and the f2, m2 application of matrices to this topic.

Yaw

Pitch



Definition 5.7.2 Direct kinematics problem is to determine the position and f1, m1 orientation of the tool with respect to (w.r.t.) a coordinate frame attached to the robot base when the Roll, joint variables of a motions robotic manipulator are given. Fig.5.4 Pitch, Yaw Consider a single-axis robot given in Fig. 5.4. The robot consists of a single revolute joint connecting a fixed link to a mobile link, whose frames of references are given by B1 = [v1 , v2 , v3 ] and B2 = [w1 , w2 , w3 ], respectively. Both B1 and B2 form orthonormal bases. Now consider any point P on the body and the coordinates of P w.r.t. the basis B1 be given by [P ]B1 =(a1 , a2 , a3 ) and let the coordinate of P w.r.t. the basis B2 be [P ]B2 =(b1 , b2 , b3 ). Then the vectors in basis B1 can be written as a linear combination of the vectors in B2 as, 3 X vj = cij wi , j = 1, 2 . . . 3. (5.2) i=1

Now using the change of basis matrix (or coordinate transformation matrix),

Applications

207

Fig. 5.4 Roll, pitch, yaw motions that is, using the relation (3.5) gives [P ]B1 = C[P ]B2 , where C = [cij ]3×3 Observation. If the coordinates of P w.r.t. B2 are known and the coordinates of P w.r.t. B1 are to be found Then [P ]B2 = C T [P ]B1 This follows from the fact that B1 and B2 are orthonormal bases and thus C −1 = C T. Representation of fundamental rotations in terms of matrices is essential in using the robot manipulator. These are described below. Suppose the mobile frame (basis) B1 is rotated about one of the unit vectors of the fixed frame(basis) B2 , then the change of basis matrix (or the coordinate transformation matrix) is called a fundamental rotation matrix. The 3 fundamental rotation matrices that arise are as follows. (i) Let the mobile frame B1 move about the unit vector v1 of the fixed frame B2 by an angle θ. Then the vectors v1 and w1 coincide and the vectors v2 and w2 , v3 and w3 vary by θ as in Fig. 5.5 Then the resulting change of basis matrix (or the coordinate transformation matrix) is given by   v1 .w1 v1 .w2 v1 .w3 M1θ = v2 .w1 v2 .w2 v2 .w3  v3 .w1 v3 .w2 v3 .w3   1 0 0 = 0 cosθ −sinθ 0 sinθ cosθ Similarly, the fundamental rotation matrix about v2 by an angle φ is given by   cosφ 0 sinφ 1 0  M2φ =  0 −sinφ 0 cosφ

208

Linear Algebra to Differential Equations

Fig. 5.5 Rotation of a body The fundamental rotation matrix about v3 by an angle ψ yields   cosψ −sinψ 0 and M3ψ = sinψ cosψ 0 0 0 1 These three fundamental rotations correspond to the yaw, pitch and roll of the tool and can be represented by matrices. Since matrix multiplication is not commutative, the order in which the rotations are done play an important role. In order to avoid ambiguity the following algorithm is used. Algorithm for Multiple Rotation (1) Set the mobile frame basis and the fixed frame basis to coincide then the rotation matrix M = I. (2) Suppose the mobile frame (B2 ) is rotated by an angle θ about the vi , unit vector of the fixed frame(B1 ), then premultiply M with Mi (θ). (3) Suppose the mobile frame (B2 ) is rotated by an angle θ about its own unit vector (wi ) then post multiply M with Mi (θ). (4) Repeat the process by going to step 2, if there are more rotations. Else the process is complete. The end result is a matrix M that maps mobile frame (B2 ) coordinates into fixed frame (B1 ) coordinates. The composite yaw, pitch, roll transformation matrix is as follows. Suppose the mobile frame B2 is rotated about the unit vector v1 with a yaw of θ, next about the unit vector v2 with a pitch of φ and then about the

Applications unit vector v3 with a roll of ψ. Then   cosψ −sinψ 0 cosφ YPR(θ) = sinψ cosψ 0  0 0 0 1 −sinφ

209

 0 sinφ 1 1 0  0 0 cosφ 0

0 cosθ sinθ

 0 −sinθ cosθ

Homogeneous coordinates The basic movements of a robot are rotation and translation. While rotation is with respect to the given origin, translation implies a change of origin. In order to accommodate both these transformations in one mathematical operation, just as in computer graphics, the homogeneous coordinates are introduced. A brief description is as below. Let p be a vector (point) in R3 and B be an orthonormal coordinate frame (basis) in R3 . Further, let γ be any nonzero scaling factor. Definition. The homogeneous coordinates of p with respect to B are defined by [p]B = γ (p1 , p2 , p3 , 1)T. The original vector p can be obtained by p = Hγ [p]B , where Hγ is a 3 × 4 matrix called homogeneous coordinate conversion matrix and is given by Hγ = γ1 [I, 0]. Observation. 1. The homogeneous coordinates of a point p are not unique because of the scale factor γ. 2. In robotics, normally, the scale factor γ=1. The transformations of rotation and translation are performed by changing from one coordinate frame to another giving rise a transformation matrix. Thus a homogeneous transformation matrix T is a 4 × 4 matrix and can be partitioned into 4 separate submatrices as   R : p T = · · · · · · · · · , ηT : γ where R is a 3 × 3 rotation matrix, p is a 3 × 1 perspective vector such that η =0. The situation η6= 0 deals with situation where an overhead camera is used and is beyond the scope of this section. Example 5.7.1 A vector v = 3i+2j +7k is rotated by 60◦ about the z-axis of the reference frame. It is then rotated by 60◦ about the x-axis of the reference

210

Linear Algebra to Differential Equations

frame. Find the rotation transformation. Solution.  cos60◦ −sin60◦ 0  sin60◦ cos60◦ 0 H1 = M3 (60◦ ) =   0 0 1 0 0 0  1 0 0 0 cos60◦ −sin60◦ H2 = M1 (60◦ ) =  0 sin60◦ cos60◦ 0 0 0   1 0 0 0 0.5 0  0.866 0.5 −0.866 0  H2 H1 =  0 0.866 0.5 0  0 0 0 0 1 0   0.5 −0.866 0 0  0.433 0.25 −0.866 0 , =  0.75 0.433 0.5 0 0 0 0 1 thus,  0.5 −0.866 0 0.433 0 0.25 −0.866 v = H2 H1 v =  0.75 0.433 0.5 0 0 0

 0 0  0 1  0 0  0 1  −0.866 0 0 0.5 0 0  0 1 0 0 0 1

    −0.23 0 3     0  2 = −4.26 0 7  6.62  1 1 1

EXERCISE 5.7 1. Find the transformation matrices for the following operations on the vector 2i + 5j + 4k (i) Rotate 60◦ about x-axis and then translate 4 units along y-axis. (ii) Translate −5 units along y-axis and rotate 30◦ about y-axis. 2. Find the matrix in homogeneous coordinates that is required to move a robot from the (1 1 3) position to the position (4 − 5 2). 3. If B1 = {(0 1 1), (1 0 1), (1 1 0)} is the frame of reference of the rigid axis of the robot and B2 = {(1 0 2), (2 3 4), (1 4 1)} is the frame of reference of the robotic arms. Find the transformation matrix that is, change of basis matrix B2 to B1 and represent a vector (4 4 1) in basis B1 and B2 . 4. A point P (2 3 − 4)T is attached to a rotating frame. The frame rotates an angle ‘θ’ about the x-axis of the frame of reference. The

Applications

211

coordinates of the point relative to the frame of reference after the ro0 tation is P (2 − 4 3). Find the rotation matrix. 5. Find the axis of  rotation and angle  of rotation about the same axis for √ 3 0 0.5   2 1 √0 . the matrix R =  0 −0.5 0 23

5.8

Bioinformatics

Bioinformatics deals with the sequencing of DNA molecules, protein molecules and genes, etc. The protein molecule and the DNA molecule consist of amino acids and nucleic acids respectively which are represented by the alphabet and are called characters. To study the similarity and comparison between different protein sequences, gene sequences or DNA sequences is a vital component of bioinformatics. The investigation deals with finding the alignment between any two given sequences. The given sequences can be considered as vectors (called strings) and the problem reduces to a mathematical one. Placing one string vertically and another string row wise immediately gives rise to a matrix. To obtain a best possible alignment between the two sequences Needleman–Wunsch algorithm is used. The Needleman–Wunsch algorithm is a global alignment technique versus the Smith–Waterman algorithm which is a local alignment technique. The following concepts are necessary to introduce the Needleman–Wunsch algorithm. Definition 5.8.1 Scoring matrix is a matrix consisting of a preassigned value (score) given to each pair of characters depending on their structure. The value can be positive or negative. Let n be the size of the alphabet. Then the matrix consists of n2 entries. But due to symmetry the number of entries sufficient for our study is n(n−1) . 2 Example 5.8.1 (a) BLOSUM 50 is the scoring matrix for proteins, whose values come from biology. The protein molecule consists of 21-letter alphabet and the matrix consists of 210 entries. (b) The DNA molecule consists of 4 letter alphabet and the number of entries in the scoring matrix is 6. The sequences under study may have undergone some insertions and deletions. These are taken care of by introducing gaps which are defined below.

212

Linear Algebra to Differential Equations

Definition 5.8.2 Gaps are blank characters. Given two strings x = (x1 x2 . . . xn ) and y = (y1 y2 . . . ym ) blank characters called gaps are introduced into the strings. These gaps can be individual blanks or a string of blanks. Definition 5.8.3 (i) Gap penalty (gp) is a numerical constant ‘a’ given for a single blank. (ii) Gap extension penalty is a numerical constant ‘b’ given for a string of blanks. If l is the length of the string gp(l) = −a − (l − 1)b, for some constants a and b with b < a. Definition 5.8.4 Optimal match between two sequences occurs when the score is maximum. This match is achieved by adding blanks to the given sequences in a systematic fashion so that the resulting strings are aligned optimally. The procedure for optimal alignment known as Needleman–Wunsch algorithm is given below.

Needleman–Wunsch Algorithm Given two strings x = (x1 , x2 , ..., xn ) and y = (y1 , y2 , ..., ym ), let x be placed in the row and y be placed in the column giving rise to an m×n relational matrix R. This matrix gives scores between xi and yj , denoted by r(xi , yj ). These scores are preassigned values which have been determined from the relation that exists between the elements of the sequences. [See Example 5.8.1.] Now the alignment matrix A(m+1)/(n+1) is constructed as follows. A zero row and a zero column are added. Set a00 = 0, ai0 = ia, i = 1, 2 · · · m, and aj0 = ja, j = 1, 2 · · · , n. Assume that the information of ai−1 j , ai j−1 , and ai−1 j−1 is known. [I.e. in order to find the value of aij the values of up element, left element and upper left diagonal element must be given.] Thus the (i, j)th element of alignment matrix is given by  ai−1 j−1 + r(xi , yj ), ....(i) ....(ii) aij = M ax ai−1 j − a,  ai j−1 − a, ....(iii) In order to understand the computation, a pointer indicates the path followed in filling the alignment matrix. The choice taken to obtain the entry aij is denoted by a diagonal arrow if (i) is chosen, left to right arrow if (ii) is chosen and down arrow if (iii) is chosen. Note that one of the options will surely be taken. Starting at the top this procedure is adopted to reach the right-most corner amn .

Applications

213

Thus, the matrix A along with the arrows gives the path for optimal alignment. If any two of the paths yield the same value for aij , then one of the paths can be chosen. But to get optimal value, all paths must be considered. Once the alignment matrix is completed the optimal alignment is given by two new strings u and v that are traced using the arrows and along with the following choice. If aij = (i), then xi is aligned with yj . If aij = (ii), then xi is aligned with a new gap. If aij = (iii), then yj is aligned with a new gap. Proceeding in this fashion both sequences constructed are in alignment with one another. The following example gives a better understanding of the Needleman–Wunsch algorithm. Example 5.8.2 Using the scoring matrix, fig 5.6 given below find the optimal alignment between the strings TAPES and TAMPS.



A

E

M

P

S

T

A

4











E

-1

5









M

-1

-2

5







P

0

-1

-2

7





S

1

0

-1

-1

4



T

0

-1

-1

-1

1

5

Fig. 5.6 Scoring matrix



Then the relational matrix is given by





T

A T

M A



0

-3 T

5

T

-3

A

5

0 -2 4 -5 −1

A

-6

R = (xi , yj ) =

-6

M

P 2 −1

-9

−1

S S

-15

1

0-8

1 -11

−1

-18

−2

7

−1

−2

−1

0

1 6 1 7 −1

−1

4 10

9

E

0

P P

−1

0 −1

6

3

13

0

P

-9

-1 S

E

-15

-4

3

5

10

13

S

-18

-7

0

2

7

14

Fig.5.6 Path tracing

214

Linear Algebra to Differential Equations

Now the alignment matrix is obtained by using the formula  a + r(xi , yj ) (i)   i−1 j−1 (ii) aij = Max ai−1 j − a   ai j−1 − a (iii) Thus the alignment matrix begins with first row and first column consisting of gap penalty as −ai and −aj, respectively. Then the first row and column are respectively [0 − 3 − 6 − 9 − 12 − 15] and [0 − 3 − 6 − 9 − 12 − 15]T Then a22 = Max



5, −6, −6



=5

a23 = Max



−3, −9, 2



=2

a33 = Max



9, −1,



=9

Next proceeding −5

proceeding in this fashion following matrix is obtained.

Fig. 5.7 The global alignment by traceback Now starting from the last corner position 6 − 6 position a66 = 14 the path is chosen by taking the maximum element in upward, in diagonal or to the left sideward of the path. The optimal path is given by dark color arrows as 14 → 10 → 13 → 6 → 9 → 5 → 0.

Applications

215

Thus a66 a55 a45 a34 a33 a22

= (i) = (iii) = (i) = (ii) = (i) = (i)

thus x6 and y6 are aligned, thus y5 is aligned with a gap, thus x4 is aligned with y4 , thus x3 is aligned with a gap, thus x2 is aligned with y2 , thus x1 is aligned with y1 .

Thus the new alignment sequences are T A M P S P E S. T A

EXERCISE 5.8 1. Using the scoring matrix given in Fig. 5.6. Find the optimal alignment between the strings APES and AMPS. 2. Using the scoring matrix given in Fig. 5.6. Find the optimal alignment between the strings SPAM and PAAM.

5.9

Principal Component Analysis (PCA)

PCA is considered as one of the most valuable results derived from linear algebra. It is used to extract relevant information from a confusing data set. Further, it is used to reduce the dimension of the data set and also to obtain hidden and simplified dynamics of the data set. In order to understand any system, some parameters (say, n) are chosen and a data set is constructed through experimentation. A relation between the data is obtained by finding the covariance matrix, (say X), which is a symmetric matrix. The problem is to (i) simplify the data set by finding parameters that contain the complete information; (ii) uncorrelate the important parameters; (iii) filter out noise and reveal the hidden dynamics;

216

Linear Algebra to Differential Equations

(iv) identify the parameters (or directions) in which most of the data is concentrated. This problem is handled using tools from linear algebra. As the data set is a matrix Xn×n , the rank of the matrix (r, say) gives the number of linearly independent parameters in the system. The system can be considered as an n-dimensional space and there are r linearly independent vectors (since rank X = r). These r vectors can be extended to form a basis containing n vectors. As there are infinitely many bases, the choice of the basis is vital. The basis consisting of the unit eigen-vectors corresponding to the eigenvalues of the covariance matrix X is considered. An orthogonal matrix P of order n × n is constructed using the r unit eigen-vectors and adding n − r orthonormal vectors. The idea is to introduce a new matrix Y, such that X = PY so that the matrix Y contains columns that are uncorrelated and are arranged in decreasing order of variance. xk = Pyk or yk = PT xk

(as P is an orthogonal matrix, P−1 = PT ).

From Chapter 2, the matrix P diagonalizes X so that X = PDPT or D = PT XP, where D is the diagonal matrix consisting of eigen-values λ1 , λ2 , λ3 , . . . , λr of X arranged in decreasing order, that is, λ1 ≥ λ2 ≥ λ3 ≥ . . . ≥ λr . Suppose u1 , u2 , u3 , . . . , ur are the eigen-vectors corresponding to the eigen-values λ1 , λ2 , . . . , λr . Then Xui = λi ui , i = 1, 2, 3, . . . , r. This means that the data concentration is in the direction of ui with λi as magnitude, i = 1, 2, 3, . . . , r. As λ1 = max{λi : i = 1, 2, . . . , r], thus one can conclude most of the data is concentrated in the direction of u1 with magnitude λ1 . Thus u1 is called the first principal component. Consider the next maximum eigen-value λ2 , excluding λ1 , then λ2 gives the magnitude of data concentrated in the direction u2 , which is the second principal component. Depending on the required accuracy of the problem the number of principal components are considered and the required information is obtained. The new matrix Y consisting of the columns yi is obtained by yi = uT i X = c1 x1 + c2 x2 + ... + cn xn , where ui = (ci ). Thus yi is the linear combination of the original variables with weights as the entries of ui . Similarly, all the significant ‘r’ columns can be obtained. The yi0 s, i = 1, 2, ..., r are the new uncorrelated variables.

Applications

217

Procedure to find the principal components of a data set Step 1. From the given dataset find the covariance matrix (or consider the given matrix). Step 2. Find the unit eigen-vectors corresponding to the eigen-values. Step 3. Arrange the eigen-values in descending order. Step 4. Then u1 is the first principal component and u2 is the second principal component and so on. This process can be continued till the required degree of accuracy is reached. Step 5. yi = uT i X, i = 1, 2, ..., r gives the necessary uncorrelated variables, which is the new data set. Another procedure to obtain PCA is the singular value decomposition. The singular value decomposition (SVD) discussed in Chapter 3 is used in many applications. It is considered as a practical tool for PCA as SVD caters to any m × n matrix. Procedure for PCA using SVD Suppose A is an m × n matrix of observations having rank r. Step 1. Calculate AT A and find its eigen-values. Let λ1 , ..., λn be the eigenvalues of AT A and v1 , ..., vn be the orthonormal vectors corresponding to λ1 ≥ λ2 ≥ ... ≥ λn . Step 2. Since rank A = r, there exists σi > 0, i = 1, 2, ..., r such that √ σi = λi , i = 1, 2, ..., r. Set ui = σ1i Avi Then σi ui = Avi ui , i = 1, 2, ..., r, are called left singular vectors of A vi , i = 1, 2, ..., r, are called right singular vectors of A vi : the right singular vectors of A are called the principal components of the matrix A. Observation. Compared to eigen-value decomposition of A, SVD is faster in calculation and is more accurate. Example 5.9.1 Consider the Iris flower data set of samples from each of 3 species of Iris, wherein four features were measured from each sample. The notation used is: SL: sepal length, SW: sepal width, PL: petal length, PW: petal width The problem is to reduce the dimension of the given data.

218

Linear Algebra to Differential Equations

Solution. Step 1. Consider the data set which is given by SW

PL

3.5

1.4

3

1.4

3.2

1.3

7

SL  5.5 4.9   4.7  4.6   7  6.4   6.9

8

5.5

1 2 3 4 5 6

3.1

1.5

3.2

4.7

3.2

4.5

3.1

4.9

PW  0.2 0.2   0.2  0.2   1.4  1.5   1.5

2.3

4

1.3

Step 2. Calculate the covariance matrix using the formula  P (xi − x ¯)(yi − y¯)   [N is number of rows] x 6= y N −1 P cov(x, y) = ¯)2   Variance (x) = (xi − x x=y N The covariance matrix is  0.9355   0.0367 X=  1.4694  0.5530

0.0368

1.4694

0.1193

−0.1625

−0.1625

2.857

−0.0725

1.1062

0.5530



  −0.0725   1.1062   0.4327

Step 3. Calculate eigen-values and eigen-vectors considering the characteristic equation det(X − λI) = 0. The eigen-values are λ1 = 4.077, λ2 = 0.2468, λ3 = 0.0184, λ4 = 0.0023. Since the eigen-vector satisfies the equation Xx = λx or (X − λI)x = 0 on substituting λ values in the above equation yields the following eigenvectors corresponding to the eigen-value λi written in the descending order x1 = [0.4464 0.6893 0.5571 0.1233]T x2 = [−0.036 0.6473 −0.7591 −0.0584]T x3 = [0.8342 −0.2722 −0.2397 −0.4153]T x4 = [0.3217 −0.1782 −0.2364 0.8994]T

Applications

219

Step 4. Select the number of principal components. One can select any value from 1 to n [n is number of λ values]. In this for example choose n = 2. Then take the top 2 eigen-values corresponding and top 2 eigen-vectors. Values arranged in descending order must be taken. Next transpose the resultant values. Thus, considering eigen-vectors corresponding to the top 2 eigen-values, x1 and x2 , the resultant matrix is   0.4464 0.6893 0.5571 0.1233 A= −0.036 0.6473 −0.7591 −0.0584 Next transposing the above matrix yields   0.4464 −0.036 0.6893 0.6473   AT =  0.5571 −0.7591 0.1233 −0.0584 4×2 Step 5. The required new dataset is given by the product of covariance matrix and the transpose matrix of the chosen eigen-vectors, and is.  −1.1577 0.2035   −2.3915 −0.9319 4×2



1.3298 −0.0008 =  2.272 0.8666

Thus the resultant dataset has 2 principal components.

EXERCISE 5.9 1. Consider the matrix  70 4 A= 1 5

120 5 5 8

30 1 1 7

 50 4  0 8

and find the first three principal components and build the new dataset. 2. Consider the matrix

 4 A = 5 4

 −3 4 −7 7 −1 2

and find the first two principal components and build the new dataset.

220

5.10

Linear Algebra to Differential Equations

Big Data

Big data mainly deals with unstructured and semi-structured data in terms of pictures, messages, etc., created by people all over the world. It is characterized by the 4 Vs: Volume, Variety, Veracity and Velocity–which are briefly described below.

Volume The main characteristic that makes data “big” is its sheer volume. The total amount of information is growing exponentially every year. For example, Google has a 30 billion x 30 billion matrix and everyday 250,000 pages of new data is being added to it. Variety Variety is one of the most interesting developments in technology as more and more information is digitized. A picture, a voice recording, a tweet – they all can be different but express ideas and thoughts based on human understanding and all these are identified as unstructured data, a fundamental concept in big data. One of the goals of big data is to use technology to consider this unstructured data and make sense of it. Veracity Veracity refers to the trustworthiness of the data. There will be inherent discrepancies in all the data collected. Velocity Velocity is the frequency of the incoming data that needs to be processed. The many SMS messages, Facebook status updates or credit card swipes that are being sent on a particular telecom carrier every minute of every day indicate the high velocity of data that is being transmitted daily. Analyzing this data and coming up with meaningful interpretations, that is required by companies and institutions for their benefit and profit, is the big challenge. In terms of understanding the data and making it structured so as to come up with useful interpretations it is essential to think in terms of mathematical language. A brief description of the math used is given below for a better understanding of the data. In order to analyze the data, mathematicians use curve fitting in one dimension, surface fitting in two dimensions and volume fitting in three dimensions. Big data can be characterized as data having too many data points in n-dimensions where the dimension of the data crops up from the number of

Applications

221

parameters involved. Thus attempts have to be made to do n-manifold fitting by extending the least squares method used for curve fitting to n-dimensions. Further it may be necessary to use nonlinear optimization for analyzing the data. Big data is by nature very dynamic. The number of parameters and their character keeps on changing. In essence there are two types of parametersstatic and dynamic- for both of which the values keep on varying. Further a parameter involving probability must be included so as to make predictions using the data. Since parameters represent dimension a big issue is which parameters to keep and which to toss so as to reduce the dimension. As stated earlier, big data consists of mainly unstructured data, which must be transformed to structured data, so that it can be understood and meaningful interpretations could be made. The principal component analysis (PCA) and singular value decomposition are used for data compression. In what follows a simple example of groceries is considered and changed to structured data so as to analyze it. Example 5.10.1 Consider a store which keeps track of 4 customers [A, B, C, D] and which/how many of 16 items the store carries that a customer purchases during any given month. Assume the 16 items are eggs, butter, milk, soda, candy, beer, utensils, soap, paper towels, fruits, vegetables, rice, lentils, coffee, tea and incense. Suppose the store has collected the following data for each of the four customers during a given month: what did the customer buy, in what quantities, total bill, top two items purchased, and how many times the customer has shopped at the store. The data can be represented using a 4 by 35 matrix, one row per customer. Thirty-five columns are needed because we need 16 columns to store which items were purchased, 16 columns to indicate which two of the 16 items are a customer’s top items (these columns will have only two nonzero values), one column for the date, one column for the total bill, and one column indicating how many times the customer has shopped at the store during that month. However, one can reduce the size of the matrix by introducing, say, 4 categories to group the 16 items. For example, eggs, butter and milk can be placed in the dairy category, the beverage category would contain beer, soda, coffee and tea, the non-food category will contain soap, utensils, incense and paper towels, and the ‘other’ category will account for the remaining items. Under this approach, only 4 by 11 matrix is required, since there are only four categories to represent items purchased, four category columns to indicate top choices, plus one column each for the date, total bill and store visit frequency. A simple illustration is as follows. Representation in a matrix form using categories is as follows (food, beverages, non-food, and other)

222

Linear Algebra to Differential Equations

Customer Date A 18-Jan

B C D

Items Top 2 items Beer, eggs, eggs, milk milk, coffee, rice 13-Jan Butter, fruits, soap fruits, soap 5-Jan Milk, soda, soda, beer beer 26-Jan Coffee, tea, tea, incense incense, utensils, lentils  18 13  5 26

Total bill 150

Store visits 2

85

4

62

1

212

2

1 1 0 1 1 0 0 0 150 1 0 1 1 0 0 1 1 85 1 1 0 0 0 1 0 0 62 0 1 0 1 0 1 1 0 212

 2 4  1 2

Based on the above matrix, we can now compute the pseudo inverse, eigenvalues, singular values, eigen-vectors, singular vectors, etc.   −0.53515 0.09607 0.12532 0.82986 −0.30462 −0.87462 0.34705 0.34705  . P = −0.22017 0.44976 0.80593 −0.31576 −0.75653 0.15332 −0.46295 −0.43569  282.34  0 Q=  0 0

0 4.66 0 0

 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . 2.46 0 0 0 0 0 0 0 0 0 1.36 0 0 0 0 0 0 0

−0.121 −0.730 −0.501 0.081 −0.118 −0.013 −0.003 −0.0705 0.517 0.268 −0.144 −0.223 −0.005 0.150 0.189 0.057 0.116 0.080  −0.001 −0.187 0.140 −0.108 0.124 0.595  −0.005 −0.134 0.003 0.180 0.956 −0.109  0.050 0.607 −0.085 0.653 R= −0.001 0.020 −0.003 0.129 0.138 −0.550 0.097 0.291  −0.003 −0.154 −0.046 −0.427 0.0411 0.236  −0.001 −0.187 0.140 −0.108 −0.008 0.024  −0.992 0.098 0.049 −0.008 0.009 0.000 −0.014 −0.547 0.615 −0.086 −0.055 −0.090

0.180 0.174 −0.318 −0.460 0.112 0.303 0.69 −0.190 −0.031 −0.023 0.015

−0.104 0.078 0.036 −0.470 0.065 0.261 −0.194 0.804 −0.032 0.009 0.034

−.0965 −0.169 −0.019 −0.154 0.005 0.093 −0.063 −0.088 0.950 0.013 −0.112

−.260 0.163 −0.884 0.252 0.073 −0.029 −0.178 0.102 0.014 0.036 −0.091

−0.240 −0.701 −0.179  −0.204  −.0253  0.152  . −0.108  −0.178  −0.162  0.026 0.529

In order to understand and analyze the patterns, the analyst changes one category and studies the corresponding effect on the matrices P, Q and R. Once the eigen-values, eigen-vectors, singular values and singular vectors have been computed, the big data analyst will now have to perform a perturbation/sensitivity analysis. Such an analysis is done by changing one or

Applications

223

more values in the original data matrix and recomputing the eigen-values, singular values, etc. This is a tedious and an iterative process but needs to be carried out in order to extract any patterns that may emerge so as to be able to predict consumer behavior. Once these patterns have been extracted, the interesting challenge becomes how best to interpret the patterns and give them real-world significance which will be of use to the store. Different stores may have different criteria for pattern interpretation.

EXERCISE 5.10 1. Find the first two principal components of the data matrix obtained by considering the first 4 columns given in the above example using eigen-value decomposition. 2. Consider from the 7th column to 11th column of the matrix in example 5.10.1. Form a matrix and find the first two principal components using eigen-value decomposition and SVD.

5.11

Conclusion

This chapter showcased some of the applications of various aspects of matrices to diverse areas ranging from mathematics and economics to computer graphics, robotics and big data. This chapter just touches the tip of the iceberg of the vast range of applications of matrices in the field of knowledge in mathematical, financial, physical, social and biological sciences, engineering and technology. For further reading the readers are suggested to refer to [1], [7], [9],[11], [15] and [17]

Chapter 6 Kronecker Product

6.1

Introduction

The present chapter is devoted to a concept which has raised the theory of matrices to new heights. Further, it enriched the scope of matrix applications in basic sciences, engineering, technology and social sciences. Very few text books on matrix theory of linear algebra include the concept of Kronecker product. The present text includes Kronecker products and interlinks all the subsequent chapters by using it as a tool. Kronecker product (also known as ‘direct product’ or ‘tensor product’) has its origin in group theory and stability analysis. Conversion of a matrix into a vector form and vice versa using a Kronecker product provides a way for solving several problems in matrices. Further, the knowledge of eigen-values and their related eigen-vectors of matrices of lower orders such as n = 2, 3, 4 were utilized to obtain the eigen-values and the corresponding eigen-vectors of matrices of higher order, thus enhancing the scope of applications of matrices of higher order. These concepts are presented in this chapter. Following is the plan which sequentially brings home the ideas concerning Kronecker product. In Section 6.2 the concepts of primary matrices are introduced and converting a matrix into a vector form is discussed. Kronecker products are introduced in Section 6.3 and some basic properties are given. Section 6.4 deals with some more useful properties of Kronecker product. In Section 6.5 the Kronecker product of the linear transformations is given. Section 6.6 deals with the utility of combining Kronecker products and vector operators. Permutation matrices and Kronecker products form the content of Section 6.7. In Section 6.8 analytic functions of Kronecker products are studied. Kronecker sum is defined in Section 6.9 and a few important properties are obtained.

6.2

Primary Matrices

In this section, the topics considered in the first few chapters are revisited and are formulated in a manner useful for the development of the material in the upcoming sections. The following is a beginning in this direction. DOI: 10.1201/9781351014953-6

225

226

Linear Algebra to Differential Equations

Definition 6.2.1 A primary matrix Pkl is an m × n matrix having ‘1’ in the (k, l) entry and zero elsewhere, that is  1 if i = k, j = l Pkl = (pij )m×n , where pij = 0 otherwise. Example 6.2.1 Primary matrices Pkl of order 2 × 2 are        1 0 0 1 0 0 0 P11 = , P12 = , P21 = and P22 = 0 0 0 0 1 0 0

 0 . 1

Observation. 1. I2 = P11 + P22 2. S = {P11 , P12 , P21 , P22 } is a basis for all 2 × 2 real matrices, R2×2 . Example 6.2.2 Show that any matrix Am×n can be written as a linear combination of primary matrices Pkl of order m × n (i.e. of same order as A). Solution.  1 if i = k, j = l Let Pkl = (pij )m×n , where pij = . Consider the set 0 otherwise S = {Pkl : k = 1, 2, . . . , m; l = 1, 2, . . . , n} of mn matrices. Then any matrix A = (aij )m×n can be written as a linear combination of elements S as follows: A = (aij )m×n =

n X m X

aij Pij .

j=1 i=1

The concept of unit vectors introduced in Chapter 1 is linked to primary matrices as follows. Definition 6.2.2 Primary column matrix, Pk! is a unit column vector having m rows and 1 column, such that Pk! = (pi1 )m×1 with pi1 =  1 if i = k 0 otherwise. Definition 6.2.3 Primary row matrix, P!l is a unit rowvector having 1 row 1 if j = l and n columns, such that, P!l = (p1j )1×n with p1j = 0 otherwise. Example 6.2.3 P1! = e1

P2! = e2 . . . Pn! = en

P!1 = eT1

P!2 = eT2 · · · P!m = eTm .

Kronecker Product

227

Relation between unit vectors and primary matrices The primary matrix Pkl of order m × n is the product of the m-unit vector ek and the transpose of the n-unit vector el , eTl ,  (Pkl )m×n = (ek )m×1 eTl 1×n which in terms of primary matrices can be written as  (Pkl )m×n = (Pk! )m×1 PT!l 1×n

Thus the matrix Pkl

  0 0   .   .    = Pk! P!l =  1 0 0 . .  % . th  k place   0 0  0 ··· 0 ··· 0 · · · 0 ···   0 · · · 1 ··· = kl 0 · · · 0 ··· 0 ··· 0 ···

. 1 l

th

(6.1)

.

. 0

 0

x  place

 0 0  0  0 0

Example 6.2.4 The primary matrices of order 2 × 2 can be written using primary column and row matrices of order 2 × 2 as follows (P11 )2×2 = (P1! )2×1 (P!1 )1×2 ,

(P12 )2×2 = (P1! )2×1 (P!2 )1×2

(P21 )2×2 = (P!1 )2×1 (P!1 )1×2

and

(P22 )2×2 = (P2! )2×1 (P!2 )1×2 .

Partitions of a matrix Consider an m × n matrix A. Represent its ‘n’ columns by A∗1 , A∗2 , . . . , A∗n and its ‘m’ rows by A1∗ , A2∗ , . . . , Am∗ . Definition 6.2.4 Column partition of matrix A is the matrix written using its columns as submatrices and is as   A = A∗1 A∗2 · · · A∗n Definition 6.2.5 Row partition of a matrix A is the matrix consisting of its rows as submatrices and is written as  T A = A1∗ A2∗ A3∗ · · · Am∗

228

Linear Algebra to Differential Equations  T Observation. Ai∗ is the ith row of A and is simply represented by ATi∗ . Here Ai∗ is the column of the ith row of A for i = 1, 2, . . . , m.   1 3 Example 6.2.5 Let A = 2 4        1 3 1 Then A∗1 = and A∗2 = and A1∗ = which gives AT1∗ = 1 4 3  2   2 and A2∗ = this gives AT2∗ = 2 4 . 4

3



Relation between partitioned matrices and primary matrices (1) The lth column of A, A∗l is given by

 APl! = A∗1

A∗2

···

  0 0   .   .      0  A∗n  1 → lth place 0   .   .   0 0

= A∗l Thus A∗l = APl!

(6.2)

(2) the kth row of A, Ak∗ is represented as a column by nomenclature. So to obtain the kth row of A, the kth row of AT is used.   T 0 . . . 0 1 0 . . . 0 A1∗ A2∗ A3∗ · · · Am∗ x kth place = ATk∗ thus, the kth row of A is ATk∗ = P!k A

(6.3)

  4 5 Example 6.2.6 Let A = 2 3. Express the 2nd column of A and 3rd row 1 8 of A using primary matrices.

Kronecker Product

229

Solution. A∗2

 4 = AP2! = 2 1

   5   5 0 3 = 3 1 8 8

A3∗ = the column matrix of the 3rd row of A. To get the 3rd row of A consider AT3∗ AT3∗

 = P13 A = 0

  4 1 2 1

0

 5  3 = 1 8

 8 .

To obtain the matrix A using row partitioned matrices and column partitioned matrices (1) From (6.2) A=

n X

A∗l P!l

(6.4)

Pk! ATk∗

(6.5)

l=1

From (6.3) A=

m X k=1

Now in Example 6.2.6, using R.H.S of the relation (6.4) gives     2 5  4  X   A∗l P!l = 2 1 0 + 3 0 1 8 1 l=1     0 5 4 0 = 2 0 + 0 3 0 8 1 0   4 5 = 2 3 = A. 1 8 To obtain A using the rows of A, consider the R.H.S of (6.5),       3 0  0  1  X    Pk! ATk∗ = 0 4 5 + 1 2 3 + 0 1 8 0 0 1 k=1       4 5 0 0 0 0 = 0 0 + 2 3 + 0 0 0 0 0 0 1 8   4 5 = 2 3 = A. 1 8

230

Linear Algebra to Differential Equations   a1 a2 a3 Example 6.2.7 Let A = . Verify the relations (6.4) and (6.5) a4 a5 a6 Solution. Verification of relation (6.4) Consider       3 X    a  a  a  A∗l P!l = 1 1 0 0 + 2 0 1 0 + 3 0 0 1 a4 a5 a6 l=1       a 0 0 0 a2 0 0 0 a3 = 1 + + a4 0 0 0 a5 0 0 0 a6   a a2 a3 = 1 , hence (6.4) holds a4 a5 a6 Consider 2 X

      1  0  a1 a2 a3 + a4 a5 a6 0 1       a1 a2 a3 0 0 0 a1 a2 a3 = + = , hence (6.5) holds. 0 0 0 a4 a5 a6 a4 a5 a6

Pk! ATk∗ =

k=1

Further, from the relation (6.1) one can obtain that the (i, j)th entry of A = aij = P!i APj! and that A=

n X m X

aij Pij ,

(6.6)

(6.7)

j=1 i=1

where Pij is an m × n matrix (same order as A) such that the (i, j)th entry of Pij = 1 and elsewhere ‘0’.  1 Example 6.2.8 Given A = 3 5

2 4 6

 3 5 7

(i) Obtain the 2nd column of A using the required primary matrix (ii) Obtain the 3rd row of A using the required primary matrix (iii) Obtain the (1, 3) entry of A using the required primary matrices. (iv) Express A as a sum of (a) column vectors of A (b) row vectors of A (c) entries of A. Solution.  1 (i) Consider AP2! = 3 5

2 4 6

    3 0 2 5 1 = 4 7 0 6

Kronecker Product 231     1 2 3   (ii) Consider P!3 A = 0 0 1 3 4 5 = 5 6 7 5 6 7      0   1 2 3   0 (iii) Consider P!1 AP3! = 1 0 0 3 4 5 0 = 1 2 3 0 5 6 7 1 1 = [3] P3 (iv) (a) Consider ∗2 P l=1 A∗l P   !l = A∗1 P!1 + A !2 + A∗3 P!3 1  2 3      = 3 1 0 0 + 4 0 1 0 + 5 0 0 1 5 6 7   1 2 3 = 3 4 5 = A 5 6 7 (b) Consider          3 1  0  0  X 1 2 3 3 4 5 5 6 7 T       Pk1 Ak∗ = 0 + 1 + 0 0 0 1 k=1   1 2 3 = 3 4 5 = A 5 6 7 (c) Consider 3 X 3 X

akl Pkl

l=1 k=1

a11 P11 + a21 P21 + a31 P31 = a12 P12 + a22 P22 + a32 P32 a13 P13 + a23 P23 + a33 P33      1 0 0 0 0 0 0 0 = 1 0 0 0 + 3 1 0 0 + 5 0 0 0 0 0 0 0 0 1 0      0 1 0 0 0 0 0 + 2 0 0 0 + 4 0 1 0 + 6 0 0 0 0 0 0 0 0      0 0 1 0 0 0 0 + 3 0 0 0 + 5 0 0 1 + 7 0 0 0 0 0 0 0 0   1 2 3 = 3 4 5 = A 5 6 7

 0 0 0 0 0 1 0 0 0

 0 0 0  0 0 1

232

Linear Algebra to Differential Equations

Next the concept of writing a matrix in terms of a vector is introduced.

Converting a matrix to a vector Let P be an m × n matrix represented by its columns as   P = P∗1 · · · P∗l · · · P∗n , where P∗l is a column with m rows, for l = 1, 2, . . . , n. Definition 6.2.6 The vector operator of a matrix P, denoted by Vec P, is defined as a column vector consisting of columns of P. Thus,   P∗1  P∗2    Vec P =  .   ..  P∗n Example 6.2.9 Let P =

 p1 p3

p2 p4

p5 p6



Then,  Vec P = p1

p3

  2 1 Example 6.2.10 Let Q = 4 8 5 3 Then,  Vec Q = 2 4

p2

p4

5

1

p5

8

p6

3

T

T

EXERCISE 6.2 1. Write the primary matrices P11 , P12 , P31 of order 3 using primary column matrices (unit vectors) of order 3. 2. Write I3 in terms of primary matrices  1 if i = j 3. Define Kronecker delta δij = . Show that 0 if i 6= j (i) δij = P1i Pj1 = P1j Pi1 (ii) Pij Pkl = δjk Pil (iii) Pij Pjl Pkl = Pil Pkl = Pil

Kronecker Product

233

4. Let A = (aij )3×3 . Write A as a sum of (i) column partitions of A (ii) row partitions of A. (iii) as entry sum of A using primary matrices (iv) express a32 using primary column 1 matrices. 5. Let P and Q be matrices of order n. Then tr(PQ) = (VecPT )T VecQ. Verify this property for P = (pij )2×2 and Q = (qij )2×2 .

6.3

Kronecker Products

In this section, Kronecker product is introduced and some of its properties are discussed. This concept is also known as a direct product or a tensor product. Consider two matrices P = (pij )m×n and Q = (qij )r×s Definition 6.3.1 The Kronecker product of two matrices P and Q, denoted by P ⊗ Q is defined as a matrix,   p11 Q p12 Q · · · p1n Q  p21 Q p22 Q · · · p2n Q    P⊗Q= .   ..  pm1 Q · · · · · · pmn Q Observation. (i) the (i, j)th entry of P ⊗ Q is the matrix pij Q of order r × s. (ii) The matrix P ⊗ Q has mn blocks and each block is a matrix of the form pij Q. (iii) The order of the matrix P ⊗ Q is mr × ns. This follows as each block has r rows and s columns and there are m rows and n columns each of such blocks. Example 6.3.1 Let A =

 2 3

 4 , 5

B=

 6 7

 8 then 9

 12   14 2B 4B  A⊗B= = 3B 5B 18 21 Observation. A ⊗ B 6= B ⊗ A.

16 18 24 27

24 28 30 35

 32 36 , 40 45

234

Linear Algebra to Differential Equations

Example 6.3.2 Let  A=

 A⊗B=

a11 B a21 B

a11 a21

a12 a22



 and

 a11 b11 a11 b21 a12 B = a21 b11 a22 B a21 b21 

B=

b11 b21

a11 b12 a11 b22 a21 b12 a21 b22

b12 b22

a11 b13 a11 b23 a21 b13 a21 b23

 b13 , b23

then

a12 b11 a12 b21 a22 b11 a22 b21

a12 b12 a12 b22 a22 b12 a22 b22

 a12 b13 a12 b23  . a22 b13  a22 b23

Some of the properties of Kronecker products that are useful in later are stated and proved below. (i) aP ⊗ R = P ⊗ aR = a(P ⊗ R) (ii) Kronecker product is associative P ⊗ (Q ⊗ R) = (P ⊗ Q) ⊗ R (iii) Zero element involving a Kronecker product is given by Omn = Om ⊗ On , where Om is a zero square matrix of order m. (iv) The unit matrix associated with a Kronecker product is given by Imn = Im ⊗ In , where Im is an Identity matrix of order m. (v) Kronecker product is distributive with respect to addition (a) (P + Q) ⊗ R = P ⊗ R + Q ⊗ R (b) P ⊗ (Q + R) = P ⊗ Q + P ⊗ R Proof. (i) By the definition of the Kronecker product   ap11 R ap12 R · · · ap1n R  ap21 R ap22 R · · · ap2n R    aP ⊗ R =  . .. ..  .  . . .  apm1 R apm2 R · · · apmn R   p11 R p12 R · · · p1n R  = a pm1 R pm2 R · · · pmn R = a(P ⊗ R) again by definition of Kronecker product.

Kronecker Product

235

Also 

p12 aR p22 aR .. .

··· ···

pm1 aR pm2 aR  p11 R p12 R  p21 R p22 R  = a . ..  .. . pm1 R pm2 R

···

p11 aR  p21 aR  P ⊗ aR =  .  ..

··· ··· ···

 p1n aR p2n aR   ..  .  pmn aR  p1n R p2n R   ..  .  pmn R

= a(P ⊗ R) (ii) This proof is done by considering the (i, j)th block of the Kronecker products. Consider L.H.S P ⊗ (Q ⊗ R) the (i, j)th block of P ⊗ (Q ⊗ R)  q11 R  q21 R  = pij (Q ⊗ R) = pij  .  ..

q12 R q22 R .. .

··· ···

 q1n R q2n R   ..  . 

qm1 R qm2 R · · · qmn R  pij q11 R pij q12 R · · · pij q1n R  pij q21 R pij q22 R · · · pij q2n R    =  .. .. ..   . . . pij qm1 R pij qm2 R · · · pij qmn R 

(6.8)

Consider R.H.S (P ⊗ Q) ⊗ R the (i, j)th block of (P ⊗ Q) ⊗ R = (i, j)th element of (P ⊗ Q) ⊗ R   pij q11 pij q12 · · · pij q1n  pij q21 pij q22 · · · pij q2n    = . .. ..  ⊗ R  .. .  . pij qm1 pij qm2 · · · pij qmn   pij q11 R pij q12 R · · · pij q1n R  pij q21 R pij q22 R · · · pij q2n R    =  .. .. ..   . . . pij qm1 R

pij qm2 R · · ·

pij qmn R

Since (6.8) and (6.9) are equal the result follows.

(6.9)

236

Linear Algebra to Differential Equations

(iii) Property (iii) is clear and so the proof is not given. (iv) Property (iv) can be easily proved and hence is avoided. (v) Consider the (i, j)th block of (P + Q) ⊗ R = (pij + qij )R = pij R + qij R = (i, j)th block of P ⊗ R + (i, j)th block of Q ⊗ R This holds for every i = 1, 2, . . . , m and j = 1, 2, . . . , n. Hence the proof. The other distributive property can be proved in a similar fashion.       p1 p2 q1 q2 r Example 6.3.3 Let P = ,Q= and R = 1 p3 p4 q3 q4 r2 Verify the properties (i), (ii), (iv)   ap1 r1 ap2 r1     ap1 r2 ap2 r2  ap1 ap2 r  (i) Let a ∈ R aP ⊗ R = ⊗ 1 = ap3 r1 ap4 r1  ap3 ap4 r2 ap3 r2 ap4 r2   p1 R p2 R =a = a (P ⊗ R) p3 R p4 R Similarly     p p2 ar1 P ⊗ aR = 1 ⊗ p3 p4 ar2   p1 ar1 p2 ar1 p1 ar2 p2 ar2   = p3 ar1 p4 ar1  p3 ar2 p4 ar2   p R p2 R =a 1 = a (P ⊗ R) p3 R p4 R       p p2 q1 q2 r L.H.S = P ⊗ (Q ⊗ R) = 1 ⊗ ⊗ 1 (ii) p3 p4 q3 q4 r2   q1 r1 q2 r1   q1 r2 q2 r2  p1 p2  = ⊗ q3 r1 q4 r1  p3 p4 q3 r2 q4 r2   p1 q1 r1 p1 q2 r1 p2 q1 r1 p2 q2 r1 p1 q1 r2 p1 q2 r2 p2 q1 r2 p2 q2 r2    p1 q3 r1 p1 q4 r1 p2 q3 r1 p2 q4 r1    p1 q3 r2 p1 q4 r2 p2 q3 r2 p2 q4 r2    =  p3 q1 r1 p3 q2 r1 p4 q1 r1 p4 q2 r1  p3 q1 r2 p3 q2 r2 p4 q1 r2 p4 q2 r2    p3 q3 r1 p3 q4 r1 p4 q3 r1 p4 q4 r1  p3 q3 r2 p3 q4 r2 p4 q3 r2 p4 q4 r2

Kronecker Product

237

R.H.S = (P ⊗ Q) ⊗ R       p1 p2 q1 q2 r = ⊗ ⊗ 1 p3 p4 q3 q4 r2     p1 q1 p1 q2 p2 q1 p2 q2 r1 p1 q3 p1 q4 p2 q3 p2 q4  ⊗  = p3 q1 p3 q2 p4 q1 p4 q2  r2 p3 q3 p3 q4 p4 q3 p4 q4   p1 q1 r1 p1 q2 r1 p2 q1 r1 p2 q2 r1 p1 q1 r2 p1 q2 r2 p2 q1 r2 p2 q2 r2    p1 q3 r1 p1 q4 r1 p2 q3 r1 p2 q4 r1    p1 q3 r2 p1 q4 r2 p2 q3 r2 p2 q4 r2    =  p3 q1 r1 p3 q2 r1 p4 q1 r1 p4 q2 r1  p3 q1 r2 p3 q2 r2 p4 q1 r2 p4 q2 r2    p3 q3 r1 p3 q4 r1 p4 q3 r1 p4 q4 r1  p3 q3 r2 p3 q4 r2 p4 q3 r2 p4 q4 r2 Since L.H.S = R.H.S, the associative law P ⊗ (Q ⊗ R) = (P ⊗ Q) ⊗ R is verified. (v) (a) (P + Q) ⊗ R = (P ⊗ R) + (Q ⊗ R) Since P and Q are of same order the addition on both sides holds.   p1 + q1 p2 + q2 P+Q= p3 + q3 p4 + q4     p1 + q1 p2 + q2 r (P + Q) ⊗ R = ⊗ 1 p3 + q3 p4 + q4 r2   (p1 + q1 )r1 (p2 + q2 )r1 (p1 + q1 )r2 (p2 + q2 )r2   = (p3 + q3 )r1 (p4 + q4 )r1  (p3 + q3 )r2 (p4 + q4 )r2     p1 r1 p2 r1 q1 r1 q2 r1 p1 r2 p2 r2  q1 r2 q2 r2     = p3 r1 p4 r1  + q3 r1 q4 r1  p3 r2 p4 r2 q3 r2 q4 r2     p1 R p2 R q1 R q2 R = + p3 R p4 R q3 R q4 R = (P ⊗ R) + (Q ⊗ R) (b) The order of the matrices Q and R is not compatible for addition. Hence, the second distributive law does not exist.

238

Linear Algebra to Differential Equations

EXERCISE 6.3 

   2 3 3 2 1. Let A = , B= then verify −1 4 −4 1 A ⊗ αB = α(A ⊗ B), where α ∈ R.     1 4 2 3 2. If A = , B= then 3 2 −1 5 (i) calculate 3A ⊗ 4B and verify that (ii) 3A ⊗ 4B = 12(A ⊗ B).     1 4 2 1 3. Given A = ,B= find the Kronecker product of A ⊗ B 2 3 3 2 and B ⊗ A     2 1 −1 1 4. If A = , B= then calculate A ⊗ B and B ⊗ A; 0 1 2 0       1 −1 2 3 3 4 5. Let A = , B= ,C = . Verify that 2 3 −1 2 2 1 (i) (A ⊗ B) ⊗ C = A ⊗ (B ⊗ C); (ii) (A + B) ⊗ C = (A ⊗ C) + (B ⊗ C); (iii) A ⊗ (B + C) = (A ⊗ B) + (A ⊗ C).

6.4

Further Properties of Kronecker Products

In this section, the transpose of a Kronecker product and the mixed product rule will be obtained. Further, using the mixed product rule relations between eigen-values and eigen-vectors of a Kronecker product and the inverse of a Kronecker product will be established. Consider the matrices P = (pij )m×n , Q = (qij )p×q , R = (rij )n×k and S = (sij )q×l in the given order. (1) Transpose property of Kronecker product (P ⊗ Q)T = PT ⊗ QT.

Kronecker Product

239

Proof. 

p11 Q  p21 Q  P⊗Q= .  ..

p12 Q p22 Q .. .

··· ···

 p1n Q p2n Q   ..  . 

··· pm1 Q pm2 Q · · · pmn Q   p11 QT p21 QT · · · pm1 QT  p12 QT p22 QT · · · pm2 QT    hence (P ⊗ Q)T =  . .. ..   .. . ··· .  T T pn1 Q pn2 Q · · · pmn QT = PT ⊗ QT, by definition of the Kronecker product. Observation. 1. The order of P ⊗ Q is mp × nq. The order of (P ⊗ Q)T is nq × mp. The order of PT is n × m, QT is q × p and PT ⊗ QT is nq × mp. 2. In ordinary multiplication of matrices, (P Q)T = QT PT. Example 6.4.1 Verify the transpose property for     p1 p2 q P= and Q = 1 p3 p4 q2 Solution. Given   p1 q1 p2 q1 p1 q2 p2 q2   P⊗Q= p3 q1 p4 q1  p3 q2 p4 q2  p q p1 q2 p3 q1 (P ⊗ Q)T = 1 1 p2 q1 p2 q2 p4 q1

p3 q2 p4 q4



    p1 p3 P = , QT = q1 , q2 p2 p4   p1 q1 p1 q2 p3 q1 p3 q2 T T P ⊗Q = p2 q1 p2 q2 p4 q1 p4 q2 T

The result is verified. Example 6.4.2 Verify the transpose property for    2 3 1 2 A= B= 1 4 2 3

 1 2

240

Linear Algebra to Differential Equations

Solution:  2 4 A⊗B= 1 2  2 4  2 (A ⊗ B)T =  3  6 3

 2 AT = 3

4 6 2 3

2 4 1 2

3 6 4 8

4 6 4 6 9 6

1 2 1 4 8 4

2 3  2  8  12 8

1 4

6 9 8 12 

 3 6  4 8

 1 BT = 2 1



 2 3 2

 2 4  2 and AT ⊗ B T =  3  6 3

4 6 4 6 9 6

1 2 1 4 8 4

 2 3  2  8  12 8

The result is verified. (2) Mixed product rule. (P ⊗ Q)(R ⊗ S) = PR ⊗ QS. Proof. First observe that the matrix products are compatible and L.H.S and R.H.S gives rise to a matrix of the same order. L.H.S = (P ⊗ Q)mp×nq (R ⊗ S)nq×kl = ((P ⊗ Q)(R ⊗ S))mp×kl R.H.S: (PR)m×k ⊗ (QS)p×l = (PR ⊗ QS)mp×kl . The proof of the rule is as follows. Consider the L.H.S. (P ⊗ Q)(R ⊗ S)   p11 Q p12 Q · · · p1n Q r11 S  p21 Q p22 Q · · · p2n Q   r21 S   = . .. ..   ..  .. . ··· .  . pm1 Q pm2 Q

···

pmn Q

rn1 S

r12 S r22 S .. .

··· ···

··· rn2 S · · ·

 r1k S r2k S   ..  .  rnk S

Now the (i, j)th entry of (P ⊗ Q)(R ⊗ S) = pi1 Qr1j S + pi2 Qr2j S + · · · + pin Qrnj S n X = pit rtj QS. t=1

Next consider the L.H.S product PR ⊗ QS.

(6.10)

Kronecker Product

241

The (i, j)th element of the product PR is given by n X

pit rtj

i=1

and the (i, j)th element of the Kronecker product PR ⊗ QS is =

n X

pit rtj QS

(6.11)

i=1

The result follows from (6.10) and (6.11). Example 6.4.3 Verify the mixed product rule for       2 P= Q = 1 2 3 1×3 R = 1 4 1×2 3 2×1

 S= 1

Solution.     2 P⊗Q= ⊗ 1 2 3 3   2 4 6 = 3 6 9    1 1   R ⊗ S = 1 4 ⊗ 0 = 0 1 1

 4 0 4

   1 2 4 6  0 L.H.S = (P ⊗ Q)(R ⊗ S) = 3 6 9 1   8 32 = 12 48 Now PR =

  2  1 3

 QS = 1  2 3  8 = 12

PR ⊗ QS =

The result is verified.

   2 8 4 = 3 12    1   2 3 0 = 4 1    8 ⊗ 4 12  32 . 48

 4 0 4

0

T 1 3×1

242

Linear Algebra to Differential Equations

(3) The inverse of the Kronecker product Pm×m and Qn×n is (P ⊗ Q)−1 = P−1 ⊗ Q−1 . The result follows from the mixed product rule by considering (P ⊗ Q)(P−1 ⊗ Q−1 ) = PP−1 ⊗ QQ−1 = Im ⊗ In . Example 6.4.4 Let P =

 1 2

2 3

 and Q =

 1 3

 1 . Verify the inverse 4

property. Solution. P−1 =

 −3 2

 2 −1

and

Q−1 =



−1 1

4 −3



Now  1 1 3 4 (P ⊗ Q) =  2 2 6 8  −12  9 P−1 ⊗ Q−1 =   8 −6  1 1  3 4 (P ⊗ Q)(P−1 ⊗ Q−1 ) =  2 2 6 8  1 0 0 1 = 0 0 0 0

2 6 3 9

 2 8  3 12

3 −3 −2 2

 8 −2 −6 2   −4 1  3 −1  −12 2  9 8  3  8 −6 12  0 0  0 1

2 6 3 9 0 0 1 0

3 −3 −2 2

8 −6 −4 3

 −2 2  1 −1

= I2 ⊗ I2 . Hence the result is verified. (4) The eigen-values and eigen-vectors of a Kronecker product. Let Pm×m and Qn×n be given. Further, suppose that {λi } i = 1, 2, . . . , m and {µj } j = 1, 2, . . . , n are the corresponding given values of P and Q, respectively. Then the eigen-values and eigen-vectors of the Kronecker product P ⊗ Q are {λi µj } and xi ⊗ yj , i = 1, 2, . . . , m and j = 1, 2, . . . , n, respectively. Proof. Consider (P ⊗ Q)(xi ⊗ yj ) for some arbitrary fixed i and j. Using the mixed product rule (P ⊗ Q)(xi ⊗ yj ) = Pxi ⊗ Qyj = λi xi ⊗ µj yj = (λi µj )xi ⊗ yj ,

Kronecker Product

243

which implies that λi µj is the eigen-value of P ⊗ Q and its corresponding eigen-vector is xi ⊗ yj . Since the result holds for all i and all j the property is proved.     5 2 2 1 Example 6.4.5 Let P = and Q = . Verify the property 4. 2 2 0 1 Solution. The eigen-values and the eigen-vectors of P are λ1 = 1 and λ2 = 6 and x1 = −1 and x2 = 21 . 2 The eigen-values and the eigen-vectors of Q are µ1 = 1 and µ2 = 2 and  1 y1 = −1 and y = 2 1 0 . Now   10 5 4 2  0 5 0 2  P⊗Q=  4 2 4 2 0 2 0 2 whose eigen-values are 1, 2, 6,   1 −1  , −2 2

12 and the corresponding eigen-vectors are       2 −2 −1  0   2  0  ,  ,    2  −1 1 0 1 0

This property 4 can be easily verified as follows.  1 −1 −1 −1  λ1 µ1 = 1 and x1 ⊗ y1 = ⊗ = −2 2 1 2   −1     0 1 −1  ⊗ = λ1 µ2 = 2 and x1 ⊗ y2 = 2 0 2 0 









  −2     2 2 −1  λ2 µ1 = 6 and x2 ⊗ y1 = ⊗ = −1 1 1 1   2     0 2 1  λ2 µ2 = 12 and x2 ⊗ y2 = ⊗ = 1 1 0 0 (5) Suppose P and Q are arbitrary square matrices of sizes m and n, respectively. Then det(P ⊗ Q) = (det P)n (det Q)m . Proof. Let λ1 , λ2 , . . . , λm and µ1 , µ2 , . . . , µn be the eigen-values of P and Q, respectively.

244

Linear Algebra to Differential Equations Then det P = λ1 λ2 . . . λm and det Q = µ1 µ2 . . . µn . Further Y det (P ⊗ Q) = λi µj i,j

= (λ1 µ1 )(λ1 µ2 ) · · · (λ1 µn ) (λ2 µ1 )(λ2 µ2 ) · · · (λ2 µn ) .. . (λm µ1 )(λm µ2 ) · · · (λm µn ) = (λ1 )n (µ1 µ2 · · · µn ) (λ2 )n (µ1 µ2 · · · µn ) (λm )n (µ1 µ2 · · · µn ) = (λ1 · · · λm )n (µ1 · · · µn )m = (det P)n (det Q)m , proving the result. Example 6.4.6 Verify the property 5 for the matrices P and Q given below     2 1 3 3 P= Q= 1 2 1 2 det P = 3  6 2 P⊗Q= 3 1  1 2 R1 ←→ R4 =  3 6

det Q = 3. 6 4 3 2

3 1 6 2

2 4 3 6

2 1 6 3

 3 2  6 4  4 2  6 3

R2 ←→ R2 − 2R1, R3 ←→ R3 − 3R1 , R4 ←→ R4 − 6R1   1 2 2 4 0 0 −3 −6   ≈ 0 −3 0 −6  0 −6 −9 −21 R4 ←→ R4 − 2R3 , R4 ←→ R4 + 3R2    1 2 2 4 0 0 0 −3 −6  = 1 −3 ≈ 0 −3 0 −6 0 0 0 0 −9

−3 0 0

 −6 −6 = 81. −9

det(P ⊗ Q) = 81 = 32 · 32 = (det P)2 (det Q)2 . Hence verified.

Kronecker Product

245

EXERCISE 6.4  2 1. Let A = 0

  1 −1 and B = 1 2

 1 0

Verify (i) (A ⊗ B)T = AT ⊗ BT (ii) (A ⊗ B)−1 = A−1 ⊗ B−1 2. Verify the mixed product rule using the following matrices         5 4 3 2 1 2 3 5 A= ,B= ,C= and D = . 3 2 −4 −1 4 3 6 1 3. Verify the transpose property of Kronecker product for     1 4 2 −1 A= and B = 3 2 3 5 4. Verify the inverse property of Kronecker product using     2 5 2 1 A= B= 1 3 3 2     1 2 2 1 5. Let A = and B = Verify the determinant property. 3 4 −4 3     1 0 2 1 6. A = and B = Verify the eigen-value property for A ⊗ B. 1 2 0 3

6.5

Kronecker Product of Two Linear Transformations

In this section, two linear transformations are considered and their Kronecker product is described. Let P = (pij )m×n , Q = (qij )k×l be two rectangular matrices and let x = Py and z = Qw be two linear transformations defined on them. Then one can conclude that x = (x1 x2 . . . xm )T and y = (y1 y2 . . . yn )T. Also z = (z1 z2 . . . zk )T and w = (w1 w2 . . . wl )T. To obtain the Kronecker product of the linear transformations x ⊗ z = Py ⊗ Qw = (P ⊗ Q)(y ⊗ w) by the mixed product rule. Observation. x ⊗ z is an mk × 1 vectors,

246

Linear Algebra to Differential Equations

P ⊗ Q is an mk × nl matrix, and y ⊗ w is an nl × 1 vectors. Hence the Kronecker product of the linear transformations is compatible. Example 6.5.1 Find a Kronecker product of the linear transformations involving the matrices     p1 p2 q1 q2 P= and Q = p3 p4 q3 q4 given by x = Py and z = Qw, where x, y, z, w are suitable vectors. Solution. The Kronecker product of the linear transformations is x ⊗ z = Py ⊗ Qw, where x = (x1 x2 )T , y = (y1 y2 )T , z = (z1 z2 )T and w = (w1 w2 )T Now using the mixed product rule x ⊗ z = Py ⊗ Qw = (P ⊗ Q)(y ⊗ w) Thus  That is

x1 x2



 ⊗

z1 z2



 =

p1 p3

  p2 q ⊗ 1 p4 q3

q2 q4

 

y1 y2



 ⊗

w1 w2



      x1 z1 p1 q1 p1 q2 p2 q1 p2 q2 y1 w1 x1 z2  p1 q3 p1 q4 p2 q3 p2 q4  y1 w2        x2 z1  = p3 q1 p3 q2 p4 q1 p4 q2  y2 w1  x2 z2 p3 q3 p3 q4 p4 q3 p2 q4 y2 w2           x1 3 0 0 y1 z1 1 2 w1       y2 and Example 6.5.2 Let x2 = 1 2 0 = . z2 2 3 w2 x3 2 0 1 y3 Find the Kronecker product of the above linear transformations.     x1 z1 Solution. The product of the linear transformations is x2  ⊗ = z2 x 3       3 0 0 y1 1 2 0 y2  ⊗ 1 2 w1 2 3 w2 2 0 1 y3         3 0 0 y1 1 2 w1       1 2 0 ⊗ y2 ⊗ By mixed product rule = 2 3 w2 2 0  1 y3    x1 z1 3 6 0 0 0 0 y1 w1 x1 z2  6 9 0 0 0 0 y1 w2      x2 z1  1 2 2 4 0 0 y2 w1     Thus  x2 z2  2 3 4 6 0 0 y2 w2      x3 z1  2 4 0 0 1 2 y3 w1  x3 z2 4 6 0 0 2 3 y3 w2

Kronecker Product

247

EXERCISE 6.5 1. Consider the linear transformations x = Ay, z = Bw with A = (aij )2×2 and B = (bij )2×2 and the vectors x = [x1 , x2 ]T , y = [y1 , y2 ]T , z = [z1 , z2 ]T and w = [w1 , w2 ]T . Find the Kronecker product of the two linear transformations x and z.     1 0 −1 0 2. Let x = y and z = w. Set the vectors x, y, z and w 0 1 2 −1 and then find the Kronecker product of the two linear transformations x and z.     1 4 2 1 3. Consider A = and B = . Define linear transformations 2 3 3 2 using A and B as in exercise 1 and find the Kronecker product of the two linear transformations so obtained.

6.6

Kronecker Product and Vector Operators

An important result in this section deals with expressing a vector operator of a product of 3 matrices as a Kronecker product involving the concerned matrices. This requires the kth column of the product of the 3 matrices. The following sequence of simple problems are useful in reaching the goal. Let Pkl , P!k , Pl! be primary matrices defined in Section 6.2. Example 6.6.1 Given any matrix Pm×n , show that PPkl = P∗k P!l and hence deduce PPll = P∗l P!l . Solution. Consider PPkl = PPk1 P1l = P∗k P!l , using (6.2). Set k = l Then PP!l = P∗l P!l . Example 6.6.2 Show that PPkl Q = P∗k QTl∗ . Solution. Consider PPkl Q = (PPk1 ) (P!l Q) = P∗k QTl∗ ,

using (6.2) and (6.3).

Example 6.6.3 Show that Pkl PPrs = plr Pks and hence deduce Pll PPrr = Plr Plr .

248

Linear Algebra to Differential Equations

Solution. Consider Pkl PPrs = Pk1 (P1l PPr1 ) P1s = Pk1 (plr ) P1s , = plr Pk1 P1s = plr Pks on using (6.6) and (6.1), respectively. Setting k = l and s = r gives Pll PPrr = plr Plr . Example 6.6.4 Express the product PQ using primary matrices. Solution. Assuming that P and Q are compatible so that the product PQ is well defined, using (6.4) and relation in Example 6.6.1, X X P= P∗j P1j = PPjj Thus PQ =

X

(PPjj )Q

j

=

X

(PPj! )(P!j Q)

j

=

X

P∗j QTj∗ .

j

Example 6.6.5 Find the kth column of the product PQ. Solution. Consider (PQ)∗k = PQPk1 = P(QPk1 ) = PQ∗k X = (PPjj )Q∗k j

=

X

PPj1 P1j Q∗k

j

=

X

P∗j qjk

j

=

X

(qjk )P∗j ,

j

on using the definition of kth column of a matrix, Example 6.6.1 and (6.1).

Kronecker Product

249

Example 6.6.6 Find the kth column of the product of 3 matrices P, Y and Q. Solution. The result can be obtained using results from the above examples. Consider (PYQ)∗k = (PYQ)Pk1 = (PY)Q∗k X = (PY)∗j qjk j

X = (qjk P)Y∗j j

Now the main result of this section is as follows. Result 6.6.1 Vec(PYQ) = QT ⊗ PVec Y. Proof. Suppose that Pm×n Yn×r Qr×s matrices, so that the product (PYQ)m×s matrix exists. Now the kth column of (PYQ) is obtained using the Example 6.6.6 as X (PYQ)∗k = (qjk P)Y∗j j

 = q1k P q2k P

···

  Y∗1   Y∗2  qrk P  .   ..  Y∗r

  Y∗1 Y∗2    = QT∗k ⊗ P  .   ..  Y∗r   T = [Q∗k ] ⊗ P Vec Y Hence

  [Q∗1 ]T ⊗ P Vec Y [Q∗2 ]T ⊗ P Vec Y   Vec(PYQ) =  . ..   . T [Q∗s ] ⊗ P Vec Y

Example 6.6.7 Verify Result 6.6.1 for the following matrices       p p2 y P= 1 Y= 1 Q = q1 q2 p3 p4 y2

250

Linear Algebra to Differential Equations

Solution. The matrix multiplication is compatible and     p p2 y1  q1 q2 PYQ = 1 p3 p4 y2    p y + p 2 y2  q1 q2 = 1 1 p3 y1 + p4 y2   (p1 y1 + p2 y2 ) q1 (p1 y1 + p2 y2 ) q2 = (p3 y1 + p4 y2 ) q1 (p3 y1 + p4 y2 ) q2 hence   (p1 y1 + p2 y2 ) q1 (p3 y1 + p4 y2 ) q1   Vec(PYQ) =  (p1 y1 + p2 y2 ) q2 . (p3 y1 + p4 y2 ) q2 = L.H.S of Result 6.6.1   q T Now Q = 1 q2   1st column of R.H.S = [Q∗1 ]T ⊗ P Vec Y     y1 = q1 ⊗ P y2    q1 p1 q1 p2 y1 = q1 p3 q1 p4 y2   q1 p1 y1 + q1 p2 y2 = q1 p3 y1 + q1 p4 y2

  2nd column of R.H.S = [Q∗2 ]T ⊗ P Vec Y     y1 = q2 ⊗ P y2    q2 p1 q2 p2 y1 = q2 p3 q2 p4 y2   q2 p1 y1 + q2 p2 y2 = q2 p3 y1 + q2 p4 y2   q1 (p1 y1 + p2 y2 ) q1 (p3 y1 + p4 y2 )  Hence the matrix on R.H.S =  q2 (p1 y1 + p2 y2 ) q2 (p3 y1 + p4 y2 ) Since L.H.S = R.H.S, the Result 1 is verified.

Kronecker Product

251

EXERCISE 6.6 1. Write the equation AY = B,   a1 a3 y1 a2 a4 y2

  y3 b = 1 y4 b2

b3 b4



in a vector-matrix form. Hint. Consider AYI2 = B. 2. Show that tr(AB) = (Vec AT )T Vec B.       2 −1 3 −4 y y3 3. Let A = ,B= ,Y = 1 3 4 2 1 y2 y4 Verify that Vec (AYB) = (BT ⊗ A)Vec Y.       1 1 2 7 z z3 4. Let A = ,B= ,Z= 1 −2 3 11 9 z2 z4 Verify that Vec (AZB) = (BT ⊗ A)Vec Z.

6.7

Permutation Matrices and Kronecker Products

In this section, the concept of a permutation matrix is introduced and their usefulness in expressing Kronecker products is given. The primary column matrices are the basic ingredients of a permutation matrix. Definition 6.7.1 Permutation matrix of order n is an n × n square matrix consisting of ‘n’ primary column matrices of order n × 1. Example 6.7.1 The permutation matrices  usingprimary  column  matrices of  T  T 1 0 1 1 0 0 0 1 order 2, 1 0 and 0 1 are , , , . 0 1 0 0 1 1 1 0 Example 6.7.2 Let Pij be a primary matrix for order 2 and i, j = 1, 2. Find U=

2 2 X X j=1 i=1

Pij ⊗ Pji

252

Linear Algebra to Differential Equations

Solution. U = P11 ⊗ P11 + P12 ⊗ P21 + P21 ⊗ P12 + P22 ⊗ P22            1 0 1 0 0 1 0 0 0 0 0 = ⊗ + ⊗ + ⊗ 0 0 0 0 0 0 1 0 1 0 0   1 0 0 0     0 0 1 0 0 0 0 0  + ⊗ = 0 1 0 0 0 1 0 1 0 0 0 1

 1 0

Example 6.7.3 Find the permutation matrix obtained by I3 ⊗ I2 . Solution.     1 0 0 1 0   I3 ⊗ I2 = 0 1 0 ⊗ 0 1 0 0 1   1 0 0 0 0 0 0 1 0 0 0 0   0 0 1 0 0 0  = 0 0 0 1 0 0.   0 0 0 0 1 0 0 0 0 0 0 1 Result 6.7.1 Given a matrix A there exists a permutation matrix U such that Vec AT = UVec A. Proof. Let A = (aij )m×n . Then the permutation matrix can be obtained either by considering primary column matrices of order mn × 1 or by considering the primary matrices of order m × n and using the corresponding Vec operator. The proof using primary matrices of order m × n is as follows. XX The matrix A = aij Pij XX hence AT = aij PTij j

i

 Vec AT = Vec PT11 Vec PT21 · · · Vec PTm1 Vec PT12 · · ·  Vec PTm2 · · · Vec PTmn Vec A  Thus U = Vec PT11 Vec PT21 · · · Vec PTm1 Vec PT12 · · ·  Vec PTm2 · · · Vec PTmn In case primary column matrices are used. Each matrix is of order mn × 1 and  U = P1 1 Pm+1 1 P2m+1 1 · · · P(n−1)m+1 1 P2 1 Pm+2 1 P2m+2 1 · · ·  P(n−1)m+2 1 Pm 1 Pm+m 1 P2m+m 1 · · · Pmn 1 Then Vec A = UT Vec AT .

Kronecker Product

253

Example 6.7.4 Find  the permutation  matrix using primary column matrices a11 a12 a13 for the matrix A = a21 a22 a23  so that Vec A = UT Vec AT. a31 a32 a33 Solution. Using the above formula with m = 3, n = 3  U = P11 P41  1 0 0 0 0 0  0 0 0  0 1 0  U= 0 0 0 0 0 0  0 0 1  0 0 0 0 0 0

P71

P21

P51

P81 T

0 1 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 1 0

0 0 1 0 0 0 0 0 0

0 0 0 0 0 1 0 0 0

0 0  0  0  0  0  0  0 1

 1 0  0  0  T T  Consider U Vec A =0 0  0  0 0

0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 1 0 0

0 1 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 1 0

0 0 1 0 0 0 0 0 0

P31

0 0 0 0 0 1 0 0 0

P61

P91

T

    a11 0 a11 a12  a21  0         0 a13  a31      0 a21  a12      0a22  =a22  = Vec A.     0a23   a32      0a31  a13   0a32  a23  1 a33 a33

Example 6.7.5 Verify the Result 6.7.1 for the matrix   a11 a12 A= a21 a22 Consider the vector formed by using the primary matrices P11 P12 P21 P22         1 0 0 0 0 0 1 0        Vec P11 =  0 Vec P12 = 1 Vec P21 = 0 Vec P22 = 0 0 0 0 1

254

Linear Algebra  1 0 0 0 0 1 U= 0 1 0 0 0 0  1 0 0  0 0 1 UT Vec A =  0 1 0 0 0 0

to Differential Equations  0 0  0 1     a11 a11 0 a21  a12  0    =   = Vec AT. 0 a12  a21  a22 a22 1

Observation. The permutation matrix is not an identity matrix.     Result 6.7.2 Let P = pij m×n and Q = qij k×l be matrices of order m × n and k × l, respectively. Then there exist permutation matrices U1 and U2 such that P ⊗ Q = U1 (Q ⊗ P)U2 . Proof. The proof follows from the Result 6.6.1, which gives Vec (PYQ) = (QT ⊗ P)Vec Y. Set

PYQT = X

Then Vec (PYQT ) = Vec X =⇒ (Q ⊗ P)Vec Y = Vec X. Using (PYQT )T = QYT PT = XT applying the Result 6.6.1, gives (P ⊗ Q)Vec YT = Vec XT Now from Result 6.7.1, there exists permutation matrices U1 and U2 such that Vec XT = U1 Vec X and Vec Y = U2 Vec YT multiplying the relation with U1 both sides gives U1 (Q ⊗ P)U2 Vec YT = U1 VecX. Also, (P ⊗ Q)Vec YT = U1 Vec X comparing the above relations give P ⊗ Q = U1 (Q ⊗ P)U2 , where U1 and U2 are permutation matrices.     1 2 2 1 and Q = . Find permutation 3 4 −4 3 matrices U1 and U2 such that P ⊗ Q = U1 (Q ⊗ P)U2 Example 6.7.6 Given P =

Kronecker Product

255

Solution.  1 0 0 0 0 1 From Example 6.7.5, U1 = U2 =  0 1 0 0 0 0 Verification     1 2 2 1 L.H.S = P ⊗ Q = ⊗ 3 4 −4 3   2 1 4 2  −4 3 −8 6   =  6 3 8 4 −12 9 −16 12  Q⊗P=

 2   6 2 =  −4 4 −12  2 4 0 0  6 8 1 0  0 0  −4 −8 −12 −16 0 1  4 1 2 −8 3 6   8 3 4 −16 9 12

  1 1 ⊗ 3 3

2 −4

 1 0 U1 (Q ⊗ P) =  0 0 

 0 0  0 1

0 0 1 0

2  −4 =  6 −12

 1 2 4 1 2  −4 −8 3 6  0  U1 (Q ⊗ P)U2 =   6 8 3 4  0 0 −12 −16 9 12   2 1 4 2  −4 3 −8 6   =  6 3 8 4 −12 9 −16 12 

=P⊗Q Hence the example verifies the Result 6.7.2.

4 8 −8 −16

 1 2 3 4  3 6 9 12 

1 2 3 4  3 6 9 12

0 0 1 0

0 1 0 0

 0 0  0 1

256

Linear Algebra to Differential Equations

EXERCISE 6.7 1. Write down the primary matrices of order 3 and find the permutation matrix XX U= Pij ⊗ Pji What is the order of the matrix U?   2 5 , find a permutation matrix, U such that Vec AT = 2. Given A = 4 3 U Vec A.     2 3 2 5 3. Let A = and B = . Find permutation matrices U1 and 4 2 3 3 U2 such that U1 (B ⊗ A)U2 .

6.8

Analytical Functions and Kronecker Product

In this section, an analytical function is considered and a relation is obtained using the Kronecker product. Let h be an analytical function of z. Then from the theory of analytical functions, h can be expressed as a convergent series, h(z) = c0 + c1 z + c2 z 2 + · · · + cn z n + · · · , where c0 , c1 , c2 , . . . , cn , · · · are all real constants. The following result involves Kronecker products. Result 6.8.1 Let P be a square matrix of order m and h be an analytical function. Then, (a)

h(In ⊗ P) = In ⊗ h(P)

and (b) h(P ⊗ In ) = h(P) ⊗ In . Proof. Consider h(In ⊗ P), since h is an analytical function X h(In ⊗ P) = ck (In ⊗ P)k . Now using the mixed product rule (In ⊗ P)2 = (In ⊗ P)(In ⊗ P) = In ⊗ P2 (In ⊗ P)3 = (In ⊗ P2 )(I ⊗ P) = In ⊗ P3

Kronecker Product

257

In general, (In ⊗ P)k = In ⊗ Pk Thus

h(In ⊗ P) =

X

ck (In ⊗ P)k

=

X

ck (In ⊗ Pk )

X

I n ⊗ ck P k X = In ⊗ ck Pk = In ⊗ h(P)k .

=

Similarly, (b) can be proved by considering X h(P ⊗ In ) = cn (P ⊗ I)n . Example 6.8.1 VerifyResult 1 2 3x + 1 for I2 and P = 0 1 0 0

6.8.1 (a) and (b)  0 2 1   1 0 Let In = I2 = 0 1    1 1 0 Then I2 ⊗ P = ⊗ 0 0 1 0  1 2 0 0 0 1 2 0  0 0 1 0 = 0 0 0 1  0 0 0 0 0 0 0 0

h(I2 ⊗ P) = 3(I2 ⊗ P) + I  1 2 0 0 0 1 2 0  0 0 1 0 = 3 0 0 0 1  0 0 0 0 0 0 0 0  4 6 0 0 0 4 6 0  0 0 4 0 = 0 0 0 4  0 0 0 0 0 0 0 0

0 0 0 2 1 0 0 0 0 6 4 0

  0 1 0 0    0  + 0  0  0 2 0 1 0 

0 0  0  0  6 4

for the polynomial h(x) =

0 0 0 2 1 0

 0 2 1  0 0  0  0  2 1

0 1 0 0 0 0

0 0 1 0 0 0

2 1 0

0 0 0 1 0 0

0 0 0 0 1 0

 0 0  0  0  0 1

258

Linear Algebra  1 2 Further, h(P) = 3 0 1 0 0

 1 Now, I2 ⊗ h(P) = 0

to Differential   0 1 0 2 + 0 1 1 0 0

  4 0 ⊗ 0 1 0

6 4 0

Equations   0 4 6 0 = 0 4 1 0 0

 4  0  0 0  6 = 0  4 0 0

6 4 0 0 0 0

0 6 4 0 0 0

0 0 0 4 0 0

 0 6 4  0 0  0  0  6 4

0 0 0 6 4 0

hence the Result 6.8.1(a) is verified. 6.8.1(b) is left as an exercise. Example 6.8.2 For h(z) = ez verify the Result 6.8.1(a) and 6.8.1(b) with I2 and P as in above example. Solution. Consider h(z) = ez . Since ez is an analytical function 6.8.1(a) to show eI2 ⊗P = I2 ⊗ eP 1 (I2 ⊗ P)2 + · · · Consider eI2 ⊗P = I6 + I2 ⊗ P + 2!      1 2 0 0 0 0 1 0 0 0 0 0 1 4 4 0 1 0 0 0 0 0 1 2 0 0 0 0 1 4      0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1     = 0 0 0 1 0 0 + 0 0 0 1 2 0 + 2 0 0 0      0 0 0 0 1 0 0 0 0 0 1 2 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0   5/2 4 2 0 0 0   0 5/2 4 0 0 0     0 0 5/2 0 0 0  + ···  =  0 0 0 5/2 4 2    0 0 0 0 5/2 4  0 0 0 0 0 5/2 Now eP = I3 + P + P2 /2! + · · ·    1 0 0 1 2 = 0 1 0 + 0 1 0 0 1 0 0   5/2 4 2 5/2 4 . = 0 0 0 5/2

  0 1 1 2 + 0 2 1 0

4 1 0

 4 4 + · · · 1

from Result

0 0 0 1 0 0

0 0 0 4 1 0

 0 0  0  4  4 1

Now

Kronecker Product     5/2 4 2 1 0 I2 ⊗ e P = ⊗  0 5/2 4  0 1 0 0 5/2   5/2 4 2 0 0 0  0 5/2 4 0 0 0     0 0 5/2 0 0 0    = 0 0 5/2 4 2   0   0 0 0 0 5/2 4  0 0 0 0 0 5/2

259

Hence the Result 6.8.1(a) is verified. Observation. 1. Note that in eI2 ⊗P the 1st three terms have been considered and also in eP , the 1st three terms were considered. The same number of terms in eI2 ⊗P and eP must be chosen. 2. Result 6.8.1(b) is left as exercise. The following example deals with using the tools of linear algebra to verify the Result 6.8.1(a). 

3 Example 6.8.3 Given A = −2 (i) (iii)

exp(A)

(ii)

 4 calculate −3

exp(A ⊗ I)

verify the result exp(A) ⊗ I = exp(A ⊗ I).

(i) Clearly the eigen-values of A are 1 and −1 and eigen-vectors are x1 = [2 − 1]T and x2 = [1 − 1]T , respectively.   Next matrix P = x1 x2 is given by,    2 1 1 P= and hence P−1 = −1 −1 −1   1 0 −1 Observe that P AP = = D and 0 −1     2 1 e 0 1 1 exp A = PeD P−1 = , −1 −1 0 e−1 −1 −2   2e − e−1 2(e − e−1 ) or exp A = e−1 − e 2e−1 − e

1 −2



260

Linear Algebra to Differential Equations

(ii) Clearly, 

3 0 A ⊗ I2 =  −2 0

0 3 0 −2

4 0 −3 0

 0 4 , 0 −3

The characteristic equation for matrix A ⊗ I2 is  3−λ 0 4  0 3 − λ 0 (A ⊗ I2 ) − λI4 =   −2 0 −3 − λ 0 −2 0

 0 4   0  −3 − λ

Verify that the eigen-values are 1 and −1 with multiplicity 2 in each case. The eigen-vectors associated to 1, are given by the system 2x + 4z = 0, 2y + 4w = 0, −2x − 4z = 0, −2y − 4w = 0. This gives x1 = α[0 − 2 0 1]T + β[−2 0 1 0]T. The eigen-vectors associated to −1 are given by the system 4x + 4z = 0 4y + 4w = 0 −2x − 2z = 0 −2y − 2w = 0 This gives x2 = γ[1 0 − 1 0]T + δ[0 1 0 − 1]T. Note that the geometric multiplicity of 1 and −1 is 2, the same as its algebraic multiplicity. Now, set     0 −2 1 0 0 −1 0 −1 −2 0   0 1  then P−1 = −1 0 −1 0  P= 0   1 −1 0 −1 0 −2 0  1 0 0 −1 0 −1 0 −2   1 0 0 0 0 1 0 0 −1   and P AP =  0 0 −1 0  0 0 0 −1

Kronecker Product

261

Hence exp(A ⊗ I2 ) = PeD P−1  2e − e−1  0 =  e−1 − e 0

0 2e − e−1 0 e−1 − e

2e − 2e−1 0 2e−1 − e 0

 0 2e − 2e−1    0 2e−1 − e .

(iii) It follows that (expA) ⊗ I2 = exp(A ⊗ I2 ).

EXERCISE 6.8  3 1. If A = 1

 2 , calculate 4

(i) expA (ii) exp(A ⊗ I) (iii) verify the property (expA) ⊗ I = exp(A ⊗ I)     1 1 5 4 2. Let A = ,B= −2 4 1 2 (i) find the eigen-values and eigen-vectors of A and B. (ii) find expA and expB. (iii) verify the property (expA) ⊗ I = exp(A ⊗ I) (iv) verify the property (expB) ⊗ I = exp(B ⊗ I)     0 1 cosh 1 sinh 1 3. If A = , show that expA = . 1 0 sinh 1 cosh 1     0 x cosh x sinh x 4. If A = , show that expAx = . x 0 sinh x cosh x     0 x cos x sin x 5. If A = , show that expAx = . −x 0 sin x cos x

6.9

Kronecker Sum

In this section, the Kronecker sum (⊕) of two matrices is defined and some of its properties are obtained. The symbol ⊕ is different from the symbol for Kronecker product, ⊗. Let P and Q be square matrices of orders m and n, respectively.

262

Linear Algebra to Differential Equations

Definition 6.9.1 The Kronecker sum of two matrices P and Q, denoted by P ⊕ Q, defined as P ⊕ Q = P ⊗ In + Im ⊗ Q.  1 Example 6.9.1 Let P = 2

  1 2 and Q = 1 −1

Solution. By definition      1 1 1 0 1 P⊕Q= ⊗ + 2 1 0 1 0    1 0 1 0 2 0 1 0 1 −1   = 2 0 1 0 +  0 0 2 0 1 0

 1 . Find P ⊕ Q. 3

  0 2 × 1 −1 1 0 3 0 0 2 0 −1

 1 3

  0 3 −1 0 = 1  2 3 0

1 1 4 0 0 3 2 −1

 0 1 . 1 4

Result 6.9.1 The following are some of the properties of a Kronecker sum (i) P ⊕ Q = Omn =⇒ P = Om and Q = On , where Om and On are zero matrices of order m and n, respectively (ii) Im ⊕ In = 2Imn (iii) Mixed addition rule: (P ⊕ Q) + (R ⊕ S) = (P + R) ⊕ (Q + S), where P and R are square matrices of order m and Q and S are square matrices of order n (iv) Associative law holds P ⊕ (Q ⊕ R) = (P ⊕ Q) ⊕ R (v) If {λk } and {µl } are the eigen-values and {xk } and {yl } are the corresponding eigen-vectors of P and Q, respectively then λk + µl are the eigen-values and xk ⊗ yl are the corresponding eigen-vectors of P ⊕ Q. Proof. (i) Clearly if P 6= O and Q 6= O then P ⊕ Q = P ⊕ In + Im ⊕ Q 6= O. Suppose if possible P 6= O and Q = O, then P ⊕ Q = O =⇒ P ⊗ In + Im ⊗ Q = P ⊗ In + O = O =⇒ P ⊗ In = O =⇒ P = O. Similarly, if P = O and if possible Q 6= O, then it can be shown that Q = O. (ii) Im ⊕ In = Im ⊗ In + Im ⊗ In = 2(Im ⊗ In ) = 2Imn .

Kronecker Product

263

(iii) Consider P, R and Q, S square matrices of order m and n, respectively. Then (P ⊕ Q) + (R ⊕ S) = (P ⊗ In + Im ⊗ Q) + (R ⊗ In + Im ⊗ S) = (P ⊗ In + R ⊗ In ) + (Im ⊗ Q + Im ⊗ S) = (P + R) ⊗ In + Im ⊗ (Q + S) = (P + R) ⊕ (Q + S), by definition of Kronecker sum. (iv) Let P, Q and R be square matrices of order m. Then L.H.S = P ⊕ (Q ⊕ R) = P ⊕ (Q ⊗ Im + Im ⊗ R) = P ⊗ Im2 + Im ⊗ Q ⊗ Im + Im2 ⊗ R R.H.S = (P ⊕ Q) ⊕ R = (P ⊗ Im + Im ⊗ Q) ⊕ R = (P ⊗ Im + Im ⊗ Q) ⊗ Im + Im2 ⊗ R = P ⊗ Im2 + Im ⊗ Q ⊗ Im + Im2 ⊗ R Since L.H.S = R.H.S the result holds. (v) Let Pm×m and Qn×n be square matrices. Consider (P ⊕ Q)(xk ⊗ yl ) = (P ⊗ In + Im ⊗ Q)(xk ⊗ yl ) = (P ⊗ In )(xk ⊗ yl ) + (Im ⊗ Q)(xk ⊗ yl ) = (Pxk ⊗ yl ) + (xk ⊗ Qyl ) = λk xk ⊗ yl + xk ⊗ µl yl = (λk + µl )(xk ⊗ yl ), which implies by the definition of eigen-values and eigen-vectors that (λk + µl ) is an eigen-value of (P ⊕ Q) with corresponding eigen-vector xk ⊗ yl .  1 Example 6.9.2 Let P = 0 P ⊕ Q (ii) P ⊕ R

  0 −1 ,Q= 1 0

  0 0 and R = 0 0

 0 . Find (i) 0

264

Linear Algebra to Differential Equations

(i) By definition  1 P⊕Q= 0  1 0 = 0 0  0 0 = 0 0

  0 1 ⊗ 0 0

0 1

0 1 0 0

0 0 0 0 0 −1 0 0

0 1 0 0

  0 1 + 1 0   −1 0 0 0 0 0 + 0 0  0 0 0 0  0 0 0 0  −1 0 0 0

 −1 0

 0 0  0 0  0 0

(ii)         1 0 1 0 1 0 0 0 P⊕R= ⊗ + ⊗ 0 0 0 1 0 1 0 0     0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0    = 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 0   1 0 0 0 0 1 0 0  = 0 0 0 0 0 0 0 0 Observation. (1) Though P + Q = O, P ⊕ Q 6= O (2) Even if at least one entry of P 6= O and Q 6= O, P ⊕ Q 6= O Example 6.9.3 Let m = 3 and n = 2. Find Im ⊕ In Consider I3 ⊕ I2 = I3 ⊗ I2 + I3 ⊗ I2 = 2I3 ⊗ I2 = 2I6 . Example 6.9.4 Verify associative law for Kronecker sum by taking       p1 p2 q1 q2 r1 r2 P= Q= and R = p3 p4 q3 q4 r3 r4 Consider P ⊕ (Q ⊕ R) = P ⊕ [Q ⊗ I2 + I2 ⊗ R]

Kronecker Product

= (P ⊗ I4 ) + I2 ⊗ [Q ⊗ I2 + I2 ⊗ R]    1 0 0 0 q1        0 1 0 0 p p2 + 1 0 ⊗ 0 ⊗ = 1 0 0 1 0 q3 p3 p4 0 1    0 0 0 0 1   p1 0 0 0 p2 0 0 0 p1 0 0 0 p2 0 0 0   0 p1 0 0 0 p2 0 0   0 0 p1 0 0 0 p2  0 =  0 0 0 p4 0 0 0  p3   p3 0 0 0 p4 0 0 0 0 0 p3 0 0 0 p4 0 0 0 0 p3 0 0 0 p4  q1 + r1 r2 q2 0 0 q1 + r4 0 q2 0  r3  0 q4 + r1 r2 0  q3  q3 r3 q4 + r4 0  0 + 0 0 0 q1 + r1  0  0 0 0 r3  0  0 0 0 0 q3 0 0 0 0 0 p + q + r r q 0 1

  =  

1 r3 q3 0 p3 0 0 0

1

0 q1 0 q3

265

q2 0 q4 0

0 0 0 0 r2 q1 + r4 0 q3

  0 r1  q2   + r3 0 0 q4 0

0 0 0 0 q2 0 q4 + r1 r3

r2 r4 0 0

0 0 r1 r3

 0    0  r2    r4

 0 0   0   0   0   q2  r2  q4 + r4

p2 0 0 0 2 2 p1 + q1 + r4 0 q2 0 p2 0 0 0 p1 + q4 + r1 r2 0 0 p2 0 q3 r3 p1 + q4 + r4 0 0 0 p2 0 0 0 p4 + q1 + r1 r2 q2 0 p3 0 0 r3 p 4 + q 1 + r4 0 q2 0 p3 0 q3 0 p4 + q4 + r1 r2 0 0 p3 0 q3 r3 p4 + q4 + r4

Consider R.H.S = (P ⊕ Q) ⊕ R = (P ⊕ Q) ⊗ I2 + I4 ⊗ R 

p1 p3  p1  0  = p3 0  1 0 0 1 + 0 0 0 0 =

   p2 q q ⊗ I2 + I2 × 1 2 ⊗ I2 + I4 ⊗ R p4 q3 q4    0 p2 0 q1 q2 0 0   p1 0 p2   + q3 q4 0 0  ⊗ I2   0 p4 0 0 0 q1 q2  p3 0 p4 0 0 q3 q4  0 0   0 0  ⊗ r1 r2 1 0 r3 r4 0 1

     

266  p1 + q1  0   q3   0 =  p3   0   0 0  r1 r2 r3 r4  0 0  0 0 + 0 0  0 0  0 0 0 0 p + q + r 1

 = 

1 r3 q3 0 p3 0 0 0

Linear Algebra to Differential Equations 0 p1 + q1 0 q3 0 p3 0 0 0 0 r1 r3 0 0 0 0 1

0 0 r2 r4 0 0 0 0

q2 0 p1 + q4 0 0 0 p3 0 0 0 0 0 r1 r3 0 0

0 0 0 0 r2 r4 0 0

0 q2 0 p1 + q4 0 0 0 p3  0 0 0 0  0 0  0 0  0 0  0 0  r1 r2  r3 r4

p2 0 0 0 p4 + q1 0 q3 0

0 p2 0 0 0 p4 + q1 0 q3

0 0 p2 0 q2 0 p1 + q4 0

 0 0   0   p2   0   q2   0  p4 + q4

r2 q2 0 p2 0 0 0 p1 + q1 + r4 0 q2 0 p2 0 0 0 p1 + q4 + r1 r2 0 0 p2 0 q3 r3 p1 + q4 + r4 0 0 0 p2 0 0 0 p4 + q1 + r1 r2 q2 0 p3 0 0 r3 p 4 + q 1 + r4 0 q2 0 p3 0 q3 0 p4 + q4 + r1 r2 0 0 p3 0 q3 r3 p4 + q4 + r4

Thus L.H.S = R.H.S. Hence P ⊕ (Q ⊕ R) = (P ⊕ Q) ⊕ R is verified. Example 6.9.5 Given P =

 1 2

  2 1 , and Q = 1 0

 −1 , Find 2

(i) the eigen-values and eigen-vectors of P and Q. (ii) the eigen-values and eigen-vectors of P ⊕ Q. (iii) Verify the property concerning the eigen-values, eigen-vectors and Kronecker sum. Solution: The characteristic equation for the matrix P is |P − λI2 | = 0. It is seen that the eigen-values of P are −1 and 3 and corresponding eigen-vectors are respectively x1 = [1 − 1]T and x2 = [1 1]T. The characteristic equation for matrix Q is |Q−µI2 | = 0. It can be verified that the eigen-values of Q are 1 and 2 and corresponding eigen-vectors are y1 = [1 0]T and y2 = [1 − 1]T. Now,     1 2 1 −2 P⊕Q= ⊕ 2 1 0 1   2 −1 2 0 0 3 0 2   =  2 0 2 −1  . 0 2 0 3

   

Kronecker Product

267

The characteristic equation for matrix P ⊕ Q is |P ⊕ Q − αI4 | = 0 or 2−α −1 2 0 0 3−α 0 2 =0 2 0 2−α −1 0 2 0 3−α Expanding this determinant, we obtain α (α − 1)(α − 4)(α − 5) = 0 It follows that eigen-values of P ⊕ Q are 0, 1, 4, 5. The associated eigen-vectors are given by the linear system (P ⊕ Q = −λI)x = 0 Corresponding to eigen-value 0, the associated eigen-vectors are given by the linear system 2x − y + 2z = 0 3y + 2w = 0 2x + 2z − w = 0 2y + 3w = 0 An eigen-vector is [1 0 − 1 0]T. The following table is illustrative. Eigen-value 1

4

5

System of equations x − y + 2z = 0 2y + 2w = 0 2x + z − w = 0 2y + 2w = 0 −2x − y + 2z = 0 −y + 2w = 0 2x − 2z − w = 0 2y − w = 0 −3x − y + 2z = 0 −2y + 2w = 0 2x − 3z − w = 0 2y − 2w = 0

Eigen-vector

[ 1 − 1 − 1 1 ]T.

[1

0

1

[1

−1

0 ]T.

1

− 1 ]T.

Observe that, α = 0 = λ1 + µ1 α = 1 = λ1 + µ2 α = 4 = λ2 + µ1 α = 5 = λ2 + µ2

and and and and

x1 ⊕ y1 x1 ⊕ y2 x2 ⊕ y1 x2 ⊕ y2

=[ =[ =[ =[

1 1 1 1

0 − 1 0 ]T − 1 − 1 1 ]T 0 1 0 ]T − 1 1 − 1 ]T

This example suggests that Kronecker sum helps to get eigen-values and eigen-vectors of matrices of higher order.

268

Linear Algebra to Differential Equations

EXERCISE 6.9  2 1. Find the Kronecker sum of the matrices A = 1

  4 8 and B = 3 5

 1 . 6

2. Find the eigen-values andeigen-vectors corresponding to the Kronecker    1 3 2 3 sum of the matrices A = and B = . 2 2 4 3

6.10

Lyapunov Function

In this section. Kronecker products are used to reduce a specific matrix equation to a matrix-vector form and solve it. Definition 6.10.1 Lyapunov Equation is a linear matrix equation of the form PX + XQ = R,

(6.12)

where P, Q and R are given matrices of order n and X is an unknown matrix of order n. The following are particular cases of (6.12) are of interest. (i) Q = O, then (6.12) reduces to PX = R (ii) When P = Q and R = O, i.e. PX ± XP = O (iii) PX ± XPT = O (iv) PX ± XP−1 = O The solution for the equation (6.12) is given by Z ∞ X=− ePt ReQt dt 0

provided the integral exists for an arbitrary constant matrix C [see 4]. Result. A necessary and a sufficient condition for the equation (6.12) to have a unique solution for each value of R is that λi + µj 6= 0 for all values of i and j, where λi is an eigen-value of P and µj is an eigen-value of Q for each i and j. The aim in this section is to solve the Lyapunov equation using Kronecker products.

Kronecker Product

269

The equation (6.12) can be written in vector matrix form as follows using the relation (6.1) and Kronecker sum. Vec(PX + XQ) = Vec(PXIn + In XQ) = Vec(PXIn ) + Vec(In XQ)  = (In ⊗ P) + QT ⊗ In VecX = (QT ⊕ P)VecX or the equation (6.12) reduces to the form Hx = c,

(6.13)

where H = QT ⊕ P, x = VecX and c = VecR. The equation (6.13) has a unique solution iff H is nonsingular, that is, the eigen-values of QT ⊕ P are all nonzero. Thus if λi + µj 6= 0 for all i and j then the equation (6.12) has a unique solution. Example 6.10.1 A medical expert advised two patients Usha and Nisha of different ages to consume (150, 145) g of proteins and (330, 270) g of carbohydrates, respectively. The patients consume the two types of food A and B and when mixed provide them approximately the desired food values. Noticing that in course of time food values of A and B decrease, the expert advised them to add vitamins C and D containing proteins and carbohydrates as boosters. The following tables present food analysis of A and B and vitamins C and D. Type of food A B

Food value 100 g Proteins Carbohydrates 5 15 30 25

The food items are boosted by vitamins C and D having potency as follows Type of food C D

Food value 100 g Proteins Carbohydrates 5 20 10 5

What amount of items A and B Usha and Nisha consume so that they receive needed food values. Solution. Let x11 and x21 be the quantities in units in 100 g of A and B for Usha and x12 , x22 the quantities for Nisha. These quantities are then subjected to foods containing vitamins C and D. The resulting model is then given by a matrix equation         5 15 x11 x12 x x12 5 20 150 145 + 11 = 30 25 x21 x22 x21 x22 10 5 330 270

270

Linear Algebra to Differential Equations

i.e. PX + XQ = R. Then (QT ⊕ P) = [(QT  5 = 20  5 0 = 20 0  10 30 = 20 0

⊗ I2 ) + (I2 ⊗ P)]        10 1 0 1 0 5 15 ⊗ + ⊗ 5 0 1 0 1 30 25    5 15 0 0 0 10 0   5 0 10  + 30 25 0 0  0 5 0   0 0 5 15 0 0 30 25 20 0 50  15 10 0 30 0 10  0 10 15 20 30 30

Using equation (6.2),      x11 10 15 10 0 150 30 30 0 10 x21  330      20 0 10 15 x12  = 145 . 0 20 30 30 270 x22 Thus the system of equations obtained are 10x11 + 15x21 + 10x12 = 150 30x11 + 30x21 + 10x22 = 330 20x11 + 10x12 + 15x22 = 145 20x21 + 30x12 + 30x22 = 270 Thus, solving the above system yields that Usha consumes 4 units and 6 units and Nisha consumes 2 units and 3 units of food A and B, respectively.

EXERCISE 6.10 1. Solve the system AX + XB = C,      1 −1 −3 4 1 where A = , B = , C = 0 2 1 0 −2 known matrix of order 2.

 3 and X is an un2

2. Solve the system AX + XB = C,      1 −1 −3 4 0 where A = , B = , C = 0 2 1 0 2 known matrix of order 2.

 5 and X is an un−9

Kronecker Product

271

3. Determine the solution of the system

 1 where A = 2

6.11

AX + XA = µX,  0 . µ = −2 and X is an unknown matrix of order 2. 3

Conclusion

The concepts related to Kronecker product and sum have been studied in this chapter. The main feature was to convert a matrix into a vector and employ it effectively in the matrix analysis. This fact has been successfully utilized in the subsequent chapters, Chapters 8 and 9. Another useful technique given is obtaining the eigen-values and eigen-vectors of higher-order matrices through the Kronecker product and Kronecker sum of lower-order matrices. An application involving Lyapunov equation is given. For further reading the readers are suggested to refer to [4] and [8].

Chapter 7 Calculus of Matrices

7.1

Introduction

One of the interesting developments in the theory of matrices is to club it with elements of calculus and introduce differentiation and integration of matrices. As calculus deals with rate of change of functions with respect to its independent variable, matrix calculus deals with the study of rate of change of matrix functions with respect to independent variables (which may be entries of a matrix) and the study of various concepts related to integration and differentiation of matrix functions. It is natural to expect that applications of matrix calculus will be far reaching in sciences, engineering and technology. The present chapter introduces the necessary concepts which can be extrapolated to an entire gamut of applications that arise in modeling of real-life situations using matrices. In Section 7.2, the concept of a matrix-valued function w.r.t. a scalar, (i.e.) with respect to an independent variable is given. Section 7.3 deals with the derivative of a vector-valued function (of a vector) w.r.t. a vector. Section 7.4 introduces the notion of a derivative of a scalar-valued function (of a matrix) w.r.t. a matrix. In Section 7.5, the notion of the derivative of a matrix-valued function w.r.t. its entries and the converse are discussed. Section 7.6 deals with the matrix differential. The derivative of a matrix-valued function w.r.t. a matrix form the content of Section 7.7. Section 7.8 deals with the rules obtained for the derivative defined in Section 7.7 using Kronecker products. Section 7.9 introduces another notion of a derivative of a matrix-valued function w.r.t. a matrix.

7.2

Derivative of a Matrix Valued Function with Respect to a Scalar

Consider an object flying in a space. The position of the flying object can be determined in a 3-dimensional coordinate system w.r.t. a frame of reference. The position of the object at any time is given by a 3-vector valued function DOI: 10.1201/9781351014953-7

273

274

Linear Algebra to Differential Equations

v(t) = (x(t) y(t) z(t)) t ∈ R and v ∈ R3 . Naturally, the functions x, y, z : R → R are scalar-valued functions of t and are assumed to be continuous (as the object is flying in space). If the object is flying gracefully or smoothly, the derivatives of all the functions x(t), y(t), z(t) exist and the derivative of a vector with respect to the time variable is given by  0  0  x(t) x (t) y(t) = y 0 (t) . z(t) z 0 (t) The definition in precise terms is as follows. Let v : Rn → Rn be an n-vector valued function v(t) = (x1 (t) x2 (t) ··· xn (t))T , t ∈ R. Assume that the functions x1 (t), x2 (t), · · · , xn (t) are differentiable w.r.t. t. Definition 7.2.1 The derivative of an n-vector valued function is a vector formed by the derivatives x01 (t), x02 (t), · · · , x0n (t), as follows    0  x1 (t) x1 (t) 0     x (t) 2 d    x2 (t)  v 0 (t) =  ..  =  .. , t ∈ R. dt  .   .  x0n (t) xn (t) 2 Example 7.2.1  Let v = (t t 1), t ∈ R. d d 2 d Then v 0 (t) = t t 1 = (1 2t 0). dt dt dt

In order to define the integral of a vector valued function it is sufficient for the real-valued functions x1 , x2 , · · · , xn : R → R be bounded. Let v : Rn → Rn be a bounded function on an interval I. Definition 7.2.2 The integral of an n-vector valued function lows, Z   x1 (t)dt  I     Z x1 (t)    x2 (t)dt  Z Z  x (t)    2   . vdt =  .  dt =  I    ..    . .. I I   xn (t) Z     xn (t)dt I

Example 7.2.2 Let v =

( t, 3

0≤t 0 for i, j  x11 where y = det x21 x31  x11 where y = det x21 x31

= 1, 2. x12 x22 x32 x12 x22 x32

 x13 x23  is found over 2nd row. x33  x13 x23  is found over 2nd column. x33

288

Linear Algebra to Differential Equations

7.5

Derivative of a Matrix Valued Function w.r.t. its Entries and Vice versa

In Section 7.4 the derivative of a scalar valued function of a matrix w.r.t. a matrix was introduced. This idea will be extended in this section. The derivative of a matrix w.r.t. its entries and the derivative of an entry w.r.t. the matrix form the content of this section. Let X = (xij )m×n be any arbitrary matrix. Definition 7.5.1 The derivative of the matrix X w.r.t. one of its entries xkl ∂X and is given by is denoted by ∂x kl ∂X = (aij )m×n , where aij = ∂xij In other words,

∂X ∂xkl

 1, if i = k, j = l 0, otherwise.

= Pkl , where Pkl is a primary matrix of order m × n.

Observation. The derivative of XT w.r.t. xkl is

∂XT ∂xkl

= PTkl .

  x x12 Example 7.5.1 Let X = 11 x21 x22     0 1 0 0 ∂X ∂X and ∂x21 = Then ∂x12 = 0 0 1 0     T T 0 0 0 1 ∂X ∂X and ∂x12 = and ∂x21 = . 1 0 0 0 Definition 7.5.2 Let Zr×s be a matrix function of X, then, h i ∂zij . ∂xkl r×s

Example 7.5.2 Let Z= ∂Z = ∂x11

" 2 x11

x11 x12

#

x11 x222 x11 # " 2x11 x12

" 0 ∂Z = ∂x22 0

1

x222 0

# .

2x22 x11

∂Z ∂xkl

=

Calculus of Matrices

289

Product rule for the derivative of a matrix w.r.t. its Entries Let Pr×m and Qm×s be matrix functions of the matrix Xm×n such that (i) each entry pij and qij are functions of the matrix X (ii) the partial derivatives of pij and qij w.r.t. xkl exist for all i, j, k, l. Pm Set Zr×s = PQ. Then zij = u=1 piu quj , for all i, j. Now, for each fixed k, l, m

(i, j)th entry of

m

X ∂piu X ∂Z ∂zij ∂quj = = quj + . piu ∂xkl ∂xkl ∂x ∂xkl kl u=1 u=1

Hence the natural extension ∂P ∂Q ∂PQ = Q+P holds. ∂xkl ∂xkl ∂xkl The derivative of the (k, l) entry of the product Z = PQ w.r.t. the matrix X is given by m m X X ∂pku ∂zkl ∂qul = qul + pku ∂X ∂X ∂X u=1 u=1 Example 7.5.3 Verify the product rule for  2   x11 P= Q = x12 x11 x12

x11 x22



by differentiating w.r.t. x11 . Solution.

Hence,

 2  x11 x12 x311 x22 PQ = x11 x212 x211 x12 x22   ∂PQ 2x11 x12 3x211 x22 = . Hence, x212 2x11 x12 x22 ∂x11     ∂P ∂Q 2x11 Now = and = 0 x22 x ∂x11 ∂x11 12    2   ∂P ∂Q 2x11  x11  x12 x11 x22 + 0 Q+P = x x ∂x11 ∂x11 12 11 x12   2x11 x12 3x211 x22 . = 2 x12 2x11 x12 x22

Hence, the result is verified.

x22



290

Linear Algebra to Differential Equations

Derivative of the product Z = PXQ Consider the matrices P, X and Q of orders r×m, m×n and n×s, respectively. Then the matrix Z = PXQ is of order r × s. Clearly, Z is a function of the matrix X. ∂z ∂Z The problem is: given Z = PXQ, find ∂Xij and ∂x , and to solve this kl problem, using results from Chapter 6, set. Vec(Z) = Vec(PXQ) = (QT ⊗ P)Vec X. Now Vec Z is an rs × 1 column matrix. QT ⊗ P is a matrix of order rs × mn and Vec X is of order mn × 1. Hence (QT ⊗ P)Vec X is of order rs × 1. Consider zi1 element of Vec Z. Then zi1 = the product of ith row element of (QT ⊗ P) and Vec X. = product of the ith row of the matrix (q11 P, q21 P, . . . , qn1 P) and Vec X   x11  x21     ..   .   = (q11 pi1 q11 pi2 · · · q11 pim q21 pi1 q21 pi2 · · · q21 pim qn1 pi1 · · · qn1 pim )  xm1     .   ..  xmn m n X X piu qv1 xuv = v=1 u=1

Thus

∂zi1 ∂xkl

= pik ql1 . Hence, ∂zi1 = ∂X



∂zi1 ∂xkl





pi1 q11  pi2 q11  = .  ..

m×n

pi1 q21 pi2 q21 .. .

··· ··· .. .

 pi1 qn1 pi2 qn1   ..  . 

pim qi1 pim q21 · · · pim qn1   pi1  pi2      =  .  q11 q21 · · · qn1 .  .  pim = Pi∗ QT∗1 = PT Pi1 P11 QT = PT Pi1 QT .

Calculus of Matrices

291

where Pij is the primary matrix compatible with PT and QT. Hence Pij is of T T i1 order r × s. Thus ∂z ∂X = P Pij Q . Next,   ∂Z ∂zi1 = ∂xkl ∂xkl r×s   p1k ql1 p1k ql2 · · · p1k qls p2k ql1 p2k ql2 · · · p2k qls    = . .. .. ..   .. . . .  prk ql1 prk ql2 · · ·   p1k p2k     =  .  ql1 ql2 · · ·  .. 

prk qls

qlk



prk = P∗k Q∗l = PPk! P!l Q = PPkl Q, where Pkl is the primary matrix of order m × n. Example 7.5.4 Let Z = PXQ, where      z z12 p p3 x Z = 11 P= 1 X = 11 z21 z22 p2 p4 x21 Find

(i)

∂Z ∂xij

(ii)

  x12 q Q= 1 x22 q2

q3 q4



∂zij ∂X

Solution. Converting the above matrix relation into a vector form gives,     z11 p1 x11 q1 + p3 x21 q1 + p1 x12 q2 + p3 x22 q2 z21  p2 x11 q1 + p4 x21 q1 + p2 x12 q2 + p4 x22 q2     Vec (Z) =  z12  = p1 x11 q3 + p3 x21 q3 + p1 x12 q4 + p3 x22 q4  z22 p2 x11 q3 + p4 x21 q3 + p2 x12 q4 + p4 x22 q4 Differentiate the column vector w.r.t. x11 , x12 , x21 and x22 and covert the resulting vector in each case into a matrix. Hence it follows that     ∂Z ∂Z p1 q1 p1 q3 p1 q2 p1 q4 = , = (i) p2 q1 p2 q3 p2 q2 p2 q4 ∂x11 ∂x12     ∂Z ∂Z p3 q1 p3 q3 p3 q2 p3 q4 = , = p4 q1 p4 q3 p4 q2 p4 q4 ∂x21 ∂x22

292

Linear Algebra to Differential Equations Further, 

(ii)

∂z11 ∂X

∂z12 ∂X

∂z21 ∂X

∂z22 ∂X

∂z11  ∂x11  =  ∂z11 ∂x  21 ∂z12  ∂x11  =  ∂z12 ∂x  21 ∂z21  ∂x11  =  ∂z21 ∂x  21 ∂z21  ∂x11  =  ∂z22 ∂x21

 ∂z11  ∂x12  p1 q1  = p3 q1  ∂z11 ∂x22  ∂z12  ∂x12  p1 q3  = p3 q3  ∂z12 ∂x22  ∂z21  ∂x12  p2 q1  = p4 q1 ∂z21  ∂x22  ∂z22  ∂x12  p2 q3  = p4 q3  ∂z22 ∂x22

 p1 q2 ; p3 q2

p1 q4 p3 q4;



 p2 q2 ; p4 q2

p2 q4 p4 q4



Example 7.5.5 Let Z = P X−1 Q, where P, X, Q are invertible matrices of ∂Z orders r × n, n × n and n × s, respectively. Find ∂x . kl Solution. Differentiating the identity

Z Z−1 = I,

∂Z −1 ∂Z−1 Z +Z = O ∂xkl ∂xkl ∂Z−1 ∂Z = −Z Z ⇒ ∂xkl ∂xkl But



∂P X Q ∂xkl

= P Pkl Q gives

∂Z ∂ = ∂xkl ∂xkl

 ∂Z−1 ∂ = Q−1 X P−1 = Q−1 Pkl P−1 ∂xkl ∂xkl  P X−1 Q = −P X−1 Q Q−1 Pkl P−1 P X−1 Q = −P X−1 Pkl X−1 Q

Example 7.5.6 Let Z = P X−1 Q, where P, X, Q are of orders r × n, ∂zi1 n × n and n × s, respectively and X be an invertible matrix. Determine . ∂xkl

Calculus of Matrices

293

Solution. From Example 7.5.5,

Let

∂Z ∂xkl P X−1 ∂Z ∂xkl ∂zi1 ∂X

= − P X−1 Pkl X−1 Q = P1 ,

X−1 Q = Q1

= − P1 Pkl Q1 = − PT1 Pij QT1 T T = − X−1 P Pij QT

X−1

T

Example 7.5.7 Let Z = XT P X, where P, X, Q are of orders r × n, n × n ∂zi1 ∂Z . , (ii) and n × s, respectively. Find (i) ∂xkl ∂X Solution. (i) Consider Z = XT PX and applying chain rule, ∂Z ∂ ∂(PX) = ( XT ) P X + XT ∂xkl ∂xkl ∂xkl = PTkl P X + XT P Pkl . (ii) Let P1 = I, Q1 = P X, P2 = XT P, Q2 = I. ∂Z Then, = P1 PTkl Q1 + P2 Pkl Q2 ∂xkl ∂zi1 = Q1 PTij P1 + PT2 Pij QT2 ∂X = P X PTij + PT X Pij Example 7.5.8 Let Z = X2 , where X is a square matrix of order n. Find (i)

∂Z ∂xkl

(ii)

∂zi1 . ∂X

Solution. (i) Consider Z = X2 , then ∂Z ∂ ∂ ∂X ∂X = (X2 ) = (X X) = X+X ∂xkl ∂xkl ∂xkl ∂xkl ∂xkl

294

Linear Algebra to Differential Equations Now,  ∂X ∂ = ∂xkl ∂xkl



0 · · ·   = 0  ..  . 0

x11 · · ·   xr1   ..  .

x12 ··· xr2 .. .

···

···

···

x1s ··· xrs .. .

xn1

xn2

···

xns

···  0 · · ·  0  ..  .  0

0 ··· 0 .. .

···

0

···

···

0 ··· 1 .. .

···

0

···

···

···

 x1n ···   xrn   ..  .  xnn

= Pkl , where Pkl is the n × n matrix having 1 in the (k, l) place and zero elsewhere. Hence ∂Z = Pkl X + XPkl ∂xkl Observation. If Z = X3 , then set Z = X2 · X and employ the derivative of X2 obtained earlier. The procedure continues for higher orders of X. Example 7.5.9 Let X =

 x11 x21

 x12 . Then x22

x211 + x12 x21 Z=X = x11 x21 + x22 x21 2



x11 x12 + x12 x22 x21 x12 + x222



z11 = x211 + x12 x21  ∂z ∂z11   11  ∂z11 2x11 x21  ∂x11 ∂x12  = = ∂z11 ∂z11  x12 0 ∂X ∂x21 ∂x22      1 0 x11 x21 x x21 1 = + 11 0 0 x12 x22 x12 x22 0

 0 0

= P11 XT + XT P11 . generalizing this expression for a matrix X of order n × n. gives, ∂zij = Pij X T + XT Pij ∂X

Calculus of Matrices

295

EXERCISE 7.5       2 4 1 3 x1 x3 1. Let P = , Q = , X = and Z = PXQ. Find 5 6 1 4 x2 x4 ∂z12 ∂Z ∂Z ∂x1 , ∂x2 , and ∂X .   x1 x3 ∂Z ∂Z and ∂x . 2. Let X = . Set Z = X2 find ∂x 3 4 x2 x4     x1 x3 y1 (x1 , x2 , x3 , x4 ) y3 (x1 , x2 , x3 , x4 ) 3. Let X = , Y= . Set x2 x4 y2 (x1 , x2 , x3 , x4 ) y4 (x1 , x2 , x3 , x4 ) ∂Z ∂Z ∂Z ∂Z Z = XY find ∂x , ∂x , ∂x and ∂x . 1 2 3 4

7.6

The Matrix Differential

In this section, the concept of a matrix differential is introduced parallel to the corresponding notion of a scalar valued function h(x), where x = (x1 x2 . . . xn )T . Definition 7.6.1 Let Z = [zij ]m×n . The matrix differential dZ of the matrix Z is defined as   dz11 dz12 · · · dz1n    dz21 dz22 · · · dz2n    dZ =  . .. ..   .. . .  dzm1 dzm2 · · · dzmn =

[ dzij ]m×n . x211

x11 x12

x21 x12

x222

" Example 7.6.1 Let Z =



# d (x11 x12 ) dZ =  d (x21 x12 ) d x222 " # 2x11 dx11 x12 dx11 + x11 dx12 = x21 dx12 + x12 dx21 2x22 dx22 "

Then

d x211

#

Result 7.6.1 The matrix differential is a linear transformation

296

Linear Algebra to Differential Equations

1. d(αZ) = αdZ, where α ∈ R 2. d(Z1 + Z2 ) = d(Z1 ) + dZ2 . Example 7.6.2 Let Z =

 z11 z13

z12 dz14



   d (αz11 ) d (αz12 ) dz11 d (αZ) = =α d (αz13 ) d (αz14 ) dz13

Example 7.6.3 Let Z1 =

 z11 z13

z12 z14

 Z2 =

 z21 z23

 dz12 = αdZ. dz14

z22 z24



  d (z11 + z21 ) d (z12 + z22 ) d (z13 + z23 ) d (z14 + z24 )   dz11 + dz21 dz12 + dz22 = dz13 + dz23 dz14 + dz24     dz11 dz12 dz21 dz22 = + dz13 dz14 dz23 dz24

d (Z1 + Z2 ) =

= dZ1 + dZ2 .

Product rule for differentials Let Z1 and Z2 be two matrices of orders r × s and s × t, respectively. Then the product Z1 Z2 is a matrix of order r × t and d(Z1 Z2 )=(dZ1 )Z2 + Z1 dZ2 . Proof. Follows a similar pattern to derivative of matrices w.r.t. a variable. Example 7.6.4 Given A = (aij )2×2 , X = (xij )2×2 . Find the matrix differential (i) Z = A X (ii) Z = XT X. Solution. (i) Given

Z = AX =

 a11 x11 + a12 x21  a21 x11 + a22 x21

a11 x12 + a12 x22

 

a21 x12 + a22 x22

Calculus of Matrices

297

then, dZ =

=

  a11 dx11 + a12 dx21 a11 dx12 + a12 dx22   a21 dx11 + a22 dx21 a21 dx12 + a22 dx22    a11 a12 dx11 dx12    a21 a22 dx21 dx22

= AdX (ii) Given Z = XT X    x11 x21 x11 x12   = x12 x22 x21 x22   x11 x11 + x21 x21 x11 x12 + x21 x22  = x12 x11 + x22 x21 x12 x12 + x22 x22   P2 P2 i=1 2 xi1 dxi1 i=1 xi1 dxi2 + xi2 dxi1  T hen dZ =  P2 P2 xi1 dxi2 + xi2 dxi1 i=1 2 xi2 dxi2  i=1      dx11 dx21 x11 x12 x11 x12 dx11 dx12  +   = dx12 dx22 x21 x22 x21 x22 dx21 dx22 = ( dX )T X + XT dX

EXERCISE 7.6    x2 y 2 1 2x Z = find (i) d(3Z1 ) (ii) d(Z1 + 4Z2 ). 2 xy x2 y x2 y 2  2    3x 4x 2xy 3x2 2. Let Z1 = Z2 = , find d(Z1 Z2 ). 8xy 6y 2 1 x2 

1. Let Z1 =

7.7

Derivative of a Matrix w.r.t. a Matrix

In this section, the concepts introduced earlier are further extended to obtain a new notion, namely, the derivative of a matrix w.r.t. a matrix. From the earlier sections, derivative of a matrix Z = [zij ]r×s w.r.t. an element xkl is given by   ∂Z ∂zij = . ∂xkl ∂xkl r×s

298

Linear Algebra to Differential Equations

Definition 7.7.1 Let Z be a function of a matrix X = [xij ]m×n , then the ∂Z derivative of the matrix Z w.r.t the matrix X, ∂X is the partitioned matrix ∂Z whose (kl)th partition is ∂xkl and ∂Z = ∂X

=



∂Z ∂xkl  ∂Z  ∂x11   ∂Z   ∂x21   .  .  .  ∂Z ∂xm1

 mr×ns

∂Z ∂x12 ∂Z ∂x22 .. . ∂Z ∂xm2

··· ···

···

 ∂Z ∂x1n   ∂Z   ∂x2n  .  ..   .  ∂Z  ∂xmn mr×ns

Observation. The derivative of the matrix Z w.r.t. the matrix X can also be written as follows XX ∂Z ∂Z = Pkl ⊗ . ∂X ∂xkl k

l

The derivative of a determinant of a matrix w.r.t a matrix is an interesting concept and is given below. Example 7.7.1 Let Z = [zij ] be a matrix whose entriess are functions of a det Z . matrix X = [xkl ] or zij = fij (xkl ). To find ∂ ∂x kl Solution. For any fixed row p, let det Z = the element zpj in det Z.

From Chain rule,

P

j

zpj Zpj where Zpj is the cofactor of

X X ∂ det Z ∂zij ∂ det Z = . ∂xkl ∂zij ∂xkl i j

It has been 7.4 that,  shown in Section P P ∂zij  for i = p i j Zpj ∂xkl ∂|Z| = ∂xkl  P P (Pr z ∂ (Z )) ∂zij for i 6= p ps ∂xkl i j s=1 ps ∂zis    Zpj f or i = p  Next, define, A = [aij ], where aij =  Pr z ∂ (Z ) f or i = 6 p  ps s=1 ps ∂zis and B = (bij ) where bij =

∂zij ∂xkl .

Then

Calculus of Matrices

then

299

∂ det Z | X X = aij bij ∂xkl i j XX = aij P!j bij Pj! i

=

X

j

ATi Bi

i

= tr(A BT ) = tr(BT A) this is alternative form of the result.  x12 ∂ . Find ∂X (X−1 ). x22   1 x22 −x12 Solution. By definition, X−1 = . x11 x22 − x12 x21 −x21 x11  ∂  −1 ) ∂x∂12 (X−1 ) ∂ −1 ∂x11 (X (X ) = ∂ −1 ) ∂x∂22 (X−1 ) ∂X ∂x21 (X   −x222 x22 x12 −x22 x21 −x21 x12 − d 1  x21 x22 −x11 x22 + d −x221 x11 x21   = 2 2  x22 x12 −x12 −x12 x11 + d x11 x12  d −x12 x21 − d x12 x11 x21 x11 −x211 Example 7.7.2 Let X =

 x11 x21

where d = (x11 x22 − x12 x21 )  x12 and x22   x211 x12 `n(x11 + x12 ) Z= cos (x21 + x22 ) exp(x12 x22 )

Example 7.7.3 Find

∂Z ∂X ,

given that X =

Solution. By definition, " ∂Z ∂Z # ∂Z ∂x11 ∂x12 = ∂Z ∂Z ∂X ∂x21 ∂x22  2x11 x12   0  =  0  − sin(x21 + x22 )  a11 Example 7.7.4 If A =  a21

 x11 x21

1 x11 +x12

x211

0

0

0

0

0

− sin(x21 + x22 )

 x11 , X =  a22 x21 a12



1 x11 +x12



 x22 exp(x12 x22 )    0 

x12

x12 exp(x12 x22 )

  and Z = A XT. Use

x22

300

Linear Algebra to Differential Equations

the direct method to evaluate Solution. Given

∂ Vec Z ∂ Vec X .

Z = A XT   a11 a12  . =  a21 a22  a11 x11 + a12 = a21 x11 + a22

 x11  x12

x21

 

x22

x12

a11 x21 + a12 x22

x12

a21 x21 + a22 x22

 

then   z11 z21   = z12  z22

  a11 x11 + a12 x12 a21 x11 + a22 x12    a11 x21 + a12 x22  a21 x21 + a22 x22 ∂ z ∂ z21 ∂ z12 11  ∂ x11 ∂ x11 ∂ x11   ∂ z11 ∂ z21 ∂ z12  ∂ x ∂ x21 ∂ x21 ∂ Vec Z 21  =   ∂ z11 ∂ z21 ∂ z12 ∂ Vec X   ∂ x12 ∂ x12 ∂ x12  ∂ z ∂ z21 ∂ z12 11 ∂ x ∂ x22 ∂ x22  22  a11 a21 0 0      0 0 a11 a21     =    a12 a22 0 0      0 0 a12 a22

∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂

z22  x11   z22   x21    z22   x12   z  22

x22

EXERCISE 7.7   x11 x12 + x211 sin(x21 + x22 ) x 1. Let Z = and X = 11 sin(x12 + x11 ) x211 + x21 x22 x21 ∂Z . ∂X 

 x12 . Find x22

 2. Let Z =

x11 x21 x11

Calculus of Matrices 301  ∂Z x11 x12 x11 x21 . Find with X as in problem x222 x12 x22 x11 + 3 ∂X

1.    ∂Z ln x222 + x212 exp x211 + x222 . Find with the same X 3. Let Z = x211 + x222 x222 + x12 ∂X as in problem 1.   ∂Z cos(x11 ) + sin(x22 ) cos(x12 ) + sin(x21 ) 4. Find , where Z = with x211 cos(x12 ) sin(x22 )(x211 + x22 ) ∂X the same X as in problem 1.

7.8

Derivative Formula using Kronecker Products

In this section, the concept of Kronecker product will be utilized. To develop differentiation of product of two matrices w.r.t. a matrix and the chain rule w.r.t. a matrix.

Product rule for matrix differentiation Let Yp×r and Zr×s be functions of a matrix Xm×n . The product rule w.r.t. an entry, xkl gives ∂YZ ∂Y ∂Z = Z+Y ∂xkl ∂xkl ∂xkl Now using the definition of the derivative of a matrix w.r.t. a matrix   ∂YZ X ∂Y ∂Z = Pkl ⊗ Z+Y ∂X ∂xkl ∂xkl k,l =

X k,l

Pkl ⊗

X ∂Y ∂Z Z+ Pkl ⊗ Y , ∂xkl ∂xkl k,l

where Pkl is an m × n primary matrix. Now X ∂(YZ) X ∂Y ∂Z = Pkl In ⊗ Z+ Im Pkl ⊗ Y ∂X ∂xkl ∂xkl k,l k,l    X X ∂Y ∂Z = Pkl ⊗ (In ⊗ Z) + (Im ⊗ Y) Pkl ⊗ ∂xkl ∂xkl k,l

∂Y ∂Z = (In ⊗ Z) + (Im ⊗ Y) . ∂X ∂X

k,l

302

Linear Algebra to Differential Equations

Observation. 1. The mixed product rule of Kronecker products is used in the second step. 2. The order of dimension as

∂(YZ) same ∂X is pm × ns. The right-hand side also has the  ∂Z ∂Y (I ⊗Z) , (I ⊗Y) and n nr×ns m pm×mr ∂X pm×nr ∂X mr×ns .

    x1 x3 Example 7.8.1 Let X = , Y = 5x1 + x2 x3 + x4 and Z = x2 x4   cos(x1 ) x2 + x3 . Verify the product rule for differentiation w.r.t. matrices exp(x3 ) sin(x4 ) for the product YZ. Solution.     cos(x1 ) x2 + x3 YZ = 5x1 + x2 x3 + x4 exp(x3 ) sin(x4 )  = (5x1 + x2 )cos(x1 ) + (x3 + x4 )exp(x3 )  (5x1 + x2 )(x2 + x3 ) + (x3 + x4 )sin(x4 ) ∂(YZ)1×2 (∂X)2×2   (5x1 + x2 )sin(x1 ) 5(x2 + x3 ) exp(x3 )[1 + x3 + x4 ] 5x1 + x2 + sin(x4 ) = 5cos(x1 ) − cos(x 5x1 + 2x2 + x3 exp(x3 ) (x3 + x4 )cos(x4 ) + sin(x4 ) 2×4 1)    1 0  5x1 + x2 x3 + x4 I2 ⊗ Y = 0 1  5x1 + x2 x3 + x4 0 = 0 0 5x1 + x2

0 x3 + x4



   1 0 cos(x1 ) x2 + x3 I2 ⊗ Z = 0 1 exp(x3 ) sin(x4 )   cos(x1 ) x2 + x3 0 0 sin(x4 ) sin(x4 ) 0 0   =  0 0 cos(x1 ) x2 + x3  0 0 exp(x3 ) sin(x4 ) Now   ∂(Y)1×2 5 0 0 1 = 1 0 0 1 2×4 (∂X)(2×2)   −sin(x1 ) 0 0 1  ∂Z1×2 0 0 exp(x3 ) 0   =  0 1 0 0  (∂X)2×2 0 0 0 cos(x1 )

Calculus of Matrices

303

Now substituting all the above calculations in ∂(YZ) ∂X ∂Y ∂Z = (I2 ⊗ Z) + (I2 ⊗ Y) ∂X ∂X   5cos(x1 ) 5(x2 + x3 ) exp(x3 ) sin(x4 ) = cos(x1 ) (x2 + x3 ) exp(x3 ) sin(x4 )   (5x1 + x2 )sin(x1 ) 0 (x3 + x4 )exp(x3 ) 5x1 + x2 + 0 5x1 + x2 0 (x3 + x4 )cos(x4 ) 5cos(x ) − (5x + x )sin(x ) 5(x + x ) exp(x )[1 + x + x ] 5x + x + sin(x )  1 1 2 1 2 3 3 3 4 1 2 4 = cos(x1 ) 5x1 + 2x2 + x3 exp(x3 ) (x3 + x4 )cos(x4 ) + sin(x4 ) . The result is verified.

Chain rule for derivative of matrices w.r.t. a matrix Consider the matrices X, Y, Z of orders m × n, p × q and r × s, respectively. ∂Z , the chain Let Zr×s = Z(Y(X)). Then Z = [zij (Y(X))]r×s . To find ∂X rule is developed as follows. For any xkl , " q p # X X ∂zij ∂yuv ∂Z = . ∂xkl ∂yuv ∂xkl v=1 u=1 r×s

Using the above relation in, X ∂Z ∂Z = Pkl ⊗ ∂X ∂xkl k,l

=

=

X

Pkl ⊗

X

i,j k,l q p XXX

Pij

Pkl

v=1 u=1 k,l

q X p X ∂zij ∂yuv ∂y uv ∂xkl v=1 u=1

∂yuv X ∂zij ⊗ Pij ∂xkl ∂y uv i,j

∂Z ∂X ∂yuv v u   X X  ∂yuv ∂Z = ⊗ Ir In ⊗ ∂X ∂yuv v u =

X X ∂yuv

In ⊗ Ir

Observation. 1. The scalar multiplication with Kronecker product (P ⊗ αQ) = αP ⊗ Q is used and the mixed product rule is used in the calculation. 2. The order of

∂Z ∂X

is rm × ns.

304

Linear Algebra to Differential Equations

3. The order of the right side of the equality is (rm × nr)(nr × ns) = rm × ns. Example7.8.2 Verify the chain rule for the following matrices X =     x11 x12 z11 z12 Y = y11 y12 and Z = where Z = Z(Y) and x21 x22 z z22   21 Y = Y(X), Z = Z(Y(X)), Z = zij Y(X) 2×2 .  ∂z

11

 ∂x11   ∂z21   ∂x11 ∂Z = ∂X   ∂z11  ∂x  21  ∂z 21 ∂x21

∂z12 ∂x11 ∂z22 ∂x11 ∂z12 ∂x21 ∂z22 ∂x21

∂z11 ∂x12 ∂z21 ∂x12 ∂z11 ∂x22 ∂z21 ∂x22

∂z12  ∂x12   ∂z22   ∂x12   ∂z12   ∂x22   ∂z  22

∂x22

Now since zij (Y(X)) hence ∂zij ∂zij ∂y11 ∂zij ∂y12 = + ∂xkl ∂y11 ∂xkl ∂y12 ∂xkl 2 X ∂zij ∂y1v = for all i, j, k, l = 1, 2. ∂y1v ∂xkl v=1 Hence X ∂z11  ∂y 1v  X ∂z 21   ∂Z ∂y1v  = ∂X X ∂z11   ∂y1v  X ∂z21 ∂y1v

∂y1v ∂x11 ∂y1v ∂x11 ∂y1v ∂x21 ∂y1v ∂x21

X ∂z12 ∂y1v X ∂z22 ∂y1v X ∂z12 ∂y1v X ∂z22 ∂y1v

∂y1v ∂x11 ∂y1v ∂x11 ∂y1v ∂x21 ∂y1v ∂x21

X ∂z11 ∂y1v X ∂z21 ∂y1v X ∂z11 ∂y1v X ∂z21 ∂y1v

∂y1v ∂x12 ∂y1v ∂x12 ∂y1v ∂x22 ∂y1v ∂x22

X ∂z12 ∂y1v X ∂z22 ∂y1v X ∂z12 ∂y1v X ∂z22 ∂y1v

 ∂y1v ∂x12   ∂y1v    ∂x12   ∂y1v   ∂x22   ∂y1v  ∂x22

Now following the procedure described in this section, it is sufficient to calculate       ∂y11 ∂Z ∂y12 ∂Z ⊗ I2 I2 ⊗ + ⊗ I2 I2 ⊗ . ∂X ∂y11 ∂X ∂y12

Calculus of Matrices   ∂y11 ∂Z Now ⊗ I2 I2 ⊗ ∂X ∂y11      ∂z11 ∂z12 ∂y11 ∂y11       ∂x11 ∂x12      ⊗ 1 0   1 0 ⊗  ∂y11 ∂y11  =  ∂z21 ∂z22   ∂y11 ∂y11  0 1  0 1 ∂y11 ∂y11 ∂x21 ∂x22   ∂y   ∂z ∂z12 ∂y11 11 11 0 0  0 0 ∂y11 ∂y11  ∂x11  ∂x12      ∂z21 ∂z22 ∂y11 ∂y11     0  0 0  0    ∂x11 ∂x12   ∂y11 ∂y11  =    ∂y11  ∂z ∂z   ∂y 11 12 11   0  0 0 0   ∂x   ∂y ∂y ∂x 11 11   21  22    ∂z21 ∂z22  ∂y11 ∂y11  0 0 0 0 ∂y11 ∂y11 ∂x21 ∂x22   ∂z11 ∂y11 ∂z12 ∂y11 ∂z11 ∂y11 ∂z12 ∂y11  ∂y11 ∂x11 ∂y11 ∂x11 ∂y11 ∂x12 ∂y11 ∂x12     ∂z ∂y   11 11 ∂z22 ∂y11 ∂z12 ∂y11 ∂z22 ∂y11    ∂y11 ∂x11 ∂y11 ∂x12 ∂y11 ∂x12   ∂y ∂x =  11 12   ∂z11 ∂y11 ∂z12 ∂y11 ∂z11 ∂y11 ∂z12 ∂y11     ∂y11 ∂x21 ∂y11 ∂x21 ∂y11 ∂x22 ∂y11 ∂x22     ∂z21 ∂y11 ∂z22 ∂y11 ∂z21 ∂y11 ∂z22 ∂y11  ∂y11 ∂x21 ∂y11 ∂x21 ∂y11 ∂x22 ∂y11 ∂x22    ∂y12 ∂Z Similarly ⊗ I2 I2 ⊗ ∂X ∂y12   ∂z11 ∂y12 ∂z12 ∂y12 ∂z12 ∂y12 ∂z12 ∂y12  ∂y12 ∂x11 ∂y12 ∂x11 ∂y12 ∂x12 ∂y12 ∂x12      ∂z ∂y  21 12 ∂z22 ∂y12 ∂z21 ∂y12 ∂z22 ∂y12    ∂y12 ∂x11 ∂y12 ∂x12 ∂y12 ∂x12   ∂y ∂x =  12 11   ∂z11 ∂y12 ∂z12 ∂y12 ∂z11 ∂y12 ∂z12 ∂y12     ∂y12 ∂x22 ∂y12 ∂x21 ∂y12 ∂x22 ∂y12 ∂x22     ∂z21 ∂y12 ∂z22 ∂y12 ∂z21 ∂y12 ∂z22 ∂y12  ∂y12 ∂x21 ∂y12 ∂x21 ∂y12 ∂x22 ∂y12 ∂x22 

305

306  Now

Linear Algebra to Differential Equations      ∂y11 ∂Z ∂y12 ∂Z ⊗ I2 I2 ⊗ + ⊗ I2 I2 ⊗ ∂X ∂y11 ∂X ∂y12

X ∂z11  ∂y 1v  X ∂z 21   ∂y1v  = X ∂z11    ∂y 1v  X ∂z21 ∂y1v

∂y1v ∂x11 ∂y1v ∂x11 ∂y1v ∂x21 ∂y1v ∂x21

X ∂z12 ∂y1v X ∂z22 ∂y1v X ∂z12 ∂y1v X ∂z22 ∂y1v

∂y1v ∂x11 ∂y1v ∂x11 ∂y1v ∂x21 ∂y1v ∂x21

X ∂z11 ∂y1v X ∂z21 ∂y1v X ∂z11 ∂y1v X ∂z21 ∂y1v

∂y1v ∂x12 ∂y1v ∂x12 ∂y1v ∂x22 ∂y1v ∂x22

X ∂z12 ∂y1v X ∂z22 ∂y1v X ∂z12 ∂y1v X ∂z22 ∂y1v

 ∂y1v ∂x12   ∂y1v    ∂Z ∂x12  = ∂y1v  ∂X  ∂x22   ∂y1v  ∂x22

and the result is verified. Example 7.8.3 Verify the chain rule for differentiation "of matrices w.r.t. #   2 y11 y11 y12   x11 x12 matrices given X = Y = y11 y12 and Z = 2 2 x21 x22 2y11 y12 y12 2 where y11 = x11 , y12 = x22 . Solution.   2 2 ∂(y11 ) ∂(y11 y12 ) ∂(y11 ) ∂(y11 y12 )  ∂x11 ∂x11 ∂x12 ∂x12     ∂(2y 2 y ) 2 2 2 ∂(y12 ) ∂(2y11 y12 ) ∂(y12 )    11 12   ∂Z ∂x11 ∂x12 ∂x12   ∂x11 =  2 2 ∂X  ∂(y11 ∂(y11 y12 ) ∂(y11 ) ∂(y11 y12 )  )    ∂x21 ∂x21 ∂x22 ∂x22     ∂(2y 2 y ) 2 2 2 ∂(y12 ) ∂(2y11 y12 ) ∂(y12 )  11 12 ∂x21 ∂x21 ∂x22 ∂x22 Using chain rule gives,

∂zij ∂zij ∂y11 ∂zij ∂y12 = · + for i, j = 1, 2 and k, l = 1, 2 ∂xkl ∂y11 ∂xkl ∂y12 ∂xkl   2x11 x222 0 0   4x11 x222 0 0 0  ∂Z   =   ∂X  0 0 0 2x x 11 22   0

0

4x211 x22

4x322

Now, to obtain the result using the chain rule formula, consider   ∂y11 ∂y11    ∂x11 ∂x12  ∂y11  = 1 0 , since y11 = x11 =  ∂y11 ∂y11  0 0 ∂X ∂x21 ∂x22

Calculus of Matrices

307

Similarly,   ∂y12 0 0 = , since y12 = x222 . 0 2x22 ∂X     ∂Z ∂Z 2y11 y12 0 y11 = = Now and 2 4y11 y12 0 2y11 2y12 ∂y11 ∂y12 Using the formula 2

X ∂Z = ∂X j=1



∂y1j ⊗ I2 ∂X

  ∂Z I2 ⊗ ∂y1j



     ∂y11 ∂Z ∂y12 ∂Z = ⊗ I2 I2 ⊗ + ⊗ I2 I2 ⊗ ∂X ∂y11 ∂X ∂y12         1 0 1 0 1 0 2y11 y12 = ⊗ ⊗ 0 0 0 1 0 1 4y11 y12 0         0 0 1 0 1 0 0 y11 + ⊗ ⊗ 2 0 2x22 0 1 0 1 2y11 2y12    2y11 y12 0 0 1 0 0 0 0 1 0 0 4y11 y11 0 0 0    = 0 0 0 0  0 0 2y11 y12  0 0 0 0 0 0 4y11 y12 0    0 y11 0 0 0 0 0 0  2 0 0 0 0  0 0   2y11 2y12  + 0 0 2x22 0 0 y11  0  0 2 0 0 0 2x22 0 0 2y11 2y12   2x11 x222 0 0 2 4x11 x22  0 0 0  =  0 0 0 2x11 x22  0 0 4x211 x22 4x322 Verifying the formula.

EXERCISE 7.8 1. Apply for the matrices given below, the product rule for differentiation   x11 x12 of matrix w.r.t. a matrix, X = x21 x22     2 cos (x11 ) x12 2x11 (i) Y = and Z = exp(x21 ) x22 x11 cos (x22 )

308

Linear Algebra to Differential Equations    x211 x11 x12 x21 cos(x11 ) x212 x22 (ii) Y = and Z = 2 x12 x21 x211 x21 sin(x12 ) x22 

2. Apply for the matrices given below,  the chain  rule for differentiation   x11 x12 of matrices w.r.t. a matrix X = (i) Y = y11 y12 and x21 x22   2 y11 y12 y12 Z= , where y11 = x11 x12 , y12 = x21 x22 2 y11 y11 y12     y11 y12 sin(y11 ) (ii) Y = y11 y12 Z= , where y11 = x211 , y12 = 2 cos(y22 ) y22 x221 .

7.9

Another Definition for Derivative of a Matrix w.r.t. a Matrix

The derivative of a matrix, with respect to a matrix can be defined in various ways. The following is another definition in this direction. Let Z be an r × s matrix and X be an m × n matrix. Assume that Z is a function of the entries of X. Definition 7.9.1 The derivative of the matrix Z w.r.t. the matrix is defined as " # X X ∂zij ∂Z = ∂xkl ∂X k l r×s

Observation. The definition yields a matrix of the same order as Z. " Example 7.9.1 Let Z =

x211 + x212

x211 using Definition 7.9.1. Find Solution. " # X X ∂zij ∂Z = ∂xkl ∂X k l

x12 x21 + x11 x22 x222 + x21 x11

#



x and X = 11 x11

 x12 . x22

∂Z ∂X

2×2

 ∂z11 ∂z11 ∂z11 ∂z11 ∂z12 ∂z12 ∂z12 ∂z12 + + + + + +  ∂x11 ∂x12 ∂x21 ∂x22 ∂x11 ∂x12 ∂x21 ∂x22   =  ∂z21 ∂z21 ∂z21 ∂z21 ∂z22 ∂z22 ∂z22 ∂z22  + + + + + + ∂x11 ∂x12 ∂x21 ∂x22 ∂x11 ∂x12 ∂x21 ∂x22   2x11 + 2x12 x21 + x22 + x12 + x11 = 2x11 x21 + x11 + 2x22 

Calculus of Matrices " 2 # x22 + x12 x11 + x221 Example 7.9.2 Consider Z = . Find x212 + x22 x211 + x11 Definition 7.7.1 and (ii) Definition 7.9.1.   ∂(x222 + x12 ) ∂(x11 + x221 )   ∂Z ∂X ∂X  1. = ∂X  ∂(x2 + x21 ) ∂(x2 + x11 )  12 11 ∂X ∂X   0 1 1 0 0 2x22 2x21 0  = 0 2x12 1 + 2x11 0 0 1 0 0 " # 1 + 2x22 1 + 2x21 ∂Z 2. = ∂X 2x12 + 1 1 + 2x11

309 ∂Z ∂X

using (i)

EXERCISE 7.9 1. Apply definition 7.9.1 for the problem 1 in Exercise 7.7. 2. Apply definition 7.9.1 for the problem 2 in Exercise 7.7.

7.10

Conclusion

In this chapter, various types of derivatives were introduced. These methods have a rational extension to functions of matrices. Some of the derivatives introduced here are not yet utilized and hence there is a great potential for new types of differential equations. Such a study will enhance the subject in the areas of linear algebra and differential equations. For further reading the readers are suggested to refer to [8], [23] and [24].

Chapter 8 Linear Systems of Differential Equations

8.1

Introduction

The study of linear systems of differential equations (LSDEs) essentially depends on the knowledge of linear algebra included in Chapters 1 to 4 and 6 of the present text. The present chapter includes the derivatives with respect to a scalar. Basic knowledge of the theory of differential equations, particularly, the existence and uniqueness of solutions is useful in the analysis of systems of differential equations and these results are stated without proof. The study of LSDEs has been found useful in many mathematical models involving coupled effects that exist in real-life situations. The present chapter brings forth a few basic principles of the study of LSDEs. In Section 8.2, the LSDEs and their variants are described in the ndimensional space, the aim being as how to solve LSDE. For this purpose, one needs the concept of fundamental matrix (FM) which is formulated by considering eigen-values and eigen-vectors of the involved matrices and is given in Section 8.3. A general method of successive approximations is included in Section 8.4. The theory of linear homogeneous systems is then extended by considering nonhomogeneous systems for which a variation of parameters formula of Lagrange’s type is obtained. This is the content of Section 8.5. Particularly, it is shown that linear systems of differential equations involving constant coefficients are rather easy to handle by introducing the concept of an exponential matrix function and methods of linear algebra. This has been achieved in Section 8.6. In addition to the essential features of LSDE’s, namely, existence and with uniqueness, one needs to know the nature of solutions of the systems and their path and future behavior involved in the study of stability. This concept is dealt with in Section 8.7. The last section includes applications of LSDEs to some modeling problems. Throughout the chapter adequate number of examples illustrate the theory.

DOI: 10.1201/9781351014953-8

311

312

8.2

Linear Algebra to Differential Equations

Linear Systems

Consider a linear system of coupled differential equations of order n x01 = a11 (t)x1 + a12 (t)x2 + · · · + a1n (t)xn x02 = a21 (t)x1 + a22 (t)x2 + · · · + a2n (t)xn .. . x0n = an1 (t)x1 + an2 (t)x2 + · · · + ann (t)xn

t ∈ I,

(8.1)

where I is a non-empty interval of the real line R. The system (8.1) has the following vector-matrix representation x0 = A(t)x,

t ∈ I,

(8.2)

where A(t) ∈ C[I, Rn×n ], x = [x1 x2 . . . xn ] an n-vector, and is a linear system of differential equations. This chapter aims at knowing the unknown function x for t ∈ I. Now add to (8.2) the value of the unknown vector x(t0 ), called ‘initial data’ t0 ∈ I and a given n-vector x0 = [x10 x20 . . . xn0 ]T. This addition enriches (8.2) as an initial value problem (IVP) x0 = A(t)x, x(t0 ) = x0 .

(8.3)

Here (8.3) is a homogeneous system, for at x = 0, x0 = 0. It is further generalized into a nonhomogeneous linear IVP having the form x0 = A(t)x + p(t), x(t0 ) = x0 ,

(8.4)

where p ∈ C[I, Rn ], p(t) 6= 0, t ∈ I, and is an n-vector of the type [p1 (t) p2 (t) . . . pn (t)]T. In (8.4) the n-vector p(t) affects the derivative x0 i.e the rate of change of x. Hence (8.4) is called a ‘perturbed system’. Other names for p(t) are (i) a booster (ii) a control factor depending on the areas of applications of the given system. Here the aim is to solve (8.4) i.e to find x(t) for t ∈ I satisfying the relations in equation (8.4). Definition 8.2.1 x(t) is called a solution of the IVP (8.4), if its derivative is equal to the right-hand side for t ∈ I and that x(t0 ) = x0 . Remark 8.2.1 (i) The system (8.4) is nonhomogeneous because the right side of (8.4) is such that A(t)x + p(t) 6= 0 when x = 0 (ii) The system (8.4) retains its ‘system’ structure if all elements of A(t), namely aij (t) are not simultaneously zero, for t ∈ I. (iii) The system (8.4) involves an n × n matrix A(t) and n-vectors x, p, x0 and therefore is called system in vector–matrix form.

Linear Systems of Differential Equations

313

The following proposition stated without proof guarantees the existence of a unique solution of (8.4) on I. Proposition 8.2.1 The IVP (8.4) possesses a solution x(t) on I if A(t) ∈ C[I, Rn×n ] and p ∈ C[I, Rn ]. Further, x(t) is unique on I. In the above proposition the condition that A(t) is continuous on I is indeed, strong. Under some relaxed conditions, the conclusions still hold. It is claimed that the linear nonhomogeneous differential equation x(n) + a1 (t) x(n−1) + .... + an (t) x = h(t),

(8.5)

where the scalar functions a1 , a2 , . . . , an and h are all continuous functions on I can be considered as linear system. Let the initial conditions at t0 ∈ I be given as x(t0 ) = b1 , x0 (t0 ) = b2 , .., x(n−1) (t0 ) = bn , where b1 , b2 , . . . , bn are given constants. The following transformations x1 = x, x01 = x2 , . . . , x0n−1 = xn convert (8.5) in to the form of an n-system (8.4), where x = (x1 x2 . . . xn )T and   0 1 0 ··· 0  0 0 1 ··· 0     .. .. .. .. ..  . A(t) =  . . . . .     0 0 0 ··· 1  −an −an−1 −an−2 · · · −a1 p(t) = (0 0 . . . h(t))T, and x0 = x(t0 ) = (b1 b2 .... bn )T. The following example is illustrative. Example 8.2.1 Convert the linear differential equation into a system form. x000 − 8x00 + 15x0 − 4x = 0. Solution. Let x1 = x, x2 = x01 = x0 , x3 = x02 = x00 and x03 = 8x3 − 15x2 + 4x1 . Hence it follow that  0  0   0 x1 x1 x2  = x02  = 0 x3 x03 4

1 0 −15

  0 x1 1 x2  8 x3

is a system of the form (8.4) with p(t) = 0, t ∈ I.

314

Linear Algebra to Differential Equations

In case the given equation is strengthened further, i.e. x000 − 8x00 + 15x0 − 4x = h(t), t ∈ I. where h ∈ C [I, R] then the related system takes the form (8.4) where p(t) = [0 0 h(t)]T. Example 8.2.2 Convert the following system of 4 differential equations (x0 y 0 z 0 w0 ) = (4x+5y +6z +7 −3x−3z 7y −3z +w +6 x+y +z +w +1) into a system. Solution. The system form(8.4) is given by :  0  4 x −3 y    = 0 z  1 w

    5 6 0 x 7  y  0 0 −3 0    +  . 7 −3 1  z  6 1 1 1 w 1

EXERCISE 8.2 1. Express the following systems of equations into vector - matrix differential equation, (i) (x0 y 0 z 0 ) = (3x + 2y 0

0

0

0

0

0

2x + y + z

(ii) (x y z ) = (2x + 5y − 9z (iii) (x y z ) = (a1 x + b1 y + c1 z

− 3z);

7x − 2y + 3z

− x + 5y + 7z);

a2 x + b2 y + c2 z

a3 x + b3 y + c3 z).

2. Express the following scalar equations into vector–matrix form (i) x00 + ax0 + w2 x = 0; (ii) x000 + 7x00 + 9x0 − 5x = 0. (iii) x(4) + 5x00 − 6x0 + 9x = 0; 3. Transform into vector–matrix IVP, the differential equations (i) x00 + 8x0 + 25x = 3, x(0) = 0, x0 (0) = 5; (ii) x000 + 3x00 + 3x0 − 4x = t2 + 4, x(1) = 10, x0 (1) = 5, x00 (1) = 7; (iii) x00 = x0 + y 0 − z + t, y 00 = tx + y 0 − 2y + t2 + 1, z 00 = x − y + y 0 + z, x(1) = 3, x0 (1) = 10, y(1) = 1, y 0 (1) = −5, z(1) = 4, z 0 (1) = 3. 4. Express into vector–matrix form (i) (x0 y 0 z 0 w0 ) = (3x + y + w 5z + 3w − x + y + z − w 5w), x(1) = y(1) = z(1) = w(1) = 2;

Linear Systems of Differential Equations

315

(ii) x0 = 3x + y + w + t, y 0 = 5z + 3w − et , z 0 = −x + y + z − w + sin t, w0 = 5w + tet . (iii) x(5) + 9x(4) + x = 10et , x(0) = x0 (0) = x00 (0) = x000 (0) = x(4) (0) = 1.

8.3

Fundamental Matrix

The aim of this section is to develop a technique for solving the system (8.2). The necessary tool is the formation of the fundamental matrix (FM) defined below. Definition 8.3.1 The system (8.2) possesses n linearly independent solutions φ1 , φ2 , . . . , φn where each φi has its components [φ1i φ2i . . . φni ]T, for i = 1, 2, ..., n. Then solutions written as n columns form an n × n matrix, called the fundamental matrix Φ(t), such that Φ0 (t) = A(t)Φ(t), t ∈ I.

(8.6)

Firstly, there is a need to determine criteria for a matrix Φ(t) to be a FM. The following theorem provides the answer. Theorem 8.3.1 Let A(t) ∈ C[I, Rn×n ]. Suppose that the relation (8.6) holds. Then the determinant Φ(t), satisfies the scalar differential equation [det Φ(t)]0 = [tr A(t)] det[Φ(t)], t ∈ I,

(8.7)

tr [A(t)] = [a11 (t) + a22 (t) + · · · + ann (t)], t ∈ I.

(8.8)

where Here the solution of (8.7) is given by  t  Z det[Φ(t)] = detΦ(t0 ) exp  tr [A(s)] ds,

t0 , t ∈ I,

t0

where t0 is the initial point. Proof. Here the column vectors φ1 , φ2 , ..., φn are the solutions of the system (8.2). Each φi (i = 1, 2, . . . , n) has n-elements in it, denoted by φ1i , φ2i , . . . , φni . In view of (8.2) it follows that φ0ij (t) = ai1 (t) φ1j (t) + ai2 (t) φ2j (t) + · · · + ain (t)φnj (t), f or t ∈ I,

(8.9)

316

Linear Algebra to Differential Equations

where i, j = 1, 2, ..., n. Now consider the determinant function of the matrix Φ(t), t ∈ I, given by φ11 φ12 · · · φ1n φ21 φ22 · · · φ2n (8.10) det Φ(t) = . (t), t ∈ I. .. φn1 φn2 · · · φnn Hence 0 φ11 φ21 [det Φ(t)]0 = . .. φn1

φ012 φ22 .. .

··· ···

φ01n φ11 φ2n φ021 .. + .. . . φnn φn1

··· φ11 φ12 φ21 φ22 + ··· + . .. .. . 0 φ0 n1 φn2 φn2

··· ··· ···

φ12 φ022 .. .

··· ···

φ1n φ02n .. . φnn

φn2 · · · φ1n φ2n .. , t ∈ I. . 0 φnn

(8.11)

Hence it follows that, the first determinant in (8.11) is given by n n n X X X a1k φkn a1k φk1 a1k φk2 · · · k=1 k=1 k=1 . φ φ · · · φ 21 22 2n .. .. .. . . . φn1 φn2 ··· φnn The simplification yields that the first determinant contributes, in (8.11), (a11 det Φ). Continuation of the similar steps, it follows that the contribution of the second determinant, in (8.11), is (a22 det Φ). Following the same procedure in the third to n determinants, the contribution to (8.11) by the last determinant is the (ann det Φ). Hence (det Φ)0 = (a11 + a22 + · · · + ann ) det Φ = (tr Φ)det Φ, t ∈ I, which is (8.7). Observation. (8.7) is a linear scalar homogeneous, first order differential equation with the initial point at t0 ∈ I, it has a solution given in Theorem 8.3.1. Here one query needs to be answered. ’Does the system (8.2) have a unique FM or not? The answer is that the FM in (8.6) is not unique. For, if Φ is one FM then ΦC, where C is an arbitrary n × n non-singular constant matrix, is also FM. It is clear that ΦC also satisfies (8.6). The examples given below are useful.

Linear Systems of Differential Equations 317  3t  e 2t e3t Example 8.3.1 Show that Φ(t) = is the FM of the system 0 e3t   3 2 x0 = x. 0 3 Solution. (i) The first criteria is toshow of Φ(t) are linearly indepen that the3tcolumns  e3t 2te dent. Consider c1 + c2 = 0 implies that e3t (c1 + 2tc2 ] = 0 0 e3t and e3t c2 = 0, that is, c2 = 0 and hence c1 = 0.   3 2 0 (ii) To prove that Φ (t) = Φ(t) : 0 3  3t  3e 6t e3t + 2e3t 0 Φ (t) = . 0 3 e3t    3t    3 2 e 2t e3t 3 2 0 Φ (t) = = Φ(t). 0 3 0 e3t 0 3 Hence, Φ(t) is an FM. 0 T Example8.3.2 Consider  the linear system x = Ax where x = [x1 x2 x3 ] −3 1 0 and A =  0 −3 1 . 0 0 −3   t2 e−3t t e−3t e−3t  2!  Verify that Φ(t) =  0 e−3t t e−3t  is an FM.

0 0 e−3t Solution. (i) To show that the 3 columns of Φ(t) are linearly independent note that       2  t t /2!  1  e−3t c1 0 + c2 1 + c3  t  = 0,   0 0 1 implies that c1 = c2 = c3 = 0 since e−3t 6= 0. (ii) Further,    1 t t2 /2 ! 0 t  + e−3t 0 Φ0 (t) = −3e−3t 0 1 0 0 1 0  2 3t −3 1 − 3t t − −3t  2!  =e 0 −3 1 − 3t  0

0

−3

1 0 0

 t 1 0

318

Linear Algebra to Differential Equations 2   −3t −3t −3t t −3 1 0 e t e e  2!  =  0 −3 1   0 e−3t t e−3t  0 0 −3 0 0 e−3t = A Φ(t).

Hence, Φ(t) is an FM.  3t e Example 8.3.3 Verify relation (8.7) when Φ(t) = 0 is the FM in Example 8.3.1.

2 t e3t e3t



Solution. Given, A (t) =

 3 0

 2 . 3

The trace of A(t) = 6. Further, detΦ (t) = e6t which implies that [det Φ (t)]0 = 6 e6t . Clearly, the relation (8.6) holds. From Theorem 8.3.1, it is noted that there exists a close relationship between the x(t) of the differential system (8.3) and its FM. Consider the expression Φ(t)α, where α is an unknown constant n-vector. Assume that x(t) = Φ(t)α, t ∈ I,

(8.12)

is the solution of (8.3). The unknown n-vector α in (8.12) needs to be found. Clearly, x0 (t) = Φ0 (t)α = A(t) Φ(t)α = A(t) x(t), t ∈ I in view of (8.12). Now, consider IVP (8.3), with initial condition, x(t0 ) = x0 . Hence, x(t0 ) = Φ(t0 )α = x0 , which implies that α = Φ−1 (t0 ) x0 . Note that Φ−1 (t0 ) exists, it follows that x(t) = Φ(t) α = Φ(t) Φ−1 (t0 ) x0 ,

t ∈ I,

(8.13)

where x(t) is the solution of the IVP (8.3). Here (8.13) establishes the relation between the solution x(t) and the FM Φ(t). Thus, if the FM of a system is known, then the solution x(t) is given by (8.13).  3 Example 8.3.4 Find the solution of the system x = 0 0

   2 1 x, x(0) = . 3 1

Solution.  3tThe FM for this problem, as given in Example 8.3.1, is e 2 te3t Φ(t) = , t ∈ I. Hence, 0 e3t     1 0 1 0 Φ(0) = and Φ−1 (0) = . 0 1 0 1

Linear Systems of Differential Equations

319

It follows from (8.13) that x(t)

= = =

Φ(t) Φ−1 (0) x(0)  3t    e 2 te3t 1 0 1 0 e3t 0 1 1   3t (1 + 2t)e . e3t

Example 8.3.5 Find the solution of the IVP     0 1 0 1 x0 = A(t)x = 0 −2 −5 x, x(0) = 0, t ∈ I. 0 1 2 1 Solution. The FM Φ(t) is given by  1 −2 + 2 cos t + sin t cos t − 2 sin t Φ(t) = 0 0 sin t

 −5 + 5 cos t −5 sin t . cos t + 2 sin t

Verify that Φ(t) is, indeed, an FM. Here    1 1 0 0 Φ(0) = 0 1 0 and Φ−1 (0) = 0 0 0 0 1

0 1 0

 0 0. 1

Hence it follows, after substitutions, that x(t) = Φ(t) Φ−1 (0) x0   −4 + 5 cos t =  −5 sin t . cos t + 2 sin t

EXERCISE 8.3 1. Show that Φ(t) given below is an FM of the stated linear systems differential  2t   0    4 e + 2 e−4t e2t − e−4t x1 0 1 x1 (i) Φ(t) = , = , 8 e2t − 8 e−4t 2 e2t + e−4t x2 8 −2 x2   0     −t e 0 2(et − e−t ) x1 −1 0 4 x1 et − et , x2  =  0 −1 2 x2 , (ii) Φ(t) =  0 e−t 0 0 et x3 0 0 1 x3

320

Linear Algebra to Differential Equations  2t   0    e e−t 0 x1 0 1 1 x1 0 e−t , x2  = 1 0 1 x2 . (iii) Φ(t) = e2t e2t −e−t −e−t x3 1 1 0 x3

2. Suppose that Φ(t) and Ψ(t) are two FMs of the system (8.2) for t ∈ I. Show that there exists a n × n constant nonsingular matrix C such that Ψ = Φ C. Is α Φ(t) + βΨ(t) (α, β are real ) an FM? 3. Let Φ(t) be FM of (8.2) where A(t) is a real matrix. Show that d (Φ−1 (t))T = −AT [Φ−1 (t)]T. dt Is (Φ−1 )T an FM of the system x0 = −AT (t)x, t ∈ I?

8.4

Method of Successive Approximations

The aim of this section is to develop a method for finding the solution of the IVP (8.3). Definition 8.4.1 Successive approximations for the IVP (8.3) is a sequence of functions, denoted by {φ0 , φ1 , φ2 , . . . , φn , ...} and defined on I as follows: φ0 (t) = x0 Zt φ1 (t) = x0 +

A(s)φ0 (s) ds, t0

Zt φ2 (t) = x0 +

A(s)φ1 (s) ds t0

.. . Zt φn (t) = x0 +

A(s)φn−1 (s) ds n = 1, 2, . . .

(8.14)

t0

defined on I. Observation. φ0 (t) is the constant vector function x0 (given in (8.3)) and therefore is known. The next approximation φ1 (t) depends on φ0 (t) = x0 and is well defined. The process thus continues for n = 2, 3, 4, · · · . The conclusion is that all the elements of the sequence {φn (t)} exist and are well defined vector functions on I.

Linear Systems of Differential Equations

321

This method of successive approximations is explained through the following illustrative example. Example 8.4.1 Consider the IVP for t ∈ I, x00 − 4x0 + 4x = 0, x(0) = 0, x0 (0) = 1, where 0 ∈ I. The IVP has the representation  0        x1 0 1 x1 x10 0 = , = , t ∈ I. x2 −4 4 x2 x20 1

(8.15)

(8.16)

Note that the IVP (8.15) is converted into the system form (8.16) by the substitution   0 0 0 00 0 x = x1 , x = x1 = x2 and x = x2 = −4x1 +4x2 . Now, suppose that φ0 = . 1   Zt      0 0 1 0 t ds = . Then, φ1 (t) = + 1 −4 4 1 1 + 4t 0

Further, in view of (8.14), φ2 (t) is given as   Zt  0 0 φ2 (t) = + 1 −4

    1 s t + 2t2 ds = . 4 1 + 4s 1 + 4t + 6t2

0

Continuation of the process provides # " t + 2t2 + 2t3 and φ3 (t) = 3 1 + 4t + 6t2 + 16 3 t " # t + 2t2 + 2t3 + 34 t4 φ4 (t) = 10 4 3 1 + 4t + 6t2 + 16 3 t + 3 t

for t ∈ I.

[As an exercise, the readers may verify φ3 (t) and φ4 (t).] Taking into account the nature of emerging pattern of polynomials in the vector, φ5 (t) and so on, it turns out that     x1 (t) t e2t = , t ∈ I. (8.17) x2 (t) (1 + 2t) e2t The limiting column vector as n → ∞, is indeed the solution of system (8.15). During the conversion of (8.15) to (8.16), it was assumed that x1 = x, hence the component x1 (t) of the vector (8.17) is the solution of the IVP (8.15), i.e., x(t) = t e2t . It is left to the readers to verify that x(t) is the solution of differential equation (8.15).

322

Linear Algebra to Differential Equations

The method of successive approximations theoretically works well, however, it becomes tedious after a few approximations. With modern developments in computational mathematics, this method yields better results. In the above example, the matrix A(t) = A is a constant. Hence the solution x(t) is obtained smoothly. The following existence result is stated without proof. Proposition 8.4.1 Consider the IVP for the system (8.3) and the scheme of successive approximations given in (8.14), where the matrix A(t) is a continuous matrix for t ∈ I and x0 is a given constant vector. Then, the sequence {φ0 (t), φ1 (t), . . . , φn (t), ...}, t ∈ I converges to a vector function x(t), t ∈ I as n → ∞ and x(t) is the solution of (8.3) existing uniquely for t ∈ I. There is yet another theoretical approach to get the solution x(t) of the IVP (8.3). Integrating (8.3) between the limits t0 and t in I yields Zt x(t) = x0 +

t0 , t1 ∈ I.

A(t1 ) x(t1 ) dt1 ,

(8.18)

t0

Since the matrix A(t) is a continuous function the integral equation remains well defined on I. What has been achieved is that the vector-matrix IVP is converted into a new form called ‘integral equation’ (8.18). Note that (8.18) is not a solution of the IVP (8.3), since the vector function x(t) appears on both the sides of (8.18). However, this fact provides an advantage since x(t) may be replaced in (8.18) under the integral sign providing the relation   Zt Zt1 x(t) = x0 + A(t1 ) x0 + A(t2 ) x(t2 )dt2  dt1 , t0

t0

Zt =

x0 +

Zt1

Zt A(t1 ) x0 dt1 +

t0

A(t1 ) t0

A(t2 ) x(t2 ) dt2 dt1 , t0

which exists and is well defined. Further the double integral contains the term x(t2 ). As the process continues, the existence of each term is guaranteed infinite times, providing the following statement Zt Zt1

Zt x(t) = lim [x0 +

A(t1 ) x0 dt1 +

k→∞

t0 tZk−1

Zt ··· +

A(t1 ) A(t2 ) x0 dt2 dt1 t0 t0

··· t0

= Φ(t, t0 ) x0 .

A(tk ) · · · A(t1 ) x0 dtk · · · dt1 ] t0

(8.19)

Linear Systems of Differential Equations

323

Under the condition of continuity of A(t) on I, it is logically possible to show that the infinite series occurring in (8.19) converges uniformly to a limit x(t) which is, indeed, the unique solution of the IVP (8.3). The process of integration in (8.19) is tedious but it is mathematically tenable. If in the IVP (8.3), A(t) = A, t ∈ I is a constant matrix then the process of infinite times integration in (8.19) works well and provides Φ(t, t0 ) = e(t−t0 )A and x(t) = e(t−t0 )A x0 , t ∈ I. The matrix Φ(t, t0 ) is called evolution or transition matrix.

Properties of Φ(t, t0 ) The following are the properties of Φ(t, t0 ): (i) Φ(t, t0 ) Φ(t0 , s) = Φ(t, s),

t, t0 , s ∈ I;

(ii) Φ(t, t) = In , In being the n × n identity matrix and t ∈ I; (iii) Φ−1 (t, t0 ) = Φ(t0 , t); (iv)

∂ ∂ t Φ(t, t0 )

= A(t)Φ(t, t0 ), or Φ(t, t0 ) = In +

Rt

A(s) Φ(s, t0 ) ds.

t0

These properties are useful in mathematical modeling problems, particularly when A(t) = A, A being a constant matrix and Φ(t, t0 ) = e(t−t0 )A , t ∈ I. The properties (i) to (iv) of Φ(t, t0 ) are clearly verified. Let the n-vectors xi = (xi1 xi2 · · · xin ), i = 1, 2, ...n be a set of nlinearly independent solutions with initial conditions (t0 x0 ) where x0 = (x10 x20 . . . xn0 ), then it follows that xi (t) = Φ(t, t0 ) xi0 ,

i = 1, 2, . . . , n

which is X(t) = Φ(t, t0 )X(t0 ), where X(t) and X(t0 ) are n × n matrices satisfying the matrix differential equation X0 (t) = A(t) X(t), X(t0 ) = X0 . Hence, it follows that Φ(t, t0 ) = X(t) X−1 (t0 ),

t0 , t ∈ I.

Note: Matrix differential equations are treated in Chapter 9.

324

Linear Algebra to Differential Equations

EXERCISE 8.4 1. Find the first three successive approximations for the following systems of equations  0        x1 1 0 x1 x (0) 2 (i) = , 1 = x2 0 −1 x2 x2 (0) 0  0  t       x1 e 1 x1 x (1) 1 . (ii) = , 1 = x2 t e t x2 x2 (1) 1  0        x t 1 x x(0) 1 (iii) = , = y 1 t y y(0) 0

8.5

Nonhomogeneous Systems

The form of a nonhomogeneous system is given in (8.4) with initial conditions, where A ∈ C[I, Rn×n ], p(t) 6= 0, t ∈ I. Note that p(t) is defined on I. It has already been mentioned that the n-vector solution x(t), under the given assumptions, exists on I and is unique. Consider the system x0 = A(t)x, x(t0 ) = x0 ,

t, t0 ∈ I,

and its perturbed system y0 = A(t)y + p(t), y(t0 ) = x0 having unique solutions x(t) and y(t) respectively. Note that the two IVP’s have identical initial conditions. The aim of this section is to establish a relation between the solutions x(t) and y(t), t ∈ I. From (8.13), it is known that x(t) = Φ(t) Φ−1 (t0 ) x0 ,

t ∈ I.

Assume that y(t) = Φ(t) Φ−1 (t0 ) u(t), t ∈ I, where the unknown function u(t) needs to be determined. Differentiation of (8.20) yields y0 (t)

=

Φ0 (t)Φ−1 (t0 ) u(t) + Φ(t) Φ−1 (t0 )u0 (t)

=

A(t)Φ(t) Φ−1 (t0 ) u(t) + Φ(t) Φ−1 (t0 ) u0 (t)

=

A(t) y(t) + Φ(t) Φ−1 (t0 ) u0 (t) = A(t) y(t) + p(t).

(8.20)

Linear Systems of Differential Equations

325

Thus, it follows that p(t) = Φ(t) Φ−1 (t0 ) u0 (t) or equivalently u0 (t) = Φ(t0 ) Φ−1 (t) p(t), t ∈ I. Integration leads to u(t) − u(t0 ) = u(t) − x0 = Rt Φ (t0 ) Φ−1 (s) p(s)ds. t0

Substituting the function u(t) in (8.20) provides   Zt Φ(t0 ) Φ−1 (s) p(s) ds y(t) = Φ(t) Φ−1 (t0 ) x0 + t0

= Φ (t) Φ

−1

Zt (t0 ) x0 +

Φ(t) Φ−1 (s) p(s) ds, t ∈ I.

(8.21)

t0

Hence, y(t) has been obtained explicitly. Since x(t) = x(t, t0 , x0 ) = Φ(t) Φ−1 (t0 ) x0 , it follows that x(t, t0 , p(s)) = Φ(t) Φ−1 (t0 ) p(s), and

Zt y(t) = x(t, t0 , x0 ) +

x(t, s, p(s)) ds.

(8.22)

t0

Equation (8.22) clearly establishes the relation between the solutions x(t) of (8.3) and y(t) of (8.4). Observation. 1. The choice of x0 = 0 in (8.3) implies that x(t) ≡ 0 for t ∈ I. Then, Zt y(t) = Φ (t) Φ−1 (s)p(s) ds, t ∈ I. t0

2. If the perturbation p(t) = 0, then y(t) = x(t) = Φ(t) Φ−1 (t0 )x0 . 3. In (8.3) and (8.4) the initial data ((t0 , x0 )) is identical. The solution of (8.4) depends on the parameters (t0 x0 ). The parameter x0 is now replaced by an unknown function u(t), for t ∈ I. Since the parameter x0 is replaced, the relations (8.21) and (8.22) are known in the literature as variation of parameters formula (VPF). The VPF formula for scalar linear differential equations was first established by the mathematician Lagrange in the 18th century. 4. In (8.3) and (8.4) the initial conditions are identical. However, they can be different from each other. Then, what is the change in VPF (8.21) in such a situation? The generalized VPF is now available in the

326

Linear Algebra to Differential Equations mathematical literature. It is not included here since it is beyond the scope of this book.

5. The VPF (8.21) or (8.22) is important since it has several applications in the theory of DEs. The examples given below are illustrative. Example 8.5.1 Solve the IVP  0          x1 1 0 x1 1 x10 (0) 1 = + , = . x2 0 2 x2 1 x20 (0) 0 Solution. An FM is given by   t  1 e2t e 0 −1 Φ(t) = , Φ (t) = 0 2e2t e3t 0   2 0 Φ−1 (0) = . 0 1

  0 1 , Φ(0) = et 0

 0 , 2

[Note: There can be an infinite number of FM’s.] Hence,  t    e 0 1 0 1 0 e2t 0 1 0   t−s   Z t t e 0 e 0 1 + . ds 2t 2(t−s) 0 e 1 0 e 0    2s   t  Z t 1 et e2t 2e − e2s 2e − e2t . ds. = + 3s et 2e2t −es + es 2et − 2e2t 0 e

x(t) =

Example 8.5.2 Consider  0  x1 0 = x2 −e2t

1 1

        x1 0 x10 (0) 0 + , = . x2 1 x20 (0) 1

Find the solution of the system, where  sin et Φ(t) = t e cos et

cos et −et sin et



is a fundamental matrix. Solution. Φ−1 (t) =

 1 −et sin et −et −et cos et

   − sin 1 − cos 1 − cos et −1 , Φ (0) = − . sin et − cos 1 sin 1

Linear Systems of Differential Equations

327

Also,   0 x0 = . Hence, 1 cos et −et sin et   sin(et − 1) = t . e cos(et − 1) 

x(t, 0, x0 ) =

sin et et cos et



  sin 1 cos 1 0 cos 1 − sin 1 1

Observation. x(t,  0, x0 ) is the solution of the homogeneous part of the IVP. 0 Further, if p(t) = , then 1 Z x(t, s, p(s)) =

t −1

Φ(t) Φ

Z t (s) p(s)ds =

0

0

 sin(et − es ) ds. e(t−s) cos(et − es )

All terms in (8.22) are obtained. Hence,   Z t  sin(et − es ) cos(et − 1) y(t) = − + ds, −et sin(et − 1) e(t−s) cos(et − es ) 0 is the solution of the nonhomogeneous equation.

EXERCISE 8.5 1. Solve the IVPs (i) x00 + 2x0 − 8x = et , x(0) = 1, x0 (0) = −4; (ii) x00 + x = 1, x(π) = 1, x0 (π) = 2. 2. Solve the IVPs  0          x1 0 1 x1 1 x1 (0) x (i) = + , = 10 ; x2 −2 3 x2 1 x2 (0) x20  0     t      x1 −1 0 4 x1 e x10 0 (ii) x2  =  0 −1 2 x2  + e−t ; x20  = 1. x3 0 0 1 x3 0 x30 1

328

8.6

Linear Algebra to Differential Equations

Linear Systems with Constant Coefficients

Finding the solution of the system (8.3) is difficult when the matrix A(t) depends on t. The present section aims at developing a method for finding an explicit solution of the system (8.3) when the matrix A(t) is independent of t, i.e., A is an n × n constant matrix. Thus, the system under consideration is x0 = Ax, x(t0 ) = x0 , t ∈ I.

(8.23)

The method consists of finding eigen values and eigen-vectors of A which are then used to write x(t) explicitly. Definition 8.6.1 eat , where a is a scalar, t ∈ I, is defined as eat = 1 + at +

∞ X an tn an tn a2 t2 + ··· + + ... = 1 + . 2! n! n! n=1

The series is uniformly convergent on I. This series is now extended by replacing the scalar ‘a’ by a matrix A, as follows: et A = In +

∞ n X t An , t ∈ I, n! n=1

(8.24)

where In is the n × n identity matrix defined in Chapter 1. The series is uniformly convergent under a matrix norm. Hence, etA is well defined. In relation (8.24) etA is called a matrix exponential function for t ∈ I.   0 1 tA Example 8.6.1 Find e where A= . −1 0 Solution. Here    0 1 etA = exp t −1 0   n     2 1 0 0 1 tn 0 1 0 1 t2 = + ··· + + t+ −1 0 n! 0 1 −1 0 −1 0 2!   2 4 3 5 1 − t2! + t4! + ... t − t3! + t5! − ...   = t3 t5 t2 t4 −(t − 3! + 5! − ...) 1 − 2! + 4! − ...   cos t sin t = , t ∈ I. − sin t cos t

Linear Systems of Differential Equations 329     0 1 0 1 Observation. The matrix is called a circulant matrix and 1 0 −1 0 is anti-circulant.   0 −4 tA Example 8.6.2 Find e where A = . −4 0 Solution. Here  2k  2k  0 −4 4 0 = , k = 1, 2, 3, .. −4 0 0 42k    2k+1 0 −4 0 −42k+1 and = , k = 0, 1, 2, ... −4 0 −42k+1 0   0 Hence exp t −4

  1 e4t + e−4t −4 = 0 2 e−4t − e4t

 e−4t − e4t . e4t + e4t

Observation. (i) Given the initial value problem x0 = Ax, x(t0 ) = x0 , x(t) = exp[A(t − t0 ) x0 is its unique solution;    at  a 0 e 1 0 (ii) Given a diagonal matrix A = 1 , exp (At) = ; 0 a2 0 ea2 t   k 1 (iii) The matrix of the type A = , is of the form 0 k     k 0 0 1 A= + . 0 k 0 0 Hence,  

eA t = e

k 0



0 t k



0 0



e



1 t 0

 =

ekt 0

0 ekt

  1 0

  0 0 + 1 0

   1 1 t = ekt 0 0

 t ; 1

(iv) Let Φ(t) = et A , then Φ−1 (t) = e−t A ; (v) Observe that for a nonsingular matrix P, exp[P−1 X P] = P−1 [eX ]P where X and P are matrices of the order n × n; (vi) If X and Y are square matrices of order n, such that XY = YX, then exp(X)exp(Y) = exp(X + Y) holds. The system (8.2) when A is an n × n constant matrix, has the form x0 = Ax. Assume that, the solution vector is of the form x(t) = (etA c), c being a constant arbitrary vector. To verify this assumption, differentiate x(t) to get x0 (t) = (etA c)0 = A etA c = A x(t), t ∈ I.

330

Linear Algebra to Differential Equations

The claim holds. If the initial condition is given as x(t0 ) = x0 , then it follows that the unique solution x(t) is given by x(t) = x(t, t0 , x0 ) = = Φ(t) Φ−1 (t0 ) x0 = etA e−t0 A x0 = eA(t−t0 ) x0 , t, t0 ∈ I. Further, when t0 = 0, x(t) = eAt x0 , the solution x(t) is an n-vector (x1 (t) x2 (t) . . . xn (t)). Choose the initial conditions xi0 = (0 0 . . . 1 . . . 0) = ei (1 in ith place), then the fundamental matrix, FM, Φ(t) (when A is a constant matrix) is constructed using n linearly independent solutions in the form of an n × n matrix. Hence, it follows that d X(t) = AX(t), X(0) = In , X0 (t) = dt where the matrix In = [e1 e2 . . . en ]. Now, it is easy to get FM etA for a given matrix A.

Methods to find a fundamental matrix The method of finding a fundamental matrix using eigen-values and eigenvectors is explained through examples for the various cases of eigen values discussed in Chapter 2. Further, the method is extended to a square matrix of order n, for each case. (a) Method of eigen-values Given a square matrix A, the method of finding its eigen-values and eigenvectors is discussed in Chapter 2. There are three possibilities, namely (i) The eigen-values are real and distinct. (ii) The eigen-values are repeated. (iii) The eigen-values are complex, distinct and/or repeated. The method is illustrated for the three cases through examples. Case (i) Eigen-values are real and distinct. Example 8.6.3 Consider the system x0 = AX where A =

 1 4

To find the eigen-values, consider the determinant, 1 − λ 2 = λ2 − 4λ − 5 = (λ + 1)(λ − 5) = 0. 4 3 − λ

 2 Find eAt . 3

Linear Systems of Differential Equations

331

Hence, the roots λ1 = −1, λ2 = 5 are the eigen-values. Consider the equation (A − λI)x = 0.  1 The corresponding eigen-vectors are [1 − 1] and [1 2] . Thus, e−t and −1   1 5t e are the two linearly independent solutions, 2 and the FM is given by,  −t  e e5t Φ(t) = . −e−t 2e5t T

T



tA −1 To find etA , weemploy the relation  e =Φ(t) Φ (0). Here, 1 1 2 −1 Φ(0) = and Φ−1 (0) = 31 . Hence, it follows that −1 2 1 1   1 2e−t + e5t −e−t + e5t . etA = 3 −2e−t + 2e5t e−t + 2e5t

Example 8.6.4 Find an FM of the system   0 1 x0 = x. −2 3 Solution. The eigen-values are λ1 = 1, λ2 = 2. For λ1 = 1, the eigen-vector is [1 1]T and one solution vector is [1 1]T et . For λ2 = 2, the eigen-vector is [1 2]T and the other solution vector is [1 2]T e2t .  t  e e2t Hence, Φ(t) = t . e 2e2t To find etA , etA

Φ(t) Φ−1 (0)  t  e e2t 2 = et 2e2t −1

=

  t −1 2e − e2t = 1 2et − 2e2t

 −et + e2t . −et + 2e2t

Observation. The eigen-vectors obtained in these two examples are not unique. They have been chosen by inspection. There are infinite ways to choose Φ(t). (a) General method When the eigen-values of a matrix are real and distinct, find for each eigen-value an eigen-vector, Suppose that λ1 is an eigen value and [a1 a2 . . . an ]T is its eigen-vector, then a related solution–vector is [a1 a2 . . . an ]T eλ1 t . Continue the process for the remaining eigen values to get n solution-vectors. Form a matrix with each solution vector as

332

Linear Algebra to Differential Equations

a column. Denote the matrix by Φ(t). This is a fundamental matrix, etA is also an FM and hence for some nonsingular matrix C, etA = Φ(t)C, i.e., for t = 0, In = Φ(0)C, which implies that C = Φ−1 (0). Hence, etA = Φ(t)Φ−1 (0). Case (ii) Repeated eigen-values.  0  x1 1 Example 8.6.5 Solve the system x2  = 0 x3 0

1 −2 1

  1 x1 3 x2 . 0 x3

Solution. The matrix A has the eigen values −3, 1, 1 and [1 − 6 2]T e−3t , [1 0 0]T et , [t 1/2 1/2]T et are the three linearly independent solutionvectors. (The second linearly independent vector corresponding to the eigen value λ = 1 is obtained as follows. Find w such that (A − I)w = [1 0 0]T . then u = tet [1 0 0]T + et w is the required vector) The F M for the system is then given by   −3t e et tet Φ(t) = −6e−3t 0 1/2et , 2e−3t 0 1/2et To find etA , since etA

= =

=

=

Φ(t) Φ−1 (0) −1  −3t  e et tet 1 1 0 −6e−3t 0 1/2et  −6 0 1/2 2e−3t 0 1/2et 2 0 1/2    −3t t t 0 −1 1 e e te 1 −6e−3t 0 1/2et  0 6 −6 8 0 −2 2 2e−3t 0 1/2et   8 4t + 1 12t − 1 1 t 2 6 . e 0 8 0 2 6

(b) General method Find the eigen values of matrix A. Suppose that λi is an eigen-value repeated ri times. Then ri linearly independent eigen-vectors are required. For this purpose consider the system of equations (A−λi I)x = 0 to get the first eigen-vector α1 . Then consider the system of equations (A − λi I)x = α1 . Choose a second vector α2 so that (A − λi I)α2 = α1 . Continue the procedure until ri linearly independent vectors are available. Case (iii) Eigen values are complex.

Linear Systems of Differential Equations   2 −5 tA 0 Example 8.6.6 Find e for x = Ax where A = . 2 −4

333

Solution. The eigen-values are −1 ± i. As before, for λ1 = −1 ± i, the eigenvectors are [3 + i 2]T and [3 − i 2]T, respectively. Hence choose solution vectors as [3 + i 2]T e(−1+i)t and [3 − i 2]T e−(1+i)t . It follows that

Φ(t) =

Here, Φ(0) =

 (3 + i) 2

 (3 + i)e(−1+i)t 2e(−1+i)t

 (3 − i)e−(1+i)t . 2e−(1+i)t

  1 2 (3 − i) and Φ−1 (0) = 2 4i −2

 −(3 − i) , 3+i

Now, etA = Φ(t)Φ−1 (0)   −5 sin t −t 3 sin t + cos t =e 2 sin t cos t − 3 sin t   1 2 −1 1 Example 8.6.7 Find etA for x0 = Ax where A = 0 1 0 −1 1 Solution. Eigen-values are 1, 1 + i, 1 − i. The solution vector for λ1 = 1  T can be chosen as 1 0 0 et . For λ2 = 1 + i the solution vector is  T  T (1−i)t −2 + i −i 1 e(1+i)t and for λ3 = (1 − i) it is −2 − i i 1 e . Hence

 t e Φ(t) =  0 0

(−2 + i)e(1+i)t −ie(1+i)t e(1+i)t

Φ and etA

  (−2 − i)e(1−i)t 1 , Φ(0) = 0 ie(1−i)t 0 e(1−i)t −1

 1  (0) = 0 0

−2 + i −i 1

 −2 − i i . 1



1

2

i 2 −i 2

1 2 1 2

  1 1 + 2 sin t − cos t 2 − sin t − 2 cos t , cos t sin t = et 0 0 −sin t cos t

which is obtained on simplifying the entries using the fact that eit = cos t + i sin t

334

Linear Algebra to Differential Equations

(c) General method Find the eigen-values of A and follow the method in finding solution vectors for each eigen-value real or complex, distinct or repeated. Note that when eigen-values are complex, the exponential expression can also be expressed in terms of trigonometric ratios, since eit = cost+i sint. If the complex eigen-values are distinct then apply the procedure in (a) to obtain the solution. If the complex values are repeated then apply the procedure in (b) to obtain the solution. Aliter method to find etA From the Cayley-Hamilton Theorem, if p(x) is characteristic equation for matrix A, then p(A) = 0. Also any scalar function f (x) can be expressed in terms of a remainder and quotient as f (x) = b(x)p(x) + r(x), where p(x) is a nth degree polynomial and r(x) is a polynomial of degree less than n. Set r(x) = rn−1 xn−1 +rn−2 xn−2 +...+r0 . This result is also true when x is replaced by a matrix, A, that is, f (A) = b(A)p(A) + r(A). From this result it follows that etA can be reduced to a matrix polynomial of degree n − 1. Thus etA = b(A)p(A) + r(A) = r(A) where r(A) = rn−1 An−1 + rn−2 An−2 + ... + r0 In×n . The ri ‘s can be obtained from the scalar case by evaluating etλ at the roots of p(λ), which are the eigen-values of the matrix A. Let the eigen-values be given as {λ1 , λ2 , ..., λn } then (i) if the n eigen-values are distinct, consider the n simultaneous equations etλi = r(λi ), i = 1, ..., n and solving them yield the values ri ’s. (ii) If one or more of the eigen-values is repeated then for the repeated tx value, say λj = λi , the equation is dedx |x=λj = tet λj = dr(x) dx x=λj . If λk = λj = 2 tx

2

2

r(x) e dx λi the equation is d x=λ = t2 et λk = d dx . This procedure continues 2 x=λ k k on for higher derivatives if the eigen-values are repeated more number of times. The following examples are illustrative

Example 8.6.8 Find e

tA

 1 when A = 4

 2 . 3

Solution. Using the method given above, etA = α0 I2 +α1 At, r(λ) = α0 +α1 λ. For the given matrix A the eigen values are λ1 = −1, λ2 = 5. Hence e−t = α0 − α1 , e5t = α0 + 5α1 which yield 1 5t 1 (e + 5e−t ) and α2 = (e5t − et ), providing 6 6   1 2e5t + 4e−t 2e5t − 2e−t = . 6 4e5t − 4e−t 4e5t + 2e−t

α0 = etA

The following example contains repeated roots of A. Example 8.6.9 Find e

tA



0 when A = −4

 1 . 4

Linear Systems of Differential Equations

335

Solution. The eigen values of A are 2, 2. Further, etA

= α0 I2 + α1 At, r(λ) = α0 + α1 λ;

2t

e

= r(2) = α0 + 2α1 and te2t = α1 yielding

α0

=

(1 − 2t)e2t and α1 = te2t , providing   (1 − 2t)e2t te2t = . −4te2t (1 + 2t)e2t

etA

 0 Example 8.6.10 Find etA , where A = 0 0

 1 0 −i 0. 1 i

The matrix At has the eigen values 0, it, −it. Further, let etA = α0 I + α1 At + α2 A2 t2 and r(λ) = α0 + α1 λ + α2 λ2 . Hence, e0

= r(0) = α0 + α1 (0) + α2 (0)

it

= r(i) = α0 + α1 (i) + α2 (i)2

e e

−it

= r(−i) = α0 + α1 (−i) + α2 (−i)2 .

Solving these simultaneous equations for α0 , α1 and α2 it follows that α0 = 1, α1 = Hence, etA

eit + e−it − 2 eit − e−it = 1 sint, α2 = = 1 − cos t. 2i −2

     1 0 0 0 1 0 0 = 1 0 1 0 + sint 0 −i 0 + 1 − cost 0 0 0 1 0 1 i 0   1 sin t − i(1 − cos t) 0 . cos t − isint 0 = 0 0 sin t cos t + i sin t

−i −1 0

 0 0 −1

In respect of repeated roots, employ the idea involved in the method of variation of parameters. The following example illustrates the method. Here again, consider example 8.6.9. Example 8.6.11 Find e

tA



0 when A = −4

 1 4

The eigen values of A are 2 repeated twice. For λ = 2, by the usual method the solution vector is x1 (t) = [1 2]T e2t .

336

Linear Algebra to Differential Equations   c1 (t) 2t To find x2 (t), assume that x2 (t) = e , c2 (t) where c1 and c2 are unknown functions of t. To determine c1 and c2 , verify the conditions under which x2 (t) is a solution. Consider    0     c1 (t) c (t) 2t 0 1 c1 (t) 2t e = e . x02 (t) = 2e2t + 01 c2 (t) c2 (t) −4 4 c2 (t) Comparing the coefficients of e2t , c01 (t)

=

−2c1 (t) + c2 (t)

c02 (t)

=

−4c1 (t) + 2c2 (t) = 2[−2c1 (t) + c2 (t)]

which implies that c02 (t) = 2c01 (t) or c2 (t) = 2c1 (t) + α. Substituting for c2 (t) gives c01 (t) = α. Hence, c1 (t) = αt + β and c2 (t) = 2αt + 2β + α. Since α and β are arbitrary constants of integration, choose α = 1 and β = 0. Hence         c1 (t) 2t t 1 0 2t 2t 2t x2 (t) = e = e = te + e . c2 (t) 2t + 1 2 1 It follows that, x(t) = c¯1 x1 (t) + c¯2 x2 (t) = c¯1 [1 2]T e2t + c¯2 {[1 2]T te2t + [0 1]T e2t }.       1 t 1 0 1 0 2t −1 Hence, Φ(t) = e , Φ(0) = , Φ (0) = and 2 2t + 1 2 1 −2 1      1 t 1 0 2t 1 − 2t t tA −1 e = Φ(t)Φ (0) = e = e2t . 2 2t + 1 −2 1 −4t 2t + 1

EXERCISE 8.6 

 i 1 0 1. Show that the matrix A = 0 i 1 has eigen-value i repeated three 0 0 i times. Find etA .  0    x 0 2 0 x 2. Show that the solutions of y  = −2 0 0 y  are given by z 0 0 0 z         x(t) sin 2t cos 2t 0 y(t) = c1 cos 2t + c2 − sin 2t + 0. z(t) 0 0 1

Linear Systems of Differential Equations

337

3. Write the equation x00 + 3x0 − 4x = 0 into system form and show that the system, in general, is unbounded. Hint: Find etA .  0    x 8 −1 x1 4. Solve 1 = . x2 4 12 x2 5. Solve the IVP  0  x 1 = y 0 6. Find etA where  1 1 (i) A = 0 1 0 0

8.7

      2 x t 8/9 + , x(0) = . 3 y t −1/9

  0 0  0 , (ii) A = 8 1,

  1 0 , (iii) A = 2, −9

 1 . 6,

Stability Analysis of a System

The term stability is commonly employed in several life situations. The dictionary meaning of this term is: continued existence, permanence, standing firmly in place, not easily moved, shaken or overthrown etc. In a narrative form of language this term is in use on many occasions, such as economic stability, stability of health, political stability or stability of a government. These kinds of experiences take a qualitative or quantitative form when reallife situations are treated in a mathematical sense. Mathematical treatment of nonlinear or complex situations is not discussed. Note that the solution of (8.3) is of the form x(t) = Φ(t)Φ−1 (t0 )x0 , t, t0 ∈ I, where Φ(t) is a fundamental matrix. When x0 = 0, clearly x(t) ≡ 0, t ∈ I, x(t) ≡ 0 is called as the trivial or zero solution and stability of the system (8.3) is defined in terms of the the trivial solution. When x0 6= 0, the solution x(t) 6≡ 0. Definition 8.7.1 The trivial solution of (8.3), is said to be stable if any x(t) with x(t0 ) remains close to the trivial solution, as t → ∞ or lim x(t) = 0. t→∞

The present section, deals with the stability of solutions of the systems described in terms of (i) polynomial form differential equations, with constant coefficients, (ii) the systems of the form (8.3) where the matrix A is constant, (iii) the matrix A in (8.3) depends on t, for t ∈ I= [0, ∞)

338

Linear Algebra to Differential Equations

The criteria for stability are not for the cases listed above are gien below without proof. (i) Consider the nth order homogeneous linear differential equation x(n) + a1 x(n−1) + · · · + an x = 0,

t ∈ [0, ∞),

(8.25)

where a1 , a2 , ..., an are constants. Under these assumptions, it is known that the solution exists uniquely on [0, ∞). It is known that the stability of trivial solution x(t) ≡ 0 is related to the roots is also the characteristic equation (CE) given by rn + a1 rn−1 + · · · + an = 0, (8.26) The following statement provides the criteria for the stability of x(t) ≡ 0. Criteria (i) Let all the roots of the CE (8.29) have negative real parts and let ρ be a real number such that −ρ > max real (λi ) 1 ≤ i ≤ n,

(8.27)

where λ1 , λ2 , . . . , λm (m < n) are the distinct roots of (8.29), then there exists a number k > 0 having the property ||x(t)|| ≤ k exp (−ρ t), t ∈ I = [0, ∞) where ρ is given in (8.30) and ||x|| = sup{|x(t)| : t ∈ I} Observation. The right side of the above inequality suggests that lim |x(t)| = 0 indicating that the solution x(t) remains close to the ideal t→∞ stability position. The nature of the roots of the CE depends on the nature of the co-efficients a1 , a2 , . . . , an . This fact is included in the stability criteria given by E. Routh and A. Hurwitz and is stated below.

Routh-Hurwitz Criteria Assume that the coefficients a1 , a2 , ..., an in the CE (8.29) are real numbers. Define the determinants a1 a3 a5 a1 a3 , D3 = 1 a2 a4 , and Let D1 = a1 and D2 = 1 a2 0 a1 a3 a1 a3 a5 · · · a2k−1 1 a2 a4 · · · a2k−2 0 a1 a3 · · · a2k−3 ...Dk = 0 0 a4 · · · a2k−4 , k = 1, 2, . . . , n and aj = 0 when j > n. .. .. .. ...... .. . . . ... . 0 0 0 ··· ak

Linear Systems of Differential Equations

339

If the det Dk > 0 for k = 1, 2, . . . , n, then the roots of the CE (8.29) have negative real parts and the solution x(t) of (8.28) is such that lim |x(t)| = 0. t→∞ The following examples illustrate the forementioned result. Example 8.7.1 Consider the DE x00 + 9x0 + 20 x = 0. Solution. The CE is given by λ2 + 9λ + 20 = (λ + 4)(λ + 5) = 0. The roots are λ1 = −4 and λ2 = −5 are negative. Hence, the solution x(t) is stable. The Routh - Hurwitz criteria may also be applied. With 9, a2 = 20, a3 = 0. a1 = 9 0 = 180 > 0. Hence, lim ||x(t)|| = 0. D1 = 9 > 0 D2 = 1 20 n→∞ Observation. For a general second order equation x00 + a1 x0 + a2 x = 0, a1 0 = a1 a2 > 0 are the conditions required for D1 = a1 > 0 and D2 = 1 a2 stability. This is possible if a1 > 0, a2 > 0. Hence, any second order differential equation having a1 > 0, a2 > 0 is always stable. Example 8.7.2 For the DE, x000 + a1 x00 + a2 x0 + a3 x = 0 a1 a3 0 a1 a3 , D3 = 1 a2 0 D1 = a1 , D2 = 1 a2 0 a1 a3 Here D1 , D2 and D3 need to be positive. Assume that a1 > 0, a2 > 0 and a3 > 0, then the required condition holds for the given DE. Example 8.7.3 Consider the DE, z 000 + a1 z 00 + (8 − a1 ) z 0 + 7z = 0. Determine the value of a1 so that the given DE remains stable. a1 7 0 a1 7 , D = 1 8 − a1 0 Solution. D1 = a1 , D2 = 1 8 − a1 3 0 a1 7 Here D1 > 0 implies a1 > 0. Further, D2 = 8 a1 − a21 − 7 > 0 implies 1 < a1 < 7. With this choice of a1 it follows that D3 > 0. Case (ii) Consider the system (8.3) where the matrix A is a constant. The following theorem gives the stability criteria of (8.3).

Theorem 8.7.1 Suppose that the eigen-values of A are λ1 , λ2 , ..., λm and let λj , j = 1, 2, ..., m be repeated nj times, such that n1 + n2 + · · · + nm = n. Let the eigen values have a representation

340

Linear Algebra to Differential Equations

λj = αj + i βj , j = 1, 2, . . . , m, and that αj < η < 0 f or j = 1, 2, . . . , m. Then there exists a constant M > 0, such that ||etA || ≤ M exp(η t), t ∈ I = [0, ∞), and the system (8.3) is said to be stable. Consider the following examples. Example 8.7.4 Consider the system x01 = −4x1 − 6x2 x02 = 5x1 − 10 x2 .   √ −4 −6 Solution. Given A = has eigen-values −7 ± 21 i. 5 −10 The real parts of the eigen-values are negative. Here αj = −7. Hence η can be chosen a negative number < −7. Therefore the conclusion is lim ||x(t)|| = 0 t→∞ and that the system is stable.  1 Example 8.7.5 Let x0 = Ax = 0 0

 1 3 x 0

1 −2 1

The matrix A has eigen-values λ1 = 1, λ2 = 1, λ3 = −3. The first two eigen values have positive real part. Hence lim ||x(t)|| = ∞; i.e. the solution x(t) t→∞ is unstable. Case (iii) Consider x0 = A(t)x, t ∈ I = [0, ∞) where the matrix A depends on t, is real valued and continuous in t ∈ I. The stability of the given system depends on the nature of the eigen values of the symmetric matrix A(t) + AT (t), AT is the transpose of A and the eigen values are functions of t. The following are the stability criteria. Theorem 8.7.2 Let H(t) be the largest eigen-value of the matrix A(t)+AT (t) Zt and let t0 be a fixed real number. If lim H(s)ds = −∞, t ∈ I, then every t→∞

t0

solution x(t) of the given linear system is such that lim x(t) = 0. t→∞

On the other hand, let m(t) be the smallest eigen-value of the matrix A(t) + Zt T A (t) and is such that lim m(s)ds = +∞. t→∞

t0

Then every nonzero solution x(t) of the given linear system is unbounded, i.e. lim ||x(t)|| = ∞. t→∞

Linear Systems of Differential Equations   2 t3 3 0 t x Example 8.7.6 Consider the system x = −t3 −2

341

Here T

A(t) + A (t) =



2 t3

−t3

 2 t3 + t33 −2 t

 4 −t3 = t3 −2 0

has two eigen-values t43 and −4. Clearly, H(t) = Applying the stability criteria, Zt

Zt H(s)ds =

t0

4 t3

 0 −4

and m(t) = −4.

4 ds s3

t0

 −2 2 2 + 2 → 2 as t → ∞. = t2 t0 t0 

Hence solution x(t) remains bounded.

EXERCISE 8.7 1. Consider the equation z 000 + (a − α1 ) z 00 − 3(α2 − 1) z 0 + 9z = 0. Apply Routh-Hurwitz’s criteria for stability and determine values of α1 , α2 so that the system remains stable. 2. Show that all solutions of the following systems are bounded on (0, ∞). (i) x00 + et x = 0. (ii) x00 + (t − 1) x = 0. 3. Using the following matrices show that the system (8.3) is stable     −2 1 1 −1 5 6 (i)  0 −2 4  (ii) A =  0 −4 −2. 0 0 −1 0 1 −1 4. For the linear system (x1 x2 x3 x4 x5 )0 = (−3x1 − 3x2 − 3x1 − x4 − x5 6x2 − x5 ) Show that the eigen-values are −3, 0, −1 with multiplicity 2, 2, 1, respectively and that x(t) is bounded. 5. Show that the system x0 = A(t) x

342

Linear Algebra to Differential Equations   1 t2 2 t , (i) is bounded as t → ∞ when A(t) = −t2 −1   − 1t t2 + 1 (ii) is stable when A(t) = . 2 −(t + 1) −2 (Hint: Find the eigen-values of A(t) + AT (t).)

8.8

Election Mathematics

Many countries in the world have adopted democratic set up for administrating the welfare activities of villages, cities, states and the country. The method of selecting the administrators is, generally, by way of ‘elections’ i.e. adult franchise. Naturally there exists a tough competition among the members of the voting community, and also between the candidates contesting the elections. Some experts, through their experience, evaluate the abilities of the contesting members and opine about ‘who will win the elections’. For many years the evaluation process was ‘abstract’ in nature. Slowly experts decided the factors involved in a competition and finally the election process took the shape of a mathematical modeling problem which is now called as ‘election mathematics’. The aim of this section is to present such a simple model involving a system of differential equations. Suppose that in a city two members A and B wish to contest an election to become a corporator in its administration, possessing variable (winning) potential x and y, respectively depending on an independent variable t. In a simplistic and pure competitive atmosphere the rate of change of potential x of A depends on the preparation of the potential of y and vice versa. In a mathematical sense dx = x0 (t) ∝ y(t) dt dy = y 0 (t) ∝ x(t), dt x0 = ay and y 0 = bx.

(8.28)

The constants of proportion ‘a’ and ‘b’ can be time dependent but for the sake of simplicity, let ‘a’ and ‘b’ be assumed to be constants. However, the model (8.28) is not realistic since many factors retard or affect the potential of the candidates in real-life situations. For example, the rate of change of the potential of the candidate will depend on various factors like (i) shortage of funds (ii) nonavailability of manpower (iii) rules and regulations of election offices. Taking into account the affect of retardation the refined model takes the form x0 = ay − px, y 0 = bx − qy, (8.29)

Linear Systems of Differential Equations

343

where p and q are assumed to be constants. The model (8.29) needs one more refinement since local disputes, false promises and rumors in the election competition, also affect x0 and y 0 . The refined model is then given by x0 = ay − px + α, y 0 = bx − qy + β,

(8.30)

where again, it is assumed that α and β are constants and take into account the above mentioned negative factors. These constants act like perturbation terms. The effectiveness of (8.30) may increase if (i) the constants a, p, α, b, q, β are chosen realistically and (ii) these are made available as functions of t. The second factor enhances the complexity in (8.30). The system (8.30) is linear in its composition and hence matrix methods are available to find its solution. It is experienced that some expert election-agencies collect useful data during the process of voting and predict the outcome of the elections as soon as the voting process is completed. If the findings of these agencies quite often differ from each other it is because of some presumptions about the parameters appearing in the system (8.30). The model (8.30) includes some special cases of interest. These are stated below: (i) Let α = 0 and β = 0. This situation suggests that the local disputes among voters are absent, false promises or rurmors are ineffective. This leads to a fair-practice election, a spirit of pure competition exists between the two candidates. (ii) The case, x0 = 0, y 0 = 0, is an indication that increase or decrease in the potential of two competing candidates is zero. It suggests that the election process is indeed peaceful. In particular in (8.33), in addition to α = 0, β = 0, suppose that ay − px = 0, bx − qy = 0. This means that x = 0 and y = 0 (when aq 6= bp) suggesting that an equilibrium stage is reached. In elections, this is a rare phenomena. In this case contesting candidates must be close friends or are saints. (iii) Consider the situation when ay − px = 0 and bx − qy = 0. Suppose that local disputes and rumors are prevailing in the community of the voters. Then in view of (8.30), the system takes the form x0 = α and y 0 = β. Assume that α > 0, β > 0. The system suggests that there is competition between the two candidates. (iv) Suppose that the constants a and b are large numbers and dominate the value of p, α, q, β, then the system (8.30) becomes x0 = ay and y 0 = bx

344

Linear Algebra to Differential Equations which means x00 = ay 0 = abx. Solving this equation results in solving √ the √system (8.30) in this case, and the solution is x(t) = A e abt + B e −abt (A and B are constants). In case A > 0, then x(t) → ∞ as t → ∞. This fact suggests that the elections are fought vigorously.

(v) Suppose that one of the contesting candidates, say B takes a lean position i.e, y = 0 then the system (8.30) takes the form dx dy = −px + α and = bx + β. dt dt Here, if b > 0 or x > 0 it implies that y increases as time t increases implying there by that the lean position of candidate B may not last long enough. This fact suggests that elections need to be fought, in the right spirit. Observation. 1. The mathematical model presented in (8.30) involves two candidates in an election. In many elections there is a multi-cornered contest for one seat. Then the model becomes complex. 2. The arms race between two conflicting nations represents a similar model. Mathematicians have studied the available data for World War II between two conflicting nations. It was found that the model of the type (8.30) works well. 3. In an ocean, big fish rely, for their food, on small fish. There exists a conflicting situation which is described by the model (8.30). This model is also known as prey–predator model in mathematics. 4. In an industry conflicting situations may arise between management and labourers about the raise in salaries, bonuses or economic problems. Model (8.30) works in such a situation also. 5. In agricultural sciences, farmers plant trees at a distance from each other following scientific guidelines between two plantations. It has been established that the roots of the trees face conflicting phenomena for food and water in the soil. In all these cases (1) to (5), the mathematical model is similar, however the meanings of the terms involved are different.

8.9

Conclusion

This chapter considered the study of vector-matrix differential equations which includes the existence and uniqueness of solutions as a basic feature. Methods

Linear Systems of Differential Equations

345

of solving linear vector-matrix differential equations are explained through second and third-order matrices. This approach can be extended to higher order matrices using computational techniques. The examples are simple and illustrate the technique. The application given in Section 8.8 helps to widen the scope of understanding of the subject. For further reading the readers are referred to [3], [6], [11] and [16].

Chapter 9 Linear Matrix Differential Equations

9.1

Introduction

During the last four to five decades the study of linear matrix differential equations (LMDEs) and their properties have witnessed a rapid growth because of the demands in modern engineering and in technology. The emergence of areas such as control systems, theory of communications and optimization, demanded extensive study of matrix analysis. This chapter deals with the methods of obtaining explicit solutions of LMDEs. In Section 9.2 an initialvalue-problem for an LMDE and other different types of LMDEs are introduced. In Sections 9.3 to 9.5, the methods of obtaining explicit solutions of LMDEs of the first order occurring in various forms are presented. The linear variation of parameters formula(VPF) has been developed to evaluate the effect of a perturbation, a booster or a control term on the system. In Section 9.6, an LMDE of higher order is treated and solved explicitly to get a solution matrix. The methods of solving LMDEs are then employed (i) to solve matrix boundary value problem and (ii) for the development of matrix trigonometry and matrix hyperbolic functions and extended trigonometric functions. The examples illustrate the methods developed in the Sections 9.3 to 9.7.

9.2

Initial-value-problems of LMDEs

The simplest form of a linear matrix differential equation (LMDE) is given by X0 = A(t)X,

t ∈ I,

(9.1)

where X is an n×n matrix defined on an interval I ⊂ R, the elements of which are differentiable functions on I, A(t) is an n × n matrix whose elements are continuous functions of t ∈ I. Here (9.1) is a homogeneous first order LMDE. Definition 9.2.1 A solution of (9.1) is defined as an n × n matrix function DOI: 10.1201/9781351014953-9

347

348

Linear Algebra to Differential Equations

Φ(t) defined on I, which is differentiable on I and is such that Φ0 (t) = A(t) Φ(t), t ∈ I.

(9.2)

Observation. To get a specific solution X(t) of (9.1), it is necessary to set an initial data X(t0 ) = X0 , t0 ∈ I, which is a given initial matrix belonging to Rn×n . Now (9.1) with its initial data (t0 X0 ) is an IVP X0 = A(t)X, X(t0 ) = X0 , t0 , t ∈ I.

(9.3)

It is known that the solution of this IVP (9.3) exists on I and is unique. Types of LMDEs 1. The IVP (9.3) may be further strengthened by adding a perturbation term H(t) to get a linear perturbed matrix differential equation Y0 = A(t)Y + H(t), Y(t0 ) = Y0 ,

(9.4)

where H(t) is an n × n matrix, continuous on I. The existence and uniqueness property of (9.4), as in (9.3), continues to hold for all t ∈ I. Observation. H(t) depends only on the scalar t ∈ I. In case, the matrix H(t) is replaced by H(t, y) then the LMDE(9.4) becomes nonlinear. In such a situation a solution of (9.4) may exist uniquely under some conditions but it becomes difficult to obtain explicit solutions. Such nonlinear situations are avoided in the chapter. 2. There is yet another possibility to strengthen the linearity in (9.1) by considering a linear matrix differential equation of the form X0 = A(t)X + X B(t), t ∈ I,

(9.5)

where A and B ∈ C[I, Rn×n ] and its perturbed format is Y0 = A(t) Y + Y B(t) + H(t),

(9.6)

where H ∈ C[I, Rn×n ]. Under these hypotheses, (9.6) possesses a unique solution on I. Generally, the equations (9.5) and (9.6) need the additional condition AB 6= BA for t ∈ I.

(9.7)

3. One more LMDE format that invites attention is X0 = A(t) X B(t), t ∈ I.

(9.8)

and its perturbed equation format is Y0 = A(t) Y B(t) + H(t), where H(t) ∈ C [I, Rn×n ].

(9.9)

Linear Matrix Differential Equations

349

The matrix systems noted in (9.1) to (9.9) are linear and are of the first order. In a subsequent section, linear systems of higher order are also considered.

9.3

LMDE X0 = A(t)X

Given a linear system of the vector- matrix form x0 = A(t)x (discussed in Chapter 8) it has been noted that a fundamental matrix(FM) Φ(t) needs to be developed. Further, the FM satisfies Φ0 (t) = A(t) Φ(t),

t ∈ I.

Observe that Φ(t) is an n × n matrix. Clearly, it is a solution of the matrix differential equation (9.1). Further, given the FM, Φ(t), matrices of the form Ψ(t) = Φ(t)C, where C is an n×n constant, nonsingular matrix, are all FMs. Hence, it follows that Φ0 (t) C = A(t) Φ(t) C , Ψ0 (t) = A(t) Ψ(t).

t ∈ I, or

Hence, to solve LMDE (9.1), it is necessary to find an FM Φ(t). It can be easily verified that X(t) = Φ(t) Φ−1 (t0 ) X0 , (since X(t0 ) = X0 )

(9.10)

is a solution of the IVP (9.1). The following examples are illustrative. Example 9.3.1 Solve the LMDE X0 =

 1 4

 2 X. 3

 1 Solution. In Example 8.6.3, it was shown that an FM of x = 4 given by      2 e−t + e5t −e−t + e5t 1 2 = Φ(t), exp t = 31 −2 e−t + 2 e5t e−t + 2 e5t 4 3 0

 2 x is 3

0 for  t ∈  I. It follows that Φ(t) is also a solution of the given LMDE, Φ (t) = 1 2 Φ(t), t ∈ I. 4 3

Example 9.3.2 For the IVP X0 =



0 −4

  1 1 X, X(0) = 4 0

 0 , obtain the 1

350

Linear Algebra to Differential Equations   1 t solution of the given IVP when a FM is given by Φ(t) = e2t . 2 2t + 1   1 0 Solution. From the given information, Φ(0) = and Φ−1 (0) = 2 1   1 0 . −2 1 By (9.10), X(t) = Φ(t) Φ−1 (0) X(0)      1 − 2t t 1 0 1 − 2t t = e2 t = e2t −4t 2t + 1 0 1 −4t 2t + 1 is a solution to the considered problem. Example 9.3.3 Solve the   0 1 0 X(π) = 1 0 0, where 0 0 1  sin 2t cos 2t Φ(t) = cos 2t − sin 2t 0 0

IVP X0 = A(t)X,   i 1 0 A(t) = 0 i 1 given that 0 0 i  0 0 is a fundamental matrix. 1

Solution. By (9.10), it follows that X(t) = Φ(t) Φ−1 (π)X(π)  0 1 sin 2t cos 2t 0 = cos 2t − sin 2t 0 1 0 0 0 1 0 0 

 0 0 0 1 1 0

1 0 0

  0 sin 2t cos 2t 0 = cos 2t − sin 2t 1 0 0

 0 0 1

The method of solving a perturbed LMDE of the type (9.4) The method of solving vector-matrix differential equation was already discussed in Section 8.5. Therefore, it follows that given Φ(t) satisfying (9.1) the solution of (9.4) is given by Y(t) = Φ(t) Φ−1 (t0 )X0 +

Zt

Φ(t) Φ−1 (s) H(s)ds, t0 , t ∈ I,

(9.11)

t0

which is equivalent to Zt X(t, s, H(s))ds, t0 , t ∈ I. (9.12)

Y(t) = Y(t, t0 , X0 ) = X(t, t0 , X0 ) + t0

Here X(t, t0 , X0 ) is the solution of (9.3), which has the form X(t) = Φ(t)Φ−1 (t0 )X0 . The steps in obtaining (9.11) are similar to those given in

Linear Matrix Differential Equations

351

Section 8.5 and hence are omitted. As a particular case, let A(t) = A, a constant matrix, then Y(t) = Y(t, t0 , X0 ) = e

A(t−t0 )

Zt X0 +

eA(t−s) H(s)ds, t, t0 ∈ I.

(9.13)

t0

The relations (9.11), (9.12) and (9.13) are all called as ‘variation of parameters formula(VPF)’.     1 −1 −t 1 0 Example 9.3.4 Solve the LMDE Y (t) = Y(t) + 0 2 0 0 Y(t0 ) = X0 , t0 ,t ∈ I.   1 −1 −t 1 Solution. A = and H(t) = . By the VPF 0 2 0 0  t−t  e 0 e2(t−t0 ) Y(t) = 0 −2e2(t−t0 )    Rt et−s e2(t−s) −s 1 X0 + ds. 0 0 0 −2e2(t−s) t0 All the terms under the integral  sign  are integrable. The simplification is left 1 1 to the reader. Choose X0 = and complete the calculation. 0 1

EXERCISE 9.3 1. Solve the LMDEs   0 1 0 (i) X = X, −4 4 2. Solve the IVPs  0 (i) X0 = 0 0  3 (ii) X0 = 0

(ii)

  1 0 1 −2 −5 X, X(0) = 1 1 2 0    2 1 0 X, X(0) = . 3 0 3

3. Solve the perturbed LMDEs    1 −1 −t 0 (i) X = X+ 0 2 0    1 0 0 (ii) X0 = X+ 0 −1 0

 −1 1 X0 =  0 −2 0 1

0 2 0

 0 1, 1

   1 1 0 , X(0) = , 0 0 1    t 1 0 , X(0) = . 1 2 0

 1 3 X, 0

352

9.4

Linear Algebra to Differential Equations

The LMDE X0 = AXB

In the present section, the properties of Kronecker products have been used extensively. The (n × n) solution matrix X(t) of the LMDE (9.14) (given) is converted into a vector-matrix form using properties in Chapter 6. Some of the properties of interest in the subsequent sections of this chapter are the following: 1. Vec (A B) = (BT ⊗ In )Vec A 2. exp (BT t) = exp (B t) 3. [(exp BT t) ⊗ In ] Vec C = Vec [C exp (B t)] 4. Vec (AXB) = (BT ⊗ A) Vec (X). Here In denotes the n × n identity matrix and A, B are n × n constant matrices. Consider an LMDE of the form X0 = AXB, X(0) = X0 , t ∈ R,

(9.14)

where A and B are constant matrices of appropriate order, AB 6= BA and X0 is a given constant matrix. By the property (iv) mentioned above, it follows that Vec(AXB) = (BT ⊗ A) Vec(X). (9.15) Now, the equation (9.14) together with (9.15) takes the form Vec(X0 ) = Vec(A X B) = (BT ⊗ A) Vec(X).

(9.16)

Here (9.16) is a linear vector-matrix equation of the type X0 = AX with constant coefficients and has, therefore, a unique solution Vec(X(t)) = [exp ((BT ⊗ A)t)] Vec(X(0)), t ∈ R.

(9.17)

Observe that the RHS of (9.17) is completely known and that exp [BT ⊗ A t] needs a resolution. The following example explains the situation. Example 9.4.1 Solve the LMDE IVP      3 −1 2 2 0 0 X = X , X(0) = 4 −2 1 3 0

 0 . 1

Solution. By (9.17), it follows that Vec(X(t)) = exp

 2 1

T   !  2 3 −1 0 ⊗ t Vec 3 4 −2 0

 .

0 1

Linear Matrix Differential Equations

353

Here, exp ((BT ⊗ A)t) is a 4 × 4 matrix. To resolve this matrix, find the eigen-values and eigen-vectors of the matrices A and B. For matrix A, the eigen-values are µ1 = −1 and µ2 = 2 and the eigen    1 1 vectors respectively are , . 4 1 For the matrix B the eigen-values are λ1 = 1, λ2 = 4, and the respective  −1 1 eigen-vectors are , . Further, the matrix 1 2   6 −2 3 −1 8 −4 4 −2  BT ⊗ A =  6 −2 9 −3. 8 −4 12 −6 The eigen-values of (BT ⊗ A) are of the form λi µj (i, j = 1,2), −1 1 1 8. The respective eigen-vectors of (BT ⊗ A) = ⊗ 1 2 4         1 −1 −1 1 4 −4 −1 1  ,  ,  ,   2  1   1  2 2 1 4 8

i.e.  −4, −1, 2, 1 are 1

leading to solution vectors         1 −1 1 −1 −4 −t 4 −4t −1 2t 1 8t   e ,   e ,   e ,   e , t ∈ R. 2 1 2 1 2 1 8 4 Hence the FM is given by   −t −e −e2t e−4t e8t −4e−t − e2t 4e−4t e8t  , leading to Φ(t) =  −t 2t −4t  e e 2e 2e8t  4 e−t e2t 8 e−4t 2 e8t   −1 −1 1 1     −4 −1 4 1 −1 1 1 1   Φ(0) =  = ⊗ 1 1 2 2 1 2 4 1 . 4 1 8 2 Observe the nature of Φ(0) in the Kronecker product. Hence    −1 T −1 1 1 1 e(B ⊗A)t = Φ(t) Φ−1 (0) = Φ(t) ⊗ . 1 2 4 1 Noting that (P ⊗ Q)−1 = P−1 ⊗ Q−1 , it follows that  −1    −1  −1 1 −1 −1 2 −1 1 1 1 = and = 4 1 1 2 3 −4 1 3 −1

 −1 . −1

354

Linear Algebra to Differential Equations

Hence it follows that −1

Φ

    1 2 −1 1 −1 (0) = ⊗ . −4 1 9 −1 −1

Hence Vec(X(t))

=

=

exp{(BT ⊗ A)t}. Vec(X(0)).   −t  2 −2 −1 1 −e −e2t e−4t e8t −t  1 4 −1 − e2t 4e−4t e8t   −4e  −8 2 −t 2t −4t 8t    −1 1 −1 1 e e 2 e 2e 9 4 −1 4 −1 4 e−t e2t 8 e−4t 2 e8t

which leads to  −e−t + e−2t + e−4t − e8t 1 X(t) =  9 −4 e−t + 4 e−4t + e2t − e8t

e−t − e2t + 2e−4t − 2 e8t 4 e−t + 8 e−4t − e2t − 2 e8t

  0 0   0 1

 .

Variation of parameters formula (VPF) The LMDE (9.8) or (9.14), after converting into the vector-matrix form provides its solution Vec(X(t)) = exp((BT ⊗ A) t)Vec(X0 ), t ∈ R. To get the solution of the perturbed equation (9.9) assume that the solution Vec(Y(t)) is such that Vec (Y(t)) = exp((BT ⊗ A) t) Vec(U(t)), t ∈ R, where U(t) ∈ C[I, Rn×n ] is an unknown matrix which needs to be found. Following the steps in the VPF of vector-matrix equation studied in Section 8.5, it follows that T

Zt

Vec(Y(t)) = exp ((B ⊗A) t) Vec(X0 )+

exp {(BT ⊗A)(t−s)}Vec(H(s))ds

t0

which is the same as Zt Vec(Y(t) = Vec(X(t, t0 X0 )) +

Vec(X(t, s, H(s)))ds. t0

Here, Vec(Y(t)) is explicitly obtained. After setting the n2 × 1 column matrix into an n × n matrix form, Y(t), the solution of the perturbed equation (9.9) is obtained.

Linear Matrix Differential Equations

355

EXERCISE 9.4 1.  Consider  the illustrative example in Section 9.4. Replace X(0) by X(0) = 1 0 . Find the solution to the given problem with the new initial 0 1 condition.     1 0 0 8 2. Let and A = and B = . 1 2 1 −2 (i)

Find the eigen-values and eigen-vectors of A and B;

(ii)

Find the eigen-values and eigen-vectors of BT ⊗ A;

(iii) Determine the matrix Φ(t) and Φ(0); (iv) Use the Kronecker product method to find Φ−1 (0); (v)

Determine exp (BT ⊗ A) t.

3. Solve the IVPs      1 0 0 8 0 0 (i) X = X , X(0) = 1 2 1 −2 0  1 (ii) X = 1 0

9.5

    0 0 8 0 X + 2 1 −2 1

 1 ; 1

  1 0 , X(0) = 0 0

 1 ; 1

More General LMDE

Consider the LMDE X0 = A(t)X + X B(t), X(t0 ) = C, t, t0 ∈ R.

(9.18)

Assume that the A(t) and B(t) are continuous noncommutative matrices of order n×n for t ∈ R and C is an n×n constant matrix. Under the continuity condition the existence and uniqueness of the solution of the LMDE (9.18) are guaranteed. To solve the LMDE (9.18), assume that Y is the solution of the IVP for LMDE Y0 = A(t)Y, Y(t0 ) = In , t, t0 ∈ R,

356

Linear Algebra to Differential Equations

where In denotes n × n identity matrix. Further, assume that Z(t) satisfies Z0 = BT (t)Z, Z(t0 ) = In . It is clear that ZT (t) satisfies the LMDE [ZT (t)]0 = ZT (t)B(t), ZT (t0 ) = In . The solution of LMDE (9.18) is then given by X(t) = Y(t)CZT (t),

t ∈ R.

(9.19)

Suppose, A(t) = A and B(t) = B are constant matrices, then Y(t) = eA(t−t0 ) and Z(t) = eB

T

(t−t0 )

.

are solutions of Y0 = AY, Y(t0 ) = In and Z0 = BT Z, Z(t0 ) = In , respectively. Then the solution of (9.18) is then given by X(t) = eA(t−t0 ) C[eB

T

(t−t0 ) T

] = eA(t−t0 ) C eB(t−t0 ) .

(9.20)

Variation of parameters formula Consider the nonhomogeneous LMDE, U0 = A(t)U + U B(t) + H(t), U(t0 ) = C,

(9.21)

where H is an n × n continuous matrix on I. The solution of the linear part of (9.21) is given by (9.19). To apply the technique of variation of parameters formula, let matrix C in (9.21) depend on t, i.e. U(t) = Y(t)C(t)ZT (t),

t ∈ I.

(9.22)

Clearly, C(t0 ) = C. Determine the matrix C(t) such that (9.22) satisfies (9.21). This gives, U0 (t)

= =

Y0 (t)C(t)ZT (t) + Y(t)C0 (t)ZT (t) + Y(t)C(t)[ZT (t)]0 A(t)Y(t)C(t)ZT (t) + Y(t)C0 (t)ZT (t) + Y(t)C(t)ZT (t)B(t).

Also, U0 (t)

=

A(t)U(t) + U(t)B(t) + H(t)

=

A(t)Y(t)C(t)ZT (t) + Y(t)C(t)ZT (t)B(t) + H(t).

Now equating the above two expressions of U0 (t), gives, Y(t)C0 (t)ZT (t) = H(t) or C0 (t) = Y−1 (t) H(t)[ZT (t)]−1 . Since, C(t0 ) = C, integrating between t0 to t Z t

Y−1 (s)H(s)[ZT (s)]−1 ds

C(t) = C + t0

Linear Matrix Differential Equations

357

and substituting in (9.22), Zt U(t)

= Y(t)[C +

Y−1 (s)H(s)[ZT (s)]−1 ds]ZT (t)

t0

Zt

T

= Y(t)CZ (t) +

Y(t)Y−1 (s)H(s)[ZT (s)]−1 ZT (t)ds

t0

Zt

T

= Y(t)CZ (t) +

Y(t, s)H(s)ZT (t, s)ds,

(9.23)

t0

where Y(t, s) = Y(t)Y−1 (s) and ZT (t, s) = [ZT (s)]−1 ZT (t). In the terminology of control theory Y(t, s) is called a transition matrix of the equation X0 = A(t)X. The relation (9.23) is the desired variation of parameters formula and it is the solution of the IVP (9.21). According to the logic followed in relation (8-22) of Chapter 8, the variation of parameters formula has the form Zt U(t, t0 , C) = X(t, t0 , C) +

X(t, s, H(s))ds,

(9.24)

t0

where X(t, t0 , C) = Y(t) C ZT (t) t, t0 ∈ I. Observation. (i) VPF (9.11) is a particular case of (9.24) (ii) Let A(t) = A and B(t) = B be constant matrices in (9.5). Then Y(t, s) = eA(t−s) and ZT (t, s) = eB(t−s) . Now the relation (9.24) takes the form U(t) = e

A(t−t0 )

B(t−t0 )

Ce

Zt +

eA(t−s) H(s)eB(t−s) ds.

(9.25)

t0

(ii) The formula (9.24) and its particular form (9.25) have several applications in stability theory, control and optimization theory, and other engineering and technology problems.

358

Linear Algebra to Differential Equations

Solution using Kronecker Products To solve (9.5) when A(t) and B(t) are constant matrices. Using Kronecker products, consider the equation X0 = AX + XB, X(0) = C.

(9.26) ( Vec(X0 ) = Vec(AX + XB) = (In ⊗ A + BT ⊗ In )Vec(X), Then Vec(X(0)) = Vec(C). This is a vector-matrix equation with initial condition, whose solution is given by Vec(X)

= exp[(In ⊗ A)t + (BT ⊗ In )t]Vec(C) = [exp(In ⊗ A)t][exp(BT ⊗ In )t]Vec(C) = Vec[exp(At)]Vec[C exp(BT t)].

Hence, X(t) = exp(At)C exp(BT t),

(9.27)

using the properties of Kronecker products in Section 9.4. Let Y(t, t0 ) = Y(t) Y−1 (t0 ), Z(t, t0 ) = Z(t) Z−1 (t0 ). Define the matrix Ψ(t, t0 ) = Z(t, t0 ) ⊗ Y(t, t0 ). Then, Ψ0 (t, t0 )

= = = = =

Z0 (t, t0 ) ⊗ Y(t, t0 ) + Z(t, t0 ) ⊗ Y0 (t, t0 ) [BT (t)Z(t, t0 )]Y(t, t0 ) + Z(t, t0 ) ⊗ [A(t)Y(t, t0 ) [BT (t)Z(t, t0 )] ⊗ In Y(t, t0 ) + In Z(t, t0 ) ⊗ [A(t)Y(t, t0 )] [BT ⊗ In + In ⊗ A][Z(t, t0 ) ⊗ Y(t, t0 )] [BT ⊗ In + In ⊗ A]Ψ(t, t0 ).

Hence, ψ(t, t0 ) satisfies the equation, Vec(X0 ) = [BT ⊗ In + In ⊗ A]Vec(X). Further, Ψ(t0 , t0 ) = Z(t0 , t0 ) ⊗ Y(t0 , t0 ) = In ⊗ In = In 2 . Hence, Ψ(t, t0 ) is the transition matrix for the LMDE in (9.18). Example 9.5.1 Solve the matrix differential equation     3 2 0 1 X0 = X+X , X(0) = C, 0 3 −1 0   c c where C is a 2 × 2 constant matrix 11 12 . c21 c22

Linear Matrix Differential Equations

359

Solution. For the IVP  3 X = 0  2t e3t e3t

 2 X, 3

0

 3t e the solution is 0

X(0) = I2 ,

    0 −1 cos t − sin t and for the IVP Z = Z, X(0) = I2 , it is 1 0 sin t cos t Hence, from (9.19) the solution of the given problem is  3t    e 2t e3t c11 c12 cos t − sin t X(t) = . 0 e3t c21 c22 sin t cos t 0

Example 9.5.2 Solve the IVP         1 2 0 1 1 1 2 1 0 + , X(0) = . X = X+X 4 3 −4 4 1 1 1 2   1 2 Solution. For the IVP X0 = X, X(0) = I2 the solution is 4 3 

1 4



2 t 3

  1 2e−t + e5t −e−t + e5t 3 −2e−t + 2e5t e−t + 2e5t   0 1 and for the IVP X0 = X , X(0) = I2 , the solution is −4 4 

e

 

e

0 −4

=



1 t 4

 2t e − 2t e2t = −4te2t

 te2t . (1 + 2t)e2t

By following the VPF (9.25), one gets  

X(t) = e 

Z +

t 

e

1 4



1 4

2 (t−s)  1 3 1





2  t 2 3 1

 0 1 −4 e 2 

 0 1 −4 e 1



1 t 4



1 (s−t) 4

ds    1 −e−(t−s) + e5(t−s) 1 1 2e−(t−s) + e5(t−s) = 3 0 −2e−(t−s) + 2e5(t−s) e−(t−s) + 2e5(t−s) 1 1  2(t−s)  e − 2(t − s) e2(t−s) (t − s)e2(t−s) ds (simplify!) −4(t − s)e2(t−s) (1 + 2(t − s))e2(t−s) 0

Z t

360

Linear Algebra to Differential Equations

Example 9.5.3 Find the transition matrix Ψ(t, t0 ) for the LMDE     3 2 0 1 X0 = X+X . 0 3 −1 0 Solution. From the Example 9.5.1, Y(t, t0 ) Z(t, t0 )

= Y(t)Y

−1

3(t−t0 )



1 0

2(t − t0 ) 1



(t0 ) = e and   cos(t − t0 ) − sin(t − t0 ) −1 = Z(t)Z (t0 ) = . sin(t − t0 ) cos(t − t0 )

Hence,    cos(t − t0 ) − sin(t − t0 ) 3(t−t0 ) 1 Ψ(t, t0 ) = ⊗e sin(t − t0 ) cos(t − t0 ) 0

 2(t − t0 ) . 1

EXERCISE 9.5       1 −1 1 0 −2 0 1. Solve X = X+X , X(0) = . 0 2 0 −1 1 1        1 −1 1 0 t 0 −2 0 2. Solve X = X+X + , X(0) = 0 2 0 −1 0 −1 1 0

3. Find the transition matrix for the LMDE    1 −1 1 X0 = X+X 0 2 0

9.6

 0 . 1

 2 . 3

A Class of LMDE of Higher Order

From results in Section 9.5, it can be observed that there exists a close relationship between the representation of solutions of the scalar differential equation x0 = ax, x(0) = c and the LMDE X0 = AX + XB, X(0) = C. This approach can be extended for studying LMDE of higher orders. Consider an nth order homogeneous scalar differential equation with constant coefficients, y (n) + a1 y (n−1) + · · · + an−1 y 0 + an y = 0, (9.28) whose corresponding characteristic equation is λn + a1 λn−1 + · · · + an−1 λ + an = 0,

(9.29)

Linear Matrix Differential Equations

361

where a1 , a2 , . . . , an are real numbers. Let λ1 , λ2 , . . . , λn be the roots of the characteristic equation. The n linearly independent solutions of the scalar differential equation (9.28) are described in the following cases (refer to Section 2.7 and Section 8.6). (a) λi , i = 1, 2, . . . , n are real and distinct. The n solutions of (9.28) are of the form xi (t) = ci eλi t , where ci is a constant for i = 1, 2, . . . , n. (b) λ1 = λ is real and repeated r times λi , i = n − r + 1, . . . , n are real and distinct. Then the r solutions of (9.28) are given by x1 (t) = c1 eλt , x2 (t) = c2 teλt , . . . , xr (t) = cr tr−1 eλt the remaining (n − r) solutions of (9.28) are as in case (a). (c) Let λ1 = a + ib be a complex root. Then λ2 = a − ib is also a root of the considered characteristic equation and the two linearly independent solutions are given by λ1 (t) = c1 eat cos bt, λ2 (t) = c2 eat sin bt and the other (n − 2) linearly independent solutions are obtained from cases (a) and (b). Now consider a class of LMDE related to (9.26) and is given by X(n) + a1 [AX(n−1) + X(n−1) B] + a2 [A2 X(n−2) + 2AX(n−2) B + X(n−2) B2 ] + . . . n(n − 1) n−2 A XB2 + · · · + XBn ] = 0, + an [An X + nAn−1 XB + 2 (9.30) where A and B are constant matrices of order n × n. From equation (9.30), some special cases are as follows.  For n = 1, X0 + a1 (AX + XB) = 0;    For n = 2, X00 + a (AX0 + X0 B) + a (A2 X + 2AXB + XB2 ) = 0; 1 2 000 00 00  For n = 3, X + a (AX + X B) + a2 (A2 X0 + 2AX0 B + X0 B2 ) 1    +a3 (A3 X + 3A2 XB + 3AXB2 + XB3 ) = 0. All the LMDEs of the above type have explicit solutions. Listed below are the respective nature of solutions related to (a), (b) and (c). The theoretical details are omitted. (a1 ) Relative to (a) of (9.28), the solution set for equation (9.30) is given by X1 (t) = eλ1 At C1 eλ1 Bt , X2 (t) = eλ2 At C2 eλ2 Bt , . . . Xm (t) = eλm At Cm eλm Bt ,

362

Linear Algebra to Differential Equations

where C1 , C2 , . . . , Cm are constant matrices of appropriate order. The sum of these solutions is also a solution. (b1 ) Relative to (b) of (9.28), the solution set for equation (9.30) is given by eλm+1 At D1 eλm+1 Bt , teλm+1 At D2 eλm+1 Bt , . . . tr−1 eλm+1 At Dr eλm+1 Bt . The sum of these solutions eλm+1 At (D1 + tD2 + · · · + tr−1 Dr )eλm+1 Bt is also a solution. Here D1 , D2 , ..., Dr are constant matrices of appropriate order. (c1 ) Relative to (c), the solution set is given by eAt cos bt and eAt sin bt where the root α = a + ib. Here eAt (C1 cos bt + C3 sin bt) where C1 and C2 are arbitrary constant matrices of appropriate order is the set of solutions of (9.30). The procedure for solving LMDE (9.30) is as follows. Given LMDE (9.30), determine its related scalar form (9.28). Solve it by using algebraic method and follow the scheme of writing solutions as given in (a1 ), (b1 ) and (c1 ). The following examples, illustrate the method. Example 9.6.1 Solve the matrix differential equation X00 + 5(AX0 + X0 B) + 6(A2 X + 2AXB + XB2 ) = 0. Solution. From (9.30), the related scalar equation (9.28) is given by y 00 +5y 0 + 6y = 0 and the characteristic equation is r2 + 5r + 6 = 0 having characteristic roots −3 and −2. The solutions of the corresponding scalar differential equation are then given by e−3t and e−2t . Now following the scheme of solutions given in (a1 ), the solution of the given LMDE is X1 (t) = e−3At C1 e−3Bt , X2 (t) = e−2At C2 e−2Bt , where C1 and C2 are arbitrary constant matrices. Hence H(t) = X1 (t)+X2 (t) is also a solution. Observation. The solution of the above LMDE is valid for any choice of matrices A and B of appropriate order. Example 9.6.2 Solve the LMDE X00 − 2(AX0 + X0 B) + (A2 X + 2AXB + XB2 ) = 0.

(9.31)

Solution. The related scalar differential equation is y 00 − 2y 0 + y = 0. Its

Linear Matrix Differential Equations

363

characteristic equation has r = 1 as a repeated root. The solutions of the corresponding scalar differential equation are given by et and tet . Hence the solution of (9.31) following the scheme in (b1 ) above, is given by X(t) = eAt (C1 + C2 t)eBt .     0 1 0 1 In particular, let A = and B = . −1 0 8 2 Observe that AB 6= BA. Further,   cos t sin t At e = and − sin t cos t  −2t  1 4e + 2e4t e4t Bt e = 6 8e4t − 8e−2t 4e4t + 2e−2t

(9.32)

Now, in view of (9.32)  

0 −1



1 t 0

 

0 8

(C1 + C2 t)e  −2t  1 4e cos t sin t + 2e4t e4t − 2e−2t = (C1 + C2 t) − sin t cos t 6 8e4t − 8e−2t 4e4t − 2e−2t X(t)



=

e



1 t −2



(9.33)

Further, suppose that equation (9.31) has the initial conditions at t = 0 given by     0 0 1 0 0 X(0) = and X (0) = . (9.34) 1 2 0 1 From (9.32) and (9.33), for t = 0  0 X(0) = C1 = 1

   0 1 0 0 and X (0) = AC1 + C2 + C1 B = . 2 0 1   −8 0 Substituting for A, C1 and B, and simplifying gives C2 = . 1 2 Hence,     −2t  1 cos t sin t −8t 0 4e + 2e4t e4t − e−2t X(t) = 1 2 + 2t 8e4t − 8e−2t 4e4t + 2e−2t 6 − sin t cos t is the required solution. Example 9.6.3 Solve the equation 2

X000 − 3(A2 X0 + 2AX0 B + X0 B ) + 2(A3 X + 3A2 XB + 3AXB2 + XB3 ) = 0. Solution. The related scalar equation is y 000 − 3y 0 + 2y = 0 whose characteristic equation has roots −2, 1 and 1. Following the scheme of solutions given in (a1 ) and (b1 ) we have

364

Linear Algebra to Differential Equations

for λ = −2, X1 (t) = e−2At C0 e−2Bt ; for λ = 1, a repeated root X2 (t) = eAt (C1 + C2 t)eBt . The general solution is given by X(t) = e−2At C0 e−2Bt + eAt (C1 + C2 t)eBt .

EXERCISE 9.6 1. Solve the following LMDE 2 (a) X000 − 3(AX00 + X00 B) + 3(A2 X0 + 2AX0 B + X0 B ) 2 3 3 2 − (A X + 3A XB + 3AXB + XB ) = 0; 2

(b) X000 − (AX00 + X00 B) − 12(A2 X0 + 2AX0 B + X0 B ) = 0     1 −1 1 2 where A = and B = . 0 2 0 3

2. Solve the IVP X00 − 2(AX0 + X0 B) + (A2 X + 2AXB + XB2 ) = 0,     0 0 0 1 0 X(0) = and X (0) = , 2 2 1 0     0 1 1 −1 and B = . where A = −1 0 0 2

9.7

Boundary Value Problem of LMDE

In this section an attempt will be made to apply the material developed in the earlier sections. In several real-life situations boundary value problems (BVP) arise naturally in applications in control engineering, dynamical systems and ecological systems. For example, BVP can be used to model the following: (i) The trajectory of a satellite to be placed in a geocentric orbit. (ii) The trajectory formed by shooting a flying plane with a missile can be modeled as a BVP.

Linear Matrix Differential Equations

365

When the parameters are many in any physical situation, then it can be formulated as BVP of LMDE. Consider a BVP of an LMDE given by, X0 = A(t)X + XB(t), +H(t),

(9.35)

where A(t), B(t), H(t), X ∈ C[I, Rn×n ] with I = [a, b], a≤b, satisfying the boundary condition MX(a) + NX(b) = K, (9.36) where M, N and K are given n × n constant matrices. Definition 9.7.1 A solution of the BVP of the LMDE (9.35) and (9.36) is a function X ∈ C[I, Rn×n ] satisfying both the relations (9.35) and (9.36). Procedure to find the solution of the BVP (9.35) and (9.36) By the VPF (9.25), the explicit solution of (9.35) is given by Z t X(t) = Y(t)CZT (t) + Y(t)Y−1 (s)H(s)[Z(t)Z−1 (s)]T ds, (9.37) a

where C is an arbitrary n × n constant matrix. Substituting (9.37) in (9.36) gives Z b T T MY(a)CZ (a)+NY(b)CZ (b)+N Y(b)Y−1 (s)H(s)[Z(b)Z−1 (s)]T ds = K. a

(9.38) To find the unknown matrix C, set A1 = MY(a), A2 = NY(b), Rb B1 = ZT (a), B2 = ZT (b) and P = K−N a Y(b)Y−1 (s) H(s)[Z(b)Z−1 (s)]T ds. With this notation it follows that A1 CB1 + A2 CB2 = P,

(9.39)

where C is an unknown matrix to be determined. Using the following properties of the Kronecker Product (see relation 9.15), Vec (A1 CB1 ) = (BT1 ⊗ A1 ) Vec C, and Vec (A2 CB2 ) = (BT2 ⊗ A2 ) Vec C, relation (9.39) becomes [(BT1 ⊗ A1 ) + (BT2 ⊗ A2 )] Vec C = Vec P Thus Vec C = [(BT1 ⊗ A1 ) + (BT2 ⊗ A2 )]−1 Vec P,

(9.40)

whenever the inverse in (9.40) exists, the RHS is known therefore the LHS, Vec C is known and hence C can be expressed. Replacing the value of C in (9.37) gives the explicit solution X(t) satisfying (9.35) and (9.36). The following example illustrates the methodology for solving BVPs.

366

Linear Algebra to Differential Equations     1 0 1 0 0 Example 9.7.1 Solve the BVP X = X+X , 0 1 0 −1  1 0

  1 0 X(0) + 1 1

  1 1 X(1) = 0 0

 0 , −1

  t ∈ 0, 1 .

Solution. Comparing the given problem (9.35)  with the  BVPof LMDE   1 0 1 0 1 1 0 (9.36), a = 0, b = 1, A = , B= , M= , N= 0 1 0 −1 0 1 1 K

=

Y(t)

=

A1

=

B2

=

and  1 0

    1 0 0 0 ,H = . Clearly, 0 −1 0 0  t   t    e 0 e 0 1 0 , Z(t) = , P=K= . 0 et 0 e−t 0 −1       1 1 0 e 1 0 M Y(0) = , A2 = , B1 = . 0 1 e 0 0 1   e 0 by (9.40). 0 e−1

Further,    1 0 1 Vec C = ⊗ 0 1 0  1 1 + e2 0 e2 1 0 =  0 0 1 0 0 1

    1 e 0 0 + ⊗ 1 0 e−1 e   1 0 0 0   . 2  0  1 −1

 e Vec P 0

Hence,  C =

1 e2

 −2 . −1

In view of (9.37) the solution of the given BVP is given by  t    1 −2 et 0 e tet X(t) = t . e tet + et e2 −1 0 e−t Observation. (i) The boundary condition (9.36) may occur in several types. Only one type is given here. (ii) In most of the cases, finding the matrix C from the relation (9.40) manually is very difficult. Using programming tools the problems can be resolved.

Linear Matrix Differential Equations

9.8

367

Trigonometric and Hyperbolic Matrix Functions

In this section an LMDE is considered and the solutions are obtained in terms of matrix trigonometric functions. Following a similar approach as in Section 9.6, a scalar linear differential equation is considered and its solutions are obtained in terms of trigonometric and hyperbolic functions. Consider the IVP of the linear scalar differential equation x00 + x = 0, x(0) = 1, x0 (0) = 1,

t ∈ R.

(9.41)

Clearly, cos t and sin t are solutions of the IVP (9.41). Transforming the IVP (9.41) to a LSDE or a linear system of differential equations gives,  0        x1 0 −1 x1 x1 (0) 1 = , = . (9.42) x2 1 0 x2 x2 (0) 0

Properties of solutions of the LSDE (9.42) Suppose x1 (t) and x2 (t) are solutions of the LSDE (9.42). d (x21 + x22 ) = 2x1 x01 + 2x2 x02 = 0. (i) Consider dt Integrating w.r.t. t between 0 to t, yields

x21 (t) + x22 (t) = 1,

(9.43)

on using (9.42) along with the initial conditions. Then the basic trigonometrical relation, x21 (t) + x22 (t) = 1, (9.44) holds for all t ∈ R Observations. 1. The solution cos t and sin t of (9.41) are also the solutions of the LSDE (9.42), which can be observed by writing x1 (t) = cos t and x2 (t) = sin t. 2. By considering x1 (t) and x2 (t) as two linearly independent solutions of the LSDE (9.41) all properties of cos t and sin t can be proved. This leads to the generalization of trigonometric functions by setting M20 (t) = x1 (t) = cos t and M21 (t) = x2 (t) = sin t.

368

Linear Algebra to Differential Equations

Matrix Trigonometric Functions For any given n × n matrix T, the matrix trigonometric functions cos T and sin T are defined in terms of the power series, ∞ X

cos T =

(−1)n

n=0

and sin T =

∞ X

(−1)n

n=0

T2n (2n)!

T2n+1 (2n + 1)!

with the assumption that kTk is well defined and kTk < ∞. Further, cos T and sin T are two linearly independent solutions of the matrix IVP, X00 + X = 0, X(0) = In , X0 (0) = 0. (9.45) Corresponding to the LMDE (9.45) its related scalar differential equation is (9.27). Also, the following properties hold for any two Tn×n and Rn×n matrices. (ii) cos(T + R) = cos(T) cos(R) − sin(T) sin(R), and sin(T + R) = sin(T) cos(R) + sin(R) cos(T) (iii) ei T = cos T + i sin T and e−iT = cos T − i sin T; (the above relations are Euler’s formulae for matrix trigonometry) (iv) cos T =

ei T − e−i T ei T + e−i T , sin T = ; 2 2i

(v) For any rational number n DeMoivre’s Theorem for matrix trigonometry is given by, (eiT )n = cos nT + i sin nT (9.46) The properties listed in (i) to (v) are well defined and well-established for matrix trigonometry.  0 Example 9.8.1 Let T = 0 0 cos2 T + sin2 T = I3 .

1 0 0

 0 1 . Find cosT, sinT and show that 0

Solution. Observe that T3 = 0. Hence Tn = 0 for all n≥3. 2 Thus, cosT = I3 − T2! , sinT = T and

Linear Matrix Differential Equations  2 2 cos2 (T) + sin2 (T) = I3 − T2! + T2 = I3 − T2 +

369 T4 4

+ T2 = I3 , since

T 4 = 0. Example 9.8.2 Prove relation (9.46), for n = 2, using the T in Example 9.8.1. Solution. Consider (cos (T) + isin (T))2

= =

 0 Example 9.8.3 For A = 0 0

2  T2 = (I2 − 2T2 ) + i2 T. I2 + i T − 2! cos (2T) + i sin (2T). 1 0 0

  1 0 1 and B =  5 −2 0

1 2 −1

 3 6 −3

Verify that AB 6= BA and M20 (A + B)

= M20 (A) M20 (B) − M21 (A) M21 (B)

M21 (A + B)

= M21 (A) M20 (B) + M21 (B) M20 (A).

Solution. Both A and B are nilpotent matrices An = 0, Bn = 0 for n ≥ 3. Hence, it follows that A2 , sin A = A and 2! B2 cos B = I3 − , sin B = B. 2!

cos A = I3 −

(A + B)4 (A + B)2 + 2! 4! A2 + AB + BA + B2 A2 B2 + B2 A2 = I3 − + 2 8

The LHS = M20 (A + B) = cos(A + B) = I3 −

Here, RHS = M20 (A)M20 (B) + M21 (A)M21 (B)    B2 AB BA A2 I3 − − − = I3 − 2 2 2 2 2 2 2 2 2 2 A B A B B A AB BA = I3 − − + + − − . 2 2 8 8 2 2 Hence, the first trigonometrical addition formula is proved. The second follows on similar lines.

370

Linear Algebra to Differential Equations

Extended trigonometric functions Consider a third-order IVP of the form x000 + x = 0, x(0) = 1, x0 (0) = 0, x00 (0) = 0.

(9.47)

Then the three linearly independent solutions of the IVP are given by M30 (t) = 1 −

∞ X t3 t6 t9 t3n + − + ··· = (−1)n 3! 6! 9! (3n)! n=0

M31 (t) = t −

∞ X t4 t7 t10 t3n+1 + − + ··· = (−1)n 4! 7! 10! (3n + 1)! n=0

M32 (t) =

∞ X t2 t5 t8 t11 t3n+2 − + − + ··· = (−1)n 2 5! 8! 11! (3n + 2)! n=0

The system representation of the IVP (9.47) is obtained by writing x01 = −x3 , x02 = x1 , x03 = x2 with x1 (0) = 1, x2 (0) = x3 (0) = 0 and is  0       x1 0 0 −1 x1 1 x1 (0) x2  = 1 0 0  x2  x2 (0) = 0. x3 0 1 0 0 x3 x3 (0) The solutions are given by x1 (t) = M30 (t), x2 (t) = M31 (t) and x3 (t) = M32 (t). Following are some of the properties satisfied by these three functions. M30 (t) M31 (t) M32 (t) 0 0 0 (t) M32 (t) (i) W (M30 , M31 , M32 )(t) = M30 (t) M31 00 00 00 M30 (t) M31 (t) M32 (t) M30 (t) M31 (t) M32 (t) = −M32 (t) M30 (t) M31 (t) −M31 (t) −M32 (t) M30 (t) 3 3 3 = M30 (t) − M31 (t) + M32 + 3M30 M31 M32

=

1 for each t.

This is a basic relation [similar to (9.43)] for the third order. This fact suggests that one can obtain similar basic relations for differential equations of higher

Linear Matrix Differential Equations

371

orders n = 4, 5, .. (ii) By following the method discussed earlier, it can be proved that addition formulae for any t ∈ R, M30 (t + α)

= M30 (t) M30 (α) − M31 (t) M32 (α) − M32 (t) M31 (α)

M31 (t + α)

= M30 (t) M31 (α) + M31 (t) M30 (α) − M32 (t) M32 (α)

M32 (t + α)

= M30 (t) M32 (α) + M31 (t) M31 (α) + M32 (t) M30 (α) (9.48)

Substituting α = t, gives formulae for M30 (2t), M31 (2t) and M32 (2t). (iii) The characteristic equation of the IVP (9.47) is r3 + 1 = 0. The charac√ 1 √ 1 teristic roots are −1, (1 − i 3), (1 + i 3), which are denoted by −1, w and 2 2 −w2 , respectively. Then, e−t

= M30 (t) − M31 (t) + M32 (t)

ewt

= M30 (t) + wM31 (t) + w2 M32 (t)

e−w

2

t

= −M30 (t) − wM31 (t) − w2 M22 (t)

(9.49)

Observation. These equations can be solved to get M30 , M31 and M32 in 2 terms of e−t , ewt and e−w t . Extended matrix trigonometric functions for any n × n matrix T can now be defined as follows. M30 (T) M31 (T) M32 (T)

= = =

∞ X

(−1)n

n=0 ∞ X

(−1)n

n=0 ∞ X

(−1)n

n=0

T3n , (3n)! T3n+1 , (3n + 1)! T3n+2 . (3n + 2)!

(9.50)

These infinite series are convergent for all T, ||T|| < 1. All the forementioned formulae can be obtained by replacing both t and α by T, an n × n matrix.   0 1 0 0 0 0 1 0  Example 9.8.4 For the matrix T =  0 0 0 1 0 0 0 0 find M30 (T), M31 (T), M32 (T) and verify the relation (9.48).

372

Linear Algebra to Differential Equations

n Solution. Here, it is observed  that T =10for n ≥ 4. Hence from the defini1 0 0 −3 0 1 0 0  T3  , tion, M30 (T) = I4 − 3! =  0 0 1 0  0 0 0 1  0 1 0 0 0 0 1 0  M31 (T) = T =  0 0 0 1, 0 0 0 0   0 0 12 0 0 0 0 1  2 2 M32 (T) = T2! =  0 0 0 0 . 0 0 0 0 To verify the basic relation in (i) for the matrix T, 3 3 3 M30 − M31 + M32 + 3M30 M31 M32      2 3 T T6 T3 T = I4 − − T3 + + 3 I − (T) 4 3 3! (2!) 3! (2!)

Consider

= I4 ,

proving the result in (i)

Example 9.8.5 For the matrix T as given in Example (9.8.3), find M30 (2T), M31 (2T), M32 (2T). Solution. Replacing t by T and α by T in the relations (9.48) gives, M30 (2T)

2 = M30 (T) − 2M31 (T)M32 (T)  2  2 T3 T = I4 − − 2T 3! 2!

 1 3  T6 4T T3 0 + − T 3 = I4 − = = I4 − 2 0 3! (3!)2 3 0 M31 (2T)

=

2M30 (T)M31 (T) − [M32 (T)]2

=

 0  0 T3 T4 T− 2 I4 − = 2T =  0 3! 4 0

=

2M30 (T)M32 (T) + [M32 (T)]2

=

 0  0 T4 T3 T2 2 + =T = 2 I4 − 0 3! 2 4 0



M32 (2T)



2 0 0 0

0 0 1 0

 0 0 ; 2 0

0 2 0 0 0 0 0 0

0 1 0 0

1 0 0 0

 0 1 . 0 0

 − 43 0  ; 0  1

Linear Matrix Differential Equations

373

Hyperbolic and extended hyperbolic functions The solutions of the IVP x00 − x = 0, with initial conditions x(0) = 1, x0 (0) = 0

(9.51)

are the two linearly independent functions cosh t and sinh t writing N20 (t) = and N21 (t) =

∞ X t2n (2n)! n=0 ∞ X

t2n+1 (2n + 1)! n=0

These functions can be extended as solutions of higher-order differential equations parallel to those of extended trigonometric functions. Now replacing t with T, an n × n matrix, the matrix hyperbolic functions are defined as follows N20 (T) = cosh T =

∞ X T2n (2n)! n=0

N21 (T) = sinh T =

∞ X T2n+1 . (2n + 1)! n=0

In a similar fashion extended matrix hyperbolic functions can be obtained.

EXERCISE 9.8 1. Find M30 (T), M31 (T), M32 (T), M30 (2T), M31 (2T), M32 (2T) and verify the relation when   1 1 3 2 6 . T= 5 −2 −1 −3 2. Show that the matrix differential equation X000 + (A3 X + 3A2 XB + 3AXB2 + XB3 ) = 0 has solution set {M30 (At)C1 M30 (Bt), M31 (At)C2 M31 (Bt), M31 (At) C3 M32 (Bt)}, where C1 , C2 , C3 are arbitrary constant matrices of appropriate order.

374

Linear Algebra to Differential Equations

3. Let T be an n × n matrix. Define N30 (T)

∞ ∞ X X T3n T3n+1 = , N31 (T) = and (3n)! (3n + 1)! n=0 n=0

N32 (T)

=

∞ X T3n+2 . (3n + 2)! n=0

These are extended hyperbolic functions. Show that 0 0 0 N30 (T) = N32 (T), N31 (T) = N30 (T), N32 (T) = N31 (T).   0 1 0 0 0 0 1 0  For A =  0 0 0 1 , 0 0 0 0 find N30 , N31 , N32 and show that 3 3 3 + N32 − 3N30 N31 N32 = I4 . N30 + N31

4. For any n × n matrices T and R such that (TR = RT), show that N30 (T + R) = N30 (T)N30 (R) + N31 (T)N32 (R) + N32 (T)N31 (R); N31 (T + R) = N30 (T)N31 (R) + N31 (T)N30 (R) + N32 (T)N32 (R); N32 (T + R) = N30 (T)N32 (R) + N31 (T)N31 (R) + N32 (T)N30 (R); Find N30 (2T), N31 (2T) and N32 (2T). 5. Show that the matrix differential equation X000 + (A3 X + 3A2 XB + 3AXB 2 + XB3 ) = 0; X(0) = 1, X 0 (0) = 0, X 00 (0) = 0 has a solution set {N30 (At)C1 N30 (Bt), N31 (At)C2 N31 (Bt), N32 (At) C3 N32 (Bt)}, where C2 , C2 , C3 are arbitrary n × n constant matrices.

9.9

Conclusion

The contents of this chapter are natural extensions of the results obtained in Chapter 8. The theory is based on the study of vectors and matrices included in Chapters 1 and 3 and Kronecker product results in Chapter 6 and techniques in calculus of matrices given in Chapter 7. The methods of explicitly solving the LMDEs are elegant, smooth and help to accommodate the wide

Linear Matrix Differential Equations

375

applications in linear algebra. Applications are provided to new topics such as boundary value problems involving matrices and matrix trigonometry. For further reading the readers are refered to [2], [4], [7], [11] and [22].

Bibliography

[1] H. Anton, C. Rorres, Elementary Linear Algebra with Applications, (10th Edition), Wiley Publication, USA. [2] S. Barnet, Matrix Differential Equations and Kronecker Products, SIAM J. Appl. Math. 24, No 1(1973) pp1–5. [3] Richard Bellman, Modern Elementary Differential Equations, Addison Wesley, Reading, Massachusetts (1968). [4] Richard Bellman, Introduction to Matrix Analysis, SIAM, PA, USA (1997). [5] K. B. Datta, Matrix and Linear Algebra aided with MATLAB, (2nd Edition), PHI Learning PVT Ltd., New Delhi, India. (2008). [6] S. G. Deo, V. Raghavendra, Rasmita Kar and V. Lakshmikantham, Text Book of Ordinary Differential Equations (3rd Edition), McGraw-Hill Education, India , New Delhi (2015). [7] J. Foreman, Data Smart: Using Data Science to Transform Information into Insight, Wiley New York (2013). [8] A. Graham, Kronecker Products and Matrix Calculus: With Applications (1st Edition), Ellis Horwood Limited, West Sussex, England (1981). [9] F. S. Hill, Computer Graphics using Open GL, Pearson, New York (2000). [10] K. Hoffman and R. Kunze, Linear Algebra (2nd Edition), Prentice Hall Inc. NJ, USA (1971). [11] V. Lakshmikantham and S. G. Deo, Variation of parameters Methods in Dynamic System, Gordon & Breach Science Publishers, Amsterdam (1998). [12] A. Lesk, Introduction to Bioinformatics (5th Edition), OUP Oxford (2019). [13] J. Liesen and Z. D. Strakos, Krylov Subspace Methods – Principles and Analysis, Oxford University Press, UK (2013).

377

378

Bibliography

[14] B. Noble, Applied Linear Algebra, Prentice-Hall, Inc. New Jersey, USA (1969). [15] S. K. Saha, Introduction to Robotics, McGraw-Hill Education, India (2014). [16] D. A. Sanchez, Ordinary Differential Equations and Stability Theory: An Introduction, W. H Freeman and Company, San Francisco (1968). [17] W. F. Stallings, Cryptography and Network Security, Pearson-Prentice Hall, NJ, USA (2005). [18] R. G. Stanton, Numerical Methods for Science and Engineering, Prentice Hall Inc., NJ, USA (1964). [19] G. Strang, Introduction to Linear Algebra (5th Edition), Wellesley Cambridge Press, MA, USA (2016). [20] R. Schilling, Fundamentals of Robotics Analysis and Control, Prentice Hall, India (1996). [21] F. E. Udwadia and R. E. Kalaba, Analytical Dynamics: A New Approach, Cambridge University Press, UK (1996). [22] J. Vasundhara Devi, S. N. R. G. Bharat Iragavarapu and S. Srinivasa Rao, Quasilinearization Technique Periodic Boundary Value Problem for Graph Differential Equations and Its Associated Matrix Differential Equations with Dynamics of Continuous, Discrete and Impulsive Systems, Series B: Applications & Algorithms 23 (2016) pp. 287–300. [23] W. J. Vetter, Derivative Operations on Matrices, IEEE Trans. Auto. Control. AC-15, 241–244 (1970). [24] W. J. Vetter, Matrix Calculus Operations and Taylor Expansions, SIAM Rev. 2, 352–369 (1973).

Answers

Exercise 1.2 1. (i) (4 −4 3) (ii)(5 0 10) (iii) (11 −12 7) (iv)(9 −12 3) −4 √ 2. (i) (−1, 1, 8), 9 (ii) (1, 11, 3), 26 3. √41−1 105 4.

(4 5 2) √ 45

10.

¯ ¯ 7¯ i−6 √j−10k 185

12. 15.

Exercise 1.3. 1. (i) 4 × 4; (ii) 4 × 1;

(iii) 2 × 3, (iv) 3 × 4, (v) 2 × 5.  2 −1 3 5 7 5 4 6 8  2  (ii)  2. 3 , 2 (1 × 11 and 11 × 1) 3. (i)  8 5 7 9  5 11 8 6 8 10   0 31 43 1 0 1 3 3 (iii)   4 1 0 3 3 3 43 13

 −6 −3  0 3

4. x=1, y=2, z=3 5. α = 2, β = 3, γ = 4, δ = 1       12 16 8 −19 21 −4 −12 0 −3 7 (ii)  −4 −2 6. (i)  −3 −9 6  , −4 8 0  8 0 12 −23 −15 −5 −6 −6 −9  2 −2 −4 0 (iii) 7 1 1 1 1  3 8 9. −12  −1    5α + 2β −α − 5β 1 −1 −11 −7 1 3α + 4β 13. (i) 5 (ii) γ −9 4 −9 α − 3β 2α + 5β 5α + 3β Exercise 1.4. 7. (i) a+b−c = 0 (ii) a+b−c = 1.

379

380

Answers

Exercise 1.6.  2 x + y2  0 2. 486 5.   0 ax + by

0 x2 + a2 xy + ab 0

Exercise 1.7.

0 xy + ab y 2 + b2 0

 ax + by 0   0  a2 + b2



     2 −3 2 1 2 0 1 −4 2. Take A = O 3. (i) (ii) 2 (iii) 2 −3 5 1 1 3 −1       9 5 −1 −2 1 2 −1 1 −1 −2  18 12 −3 −4  4 (iv) −2 −3 2  (v). −2 3 4. (i)  −2 −1 1 0 0 −3 1 −5 5 11 −3 −2 0 1   3 −15 5 6 −2 (ii) −1 1 −5 2     3 −1 −9   1 0 0 2 4 4 7 3  1 −1 (ii) −1 21 (iii) −3 1 1 5. (i) 50 2 −7 −1 1 −1 −1 −4 1 2 2 4 4 Exercise 1.8.  3  1+a  1.    −6 1  15 29 14 29  2.  12 23 21 42 13 26

0 1+b 14 9  19 21  22  28 13 5×3

  8 1 0  1+c   1 − a 1 − b ,    −6 10 13  1 −7 12   8 −7 2 −4  1 1 −1 −3  3. 51  −13 12 −2 9  −1 4 −4 −2

 −2 1−c   3  −6   1 2  4.  2 1

Exercise 1.9     2 12 3 −1 4. 5. 15 1 12 −8 Exercise 2.3 2. (i) 2

(ii) 3

(iii) 3

(iv) 2

3. 3

Exercise 2.4 

1. (a) 4

(b) 4

2. 3

1 −1 3 (a)  −1 −3

0 1 1 −9

  0 0 1 0 0 1 0 0 ,  1 0 0 0 −15 6 0 0

2 −8/6 0 −1/6

 −1 −1  0 0

Answers 

1 (b) −2/5 −1/3

0 1/5 0

  0 1 1 0  , 0 1 1/3 0 0

381

 1 −1 1

Exercise 2.5 k = 2, x = 35 z, y = −4 1. k = −1 x = 35 (z + 1), y = −4 5 (z + 1); 5 z+1 2. a = −2, b = 0, Infinitely many solutions; a 6= −2, unique solution; a = −2, b 6= 0, inconsistent. −997 −485 −69 −8 3. (a) x = 79 , y = , z = (b) x = 251 18 162 243 . 35 , y = 35 , z = 35 . 51 5 4. a = 2 , b = 2 , Infinitely many no. of solutions; a 6= 52 , unique solution; a = 52 , b 6= 51 2 , no solution 5. x = 0 6. (a) x = 3, y = 3, z = 1 (b) x = −1, y = −1, z = 1 (c) x = 1, y = 1, z = 1 Exercise 2.6   −3 4 5 1. (i) 52 −1 5

 0 (iv) 1 1

 −1 (ii)  2 −2

5 1 3 −4 3 −7 3

Exercise 2.7

1 3 −1  3 −1 3



−2 3 5 3 −2 3

2 3 −2  3 5 3





2 3

(iii)  0

−1 3

−1 3

1 −1 3

−1 3



0 2 3

         1 −1 2 1 1 , 1.(i) λ = 1, −2, 3; −1 , −1 ,  1  (ii) λ = 3, 2; 3 2  1  1  0     1 19 −1 3 −1 (iii) λ = 2, 4, −5; 0 ,  2  ,  7  (iv) λ = 4, −1; , 2 1 0   0  6 1 −1 (v) λ = 9, 2; , , 2 5      1 2 16 (vi) λ = 1, 4, 6; 0 , 3 , 25 0 0 10         −1 −1 1 0 0 0 −1 −2 3 0 0 0 0 1 , 0 0 0 (ii)  3 1 0 , 0 2 0 2. (i)  1 0 1 1 0 0 6 1      0 1  0 0 2  2 −2 1 9 0 0 7 0 1 6 0 0 (iii) 1 0 −2 , 0 9 0  (iv) 15 1 0 , 0 −1 0 0 1 2 0 0 −9 35 0 0 0 0 1 

382

Answers

Exercise 2.8         2 −1 1 −1 √1   1. (i) √32 ; 0, √12 , 13 1 and √12 , 5 2 1 1 2 0   √ −1  0 2      √1  1 0 0 √ 2 ,  1 , and (ii) 2, 0;  0 0 √  1 2 √1 0 2 Exercise 2.9 1.

normal form index (i) 2u2 + 2v 2 2 (ii) u2 + 6v 2 2 (iii) −9u2 + 9v 2 + 9w2 2

signature 0 0 1

nature positive semi definite. positive definite. indefinite.

Exercise 3.2   −a −b 4. ∈ / S. S is not a vector space. −c −d    11. (i) H =  (x, 0) x ∈ R and K = (0, y) y ∈ R (ii) S1 = (x, 0, 0) x ∈ R & S2 = (0, y, z) y, z ∈ R 13. H and K are not subspaces of R2 15. (iii) W1 = {(x, 0)/x ∈ R} , W2 = {(0, y)/y ∈ R} Exercise 3.3       0 0   2 1. 0 , 1 , 0 4. {x}   1 0 1        0 0 0  0 0 0   a 0 0 5. (i) M1 = 0 0 0 , 0 b 0 , 0 0 0 a, b, c ∈ R   0 0 c 0 0 0 0 0 0        0 0 e 0 0 0   0 d 0  (ii) & (iii) M2 = M1 ∪ 0 0 0 , 0 0 0 , 0 0 f  d, e, f ∈ R   0 0 0 0 0 0 0 0 0 6. (i) 2 (ii) 2 7. (a) 4 (b) 3 (c) 3 8. 4

Answers Exercise 3.4  1  2 2 3 1. −1 −1 4

 2.

6

14 −5

−1 5 4 5 −7 5

3 5  1 10 −3 10



 3.

383 

0 0 4.  1 −1

−1 5 3  10 1 10



0 −1 1 −1 0 1 0 0

 1 0  −1 1



 0 1 0 −1 2 0 −2 −1  5.   2 −2 0 3 −2 3 2 −1 Exercise 3.5 1. 3 2. x2 is not linear 5. rank T=3 6. (c) is not a linear transformation.  (a), (b) are linear transformations,     7. −4 −3 −30//11 9 17 8 (a) −1 4//0 3 (b) 1 9// − 2 −5 Exercise 3.6      3 2 −1 0 1 0 1. 2. 4. −3 −3 0 1 k 1 3 3   −4 −3 −30 (7) 11 9 17   −1 0 5 7 (8) (a) (b) 4 3 −6 −3 Exercise 3.7 2. (i) 6||x||2 + 36||y||2 + 35hx, yi 2 (ii) 40x  − 30y  + 12kxk  + 16hx, yi +i 1 0 0.707 0.707 6. U = , V= 0 1 0.707 −0.707

 0 1 3

5.

 0 2

−1 1

 1 0

 30x + 40y − 16kxk2 − 12hx, yi

Exercise 3.8  1 −1 1 1 0 (2) A =  1 −1 0 1     1 1 −1 2 √1  1  , √1  1  , √1  1  2 3 2 6 −1 0 1 14 22 (1 − 2) + √ (2 1) 3. x = √ 5 5 Exercise 4.2

    2 2.0046 , , system is well 1 0.99899 conditioned.       −998.3333 −2247250 1.975 5. , , ill conditioned. 6. , well-conditioned. 1000 2250250 1.015

3. (i)k(A) = 4.2652,

(ii) k(A) = 5.795

Exercise 4.3 1. x = − 12 , y = 3, z =

 −1 −3 4

2

2.  0 1 2

4.

−5 3 7 3

−1

7 6 −4  3 1 2



384  1 3. 4 5

Answers 

0 3 2



1 4.  3 −1

0 0 , U = LT . 4

0 1 −5 7

Exercise 4.4 1. (i) (a) (1 0.5 0.5); (b) (1 0.5 0.5) 2. (i) z = 0.5, y = 1, x = 1

  1 0 0 , 0 1 0

4 −7 0

 3 −8 2 7

(ii) (a) (5 1 2); (b) (5 1 2).

(ii) & (iii) The iteration scheme diverges as the equations are not strictly dominant.   2 3.  3  −1 Exercise 4.5  −2.4495 −3.6742 0 1.8708 1.  0 0

 −4.0825 1.669  1.0911

  2 2. x = 1 1

Exercise 4.6 

1 √

2 2 2.   0 0

√ 2 2 √0 2 0

√0 2 −1 √ 2 5 2

Exercise 4.7  −0.3714 −0.2682 1. −0.5571 −0.7015 −0.7428 0.6603

 0 0  5 

2 −5 2

8 9 −4  9 −1 9





1 2. 0.88 2.05

0.88 −0.54 0

 2.05 0  5.5

Exercise 4.8  2.(i) (−8, 7), (−7, 4) 3. (i) λ = 6.1, 0.86 1 1 0.52  (ii) λ = 5.98,  0.44 0.74 Exercise 5.2. 2. x2 + y 2 + z 2 − 2y = 0 1. x2 + 76 y 2 + 2xy − 2x − 13 6 y =0 3. 9x − 3y + 3z − 9 = 0 Exercise   5.3.       0.55 0.4825 0.472375 0.47085625 1. , , , 0.45 0.5175 0.527625 0.52914375       1  67  0.5 0.57 0.589 5 3       2. 0.4 , 0.27 , 0.238 3. (i) 2 (ii) 89 4. 11 22 3 89 0.1 0.16 0.173

1. (i) 9, 7; (ii) 9, 7

0.66

T

.

Answers

385

5. x = 16.67%, y = 29.17% and z = 54.16%. Exercise 5.4. 103400 8500 1. x = 6100 2. r = 1, w = 2, i = 5 90 , y = 273 , z = 39 3. x = 1000, y = 2200, z = 1800 4. 2%, 4%, 11% 5. 20 3 , 0, Exercise 5.5. 

5 1. encrypted message 13



20 3



     0 11 3 2. encrypted message 5  , 17 , 10 12 16 8

Exercise 5.6. 1. (3 (3 5 4), (5 4 3), (7 3 5)  5), (4 5), (3 2),(1 2) 2.     a11 −a12 0 0 2 − √12 √32 1 x1 y1 0 0 a21 −a22 0 0   √1 √7 1 3.   5. x2 y2 0 0 2 a31 −a32 0 0 4. 1 − 2 √3 √5 x3 y3 0 0 5 1 2 2 0 0 0 1 Exercise 5.7.        1  1   1 0 0 3 3.732 2 5 2 13 2 2 2  0  0 1 0 −6 3.036 3  3.  3  2.  ; (ii)  ,  1 , −8, −1 1. (i)  2 2 2 2.464 0 0 1 −1  6.33  7 7 − 12 12 2 2 0 0 1 1 1 1   1 0 0 4. 0 0 −1 5. 30◦ or π6 0 1 0 Exercise 5.8. A M P - S A - P E S 1. S P A - M - P A A M 2. Exercise 5.9.     −245.51929243 −1743.16677514 −0.61732479 1.50020466 −420.98861985 −2976.93615271  2. −3.31444678 −8.36360509 1.   −99.93679855 −708.55908509  −2.69712199 6.8340043 −169.13971335 −1206.7523688 Exercise 5.10.     −3.57366847e0 4.31847612e0 77.62148014 −3.76833911 −8.48478342e0 9.71681260e0  −3.49809178 0.20034556    0 0   1.  2. (i)   7.84596841e 3 −9.41043273e3  0.82372516 −0.27093975 −2.64511790e 3.10073903e  −0.82372516 0.27093945 0 2.9348546e −4.01209624e0 Exercise 6.2. 1. (P1! )(3×1) (P!1 )(1×3) , (P1! )(3×1) (P!2 )(1×3) , (P3! )(3×1) (P!1 )(1×3) . 2. (P11 )(3×3) + (P22 )(3×3) + (P33 )(3×3) 4. (i) Σ31 A∗l P!l , (ii) Σ31 Pk! ATk∗ , (iii) Σ3i=1 Σ3j=1 aij Pij , (iv) P!3 AP2!

386 Exercise 6.3.  −12 12  24 0 2. (i)  −36 36 72 0  2 8 1 4 6 2 (ii)  3 12 2 6 9 4

Answers

   2 1 8 4 48   0  3. (i) 3 2 12 8,   4 2 6 3 24 6 4 9 6 0     4 −2 2 −1 1 −2 −1 2 1     3  4. (i)  4 0 2 0, (ii)  0 −1 0 1  0 0 −1 1 4 8 2 0 0 6 0 0 2 0 0 2 0 0     −2 4 0 0 0 1 0 −1  2 0 0 0   1 2 1 −2 −1  Exercise 6.4. 1. (i)  −1 2 −1 2 , (ii) 4 0 0 0 2 1 0 1 0 0 0 4 2     2 3 6 9 147 42 154 44 −126 −21 −132 −22 −1 5 −3 15   3.  2.   77 6 22 84 24   8 12 4 −4 20 −2 10 −66 −11 −72 −12           0 0 1 0 6 −3 −10 5 0 0 1  1 −9 6 15 10 , 5. 400, 6. 2,   , 3,   , 4,   , 6,   4.  1 1 −1 −1 −2 1 4 −2 1 0 −1 0 3 −2 −6 4 −48 96 −24 48 

Exercise 6.5. 1.      x1 z1 a11 b11 a11 b12 a12 b11 a12 b12 y1 w1 x1 z2  a11 b21 a11 b22 a12 b21 a12 b22  y1 w2       x2 z1  = a21 b11 a21 b12 a22 b11 a22 b12  y2 w1  x2 z2 a21 b21 a21 b22 a22 b21 a22 b22 y2 w2      x1 z1 −1 0 0 0 y1 w1 x1 z2   2 −1 0  y1 w2  0     2.  x2 z1  =  0 0 −1 0  y2 w1  x z 0 0 2 −1 y w  2 2   2 2 x1 z1 2 1 8 4 y1 w1 x1 z2  3 2 12 8 y1 w2      3.  x2 z1  = 4 2 6 3 y2 w1  x2 z2 6 4 9 6 y2 w2    a1 y1 + a3 y2 b1 6y1 −3y2 4y3 a2 y1 + a4 y2  b2   9y1 12y2 6y3     Exercise 6.6. 1.  a1 y3 + a2 y4  = b3 , 3.  −8y1 4y2 2y3 a3 y3 + a4 y4 b4 −12y1 −16y2 3y3

 −2y4 8y4  , −y4  4y4



2z1  −4z1 4.   7z1 −14z1

2z2 6z2 7z2 21z2

11z3 −22z3 9z3 −18z3

Answers 

11z4 33z4   9z4  27z4

 1 0 Exercise 6.7. 1. U is a 9×9 matrix, for problems 2. & 3., U =  0 0 Exercise 6.8.  5 2 e +2e 2e5 −2e2 0 3 3  2   e5 +2e2 5 −2e2 + 2e5 0  0 1 2e + e 3 1. (i) 3 (ii)  e5 −e2 2e5 +e2 −e2 + e5 e2 + 2e5  3 0 3 e5 −e2 0 0 3          3 2 1 4 −1 −e + 2e2 2. (i) λ = 3, 2 , ; µ = 6, 1 , (ii) 1 1 1 1 −2e3 + 2e2 " 6 # 6 4e +e 5 e6 −e 5

387

0 0 1 0

0 1 0 0

 0 0  0 1

0



2e5 −2e2   3 

0



2e5 +e2 3  3 2

e −e , 2e3 − e2

4e −4e 5 e6 +4e 5

Exercise 6.9.         −9 −1 3 10 1 4 0 −3  1  −12  5 8 0 4        1.   1 0 11 1 2. λ = −2, 3, 5, 10; −2 , −1 ,  6  , 8 1 2 0 1 5 9 Exercise  6.10. 7    1 0 2 0 0 6 6 1. , 2. −13 −1 , 3. 1 −1 0 0 6 6 Exercise 7.2. h 4 i 2 T 1. [3t2 − cost 1]T , t4 cost t2 + [c1 c2 c3 ]T 2. [et − 2sin2t 5]

  3 4   3 4

T

3. [tet − t cost + sint sint] + [c1 c2 c3 ]T     cost − t sint 1 cost 0 1  0 sint + t cost et 4. (i) , (ii)  2t 3t2 t t 2t e + te tant + t sec 2t #  "  t2 c c sint 2 + 1 3 5. (i) t3 c2 c4 −cost  3    t2 t3 c1 c4 c7 t 2 6  + c2 c5 c8  (ii) sint −cost 0 c3 c6 c9 t ln sect ln(tant + sect)       ∂f ∂f x 2 xz , ∂f 0 2z xy 6. ∂x = y 2x yz , ∂y = ∂z = 7. (i) 0 (ii) 0 (iii) cost[et −t ln t[tet +et ]]+t2 et (tet + et )sint+tet [cost(ln t+ 3 2t 1) − t e (2sint + cost) + t2 et lnt(−sint + cost) − 3t2 e2t sint + tet cost(2lnt + 1) + et (sint − cost) + sint + cost

388

Answers

Exercise 7.3.  2 3x1 − 3x2 x3 1. −3x1 x3 + 2x2 −3x1 x2 Exercise 7.4. " ||X||x −|X| 22

1. (i)

||X||2 −||X||x12 −|X| ||X||2

 cosx1 −ex2 

∂y ∂x

3.

∂y ∂x

2 4x2

1 x3

−||X||x21 −|X| ||X||2 ||X||x11 −|X| ||X||2



2.

 2.

x22 x33 − x23 x32 = −x12 x33 + x13 x32 −x13 x22 + x12 x23 x22 x33 − x23 x32 = −x12 x33 + x13 x32 −x13 x22 + x12 x23

27x21 + 1 2x2

#

 (ii)



4. r2 cosφ

 −x21 ||X|| + |X| x11 ||X|| + |X|  x21 x32 − x22 x31 −x11 x32 + x12 x31  −x12 x21 + x11 x22  x21 x32 − x22 x31 −x11 x32 + x12 x31  −x12 x21 + x11 x22

x22 ||X|| + |X| −x12 ||X|| + |X|

−x21 x33 + x23 x31 −x13 x31 + x11 x33 −x11 x23 + x21 x13 −x21 x33 + x23 x31 −x13 x31 + x11 x33 −x11 x23 + x21 x13

Exercise 7.5.         2 6 4 12 6 8 x2 x1 + x4 0 1. , , 2. , 5 15 6 18 12 16 0 x2 x2   ∂y2 ∂y3 ∂y4 ∂y1 + x3 ∂x y3 + x1 ∂x + x3 ∂x y1 + x1 ∂x 1 1 1 1 , 3.  ∂y2 ∂y3 ∂y4 ∂y1 x2 ∂x1 + x4 ∂x1 x2 ∂x1 + x4 ∂x1   ∂y1 ∂y2 ∂y3 ∂y4 x1 ∂x2 + x3 ∂x2 x1 ∂x + x3 ∂x 2 2   ∂y2 ∂y3 ∂y4 ∂y1 y3 + x2 ∂x2 + x4 ∂x2 y1 + x2 ∂x2 + x4 ∂x2  ∂y1 ∂y2 x1 ∂x3 + y2 + x3 ∂x 3  ∂y2 ∂y1 + x4 ∂x x2 ∂x 3 3  ∂y1 ∂y2 x1 ∂x + x 3 ∂x4 4  ∂y1 ∂y2 x2 ∂x + y2 + x4 ∂x 4 4

x1

∂y3 ∂x3

∂y4 + y4 + x3 ∂x 3

∂y3 ∂y4 x2 ∂x + x4 ∂x 3 3 ∂y3 ∂y4 x1 ∂x + x 3 ∂x4 4 ∂y3 ∂y4 x2 ∂x + y4 + x4 ∂x 4 4

Exercise 7.6.   6x dx 6y dy 1. (i) 3y dx + 3x dy 2xy dx + x2 dy   2x dx 2y dy + 8 dx (ii) (y + 8x) dx + x dy 2xy dx + (x2 + 8y) dy

 ,  

x3 2x4



Answers (18x2 y + 4) dx + 6x3 dy 32xy 2 dx + (32x2 y + 12y)dy

 2.

(36x3 + 12x2 ) dx (72x y + 12xy 2 ) dx + (24x3 + 12x2 y) dy 2

Exercise 7.7.   x12 + 2x11 0 x11 0 sin(x12 + x11 )  2x11 sin(x12 + x11 ) 0  1.   0 sin(x21 + x22 ) 0 sin(x21 + x22 ) 0 x22 0 x21   1 0 x12 x11 x21 0  0 0 0 0 x11 0    2.  2 x21 0 0 x22 x22 0  x 0 0 2x22 x12 0 x11  11  2x12 [exp(x211 + x222 )]2x11 0 0 x222 +x212       2x11 0 0 1     3.     22 0 0 2x22 exp(x211 + x222 ) x22x+x  2  22 12     0 0 2x22 2x22  −sin(x11 ) 2x11 cos(x12 ) 4. 0 0

0 2x11 sin(x22 ) cos(x21 ) 0

389 

0 sin(x12 ) cos(x22 ) 0

−x211

−sin(x12 ) 0   0 2 cos(x22 )[x11 + x22 ] + sin(x22 )

Exercise 7.8.   −2x11 sin(x11 ) + 2cos(x11 ) 2x12 cos(x22 )   x22 cos(x21 ) + 2exp(x21 ) 0 , (ii) 1. (i)    0 −x212 sin(x22 ) 2x11 exp(x21 ) x11 cos(x22 ) − x22 x11 sin(x22 )  2 2 2



2x11 cos(x11 ) − x11 sin(x11 ) 2x11 x12 x21 x21 sin(x12 ) + x12 x21 cos(x12 ) 2x11 x12 x22 + x221 −x12 x22 sin(x11 ) 0 x22 cos(x11 ) + x221 cos(x12 ) 3x212 x222   x12 sin(x12 ) 2x12 x21 0 x211 x212 2x21 sin(x12 ) 3x221 x12 cos(x11 ) 2x312 x22

 x21 x12 x22  2x11 x221  2. (i)  x11 x12 x22 2x211 x21  2x11 x222  0 (ii)   0 0

0 x12 x21 x22 0 x11 x12 x22

x22 x11 x21 0 x11 x12 x21 0

 2x12 x222 x11 x21 x22   2x212 x22  x11 x12 x21

2x11 cos(x211 ) 0 0 0 0 2x22 x211 0 −2x22 sin(x222 )

 0 0   0  4x322



390

Answers

Exercise 7.9.   x12 + 3x11 2cos(x21 + x22 ) 1. 2cos(x12 + x11 ) 2x11 + x21 + x22 )   1 x11 + x12 x11 + x21 ) 2. x11 + x21 x222 + 2x12 x22 x11 + x22 Exercise 8.2.  T A= 1. x0 = Ax and x = x y z and    3 2 0 2 5 −9 a1 b1 c1 (i) 2 1 1  (ii)  7 −2 3  (iii) a2 b2 c2  0 0 −3 −1 5 7 a b c  0  3 3 3     0    x1 x1 0 1 0 x 0 1 x1      x x2  0 0 1 2. (i) 1 = , (ii) = 2 x2 −w2 −a x2 x3 5 −9 −7 x3    0  x1 0 1 0 0 x1          x2   1 x1 0 0  =  0 0 1 0 x2  3. (i) 0 (iii)  + , x = 0 x3   0 0 0 1 x3  −25 −8 x2 3 5 −9 6 −5 0 x x 4 4        x1 10 0 1 0 0 1  x2  +  0  , x0 =  5  (ii) 0 0 2 4  t+  7 4 −3 −3 x3 0 1 0 0 0 0 x1 0 0 1 0 1 −1 0 x2   t       0 0 0 1 0 0  y1   0   T       (iii)    y2 +t2 + 1 x(1) = 3 10 1 −5 4 3 t 0 −2 1 0 0      0 0 0 0 0 1  z1   0  1 0 −1 1 1 0 z2    0 3 1 0 1 2 3 1 0 1 0 0 5 3 2 0 0 5 3      4. (i) A =  −1 1 1 −1 , x(1) = 2 (ii) A = −1 1 1 −1 , 0 0 0 5 0 0 0 5 2     −1     0 1 t 0 1 0 0 0  0     −et   0 0 1 0 0   1        b= (iii) A  , b =  0  ; x(0) =  1 sint 0 0 0 0 1     0 1 tet −1 0 0 0 9 t 10e 1 Exercise   8.4.     2 2 + 2t 2 + 2t + t2 , 1. (i) , 0 0 0  "    3 t 1 2t 1+e +t e + tet + t6 + e−t + t + 1 1 2 2 , (ii) , 2 3 1 1 − e−t + t2 1 + t2 + (t − 1)et + t3 − 21 (t2 + 2t + 2)e−t +

# e−2t 2

Answers    1 1+t+ (iii) , 0 1

t2 2

 " 1 + 12 t2 + , 3 t + t2

t4 8

391

#

Exercise  1 8.5.    31 −4t e − 5 et + 16 e2t + 30 1 + 2sin(t + π) 1. (i) (ii) 62 −4t 2cos(t − π) − 15 et + 13 e2t − 15 e  5 t 5 −t   2t  t 2e − 2e e (−x10 + x20 ) + e [2x10 − x20 − 1] + 1  et + te−t  2. (i) , (ii) 2e2t [−x10 + x20 ] + et [2x10 − x20 − 1] + 1 et Exercise 8.6.  it   −4t  e teit 12 t2 eit e + 4et −e−4t + et 1 it it   te 1. 0 e 3. (i) 5 −4e−4t + 4et 4e−4t + et 0 0 eit       t e − t/3 − 1/9 1 −t 4. c¯1 e10t + c¯2 e10t 5. −2 2t + 1 − 3t − 19   t  4t  e tet 0 −2t e4t − e−2t 1 2e + 4e t   0 (ii) 6 6. (i) 0 e 8e4t − 8e2t 4e4t + 2e−2t 0 0 et  3t  e − 3te3t te3t (iii) −9te3t e3t + 3te3t Exercise 8.7. 3 1. α1 > −a; α2 < 1 − a+α 1 3. (i) λ = −2, −2, −1; stable (ii) λ = −1, −2, −3; stable 5. (i) λ = s22 ; bounded (ii) λ = −4; stable Exercise 9.3.  −t   2t  4e et − e−3t 3et − 4e−t + e−3t e − 2te2t te2t et + 3e−3t 3et − 3e−3t  1. (i) , (ii) 14  0 −4te2t (1 + 2t)e2t t −3t 0 e −e 3et + e−3t  −1 + 2 cos t + sin t −4 + 4 cos t + 2 sin t −7 + 7 cos t + sin t 2 cos t − 4 sin t cos t − 7sin t , 2. (i) cos t − 2 sin t sin t 2 sin t cos t + 3 sin t  3t    e 6te3t t + 1 2et − e2t − 1 (ii) , 3t . 3 (i) 0 e2t  0 t 3e t e e −t−1 (ii) 2e−t 2(t + 1 − e−t ) Exercise 9.4.  −3e−t + 9e2t + 3e8t 1 1. 9 −12e−t + 9e2t + 3e8t     1 1 and µ = 1, 2; , 0 1

     1 3e−t − 9e2t + 6e8t −1 2. (i) λ = 2, −4; , , 12e−t − 9e2t + 6e8t 2 4

392   1 0  (ii) λ = 2, −4, 4, −8;  2 , 0  −4t 2t 2t −4t e

+2e4t +e−8t 3

−2e −e

+2e 3

Answers     1 −1  0  1  ,  ,  4  2 2 0

−e−4t +e2t 6

   e−8t +2e4t 0  3  (iii)   −4e−4t +4e2t −4e2t +4e−4t +4e4t −4e−8t  3 3   −4e−8t +4e4t 0 3 (iv) I4 , (v) same as (iii).  −4t 2t 2t −4t 4t −8t −e

+e

   0   3. (a)   −4e−4t +4e2t  3   0  2t 4t −4t 6e −e −e 12

     (b)     

4t

e −e 6

2e−4t +e2t 3



−8t

         

e4t +2e−8t 3 12e2t +6e−4t +10e4t −7e−8t −5 48 4t

10e −7e 48

0

−8t

−3

24e2t −15e−4t −9 12

−6e2t −10e−4t +5e4t +7e−8t 12

0

10e4t +15e−8t −1 24

          

Exercise 9.5.  2t   −e − e3t 1 − et −t/2 + 1/4 − 5/4e2t − e3t 1. 2. 3t t e e e3t  t      c1 c3 et −e3t e e2t 3. 2t 0 −2e c2 c4 0 e3t Exercise 9.6. 1. (i) eAt [c0 + c1 t + c2 t2 ]eBt (ii) C0 + e4At C1 e4Bt + e−3At C3 e−3Bt   −t At −2t 2. e eBt 2 − t 2 − 2t



      2t −4t 4t −8t  −e −2e +e +2e  3   e4t +2e−8t 3

0

−e−2t −2e−4t +e4t +2e−8t 3

−3

e2t +e−4t +e4t −e−8t 6 e4t −e−8t 6

0

+e −e 6

e +e

6

  −1 −1  ; 4 4

2(1 − et ) 2et − 1



Answers Exercise 9.8.      1 0 0 1 1 3 0 0 2 6  , 12  3 3 1. 0 1 0 ,  5 0 0 1 −2 −1 −3 −1 −1    1 1 3 0 0 0 2 6 ,  3 3 9 I3 , 2  5 −2 −1 −3 −1 −1 −3

393

 0 9 ; −3

Index

Absolute error, 132 adjoint matrix, 36 affine transformation, 111 algebra of matrices, 13 algebraic multiplicity, 77 analytical function, 256 anti-circulant, 329 Augmented matrix, 66 basis, 102 big data, 220 bilinear form, 90 bioinformatics, 211 boundary value problems, 364 canonical form, 92 Cayley–Hamilton theorem, 72 chain rule, matrix derivative of, 303 characteristic equation, 77, 338 characteristic polynomial, 77 Cholesky method, 147 circulant matrix, 329 closed model, 190 cofactor, 29 complete pivoting, 137 computational complexity, 148 computer graphics, 199 condition number, 134 conic section, 182 conjugate matrix, 21 conjugate transpose, 21 contraction, 5 coupled differential equations, 312 Cramer’s rule, 70 cross product, 8 Crout’s method, 143 cryptography, 196

decomposition method, 142 determinant, 29 determinant, derivative of, 278 diagonal matrix, 12 diagonalizable, 83 differential, matrix of, 295 dilation, 5 dimension, 103 direct sum, 99 Doolittle’s method, 143 dot product, 7 eigen-value, 75 eigen-values, 330 eigen-vector, 75 eigen-vectors, 330 elementary matrix, 27 elementary row, 26 equivalent matrices, 28 Euclidean distance, 6 factorization method, 142 floating-point, 131 fundamental matrix, 315 Gauss elimination method, 66, 137 Gauss-Jacobi method, 149 Gauss-Jordan method, 39, 67, 140 Gauss-Siedel method, 151 geometric multiplicity, 77 Gershgorin circles, 173 Gershgorin method, 172 gradient matrix, 285 Gram–Schmidt, 125 Hermitian matrix, 23 Hill method, 197 homogeneous, 312, 347 395

396 Homogeneous coordinates, 199 homogeneous coordinates, 209 Householder transformation, 161 idempotent matrix, 24 Identity matrix, 12 ill-conditioned problem, 133 indefinite, 94 index, 93 inner product, 121 integral equation, 322 inverse, 35 involutary matrix, 24 Jacobian, 280 Jordan block, 87 Jordan Canonical form, 87 kinematics problem, 206 Kronecker product, 225, 233 Kronecker sum, 262 Krylov space solver, 179 Krylov subspace, 179 Laplace’s rule, 29 leading diagonal, 12 left inverse, 34 Leontief, 190 linear combination, 5, 101 linear equation, 53 linear matrix DE, 347 linear span, 101 linear system, 54 linear transformation, 109 linearly dependent, 6, 102 linearly independent, 6, 101 Lyapunov Equation, 268 m-digit arithmetic, 132 mantissa, 132 Markov Chain, 183 Markov process, 183 matrix, 10 matrix derivative of, 298 matrix derivative w.r.t entry, 288 matrix exponential, 328

Index matrix, derivative of, 275, 308 matrix, integral of, 275 matrix, product rule, 289 minimal polynomial, 113 minor, 29 modal matrix, 83 modular inverse, 49, 198 Moore-Penrose inverse, 48 n-vector, derivative of, 274 n-vector, integral of, 274 negative definite, 94 negative semi definite, 94 nilpotent matrix, 24 nonhomogeneous, 66, 312 nonsingular matrix, 36 normal form, 61 open model, 192 ordered basis, 106 orthogonal, 122 orthogonal complement, 123 orthogonal matrix, 23 orthogonalization, 125 orthonormal basis, 123 partial pivoting, 137 partitioned, 43 partitioned matrix, 298 PCA, 215, 221 permutation matrix, 156, 251 perturbed system, 312 pivot, 136 positive definite, 94 positive semi definite, 94 Power method, 175 power series, 128 primary matrices, 225 Probability vector, 184 projection, 8, 202 projection vector, 8 quadratic form, 90 rank, 56 reduced row echelon form, 59

Index Relative error, 132 right inverse, 34 robot, 204 rotation, 115, 167, 201, 207 Rouche’s theorem, 68 Routh-Hurwitz, 338 row echelon form, 59 RSA, 196 scalar, 2 Scaling, 200 scoring matrix, 211 signature, 93 singular matrix, 36 singular value, 88 singular value decomposition, 124 singular vectors, 88 skew Hermitian matrix, 23 skew-symmetric matrix, 22 stable, 337 Stochastic matrix, 184 Stochastic process, 183 submatrix, 29 subspace, 98 Successive approximations, 320 successive over relaxation, 157 Sylvester’s Law of Inertia, 94 symmetric matrix, 22 trace, 12 Transition probability, 184 Transition matrix, 184 translation, 199 transpose, 20 triangular matrix, 12 tridiagonalization, 167 unit vector, 6 Unitary matrix, 24 variation of parameters, 325, 357 vector, 2 vector operator, 232 vector space, 97 vector, chain rule, 282 vector, derivative of, 280

397 vector-matrix, 312 well-conditioned problem, 133 zero matrix, 12