A Portrait of Linear Algebra [4 ed.]
 9781792430503

Table of contents :
Table of Contents
Chapter Zero
Part I: Set Theory and Basic Logic
Part II: Proofs
Summary
Exercises
Chapter One
1.1 The Main Subject: Euclidean Spaces
Summary
Exercises
1.2 The Span of a Set of Vectors
Summary
Exercises
1.3 Euclidean Geometry
Summary
Exercises
1.4 Systems of Linear Equations
Summary
Exercises
1.5 The Gauss-Jordan Algorithm
Summary
Exercises
1.6 Types of Linear Systems
Summary
Exercises
Chapter Two
2.1 Linear Dependence and Independence
Summary
Exercises
2.2 Introduction to Subspaces
Summary
Exercises
2.3 The Fundamental Matrix Spaces
Summary
Exercises
2.4 The Dot Product and Orthogonality
Summary
Exercises
2.5 Orthogonal Complements
Summary
Exercises
2.6 Full-Rank Systems and Dependent Systems
Summary
Exercises
Chapter Three
3.1 Mapping Spaces: Introduction to Linear Transformations
Summary
Exercises
3.2 Rotations, Projections, and Reflections
Summary
Exercises
3.3 Operations on Linear Transformations and Matrices
Summary
Exercises
3.4 Properties of Operations on Linear Transformations and Matrices
Summary
Exercises
3.5 The Kernel and Range; One-to-One and Onto Transformations
Summary
Exercises
3.6 Invertible Operators and Matrices
Summary
Exercises
3.7 Finding the Inverse of a Matrix
Summary
Exercises
3.8 Conditions for Invertibility
Summary
Exercises
Chapter Four
4.1 Axioms for a Vector Space
Summary
Exercises
4.2 Linearity Properties for Finite Sets of Vectors
Summary
Exercises
4.3 A Primer on Infinite Sets
Summary
Exercises
4.4 Linearity Properties for Infinite Sets of Vectors
Summary
Exercises
4.5 Subspaces, Basis and Dimension
Summary
Exercises
4.6 Diagonal, Triangular, and Symmetric Matrices
Summary
Exercises
Chapter Five
5.1 Introduction to General Linear Transformations
Summary
Exercises
5.2 Coordinate Vectors and Matrices for Linear Transformation
Summary
Exercises
5.3 One-to-One and Onto Linear Transformations; Compositions of Linear Transformations
Summary
Exercises
5.4 Isomorphisms
Summary
Exercises
Chapter Six
6.1 The Join and Intersection of Two Subspaces
Summary
Exercises
6.2 Restricting Linear Transformations and the Role of the Rowspace
Summary
Exercises
6.3 The Image and Preimage of Subspaces
Summary
Exercises
6.4 Cosets and Quotient Spaces
Summary
Exercises
6.5 The Three Isomorphism Theorems
Summary
Exercises
Chapter Seven
7.1 Permutations and The Determinant Concept
Summary
Exercises
7.2 A General Determinant Formula
Summary
Exercises
7.3 Properties of Determinants and Cofactor Expansion
Summary
Exercises
7.4 The Adjugate Matrix and Cramer's Rule
Summary
Exercises
7.5 The Wronskian
Summary
Exercises
Chapter Eight
8.1 The Eigentheory of Square Matrices
Summary
Exercises
8.2 The Geometry of Eigentheory and Computational Techniques
Summary
Exercises
8.3 Diagonalization of Square Matrices
Summary
Exercises
8.4 Change of Basis and Linear Transformations on Euclidean Spaces
Summary
Exercises
8.5 Change of Basis for Abstract Spaces and Determinants for Operators
Summary
Exercises
8.6 Similarity and The Eigentheory of Operators
Summary
Exercises
8.7 The Exponential of a Matrix
Summary
Exercises
Chapter Nine
9.1 Axioms for an Inner Product Space
Summary
Exercises
9.2 Geometric Constructions in Inner Product Spaces
Summary
Exercises
9.3 Orthonormal Sets and The Gram-Schmidt Algorithm
Summary
Exercises
9.4 Orthogonal Complements and Decompositions
Summary
Exercises
9.5 Orthonormal Bases and Projection Operators
Summary
Exercises
9.6 Orthogonal Matrices
Key Concepts
Exercises
9.7 Orthogonal Diagonalization of Symmetric Matrices
Summary
Exercises
Chapter Ten
10.1 The Field of Complex Numbers
Summary
Exercises
10.2 Complex Vector Spaces
Summary
Exercises
10.3 Complex Inner Products
Summary
Exercises
10.4 Complex Linear Transformations and The Adjoint
Summary
Exercises
10.5 Normal Matrices
Key Concepts
Exercises
10.6 Schur's Lemma and the Spectral Theorems
Summary
Exercises
10.7 Simultaneous Diagonalization
Summary
Exercises
Glossary of Symbols
Subject Index

Citation preview

LINEARALGEB Fourth Editio

A Portrait of

Linear Algebra Fourth Edition

Jude Thaddeus Socrates Pasadena City College

Kendall Hunt

publishing

company

Jude Thaddeus Socrates and A Portrait of Linear Algebra are on Facebook.

Please visit, and Like our page! To order the print or e-book version of this book, go to: https ://he.kendallhunt.com/product/portrait-linear-algebra You can also download a free copy of the Answer Key to the Exercises on this website, among other goodies.

Cover: The Walt Disney Concert Hall in Los Angeles, home of the Los Angeles Philharmonic Orchestra; Frank Gehry, Architect. Copyright: Shutterstock.com.

Kendall Hunt

publishing

company

www.kendallhunt.com Send all inquiries to: 4050 Westmark Drive Dubuque, IA 52004-1840

Copyright© 2011, 2013, 2016, 2020 by Jude Thaddeus Socrates ISBN 978-1-7924-3050-3 Kendall Hunt Publjshing Company has the exclusive rights to reproduce this work, to prepare derivative works from thjs work, to publicly distribute this work, to publicly perform this work and to publicly display this work. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the copyright owner. Published in the United States of America

Table of Contents Chapter Zero. The Language of Mathematics:

Sets, Axioms, Theorems, am/ Proofs

1

Chapter One. The Canvas of Linear Algebra:

Euclidean Spaces and Systems of Linear Equations

27

1.1 The Main Subject: Euclidean Spaces

28

1.2 The Span of a Set of Vectors

45

1.3 Euclidean Geometry

57

1.4 Systems of Linear Equations

71

1.5 The Gauss-Jordan Algorithm

82

1.6 Types of Linear Systems

96

Chapter Two. Peeling the Onion:

Subspaces of Euclitlean Spaces

111

2.1 Linear Dependence and Independence

112

2.2 Introduction to Subspaces

124

2.3 The Fundamental Matrix Spaces

139

2.4 The Dot Product and Orthogonality

154

2.5 Orthogonal Complements

169

2.6 Full-Rank Systems and Dependent Systems

183

Chapter Three. Adding Movement and Colors:

Linear Transformations on Euclidean Spaces

199

3.1 Mapping Spaces: Introduction to Linear Transformations

200

3.2 Rotations, Projections and Reflections

214

3 .3 Operations on Linear Transformations and Matrices

231

3 .4 Properties of Operations on Linear Transfonnations and Matrices

245

3.5 The Kernel and Range; One-to-One and Onto Transformations

259

3.6 Invertible Operators and Matrices

276

3. 7 Finding the Inverse of a Matrix

289

3.8 Conditions for lnvertibility

299

Chapter Four. From The Real to The Abstract:

General Vector Spaces

311

4.1 Axioms for a Vector Space

312

4.2 Linearity Properties for Finite Sets of Vectors

328

4.3 A Primer on Infinite Sets

340

4.4 Linearity Properties for Infinite Sets of Vectors

352

4.5 Subspaces, Basis, and Dimension

362

4.6 Diagonal, Triangular, and Symmetric Matrices

379

Chapter Five. Movement in the Abstract:

Linear Transformations of General Vector Spaces

395

5.1 Introduction to General Linear Transformations

396

5.2 Coordinate Vectors and Matrices for Linear Transformations

410

5.3 One-to-One and Onto Linear Transformations; Compositions of Linear Transformations

429

5.4 Isomorphisms

450

Chapter Six. Operations on Subspaces:

The Isomorphism Theorems

467

6.1 The Join and Intersection of Two Subspaces

468

6.2 Restricting Linear Transformations and the Role of the Rowspace

478

6.3 The Image and Preimage of Subspaces

488

6.4 Cosets and Quotient Spaces

499

6.5 The Three Isomorphism Theorems of Emmy Noether

509

Chapter Seven. From Square to Scalar:

Permutation Theory and Determinants

523

7 .1 Permutations and The Determinant Concept

524

7 .2 A General Determinant Formula

538

7.3 Computational Tools and Properties of Determinants

555

7.4 The Adjugate Matrix and Cramer's Rule

568

7.5 The Wronskian

578

IV

Chapter Eight. Painting the Lines:

Eigentheory, Diago11alizatio11,a,ul Similarity

583

8.1 The Eigentheory of Square Matrices

584

8.2 Computational Techniques for Eigentheory

596

8.3 Diagonalization of Square Matrices

610

8.4 Change of Basis and Linear Transformations on Euclidean Spaces

625

8.5 Change of Basis for Abstract Spaces and Determinants for Operators

637

8.6 Similarity and The Eigentheory of Operators

646

8.7 The Exponential of a Matrix

659

Chapter Nine. Geometry in the Abstract:

Inner Product Spaces

663

9.1 Axioms for an Inner Product Space

664

9 .2 Geometric Constructions in Inner Product Spaces

676

9.3 Orthonormal Sets and The Gram-Schmidt Algorithm

687

9 .4 Orthogonal Complements and Decompositions

703

9.5 Orthonormal Bases and Projection Operators

716

9.6 Orthogonal Matrices

730

9. 7 Orthogonal Diagonalization of Symmetric Matrices

743

Chapter Ten. Imagine That:

Complex Spaces and The Spectral Theorems

749

10.1 The Field of Complex Numbers

750

10.2 Complex Vector Spaces

766

10.3 Complex Inner Products

776

10.4 Complex Linear Transformations and The Adjoint

785

10.5 Normal Matrices

796

10.6 Schur's Lemma and The Spectral Theorems

810

10.7 Simultaneous Diagonalization

821

Glossary of Symbols

839

Subject Index

844

The Answer Key to the Exercises is available as a free download at:

https ://he.kendallhunt.com/product/portrai t-linear-algebra

V

Preface to the 4th Edition Over the last four years, I taught Linear Algebra out of the 3rd edition of A Portrait of Li11ear Algebra almost every semester, and during the Winter and Summer intercessions at Pasadena City College. I learned a lot from my interactions with all of these students, and with the colleagues who taught out of my book. I incorporated many improvements that they suggested over these years, and I was looking forward to teaching out of the 4th edition in the Fall Semester of 2020. And then COVID-19 happened. I last saw my students in person on March 11. As luck would have it, my College had planned a professional development day for March12, months in advance, which my colleagues and I used to prepare for teaching remotely. It was like being told that the Titanic is sinking, here's a lifeboat (a Zoom account), jump in, and good luck. Although I would not call this situation ideal, it gave me an opportunity to think about how to explain concepts clearly and to engage my students even if they are miles away from me. Just this week, our College also decided that most classes, including all Math classes, will be taught remotely in the Fall. Many colleges and universities across the country will do the same. While it saddens me to think that I will not see my students' faces in person when we launch the 4th edition in August, I am thankful for the opportunity to keep them healthy and safe while still providing for their education. The 4th edition would not exist without the ideas that came from many conversations with my colleagues at PCC. I want to thank all those who have taught out of Portrait over the years: John Sepikas, Lyman Chaffee, Christopher Strinden, Patricia Michel, Asher Shamam, Richard Abdelkerim, Mark Pavitch, David Matthews, Erland Weydahl, Guoqiang Song, Leif Hopkins, Jorge Basilio, Robert Mardirosian, Thomas Kowalski, and Leonid Piterbarg. A special thank you to Daniel Gallup who taught out of the very first incarnation of this book in 2008, and is currently teaching Linear Algebra out of the 3rd edition to high school seniors and became a published author himself this year. I thank all the students who learned Linear Algebra from this book, and gave me the motivation and inspiration to keep improving the text and corning up with challenging problems. Thank you to Beverly Kraus and Taylor Knuckey of Kendall Hunt for their valuable assistance in bringing the 4th edition to existence. Thank you to my husband, my best friend, and biggest supporter, Juan Sanchez-Diaz. Gracias, Papi. Thank you to our standard poodle Johannes, for being my constant companion especially now that I'm home all day, and giving me an excuse to take a long walk around the neighborhood and get some exercise in the backyard until it's safe to go back to the gym. To all the members of the Socrates and Sanchez families all over the planet, maraming salamat, y muchas gracias, for all your love and support. Thanks to all my colleagues at PCC, my friends on Facebook, and my barkada, for their VII

camaraderie and encouragement. Thank you to my late parents, Dr. Jose Socrates and Dr. Nenita Socrates, for teaching me and all their children the love for learning. And to our Lord, for showering my life with His blessings. Jude Thaddeus Socrates Professor of Mathematics Pasadena City College, California May, 2020

What's New: How The 4th Edition is Organized The main text from the 3rd edition has been reorganized into 11 chapters. Some sections were split in two in order to give the student an opportunity to catch their breath and try out some problems before proceeding to the next topic. Some sections were relocated in order to improve the flow of the material. The fmal chapter of the 3rd edition has been moved on-line, along with a couple of section from earlier chapters. These are all available as free downloads to the public. The free material includes applications of Linear Algebra, and a development of the Singular Value Decomposition and its twin, the Fundamental Theorem of Linear Algebra. All Theorems are now numbered, with major Theorems given full names as in previous editions. At the end of each Section, you will find a Summary containing the definitions and the Theorems, usually on a single page. The Exercises have been reorganized by type, and to more clearly show which problems are related to each other. Chapter Zero provides an introduction to sets and set operations, logic, the field axioms for real numbers, and common proof techniques, emphasizing theorems that can be derived from the field axioms. This brief introductory chapter will prepare the student to learn how to read, understand and write basic proofs.

We base our development of the main concepts of Linear Algebra on the following definition: Li11ear Algebra is the study of vector spaces, their structure, tra11sformations that map one vector space to another.

and the li11ear

Chapter I rigorously constructs the archetype vector spaces: the Euclidean spaces IRl.'1, the Span of a set of vectors, systems of linear equations, and the Gauss-Jordan Algorithm, the central tool of Linear Algebra. New to the 4th editio1t: a rigorous development of Euclidean geometry as it relates to Linear Algebra, motivating the Gauss-Jordan Algorithm, and proving its effectivity and integrity.

Chapter 2 introduces the concepts of linear independence, subspaces, the four fundamental matrix spaces: rowspace, columnspace and nullspace for a matrix and its transpose, and algorithms to find a basis for each space, and see their relevance when solving systems of equations. We use the dot product as a way to generalize the concepts of angles and orthogonality in higher dimensions, and to construct and find a basis for the orthogonal complement of a subspace. New to the 4th edition: how to determine which equations can be eliminated from a dependent linear system while maintaining the original solution set. Chapter 3 introduces linear transformations on Euclidean spaces as encoded by matrices. We will explore the geometric significance of a linear transformation, and see how a linear transformation determines special subspaces, namely the kernel and the range of the transformation. These spaces help us to investigate one-to-one and onto properties of transformations. We will define basic matrix operations, and an algorithm to find the inverse of a matrix, when this exists. New to the 4th edition: we develop the concept of invertibility to functions from one arbitrary set to another, and use it to construct the inverse of an invertible operator. Chapter 4 generalizes the concepts from Chapters 1 and 2 to construct abstract vector spaces and prove their analogous properties. We focus most of our examples on function spaces (in particular, polynomial spaces), and special families of matrices. New to the 4th edition: A separate section has been created to introduce infinite sets and their cardinalities. Chapter 5 generalizes the concepts from Chapter 3 to linear transformations from an arbitrary vector space to another. We will see that in the finite-dimensional case, a linear transformation can be encoded by a matrix as well. By focusing on function spaces preserved by the derivative operator, the strong relationship between Linear Algebra and Differential Equations is firmly established. Chapter 6 investigates the subspace structure of vector spaces. We will see techniques to fully describe the join and intersection of two subspaces, the image or preimage of a subspace, and the restriction of a linear transformation to a subspace. We will create cosets and quotient spaces, and see one of the fundamental triptychs of modem mathematics: the Isomorphism Theorems ofEmy Noether as applied to vector spaces. Chapter 7 explores the determinant function, its properties, especially its relationship to invertibility, and efficient algorithms to compute it. We will see Cramer's rule, a technique to solve invertible square systems of equations, albeit not a very practical one. Chapter 8 introduces the eigentheory of operators both on Euclidean spaces as well as abstract vector spaces. We will see when it is possible to encode operators into the simplest possible fonn, that is, to diagonalize them. We will study the concept of similarity and its consequences. Chapter 9 generalizes geometry on a vector space through an inner product. This allows us to introduce the concepts of norm and orthogonality in abstract spaces. We will explore orthonormal bases, the Gram-Schmidt Algorithm, orthogonal matrices, and the orthogonal diagonalization of symmetric matrices.

IX

Chapter 10 applies the constructions thus far to vector spaces over arbitrary fields, especially the field of complex numbers. The main goal of this chapter is to prove the Spectral Theorem of Normal Matrices. One specific case of this Theorem tells us that symmetric matrices can indeed be diagonalized by orthogonal matrices. We also see that commuting diagonalizable matrices can be simultaneously diagonalized by the same invertible matrix, and present an algorithm to find a common diagonalizing matrix.

Special Topics and Mini-Projects Scattered in the Exercises are multi-step problems that guide the student through various topics that probe deeper into Linear Algebra and its connections with Geometry, Calculus, Differential Equations, Set Theory, Group Theory and Number Theory.

The Medians of a Triangle: a coordinate-free proof that the three medians of any triangle intersect at a common point which is 2/3 the distance from any vertex to the opposite midpoint (Section 1.1). The Cross Product: used to create a vector orthogonal to two vectors in IR{3, and proves its other properties using the properties of the 3 x 3 determinant (Sections 2.4, 7 .1 and 7 .2). The Uniqueness of the Reduced Row Echelon Form: uses the concepts of the rowspace of a matrix to prove that the rref of any matrix is unique (Section 2.3). Drawing Three-Dimensional Objects: applies the concept of a projection in order to show how to draw the edges of a 3-dimensional object as perceived from any given direction (Section 3.2).

The Center of the Ring of Square Matrices: uses basic matrix products to show that the only n x n matrices that commute with all n x n matrices are the multiples of the identity matrix (Section 3.4).

The Kernel and Range of a Composition: proves that the kernel of a composition T2 o T, contains the kernel of T,, and analogously, the range of T2 o T, is contained in the range of T2 (Section 3.5 for Euclidean Spaces and Section 5.3 for arbitrary vector spaces). The Direct Sum of Matrices: explores the properties of matrices in block-diagonal form (Sections 3 .8, 4.6, 7 .3, 8.1, 9 .6, and 10.7). The Chinese Remainder Theorem: introduced and applied to construct invertible 2 x 2 integer matrices whose inverses also have integer entries (Section 3.8). Cantor's Diagonal Argument: proves that the set of rational numbers is countable by showing how to list its elements in a sequence (Section 4.3). The Countability of Subintervals of the set of Real Numbers: gives a guided proof that all subintervals of IR{that contains at least two points have the same cardinality as IR{, by explicitly constructing bijections among these subintervals (Section 4.3). Bisymmetric Matrices: explores the properties and dimensions of this unusual and interesting family of square matrices (Section 4.6). The Centralizer of a Matrix: proves that the set of matrices that commute with a given square matrix forms a vector space, and finds a basis for it in the 2 x 2 case (Sections 4.5 and 10.7). X

Vector Spaces of Infinite Series: proves that the set of absolutely convergent series form a subspace of the space of all infinite series, whereas conditionally convergent and divergent series are not closed under addition (Section 4.5). We also see a natural inner product which is well-defined on absolutely convergent series but fails for conditionally convergent series (Section 9.1). Casting Shadows: shows that the shadow on the floor of an image on a window pane is an example of a linear transformation (Section 5.2). The Vandermomle Determinant: applies row and column operations and cofactor expansions to find a closed formula for the Vandermonde Determinant, and applies it to some Wronskian determinants, proving that certain infinite subsets of function spaces are linearly independent (Sections 7.3 and 7.5). The Special Linear Group of Integer Matrices: introduces the concept of a group, and proves that the set of all n x n matrices with integer entries and determinant 1 form a group under matrix multiplication. This project also proves that SL 2 C1L)is generated by two special matrices (Section 7.3). Invertible Triangular Matrices: uses Cramer's rule to prove that the inverse of an invertible upper triangular matrix is again upper triangular, and analogously for lower triangular matrices (Section 7.4). Eigenspaces of Matrices Related to Rotation Matrices: although a rotation matrix itself does not have real eigenvalues unless the rotation is by O or n radians, performing the reflection across the x-axis followed by a rotation matrix always leads to real eigenvalues, and a basis for the eigenspaces that involve the half-angle formula (Section 8.1). Properties Preserved by Similarity: proves that similar matrices share attributes such as determinants, invertibility, arithmetic and geometric multiplicities, and diagonalizability (Section 8.6). Introduction to Fourier Series: shows that the infinite family of trigonometric functions { sin(nx ), cos(nx) In E N} are mutually orthogonal under the inner product defined using the integral of their product over [O,2n] (Section 9.3). De Morgan's Laws for Subspaces: proves that (V n W).1= v.1 V w.1 and (V V W).1 = v.1 n W\ connecting the ideas of the intersection and join of two subspaces with their orthogonal complements (Section 9.4). Matrix Decompositions: shows that any square matrix can be decomposed uniquely as the sum of a symmetric and a skew-symmetric matrix, and that the spaces of symmetric and skew-symmetric matrices are orthogonal complements of each other under a naturally defined inner product on all square matrices (Section 9.5). Idempotent Matrices - - New to the 4th Edition: shows that a matrix is symmetric and idempotent if and only if it is the projection matrix onto its columnspace (Section 9.5). Right Hande,l versus Left Handed Orthonormal Bases: uses the cross-product to define and create right-handed orthonormal bases for Ii 3 , and relates the concepts of right-handed versus left-handed orthonormal bases to proper versus improper orthogonal matrices (Section 9.6).

XI

Rotations in Space: explicitly constructs the matrix of the counterclockwise rotation by an angle 0 about a fixed unit normal vector nin ~ 3 by elegantly connecting this operator with the concepts of a right-handed coordinate system, orthogonal matrices, and the change of basis formula (Section 9.6). Finite Fields: introduces finite fields by constructing the addition and multiplication tables for the finite fields I/(5) and I/(7) (Section 10.1). The Pauli matrices: an introduction to normal matrices that are important in Quantum Mechanics (Section 10.6).

A Note on Technology The calculations encountered in modem Linear Algebra would be all but impossible to perform in practice, especially on large matrices, without the advent of the computer. It would be tedious to perform calculations on these large matrices by hand. However, we encourage the student to learn the algorithms and computations first, by practicing on the homework problems by hand (with the help of a scientific calculator, at best), before using technology to perform these computations. It is easy to find free and downloadable software or apps by typing "Linear Algebra Calculator" in a search engine. The following computations and algorithms are relevant for this book:

• Matrix Arithmetic: Addition, Multiplication, Inverse, Transpose, Determinant; • The Gauss-Jordan Algorithm and the Reduced Row Echelon Form or rref; • Finding a basis for the Rowspace, Colurnnspace and Nullspace of a Matrix; • Characteristic Polynomials, Eigenvalues and Bases for Eigenspaces; Some graphing calculators also provide many of these routines. We leave it to the instructor to decide whether or not these will be allowed or required in the classroom, homework, or examinations.

To the Student This book was written for you. On every topic, on every page, I always wrote with one thought in mind: how would I explain this to one of my students if they were standing in front of me right now? I hope you enjoy reading this book, trying out the problems, and seeing the beauty of Linear Algebra. It will be a challenging experience, because nothing worth doing in life comes without effort and difficulty, to paraphrase Theodore Roosevelt.

I also hope that you will consider this book as an invitation to open the door to Mathematics a little bit wider, step inside, and explore this glorious subject.

XII

Chapter Zero

The Language of Mathematics:

Sets, Axioms, Theorems, and Proofs Mathematics is a language, and Logic is its grammar. You are taking a course in Linear Algebra because the major that you have chosen will make use of its techniques, both computational and theoretical, at some points in your career. Whether it is in engineering, computer science, chemistry, physics, economics, or of course, mathematics, you will encounter matrices, vector spaces and linear transformations. For most of you, this will be your first experience in an abstract course that emphasizes theory on an almost equal footing with computation. The purpose of this introductory Chapter is to familiarize you with the basic components of the language of mathematics, in particular, the study of sets (especially sets of numbers), subsets, operations on sets, logic, Axioms, Theorems, and basic guidelines on how to write a coherent and logically correct Proof for a Theorem.

Part 1- Set Theory and Basic Logic The set is the most basic object that we work with in mathematics: Definition: A set is an unordered collection of objects, called the elements of the set. A set can be described using the set-builder notation: X = {x Ix possesses certain determinable qualities},

or the roster method: X = {a, b, c, ... },

where we explicitly list the elements of X. The bar symbol represents the phrase "such that."

"I" in

set-builder notation

We will agree that such "objects" are known to exist. They could consist of people, letters of the alphabet, real numbers, or functions. There is also a special set, called the empty set or the null-set, that does not contain any elements. We represent the empty set symbolically as:

0 or { }.

Early in life, we learn how to count using the set of natural numbers: N={0,1,2,3,4,

... }.

We learn how to add, subtract, multiply and divide these numbers. Eventually, we learn about negative integers, thus completing the set of all integers: 7L=

{... -3,-2,-1,

0, 1, 2, 3, ... }.

We use the letter 7Lfrom Zahlen, the German word for "number." Later on, we learn that some integers cannot be exactly divided by others, thus producing the concept of a fraction and the set of rational numbers:

Q

= { % I a and bare integers, with b

* 0 }-

Notice that we defined Q using set-builder notation. Still later on, we learn of the number n when we study the circumference and area of a circle. The number n is an irrational number, although it can be approximated by a fraction like 22/7 or as a decimal like 3.1416. When we learn to take square roots and cube roots, we encounter other examples of irrational numbers, such as /2 and (5. By combining the sets of rational and irrational numbers, we get the set of all real numbers IRL We visualize them as corresponding to points on a number line. A point is chosen to be "O," and another point to its right is chosen to be "l." The distance between these two points is the unit, and subsequent integers are marked off using the unit. Real numbers are classified into positive numbers, negative numbers, and zero (which is neither positive nor negative). They are also ordered from left to right by our number line. We show the real number line below along with some famous numbers: 1t

(!

-4

-3

-2

-1

0

2

The Real Number Line

3



~

Logical Statements and Axioms An intelligent development of Set Theory requires us to develop in parallel a logical system. The basic component of such a system is this: Definition: A logical statement is a complete sentence that is either true or false.

2

Chapter Zero: The Language of Mathematics

Examples: The statement: The number 2 is an integer. is a true logical statement. However: The number 3/4 is an integer. is a false logical statement. The statement: Gustav Mahler is the greatest composer of all time. is a sentence but it is not a logical statement, because the word "greatest" cannot be qualified. Thus, we cannot logically determine if this statement is true or false.o In our daily lives, especially in politics, one person can judge a statement to be true while someone else might decide that it is false. Such judgments depend on one's personal biases, how credible they deem the person who is making the argument, and how they appraise the facts that are carefully chosen (or omitted) to support the case. In mathematics, though, we have a logical system by which to determine the truth or falsehood of a logical statement, so that any two persons using this system will reach the same conclusion. For the sake of sanity, we will need some starting points for our logical process:

Definition: An Axiom is a logical statement that we will accept as true, that 1s, as reasonable human beings, we can mutually agree that such Axioms are true. You can think of Axioms as analogous to the core beliefs of a philosophy or religion.

Examples: One of the most important Axioms of mathematics is this: The empty set 0 exists. In geometry, we accept as Axioms that points exist. We symbolize a point with a dot, although it is not literally a dot. We accept that through two distinct points there must exist a unique line. We accept that any three non-collinear points (that is, three points through which no single line passes) determine a unique triangle. We believe in the existence of these objects axiomatically. We note, though, that these are Axioms in what we call Euclidean Geometry, but there are other geometric systems that have very different Axioms for points, lines and triangles. □

Quantifiers Most, if not all of the logical statements that we will encounter in Linear Algebra refer not just to numbers, but also to other objects that we will be constructing, such as vectors and matrices. We will use what are called quantifiers in order to specify precisely what kind of object we are referring to:

Sets, Axioms, Theorems, and Proofs

3

Definitions- Quantifiers: There are two kinds of quantifiers: universal quantifiers and existential quantifiers. Examples of universal quantifiers are the words for any, for all and for every, symbolized by'v'. They are often used in a logical statement to describe all members of a certain set. Examples of existential quantifiers are the phrases there is and there exists, or their plural forms, there are and there exist, symbolized by 3. They are often used to claim the existence ( or non-existence) of a special element or elements of a certain set. Example: In everyday life, we can make the following statement: "Everyone has a mother." This is certainly a true logical statement. Let us express this more precisely using quantifiers:

"For every human being x, there exists another human being y, who is the mother of x. "□ Some of the best examples of logical statements involving quantifiers are found in the Axioms that define the Real Number system. Linear Algebra in a sense is a generalization of the real numbers, so it is worthwhile to formally study what most of us take for granted.

The Axioms for the Real Numbers We will assume that the set of real numbers has been constructed for us, and that this set enjoys certain properties. Furthermore, we will mainly be interested in what are called the Field Axioms:

Axioms - The Field Axioms for the Set of Real Numbers: There exists a set of Real Numbers, denoted~, together with two binary operations:

+ (addition) and

• (multiplication).

Furthermore, the members of ~, addition, and multiplication, properties: I. The Closure Property of Addition:

For all x, y

E ~:

x +y

E ~

as well.

x •y

E ~

as well.

enjoy the following

2. The Closure Property of Multiplication:

For all x, y

E ~:

3. The Commutative Property of Addition: For all x, y

E ~:

x +y = y + x.

4. The Commutative Property of Multiplication:

For all x, y

4

E ~:

x • y = y • x. Chapter Zero: The Language of Mathematics

5. The Associative Property of A 1 is prime if the only natural numbers that exactly divide p are 1 and p itself. If we look at the first few prime numbers p = 2, 3, 5, 7, we get:

22 - 1 = 4 - 1 = 3

which is prime,

23

which is prime,

-

1= 8- 1= 7

25 -1 = 32 - 1 = 31 2

7 -

1 = 128 - 1

=

which is prime, and

127 which is also prime.

This might be enough to convince you that the statement is true. However, for p = 11, we get: 2 11 - 1

=

2048 - 1

=

2047

=

23 • 89.

Thus, we found a counterexample to the statement above, and so this statement isfa/se. 0 In fact, it turns out that the integers of the form 2P - 1 where p is a prime number are rarely prime, and we call such prime numbers Mersenne Primes. As of January 2020, there are only 51 known Mersenne Primes, and the largest of these is 2 82•589 ,933 - 1. This is also the largest known prime number. If this number were expressed in the usual decimal form, it will be more than 24.8 million dil(its long. Large prime numbers have important applications in cryptography, a field of mathematics which allows us to safely provide personal information such as credit card numbers on the Internet.

Negations Definition: The negation of the logical statement p is written symbolically as: not p. The statement not p is true precisely when p is false, and vice versa. When a negated logical statement is written in plain English, we put the word not in a more natural or appropriate place. We can also use related words such as never to indicate a negation.

Examples: The statement: "An integer is not a rational number."

is a false logical statement. On the other hand, the statement: "The function g(x) = 1/x is not continuous at x = O." is a true logical

statement. □

Sets, Axioms, Theorems, and Proofs

7

Converse, Inverse, Contrapositive and Equivalence By using negations or reversing the roles of the hypothesis and conclusion, we can construct three implications associated to an implication p ⇒ q: Definition: For the implication p



q, we call:

the converse of p

q ⇒ p





q,

not p



not q

the inverse of p

not q



not p

the contrapositive of p

q, and ⇒

q.

Unfortunately, even if we knew that an implication is true, its converse or inverse are not always true. Example: We saw earlier that the following statement is true: "Iff(x) is differentiable at x = a, then f(x) is also continuous at x = a." The converse of this statement is:

"If f(x)

is continuous at x

=

a, then f(x) is also differentiable at x

=

a."

This statement is.false, as shown by the counterexample.f(x) = lxl, which is well known to be continuous at x = 0, but is not differentiable at x = 0. Similarly, the inverse of this Theorem 1s:

"Iff(x) is not differentiable at x

=

a, then f(x) is also not continuous at x

=

a."

The inverse is also false: the same functionf(x) = lxl is not differentiable at x continuous there. Finally, the contrapositive of our Theorem is: "Iff(x) is not continuous at x

=

a, then f(x) is also not differentiable at x

=

=

0, but it is

a. "

The contrapositive is a true statement: a function which is not continuous cannot be differentiable, because otherwise, it has to be continuous. 0

If we know that p ⇒ q and q ⇒ p are both true, then we say that the conditions p and q are logically equivalent to each other, and we write the equivalence or double-implication: p

8

¢=>

q

(pronounced as: p if and only if q).

Chapter Zero: The Language of Mathematics

We saw above that the contrapositive of our Theorem is also true, and in fact, this is no accident. An implication is always logically equivalent to its contrapositive (as proven in Appendix B): (p ⇒ q) ~ (notq ⇒ notp).

Later, if we want to prove that the statement p ⇒ q is true, we can do so by proving its contrapositive. Similarly, the converse and the inverse of an implication are logically equivalent, and thus they are either both true or both false. We saw this demonstrated above with regards to differentiability versus continuity. The contrapositive of an equivalence p ~ q is also an equivalence, so we do not have to bother with changing the position of p and q. An equivalence is again equivalent to its contrapositive: (p

~

q)

~

(notp

~

notq ).

Logical Operations We can combine two logical statements using the common words and and or: Definition: If p and q are logical statements, we can form their co11junction: p and q

and their disjunction:

p or q The conjunction p and q is true precisely if both conditions p and q are true. Similarly, the disjunction p or q is true precisely if either condition p or q is true (or possibly both are true). Thus, if p and q is true, then p or q is also true. Example: The statement:

"The real number

Ji is irrational and bigger than 1."

is a true statement. However, the statement: "Every real number is either positive or negative." is false because the real number O is neither positive nor

negative. □

The negation of a conjunction or a disjunction is sometimes needed in order to understand a Theorem, or more importantly, to prove it. Fortunately, the following Theorem allows us to simplify these compound negations:

Sets, Axioms, Theorems, and Proofs

9

Theorem 0.2-De Morgan's Laws: For all logical statements p and q: not (p and q) (not p) or (not q ), and not (p or q) (not p) and (not q) Note that De Morgan's Laws look very similar to the Distributive Property (with a slight twist), and in fact they are precisely that in the study of Boolean Algebras. De Morgan's Laws are proven in Appendix B.

Subsets and Set Operations We can compare two sets and perform operations on two sets to create new sets.

Definitions: We say that a set Xis a subset of another set Y if every member of Xis also a member of Y. We write this symbolically as: X c Y (x EX~ x E Y). If X is a subset of Y, we can also say that Xis contained in Y, or Y contains X We can visualize sets and subsets using Venn Diagrams as follows:

r X

We say X equals Y if Xis a subset of Y and Y is a subset of X:

(X = Y) (X c Y and Y c X). Equivalently, every member of Xis also a member of Y, and every member of Y is also a memberofX:

( X = Y) ( x E X ~ x E Y and y E Y ~ y E X). Suppose that X and Y are both subsets of another set, Z. We can combine X and Y into a single set that contains precisely all the members of the two sets using the union operation: XU Y= {z E Zlz E Xorz E Y}. We determine all members common to both sets using the intersection operation: X

n y = {z

E

ZIz

E

X and z

E

Y}.

We can also take the llifference or complement of two sets: X - Y = {z

E

Z Iz

E

X and z

{l

Y}.

Notice the use of or and and in the definitions. We can also visualize these set operations using Venn diagrams. We first show two sets A and B below, highlighted separately for clarity:

10

Chapter Zero: The Language of Mathematics

A

I B

_______,/

B

n B:

Next, we show their union A U B, and their intersection A

A

AUB

.1

ns

B

Finally, we show the two complements, A-Band

B-A:

A-B

A

I B-A

B

Example: Suppose we have the sets (expressed in roster notation):

A= {b,d,e}, B = {a, b, c, d, e,.f}, C= {c,e,h,k},

and

D = {d, e, g, k}.

Then, A c B because every member of A is also a member of B, and there are no other subset relationships among the four sets. Now, let us compute the following set operations: CUD = { c, e, h, k} U { d, e, g, k} = { c, d, e, g, h, k},

C nD

= { C,

e, h, k}

n {d, e, g, k}

C-D=

{c,e,h,k}-{d,e,g,k}

D- C =

{

d, e, g, k} - { c, e, h, k}

= {

e, k},

= {c,h}, = {

and

d, g}.

As a special bonus, notice that: CUD=

{c, d, e,g, h,k} = {e, k} U {c, h} U {d,g}

Sets, Axioms, Theorems, and Proofs

= (CnD)

U (C-D)

U (D-C).o

11

In the course of developing Linear Algebra, we will not just consider sets of real numbers, but also sets of vectors, notably the Euclidean Spaces from Chapter 1, sets of polynomials, and more generally, sets of functions (such as continuous functions and differentiable functions), and sets of matrices. We will be gradually constructing these objects over time.

Part II- Proofs Perhaps the most challenging task that you will be asked to do in Linear Algebra is to prove a Theorem. To accomplish this, you need to know what is expected of you: Definition: A proof for a Theorem is a sequence of true logical statements which convincingly and completely explains why a Theorem is true. In many ways, a proof is very similar to an essay that you write for a course in Literature or History. It is also similar to a laboratory report, say in Physics or Chemistry, where you have to logically analyze your data and defend your conclusions.

The main difference, though, is that every logical statement in a proof should be true, and must either be an axiom, or follows as a conclusion from previously established true statements. The method of reasoning that we will use is a method of deductive reasoning which is formally called modus ponens. It basically works like this: Suppose you already know that an implication p ==>q is true. Suppose you also established that condition p is satisfied. Therefore, it is logical to conclude that condition q is also satisfied.

Example: Let us demonstrate modus ponens on the following logical argument: In Calculus, we proved that:

is a continuous odd function on [-a,a], then [af(x)dx

Iff(x)

= 0.

The function f(x) = sin 5 (x) is continuous on ~, because it is the composition of two continuous functions. It is an odd function on [-n-/4, n-/4], since: sin 5 (-x) = (sin(-x)) 5 = (-sin(x)) 5 = -sin 5 (x), where we used the odd property of both the sine function and the fifth power function. Therefore,

f

14

11:

-,r:/4

sin 5 (x) dx = 0. □

Notice that this reasoning allows us to compute this definite integral without the inconvenience of finding an antiderivative and applying the Fundamental Theorem of Calculus! Begin the proof by identifying what is given (the hypotheses), and what we want to show (the conclusions). Next, understand the meaning of the given conditions and the conclusion that 12

Chapter Zero: The Language of Mathematics

you are supposed to reach. It is therefore important that you can recall and state the definitions of a variety of words and phrases that you will encounter in Linear Algebra. After all, it would be impossible for you to explain how you obtained your conclusion if you do not even know what the conclusion is supposed to mean. We also use special symbols and notation, so you must know what they mean. Often, a previously proven Theorem can also be helpful to prove another Theorem. Rest assured, you will be shown examples which demonstrate proper techniques and reasoning, which you are encouraged to emulate as you learn and develop your own style. In the meantime, we present below some examples of general strategies and techniques which will be useful in the coming Chapters. These strategies are certainly not exhaustive: we sometimes combine several strategies to prove a Theorem, and the more difficult Theorems require a creative spark. For our first example, though, let us see how to prove a Theorem using only the Axioms of the Real Number System:

Example: Let us prove the following: Theorem 0.3 - The Multiplicative Property of Zero: For all a

E

Ii:

0 •a= 0 =a• 0.

Proof Suppose that a is any real number. We want to show that O • a = 0. If we can do this, then we can also conclude by the Commutative Property of Multiplication that a• 0 = 0 as well. All we know is that O • a is some real number, by the Closure Property of Multiplication. We will use a clever idea. We know the Identity Property of 0, that is, for all x 0 +x

=

x

=

E

Ii:

x+ 0.

Since this is true for all real x, it is true in particular for x = 0, so we get: 0 + 0 = 0.

Now, if we multiply both sides of this equation by a, we get the equation:

(0+0) •a= 0 •a. This equation is again a true equation because of the following Axiom:

Axiom - The Substitution Principle: If x, y E Ii and F (x) is an arithmetic expression involving x, and x = y, then: F(x) = F(y).

Simply put, if two quantities are the same, and we do the same arithmetic operations to both quantities, then the resulting quantities are still the same. Continuing now, by the Distributive Property, we get: 0 • a + 0 • a = 0 • a.

Remember that we want to know exactly what O • a is. Since O • a is a real number, it possesses

Sets, Axioms, Theorems, and Proofs

13

an additive inverse, -(0 • a), by the Existence of Additive Inverses. Let us add this to both sides of the equation:

- (0 •a) + (0 •a+ 0 •a) = -(0 • a) + 0 • a. By the defining property of the additive inverse, -(0 • a) + 0 • a = 0, so we get:

=0.

-(0•a)+(0•a+0•a)

But now, by the Associative Property of Addition, the left side is: (-(0 •a) + 0 • a) + 0 • a = 0.

Thus, by the additive inverse property, as above, we get:

0 + (0 • a) = 0. (we enclosed 0 • a in parentheses to emphasize that it is the quantity we are studying in the equation). Finally, by the additive property of 0 again, the left side reduces to 0 • a, so we get: 0 •a= 0. ■

Case-by-CaseAnalysis We can prove the implication p ⇒ q if we can break down p into two or more cases, and every possibility for p is covered by at least one of the cases. If we can prove that q is true in each case, the implication is true. This is also sometimes called Proof by Exhaustion. Example: Let us prove the following: Theorem 0.4 ~ The Zero-Factors Theorem: For all a, b

E

!RI.:

a • b = 0 if and only if either a = 0 or b = 0. Proof Since this is an if and only with the converse, which is easier:

if

Theorem, we must prove two implications. Let us begin

(p the converse of p ==>q, not p ==>not q the inverse of p ==>q, and not q ==>not p the contrapositive of p ==>q. If p ==>q and q ==>p are both true, then we say that p and q are equivalent to each other. We write the equivalence or double-implication p ~ q, pronounced asp if and only if q.

The implication p ==>q is equivalent to its contrapositive not q ==>not p. The conjunction p and q is true precisely if both conditions p and q are true. The disjunction p or q is true precisely if either condition p or q is true.

0.2 De Morgan's Laws: For all logical statements p and q: not (pandq) is logically equivalent to (not p) or (notq ), and similarly, not (p or q) is logically equivalent to (not p) and (notq ). A set Xis a subset of another set Y if every member of Xis also a member of Y. We write this symbolically as X c Y. Two sets X and Y are equal if Xis a subset of Y and Y is a subset of X, or equivalently, every member of Xis also a member of Y, and vice versa: (X = Y) ~ ( X ~ Y and Y c X) ~ ( x E X ==>x E Y and y E Y ==>y E X). Given two sets X and Y which are both subsets of Z, we can find: • their union: XU Y = {z E Z[z EX or z E Y}; • their intersection: xn Y = {z E Z[z EX and z E Y}; and • their difference or complement: X - Y = {z E Z[z EX and z (£: Y}. A proof for a Theorem is a sequence of true logical statements which convincingly and completely explains why a Theorem is true. A good way to begin a proof is by identifying the given conditions and the conclusion that we want to show. It is also a good idea to write down definitions for terms that are found in the Theorem. The main logical technique in writing proofs is modus ponens. We also use techniques such as:

20

Chapter Zero: The Language of Mathematics

• • • •

Case-by-Case Analysis Proof by Contrapositive Proof by Contradiction Proof by Mathematical Induction

Chapter Zero Exercises 1.

Decide if the following statements arc logical statements or not. If a statement is logical, classify it as True or False. a. b. c. d. e.

f. 2.

Write the converse, inverse and contrapositive of the following: a. b.

3.

If 0 S x S n/2, then cos(x) ~ 0. (challenge: write the inverse and contrapositive without using the word "not") If f(x) is continuous on the closed interval [a, b] then f(x) possesses both a maximum and a minimum on [a, b].

For the sets A and B, find AU B, An B, A-Band a. b.

4.

If xis a real number and lxl < 3, then -3 < x < 3. If x and y are real numbers and x < y, then x 2 < y 2 . If x andy are real numbers and 0 < x < y, then 1/y < 1/x. Every real number has a square root which is also a real number. As of March 2020, Roger Federer holds the record for the most number of consecutive weeks as the world's number 1 tennis player. The Golden State Warriors are the best team in the NBA.

A= {a,c,f,h,i,j,m}, A= {a,d,g,h,j,p,r,t},

B -A:

B = {b,c,g,h,j,p,q}. B = {b,d,g,h,k,p,q,s,t,v}.

Prove the following Theorems concerning Real Numbers using only the 11 Field Axioms (and possibly Theorems that were proven in Chapter Zero). For every item, specify in your proof which Axiom or Theorem you are using at each step. We did (a) for you. a.

The Cancellation Law for Addition: For all x, y, c If x + c

=

y + c, then x

E IR1.:

=

y.

Proof We are given that x, y, c E IR1.,and x + c = y + c. We must show that x = y. Since c E !RI.,there exists -c E !RI.,by the Existence of Additive Inverses. We can add -c to both sides of the equation: (x+c) + (-c) = (y+c) + (-c), by applying the Substitution Principle. By the Associative Property of Addition, we get: x + (c + (-c)) = y + (c + (-c)), and from this: x + 0 = y + 0 again by the Existence of Additive Inverses. The last equation simplifies to x = y by the Existence of the Additive Identity. ■

Sets, Axioms, Theorems, and Proofs

21

b.

The Cancellation Law for Multiplication: For all x, y, k If k • x

=

E

IR, k

* 0:

k • y, then x = y.

The Uniqueness of Additive Inverses: Suppose x E IR. If w E IRis any real number with the property that x + w = 0 = w + x, then w = -x. In other words, -x is the only real number that satisfies the above equations. d. Use the previous Exercise to show that -0 = 0. Hint: which Field Axiom tells us what 0 + 0 is? c. Use the Uniqueness of Additive Inverses to prove that for all x E IR:-x = (-1) • x. Hint: show step-by-step that x + (-I) • x simplifies to zero. f. The Double Negation Property: Use some of the previous Exercises to show that: Forallx E IR:-(-x) =x. g. The Uniqueness of Multiplicative Inverses: Suppose x E IRand x * 0. If y E IRis any real number with the property that x • y = l = y • x, then y = l/x. In other words, llx is the only real number that satisfies the above equations. h. The Double Reciprocal Property: For all x E IR,x 0: 1/(1/x) = x. Solving Algebraic Equations: Prove that for all x, a, b E IR: 1. if x + a = b, then x = b - a. J· if a 0 and ax= b, then x = bla. Use Proof by Contradiction to prove the following statements. a. The real number 0 cannot have a multiplicative inverse. Hint: suppose 0 has a multiplicative inverse x. What can we say about 0 • x? b. There is no Largest real number. Note: for parts (b) and (c), the Order Axioms for Real Numbers (see Appendix A) will be useful. c. There is no smallest positive real number. d. Suppose that n E N and n factors as n = a • b, where a, b E N and both are positive. Prove that either a :'.SJn orb :'.SJn. You may assume that Jn E IR. Primality Testing: This is a follow-up to Exercise 5(d). a. Use Exercise S(d) to prove: If n E N is not a prime number (that is, n is composite), and n > l, then n has a prime factor which is at most Jn. Note: you will not need Proof by Contradiction. Recall that n is composite if n can be factored as n = a • b, where 1 < a < n, and 1 < b < n. b. Write the contrapositive of the statement in (a). c. Use (b) to decide if 11303 is prime or composite. Use Proof by Contrapositive to prove the following statements. You may use De Morgan's Law to simplify the contrapositive, when applicable: a. For all a, b E "lL:if a •bis odd, then a is odd and bis odd. b. For all a, b E "lL:if a+ bis even, then either a and bare both odd or both even. c. For all x, y E IR:if x •y is irrational, then either xis irrational or y is irrational. d. For all a E "lL:a 2 is even if and only if a is even. c.

*

*

5.

6.

7.

22

Chapter Zero: The Language of Mathematics

Negating Statements with Quantifiers: A logical statement that begins with a universal quantifier is negated as follows: not (V x : p) is equivalent to: :3x : not (p). This should make sense: if it is not true that all x possess property p, then at least one x does not possess property p. Similarly: not (:3x : p) is equivalent to: V x : not (p ). Thus, the negation of "All of my friends are Democrats" is "One of my friends is not a Democrat." Notice that "None of my friends are Democrats" is wrong. Similarly, the negation of "One of my brothers is left-handed" is "All of my brothers are right-handed." It is not "One of my brothers is right-handed." Use the ideas above to write the negation of the following statements, and determine whether the original statement or its negation is true: a. Every real number x has a multiplicative inverse llx. b. There exists a real number x such that x 2 < 0. c. There exists a negative number x such that x 2 = 4. d. All prime numbers are odd. 9. Exploring Goldb,1ch's Conjecture: review the final Example in Chapter Zero. a. Demonstrate Goldbach's Conjecture using: 130 = ? + ? b. Rewrite Goldbach's Conjecture using the quantifiers "for every" and "there exist." 10. The Twin Prime Conjecture: Twin primes are pairs of prime numbers that differ only by 2. For example, (11, 13) are twin primes, as are (41,43). The Twin Prime Conjecture states that there are an infinite number of twin primes. What are the next years after 2020 that are twin primes? l l. The Fibonacci Prime Conjecture: The Fibonacci Numbers are those in the infinite sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ... where the next number (starting with the third) is the sum of the previous two numbers. Notice that 2, 3, 5, 13 and 89 are primes that appear in this sequence, so they are called Fibonacci Primes. The Fibonacci Prime Conjecture states that there are an infinite number of Fibonacci primes. Find the next Fibonacci prime after 89. 12. Prove the following by Mathematical Induction: For all positive integers n: 8.

a.

l2 + 32 + ... + (2n _ 1)2 = n(2n + l j(2n - 1)

b.

13+23+···+n3

c.

13 + 3 3 + ·· · + (2n - 1) 3 = n 2 (2n 2

d.

1 • 2 + 2 • 3 + · · • + n • (n + l) = n(n + 1j(n + 2 )

e.

1·3+2•4+3•5+···+n(n+2)=

f. g.

= [ n(n;l)

1 1 -1 2+-2 3+···+ • • _1_+_1_+···+ 1•3 3 •5

Sets, Axioms, Theorems, and Proofs

]2 -

1)

n(n+l~

2

n+?)

c1 n•n+l

)=__!1_1 n+ 1 (2n - l ) • (2n + l )

n 2n + l

23

h. 1.

+ 2 • 2 2 + 3 • 2 3 + ··· + n • 2n = 2[(n - 1)2n + 1] 1 • 3 + 2 • 3 2 + 3. 3 3 + ··· + n • 3n = ..1[(2n-1)3n + 1] 1•2

4

n < 2n (this might require a little bit of creativity in Step 3).

J. 13. Induction in Geometry: An n-gon is a polygon with n vertices (thus a triangle is a 3-gon and a quadrilateral is a 4-gon). We know from basic geometry that the sum of the angles of any triangle is 180°. Use Induction to prove that the sum of the interior angles of a convex n-gon is (n - 2) • 180° (a polygon is convex if any line segment connecting two points inside the polygon is entirely within the polygon). Hint: in the inductive step, cut out a triangle using three consecutive vertices. Draw some pictures. 14. Properties of Set Operations:

If X and Y are two subsets of Z, write down the definition of XU Y and X n Y. If A and Bare two sets, write down the meaning of the expression A c B. Similarly, what is the meaning of the equation A = B? d. Now, use the previous parts to prove thatX c XU Y and Y c XU Y e. State and prove a similar statement regarding X, Y and X n Y. f. Prove that X ~ Y if and only if Y = XU Y. g. Similarly, prove that X c Y if and only if X = X n Y. Notice that it is now X on the left side of the equation. h. Prove that (X - Y) n Y = 0. 1. Prove that XU Y = (X n Y) U ( Y - X) U (X - Y), and each of the three sets in this union have no element in common with the other two. Hint: draw and label the diagram for XU Y. For the last two parts, assume that A and Bare subsets of X. J. Prove that A n B is the largest subset of X which is contained in both A and B. In other words, prove that if Cc A and Cc B, then Cc An B. k. Prove that AU Bis the smallest subset of Xwhich contains both A and B. In other words, prove that if A c D and B c D, then A U B c D. 15. The Infinitude of Primes: Our goal in this Exercise is to show that the set of prime numbers is infinite. Thus, if the set of primes is P = { 2, 3, 5, 7, 11, 13, ... }, then this list will never terminate. a. Warm-up: prove that if the integers a and b are both divisible by the integer c, then a - b and a+ b are also divisible by c. (We say that an integer x is divisible by a non-zero integer y if x/y is also an integer). We will use Proof by Contradiction to prove our main goal. Suppose that P above is a finite set, so the complete set of primes becomes P = { 2, 3, 5, 7, 11, ... , p L} where p L is the largest prime number. Let N = (2 • 3 • 5 • · · · • p L) + 1 E ~We will proceed with a Case-by-Case Analysis: b. Suppose that N is prime. Show that we have a contradiction, hence we are done. c. Now, suppose that N is not prime (thus we have considered both possibilities about N). Use Exercise 6(a) to find a prime q which is smaller than jN which is missing from the set P above. Explain why this is a contradiction and therefore our proof is finished. Hint: (a) could be useful. a. b. c.

24

Chapter Zero: The Language of Mathematics

l 6. Powersets: If X = {x 1, x 2 , ... , Xn} is a finite set, we define f,J(X), the powerset of X, to be the set of all subsets of X. For example, if X = {a,b}, then f,J(X) = {0, {a}, {b}, {a,b} }, and thus f,J(X) has 4 elements - each subset is one element of f,J (X). a. If X = { a, b, c}, list all the members of f,J(X). How many subsets does Xhave? b. Separate the list that you got in part (a) into two columns. On the left, put those subsets that contain c, and on the right, put those that do not contain c. c. Now, cross out c from each subset on the left column. What do you notice? d. Prove by induction that ifX = {x1,x2, ... ,xn}, then f,J(X) has 2n elements. Hint: in the induction step, we want to show that the number of subsets of {x1,x2, ... ,Xk+1} is double the number of subsets of {x1,x2, ... ,xk}- Think of how to generalize parts (b) and (c ). e. Show that the set of subsets of a finite setXhas strictly more members thanXitself. Hint: Use one of the Exercises above on Induction. 17. The purpose of this Exercise is to prove that for any real number a: Recall that the absolute value of a

.

E IRI.1s defined

by: Ia I =

Jai =

I a 1-

{ a if a > 0, and . -a

if a < 0.

2

We also know that the function f(x) = x is not one-to-one on (--, oo), but 1t 1s one-to-one if the domain is restricted to [0, oo). In this case, the range of f(x) is also [ 0, oo), and so we will define the square root of a real number b E [ 0, oo), as:

[b

=

c, where c

E

[0,oo), and b

=

c2 •

laI ~

a. b.

Warm-up: use the definition above to explain why for any real number a: Again, using the definition, show that I a 12 = a 2 •

0.

c.

Our next goal is to show that IE is unique. In other words, prove that if c and dare two real numbers such that c ~ 0, and d ~ 0, and b = c2 = d 2 , then c = d. Hint: rewrite this equation into: c 2 - d 2 = 0 and use the Zero Factors Theorem.

d.

Rewrite the definition for

e.

Put together all the steps above to write a complete proof that

IE to define Jai.

/a2 = I a 1-

18. Positive Numbers and the Order Axioms: In some of the Exercises above, we assumed that the reader was familiar with the basic properties of positive numbers and inequalities. We can formalize these properties with these additional Axioms for Positive Numbers: There exists a non-empty subset 1R1. + c IRI.,consisting of the positive re,il numbers, such that the following properties are accepted to be true: (i) Closure under Addition and Multiplication: If x, y E !RI.+, then x + y E 1R1. +, and x • y E !RI.+. (ii) Zero is not positive: 0 ~ !RI.+. ( iii) The Dichotomy Property: If x * 0, then either x E !RI.+, or - x E !RI.+, but not both. Using only these three Axioms, prove the following statements (as usual, an earlier Exercise can be used to prove a later Exercise, if applicable, and you may use the Field Axioms and anything else proven earlier).

Sets, Axioms, Theorems, and Proofs

25

a. b. c.

d.

e. f. g. h.

Prove that 1 E !Rs.+.Hint: Use Proof by Contradiction. Suppose instead -1 E !Rs.+. What do the Closure properties and the Dichotomy Property tell us? Use the previous Exercise to show that the set of positive integers { 1, 2, 3, ... } is a subset of !Rs.+. Hint: use the Closure property, and Induction. Prove the Reciprocal Property for !Rs.+: For all x E !Rs., x 0: x E !Rs.+if and only if 1/x E !Rs.+. See the hints in the two previous Exercises. The Dichotomy Property creates another set, !Rs. - , consisting of the negative real numbers: IRs_-= {x E !Rs. I -x E 11 z, then x > z. n. Prove that if x < y and z E !Rs.+, then x • z < y • z and x • (-z) > y • (-z). o. Prove the Order Properties for Reciprocals: For all x, y E !Rs.: 1.

*

If 0 < x and x < y, then 1/x > lly. If x < y and y < 0, then 1/x > lly.

26

p.

Prove the Squeeze Theorem for Inequalities: For all x, y, z

q.

If x S y and y S x, then x = y. Let us define the imaginary unit i to be a number with the property that: i 2 = i • i = -1. Prove that such a number cannot be a real number. Hint: if i E !Rs., then either i E !Rs.+ or i E IRs_-or i = 0. Show that all cases lead to a contradiction.

E

!Rs.:

Chapter Zero: The Language of Mathematics

Chapter One

The Canvas of Linear Algebra:

Euclidean Spaces and Systems of Equations We study Calculus because we are interested in real numbers and functions that operate on them, such as polynomial, rational, radical, trigonometric, exponential and logarithmic functions. We want to study their graphs, derivatives, extreme values, concavity, antiderivatives, Taylor series, and so on. In the same spirit, we define:

Linear Algebra is the study of vectors, which are generalizations of numbers, vector spaces, their structure, and functions with special properties called linear transformations that map one vector space to another. In this Chapter, we will look at the basic kind of vector space, called Euclidean n-space, 2 and !RI. 3 can be visualized as arrows, and the basic operations symbolized by !RI. n. Vectors in !RI. of vector adtlition, subtraction and scalar multiplication can be interpreted geometrically: ll

. . +V

From these two basic operations, we will construct linear combinations of vectors, and form the Span of a set of vectors. We will use vectors to create another important object, called a matrix. A systems of linear equations can be represented by an augmented matrix. We can then use the Gauss-Jordan Algorithm to transform the augmented matrix into its reduced row echelon form, from which we can easily read off solutions to the original linear system. -X3 2x1 -3x2 -5x1 +6x2 -5x3 -3X4 XJ +7x3 +3x4

-1 augmented matrix

4 =



-2

[

-:]

2 -3 -1 0 I -5 6 -5 -3 I 1

0

7

3

I

-2

u ] 3 I -2 1 5 2 I -1

0 7

reduced row echelon form ➔

0 0 0

solutions ➔

x=

I

(-2 - 7x3 - 3x4,-l - 5x3 - 2x4, X3, X4), where X3,X4 E

0

!RI..

27

1.1 The Main Subject: Euclidean Spaces In ordinary Algebra, we see ordered pairs of numbers such as ( 3, -5). Our first step will be to generalize these objects: Definition: An ordered n-tuple or vector

vis an ordered list of n real numbers:

Example: ( 2, -1, 4) is an ordered 3-tuple (more naturally called an ordered triple), and (5, 7,-3, 0, 6, 2) is an ordered 6-tuple. 0 Definition: The set of all possible n-tuples is called Euclidean n-space, denoted by the symbol [R{n:

Euclidean n-space is the main subject of linear algebra, and it is the fundamental example of a category of objects called vector spaces. Almost all concepts that we will encounter are related to vector spaces. The number n is called the dimension of the space, and we will refer to [R{2 as 2-dimensional space, [R{3 as 3-dimensional space, and so on. Euclidean n-spaces are referred to collectively as Euclidean spaces. A vector v from [R{n is more specifically called an n-dimensional vector, although we will simply say "vector" when we know which Euclidean space v comes from. We use an arrow on top of a letter to denote that the symbol is a vector. The entries within each vector are called the components of the vector, and they are numbered with a subscript from 1 to n. We will also agree that [R{1 = {v = ( v 1 ) I v I E [R{} = [R{, the set of real numbers. Example: Letv

=

(7,0,-5, 1)

4 E [R{ ,

We say thatv,

=

7,

v2 =

0,

V3 =

-5 and

V4 =

1.o

To distinguish real numbers from vectors, we will also refer to real numbers as SC{llars. Definition: Two vectors u = (u,, u2, .. . , Un) and v = (v,, v2, ... , v,,) from if all of their components are painvise equal, that is, u, = v; for i = 1 ... n. Two vectors from different Euclidean spaces are never equal. Example:

ln[R{ 3 ,

We can say that

(/4,

3 2 , cos(n)) = (2, 9,-1),but(-2,

5, 7)

!R{n are

equal

* (5,-2,

7). 0

Many of the Axioms for Real Numbers that we saw in Chapter Zero have analogs in Euclidean spaces. Let us start by generalizing the scalar zero and the additive inverse of a real number:

28

Section 1.1 The Main Subject: Euclidean Spaces

Definitions: Each IR(.11 has a special element called the zero vector, also called the atlditive identity, all of whose components are zero: --+

On = ( 0, 0, ... , 0 ).

We assume in this notation that there are n components, all of which are zero. Every vector = ( v 1, v2, ... , Vn) E IR(11 has its own additive inverse or negative:

v

--+

Example: In IR(5, the zero vector will be written as Os = ( 0, 0, 0, 0, 0 ). Notice that we do not put a subscript on the zeroes. If = ( 4,-2, 0, 7,-6), then = (-4, 2, 0,-7, 6).

v

-v

Vector Arithmetic Vectors in

11 IR(. are

manipulated in two basic ways:

u

Definitions: If = ( u 1, u2, ... , u 11) and the vector sum:

v = ( v1, v2, ... , Vn) are vectors in iR(.'1,we define

We also refer to the operation of finding the vector sum as vector addition. If r E IR(,we define the scalar product:

The operation of finding a scalar product will be referred to as scalar multiplication. We can also define vector subtraction by:

Example: Let

u

=

(3,-5, 6, 7) and

v

=

(-4, 2, 3,-2). Then:

u+ v = ( 3 + (-4 ),-5

+ 2, 6 + 3, 7 + (-2))

= (-1,-3, 9, 5),

- v = (4 ' -2 ' -3 ' 2) '

5u= 1v1- 4v2, and ➔vs = 8v1 + 5v2 - l0v4.

Notice that the vectors that we solved for correspond to the free variables, and they are all expressed as linear combinations of the vectors corresponding to the leading variables. Furthermore, the coefficients 7 and -4 appear in column 3, and the coefficients 8, 5, and -10 appear in column 5. Consequently, any linear combination of vectors involving v 3 or vs can be rewritten by substituting the expressions above, resulting in a linear combination involving onlyv 1, v 2, and v 4. For example: 6v2 + 4v3 - 9vs = 6v2 + 4(7v, -4v2) - 9(8v1 + 5v2 - l0v4) = 6v2 + 28v, - t6v2 - 72v, - 45v2 + 90v4 = -44v1 Section 2.1 Linear Dependence and Independence

s5v2+ 90v4. □ 113

This Example tells us two important things about a dependent set of vectors S: ( 1) We can solve for vectors corresponding to free variables as linear combinations of the vectors corresponding to the leading variables. (2) Consequently, any linear combination of vectors from S can be written exclusively as a linear combination of the vectors corresponding to the leading variables. These properties are encapsulated in the following central result:

The Minimizing Theorem Given the choice of whether a person would prefer to be called "independent of others" or "dependent on others," it would be a no-brainer to pick "independent of others." Similarly, in Linear Algebra, we are not exactly thrilled with dependent sets, but as we saw above, the Gauss-Jordan Algorithm can be used to weed out redundant vectors from a dependent set:

Theorem 2.1.2 - The Minimizing Theorem: Let S = {v1, v2, ... , vn} be a non-empty set of vectors from IRl.111, and let A= [SJ= [v1 v2 ··· vn]. Suppose that R is the rref of A. If R has only leading variables, then Sis already imlependent. If R has at least one free variable, then Sis dependent. In this case, let i 1, i2, ... , ik be the column numbers of the columns of R that contain the leading variables. Then the set S 1 = {vi 1 , 2 , ••• , Vik}, that is, the subset of vectors of S consisting of the corresponding columns of A, is a linearly imlependent set, and: Span(S) = Span(S 1).

vi

Furthermore, every Vi E S - S 1, that is, the vectors of S corresponding to the free variables of R, can be expressed as a linear combination of the vectors of S 1, using the coefficients found in the corresponding column of R.

Idea Behiml the Proof We will not give a general proof of this Theorem, but let us use the previous Example to demonstrate why this Theorem is true. Our set S was: S= {(5,-3,6,5),(8,-5,9,6),(3,-1,6,

5

A= [SJ =

8

11),(8,-5,

3

8

0

-3 -5 -1

-5

1

10 -7

6

9

6

5

6

11 7

0

10, 7),(0, 1,-7,0)}, 1 0

has rref R =

7

and

0

8

0 1 -4 0

5

0 0

0

1 -10

0 0

0

0

0

Again, our leading variables are x 1, x 2 , and x 4 , and the free variables are x 3 and x 5 • The Minimizing Theorem claims that the set S 1 = {v1, v2, v4 }, corresponding to the leading variables, is an independent set. To prove this, we must show that the only solution to the dependence test equation:

114

Section 2.1 Linear Dependence and Independence

is the trivial solution x, = x2 = X4 = 0. Notice that this test equation does not involve X3 and xs. Thus, we can rewrite this test equation using all five vectors by simply replacing X3 and x 5 with zeroes. Thus, we want to solve the system: 5 -+

-+

04, or

Ax=

3

8

8

-3 -5 -1 -5 9

6

1

0

11

7

0 1 -4

Rx=

0 0 0 0

Thus, we get

0

0

5

0

0 04

0

if and only if

-+

4

x1

=

x2

=

x4

are exactly the same as the

0 1 +x2

X4

0

Ax=0

0

= Xi

0

0

0

1

X2

1 -10

0

-+

Rx=

x,

8

0

X4

x

7

0

0

where = (x,, x 2 , 0, x 4 , 0 ). However, the solutions to -+ solutions to Rx = 04, and Rx easily gives us: 1 0

0

X2

10 -7

6

6

5

X1

0

0

x,

0

X2

+x4

0

0

1

X4

0

0

0

0

= 0, so S 1 = {v,, v2, V4} is independent.

v3 vs c, cs.

We already saw in the previous Example that we can obtain and as linear combinations of 1,v2, and v4, but we can tweak the argument above to see a different proof. This time, we will go backwards. Let us write the columns of Ras to We can easily see that:

v

7

-4

1 =7

0

Cs

=

-4

1

0

0

0

0

0

0

8 -+

0

5

-10

1 0

=8

0

0

0 +5

1

0

-10

0

0

1

0

0

0

Now, let us rewrite the two equations above as:

7c, -+

-+

-+

4c2- c3= 04, -+

-+

and

-+

8c, + 5c2 - l0c4 - cs = 04.

Notice these are homogeneous equations, so we can rewrite them using matrix products:

Section 2.1 Linear Dependence and Independence

115

1 0

7

7

0

8

0 1 -4

0

5

0 0

0

1 -10

0 0

0

0

1 0

-4 --+

-1

04, and

=

0

0

0

8

7

0

8

0 1 -4

0

5

0 0

0

1 -10

0 0

0

0

5 --+

0

=

04.

-10

0

-1

--+

We now have two solutions to the equation Rx = 04. But as we stated earlier, these vectors x are also solutions to the original equation = 04 . We can write these matrix products using the columns of A and the coefficients in obtaining:

Ax x,

7v1 -4v2 -

v3

=

04, and 8v1 + 5v2 - l0v4 -

vs

=

04.

These two equations can be more naturally written as:

v3= -7v1 + 4v2,

and vs

=

-8v1 - 5v2 + l0v4.

This shows that the vectors corresponding to the free variables can be written as linear combinations of the vectors corresponding to the leading variables. Since any linear combination involving S = {v 1, v 2, ... , vs} can be written using only vectors fromS 1 = {v1,v2,v4}, wehaveshownthatSpan(S) = Span(S 1 ). □ The Minimizing Theorem gives us an equivalent way to describe dependent sets of vectors:

Theorem 2.1.3: Suppose that S = {v1, v2, ... , vn} is a set of vectors from some IR --> For (2), the dependence test equation would be: cv = Om. --> But by the Zero Factors Theorem, either c = 0 or v = 0 111• We already know by (1) that the second possibility yields a dependent set, but if the only possibility is c = 0, then S is independent, andv must be a non-zero vector. This proves (2). For (3), suppose that Sis dependent. Then there is a non-trivial solution to the test equation: -->

-->

-->

au+ bv = Om, that is, either a orb is not zero. Without loss of generality, let's say a is not zero.

u u -Jv. u v

Then, we can solve for as: = This means that and are parallel to each other. Conversely, suppose that u = cv, for some real number c. --> Then, we can rewrite this equation as: 1 • u- cv = Om. This yields a dependence equation for S, because 1 0. Thus, Sis dependent. This proves (3). The second half of the proof for (3) also proves (4). We will leave the proof of (5) as an Exercise. ■

*

118

Section 2.1 Linear Dependence and Independence

The Extension Theorem Theorem 2.1.4 is about dependent sets and independent sets consisting of only a few vectors. For our final topic, let's just focus on independent sets of vectors. We saw that if we want 1 } to be independent, then 1 should be a non-zero vector.

{v

v

Now, let's think about adding another vector v2 to this set. If we want {v 1, v2} to be independent, then Theorem 2.1.4 says that I and v2 should not be parallel to each other. However, since v1 is a non-zero vector, we know that Span( {v 1}) is a line. Thus, if we don't want 2 to be parallel to v1, then we should choose 2 to be any vector which is not on the line Span( {v 1} ), and we get {v 1, v2} to be an independent set of vectors

v

v

v

v3

Theorem 2.1.4 ends with a set of three vectors, so let's think about adding a third vector to {v I' V2}. If we want {v I' V2,V3} to be linearly independent, then 2.1.4 says that V3should not be on the plane spanned by 1 and 2 , but it also requires that 1 should not be on the plane spanned by 2 and 3 , and 2 should not be on the plane spanned by 1 and 3 . That's three conditions that we have to check. Fortunately, the next Theorem says that we only need to check the first condition. It also gives us a way to generalize this process to more than three vectors:

v

v v

v

v

v

v

v

Theorem 2.1.5 - The Extension Theorem: Let S = {v1, v2, ... , vn} be a linearly independent set of vectors from some lim, and suppose Vn+l is not a member of Span(S). Then, the extended set:

st= SU {Vn+l} = {v1, V2,... , Vn,

Vn+l}

is still linearly independent.

Proof Let us construct the dependence test equation for the extended set:

We must show that we can only get the trivial solution: c 1 = 0,

c2

= 0, ...

Cn+l

= 0.

At this point, let us break up the analysis into two cases:

Case 1. Suppose we can find a solution where Cn+1 = 0. Then we get the (shorter) dependence • ➔ ➔ ➔ ➔ equat10n:C1V1+ C2V2+ ... + CnVn = Om, However, we know that Sis linearly independent, so all the coefficients Ct through Cn of this equation must be 0. Thus c 1 = c 2 = · · · = C 11+1 = 0, so st is linearly independent. Case 2. Suppose we can find a solution where C11+1 * 0. Then the depemlence equation can be -> -> to SO1Ve C10r ->Vn+I: Cn+I ->Vn+l = -CI ->VJ- C2V2 - · · · - CnVn, and thUS:

USe d

->

Vn+l= -

Ct

Cn+I

->

V1-

C2-> Cn+l V2-

... -

Cn-> Cn+l Vn.

v

But this equation implies that 11+ 1 is a member of Span(S), and thus Case 2 leads to a contradiction. Thus, only Case 1 is possible, so st is linearly independent. ■ The Extension Theorem will be very important in the next Section when we construct a special set of vectors called a "basis." Section 2.1 Linear Dependence and Independence

119

2.1 Section Summary

v

v

A set of vectors S = {v 1, 2, ... , 11 } from IR = 06.

If c 1 were non-zero, we would have a non-zero entry in the 1st component, due to the leading 1 in row 1. Thus, c I must be zero. By the same reasoning, c2 must also be zero because of the leading 1 in the 2nd component of the 2nd row, and, c 3 must also be zero because of the leading 1 in the 5th component of the 3rd row. We can also verify that each of the original rows of A can be expressed as a linear combination of these three vectors. Again, thanks to the placement of the leading 1's, we can easily eyeball the correct coefficients. For example, row 1 of A can be written as:

[ @ [2J1 s ~

o ] = s[ 7[

[!] o 3 o

0

8

o s ]+

-2 -5

o -7

J+ 6 [

oooo

04J

Thus, these three rows Span rowspace(A). Since we already know that they are linearly independent, we have verified that they form a basis for rowspace(A). Now, let us find a basis for colspace(A). We will denote the original columns of A as c1 through 7 . The rref R tells us that the leading variables correspond to columns 1, 2 and 5, and the free variables correspond to columns 3, 4, and 6.

c

Thus,

c c and c form a basis for colspace(A), 1,

2

5

colspace(A) ~

4 IRl. has

basis

{c c c

Section 2.3 The Fundamental Matrix Spaces

1,

2,

5 }

and we write: = { (5, 3, 1, 2), (7, 4, 2, 4), (6, 3, 3, 5) }.

141

Notice that we wrote the columns horizontally as vectors in ~ 4. The Minimizing Theorem tells us that not only are these three columns linearly independent, but we can also write c3, C4, and C6in terms ofc1, c2, and using the coefficients in columns 2, 4, and 6, in R:

cs,

c3 = 3c1 - 2c2, c4 = 8c1 - 5c2, and

c6= 5c1 -

7c2 + 4cs.

cs

This verifies that c 1, c 2 and Span colspace(A). Since we already know that they are linearly independent, we have verified that they form a basis for colspace(A). Next, to find a basis for nullspace(A ), we set up the three homogeneous equations represented by R, as usual:

+ 3x3 + 8x4

x1 x2

2x3

+ 5x6 = 0

5x4

7X6

0

Xs + 4X6 = 0 The leading variables are x1, x2, and xs, and the free variables are x3, X4, and X6. We solve for the leading variables in terms of the free variables:

x1 = -3x3 - 8x4 - 5x6, x2 = 2x3 + 5x4 + 7x6, and XS= -4X6. ..... .....

Thus, our solutions to Ax = 04 are:

= (-3x3, 2x3, X3, 0, 0, 0) + (-8x4, 5x4, 0, X4, 0, 0) + (-5X6, 7X6, 0, 0,-4X6, X6) = x3(-3, 2, 1, 0, 0,

o) +x4(-8,

5, 0, 1, 0, 0) +x6(-5,

7, 0, 0,-4, 1).

where the free variables x 3 , x 4 , and X6 can be any real numbers. Thus, the three vectors above Span nullspace(A). If we were to arrange these three vectors on top of each other in a dependence test equation, just like we did for the basis for rowspace(A ), we can see that a similar pattern appears: there is a 1 in the component corresponding to the free variable x; that produced that vector, and there are only zeroes above and below that 1:

142

Section 2.3 The Fundamental Matrix Spaces

X3

< -3,

2,

+

X4

< -8,

5, 0,

+

X6

< -5,

0,

0 >

0,

0 >

7, 0, 0, -4,

1 >

1, 0, 1,

--+

06.

Using the same logic as we applied to the rowspace, X3 has to be zero, otherwise we get a non-zero entry in the 3rd component of the sum, thanks to the 1 in the 3nd component of the 1st vector. Similarly, we must also have X4 = 0 because of the 1 in the 4th component of the 2nd vector, and X6 = 0 because of the 1 in the 6th component of the 3rd vector. Thus, this set both Spans nullspace(A) and is linearly independent, and so we have a basis for nullspace(A ), and we summarize our discussion with:

nullspace(A) ~

6 IRl. has

basis { (-3, 2, 1, 0, 0, 0), (-8, 5, 0, 1, 0, 0), (-5, 7, 0, 0,-4, 1) }.

Lastly, to find a basis for nullspace(AT), we will need AT and its rref:

5

AT=

3

1

2

1 0

2

0

7 4

2

4

0 1 -3

0

1 -1

-2

0 0

0

1

5 4

-2

-4

0 0

0

0

6

3

3

5

0 0

0

0

0 -1

3

2

0 0

0

0

1

with rref R 1 =

If we refer to the four variables for the homogeneous system corresponding to AT as y 1 through y4, we can set up the three equations:

y2

= 0 = 0

3y3 Y4

0

The leading variables are y1, y2 and y4, and the only free variable is y3. We solve for the leading variables in terms ofy3, and we get: y1 = -2y3, y2 = 3y3 andy4 = 0. Thus, our solutions are:

Our basis for nullspace(AT) therefore consists of a single (non-zero) vector, and we write:

nullspace(AT)

Section 2.3 The Fundamental Matrix Spaces

4 ~ IRl.

has basis { (-2, 3, 1, 0) }. o

143

Rank and Nullity The dimensions of the Four Fundamental Matrix Spaces go by the following names: Definition/Theorem 2.3.4 - Rank and Nullity: Let A be an m x n matrix. The dimension of the nullspace of A is called the nullity of A. The dimension of the rowspace of A is exactly the same as the dimension of the columnspace of A, and we call this common dimension the rank of A. Furthermore, since rowspace(A) = colspace(AT), and colspace(A) = rowspace(AT), we can conclude that rank(A) = rank(AT). We write these dimensions symbolically as: rank(A) = dim(rowspace(A))

= dim(colspace(A))

nullity(A) = dim(nullspace(A) ),

= rank(AT),

and

nullity(A T) = dim(nullspace(AT) ). Proof All we need to show is that dim(rowspace(A)) = dim(colspace(A)). We saw that the non-zero rows of the rref R of A form a basis for rowspace(A). Thus, the dimension of rowspace(A) is the number of leadinf,(ones of R. However, the Minimizing Theorem says that the columns of A corresponding to the leading ones of R form a basis for the columnspace of A. Thus, the dimension of colspace(A) is also the number of leading ones of R, and so these two dimensions are equal. ■ Example: For the matrix in our previous Example, we found that: rowspace(A) has basis {( 1, 0, 3, 8, 0, 5 ), (0, 1,-2,-5, 0,-7 ), (0, 0, 0, 0, 1, 4 )}, colspace(A) has basis {(5, 3, 1, 2), (7, 4, 2, 4), (6, 3, 3, 5)}, nullspace(A) has basis { (-3, 2, 1, 0, 0, 0), (-8, 5, 0, 1, 0, 0), (-5, 7, 0, 0,-4, l)}, and nullspace(AT) has basis {(-2, 3, 1, 0)}.

We also showed in this Example that each set of vectors above is a basis for the corresponding matrix space. Thus: rank(A) = 3 = rank(AT), nullity(A) = 3 and nullity(AT) = 1. 0 Simply staring at a matrix which is not in reduced row echelon form will usually not allow you to correctly guess its rank or nullity. However, we can set some bounds on these dimensions: Theorem/Definition 2.3. 5 - Bounds 011 Rank and Nullity: Suppose A is an m x n matrix. Then: 0 ~ rank(A) = rank(AT) ~ min(m,n), n- m

~

nullity(A)

~

n,

and m - n

~

nullity(AT)

~

m.

We say that A hasfull-rank (or A is a full-rank matrix) if rank(A) = min(m,n).

144

Section 2.3 The Fundamental Matrix Spaces

The symbol min(m,n) means the smaller of the two values m and n. For example, min( 4, 7) = 4. We leave the details of the proofs for these inequalities as an Exercise. Notice also that if m ~ n, then n - m :S 0, and thus we effectively just get 0 :S nullity(A) :S n, since nullity cannot be negative. Likewise, if m :S n, we just get 0 :S nullity(AT) :S m.

Example: If A is a 5 x 9 matrix, then 0 :S rank(A) = rank(AT) :S 5 = min(5,9). Since 9 - 5 = 4, nullity(A) is between 4 and 9, inclusively. However, 5 - 9 = -4, and since nullity(AT) cannot be negative, all we can conclude is that 0 :S nullity(AT) :S 5. However, we already knew this since nullspace(AT) ::::!IRl.5 . Thus, our bounds give us no additional useful information about nullity(AT). 0

The Dimension Theorem for Matrices We now present one of the central Theorems of Linear Algebra:

Theorem 2.3. 6 - The Dimension Theorem for Matrices: For any m x n matrix A:

+ nullity(A) = n, the number of columns of A, and similarly,

rank(A)

rank(AT) + nullity(AT) = m, the number of rows of A.

Proof If R is the rref of A, the rank of A is the number of leading 1's of R, and the nullity of A is the number of free variables. But since we have n variables, and every variable is either leading or free (but not both), the first equation follows. A similar argument applies to AT, since the number of columns of AT is the number of rows of A. ■ Example: In the previous Example, A is a 4 x 6 matrix with rank 3 and nullity 3. The Dimension Theorem is thus verified, since rank(A) + nullity(A) = 3 + 3 = 6 = n. A is not a full-rank matrix, since min(4,6) = 4 3. □

*

Sight-Readingthe Nullspace We will now present a way to find a basis for the nullspace of a matrix by inspection, without having to explicitly solve for our leading variables. Let us bring back the rref R and the basis that we found for nullspace(A) in the previous Example:

1 0

R=

3

8

0

5

0 1 -2 -5 0 -7 0 0

0

0

1

4

0 0

0

0

0

0

Section 2.3 The Fundamental Matrix Spaces

{ and

< -3, < -8,

2, 1, 0,

0,

0 >,

5, 0, 1,

0,

0 >,

< -5,

7, 0, 0, -4,

1 >

}.

145

We begin by identifying the leatling ones in the rref, which are highlighted, and the free variables, as usual. Our free variables are x3, X4, and X6. This tells us that we will need three vectors in our basis, corresponding to each free variable. We create a skeleton for these three vectors and write them on separate lines so we can align the six components properly. For each vector, place a one on the entry corresponding to the free variable, and put a zero on the entries above and below this one on the other vectors, because free variables do not affect each other. Put a zero as well on all coordinates to the right of each one, because a free variable only affects a leading variable to its left. For example, X4 only affects x1 and x2, but notxs. Leave the other entries blank for now:

{

< -' <


c.

d.

Y2

-->

rn

b.

r1

-->

r2

A=

-->

rn

Show that every row in the set SI = {r I, r2, ... , r n } is a linear combination of the rows in the set S2 = {s1, r2, ... , rn }, wheres, = C • r,, and vice versa. Now, prove in general that if the rows of A are {r1, r2, ... , r1, ... , rn} and the rows of Bare {r1, r2, ... , Si, ... , rn }, where Si = C • r1, then every row of A is a linear combination of the rows of B, and vice-versa. Suppose we exchange any two rows of A to produce B. Show that the rows of Bare the same as the rows of A, just in a different order, and use this to show that each row of A is a linear combination of the rows of B, and vice-versa. Suppose that:

A=

-->

ri

r1 + k. r2

-->

-->

r2 '

and B =

-->

r2

-->

rn

rn

for some k E !RI.(note that k could be zero). Let us writes, = r1 + k • r2. Show that every row in the set S l = {r I, is a linear combination of the rows in the set S2 = {s 1, r2, ... , r 11 }, and vice versa. Generalize your argument in part (d) and prove that if A and B have exactly the same rows, but row i of B is: _,s· I = 1l +k•r j,

r2,...,rn}

e.

then each row of Bis a linear combination of the rows of A, and vice-versa. This completes the proof that rowspace(A) = rowspace(B). Section 2.3 The Fundamental Matrix Spaces

151

f. 5.

Explain why the non-zero rows of the rref R of A form a basis for the rowspace of A. Be sure to address both issues: Spanning and linear independence.

Proof of Theorem 2.3.5: Suppose that A is an m x n matrix. Prove that: a.

0 :S rank(A) :S min(m,n)

b.

n - m :S nullity(A) :S n

c.

m - n :S nullity(AT) :S m

6.

Suppose that A is an n x n matrix. Prove that nullity(A) = nullity(AT).

7.

Prove that the only m x n matrix with rank O is the m x n matrix where all the entries are zero. Hint: use Proof By Contradiction. Think of the Gauss-Jordan Algorithm.

8.

Suppose that A is an a x n matrix, and B is a b x n matrix, where a and b could possibly be different positive integers. Prove that:

rank(A) :S rank(B) 9.

if and only if

nullity(A) 2: nullity(B).

Proof of Theorem 1.5.3: The Uniqueness of the Reduced Row Echelon Form: We are now in a position to prove that if A is an m x n matrix, and we obtain two matrices Hand J from A using a finite sequence of elementary row operations, and both Hand J are in reduced row echelon form, then H = J. Thus, the rref of A is unique. We will use the Principle of Mathematical Induction. a.

First let us take care of the trivial case: If A consists entirely of zeroes, prove that

H=A=J. Thus we can assume for the rest of the Exercise that A is a non-zero matrix. b.

Explain why rowspace(H)

c.

Explain why the number of non-zero rows of H must be the same as the number of non-zero rows of J. Hint: what does this number represent? Thus, we can conclude that both H and J have k non-zero rows, for some positive number k. We must now show that every pair of corresponding rows are equal. We will start with the last non-zero row because it has the most number of zeroes. Before we look at the general case, let us look at a numeric wann-up:

d.

Both Hand Jbelow are in rref and have rank 3:

=

rowspace(A)

=

rowspace(J).

1 0 -2

0

0 1 4

-0

0 0

0

1

: ]·

-3

Explain why the 3rd row of J cannot be expressed as a linear combination of the three rows of H. Hint: use the fact that the leading 1 is in the 4th column and every entry to its left is zero. e.

152

Now, explain in general that the leading 1 in row k of H must be in the same column as the leading 1 in row k of J. Hint: pick the matrix whose leading one in row k is further to the right.

Section 2.3 The Fundamental Matrix Spaces

f.

Both Hand Jbelow have their leading 1 in row 3 in the same column:

H

=

[

~~~

~2

:

0 0 1 5 -3

g. h.

1.



k.

l; ~ ~ ~ J

=

[

2 ~

0 0 1 5

~6

-2

l-

Explain why the 3rd row of J cannot be expressed as a linear combination of the three rows of H, and similarly, the 3rd row of H cannot be expressed as a linear combination of the three rows of J. Hint: use the fact that the leading ones in rows 1 and 2 of Hand J are above zeroes in row 3. Explain in general that row k of H must be exactly the same as row k of J. Now, let us focus on row k- 1. Both Hand J below are in rref, both have rank 3, and their 3rd rows are the same:

Explain why the 2nd row of J cannot be expressed as a linear combination of the three rows of H. Note that this includes possibly using the 3rd row of H. In the same spirit as parts (e) and (g), explain in general why row k - l of H must be exactly the same as row k - 1 of J. Notice that we are working our way up each matrix. Generalize your arguments above: show that if we already know that rows i to k of Hand J have already been shown to be equal, then row i - l of Hand J must also be equal. Since we can continue in this fashion until we reach row 1, this completes the proof that all rows of H must be exactly the same as the corresponding row of J. Epilogue: In part (e), we focused on the last non-zero row. Suppose we looked at the first rows instead. Both Hand J below are in rref and have rank 3. Show that row 1 of J is a linear combination of the three rows of H. This shows that induction cannot begin at the first row.

H-l~ I~~ J-l~ ~ II ~ 8

-:

}

l

10. True or False: Determine whether each statement is true or false, and briefly explain your answer by citing a Theorem, providing a counterexample, or a convincing argument. a. b. c. d.

If If If If

A is a 7 x 4 matrix, then A can have rank 5. A is a 4 x 7 matrix, then A can have nullity 5. A is a 7 x 4 matrix, then A can have nullity 5. A is a 7 x 4 matrix, then rank(A)

Section 2.3 The Fundamental Matrix Spaces

+ nullity(A) = 7. 153

2. 4 The Dot Product and Orthogonality We can draw points, lines and planes on the Cartesian plane and in Cartesian space, and thus 3 . However, we often want to study the angles formed by we can see the geometry of IRl. 2 and !RI. two vectors in higher dimensional spaces which we cannot visualize. In order to explore this further, we need the following general concept:

u

Defi,nition: If = (u1, u2, ... , Un) and define their dot product:

Example:

Ifu = (5,-3,

v = (v1, v2, ... , vn)

0, 2,-7) and v = (2, 5, 984,-6,-4),

uo v = 5 (2) - 3(5) + 0(984) + 2 (-6)

are vectors from

1 IRl. 1,

we

then:

- 7 (-4) = 1o- 15 + o- 12 + 28 = 1 1.0

The dot product of a vector with itself has a natural geometric interpretation. The following definition generalizes the concept of length that we introduced in Section 1.1: Definitions: We define the length or norm or magnitude of a vector 11 = ( v 1, v2, ... , v11) E IRl. as the non-negative number:

v

llv II = ✓ VT+ v~ + ... + v~ . It follows directly from the definition of the dot product that: 2

llvll =vov,

orinotherwords,

llvll =

Jvov.

A vector with length 1 is called a unit vector. 2 3 as we did Notice that we get exactly the same definition for the length of a vector in !RI. and !RI. in Section 1.1. Similarly, we can easily prove that:

11 Theorem2.4.l:ForanyvectorvE IRl. andkE !RI.:llkvll = lklllvll. In particular, ifv -:t 011, then u1 = ll~IIvis the unit vector in the same direction as v,and

ih =

- II~IIvis the unit vector in the opposite direction as v.Furthermore:

llvll =

O if and only if

v=

....

On.

Example: The vectorv = (5,-3, 0, 2,-7) has length:

llv 11 = J25 + 9 + o + 4 + 49 = and thus the two unit vectors parallel to vare:

u1= rin-(5,-3,0,2,-7) v87

154

and

/87,

u2= - icn(5,-3,0,2,-7).o v87

Section 2.4 The Dot Product and Orthogonality

2 Notice that in !RI. and !Rl.3, the length of the vector u is also the length of the directed line segment or arrow representing as we saw in the Section 1.1. We also see that the standard basis vectors are unit vectors in each n-space, since all their components are 0 except for a single "l ."

e1,e2,..., en

u,

The following properties of the dot product are easily proven, just like we proved the properties of vector arithmetic in Section 1.1:

Theorem 2.4.2 -Properties of the Dot Product: For any vectors wE IRl.11 and scalar k E !RI.,we have:

u,v,

-+



1. The Commutative Property ~

....

2. The Right Distributive Property

....

3. The Left Distributive Property

(

....)

U+ V



-+ =uov+uow.

.... )

uo\v+w



V =Vo U.

U o

















W = U o W + Vo W.

o

Ck.u) ov = k(uov) = z1o Ck.v).

4. The Homogeneity Property

....

5. The Zero-Vector Property

u



6. The Positivity Property

lfu

=t

--+

011 = 0.

o





011, thenuou



> 0.

The last two properties can be combined into one: ➔



uou > 0

7. The Non-Degeneracy Property

if and only if



u



=t



-+

On, and 011o On = 0.

11(which IRl. 11 Example: Suppose we are told that u and v are two vectors from some IRl. is not really important). Suppose we were provided the information that llull = 5, llvll= 29, and uo v = 24. Let us find 1!2u - 3vl!.

This problem will help us appreciate the power of the property that for all vectors

wE

11 !RI. :

2 !!wll =wow.

2u-

To find I! 3vII,we apply the formula above to homogeneous properties of the dot product. Thus:

112z1 - 3vll2

=

w= 2u- 3v, along with the distributive and

c2z1 - 3v) c2u- 3v) 0

= c2z1 - 3v) o 2z1 - c2t1- 3v) o 3v

= c2u)oc2z1)- (3v) o c2z1)- c2u)o(3v) - C-3v)o (3v) = 4(u =

om- 6(u

2 41!ull -12(u

o

v) - 6(v

O

v)

om+ 9(v

o

v)

+ 9llv1!2,

where in the last step, we again used the property above, to change Section 2.4 The Dot Product and Orthogonality

u uto !lu11and v vto o

2

o

155

llv 112 , as well as the commutative property of the dot product. Also, notice the similarity between the computation above and the formula for expanding (a - b) 2 in basic algebra. Now, since we know the values of all of these quantities, we can now compute: 2

2

ll2u - 3vll = 4llu II =

-

12(uv) + 9llvll2 O

4(5) 2 - 12(24) + 9(29)

2

=

7381.

Thus, ll2u - 3vll = J7381 · □

A Geometric Formula.tion for the Dot Product The Law of Cosines from Trigonometry says that if a triangle has sides a, b, c and the angle opposite side c is called e,then:

c2 = a 2 + b2 Now, suppose triangle in ~ 2 :

u and v are two non-zero

-

2abcos(0)

vectors of ~ 2 . The vectors

v,u- v and u form

a

y

4 3 2 1 }23456

X

The Triangle Formed by

v,u-vand u v

u.

We can easily check that this diagram is correctly labeled, because + (u - v) = Now, if we let 0 be the angle between and as shown in the diagram, then by the Law of Cosines:

u v,

lli1-vll

2

2

= llull2 + llvll -2lli1llllvllcos(0).

But recall that:

156

Section 2.4 The Dot Product and Orthogonality

11a - vii2

cu- v) eu- v) = cu - v)o u- cu- v)o v =

0

➔➔

➔➔

➔➔

(by the Right Distributive Property)

➔➔

= uou-vou-uov+vov

(by the Left Distributive Property) (by the Commutative Property).

Thus we get:

[lull2 -

2uv+ llvll2 =

llull2 + llvll2 - 2llullllvllcos(0).

O

Cancelling common terms and dividing both sides by -2, we get: Definition/Theorem 2.4.3: If

uand vare non-zero vectors in~

2,

then:

u v = llullllvllcos(0), O

u v

where 0 is the angle formed by the vectors and in standard position. Thus, we can compute the angle 0 between and using:

u

v

where O :'.S0 :'.Sre. We will use the exact same formula for two non-zero vectors in ~ 3 .

u

Example: Let us consider the vectors = the cosine of the angle 0 between them is: cos(B)

=

uv o

llullllvll

=

( 5,

3) and

v

=

(-2, 6). According to our formula,

5(-2) + 3(6) = -10+18 = 2 J52 + 32 JC-2) + 62 ./34./40

8

4/85

~0.21693

Thus, by using a scientific calculator, we find that 0 ~ 1. 3521 radians, or about 77.47°. Let us draw the two vectors together in standard position:

6 y 5 v=

u= < 5, 3 > -4-3-2-1

123456

X

Finding the Angle Between Two Vectors We can check with a protractor that this answer is reasonable. □

Section 2.4 The Dot Product and Orthogonality

157

Notice in particular that if uo v = 0, then cos(0) = 0, hence 0 = rr/2. In other words uand v --+ 2 are perpendicular to each other. Since 0 2 o = 0 for all E !RI. , from the definition of the dot --+ 2, and similarly for --+ product, we will also agree that 02 is orthogonal to all vectors in !RI. 03 in 3 !RI.• We summarize all this with the following:

v

v

u and v u v = 0.

Definition/Theorem 2.4.4: Two vectors orthogonal to each other if and only if o Example: The vectors

E

2 !RI.

3 are perpendicular or or !RI.

u= ( 5, -3, 1) and v = ( 4, 9, 7) are orthogonal, since: uov = 5(4) + (-3)(9) + 1(7) = 20- 27 + 7 = o.

Sketching these vectors to verify that they are perpendicular would be a futile task, though, because the vectors will not appear to be perpendicular, thanks to the distorted perspective of Cartesian space. □

Revisiting The Cartesian Equation of a Plane In Section 1.2, we saw in our Example that:

Span({(3,-l,2),(-4,2,3)})

=

{(x,y,z)

3 17x+ E IRl.

17y-2z

=

0},

a plane II passing through the origin. If we collect the coefficients of the variables in the Cartesian equation into a vector, we get n = (7, 17,-2). This is called a normal vector to the plane II. It is not unique, but the line L = Span( {(7, 17,-2)} ), which we call the normal line to II, is unique. Notice that if we take the two vectors defining our Span and get their dot products with we get:

n,

(3,-1, 2) o (7, 17,-2) = 3(7) + (-1)(17) + 2(-2) = 0, and (-4,2,3)o(7,

17,-2) = (-4)(7)+2(17)+3(-2)

= 0.

n

Thus is orthogonal to both vectors. However, it also follows from the equation of II that a 3 vector (x,y, z) in !RI. is on II if and only if (x, y, z) is orthogonal ton, since:

0

=

7x+ 17y-2z

=

(7, 17,-2)o(x,y,z).

This argument generalizes to any plane II passing through the origin:

u,v

3 are non-parallel vectors, and Definition/Theorem 2.4.5: Suppose that E !RI. II = Span( {u, v}) is the plane passing through the origin with Cartesian equation:

ax+ by+ cz = 0. 3 is Then: n= ( a, b, c) is a normal vector to II, which means that any vector (x, y, z) in !RI. on II if and only if (x, y, z) is orthogonal to Although is not unique, the line L = Span( {n}) is unique, and we call L the normal line to II.

n.

n

On the left, below, we show the plane II and its normal line L from our Example:

158

Section 2.4 The Dot Product and Orthogonality

,---------jn II

/•\/'

( Xo•Yo,=o )

./w

I

Q (x.y. =)

7x + 17y- 2z = 0 and Span( {(7, 17,-2)})

An Arbitrary Plane in Cartesian Space

More generally, suppose that II is the translate of Span({u,v}) by Xp=(x 0 ,y 0 ,z 0 ), corresponding to the vector from the origin to P(xo, yo, zo). This is illustrated above, on the right. In this case, IT may no longer pass through the origin. Suppose that Q( x, y, z) is any point on IT. As we saw in the previous Chapter, II will have a vector equation: (x,y,z)

= (xo,Yo,zo)+tu+sv.

We can rewrite this equation as: (x-xo,y-yo,z-zo)

= tu+sv.

w

In other words, = (x -xo, y-yo, z- zo) is a vector in the plane Span( {u, v}) which does pass through the origin. Thus, by our observation above, (x -xo, y-yo, z - zo) must be orthogonal to any normal vector n = ( a, b, c) for Span( {u, v} ). Thus: (a,b,c)

0

(x-xo,y-yo,z-zo)

= 0, or

a(x - xo) + b(y - Yo) + c(z - zo) = 0, and expanding: ax+ by+ cz = axo + byo + czo.

But the right side is a constant, since we fixed the point ( xo, yo, zo) corresponding to Xp. Thus we obtain, as in Chapter 1, the Cartesian equation of any plane in Ii 3 . We summarize this discussion in the following:

u,v

Definition/Theorem 2.4.6: Suppose that E Ii 3 are non-parallel vectors, and [1 is the plane + Span( {u, v} ), where = (xo,yo, zo) and (xo, yo, zo) is an arbitrary point in Ii 3 . Then IT has a Cartesian equation:

xp

xp

ax + by + cz = d = axo + byo + czo, where = ( a, b, c) is a normal vector to Span( {u, v} ). Thus, is also normal to all the translates of Span( {u, v} ). The plane IT passes through the origin if and only if d = 0. We will still refer to L = Span( {n}) as the normal line to II. Thus, all the translates of Span( {u, v}) have the same normal line L. Consequently, two distinct planes II1 and IT2 are parallel to each other if and only if their Cartesian equations can be written using the same normal vector:

n

n

II1: ax+by+cz

= d1 and II2: ax+by+cz

Section 2.4 The Dot Product and Orthogonality

= d2, whered1

* d2. 159

In the Exercises, we will see a more efficient way to find the Cartesian equation of a plane, by using the cross product of the vectors and This is a technique seen in any Multivariable Calculus course.

u

v.

The Cauchy-Schwarz Inequality Recall that we defined the angle 0 between two non-zero vectors in fonnula:

IR(. 2

or

3 IR(.

using the

But since cos(0) is between -1 and 1, we have: --+

1< - -

--+

uov o and v v = llvll2 > o.

Case 2: Suppose now that u 0

0

Let us construct the linear combination: --+

--+

--+

w = ru +sv,

--+

....

where rands are any two scalars, possibly 0. Thus, w could be On, so the best that we can say is that II II =:>:0, which we will write as:

w

os

2

11w11=

ww O

= (ru + sv) 0 (ru + sv) = (ru)

0

(ru) + (ru)

0

(sv) + (sv)

0

(ru) + (sv)

0

(sv)

= r2 (u Ou)+ 2rs(u v) + s2 (v v). O

Since this is true for any rands,

160

O

let us first substitutes =

uo u.Then we get:

Section 2.4 The Dot Product and Orthogonality

0 s r 2 (u Ou)+ 2r(u O u)(u O v) + (u O u)2cvo v), and since

u u= llu11is positive, 2

o

0

Finally, we substitute r = -(u

o

we can divide it out to obtain the equivalent inequality:

s r 2 + 2r(u v) + (u u) (v v). O

O

O

v), and we get:

o s cuo v)2 -

2cuv)Cu v)+ euu)(v v), o

o

o

o

which simplifies to:

Since both sides are non-negative, by taking square roots, we equivalently obtain:

lu vis llullllvll- ■ O

Angles and Orthogonality Thanks to the Cauchy-Schwarz Inequality, we can define the angle between any two vectors in ~n:

Definition: If

u,v

E !RI. n are

non-zero vectors, we define the angle 0 between

cos(O)

-+ -+ = lllli°ll;II'

where O S 0 S re. Furthermore, we will say that if and only if O = 0.

uv

uand vare orthogonal

-+

Consequently, the zero vector OII is orthogonal to all vectors in

uand vby:

to each other

!RI. n.

The definition for cos(0) makes sense because the Cauchy-Schwarz Inequality assures us that this quotient is between -1 and 1. Our convention for the zero vector is exactly the same as 2 what we had for IR/. and ~ 3 . Example: Let us find an approximation for the angle 0 between = ( 5, 2, -3, 9), even though it is impossible to visualize vectors in !RI.4 :

v

cos( 8 )

u = (3,-7,

4, 2) and

-+ -+ 3(5) - 7(2) + 4(-3) + 2(9) ll-+uull ll~v II = ---;:::::===::::::::::::::::===:::::::::::::::::--;:::::::===:::::======J9 + 49 + 16 + 4 ✓25 + 4 + 9 + 81 0

=

7

./78jTf9

= ~2

~

0.072657,

..,j7LOL.

and thus 0 ~ cos- 1(0. 072657) ~ 1. 4981 radians. 0 The following consequence of the Cauchy-Schwarz Inequality will be left as an Exercise:

Section 2.4 The Dot Product and Orthogonality

161

Theorem 2.4.8- The Triangle Inequality: For any two vectors and E rn1n:

u v

llu+ vii :s lluII+ llvll-

u,v u v

Its name comes from the fact that and + form the sides of a triangle, and this Theorem says that the sum of the lengths of two sides is at least the length of the third.

llu+ vii :s llull+ llvll

The Triangle Inequality:

We can say this in everyday language as "the shortest path between two points is along a straight line." Notice that we achieve equality if and are parallel and in the same direction, in which case the three vectors are colinear. In other words, we have a "degenerate" triangle.

u v

Distance Between Vectors The distance formula that we see in basic algebra can be generalized using the following:

Definition: lfu = (u,, u2, . .. , Un) and V = (v,, we define the distance between uand vas:

V2, ... ,

Vn) are vectors from IR

-:1=-

u,

Hint: Use the Triangle Inequality for vectors and rewrite

u,

u- was u- v+ v- w.

->

1.

Prove that On is the only vector in ~n that is orthogonal to itself. In other words, if _. v E ~nan d_.. v1sort hogona 1tov,_. t henv_. = _. On.

J.

Prove that if is orthogonal to all the vectors all the vectors in Span( {v1, v2, ... , Vk} ).

k.

Show that in any ~ n, the vectors in the standard basis {e 1, e2, ... , en} are mutually orthogonal to each other. This means that is orthogonal to if i -:1=-j. Show that if a line L is orthogonal to a plane TI1, then any plane TI2 which contains L is also orthogonal to TI1• Note that we want the two planes to be orthogonal to each other. Assume all objects are in ~ 3 .

1.

n

Section 2.4 The Dot Product and Orthogonality

ei

VJ,v2, ...,Vk,then

nis orthogonal

to

ej

167

m. n.

Show that if a line L is parallel to a plane TI, then any direction vector for L must be orthogonal to any normal vector for TI. Suppose that E ~ n are both unit vectors, and let 0 be the angle between them. Prove that llu - vii = 2 sin(0/2). Hint: review the half angle formulas from Trigonometry.

u,v

v

--+

We proved The Zero Factors Theorem in Section 1. 1, which says that k • = On --+ if and only if either k = 0 or v = 0 11• Use the dot product to prove directly that if --+ --+ -=t=011and k • = 011,then k = 0 (that is, without using Proof by Contradiction or Case-by-Case Analysis). 14. The Parallelogram Law states that the sum of the squares of the two diagonals of a parallelogram is equal to the sum of the squares of the four sides. a. On your paper, copy the Parallelogram Principle found on page 32. Be sure you include the labels of all the vectors involved. b. Rewrite The Parallelogram Law using the lengths of the vectors in the diagram. c. Prove the Law using the identity II 112 = o o.

v

v

w

d.

w w.

Bonus: use your computations in (c) to prove that UO V =

l (llii + vii2 - llu - vii2).

15. Motivating The Cross Product: In Exercise 4 above, we defined the cross product of --+ --+ 3 u = (u1, u2, u3) and v = (v1, v2, v3) E ~ as: --+

......

u xv

=

--+

(u2v3 - UJV2)

i-

---+

(u I V3 - UJV1

)j + (u1 v2

--+

- u2v1)

k.

We will see in this Exercise how this vector naturally appears in the Cartesian equation of the plane TI = Span({u, v} ), ifu and are not parallel to each other. Recall that (x, y, z) E TI (f and only if we can find two scalars, rands E ~, so that:

v

(x,y,z)

= r(u1,u2,u3)+s(v1,v2,v3).

Recall also that if a solution tor ands exist for this specific (x,y, z), then this solution will have to be unique. For the sake of simplicity, let us also assume that none of the coefficients of and are zero, and we will not be dividing by zero in any of the computations below. a. Separate the vector equation above into three separate equations. b. Use the Addition Method to eliminate r from the equations for x and y, and then solve for s. c. Use the Addition Method to eliminate r from the equations for y and z, and then solve for s. d. Set your answers to (b) and (c) equal to each other (since they are both the value of s, and s is unique), and clear the denominators of both sides of the equation. Group together the terms for x, for y, and for z. e. Every term in your answer to (d) should have u 2 as a common factor. Divide out u 2 and show that the resulting Cartesian equation for TI can be simplified as:

u v

+ (-u1V3 + U3V1 )y + (u1 V2 - U2V1 )z = Notice that the corresponding normal vector to the plane is exactly: (u2v3 - UJV2)x

n= 168

(u2v3 -

U3V2,-u1v3

+

U3V1, U1V2 -

u2v1)

=

o.

UXV.

Section 2.4 The Dot Product and Orthogonality

Section 2. 5 Orthogonal Complements A plane II through the origin in ~ 3 has equation ax + by + cz = 0, and we saw that this can be written as a dot product:

(a, b, c) o (x,y, z) = 0. Thus, every vector (x, y, z) on II is orthogonal to any vector on the normal line L = Span ( {( a, b, c)}). We will now generalize this idea:

Definition/Theorem 2.5.1: If Wis a subspace of the orthogonal complement of W, defined as:

w1-= is also a subspace of

{v

E

~n,

w1-(pronounced

then

~II IVOw = 0 for all wE W}

~ 11•

On,

--+

--+

ow

w W

Proof First, w1contains since 011 = 0 for all E (in fact, for all w1-is non-empty. Next, suppose that and are vectors in w1-. Thus:

v u v w= 0 and u w= o

We must show that

,v

so again

o

v+ uand rvare also vectors ,.!

in

0 for all

w

E

w

E ~

11 ).

Thus,

W.

w1-, for any r E

~-

Thus:

-+ -+ -+ --+ + --+) u ow = -+v o w + -+u o w = 0 + 0 = 0 £or al 1w E W,

v+ uis a vector in w1-.Similarly: (rv)0 w= r(v Ow) = r(O) = 0

Thus

"W perp"),

w1is closed under

for all

wE w.

addition and scalar multiplication, and is a subspace of~

n ■

Examples: Let us look at non-trivial subspaces of~ 2 and ~ 3 . Suppose that L, is the line in ~ 2 with equation y = ~ x. The vector w= ( 4, 3) is a member of L, so if = (x, y) E Lt, then we

v

must satisfy:

= (x, y) o ( 4, 3) = 4x + 3y,

0 =vow

or in other words, y = -

ix.

But we know from basic algebra that this is exactly the equation

of the line through the origin that is perpendicular to L. Now, let us go backwards. Suppose that L2 is the line y = line, so if (x, y)

E

j x.

The vector ( 3, -4) is on this

L2, then we must satisfy: 0

=

(x,y)

o

(3,-4)

=

3x-4y,

or in other words, y = ~ x, which is the equation of our original line L,. Thus:

Lt=

L2, and L1 = L,.

Similarly, suppose that L is the line in ~ 3 given by L = Span((-4, 3, 5)). = (x,y,z) EL\ we must satisfy:

v

Section 2.5 Orthogonal Complements

Then, if

169

0

=

(x, y, z)

o

(-4, 3, 5)

-4x + 3y + 5z,

=

which is the equation of the plane II through the origin orthogonal to L. Thus, L _1_ = II. Now, let's go backwards. Consider the plane II with equation -4x + 3y + 5z = 0. It seems reasonable to guess that II _j_= L, but let us prove this guess. The vectors:

w1= are both on II, so if (x, y, z)

E

(3, 4, 0), and

w2=

(5, 0, 4)

II_j_,then we must satisfy the two equations:

0

=

(x, y, z)

0

=

(x,y,z)o(5,0,4)

o

(3, 4, 0)

=

3x + 4y,

=

5x+4z.

and

This is a homogeneous system of equations! Thus, ( x, y, z) is in the nullspace of the matrix: 3 4 [

O ], 5 0 4

which has rref:

We have one free variable, and so (x,y,z) we have indeed proven that:

E

[

1

415

O 0 1 -3~



Span((-415, 3/5, 1)) = Span((-4, 3, 5)). Thus,

L _j_= II, and II _j_= L. □ The next Theorem generalizes these two Examples. The proof will be left as an Exercise.

Theorem 2.5.2: Suppose that L1 = Span((a, b)) is a line in IR;.2 through the origin, and L2 = Span( ( b, -a)) is the line through the origin perpendicular to L 1. Then:

Lt= L2, and Li= Li. Similarly, if L = Span((a, b, c)) is a line in IR;.3 through the origin, and II is the plane through the origin with Cartesian equation ax + by + cz = 0, then: L_j_= II, and II_j_= L.

L 1 = Span(( a, b) ), and

L = Span({(a,b,c)}),

L2 = Span((b,-a)).

II : ax + by + cz = 0.

Orthogonal Complements in

170

and

2 and 1R;, 3 1R;,

Section 2.5 Orthogonal Complements

The final part of our previous Example can also be generalized to give us a significant shortcut to find all the vectors in w1-. It was proven in Exercise 13 (j) of Section 2.4:

Theorem 2.5.3: If

w = Span( {w,,W2,... , ih}) ~ ~ then: w1-= {v E ~ Ivo Wj= 0 for all i = 1... k }. 11 ,

11

Example: Suppose S = {( 1,-2,-5,-4 ), (3, 1,-1, 9), ( 5,-3,-11, 1)} and W = Span(S) ~ ~ 4 • Let us find a basis for w1-. We want to find all vectors (x,, x2, x 3, x4) so that: (l,-2,-5,-4) (3, 1,-1,9) (5,-3,-11,

= 0,

0

(x1,x2,x3,x4)

0

(x1,x2,x3,X4) = 0, and

1) 0 (x1,X2,X3,X4) = 0.

In other words, we want to solve the system: x,

2x2

3x1 + 5x 1

0

= 0

X2

3x2

This is a homogeneous system of equations, so the vectors that we want are precisely the vectors in nullspace(A), where A is the coefficient matrix:

A - [

~

~:

~

~l~l

]

with rref R - [

~I ~ ~I

l

The leading variables are x, and x2, and the free variables are X3 and X4. Sight-reading the nullspace, we need two vectors, one for X3 and one for X4, and we get as a basis for nullspace(A):

( 1,-2, 1, 0) and (-2,-3, 0, 1). We can easily check by taking dot products that both vectors are orthogonal to every vector in S = {( 1,-2,-5,-4 ), ( 3, 1,-1, 9), ( 5,-3,-11, 1) }, so we have confidence that our basis. For example, taking the dot product of the first vectors of each set, we get: (1,-2, 1, 0)

0 (

1,-2,-5,-4)

= 1 + 4- 5 = 0,

and similarly for the five other pairs. Thus, we can conclude that:

w1-is a 2-dimensional

space with basis { ( 1,-2, 1, 0), (-2, -3, 0, 1) }. o

We were able to describe w1-by finding the nullspace of a coefficient matrix A in this Example. This will be true in general, but before we can prove this, we need a new point of view:

Section 2.5 Orthogonal Complements

171

A Dot Product Perspective of Matrix Multiplication One of the beauties of Mathematics is that we can sometimes look at the same object in different ways. The matrix product is a good example. We first defined as a linear combination of the columns of A using the coefficients from Let us spell it out:

Ax

Ax=

a1,1

a1,2

a1,n

a2.1

a2.2

a2,n

am,I am,2

am,n

a1,1

= x,

Ax

x.

Xn

a1,n

a1,2

a2,1

+x2

am,I

a2,2

+ ··· +Xn

am,2

However, notice that the top entry of

am,n

Axcan be written as:

r

where 1 is the first row of A. Similarly, we can see that the second entry is in this way, we get:

x,

-+

-+

-+

-+

r2

-+

-+

-+

-+

-+

X o X o

xr o

2.

Continuing

r1 r2 o X

Ax= Xn

-+

x

o

r 111 ox

r 111

From this, we can see that:

r;= 0 for

-+

Theorem 2.5.4: A vector XE IRl.11 is a solution to Ax= Om if and only if XO each row of A. In other words, xis in the nullspace of A if and only if is orthogonal to all the rows of A. Thus:

x

r;

w1-= nullspace(A). then u1-= rowspace(A).

If W = rowspace(A), then

Similarly, if U

=

nullspace(A),

This last Theorem shows us the relationship between rowspace(A) and nullspace(A ), and also gives us an efficient algorithm to describe the orthogonal complement of a subspace:

172

Section 2.5 Orthogonal Complements

Theorem 2.5.5: Suppose w = Span(S) ~ IRl.11, wheres= {w1, W2,... 'Wk} C IRl.11• If we form the matrix B with rows w1, w2 , ••• , then:

wk,

W = rowspace(B)

and

w1.= nullspace(B).

Thus, the non-zero rows of the rref of B form a basis for W, and we can obtain a basis for w1.exactly as we would find a basis for nullspace(B) using the rref of B.

Note: This is the only place in this book where we assemble vectors into the rows of a matrix. However, this could still be confusing. We can rephrase the previous Theorem using columns and the transpose operation, if this helps: Theorem 2.5.6: Suppose w = Span(S) ~ IRl.11, wheres= {w1, w2, ... ';::j\} C IRl.11• If we form the matrix A = [ S] with columns 1, w2, ... , as usual, then:

w

W = colspace(A) = rowspace(AT)

wk,

and

Example: Let us consider the subspace W = Span(S) ~ s = {w1, w2,

w3,w4} = {(3,-2,

w1.= 5

!RI.,

nullspace(AT).

where:

1, 5, 0), (5,-3, 2, 6, 1), (-8, 3,-5, 3,-7), (4, 1, o,-2, 3)}.

Notice that there are.four vectors in S. Our main objective is to find a basis for w1.. To do this, we assemble the four vectors that generate W into the rows of a matrix, as prescribed by our Theorem:

B=

3 -2

1

5

0

5 -3

2

6

1

-8 4

3 -5 1

3 -7

0 -2

2_ 5 18

1 0 0

'

with rref:

R=

3

2_ 5 J_

0 1 0

-5

0 0 1

-5

5

0 0 0

0

0

17

5 8

The leading variables are x1, x2 and x3, and the free variables are X4 and xs. Again, let us sight-read the nullspace. We will need two vectors, corresponding to X4 and xs:

w1.= nullspace(B) = Span( { ( ~, 1s3,1s7,1,o),( w1.= Span({(-2, 18, 17,5,0),(-2,-7,-8,0,5)}),

~, ~, ~,0, 1)}),

or

by clearing the denominators, as we did in the previous Example. Thus, W-1_ is 2-dimensional. There is, however, a bonus outcome from the rref. Since we assembled the vectors from S into the rows of B, then W = Span(S) = rowspace(B). But we saw in Section 2.3 that the non-zero rows of R form a basis for rowspace(B). But notice that there are only three non-zero rows of R. Thus, we can more efficiently say that: Whasbasis { ( 1,0,0,

~,

~

), (o, 1,0,-

s8,;),(o,o,

1

l, ~)}, or

1,- 1

W has bas is { ( 5, 0, 0, 2, 2), ( 0, 5, 0, -18, 7), ( 0, 0, 5, - 17, 8) } ,

Section 2.5 Orthogonal Complements

173

again, by clearing denominators. Thus, Wis only 3-dimensiona/. Let us think about this some more: Since W is 3-dimensional, this means that the original Spanning set S which consists of four vectors has to be dependent, by the Dependent Sets from Spanning Sets Theorem. It is far from obvious, though, how the four vectors are related to each other. The rref R only tells us the dependency relationships of the columns of B, but unfortunately, not its rows. If we really want to know how the four original vectors depend on each other, we would need to use The Minimizing Theorem and assemble the vectors in S into the columns of a matrix, say:

-8

4

1 0

9

0

-2 -3 3 1 2 -5

1

0 1 -7

0

0

0 0

0

1

5

6

3

-2

0 0

0

0

0

1

-7

3

0 0

0

0

3 A= BT=

5

1

with rref R =

Now we can see that w 3 is a linear combination of the first two vectors: w 3 = 9w 1 -7w Thus, by the Minimizing Theorem, S1 = { 1, w2, 4} is another basis for W.□

w

w

2•

You will prove the following properties of orthogonal complements in the Exercises.

Theorem 2. 5. 7 - Properties of Orthogonal Complements: For any subspace W :::i!Rs.11: ➔

a) WnW1-={0n},and b) (WJ_l = w. Thus, we can say that Wand w1-are orthogonal complements of each other, or that Wand w1-form an orthogonal pair of subspaces. The Dimension Theorem for Matrices can be used to prove an analogous statement concerning a subspace Wand its orthogonal complement w1-.Its proof is also left as an Exercise:

Theorem 2.5.8- The Dimension Theorem for Orthogonal Complements: If Wis a subspace of !Rs.11 with orthogonal complement W\ then: dim(W) + dim(W1-) = n. Example: In our most recent Example, we saw W = Span(S) :::i !Rs.5 , where:

s=

{w1, w2, w3, w4} = {(3,-2,

1, 5, o),(5,-3, 2, 6, 1), (-8, 3,-5, 3,-7), (4, 1, 0,-2, 3) }.

Although S has four vectors, we found out that dim(W) = 3 only, but we also saw that dim( w1-) = 2. Since both are subspaces of !Rs. 5, this confirms the Dimension Theorem: dim(W) + dim(W1-) = 3 + 2 = 5 = n. □

174

Section 2.5 Orthogonal Complements

Usingdim(W) to Find Other Basesfor W Knowing the dimension of a subspace Wallows us to more efficiently check whether or not a subset B of Wis a basis for W:

Theorem 2.5.9-

The "Twofor the Price of One" or "Two-for-One" Theorem for Bases: Suppose W is a subspace of ~ 11, and dim(W) = k. Let B = {w 1, w 2, ... , wk} be any subset of k vectors from W. Then: B is a basis for W if and only if either B is linearly independent or B Spans W. In other words, it is necessary and sufficient to check B for only one condition without checking the other, if B already contains the correct number of vectors. Proof (==>)This direction is obvious because by definition, a basis B is linearly independent and Spans W. ( 1, this operator corresponds to a dilation operator, and if Ic I < 1, to a contraction operator. Furthermore, if c < 0, the operator also produces a reflection across the y-axis in the first case, and across the x-axis in the second case.

Example: Let Tbe the operator with standard matrix: [T] - [ --+

~~

J

--+

Recalling that T( i) is in column 1, and TU) is in column 2, we get: --+



--+



--+



➔➔

(0, 1) = J, and T(i +J) = 2i +J = (2, 1), by applying the Additivity Property. The effect on the basic box is shown below: T(i) = (2, 0) = 2i, TU)=

y

y

i +j

I

T

----,

I

-

T(j)

T(i+j)

j

T i I

X

I

2

X

The Action of A Horizontal Dilation Notice that the box has been dilated or "stretched" horizontally by a factor of 2. The vertical unit vector is not affected by T.o

J

206

Section 3.1 Mapping Spaces: Introduction to Linear Transformations

1

Example: Similarly, suppose Tis the Type 1 operator with [T] = [ --->

--->

--->

This time, T( i) = i, TU) = -

2--+

3

},

O ] . 0 -2/3

and the basic box is contracted or "shrunk" vertically by a

factor of 2/3 and reflected across the x-axis: y

y i +j

T(i)

T

j

X

1

T(j)

-2/3

X

---, ___.,

T(i+j)

The Action of A Vertical Contraction Combined with a Reflection.

0

Shear Operators

A 2 x 2 Type 3 elementary matrix has the fonn: [

~

; ]



or [





J

~

: -+



In the first case, the unit vector i is not affected, but TU) = ci +j, so the image of j is now leaning to the right or left, depending on whether c is positive or negative. Because of this, the first kind is called a horizontal shear operator. In the second case, is not affected, but the ---> image of i is now tilting up or down, so the second kind is called a vertical shear operator.

J

Example: Let T be the operator with [T] = ➔



[



1 -3/4 ] • 0 1 ➔



Again, we see that T(i) = i, TU)= (-3/4,1), and T(i+j) effect on the basic box is shown below:

=

(l,0)+(-3/4,1)

=

(1/4,1). The

___.,___.,

y

i +j

T(j)

y

--+

T( i

T

---,

j)

j

71

X

T(i)

X

The Action of A Horizontal Shear Transformation.

Section 3.1 Mapping Spaces: Introduction to Linear Transfonnations

0

207

Example: The action of any operator on Suppose Tis the operator with: [T] =

2 IR 0 about the origin is a linear transformation, with:

[rote] =

cos(0) -sin(0) [

sin(0)

cos(0)

]

.

Example: Let us find the matrix of the counterclockwise 0 = cos- 1(3/5) z 53° about the origin. We have:

rotation by the angle

cos(0) = 3/5 and sin(0) = 4/5, thus: [rote ] =

3/5 -4/5 ] . [ 4/5 3/5

To demonstrate its action, let us find rote (( 2, 7)): 3/5 -4/5 ] [ 2 ] [ -4.4 rote(( 2' 7 ))= [ 4/5 3/5 7 = 5.8

]

·

We graph (2, 7) and rote((2, 7)) = (-4.4,5.8) below, and observe that their lengths are the same but rote((2, 7)) is rotated counterclockwise from (2, 7) by about 53°:

The Vector v = (2, 7) and its Rotation rote(v) by 0 = cos- 1(3/5).

Section 3.2 Rotations, Projections and Reflections

0

215

Basic Projections and Reflections in ~ 2 2 . These We can perform a variety of projection and reflection operators on a vector (x,y) E !RI. 3 which we will see later, have important applications in operators, and their counterparts in !RI. computer graphics.

The projection ofv = (x,y) onto the x-axis, denoted proJxCv), is the vector (x, 0):

pro}x((x,y)) = (x, 0), and similarly, the projection ofv onto the y-axis is the vector ( 0, y):

projy((x,y))

= (0,y).

We see their geometric interpretation below:

--+

pro}y( v)

proj" ( ;) = < 0, y > pro} (v) = < X, 0 > X

The Relationships Among

v,pro}x(v), and pro}y(v)

The key relationship among these vectors is seen in the right triangle that they form:

v = (x,y)

= (x, 0) + (0,y) = pro}x(v) + pro}y(v),

where pro}x(v) is orthogonal to pro}y(v). This is an example of what is called an orthogonal decomposition, and the rest of our examples will involve this concept as well. As with rotations, we will show that these are indeed linear transformations by finding their standard matrices. But since their actions are very simple, it is easy to check by direct multiplication that:

projx(v) - [

~

]-

[

~~

] [ ; ] and

projy(v) - [

:

] - [

~~

] [

;

J

Thus pro}x and pro}y are linear transformations, with:

[projx] - [

216

~~

l

and [proj,] - [ 0 0 ]· 0 1

Section 3.2 Rotations, Projections and Reflections

v

Similarly, we can take and reflect it across the x-axis, the y-axis, or the origin (in the same way that we reflect graphs of functions): y

X

A Vector

vand its Three Basic Reflections in IR{2.

We compute these operators, and see their standard matrices, via:

1

0

0 -1

-1 0 0

1

-1

0

0 -1

][:J

][:J

and

][;J

Notice that [ reflx] and [ refly] are both 2 x 2 Type 1 elementary matrices, with c = -1, and [ refl-02] = S_1, which represents scalar multiplication by -1. The reflection operators across the x- and y-axes can be related to the projection operators through the following diagrams: y

-projx(v)

y

projx (v)

1pro}y(V) X X

l-proj,(V)

The Geometric Relationships Among v, pro}x(v), projy(v), reflx(v) and refly(v)

Section 3.2 Rotations, Projections and Reflections

217

From these, we see that:

re.flx(v) = pro}x(v) - pro}y(v), and rejly(v) = pro}y(v) - pro}x(v).

General Projections and Reflections in ~ 2 The x-axis and y-axis are two orthogonal lines that pass through the origin. More generally, if L is any line through the origin in [R{2 , there is a unique line r1-,also passing through the origin, that is orthogonal to L (recall from Chapter l that we call r1-the orthogonal complement of L). We can define the projection operators onto L and L 1 and the reflection operator across L by the following vector diagrams:

y projLJ.(V)

\

ref!, (v

- pro)L,(V)

---..\

The Projections of vOnto a Line L and its Orthogonal Complement L 1-, and the Reflection ofv Across L.

It was easy to find pro}x(v) and pro}y(v) if we knew v = (x,y), but for a random line Land its orthogonal complement L 1-, these projections are not that obvious. However, our goal is to satisfy the equation:

v = pro}L(v) + pro}LJ_(v), where pro}L(v) is parallel to L, and pro}iJ_(v) is parallel to projections, we can find the reflection across L via:

r1-.Once

we find these two

re.flL(v) = pro}L(v) -pro}LJ_(v), as seen from the diagram. Let us demonstrate how to find these three vectors. Example: Let L be the line in [R{2 with Cartesian equation y = ~ x.

The vector (3, 2) is parallel to L, and since the vector (-2, 3) is orthogonal to (3, 2) as we easily check with the dot product, (-2, 3) must be parallel to r1-.Let v = (x,y) be any vector in ~ 2 . From the diagram, we wantpro}L(v) to be parallel to L, andpro}LJ_(v) parallel to r1-. 218

Section 3.2 Rotations, Projections and Reflections

Thus: proj LCv) = a(3, 2), and

= b(-2, 3),

prohj_(v)

for some scalar multiples by a and b. However, we want:

v = projL(v) Using

v = (x,y),

+ proju(v).

we get:

(x,y) = a(3, 2) + b(-2, 3). This vector equation is equivalent to the linear system:

3a- 2b = x 2a

+ 3b = y.

By eliminating b, we can solve for a: 9a -6b = 3x 4a

==>13a

+ 6b = 2y

= 3x + 2y or a =

3

13

x+

2

13 y.

?3 x + ?3 y. Thus, we get: a(3, 2) ?3 + ?3 y)(3, 2)

Similarly, by eliminating a, we get b = projL(v)

=

=

= (

{3x +

163y, 163x + 1j y)'

=

= b(-2,3)

prohj_(v)

= ( 1\x-

x

(

(-i23 x+

163Y,-163x+

and

?3 y)(-2,3) {3Y)-

From these, we can see that these projections are indeed operators, with matrices:

[projL]

= [

1]

~ 13

13

_i_

13

and [projLj_] = [ _ _6_ 13

6 - 13 ] _2_·

13

Finally, we can find the reflection across L via: rejl L(v)

= pro} L(v) - proj Lj_(v) = ( = (

(3X+

6 13

y, 163 X+ lj y)- ( 1\ X-

6 13

y, -

6 13 X +

{3y)

5 12 12 5 ) TIX+ 13Y, TIX-TTY'

and thus reflection across L is indeed an operator, with:

[rejlL]

= [

(~ 13

_l}]· 13

Let us demonstrate the result of these three operators on v = ( 4, 1). We get:

Section 3.2 Rotations, Projections and Reflections

219

_2_

proj1(v) -[

13 _Q_ 13

J

_A_

proj ,s(v) - [

13 _ _Q_ 13

and

2refh(v) -[

13

2.46 ]· 3. 31

_u_ 13

Let us put these all together in the following diagram:

re.fld7

-pro}

L'

(v)

y

~

p~~~)

L

X

Projections and Reflections with respect to L : y =

jx

Projections and Reflections in Ii 3 We can also define basic projection and reflection operators in ~ 3 , but we now have more varieties. Notice that in IR2 , the lines L through the origin are the non-trivial subspaces of~ 2 . But for ~ 3 , the non-trivial subspaces are lines L through the origin as well as planes IT passing through the origin. The simplest subspaces are thus the x-, y- and z-axes, and the xy-, yz-, and xz-planes, and so for v = (x, y, z), we can define their projection operators:

projx((x, y, z)) = (x, 0, 0), projy((x,y,z))

= (0,y, 0),

projz((x,y,z))

= (0,0,z),

= (x,y, 0), pro}xz((x,y,z)) = (x,0,z), and pro}y::((x,y,z)) = (0,y,z). pro}xy((x,y,z))

These projections are connected by the following relationships:

v = (x,y, v = (x,y, v = (x,y, 220

z) = (x, 0, 0) + (0,y, z) = pro}x((x,y, z)) + pro}yz((x,y, z)), z) = (0,y, 0) + (x, 0, z) = pro}y((x,y, z)) + pro}xz((x,y, z)), and z) = (0, 0, z) + (x,y, 0) = pro}=((x,y, z)) + pro}xy((x,y, z)).

Section 3.2 Rotations, Projections and Reflections

Note that the z-axis is the orthogonal complement of the xy-plane, and so on. Once again, we obtain an orthogonal decomposition, as we verify that (x, y, 0) o ( 0, 0, z) = 0, and so on. The relationships among projxy(v) and projz(v) can be visualized by imagining the sun to be directly overhead at high noon: If is an actual arrow anchored to the origin, then prof).y(v) would be the shadow that vmakes on the ground:

v,

v

v=

A Basic Orthogonal Decomposition in

IR(3

Let us look next at the reflections across the coordinate planes. To do this, pretend, for instance, that the xy-plane is a mirror. The reflection of (x, y, z) across the xy-plane is thus (x, y,-z). But notice that we can write: reflxy((x,y, z)) = (x,y,-z)

= (x,y,0)-(0,

0, z)

= projxy((x,y, z)) - projz((x, y, z) ).

Again, since the z-axis is the orthogonal complement of the xy-plane, this equation 1s analogous to the equation from IR(2 : reflL(v) = projL(v) - proj L-'-(v).

Now, by reversing the roles of the .xy-plane and the z-axis, we can define the reflection across the z-axis in IR(3 via: reflz((x,y,z))

= projz((x,y,z))-proj_ry((x,y,z))

= (0, 0, z)- (x,y, 0) = (-x,-y, z). We can now summarize the six basic reflection operators: reflx((x,y, z)) = (x,-y,-z),

reflxy((x,y,z))

= (x,y,-z),

refly((x,y, z)) = (-x,y,-z),

reflxz((x,y,z))

= (x,-y,z),

refl=((x,y,z))

= (-x,-y,z),

and

reflyz((x,y, z)) = (-x,y, z).

We note that the matrices of the reflections across the coordinate planes are 3 x 3 Type 1 matrices with c = -1, analogous to what we saw in IR(2 . We will see in the Exercises that Type 2 elementary matrices represent reflection operators in IR(2 and IR(3 .

Section 3.2 Rotations, Projections and Reflections

221

From our discussion above, it makes sense to investigate projections and reflections as they relate to an arbitrary plane TI together with its normal line L. = (x, y, z) E ~ 3, we want to produce the projection operators onto TI and L in order to express as an orthogonal decomposition:

Ifv

v = proJrrCv)+proJL(v),

whereproJrr(v)

E

v

TI and proJL(v)

EL.

We must show that this sum can be constructed in exactly one way. From this, we get the reflection operators:

reflrr(v)

=

proJn(v) - pro}L(v), and reflL(v)

=

pro}L(v) - proJn(v)

=

-refln(v).

Let us illustrate these computations:

Example: Let TI be the plane in ~ 3 with Cartesian equation:

3x - 2y + 5z = 0. TI has normal vector n = (3,-2, 5), and normal line L = Span({n} ). We will show that projn, proj L and re.fin, are all operators by finding their standard matrices (leaving reflL as an Exercise). Let

v = (x, y, z) be any vector in~ 3 . We see below the vectors that we are looking for: pro)

L

(°7)

v

rr n

~\proj,.(V)

p,oj n ~VJ\proj, (v)

\

refl

\'roj, (V)

11 (-;)

re.fin (v)

The Relationships Among

v,proJn(v), pro)L(v) and refln(v)

We will use a different strategy from that in our Example in ~ 2 . Let us begin with proj L(v). This vector is parallel ton, so:

proJ,.(v)

=

k(3,-2, 5)

=

(3k,-2k, 5k),

for some scalar multiple k. For now, let us assume that projrr(v) actually exists, but we will verify later on that this assumption is justified. Since

222

v = proJn(v)

+ pro)L(v), we must have: projn(v) = v- pro}L(v) = (x, y, z) - ( 3k,-2k, 5k) = (x - 3k, y + 2k, z - 5k). Section 3.2 Rotations, Projections and Reflections

But since pro} n (v) E IT, this vector must be orthogonal to satisfy the equation:

n,so the correct value of k must

n projn(v)

0=

O

= (3,-2, 5) o (x-

3k,y

+ 2k, z- 5k)

= 3(x - 3k) - 2(y + 2k) + S(z- 5k) = 3x - 2y + Sz- (9 + 4 + 25)k. . we fim d k = 3x - 2Y + Sz as theon ly poss1"ble so Iut10n. . Th us, we get: F rom th 1s, 38

pro}L(v) = (3k,-2k,

5k)

= \ 3 ( 3x - ~~ + Sz ) , _ 2 ( 3x = / 9x - 6y + 1Sz \

38

i~+

Sz ) , 5 ( 3x -

-6x + 4y - lOz

'

38

'

i~+

Sz ) )

1Sx - lOy + 25z ) 38 .

Consequently, we also get:

projn(v) = V-prO}L(v) _ ( -

/ 9x - 6y + 1Sz

) X,

y,

Z

-

38

\

-6x + 4y - lOz

'

38

'

l Sx - lOy + 25z ) 38

= / 29x + 6y - l 5z 6x + 34y + 1Oz -1 Sx + lOy + 13z ) \ 38 ' 38 ' 38 .

We will now check that we were justified in assuming that proj n (v) exists by checking that the fmal vector above is on II, that is: . r.t) = 3 ( 29x + 6y - 1Sz ) _ 2 ( 6x + 34y + 1Oz ) ( -15x + 1Oy + 13z ) n 0 proJmY 38 38 +5 38

-+

=

ls cs1x + 1sy-45z-

12x- 68y- 2oz- 1sx +soy+ 65z)

=

o,

for all x, y, and z. Thus, our solution is indeed correct. Next, we find the reflection operator across II using our two projections:

refln(v) = pro}n(v) - pro}L(v) = / 29x + 6y - 1Sz 6x + 34y + 1Oz - l Sx + 1Oy + 13z ) \ 38 ' 38 ' 38 _ / 9x - 6y + 1Sz

\

38

= / 1Ox + 6y - 1Sz \

19

Section 3.2 Rotations, Projections and Reflections

-6x + 4y - 1Oz

' '

38

'

1Sx - lOy + 25z ) 38

6x - 1Sy + lOz - l Sx + 1Oy - 6z ) 19 ' 19 .

223

Finally, we can assemble the matrices of the three operators:

[pro}L] = _2_ 38 6

-38

l.5_

_l.Q_

-38 38

6

[pro}n]

15

29

15

38 38 -38

38 4 10 38 -38 38

6

[refln] =

=

__Q_ 34 38 38

1.5_ 38

l.Q_

_l.5_

38

38

10

pro}L(v) =

_ _Q_

38 l.5_

38 22_

pro}n(v) =

_±_ _l.Q_ 38 38 _l.Q_

38

38 10

19

1.5_ 38

_Q_ _ _li

38 38 __Q_ J± 38 38 _l.5_

refln(v) =

_ __Q_ l.5_ 38 38

38 l.Q_

38

l.Q_

1.1

38

38

6

__Q_

19

19 15

6

38

19

1.1

_l.5_

19

38

Let us demonstrate how these matrices work with the sample vector: _2_ 38

l.Q_

15

-19 10

19

19

l.Q_ _ __Q_

19

v = (-4,-3,

19 5).

J_

2 -1 5

2

_ll_

2

-2 2.

, and

2

15

19 -19

__Q_ l.5_

l.Q_

19 19 19 15 l.Q_ 6 -19 19 -19

We show two perspectives of all these vectors in the diagrams below. The edgewise view on the right reminds us of the analogous diagram of projections and reflections in ~ 2 .

A Vector

224

v,and its Projections

on IT and L, and its Reflection Across IT. 0

Section 3.2 Rotations, Projections and Reflections

3.2 Section Summary 3.2.1: The rotation transformation rote : IRs.2 ➔ IRs.2 that takes a vector v in standard position and rotates counterclockwise by an angle of 0 is a linear transformation, with:

v

[rote] =

cos(0) [

- sin(0) ]

sin(0)

cos(0)

.

v

In IRs.2 , we can define the projections of a vector onto the x-axis and y-axis, its reflections across the x-axis and y-axis, and its reflection across the origin, as: pro}x((x,y))

= (x, 0),

projy((x,y))

= (0,y),

rejlx((x,y))

= (x,-y),

rejly((x,y))

= (-x,y),

refl 0/(x,y))

and

= (-x,-y).

2 and its orthogonal complement L .1, it More generally, given any line L through the origin in 1Rs. 2 is possible to take any vector v E 1Rs. and find its orthogonal decomposition v = pro}L(v) +pro}u(v), where pro}L(v) is parallel to L, and pro}u(v) is parallel to L.1. The reflection across L can be defined using these projections as:

reflL(v) = pro}L(v)-pro}u(v). In IRs.3 , we can define the six basic projection operators: pro}x((x,y, z)) = (x, 0, 0),

pro}xy((x,y,z))

= (x,y, 0),

projy((x,y, z)) = (0,y, 0),

pro}xz((x,y,z))

= (x, 0,z),

projz((x,y,z))

pro}yz((x,y,z))

= (0,y,z).

= (0, 0,z),

and

Similarly, we can define the six basic reflection operators:

= (x,y,-z),

rejlx((x,y, z)) = (x,-y,-z),

rejlxy((x,y,z))

rejly((x,y, z)) = (-x,y,-z),

reflxz((x,y, z)) = (x,-y, z),

= (-x,-y,z),

reflyz((x,y, z)) = (-x,y, z).

rejlz((x,y,z))

and

More generally, given a random plane II through the origin in 1Rs.3 and its orthogonal complement L, it is possible to take any vector v E 1Rs.3 and find its orthogonal decomposition:

v = projn(v)

+ pro}L(v), where projn(v)

E

II and pro}L(v)

EL.

From this, we get the reflection operators: refln(v)

=

pro}n(v) - pro} L(v), and reflL(v)

Section 3.2 Rotations, Projections and Reflections

=

pro} L(v) - pro}n(v).

225

3.2 Exercises l.

angles 0: (i) find the matrix of rote, the counterclockwise rotation by 0 in (ii) compute rot0 (v) for = ( 5, 3 ), providing both exact and approximate answers; and (iii) sketch and rote(v) and check with a ruler and protractor that rote(v) has the same length as but is rotated counterclockwise by 0.

Rotation Matrices: For the following ~2;

v

a. rc/6

b. 2rc/3

c. sin- 1(3/5)

d. cos- 1 (-5/13) f.

2.

vbut is rotated

1

clockwise by

a. -2rc/3

b. -5rc/6

c. sin- 1(-20/29)

d. tan- 1 (-4/3)

e. sin- 1(15/17) - re

f.

1

-2cos-

C.

(-20/29).

=

vunder these three operators,

lx

b. y

5

y = _A_x

Hint: Use the Double Angle Formulas.

=

as shown in the Example in this Section.

1-x 4

d. y = _]_x

3

5

f. y = __ l_x

e. y = 3x

.ff

Projections mid Reflections in ~ 3 : For each of the given planes TI in ~ 3 and corresponding normal line L: (i) find the matrix of pro} L, projn, and refln; (ii) compute the values of these three operators approximate answers. a. 4x C.

+ 2y-3z

7x-4y-

= 0 5z = 0

e. 2y- 7z = 0

226

101.

v

a. y

5.

Hint: Use the Half-Angle Identities.

Projections and Reflections in ~ 2 : For each of the given lines Lin ~ 2 : (i) find the matrix of pro} L, proJL1-and refh; (ii) compute the values of these three operators on = ( 3, 2 ), providing both exact and approximate answers; and (iii) sketch the graphs of L, L .1, v and the images of

4.

i sin- (21/29)

Clockwise Rotations: If 0 < 0, the formula for [rote] is exactly the same, but the geometric effect is a clockwise rotation by 101. For the following 0: (i) find the matrix of rote; (ii) compute rote(v) for v = ( 5, 3 ), providing both exact and approximate answers, and (iii) sketch v and rote(v) and check with a ruler and protractor that rote(v) has the same length as

3.

v

v

b. 2x- 5y d. 3x f.

on

v = (-5,

4, 7), providing

both exact and

+ 6z = 0

+ 5z = 0

4x-7y

= 0

Note: (f) represents a plane in~ 3 , not a line in~ 2 . Find the matrix of the counterclockwise rotation m ~ 2 by 0 = rc/2. Is this a 2 x 2 elementary matrix? Why or why not? Section 3.2 Rotations, Projections and Reflections

6.

Find the matrix of reflL, the reflection across L for the normal line to the plane TI with Cartesian equation 3x - Sy+ 2z = 0 as seen in the final Example of this Section. How is [ re.flL ] related to [ re.fln ]?

7.

Find the matrix of reflL, the reflection across L for the normal line to the plane TI with Cartesian equation 2x - Sy+ 6z = 0 from Exercise 4 (b).

8.

Type 2 Elementary Matrices: We will see in this Exercise that Type 2 elementary matrices correspond to reflections in ~ 2 or ~ 3 . Let T be the operatorwith [Tj - [

~~

J

Note that this is the only 2 x 2 Type 2 elementary matrix. a. Find T(v) and T(w) forv = (5,2) and w = (-3, 4). b. Sketch the four vectors involved in (a), and the line y = x. Convince yourself that T(v) is a mirror-image ofv across y = x, and likewise for wand T(w ). c. Now, find the standard matrices of proj L, proj L1-and reflL for the line in~ 2 : y = x. Which of these matrices correspond to [7]? d.

~~~

Consider the 3 x 3 Type 2 elementary matrix [

] ·

0 0 1

e.

Show that this is the matrix of the reflection in ~ 3 , re.fln, across the plane TI with Cartesian equation y = x, i.e. x - y = 0. There are exactly two other 3 x 3 Type 2 elementary matrices. Find them, and for each, find the corresponding plane TI such that the reflection across IT has this matrix as its standard matrix.

1 0 0 0 f.

0 0 0 1

Consider the 4 x 4 Type 2 elementary matrix

0 0 1 0 0 1 0 0

If Tis the corresponding operator, find an explicit formula for T((x 1, x2, X3, x 4 ) ). Write a sentence describing in words what T does to any vector E ~ 4 . (Since we cannot visualize ~ 4 , we cannot see the effect of Ton ~ 4 , but we can still explain what T does to a vector.)

v

9.

General Formulas in ~ 2 : Suppose that v = ( a, b) is a unit vector in ~ 2 , L is the line Span( {v} ), and L1-is the orthogonal complement of L. Prove that the matrices of proj L, proj u and reflL are given by:

[projL] =

[

a2 ab ], ab b 2

Section 3.2 Rotations, Projections and Reflections

[projL1-]=

[

b2 -ab

-ab a2

],

and

227

[rejlL]=

a2 -b2

2ab

2ab

b2 - a2

[

] .

n

10. General Formulas in IR (X). In Exercise 16, you proved that if IXI = n, then I g;:> (X) I = 2n, which is a lot bigger than n.

We can apply this same idea on an infinite set. For example, if X = N, then g;:> (N) contains all the subsets containing one or more natural numbers, as well as the empty set. One of the great accomplishments of Georg Cantor, the father of Set Theory, is the proof that for any set X, including infinite sets:

Section 4.3 A Primer on Infinite Sets

345

Thus, if we let X1 = k IXI and say that the cardinality of Y is strictly bigger than the cardinality of X We say that IXI :S I YI if either IXI < I YI or IXI = I YI. We can pronounce this as "the cardinality of Xis less than or equal to the cardinality of Y," or "the cardinality of Xis at most the cardinality of Y." This means that there exists a function f : X ➔ Y which is one-to-one. Such a function may or may not be onto. In this case, we can also write: I YI~ IXI and say that the cardinality of Y is at least the cardinality of X. 4.3.2. - Countable and Uncountable Sets of Numbers: The set of natural numbers, integers, and rational numbers are all countable:

INI = 111= llUI=

~0-

However, the set of real numbers, the set of irrational numbers, and all intervals of the real number line that contain at least two numbers are all uncountable and have cardinality c:

IIRI.I = IIRl.-lUI = l(a,b)I where a < b

E

= l[a,b]I = l[a,b)I = l(a,b]I = c,

!RI..Similarly, these infinite intervals also have cardinality c:

I(-oo, b) I = I(-oo, b] I = I(a, oo) I = I[a, oo) I =

c.

In other words, any interval of real numbers containing at least two numbers has cardinality c.

346

Section 4.3 A Primer on Infinite Sets

4.3 Exercises 1.

In this Section, we showed that N and 7Lhave the same cardinality by listing the members of 7Las:

0, 1,-1, 2,-2, 3,-3, ... , n,-n, ...

2. 3. 4.

5. 6.

7. 8.

Use this to explicitly construct a function f: N ➔ 7Lthat is both one-to-one and onto. Hint: an easy way to do it would be to use a piecewise definition. Show that the set of odd integers O = { ... -7,-5,-3,-1, 1, 3, 5, 7, ... } is countable by constructing an explicit bijection from N to 0. Show that the set of even integers E = {... -6, -4, -2, 0, 2, 4, 6, ... } is countable by constructing an explicit bijection from N to E. More generally, let m be a fixed positive integer. Show that the set of all multiples of m:

M = { ... -4m,-3m,-2m,-m, 0, m, 2m, 3m, 4m, ... } is countable by constructing an explicit bijection from N to M. Suppose that Xis a subset of Y. Prove that IXI :S I YI. Hint: state the definition of this symbol and create an easy function/that satisfies the definition. Suppose that X and Y are both countable sets, and assume for the sake of simplicity that X n Y = 0, that is, they have no element in common. Prove that XU Y is also countable. Hint: list X and Yin a countable way and show how to list the elements of XU Y also in a countable way. Show that the set of irrational numbers Ii - Q is also uncountable. Hint: Use Proof by Contradiction and the previous Exercise. Suppose that X and Y are both finite subsets of some set Z. a. Draw a Venn diagram of the set-up. Allow the possibility that X and Y intersect. b. Show that you can label some parts of the diagram as X n Y, X - Y, and Y - X. Recall that: X-Y= c. d.

{z

E

Zlz

E

Xbutz

Y},

and analogously for Y - X. Show that every element in XU Y is in exactly one of the following subsets: X n Y, X-Y, and Y-X. Use (c) to prove that:

1xu YI= 1x1+ IYI-IXn 9.

~

YI.

The Countability of the Rational Numbers: The purpose of this Exercise is to show that Q is countable, that is: IQ I = ~ O• Recall that every rational number alb can be written as a fraction where b is as small as possible, i.e., in lowest terms. We will create an infinite table that will contain the members of Q in the order of increasing denominators. On the 1st row we see the members of 7L(rational numbers with denominator 1), listed as a sequence as we saw in the Examples. On the 2nd row are the rational numbers with denominator 2, then on the 3rd row those with denominator 3, and so on.

Section 4.3 A Primer on Infinite Sets

347

0

1

2

-2

-3

3

2

2

3

-3

2

2

5

-5

2

2

-=1

..2_

-2 3

.A

-4 3

__5_

3

-=1

...3..

~

__5_

~

1

..l_

5

-=1 5

4 -2

1

-1

5 5

5 -5

-1

1

2

_L

3

3

_L

4

3

4

_L

6 a.

-1

4

6

6

4

7 3

4 -3

_J_

5

4

A_

5

5

6

List the first 7 rational numbers on each of the next 3 rows of this table. Now, in order to prove that Q is countable, we have to list all the rational numbers in one sequence, without repetition. The idea is to traverse this table in a diagonal or zigzag manner, following the arrows numbered 1, 2, 3 and so on, as shown in the figure below:

G)

CD

/

/3' /

_i/ / 2

-

®

2

/

3

,,'3

_}_ /

-3-

✓4

©

/

..2_

5 1✓

®' "'6

5

/

-=1 6

/ '3

-=i/

(J) /

73'

/

/ 2 -4

3

7

2 5

3

/

5

....,.4

4

-2 5

_l

__5__ ~

6

/

4✓

-2✓

..

6

5

-5

7

4

4

~

A_

5

5

.. . . . .

Thus, the first 6 rational numbers in this sequence are 0,

.

. .

. . . . . . .

. .

. . . . . .

i ,1,-1, -1and 1.

b.

List the next 25 rational numbers in this sequence according to this diagonal traversal. Be sure to include in your traversal appropriate members from the next three rows from (a). c. Suppose the rational number alb is found in row i, column} in the table above. This rational number will be found on the kth arrow as described in the diagonal traversal above. Find a formula fork in terms of i andj. For example, the number -2/3 in row 3 and column 4 is on the 6th arrow. d. Explain why every rational number will be found exactly once in this sequence. e. The Big Picture: Summarize the steps above into a proof that IQI = ~ o. The idea above is attributed to Georl( Cantor (1845, Russia - 1918, Germany), professor at the University of Halle, and the Father of Set Theory. It is known as The Cantor Diagonalization Argument. 348

Section 4.3 A Primer on Infinite Sets

10. The Uncountability of Sub-intervals of the Real Numbers: We know from Algebra that there are different kinds of intervals of real numbers: finite open intervals of the form (a, b), finite closed intervals of the form [a, b], finite half-open/half-closed intervals of the form (a, b] or [a, b), infinite open intervals of the form (a,oo) and (-oo,b), and infinite closed intervals of the form [a,oo), and (-oo,b]. The purpose of this Exercise is to show that all of these eight interval types have cardinality c, and thus they are all uncountable. a. Suppose that [a, b] is any closed interval of !RI..Find a linear function:

f: [O,1] ➔

b.

[a,b], that is, of the form f(x) = mx + k, with a positive slope m, which is both one-to-one and onto [a, b]. Hint: what should the graph look like? Explain why this proves that all finite closed intervals [a, b] have the same cardinality. Show that the same linear functionfthat you found in (a) is also a function:

f:

(0, 1]

➔ (a, b ],

f:

[0,1)



[a,b), and

f:

c.

(0, 1) ➔ (a, b ), and each is one-to-one and onto the indicated range when restricted to the corresponding domain. Hint: this just means that the graph you drew in (a) will have a hole or two. Explain why this proves that all finite intervals of the form (a, b] have the same cardinality, and similarly for the intervals of the other two forms. Parts (a) and (b) show that in order to prove that a finite interval of any of the four forms seen above has cardinality c, we have to prove that the four intervals [0, 1], (0, 1], [0, 1) and (0, 1) - or any particular example of each of these four forms all have cardinality c. Sketch the graph of the restricted tangent function from Trigonometry: tan(x) :

(-1,1) ➔ !RI.,

and explain why it is both one-to-one and onto. Explain why this proves that:

I(-1 , 1) I = d.

1

!RI. 1 = C.

Use parts (b) and (c) to prove that for all finite open intervals (a, b):

I( a, b ) I = I!RI.I = c. e.

Next, let us consider infinite open intervals of the form (a, oo), starting with a = 0. Sketch the graph of the natural logarithm function: ln(x) : ( 0, oo) ➔ !RI., and explain why this function is one-to-one and onto. Explain why this proves that:

ICo,00 ) I = I!RI.I = c. f.

Let a

E

!RI..Find a (very simple) linear function:

f: (a, oo) ➔

0, oo), that is one-to-one and onto. Explain why this proves that for any a Section 4.3 A Primer on Infinite Sets

(

E !RI.:

349

ICa,00 ) I = I!Ri.I = c. g.

Show that the same function/ that you found in the previous part is also a function:

f: [a, oo) ➔

0, oo),

[

that is again one-to-one and onto. Explain why this proves that for any a

E

!Ri.:

l[a,oo)I = l[0,oo)lh.

Let b

E

!Ri..Find a linear function:

f:

(-oo, b)



(

0, oo),

that is one-to-one and onto, but with a negative slope. Explain why this proves that for any b E !Ri.:

I(- 00 , b ) I = I!Ri.I = 1.

Show that the same function/that

c.

you found in the previous part is also a function:

f:

(-oo, b]



[

0, oo),

that is again one-to-one and onto. Explain why this proves that for any b

E

!Ri.:

IC-00 , b] I = I[ O,00 ) 1J.

Show that the function: f(x)

= 1 ~x'

restricted to the domain [ 0, 1 ) is a one-to-one function. Show that its range is [ 0, oo). In other words:

f: [0, 1 ) k.



[

0, 00)

is both one-to-one and onto. Explain why this shows that I[ 0, 1 ) I = I[ 0, oo) 1Find a linear function:

f: (0, 1]



[

0, 1 ),

which is both one-to-one and onto, but with a negative slope. Explain why this proves that:

ICo,1 J I = I[ o,1 ) I1.

Explain why this shows that all finite half-open/half-closed intervals have the same cardinality. Generalize the idea from the previous part to create a linear function:

f: (a, b]



[

a, b ),

that is both one-to-one and onto. We will call this function aJUp. Now comes the hardest part. We will construct a function: f : [ 0, 1 ]



[

0, 1 ),

that is both one-to-one and onto. The idea is to break up [ 0, 1] into halves, then quarters, then eighths, and so on, and flip the new interval on the right: 350

Section 4.3 A Primer on Infinite Sets

[ 0, 1]

1Ju ( 1,1 J; flip ( 1,1 l ➔ 0,1] u [ 1,1); subdivide [ 0, i ] into two: . ( 4' 1 21 ]·. fl Ip = [ ! ] u ( ! ' 1]u [ 1' into two: ➔ 0,! ]u [ ! , 1) u [ 1,1); subdivide [ . ( 1 1 ]. = [ 0,~ ] u ( ~ ' ! ] u [ ! ' 1) u [ 1,1); fl Ip 8' 4 . ➔ ~ ~ !) i ) i ,l); subdivide [ ~ into two: = [ o, /6 Ju ( /6 , ~ J u [ ~, ! ) u [ ! , 1)u [ 1,1); flip ( / 6 , ~ l ... Notice that 1 appears twice after the n flip, but this is a tempora,y concern since the leftmost interval will be divided into two again in the next iteration and the next subinterval flipped, and so inwill appear only once after the next step. = [ o, [

0,

1);

0, } ]

[

[

0,

] U[

,

U[ } ,

U[

0,

]

th

11

m.

Write the algorithm above as a piecewise function with an infinite number of pieces in the domain. Explain why f(O) = 0 and why this function is one-to-one and its range is [ 0, 1). Hint: the previous part should be very useful. The formula should be in this form: 0

if X = 0,

(1/2, 1],

ifx

E

if X

E (

if X

E (

1/4, 1/2],

f(x) =

n. o.

2

;+!, in],

Note that we have no choice but to havef(O) = 0. Explain why this part shows that I[ 0, 1] I = I[ 0, 1) I- Note that these are almost the same interval, except the second interval is open at x = l. In other words, they are only different by exactly one point. Sketch the graph off(x) from part (m). Apply the same ideas found in parts (1) and (m) to define a function:

f:

(0, 1)



(0, 1],

which is one-to-one and onto. Hint: begin with:

( 0, 1) = ( 0, p.

1) u [ 1,1).

Summarize all the parts above to show that all of the eight interval types of real numbers as listed at the beginning of this Exercise have cardinality c.

Section 4.3 A Primer on Infinite Sets

351

4.4 Linearity Properties for Infinite Sets of Vectors We are now in the position to define the concepts of linear combinations, Spans and linear independence for infinite sets of vectors. One good way to start is to recall the polynomial spaces IJlln= Span( { 1, x, x 2 , ... , xn} ). For example: 3 1Jll

2

,x 3 }) = {co+c1x+c2x 2 +c3x 3 lco,c1,c2,c3 E ~}.

= Span({l,x,x

Since we are only allowed to multiply vectors by a constant, and not multiply two polynomials together as we do in ordinary Algebra, in order to get polynomials of degree higher than 3, we need to expand our space to 1Jll4 or 1Jll5 and so on. This strategy will not work, unfortunately, if we are interested in producing any polynomial p(x), no matter how high the degree. One way to get around this is to work with the infinite set of monomials: S -{l

, X, X 2 , X 3 , ... , X n , . • . } .

This set is important because any polynomial p(x) can be written as a finite sum of these monomials with constant coefficients. To study these sets, though, we will first need to know how to describe them. The most convenient way to do this is by using what is called an indexing set, denoted I, which is typically a non-empty subset of ~- The simplest examples will have indexing sets I = N, I = ~ itself, or I could be an interval such as [O,oo) or [-3, 2]. The vectors of S will be in a one-to-one correspondence with the elements of I through the use of set-builder notation. We will write infinite sets of vectors in general as: S = {vi I i E /} c (V, EB,0 ), where I c ~ is some non-empty indexing set.

v;*

To avoid ambiguity, we will insist that VJif i andj are distinct indices in I. In other words, distinct indices correspond to distinct vectors, and vice versa. Thus, if I is a countable indexing set, then Sis also a countable set of vectors (and similarly for uncountable index sets). Note that if I is the finite set { 1, 2, ... , n }, we get our old sets of vectors:

S= {viii

E {1,2, ... ,n}}

= {v,,v2, ... ,Vn},

Example: Above, we saw the infinite set of all monomials in the variable x:

Each of these can be viewed as a function defined for all real numbers, and so S c F(~ ), the vector space of all functions defined on ~- We wrote Sin roster form above, but we can also describe Sin set-builder notation as:

S = {x 11 I n

352

E

N}.

Section 4.4 Linearity Properties for Infinite Sets of Vectors

The indexing set is N, and the index is the variable n, which we allow to be any natural number. To see how this matches with our general notation, our vectors can be labeled as:

Vn = xn, where n EN. Notice also that if n and m are different natural numbers, then xn and xm are different monomials. Thus, each n corresponds to a unique vector xn. Since our indexing set is N, our set of vectors Sis countable. In the same way, we can construct the infinite sets of even monomials and odd monomials:

E = {x 2n I n

E

N } = { 1, x 2 , x 4 ,

0 = {x 211 + 1In EN}=

... ,

x 2n, ... }, and

{x,x3,x5, ... ,x 2n+l,... }.

Notice that we used the same indexing set N for these two sets. Since each monomial x 2n or 1 x 211 + is in one-to-one correspondence with N, we can say that all three sets S, E and O are countably infinite sets of vectors. 0 As seen above, if S is countable, we can describe it using the roster form, where we list the members of S in a sequence with a clear pattern, or in set-builder notation, using a countable set such as N as our indexing set. However, if S is uncountable, we can no longer list the members of S in a sequence, so we have no choice but to use set-builder notation using a non-empty indexing set I.

Example: We will consider several more subsets from F(~). Let us start with:

S ,- - { elex I k

E

71._}-- { ...

-x 1 x ,e -3x,e -2x ,e,,e,e,e,

2x

3x .... }

The indexing set for S1 is 7L.Since 7Lis countable, and the functions in S1 are distinct (no two of them have the same graph), S1 is also countable. Similarly, the set:

has Q as an indexing set. S2 is also countable, since Q is countable, as we saw in the Exercises of Section 4.3. However, since the way we list Q is not very convenient, it is certainly better to describe S2 using set-builder notation instead of roster form. Since every integer is also a rational number, S 1 c S2 . However, the functions e 2 ·n. Since there are

dim([Pn) = n + 1. Similarly, we saw that the countable set B = { 1, x, x 2 ,

dim(rJJ>) =

... ,

xn, ... } is a basis for IP1,and thus:

~ O·

In Section 4.4, we also saw the linearly independent, uncountable set:

S3 = {ekxlk

E

!RI.}C F(IRI.),with

1s31 = C.

Since the Span of any set of vectors is a subspace, we can construct: W = Span(S3)

~

F(IRI.),

and S3 is a basis for W. Thus, dim(W) = c. o

368

Section 4.5 Subspaces, Basis and Dimension

We can also generalize the Theorem that says that any subspace of !Rn 1s at most n-dimensional:

Theorem 4.5.8: Let (W, EB,0) be a subspace of a finite-dimensional (V, EB,0 ). Then: dim(W) :S dim(V). Furthermore, dim ( W) = dim ( V) if and only if W = V.

vector space

Again, the proof carries over exactly as in Theorem 2.2.6 and Exercise 5 of Section 2.2, thanks to the Extension Theorem.

Example: Consider IP2 , the vector space of polynomials with degree at most 2. Let: W

E [Pl2

{p(x)

=

lp(3)

=

2p(-l)

}.

First let us show that Wis indeed a subspace of IP2 • First, Wis not empty, because the zero polynomial z(x) satisfies the condition z(3) = 0 = 2z(-1 ). We must show next that W is closed under addition and scalar multiplication. Suppose p(x) and q(x) are members of Wand c is a scalar. Then: (p+q)(3) (c • p)(3)

=p(3)+q(3)

= c • p(3)

= =

2p(-l)+2q(-l)

c • 2p(-l)

=

=

2(p+q)(-l),

and

2(c • p)(-l),

and thus Wis indeed closed under both operations. Now, let us find a basis and the dimension of W. We begin by forming a typical linear combination from the ambient space IP2 and determine what restrictions are imposed on the coefficients. Using our basis { 1, x, x 2 }, a typical member of [Pl2 can be written as a linear combination: p(x) =

Co+

C1X

+ c2x2 .

The defining condition says that p(3) = 2p(- l ), so we must have: p(3)

=

co+3c1 +9c2

=

2(co-c1 +c2)

=

2p(-l).

We can rewrite this equation as: co - 5c1 - 7c2 = 0. This is a single homogeneous equation. We can find solutions for it by making c1 and c2 our free variables (this is because we wrote our linear combination in ascending degree). From this, we have: co = 5c1 + 7c2. Thus, we can conclude that: p(x) = 5c 1 + 7c 2 + c 1x + c 2x 2. Let us collect the terms with common coefficients: p(x) = c1(5+x)+c2(7+x

2),

and thus we see that any member of W must be a linear combination of the two polynomials 5 + x and 7 + x 2 (both of which satisfy the required condition: 5 + 3 = 2(5 - 1), and likewise for 7 + x 2 ). In other words: W = Span( {5 +x, 7 +x 2 } ).

Since these are polynomials of different degrees, they are linearly independent by our Theorem in Section 3.2. Thus, we have found a basis for W:

Section 4.5 Subspaces, Basis and Dimension

369

B = {5 +x, 7 +x 2 }, and from this, dim(W) = 2. As expected, it is smaller than dim(IP'2 ) = 3. Let's check that p(x) = 5 +x E W: p(3) = 8, andp(-1) = 4, so p(3) = 2p(-l ), makingp(x) a member of W.□ Let us see what we can do if we have a slightly bigger ambient space and we play with some Calculus: Example: Let V = IP'3 , and consider: W= {p(x)

E

IP'3 1p(-2)

=

0andp 1(3)

=

O},

under the same addition and scalar multiplication, of course. First let us show that Wis indeed a subspace of V. Again, z(-2) = 0 and z 1(x) = z(x), so z 1(3) = 0 also. Thus, W contains z(x), and so Wis not empty. We must show next that Wis closed under vector addition and scalar multiplication. If p(x) and q(x) are two vectors in W, and k is any scalar, then: (p + q)(-2)

= p(-2) + q(-2) = 0 + 0 = 0, and

(p + qi(3) = p 1(3) + q1(3) = 0 + 0 = 0. Thus, W is closed under addition, since both defining properties are satisfied by p + q. Similarly: (kp)(-2) = k •p(-2) = k • 0 = 0, and

(kpi(3)

= k•p 1(3) = k•O = 0,

and thus Wis closed under scalar multiplication. Notice that we used two properties of the derivative which we can call additivity and homogeneity:

(p+qi(x) (k•pi(x)

= i(x)

+q 1(x), and

= k•p 1(x).

Now, let us further use our knowledge of Calculus in order to find a basis for W. We know that { 1, x, x 2 , x 3 } is a basis for IP'3 , so first we write: p(x) = Co+ CJX+ c2x2 + C3X3, as a typical generic member of IP'3 . In order to be a member of W, it must satisfy the two defining properties. Since the second property involves a derivative, we first compute: p 1(x) = CJ +2c2x+3c3x 2. Now, the two properties say:

p(-2) = 0 = co - 2cJ + 4c2 - 8c3, and p 1(3) = 0 =CJ+ 6c2 + 27c3.

But this is a homogeneous system of two equations in four variables. We can solve this using our techniques from Chapter 1. We assemble the matrix:

370

Section 4.5 Subspaces, Basis and Dimension

[

1 -2 4 -8 ] 0 1 6 27

. rref [ with

1 0 16 46 ] . 0 1 6 27

The nullspace of this matrix gives us the solutions to the four coefficients. Since the leading variables are c 0 and c 1, and the free variables are c 2 and c 3 , we can sight-read the basis for the nulls pace: (co, c,, c2, c3) = c2(-l6,-6,

1, 0) + c3(-46,-27, 0, 1) = (-l6c2 -46c3,-6c2

-27c3, ci, c3 ).

Thus, p(x) must have the form:

p(x) = (-l6c2 - 46c3) + (-6c2 - 27c3)X + c2x2 + C3X3

= - l 6c2 - 6c2x + c2x2 - 46c3 - 27c3x + c3x3 6x +x 2) + c3(-46-27x

= c2(-l6-

+x 3 ).

These last two polynomials have different degrees, so they are independent, and since every member of W must be a linear combination of these two polynomials, they Span W. Thus, we conclude that W has basis: B = {-16- 6x +x 2,-46 - 27x + x 3 }. We also conclude that Wis 2-dimensional. Notice also that the basis {(-16,-6, 1, 0),(-46,-27, 0, 1)} for the nullspace closely corresponds to the coefficients in the two basis vectors in B. Thus, by decoding the basis for the nullspace, we can find a basis for W. We can check that both polynomials in B satisfy the defining properties of W: p (x) = -16-6x+x 2; p2(x) = -46-27x+x3; 1

p 1(-2) = -16+ 12+4

=

0; p2(-2)

-46+54-8 = 0; Pi(x) = -27 + 3x2, and thus:

p~ (x) = -6 + 2x; p~ (3) = -6

I P2(3)

+ 6 = 0;

=

=

-27 + 27

=

0.

We also graph the two polynomials below, and indeed we see that both polynomials have -2 as an x-intercept, and the tangent line at x = 3 is horizontal for both graphs. 50

-

-

-

-

-

The Polynomials p 1(x)

=

Section 4.5 Subspaces, Basis and Dimension

--tt>O

-

-

-

-16 - 6x + x 2 and p 2 (x)

=

-46 - 27x + x 3 0 371

The last portion of the previous Theorem which states that: dim(W)

=

dim(V)

if and only if

W = V,

is false in the case when W is a subspace of an infinite dimensional vector space V. This is one of the reasons why infinite dimensional vector spaces are best left in the appropriate field of study, which is called Functional Analysis.

Example: Consider the vector space of all polynomials: IJll= Span ( { 1, x, x 2 , x3, ... , xn, ... } ) .

Since the set of monomials is countable and linearly independent, dim(IJll)= ~o- Now, consider the subspace: IJll e

= Span (E), where E = { 1, x 2 , x 4 , x 6 ,

... ,

x 2n, ... } ,

consisting of all even polynomials, that is, the polynomials p(x) that satisfy the equation p(x) = p(-x). As seen in the Exercises in Section 4.4, Eis likewise countable and linearly independent, and so dim(IJlle) = ~ o as well. But obviously [Ille is not all of IJll,and so we see that our Theorem on subspaces of finite dimensional spaces may be false if the ambient space is infinite dimensional. □

Using dim(W? to Find Other Basesfor W Theorem 2.5.9 told us that if we knew the dimension of W, it becomes easier to verify if a subset B of Wis a basis for W. This Theorem generalizes to all finite-dimensional subspaces:

Theorem 4.5.9 - The Two-for-the-Price-of-One or Two-for-One Theorem: Let (W, EB,0) be a finite-dimensional subspace of a vector space (V, EB,0 ), with dim(W) = n, and suppose that B = wn} is any subset of vectors from W with exactly n vectors.

{w,,w2,...,

Then: B is a basis for W

if and only if

either B is linearly independent or B Spans W.

In other words, it is necessary and sufficient to check B for only one condition (and this would more easily be the condition of linear independence) without checking the other if B already contains the correct number of vectors. Again, this statement may be false if V is infinite dimensional, as you will see in the Exercises.

372

Section 4.5 Subspaces, Basis and Dimension

4.5 Section Summary A non-empty subset W of a vector space ( V, EB,0) is called a subspace of V if W is closed under vector addition and scalar multiplication. In other words, for all 1 and 2 E W, and k E [R(: w1 EB 2 E W, and k 0 1 E W. As before, we write W ~ V, and we refer to Vas the ambient space of W.

w

w

w

w

4.5.1: Let Wbe a non-empty subset of (V,EB,0). Then: Wis a subspace of V if and only if ( W, EB,0) is itself a vector space. 4.5.2: Let (V, EB,0) be a vector space. Then, Vhas two trivial subspaces: W = {Ov}, and all of Vitself. ➔

4.5.3: Suppose S = {vd i E J} c (V, EB,0 ), where I c [R( is some non-empty indexing set, and let W = Span(S). Then: (W,EB,0) is a subspace of (V, EB,0 ). A set of vectors B from a vector space ( V, EB,0) is a basis for V if B is linearly independent and Spans V. We will agree that the zero vector space V = { 0 v} does not have a basis, since any set containing Ov is automatically dependent. ➔



4.5.4 - The Extension Theorem: Let S = {v1, v2, ... , vn} be a finite, linearly independent set of vectors from some vector space (V, EB,0 ), and suppose Vn+Iis not a member of Span(S). Then, the extended set: S1 = {v1, V2,... , Vn,Vn+I} is still linearly independent. 4. 5.5 -

Existence of a Basis: Every non-zero vector space ( V, EB,0) has a basis B.

A non-zero vector space ( V, EB,0) is called finite dimensional if we can find a finite set B which is a basis for V. We call such a set a finite basis for V. Otherwise, we say that Vis infinite dimensional. We will agree that V = {0 v} has dimension 0, and is also finite-dimensional. ➔

4. 5.6. -

The Dependent/Independent Sets from Spanning Sets Theorem:

Suppose we have a set of n vectors: S = {w1, w2, ... , 11\} c (V, EB,0 ), and we form W = Span(S). Suppose now we randomly choose a set of m vectors from W to form a new set: L = {u 1, u2 , ... , Um}. Then, we can conclude that: if m > n, then L is linearly dependent. Consequently, the contrapositive says that: if Lis linearly independent, then m :S n. 4.5. 7 - The Dimension of a Vector Space: Any two bases for a finite-dimensional vector space ( V, EB,0) have exactly the same number of elements. We call this common number the dimension of V and is denoted as dim( V). If dim(V) = k, we also say that Vis a k-dimensional vector space. 4.5.8: Let (W, EB,0) be a subspace of a finite-dimensional vector space (V,EB,0). Then: dim(W) :S dim(V). Furthermore, dim(W) = dim(V) if and only if W = V. 4.5.9 - The Two-for-the-Price-of-One or Two-for-One Theorem: Let ( W, EB,0) be a finite-dimensional subspace of a vector space ( V, EB,0 ), dim( W) = n, and suppose that B = { 1, 2 , ••• , 11 } is any subset of vectors from W. Then: B is a basis for W if aml only if either B is linearly independent or B Spans W.

ww

Section 4.5 Subspaces, Basis and Dimension

w

with

373

4.5 Exercises 1.

Each item involves a subset W of IP2 or IP3 . For each item: (i) show that z(x) satisfies the description of W; (ii) show that Wis closed under addition and scalar multiplication; (iii) find a basis for W; (iv) state dim(W). In order to get the same basis as that in the Answer Key, let p(x) = co+ c 1x + c 2x 2 be the generic member for IP2 (add c 3x 3 for Exercises involving IP3). Make the leftmost variables the leading variables as we did in the Examples. a. W = { p(x) E IP2 Ip(-1) = p(2) and p(3) = 2p(1)} b.

W= {p(x)

E

IP2 12p(l) = 3p1(-1)}

C.

W= {p(x)

E

IP2 1 rlp(x)dx

d.

W= {p(x)

E

1P3 lp(-2)

=p(l)andp

e.

W = {p(x)

E

IP3 lp(-2)

= p 1(3) andp(3) = -2p 1(-1)}

f.

W = {p(x)

E

1P3 I f~p(x) dx = 0}

= 0} 1

(2) = 0}

2.

IP3 lp(-1) +p(2) = 2p 1(3),p(l) = p(2) andp 11(-2) = p 1(0)} Show that: W = { p(x) E IP2 Ip(3) = -2} is not a subspace of IP2.

3.

Let V = Span( { e2", e 3X,e 5x} ).

g.

a. b.

c. 4.

W = {p(x)

E

Show that W1 = { /(x) E Vl/(0) = 0 and/ 1 (0) = 0} is a subspace of V. Find a basis for W1 and its dimension. Show that W2 = {/(x) E Vl/(0) = / 1(0)} is a subspace of V. Find a basis for W2 and its dimension. Why is this different from (a)? Using only the descriptions of W1 and W2 above, would it be possible to say that W1 is a subspace of W2 , or W2 is a subspace of W1 , or neither? Show that W3 = {f(x) E VI [/(0)] 2 = / 1(0)} is not a subspace of V.

Show that the subset Wof V = Span( {sin(x), cos(x), tan(x)}) defined by:

W = {f(x)

E

Vl/(0)

= f(rr:/4)}

is a subspace of V. Find a basis for Wand its dimension. 5.

Show that the subset W of IP3 defined by:

W = {p(x)

E

IP3 I p(-1)

= 0, p(l) = 0, and p(4) = 0}

is a subspace of [ll>3 . Think carefully about the next part before you start making any computations: find a basis for Wand state its dimension. Again, think smartly! 6.

374

In our Examples, we defined subspaces by listing two (or more) conditions joined by the word"and."Considertheset: W= {p(x) E 1P2 lp(-2) = Oorp(3) = 0}. a. Show that z(x) is a member of W. b. Show that Wis closed under scalar multiplication. c. Show that Wis not closed under vector addition, by producing two vectors from W whose sum is not in W. Conclude that Wis not a subspace.

Section 4.5 Subspaces, Basis and Dimension

7.

Let F be the set of all functions f(x) defined on an interval /, except possibly at a specific point x = a E /. a. Show that F is a vector space under the usual addition and scalar multiplication of functions. b. Show that: W = {f(x) E F I lim f(x) = 0 } is a subspace of F. X➔Q

c.

Show that: U = {f(x)

F

E

I

lim f(x)

= -2 } is not a subspace of F.

X➔ Q

8.

Use the Two-for-One Theorem (and the answer to the corresponding Exercise) to determine if the indicated set of vectors is also a basis for the subspace W in the corresponding Exercise. In other words, check if each member of the basis is actually a vector from the subspace, and check if the set is independent (it is easier to check independence instead of Spanning). For those S which are a basis, show how to express each vector in Sas a linear combination of the original basis vectors. a. Exercise 1 (b): S = {x 2 + 8x, x 2 - 4} b.

Exercise 1 (c): S = {x 2

c. d. e.

Exercise 1 (d): S = {3 - 24x - 9x 2 + 5x 3 , 7 + 72x + 27x 2 - 15x3 } Exercise 1 (e): S = { 10x 3 + 16x2 - 99x + 85, 3x 3 + 20x 2 - 43x + 16} Exercise 1 (f): S = {x 3 -x - 8, x 3 + 3x 2 - 23, 3x 2 + x - 15}

f. g. h.

Exercise 1 (f): S = {x 3 - 5x, 13x 3 - 30x 2 , 6x 2 - 13x} Exercise 1 (g): S = {22 + lOx - x 2 - x 3 } Exercise 3 (b ): S = { e 5x - 2e 3x, e 5x - 4e2x}

1.

Exercise 4: S = { tan(x) + cos(x)-

-

2x, 2x - 1}

*

sin(x),

J2 tan(x)

-2sin(x)}

--+

Let b > 0, a positive real number, b 1. Show that {b} is a basis for IR

r1

d1r,

->

If A =

r2

d2r2

, then: DA=

->

rn

dnrn

If B = [ ->C1 ->C2

->

Cn

], then: BD = [

d,c,

d2 c2 ... dnCn

l

In other words, we can obtain DA by multiplying each row of A by the corresponding diagonal entry of D, and we can obtain BD by multiplying each column of B by the corresponding diagonal entry of D. 4.6.3 -

Closure Properties for Diagonal Matrices:

If A and B are n x n diagonal matrices and c is any scalar, then A + B, A - B, cA and AB are also n x n diagonal matrices. In particular, the positive powers of a diagonal matrix are also diagonal, and if D = Diag(d 1, d2, ... , d 11), then Dk= Diag(df, df, ... , d,~) for all positive integers k. Consequently, the subset:

D(n)

=

{A

E

Mat(n)laiJ

=

0 if i -=1=-j}

is a subspace of Mat(n). Furthermore, dim(D(n)) = n. 4.6.4 -

Jnvertibility of Diagonal Matrices:

A diagonal matrix D = Diag ( d,, d2, ... , dn) is invertible i = 1. .n. In this case: D- 1 = Diag(d 11, d 21, ... , d-;;1).

if and only if

di

0 for all

-=I=-

Ann x n matrix U = [ui,i] is called upper triangular if all the entries below the main diagonal are 0, that is, uiJ = 0 if i > j. Similarly, an n x n matrix L = [li,i] is called lower triangular if all the entries above the main diagonal are 0, that is, liJ = 0 if i < j. We also say the U is strictly upper triangular if its diagonal entries are also zeroes, that is,

Section 4.6 Diagonal, Triangular, and Symmetric Matrices

387

U;J =

0 if i 2 j. Similarly L is strictly lower triangular when li,i

4.6.5 -

=

0 if i :Sj.

Closure Properties for Triangular Matrices:

If A and Bare n x n upper triangular matrices and c is any scalar, then A+ B, A -B, cA and AB are also n x n upper triangular matrices. In particular, the positive powers of an upper triangular matrix are also upper triangular. An analogous statement is true for lower triangular matrices. Consequently, the subset of all upper triangular matrices: U(n) = {A isasubspaceofMat(n).

E

Mat(n)laiJ

Furthermore, dim(U(n))

= 0 if i > j}

= n(n; l)

Similarly, the subset of all lower triangular matrices: L(n) = {A

E

Mat(n)lai,i

= 0 if i j. Use similar ideas to prove the analogous result for lower triangular matrices.

10. Prove that the transpose of an upper triangular matrix is a lower triangular matrix, and vice versa. 11. Prove that an upper triangular matrix is invertible if and only if none of the entries in the main diagonal is 0. Use the same hint as in Exercise 6 (b). State and prove a similar statement for lower triangular matrices (Warning: the proof for upper triangular matrices is not exactly the same idea for lower triangular matrices).

12. Let T: ~ 3



~3, with [7]

=

A

=

[

:

0 a. b.

c.

-~

:



0 -7

Compute T(e1), T(e2), and T(e3). Write your answers in terms ofe1, e2, and e3. Use your computations in (a) to find three vectors v1, v2, and VJ, such that T(v1) = T(v2) = and T(v3) = e3. Hint: fully exploit the linearity properties of T. Use your answers in (b) to write A- 1.

e,,

e2,

13. Prove that if an upper triangular matrix is invertible, then its inverse is also upper triangular. Hints: Let T be the linear transformation corresponding to this matrix. Explicitly construct r- 1 by defining it on the standard basis using Induction. Generalize the ideas from the previous Exercise. Start by showing that you can define r- 1 (e 1). 1 1 Assume that you can define (e 1 ) through (ek). Finally, show that you can define 1(ek+I ).

r-

r-

r-

14. Let A and B be m x k matrices, let C be a k x n matrix, and let r

A

(A+ Br =AT+ BT

C. (rA)T

E ~-

Prove that:

r(AT)

a.

(ATr

d.

(BC) T = CTBT (Hint: Use the dot product formula for the matrix product)

=

b.

=

15. Prove that if A is an invertible n x n matrix, then AT is also invertible, and (AT)-1 = (A-If.

Hint: by the uniqueness of the inverse, all you need to show is thatAT(A- 1 ) T = / 11• Part (d) from the previous Exercise will be useful. 16. Let A and B be symmetric n x n matrices and let c be any scalar. Prove that AT, A+ B, A - B and cA are also symmetric. Section 4.6 Diagonal, Triangular, and Symmetric Matrices

391

17. Prove that if A is invertible and symmetric, then A- 1 is also symmetric. 18. Prove the converse of Exercise 5 from Section 3.8: If A is a .fixed m x n matrix, and the equation yA = dis solvable (for y) for all l x n matrices d, then A is invertible. Hint: use the transpose operation, Exercise 14 (d), and the Really Big Theorem 3.8.5. ~

~

19. Suppose that A is a strictly upper triangular n x n matrix. Prove that An = 0 nxn, the zero n x n matrix. Hint: compute the powers of a strictly upper triangular 4 x 4 matrix and observe what happens to each power, and why this happens.

20. LetA -[

}}

:~ 0

J

-;~o

B -[

~~ ~J

nd

a

c-[i ~ -iJ 9

Notice that all three matrices are obviously symmetric. a. Compute AB and BA and verify that they are equal. Look at the resulting matrix, and check that it is also symmetric. b.

On the other hand, compute AC and CA. Is either matrix symmetric? Are these two products equal to each other?

c.

Suppose A and B are symmetric n x n matrices. Prove that AB is also symmetric if and only if A and B commute with each other, that is, AB = BA. Hint: Exercise 14 (d) will again be very useful.

21. Show that the set:

is a basis for the vector space of diagonal 3 x 3 matrices. Hint: show that every 3 x 3 matrix can be expressed as a linear combination of these three matrices (thus proving Spanning), and these three matrices are linearly independent. 22. Let Diag(n) be the vector space of all diagonal n x n matrices. Use idea of the previous Exercise to find a general basis for Diag(n) and state its dimension. 23. Show that the set:

{UJU JU J u JUJU]} 0 0

1 0

0 1

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

1 0

0 1

0 0

0 0

0 0

0 1

is a basis for the vector space of upper triangular 3 x 3 matrices. 392

Section 4.6 Diagonal, Triangular, and Symmetric Matrices

24. Let Upper(n) be the vector space of all upper triangular n x n matrices. Use the idea of the previous Exercise to find a general basis for Upper(n) and state its dimension. 1 Hint: you will need the formula: 1 + 2 + 3 + ... + k = k(k; ) that is usually seen in Precalculus when we study Mathematical Induction. 25. Let Lower(n) be the vector space of all lower triangular n x n matrices. Explain why the dimension of lower(n) should be exactly the same as that of Upper(n ). 26. Show that the set:

is a basis for the vector space of symmetric 3 x 3 matrices. Hint: complete the matrix below depicting the general form of a symmetric 3 x 3 matrix:

27. Let Sym(n) be the vector space of all symmetric n x n matrices. Use the idea of the previous Exercise to find a general basis for Sym(n ), and find its dimension. 28. Suppose that A and B are n x n matrices. Prove that A • B = Onxn if and only if colspace(B) ::SJnullspace(A) or rowspace(A) ::SJnullspace(BT). Hint: think of the definition of A • B. 29. Matrices in Block Diagonal Form: Suppose that A 1, A 2 , •.• , Ak are all square matrices, not necessarily of the same size, with k ~ 2. We defined the direct sum of these matrices:

in Exercise 8 of Section 3.8. Prove the following statements about celiain kinds of special direct sums: a.

A is diagonal if and only if every Ai is also diagonal.

b.

A is upper triangular if and only if every Ai is also upper triangular.

c.

A is lower triangular if am/ only if every Ai is also lower triangular.

d.

AT = AT EBA 1EB... EBAr

e.

A is symmetric if and only if every Ai is also symmetric.

Section 4.6 Diagonal, Triangular, and Symmetric Matrices

393

30. Bisymmetric Matrices: We say that an n x n matrix A is bisymmetric if the entries of A are symmetric across the main diagonal as well as the opposite diagonal, which is made oftheentriesa1,n, a2,n-1, ... , an,l• Algebraically,thismeans: a,J

=

aj,i,

and

=

aiJ

an+l--j,n+l-1

for all i,j = 1... n.

For example, the most general form of a bisymmetric 1 x 1, 2 x 2 and 3 x 3 matrix would be, respectively:

and [

:

:

: ]·

c b a Notice that every 1 x 1 matrix is automatically bisymmetric. Let us denote by Bisym(n) the set of all n x n bisymmetric matrices. a.

Show that Bisym(2) is a subspace of Sym(2).

b.

Find a basis for Bisym(2) and state dim(Bisym(2)). Hint: decompose the matrix above into two matrices which contain only one distinct letter and the other entries are zeroes.

c.

Show that Bisym(3) is a subspace of Sym(3).

d.

Find a basis for Bisym(3) and state dim(Bisym(3) ).

e.

Find the general form of all 4 x 4 bisymmetric matrices and repeat parts (a) and (b). Replace 2 with 4 in the instructions.

f.

Find the general form of all 5 x 5 bisymmetric matrices and repeat parts (a) and (b). Replace 2 with 5 in the instructions.

g.

Use your answer in (e) to show that if you erase the 1st and 4th rows and 1st and 4th columns of a 4 x 4 bisymmetric matrix, you get a 2 x 2 bisymmetric matrix.

h.

Use your answer in (f) to show that if you erase the 1st and 5th rows and 1st and 5th columns of a 5 x 5 bisymmetric matrix, you get a 3 x 3 bisymmetric matrix.

1.

Now, let us begin to generalize: show that Bisym(n) is a subspace of Sym(n).

J.

Show how to construct a basis for Bisym(n) consisting of two kinds of matrices: (1) matrices where the only non-zero entries are in rows 1 and n and in columns 1 and n, and (2) matrices where all the entries in rows 1 and n and columns 1 and n are zeroes. Show that the matrices of the 2nd kind are in one-to-one correspondence with a basis for Bisym(n - 2). Draw from your observations in parts (e) and (f).

k.

Use part (j) and Induction to show that: 2

dim(Bisym(n))

= {

k

2

k +k

if n = 2k - 1, an odd number, and if n = 2k, an even number.

Note: divide your proof into the case when n is odd and n is even. 394

Section 4.6 Diagonal, Triangular, and Symmetric Matrices

Chapter Five

Movement in the Abstract:

Linear Transformations of General Vector Spaces Now that we have some understanding of general vector spaces, we will construct linear transformations from one vector space V to another vector space W. A linear transformation T from V to W first has to be a function whose input is a vector from V, and whose output is a vector from W. Just like in Chapter 3, we write = T(v),and we require T to posses the properties of additivity and homogeneity. We write these properties just like before, as:

w

w

T(u + v) = T(u) + T(v),

v

and T(k • v) = k • T(v).

uv

v

We should keep in mind, though, that + and k • represent the vector addition and scalar multiplication in V, whereas T(u) + T(v) and k • T(v) represent the vector addition and scalar multiplication in W. An easy, but important example would be the derivative. The additivity and homogeneity properties are the familiar derivative properties:

D(p(x)+q(x)) D( k • p(x))

=p 1(x)+q 1(x) =D(p(x))+D(q(x)), =

k • p 1(x)

=

and

k • D(p(x)).

for any two differentiable functions p(x) and q(x). We know from Chapter 4 that W = Span( { sin(x ), cos(x)}) is a vector space. But we also know from Calculus that:

Jx(a• sin(x) + b • cos(x))

=a• cos(x) - b • sin(x)

E

W.

Below, for example, we see how p(x) = 3sin(x) + 5cos(x) is transformed by the derivative intop 1(x) = 3cos(x)-5sin(x):

D ➔

But because the derivative is again a function in W, we can regard the derivative as a linear transformation D from W to W, and because W is 2-dimensional, we will construct a 2 x 2 matrix that will represent D as a linear transformation. We will also see that D is in fact both one-to-one and onto on W, which means that D is an isomorphism on W. This invertibility property will allow us to solve certain ordinary differential equations using an inverse matrix. In other words, we will be able to find, for example, a function y = f(x) such that:

3/ 1 -5/

+ 2y = 4sin(x)- 7cos(x).

395

5.1 Introduction to General Linear Transformations Now we begin generalizing the terms, constructions and Theorems from Chapter 3: Definition: Let ( V,EB,0) and ( W,EE,8 ) be any two vector spaces, with their indicated vector addition and scalar multiplication. A linear transformation: T: v ➔ W is a fu11ctio11that assigns a unique member E W to every vector v E V, such that T satisfies the following conditions for all E Vandall scalars k E IR{:

u,v

w

The Additivity Property:

T(u EBv) = T(u) EET(v),

The Homogeneity Property:

T(k 0 v) = k 8 T(v).

As usual, we write T(v) =

and

w,the image ofv under T.

It is of course cumbersome to explicitly specify that the addition and scalar multiplication on the left side of the equations (inside the parentheses) are those of the space V, and those on the right side of the equation are the operations in W, and indeed we will usually just write: T : V



W is a function, and T satisfies:

T(u + v) = T(u) + T(v),

T( k • v) = k • T(

and

v),

when the addition and scalar multiplication on each side of the equations are clear. We can visualize the two linearity properties with essentially the same diagram as in Section 3.1:

w

w

k T (v)

The Additivity Property

= T ( k v)

The Homogeneity Property

As before, we call V the donuzin of T and W the codomain of T. A linear transformation is also called a vector space homomorphism, the last word literally meaning "same form," because T essentially preserves the vector addition and scalar operation in both spaces. When the domain is the same space as the codomain, i.e. T : V ➔ V, we once again call T a linear operator as we did for Euclidean spaces. 396

Section 5.1 Introduction to General Linear Transformations

Let us generalize the simplest kinds of linear transformations:

Definitions/Theorem: Let ( V, EB,0) and ( W, 83,8) be any two vector spaces. The following three examples are all linear transformations: The zero transformation from V to Wis the function: ➔

--+

--+

Z : V ➔ W, where Z ( v) = 0 w for all v

The identity operator of Vis the function: Iv : V ➔ V, where Iv(v)

E

V.

Vfor all VE V.

=

More generally, for any scalar k, the scaling operator of V by k is the function: sk : V ➔ V, where Sk(v) = k 0 v for all v E V. Proving that these are all linear transformations is easy, and are left as Exercises. We began our study oflinear transformations of Euclidean spaces by constructing the standard matrix [ T] and deriving information about T from the properties of matrix multiplication. However, it is only possible to create a matrix for a linear transformation of abstract vector spaces if the domain is finite dimensional, as we shall see in the next Section. In the meantime, we will try to be as general as possible in re-defining the terms and proving the properties from Chapter 2 in the context of abstract vector spaces, even if these spaces are infinite dimensional, and explore examples related to Algebra and Calculus.

Evaluation Transformations Let V = F(I) and W = llt We can construct the evaluation transformation (sometimes called the evaluation homomorphism):

Ea : F(/) for some fixed number a E E 1(ln(x)) = 0, andE 1 (sin- 1 (x))

/.

=

➔ IR2 ➔ B = {2, 5 -x, 2 + 3x-x

2 1Jll 2

}

is an operator whose matrix with respect to the basis is given by:

-2 [T] 8 =

3 [

-1

-1

a.

Find ( 3x 2 + 5x - 7) 8 .

b.

Use (a) to compute T(3x 2 + Sx - 7). Don't forget to perform all three steps.

c.

Use the idea of (a) and (b) to compute T(l), T(x) and T(x 2 ). See the Hint in Exercise 27 (c). Use (c) to construct [T] 8 1, the matrix of Twith respect to the standard basis

d.

B 1 = {l,x,x

e.

2

}.

Use [T] 8 1 to recompute T(3x 2

+ 5x - 7).

23. Completing the Proof of Theorem 5.2.1: If Vis a vector space and Bis a fixed basis for V, prove that for all vectors E V:

u

(c • U) 8 =

C •

(u)

8

.

Hint: what is the meaning of each side of this equation, starting with ( u) 8 ? 24. Consider the plane through the origin, II, with equation: 3x + 7y - 8z = 0. Review the last Example in this Section. a. b.

Show that v1 = (7,-3,0) and v2 = (8,0,3) are two linearly independent vectors on the plane II. Let = (3, 7, -8) be the obvious normal for the plane. Show that:

n

is a basis for ll 3. c.

Use the geometric description of pro}n to explain why [pro}n] 8 = Diag( 1, 1, 0).

d.

use a single matrix (of dimension 3 X 6) to find (e I) 8' (e2 >8' and (e3) 8"

e. f. g.

Computeprojn(e1), pro}n(e2), andpro}n(e3) using (c) and (d). Find the standard matrix [pro}n], using (e). Similarly, show that [refln] 8 = Diag(l, 1,-1).

h.

Find the standard matrix [refln]. You may use your answers from (d). Let L be the line Span( {n} ). Find the standard matrix [pro}L]-

1.

25. Repeat Exercise 24 with the indicated plane. For the first step, find vectors v1 and v2 on II where either x = 0 or y = 0 or z = 0 to slightly simplify the computations. b.

II : 5x - 3y + 7z = 0. II : 2x - y + Sz = 0.

c.

II : x = ~ z. What two linearly independent vectors are on II?

a.

426

Section 5.2 Coordinate Vectors and Matrices for Linear Transformations

26. Alternative Formulas for Projection and Reflection Matrices: Let us generalize the previous Exercise. Suppose that TI : ax+ by + cz = 0 is a plane in IR(3 passing through

I

::n::::nth:0::::1:i:, rt:;asfml noneofthe coefficients ismo. a. b. c.

Explain the relevance of the three columns of C. Explain why C is invertible. Show that: 1 0 0

[pro}n] = C

O 1 O [

l

c- 1

1 0 O 1

and [refln] = C [

0 0 0

0 O

l

c- 1•

0 0 -1

*

How would you modify the matrix C if TI had equation: ax + cz = 0, where a 0 and c O? 27. The Minimizing Theorem: Now that we have coordinate vectors, we can state a general version of The Minimizing Theorem: Suppose that Vis a finite dimensional vector space, with basis B = {v1, V2, ... , Vn}- Suppose that S = {w1, W2, ... , Wk} is any finite subset of vectors from V. Let [wt]B, [w2Ja, ... , [wk]a be the respective coordinate matrices. Let us assemble them in the columns of: d.

*

A = [ [wt]B [w2Ja · · · [wk]a ], with rref R. Suppose that i 1, i 2 , ... , i 111 are the columns of R that contain the leading variables. Prove that the set S 1 = {wip 12, ••• , 1111}, that is, the subset of vectors of S corresponding to the leading columns of A, is a linearly independent set, and:

w

w

Span(S) = Span(S

1 ).

v;

1 Furthermore, every E S - S , that is, the vectors of S corresponding to the free variables of R, can be expressed as linear combinations of the vectors of S 1, using the coefficients found in the corresponding column of R. 28. Use the Minimizing Theorem above to find a subset S 1 of S = { w2, ... , Wk} which is linearly independent such that Span(S) = Span(St), and for every E S- st, find a linear combination of the vectors from st that will add up to Use a convenient basis B for the ambient space (if this is given as Span(B), use B itself as a basis). You may use the symbols 1, 2 , ••• , Wk in your answers. 2 a. S = { 5 - 4x + 3x 2 , 6 - 7x + 2x 2 , 2 + Sx + 6x 2 , I + 2x + 3x 2 } c Oll b. S = { 3 + Sx + x 2 + 4x 3 4 + 7x + 2x 2 + 3x 3 x + 2x 2 - 7x 3 ' ' ' 3 3 + 4x - x 2 + 2x 3 , 7 + 3x - 15x 2 + 7x 3 } c Oll

w,, vi v;.

ww

C.

d.

S = { 3 - 4x - 3x 2 + x 3 - 5x 4 -1 + 3x + 2x2 + 4x 3 + 3x 4 3 + 1 lx + 6x 2 + 40x 3 + 7x 4 ' ' 4' 7 + 4x +x 2 + 37x 3 -x4,-2 + 3x + 2x 2 +x 3 + 3x 2 , 1 + 3x + 3x 2 + 6x 3 + 6x6 } c Oll s = {sex+ 4e 3X + 3e 4x, 6ex + 7e 3X + 2e 4x, 2ex - 5e 3X + 6e 4X, ex - 2e 3X + 3e 4X, sex + 11e 3X} C Span( {ex, e 3X' e 4X})

Section 5.2 Coordinate Vectors and Matrices for Linear Transformations

427

29. Suppose that Vand Ware vectors spaces with dim(V) = n and dim(W) = m. Let B = {v1, v2, ... , vn} be a basis for V, and let B 1 = {w1, w2, ... , wm} be a basis for W. Suppose that T 1 and T2 are both linear transformations with domain V and codomain W. Prove that: a.

[T1 + T2]8 ,8 1 = [T1]8 ,8 1 + [T2]8 ,8 1.

b.

[k · T1]8 ,8 1 = k • [TiJ 8 ,8 1

30. Casting Shadows: Let us imagine that the yz-plane is a wall, and a window-pane is -+ -+ formed by the unit vectors j and k. Imagine also that the sun is located infinitely far away, in the direction ofu= (a, b, c). For now, let us assume that c > 0 (i.e. the sun is above the horizon), and that light is coming from the sun in parallel rays, in the direction of

-u.



y

s;:

R

X

y

X

lfvis an arbitrary vector on the yz-plane, in standard position, let Su(v) be its shadow on

2 2 the .xy-plane. Thus, we have a function: Su : IRl.2 ➔ !RI. , where the domain !RI. refers to the 2 yz-plane, with basis B = {j, k}, and the codomain !RI. refers to the .xy-plane, with basis -+ -+ 1 B = { i, j}. Above, on the right, we show the image under Su of the unit square, with the letter R inside.

a.

Explain why our assumptions imply that the shadow of a triangle is again a triangle, and the shadow of two parallel vectors are again parallel. Thus, Su is additive and homogeneous.

b.

Find SuU) and Su(k ). Hint: j is obviously its own shadow. Fork, think of the line passing through (0, 0, 1) with direction vector and where its shadow will fall on the xy-plane. You will need the equation of this line.

c.

Use (b) to assemble [Sn]8 ,8 1.

d.

Find [Su]8 ,8 1 ifu = ( 3,-2, 5), and sketch the effect of Suon B.

e.

Show that [Su]8 ,8 1 is undefined if c = 0, but it is still defined if c < 0.



--+



--+

u,

What would be the physical interpretation of [Su]8 ,8 1 if c < 0? Hint: show that if c < 0, [Sii]s 8 1 is exactly the same as what we would get if we replaced uwith -u.Notice that.the z-component of -uwill now be positive. Demonstrate your answer with = ( 3,-2,-5).

u

428

Section 5.2 Coordinate Vectors and Matrices for Linear Transformations

5.3 One-to-One and Onto Linear Transformations; Compositions of Linear Transformations Now that we have some understanding of how to compute the action of a linear transformation with finite-dimensional domain and codomain by constructing a matrix for it, we will go into a deeper exploration of the properties of linear transformations. In particular, we will see how to generalize the idea of a linear transformation being one-to-one or onto, how to use the rref of a matrix in order to find a basis for the kernel and the range and to test for the one-to-one and onto properties, how to compose two transformations, and find a matrix for this composition.

One-to-One Transformations and Onto Transformations We can use exactly the same definition for one-to-one functions that we saw with Euclidean spaces:

Definition: We say that a linear transformation T : V ➔ Wis one-to-one or injective if the image of different vectors from the domain are different vectors from the codomain:

We again say that Tis an injection or an embedding. As before, we can rephrase this definition in terms of its contrapositive:

Theorem 5.3.1: A linear transformation T: V ➔ Wis one-to-one if and only if the only way two vectors from the domain have the same image in the codomain is for them to be the same vector to begin with: If T(v1) = T(v2) then v1 = v2. In other words, the only solution to T(v 1)

=

v v2.

T(v 2 ) is 1 =

Finally, this condition is once again intimately related to the kernel of T :

Theorem 5.3.2-

The Kernel Test for Injectivity:

A linear transformation T : V



Wis one-to-one

if and only if

--+

ker( T) = { 0 v}.

The proof, of course, is exactly the same as in Chapter 3, thanks to the linearity properties of T. We would also like to point out that all the statements above are true even if V or Wis infinite dimensional (you will notice in the proof in Chapter 3 that there is no mention whatsoever of a matrix for T). As with one-to-one transformations, we can define onto transforn1ations in exactly the same way as with Euclidean spaces:

Section 5.3 Compositions of Linear Transformations

429

Definition/Theorem 5.3.3: We say that a linear transformation T: V surjective if the range of Tis all of W:



W is onto or

range(T) = W.

Since rank(T) = dim(range(T) ), we can also say that Tis onto rank(T) = dim(W), in the case when Wis finite dimensional.

if and only if

We again say that T is a surjection or a covering. We can visualize these two concepts using essentially the same diagrams from Chapter 3, with only the spaces relabelled: V

w

T

JI

w

. ---!__

. 0'.

Tis one-to-one

if and only if

Tis onto

.....

ker(T) = { 0 v}.

if and only if

range( T) = W.

Finding the Kernel and Range Using [TJs,s' Our next task is to determine if a given linear transformation is one-to-one, onto, neither or both. For this, we will need to study its kernel and range. Recall that we can find a basis for the kernel and range of a linear transformation T : ~ n ➔ ~ 111 by examining the rref of its standard matrix [T]. The nullspace of [T] is the same as ker(T), and the original columns of [T] corresponding to the leading l's in the rref form a basis for range(T), since this subspace is the same as colspace([T]). We can apply this idea to [Th 8 1 when we are dealing with a linear transformation between abstract vector spaces. We will leave the proof of the following as an Exercise:

Theorem 5.3.4: Suppose that T: V ➔ Wis a linear transformation, with dim(V) = n and dim(W) = m, both finite-dimensional vector spaces. Let B = {v,, v2, ... , vn} be a basis for V, and let B 1 = {¼'I, w2, ... , Wm}be a basis for W. Let us construct the m x n matrix [T]8 _8 1 as we did in the previous Section, and let R be the rref of [T]B,Bt. Suppose that: is the basis that we obtain for nullspace([T] 8 8 1) using R, as we did in Chapter 2. By the Uniqueness of Representation Property, we know that there exists E V so that (i:i;) 8 = for every i = 1. .. k.

z;

430

zt

Section 5.3 One-to-One and Onto Linear Transformations;

We conclude that the set {u1, U2, ... , Uk} C Vis a basis for ker(T). --+ As usual, if there are no free variables in R, then nullspace([ TJ a at) = { 0 n}, --+ , and consequently ker(T) = { 0 v}. Similarly, the set of original columns: --+ {

--+

--+

Ci 1 , C ; 2 , ••• , Cir

}

fTl)

,n

C '-""

from [TJB,B1corresponding to the leading l's of R form a basis for columnspace([TJ 8 at) as we found in Chapter 2, and there exists

d;E

W so that

(d;)

for every j

B 1 = C;j

=

I ... r.

--+ We conclude that the set {d1, d2, ... , dr} c Wis a basis for range(T). --+ If Tis the zero transformation, then range(T) = { 0 w}.

In other words, the information provided by [TJ8 a1 and R simply needs to be decoded with respect to the corresponding basis: we use B to find a basis for ker(T) from a basis for nullspace([TJ 8 8 1), and we use B 1 to find a basis for range(T) from a basis for columnspace([TJ aBi). Example: Let T:

2 3 ➔ 1Jl) 1Jll be

given by:

T(p(x)) = p 1(x) + 3x • p 11(x)-2p(-1). We leave it the reader to verify that this is indeed a linear transformation. Let us choose the 3 and B 1 = { 1, x, x 2 } for 1Jll 2 . We compute Ton the standard bases B = { 1, x, x 2 , x 3 } for IJ» basis vectors: T(l)

=

0 + 0 - 2 · 1 = -2,

T(x)

=

1 + 3x • 0 - 2(-1)

=

3,

2x + 3x • 2 - 2 • 1

=

8x - 2, and

T(x

2

) =

T(x 3 )

=

2

3x + 3x • 6x- 2(-1)

=

21x 2 + 2.

Now, we encode each as a column for [TJa,B1:

[TJ8 ,8 1 = [ -~ : -~ 0 0

~

],

with rref

[

0 21

~ 0

2 -~ 0

~

: ]·

0 1

Thus, we have one free variable, and the coordinates with respect to B of the single member of the basis for our kernel are (3/2, 1, 0, 0). Clearing fractions, we can use (3, 2, 0, 0).

Decoding these coordinates with respect to B, the actual polynomial is:

p(x) = 3 • 1 + 2 • x + 0 • x 2 + 0 • x 3 = 3 + 2x. We can check that: T(p(x)) = 2 + 3x • 0 - 2(3 - 2) = 0, so p(x) is indeed in the kernel. Thus: ker(T) = Span( {3 + 2x} ). Since ker(T)

* {z(x)},

Tis not one-to-one.

Section 5.3 Compositions of Linear Transformations

431

Similarly, we have leading l's in the 1st, 3rd and 4th columns, so the coordinates with respect to B 1 of the members of the basis for our range are found in the original 1st, 3rd and 4th columns of [TJ8 ,8 1: (-2, 0, 0), (-2, 8, 0) and (2, 0, 21). Decoding these coordinates with respect to B 1, the actual members of our basis are:

{-2+ 0 •x+ 0 •x 2 ,-2 + 8 •x+ 0 •x 2 , 2 + 0 •x+ 2lx 2 } = {-2,-2 + 8x, 2 + 21x 2 }. But notice that dim([Pl2 ) = 3 and our basis above has 3 members, so actually: range(T)

and Tis

= [Pl2 =

2

Span({-2,-2+8x,2+21x

})

=

Span({l,x,x

2

})

onto. □

The Dimension Theorem for Abstract Vector Spaces We will now fully generalize the Dimension Theorem for a linear transformation involving abstract vector spaces: Theorem 5.3.5 ~ The Dimension Theorem: Let T : V ➔ W be a linear transformation, and suppose that V is finite dimensio11al with dim(V) = n. Then, both ker(T) and range(T) are finite dimensional, and we can define:

rank(T) = dim(range(T) ), and nullity(T) = dim(ker(T)).

Furthermore, these quantities are related by the equation: rank(T) + nullity(T)

=

n

=

dim(V)

Proof Since V is finite dimensional, ker(T) furthermore, 0 S nullity(T) S n.

~

=

dim(domain of T).

V is automatically finite dimensional, and

Let us consider the trivial case that happens if nullity(T) = n. This is equivalent to T being the -+ zero transformation to W, that is T(v) = 0 w for all E V. But this is equivalent to -+ range(T) = {Ow}, which means rank(T) = 0. Thus, this case satisfies rank(T) + nullity(T) = 0 + n = n. Now, suppose that nullity(T) = k < n. If k > 0, that is, T is not one-to-one, suppose that S = {v1, v2 , ••• , vk} is a basis for ker(T). However, if Tis one-to-one, then k = 0, and there is no basis for ker(T), so we may just assume that Sis the empty set.

v

By the Extension Theorem, we can enlarge S, one vector at a time, to a basis: B = { -+VJ,

-+

-+

-+

-+

-+ }

V2, ... , Vk, Vk+I, Vk+2, ... , Vn

for V. Thus, any vector v E V can be expressed uniquely as a linear combination:

432

Section 5.3 One-to-One and Onto Linear Transformations;

From this, we get: T(v) = T(c1v1 + c2v2 + ... + CkVk+ Ck+!Vk+l+ Ck+2Vk+2 + ... + CnVn)

= T(c1v1) + T(c2v2) + ... + T(ckvk) + T(Ck+tVk+l)+ T(Ck+2Vk+2) + ... + T(CnVn) = CJ T(vi) + c2T(v2) + ··· + CkT(vk)+ Ck+lT(vk+I) + Ck+2T(vk+2) + ··· + CnT(vn) = Ck+!T(vk+l) + Ck+2T(vk+2) + · ·· + CnT(vn), since T(v1) = Ow, T(v2) = Ow, ..., T(vk) = Ow.Thistellsusthatrange(T) isSpannedbythe set { T(v k+1) , T(v k+2), ... , T(v n)}, which we illustrate in the following diagram:

v,,. ---------------.

T

•---------------•

T(v:;)

• V



w



The Dimension Theorem Since there are n - k vectors in this set, we will complete the Proof by showing that this set is also linearly imlependent, and thus rank(T) = n - k. We construct the dependence test equation: -+

dk+1T(vk+1) +dk+2T(vk+2)+ ··· +dnT(vn) = Ow. -+

Reversing the steps above, we get: T(dk+tVk+l+ dk+2Vk+2 + · · · + dnVn) = Ow. This shows that the vector v = dk+tVk+l+ dk+2Vk+2 + ·· · + dnV11 is a member of ker(T). Recall, though, that {v1, V2, ... , Vk} is a basis for ker(T). Thus, we can find coefficients d 1, d 2 , ••. , dk such that: in other words: -+

- d1v1 - d2v2 - ... - dkVk+ dk+!Vk+l+ dk+2Vk+2 + ... + dnVn= Ow. But now, since {v1, v2 , .•• , vk, Vk+i,Vk+2, ... , vn} is a basis for all of V, this set is linearly independent, and so di = d2 = · · · = dk = dk+l = · · · = dn = 0. In particular, this shows that the coefficients dk+l through dn in our dependence test equation for T(vk+J), T(vk+2), ... , T(vn) above are all zero, and so this set is linearly independent. This completes the Proof. ■ Section 5.3 Compositions of Linear Transformations

433

Example: In our previous Example, the basis for our kernel had one vector, so nullity(T) = 1. However, range(T) = 1Jl>2, so rank(T) = 3. Thus, we verify the Dimension Theorem for this Example: 3 ). rank(T) + nullity(T) = 3 + 1 = 4 = dim(IJl>

0

The Dimension Theorem also tells us that if V is finite dimensional, then the range of T : V ➔ Wis also finite dimensional, even if the codomain Wis infinite dimensional. Thus, we can also regard T as a linear transformation: T: V ➔ range(T), and now both Vand range(T) are finite dimensional. This means that we can always construct the matrix of a linear transformation with respect to finite bases when the domain is finite dimensional.

Comparing Dimensions As before, knowledge of the relative dimensions of the domain and codomain can immediately tell us if Tis not one-to-one or not onto (proven exactly as in Chapter 2):

Theorem 5.3.6: Suppose T: V ➔ W is a linear transformation of finite dimensional vector spaces. Then: a) if dim(V) < dim(W), then T cannot be onto. b) if dim(V) > dim(W), then T cannot be one-to-one. Example: Any linear transformation T : ~ 4 ➔

7

1Jl>

cannot be onto since:

7 ). dim(~ 4 ) = 4 < 8 = dim(IJl>

However, it may or may not be one-to-one. Similarly, any linear transformation T: one-to-one, since:

6 ➔ ~6

1Jl>

may or may not be onto, but it cannot be

Compositions of Linear Transformations We can compose two linear transformations, as before, as long as the codomain of the first transformation is the same as the domain of the second transformation:

Definition/Theorem 5.3. 7: Suppose that T1 : V ➔ U and T2 : U ➔ W are linear transformations. Then: their composition:

T2 ° T1 : V ➔ W, is also a linear transformation. Its action is given as follows: Suppose VE V, T1(v) = U E U, and T2(u) =WE W. Then:

(T2 o T1)(v) = T2(T1(v)) = T2(u) = 434

w.

Section 5.3 One-to-One and Onto Linear Transformations;

We can visualize the composition of these two transformations using the following diagram:

u

V

w

_ ---------f-3--r ~{ ~ T2

I v•---,---;:..-. _____ ,i_ ~--.L..:: ...

1,11

The Composition of Two Linear Transformations Again, the linearity of the composition follows from that of the individual transformations, and is left as an easy Exercise. Example: Let Ti : IP2 ➔

4 1Jll ,

and T2

4

: 1Jll

➔ IP3 be given by:

Ti(p(x)) = (x 2 -x + 3) • p(x), and T2(r(x)) =

fxr(x)

= D(r(x)).

For instance:

T, (3x 2 - 5x + 4) = (x 2 - x + 3) • (3x 2 - 5x + 4) = 3x4

T2(5x4

-

-

8x3 + 18x2 - l9x + 12, and

3x 2 + 7x - 2) = 20x 3 - 6x + 7.

We saw in the Exercises of Section 5.1 that multiplying a polynomial by a fixed polynomial is a linear transformation, and so is taking a derivative, so both Ti and T2 are linear transformations. Let us demonstrate the composition T2 o Ti on 3x 2 - 5x + 4:

(T2 o T1)(3x 2 -5x+4)

= T2(T1(3x 2 -5x+4)) 8x3 + 18x2

= T2 (3x 4

-

= 12x3

24x2 + 36x - 19. □

-

19x + 12) (from above)

-

Example: Let D : C 1(I) ➔ C(I) be the differentiation operation, where I = [a, b], and Jnd(j) : C(/) ➔ C 1 (I) be the indefinite integral operation. Then: (D o Ind) (f) = D =

(I:f(t)

A_ dx

dt)

fx f(t)dt

=

f(x),

a

where the last equation follows from The Fundamental Theorem of Calculus.

Section 5.3 Compositions of Linear Transformations

435

Thus, Do Ind= Iqi), the identity operator on C(J). However, let us see what happens to.f(x) = x 2 - Sx + 3 under the reverse composition Ind where I= [O,1]: (Ind

o

D )(x

2 -

Sx

+ 3)

=

=

Ind (

s:

Jx(x

2 -

Sx + 3))

(2t - S)dt = t 2

-

St

=

I~=

o

D,

Ind (2x - 5) x2

-

Sx.

Thus, in this case, Tnd o D * Tcl (I). Notice that in particular, if f(x) = c, any constant valued function where c 0, then Ind o D(c) = 0 c. □

*

*

In Section 5.4, we will study invertible linear transformations. Recall from the Two-for-One Theorem for the Matrix Inverse (3.7.1) that if AB= In, where A and Bare both n x n matrices, then A and B are inverses of each other, and so BA = In as well. However, our Example above shows that this may be false in the infinite dimensional case. More generally, if we have a (finite) sequence of linear transformations, T 1, T2, ... , Tn, where the cotlomain of Ti is the domain of T;+I for all i = I ... n - I, we can once again construct the n-fold composition:

Tn O Tn-10

•••

0

T2 0 T1'

defined inductively in the usual manner as Tn o (T 11_ 1 o

··· o

T2 o T1 ).

The Matrix of a Composition It should be no surprise that we can compute the matrix of a composition using a matrix product: Theorem 5.3.8: Let T1 : V ➔ U and T 2 : U ➔ W be linear transformations of finite dimensional vector spaces. Let B be a basis for V, B 1 a basis for U, and B 11 a basis for W. Then:

[T2 o TiJ 8 ,8 11 = [T2]8 1,8 11 • [TiJs,s'·

In particular, if V = U = W, that is, Ti and T2 are operators on V, then: [T2 ° T1]8 = [T2]8 • [Ti] 8 . Furthermore, if T1 = T2 = T, the self-composition To T = T 2 has matrix: [T2Js = [TJ1We can generalize this formula for the r-fold self-composition:

[FJs = [T]~. ,£. L et B - {--+ --+ ... ,vn, --+} P rooJ. v1,v2,

--+ ... ,w111 --+ } b e th e an d B 11 -- {--+ w1,w2, respective bases. By construction, the matrices we are interested in are:

436

B 1 -- {--+ --+ ... ,Uk, --+} u1,u2,

Section 5.3 One-to-One and Onto Linear Transformations;

[T2 ° T, Js,s11= Z

= [

[(T2 ° T, )(v,) Js11[(T2 ° T, )(v2) Js11· · · [(T2 ° T, )Cvn)Js11 ],

[T2]sl,sll

= [

[T2(u I)] sit [T2(u2)] sit · · · [T2(Uk)h11], and

=

X

[T, Js,st = Y =

[

[T, (v,) ]st [T, (v2)hi · · · [T1Cvn)hi].

Our goal is to show that Z = XY. All we have to do is unravel the definitions. The first column of Z is [(T2 o T I )(v 1 )]s11.We need to show that this equals the first column of XY. But recall from the definition of general matrix products that the first column of XY is Xy,, where yI is the first column of Y. But y, = [T1(v 1)]st. Thus, we must show that:

[(T2 o T, )(v1 )]sit

=

X[T, (v1)] 81 = [T2]s1,a11[T1 (v1)]s1

But recall that in general (changing notation to avoid confusion): [TJs,si[v]s = [T(v)]s1, where Sis a basis for the domain of T and S 1 a basis for the codomain. Thus: [T2]s1,s11[T,(v,)]s1 = [T2(T1(v,))]s11

=

[(T2 ° T1)(v1)]s11.

Similarly, the rest of the columns of Care equal to the corresponding columns of AB. ■ Clearly, this idea also applies to the composition of several linear transformations, as long as the compatibility criterion is satisfied. Example: Let T 1

2 3 ➔ 1Jll : 1Jll and

T2

2 : 1Jll ➔ ~ 2

be given by:

T1(p(x)) = 3p1(x)- 5x • p 11(x), and T2(q(x)) = (q(-2),q(3)).

We will leave it as an Exercise to show that these are indeed linear transformations. Let us find the individual matrices and the matrix of the composition using the standard bases 2 2 respectively: B = { 1, x, x 2 , x 3 }, B 1 = { 1, x, x 2 }, and B 11 = 1, 2 } for lfll3,1Jll and~

{e e

T1(1)

=

3 · 0 - 5x • 0

=

0,

T1(x)

=

3 • 1 - 5x • 0

=

3,

T, (x 2 )

= 3 • 2x - 5x • 2 = -4x, and

T, (x 3 )

=

Thus we can assemble: [T1]n,nt = [

3 • 3x2

~~ 0 0

-

-~

5x • 6x

~

=

-21x 2 .

] .

0 -21

Now for T2:

We get the matrix of the composition using a matrix product: Section 5.3 Compositions of Linear Transformations

437

[T2 o Ti]

88 11 '

= [T2J8 18 11[Ti]88 '

1

'

0 3

0

0

0 0 -4

0

-[ -::J[ 1 1

0 0

-21

0

Since we computed T 1 explicitly for the members of B, we can find their values under the composition directly: T2(T1(1))

=

T2(0)

(0, 0),

=

T2(T1(x)) = T2(3) = ( 3, 3), T2 (T 1(x 2 )) = T2 (-4x) = (8,-12),

T2 (T 1(x 3 ))

=

T2(-2lx 2)

and

(-21 • 4,-21 · 9)

=

=

(-84,-189)

and we can see that these are indeed the four columns of the matrix product. 0 Example: We saw in the previous Section that the function space: W = Span(B),

where B = {x 2 e- 3x, xe- 3x, e- 3x},

is preserved by the derivative operation D. In other words, D is an operator: D:W ➔ W.

We also found the matrix of D with respect to B:

-3 [DJ8 =

~ ]·

0

2 -3 [

1 -3

0

Thus, the secoml derivative Do D = D 2 also preserves W, and:

-3 2

[D J8

=

[DJ~ =

l

0

0

2 -3

0

0

1 -3

l 1 2

=

9 -12 2

We can use this matrix to find the 2nd derivative of f(x) = 8x 2 e- 3x

-

Sxe- 3x + 9e- 3x using the

matrix product:

438

Section 5.3 One-to-One and Onto Linear Transformations;

5.3 Section Summary For Theorem 5.3.1 to 5.3.6, suppose that T : V ➔ Wis a linear transformation. 5.3.1: Tis one-to-one if and only if the only way two vectors from the domain have the same image in the codomain is for them to be the same vector to begin with: if T(v 1)

=

T(v2) then v 1

=

v2.

In other words, the only solution to T(v 1 ) = T(v2) is v 1 = v2. (This Theorem is basically just the contrapositive of the definition.) -+

5.3.2- The Kernel Test for lnjectivity: Tis one-to-one if and only if ker(T) = {O v}. 5.3.3: Tis onto or surjective if the range of Tis all of W: range(T) = W. Since rank(T) = dim(range(T) ), we can also say that Tis onto if and only if rank(T) = dim(W), in the case when Wis finite dimensional. 5.3.4: The rref R of [TJs,si can be used to find a basis for nullspace([TJs,si) and columnspace([T] 8,si), as we did in Chapter 2.

By decoding the basis for nullspace([TJs si) using B, we can find a basis for ker(T). Similarly, by decoding the basis for columnspace([TJ 8 ,8 1) using B 1, we can find a basis for range(T). 5.3.5 -

The Dimension Theorem:

Suppose that V, the domain of T, is finite dimensional with dim(V) = n. Then, both ker(T) and range(T) are finite dimensional, and we can define:

rank(T) = dim(range(T) ), and nullity(T) = dim(ker(T) ). Furthermore, these quantities are related by the equation:

rank(T) + nullity(T) = n = dim(V) = dim(domain of T). 5.3. 6: Suppose that V and Ware both finite dimensional. Then:

a) if dim(V) < dim(W), then T cannot be onto. b) if dim(V) > dim(W), then T cannot be one-to-one. 5.3. 7: Suppose that T, : V ➔ U and T2 : U ➔ W are linear transformations. Then: their composition, T2 o T, : V ➔ W, is also a linear transformation. Its action is given as follows: Suppose E V, T, (v) = E U, and T2(u) = E W. Then:

v

u

w

(T2 o T,)(v) = T2(T,(v)) = T2(u) =

w.

5.3.8: Let T 1 : V ➔ U and T2 : U ➔ W be linear transformations of finite dimensional vector spaces. Let B be a basis for V, B I a basis for U, and B 11 a basis for W. Then:

[T2 o TiJ 8 ,s11= [T2]s1,8 11• [TiJ 8 ,s1· In particular, if V = U = W, that 1s, T, and T2 are operators on V, then: [T2 o Ti] 8 = [T2]8 • [TiJsFurthermore, if T1 = T2 = T, the self-composition TOT= T 2 has matrix: [T2]8 = [TJ1. We can generalize this formula for the r-fold self-composition: [F]s = [T]~.

Section 5.3 Compositions of Linear Transformations

439

5.3 Exercises 1.

4 Let T : IP'2 ➔ IRl. be the linear transformation from Exercise 7 of Section 5.1, and Exercise 8 of Section 5.2, given by: T(p(x)) = \ p(-2 ), p 1(1 ), p 11(x ), ~p(x )dx).

f

a. b.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Let R = {1,x,x 2} and R 1 = {e1, e2,e3, e4}. The matrix [TJ8 ,8 1 can be found in the Answer Key for Exercise 8, Section 5.2.

c.

Find the rref of[TJ 8 ,8 1.

d. e. f. g. h.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T). Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T. Describe all polynomials p(x) E IP'2 , such that: p(-2)

f

= 38, p 1(1) = 3, p 11(x) = 10, and ~p(x)dx = 13/6.

Hint: Solve an augmented system that uses rightmost column? 2.

Let T : IP'3 ➔ IP'1 be the linear transformation from Exercise 8 in Section 5.1 and Exercise 9 in Section 5.2, given by: T(p(x)) = p 11(x).

c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Let B = {l,x,x 2 ,x 3 } and B 1 = {l,x}. The matrix [118 ,8 1 can be found in the Answer Key for Exercise 9, Section 5.2. Find the rref of[1l 8 ,8 1.

d. e. f. g.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T). Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

a. b.

3.

Let T: IP'2 ➔ IP'3 be the linear transformation from Exercise 10 m Section 5.1 and Exercise 10 in Section 5.2, given by: T(p(x)) = f~p(t)dt.

c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Let B = { 1, x, x 2 } and B 1 = { 1, x, x 2 , x 3 }. The matrix [118 ,8 1 can be found in the Answer Key for Exercise 10, Section 5.2. Find the rref of[1l 88 1.

d.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T).

a. b.

440

[71s,si. What should be on the

Section 5.3 One-to-One and Onto Linear Transformations;

e. f. g. 4.

Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

Let T:

3

1Jl>



2

1Jl>

be the linear transformation from Exercise 18 in Section 5.2, given by:

T(p(x)) = p 1(x) + (x + 1) • p 11(x) + 2p(-1). a. b.

5.

c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? LetB = {l,x,x 2 ,x 3 } and B 1 = {l,x,x 2 }. The matrix [T] 88 1 can be found in the Answer Key for Exercise 18, Section 5.2. Find the rref of[TJ 8 ,8 1.

d. e. f. g.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T). Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

Let T:

2 1Jl>

3 be ➔ 1Jl>

the linear transformation from Exercise 19 in Section 5.2, given by:

T(p(x)) = (2x-5) a. b.

6.

•p(x)+(x

2

+3) •p1(x)-p(-2)

•x 3 •

c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Let B = { 1, x, x 2 } and B 1 = { 1, x, x 2 , x 3 }. The matrix [T] 8 8 1 can be found in the Answer Key for Exercise 19, Section 5.2. Find the rref of[T] 88 1.

d. e. f. g.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T). Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

Let T :

2 ➔ 1Jl> 3 be

1Jl>

the linear transformation given by:

T(p(x)) = (x 2

-

5) • p 1(x)

+p 11(-1) • (x 3 + 2x - 4).

a. b. c. d.

Convince yourself mentally that Tis indeed a linear transformation from Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? LetB = {l,x,x 2 } andB 1 = {l,x,x 2 ,x 3 }. Find [T] 88 1.

e.

Find the rref of [T] 8 ,8 1.

f. g. h.

Use (e) to find a basis (if possible) for ker(T) and state nullity(T). Use (e) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto?

1.

Verify the Dimension Theorem for T.

Section 5.3 Compositions of Linear Transformations

2 IJD

to 1Jl>3.

441

7.

Let T : IP'3 ➔ IP'2 be the linear transformation given by: T(p(x)) = p(-2)

-

15x + 9).

a. b. c. d.

Convince yourself mentally that Tis indeed a linear transformation from Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? LetB = {l,x,x 2 ,x 3 } andB' = {l,x,x 2 }. Find [1Jssl•

e. f. g. h.

Find the rref of [T] s,si. Use (e) to find a basis (if possible) for ker(T) and state nullity(T). Use (e) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

1.

8.

• (2x2 - lOx + 6) + p 1(- 1) • (3x2

3 IJD

to IP'2 .

Suppose that T : IP'2 ➔ IP'1 is a linear transformation whose matrix with respect to the bases B = { 1, 5 - x, 2 + 3x-x 2 } for IP'2 and B 1 = {x + 3, 2} for IP'1 is given by:

[7Jo~' - [ _; -~ -~

J

Note: this was the linear transformation in Section 5.2, Exercise 20.

9.

a. b. c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Find the rref of[TJs,sl•

d. e. f. g.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T). Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

Suppose that T: IP'1 ➔ IP'2 is a linear transformation whose matrix with respect to the bases B = { 1, 2 + x} for IP'1 and B 1 = {x 2 -x, x + 1,-1} for IP'2 is given by:

[1100' -[

-i-~J

Note: this was the linear transformation in Section 5.2, Exercise 21.

442

a. b. c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Find the rref of[JJs,sl•

d. e.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T). Use (c) to find a basis for range(T) and state rank(T).

Section 5.3 One-to-One and Onto Linear Transformations;

f. g.

Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

2 10. Suppose that T : 1Jll ➔ IJllI is a linear transformation whose matrix with respect to the 2 and B 1 = { x + 3, x - l} for IJll bases B = { 1, 2 + x, x - x 2 } for 1Jll I is given by:

1 l11a,a ~ [ -~ -:

-I~

J

a. b. c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Find the rref of[1l 8 ,8 1.

d. e. f. g.

Use (a) to find a basis (if possible) for ker(T) and state nullity(T). Use (a) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

1 ➔ 1Jll 2 is a linear transformation whose matrix with respect to the 11. Suppose that T : 1Jll basesB = {l, 1-x} for1Jll1 andB 1 = {x 2 +2x,x- l, l} for1Jll2 is given by:

[TJs,s1 = [-~:

-:~ ]· 35

25

a. b. c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Find the rref of [718 ,8 1.

d. e. f. g.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T). Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T.

2 ➔ 1Jll 2 is the operator whose matrix with respect to the standard basis 12. Suppose that T : 1Jll B = { 1, x, x 2 } is given by:

-6 ] 5

.

3

a. b. c. d.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Find the rref of[7l 8 . Use (c) to find a basis (if possible) for ker(T) and state nullity(T).

Section 5.3 Compositions of Linear Transformations

443

e. f. g. h.

Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T. 2 , if possible, such that: Find all polynomials p(x) E 1Jl> T(p(x)) = 6- 7x- 9x2. Hint: find the rref of a certain augmented matrix.

13. Suppose that T :

3 ➔ 1Jl> 2 is 1Jl>

a linear transformation with matrix:

[118 ,8 1 = withrespecttothebasesB

=

4 -8 7 -3 ] -2 4 -5 9 [

3 -6

{x3 + l,x 2

-

6 -6

l,x+ 1,x-1}

andB

1 =

{x2 -1,x+2,x-1}

a. b. c.

Can we immediately say that Tis not one-to-one? Why or why not? Can we immediately say that Tis not onto? Why or why not? Find the rref of [T]8 ,8 1.

d. e. f. g. h.

Use (c) to find a basis (if possible) for ker(T) and state nullity(T). Use (c) to find a basis for range(T) and state rank(T). Is Tone-to-one? Is Tonto? Verify the Dimension Theorem for T. 2 , if possible, such that: Find all polynomials p(x) E 1Jl> T(p(x)) = 6x2 + l5x + 51. Hint: you will need the rref of two augmented matrices. The first rref is related to the Encode step. The second rref is related to the Multiply step. You also need to think very carefully about how you should interpret the second rref.

14. Let T 1

2 ➔ 1Jl> 3 and : 1Jl>

T2

3 ➔ 1Jl> 2 be : 1Jl>

the linear transformations given by:

T 1 (p(x)) = (x 2 +3x-5)

Let B

= {

1, x, x 2 }

•p 1(x)+4p(x),

and

T2(q(x)) = 3q1(x) - 5q11(x). and B 1 = { 1, x, x 2, x 3 } be the standard bases, respectively, for

2 and 1Jl>

1Jl>3.

a. b.

Convince yourself mentally that T1 and T2 are linear transformations. Find [T1Js,8 1 and [T2Js1,8 .

c. d.

Explain why both compositions T2 o T1 and T1 o T2 are defined. Find [T2 o Ti] 8 and [T1 o T2]8 1.

15. Let T 1

2 ➔ 1Jl> 3 and : 1Jl>

T2

3 ➔ ~3 : 1Jl>

be the linear transformations given by:

T 1(p(x)) = (2x+3) ·p(x), T2(q(x)) = (q(-3),q1(2),q

444

and 11

(-l)).

Section 5.3 One-to-One and Onto Linear Transformations;

Let B = {l,x,x 2}, B 1 = {l,x,x2,x respectively, for IP2 , [Pl3 and IRl.3 .

3} and B 11 = {e1,e2,e3} be the standard bases,

a.

Convince yourself mentally that T1 and T2 are linear transformations.

b.

Find [T1]8 ,8 1 and [T2]8 1,8 11.

c.

Explain why the composition T2 o Ti is defined, and find [T2 o T1]8 ,8 11 using (b).

d. e.

Is the composition Ti o T2 defined? Why or why not? Is the matrix product [T, ]8 ,8 , • [T2]8 1,8 11defined? Why or why not?

16. Let T 1 : IP2 ➔ IP3 and T2 : IP3 ➔ [PlI be linear transformations. SupposethatB = {l, 1-x,2x+x 2} c IP2 , B 1 = {1, 1 +x,3x-x 2,x 2 +x 3 } c IP3 , and B 11 = {-1, 1 +x} c IP1• Note that each set contains polynomials of distinct degrees, and each set contains the correct number of vectors, so all sets are bases for the corresponding space. Now, suppose we are given that: 3

5

1

-2

-4

2

1

0

7

1

2 -1

3 -2 5 1 ]· 2 4 7 -3

c.

Does the composition T 1 o T2 make sense? Why or why not? If so, what are the domain and codomain of T 1 o T2 ? Does the composition T2 o Ti make sense? Why or why not? If so, what are the domain and codomain of T2 o Ti? Compute T 1(3x 2 - 5x + 2)

d.

Use your work to (c) to compute (T2 o T1)(3x 2

e.

Find [T2 ° T1]8 ,8 11.

a. b.

-

5x + 2).

Use your answer to (e) to compute (T2 o Ti )(3x 2 - 5x + 2) directly. You should get the same answer as in part (d). 17. Let T1 : [Pl1 ➔ IP2 and T2 : IP2 ➔ [PlI be linear transformations. Let B = { 1, 1 - x} c [Pl1, and B 1 = { 1, 1 + x, 3x - x 2 } c IP2 • Note that each set contains polynomials of distinct degrees, and each set contains the correct number of vectors needed to form a basis, so they are both bases for the corresponding space. Now, suppose we are given that:

f.

[Ti]

a. b.

88 1

'

= [

~ 0

-;

], and [T2]8 1 8 = [

7

'

4

3

7 -1

-

2

4

]

Is the composition T, o T2 defined? Explain. If so, what are the domain and codomain of T1 o T2? Is the composition T2 o T 1 defined? Explain. If so, what are the domain and codomain of T2 o T1?

Section 5.3 Compositions of Linear Transformations

445

c. d. e. f. g. h. 1.

J.

Compute T1(Sx - 7) Use (c) to compute (T2 o T1)(Sx- 7). Find [T2 o T, ]sUse your answer to (e) to compute (T 2 o T 1)(Sx - 7) directly. You should get the same answer as in part (d). Compute T2(6x 2 + 3x - 4). Use (g) to compute (T 1 o T2)(6x 2 + 3x - 4). Find [T, T2]8 1. Use your answer to (i) to compute (T, o T2)(6x 2 + 3x - 4) directly. You should get the same answer as in part (h). 0

18. Function Spaces Preserved by the Derivative: In Section 5.2, Exercises 11, we found the matrix [DJs of the derivative operation Don the subspaces W = Span(B). (a) Use your answers in that section to find the matrices of the 2nd and 3rd derivatives, [D 2 Js and [D 3 ] s ; (b) Use these matrices to find the 2nd and 3rd derivatives of the indicated function .f(x) using a matrix product. (c) Show that D is both one-to-one and onto on W by finding the rref of [DJs and describing ker(D) and range(B). a.

W = Span(B), whereB = {e-x, e2.x};f(x) = 5e-x -3e2.x.

b.

W = Span(B), where B = { ex sin(x), ex cos(x)}; f(x) = 4ex sin(x) - 3ex cos(x).

c.

W = Span(B), where B = ( { e- 3x sin(2x), e- 3x cos(2x)} ); f(x) = se- 3x sin(2x) - 9e- 3x cos(2x).

d.

W = Span(B), where B = ( {xe 5X,e 5x} ); f(x) = -2xe 5x + 7e 5x.

e.

W = Span(B), where B = {x 2e- 4x, xe-4x, e- 4x}; f(x) = -5x 2e-4x + 2xe-4x - 7e-4x_ W = Span(B), where B = ( {x 2 5X, xSX, SX}); f(x) = -4x 2SX + 9xSX- 2(SX).

f. g.

Span(B), where B = ( {xsin(2x), xcos(2x), sin(2x), cos(2x)} ); f(x) = 4xsin(2x) + 9xcos(2x) - 5sin(2x) + 8cos(2x). W

=

19. In Exercise 14 of Section 5.2, we constructed the matrix [D] 8 of the derivative operator Don W = Span(B), where B = { e0 x sin(bx), eaxcos(bx)}:

Js and [D 3 Js- Observe

a.

Find [D 2 two pairs.

b.

. to show th at 1or .:: any positive . . mteger . k : [Dk]s -- [ ak -bk. ], U se Induct10n bk ak

how the four entries are related to each other in

for some real numbers ak and bk-

446

Section 5.3 One-to-One and Onto Linear Transformations;

20. Suppose that Ti : V ➔ U, and T2 : U ➔ W are linear transformations of vector spaces. Prove that T2 o T1 is also a linear transformation. In other words, prove that T2 o T1 is additive and homogeneous. 21. Prove that if T : V ➔ W is one-to-one and S = {v1, v2 , ... , vk} is a set of linearly independent vectors from V, then { T(v1), T(v2 ), ... , T(vk)} is a set of linearly independent vectors from W. 22. Suppose that a = ( ao, a 1, ... , an ), where ao, a 1, ... , an are n + 1 distinct real numbers. Consider the evaluation homomorphism:

E-a : OJlll➔

1 IR{11 + ,

where Ea(p(x)) = ( p(ao ), p(a

1 ), ••• ,

p(an) ),

as constructed in Exercise 17 of Section 5.1. Prove that Ea is one-to-one. 23. Suppose that T: V ➔ Wis a linear transformation, with dim(V) = n and dim(W) = m. Prove the following statements: a.

If n :S m, then: Tis one-to-one if and only if for any basis {v 1,v2, ... , the image set { T(v 1), T(v2), ... , T(v 11)} is linearly independent.

v

11 }

for V,

Hint: think of ker(T). b.

If n ~ m, then: T is onto if and only if there exists a linearly independent subset {v1, v2, ... , vm} from V, such that the image set {T(v1 ), T(v2), ... , T(vm)} is also linearly independent. Note that there are only m vectors in these sets. Hint: Tis onto if and only if rank(T) = m.

c.

Bonus: show that (a) is still true if the phrase "for any basis" is replaced with "for at least one basis."

24. Decoding the Kernel and Range: The purpose of this Exercise is to show that we can obtain a basis for ker(T) and range(T) by decoding the information found in any matrix for T. Suppose that T: V ➔ W is a linear transformation, with dim(V) = n and dim(W) = m, both finite-dimensional vector spaces. Let B = {v1, v2 , ... , v11} be a basis for V, and let B 1 = { w1, w2 , ... , wm} be a basis for W. Let us construct the m x n matrix [TJ8,81, and let R be the rref of [TJ8,81. a.

Suppose that z E nullspace([TJ 8_81) ~ IR{11• By the Uniqueness of Representation Property, we know that there exists u E V such that (u) 8 = z. Show that E ker(T).

u

ker(T). Showthat(u)8

b.

Conversely,supposethatu

c.

Now suppose that b E colspace([TJs si) ~ IR{m. Representa~ion Property, we know that there exists

E

E

nullspace([TJ8,B1).

--+

Show that d

E

By

dE

the Uniqueness of W such that s = b.

(d)

range(T).

(d)B E colspace([TJ 8,s1).

d.

Conversely, suppose that d E range(T). Show that

e.

Use (a) and (b) to prove that ker( T) = { 0 v} if and only if

--+

nullspace([T] 881) = {On}Section 5.3 Compositions of Linear Transformations

447

--+

f.

Similarly, use (c) and (d) to prove that range(D = {Ow}

if and only if

--+

colspace([TJ8 ,B1) = {Om}Parts (e) and (f) handle the trivial cases where either ker(D or range(T) is the zero-subspace. Thus, we finish the problem by assuming that neither space is the zero-subspace: g.

11 Let {zi, z2, .. . , Zk} C !Rl. be the basis that we obtain for nullspace([TJsst) using R, as we did in Chapter 2. By (a), the corresponding vectors U;such that (iii) 8 = for every i = l. .. k form a subset { u i, u2, ... , Uk} for ker(T). Prove that {u 1, u2, ... , Uk} is a basis for ker(n. Reminder: this means you have to prove two properties: linear independence and Spanning. In particular, this tells us that dim(ker(T)) = dim(nullspace([TJ 8,81)) for any matrix [TJ8 8 1 representing T.

h.

Finally, let 7\, ... , c;,.} c !Rl.111 be the original columns of [T]8 8 1 that correspond to the leading l's found in R, so that this set forms a basis for columnspace([T]88 1) as we found in Chapter 2. By (b), the corresponding vectors

z;

{c\,

d1 E

W such that

\d;)81 =

Ci/or every j

=

1. .. r form a subset

{di,d2,..., d,.}for

range(T). --+

--+

--+

Prove that {di, d2, ... , d,.} is a basis for range(D. Again, you have to show both linear independence and Spanning properties. In particular, this tells us that dim(range(D) = dim(colspace([TJ 881)) for any matrix [TJs,st representing T. 25. The Kernel and Range of a Composition: The purpose of this Exercise is to generalize Theorem 3.5.10, which is proven in Exercise 9 of Section 3.5. We will investigate the kernel and range of the composition of two linear transformations. Suppose that:

Ti : V--+U, and T2 : U

--+

W,

are linear transformations of vector spaces. We do not have to assume that any of these spaces is finite-dimensional.

448

a.

Write down the general definition of the kernel of any linear transformation --+ --+ T: X--+ Y, where X and Y are arbitrary vector spaces. Use the symbol Ox or Or, whichever is appropriate.

b.

Adapt the definition in part (a) to write down the definition of ker(T 1), ker(T 2) and ker(T 2 o Ti), as set-up above. There should be three seoarate definitions. Make ➔ --+ -t sure that you precisely use the symbols V, U, W, Ov, Ou and Ow, where appropriate.

c.

Two out of the three subspaces that you defined in (b) are subspaces of the same vector space. Which of the two kernels live in which same vector space?

d.

Use your definitions to prove that ker(T 1 ) '.SJker(T 2 o Ti). Hint: This means that you must show that every member

v of ker(T1)

is also a

Section 5.3 One-to-One and Onto Linear Transformations;

member of ker(T2 o Ti). Note that in Section 3.5, we did not know the general concept of a subspace. We can now say that ker(Ti) is a subspace of ker(T 2 o Ti), and not just a subset. e.

Use part (d) to prove that if T2 o T 1 is one-to-one, then Ti is also one-to-one.

f.

Write down the contrapositive of the statement in ( e). Now, we will repeat the steps above for the range: ➔

g.

Write down the general definition of the range of T: X

Y, as set-up in (a).

h.

Adapt the definition in part (g) to write down the definition of range(Ti ), range(T 2), and range(T 2 o Ti), as set up above. There should be three separate definitions. Make sure that you precisely use the symbols V, U, and W, where appropriate.

1.

Two out of the three subspaces that you defined in (h) are subspaces of the same vector space. Which of the two ranges live in which same vector space?

J.

Use your definitions to prove that range(T 2 o Ti) ~ range(T 2). Hint: This means that you must show that every member of range(T 2 o Ti) is also a member of range(T 2). Again, we can now say that range(T 2 o Ti) is a subspace of range(T 2), and not merely a subset.

w

k.

Use part (j) to prove that if T2 o Ti is onto, then T2 is also onto. Do you notice the difference with part (e)?

1.

Write down the contrapositive of the statement in (k).

m.

Bonus: Prove directly the two statements you wrote down in parts (e) and (k).

26. The Kernel of Commuting Operators. Suppose that Ti and T2 are operators:

Ti, T2 : V ➔ V, where V is a finite dimensional vector space. We know from the previous Exercise that

ker(Ti)

~

ker(T2 °Ti).

Thus, suppose that we construct a basis

{vi, v2 ,

•.. ,

Vr} for ker(Ti ).

By the Extension Theorem, we can extend this basis one vector at a time, to form: -+ {

-+

-+

-+

-+

4

}

Vi, V2, ... , Vr, Vr+i, Vr+2, ... , Vr+s ,

which will now be a basis for ker(T2

T, ). Now, suppose we are also told that T2 o T1 = Ti o

o

T2, and we know that ker(T2) 1s

exactly s-dimensional, where s is in the notation above. Prove that the set:

is a basis for ker( T2). Hint: first show that each of the vectors above is actually from ker(T2). Next, show that this set is linearly independent. Finally, take advantage of the Two-For-One Theorem.

Section 5.3 Compositions of Linear Transformations

449

5. 4 Isomorphisms Now that we understand the nature of one-to-one and onto linear transformations, we will put them together, as we did in Section 3.6, to generalize a very special kind of linear transformation: Definition: If V and Ware vector spaces, we say that a linear transformation T : V ➔ Wis an isomorphism if Tis both one-to-one and onto. We also say that Tis invertible or Tis bijective, and that V and Ware isomorphic to each other. If V = W, an isomorphism T : V ➔ Vis also called an automorphism. In Theorem 3.6.2, though, we said that if T: IRl.11 ➔ IRl.111 is invertible, then n = m, that is, the domain and the codomain of T have to be the same. Now that we are considering linear transformations of abstract vector spaces, we can loosen this restriction a little bit: Theorem 5.4.1: Let T: V ➔ W be an isomorphism of finite dimensional vector spaces. Then: dim(V) = dim(W). Proof If dim(V) < dim(W), then Tcannot be onto, and if dim(V) > dim(W), then T cannot be one-to-one. Thus we must have dim(V) = dim(W). ■ This Theorem says that we do not have to bother asking if a linear transformation is an isomorphism if the domain and the codomain do not have the same dimension (in fact, this Theorem is true even if both V and Ware infinite dimensional). Notice that this Theorem is not requiring that V = W, only that their dimensions are the same. It turns out that the converse of this Theorem is also true and will be proven in the Exercises: Theorem 5.4.2: If Vand Ware finite dimensional vector spaces and dim(V) = dim(W), then there exists an isomorphism T: V ➔ W. Again, this Theorem is true even if both spaces are infinite dimensional. Thus, the two statements above can be combined in full generality as the following: Theorem 5.4.3: Two vector spaces Vand Ware isomorphic to each other if and only if dim(V) = dim(W). Theorem 3.6.3 gave us an equivalent condition to T being an isomorphism. We can generalize this Theorem as well, taking into account the restriction we saw above:

450

Section 5.4 Isomorphisms

Definition/Theorem 5.4.4: A linear transformation T 1 : V ➔ Wis an isomorphism of vector spaces if and only if we can find another unique linear transformation, T2 : W ➔ V,the inverse of T,, such that T2 is also an isomorphism, and for all E V and for all wE W:

v

(T2 o T 1)(v)

=

v

and (T 1 o T2)(w)

=

w.

In other words, T2 o T, = Iv, and T, o T2 = I w, the identity operators on V and on W, respectively. We also write: T2 = T11• In particular, T2 must be the linear transformation such that for all 111E W:

T2(w) =

v, where vis

the unique vector from V such that T, (v) =

w.

Furthermore, if T 1 is invertible with inverse T2, then T2 is also invertible, and T21 = T 1• Thus, we can say that T1 and T2 are inverses of each other. In particular, if T 1 : V ➔ Vis an automorphism, we get:

T2 o T, = f V = T1 o T2. Recall that the bulk of the proof of Theorem 3.6.3 follows from the general scenario of Theorem 3.6.1, and in the same way, the existence and uniqueness of the one-to-one and onto function T2 also follows from Theorem 3.6.1. The proof of the additivity and homogeneity of T2 is virtually identical to the proof of Theorem 3.6.3, by simply requiring E V, the domain, instead of IRl. n, and requiring E W, the codomain, instead of IRl. n (recall that IRl. n is both the domain and the codomain in Theorem 3.6.3). Again, let us rewrite this Theorem in a more natural notation:

v

w

Theorem 5.4.5: A linear transformation T: V ➔ Wis invertible if and only if we can find another unique linear transformation, r- 1 : W ➔ V, the inverse of T, such that for all v E V and all wE W: (T- 1 o T)(v)

=

r-1 )(w)

V and (To

=

w.

In other words, T2 o T, = Iv, and T 1 o T2 = I w, the identity operators on Vandon W. In particular, r- 1 must be the linear transformation such that for all E W:

w

r- 1(w) = v, where vis

the unique vector from V such that T(v)

=

w.

Furthermore, if T is invertible with inverse r- 1, then r- 1 is also invertible, and (r- 1)- 1 = T. Thus, we can say that Tand r- 1 are inverses of each other.

T

The Composition of Twith Section 5.4 Isomorphisms

r- 1

-I

0

T=l,

r-1 o T =

Iv and To

r- 1 =

Iw 451

Notice that r- 1 exists even if V or Wis infinite dimensional. In the special case that they are both finite dimensional, though, we can find a matrix for r- 1 in a natural way:

Theorem 5.4.6: Suppose T: V ➔ W is an isomorphism of finite dimensional vector spaces. By the previous Theorems, we know that dim(V) = dim(W) = n, say, and there exists r- 1 : W ➔ V such that r- 1o T = Iv and To r- 1 = I w. If B is a basis for V and B 1is a basis for W, then [T]B,B'is an invertible n x n matrix, and: [T-1 ]B, B = [T]~IR'" '

'

In particular, if T : V



Vis an automorphism, then:

[r-1 Js=

[TJsl.

Proof We know from the previous Section that the matrix of the composition of two transformations is the product of the matrices of each transformation, in the same order, using the appropriate bases. Thus:

[r-

1

Js,,s•[T]s,BI = [r- 1 o T]B,B = [lv]B

[T]B,B'. [r-1]8,,B

=

[To

r-l]B,,81

and

[lw]s'·

=

However, the matrix of the identity operator on any finite-dimensional vector space with respect to any basis is always the identity matrix (see Exercise 29 for the proof). Since dim(V) = dim(W) = n, we get:

and thus: [T]s,s' • [T-' ]si,s

=

In

=

[T-1 ]B,,B• [T]s,sl•

This shows that [T] 8 , 8 , is invertible, with inverse [T- 1] 8 ,_8 . ■

Example: Let T :

[Jl)3

➔ [R{4 be the linear transformation given by:

It is easy to check that T is a linear transformation. Let us find its matrix with respect to the 4. Since we standard bases B = { 1,x,x 2 ,x 3 } for [Jl)3 and similarly B 1 = {e1, e2, e3, e4} for !Rl. need to evaluate p(x ), p 1(x) and p 11 (x) at the indicated points, we first create a table:

452

pl (-1)

pl/ ( 1)

1

0

0

-2

3

1

0

2

4

9

-2

2

6x

-8

27

3

6

pl/ (x) p(-2)

p(x)

pl(x)

I

0

0

1

X

1

0

x2

2x

x3

3x 2

p(3)

Section 5.4 Isomorphisms

We can now compute: T(l) = (1, 1,0,0), T(x) = (-2, 3, 1, 0),

T(x 2 )=(4,9,-2,2),

and

T(x 3 ) = (-8, 27, 3, 6).

We assemble

[118 , 8 1, column by column:

[T]B,8

1

=

1 -2

4 -8

1

3

9 27

0

1 -2

3

0

0

2

6

Using technology, we find that this 4 x 4 matrix is invertible, and its inverse, which 1s [r- 1]81 8' is:

54 -4 [Tri8,B 1 -- [r-1]

- 50 1

8 1,8 -

120 30

-18

18 -40 -85

-6

6 -30 -20

2

-2

10

15

We can now use this inverse matrix to accomplish the following: Find a polynomial p(x ), of degree at most 3, such that p(-2) = 2, and c = 1 is an inflection point.

=

5, p(3)

=

-10,

p 1(-1)

This is equivalent to finding a polynomial p(x) T(p(x))

3 E 1Jll

such that:

= (5,-10, 2, 0).

(The second derivative of a cubic is a linear polynomial, so we are guaranteed a sign change whenever there is a zero; conversely, quadratics and linear functions do not have inflection points.) To find the coordinates of p(x) with respect to B, we multiply:

54 -4 _1 50

-18

-6

120

30

5

-85

-10

6 -30 -20

2

18 -40

2 -2

10

15

0

11 =

-7 -3 1

Decoding these coordinates with respect to B, we get: p(x) = 11 - 7x - 3x2 + x 3 .

Section 5.4 Isomorphisms

453

We can check algebraically that p(-2) = 5 and p(3) = -10. We get the derivatives: p 1(x) = -7 - 6x + 3x 2 and p 11(x) = -6 + 6x, and likewise see that p 1(-1) = 2 and p 11(1) = 0, and the 2nd derivative indeed experiences a sign change at c = 1, so this is in fact an inflection point. 0

Applications in Calculus and Ordinary Differential Equations We saw in the two previous Sections that in some cases, we can restrict the derivative transformation D to a finite dimensional function space W = Span(B), and the derivatives are again in W. We say in this case that D preserves W, and so D is an operator: D:W ➔ W.

If [DJs is an invertible matrix, then Dis an invertible operator. However, this means that D- 1 gives an an antitlerivative for every function/(x) from W. Example: We saw in Section 5.1 and 5.2 that the function space:

where B = {x 2 e- 3x, xe- 3x, e- 3x},

W = Span(B),

is preserved by the derivative operation D, so D is an operator: D:W ➔ W.

We also found the matrix of D with respect to B:

[DJs=

[-:

~ ]·

~3

1 -3

0

Since this lower triangular matrix is invertible, D is an isomorphism on W. In other words, Dis an automorphism. We can find [D- 1J8 via: [D- 1Js = [DJ 81 =

[

-)

-3

0

0

2

-3

0

0

1 -3

]

Thus, if we want to find J( 5x 2 e- 3x - 8xe- 3x + 2e- 3x)dx, we encode the coefficients ( 5,-8, 2) in a column matrix, perform the matrix multiplication: _j_

3 H_

9 _ _A_

27 and decode the coordinates for our answer (adding an arbitrary constant C, as usual), to get: 454

Section 5.4 Isomorphisms

You might recall that to find this antiderivative directly, we would need to apply a technique called Integration by Parts. This Example shows how to do it instead using a matrix! 0 Let us extend this idea to solve a special kind of differential equation, that is, an equation involving one or more derivatives: Definition: Let x be an independent variable, and y a variable that depends on x. An ordinary linear differential equation or o.d.e. is an equation of the form: CnY(n)

+ .. · + c2y(l) + c,y1 +coy= g(x),

for some positive integer n, scalars co, c 1, ... , Cn, and function g(x). A solution to such an equation is a function y = f(x) defined on some interval /, that satisfies the differential equation. The word "ordinary" refers to the appearance of only ordinary derivatives from basic Calculus (as opposed to partial derivatives that appear in Multi-Variable Calculus; an equation involving partial derivatives is naturally referred to as a partial differential equation or p.d.e.). STEM majors often take a separate course on Differential Equations, but we will see below how to use the concept of a linear transformation, and in particular an invertible transformation, in order to solve these kinds of differential equations. Example: Let us consider the differential equation:

2/

1-

5y1 + 4y = 185x 2 e- 3x

-

281xe- 3x

-

188e-3x.

We want to find one solution to this equation. The process of finding all solutions to this differential equation is more difficult, and is treated more appropriately in a full-term course in Differential Equations. Since the function on the right is a member of the vector space: W = Span(B),

where B

=

{x 2 e- 3x, xe- 3x, e- 3x},

it is natural to l(Uess that we will find a solution to this o.d.e. also in W. Since Wis preserved by D, and thus by D 2 as well, this further gives us hope that W contains at least one solution. Now that we have a good guess for the space to work with, we can think of the left side of this equation as the operator: T : W T(f(x))



W, given by:

= 2/ 1\x) - 5f 1(x) + 4f(x).

Notice that T = 2D 2 - SD+ 41 w, a linear combination of the operators I w, D and D 2 , and thus we are certain that Tis in fact a linear transformation. Thus, we can find [T] 8 using h, [DJs and [D 2 ] 8 = [D]~ which we found in the last Example in Section 5.3:

Section 5.4 Isomorphisms

455

[T]s = 2[D 2 ] 8

-

5[D] 8 + 4h

-{ ~ ]-{ }{i l -[ ~ J 9 -12

-3

0

0 2 -3

9

2 -6

37

0

-34

37

0

0 0

0 0 1 0

1 -3

0 1

4 -17 37

Notice that the final matrix is again lower triangular. It is invertible, with inverse:

Since we want T(j(x))

= 185x 2 e- 3x -281xef(x) =

3x -

r- 1(185x 2e- 3x -

188e-3X, we get: 281xe- 3x - 188e- 3x).

l[ l [

Using the inverse matrix and the desired output encoded in a column matrix, we compute:

50~53

[

1369 0 0 1258 1369 0 430 629 1369

185 -281 -188

253,265 -151,959 -354,571

= 50~53

Thus, one solution to our differential equation is: f(x) = 5x 2 e- 3x

-

3xe- 3x

-

7e- 3x.

0

You might be wondering - doesn't this computation show that the function we found is the only solution to this differential equation? The answer is no, because we looked for a solution in only one function space W. To find all solutions, we need to extend our transformation above to the entire space of twice differentiable functions defined on all real numbers:

and thus there may be other solutions that are members of C 2 (~), and not just one from the subspace W that we considered. However, as before, any two solutions will differ by a member of the kernel of T, and finding the members of this kernel is one of the tasks of a full-term course in Differential Equations.

It is also crucial in this process that we pick the appropriate function space W to work with. The function g(x) on the right side of the differential equation should point us in the right direction. It is important in the process that Wis preserved by the first, and possibly second and higher derivatives. 456

Section 5.4 Isomorphisms

Polynomial Curve Fitting We can use the idea of a linear transformation to find polynomials (or possibly other functions for that matter, but there are no guarantees for functions that are not polynomials) that pass through certain points. We know from basic algebra that two distinct points determine a unique line. Since we want our lines to represent a polynomial function, though, we will insist that all the points that we deal with have distinct x-coordinates. Similarly, three non-collinear points will determine a unique parabolic function p(x) = ax 2 + bx+ c. If the points are collinear, we get a "degenerate" quadratic p(x) = bx+ c or a constant polynomial p(x) = c, but notice that all these polynomials are still members of IP2 . Continuing with this analogy, four points with distinct x-coordinates will determine a unique polynomial of at most third degree, in other words, a member of IP3 , and so on. In fact, Exercise 22 of Section 5.3 says that the evaluation homomorphism: Ea : IP11

➔ ~,i+I,

where:

is a one-to-one function, if the ai are distinct numbers. By the Two-for-One Theorem, and the fact that dim(IP11) = n + 1 = dim(~ n+I ), Ea is also onto, and thus is invertible.

Example: Let us find a cubic polynomial that passes through the points:

(-4,-198),

(-1, 102), (2,-48)

and (3,-58).

We will do this by constructing the evaluation transformation: E-a: IP3 ➔ ~4, for the vector a = (-4, -1, 2, 3 ), and then we look for the unique member of IP3 whose image under Ea is (-198, 102,-48,-58). We can use the standard basis B = { 1, x, x 2 , x 3 } for IP3 and similarly the standard basis B 1 = {e1, e2,e3,e4} for ~ 4. We compute: E-a(1) = ( 1, 1, 1, 1), E 0 (x) = (-4,-1, 2, 3),

E 0 (x 2 ) = (16, 1,4,9),

and

E-a(x3 ) = (-64,-1, 8, 27).

We assemble the matrix: 1 -4 [Ea]s,s1 =

1

Section 5.4 Isomorphisms

16 -64

-1

1

-1

2

4

8

3

9

27

457

Our discussion preceding this Example tells us that this should be an invertible matrix. Indeed, its inverse is: -12 _ 1 [E-1] a st,s - 252

168 168 -72

-2

-98

8

-7

-2

154 -54 -28

27

7 -14

9

which we can find using technology. Thus, there is exactly one cubic polynomial that passes through the given four points. We can find it by assembling the desired y-coordinates in a coordinate matrix and performing the matrix product with this inverse: -12 _l_ 252

168 168 -72

-2

-98

8

-7

-2

-198

62

154 -54

102

-55

-28

27

-48

-10

7 -14

9

-58

Using our ordered basis B = { 1, x, x 2 , x 3 } for polynomial we are looking for is:

[Pl3

5

to decode this coordinate matrix, the actual

p(x) = 62 - 55x - 10x2 + 5x 3 .

We can see from its graph above thatp(x) passes through the points (-4,-198), ( 2,-48) and ( 3,-58). □

458

(-1, 102),

Section 5.4 Isomorphisms

5. 4 Section Summary If V and Ware vector spaces, we say that a linear transformation T : V ➔ Wis an isomorphism if Tis both one-to-one and onto. We also say that Tis invertible or Tis bijective, and that V and Ware isomorphic to each other. If V = W, an isomorphism T : V ➔ Vis also called an automorphism. 5.4.1: Let T : V ➔ W be an isomorphism of finite dimensional vector spaces. Then: dim(V) = dim(W).

5.4.2: If Vand Ware finite dimensional vector spaces and dim(V) = dim(W), then there exists an isomorphism T : V ➔ W. 5.4.3: Two vector spaces V and Ware isomorphic to each other if and only if dim(V) = dim(W). 5.4.4: A linear transformation T 1 : V ➔ Wis an isomorphism of vector spaces if and only if we can find another unique linear transformation, T2 : W ➔ V,the inverse of T 1, such that T2 is also an isomorphism, and for all v E V and for all wE W: (T2 o T1)(v)

=

v

and (T1 o T2)(w)

=

w.

In other words, T2 o T1 = Iv, and T1 o T2 = I w, the identity operators on V and on W, respectively. We also write: T2 = T 11. In particular, T2 must be the linear transformation such that for all E W:

w

T2(w)

v, where vis

w. Furthermore, if T1 is invertible with inverse T2, then T2 is also invertible, and T21 = T1. Thus, we can say that T1 and T2 are inverses of each other. In particular, if T 1 : V ➔ Vis an automorphism, we get: T2 o T 1 =Iv= T 1 o T2. 5.4.5: (restatement of 5.4.4) A linear transformation T: V ➔ Wis invertible if and only if we can find another unique linear transformation, r- 1 : W ➔ V, the inverse of T, such that for all v E V and all wE W: =

the unique vector from V such that T1(v)

=

T)(v) = V and (To r- 1 )(w) = w. In other words, T2 o T1 = Iv, and T1 o T2 = I w, the identity operators on Vandon W. In particular, r- 1 must be the linear transformation such that for all wE W:

cr-1

r- 1(w)

=

O

v, where vis

the unique vector from V such that T(v) = w.

Furthermore, if Tis invertible with inverse r- 1, then r- 1 is also invertible, and (T- 1)Thus, we can say that T and r- 1 are inverses of each other.

1

= T.

5.4.6: Suppose T: V ➔ Wis an isomorphism of finite dimensional vector spaces. By the previous Theorems, we know that dim(V) = dim(W) = n, say, and there exists r- 1 : W ➔ V such that r- 1 o T = Iv and To r- 1 = Iw. If Bis a basis for V and B 1 is a basis for W, then [T] 8 ,8 1 is an invertible n x n matrix, and: [T- 1] 8 1,8 = [T];'.8 1. In particular, if T: V ➔ Vis an automorphism, then: [r-

1

]

8

= [TJ:81•

Isomorphisms can be used to solve ordinary linear differential equations or to find polynomials that pass through given points or possess certain attributes (when applicable). Section 5.4 Isomorphisms

459

5.4 Exercises 1.

Let T :

2 1])) ➔ IR(3

be the linear transformation given by: T(p(x))

and let B

2.

= {

1, x, x 2} c

2, 1]))

= (p(-3),p(5),p

1

(2)),

and B 1 = {e1, e2,e3} be the standard bases for 1]))2 and

IR(3.

a.

Verify that Tis indeed a linear transformation and find [T] 8 8 ,.

b.

Prove that Tis an isomorphism by finding the inverse of this matrix.

c.

Use (b) to find a polynomialp(x) of degree at most 2 that passes through (-3, 75) and (5, 99), and withp 1(2) = 13.

Let T :

3 ➔ IR(4 1]))

be the linear transformation given by: T(p(x))

and let B and IR(4 .

= {

1, X, x 2 , x 3} C

3, 1]))

= (p(-4),p(l),p(3),p

1

(-l)),

and B 1 = {e1, e2, e3, e4} be the standard bases for

a.

Verify that Tis indeed a linear transformation and find [T] B,B'.

b.

Prove that Tis an isomorphism by finding the inverse of this matrix.

c.

Use (b) to find a polynomial of degree at most 3 passing through (-4,-247), and (3, 19), and withp 1(-l) = 23.

3 1]))

( 1,-7)

For Exercises (3) to (9): adapt the ideas in Exercises (1) and (2) in order to construct a linear transformation T : 1]))11 ➔ IR('i+1 (for an appropriate 1]))11), construct the matrix for T with respect to the standard basis of each space, find the inverse of this matrix, and use it to find the polynomials with the indicated properties. Use technology if allowed by your instructor to invert the matrices. 3.

4.

5.

Find a polynomial p(x) of degree at most 2 that passes through the points: a.

(-2,52) and (4,58), and withp 1(3) = 21.

b.

(-2,-13)

= -14.

Find a polynomial p(x) of degree at most 2 that passes through the point: a.

(3,83), with/(-4)

b.

(3,-106), withp'(-4)

-77, andf~p(x)dx

=

=

f

35/2.

=

45, and ~p(x)d.x

=

65/6.

Find a polynomial p(x) of degree at most 2 that passes through the point: a. b. c.

460

and (4,-25), and with/(3)

(5,-58), withp 1(-2)

=

(5,324), with/(-2) (5,53), with/(-2)

=

=

25 andp 1(7) -68,andp 1

=

1

(7)

12, andp (7)

=

-47. =

202.

12 also. Explain what happened.

Section 5.4 Isomorphisms

6.

7.

8.

9.

Find a polynomial p(x) of degree at most 3 that passes through the points: a.

(-5, 851 ), (-2, 89), and (3, -61 ), and with p 1(2) = -31.

b.

(-5,-349),

(-2,-55),

and (3,-85), and withp 1(2) = -45.

Find a polynomial p(x) of degree at most 3 that passes through the points: a.

(-3, 152) and (2,47), withp 1(-4) = -269 andp 1(5) = -161.

b.

(-3,-532)

and (2,148), withp 1(-4)

868 andp 1(5)

=

1237.

=

Find a polynomial p(x) of degree at most 3 that passes through the points: a.

(-4,815) and (7,-2474), withp 1(-6) = -1133 andp 11(9) = -460.

b.

(-4,-188)

and (7, 1275), withp 1(-6)

=

417 andp 11(9)

=

216.

Find a polynomial p(x) of degree at most 3 that passes through the point:

f and f~p(x)dx

a.

(6,2185), withp 1(-8) = 1616, / 1(-3) = -148, and ~p(x)dx = -77/12.

b.

(6,2277), withp 1(-8) = 2094, p 11(-3) = -198,

= 11/4.

10. Find a polynomial p(x) of degree at most 3, with: a. p 1(0) = -11, pf1(7) = 10, ~p(x)dx = 26 , and 2 p(x)dx = 8 3 b. p 1(0) = 9, p 11(7) = - 100l , ~p(x)dx = and 2 p(x)dx = 86 . 2 3 11. We saw the following sets of functions B and spaces W = Span(B) in the Exercises of Sections 5.1 and 5.2. We saw in Section 5.1 that the derivative operator D preserves W, and we constructed [DJs in Section 5.2. The corresponding Exercise numbers are indicated for your reference. For each item: (i) Check your homework solutions and the 1 Answer Key for [DJs; (ii) Show that Dis invertible by finding [DJ11 ; (iii) Use (ii) to find the indicated general antiderivative(s).

f

f~ g, f~

f

a.

B

= { e- 3x sin(2x), e- 3x cos(2x)}; Exercise ll(c), Section 5.1 and 5.2.

Find: b.

+29e- 3xcos(2x))dx.

f (15xe5x + 43e 5x) dx.

f (-16x 2e--4x+44xe--4x+ 3e--4x)dx.

B = {x 2 • SX,x • 5X, SX}; Exercise 1l(f), Section 5.1 and 5.2. Find:

e.

3xsin(2x)

B = {x 2 e--4x,xe--4X,e--4x};Exercise 1 l(e), Section 5.1 and 5.2. Find:

d.

f (-lle-

B = {xe 5x, e 5~·}; Exercise 11(d), Section 5 .1 and 5 .2.

Find: c.

l.

f (7x

2 •

5x -4x • 5x + 9 • SX)dx.

B = {xsin(2x), xcos(2x), sin(2x), cos(2x)}; Exercise 1l(h), Section 5.1 and 5.2.

f

Find: (4xsin(2x) + 9xcos(2x)-

Section 5.4 Isomorphisms

5sin(2x) + 8cos(2x))dx.

461

f.

B = { e 0 x sin( bx), e 0 x cos( bx)}, where a and b are non-zero scalars. Exercise 14, Section 5.1 and 5.2.

f e xsin(bx)dx

Find:

0

and

f ea.') det(A) '

det(A) det(A) ' · · · '

Xn

det(A ) = det(A) ' -+

where A (i) is the matrix obtained from A by replacing column i of A with b. Section 7.4 The Adjugate Matrix and Cramer's Rule

571

x

--+

Proof The unique solution xis of course given by = A- 1 • b, since A is invertible. However, if we use our new formula for A- 1, we get: 1 . --+ --+ x= det(A) adj(A). b. --+

Since bis a column vector, we can again use the dot product interpretation to get:

1 x •rowiofady"(A)ob= ; - det(A)

1 •(b1C1 1 +b2C2 1 +···+b det(A) · •

C111·)

11



'

where again, we note that adj(A) is the transpose of cofi.A), and thus the entries of row i of adj(A) are the corresponding cofactors for column i of A. However, in the same way that we proved the previous theorem by cleverly replacing row j of A, we will finish the proof of --+ Cramer's rule by replacing column i of A with the column vector b, and call the resulting matrix AU):

A(i) =

a1,1

a 1,;-1 b,

GJ,i+I

a1,n

a2,1

a2,;-1 b2

G2,i+I

a2,n

Gn,I

Gn,i-1 bn Gn,i+l

Gn,11

This time, naturally, we will compute det(A(O) using a cofactor expansion along column i. We will denote the}, i -cofactor of AU) by where j = 1.. n. Since we will delete row j and column i of A (i), we will get exactly the same minor that we would obtain by deleting row j and column i of A itself. Thus:

cJ:;, (i)

C.j,I =

C1·;. '

From this, we get the determinant by expanding along column i:

det(A(i))

=

b1CU + b2Ci:~+ · .. + bnc~:~ = b1 C,,; + b2C2,;+ ... + bnCn,i,

and thus we get:

x; completing our

=

det~A) • (b1C1,; + b2C2,;+ ··· + bnCn,;) = det~A) • det(A(i)),

Proof ■

Example: Let us bring back our earlier 3 x 3 matrix:

We saw that det(A) = -59, so we know that A is invertible. Let us find the unique solution to:

572

Section 7.4 The Adjugate Matrix and Cramer's Rule

We will need the determinants:

det(A(l>)

=

7 -2

3

9

7

-8

det(A c2 >) =

1

=

-177,

=

-413, and

=

-118.

4 -9

5

7

3

-4

9

7

-6 -8 -9 5 -2 det(AC3 )) =

7

-4

1

9

-6

4 -8

Thus: X1 =

-~.vi3, =

X2 =

1 -=-~ / 7, =

and

X3 =

~ll

=

2,

so the unique solution is:

x = (3, 7,2). □

Section 7.4 The Adjugate Matrix and Cramer's Rule

573

7.4 Section Summary Let A be an n x n matrix. The cofactor matrix of A, denoted cof(A), is the n x n matrix whose entries are the corresponding cofactors of each entry of A:

We recall that: C(A) l,J

= (-1 )i+J • A1l,J·(A) ,

where M;,J(A), is the determinant of the submatrix obtained from A by erasing the fh row and fh column of A. The adjugate matrix of A is the transpose of the cofactor matrix, and is written as: adj(A) = (cof(A))T. 7.4.1: Let A be any n x n matrix. Then:

A • adj(A) = det(A) • In.

Consequently, if A is invertible, then: A-' = det~A) adj(A).

7.4.2 ~ Cramer's Rule: Let A be an invertible matrix. Then: the unique solution to the matrix equation:

A Xn

has entries: XJ

det(AO>) = det(A) '

x2

det(A c2 >) = det(A) ' ... ,

Xn

det(A C11>) = det(A) ' --->

where A Ci) is the matrix obtained from A by replacing columll i with b.

574

Section 7.4 The Adjugate Matrix and Cramer's Rule

7.4 Exercises 1.

Find the adjugate matrix of the following matrices, and use them to find the inverse of each matrix, when possible:

a. [

~

!] 4

3 -] 5 -2 0

C.

[

-3

l

-2

d.

2

1 -3

0

-4

7

6 9

3 -8 0 4 3 -2

5

f.

3

4

5 0

2

1 7 -1

[

2

-5 e.

2.

1

b. [ -3 -5 ] 12 20

l

3

7

7

1

-2

3

6

2

5

-2

4

6

-7 -1 5

2

Use Cramer's Rule to solve the following systems of linear equations, if applicable:

a.

Sy = 4

3x

2x + y

C.

e.

-7

Sy

20

-6x

+ 2y

-14 6

X

3y

+ 7z

-1

y

3z

4

z

8

7x + 3y - 4z

1

+ 6y +

h.

lLy 6

4

=

+ 3y

2z

X

2y

z

4x

y

4z

X

-3x

+ 2y -

4y -

J.

A. 3

9

=

4z =

5z

+ y + 7z =

3x + 7y + 7z

2y

Section 7.4 The Adjugate Matrix and Cramer's Rule

2

2x

-x

2z = -5

+ z + 2w = -3 -4y + 7z 6w 5 6x + 3y 8z + 5w = 0 9x + 4y + 3z + w = -2

-Sx l.

f.

2-y = _]_

+

3

Sz

-3x

4

2-x

+ 4y

8

Sy

J_x

d.

2x

Sx + 2y +

7x + 4y = -6 6x

l5x

Sx +

g.

b.

5

-3 3

-2 4 4w =

1

+ 3y + 6z + 9w 2 Sx 2y 7z + 3w = -1 4x + 6y + Sz 5w = 2

-2x

575

k.

3.

w 7x Sy + 2z 4y 4z 3w -2x 4x + y 6z + 5w 3x + 2y + 9z + 2w

=

3 -7 4

=

l.

+ 2y

=

=

3 -2 5 -4

Use Cramer's Rule to solve for the variables band c only in the linear system:

3a - 2b a + b + -2a + 3b 4b Sa 4a + Sb +

4.

0

2z + 2w -4x 8y + 6z + w 4z 3x 6y 3w 4w X + y + z X

--+

Sc + d - 4e 3c 2d + 7e 2e C + 4d 2c + 3d + 3e 7c d + e

6 =

-3

=

2

=

1

=

-4

c,

--+

Suppose that {a, b, d} is a linearly independent set of vectors from Rule to show that the system represented by the augmented matrix:

a1 a2 a3 a4

b, CJ b2 C2 b3 C3 b4 C4

d,

5a 1 - 2c 1 + 7d1

d2 d3 d4

5a2 - 2c2 + 7d2 5a3 - 2c3 + 7d3 5a4 - 2c4 + 7d4

4. Use Cramer's

IR{

a

has exactly one solution, and find that solution. As usual, = ( a 1, a2, a3, a4), and similarly for the other vectors. Hint: think of the properties of determinants. 5.

576

Let Ebe an n x n elementary matrix, where n 2: 2. a.

Warm-up: compute the adjugate matrices of the following elementary matrices. Which of the adjoin ts is also an elementary matrix?

b.

If n = 2 and E is of Type 1 (multiply row i of In by a non-zero number c to obtain E), then adj(E) is also an elementary matrix of Type 1.

c.

Show that if n > 2 and E is of Type 1, then adj(E) is not an elementary matrix (except for the trivial case when E = In). What goes wrong?

d.

Show that if Eis of Type 2 (exchange two rows of In to obtain E), then adj(E) is never an elementary matrix.

e.

Show that if E is of Type 3 ( add a multiple of row i of In to row j to obtain E), then adj(E) is also an elementary matrix of Type 3.

Section 7.4 The Adjugate Matrix and Cramer's Rule

6.

Prove that if A and B are invertible n x n matrices, then: adj(A • B) = adj(B) • adj(A).

7.

Note: this formula is true even if A or B is not invertible, but the proof is much more difficult. Also notice the similarity between this formula and those for the inverse and transpose of the matrix product A • B. Let A be an n x n matrix. The objective of this exercise is to prove: A is invertible if and only if adj(A) is also invertible. Note that the formula: A • adj(A) = det(A) • In

8.

is always true, whether or not A is invertible. a. The easy part: use this formula to prove that if A is invertible, then adj(A) is also invertible. b. Now for the converse: Suppose instead that adj(A) is invertible. Use Proof by Contradiction to prove that A is also invertible. Hint: Suppose A is not invertible. Use the formula above to solve for A. What happens? Be sure to actually explain what the contradiction is. This is not as obvious as it looks. Use the previous Exercise to show that if A is an n x n matrix, then: det(adj(A))

9.

= det(Af-

1



The objective of this Exercise is to give an alternative proof that the inverse of an invertible upper-triangular matrix A is again upper triangular. a. Write down the rigorous definition of what an upper triangular matrix is, from Section 4.6. The definition should mention a certain inequality. b. Consider the upper triangular matrix:

2 A=

5 -1

4

0 -3 0 0

6

0

0 -1

0

2

7 -8

Find adj(A). What kind of a matrix is it? c. Now, let A be any n x n upper-triangular matrix. Show that if none of the entries on the main diagonal of A is zero, then all of the cofactors C;,i for the main diagonal are also non-zero. d. Next, show that if i < j, then the matrix obtained by the deleting row i and column} from A is still upper triangular. e. Continuing part (d), show that additionally, a zero now appears on the main diagonal. Conclude that CiJ = 0. f. Explain why the last two parts shows that adj(A) is an upper triangular matrix also. g. Finally, explain why A- 1 is also upper triangular. 10. Mimic the outline of the previous Exercise, parts (c) to (g), to prove that if A 1s an invertible lower triangular matrix, then A- 1 is also lower triangular. Section 7.4 The Adjugate Matrix and Cramer's Rule

577

7.5 The Wronskian In Section 4.2, we defined a finite set of functions S = {f1 (x), fi(x), ... , fn(x)} from some function space F(I) to be linearly independent if the only solution to the dependence test equation:

is the trivial solution: c1 = c2 = .. · = Cn = 0. In other words, the only way for the linear combination on the left side of this equation to be zero at all points x E J is to have all coefficients zero. We saw a variety of ideas to determine whether or not S were linearly independent or dependent, such as plugging in several values of x and attempting to solve the resulting systems of equations, taking a limit as x approaches a or some infinite limit, applying the Fundamental Theorem of Algebra (if the functions happened to be polynomials, or can be converted to polynomials), and using known identities from trigonometry, again if applicable. In other words, we had no clear or obvious strategy on how to attack this question. In this Section, we present a method which uses the concepts of the derivative and the determinant in order to decide if Sis independent or dependent. To see how this method works, let us first assume that Sis linearly dependent. This means that we can find a non-trivial solution to the dependence test equation, that is, where at least one Ci is non-zero. However, if S c C 1(J), and we apply the derivative operator to both sides of the equation, we get:

Let us keep applying this idea by taking a 2nd derivative, 3rd derivative, all the way to the (n - 1)-st derivative, thus assuming that Sc c11- 1(I). We end up with the system: cif1 (x)

+

C2/2(x)

+

+

Cnfn(X)

z(x)

cif((x)

+

C2fi(X)

+

+

Cnf:(x)

z(x)

C1flI (x)

+

C2J/1 2 (x)

+

+

Cnf,/1 n (x)

z(x)

(X ) C1f(n-1) I

(X) + ... + Cn/,(n-1) (X) = z(x) 1 2 + C2f(n-1)

Since we only took n - 1 derivatives, this system is square, and we can write it in the form of a matrix product: f1(x)

f2(x)

/,1(x)

J((x)

J£(x)

1:cx)

j(1(x)

J/(x)

j,;1(x)

f?1-l)(x) 1}'1-1\x)

578

f}n-J)(x)

C1

z(x) z(x)

Cn

z(x)

Section 7.5 The Wronskian

Now, recall that the non-trivial solution c 1, c2, ... , Cn must be valid for all points x E I. But this means that the square matrix cannot be invertible for any value of x E I. Thus, its determinant must be zero for all x E I. This determinant is called the Wronskian of S, denoted W(S), and is named after Jozef Maria Hoene-Wronski (Poland, 1776-1853). The contrapositive of this statement tells us that if W(S) is non-zero for at least one x E J, then S is an independent set. We summarize our conclusions in the following:

Definition/Theorem 75.1: Let S = {f1(x),/2(x), ... ,fn(x)} I. We define the Wronskian of S, Ws(x) as the function:

Ws(x) = W({/1 (x), Ji(x), ... , fn(x)}) =

c

c

11 -

1

(J) for some interval

f1(x)

/2(x)

fn(x)

J((x)

fi(x)

f:(x)

f(' (x)

ff(x)

1: (x)

1

1?1-l)(X) 1?-l)(x) If Sis a dependent set, then Ws(x) = z(x) for all values x Thus, if Ws(x) is not zero for at least one x

E

E

f,~n-1\x)

I.

J, then Sis an independent set.

Example: Let us consider the set S = { sin(x), sin(2.x), sin(3x)}. Using the Chain Rule, the Wronskian of Sis:

Ws(x) =

sin(x)

sin(2x)

sin(3x)

sin(x)

sin(2x)

sin(3x)

cos(x)

2cos(2x)

3 cos(3x)

cos(x)

2cos(2x)

3 cos(3x)

0 -3 sin(2x)

-8 sin(3x)

-sin(x)

-4sin(2x)

-9sin(3x)

= sin(x) (-16 cos(2x) sin(3x) + 9 cos(3x) sin(2.x)) - cos(x) (-8 sin(2x) sin(3x) + 3 sin(2x) sin(3x))

= 9 sin(x) cos(3x) sin(2x) - 16 sin(x) cos(2x) sin(3x) + 5 cos(x) sin(2x) sin(3x ). Note that we added row 1 to row 3 to produce the zero in column 1 before performing a cofactor expansion along column 1 to compute the determinant. We can attempt to simplify W(S) further using trigonometric identities, but there is really no point in doing so in this case. Let us evaluate Ws(x) at some convenient value for x to see if we obtain at least one non-zero result. Notice that sin(3x) appears in two terms. If we letx = n/3, then sin(3x) = 0, and so:

Ws(n/3)

=

9sin(n/3)cos(n-)sin((2n/3))

= -

1.

2

Since Ws(x) is non-zero at x = n/3, we can safely conclude that Sis linearly

Section 7.5 The Wronskian

independent. □

579

What if we had evaluated Ws(x) at several values for x and kept getting zero? Would this mean that S were dependent? We will answer this question in more depth in the next subsection, but for now, let us say that when this happens, you can either keep trying with more values for x until you get a non-zero result. You can also try to think of identities (trigonometric or otherwise) that will show that Ws(x) = z(x) for all x E I. A valid conclusion in one case will be presented in the next subsection.

What About the Converse? Suppose that Ws(x) = z(x) for all x EI. Does this also mean that S is a depe11dent set? Unfortunately, the answer in general is no. Giuseppe Peano, in 1889, found a counterexample. Consider the two functions:

f(x)

=

x 2, and g(x)

=

xlxl,

and consider the set S = {f(x), g(x)} c F(IRI.).Note that we can write the second function as:

g(x)

=

x 2 if x > O . { -x 2 1fx < 0

,

and so g1(x)

Thus, bothf(x) and g(x) are differentiable for all x

Ws(x) =

x2 2x

E

=

{

2x -2x

if X 2: 0 ifx < 0

= 21xl.

!RI.,and:

xlxl 21xl

so Ws(x) = z(x) for all x E !RI..However, S contains only two (non-zero) functions, and so Sis dependent if and only if these functions are parallel as vectors, that is:

f(x) = k•g(x),

forsomenon-zerok

E

!RI.,forallx

E

!RI..

But since .f(x) = x 2 = g(x) when x 2: 0, the only possible solution for this equation would be k = I. But f(x) = -g(x) when x < 0, and so k = l does not work for all x E !RI..Thus, S is actually an independent set, even though Ws(x) = z(x) for all x E !RI.. What went wrong with this set? Notice that g 1(x) = 2 lxl, and so g 1(x) is not differentiable at x = 0. Peano himself also discovered that the converse will be true if we require that the functions in Sare real analytic, that is they are members of C00 (1R1.), the set of all functions 00 whose higher derivatives all exist at all x E !RI..Since g(x) is not in C (IRI. ), this Example also shows that the following converse could fail if our functions are not in C00 (IRI.). Theorem 7.5.2: LetS = {f,(x),f2(x), ... ,fn(x)} c Cx'(IRI.). If W(S) = z(x) for all x E !RI.,then Sis a linearly dependent set.

580

Section 7.5 The Wronskian

Example: Let us consider the setS = { cos 2 (x), sin 2 (x), 1} c C'0 (~).

We know that:

and so Sis certainly a dependent set. To prepare ourselves to compute W(S), we have:

Jxcos (x) = 2cos(x)(-sin(x)) 2

2

d cos 2 (x) = -2cos(2x). dx 2

Jxsin (x) 2

=

= -sin(2x),

and so:

Similarly:

2 sin(x) cos(x)

sin(2x), and so:

=

2

d sin 2 (x) = 2cos(2x). dx 2

Thus, the Wronskian of this set is: sin 2 (x) 1

cos 2 (x)

Ws(x) =

-sin(2x)

sin(2x) 0

-2cos(2x) =

2cos(2x)

0

-2 sin(2x) cos(2x) + 2 sin(2x) cos(2x)

=

z(x ),

where we computed the determinant using a cofactor expansion along the 3rd column. Thus, we verify that Sis indeed a dependent set. 0

7.5 Section Summary 7.5.1: Let S = {/ 1 (x), h(x), ... , f, 1 (x)} c en-I (I) for some interval I. Wronskian of S, Ws(x) as the function:

Ws(x)

=

W( {/1(x), h(x), ... , fn(x)})

=

J1(x)

J2(x)

fn(x)

J((x)

JlCx)

1:cx)

J(' (x)

J;'(x)

f;' (x)

J/n-l)(x) f?-l)(x) If Sis a depemlent set, then Ws(x) = z(x) for all values x Thus, if Ws(x) is not zero for at least one x

E

We define the

E

1?-1)(x)

I.

J, then Sis an independent set.

7.5.2: Let S = {/1 (x), f2(x), ... , fn(x)} c C"'(~ ).

If W(S) = z(x) for all x

Section 7.5 The Wronskian

E ~,

then Sis a linearly dependent set.

581

7.5 Exercises 1.

2.

For each item: (i) Find the Wronskian Ws(x) of the set of functions S, and (ii) decide whether S is independent or dependent. If S is independent, give at least one value for x so that the Wronskian is non-zero at x. Approximations are acceptable. a.

S = { cos(x), cos(2.x), cos(3x)}

b.

S = { ex sin(x ), ex cos(x)}

C.

S = { ekxsin(nx), ekxcos(nx)}, where k and n are fixed non-zero real numbers.

d.

S = { tan 2 (x), sec 2 (x), l}

e.

S = { cot2(x), csc 2 (x), 1}

f.

S = { cos(x), sin(x), cos(2.x), sin(2x)}

g.

h.

S = { tan(x), tan(2x), tan(3x)} S = {xll2,x3l5,x7l4}

1.

s = { x112xl/3 xll4 x11s}

J.

' ' ' S={~,Jx-2,Jx-3,Jx-4}

k.

S= {Y,4X,SX}

1.

S = { log 3 (x ), log 4 (x ), log 5 (x)}

In Section 4.4, we said that if S = {f;(x) Ii E /} is an infinite set of functions, then S is a linearly independent set if every finite subset of Sis linearly independent. Thus, we can apply the idea of the Wronskian on an arbitrary fmite subset S 1 of S to determine if Sis dependent or independent.

For the following infinite sets S: (i) Write down what an arbitrary finite subset S 1 of S would look like, where S 1 contains n functions and the indices are in increasing order; (ii) Find the Wronskian Wsi(x) of the set S 1 from (i); (iii) Determine if S 1 is linearly independent or dependent using Wsi(x); (iv) Use (iii) and the introductory paragraph above to decide if S itself is dependent or independent. Hint: in all four problems, one factor for W5 1(x) is a Vandermonde determinant. You may need to perform some row and/or column operations before the Vandermonde determinant reveals itself.

a.

S = { ekx I k

b. C.

s = { bx I b E (O,oo)} C s = {xk I k E (O,oo)} C

d.

S = { (x- k)m I k

E ~}

c C:00 (~).

E

C 00 (1R)

= < 3/2, 9/2 > y

Y < 1, 3 > 3

T

2

1

3 2

1

T ( < 5, 2 > ) = < 5/2, 1 >

-~+---4---4-----1------1--

1 2 3 4 5

X

1 2

3 4

5

X

Visualizing the Effect of an Operator Through its Eigenvectors Notice that the corresponding sides of the parallelogram are parallel to each other, although the proportions are different. □

598

Section 8.2 The Geometry ofEigentheory and Computational Techniques

The Kernel as an Eigenspace ➔

Although On is not allowed to be an eigenvector, the scalar 0 is allowed to be an eigenvalue. However,?. = 0 is an eigenvalue for A if and only if there exists a non-zero vector such that:

v

-t



-t

Av

=

A•v

=

0 •V



=

On,

or in other words, if and only ff A has a non-zero kernel. We know from The Really Big Theorem on Invertibility (Theorem 3.8.5) that this is equivalent to A being not invertible. The contrapositive of this statement thus tells us that A = 0 is not an eigenvalue for A if and only if A is invertible. Together with Theorem 7.3.l which said that A is invertible if and only if det(A) is non-zero, we can now formally add two more conditions to our Really Big Theorem:

Theorem 8.2.1 -Addenda to the Really Big Theorem on Invertibility: Let A be an n x n matrix. Then, the condition that A is invertible is equivalent to the following: 26. det(A) is not 0. 27. A = 0 is not an eigenvalue for A.

-1 -2 2 2 2 1

Example: Let A =

[

1

l .

0 3

We point out that the third row is the sum of the first two rows, and therefore A is not invertible. Let us find its characteristic polynomial and verify:

p(A) - det(Al, -A) -

}_+ 1

=

de{[ ~ ]-[

2

-2

-2

A-2

-1

-1

0

?. -3

A 0

-1

-2

2

0 A

2

2 1

0 0

1

0 3

])

= (A+ l)(A-2)(?.-3)+2+0-2(?.-2)+4(?.-3)-0 = ?.3 - 4A2 +?. + 6 + 2 - 2?. + 4 + 4A - 12 = ?.3 - 4A2 + 3A = ?.(?.- 1)(?.- 3). We used the shortcut method for a 3 x 3 determinant: copy columns 1 and 2 on the right, multiply diagonally right for the positive terms, and diagonally left for the negative terms. Our p(?.) confirms that?. = 0 is indeed an eigenvalue, along with 1 and 3. The rref of A is:

Section 8.2 The Geometry ofEigentheory and Computational Techniques

599

1 0

3

0 1 _j_ 2

0 0

0

and from this we can see that:

Eig(A,O) = nullspace(A) = Span( {(-6, 5, 2)}) A basis for the other two eigenspaces can be found, as usual, by looking at the nullspace of the corresponding A - Ah. □ Finding the eigenvalues of a non-triangular 3 x 3 matrix or bigger can be a daunting task, unless one can use mathematical software or a graphing calculator. We will need to find the roots of a cubic or higher-degree polynomial, which may be irrational or imaginary. However, if the entries of the matrix are integers, hopefully there are enough integer and rational roots so that any irrational root will involve only the square root of an integer. Let us now look at Theorems and techniques that will help us to find eigenvalues.

The Integer and Rational Roots Theorems Our basic tools are two Theorems that we usually see in Precalculus: Theorem 8.2.2 - The Integer Roots Theorem: Let p(x) = xn + Cn-ixn-l + · ·· + c 1x + co be a polynomial with integer coefficients, and co -=t0. Then, all the rational roots of p(x) are in fact integers, and if x = c is an integer root of p(x), then c is a factor of the constant coefficient c 0 . Note: If co = 0, then we can factor p(x) in the form xk • q(x) for some positive power k, and the constant coefficient of q(x) is now non-zero. Thus we can apply the Integer Roots Theorem to q(x). If a matrix has entries that are non-integer rational numbers (i.e. fractions), the characteristic

polynomial will still have highest term An, but some of the coefficients of the lower powers of A may be fractions. To find the roots of such a polynomial, we would normally "clear denominators" by multiplying the polynomial by the least common denominator of the coefficients. This will give us a polynomial that will now have integer coefficients. The roots of this polynomial will be the same as the roots of original polynomial p(A). The highest term, though, will now have a coefficient that is not 1. To find the roots of such a polynomial, we can use: Theorem 8.1.3 - The Rational Roots Theorem: Let q(x) = CnXn+ Cn_,xn-l+ · ·· + c,x + co be a polynomial with integer coefficients, with co -=t0. Then, all the rational roots of q(x) are of the form x = cld, where c is a factor of the constant coefficient Coand dis a factor of the leading coefficient Cn. 600

Section 8.2 The Geometry ofEigentheory and Computational Techniques

These two Theorems give us possible candidates for the integer and rational roots of the characteristic polynomial. It is still up to us to perform the tedious task of finding the actual roots, either by plugging them directly into p(?i,) or using synthetic division, a technique that we also see in precalculus. We can likewise use the following Theorem for some additional assistance:

Theorem 8.2.4 ~Descartes' Rule of Signs: Let p(x) be a polynomial with real coefficients. Then: the number of positive roots of p(x) is equal to the number of sign changes in consecutive coefficients of p(x ), or it is less than this number by an even integer. Similarly, the number of negative roots of p(x) is the number of sign changes in consecutive coefficients of p(-x ), or it is less than this number by an even integer. In any case, if we are fortunate enough to find a root quickly, say A = c, then we can factor out A - c from the characteristic polynomial, resulting in a polynomial of degree n - 1. We repeat this process of finding roots until we are down to a quadratic factor, at which point we use the quadratic formula or factoring techniques to find the remaining roots.

Example: LetA =

8

[ -13

8

15 8 ] · First we find the characteristicpolynomial:

-~0

-4

det(?i,h -A)

=

3

Ji,+ 13

-8

-8

20

?i,-15

-8

-4

4

?i,-3

=(Ji,+ 13)(?i,- 15)(?i,- 3) + (-8)(-8)(-4) - (-8)(?i,- 15)(-4) - (Ji,+ 13)(-8)(4)

= ?i,3 - 5A2

-

+ (-8)(20)(4) - (-8)(20)(?i,-3)

189A + 585 - 256 - 640 - 32A + 480 + 32A + 416 + 160A- 480

= Ji,3 - 5A2 -29A + 105, with a little work on a calculator. Let us hope that there are integer roots for p(A ), and try the factors of 105 = 3 x 5 x 7. Thus, even though 105 is a big number, it has a small number of factors, namely:

± { 1, 3, 5, 7, 15, 21, 35, 105}. Notice that p().) has two sign changes, whereas p(-A) = -). 3 - 5A 2 + 29). + 105 has only one sign change. Therefore, we are guaranteed a unique negative real root. We keep our fingers crossed that this negative root is an integer, as we sequentially try the negative factors of 105:

Section 8.2 The Geometry ofEigentheory and Computational Techniques

601

p(-1)

=

-1-5+29+

105 = 128

p(-3)

=

-27 -45 + 87 + 105

p(-5)

=

-125 - 125 + 145 + 105

=

No.

120 =

0

Try again. Success!

Now, we divide out ,:\,+ 5 from p(,:\,), either via long division or synthetic division, and get a quadratic, thus:

The quadratic above now easily factors as(,:\,- 3)(,:\,- 7), and so:

p(;L) = (;L+5)(;L-3)(A-7). Thus, the eigenvalues are: A = -5, 3 and 7.o Once the eigenvalues are found, it is now a straightforward task to find the corresponding eigenvectors using the Gauss-Jordan algorithm, as seen in the previous Section. We leave it as an Exercise to find a basis for the eigenspaces of the matrix above. We have some further remarks about this Example. You may gamble and try to positive roots instead, but unfortunately Descartes' Rule of Sign says that we have positive roots, and thus we are not guaranteed in advance that we will find positive less positive integer roots. The gamble would have paid off in this case because eigenvalue, and we would have discovered it next after failing with ,:\,= 1.

look for the either 2 or 0 roots, much ,:\,= 3 is an

Notice also that we were not guaranteed either that the unique negative root would be an integer. Since p(-1) and p(-3) are both positive, if we reach a point where p(A) is negative for some integer candidate A < -3, we would have a sign change without hitting an integer root. Thus, this unique negative root would have been irrational. If this happens, we should try to find the positive roots, and hope that they exist, and hope at least one is an integer. When all else fail, most graphing calculators and mathematical software can easily approximate the roots of polynomials. Some can also find an approximate basis for e1genspaces. Here is one thing we learned in Calculus, though, that might give us some more comfort:

Theorem 8.2.5: Let p(x) be a polynomial with odd degree. Then: p(x) has at least one real root. Thus, all 3 x 3 matrices, all 5 x 5 matrices, etc., are guaranteed to have at least one real eigenvalue. Unfortunately, this still does not guarantee us that the eigenvalue is an integer or a rational number. Let us next look at an example where we can approximate the eigenvalues of a matrix using an algorithm that we learn in Calculus.

602

Section 8.2 The Geometry ofEigentheory and Computational Techniques

Example: Consider the matrix:

With a little bit of work, we will find that the characteristic polynomial of A is: p().) = ).,3

-

10).2 - 34).,+ 103.

Descartes' Rule of Signs tells us that there are two or zero positive roots for p().), and thus we are not guaranteed a positive root. However: 3 - l0Ji,2 +34Ji,+ 103, p(-).) = -},.,

and so Descartes' Rule now tells us that there is exactly one negative root, since p(-Ji,) has only one sign change. Unfortunately, 103 is a prime number, and so the only possible rational roots for p(Ji,) are the integer roots ±1 and ±103. Directly plugging in -1 and -103 yield 126 and-1195212, respectively, so neither one of them is a root. This tells us that the negative root must be irrational. However, since there is a sign change between p(-1) and p(-103), the Intermediate Value Theorem of Calculus tells us that there must be a zero for p(Ji,) somewhere between -103 and -1, and our instincts should tell us that this root should be much closer to -1 than -103, judging by their values under p(x). Let us try to narrow the gap a bit: p(-2)

=

123, p(-3)

=

88, p(-4)

=

15, and at last: p(-5)

=

-102.

Thus, our irrational root is in the interval [-5,-4]. To refine our root further, we can apply Newton's Method, which needs the derivative, l(Ji,) = 3}i,2 - 20Ji,- 34. Recall that this method begins with an initial guess which we will call xo, and the next guess is inductively defined to be:

Let us use xo = -4 as our initial guess, since 15 is closer to zero than -102. We will stop Newton's Method when the first four digits after the decimal point do not change between Xk andxk+I· We get:

x,

=

-4-

p(- 4 ) p 1(-4)

=

-4-

li

94

:::::-4.15957, X

2

= -4.15957-

p(-4.15957) p 1(-4.15957)

~ -4 15957- -0.5638199325 ~ . 101.0974678

:::::-4. 153993, and X

3

= -4.153993 - p(-4.153993) p 1(-4.153993)

~ -4 153993 - -0.0006983350646 ~ . 100.8468335

:::::-4.153986075.

Section 8.2 The Geometry ofEigentheory and Computational Techniques

603

Rounding this off to -4.153986, we verify thatp(-4.153986) ~ 7.59167 x 10- 6 , so it would appear that we have an excellent approximation for our negative root. We can now use synthetic division to factor out;\,+ 4.153986 fromp(;L): -4.153986

-10

1

-34

103

-4. 153986

58. 79546

-14. 153986

24. 79546

-102. 9999937 ~

0

The bottom line tells us that other factor of p(A) is the approximate quadratic: ;L2

-

14. 153986;\,+ 24. 79546.

Applying the Quadratic Formula, we get our two other approximate roots:

A~ 2.048238689 and ;\, ~ 12.10574731. Thus, all three roots are in fact irrational. Of course, a graphing calculator would also find these three approximate roots for us, with much less effort. We should point out that since these are not exact values for A, extra care must be given in applying the Gauss Jordan Algorithm to find the eigenspaces. Recall that each matrix A - A • h must not be invertible, that is, the correct rref should have at least one row of zeroes. Let us illustrate for A ~ 2. 04824. As usual, we first find A - (2. 04824 )h:

3-2.:4824 [

5-2.~4824

-1

-01

0

l~ [

0.9~176 2.9:176

2-2.04824

-1

0



~1 -0.04824

If we were to directly use technology at this point to find the rref of this matrix, we would be disappointed to get the identity matrix h: this is both wrong and useless. Let us find the correct approximate rref intelligently, step by step. Let us swap the 1st and 3rd row to get a leading 1 in the first column (after dividing row 1 by -1 ), and clear out the other two entries of column 1, as usual: 1 8

~

[

0.95176

0 0.04824 ] [ 2.95176 0 ~ 8

-1

1 0 0.04824 0 2.95176 -0.38591 0

8

] .

-1.04591

To get a row of zeroes, the 2nd and 3rd row should be parallel to each other. We can easily verify this by dividing each row by the first non-zero entry (in other words, the corresponding entry in column 2):

604

Section 8.2 The Geometry ofEigentheory and Computational Techniques

1 0 0.04824 0 1 -0. 13074

_: [

l .

0 1 -0. 13074

The 2nd and 3rd row are indeed approximately equal up to 5 decimal places. If the A we obtained were exact, the two rows should be exactly equal. This gives us the more useful (and correct) approximate rref:

1 0

0.04824 0 1 -0.13074

0 0

l .

0

Thus, our eigenspace is approximately: Eig(A, 2. 048239)

~

Span( {(-0. 04824, 0. 13074, 1)} ).

We can check that: -0. 04824 0.13074

1

-0.09880 0.26778 2.04820

-0.04824

-0.09881

0.1307:

0.26777

2.04824 [

2.04820

l

whereas

l

We can apply these ideas to find an approximate basis for the other two eigenspaces: Eig(A,-4.153986)

~

Span({(6.153986,-5.3781913,

Eig(A, 12.105747)

~

Span({(-10.105747,-11.37755,

1)}) and 1)}).

0

This last Example illustrates an important point: technology can be very useful in performing messy computations for us, but their precision is limited. Their use could yield misleading or outright false answers. We need to intelligently interpret whatever results we obtain from them, and re-do our computations if necessary to account for the lack of precision. We also remark that the matrix in the previous Example is symmetric, and it will be stated in Chapter 9 and proven in Chapter 10 that all the eigenvalues of a symmetric matrix are real, and thus we can apply an algorithm such as Newton's Method to successfully find all the real roots.

Section 8.2 The Geometry ofEigentheory and Computational Techniques

605

8.2 Section Summary The shear operators only have )., = 1 as an eigenvalue, and for horizontal shears, the eigenspace is Span( { i}), whereas for vertical shears, the eigenspace is Span ( {j}). ➔



Let II be a plane in IRl.3 passing through the origin, with normal line L. For the projection operator onto TT,the eigenspace for )., = 1 is TT,and the eigenspace for ).,= 0 is L. For the reflection operator across II, the eigenspace for)., = 1 is again II, and the eigenspace for )., = -1 is L.

8.2.1-Addenda

to the Really Big Theorem on Invertibility:

Let A be an n x n matrix. Then, the condition that A is invertible is equivalent to the following: 26. det(A) is not 0.

27. )., = 0 is not an eigenvalue for A. We can try to find integer or rational eigenvalues by using the Integer Roots Theorem or the Rational Roots Theorem, and Descartes' Rule of Signs. Technology and techniques such as Newton's Method can help us approximate irrational eigenvalues and find an approximate basis for the corresponding eigenspaces.

8.2.2-

The Integer Roots Theorem:

Let p(x) = x 11 + Cn-1x11- 1+ ·•• + c1x + co be a polynomial with integer coefficients, with co non-zero. Then, all the rational roots of p(x) are in fact integers, and if x = c is an integer root of p(x), then c is a factor of the constant coefficient co.

8.2.3 - The Rational Roots Theorem: Let q(x) = c11x 11 + c 11_ 1xn-1 + · •· + c 1x + co be a polynomial with integer coefficients, with co 0. Then, all the rational roots of q(x) are of the form x = cld, where c is a factor of the constant coefficient co and dis a factor of the leading coefficient c 11•

*

8.2.4 -Descartes'

Rule of Signs:

Let p(x) be a polynomial with real coefficients. Then: the number of positive roots of p(x) is equal to the number of sign changes in consecutive coefficients of p(x), or less than this number by an even integer; the number of negative roots of p(x) is the number of sign changes in consecutive coefficients of p(-x ), or less than this number by an even integer.

8.2.5: Let p(x) be a polynomial with odd degree. Then p(x) has at least one real root. Thus, all square matrices of odd dimension, such as 3 x 3 or 5 x 5 matrices, have at least one real eigenvalue (although it can still have imaginary eigenvalues).

606

Section 8.2 The Geometry ofEigentheory and Computational Techniques

8.2 Exercises 1.

Find a basis for the eigenspaces of each of the eigenvalues A = -5, 3 and 7 for the matrix in the 3rd Example of this Section.

2.

We saw that if A only has integer entries, then its characteristic polynomial p(.?.) has only integer coefficients. According to the Integer Roots Theorem, the factors of the constant coefficient co (if co 0) are the only possible candidates for the integer roots of p(.?.). Prove that if c 0 factors as:

*

nl

n2

co=P1 ·P2 •···•Pk where p

1,

nk

p2, ... , Pk are the distinct prime factors of co, then there are exactly: 2(n1

+ l)(n2 + l)

... (nk + 1)

distinct factors of c0 • Hint: What are the possible choices for the power of Pi that appears in a factor? Don't forget that you can have positive and negative factors. 3.

4.

Use the previous Exercise to get a count of the number of candidates for the integer roots of the following polynomials, list all of them, then factor the polynomials completely and find their roots: a.

p(.?.) = ).,3 - 8V - 3).,+ 90

b.

p(.?.) = ).,3 + .?.2 - 30).,- 72

C.

p().,) = ).,3

-

11).2 + 33).,- 15

Find the characteristic polynomial, eigenvalues, and a basis for each eigenspace (for the real eigenvalues only), and the dimension of each eigenspace, of the following matrices. Note: irrational eigenvalues will appear in (d), and you will need the Rational Roots Theorem in (f), (m), (n), and (o).

3 0

3 g.

[

-21 15

0 5

4

-17 -21 10

-2

0

0

3

0

5

3

-6

1 0 ]

-2 -3 0

-4

e

[

-2

3 -5 -8

-11 15

-15

-2

l

C.

1.2_ -2 -2 4 _ _2_

12 ]

5

f.

6 14

-6 10 -12 12 -6 16

~ il

[:

]_

l

Section 8.2 The Geometry ofEigentheory and Computational Techniques

2

1.

[

-2

4

-2

~~ ~~ -9 -6

_j_

4

~9 10

l 607

J.

1.

[ [

29 64 -32 -16 -35 16 -8 -16 5

-19 16 16 -72 53 48 48 -32 -27

_!fl_

75

4 n.

]

2 25

22 3 6

m.

2i

3

32

0.

-7 14

0 3 0 0

3 0 0 4

0 0 4

r.

0 3 0 7

3 0 7 0

0 7 0 3

t.

68 80 -80 -30 -32 -39 37 15 32 37 -39 -15 -20 -25 25 13

V.

-26 -42 14 42 14 23 -7 -21 -42 -63 23 63 14 21 -7 -19

l

-11 -11

110 -11. _ _lQ_ 3 3 3 14

2.l

]J_

2.l

-5

153 _li 4 2 _j]_ -75 4

_1.5_

[

17

15

0 0 3 0

p.

608

l

k.

-13 11 11 -44 42 44 22 -22 -24

0 0 7 0

5

5 5 -28 -539

q.

0

0 0 0 7

0 7 0 0

7 0 0 0

7 0 3 0

s.

5 -9 -6 -9 8 -8 -24 -5 0 -3 3 -3 -8 10 24 7

u.

-25 30 14 -2 14 -12 -7 1 -70 75 38 -5 14 -8 -7 -3

w.

-12 -11 27 10 7 -13 -5 6 -22 -17 43 15 50 39 -95 -33

Section 8.2 The Geometry ofEigentheory and Computational Techniques

5.

The following symmetric matrices have irrational eigenvalues, but each eigenspace is only 1-dimensional. Find p(A) and the eigenvalues, correct to 4 decimal places. Find an approximate basis for each eigenspace, also correct to 4 decimal places. You may use technology, but heed the warnings in the final Example of this Section with regards to finding the approximate rref of A - A.In. 2 -1 0 1

3 2 -1

-1 C.

0

2

1

-1

1

0

0 -3

Suppose that A is an n x n matrix. Prove that ifv is an eigenvector of A associated to the eigenvalue A, and k 0, then k •vis again an eigenvector of A associated to A. 7. Suppose that A is an n x n matrix. Prove that if A is an eigenvalue for A with associated eigenvector then Ak is an eigenvalue for A k for any positive integer k, with associated eigenvector vas well. Hint: use Induction. 8. Suppose that A is an invertible n x n matrix. a. Prove that if ,l.,is an eigenvalue for A with associated eigenvector then llA is an eigenvalue for A- 1, with associated eigenvector as well. As part of your proof, explain why the expression 1/A makes sense if A is invertible. b. Use (a) to show that for every eigenvalue k Eig(A, A) = Eig(A- 1, llA). Reminder: you must show that ifv E Eig(A,A), then v E Eig(A- 1, llA) as well, and vice-versa. 9. Let II be a plane in passing through the origin, and let L be its normal line. Adapt the ideas in the first part of this Section to find the eigenvalues of proj L and refl.L, and describe the eigenspaces (hint: the eigenspaces are either L or II). 10. The Cayley-Hamilton Theorem states that: If A is an n x n matrix with characteristic polynomial p(A), then p(A) = Onxn•We can think of this as saying that A is a root of its characteristic polynomial. This Theorem is very deep, and its proof is complicated. Demonstrate that the Cayley-Hamilton Theorem is true when applied to the matrices in Exercises 4 (a) and 4 (b). 11. True or False: Determine if the statement is true or false. If the statement is true, cite a definition or Theorem that supports your conclusion, or give a convincing argument why the statement is true. If the statement is false, cite a definition or Theorem that supports your conclusion, or provide a counterexample, or give a convincing argument why the statement is false. 6.

*

v,

v,

v



a.

If A is an n x n matrix and A is one of its eigenvalues, then On is an eigenvector for A. If A is an n x n matrix and A is one of its eigenvalues, then On is a member of the eigenspace Eig(A, A). The diagonal entries a;J are the eigenvalues of any n x n matrix A. Every 3 x 3 matrix has at least one rational eigenvalue. Every 3 x 3 matrix has at least one real eigenvalue. Every 4 x 4 matrix has at least one real eigenvalue. Every 5 x 5 matrix has at least one real eigenvalue. ➔

b. c. d. e. f. g.

Section 8.2 The Geometry ofEigentheory and Computational Techniques

609

8.3 Diagonalization of Square Matrices One of the most elegant applications of Eigentheory is the process of diagonalizing a square matrix or a linear operator. In this Section, we will study the process for square matrices, and see the process for operators in Section 8.5. We begin by defining the main concept: Definition: Let A be an n x n matrix. We say that A is diagonalizable if we can find an invertible matrix C such that: c- 1Ac = D, where D = Diag( a 1, a2, ... , an) is a diagonal matrix, or equivalently: AC= CD or A= CDc-

1•

We also say that C diagonalizes A. The matrix product c- 1AC is also referred to as conjugating A by C. A matrix which is not diagonalizable is also called defective. The key to understanding the connection between diagonalization and Eigentheory is actually the second equation: AC= CD. If we partition C into its column vectors as:

we can think of both sides of the equation in terms of matrix multiplication. Recall that AC is the matrix whose columns are Avi, Similarly, in Theorem 4.6.2, we saw that multiplying a matrix Con the right by a diagonal matrix D has the effect of multiplying the columns of C by the corresponding diagonal entry in D. Thus, we get: AC = CD

is equivalent to:

[Av1Av2 ... Avn] = [a1v1 a2v2 ... anvnJ. By comparing columns, we see that we must satisfy:

vi.

vi

for each column Thus, the columns of Care eigenvectors of A, and the corresponding entry a; in D is its eigenvalue. Since C is invertible, these columns must be linearly indepemlent, and consequently the set S = {v1, v2 , ••• , Vn} is a basis for ~n by the Two-for-One Theorem. We summarize our discovery in the following: Theorem 8.3.1- The Basis Test/or Diagonalizability: Let A be an n x n matrix. Then: A is diagonalizable if and only if we can find a basis for ~ 11 consisting of n linearly independent eigenvectors for A, say {v1, v2 , ••• , v11}. If this is the case, then the diagonalizing matrix C is the matrix whose columns are v1, v2 , ... , Vn,and the diagonal matrix D contains the corresponding eigenvalues along the main diagonal. 610

Section 8.3 Diagonalization of Square Matrices

Although there could be as many as n! ways to arrange D, let us agree for the sake of uniformity to arrange the eigenvalues in increasing order, and arrange the columns of C using the same order that the nullspace basis is written when we sight-read the rref. 4/3 Examples: In Section 8.1, we saw the lower triangular matrix A - [

0

22/3 -7/3 -10/3

5/3 4/3

with eigenvalues A= -7/3 and 4/3. We saw that: Eig(A,-7/3)

=Span({(0,-11,5)}),

and

Eig(A, 4/3) = Span( { (1, 2, 0), ( 0, 0, I)}).

We assemble the matrix:

-IIi ~

C -[

l

which is invertible, and:

c-1 -

/

1

[

2

-1

11

0

-10

5

~ ]·

11

We verify that:

c- AC 1

=

0

_l[ tl 11

-10

-10

5

44

0

0

44

~

4/3

-11

2 0

5

0 1

l

= D,

a diagonal matrix containing the eigenvalues of A (with 4/3 appearing twice, since Eig(A, 4/3) is 2-dimensional). Thus, A is diagonalizable. In the same example, we changed the entry 22/3 in A to 23/3, yielding: 4/3 A1 =

0

23/3 -7/3 [

-10/3

~ ]·

5/3 4/3

However, this time, we got:

Section 8.3 Diagonalization of Square Matrices

611

= Span( {( 0,-11, 5)} ), as before, but Eig(A,413) = Span({(0,0, 1)}).

Eig(A,-7/3)

Since we only have two independent eigenvectors (they are not parallel to each other), A I is not diagonalizable. □ Hopefully, you noticed that the suspicious issue is the factor A - 4/3 which was squared in both characteristic polynomials. This will be further explained in the next sub-section.

Example: Let A =

[ 0-2]

. Then:

8

0

), 2

-8 l This matrix does not have real eigenvalues, and thus does not have any eigenvectors. Thus, A is not diagonalizable, even though it almost looks like a diagonal matrix. □ In general, any matrix which has imaginary eigenvalues cannot be diagonalized. However, we will see in Chapter 9 that it will be possible to diagonalize such matrices over the set of matrices with imaginary entries, so to be more precise, we say:

Theorem 8.3.2: Let A be an n x n matrix with imaginary eigenvalues. Then: A is not diagonalizable over the set of real invertible matrices.

Geometric and Algebraic Multiplicities We saw earlier that if the matrix A does not have n linearly independent eigenvectors, then we cannot diagonalize A. However, notice also that the eigenspace which made this impossible came from a double (i.e. repeated) eigenvalue for p(A). Clearly this is what produces the complication, and therefore it requires further investigation. First, let us introduce some new terminology:

Definitions -Algebraic and Geometric Multiplicities: Let A be an n x n matrix with distinct (possibly imaginary) eigenvalues A1, A-2,... , Ak. Suppose p(A) factors as: p(A) = (A-A-1)" 1 •(A-A2)'7

2

•···



(A-Ak)'1\

where n 1 + n2 + ·· · + nk = n. We call the exponent n 1 the algebraic multiplicity of i,. We call dim(Eig(A, i,)) the geometric multiplicity of i,. We agree that dim(Eig(A,A- 1)) = 0 if l 1 is an imaginary eigenvalue (at least for now).

612

Section 8.3 Diagonalization of Square Matrices

Note: The Fundamental Theorem of Algebra tells us that the sum of the algebraic multiplicities of the Ai must be the degree n of p(A). A very deep result from a field of mathematics called Algebraic Geometry (just by coincidence) gives the connection between these two multiplicities:

Theorem 8.3.3 - The Geometric vs. Algebraic Multiplicity Theorem: For any eigenvalue Ai of an n x n matrix. A, the geometric multiplicity of A; is at most equal to the algebraic multiplicity of A;. Thus, following our notation in the previous definitions: I :S dim(Eig(A, Ai)) S ni for every i = 1... k.

In particular, if A; is a simple root (i.e. n; = 1), then dim(Eig(A, Ai)) is exactly 1. Example: Let p(?i,) = (A+ 5) 3 • (A-2) 4 • (A2 + 4) be the characteristic polynomial of a matrix A. The degree of p(A) is 3 + 4 + 2 = 9, and thus A must be a 9 x 9 matrix. The Geometric vs. Algebraic Multiplicity Theorem tells us that:

I :S dim(Eig(A, -5)) :S 3, 1 :S dim(Eig(A,2)) dim(Eig(A,2i))

:S 4, and

= 0 = dim(Eig(A,-2i)).

The last two eigenspaces have dimension 0, by convention, because 2i and -2i are imaginary numbers. Thus, A is definitely not diagonalizable over the set of real invertible matrices. □

Independence of Distinct Eigenspaces We saw in a previous example that it may not always be possible to get n linearly independent eigenvectors for an n x n matrix. However, since we had two eigenvalues, we were still fortunate enough to find two linearly independent eigenvectors. More generally, the number of eigenvalues gives us a lower bound for the number of linearly independent eigenvectors we can find:

Theorem 8.3.4 - Independence of Distinct Eigenspaces: Suppose that A 1, A2 , ••• , Ak are distinct eigenvalues for an n x n matrix A, and suppose that is an eigenvector of A corresponding to Ai, for i = l .. k. Then: the set S = {v1, v2, ... , vk} is linearly independent. Thus, if A has a total of m distinct eigenvalues, we can find at least m linearly independent eigenvectors for A.

v;

Proof This is another proof that deserves to be called magical.

v

-+

{v

First, since 1 * 011, the set 1} is linearly independent. Now let's see what happens when we include the 2nd vector. Suppose {v1, v2} is a set of two Section 8.3 Diagonalization of Square Matrices

613

eigenvectors with corresponding distinct eigenvalues At and A2. This set is dependent if and only if v2 is parallel 1, that is, v2 = 1, where :t. 0. Since subspaces are closed under scalar multiplication, this implies that 2 belongs to the same eigenspace as 1• This contradicts our condition that A 1 :t. A2.

v

v

cv

c

v

The rest of the proof proceeds by Mathematical Induction: Let us assume that the set is linearly independent for our Induction Hypothesis. We must show that the 1, 2 , ..• , vi} extended set {v1, v2, ... , VJ, Vj+I } is still linearly independent as our Inductive Step.

{v v

Let us consider the dependence test equation for this set:

By multiplying both sides of this equation by the matrix A on the left, we get: A(c1v1

~

+ c2v2 + ·· · + c1v 1 + c1+1VJ+1) = AOn .....

C1Av1

~

+ c2Av2 + .. · + c1Av1 + Cj+tAVj+I = On .....

+ C2A2V2+ ··· + CjAJVj + Cj+IAJ+tVJ+I = On,

CtAtVt

v;

where we were able to perform the last step because each is an eigenvector with eigenvalue A;. Now, starting with the original highlighted dependence equation above, we can also multiply both sides by AJ+t, thus getting: .....

CtAJ+tVt

+ C2AJ+tV2+ ··· + CjAJ+tVJ + Cj+IAJ+tVJ+I = On.

Let us put our two resulting equations on top of each other: CtAtVt

+

C2A2V2

+ ··· +

CJAJVJ

+

.....

CJ+IAJ+tVJ+I

= On, and .....

CtAJ+tVI

+ C2Aj+tV2 + ··· + CjAj+IVj + Cj+lAJ+IVj+l = On.

Notice that the vi+I terms in both equations are identical. Now, here comes the magic: if we subtract the corresponding sides of these equations from each other, we get: ➔

Ct (At - AJ+t)v1

+ c2(A2



- AJ+t)v2

+ · · · + cj(AJ

-+

- AJ+t)v1



= On.

v

vi}.

This now looks like a ,lependence test equation for the set of vectors in {v 1, 2 , ... , This set is assumed to be independent in the Induction Hypothesis, and so each c;(A; - Aj+I) must be zero! Since the A; are distinct, each A; - AJ+t is a non-zero scalar, and so this means that Ci = 0 for i = 1. . .j. But going back to our original dependence equation, we get: .....

.....

CJ+tVJ+t

= On.

..... :t. 0 11, the Zero-Factors Theorem tells us that c1+ 1

= 0. Thus, the extended set is still linearly independent. This argument shows that we can keep adding eigenvectors to this set, as long as we are adding an eigenvector from a new eigenspace. Thus, 1, 2 , ••• , Vk} is linearly independent. ■ Since

Vj+t

{v t, v2, ... , Vj,

Vj+I }

{v v

614

Section 8.3 Diagonalization of Square Matrices

Example: Consider the matrix from one of our previous examples:

0 23/3 -7/3 4/3

A, = [

-10/3

0 0 5/3 4/3

l .

We determined that A I is not diagonalizable because we could not find three linearly independent eigenvectors for A. But the two vectors that we did find, ( 0,-11, 5) and ( 0, 0, 1), are still linearly independent, i.e., not parallel. 0 Thanks to the Geometric vs. Algebraic Multiplicity Theorem, together with The Independence of Distinct Eigenspaces Theorem, we have the following Theorem: Theorem 8.3.5 - The Multiplicity Test for Diagonalizability: Let A be an n x n matrix. Then: A is diagonalizable if and only if for all of its eigenvalues At, the geometric multiplicity of A; is exactly equal to its algebraic multiplicity. Idea of the Proof We will demonstrate the ideas behind the Proof using a 6 x 6 matrix A. The ideas can be applied in general, and will be outlined in the Exercises. Let us suppose that: 2. p(A) = (A+3)(A-2)3(A-5) Thus, A has three distinct eigenvalues: A1 = -3, A2 = 2, and A3 = 5, arranged in increasing order for convenience. The Geometric vs. Algebraic Multiplicity Theorem tells us that: dim(Eig(A,-3))

= 1,

1 '.Sdim(Eig(A,2))

'.S3, and

1 '.Sdim(Eig(A,5))

'.S2.

Now, let us demonstrate both directions of the Theorem:

v6}

(⇒) Suppose we know that A is diagonalizable. Thus, there exists a basis B = {v 1, v2, ... , for !RI.6 consisting of six independent eigenvectors for A. Each of these eigenvectors belongs to exactly one of the eigenspaces above. However, since B is linearly independent, every subset of B is also independent. We know that Eig(A, -3) is exactly I-dimensional. But suppose that none of the eigenvectors in B belong to Eig(A,-3 ). This means that each of these six is either in Eig(A, 2) or Eig(A, 5 ). But Eig(A, 2) is at most 3-dimensional, so at most three of the six can belong to Eig(A, 2). This is because 4 vectors from a 3-dimensional subspace must be linearly dependent, according to the Dependent Sets from Spanning Sets Theorem (2.2.8 1). Similarly, at most two of the six can belong to Eig(A, 5). Thus, at least one of the six won't belong to either eigenspace. We get a contradiction, so at least one of the six vectors is from Eig(A,-3).

Now, we are down to five remaining eigenvectors. This time, if fewer than three of these five belong to Eig(A, 2), then at least three of the five have to belong to Eig(A, 5), which is again Section 8.3 Diagonalization of Square Matrices

615

impossible because Eig(A, 5) is only 2-dimensional. Thus, exactly three of these five belong to Eig(A, 2). Finally, the remaining two eigenvectors must belong to Eig(A, 5) because we know they don't belong to either of the first two eigenspaces. Thus, each of the eigenspaces must have a dimension which equals its maximum allowable value, which happens to be the corresponding algebraic multiplicity.

(

5c1VJ = 06. Since we know that {v 1 } is independent, we must have 5c 1 = 0, and so c I must be zero. Thus,

616

Section 8.3 Diagonalization of Square Matrices

we can update the original test equation to just: ➔

-+

-+

-+

C2V2+ C3V3 + C4V4 = 06. But we also know that {v2, v3, v4} is independent, and so c2, c3, and Thus, C1through

C4

C4

are zero.

are all zeroes, and the extended set {v1, v2, V3,v4} is independent.

Similarly, we can extend this set by the set {vs, V6}, the basis for Eig(A,5). new test equation:

We set-up the

We emphasize at this point that you should look at this as a completely new equation. We do not know yet that c I through C4 are zero. Now, let us apply our two tricks. We multiply both sides by A, distribute, and obtain:

- 3c1v1+ 2c2v2 + 2c3v3 + 2C4V4+ 5csVs + 5C6V6 = 06. Next, we multiply both sides of the original test equation by the next eigenvalue AJ = 5, and get:

5c1v1+ 5c2v2+ 5c3V3+ 5c4V4+ 5csvs + 5c6v6 = 06. Subtracting the previous equation from this equation, vs and

V6

cancel out, and we get: -+

8c1v1+ 3c2v2 + 3C3V3+ 3C4V4= 06. Since we saw earlier that {v1, v2, V3,V4} is independent, now we see that C1through C4 are all -+ zeroes. The test equation thus reduces to csvs + C6V6 = 06. Since we know that {vs, V6} is an independent set, we conclude that cs = 0 = C6 as well. Thus {v1, v2, v3, v4, vs, v6} is an independent set. ■ As we saw in one of our earlier Examples, it is time consuming to have to find a basis for every eigenspace of A, before being able to determine whether or not A is diagonalizable. However, one important and easy consequence of this Theorem is the following:

Theorem 8.3. 6: Let A be an n x n matrix with n distinct, real eigenvalues. Then: A is diagonalizable. Proof Since we haven real distinct eigenvalues,p(A)

factors as:

for some n distinct real eigenvalues AI through A11• As we pointed out in Theorem 8.3.3, if the algebraic multiplicity of A; is exactly one (i.e. Ai is a simple root), as we are given, the geometric multiplicity must also be exactly one, that is, dim(Eig( A, Ai)) = 1 as well. By Theorem 8.3.5, A is diagonalizable. ■

Section 8.3 Diagonalization of Square Matrices

617

-1 -2 2 2 2 1

Example: Let A =

[

1

l .

0 3

We saw in the previous Section that the characteristic polynomial of this matrix is:

and the eigenvalues are A = 0, 1 and 3. Even though this matrix is not invertible because 0 is an eigenvalue, it is diagonalizable because we have three distinct eigenvalues for this 3 x 3 matrix. To find the eigenvectors, we must compute the matrices A - ?,.,f] and their rrefs, resulting in:

for?,.,= 0:

[ l [ l [ l [ l -1

-2

2

2

2

1

1

0 3

-2

for?,.,= 1:

-2

for?.-

3 [

1 1

1

0 2

2 -1 1

with rref

0 0 1 0

with rref

with rref

0

2

0 1 -3

2

1 ] 0 0

3

0 1 -5/2

2

2

-4 -2

1 0

[

0 0

0

1 0

0

0 1 -1 0 0

The eigenspaces are thus:

0

I

Eig(A, 0) = Span( { (-6, 5, 2)} ), Eig(A, 1) = Span( { (-2, 3, 1)} ), and Eig(A, 3) = Span( { (0, 1, 1 )} ).

We assemble the basis vectors into the columns of C:

The diagonalizing matrix is D = Diag( 0, 1, 3). This time, let us verify that:

618

Section 8.3 Diagonalization of Square Matrices

I ~ I~ I -::i

-1/3 1

enc-' -[ -~ -~ : -[ -

- [

0 J[-1/3 -1/3 3 1/2 1

0

3

~I -;

l

-1/3 4/3

0 -2 0 3 1

1/3 -1

1/3] -1

1/6 -1/3 4/3

I ]-

A,

as expected. 0

Powers of Diagonaliuible Matrices One useful application of diagonalization is the ability to find powers of a matrix without a lot of effort. Suppose we have the factorization A = CDc- 1, then:

A 2 = (CDc- 1 )(CDc-

= CD(c- 1C)Dc-

1) 1

= CD(Jn)DC- 1 = C(DlnD)c-

1

where we applied the Associative Property of Matrix Multiplication in some of the steps. Proceeding by induction, we get: Theorem 8.3. 7: Let A be a diagonalizable n x n matrix with A = CDC- 1 • Then: for all positive integers k:

Furthermore, if A is invertible, then:

Since Dk is easy to compute for a diagonal matrix D, this gives us an easy way to compute A k indirectly. The formula for A- 1 will be proven in the Exercises.

Section 8.3 Diagonalization of Square Matrices

619

l [ l[ l[ l l[ l l l[ l[ l l[ l[ l

Example: We saw in the previous Example where A can be diagonalized as:

A

-1 -2 2 2 2 1

=

[

1

=

0 3

-6 -2 0 5 3 1

2

-1 /3 -1/3 1 1/2

0 0 0 0 1 0

1 1

0 0 3

1/3 -1

.

1/6 -1/3 4/3

Thus, if we want the 8th power of this matrix, we get:

-6 -2 0 5 3 1

A8 =

[

2

1 1

0 0 0 0 1 0

8[

0 0 3

-1/3 -1/3 1/2 1 1/6

1/3 -1

-1/3 4/3

8

-6 -2 0 5 3 1

= [

1 1

-6 -2 0 5 3 1

= [

620

2

2

1 1

0 0 0 18

0 0

0

0 38

0 0 0 1

0 0

0 0 6561

-1/3 -1/3 1/2 1 1/ 6

1/3 -1

- 1/3 4/3

-1 /3 -1 /3 1/3 1/2 1 -1 1/6 -1/3 4/3

Section 8.3 Diagonalization of Square Matrices

8.3 Section Summary Let A be an n x n matrix. We say that A is diagonalizable ifwe can find an invertible matrix C such that c- 1AC = D, where D = Diag( a 1, a 2 , ••. , an) is a diagonal matrix, or equivalently: AC= CD or A= CDc-

1



We also say that C diagonalizes A. The matrix product c- 1AC is also referred to as conjugating A by C. A matrix which is not diagonalizable is also called defective. 8.3.1-

The Basis Test/or Diagonalizability:

Let A be an n x n matrix. Then: A is diagonalizable if and only if we can find a basis for IR,

=

{2,3-x,5+7x+x

2

and B 1 = {-3,2+x,

},

respectively. Now, let T :

2 3 1Jl> --+ 1Jl>,

2

2

,x+x

3

}.

whose matrix with respect to Band B 1 is:

7 -1 [T] 8 ,s1 =

1-x

5

6 -2

-3

8

0

4

2

3

As a warm-up, let us remember how to compute T(v), where v = 6 - 4x + 3x2 . First, we need to encode this polynomial using the basis B. Fortunately, the degrees of the members of B are all distinct, so starting with the quadratic member, we find the coefficients by inspection, as:

6 - 4x + 3x2 = -42(2) + 25(3 -x) + 3(5 + 7x + x 2 ), thus (v)B = (-42, 25, 3). Now, we multiply:

7 -1

2

5

-304

6 -2

60

-3

8

0

326

4

2

3

-109

Finally, we decode these coefficients using B 1 to get:

Section 8.5 Determinants for Operators

639

+ 60(2 + x) + 326(1 - x 2 )

T(v) = -304(-3)

109(x + x 3 )

-

= 1358 - 49x - 326x 2 - 109x 3 . Clearly [T]s.s' is not a very convenient matrix to use. Let us therefore find the matrix of T with respect to the standard bases S = { 1, x, x 2 } for 1Jl>2 and S 1 = { 1, x, x 2 , x 3 } for 1Jl>3. We have:

[BJs =

[ 2

~

3

0 -1 0

0

-3 1

and [B ] 5 ,

]

-

2

1 0

0 1

0 1

0 0 -1 0 0

The inverse of the first matrix is:

1/2 3/2 -13 0 -1 7

[B]Sl -[

0

0

i

0

0 1

.

1

Now, we are ready to apply the formula to obtain our standard matrix:

[TJs,s'

=

1

[Bl]s1[T] 8 ,8 1[BJs 2

1 0

7 -1

0 1

0 1

2

-3 =

0 0 -1

0 1

0 0

=

-3

0

4

-10

-53

402

3

1

-21

3/2

25/2

-95

2

4

-35

5

6 -2 8

0

2

3

[

1/2 3/2 -13 0

-1

7

0

0

1

l

To check that this matrix is correct, we recompute T(6 - 4x + 3x 2 ). This time, we encode using the standard basis S, thus producing [v] 8 , and then we multiply [TJs,s' by this matrix:

[T(v)]s1

=

[T]s,si[v]s

=

-10

-53

402

1358

3

1

-21

-49

25/2 -95

-326

3/2 2

4

-35

-109

Finally, we decode this result with respect to the standartl basis S 1 as:

T(v) = 1358 - 49x - 326x 2

-

109x 3 .

We get the same answer, so we can be fairly confident that our standard matrix is correct. □ 640

Section 8.5 Change of Basis for Abstract Spaces and

v

The Determinant of an Operator We are now in a position to extend the concepts of the determinant and eigentheory to linear operators T : V --+ V. This would almost seem like a natural thing to do because any matrix for T with respect to some basis S would be be a square matrix [TJs- It would thus be a simple matter of computing the determinant, characteristic polynomial, and eigenvectors of this matrix. The million dollar question, though, is: Will we get the same answers regardless of the choice of the basis S? The answer, of course, is yes. The key is the equation:

where B and S are any two bases for V, and [BJ s is the matrix whose columns are the coordinate vectors of the members of B with respect to S. Notice that all of the matrices involved in this equation are n x n matrices, where n = dim(V). This equation immediately leads us to our first goal: Definition/Theorem 8.5.3: Let Sand B be bases for a finite dimensional vector space V, and let T: V--+ Vbe a linear oper

B.

Repeat Exercise 1, with B = { 1 + x - x 3 , 2 - x + x 2 , 5 - x, 2}, B 1 = { 1 - 3x 2 + x 3 , x - x 2 - x 3 , x 2 - 5x 3 , - x 3 }, and = 5 - 3x + 4x 2 - 2x 3 , with IJI>2 replaced with IJI>3 in part (b).

v 3.

Let B = { 1 + x - x 3 , 2 - x + x 2 , 5 - x, 2} be the first basis for IJI>3 in Exercise 2, and B 1 = { x - x 2 , 1 + x, 2 - x 2 } the second basis for IJI>2 in Exercise 1. Suppose that a linear transformation T : 1Jll3 ➔ 1Jll2 is given by:

[T] 8 ,8 1 = [ _; -~ 2 a.

~

-~ ]·

7 -4 -2

v

Compute T(v), where = 5 - 3x + 4x 2 - 2x 3 . Express your final answer as an 2 ordinary vector in IJI> (which means that you should not forget to decode). Use part of your answer in Exercise 2.

Section 8.5 Determinants for Operators

643

4.

,x 3 } andS

1

2

c.

Recompute T(v), where V = 5 - 3x + 4x 2 - 2x 3 , using [T]s,sl•

=

{l,x,x

}.

Let B = { 1 + x 2 , 1 - x + x 2 , 1 + x} be the first basis for 1Jll2 in Exercise 1, and B 1 = { 1 - 3x 2 + x 3 , x - x 2 - x 3 , x 2 - 5x 3 , - x 3 } the second basis for 1Jll3 in Exercise 2. Suppose that a linear transformation T : 1Jll2 ➔ 1Jll3 is given by:

-3

0

5

-3 -1 1 2 1 -4

[T]a al =

-1

7

6

b.

Compute T(v), where v = 4 + 3x - 5x 2 • Express your final answer as an ordinary vector in 1Jll3 . Use part of your answer in Exercise 1. LetS = {1,x,x 2 } andS 1 = {1,x,x 2 ,x 3 }. Find [TJs,si-

c.

Recompute T(v), where V = 4 + 3x- 5x2 , using [T]s,sl•

a.

5.

Find [TJs,st-

LetS

=

{l,x,x

2

b.

Suppose that T :

2 is 2 ➔ 1Jll 1Jll

an operator, with: 1 -1

[T]B =

0 [

1

-1 -2

where B = { 1 + x 2 , 1 - x + x 2 , 1 + x} is the first basis from Exercise 1. a. Compute T(v) where v = 4 + 3x - 5x 2 is the vector from Exercise 1. b. Let S = { 1, x,x 2 }. Find the change of basis matrix [B]s and its inverse [BJs 1• c. d. e. f. 6.

1

Use the formula [T]s = [B]s[T ] 8 [BJs to find [T]sRecompute T(v), where v = 4 + 3x- 5x 2 , using [T]sCompute det(T). Is T invertible? If so, find [r- 1Ja.

Suppose that T :

3 ➔ 1Jll 3 is 1Jll

an operator, with:

[T]a =

-1

0 -2

-1

2

0

-1

3

1 -4

1 3

-2

1

1

1

where B = { 1 + x - x 3 , 2 - x + x 2 , 5 - x, 2} is the first basis from Exercise 2. a. Compute T(v) where v = 5 - 3x + 4x 2 - 2x 3 is the vector from Exercise 2. b. Let S = { 1, x,x 2 , x 3 }. Find the change of basis matrix [B]s and its inverse [B]51• c.

644

1

Use the formula [T]s = [B]s[T] 8 [B]s to find [TJs-

Section 8.5 Change of Basis for Abstract Spaces and

d. e. f. 7.

-

2x3 , using [TJ5 .

Let D : IP2 ➔ IP2 be the differentiation operator, and let B = { 1 + x 2 , 1 - x + x 2 , 1 + x} be the first basis from Exercise 1. a. Find [DJ8 . b. c. d.

8.

v

Recompute T(v), where = 5 - 3x + 4x 2 Compute det(T). Is T invertible? If so, find [1 1 J8 .

Let S = { 1, x, x 2 }. Use the formula [TJ5 use your work from Exercise 5. Compute det(T). Is T invertible? If so, find [r- 1J8 .

[BJ5 [TJ8 [B]s 1 to find [DJ5 . You may

=

Repeat the previous Exercise for the differentiation operator D : IP3 ➔ IP3 , where: B = { 1 + x - x 3, 2 - x

+x2, 5 -

x, 2}

is the first basis for IP3 from Exercise 2 and S = { 1, x, x 2 , x 3 }. You may use your work from Exercise 6.

9.

Let V= Span(B), whereB = {sin(x), cos(x)}, andB 1 = {sin(x+n/6), sin(x+n/3)}. 1 1 a. Show that B is also a basis for V. Reminder: you must show that B is a subset of V to begin with. b. Find the change of basis matrix from B to B 1•

➔ Vis given by:

[T] 8,

1 -3 ] . Find [ TJ . 8 3 -7

c.

Suppose that T: V

d. e.

Compute det(T). Is T invertible? If so, find [r- 1J8 •

f.

Suppose that D : V ➔ Vis the differentiation operator. Find [DJ8 .

g. h.

Compute det(D). Is D invertible? If so, find [D- 1J8 .

10. Let V=Span(B), where B= {e-2.x,x•e-2.x,x operator D : V ➔ V. a. Find [DJs. b. Compute det(D). c. Is D invertible? If so, find [D- 1J8 .

-

[

2

•e- 2r},

and D the differentiation

11. Let T : V ➔ V be a linear operator acting on a finite dimensional vector space V. Prove that Tis invertible if and only if det(T) * 0. 12. Prove that if J v : V ➔ V is the identity operator of V, and B and B 1 are any bases for V, then [J vJB,B1 = CB,B1. Note: this is analogous to Theorem 8.4.3, proven in the last Exercise from Section 8.4, and the proof is exactly the same idea there.

Section 8.5 Determinants for Operators

645

8. 6 Similarity and The Eigentheory of Operators You may have noticed that in the last few Sections, we saw two equations that have the same structure. We said that a square matrix A is

T(p(x)) = 2p(x) + (x + 5)p 1(x) + (x 2 - 3x + 7)p 11(x). W c found that its matrix with respect to the standard basis S = \r 1' x ' x 2 'f is·•

l

~ I 1:

[Tls -[

Since this is upper triangular, the characteristic polynomial is: p(A) = (A - 2)(A - 3 )(}.,- 6), with distinct eigenvalues A = 2, 3 and 6. Thus, each eigenspace must be I-dimensional, as we saw in Section 8.3. As before, let us simultaneously find the eigenvectors by finding the reduced row echelon forms of the matrices [TJs - A.13for each of the values of k

[T]s - 2h = [

~~

4

~ ],

0 0 4

-1 5 ; [T]s - 3h = [ 0 0 14 0 0

[T] 5 - 6h -[

-4 5 0 -3 14 : 0

with rref

l l

, with rref

, with rref

0

The coordinates with respect to S of the single basis vector for each of the respective eigenspaces are therefore: (1,0,0),

(5,1,0)

and (31,8,6).

Lastly, we decode these coordinates using the standard basis S = { 1, x, x 2 }, and get:

Eig(T,2) = Span{l}, Eig(T,3) = Span{S +x},

and

Eig(T,6) = Span{31 + 8x+ 6x2 }. We can compute their images under T:

Section 8.6 Similarity and the Eigentheory of Operators

653

T(1)=2=2•1, T(5 +x) = 2(5 +x) + (x+ 5) • 1 = 3 • (5 +x),

and

T(31 + 8x + 6x2 ) = 2(31 + 8x + 6x 2 ) + (x + 5)(8 + 12x) + (x2 - 3x + 7)(12) 2

= 186+48x+36x

= 6·(31+8x+6x

2 ).

and see that they are indeed eigenvectors, respectively, for}., = 2, 3 and 6. □

Diagonalization of Operators Now that we know how to find eigenvalues and eigenspaces for linear operators, we can generalize the diagonalization process to operators. We leave the proof of the following Theorem as an Exercise:

Definition/Theorem 8.6. 6: Let T : V ➔ V be a linear operator acting on a finite dimensional vector space V. We say that Tis diagonalizable if we can find a basis B for V such that [T] 8 is a diagonal matrix. Thus, Tis diagonalizable basis S of V.

if and only if

[T]s is a diagonalizable matrix for any choice of

Example: Our previous operator T : 1Jll2 ➔ eigenvalues. Thus, with respect to the basis:

[Pl2

is diagonalizable because [TJs has 3 distinct

B = { 1, 5 + x, 31 + 8x + 6x 2 }

consisting of eigenvectors, the matrix of Tis diagonal:

2 0 0 0 3 0

[T] 8 = [

l .

0 0 6

Notice also that if we construct [BJs, where S = { 1, x, x 2 }, then:

[B]s - [

~i

31 :

J

with inverse [BJ:S: = [

1 -5 1

0 0

1/6

][

~ 0

2 5 14 3/2 -4/3

0

1

0 3

4

0 0

6

5 ~ 0

}~~3 ], and so: 1/6

31

JUi l-UHl :

= [T] 8 , as before. 0

654

Section 8.6 Similarity and the Eigentheory of Operators

8. 6 Section Summary Let P and Q be n x n matrices. We say that P is similar to Q if we can find an invertible n x n matrix R such that P = R- 1QR. A relation ~ on the members of a set Xis a function that gives a value of either true or false given an ordered pair ( x, y) of members of X If the value of ~ is true for ( x, y ), we write x ~ y, and say that "x is related to y. " If the value of ~ is false for ( x, y), we write x "f, y and say "x is not related toy." 8.6.1: The relationship "is similar to," symbolized by~, is an equivalence relation on the set of all n x n matrices. In other words, for all n x n matrices P, Q and S: (1) Similarity is Reflexive: P ~ P. (2) Similarity is Symmetric: If P ~ Q, then Q ~ P. (3) Similarity is Transitive: If P ~ Q and Q ~ S, then P ~ S. 8. 6.2: Any equivalence relation ~ acting on a set X partitions X into equivalence classes: X = X, U X 2 U · · · U Xk U · · ·, where x, y E X belong to the same equivalence class X; if and only if x ~ y. Furthermore, every element x E Xbelongs to exactly one equivalence class X;. We can thus visualize X partitioned into these equivalence classes where two distinct equivalence classes do not intersect. 8. 6.3 - Invariant Properties under Similarity: If P ~ Q, then: (1) det(P) = det(Q). (2) Pis invertible if and only if Q is invertible. (3) nullity(P) = nullity(Q). (4) rank(P) = rank(Q). (5) the characteristic polynomial of P and Qare exactly the same. (6) the eigenvalues of Pare exactly the same as the eigenvalues of Q. (7 & 8) if A is a common eigenvalue, then the algebraic and geometric multiplicities of A are the same for P and Q. (9) Pis diagonalizable if and only if Q is diagonalizable. (10) tr(P) = tr(Q), where the trace of A is: tr(A) = a1,1 + a2,2 + ... + an,n8.6.4: Let Q = k • In be a scalar matrix. Then: Q is the only member in its equivalence class under similarity. In other words, the only matrix similar to Q is Q itself. 8.6.5: Let T: V ➔ V be a linear operator acting on a vector space V (which could be infinite dimensional). We say that A is an eigenvalue of T and a non-zero vector E V is an eigenvector for T associated to A, if T(v) = Av. We will denote the corresponding eigenspace by: Eig(T, A) = {v E VIT(v) = AV} ~ V.

v





Again, 0 v E Eig( T, A), even though Ov is never an eigenvector. If Vis finite dimensional, say, dim(V) = n, then we can define the characteristic polynomial of T to be the characteristic polynomial of any matrix [TJs for T, that is, with respect to any basis S of V:pr(A) = det(Aln - [T]s)- This definition does not depend on the choice of basis S for V. In this case, A is an eigenvalue of T if and only if if A is a root of pr(A). 8.6.6: Let T : V ➔ V be a linear operator acting on a finite dimensional vector space V. We say that Tis diagonalizable if we can find a basis B for V such that [T] 8 is a diagonal matrix. Thus, Tis diagonalizable if and only if [T]s is a diagonalizable matrix for any choice of basis S of V.

Section 8.6 Similarity and the Eigentheory of Operators

655

8. 6 Exercises 1.

For each of the operators below: (i) find [TJs, (ii) find det(T), (iii) find the characteristic polynomial of T, (iv) find the eigenvalues of T, (v) find a basis for each eigenspace of T, properly decoded as vectors of V, (vi) diagonalize T, if possible, that is, find a basis B for which [T] 8 is diagonal, and find [T] 8 itself. You may assume (or convince yourself mentally) that Tis indeed linear. If Vis given as Span(S), you may safely assume that S is linearly independent and use S as the standard basis for V. T: IP2 ➔ IP2 , given by: T(p(x)) = (3x- 5)p 1(x) + (4x 2 - 7)p 11(x); S = { 1, x, x 2 }. T: IP3 ➔ IP3 , given by: T(p(x)) = 4p(x) + (2x + 5)p 1(x) + (3x 2 + 2x-4)p 11(x); S = {l,x,x 2 ,x 3 }. c. D : V ➔ V, where V = Span( { sin(5x), cos(5x)} ), and D is the differentiation operator. d. D : V ➔ V, where V = Span( { e-x, e2x, e 5x} ), and D is the differentiation operator. e. D : V ➔ V, where V = Span( { e 3x, xe 3x, x 2e 3x} ), and D is the differentiation operator. Warning: don't forget the product rule and chain rule for this problem. Let D 2 = D o D be the second derivative operator: D 2 : C2 ([R()➔ c0 ([R(),acting on the vector space of all twice differentiable functions with continuous first and second derivatives defined on all real numbers. Note that both of these spaces are infinite dimensional, so we cannot construct a matrix for D 2 , and thus we do not have a characteristic polynomial to work with either. a. b.

2.

Show that/(x) = sin(x) and g(x) = cos(x) are both eigenvectors for D 2 . What are the corresponding eigenvalues? b. Show that h (x) = ekxis an eigenvector for D 2 for all real numbers k. What is the corresponding eigenvalue? c. Show that every positive number µ is an eigenvalue for D 2 , and find at least one eigenvector. d. Show that p (x) = sin(kx) is an eigenvector for D 2 for all real numbers k. What is the corresponding eigenvalue? e. What can we conclude for q (x) = cos(kx)? f. Show that f(x) = sin(x) and g (x) = cos(x) are both eigenvectors for D 4 = D 2 o D 2 , and they have the same eigenvalue. Let Seq be the vector space consisting of all sequences of real numbers: a.

3.

V= (x1,X2,X3, ... ,Xn,••·)We note that we are not requiring the sequences to converge. We will define the Shift transformation:

Shifi(v) = (x2, X3, ... , Xn, ... ), where v is as we defined it above. In other words, we forget x I and make x2 the new first term, and so on, shifting every term up. a. Show that Shift is a linear transformation, i.e. it is both additive and homogeneous. 656

Section 8.6 Similarity and the Eigentheory of Operators

b.

Show that every real number ;L is an eigenvalue for Shift. Hint: solve the equation: Shift(v)

=

A•

v, or

(x2, X3, ... , Xn, ... ) = A• (xi, X2, X3, ... , Xn, ... ).

c.

4.

Show that Eig(Shift, ,;\,) is I-dimensional for any real ;L, and state a basis for the e1genspace. Suppose that T: IP2 ➔ IP2 is an operator whose matrix with respect to S = { 1, x, x 2 } is:

[T]s =

a. b. 5.

5 2-3] 6 .

0 -1

[0

0

4

Show that Tis diagonalizable Find a basis B for IP2 such that [T] 8 is diagonal, and find [T] 8 .

Suppose that T: IP2 ➔ IP2 is an operator whose matrix is: [T]R = Diag(4,-7,3), respect to the basis B = {3 - 5x, 2 + x 2, 1 -x - x 2 }.

with

Find [TJs, where Sis the standard basis S = { 1, x, x 2 }. Show that the relationship x ~ y among human beings, where x ~ y if x and y have the same birthday (not necessarily on the same year) is an equivalence relation. How many equivalence classes are there? 7. Fix an integer n > 1. Define a relation on I via: x ~ y if and only if x - y is a multiple of n, in other words, x - y = kn for some integer k. a. Show that this is an equivalence relation. b. If n = 2, show that the equivalence classes of I under this relation are the sets of even and odd integers, as seen in one of the Examples. c. If n = 3, show that there are 3 equivalence classes of I under this relation. d. In general, show that there are n equivalence classes of I under this relation. What is the smallest positive member in each of these equivalence classes? 8. Suppose that Sis the set of all vector spaces (both finite and infinite dimensional spaces). Define a relation on S via: V ~ W if and only if V is isomorphic to W. Recall this means that there exists a linear transformation T : V ➔ W that is both one-to-one and onto. Prove that ~ is an equivalence relation. Follow up: suppose that V has dimension n. Describe the other members of the equivalence class of V. 9. Suppose that D = Diag( d 1, d 2 , •.. , d 11 ) is a diagonal matrix, and E is another n x n diagonal matrix that contains the same entries on the main diagonal as D, except possibly in a different order. Prove that E is similar to D. Note: the entries do not have to be distinct. Hint: Think of permutations and elementary row operations. 10. Prove that if S = {;L1, A2, ... , An} is a set of n distinct real numbers, and A and B are n x n matrices whose eigenvalues are exactly the members of S, then A and Bare similar. Warning: you cannot use any of the properties that are preserved by similarity, since you may only use these properties if you already know that A and B are similar. Instead, recall what we know about matrices with distinct eigenvalues from Section 8.3. 6.

Section 8.6 Similarity and the Eigentheory of Operators

657

11. Let k P - [

12. 13.

E

!RI.,and x, y non-zero real numbers. Show that the matrices:

~

:

] and Q - [

~~

}re

similar to each other

Hint: Solve for R such that P = R- 1QR. Keep Ras simple as possible. Prove that for any two n x n matrices A and B: tr(AB) = tr(BA). Hint: all you need to do is look at each diagonal entry of both AB and BA. Properties Preserved by Similarity: We will complete the proofs of the properties stated in the main theorem for similar matrices. Suppose that P and Q are n x n matrices and P ~ Q, that is, Pis similar to Q. a. Write down the definition of P ~ Q. b. Use (a) to show that det(P) = det(Q). c. Use (b) to show that Pis invertible if and only if Q is invertible. d. Show that the eigenvalues of P are exactly the same as the eigenvalues of Q. Note: we proved in the text that P and Q have exactly the same characteristic polynomial. e. Show that if ).,is a common eigenvalue of P and Q, then the algebraic multiplicity of A with respect to Pis the same as its algebraic multiplicity with respect to Q. f. Preliminary to the next part: Suppose that {v,, v2, ... , Vk} is a linearly independent set of vectors from Eig(Q,A). Prove that the set of vectors: {R-tVt, R- 1v2, ... , R-'vk} is a linearly independent set of vectors from Eig(P, A), where P = R- 1QR. Hint: slowly compute (R- 1QR)(R- 1 and use the fact that R is invertible. State and prove an analogous statement regarding a set of vectors from Eig(P, A). g. Show that if A is a common eigenvalue of P and Q, then the J(eometric multiplicity of A with respect to Pis the same as its J(eometric multiplicity with respect to Q. Hint: use the previous Exercise to convert a basis for Eig(P, A) to a set of linearly independent vectors from Eig(Q, A), and vice versa, starting with a basis for Eig(Q, A). What does each construction imply about the relative dimensions of Eig(Q, A) and Eig(P, A)? h. Show that nullity(P) = nullity(Q). Hint: consider two cases: both P and Q are invertible and both P and Q are not invertible. Recall that the nullspace of a matrix can be viewed as an eigenspace. 1. Use the Dimension Theorem to show that rank(P) = rank(Q). J. Show that Pis diagonalizable if and only if Q is also diagonalizable. Reminder: you must show both implications. k. Show that tr(P) = tr(Q). Hint: Use Exercise 11. Suppose that A is a diagonalizable n x n matrix, with eigenvalues At, A2 , ... , An (possibly with repetitions). Prove thattr(A) = At + A2 + · · · + i,.1 • Thus, the trace of a diagonalizable matrix is the sum of all its eigenvalues, taking into account multiplicities. Let T: V ➔ Vbe an operator on a finite-dimensional vector space V. Prove that Tis diagonalizable if and only if: a. [T] 5 is a diagonalizable matrix for any choice of basis S of V. b. there exists a basis B for V consisting of eigenvectors for T.

vi),

14.

15.

658

Section 8.6 Similarity and the Eigentheory of Operators

8. 7 The Exponential of a Matrix In Pre-Calculus, we encounter the natural exponential function, ex, where Euler's number e is: e

= lim ( 1 + J_)n. n

n➔oo

Although we don't fully understand the concept of a limit at this level, we can see by experiment that as n gets bigger and bigger, the expression above converges or gets closer and closer to a fam.iliar number:

1 ) 100-- 1. 01100~ ~ 2. 7048 , ( 1 + 100 (1 + 10600) 100000 = 1.00001100000~ 2.7183, and so on. The graph of ex is of course strictly increasinl( over the real numbers, so we can define its inverse, the natural logarithmic function or ln(x). Later, we reverse this chronology by using the Fundamental Theorem of Calculus to first define ln(x) as the definite integral: ln(x) = (

+

dt, where x > 0,

and define ex as the inverse of this one-to-one function. However, it is not until we get to the study of Taylor and Maclaurin Series that we are able to compute as many digits to e as we want, using the famous formula: 00

ex =

L ~xn, n=0 n. 00

e = ~

n!1

so in particular,

1 + 1 + l + 1 + 1 + 1 + .. · ~ 2. 718281828 ... 2 6 24 120

=

We will use this Maclaurin series to extend the operation of exponentiation to square matrices: Definition - The Exponential of a Square Matrix: Let A be an m x m matrix. We define the exponential of A, denoted eA, by the infinite senes: 00

eA =

L -n!1-A n=O

11

=Im+ A+ l_A 2 + l_A 3 + - 1-A 2 6 24

4

+ .. •

This definition seems easy and straightforward enough, but there are certainly some issues that we have to deal with. First, we know how to compute a linear combination of matrices, but what is the meaning of an infinite series of matrices? We can answer this, as usual, by taking a limit. We can compute the sequence of partial sums: k

{Sk(A)}:,,,

Section 6.4 TheExponential of a Matrix

where Sk(A) =

L ~A n=0 n.

11 •

659

These are just polynomial evaluations of A, so they certainly exist. The next issue is now: what does it mean for a sequence of matrices to converge to a limit matrix B? We can define it in the natural way: we will say that:

if the entry in row i, column} of Sk(A) converges to the corresponding entry BiJ· Although these definitions are of course precise and quantifiable, they do not give us an easy way to compute eA for an arbitrary matrix A, or even allow us to know ifwe have a reasonable approximation for eA. Fortunately, we will be sidestepping these technical issues by focusing on the case when A is a diagonalizable matrix. As our next step, let us see what happens when A is itself a diagonal matrix:

Theorem 8. 7.1: Suppose that D = Diag(d1, d2, ... , dm) is a diagonal matrix. Then:

eD = Diag( ed1' edz, ... 'edm ). Proof We know from Section 2.9 that for a diagonal matrix in the notation above:

l D 11 n.1

-

D zag · ( - l 1 d 111 , - 11 d 211, ••• n. n.

-

l d 11 ) n.1 111

, -

,

an d

This now gives us an easy way to compute the exponential of a diagonalizable matrix:

Theorem 8. 7.2: Suppose that A is an m x m diagonalizable matrix, with A = CDcsome invertible matrix C and diagonal matrix D = Diag( d 1, d2, ... , dm). Then:

eA = C-e

0

.c- 1 =

C•Diag(ed1,ed2, ... ,edm)

1,

for

.c- 1•

More generally, if tis a real variable, we have:

etA = C. eDt • c-1= C • Diag( edit, edit, ... , edml) • c-1. Proof We know from Section 6.3 that in the notation above: A 11 = CD11c- 1 and so

'

- 1-A

11

n!

= _l_ CD11c- 1 and n!

'

00 00-1,-CD ) cL-1,-A =L c- = C • (00 L -1,-D n. n. n. 11

n=I

660

11

n=I

1

11



1

= C • e0 • c- 1• ■

n=I

Section 6.4 TheExponential of a Matrix

The expression e'A is important in a course in Differential Equations. In this case, the matrix A usually represents the coefficients of a system of linear ordinary differential equations in the variable t, with a given initial or boundary condition. The expression e1A appears in the solution to this system.

-1

-2

:

2

Example: Consider the matrix from Section 6.3: A -[

2

0

We saw that A is diagonalizable, with:

-1/3 -1/3 1/2 1

1/3 -1

1/6 -1/3

4/3

l-

Thus, we can find the matrix exponential:

-2

-e+2

0

-1/3

-1/3

1/3

3 1

1/2

1 1

1/6 -1/3

4/3

-2e+2

2e-2

1-e + ..l.e3 - i.

3e - ..l.e3 - i.

-3e + ..1.e3 + i.

...Le+ ..l.e3 - 1..

e - ..l.e3 - 1..

-e + ..1.e3 + 1..

2 2

6 6

3

3

3

3

3

3

3

3

3

3

If we wanted e1A, we would get:

-2e 1 + 2 let+ 2

1/3 -1

1/6 -1/3

4/3

6

3

6

3

3

3

e' _ ...L.e31 _ 1.. 3

l-

2e 1 -2

...L.e31 _ 2.. 3e' _ ...L.e31 _ 2... -3et + ..1.e31+ 2...

...Let+ ...L.e3t_ 1.. 2

-1/3 -1/3 1/2 1

3

3

-et+

3

·□

..1.e31+ 1.. 3

3

Notice that we get eA by substituting t = l into our answer for e 1A.

Section 6.4 TheExponential of a Matrix

661

8. 7 Section Summary Let A be an m x m matrix. We define the exponential of A, denoted eA, by the infinite series: C(l

eA =

L n=O

8. 7.1: If D

=

_1 An = Im + A + l_ A 2 + J_ A 3 + _l_A 4 + ... n! 2 6 24

Diag( d,, d2, ... , dm) is a diagonal matrix, then: e0

Diag( ed1, ed2, ... , ed"').

=

8. 7.2: Suppose that A is an m x m diagonalizable rnatrix, with A = matrix C and diagonal matrix D = Diag( d,, d2, ... , dm). Then:

cnc-

eA = Ce0 c- 1 = C • Diag( ed1, ed2, ... , edm) •

1

,

for some invertible

c-1•

More generally, if tis a real variable, we have: etA = CeDtc-1

= C • Diag( edit, ed2t, ... , edmt) • c-1.

8. 7 Exercises For Exercises (1) to (18): Find eA and e1A for the following matrices A, which were diagonalized in Section 8.3, Exercises 3 and 4. You may use the matrices C and D which are found in the Answer Key. For your reference, the item indicates all the Exercises where the matrix appears (from Section 8.1 to 8.3). 1. 2. 3. 4. 5.

Section Section Section Section Section

8.1 8.1 8.1 8.1 8.1

Exercise Exercise Exercise Exercise Exercise

1 (d), Section 8.3 Exercises 1 (b) and 3 (a). 1 (m), Section 8.3 Exercises 1 (d) and 3 (b). 1 (g), Section 8.3 Exercises l (e) and 3 (c ). 1 (j), Section 8.3 Exercises 1 (g) and 3 (d). 1 (k), Section 8.3 Exercises l (h) and 3 (e).

6. 7. 8. 9. 10. 11. 12.

Section Section Section Section Section Section Section

8.2 8.2 8.2 8.2 8.2 8.2 8.2

Exercise Exercise Exercise Exercise Exercise Exercise Exercise

4 4 4 4 4 4 4

13. 14. 15. 16. 17. 18.

The The The The The The

662

matrix matrix matrix matrix matrix matrix

(a), Section 8.3 Exercises (g), Section 8.3 Exercises (h), Section 8.3 Exercises (k), Section 8.3 Exercises (t), Section 8.3 Exercises (u), Section 8.3 Exercises (v), Section 8.3 Exercises

1 (l) and 3 (f). 1 (n) and 3 (g). 1 (o) and 3 (h). 1 (r) and 3 (i). l (s) and 3 (j). 1 (t) and 3 (k). 1 (u) and 3 (1).

A in Exercise 2 (a) Section 8.1, and Exercise 4 (a) Section 8.3 A in Exercise 2 (b) Section 8.1, and Exercise 4 (b) Section 8.3 A in Exercise 2 (c) Section 8.1, and Exercise 4 (c) Section 8.3 Bin Exercise 2 (d) Section 8.1, and Exercise 4 (d) Section 8.3 C in Exercise 2 (e) Section 8.1, and Exercise 4 (e) Section 8.3 B in Exercise 2 (f) Section 8.1, and Exercise 4 (f) Section 8.3

Section 6.4 TheExponential of a Matrix

Chapter Nine

Geometry in the Abstract:

Inner Product Spaces In this Chapter, we will look for generalizations of the dot product operation in abstract vector spaces, which are called i1111er products. We will require that these inner products possess four of the properties that dot products possess, and from these properties, derive other properties that are shared with the dot product. Because of these properties, we will see that the Cauchy-Schwarz Inequality from Chapter 2 is still true in such an inner product space, and thus we will be able to generalize the concepts of the length of a vector, and the angle and distance between two vectors. In particular, we will be able to decide when two vectors are orthogonal or perpendicular to each other. We will show that for any subspace W of an inner product space, we can construct the orthogonal complement, W_j_,such than any member of Wis orthogonal to any member of W_j_. Recall that we did this in Chapter 2 for subspaces of Euclidean space, and to find a basis for W_j_,we find the nullspace of a matrix whose rows form a basis for W. Unfortunately, this does not generalize well in an abstract inner product space, but The Gram-Schmidt Algorithm will accomplish this for us. When we constructed the projection and reflection operators across lines and planes in 2- and 3-dimensional Euclidean space back in Chapter 3, we first showed that we can always decompose a vector in these spaces as the sum of a vector on the given line or plane, and a vector orthogonal to this line or plane. Similarly, we will generalize this orthogonal decomposition in terms of pairs of subspaces Wand W_j_ of an inner product space: any vector E V can be expressed as a sum: = 1 + 2 , where 1 E Wand 2 E W_j_. Likewise we will generalize the construction of a projection operator onto a subspace W of V.

v

v w w

w

w

We will see a special family of invertible matrices, called orthogonal matrices, that have the special property that their inverse is simply their transpose. This will require an orthogonality condition among the rows and columns of the matrix. We will demonstrate that symmetric matrices can always be diagonalized by an orthogonal matrix, but this property will be proven in Chapter 10 in greater generality.

663

9.1 Axioms for an Inner Product Space Way back in Section 2.4, early in our voyage, we saw the dot product in !RI. n: --+

U

u u

--+ 0

V

=

+ ... +UnVn,

U1V1 +U2V2

v

where = (u 1, u2, .. . , u 11) and = (v 1, v2, ... , v11), as usual. The dot product takes two vectors and and produces a scalar. We derived several desirable properties of the dot product, and we will use four of them to generalize it for an abstract vector space:

v

Definition - The Axioms of an Inner Product Space: Let V be a vector space. A bilinear form ( I) on V is a function that takes two vectors VE Vas its inputs, and produces a scalar, denoted An inner product on V is a bilinear form on V, such that the following properties are satisfied by all vectors and wE V, and any k E !RI.:

u,

(ulv).

u,v

1. The Symmetric Property 2. The Homogeneity Property

3. The Additivity Property

= = k. . = (ulw> + (vlw).

4. The Positivity Property

-+

lfv



➔ -+

Ov, then (vlv) > 0.

-=I=-

We also say that Vis an inner product space under the inner product (I). Notice that the Positivity Property deals only with non-zero vectors. It turns out that the Additivity and Symmetric Properties are enough to show that the inner product of the zero vector with any vector, as expected, is zero:

Theorem 9.1.1: Let Vbe an inner product space. Then, for any

vE

V:

(vlOv)= (ovlv)= o. In particular:

(OvlOv) = 0. --+

--+

--+

Proof Since we know that Ov = 0 v + 0 v, we get:

by the Additivity

(vlOv)= (vlOv + ov)= (vlOv) + (vlOv ), Property. Since (vlOv) is some real number, it has a negative.

We can add

this negative to both sides of the equation and get:

(vlOv)+ (-(vlOv))= (vlOv)+ (vlOv)+ (-(vlOv) ), o = (vlOv)+ o = (vlOv ). By the Symmetric Property, (Ovlv) = (vlOv) = 0 as well. ■ 664

thus

Section 9.1 Axioms for an Inner Product Space

Note that the Symmetric Property gives us Homogeneity and Additivity properties in the right vector in the bilinear form as well, that is:

(ulk. v) = (k. vlu) = k. (vlu) = k. (ulv),

and

(ulv+w> = (v+wlu> = (vlit)+(wfu) = (ulv)+(ulw). If k = -1, we also have: (it - Iw> = ( it + (-1 • v) Iw) = ( Iw) + (-1 • Iw) = ( Iw) + C-1 ) · (v Iw)

v

u

= (u[w)-(v[w) (ulv-w)

v

u

and similarly,

= (ufv)-(ulw). ■

Euclidean spaces under the ordinary dot product are inner product spaces according to our definition above, where we write u instead of (ulv). Obviously, there are many other kinds of inner products, and we will now explore some of them.

av

Weighted Dot Products The easiest way to change the dot product is to incorporate a list of weights for each term. For this purpose, let y 1, Y2,... , Yn be n positive numbers. We define an inner product on IRl.11 by:

Example: Let us consider

3 under !RI.

the bilinear form:

(u[v) = 3U1V1+ 5U2V2+ 2U3V3, where u and v are written as usual. Here, our weights are 3, 5 and 2. For example, if = ( 4,-1, 6) and = (2, 2,-3), then:

u

v

(ufv> Note that in contrast,

=

3. 4. 2 + s. c-1). 2 + 2. 6. C-3)

uav=8 -

=

-22.

2 - 18 = -12.

The bilinear form is obviously symmetric (we can reverse ui and vi) and homogeneous (we can factor out k). Let us verify that it is ad,litive:

(u+vlw)

= 3(u1 +v1)w1 +5(u2 +v2)w2 +2(u3 +v3)W3 = 3u1w1+ 3v1w1+ 5u2w2 + 5v2w2+ 2u3w3 + 2v3W3 = (ulw)+(vlw),

after some rearrangements. Now, suppose _.

_.

vE

3 is a !RI. 2 2

non-zero vector, and consider: 2

(v 1 v) = 3v1 + 5v2 + 2v3. Clearly this is at least zero. Since at least one coordinate v1, v2 or v3 is not zero, one of the terms is strictly positive, and thus (v Iv) > 0. Therefore our inner product is positive.□

Section 9.1 Axioms for an Inner Product Space

665

The ideas behind these calculations can of course be generalized to show that this bilinear form is an inner product for all positive weights y 1, r2, ... , y n·

Inner Products Generated by Isomorphisms We can generalize the dot product in

further by considering any isomorphism:

~fl

(that is, a one-to-one and onto operator) and define a new inner product on ~ 11 by:

(ulv) =

T(u)

0

T(v).

Recall that [T] is an invertible n x n matrix, and we will use this invertibility to show that all the properties of an inner product are satisfied. 2 Example: Let T : ~ 2 ➔ llR be given by:

[T] - [

:

Since det([T]) = 1, Tis invertible. Now, suppose

T(i/) - [

T(V) - [

~

J

u = ( 4,-7)

and v = (-1, 6). Then:

J ][ }[ J

5 3

][ -~} [ =:

3 2

5 3 3 2

-1

13

6

9

and

Thus, we have:

(utv>= reu)o rev)=}

Thus, [B]- 1 = [B] T. 741

To find [roto,ii],we will need the change of basis formula from Section 8.4: [roto,n]= [B][rot0,n] [Br' = [B][rot0:z][Br, where Bis the basis from part (e). 8

f. g.

Use the formula above to prove that [rot0,n] is also a proper orthogonal matrix. Multiply the matrices in [B][roto;d[Br and show that:

a2 + ( 1 - a 2 ) cos 0 ab- abcos0 + csin0

[rot0,n] = [

ab - ab cos 0 - c sin 0 ac - ac cos 0 + b sin 0 b 2 + (1-b 2 )cos0 bc-bccos0- asin0

ac - accos0- bsin0 be - bccos0 + asin0

c 2 + (1- c 2 ) cos0

h.

The formula [rot0,n] = [B][rot0,11Js[Br' involves a composition of three linear operators. Write a short paragraph explaining geometrically what each of these operators does, in the c01Tectorder.

1.

Find [roto,n],where

J.

Use the matrix in (i) to compute rot0,,i( Use 3-D graphing technology to plot from part (j), and rot0,,i( w).

k.

1. m. n.

o. p. q.

n=

➔ (3,-2, 6) and 0

r.

=

sin- 1(3/5).

w),where w= ( 5, 8,-9). n,the plane II with normal n,the vector w

To wrap up this project, we will show how to rewrite [roto,ii]in an elegant way. Let ube a ftxed vector from 1R1.3 . Show that the function: crossu(v) = u (cross uwith v) is a linear transformation (i.e. it is additive and homogeneous). Find [ crOSS,i],where = 0.

3i.

Notice that we instinctively used some kind of a product rule for radicals, but we need to be careful: Definition: The square root function can be extended to the negative numbers, with the property:

!a:-E= ra-/b, if a and b are both non-negative, or exactly one of them is negative. In other words, the product rule is invalid if both a and bare negative. Likewise, if a and b are both positive, then:

Jba J -~ J-%=ff. =

=

Let us see why this product rule makes sense: we know that rule to work when both factors are negative, we will have:

6=

./36= JC-4)(-9) =RH=

i.

/36 = 6.

If we allow the product

c20(3i) = 6i 2 = 6(-1) = -6.

giving us the wrong sign. We also need to be extra careful with quotients, as you will see in the Exercises. Using the set of real numbers and the imaginary unit i, we can construct the set of all complex numbers: Definition: The field of complex numbers is denoted by: ll6+5ill, since

/65

>

/6f.

The Field Properties The set of complex numbers and the operations of addition and multiplication in this set possess the same familiar properties of the arithmetic of real numbers:

Theorem JO.I.I - The Field Properties for the Set of Complex Numbers: Let z, w, u E C Then the following properties are true: 1. The Closure Property of Addition

z+wEC

2. The Closure Property of Multiplication

Z •WE([_

3. The Commutative Property of Addition

z+w

4. The Commutative Property

Z • W

= w+z. =

W •Z.

of Multiplication

= (z+w)+u. z • (w • u) = (z • w) • u.

5. The Associative Property of Atldition

z+(w+u)

6. The Associative Property

of Multiplication 7. The Distributive Property of

z • (w + u) = (z • w) + (z • u).

Addition over .~ultiplication: There exists Oe = 0 + Oi E

8. The Existence of the

such that z + Oe = z.

Additive Identity:

There exists 1e = 1 + Oi E

9. The Existence of the

There exists -z z + (-z)

Additive Inverses: 11. The Existence of

Multiplicative Inverses:

752

([

such that z • le = z.

Multiplicative Identity: 10. The Existence of

([

If z

:J:.Oe,

=

E

Oe

=

+ ~