Winding Around: The Winding Number in Topology, Geometry, and Analysis 1470421984, 978-1-4704-2198-4

The winding number is one of the most basic invariants in topology. It measures the number of times a moving point $P$ g

418 107 4MB

English Pages 269 [287] Year 2015

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Winding Around: The Winding Number in Topology, Geometry, and Analysis
 1470421984, 978-1-4704-2198-4

Table of contents :
Content: * Prelude: Love, hate, and exponentials* Paths and homotopies* The winding number* Topology of the plane* Integrals and the winding number* Vector fields and the rotation number* The winding number in functional analysis* Coverings and the fundamental group* Coda: The Bott periodicity theorem* Linear algebra* Metric spaces* Extension and approximation theorems* Measure zero* Calculus on normed spaces* Hilbert space* Groups and graphs* Bibliography* Index

Citation preview

S T U D E N T M AT H E M AT I C A L L I B R A RY Volume 76

Winding Around The Winding Number in Topology, Geometry, and Analysis John Roe

HEMATICS MAT

STUDY

ADV ANCED

SEM

ESTERS

American Mathematical Society Mathematics Advanced Study Semesters

Winding Around The Winding Number in Topology, Geometry, and Analysis

S T U D E N T M AT H E M AT I C A L L I B R A RY Volume 76

Winding Around The Winding Number in Topology, Geometry, and Analysis John Roe

HEMATICS MAT

ADV

A N CED

SE M

STUDY

E S TERS

American Mathematical Society Mathematics Advanced Study Semesters

Editorial Board Satyan L. Devadoss Erica Flapan

John Stillwell (Chair) Serge Tabachnikov

2010 Mathematics Subject Classification. Primary 55M25; Secondary 55M05, 47A53, 58A10, 55N15. For additional information and updates on this book, visit www.ams.org/bookpages/stml-76 Library of Congress Cataloging-in-Publication Data Roe, John, 1959– Winding around : the winding number in topology, geometry, and analysis / John Roe. pages cm. — (Student mathematical library ; volume 76) Includes bibliographical references and index. ISBN 978-1-4704-2198-4 (alk. paper) 1. Mathematical analysis—Foundations. 2. Associative law (Mathematics) 3. Symmetric functions. 4. Commutative law (Mathematics) I. Title. QA299.8.R64 515—dc23

2015 2015019246

Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Permissions to reuse portions of AMS publication content are handled by Copyright Clearance Center’s RightsLink service. For more information, please visit: http:// www.ams.org/rightslink. Send requests for translation rights and licensed reprints to reprint-permission @ams.org. Excluded from these provisions is material for which the author holds copyright. In such cases, requests for permission to reuse or reprint material should be addressed directly to the author(s). Copyright ownership is indicated on the copyright page, or on the lower right-hand corner of the first page of each article within proceedings volumes.

c 2015 by the author. All rights reserved. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines 

established to ensure permanence and durability. Visit the AMS home page at http://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

20 19 18 17 16 15

Contents

Foreword: MASS and REU at Penn State University

ix

Preface

xi

Chapter 1. Prelude: Love, Hate, and Exponentials

1

§1.1. Two sets of travelers

1

§1.2. Winding around

5

§1.3. The most important function in mathematics

6

§1.4. Exercises Chapter 2. Paths and Homotopies

12 15

§2.1. Path connectedness

15

§2.2. Homotopy

18

§2.3. Homotopies and simple-connectivity

22

§2.4. Exercises

25

Chapter 3. The Winding Number

27

§3.1. Maps to the punctured plane

27

§3.2. The winding number

29

§3.3. Computing winding numbers

33

§3.4. Smooth paths and loops

38

v

vi

Contents §3.5. Counting roots via winding numbers

42

§3.6. Exercises

46

Chapter 4. Topology of the Plane

49

§4.1. Some classic theorems

49

§4.2. The Jordan curve theorem I

54

§4.3. The Jordan curve theorem II

59

§4.4. Inside the Jordan curve

63

§4.5. Exercises

67

Chapter 5. Integrals and the Winding Number

73

§5.1. Differential forms and integration

73

§5.2. Closed and exact forms

79

§5.3. The winding number via integration

84

§5.4. Homology

87

§5.5. Cauchy’s theorem

94

§5.6. A glimpse at higher dimensions

95

§5.7. Exercises

97

Chapter 6. Vector Fields and the Rotation Number

101

§6.1. The rotation number

101

§6.2. Curvature and the rotation number

105

§6.3. Vector fields and singularities

107

§6.4. Vector fields and surfaces

113

§6.5. Exercises

117

Chapter 7. The Winding Number in Functional Analysis

121

§7.1. The Fredholm index

121

§7.2. Atkinson’s theorem

125

§7.3. Toeplitz operators

129

§7.4. The Toeplitz index theorem

133

§7.5. Exercises

136

Contents

vii

Chapter 8. Coverings and the Fundamental Group

139

§8.1. The fundamental group

139

§8.2. Covering and lifting

143

§8.3. Group actions

150

§8.4. Examples

153

§8.5. The Nielsen-Schreier theorem

157

§8.6. An application to nonassociative algebra

161

§8.7. Exercises

165

Chapter 9. Coda: The Bott Periodicity Theorem

169

§9.1. Homotopy groups

169

§9.2. The topology of the general linear group

174

Appendix A. Linear Algebra

181

§A.1. Vector spaces

181

§A.2. Basis and dimension

184

§A.3. Linear transformations

188

§A.4. Duality

192

§A.5. Norms and inner products

194

§A.6. Matrices and determinants

197

Appendix B. Metric Spaces

203

§B.1. Metric spaces

203

§B.2. Continuous functions

206

§B.3. Compact spaces

208

§B.4. Function spaces

213

Appendix C. Extension and Approximation Theorems

217

§C.1. The Stone-Weierstrass theorem

217

§C.2. The Tietze extension theorem

220

Appendix D. Measure Zero §D.1. Measure zero subsets of R and of S

223 1

223

viii Appendix E. Calculus on Normed Spaces

Contents 229

§E.1. Normed vector spaces

229

§E.2. The derivative

231

§E.3. Properties of the derivative

233

§E.4. The inverse function theorem

237

Appendix F. Hilbert Space

239

§F.1. Definition and examples

239

§F.2. Orthogonality

243

§F.3. Operators

246

Appendix G. Groups and Graphs

249

§G.1. Equivalence relations

250

§G.2. Groups

251

§G.3. Homomorphisms

254

§G.4. Graphs

257

Bibliography

261

Index

265

Foreword: MASS and REU at Penn State University

This book is part of a collection published jointly by the American Mathematical Society and the MASS (Mathematics Advanced Study Semesters) program as a part of the Student Mathematical Library series. The books in the collection are based on lecture notes for advanced undergraduate topics courses taught at the MASS and/or Penn State summer REU (Research Experiences for Undergraduates). Each book presents a self-contained exposition of a nonstandard mathematical topic, often related to current research areas, accessible to undergraduate students familiar with an equivalent of two years of standard college mathematics and suitable as a text for an upper division undergraduate course. Started in 1996, MASS is a semester-long program for advanced undergraduate students from across the USA. The program’s curriculum amounts to sixteen credit hours. It includes three core courses from the general areas of algebra/number theory, geometry/topology, and analysis/dynamical systems, custom designed every year; an interdisciplinary seminar; and a special colloquium. In addition, every participant completes three research projects, one for each core course. The participants are fully immersed into mathematics, and

ix

x

Foreword: MASS and REU at Penn State University

this, as well as intensive interaction among the students, usually leads to a dramatic increase in their mathematical enthusiasm and achievement. The program is unique for its kind in the United States. The summer mathematical REU program is formally independent of MASS, but there is a significant interaction between the two: about half of the REU participants stay for the MASS semester in the fall. This makes it possible to offer research projects that require more than seven weeks (the length of the REU program) for completion. The summer program includes the MASS Fest, a two to three day conference at the end of the REU at which the participants present their research and that also serves as a MASS alumni reunion. A nonstandard feature of the Penn State REU is that, along with research projects, the participants are taught one or two intense topics courses. Detailed information about the MASS and REU programs at Penn State can be found on the website www.math.psu.edu/mass.

Preface

Mathematics is an endlessly fruitful subject. One reason is its ability to make lemons into lemonade. In mathematics, the gap between what we’re hoping to prove and what is actually true can itself become something that we can measure, something we can quantify — the basis for a whole new world of mathematical theory. Let me give an example. In Calculus II, you learn that every (reasonably smooth) function of one variable is the derivative of another function — the fundamental theorem of calculus says that integration is the reverse of differentiation. In Calculus III, you find out that in higher dimensions there is a necessary condition that must be satisfied if n given functions are to be the partial derivatives of another function. For instance, in dimension 2, if functions u and v are to be the partial derivatives (with respect to x and y) of a function f , then the integrability condition ∂v ∂u = ∂y ∂x must be satisfied. Is this necessary condition always sufficient? For functions defined on a disc, the answer is yes (“every irrotational vector field is the gradient of a potential”). On more general domains, though, the xi

xii

Preface

answer is no, as is shown by the notorious example y −x u= 2 , v= 2 2 x +y x + y2 defined on R2 \ {(0, 0)}. Most Calculus III courses treat this as a nuisance, an anomaly. What if we instead treated it as a clue, a signpost, the start of a trail that might lead to a new kind of mathematics? In fact, this trailhead leads us up one of the many routes to the summit of Mount Winding-Number, one of the most beautiful peaks of the Mathematical Range1 . This book is a sort of hiker’s guide to that mountain. Some guides want to get you to the top as quickly as possible so as to move on to “greater” things. I am not one of those. Rather, I want us to take our time, to explore different paths, and to get to know the shape of the mountain from various angles: “winding around” is a description of the book’s methodology as well as of its subject matter. Only in the final chapter will we begin to explore the high ridge that connects our mountain to the “greater ranges” of algebraic topology. The book originates from a course taught in Penn State’s MASS program in the fall of 2013. MASS is a unique semester-long intensive experience that brings together a “critical mass” of highly motivated undergraduate students from colleges across the USA and elsewhere. It is a pleasure to record my thanks for the opportunity to share in this program once again and for the energetic participation of all the students in the course. This book is dedicated to you all. Note to the reader: Our trail in this book will wind through several different parts of mathematics, parts which are often segregated in their own courses with titles like “abstract algebra” or “analysis” or “geometry” or “topology”. Probably, you will be more familiar with some of these than with others. Don’t worry! A series of appendices reviews necessary background (and gives suggestions for further reading if you want to follow up in greater depth) in these various subject areas. As you read through the main text, notes will direct you to the relevant appendix at the first point that its concepts are required. Then you can decide whether to read the appendix for a quick refresher or to continue with the main text and hope for the 1

For an extended riff on the idea of mathematics as mountaineering, see [1].

Preface

xiii

best. Whichever you decide, be sure to have fun! This is a beautiful mathematical journey and I want you to enjoy it. If you have any comments or suggestions for improving the book, feel free to contact me at [email protected]. The website for this book is www.ams.org/bookpages/stml-76. John Roe

Chapter 1

Prelude: Love, Hate, and Exponentials

1.1. Two sets of travelers A topologist is a mathematician who can’t tell the difference between a donut and a coffee mug.

Figure 1.1. Transforming a donut into a coffee mug.

This well-known saying expresses the idea that topology studies those properties of “spaces” (we will have to say what we mean by that) which are unaffected by “continuous changes” (we will have to say what we mean by that also). Why might mathematicians be interested in such a thing? Doesn’t it seem rather, well, imprecise? Here’s a story (freely adapted from Chapter 1 of [3], where it is attributed to N. N. Konstantinov) that hints at an answer. In a certain country there are two cities — call them Aberystwyth and Betws-y-Coed — and two 1

2

1. Prelude: Love, Hate, and Exponentials roads that join them: the “low road” and the “high road”. In A dwell two lovers, Maelon and Dwynwen, who must travel to B: M by the high road, and D by the low. So great is the force of their love that if at any instant they are separated by ten miles or more, they will surely die. As well as a pair of lovers, our story contains a pair of sworn enemies, Llewelyn and John. As our story begins, L is in A, J is in B, and they must exchange places, L traveling from A to B via the high road while J travels from B to A via the low road. So great is the force of their hatred that if at any instant they are separated by ten miles or less, they will surely die. Prove that tragedy is inevitable. At least two people will end up dead.

Remark 1.1.1. The point about this story is that we are given no specific information about the travels of D, M, L, and J: how fast they go, whether they halt on the journey, whether they speed up or slow down or even backtrack. Any mathematical tool effective enough to solve the problem must not care about these kind of “geometrical” specifics: must not care, in fact, about the difference between the donut and the coffee mug. It is wrong to suppose that topology, because it does not care about such distinctions, is somehow imprecise. On the contrary! Only a truly powerful theory can draw precise, specific conclusions from such unspecific initial data. There are two components to solving the problem. The first is to set up a suitable graphical representation, which turns this picturesque story into a problem in topology. The second is to solve the resulting topological problem. In the first step, we parameterize the problem by the unit square S = [0, 1] × [0, 1]. A point (x, y) ∈ S is thought of as describing the location of a pair of characters (either M and D, or L and J) along the high and low roads, respectively. So for instance the point (0, 0) represents “both characters are at A”, (1, 1) represents “both

1.1. Two sets of travelers

3

Lovers’ path Low road Haters’ path

High road Figure 1.2. Parameterizing the lovers and haters problem.

characters are at B”, (0.4, 0.7) represents “the first character is 40 percent of the way along the high road from A to B and the second character is 70 percent of the way along the low road”, and so on. The travels of a pair of characters along the high and low roads are now encoded in the movement of the single point (x, y) through S. Now the terms of the problem say that the path which describes the motion of the pair (M, D) must start at (0, 0) and end at (1, 1). And the path which describes the motion of (L, J) must start at (0, 1) and end at (1, 0). So (and this is the topological bit that we’ll have to come back to), “obviously”, the two paths have to cross (Figure 1.2). Okay, what happens at a crossing point (x0 , y0 )? This represents a pair of points — one on the high road, one on the low — which are occupied (at different times) both by M and D and by L and J. If that pair of points is 10 miles or more apart, it spells doom for M and D; 10 miles or less, curtains for L and J. Either way, tragedy is inevitable, just as the problem says. So here is the key topological fact that we have to prove. Theorem 1.1.2. Two continuous paths in the unit square S, one joining (0, 0) to (1, 1) and the other joining (0, 1) to (1, 0), must cross somewhere.

4

1. Prelude: Love, Hate, and Exponentials

Surprisingly (perhaps) this is not easy to prove. Let’s look at one attempted proof and critique that. Attempted Proof #1: Let the two paths be the graphs of continuous functions f (x) and g(x). Thus f (0) = 0, f (1) = 1, g(0) = 1, g(1) = 0. Therefore if we consider the function h(x) = f (x) − g(x), we have h(0) = −1 and h(1) = 1. By the intermediate value theorem, h(x0 ) = 0 for some x0 . Then f (x0 ) = g(x0 ) = y0 , say, so the two paths cross at (x0 , y0 ). (?) The trouble with this argument is that it assumes that our paths can be represented as the graphs of functions — in other words, that there is no “backtracking” in the x-direction. But nothing in the statement of the problem requires this, and there are many continuous paths which cannot be represented as graphs, either in this way where y is a function of x or in the reverse way where x is a function of y. In a sense, the “no backtracking assumption” has allowed us to reduce the 2-dimensional problem to a 1-dimensional one, which can then be solved using a 1-dimensional tool, the intermediate value theorem. Without making this assumption we are confronted with a situation which requires essentially 2-dimensional tools. Attempted Proof #2: Consider the loop in the plane formed by traveling from (0, 0) to (1, 1) along the lovers’ path and then returning via the circular arc t → (cos t, 1 + sin t),

0  t  3π/2.

The point (1, 0) is clearly outside the loop and (0, 1) is inside it, so any path — such as the haters’ path — from one to the other must cross the loop somewhere. This argument is correct, but the notions “outside” and “inside” have to be made precise, and this isn’t as easy as it may seem — especially if we consider paths that may cross themselves or selfintersect. What we will end up doing is defining whether a point p is “outside” or “inside” a loop γ by counting how many times γ “winds around” p. Of course that simply shifts the question to explaining what we mean by “winds around”, but this is a question to which it is possible to give a precise answer.

1.2. Winding around

5

1.2. Winding around Counting revolutions or “windings” is an important and familiar notion, in everyday life as well as in mathematics and science. We measure our days by revolutions of the earth, our months by revolutions of the moon around the earth, and our years by revolutions of the earth around the sun. Computing orbits and their periods is the beginning of the theory of gravitation. The metaphor of life as a “wheel of fortune” resonates through cultures ancient and modern. Jerry Garcia sung in 1972 The wheel is turning and you can’t slow down You can’t let go and you can’t hold on You can’t go back and you can’t stand still If the thunder don’t get you then the lightning will. How many times does the wheel turn? If we stipulate that at the end of the story the wheel is in the same position as it was in the beginning, then the answer is an integer — a whole number of turns, positive (by convention) for counterclockwise revolutions and negative for clockwise ones. This integer is the winding number, the central concept of this book. Notice that to compute it, you have to know the whole continuous story of the motion of the wheel: it is not enough to look at snapshots of its beginning and end. In other words, the question “how many times around” is at root a topological one, and its answer, the winding number, is a topological notion. We’ve already seen in the previous section an example of how any kind of continuous motion can be conceptualized as a path in a suitable abstract space, that is, a mapping from the unit interval into that space. Similarly, a continuous motion that returns to its starting point can be conceptualized as a loop, that is, a mapping of the unit circle into a space. The winding number provides a way to classify and distinguish such loops. As we hinted above, it is the key to such intuitively natural notions as the distinction between the “inside” and the “outside” of a closed curve in the plane. Many students will first meet the winding number in a course on complex analysis, rather than topology. This is because of the

6

1. Prelude: Love, Hate, and Exponentials

beautiful way the winding number enters into Cauchy’s residue theorem, which allows one to compute certain integrals of a function f (z) in terms of the behavior of f at certain special points, its so-called poles or singularities, and the winding numbers of loops around these singularities. That powerful subject is not emphasized here, however (in particular, one does not need any prior acquaintance with complex analysis in order to read this book). Why? Because important as complex analysis is (with its applications throughout mathematics, physics, and engineering), the notion of winding number turns out to have ramifications far beyond even that field. In fact, it’s not really too much of a stretch to see the winding number as the golden cord which guides the student through the labyrinth of classical mathematics: connecting algebra and analysis, potential theory and cohomology, complex numbers and just about everything. In this book, we will look at some of the many ways that winding numbers show up in mathematics. The settings are quite diverse: topology, geometry, functional analysis, complex analysis, algebraic systems, and even Lie groups. However, underneath it all is a simple idea: winding around. Let’s get started.

1.3. The most important function in mathematics We’ll begin by renewing our acquaintance with a familiar object — the function ex — from the viewpoint of the complex plane. As Euler discovered, the exponential and trigonometric functions are closely related in the complex domain, and in particular the exponential function can be used to describe the unit circle in C. It should therefore be no surprise that exponentials are going to be closely involved in our discussion of the winding number, which is all about continuous travel on the unit circle. The exponential function exp(z), or ez , is defined by exp(z) = 1 + z +

z3 z2 + + ··· . 2! 3!

1.3. The most important function in mathematics

7

Rudin [34] begins his classic book Real and Complex Analysis with the statement “This is the most important function in mathematics.” Before we can start looking at its properties, though, we need to remind ourselves what kind of thing z is here. Remember that a complex number is a formal expression of the sort z = x + yi where x and y are real numbers and i2 = −1. (We call x the real part of z and y the imaginary part, and we use the notation x = Re z, y = Im z.) We’ll think of x + yi as represented by the point (x, y) of the plane (sometimes called the complex plane or the Argand diagram in this context). There is no problem in adding, subtracting, or multiplying complex numbers by the usual rules. However, the following is a nontrivial fact. Theorem 1.3.1. The complex numbers form a field; i.e., every nonzero complex number has a multiplicative inverse. Proof. We write an explicit formula for the inverse. If z = x + yi is a complex number, then its absolute value or modulus |z| is the positive real number defined by |z|2 = x2 + y 2 . The complex conjugate of z is z¯ = x − yi. One computes z z¯ = x2 + y 2 = |z|2 . Thus, if z = 0, one has |z| > 0 and z¯ x y = 2 − 2 i |z|2 x + y2 x + y2 is the multiplicative inverse of z.



Remark 1.3.2. The field C may be considered as a vector space over R (see Appendix A to review the theory of vector spaces and linear algebra). When so considered, C is 2-dimensional: any R-basis has two elements (the canonical example is of course the basis consisting

8

1. Prelude: Love, Hate, and Exponentials

of 1 and i). The possible R-bases (z, w) of C fall into two classes, right-handed like (1, i) and left-handed like (1, −i). Formally we may say that (z, w) is a right-handed basis if Im(¯ z w) > 0 and a left-handed basis if Im(¯ z w) < 0. (If Im(¯ z w) = 0, then z and w do not form a basis.) The exponential series exp(z) =

∞  zn n! n=0

converges for all values of z and defines a differentiable function on the whole complex plane (such a function is called an entire function). We also use the notation ez for this function. By term-by-term multiplication and differentiation one verifies (treat these as exercises) (a) addition law: ez+w = ez ew ; (b) differentiation law: the function z → ez is its own derivative. The sine and cosine functions are defined in terms of the exponential by sin z =

∞  (−1)n z 2n+1 eiz − e−iz = , 2i (2n + 1)! n=0

cos z =

∞  (−1)n z 2n eiz + e−iz = . 2 (2n)! n=0

The exponential, sine, and cosine functions are real-valued for real arguments, and we have eiz = cos z + i sin z for all z. Moreover, since the power series for the exponential function has real coefficients, ez¯ = ez . It follows that |ez |2 = ez ez = ez+¯z = e2 Re z so |ez | = eRe z , for all complex numbers z. In particular, |eiy | = 1 for all real y.

1.3. The most important function in mathematics

9

The addition law for the exponential function yields the corresponding laws for sine and cosine, sin(z + w) = sin z cos w + cos z sin w, cos(z + w) = cos z cos w − sin z sin w. In particular sin2 z + cos2 z = 1 — the special case w = −z of the second identity. One sees by computation that cos has a positive real zero; define π by letting π/2 be the smallest positive real zero of cos. We have cos(π/2) = 0 and sin(π/2) = 1. The identities now give sin(z + π/2) = cos(z),

cos(z + π/2) = − sin(z).

Iterating these we find that cos and sin are 2π-periodic, so the exponential function is 2πi-periodic. In particular we get the famous formulae e2πi = 1, eπi = −1. Remark 1.3.3. It’s often useful to represent a complex number z = x + iy in polar coordinate form as z = reiθ = r cos θ + ir sin θ. Here r = |z|, and θ is a “residue class modulo 2π” (an equivalence class of real numbers, two numbers being considered as equivalent if they differ by a multiple of 2π). One calls r the modulus of z and θ the argument. In polar coordinates the law for multiplication of complex numbers takes the simple form (r1 eiθ1 )(r2 eiθ2 ) = r1 r2 ei(θ1 +θ2 ) ; you multiply the moduli and add the arguments. The identity ez ·e−z = 1 shows that the exponential function never takes the value 0. However, it can take any other value. Indeed if w = r(cos θ + i sin θ) is a nonzero complex number written in polar form, then z = log(r) + iθ has ez = w. There are of course infinitely many such z, differing by integer multiples of 2πi. The really interesting story begins when we ask how these infinitely many possibilities for the preimage of w fit together as w varies continuously in C \ {0}. Let’s start to get a grip on this by considering only a limited range of values of w, those that lie on the unit circle (|w| = 1).

10

1. Prelude: Love, Hate, and Exponentials t

x y

Figure 1.3. The exponential map illustrated — from the picture, you can guess that “winding around” will be involved somehow. This is the graph of x + iy = eit .

Lemma 1.3.4. The complex number w = exp(z) lies on the unit circle if and only if z is purely imaginary, which is to say z = it for some t ∈ R. Proof. This follows from the formula | exp(z)| = eRe z which we observed before: | exp(z)| = 1 if and only if Re z = 0, which is to say that z is purely imaginary.  For t ∈ R we have Euler’s formula exp(it) = cos t + i sin t. As t moves along the real axis, the point w = exp(it) rotates with unit speed around the circle. (If you like, think of t as time, measured in suitable units, and w as the position of the tip of the minute hand of a clock whose center is at the origin.1 ) Many different t-values correspond to the same w, just as many different time-values all have the minute hand pointing to 6. We can think of the exponential function 1 Unfortunately for this illustration, mathematical convention is that the positive direction of rotation is counterclockwise, so you should think of the clock as running backwards.

1.3. The most important function in mathematics

11

as “wrapping up” the imaginary axis into a spiral that projects to the unit circle, as shown in Figure 1.3. It is fundamentally important that, although each point w on the unit circle corresponds to many different t-values, there is no way to choose those t-values for the whole unit circle in a “continuous” manner. More precisely, Lemma 1.3.5. There is no continuous function θ : S 1 → R such that w = exp(iθ(w)) for all w ∈ S 1 . In fact, there is no continuous function s : S 1 → S 1 such that s(w)2 = w for all w ∈ S 1 . Proof. The second statement clearly implies the first since if we could find a function θ having the required properties, we could then define s(w) = exp(iθ(w)/2) and we would have s(w)2 = w. Suppose then that a continuous function s exists having s(w)2 = w. Consider the function u(t) = s(eit )s(e−it ),

t ∈ R.

This is a continuous function on R. We have u(t)2 = s(eit )2 s(e−it )2 = eit e−it = 1. Thus u(t) = ±1 for each t. A continuous integer-valued function on R is constant, so u is constant. But then −1 = s(−1)2 = u(π) = u(0) = s(0)2 = 1, which is an obvious contradiction2 .



Remark 1.3.6. It’s interesting to contrast the situation for square roots of complex numbers, revealed by Lemma 1.3.5, with the corresponding situation for real numbers. When we look in the real field, we find two problems: an existence problem (some numbers, the negative ones, don’t have any square roots) and a uniqueness problem (other numbers, the positive ones, have more than one, so the sym√ bol x can be ambiguous). In the real case it’s easy to resolve the uniqueness problem by executive order: we just decree, as is done 2

This argument is adapted from Beardon’s book [8].

12

1. Prelude: Love, Hate, and Exponentials

√ in high school algebra, that for x > 0, the symbol x should stand for the positive square root of x. Lemma 1.3.5 tells us that the introduction of complex numbers, which fixes the existence problem (every complex number has a square root), at the same time makes it impossible to come up with any sort of “executive order” which resolves the uniqueness problem (the ambiguity of square roots) in a reasonable (read: continuous) way. It follows from Lemma 1.3.5 that there is no “complex logarithm” function  defined and continuous on all of C \ {0} and such that exp((z)) = z. Functions with this property can, however, be found on some smaller domains. Here is an important example. Lemma 1.3.7. Let S = C\R− be the complex plane with the negative real axis removed (this is sometimes called a “slit plane”). There exists a continuous function  : S → C such that exp((z)) = z for all z ∈ S. Proof. Each z ∈ S has a unique polar coordinate representation z = reiθ = r cos θ + ir sin θ,

−π < θ < π,

and the polar coordinates r, θ depend continuously on z ∈ S. Put (z) = log(r) + iθ, where log is the usual natural logarithm for positive real numbers.



Remark 1.3.8. A function  having the property asserted by the lemma is called a branch of the logarithm defined on the slit plane C \ R− . Notice that such branches are not unique: if z → (z) has the property of the lemma, then so does z → (z) + 2kπi for any integer k. Later, we will see that this integer ambiguity is related to the winding number in a simple way.

1.4. Exercises Exercise 1.4.1. Calculate the quotient (3 + 2i)/(1 − 2i). Find two complex roots of the quadratic equation 2z 2 − 3z − 5i = 0.

1.4. Exercises

13

Exercise 1.4.2. Show that the modulus obeys the triangle inequality |z ± w|  |z| + |w|. This allows us to make the complex plane into a metric space (see later, Definition B.1.1) and thus to introduce topological notions such as open and closed sets, continuity, etc. √ Exercise 1.4.3. Let a = 1 + i and b = 3 − i. Express each of the complex numbers a + b,

a − b,

ab,

a/b

in the form x + yi and in the form reiθ , simplifying your answers as much as possible. Exercise 1.4.4. Let z = eiθ where θ = 2π/5. Prove that 1 + z + z 2 + z 3 + z 4 = 0. By considering the real part of this expression prove that √ −1 + 5 . cos θ = 4 Exercise 1.4.5. (a) Show that the mapping z → 1/z sends the circle |z − 1| = 1 (in the complex plane) into a straight line. (b) Let A, B, C, and D be four points on a circle in the (Euclidean) plane, and let the symbol d(X, Y ) denote the Euclidean distance between two points X and Y . Let p = d(A, B)d(C, D), let q = d(A, C)d(B, D), and let r = d(A, D)d(B, C). Show that one of p, q, r is equal to the sum of the other two. (This result is due to Ptolemy of Alexandria, nearly 2,000 years ago. To prove it using complex numbers, take the circle to be the one in the first part of the question, and take A to be the origin. Use the transformation z → 1/z to relate the theorem to the distances between points on a straight line.) Exercise 1.4.6. In the 1840s, William Rowan Hamilton spent much effort trying to find a 3-dimensional field of “hypercomplex” numbers, i.e., of symbols of the form x + yi + zj, with x, y, z ∈ R, which can be added, subtracted, multiplied, and divided in the same way that complex numbers can. Show that his quest was hopeless: no matter how we define i2 and j 2 , we will not obtain a 3-dimensional system of the desired sort. (Hint: Use linear algebra. Let V denote the

14

1. Prelude: Love, Hate, and Exponentials

proposed system. Fix a specific nonreal element α ∈ V and let mα denote the operation of multiplication by α, which is an R-linear transformation from V to V . This transformation must have a real eigenvalue because V is odd-dimensional. From this, deduce that one can find two nonzero elements of V whose product is zero, which contradicts the desired existence of division in V .)

Chapter 2

Paths and Homotopies

2.1. Path connectedness In this chapter we will explore the notions of “continuous movement” or “continuous deformation” which (as we saw in Chapter 1) are fundamental to understanding the winding number. We’ll represent these by paths in a metric space. Metric spaces (examples include the standard Euclidean spaces Rn or subsets thereof) provide an abstract context in which continuity can be defined. For a review of metric space theory, see Appendix B. Let X be a metric space. A path in X is a (continuous) map γ : [0, 1] → X. The points γ(0) and γ(1) of X are the initial and final points of the path. (This is Definition B.2.2 in Appendix B.) One should think of this as saying that a path is “the track of a continuously moving point” in X. Definition 2.1.1. Two points p, q in a metric space X are connected by a path if there exists a path γ : [0, 1] → X with initial point γ(0) = p and final point γ(1) = q. Proposition 2.1.2. The relation of “being connected by a path” (on points in a given metric space) is an equivalence relation1 .

1

See Section G.1.

15

16

2. Paths and Homotopies

Proof. We must check that the relation is reflexive, symmetric, and transitive. It is reflexive: for any p ∈ X, the constant path at p (γ(t) = p for all t ∈ [0, 1, ]) shows that p is connected to itself. It is symmetric: if γ is a path with initial point p and final point q, then the reverse path γ  defined by γ  (t) = γ(1 − t) has initial point q and final point p. It is transitive: let γ1 be a path with initial point p and final point q and let γ2 be a path with initial point q and final point r. Define a new path γ = γ1 ∗ γ2 (the concatenation of γ1 and γ2 ) by  γ1 (2t) (t  12 ), γ(t) = γ2 (2t − 1) (t  12 ). Then γ is continuous2 and has initial point p and final point r.



Definition 2.1.3. The equivalence classes for the above equivalence relation are called the path components of the space X. If X only has one path component, it is called path connected. If a space is path connected, it is (in principle) straightforward to show that: just construct some paths. How do we show that a space is not path connected, though? Most proofs ultimately rely on the following fact. Lemma 2.1.4. Any continuous path in a discrete space X (one in which every subset is open; see Example B.1.9) must be a constant path. Proof. Let γ be a path in X with initial point γ(0) = p. Consider the function f : X → {0, 1} which sends p to 0 and all points of X \ {p} to 1. Since X is discrete, this function is continuous; so f ◦ γ is a continuous function from [0, 1] to {0, 1}. Such a function must be constant (by the intermediate value theorem; if it wasn’t constant it would take the value 12 somewhere, a contradiction) which shows that γ(t) = p for all t.  2

This follows from the “gluing lemma”, Proposition B.4.2.

2.1. Path connectedness

17

Remark 2.1.5. Traditionally, a space X is called connected if it cannot be written as the union of two disjoint nonempty open subsets. It is easy to see that if X is path connected, it is connected (use the argument of Lemma 2.1.4). The converse is false in general. You’ll find a discussion of this in any introductory text, but I’ve chosen to avoid it here by working with path connectedness exclusively. Example 2.1.6. We will spend a lot of time working with paths. It’s important to understand, therefore, that the behavior of paths can be very far from the “smoothness” that one becomes accustomed to by drawing pictures in the plane. A famous example of Peano [30] gives a path in the plane whose image is the whole unit square. In other words, continuous maps can raise dimension! Peano’s construction is quite explicit and we’ll review it below. We consider real numbers between zero and one as represented by ternary (base-3) expansions; that is, the digits allowed are 0, 1, and 2, and a sequence 0 · a1 a2 a3 . . .  −j of digits represents the real number ∞ j=1 aj 3 . Most numbers between 0 and 1 have a unique expansion of this form, the only exceptions being triadic rational numbers (those whose denominator is a power of 3), which have two expansions, one ending in all 0s and one ending in all 2s. For a digit a let κ(a) denote the complement of a; that is, κ(0) = 2,

κ(1) = 1,

κ(2) = 0.

We let κn (a) denote the result of applying κ n times to a: this is κ(a) if n is odd and a if n is even. Given a number x = 0 · a1 a2 a3 . . ., we define two new numbers y = 0 · b1 b2 b3 . . . and z = 0 · c1 c2 c3 by the relations bn = κa2 +···+a2n−2 (a2n−1 ),

cn = κa1 +a3 +···+a2n−1 (a2n ).

One checks that this process is well-defined : two different representations of the same x (if such exist, that is, when x is a triadic rational) give rise to different representations of the same y and z. It is also continuous. To see this, notice that given any x and any n, there is

18

2. Paths and Homotopies

δ > 0 such that if |x − x| < δ, then the first n digits of (suitably chosen) expansions of x and x agree. Thus, given x and ε > 0, choose m such that 3−m < ε; then take n = 2m and δ chosen as above in terms of n. If |x − x | < δ and if y  , z  are defined in terms of x as above, then y agrees with y  up to m digits (and similarly for z and z  ), so |y − y  | < ε,

|z − z  | < ε.

Therefore x → (y, z) is a continuous map of the unit interval to the unit square. This map is surjective. For consider expansions of y, z as above; we can find an x that maps to them by writing a2n−1 = κc1 +c2 +···+cn−1 (bn ),

a2n = κb1 +b2 +···+bn (cn ).

This completes the construction of a path whose image is the unit square, usually called the Peano space-filling curve. Remark 2.1.7. The Peano map from the interval to the unit square is not injective (see Exercise 2.4.3) and therefore is not a homeomorphism. Nevertheless, the existence of such a strange example might lead one to worry about whether some more complicated construction might produce a homeomorphism from the interval to the unit square. If that were to happen, it would mean that the notion of “dimension” — in the intuitive sense of “how many parameters are needed” to describe something — would not belong to topology. In the early twentieth century, Brouwer and others showed that dimension is, in fact, a topological notion: Rn and Rm are not homeomorphic unless n = m. The general proof is quite delicate. We’ll use the winding number to address some cases of this question later in the book (see Exercise 4.5.9 for example).

2.2. Homotopy Let X and Y be metric spaces, with X compact. We recall (see Definition B.4.1 and the following discussion) that the collection of all continuous maps f : X → Y is itself a metric space (denoted Maps(X, Y )) under the uniform distance d(f0 , f1 ) = sup{d(f0 (x), f1 (x)) : x ∈ X}.

2.2. Homotopy

19

The next definition is a key one in topology. Definition 2.2.1. Let X, Y be metric spaces, with X compact, and let f0 , f1 be maps from X to Y . A homotopy from f0 to f1 is a path joining them in Maps(X, Y ). In other words, it is a continuous “one-parameter family of maps” {(fs ) : s ∈ [0, 1]} with initial point f0 and final point f1 . The path components of Maps(X, Y ) are called the homotopy classes of maps from X to Y . Two maps that are connected by a homotopy are called homotopic. Remark 2.2.2. The relation “being homotopic” is thus a special case of the relation “being connected by a path”. Since the latter is an equivalence relation (Proposition 2.1.2), so is the former. By definition, a homotopy is a one-parameter family, say fs , of maps from X to Y . But we can also consider the same data as defining a single map3 F : [0, 1] × X → Y by the formula F (s, x) = fs (x). We will use either definition as it is convenient. The fact that the two definitions are equivalent is a special case of the exponential law for function spaces, which is proved in Appendix B (Proposition B.4.5). Example 2.2.3. Let γ : [0, 1] → X be a path in a metric space X. Intuitively, a path represents the “trajectory” of a moving particle which takes position γ(t) at time t ∈ [0, 1]. One can envisage the particle tracing out the same “trajectory” but at a different speed: this corresponds to a path γ ◦ ϕ, where ϕ : [0, 1] → [0, 1] is a continuous map having ϕ(0) = 0 and ϕ(1) = 1. Such a path is called a reparameterization of γ. Now we have Proposition 2.2.4. A reparameterization of a path is homotopic to the original path. Moreover, the homotopy can be taken to fix the endpoints of γ (the meaning of this will be explained in the proof ). Proof. The required homotopy is given by   h(s, t) = γ st + (1 − s)ϕ(t) . 3 In fact, if you consult standard textbooks, you will find that this is the more common definition of “homotopy”. It has the advantage of working well even when X is not compact.

20

2. Paths and Homotopies

Note that h(s, 0) = γ(0) and h(s, 1) = γ(1) for all s. That is what we mean by “fixing the endpoints”: all of the curves h(s, ·) making up the homotopy have the same starting and ending points.  Remark 2.2.5. As is suggested by the above discussion of homotopies of paths “with endpoints fixed”, we are often interested only in those maps from a space X to Y which have some special behavior on a subspace of X (in the example, the subset {0, 1} consisting of the endpoints of the unit interval [0, 1]). For example, if A is a subset of X and B a subset of Y , the maps f : X → Y such that f (A) ⊆ B are called maps of pairs (X, A) → (Y, B). We’ll denote the space of such maps by Maps((X, A), (Y, B)). A particularly important example occurs when each of A and B consists of a single point which we may call the “basepoint” of X or Y , respectively. In that case Maps((X, A), (Y, B)) is the space of basepoint-preserving maps from X to Y , which we may also denote Maps• (X, Y ) if the choice of basepoint is clear from the context. Example 2.2.6. Let Y be a metric space, and let y0 ∈ Y . The path space of Y based at y0 is the space Py0 (Y ) = Maps(([0, 1], {0}), (Y, {y0 })); in other words, it is the space of paths in Y with initial point {y0 }. Example 2.2.7. Let Y be a metric space, and let y0 ∈ Y . A loop in Y based at y0 is a path whose initial and final points are y0 . The space of such paths Ωy0 (Y ) = Maps(([0, 1], {0, 1}), (Y, {y0 })) is called the based loop space of Y with basepoint x0 . Example 2.2.8. The free loop space Ω(Y ) is the space of all maps f : [0, 1] → Y such that f (0) = f (1). Each of these mapping spaces comes equipped with its own notion of homotopy, which is a path joining two maps in the relevant mapping space. For example, if f0 , f1 belong to Maps((X, A), (Y, B)), we can consider a path joining them in the space Maps((X, A), (Y, B)); we

2.2. Homotopy

21

could call this a homotopy of maps of pairs. Special cases are a homotopy of paths (a path in Py0 (Y ), Example 2.2.6), a homotopy of based loops (a path in Ωy0 (Y ), Example 2.2.7), and a homotopy of free loops (a path in Ω(Y ), Example 2.2.8). Note that increasing the size of the mapping space under consideration can make homotopy “easier”. For instance, it is perfectly possible for two loops to be nonhomotopic in Ωy0 (Y ) but homotopic in Py0 (Y ); geometrically, it is easier to deform a path if you don’t have to keep its final point fixed at the initial point. Remark 2.2.9. The fundamental example of a loop is given by our friend the exponential map η : t → e2πit , which maps [0, 1] continuously onto the unit circle S 1 in the complex plane, with η(0) = η(1) = 1. In fact, this is the universal example of a loop in the following precise sense. Proposition 2.2.10. Let f : [0, 1] → Y be a loop in a metric space Y . Then there is a unique map g : S 1 → Y such that f = g ◦ η; in other words, the diagram [0, 1] CC CCf CC η CC  ! g 1 /Y S is commutative. Proof. The function g is defined as follows: to find g(u), for u ∈ S 1 , choose a t such that η(t) = u, and then define g(u) to equal f (t). There is no ambiguity in this process unless u = 1, in which case t could be 0 or 1; but since f (0) = f (1) because f is a loop, it does not matter which t we pick in this case. It remains to show that g is continuous, and for this we use the characterization of continuous functions as those which pull back closed sets to closed sets (Remark B.2.4). Let C be any closed subset of Y . Then f −1 (C) is a closed subset of [0, 1] because f is continuous, hence compact because [0, 1] is compact. Therefore g −1 (C) = η(f −1 (C)) is compact (Proposition B.3.16), and therefore closed (Proposition B.3.6). Thus g is continuous. 

22

2. Paths and Homotopies

It follows that the free loop space Ω(Y ) can be identified with the space Maps(S 1 , Y ) of maps from S 1 to Y . Similarly, the based loop space Ωy0 (Y ) can be identified with Maps• (S 1 , Y ).

2.3. Homotopies and simple-connectivity We can now formulate the question that is central to this course. Key Question Let X be a path-connected space, p a point of X. Is the space of loops, Ωp (X), path connected? If so, how do we prove this? If not, how do we describe the path components of Ωp (X)? Example 2.3.1. Let X = C, the complex plane, and choose the “base point” p to be the number 1. Let γ be any loop based at p; i.e., γ : [0, 1] → X with γ(0) = γ(1) = p. Then the formula γs (t) = (1 − s)γ(t) + sp defines a homotopy from γ = γ0 to the constant loop γ1 . Thus every loop in Ωp X is homotopic to the constant loop, so Ωp X is path connected. Definition 2.3.2. A space X is called simply connected if both X and Ωp X are path connected. In other words, X is path connected and every loop in X, based at p, is homotopic to a constant loop. Thus the example above shows that the complex plane C is simply connected. (It can be shown that this definition does not depend on the choice of base point p: see Exercise 2.4.6.) Example 2.3.3. Let X = C \ R− , the complex plane slit4 along the negative real axis. The exact same homotopy as in Example 2.3.1 shows that this X is also simply connected. Remark 2.3.4. One can abstract the key property that is involved in the preceding argument. Suppose that X is a subset of C (or more generally of any Euclidean space). If there is a point p ∈ X which has the property that, for every x ∈ X, all the points on the line segment from x to p also belong to X, then X is called star-shaped about p. The argument of Example 2.3.1 works exactly for star-shaped sets. 4

We introduced this terminology in Lemma 1.3.7.

2.3. Homotopies and simple-connectivity

23

Example 2.3.5. Let X = C \ {0}, the “punctured plane”. This subset is not star-shaped and the homotopy that we used in the previous two examples will not work. Consider for instance the loop γ(t) = e2πit = cos(2πt) + i sin(2πt). This is a loop in X (it never goes through the origin). But the linear homotopy that we used before, γs (t) = (1−s)γ(t)+s, is now invalid: it would pass through the origin at s = t = 12 , and the origin is not in X. In the next chapter we will see that this is not an accident: this X is not simply connected, and this γ is not homotopic to a constant loop. Example 2.3.6. Consider the 2-sphere, which is the subset S 2 ⊆ R3 made up of points at unit distance from the origin, or more generally the (n − 1)-sphere S n−1 := {(x1 , . . . , xn ) : x21 + · · · + x2n = 1} ⊆ Rn . We would like to prove that this is simply connected when n  3. However, this will take a bit more work than anything we have done so far. Let N ∈ S n−1 be the point (0, 0, 0, . . . , 1) (the “north pole”). The stereographic projection ϕ : S n−1 \{N } → Rn−1 is the map that sends P = (x1 , . . . , xn ) ∈ S n−1 \ {N } to the point where the line through N and P intersects the plane spanned by the first (n − 1) coordinate axes (see Figure 2.1). Elementary geometry shows that ϕ is given by the formula   x1 xn−1 ,..., . ϕ(x1 , . . . , xn ) = 1 − xn 1 − xn Lemma 2.3.7. The stereographic projection ϕ : S n−1 \ {N } → Rn−1 is a homeomorphism. Proof. Calculation shows that the inverse of ϕ is given by the map   2x1 2xn−1 x 2 − 1 ,..., , x = (x1 , . . . , xn−1 ) → ,

x 2 + 1

x 2 + 1 x 2 + 1 where x 2 = x21 + · · · + x2n−1 . From these explicit formulas we see  that both ϕ and ϕ−1 are continuous.

24

2. Paths and Homotopies N = (0, 1) P = (x, y) ϕ(P ) = y/(1 − x)

Figure 2.1. Stereographic projection (the figure shows the case n = 2).

In the same manner we can define the stereographic projection whose pole is any point P ∈ S n−1 : it is a homeomorphism from S n−1 \ {P } to the hyperplane P ⊥ ⊆ Rn−1 made up of points with position vector perpendicular to P . It follows that every loop in S n−1 whose image is not the whole of is homotopic to a constant loop. For if there is a point P not in S the image of our loop, then under stereographic projection the loop becomes a map S 1 → S n−1 \ {P } ∼ = Rn−1 . It is therefore homotopic n−1 is star-shaped (Remark 2.3.4). to a constant loop because R n−1

Unfortunately, the example of the Peano space-filling curve (Example 2.1.6) shows that there might be loops that map S 1 onto S n−1 and don’t “omit” any point N that we can use in the above argument. So what we are going to do is to first make a preliminary homotopy to put our loop into “general position”; it will then omit many points, and we can use the argument above. Definition 2.3.8. Let a, b ∈ S n−1 . We say that they are antipodal if a = −b as vectors in Rn ; it is equivalent to say that d(a, b) = 2. In this case we write b = A(a) (the antipode of a). If a and b are not antipodal, there is a unique shortest path in S n−1 from a to b (a great circle arc); we’ll call this the straight path from a to b. The reason for this terminology is that if ϕ denotes stereographic projection with pole A(a), then ϕ carries the straight path from a to b to the straight line in Rn−1 from ϕ(a) to ϕ(b).

2.4. Exercises

25

Completion of Example 2.3.6. We are going to prove that S n−1 is simply connected for n  3. Let us say that a path γ in S n−1 is piecewise straight if there are finitely many parameter values t0 = 0, t1 , . . . , tn = 1 such that γ(tk ) and γ( tk+1 ) are not antipodal and the restriction of γ to the interval [tk , tk+1 ] is the straight path from γ(tk ) to γ(tk+1 ). I claim that any path in S n−1 is homotopic to a piecewise straight path with the same starting and ending points. Indeed, let γ be a path in S n−1 . Since γ is continuous, it is uniformly continuous (Proposition B.3.19) so there is some δ > 0 such that if |s−t| < δ, then d(γ(s), γ(t)) < 2. Choose t0 = 0, t1 , . . . , tn = 1 such that |tk − tk+1 | < δ for all k; this ensures that for each k the range γ([tk , tk+1 ]) does not contain any pair of antipodal points. Then “straighten out” the path between tk and tk+1 by first projecting stereographically from tk , then carrying out an endpoints-fixed homotopy to a straight path in Rn−1 , and then projecting back again to S n−1 . An explicit formula for such a homotopy can be given as follows: let ϕk be the stereographic projection from S 2 \ A(γ(tk )) to C. For t ∈ [tk , tk+1 ], put −1 γs (t) = ϕk (1 − s)ϕk (γ(t)) 

 tk+1 − t t − tk ϕk (γ(tk )) + ϕk (γ(tk+1 )) . +s tk+1 − tk tk+1 − tk Then γ0 = γ, and γ1 is the piecewise straight path with vertices γ(t0 ), . . . , γ(tn ). Now the points of a piecewise straight path are contained in the union of finitely many 2-dimensional planes in Rn . Thus, if n  3, a piecewise straight path cannot fill up all of S n−1 . By stereographic projection, as noted above, it follows that a piecewise straight loop in S n−1 is homotopic to a constant loop. Since every loop is homotopic to a piecewise straight loop, the proof is completed. 

2.4. Exercises Exercise 2.4.1. Find the path components of the rational numbers Q with their usual metric.

26

2. Paths and Homotopies

Exercise 2.4.2. Check the details in the construction of the Peano space-filling curve (Example 2.1.6). Also, check that the Peano map is not injective (think about the ambiguities in ternary rational expansions for triadic rationals). Exercise 2.4.3. Let X be a path-connected metric space. Call a point p ∈ X a cut point if X \ {p} is not path connected. Show that the unit interval [0, 1] has some cut points but that the unit square [0, 1] × [0, 1] does not have any cut points. Explain why this shows that these two spaces cannot be homeomorphic to one another. Exercise 2.4.4. Let X be the punctured complex plane C \ {0}. A loop in X, starting and ending at 1, is defined as follows:  cos(4πt) + i sin(4πt) (0  t  12 ), γ(t) = (32(t − 34 )2 − 1) − (t − 12 )(t − 34 )(t − 1)i ( 12  t  1). Show that the loop γ is homotopic to a constant loop. Exercise 2.4.5. Let S be the spiral in the complex plane defined by S = {teit : 0  t < ∞}. Prove that C \ S is simply connected. Exercise 2.4.6. Show that if X is path connected and p, q ∈ X, then Ωp (X) is path connected if and only if Ωq (X) is path connected. (Suggestion: Consider the map Ωp (X) → Ωq (X) defined by sending a loop γ based at p to the concatenation θ ∗ γ ∗ θ  , where θ is a path from q to p.) It follows that our definition of “simply connected” does not depend on the choice of base point. Exercise 2.4.7. Let X be a nonempty compact space with a base point p. Consider the space Maps• (X, X) of maps from X to itself that send p to p. We say X is contractible if Maps• (X, X) is path connected. Prove: (i) Every contractible space is path connected. (ii) Every contractible space is simply connected. (iii) X is contractible if and only if the identity map X → X is homotopic in Maps• (X, X) to a constant map.

Chapter 3

The Winding Number

3.1. Maps to the punctured plane Let X be a compact metric space. We are going to analyze maps1 from X to the punctured plane C \ {0}. A key player in the discussion will be the exponential map exp : C → C \ {0} which we discussed in Section 1.3. Definition 3.1.1. Let X be some compact metric space. A (continuous) map f : X → C \ {0} is an exponential if there is a map g : X → C such that f = exp ◦g; in other words, the diagram ;C xx x x exp xx xx x  x f / C \ {0} X g

commutes. (Because of the way this picture looks, one also expresses this by saying that f lifts through the exponential map.) Lemma 3.1.2. If f : X → C \ {0} is an exponential, then it is homotopic to a constant map. More generally, if f1 /f0 is an exponential, then f0 is homotopic to f1 . 1 By our convention, the words “map” and “mapping” refer to continuous functions; see Definition B.2.1.

27

28

3. The Winding Number

Proof. Suppose that f1 (t)/f0 (t) = exp(g(t)). Then a homotopy from  f0 to f1 , in C \ {0}, is given by fs (t) = exp(sg(t))f0 (t). The main result of this section is a converse to Lemma 3.1.2: a map X → C \ {0} is homotopic to a constant map only if it is an exponential. First we need Lemma 3.1.3. Suppose that f : X → C \ {0} never takes negative real values. Then f is an exponential. Proof. By assumption, f maps into the slit plane S = C \ R− . Then f (x) = exp((f (x))), where  is the branch of the logarithm defined in Lemma 1.3.7.



Proposition 3.1.4 (Rouch´ e’s theorem). Suppose that f0 and f1 are maps from X to C \ {0} such that |f0 (x) − f1 (x)| < |f0 (x)| + |f1 (x)|. Then f0 /f1 and f1 /f0 are exponentials. Proof. The inequality shows that for each w = f0 (x)/f1 (x), |w − 1| < |w| + 1. It is not hard to see that w ∈ C satisfies this inequality iff w ∈ / R− . Thus f0 /f1 never takes negative real values and the result follows from the previous lemma.  Proposition 3.1.5. Let X be a compact metric space and f0 , f1 maps from X to C \ {0}. The maps f0 and f1 are homotopic if and only if f1 /f0 is an exponential. Proof. We already proved “if” (Lemma 3.1.2). For “only if”, consider a homotopy h(s, x) with h(0, x) = f0 (x) and h(1, x) = f1 (x). Because [0, 1] × X is a compact space and h never takes the value 0, there is ε > 0 such that |h(s, x)| > ε for all s ∈ [0, 1] and x ∈ X. By the uniform continuity (Proposition B.3.19) of h, there is δ > 0 such that |h(s, x) − h(s , x)| < ε

whenever s, s ∈ [0, 1] with |s − s | < δ.

3.2. The winding number

29

Thus when |s − s | < δ the maps hs (x) = h(s, x) and hs (x) = hs (x) satisfy |hs (x) − hs (x)| < ε < |hs (x)| + |hs (x)|. By Rouch´e’s theorem, hs /hs is an exponential. Choose a finite sequence sj with s0 = 0, sm = 1, and |sj − sj+1 | < δ. Then f1 /f0 =

m−1

  hsj+1 /hsj

j=0

is a product of exponentials, which is an exponential.



Corollary 3.1.6. With the same notation, a map X → C \ {0} is itself an exponential if and only if it is homotopic to a constant map.

3.2. The winding number In the previous section we discussed maps from any compact X to the punctured plane. We now specialize our attention to the cases X = [0, 1] (paths) and X = S 1 (loops). Recall that a path in a space Y is just a (continuous) map γ : [0, 1] → Y ; a loop in Y is a path with the additional property that its initial point γ(0) is equal to its final point γ(1). We noted in Proposition 2.2.10 that a loop in Y can also be represented as a map from the unit circle S 1 to Y . Proposition 3.2.1. Every path γ : [0, 1] → C \ {0} lifts through the exponential map to a path g : [0, 1] → C. Moreover, if g0 and g1 are two lifts of γ, then there is an integer n such that g0 (t)−g1 (t) = 2πin, for all t. Proof. The homotopy h(s, t) = γ((1 − s)t) shows that γ is homotopic to a constant path. By Corollary 3.1.6, then, γ lifts through the exponential map. Suppose now that g0 , g1 are two lifts of γ. Then exp(g0 (t)) = γ(t) = exp(g1 (t)), and thus exp(g0 (t) − g1 (t)) = 1, which implies that (g0 (t) − g1 (t))/2πi is an integer. A continuous integer-valued function on [0, 1] is constant (Lemma 2.1.4), so g0 (t) − g1 (t) = 2πin for some integer n. 

30

3. The Winding Number

2πi

γ

g

1

0

Figure 3.1. A path g in C starting at 0 and ending at 2πi and the loop γ = exp ◦g in C \ {0} obtained by exponentiating it. The winding number of γ is (2πi)−1 (g(1) − g(0)), which equals 1 in this case.

Now suppose that γ is a loop in C \ {0}, that is, a path that has the same initial and final points. By Proposition 3.2.1, γ is the exponential of a path, that is to say, a map g : [0, 1] → C such that (3.2.2)

γ(t) = exp(g(t)).

But g need not be a loop! All we know about its initial and final points is that exp(g(1)) = exp(g(0)), and this tells us that g(1)−g(0) = 2πim for some integer m. Moreover, the second part of Lemma 3.2.1 tells us that m does not depend on the choice of lifting g for the loop γ. It is therefore an invariant of the loop γ. Definition 3.2.3. If γ is a loop in C \ {0} as above, the integer m (equal to (2πi)−1 (g(1) − g(0)) for any g such that exp(g(t)) = γ(t)) is called the winding number of γ about the origin and is denoted wn(γ, 0) (or sometimes just wn(γ)). See Figure 3.1 for an illustration of this. Definition 3.2.4. If γ is a loop in C \ {p}, the winding number of γ about p is defined to be the winding number of the loop t → γ(t) − p (in C \ {0}) about the origin. We denote it wn(γ, p).

3.2. The winding number

31

Lemma 3.2.5. If a loop γ : [0, 1] → C \ {p} is homotopic through loops to a constant loop, then wn(γ, p) = 0. It is important that the homotopy in this lemma take place through loops — in other words, if we identify the given loop with a map S 1 → C \ {p} using Proposition 2.2.10, then that map is homotopic to a constant in the free loop space Maps(S 1 , C \ {p}). Proof. Without loss of generality take p = 0. Consider the diagram SO 1 H η

[0, 1]

/C

ψ

H

H

ϕ

H γ

H

exp

H#  / C \ {0}.

Here η(t) = e2πit . By Proposition 2.2.10, a map ϕ (shown by the dashed arrow) exists that makes the bottom triangle commute. By hypothesis, ϕ is homotopic to zero in the free loop space. Hence by Proposition 3.1.5 (with X = S 1 ), there exists a map ψ (shown by the dotted arrow) making the top triangle commute. Now g = ψ ◦ η is a lifting of γ through the exponential map, and g(0) = g(1). Therefore the winding number of γ is zero.  Lemma 3.2.6. Let γ1 and γ2 be loops in C \ {0} and let γ(t) = γ1 (t)γ2 (t) be their pointwise product. Then wn(γ) = wn(γ1 ) + wn(γ2 ). Proof. If g1 and g2 are lifts of γ1 and γ2 , respectively, then g1 + g2 is a lift of γ1 γ2 .  Theorem 3.2.7. A loop in C \ {0} has winding number n if and only if it is homotopic to the loop en (t) = exp(2πint),

t ∈ [0, 1].

In particular, two loops are homotopic if and only if they have the same winding number, and a loop is homotopic to a constant if and only if it has winding number 0.

32

3. The Winding Number

Proof. It is clear from the definition that en has winding number n since we can take a lift g for en to be g(t) = 2πint. Suppose that γ has winding number 0. Then we can write γ(t) = exp(2πig(t)) with g(0) = g(1) = 0. The homotopy h(s, t) = exp(2πi(1 − s)g(t)) is then a homotopy of loops (not just of paths) and contracts γ to a constant loop. Conversely, if γ is homotopic (through loops) to a constant loop, it has winding number zero by Lemma 3.2.5. This shows that a loop is homotopic to a constant loop if and only if it has winding number zero. Moreover, all constant loops are homotopic to one another (since C \ {0} is path connected). To prove the general case we make use of Lemma 3.2.6 (which shows that loops γ0 and γ1 have the same winding number if and only if γ0 /γ1 has winding number zero), together with the observation that γ0 and γ1 are homotopic through loops if and only if γ0 /γ1 is homotopic to a constant loop (if γs is a homotopy from γ0 to γ1 , then  γs /γ1 is a homotopy from γ0 /γ1 to the constant loop 1). Remark 3.2.8. The loops en that appear in Theorem 3.2.7 have the property that |en (t)| = 1 for all t; in other words, they are maps whose range is contained in the unit circle S 1 ⊆ C. It follows therefore from the theorem that every loop in C\{0} is homotopic to one whose range is contained in the unit circle. It is also easy to see this directly. In fact, if z ∈ C \ {0}, let υ(z) = z/|z| be the “unitization” of z; then for any loop γ in C \ {0} the “radial retraction” homotopy h(s, t) = (1 − s)γ(t) + sυ(γ(t)) shows that γ is homotopic to the loop υ ◦ γ, whose range lies in S 1 . Remark 3.2.9. The winding number has another additivity property besides that described in Lemma 3.2.6. Let γ1 and γ2 be loops in C \ {0}, based at the same point, and let γ = γ1 ∗ γ2 be their concatenation. That is,  γ1 (2t) (t  12 ), γ(t) = γ2 (2t − 1) (t  12 )

3.3. Computing winding numbers

33

(see the proof of Proposition 2.1.2). Then, once again, wn(γ) = wn(γ1 ) + wn(γ2 ). To see this, note that if g is a lift of γ through the exponential map, then g = g1 ∗ g2 , where g1 and g2 are lifts of γ1 and γ2 , respectively. Thus wn(γ) = (2πi)−1 (g(1)−g(0)) = (2πi)−1 (g(1)−g( 21 )+g( 12 )−g(0)) = (2πi)−1 (g1 (1)−g1 (0))+(2πi)−1 (g2 (1)−g2 (0)) = wn(γ1 )+wn(γ2 ) as required.

3.3. Computing winding numbers In this section we will learn how to compute winding numbers in practice. Let γ be a loop in C \ {0}. The image of γ (the set of points γ(t) for t ∈ [0, 1]) is a compact subset of C \ {0} and in this context is usually denoted γ ∗ . Consider the complement of this image, C \ γ ∗ : that is, the set of all those points p that the loop γ does not pass through. For each such p, the winding number wn(γ, p) of γ around p is well-defined. Proposition 3.3.1. The function p → wn(γ, p) is constant on the path components of C \ γ ∗ . It is equal to zero on the unbounded component 2 of this set. Proof. Suppose p and q are in the same path component of C \ γ ∗ . That means there is a path ψ : [0, 1] → C \ γ ∗ that starts at p and ends at q. For each s ∈ [0, 1], the loop t → γ(t) − ψ(s) does not pass through 0. Thus the map h : (s, t) → γ(t) − ψ(s) is a homotopy of loops in C \ {0}, which must therefore preserve the winding number about 0. But wn(h(0, ·), 0) = wn(γ, p) and wn(h(1, ·), 0) = wn(γ, q). The compact set γ ∗ is bounded, so it is included in some ball B(0; R). All the points in the complement of this ball (i.e., those p 2

I’ll explain what this is in the course of the proof.

34

3. The Winding Number

with |p| > R) are path connected to each other outside the ball and are therefore in the same component of C \ γ ∗ . This is called the unbounded component of C \ γ ∗ . If |p| > R, then the loop t → γ(t) − p is contained in a half-plane. A half-plane is star-shaped (see Remark 2.3.4) so this loop is homotopic (by a linear homotopy) to a constant and has winding number zero. 

The components of C \ γ ∗ are sometimes called the cells of γ. The above proposition suggests the following approach to calculating winding numbers: to find wn(γ, p), choose a path ψ from p to the unbounded cell. We expect (with luck!) that this path should pass through only finitely many cells. The winding number wn(γ, ψ(s)) is constant on cells, and we know it is zero on the unbounded one, so “all” we have to do is to understand how the winding number changes as we pass from one cell to the next. Here is the simplest situation in which one can do that. Lemma 3.3.2. Suppose that a loop γ has a short straight section. Suppose p0 is just to the right of the short straight section and p1 is just to the left of it. Then wn(γ, p1 ) − wn(γ, p0 ) = 1. Of course there are lots of questions here: what is a short straight section? Which way is to the right? How far is “just”, etc.? We can make this precise as follows. We’ll suppose that there are a ball B = B(a; δ), a complex number b of absolute value δ, and a parameter interval (t0 , t1 ) for γ, such that γ(t0 ) = a − b, γ(t1 ) = a + b, and as t runs from t0 to t1 the path γ follows the straight line (a diameter of the ball) from a − b to a + b. We assume further that γ(t) does not meet B for t ∈ / (t0 , t1 ). This is what we will mean by a “short straight section” of the path. The complement, in B, of the diameter (a − b, a + b) has two path components. The component consisting of those z for which the Rbasis (z − a, b) for C is right-handed (see Remark 1.3.2) will be called the “right-hand” component; the other will be called the “left-hand” component.

3.3. Computing winding numbers

(ii) Path θ a

a

p0

p1

a+b

(i)

35

p0

a+b

(iii)

p1 Path γ

a

p0

p1 a−b

Path γ

a−b

Figure 3.2. The bubble argument. (i): the original path γ including the short straight section from a − b to a + b. (ii): the additional “bubble” θ which winds around p0 but not p1 . (iii): the modified path γ ˜ that is homotopic to a concatenation of γ and θ.

Our claim precisely is that if |p0 − a|, |p1 − a| < δ/2, p0 is in the right-hand component, p1 in the left-hand component, then wn(γ, p1 ) − wn(γ, p0 ) = 1. Proof. Let θ be the D-shaped loop that starts at a − 34 b, proceeds around a semicircle of radius 34 δ to a + 34 b, and then returns via a straight line to a − 34 b again. Let γ˜ be the loop obtained by modifying γ by replacing the straight line segment from a − 34 b to a + 34 b with the semicircle described above. (See Figure 3.2 for illustrations of these three loops.) Observe the following. (a) γ˜ is homotopic to the concatenation3 of θ and γ. 3 Details: Parameterize both γ and θ with a − 34 b as the base point, so that γ(0) = γ(1) = θ(0) = θ(1) = a − 34 b. Then the concatenation θ ∗ γ travels around the semicircular arc of θ, then back down the straight line segment, out along the straight line segment again (exactly reversing the previous section), and then around the rest of γ. The two opposite copies of the straight line segment can be deformed by a linear homotopy to a constant path at a + 34 b. The result of this homotopy is a parameterization of γ ˜.

36

3. The Winding Number

(b) The winding number of θ about p0 is 1; about p1 it is 0. (c) The points p0 and p1 belong to the same component of the complement of γ˜ ∗ . From Remark 3.2.9 the winding number of a concatenation of paths is the sum of their winding numbers. Taken together, these facts give us the proof: wn(γ, p1 ) = wn(˜ γ , p1 ) − wn(θ, p1 ) = wn(˜ γ , p1 ) = wn(˜ γ , p0 ) = wn(γ, p0 ) + wn(θ, p0 ) = wn(γ, p0 ) + 1.  Remark 3.3.3. We refer to this style of proof as a “bubble argument” because of the way the point p1 is enclosed in a “bubble” (the loop θ) that grows out of the curve γ. For an example where this allows a complete calculation of winding numbers, let us consider polygonal loops. Suppose that we give a list of points v0 , v1 , . . . , vn ∈ C. The polygonal path with these vertices is obtained by concatenating the straight line segments (“edges”) between them; i.e., it is the map t → (nt − k)vk+1 + (k + 1 − nt)vk

for k/n  t  (k + 1)/n.

(Here we have parameterized the path so that each edge takes equal “time”; but any other parameterization would yield a homotopic path.) It is a polygonal loop if v0 = vn . We’ll denote a polygonal path (or loop) by v0 , . . . , vn . Suppose we have a polygonal loop v0 , . . . , vn  that does not pass through a point p. Choose a ray R from p to ∞ that is not parallel to any of the edges of the given loop and does not pass through any of its vertices (we’ll call this a transverse ray with respect to the loop). For each edge ek = vk , vk+1  of the loop we can define an intersection number ⎧ ⎪ if ek and R do not meet, ⎪ ⎨0 (3.3.4) i(ek , R) := +1 if R crosses ek from left to right, ⎪ ⎪ ⎩−1 if R crosses e from right to left. k

(Since ek and R are both straight, they can meet in at most one point. The notion of “right” and “left” is defined as in our discussion

3.3. Computing winding numbers

37

R ⊕  ⊕ p

Figure 3.3. Computing the winding number of a polygonal loop.

of Lemma 3.3.2.) Now we have Proposition 3.3.5. The winding number of a polygonal loop around p is equal to the sum of its edge-intersection numbers with a transverse ray:  wn( v0 , . . . , vn , p) = i( vk , vk+1 , R), k

where R is a transverse ray from p to infinity. (See Figure 3.3.) Proof. This follows immediately from Lemma 3.3.2: trace along the ray R, from the unbounded component inward to p, keeping track of the changes in the winding number. There is one small issue to deal with: our definition of a “polygonal loop” allows for different edges to overlap (for instance a polygonal loop with 6 edges that goes around the same triangle twice). Lemma 3.3.2 does not apply directly in such a case. However, it is easy to see that any polygonal loop is homotopic to a nearby one that does not have overlapping edges. So, if necessary, we may adjust the loop by a preliminary homotopy to ensure that its edges do not overlap, and then we may apply the preceding argument to the adjusted loop.  Remark 3.3.6. This gives us, “in principle”, a way to compute winding numbers for any loop. Let γ be a loop in C\{0}, say. By compactness, there is ε > 0 such that |γ(t)| > ε for all parameter values t; and then by uniform continuity there is δ > 0 such that if |t − t | < δ, then |γ(t) − γ(t )| < ε. It follows that the portion of γ between parameter

38

3. The Winding Number

values t and t having |t − t | < δ is homotopic, by a linear homotopy in C \ {0}, to the straight line path from γ(t) to γ(t ). Subdividing the whole parameter interval into subintervals of length < δ, we find that γ is homotopic to a polygonal path. We can then use the algorithm above to compute the winding number of this approximating polygonal path, which will be the same as the winding number of the original γ. This pattern — there is an algorithm to compute winding numbers, but only after approximating a general path by a “good” path — will recur several times with different meanings of “good” (polygonal, algebraic, differentiable, etc.). The next two sections give additional examples.

3.4. Smooth paths and loops Let Ω be an open subset of C (or of any other finite-dimensional vector space over R) and let γ : [0, 1] → Ω be a path. We can differentiate it by the usual formula γ(t + h) − γ(t) ∈C h→0 h provided that the limit exists. Then γ  is also a function from [0, 1] → C, and we can ask whether it is differentiable, and so on. The path is said to be smooth if derivatives of all orders exist. If γ is a loop, we’ll also require smoothness at the basepoint, i.e., that γ (r) (0) = γ (r) (1) for r = 1, 2, . . .. With these conventions, the derivative of a smooth path is a smooth path, and the derivative of a smooth loop is a smooth loop. Finally, we’ll say that a smooth path (or loop) γ is regular if γ  (t) = 0 for all t ∈ [0, 1]. γ  (t) = lim

In this section we will show how to compute the winding number for smooth loops, along the lines of the computation for polygonal loops in the previous section. Remark 3.4.1. It follows immediately from the Stone-Weierstrass theorem (Theorem C.1.5) that the smooth loops in Ω are dense among all (continuous) loops; that is to say, for any continuous loop and any ε > 0 there is a smooth loop within ε of the given continuous one. Replicating the argument given in Remark 3.3.6 above, we can

3.4. Smooth paths and loops

39

N

T Figure 3.4. Transverse (T ) and nontransverse (N ) intersections of a curve with a ray.

therefore see that every continuous loop is homotopic to a smooth one and therefore that the computation of the winding number for smooth loops gives us another “in principle” way to compute it for all loops. Using the Stone-Weierstrass theorem in a slightly more elaborate way, one can also see that if two smooth loops are continuously homotopic, then they are smoothly homotopic; that is, the homotopy h : [0, 1] × [0, 1] → Ω can be taken to be a smooth function of both variables. We will need this fact at one point later on. Definition 3.4.2. Let γ be a smooth path in C and let R be a ray in the direction of a unit vector u ∈ C. Then R and γ are transverse if, at every point γ(t) where γ ∗ and R intersect, the complex numbers γ  (t) and u are linearly independent over R (and thus form an R-basis of C; see Remark 1.3.2). “Linearly independent over R” amounts to saying that γ  (t) is nonzero and not parallel to the unit vector u. Thus the ray R “cuts through” the curve at a point of transversality and does not “graze” it. See Figure 3.4. Definition 3.4.3. Let P = γ(t) be a point where a smooth path γ meets a ray R (in direction u) transversely. The intersection number of γ and R at P , which we write it (γ, R), is defined to be +1 if u and γ  (t) form a right-handed basis for C (see Remark 1.3.2) and −1 if they form a left-handed one.

40

3. The Winding Number

Notice that this definition is consistent with the one we gave for polygonal paths (equation (3.3.4)). We are going to prove a result expressing winding numbers in terms of smooth intersection numbers, which will be a counterpart of Proposition 3.3.5 in the polygonal case. Proposition 3.4.4. Let γ be a smooth loop in C \ {p} and let R be a ray from p to ∞, transverse to γ. Then there are only finitely many parameter values t = t1 , . . . , tk where γ(t) intersects R and the winding number of γ around p is equal to the sum of its intersection numbers with R, that is wn(γ, p) =

k 

itj (γ, R).

j=1

We will prove Proposition 3.4.4 by “straightening out” the smooth loop near each intersection point, without changing the winding number or any of the intersection numbers. Once this is done we can make use of Lemma 3.3.2 relating the winding number to intersection numbers for curves that have short straight sections. To be precise, we need to verify the following. Claim 3.4.5. Let γ be a smooth loop in C \ {p} and let R be a ray from p to ∞, transverse to γ. Then R meets γ in only finitely many points. Moreover, γ is homotopic, in C \ {p}, to a loop which has the same intersection points with R and has short straight sections near each intersection point. Proof. There is no loss of generality in assuming that p = 0 and that the ray R is the positive x-axis. If we write γ(t) = x(t) + iy(t), then intersection points t = τ have y(τ ) = 0 and the transversality condition is y  (τ ) = 0. By the mean value theorem, there is ε > 0 such that if |t − τ | < ε, then |y(t) − y(τ ) − y  (τ )(t − τ )| < 12 |y  (τ )||t − τ |,

|x(t)| > 12 |x(τ )|.

In particular this tells us that there are no other points of intersection for t in this range; so the parameter values t for which intersections occur form a discrete subset of the compact set [0, 1], which is therefore finite. Now let λ(t) = (x(τ ) + x (τ )(t − τ )) + i(y(τ ) + y  (τ )(t − τ )),

3.4. Smooth paths and loops

41

that is, the straight line path tangent to γ at the intersection point τ . Let ϕ be a “bump function” for which ϕ(t) = 1 for |t − τ | < ε/3 and ϕ(t) = 0 for |t − τ | > 2ε/3. The homotopy h(s, t) = (1 − s)γ(t) + s((1 − ϕ(t))γ(t) + ϕ(t)λ(t)) now deforms γ, in C \ {0}, to a path which has a short straight section near t = τ and exactly the same intersection with R as γ has at τ . Apply this construction to each of the finitely many intersection points to complete the proof.  Despite their formal similarity, there is an apparently significant difference between Propositions 3.3.5 and 3.4.4. In the polygonal case it is obvious that what we called transverse rays exist: there are only finitely many directions that such a ray has to avoid. In the smooth case, it is not at all obvious that there are any transverse rays in the sense of Definition 3.4.2. In fact, however, a ray “chosen at random” will almost surely be transverse. This is an (easy) consequence of a general result called Sard’s theorem, which is fundamental to the study of smooth maps in all dimensions. Sard’s theorem, which is discussed in Appendix D, says that for any smooth function f , the image under f of the set of points where f  vanishes is “small” in the sense of measure theory. The consequence that we will need looks like this: Proposition 3.4.6. Let γ and p be fixed as in Proposition 3.4.4. Then the nontransverse rays form a subset of measure zero4 in the set of all rays through p. In particular, transverse rays always exist.

Proof. Consider the loop γ as a path [0, 1] → C \ {p}. From (3.2.2), there is a map g : [0, 1] → C such that γ(t) − p = exp(g(t)), and the smoothness of γ implies that g is smooth too. Let us work out what the condition is for the ray through γ(t) to meet the path γ transversely there. It is Im((γ(t) − p)γ  (t)) = 0; but if γ(t) − p = 4 Notice that the rays through a fixed point are parameterized by a copy of the circle, S 1 . The notion of a measure zero set of rays is defined according to Remark D.1.8.

42

3. The Winding Number

exp(g(t)), this expression is equal (by the chain rule) to g  (t)) Im((γ(t) − p)γ  (t)) = Im(exp(g(t)) exp g¯(t)¯ = − exp(2 Re g(t)) Im g  (t). So let f (t) = Im g(t), the imaginary part of g. Then nontransverse rays occur for parameter values t where f  (t) = 0, and moreover the angle for such a nontransverse ray is just exp(if (t)) ∈ S 1 . Therefore, the set of angles for nontransverse rays is F = exp(if (E)), 

where E = {t : f (t) = 0}. By Lemma D.1.7, f (E) is a measure zero subset of R, which implies that F is a measure zero subset of S 1 . 

3.5. Counting roots via winding numbers The fundamental theorem of algebra is the following statement: let p(z) = z n + an−1 z n−1 + · · · + a0 be a complex polynomial of degree n; then p has exactly n roots in the complex plane (counted according to multiplicity). Remark 3.5.1. Complex numbers first entered mathematics as a corollary of Cardano’s formula (sixteenth century) for the solution of a cubic equation. Rafael Bombelli noticed that this formula, when applied to the equation x3 − 15x − 4 = 0, gave a solution involving the square root of −121; yet by inspection x = 4 is a root. Bombelli had discovered that he could compute with “imaginary” quantities √ like −121 and obtain the correct “real” answer. Leibniz (1702) claimed that the fundamental theorem of algebra is false and that x4 + 1 = 0 is a counterexample. He assumed that the square root of a complex imaginary quantity must be an imaginary of a still more complicated kind. In fact, as we know, the square roots of i are the complex numbers ±2−1/2 (1 + i). It is usually said that the first “real proof” of the fundamental theorem was given by Gauss in his 1799 doctoral thesis. However, by modern standards that proof contains some significant gaps. It assumes various properties of real algebraic curves, which are actually

3.5. Counting roots via winding numbers

43

quite subtle (Gauss: “no-one has ever doubted it. . . but, if anyone desires, on another occasion I intend to give a demonstration which will leave no doubt.” Apparently he never did.) Gauss’s proof also assumes the topological lemma about paths in the unit square that already appeared in our discussion of the “lovers and haters” problem. Nowadays there are proofs of the fundamental theorem based on analysis, proofs based on algebra, and proofs based on topology. We will give a topological proof. Moreover, the proof will generalize to give another, quite different, “in principle” calculation of the winding number for any loop, this one based on approximating by rational functions and counting zeroes and singularities. Now we will begin our proof of the fundamental theorem of algebra. Recall the definition of the multiplicity of a root of a polynomial. The division algorithm tells us that if p is a polynomial of degree n and a ∈ C, we can always write p(z) = (z − a)q(z) + p(a) for some polynomial q = q1 of degree n − 1. In particular if a is a root of p, that is p(a) = 0, we can write p(z) = (z − a)q1 (z). It may or may not be the case that a is a root of q1 ; if it is, we can apply the division algorithm again to write p(z) = (z − a)2 q2 (z), and so on until we arrive at some qk with qk (a) = 0. Definition 3.5.2. The multiplicity of a root a of a complex polynomial p of degree n is the number k ∈ N such that p(z) = (z − a)k qk (z) where qk is a polynomial of degree n − k and qk (a) = 0. Note that the multiplicity is zero if p(a) = 0. Proposition 3.5.3. Let p be a complex polynomial and let a be a root of p. Let γ be a circular loop around a of radius r, small enough that B(a; r) contains no other roots of p, and traversed once in the positive direction. Consider the loop p ◦ γ : t → p(γ(t)) ∈ C \ {0}. The winding number of this loop (about 0) is equal to the multiplicity of the root of p at a.

44

3. The Winding Number

Proof. Write p(z) = (z − a)k qk (z) = uk (z)qk (z) as in the definition above. Then, by Lemma 3.2.6, the winding number of p ◦ γ is the sum of the winding numbers of qk ◦ γ and of uk ◦ γ. Write γ(t) = γρ (t) = a + ρe2πit , with ρ = r. Since qk never vanishes in B(a; r), letting ρ vary from r to zero defines a homotopy of qk ◦ γ to the constant path. Thus wn(qk ◦ γ) = 0. We can calculate explicitly that uk ◦ γ(t) = r k e2πikt . This has winding number wn(uk ◦ γ) = k, by Theorem 3.2.7. Putting this together with the previous calculation completes the proof.  Theorem 3.5.4. Let p be a complex polynomial. Let r > 0, and suppose that p has no roots on the circle |z| = r. Then the total number of roots inside that circle (counted with multiplicity) is the winding number of t → p(re2πit ) about the origin. Proof. It is another “bubble argument” (compare Remark 3.3.3). We consider the winding number n(ρ) of t → p(ρe2πit ) about the origin, as ρ increases from 0 to r. When ρ is small and positive, n(ρ) is equal to the multiplicity of 0 as a zero of p. As ρ increases, the path t → p(ρe2πit ) varies by a homotopy in C \ {0} except for those ρ which are the absolute values of roots of p. Thus n(ρ) is piecewise constant with “jumps” at the absolute values of the roots of p. Let a be a root of p, and suppose for a moment that p has no other roots of the same absolute value as a. Then by a bubble argument, the increase in n(ρ) when ρ passes through |a| is just the winding number of p ◦ γ about 0, where γ is a small circular path around a. By Proposition 3.5.3, this increment is the multiplicity of the zero at a. (If there are several zeroes with the same absolute value c, we consider “bubbles” for each of them separately, and we find that the increment in n(ρ) as ρ passes through c is the total multiplicity of all the roots having that absolute value.) Adding up all the increments in n(ρ), we find that n(r) is the total multiplicity of zeroes contained in the disc of radius r, as asserted.  Corollary 3.5.5 (Fundamental theorem of algebra). A polynomial of degree n has exactly n complex roots, counted with multiplicity.

3.5. Counting roots via winding numbers

45

Proof. Let p(z) = z n + an−1 z n−1 + · · · + a0 be such a polynomial. (There is no loss of generality in rescaling to make the polynomial monic, i.e., to make the leading coefficient 1.) Write f (z) = z n . We know that f (z) − p(z) is a polynomial of degree n − 1 at most, and it follows that there is R > 0 such that |f (z) − p(z)| < |z|n = |f (z)| whenever |z|  R. It follows in particular that p(z) has no zeroes for |z|  R. Moreover, let γ denote the circular path with center 0 and radius R. By Rouch´e’s theorem (Proposition 3.1.4), the paths f ◦ γ and p ◦ γ have the same winding number about 0, which is to say that f and p have the same number of roots (counted with multiplicity) inside B(0; R). But f obviously has exactly n roots in B(0; R) (at the origin, with n-fold multiplicity), so p does also.  We can generalize Theorem 3.5.4 above to rational functions, with basically the same proof. Remember that a rational function is just a quotient of two polynomials: f (z) = p(z)/q(z). As well as zeroes (where p has a root), a rational function has poles or singularities (where q has a root). If a is a pole, we have f (z) = (z − a)−k g(z),

g(a) = 0, ∞,

for some rational function g which has neither a zero nor a pole at a. The number k is called the multiplicity of the pole at a. Comparison with Definition 3.5.2 makes it clear why we should envisage a pole as a “zero of negative multiplicity”. Theorem 3.5.4 generalizes as follows. Theorem 3.5.6. Let f be a rational function. Let r > 0, and suppose that f has no zeroes or poles on the circle |z| = r. Then the total number of zeroes and poles inside that circle (counted with multiplicity, with zeroes counting positively and poles counting negatively) is  the winding number of t → f (re2πit ) about the origin. Remark 3.5.7. In fact, every loop in C\{0} can be approximated by (and, in particular, is homotopic to) one of the form t → f (eit ), where f is rational; this follows, once again, from the Stone-Weierstrass

46

3. The Winding Number

theorem (Theorem C.1.5). Thus, Theorem 3.5.6 gives us a third “in principle” algorithm for calculating the winding number, this time in terms of algebraic invariants — roots of polynomials. Let V be a vector space and T : V → V a linear transformation. An eigenvalue for T is a scalar λ for which the linear equation T v = λv has a nonzero solution v ∈ V ; the corresponding vectors v are called eigenvectors for the eigenvalue λ. A well-known corollary of the fundamental theorem of algebra is Proposition 3.5.8. Let V be a finite-dimensional complex vector space. Then any linear transformation T : V → V has an eigenvalue. Proof. Since V is finite-dimensional, so is the space End(V ) of linear transformations V → V : if V has dimension n, then End(V ) has dimension n2 . Consider the n2 + 1 linear transformations 2

I, T, T 2 , . . . , T n

in End(V ). They cannot form a linearly independent set, so there is a linear relation between them. That is to say, there is a polynomial p (of degree m  n2 ) such that p(T ) = 0. By the fundamental theorem of algebra we may write p(λ) = c(λ − λ1 ) · · · (λ − λm ) for some complex numbers λ1 , . . . , λm and c = 0. Then 0 = p(T ) = c(T − λ1 I) · · · (T − λm I). Since the composite of injective maps is injective, at least one of the linear transformations T − λj I must fail to be injective. A nonzero element of its kernel is then an eigenvector.  Traditionally, this result is proved using determinants. But, as stressed by Axler [7], it is simpler to avoid this discussion, as we have done above.

3.6. Exercises Exercise 3.6.1. Let f (z) = z + 1/z. Determine the winding number of the loop f ◦ γk about 0 in the following cases (k = 1, 2, 3):

3.6. Exercises

47

(i) γ1 is a circular path with center 0 and radius 12 ; (ii) γ2 is a circular path with center 0 and radius 3; (iii) γ3 is a circular path with center i and radius 32 . (A “circular path” is traversed once, in the positive direction.) Exercise 3.6.2. Investigate whether it is possible to define a “winding number” for maps C \ {0} → C \ {0}. What properties does your definition have? Exercise 3.6.3. Suppose that γ1 and γ2 are loops starting and ending at 1. Construct an explicit homotopy of loops from the concatenation γ1 ∗ γ2 to the pointwise product γ1 γ2 . Exercise 3.6.4. Let γ : [0, 1] → C \ {0} be a loop. (a) Let u : [0, 1] → [0, 1] be any map such that u(0) = 0 and u(1) = 1. Show that the winding number of γ ◦ u is the same as the winding number of γ. (One says that the winding number is independent of parameterization. See Example 2.2.3.) (b) For c ∈ (0, 1) define γc by  γ(t + c) γc (t) = γ(t + c − 1)

if t + c  1, if t + c > 1.

Show that γc is a loop with the same winding number as γ. (One says that the winding number is independent of basepoint.) Exercise 3.6.5. Let X denote the unit square {(x, y) : 0  x, y  1}. Let γ1 be a continuous path in X from (0, 0) to (1, 1), and let γ2 be a continuous path from (0, 1) to (1, 0). Prove that γ1 and γ2 must meet somewhere. (This is the lemma we needed for the “lovers and haters” problem. Assuming no crossing, close off one path into a loop in R2 , and obtain a contradiction by considering properties of the winding number.) Exercise 3.6.6. A polygonal path traverses the vertices of a regular heptagon in the unit circle by joining every third vertex (that is, the vertices are traversed in the order e6kπi/7 for k = 0, 1, 2, . . . , 6). Find the winding number of the path around the origin. Generalize to paths traversing every mth vertex of a regular n-gon.

48

3. The Winding Number

Exercise 3.6.7. Construct an example of a smooth path in C\{0} for which there are infinitely many nontransverse rays through 0. How is this consistent with Sard’s theorem? Exercise 3.6.8. Fill in the details of Bombelli’s calculation for the equation x3 − 15x − 4 = 0, as follows. Write x = u + v, with the auxiliary condition uv = 5. Show that the original equation now gives u3 + v 3 = 4. Deduce that u3 and v 3 are the two roots of the quadratic equation t2 − 4t + 125 = 0 and therefore that u3 , v 3 are 2 ± 11i. Check that 2 + i is a complex cube root of 2 + 11i, and similarly for the minus sign. Thus the real root 4 = (2 + i) + (2 − i) can be obtained by the use of complex numbers. Exercise 3.6.9. (a) Show (by consideration of winding numbers) that there is no sequence of complex polynomials pn (z) such that pn (z) → z¯ uniformly for |z| = 1. (b) For those who feel more ambitious, investigate whether the same conclusion holds when we replace “uniformly” by “pointwise”. That is, is there a sequence of complex polynomials with pn (z) → z¯ for each individual z on the unit circle, but without the assumption of uniformity? You will probably need to use a result from complex analysis about polynomial approximation, such as Runge’s theorem — see Rudin [34]. Exercise 3.6.10. In the context of Lemma 3.3.2, show that the way in which the short straight section is parameterized does not matter. In other words, we get the same winding numbers if we replace γ(t), for t ∈ [t0 , t1 ], by any expression a + bϕ(t), where ϕ : [t0 , t1 ] → [−1, 1] is any (continuous) map sending t0 to −1 and t1 to 1.

Chapter 4

Topology of the Plane

4.1. Some classic theorems Many natural questions about the topology of figures in the plane can be answered by using the winding number. Definition 4.1.1. Let n be a natural number. The n-ball B n is the set of points (x1 , . . . , xn ) ∈ Rn with x21 + · · · + x2n  1; the (n − 1)sphere S n−1 is the boundary of the n-ball, that is, the set of points (x1 , . . . , xn ) ∈ Rn with x21 + · · · + x2n = 1. A basic result of topology is the no-retraction theorem: Theorem 4.1.2. There is no continuous map f : B n → S n−1 such that f (x) = x for all x ∈ S n−1 . When n = 1, the theorem follows from the fact that B 1 is path connected (Definition 2.1.1) while S 0 is not. When n = 2, we will prove the theorem using the winding number. For n  3 one needs higher-dimensional topological methods (one possible approach will be indicated in the final chapter; see Example 9.1.7). Proof (for n = 2). Let f : B 2 → S 1 be a map satisfying the hypothesis of the theorem. Define a family of maps ht : S 1 → S 1 by ht (eiθ ) = f (teiθ ). 49

50

4. Topology of the Plane f (x) S n−1 x Bn g(x)

Figure 4.1. Constructing a retraction to prove the Brouwer fixed-point theorem.

Plainly h is a homotopy, with h1 being the identity map (by assumption) and h0 being constant. But the identity map S 1 → S 1 has winding number 1, and a constant map has winding number 0, so no such homotopy can exist.  The no-retraction theorem implies the Brouwer fixed-point theorem. This is another result which is valid in all dimensions, but techniques based on the winding number only give us a proof in dimension n = 2. Corollary 4.1.3 (Brouwer fixed-point theorem). Let g : B n → B n be a continuous map. Then g has a fixed point; in other words, there exists x ∈ B n such that g(x) = x. Many existence questions in mathematics and its applications (does this differential equation have a solution? does this economic model have an equilibrium? does this function have a zero?) can be reformulated as fixed-point problems, and the Brouwer theorem is then often the key to a positive solution. Proof. Suppose that g has no fixed point. Define a map f : B n → S n−1 as follows. For each x ∈ B n , since g(x) = x, there is a unique ray starting at g(x) and passing through x. Let f (x) be the (unique) point where this ray intersects S n−1 . (See Figure 4.1.)

4.1. Some classic theorems

51

It is clear that the map f is continuous, and if x ∈ S n−1 , then f (x) = x by construction. Thus f is a retraction, contradicting Theorem 4.1.2.  In the following discussion we will focus our attention on the winding numbers of maps S 1 → S 1 (in this context the winding number is often referred to as the degree). Since the domain and the codomain are the same, we can compose these maps: (f ◦ g)(z) = f (g(z)). What effect does composition have on the winding number? Lemma 4.1.4. Let f, g ∈ Maps(S 1 , S 1 ), and let h = f ◦ g denote their composition. Then wn(h) = wn(f ) · wn(g). Proof. Notice that if f varies through a homotopy fs , then h varies through the homotopy fs ◦g. Similarly if g varies through a homotopy gs , then h varies through the homotopy f ◦ gs . Let f have winding number m and let g have winding number n. Then, by Theorem 3.2.7, f is homotopic to the map z → z m and g is homotopic to the map z → z n . By the previous paragraph, h is homotopic to the composite of these two maps, which is z → z mn . Thus h has winding number mn, as required.  A map f : S 1 → S 1 is called even if f (z) = f (−z) for all z, and it is called odd if f (z) = −f (−z) for all z. Proposition 4.1.5. An even map S 1 → S 1 has even degree; an odd map S 1 → S 1 has odd degree. Proof. Suppose that f : S 1 → S 1 is even. Define g : S 1 → S 1 by g(z) = f (w),

where w2 = z.

Of course, given z, there are two possible choices for w (differing by sign); but because f is even, the choice doesn’t matter, and so g is well-defined. I claim that g is also continuous. Indeed, let ε > 0. Since f is (uniformly) continuous there is δ > 0 such that |w1 − w2 | < δ implies

52

4. Topology of the Plane

|f (w1 ) − f (w2 )| < ε. Now suppose |z1 − z2 | < δ 2 , z1 = w12 , and z2 = w22 . Then |w1 − w2 ||w1 + w2 | = |w12 − w22 | < δ 2 , so one of |w1 − w2 | and |w1 + w2 | is less than δ. By an appropriate choice of sign we may arrange that |w1 − w2 | < δ; so we have proved that |z1 − z2 | < δ 2 ⇒ |g(z1 ) − g(z2 )| < ε and g is continuous, as required. Now f = g ◦ s, where s(w) = w2 . Since the map s has degree 2, Lemma 4.1.4 shows that the degree of f is a multiple of 2. This proves the first part of the proposition. To prove the second part, suppose that k is odd. Then f (z) = zk(z) is even, so it has even degree. But deg(f ) = 1 + deg(k) by Lemma 3.2.6. Thus k has odd degree as required.  In the statement of the next theorem, let A : S 2 → S 2 denote the antipodal map (Definition 2.3.8) which sends each point to its opposite. Theorem 4.1.6 (Borsuk-Ulam theorem). Let h : S 2 → R2 be a continuous map. Then there exists x ∈ S 2 such that h(x) = h(A(x)), where A(x) is the antipode of x. A famous illustration of the Borsuk-Ulam theorem is that there are always two antipodal points on the earth’s surface having exactly the same temperature and barometric pressure. Proof. Suppose not. Then, for each x ∈ S 2 , the vector h(x) − h(A(x)) is nonzero; let u(x) ∈ S 1 be the unit vector in the direction of h(x) − h(A(x)). Then u : S 2 → S 1 is a continuous map and satisfies u(A(x)) = −u(x) for all x ∈ S 2 . If we restrict u to the equatorial copy of S 1 inside S 2 , we obtain an odd map v : S 1 → S 1 , which must have odd degree by Proposition 4.1.5. In particular, the degree of v must be nonzero. But the map u on the upper hemisphere of S 2 describes a homotopy between v and a constant map, which has zero degree. This contradiction proves the theorem. 

4.1. Some classic theorems

53

Figure 4.2. Planar counterpart of the ham sandwich theorem: two areas can be bisected by a single line.

Theorem 4.1.7 (Ham sandwich theorem). Let A, B, and C be three solid bodies in R3 . There is always a plane that divides all three of them exactly in half by volume. To understand the name, think of A, B, and C as being the two slices of bread and the filling of a sandwich, which we want to divide fairly by a single knife-cut. It seems however (see [9]) that the earliest formulation involving ham is the following: “Can we place a piece of ham under a meat cutter in such a way that meat, bone, and fat are all cut in halves?” Apparently bread—forming a “sandwich”—was added to the problem later, replacing the bone and fat. Proof. The mention of volume suggests, correctly, that one would have to do some measure theory to make a rigorous argument; we won’t sweat the details of that here. It’s helpful to think first about the case when one or more of A, B, C are balls. A plane that bisects the volume of a ball is just a plane through its center. So if all three bodies are balls, the existence of the desired plane is obvious: just consider the plane through their centers. If one body is a ball — say C — then for each vector v ∈ S 2 there is exactly one plane normal to v that bisects C, namely, the plane through the center of C. Call that plane Pv . Now define x(v) to be the volume of the part of body A on the positive v-side of

54

4. Topology of the Plane

Pv , and y(v) similarly for body B. Define a map f : S 2 → R2 by f (v) = (x(v), y(v)). By the Borsuk-Ulam theorem, there is a v ∈ S 2 such that f (v) = f (A(v)). But this means that the volumes of the parts of bodies A and B on the positive v-side of Pv are the same as the corresponding volumes on the negative v-side; that is, Pv bisects A and B, as well as C. In the general case, we can carry out the same argument provided we know that for each vector v we can find a plane Pv normal to v that bisects C and that Pv depends continuously1 on v. Here’s how to do that. Consider the family of all planes perpendicular to v. These are parameterized by a real number t, the signed distance from the plane to the origin, and as t varies from −∞ to ∞ the fraction of C on the positive v-side of the plane increases continuously from 0 to 1. Thus, by the intermediate value theorem, there is a value of t for which this number is 12 , i.e., a plane perpendicular to v bisecting C. (If there is more than one such plane, the possible t-values for such planes form a closed interval; we choose Pv to be the plane corresponding to the midpoint of that interval.) This completes the proof of the ham sandwich theorem.



4.2. The Jordan curve theorem I Definition 4.2.1. A loop γ in the plane is called a Jordan curve if it has no self-intersection points or, to put it another way, it is given by an injective map γ : S 1 → C. Proposition B.3.17 tells us that such an injective map γ is actually a homeomorphism onto its image, for which reason one sometimes refers to the image γ ∗ itself as a Jordan curve. One of the most notorious theorems in topology is the following: Theorem 4.2.2 (Jordan curve theorem). Any Jordan curve in C divides the plane into exactly two regions, of which it is the common boundary. 1 This continuity is one of those details that we ought to sweat over. In fact, it is a nontrivial point: see the discussion in [6]. Exercise 4.5.5 will indicate one way of filling in the details here.

4.2. The Jordan curve theorem I

55

Figure 4.3. A complicated Jordan curve. Courtesy of Robert Bosch, Oberlin College.

We give a little clarification here. By “divides the plane into exactly two regions”, we mean that C \ γ ∗ has two path components (or cells as we called them before). The notion of boundary is defined below: Definition 4.2.3. Let U be a subset of a metric space X. A point x ∈ X is a boundary point of U if for every ε > 0 the ball B(x; ε) contains both points of U and points of the complement X \ U . The boundary ∂U of U is the collection of its boundary points. Lemma 4.2.4. For any compact K ⊆ C, the boundary of each path component of C \ K is a subset of K. The Jordan curve theorem supplies the additional information that if K is a Jordan curve, this subset is all of K. Proof. Let U be some path component of C \ K and let W be the union of all the other path components of C \ K. Let p ∈ ∂U . Then for any ε > 0, the ball B(p; ε) contains a point z0 ∈ U and a point / U . The point z1 either belongs to K or to W ; if it belongs to W , z1 ∈ the line segment from z0 to z1 must contain a point of K (otherwise z0 and z1 would be in the same path component of C \ K). So, in either event, B(p; ε) contains a point of K. Since this is true for any ε > 0, p belongs to the closure of K. But K is closed since it is compact, so p ∈ K as required. 

56

4. Topology of the Plane

Remark 4.2.5. Given what we know already, it is not hard to prove Theorem 4.2.2 for polygonal Jordan curves (Exercise 4.5.6). But the Peano curve (Example 2.1.6) shows that continuous paths can have unexpected, counterintuitive topological properties, which are not shared by “nice” paths like polygonal or differentiable ones. One might worry, then, whether the oh-so-intuitive Jordan curve theorem might also fail for some weird continuous path. In fact, it doesn’t; but this is not at all straightforward to prove. Remark 4.2.6. The theorem was first stated formally by Jordan in his Cours d’analyse of 1877. It is often claimed that this proof was fallacious. However, in 2007 Thomas Hales published a reanalysis of Jordan’s original proof [19] in which he defends Jordan’s reasoning. Jordan’s argument proceeds from the case of polygonal paths to general paths. The Jordan curve theorem is hard because it deduces something about the complement of a subset of the plane using only information internal to that subset. The first result that we need in the proof of the theorem gives a way to make that crucial step from a subset to its complement. Notice in the statement below that we give a criterion to determine when a point p lies in the unbounded component of the complement of a compact subset K using only homotopical information that is internal to K itself. Proposition 4.2.7 (Eilenberg’s criterion). Let K be a compact subset of C. A point p ∈ C\K belongs to the unbounded component of C\K if and only if the function from K to C\{0} defined by z → z −p lifts through the exponential map (in the sense of Definition 3.1.1, that is, if and only if there is a function k(z) defined on K with z − p = exp(k(z))). Proof. (“Only if”) If p is sufficiently far from K, then z − p is contained in a fixed half-plane for all z ∈ K, so it certainly lifts through the exponential map. As p moves on a path in the unbounded component of C \ K, the map z → z − p moves by a homotopy in C \ {0}. By Proposition 3.1.5, the property of lifting through the exponential map is preserved by such a homotopy.

4.2. The Jordan curve theorem I

57

U γr

K

p

W

Figure 4.4. Illustrating the proof of the “if” direction of Eilenberg’s criterion.

(“If”) Suppose (aiming at a contradiction) that p belongs to some bounded component U of C \ K and that z − p = exp f (z) for some continuous function f : K → C. Let X be the metric space C \ U , which consists of K and all the other components (except for U ) of C \ K; in particular it includes the unbounded component W . Thus we have the following set-up: a compact subset K of a metric space X, and a continuous function f : K → C. In this situation, the Tietze extension theorem (see Proposition C.2.1 in Appendix C) applies and says that there exists a continuous g : X → C that extends f ; that is, we have a commutative diagram .

XO g

? K

f



/C

In particular, z − p = exp(g(z)) for z ∈ K. Now recall (Lemma ¯ ⊆ U ∪ K. Consider the two 4.2.4) that ∂U ⊆ K and therefore U ¯ ) and z → exp(g(z)) (on X = continuous functions z → z − p (on U C \ U ). These are defined on closed sets and agree where they are defined, so by Proposition B.4.2 they fit together to define a continuous map u : C \ {p} → C \ {0}.

58

4. Topology of the Plane

Consider the winding number of the loop u ◦ γr , where γr is a circle with center p and radius r. Varying r gives a homotopy, so all these winding numbers must be the same. But for r small, γr lies entirely inside U and the winding number is 1; whereas for r large, γr lies entirely inside the unbounded component W , and therefore inside X, so that u ◦ γr is an exponential and the winding number is 0. This contradiction proves the result.  Eilenberg’s criterion immediately allows us to prove that certain compact sets have connected complements (one says that they do not separate the plane). Recall (Exercise 2.4.7) that a compact space K is contractible if the identity map K → K is homotopic to a constant map. Proposition 4.2.8. Let K be a compact contractible subset of C. Then its complement C \ K is connected. Proof. If K is contractible, then there is a homotopy H : [0, 1]×K → K between the identity map K → K and a constant map. Now suppose that f is a map from K to another space X. Then the composite [0, 1] × K

H

/K

f

/X

gives a homotopy between f and a constant map K → X. In other words, any map from a contractible space to any space is homotopic to a constant map. In particular, for p ∈ C \ K, the map ϕ : z → (z − p) is homotopic in C \ {0} to a constant map. By Proposition 3.1.5, the map ϕ lifts through the exponential function. By Eilenberg’s criterion, p is in the unbounded component of C \ K. Since p was arbitrary, every point of C \ K is in the unbounded component, which is to say that C \ K is connected.  Remark 4.2.9. An important example is that of an arc. By definition, an arc in C is a subset that is homeomorphic to the compact interval [0, 1]. It is easy to see that [0, 1] is contractible, and thus so is any arc. From Proposition 4.2.8 we see therefore that the comple-

4.3. The Jordan curve theorem II

59

ment of any arc is connected (or, as it is usually expressed, no arc separates the plane): a fact that will play an important role in the proof of the Jordan curve theorem in the next section.

4.3. The Jordan curve theorem II We continue with our proof of the Jordan curve theorem (Theorem 4.2.2). Proposition 4.3.1. Let γ be a Jordan curve in C, and assume that C \ γ ∗ is not path connected. Then the boundary of each connected component of C \ γ ∗ is all of γ ∗ . Proof. Let z = γ(t) be a point of γ ∗ . Let U be a bounded component of C \ γ ∗ and let W be the unbounded component. It suffices to show that, for every ε > 0, the disc B(z; ε) contains points of U and points of W . By continuity there is δ > 0 such that the arc B = γ([t − δ, t + δ]) lies entirely in the disc B(z; ε). Let A be the complementary arc γ(S 1 \ (t − δ, t + δ)). The arc A does not separate the plane (Remark 4.2.9). Therefore, there is a path ψ in C \ A joining a point p ∈ U to a point q ∈ W . This path must meet γ (since p, q are in different cells) and it doesn’t meet A, so it must meet B. Let s0 be the parameter value where it first meets B, i.e., s0 = inf{s ∈ [0, 1] : ψ(s) ∈ B}. Then ψ(s0 ) ∈ B(z; ε). By continuity, ψ(s) ∈ B(z; ε) for s < s0 sufficiently close to s0 . But ψ(s) ∈ U for all s < s0 , so B(z; ε) contains points of U . Arguing similarly for s1 = sup{s ∈ [0, 1] : ψ(s) ∈ B}, we see that B(z; ε) also contains points of W .  We will also need the “lovers and haters” lemma (Exercise 3.6.5). Proposition 4.3.2. Let R be a closed rectangle in the plane, and let A, B, C, D be four points on the circumference of R, taken in cyclic order. Let γ1 and γ2 be paths from A to C and from B to D, respectively, lying entirely (except for their endpoints) in the interior of R. Then γ1 and γ2 have a point of intersection.

60

4. Topology of the Plane

Arc J1 Q

P

Arc J2

Figure 4.5. Jordan curve theorem, Part 1.

Proof. Join C to A by a polygonal arc γ0 that lies outside R (except at its endpoints). The concatenation of γ0 and γ1 forms a loop γ. By Lemma 3.3.2 we can compute that the winding numbers of γ about B and about D differ by ±1. In particular, B and D cannot be joined by a path in C \ γ ∗ . Thus γ2 must meet γ somewhere. As it lies  entirely within R, it must in fact meet γ1 , as required. Now for the proof of the Jordan curve theorem. Let J be a Jordan curve. Since J is compact, there exist points P, Q ∈ J that are as far apart as possible (that is, r = d(P, Q) is greater than or equal to the distance between any other two points of J). Thus J is completely contained in the lozenge-shaped region B(P ; r) ∩ B(Q; r). In particular, it is completely contained in some rectangle R with sides perpendicular and parallel to P Q, and it meets the boundary of this rectangle only at P and at Q. The Jordan curve J is then the union of two arcs from P to Q; call these J1 and J2 . (See Figure 4.5.) Keep this set-up for the next two lemmas, which together constitute a proof of the Jordan curve theorem. This proof is based on an article by Maehara [27]. Lemma 4.3.3. The complement of J has at least two connected components. Proof. We want to find a point which we can prove does not lie in the unbounded component of the complement of J. Here’s how we

4.3. The Jordan curve theorem II

61

A K

α

Arc J1 Q

P L W Arc J2

M β N

B

Figure 4.6. Jordan curve theorem, Part 2.

proceed. Pick a point A on the top of the rectangle R and draw a vertical line AB from the top to the bottom of the rectangle. The “lovers and haters” lemma assures us that this line will pass through both arcs J1 and J2 ; assume, without loss of generality, that the first such intersection point (as we move downwards from A) is with J1 . Let K be the highest intersection point of AB with J1 , and let L be the lowest such point. (Remark that the existence of such “highest” and “lowest” points — the sup and inf of certain sets — is assured by the compactness of J1 .) There are points of J2 ∩ AB lower than L. (Proof: If not, the path comprised of segment AK, the arc α of J1 from K to L, and segment LB would connect A to B without meeting J2 , contradicting the lovers and haters lemma.) Let M be the highest such point, and let N be the lowest such point. For future reference, we note that the straight line segment N B meets neither J1 nor J2 , except at its endpoint N . Finally, let W be the midpoint of LM . By construction, W belongs neither to J1 nor to J2 , so it is in C \ J. We will prove that it is not in the unbounded component of this set. (See Figure 4.6.) Suppose the contrary. Then there is a path Γ in R \ J that connects W to some point Z on the boundary of the rectangle R.

62

4. Topology of the Plane

There are two possibilities: either Z lies on the boundary arc P AQ or on the boundary arc P BQ. Suppose first that Z lies in P AQ. Then concatenating Γ to the segment W B yields a path Γ ∗ W B from Z to B that does not meet J1 . This contradicts the lovers and haters lemma. On the other hand, suppose that Z lies in P BQ. Then the concatenation of four paths AK ∗ α ∗ LW ∗ Γ yields a path from A to Z that does not meet J2 . This is a similar contradiction.  Lemma 4.3.4. The complement of J has at most two connected components. Proof. Keep the notation from the previous lemma. In addition to the arc α that we already defined, let β be the arc of J2 that connects M to N . Let Δ be the path from A to B obtained by concatenating Δ = AK ∗ α ∗ LM ∗ β ∗ N B. Observe that the points of Δ all lie in either J itself, the unbounded component of the complement of J, or the bounded component of the complement that contains W . Let ε > 0 be small enough that B(P ; ε) and B(Q; ε) don’t meet the arc Δ. Suppose that U is another component of C \ J, bounded, but not containing W . By Proposition 4.3.1, ∂U = J. In particular, there are points P  and Q of U belonging to B(P ; ε) and B(Q; ε), respectively. These points are inside R (everything outside belongs to the unbounded component). Moreover, there is a path Λ in U joining P to Q. Notice that, by construction, the path Δ contains no points on U , so Λ cannot meet Δ. (See Figure 4.7.) Now we contradict the lovers and haters lemma using the paths  Δ and Λ = P P  ∗ Λ ∗ QQ . This completes the proof. A classically important corollary is the theorem of invariance of domain.

4.4. Inside the Jordan curve

63

A K

α

P

Q L W M β N

B

Figure 4.7. Jordan curve theorem, Part 3. The path Δ is emphasized.

Theorem 4.3.5 (Invariance of domain). Let U ⊆ C be an open set and let f : U → C be continuous and injective. Then f is an open map (it takes open sets to open sets). In particular, f (U ) is open in C, and f is a homeomorphism from U onto f (U ). Proof. It suffices to show that whenever B(z; ε) ⊆ U , f (B(z; ε)) is an open subset in C. Let D be the closure of B = B(z; ε) and S its boundary circle. Then f (S) is a Jordan curve, so its complement has two components; on the other hand, the complement of f (D) has only one (unbounded) component by Eilenberg’s criterion (Proposition 4.2.8). Since f (B) is path connected, it must be contained in one of the components of the complement of f (S), and it must in fact be the whole of the bounded component (since otherwise the complement of f (D) = f (S) ∪ f (B) would not be connected). Thus f (B) is open. 

4.4. Inside the Jordan curve It’s important to be clear about what we have not proved at this point. Let J be a Jordan curve in C. We have proved that the complement C \ J consists of exactly two components, one bounded and one unbounded. The bounded component is called the Jordan

64

4. Topology of the Plane

domain ΩJ determined by J. None of the techniques that we have developed so far answer any of the following natural questions: (a) Is the open set ΩJ simply connected? (b) If the answer to (a) is yes, is ΩJ homeomorphic to the open disc U := {z ∈ C : |z| < 1}? (c) If the answer to (b) is yes, can a homeomorphism ΩJ → U be found which extends continuously to a homeomorphism from ΩJ = ΩJ ∪ J to U = U ∪ S 1 ? Clearly these are successively stronger statements: (c) implies (b) which implies (a). Note that in (c) the equality ΩJ = ΩJ ∪ J is a consequence of the Jordan curve theorem. It turns out that all three of these statements are true. The strongest of them, (c), is known as the Schoenflies theorem. However, they are all of a higher level of difficulty than the Jordan curve theorem itself. One way of seeing this is to notice that the natural higher-dimensional counterpart of the Jordan curve theorem is true (the Jordan-Brouwer separation theorem) but the higher-dimensional counterpart of the Schoenflies theorem is false without additional assumptions. We’ll discuss this in a moment. So how might one go about proving (a)–(c)? Modern direct proofs [35] use some kind of infinite combinatorial construction to prove the Schoenflies theorem — imagine a ramifying, treelike structure reaching out towards that (potentially very wiggly) boundary of ΩJ . But the first proofs did not work that way. Instead, they made use of analysis — specifically, the theory of conformal mapping — to prove this result. Definition 4.4.1. Let U and V be open subsets of C. A conformal mapping from U to V is a homeomorphism f : U → V which is holomorphic, that is, differentiable as a function of a complex variable — the limit f (z + h) − f (z) ∈C f  (z) := lim h→0 h must exist for each z ∈ U . Holomorphic functions will be discussed in more detail in the next chapter (Definition 5.3.1), where we will see that the real and

4.4. Inside the Jordan curve

65

imaginary parts u, v of such a function f must satisfy the partial differential equations ∂v ∂v ∂u ∂u = , =− ∂x ∂y ∂x ∂y known as the Cauchy-Riemann equations. The geometric significance of these equations turns out to be that f preserves the angles between infinitesimal vectors, though it may rescale their lengths. That is the reason for the word “conformal”. Theorem 4.4.2 (Riemann mapping theorem). Let U be a proper, nonempty connected open subset of C (proper means not equal to C itself ) and suppose that the following holds: • for every loop γ in U and every p ∈ / U , the winding number wn(γ; p) equals zero. Then there exists a conformal homeomorphism from U onto U. In particular, U is homeomorphic to U.  Remark 4.4.3. Most courses on complex analysis will contain a proof of the Riemann mapping theorem. One approach, due to Koebe, proceeds as follows. First, one shows that the hypothesis about winding numbers implies that we can always find holomorphic square roots2 of nowhere-zero holomorphic functions on U — in other words, if f is such a function, then one can find another such function g such that g(z)2 = f (z). Now one considers the class of conformal homeomorphisms of U into U (i.e., the range may be a proper subset of U). Using the existence of square roots, one gives an explicit construction for “improving” such a homeomorphism by making its range “bigger”. Iterating this “improvement” process and passing to the limit gives the Riemann map. Now it is easy to see that the hypothesis of the Riemann mapping theorem is satisfied for the interior region ΩJ of a Jordan curve J (Exercise 4.5.8). The theorem therefore implies a “yes” answer to questions (a) and (b) above. One can also use these analytical techniques to prove the Schoenflies theorem (that is, give a positive answer to question (c) as well). 2

We will prove this in the next chapter; see Proposition 5.5.2.

66

4. Topology of the Plane

To do so one must investigate the extension of the Riemann mapping to boundary points. Two examples warn us that this will be a delicate matter: Example 4.4.4. If U is an open subset of C which is not a Jordan domain, a confromal homeomorphism U → U need not extend to a homeomorphism ∂U → ∂U; see Exercise 4.5.12. Example 4.4.5. If ΩJ is a Jordan domain and f : ΩJ → U is a general (not conformal) homeomorphism, then f need not extend to a homeomorphism J → ∂U; see Exercise 4.5.11. Nevertheless, in the early twentieth century Carath´eodory was able to prove Proposition 4.4.6. Let J be a Jordan curve, and let f : ΩJ → U be a conformal homeomorphism. Then there is a unique homeomorphism  from ΩJ = ΩJ ∪ J to U that extends f . Together with the Riemann mapping theorem, this completes the proof of the Schoenflies theorem. (Full details of this argument may be found in [8].) Remark 4.4.7. It may seem surprising that the first proof of a purely topological result such as the Schoenflies theorem should depend on the analysis of partial differential equations. But there is more than one modern parallel. One of the most sensational mathematical stories of recent years has been the proof, by Perelman, of the Poincar´e conjecture, a fundamental result of 3-dimensional topology. And while the details are very different and much more sophisticated, the basic structure — introducing auxiliary partial differential equations, approaching an “optimum” solution by an iterative “improvement” process, and showing that such an “optimum” also solves the original topological problem — is quite similar to the structure of the proof of the Schoenflies theorem that is sketched above. See [28]. As mentioned earlier, the Schoenflies theorem is special to dimension 2. The Alexander horned sphere gives an example of a homeomorphic image of S 2 in R3 whose interior domain is not homeomorphic to the interior of a standard ball. To prove the Schoenflies theorem in higher dimensions, one needs an extra hypothesis of “local

4.5. Exercises

67

flatness” which rules out such wild behavior. The classic reference on the Alexander horned sphere is [10, page 38] (many interesting drawings and animations can now be found online also), and for the “locally flat” version of the higher-dimensional Schoenflies theorem, see [12]. However, a “stabilized” version of the Schoenflies theorem is true (and easy to prove) in any dimension. The statement is: Lemma 4.4.8. Let A and B be homeomorphic closed subsets of Rn , and let h : A → B be a homeomorphism between them. Embed Rn in R2n as a hyperplane. Then the homeomorphism h : A → B can ˜ from R2n → R2n . In particular, be extended to a homeomorphism h 2n 2n R \ A is homeomorphic to R \ B. In other words, if we allow ourselves some extra room, in the form of n extra dimensions, then the strongest possible Schoenfliestype statement becomes true. The ingenious proof, due to Dold [15], is outlined in Exercise 4.5.14. The key step is provided by the Tietze extension theorem. With a little extra input from algebraic topology it is possible to obtain the Jordan curve theorem, and its higherdimensional counterpart the Jordan-Brouwer separation theorem, as corollaries of this result.

4.5. Exercises Exercise 4.5.1. Give an example of a compact connected metric space X and a continuous map f : X → X that has no fixed point. Also, give such examples where f has a specified number n ∈ N of fixed points. Exercise 4.5.2. Let m ∈ N. Suppose that a map f : S 1 → S 1 has the property that f (z) = f (e2πi/m z) for all z ∈ S 1 . What can you say about the degree of f ? (Note that Proposition 4.1.5 covers the case m = 2.) Exercise 4.5.3. It is a consequence of Lemma 4.1.4 that, for maps f and g from S 1 to itself, f ◦ g is always homotopic to g ◦ f . Investigate whether this is always true for maps from a compact metric space X to itself.

68

4. Topology of the Plane

Exercise 4.5.4. Let A be an n × n square matrix all of whose entries are strictly positive. Show that A has an eigenvector with strictly positive eigenvalue and all entries strictly positive. (This is a simple version of the Perron-Frobenius theorem.) Hint: Let Δ ⊆ Rn be the simplex    Δ = (x1 , . . . , xn ) : xi  0, xi = 1 . Show that Δ is homeomorphic to B n−1 . If v ∈ Δ, show that Av has all entries strictly positive and therefore that Av 1 , the sum of the entries of Av, is also strictly positive. Apply the Brouwer fixed-point theorem to the map v → Av −1 1 Av from Δ to itself. Exercise 4.5.5. This exercise indicates one way to resolve the continuity question in the proof of the ham sandwich theorem. We will consider each of the three bodies A, B, and C to be Lebesgue measurable subsets of R3 , having positive Lebesgue measure and each contained in the ball B(0; R) of center 0 and radius R in R3 . Let H = [−3R, 3R] × S 2 . To each (t, v) ∈ H assign the half-space Ω(t, v) = {x ∈ R3 : x · v > t}. Notice that B(0; 2R) is a subset of Ω(t, v) for all t < −R and all v and is disjoint from Ω(t, v) for all t > R and all v. Let λ denote Lebesgue measure. (i) Let f ∈ L1 (B(0; 2R), λ) be an integrable function and define  Φf (t, v) = f dλ. Ω(t,v)

Prove that Φf is a continuous function on H. (ii) Fix ε > 0 and let f = χC + εχB(0;2R) . Show that, for fixed v, the function t → Φf (t, v) is strictly monotone decreasing for t ∈ [−R, R] and indeed that there is a constant c such that |Φf (t, v) − Φf (t , v)|  c|t − t | for t, t ∈ [−R, R].

4.5. Exercises

69

(iii) Deduce that for each v ∈ S 2 there is a unique t = tv such that  1 Φf (t, v) = 2 f dλ = 12 (λ(C) + ελ(B(0; 2R))) and that tv depends continuously on V . (iv) Using the Borsuk-Ulam theorem show that there exists v ∈ S 2 such that Ω(tv , v) contains exactly half of the volume of A and of B. (v) The v and tv constructed in the preceding steps depend on ε. Now let ε = 1/k, k = 1, 2, 3, . . ., and let (tk , vk ) be the sequence of corresponding parameters in H. Show that we can find a subsequence (tkm , vkm ) which is convergent, to (t∞ , v∞ ) say. (vi) Show that the plane represented by (t∞ , v∞ ) solves the original ham sandwich problem. Exercise 4.5.6. Complete the following outline of a proof of the Jordan curve theorem for polygonal loops. (a) Let γ be a polygonal Jordan curve and let ε > 0. For each straight line segment of γ draw two parallel segments, at distance ε either side of γ, and join these segments at suitable points to form polygonal curves γ1 and γ2 , which if ε is sufficiently small will be disjoint from γ. (b) Now let U be the union of all those cells of γ for which the winding number wn(γ, p) is even, and let V be the union of all those cells for which the winding number is odd. Show that γ1 is contained in one of these sets, say U , and γ2 is contained in the other, say V . (c) Show that U and V are path connected; this will finish the proof. Suppose that p, q ∈ U and draw the straight line path from p to q. If this doesn’t meet γ, we are done. If it does, there is a first point at which it meets γ. Show that just before the first point at which it meets γ, it must meet γ1 , say at a point p . By the same argument, just after the last point at which it meets γ it must meet γ1 , say at a point q  . But now we get a path from p to q in U by traveling first straight to p , then along γ1 to q  , then straight along γ to q.

70

4. Topology of the Plane

(This argument suggests that the Jordan curve theorem is relatively simple once our curve γ has an appropriate “tubular neighborhood”, here constituted by the “guard rails” γ1 and γ2 .) Exercise 4.5.7. Let A and B be compact subsets of C, and assume that A ∩ B is connected. (i) Suppose that A and B do not separate the plane. Show that A ∪ B does not separate the plane. (Remember that a compact subset K separates the plane if C \ K has more than one path component.) (ii) More generally, show that if p, q are two points which are in the same component of C \ A and in the same component of C \ B, then they are in the same component of C \ A ∪ B. (iii) Does the result in part (i) remain true if we only require that A and B are closed sets? Exercise 4.5.8. Let J ⊆ C be a Jordan curve and let ΩJ be its interior (that is, the bounded component of the complement of J). (i) Prove that if γ is a loop in ΩJ and q ∈ C\ΩJ , then wn(γ, q) = 0. (Hint: Show that all points of C \ ΩJ belong to the same path component of C \ γ ∗ .) (ii) Let p belong to the bounded component of C \ J. Show that the winding number of J around p is ±1. (Hint: Prove this first when J has a short straight section, using Lemma 3.3.2. Reduce the general case to this using a “cross-cut”.) Exercise 4.5.9. Show that R2 is not homeomorphic to Rn for n  3. Hint: Let h : Rn → R2 be the putative homeomorphism. Let U ⊆ Rn be the set {(x1 , x2 , 0, . . . , 0) : x21 + x22 < 1} which is homeomorphic to a disc in R2 . By applying the invariance of domain theorem to the map R2 → R2 defined by (x1 , x2 ) → h(x1 , x2 , 0, . . . , 0) show that the image h(U ) is open in R2 . Show that this contradicts the continuity of h. Exercise 4.5.10. What is the 1-dimensional counterpart of the invariance of domain theorem? Prove it. Exercise 4.5.11. Construct an example of a homeomorphism U → U which has no continuous extension to the boundary.

4.5. Exercises

71

Exercise 4.5.12. (i) Let J be a Jordan curve in C and let U be the bounded component of C \ J. Let z0 ∈ J. Show that for each ε > 0 there exists δ > 0 having the following property: if z1 , z2 ∈ U ∩ B(z0 ; δ), then there exists a curve in U ∩ B(z0 ; ε) joining z1 to z2 . (ii) Let U denote the “slit disc” in C; that is, U = {x + iy : x2 + y 2 < 1, x  0 ⇒ y = 0}. Show that U is bounded and simply connected, but its boundary is not a Jordan curve. (iii) Give an explicit example to show that the boundary of the region U in part (ii) above does not have the property proved in part (i) for Jordan curves. (iv) (For those who know something about conformal maps) Construct a conformal homeomorphism from the slit disc U onto U. Show that the homeomorphism you have constructed does not extend to ∂U . Can you give an explicit description of its boundary behavior? Exercise 4.5.13. Let J be a Jordan curve in the plane. Show that there is a continuous function f : C → R such that f (z) is positive if z is in the interior of J, negative if z is in the exterior of H, and zero if z is on J. Challenge: If J is smooth (i.e., can be smoothly parameterized by a loop γ with γ  = 0), show that f can be chosen to be smooth also. Exercise 4.5.14. Let A ⊆ Rm and B ⊆ Rn be closed sets and let h : A → B be a homeomorphism. (i) Using the Tietze extension theorem, extend h to a continuous map f : Rm → Rn . Show that the map h1 (x, y) = (x, y + f (x)) is a homeomorphism Rm × Rn → Rm × Rn . (ii) Similarly, extend h−1 to a continuous map g : Rn → Rm and show that h2 (x, y) = (x − g(y), y) is a homeomorphism Rm × Rn → Rm × Rn .

72

4. Topology of the Plane

(iii) Show that h2 ◦ h1 : Rm × Rn → Rm × Rn is a homeomorphism which sends (x, 0) to (0, h(x)) whenever x ∈ A. (iv) Putting n = m, deduce the result of Lemma 4.4.8. Dold’s paper [15] shows how this result plus a little algebraic topology (the Mayer-Vietoris sequence) can be used to give a very concise proof of the general Jordan-Brouwer separation theorem. A detailed development of the argument, with all the required algebraic topology, can be found in [26]. The presentation uses de Rham cohomology, which we will begin to develop in the next chapter.

Chapter 5

Integrals and the Winding Number

5.1. Differential forms and integration In Chapter 3 we gave three procedures for computing the winding number, one “polygonal”, one “smooth”, and one “algebraic”. In various ways, these all worked by counting discrete objects, like intersection points of curves or roots of polynomials. In terms of our archetypal model of a clock face, imagine a bell chiming every time the clock hand passes 12, so that the winding number is obtained by counting the chimes. We could also imagine a quite different approach to calculating the winding number. Instead of counting discrete “chimes” or intersections, we could try to measure the angular speed at which the clock hand is turning at each instant and thus find the total angle through which it turns (which is just 2π times the winding number) by adding up the increments “angular speed times time interval” over successive instants. In other words, we could define the winding number by integration. That is the guiding idea of this chapter. We’ll need the language of multivariable calculus to do this effectively. Traditionally, multivariable calculus is expressed using arrays of partial derivatives, but it is more in accordance with the spirit of

73

74

5. Integrals and the Winding Number

modern mathematics to avoid this implicit use of specific coordinate systems and instead to think of derivatives as linear maps between normed vector spaces. Appendix E gives an outline of multivariable calculus from this point of view. 5.1(a). Differential 1-forms. Let V be a finite-dimensional vector space over the real field R, and let V ∗ denote its dual space (Definition A.4.1). Definition 5.1.1. Let Ω be an open subset of V . A (differential ) 1-form on Ω is a smooth map Ω → V ∗ . In other words, a 1-form α is a “smoothly varying family of elements of the dual space”: it is a function Ω×V → R, (x, v) → α(x)[v], depending1 smoothly on x and linearly on v. One can add 1-forms (pointwise) and multiply them by scalars, so the 1-forms comprise a vector space. In fact, one can even multiply 1-forms pointwise by functions. If α is a 1-form and g is a smooth function, then β defined by β(x)[v] = g(x)α(x)[v] is also a 1-form, we use the natural notation β = gα. If f is a smooth, real-valued function on Ω, its directional derivatives give rise to a 1-form in a natural way: Definition 5.1.2. Let f : Ω → R be a smooth map. The gradient 1-form df of f is defined by

d f (x + tv) df (x)[v] = dt t=0 for vectors v ∈ V . In the language of Appendix E, df is simply the derivative Df of f , thought of as a map Ω → V ∗ : df (x)[v] = Df (x)[v]. See Example E.2.8. 1 To help keep track of the various dependencies, I will use square brackets in this chapter to indicate a linear dependence, and round brackets to indicate a more general functional dependence. See Convention E.2.6 for more about this.

5.1. Differential forms and integration

75

Proposition 5.1.3. The gradient operator d satisfies the sum and product rules: d(f + g) = df + dg,

d(f g) = f dg + g df.

Proof. We’ll prove the product rule (the sum rule is easier). By definition

d d(f g)(x)[v] = f (x + tv)g(x + tv) dt t=0

d d = f (x + tv) g(x + tv) + g(x + tv) f (x + tv) dt dt t=0 = f (x) dg(x)[v] + g(x) df (x)[v] 

using the ordinary form of the product rule.

Remark 5.1.4. Later we shall also want to consider complex -valued 1-forms, where α(x)[v] belongs to C rather than to R. There is no extra difficulty in dealing with these — their real and imaginary parts are just 1-forms in the ordinary (real-valued) sense. The laws above continue to apply. 5.1(b). Pullback forms. Let V and W be two vector spaces, and suppose that ϕ : V → W is a smooth map. Recall that for each x ∈ V , the derivative Dϕ(x) of ϕ is a linear map V → W . If β is a 1-form on W , then we may define a 1-form α on V by composing with Dϕ: α(x) : V

Dϕ(x)

/W

β(ϕ(x))

/R.

  In symbols, α(x)[v] = β(ϕ(x)) Dϕ(x)[v] . Definition 5.1.5. The above-defined 1-form α is called the pullback of β along the smooth map ϕ, and it is written α = ϕ∗ β. Remark 5.1.6. Here is an important special case. Suppose that β is in fact the gradient of a function f : W → R. Then I claim that α = ϕ∗ β is the gradient of the function f ◦ ϕ : V → R, that is, ϕ∗ ( df ) = d(f ◦ϕ). To see this, remember that df (x)[w] = Df (x)[w]

76

5. Integrals and the Winding Number

and therefore

  ϕ∗ ( df )(x)[v] = Df (ϕ(x)) Dϕ(x)(v)   = Df (ϕ(x)) ◦ Dϕ(x) [v] (definition of composite map) = D(f ◦ ϕ)(x)[v] (by the chain rule, Proposition E.3.2) = d(f ◦ ϕ)(x)[v] as required.

Proposition 5.1.7. Pullbacks are functorial: given smooth maps V

ϕ

/W

ψ

/X

and a 1-form β on X, we have (ψ ◦ ϕ)∗ (β) = ϕ∗ (ψ ∗ (β)). Proof. Let β be a 1-form on X. Then by definition (ψ ◦ ϕ)∗ (β)(x)[v] = β(ψ(ϕ(x)))[D(ψ ◦ ϕ)(x)[v]] ψ ∗ (ϕ∗ β) (x)[v] = β(ψ(ϕ(x)))[Dψ(ϕ(x))[Dϕ(x)[v]]]. But these are equal by the chain rule, Proposition E.3.2.



Example 5.1.8. Consider the special case where both V and W are 1-dimensional vector spaces. In that case they are both isomorphic to R; fix isomorphisms x : V → R and y : W → R. The function y = ϕ(x) is now a real-valued function of a real variable, and its derivative Dϕ(x) is a 1 × 1 matrix, that is, a scalar Dϕ(x) = ϕ (x). The most general 1-form β on W is given by β = f (y) dy for some smooth function f , and Definition 5.1.5 gives (5.1.9)

ϕ∗ (f (y) dy) = f (ϕ(x))ϕ (x) dx.

The identity (5.1.9) in the above example should remind you of the rule for integration by substitution from elementary calculus:  b  ϕ(b) f (y) dy = f (ϕ(x))ϕ (x) dx, ϕ(a)

a

valid where the substitution ϕ has (strictly) positive derivative on the interval [a, b] ⊆ R. In fact, this allows us to set up a close link between 1-forms and integration over intervals.

5.1. Differential forms and integration

77

Definition 5.1.10. Let I = [a, b] ⊆ R be a closed bounded interval  and let α be a 1-form on I. The integral of α over I, written α, is defined to be the ordinary “Calculus I” integral  b f (x) dx, a

where α = f (x) dx is the expression of the 1-form α in terms of the “standard” 1-form dx. From (5.1.9) and the rule for integration by substitution, it then follows that Lemma 5.1.11. Let I and J be closed bounded intervals, and let ϕ : I → J be a smooth bijection with strictly positive derivative everywhere. Let β be a 1-form on J. Then   ϕ∗ β = β. I

J

Thus 1-forms and pullbacks automatically keep track of the derivative terms required when we integrate by substitution. We’ll develop this relationship further in the next subsection. 5.1(c). Integration along a path. Let us now consider paths in V . By a smooth path (also known as a smooth curve) in V we just mean a smooth map [0, 1] → V , as before (see Section 3.4). Sometimes it is convenient to allow a general parameter interval [a, b] instead of [0, 1]; of course, this makes no essential difference. Definition 5.1.12. Let γ be a smooth path in V and let α be a ∗ 1-form on an open  subset Ω ⊆ V that contains γ . The integral of α along γ, written γ α, is defined by   α= γ ∗ α, γ

[0,1]

where γ ∗ α is the pullback of the 1-form α along the smooth map γ. Using the definitions of pullback (Definition 5.1.5) and of integration over [0, 1] (Definition 5.1.10), we obtain the explicit expression   1 (5.1.13) α= α(γ(t))[γ  (t)] dt, γ

0

78

5. Integrals and the Winding Number

where the integral on the right is now an ordinary “Calculus I” integral. There is a version of the fundamental theorem of calculus in this context, which tells us what happens when we integrate a gradient form. Proposition 5.1.14. Let γ be a smooth path in V , f a smooth realγ ∗ . Then valued function defined on an open subset Ω ⊆ V containing  df = f (γ(1)) − f (γ(0)). So, if γ is a loop, then γ df = 0. γ Proof. By the chain rule (Proposition E.3.2), d (f (γ(t))) = df (γ(t))[γ  (t)]. dt Therefore,   df = γ

1

df (γ(t))[γ  (t)] dt =

0

 0

1

 1 d (f (γ(t))) dt = f (γ(t)) , dt t=0

by the ordinary fundamental theorem of calculus.



Definition 5.1.15. Let γ be a smooth path in V and let ϕ : [0, 1] → [0, 1] be a smooth bijection with strictly positive derivative everywhere. Then we will call the path γ ◦ ϕ a smooth reparameterization of γ. See Example 2.2.3 for the language of “reparameterization”. It is natural to regard a reparameterization of γ as “the same path traversed with different speed”, and this leads to the idea that geometric concepts related to smooth paths should not be affected by smooth reparameterization. Lemma 5.1.11 tells us that this is true for the integral of a 1-form. Here are the details: Proposition 5.1.16. Suppose that γ2 is a smooth reparameterization of the smooth path γ1 in V and that α is a 1-form on V . Then   α= α. γ1

γ2

Proof. Let γ2 = γ1 ◦ ϕ, where ϕ is as above. Put β = γ1∗ α. By the functorial property of pullbacks (Proposition 5.1.7), γ2∗ α = (γ1 ◦ ϕ)∗ α = ϕ∗ γ1∗ α = ϕ∗ β.

5.2. Closed and exact forms Therefore,   α= γ1

γ1∗ α [0,1]

79





=



β= [0,1]



ϕ β= [0,1]

γ2∗ α

[0,1]

using the definition of path integral and Lemma 5.1.11.

 =

α γ2



Because of Proposition 5.1.16, when calculating the integral of a 1-form along a path we may freely choose whichever smooth parameterization of the path is most convenient. This often simplifies calculation. 5.1(d). Piecewise smooth paths. A path γ : [0, 1] → V (that is, a continuous map) is piecewise smooth if there exist a0 , . . . , am ∈ [0, 1], with 0 = a0 < a1 < · · · < am−1 < am = 1, such that γ is smooth on each of the intervals [aj , aj+1 ] for j = 0, . . . , m − 1. In other words, a piecewise smooth path is made up by concatenating finitely many smooth segments. Polygonal paths (see Section 3.3) are examples of piecewise smooth paths. We can extend the definition of path integral to piecewise smooth paths by defining  m−1  α= α, (5.1.17) γ

j=0

γj

where γj is the smooth path obtained by restricting γ to the interval [aj , aj+1 ]. With this definition the fundamental theorem of calculus, Proposition 5.1.14, continues to hold for piecewise smooth paths (to prove this, apply the smooth version to each of the segments γj and add the results).

5.2. Closed and exact forms Let α be a 1-form defined on an open subset Ω of a vector space V . Being (in particular) a smooth map Ω → V ∗ , α can be differentiated : its derivative Dα(x) is defined at each x ∈ Ω and is a linear map V → V ∗ or (what is the same thing) a bilinear map V × V → R. That is, the symbol Dα(x)[v1 , v2 ]

80

5. Integrals and the Winding Number

defines a real number for all x ∈ Ω and v1 , v2 ∈ V ; this real number depends smoothly on x and linearly on v1 and v2 . (Here we are reproducing, in our context of 1-forms, some of the discussion leading up to the expression (E.3.3).) Definition 5.2.1. We say that α is closed if Dα is symmetric, that is, Dα(x)[v1 , v2 ] = Dα(x)[v2 , v1 ] for all x, v1 , v2 . Definition 5.2.2. We say that α is exact if there is a smooth function f on Ω with α = df . The equations defining a closed form are linear, so the closed forms (on a given domain Ω) make up a vector space Z 1 (Ω), a subspace of the vector space of all forms on Ω. Similarly, the exact forms make up a vector space B 1 (Ω). In fact, B 1 (Ω) is a subspace of Z 1 (Ω); in other words, Proposition 5.2.3. Every exact form is closed. Proof. Bearing in mind that df (x)[v] = Df (x)[v] (Definition 5.1.2), we see that if α = df , then Dα = D2 f . This is symmetric by Clairaut’s theorem (Proposition E.3.4).  Proposition 5.2.4. Let Ω be an open subset of a vector space V and let Ω be an open subset of a vector space V  . Let ϕ : Ω → Ω be a smooth map and let α be a 1-form defined on Ω. Then: (a) If α is closed, then its pullback ϕ∗ α is closed. (b) If α is exact, then its pullback ϕ∗ α is exact. (Briefly, closedness and exactness are preserved under pullback.) Proof. To prove (a), we calculate using the chain rule that    D(ϕ∗ α)(x)[v1 , v2 ] = Dα ϕ(x) Dϕ(x)[v1 ], Dϕ(x)[v2 ] , and clearly this is symmetric in v1 , v2 if Dα(ϕ(x)) is symmetric. To prove (b), we observe from Remark 5.1.6 that if α = df , then  ϕ∗ α = d(f ◦ ϕ). Remark 5.2.5. Since every exact form is closed, it is natural to ask whether the converse holds: is every closed form exact? One

5.2. Closed and exact forms

81

can measure the difference2 between closed and exact forms by the quotient space H 1 (Ω) := Z 1 (Ω)/B 1 (Ω), known as the first de Rham cohomology of Ω. This is a topological invariant of Ω and is not always zero. For computational purposes it is important to understand how the definition of “closed” is expressed in coordinates. Choose a basis for the n-dimensional vector space V , and let x1 , . . . , xn : V → R be the associated coordinate functions. Then3 a general 1-form α can be written as α = u1 dx1 + · · · + un dxn , where f1 , . . . , fn : V → R are smooth functions. Proposition 5.2.6. A 1-form α, written in coordinates as u1 dx1 + · · · + un dxn , is closed if and only if ∂ui ∂uj = ∂xj ∂xi for all i, j with 1  i, j  n. Proof. Fix a point y = (y1 , . . . , yn )T in V . The derivative Dα(x) is a bilinear form on V , which may be written in coordinate form (with respect to the given basis of V ) as a square matrix. By the chain rule, the (i, j)th entry of this matrix is the derivative at yi of the composite function τ σi / Ω α / V∗ j / R , R where σi (y) = (y1 , . . . , yi−1 , y, yi+1 , . . . , yn )T and τj (ξ1 , . . . , ξn ) = ξj . But this composite function is y → uj (y1 , . . . , yi−1 , y, yi+1 , . . . , yn ) and its derivative at xi is just ∂uj (y). ∂xi 2 Making this definition is an example of the process described in the Preface, whereby mathematicians turn lemons into lemonade. In place of an inconvenience — closed and exact are not always the same — we now have an interesting and useful invariant — the de Rham cohomology. 3 See Exercise 5.7.1.

82

5. Integrals and the Winding Number (0, b)

γ3

γ4

(0, 0)

(a, b)

γ2

γ1

(a, 0)

Figure 5.1. Figure for Lemma 5.2.7.

The form α is closed if and only if the matrix representing Dα is symmetric, which gives the required condition ∂uj ∂ui (y) = (y), ∂xj ∂xi which must be valid at all points y ∈ Ω.



The next lemma gives the crucial step in understanding the relationship between closed and exact forms. Lemma 5.2.7. Let γ be a piecewise smooth path traversing the boundary of a rectangle R, whose sides are parallel to the coordinate axes. Let α be a closed 1-form defined on R. Then γ α = 0. Proof. There is no loss of generality in assuming that the successive vertices of the rectangle are at (0, 0), (a, 0), (a, b), and (0, b). (See Figure 5.1). Write γ1 , γ2 , γ3 , and γ4 for paths traversing the successive sides of the rectangle, and write α = g dx + h dy where g and h are functions of x and y. The assumption that α is closed means that g2 = h1 , where the subscripts 1 (or 2) denote partial derivative in the first (or second) variable.  We want to evaluate γ α, which is equal to the sum of the inte  grals along the four sides of the rectangle, 4k=1 γk α. Let’s look at these terms individually.  Parameterizing the path γ1 in the obvious way4 gives γ1 α = a  g(x, 0)dx. Similarly we may parameterize γ3 to get γ3 α = x=0 4

Note the implicit use of Proposition 5.1.16.

5.2. Closed and exact forms

83

a − x=0 g(x, b)dx, the minus sign arising because γ3 is traversed in the backward direction. Thus   a  α+ α= (g(x, 0) − g(x, b)) dx γ1

γ3

x=0  a



b

=−

g2 (x, y)dydx x=0

y=0

using the fundamental theorem of calculus. On the other hand, similar reasoning with the vertical sides of the rectangle gives    b α+ α= (h(a, y) − h(0, y)) dy γ2

γ4

y=0  b



a

=−

h1 (x, y)dxdy. y=0

x=0

Combining both equations and changing the order of integration, we get   a  b f (z) dz = − (g2 (x, y) − h1 (x, y)) dydx γ

x=0

y=0

and this vanishes because α is closed.



Theorem 5.2.8. Let Ω be an open subset of a vector space V and let α be a closed 1-form defined on Ω. Let γ be a smooth (or piecewise smooth) loop in Ω. If γ is homotopic to a constant loop, then  α = 0. γ

 More generally, for an arbitrary loop γ, the integral γ α depends only on the homotopy class of γ in the space Maps(S 1 ; Ω). Proof. Let h be a homotopy of smooth loops in Ω between γ0 and γ1 . Thus, h is a map (which may be assumed to be smooth; see Remark 3.4.1) from [0, 1]×[0, 1] to Ω, with h(0, t) = γ0 (t) and h(1, t) = γ1 (t), and moreover h(s, 0) = h(s, 1) for all s (this expresses that h is a homotopy of loops). Let us apply Lemma 5.2.7 to the form h∗ α, which is closed because of Proposition 5.2.4 and is defined on the rectangle [0, 1] × [0, 1] ⊆ R2 . The lemma tells us that the sum of the

84

5. Integrals and the Winding Number

integrals of h∗ α along the four sides of the rectangle is zero. But, by definition, the integrals along the vertical sides of the rectangle are   α and − α, γ1

γ0

and the integrals along the horizontal sides cancel because of the loop property h(s, 0) = h(s, 1). Thus we find that   α= α, γ1

γ0

as required. (If the initial loop γ is only piecewise smooth, we first round off the corners to get a homotopic smooth loop without changing the integral by more than some prescribed ε; then we apply the above argument to the smooth approximation and finally let ε → 0.) 

5.3. The winding number via integration We now focus our attention on the 2-dimensional case — the complex plane. Let f be a smooth function from an open subset Ω of C to C itself. Definition 5.3.1. The smooth function f : Ω → C is holomorphic if it is complex differentiable at each point z ∈ Ω; in other words, if the limit f (z + h) − f (z) f  (z) = lim h→0 h h∈C

exists (in C) for each z ∈ Ω. The usual arguments of elementary analysis apply to this definition and show that the sum, product, quotient, and composite (where defined) of holomorphic functions are holomorphic. Remark 5.3.2. Suppose we write f = u + iv, z = x + iy and regard u, v as functions of the two real variables x, y. Then, in the language of Appendix E, to say that f is holomorphic is to say that at each point z of Ω the real derivative   ∂u/∂x ∂u/∂y Df = ∂v/∂x ∂v/∂y

5.3. The winding number via integration

85

is actually a complex linear map C → C: namely, the map given by multiplication by the complex number f  (z) = a + ib ∈ C. The 2 × 2 matrix of this multiplication map is obtained by considering the effect of multiplication by a + ib on the basis vectors 1 and i. It is   a −b . b a Comparison with the matrix of partial derivatives above shows that a holomorphic map satisfies the Cauchy-Riemann equations ∂u ∂v = , ∂x ∂y

∂v ∂u =− , ∂x ∂y

which we have already encountered in Section 4.4. Most complex analysis courses will offer a proof that, under some smoothness assumptions, the Cauchy-Riemann equations give a sufficient as well as a necessary condition for a function to be holomorphic. However, we will not need this fact. Lemma 5.3.3. If the smooth function f is holomorphic, then the 1-form f (z) dz is closed. Proof. The form f (z) dz is a complex 1-form (see Remark 5.1.4), which we may express in terms of real and imaginary parts as f (z) dz = (u + iv)( dx + i dy) = (u + iv) dx + (−v + iu) dy, where f = u + iv. The condition for this form to be closed is then ∂ ∂ (u + iv) = (−v + iu), ∂y ∂x and taking real and imaginary parts gives ∂u ∂v = , ∂x ∂y

∂v ∂u =− , ∂x ∂y

which are the Cauchy-Riemann equations.



There is a close relation between holomorphic functions, integration along paths, and winding numbers. We have developed all the ingredients that we need to state and prove the fundamental result.

86

5. Integrals and the Winding Number

Theorem 5.3.4. Let p be a point of C, and let γ be a piecewise smooth loop in C \ {p}. Then  1 1 dz. wn(γ, p) = 2πi γ z − p  Remark 5.3.5. The integral here is of the general shape γ f (z) dz, where f is a complex-valued function. The “Calculus I” formula for such an integral (following (5.1.13)) is  1  f (z) dz = f (γ(t))γ  (t) dt, γ

0

where, on the right-hand side, the real and imaginary parts of the complex expression f (γ(t))γ  (t) are integrated independently. Proof. There is no loss of generality in taking p = 0, and we shall do that. Our proof will use a three-stage process for identifying the winding number, which is a “standard operating procedure” that we will see again in Section 7.4. Considering the right-hand side as a function ν(γ) = (2πi)−1 γ z −1 dz of the loop γ, we’ll prove (a) that ν is multiplicative (that is, ν(γ1 γ2 ) = ν(γ1 ) + ν(γ2 )) and (b) that it is homotopy invariant. In light of Theorem 3.2.7 this shows that ν(γ) is some multiple of the winding number of γ, and we’ll check which multiple it is by (c) computing a single example. For multiplicativity (a), note the following consequence of the product rule (Proposition 5.1.3): (5.3.6)

1 1 1 d(f g) = df + dg. fg f g

We’re going to apply this as follows: let V = C ⊕ C, a 4-dimensional vector space over R, and let f and g be defined on V by f (z1 , z2 ) = z1 and g(z1 , z2 ) = z2 . Define a path γ in V by γ(t) = (γ1 (t), γ2 (t)). Then it is easy to check (either by using the definition of path integral in terms of pullbacks or more prosaically by using the “Calculus I” formula of Remark 5.3.5) that   1 1 1 1 df, ν(γ2 ) = dg, ν(γ1 ) = 2πi γ f 2πi γ g

5.4. Homology

87

 1 1 ν(γ1 γ2 ) = d(f g). 2πi γ f g Thus the desired additivity follows from (5.3.6). and

For homotopy (b), consider the integrand (z − p)−1 dz. Since p∈ / Ω, the function z → (z − p)−1 is holomorphic on Ω. Thus, by Lemma 5.3.3, the 1-form (z−p)−1 dz is closed. The desired homotopy invariance follows from Theorem 5.2.8. As noted above, properties (a) and (b) show that ν is a multiple of the winding number. To show that this multiple is 1 and thus that ν is equal to the winding number, we compute a single example (c). Consider the standard path γ(t) = e2πit with winding number 1. For this path  1   1 γ (t) 1 1 dt = ν(γ) = 2πi dt = 1, 2πi 0 γ(t) 2πi 0 which agrees with the winding number. This completes the proof.  Remark 5.3.7. Notice that this theorem provides us with a fundamental example of a closed 1-form (on the non-simply-connected region C \ {p}) that is not exact. Indeed, (z − p)−1 dz is closed as we observed above; but if it were exact, then its integral around any loop would vanish, by the fundamental theorem of calculus (Proposition 5.1.14).

5.4. Homology As we further develop the theory of integration along paths, we are going to run across situations where we need to employ the following device. Let γ be a piecewise smooth loop in some region Ω ⊆ C. It may be possible to express γ as the concatenation of finitely many paths, say γ1 , . . . , γn . Then, of course, we will have  n   α= α γ

k=1

γk

for any 1-form α (in fact, this is how we defined the integral along a piecewise smooth path; see (5.1.17)). Now it is quite possible that concatenating the paths γk in a different order, such as γσ(1) , . . . , γσ(n)

88

5. Integrals and the Winding Number

where σ is a permutation of {1, . . . , n}, might yield an entirely different loop δ. In this case we will have   n  n    α= α= α= α, δ

k=1

γσ(k)

k=1

γk

γ

even though the loops γ and δ may be quite different from the point of view of the topological tools we have developed thus far — for instance, they need not be homotopic (Exercise 5.7.15). It is convenient to formalize this idea by means of the notions of chains, cycles, and homology. This machinery is called homology theory and is one of many things described in this book whose developments and generalizations are vital tools in higher-dimensional topology. Definition 5.4.1. Let Ω be an open subset of a vector space V . A 1-chain in Ω is a finite formal linear combination, with integer coefficients, of smooth paths in Ω. In other words, it is a formal expression n  mk [γk ], k=1

where the γk are smooth paths in Ω and the mk are integers. Chains will usually be denoted by upper case Greek letters like Γ. By introducing the integer coefficients mk , we have made the collection of chains into an abelian group: two chains can be added componentwise, and this addition operation is commutative. (Those who are familiar with the terminology will recognize that the group of chains is in fact the free abelian group generated by the smooth paths in Ω.) When mk = 1, we usually omit it and just write the corresponding term as [γk ], and similarly when mk = −1, we write −[γk ]. A term with a 0 coefficient can be omitted. The coefficient mk may be referred to as the multiplicity of γk in the given chain. Finally, we define the image Γ∗ of Γ to be the union of the images γk∗ . Important examples arise from piecewise smooth paths: if γ is such a path, made up by concatenating smooth subarcs γk , then we  may associate to it the chain Γ = k [γk ]. The following definition then generalizes (5.1.17).

5.4. Homology

89

n Definition 5.4.2. Let Γ = k=1 mk [γk ] be a 1-chain in Ω, and let α be a 1-form on Ω. Then the integral of α along Γ is defined by   n  α= mk α. Γ

γk

k=1

An equivalence relation ∼ on chains (we just call it “equivalence”) is generated5 by the following three rules. (a) If γ is the concatenation of γ1 and γ2 , then [γ] ∼ [γ1 ] + [γ2 ]. (b) If γ  is the reverse6 of γ (traversed in the backwards direction), then [γ] ∼ −[γ  ]. (c) If γ1 is a reparameterization7 of γ2 , then [γ1 ] ∼ [γ2 ]. Notice that this equivalence relation respects the operation of integration along chains: if Γ1 is equivalent to Γ2 , then   α= α Γ1

Γ2

for all α defined in a neighborhood of Γ∗1 ∪ Γ∗2 . Notice also that a constant chain is equivalent to 0 (by combining (a) and (c) above). Definition 5.4.3. Suppose that γ is a piecewise smooth loop, made up by concatenating smooth subarcs γk . The associated chain Γ =  k [γk ] is called a simple cycle. More generally, a cycle is any chain which is equivalent to a finite linear combination of simple cycles. In other words, a chain is a cycle if its component arcs (counted according to multiplicity) can be subdivided, rearranged, reoriented, and concatenated to yield finitely many piecewise smooth loops (again with associated multiplicities).  Lemma 5.4.4. If Γ is a cycle, then Γ α = 0 for any exact 1-form α. Proof. This follows from the fundamental theorem of calculus for piecewise smooth loops: if γ is such a loop, then  df = f (γ(1)) − f (γ(0)) = 0. γ 5

See Remark G.1.4. See Proposition 2.1.2. 7 See Example 2.2.3. 6

90

5. Integrals and the Winding Number

Thus the result holds for simple cycles, and it extends to all cycles by linearity.  To define the key notion of homology, we need to introduce a special class of cycles, called simple boundaries. Let R be a rectangle in R2 , with vertices a, b, c, d. We can define a piecewise smooth loop ∂R (the boundary of R) in R2 by following the boundary edges ab, bc, cd, and da, in that order and at constant speed. Now suppose that σ : R → Ω is a smooth map. By composing σ with ∂R we obtain a piecewise smooth loop in Ω and hence a cycle which we denote ∂(R, σ) or just ∂R if the map σ is obvious from the context. Such a cycle is also called a simple boundary in Ω. By analogy with the previous definition we now say Definition 5.4.5. A boundary in Ω is any cycle which is equivalent to a finite linear combination of simple boundaries. Two chains (or cycles) are said to be homologous in Ω if their difference is a boundary in Ω. Because of this, a boundary may also be described as nullhomologous. Example 5.4.6. Two paths that are homotopic (keeping endpoints fixed) in Ω are homologous in Ω. (Indeed, a homotopy is simply a map from the rectangle [0, 1] × [0, 1] to Ω.) Compare the next result with Lemma 5.4.4. Lemma 5.4.7. Suppose that Γ is a boundary in Ω. Then for any closed 1-form α on Ω.

 Γ

α=0

Proof. It suffices to consider the case of a simple boundary, Γ = ∂(R, σ). Then    α= σ∗α = β, Γ

∂R

∂R

where β is the closed (Proposition 5.2.4) form σ ∗ α defined on T . The result then follows from Lemma 5.2.7.  Thus the difference between boundaries and cycles is closely related to the difference between closed forms and exact forms. This result completes our development of the general language of homology. In order to apply it to problems in plane topology,

5.4. Homology

91

however, we need one more thing: an effective criterion that tells us when a given cycle is actually a boundary. The proof of such a criterion occupies the remainder of this section. Remark 5.4.8. Let Γ be a cycle in C, and let p ∈ C be a point not belonging to Γ∗ . Then we can define the winding number of Γ around  p in the natural way: if Γ =  m Γ , where Γ is the simple cycle associated to the piecewise smooth loop γ , then put  wn(Γ, p) = m wn(γ , p). 

The integral formula for the winding number, Theorem 5.3.4, extends by linearity to give  1 1 dz. wn(Γ, p) = 2πi Γ z − p Theorem 5.4.9 (Artin’s criterion). Let Ω ⊆ C be an open subset, and let Γ be a cycle in Ω. Then Γ is a boundary in Ω if and only if wn(Γ, p) = 0 for all p ∈ C \ Ω. Consequently, two cycles in Ω are homologous in Ω if and only if they have the same winding numbers around all p ∈ C \ Ω. In order to present the proof of Artin’s criterion, let’s introduce some terminology. A finite collection of horizontal and vertical lines in C will be called a grid. The grid lines of a given grid G subdivide one another into line segments or edges (some finite and some not), and a cycle which is equivalent to a finite linear combination of finite edges for G will be called a grid cycle for G. Such a cycle is completely determined, up to equivalence, by the multiplicities8 that it assigns to each of the finite edges of the grid G; these will be called the grid multiplicities of the given grid cycle. The complement of the grid G in C is a disjoint union of open rectangles (some finite and some not), and a given grid cycle will have a well-defined winding number around each of these complementary rectangles; let us call the collection of these numbers the grid winding numbers of the grid cycle in question. We use the notation wn(Γ; R) 8 For definiteness, let us say that each edge is oriented in the direction of increasing the appropriate coordinate, x for horizontal edges and y for vertical ones.

92

5. Integrals and the Winding Number

−1

+1

+2 0

Ω Figure 5.2. A grid, a grid cycle, and some grid winding numbers.

where Γ is a grid cycle and R a complementary rectangle, and we notice that wn(Γ; R) = 0 if the rectangle R is not finite. See Figure 5.2. Lemma 5.4.10. Every cycle in an open subset Ω ⊆ C is homologous to a grid cycle (for some grid G).

Proof. It suffices to consider a piecewise smooth loop γ. Since γ ∗ is a compact subset of Ω, there is ε > 0 such that any disc of radius ε centered on a point of γ is contained in Ω. Since γ is uniformly continuous (Proposition B.3.19), we can find δ > 0 such that if |s − t| < δ, then |γ(s) − γ(t)| < ε. Subdivide the parameter interval [0, 1] into finitely many subintervals [tj , tj+1 ], j = 0, . . . , m, such that |tj − tj+1 | < δ for each j, and denote by γj the path given by restricting γ to [tj , tj+1 ]. Then [γ] is equivalent to the cycle [γ0 ]+· · ·+[γm−1 ]. Put γ(tj ) = xj + iyj , and let hj be the horizontal line segment from xj + iyj to xj+1 + iyj , and vj the vertical line segment from xj+1 + iyj to xj+1 + iyj+1 . Then γj is homotopic (with endpoints fixed) to the concatenation of hj and vj , by a linear homotopy lying inside the disc

5.4. Homology

93

B(γ(tj ); ε) ⊆ Ω. Consequently, [γ] is homologous in Ω to m−1 

[hj ] + [vj ],

j=0



which is a grid cycle.

Lemma 5.4.11. A grid cycle all of whose grid winding numbers are 0 is equivalent to 0. Proof. By Lemma 3.3.2, the multiplicity of an edge e in a grid cycle is equal to the difference of the grid winding numbers on the two sides of e. So, if all grid winding numbers are 0, then all multiplicities are 0 also.  Proof of Artin’s criterion (Theorem 5.4.9). Suppose that Γ is a boundary. Then Lemma 5.4.7 shows that the winding numbers  1 1 dz wn(Γ, p) = 2πi Γ z − p must vanish for p ∈ / Ω since the integrand is a closed form by Lemma 5.3.3. Suppose conversely that these winding numbers all vanish. By Lemma 5.4.10, there is no loss of generality in assuming that Γ is a grid cycle based on some grid G. Define a new grid cycle (5.4.12)

Γ =



wn(Γ; R)[∂R],

R

where the sum is extended over all finite complementary rectangles R to G. Notice that if R is a fixed finite complementary rectangle, then  1 (R = R ), wn([∂R]; R ) = 0 (R = R ). Consequently, wn(Γ; R) = wn(Γ ; R) for all R, and therefore, according to Lemma 5.4.11, Γ − Γ is equivalent to 0. In particular, Γ and Γ are homologous. It remains to show that Γ is a boundary in Ω, and this will follow if we can show that every R appearing with nonzero coefficient

94

5. Integrals and the Winding Number

in (5.4.12) has closure contained in Ω. Indeed, suppose R contains a point p ∈ / Ω. Then wn(Γ; p) = 0 by hypothesis. There is a ball B(p; ε) that does not meet Γ∗ , and thus wn(Γ; q) = 0 for all q in this ball. Some of the points q are in the interior of R, so wn(Γ; R) = 0. This completes the proof.  This proof is adapted from Ahlfors’ book [2] (which is also the source of the attribution to Artin).

5.5. Cauchy’s theorem The results of the previous section combine to prove the most general form of a key result of complex analysis, which substantially extends the homotopy argument used in part (b) of the proof of Theorem 5.3.4. Theorem 5.5.1 (Cauchy’s theorem). Let Ω be an open subset of C, and let Γ be a cycle in Ω which has the property that wn(Γ; p) = 0 for all p ∈ Ω. Then  f (z) dz = 0 Γ

for every holomorphic function f : Ω → C. Proof. By Artin’s criterion (Theorem 5.4.9), Γ is a boundary. By Lemma 5.3.3, the 1-form f (z) dz is closed. Combining these two facts and applying Lemma 5.4.7, we obtain the result.  This theorem is used throughout complex analysis. We’ll only give one example here (see Exercise 5.7.14 for another). In Section 4.4, discussing generalizations of the Jordan curve theorem, we promised to prove that any nowhere-zero holomorphic function defined on a Jordan domain has a holomorphic square root there. We can now redeem that promise (indeed, we can do a little better). Proposition 5.5.2. Let U be a connected open subset of C having the property9 that for every loop γ in U and every p ∈ C \ U , the winding number wn(γ; p) equals 0. Then every holomorphic function f : U → C \ {0} lifts (holomorphically) through the exponential map; i.e., there exists a holomorphic g : U → C such that eg = f . 9

By Exercise 4.5.8 every Jordan domain has this property.

5.6. A glimpse at higher dimensions

95

The existence of holomorphic square roots (which is what was needed for the Riemann mapping theorem) certainly follows from this: the function eg/2 gives such a square root. Proof. Let z0 be some point of U , let 0 be a complex number such that exp(0 ) = z0 , and define g(z) for z ∈ U by  z  f (z) g(z) = 0 + dz, z0 f (z) where the integral is taken along an arbitrary path in U from z0 to z. (Cauchy’s theorem (Theorem 5.5.1) is invoked at this point: the integrand f  (z)/f (z) is holomorphic in U , so if γ1 and γ2 are two paths from z0 to z,   f  (z) f  (z) dz − dz = 0 γ1 f (z) γ2 f (z) because this is the integral of a holomorphic function around a cycle in U obtained by traversing γ1 followed by the reverse of γ2 .) By differentiating the integral we find g  (z) = f  (z)/f (z) and therefore  d  −g(z) e f (z) = e−g(z) (−g  (z)f (z) + f  (z)) = 0. dz Thus eg(z) is a constant multiple of f (z) and evaluating at z = z0 we see that the constant is 1. 

5.6. A glimpse at higher dimensions The underlying message of this chapter is that differential 1-forms are useful in studying the “winding around” aspect of topology: simpleconnectedness, winding numbers, and so on. The reason for this is that the difference between the properties “closed” and “exact” for 1-forms on an open set U in a vector space V reflects the lack of simple-connectedness of U (see Exercise 5.7.10). Now, as observed in Remark 5.2.5, the “difference between closed and exact” is measured by the quotient vector space H 1 (U ) :=

Closed 1-forms on U Z 1 (U ) = , B 1 (U ) Exact 1-forms on U

96

5. Integrals and the Winding Number

and one may ask whether this H 1 is the beginning of a sequence of quotient spaces that express “higher-dimensional topological structures” of U . It turns out that this is indeed the case. When V is n-dimensional, there is a sequence of vector spaces and maps d between them, i.e., Ω0 (U )

d

/ Ω1 (U )

d

/ Ω2 (U )

d

/ ···

d

/ Ωn (U ) ,

called the de Rham complex of U . Here Ω0 (U ) is the space of smooth functions on U , Ω1 (U ) is the space of 1-forms, and the first d is the one that we have already introduced. The subsequent Ωs are more complicated “multilinear” objects: for instance, Ω2 (U ) can be regarded as the space of functions which assign, to each point of U , a skew-symmetric bilinear functional on V × V . In the sequence above it turns out that the composite of any two d operators is zero, or to put it another way, the kernel of each d contains the image of the preceding one. This is a generalization of the fact that exact 1-forms are closed (Proposition 5.2.3) and its proof, like the proof in the 1-form case, relies on Clairaut’s theorem (Proposition E.3.4). Thus the definition H p (U ) :=

Kernel of d: Ωp (U ) → Ωp+1 (U ) Image of d: Ωp−1 (U ) → Ωp (U )

generalizes our H 1 (U ) above. These H p are called the de Rham cohomology spaces of U and they are a vital tool in higher-dimensional topology. One way to bring the above abstractions down to earth is to consider the case when V = R3 . In this situation we can avail ourselves of a useful coincidence: the number of independent components of a skew-symmetric 3 × 3 matrix ⎛ ⎞ 0 a −b ⎝ −a 0 c ⎠, b −c 0 namely 3, is the same as the number of components of a vector;10 and therefore, 2-forms can be identified with vector fields in this 10 This coincidence is also responsible for the existence of the classical vector product, another uniquely 3-dimensional construction.

5.7. Exercises

97

case. When we make this identification, the three d operators in the de Rham complex become the familiar operators grad, curl, and div of classical vector calculus, and the relations between kernel and image are also classical: curl grad ϕ = 0 and div curl V = 0. Remark 5.6.1. From Lemmas 5.4.4 and 5.4.7, it follows that integration gives a pairing H 1 (U ) × H1 (U ) → R, where the first homology group H1 (U ) is defined to be the quotient of the cycles by the boundaries. This duality structure also generalizes to higher dimensions and relates the analytically defined de Rham spaces to the combinatorics of cycles and boundaries. For much more about differential forms and cohomology, I recommend the book [26].

5.7. Exercises Exercise 5.7.1. Suppose that V = Rn and that x1 , . . . , xn : V → R are the standard coordinate functions. Show that for any point x, the elements dx1 (x), . . . , dxn (x) ∈ V ∗ form a basis for V ∗ . Deduce that any 1-form α on V can be written α = f1 dx1 + · · · + fn dxn for suitably chosen smooth functions fi : V → R. Exercise 5.7.2. Suppose that V = Rn with coordinate functions x1 , . . . , xn : V → R as in the previous exercise. Let f : V → R be a smooth function. Show that n  ∂f df = dxi . ∂x i j=1 (Hint: Fix an (n − 1)-tuple of coordinates, say (x2 , . . . , xn ), and let i1 : R → V be defined by sending x ∈ R to the column vector (x, x2 , . . . , xn )T , where the superscript T denotes transpose. Compute the derivative of f ◦ i1 . Using the chain rule, identify this derivative with the coefficients of dx1 in df . Repeat for the other coordinate subscripts 2, 3, . . ..) Exercise 5.7.3. Let V = R2 with coordinate functions x and y. Show that the 1-form x dy is not the gradient of any function.

98

5. Integrals and the Winding Number

Exercise 5.7.4. Let U1 , U2 be open subsets of finite-dimensional vector spaces V1 , V2 , respectively. A diffeomorphism f : U1 → U2 is a smooth bijection whose inverse is also smooth. Show that if such a diffeomorphism exists, then dim V1 = dim V2 . (Hint: Use the chain rule to show that the derivative of a diffeomorphism (at any point) must be an invertible linear map.) This easy proof that diffeomorphisms preserve dimension should be contrasted with the difficulty of proving that homeomorphisms do so (compare Exercise 4.5.9), a difficulty which ultimately arises from nonsmooth examples such as the Peano curve (Example 2.1.6). Exercise 5.7.5. Show that a smooth map between open intervals is a diffeomorphism if and only if it is onto and the derivative f  (t) is nonzero and of constant sign (either > 0 for all t or < 0 for all t: the first case is called orientation-preserving and the second is called orientation-reversing). By considering the map t → t3 from (−1, 1) to itself, show that a smooth and bijective map need not be a diffeomorphism. Exercise 5.7.6. If the smooth path γ is regular (see the beginning of Section 3.4 for the definition), show that any smooth reparameterization of γ is regular too. Exercise 5.7.7. Write out a detailed proof of Proposition 5.1.16 without using the language of pullbacks and functionality, only using the “Calculus I” definition of path integral. (This is a good way to understand just what is “under the hood” of such language.) Exercise 5.7.8. Verify the following properties of 1-forms: (a) (Integration by parts for 1-forms) Let γ be a smooth loop in a vector space V and let f, g be smooth functions defined in a neighborhood of γ ∗ . Prove that   f dg = − g df. γ

γ

(b) (Chain rule for 1-forms) Let f be a smooth function of x1 , . . . , xk , where x1 , . . . , xk are themselves smooth functions on a vector

5.7. Exercises

99

space V . Prove that df =

k  ∂f dxi . ∂x i i=1

Exercise 5.7.9. Let Ω be an open subset of V . A 1-form α on Ω is said to have some property locally if, for every p ∈ Ω, there is an open U ⊆ Ω with p ∈ U such that the restriction of α to U has that property. Show that a locally closed form must be closed but that a locally exact form need not be exact. Exercise 5.7.10. Let Ω ⊆ V be a simply connected open set. Prove that every closed 1-form α defined on Ω is exact. (Hint: Fix a base point x0 ∈ Ω and define a smooth function  x f (x) = α x0

where the integral is taken along any path in Ω from x0 to x — use Theorem 5.2.8 to argue that the choice of path does not matter. Show that α = df .) Exercise 5.7.11. Show that the first de Rham cohomology is functorial : a smooth map ϕ : Ω → Ω induces a linear map ϕ∗ : H 1 (Ω) → H 1 (Ω ). Also show that homology is functorial: ϕ induces a homomorphism ϕ∗ : H1 (Ω ) → H1 (Ω). (Notice the opposite directions of the two maps.) Can you relate these functorial maps? Exercise 5.7.12. An entire function is a function f : C → C that is represented by an everywhere convergent power series f (z) =

∞ 

an z n ,

n=0

for some coefficients an ∈ C. Prove that  f (z) an = dz, n+1 γ 2πiz where γ is a circular path around the origin (of any given radius). Deduce that no nonconstant entire function can be bounded (Liouville’s theorem: note that the example of sin x = x−x3 /3!+· · · shows

100

5. Integrals and the Winding Number

−1

1

Figure 5.3. The loop γ for Exercise 5.7.15.

that this statement is not true if we restrict attention to real variables only). Exercise 5.7.13. We can use Liouville’s theorem to give another proof of the fundamental theorem of algebra (see Section 3.5). Let p be a polynomial and suppose, for a contradiction, that p does not have any roots. Then f (z) = 1/p(z) defines an entire function. Show that f is bounded, and therefore (by Liouville’s theorem) f , and thus p, must be constant. Exercise 5.7.14. Let f : Ω → C be a holomorphic function, and let Γ be a cycle in Ω that is nullhomologous (that is, wn(γ; p) = 0 for all p∈ / Ω). Show that for any w ∈ C,  f (z) 1 dz = wn(Γ; w)f (w). 2πi Γ z − w (Hint: Show that Γ is homologous in Ω \ {w} to the cycle Γε that travels wn(Γ; w) times around a circular loop of radius ε around w. Apply Cauchy’s theorem to the function f (z)/(z − w) and the cycle [Γ] − [Γ ]; then let ε → 0.) Exercise 5.7.15. Let Ω = C \ {−1, 1}, and let γ be the loop in Ω that is depicted in Figure 5.3. Show using the homology form of  Cauchy’s theorem that γ f (z) dz = 0 for any holomorphic function f on Ω. Nevertheless, γ is not homotopic in Ω to a constant loop (we will show this rigorously in Chapter 8), and thus a homotopy form of Cauchy’s theorem is insufficient for a proof.

Chapter 6

Vector Fields and the Rotation Number

6.1. The rotation number Let γ be a regular smooth loop in the plane — we recall from Section 3.4 that regular means γ  (t) = 0 for all t ∈ [0, 1]. Then for each t, γ  (t) ∈ C \ {0} and γ  (0) = γ  (1). In other words, γ  is itself a loop in C \ {0}. Definition 6.1.1. The rotation number of γ, denoted rot(γ), is the winding number of the loop γ  about 0. The rotation number thus measures the number of times that the tangent vector to γ turns around 0. That is not necessarily the same as the number of times that γ itself turns around 0 — see Figure 6.1 for an illustrative example. When studying the rotation number it is often convenient to consider the unit tangent vectors to γ. Recall (Remark 3.2.8) that, for any nonzero complex number w, the symbol υ(w) denotes w/|w|, the unit vector in the direction of w. (It is sometimes convenient to think of υ(w) as “measuring the angle” between w and 1.) The map υ sends C \ {0} continuously onto the unit circle S 1 , and we noted in Remark 3.2.8 that for any loop γ in C \ {0}, the winding numbers (around 0) of the loops γ and υ ◦ γ are the same. In particular we can 101

102

6. Vector Fields and the Rotation Number

Figure 6.1. A loop with winding number +1 about 0 and rotation number +2.

(and often will find it convenient to) think of the rotation number of γ as the winding number of t → υ(γ  (t)). The winding number and rotation number are not the same, as we have seen; but there are important special cases where they agree. Definition 6.1.2. The smooth loop γ is monotonic around p ∈ C if no ray through p is tangent to γ, that is, if the complex numbers γ(t) − p and γ  (t) are always linearly independent over R. Proposition 6.1.3. If the smooth loop γ is monotonic about p, then wn(γ, p) = rot(γ). Proof. Consider the linear homotopy   h(s, t) = (1 − s) γ(t) − p + sγ  (t). This joins γ − p and γ  and never passes through 0 (since h(s, t) = 0 would give an R-linear dependence between γ(t) − p and γ  (t)) so it shows that γ − p and γ  have the same winding number about 0.  Recall from Section 4.2 that a Jordan curve is a loop that does not intersect itself: the points γ(t) and γ(t ), for t < t, must always be distinct (except when t = 0 and t = 1). The beautiful proof of the following result about the rotation number of a Jordan curve is due to Heinz Hopf.

6.1. The rotation number

103

Proposition 6.1.4 (Theorem of the turning tangent). If γ is a smooth Jordan curve, then rot(γ) = ±1. Proof. By compactness, there is some point on the curve that minimizes the imaginary part of γ(t). Without loss of generality, let us take it that this minimum occurs at t = 0 and that γ(0) = 0. Thus Im γ(t)  0 for all t. Recall that for a nonzero complex number w, we define υ(w) = w/|w|. Consider the function (the secant map) defined for 0  t  t  1 by ⎧    ⎪ ⎪ ⎨υ(γ(t) − γ(t )) if t > t , (t , t) = (0, 1),  σ(t , t) = υ(γ  (t)) if t = t , ⎪ ⎪ ⎩−υ(γ  (0)) if t = 0, t = 1. Expressed in words, this means that σ(t , t) is the unit vector in the direction from γ(t ) to γ(t) (well-defined because the curve does not intersect itself), with appropriate interpretation when t = t (as the unit tangent vector at t) and when t = 0, t = 1 (now γ(t ) = γ(t) as we have gone “all the way around” the loop and the appropriate interpretation is minus the unit tangent vector at 0). I claim that σ is a continuous function on the closed triangle 0  t  t  1. Let us grant that claim for a moment, finish the proof, then come back to the claim. We construct a homotopy of loops in S 1 as follows:  σ((1−s)t,(1+s)t) (0  t  12 ), h(s, t) = σ((1−s)(1−t)+(2t−1),(1+s)(1−t)+(2t−1)) ( 21  t  1). This homotopy is illustrated in Figure 6.2: as t runs from 0 to 1, it follows the values of σ along the line segment from (0, 0) when t = 0 to ( 12 (1 − s), 12 (1 + s)) when t = 12 and then along the line segment from there to (1, 1) when t = 1. When s = 0, this is the loop of unit tangent vectors to γ, so its winding number is precisely rot(γ). When s = 1, we are traversing the two line segments from (0, 0) to (0, 1) and then to (1, 1). Now use the fact that γ lies in the upper half-plane and γ(0) = 0. This tells us first of all that γ  (0) is real, so that h(1, 0) = h(1, 1) =

104

6. Vector Fields and the Rotation Number (0, 1) (1, 1) ( 12 (1 − s), 12 (1 + s))

(0, 0) Figure 6.2. The homotopy used to prove the theorem of the turning tangents.

−h(1, 12 ) = ±1. Second, it tells us that the loop t → h(1, t) must lie in the upper half-plane for 0  t  12 and the lower half-plane for 12  t  1. But now Rouch´e’s theorem (Proposition 3.1.4) tells us that this loop is homotopic to the loop t → e2πit (if h(1, 0) = 1) or t → −e−2πit (if h(1, 0) = −1). Thus it has winding number ±1. Now the homotopy h shows that this winding number is the same as the winding number of t → h(0, t), which is the rotation number of γ, as we remarked. This completes the proof except for verifying the claimed continuity of the secant map σ. To check the continuity, define 

γ(t)−γ(t ) t−t 



g(t , t) =

γ (t)

(t < t ), (t = t ),

so that σ(t , t) = υ(g(t , t)) except for the special case (t , t) = (0, 1) which we will handle separately. It will be enough to show that g is a continuous function of (t , t) and the only points at which there is any doubt are those on the diagonal t = t . Consider then a diagonal point (a, a). Given ε > 0 there is δ > 0 such that |γ  (t) − γ  (a)| < ε for t ∈ (a − δ, a + δ). Now write  1 [γ  ((1 − u)t + ut ) − γ  (a)] du g(t , t) − g(a, a) = 0

6.2. Curvature and the rotation number

105

using the fundamental theorem of calculus. If t , t ∈ (a − δ, a + δ), then the absolute value of the quantity under the integration sign is less than ε for each u. Thus |g(t , t) − g(a, a)| < ε and this establishes continuity. At the point (0, 1) the proof uses the same idea. See Exercise 6.5.5 for some supporting details.  Remark 6.1.5. One sees from the proof that the sign (±) of the rotation number depends on whether the curve γ is traversed in the “positive” or “negative” direction, where by the “positive direction” we mean that direction for which the interior of γ is to the left of γ  .

6.2. Curvature and the rotation number The rotation number is closely related to the curvature studied in differential geometry. Let us recall a few definitions. Let γ be a smooth, regular path in the plane. For the purposes of this discussion, we allow the parameter interval to be of arbitrary length — that is, we allow γ to be a map [0, d] → C, rather than [0, 1] → C. Definition 6.2.1. The path γ is a unit speed path if |γ  (t)| = 1 for all t. It is a simple theorem that every regular smooth path can be smoothly reparameterized (Definition 5.1.15) so as to become a unit speed path. This can be achieved by using the arclength  t |γ  (t)|dt (6.2.2) s(t) = 0

as the new parameter. For this reason a unit speed path is also called a path parameterized by arc length, and we will usually use the letter s to denote the parameter for such a path. Let γ be a smooth unit speed path. Then s → γ  (s) is a smooth map [0, d] → S 1 . According to Proposition 3.2.1 there exists a function s → θ(s) such that γ  (s) = eiθ(s) ; the function θ is a “smooth choice of tangent angle” at every point. Definition 6.2.3. The curvature κ of a unit speed path γ at s is the derivative θ  (s): it measures how fast the tangent angle is turning with distance.

106

6. Vector Fields and the Rotation Number

Example 6.2.4. The curvature of a circular path of radius r is equal to ±1/r (the sign dependent on the orientation). From our definitions we have Proposition 6.2.5. If γ : [0, d] → C is a smooth loop with a unit speed parameterization, then  d 1 rot(γ) = κ(s)ds; 2π 0 the rotation number is equal to (2π)−1 times the total curvature. d Proof. Since κ(s) = dθ/ds (in the notation above), we have 0 κ(s)ds = θ(d) − θ(0). But, by Definition 3.2.4, this is just 2π times the wind ing number of γ  . Definition 6.2.3 has the disadvantage of apparently depending on a particular choice of parameterization (a unit speed parameterization). We can use the technology of 1-forms to give a more canonical approach. Definition 6.2.6. Let γ be a regular smooth path in the plane, The curvature 1-form of γ is the 1-form ωγ = Im(γ  (t)/γ  (t)) dt defined on γ. (Here γ  and γ  are complex numbers, and Im denotes the imaginary part.) Lemma 6.2.7. The curvature 1-form is unchanged by smooth reparameterization (Definition 5.1.15) of the path γ. If the parameterization is a unit speed one, then ω = κ ds, where κ is the curvature defined above for unit speed curves. Proof. Let t and t be two parameterizations for γ. We have, by the chain and product rules,  2 dγ dγ dt d2 γ dγ d2 t d2 γ dt =  , =  2 + 2 dt dt dt dt2 dt dt dt dt and so

d2 γ  dγ d2 t  dt = + dt2 dt dt2 dt



d2 γ  dγ dt 2 dt



dt . dt

6.3. Vector fields and singularities

107

The first term in the second display is real, so when we take imaginary parts it vanishes and we get  2    2   d γ dγ d γ dγ dt , Im dt = Im dt2 dt dt 2 dt which shows that the definition of ω is independent of the parameterization. For the second part of the lemma, consider a unit speed parameterization with γ  (s) = eiθ(s) ; then γ  = ieiθ θ  and the defini tion of ω becomes ω = Im(iθ  ) ds = κ ds, as required. The invariant version of Proposition 6.2.5 is then Proposition 6.2.8. Let γ be a smooth, regular loop. Then  1 rot(γ) = ωγ ; 2π γ the rotation number is (2π)−1 times the total curvature. In particular, the total curvature of a smooth Jordan curve is ±2π, the sign depending on whether the curve is traversed in the positive or negative direction. Proof. Since the right-hand side does not depend on the choice of parameterization, by Lemma 6.2.7, we may assume that the loop γ is parameterized by arc length. Then the right side is (2π)−1 κ ds, which equals the rotation number by Proposition 6.2.5. For an alternative proof see Exercise 6.5.10. 

6.3. Vector fields and singularities Imagine that we want to compute the rotation number of a large loop that is laid out on the surface of the earth. We walk around the loop, measuring at each point the angle θ that the tangent vector makes with north (let’s say) and then integrating the curvature κ = dθ/ds around the loop. To find out which direction is north, we carry a magnetic compass. What’s wrong with this picture? For some curves our calculation of the rotation number will give the “correct” answer. For others, like a meridian of latitude encircling the (magnetic) north pole, it

108

6. Vector Fields and the Rotation Number

will look as though we have a Jordan curve with rotation number zero! What is happening is that we are using a vector field to describe our “reference” direction — here, the vector field which gives the direction in which the compass needle points — and that vector field can have “singularities” (like that of the earth’s magnetic field at the north pole) which themselves contribute to the rotation number. In this section we will work out the details of this idea. Let V be a finite-dimensional vector space (we will usually be thinking about V = C, considered as a 2-dimensional vector space over R) and let Ω be an open subset of V . Definition 6.3.1. A tangent vector to Ω at p is a pair (p, v) consisting of a point p ∈ Ω and a vector v ∈ V . We think of a tangent vector at p as a little arrow whose origin is at p and which is pointing in the direction indicated by v. Definition 6.3.2. A vector field X on Ω is a continuous (usually smooth) function which assigns to each point p ∈ Ω a tangent vector at that point. To visualize a vector field, therefore, imagine attaching a little arrow to each point p ∈ Ω, varying smoothly with p. You might end up with something like a weather map indicating wind strengths and directions. It is common practice to write X(p) for the vector v by itself (rather than the ordered pair (p, v)) when this will not cause confusion. Definition 6.3.3. A singularity of the vector field X is a point z such that the vector X(z) = 0 (the little arrow has zero length). The vector field has isolated singularities if the set of singularities is discrete, and it is nonsingular if the set of singularities is empty. From now on we will focus on vector fields with isolated singularities, defined on open subsets of V = C. Let X be such a vector field, defined on Ω ⊆ C, that has an isolated singularity at a. Then, for small ε, the function t → X(a + εe2πit ), t ∈ [0, 1], is a loop in C \ {0}. Varying ε changes this loop by a homotopy, so the winding number of the loop is independent of ε (as long as ε is small enough).

6.3. Vector fields and singularities

Index 1

Another index 1 example

109

Index -1

Figure 6.3. Vector fields with various singularities. The black dot denotes the basepoint p of each vector and the line denotes its magnitude and direction. The singularity is the point in the center of each picture where the vector field vanishes.

Definition 6.3.4. The above winding number is called the index (or degree) of the isolated singularity at a and is denoted ind(X, a). In favorable cases the index can be recovered from “infinitesimal” data about the vector field near the singularity. Proposition 6.3.5. Let X := u + iv be a smooth vector field having an isolated singularity at the point z := x + iy = a (say). Suppose that the Jacobian ∂u/∂x ∂u/∂y J= ∂v/∂x ∂v/∂y is nonzero at the singularity z = a. Then the index of the singularity is ±1, according to the sign of J. Proof. Without loss of generality assume that a = 0. Let M denote the 2 × 2 matrix whose determinant is the Jacobian; that is,   ∂u/∂x ∂u/∂y M= . ∂v/∂x ∂v/∂y (x,y)=(0,0) By Definition E.2.4 we can write the path whose winding number gives the index as follows:     u(ε cos(2πt), ε sin(2πt)) cos 2πt = εM + e(t), v(ε cos(2πt), ε sin(2πt)) sin 2πt

110

6. Vector Fields and the Rotation Number

where the error term e(t) has the property that ε−1 e(t) → 0 as ε → 0. Since M is an invertible matrix, the first term on the righthand side has norm greater than Cε for all t, ε, where C is a constant (actually the norm of the matrix M −1 ). Thus for ε small enough the norm of the first term on the right-hand side is strictly greater than that of the second one, and so by Rouch´e’s theorem (Proposition 3.1.4) the index is equal to the winding number of the path   cos 2πt t → M . sin 2πt But this is just an ellipse, traversed in the positive or negative direction according to the sign of the determinant of M , and so has winding number ±1.  Remark 6.3.6. In fact one can recover the index from the infinitesimal data at any isolated singularity, even one where the Jacobian is singular. The remarkable algebraic formula which allows one to do this was discovered only in the 1970s. See [17]. Let X be a vector field, with isolated singularities, defined on an open subset Ω ⊆ C. Let γ be a loop in Ω which does not pass through any singularities of X. Then t → X(γ(t)) is a loop in C \ {0}. If γ is a small circle surrounding a singularity, then the winding number of this loop is, by definition, the index of the singularity (Definition 6.3.4). But there is no need to restrict our attention to that case, and we can make the more general definition below. Definition 6.3.7. In the situation above, we will call the winding number of X(γ(t)) the rotation number of X around γ, and we will denote it by rot(X; γ). We may extend this definition to cycles Γ in Ω as in Remark 5.4.8. Remark 6.3.8. The rotation number of a smooth loop γ, as we have defined it in Definition 6.1.1, appears in terms of the above definition to be rot(γ  ; γ), the “rotation number of the vector field γ  around γ”. This does not make literal sense without a bit of work — since γ  is defined only on γ itself, not on an open set Ω — but it is a suggestive connection.

6.3. Vector fields and singularities

111

Theorem 6.3.9. Let Ω be an open subset of C, and let Γ be a nullhomologous cycle1 in Ω. Suppose that X is a nonsingular smooth vector field defined on Ω. Then rot(X; Γ) = 0. Recall that, by Artin’s criterion (Theorem 5.4.9), Γ is nullhomologous if and only if wn(Γ; p) = 0 for all p ∈ Ω. Proof. First, observe that since X is nonsingular, we may without loss of generality assume that it is a unit vector field, that is, |X(z)| = 1 for all z ∈ Ω. Indeed, the unit vector field υ(X) is well-defined, smooth, and nonsingular and it has the same rotation number as X by Remark 3.2.8. Now let X = u + iv, u, v real, be a smooth unit vector field. Define a 1-form ω by     ∂u ∂v ∂v ∂u d(u + iv) = −v +u +u dx + −v dy. ω= i(u + iv) ∂x ∂x ∂y ∂y To see where this formula comes from, notice that for each p ∈ Ω there is a disc D = D(p; ε) ⊆ Ω. Since this disc is contractible, Corollary 3.1.6 tells us that there is a smooth real-valued function θ on D with u + iv = eiθ = cos θ + i sin θ. Then the form ω above is equal, on D, to dθ. In particular ω is closed on a neighborhood of p; and, since p was arbitrary, it follows that ω is closed2 . Moreover, by the integral formula for the winding number (Theorem 5.3.4),  1 rot(X; γ) = ω. 2π Γ Since ω is closed and Γ is a boundary in Ω, Lemma 5.4.7 implies that  ω = 0. The result now follows.  Γ Corollary 6.3.10. Suppose that X is a smooth vector field with isolated singularities on Ω, and let Γ be a nullhomologous cycle in Ω. 1 2

See Section 5.4 for the language of cycles and homology. But it need not be exact! See Exercise 5.7.9.

112

6. Vector Fields and the Rotation Number

Then rot(X; Γ) =



wn(Γ; pk ) ind(X; pk ),

k

where the sum is taken over all singularities pk . Proof. For each singularity pk , let γk denote a small circle surrounding it. Let Ω = Ω \ {pk } denote Ω with the singularities removed. Let Γ be the cycle in Ω defined by  Γ = wn(γ; pk )γk . k 

Then Γ and Γ have the same winding numbers around all points not in Ω , on which region X  is nonsingular. Thus by Theorem 6.3.9, applied to the cycle [Γ] − [Γ ],  rot(X; Γ) = rot(X; Γ ) = wn(γ; pk ) rot(X; γk ), k

and rot(X; γk ) = ind(X; pk ) by definition.



Just for fun, let’s use this to prove the Brouwer fixed-point theorem again. Theorem 6.3.11. Let D denote the closed unit disc {z ∈ C : |a|  1}. Any continuous map f : D → D must have a fixed point, that is, a point z0 such that f (z0 ) = z0 . Proof. Suppose not. Then define a vector field3 X as follows: X(z) is the vector f (z) − z, i.e., the vector that points from z to f (z). This is a vector field without singularities (by assumption) and from any boundary point z ∈ S 1 it always points inwards. Let t → γ(t) = e2πit parameterize the boundary circle. Then for all t, we have Im(γ  (t)/X(γ(t))) < 0; this just translates the statement that X(γ(t)) points inwards. By Rouch´e’s theorem (Theorem 3.1.4), the winding number of γ  (that is, the rotation number of γ) equals the winding number of X ◦ γ (that is, the rotation number of X). But the first of these is ±1 (Proposition 6.1.4) and the second is 0 (Theorem 6.3.9), so this is a contradiction.  3

Extend it to the exterior of the disc by setting X(reiθ ) = X(eiθ ) for r > 1.

6.4. Vector fields and surfaces

113

Figure 6.4. Some surfaces.

6.4. Vector fields and surfaces In this final section we are going to use the ideas of this chapter to sketch the proof of a famous theorem about vector fields on surfaces. The kinds of surfaces that we have in mind are called in mathematics closed, oriented, smooth surfaces of which standard examples are the sphere, the torus, and the double torus (Figure 6.4). One way to study the topology of a surface S is to subdivide it into polygons, or faces, meeting along edges and at vertices. For example, a standard soccer ball subdivides the surface of a sphere into 32 faces (12 pentagons and 20 hexagons), with 90 edges and 60 vertices. The quantity χ(S) = V − E + F, where V , E, and F denote the numbers of vertices, edges, and faces, respectively, is called the Euler characteristic of the surface. It does not depend on the way the surface is subdivided and is equal to 2 for the sphere, 0 for the torus, −2 for the double torus, and in general to 2 − 2g where g is the genus or “number of holes” of S. A surface may also be considered as the domain of a tangent vector field (one whose direction is everywhere tangential to the surface). When you start sketching these you soon find that a tangent vector field on the sphere seemingly must have a singularity somewhere (as the “latitudinal” vector field has singularities at the north and south poles) whereas a tangent vector field on the torus can be nonsingular

114

6. Vector Fields and the Rotation Number

Figure 6.5. Nonsingular vector field on a torus. Creative Commons Attribution-ShareAlike 3.0.

(see Figure 6.5). These observations are systematized and generalized by the following famous theorem of Hopf. Theorem 6.4.1 (Hopf index theorem). Let X be a smooth tangent vector field on a compact oriented surface S, with isolated singularities {p1 , . . . , pn }. Then the sum of the indices of the singularities, n 

ind(X, pk ),

k=1

depends only on S (and not on the vector field ); moreover, it is equal to the the Euler characteristic χ(S) of the surface S. Sketch of the proof. The basic idea is to apply Proposition 6.2.8 to the “boundary curves” of each of the faces in a subdivision and then add up the results. There are two fundamental obstacles to this plan. (a) Proposition 6.2.8 applies to smooth Jordan curves, but the boundaries of faces are only piecewise smooth — the tangent vector “jumps” at the vertices. (b) Proposition 6.2.8 involves the curvature, that is, the rate of change of the angle that γ  makes with a fixed “reference direction”. But, on a surface, there is no globally defined choice of “reference direction”.

6.4. Vector fields and surfaces

115

It is not too hard to see how one should resolve obstacle (a). Suppose that we consider regular piecewise smooth loops: a loop γ : [0, 1] → is regular piecewise smooth if it is continuous and if there exist finitely many parameter values 0 = a0 < a1 < · · · < am = 1 such that (i) the map γ is smooth on each subinterval [ai , ai+1 ], and its derivative γ  is nonzero there; (ii) (no cusps) at each breakpoint ai the tangent vectors γ  (a− i ) :=

lim

u→0,u0

γ  (ai + u)

are not in exactly opposite directions. For such a curve γ the derivative γ  (t) now traces out a series of arcs in C \ {0} as t runs over the parameter intervals [ai , ai+1 ]. The endpoint  + γ  (a− i ) of one arc need not be the same as the starting point γ (ai ) of the next arc, but because of the no-cusps condition, the straight line  + path from γ  (a− i ) to γ (ai ) lies in C \ {0}. Thus, by joining up the  arcs γ|[ai−1 ,ai ] with line segments, we obtain a (continuous) loop in C\{0}, and we can define the rotation number of the regular piecewise smooth loop γ to be the winding number of this loop. Clearly, if γ is actually smooth, this agrees with Definition 6.1.1. Remembering that the curvature is supposed to keep track of the angular change in the tangent vector γ  , it is not hard to see that Proposition 6.2.8 generalizes to the following statement for regular piecewise smooth curves: ⎛ ⎞   1 ⎝ ωγ + θj ⎠ , (6.4.2) rot(γ) = 2π γ|[a ,a ] i j i

i+1

where ω is the curvature 1-form as before and θj ∈ (−π, π) is the external angle at the jth vertex, that is, the angle between γ  (a− j ) and γ  (a+ ). In particular, for a regular piecewise smooth Jordan j curve, oriented in the positive direction, the quantity appearing in (6.4.2) is equal to 1. Now let’s turn our attention to obstacle (b) above, and here we convert the obstacle from a bug to a feature by using the vector field X to provide the missing choice of “reference direction”. In other words, we define “relative curvature” ω X and so on by measuring the

116

6. Vector Fields and the Rotation Number

rate of change of the angle between γ  (t) and X(γ(t)) (rather than the rate of change of either of these quantities individually). This makes sense provided that all the singularities of X are in the interiors of faces (which can easily be arranged by perturbing the subdivision a bit); however, it now introduces a further change into (6.4.2). By integrating the relative curvature, we will obtain, not the absolute rotation number +1 of a boundary curve γ, but the difference between this quantity and the rotation number of X around γ as computed in Corollary 6.3.10. In other words, we will have for each γ that bounds a face F (considered as a regular piecewise smooth curve), ⎛ ⎞     1 ⎝ ind(X; pk ) = ωγX + θj ⎠ , (6.4.3) 1 − 2π γ|[a ,a ] i j pk ∈F

i

i+1

with the θj denoting external angles as before. It is this last equation that we will sum over all faces F . When  we do that, we will obtain F − ind(X; pk ) on the left. What will happen on the right? Each edge will appear twice, once as the edge of its left-hand face and once as the edge of its right-hand face, and the orientations of these two occurrences will be opposite. So the sum of all the edge integral contributions, from all faces, will be zero. But the sum of the vertex contributions will not be zero. To see what we will get here, let ϕj = π − θj be the internal angle corresponding to the external angle θj . Then for face F ,   θj = e(F )π − ϕj , j

j

where e(F ) is the number of vertices (equal to the number of edges) of face F . If we sum this expression over all faces, the first term on the right sums to 2πE (because each edge appears twice in the sum, contributing π each time) and the second term on the right sums to −2πV (because the sum of all the external angles at any given vertex is 2π). Consequently, after summing over all faces we get from (6.4.3)  F− ind(X; pk ) = E − V, which is Hopf’s theorem.



6.5. Exercises

117

6.5. Exercises Exercise 6.5.1. Give a construction that produces, for each m, n ∈ Z, a regular smooth loop in C \ {0} with winding number m around 0 and rotation number n. Exercise 6.5.2. Generalize Proposition 6.1.3 by proving that the difference wn(γ, p) − rot(γ) is in general equal to the number of rays through p that are tangent to γ, counted with appropriate signs. (“In general” refers to a suitable transversality hypothesis.) Exercise 6.5.3. Let γ0 and γ1 be regular smooth loops in Ω (an open subset of C). A homotopy h between them is called a regular homotopy if, for each s ∈ [0, 1], the loop γs defined by γs (t) = h(s, t) is also regular and smooth. Give an example of two loops that are homotopic but not regularly homotopic. Exercise 6.5.4. Show that two regular smooth loops in C are regularly homotopic if and only if they have the same rotation number. This is the Whitney-Graustein theorem; see [39]. To prove the “if” part, try to integrate a homotopy on the level of derivatives (γ  ) to a homotopy on the level of the curves themselves (γ). Exercise 6.5.5. In the proof of Proposition 6.1.4 we left to the reader the verification that the secant map is continuous at (0, 1). Here is one way to approach this. Define a new smooth path θ by  γ(s + 12 ) (0  s  12 ), θ(s) = γ(s − 12 ) ( 21 < s  1) (smoothness at s = 12 follows from the regular loop condition for γ at t = 0). Then define for t > 12 and t < 12 , ⎧ ⎨ θ(t− 12 )−θ(t + 12 ) (t − t < 1),  t−t −1 h(t , t) = ⎩γ  (0) (t = 1, t = 0). Show that h is continuous and that υ(h(t , t)) = −σ(t , t) for all t, t .

118

6. Vector Fields and the Rotation Number

Exercise 6.5.6. A different proof of the theorem of turning tangents (Proposition 6.1.4) can be given by way of polygonal approximation. In this exercise we’ll develop that proof, following some lecture notes of M. Ghomi. (a) A polygon P in the plane is called whisker-free if no two successive edges are in exactly opposite directions. (This is the equivalent for polygons of the no-cusps requirement for regular piecewise smooth curves.) The external angles of a whisker-free polygon are then well-defined in (−π, π). Show that the sum of the external angles of a whisker-free Jordan polygon is ±2π, with sign dependent on orientation. (Hint: Use induction on the number of vertices. For the induction step, show that if there are more than 3 vertices, then one can always be removed in such a way that the truncated polygon is still a whisker-free Jordan one.) (b) Let γ be a regular smooth Jordan curve, parameterized by arc length: γ : [0, d] → C. Define the nth approximating polygon Pn to be the polygon with vertices γ(kd/n), k = 0, . . . , n − 1. Show that when n is large enough, Pn is a whisker-free Jordan polygon and that its kth side (the side with vertices γ((k − 1)d/n) and γ(kd/n)) is parallel to γ  (sk ) for some sk ∈ [(k − 1)d/n, kd/n]. (c) Deduce using the definition of an integral as the limit of a sum that the sum of the external angles of the polygon Pn tends to the total curvature,  d κ(s)ds, 0

as n tends to infinity. Thus obtain another proof of the theorem of the turning tangent. Exercise 6.5.7. Let γ be a regular smooth plane curve, possibly with finitely many transverse self-intersections. An uncrossing move modifies γ in the neighborhood of a self-intersection, as shown in Figure 6.6. Show that an uncrossing move does not change the rotation number. Deduce that the rotation number of γ is equal to the number of anticlockwise loops minus the number of clockwise loops obtained after we have uncrossed all self-intersections.

6.5. Exercises

119

Figure 6.6. Uncrossing move.

Exercise 6.5.8. With the notation of the previous problem, give each self-intersection point p a sign w(p) according to the following scheme: assume that γ(0) = γ(1) is not a self-intersection and, if γ(a) = γ(b) with 0 < a < b < 1, give a self-intersection point a + sign if the pair (γ  (b), γ  (a)) is a right-handed basis (i.e., γ  (a) lies counterclockwise from γ  (b)) and a − sign if not. Show that  rot(γ) = w(p) + wn(γ, q  ) + wn(γ, q  ), p

where the sum runs over the self-intersections and q  , q  are points in the two cells of C \ γ ∗ that are adjacent to the base point γ(0) = γ(1). (Use an induction argument on the number of self-intersections.) Exercise 6.5.9. Show that if a path in the plane has constant curvature 1/r, it is (part of) a circle of radius r. Exercise 6.5.10. Give an alternate proof of Proposition 6.2.8 by applying the integral formula for the winding number (Theorem 5.3.4) to the loop γ  . Exercise 6.5.11. A smooth curve γ : [0, 1] → C in the plane has the property that the distance from the origin, |γ(t)|, achieves its maximum value R at t = a, where 0 < a < 1. Prove that the absolute value of its curvature κ at t = a is at least R−1 . Is there a corresponding theorem for the point where |γ(t)| achieves its minimum value? Exercise 6.5.12. Show that the formula z → az n , where a = 0 is a constant, gives a vector field with an index n singularity at 0 and that the formula z → a¯ z n gives a vector field with an index −n singularity. Identify choices of a and n leading to the pictures in Figure 6.3. Draw similar pictures for index 2 and index −2 singularities.

120

6. Vector Fields and the Rotation Number

Exercise 6.5.13. A class is studying the Hopf index theorem (Theorem 6.4.1). (a) A misguided student argues as follows: “On the complex plane C we can consider a constant vector field, which has no singularities. Identify C with the sphere by stereographic projection; then we’ll get a vector field on the sphere with no singularities. This contradicts Hopf’s theorem since the Euler characteristic of the sphere is 2.” Find the student’s mistake, and draw a picture to show that, in fact, his example is consistent with Hopf’s theorem. (b) The Euler characteristic of the torus is 0, so it is consistent with Hopf’s theorem that there exists a vector field on the torus without singularities. Can you help the students give an example of such a vector field? (c) The students are arguing about whether there is a compact oriented surface (without boundary) that has Euler characteristic 0 but does not admit a nonsingular vector field. Some say “yes”; others say “no”. It turns out that some of them are assuming that a surface has to have an additional topological property besides those that were explicitly mentioned, and others are not. What is this property? Exercise 6.5.14. Study the paper [31] regarding the topology of “ridge patterns” (such as those appearing in human fingerprints). What is the key difference between these “ridge patterns” and the vector fields that we have investigated in this chapter? Exercise 6.5.15. Read Jules Verne’s novel Around the World in Eighty Days [37]. At the end of the book, Phineas Fogg thinks he has lost his bet, but it turns out that he has not. Explain his mistake in terms of the results of this chapter.

Chapter 7

The Winding Number in Functional Analysis

Topological ideas such as the winding number are ubiquitous in modern mathematics, often showing up in entirely unexpected contexts. For example, in this chapter the winding number will arise as the solution to a problem in functional analysis — roughly speaking, the problem is to count the number of “arbitrary constants” that appear in the general solution of a certain integral equation. This process — relating topology to “counting” the solutions of differential or integral equations — is the central theme in the index theorem of Atiyah and Singer (see [23]), one of the great unifying mathematical results of the later twentieth century. The equations that we look at are going to be defined on an infinite-dimensional vector space. We will use the theory of Hilbert spaces throughout this chapter; Hilbert spaces are simple examples of infinite-dimensional vector spaces, with a geometry that is close to the familiar Euclidean geometry of finite dimensions. The Hilbert space theory that we will need is reviewed in Appendix F.

7.1. The Fredholm index Let V be a vector space and U a subspace of V . Remember (Definition A.2.6) that the dimension of U is the number of elements in a 121

122

7. The Winding Number in Functional Analysis

basis for U . That is, the dimension of U is the smallest n for which we can find u1 , . . . , un ∈ U such that every u ∈ U can be written  u = ni=1 λi ui for some scalars λi ∈ C. We will also need the notion of codimension. By definition, the codimension (Definition A.3.10) of the subspace U of V is the dimension of an algebraic complement to U , that is, the smallest number n for which there exist v1 , . . . , vn ∈ V such that, for every v ∈ V , one can write n  μi vi + u, μi ∈ C, u ∈ U. v= i=1

When everything is finite-dimensional, the codimension of U in V is just dim V − dim U . But the codimension may be finite even if V and U are both infinite-dimensional. Now let T : V → W be a linear map between vector spaces. Recall the following definitions: (a) The kernel and image of T are defined by Ker T = {v ∈ V : T v = 0}, Im T = {w ∈ W : ∃v ∈ V, T v = w}. They are vector subspaces of V and W , respectively. (b) The nullity of T is the dimension of Ker T , and the rank of T is the dimension of Im T . (c) The corank of T is the codimension of Im T in W . Similarly, the conullity of T is the codimension of Ker T in V . One of the basic results of linear algebra is the “rank-nullity” theorem (Theorem A.3.14). One version of its statement is the following. Theorem 7.1.1. For a linear transformation T : V → W between finite-dimensional vector spaces, we have Nullity T − Corank T = dim V − dim W. Notice that the usual formulation Nullity T + Rank T = dim V is equivalent to this one, but our version is more suggestive when we come to generalize to infinite dimensions. It’s helpful to think of the rank-nullity theorem in terms of Figure A.2 on page 191, which shows

7.1. The Fredholm index

123

T as exactly matching up a piece of V (of size Conullity T ) with a piece of W (of size Rank T ), with “left-over” pieces on each side of size Nullity T and Corank T , respectively. An invertible linear map has zero nullity and zero corank. We will be interested in operators on Hilbert space that are “almost” invertible, where “almost” is expressed by saying that the nullity and corank are finite. That is, Definition 7.1.2. Let V and W be Hilbert spaces and let T : V → W be a bounded linear operator. We say T is a Fredholm operator if (a) the kernel of T has finite dimension, (b) the range of T has finite codimension. Proposition 7.1.3. The kernel and the range of a Fredholm operator on Hilbert space are closed subspaces. Proof. The kernel of any bounded linear operator is closed since it is the inverse image of a closed set, namely {0}, under a continuous map (Remark B.2.4). As for the range, let T : V → W be Fredholm. Let {w1 , . . . , wn } be a basis for a complement of Im T in W . Define a bounded linear operator n  L : (Ker T )⊥ ⊕ Cn → W, (v, λ1 , . . . , λn ) → T v + λi vi . i=1

L is bijective and therefore has a bounded inverse M = L−1 by the closed graph lemma (Lemma F.3.5). Then Im T is the inverse image M −1 ((Ker T )⊥ ) of the closed subspace (Ker T )⊥ , so it is closed.  Definition 7.1.4. The index Index(T ) of a Fredholm operator T is the difference of dimensions Nullity T − Corank T . The rank-nullity theorem can be restated as follows: if V, W are finite-dimensional and T : V → W is a linear map, then Index(T ) = dim V − dim W . In other words, the index does not depend on T at all! In particular, an operator from a finite-dimensional vector space to itself must have index zero. This statement is not true for maps from an infinite-dimensional space to itself, as the following important example shows.

124

7. The Winding Number in Functional Analysis

Example 7.1.5. Let V = W = 2 , the Hilbert space of squaresummable sequences (Example F.1.2). Let T : V → W be the linear operator defined by T (a0 , a1 , a2 , a3 , . . .) = (0, a0 , a1 , a2 , . . .), called the unilateral shift. Clearly, Nullity T = 0 while Corank T = 1, so T is a Fredholm operator of index −1. The adjoint operator T ∗ (the unilateral backward shift) defined by T ∗ (a0 , a1 , a2 , a3 , . . .) = (a1 , a2 , a3 , a4 , . . .) has index +1. The next result gives the key properties of the Fredholm index. Theorem 7.1.6. Let H be a Hilbert space. Then the space Fred(H) of Fredholm operators H → H is an open subset of B(H). The index function Index : Fred(H) → Z is constant on the path components of Fred(H), and two Fredholm operators belong to the same path component of Fred(H) if and only if they have the same index. If H is infinite-dimensional, all integers can be obtained as the indices of appropriate Fredholm operators. These properties should remind you very strongly of the key properties of the winding number, Theorem 3.2.7. We’ll prove the first two statements (the Fredholm operators form an open set and the index is constant on path components) in the next section. As for the final statement (all integers can be obtained), we have already seen examples of Fredholm operators of index ±1 (the unilateral shift and its adjoint), and by taking powers of these we can obtain Fredholm operators of any integer index. We’re not going to prove in detail that two Fredholm operators having the same index are in the same path component, as the argument requires more Hilbert space technology than I want to develop. But I wanted to state the full result (Theorem 7.1.6) so you can see the closeness of the parallel to the winding number.

7.2. Atkinson’s theorem

125

7.2. Atkinson’s theorem From Appendix F the collection B(H) of all bounded ( = continuous) linear operators on a Hilbert space H is itself a complete normed vector space. The norm is defined by

T = sup{ T x : x  1} = sup{| T x, y| : x , y  1}. Lemma 7.2.1. Let H be a Hilbert space and let S ∈ B(H) with

S < 1. Then I − S is an invertible operator. Proof. Define R by the series R = I + S + S + ··· = 2

∞ 

Sn.

n=0

If S = s < 1, then S n  sn , and so simple estimates show that the partial sums of the above series form a Cauchy sequence in the normed vector space B(H). Since B(H) is complete, this Cauchy sequence converges to a bounded operator R. We have SR = RS =

∞ 

Sn = R − I

n=1

whence R(I − S) = (I − S)R = I as required.



Corollary 7.2.2. The set of invertible operators (on a single Hilbert space or from one Hilbert space to another ) is open. Proof. Let T : H1 → H2 be invertible, with inverse S. Let ε =

S −1 . If T − T  < ε, then I − ST  < 1 and I − T  S < 1, whence ST  and T  S are invertible. It follows that T  is invertible.  Proposition 7.2.3. Let V, W be Hilbert spaces. The set Fred(V, W ) of all Fredholm operators from V to W is an open subset of B(V, W ), and the index is constant on path components of this open set. Proof. Let T be Fredholm. We are going to show that there exists ε > 0 such that if T  −T < ε, then T  is Fredholm and has the same index as T . This will clearly show that the set of Fredholm operators is open. It also implies that the index is a continuous integer-valued

126

7. The Winding Number in Functional Analysis

function on the set of Fredholm operators and hence that it is constant on path components. We consider the orthogonal direct sum decompositions (Theorem F.2.4) of V and W given by V = V0 ⊕ V1 , W = W0 ⊕ W1 ,

V0 = Ker(T ), V1 = Ker(T )⊥ , W0 = Im(T )⊥ , W1 = Im(T ).

Note that V0 and W0 are finite-dimensional and Index(T ) = dim V0 − dim W0 . Every linear transformation from V to W has a 2 × 2 matrix representation with respect to this decomposition. In particular T itself has such a representation   0 0 T = , 0 T11 where T11 : V1 → W1 is invertible. Let T  = T + L, where the perturbation L has norm smaller than −1 −1

, and write ε = T11   L10 L00 T +L= . L01 T11 + L11 By our assumption, (T11 + L11 ) is invertible (Corollary 7.2.2). Now we are going to perform “elementary row and column operations” on the matrix T + L. This may seem strange because the entries of our matrix are themselves linear transformations, not numbers as they are in a first course in linear algebra, but in fact everything works in the same way: an elementary row operation corresponds to multiplying on the left by a certain invertible matrix, and an elementary column operation corresponds to multiplying on the right by a certain invertible matrix. The operations we want to carry out are these: −1 row1; that is, multply • Add −L10 (T11 +L  11 ) times row 2 to −1 1 −L10 (T11 + L11 ) on the left by . 0 1

• Add −(T11 + L11 )−1 L01 times  column 2 to column 1; that 1 0 is, multiply on the right by . −1 −(T11 + L11 ) L01 1

7.2. Atkinson’s theorem

127

These operations are effected by invertible matrices, so they don’t change the dimension of the kernel or the codimension of the image (and in particular they don’t change the Fredholm index). Their effect is to reduce T + L to the matrix   0 L00 − L10 (T11 + L11 )−1 L01 . 0 T11 + L11 The index of this diagonal matrix is clearly the sum of the indices of the diagonal entries. But the first diagonal entry is just a linear transformation between finite-dimensional vector spaces, so its index is dim V0 − dim W0 = Index(T ), and the second entry is invertible so it has index zero. Thus T + L is Fredholm and has the same index as T , completing the proof.  Proposition 7.2.4 (Atkinson’s theorem). Let T be a bounded operator on a Hilbert space H. The following conditions are equivalent: (a) T is Fredholm. (b) T is invertible modulo finite-rank operators: there is a bounded operator S such that I − ST and I − T S are of finite rank. (c) T is invertible modulo compact operators: there is a bounded operator S such that I − ST and I − T S are compact operators. See Definition F.3.3 for the definitions of “finite rank” and “compact” operators. For those who are familiar with the terminology, Atkinson’s theorem can be expressed as follows: an operator T is Fredholm if and only if its image in the quotient algebra B(H)/K(H) is invertible. This quotient (called the Calkin algebra Q(H)) is an important object in operator algebra theory. Proof. Suppose that T is Fredholm, (a). Then T maps the orthogonal complement (Ker(T ))⊥ bijectively onto Im(T ). Let Q be the inverse map from Im(T ) to (Ker(T ))⊥ ; by the closed graph lemma (Lemma F.3.5), Q is a bounded operator. Let P be the orthogonal projection from H onto the closed subspace1 Im(T ) and let S = QP . 1

This projection exists by Theorem F.2.4.

128

7. The Winding Number in Functional Analysis

Then by construction, I − T S and I − ST are the orthogonal projections onto Im(T )⊥ and Ker(T ), respectively. Since these are finitedimensional, the associated projections have finite rank. Thus T is invertible modulo finite-rank operators, (b). It is obvious that (b) implies (c). Suppose (c), that T is invertible modulo compacts, and let S be such that I − ST ∈ K and I − T S ∈ K. There is a finite-rank operator F such that I − ST − F < 12 . By Lemma 7.2.1, this implies that ST + F is invertible. Consider now the identity map I = (ST + F )−1 (ST + F ) when restricted to the kernel of T ; the restriction of ST +F to Ker(T ) has finite rank, whence the restriction of I to Ker(T ) has finite rank, and thus Ker(T ) is finite-dimensional. Similarly there is a finite-rank operator F  such that T S + F  is invertible. The equation v = (T S + F  )(T S + F  )−1 v = T S(T S + F  )−1 v + F  (T S + F  )−1 v shows that Im(T )+Im(F  ) = H and, since Im(F  ) is finite-dimensional, this shows that Im(T ) has finite codimension. Thus T is Fredholm, (a), as required.  Corollary 7.2.5. Let T be a Fredholm operator and K a compact operator. Then T + K is Fredholm and has the same index as T . Proof. Any inverse for T modulo compacts is also an inverse for T +K modulo compacts, so T +K is Fredholm by Atkinson’s theorem. The linear path s → T + sK shows that T and T + K belong to the same path component of the space of Fredholm operators, so they have the same index.  Proposition 7.2.6. If T1 , T2 are Fredholm operators on a Hilbert space H, then so is their composite T1 T2 , and moreover Index(T1 T2 ) = Index(T1 ) + Index(T2 ). Proof. It follows from Atkinson’s theorem that the composite of Fredholm operators is Fredholm. To prove the formula for the index, choose an operator S2 that is an inverse for T2 modulo compacts.

7.3. Toeplitz operators

129

Consider the one-parameter family of 2 × 2 matrices (operators on H ⊕ H)

T2 cos(πs/2) I sin(πs/2) Vs = , s ∈ [0, 1]. −I sin(πs/2) S2 cos(πs/2) These are all invertible modulo compacts (hence Fredholm) with



T2 0 0 I , V1 = V0 = . −I 0 0 S2 Note that Index(V0 ) = Index(T2 )+Index(S2 ), whereas V1 is invertible so has index 0; therefore, Index(T2 ) = − Index(S2 ). Consider now the path of operators

T1 0 Ws = Vs . 0 I This is also a continuous path of Fredholm operators with



T 1 T2 0 0 T1 W0 = , W1 = . −I 0 0 S2 The equality Index(W0 ) = Index(W1 ) now gives Index(T1 T2 ) + Index(S2 ) = Index(T1 ), which, together with Index(S2 ) = − Index(T2 ), implies the desired result. 

7.3. Toeplitz operators There is a natural construction that gives rise to Fredholm operators related to the Hilbert space L2 (S 1 ) of square-integrable functions on the circle. We recall that this Hilbert space has an orthonormal basis given by the elementary trigonometric functions en (t) = eint ,

t ∈ [0, 2π],

for n ∈ Z. The coefficients of a function with respect to this orthonormal basis are called its Fourier coefficients; see Definition F.1.6.

130

7. The Winding Number in Functional Analysis

Suppose that g is a continuous function on the circle (i.e., a continuous map [0, 2π] → C with g(0) = g(2π)). The multiplication operator Mg on H is defined by ∀f ∈ H = L2 (S 1 ).

Mg (f ) = gf

Proposition 7.3.1. The multiplication operator Mg is bounded, with norm

Mg  sup{|g(x)| : x ∈ [0, 2π]}. Proof. Let m = sup{|g(x)| : x ∈ [0, 2π]}. If f ∈ L2 (S 1 ), we have 1

Mg f = 2π





|Mg f (x)|2 dx

2

1 = 2π  m2



0 2π

|g(x)f (x)|2 dx 0

1 2π





|f (x)|2 dx = m2 f 2 . 0

This shows that Mg f  m f as required.



Remark 7.3.2. In fact, it is not hard to prove that we have equality in Proposition 7.3.1: Mg = sup{|g(x)| : x ∈ [0, 2π]}. We will not need this, however. We now ask: what is the (infinite-by-infinite) matrix of the multiplication operator Mg with respect to the Fourier basis {en }? This question is particularly simple to answer when g is a trigonometric polynomial (that is, a finite linear combination of the en ). If g(t) is such a polynomial, of the form g(t) =

N 

cn eint ,

n=−N

then we clearly have Mg eikt =

N  n=−N

cn ei(k+n)t .

7.3. Toeplitz operators

131

This proves the special case (where g is a trigonometric polynomial) of Proposition 7.3.3. Let g be a continuous function on S 1 . The matrix of Mg with respect to the trigonometric basis is ⎡ ⎤ .. .. . . ⎢ ⎥ ⎢ c1 c0 c−1 c−2 c−3 ⎥ ⎢ ⎥ ⎢ c2 c1 c0 c−1 c−2 ⎥ , ⎢ ⎥ ⎢ c c0 c−1 ⎥ ⎣ 3 c2 c1 ⎦ .. .. . . where the {cn } are the Fourier coefficients of g. Proof. The calculations above prove this when g is a trigonometric polynomial. The general case follows by the Weierstrass approximation theorem (Theorem C.1.5): the trigonometric polynomials are (uniformly) dense among all continuous functions.  Definition 7.3.4. The Hardy space H 2 (S 1 ) is the closed subspace of L2 (S 1 ) comprised of those functions f ∈ L2 (S 1 ) all of whose negative Fourier coefficients are zero — in other words, those for which

f, en  = 0 for n < 0. The Hardy projection P is the orthogonal projection onto Hardy space; thus P f has the same Fourier coefficients as f for n  0, and zero Fourier coefficients for n < 0. Remark 7.3.5. If we think of f as an L2 function of z = eit defined on the unit circle in the complex plane, then the functions in the Hardy space are precisely those that involve only nonnegative powers of z and that therefore extend to holomorphic functions defined on the unit disc. Definition 7.3.6. Let g be a continuous function on the circle. The Toeplitz operator with symbol g is the operator on the Hardy space defined by Tg = P Mg ; in other words, to compute Tg f , we first multiply f by g and then project the result back into the Hardy space.

132

7. The Winding Number in Functional Analysis

From our discussion above we can see that the matrix of a Toeplitz operator with symbol g is the truncation of the matrix given in Proposition 7.3.3 for the corresponding multiplication operator. That is, ⎡ ⎤ c0 c−1 c−2 · · · ⎢ c1 c0 c−1 ⎥ ⎢ ⎥ (7.3.7) ⎢ c2 c1 ⎥. c0 ⎣ ⎦ .. .. . . Whereas the diagonals of the “multiplication matrix” of Proposition 7.3.3 were “two-way infinite”, the diagonals of the corresponding “Toeplitz matrix” (7.3.7) are only “one-way infinite”. This gives rise to some edge effects when we multiply Toeplitz matrices. It turns out that these “edge effects” are represented by compact operators: Proposition 7.3.8 (Symbolic calculus). Let g1 and g2 be continuous functions on S 1 . Then Tg1 g2 − Tg1 Tg2 ∈ K; in other words, the assignment g → Tg is a homomorphism modulo compact operators. Proof. Let P denote the Hardy projection. I claim that for any continuous g the commutator [P, Mg ] := P Mg − Mg P is compact. This will be enough since then Tg1 Tg2 = P Mg1 P Mg2 ∼ P P Mg1 Mg2 = P Mg1 g2 = Tg1 g2 , where the notation ∼ denotes “equality modulo compacts”. To prove the claim, consider the collection C of all continuous functions g that satisfy it. Clearly C is a vector space, and the identity [P, AB] = [P, A]B + A[P, B] shows that it is closed under multiplication. A direct computation shows that [P, Mg ] is a rank-one operator when g(t) = e±it . It follows that C contains all trigonometric polynomials. Now for a general g let gn be a sequence of trigonometric polynomials converging uniformly to g (the existence of such a sequence is guaranteed by the Weierstrass

7.4. The Toeplitz index theorem

133

approximation theorem (Theorem C.1.5)). Then the operators Mgn converge to Mg by Proposition 7.3.1 and therefore the commutators [P, Mgn ] converge to [P, Mg ]. Thus [P, Mg ] is a limit of compact operators, which is compact, as required. 

7.4. The Toeplitz index theorem Now we can put all the ingredients together to relate topology and analysis. Let g be a continuous complex-valued function on S 1 . Proposition 7.4.1. If the function g is nowhere-vanishing, then the Toeplitz operator Tg is a Fredholm operator. Proof. Since g does not vanish, the function g −1 is defined and continuous everywhere. By the symbolic calculus of Proposition 7.3.8, Tg Tg−1 and Tg−1 Tg are equal modulo compacts to T1 = I. Thus Tg is invertible modulo compacts, so it is Fredholm by Atkinson’s theorem (Proposition 7.2.4).  Since Tg is a Fredholm operator when g is nowhere-vanishing, we can ask about its index. Theorem 7.4.2 (Toeplitz index theorem). Let g : S 1 → C \ {0} be a nowhere-vanishing function. Then Index(Tg ) = − wn(g, 0) where we consider g as a loop that does not pass through 0. Example 7.4.3. Suppose that g(t) = eit . Then for each of the basis elements e0 , e1 , e2 , . . . of the Hardy space we have Tg en = en+1 . Thus Tg is in fact the unilateral shift (Example 7.1.5) and has index −1. On the other hand, the path g described is just the unit circle traversed once in the positive direction, so wn(g, 0) = +1. Proof. Just as in the proof of Theorem 5.3.4, our argument will follow the three-stage “standard operating procedure”: (a) Show that Index Tg depends only on the homotopy class of g (among maps S 1 → C \ {0}).

134

7. The Winding Number in Functional Analysis

(b) Show that the index is multiplicative: Index Tg1 g2 = Index Tg1 + Index Tg2 . (c) Deduce that the index is a multiple of the winding number; fix the multiple by computing one example. To prove (a), we notice that Tg “depends continuously on g”: specifically, if sup{|g1 (x) − g2 (x)|} < ε, then

Mg1 − Mg2 = Mg1 −g2 < ε and it follows that

Tg1 − Tg2 = P (Mg1 − Mg2 )  P

Mg1 − Mg2 = Mg1 −g2 < ε since P = 1. It follows that if s → gs is a homotopy of maps S 1 → C\{0}, then s → Tgs is a continuous path of Fredholm operators, and therefore that Index Tgs does not depend on s by Proposition 7.2.3. To prove item (b), notice that Tg1 g2 is equal modulo compacts to Tg1 Tg2 , by the symbolic calculus of Proposition 7.3.8, and these two operators therefore have the same index by Corollary 7.2.5. By Proposition 7.2.6, Index(Tg1 Tg2 ) = Index(Tg1 ) + Index(Tg2 ). Now Theorem 3.2.7 and Lemma 3.2.6 tell us that loops in C \ {0} are classified up to homotopy by their winding numbers, with the pointwise product of loops corresponding to the addition of winding numbers and each loop being homotopic to some power of the basic loop z → z. It follows that any integer-valued, multiplicative homotopy invariant of loops in C \ {0} is of the form g → k wn(g, 0), where the constant k is the value of the invariant on the basic loop. In the case of the Toeplitz index, we have already carried out the calculation of k in Example 7.4.3 above.  Instead of single (“scalar”) Toeplitz operators, as earlier, we can consider matrix Toeplitz operators, i.e., n×n matrices whose elements are Toeplitz operators. The symbol of such a Toeplitz operator is an n × n matrix of functions on the circle, i.e., a map S 1 → Mn (C), and the operator is Fredholm if the symbol is invertible, i.e., is a map to the group GL(n, C) of invertible matrices. To conclude this section, let’s state the index theorem for matrix Toeplitz operators.

7.4. The Toeplitz index theorem

135

Theorem 7.4.4. Let ϕ : S 1 → GL(n, C) be a continuous, matrixvalued symbol, and let Tϕ be the corresponding matrix Toeplitz operator. Then Tϕ is Fredholm, and its index is given by Index Tϕ = − wn(det ϕ, 0) where det ϕ is the path in C \ {0} given by the determinant2 of ϕ. Sketch of the proof. Induction on n, the n = 1 case being the theorem that we have established already. By what we have already proved, the index depends only on the homotopy class of the symbol in the space of maps S 1 → GL(n, C). So to give the inductive step, it suffices to prove that any map S 1 → GL(n, C) is homotopic, in the space of such maps, to a map of the

ϕ 0 form , where ϕ is a map S 1 → GL(n − 1, C). Such a 0 1 homotopy will preserve both the left- and the right-hand sides of the theorem (the index and the winding number of the determinant) and will reduce the n-dimensional case (for ϕ) to the (n − 1)-dimensional case (for ϕ ), which we may assume solved by induction. The idea is to use row and column operations again, but there is a significant difficulty. To cancel the last row and column we have to get an element in the bottom-right corner which is invertible and so can act as a “pivot”. In ordinary linear algebra this is no problem: some element is certainly nonzero and we can permute rows and columns to get it to the bottom right. But here we are doing linear algebra with matrix entries that are functions on S 1 , and it may well be that even though the whole matrix is invertible for every parameter value, there is no individual matrix element that does not vanish somewhere. The t sin t path of rotation matrices t → [ −cos sin t cos t ] gives a practical example of this. So here’s how we proceed. Take our loop ϕ = ϕ(t) and let v = v(t) ∈ Cn be the last column of ϕ. This is a nowhere-vanishing vector (if it vanished somewhere, ϕ would not be invertible at that point) and by rescaling ϕ(t) by v(t) −1 (which only changes things by a homotopy) we may assume that v(t) is a unit vector for all t. Thus v is actually a loop in the sphere S 2n−1 of unit vectors in Cn . 2

See Definition A.6.5.

136

7. The Winding Number in Functional Analysis

This sphere is simply connected (Example 2.3.6). That is to say, there is a homotopy of loops in the sphere from the constant loop (0, 0, . . . , 0, 1)T to the loop v(t). Now we appeal to a lifting property, rather like the crucial property of Proposition 3.1.5 of the exponential map. This is called the fibration property of the map c : GL(n, C) → Cn \ {0} that sends an invertible matrix to its last column (Definition 9.1.9). Here, it allows us to “lift” the homotopy of the final column (given to us by the simple-connectedness of the sphere) to a homotopy of the whole matrix. This then can be used to show that a nonzero pivot can be found and therefore that elementary row and column operations will reduce the matrix of ϕ to the block form required. More details of this argument can be found in Lemma 9.2.1. 

7.5. Exercises Exercise 7.5.1. Let H be a Hilbert space and suppose that E and F are closed subspaces with E ⊆ F . Show that F ⊥ ⊆ E ⊥ and that the codimension of E in F is the same as the codimension of F ⊥ in E ⊥. Exercise 7.5.2. If T is a Fredholm operator on Hilbert space, show that its adjoint T ∗ is a Fredholm operator also and Index(T ∗ ) = − Index(T ). Exercise 7.5.3. An operator T on a Hilbert space is normal if it commutes with is adjoint, T T ∗ = T ∗ T . (i) If T is a normal Fredholm operator, show that its index is zero. (ii) An operator T is essentially normal if the difference T T ∗ − T ∗ T is compact. Show that the sum of a normal operator and a compact operator is essentially normal. (iii) Show that there exist essentially normal operators that cannot be expressed as the sum of a normal operator and a compact operator. Elucidating the phenomenon in (iii) above leads one to the BrownDouglas-Fillmore theory [11], an important link between operator theory and topology.

7.5. Exercises

137

Exercise 7.5.4. This exercise gives an alternative proof of the result of Proposition 7.2.6 that if S, T are Fredholm operators on a Hilbert space H, then so is their composite ST , and that Index(ST ) = Index(S) + Index(T ). It does not involve the matrix homotopies used in the text, but it does need some more ideas from linear algebra (quotient spaces and the associated isomorphism theorems). (i) Show that the sequences / Ker(T )

0

/ Ker(ST )

T

/ Ker(S) ∩ Im(T )

/0

and 0

/

H Ker(S) + Im(T )

S

/

H Im(ST ) /

H Im(S)

/0

are exact. (See Definition A.3.17 for the terminology.) (ii) Count dimensions in these exact sequences (Theorem A.3.18) and use the “second isomorphism theorem” Ker(S) + Im(T ) ∼ Ker(S) = Im(T ) Ker(S) ∩ Im(T ) to complete the proof of the desired result. Exercise 7.5.5. Fredholm was originally interested in solutions to  integral equations of the form f (x) − k(x, y)f (x) = g(x): for our purposes these can be written (I + K)f = g, where I is the identity and K a compact operator (on some Hilbert space). In these circumstances he formulated what came to be called The Fredholm Alternative 3 , namely the statement that the following conditions are equivalent: • Solutions always exist to the inhomogeneous problem; i.e., for every g there exists f such that (I + K)f = g. • Solutions to the homogeneous problem are unique; i.e., the only f such that (I + K)f = 0 is f = 0. Prove the Fredholm alternative as a consequence of our general results on Fredholm operators. 3

This really should have been the title of a book by Robert Ludlum.

138

7. The Winding Number in Functional Analysis

Exercise 7.5.6. Let H denote the Hardy space H 2 (S 1 ). Consider the Toeplitz algebra T of operators on H: that is, the smallest closed subset of B(H) which contains all the Toeplitz operators Tf and satisfies A, B ∈ T, λ, μ ∈ C =⇒ λA + μB ∈ T, AB ∈ T. Show that T contains all the compact operators. Exercise 7.5.7. Let H be a Hilbert space. You are given the following facts: Among the compact operators on H there is a subclass called the traceable operators. The traceable operators form an ideal in B(H) which contains every finite-rank operator. If T is a traceable operator and {en } is a complete orthonormal set, the “sum of diag onal matrix entries” n T en , en  converges absolutely to a number Tr(T ), the trace of T , which depends only on T (not on the choice of orthonormal set). Finally, if A and B are bounded operators such that AB and BA are both traceable, then Tr(AB) = Tr(BA). (a) Show that if T is a Fredholm operator, there is a “parametrix” S such that I − ST and I − T S are traceable. (b) Show that if S is such a parametrix, then Index(T ) = Tr(I − ST ) − Tr(I − T S). (c) (Challenge) Let f be a function on the circle and let P denote the Hardy projection. It is known that if f is sufficiently smooth, then the commutator [Mf , P ] is a traceable operator on L2 (S 1 ). Assuming this, prove Index(Tf ) = Tr(Mf −1 [P, Mf ]), where [A, B] denotes the commutator AB − BA. Alain Connes noticed that if you think of the trace as a “noncommutative” version of the integral and the commutator as a “noncommutative” version of the exterior derivative d, the result in (c) looks a lot like the integral formula for the winding number, (1/2πi) f −1 df . This became one of the foundations of his theory of noncommutative geometry, developed in the book [13].

Chapter 8

Coverings and the Fundamental Group

8.1. The fundamental group The winding number project was about understanding and classifying loops in the metric space C \ {0}. We can generalize this to an arbitrary metric space, as follows. Definition 8.1.1. Let X be a metric space, with a basepoint x0 . The fundamental group π1 (X, x0 ) is the collection of homotopy classes of loops in X based at x0 , that is, of continuous maps γ : [0, 1] → X having γ(0) = γ(1) = x0 . Remark 8.1.2. Note that an element of the fundamental group is not a single loop, but an entire homotopy class of such loops. For example, the fundamental group of the punctured plane has one element for each integer n, and that element is the homotopy class consisting of all the loops (with the specified basepoint) that have winding number n. The object π1 (X, x0 ) was invented by Poincar´e. It is called the fundamental group because it can be equipped with a “multiplication” that satisfies the laws of an abstract group. We made use of such a multiplication (the pointwise product of loops) on several occasions 139

140

8. Coverings and the Fundamental Group

when studying the winding number, but this multiplication involved the arithmetic of complex numbers — the product on C \ {0} — and no analog to this is available on a general metric space X. However, it turns out that the concatenation of loops provides an acceptable substitute. Recall from Remark 3.2.9 the following definition: Definition 8.1.3. Suppose that γ1 and γ2 are loops in X based at x0 . Their concatenation is the loop γ1 ∗ γ2 defined by  (t  12 ), γ1 (2t) t → γ2 (2t − 1) (t > 12 ). Because homotopies between loops can be concatenated in the same way as the loops themselves, concatenation passes to homotopy classes and defines a binary operation ∗ on π1 (X, x0 ). Proposition 8.1.4. The operation of concatenation makes π1 (X, x0 ) into a group (Definition G.2.1). That is: (a) Concatenation is associative: g1 ∗ (g2 ∗ g3 ) = (g1 ∗ g2 ) ∗ g3 . (b) The class e of the constant path at x0 acts as an identity element: e ∗ g = g = g ∗ e. (c) Inverses exist: for each g there is g −1 such that g∗g −1 = g −1 ∗g = e. Proof. The proof uses the notion of reparameterization of a loop. Recall (Example 2.2.3) that we define a reparameterization of a path to be its composition with any continuous map u : [0, 1] → [0, 1] having u(0) = 0 and u(1) = 1. If two loops are related by reparameterization, they are homotopic (via a linear homotopy). Let γ1 , γ2 , γ3 be loops. The loops γ1 ∗ (γ2 ∗ γ3 ) and (γ1 ∗ γ2 ) ∗ γ3 are related by reparameterization using the piecewise linear map ⎧ 1 ⎪ (0  t  12 ), ⎪ ⎨2t u(t) = t − 14 ( 12  t  34 ), ⎪ ⎪ ⎩2t − 1 ( 3  t  1), 4

whose graph is shown in Figure 8.1.

8.1. The fundamental group

141

Figure 8.1. Reparameterization used to prove associativity in π1 .

Similarly if γ is a loop and e is the constant loop, the loops γ and γ ∗ e are related by reparameterization using the piecewise linear map  2t (0  t  12 ), u(t) = 1 ( 12  t  1), and γ and e ∗ γ are related by a similar reparameterization. Finally, the reverse loop to a loop γ is defined by γ  (t) = γ(1 − t). The concatenation γ ∗ γ  is homotopic to the constant loop via the homotopy  γ(2 min{t, (1 − s)}) (t  12 ), h(s, t) = γ(2 min{(1 − t), (1 − s)}) (t  12 ). This completes the proof.



Example 8.1.5. For a simply connected space X, π1 (X, x0 ) is the trivial group 0 (with just one element, the identity). Example 8.1.6. For X = C \ {0}, our calculations of the winding number (in Theorem 3.2.7) show that π1 (X, x0 ) ∼ = Z (we can take any basepoint x0 , but for definiteness let’s take x0 = 1). Remark 8.1.7. How does the group π1 (X, x0 ) depend on the choice of basepoint? We need to assume that X is path connected to answer this question. Assuming this, let x0 and x1 be two basepoints and let ϕ : [0, 1] → X be a path connecting them, with ϕ(0) = x0 and ϕ(1) = x1 ; let ϕ be the reversed path (ϕ (t) = ϕ(1 − t)). If γ is a loop based at x0 , then ϕ ∗ γ ∗ ϕ is a loop based at x1 , and this process

142

8. Coverings and the Fundamental Group

passes to homotopy classes and gives an isomorphism π1 (X, x0 ) → π1 (X, x1 ). Thus “up to isomorphism” the fundamental group of a path-connected space does not depend on the choice of basepoint. However, note that there may be many different isomorphisms between π1 (X, x0 ) and π1 (X, x1 ); the above discussion does not single out a particular one — if we choose a different path from x0 to x1 , we could in principle obtain a different isomorphism. See Exercise 8.7.2 Suppose that (X, x0 ) and (Y, y0 ) are spaces with basepoint and that f : X → Y is a based map — that is, a continuous map with f (x0 ) = y0 . Then, for any loop γ : [0, 1] → X in X, the composite f ◦ γ : [0, 1] → Y is a loop in Y . Moreover, this construction passes to homotopy classes and so it gives rise to a map f∗ : π1 (X, x0 ) → π1 (Y, y0 ). It is clear that for g1 , g2 ∈ π1 (X, x0 ), f∗ (g1 ∗ g2 ) = f∗ (g1 ) ∗ f∗ (g2 ); that is, f∗ is a homomorphism of groups (Definition G.3.1). It is called the induced homomorphism associated to the map f . The construction has the following obvious properties. Proposition 8.1.8. The construction of the induced homomorphism is functorial. That is, if f : (X, x0 ) → (Y, y0 ) and g : (Y, y0 ) → (Z, z0 ) are maps of spaces with basepoints, then g∗ ◦ f∗ = (g ◦ f )∗ : π1 (X, x0 ) → π1 (Z, z0 ). Moreover, the identity map gets transformed into the identity homomorphism. Finally, homotopic maps induce the same homomorphism on fundamental groups.  The “functorial” language comes from category theory [25]: a functor transforms both “objects” and “morphisms” from one kind of mathematical theory to another, while preserving the formal properties expressed by composition laws. For instance, the fundamental group functor transforms topology to group theory, replacing spaces by groups and continuous mappings by homomorphisms. By the use of functors, problems in one theory (say, topology) can be transformed

8.2. Covering and lifting

143

into problems in another theory (say, algebra). This is the basic idea of “algebraic topology”. Example 8.1.9. Let’s use these ideas to prove the Brouwer fixedpoint theorem yet once more. As we observed before, this is equivalent to the no-retraction theorem (Theorem 4.1.2): there is no retraction of the closed disc D2 onto its boundary S 1 . Well, suppose there was. Such a retraction would amount to a commutative diagram of spaces and maps S1 B BB BB BB B!

D2

/ S1 |> | || || | |

where the horizontal map is the identity, the downward diagonal is the inclusion map, and the upward diagonal is the supposed retraction. Applying the fundamental group functor would give us a commutative diagram of groups and homomorphisms Z> >> >> >> >

/Z ? 0

where the horizontal map is the identity. But clearly no such diagram exists (the identity map Z → Z cannot factor through a trivial group or else it would itself be trivial). Remark 8.1.10. Let f : (X, x0 ) → (Y, y0 ) be a based map. If there is a map g : (Y, y0 ) → (X, x0 ) such that f ◦ g is homotopic to the identity map on Y and g ◦ f is homotopic to the identity map on X, then we say that f is a homotopy equivalence. From the functorial properties of the induced homomorphism it follows that if f is a homotopy equivalence, then f∗ : π1 (X, x0 ) → π1 (Y, y0 ) is an isomorphism.

8.2. Covering and lifting We are going to consider some special kinds of maps between (metric) spaces. A map f : X → Y is a surjection if it maps X onto Y , i.e., for every y ∈ Y there is x ∈ X with f (x) = y. Two surjections onto

144

8. Coverings and the Fundamental Group

the same space, f1 : X1 → Y and f2 : X2 → Y , are called equivalent if there is a homeomorphism h : X1 → X2 such that f2 ◦ h = f1 , or in other words such that the diagram X1 A AA AA A f1 AA

h

Y

/ X2 } } }} }} ~}} f2

commutes. Example 8.2.1. An obvious example of a surjection is a product surjection, where X is just a product Y × F for some space F and f : X → Y is the “coordinate” map f (x, ξ) = x. The space F is called the fiber of the product surjection. Example 8.2.2. We can restrict a surjection to an open subset U ⊆ Y : if f : X → Y is a surjection and U ⊆ Y is open, then f −1 (U ) ⊆ X is open too. Considering f −1 (U ) and U as metric spaces in their own right, we obtain a surjection fU : f −1 (U ) → U, called the restriction of f to U . Now we give a key definition. Recall (Example B.1.9) that a metric space F is discrete if every subset is open. Equivalently, for each ξ ∈ F , the infimum inf{d(ξ, ξ  ) : ξ  = ξ} is strictly positive. Definition 8.2.3. A surjection p : E → B is a covering map if there are a discrete space F and an open cover U of B such that, for each U ∈ U , the restriction pU : p−1 (U ) → U is equivalent to a product surjection with fiber F . Again, we call F the fiber of the covering map. The cardinality of F is the number of sheets of the covering. A cover U having the property described in this definition will be called a trivializing cover for p, and its members trivializing sets. The two uses of the word “cover”, as in “open cover” and “covering

8.2. Covering and lifting

145 t

x y

U Figure 8.2. Reprising Figure 1.3 to illustrate the covering map from R to S 1 . The figure shows the set U ⊆ S 1 (dashed) and three of the infinitely many components of e−1 (U ) (heavy lines).

map”, are unrelated, but both are part of the standard terminology of the subject. Consider the exponential map e(t) = exp(2πit), from R to S 1 , a familiar friend from earlier chapters. It is a local homeomorphism but not a global homeomorphism. In fact, Proposition 8.2.4. The exponential map e : R → S 1 is a covering map. Proof. Consider the open subset U of S 1 given by its intersection with the left-hand half-plane; that is, U = {x+iy : x2 +y 2 = 1, x < 0}. Notice that a point of U is completely determined by its y-coordinate, which lies in (−1, 1), and x = −(1 − y 2 )1/2 is a continuous function of y; so U is homeomorphic to (−1, 1). We have ' (n + 14 , n + 34 ) e−1 (U ) = n∈Z

and the homeomorphism e−1 (U ) → U × Z that sends n + t, t ∈ ( 41 , 34 ), to ((− cos 2πt, sin 2πt), n) implements an equivalence between the restriction of e to U and the product surjection U × Z → U ; see Figure 8.2.

146

8. Coverings and the Fundamental Group

Similar discussions can be carried out for the intersection of S 1 with the upper, lower, and right-hand half-planes, and the four open subsets so obtained cover S 1 . This completes the proof.  One can prove by a similar argument that the exponential map exp : C → C \ {0} is a covering map. Example 8.2.5. Fix a nonzero n ∈ N. The map S 1 → S 1 given by z → z n is a covering map. In this case the fiber F has n points. If p : E → B is a covering map (or any surjection really) and f : X → B is a map, a lift of f is a map f˜: X → E such that p ◦ f˜ = f , i.e., such that the diagram f˜

X

~

~

~

f

E ~> p

 /B

commutes. Theorem 8.2.6 (Homotopy lifting theorem). Let p : E → B be a covering map. Let X be a compact metric space, and let fs , s ∈ [0, 1], be a homotopy of maps from X to B. Suppose a lift f˜0 : X → E of f0 is given. Then there is a lift f˜s of the entire homotopy fs , beginning at f˜0 . Moreover, such a lift is unique. The theorem can be expressed by the diagram below: the solid arrows represent the data (homotopy fs and initial lift f˜0 ), and the dashed arrow represents the lifted homotopy that is to be constructed so as to make the diagram commute: X ×  {0} _ v v X × [0, 1] 

v

v

/E v: p

 / B.

Corollary 8.2.7. Suppose that p : E → B is a covering space with the total space E simply connected. Then a loop f : S 1 → B is homotopic to a constant loop if and only if it lifts to a loop in E.

8.2. Covering and lifting

147

Proof of the corollary, assuming the theorem. Suppose that f lifts to F : S 1 → E. By assumption, E is simply connected, so the loop F is homotopic (in E) to a constant. Composing this homotopy with the projection p : E → B, we get a homotopy (in B) of f to a constant loop. Conversely suppose that f is homotopic to a constant loop. Let fs be such a homotopy with f0 being constant and f1 = f . The constant loop f0 can be lifted to a constant loop F0 in E (since p is a surjection). By Theorem 8.2.6 the homotopy fs lifts to a homotopy Fs , which starts at F0 and ends at F1 = F , which is the required lift of f .  Corollary 8.2.7 is the analog for general covering spaces of the lifting property of the exponential map (Proposition 3.1.5). Proof of Theorem 8.2.6. Fix a trivializing cover U for B. If Y is a subset of X, let us call Y modest (as in, “of modest size”) if we can partition [0, 1] into finitely many closed subintervals [sk , sk+1 ], 0 = s0  s1  · · ·  sm = 1, such that the restriction of the homotopy f to Y × [sk , sk+1 ] has image that is a subset of a member of U . Step (a). We show that for each x ∈ X, there is εx > 0 such that the closed ball B(x; εx ) is modest. Indeed, let jx : [0, 1] → B be the path jx (s) = f (s, x). Let δ be a Lebesgue number (Theorem B.3.13) for the cover jx∗ (U ) of [0, 1]. Choose a partition of [0, 1] into finitely many closed subintervals [sk , sk+1 ] of length less than δ. By construction, then, for each k there is Uk ∈ U such that the compact set f ([sk , sk+1 ]×{x}) is contained in Uk . By compactness and by uniform continuity of f , there is εk > 0 such f ([sk , sk+1 ] × B(x; εk )) ⊆ Uk . Take ε = 12 min{εk }; then B(x; εx ) is modest. Step (b). Now we prove the uniqueness statement. It is sufficient to prove this when X is a point since two liftings which agree when restricted to each point of X must agree globally. So assume X is a point (and then omit it from the notation). Now we are in the situation of a path-lifting problem: we are given a path f : [0, 1] → B, and we want to know that two liftings f  , f  : [0, 1] → E of f that start at the same place (f  (0) = f  (0)) must in fact agree everywhere (f  (s) = f  (s) for all s). Let T denote the collection of all those

148

8. Coverings and the Fundamental Group

numbers t ∈ [0, 1] such that f  (s) = f  (s) for all s ∈ [0, t]. By hypothesis, T is a nonempty interval with left endpoint 0. Let t0 = sup T . There is ε > 0 such that V = (t0 −ε, t0 +ε)∩[0, 1] is contained in f −1 (U ) for some trivializing set U . Now using the local trivialization, we may identify p−1 (U ) ⊆ E with U × F , where F is some discrete space, and under this identification f  and f  become maps of the form f  (s) = (f (s), g  (s)), f  (s) = (f (s), g  (s)),

s ∈ V, g  (s), g  (s) ∈ F.

Since g  (s) = g  (s) for some s ∈ V , this identity must hold for all s ∈ V (a continuous map from an interval to a discrete space must be constant). If t0 < 1, then V contains s > t0 , which is a contradiction; so t0 = 1 and moreover 1 ∈ V , and this is the desired uniqueness statement. Step (c). Next we will prove the existence statement under the assumption that X itself is modest. The proof is quite similar to the proof of uniqueness. Let now T be the collection of all those numbers t ∈ [0, 1] for which a continuous lift f˜ of f , starting at the given f˜0 , exists on X × [0, t]. Clearly T is a nonempty interval with left endpoint 0. Let t0 = sup T . By the assumption that X is modest, there is ε > 0 such that if V = (t0 − ε, t0 + ε) ∩ [0, 1], then X × V is contained in f −1 (U ) for some trivializing set U . Now using the local trivialization, we may identify p−1 (U ) ⊆ E with U × F , where F is some discrete space, and under this identification the lifting f˜ becomes a map of the form f˜(x, s) = (f (x, s), g(x)),

s ∈ V ∩ [0, t0 ), x ∈ X.

The map g is continuous from X to F ; it does not depend on s ∈ V ∩ [0, t0 ) because a continuous map from an interval to a discrete space must be constant. However, the displayed equation makes sense for all s ∈ V and defines a lifting of f . As before it follows that t0 = 1 and 1 ∈ V , which is the required lifting result. Step (d). Finally we prove the general case. Since X is compact, Step (a) shows that it can be covered by finitely many modest closed balls. Applying Step (c) to each of these, we get continuous liftings

8.2. Covering and lifting

149

over each of these modest balls. By Step (b), whenever two modest balls overlap, the corresponding liftings agree. By the gluing lemma (Proposition B.4.2), the liftings over modest balls fit together to define a continuous lifting on all of X × [0, 1], as required. 

Theorem 8.2.8. Let p : E → B be a covering, with E path connected, and let basepoints e0 ∈ E and b0 ∈ B be chosen with p(e0 ) = b0 . Let (Y, y0 ) be a path-connected and locally path-connected space and let f : Y → B be a based map. Then there exists a based lifting g : Y → E of f if and only if f∗ (π1 (Y, y0 )) ⊆ p∗ (π1 (E, e0 )). Moreover, if it exists, such a based lifting is unique. In this theorem, “locally path connected” means that for every point x ∈ E and every open subset U containing x, there is a pathconnected open subset V with x ∈ V ⊆ U . This condition is satisfied for most “ordinary” spaces, though there are examples for which it fails. The lifting condition is expressed in the diagram

g

u (Y, y0 )

u

u

f

u

(E, e0 ) u: p

 / (B, b0 )

Proof. From the functoriality of the fundamental group, if a lifting exists, f∗ = p∗ g∗ so Im(f∗ ) ⊆ Im(p∗ ). Thus the condition on the fundamental groups is necessary. To see that it is sufficient, suppose that Im(f∗ ) ⊆ Im(p∗ ). Then we attempt to define a lift g as follows: for y ∈ Y , pick a path γ from y0 to y, consider the path f ◦ γ in B from b0 to f (y), and lift this path (using Theorem 8.2.6) to a path in E starting at e0 . We want to define g(y) to be the endpoint of the lifted path. To do so, we must check that different choices of γ give the same final result. Suppose then that γ and γ  are two such choices. The concatenation γ  ∗ γ  is then a loop in Y , defining a class in the fundamental group π1 (Y, y0 ). By the hypothesis Im(f∗ ) ⊆ Im(p∗ ), the loop

150

8. Coverings and the Fundamental Group

f ◦ (γ  ∗ γ  ) in B is homotopic (via a homotopy h) to a loop p ◦ θ, where θ is a loop in E. Now apply the homotopy lifting theorem (Theorem 8.2.6) to the situation θ

S 1 ×  {0} _ 

v

v S 1 × [0, 1]

v

v

h

/E v: p

 / B.

The existence of the lifting (the diagonal map in the diagram) shows in particular that γ  ∗ γ  lifts to a loop in E and therefore that the endpoints of the lifts of γ and γ  are the same. The local connectedness hypothesis comes in when we try to prove that the map g that we have defined is continuous. Suppose that U is an open neighborhood of g(y) in E. Then p(U ) is open in B, and, without loss of generality, we may take U so small that p−1 (p(U )) ∼ = U × F where F is the (discrete) fiber. By the local connectedness, there is a path-connected open neighborhood V of y such that f (V ) ⊆ p(U ). For y  ∈ V , a path from y0 to y  may be defined as the concatenation of γ with a path that remains entirely within V (because V is path connected). A lift of this path may be defined by concatenating a lift of γ with a path that lies entirely in p(V ) × F where the fiber component remains constant, i.e., which lies entirely in U . Thus g(V ) ⊆ U . This proves g is continuous at y. Finally, the uniqueness claim follows from the uniqueness part of Theorem 8.2.6. 

8.3. Group actions Covering spaces are closely related to group actions. Definition 8.3.1. Let G be a group and let X be a metric space. An action of G on X is a homomorphism from G to the group of isometries of X. In other words, to each g ∈ G and x ∈ X there is associated g · x ∈ X such that: (a) (g1 g2 ) · x = g1 · (g2 · x). (b) If e ∈ G denotes the identity, then e · x = x.

8.3. Group actions

151

(c) d(x1 , x2 ) = d(g · x1 , g · x2 ) (i.e., each g ∈ G acts as an isometry on X). One says that the action makes X into a G-space. Example 8.3.2. Let G be the group Z, and let X = R. Then addition (g · x = g + x) makes X into a G-space. Example 8.3.3. Let G = S3 , the symmetric group on 3 letters, and let T be an equilateral triangle in the plane with vertices A, B, C, say. For each permutation σ ∈ G there is one and only one isometry Iσ of T which permutes the vertices A, B, C according to the permutation σ. The assignment σ · x = Iσ (x) gives an action of G on T . Definition 8.3.4. An action of G on X is free and properly discontinuous (FPD for short) if for each x ∈ X there is ε > 0 such that d(x, gx) > ε for all nonidentity g ∈ G. The first example above is free and properly discontinuous; the second is not. Definition 8.3.5. If G acts on X, the orbit Gx of x ∈ X is the set {g · x : g ∈ G}. The orbit space G\X is the collection of orbits of G on X. Proposition 8.3.6. Suppose that G acts on the metric space X by an FPD isometric action. Then the formula d(Ω, Ω ) = inf{d(x, x ) : x ∈ Ω, x ∈ Ω } defines a metric on the orbit space G\X. Proof. It is clear that the formula for d is positive and symmetric. We must show that it is strictly positive on distinct orbits and that it satisfies the triangle inequality. As a preliminary to both parts of the proof notice that if x ∈ Ω, then d(Ω, Ω ) = inf{d(x, x ) : x ∈ Ω } (i.e., we only need to take the infimum over one orbit, not both). This is because d(gx, g  x ) = d(x, g −1 g  x ) since G acts by isometries. Now, suppose d(Ω, Ω ) = 0 with Ω = Gx, Ω = Gx , x = x . By the above there exists a sequence of distinct elements gn ∈ G with d(x, gn x ) < n−1 . Then d(x , gn−1 gm x ) = d(gn x , gm x ) < n−1 + m−1

152

8. Coverings and the Fundamental Group

and this will contradict the FPD condition for n, m sufficiently large. Thus the distance between distinct orbits is strictly positive. To prove the triangle inequality, consider three orbits Ω, Ω , Ω equal to Gx, Gx , Gx , respectively. By definition of the infimum, for any ε > 0 there exist g  , g  ∈ G such that d(x, g  x ) < d(Ω, Ω ) + ε,

d(x , g  x ) < d(Ω , Ω ) + ε.

Then d(Ω, Ω )  d(x, g  g  x )  d(x, g  x ) + d(g  x , g  g  x ) = d(x, g  x ) + d(x , g  x )  d(Ω, Ω ) + d(Ω , Ω ) + 2ε. Letting ε → 0 we obtain the triangle inequality.



Proposition 8.3.7. Suppose that G acts on the metric space X by an FPD isometric action. Then the natural map p : X → G\X is a covering map with fiber G (equipped with a discrete metric). Proof. Put Y = G\X, let y ∈ Y , and let x ∈ p−1 (y). According to the FPD condition there is ε > 0 such that the distances d(x, gx), g = e, are all greater than ε. It easily follows that the balls B(gx; 12 ε) are all disjoint. Let U = B(y; 12 ε). By definition of the metric in Y , if x ∈ p (U ), then there is some g ∈ G such that d(x , gx) < 12 ε. From the FPD condition, this g is uniquely determined, and the restriction of p to a map B(gx; 12 ε) → U is an isometry. Thus the map x → (p(x ), g) is a homeomorphism from p−1 (U ) to U ×G. Thus p is a covering map, as asserted.  −1

The next result generalizes our use of the exponential map to compute the fundamental group of the circle S 1 . Theorem 8.3.8. Suppose that G acts on the metric space X by an FPD isometric action and in addition that X is simply connected. Then the fundamental group π1 (G\X) is isomorphic to G. Proof. Let Y = G\X. Choose a basepoint y0 ∈ Y and a basepoint x0 ∈ X mapping to y0 under the covering map p : X → Y which sends each point to its orbit. Define a map α : G → π1 (Y, y0 ) as

8.4. Examples

153

follows: given g ∈ G, join x0 to g · x0 by a path γ in X, and let α(g) be the homotopy class of the based loop p ◦ γ in Y . Since X is simply connected, the path γ exists and is unique up to (endpoint fixed) homotopy; it follows that p ◦ γ is well-defined up to homotopy of loops, so that α(g) ∈ π1 (Y, y0 ) is well-defined. We prove that α is a homomorphism. Let g, h ∈ G and let γ, θ be paths from x0 to g · x0 and h · x0 . Let γ h (t) = h · γ(t): this is a path from h · x0 to hg · x0 . Notice that p ◦ γ h = p ◦ γ. However, the concatenation θ ∗ γ h is a path from x0 to hg · x0 ; thus, α(hg) = [p ◦ (θ ∗ γ h )] = [(p ◦ θ) ∗ (p ◦ γ h )] = α(h)α(g), where the square brackets denote homotopy classes of loops. This shows that α is a homomorphism. We prove that α is surjective. Let ϕ : [0, 1] → Y be a based loop in Y representing a class in the fundamental group. By the lifting theorem (Theorem 8.2.6), there is a path ϕ˜ in X starting at x0 that lifts ϕ. The point ϕ(1) ˜ is in the same orbit as x0 , so there is g ∈ G ˜ Then α(g) = [ϕ] ∈ π1 (Y, y0 ). such that g · x0 = ϕ(1). We prove that α is injective. Suppose that g ∈ ker α. Then a path from x0 to g · x0 projects to a nullhomotopic loop in Y . By Corollary 8.2.7, that loop in Y lifts to a loop in X. By the uniqueness of lifting, that lift must be equal to the original path from x0 to g · x0 .  Thus g · x0 = x0 . Since the action is free, g is the identity.

8.4. Examples In the previous section we proved that if a group G acts freely and properly discontinuously on a simply connected space X, then the quotient space G\X has fundamental group G. We will give several examples. Example 8.4.1. Considering the FPD action of Z on R by translations recovers for us the calculation π1 (S 1 ) = Z which underlies the notion of winding number. Example 8.4.2. Letting Z2 act on R2 by translations, we get the calculation π1 (T 2 ) = Z2 , where T 2 is the torus. This also can be deduced from the case of the circle using Exercise 8.7.1.

154

8. Coverings and the Fundamental Group

Remark 8.4.3. The fundamental groups of closed surfaces of higher genus can also be approached in this way, but the required isometric actions are not actions on the Euclidean plane anymore. Instead, X becomes 2-dimensional hyperbolic space, and the required groups G are Fuchsian groups (discrete groups of hyperbolic isometries). This very classical theory was extensively developed in the nineteenth century. Example 8.4.4. Let the group G = Z2 with two elements act on the sphere S 2 by the antipodal map (i.e., the nonzero element of g sends each point of S 2 to its antipodal point). Clearly this action is FPD. The orbit space G\S 2 is called the real projective plane, RP2 . By the result above, the fundamental group of RP2 is Z2 . Now we will consider another important example, the free group. Let S be a set, which in this context we call the set of generators. A word in S is a formal expression (of any length k) sn1 1 sn2 2 · · · snk k where s1 , . . . , sk ∈ S and n1 , . . . , nk ∈ Z. (The individual sni i are called the terms of the word, with base si and exponent ni .). We also allow consideration of the empty word of length 0, which is denoted 1. Two words are said to be equivalent if one can be obtained from the other by a finite succession of the following operations (“elementary equivalences”): (i) replacing two successive terms with the same base s by a single term using the “exponential law”: sn sm = sn+m , (ii) replacing a single term by two successive terms with the same base using the exponential law, (iii) inserting a term s0 in any position for any generator s, (iv) deleting a term s0 from any position for any generator s. For example, if S = {a, b}, the following words are equivalent. a2 babb−1 a−1 b2 ∼ a2 baa−1 b2 ∼ a2 bb2 ∼ a2 b3 ∼ aabbb. In the last version we abbreviate s1 as s. It is clear that our relationship of equivalence is, indeed, an equivalence relation generated by our elementary equivalences (Remark G.1.4).

8.4. Examples

155

If w and w are words, their juxtaposition is the word defined by writing the terms of w and following them by the terms of w . Juxtaposition of words is an associative binary operation, for which the empty word acts as an identity. It does not, however, admit inverses. Proposition 8.4.5. The juxtaposition operation passes to equivalence classes of words. Equipped with this operation, the collection of equivalence classes of words becomes a group (i.e., it has inverses). Definition 8.4.6. The group so defined (of equivalence classes of words in S) is called the free group generated by S and is denoted by F (S). The free group on a finite number n of generators is also denoted Fn . Proof of the proposition. It is clear that if words w, w are equivalent, so are their juxtapositions with a third word w . Thus juxtaposition does indeed pass to an associative binary operation on F (S), with the equivalence class of the empty word as identity. If w = sn1 1 sn2 2 · · · snk k , then let

−n

k 1 sk−1k−1 · · · s−n . w−1 = s−n 1 k

One checks directly that the words ww−1 and w−1 w are equivalent to the empty word,  Definition 8.4.7. A word in S is called reduced if no term has exponent 0 and no two consecutive terms have the same base s ∈ S. Proposition 8.4.8. Each word is equivalent to one and only one reduced word. Outline proof. We describe (via pseudocode) Algorithm 1, which has the following properties: (a) The input to the algorithm is a word; the output from the algorithm is a reduced word equivalent to the input word. (b) If the input to the algorithm is already reduced, the output is the same as the input. (c) Equivalent input words produce the same output.

156

8. Coverings and the Fundamental Group

Algorithm 1 Algorithm to reduce a word in the free group 1: 2:

3: 4: 5:

6: 7: 8: 9: 10: 11: 12: 13: 14: 15:

16:

function Reduce-Word(W )  W is a word in the alphabet S Initialize two stacks, L and R. The L stack is initially empty; the R stack contains the input word, with the leftmost term on top. while R stack is non-empty do while terms on top of L and R stacks have same base do t ← combination of top terms of L and R according to exponential law pop stacks L and R if exponent of t is nonzero then push t onto stack L end if end while t ← top term of R pop stack R if exponent of t is nonzero then push t onto stack L end if end while Reduced word is now contained on stack L, rightmost on top return contents of stack L end function

The existence of this algorithm proves the proposition. One sees (and proves formally by induction) that the word on the left stack is always reduced and that the operation of the algorithm does not change the equivalence class of the word obtained by concatenating the L and R stacks. Moreover, if two words are related by an elementary equivalence (and therefore by any equivalence), the algorithm will convert them into the same word.  Remark 8.4.9. Let G be any group. A subset S ⊆ G is called a generating set for G if any element of G can be written as a word in the members of S, where successive terms are multiplied using the composition law of G. The word problem for G asks for an algorithm to decide when two different words represent the same group element. The algorithm that we have presented above solves the word problem for the free group.

8.5. The Nielsen-Schreier theorem

157

It was proved by Novikov that there exist groups for which there is no solution to the word problem — there is no algorithm to decide whether or not a given word represents the identity. Combining these ideas with topological constructions, it can be shown that there is no algorithm to decide whether or not two explicitly given 4-dimensional manifolds are homeomorphic to one another.

8.5. The Nielsen-Schreier theorem Now we are going to construct a simply connected space on which a free group acts freely and properly discontinuously. To keep things explicit we’ll consider the free group on two generators, S = {a, b}, but the arguments extend without difficulty to any finite number of generators (or indeed, with a bit of extra bookkeeping, to an infinite number of generators). Our metric space X is the geometric realization of the Cayley graph of G = F (S). The Cayley graph has one vertex for each element of G, and vertices g and g  are joined by an edge if and only if g −1 g  is one of a, b, a−1 , b−1 . Remember that each group element can be represented uniquely by a reduced word, a product of nonzero powers of the two generators a, b alternately. We see that g, g  are joined by an edge in the Cayley graph if and only if the reduced word for g  differs from the reduced word for g by “one letter on the right”. For example, the four neighbors of the reduced word a2 b−1 a3 are a2 b−1 a2 , a2 b−1 a4 , a2 b−1 a3 b, and a2 b−1 a3 b−1 . We make X into a metric space by identifying each edge with a standard interval of length 1 and then measuring distances along shortest paths (see Remark G.4.5 for more details). For g ∈ G, we define its absolute value to be the distance from g to the identity in X. This is the same as the length of the reduced word representing g, i.e., the sum |n1 | + |n1 | + · · · + |nk | in the reduced word representation g = sn1 1 · · · snk k . By examining reduced words, one sees that of the four edges of X that leave a nonidentity vertex g, three lead to a vertex of greater absolute value and one to a vertex of lesser absolute value. It follows that from any vertex g of X (and therefore from any point of X at all) there is a unique radial geodesic: a path of length |g| in X

158

8. Coverings and the Fundamental Group

Figure 8.3. Part of the Cayley graph of the free group. All edges have length 1.

leading from g to the identity vertex. For x ∈ X let us temporarily denote by xs , s ∈ [0, 1], the point on the radial geodesic from e to x that is distance s|x| from the identity (so that x1 = x and x0 = e for all x). Proposition 8.5.1. The Cayley graph X of a free group is simply connected. Proof. Clearly X is connected. If γ(t) is a loop (based at the identity) in X, then γ(t)s , s ∈ [0, 1], gives a homotopy of that loop to the constant loop.  Every nonzero g ∈ G moves every point of X by distance at least 1, so the action of G on X is FPD. Thus the quotient G\X is a space having fundamental group G, the free group on two generators. What is this space? All the vertices of X are in the same G-orbit. The edges fall into two G-orbits, according to whether their endpoints are

Figure 8.4. The “figure eight” space, a bouquet of two circles.

8.5. The Nielsen-Schreier theorem

159

related by multiplying by a or by b. Thus the quotient space G \ X consists of one vertex and two loops on it, as shown in Figure 8.4. This is sometimes called a “bouquet” (or, less poetically, a “wedge”) of two circles. The free group on n generators can similarly be shown to be the fundamental group of a bouquet of n circles. Proposition 8.5.2. The fundamental group of any finite connected graph is free on n generators, where n = E − V + 1, with E being the number of edges and V the number of vertices. This result is also true for infinite graphs, but to keep things simple we will prove only the finite case. Proof. A basic (and easy) theorem of graph theory says that a connected graph has a spanning tree, that is, a subgraph which contains all the vertices and has no circuits (see Lemma G.4.4). The spanning tree of our finite graph contains all V vertices and V − 1 edges, so there are n = E − V + 1 edges of the graph that do not belong to the spanning tree. Let (Y, y0 ) be a bouquet of n circles. Define a map f : X → Y by sending every point of the spanning tree to y0 and sending the n remaining edges linearly to the n circles of the bouquet. Define a map g : Y → X by sending each circle of the bouquet linearly to the loop in X obtained by following the spanning tree outward to the beginning of the corresponding edge, traversing that edge, and then following the spanning tree inward again to the basepoint. It is not hard to see that f ◦ g and g ◦ f are homotopic to their respective identity maps. Thus the graph X is homotopy equivalent (Remark 8.1.10) to a bouquet of n circles, which means that its fundamental group is isomorphic to the fundamental group of such a bouquet — that is, to  the free group Fn . We are going to use this to prove an important group-theoretic result about free groups, the Nielsen-Schreier theorem. First, a general lemma about covering spaces. Lemma 8.5.3. Let p : (Y, y0 ) → (X, x0 ) be a covering space, where Y is path connected. Then the induced map p∗ : π1 (Y, y0 ) → π1 (X, x0 )

160

8. Coverings and the Fundamental Group

is injective. The index1 of the subgroup H = p∗ (π1 (Y, y0 )) in G = π1 (X, x0 ) is equal to the number of sheets of the covering.

Proof. An element of the kernel of p∗ would be a loop in Y that maps to a nullhomotopic loop in X. But the homotopy lifting theorem (Theorem 8.2.6) allows us to lift such a homotopy to a nullhomotopy of the original loop. Thus p∗ has trivial kernel, so it is injective. We count the number of points in the fiber F = p−1 (x0 ). For each element g ∈ π1 (X, x0 ) we obtain a y ∈ F as the endpoint of γ(1), where γ is a path in Y starting at y0 and lifting g. This gives a mapping of π1 (X, x0 ) onto F . Two group elements g1 , g2 represent the same y if and only if g1 g2−1 comes from a loop in Y based at y0 , in other words, if and only if g1 and g2 are in the same coset of H. Thus the number of points in the fiber is equal to the number of cosets, i.e., the index of H in G.  Theorem 8.5.4. Let G = Fn be a free group on n generators. Let H be a subgroup of finite index k. Then H is also free on 1 + k(n − 1) generators. Remark 8.5.5. This is the “finite” part of the Nielsen-Schreier theorem. The infinite part, which can be proved in the same way, says that an infinite-index subgroup of a free group is also free, on infinitely many generators. Notice the apparently paradoxical fact that the number of generators increases the smaller the subgroup gets. ˜ be the Cayley Proof. Let X be a bouquet of n circles, and let X graph of G, which is the infinite tree on which G acts FPD with ˜ Since H is a subgroup of G, it can also be thought of as X = G\X. ˜ and this action is also FPD. Let Y = H\X. ˜ Then Y has acting on X, fundamental group H, and Y → X is a covering map whose fiber is the coset space H\G (in particular, its cardinality is the index k of H in G). Thus Y is a finite graph so π1 (Y ) = H is free. To compute the number of generators, let VX , EX denote the number of vertices and 1

See Remark G.2.9.

8.6. An application to nonassociative algebra

161

edges for X, and let VY and EY denote the corresponding numbers for Y . We have EX − VX = n − 1, EY = kEX , VY = kVX . From these we find that the number of generators of H is 1 + EY − VY = 1 + k(EX − VX ) = 1 + k(n − 1), as asserted.



8.6. An application to nonassociative algebra The complex numbers C form a finite-dimensional vector space A over R equipped with a bilinear map A × A → A (the multiplication of complex numbers). In general, a vector space equipped with such a bilinear map is called a (real) algebra: the words associative, commutative applied to an algebra refer to the corresponding properties of the multiplication map. A finite-dimensional algebra is called a division algebra if the multiplication map has no zero-divisors: that is to say, the product of two nonzero elements is again nonzero. Finally, an algebra is unital if it contains an element 1 that acts as the identity for multiplication. It is easy to prove that the only associative, commutative, unital real division algebras are R and C, and it is not too much harder to prove that dropping the commutativity requirement yields just one more example, the quaternion algebra H of Hamilton. But suppose we keep commutativity and drop associativity instead? This question was investigated by Hopf in the 1940s and his answer involved topology and covering spaces in a very intriguing way. His conclusion was this: Theorem 8.6.1. The only finite-dimensional commutative unital real division algebras are R and C. (Thus, for these algebras, “the associative law is a consequence of the commutative law”.) In what follows, A will denote a finite-dimensional real commutative division algebra, with multiplication map denoted by (x, y) → x · y. We will not need the unitality hypothesis until the very end. By choosing a basis, we may identify A with Rn and thus introduce the norm a — the usual (Euclidean) length of the vector a.

162

8. Coverings and the Fundamental Group

Lemma 8.6.2. For any (finite-dimensional) division algebra A there is a constant m > 0 such that m−1 a

b  a · b  m a

b

for all a, b ∈ A. Proof. Let S n−1 denote the unit sphere in Rn . Consider the set S = { a · b : (a, b) ∈ S n−1 × S n−1 } ⊆ R+ . Since S n−1 × S n−1 is compact, S is a compact set of strictly positive real numbers, so it is bounded above and below. Choose m = max{sup(S), inf(S)−1 } and use bilinearity to complete the proof.  We define the quadratic map q : A \ {0} → A \ {0} by q(x) = x · x. The key to Hopf’s proof is to consider the topological properties of the quadratic map. The image of the quadratic map may or may not be the whole of A \ {0} (the examples A = C and A = R show these two possibilities). However each point of Im(q) is the image of exactly two points in the domain. This is because of the familiar factorization q(x) − q(a) = x · x − a · a = (x − a) · (x + a) which shows, since there are no zero-divisors, that if q(x) = q(a), then x = ±a. Note that this factorization depends on the commutative and bilinear properties of multiplication only; it does not require the associative law. Lemma 8.6.3. The image of q is a closed subset of A \ {0}. Proof. Suppose yn = q(xn ) is a sequence in Im(q) that converges to y ∈ A \ {0}. By omitting initial terms if necessary, we may assume that there is a constant r > 0 such that r −1  yn  r for all n. Now using Lemma 8.6.2 we obtain a constant m such that (mr)−1/2 

xn  (mr)1/2 for all n. By compactness, there is a subsequence of the xn that converges to a nonzero x. Since q is continuous, q(x) = y, as required.  We are now going to consider q as a smooth map, using the higherdimensional calculus language of Appendix E. Since q is a map A →

8.6. An application to nonassociative algebra

163

A, its derivative at a point x ∈ A \ {0} is a linear map A → A; in fact the calculation q(x + h) = q(x) + 2x · h + h · h (which uses commutativity) shows that the derivative of q at x is the linear map h → 2x · h. By the division algebra property this is an invertible linear map. It follows from the inverse function theorem (see Theorem E.4.1) that q is a local homeomorphism near each x ∈ A \ {0}; in fact, each such x has a neighborhood U such that q maps U diffeomorphically onto a neighborhood q(U ) of q(x). In particular, the image of q is an open subset of A \ {0}. Lemma 8.6.4. If dim A  2, then the quadratic map q is surjective (that is, Im(q) = A \ {0}). Proof. The image Im(q), considered as a subset of A \ {0}, is both open (as we have just seen) and closed (by Lemma 8.6.3). But the complement of the origin in a vector space of dimension at least 2 is connected (Remark 2.1.5), so its only open-and-closed subsets are the empty set and the whole of A \ {0}. Since the image of q is not empty, it must be the whole space.  At this point we have split the proof of Theorem 8.6.1 into two cases: (i) A is 1-dimensional: in this case it is easily seen that A ∼ = R as an algebra. (ii) A is higher-dimensional: in this case we now know that q maps A \ {0} onto A \ {0}. We’ll continue the analysis to show that in this second case dim A = 2. Lemma 8.6.5. If the quadratic map q is onto, then it is a (2-sheeted ) covering map. Proof. By Definition 8.2.3, we must show that every y ∈ A \ {0} has an open neighborhood V such that q −1 (V ) is the disjoint union of two open sets, on each of which q restricts to a homeomorphism. Let q −1 {y} = {x, −x}. Then (by the inverse function theorem) there

164

8. Coverings and the Fundamental Group

is a neighborhood U of x such that q maps U homeomorphically (indeed diffeomorphically) onto q(U ). By shrinking U if necessary, we may assume that U ∩ (−U ) = ∅. Take V = q(U ). Then q −1 (V ) = U  (−U ) is the disjoint union of two open sets each of which is mapped homeomorphically onto V .  A more general approach to this result is indicated in Exercise 8.7.14. Lemma 8.6.6. If Rn \ {0} has a connected 2-sheeted covering, then n = 2. Proof. For n  3, Rn \{0} is simply connected. (Any loop in Rn \{0} is homotopic, by a radial retraction, to one in the unit sphere S n−1 . For n  3 this unit sphere is simply connected (Example 2.3.6) so any loop in it is homotopic to a constant). By Lemma 8.5.3, a simply connected space cannot have a connected covering with more than one sheet.  Completion of the proof of Theorem 8.6.1. From Lemmas 8.6.5 and 8.6.6, together with the case-splitting above, we find that either A is 1-dimensional (and so isomorphic to R) or it is 2-dimensional. It remains to show that in this second case A is isomorphic to C. So far our argument didn’t use the assumption of an identity. In fact there do exist other 2-dimensional, commutative, nonassociative division algebras without identity, for example the multiplication on C defined by a · b = ab. However, with the additional assumption of an identity 1 we may argue as follows. Since q is surjective, there exists an element i such that q(i) = −1. Necessarily, 1 and i are linearly independent since q(λ1) = λ2 1 = −1 for all real λ. Since A is 2-dimensional, it is spanned by 1 and i, so every element of A is of the form x1+yi, where x, y ∈ R and the multiplication is given by 1 · 1 = 1, 1 · i = i · 1 = i, i · i = −1. It is now evident that A is isomorphic to C. 

8.7. Exercises

165

8.7. Exercises Exercise 8.7.1. Let X and Y be metric spaces (compact, if you like) with basepoints x0 and y0 . Let Z = X × Y with basepoint z0 = (x0 , y0 ). Show that there is an isomorphism ∼ π1 (X, x0 ) × π1 (Y, y0 ). π1 (Z, z0 ) = Exercise 8.7.2. Suppose that X is path connected and π1 (X, x0 ) is abelian. Show that, in this case, the isomorphisms π1 (X, x)0 ) → π1 (X, x1 ) associated to different paths from x0 to x1 are all the same. Exercise 8.7.3. Prove that homotopy equivalence is an equivalence relation (on the class of compact metric spaces with basepoint). Exercise 8.7.4. Show that the real projective plane can be obtained by gluing a disc to a M¨obius band along their boundary circles, and that in terms of this identification the generator of the fundamental group of RP2 is the “core” circle of the M¨obius band. Exercise 8.7.5. Show that there is no retraction of the M¨obius band onto its boundary circle. Exercise 8.7.6. Regard the 3-sphere S 3 as the space {(z1 , z2 ) : z1 , z2 ∈ C, |z1 |2 + |z2 |2 = 1}. Let p, q be natural numbers with p prime, q < p. Let T be the transformation S 3 → S 3 defined by T (z1 , z2 ) = (e2πi/p z1 , e2πiq/p z2 ). Describe a metric on S 3 for which T is an isometry. An action of Z on S 3 is defined by n · x = T n (x). The orbit space of this action is called the lens space L(p, q). Compute the fundamental group of L(p, q). Exercise 8.7.7. Consider the group G of isometries of R2 generated by the two transformations a(x, y) = (x + 1, y) (a translation) and b(x, y) = (−x, y + 1) (a glide reflection). Verify that ba = a−1 b (equivalently, aba = b). Deduce that any g ∈ G can be written as am bn and the multiplication law is 



(am bn )(am bn ) = am+(−1)

n

m n+n

b

.

Show that G is a nonabelian group and that its action on the plane is FPD. (The quotient G\R2 is called the Klein bottle.)

166

8. Coverings and the Fundamental Group

Exercise 8.7.8. (a) Show that every continuous map from the real projective plane to the torus is nullhomotopic. (b) Must every continuous map from the torus to the real projective plane be nullhomotopic? Exercise 8.7.9. Implement Algorithm 1 in your favorite programming language and check that it really works. Exercise 8.7.10. Verify the “not hard to see” statement in the proof of Proposition 8.5.2. Exercise 8.7.11. Let G = F2 be the free group on two generators x and y. Let C(x) denote the set of all g ∈ G whose reduced word begins with a strictly positive power of x, and let C(x−1 ) denote those whose reduced word begins with a strictly negative power of x. Similarly define C(y) and C(y −1 ). Show that G = {e} ∪ C(x) ∪ C(x−1 ) ∪ C(y) ∪ C(y −1 ) but also G = xC(x−1 ) ∪ C(x),

G = yC(y −1 ) ∪ C(y).

Deduce that there is no way to assign “probabilities” to subsets of G that satisfy the ordinary laws of probability theory (probabilities are real numbers between 0 and 1, the probability of all of G is 1, the probability of the empty set is 0, the probability of A ∪ B is the probability of A plus the probability of B minus the probability of the intersection), which is also translation invariant (the probability of A equals the probability of gA, for all g ∈ G). (This question produces a paradoxical decomposition: G is decomposed into four pieces which can be reassembled to make two copies of G. The underlying idea can be used in more complicated paradoxical decompositions such as the Banach-Tarski paradox, decomposing a solid ball in R3 into finitely many pieces which can be reassembled to make two equal-size balls. See [38].) Exercise 8.7.12. Show that C \ {−1, 1} is homotopy equivalent to the bouquet of two circles. Use this and our computations for the free group to verify that the loop in C\{−1, 1} described in Exercise 5.7.15 is not homotopic to a constant loop.

8.7. Exercises

167

Exercise 8.7.13. For an arbitrary group G, define G ⊗ Z2 to be the quotient of G by the subgroup generated by all squares and all commutators in G. Prove that Fn ⊗ Z2 is a finite group with 2n elements, and deduce that Fn and Fm are not isomorphic if n = m. Exercise 8.7.14. Give an example of a local homeomorphism which is not a covering map. Show, however, that a proper local homeomorphism is always a covering map (with finite fibers). (A continuous map p : X → Y between metric spaces is proper if p−1 (K) is compact in X whenever K is compact in Y .)

Chapter 9

Coda: The Bott Periodicity Theorem

9.1. Homotopy groups In this final chapter I want to follow a beautiful paper of Atiyah [4] to indicate one direction in which the theory that we’ve studied so far can be developed. By no means is this the only such direction! Indeed, the winding number lies at the foundation of nearly every part of modern mathematics. This will be a very brisk tour with almost no proofs. One way to generalize the fundamental group (and therefore the notion of winding number) is the following. Definition 9.1.1. Let (X, x0 ) be a pointed space. The nth homotopy group πn (X, x0 ) is [S n , X], the collection of homotopy classes of basepoint-preserving maps from the n-sphere to X. To understand the group operation, it is helpful to reformulate the definition and say that πn (X) is the collection of homotopy classes of maps from the unit n-cube I n to X which send the boundary ∂I n to the basepoint x0 . (This corresponds in the case n = 1 to our viewing loops as paths which begin and end at the basepoint.) Now we can define a group operation by concatenation on one of the I 169

170

9. Coda: The Bott Periodicity Theorem

factors in I n . It turns out not to matter, up to homotopy, which I factor we pick. This observation leads to Proposition 9.1.2. For n  2, the group πn (X) is abelian. Let’s calculate some of the groups πn (X). Proposition 9.1.3. πn (S 1 ) = 0 for n  2. Proof. In the diagram

Sn

|

|

|

R |=  / S1

the map S n → S 1 lifts to a map S n → R because of the lifting theorem (Theorem 8.2.8), since S n is simply connected for n  2. But R is contractible — the identity map R → R is homotopic to the constant map — and it follows that any map from any space to R is homotopic to the constant map. Thus any map S n → S 1 is  homotopic to a constant, and πn (S 1 ) = 0. The only thing about S 1 that we used in this argument is that it has a contractible covering space R. In general, a space with a contractible covering space is called an Eilenberg-MacLane space of type K(π, 1) and the same argument applies to show that all the higher homotopy groups of such a space vanish. Any compact manifold carrying a metric of nonpositive curvature (such as a surface of genus  1 or a compact hyperbolic 3-manifold) is an Eilenberg-MacLane space. We used the fact that π1 (S n ) = 1 for n  2 in the above proof. You’ll remember that the proof of this is a “general position” argument: a map S 1 → S n can be deformed to a piecewise linear map, which certainly misses a point of S n (say the north pole, N ), and then the map factors as S 1 → S n \ {N } → S n ; but since S n \ {N } ∼ = Rn is contractible, this map is homotopic to a constant. With a little extra care, this sort of general position

9.1. Homotopy groups

171

argument can be applied to maps S m → S n where m < n, thus giving Proposition 9.1.4. The groups πm (S n ) are trivial when m < n. What about when m = n? This is the situation with our winding number discussion, π1 (S 1 ) ∼ = Z. It turns out that πn (S n ) is also Z. There are two parts to this (just as there are with the winding number): first, defining a numerical invariant of maps S n → S n (the degree); second, proving that it is the only such invariant (i.e., two maps with the same degree are homotopic). Here’s how to define the degree. Let f : S n → S n be a map. By a homotopy, we can make it smooth. The appropriate version of Sard’s theorem (compare Proposition 3.4.6 for the case n = 1) says that for almost all p ∈ S n , the map f is transverse at p, meaning that f −1 {p} is a finite set of points and for each q ∈ f −1 {p}, the derivative Tq f of f , which is a linear map from the n-dimensional vector space T1 S n to the n-dimensional vector space Tp S n , is invertible. We count the number of points q (with appropriate sign determined by the determinant of Tq f ) and that number is the degree of f . It is not too hard to see that this is a homotopy invariant of the map f . This process gives a homomorphism πn (S n ) → Z for each n. To see that it’s an isomorphism needs some further ideas. Definition 9.1.5. Let X be a space. The suspension of X, denoted ΣX, is obtained from X × [0, 1] by identifying all of X × {0} to a single point and all of X × {1} to another single point. The suspension of S n is S n+1 . Suspension is functorial (that is, a map X → Y “suspends” to a map ΣX → ΣY ). These two facts combine to yield a suspension homomorphism Σ : πk (X) → πk+1 (ΣX) and in particular when X is a sphere, we have the suspension homomorphism πk (S n ) → πk+1 (S n+1 ). Theorem 9.1.6 (Freudenthal suspension theorem). The suspension homomorphism Σ : πk (X) → πk+1 (ΣX) is an isomorphism if X is n-connected (which means that πj (X) is trivial for j  n) and k  2n; it is an epimorphism if k = 2n + 1.

172

9. Coda: The Bott Periodicity Theorem

This is proved by transversality-type arguments (or by more elaborate techniques like spectral sequences, but I can’t go into that here). Applied to the groups πn (S n ) we find that in the sequence of suspension maps π1 (S 1 ) = Z → π2 (S 2 ) → π3 (S 3 ) → · · · the first map is an epimorphism and all the subsequent maps are isomorphisms. But the suspension map is clearly compatible with the degree, so that degree gives a right inverse to the epimorphism π1 (S 1 ) → π2 (S 2 ). Thus all suspension maps are isomorphisms in this case, and πn (S n ) = Z. Example 9.1.7. The existence of the degree allows one to prove the general case of the no-retraction theorem (Theorem 4.1.2) and therefore of the Brouwer fixed-point theorem. Indeed, we can mimic the proof given in Chapter 4 for the 2-dimensional case of Theorem 4.1.2, using the degree of a map S n−1 → S n−1 in place of the winding number of a map S 1 → S 1 . The discussion so far yields the following table of homotopy groups πk (S n ), where k labels the columns and n the rows: 1 1 Z 2 0 3 0 4 0

2 0 Z 0 0

3 0 ? Z 0

4 0 ? ? Z

This evidence is compatible with the idea that the whole table is full of zeroes except for Z’s down the diagonal. What makes homotopy theory fascinating (and difficult) is that this is very far from being the case. The first “?” in the table, π3 (S 2 ), provides an example. It turns out that this group is not zero, but Z. The generator is the Hopf map. (See [24] for a more detailed introduction.) Definition 9.1.8. The Hopf map is the map ψ : S 3 → S 2 that is given by sending each point of the unit sphere S 3 ⊆ C2 to the corresponding point of the complex projective line CP1 (which can be identified

9.1. Homotopy groups

173

with the Riemann sphere S 2 = C ∪ {∞} via stereographic projection (Lemma 2.3.7)). In explicit form, ψ(z, w) = (2z w, ¯ |z|2 − |w|2 ) ∈ C × R = R3 . For each point p ∈ S 2 , the set ψ −1 {p} is a copy of the circle S 1 sitting in S 3 . For two different points p, q ∈ S 3 the circles ψ −1 {p} and ψ −1 {q} are linked. (This is how we know that the Hopf map is homotopically nontrivial.) The Hopf map is actually a fibration. What is a fibration? It is a “nondiscrete” generalization of the notion of covering map. Definition 9.1.9. A continuous map p : E → B is a fibration if, for any space X and any maps f : X → E and g : X × [0, 1] → B with p(g(x)) = f (x, 0), the diagonal in the commutative diagram X ×  {0} _ v v X × [0, 1] 

g

v

v f

/E v; p

 /B

can be filled in by a continuous map. Compare this with Theorem 8.2.6 for covering spaces. The difference is that in the definition of fibration, we do not assert that the lift (the diagonal map) is uniquely determined by the given data. For example, any product projection B × F → B is a fibration, whether or not F is discrete. If p : E → B is a fibration and B is path connected, then all of the fibers p−1 {x}, x ∈ B, are homotopy equivalent. (Exercise: Prove this!) One usually picks one, e.g., the inverse image of the basepoint, and calls it “the” fiber F of the fibration. If G is a Lie group acting transitively by diffeomorphisms on a manifold X, then the map G → X defined by g → g·x0 (x0 being some fixed point in X) is a fibration whose fiber is the stabilizer subgroup Gx0 = {g ∈ G : g · x0 = x0 }.

174

9. Coda: The Bott Periodicity Theorem

In particular this covers the case of the Hopf fibration, which is given by an action of the Lie group S 3 = SU (2) (2 × 2 complex unitary matrices with determinant 1) on S 2 by rotations. Another important example is the path space fibration. Let B be any pointed space and let P B be the path space (Example 2.2.6) of B, that is, the collection of all maps [0, 1] → B that start at the basepoint. There is then a natural map p : P B → B sending each path in P B to its “free” end. A tautological argument shows that this map is a fibration. Its fiber is the loop space ΩB. Theorem 9.1.10. Let p : E → B be a fibration with fiber F . There is a long exact sequence of homotopy groups · · · → πk+1 (E) → πk+1 (B) → πk (F ) → πk (E) → πk (B) → · · · . When you see something like this, the most important question to ask is, how is the connecting homomorphism defined—that is, the one which shifts dimension, from πk+1 (B) → πk (F )? This homomorphism is just a generalized version of our definition of the winding number! For instance if k = 0, we start with a loop in B, lift it to a path in E starting at the basepoint, and ask where in F it ends up (π0 (F ) = F when F is discrete). The general case is just a version “with parameters” of the same argument. Example 9.1.11. For the Hopf fibration we obtain isomorphisms πk (S 3 ) = πk (S 2 ) for k  3, and also π2 (S 2 ) = π1 (S 1 ) via the connecting map (this is an alternative way to compute π2 (S 2 )). Example 9.1.12. The path space P B is always contractible, as we discussed a while ago, so the connecting map gives isomorphisms πk (B) = πk−1 (ΩB) for all k. The k = 1 case of this is our original definition of the fundamental group (as path components of the loop space).

9.2. The topology of the general linear group We are going to be interested in the topology of the general linear group GL(m, C) of m × m complex invertible matrices. When m = 1, this is our old favorite, the punctured plane.

9.2. The topology of the general linear group

175

Lemma 9.2.1. When k < 2m, the map πk (GL(m)) → πk (GL(m + 1)) (induced by adding an extra 1 at the bottom right of an invertible m × m matrix to give an invertible (m + 1) × (m + 1) matrix ) is an isomorphism. Proof. GL(m + 1) acts transitively on Cm+1 \ {0}, with stabilizer of the last basis vector being the group H of matrices   A 0 , C 1 where A ∈ GL(m) and C is an arbitrary row vector. The group H is homotopy equivalent to GL(m), via a linear homotopy on the C components. Moreover, Cm+1 \ {0} is homotopy equivalent to S 2m+1 . Thus, up to homotopy, there is a fibration GL(m) → GL(m + 1) → S 2m+1 , and the associated long exact sequence of homotopy groups gives the result.  Thus the interesting group is going to be π2m−1 (GL(m)), which we can also write as π2m−1 (GL) to indicate that there is no “penalty” for increasing the size of the matrices if we want. The case m = 1 is of course the winding number again. Theorem 9.2.2 (Bott periodicity theorem). The “odd order” homotopy group π2m−1 (GL(m)) is always isomorphic to Z. The isomorphism is given as follows: given f : S 2m−1 → GL(m), take the first column of f as a map g : S 2m−1 → Cm \ {0}  S 2m−1 . Then the Bott integer invariant of f is 1 deg(g), (m − 1)! where deg g is the usual degree of the map g. In particular, deg g is always divisible by (m − 1)!. Moreover, the “even order” groups π2m−2 (GL(m)) are always trivial. It is hard to overemphasize how surprising this theorem was when Raoul Bott found it in 1957. At that time it was assumed that the

176

9. Coda: The Bott Periodicity Theorem

homotopy groups of any “reasonable” space, like those of spheres, become successively harder and harder to compute. Bott’s work contradicted this expectation and predicted a previously unheard-of general pattern. But it turned out to be right. Like many important results, Bott periodicity has several different proofs. It turns out that one of them is based on Fredholm operators! Definition 9.2.3. Let Fred(H) denote the space of Fredholm operators on a Hilbert space H. The subset Fred0 (H) is defined to be the connected component of Fred(H) containing the identity, i.e., the Fredholm operators of index zero. Proposition 9.2.4. For k  1 there is an isomorphism πk (Fred0 ) ∼ = πk−1 (GL). To prove this proposition, what we must do is to exhibit a fibration (up to homotopy) GL → E → Fred0 , where the space E in the middle is contractible (that is, homotopy equivalent to a point). Let B(H) denote the space of all bounded operators on H, and let G be the group of invertible elements in B(H). Similarly, let Q(H) = B(H)/K(H) be the Calkin algebra, and let F be the group of invertible elements in Q(H). Let F0 be the component of the identity in F . Let ρ : B(H) → Q(H) be the quotient map. By Atkinson’s theorem (Theorem 7.2.4), F = ρ(Fred) and F0 = ρ(Fred0 ). The map G → F0 is a surjective group-homomorphism. The kernel K of this homomorphism is the group of invertibles on H which are of the form identity plus compact. Even though our groups are infinite-dimensional, it can be shown that G → F0 is a fibration with fiber K. Now Proposition 9.2.4 will follow from three assertions: (a) The fiber K is homotopy equivalent to GL. (b) The total space G is contractible (Kuiper’s theorem). (c) The projection ρ : Fred0 → F0 is a homotopy equivalence.

9.2. The topology of the general linear group

177

The most surprising of these three items is probably (b), Kuiper’s theorem: the group GL(H) of invertible bounded operators on an infinite-dimensional Hilbert space H is contractible. So I’m going to sketch the proof. Actually, I’m not going to sketch the full proof of Kuiper’s theorem, but of a weaker version, which says that the inclusion GL(H) → GL(H ⊕ H) given by a → ( a0 I0 ) is nullhomotopic. This weaker statement includes the main idea of Kuiper’s proof, and, actually, it would be enough for our purposes (because the “stabilization” idea of increasing the size of matrices is built into our discussion anyhow). Remark 9.2.5. Don’t confuse GL(H) with the smaller group GL = limn→∞ GL(n) that we discussed earlier. This latter group is a proper subgroup of the big group GL(H). Proof of the simplified Kuiper theorem. If U is any invertible bounded operator, there is a standard formula which gives a path γU of 2 × 2 invertible matrices with  γU (0) =

U 0

0 U −1



 ,

γU (1) =

I 0

0 I

 .

We constructed such a formula in our proof of Proposition 7.2.6. Now let U be any invertible on H. We map to GL(H ⊕ H) as discussed above, but now we use the fact that the second H is isomorphic to the direct sum of infinitely many copies of itself. Thus, we are looking at the invertible operator in GL(H ⊕ H ⊕ H ⊕ · · · ) which has U at the top left and I’s down the rest of the diagonal. For simplicity let’s just denote this operator as V = U ⊕ I ⊕ I ⊕ . . ., and let’s distinguish the individual Hilbert space factors as H0 , H1 , H2 , and so on. Thinking about the subspace H1 ⊕H2 , V restricts to this subspace and acts as I ⊕ I. Follow the path γU −1 backwards on this subspace to get to the operator U −1 ⊕ U . What’s more, follow that same path on all the subspaces H2k−1 ⊕ H2k (in other words, take the direct sum of all the paths). At time 1 we have deformed V to W = U ⊕ (U −1 ⊕ U ) ⊕ (U −1 ⊕ U ) ⊕ · · · .

178

9. Coda: The Bott Periodicity Theorem

(The parentheses just indicate the individual 2-space summands on which we carried out our rotations.) Now reparenthesize W as W = (U ⊕ U −1 ) ⊕ (U ⊕ U −1 ) ⊕ · · · and on each of the summands H2k ⊕ H2k+1 follow the path γU to get to (I ⊕ I) ⊕ (I ⊕ I) ⊕ · · · , which is of course the identity.



Remark 9.2.6. The key observation from functional analysis which makes this work is that if you have an operator on Hilbert space given by a diagonal matrix (or a block diagonal matrix, i.e., one which is a direct sum of “blocks”), then the norm of the operator is the supremum of the norms of the diagonal entries or blocks. Thus, if all the blocks vary in a continuous way (with some uniformity, which is certainly true here because the blocks are all basically the same), then the operator varies continuously as well. These ideas give us a fibration (up to homotopy) GL → • → Fred0 which provides isomorphisms πk (Fred) → πk−1 (GL) (morally speaking, a map Ω Fred → GL). What about the other direction to finish the proof of Bott periodicity? We need a map ΩGL → Fred. But we already constructed such a map! An element of ΩGL is just a map f : S 1 → GL(k) for some k. To this, we may associate a (matrix) Toeplitz operator Tf as in Theorem 7.4.4, which is Fredholm since its symbol is invertible. In this way we get a map ΩGL → Fred and the index theorem for matrix Toeplitz operators says that this map induces an isomorphism on the level of π0 , from π1 (GL) = π0 (ΩGL) to π0 (Fred). We need to understand why this map too is a homotopy equivalence. The reason is going to be the Toeplitz index theorem plus some extra algebraic structure. To get a better idea about this let’s express the crucial homotopy groups πk−1 (GL) in a more homogeneous manner. Suppose we have a

9.2. The topology of the general linear group

179

map ϕ : S k−1 → GL(m). We extend it to a map Rk → GL(Cm ⊕ Cm ) as follows:

0 |x|ϕ(x/|x|) Φ(x) = . 0 |x|ϕ(x/|x|)−1 In other words, we have first extended ϕ homogeneously from the unit sphere to the whole of Euclidean space, and then we have “doubled” it to act on Cm ⊕ Cm . A vector space which is split into two “halves” in this way is sometimes called a super vector space, and linear transformations on it are classified as odd or even according to whether they preserve or reverse the two factors in the decomposition. The map Φ is what is called a homogeneous supersymmetry; in other words, it is odd and satisfies Φ(x)2 = |x|2 I. The homotopy classes of such homogeneous supersymmetries on Rk can be organized into a group K(Rk ), which is of course isomorphic to πk−1 (GL) — all we have done is rewrite things in a different way. The advantage of this approach though is that it makes evident a multiplicative structure which we might not have seen before. There is a product K(Rk1 ) × K(Rk2 ) → K(Rk1 +k2 ) defined by ( + 1⊗Φ ( 2, Φ = Φ1 ⊗1 ) k and this makes the direct sum K = k K(R ) into a graded ring. What we have to prove is that this graded ring is actually a polynomial ring on b ∈ K(R2 ), where b is the Bott generator

0 z b(z) = , z¯ 0 where we have identified R2 with C. The map α : πk+1 (GL) → πk−1 (GL) defined by Toeplitz index theory now becomes a map of degree −2, α : K → K. Lemma 9.2.7. α is a map of Z[b]-modules. What this means is that once we know that α(b) = 1 — which is the basic Toeplitz index theorem — we will also know that α(bm ) = bm−1 and thus that α is a left inverse to multiplication by b. To show

180

9. Coda: The Bott Periodicity Theorem

that it is also a right inverse, Atiyah [5] used an ingenious homotopy to show Lemma 9.2.8. Let X be any space. The diagram K(X × R2 )

b

/ K(X × R2 × R2 )

b

 / K(X × R2 )

α

 K(X)

α

commutes. Because of this lemma, the fact that α is a right inverse of b for the space X follows from the fact that it is a left inverse of b for the space X × R2 . Our brief sketch of the proof of Bott periodicity is thereby completed, and with it we conclude this book. I wish you much happy winding around in the future.

Appendix A

Linear Algebra

Linear algebra is the study of linear transformations. A linear transformation is, essentially, one that can be computed on the sum of two objects by applying it to each of the objects separately and then adding the results (the formal definition is a tad more complicated: it can be found in Definition A.3.1). Such objects are common in mathematics, and linear algebra is therefore an important foundation for many mathematical theories. A classical reference for the material of this appendix is Halmos [20]. For a more modern treatment, see Axler [7]. Either of these books will provide much more detail and background about the key ideas of linear algebra.

A.1. Vector spaces We fix a number system called the field of scalars. For the purposes of this book, the scalars will be either the real numbers (R) or the complex numbers (C). (Other scalar fields are possible, and important in various parts of mathematics, but we won’t need them here.) To encompass both possibilities, we’ll use the symbol K. 181

182

A. Linear Algebra

Definition A.1.1. A vector space with scalar field K is a set V equipped with two mappings: (a) addition, which is a mapping V × V → V and is denoted (u, v) → u + v, and (b) scalar multiplication, which is a mapping K × V → V and is denoted (λ, v) → λv. These mappings are required to have the following properties: (i) Addition is associative: u + (v + w) = (u + v) + w for all u, v, w ∈ V. (ii) Addition is commutative: u + v = v + u for all u, v ∈ V . (iii) There is an identity element for addition: that is, an element 0 ∈ V such that 0 + v = v = v + 0 for all v ∈ V . (iv) There are additive inverses in V : for each v ∈ V there exists an element (−v) ∈ V such that v + (−v) = 0. (v) Scalar multiplication distributes over addition of scalars: (λ + μ)v = λv + μv for all λ, μ ∈ K and v ∈ V . (vi) Scalar multiplication distributes over addition of vectors: λ(u + v) = λu + λv for all λ ∈ K and u, v ∈ V . (vii) Scalar multiplication is associative: (λμ)v = λ(μv) for all λ, μ ∈ K and v ∈ V . (viii) The multiplicative unit 1 ∈ K acts as unit for scalar multiplication: 1v = v for all v ∈ V . The elements of a vector space are called vectors. The set consisting of a single vector 0 is a vector space, the zero space. The field K itself is an example of a vector space (with scalar multiplication being multiplication in K). More generally, for any integer n, the collection Kn of n-tuples of elements of K, with “componentwise” addition and scalar multiplication, is a vector space. Indeed, the space of functions from any set X to K is a vector space, with componentwise operations: the space Kn of n-tuples is an example of this, where X = {1, 2, . . . , n}.

A.1. Vector spaces

183

Definition A.1.2. Let S be a subset of a vector space V . An expression of the form k  λi s i , i=1

where the λi are members of K (scalars) and the si are members of S, is called a linear combination of members of S. Definition A.1.3. A subset W of a vector space V is called a subspace if it has the property that any linear combination of members of W is itself a member of W . In this case, the addition and scalar multiplication mappings send W × W into W and K × W into W , so that W may be considered as a vector space in its own right. Example A.1.4. Let X be a metric space (for example, the closed interval [0, 1]) and let V be the vector space of all functions X → K. Then the continuous functions X → K form a subspace of V . Let V be a vector space and let V1 , V2 be subspaces of V . Their intersection V1 ∩ V2 is always a subspace of V as well. Their union V1 ∪ V2 is almost never a subspace of V (think for example of V = R2 with V1 being the x-axis and V2 being the y-axis). However, the sum V1 + V2 := {v1 + v2 : v1 ∈ V1 , v2 ∈ V2 }, that is, the collection of all vectors v ∈ V that can be represented as a sum v = v1 + v2 with v1 ∈ V1 and v2 ∈ V2 , is a subspace of V and in fact is the smallest such subspace that contains both V1 and V2 . Definition A.1.5. Let V be a vector space and V1 , V2 subspaces of V . If V1 ∩ V2 = {0} and V1 + V2 = V , then we say V is the direct sum of V1 and V2 and write V = V1 ⊕ V2 . It is equivalent to say that V = V1 ⊕ V2 if every v ∈ V can be represented in a unique way as a sum of v1 ∈ V1 and v2 ∈ V2 . The uniqueness follows because if v1 + v2 = v = v1 + v2 are two such representations, then v1 − v1 = v2 − v2 belongs to the intersection V1 ∩ V2 and is therefore zero. Thus the maps P1 : V → V1 and P2 : V → V2 , sending v to its components v1 and v2 , respectively, are well-defined.

184

A. Linear Algebra V2

V2 V2 V1 Figure A.1. A subspace (V1 ) of the plane and several complementary subspaces to it (V2 , V2 , V2 ).

Definition A.1.6. The maps P1 , P2 defined above are called the projections associated to the direct sum decomposition V = V1 ⊕ V2 . (We call P1 the projection onto V1 along V2 , and we call P2 the projection onto V2 along V1 . ) Definition A.1.7. Let V be a vector space and V1 a subspace. Another subspace V2 is called a complement (or complementary) to V1 if V = V1 ⊕ V2 . Note that there can be many complements to a given subspace. See Figure A.1.

A.2. Basis and dimension Let V be a vector space and let S = {s1 , . . . , sk } be a finite subset of V . Associated to S is a map (A.2.1)

ΦS : Kk → V,

ΦS (λ1 , . . . , λk ) =

k 

λi s i ,

i=1

which sends each k-tuple of scalars (λ1 , . . . , λk ) to the corresponding linear combination of members of S. Definition A.2.2. Let S and V be as above. Then: (i) If the map ΦS is injective, we say that S is a linearly independent set. (That is to say, S is linearly independent if there is no

A.2. Basis and dimension nontrivial linear relation its members.)

185 k i=1

μi si = 0, some μj = 0, among

(ii) If the map ΦS is surjective, we say that S is a spanning set for V , or spans V . (That is to say, every vector in V is a linear combination of members of S.) (iii) If the map ΦS is bijective, we say that S is a basis for V . (That is, a basis is a linearly independent spanning set.) Definition A.2.3. If a vector space V has a finite spanning set, we say that it is finite-dimensional. Lemma A.2.4. Let V be a finite-dimensional vector space. Then: (a) Any (finite) spanning set has a subset which is a basis. (b) Any (finite) linearly independent set is a subset of a basis. Proof. (a) Let S be a finite spanning set. By removing elements of S one by one, you will eventually reach a minimal spanning set: a subset S  of S which spans V , but such that no proper subset of it spans V . I claim that such a minimal spanning set is linearly independent and is therefore a basis. Indeed, if the set S  fails to be linearly independent, there exist s1 , . . . , sk ∈ S  and scalars λ1 , . . . , λk ∈ K, not all zero, such that  k j=1 λj sj = 0. Suppose without loss of generality that λ1 = 0. Then k  s1 = − (λj /λ1 )sj , j=2

and using this relation, any linear combination of members of S  can be rewritten so as not to involve s1 . Thus S  \ {s1 } spans V , contradicting the hypothesis that S  is a minimal spanning set. The proof of (b) is similar. Suppose T is a linearly independent set and S is a (finite) spanning set for V . By including elements of S one by one to T , you will eventually reach a maximal linearly independent set: a set T  with T ⊆ T  ⊆ T ∪ S which is linearly independent, but with the property that any T  ⊆ T ∪ S that properly includes T  is not linearly independent. I claim that T  spans V . Suppose first that s ∈ S \ T  . Then the set T  ∪ {s} is not linearly independent, so

186

A. Linear Algebra

there is a linear relation among its members. Because T  is linearly independent, this linear relation must include the vector s with a nonzero coefficient. Thus (rewriting the relation as we did in the previous paragraph) s is a linear combination of members of T  . This applies to every s ∈ S \ T  , so all members of S are either in T  already or are linear combinations of members of T  . Since S spans  V , it follows that T  spans V , as required. So far we have defined “finite-dimensional” but not “dimension”. The next result allows us to give a definition of dimension. Proposition A.2.5. In a finite-dimensional vector space, the number of elements in any linearly independent set is less than or equal to the number of elements in any spanning set. In particular, any two bases have the same (finite) number of elements. Definition A.2.6. The number of elements in a basis of a vector space V is called its dimension and is denoted dim(V ). Proof. Let S span V and let T be linearly independent. We will describe a process (called the replacement algorithm) which will produce a sequence of spanning sets S = S0 , S1 , S2 such that: (a) All of the sets Sj have the same number (say m) of elements. (b) Sj consists of j elements of T and m − j elements of S. (c) Sj+1 is obtained from Sj by replacing an element of S (already contained in Sj ) by an element of T (not already contained in Sj ). The process terminates when all the elements of S have been replaced by elements of T . At that point we have obtained a spanning set Sk that has the same number of elements as the original S and has T as a subset. Clearly this implies that T has no more elements than S. Here is how the replacement algorithm works. Suppose that S originally has m elements. At the jth step, we have Sj = {t1 , . . . , tj , s1 , . . . , sm−j }, where the t’s belong to T and the s’s belong to S, and this is a spanning set. If the set {t1 , . . . , tj } is all of T , we are done. Otherwise, there is some element of T that is not one of the t1 , . . . , tj ; choose such an element and denote it by tj+1 .

A.2. Basis and dimension

187

Since Sj is a spanning set, there exist scalars λi and μi such that tj+1 =

j  i=1

λ i ti +

m−j 

μi s i .

i=1

At least one of the μ’s must be nonzero (since T is a linearly independent set): say (without loss of generality) that it is μ1 . Then we can rewrite the equation above to express s1 as a linear combination of the members of Sj+1 = {t1 , . . . , tj+1 , s2 , . . . , sm−j }. Thus every linear combination of members of Sj is also a linear combination of members of Sj+1 , so Sj+1 spans V . We have successfully  “replaced” s1 by tj+1 , as required. Proposition A.2.7. Every subspace of a finite-dimensional vector space is finite-dimensional and has a complement which is also finitedimensional. Moreover, if the finite-dimensional space V is the direct sum of subspaces X and Y (see Definition A.1.5), then dim(V ) = dim(X) + dim(Y ). Proof. First we show that any subspace of a finite-dimensional space is finite-dimensional. Let V be finite-dimensional and X a subspace of V . Every linearly independent subset of X is also a linearly independent subset of V : thus, no such subset can have more than dim V members, by Proposition A.2.5. Therefore there exist maximal linearly independent (finite) subsets of X: such a subset S ⊆ X is linearly independent, but no subset of X that properly contains S is linearly independent. The same argument as in the proof of Lemma A.2.4(b) shows that such a maximal linearly independent subset is a basis for X. Thus X is finite-dimensional (and its dimension is less than or equal to the dimension of X). Now choose a basis {x1 , . . . , xk } for X. It is linearly independent (in V ) so by Lemma A.2.4(b) there are elements {y1 , . . . , y } such that {x1 , . . . , xk , y1 , . . . , y } is a basis for V . The elements {y1 , . . . , y } form a basis for another subspace Y which is complementary to X. Conversely, if X and Y are complementary subspaces and we combine

188

A. Linear Algebra

bases {x1 , . . . , xk } for X and {y1 , . . . , y } for Y , we obtain a basis {x1 , . . . , xk , y1 , . . . , y } for V . We now have dim V = k +  = dim X + dim Y, 

as required.

Corollary A.2.8. The only n-dimensional subspace of an n-dimensional vector space V is V itself. Proof. A complementary subspace is 0-dimensional, hence trivial. 

A.3. Linear transformations Definition A.3.1. Let V and W be vector spaces (having the same scalar field K). A map T : V → W is called a linear map or linear transformation if T (λ1 v1 + λ2 v2 ) = λ1 T (v1 ) + λ2 T (v2 ) for all λ1 , λ2 ∈ K and v1 , v2 ∈ V . Example A.3.2. Suppose that V is decomposed as a direct sum V = V1 ⊕ V2 . Then the projections P1 and P2 associated to the direct sum decomposition (see Definition A.1.6) are linear maps. Example A.3.3. If S : U → V and T : V → W are linear maps, so is their composite T ◦ S (the map U → W defined by (T ◦ S)(u) = T (S(u))). Definition A.3.4. Let T : V → W be a linear map. The kernel and image of T are defined by Ker T = {v ∈ V : T v = 0}, Im T = {w ∈ W : ∃v ∈ V, T v = w}. They are vector subspaces of V and W , respectively. Remark A.3.5. By definition, a linear map T : V → W is surjective (onto) if and only if Im(T ) = W . Also, T is injective (one-to-one) if Ker(T ) = {0}: to see this, notice that if T (v1 ) = T (v2 ), then T (v1 − v2 ) = T (v1 ) − T (v2 ) = 0,

A.3. Linear transformations

189

and therefore v1 − v2 ∈ Ker(T ). If T is bijective (that is, both surjective and injective), it is called an isomorphism. Definition A.3.6. Vector spaces V and W are isomorphic if there exists an isomorphism (a bijective linear map) from V to W . Example A.3.7. Let S = {s1 , . . . , sk } be a finite subset of V . Then the associated map ΦS : Kk → V (see (A.2.1)) is linear. If S is a basis for V , then ΦS is linear and bijective, hence an isomorphism. Proposition A.3.8. Isomorphic vector spaces have the same dimension. Proof. Let T : V → W be an isomorphism and let S be a subset of V . Define T (S) to be {T s : s ∈ S} ⊆ W . Because T is surjective, if S spans V , then T (S) spans W . Because T is injective, if S is linearly independent in V , then T (S) is linearly independent in W . Thus T carries bases of V to bases of W , and the result follows.  Proposition A.3.9. Let V be a vector space and let W be a subspace of V . Then any two complementary subspaces to W are isomorphic. Proof. Let X and Y be two complementary subspaces, so that V = W ⊕ X = W ⊕ Y . Let P denote the projection onto X along W , and let Q denote the projection onto Y along W . I claim that if x ∈ X, then P (Q(x)) = x. Indeed, write x = y + w,

y ∈ Y, w ∈ W.

Then by definition P (x) = y. But then the decomposition y = x + (−w),

x ∈ X, (−w) ∈ W,

shows that, by definition, Q(y) = x. It follows that Q : X → Y and P : Y → X are inverses of one another, so they are isomorphisms.  The previous two propositions imply that the following definition is a good one. Definition A.3.10. The codimension of a subspace W of a vector space V is the dimension of any complementary subspace of W .

190

A. Linear Algebra

If everything is finite-dimensional, Proposition A.2.7 shows that the codimension of W in V is just dim(V ) − dim(W ). The idea of codimension gains importance in the infinite-dimensional context. Even if V and W are not finite-dimensional, the codimension of W in V may be finite, as the following example shows. Example A.3.11. Let V be the vector space of all continuous functions [0, 1] → K, and let W be the subspace consisting of all functions that vanish at 0 (that is, W = {f ∈ V : f (0) = 0}). Both V and W are infinite-dimensional. However, W has codimension 1 in V . To show this, we just need to construct a 1-dimensional complementary subspace; I claim that the space of constant functions does the job. The space X of constant functions is 1-dimensional (a basis is provided by the single function 1, the constant function with value 1). Clearly W ∩ X = {0}, and on the other hand the identity f = (f − f (0)1) + (f (0)1) +, - * +, * ∈W

∈X

shows that W + X = V . Lemma A.3.12. Let V be a finite-dimensional vector space and let T : V → W be a linear transformation. Let Y be a subspace of V complementary to Ker(T ). Then the restriction of T to a map Y → Im(T ) is an isomorphism. Proof. The existence of the complementary subspace Y follows from Proposition A.2.7. Let T  denote the restriction of T to a map Y → Im(T ). Then Ker T  = Ker T ∩ Y = {0}, so T  is injective. On the other hand, if w ∈ Im(T ), then w = T v for some v, and we may write v = x + y with x ∈ Ker(T ) and y ∈ Y . We have w = T (v) = T (x) + T (y) = 0 + T  (y) ∈ Im(T  ). Thus T  is surjective. Being linear, injective, and surjective, T  is an isomorphism.  It’s helpful to represent this lemma as in Figure A.2: T maps “part” of V (the kernel) to zero, and a “complementary part” (the space Y ) isomorphically to a “part” of W (the image).

A.3. Linear transformations

191

V

W

Complement to Ker T

∼ =

Complement to Im T

Ker T

T

Im T

0 Figure A.2. Schematic representation of a linear operator.

Definition A.3.13. Let T : V → W be a linear map. The nullity of T is the dimension of Ker T , and the rank of T is the dimension of Im T . Theorem A.3.14 (Rank-nullity theorem). For a linear transformation T : V → W between finite-dimensional vector spaces, we have Nullity T + Rank T = dim V. Proof. Let Y be a complement to Ker(T ) in V . According to Proposition A.2.7, dim(Y ) + dim Ker T = dim V. But by Lemma A.3.12, Y is isomorphic to Im T . The result follows from Proposition A.3.8.  A reformulation of the rank-nullity theorem will become important in the discussion of the theory of Fredholm operators in Chapter 7. Definition A.3.15. Let T : V → W be a linear map. The corank of T , Corank T , is the codimension of Im(T ) in W . Similarly, the conullity of T , Conullity T , is the codimension of Ker T in V . Proposition A.3.16. For a linear transformation T : V → W between finite-dimensional vector spaces, we have Nullity(T ) − Corank(T ) = dim V − dim W.

192

A. Linear Algebra

Proof. This follows immediately from Theorem A.3.14 and the definition of codimension.  We can generalize this result by using the idea of exact sequence. Definition A.3.17. Let V0 , V1 , . . . , Vn be a sequence of vector spaces and Tj : Vj → Vj+1 linear maps between them: V0

T0

/ V1

T1

/ ···

Tn−1

/ Vn .

The sequence is exact if Im(Tj ) = Ker(Tj+1 ) for each j = 0, . . . , n−1. An exact sequence is terminating if V0 = Vn = {0}. Theorem A.3.18. In a terminating exact sequence of finite-dimensional vector spaces, the alternating sum of the dimensions of the  spaces, j (−1)j dim(Vj ), is equal to zero. Proof. For each j let Xj be the subspace of Vj defined by Xj = Ker(Tj ) = Im(Tj−1 ), and let Yj be a complementary subspace. According to Lemma A.3.12, Tj gives rise to an isomorphism between Yj and Xj+1 , so these spaces are of the same dimension. Thus   (−1)j dim(Vj ) = (−1)j (dim(Xj ) + dim(Yj )) j

j

=



(−1)j (dim(Xj ) − dim(Xj+1 )) .

j

The expression on the right is a “telescoping” sum which gives 0, as required. 

A.4. Duality Let V be a vector space with scalar field K. A linear map ϕ : V → K is called a linear functional on V . Suppose that ϕ1 , ϕ2 are linear functionals. We can define their sum to be the linear functional ϕ given by ϕ(v) = ϕ1 (v) + ϕ2 (v).

A.4. Duality

193

We write ϕ = ϕ1 +ϕ2 in this case. In the same way, if λ is a scalar, we can define the scalar product of λ and ϕ1 to be the linear functional ψ given by ψ(v) = λϕ1 (v), and we may write ψ = λψ1 . Thus we have equipped the collection of all linear functionals with operations of addition and scalar multiplication, and it is easy to verify that (equipped with these operations) the space of linear functionals is itself a vector space. Definition A.4.1. The dual space V ∗ of V is the space of linear functionals V → K. Proposition A.4.2. If V is finite-dimensional, then V ∗ is finitedimensional also, and dim(V ) = dim(V ∗ ). Proof. Choose a basis S = {v1 , . . . , vn } for V . Then, for each k ∈ {1, . . . , n}, define a linear functional ϕk : V → K by ⎛ ⎞ n  λj vj ⎠ = λk ; ϕk ⎝ j=1

in other words, ϕk (v) is obtained by writing v as a linear combination of the basis vectors v1 , . . . , vn and then taking the coefficient of vk . I claim that the set P = {ϕ1 , . . . , ϕn } forms a basis for V ∗ . To prove this we must show two things: P spans V ∗ and P is linearly independent. For both arguments the key observation is that  1 if j = k, ϕk (vj ) = 0 if j = k, which follows from the definition of ϕk . To prove that P spans V ∗ , let ϕ ∈ V ∗ and consider the linear functional n  ψ= ϕ(vk )ϕk . k=1

From the “key observation” above, ϕ(vj ) = ψ(vj ) for each j = 1, . . . , n. Since every v ∈ V is a linear combination of v1 , . . . , vn , it follows that ϕ(v) = ψ(v) for every v, that is, ϕ = ψ. But ψ is a

194

A. Linear Algebra

linear combination of ϕ1 , . . . , ϕn , by construction. It follows that P spans V ∗ . To prove that P is linearly independent, suppose there is a linear relation of the form n  μk ϕk = 0. k=1

Applying this linear relation to vj and using the “key observation”, we find that μj = 0. Since j was arbitrary, all the μ’s are zero. So no nontrivial linear relation exists, and we have proved that P is linearly independent. Since P and S have the same number of elements, dim(V ) =  dim(V ∗ ), as asserted. Remark A.4.3. The basis P for V ∗ that appears above is called the dual basis to the originally given basis of V . The operation of forming duals (“dualization”) can be applied to linear maps between vector spaces, as well as to the vector spaces themselves. Specifically, suppose that T : V → W is a linear transformation and that ϕ ∈ W ∗ . We can define a linear functional ψ ∈ V ∗ by ψ(v) = ϕ(T (v)), or ψ = ϕ ◦ T. We denote ψ by T ∗ ϕ. Then T ∗ is a linear transformation W ∗ → V ∗ (in the “backwards” direction). It is called the dual transformation to V . It is an immediate consequence of the definition that if S : U → V and T : V → W are linear transformations, then (T ◦ S)∗ = S ∗ ◦ T ∗ ; this is expressed by saying that the process of dualization is functorial.

A.5. Norms and inner products In the previous section we have seen that if V is a finite-dimensional vector space, then V and V ∗ have the same dimension and are therefore isomorphic. However, there is no unique choice of an isomorphism between V and V ∗ . One way in which one can make such a choice is by fixing an inner product on V .

A.5. Norms and inner products

195

Definition A.5.1. Let V be a vector space. An inner product on V is a map V × V → K, denoted by (v, w) → v, w, which has the following properties: (i) (Linearity) For scalars λ1 , λ2 ∈ K and vectors v1 , v2 , w ∈ V , one has

λ1 v1 + λ2 v2 , w = λ1 v1 , w + λ2 v2 , w. (ii) (Positive definiteness) The inner product of a vector with itself,

v, v, is a nonnegative real number and is equal to 0 if and only if v is the zero vector. (iii) (Symmetry) This has two versions, depending on whether K = R or C. If K = R, we require simply that v, w = w, v for all v and w. If K = C, we introduce complex conjugation into this identity and require that v, w = w, v for all v and w. A vector space equipped with an inner product is called an inner product space. The “dot product” of multivariable calculus is an example of an inner product. Some people (following this example) also use the term “scalar product” for an inner product, but we will avoid this terminology, as it leads to confusion with the scalar multiplication operation K × V → V . Lemma A.5.2. Let V be an inner product space. Then for all vectors u, v ∈ V we have the Cauchy-Schwarz inequality | u, v|  u

v

with equality if and only if u and v are linearly dependent. Proof. Suppose, without loss of generality, that u = 0 (otherwise the inequality is trivial). Put w = u, vu − u, uv. Then, expanding and using the properties of the inner product,

w, w = | u, v|2 u, u − 2| u, v|2 u, u + u, u2 v, v. This quantity is nonnegative, so | u, v|2 u, u  u, u2 v, v, and dividing through by the (strictly positive) quantity u, u gives the

196

A. Linear Algebra

Cauchy-Schwarz inequality. If equality occurs, w = 0, and this gives the asserted linear relation between u and v.  Corollary A.5.3. In an inner product space V the expression v = ( v, v)1/2 defines a norm on V ; that is to say,

λv = |λ| v ,

u + v  u + v ,

for all λ ∈ K and u, v ∈ V . Proof. The first equality follows immediately from properties of the inner product. For the second displayed item (the triangle inequality), use Cauchy-Schwarz to write

u + v 2 = u 2 + v 2 + 2 Re u, v  u 2 + v 2 + 2 u

v . The right side is ( u + v )2 , and taking square roots gives the result.  Let V be an inner product space. For each w ∈ V the map ϕw : V → K defined by ϕw (v) = v, w belongs to V ∗ . A linear functional ϕ that arises in this way (i.e., is equal to ϕw for some w) is said to be represented by the inner product. Proposition A.5.4 (Representation theorem). If V is a finitedimensional inner product space, then every linear functional on V is represented by the inner product. Proof. Suppose first that K = R. Then the assignment w → ϕw defines a linear transformation Φ from V to V ∗ . I claim that this linear transformation is injective: indeed, if w ∈ Ker Φ, then ϕw (w) =

w, w = 0, and it follows from the definiteness of the inner product that w = 0. From the rank-nullity theorem, then, dim(Im Φ) = dim V = dim V ∗ . Since Im Φ is a subspace of V ∗ , it must be equal to V ∗ (Corollary A.2.8). This is the representation theorem. If K = C, this argument does not quite work as stated, be¯ 1 Φ(v1 ) + λ ¯ 2 Φ(v2 )) cause the map Φ is antilinear (Φ(λ1 v1 + λ2 v2 ) = λ rather than linear. It is necessary to verify that the conclusion of the rank-nullity theorem still holds for antilinear maps. This is a

A.6. Matrices and determinants

197

straightforward generalization of the earlier discussion and is left to the reader.  If V is not finite-dimensional, the representation theorem does not hold in the form that we have stated. Later we shall single out an important class of infinite-dimensional spaces (Hilbert spaces) where a version of the representation theorem does remain true. Remark A.5.5. In passing, we defined above the notion of norm on a vector space V . Recall that a norm is a map V → R+ such that

λv = |λ| v ,

u + v  u + v ,

for all λ ∈ K and u, v ∈ V . An inner product defines a norm, but there are examples of norms that do not arise in this way: for instance, the expression

(x, y) = max{|x|, |y|} defines a norm on R2 that does not arise from any inner product. The notion of norm will be important in our discussions of multivariable calculus (Appendix E) and, once again, of Hilbert space (Appendix F).

A.6. Matrices and determinants Let V be a finite-dimensional vector space, and suppose that a basis B = {v1 , . . . , vk } has been chosen for V . Then, as we remarked in Example A.3.7, the linear map Kk → V defined by sending a k-tuple  (λ1 , . . . , λk ) to the vector v = λj vj (see (A.2.1)) is an isomorphism from Kk to V . It is conventional to write the list of scalars {λj } in a vertical format, like ⎞ ⎛ λ1 ⎜ .. ⎟ ⎝ . ⎠, λk and call it the column that represents the vector v relative to the given basis B. We use the notation [v]B for this column. Now suppose that T : V → W is a linear transformation between finite-dimensional vector spaces and that bases B = {v1 , . . . , vn } and

198

A. Linear Algebra

C = {w1 , . . . , wm } are chosen in V and W , respectively. Then there are unique scalars tij ∈ K, i = 1, . . . , m, j = 1, . . . , n, such that T (vj ) =

m 

tij wi .

i=1

The array of scalars



t11 ⎜ .. B [T ]C = ⎝ . tm1

··· .. . ···

⎞ t1n .. ⎟ . ⎠ tmn

is called the matrix that represents the linear transformation T . Using linearity, one checks easily that the composite map ΦC ◦ T ◦ Φ−1 B — that is, the map that takes the column [v]B (say of scalars λj ) that represents v ∈ V to the column [T v]C (say of scalars μi ) that represents T v ∈ W — is given by the usual rule of matrix multiplication [T v]C = [T ]B C [v]B ; that is, ⎛ ⎞ ⎛ ⎞⎛ ⎞ μ1 λ1 t11 · · · t1n ⎜ .. ⎟ ⎜ .. .. ⎟ ⎜ .. ⎟ . .. ⎝ . ⎠=⎝ . . . ⎠⎝ . ⎠ μm

tm1

···

tmn

λn

Example A.6.1. We can apply this idea to the dual space V ∗ of V (Definition A.4.1). An element ϕ ∈ V ∗ is just a linear map V → K. Taking the obvious basis {1} of the 1-dimensional vector space K, we see that ϕ is represented by a row (α1 , . . . , αn ) in such a way that the result of applying ϕ to v corresponds to the matrix product ⎞ ⎛ λ1 n ⎟  ⎜ (α1 , . . . , αn ) ⎝ ... ⎠ = αj λj . λn

j=1

The {αj } are therefore the coefficients of the expression of ϕ in terms of the dual basis (Remark A.4.3) to the original basis B of V . Lemma A.6.2. Let S : U → V and T : V → W be linear transformations between finite-dimensional vector spaces, and let A, B, C be

A.6. Matrices and determinants

199

bases in U, V, W , respectively. Then the matrices of T , S, and T ◦ S are related by B A [T ◦ S]A C = [T ]C [S]B ,

where the product on the right is matrix multiplication. Proof. Let u ∈ U . By definition of T ◦ S, we have (T ◦ S)(u) = T (S(u)). This translates into matrix language as   B A [S] . [u] = [T ] [u] [T ◦ S]A A A C C B Since matrix multiplication is associative and since it is distributive over subtraction, we may rewrite this as   B A [T ◦ S]A − [T ] [S] C C B [u]A = 0. But [u]A in this equation can be an arbitrary column vector, so the matrix in large parentheses is zero, as required.  Definition A.6.3. Two square matrices M and N are similar if there is an invertible matrix U such that M = U −1 N U . Corollary A.6.4. Let T : V → V be a linear transformation from a finite-dimensional vector space V to itself, and let B and C be two C bases of V . Then the matrices [T ]B B and [T ]C are similar. Proof. We have C C B [T ]B B = [id]B [T ]C [id]C , B B where “id” stands for the identity map. But [id]C B [id]C = [id]B = I, B C the identity matrix, and similarly [id]C [id]B = I. Thus U = [id]B C is −1 C  invertible with inverse U = [id]B , proving the result.

We will conclude with a brief discussion of determinants. Let M = (mij ) be a square matrix, with n rows and n columns. Recall (see Appendix G for a fuller discussion) that Sn denotes the collection of all bijective maps {1, . . . , n} → {1, . . . , n} — the symmetric group — and that if σ ∈ Sn , then sign(σ) is ±1 according to whether σ

200

A. Linear Algebra

can be represented as the composite of an even or an odd number of transpositions (exchanges of two elements). With the above notation we have Definition A.6.5. The determinant of the n × n matrix M = (mij ) is the scalar  det(M ) = sign(σ)m1σ(1) m2σ(2) · · · mnσ(n) . σ∈Sn

Example A.6.6. Suppose n = 2. The symmetric group S2 has two elements, the identity map (sign +1) and the transposition exchanging 1 and 2 (sign −1). The formula above gives  det

a11 a21

a12 a22

 = a11 a22 − a12 a21 .

Example A.6.7. The determinant of the identity matrix is 1. Lemma A.6.8. The determinant, considered as a map from the vector space Mn (K) to K, has the following two properties: (a) It is column-multilinear: that is, det(M ) depends linearly on each column of M if we keep the other columns fixed. (b) It is alternating: that is, det(M ) changes sign if we interchange any two columns of M . Moreover, any other map Mn (K) → K having these two properties is a multiple of the determinant. Proof. Consider the formula for det(M ) given in Definition A.6.5,  sign(σ)m1σ(1) m2σ(2) · · · mnσ(n) . det(M ) = σ∈Sn

The formula is a sum of n! terms, each of which is a product of n matrix entries. Fixing a particular column, say the kth, each one of the product terms contains exactly one matrix entry from the kth column, namely the entry mσ−1 (k),k . This proves (a). As for (b), suppose that N is obtained from M by transposing two columns by

A.6. Matrices and determinants

201

a transposition τ . Then  sign(σ)m1,τ σ(1) m2,τ σ(2) · · · mn,τ σ(n) det(N ) = σ∈Sn

=



(− sign(τ σ)) m1,τ σ(1) m2,τ σ(2) · · · mn,τ σ(n) = − det(M )

σ∈Sn

since τ σ runs over Sn as σ does. Suppose now that Λ : Mn (K) → K is another map which is column-multilinear and alternating. For any map μ : {1, . . . , n} → {1, . . . , n}, let Eμ denote the matrix with a 1 in each position (μ(k), k) and zeroes elsewhere — in other words, Eμ is a matrix with exactly one “1” in each column, the positions of the “1”s being governed by the map μ. Because Λ is column-multilinear,  Λ(Eμ )mμ(1),1 mμ(2),2 · · · mμ(n),n . Λ(M ) = μ

Because Λ is alternating, it must vanish on any matrix M with two columns identical (interchanging those columns gives Λ(M ) = −Λ(M )). The only matrices Eμ that do not have two columns identical are those where μ is a permutation σ; so the sum over all μ in the display above can be replaced by a sum over permutations only. For a permutation σ, the alternating property gives Λ(Eσ ) = sign(σ)Λ(I), where I is the identity matrix. Finally, then,  Λ(M ) = Λ(I) sign(σ)mσ(1),1 mσ(2),2 · · · mσ(n),n = Λ(I) det(M ), σ∈Sn

as required.



Proposition A.6.9. For square matrices M and N (of the same size) we have det(M N ) = det(M ) det(N ). Proof. Fix M . By definition of matrix multiplication, each column of M N is obtained by multiplying M by the corresponding column of N . Thus, the map Mn (K) → K defined by N → det(M N ) is columnmultilinear and alternating, so it is a multiple of the determinant. Taking N = I shows that this multiple is det(M ). 

202

A. Linear Algebra

Corollary A.6.10. Similar matrices have the same determinant. Proof. Suppose that M and N are similar, that is, M = U −1 N U for some invertible U . Then det(M ) = det(U −1 ) det(N ) det(U ) = det(U ) det(U −1 ) det(N ) = det(U U −1 N ) = det N, 

as required.

Now suppose that V is a finite-dimensional vector space and T : V → V is a linear transformation. We define the determinant of T as follows: choose a basis B for V , and define det(T ) to be the determinant of the square matrix [T ]B B . This is well-defined because another choice of basis C would lead to a matrix [T ]C C which is similar to [T ]B and thus has the same determinant. B Theorem A.6.11 (Invertibility criterion). Suppose the vector space V is finite-dimensional. A linear transformation T : V → V is bijective (invertible) if and only if det(T ) = 0. Proof. If T is invertible, then det(T ) det(T −1 ) = det(id) = 1, so det(T ) = 0. Conversely, suppose that T is not invertible. Then it fails either to be injective or to be surjective, but in this case these are equivalent by the rank-nullity theorem. Thus T is not injective so there is a nonzero vector v1 such that T v1 = 0. The set {v1 } is linearly independent, so there is a basis B containing v1 as its first member. The matrix of T with respect to this basis has its first row consisting entirely of zeroes. Therefore, its determinant is zero. 

Appendix B

Metric Spaces

A metric space is a set of points for which we have a notion of distance — a measure of “how different” or “how far apart” two points are. This notion of distance can be applied in several ways, but one of the most important and natural is to say what we mean by continuous motion and therefore to serve as the foundation of topology, the study of continuous maps and the properties that they preserve. That is what we’ll review in this appendix. For a reference see Sutherland [36], or, for a more leisurely treatment, O’Searcoid [29].

B.1. Metric spaces Definition B.1.1. A metric space is a set X equipped with a function d : X × X → R (called a metric or distance function) such that: (i) d(x, x )  0 for all x, x ; moreover, d(x, x ) = 0 if and only if x = x . (ii) d(x, x ) = d(x , x) for all x, x (symmetry). (iii) d(x, x )  d(x, x ) + d(x , x ) for all x, x , x (triangle inequality). Here are some examples of metric spaces.

203

204

B. Metric Spaces

Example B.1.2. The vector space Rn can be made into a metric space by defining the norm of a vector x = (x1 , . . . , xn ) to be 1/2 

x = |x1 |2 + · · · + |xn |2 and then defining the distance between two vectors x and x by d(x , x ) = x − x . The same formula also makes Cn into a metric space. (These definitions come from the standard inner products on Rn and Cn , respectively, so that the triangle inequality is a consequence of Corollary A.5.3.) These examples are called Euclidean spaces. Example B.1.3. Let X be any set. A metric on X can be defined by  0 (x = y), d(x, y) = 1 (x = y). Example B.1.4. Let A be a finite set (the alphabet) and consider the set An of n-tuples of elements of A, which we think of as n-letter words in the alphabet A. Define a distance on An by d(x, y) = #{i : 1  i  n, xi = yi }, in other words, the number of positions in which the two words differ. (The reader should check that the triangle inequality holds, so this formulation does define a metric.) This so-called Hamming metric was introduced in 1950 to give a technical foundation to the theory of error-correcting codes [21]. Example B.1.5. Let X be any metric space and let Y be a subset of X. Then the distance function on X, restricted to Y , makes Y into a metric space; this is called the subspace metric on Y . Definition B.1.6. A subset U of a metric space X is an open subset of X if for every x ∈ U there is ε > 0 such that the entire open ball B(x; ε) := {x ∈ X : d(x, x ) < ε} is contained in U . Lemma B.1.7. Every open ball in a metric space is an open set.

B.1. Metric spaces

205

Proof. Let X be a metric space and let U = B(p; r) = {x ∈ X : d(p, x) < r} be an open ball. For x ∈ U , define ε = r − d(p, x) > 0. Suppose that y ∈ B(x; ε). Then by the triangle inequality, d(p, y)  d(p, x) + d(x, y) < (r − ε) + ε = r, and therefore y ∈ U . It follows that B(x; ε) ⊆ U . Since x ∈ U was arbitrary, this shows that U is open.  This result shows that there are plenty of open subsets. Definition B.1.8. Let X be a metric space. A subset F ⊆ X whose complement X \ F is open is called closed. Note carefully that “closed” does not mean the same as “not open”. Many subsets of X are neither open nor closed, and some may be both. Example B.1.9. In a space with the metric of Example B.1.3, every subset is open (and, therefore, every subset is also closed). A space that has this property is called a discrete space. It is easy to see that a metric space X is discrete if and only if, for every x ∈ X, the infimum inf{d(x, y) : y ∈ X, y = x} is strictly positive. Lemma B.1.10. In any metric space, the union of any collection of open subsets is an open subset. The intersection of a finite collection of open subsets is open. The empty set ∅ and the entire metric space X are open subsets of X. Proof. Let F be a collection of open subsets of a metric space X 0 and let U = F be the union of the family. If x ∈ U , then there is some V ∈ F such that x ∈ V . Since V is open, there is ε > 0 such that B(x; ε) ⊆ V . Since V ⊆ U , we also have B(x; ε) ⊆ U . Thus for any x ∈ U there exists ε > 0 such that B(x; ε) ⊆ U , which is to say that U is open. Now let F = {U1 , . . . , Un } be a finite collection of open sets and 1 let U = F = U1 ∩ · · · ∩ Un . If x ∈ U , then for each i = 1, . . . , n there is εi > 0 such that B(x; εi ) ⊆ Ui . Let ε = min{ε1 , . . . , εn } > 0.

206

B. Metric Spaces

Then B(x; ε) ⊆ Ui for all i, and thus B(x; ε) ⊆ U . Thus for any x ∈ U there exists ε > 0 such that B(x; ε) ⊆ U , which is to say that U is open.  Remark B.1.11. Mathematicians frequently say “U is open” as a shorthand for “U is an open subset of whatever metric space X is natural in the context”, but it is important to realize that openness depends on X as well as on U . For example, the interval (0, 1) is an open subset of R, but it is not an open subset of R2 (where we think of R as the x-axis in R2 ). Remark B.1.12. Dually, the intersection of any collection of closed sets is closed. In particular, given any subset A of X, the intersection of all the closed subsets of X that contain A is itself a closed set. Clearly, this intersection is the smallest closed subset of X that contains A: it is called the closure of A and is denoted A. Definition B.1.13. We say that a subset of a metric space is bounded if it is contained in some open ball.

B.2. Continuous functions The definition of continuity is translated in the natural way to the metric space context. Definition B.2.1. A function f : X → Y between metric spaces is continuous at x ∈ X if for every ε > 0 there is δ > 0 such that d(f (x), f (x )) < ε whenever d(x, x ) < δ. It is continuous if it is continuous at every x ∈ X. In the context of metric spaces a continuous function will also be called a map or mapping. Important examples are paths in a metric space. Definition B.2.2. Let X be a metric space. A path in X is a continuous function from [0, 1] to X. (Recall that [0, 1] denotes the closed unit interval {x ∈ R : 0  x  1}.) Although we define continuity in terms of epsilons and deltas, it also has a very important alternative characterization in terms of open sets.

B.2. Continuous functions

207

Theorem B.2.3. Let X and Y be metric spaces. Then f : X → Y is continuous if and only if, for every open U ⊆ Y , the inverse image f −1 (U ) := {x ∈ X : f (x) ∈ U } is open in X. Proof. Suppose that f is continuous and let U ⊆ Y be open. Let x ∈ f −1 (U ); then f (x) ∈ U so by definition of “open” there is ε > 0 such that B(f (x); ε) ⊆ U . By definition of “continuous” there is δ > 0 such that if x ∈ B(x; δ), then f (x ) ∈ B(f (x); ε) ⊆ U . But this means that B(x; δ) ⊆ f −1 (U ). Thus the set f −1 (U ) is open. Conversely, let x ∈ X, ε > 0 and suppose that f satisfies the condition in the theorem. In particular, we may consider U = B(f (x); ε), an open set such that x ∈ f −1 (U ). Our hypothesis tells us that f −1 (U ) is open, which means that there is a δ > 0 such that B(x; δ) ⊆ f −1 (U ). Thus, whenever x ∈ B(x; δ), f (x ) ∈ B(f (x); ε). This gives us continuity.  Remark B.2.4. Because closed sets are the complements of open ones and f −1 (Y \ A) = X \ f −1 (A), it is also true that f is continuous iff the inverse image of every closed set is closed. Definition B.2.5. A map f : X → Y between metric spaces is called a homeomorphism if it is a bijection and both f and f −1 are continuous. (Because of the preceding theorem, it is the same thing to say that f is a bijection and U is open in X if and only if f (U ) is open in Y , or that f is a bijection and F is closed in X if and only if f (F ) is closed in Y .) If there is a homeomorphism between X and Y , then we say that these spaces are homeomorphic. When we are studying topology, we think of homeomorphic spaces as “essentially the same”: the coffee cup is homeomorphic to the donut (compare Figure 1.1). Homeomorphism therefore plays the same role in topology as isomorphism does in linear algebra or in group theory. Remark B.2.6. For f to be a homeomorphism it is not enough that it simply be continuous and bijective. For example, the map from the half-open interval [0, 2π) to the unit circle, given by t → eit , is continuous and bijective, but its inverse is not continuous.

208

B. Metric Spaces A more restrictive notion than homeomorphism is that of isome-

try Definition B.2.7. Let (X, dX ) and (Y, dY ) be metric spaces. An isometry from X to Y is a bijection f : X → Y such that dY (f (x1 ), f (x2 )) = dX (x1 , x2 ) for all x1 , x2 ∈ X. Every isometry is a homeomorphism, but the reverse is far from the case: isometric spaces are “the same” metrically, and not just topologically. For instance, all ellipses (in the plane) are homeomorphic, but they are not all isometric. (The interested reader can find necessary and sufficient conditions for two ellipses to be isometric.)

B.3. Compact spaces The notions of sequence of points in a metric space and subsequence of a given sequence are defined in the same way as they are for real numbers. (Formally speaking, a sequence in the space X is a function N → X, and a subsequence of the given sequence is the result of composing it with a strictly monotonic function N → N.) Definition B.3.1. We say that a sequence {xn } in the metric space X is convergent to x ∈ X if ∀ε > 0 ∃N ∈ N ∀n  N d(xn , x) < ε. Equivalently, every open set that contains x also contains all xn for n sufficiently large; the sequence “enters and remains within every neighborhood of x”. Convergent sequences characterize closed sets: Proposition B.3.2. A subset A of a metric space X is closed if and only if, whenever {xn } is a sequence of points of A that converges to x ∈ X, the limit x also belongs to A. Proof. Both directions are proofs by contradiction. (Only if) Suppose A is closed and x ∈ / A. Since X \ A is open, there is ε > 0 such that B(x; ε) ⊆ X \ A. Thus the distance from x

B.3. Compact spaces

209

to any point of A is  ε, and therefore no sequence of points of A can converge to x. (If) Suppose that A is not closed, so that X \ A is not open. Then there exists x ∈ X \ A for which no ball B(x; ε) is a subset of X \ A — that is, every such ball has nonempty intersection with A. Take ε = 1/n and let xn be a point of B(x; 1/n) ∩ A. Now {xn } is a sequence of points of A that converges to x ∈ / A.  Remark B.3.3. A subset A ⊆ X is called dense if every x ∈ X is the limit of a sequence in A. In view of Proposition B.3.2, it is equivalent to say that the only closed subset of X that contains A is the set X itself: that is, the closure of A is X. Definition B.3.4. A metric space X is compact iff every sequence of points of X has a subsequence that converges in X. Remark B.3.5. This notion is often called sequential compactness to distinguish it from another formulation, covering compactness, which is applicable in more general contexts. For metric spaces, however, the two kinds of compactness are equivalent (as we’ll see in a moment). Thus there is no risk of ambiguity if we use the word “compact” without qualification, and we will usually do that. The notion of compactness is often applied to subsets of a metric space, which can of course be thought of as metric spaces in their own right (compare Example B.1.5). For example, R is not compact, but any closed, bounded subset of R is compact — this is the classical Bolzano-Weierstrass theorem. Proposition B.3.6. If A is a subset of any metric space X and A is compact (in its own right), then A is closed and bounded in X. Proof. We apply the criterion of Proposition B.3.2. Suppose A is compact. Let {xn } be a sequence in A that converges to x ∈ X. By compactness of A, {xn } has a subsequence that converges in A — in particular, its limit must be a point of A. But any subsequence of {xn } converges to x, so x belongs to A. Thus A is closed. Fix a ∈ A and suppose (aiming for a contradiction) that A is not bounded. Then for each n ∈ N there exists xn ∈ A with d(xn , a) >

210

B. Metric Spaces

n. No subsequence of the {d(xn , a)} can be bounded, whence no subsequence of the {xn } can be convergent. Thus A is not compact.  Proposition B.3.7. Let A be a closed subset of a compact metric space X. Then A is compact (in its own right). Proof. Let {xn } be a sequence in A. Since X is compact, this sequence has a subsequence convergent to x ∈ X. Since A is closed, the limit x in fact belongs to A. Thus {xn } has a subsequence convergent in A. That is, A is compact.  Definition B.3.8. Let X and Y be metric spaces. The product space X ×Y is the set of pairs (x, y), with x ∈ X and y ∈ Y and with metric 1/2  . d((x, y), (x , y  )) = dX (x, x )2 + dY (y, y  )2 Thus Rn is the product space R × · · · × R (n factors). Proposition B.3.9. If X and Y are both compact, so is X × Y . Proof. Let (xn , yn ) be a sequence in X × Y . Since X is compact, there is a subsequence xnk of xn that converges in X. Now consider the sequence ynk in Y . Since Y is compact, there is a subsequence ynkj that converges in Y . The corresponding subsequence xnkj also converges in X since it is a subsequence of the convergent sequence xnk . Thus (xnkj , ynkj ) is a subsequence of the original sequence that converges in X ×Y .  Proposition B.3.10. Every closed, bounded1 subset of Rn or Cn is compact. Proof. We take it as known that a closed, bounded interval in R is compact (the Bolzano-Weierstrass theorem). Let A be a closed bounded subset of Rn . Then A is contained in some cube, which is a product of closed bounded intervals. Proposition B.3.9 shows that the cube is compact. Therefore A is compact, as a closed subset of a compact space (Proposition B.3.7).  1

See Definition B.1.13.

B.3. Compact spaces

211

Remark B.3.11. A sequence (xn ) in a metric space X is called a Cauchy sequence if d(xn , xm ) → 0 as n, m → ∞, in other words if its points “get arbitrarily close to one another”. A metric space X is said to be complete if every Cauchy sequence in X is convergent. It is easy to see that every compact metric space is complete. There are, however, many other examples (e.g., X = R). An alternate formulation of compactness makes use of the notion of open cover for a metric space. Definition B.3.12. Let X be a metric space. An open cover U for X is a collection (finite or infinite) of open sets whose union is all of X. A Lebesgue number for U is a number δ > 0 such that every open ball of radius δ is a subset of some member of U . Theorem B.3.13. Every open cover of a compact metric space has a Lebesgue number. Proof. Suppose that U does not have a Lebesgue number. Then for every n there is xn ∈ X such that B(xn ; 1/n) is contained in no member of U . If X is compact, the sequence (xn ) has a subsequence that converges, say to x. Now x belongs to some member U of U since U is a cover. Thus there is ε > 0 such that B(x; ε) ⊆ U . There is n > 2/ε such that d(xn , x) < ε/2. But then B(xn ; 1/n) ⊆ B(xn ; ε/2) ⊆ B(x; ε) ⊆ U, which is a contradiction.



Definition B.3.14. A metric space X is said to have the Heine-Borel property if, for every open cover U of X, there exists a finite subset of U that still covers X. A subset of a cover U of X which is itself a cover of X is called a subcover of U . Thus, a space X has the Heine-Borel property if every open cover of X has a finite subcover. Proposition B.3.15. A metric space has the Heine-Borel property if and only if it is compact. As hinted in Remark B.3.5, a space with the Heine-Borel property is often called covering compact, and the above result is then

212

B. Metric Spaces

expressed by saying “a metric space is covering compact if and only if it is sequentially compact”. Proof. Suppose that X is compact. I claim that, for each δ > 0, X has a finite cover by δ-balls (this is sometimes called “total boundedness” of X). Indeed, try to construct a sequence x1 , x2 , . . . in X in the following way: pick any point x1 , pick a point x2 at distance  δ from x1 , then a point x3 at distance  δ from both x1 and x2 , and so on. If this process continues for ever, it produces a sequence without any convergent subsequence, which is impossible. But the only way it can stop is if, for some n, the balls B(x1 ; δ), . . . , B(xn ; δ) cover X. Now let U be an open cover of X. Let δ > 0 be a Lebesgue number for U (which exists because of Theorem B.3.13). By the previous paragraph, X has a finite cover by δ-balls. But each such ball is a subset of a member of U ; so U has a finite subcover. Now suppose that X has the Heine-Borel property. Suppose for a contradiction that (xn ) is a sequence in X without convergent subsequence. Then for each x ∈ X there is some εx such that xn ∈ / B(x; εx ) for all but finitely many n. The B(x; εx ) form a cover of X. Picking / X for all but a finite subcover we obtain the contradiction that xn ∈ finitely many n.  The following is a typical application of the covering formulation of compactness (the reader will also be able to give an easy proof using the sequential formulation). Proposition B.3.16. Let X and Y be metric spaces, and suppose that there exists a map f : X → Y that is continuous and surjective. If X is compact, then Y is compact also. Proof. Let U be an open cover of Y . For each open set U ∈ U , the set f −1 (U ) is an open subset of X, by Theorem B.2.3. The collection W of open sets f −1 (U ), U ∈ U , covers X. As X is compact, W has a finite subcover, say {f −1 (U1 ), . . . , f −1 (Un )}. The corresponding finite collection of open subsets of Y , {U1 , . . . , Un }, is a subset of U and (because f is surjective) covers Y . Hence Y is compact also.  This gives us a useful result (compare Remark B.2.6).

B.4. Function spaces

213

Proposition B.3.17. Let f : X → Y be a continuous bijection of metric spaces, with X compact. Then f is a homeomorphism. Proof. To prove that f −1 is continuous it suffices to prove that f takes closed sets to closed sets (see Remark B.2.4). Let A ⊆ X be closed. Then A is compact (Proposition B.3.7). So f (A) is compact (Proposition B.3.16). Thus f (A) is closed (Proposition B.3.6).  There is also an important relationship between compactness and uniform continuity. Definition B.3.18. A map f : X → Y between metric spaces is uniformly continuous if for each ε > 0 there is δ > 0 such that, for all x, x ∈ X, d(f (x), f (x )) < ε whenever d(x, x ) < δ. The extra information beyond ordinary “pointwise” continuity is that δ does not depend on x. Proposition B.3.19. Every continuous map from a compact metric space X to a metric space Y is uniformly continuous. Proof. Let f : X → Y be continuous. Let ε > 0. Let U be the open cover of Y by all balls of radius ε/2, and let V = f ∗ (U ) be the pullback of this cover under f ; i.e., a set V belongs to V if and only if it is of the form V = f −1 (BY (y; ε/2))

for some y ∈ Y .

(Notice that we’re using Theorem B.2.3 to argue that V is open.) Let δ be a Lebesgue number for the cover V of X. Then, if d(x, x ) < δ, we have x ∈ BX (x; δ), and so both x, x belong to some V ∈ V . That means that both f (x), f (x ) belong to some B(y; ε/2), so dY (f (x), f (x ))  dY (f (x), y) + dy (y, f (x )) < ε and we are done. 

B.4. Function spaces Let X be a compact metric space and let f0 , f1 : X → Y be two maps from X to another metric space Y . The function X → R defined by x → dY (f0 (x), f1 (x))

214

B. Metric Spaces

is then continuous, so its range is compact (Proposition B.3.16) and therefore bounded. Thus the supremum in the definition below exists and is finite: Definition B.4.1. With notation as above, the uniform distance (or just distance) between the maps f0 and f1 is d(f0 , f1 ) = sup{d(f0 (x), f1 (x)) : x ∈ X}. The collection of all (continuous) maps from the compact space X to Y is denoted Maps(X, Y ). It is easily verified that the uniform distance satisfies the triangle inequality, so Maps(X, Y ) becomes a metric space in its own right. One way to study the topology of a space Y is to look at the connectivity properties of Maps(X, Y ) for various standard spaces X. We will use two important properties of these mapping spaces. The first is the gluing property: Proposition B.4.2. Let X be a metric space and let A, B be closed subsets whose union is X. Let f : X → Y be a function from X to another metric space Y . If the restrictions f|A and f|B of f to A and B are continuous (as maps A → Y and B → Y ), then f itself is continuous (as a map X → Y ). Remark B.4.3. A diagrammatic way to say the same thing is this: in the diagram of restriction maps Maps(X, Y )

/ Maps(A, Y )

 Maps(B, Y )

 / Maps(A ∩ B, Y )

an element of the top left space Maps(X, Y ) is the same as a pair of elements of the top right and bottom left spaces whose restrictions to the bottom right space agree. One expresses this property by saying that the diagram is a pullback square. Proof. We begin by establishing the following lemma: if A is a closed subset of a metric space X and F is a closed subset of A (when A is considered as a metric space in its own right), then F is a closed

B.4. Function spaces

215

subset of X. (Compare Remark B.1.11.) To see this, let U be the complement of F in X. Let x ∈ U . There are two cases: (a) Case x ∈ X \ A. Then there is ε > 0 such that B(x; ε) ⊆ X \ A since A is closed in X. A fortiori, B(x; ε) ⊆ U = X \ F . (b) Case x ∈ A \ F . Then there is ε > 0 such that the ball in A around x, of radius ε, BA (x; ε), is contained in A \ F . But BA (x; ε) = B(x; ε) ∩ A, so again we have B(x; ε) ⊆ BA (x; ε) ∪ (X \ A) ⊆ U = X \ F. In either case B(x; ε) ⊆ U . Thus U is open in X, so F is closed in X, as required. Now for the proof of the proposition. Let G be an arbitrary −1 (G) = f −1 (G) ∩ A is closed subset of Y . Since f|A is continuous, f|A a closed subset of A, and therefore a closed subset of X by the lemma. Similarly, f −1 (G) ∩ B is a closed subset of X. But then f −1 (G) = (f −1 (G) ∩ A) ∪ (f −1 (G) ∩ B) is closed in X, so f is continuous by Remark B.2.4.



Remark B.4.4. The same result holds (with a different proof) if A and B are both open rather than closed. Some condition on A and B is necessary, however, as easy examples show. The second important property of mapping spaces is the exponential law. Suppose that X and Y are compact metric spaces and Z is any metric space. Let F be a continuous map from X × Y to Z. Then, for each fixed x ∈ X, the map fx (y) = F (x, y) : Y → Z is continuous. Therefore, x → fx defines a function g from X to Maps(Y, Z). Moreover, this function g is itself continuous. To see why, notice that by Proposition B.3.19, the map F is uniformly continuous from X × Y to Z. In particular this implies that, given ε > 0, there exists δ > 0 such that if x, x ∈ X with d(x, x ) < δ, then d(F (x, y), F (x , y)) < 12 ε for all y ∈ Y . But this implies d(g(x), g(x )) < ε, so g is continuous.

216

B. Metric Spaces

From this discussion we conclude that the process of passing from F to g defines a function Φ : Maps(X × Y, Z) → Maps(X, Maps(Y, Z)). The exponential law for function spaces is then the following statement. Proposition B.4.5. For metric spaces X, Y , and Z, with X and Y compact, the function Φ defined above is a homeomorphism from Maps(X × Y, Z) to Maps(X, Maps(Y, Z)). Proof. It is immediate from the definitions that if d(F, F  ) < ε, then d(Φ(F ), Φ(F  ))  ε, which shows that Φ is continuous. Let us construct an inverse to Φ. Suppose that g ∈ Maps(X, Maps(Y, Z)). Then for each x ∈ X, g(x) is a (continuous) map Y → Z. Let us now define Ψ(g) = F where F (x, y) = [g(x)](y). We need to show two things: first, that this F is a continuous map from X × Y to Z and, second, that Ψ is a continuous map from Maps(X, Maps(Y, Z)) to Maps(X × Y, Z). Considering F first, let ε > 0 be given. There exists α > 0 such that (d(x, x ) < α) ⇒ (d(g(x), g(x )) < 12 ε), and there exists β > 0 such that (d(y, y  ) < β) ⇒ (d(g(x)(y), g(x)(y  )) < 12 ε). Let δ = min{α, β}. Then if d((x, y), (x , y  )) < δ, we have d(F (x, y), F (x , y  )) = d(g(x)(y), d(g(x )(y  ))  d(g(x)(y), g(x)(y  )) + d(g(x)(y  ), g(x )(y  ))  d(g(x)(y), g(x)(y  )) + d(g(x), g(x )) < 12 ε + 12 ε = ε, which shows that F is continuous. The proof that Ψ is continuous proceeds in a similar way. 

Appendix C

Extension and Approximation Theorems

In this brief section we will prove two significant results about realvalued functions on compact metric spaces, the Stone-Weierstrass theorem and the Tietze extension theorem. These results are important in many parts of analysis: in this book, they will be involved in our proof of the Jordan curve theorem (Chapter 4) and in our discussion of Fredholm operators (Chapter 7).

C.1. The Stone-Weierstrass theorem The classical Weierstrass approximation theorem states that every continuous function on a closed bounded interval can be uniformly approximated by polynomials. The Stone-Weierstrass theorem generalizes that result to functions on other compact metric spaces. Definition C.1.1. A collection L of real-valued functions on a set X is a lattice if, whenever it contains functions f and g, it also contains their pointwise maximum and minimum — usually written f ∨ g and f ∧ g in this context. Proposition C.1.2. Let L be a lattice of continuous real-valued functions on a compact metric space X. If, for all x, x ∈ X and any 217

218

C. Extension and Approximation Theorems

a, a ∈ R, there is a function f ∈ L having f (x) = a and f (x ) = a , then L is dense in Maps(X; R). The property appearing in the statement is called the two point interpolation property. Proof. Let the continuous function h : X → R be given and let ε > 0. We are going to approximate h within ε by elements of L. By hypothesis, for each x, x ∈ X there exists fxx ∈ L such that fxx (x) = h(x) and fxx (x ) = h(x ). Fixing x for a moment, let Vx = {y ∈ X : h(y) − ε < fxx (y)}. These sets are open, and x ∈ Vx , so they cover X. Take a finite subcover and let gx be the (pointwise) maximum of the corresponding functions fxx ∈ L. Because L is a lattice, gx ∈ L and by construction, h(y) − ε < gx (y) ∀y,

h(x) = gx (x).

So we have approximated h from one side by members of L. Now we play the same trick again from the other direction: let Wx = {y : gx (y) < h(y) + ε}. Again these form an open cover of X; take a finite subcover and let g be the (pointwise) minimum of the corresponding gx . Then g ∈ L and by construction h − ε < g < h + ε, as required.  There is a more classical formulation which makes use of algebraic rather than order-theoretic operations. Lemma C.1.3. There is a sequence of polynomials on [−1, 1] converging uniformly to |x|. Proof (Dydak-Feldman [16]). It clearly suffices to produce a sequence pn (x) of polynomials on [−1, 1] that converge uniformly to the function f (x) = 12 (x + |x|) that equals x for x  0 and 0 for x  0. Define p1 (x) = x2 and, inductively,   pn+1 (x) = pn (x) · 1 + 12 (x − pn (x)) .

C.1. The Stone-Weierstrass theorem

219

A simple calculation shows that x−pn+1 (x) = (x−pn (x))(1− 21 pn (x)). Using these facts, one easily establishes by induction the following: (a) We have 0  pn (x)  |x| for all x ∈ [−1, 1]. (b) The sequence {pn (x)} is monotone decreasing for x  0 and monotone increasing for x  0. Fix ε > 0. We divide the interval [−1, 1] into three parts: A = [−1, −ε], B = [−ε, ε], and C = [ε, 1]. If x ∈ A, then   0  pn+1 (x) = pn (x) · 1 + 12 (x − pn (x))  pn (x) · (1 − 12 ε). By induction we find 0  pn+1 (x)  (1 − 12 ε)n , which is smaller than ε for n sufficiently large. If x ∈ B, then 0  pn (x)  ε for all n, by (a) above. If x ∈ C, then   0  x − pn+1 (x) = (x − pn (x)) 1 − 12 pn (x)  (x − pn (x))(1 − 12 ε2 ). Again by induction, 0  x − pn+1 (x)  (1 − 12 ε2 )n , which is smaller than ε for n sufficiently large. We conclude that for n sufficiently  large, |pn (x) − f (x)|  ε for all x ∈ [−1, 1], as required. One says that a subset of Maps(X; R) is a subalgebra if it is closed under pointwise addition, subtraction, multiplication of functions, and multiplication by scalars. Lemma C.1.4. A closed subalgebra of Maps(X; R) that contains the constant functions is a lattice. Proof. Let A be the given subalgebra and f, g ∈ A; let h = f − g. There is no loss of generality in assuming (by rescaling) that |h|  1 everywhere. Any polynomial in h belongs to A and hence so does |h| by closure and Lemma C.1.3. But now f ∧g = belong to A as well.

f + g − |h| , 2

f ∨g =

f + g + |h| 2 

Theorem C.1.5 (Stone-Weierstrass theorem). Let X be a compact metric space and let M be one of the two spaces Maps(X, R) or

220

C. Extension and Approximation Theorems

Maps(X, C). Let E be a subset of M having the following properties: (i) E contains the constant functions. (ii) E is a subring of M ; that is, it is closed under the operations of addition, subtraction, and multiplication. (iii) E separates points of X; that is, for each pair of distinct points x0 , x1 ∈ X there is a function f ∈ E with f (x0 ) = f (x1 ). (iv) In the complex case it is also required that E is closed under complex conjugation; that is, if the function f ∈ E, then f¯ ∈ E also. Then E is dense in M . ¯ denote the Proof. Consider the case of Maps(X, R) first. Let E closure of E. It is a closed subalgebra of Maps(X; R) and is therefore a lattice, by Lemma C.1.4. Moreover, it has the two point interpolation property (by simple algebra using the fact that E separates points). ¯ = Maps(M ; R) by Proposition C.1.2, which is to say that E Thus E is dense. In the case of Maps(X, C), since E is now also supposed to be closed under complex conjugation, we see from the previous result that the collection F of real parts (or of imaginary parts) of elements of E is dense in Maps(X, R). It follows that E is dense in Maps(X; C).  Notice that the algebra of (real) polynomial functions on a compact interval in R satisfies the conditions of the theorem. Thus, such functions are dense among the continuous functions. This is the classical Weierstrass approximation theorem. (Of course, a special case of this, in the form of Lemma C.1.3, had to be proved by an explicit construction in order for us to get the general Stone-Weierstrass result.)

C.2. The Tietze extension theorem Let X and Y be metric spaces, and let A be a subset of X. Then every continuous function from X to Y restricts to a continuous function from A to Y . This operation defines a restriction map between

C.2. The Tietze extension theorem

221

function spaces Maps(X, Y ) → Maps(A, Y ) which is in a natural sense “dual” to the inclusion map A → X. Must the restriction map Maps(X, Y ) → Maps(A, Y ) be surjective? In other words, can every continuous map A → Y be extended to a continuous map X → Y ? In general the answer is “no”, but it is “yes” when the range space Y is R, the space of real numbers. Proposition C.2.1 (Tietze extension theorem). Let A be a compact subspace of a metric space X. Any continuous function A → R can be extended to a continuous function X → R. That is, given any continuous f : A → R, one can fill in the diagram XO g

? f /R A with a continuous g to make it commutative. Proof. Consider the metric space M = Maps(A; R) and the subset E consisting of maps f : A → R that can be extended to maps X → R (let’s call these the extendible maps). Then E contains the constant functions, and it is a subring of M because an extension of the sum (or difference or product) of two extendible functions is the sum (or difference or product) of their extensions. Moreover, E separates points of A: given distinct x0 , x1 ∈ A the function x → d(x, x0 ) separates x0 from x1 and is extendible. By Stone-Weierstrass, then, E is dense in M . To complete the proof of the theorem (that is, to show that all f ∈ M are extendible), it will therefore suffice to show that E is also closed. Begin with a simple observation: if a function f ∈ A has an extension g at all, then it has such an extension with g = f . Indeed, if g is any extension of f , then so is the function ⎧ ⎪ ⎪ ⎨+ f if g(x) > f , h : x → g(x) if − f  g(x)  f , ⎪ ⎪ ⎩− f if g(x) < − f ,

222

C. Extension and Approximation Theorems

and this extension has h = f . Now, suppose that {fj } is a sequence in E converging uniformly to f ∈ M . By passing to a subsequence we may assume that

fj+1 − fj < 2−j . Let g0 be an extension of f1 and, for j  1, let gj be an extension of fj+1 − fj having gj < 2−j . Then the series ∞ 

gj

j=0

converges uniformly on X to an extension of f . Thus f belongs to E also, so E is closed.  Remark C.2.2. The Tietze extension theorem applies also when Y is any Euclidean space Rn (just consider each component separately).

Appendix D

Measure Zero

The notion of set of measure zero (in the real line) is important in the proof of Sard’s theorem (the 1-dimensional version, Proposition 3.4.6) and the “general position” results that follow from it. In this section we’ll review the basic definitions here.

D.1. Measure zero subsets of R and of S 1 Let I ⊆ R be an interval (open, closed, or half-open). We define the length of I by the natural formula (I) = sup I − inf I. Definition D.1.1. Let S be a subset of R. We say that S has measure zero (or is a null set) if, for every ε > 0, there exists a sequence I1 , I2 , I3 , . . . of intervals that cover S and whose total length ∞ 

(In )

n=1

is less than ε. Example D.1.2. Clearly, any subset of a set of measure zero also has measure zero. The union of two sets of measure zero also has measure zero. A translate of a set of measure zero also has measure zero. 223

224

D. Measure Zero

It is usual, in this definition, to require the intervals to be open, but this in fact does not make any difference: if we have a sequence of arbitrary intervals In satisfying the definition, we may find open intervals Jn ⊇ In with (Jn ) < (In ) + ε2−n . The open intervals Jn  then cover S and ∞ n=1 (Jn ) < 2ε. In particular, taking the original intervals In to be points, we find Lemma D.1.3. Any finite or countable set has measure zero.



The following shows that the notion of “measure zero” is not a vacuous one. Proposition D.1.4. An interval of positive length does not have measure zero.

Proof. Any interval of positive length contains a closed interval of positive length; it suffices therefore to prove the result for a closed interval. I claim that in any covering of a closed interval [a, b] by open intervals, the total length of the open intervals must be at least b − a; it cannot be arbitrarily small. Any covering of [a, b] by open intervals has a finite subcovering, so it suffices to show that if a closed interval [a, b] is covered by n open intervals, the total length of those intervals must be at least b − a. This we shall do by induction on n. The base case (n = 1) is evident: if a single open interval I1 covers [a, b], then I1 contains both a and b so its length is at least b − a. Suppose now that the statement has been proved for coverings by (n − 1) open intervals, and let I1 , . . . , In be n open intervals whose union contains [a, b]. One of the intervals, say I1 = (α, β), must contain a. Then α < a < β. The closed interval [β, b] is covered by I2 , . . . , In so by the induction hypothesis, n  k=2

(Ik )  b − β.

D.1. Measure zero subsets of R and of S 1

225

But (I1 ) = β − α > β − a, so n 

(Ik )  (β − a) + (b − β) = b − a,

k=1

completing the induction as required.



Note that, as a corollary of Proposition D.1.4 and Lemma D.1.3, we deduce that an interval of positive length is uncountable (i.e., not countable). This fact was originally proved by Cantor using decimal expansions and a diagonal argument. Example D.1.5. The converse of Lemma D.1.3 is false: there exist uncountable sets of measure zero. The standard example of such an object is the Cantor ternary set C, which is the collection of all x ∈ (0, 1) that can be written in a ternary (base-3) expansion that does not contain the digit 1. (See Example 2.1.6 for more about this idea. In cases where the expansion is ambiguous, it suffices that one of the possible expansions does not contain the digit 1.) This set is uncountable — in fact, sending a binary expansion made up of 0’s and 1’s to the ternary expansion made up of 0’s and 2’s in the corresponding places defines an injection from the whole interval [0, 1] (the standard example of an uncountable set) to C. But C has measure zero. In fact, write Cn for the set of numbers that have a ternary representation which do not contain 1 among the first n digits 1 only. Then C = Cn ; but each Cn is a union of (closed) intervals whose total length is (2/3)n , and this tends to zero as n → ∞. Lemma D.1.6. Any bounded open subset S of R can be written as the union of at most countably many disjoint open intervals: S = 0 j (aj , bj ). If S is contained in a closed interval [a, b], then  (bj − aj )  b − a; j

that is, the total length of the open intervals that comprise S is less than or equal to the length of [a, b]. Proof. Define an equivalence relation ∼ on S by saying that x ∼ y if [x, y] ⊆ S. The equivalence classes for this relation are easily seen to be open intervals contained in S; so S is a union of disjoint open

226

D. Measure Zero

intervals. Each open interval in R contains a rational number, and the set of rationals is countable, so there can be at most countably many open intervals in any disjoint family. For the statement about length, it suffices to show that if finitely many disjoint intervals (aj , bj ), j = 1, . . . , n, are contained in [a, b], then the sum of their lengths is less than or equal to b − a. Again, we use induction on n, the statement being apparent for n = 1. For the inductive step, assume without loss of generality that a1 = min{a1 , . . . , an }. Because the intervals are disjoint we must have aj  b1 for j = 2, . . . , n. Thus the disjoint intervals (aj , bj ) for j = 2, . . . , n are contained in the closed interval [b1 , b]. By the induction hypothesis, n  (bj − aj )  b − b1 , j=2

and therefore

n 

(bj − aj )  b − a1  b − a.

j=1

This completes the proof.



The key to the proof of Proposition 3.4.6 is Lemma D.1.7. Let f : (0, 1) → R be differentiable with continuous derivative. Let E ⊆ (0, 1) be the set of those points x for which f  (x) = 0. Then f (E) has measure zero. Note that E itself might not have measure zero — for instance, f might be a constant function! Proof. Let Em = {x ∈ (0, 1) : |f  (x)| < 1/m}. Em is an open subset of (0, 1), by the continuity of f  , and therefore it is an open subset of R. Thus by Lemma D.1.6 it is a countable union of disjoint intervals (cn , dn ), with total length at most 1. By the mean-value theorem, for x, x ∈ (cn , dn ) we must have |f (x) − f (x )|  (1/m)|x − x | < (1/m)(dn − cn ) and thus f ((cn , dn )) is a subset of some open interval (an , bn ) of length at most (2/m)(dn −cn ). It follows that f (Em ) is a subset of the union 0 n (an , bn ), which is a union of intervals with total length at most

D.1. Measure zero subsets of R and of S 1

227

 (2/m) (dn − cn )  2/m. Since f (E) ⊆ f (Em ) for every m, this completes the proof.  Remark D.1.8. When we apply this idea in Chapter 3, we will need to talk about measure zero for subsets of the unit circle S 1 , the set of “directions of rays” through the origin in the complex plane. To do this, let X be a subset of S 1 . Choose an interval (a, b) ⊆ R such that b − a > 1, so that the map η : t → e2πit sends (a, b) onto S 1 , and define X to have measure zero in S 1 if and only if η −1 (X) ∩ (a, b) has measure zero in R. (It is not hard to see that the notion so defined is independent of (a, b): increasing the size of the parameter interval may lead to the same point of S 1 being “counted” multiple times, but using the properties noted in Example D.1.2, one proves that the notion of “measure zero” is not affected by this.) Using Proposition D.1.4, one sees that the whole circle S 1 does not have measure zero. Thus, the complement of a measure zero subset of the circle is nonempty. That is the crucial fact we will need in applying Sard’s theorem to show that there exist “transverse” intersections of a smooth path with a generic ray.

Appendix E

Calculus on Normed Spaces

E.1. Normed vector spaces This appendix sketches the modern approach to multivariable calculus in the context of normed vector spaces. This approach has two advantages: (a) It does away with the complicated notation of partial derivatives. (b) It puts into prominence the central notion that the derivative of a map is the best linear approximation to that map at a particular point. For greater detail about these ideas, the best reference is still Dieudonn´e [14, Chapter 8]. Let V be a (real or complex) vector space. Recall from Remark A.5.5 the following definition. Definition E.1.1. A norm on V is a map V → R+ , denoted by v → v , such that: (i) For all v, v  0, with equality if and only if v = 0. (ii) λv = |λ| v for all scalars λ and vectors v. (iii) v + w  v + w . 229

230

E. Calculus on Normed Spaces

The triangle inequality (iii) above implies that a norm gives rise to a metric via the expression d(u, v) = u − v . Example E.1.2. Let V be a finite-dimensional vector space and choose a basis {v1 , . . . , vn } for V . Each v ∈ V can then be written  uniquely as a sum nj=1 λj vj . The expression ⎛

v = ⎝

n 

⎞1 2

|λj |

2⎠

j=1

then defines a norm on V (in fact, this is the norm associated to an inner product for which the chosen basis is orthonormal; see Corollary A.5.3). Definition E.1.3. Two norms · 1 and · 2 on the same vector space V are equivalent if there is a constant m > 0 such that

v 1  m v 2 ,

v 2  m v 1

for all v ∈ V . Proposition E.1.4. Any two norms on a finite-dimensional vector space are equivalent. Proof. Let V be a finite-dimensional (real) normed space, with norm

· . Choose a basis {v1 , . . . , vn } and define a function from the sphere S n−1 = {(x1 , . . . , xn ) ∈ Rn : x21 + · · · + x2n = 1} to R+ by 2 2 2 2 2 n 2 2 x v (x1 , . . . , xn ) → 2 j j2 . 2 2 j=1 2 The properties of the norm show that this is a continuous, nowherevanishing function on the compact space S n−1 , hence bounded between m−1 and m for some m > 0. This shows that · is equivalent to the norm associated to choosing {v1 , . . . , vn } as an orthonormal basis. Since · was arbitrary, all norms are equivalent.  It follows that all norms on a finite-dimensional vector space give rise to the same topology and that they are all complete (that is, Cauchy sequences converge). All the theory that we are going to develop in this appendix works for complete normed vector spaces

E.2. The derivative

231

(even infinite-dimensional ones), but to keep things down to earth we will state it only in the finite-dimensional case. Remark E.1.5. If V and W are finite-dimensional vector spaces, so is the space L(V, W ) of linear transformations from V to W . If V and W have specified norms, it is natural to equip L(V, W ) with the norm

T = sup{ T v : v ∈ V, v  1}, which is called the operator norm of T .

E.2. The derivative For a map f : R → R, the standard definition of the derivative is f (x + h) − f (x) . h→0 h This definition does not generalize directly to maps between vector spaces because you cannot divide one vector by another. We must first reformulate it, as follows. f  (x) = lim

Proposition E.2.1. Let f : R → R be a function, and let x ∈ R. The derivative f  (x) (if it exists) is the unique constant t ∈ R with the property that f (x + h) = f (x) + th + o(h). Here the notation o(h) refers to some function of h, say g(h), with the property that |g(h)|/|h| tends to 0 as h tends to 0. Proof. Once we have decrypted the o notation, we see that the equation in the proposition tells us that   f (x + h) − f (x) − t → 0 as h → 0; h that is, by definition, f  (x) exists and equals t.



Remark E.2.2. The o notation makes sense in finite-dimensional vector spaces too: if V, W are such spaces, the symbol o(h), for a vector variable h ∈ V , will refer to some function g(h) ∈ W such that

g(h) / h tends to 0 as h tends to 0 in V . Because any two norms

232

E. Calculus on Normed Spaces

in a finite-dimensional space are equivalent (Proposition E.1.4), this definition does not depend on the choice of norms. Lemma E.2.3. Let T : V → W be a linear map between finitedimensional vector spaces. If T (h) = o(h) for h ∈ V , then T = 0. Proof. If T is nonzero, there is some v ∈ V with v = 1 and

T (v) = c > 0. But then, putting h = λv,

T (h) / h = c

for all λ = 0,

contradicting the hypothesis T (h) = o(h).



Definition E.2.4. Let V and W be finite-dimensional vector spaces over R, let Ω be an open subset of V , and let f : Ω → W be a continuous map. We say that f is differentiable at x ∈ Ω if there is a linear map T : V → W such that (E.2.5)

f (x + h) = f (x) + T [h] + o(h)

as h → 0 in V . The map T is called the derivative of f at x, and we denote it by Df (x). The underlying idea here is that the derivative Df (x) gives the best linear approximation to f near x. Lemma E.2.3 shows that the derivative (if it exists) is uniquely determined by (E.2.5). Convention E.2.6. Notice that Df (x), the derivative of f at x, is itself a linear map from V to W . Given h ∈ V , then, Df (x)[h] is a vector in W , depending both on x and on h. Calculations with derivatives tend to involve a plethora of parentheses like this. To help keep them straight, I will use a (nonstandard) convention. When the value of a function depends linearly on its vector argument, I will use square brackets, like [h]; when the dependence is not necessarily linear I will use parentheses, like (x). Example E.2.7. Suppose that V = R. A map f from R to W is just a path in W . We defined the derivative f  (x) ∈ W of such a path at the beginning of Section 3.4, by the usual formula f  (x) = lim

h→0

f (x + h) − f (x) . h

E.3. Properties of the derivative

233

To reconcile this with Definition E.2.4 we need only rewrite the displayed equation as f (x + h) = f (x) + f  (x)h + o(h). Comparing with Definition E.2.4 we see that Df (x) is the linear map R → W that sends h to f  (x)h. In fact, every linear map R → W is of the form h → ch, where c is a constant vector, so the two versions of the definition contain exactly the same information. Example E.2.8. Suppose that W = R, so that we are looking at a real-valued function f defined on an open subset of a higherdimensional vector space V . The derivative Df is now a linear map V → R, that is, an element of the dual space V ∗ of V (Definition A.4.1). One should think of this as a version of the directional derivative from traditional multivariable calculus, in that Df (x)[v] measures the rate of change of f at the point x in the direction of the vector v. In fact, holding x and v fixed and letting t be an auxiliary real variable, we have by definition and using linearity f (x + tv) = f (x) + Df (x)[tv] + o(t) = f (x) + tDf (x)[v] + o(t). Thus

Df (x)[v] =

d f (x + tv) dt

, t=0

which is the usual definition of the directional derivative of f in the direction v.

E.3. Properties of the derivative In this section we will briefly review some key properties of this notion of derivative. The first result is a version of the mean value theorem (expressed as an inequality rather than an equality): Proposition E.3.1 (Increment formula). Let V, W be finite-dimensional (normed ) vector spaces, and let f be a differentiable map from a convex open subset Ω of V to W . Suppose that there is a constant

234

E. Calculus on Normed Spaces

C > 0 such that the operator norm Df (x)  C for all x ∈ Ω. Then, for all x, y ∈ Ω,

f (x) − f (y)  C x − y . Proof. Suppose that the result is false. Then there exist x, y ∈ Ω and δ > 0 such that f (x) − f (y)  (C + δ) x − y . Let us call a pair of points x, y for which this happens a δ-bad pair. Let z = (x + y)/2 be the midpoint of [x, y]. Then one of the pairs (x, z) and (z, y) must also be δ-bad. Let (x1 , y1 ) be such a one. Repeating this argument we obtain a sequence of δ-bad pairs (xn , yn ), all lying on the line segment [x, y] and each one obtained by bisecting the previous one. Clearly there is some z on the line segment such that xn → z, yn → z. But then by construction there are points z + h arbitrarily close to z with

f (z + h) − f (z)  (C + δ) h . This contradicts the definition of the derivative since

f (z + h) − f (z)  Df (z)[h] + o(h)  C h + o(h) using the definition of the operator norm.



The most important fact about derivatives is the chain rule. In our language, the chain rule simply states that the best linear approximation to a composite map f ◦ g is the composite of the best linear approximation to f and the best linear approximation to g. (Contrast the simplicity and generality of this statement with the numerous complicated special cases that appear in the usual Calculus III course!) To put it more precisely, Proposition E.3.2 (Chain rule). Let V ,V  , and W be three finitedimensional vector spaces over R. Let Ω be an open subset of V and Ω an open subset of V  . Let g : V → V  be continuous with g(Ω) ⊆ Ω , let f : Ω → W be continuous, let g be differentiable at p ∈ Ω, and let f be differentiable at q = g(p) ∈ Ω . Then f ◦ g is differentiable at p, and D(f ◦ g)(p) = Df (q) ◦ Dg(p) = Df (g(p)) ◦ Dg(p) ∈ L(V ; W ) where Dg(p) ∈ L(V, V  ) and Df (q) ∈ L(V  , W ), so that their composite belongs to L(V, W ), as does D(f ◦ g)(p).

E.3. Properties of the derivative

235

Proof. There is no loss of generality in assuming that p = q = 0. From the definition of the derivative, we may write g(x) = g(x) − g(0)

= Dg(0)[x] + o1 (x),

f (y) − f (0)

= Df (0)[y] + o2 (y),

where o1 , o2 denote “error terms” which satisfy the following: for each fixed ε ∈ (0, 1) there is δ > 0 such that

o1 (x)  ε x provided x < δ,

o2 (y)  ε y provided y < δ. Let a = max{ Dg(0) , Df (0) } (operator norm). Then provided

x < δ we have g(x)  (1 + a) x . Now take y = g(x) and substitute the first equation above into the second. This gives f (g(x)) − f (g(0)) = Df (0) ◦ Dg(0)[x] + Df (q)[o1 (x)] + o2 (g(x)). If we take x < δ/(1+a), then the two right-hand terms are bounded by aε x + ε(1 + a) x = (2a + 1)ε x , and since ε was arbitrary this shows that the sum of these terms is o(x), as required.  What about higher derivatives? The space L(V ; W ) of linear maps from V to W is itself a finite-dimensional vector space. Thus, if f is differentiable in Ω, then Df is a map from Ω to L(V ; W ) and one can ask about its continuity, differentiability, and so on. The second derivative of f , D2 f , is the derivative of Df . By definition, D2 f (x) is a linear map from V to L(V ; W ). But this is exactly the same thing as a bilinear map from V × V to W : that is to say, the expression (E.3.3)

D2 f (x)[v1 , v2 ]

gives an element of W for all pairs of vectors v1 , v2 ∈ V and, moreover, depends linearly both on v1 and on v2 . ( In a similar way we can define third and higher derivatives. We say a function is smooth if it has derivatives of all orders.) Proposition E.3.4 (Clairaut’s theorem). Suppose that f is a function from an open subset Ω of V to W , twice differentiable at a

236

E. Calculus on Normed Spaces

point p ∈ Ω. Then , D2 f (p) is a symmetric bilinear map V ×V → W . That is to say, D2 f (p)[x, y] = D2 f (p)[y, x] for all x, y ∈ V . This result is familiar in Calculus III under the form “the mixed partial derivatives are symmetric” — that is, ∂ 2 f /∂x∂y = ∂ 2 f /∂y∂x. Proof. There is no loss of generality in assuming that p = 0 and f (0) = 0; moreover, by adding a linear term to f , we may assume that Df (0) = 0 also. The proof will show that under these assumptions, D2 f (0)[x, y] is a good approximation to the expression f (x + y) − f (x) − f (y). Since this expression is clearly symmetric in x and y, the result will follow. Here are the details. We start by applying the increment formula (Proposition E.3.1) to the map ϕ : x → f (x + y) − f (x) − Df (y)[x] (y fixed). The derivative of ϕ is Dϕ(x) = Df (x + y) − Df (x) − Df (y) = D2 f (0)[(x + y) − x − y] + o( x + y ) = o( x + y ). Therefore by the increment formula

ϕ(x) − ϕ(0) = f (x + y) − f (x) − f (y) − Df (y)[x]

 x o( x + y )  o(( x + y )2 ). Moreover, by definition of the derivative, Df (y) − D2 f (0)[y] = o( y ) and thus

D2 f (0)[y, x] − Df (y)[x]  o( y

x )  o(( x + y )2 ). Putting these together,

f (x + y) − f (x) − f (y) − D2 f (0)[y, x]  o(( x + y )2 ) and therefore, using the symmetry,

D2 f (0)[x, y] − D2 f (0)[y, x]  o(( x + y )2 ). A rescaling argument (as in the proof of Lemma E.2.3) now completes the proof. 

E.4. The inverse function theorem

237

Remark E.3.5. Our definitions of derivative and of smooth have been given for functions defined on open subsets Ω of a vector space. The alert reader will have noticed, however, that we used the word “smooth” in Section 3.4 in the context of paths, that is, of maps defined on [0, 1], which is definitely not an open subset of R. Later, we shall also need to talk about smooth maps defined on the unit square [0, 1] × [0, 1] in R2 and a few similar examples of closed subsets C ⊆ Rn . The definitions of this appendix still work in this context: the key observation is that for every x ∈ C, even when x is a boundary point, the vectors h ∈ Rn such that x+h ∈ C are sufficiently abundant to span Rn . This ensures that Definition E.2.4 defines the derivative uniquely.

E.4. The inverse function theorem At one point in Chapter 8 we will also need the inverse function theorem. We state it as follows: Theorem E.4.1. Let f be a smooth map from an open subset Ω of V to W , and suppose that for some x0 ∈ Ω the linear map Df (x0 ) : V → W is invertible. Then there is an open set Ω contained in Ω and containing x0 such that the restriction of f to Ω is a bijection Ω → f (Ω ). Moreover, if g : f (Ω ) → Ω denotes the inverse map, then g is smooth, and moreover Dg(f (x0 )) = (Df (x0 ))−1 as a linear map W →V.  Since this result is only used in one example in this book, we won’t give the proof here. The argument, which may be found in [14, 10.2.5], uses an iteration which produces the inverse map by successive approximations. Because of the limiting process involved here, if we are working in the general normed space context, it is necessary to assume that the spaces involved are complete. This is automatic in the finite-dimensional case.

Appendix F

Hilbert Space

Functional analysis is the subject which arose at the beginning of the twentieth century when mathematicians began to systematize the insight that many of the standard processes of analysis — differentiation and integration are the most obvious examples — are linear operations and should therefore be considered in the context of linear algebra (where the underlying vector spaces are infinite-dimensional). The development of quantum mechanics in the 1920s and beyond underlined the importance of infinite-dimensional vector spaces, especially those equipped with an inner product. These are the Hilbert spaces whose theory we will review in this appendix. In Chapter 7 we extract an integer invariant, closely related to the winding number, from this infinite-dimensional linear algebra.

F.1. Definition and examples Recall from Definition A.5.1 that an inner product on a complex vector space V is a complex-valued function on V × V , written ·, ·, which has the properties that u, λ1 v1 + λ2 v2  = λ1 u, v1  + λ2 u, v2 , that u, v = v, u, and that u, u is a nonnegative real number, zero only when u = 0. A vector space V equipped with an inner product is an inner product space. 239

240

F. Hilbert Space

3 The norm on an inner product space V is defined by v =

v, v. The norm on V defines a metric by d(u, v) = u − v . Definition F.1.1. An inner product space is called a Hilbert space if it is complete as a metric space; that is, every Cauchy sequence converges. (Remark B.3.11.) Example F.1.2. Let 2 (N) denote the space of square-summable sequences of complex numbers; that is, an element a ∈ 2 (N) is a se∞ quence (an )n∈N such that n=1 |a2n | < ∞, and the inner product of two such sequences is defined as

a, b =

∞ 

a ¯n bn

n=1

(the sum converges absolutely). It is a standard exercise to check that this is a Hilbert space. The space 2 (Z) of square-summable two-way sequences (bn )n∈Z of complex numbers can be defined in a completely analogous way (as indeed can the space 2 (S) for any set S). One can think of the inner product in Example F.1.2 as being the most direct “infinite-dimensional” generalization of the dot product a · b = a1 b1 + a2 b2 + a3 b3 of vectors in R3 — obtained by allowing the subscripts n to vary over the infinite range N or Z rather than just the finite set {1, 2, 3}. Having taken that step, though, one might go further and ask whether the discrete variable n can be replaced by a continuous one. For example, suppose that we consider continuous complex-valued functions on the unit circle, equivalently, continuous 2π-periodic functions on R. The formula (F.1.3)

1

u, v = 2π





u ¯(t)v(t)dt 0

defines an inner product on the space of such functions. Do we obtain

F.1. Definition and examples

241

a Hilbert space in this way? In other words, is this inner product space complete? The answer is no: Example F.1.4. Let {un } be the on [0, 2π] defined by ⎧ ⎪ nt ⎪ ⎪ ⎪ ⎨1 un (t) = ⎪1 − n(t − 1) ⎪ ⎪ ⎪ ⎩ 0

sequence of continuous functions (0  t  1/n), (1/n  t  1), (1  t  1 + 1/n), (1 + 1/n  t  2π).

Then it is easy to see that (un ) is a Cauchy sequence (in the norm arising from the inner product(F.1.3)), but that it does not converge in this norm to any continuous function. The situation described in the above example is analogous to one familiar from introductory analysis. Suppose for instance that we define a sequence of rational numbers inductively by q1 = 1, q2 = 32 , 1 q3 = 17 12 , and so on, with qn+1 = 2 (qn + 2/qn ). Then {qn } is a Cauchy sequence of rational numbers, but it does not converge to a rational limit — it is “trying” to approach the irrational number √ 2. Similarly, the sequence {un } above is “trying” to approach the discontinuous function that equals 1 for 0 < t  1 and 0 for other values of t. In the case of the rational numbers, we know that by enlarging the rational number system to the real number system, we can arrive at a complete space — in other words, we can ensure that every Cauchy sequence converges. We might therefore ask whether we can similarly enlarge the space of continuous functions on the circle to a larger space (including some discontinuous functions) which will be complete (in other words, a Hilbert space) with respect to the L2 inner product (F.1.3). The Lebesgue theory of integration, developed at the turn of the twentieth century, provides a positive answer. It can be shown that if {un } is a Cauchy sequence of continuous functions with respect to the L2 norm, then there is a subsequence {unk } that converges “almost everywhere”: that is to say, the points x for which the sequence of real numbers {unk (x)} does not converge form a set of measure

242

F. Hilbert Space

zero1 . The functions u(x) obtained as “almost everywhere” limits of Cauchy sequences of continuous functions in this way are called square-integrable or L2 functions. Lebesgue’s theory allows one to interpret the expression (F.1.3) for the inner product even when u and v are merely square-integrable (and not assumed to be continuous). Moreover, the space L2 (S 1 ) of square-integrable functions on the circle, equipped with this inner product, is a Hilbert space — it is complete. Remark F.1.5. There is one nuance which is important here. Remember that the convergence appearing above was only “almost everywhere”, and as a result we should think of an L2 function as being defined only “almost everywhere”. The points of L2 (S 1 ) are therefore strictly speaking equivalence classes of functions, two functions being regarded as equivalent if they differ only on a set of measure zero. The two basic examples that we have described above (2 and L2 ) are even more closely related than they may at first appear. In fact, let H = L2 (S 1 ) be the Hilbert space of square-integrable functions on the circle. The functions en (t) = eint , n ∈ Z, form an orthonormal set in H: that is, en , em  equals 1 if n = m and equals 0 if n = m. Definition F.1.6. Let f ∈ L2 (S 1 ) be a square-integrable function. The Fourier coefficients of f are the inner products  2π 1 cn = en , f  = f (t)e−int dt. 2π 0 (Some √ textbooks may have definitions that differ by a factor of 2π or 2π.) Now we have Proposition F.1.7. The map that sends a function f to the sequence {cn := en , f } of its Fourier coefficients is an isometric isomorphism from L2 (S 1 ) to 2 (Z). The inverse map is defined as follows: for a  cn en converges in L2 (S 1 ) to a sequence {cn } in 2 (Z), the series function f whose Fourier coefficients are the given sequence. Outline of proof. A finite linear combination of the functions en is called a trigonometric polynomial. If f is a trigonometric polynomial, 1 This result can be proved by elementary means, i.e., using only the definition of “measure zero” in Appendix D. The reader may enjoy thinking about this.

F.2. Orthogonality say f =

N n=−N

243

cn en , then one easily checks that

ek , f  = ck for |k|  N ,

= 0 otherwise,

f 2 =



|cn |2 .

It follows from the Stone-Weierstrass theorem (Theorem C.1.5) that every continuous function is a uniform limit of trigonometric polynomials and therefore that every L2 function is a limit (in L2 ) of trigonometric polynomials. The equality above may therefore be extended to all L2 functions by passing to the limit.  This result may also be expressed by saying that the orthonormal set {en } is complete. The complete orthonormal set {en } is called the Fourier basis of H = L2 (S 1 ). Remark F.1.8. The space L2 (S 1 ) is only one example of a wider class of Hilbert spaces that arise from measure theory. Indeed, if (X, μ) is a measure space, we may define L2 (X, μ) to consist of equivalence classes (modulo equality μ-almost everywhere) of functions f : X → C such that  2

f = |f (x)|2 dμ(x) < ∞. However, the space L2 (S 1 ) is the only example of this general construction that we will need to use.

F.2. Orthogonality Two vectors u and v in an inner product space are called orthogonal if u, v = 0. Lemma F.2.1 (Pythagoras’s theorem). If u, v ∈ H are orthogonal, then u + v 2 = u 2 + v 2 . Proof. Write

u + v 2 = u + v, u + v = u 2 + u, v + v, u + v 2 . Since u, v = v, u, this gives the result.



A similar expansion for any two vectors u, v gives the parallelogram law (F.2.2)

u + v 2 + u − v 2 = 2 u 2 + 2 v 2

244

F. Hilbert Space v

E x x w

Figure F.1. Proof of the projection theorem.

(the “cross terms” cancel out). Geometrically, this states that the sum of the squares on the two diagonals of a parallelogram is equal to the sum of the squares on all four edges. Let u be a vector in some Hilbert space H. The collection u⊥ of all vectors orthogonal to u is a closed subspace of H because it is the kernel of the continuous linear map v → v, u from H to C. It follows that for any S ⊆ H, the collection 4 u⊥ S ⊥ := u∈S

of vectors orthogonal to every u ∈ S is also a closed subspace. It is called the orthogonal of S. Remark F.2.3. Notice that not every subspace of a Hilbert space is closed. We have already seen counterexamples: the space of continuous functions inside L2 (S 1 ), or the space of finitely nonzero sequences inside 2 (Z), is not closed. Theorem F.2.4 (Projection theorem). Let H be a Hilbert space and E a closed subspace of H. Then the orthogonal E ⊥ is a complementary subspace (Definition A.1.7) to E. Outline of the proof. It is clear that E ∩ E ⊥ = {0} since the only vector that is orthogonal to itself is the zero vector. Thus, we have to prove that every v ∈ H can be written as a sum x + y, where x ∈ E

F.2. Orthogonality

245

and y ∈ E ⊥ . It follows from Pythagoras’s theorem that if this can be done, then x is the unique point of E that is closest to v. Thus, to construct x, we try to show that there is a point of E at which the minimum distance to v is attained. In finite dimensions this would be an easy compactness argument. In infinite-dimensional spaces we must proceed more carefully, as follows. Let c = inf{ x − v : x ∈ E} be the infimal distance from v to E. We want to show that c is attained by some x ∈ E. Suppose that x , x ∈ E and let w = x + x − v, so that the points v, x , w, x form a parallelogram (see Figure F.1). Notice that the midpoint of the parallelogram, 12 w + v = 12 x + x , belongs to E, so that

12 (w − v) 2  c2 . By the parallelogram law

x − x 2 = 2( x − v 2 + x − v 2 ) − w − v 2  2( x − v 2 + x − v 2 ) − 4c2 . Now let {xn } be a sequence of points of E such that xn − v → c. Applying the above identity with x = xn and x = xn we see that

xn − xn → 0 as n , n → ∞. That is, {xn } is a Cauchy sequence. By completeness, it converges to a point x ∈ H, and since E is closed, we in fact have x ∈ E. This finishes the proof.  As a consequence of this, we find that a version of the representation theorem (Theorem A.5.4) is true for Hilbert space. Proposition F.2.5. Every continuous linear functional on a Hilbert space H is represented by the inner product. That is, if ϕ : H → C is a continuous linear map, then there exists x ∈ H such that ϕ(v) = v, x for all v ∈ H. Proof. We may assume that ϕ is not the zero functional (otherwise the result is trivial). Let K = ker(ϕ), which is a closed subspace of codimension one. By the projection theorem, K ⊥ is a complementary subspace, and it has dimension one by Proposition A.3.9. Let w be a unit vector in K ⊥ and let x = ϕ(w)w. Then w, x = ϕ(w) w 2 = ϕ(w). It follows that the equality ϕ(v) = v, x

246

F. Hilbert Space

holds for all v ∈ K ⊥ , and it also holds for all v ∈ K since both sides  are zero, so it holds for all v ∈ K ⊕ K ⊥ = H.

F.3. Operators Let H and K be Hilbert spaces and let T : H → K be a linear map. One says that T is bounded if the quantity (F.3.1)

T := sup{ T x : x  1}

is finite. In that case, T is called the norm of T . The bounded linear maps are exactly those which are continuous when we consider H and K as metric spaces. In functional analysis we usually restrict our attention to such maps. The collection of all bounded linear maps from H to K, denoted B(H; K) (or just B(H) if H = K) then becomes a normed vector space, and the completeness of K easily implies that B(H; K) is also complete. A bounded linear map is also called a linear operator. Linear maps can be composed (multiplied) as well as added; that is, B(H) is not only a normed vector space but also a ring. Notice the inequality relating the norm to the composition of linear maps

ST  S

T

which follows easily from the definition of the norm of an operator in (F.3.1) above. Suppose that T : H → K is a bounded linear map. Then, for each fixed y ∈ K, the map from H to C defined by x → T x, y is bounded and linear. According to the representation theorem (Proposition F.2.5), then, it is represented by the inner product with an element of H. Let us call this element (which of course depends on y) T ∗ y. That is, we have by definition

T x, y = x, T ∗ y for all x, y. Proposition F.3.2. The map T ∗ : K → H so defined is a bounded linear operator, with norm T ∗ = T .

F.3. Operators

247

Proof. The expression for the norm follows from the equation

T = sup{| T x, y| : x , y  1}, which in turn is a consequence of the Cauchy-Schwarz inequality.  The operator T ∗ is called the adjoint of T . Definition F.3.3. An operator T ∈ B(H) has finite rank if Im(T ) is finite-dimensional. An operator T is compact if there exists a sequence of finite rank operators Tn that converges to T in norm. The space of compact operators on H is denoted K(H). It is easy to see that the sum of two finite rank operators is finite rank and that the composite (either way around) of a finite rank operator and a bounded operator is finite rank. For compact operators, this implies: Proposition F.3.4. K(H) forms a closed, two-sided ideal in B(H); that is to say, the sum of two compact operators is compact, the product of a compact operator and a bounded operator (in either order ) is compact, and the limit of a convergent sequence of compact operators is compact.  We will need the following standard lemma at one point. Its tricky proof, which may be found in standard texts such as Rudin [33, Corollary 2.12], ultimately depends on the Baire category theorem. Lemma F.3.5 (Closed graph lemma). An algebraically invertible operator on Hilbert space is topologically invertible. That is, if T : V → W is a bijective bounded linear map between Hilbert spaces, then the linear map T −1 : W → V is also bounded. 

Appendix G

Groups and Graphs

The idea of “symmetry” is a fundamental one in mathematics. A figure such as an equilateral triangle is highly symmetrical because there are many geometric operations (two rotations and three reflections) that map the triangle back to itself. The subject of group theory originates when we focus our attention on the symmetries themselves and the effects of composing them, more than we focus on the “symmetrical” object. One of the earliest triumphs of this point of view was Galois’ understanding of the symmetries of polynomial equations and the relationship of these symmetries to the solvability of these equations by a “radical formula” (such as the famous quadratic formula

x=

−b ±



b2 − 4ac 2a

for the solution of the equation ax2 + bx + c = 0). We will use a few ideas from group theory in Chapter 8, and we’ll review them briefly in this appendix. All this material is usually covered in an undergraduate “introductory abstract algebra” course. Standard textbooks for such a course include Gallian [18], Herstein [22], or Rotman [32]; all of these provide more detail on the topics listed below, and of course they develop algebra much further as well. 249

250

G. Groups and Graphs

G.1. Equivalence relations Let X be a set. A relation R on X is a subset of X × X. We think of R as expressing a “relationship” between elements of X that may be true or false, and in keeping with this point of view we write xRy (for x, y ∈ X) in place of (x, y) ∈ R. For example, “=” (equals), “