Differential topology with a view to applications [1 ed.] 9780273002833

206 89 12MB

English Pages 291 p. ; [308] Year c1976.

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Differential topology with a view to applications [1 ed.]
 9780273002833

Citation preview

o

Research Notes in Mathematics

D R J Chillingworth

Differential topology with a view to applications

Pitman LONDON SAN FRANCISCO-MELBOURNE

9

Titles in this series 1 Improperly posed boundary value problems A Carasso and A P Stone 2 Lie algebras generated by finite dimensional ideals 1 N Stewart 3 Bifurcation problems in nonlinear elasticity R W Dickey 4 Partial differential equations in the complex domain D LColton 5 Quasilinear hyperbolic systems and waves A Jeffrey 6 Solution of boundary value problems by the method of integral operators D LColton 7 Taylor expansions and catastrophes T Poston and I N Stewart 8 Function theoretic methods in differential equations R P Gilbert and R J Weinacht 9 Differential topology with a view to applications D R J Chillingworth 10 Characteristic classes of foliations H VPittie 11 Stochastic integration and generalized martingales A U Kussmaul 12 Zeta-functions: An introduction to algebraic geometry A D Thomas 13 Explicit a priori inequalities with applications to boundary value problems V G Sigillito 14 Nonlinear diffusion W E Fitzgibbon III and H F Walker 15 Unsolved problems concerning lattice points J Hammer 16 Edge-colourings of graphs S Fiorini and R J Wilson 17 Nonlinear analysis and mechanics: Heriot-Watt Symposium Volume I R J Knops 18 Actions of finite abelian groups C Kosniowski 19 Closed graph theorems and webbed spaces M De Wilde

Differential topology with a view to applications

NUNC COGNOSCO EX PARTE

THOMAS J. BATA LIBRARY TRENT UNIVERSITY

Digitized by the Internet Archive in 2019 with funding from Kahle/Austin Foundation

https://archive.org/details/differentialtopoOOOOchil

D R J Chillingworth University of Southampton

Differential topology with a view to applications

Pitman LONDON • SAN FRANCISCO • MELBOURNE

QUI3. &

. C v?

/1?6b

PITMAN PUBLISHING LIMITED 39 Parker Street, London WC2B 5PB FEARON-PITMAN PUBLISHERS INC. 6 Davis Drive, Belmont,California 94002,USA

Associated Companies

Copp Clark Ltd, Toronto Pitman Publishing New Zealand Ltd, Wellington Pitman Publishing Pty Ltd, Melbourne First published 1976 Reprinted 1977 Reprinted 1978 AMS Subject Classifications: (main) 58AOS, 34-02 (subsidiary) 26A57,34C-, 34D30, 58F-, 70-34 Library of Congress Cataloging in Publication Data Chillingworth, David. Differential topology with a view to applications. (Research notes in mathematics; 9) Includes bibliographical references. 1. Differential topology. I. Title. II. Series. QA613.6.C48 514'.7 76-28202 ISBN 0-273-00283-X ©DR J. Chillingworth 1976 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means,electronic, mechanical, photocopying, recording and/or otherwise without the prior written permission of the publishers. The paperback edition of this book may not be lent,resold, hired out or otherwise disposed of by way of trade in any form of binding or cover other than that in which it is published, without the prior consent of the publishers. Reproduced and printed by photolithography in Great Britain at Biddles of Guildford

Preface With the increasing use of the language and machinery of differential topology in practical applications, many research workers in the physical, biological and social sciences are eager to learn some of the background to the subject, but are frustrated by the lack of any self-contained treatment that neither is too pure in approach nor is written at too advanced a level. Few people have the time and energy to embark on a systematic study in a field other than their own, and so this book is written with the purpose of making differential topology accessible in one volume as a working tool for applied scientists.

It could also serve as a guide for graduate students

finding their way around the subject before plunging into a more thorough treatment of one aspect or another.

The book should perhaps be read more

in the spirit of a novel (with a rather diffuse ending) than as a text-book. The particular aim is to study the global qualitative behaviour of

dynamical systems, although there are numerous byways and diversions along the route.

A dynamical system is some system (economic, physical,

biological ...) which evolves with time.

Given a starting point,

the system

moves within a universe of possible states according to known or hypothesized laws, often describable locally by a formula for the 'infinitesimal* evolution, namely a differential equation.

The global

theory is the theory of all possible evolutions from all possible initial states,

together with the way these fit together and relate to each other.

Qualitative theory is concerned with the existence of constant (equilibrium) behaviour, periodic or recurrent behaviour, and long-term behaviour.

313734

together with questions of local and overall stability of the system.

Global

qualitative techniques, mainly stemming from the work of Henri Poincard^ (1854-1912), are important both because precise quantitative theoretical solutions may in general be unobtainable, and because in any case a qualitative model is the basis of a sound mental picture without which mechanical calculation is highly dangerous. The natural universe of evolution for a dynamical system is often a differentiable manifold; the evolution itself is a flow on the manifold, and a differential equation for infinitesimal evolution becomes a Vector field on the manifold.

Chapters 1-3 of the book are concerned with defining

and explaining these terms, while Chapter 4 goes into the qualitative theory of flows on manifolds, ending with some discussion of bifurcation theory. There is an Appendix on basic terminology and notation for set theory. Inevitably there are many topics which should have been included or developed but which would have expanded the volume to twice its size. Differential forms are hardly mentioned, singularity theory is only touched upon, and the fascinating terrain of general bifurcation theory for differential equations, including the Centre Manifold Theorem (one of the few really -practical applications of differential topology), is left largely unexplored.

I hope the tantalized reader will be able to follow up these

topics via the references given. Formal prerequisites are kept to a minimum.

The ideas from topology and

linear algebra that are needed are mostly developed from first principles, so that the basic requirements are hardly more than a familiarity with derivatives and partial derivatives in elementary calculus - although these, too, are defined in the text.

The exceptions to this are complex

numbers, which are assumed to be well-known objects to mathamatically-minded

scientists, and determinants and eigenvalues of matrices which may be less well-known to some but are everyday equipment for others.

My excuse for

this logical inconsistency is lack of space and the need to draw a line somewhere:

I felt that it was more important to discuss carefully some

of the fundamental ideas about linear spaces upon which the rest of the structure is built than to go on to techniques familiar to many people and in any case quite accessible elsewhere.

In the first three Chapters the

complex numbers feature mainly in examples and illustrations but in their roles as eigenvalues they become crucial to the main plot in Chapter 4. As overall references for the qualitative theory of dynamical systems, I suggest the now historic survey article by Smale very readable lecture notes of Markus differential equations by Amol'd

[73 ].

[125] and the subsequent

The excellent books on

[ll] and Hirsch and Smale

[55]

directed towards the qualitative theory of flows on manifolds.

are both

For back¬

ground on differential topology a recent and attractive text is Guillemin and Pollack

[48J:

there is also a forthcoming book by Hirsch [52 ].

The

fascinating article on applications to fluid mechanics and relativity by Marsden, Ebin, and Fisher

[]75j

is highly recommended (see also the

introduction to differential topology by Stamm in the same volume). The present book grew from a series of lectures given to a mixed audience of pure and applied mathematicians, engineers, physicists and economists at Southampton University in 1973/74.

It is through the

encouragement of several of these colleagues that I have expanded the lecture notes into book form, and I am grateful to them and others for helpful comments and criticisms.

I am particularly indebted to Peter Stefan

of the University College of North Wales, Bangor who carefully read the original notes and offered many detailed suggestions for improvement.

* Now appeared: highly recommended

Despite all this assistance, I claim the credit for errors.

I would also

like to thank Professor Umberto Mosco and Professor Nicolaas Kuiper for hospitality at the Istituto Matematico dell' University di Roma and the I.H.E.S., Bures-sur-Yvette, respectively, during visits to which I wrote up much of the notes.

I am grateful also to Pitman Publishing for their

interest in the book and patience during its production,

to my wife Ann for

tolerating the side-effects, and especially to Cheryl Saint and Jenny Medley for spending many long hours producing such a perfect typescript. Finally, my special thanks go to Les Lander for taking upon himself the task of drawing all the figures in the book, and obtaining such professional results in a short space of time.

David Chillingworth Southampton, August 1976.

I am grateful to Mike Irwin, Mark Mostow and my father H.R. Chillingworth for kindly pointing out a number of misprints and errors which, fortunately, I have been able to correct for the second printing of the book. Also I would like to take this opportunity to draw attention to the Appendix, which explains some set-theoretic terminology and notation that is used without comment in the text.

D.R.J.C. May 1977.

Contents 1. Basic topological ideas 1.1 1.2 1.3 1.4 1.5 1.6 1.7

The concept of a function Continuity Continuity from a more general viewpoint Further topological concepts Homeomorphism of spaces and equivalence of maps Compactness Connectedness

Remarks on the literature

1 4

9 18 25 27 33

35

2. Calculus 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8

Differentiation Linear spaces and linear maps Normed linear spaces Differentiation (continued) Properties and uses of the derivative Higher orders of differentiation Germs and jets Local structure of differentiable maps I: Non-singular behaviour 2.9 Local structure of differentiable maps II: Singularities

36 38 45 53 64 72 80

100

Remarks on the literature

115

92

3. Differentiable manifolds and maps 3.1 The concept of a differentiable manifold 3.2 Remarks, comments and more examples of differentiable manifolds 3.3 The structure of differentiable maps between manifolds 3.4 Tangent bundles and tangent maps 3.5 Vector fields and differential equations

116 131 146 162 179

Remarks on the literature

189

4. Qualitative theory of dynamical systems 4.1 4.2 4.3 4.4 4.5 4.6 4.7

Flows and diffeomorphisms Local behaviour near fixed points and periodic orbits Some global behaviour Generic properties of flows and diffeomorphisms Global stability Dynamical systems under constraint Breakdown of stability: bifurcation theory

Remarks on the literature

191 204 217 221 227 245 252 267

APPENDIX: Terminology and notation for sets and functions

268

REFERENCES

273

Index

284

1 Basic topological ideas

1.1

THE CONCEPT OF A FUNCTION

We usually visualize a real-valued function of a real variable in terms of its graph:

For a given number anothet real number coordinatized by

x

(i.e.

f(x).

(x,y)

x e R)

the function

f

provides

The graph consists of all points in a plane

which satisfy

graph(f) = ((x,y)

y = f(x),

or in formal notation

e R x R | y = f(x)}

.

Another 'picture' of the, function, though less useful than the above, is the following:

R

R

x' fCx) x

f(x') Figure 2

1

For a real function

f

of two real variables

be thought of as a landscape, with 'horizontal' plane and Formally, we have

f(x^,x2)

(x^,x2) e

(x^,x2)

graph(f) = {((x1,x2), y)

Alternatively, we can picture

f(x^,x2)

e R" x R = R3

f

'vertically'.

2 ,

the graph could

as coordinates in a

being measured

R x R = R

(x^,x2)

e

R

and

| y = fCx^.x.^)}

by a 'source and target' picture:

R

R + (x„x2) f

f(x,,x2)

Figure 3 Now suppose we have two functions

f^, f?

of two variables

(x^,x2)

We can consider them both at the same time by writing

f(xisx2) = (f-L(x1,x2), f2(x1,x2))

so that

f

is a function from

R

2

to

R

e

R

X

R = R‘

2

The 'graph' picture in this case is harder to visualize, since by analogy with the previous definition of graph we have

2

graph(f) = {((Xl,x2),

(y15y2))

e R2 x R2

| y. = f.^^) i = 1,

and so the graph is a subset of

2}

R2 x R2 = R^

In a similar way, if we are given

k

functions of

n

variables we can

put them all together to obtain a corresponding function from We write

x= (x^x^

. . . ,x ) ,

,

Rn

to

Rk .

let

f(x) = (f1(x), f2(x),

fk(x))

,

and keep in mind the picture:

In formal notation this would be written as

f The graph of

f

: Rn - Rk

.

would be a subset of

R

n

x R

Tc

= R

n "hlc

t

difficult to

visualize and in general not yielding as much intuitive information about the behaviour of way

f

causes

f R

n

as we might extract from Figure 5 by considering the •





to be folded up and twisted inside

R

k

The techniques for analyzing such 'folds and twists' are those of topology and calculus together, or diffeventual topology.

Thus we see already how

a study of differential topology may give useful insight into the ways in which

k

functions of

n

variables can mutually interact both in general

3

circumstances and in particular cases.

Our aim in the first two chapters

will be to develop some of these basic techniques.

Remarks 1. It is frequently necessary to consider functions which are not defined for all values of the variables

(x^x^, .. . ,xn)

• belonging to some subset

x = (x^ jX^, • . . >xn)

but only for c say, of

U,

on K

In this case we of course write f : U -+ Rk

.

2. The word function is usually reserved for reaZ-vaZued functions only, i.e. those of the form f

In other cases

:

(something) -* R

(e.g. for

f : U

R

, k > 1)

we tend to use the term

map or mapping.

1.2

CONTINUITY

Roughly speaking, a map is continuous if by making a small perturbation in the input (i.e. the independent variables) you obtain only a small change in the output (i.e. the dependent variables). far too vague a definition. function

g : R -* R

However, this is obviously

For example, we would wish to think of the

defined by

f (x) =10

23

x

as being continuous

(indeed, its graph is a straight line), but it could be argued that small changes in

x

produce very large changes in

definition of continuity for functions

(a)

Continuity at a point

x

o

.

if, given any positive number

4

g(x).

f : R -* R

The function e

The formal is as follows:

f

is continuous at

x

o

(thought of as admissible margin of

error in the output), number

6

it is then possible to find another positive

such that perturbing

to vary by less than lxD - x| (b)

Continuity. R,

c.

< 6

Xq

by less than

|f(xq) - f(x)|

If we are considering

each point

xq

f

causes

f(x )

Symbolically,

implies

then we say that

6

f

U.

.

defined on some subset

is continuous on

belonging to

< e

If

U

U

U

of

if it is continuous at

is the whole of

R

any case understood from the context then we simply say that

or is in f

is

continuous.

Note that the function any

xq

and any

e

g

in the above example is continuous, since for

we may take

6 = 10

-23

e .

Remarks 1.

It is tempting to try to combine (a) and (b) by saying (hopefully)

f

is continuous on

U

if, given any

e > 0,

there exists a

that

6 > 0

such that |x — y|

for all

x

and

y

R

(b) but it does

The point is that for the

genuine definition of continuity we must allow well as on

< e

6

to depend on

xq

(as

If we do not do this, but demand that the

should apply everywhere (given

e),

then we have uniform

continuity - a notion which is in fact of considerable importance in contexts involving approximating functions by other functions as

for example in certain techniques of numerical analysis. 2. All

'standard' functions such as polynomials, exponentials,

sin x

and

so on can easily be proved to be continuous where they are defined. The only hazard to continuity in combining them ad libitum is the risk of dividing by a function which vanishes somewhere.

It is easy to see how the definition of continuity will go over to functions

f : R

n

R

k-

,

since all that is necessary is to replace the

modulus by the euclidean distance from the origin in Example 3 below).

Rn

or

(see

R

However, we shall need to study the 'small change in

input gives small change in output' problem in situations where the input and output may be rather more complicated than numbers or n-tuples of real numbers; for example,

they may be differential equations, or perhaps

collections of functions.

Therefore we want to generalize the definition

of continuity so that it applies to maps sets other than

Rn

or subsets of

distance in both sets f : A -> B 6 > 0

(b)

A, B;

f : A

.

Rn

6

than

e;

A

and

B

are

then we could simply say

is continuous at

x

e A

o

then the distance from

f : A ■+ B

where

To do this we need notions of

if, given

such that if the distance from

than

B

f(x)

x to

to

e X

o

f(xo)

> o,

there exists

(in

A)

is less

(in

B)

is less

is continuous if it is continuous at every point in

A.

Now it turns out that the minimal properties that a 'distance function' d

on a set

S

needs to have in order to reflect adequately the basic

relationships of distance in euclidean space are these:(1)

6

d(s,s')

is always

$ 0,

and

= 0

when and only when

s = s'

;

(2)

d(s, s') = d(s',s)

for all

s

(3)

d(s,s") £ d(s,s ' ) + d(s',s")

and

s'

for all

in

S ;

s, s'

and

s"

in

(This last property is known as the triangle 'inequality.) function satisfying (1),

(2) and (3)

S . A distance

is called a metric.

DEFINITION A function

d

which satisfies (1)3

S.

S

together with a particular metric on it is called a metric

A set

(2) and (3) above is called a metric on

space.

Note that function on

d S

is actually a function from itself.

The image of

d

S x S

to

and not a

R,

is contained in the set

non-negative real numbers (by (1)), and so we could write

d

R+

o

of

: S x S -> Rq .

EXAMPLES of metric spaces

.

1

d(x,y) =

S = R,

2. S = R2,

d(x,y) =

S = Rn,

d(x,y) =

|x -

(X1

n 3.

l i=l

Examples 1 and 2 are special cases of Example 3, known as the euclidean metric or usual metric in

4.

S = {bounded functions d(f,g)

5.

=

sup a $ x $ b

|f(x)

f

:

- g(x)|

R

[a,b]

-* R}

.

s = {differentiable functions

f

:

(a,b)

R

with bounded

derivative} d(f,g)

=

sup a 0

given any point

such that the halt

B (x)

x

in

W

is contained in

o

there W.

This leads to a general definition:

DEFINITION Let

S

he a metric space.

any

x

in

VI3

For example, line

R,

A subset

W

of

there exists some

6 > 0

the interval

2 < x < 3}

(x |

whereas the interval

(x |

S

with

2 < x $ 3}

is an open set if3 given B (x) o

{(x,y)

|

2 < x < 3, y = 0}

W.

is an open set in the real is not (because

fails to lie in any 6_ball contained in the interval). line segment

contained in

Observe that the

is not an open set in

since every 6-ball will now have to contain points with

x = 3

y f 0.

R2 , On the

other hand ((x,y)

|

2 0

we must show that Then

with

6

shown that

.

f ^(V)

6 > 0

is continuous.

f ^(V)

is open in

and so since

B (f(x)) cz. V . e

the existence of some B.(x) £2 f ^(V)

f(x) e V,

f

with

V

Let

V

A.

Choose any

is open we can find

Now the continuity of

f

guarantees

f(B^(x)) c B^(f(x)) c. V ,

This argument applies to each

be an

x e f ^(V)

so ,

so we have

is open.

Conversely, let us assume the property about open sets and deduce the continuity of e-ball

B^(f(x))

f.

Choose a point

is an open set in

B

x e A.

This means that since

x e f ^"(B (f(x))) e

B. (x)ct f ^(B (f(x))) o e

or, in other words,

This applies for each

e > 0 ,

Since

x

e > 0

the

(an easy exercise, again using

the triangle inequality), and so by hypothesis A.

For any

f ^(B^(f(x))) there is some

is open in 6 > 0

f(B.(x))cr B (f(x)) o e

.

and so proves the continuity of

was an arbitrary point of

A

we have shown that

f

with

f

at

x.

is

continuous.

In view of this theorem it is clear that in the study of continuity of maps between metric spaces it is the family of open sets in each space which is important, rather than the actual metric.

More precisely, if

two different metrics give rise to the same family of open sets then any map which is continuous using one metric will automatically be

12

continuous using the other. The family of open sets of a metric space is called its topology. All that is needed, therefore, in order to define the concept of continuity of maps from any set

A

they are given as metric spaces)

is the notion of a topology for each of

A

and

to another set

B

(whether or not

B.

Naively, we could select any family

F

postulate that

F

of subsets of

shall be the topology for

A.

A

and

Not surprisingly this

does not lead very far, unless we insist that the family

F

obey a few

simple rules which the family of open sets in a metric space does obey. The rules which turn out to be crucial are these: (1) If

U

and

(2) If each

IT

V

belong to

F F,

belongs to

then so does

U n V .

for some family

A

then so does

U U. X e A

{U } A X e A

.

For convenience we also add a formal clause: (3) Both the whole set to

A

and the empty set

0

belong

F.

Note that clause (1) number of sets in

F

implies that the intersection of any finite

also belongs to

F,

but it does not guarantee

anything about the intersection of an infinite number of them.

On the

other hand, clause (2) asserts that the union of an arbitrary number of sets in

F

must still belong to

F.

It is straightforward to verify (making good use of the triangle inequality) (1),

(2)

that the family of open sets in a metric space does satisfy

and (formally)

(3).

This means that we are now able to define

13

what we mean by a topology on an arbitrary set, whether or not we are given a metric.

DEFINITION

Let

be a set.

S

A family

F

of subsets satisfying the rules (1)3

and (3) above is called a topology for

S.

The set

(2)

together with

S

its topology is a topological space.

By analogy with the metric space situation, of

S

the sets in the topology

are referred to as open sets.

EXAMPLES of topological spaces 1.

Obviously,

every metric space is a topological space - since the

properties of open sets in a metric space inspired the general definition of

topology.

Rn

The topology arising from the usual metric on

is

called the usual topology. 2.

If

S

satisfies

is any set then the family of all subsets of (1),

(2)

discrete topology. discrete metric

and

(3)

S

and so is itself a topology.

naturally It is called the

It is the topology obtained by giving

(§1.2, Example 8),

S

the

as follows easily from the

observation that each point by itself forms an open set with respect to the discrete metric because when 3.

B^(x)

consists of

x

alone

6 < 1.

At the other extreme, we may take the family consisting of no open

sets except those legally required by clause 0.

4. Suppose

is a set with a given topology

S

arbitrary subset of

(3), namely

S

itself and

This is sometimes called the indiscrete topology.

the empty set

14

the point

S

F,

and

(not necessarily belonging to

T

F).

is some Then

T

can

be given a topology V

is of

the form

F,^

as follows:

T n U

The verification of

where

(1),

(2),

U

(3)

V

belongs

to

precisely when

is some set belonging to for

F^

is

easy.

This topology

is called the induced topology or subspace topology on from a metric on

S

(§1.2, Example 7)

on

5.

If

S, T

then

F

T.

where

U, V

unfortunately this fails if

S = T = R

to satisfy rule

S x T S, T

(2)

from those of

S

to be open if it is of

respectively, but

in general.

(For example

U x V

can be thought

and a union of rectangles may not be a rectangle.)

Instead we define a set in

of the form

are open in

S x T

with the usual topology then sets

of as rectangles,

comes

spaces there is a natural way of

tempting to define a set in

U x v

the form

F

If

T.

are two topological

It is

T.

F^

in fact comes from the induced metric

constructing a topology on the cartesian product and

F.

U x v

where

to be open if it is the union of sets

S x T U,

V

are open in

S, T

respectively.

It is

straightforward to verify that this does satisfy the rules for a topology on

S x T.

It is

called the product topology

on

S x T.

It is an instructive exercise to check that the product topology on Rn x Rm

(Rn,

Rm

on

R

= R

x R

6.

Let

S

with usual topology) .

be a topological space,

the disjoint union of a For example,

is the same as the usual topology

and suppose that

(possibly infinite)

family of

S

is expressed as sets

in certain types of application the elements of

. S

some objects which we are attempting to classify and each represent a collection of objects having a certain property

may PA

If we now broaden our focus and decide to regard two objects as if they belong to the same

S

,

may be

in common. 'the same'

we are led to consider instead of

S

a

15

new set

S,

This set

namely the set whose elements are themselves the

S

inherits a topology from that of W c. S

Given any set union of all in

S

if

those

W

let S

W

is open in

in the following way.

denote the subset of

which belong to

A

S

S.

W.

S

consisting of the

Then define

See Figure 7.

.

W

to be open

Again,

S

w Figure 7 it is routine to verify that this defines a topology on

S.

There is an equivalent but neater way to express this construction. Let

S

be a topological space and let

defined on S

S.

R

be some equivalence relation

The equivalence classes of

into disjoint subsets

S

as above,

R

give a decomposition of

and conversely such a

decomposition defines an equivalence relation on belonging to the same S/R:

S

.

this corresponds to tt

taking each be open if above,

A

x e S

it ^(U)

S/R (= S)

that of

We have a natural map

S -> S/R

to its equivalence class. is open in

and so it equips

- namely,

Denote the set of equivalence classes by S.

:

S

S/R

S.

Then this

with a topology.

Define a set is

U cz S/R

the same definition as This

topology on

is called the quotient topology.

If the topology on

S

comes from a metric it will not

necessarily follow that the quotient topology comes from a metric on

16

to

S/R.

there is no such thing as a

'quotient metric'

in general,

This is

one more reason for studying topologies rather than metrics.

Returning now to the motivating idea behind the generalization of metric space to topological

space, we will formulate a definition of

continuity for maps between topological spaces.

In view of the above

theorem and the subsequent discussion there is only one plausible definition that coincides with our previous definition for metric spaces.

DEFINITION

Let

f

:

undevstood). V

be a map between two topological spaces (topologies

A -* B

Then

is open in

f

is continuous if

f ^(V)

is open in

A

B.

Again we emphasize that there is nothing here to say that if open in

A

whenevev

then

f(U)

should be open in

U

is

B.

The main advantage of this topological definition of continuity is that it avoids reference to any specific metric, whether or not there is one available.

It is often useful to work with topological spaces

(such as

function spaces) which can be given metrics but only in rather artificial ways that distract attention from the more natural topological structures. Indeed,

there are contexts in which it is necessary to deal with

topological

spaces having no metric structure at all which is compatible

with the topology. The second advantage is the usefulness and economy of the definition for handling theoretical

statements about continuous maps.

For example,

consider the following proof of the fact that the composition of two continuous maps is continuous. is of course straightforward,

The proof for metric spaces

(using

6,

e)

but not as tidy.

17

THEOREM

If

f

A -* B

g*f

: A

:

C

and

g

:

are continuous then so is the composition

B

C

in

C.

.

Proof Choose any open set continuity of

g,

V so

f ^(g ^(V))

f.

But

1.4

FURTHER TOPOLOGICAL CONCEPTS

(i)

f ^(g 1(V))

Then

An open set

neighbourhood of

U

=

(g*f)

x.

of

*,

A

by the

by the continuity of

this finishes

x

B

the proof.

is called an open

For technical reasons it can be useful to think in

x

and to call any set

N

simply a

if it contains an open neighbourhood of

Thus for example the interval (though not open)

is open in

is open in

containing the point

slightly more general terms,

neighbourhood

^(V)

g ^(V)

(x e R

of the point

of either of the points

1

or

|

1 $ x $ 3}

2

in

R,

3

in

R.

x.

is a neighbourhood

but it is not a neighbourhood

There is a characterization of open sets that is convenient to use in practice when checking whether a given set

is open or not.

straightforward from the definition of topology

The proof is

(page 14) with clause

(2)

playing an important part.

Characterization of open sets A set

U

is open if and only if for each point

neighbourhood

N

of

X

x

with

N

X

We see from this that we can use in contexts where we would use can define a map

18

f

:

A -> B

B^

contained in

x e U

there is a

U.

'neighbourhood'

in topological

in metric spaces.

to be continuous at a point

spaces

For example, we x

e A •

(a concept which we have only used so far in metric spaces)

neighbourhood

N

of the point

yQ = f(xQ)

neighbourhood

M

of

f(M) B

page

with

xq

e B

between two topological if and only if it

x^ e A.

if given any

that a map

spaces is continuous

is continuous at

(as defined on

(as defined above)

for

This now brings us back satisfyingly in a full circle

to the original definition of continuity in a metric space

(page

9) with

which we began.

(ii)

Given any set

W

in a topological space

union of alt the subsets of

W

we can consider the

S

which are open sets in

S

with the induced topology - but the converse may not hold). open

(by clause

contained in

It is called the interior of

measures how 'thick'

W

has empty interior in in

R

is

W

This is

and so can be thought of as the largest open set

(2))

W.

(so open in

is as a subset of R

.

the open interval

S.

W,

and in a sense

Thus a line or a plane

The interior of the closed interval

[a,bj

(a,b), but both have empty interior in the

2 plane

R .

The set

Q

of rational numbers and its

set of irrational numbers,

(iii)

open nor closed; simultaneously. open and closed. open.

each have empty interior in

A subset of a topological

complement is an open set.

complement I,

the

R.

space is called a closed set if its

Generally speaking, most sets are neither

it is also possible for a set to be both open and closed In a discrete space,

for example,

all

sets are both

Thus a closed set is not simply one which fails to be

The reason for introducing this

terminology is that topological

arguments frequently go through much more rapidly when expressed in terms

19

of closed sets

than they would do if they dealt with open sets directly.

Observe that by taking complements

the useful

characterization of open

sets given above is transformed automatically into the following:

Characterization of closed sets A set

C

is closed if and only if for each point

a neighbourhood

(iv)

Let

point

x

W

of

entirely disjoint from

(which may or may not belong to

apart from (possibly) the point

W,

points of

W

'arbitrarily close'

to

x

C

there is

S,

and define a

to be an accumulation point

W)

if each neighbourhood of

W

not in C.

be any subset of the topological space

or limit point of

point of

y

y

contains at least one

x

itself.

x

Thus there are

but different from

x.

It is

easy to see that the characterization of closed sets above can now be converted into the statement that a set is closed if and only if it

contains all its limit points. Given an arbitrary set manufacture a set points.

W

This set

and the set

C

W,

not necessarily closed or open, we can

consisting of

W

together with all

is called the closure of

W

is closed if and only if

C = C

W. .

its

limit

Obviously

W et W,

The following facts

are not difficult to verify: (1)

For any set

(2) W (3)

w2 = W

However,

=

20

the closure

W

is a closed set,

is the intersection of all closed sets

w1 u

= {x

W

u w2

note that e

R

|

r\ W2 = 0

x < a} but

i.e.

W = W

that contain

.

W.

.

Wj n ,

W2 1

n W

W2 = {x e R A W2 = {a}

.

necessarily. |

x > a}

.

Then

For example, A W2

take

If

A

is a subset of

equal to or contains

B,

B

and yet occupies so much of

then we say that

B

is dense in

A

that

A

or is

an accumulation point of

subset of

B

contains points of

has empty interior. (and open) in

R.

in

For example,

R.

The concept of denseness

dynamical systems, See §3.3

'usual'

or that the complement of

(Sard's theorem)

is

A

in

B

the set of non-integer numbers is dense (and not open)

is very important in mathematical

as opposed to

and so on:

B

or that every open

The set of rational numbers is dense

characterizations of

(v)

A,

A,

is

B.

Other equivalent ways of saying this are that every point of either in

A

'unusual'

properties of maps,

we shall make considerable use of it.

and also

§4.4.

The familiar ideas of approaching a limit and convergence can easily

be fitted into the topological framework constructed so far. space context f(x) any

(e.g.

in euclidean space) we say

approaches e > 0,

x ) o

< 6

as

c

approaches

x

there then exists some

property that d(x,

In a metric

d(f(x),

c)

< e

xq

6 > 0

for all

x

if,

given

with the with

.

In topological terms this becomes f(x) approaches any neighbourhood neighbourhood f(x)

e N

as

c

M

for all

approaches

x

xq

if,

given

N

of

c

there then exists some

of

xq

with the property that

x e M .

We write f (x) -> c

as

x -* xq

or

21

lim x

and call

c

->

the limit of

f(x)

X

= c

o

f (x)

as

x -> x

o

The wording of these definitions sounds familiar, that it is an echo of the definition of earlier in this section. these ideas?

What,

then,

and in fact we note

'continuity at a point'

given

is the exact relationship between

Proceeding carefully from the definitions alone we soon see

that to say f(x) -> c

as

is the same as to say that the map g(x)

= f(x)

x -> x g

defined by x i x

when

o

o •

g(xo) = C

is continuous at a limit'

x

.

Therefore in making the definition of

'approaching

we have actually introduced nothing new.

A standard variant on the above occurs when we are dealing with a

sequence of points

x^, x£, x^, x

n

-> c

... as

.

In this case we say

n -> °°

or lim x = c n n -> °° to mean that given any neighbourhood

N

exists some positive integer x

n

e N

for all

For metric spaces usual

22

of

c m

there then such that

n > m .

(in particular euclidean space)

idea of a sequence approaching a limit:

this coincides with the

given any

e > 0

there

exists some positive integer

m

such that

d(x n’

all

c) '

is less than

e

for

n > m .

Here,

and in the general

is convergent,

topological setting, we say that the sequence

or converges to the limit

In the special

c.

case of real or complex-valued functions or sequences

the definitions can be stretched a little to include the notion of ’approaching infinity’. set of the form (or

xn -*■ + °°)

{y e R

|

We replace the neighbourhood

{y e R

|

y > r}

for some

in the real-valued case,

y < s}

for some

s

r

N

when defining

or by a set of

when defining

of

c

by a

f(x) -> + °o

the form

f(x) -> - °°

(or

x

-> - «>) n

also

in the real

{zeC|

| z|

case,

> r}

or,

finally,

to define

by a set of the form

f (x) -> °°

(or

-> °°)

in the complex¬

valued case.

Remarks 1.

The definitions for sequences can actually be subsumed under those

for functions.

The way to do this is to consider the sequence

x^,

of points

1,

X£, x^, 2,

3,

...

...

and so on. to

N

in the space

under the map

c

as

n

°°’

(n) -> c We will not go into details, There is an irritating

this point.

as the images of the integers

taking p

1

to

x^,

(thought of as

N u {p}

and putting a suitable topology on

re-express

2.

:

S

2

to

x^,

’infinity’)

it is possible to

as as

n

p

.

since they are rather uninstructive.

source of ambiguity which inevitably arises at

We have the notion of 'limit of a sequence and limit point

of a set, but when we think of a sequence as a set of points these two

23

notions may not coincide. -

I +2 2’

1_

_5

4’

6’

i.e.

8’

although the points On the other hand,

For example,

±1

x

n

=

(-1)

the sequence n 2n - 1

has no limit,

2n

are both limit points of the set

the sequence

1,

1,

1,

1,

...

(x^

n e N}

|

has the limit

1

.

but

as a set it has no limit points since it consists of only one point. Therefore care must be taken to use these terms with precision when sequences are being treated from a topological point of view. 3. When

S

is a topological space which also carries some notion of

addition (for example a normed linear space:

see §2.3),

then besides

the

concept of convergent sequence we also have that of convergent series. A series means simply a sequence of elements of

S

with

'+'

signs

between them:

x^ + x^ + x^ + ... We say the series is convergent or otherwise according to whether the sequence

Sf, $2,

, ... are the so-called -partial sums

is convergent or otherwise, where the n

s

We emphasize, however,

n

=

y x. .r i=l

.

that this only makes sense to the extent that 00

'+' makes

sense in

S.

4. Any metric space

given

x,y e S

such that

U

S

We write (such as

24

Rn)

and

V

are disjoint.

where

d = d(x,y)

to denote

U, V

of

x,

respectively

y

This may seem obvious,

In fact we may take :

lim S n->-°°

has the elementary property that

there exist neighbourhoods

verified from the definit'ons. V = Bd/2(y)

Y x. i=0 1

U = B

but must be

d/2 then the non-intersection of

(x)

and

U, V

follows from the triangle inequality.

Unfortunately topological spaces

failing to have this

'separation'

considerations

in quotient spaces:

(e.g.

property can arise from quite natural see page 16,

and §4.1).

Spaces which do have the property are called Hausdorff spaces (F.

Hausdorff,

1868-1942).

Thus all metric spaces are Hausdorff.

1.5

HOMEOMORPHISM OF SPACES AND EQUIVALENCE OF MAPS

Let

S, T

be two topological spaces,

and suppose

f

:

S -> T

is a

*

bisection f

:

.

If

T -*■ S

f

is continuous,

is continuous,

definition of continuity,

concerned

S

f

:

and

S -> T T

f

called a homeomorphism.

is

this means that the open sets in

precisely to the open sets homeomorphism

then

and at the same time its inverse

in

T,

under

f.

Thus

then as far as their topological structure is

are indistinguishable.

any property possessed by S

We say that

S

will also be possessed by

T,

f

^

is continuous:

section for a particular theorem on these lines. then not necessary to study

f ^

T

are

It follows that

for example)

f

is continuous

see the following This means that it is

explicitly in order to show that

a homeomorphism, which is very useful

generality.

and

and vice versa.

In many useful contexts the fact that a bijection guarantees that its inverse

S

and expressible purely in terms of the

(such as the property of being Hausdorff,

difficult to analyze.

correspond

if there exists a

topologically equivalent or, more usually, homeomorphic.

topology of

S

From the

since

f ^

f

is

may in practice be

Unfortunately this result does not hold in all

It may fail,

for example,

in certain function spaces.

Artificial counterexamples are also easily manufactured. with the discrete topology and

T =

R

Let

with the usual topology,

S =

R

and let

* see page 271. 25

f

:

S -> T

domain U

S

in

S

be any bijection: is continuous!) with

fU

then

but

f

f ^

not open in

is continuous

(every map with

is not since there exist open sets

T.

Homeomorphism is the natural notion of equivalence between two topological spaces.

We next consider how to define a notion of

equivalence between two maps S, T

are given.

f,g

:

S -* T

Thinking informally of a homeomorphism from a space

to itself as the result of viewing we can regard two maps, like

f

when

S

and

f,g T

:

S

S T

through distorting spectacles, as being equivalent if

g

looks

Formally this

is expressed by saying that

are topologically equivalent or conjugate if there exist homeomorphisms

g

h

:

S

S

k

: T -> T

such that g-h = k*f i.e.

the result of applying

g

same as the result of applying by

k.

to f

S to

, after distortion by S

h

is the

and then distorting the result

Note that this is the same as g = k*f*h 1

,

and can be expressed simply by saying that the following diagram f S h

->

j. S

T |

g -*

k

T

commutes, i.e. either pathway from the top left hand corner to the bottom right hand corner gives the same result.

26

S

are both viewed through distorting spectacles -

but with separate lenses. f,

where the two topological spaces

Observe that since

h, k

have inverses

the relation of topological equivalence is an

equivalence relation on the set of continuous maps In cases when h,

S = T

we often insist that

k

S -> T. should

since we wish to avoid the idea of distorting

ways

simultaneously.

S

he the same as

in two different

The expression for topological equivalence then

becomes g = h*f*h 1 We shall see more of this later, when studying the qualitative theory of dynamical

1.6

systems

(§§4.2,

4.5).

COMPACTNESS

There

is an extremely useful theorem in analysis as follows:

THEOREM A

If

is a continuous real-valued function on a closed interval

f

then

f

| f (x) |

is bounded, £ k

for all

i.e. x

there exists a constant

in

r

tt

[0, -j)

tan x

such that

[a,b] .

Note that the assumption that For example,

k

[a,b]

is a closed interval is vital.

[a,b]

is defined and continuous on the non-closed interval

but is certainly not bounded.

Here are two standard proofs of the theorem.

Proof 1 Suppose exist

f x^ e

is not bounded: [a,b]

with

|f(x^)|

Then there exists

x2 ^ x^^

by the greater of

2

points

x-j-,

x2,

x3,

and ...

then we deduce a contradiction. > 1

with ^(x^l

with

(otherwise

|f(x2)| ).

lf(xn)|

> 2

f

There must

is bounded by

(otherwise

f

1).

is bounded

Continuing, we obtain distinct > n

for each positive integer

n.

27

Now we invoke the

Weierstrass limit point theorem (or

'classical'

condensation theorem):

If

W

A

is a subset of a closed interval and

contains ccn infinite

A

number of points then there is at least one accumulation point of

A

somewhere in the interval.

This immediately implies that there is in point x^.

xA

A = {x^, X2, x^,

of

But by the continuity of

there is a 5^ > 0 B

(x^).

such that

Choose now

m

...},

i.e.

f

x^

at

|f(x)|


n £ m >

|f(xA)|

(Note:

we do not claim that

x

+ 1

avoids

l

with

n

x,, x„,

*

n :> m,

x

z

n

.

m-1

and

which gives the contradiction. x

-> x. *

.)

By the continuity of

f

n

Proof 2 Choose [a,b]

e > 0.

there exists

provided

y

6^ > 0

is within

the open intervals

6^

such that of

(x - 6

x.

x + 6 )

we know that for each |f(y)|

is less than

The whole of as

x

[a,bj

x

in

|f(x)|

+ e

is covered by

runs through

fa,b1 ,

or

equivalently [a,b]

=

U x e

where the

'

(x - 6

, x + 6 ) '

[a,b]

denotes that part of

(x - 6

, x + 6 X

(Note in passing that

(x - 6

the induced topology of the Heine-Borel theorem:

28

[a,b]

, x + 6 )'

) X

lying in

is an open subset of

as a subspace of

R.)

This

[a,b]. [a,b]

in

time we use

Suppose

HB

-is a closed interval in

Ea>tT|

there is given a family [a,b] c

U J

(with

R

of open intervals covering

{J^}

Then there is a finite sub-family

.

\

J

A

which also covers

Applying this

[a,b] ,

i.e.

Qa»bJ cz J

than

[a,b] c= J. v X -i

to the family

deduce that there exist u J u X0 12

X-.

...

{J

x2, u J

X

a,b / ± «>)

x

}

where

..., xm

.

J,

u

x

in

=

, J , . .., J X, X0 A 12m ...

uJ X

(x - 6

[a,b]

Then clearly

i.e.

[a,b] ,

An

J

ccnd

x

, x + 6 ) x

we

with

If(x)I

is everywhere less

m

max{ | f (x-^) | , | f (x^) | ,

...»

| f (x ) | } + e

which shows that

f

is

bounded. It is

straightforward to follow through the above two proofs and check

that they work for functions defined on any topological space satisfies an analogue of interval

[a,b2

open set of

S

W

or

is replaced by in

HB

HB S

S

which

respectively where the closed

and open interval is replaced by

In fact topological spaces that satisfy this

analogue of the Heine-Borel theorem play such a fundamental role in analysis that they are given an explicit name:

compact spaces.

DEFINITION

A topological space

S

is compact if and only if every cover of

S

by

open sets has a finite subcover.

Sometimes it is convenient to talk about a compact subset space

S.

This

simply means that

topological space in its own right

T

of a

is compact when regarded as a

(with the induced topology).

In the present more general context it is easy to deduce compact spaces as a consequence of

T

HB

.

W

for

The argument goes as follows.

29

Suppose

A

is a subset of the compact topological space

an infinite number of points. point.

If not,

We wish to prove

then every point of

containing at most one point of

A.

S

A

S

and

A

has

has an accumulation

has an open neighbourhood

But then the compactness of

S

implies that a finite number of these open neighbourhoods will suffice to cover

S,

which means that

This contradiction shows that

A

can have only a finite number of points. A

must after all have an accumulation

point. Returning to the starting-point of this section, by mimicking either Proof 1

(using the above generalized version of

W

)

or Proof 2

directly we can construct a proof of a more general version of the theorem on boundedness of continuous functions.

THEOREM A'

If f

is a compact topological space and

S

is bounded,, i.e.

for all

f

there exists a constant

: k

is continuous then

S -*■ R

such that

|f(x)|

$ k

x e S.

Remarks 1. Unfortunately it is not possible to deduce compactness from the Weierstrass property in the case of an otherwise arbitrary topological space.

Thus the generalized versions of

equivalent.

i.e.

2. Theorem A' ways

HB

are not entirely

they can be shown to

In particular they are equivalent in their original for closed intervals

[a,b]

of the real

line.

has an immediate corollary which illustrates one of

the main

in which the theorem might be used in practice, namely that if

is compact and

30

and

In the case of metric spaces, however,

be equivalent. context,

W

f

:

S

R

S

is a continuous non-vanishing function then

f

is bounded away from zero, |f(x)|

^ k

for all

i.e.

x e S.

there exists

k > 0

such that

The proof of the corollary comes from

applying Theorem A

to

1/f.

Clearly this need not be true if

not compact:

S =

(0,1)

and

take

In applications

f

f(x)

= x,

S

is

for example.

may typically be a distance function from points of

S

to some other point or set.

3.

There is an alternative characterization of compactness as follows.

A topological

space

closed subsets

is compact if and only if, given any family of

S

such that any finite number of the

{C^}

non-empty intersection, At first sight this

have

then the intersection of them all is non-empty.

seems unrelated to the former definition,

but in fact

it is easy to translate one version into the other simply by considering the open complements

U

A

of the closed sets

Thus for example the statement

S = U U A

C. , A

becomes

and vice versa. 0 = f| C

A

A

,

and so on.

A

We leave this verification as an exercise. This characterization is frequently used to prove the existence of a point defined by some kind of infinite intersection process. illustration,

observe that if

I

is

the closed interval

the intersection of any finite number of

I

's

As an

To, — 1

then

is non-empty and in fact

CO

f]

I

= {0}

which is non-empty.

On the other hand if

=

(0, — ]

n=l

then again the intersection of any finite number of

J^'s

is non-empty

00

but now

0

J

= 0.

Here

=

(0,l]

is not compact

(see below).

n=l We will now give a working theorem whose geometrical usefulness justifies

the rather technical definition of compactness.

three facts about compact sets,

First we state

each of which is straightforward to verify

from the definitions:

31

(1)

Every closed subset of a compact space is compact.

(2)

If

f

:

S -* T

compact subset, (3)

K c. S

is continuous and then

f(K)

is a

is a compact subset of

T.

In a Hausdorff space every compact set is closed.

Putting these together we obtain

THEOREM B

Let

and

S

Suppose

f

be topological spaces with

T :

compact and

S

T

Hausdorff.

is a continuous map which is also a bisection.

S -> T

is automatically a homeomorphism (i.e.

Then

f

is automatically continuous).

f ^

Proof To prove

f ^

is,

V

i.e.

bijection) implies

continuous we have to show open implies

as showing

C

compact

C

(Fact

which in turn implies

f(V)

open.

(f

f(C)

(1)) which implies closed

is open whenever

This is the same

closed implies

f(C)

^(V)

(Fact

closed. f(C)

V

(since

f

But

closed

compact

C

(Fact

is a

(2))

(3)).

Thus for example to prove that the space of real numbers modulo the integers

(i.e.

the quotient space

R/R

where

xRy

means

x - y e Z)

is

formally homeomorphic to a circle it suffices to construct a continuous bijection R/R

from

is compact

and of

f

R/R

(being the image of

is Hausdorff. 0) h- e^771®

1 :

0 H- e

2 tt

of working with

32

i0

f

R/R

.

[0,l]

in the complex plane, under the projection

It is easy to verify that

is a bijection,

quotient topology in R -> S

to the unit circle

f

:

R

since R/R)

(equivalence class

and is continuous by definition of the

and the continuity of

the map

We have no need to bother with the technicalities .

Recognizing compact sets.

In all the above theory we have not discussed

how to recognize a compact set where we see one.

After all,

it is

clearly impossible to check the properties of every open cover. Fortunately,

in euclidean space

Rn

,

the most useful case,

there is an

immediate test:

THEOREM C A subset

of

K

is compact precisely when it is both (a) closed and

Rn

(b) bounded (i.e.

there exists some number

The compactness of (cover

K

by

K

K fi B^(0) ,

k

immediately implies k = 1,

2,

3,

...).

with

(a)

Kc B^CO)).

(Fact

However,

more delicate, being a generalized Heine-Borel theorem. the proof here.

of

1.7

and

(b)

the converse is We will not give

Note that once we have Theorem C then the original

Theorem A follows directly from Fact bounded on

(3))

[a,b]

is the same as

(2),

saying

since to say that f([a,bj)

f

is

is a bounded subset

R.

CONNECTEDNESS

A topological space is connected if, piece.

Of course,

roughly speaking,

it is all in one

any space with more than one point in it can be

written as the union of two disjoint non-empty subsets, but it turns out that the definition which best captures our intuitive notion of being 'in one piece'

is the following:

DEFINITION

A topological space

S

is connected if it cannot be split into two

disjoint non-empty subsets which cere both open3 i.e. if U, V

S = U u V

where

are open and disjoint then one of them must be empty.

33

Thus in a connected space the only subsets which are both open and closed simultaneously are the empty set and the whole space.

It follows

that any space with the discrete topology cannot be connected if it has more than one point.

A subset

Q

of

Rn,

when given the induced

topology, will fail to be connected precisely when it can be written as Q = Qx U Q2

where

some open set

Ik

in an open set

Q-^,Q2 of

Ik

the notion of

are non-empty,

Rn

disjoint,

'separate pieces'.

of the two closed intervals (as a subset of

R

Q1 = Q A (-2,2)

and

Q.

- in other words when each

which does not meet the other being

and

=

[o,l]

and

with the induced topology), Q2 = Q

for

can be included

Q_^.

This formalizes

For example, Q2 =

= Q n U.

the union

Q

is not connected

[2,3]

since we have

C\ (1,4).

This still leaves the problem of deciding when a given topological space -is connected,

since it is impossible to check all possible attempts

at splitting it into disjoint open subsets.

The two most useful results

in this direction are:

(1)

If

S

then

is connected and f(S)

f

:

S

T

is continuous

(with the topology induced from

T)

is

connected; (2)

every open, half-open or closed interval infinite)

(possibly

of the real line is connected.

Sketch proofs (1)

If

S = f

f(S)

= U u V

X(U) u f ^(V),

clearly disjoint.

34

(both open and disjoint in both open in

S

f (S))

then

by the continuity of

f

and

(2)

If an interval

in

I,

a < b,

numbers

x

X)

is not connected we have

disjoint and non-empty.

suppose

m

I

shows or

for which

m V

and let

b

in

m = least upper bound of the set

X

of all

U

Choose points

I

a

is contained in

in

both open

U,

[a,x) n

cannot lie in

I = U u V,

U.

Contemplation of

(since it would not be an upper bound for

(since it would not be the least upper bound for

Contradiction,

so

I

V,

X).

is connected.

As with compactness, we extend the definition to subsets of a topological

space by regarding them as spaces in their own right with the

induced topology. Apart from its direct geometrical meaning,

connectedness plays an

important indirect role in the proofs of many theorems. to prove that a certain property holds for all space

S.

the set S,

x

in some topological

A frequently useful approach is to prove first of all T

of points

x

for which

P

P

fails

is also an open subset of

the existence of just one point non-empty,

so

F

guaranteed for all

x

in

Remarks on the literature. [99] .

for which

If P

of points for S

is connected,

holds means that

T

is

is

There are many good introductory texts on such as Mendelson

[80] , Patterson

Particularly recommended also are Sutherland

and Simmons

P

then

S.

is written in much the same spirit as provided,

S.

F

must be empty and therefore the validity of

metric and topological spaces, Pitts

x

that

does hold is an open subset of

and then to prove by other means that the set

which

Suppose we wish

this

[94],

[L34] , which

chapter but with all details

[ll| , which contains much additional material

relevant to the notes as a whole. See Willard [l50] for a deeper study of topological spaces, or Hocking and Young [57] for more geometric aspects.

35

2 Calculus

2.1

DIFFERENTIATION

Recall from elementary calculus the definition of differentiability at a point

x

for a real-valued function of one real variable.

DEFINITION

The function

f

:

R

is differentiable at

R

lim

(f(x+h)

x

if

- f(x))

h - o h

exists (± °°

not admitted).

This limit is called the derivative of denoted by

f'(x)

or

f

at

x,

and is traditionally

df — .

Strictly speaking we should write

x

for which the limit has been evaluated.

to emphasize the point

Another way of expressing the existence of the limit is

there exists a real number f(x+h)

where In general.

X

e -> 0

as

- f(x)

X

h -> 0

36

f(h)

= f(x+h)

L(h)

= Ah :

to say:

= Ah + e|h|

of course depends on

h = 0

,

such that

x.

that the graphs of the two functions

are tangent at

df -j^- (x)

- f(x)

see Figure 8.

f, L

The pictorial of

h

interpretation is

defined by

It is reasonable to describe this by saying that

approximation to

f

at

x, h

or, f (x)

equivalently,

L

is a 1■■inear

that the map

+ Ah

(whose graph is a straight line)

is a first order approximation to the

map h

f (x+h)

Since this idea turns out to be so useful in the study of real functions of a real variable, much as possible. (maps)

it is worthwhile trying to generalize it as

Obviously we would get nowhere by considering functions

between arbitrary topological spaces,

since the minimum necessary

ingredients are: 1.

concepts of addition and subtraction,

2.

the idea of a map being linear,

3.

a possibility of taking limits.

Now the first two of these are available in the world of linear spaces or vector spaces, while the third is most conveniently associated with metric spaces

(§1.2).

Therefore to generalize our simple notion of

differentiation we need to develop a theory of linear spaces which are at the same time metric spaces.

This will occupy us for the next two

37

sections of the chapter.

2.2

LINEAR SPACES AND LINEAR MAPS

A real linear space or vector space is a set

V

together with two

operations called 'addition' and 'multiplication by scalars (i.e. real numbers)' which obey a number of reasonable rules.

The standard rules

for addition are these: (i)

x + (y+z) = (x+y) + z

for all

x, y, z

(ii)

x + y = y + x

for all

x, y

(iii)

there is a zero in satisfies

(iv)

every

V,

in

for every

has a negative, i.e. an element

which satisfies

V

V

i.e. an element

0+x=x+0=x

x

in

0

which

x

in

V

- x

x + (~x) = 0 ,

followed by rules involving scalar multiplication as well as addition: (v)

a(x+y)

= ax + ay

for all

X,

y

(vi)

(a+g)x = ax + Bx

for all

X

in

V

and all scalars

for all

X

in

V

and all scalars

for all

X

in

V.

(vii) (viii)

a(gx)

=

(ag)x

lx = X

in

V

and all seal

Remarks 1. It can be deduced from these rules that other obvious-looking arithmetical statements also hold, such as Also,

it is usual to write

x - y

Ox = 0

instead of

or

x + (~y)

(-l)x = - x . .

2. The same definition could equally well be made with complex numbers rather than real numbers as scalars, giving a complex linear space. 3. When thinking geometrically we call

38

0

the origin of

V.

4. Nothing has been said about any topology for

V.

EXAMPLES of linear spaces 1.

V = R

2*

V =

3.

V = {all

R

;

usual addition and multiplication. ;

co—ordinate-wise addition and

R

functions

R}

defined pointwise,

;

scalar multiplication.

addition and scalar multiplication

i.e.

f + g

means the function

x

f (x)

af

means the function

x W

a,

g.

on the left hand side denotes addition in

whereas on the right hand side it denotes addition in completely different in the way they are defined.

40

(in

V)

to zero

(in

These may be

A similar remark

applies to the scalar multiplications on both sides. linear map always takes zero

W.

W).

V,

Observe that a

EXAMPLES of linear maps 1.

V = any linear space;

L

some fixed scalar R

,

W-R

;

: V -> V

defined by

j

L

•••)

n

:

V

W

(1 $ i £ m,

for

defined by n

n

)

“13

j‘i

for a collection of a£j

= av

a.

( n ^>^2

L(v)

mn

1 $ j

x3’

X

“2j v

l

•••

a

j=i

.

mj

x.

J

fixed scalars

£ n)

.

Here

L

is traditionally

represented by the matrix

all

a12

a21

a22

a2n



















Otrt

^ ml

3. V = S L 4.

defined by

Linear maps from

(i)

L(f)

V

ot

mn^

above,

= f'

to

“in

•••

m2

in Example 7(b)

: V -> W

•••

W

W = V

in Example 3 above;

.

can be added together and multiplied by

scalars pointwise to give further linear maps. check that the set space,

Lin(V,W)

of linear maps

Indeed, V -> W

it is easy to

itself forms a linear

in fact a linear subspace of the linear space of all maps

V -* W

.

This construction is so important in what follows that we will isolate it for convenient reference:

of all linear maps

L

:

:

V -* W

If

V -> W

V

and

W

are linear spaces3 then the set

forms a linear space3 denoted by

Lin(V,W)

.

Kernel and image For a linear map elements

v

in

L V

let the kernel of

L

denote the set of all

which are taken to the zero element in

W.

41

Linearity of

L

implies that the kernel

V,

L

is injective precisely when

and that

Thus the

'size'

of

ker(L)

(for example,

measures the failure of injectivity of The image

im(L)

L

:

V

W

im(L)

its dimension:

W,

is all of

see below)

W.

inverse

L

: W -> V

must also

recall

§1.5 and Theorem B of

A bijective linear map is called a linear isomorphism,

L.

L

(Compare this with the analogous situation for

isomorphism the linear structure of via

V.

and by definition

isomorphism if the linear context is understood.

W,

in

isa linear map and is at the same time a bisection then

continuous maps between topological spaces; §1.6.)

= {0}

L.

it is a simple exercise to check that its be a linear map.

is a linear subspace of

ker(L)

is a linear subspace of

is surjective precisely when If

ker(L)

V

If

L

or just

: V -> W

is an

corresponds precisely to that of

Thus if there exists an isomorphism between two given linear

spaces they are indistinguishable as linear spaces:

they are said to be

isomorphic.

EXAMPLE Suppose

S, T

S n T = {0}.

are two linear subspaces of Then it turns out that every

a unique way as as

x = s + t

'co-ordinates'

for

x,

where

s e S,

S

and

such that x

t e T.

in

V

S + T = V

S x T.

We can regard

We write

and

can be written in

and it is routine to check that

isomorphic to the cartesian product

direct sum of

V

V

s,

t

is

V = S @ T,

the

T.

Dimension In euclidean space geometrical problems can be converted into algebraic problems by taking co-ordinates,

42

sometimes already given and sometimes

constructed artifically with the solution of a particular problem in mind. This

is

such a useful technique that it is worth exploiting as much as

possible.

It leads

to the concepts of basis and dimension in general

linear space theory. A linear space finite collection each

x

in

V

is said to be finite—dimensional if it contains a

V

(e^,

•••>

effl}

can be written in the form x - a,e1 + a„e0 + it 2 Z

for some unique choice of scalars depends on for

V,

of elements with the property that

x).

+ a e mm

a,,

cu, Z

...,

a

{e,,

eOJ

....

e }

1

The collection

and the number

...

12

(which of course

m

m

is called a basis

is called the dimension of

m

V.

It can be

shown to be independent of the actual basis considered. A linear space which is not finite dimensional is

dimensional .

Note that this

'infinite basis'

for

V,

said to he infinite-

says nothing about the existence of any

whatever that may mean.

EXAMPLES 1.

V = Rn

(0,

has a basis

I,

0,

elsewhere.

This

dimension of 2.

Let

W

is

V

...,

R

is is

{e^,

0)

with

n

in the

where ith

e^

is the vector

place and Hence the

be the linear space consisting of all maps

some given linear space,

For example, when

0

(as we would hope.).

V

dimensional precisely when

V

1

en^

called the standard basis.

scalar multiplication in

get.

•••»

A

is an arbitrary set,

are defined pointwise. W

W = R

A -> W

and addition and

Then

finite-dimensional and

is

and

can be thought of as an n-tuple

A= {1,

(x^,

2,

...,

•••> xn)

where

n}

V A

is finite¬ is a finite

an element of

with each

x^

in

R,

43

and

V

is

Rn.

isomorphic to

element of

V

R,

is infinite-dimensional.

and

V

A = N = {1, 2, 3,

If

is an infinite sequence

(x^,

*2, x^,

...}

. ..)

then an of elements of

3. Any linear space isomorphic to a finite-dimensional space is itself finite-dimensional, to a basis

since the isomorphism transports a basis in one space

in the other.

Thus any infinite-dimensional space cannot be isomorphic to a finite¬ dimensional one.

On the other hand,

of the same dimension are isomorphic. that every finite-dimensional

isomorphic to

Rn

any two finite-dimensional spaces From this and Example 1 it follows

(real) linear space of dimension

n

is

- although in practice the construction of an

isomorphism may be rather artificial.

After this excursion into the theory of linear spaces and linear maps we now turn again to the task of generalizing the definition of differentiability. If linear)

V, W

are linear spaces and

then given

as an element of

x

W.

and

h

F(h)

= f(x+h)

V

: V -*■ W

is any map

(not necessarily

we can certainly define - f(x)

Now if we wish to discuss approximating

some suitable linear map topological structure. differentiability at

in

f

L

F

by

we need to introduce some metric or at least

Although we could make a definition of x

involving only topologies on

V

and

W,

experience shows that in the first instance it is useful to ensure that V

and

W

are each metric spaces

(as well as linear spaces),

and moreover

that the metrics are in a certain sense compatible with the linear

44

Pursuing these ideas we arrive at the notion of a normed

structures.

linear space.

2.3

NORMED LINEAR SPACES

Observe that (i)

(ii)

in

R

with the euclidean metric the following rules hold:

d(x+z, y+z)

= d (x, y)

for all

x, y,

z

in

Rn

(i.e.

translation does not alter relative distances).

d(ax,

ay)

scalars

=

|a|d(x,y)

a

(i.e.

for all

x, y

in

Rn

and all

scalar multiplication magnifies or

contracts distances uniformly). These are not consequences of the general definition of metric since they involve the linear structure of

Rn

(§1.2),

which need not exist in

an arbitrary metric space. Let us d

say (temporarily)

on it then

that if

x,

length of

the

(1)

y

in

=

and then since (2)

||x|| when

(3)

V.

|a|

d

||x||

and

(ii).

d(x,0)

by

||x||

we see that in view of for every

x

in

V

If

d(x,y)

d

is

= d(x-y, 0)

(thought of as the (ii)

above we have

and scalar

a,

is a metric we also have

£ 0 x

||x+y||

Ox )

(i)

we immediately see that

Denoting

'vector'

||ax||

is a linear space with some metric

is sensible if it satisfies

d

a sensible metric then from (i) for every

V

for all

x

in

V,

and

is the zero element in $

Note that the given

||x|| d

+

||y||

||x||

=

for all

||x-y||

precisely

V; x, y

is recoverable from d(x,y)

=0

in

||•||

V. by the formula

.

45

A function

norm for

||*||

V,

and

linear space.

: V -> R

(1),

(2)

and

(3)

is called a

together with a particular norm is called a normed

V

Any normed linear space is automatically a metric space,

with metric defined by that this

satisfying

d

d(x,y)

does obey rules

=

||x-y||

(1),

(2),

:

it is straightforward to check

(3)

of the definition of a metric.

EXAMPLES of normed linear spaces

1. R

1 1x1

with

for n-vectors.

l

Li=l

1

1

.i 1

x

|2

\ ,

i.e.

the usual

'length'

function

y

This gives the euclidean metric and so it is called the

euclidean norm. 2.

f n

_

In particular, when

(a)

Rn

with

I |

(b)

Rn

with

11*11

X

| |

n = 1

Rn

||x||

=

|x|

.

= max{|x^ | = 11

+

1

In fact it can be proved that any norm on same topology on

we have

Rn

will give rise to the

as does the euclidean norm.

Of course the metric

will in general be different. 3.

The space of bounded functions f||

This

f

:

R,

x e R}

= sup{|f(x)

with

.

is a linear subspace of the space of all functions

(Example 3,

4. The space of bounded maps linear space and a number

R

R

§2.2) with pointwise addition and scalar multiplication,

the metric is that of Example 4,

K

equivalently,

A

f

§1.2. : A -> W,

such that

f (x)

f(A)cz B^.(0).)

'W

where

W

is a normed

(Bounded means there exists

is less than

K

for all

Again the obvious norm is

= sup{| |f(x) 'W

and

More generally:

is any set whatsoever.

f| |

46

R

x e A}

.

x

in

A;

5. The space of continuous | | f| |

- sup{| f (x) |

functions

f

:

[o,l]

-* R

with

Note that this is a linear space with

a < x $ b}.

pointwise addition and scalar multiplication as usual, sense in view of Theorem A of

§1.6.

This space is a linear subspace of

the special case of Example 4 above where norm and metric from this

differentiable at each point in

(0,1)

[jo,lJ ,

and it inherits the

f

:

[o,ll

-* R

which are

and whose derivative

Here we could take

= sup{ | f (x)

or,

A =

larger space.

6. The space of continuous functions

is a bounded function.

and the norm makes

f'

fI |

:

(0,1) -> R

where

a $ x $ b}

if we wished to measure derivatives as well as we would take where

= sup{|f (x)

a £ x < b} + sup{|f'(x)

a $ x $ b}

This is a linear subspace of the space in Example 5. naturally inherits is to work with is of Example 5,

'0

||•|

,

The norm which it

but the norm which it may be most useful is that

The metric corresponding to

1



§1.2.

7. The space of systems of differential equations

^1 = fi^xl»

x2') (F)

~

f2^xi’ x2^

defined on the closure that

f^,

f2

U

of a bounded open subset

U

of

R

and such

are continuous, where the linear structure is defined by

pointwise addition and scalar multiplication of maps f =

(f

f^)

|F||

:

R

2

-> R

2

,

and the norm is given by

= sup{||(f1(x),

f2(x))

x =

(x^,

x2)

e U}

47

Theorem C of that

||*||y

§1.6 shows that makes

U

sense.

is compact,

and then Theorem A'

It is routine to verify that

guarantees

||*||y

is a

norm. The corresponding metric is like that of Example 6,

§1.2,

although

2 there we supposed

f^,

f^

to be bounded on the whole of

R

Now we have essentially everything we need to pursue the study of differentiability.

However,

it is worth going a little further at this

stage to provide some machinery that will be useful later on.

Linearity and continuity Every linear map

L

The proof is easy:

:

let

and observe that if L(x)

Rn -* Rm

(with the euclidean norms)

e2>

(e^,

x

and

- L(y)

y

•••>

e^}

are in

Rn

be the standard basis for

by linearity

r n

l

Rn

then

= L(x-y)

= L

is continuous.

-v (xi - yi>ei

i=l n =

£ i=l

(x.

- y.)L(e.)

by linearity.

n Hence

| L (x)

- L (y) | 1

s

l

l*i - yi1

llL(ei)l |

$

||x-y||M

where

i=i n M =

l

l|L(ei)||

.

Therefore

||L(x)

- L(y)| |

less than

6 = e M ^

is less than

e

i=l whenever

1 1 x-y 11

is

.

In this proof we have used strongly the fact that the domain

Rn

is

finite-dimensional, but note that the proof would have worked with the codomain replaced by any normed linear space

48

F.

In general a linear map between two normed linear spaces need not be continuous.

Here is an example with, necessarily,

an infinite¬

dimensional domain.

EXAMPLE Let

V

be the space of all infinite sequences

— (^-^»

> a3 » . . . )

with addition and scalar multiplication defined term—wise,

the linear subspace consisting of those with

Define a norm for

E

Y lj, n=l

1

a

and let

E

be

finite.

n1

by oo

and define a linear map L(a^ »

Then

L

a2*

L a3♦

is clearly linear,

let

: E

E

•••)



by (a2>

Then

| |a^n^I I

= “

(n-1)!

but

• • •)



6

is there always exists some

less

6

but

||L(a^n^)||

= 1

in the

| |L(a^n^)| |

how small than

a^ >

but is not continuous at

denote the sequence with

elsewhere.

a3>

a^n^

0. nth

place and zero

,

and so no matter

= 1 in

To see this

E

with

| |a^n^ | |

.

The basic facts about continuity of linear maps are summarized in the following theorem.

The details of the proof are straightforward,

be found in any book

(such as Simmons

[ll8])

and can

on functional analysis.

THEOREM

Let

L

:

E -> F

be a linear map between normed linear spaces.

Then the

following statements are mutually equivalent:

49

(1)

L

is continuous.

(2)

L

is continuous at the origin in

(3)

There exists a constant

K

such that

is less than

K

for every

||l(x)||F

with

E

Statement the

(3),

If

L

K

.

S

in

E

F

F.

K

which satisfy

(3).

radius of the smallest sphere centred at the origin in

E

distorts

is a continuous linear map then there is a greatest

lower bound for all the constants

L(S ),

L

it nevertheless maps it inside some sphere

and centre the origin in

: E -*

in

x

can be described by saying that however much

'unit sphere'

of radius

1IxlI ii E = 1

E.

This F

is the

and containing

in other words sup{I IL(x)j |

This number is denoted by

| jL| |,

for it can be shown without much

trouble that it does define a norm on the linear subspace of

Lin(E,F)

consisting of continuous linear maps. Note that for such a map Lx

for every I |LI I

x

in

by writing

E.

L

we have the inequality x

'F ^

This follows immediately from the definition of

I Ixl I

E

= a

and applying

| |L(a 1x)| |F ^

since

II a

Ell

linearity of

E

L

The notion of

= 1 ,

L

to

(1)

we have

| |L | |

but the left hand side is

and property

a ''"x :

a ^||l(x)|| 1 1

1 1 ]7

by the

J

of the norm.

the normed linear space of continuous

linear maps

E -v F

with particular properties of the norm such as the inequality above, will

50

,

be important in §2.6 when we consider higher orders of differentiation. Quite apart from these uses,

however,

it constitutes one of the basic

pieces of equipment in functional analysis.

Completeness Recall

that

in a metric space

converges to

c

in

given any m If

S

S

we say a sequence

d(x

n

,

c)

< e

for all d(x

n

with

,

c)

|x

n > m.

=

n'

this is just

...

there exists a positive integer

is a normed linear space we have

S = R

x^,

when:

e > 0

such that

S

x i

- c n

1 1

,

and if

n

When we are given a sequence how can we tell whether or not it converges to some d(x^,

c?

c)

If we have a plausible guess for

c

then we can look at

and try to check the definition, but what methods are available

if we are not sure even whether is a very useful lemma

c

exists?

S = R

there

(often known as the General Pr'inciple of Convergence)

stating that a sequence of numbers something)

In the case

x^,

x^,

...

converges

(to

if and only if

given any m

e > 0

such that

there exists a positive integer

x

< e

- x nl

for all

n^, ^ > m.

n2

Unfortunately the generalization of this to arbitrary metric spaces with d(x

, x nl

example,

)

replacing

x

n2 S

take,

for

n2

to be an open interval or to be the set of rational numbers

with the metric induced from S

turns out to be false:

- x nl

R.

will converge to something in

In these cases a sequence as above in R

which may not belong to

S.

51

Metric spaces

in which this General Principle of Convergence does hold

are sufficiently important and common for it to be worthwhile identifying they are called complete metric spaces.

them with a particular name:

A normed linear space which is also complete when viewed as a metric space is known as a Banach space

(S.

Banach,

1892-1945).

The examples given above of non-complete metric spaces are, not

(real)

linear spaces.

a finite-dimensional

of course,

In fact it would be impossible to construct

linear example in view of the result that every

finite-dimensional normed linear space is automatically a Banach space. Completeness of a normed linear space is a technical condition needed for the correct statement of some general results later on,

but in finite

dimensions there will be no need to worry about it explicitly.

Hilbert spaces Apart from its linear and norm structures,

Rn

has

interesting geometrical

properties that relate to the inner product

= *^1 + x2y2 + of pairs of elements for vectors in

R

3

.

+ xnyn

familiar as the dot product or scalar product

x, y

We have

=

i i 2 ||x||

,

and in

R

3

it can be

shown that = where

0

||x||

||y||

is the angle between the vectors

this as the definition of

0,

cos 0 x, y.

In

Rn

we can take

and mimic much of the geometry of

RJ

.

In functional analysis there are many circumstances under which a Banach space comes provided with an inner product analogous Rn

.

Technically,

means a map

52

to that in

an inner product on a real normed linear space

: E x E -> R

E

which is linear in each factor separately,

is

symmetric

for every

( = ),

x,

and is such that

vanishing only when

x = 0.

are some amendments to these conditions.) it

(In the complex case there Using such an inner product

is possible to derive much stronger results about the structure of a

Banach space than would be possible without it. an inner product is called a Hilbert space (D. Sometimes

sense.

R) .

Hilbert,

1862-1943). although

is still not too large in a topological

space is called separable if it contains a countable

A topological

dense subset

A Banach space which has

it is necessary to ensure that a Hilbert space,

possibly infinite-dimensional,

in

is non-negative

(as with the countable set

Q

of rational numbers contained

It is particularly useful to deal with separable Hilbert spaces,

since it can be proved that they have in a certain precise sense at most countably infinite dimensions,and many finite-dimensional techniques can be extended to this case by replacing n-tuples

(x^,

..., x^)

by

infinite sequences.

2.4

DIFFERENTIATION

Let

E

||.||

and

F

| | . | |f,

linear) map,

(continued)

be two normed linear spaces with respective norms suppose

and let

x

f

: E -> F

is a given continuous

be a particular point of

E.

(not necessarily

By analogy with

§2.1 we make the following definition:

DEFINITION

The map

f

is differentiable at

which approximates

f

at

f(x+h)

x

x

if there is a linear map

in the sense that for all

- f(x)

= L(h)

+

h

in

L

: E E

F

we have

||h||E h(h)

53

where

is an element of

n(h)

F

with

| | n (h) | |

as

-> 0

In general

L

of course depends on

we should write something like the role of

|h|

in §2.1

L

while

longer a number but an element

Here

plays the role of

(thought of as L =

'small')

exists

It is a linear map from

Df(x).

L.

of

E

to

f

Df(x)

instead of

Df(x)(h).

is a linear map from

Remember that E

to

F,

h

and so

e

which is no

F.

at

plays

It is not E, F, x

f

and

and denote

and not a number.

F,

To avoid proliferation of brackets we will often write Df(x)h

||h||

(for given

We call it the derivative of

then it is unique.

it by

ij

and so to be formally correct

instead of n(h)

difficult to show that if such an x)

x,

0 •

||h||

r

Df(x).h

or

is an element of Df(x)h

E,

is an element of

F. Using this notation we can re-write the defining expression for the derivative as f(x+h)

where

| |h| |

means

- f(x)

= Df(x)h +

j |h| |^,

and

||h||n(h)

p(h) -*■ 0

(in

The pictorial interpretation is much as before,

(*)

F)

as

cartesian product image of

f.

Df(x) f(x)

See Figure 9.

54

E x F)

(which will be some kind of

is a linear subspace of

it is

E).

Instead of 'surface'

in the

it can often be more useful to visualize the

This consists of some distorted version of

The image of the vector

f

(in

although the spaces

now may not be 1-dimensional or even finite-dimensional. thinking of the graph of

h -*■ 0

'tangent'

F,

to the image of

E

inside

F.

and when translated by f

at the point

f(x).

image of f

Frgure 9 Equivalently,

the map h

f (x)

+ Df (x)h

is a first order approximation to the map

h i-»- f(x+h)

Interpretation of the derivatives in familiar cases We will now look at the meaning of the above rather abstract definition in the case of maps between euclidean spaces, where differentiation is a more familiar concept.

Case 1. Here

f

E =

R,

F =

Rm . R -> Rm

is a continuous map

,

which we call a path

in

Rm .

(From some points of view it would be more sensible to give that name to the image of

f

parametization by At each point map

R -> Rm

.

R™

in

t

R. in

but then we would have lost track of its

We shall continue to call the map a path.)

R

the derivative

Now linear maps

very easy to understand, so by linearity of L(s)

,

L

L

all you have to do is

L(s)

to find

R

from

since for any

we have

Df(t),

s

is a linear

to any linear space in

= sL(l). L(l)

if it exists,

R

we have

s

= s.l

In other words,

as a element of

F

F,

are and

to find and

55

multiply it by the scalar L(l)

=0

in

the zero map

s.

Thus there are two possibilities:

F,

in which case

R ■+

F)

or

L(l)

L(s) / 0

= 0

for all

R -* Rm

L = Df(t),

then its

translated to

we see that if

L(l)

Df(t)

image is a straight line which,

f(t),

(i.e.

in which case the image of

straight line through the origin and the point the case when

R

s e

is tangent at

f(t)

in

F.

either L

L

is is

the

Returning to

is not the zero map after the origin is

to the image of

the path

f.

See Figure 10.

Figure 10 The association of

L(l)

R

between linear maps

with

L

sets up a 1-1 correspondence

and elements of

-> F

F,

and it is a simple matter

to check that this is a linear isomorphism between the space of all linear

R

maps F

-> F

itself.

(pointwise addition and scalar multiplication) For this reason whenever we have a map

particular when but

Df(t)(l)

f'(t).

F = or

Rn)

which is an element of

By linearity we have for all Df(t)s = Df(t)(s.l)

f(t+h) where

56

n(h) -> 0

as

(*)

:

R

-> F

(in

it is often convenient to work not with

Df(t)l,

and so the formula

f

and the space

s

in

h -* 0.

and denoted by

R

= sDf(t)l = sf'(t)

defining the derivative of

- f(t)

F

Df(t)

= hf’(t)

+

Finally,

f

:

R

F

becomes

||h||n(h) returning to the specific case

when

F

R

we have

formula shows us that (f

f(t)

-

(f^(t),f2(t),...,fR(t))

f'(t)

'(t) , f 2'(t) ,. . . ,f(t) )

is simply the n-vector where the

the real-valued functions

f^.

Of course.

f(t)

f^1

are the usual derivatives of

This fits neatly into Figure 10,

illustrating our familiar notion of vector based at

and the above

(f,'(t),f„'(t),...,f '(t)) l z n

and tangent to the

(image of

the)

as a

path there.

Figure 10 serves equally well as a pictorial representation of

the case for general

F,

not necessarily

Rm.

To round off this description we look at the case

n = 1,

bringing

us back to the beginning of the discussion of differentiation (§2.1). We have two concepts of derivative:

the old familiar

and the new

R -* R

Df(t)

as a linear map

.

f'(t)

as a number,

What is their relationship?

From the above we see that it is merely that f'(t)

= Df(t)1

the right hand side being a number

E = R

Case 2. Here

f

,

,

(element of

since

= x,

and

Df(x)

(if

n

real variables

it exists)

is a linear map

To analyze this linear map we will take the standard basis for

Rn

(see

§2.2).

Then any

h = h^ + h2e2 +

(i.e.

h =

(h

n = 1.

F = R .

is a real-valued function of

(x15x2,...,xn)

R)

,h2,...,hn))

h

in . . .

Rn

Rn

R

{e^,e2,...,en>

can be written uniquely as

+ hnen

and so by linearity we have

Df(x)h = h1Df(x)e1 + h2Df(x)e2 +

= h-^a-^ + l>2a2 +

•••

+ ^nan

...

»

+ hnDf(x)en

saY'

57

By taking

h

to be of the form

place and zeros elsewhere) formula

for each

(*), we find that the 3f

partial derivatives

(0,0,...,1m,...,0)

cm

i

in turn,

tm

in the

ith

and substituting in the

are nothing other than the familiar

at the point

3x.

(i.e.

x.

1

If

v

is any element of

(partial)

derivative of

R

f

then the number

at

x

Df(x)v

in the direction of 3f

terminology the partial derivative

(x)

3x.

is called the v.

Thus

is the derivative of

in this f

at

i

x

in the direction of

linear space

E

e^.

If

R

then the notion of

is replaced by an arbitrary normed 'standard'

3f

partial derivatives

3x. l

will not exist, but the derivative of v

is an element of

Case 3.

E = Rn,

In this case

and

Rm

in the direction of

(where

will still make sense.

is a linear map

^el»e2’'*"’em^

Rn

R™

,

and

so if we choose bases

(not necessarily the standard bases)

respectively we can represent

Specifically,

v

F = Rm .

Df(x)

{el,e2,•••,en}, Rn

E)

f

Df(x)

for

by a matrix.

if h = £, e-.

11

+ £„e0 +...+£ e 2 2 n n

II

+ A„e„ + 2 2

then Df(x)h =

where the

A's

are related to the

...

£'s

+ A e mm by

n Ai =

E “ii j=l J

for a collection of and

58

n

columns.

mn

£i

»

scalars

1 $ i $ m

{om^}

forming a matrix with

m

rows

If we now do take the bases to be the standard bases,

l.

so that the

1 and

are the usual coordinates, f(x)

=

then writing

(f1(x),f2(x),...,fm(x))

and putting together the discussions of cases 1 and 2 we find that

3f. ij

for each

i,

j.

standard bases

The matrix representing

9x^

8f2 9x^

m 9x,

is

Df(x)

with respect to the

is thus

K

This

(x)

9x.

9f (x) 9x2

3f2

(x)

9x2

3f (x)

m

3x,

(x)

9x

(x) n

3f (x)

(x)

8x

3f (x)

m

9x

(x)

the so-called Jacobian matrix (with respect to standard bases)

condensed to the notation

3(f1> f2» ‘ ' ‘ ,fm^ 9 (x-^ ,x2 , • • • 9x^)

Observe that we could equally well construct an analogous Jacobian matrix for other choices of bases. When

m = n

the matrix is a square matrix,

This number is often denoted by understood:

Jf(x)

or just

and so has a determinant. J

when

it will be of considerable interest to us

f

later,

and

x

are

in §2.8.

59

Remarks 1.

The Jacobian matrix depends on a choice of bases, whereas

derivative

Df(x)

does not.

the

The derivative was defined entirely

without reference to coordinates. 2.

Although the differentiability of

f

at

the partial derivatives of the components fact that the converse does not hold. partial derivatives

x f^

This

implies the existence of of

f,

it is a standard

is essentially because the

tell us approximately how the function behaves in

particular directions, whereas the derivative behaviour in a whole neighbourhood.

Df(x)

deals with the

It can be proved,

however,

that if

the partial derivatives exist and are themselves continuous functions some neighbourhood of

x

then

f

n

is differentiable at

x.

in

We shall not

be concerned with technicalities of this kind in what follows. 3. Digression into the complex field.

All

the above ideas work just as

well for maps between complex linear spaces.

In particular,

if

Cn

denotes the linear space of n-tuples of complex numbers with co-ordinate wise addition and multiplication by scalars

(complex numbers)

and norm

given by

r n

\

l I z.

2

1 2

1

'•i=l

then the derivative of a differentiable map

f

complex-linear map

Cn

R2n

by,

Df(z)

for example,

:

Cn -* C™ .

Now

identifying

(x1,y1,x2,y2.xn,yn)

where

:

= x.

+ iy.

J

J

J

at

C

for each

j

= l,2,...,n

we can multiply by complex scalars while in

multiply only by real scalars. real

60

linear map

R2n

R"m

A complex-linear map

(since real

will be a

with

n although in

z

can be thought of as

...,z )

z.

Cn -> C™

Cn -* C™

o RZn

,

we can

becomes a

scalars are special cases of

complex scalars)

but the converse is not true.

conditions for a real-differentiable map complex-differentiable map 9u.

9y. J

where we write

k

Cn -* Cm

,v. k

*

9y. J

:

z

in

Cn

R^n

the

to represent a

are that

1 $ i $ m

1

9x. J

u

:

9v.

9v.

_1

each

f

f

At

at

9x. J

z,

1 J? j $ n

f (X]_ ^ ,x2 ,y2, . . . ,xn,yn)

being a function of the

=

x.

(U]_ ^ ,u2 ,v2 , . . . ,um,vm) and

l

y..

,

These are the

J3

classical Cauehy-Riemann equations,

So far we have been discussing maps F

E.

x

it is not necessary that

All we need is

sufficiently small of

whose domain

E

and codomain

are both normed linear spaces, but it is clear that for differentiability

to be defined at of

f

h,

that

f(x+h)

i.e.

U

such that

that although the formula

x + h

lies

in

U,

the map

can be thought of as the whole of f

B^(x)



Hence if the domain

in a normed linear space

sense to talk about differentiability of Note, however,

be defined on the whole

should be defined for all

on some 6-ball

is given as an open set

f

f

E.

: U

f

at any point

(*)

E x

it makes in

only has meaning if

Df(x)

U. h

is

is linear and so its domain

Thus associated to

F

we have Df (x) for every If it is

x

in

U

the case that

then we simply say

differentiable on Df (x)

at which

: E -*■ F,

f U.

: U

f

: E -* F f

(linear)

is differentiable.

is differentiable at F

is differentiable,

For each

x

in

U

x

for every

or that

f

x

in

U

is

we have a linear map

or in other words we have a map

61

Df

: U -> {linear maps

E -* F} = Lin(E,F)

At this point we may wish to ask whether the derivative of varies continuously with continuous. Lin(E,F), of

x,

i.e. whether the map

at

x

U -*■ Lin(E,F)

:

is

To make sense of this we would have to have a topology on or at least on some subset of

Df.

Df

f

Lin(E,F)

containing the image

Now it so happens that if the continuous map

differentiable at

x

then

Df (x)

is a continuous linear map

(i.e.

This means that the image of subspace of

Lin(E,F)

Df

f

:

U

is not merely a linear map continuous on :

U -> Lin(E,F)

consisting of continuous

in §2.3 this space, which we will now denote by

E, is

F

is

E -*■ F

for each fixed

but x).

contained in the

linear maps. L(E,F),

As we saw

has a natural

structure as a normed linear space and hence a metric space. Therefore if we regard

Df

not just as a map

Df

:

U -> Lin(E,F)

Df

: U -> L(E,F)

but as a map

we are able to discuss whether or not it is continuous.

DEFINITION

If

Df

is continuous we say that

is of class

This

on

f

: U -> F

-or simply that

U

is of class is

f

C^

c\

or

f

.

somewhat abstract formulation may seem far removed from familiar

ideas of continuity of derivative in euclidean space, but the gap is easily bridged. between map

62

Lin(Rn,Rm)

Rn -* Rm

a norm on

To begin with,

is

and

L(Rn,Rm)

continuous.

L(Rn,Rm).

recall that since,

there is no distinction

as we have seen,

every linear

There are various reasonable ways to put

First of all we can take the norm defined above in

the general context, namely |L| |

= {sup| |L(x)

M

= 1}

where the norms on the right hand side refer to the usual euclidean norms.

Rn, Rm

Secondly, we could choose norms other than euclidean norms for and make the analogous definition of bases for

Rn

A = (a„)

and, regarding

and

Rm,

the euclidean norm of

||l||.

then represent A

A,

L

Thirdly, we could choose by an

as an element of

m x n

R™1,

take

matrix ||l||

to be

i.e. f m

n

l

l

a. . ij

*4=1 j=l

Then there are other possible variants such as 1 lLl 1

= max

1

i> j and so on.

a. . 1 ij 1

Now the important fact to use here, which sweeps away all

this confusion, is that it makes no difference which norm we choose on

L(Rn,Rm).

More precisely,

since

L(Rn,Rm)

invoke a non-trivial theorem (Simmons

is finite-dimensional we can

[ll8j) which says that any two norms

on a finite-dimensional linear space will induce the same topology, although of course the induced metrics will in general be different. Consequently, for studying the continuity (a purely topological property) of measuring

Df

: U •*

||Df(x)||.

L(Rn,Rm)

we can choose whichever way we like of

A convenient way is to take standard bases and let 3f •

Df (x)

max i» J

1

3x. J

(x)

It is then simple to see that continuity of

of the partial derivatives

3f. ^ - .

Therefore

Df

is the same as continuity

f : U -*

R

(U

an open

Xj

63

3f. subset of

R )

is

precisely when all the

C

are continuous

Xj functions on

U.

This freedom in the choice of norm is another illustration of the advantages of working with topologies rather than particular metric structures.

2.5

PROPERTIES AND USES OF THE DERIVATIVE

Unless with

stated otherwise, U

E

and

an open subset of

F

will denote normed linear spaces

E.

The following properties of the derivative can be verified directly from the definition:

Linear combinations.

(i)

is the map

af + Eg

If

: U ->- F

for any constants

D(af + Bg) as maps

f,g

U -*■ L(E,F).

are differentiable then so

a,3,

and

= aDf + BDg

Constant maps themselves have derivative zero,

of course.

(ii)

Linear maps.

linear map

L

If

: E -> F

f

is the restriction to

then

Df(x)

= L

already its own linear approximation. (Note that we do not say In the special case and

(iii)

Df(x)

= Z

Dt(x)

E = F = R

Bilinear maps.

If

B

x

in

U,

i.e.

f

is

C

L

is

.

which would not make sense.)

we have

E = E^ x E2

of a continuous

In particular

regarded as the linear map

of a continuous bilinear map 64

= L(x),

for every

U

f(x) R

and

: E^ x E^ + F

= Zx R

f

:

x h- Zx

.

is the restriction to (i.e.

l,

for some number

B

U

is linear in each

factor separately)

then

Df(x)h = B(Xl,h2)

where we write in

E2.

x =

(x^x^,

In particular

taking

x

f

+ B(h1,x2)

h = is

(h^t^) c\

with

x^l^

since the map

in

Df

E1

and

x2,h9

: U -* L(E,F)

to the linear map (h

B(xrh2)

+ B(h1,x2))

is continuous. As an example,

consider

E^ = E2 = F = R

(ordinary multiplication).

and

f(x

,x2)

= X]x2

Then

Df(x)h = x1h2 + h1x2

=

(x2,x1)

which is another way of saying that

Cartesian products.

(iv)

If

differentiable then so is (f^

x f 2) (x-j-,x2)

means

(f^Cx^),

of

f f

:

U

F

lies in

f?

:

U2 -> F2

x U2 -* F^ x F2

f2(x2)),

(Df1(x1)h1,

(Important)

normed linear spaces, Let

=

and

are

where

and we have

Df2(x2)h2)

.

derivatives operate co-ordinate-wise.

Compositions.

(v)

-* Fx

f = f^ x

Df(x1,x2)(h1,h2)

In other words,

3f 3f - = x„, - = x ox^ l 3x2 1

and V

and g

:

U,V V

G

Suppose

E,F

are open sets in

and E,F

be continuous maps,

so that the composition

g*f

G

are three

respectively.

and suppose the image

: U -* G

exists:

see

Figure 11.

65

Figure 11 If g*f

f

is differentiable at is differentiable at

x x

D(g* f)(x)

and

g

is differentiable at

f(x)

then

and = Dg(f(x))*Df(x)

Observe that the right hand side is the composition of the two continuous linear maps

: E -*■ F

Df (x)

and

continuous linear map from

Dg(f(x))

E

to

:

F -> G

and is

itself a

G.

This formula for the derivative of

g*f

is known as

the Chain Rule.

It can be expressed by saying that the derivative of a composition is the

composition of the derivatives, but note that care must be taken to specify at which points the derivatives are calculated. In euclidean space the derivatives Jacobian matrices, derivatives. y = f(x),

and then we obtain the familiar chain rule for partial

Explicitly,

z = g(y),

if

E =

Rn,

F

= Rm,

G =

R^3

and we write

then the chain rule states that

9(Zi,z2,..•,zp)

= 3(z1,z2,...,zp)

3(x15x2,...,xn)

3(y1,y2,..•,ym)

or equivalently 3z.

m

3z.

-a= y _i 3x. . 3y, j

66

can be represented by corresponding

k=l

Jk

^k 3x. J

3(y1»y2»---»ym) ‘

3(x1,x2,...,xn)

which is

the form in which the rule is most commonly expressed.

This form is the one which is most useful for specific calculations, but its disadvantages are

Rn

co-ordinates in

(1)

it is

etc.,

(3)

cumbersome, it gives

(2)

it depends on choices of

(as written)

no information about

where to evaluate the partial derivatives.

EXAMPLES 1. m

Ordinary multiplication of real numbers is a bilinear map :

R

R,

x R ->■

and the Chain Rule allows us to use this to derive a rule

for the derivative of the pointwise product functions

f

m(f(x),g(y))

:

R

E

and

g

we can write

D(fg)(x,y)

:

R.

F -*

: E x F -*

Expressing

fg = m*(f x g)

R

f(x)g(y)

of two as

and so by the Chain Rule

= Dm(f(x),g(y))



D(f x g)(x,y)

= Dm(f(x),g(y))

*

(Df(x)

by the rule for cartesian products. E x F

fg

x Dg(y))

Applying both sides

to

(h,k)

in

we obtain D(fg)(x,y).(h,k)

= Dm(f(x),g(y))•

(Df(x)h,Dg(y)k)

= m(f(x),Dg(y)k)

+ m(Df(x)h,g(y))

by the rule for bilinear maps. f(x)Dg(y)k + g(y)Df(x)h

The right hand side is

simply

(multiplication is commutative),

and so we can

write the conclusion as D(fg)(x,y)

= f(x)Dg(y)

where here the right hand side is two linear maps case when

Df(x)

E = F =

R

:

E -

R

+ g(y)Df(x)

a pointwise linear combination of the

and

Dg(y)

:

this of course reduces

F

R.

In the particular

to nothing more than the

usual product rule (fg)'

= fg'

+ gf'

67

2.

If

f(x)

B

: E x E ->

linear map

f d

is a bilinear function let

f

:

E ->

R

be defined by

Such a function is often called a quadvat'ic fovm.

= B(x,x).

We can write

R

as the composition

:

E -* E x E

f = B*d

taking x to

Df(x)h = DB(d(x))



(x,

where

x) .

d

Since

is

the

Dd(x)

'diagonal'

= d

we have

Dd(x)h

= DB(x,x).(h,h) = B(x,h) If

B

+ B(h,x)

is symmetric in the two factors this becomes Df(x)h = 2B(x,h)

or

Df(x)

and

= 2B(x,*)-

B =

A particularly common example is

scalar product

,

the case

E = Rn

i.e. n

B(x,y)

= x.y =

\

x.y.

:

i=l 2 here

f(x)

=

||x||

can represent by the f(x)

and

Df(x)

1 x n

is the quadratic form T

(where

is

the linear map

h h- 2x.h

matrix or row-vector T x Ax

then

x

This we

More generally,

. is the row-vector

Df(x)

denotes transpose of the vector

2x.

.

and of the matrix

if

T T x (A+A ) A) .

3. A superficially more complicated but in fact equally straightforward class of examples is illustrated by the following: m x n

matrix and an n-vector,

respectively,

with respect to some real parameter What is the formula for Writing

f = B • B

we have

68

:

f'(t),

(A x b)

where

L(Rn,Rm)

x

t,

Let

Rn + Rm

and let

f(t)

(A,b) h- Ab

,

= A(t)b(t)

Df(t).l,

is the bilinear map :

be an

each varying differentiably

or, more precisely B

A,b

in

R™ .

in

Rm

?

f'(t)

= Df(t).1 = DB(A(t),b(t))



= B(A(t),b'(t)) = A(t)b'(t)

which makes in

L(R

sense because

,R ).

b'(t)

If

f

inverse Df(x)

g

:

:

■+

U

.

E

+ B(A'(t),b(t))

+ A'(t)b(t)

= Db(t).1

in

Rn

and

A’(t)

= DA(t).1

There are many similar examples that could be constructed

using products of matrices,

(vi)

(A'(t),b'(t))

V

V ->■ U F

transposes and so on.

is a differentiable map which has a differentiable then at each point

is a linear

x

isomorphism.

g*f = id

:

in

U

the derivative

This is easy to prove:

we have

U -* U

and so the Chain Rule gives Dg(f(x))



Df(x)

= D(id)(x) = id

since

id

:

U -> U

is

(the restriction of) f • g = id

:

V

Df (x) Df(g(f(x)))

inverse, Again,

and



Dg (f (x))

= Df(x).

so in particular

E a linear map.

Similarly

V

and so by the Chain Rule at the point

since

: E

f(x)

= id

:

Therefore Df(x)

in

V

F -> F Df(x)

has

Dg(f(x))

as its

is a linear isomorphism.

this can be casually expressed by saying

the derivative of the

inverse is the inverse of the derivative. A differentiable map with differentiable inverse is called a

diffeomorphism.

A diffeomorphism defined on some open neighbourhood

of a given point

x

is called a

local diffeomorphism

at

x,

U

although this

term is often reserved to apply to local diffeomorphisms whose domain and codomain are both open neighbourhoods of

x

in

E

and which keep the

69

point

x

itself fixed.

simply as invertible

(vii)

Local diffeomorphisms at

(non-linear)

roughly,

can be regarded

changes of local co-ordinates near

A much-used result in elementary calculus

This states,

x

is

Mean Value Theorem.

the

that if a line segment is drawn joining two points

on the graph of a differentiable function then there is

somewhere on the

graph in between these points where the slope of the curve is the slope of the line segment. [x,

x+h]

x.

More accurately,

lies in the domain of

the same as

if the interval

the differentiable function

f

(although

strictly speaking we do not need differentiability at the end-points: continuity there suffices) x < E, < x+h

then there exists some point

£

with

such that f'(0

=

(f(x+h)

- f(x))/h

.

Rewriting this as f(x+h)

= f(x)

+ f' (£)h

gives another interpretation of expressed exactly as map is

f(x)

the derivative of

the theorem,

namely that

plus a linear map applied to f

at

E,,

not that at

definition, may give only an approximation to interpretation, n

f(x+h)

rather than the

'graph'

x

which,

f(x+h).

version,

h.

can be

The linear from the

This

latter

generalizes directly to

dimensions or to any normed linear space:

THEOREM (Mean Value Theorem)

Let U, U.

f

:

U -+ R

be a differentiable function.

and suppose the whole line-segment joining Then there is some point f(x+h)

70

= f(x)

y

Let x

x

to

and x+h

x+h

also lies in

on the line-segment such that + Df(y)h

lie in

The proof is easily obtained from the one-dimensional version above. Given

x

and

h

in

E,

(t)

let

R -+ R

be the function defined by

= f (x + th)

i.e.

f

evaluated at the point

from

x

to

x+h.

:

Then

a

proportion

t

along the line segment

is differentiable, being the composition of

the differentiable functions

R -+ E

:

t

x + th

and

f

:

E -> R,

and

the Chain Rule gives '(t)

= Df (x + th)h

By the standard Mean Value Theorem the left hand side equals for

t

equal to some (f)(1)

and so taking

£

between

- F

is of class

is

to say that the

map Df is continuous, where |L| |

: U ->■ L(E,F)

L(E,F)

has the topology which comes from the norm = 1}

x

= { sup | | L (x)

(Note:

Do not confuse the possible continuity of

(if

is continuous)

f

whether is,

Df

then

continuity of

Df(x)

D(Df)(x)

.

2

2

We can next ask

.

U.

If it

Df

f

at

x,

abbreviated to

implied that its derivative was a

so the fact that

.

If

in

.....

f

Df

is continuous

.2

is a continuous linear map,

L(E,L(E,F)).

x

.

Just as the continuity of

continuous linear map, D f(x)

: E -> F.)

will be a linear map

is called the second derivative of

D f(x).

with the necessary

itself is differentiable at a given point

E + L(E,F) This

Df

i.e.

D f(x)

implies

that

belongs to

is differentiable on the whole of

U

we thus have

a map D2f

: U -> L(E,L(E,F))

If this is continuous, which is we say that

f

is of

class C

2

.

the same thing as .

saying that

Df

is

C

Although the right hand side looks

complicated it is still a normed linear space,

and we can therefore

. . # . continue m this way to define

If

C

3

,

C

4

etc.

f

is of class

C

t

00

for all

r = 1,

2,

3,

Clearly,

. if

. is

C

f

... r

then we say that

then



D

r~l

f

*

exists •



f



72

C

1C





implies

to mean continuous.

C

s

for all

s < r.

C

and is continuous,

would not have considered differentiating it: and hence

is of class

thus

C

1

or we

implies

C

Conventionally we write

1

,

,

Sometimes

literature to mean 00

C

smooth

the words C

.

or just

differentiable

are used in the

smooth

Later we shall tend to use

to mean either

27

or

C

for some

r

understood from the context.

Discussion of derivatives of high order rapidly becomes very uncomfortable because we have to deal explicitly with spaces of type

L(e,L(e,L(e,...,L(e,L(e,f) ))...)

.

Fortunately we can avoid the problem by making use of some linear isomorphisms which conveniently exist between these spaces and a family of other types of space which are conceptually easier to grasp. spaces of

These are

multilinear maps.

We have already encountered examples of bilinear maps. An

n-multilinear map M :

where all

the

E^

is

E, x E„ x 12 and

each factor separately are held fixed).

a map

F

...

x E

n

F

, ’

are linear spaces, which is a linear map in

(i.e.

when all the components

For example,

the multiplication map

(x,,x„,...,x ) h- x.x„...x 1* 2 * n 1 2 n

is an n-multilinear map r

Rn = R x R x

As another example,

n x n

...

x R -* R.

times,

...

x Rn

each factor corresponding to one column vector.

determinant

Then the

map is an n-multilinear map Rn x Rn x

As

consider the space of all

matrices regarded as Rn x Rn x

n

in the other factors

...

x Rn

R

special cases we note that a linear map is a 1-multilinear map

and a bilinear map is a 2-multilinear map. n-multilinear map is in general multiplying something on the

not linear,

Observe that for

n >

1

an

since the effect of

left hand side by a scalar

a

is

to

73

multiply in

F

by

oin.

Consider now the space L (E^ , L(E^,F)). every E2

in

the map

Writing

An element E^

L(x^)

and

bilinear.

instead of

:

L

L(x^)

L(x^)

in

L(E2»F).

L(x^).X2

were

Now to every

L(x^).X2

of

X2

in

F.

we have constructed a map

linear this map

since

L(x^)

is continuous, where

product topology Conversely,

this space is a map associating to

E^ x Ej 4 F

Furthermore,

shown that

of

or more generally

associates an element

L(x^,X2)

L

L

an element

L

Since

L(E,L(E,F)),

L

and

is easily seen to be

L

are continuous

x E2

it can be

is equipped with the

(see page 15).

if we start with a bilinear map

L

:

E^

x E2 -*■ F

then we

can define a corresponding linear map L

by taking taking

L(x^)

X2

to

continuity of

to be,

turns out that L (E^,L(E2,F)). between

L

M2

or,

However, L(x^)

x^

E^,

if it is,

that

L

: E^ -> L(E2>F)

if we do assume that and

L

L

L L

F

L(x^)

will be

L

belongs

to

gives a natural bijection

M2(E^ x E2;F)

of continuous

and moreover it is routine to verify that when

is made into a linear space by pointwise addition and scalar

multiplication this bijection is a linear isomorphism.

74

E2

is continuous then it

are continuous and so

and the space F,

the linear map

there is no reason to expect that

Therefore we see that

E^x

in

Now since we have assumed nothing about

to begin with,

L (E^,L(E2,F))

bilinear maps

for each

L(x^,X2).

will be continuous, continuous.

: E^ -* Lin(E2 ,F)

In a similar way,

the space

L(E1,L(E2,L(E3,...,L(En_1,L(En,F)))...)

can be seen to be linearly isomorphic to the space M

n

(E, x E-> x 1 2

...

x E

n

,F)

of continuous n-multilinear maps with pointwise addition and scalar multiplication:

the correspondence is

L(x15x2,...,xn)

=

L L

where

(...((L(x1))(x2))...(xn_1))xn

.

This now gives us a much less clumsy notation for dealing with derivatives.

Instead of (...((Drf(x)(h1))(h2))...(hr_1))hr

we simply write

9

D f (x) (h.^ 5^2»• * •

thinking of When

Drf

E = Rn

U -*■ M^(E x E x

here as a map

and

F = R,

...

x E;F).

so that we are dealing with a real-valued j-

function of

n

variables,

the expression for

D f(x)(h,h,...,h)

becomes a. . ij

(x) •P

h.h....h 1 J P

where 3rf ii...p

3x.3x....3x i J P

and the summation is over all r-tuples integers

from

1

to

n.

When

n = 1

(with possible repeats) this reduces

of the

to

Drf(x)(h,h,...,h) = f(r)(x) hr where

f^

is the usual

rth

derivative of a function of one variable.

Formally, we see that the relationship between the abstract interpretation of

Drf(x)

f

(x)

as a number and

as a multilinear map is that

75

f(r)(x)

= Drf(x)(l,l,...,l)

.

In other words, we need only one coefficient to describe the effect of an r-multilinear map M = Drf(x)

M:RxRx...xR->-R

this coefficient is

We remark here that,

on

(h,h,...,h),

and when

f^r^(x).

as with linear maps,

every multilinear map is

automatically continuous when we work with finite-dimensional spaces. Therefore explicit references to the continuity of

D f(x)

multilinear map can be conveniently dispensed with when

as a

f

is a map

between euclidean spaces.

The particular case map

E x E

form on

Rn

F. .

r = 2.

When

Here

E = Rn

D f(x)

and

F = R

Given a choice of basis

Rn x Ru -> R

in

is thought of as a bilinear then

D2f(x)(*,*)

Rn

any such bilinear map

can be represented by a matrix

B(u,v)

,

Q =

(q^^),

is

a quadratic

where

= u Qv n u.q..v. i W J

i»j=l When

B = D f(x)

this matrix

Q

of second partial derivatives of

d2f (x) 9x^9x^

92f 3x 3xn n 1

This

f

at

92f

(x)

92f 9x 9x„ n 2

9x.. 9x 1 n

in

,n R ,

92f

(x)

is called the Hessian matrix of

x)

9x 9x n n

f

is

the matrix

x:

9x^9x2

(x)

choice of co-ordinates

76

92f

(now depending on

at

x.

(x)

(x)

It depends on the

whereas the original bilinear map

2 D f(x) It

:

E x E -> R

does not.

is well known that for a point

minimum for

f

3f gx

, v vxJ >

/__\ 9f vx/> gx 12

x

to be a local maxi-mum or

it is necessary that the partial derivatives 9f

, N (.x)

...»

should all vanish,

and then in order to

n

determine whether

x

is a maximum or is a minimum it is necessary to

look closely at the properties of the Hessian matrix at return to this

x.

We shall

in detail in §2.9.

It is a standard fact from elementary calculus that when f

:

Rn -* R

is

C2

we always have

2

9 f 9x.9x. i

i,j

2

/ \ (x)

9 f -w 9x. 9x.

=

1

- l,2,...,n,

J

which means

(x)

i

that the Hessian matrix is symmetric.

2 This is

is

the same as

symmetric,

saying that the bilinear map

u,v

:

Rn x Rn + R

i.e. D2f(x)(u,v)

for all

D r(x)



in

R

n

.

= D2f(x)(v,u)

More generally,

if

f

is

C

it

then each

rth

partial derivative is independent of the order of differentiation. The corresponding co-ordinate-free version of this is that the r-multilinear map ordering of the

D f (x) r

is symmetric,

components of

can be found in Dieudonnd

[32^,

i.e.

independent of the relative

ExEx...xE.

A proof of this fact

for example.

Note in passing that of the two examples of multilinear maps given above,

the multiplication example is symmetric but the determinant

example is not.

77

Remarks 1.

If

f(x)

=

(f

(x) ,f2 (x) ,. . . ,fm(x) )

straightforward to prove that the

rth

is

C

(See Remark 2,

2. An n-multilinear map

n

the

n

is

in

Rn

then it

is

precisely when all

f^

exist and are

§2.4.)

ExEx...xE->-R

covariant tensor of rank tensor of rank

x

(r 5 1)

mixed partial derivatives of all

continuous.

3.

f

where

on

E.

When

is specified by

m11

is sometimes E = Rm

called a

a covariant

coefficients.

In the discussion of multilinear maps we have avoided mentioning

topologies for

M

(E^ x E^ x

manage without these since,

...

x E

;F) .

Strictly speaking we can

as we showed,

the

spaces are just

alternative versions of the complicated spaces of continuous

linear maps

for which at least in principle we already have topologies derived from norms:

see §2.3.

Nevertheless,

it is useful to see how to make

into a normed linear space explicitly. continuous linear

M

n

By analogy with the theory for

(= 1-multilinear) maps,

it can be shown easily that

an n-multilinear map M :

E. x E„ x 12

. ..

x E

-> F

n

is continuous if and only if there exists a constant ||m(x)||

$ K

\\xl\\

1 lx2l

=

least such

for all

K, 1

-

...

x =

-

1

(x^,x^,...,x ) x

|| n1 1

=

M||

= sup{

with

If we then take

1M |

|m(x)I

1111

M

n

(E

1

= 1

x E? x 2

f°r

...

i =

x E

n

1,2,

;F)

.

this is in a sense the most reasonable norm to choose,

natural bisection between

78

such that

to be the

i. e.

then this does define a norm for more,

1

K

. . . ,n)

What

is

since the

and the corresponding space of linear maps

^(E1’L(E2> • • •

etc.)

is

then not only a linear isomorphism but is also

norm-preserving.

Properties of higher derivatives Corresponding to the properties of

the derivative mentioned in §2.5 we

have the following properties of higher derivatives:

(i)

Linear combinations of

Cr

Dr(af + gg)

(.11)

Linear maps

because

(in)

DL(x)

L

= L

are

B

Cr,

= aDrf + gDrg

C

,

for every

Bilinear maps

maps are

are

with x,

C

,

.

D L = 0

i.e.

with

and

for all

DL(x)

DB

r :> 2.

This is

does not vary with

as given in §2.5

x.

and

D B

given by D2B(x) ((h1,h2) , (k1,k2))

To see this, 2 D B(x)

observe that

= D(DB)(x)

= DB

D2B(x)(h,k)

=

= B(h1,k2)

DB(x)

is linear in

for all

x.

(D2B(x).h).k

DrB = 0

2 D B(x)

Since

for all

x

This means

.

and so that

by definition

= DB(h).k as claimed.

+ B^,!^)

by the above does not vary with

x

we have

r 5 3. oo

In a similar way it follows that n-multilinear maps

M

are

C

,

and D^MCx) (h,k, . . . ,p)

where there are

n

permutations

0

of

follows

D^M = 0

that

- £ M(h^^ ,k o

entries

h,k,...,p

{l,2,...,n}. for all

• »Pp)

and

the sum is over all

Since this

r £ n+1

is

independent of

x

it

.

79

(iv)

The cartesian product of two

Cr

CV,

maps is

and all derivatives

just operate co-ordinate-wise.

(v)

The composition of two

Cr

maps

is

Chain Rule together with induction on D(g* f)(x) so if

f

and

g

are

= Dg(f(x)) C

r

then



Cr.

r.

This comes from the

We have

Df(x)

Df

and

Dg

are

induction hypothesis

(that the result holds for

Dg(f(x))

function of

is a

C

x.

C

and the

r-1)

ensures that

Then we use the bilmearily of

the map L(E,F)

x L (F ,G) -* L(E,G)

:

(A,B)

B-A

and the Chain Rule and induction hypothesis again to see that C

r—1

and so

g*f

. is

C

D(g*f)

is

r

There is a formula for

D

(g*f)

in terms of derivatives of

f

and

g,

but it is cumbersome and not worth trying to memorize.

2.7 If x

GERMS AND JETS f

and

g

are two maps which agree on some neighbourhood of a point

then all their derivatives at that point are the same.

Therefore if

we are interested in trying to deduce the local behaviour of a map from information about its derivatives at

x

the precise nature of the map away from consider any map coinciding with

f

we need not be concerned with x

but could equally well

on some neighbourhood of

x.

This leads naturally to the idea of the germ of a map Let and

80

g

T :

be a topological space and let V ■+ S

be maps with domains

U,V

S

be any set.

open sets

in

Let T,

f

: U

and suppose

S x

lies

in

U n V.

Then

if there exists

f|w

such that

f

and

are said to be germ-equivalent at

g

some open neighbourhood = g|w,

i.e.

f

and

W

g

of

x

u r\ V

lying inside

coincide on

W.

x

This is an

equivalence relation on the set of all maps defined on neighbourhoods of x

in

T

and with values

germs of maps at

x.

in

If

S,

S

and the equivalence classes are called

is a topological

consider germs of continuous maps, E,F

we can consider germs at

these germs by

x

and if of

Cr

space also then we can

S,T

are normed linear spaces

maps.

We denote the set of

E^CEjF).

In any of these cases,

if

S

is

a linear space we can define

pointwise addition and scalar multiplication of maps from a given domain into at

S, x

and this goes through to the germ level to give the set of germs

the structure of a linear space.

If

S

also has a multiplicative

structure which interacts reasonably with addition and scalar multiplication then this,

too, will go through to a similar structure on

the set of germs at

In particular,

x.

if

is a ring or an algebra

S

(whose definitions we will not go into here)

then the set of germs at

x

becomes a ring or an algebra. The cases which will mainly interest us are those for which and in

S = R™ . Rn,

Taking

x

without loss of generality to be the origin

we see that the set

E^(Rn,Rm)

does and linear combinations of §2.6).

In the case

algebra,

and so

properties two

C

(iii)

m = 1

E^(Rn,R) and

functions

(v)

is

C

T = Rn

C

maps are

the codomain forms a ring

of

forms a linear space since

R

C

(property

(i)

R™

in

is a ring and even an

(algebra):

here we use

§2.6 to show that the pointwise product of

.

The ring

EQ(R

,R)

of

C

germs

is often CO

denoted simply

En

or

E(n).

When studying local properties of

C

81

functions of

n

elements of

E

variables we will really be studying properties of

n

0

CO

As we have said,

if two

their derivatives at

x

C

maps are germ-equivalent at

are the same.

about the derivatives at

map representing that germ.

rather than of a specific

The basic question then is:

does knowledge of all the derivatives of a germ at about the germ itself? derivatives

at

then all

This means that we can talk

of a given germ,

x

x

x

to what extent

provide knowledge

Can the germ be reconstructed,

given all its

x?

The most elementary result in this direction is the Mean Value Theorem (§2.5) which for

f

:

R -+ R

f (x+h)

E,

for some

states that

lying between

x

which is not a derivative at f(x+h)

+ hf' (O

= f (x)

= f(x)

and x.

h.

Of course,

this

f'(5)

Therefore

+ hf'(x)

+ error term.

Further manipulations with the Mean Value Theorem (as book on calculus)

involves

in any standard

show that the error term can be expressed as

h2 — f" (52)

f°r some

?2

lyi-n§ between

x

involves a derivative elsewhere than at working with derivatives at

x

and x,

x+h.

Again,

this

and so if we insist on

we must write

,2 f(x+h)

= f(x)

+ hf'(x)

+ — f"(x)

+ error term.

Once more the Mean Value Theorem can be used to show that the error term will have the form

h3

-jr f"(£3)

for some

E,^

between

x

and

x+h.

This process of pushing the error term further to right can be continued indefinitely,

although there is no reason to believe that the error term

will become any smaller as this happens.

82

Indeed,

it can easily get larger.

is described by Taylor's Theorem, which says

The true state of affairs that if we write

f(x+h)

= f(x)

then the error term,

, + —y f" (x) ^•

+ hf'(x)

+

...

or remainder term, has . n+1

,

, n . . + —r f n (x) n•

+ error term

the form

,.

V«-^(n+1) the error term

Rn(h)

has the form

—5—r Dn+1f(5)(h,h,...,h) (n+1) .

for some point

K

between

x

and

x+h

83

Tl

Remember that here

E x E x and so

...

Drf (x) (h,h,. .. ,h)

euclidean space Drf(x)(h ,h

Rn h)

l



D f(x)



is an r-multilinear map x E -> R

is a real number.

1

to

E

is

we can for purposes of calculations write as

3x.3x.. . . 3x i J P

(x)

h.h....h

1 J

p

where the sum is over all ordered r-tuples from

Recall also that when

i,j,...,p

of the integers

n.

The proof of the general version of the given the one-dimensional version. F(t) and apply the theorem to F'(t)

theorem is surprizingly simple,

We write

= f(x + th)

F

at

t = 0.

First of all

= DF(t).1 = Df(x + th).h

by the Chain Rule,

and then since the map

is linear for each fixed F"(t)

h

L(E,R) -* R

taking

A

to

A.h

the Chain Rule applied again gives

= D^f(x + th)(h.h)

and in general F^ (t)

= Drf (x + th) (h.h

Now writing

F(1)

= F(0)

stated for If it is

+ F' (0)

+ \y F"(0) ^•

+

...

+ ^- F(n) (0) n•

+

F(n+1) (£)

f. the case that

R

(h)

does approach zero as

we are, by definition of convergence for infinite series,

84

, v,n+1) •

n

increases then entitled to write

f(x+h)

+ Df(x).h+ JJ- D2f(x)(h,h)

= f(x)

+

...

CO

l

=

~r Dnf(x)(h,h,...,h)

,

n=0 n* where

D f

means

f.

The infinite series on the right is called the

Taylor series for (the germ of)

f

at

x.

This definition can be generalized directly to maps the series

itself

(now a series of elements in

Mean Value Theorem.

is

C

F)

f

:

E -+ F,

since

does not depend on the

If

f

the Taylor series always exists, but

(1)

it may diverge,

or

(2)

it may converge, but to something other than f(x+h) .

If neither of these happens, for all

h

at

For

x.

in some neighbourhood of E = R

,

F = R

for sufficiently small co-ordinates

and the Taylor series converges to

fu ,

then

this means that

f

is said to be analytic

f^(x+h)

can be expressed

as a convergent power-series in the

for each component

sometimes called class Analytic maps

h

x

f(x+h)

Cw

f^

of

f.

Analyticity is

.

are convenient to work with,

since we can invoke the

whole machinery of differentiation of power-series term by term, inversion of power series, any analysis text.

etc.,

Furthermore,

of which details can be found in almost powerful results from the theory of

complex functions can also be used.

What is perhaps even more

significant for practical purposes is the fact that if we replace

f(x+h)

by f(x)

+ Df(x).h + yr °2f(x)(h,h)

which is a polynomial in

h,

+

...

+ yr Dnf(x)(h,h,...,h)

then we can be confident that by choosing

n

85

large enough the error in taking this approximation will be insignificantly small. Despite these enticing advantages of analytic functions,

implicitly

relied upon in much mathematical modelling in physics, engineering and so on,

there are two serious objections - one practical and one

philosophical - to working with analytic functions.

The practical

objection is that functions do arise in real life which look harmless CO

but are in fact not analytic even though they are example is the function

f (x)

f

=

:

R -> R

^

,

x > 0

,

This is not analytic at induction)

0,

0

A standard

,

x ■ Rn

(f(u,v),v)

(I =

Inverse Function Theorem 0

:

with

has matrix

which is non-singular

germ at

F

Rn = Rm x Rn m

Hence

f

L(u,v)

is right equivalent to

is

A,

proving

Corollary 1.

Proof of Corollary 2 Again we suppose

x = 0

Rn, f(x) = 0

in

r

co-ordinates such that non-singular matrix and Rn x Rm n map

G

:

A

has matrix C

is an

(m-n)

R

R

DG(0)

M

and we choose

where now

v. L. x n

with corresponding co-ordinates

B

matrix. (x,y)

is an We write

K

as

and this time define a

= f(x)

+

(0,y)

.

has matrix

B

0

C

I

which is non-singular and so by the Inverse Function Theorem a diffeomorphism.

n x n

by G(x,y)

Then

g

Rm

in \

Let

3

be the germ at

0

of

M*G 1

G

is locally



97

Then M(x,0)

= gG(x,0) = Bf(x)

giving

B*f = A

equivalent to

since

A,

M(x,0)

=

Bx (Cx ^ V.

= Ax.

proving Corollary 2.

Hence

f

is left

See Figure 16.

Rm" y

Rm JL—

/TV

,

Rn x

0

Rm

f,

R

J X d&

/fR"

Figure 16 The condition that zero,

i.e.

A

is injective means that the kernel of

the only n-vector

u

satisfying

Au = 0

is

A

is

the zero vector.

n This means that the only linear combination J

)

u.c.

L,

of the columns

l—i

c. —l

i=l

of

A

(as a matrix)

words

the columns of

which is zero is the one with all u. A

are Z■{■nearly

'independent.

is surjective precisely when the rows of The

rank

= 0,

l

A

Correspondingly,

A

are linearly independent.

of a matrix is the largest number of linearly independent rows,

or alternatively,

columns

(the answer is the same).

Thus

corollaries above can be summarized as follows:

If

Df(x)

has maximum rank then

Zike the linear map

98

or in other

Df(x)

near

f 0.

Zooks near

x

the two

Remarks In infinite dimensions we cannot talk about ranks,

1.

the first formulation of the corollaries.

so we must stick to

We also have to add in an

extra condition to ensure that the splitting of

E

suitable

then the proofs are the

same as

'co-ordinates'

(x,y)

is

still valid:

in the finite-dimensional case.

Explicitly,

that in Corollary 1 we should be able to split © E2 with

(see page 42)

E^ = ker A

and

where E^

E^,E0

taken by

or

E

A

takes

A

isomorphic ally onto

F^.

2.

E

F

r>n“in

R

nm

into

or

F^ $ F^

is finite-dimensional

is finite-dimensional in Corollary 2.

In the context of Corollary 1 we can define a 2

F

F,

E

These splitting conditions

can be shown to be automatically satisfied when in Corollary 1 or

the condition is

are closed linear subspaces of

isomorphically onto

E

into two

into a direct sum

similarly in Corollary 2 we should be able to split where

F

C

map

(germ)

"t

-> R

by (y)

= first co-ordinate of y(- B "*"Cy,y)

,

where the inspiration for the expression on the right is that the linear subspace of

Rn

consisting of all elements of the form

precisely the kernel of

A.

It then follows

(- B

Cy,y)

is

that

f((y) »y) = fy(- B lcy>y) becaus-e

y

preserves the y co-ordinate

(because

L

and

F

do) ,

= A(- B_1Cy,y)

=

by design.

0

This version of Corollary 1 is called the

Implicit Function

Theorem, since it says that if f is

a

Cr

:

Rn = Rm x Rn_m -+ Rm

map with

f(0)

=0

and such that the derivative of

f

99

restricted to the first factor is an isomorphism (represented by the matrix

B

above) ,

then there exists a

Cr

map

Rm

satisfying

f (4> (y) >y) = 0 for all f = 0

y

in some neighbourhood of the origin, or in other words putting

gives the first co-ordinate as an implicit function

second co-ordinate.

Note that when

n = 2, m = 1

and the derivative condition is simply that



Rm

Rn.

DEFINITION

A singular point of

f

maximal rank3 i.e. where The germ of

100

f

is a point Df(x)

x

at which

Df(x)

does not have

is neither injective nor surjective.

at a singular point is called a singularity.

be

Thus at a singular point the derivative may not give a good local qualitative picture of the map itself.

Remark The term

singularity

is used in many different fields

of different things - such as a point where not continuous,

or not defined.

f

to mean a variety

is not differentiable,

or

It may also refer to a region of space

with bizarre geometric properties,

such as a

'black hole'

in cosmology.

Here we will keep the word to mean the germ of a differentiable map at a singular point.

EXAMPLES of singularities 1.

In the case

n = m = 1

the derivative

Df(x)

is a linear map

which must either be surjective or be the zero map. point of

f

R ->■ R

:

is a point

x

where

f' (x)

: R

point or stationary point.

For example,

has

and nowhere else.

a singularity at

2.

x = 0

=

f

n,

i.e.

to functions

the linear map

Rn -*■ R

f

:

R.

U

R

'df

8f

given by

f (x)

m = 1

= x

for

Df (x)

is

(=1)

is

matrix

8f

ax * ax ’ ’‘‘ ’ ax 12

v.

x n

1

R

called a critical

0,

Here the derivative

represented by the

->

Thus a singular

The previous example generalizes immediately to the case

any

R

ny

and so the only way in which it can fail to have maximal rank 3f

by the vanishing of all the partial derivatives

7—-

.

Again,

this is

i called a critical point or stationary point of is

f

:

Rn

->

R

singularity at is

given by 0

in

Rn

f(x)

= x^

2

+

2

+

but nowhere else.

in a certain sense the most

'nonsingular'

f.

...

+ x

A standard example

2

;

this has a

As we shall see later,

this

type of singularity.

101

A constant map

3.

Rn)

is

Df (x)

f

Rn

:

Rm

(i.e.

f (x)

= c

is the zero linear map

Rn

Rn -* R™

is crushed by

f

Rm

in

the most singular kind of map imaginable.

the structure of 4.

f

for each

for all

x

in

Here the derivative x,

and the whole of

into a single point of

R .

As a geometrically more interesting kind of singularity consider

R2 -* R2

:

given by 3

f(x1,x2) = (xlfx2

Here

Df(x)

+ x1x2)

is represented by the

.

2x2

matrix

3x22 +

and so the rank is less than maximal precisely when

3x22 + x^ = 0

which is the equation of a parabola in the domain be described pictorially as

R

2

.

The whole map can

in Figure 17.

Critical points of real-valued functions We will now return to the case when one real function of i.e.

102

n

m = 1,

in other words when

f

is

variables and singular points mean critical points,

points at which all the partial derivatives of

f

vanish.

To understand how

f

behave s near a critical point,

look at the next available its

second derivative.

'approximate'

Recall

it is natural to

information about

f,

that this second derivative

namely

D2f(x)

can

be regarded as a symmetic bilinear map D2f(x) or as

a quadratic form,

Hf(x)

of

f

at

x.

:

Rn

x

Rn + R

represented by the Hessian matrix

32f

=

(x)

9x.3x. 1 J

From experience with the Inverse Function Theorem it seems

reasonable that we will be able to extract the most local information about f

from its

second derivative when this Hessian matrix is non-singular.

DEFINITION

A critical point

x

of

f

:

U ->

R

is called degenerate ov non-degenerate

according to whether the determinant of the Hessian matrix of

at

f

x

does or does not vanish.

Remarks 1.

The Hessian matrix itself depends on a choice of co-ordinates

but the vanishing or otherwise of its determinant does not.

Rn,

in

This

is easy

to verify by applying a co-ordinate change and then using the Chain Rule twice.

However,

the real reason for this is that the vanishing or

non-vanishing of the determinant of

H^(x)

corresponds to the

. .

degeneracy or non-degeneracy of the bilinear map An arbitrary bilinear map if there exists

some

u

B in

: Rn

Rn

x

Rn

R

is

2

D f(x).

said to be degenerate

with the property that

B(u,v)

= 0

every

v

in

Rn

;

point

x

of

f

is degenerate or non-degenerate according to whether

otherwise it is non-degenerate.

Thus

for

the critical

103

2

D f(x) 2.

is a degenerate or non-degenerate bilinear map.

In view of the previous remark, we can extend the theory of critical

points to functions on any normed linear space critical point of

f

when

Df(x)

R

: E ->

is

E.

The point

x

the zero linear map,

is a and is

degenerate or non-degenerate according to the degeneracy or

2 non-degeneracy of the bilinear map

D f(x)

:

E x E ->

R.

The Inverse Function Theorem allowed us to change co-ordinates to bring f

locally into linear form,

as possible.

provided the derivative was as well-behaved

We might next hope for a theorem saying that even at a

critical point we can change co-ordinates agreeable standard form, non-degenerate.

(b.

f

locally into some

provided that the critical point is

There is

after Marston Morse

to bring

such a theorem.

1892

It is called the Morse Lemma,

) who gave it as part of a general theory

of the structure of real-valued functions,

a theory which has had far-

reaching implications in differential topology.

Proofs may be found in

many books,

or Milnor

such as Golubitsky and Guillemin

[42]

[[82].

THEOREM (Morse Lemma)

Suppose f

at

p

is a non-degenerate critical point of is right equivalent to the germ at

p

f.

Then the germ of

of a function

0

cf>

of

the form \

+

> • • • )

where each

e.

i

is

It is clear that origin,

104

since

D (x)

± 1

=

...

+ e x n n

2

.

does have a non-degenerate critical point at the ,2e2x2 ,. . . ,2cnXn)

(as a

1 x n

matrix)

which vanishes at

x - 0,

and

H^(0)

is the

n x n

diagonal matrix

2f

2e,

2e

which is non-singular. example,

n

The Morse Lemma says that this is the archetypal

every other function near a non-degenerate critical point being

reducible to one of this type by a local change of co-ordinates. The number of

e.

which are

is called the i-ndex of the critical

- 1

1

point

p.

It depends only on

f

near

way of choosing new co-ordinates;

it is

p,

and not on the particular

the number of negative

2 eigenvalues of

H^(p)

or of

D f(p)

as a linear map

is

we can rearrange co-ordinates so that

Rn -* L(Rn,R) = Rn . If the index of has

p

k

(apart from the constant

f(p))

the form \

/*

2

(x-^ ,X2 , • • * , x^) V

X1

+

. . .

+ x

2

2

2

, n-k

Xn-k+l

...

+

X

n

This allows us to gain a good geometric picture of the behaviour of near

0,

i.e.

of

index

(including

f

cj>

near

k = 0)

p. and

There are n + 1

n + 1

possible values of the

correspondly different types of

non-degenerate critical point. n = 1

n = 2

k = 0

Local form

x

minimum

k = 1

Local form

- x

maximum

k = 0

Local

form

x-^

k = 1

Local form

x^

k = 2

Local form

- x^

2

+ x2

2

2

:

2

saddle point

- x2

2

- x2

minimum

2

:

maximum

105

See Figure 18 for pictures in the case

n = 2.

Note that a saddle

point can be thought of as a minimum in one direction coupled with a maximum in another.

In general there will be

saddle point corresponding to

(n-1)

different types of

k = 1,2,...,n-1.

Remarks 1. An immediate deduction from these local forms non-degenerate critical point is iisolated, containing no other critical point,

i.e.

is that each lies

in some neighbourhood

degenerate or otherwise.

2. The Morse Lemma has an infinite-dimensional version,

extremely useful

in calculus of variations where it is necessary to consider the behaviour of a function defined on a normed linear space themselves functions. write the

(rearranged) x ^ f(p)

where

P

canonical form for +

||(I-P)x||^ -

E

Rn -> Rn

.

above form where

P

whose elements are

In finite dimensions we could (J>

as

||Px||^

k

co-ordinates and

I

denotes the

The infinite-dimensional Morse Lemma states

is a Hilbert space

degenerate critical point

106

[92].

is projection onto the last

identity map provided

See Palais

E

p

(page 53),

the function

that,

in a neighbourhood of a non¬ f

can be converted into the

now denotes a perpendicular projection onto some

linear subspace of

E.

There do exist generalizations of this to Banach

spaces which are not Hilbert spaces:

see Tromba

|44] .

Further study of degenerate critical points At a degenerate critical point

p

singular, which is equivalent matrix has non-zero kernel, with

Hf(p).u = 0.

of

f

the Hessian matrix

H

to saying that as a linear map

i.e.

However,

(p)

is

R° -* Rn

there exists at least one direction

the u

it would be reasonable to hope that if we

keep perpendicular to such directions we could still apply a kind of Morse Lemma giving a standard quadratic form in some co-ordinates, completely degenerate

(i.e.

the remaining variables. choose co-ordinates non-degenerate does exist.

'Morse'

(Splitting Lemma,

conveniently

[45^ •

or Gromoll-Meyer Lemma)

is a degenerate critical point of

is right equivalent to the germ at

0

(xi5X2, . . . ,Xn) B- f (p)

e.

is

±1

+

and

Then the germ of

f.

of a function

/

where each

allowing us to

It was first published in an infinite-dimensional context by

THEOREM

p

Such a generalized Morse Lemma,

part and a completely degenerate part,

and Meyer

p

function of

so as to split the function locally into the sum of a

Gromoll

Suppose

degenerate in all directions)

leaving a

e,x, 11

ip

+...+£ x

2

r r

at

of the form

f

k

2

f

+

rj,

[ r+1

is completely degenerate,

5 nj

i.e.

i

D2ip(0)

= 0

(as well as

Dip(0)

= 0).

Remark In the infinite-dimensional version when

E

is a Hilbert space

H

we let

2 K

be the kernel of the second derivative

H -> L(H,R)

= H*

.

D f(p)

(In fact for a Hilbert space

regarded as a linear map H

= H,

in much the same

107

way that

Rn" = Rn;

seepage 247.)

so that there is only a 2 D f(p)

the image of

We assume that

K

is finite-dimensional,

'finite-dimensional amount' a

is a closed subspace of

of degeneracy,

, (a technical necessity).

H

Each point

x

can be regarded as having co-ordinates

belongs to

K

and

w

is perpendicular to

K.

and that

(w,y)

where

y

Then the standard form

given by the Lemma is x b- f(p)

where this time perpendicular to element of

+

| | (I-P)w| |2 -

P

| |Pw| |2 +

(y)

is a projection inside the space of vectors K,

L(K,K )

and

(Jj

satisfies

DiJj(O)

or as a bilinear map

In view of the Gromoll-Meyer Lemma,

= 0

and

K x K -v R

2

D i[j(0)

w =0

as an

.

the problem of studying the

behaviour of degenerate critical points has been narrowed down to the study of critical points at which the second derivative vanishes entirely, i.e.

points at which the 2-jet vanishes

f(p)).

(ignoring the initial constant

It would be pleasant if there were some kind of

'non-degeneracy'

condition on the third derivative in the presence of which we could again convert the function locally into some standard form by a sort of third degree Morse Lemma.

Essentially this is the case, but the details of the

theory, which generalizes to functions with vanishing 3-jets, are rather more subtle than one might at first suspect.

4-jets,

Indeed,

etc.,

it is

only in recent years with the work of Thom, Mather, Arnol'd and others that the classification of degenerate critical points under various conditions of relative non-degeneracy has become properly understood at all. The problem can be stated in two parts: (1) To what extent does the first non-vanishing k-jet of

108

f

determine the local behaviour of

f ?

(2)

Can the first non-vanishing k-jet homogeneous polynomial degree

k)

(which is a

x1,x2,...,x

in

of

be put into some normal form?

The second problem is one of algebraic geometry, or

less easily for a few low values of

k - 2

k

and

and can be solved more

n.

For example, when

we have the classification of quadratic forms

into standard non¬

degenerate types

ex If

2

+ e x„ 22

2

+

...

+ e x n n

which appeared in the Morse Lemma,

2 elXl

2 + e2X2

2

(e. = ± 1) x

and degenerate types

2 +

•••

(ei = ± 1, 0 $ r < n)

+ erxr

which featured in the Gromoll-Meyer Lemma. cubic form in coincide)

x^,x2

factorizes

When

k = 3

and

n = 2

into one or three real factors

a

(which may

giving without much effort the various standard forms 3

X1

3

+ X2 ’ X1

The first problem is

3



2

2

3

~ 3xix2 ’ xl x2’ X1

that of detevmi-naoy of the k-jet

(see §2.7).

Without going into the depths of the theory, we will give the result in the form of a finite procedure for testing the determinacy of the first non-vanishing jet. We may as well suppose that we are working at the origin in

Rn,

and

we shall regard the first non-vanishing jet of a function germ as being a homogeneous polynomial of the appropriate degree a k-jet,

and for

i = l,2,...,n

let

d^

denote

k. 9E

,

Let

E,

be such

itself either

i zero or a homogeneous polynomial of degree

k - 1.

109

DETERMINACY TEST.

The jet

E,

every homogeneous polynomial

as above -is determined if and only if of degree

q

k + 1

in the

can he

x^

written as q = q1d1 + q2d2 + ... +

where the

are homogeneous polynomials of degree

2.

Since the space of homogeneous polynomials of degree dimensional real linear space of dimension

ml

X1

/-a 1,-4(n-1);k.

is a finite-

k

(the elements

m

m2

x

X2

n

with

m, + m~ + 12

...

form a has is),

+ m = k n

the

Determinacy Test reduces to a finite combinatorial problem.

EXAMPLES 1. The k-jet of a constant function is not determined since each However, 2.

If

1-jet

- 0.

this fact is obvious anyway.

Df(0)

E,

d.

of

/ 0 f

(i.e.

0

is not a critical point of

is determined since at least one of the

is a non-zero number and then every polynomial can be written as

f)

d^

q

then the d^

of degree

times a polynomial of degree 2.

(say

d^)

k+1 = 2

However, we

already knew this fact as a consequence of the Inverse Function Theorem in §2.8. 3. If

Df(0)

= 0

then if

A

denotes the Hessian matrix

H^(0)

we have

n 1.

\

=

1 If

A

is non-singular

there is a matrix

(i.e.

B = A

A. .

x.

^

J

j-1 0

is a non-degenerate critical point of

such that n

x. J

110

=

y

i=l

B. .

d.

J1

1

f)

Now every homogeneous polynomial

q

of degree

3

can be written as

n

(0)

= 0,

and then take

y = x(l +

(x))

1 /k

.

It is easy to make

argument rigorous.

Observe in constrast:5.

For

k > 1

the k-jet

(x

,x2) ►+

k

. . , is not determined,

since

x2

k+1

cannot be written as

q ll

3? - + q0 l2 3x.

The result is reasonable,

35

, k-1 — = q1kx1

3x,

since

1

x^

vanishes everywhere along a line

(the x2-axis) whereas no amount of local changes of co-ordinates

around

0

111

in

R“

x^ - x-^ dx2

will alter the fact that e.g.

also along a curve touching the More generally,

x2-axis at

vanishes

the origin.

it is easy to see both from the algebra and the

geometry that no k-jet of

f

:

Rn -> R

which is

independent of one or more

of the co-ordinates can possibly be determined.

6. The 3-jet (x1,x2) h- x1

is determined since

d^ = 3x^

3

2

- 3x^2

- 3x2

homogeneous polynomial of degree 4 q^d^ + q2d2

with

q

, q£

3 2 2 3 x^ x2> x^ x2 , x^x2

2

,

in

of degree

2

d2 = - 6x^x2 x^,x2

2.

can be written as

To see this,

are already multiples of

1

4

X1

2,

1

and every

d2

observe that

and that

,

= 3 X1 dl " 6 XlX2d2 4

x„ 2

1 2, i , = - -x x-. d, - — x,x„d„ 321 6122

However: 7.

The 3-jet (x

is not determined, of form

d^

and

d2

,x2) ^ x^ x2

since

d^ = 2x^x2, d2 = x^

would have

q^d^ + ^2d2

'

x^

2

as a factor.

and so every combination Thus

x2

4 is not of the

This non-determinacy can be understood geometrically

2

as follows:

x^ x2

vanishes along two lines

cutting at right angles, whereas along the x^-axis only.

2 5 x^ x2 + x2 ,

axes)

for example, vanishes

No local co-ordinate changes

one of these configurations into the other.

112

(the co-ordinate

in

R

2

can convert

So far we have considered determinacy only for homogeneous k-jets, which materialize as the first non-vanishing jet at germ.

0

of some function

It is important also to be able to test determinacy of non-

homogeneous jets, since it frequently happens that although the first non-vanishing jet of

f

may not be determined, the addition of the next

term in the Taylor series of

f

gives a jet which then is determined.

The necessary and sufficient test for determinacy of non-homogeneous jets is a little cumbersome, and in many cases it is enough to work with two tests, one necessary and one sufficient, which do not quite meet in the middle. Let

£

For full details see Poston and Stewart

[lOl] .

be any (possibly non-homogeneous) k-jet, and define

3 E,

d^ =

i as before. any degree

In this case $ k - 1.

d^

Let the minimum degree of a polynomial denote the

lowest degree of all its terms. 2

may be a non-homogeneous polynomial of

For example, the minimum degree of

3 2. X2X3 + xp x3 1S four*

Sufficient test.

If every homogeneous -polynomial

q

of degree

k + 1

can be written as n q =

l

q^i + r

i=l where the

q^

cere polynomials of minimum degree at least

a polynomial of minimum degree at least

k + 2,

then

5

23

and

r

is

is determined.

This is similar to the earlier test (for homogeneous jets), but we now allow the > k + 1

q^

to be non-homogeneous and discard all terms of degree

which arise.

113

EXAMPLE We saw previously that

If now we let

n

(x^,X2) b- x^ X2

denote the 4-jet

x

is not a determined 3-jet.

(x^,X2) b- x^ X2 + x^

n

1

is determined

(here

r = 0).

Thus

5

have

Necessary test.

If

of degree

can he written as

of minimum degree

as a factor and

X2

4

(*)

where the

q^

are polynomials

is as before.

r

Clearly the sufficient test implies the necessary test, However,

x^ X2

is determined then every homogeneous polynomial

£

and

1

d^

the undetermined 3-jet

has been made into a determined 4-jet by adding on

k + 1

we find that

5

and other homogeneous polynomials of degree so

4

as

it should do.

there is a small gap between the two through which jets can

easily fall.

EXAMPLE (Siersma's example

[l 1 ^ 3

Let

5

be the 4-jet

d2 = 3x^X2

2

.

.

.

so we suspect that

£

minimum degree

We have

d^ = 3x^

3 + x^

2.

5

6)

with

Thus the Sufficient Test fails

and

To prove this we must

q^d^ + ^2^2

(in particular we see

ea°h of 5

2 = x? d^ - x^^)

and so

does not fail the Necessary Test and we cannot be sure that it is not

114

,

here we find that any homogeneous polynomial

can be written as 1

5

(m°dulo terms of degree

may not be determined.

apply the Necessary Test: 5

.

.

q^d-^ + ^2^2

each of minimum degree

of degree

+ x.^2

A little combinatorial work shows there is no way in which

can be written as q^,q2

2

3

(x^,X2) b- x^

E,

determined: determined,

it has fallen between the two tests.

In fact this

E,

is

but to prove this needs more technicalities which we will not

discuss here. Interestingly,

polynomial of degree degree

7)

where

E,

for this 6

it turns out that every homogeneous

can be written as

qpq9

have minimum degree

regarded, as a 5-jet is determined, of

q^d-^ + ^2^2 2.

(modulo terms of

This means

E,

that

in other words that any function germ

the form 3

(x^,x2) ^

is right equivalent to

+ x^x^

E,.

3

+

(terms of degree

£ 6)

It is in showing that terms of degree

5

also make no difference that the subtlety of this example lies.

Remarks on the literature.

There are a great many books on linear algebra. A fairly random selection is Cullen [3l] , Halmos [49] , Mirsky [84] (in which linear spaces are called linear manifolds), Nomizu [91J , Shields The basic material on calculus in normed linear spaces can be found in Dieudonnd

[32] , Lang

[67]

or Loomis

and Sternberg

[70] ,

jE 14] .

and for euclidean

space in almost any text on advanced calculus such as Hoffman

[58],

Rudin [10^ or Spivak [l2| . See also Griffiths and Hilton [43] or Hirsch and Smale [55] for both linear algebra and calculus. Some references on singularity theory are Eells

[36] ,

Golubitsky and Guillemin

[42] t or the

survey by Arnol'd [9], Determinacy of jets is explored and explained thoroughly in Poston and Stewart [LOT]. Much of the content of this chapter is also contained in books on differential topology referred to at the end of Chapter 3,

and also in Field

[l60] .

115

3 Differentiable manifolds and maps

3.1

THE CONCEPT OF A DIFFERENTIABLE MANIFOLD

With the topology and calculus developed so far we have assembled a mathematical toolkit for dealing with

1.

differentiation in open subsets of euclidean spaces or, more generally,

2.

global structures which are not necessarily

The next step is of

in normed linear spaces

to unite these

two.

linear.

As a natural result of the synthesis

local linearity with global non-linearity we arrive at the definition

of a differentiable manifold. the definition of differentiability

Recall that in order to formulate at

x

for a continuous map

f

needed to use the linearity of of

x

in

U

and

f(x)

in

F.

:

U -> F,

where

U

is

locally, i.e.

E, F

Therefore if

S

open in

E,

in some neighbourhood

and

T

are two

topological spaces we can talk about differentiability of a map at a point

x

linear spaces context of all

S

provided that

in neighbourhoods of

topological spaces means

this more formally as Suppose

map.

Let

S, x

neighbourhood where

U'

that

f(x)

116

in

is

T

x

and and

S

:

S

S,

Now 'looks

'is homeomorphic to',

f

:

and suppose that

such that there is

S -> T x

like'

in the

so we can express

V

in

has

is

T

E.

a continuous

an open

a homeomorphism

an open subset of a normed linear space

has an open neighbourhood

T

look locally like normed

f(x).

are topological spaces and

in

T

f

follows.

be a point of U

S

we only

:

U -> U'

Suppose also

with a homeomorphism

ip : V -> V' assume

where

V’

is

an open subset of a normed linear space

for simplicity that

so that

f(U) U'

a

in a normed linear space r

a

.

The same

If we put in the further reasonable assumption

locally the same everywhere,

i.e.

that all

really just copies of the same normed linear space at

E

the definition of a manifold modelled on

E,

the

E^'s

are

we have arrived

E.

117

DEFINITION

A manifold modelled, on a normed linear space which admits a family

of open sets covering

{U^}

is homeomorphic with an open subset

Each

U ,

called a chart.

The collection

S. f

{(U

a

,

S {(h

the formal manoeuvring in this

uncomfortable about trying to do calculus would be right:

120

our instincts

''‘U

is ,

or

a homeomorphism and

a

*h) }

is one

for

T.

last example, we may still feel around the edge of a square.

detect the fact that manifolds as we have

We

defined them do not yet quite provide calculus. with

the correct setting for global

We will pursue this problem further,

after ending the list above

two rather bizarre examples of manifolds.

2 S = R U

with an unusual topology constructed as follows: is open in

line £

£

S

parallel

(regarded as

precisely when

U

to

in an open subset of

R

the

x^-axis

intersects every

with the usual topology).

2 Thus every

'usual'

open subset of

certainly not conversely. not m the usual

R,

morphic to S

is

R

2

.

the line

a manifold modelled on

This manifold is

is open in this

For example,

Every point

namely

R

the

p

in

through

R,

not on

X2_axis S

p

is open in

parallel

to the

8.

S =

[o,00)

with an unusual

the open sets

in

S

open in the usual

2

R .

S.

is

easy

in

(1,

u

follows:

1+e)

R)

(induced from or contain

0

for some

0

and which together with

e > 0.

R

every point has neighbourhoods homeomorphic to open :

this

little checking reveals Jj3,e)

It needs an

to verify that this prescription does define a topology for

Furthermore,

intervals

as it possesses

are precisely those which are

are either disjoint from

It

Thus

to constitute an atlas.

topology as

topology

an open interval

but

x^-axis.

in a certain sense embarrassingly large,

infinite number of charts

S

has a neighbourhood homeo-

an unoountably -infinite number of disjoint open sets. uncountably

topology, but

(1,1+e)

that each neighbourhood of

is homeomorphic to

manifold modelled on neighbourhood of

is by definition for points other than

0

R.

However,

(-e,e) S

in

0 R.

in Thus

is not Hausdorff,

meets every neighbourhood of

S

0,

of S

and a

the form

is a

since every

1.

121

There are contexts Hausdorff manifolds with them.

in which the study of

is

'large' manifolds

and non-

important, but we shall not be seriously concerned

From now on all our manifolds will be assumed automatically to

be Hausdorff and possess atlases with at most a countably 'infinite number

of chants.

Next we look a little more closely at what is on manifolds.

As

involved in doing calculus

already described, we discuss differentiability of maps

between manifolds by

representing the maps

locally

between open subsets of normed linear spaces.

(via charts)

It is

clear,

though,

the result of this discussion may depend heavily on the choices chart homeomorphisms if

x

lies

d> 1 Vp

Vk V

p

difficulty,

can write

first observe that

the map r

n U ) -> V' = (JjV c: F

pap

as

a

(U

n

a

UJ

followed by ipf

V'

a composition:

^p1 = See Figure

* Vp^

One representative for

f

is

* therefore the

Figure 21 composed with an overlap map

other representative such overlap map is space)

a homeomorphism of one open subset of

with another: we have

If it happened to be the case

atlas were of class

C

^



~~

• • the ambiguity about of different

was C

r

E

(the model

that every overlap map within the given

(l£r N

diff'event-table manifolds is

C

map if all its local representatives via charts of

M

and

27

N

are

C

maps in the sense of

Note

that to verify this

§2.6.

in any given case it suffices

to check the

differentiability of the local representatives with respect to just one convenient atlas

for each of

M

all other atlases m the same As

and C

r

N.

It then follows automatically for

structure.

a special case of the above we have:

DEFINITION

A homeomorphism “

f

1

:

N

M

f

: M -> N

is also a

If there exists

27

which is a

2T

C

map and whose inverse 27

map is called a

C

a diffeomorphism between

M

C

diffeomorphic (with the qualifier

C

diffeomorphism.

and

N

they are said to be

inserted according to context).

diffeomorphic manifolds are indistinguishable as

far as

Two

their topologies

and differentiable structures are concerned.

125

EXAMPLES of differentiable manifolds All the examples

of manifolds

above are in fact examples

of differentiable CO

manifolds, in 3(a)

since the overlap maps of the charts are all

the overlap map

image also

(0,1)

cf> .^

which is check. for

S

C

as

In fact the atlases

1

C\

The other overlap maps 3(a)

or 3(b)

and

/

S

2

t >-*- + vl-t

are equally easy

define the same

.

,

(0,1)

consists of those points of

and is given by the formula

0 < t < 1.

For example,

has domain the open interval

(observe that

lying in the first quadrant)

C .

C

to

structure

°o

since the charts

from one overlap

other and therefore a maximal

C

atlas

C

with the charts

including either of

from the

them would

include both. Example 6

(the square)

manifold have sharp since

still seems worrying.

comers?

Yet

the overlap map between

the square is a differentiable manifold

d> »h

and

*h 3

a

is nothing other than arranged to be

C

03

as

The point here is refers

4> • (j>a,

is

(d>

*h) (d> 3

*h)

^

which

a

which could perfectly well have been

.

m the examples

that the

How can a differentiable

for

S

1

'differentiable'

to a relative property of overlap maps,

above. in 'differentiable manifold' and so if you live in the

world defined everywhere locally by the charts you simply do not see sharp corners.

The comers

something else,

such as

exist only as a feature of the square in

the plane

R

2 .

Once the manifold is considered as

part of something bigger then it is having to carry more intrinsic properties as

a manifold,

these extrinsic properties

the

than its

own

and it is important to distinguish

(whatever they may be in any given context)

from

the intrinsic ones which do not vary with the context. What,

126

then,

is

the real nature of the extrinsic property of the circle/

square manifold that has not having comers?

corners,

or,

conversely, what is

R

A straight line in

2

the meaning of

clearly has no corners,

and

nor does anything which can be converted into a straight line by a diffeomorphism. Rn

(i.e.

Generalizing this

linear subspaces

to k-dimensional affine subspaces of

translated by some constant vector,

necessarily passing through the origin in corners

or otherwise is

a local property

the definition of a subset this kind is

S

of

Rn

Rn)

so not

and realizing that having

(although extrinsic) we arrive at

having

'no

comers'.

A subset of

Rn.

called a submanifold of

DEFINITION

Rn

A submanifold of for each

x

in

dif feomorphism that

(x1 - /l-x2

local

:

U

R

2

by

2 ,

x2)

.

a diffeomorphism onto its image

= {x e R

segment

2

C

x^

{x e R2

> - /l-x 2

|x2|

Xx = °, Ix2l

< 1}

< 1}.

and takes

U A S

See Figure 23.

to

the line

It is easy to find

Figure 23 129

charts

(U,)

to take

care of points

on

S

with

< 0

or

°o

(use

x9

instead) ,

thus

showing that

Note that the restriction of

R

in Example 3(a), when

indeed any chart for wo uld, with

S

S

is a

to

U A S

is

thought of as

C

This means

.

above is the

simply

3(a)

(or 3(b))

submanifold of

R

2

R

in

,

C

structure for

the same

structure that

and

2

K

certainly overlap

C

S - S

the

defined by

,

C

.

D

arising from its submanifold property m

that the

is

K

the chart map

x^-axis

CO

atlas

d^

submanifold of

if it were not already one of those in 3(a),

them.

0

S

,

inherits

as

a

.

2. Likewise it can be seen that

S

n

. is a

C

00

. submanifold of

n K

although later we shall find a much easier way of showing this brandishing atlases

explicitly.

This

C

00

structure for

S

n

1

,

than by

/ , \ (n 5 1)

is

usually called its standard structure. Now we can unravel the fact is

that the square

confusion surrounding the S

'square'

is not a differentiable subman-ifotd of

since at any corner there is no local diffeomorphism in straighten out

S

example.

locally to a straight line.

R

2

R

The

2

,

which will

Nevertheless

S

can be

00

equipped with a

C

structure by exploiting its

topological equivalence

to a circle as on page 126, but this structure bears relationship to the

C

°°

structure of

arise in practical situations, initially as submanifolds of

2

R .

Examples like this hardly ever

since manifolds Rn.

little direct

are very often encountered

The only reason for mentioning it was

to

highlight the distinction between intrinsic and extrinsic properties of manifolds. 3.

Suppose we wish to study the motion of a spherical pendulum.

. .

.

.

poszfoon is represented by a point

130

x

in the unit 2-sphere

S

2

The ,

say,

and

o

the velocity is sphere at

x,

represented by a vector which is

v

in

the same as saying that

R

which is v

tangent to the

is perpendicular to

x

3 regarded as

a vector in

R

.

Thus

the space we must consider if we wish to

analyze the global dynamics of the pendulum is T = ■{ (x,v)

e

x R^

which consists of all elements

and

is perpendicular to

(x^x^x^;

2 x^

v

2 + x2

the space

V^>V2’V3^

x

that satisfy

2 = 1

+ x3

X1V1 * X2V2 + X3V3 * °

This

set

T

is

a submanifold of

R^,

as

can be proved either directly by

a careful choice of charts or indirectly by more general methods: see

§3.4 below.

An obvious question arising is: realized as

a submanifold of some

can every differentiable manifold be Rn ?

dispense with the abstract definition. below.

However,

the ways

If this is so,

then we can in theory

The answer is Yes:

see

§3.2,

of doing this may be quite artificial,

distract from the essential properties of the manifold itself.

[T]

and For this

reason we prefer to deal with manifolds abstractly where possible,

to avoid

carrying around a lot of redundant information.

3.2

REMARKS,

COMMENTS, AND MORE EXAMPLES OF DIFFERENTIABLE MANIFOLDS

m A manifold is n-dimensional, or an n-manifold, if the normed linear space on which it is modelled is In this

a finite-dimensional space of dimension

case we can regard the model space as

Rn.

n.

An infinite-dimensional

manifold is one which is modelled on an infinite-dimensional normed linear

131

space If U

cj)

:

U -> U'

is a chart on an n-manifold we can for each point

u

in

write (u)

Sometimes cf>.

:

it is

R

U ->

itself of map

tjj

=

(fj^Cu),

n (u))

formally useful to write i^

coordinate function)

in

Rn.

If

u

also lies

then similarly writing

y.

as well as

in the domain

to denote

\pj

(j

—i express

of the

x's

is a y's

the V

C

i

coordinate

of another chart

= l,2,...,n)

r

the fact that the overlap map

all partial derivatives of the

n

to denote the function

(the x

R

in

we can

. diffeomorphism by:

as functions

exist (and are continuous) up to order

r,

and the Jacobian matrix 3(y1»y2>•••>yn) 3(X1>x2>•••>xn) is non-singular everywhere where it is defined. This

is a consequence of the Inverse Function Theorem (§2.8).

If everything that we have done so far is re-cast in the setting of

complex linear spaces3 with complex differentiability (see Remark 3, page 60) we obtain complex manifolds.

These arise in the global theory of functions

of one or more complex variables, Any open subset

W

of a

C

the same dimension with charts

in particle physics manifold

M

the restrictions

for example.

is itself a manifold of to

W

of those in the

TC

C

structure of

subset of

In particular

the model space

of diffeomorphism used in §2.5.

132

M.

E

(page 125)

.

is a is

(as already noted in §3.1) C

00

in this

manifold.

any open

Observe that the meaning

case the same as

that originally

1 ^ 1

have so far avoided asking an obvious question:

manifold is C

,

it always possible to choose an atlas whose overlap maps are

thus making the manifold into a differentiable manifold?

No,

The answer is

although not surprizingly the proof involves esoteric topological

methods.

An example in dimension 10 was found by Kervaire

However,

it can be shown that if there is a 00



exist a

atlas

and even a

.

manifolds.

C

atlas.

then there will also

Therefore from now on we will

CO

C

differentiable manifolds,

and call them smooth

•£ C

case for any given

r £ 1.

a Another natural question to ask is: with two distinct smooth structures? answer is Yes.

(1)

: R

(2)

tj;

:

^

in 1960.

Remarks made about smooth manifolds will usually apply equally

# well m the

Officially,

atlas

[64 [|

0)

C

tend to work with

^(f>

given a topological

For example, -> R

:

x >-»■ x

R

:

x^*x^.

R

can a given manifold be equipped

It is very easy to see that the

consider the two single-chart atlases on

these define distinct

C"*"

R

structures since their overlap map

although a differentiable homeomorphism is not a diffeomorphism (its

inverse is not differentiable at

0),

structure determined by

and therefore

to

the

(R,cf>).

R

are diffeomorphic as smooth manifolds.

(R,tJj)

Nevertheless,

does not belong the two copies of

The map

f : R(1) + R(2) taking

x

to

3>/x

given charts is -

is a diffeomorphism because its representative via the ipf G

:

1842-1899).

of multiplication

x h- x ^ Examples

(e^ ,e'*’^)) ►* place in each factor),

G x G

are both smooth maps are:

group

the algebraic structure of a

the

circle

torus

x S"*"

the group

GL(n;R)

:

(x,y)

is called a

S'*"

(with

G

-> xy

Lie group

(with multiplication S’'"-multiplication taking

of invertible

n x n

matrices

2 with real entries. such as

This

Lie group,

Conversely,

groups

There is

groups

as

of

on a manifold

which respects

M

is

:

for all

x,y

’do

in

a(x) is

G.

action

M

M}

of composition on both sides,

i.e.

= a(xy)

(Remember the composition on the left hand side

then

a(y)’.)

a smooth map,

For a given point through all of

Technically, an

a map

G -> {all dif feomorphisms

the rules

a group of

a particular rich theory of the actions of Lie

a(y)*a(x)

runs

Many Lie groups

one can start with an abstractly-defined

of diffeomorphisms of manifolds.

a

ct(x)(p)

Rn .

and look for objects on which it can act as

transformations.

means

an open subset of

these arise naturally as groups of self-transformations of some

geometric object.

G

last is

p G

If the map

then we say that in

is

M,

G x M -> M a

is

a

orbit

of

p

(x,p)

to

smooth action.

the set of all points

called the

taking

(ct(x) (p) }

under the action

as

x a.

145

It is an immediate consequence of the definition of a group to

the same orbit'

partitioned into

is

an equivalence relation on

a-orbits.

addition)

3.3

Z

:

M

is

R

and the

'multiplication'

is usual

of the 1-dimensional Lie group (in both of which

on compact manifolds.

THE STRUCTURE OF DIFFERENTIABLE MAPS BETWEEN MANIFOLDS

local

We have now reached the stage of generalizing the f

and so

’belonging

In Chapter 4 we shall be studying the orbit

structures of smooth actions O-dimensional Lie group

G,

that

Rn -* Rm

global

(i.e.

families

of real

functions of real variables)

study of maps between differentiable manifolds.

questions

study of maps to the

There are two main

to ask:

1. What are the possible geometrical forms

(topological)

that differentiable manifolds can take?

2. What can differentiable maps between them look like? Question 1 was discussed briefly in §3.2,

10

.

We will now turn

attention to Question 2, which is more important in applications since in practice we are likely to know which manifolds we are dealing with but need to know something about maps between them as relating to inputs and outputs of dynamical systems. To investigate the structure of differentiable maps we first look at local structure and then piece together local information to give global •



information.

for some given

146

.

As before, we will take r £ 1.

smooth

00

to mean either

C



or

C

Local structure

(a) Let

M,N

be smooth manifolds modelled on

normed linear spaces (§3.2) sets

Any smooth map

f

or, more generally,

: M -> N

F,

and therefore the

between manifolds

is precisely

on

is by definition

represented locally by smooth maps between open sets

in

maps

E,F.

,R

R

in

E

and open

theory of the local structure of smooth maps the theory of the local structure of smooth

in euclidean or more general normed linear spaces.

We have already studied the in §§2.8, However,

local structure of such maps to some extent

2.9 and we have no further results it is worthwhile

to offer at this stage.

looking again at the consequences of the Inverse

Function Theorem and thinking of them from a manifold point of view. Recall that two significant corollaries of the theorem were summarized on page 98

by saying that if

Rn

Df (x)

:

like

Df(x).

Rn

->

:

and when

Rn

projection of

Rn

-*■

Rm is such that the

has maximal rank then near

In particular, when

Rm,

in

Rm

f

onto

$

m

Rm

the map

f

Rn

f(x)

cases

can be described in the language of

case

f

is

locally like

an

looks

m

each case, but

in

Rn.

0),

we

looks

like

R™

Now the first of these two

§3.1 by saying that locally the

Rm,

f(x)

and in the second

are

'looks

like'

in the

(n-m)-dimensional

two contexts do

required for the formal definition of submanifold in

they do.

Thus,

remembering the extension of the idea of

submanifold from a euclidean space setting to manifolds (§3.2

f

locally

Of course to tie this down precisely, we have to

check that the formal definitions of correspond to what is

looks

locally like a linear

n-dimensional submanifold of

Rn.

f

the local image of

the local inverse images of points near

submanifolds of

the map

and so the inverse image of each point of

near

image of

looks

n

n £ m

x

linear map

can summarize the

two corollaries of

in general

the Inverse Function

147

Theorem this

time as a theorem about maps between manifolds as

follows:

THEOREM

Let

f

n,m

respectively3 and let

: N

M

be a smooth map between smooth manifolds of dimensions

^

x

be a point of

representative

f = 4>f(x))

and (ii) n^m (i)

f

for

Take a local

N.

we have

takes some 'neighbourhood of

smooth submanifold of (ii)

f

of points in some neighbourhood

are locally (i.e. near x)

f(x)

submanifolds of

to an n-dimensional

N

M ;

the inverse images under of

in

x

(n-m)-dimensional

N.

Remarks 1. If the maximal rank condition holds f

then it holds for

them all

of course).

This follows

that if

R

A :

isomorphisms

-> R

then

is

for one such local representative

(at the points corresponding to the given

x,

immediately from the Chain Rule, using the fact a linear map and

CAB ^

kind concerning properties

has of

B:R

the same rank as f

-> R A.

,

C:R

-> R

are

Arguments of this

which are independent of the choice of

local representative are becoming quite familiar by now. 2.

The case

condition is that The

f

is

n = m

is

common to both (i)

(ii).

Here the maximal rank

that derivative should be an isomorphism;

the conclusion is

locally a diffeomorphism (the Inverse Function Theorem directly).

local image of

f

is an open set in

locally the inverse images of points near O-dimensional.

148

and

M,

thus m-dimensional,

f(x)

and

are single points and thus

(b)

Non-singular global structure

Since

the maximal rank property at one point implies

that

f

agreeable local properties near that point, we might expect the maximal rank property

everywhere than description.

f

(i.e.

:

N -> M

that if

has f

has

using local representatives via charts)

would have some easily understandable global

In fact,

looking more closely at the above theorem, we could

reasonably hope that under the assumption of maximal rank everywhere

and

(i)

if

n £ m

then

fN

(ii)

if

n^m

then inverse images of points

submanifolds of A little thought shows though

f

takes

different pieces when

f

is a smooth n-submanifold of

N

(n-m)-

N that

(i) will not hold as stated, because even

small open pieces of on

are smooth

M

N

into submanifolds of

M,

may collide or otherwise interfere with each other

is applied and so

fN

,

as

a subset of

in Figure 29 may well have maximal rank everywhere

oo image of S'

M,

may not have the

(in these cases the

O

image of R

Figure 29

149

condition amounts

to the non-vanishing of the

their images are not submanifolds of crossover points - where and in i.e.

(b) f

fS^

is a submanifold of

p

: in (a)

z

2

1

we have

the definition of an embedding, with

S

n £ m.

f

:

.

.1

and thus winding

M

crossover,

-> R

.

twice around its the most

S

image

f

: N -> M

charts

If we know that

satisfies

for

fN

fN

2

fN

for

M f

M

then by taking

locally as

Rn

Rm

: N -* fN

all have non¬

in

-*■

a smooth map

(i.e.

fN

.

fN

and since smoothness

that the inverse of the bijection

or in other words

that

f

(§3.1)

is everywhere locally invertible as

locally a diffeomorphism),

local property this means is smooth,

: N

to see

'submanifold'

immediately

f

S ) ,

and that

Now the Inverse Function Theorem

that

1

(also

singular derivative everywhere. tells us

= C

'nonsingular' kind of map

is a submanifold of

in

we see that the local representatives

is a

Rather than recording the

the maximal rank condition,

that exhibit

fN

f

ingredients of this definition directly, we will pause for a moment what it means.

R ,

in

and when we add the extra assumption that

z

N

R

there will be many instances where

taking

between manifolds

but

there are places - the

bisection onto -its image, to rule out examples such as to

f'(0))

even though there is no actual

Nevertheless

M,

2

does not look locally like

the same applies at

is injective.

R

'velocity' vector

f

:

is a

N

fN

is a dif feomorphism onto its image

Thus we can paraphrase the definition of embedding as follows:

DEFINITION

A smooth map

f

: N -* M

is an embedding if it is a dif feomorphism from

to a smooth submanifold of

Thus N

150

an embedding of

as a submanifold of

N M.

M.

in

M

can be regarded as a

'realization'

of

N

If

f

satisfies

the maximal rank condition

(with

n £ m)

to be an embedding then it is called an immersion.

fails

called an immersed submanifold of

sometimes

strictly speaking be a submanifold.

M,

but possibly

Its

image is

although it may not

From the remarks

in (a) we can say

that an immersion is a map which is everywhere locally an embedding_, it may fail to be an embedding for certain global reasons. reasons have been illustrated in Figure 29. relating to almost periodic motions

while

Two possible

Here is an important example

in dynamical systems

(see §4.1),

giving

yet another illustration.

EXAMPLE (Irrational flow on a torus) Let

f

R -*■ S

:

x S

be defined by

f(t)

= (exp iat,

are chosen to be two nonzero real numbers with satisfies x S1 f(t) of

the maximal rank condition, we have

= f(s) 2tt,

f'(t)

=

(a,3)

would imply

and so if

of two integers.

a(t-s)

t f s Thus

x s1

f

at rate

joining up with itself: nowhere

that if

a/3

but no longer injective. winding where

p

times

a/3 = p/q

irrational.

a,3

Then

f

Moreover,

B(t-s)

f

is

injective,

since

were both integer multiples

a/3

expressible as

the quotient

is an immersion and is injective, but it is not

a

fR

is a line winding round and round the

in one sense and

3

in the other and never

it forms a dense subset of the torus and so

looks simply like

Notice

(0,0). and

where

since in local angular coordinates on

we would have

an embedding since its image torus

/

a/3

exp iBt)

R

1

.

in

R

2

were rational, Its

locally. then

image would be

f

would still be an immersion

(diffeomorphic to)

around the torus one way and

q

times

a circle

the other way,

in lowest terms.

151

injective

The two examples we have seen so far of not embeddings behaves

in the domain

term behaviour that causes

able to

(§1.6)

'approach infinity':

the trouble.

Now,

in a sense corresponds

'approach infinity'.

it is only the long¬

the topological notion of

to an intuitive idea of not being

Therefore it is reasonable to expect

compact

every injective immersion of a This

that are

fail to be embeddings because of the way the immersion

as points

compactness

immersions

that

manifold may have to be an embedding.

is indeed true.

THEOREM

An injective immersion of a compact manifold is ccn embedding.

The value of this n £ m,

theorem is

that if we are given

then we only have to check properties of

fN

will be a submanifold of

fN

is

to see whether

f

:

and

N -> M N

with

to know that

we do not have to know in advance what

M:

: N

f

f

fN

is a diffeomorphism.

This

is

typical of the kind of simplification to a problem that the use of compactness can give.

Sketch of proof: guarantees

that

As f

mentioned above,

takes suitably small open neighbourhoods

diffeomorphically to their images are themselves open sets of

M)

takes open sets in

bijection further,

152

in

N

-*■

fN

,

in

fN

we immediately see that

is a local property. f

the Inverse Function Theorem

M.

^

:

in

N

Once we know that these images

(with its f

U

fN

topology induced as -> N

a subset

is smooth since smoothness

Therefore the problem is boiled down to proving that N

to open sets

is actually a

in

fN

,

i.e.

homeomorphism.

for Theorem B in §1.6 tells us

that this

that

f,

already a

But now we need go no is

true since

N

is

compact by assumption and

fN

is Hausdorff because

M

is.

In this discussion of global non-singular behaviour so far we have been concentrating on the case

(i)

n $ m.

Here there are no hidden traps. images of points

are smooth

What about the other case

Under the maximal rank assumption,

(n-m)-submanifolds of

N

therefore also globally since being a submanifold of in terms

of the topology of

In the case a

submersion,

n $ m

(ii)

locally, N

n $ m ? inverse

and

is a local property

N.

a map satisfying the maximal rank condition is called

and so we have the following theorem:

THEOREM

If

f

:

N -> M

inverse image

is a submersion then for every point f

is an

1(p)

(n-m)-submanifold

of

p

of

the

fN

N.

Remarks 1. There may be points f

■*'(p)

p

of

M

not in the image of

of

If

in which case

will be empty. Y

2.

f,

f

is of class

C

then each

f

”1

(p)

is

in fact a

C

XT

submanifold

N.

3. The simplest examples of submersions

(other than diffeomorphisms, which

are simultaneously submersions and immersions) cartesian product

N =

* M2

onto one or other of the factors.

inverse image of a point

P2

N ->• M2

t^e set

:

(P-^>P2^ ^ P2

diffeomorphic to

M^.

Here

are projections from a

of

M2

dim(M^)

The

under the projection x ^P2^» = dim(N)

which is of course — dim(M2),

as

in the

theorem.

153

4. We know from the previous discussion that every submersion has

locally.

of product structure

That it need not do so

illustrated by the Mobius strip the strip,

(§3.2 [9 [).

leaving a non-compact 2-manifold

globally

strip as

(x,y)

h-

(x,j)

First remove the boundary of (without boundary),

looks

and collapse Such a

in our formal definition of the Mobius

a quotient space obtained from the square

Locally this

type

is easily

this manifold down on to a circle going around the strip once. map can be given by

this

A =

[o,l]

like a product projection as above, but

x

[b,l].

it is not so

globally since the Mobius strip is not homeomorphic to the product S1 x

[0,1].

This example, nevertheless

showing how a space which is not a cartesian product can

look everywhere locally like one and even have a globally

defined'projection'

(= submersion)

topological idea of a

bundle.

onto one

'factor',

We shall meet this

context of tangent and cotangent bundles

gives

the germ of the

again shortly in the

for smooth manifolds.

5. The purpose of removing the boundary of the Mobius strip in the preceding example was boundary.

to avoid having to define submersions

In fact, with a little care there is no real problem in defining

submersions, immersions 6.

and embeddings

for such manifolds.

In general this discussion goes over to infinite-dimensional manifolds,

provided the usual precautions page

(c)

for manifolds with

are taken about splitting

(Remark 1,

99 ) .

Global structure with singularities

The problem of understanding the global structure of smooth maps between manifolds, where singularities are admitted,

is no less

than the problem of

understanding everything about the global properties of families of smooth

154

functions

of

knowledge is theorems,

n

variables.

rather limited.

Nevertheless,

there

are

some useful general

and in particular cases very strong results about structure can

be obtained.

The theory of singularities of smooth maps,

the global theory, is

Therefore it is not surprizing that our

and in particular

is still very young as a branch of mathematics, but it

growing rapidly with the intensive cultivation of many ideas due in

large part to Thom in the 1960's. The simplest non-trivial maps one of the manifolds of as

a

path

in

deformations

M,

is

R

to look at are those for which at least

or an interval in

R.

A map

R

-> M

is

thought

and although the sets of paths and loops and

of loops

in

M

are interesting and important things

from the point of view of the global topology of

M,

to study

the actual paths do

not lend themselves to much analysis which is of topological significance. After all,

a singularity is

hence any)

local coordinates

simply a place on the path where in some the

'velocity' vanishes.

(and

Therefore we will

move immediately on to the more interesting case of maps

R,

M

i.e.

real-valued functions of many variables.

The structure of real-valued functions Singularities

of such functions

critical points

are called

already studied some local structure near critical points language of

germs)

in §2.9.

and we have (using the

We will now consider some global implications.

EXAMPLES 1.

Let

M =

Rn

with

,

f

: M ->

R

,

f(x1,x2,...,xn)

where the

a.'s

defined by

= a1x1

2

are non-zero constants.

+ a2x2

Then

2

+

...

Df(x)

+ a^ n n is

2

the linear map

l

155

R

-> R

described by the

1 x n

matrix

(2aixi> V2.V.1 which fails x

to have maximal rank

= x. = ... 12

f,

= x =0. n

(= 1)

precisely when

Therefore if

k ^ 0

nn x e K

2

and

k

in the image of

the set

gives

that the

(n-1)-sphere

general theory rather than by considering charts.

„n-l

signs f ^(0)

k > 0

then

(an ellipse when then

n

R .

f

S

n-1

In particular. is

-1

(k)

f ^(k)

n = 2),

is noncompact

is

If all the

f : M -> R

n = 2),

while if some

a^

(a hyperbola when

depending on the

be projection onto the

sketched in Figure 31. inverse image

f

-1

(k)

a^'s.

and

a^

n = 2).

x^-axis

See Figure

(the 'height'

We see that for most points is

a.

are

a compact manifold diffeomorphic to

is either just one point or is not a submanifold of

lines when

a smooth

Rn. We already know this (§3.1) but now we have proved it by

submanifold of

positive and

II

= 1



a.

. )

+ a2X2

(n-1)-dimensional smooth submanifold of

taking each

2

+ • •

will be an

alXl

2

+

-1

156

lies

k

a 1-dimensional submanifold of

have opposite Note that Rn

(a pair of

30.

function) in M,

fM

as the

although not

From the picture in Example 2 it seems reasonable to expect that in general

'most'

which are is

points

k

in the image of

(n-1)-dimensional

indeed the case.

It is

(i.e.

f

will have inverse images

codimension 1)

submanifolds of

clearly going to be true when

finite number of critical points, restriction of

f

f

M.

This

has only a

for on removing the critical points

the

to the remaining manifold is always a submersion, but

the importance of the result is

that it holds

for any smooth

f

whatsoever,

no matter how large or complicated the set of critical points may be.

p

Define a critical value of

f

is

Thus

a critical point of

Values to

R.

f.

: M

R

to be any number

k = f(p)

critical points belong to

Numbers which are not critical values,

M,

where

critical

including those not OO

even in

fM ,

are

called regular values.

Assume that

f

is

C

.

The

theorem that gives us what we hope for is:

THEOREM (Sard's Theorem)

The set of regular values of

f

is a dense subset of

R. 157

Remarks 1. The original theorem (see Sard relates the values Also,

measure

(a non-topological

is much stronger than this.

concept)

to the degree of differentiability of it deals with maps from

valued functions:

3. The theorem does

not

For example,

non-zero

k

to

m

of the set of critical

f

^(k)

[l23j .

say that the set of non-critical points is dense

the map

as regular

Rn -> R

values,

taking everything to zero has all

but has no non-critical

points

that if

k

at all.

is

is non-empty but is not a codimension-1 submanifold of

can find

k'

as close as we wish to

k

with the property that

is either empty or is genuine codimension-1 submanifold of If we call a codimension-1 submanifold a

hypersurface3

summarize the above discussion by saying that for

of points

x

in

f

then we ’*"(k')

M. then we can k

in

R

this conclusion is

which we think of as

that it allows us

to take local

= k.

the world in which

x

f_1(k),

can vary subject to the

We shall use this hypersurface property

when considering certain important types of dynamical system (§4.6).

158

the set

M.

coordinate systems consistently and meaningfully everywhere on

f(x)

M,

= k

is either empty or is a hypersurface in The advantage of

'most'

such that

satisfying the nonlinear equation

M

f(x)

constraint

M.

dimensions rather than just real¬

The practical interpretation of the theorem is f

and the dimension of

theorem applicable to infinite-dimensional manifolds

has been formulated by Smale

M.

n

It

see below.

2. A version of Sard's

in

Qlll])

Let us look again at the illustration in Example 2, critical points are given to be non-degenerate. definition does not depend on the choice of neighbourhood of each critical point model which enables us disappears as Furthermore, moves

k f

As we saw in §2.9 this

local coordinate chart.

f ^(k)

does not change at all in

R

(up to diffeomorphism)

k = f(p). as

k

not containing any critical values.

Thus from a knowledge only of the critical points of seems

changes or even

through the corresponding critical value

throughout any interval

In a

the Morse lemma gives us a local

to see how the submanifold

passes (k)

p

and suppose that the

f

and their types it

that we can more or less reconstruct the global geometry of the

manifold

M.

This

simple yet profound observation is

the basis of Morse

theory. Morse theory has been used to great effect by topologists in studying the geometry of high-dimensional manifolds, the generalized Poincar^ conjecture however,

for example in Smale’s proof of

(§3.2, 10 ).

In practical applications,

it may often be more useful to work in the opposite direction,using

the fact that the global topological structure of

places constraints on

M

the possible numbers of the various types of non-degenerate critical point that a smooth function on Let us suppose that

M

M

can have.

is a compact manifold.

each non-degenerate critical point of

f

As remarked in §2.9,

is isolated,

neighbourhood containing no other critical points. that all the critical points are non-degenerate, implies

i.e.

lies in a

If we now suppose also

the compactness of

M

that there can only be a finite number of them (if there were an

infinite number they would have to have an accumulation point since all derivatives of local representatives of

f

(§1.6), and

are continuous such

an accumulation point would have to be a critical point of

f

:

an

159

impossibility if all critical points are isolated)0 number of critical points of index N.

(0 £ i £ n)

i

(see §2.9).

are related to the topology of

M

Let

FL

denote the

Then the numbers by the so-called

Morse inequalities: N

N

r

n

where the

- N

, r-1

_ r-2

...

. ± N„ £ B - B i + B r-2 0 ^ r-1 r

...

± B

- N

„ , + N n-1 n-2

...

± N~ = B - B , + B 0 ~ n-2 0 n-1 n

...

± B

+ N

are the Betti numbers of

B.

M

(

l

(certain integers defined from

1

the topology of

M

See e.g. Maunder

via its homology groups: we will give no details here.

[^79J).

The alternating sum

B0 " B1 + B2 ' the Euler characteristic of

is

numbers,

M,

**•

1 Bn

usually written

The Betti

are topological invariants_,

and hence the Euler characteristic,

which means

x(M)*

that homeomorphic spaces have the same Betti numbers.

These

invariants can in principle be computed directly for any manifold or reasonable topological space defined in some fairly concrete way,

and

much can be discovered about them abstractly using the machinery of homology theory,

itself a small part of algebraic topology.

EXAMPLES 1.

For

the function on the torus in Figure 31 we have

(the number of saddle-points) f

:

torus ->- R

NQ 2.

so

x(torus)

= 0.

= 2

and

Hence any

with all critical points non-degenerate would satisfy = 0.

The n-sphere

It is worth trying several examples to verify this. Sn

admits a smooth map

degenerate maximum (index n) other critical points.

160

= 4,

NQ =

f

:

Sn -> R

having one non¬

one non-degenerate minimum (index 0),

For example,

let

f

and no

be the projection of the

standard Hence

S

x(S )

(unit sphere in = 1 +

(-1)

gives a proof that

S

2

.

R

In particular

2-2g,

x(S )

= 2,

in

Rn+^.

which incidentally

is not homeomorphic to a torus.

3. An orientable surface of genus characteristic

onto one coordinate axis

g

(see §3.2, [lo] )

has Euler

as may be seen by counting critical points of a

height function based on Figure 28

.

Note that this

is negative for all

cases other than the sphere or the torus.

Maps between arbitrary manifolds As

soon as we take both manifolds to have dimension greater than 1 the

global problems become much more subtle and complicated. the main tools - Sard's Theorem - is still available. parallel to of

f

that we took for functions

: M -*■ N

to be points

p

f

: M -> R,

However,

one of

Following a path

we define singular points

where the maximal rank condition fails

(in some and therefore any local coordinates), singular values

to be

points

f,

q

of

N

which are the images of singularities under

regular Values to be points in .

*

N

which are not singular values

.

(including

oo

points which are not m Theorem says

and

fM).

Then,

assuming

f

to be

C

,

Sard's

that the set of regular values is a dense subset of

N.

Again exploiting the corollaries of the Inverse Function Theorem as in the first part of this

section, we see that when

dim N £ dim M

the

inverse image of each regular value is either empty or is a smooth submanifold of

M

interpretation is M

that if

of codimension equal to

close or

of codimension equal to

to

q

f ^(q')

in

N

f

"'’(q) dim N,

dim N.

Thus the practical

is non-empty but is not a submanifold of then there exist points

q'

arbitrarily

for which either the submanifold condition will be true

will be empty.

Much as before, we can summarize this by

161

saying that m

’usually ’ the set of solutions of

variables3 where

dimension

n £ m,

n

non—linear equations In

Is either empty or Is a smooth manifold of

m-n.

Given some notion of

'nondegeneracy*

for singularities in arbitrary

dimensions, we would aim to mimic the Morse inequalities by relating the number and types

(whatever they may be)

maps

to the topological structures of

f

:

M -> N

programme which has hardly begun, that suitable ideas of evolved,

of nondegenerate singularities of M

and

N.

This

is a

since it is only relatively recently

'nondegeneracy'

in this wider context have been

and in any case the difficulties seem formidable.

It is likely,

though,that we shall see exciting developments in these directions

in the

next few years.

3.4

TANGENT BUNDLES AND TANGENT MAPS

In the previous discussion of the structure of smooth maps condition played an important part. of a local representative of a map,

This is a condition on the derivative and carries with it

does not depend on the choice of local representative. definitions of critical point3

the maximal rank

the remark that it Similarly,

the

non-degenerate3 singular point are in terms

of the derivatives of a local representative, with the accompanying assurance about independence of choice of representative.

It is cumbersome

to have to refer to local representatives when dealing with properties independent of them,

but how else can one work with derivatives?

The answer is: by means of the tangent bundle3 us

to deal with derivatives of smooth maps without mentioning local

representatives explicitly.

162

a concept which will enable

If we think of some evolving dynamical system as being represented by a point moving on a manifold, then it is likely that we will be interested not only in the position but also in some sense the velocity of the point. The intuitive idea of the tangent bundle of

M

is that it is the space of

all positions and velocities of points moving on

M.

The subtlety of its

construction stems from the need for an intrinsic definition, using information only about

M

and not about any embedding of

M

in some

euclidean or other linear space (in which case the definition of velocity vectors would present no problem). To formulate the definition let us go back to a local setting. E,F

to be Banach spaces (e.g. euclidean spaces

open sets in

E,F

respectively.

p

in

U,

then the derivative

by the effect of is needed.

f

if

Df(p)

on smooth paths in

By a smooth path in

and let

U,V

be

The basis of our definition of tangent

bundle will be the following observation: at

Rn),

Take

U

f : U -> V

: E U

F

is characterized entirely

based at

based at

p

is differentiable

p.

Some explanation

we mean a smooth map

c : J ->■ U where If

J

is some open interval

f : U

V

(a,b)

with

a < 0 < b

and

c(0) = p.

is a smooth map, then by the Chain Rule we see that the

composition f*c : J -* V

is a smooth path in

V

based at

c(p)

D(f-c)(0) = Df(c(0)) i.e.

(f-c)'(O) = Df(p)

Thus the derivative

Df(p)

and that • Dc(0)

. c'(0)

:

R -* F,

.

takes the tangent to the path

(*) c

at

p

to the

163

tangent to the path

p in

(e.g. U)

c :

t

h-

f*c

p + tv,

at

f(p).

for

See Figure 32.

|t|

small enough to ensure

and so the property above describes

Given two smooth paths

c^,c2

based at

Df(p) p,

c(t)

lies

completely.

define them to be tangent

there if

(c^t) - c2(t))/t -> 0 (in E)

as

(the usual definition of 'first order contact'). equivalence relation on the set of all paths in call the equivalence classes tangency classes at way of capturing the idea of veloctt'ies at Denote the tangency class of

a P

c

by

t

0

Then tangency is an U

based at

p.

p,

and we

This is the formal

p.

[c].

Then the map

{tangency classes at

p} -> E

defined by

°p[c] = c* (0) makes sense

( [cj

(cp' (0) = c2'(0)

= [cj implies

implies

[cj = [c2] )

that it is surjective (for any

164

c^'(0) = c2'(0))

v

in

and is injective

and we have already noted above E

there exists

c

with

c'(0) = v).

Therefore we have a natural bijection between the linear space set

TpU

means

of tangency classes of

that we can regard

T^U

(just lift the structure of similarly think of

smooth paths in

over to

the

linear map

Df(p)

which takes each

: E -> F

[cl

TpU

as

p

:

in

T U P

runs

to

U

p.

via the bijection),

at the point

becomes

T f P

Define the tangent bundle spaces

T U

as an isomorphic copy of

is called the tangent space

TpU

based at

and the This

as a normed linear space isomorphic to E

T^

U

E

T U P

E

and

F.

This linear space

p.

In this new guise

the linear map

T

V f (p)

to

[f-d

Tf(p)v-

TU

of

U

through

U.

The isomorphisms

to be the union of all a

: P

the linear

T U -> E P

fit

together to give a bijection

o :

TU + U x E

defined explicitly by

a([c])

path

We then assign to

c

makes aW

based at a

p.

into a homeomorphism,

is open in

U x E

Any smooth map

f

defined as

T f P

i.e,

: U •> V

of the derivative of

:

for every smooth

that unique topology which

is open in

TU

precisely when

TU -> TV

f

via

T U. P

Thus

Tf

simultaneously all over a

and its

:

analogue for

(p,h) h- (f(p),

. . From this expression we see that if

f

y_1 C

c'(0))

induces a tangent map

U x E -> V x F

class

(c(0),

TU

W

on each linear space

U x E, V x F

=

with its product topology.

Tf

as

= (p,c'(0))

represents

U. V,

If we regard then

Df(p).h)

. is of class

C

r

Tf

TU,

TV

is the map

.

then

the effect

(**)

Tf

is of

1T“ 1

since

Df

is of class

C

(cf.

page 80).

165

Using this notation the Chain Rule has a very clean form. smooth maps (p,h)

f,g

whose composition

g*f

makes sense then

If we have T(g*f)

takes

to (g*f(p), D(g*f)(p).h) = (g(f(p)), Dg(f(p))*Df(p).h) = Tg(Tf(p,h))

so in fact the Chain Rule becomes T(g-f) = Tg • Tf. This is easy to see from the tangent space definition, T(g'f)([c]) = [gfc] = Tg([fc]) = Tg • Tf([c])

for any

From the Chain Rule it follows immediately that if then so is

Tf,

since

is an inverse for

T(id^) = id,^

since

(similarly for

[c] f V)

in

TU.

is a diffeomorphism and so

T(f

Tf.

The idea of tangency classes is the one we use in order to put derivatives into a manifold context. on

E

and

p

is a point on

M,

If

M

is any smooth manifold modelled

define a smooth path based at

p

to be

a smooth map c : J -> M where

J = (a,b)

with

a < 0 < b

and

c(0) = p,

and define two such to

be tangent if their local representatives with respect to some chart around p

are tangent in the sense already defined above for maps between open sets

in normed linear spaces. is easily seen by applying X = ).

and the rules for a linear E. N

is a manifold modelled on

then by taking local representatives we see that if then

will

In more concrete terms, this means that given

If we have a smooth map

T M

to

by the bijection corresponding

acquires via the bijection with

The same applies to scalar multiples

F,

a

Therefore, no matter which chart we choose3

we can define

space (§2.2) work for

itself

However, the important thing

the net result is a linear isomorphism

follows again from

T M P

If we choose a different chart we may obtain

a different bijection between

to one chart, and then from

T^M

T^is means that

[c J = f

[cj

in

determines a

well-defined map T f : T M 4 T N P P f(p) [c] »• [f.c].

167

In terms of a local representative than

Df(cf>(p))

: E -> F.

is a linear map.

f(p)

this map

It is called the tangent map of

on

T f

Since this is linear, the map

as taking velocities of points of points

f = ipftp ^

p

of

M

f

is none other

T f : T^M -> T^^N at

p.

We think of it

to the corresponding velocities

N.

To surrmavize: the tangent space

T^M

is a linear space isomorphic to

constructing an explicit isomorphism requires choosing a chart around but the linear structure induced on smooth map [c] H- [f-c],

f : M

N

T M P

p,

does not depend on the chart.

induces a linear tangent map

E;

T f : T^M -*

A by

which in local representation is simply the derivative.

Note that we have 'forgotten*

the norm structure on

thinking of any particular norm for

T^M

E,

at the moment.

and we are not We shall come back

to this point shortly. So far we have discussed only what happens at one point on the manifold. The next task is to put together in some coherent way the tangent spaces and tangent maps over the whole manifold

M.

DEFINITION The tangent bundle of

M,

denoted by

spaces at all the points of

Equivalently,

TM

TM,

is the union of the tangent

M.

is the set of all tangent vectors everywhere on

As it stands, this definition gives

TM

M.

simply as a set, with no further

structure than the linear structure of each

T M. P

However,

TM

can

quickly be made not only into a topological space but into a smooth manifold, and it then turns out that Tf is a smooth TM

168

to

TN.

(C

if

f

is

C )

map from

Topology and smooth structure for Let us

first take the

TM

local case, when

M

is an open subset

U

of

E.

Here we have already seen that there is a natural bisection a defined by taking we think of

{p}

. an arbitrary

C

T U P x E

r

U'

x E

c

as just a copy of

x E,

in

in

x

U'

U

x E.

a

via

for each

p

in

U,

M

E

'labelled'

by

p.

If we are on

we can take a chart

between [c]

x E

TU

T^U

(pc

in

. is a

C

this back via

1

x x„ a 6

.

y =

and,

U' ,

x

x„W 3

. V

with every

C

with

TU'

by letting

and then apply

:

U -> U',

diffeomorphism, r > 1)

: U -*■ U'

TU

is

cj>.

TU.

but the topology This is because any

and so as we saw above

and is in any case a (*)

that

Ty

is in fact

x W = x x„ ^(x„W) a a 6 6

independent of

TM TU

TM

TU, we

to be an open set precisely when the intersection

is open in

condition discussed above, to check this

is open,

cj>.

To get from this a topology on the whole of

in

Now this

we now have a well-defined topology on

TU.

does satisfy the rules for a topology

impossible)

is open if and only if

showing that the topology for

independent of

itself as a manifold)

to a topology for

But it follows directly from

so

and

r

is a

. . diffeomorphism (if

homeomorphism.

TU

does not depend on the choice of

TU

r-1

U

U -> U'

Again, we already have a product topology defined for

—l

Ty

:

to the point

- i.e. we identify

construction depends on the choice of chart

overlap map

(thinking of

in each

correspond to

and so we lift

obtained for

where

P

by taking each

a : TU' -> U' U'

{p} x E

manifold

( (4>c) ' (0)) each path

TU -> U x E

to

construct a bijection and

:

It is routine to check that this (§1.3).

it is not necessary

In view of the overlap (and of course theoretically

intersection properly for all charts on

M :

it

169

suffices to check it for one atlas. By means of the bijections TM.

Each

t

to an open set

on

E x E. C

we have gained more than a topology for

is a homeomorphism (by definition)

TM

are

x

U' x E

in

E x E,

and so

from an open set TM

. diffeomorphisms if the overlap maps

and so a hence a

C C

r

structure on

r—1

-1

0

each

two points in different tangent spaces. has dimension

TpM

then

n

then

TM

TM

has dimension

is never compact.

2n.

This is essentially because

is non-compact, being homeomorphic to

E.

(A compact set in

E

would have to be bounded: cf. Theorem C in §1.6.) 5. There is a natural map TpM

to

p

to the point

for each p

170

and then

T^M

in

itself.

of the tangent bundle M,

p

it

TM

: TM -* M, M,

Thus

i.e. tt

called the projection map, taking it

takes all tangent vectors at

1(p) = T^M.

With

it

p

in mind we think

as a collection of tangent spaces lying 'above'

is called the fibre of the fundle over p.

See Figure 33.

Figure 33 The projection

tt

is a smooth map

local representatives of in

U* .

(C

the form

if

cj)ttx 1

This has maximal rank everywhere

and splits

M

is

taking

C )

(q,h)

since it has in

U'

(its derivative is

in the infinite-dimensional case) and so

tt

x E

to

q

surjective,

is a submersion

TM -> M.

EXAMPLES of tangent bundles 1.

For an open set

to make

U

in

the bijection

a

reason we usually write the same linear space points

in

U

If

S

TU

E

U.

It is

was defined just in order

into a homeomorphism. In effect this says

since

This

For this

that we are using

U

is useful for calculations, but

itself lies

lying in (the other copy of)

in (one copy of) E

E

and so

might be mistaken for

important to try to avoid this confusion.

is a smooth k-dimensional submanifold of

the right charts for

M

at points on

2k-dimensional submanifold of T S P

U x E

TU

in which to represent the tangent vectors at all

simultaneously.

tangent vectors

2.

:

the topology on

TU = U x E.

confusing conceptually,

elements of

E

TM.

S

we see that

For each

is a k-dimensional linear subspace of

M

p

in

then TS

S

by taking is a smooth

the tangent space

T M. P

171

3. As an example of the above,

consider

can be written in polar coordinates r(t)

= 1

since

(c(O),0’(O)) charts

for

c

is

in

for each S*"

it is

S'*".

[c] easy

c(t)

in

TS*"

isomorphism,

so we usually write simply

TS*"

S*"

5.

T(TM) define C

r— 1

TM

,

is

2

or

to

can be regarded as

’coordinates'

Since

.

A path

(r(t),0(t))

c

a

R

in

S

where of course

x R

taking

[c]

to

and by taking

is a diffeomorphism.

(the second factor) TS

= S

x R.

by a linear

The bundle projection

is just projection on the first factor.

4. T(M x N) second

T S'*" P

R

is clearly a bijection,

a

:

=

to check that

takes each

in

a : TS*"

The map

Furthermore

it

S

T M

TM x TN

by working with the first and

separately.

itself a smooth manifold, of dimension

T"*M = T(T^M),

4n

if

and so on.

and so the process

M

If

its own tangent bundle

has dimension

M

is of class

r T M

stops at

it has

n. Cr

Similarly we then

TM

is

which is a topological manifold

but has no given differentiable structure and so no tangent bundle. Locally

2

T M

looks like T(U x E)

since 6.

U x E

=

x

(E x E)

is an open subset of the linear space

From Examples

3 and 4 it follows

TCS1 x S1)

on rearranging the factors. 0,

(U x E)

on the two

S*"

E x E.

that

=

(S1 x R)

=

(S

11

x S1)

x

(S1 x R)

2

x RZ

This expresses

the fact that angular coordinates

factors allow tangents to paths on

described everywhere by the pair of numbers

S'*" x S'*"

(0’(t) , ’(t) )

in

to be o R .

2 7. The two-sphere

S

does not have a system of angular coordinates

Euler angles give charts

172

that do not work at

(the

the north and south poles).

and this is related to the fact that 2 S

2 x R .

If it could be,

TS^

then we would be able immediately to assign a

non—zero tangent vector to each point of continuously over

S

2

cannot be identified with

S

(take a fixed vector

in a way that varied v

o R ).

in the second factor

The celebrated Hairy Ball Theorem in topology asserts that this is impossible.

We have here more than a mere topological curiosity, for we

saw on page

131

that the manifold that we now recognize as

TS

is the

natural setting for the study of the motion of a spherical pendulum. In fact

TSn = Sn x Rn

only in the cases

n = 1,3

This is a deep result, first established by J.F. Adams

or

7

[ 3 ]

(or

0).

in 1962.

It

is closely connected with the existence of complex numbers (dim 2), quaternions

(dim 4), Cayley numbers (dim 8) but nothing analogous in

higher dimensions.

The proof involves much elaborate machinery from

algebraic topology. The question of whether or not

TM

is the same as

M x Rn

is very

important both on purely geometric grounds and from the point of view of applications.

If

TM = M x Rn

then velocities of points moving on

M

(= tangents to paths) can all be represented in one n-dimensional coordinate space

Rn,

charts.

despite the fact that

However,

if

TM ^ M x Rn

for velocities as well. some

Rm

then

TM

M

itself may need several coordinate then we need separate coordinate charts

Of course if

M

is a submanifold of

(Ex.l), and velocities on

M

is embedded as a submanifold in TRm

(Ex.2) which is

R™ x Rm

can be measured in the second factor

Rm.

However,

this introduces a lot of redundant extrinsic information about

the way

M

processes on

lies in M

R™,

and is not helpful to an understanding of intrinsic

itself.

173

A tangent bundle

trivialc

TM

which is really a product

M x R

is called

The precise definition, valid also in infinite dimensions,

is

the

TM

is

following:

DEFINITION

If

is a smooth manifold modelled on

M

then the tangent bundle

E

trivial if there exists a diffeomorphism

TM -* M x E

which takes each tangent space of

{p} x e

in

E

by a linear isomorphism to the copy

T^M

A manifold whose tangent bundle is trivial is

M x e„

called parallelizable.

Tangent maps for manifolds By analogy with the local case

map

Tf

: TM

linear map

TN

T^f

to points on

(open set

of a smooth map

f

on each tangent space

N,

the map

Tf

: U -> U'

representative

f

. representative for class

for

Tf.

in

: M -> N T^M.

E)

and

we see that

Therefore

Tf

we define the tangent

to be the map which is the While

takes velocities on

If we take suitable charts tpfip ^

U

f

takes points on

M

to velocities on

\p : V -> V' T(^f

. is of class

M N,

to look at a local

is a local C

r-1

if

f

is of

r C o

There is a simple global Chain Rule for maps between manifolds: given [g-f-c]

f

: M

N

and

= Tg.Tf([c])

g

: N

P

T(idM)

= idrpM

in the local case that if

174

g*f

: M

P

and since

we see as before that

T(g-f)

Again,

we have

= Tg-Tf.

and it follows by the same formal argument as f

:

M

N

is a diffeomorphism

(Cr)

then so is

r— 1 i).

Tf (C

Since of

N

Tf

at

takes the tangent space of

f(p),

M

at

p

to the tangent space

we could write

it *Tf = f• tt : TM -> N N M

where

tt^

denotes the projection

TM

M,

and similarly for

tt^.

This is

often conveniently represented by saying that the diagram

TM

Tf ->-

M

TN

-

N

commutesj meaning that you arrive at the same result whichever route you follow from top left to bottom right. Tf : TM -> TN

being a map ovev

the derivative of

f,

This diagram conveys the idea of

carrying on top of

f

information about

f.

EXAMPLES of tangent maps 1. It is worth emphasizing again that when f : U

V

then

Tf : TU -> TV

U,V

and

q

Tf:S

f :

->

is an integer. 1

x

R -»■ S

1

xR

by

Tf

does.

(such as checking the maximal rank condition) we

work with local representatives and 2. Define

Tf : TM -*■ TN .

makes no sense in the manifold case, although

To carry out calculations

and

(p,h) h- (f(p), Df(p)h).

This is the form of all local representatives of Df

E,F

is the map

U x E -> V x F :

Remember that

are open sets in

f(z) =

D,

but for general theorems we use where

Then (Example 3 above)

z

is a complex number TS'*' =

x R

e10

and here

is given by

175

T.

Tf(z,h) = (zq, qh). This is because

f(e^^^'t^) =

and so

of paths on

by the factor

f

multiplies velocities

q.

Tvansvevsality Since the tangent space to p,

M

at

p

is a local linear version of

M

near

the geometry of configurations of linear spaces and subspaces can be

used to model local geometry of configurations of manifolds and submanifolds. The development of this simple observation into a theory of tvansvevsdli-ty of intersection by Thom (see §4.7) has been of profound importance in many aspects of differential topology. Two lines in

R

3

will in general not meet;

two planes will m general

meet in a line, and a line and a plane will meet in a point. extends easily to

Rn;

two affine subspaces

in general not meet unless satisfied then

K

and

dim KnL=k+£-n

k + £ £ n,

L

K,L

The geometry

of dimensions

k,£

will

and if the dimension condition is

will in general meet in such a way that

and

K + L = Rn:

this is easy to believe from

considering low-dimensional cases that can be visualized.

Applying this to

tangent spaces of manifolds, we say that two submanifolds

K,L

dimensional manifold K c\ L

M

of a finite¬

intersect tvansvevsalty if for every point

p

on

we have TK + TL = TM. P P P

It is then not difficult to prove using the Inverse Function Theorem that K A L If

is a

(k + £ - n)-dimensional submanifold of

k + £ = n

product

K x L

then up to local diffeomorphism near

p.

If

cannot be satisfied for any 176

k + £ < n p

in

K r\ L

M

M

locally near

p.

is expressible as the

the transversality condition so in this case we take

transversality to mean

K n L

is empty.

The idea becomes more useful if instead of two submanifolds we take one submanifold together with the image of another manifold under a smooth map into

M.

If

S

is a submanifold of

we look at points if either

f(p)

p

in

N

M

and

and say that

does not lie in

S

f

f : N -> M

is a smooth map

is transversal to

S

at

p

or, if it does, then

V(iph) + Tf(P)s - Tf(P)MSee Figure 34.

near of

N

p

It follows from the Inverse Function Theorem that locally

the inverse image

f ^(S)

is (if not empty) a smooth submanifold

of codimension equal to that of

S

in

M.

We have already met this

idea before in the case when

S

surjectivity of

is locally a submersion

T^f

and

f

is one point: then transversality means (cf. page 153).

The map version of transversality becomes the submanifold version when is an embedding, and is very similar when

f

f

is an injective immersion.

We shall use it in this context when studying dynamical systems in §4.4. Transversality makes sense in infinite dimensions, although to use any of the theory we have as usual to include a proviso about splitting. might suppose that that

(T f) p

T^^M

^(T, ,S) f(P)

splits into a direct sum

is a factor in a splitting of

Tf^S © F, T N, P

We

say, and

although

177

sometimes we can assume less than this.

Transversality theory in function

spaces is a very powerful tool in differential topology and global analysis. See Abraham and Robbin

[_ 2 ] .

Returning to the original ideas of 'general position' hope that in some sense a map

f : N -> M

to a given submanifold

M.

S

of

in

R ,

we

would in general be transversal

This is the essence of Thom's

Transversality Theorem (page 259).

However, we cannot treat this here since

we have not yet discussed a definition for 'in general'.

We shall return

to these questions in Chapter 4.

Riemannian structures As remarked on page 168, locally as

U x E.

the norm on

E

is forgotten when

is modelled

Once the tangent bundle has been constructed, however,

it becomes useful to have a measure of length in each that paths in

TM

M

T^M

not only so

have speed (a number) as well as velocity (a tangent

vector) but also in order to measure global contracting/expanding behaviour of dynamical systems: see §4.5. such as T^M

Rn

If

E

is a Hilbert space

H

(page

52),

for example, then it would be useful to be able to equip each

with an inner product

p

which varies continuously or, better,

smoothly with respect to

p.

a smooth function of

p

for all smooth vector fields

neighbourhood of

In fact it is always possible to do this, although

p.

By this we mean that

certainly not in any unique way.

^ X,Y

should be

defined in a

The family of inner products is called a

R'Lemann'ian structure (B. Riemann, 1826-1866); it is technically a section of a bundle of covariant tensors (see §4.6). M

we have on each

T M p

a norm

\

I IvI I = 5 II 1 'P P

178

for

v e T M. P

I I•I I 1 1 1 'p

Given a Riemannian structure on defined (as on

H) J

bv y

With this norm we can construct a metwc on

M,

using the obvious

intuitive idea of the 'shortest distance between two points'. path p

y

to

: q

[o,l] -> M along

with

y

y(0) = p ,

we define the distance from

as

dy(p,q)

Then taking

y(l) = q

Given a

Y'(t)

d(p,q) - inf d (p,q) y V

y(t)

dt.

gives a metric on

M.

For this reason

the Riemannian structure is itself often called a Riemannian metric.

3.5

VECTOR FIELDS AND DIFFERENTIAL EQUATIONS

Consider a system

S

of first order autonomous (i.e. no explicit

t

on

the right hand side) ordinary differential equations

x x

x denotes

where

1 2

n

d_ dt *

(x^,x2,•• • jx ) X2(xi,x2, ...,xn)

= Xn(x1,x2,...,xn) We can write the system more economically as

x = X(x) where

Rn,

x

lies in some open subset

U

of

Rn

and

X

is a map from

U

to

which we do not assume to have any special properties at this stage.

It is convenient, and in keeping with our general aim, to regard as a measure of time.

The set

U

is called the phase space of

What does it mean to solve the equations?

t

here

S.

First we have to postulate

some initial conditions such as

179

x = p and then a solution means a path

when

t = 0

c : J -> U

satisfying

c (0) = p and c(t) = X(c(t)) for all

t

in the interval

J,

where

c(t)

is what in §2.4 we called

c'(t) = Dc(t).1. We think of

X

as prescribing a vector

and we want a path a typical point vector Rn X

based at

c(t),

X(c(t)).

in which

c

p

X(x)

x

in

U,

with the property that the tangent at

i.e. the velocity of that point, is precisely the

We should therefore think of

x

at each point

X(x)

not as an element of

lies, but as an element of the tangent space

T U.

Then

becomes a map X : U -> TU = U x Rn

with

X(x)

an element of

T U

for every

x

in

U,

or in other words

X

satisfies tt*X = idjj where

tt : TU

U

vector field on

is the natural projection map. U.

Formally, a solution

X(c(t)) = (c(t),c(t)),

c

Such a map is called a

then satisfies

where the seemingly redundant

c(t)

in the first

factor is there to keep track of the particular tangent space in which X(c(t))

lies.

Since by definition

c(t) = Dc(t).l

we have

in tangent map notation

X(c(t)) = (c(t), Dc(t).l) = Tc(t,1)

or

Ttc(l).

These definitions, in which we have disentangled the space in which X(x)

180

lies from that in which

x

itself lies, generalize immediately to

X

manifolds.

DEFINITION A vector field on a smooth manifold 77 *x =

Thus

where

X(p)

M

is a map

is the natural projection

it

lies in

tt

\p) = T^M

TM.

p

which satisfies

TM -> M.

for every point

assigns a tangent vector (velocity) at called a sect'Lon of the bundle

X : M -> TM

p

on

to each point

p.

M,

i.e.

X

Such a map is

See Figure 35*

image of X

Figure 35 The notion of a solution to the system

S

now becomes:

DEFINITION Let M,

X

be a vector field on

is a path

a < 0 < b)

c : J

M

satisfying

M.

(where c(0) = p

A solution curve J

to

X,

is some open interval

based at (a,b)

p

on

with

and

Ttc(l) = X(c(t))

for all

t

in

J.

(Here

TJ = J x R

and so

T c

is a map

R -* TM.)

Thus a vector field on a manifold is the global version of a first order autonomous system of

n

ordinary differential equations on

Rn.

Observe

181

that the definition works equally well for smooth manifolds modelled on an infinite dimensional normed linear space

E.

Higher order equations There is a standard trick for converting an variable into a system of Assuming the

nth

n

first order equations in

n

variables.

,n-l d x dx n-1 dt”**’ dt

= F

dt

x^

order equation in one

order equation is of the form

,n d x

we write

nth

for

x

and then

x

n

x 2»

x^

F(x^,x^,...,x^) Rn.

giving us a first order system on



and finally

,

The subset of

Rn

on which all

this is defined (generally an open subset) is the phase space for the original equation. In the case

n = 2

we have

X1 = X2

x2 = F(x ,x2)

and so from a second order equation on an open interval a first order equation (i.e. a vector field) on vector field

X

on

TU

x,-component of

X(x)

in

T U = R X

tangent bundle language this says Tit • X = idrpTT TU

2

X

R

we obtain

Not every

comes from a second order equation on

-L

182

in

U x R = TU.

easy to see that a necessary and sufficient condition on that the

U

U :

it is

for this is

should be equal to

x„. Z

In

o

where

it

map, and

: TU

->

U

is the usual projection.

2 X : TU -> T U

Tit

:

is the vector field on

T U

TU.

TU

->

is its tangent

This generalizes

verbatim to manifolds:

DEFINITION A second order autonomous ordinary differential equation on field.

2 X : TM -t T M

on

TM

Ttt

where

tt

: TM

M

M

is a vector

which satisfies



X =

id

TM

is the natural projection

(so we have

Ttt

2 : T M -* TM).

Non-autcnomous equations A system of first order equations which contains hand side can be regarded as a family

X

t

explicitly on the right

of vector fields varying with

This goes over directly to vector fields on manifolds. another variable

u = t

as just one vector field (x,u)

of

M x R

on the product manifold

the tangent space to

and the component of

X(x,u)

M x R

in

t : in this case

M x (R/Z) = M x

X

is

X

M x R.

X

At a point

T^M x T^R = T^M x R

in the second factor is

This technique is particularly useful when say)

By introducing

we can interpret the family of vector fields X

t.

1

since

= 1.

is periodic (of period 1,

projects to a vector field on

giving rise to a flow with a cross-section: see §4.1.

Local existence and uniqueness of solutions It is a standard and fundamental theorem that provided well-behaved local solutions to furthermore if conditions.

X

X

X

is sufficiently

exist, are essentially unique and

is smooth they vary smoothly with respect to the initial

We will give a careful statement of a form of this theorem.

Note that since we are working locally we operate in a normed linear space

183

E

rather than an arbitrary manifold.

The theorem requires

E

to be a

Banach space, since the proof involves techniques of infinite iteration and needs good criteria for convergence of sequences.

THEOREM (Local existence and uniqueness of solutions of ordinary differential equations) Let Cr

U

be an open set in a Banach space

map with

r £ 1.

Then given any point

a positive number and

E,

a neighbourhood

and let p

in

U

f : U

E

be a

there exists

e,

W

of

p

in

U

such that for any

x

in

W

and any positive number

6 $ e

there exists a unique solution curve c

:

(-6,6) -> U

X

to the equation

x = f(x),

satisfying

c (0) = x*

Moreover3 if we define a map : W x (-6,6) -* U by (x,t) = c (t) X

then

is a

C

IT

map.



Thus solutions vary

C

1C

with respect to

t

and

the initial conditions.

Remarks 1. This may seem rather an elaborate formulation of a simple idea. the reason for introducing -e < t < e use of 184

U U

W

However,

is that we want solutions to exist for

throughout a whole neighbourhood of

p.

We may not be able to

itself for this neighbourhood, since the nearer the shorter the positive (say)

x

t-interval over which

is to the edge c^

is defined

may be.

5,

See Figure 36.





.

Also, we must formally allow the flexibility with

.

.

since otherwise it is just conceivable that solutions

both exist with -e/2 < t < e/2

c

(0) = x

(i = 1,2)

c

i

x

,c

9

x

could

and be defined for, say,

without contradicting the uniqueness statement for

e.

(Of course these solutions would have to be non-extendible to solutions on -e < t < e,

or e-uniqueness would be contradicted.)

2. For proofs of the theorem see Dieudonnd

[^32^, Lang

[^67^], or any of

the many books on differential equations such as Arnol'd ^llj , Coddington and Levinson j^30T, Hartman

^5l] , Lefschetz (^69^) .

There is a neat proof

using the Inverse Function Theorem directly, due to Robbin [104] • proofs begin by supposing

X

only to be

C'*'

Most

or satisfying an even weaker

Lipschitz condition (R. Lipschitz, 1832-1903), and use induction to prove . the theorem m the

C

r

case.

non-autonomous case, where

f

Moreover, proofs are usually given for the is a function of both

smoothness or Lipschitz conditions apply to

x

x

and

t

and the

but not necessarily to

t.

00

However, we shall always be assuming smoothness (usually

C )

in both

variables and accordingly reduce the problem to the autonomous case as described on page 183.

185

We now want to extend this local existence and uniqueness theorem to a global theorem by continuing our local solution curves all over the manifold 27

M

as far as they will go, assuming

solutions do exist.

X

to be

C (r £ 1)

so that local

The key result enabling us to do this is the following

lemma which says that any two solution curves based at

p

will agree on

any interval on which they are both defined.

LEMMA Let

cx : J

-> M

and

based at the point and

c^

p

c^ : ^ on

M.

M

be two solution curves for

Then

c^(t) = c^Ct)

for all

are both defined3 i.e. on the intersection

t

X

both

where

c^

r\ J^.

Proof This may seem obvious, but it is instructive to see how several topological notions from Chapter 1 are involved in formalizing this intuition. be the subset of I =

A J2*

A ^

on which

(1) I

is not empty (it contains

(2) I

is open in

(3) I

is closed in

and

C2(t)

M

A ^2

c^

agree; we aim to prove

t,

A 3^ J^A

0);

(the local uniqueness guarantees this); . (If

t

is not in

I

then

c^(t) 4 c^(t)

is Hausdorff we can find neighbourhoods of

which do not meet.

implies that of

and

I

Note the following facts

and since

c^(s) 4 0^(3)

J

(1),

(2) and (3) imply that

The continuity of

for all

i.e. the complement of

Now

186

c^

Let

I

s

c^>c2

c^(t) thus

within some distance

is open so

I

is closed.)

an interval, and is therefore connected (see §1.7). I =

A J^,

y

proving the lemma.

Hence

Finally, we construct a maximal solution curve

c : J -v M

(i.e. one which cannot be extended over any larger as follows.

Let

at

J

p.

Let

c : J -> M

(c

A

: J

M}

A

c

J . A

Then

J

J ,

c : J -> M

and define

A

is an interval (finite

or infinite) and the above lemma shows that the definition of unambiguous.

Clearly

based at

We call it the global solution curve through

p.

J)

be the family of all solution curves based

on each

A

p

t-interval containing

be the union of all the various intervals

to be equal to

based at

c

is

is the unique maximal solution curve for

Global solutions need not be defined for all time

X

p.

t:

EXAMPLES 1. The equation

x = 1

has global solution through

If the equation is defined on the open interval each solution curve is defined in 2. Let through

M = R p

is defined only for

c(t) = (p

finite number.

M = (-1,1)

in

c(t) = p+t.

R,

then

2 x = x .

-t)

,

The global solution

but when

p > 0

this solution

< t < p ^.

The problems are that (1) c(t)

given by

only for a finite time.

and consider the equation

is given by

time, or (2)

M

p

c(t)

may reach the 'edge' of

can shoot off to infinity while

However,

if

M

t

M

in a finite

approaches some

is compact there is no edge (our manifolds

here have no boundary) and no way of shooting off to infinity, and so we would expect global solutions to be defined in this case for all

t.

The

optimism is justified.

THEOREM If all

M

is compact then each global solution curve for

t

in

X

is defined for

R.

187

Proof Let

c : J -> M

J = (a,b) . If t

n

b < °°

We shall show we could find

= b - —). n If

be the global solution curve through

c(t ) = c(t ) m n

R,

c

for some

(see §1.6).

M

with

m,n,

b < 00.

-e < t < e b - t

(possible by continuity of

t

n

+ e > b

as

n

in

00

(e.g.

M.

c(t) = c(t+t -t ) n m

Hence all the

c(t )

valid for all

are distinct.

there exists an accumulation point

such that

n

then have

-> b

x_,_

t

By

of the

c(t )'s

Now the local existence and uniqueness theorem shows that

and defined for t

n

is similar.

then local uniqueness and the above

unique solutions exist based at any

choose

t

and suppose

a =

c(t ), c(t„),... 12

is periodic with r

contradicting

compactness of

the argument for

t_,t„,... 12

Look at the points

lemma show that in

b = 00 ;

p,

c

where

x e

in some neighbourhood depends on

is less than

n c).

e

W and

W

of

x^

but not on

x.

Now

c(t ) n

is in

W

By local uniqueness and the lemma above we

extendible to a solution on

(a,

t +e)

we have contradicted the maximality of

,

and since

J.

Therefore

b = °°,

as originally claimed.

Thus if

M

(P»t) = c (t) P

is compact (and often when it is not) we can define where

c

is the global solution for

X

through

p,

and

P

thereby obtain a map

(p : M x R -> M.

This map, representing all the solutions for

X

simultaneously, has a

number of interesting properties which will be of central importance to what follows.

First of all,

can be shown to be

Cr

if

X

is

Cr,

by piecing together the local information from the local existence and uniqueness theorem.

188

Secondly, if we write

^^(p) = c (t)

then

^_

: M -v M

satisfies

«f)f. * t for all gives

s,t t

=

s

, t+s

:

M -> M

This essentially says as

= 4>t(4>s(p)) •

that uniqueness of solutions

the unique global solution based at

c

(s),

so

Also

*0 " idM since

^q(p)

= Cp(0)

diffeomorphism of

(j>

:

= P*

because

4>

it immediately follows * cf>

t

-t

= 4>

-t

• 4>

=

t

that

= id M

0

(f>^

so

is a

d> T-t

is

the inverse

M -> M.

Such a map

flow

is called a

which we shall use

to model

on

M.

This

the theory of the

and this will occupy us

Remarks on the literature.

dynamical system

that we have been pursuing

three chapters of this book.

of dynamical systems becomes

manifolds_,

the mathematical object

the general concept of a

governed by ordinary differential equations throughout the first

is

The qualitative theory

structure of flows on

for the whole of the fourth chapter.

Over the last decade or so there has been a slow

but steady stream of books becoming available on differential

topology.

In

rough chronological order (relevant since language and aims of exposition tend to change) Mackenzie Wallace Lang

there are Lang

[13],

[145] ,

[68J,

[66],

Narasimhan

[87],

Chillingworth

[[331 , Hirsch

[25],

Stamm

[15],

Spivak

Golubitsky and Guillemin

Dodson and Poston manifolds is

Munkres

Bishop and Crittenden

[130] .

recorded in Bourbaki

[128"], Milnor

[83],

[129], Brickell and Clark [20 ] ?

[42] ,

[52],

[86], Auslander and Spivak

Guillemin and Pollack

[48],

See also survey articles by

The formal theory of differentiable [18],

The series of volumes

[l29[]

by

Spivak provide a very thorough and lively account of manifolds and differential geometry, obtain.

although unfortunately they are not always easy to

For Morse theory see Milnor

article by Palais

[93]

in some books on differential and Clark,

or Spivak

[129],

treatments are Hochschild by Adams

[ 4 ].

[82]

or Wallace

is also strongly recommended. topology:

The classic text is Chevalley

[36]

the excellent

Lie groups

feature

see Bishop and Crittenden, Brickell [24];

other

or the concise and very clear lecture notes

A standard reference for theorems on the orbit structures

of Lie group actions is Borel [17]. Some references for algebraic topology are Agoston Young

[145] ;

[57], Maunder

[5

], Hocking and

[79].

189

For the basic

theory of differential equations refer to any of the texts

quoted on page 185,

and also the short introduction to some qualitative

aspects of the theory by Hurewicz Hirsch and Smale of view,

[^59^.

The books by Arnol'd

fllj

and

are both written very much from the qualitative point

and contain large numbers of interesting examples,

fresh ideas:

190

[^55j

applications and

they would provide a stimulating accompaniment to these notes.

4 Qualitative theory of dynamical systems

4.1 We

FLOWS AND DIFFEOMORPHISMS saw in the previous

chapter that the solutions of a

(differential equation)

on a compact manifold 4>

on

M.

Writing

:

Diffr(M)

is a

group homomorphism,

that the flow

is an

IT



diffeomorphisms

Diff

Diff

(M)

(M)

has

• ••

(remember

(f>

=

t

4)_ 21 ’

(t)

Z

the algebraic properties

or in the language of Lie groups of

R

on the manifold

orbits

M.

(see §3.2

1 12] )

The images of

of this action.

and consider positive and negative iterations

t ’ ^0 ’ ^ t ’ ^21

1

and

t •*

the solution curves of the vector field are the If we fix a value

is another one,

say that the map :

action

M -* M.

and

nt =

-> Dif fr (M)

which is an action of the group

:

1 ''

(d>fc)n)

we have a map

n h- 4>^^

Z

on

M.

This action is a kind of

191

discrete approximation to the R-action

M meaning

f

We call this the Z-action

(fn(x)}

for all

n = 0, ± 1,

It does not follow that

f

± 2,

...

need come

see below.

In both the flow and diffeomorphism case the total all

to

f

the orbits is called the

phase portrait.

It is

configuration of this which we study

when trying to understand the global qualitative behaviour of a dynamical system.

Remarks 1.

One approach to the problem of classifying phase portraits is

the quotient space

(§1.3)

of

'lying in the same orbit', in the

orbit space

complications

M/R.

by the equivalence relation

However,

this leads in general to such

M/R

may easily be non-Hausdorff.

Try to imagine,

the orbit space for an irrational flow on the torus or even for an innocent linear system in

x^ = x^,

:

= _

x2

R

such as

see Figure 44.

In the same way that a flow represents the evolution of a system

governed by ordinary differential equations, a

of

so that orbits are then represented by points

(Example 6 below),

2.

R

that it hardly sheds any light on the problem.

One difficulty is that for example,

M

to take

difference equation

point on

192

M.

of the form

a diffeomorphism represents

x ,. = f(x ) m+1 m

The sequence of points

where each

xQ,x1,x2,...

is

x m

is a

the forward orbit

Xq

°f

under

f,

and by including

the whole orbit. is

x_^ - f

(xq)

and so on we obtain

Therefore the theory of diffeomorphisms as Z-actions

the global qualitative theory of certain types of difference equations.

For many types of difference equation the map however, still

and so we no longer have a Z-action.

may not be invertible, Nevertheless, we could

study the behaviour of an arbitrary smooth map

forward iterations. these lines, 3.

f

f

: M -> M

Although there are some interesting results along

the global

theory remains relatively unexplored.

We could drop differentiability from the discussions,

continuous

actions where each

c|>

we need not work with manifolds, The

theory of such actions

continuously with

t,

or

f

and consider

is merely a homeomorphism:

then

but any topological spaces we liked.

(often called

continuous flows

discrete flows,

or

topological dynamics. this generality,

under

respectively)

when

varies

is called

Many useful results about flows can be proved in

but differentiability yields much stronger results.

We shall rely heavily on differentiability, which can usually be assumed in practical applications.

The relationships between R-actions and Z-actions There are two important ways in which R-actions give rise to Z-actions. 1.

A choice of

t

gives

the Z-action

n

from the R-action

,

as described above. 2.

Let

be a flow on

suppose that

\

M

with corresponding vector field

is a codimension-1 submanifold of

M

X,

and

with the following

properties: (a)

every orbit meets

\

for arbitrarily large

positive and negative times

t

;

193

(b)

if

p

to

£

lies on

£

i.e.

X(p)

,

then

£

is called a

cross-section

£

induces a Z-action on

for

£

we define

on

positive on

p.

t

for which

We call

See Figure 37. system on 0

M

f

f (p) c|>

(p)

The flow on (page

£

to be

, t

(p)

183) has

It is not hard to

where

again lies in

M x S'*'

cj>.

generated by

first-return map

the

.

transversally.

of the flow

prove that p

is not tangent

does not lie in

Thus orbits intersect Then

X(p)

£

.

f

t^

£ ->■ £

when

is the least

In general

PoincarS map

or

:

for

t^ £

depends .

arising from a periodic non-autonomous

a cross-section

£ = M x {0} for any fixed

e S1.

M

Figure Z7 Conversely, way.

a Z-action gives rise to an R-action in the following

Let the Z-action be generated by

product

M x I

(where

vector field

X

contained in

M x R.

All

topology

(§1.3):

in §3.2,

194

9

in

Diff

is the closed interval

pointing in the I-direction,

attaching the point 1-end.

I

f

[o,lJ)

thinking of

Now glue the two ends of (x,0)

M. x I

at the O-end to the point

this can be made respectable,

(M).

Take the and take a unit M x I

as

together, (f(x),l)

at the

using the idea of quotient

compare the mathematical description of the Mobius band See Figure 38.

Mxl

CO

Figure 38

We obtain a vector field on a smooth manifold of dimension (if has

dim M - n), M

Thus

and the corresponding flow,

n + 1

called the suspension of

f,

as a cross-section.

R— and Z—actions can be associated to each other in ways which

reflect their orbit-structures.

This association has serious

limitations,

though: 1. Not every flow is the suspension of a diffeomorphism,

since

for example a suspension has no points where the vector field vanishes.

This

also implies that not every flow has a

cross-section. 2. Not every diffeomorphism can be incorporated into a flow on

the same manifold (i.e. as

t

for some

t) .

Clearly a

diffeomorphism which is not deformable continuously to the identity cannot be not enough: by

f(e

are as

i0

)

t

let

f

= p*^

(e

for any

: i0

t.

Deformability is

still

be a diffeomorphism defined ),

where

illustrated in Figure 39,

\p

is a flow whose orbits and

p

is rotation by

180°

.

195

q Then If

f(p) f = ,

f

2 (p)

= P,

f

(q) = 9*

then the whole circle must

would not be periodic) which would

of

period

(r)

f r.

and

for any

2

= p,

f(q)

Figure 39

2t

:

see below.

Therefore

f

But this

is not of the

t.

it is clear that the study of flows and the study of

iterated diffeomorphisms are very closely related, them both in parallel.

For flows we tend to write

and we shall pursue

(p)

as

(J^p.

Important features 1. A fixed point of a flow all

t,

i.e.

is a point

p

for which

for which the vector field vanishes:

also called a zero

(or, misleadingly,

of the vector field.

It corresponds

X(p)

p = p = 0.

for Thus it is

a singularity or critical point) to an equilibrium state of the

dynamical system being modelled. A fixed point of a diffeomorphism f(p)

= p,

and hence

fn(p)

= p

2. A periodic point of a flow some q.

196

T > 0 If

with

r = (f> q

Tq = q.

f

is a point

for all integers

p

for which

n.

is a point

q

for which there exists

The least such

T

is called the period of

is any point on the orbit of

q

then

V

^T+s^

^s+T^

~

point with the same period as q

as a A

i

(q)

periodic orbit

periodic point - q

q.

of period

m > 0

(or

points from fixed points).

>1

£

return map for

f

is also a periodic

is a point

y

of

q

such that

if we want to distinguish periodic

The least such

conversely,

r

T.

Every fixed or periodic point of the suspension flow;

so

Therefore we describe the orbit

of a diffeomorphism

for some

cross-section

_ r>

f

m

is the

period

of

q.

gives rise to a periodic orbit of

every periodic orbit of a flow with a

gives rise to a fixed or periodic point of the first£

.

This fact that the phenomenon of periodicity is

preserved when taking suspensions and cross-sections

is one of the

justifications for studying diffeomorphisms in order to understand flows. Fixed points and periodic orbits correspond to in a dynamical system,

'observable'

phenomena

since they represent states which are unchanging

or which repeat themselves.

However, points whose orbits return

infinitely often to any arbitrarily small neighbourhood of them, (since we can only measure to within a certain finite accuracy) which have points often

(for times

q

arbitrarily close to

t -»- °°)

p

or even points

p

which return infinitely

to any neighbourhood of

p,

may also be

associated to some kind of observable phenomenon although it could have a rather chaotic or turbulent character. type is called non-wandering behaviour.

Recurrent behaviour of this A point

p

is

wandering

has a neighbourhood which never returns to intersect itself, which is not wandering is

non-wandering.

if it

and a point

Formally, we express this as

follows:

197

DEFINITION

A point

p

is a

neighbourhood

of

U

on

Similarly for a diffeomorphism

replacing

(n > 0)

for the flow

M

there exist arbitrarily large

p,

is non-empty.

Urw|> V fn

non-wandering point

ifs given any

t > 0

for which

with

f3

.

t

Remarks 1. The set of non-wandering points is denoted by

ft() ,

ft(f)

or simply

ft . 2. Let

Then if

4>^_U

U

s+t

UnU = cf>Vr'iV s t

x

is in

is in

n(4>);

using

t < 0

is nonempty it follows is also nonempty.

that

From this

it is

easy to deduce (i)

(ii)

if

ft() Result

(ii)

ft() = ft ()

(put

n(4>)

s = - t

shows that if

in the above). is

the flow obtained from

ft(f').

In particular,

ft(f)

4. The set of wandering points is obviously an open set,

compact.

closure

the 5.

- X

then

.

Clearly fixed points and periodic orbits all lie in

closed set.

x

would give the same definition of

Similar remarks apply to 3.

then the whole orbit of

If

M

= ft(f



ft. and so

ft

is a

happens to be compact this will imply that

ft

is

The fact that

ft

is closed means that it must contain at least

of the set of fixed and periodic points.

If a point is wandering it corresponds to transient behaviour in a

dynamical system, know where

198

and so to understand the long-term behaviour we need to

(if anywhere)

the point can be said to approach asymptotically

as

t -> + oo.

For any point

ut-llvmt point

of

x

we say that another point

(the orbit of)

x

if there are points

y

is an

x,

tl on the orbit of

x

with

t. -> +

and

00

d>

1

t.

x -xy

as

i -> °°

x,

.. .

C2 .

This does

1

not

necessarily imply that

The set of all written

such

y

-* y

for given

as x

t -> + °°

:

see examples below.

is called the

u-llrrrlt set

of

x,

w(x).

Replacing

t -* + °°

Replacing

t^

by

by

n^

t -*■ -

00

a-llnrit set

gives the

a(x).

gives a corresponding definition of

a

and

u>

for diffeomorphisms. A straightforward argument shows that of

n,

a(x)

for every

= co(x)

x.

If

= orbit of

x

o>(x)

and

a(x)

are both subsets

is fixed or periodic then

x.

EXAMPLES of flows Simple harmonic oscillator

1.

We convert this on

2

x + k x = 0

second order equation on

TR = R x R = R

by setting

x = y

R

and obtaining

The equation is easy to solve explicitly,

:

R

2

x R

R

(j> (x,y; t)

2

=

into a first order system

and a formula for the flow

can be worked out as

(x cos kt + k_1y sin kt,

- kx sin kt + y cos kt)

(It is instructive to verify that this Is a flow.) ellipses

k2x2 + y2 = constant,

fixed point.

except for

(0,0)

Every non-fixed point is periodic,

The orbits are which is the unique so every point is

2 non-wandering:

= R

199

2. Damped simple hcurmonic oscillator Here

2 x = y, y = - 2by - k x

x + 2bx + k x = 0

(b > 0)

and without working out the equat%ons for

the flow we know that the vectors now have an 'inwards'

component and so

we deduce on qualitative grounds alone that all orbits spiral in towards the origin as Also,

every

t -*■ + 00 a(p)

consists of

.

Hence the only w-limit set is the origin.

is empty except that

ct(0,0)

=

(0,0).

Thus

here

the origin and nothing else.

This and the previous example are both linear systems, meaning that the map

2

R

2

:

(x,y) >-»- (x,y)

is a linear map.

structures of these and other linear systems in 3. Simple pendulum.

m£0 + mg sin 0=0

Take

0

£ = g = 1,

a point on

S

,

so

represents

and convert this

second order equation on

into

a first order equation on TS

1

= S

1

to obtain When

x R

by putting

0 = v

0 = v, v = - sin 0

S1 x R

is

'unrolled'

.

the

orbits of the flow are as shown in Figure

200

40.

For pictures of R

2

the orbit

see Figure 44.

Each orbit lies in some subset

1 maP

E

(c)

where

E

:

TS

-> R

is the

energy

2

(6>v) ^ y v

see Figure 41.

- cos 9 Here

SI

(this simply says that energy is conserved): is all of

TS1.

R

-

*1

---1

Figure 41 Observe that

E

has two critical points

both non-degenerate regular value

c

submanifold of

(0,0)

(minimum and saddle-point,

of

E

TS1

the inverse image

(see

§3.3,

page

157).

and

(tt,0)

;

respectively).

E 1(c)

they are For every

is a codimension-1

It is now a very instructive

exercise to visualize what happens to the orbits when a damping term is added,

turning the equation into

0 + b9 + sin 0=0

and corresponding to a loss of energy along each non-fixed orbit.

.

4

If

Gradient flows on X

Rn

is a vector field on 9V X1

where

V

:

Rn

-*■

vector field. solution curves,

9x

R

.

Rn 9V

’ X2 9x„ 12

is

corresponding to equations of the form . *

_ 9V

Xn

some smooth function,

It is easy to show that

V

9x

n

then

X

is called a

gradient

can never decrease along

since

201

4rV(c(t))

- DV(c(t)).c(t)

"

j, 1x7 1=1

l

=

1

(c.(t))2 $ 0

.

i=l

If fact

4— V(c(t)) at

of solutions)

c(t)

are defined for all

> 0

unless

c(t)

= constant. t,

= 0,

in which case

(by uniqueness

From this it follows that if solutions

so that

X

gives rise to a flow

on

Rn

,

the only non-pandering points are fixed points - i.e. there can be no periodic or other recurrence phenomena: c(t)

= 4>

(p)

has left

so by the continuity of which

t(p)

p = c(0) V

this is because once a point

the value of

V

must have increased and

there will be some neighbourhood of

p

into

can never return.

This relatively simple structure of

Q,

for gradient flows on

Rn

allows their qualitative behaviour to be analyzed quite thoroughly. We shall return to this important fact and its implications

later,

in

the wider context of manifolds. 5. The van der Pol oscillator

x - a(l-x )x + x = 0,

a > 0

Converting this as usual into a first order system, we have x = y

2

y = ot(l-x )y - x which is a non-linear system exhibiting a particularly important type of behaviour. (see §4.2) However, y

The origin is clearly a fixed point,

shows that solution curves are spiralling away from the origin.

they do not spiral out to infinity but approach a periodic orbit

which surrounds the origin.

202

and local analysis there

Outside

y

the solution curves

spiral in

towards

y,

and

is therefore called a limit cycle.

y

set of every point except the origin;

here

It is

0, = y \j {origin}.

the

w-limit

See

Figure 42.

X

Figure 42 The physical interpretation of

the picture is that, no matter what the

initial conditions are apart from

(0,0),

the system will eventually

(or possibly very quickly) be behaving in a way which is practically indistinguishable from the periodic behaviour of points on Indeed,

even if it is initially at

cause it to rocket out towards It is

(0,0),

y.

the slightest perturbation can

y.

surprisingly difficult to prove rigorously the existence of the

single limit cycle

y

for this system,

although it is easy enough to

construct artifically other examples of systems exhibiting limit cycles. For example

has

the circle

r = 1

as a limit cycle.

The van der Pol equation is

historically important as the basic equation for the oscillation of a radio valve.

203

6.

Rational and irrational flows on the torus

Recall that in t *->■ (exp iat,

§3.3

we studied immersions

exp i3t).

vector field on

R ->

Such a curve arises

S^~ x

corresponding in

x

of the form

as a solution curve to the

(0,)

'co-ordinates'

to the

first order system 0 = a = 3 (in fact it is the global rational then

every

non-wandering.

solution through

=

(0,0)).

If

a/3

is

orbit of the corresponding flow is periodic, hence

a/3

If

is irrational,

torus, hence again non-wandering.

4.2

(0,)

then every orbit is dense in the

Therefore in both cases

ft =

x S^

LOCAL BEHAVIOUR NEAR FIXED POINTS AND PERIODIC ORBITS

In studying the qualitative properties of flows and diffeomorphisms our approach will be wandering set

ft

(1)

to investigate behaviour on and near the non¬

(where the action is)

and then

(2)

piece together a

global picture from information about how orbits go from (near) of

ft

to

points,

(near)

another.

one part

We begin by looking at behaviour near fixed

corresponding physically to behaviour near equilibrium states.

First we will consider flows,

and then discuss the analogous behaviour

for diffeomorphisms.

Local linearization of flows Let

p

be a fixed point of the smooth flow

we have a linear map T d> P

204

t

: T M -> T M p

p

p t

so as

t

runs

-

T 4> =T(d> p s p rt

through

R



)

at

If

(page 166)

gives

= T d> p^t+s

the family

gives us a flow on the linear space of

tangent maps

{T

of

T^m.

(linear)

We call

diffeomorphisms

this the I'ineav'izat'Lon

p.

Rn

is a linear map

X(0) ,

= 0.

The derivative

DX(0)

of

and we can consider the linear

system x = DX(0).x on

Rn.

This

is called the 'L'Lneav'Lzat'ion of

Not surprizingly,

Rn

at

p.

then

at

0.

the flow corresponding to this linear system turns out

to be the coordinate representation in cf>

X

The proof is

simple:

(fi^x = X(tx)

if

U

X

of the above linearization of gives rise to the flow

= DX(4>j_x)



x = 0

|- D 0 L

showing that

(0)

(v,t) h- Dc|>

gives

x

and

t

(see

and get

= DX (0)

L

x

Dcf>t(x)

since we can interchange differentiation with respect to Now we put

in

and so differentiation with respect to

|^T Dt(x)

page 77 ).

cj)



Dtf)

(0) L

(0)v

is the flow arising from

DX(0)

.

205

It is reasonable to ask whether knowing about the behaviour of the linearization of

at

p

will be of any use in trying to understand

the behaviour of

itself near

p.

As with the Inverse Function

Theorem and the Morse Lemma for maps and for functions,

the answer is

Yes, provided certain non-degeneracy conditions are met. Since we are working locally we may as well

and we will once again use the language of germs

p = 0,

The vector field map

X

suppose that

:

Rn -> Rn

X .

is thought of locally as the germ at

M = R

and

(§2.7). 0

of a smooth

The kind of equivalence we are interested in this

time will not be right equivalence or right-left equivalence but something specially adapted to vector fields, in the domain of

X

since a local change of co-ordinates

automatically gives a transformation of both the right

hand side and the left hand side of the system x = h(y)

where

h

x = X(x).

is a local diffeomorphism then

Explicitly,

x = Dh(y)y

and so the

system in the new y-coordinates becomes y =

(Dh(y))

1x = Dh 1(h(y)).X(h(y)) = Y(y) ,

say.

Therefore the idea of two vector field germs being the same up to

dvfferentvable

change of co-ordinates becomes formally the

(C )

following:

DEFINITION •

Two vector fveld germs

X, Y

dif feomorphism germ

such that

h

Y(•)

are

X*

C -equivalent if there exists a

= Dh_1(h(-)).X(h(-))

or3 more simply3 Yh_1(-)

206

= Dh_1(•).X(•)

.

if

C

T*

Now we can ask whether the germ of a vector field equivalent for some

r

to the germ at

DX(O),

and the answer we receive is:

THEOREM

(Sternberg

For each integer

at

0

is

Cr

of the linearized vector field

usually.

[jL3l] )

there is a finite number of relationships among

r > 2

the eigenvalues of to

0

X

with the property that

DX(O)

is Cr-equivalent

X

unless one or more of the relationships holds.

DX(O)

Remarks 1. The relationships are as follows: in terms of all

the eigenvalues

A. J

ml^l + m2^2 +

where the If any

A^

eigenvalue

nr

> 0

A^,A2»»*-»A^

Aj

can be expressed

(including

A^)

as

+ m A n n

are integers with

is purely imaginary A.,

an eigenvalue

(A^

2 £ £ nr = ioj)

£

then

a certain - A^

N(r).

is also an

and so we can write A. J

2A . J

+ A., J

which is a relationship of the proscribed type. may not be

linearizable.

In this case

As we see on the next page,

X there are

elementary and intuitive reasons why this is so, but it is important to realize that these Sternberg conditions are rather subtle and in general non-intuitive, then 2.

X

showing for example that if

may not be

The set of

linear maps

L r

:

X

to be

relationships for

C

Rn -*■ Rn

A^ = 1,

A^ = 2

avoiding the Sternberg

is open and dense in

CO

For

and

linearizable.

relationships for given 3.

n = 2

,

L(Rn,Rn)

.

,

linearizable it must avoid the Sternberg

every

r 5 1,

so an infinite number of conditions have

207

to be met.

The set of linear maps

relationships for a residual set

all

r

L

:

Rn

->

Rn

avoiding the

is dense but no longer open in

L(Rn,Rn):

it is

(see §4.4).

If we are concerned with a qualitative understanding of the local behaviour of a dynamical system near an equilibrium, Theorem is too powerful. equilibrium point

p p

For example,

all approach

X

but only topological

t ■+ + °°)

as

(C^)

orbits of the flow locally. Qso^

if we want to know if the

is a stable equilibrium (the orbits of points near

p

Hartman

then Sternberg's

and Grobman

[44] >

we do not need

C

information about

information about the configuration of the The key result here is the theorem of usually referred to in non-Soviet

literature as Hartman's Theorem.

It was proved at a surprisingly late

date in the development of the theory of differential equations.

THEOREM (Hartman)

If

DX(0)

has no 'purely imaginary (including zero) eigenvalues, then

there is a homeomorphism germ at

0

in

Rn

locally taking orbits of

to orbits of the linearized flow (i.e. the flow defined by The homeomorphism preserves the sense of orbits

(t

not necessarily the parametrization of orbits by

That the eigenvalue

x = y,

y = - x

2

+ y

then all orbits except

208

(0,0)

- °°)

but

illustrated by the

X

given by

2

? y = -x + x

or

(see §4.1, Example 1).

Here the field is already linear, but if we take x = y + x

DX(0)J.

t.

condition is necessary is

simple harmonic motion system

+ °°

tj>

?

+ y

spiral outwards and so in no neighbourhood

of the origin can there be a homeomorphism taking orbits of this flow to the circular orbits of the simple harmonic flow arising from

DX(0,0).

Analogous results for diffeomorphisms If a

C

diffeomorphism germ

y ~ Y(y)

h

converts the system

into

then since we are after all dealing only with a change of

co-ordinates it is clear that under » ^

x = X(x)

h

the orbits of the respective flows

must agree and also respect parametrization by

h(j> ^ (x) = ^th(x)

t,

so that

locally or, as germs, h»t = ip *h .

This says that the diffeomorphisms (page 26 )

by the

C

, ipt

diffeomorphism

are locally conjugate

h.

This notion of conjugacy

at the germ level is what corresponds in the diffeomorphism case to r C -equivalence m the vector field or flow case. Sternberg's Theorem for local germ

f

at

0



m.

°f

> 0

and

2 ^2 * * * **yn

l

.

linearization of a diffeomorphism Df(0)

m, m„ 12 yj " yl y2 for

x

is as before, with

C -conjugacy replacing eigenvalues

C

The relationships which the

should avoid are m n yn

.

l

Hartman's Theorem becomes the following:

THEOREM (Hartman's Theorem for diffeomorphisms) If f

Df (0) at

0

has no eigenvalues with modulus equal to is

C°-conjugate to the germ at

0

1

then the germ of

of the linear map

L = Df (0).

209

It follows immediately that the conjugating homeomorphism germ satisfying

hf = Lh

will locally take the orbit of each

f-action to the orbit of

h(x)

x

h

under the

under the L-action, since

h*f^(x) = h*f*f(x) = L*h*f(x) = L^h(x)

and similarly h*fm(x) = L^'hCx) for every integer

m.

Remarks 1. A typical linear system on

Rn

is of the form

x = Ax where

A

Rn -> Rn ,

is a linear map

A e L(Rn,Rn).

i.e.

R

to prove that the corresponding flow on

It is not hard

is given explicitly by the

formula , . . tA

1)

such that

+ + L(E)=E ;

all eigenvalues of

L | E+

have modulus

> 1

;

all eigenvalues of

L | E

have modulus

< 1

.

k < 1 < K

L

with

with

) with the following properties:

Furthermore, there exists a norm on

i.e.

X

(corresponding to eigenvalues

X

Rn

for which there exist constants

such that

x e E+

implies

||L(x)||

> K||x||

x s E

implies

||L(x)||

< k||x||

expands along

E

,

and contracts along

E

211

We can think of two

'co-ordinate'

or

E

= {0}

Rn

as being decomposed into the direct sum of

subspaces:

E+

and

is not excluded.

the origin is a sink',

if

If

E

.

the

The possibility that

E+ = {0}

then

E

= R

E

and we say

then the origin is a source.

E+ = Rn

Intermediate cases are often called saddle points.

See Figure 43.

saddle

Figure 43 The proofs of the above assertions are straightforward matrix theory, although the construction of the norm is not entirely trivial. Analogous results apply in infinite dimensions, using the spectrum instead of the eigenvalues. For a hyperbolic linear flow . . corresponding behaviour:

_n R

is invariant under the flow A

|

E+

(A

|

E )

and constants

for all

arising from

splits into (i.e.

have positive

k < 1 < K

E

E

+

© E

= E )

(negative)



we have

such that each

E

+

and all eigenvalues of

real part.

There is a

norm

such that

x e E+

implies

| |tx| |

>

x e E

implies

| | x| |

< k^ | |x| |

t > 0.

x = Ax

| | x| |

The situation can again be symbolized by Figure 43.

In two dimensions the orbits of hyperbolic linear flows can be sketched quite easily, explicitly:

212

since the corresponding system of equations can be solved see Figure 44.

Note that the validity of

these pictures

rests on the fact that if a hyperbolic linear system is perturbed linearly by a small amount so that no eigenvalue crosses the imaginary axis,

the phase portrait remains the same up to homeomorphism.

then

This observation,

although not mentioned again explicitly, is a first step in the theory of global stability discussed in §4.5 and should be kept well in mind.

Pictures of orbits of flows arising from 2-dimensional hyperbolic linear systems

x = Ax.

real, negative (i)

equal, with two independ¬ ent eigenvectors

(ii)

unequal

Xl,X2

=

3 / 0,

a

±

ip’

a < 0

(iii) or (iv), depending on sign

of lower left hand element in

(v)

A

real, negative

A^ = A2

with no two independent eigenvectors (A cannot be diagonalized) (vi)

Ax,A2

real, opposite

signs

(v)

(i)-(v) are sinks, point.

(vi) is a saddle-

For sources change signs of

(real parts of)

A^,A2

Figure 44

in (i)-(v). 213

The discussion of hyperbolicity so far has been concerned with fixed points of flows and diffeomorphisms. this

to periodic orbits.

the diffeomorphism

f

If

q

2

fm.

approach.

Choose a point

q

is also hyperbolic,

q

and take a

codimension-1 submanifold

\

i.e.

X(q)

\

such that the vector

sufficiently close to

the forward orbit of

x

V

of

y. q

P(x)

which

= (|) x

x e \

where .

Thus

the first-return map

P

P

\

and transverse to .

again at a time

\

x)

page 194).

P

214

y

,

approximately

is the smallest

Cr

if

X

is.

Clearly

q,

t > 0

for P

P(q)

= q,

and the

since we construct an inverse

P

Therefore we can

as a diffeomorphism germ

is hyperbolic then we say that

periodic orbit of the flow. transverse to

t

Using the Implicit Function Theorem

discuss the hyperbolicity or otherwise of If

on

l

is a diffeomorphism germ at

.

x

there exists a

by following orbits backwards instead of forwards.

Rn 1 -> Rn 1

For each

is a kind of local cross-section, with

is

y,

on which we can define a map

(depending on £

(cf.

it can be shown that germ of

x

piece of

the continuity of the flow guarantees that

More precisely,

in

q

(small)

does not lie in

P : V + by

we need a different method of

passing through

will meet

equal to the period of neighbourhood

q

y,

so we can

f.

of a flow on

of

It then follows automatically that

talk about a hyperbolic periodic orbit of y

m

is hyperbolic if it is

q

(q)... on the orbit of

For a periodic orbit

step to extend

is a periodic point with period

then we say that

hyperbolic as a fixed point of each point f(q),f

It is only a small

y

is a hyperbolic

This does not depend on the choice of

nor on the choice of

q

on

y

\

at which to construct

The germ at

q

Recall that if

f

of the map :

M

M

P

is

called a

PoincarS map

is any diffeomorphism then

germ for

fm

is a

Poincar^ map for every periodic orbit of the suspension of arising from a periodic point of

f

of period

P

(global) (page 195)

m.

Now we can apply Hartman's theorem to the germ of diffeomorphism case or to the germ of

f

y.

fm

in the

in the flow case,

and it tells us

the behaviour of a diffeomorphism or flow near a hyperbolic periodic

that

orbit is topologically the same as the behaviour of the linearized version. In view of Hartman's Theorem and our analysis of hyperbolic linear systems above we now see that if we take a sufficiently small neighbourhood

in-set

of

p

U

of a hyperbolic fixed point

p

and define the local

in the diffeomorphism case by

in(p)

= (x e U

|

fn(x) -> p

|

tx

as

n -> + °°}

or in the flow case by in(p)

then

in(p)

subspace in(p)

= (x e U

p

as

t -> + 00}

is homeomorphic to a neighbourhood of

E_

of

Rn

0

defined from linearization of

is a topological manifold of dimension

of eigenvalues with modulus less than

1

s

in the linear f

or

where

(or real part

cf>

s

at

p,

so

is the number

< 0).

In fact

we can do better than this:

THEOREM

If the diffeomorphism or flow is of

M.

.

its tangent space at

The terminology

in-set

then

Cr p

in(p)

corresponds

was suggested by E.C.

to

is a E

Cr in

Zeeman.

submanifold Dn K .

More traditionally g

in (p) Here

is called a s

means

local stable manifold

stable,

and also refers

to

for

p,

denoted by

w£oc(p)



the dimension of

215

s Wn (p). 36oc

Note that if we choose

ball with respect to similarly for

cf>

U

carefully

the relevant norm on

with

[6l] .

Replacing

It is known as n

by

- n

or unstable manifold of dimension appropriate

U

we have

The extension of If

q

we have

(or

W£0C(p)

u = n - s

n R )

s then

fW

s

36 OC

(p)cW

Jo OC

(p);

t > 0.

The proof of the theorem is not easy: or Irwin

(corresponding to a small

see Smale

[l22]

Nitecki

[90],

the Stable Manifold Theorem. t

by

for

- t)

local out-set

we obtain the

P*

It is a

C

submanifold of

, with tangent

space at

p

corresponding to

fW^ (p) 36 O C

these ideas

36 O C

(p);

similarly for

to periodic orbits

is a hyperbolic periodic point of period

of

fm

f

local stable and unstable manifolds for

f(q),f

at

m_ i we obtain

(q),...,f

m 1

(q):

see Figure 45.

Figure 45 We can think of these constituting a stable whole.

216

m

stable

.

q.

f,f 2

t > 0

.

then as above

2 Transporting these around the orbit by applying

,

E

is straightforward.

m

local stable and unstable manifolds for

t

M

(unstable) manifolds

as

together

(unstable) manifold for the periodic orbit as a

For

Similarly,

if

y

is a hyperbolic periodic orbit for the flow

then

the local stable and unstable manifolds for a Poincar4 map can be transported around as

local

whole:

4.3

y

by

to give manifolds of one higher dimension

stable and unstable manifolds



(y),

(y)

for

y

as a

see Figure 46.

SOME GLOBAL BEHAVIOUR

Now that we have some understanding of the way in which a flow or diffeomorphism behaves near a hyperbolic fixed point or periodic orbit we look at the consequences for global behaviour on a manifold In particular,

let us consider the case of a point

approaches a hyperbolic fixed point If we take

t

large enough then

p

(jj^x

as

all

cb

-t

W

s £oc

t > 0

(p) .

for which so that

p,

or,

(or, indeed, over all

t e R)

d>

-t

= p.

(p)

equivalently,

Therefore if we take the union of the

(J>^_x

u)(x)

will eventually be in

defined for some suitable neighbourhood of in

t -»- + °°,

x

M.

x

s W„ (p) £oc

lies over

we shall capture precisely

217

those points for

p.

x

for which

+ °°.

Any

x

in

Co (x) = p , a (x) = p2and

u p e W (p).

• If there exists some

u,

then

all satisfying

called homoolinic points, and

(cf. m(x)

2 above) = a(x)

there is an infinite = p.

Such points are

although sometimes this term is reserved for u W (p)

. intersect transversally

(cf.

§4.4).

See Figure 48.

Figure 48

220

(P2) •

and no point in it is fixed or periodic

->

x e W (p) D W (p)

number of such

p^ ±

This is because

under the action of

u (p ) n W (p )

x ^ p

s

where

since such a point could not W

W

W

which is nonsense.

s W (p^)

(as a set)

in

then

Once again, we can extend these ideas everything for flows as well are no longer discrete,

as for diffeomorphisms.

(s — u = 1).

phenomena shows how analogous

and do

In this case orbits

and the behaviour illustrated

not occur in such low dimensions diffeomorphism

to periodic orbits,

in Figure 48 could

Suspending these things happen in higher

dimensions for flows.

From the results of this section we see that in studying the global structure of a dynamical system we can make some progress if we know that fixed points and periodic orbits are all something about the

hyperbolic,

and if we know

'intersections of stable and unstable manifolds.

Now we might reasonably hope that

'in general'

all fixed points and

periodic orbits will be hyperbolic,

and that stable and unstable manifolds

will intersect in general position,

i.e.

transversally.

To discuss this we first have to provide a suitable mathematical meaning for

'in general',

and then find which properties we can expect to be those

of a

'generic'

system.

This is the programme for the next section.

4.4

GENERIC PROPERTIES OF FLOWS AND DIFFEOMORPHISMS

It is common practice in classifying any set

S

to begin by trying to sort them into two types: degenerate,

tame,

pathological, S,

nice,

...).

and to associate

...)

and

usual

(singular,

'usual'

with

(regular, non¬

degenerate, wild,

One way to do this is to put a

Another way is via a topology on here,

unusual

of mathematical objects

measure

on the set

occupying a subset of large measure. S.

This is

the approach we shall use

since at the same time it allows us to discuss questions of global

221

stability of dynamical Naively,

systems.

it seems reasonable to associate

subset which is dense in since a subset of For example,

S

This is soon seen to be inadequate,

and its complement in

the subset

its complement,

S.

the set

for a start.

S

of rational numbers is dense in

I

of irrational numbers,

'more usual'

'usual'.

though,

can both be dense in

Q

rationals and irrationals are both are in some sense

'usual' with occupying a

but so is

and therefore the

Nevertheless,

than the rationals:

R,

S.

the irrationals

there are more of them,

We can capture this by using the topological notion of a

residual set.

DEFINITION

Let

S

be any topological space.

A subset

G

of

S

is called a

residual set if it is the intersection of a countable number of sets,

of which is both open and dense in

each

S.

In any respectable topological space a residual set is itself dense. This is known as Baire's Theorem,

once

'respectable'

is defined,

and then a

space in which this result holds is called a Baire space (L.

Baire,

1874-1932).

For example,

metric space is a Baire space.

any space homeomorphic to a complete

The intersection of a finite or countably

infinite number of residual sets is residual,

and so in a non-empty Baire

space the complement of a residual set cannot be residual empty set would be dense). good synonym for

'usual'

(otherwise the

Therefore occupying a residual set becomes a

in a Baire space.

Conveniently,

the spaces we

deal with will all be Baire spaces. In the rational-irrational example above we see that as numbers are countable we can write

222

Q = {,q2, . . . ,q , . . . }

the rational and, writing

Un

for the open dense subset

R - {qj

as the intersection of all the R,

and

Q

U^.

R,

of

Thus

I

I

is a residual subset of

is not.

The sets we are interested in are the set diffeomorphisms on the smooth n~manifold fields on

we can then express

M

these sets.

which we denote by

X (M).

Diffr(M)

M,

of

Cr

and the set of

Cr

vector

We will now put topologies on

Assume unless otherwise stated that

M

is compact:

this

makes the topologies easier to describe, and is in any case the context in which our later theorems will apply. We regard M

M,

on

Diffr(M).

Diffr(M)

topologize

C

as a subset of the set (M,M),

Cr(M,M)

of

Cr

maps

and then take the subset topology induced

To topologize

Cr(M,M),

we want to capture the idea of two we can do this by saying that

or, more generally,

C

f, g

maps

M -> N

Cr(M,N),

being 'close' and

are close if their local

representatives in all charts of some atlases are close.

The procedure

for converting these hazy notions into a topology is as follows. Let

(U,),

the closure

(V,ijj)

fU

of

be charts on fU

fU g

is compact. in

Y

C (M,N)

sup | | ipf (x) - Tj)g(x) | | xeU where

||* I I

with (for technical reasons)

bounded and contained in

§1.6 this implies that denote the set of all

M, N

For any .

Rn .

fU

,...,U

with the property that each “s

contained in some chart

B (f;U,V)

and such that

Take atlases for

U

“2

gU c V

let

(*)

M

,U



for which

chosen carefully so that there is a cover of

“l

By Theorem C in

e > 0

< e

is any particular norm on

V.

M

and

N

by charts is bounded and “i

V

on

N.

This is not hard to do, by

3i considering the cover of

M

by all the open sets of the form

f

(V^),

223

using the compactness of

M

to get a finite subcover, and then taking

slightly smaller charts to satisfy the closure condition. B (f) e then

to be the intersection of the

g

lies in

B^_(f)

respect to the charts

U

a.

, V„ B.

1

1

W

there exists some

is contained in which for

W.

N = M

for

i = l,2,...,s

:

precisely when all its local representatives with

Finally, we define a set f e W,

B (f;U ,Vn ) e ou fk

Now define

differ from those of

in B (f) e

Cr(M,N)

f

by less then

e.

to be an open set if, given any

constructed as above such that IT C (M,N)

This gives a topology on induces the C -topology on

Diff

B (f) e

0 the C -topology,

(M).

The topology can

be shown to be independent of the choice of atlases and special charts

If we want the topology to capture closeness of derivatives of all orders up to

r

we do the same as above, but replace

r , I SUP I1D f(y) k=0 yeU'

— where

U' = U,

IT

C (M,M)

and on

||

Rm

and .

Diff

T

< e >

(**)

^_

_1

f = i|jf

and

MrCR™ x Rm x ... x Rm;Rn) from norms on

, _ D g(y)

(*) by

(see §2.6) D f (y)

belongs to the space

of k-multilinear maps and

Rn. (M,M).

||*||

is obtained

This then gives us the Cr-topology on Cr(M,N), Unless it is stated otherwise, this

topology is assumed to be the one in use.

It is known more classically

as the topology of uniform convergence of all derivatives up to order The case

r = 1

r.

will most interest us.

Remarks 1. There is a more sophisticated way of describing these topologies using jets:

224

see Golubitsky and Guillemin [[

, for example.

Alternatively we can embed regard

f

: M

N

as

f

Tf : T M -> T RP = R^ euclidean metric on and

T g.

N

in some N -* Rp .

: M

where R*^

Rp

q = 2rp

(see §3.2,

\J]),

and then

This has an r-fold tangent map and with a little care the ordinary

can be used to give a 'distance' between

We obtain a metric on

topology already defined.

C (M,N),

Trf

and this induces the

Using the metric, it can be shown that

C Cr(M,N)

■JT

and

Diff (M,M)

2. When

M

are Baire spaces.

is not compact problems arise since there may not be a finite

atlas, and even when there is one a topology constructed as above may not be suitable for dealing with behaviour 'at infinity'. details, except to mention that for useful

Cr

open subsets of

topology is obtained by replacing the constant

continuous function edge of

M, N

We will not go into

M) .

e

: M -> R

Rm, Rn e

a

in (**) by a

(which may tail off to zero towards the

Alternatively, one can consider a topology of uniform

convergence on compact subsets of derivatives up to order

r.

3. Further subtleties occur even in the compact case if we wish to treat all derivatives for

C°°

maps:

there are a number of topologies one could

...

CO

use.

One choice is to say a set is open in

of sets, each open in some

C (M,N)

when it is a union

C (M,N).

4. These ideas can be extended to infinite-dimensional manifolds and to manifolds with boundary. For X

Xr(M)

and a chart

the procedure is similar but easier. 4>

x xd) ""1 a

• u ' ‘a

factor we have

U ' a

Rn

a

X Rn

^

there is the local representative

cl

(cf.

xa : Uq' -> RU .

we can suppose that bounded in

■* U '

: U cl

Given a vector field

p.169) and considering only the second If we shrink

a little if necessary

is defined on the closure

(hence compact)

it follows that

Ua>

||D X^||

and if

^

is

is a bounded 225

function on

U . a a

Let

can choose a finite atlas of charts

IX| I

This is a norm for

When

X = T D X 1 1 a1 1 r . Ln ' ' c k=0

=

sup l$i£s

X (M),

U

X

a.

l

,U

a2

, . .. ,U

a

M

is compact we

as above, and define s

'r

making it into a Banach space.

The norm

itself depends very much on the choice of atlas, but the topology induced on

Xr(M)

does not.

(Cf. Example 6 on page 8, where there is only one chart.)

Remarks analogous to those in the

Diffr(M)

case also apply to

Xr(M).

With topologies assigned to these spaces of diffeomorphisms and vector fields we can now discuss genericity.

DEFINITION A property Cr P

P

of diffeomorphisms (vector fields) on

generic if the set

P

M

is said to he

of diff eomorphisms (vector fields) possessing

contains a residual subset of

Diff (M)

There are two particular properties

(X (M)).

P^, P^

which are known to be

generic for diffeomorphisms and vector fields on a compact manifold: P^

Every fixed point and periodic orbit is hyperbolic_, and all inter¬

sections of their stable and unstable manifolds are transversal. This does not say that there need be only a finite number of fixed points and periodic orbits.

As we shall see below, the property of having

only a finite number of periodic orbits is not P2

The non-wandering set

points and periodic orbits.

226

ft

generic.

is precisely the closure of the set of fixed

Remarks 1*

as

^

generic

Kupka-Smale Theorem. 2.

his Closing Lemma p

such that

See Kupka

?2

The result that

for each finite

is

C1

[65] ,

Smale

r.

This is known as the

[l2P] or Peixoto

generic is due to Pugh,

and follows from

This states that any recurrent point

[l03] .

p e m(p))

[96] .

(a point

can be converted into a periodic point by

arbitrarily C —small perturbations of the diffeomorphism or vector field. The proof

is extremely difficult.

and for C -small perturbations Lemma may be false. Thus

For C^-small perturbations it is easy,

(r > 1)

It is unknown whether

P^

is

C

the

Cr

Closing

generic or not.

the choice of topology is not merely a refinement for mathematicians,

but has important implications for systems

4.5

little is known:

the kinds of behaviour in dynamical

that we are to consider as usual or as exceptional.

GLOBAL STABILITY

We will now take a different point of view from that of genericity,

and

consider when the overall structure of a dynamical system persists under sufficiently small perturbations of the system.

After all, we should be

suspicious of a differential or difference equation used to model a reallife system if the equation can be made to behave very differently by including arbitrarily small extra terms,

since in practice we can only

measure to within some non-zero margin of error.

This leads to the idea

of structural stability of diffeomorphisms and vector fields. indulge in some wishful

thinking,

when suitably formulated,

We might

and hope that structural stability,

is a generic property

...

227

Previously

(§4.2) we discussed local equivalence of germs of

diffeomorphisms or vector fields at a fixed point.

We can easily transfer

this to a global concept.

DEFINITION IT

Two diffeomorphisms •

if there exists

This

f,g e Diff

h e Diff

k

(M) •

wi,th

(M)

are

1c

C

globally.

k £ r)

h*f = g»h.

is a special case of conjugacy of two maps

'look the same'

(for some

equivalent

(§1.5),

Note that in particular

h

i.e.

f

and

g

takes f-orbits

globally to g-orbits.

DEFINITION •

Two vector fields

X,Y e X

IT

are

(M)

k h e Diff

k



equivalent if there exists

, (M)

#

taking X-orbits to Y-orbits, preserving senses but not

necessarily parametrization by

In either case

h

t.

takes the phase portrait of one system to the phase

portrait of the other. points,

C

As pointed out in the local context for fixed

it is mainly the case

k = 0

(i.e.

h

a homeomorphism)

that is of

practical interest in studying the long-term behaviour of orbits. Therefore our definition of structural stability will be as follows:

DEFINITION 2“

A diffeomorphism

f e Diff

T"

(M)

(or vector field

structurally stable If there is a neighbourhood (or N

X

is

228

in C°

X (M))

X e X

N

of

(M)) f

in

is Diffr(M)

such that every diffeomorphism (or vector field) in

equivalent to

f

(or

X).

If we had worked with

Ck

equivalence

definition was practically useless.

(k :> 1)

For example,

we would find that the let

f

diffeomorphism with a finite number of fixed points. C1

equivalent then the conjugating

fixed point shows

p

of

f

h(p)

-^(p)®

of

and hence has the same eigenvalues as

and so

f

T f

are even

takes each

and the Chain Rule

T f.

T h

However, we can

by arbitrarily small

could not be structurally stable.

based on the same fact about linearization, be structurally stable C

h

g

P

easily change the eigenvalues of f,

g,

f,

is linearly conjugate by

r

in

If

diffeomorphism

to a fixed point

that the linear tangent map

T f

C1

be a

(with

k 5 1)

As an example

the system

since

x = -

changes

x = — x

(l+e)x

would not

would not be

equivalent to the first system even though its phase portrait is

identical.

Of course,

so neither does

if we take

k = 0

then

T h P

does not exist and

the above argument.

Most of the important theorems on structural stability are formulated for compact manifolds.

Although the manifolds that it might be necessary

to use in practice could often turn out to be non-compact spaces for second order systems), essentially what goes on,

(e.g. phase

the results for the compact case show

and they can often be adjusted to cope with

specific non-compact problems.

Therefore in what follows we make the 0°

standing assumption that boundary),

M

is a

and we might as well

C

compact n-manifold (without

suppose also that

otherwise we could treat each piece of

M

separately.

The early results on structural stability those of the Russian school However,

(called

(Andronov, Pontrjagin

the first main global theorem which

results on gradient systems

(see §4.6)

is connected,

M

'roughness') were

[7^])

in the 1930's.

together with Smale's

launched a new attack on the

229

subject, was Peixoto's theorem for flows on 2-manifolds.

THEOREM

A

(Peixoto

[95J)

vector field on a 2-dimensional manifold

Cr

is structurally stable

M

if and only if it satisfies the following conditions: (1) The number of fixed points and periodic orbits is finite3 and each is hyperbolicj (2) there are no points

a(x)

and

are both saddle points;

w(x)

(3) n

for which

x

consists of fixed points and periodic orbits

only. Moreover, if

is orientable the set of such vector fields is open and

M

r

dense (thus highly generic) in

X

(M).

Remarks 1. As

stated in

[95^]

the full result includes the non-orientable case,

but there is a gap in the proof. the rescue, for 2.

For

r = 1

the Closing Lemma comes

to

but it seems that the situation has still not been clarified

r > 1. Condition

(2)

is the same as transversality of intersection of stable

and unstable manifolds: not automatic are

the only dimensions for which transversality is

s = u = 1

uniqueness of solutions if along a whole orbit.

(i.e. W (p^)

for saddle points), is tangent to

Thus genericity of

W

and then by

(p£)

(hyperbolicity +

they must meet (2))

corresponds to the Kupka-Smale Theorem (page 227). With this result in mind,

Smale proposed a closer study of systems on

compact n-manifolds satisfying the above conditions with

230

)

(1)

and

(3)

together

(2)

All stable and unstable manifolds intersect transversally.

Such systems are now known as Morse-Smale

(MS)

systems.

Apart from their

dynamic properties, Morse-Smale systems have the added interest of providing close analogues to the Morse inequalities a set of Morse-Smale inequalities the topology of

M

(Smale

(page 160).

[119], Markus

[73]),

There is

through which

places constraints on the possible numbers of fixed

points and periodic orbits for any MS diffeomorphism or flow on However,

M.

from the point of view of dynamics MS systems seemed good material

for two plausible conjectures:

Conjecture 1

A system is structurally stable if and only if it is MS.

Conjecture 2

MS

systems are dense in

Failing Conjecture 2,

Conjecture 3

Diff^(M)

or

X^(M).

one could at least try

Structurally stable systems are dense in

These conjectures were short-lived.

Diff1(M),

X1(M).

Their fates are quickly

summarized: Conjecture 1

MS

implies structurally stable:

by Palis and Smale

(see

[22]).

Structurally stable implies MS Conjecture 2

FALSE in general.

Conjecture 3

FALSE in general.

Of course,

:

FALSE in general.

the conjectures are true for vector fields when

by Peixoto's Theorem. dim M = 1,

proved TRUE

(i.e.

dim M = 2

They are also true for diffeomorphisms when

M = S1).

The failure of Conjectures 1 and 2 in general

is a consequence of the fact that there exist complicated non-wandering

231

phenomena which systems can exhibit and which cannot be removed by small perturbations.

We will now look at two basic examples of such phenomena.

The failure of Conjecture 3 is more subtle,

and we shall be in a better

position to understand it once we have understood these examples.

EXAMPLE 1:

The Smale Horseshoe 2

Take a rectangle

R

in the plane

it into a horseshoe shape as C',

D')

R

shown

.

Pick it up,

(vertices

A,

and put it down again partly on top of

see Figure 49.

It is not difficult,

B,

stretch it and bend C, D

go to

A',

B',

its original position:

using smooth but possibly non-analytic

Figure 49 functions

(cf.

example on page

86 )

to piece things

this construction into a diffeomorphism this

:

R

2

-v R

2

.

to incorporate

Furthermore

(and

is-important) we can arrange that the two rectangular strips R r\ f

consisting of points of

R

which remain in

are each stretched linearly by a factor (parallel to 'horizontal'

AD)

remain m

R

R

>1

when in the

f

R^f

-1

(parallel to (R)

when both

of four thin rectangles,

f

f

-2

and

(R) ,

'vertical'

f

2

0,

, a A = e

3. Hyperbolic fixed points and periodic orbits are hyperbolic invariant sets in the above sense. 4. The horseshoe example is hyperbolic on example is hyperbolic on the whole torus.

238

A;

the toral automorphism

If define all

is a point of

X

the

y

A

and

A

is hyperbolic for

(generalized) stable manifold

in

M

WS(x)

for which the distance between

approaches zero as

n -* °°.

To do

for fn(y)

f x

we would like to to be the set of

and

fn(x)

this we must have a metric on

M,

but

since we already have a Riemannian structure we can use that to provide a metric:

again see

euclidean space.

§3.4.

Alternatively, we could embed

Therefore the definition of

independent of the particular metric chosen) Generalized Stable Manifold Theorem,

WS(x)

makes

M

in sense

(it is

and there is then the

due to Hirsch and Pugh

[53] , which we

summarize:

THEOREM

Each As

Ws(x)

is an immersed copy of

runs through

x

A

all the

Ws(x)

varying family (at least locally near that

WS(f(x))

Replacing

with

Rs,

ES

as tangent space at

x.

fit together to give a continuouslyA),

and the family is invariant in

= fWs(x).

n

by

- n

we obtain the definition and analogous theorem

for generalized unstable manifolds This theorem in the case

WU(x).

A = ft(f)

is one of the main ingredients

in

the long-awaited Structural Stability Theorem that generalizes Peixoto's theorem and the theorem of Palis and Smale on

MS

systems:

THEOREM

Suppose the diffeomorphism

f

(1) the non-wandering set

satisfies the following: ft

is hyperbolic;

(2) periodic orbits are dense in

ft;

239

(3) all 'intersections of (generalized) stable and unstable manifolds are transversal. Then

f

is structurally stable.

Remarks 1.

This theorem was proved by Robbin

many people including Hirsch,

Pugh,

[l05]

as

the culmination of work by

Shub and Smale:

In Robbin's proof it is necessary to suppose that C

2

.

,

although small perturbations are allowed in

the

case was given by Robinson

2. The converse almost proved, stability is

f Diff

[22^],

[ll| .

itself is actually

i

(M).

A proof in

Q.08].

(that structural stability implies

(1),

(2)

and

(3))

is

and is certainly true if the definition of structural

strengthened a little:

In fact Franks

see

[4C>3

shows that

(1),

see Guckenheimer (2),

(3)

\A6~J

,

Franks

[39^] .

are necessary and sufficient

conditions for a slightly amended structural stability property that may be of more relevance from a practical point of view than the previous 'pure' 3.

definition.

Corresponding results exist for flows.

See Robinson

[l07] , Franks

Behaviour of structurally stable systems We will here use the word system to mean either a diffeomorphism or a flow. The two conditions on the non-wandering set

£2

£2

hyperbolic *

periodic points dense in

Q

together comprise what Smale [l25] called Axiom A. structural stability theorem can be stated as

240

Thus Robbin's

[38^].

Axiom A plus transversality of implies structural stability .

stable and unstable manifolds No examples are known of systems with

not dense in

ft,

ft

hyperbolic but periodic points

and so it is an unsolved problem to decide whether the

first clause of Axiom A always The structure of

ft

implies

the second.

for an Axiom A system can be described in fairly

terms via Smale's ft-decomposition theorem

general but useful

[l25]:

THEOREM

If Axiom A is satisfied then ft = ft,

ft

breaks up into a disjoint union

ft0 v; . .. u ft

12

in which each

ft^

is closed,

s

invariant under the diffeomorphism

(or flow),

and such that the diffeomorphism (or flow) acts transitively on each i.e.

has an orbit dense in

ft.,

ft.. 1

It is possible that M:

s = 1

and

ft = ft

is

the whole of the manifold

this was the case for the hyperbolic toral automorphism.

In general,

systems for which the whole manifold is hyperbolic are called Anosov systems,

since they first came to light in the flow context through work

of the Russian D.V. Anosov of

M

always implies



ft = M.

It is not known whether hyperbolicity The classification of Anosov systems is

an interesting and very difficult open problem, Anosov system with

ft = M

important because an

is an example of a dynamical system that

exhibits complicated recurrent behaviour everywhere which cannot is structurally stable) The sets

ft.

be destroyed by small

(since it

perturbations.

in the decomposition are called basic sets.

They generalize hyperbolic fixed points and periodic orbits as models for stable

(thus observable)

recurrence phenomena in dynamical

systems.

241

The Cantor set in the horseshoe is an example of a complicated basic set. In principle we would like to be able to classify all possible types of basic sets

fib

for Axiom A systems,

attractors, i.e. for which everything near

fb

WS(£b)

approaches

and especially those which are is a neighbourhood of

£b

as

t

(or

fib

n) -> + °°) .

These generalize sinks, which are hyperbolic fixed points dim W (p) Williams

= dim M.

(so that

p

with

Some interesting results have been obtained by

[l511 .

There is a school of thought that strange attractors

in smooth

dynamical systems may be the right models for hydrodynamic turbulence. See Rue lie and Takens

[Tio] , Iooss

[6o] , Marsden et al.

[75], Arnol'd

[lo] .

The behaviour of Axiom A diffeomorphisms and flows on the basic sets 1 i

fb

\

themselves has been studied in detail by Bowen, who showed that

symbolic dynamics can be used to give a very good picture of the measure theoretical or statistical properties of the system. plays an important role.

The idea of entropy

Thus deterministic mechanics leads via Axiom A

to statistical mechanics within the same model.

See Bowen

[19] .

To understand the global behaviour of an Axiom A system we have to see which points a(x) W

CL

(fb)

Sb

x

and

and

W

go from one m(x)a £b (fb).

£b

.

Such an

(This

y

in

Sb

x

fb

lies

,

i.e.

satisfy

in the intersection of

is more diff icult to prove than it looks.

It amounts to showing that if some point

to another

with

d(fn(x),fb) ->- 0 d (fn (x) , fn (y)) -* 0

as as

n -> + °° n -> + oos

then there is i.e.

g

x e W (y).

See Figure 51.

periodic points

in

fi,

hyperbolic invariant set

242

The proof uses strongly the density of

and the result is false for an arbitrary A

:

see the counterexample due to Bowen in

[54].)

Figure 51 Putting an ordering on the

ft.'s

by writing

1

W

u.

(ft^)

r\

W

s

(ftj)

whenever

ft.

> ft.

i

J

J

> ft.

i

J

to mean that

is nonempty, we can draw a schematic diagram or graph

with vertices representing the ft.

ft.

.

ft^'s

and a directed edge joining

ft^

to

It can be shown that the condition of transversal

intersection of stable and unstable manifolds implies that there are no cycles

(closed circuits of consistently directed edges)

in the graph,

so

that the global behaviour of the system is represented by a hierarchical structure as

ftc,ft0 5

o

in Figure 52.

The attractors are the

ft^'s

at the bottom

are

attractors

Figure 52 of

the hierarchy.

Axiom A systems,

In general

it

is not known which graphs can come from

or to what extent equivalent graphs

imply equivalent

243

systems,

although a classification in the case of Morse-Smale systems on

2-manifolds has been carried out by Peixoto

[98^

Final remarks on stability 1.

The example due to Smale

[l24] ,showing that structural stability is

unfortunately not a generic property in constructing an Axiom A diffeomorphism p

as one basic set

Diff^(M), f

was obtained by

having an ordinary saddle point

and a torus with a hyperbolic toral automorphism

on it as another basic set

and then changing

f

to

for which

WU(p)

bends by a large amount and has non-transversal intersection with

s W (y)

for some

y e ^



If

... is sufficiently

g

non-transversality persists, with f ^.

The

type of phase-portrait for

periodic or non-periodic in This means which is

that

f

Diff^(M),

vector fields,

has a

since

perturbations

close to close to

f^

this

p,y,^2

depends on whether or not

y*

for is

neighbourhood of diffeomorphisms,

none of

Hence structurally stable systems are not The same holds for

by taking a suspension. it turns out that every system in

see Shub [ll6j , Zeeman

interpret,

g

1

and both possibilities occur densely.

can be approximated arbitrarily

systems:

[l52j .

and yet implies

mathematical modelling.



Diff^CM)

or

C°-closely by structurally stable This is a rather confusing result to

the definition of structurally stable uses

on what choices of topologies

244

g

f°r

and Conjecture 3 is demolished.

Despite this result,

X1(M)

SI 1>

structurally stable.

dense in

2.

p^y',^'

C

density.

C1

The significance of this depends

seem appropriate in any given context of

3. There are other global notions of stability that have been proposed in the hope that they may be both physically meaningful and topology)

generic.

restricted to

For example,

:

see e.g.

Shub

ft-stability [jL15~] )

tolerance stability (due to Zeeman;

(in some useful

(stability of the system

is not

see Takens

generic; (jL37~|)

is in a certain sense

generic.

4.6

DYNAMICAL SYSTEMS UNDER CONSTRAINT

The results

considered above about structural stability and genericity all

assume that arbitrary perturbations in allowed, whereas

Diff

(M)

there are many circumstances

or

X

(M)

are to be

in modelling real

systems in

which the admissible class of perturbations will be restricted and so possibly lead to entirely different results both about stability and genericity.

In fact systems under constraint are probably more common than

unconstrained systems.

We will consider three types of constrained system:

gradient systems_, Hamiltonian systems and systems with symmetry and comment on their global stability properties. To discuss gradient and Hamiltonian systems we need to introduce the

cotangent bundle of a smooth manifold.

Cotangent bundles Let space

f

: M -> R E.

be a smooth function on a manifold

For each point

T f p and

T f p

:

p

at

M

modelled on a Banach

we have the linear tangent map

T M -> T , ,R = R P f(p)

is an element of

called covectors

on

M

p,

(T M) * = L(T M, P P

and

R).

Elements of

(T M)* - which we rewrite as

(T M)* P T*M - is

are the

245

cotangent space of df

M

at

p.

Usually we denote

assigns a covector to each point

a covector field on To discuss

the way

M.

with

M,

and is

df(p):

then

thus an example of

df(p)

varies with

p

on the set

we need

T*M

to put a topology,

and

of all covectors at all

This can be done by analogy with the construction for the

tangent bundle T M

of

by

M.

preferably a smooth structure, points on

p

T f

E

(§3.4).

Choose a chart

(as on page 167)

U ■* U'

:

and hence identify

with T*M

p e U,

with

identify

E* = L(E,R).

Corresponding to the chart x for

TM

TU -* U’

x E

:

T*U -> U'

we obtain a map

t* and so an atlas for

M

atlas obtained for TM,

x E*

gives us an atlas

on the cartesian product

for

:

E x E*. . is

T*M

C

r-1

C

r

taking

(equation (**)

on page 165),

. . diffeomorphism

f

:

(p,h)

to

f

:

Whereas

U -> V

(f(p),Ah)

M

is

C

r

then the

the

. induces a where

TM C

proof relied on

r— 1

A = Df(p)

map e L(E,F)

this time we have to use the fact that the

. induces

U -> V

as a manifold modelled

. . The proof of this is similar to that

.

overlap map

U x E -* V x F

T*M

If the atlas for

although a little more subtle.

the fact that a

for

a

C

r-1

map

U x E* -> V x F*

taking

-1* (p,h)

to

takes

p e E*

Thus

(f(p), A to . is a

T*M

bundle for

M.

h),

where for

C

r— 1

As with

TM

we have a natural projection map

a

it.a = id

246

L* e L(E*,F*)

differentiable manifold called the cotangent

M

or Pfaffians

the map

p*L e F*.

and a covector field on : M -> T*M

L e L(F,E)

satisfying

is a section of the bundle, .

(J . Pfaff, 1765-1825 ).

it

:

T*M -> M,

i.e. a map

Covector fields are also called 1 -forms

In finite dimensions we often regard distinction between

1 x n

really saying that since X

Rn -> R

:

X(*)

52)

on

with

dimensional

Rn.

as

n x 1

Rn

by blurring the

column vectors.

(given a choice of coordinates)

X e R

correspond to

(page

matrices and

x h- Lx.

can be expressed as

will identify

Rn'‘

£ =

2 2

n n

e Rn.

(£^,£^,...,£ )

where

any linear map

+ £0x0 +...+£ x

11

is

We are

=

we

More briefly, we let

the standard inner product

This works equally well without coordinates on any finite¬

linear space

V

if we replace the inner product by any non¬

degenerate bilinear map 3 There is implies

a map

:

V -> V*

that this is

be surjective, (page

Sj,

53)

i.e.

:

V x v

R

.

taking

v

to

injective,

the non-degeneracy

and then by counting dimensions

an isomorphism.

it must also

In the case of a Hilbert space

the properties of the inner product are such that we again get

an isomorphism between

H

and

critical points of functions: Suppose

3(v,*);

H*.

This

fact was used in discussing

see page 107.

that for each point

p

on the smooth manifold

M

we have a

non-degenerate bilinear map

3(p) Then we have

Sl(p) tr

:

T M P

: T M x T M -> R. P

P

and in the finite-dimensional or Hilbert

T*M, P

space case we also have S,(P) 3T

Hence we can apply and so

3

3^(p)

= 3

(P)

:

T*M -* T M.

V

H

to convert any

F

covector at

p

to a vector at

p,

converts any 1-form into a vector field.

To discuss continuity or smoothness of this process we construct a bundle

over

M

with fibre

M„(T M x T M ; 2 p p

R)

over

p,

obtaining a smooth manifold

247

called the tensor bundle of covariant degree 2. analogy with

T*M

is close:

see e.g. Abraham

We give no details,

Q1 J.

A section

called a covariant tensor field of degree 2,

bundle is

if the bilinear map

g(p)

is non-degenerate

3

but the of this

and is non-degenerate

for every

p

in

M.

All

that

we have said so far is summarized by: any smooth oovariant tensor field

of degree 2 on field

converts a smooth coveotor field

M

into a smooth vector

namely the unique vector field satisfying

X,

3(p)

for all

a

3

and every

v e T^M

(X(p),v)

in

p

= a(p)v M.

Now we look at two particular cases.

Gradient systems If

3(p)

is symmetric and positive definite for every

Riemannian structure on the covector field

df

M

(see page 178).

. is converted by

called the gradient field of just

grad f

is

3

f

the standard inner product for all

and so

grad f

is

=

. into a

3

with respect to

is understood.

Df (p)

Given a

3f 3x

C C

3,

then

df(p)

then

M = Rn the

(Example 4).

=

n Y

possibly depending on

p)

then

1J component of the gradient vector at

df(p)

248

(h^j)

is

R

and

or

3(p)

is

matrix

n p

t^ie matrix inverse to

would become

(g..)»

.n R

If we

g..x.y. ij i J

would not change but the Y .L,

J=1

where

grad^f

1 x n

• • =l i,j

ith

: M

the same expression regarded as a field of vectors in

had chosen a different inner product of the form g..

f

vector field

denoted by

is

is a

3f ,_N 9f / N (p ’ 9x0(p),,"’3x (P) 12 n

This was our definition of gradient system in §4.1

(with

3

function

r—1

For example, when p

p

3f h.. -(p)

ji 9x. J

A gradient field

X

on a

(compact) manifold

with particularly simple properties.

.

$fc(p)

= df(pt)

.

X(pt)

by definition of

X,

definite.

f

Hence

and this

alone,

fixed orbits of f = constant

manifolds)

are perpendicular (in terms of

Y(p)

153)

is

behaviour of gradient systems,

for

4.

(p)

since X,

3(p^_)

we have

f. g)

is positive

and so there can be consists of

Moreover,

the non-

to the level sets

that away from critical points

a tangent vector to

= df(p)Y(p) = 0.

pfc

The non-wandering set

namely the critical points of

since if

3(p)(X(p),Y(p))

^ 0

increases along the orbits of

(remember (page

gives rise to a flow

(x(pt), x(pt))

last term is

no periodic or recurrent phenomena. fixed points

If we write

jfc f(pt) = df(pt)

= e(Pt)

M

N = f \k)

they are

then

This gives a very good grasp of the

and Smale

[l20j

was able to prove that

gradient systems with hyperbolic fixed points and transversal intersections of stable and unstable manifolds are structurally stable as one of the first main results of the theory.

Of course, we now see this as a special case of

the general structural stability theorem in §4.5.

Interestingly,

ft-decompos ition theorem and the graph representation (page 243)

Smale's

allow any

Axiom A system with transversality of intersection of stable and unstable manifolds

to be regarded in a very loose way as a kind of overall gradient

sys tern.

Hamiltonian systems If

B(p)

is anti-symmetric

in

T M) P

then

3

is

(i.e.

g(p)(u,v)

called a 2-form on

M.

= ~3(p)(v,u) As above,

for every

u,v

a non-degenerate

249

2-form will convert any covector field into a vector field. time we see that

df

to the level sets

(manifolds away from critical points of

X)

but is

becomes a vector field

tangent to them,

= B(Pt)(X(pt),X(pt))

because of the anti-symmetry,

For most

that k

f

X

apply to

and so

f

the level set

f

'*’(k)

is

page 157),

a codimension-1

n-1.

satisfies a further technical condition

f(x)

or Spivak

It so happens

f

[j.28])

(namely,

(cf.

function

vector field

X

f

:

T*N.

T*N

R

T*N

§3.2,

[8]).

If

8

see and

(W.R. Hamilton,

of any manifold

If we take

N

M = T*N,

on

T*U = U x Rn

to the system of ordinary differential equations

x.

3f

l l

9f 3x. l

has a

and so any smooth

to be an open set X

N

gives automatically a Hamiltonian

(Remember that here

turns out that the vector field

250

we would then be

in this context called a Earniltonian function.

(Hamiltonian)

T*N -> T(T*N).)

M

the phase space

if it is closed',

natural symplectic 2-form that can be defined on it,

X :

k

then it becomes a symp'lect'tc form,

that the cotangent bundle

on

submanifold of

$ k

the vector field is called a Hamilton-ian vector field with

of

If for some reason we had to

working with a flow on a manifold with boundary

1805-1865)

= zeros

remains constant along orbits.

the system a constraint of the form

Q1 J

f,

= 0

and so for fixed such

has been reduced to dimension

e.g. Abraham

which is not perpendicular

represents some quantity which is conserved by the flow.

(recall Sard's Theorem, for

this

since we have

^f(Pt)

This means

X

However,

1,2 , • • •, n

U

so we have in

Rn

= u x Rn

then it corresponds

where

(x,y)

are the coordinates in

U x Rn.

These are Hamilton's

equations from classical mechanics. The global behaviour of Hamiltonian flows is very complicated, since it turns out that there can be no sources or sinks, and recurrence phenomena abound.

There are some quite simple properties known to be generic for

Hamiltonian flows

(see Robinson

extraordinarily elaborate ones

[l06^] , Takens

[135J), and also some

(Markus and Meyer

[74!])*

In view of these,

it seems difficult to know how to formulate global stability conjectures within the Hamiltonian context.

Of course,

the characteristically

Hamiltonian behaviour can always be destroyed by allowing perturbations through non-Hamiltonian systems in harmonic system (Example

1,

(M).

A common example is the simple

§4.1) in which the family of periodic orbits

around the origin is converted into a spiral configuration with the origin as a sink by the addition of arbitrarily small amounts of frictional damping. Here the undamped system is Hamiltonian with symplectic 2-form

(u,v) *->

- u^v2

for

2 2 2 f(x^,x2) = k x^ + x2 u,v e R

2

= T^R

2

an R

.

let

co

E^

denote the set of germs at

0

of

C

functions

For convenience of writing we will be rather lax about

distinguishing between germs and functions that represent them. previously said that two such germs exists a germ of a diffeomorphism that

g = f»y

.

f,g y

E

n

.

• y + K

for some constant

Then we can define an element

neighbourhood

Rn -* Rn

taking

0

to

if there 0

and such

Since we are interested only in critical points, we will

now broaden the definition a little, g = ±f

:

right equivalent

are

We have

N

in

E^

and say that K. f

f,g

equivalent

are

if

Suppose that we have a topology on

E

of

for which every

g

to be

n in

N

stable

if it has some

is equivalent to

f.

Following the general approach outlined at the beginning of this section,

we let

T

u

stable.

denote the subset of To study

of it in general

£

we

E

n

consisting of those which are not

try to analyze the

k-parameter

families of maps

The programme behind Thom’s Theorem is as to regard

E

'avoidability'

Rn -> R,

follows.

as an infinite-dimensional manifold.

of various pieces for

k = 1,2,...

We ought to be able Any element with non-

257

.

vanishing first derivative at

0

is stable

(since it is right equivalent

to its derivative (see page 110) and any two non zero linear maps are right equivalent),

and so

£

lies

decomposible as a collection of disjoint where each

together with a very

Further,

£

pieces

submanifold of

remaining piece

.

Iq

ought to be

(although abutting)

is a codimension-i 'small'

R

in the codimension-n submanifold

consisting of germs with vanishing derivative.

IlJ2,...

R

£q

Each

,

should consist

of a finite number of classifiable equivalence classes.

Given a k-parameter

n family of maps

f

:

R

-> R

(where

c =

(c^, c2,. ., c^)

e R ) we define a

global germ mccp F by

F(x,c)

the germ at

= germ of 0

of

:

Rn x Rk -s- E

y h- f^(x+y) f

at

n

y = 0.

with origin moved to

this map to be transversal

(page 176)

meet only those

n+i £ n+k,

with

to all

In other words, x. the

F(x,c)

is

In general we hope for ^

avoiding

,

and therefore

as well.

to

Finally,

two k-parameter families whose global germ maps meet a given

^

transversally at the same point should themselves be in some sense locally equivalent. The initial obstacle to carrying out this programme is easily be made into a suitable manifold.

that

A way around this,

E^

cannot

though,

is

to convert the problem into a finite-dimensional one by working with jets of some given order,

look in this

for submanifolds representing the

linear space of jets Y

determinacy of jets

(§2.9)

A second problem is

that the decomposition of

way suggested for

i £ 5

:

88)

and then to use facts about

to lift the results back up to the germ level.

\

only works in the simple

after that, natural candidates

to contain infinite families of equivalence classes.

258

(see page

for

However,

V. Thom

tend

was interested in

k $ 4

parameters for reasons mentioned below, and so

this problem is avoided. The genericity of transversality of

F

to the

is expected as a

consequence of Thom's powerful Transversality Theorem:

THEOREM

If

M,W

of maps

are manifolds and

is a closed submanifold of

S

which are transversal to

W -> M

M

then the set

is an open dense subset of

S

oo

C (W,M).

This theorem has been the springboard for many important advances in differential topology. to show

(a)

In fact we need here a more sophisticated version

perturbations in

obtain transversality with ^

f

supply enough perturbations of

^ , and (b)

F

to

the fact that the submanifolds

may not be closed does not matter since together they form a

stratification. By writing

The required version exists, also due to Thom. $(x,c)

instead of

family locally as a germ be smooth in

x

and

c

unfolding of the germ

R

n

x R

Ic

fc(x) -> R

we may as weli regard a k-parameter

at

(0,0),

* • which we will assume to

and therefore an element of f = f

.

Two unfoldings

We call of

f

$

an

are equivalent

if there exist smooth germs w : Rn x Rk -* Rn

with

x

a diffeomorphism

w(x,c)

germ for each

c,

and

w(x,0) = x

b :

R^ -> R^

a dif feomorphism germ with

a :

R^ -> R

with

b(0) = 0

a(0) = 0

such that f(x,c) = $(w(x,c),b(c)) + a(c)

.

(*)

259

Thus

¥

is obtained from

depending on

c,

$

by (i) an invertible change of x-variables,

(ii) an invertible change

addition of a number

a

depending on

c.

b

of c-variables and (iii)

We call the unfolding

$ 00

stable if, roughly, any other unfolding (of 4>

is equivalent to

$ .

f)

sufficiently

C -close to

In trying to make this precise we again run up

against the problem of a topology for

^n+^>

but one waY around this is to CO

choose a specific neighbourhood topology for maps every

U.

U -> R,

U

of

(0,0)

to work with, using a

and then to insist on stability on

See Wassermann [l46]

U

C

for

for a very careful discussion of these

questions. Now we can give a more technical expression of Thom’s theorem.

THEOREM For any

n > 1

there are seven equivalence classes

(called strata) in (1) each

E

n

E^

of codimension

(2) generically3 for a k--parameter family corresponding global germ map

$

is versal,

such that T

F

of

f

$

is an unfolding of

b

with

k £ 4

£ ;

$

f e £;?

then any other unfolding

by coordinate changes as in (*)

need not be invertible.

If

k = i

then

called universal : every other universal unfolding of equivalent to

$ .

f

$

is

is

From this and the definition of transversal it

follows that universal unfoldings cure stable.

260

the

meaning that if the origin is cliosen in

can be obtained from

except that

$

n+i ;

will be transversal to

and will avoid the rest of

(3) such a

^

such that

is a submanifold of

the

£3’

Remarks !• T^e interpretation of the manifold structures is strictly speaking via jets, as mentioned above. 2. For

n - 1

there is only one stratum of each codimension:

see the

comments on Thom's classification list below. 3. For

k = 5

we include four more strata

£ ’





f

but for

k 5 6

we would need to include an infinite family and this would complicate the genericity property (2). 4. Under our definition

f

is equivalent to

minima (= stable equilibria)

• seven strata break into ten (not 14, x

-x)

A sign change converts

into maxima and vice-versa, so in applications

it is important to distinguish between

by

—f .

f

and

since e.g.

—f . x

3

If we do this the can be changed to

-x

3

.

5. We supposed

$

defined on all of

open subset would do for the domain of

R

ri

x R

Ic

only for convenience; any

$ .

6. Full details of the machinery for proving the theorem were essentially published in more general form in the work of Mather [jsH •

The main tool

for dealing with unfoldings on the jet level is the Preparation Theorem, a classical theorem for formal power series only recently proved by Malgrange CO

for

C

functions.

See references at the end of this chapter.

A universal unfolding of degenerate critical point at nearby when

f

appropriate

C

f

gives a full description of the way the 0

breaks up into a cluster of critical points

is perturbed in ccny way whatsoever that is small in the 00

sense,

.

problem into a finite geometrical one. of points

c e R

k

....

thus reducing a seemingly infinitely complicated

for which

f

For any unfolding

$ ,

the set

K

• . has at least one degenerate critical

261

point is the catastrophe set occurs for a flow on of

f

Rn

¥ ,

As

governed by

the diffeomorphism

onto that for

K.

b

in

(*)

c

passes through f

.

For

$,T

K

a bifurcation

universal unfoldings

takes the catastrophe set for

$

and so the catastrophe set for any particular universal

unfolding is a kind of archetype representing the catastrophe sets for all other universal unfoldings. Thom has given a now celebrated list of seven germs representing each of the equivalence classes

^ ,

together with universal unfoldings for them

and (theoretically) pictures of the corresponding archetypal catastrophe sets, We will not give the list, since it may be found in any of the references to catastrophe theory given below: perhaps this is the first account of the theory in which the list does not appear! f(x) + Q

while three are of the form

Four germs are of the form

f(x,y) + Q

degenerate quadratic form in the remaining

n-1

where or

Q

n-2

is a non¬

variables.

Recall

that from the Gromoll-Meyer lemma (page 107) any germ at a degenerate critical point can be put into the form (totally degenerate) + (non¬

degenerate).

Implicit in Thom's list is the theorem that any germ whose

degenerate part needs more than two variables will be avoidable in generic k-parameter families,

k $ 4.

Given an unfolding of a germ with degenerate critical point at

0

it is

important to know how to find out whether the unfolding is universal or not. • • There is a formal criterion for this: the unfolding f

is universal if and only if3

h. e

(1 R

of

there exist germs

such that

3 n a. j— (x,0) + l h (x)d (x) j=l J j i=l 1 1

l

n

where

cL(x)

as in §2.9.

EXAMPLES 4

1. n = 1 ;

a

f(x) = x .

unfolding since any

Here

g

$(x;c^,C2) E x

o

+ cix + C2X

i-s a universal

can be written as

/ \ . 3$ 9$ , 3 8(x) =aotaik];ta2^M,,h = aQ + a^x + a^x for some germ h. 2. n = 2 ;

3 + 4x h

This is the cusp catastrophe.

f(x,y) = x

3

3 + y .

The unfolding

3 $(x,y) is universal, since any

/

3

= x g

+ y

\

h,k .

In general when

$

+ c^x +

+ c3xy

can be written as

^

g(x,y) = a0 +

for some germs

2

94> 9$ 2 2 9$_ w + a + a + 3x h + 3y k 2 9c. 3 9c, '1 ~ “^2 - ”'"3

This is the hyperbolic umbilic catastrophe.

is not a polynomial there will be no finite way of

checking this criterion directly, so we need to replace the problem by one about polynomials rather as in the tests for determinancy in §2.9.

This

can be done easily enough, although it is a little elaborate to write down. See Poston and Stewart

[lOl].

The use of catastrophe theory in practice To interpret the significance of the catastrophe set in a given family of systems we have to know how the critical points of the function

f

are

observed, via the physical or other laws which operate in the particular context.

For example, when there are two or more minima of

f

available

the system may adopt the state corresponding to the lowest minimum: in this

263

case the set of points

c

for which

f

has two equal-valued minima

(the shock-wave set) will play a crucial role, rather than the catastrophe set itself. Thom's use of catastrophes to study morphogenesis in biology is based on the following model. time we think of

T

If

T

is a piece of biological tissue in space and

as a region in

regarded as a point

c = (c^,c^,c^)

that the internal behaviour of governed by a potential function can apply the foregoing theory. are subsets of

T

R .

C

Each cell

e R

,

C

in

T

can be

and then if we hypothesize

is described by a dynamical system f

we have a 4-parameter family

$

and

The catastrophe set and shock-wave set

itself, and through the laws of adhesion and other

properties of cells they will influence the observed manifestations of the bifurcations in assume that

$

f . c

Since we are modelling a real-life system we can

will be everywhere locally stable as an unfolding (an

assumption vital to the model) and hence since there are the local forms for

$

and the catastrophe set

K

k Rm

for cases other than

Singularities which persist (up to local equivalence) under small

perturbations are stable. by Whitney

[149] :

Stable singularities

they have local form either

R

2

-> R

(x^jX^)

2

were classified 2 (x^,X2 )

(a fold)

3

or

(x^jX^)

(x^jX^

+ x-^x^)

(a cusp as in Example

4, page 102).

Much

is now known about stability of singularities and unfoldings of unstable ones, following the work of Mather ^78^1 .

See Golubitsky and Guillemin

[[42].

This information should soon find its way into any sphere of mathematical modelling concerned with

m

functions of

n

variables, as it has done

already in mathematical economics. 4. Infinite dimensions.

In continuum mechanics bifurcation theory has

developed many techniques of its own (Keller and Antman

[[63]] , Sattinger

[ll2]) and is only now beginning to interact with catastrophe theory. Sometimes the problems can be reduced to finite-dimensional ones, using the Gromoll-Meyer lemma (page 107) as in Chillingworth Vlf\, or the powerful Centre Manifold Theorem as in Marsden, Ebin and Fisher Marsden and McCracken 266

[[76]] •

[~75]]

The ideas of genericity can sometimes be

or

used directly in the appropriate function—space: Mallet—Paret

Q 29^ *

There are also ways in which catastrophe theory can be

used in partial differential equations: Duistermaat

Q3531 •

see Chow, Hale and

see

Guckenheimer [[ 47^],

Since partial differential equations are ultimately of

more use in modelling the real world than are ordinary differential equations,

it is probably here that catastrophe theory will in the long run

have its greatest impact.

Remarks on the 1-iteratwce. Most of the recent work in global stability and genericity for dynamical systems stems from the survey of Smale [125~] . See in addition the lecture notes by Markus [733 , and the books by Nitecki 90 31 and Abraham and Robbin [3 2 J . The progress of research can be traced through the three Proceedings volumes [22[ , [j?7 ] , [71~|, each of which also contains useful survey articles. For a treatment of differential equations with accessible proofs of some of the key results used in this chapter see the forthcoming book by Irwin and Robertson [[62J . The monograph by Moser [385 3J relating classical and new theories in the context of celestial mechanics is highly recommended reading. In this connection see also the appealing book on differential topology and dynamical systems by Abraham [ 1 3 • The main reference in catastrophe theory is of course the treatise of Thom [1383. For mathematical background see Brocker and Lander [213 , Poston and Stewart Q-01] , Wassermann [l463 » Zeeman and Trotman Jl59j > anc* also the excellent survey of bifurcation theory by Arnol'd Q103J * There are numerous expository articles about catastrophe theory. A very elementary introduction is Chillingworth \^2b~\ ; for treatments with more mathematical content see Golubitsky Q413 , Sussmann [1333 . The 8eneral article by Stewart [132[] is valuable reading, although at present there is nothing to surpass the tour de force of Zeeman [1563 .

And -if I've said too much, they'l say; I’m Sorry not at allj For much more unto Such, I may. And not he Criminall. Thomas Mace : Musick's Monument,

1676.

267

Appendix

TERMINOLOGY AND NOTATION FOR SETS AND FUNCTIONS Sets 1. A set is a collection of objects specified either by a given list or by some defining property or process.

(There are logical paradoxes inherent

in this definition, but we keep well clear of them.) are called its elements of

A

or belongs to

:

A.

we write

A

a^,a^,...

is defined as the set of elements

a certain property

P

to denote that

a

is an element

In many contexts elements are considered as points.

The set consisting of elements if

a e A

The objects in a set

we write

is denoted by b

{a^,a^,...},

of some other set

A = (b e B

b satisfies

P} .

B

and

satisfying Standard

sets we will use are: N = natural numbers Z = integers

{1,2,3,...}

{0,±1,±2,...}

Q = rational numbers

{p/q

where

p,q e Z

and

q ^ 0}

R = real numbers C = complex numbers. We will not define

R, C

2. The cartesian product first object

a

here, but we assume familiarity with them. A x b

is the set of all pairs

is an element of

A

(a,b)

and the second object

b

where the is an element

2 of

B.

Examples are the euclidean plane

euclidean n-space defined inductively as 3.

In a particular case the symbol

open interval from 268

a

to

b,

(a,b)

R

= R x R

and, more generally,

Rn = Rn ^ x R . may also be used to mean the

i.e. the set of real numbers

x

with

a < x < b

interval

denoted formally [k,b]

is

by

{x e R

a * x ^ b) .

meaning) half-open intervals intervals

(a,-)

,

(-~,b)

4. If every element of

subset of

,

A

(a,b) = {x e R

(a,bj and

possibility

A = B ,

There are also (with the obvious

is automatically an element of B:

we write

B,

written

The union

of everything in either everything in both C

A

is a

This allows the A S B

if

A

Ac B ,

the set of

is the complement of

A

in

when there is no possible confusion with subtraction in

an algebraic sense.

subsets of

If

which does not belong to

B-A

then

Note that contained in does not mean belongs to,

which applies to elements and not subsets. B

AcB.

B

although some authors prefer the notation

equality is allowed.

everything in

as well as infinite

[a,b),

(-oo>co) = R

or is contained in

B

and

The closed

a < x < b} .

A

A u B

of

A

and

B

is the set consisting

the intersection

A r\ B

A

or

B ;

and

B.

It is easy to see that if

A

is the set of and

B

are

then C - (A U B) = (C - A) n (C - B)

C - (A n B) = (C - A) U (C - B) . It is convenient to think formally of an empty set at all, so that for example if write

A A B = 0 .

5. A relation

R

Here we say

A,B A

implies

and

B

are disjoint,

between pairs of elements of a set

bRa

together imply {x e A

for every

a,b e A)

a

A

for every

is called an a e A)

and transitive (aRb

a,b,c e A).

For given

is called the equivalence class of

xRa}

it contains

aRc

for every

which has no elements

have no elements in common we can

equivalence relation if it is reflexive (aRa (aRb

0

itself in view of reflexivity.

a

a,

, symmetric

and

bRc

the set

with respect to

R :

It follows from symmetry

and transitivity that any two equivalence classes either coincide or are

269

disj oint. 6. If if

A

is a subset of

a $ M

for all

exists no smaller

R

a number

a e A . M

If

M

is called an upper bound for

is an upper bound for

which is also an upper bound for

least upper bound or supremum for

A :

we write

A

A,

and there

then

= sup A.

for lower bound and greatest lower bound or infimum, written

A

is a

Similarly inf A .

It is either a theorem or an assumption about real numbers (depending on your

R)

definition of

R

that every subset of

with an upper bound has a least

upper bound; similarly for lower bounds.

Functions 7. A function

f

from

A

precisely one element of

to B

B

to each element of

f : A and denote the effect of omitted for simplicity. "f

takes

a

indicate a

to

f

If

b" .

as

f(*)>

f

is a prescription for associating

on

a

by

f(a) = b

f(a)

we write

270

The set

f(a)

B = R .

must be defined for every element

is the domain of the map

B

to be read as

although it is customary in analysis and topology to call a

may wish to consider the operation of

8.

f : a h- b,

The word map is generally synonymous

or we are not strictly entitled to put

restriction of

Sometimes the brackets are

with the dot keeping track of where the ’variable'

The prescription for

A

.

In formulae it is occasionally convenient to

map a function only in the case when

The set

We write

B

has to be inserted in the formula.

with function3

A.

f

to

U

is written

is the codomain of

f ,

A

of

A,

at the beginning of the arrow.

f . f

a

If

U

is a subset of

on elements of

f|u .

In full,

U

this is

A ,

alone.

we This

f|u : U -> B.

a rather formal definition since

the codomain could be changed by including

B

in some larger set and

thinking of this as the codomain without essentially doing anything to Not every element of the codomain need come via The subset of in

A

A

B

f ,

from anything in

A .

consisting of those elements that do come from something

is called the range of

under

f

f .

f ,

or the image of

in any case written

that everything in

B

fA .

If

fA

comes from something in

f ,

or the image of

is the whole of

A ,

then

f

B , so

is

surjective (is a surjection). 9. Two distinct elements of of

A

may well be taken by

B .

If this never happens, then

f

is injective if and only if

Thus

f(a) = f(a') 10. If

f

• A

to the same element

is injective (is an injection).

implies

a = a’

.

is both injective and surjective it is bijective (is a bijection).

In other words, id

f

f

A :

it is

a h- a

'one-to-one

and onto'.

is a bijection.

The identity map

A bijection

f : A

B

has an inverse

A f

: B -* A

defined by f-1(b) = unique

11. If

f : A -> B

composition

and

g • f

a e A

g : B -> C

: A -> C

satisfying

f(a) = b .

are any two maps then they have a

defined as

a h- g(f(a))

for every

Note that for this to make sense the domain

B

of

the codomain of

f

could be smaller than

The inverse

f ,

f"1

although the image of

of a bijection

f : A •* B

g

a e A .

must be the same as B .

is characterized by the

relationships f-1 • f = id

: A

A

Pi.

f • 12. Even if of

B

f

f_1 = idg : B

B .

is not a bijection, we can still define for each subset

a subset of

A

called the inverse image of

V

under

f ,

V

consistin;

271

of those elements

a

of

image is denoted by

A

for which

f "*"(V)

,

f(a)

Remember, however,that the symbol

f(a) e V} .

f

-1

on its own is meaningless unless f ^(V)

(In this case the two meanings of

consists of just one element

This inverse

V .

so

f_1(V) = {a e A

is bijective.

belongs to

v,

cumbersome but formally correct

we write f ‘''({v})

coincide.)

If

instead of the more

f ''"(v) .

EXAMPLES f : R

(i)

R :

[j-l,l]

f

- [-1,1] :

x h*- sin x

fR = clos ed interval

(ii)

-

f :

(iii) Let d : M

= n x n

-> R

n

by

Neither injective nor surjective.

x ^ sin x

Bijective.

matrices (real numbers as entries), n £ 1.

d(M) = det(M)

.

Then

d

is the set of non-singular matrices.

(iv)

be the unit sphere in

S

R

3

,

and let

0,

Define

f :

angles (see almost any book on mechanics). by

f(0,) = point on

but not surjective. same wording then

13. A function We write

sup f(a) aeA

Willard

272

Q

with Euler angles

If instead we define F

0, S

by the

sup fA;

has upper and lower bounds. similarly for

is countable if there is a bijection

of rational numbers is countable, but

[150] .

be the Euler

is surjective but not injective.

f : A -> R

14. An infinite set The set

S

Define

is surjective but not injective,

d '''(R - {0}) Let

0,±1,±2,...}.

n

(0) = {n-rr

R

is not.

inf

A

.

N

.

See e.g.

f V

Abraham, R.

Foundations of Mechanics3 Benjamin, New York 1967.

[2]

Abraham, R. and Robbin, J.W.

Transversal Mappings and Flows3 Benjamin, New York 1967.

[3]

Adams, J.F.

Vector fields on spheres, Ann. Math. (1962), 603-632.

[^]

Adams, J.F.

Lectures on Lie Groups3 Benjamin, New York 1969.

[3]

Agoston, M.

Algebraic Topology: A First Course, Marcel Dekker, New York 1976.

M

Ahlfors, L.V. and Sario, L.

Riemann Surfaces3 Princeton University Press 1960.

[7]

Andronov, A.A. and Pontrjagin, L.S.

Systkmes grossiers, Dokl. Akad. Nauk. (1937), 247-251.

W

Anosov, D.V.

Roughness of geodesic flows on closed Riemannian manifolds of negative curvature, Soviet Math. Dokl. 3 (1962), 1068-1070.

M

Amol'd, V.I.

Singularities of smooth mappings, Russian Math. Surveys 23 (1968), 1-43.

[10]

Arnol'd, V.I.

Lectures on bifurcations in versal families, Russian Math. Surveys 27 (1972), 54-124.

Amol'd, V.I.

Ordinary Differential Equations3 M.I.T. Press Cambridge, Mass. 1973.

[12]

Amol'd, V.I.

Critical points of smooth functions and their normal forms, Russian Math. Surveys 30 (1975)

[13]

Auslander, L. and MacKenzie, R.E.

Geometry of Manifolds, Academic Press, New York 1964.

[14]

Berry, M.V.

Waves and Thom's theorem, Adv. Rhys. (1976), 1-26.

1

h-*

M

-1

References

75

14

25

273

[15]

Bishop, R.L. and Crittenden, R.J.

Geometry of Manifolds_, Academic Press, New York 1964.

[16]

Blackett, D.W.

Elementary Topology3 Academic Press, New York 1967.

[17]

Borel, A. et al.

Seminar on Transformation Groups, Annals of Mathematics Studies 46, Princeton University Press 1960.

[18]

Bourbaki, N.

Elements de Mathdmatique 33: Varibtds Diffdrentiables et Analytiques; Fascicule de Rdsultats, Hermann, Paris 1967.

[19]

Bowen, R.

Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms, Lecture Notes in Mathematics 470, Springer-Verlag, Berlin* Heidelberg 1975.

[20]

Brickell, F. Clark, R.S.

[21]

Brocker, Th. and Lander, L.C.

Differentiable germs ccnd catastrophes3 L.M.S. Lecture Notes Series 17, Cambridge University Press 1975.

[22]

Chem, S.S. and Smale, S. (eds.)

Global Analysis: Proceedings of A.M.S. Symposium in Pure Mathematics Vol. XIV, American Mathematical Society, Providence, R.I. 1970.

[23]

Chern, S.S. and Smale, S. (eds.)

As above, Vol. XV.

[24]

Chevalley, C.

Theory of Lie Groups. Princeton University Press 1946.

[25]

Chillingworth, D.R.J.

Smooth manifolds and maps, in Global Analysis and its Applications3 Vol.l. IAEA, Vienna 1974.

[26]

Chillingworth, D.R.J.

Elementary catastrophe theory. Bull. Inst. Math. Appl. 11 (1975), 155-159.

[27]

Chillingworth, D.R.J.

The catastrophe of a buckling beam, in

[28]

Chillingworth, D.R.J. and Reversals of the earth's magnetic field, Furness, P.M.D in []7l[] .

[29]

Chow, S.N.,Hale, J.K. and Applications of generic bifurcation, I, Mallet-Paret, J. Arch. Rat. Mech. Anal. 59 (1975), 159-188.

274

and

Differentiable Manifolds3 Van Nostrand Reinhold, London 1970.

[7l].

[30]

Coddington, E. and Levinson, N.

Theory of Ordinary Differential Equations3 McGraw-Hill, New York 1955.

[31]

Cullen, C.

Matrices and Linear Transformations, Addison-Wesley, Reading, Mass. 1966.

[32]

Dieudonnd, J.

Foundations of Modern Analysis_, Academic Press, New York 1960.

[33]

Dodson, C.T.J. and Poston, T.

Tensor Geometry3

[34]

Dodson, M.M.

Darwin's law of natural selection and Thom's theory of catastrophes. Math. Biosciences _28 (1976), 243-274.

[35]

Duistermaat, J.J.

Oscillatory integrals, Lagrange immersions and unfoldings of singularities, Comm. Pure Appl. Math. 27_ (1974), 207-281.

[36]

Eells, J.

Singularities of Smooth Maps3 Nelson, London 1968.

[37]

Field, M.J.

Equivariant dynamical systems, Bull. A.M.S. _76 (1970), 1314-1318.

[38]

Franks, J.M.

^-stability: diffeomorphisms and flows, in Proceedings of Colloquiwn on Smooth Dynamical Systems (ed. D.R.J. Chillingworth), Southampton University 1972.

[39]

Franks, J.

Absolutely structurally stable diffeomorphisms, Proc. A.M.S. 37 (1973), 293-296.

[40]

Franks, J.

Time dependent stable diffeomorphisms, Inv. Math. 2A (1974), 163-172.

[41]

Golubitsky, M.

An introduction to catastrophe theory and its applications (to appear).

[42]

Golubitsky, M. and Guillemin, V.

Stable Mappings and Their Singularities, Graduate Texts in Mathematics 14, SpringerVerlag, New York 1973.

[43]

Griffiths, H.B. and Hilton, P.J.

A Comprehensive Textbook of Classical Mathematics3 Van Nostrand Reinhold, London 1970.

[44]

Grobman, D.

Homeomorphisms of systems of differential equations, Dokl. Akad. Nauk. 128 (1959), 880-881.

Pitman, London 1976.

275

[45]

Gromoll, D. and Meyer, W.

On differentiable functions with isolated critical points, Topology 8 (1969), 361-369.

[46]

Guckenheimer, J.

Absolutely ^-stable diffeomorphisms, Topology 11 (1972), 195-197.

[47]

Guckenheimer, J.

Catastrophes and partial differential equations, Ann. Inst. Fourier (Grenoble) (1973), 31-59.

23

[48]

Guillemin, V. and Pollack, A.

Differential Topology3 New Jersey 1974.

[49]

Halmos, P.

Finite-Dimensional Vector Spaces3 Van Nostrand, Princeton, N.J. 1958.

[50]

Hartman, P.

On the local linearization of differential equations, Proc. A.M.S. 14 (1963), 568-573.

[51]

Hartman, P.

Ordinary Differential Equations3 Wiley, New York 1964.

[52]

Hirsch, M.W.

Differential Topology3 Graduate Texts in Mathematics, Springer-Verlag, New York 1976.

[53]

Hirsch, M.W. and Pugh, C.C.

Stable manifolds and hyperbolic sets, in [22].

[54]

Hirsch, M.W., Palis, J., Pugh, C.C. and Shub, M.

Neighbourhoods of hyperbolic sets, Inv. Math. 9 (1970), 121-134.

[55]

Hirsch, M.W. and Smale, S.

Differential Equations3 Dynamical Systems3 and Linear Algebra3 Academic Press, New York 1974.

[56]

Hochschild, G.

The Structure of Lie Groups3 Holden-Day, San Fransisco 1955.

[57]

Hocking, J.G. and Young, G.S.

Topology3 Addison-Wesley, Reading, Mass.

[58]

Hoffman, K.

Analysis in Euclidean Space3 Prentice-Hall, New Jersey 1975.

[59]

Hurewicz, W.

Lectures on Ordinary Differential Equations3 M.I.T. Press, Cambridge, Mass. 1958.

[60]

Iooss, G.

Bifurcation of a periodic solution of the Navier-Stokes equations into an invariant torus. Arch. Rat. Mech. Anal. 58 (1975), 57-76. —

276

Prentice-Hall,

1961.

[61]

[62]

Irwin, M.C.

On the stable manifold theorem. Bull. 2 (1970), 196-198.

Irwin, M.C. Robertson,

and

(To appear).

S.A.R.

[63]

Keller, Antman,

[64]

Kervaire, M.

J.B. S.

and

Bifurcation Theory and Nonlinear Eigenvalue Problemss Benjamin, New York 1969. A manifold which does not admit any differentiable structure. Comm. 34 (1960), 257-270.

[65]

Kupka,

I.

Lang,

S.

Lang,

S.

Lang,

(Interscience), New York 1962.

S.

1968.

Differential Manifolds, Addison-Wesley, Reading, Mass. version of

[69]

Lefschetz,

S.

Loomis,

L.H.

Sternberg,

[71]

1972.

(Revised and expanded

[66].)

Differential Equations: Geometric Theorys Wiley

[70]

Advanced Calculus3 Addison-Wesley, Reading,

and

S.

Manning, A.K.

(Interscience), New York 1957.

Mass.

1968.

Dynamical Systems - Warwick 1974, Lecture

(ed.)

Notes in Mathematics 468, Berlin*Heidelberg [72]

Markov,

(1963),

Analysis I, Addison-Wesley, Reading, Mass.

[68]

2

Introduction to Differentiable Manifolds, Wiley

[67]

Math. Helv.

Contribution h la thdorie des champs gdndriques, Contrib. Biff. Eqs. 457-484; _3 (1964), 411-420.

[66]

L.M.S.

Springer-Verlag,

1975.

The unsolvability of the problem of

A.A.

homeomorphy, Proc. Internat. Congress of Mathematicians 1958, Cambridge University Press 1960, [73]

pp.

300-306

(in Russian).

Lectures in Differentiable Dynamics3

Markus, L.

C.B.M.S

Regional Conference Series in Mathematics 3, American Mathematical Society, Providence, R.I. [74]

Markus,

L.

Solenoids in generic Hamiltonian dynamics

and

(to appear).

Meyer, K.R.

[75]

Marsden, J., Ebin, and Fischer, A.

1971.

D.,

Diffeomorphism groups, hydrodynamics and relativity, Proceedings of the 13th Biennial

Seminar of the Canadian Mathematical Congress (ed.

J.R.

Vanstone),

C.M.C.

1972.

277

[76]

Marsden,

J.E.

and

McCracken, M.

The Hopf bifurcation and its Applications, Applied Math.

Series 19,

Berlin*Heidelberg • [77]

Springer-Verlag,

New York 1976.

Algebraic Topology: An Introduction,

Massey, W.S.

Harcourt,

Brace and World, New York 1967.

00 [78]

Mather,

Stability of

J.

C

mappings:

Ann. Math. 87 (1968), 89-104 II. Ann. Math. 89 (1969), 254-291 III. Publ. Math. I.H.E.S. No.35 (1968),127-156 IV. Publ. Math. I.H.E.S. No.37 (1969),223-248 V. Ado. in Math. 4 (1970), 301-336 VI. in Proceedings of Liverpool Singularities Symposium I (ed. C.T.C. Wall), Lecture I.

Notes in Mathematics

192,

Springer-Verlag,

Berlin*Heidelberg 1971. [79]

Maunder,

C.R.F.

Algebraic Topology, Van Nostrand, New York 1970.

[80]

Mendelson,

B.

Introduction to Topology, Allyn & Bacon, Boston, Mass.

[8l]

Milnor,

1962.

On manifolds homeomorphic to the 7-sphere,

J.W.

Ann. Math. _64 (1956), 399-405. [82]

Morse Theory, Annals of Mathematics Studies

Milnor, J.W.

51, Princeton University Press 1963. [83]

Topology from the Differentiable Viewpoint,

Milnor, J.W.

University Press of Virginia [84]

Mirsky,

1965.

An Introduction to Linear Algebra, Oxford

L.

University Press 1955. [85]

Moser, J.

Stable and Random Motions in Dynamical Systems: With Special Emphasis on Celestial Mechanics, Annals of Mathematics Studies 77, Princeton University Press

[86]

Munkres,

J.R.

1973.

Elementary Differential Topology, Annals of Mathematics Studies 54, Princeton University Press

[87]

Narasimham,

R.

1961

(revised 1966).

Analysis on Real and Complex Manifolds, Masson, Paris 1968.

[88]

278

Newhouse,

S.

On simple arcs between structurally stable flows, in [7l].

[89]

Newhouse, Palis, J.

[90]

Nitecki,

S.

and

Z.

Bifurcations of Morse—Smale dynamical systems, in [97].

Differentiable Dynamics, M.I.T. Press, Cambridge, Mass.

[91]

Nomizu, K.

1971.

Fundamentals of Linear Algebra, McGraw-Hill, New York 1966.

[92]

Palais,

R.S.

Morse theory on Hilbert manifolds. Topology 2 (1963), 299-340.

[93]

Palais,

R.S.

Critical point theory and the minimax principle, in [23].

[94]

Patterson,

[95]

Peixoto, M.M.

E.M.

Topology, Oliver and Boyd, Edinburgh 1956. Structural stability on two-dimensional manifolds. Topology 1

[96]

Peixoto, M.M.

Peixoto, M.M.

101-120.

On an approximation theorem of Kupka and Smale, J.

[97]

(1962),

(ed)

Diff. Eqns.

3 (1967), 214-227.

Dynamical Systems: Proceedings of Salvador Symposium 1971, Academic Press, New York 1973

[98J

Peixoto, M.M.

On the classification of flows on 2-manifolds in

[99]

Pitts,

C.G.C.

[97] .

Introduction to Metric Spaces, Oliver and Boyd, Edinburgh 1972.

[ioo]

Podnaru, V.

[101]

Poston, Stewart,

[102]

Poston, Stewart,

[103]

T.

Singularities

C en presence de symdtrie, Lecture Notes in Mathematics 510, SpringerVerlag, Berlin*Heidelberg 1976.

Taylor Expansions and Catastrophes3 Research

and

I.

Notes in Mathematics

The Geometry of the Higher Catastrophes,

T., I.N.

7, Pitman, London 1976.

and

Research Notes in Mathematics, Pitman,

Woodcock, A.E.R.

London

Pugh,

The closing lemma, Amer.

C.C.

(to appear).

J. Math.

89

(1967),

956-1009. [104]

Robbin,

J.W.

On the existence theorem for differential equations, Proc. A.M.S.

[105]

Robbin,

J.W.

19

(1968),

1005-1006.

A structural stability theorem, Ann. Math. 24

(1971),

447-493.

279

[106]

Generic properties of conservative systems,

Robinson, R.C.

I,

II, Amer.

J. Math.

92

(1970),

562-603

and 897-906. [107]

Robinson, R.C.

Structural stability of

C'*’

flows,

[108]

Robinson, R.C.

Structural stability of

C1

diffeomorphisms,

to appear in J. [109]

in

Diff. Eqns.

Principles of Mathematical Analysis3

Rudin, W.

[7l] .

2nd ed.,

McGraw-Hill, New York 1964.

[no] [in]

Ruelle,

D.

On the nature of turbulence,

and

Phys.

Sard, A.

The measure of the critical values of

20

(1971),

167-192.

differentiable maps. 48 [112]

Comm. Math.

Takens, F.

Sattinger,

D.H.

(ed.)

(1942),

Bull. A.M.S.

883-890.

Topics in Stability and Bifurcation Theorys Lecture Notes in Mathematics 309,

Springer-

Verlag, Berlin*Heidelberg 1973. [113]

Sewell, M.J.

Some mechanical examples of catastrophe theory. Butt. 163-172.

[114]

Shields, P.C.

Shub, M.

1964.

Shub, M.

[97].

Structurally stable systems are dense.

Bull. A.M.S. [117]

Siersma,

D.

78

Simmons,

G.F.

(1972),

817-818.

Classification and deformation of singularities;

[118]

(1976),

Stability and genericity for diffeomorphisms, in

[H6]

12

Linear Algebra, Addison-Wesley, Reading, Mass.

[115]

Inst. Math. Appl.

thesis, Amsterdam 1974.

Introduction to Topology and Modem Analysis3 McGraw-Hill, New York 1963.

[119]

Smale,

S.

Morse inequalities for a dynamical system, Bull. A.M.S. 66 (I960), 43-49.

[120]

Smale,

S.

On gradient dynamical systems. Arm. Math. (1961), 199-206.

[121]

Smale,

S.

Generalized Poincard's conjecture in dimensions greater than four, Ann. (1961), 391-406.

280

Math.

74

Ik

[122J

Smale,

S.

Stable manifolds

for differential equations

and diffeomorphisms, Ann. Pisa l8 (1963), 97-116. [l23]

Smale,

S.

An infinite dimensional version of Sard's theorem, Amer.

[l24]

Smale,

Scuola Norm. Sup.

S.

J. Math.

87

(1965),

861-866.

Structurally stable systems are not dense, 88 (1966), 491-496.

Amer. J. Math. [l25]

Smale,

S.

Differentiable dynamical systems. Butt.

A.M.S. [l26^]

Sotomayor,

J.

73_ (1967),

747-817.

Generic one-parameter families of vector fields, Publ. Math. 5-46.

I.H.E.S. No.43 (1974),

[l27j

Sotomayor, J.

Generic bifurcations of dynamical systems, in [97].

[128]

Spivak, M.

Calculus on Manifolds

_,

Benjamin, New York

1965.

[T29]

Spivak, M.

A Comprehensive Introduction to Differential Geometryj Publish or Perish, Boston, Mass. 1970.

[l3cFj

Stamm, E.

Introduction to differential

topology.

Proceedings of the IZth Biennial Seminar of the Canadian Mathematical Congress (ed. J.R. Q.3l]

Sternberg,

S.

Vanstone),

C.M.C.

1972.

On the structure of local homeomorphisms of euclidean n-space II, Amer. (1958),

[132]

Stewart,

I.N.

J. Math.

80

623-631.

The seven elementary catastrophes. New

Scientist _68 (1975), 447-454. [133]

Sussmann, H.J.

Catatastrophe theory, Synthbse 31

(1975),

229-270. [jl34j

Sutherland, W.A.

Introduction to Metric and Topological Spaces Oxford University Press 1975.

Q.35]]

Takens,

Hamiltonian systems:

F.

,

Generic properties of

closed orbits and local perturbations.

Math. Ann. (jL36j

Takens,

F.

188

Takens,

F,

304-312.

Singularities of vector fields, Publ.

I.H.E.S. No. [l37]

(1970),

43

(1973),

Tolerance stability,

in

Math.

47-100. [71].

281

[138]

Stability Structurelle et Morphogdnkse3

Thom, R.

Benjamin Advanced Book Program, Mass. D.H. [139]

1972. Fowler,

Reading,

English translation by Benjamin A.B.P.

1975.

A global dynamical scheme for vertebrate

Thom, R.

embryology,

(A.A.A.S.,

1971:

Some Math.

Questions in Biology VI), A.M.S.

Lectures on Mathematics in the Life Sciences 5 (1973), 3-45. [l40]

Thom,

R.

Zeeman, [l4l]

Catastrophe

and

Thompson, J.M.T. Hunt, G.W.

and

[I42] Thompson, J.M.T. and Hunt, [l43]

theory:

future perspectives,

E.C.

its present state and in

[7l].

A General Theory of Elastic Stability3 Wiley, London 1973. Towards a unified bifurcation theory,

Z.A.M.P.

26 (1975), 581-604.

G.W.

Thompson, M.

The geometry of confidence: Enga te and Hagen moka3

an analysis of the

a complex system of

pig-giving in the New Guinea Highlands, appear in Rubbish Theory3 Q-44]

Tromba, A.J.

to

Paladin.

The Morse lemma on arbitrary Banach spaces, 79 (1973), 85-86.

Bull. A.M.S. [145]

Wallace, A.H.

Differential Topology: First Steps3 Benjamin, New York 1968.

[146]

Wassermann,

G.

Stability of Unfoldings3 Mathematics 393, Heidelberg 1974.

Lecture Notes in

Springer-Verlag,

Berlin*

[jL47]

Whitney, H.

Differentiable manifolds, Ann. Math. (1936), 645-680.

[l48]

Whitney, H.

The self-intersections of a smooth n-manifold in 2n-space, Ann. Math. 45 (1944), 220-246.

Q.49]] Whitney, H.

37

On singularities of mappings of Euclidean spaces I, Mappings of the plane into the plane. Am. Math.

Jj.50]

Willard,

S.

282

Williams,

R.F.

(1955),

374-410.

General Topology3 Addison-Wesley, Reading, Mass.

[l5l]

62

1970.

Expanding attractors, Publ. No. 43 (1974), 169-204.

Math. I.H.E.S.

[l52]

Zeeman, E.C.

n° C

density of stable diffeomorphisms and

flows, in Proceedings of Colloquium on Smooth Dynamical Systems (ed. D.R.J. Chillingworth), University 1972. [l53]]

Zeeman, E.C.

Southampton

On the unstable behaviour of stock exchanges,

J. Math. Economics 1 (1974), 39-49. [l54]

Zeeman,

E.C.

Primary and secondary waves in developmental biology,

(A.A.A.S.,

1974:

Some Mathematical

Lectures on Mathematics in the Life Sciences 7 (1974), Questions

in Biology VIII), A.M.S.

69-161. |jL55]

Zeeman, E.C.

Levels of structure in catastrophe theory, in Proceedings of the International Congress of Mathematicians_, Vancouver 1974, Canadian Mathematical Congress 1975,

u,

533-546.

[jL56]

Zeeman, E.C.

Catastrophe theory, Scientific American 234 (1976), 65-83.

[1.57]

Zeeman, E.C.

Euler buckling, in Catastrophe Theory Seattle 1975, Lecture Notes in Mathematics 525, Springer-Verlag, 1976.

[jL58]

Zeeman,

E.C.

Berlin*Heidelberg

The umbilic bracelet and the double cusp catastrophe, in Catastrophe Theory - Seattle 1975, as above.

[I59]

Zeeman, E.C. and Trotman, D.J.A.

The classification of elementary catastrophes of codimension £ 5,

in Catastrophe Theory -

Seattle 1975, as above.

Addenda [160]

Field, M.J.

Differential Calculus and its Applications, Van Nostrand Reinhold, London 1976.

[l6f]

Meyer,

K.R.

Generic bifurcations in Hamiltonian systems, in

[7l] .

283

Index

a-limit set

199

accumulation point

20,

action of Lie group

145,

analytic

85-6

Anosov system atlas

241

attractor Axiom A

242-3 240-244,

Baire

222,

Banach space

52,

basic set basis

241

Betti numbers

188 191-6,

118-124 249

225 93,

184,

§4.7

bijection

94,

271

bilinear

64, 27,

76-9, 30-1,

bundle

c1 C

2

165-175,

2

178,

62 ,

C

r

,

00 C

cu C

103,247 33, 46, 50,

137-8 154, 162,

boundary

226

43 160

bifurcation

bound, bounded

251

x

72 85

topology

Cantor set

223-6

cartesian product

233, 242 65, 80, 134,

catastrophe

256-7

catastrophe set

261,

Chain Rule chart

66, 148, 163, 166,

268

264-5 174

118-131

circle

32,

closed set

19-20

closure

20,

Closing Lemma

227,

230

codimens ion

136,

177,

codomain compact

270

compact manifold

143,

complement

269

complete

52,

94,

complex

61,

132

284

176,

118, 198,

§1.6,

138,

144,

223,

225-6

253-5

187, 152,

145

198, 159,

222

225 223,

229

272

composition

65,

80,

conj ugate

26,

209,

connected

33,

271 234

186, 229, 102, 110

constant map constraint

158,

245,

continuous

4-6,

9,

continuous

linear map

convergence

48,

72,

62

co-ordinate chart

21-3, 118

cotangent bundle co un tab 1 e

53,

122,

covariant tensor

78,

248

covector

245

covector field

246,

critical point

101-115,

critical value

157

cross-section cusp

265

17-18,

50,

51,

245-6,

238

84,

224-5

250 222,

272

155,

157,

250

183,

194,

cylinder

263, 134,

266 138

damping

200-1,

degenerate

derivative

103, 107, 256, 261 21, 207, 222, 231, 239-241, 244, 254, 36, 54, 58, 67 , 72

determinacy

88-9,

determinant

73, 69,

dense

diffeomorphism diffeomorphism,

local

difference equation

197,

:

214

251

109-115,

139 125,

132,

258 189,

69, 88 192

differentiable

36,

53,

differentiable manifold

ii,

124,

126,

differential equation

i,

§3.5,

191,

199-2