206 89 12MB
English Pages 291 p. ; [308] Year c1976.
o
Research Notes in Mathematics
D R J Chillingworth
Differential topology with a view to applications
Pitman LONDON SAN FRANCISCO-MELBOURNE
9
Titles in this series 1 Improperly posed boundary value problems A Carasso and A P Stone 2 Lie algebras generated by finite dimensional ideals 1 N Stewart 3 Bifurcation problems in nonlinear elasticity R W Dickey 4 Partial differential equations in the complex domain D LColton 5 Quasilinear hyperbolic systems and waves A Jeffrey 6 Solution of boundary value problems by the method of integral operators D LColton 7 Taylor expansions and catastrophes T Poston and I N Stewart 8 Function theoretic methods in differential equations R P Gilbert and R J Weinacht 9 Differential topology with a view to applications D R J Chillingworth 10 Characteristic classes of foliations H VPittie 11 Stochastic integration and generalized martingales A U Kussmaul 12 Zeta-functions: An introduction to algebraic geometry A D Thomas 13 Explicit a priori inequalities with applications to boundary value problems V G Sigillito 14 Nonlinear diffusion W E Fitzgibbon III and H F Walker 15 Unsolved problems concerning lattice points J Hammer 16 Edge-colourings of graphs S Fiorini and R J Wilson 17 Nonlinear analysis and mechanics: Heriot-Watt Symposium Volume I R J Knops 18 Actions of finite abelian groups C Kosniowski 19 Closed graph theorems and webbed spaces M De Wilde
Differential topology with a view to applications
NUNC COGNOSCO EX PARTE
THOMAS J. BATA LIBRARY TRENT UNIVERSITY
Digitized by the Internet Archive in 2019 with funding from Kahle/Austin Foundation
https://archive.org/details/differentialtopoOOOOchil
D R J Chillingworth University of Southampton
Differential topology with a view to applications
Pitman LONDON • SAN FRANCISCO • MELBOURNE
QUI3. &
. C v?
/1?6b
PITMAN PUBLISHING LIMITED 39 Parker Street, London WC2B 5PB FEARON-PITMAN PUBLISHERS INC. 6 Davis Drive, Belmont,California 94002,USA
Associated Companies
Copp Clark Ltd, Toronto Pitman Publishing New Zealand Ltd, Wellington Pitman Publishing Pty Ltd, Melbourne First published 1976 Reprinted 1977 Reprinted 1978 AMS Subject Classifications: (main) 58AOS, 34-02 (subsidiary) 26A57,34C-, 34D30, 58F-, 70-34 Library of Congress Cataloging in Publication Data Chillingworth, David. Differential topology with a view to applications. (Research notes in mathematics; 9) Includes bibliographical references. 1. Differential topology. I. Title. II. Series. QA613.6.C48 514'.7 76-28202 ISBN 0-273-00283-X ©DR J. Chillingworth 1976 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means,electronic, mechanical, photocopying, recording and/or otherwise without the prior written permission of the publishers. The paperback edition of this book may not be lent,resold, hired out or otherwise disposed of by way of trade in any form of binding or cover other than that in which it is published, without the prior consent of the publishers. Reproduced and printed by photolithography in Great Britain at Biddles of Guildford
Preface With the increasing use of the language and machinery of differential topology in practical applications, many research workers in the physical, biological and social sciences are eager to learn some of the background to the subject, but are frustrated by the lack of any self-contained treatment that neither is too pure in approach nor is written at too advanced a level. Few people have the time and energy to embark on a systematic study in a field other than their own, and so this book is written with the purpose of making differential topology accessible in one volume as a working tool for applied scientists.
It could also serve as a guide for graduate students
finding their way around the subject before plunging into a more thorough treatment of one aspect or another.
The book should perhaps be read more
in the spirit of a novel (with a rather diffuse ending) than as a text-book. The particular aim is to study the global qualitative behaviour of
dynamical systems, although there are numerous byways and diversions along the route.
A dynamical system is some system (economic, physical,
biological ...) which evolves with time.
Given a starting point,
the system
moves within a universe of possible states according to known or hypothesized laws, often describable locally by a formula for the 'infinitesimal* evolution, namely a differential equation.
The global
theory is the theory of all possible evolutions from all possible initial states,
together with the way these fit together and relate to each other.
Qualitative theory is concerned with the existence of constant (equilibrium) behaviour, periodic or recurrent behaviour, and long-term behaviour.
313734
together with questions of local and overall stability of the system.
Global
qualitative techniques, mainly stemming from the work of Henri Poincard^ (1854-1912), are important both because precise quantitative theoretical solutions may in general be unobtainable, and because in any case a qualitative model is the basis of a sound mental picture without which mechanical calculation is highly dangerous. The natural universe of evolution for a dynamical system is often a differentiable manifold; the evolution itself is a flow on the manifold, and a differential equation for infinitesimal evolution becomes a Vector field on the manifold.
Chapters 1-3 of the book are concerned with defining
and explaining these terms, while Chapter 4 goes into the qualitative theory of flows on manifolds, ending with some discussion of bifurcation theory. There is an Appendix on basic terminology and notation for set theory. Inevitably there are many topics which should have been included or developed but which would have expanded the volume to twice its size. Differential forms are hardly mentioned, singularity theory is only touched upon, and the fascinating terrain of general bifurcation theory for differential equations, including the Centre Manifold Theorem (one of the few really -practical applications of differential topology), is left largely unexplored.
I hope the tantalized reader will be able to follow up these
topics via the references given. Formal prerequisites are kept to a minimum.
The ideas from topology and
linear algebra that are needed are mostly developed from first principles, so that the basic requirements are hardly more than a familiarity with derivatives and partial derivatives in elementary calculus - although these, too, are defined in the text.
The exceptions to this are complex
numbers, which are assumed to be well-known objects to mathamatically-minded
scientists, and determinants and eigenvalues of matrices which may be less well-known to some but are everyday equipment for others.
My excuse for
this logical inconsistency is lack of space and the need to draw a line somewhere:
I felt that it was more important to discuss carefully some
of the fundamental ideas about linear spaces upon which the rest of the structure is built than to go on to techniques familiar to many people and in any case quite accessible elsewhere.
In the first three Chapters the
complex numbers feature mainly in examples and illustrations but in their roles as eigenvalues they become crucial to the main plot in Chapter 4. As overall references for the qualitative theory of dynamical systems, I suggest the now historic survey article by Smale very readable lecture notes of Markus differential equations by Amol'd
[73 ].
[125] and the subsequent
The excellent books on
[ll] and Hirsch and Smale
[55]
directed towards the qualitative theory of flows on manifolds.
are both
For back¬
ground on differential topology a recent and attractive text is Guillemin and Pollack
[48J:
there is also a forthcoming book by Hirsch [52 ].
The
fascinating article on applications to fluid mechanics and relativity by Marsden, Ebin, and Fisher
[]75j
is highly recommended (see also the
introduction to differential topology by Stamm in the same volume). The present book grew from a series of lectures given to a mixed audience of pure and applied mathematicians, engineers, physicists and economists at Southampton University in 1973/74.
It is through the
encouragement of several of these colleagues that I have expanded the lecture notes into book form, and I am grateful to them and others for helpful comments and criticisms.
I am particularly indebted to Peter Stefan
of the University College of North Wales, Bangor who carefully read the original notes and offered many detailed suggestions for improvement.
* Now appeared: highly recommended
Despite all this assistance, I claim the credit for errors.
I would also
like to thank Professor Umberto Mosco and Professor Nicolaas Kuiper for hospitality at the Istituto Matematico dell' University di Roma and the I.H.E.S., Bures-sur-Yvette, respectively, during visits to which I wrote up much of the notes.
I am grateful also to Pitman Publishing for their
interest in the book and patience during its production,
to my wife Ann for
tolerating the side-effects, and especially to Cheryl Saint and Jenny Medley for spending many long hours producing such a perfect typescript. Finally, my special thanks go to Les Lander for taking upon himself the task of drawing all the figures in the book, and obtaining such professional results in a short space of time.
David Chillingworth Southampton, August 1976.
I am grateful to Mike Irwin, Mark Mostow and my father H.R. Chillingworth for kindly pointing out a number of misprints and errors which, fortunately, I have been able to correct for the second printing of the book. Also I would like to take this opportunity to draw attention to the Appendix, which explains some set-theoretic terminology and notation that is used without comment in the text.
D.R.J.C. May 1977.
Contents 1. Basic topological ideas 1.1 1.2 1.3 1.4 1.5 1.6 1.7
The concept of a function Continuity Continuity from a more general viewpoint Further topological concepts Homeomorphism of spaces and equivalence of maps Compactness Connectedness
Remarks on the literature
1 4
9 18 25 27 33
35
2. Calculus 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
Differentiation Linear spaces and linear maps Normed linear spaces Differentiation (continued) Properties and uses of the derivative Higher orders of differentiation Germs and jets Local structure of differentiable maps I: Non-singular behaviour 2.9 Local structure of differentiable maps II: Singularities
36 38 45 53 64 72 80
100
Remarks on the literature
115
92
3. Differentiable manifolds and maps 3.1 The concept of a differentiable manifold 3.2 Remarks, comments and more examples of differentiable manifolds 3.3 The structure of differentiable maps between manifolds 3.4 Tangent bundles and tangent maps 3.5 Vector fields and differential equations
116 131 146 162 179
Remarks on the literature
189
4. Qualitative theory of dynamical systems 4.1 4.2 4.3 4.4 4.5 4.6 4.7
Flows and diffeomorphisms Local behaviour near fixed points and periodic orbits Some global behaviour Generic properties of flows and diffeomorphisms Global stability Dynamical systems under constraint Breakdown of stability: bifurcation theory
Remarks on the literature
191 204 217 221 227 245 252 267
APPENDIX: Terminology and notation for sets and functions
268
REFERENCES
273
Index
284
1 Basic topological ideas
1.1
THE CONCEPT OF A FUNCTION
We usually visualize a real-valued function of a real variable in terms of its graph:
For a given number anothet real number coordinatized by
x
(i.e.
f(x).
(x,y)
x e R)
the function
f
provides
The graph consists of all points in a plane
which satisfy
graph(f) = ((x,y)
y = f(x),
or in formal notation
e R x R | y = f(x)}
.
Another 'picture' of the, function, though less useful than the above, is the following:
R
R
x' fCx) x
f(x') Figure 2
1
For a real function
f
of two real variables
be thought of as a landscape, with 'horizontal' plane and Formally, we have
f(x^,x2)
(x^,x2) e
(x^,x2)
graph(f) = {((x1,x2), y)
Alternatively, we can picture
f(x^,x2)
e R" x R = R3
f
'vertically'.
2 ,
the graph could
as coordinates in a
being measured
R x R = R
(x^,x2)
e
R
and
| y = fCx^.x.^)}
by a 'source and target' picture:
R
R + (x„x2) f
f(x,,x2)
Figure 3 Now suppose we have two functions
f^, f?
of two variables
(x^,x2)
We can consider them both at the same time by writing
f(xisx2) = (f-L(x1,x2), f2(x1,x2))
so that
f
is a function from
R
2
to
R
e
R
X
R = R‘
2
The 'graph' picture in this case is harder to visualize, since by analogy with the previous definition of graph we have
2
graph(f) = {((Xl,x2),
(y15y2))
e R2 x R2
| y. = f.^^) i = 1,
and so the graph is a subset of
2}
R2 x R2 = R^
In a similar way, if we are given
k
functions of
n
variables we can
put them all together to obtain a corresponding function from We write
x= (x^x^
. . . ,x ) ,
,
Rn
to
Rk .
let
f(x) = (f1(x), f2(x),
fk(x))
,
and keep in mind the picture:
In formal notation this would be written as
f The graph of
f
: Rn - Rk
.
would be a subset of
R
n
x R
Tc
= R
n "hlc
t
difficult to
visualize and in general not yielding as much intuitive information about the behaviour of way
f
causes
f R
n
as we might extract from Figure 5 by considering the •
•
•
to be folded up and twisted inside
R
k
The techniques for analyzing such 'folds and twists' are those of topology and calculus together, or diffeventual topology.
Thus we see already how
a study of differential topology may give useful insight into the ways in which
k
functions of
n
variables can mutually interact both in general
3
circumstances and in particular cases.
Our aim in the first two chapters
will be to develop some of these basic techniques.
Remarks 1. It is frequently necessary to consider functions which are not defined for all values of the variables
(x^x^, .. . ,xn)
• belonging to some subset
x = (x^ jX^, • . . >xn)
but only for c say, of
U,
on K
In this case we of course write f : U -+ Rk
.
2. The word function is usually reserved for reaZ-vaZued functions only, i.e. those of the form f
In other cases
:
(something) -* R
(e.g. for
f : U
R
, k > 1)
we tend to use the term
map or mapping.
1.2
CONTINUITY
Roughly speaking, a map is continuous if by making a small perturbation in the input (i.e. the independent variables) you obtain only a small change in the output (i.e. the dependent variables). far too vague a definition. function
g : R -* R
However, this is obviously
For example, we would wish to think of the
defined by
f (x) =10
23
x
as being continuous
(indeed, its graph is a straight line), but it could be argued that small changes in
x
produce very large changes in
definition of continuity for functions
(a)
Continuity at a point
x
o
.
if, given any positive number
4
g(x).
f : R -* R
The function e
The formal is as follows:
f
is continuous at
x
o
(thought of as admissible margin of
error in the output), number
6
it is then possible to find another positive
such that perturbing
to vary by less than lxD - x| (b)
Continuity. R,
c.
< 6
Xq
by less than
|f(xq) - f(x)|
If we are considering
each point
xq
f
causes
f(x )
Symbolically,
implies
then we say that
6
f
U.
.
defined on some subset
is continuous on
belonging to
< e
If
U
U
U
of
if it is continuous at
is the whole of
R
any case understood from the context then we simply say that
or is in f
is
continuous.
Note that the function any
xq
and any
e
g
in the above example is continuous, since for
we may take
6 = 10
-23
e .
Remarks 1.
It is tempting to try to combine (a) and (b) by saying (hopefully)
f
is continuous on
U
if, given any
e > 0,
there exists a
that
6 > 0
such that |x — y|
for all
x
and
y
R
(b) but it does
The point is that for the
genuine definition of continuity we must allow well as on
< e
6
to depend on
xq
(as
If we do not do this, but demand that the
should apply everywhere (given
e),
then we have uniform
continuity - a notion which is in fact of considerable importance in contexts involving approximating functions by other functions as
for example in certain techniques of numerical analysis. 2. All
'standard' functions such as polynomials, exponentials,
sin x
and
so on can easily be proved to be continuous where they are defined. The only hazard to continuity in combining them ad libitum is the risk of dividing by a function which vanishes somewhere.
It is easy to see how the definition of continuity will go over to functions
f : R
n
R
k-
,
since all that is necessary is to replace the
modulus by the euclidean distance from the origin in Example 3 below).
Rn
or
(see
R
However, we shall need to study the 'small change in
input gives small change in output' problem in situations where the input and output may be rather more complicated than numbers or n-tuples of real numbers; for example,
they may be differential equations, or perhaps
collections of functions.
Therefore we want to generalize the definition
of continuity so that it applies to maps sets other than
Rn
or subsets of
distance in both sets f : A -> B 6 > 0
(b)
A, B;
f : A
.
Rn
6
than
e;
A
and
B
are
then we could simply say
is continuous at
x
e A
o
then the distance from
f : A ■+ B
where
To do this we need notions of
if, given
such that if the distance from
than
B
f(x)
x to
to
e X
o
f(xo)
> o,
there exists
(in
A)
is less
(in
B)
is less
is continuous if it is continuous at every point in
A.
Now it turns out that the minimal properties that a 'distance function' d
on a set
S
needs to have in order to reflect adequately the basic
relationships of distance in euclidean space are these:(1)
6
d(s,s')
is always
$ 0,
and
= 0
when and only when
s = s'
;
(2)
d(s, s') = d(s',s)
for all
s
(3)
d(s,s") £ d(s,s ' ) + d(s',s")
and
s'
for all
in
S ;
s, s'
and
s"
in
(This last property is known as the triangle 'inequality.) function satisfying (1),
(2) and (3)
S . A distance
is called a metric.
DEFINITION A function
d
which satisfies (1)3
S.
S
together with a particular metric on it is called a metric
A set
(2) and (3) above is called a metric on
space.
Note that function on
d S
is actually a function from itself.
The image of
d
S x S
to
and not a
R,
is contained in the set
non-negative real numbers (by (1)), and so we could write
d
R+
o
of
: S x S -> Rq .
EXAMPLES of metric spaces
.
1
d(x,y) =
S = R,
2. S = R2,
d(x,y) =
S = Rn,
d(x,y) =
|x -
(X1
n 3.
l i=l
Examples 1 and 2 are special cases of Example 3, known as the euclidean metric or usual metric in
4.
S = {bounded functions d(f,g)
5.
=
sup a $ x $ b
|f(x)
f
:
- g(x)|
R
[a,b]
-* R}
.
s = {differentiable functions
f
:
(a,b)
R
with bounded
derivative} d(f,g)
=
sup a 0
given any point
such that the halt
B (x)
x
in
W
is contained in
o
there W.
This leads to a general definition:
DEFINITION Let
S
he a metric space.
any
x
in
VI3
For example, line
R,
A subset
W
of
there exists some
6 > 0
the interval
2 < x < 3}
(x |
whereas the interval
(x |
S
with
2 < x $ 3}
is an open set if3 given B (x) o
{(x,y)
|
2 < x < 3, y = 0}
W.
is an open set in the real is not (because
fails to lie in any 6_ball contained in the interval). line segment
contained in
Observe that the
is not an open set in
since every 6-ball will now have to contain points with
x = 3
y f 0.
R2 , On the
other hand ((x,y)
|
2 0
we must show that Then
with
6
shown that
.
f ^(V)
6 > 0
is continuous.
f ^(V)
is open in
and so since
B (f(x)) cz. V . e
the existence of some B.(x) £2 f ^(V)
f(x) e V,
f
with
V
Let
V
A.
Choose any
is open we can find
Now the continuity of
f
guarantees
f(B^(x)) c B^(f(x)) c. V ,
This argument applies to each
be an
x e f ^(V)
so ,
so we have
is open.
Conversely, let us assume the property about open sets and deduce the continuity of e-ball
B^(f(x))
f.
Choose a point
is an open set in
B
x e A.
This means that since
x e f ^"(B (f(x))) e
B. (x)ct f ^(B (f(x))) o e
or, in other words,
This applies for each
e > 0 ,
Since
x
e > 0
the
(an easy exercise, again using
the triangle inequality), and so by hypothesis A.
For any
f ^(B^(f(x))) there is some
is open in 6 > 0
f(B.(x))cr B (f(x)) o e
.
and so proves the continuity of
was an arbitrary point of
A
we have shown that
f
with
f
at
x.
is
continuous.
In view of this theorem it is clear that in the study of continuity of maps between metric spaces it is the family of open sets in each space which is important, rather than the actual metric.
More precisely, if
two different metrics give rise to the same family of open sets then any map which is continuous using one metric will automatically be
12
continuous using the other. The family of open sets of a metric space is called its topology. All that is needed, therefore, in order to define the concept of continuity of maps from any set
A
they are given as metric spaces)
is the notion of a topology for each of
A
and
to another set
B
(whether or not
B.
Naively, we could select any family
F
postulate that
F
of subsets of
shall be the topology for
A.
A
and
Not surprisingly this
does not lead very far, unless we insist that the family
F
obey a few
simple rules which the family of open sets in a metric space does obey. The rules which turn out to be crucial are these: (1) If
U
and
(2) If each
IT
V
belong to
F F,
belongs to
then so does
U n V .
for some family
A
then so does
U U. X e A
{U } A X e A
.
For convenience we also add a formal clause: (3) Both the whole set to
A
and the empty set
0
belong
F.
Note that clause (1) number of sets in
F
implies that the intersection of any finite
also belongs to
F,
but it does not guarantee
anything about the intersection of an infinite number of them.
On the
other hand, clause (2) asserts that the union of an arbitrary number of sets in
F
must still belong to
F.
It is straightforward to verify (making good use of the triangle inequality) (1),
(2)
that the family of open sets in a metric space does satisfy
and (formally)
(3).
This means that we are now able to define
13
what we mean by a topology on an arbitrary set, whether or not we are given a metric.
DEFINITION
Let
be a set.
S
A family
F
of subsets satisfying the rules (1)3
and (3) above is called a topology for
S.
The set
(2)
together with
S
its topology is a topological space.
By analogy with the metric space situation, of
S
the sets in the topology
are referred to as open sets.
EXAMPLES of topological spaces 1.
Obviously,
every metric space is a topological space - since the
properties of open sets in a metric space inspired the general definition of
topology.
Rn
The topology arising from the usual metric on
is
called the usual topology. 2.
If
S
satisfies
is any set then the family of all subsets of (1),
(2)
discrete topology. discrete metric
and
(3)
S
and so is itself a topology.
naturally It is called the
It is the topology obtained by giving
(§1.2, Example 8),
S
the
as follows easily from the
observation that each point by itself forms an open set with respect to the discrete metric because when 3.
B^(x)
consists of
x
alone
6 < 1.
At the other extreme, we may take the family consisting of no open
sets except those legally required by clause 0.
4. Suppose
is a set with a given topology
S
arbitrary subset of
(3), namely
S
itself and
This is sometimes called the indiscrete topology.
the empty set
14
the point
S
F,
and
(not necessarily belonging to
T
F).
is some Then
T
can
be given a topology V
is of
the form
F,^
as follows:
T n U
The verification of
where
(1),
(2),
U
(3)
V
belongs
to
precisely when
is some set belonging to for
F^
is
easy.
This topology
is called the induced topology or subspace topology on from a metric on
S
(§1.2, Example 7)
on
5.
If
S, T
then
F
T.
where
U, V
unfortunately this fails if
S = T = R
to satisfy rule
S x T S, T
(2)
from those of
S
to be open if it is of
respectively, but
in general.
(For example
U x V
can be thought
and a union of rectangles may not be a rectangle.)
Instead we define a set in
of the form
are open in
S x T
with the usual topology then sets
of as rectangles,
comes
spaces there is a natural way of
tempting to define a set in
U x v
the form
F
If
T.
are two topological
It is
T.
F^
in fact comes from the induced metric
constructing a topology on the cartesian product and
F.
U x v
where
to be open if it is the union of sets
S x T U,
V
are open in
S, T
respectively.
It is
straightforward to verify that this does satisfy the rules for a topology on
S x T.
It is
called the product topology
on
S x T.
It is an instructive exercise to check that the product topology on Rn x Rm
(Rn,
Rm
on
R
= R
x R
6.
Let
S
with usual topology) .
be a topological space,
the disjoint union of a For example,
is the same as the usual topology
and suppose that
(possibly infinite)
family of
S
is expressed as sets
in certain types of application the elements of
. S
some objects which we are attempting to classify and each represent a collection of objects having a certain property
may PA
If we now broaden our focus and decide to regard two objects as if they belong to the same
S
,
may be
in common. 'the same'
we are led to consider instead of
S
a
15
new set
S,
This set
namely the set whose elements are themselves the
S
inherits a topology from that of W c. S
Given any set union of all in
S
if
those
W
let S
W
is open in
in the following way.
denote the subset of
which belong to
A
S
S.
W.
S
consisting of the
Then define
See Figure 7.
.
W
to be open
Again,
S
w Figure 7 it is routine to verify that this defines a topology on
S.
There is an equivalent but neater way to express this construction. Let
S
be a topological space and let
defined on S
S.
R
be some equivalence relation
The equivalence classes of
into disjoint subsets
S
as above,
R
give a decomposition of
and conversely such a
decomposition defines an equivalence relation on belonging to the same S/R:
S
.
this corresponds to tt
taking each be open if above,
A
x e S
it ^(U)
S/R (= S)
that of
We have a natural map
S -> S/R
to its equivalence class. is open in
and so it equips
- namely,
Denote the set of equivalence classes by S.
:
S
S/R
S.
Then this
with a topology.
Define a set is
U cz S/R
the same definition as This
topology on
is called the quotient topology.
If the topology on
S
comes from a metric it will not
necessarily follow that the quotient topology comes from a metric on
16
to
S/R.
there is no such thing as a
'quotient metric'
in general,
This is
one more reason for studying topologies rather than metrics.
Returning now to the motivating idea behind the generalization of metric space to topological
space, we will formulate a definition of
continuity for maps between topological spaces.
In view of the above
theorem and the subsequent discussion there is only one plausible definition that coincides with our previous definition for metric spaces.
DEFINITION
Let
f
:
undevstood). V
be a map between two topological spaces (topologies
A -* B
Then
is open in
f
is continuous if
f ^(V)
is open in
A
B.
Again we emphasize that there is nothing here to say that if open in
A
whenevev
then
f(U)
should be open in
U
is
B.
The main advantage of this topological definition of continuity is that it avoids reference to any specific metric, whether or not there is one available.
It is often useful to work with topological spaces
(such as
function spaces) which can be given metrics but only in rather artificial ways that distract attention from the more natural topological structures. Indeed,
there are contexts in which it is necessary to deal with
topological
spaces having no metric structure at all which is compatible
with the topology. The second advantage is the usefulness and economy of the definition for handling theoretical
statements about continuous maps.
For example,
consider the following proof of the fact that the composition of two continuous maps is continuous. is of course straightforward,
The proof for metric spaces
(using
6,
e)
but not as tidy.
17
THEOREM
If
f
A -* B
g*f
: A
:
C
and
g
:
are continuous then so is the composition
B
C
in
C.
.
Proof Choose any open set continuity of
g,
V so
f ^(g ^(V))
f.
But
1.4
FURTHER TOPOLOGICAL CONCEPTS
(i)
f ^(g 1(V))
Then
An open set
neighbourhood of
U
=
(g*f)
x.
of
*,
A
by the
by the continuity of
this finishes
x
B
the proof.
is called an open
For technical reasons it can be useful to think in
x
and to call any set
N
simply a
if it contains an open neighbourhood of
Thus for example the interval (though not open)
is open in
is open in
containing the point
slightly more general terms,
neighbourhood
^(V)
g ^(V)
(x e R
of the point
of either of the points
1
or
|
1 $ x $ 3}
2
in
R,
3
in
R.
x.
is a neighbourhood
but it is not a neighbourhood
There is a characterization of open sets that is convenient to use in practice when checking whether a given set
is open or not.
straightforward from the definition of topology
The proof is
(page 14) with clause
(2)
playing an important part.
Characterization of open sets A set
U
is open if and only if for each point
neighbourhood
N
of
X
x
with
N
X
We see from this that we can use in contexts where we would use can define a map
18
f
:
A -> B
B^
contained in
x e U
there is a
U.
'neighbourhood'
in topological
in metric spaces.
to be continuous at a point
spaces
For example, we x
e A •
(a concept which we have only used so far in metric spaces)
neighbourhood
N
of the point
yQ = f(xQ)
neighbourhood
M
of
f(M) B
page
with
xq
e B
between two topological if and only if it
x^ e A.
if given any
that a map
spaces is continuous
is continuous at
(as defined on
(as defined above)
for
This now brings us back satisfyingly in a full circle
to the original definition of continuity in a metric space
(page
9) with
which we began.
(ii)
Given any set
W
in a topological space
union of alt the subsets of
W
we can consider the
S
which are open sets in
S
with the induced topology - but the converse may not hold). open
(by clause
contained in
It is called the interior of
measures how 'thick'
W
has empty interior in in
R
is
W
This is
and so can be thought of as the largest open set
(2))
W.
(so open in
is as a subset of R
.
the open interval
S.
W,
and in a sense
Thus a line or a plane
The interior of the closed interval
[a,bj
(a,b), but both have empty interior in the
2 plane
R .
The set
Q
of rational numbers and its
set of irrational numbers,
(iii)
open nor closed; simultaneously. open and closed. open.
each have empty interior in
A subset of a topological
complement is an open set.
complement I,
the
R.
space is called a closed set if its
Generally speaking, most sets are neither
it is also possible for a set to be both open and closed In a discrete space,
for example,
all
sets are both
Thus a closed set is not simply one which fails to be
The reason for introducing this
terminology is that topological
arguments frequently go through much more rapidly when expressed in terms
19
of closed sets
than they would do if they dealt with open sets directly.
Observe that by taking complements
the useful
characterization of open
sets given above is transformed automatically into the following:
Characterization of closed sets A set
C
is closed if and only if for each point
a neighbourhood
(iv)
Let
point
x
W
of
entirely disjoint from
(which may or may not belong to
apart from (possibly) the point
W,
points of
W
'arbitrarily close'
to
x
C
there is
S,
and define a
to be an accumulation point
W)
if each neighbourhood of
W
not in C.
be any subset of the topological space
or limit point of
point of
y
y
contains at least one
x
itself.
x
Thus there are
but different from
x.
It is
easy to see that the characterization of closed sets above can now be converted into the statement that a set is closed if and only if it
contains all its limit points. Given an arbitrary set manufacture a set points.
W
This set
and the set
C
W,
not necessarily closed or open, we can
consisting of
W
together with all
is called the closure of
W
is closed if and only if
C = C
W. .
its
limit
Obviously
W et W,
The following facts
are not difficult to verify: (1)
For any set
(2) W (3)
w2 = W
However,
=
20
the closure
W
is a closed set,
is the intersection of all closed sets
w1 u
= {x
W
u w2
note that e
R
|
r\ W2 = 0
x < a} but
i.e.
W = W
that contain
.
W.
.
Wj n ,
W2 1
n W
W2 = {x e R A W2 = {a}
.
necessarily. |
x > a}
.
Then
For example, A W2
take
If
A
is a subset of
equal to or contains
B,
B
and yet occupies so much of
then we say that
B
is dense in
A
that
A
or is
an accumulation point of
subset of
B
contains points of
has empty interior. (and open) in
R.
in
For example,
R.
The concept of denseness
dynamical systems, See §3.3
'usual'
or that the complement of
(Sard's theorem)
is
A
in
B
the set of non-integer numbers is dense (and not open)
is very important in mathematical
as opposed to
and so on:
B
or that every open
The set of rational numbers is dense
characterizations of
(v)
A,
A,
is
B.
Other equivalent ways of saying this are that every point of either in
A
'unusual'
properties of maps,
we shall make considerable use of it.
and also
§4.4.
The familiar ideas of approaching a limit and convergence can easily
be fitted into the topological framework constructed so far. space context f(x) any
(e.g.
in euclidean space) we say
approaches e > 0,
x ) o
< 6
as
c
approaches
x
there then exists some
property that d(x,
In a metric
d(f(x),
c)
< e
xq
6 > 0
for all
x
if,
given
with the with
.
In topological terms this becomes f(x) approaches any neighbourhood neighbourhood f(x)
e N
as
c
M
for all
approaches
x
xq
if,
given
N
of
c
there then exists some
of
xq
with the property that
x e M .
We write f (x) -> c
as
x -* xq
or
21
lim x
and call
c
->
the limit of
f(x)
X
= c
o
f (x)
as
x -> x
o
The wording of these definitions sounds familiar, that it is an echo of the definition of earlier in this section. these ideas?
What,
then,
and in fact we note
'continuity at a point'
given
is the exact relationship between
Proceeding carefully from the definitions alone we soon see
that to say f(x) -> c
as
is the same as to say that the map g(x)
= f(x)
x -> x g
defined by x i x
when
o
o •
g(xo) = C
is continuous at a limit'
x
.
Therefore in making the definition of
'approaching
we have actually introduced nothing new.
A standard variant on the above occurs when we are dealing with a
sequence of points
x^, x£, x^, x
n
-> c
... as
.
In this case we say
n -> °°
or lim x = c n n -> °° to mean that given any neighbourhood
N
exists some positive integer x
n
e N
for all
For metric spaces usual
22
of
c m
there then such that
n > m .
(in particular euclidean space)
idea of a sequence approaching a limit:
this coincides with the
given any
e > 0
there
exists some positive integer
m
such that
d(x n’
all
c) '
is less than
e
for
n > m .
Here,
and in the general
is convergent,
topological setting, we say that the sequence
or converges to the limit
In the special
c.
case of real or complex-valued functions or sequences
the definitions can be stretched a little to include the notion of ’approaching infinity’. set of the form (or
xn -*■ + °°)
{y e R
|
We replace the neighbourhood
{y e R
|
y > r}
for some
in the real-valued case,
y < s}
for some
s
r
N
when defining
or by a set of
when defining
of
c
by a
f(x) -> + °o
the form
f(x) -> - °°
(or
x
-> - «>) n
also
in the real
{zeC|
| z|
case,
> r}
or,
finally,
to define
by a set of the form
f (x) -> °°
(or
-> °°)
in the complex¬
valued case.
Remarks 1.
The definitions for sequences can actually be subsumed under those
for functions.
The way to do this is to consider the sequence
x^,
of points
1,
X£, x^, 2,
3,
...
...
and so on. to
N
in the space
under the map
c
as
n
°°’
(n) -> c We will not go into details, There is an irritating
this point.
as the images of the integers
taking p
1
to
x^,
(thought of as
N u {p}
and putting a suitable topology on
re-express
2.
:
S
2
to
x^,
’infinity’)
it is possible to
as as
n
p
.
since they are rather uninstructive.
source of ambiguity which inevitably arises at
We have the notion of 'limit of a sequence and limit point
of a set, but when we think of a sequence as a set of points these two
23
notions may not coincide. -
I +2 2’
1_
_5
4’
6’
i.e.
8’
although the points On the other hand,
For example,
±1
x
n
=
(-1)
the sequence n 2n - 1
has no limit,
2n
are both limit points of the set
the sequence
1,
1,
1,
1,
...
(x^
n e N}
|
has the limit
1
.
but
as a set it has no limit points since it consists of only one point. Therefore care must be taken to use these terms with precision when sequences are being treated from a topological point of view. 3. When
S
is a topological space which also carries some notion of
addition (for example a normed linear space:
see §2.3),
then besides
the
concept of convergent sequence we also have that of convergent series. A series means simply a sequence of elements of
S
with
'+'
signs
between them:
x^ + x^ + x^ + ... We say the series is convergent or otherwise according to whether the sequence
Sf, $2,
, ... are the so-called -partial sums
is convergent or otherwise, where the n
s
We emphasize, however,
n
=
y x. .r i=l
.
that this only makes sense to the extent that 00
'+' makes
sense in
S.
4. Any metric space
given
x,y e S
such that
U
S
We write (such as
24
Rn)
and
V
are disjoint.
where
d = d(x,y)
to denote
U, V
of
x,
respectively
y
This may seem obvious,
In fact we may take :
lim S n->-°°
has the elementary property that
there exist neighbourhoods
verified from the definit'ons. V = Bd/2(y)
Y x. i=0 1
U = B
but must be
d/2 then the non-intersection of
(x)
and
U, V
follows from the triangle inequality.
Unfortunately topological spaces
failing to have this
'separation'
considerations
in quotient spaces:
(e.g.
property can arise from quite natural see page 16,
and §4.1).
Spaces which do have the property are called Hausdorff spaces (F.
Hausdorff,
1868-1942).
Thus all metric spaces are Hausdorff.
1.5
HOMEOMORPHISM OF SPACES AND EQUIVALENCE OF MAPS
Let
S, T
be two topological spaces,
and suppose
f
:
S -> T
is a
*
bisection f
:
.
If
T -*■ S
f
is continuous,
is continuous,
definition of continuity,
concerned
S
f
:
and
S -> T T
f
called a homeomorphism.
is
this means that the open sets in
precisely to the open sets homeomorphism
then
and at the same time its inverse
in
T,
under
f.
Thus
then as far as their topological structure is
are indistinguishable.
any property possessed by S
We say that
S
will also be possessed by
T,
f
^
is continuous:
section for a particular theorem on these lines. then not necessary to study
f ^
T
are
It follows that
for example)
f
is continuous
see the following This means that it is
explicitly in order to show that
a homeomorphism, which is very useful
generality.
and
and vice versa.
In many useful contexts the fact that a bijection guarantees that its inverse
S
and expressible purely in terms of the
(such as the property of being Hausdorff,
difficult to analyze.
correspond
if there exists a
topologically equivalent or, more usually, homeomorphic.
topology of
S
From the
since
f ^
f
is
may in practice be
Unfortunately this result does not hold in all
It may fail,
for example,
in certain function spaces.
Artificial counterexamples are also easily manufactured. with the discrete topology and
T =
R
Let
with the usual topology,
S =
R
and let
* see page 271. 25
f
:
S -> T
domain U
S
in
S
be any bijection: is continuous!) with
fU
then
but
f
f ^
not open in
is continuous
(every map with
is not since there exist open sets
T.
Homeomorphism is the natural notion of equivalence between two topological spaces.
We next consider how to define a notion of
equivalence between two maps S, T
are given.
f,g
:
S -* T
Thinking informally of a homeomorphism from a space
to itself as the result of viewing we can regard two maps, like
f
when
S
and
f,g T
:
S
S T
through distorting spectacles, as being equivalent if
g
looks
Formally this
is expressed by saying that
are topologically equivalent or conjugate if there exist homeomorphisms
g
h
:
S
S
k
: T -> T
such that g-h = k*f i.e.
the result of applying
g
same as the result of applying by
k.
to f
S to
, after distortion by S
h
is the
and then distorting the result
Note that this is the same as g = k*f*h 1
,
and can be expressed simply by saying that the following diagram f S h
->
j. S
T |
g -*
k
T
commutes, i.e. either pathway from the top left hand corner to the bottom right hand corner gives the same result.
26
S
are both viewed through distorting spectacles -
but with separate lenses. f,
where the two topological spaces
Observe that since
h, k
have inverses
the relation of topological equivalence is an
equivalence relation on the set of continuous maps In cases when h,
S = T
we often insist that
k
S -> T. should
since we wish to avoid the idea of distorting
ways
simultaneously.
S
he the same as
in two different
The expression for topological equivalence then
becomes g = h*f*h 1 We shall see more of this later, when studying the qualitative theory of dynamical
1.6
systems
(§§4.2,
4.5).
COMPACTNESS
There
is an extremely useful theorem in analysis as follows:
THEOREM A
If
is a continuous real-valued function on a closed interval
f
then
f
| f (x) |
is bounded, £ k
for all
i.e. x
there exists a constant
in
r
tt
[0, -j)
tan x
such that
[a,b] .
Note that the assumption that For example,
k
[a,b]
is a closed interval is vital.
[a,b]
is defined and continuous on the non-closed interval
but is certainly not bounded.
Here are two standard proofs of the theorem.
Proof 1 Suppose exist
f x^ e
is not bounded: [a,b]
with
|f(x^)|
Then there exists
x2 ^ x^^
by the greater of
2
points
x-j-,
x2,
x3,
and ...
then we deduce a contradiction. > 1
with ^(x^l
with
(otherwise
|f(x2)| ).
lf(xn)|
> 2
f
There must
is bounded by
(otherwise
f
1).
is bounded
Continuing, we obtain distinct > n
for each positive integer
n.
27
Now we invoke the
Weierstrass limit point theorem (or
'classical'
condensation theorem):
If
W
A
is a subset of a closed interval and
contains ccn infinite
A
number of points then there is at least one accumulation point of
A
somewhere in the interval.
This immediately implies that there is in point x^.
xA
A = {x^, X2, x^,
of
But by the continuity of
there is a 5^ > 0 B
(x^).
such that
Choose now
m
...},
i.e.
f
x^
at
|f(x)|
n £ m >
|f(xA)|
(Note:
we do not claim that
x
+ 1
avoids
l
with
n
x,, x„,
*
n :> m,
x
z
n
.
m-1
and
which gives the contradiction. x
-> x. *
.)
By the continuity of
f
n
Proof 2 Choose [a,b]
e > 0.
there exists
provided
y
6^ > 0
is within
the open intervals
6^
such that of
(x - 6
x.
x + 6 )
we know that for each |f(y)|
is less than
The whole of as
x
[a,bj
x
in
|f(x)|
+ e
is covered by
runs through
fa,b1 ,
or
equivalently [a,b]
=
U x e
where the
'
(x - 6
, x + 6 ) '
[a,b]
denotes that part of
(x - 6
, x + 6 X
(Note in passing that
(x - 6
the induced topology of the Heine-Borel theorem:
28
[a,b]
, x + 6 )'
) X
lying in
is an open subset of
as a subspace of
R.)
This
[a,b]. [a,b]
in
time we use
Suppose
HB
-is a closed interval in
Ea>tT|
there is given a family [a,b] c
U J
(with
R
of open intervals covering
{J^}
Then there is a finite sub-family
.
\
J
A
which also covers
Applying this
[a,b] ,
i.e.
Qa»bJ cz J
than
[a,b] c= J. v X -i
to the family
deduce that there exist u J u X0 12
X-.
...
{J
x2, u J
X
a,b / ± «>)
x
}
where
..., xm
.
J,
u
x
in
=
, J , . .., J X, X0 A 12m ...
uJ X
(x - 6
[a,b]
Then clearly
i.e.
[a,b] ,
An
J
ccnd
x
, x + 6 ) x
we
with
If(x)I
is everywhere less
m
max{ | f (x-^) | , | f (x^) | ,
...»
| f (x ) | } + e
which shows that
f
is
bounded. It is
straightforward to follow through the above two proofs and check
that they work for functions defined on any topological space satisfies an analogue of interval
[a,b2
open set of
S
W
or
is replaced by in
HB
HB S
S
which
respectively where the closed
and open interval is replaced by
In fact topological spaces that satisfy this
analogue of the Heine-Borel theorem play such a fundamental role in analysis that they are given an explicit name:
compact spaces.
DEFINITION
A topological space
S
is compact if and only if every cover of
S
by
open sets has a finite subcover.
Sometimes it is convenient to talk about a compact subset space
S.
This
simply means that
topological space in its own right
T
of a
is compact when regarded as a
(with the induced topology).
In the present more general context it is easy to deduce compact spaces as a consequence of
T
HB
.
W
for
The argument goes as follows.
29
Suppose
A
is a subset of the compact topological space
an infinite number of points. point.
If not,
We wish to prove
then every point of
containing at most one point of
A.
S
A
S
and
A
has
has an accumulation
has an open neighbourhood
But then the compactness of
S
implies that a finite number of these open neighbourhoods will suffice to cover
S,
which means that
This contradiction shows that
A
can have only a finite number of points. A
must after all have an accumulation
point. Returning to the starting-point of this section, by mimicking either Proof 1
(using the above generalized version of
W
)
or Proof 2
directly we can construct a proof of a more general version of the theorem on boundedness of continuous functions.
THEOREM A'
If f
is a compact topological space and
S
is bounded,, i.e.
for all
f
there exists a constant
: k
is continuous then
S -*■ R
such that
|f(x)|
$ k
x e S.
Remarks 1. Unfortunately it is not possible to deduce compactness from the Weierstrass property in the case of an otherwise arbitrary topological space.
Thus the generalized versions of
equivalent.
i.e.
2. Theorem A' ways
HB
are not entirely
they can be shown to
In particular they are equivalent in their original for closed intervals
[a,b]
of the real
line.
has an immediate corollary which illustrates one of
the main
in which the theorem might be used in practice, namely that if
is compact and
30
and
In the case of metric spaces, however,
be equivalent. context,
W
f
:
S
R
S
is a continuous non-vanishing function then
f
is bounded away from zero, |f(x)|
^ k
for all
i.e.
x e S.
there exists
k > 0
such that
The proof of the corollary comes from
applying Theorem A
to
1/f.
Clearly this need not be true if
not compact:
S =
(0,1)
and
take
In applications
f
f(x)
= x,
S
is
for example.
may typically be a distance function from points of
S
to some other point or set.
3.
There is an alternative characterization of compactness as follows.
A topological
space
closed subsets
is compact if and only if, given any family of
S
such that any finite number of the
{C^}
non-empty intersection, At first sight this
have
then the intersection of them all is non-empty.
seems unrelated to the former definition,
but in fact
it is easy to translate one version into the other simply by considering the open complements
U
A
of the closed sets
Thus for example the statement
S = U U A
C. , A
becomes
and vice versa. 0 = f| C
A
A
,
and so on.
A
We leave this verification as an exercise. This characterization is frequently used to prove the existence of a point defined by some kind of infinite intersection process. illustration,
observe that if
I
is
the closed interval
the intersection of any finite number of
I
's
As an
To, — 1
then
is non-empty and in fact
CO
f]
I
= {0}
which is non-empty.
On the other hand if
=
(0, — ]
n=l
then again the intersection of any finite number of
J^'s
is non-empty
00
but now
0
J
= 0.
Here
=
(0,l]
is not compact
(see below).
n=l We will now give a working theorem whose geometrical usefulness justifies
the rather technical definition of compactness.
three facts about compact sets,
First we state
each of which is straightforward to verify
from the definitions:
31
(1)
Every closed subset of a compact space is compact.
(2)
If
f
:
S -* T
compact subset, (3)
K c. S
is continuous and then
f(K)
is a
is a compact subset of
T.
In a Hausdorff space every compact set is closed.
Putting these together we obtain
THEOREM B
Let
and
S
Suppose
f
be topological spaces with
T :
compact and
S
T
Hausdorff.
is a continuous map which is also a bisection.
S -> T
is automatically a homeomorphism (i.e.
Then
f
is automatically continuous).
f ^
Proof To prove
f ^
is,
V
i.e.
bijection) implies
continuous we have to show open implies
as showing
C
compact
C
(Fact
which in turn implies
f(V)
open.
(f
f(C)
(1)) which implies closed
is open whenever
This is the same
closed implies
f(C)
^(V)
(Fact
closed. f(C)
V
(since
f
But
closed
compact
C
(Fact
is a
(2))
(3)).
Thus for example to prove that the space of real numbers modulo the integers
(i.e.
the quotient space
R/R
where
xRy
means
x - y e Z)
is
formally homeomorphic to a circle it suffices to construct a continuous bijection R/R
from
is compact
and of
f
R/R
(being the image of
is Hausdorff. 0) h- e^771®
1 :
0 H- e
2 tt
of working with
32
i0
f
R/R
.
[0,l]
in the complex plane, under the projection
It is easy to verify that
is a bijection,
quotient topology in R -> S
to the unit circle
f
:
R
since R/R)
(equivalence class
and is continuous by definition of the
and the continuity of
the map
We have no need to bother with the technicalities .
Recognizing compact sets.
In all the above theory we have not discussed
how to recognize a compact set where we see one.
After all,
it is
clearly impossible to check the properties of every open cover. Fortunately,
in euclidean space
Rn
,
the most useful case,
there is an
immediate test:
THEOREM C A subset
of
K
is compact precisely when it is both (a) closed and
Rn
(b) bounded (i.e.
there exists some number
The compactness of (cover
K
by
K
K fi B^(0) ,
k
immediately implies k = 1,
2,
3,
...).
with
(a)
Kc B^CO)).
(Fact
However,
more delicate, being a generalized Heine-Borel theorem. the proof here.
of
1.7
and
(b)
the converse is We will not give
Note that once we have Theorem C then the original
Theorem A follows directly from Fact bounded on
(3))
[a,b]
is the same as
(2),
saying
since to say that f([a,bj)
f
is
is a bounded subset
R.
CONNECTEDNESS
A topological space is connected if, piece.
Of course,
roughly speaking,
it is all in one
any space with more than one point in it can be
written as the union of two disjoint non-empty subsets, but it turns out that the definition which best captures our intuitive notion of being 'in one piece'
is the following:
DEFINITION
A topological space
S
is connected if it cannot be split into two
disjoint non-empty subsets which cere both open3 i.e. if U, V
S = U u V
where
are open and disjoint then one of them must be empty.
33
Thus in a connected space the only subsets which are both open and closed simultaneously are the empty set and the whole space.
It follows
that any space with the discrete topology cannot be connected if it has more than one point.
A subset
Q
of
Rn,
when given the induced
topology, will fail to be connected precisely when it can be written as Q = Qx U Q2
where
some open set
Ik
in an open set
Q-^,Q2 of
Ik
the notion of
are non-empty,
Rn
disjoint,
'separate pieces'.
of the two closed intervals (as a subset of
R
Q1 = Q A (-2,2)
and
Q.
- in other words when each
which does not meet the other being
and
=
[o,l]
and
with the induced topology), Q2 = Q
for
can be included
Q_^.
This formalizes
For example, Q2 =
= Q n U.
the union
Q
is not connected
[2,3]
since we have
C\ (1,4).
This still leaves the problem of deciding when a given topological space -is connected,
since it is impossible to check all possible attempts
at splitting it into disjoint open subsets.
The two most useful results
in this direction are:
(1)
If
S
then
is connected and f(S)
f
:
S
T
is continuous
(with the topology induced from
T)
is
connected; (2)
every open, half-open or closed interval infinite)
(possibly
of the real line is connected.
Sketch proofs (1)
If
S = f
f(S)
= U u V
X(U) u f ^(V),
clearly disjoint.
34
(both open and disjoint in both open in
S
f (S))
then
by the continuity of
f
and
(2)
If an interval
in
I,
a < b,
numbers
x
X)
is not connected we have
disjoint and non-empty.
suppose
m
I
shows or
for which
m V
and let
b
in
m = least upper bound of the set
X
of all
U
Choose points
I
a
is contained in
in
both open
U,
[a,x) n
cannot lie in
I = U u V,
U.
Contemplation of
(since it would not be an upper bound for
(since it would not be the least upper bound for
Contradiction,
so
I
V,
X).
is connected.
As with compactness, we extend the definition to subsets of a topological
space by regarding them as spaces in their own right with the
induced topology. Apart from its direct geometrical meaning,
connectedness plays an
important indirect role in the proofs of many theorems. to prove that a certain property holds for all space
S.
the set S,
x
in some topological
A frequently useful approach is to prove first of all T
of points
x
for which
P
P
fails
is also an open subset of
the existence of just one point non-empty,
so
F
guaranteed for all
x
in
Remarks on the literature. [99] .
for which
If P
of points for S
is connected,
holds means that
T
is
is
There are many good introductory texts on such as Mendelson
[80] , Patterson
Particularly recommended also are Sutherland
and Simmons
P
then
S.
is written in much the same spirit as provided,
S.
F
must be empty and therefore the validity of
metric and topological spaces, Pitts
x
that
does hold is an open subset of
and then to prove by other means that the set
which
Suppose we wish
this
[94],
[L34] , which
chapter but with all details
[ll| , which contains much additional material
relevant to the notes as a whole. See Willard [l50] for a deeper study of topological spaces, or Hocking and Young [57] for more geometric aspects.
35
2 Calculus
2.1
DIFFERENTIATION
Recall from elementary calculus the definition of differentiability at a point
x
for a real-valued function of one real variable.
DEFINITION
The function
f
:
R
is differentiable at
R
lim
(f(x+h)
x
if
- f(x))
h - o h
exists (± °°
not admitted).
This limit is called the derivative of denoted by
f'(x)
or
f
at
x,
and is traditionally
df — .
Strictly speaking we should write
x
for which the limit has been evaluated.
to emphasize the point
Another way of expressing the existence of the limit is
there exists a real number f(x+h)
where In general.
X
e -> 0
as
- f(x)
X
h -> 0
36
f(h)
= f(x+h)
L(h)
= Ah :
to say:
= Ah + e|h|
of course depends on
h = 0
,
such that
x.
that the graphs of the two functions
are tangent at
df -j^- (x)
- f(x)
see Figure 8.
f, L
The pictorial of
h
interpretation is
defined by
It is reasonable to describe this by saying that
approximation to
f
at
x, h
or, f (x)
equivalently,
L
is a 1■■inear
that the map
+ Ah
(whose graph is a straight line)
is a first order approximation to the
map h
f (x+h)
Since this idea turns out to be so useful in the study of real functions of a real variable, much as possible. (maps)
it is worthwhile trying to generalize it as
Obviously we would get nowhere by considering functions
between arbitrary topological spaces,
since the minimum necessary
ingredients are: 1.
concepts of addition and subtraction,
2.
the idea of a map being linear,
3.
a possibility of taking limits.
Now the first two of these are available in the world of linear spaces or vector spaces, while the third is most conveniently associated with metric spaces
(§1.2).
Therefore to generalize our simple notion of
differentiation we need to develop a theory of linear spaces which are at the same time metric spaces.
This will occupy us for the next two
37
sections of the chapter.
2.2
LINEAR SPACES AND LINEAR MAPS
A real linear space or vector space is a set
V
together with two
operations called 'addition' and 'multiplication by scalars (i.e. real numbers)' which obey a number of reasonable rules.
The standard rules
for addition are these: (i)
x + (y+z) = (x+y) + z
for all
x, y, z
(ii)
x + y = y + x
for all
x, y
(iii)
there is a zero in satisfies
(iv)
every
V,
in
for every
has a negative, i.e. an element
which satisfies
V
V
i.e. an element
0+x=x+0=x
x
in
0
which
x
in
V
- x
x + (~x) = 0 ,
followed by rules involving scalar multiplication as well as addition: (v)
a(x+y)
= ax + ay
for all
X,
y
(vi)
(a+g)x = ax + Bx
for all
X
in
V
and all scalars
for all
X
in
V
and all scalars
for all
X
in
V.
(vii) (viii)
a(gx)
=
(ag)x
lx = X
in
V
and all seal
Remarks 1. It can be deduced from these rules that other obvious-looking arithmetical statements also hold, such as Also,
it is usual to write
x - y
Ox = 0
instead of
or
x + (~y)
(-l)x = - x . .
2. The same definition could equally well be made with complex numbers rather than real numbers as scalars, giving a complex linear space. 3. When thinking geometrically we call
38
0
the origin of
V.
4. Nothing has been said about any topology for
V.
EXAMPLES of linear spaces 1.
V = R
2*
V =
3.
V = {all
R
;
usual addition and multiplication. ;
co—ordinate-wise addition and
R
functions
R}
defined pointwise,
;
scalar multiplication.
addition and scalar multiplication
i.e.
f + g
means the function
x
f (x)
af
means the function
x W
a,
g.
on the left hand side denotes addition in
whereas on the right hand side it denotes addition in completely different in the way they are defined.
40
(in
V)
to zero
(in
These may be
A similar remark
applies to the scalar multiplications on both sides. linear map always takes zero
W.
W).
V,
Observe that a
EXAMPLES of linear maps 1.
V = any linear space;
L
some fixed scalar R
,
W-R
;
: V -> V
defined by
j
L
•••)
n
:
V
W
(1 $ i £ m,
for
defined by n
n
)
“13
j‘i
for a collection of a£j
= av
a.
( n ^>^2
L(v)
mn
1 $ j
x3’
X
“2j v
l
•••
a
j=i
.
mj
x.
J
fixed scalars
£ n)
.
Here
L
is traditionally
represented by the matrix
all
a12
a21
a22
a2n
•
•
•
•
•
•
•
•
•
Otrt
^ ml
3. V = S L 4.
defined by
Linear maps from
(i)
L(f)
V
ot
mn^
above,
= f'
to
“in
•••
m2
in Example 7(b)
: V -> W
•••
W
W = V
in Example 3 above;
.
can be added together and multiplied by
scalars pointwise to give further linear maps. check that the set space,
Lin(V,W)
of linear maps
Indeed, V -> W
it is easy to
itself forms a linear
in fact a linear subspace of the linear space of all maps
V -* W
.
This construction is so important in what follows that we will isolate it for convenient reference:
of all linear maps
L
:
:
V -* W
If
V -> W
V
and
W
are linear spaces3 then the set
forms a linear space3 denoted by
Lin(V,W)
.
Kernel and image For a linear map elements
v
in
L V
let the kernel of
L
denote the set of all
which are taken to the zero element in
W.
41
Linearity of
L
implies that the kernel
V,
L
is injective precisely when
and that
Thus the
'size'
of
ker(L)
(for example,
measures the failure of injectivity of The image
im(L)
L
:
V
W
im(L)
its dimension:
W,
is all of
see below)
W.
inverse
L
: W -> V
must also
recall
§1.5 and Theorem B of
A bijective linear map is called a linear isomorphism,
L.
L
(Compare this with the analogous situation for
isomorphism the linear structure of via
V.
and by definition
isomorphism if the linear context is understood.
W,
in
isa linear map and is at the same time a bisection then
continuous maps between topological spaces; §1.6.)
= {0}
L.
it is a simple exercise to check that its be a linear map.
is a linear subspace of
ker(L)
is a linear subspace of
is surjective precisely when If
ker(L)
V
If
L
or just
: V -> W
is an
corresponds precisely to that of
Thus if there exists an isomorphism between two given linear
spaces they are indistinguishable as linear spaces:
they are said to be
isomorphic.
EXAMPLE Suppose
S, T
S n T = {0}.
are two linear subspaces of Then it turns out that every
a unique way as as
x = s + t
'co-ordinates'
for
x,
where
s e S,
S
and
such that x
t e T.
in
V
S + T = V
S x T.
We can regard
We write
and
can be written in
and it is routine to check that
isomorphic to the cartesian product
direct sum of
V
V
s,
t
is
V = S @ T,
the
T.
Dimension In euclidean space geometrical problems can be converted into algebraic problems by taking co-ordinates,
42
sometimes already given and sometimes
constructed artifically with the solution of a particular problem in mind. This
is
such a useful technique that it is worth exploiting as much as
possible.
It leads
to the concepts of basis and dimension in general
linear space theory. A linear space finite collection each
x
in
V
is said to be finite—dimensional if it contains a
V
(e^,
•••>
effl}
can be written in the form x - a,e1 + a„e0 + it 2 Z
for some unique choice of scalars depends on for
V,
of elements with the property that
x).
+ a e mm
a,,
cu, Z
...,
a
{e,,
eOJ
....
e }
1
The collection
and the number
...
12
(which of course
m
m
is called a basis
is called the dimension of
m
V.
It can be
shown to be independent of the actual basis considered. A linear space which is not finite dimensional is
dimensional .
Note that this
'infinite basis'
for
V,
said to he infinite-
says nothing about the existence of any
whatever that may mean.
EXAMPLES 1.
V = Rn
(0,
has a basis
I,
0,
elsewhere.
This
dimension of 2.
Let
W
is
V
...,
R
is is
{e^,
0)
with
n
in the
where ith
e^
is the vector
place and Hence the
be the linear space consisting of all maps
some given linear space,
For example, when
0
(as we would hope.).
V
dimensional precisely when
V
1
en^
called the standard basis.
scalar multiplication in
get.
•••»
A
is an arbitrary set,
are defined pointwise. W
W = R
A -> W
and addition and
Then
finite-dimensional and
is
and
can be thought of as an n-tuple
A= {1,
(x^,
2,
...,
•••> xn)
where
n}
V A
is finite¬ is a finite
an element of
with each
x^
in
R,
43
and
V
is
Rn.
isomorphic to
element of
V
R,
is infinite-dimensional.
and
V
A = N = {1, 2, 3,
If
is an infinite sequence
(x^,
*2, x^,
...}
. ..)
then an of elements of
3. Any linear space isomorphic to a finite-dimensional space is itself finite-dimensional, to a basis
since the isomorphism transports a basis in one space
in the other.
Thus any infinite-dimensional space cannot be isomorphic to a finite¬ dimensional one.
On the other hand,
of the same dimension are isomorphic. that every finite-dimensional
isomorphic to
Rn
any two finite-dimensional spaces From this and Example 1 it follows
(real) linear space of dimension
n
is
- although in practice the construction of an
isomorphism may be rather artificial.
After this excursion into the theory of linear spaces and linear maps we now turn again to the task of generalizing the definition of differentiability. If linear)
V, W
are linear spaces and
then given
as an element of
x
W.
and
h
F(h)
= f(x+h)
V
: V -*■ W
is any map
(not necessarily
we can certainly define - f(x)
Now if we wish to discuss approximating
some suitable linear map topological structure. differentiability at
in
f
L
F
by
we need to introduce some metric or at least
Although we could make a definition of x
involving only topologies on
V
and
W,
experience shows that in the first instance it is useful to ensure that V
and
W
are each metric spaces
(as well as linear spaces),
and moreover
that the metrics are in a certain sense compatible with the linear
44
Pursuing these ideas we arrive at the notion of a normed
structures.
linear space.
2.3
NORMED LINEAR SPACES
Observe that (i)
(ii)
in
R
with the euclidean metric the following rules hold:
d(x+z, y+z)
= d (x, y)
for all
x, y,
z
in
Rn
(i.e.
translation does not alter relative distances).
d(ax,
ay)
scalars
=
|a|d(x,y)
a
(i.e.
for all
x, y
in
Rn
and all
scalar multiplication magnifies or
contracts distances uniformly). These are not consequences of the general definition of metric since they involve the linear structure of
Rn
(§1.2),
which need not exist in
an arbitrary metric space. Let us d
say (temporarily)
on it then
that if
x,
length of
the
(1)
y
in
=
and then since (2)
||x|| when
(3)
V.
|a|
d
||x||
and
(ii).
d(x,0)
by
||x||
we see that in view of for every
x
in
V
If
d(x,y)
d
is
= d(x-y, 0)
(thought of as the (ii)
above we have
and scalar
a,
is a metric we also have
£ 0 x
||x+y||
Ox )
(i)
we immediately see that
Denoting
'vector'
||ax||
is a linear space with some metric
is sensible if it satisfies
d
a sensible metric then from (i) for every
V
for all
x
in
V,
and
is the zero element in $
Note that the given
||x|| d
+
||y||
||x||
=
for all
||x-y||
precisely
V; x, y
is recoverable from d(x,y)
=0
in
||•||
V. by the formula
.
45
A function
norm for
||*||
V,
and
linear space.
: V -> R
(1),
(2)
and
(3)
is called a
together with a particular norm is called a normed
V
Any normed linear space is automatically a metric space,
with metric defined by that this
satisfying
d
d(x,y)
does obey rules
=
||x-y||
(1),
(2),
:
it is straightforward to check
(3)
of the definition of a metric.
EXAMPLES of normed linear spaces
1. R
1 1x1
with
for n-vectors.
l
Li=l
1
1
.i 1
x
|2
\ ,
i.e.
the usual
'length'
function
y
This gives the euclidean metric and so it is called the
euclidean norm. 2.
f n
_
In particular, when
(a)
Rn
with
I |
(b)
Rn
with
11*11
X
| |
n = 1
Rn
||x||
=
|x|
.
= max{|x^ | = 11
+
1
In fact it can be proved that any norm on same topology on
we have
Rn
will give rise to the
as does the euclidean norm.
Of course the metric
will in general be different. 3.
The space of bounded functions f||
This
f
:
R,
x e R}
= sup{|f(x)
with
.
is a linear subspace of the space of all functions
(Example 3,
4. The space of bounded maps linear space and a number
R
R
§2.2) with pointwise addition and scalar multiplication,
the metric is that of Example 4,
K
equivalently,
A
f
§1.2. : A -> W,
such that
f (x)
f(A)cz B^.(0).)
'W
where
W
is a normed
(Bounded means there exists
is less than
K
for all
Again the obvious norm is
= sup{| |f(x) 'W
and
More generally:
is any set whatsoever.
f| |
46
R
x e A}
.
x
in
A;
5. The space of continuous | | f| |
- sup{| f (x) |
functions
f
:
[o,l]
-* R
with
Note that this is a linear space with
a < x $ b}.
pointwise addition and scalar multiplication as usual, sense in view of Theorem A of
§1.6.
This space is a linear subspace of
the special case of Example 4 above where norm and metric from this
differentiable at each point in
(0,1)
[jo,lJ ,
and it inherits the
f
:
[o,ll
-* R
which are
and whose derivative
Here we could take
= sup{ | f (x)
or,
A =
larger space.
6. The space of continuous functions
is a bounded function.
and the norm makes
f'
fI |
:
(0,1) -> R
where
a $ x $ b}
if we wished to measure derivatives as well as we would take where
= sup{|f (x)
a £ x < b} + sup{|f'(x)
a $ x $ b}
This is a linear subspace of the space in Example 5. naturally inherits is to work with is of Example 5,
'0
||•|
,
The norm which it
but the norm which it may be most useful is that
The metric corresponding to
1
•
§1.2.
7. The space of systems of differential equations
^1 = fi^xl»
x2') (F)
~
f2^xi’ x2^
defined on the closure that
f^,
f2
U
of a bounded open subset
U
of
R
and such
are continuous, where the linear structure is defined by
pointwise addition and scalar multiplication of maps f =
(f
f^)
|F||
:
R
2
-> R
2
,
and the norm is given by
= sup{||(f1(x),
f2(x))
x =
(x^,
x2)
e U}
47
Theorem C of that
||*||y
§1.6 shows that makes
U
sense.
is compact,
and then Theorem A'
It is routine to verify that
guarantees
||*||y
is a
norm. The corresponding metric is like that of Example 6,
§1.2,
although
2 there we supposed
f^,
f^
to be bounded on the whole of
R
Now we have essentially everything we need to pursue the study of differentiability.
However,
it is worth going a little further at this
stage to provide some machinery that will be useful later on.
Linearity and continuity Every linear map
L
The proof is easy:
:
let
and observe that if L(x)
Rn -* Rm
(with the euclidean norms)
e2>
(e^,
x
and
- L(y)
y
•••>
e^}
are in
Rn
be the standard basis for
by linearity
r n
l
Rn
then
= L(x-y)
= L
is continuous.
-v (xi - yi>ei
i=l n =
£ i=l
(x.
- y.)L(e.)
by linearity.
n Hence
| L (x)
- L (y) | 1
s
l
l*i - yi1
llL(ei)l |
$
||x-y||M
where
i=i n M =
l
l|L(ei)||
.
Therefore
||L(x)
- L(y)| |
less than
6 = e M ^
is less than
e
i=l whenever
1 1 x-y 11
is
.
In this proof we have used strongly the fact that the domain
Rn
is
finite-dimensional, but note that the proof would have worked with the codomain replaced by any normed linear space
48
F.
In general a linear map between two normed linear spaces need not be continuous.
Here is an example with, necessarily,
an infinite¬
dimensional domain.
EXAMPLE Let
V
be the space of all infinite sequences
— (^-^»
> a3 » . . . )
with addition and scalar multiplication defined term—wise,
the linear subspace consisting of those with
Define a norm for
E
Y lj, n=l
1
a
and let
E
be
finite.
n1
by oo
and define a linear map L(a^ »
Then
L
a2*
L a3♦
is clearly linear,
let
: E
E
•••)
—
by (a2>
Then
| |a^n^I I
= “
(n-1)!
but
• • •)
•
6
is there always exists some
less
6
but
||L(a^n^)||
= 1
in the
| |L(a^n^)| |
how small than
a^ >
but is not continuous at
denote the sequence with
elsewhere.
a3>
a^n^
0. nth
place and zero
,
and so no matter
= 1 in
To see this
E
with
| |a^n^ | |
.
The basic facts about continuity of linear maps are summarized in the following theorem.
The details of the proof are straightforward,
be found in any book
(such as Simmons
[ll8])
and can
on functional analysis.
THEOREM
Let
L
:
E -> F
be a linear map between normed linear spaces.
Then the
following statements are mutually equivalent:
49
(1)
L
is continuous.
(2)
L
is continuous at the origin in
(3)
There exists a constant
K
such that
is less than
K
for every
||l(x)||F
with
E
Statement the
(3),
If
L
K
.
S
in
E
F
F.
K
which satisfy
(3).
radius of the smallest sphere centred at the origin in
E
distorts
is a continuous linear map then there is a greatest
lower bound for all the constants
L(S ),
L
it nevertheless maps it inside some sphere
and centre the origin in
: E -*
in
x
can be described by saying that however much
'unit sphere'
of radius
1IxlI ii E = 1
E.
This F
is the
and containing
in other words sup{I IL(x)j |
This number is denoted by
| jL| |,
for it can be shown without much
trouble that it does define a norm on the linear subspace of
Lin(E,F)
consisting of continuous linear maps. Note that for such a map Lx
for every I |LI I
x
in
by writing
E.
L
we have the inequality x
'F ^
This follows immediately from the definition of
I Ixl I
E
= a
and applying
| |L(a 1x)| |F ^
since
II a
Ell
linearity of
E
L
The notion of
= 1 ,
L
to
(1)
we have
| |L | |
but the left hand side is
and property
a ''"x :
a ^||l(x)|| 1 1
1 1 ]7
by the
J
of the norm.
the normed linear space of continuous
linear maps
E -v F
with particular properties of the norm such as the inequality above, will
50
,
be important in §2.6 when we consider higher orders of differentiation. Quite apart from these uses,
however,
it constitutes one of the basic
pieces of equipment in functional analysis.
Completeness Recall
that
in a metric space
converges to
c
in
given any m If
S
S
we say a sequence
d(x
n
,
c)
< e
for all d(x
n
with
,
c)
|x
n > m.
=
n'
this is just
...
there exists a positive integer
is a normed linear space we have
S = R
x^,
when:
e > 0
such that
S
x i
- c n
1 1
,
and if
n
When we are given a sequence how can we tell whether or not it converges to some d(x^,
c?
c)
If we have a plausible guess for
c
then we can look at
and try to check the definition, but what methods are available
if we are not sure even whether is a very useful lemma
c
exists?
S = R
there
(often known as the General Pr'inciple of Convergence)
stating that a sequence of numbers something)
In the case
x^,
x^,
...
converges
(to
if and only if
given any m
e > 0
such that
there exists a positive integer
x
< e
- x nl
for all
n^, ^ > m.
n2
Unfortunately the generalization of this to arbitrary metric spaces with d(x
, x nl
example,
)
replacing
x
n2 S
take,
for
n2
to be an open interval or to be the set of rational numbers
with the metric induced from S
turns out to be false:
- x nl
R.
will converge to something in
In these cases a sequence as above in R
which may not belong to
S.
51
Metric spaces
in which this General Principle of Convergence does hold
are sufficiently important and common for it to be worthwhile identifying they are called complete metric spaces.
them with a particular name:
A normed linear space which is also complete when viewed as a metric space is known as a Banach space
(S.
Banach,
1892-1945).
The examples given above of non-complete metric spaces are, not
(real)
linear spaces.
a finite-dimensional
of course,
In fact it would be impossible to construct
linear example in view of the result that every
finite-dimensional normed linear space is automatically a Banach space. Completeness of a normed linear space is a technical condition needed for the correct statement of some general results later on,
but in finite
dimensions there will be no need to worry about it explicitly.
Hilbert spaces Apart from its linear and norm structures,
Rn
has
interesting geometrical
properties that relate to the inner product
= *^1 + x2y2 + of pairs of elements for vectors in
R
3
.
+ xnyn
familiar as the dot product or scalar product
x, y
We have
=
i i 2 ||x||
,
and in
R
3
it can be
shown that = where
0
||x||
||y||
is the angle between the vectors
this as the definition of
0,
cos 0 x, y.
In
Rn
we can take
and mimic much of the geometry of
RJ
.
In functional analysis there are many circumstances under which a Banach space comes provided with an inner product analogous Rn
.
Technically,
means a map
52
to that in
an inner product on a real normed linear space
: E x E -> R
E
which is linear in each factor separately,
is
symmetric
for every
( = ),
x,
and is such that
vanishing only when
x = 0.
are some amendments to these conditions.) it
(In the complex case there Using such an inner product
is possible to derive much stronger results about the structure of a
Banach space than would be possible without it. an inner product is called a Hilbert space (D. Sometimes
sense.
R) .
Hilbert,
1862-1943). although
is still not too large in a topological
space is called separable if it contains a countable
A topological
dense subset
A Banach space which has
it is necessary to ensure that a Hilbert space,
possibly infinite-dimensional,
in
is non-negative
(as with the countable set
Q
of rational numbers contained
It is particularly useful to deal with separable Hilbert spaces,
since it can be proved that they have in a certain precise sense at most countably infinite dimensions,and many finite-dimensional techniques can be extended to this case by replacing n-tuples
(x^,
..., x^)
by
infinite sequences.
2.4
DIFFERENTIATION
Let
E
||.||
and
F
| | . | |f,
linear) map,
(continued)
be two normed linear spaces with respective norms suppose
and let
x
f
: E -> F
is a given continuous
be a particular point of
E.
(not necessarily
By analogy with
§2.1 we make the following definition:
DEFINITION
The map
f
is differentiable at
which approximates
f
at
f(x+h)
x
x
if there is a linear map
in the sense that for all
- f(x)
= L(h)
+
h
in
L
: E E
F
we have
||h||E h(h)
53
where
is an element of
n(h)
F
with
| | n (h) | |
as
-> 0
In general
L
of course depends on
we should write something like the role of
|h|
in §2.1
L
while
longer a number but an element
Here
plays the role of
(thought of as L =
'small')
exists
It is a linear map from
Df(x).
L.
of
E
to
f
Df(x)
instead of
Df(x)(h).
is a linear map from
Remember that E
to
F,
h
and so
e
which is no
F.
at
plays
It is not E, F, x
f
and
and denote
and not a number.
F,
To avoid proliferation of brackets we will often write Df(x)h
||h||
(for given
We call it the derivative of
then it is unique.
it by
ij
and so to be formally correct
instead of n(h)
difficult to show that if such an x)
x,
0 •
||h||
r
Df(x).h
or
is an element of Df(x)h
E,
is an element of
F. Using this notation we can re-write the defining expression for the derivative as f(x+h)
where
| |h| |
means
- f(x)
= Df(x)h +
j |h| |^,
and
||h||n(h)
p(h) -*■ 0
(in
The pictorial interpretation is much as before,
(*)
F)
as
cartesian product image of
f.
Df(x) f(x)
See Figure 9.
54
E x F)
(which will be some kind of
is a linear subspace of
it is
E).
Instead of 'surface'
in the
it can often be more useful to visualize the
This consists of some distorted version of
The image of the vector
f
(in
although the spaces
now may not be 1-dimensional or even finite-dimensional. thinking of the graph of
h -*■ 0
'tangent'
F,
to the image of
E
inside
F.
and when translated by f
at the point
f(x).
image of f
Frgure 9 Equivalently,
the map h
f (x)
+ Df (x)h
is a first order approximation to the map
h i-»- f(x+h)
Interpretation of the derivatives in familiar cases We will now look at the meaning of the above rather abstract definition in the case of maps between euclidean spaces, where differentiation is a more familiar concept.
Case 1. Here
f
E =
R,
F =
Rm . R -> Rm
is a continuous map
,
which we call a path
in
Rm .
(From some points of view it would be more sensible to give that name to the image of
f
parametization by At each point map
R -> Rm
.
R™
in
t
R. in
but then we would have lost track of its
We shall continue to call the map a path.)
R
the derivative
Now linear maps
very easy to understand, so by linearity of L(s)
,
L
L
all you have to do is
L(s)
to find
R
from
since for any
we have
Df(t),
s
is a linear
to any linear space in
= sL(l). L(l)
if it exists,
R
we have
s
= s.l
In other words,
as a element of
F
F,
are and
to find and
55
multiply it by the scalar L(l)
=0
in
the zero map
s.
Thus there are two possibilities:
F,
in which case
R ■+
F)
or
L(l)
L(s) / 0
= 0
for all
R -* Rm
L = Df(t),
then its
translated to
we see that if
L(l)
Df(t)
image is a straight line which,
f(t),
(i.e.
in which case the image of
straight line through the origin and the point the case when
R
s e
is tangent at
f(t)
in
F.
either L
L
is is
the
Returning to
is not the zero map after the origin is
to the image of
the path
f.
See Figure 10.
Figure 10 The association of
L(l)
R
between linear maps
with
L
sets up a 1-1 correspondence
and elements of
-> F
F,
and it is a simple matter
to check that this is a linear isomorphism between the space of all linear
R
maps F
-> F
itself.
(pointwise addition and scalar multiplication) For this reason whenever we have a map
particular when but
Df(t)(l)
f'(t).
F = or
Rn)
which is an element of
By linearity we have for all Df(t)s = Df(t)(s.l)
f(t+h) where
56
n(h) -> 0
as
(*)
:
R
-> F
(in
it is often convenient to work not with
Df(t)l,
and so the formula
f
and the space
s
in
h -* 0.
and denoted by
R
= sDf(t)l = sf'(t)
defining the derivative of
- f(t)
F
Df(t)
= hf’(t)
+
Finally,
f
:
R
F
becomes
||h||n(h) returning to the specific case
when
F
R
we have
formula shows us that (f
f(t)
-
(f^(t),f2(t),...,fR(t))
f'(t)
'(t) , f 2'(t) ,. . . ,f(t) )
is simply the n-vector where the
the real-valued functions
f^.
Of course.
f(t)
f^1
are the usual derivatives of
This fits neatly into Figure 10,
illustrating our familiar notion of vector based at
and the above
(f,'(t),f„'(t),...,f '(t)) l z n
and tangent to the
(image of
the)
as a
path there.
Figure 10 serves equally well as a pictorial representation of
the case for general
F,
not necessarily
Rm.
To round off this description we look at the case
n = 1,
bringing
us back to the beginning of the discussion of differentiation (§2.1). We have two concepts of derivative:
the old familiar
and the new
R -* R
Df(t)
as a linear map
.
f'(t)
as a number,
What is their relationship?
From the above we see that it is merely that f'(t)
= Df(t)1
the right hand side being a number
E = R
Case 2. Here
f
,
,
(element of
since
= x,
and
Df(x)
(if
n
real variables
it exists)
is a linear map
To analyze this linear map we will take the standard basis for
Rn
(see
§2.2).
Then any
h = h^ + h2e2 +
(i.e.
h =
(h
n = 1.
F = R .
is a real-valued function of
(x15x2,...,xn)
R)
,h2,...,hn))
h
in . . .
Rn
Rn
R
{e^,e2,...,en>
can be written uniquely as
+ hnen
and so by linearity we have
Df(x)h = h1Df(x)e1 + h2Df(x)e2 +
= h-^a-^ + l>2a2 +
•••
+ ^nan
...
»
+ hnDf(x)en
saY'
57
By taking
h
to be of the form
place and zeros elsewhere) formula
for each
(*), we find that the 3f
partial derivatives
(0,0,...,1m,...,0)
cm
i
in turn,
tm
in the
ith
and substituting in the
are nothing other than the familiar
at the point
3x.
(i.e.
x.
1
If
v
is any element of
(partial)
derivative of
R
f
then the number
at
x
Df(x)v
in the direction of 3f
terminology the partial derivative
(x)
3x.
is called the v.
Thus
is the derivative of
in this f
at
i
x
in the direction of
linear space
E
e^.
If
R
then the notion of
is replaced by an arbitrary normed 'standard'
3f
partial derivatives
3x. l
will not exist, but the derivative of v
is an element of
Case 3.
E = Rn,
In this case
and
Rm
in the direction of
(where
will still make sense.
is a linear map
^el»e2’'*"’em^
Rn
R™
,
and
so if we choose bases
(not necessarily the standard bases)
respectively we can represent
Specifically,
v
F = Rm .
Df(x)
{el,e2,•••,en}, Rn
E)
f
Df(x)
for
by a matrix.
if h = £, e-.
11
+ £„e0 +...+£ e 2 2 n n
II
+ A„e„ + 2 2
then Df(x)h =
where the
A's
are related to the
...
£'s
+ A e mm by
n Ai =
E “ii j=l J
for a collection of and
58
n
columns.
mn
£i
»
scalars
1 $ i $ m
{om^}
forming a matrix with
m
rows
If we now do take the bases to be the standard bases,
l.
so that the
1 and
are the usual coordinates, f(x)
=
then writing
(f1(x),f2(x),...,fm(x))
and putting together the discussions of cases 1 and 2 we find that
3f. ij
for each
i,
j.
standard bases
The matrix representing
9x^
8f2 9x^
m 9x,
is
Df(x)
with respect to the
is thus
K
This
(x)
9x.
9f (x) 9x2
3f2
(x)
9x2
3f (x)
m
3x,
(x)
9x
(x) n
3f (x)
(x)
8x
3f (x)
m
9x
(x)
the so-called Jacobian matrix (with respect to standard bases)
condensed to the notation
3(f1> f2» ‘ ' ‘ ,fm^ 9 (x-^ ,x2 , • • • 9x^)
Observe that we could equally well construct an analogous Jacobian matrix for other choices of bases. When
m = n
the matrix is a square matrix,
This number is often denoted by understood:
Jf(x)
or just
and so has a determinant. J
when
it will be of considerable interest to us
f
later,
and
x
are
in §2.8.
59
Remarks 1.
The Jacobian matrix depends on a choice of bases, whereas
derivative
Df(x)
does not.
the
The derivative was defined entirely
without reference to coordinates. 2.
Although the differentiability of
f
at
the partial derivatives of the components fact that the converse does not hold. partial derivatives
x f^
This
implies the existence of of
f,
it is a standard
is essentially because the
tell us approximately how the function behaves in
particular directions, whereas the derivative behaviour in a whole neighbourhood.
Df(x)
deals with the
It can be proved,
however,
that if
the partial derivatives exist and are themselves continuous functions some neighbourhood of
x
then
f
n
is differentiable at
x.
in
We shall not
be concerned with technicalities of this kind in what follows. 3. Digression into the complex field.
All
the above ideas work just as
well for maps between complex linear spaces.
In particular,
if
Cn
denotes the linear space of n-tuples of complex numbers with co-ordinate wise addition and multiplication by scalars
(complex numbers)
and norm
given by
r n
\
l I z.
2
1 2
1
'•i=l
then the derivative of a differentiable map
f
complex-linear map
Cn
R2n
by,
Df(z)
for example,
:
Cn -* C™ .
Now
identifying
(x1,y1,x2,y2.xn,yn)
where
:
= x.
+ iy.
J
J
J
at
C
for each
j
= l,2,...,n
we can multiply by complex scalars while in
multiply only by real scalars. real
60
linear map
R2n
R"m
A complex-linear map
(since real
will be a
with
n although in
z
can be thought of as
...,z )
z.
Cn -> C™
Cn -* C™
o RZn
,
we can
becomes a
scalars are special cases of
complex scalars)
but the converse is not true.
conditions for a real-differentiable map complex-differentiable map 9u.
9y. J
where we write
k
Cn -* Cm
,v. k
*
9y. J
:
z
in
Cn
R^n
the
to represent a
are that
1 $ i $ m
1
9x. J
u
:
9v.
9v.
_1
each
f
f
At
at
9x. J
z,
1 J? j $ n
f (X]_ ^ ,x2 ,y2, . . . ,xn,yn)
being a function of the
=
x.
(U]_ ^ ,u2 ,v2 , . . . ,um,vm) and
l
y..
,
These are the
J3
classical Cauehy-Riemann equations,
So far we have been discussing maps F
E.
x
it is not necessary that
All we need is
sufficiently small of
whose domain
E
and codomain
are both normed linear spaces, but it is clear that for differentiability
to be defined at of
f
h,
that
f(x+h)
i.e.
U
such that
that although the formula
x + h
lies
in
U,
the map
can be thought of as the whole of f
B^(x)
•
Hence if the domain
in a normed linear space
sense to talk about differentiability of Note, however,
be defined on the whole
should be defined for all
on some 6-ball
is given as an open set
f
f
E.
: U
f
at any point
(*)
E x
it makes in
only has meaning if
Df(x)
U. h
is
is linear and so its domain
Thus associated to
F
we have Df (x) for every If it is
x
in
U
the case that
then we simply say
differentiable on Df (x)
at which
: E -*■ F,
f U.
: U
f
: E -* F f
(linear)
is differentiable.
is differentiable at F
is differentiable,
For each
x
in
U
x
for every
or that
f
x
in
U
is
we have a linear map
or in other words we have a map
61
Df
: U -> {linear maps
E -* F} = Lin(E,F)
At this point we may wish to ask whether the derivative of varies continuously with continuous. Lin(E,F), of
x,
i.e. whether the map
at
x
U -*■ Lin(E,F)
:
is
To make sense of this we would have to have a topology on or at least on some subset of
Df.
Df
f
Lin(E,F)
containing the image
Now it so happens that if the continuous map
differentiable at
x
then
Df (x)
is a continuous linear map
(i.e.
This means that the image of subspace of
Lin(E,F)
Df
f
:
U
is not merely a linear map continuous on :
U -> Lin(E,F)
consisting of continuous
in §2.3 this space, which we will now denote by
E, is
F
is
E -*■ F
for each fixed
but x).
contained in the
linear maps. L(E,F),
As we saw
has a natural
structure as a normed linear space and hence a metric space. Therefore if we regard
Df
not just as a map
Df
:
U -> Lin(E,F)
Df
: U -> L(E,F)
but as a map
we are able to discuss whether or not it is continuous.
DEFINITION
If
Df
is continuous we say that
is of class
This
on
f
: U -> F
-or simply that
U
is of class is
f
C^
c\
or
f
.
somewhat abstract formulation may seem far removed from familiar
ideas of continuity of derivative in euclidean space, but the gap is easily bridged. between map
62
Lin(Rn,Rm)
Rn -* Rm
a norm on
To begin with,
is
and
L(Rn,Rm)
continuous.
L(Rn,Rm).
recall that since,
there is no distinction
as we have seen,
every linear
There are various reasonable ways to put
First of all we can take the norm defined above in
the general context, namely |L| |
= {sup| |L(x)
M
= 1}
where the norms on the right hand side refer to the usual euclidean norms.
Rn, Rm
Secondly, we could choose norms other than euclidean norms for and make the analogous definition of bases for
Rn
A = (a„)
and, regarding
and
Rm,
the euclidean norm of
||l||.
then represent A
A,
L
Thirdly, we could choose by an
as an element of
m x n
R™1,
take
matrix ||l||
to be
i.e. f m
n
l
l
a. . ij
*4=1 j=l
Then there are other possible variants such as 1 lLl 1
= max
1
i> j and so on.
a. . 1 ij 1
Now the important fact to use here, which sweeps away all
this confusion, is that it makes no difference which norm we choose on
L(Rn,Rm).
More precisely,
since
L(Rn,Rm)
invoke a non-trivial theorem (Simmons
is finite-dimensional we can
[ll8j) which says that any two norms
on a finite-dimensional linear space will induce the same topology, although of course the induced metrics will in general be different. Consequently, for studying the continuity (a purely topological property) of measuring
Df
: U •*
||Df(x)||.
L(Rn,Rm)
we can choose whichever way we like of
A convenient way is to take standard bases and let 3f •
Df (x)
max i» J
1
3x. J
(x)
It is then simple to see that continuity of
of the partial derivatives
3f. ^ - .
Therefore
Df
is the same as continuity
f : U -*
R
(U
an open
Xj
63
3f. subset of
R )
is
precisely when all the
C
are continuous
Xj functions on
U.
This freedom in the choice of norm is another illustration of the advantages of working with topologies rather than particular metric structures.
2.5
PROPERTIES AND USES OF THE DERIVATIVE
Unless with
stated otherwise, U
E
and
an open subset of
F
will denote normed linear spaces
E.
The following properties of the derivative can be verified directly from the definition:
Linear combinations.
(i)
is the map
af + Eg
If
: U ->- F
for any constants
D(af + Bg) as maps
f,g
U -*■ L(E,F).
are differentiable then so
a,3,
and
= aDf + BDg
Constant maps themselves have derivative zero,
of course.
(ii)
Linear maps.
linear map
L
If
: E -> F
f
is the restriction to
then
Df(x)
= L
already its own linear approximation. (Note that we do not say In the special case and
(iii)
Df(x)
= Z
Dt(x)
E = F = R
Bilinear maps.
If
B
x
in
U,
i.e.
f
is
C
L
is
.
which would not make sense.)
we have
E = E^ x E2
of a continuous
In particular
regarded as the linear map
of a continuous bilinear map 64
= L(x),
for every
U
f(x) R
and
: E^ x E^ + F
= Zx R
f
:
x h- Zx
.
is the restriction to (i.e.
l,
for some number
B
U
is linear in each
factor separately)
then
Df(x)h = B(Xl,h2)
where we write in
E2.
x =
(x^x^,
In particular
taking
x
f
+ B(h1,x2)
h = is
(h^t^) c\
with
x^l^
since the map
in
Df
E1
and
x2,h9
: U -* L(E,F)
to the linear map (h
B(xrh2)
+ B(h1,x2))
is continuous. As an example,
consider
E^ = E2 = F = R
(ordinary multiplication).
and
f(x
,x2)
= X]x2
Then
Df(x)h = x1h2 + h1x2
=
(x2,x1)
which is another way of saying that
Cartesian products.
(iv)
If
differentiable then so is (f^
x f 2) (x-j-,x2)
means
(f^Cx^),
of
f f
:
U
F
lies in
f?
:
U2 -> F2
x U2 -* F^ x F2
f2(x2)),
(Df1(x1)h1,
(Important)
normed linear spaces, Let
=
and
are
where
and we have
Df2(x2)h2)
.
derivatives operate co-ordinate-wise.
Compositions.
(v)
-* Fx
f = f^ x
Df(x1,x2)(h1,h2)
In other words,
3f 3f - = x„, - = x ox^ l 3x2 1
and V
and g
:
U,V V
G
Suppose
E,F
are open sets in
and E,F
be continuous maps,
so that the composition
g*f
G
are three
respectively.
and suppose the image
: U -* G
exists:
see
Figure 11.
65
Figure 11 If g*f
f
is differentiable at is differentiable at
x x
D(g* f)(x)
and
g
is differentiable at
f(x)
then
and = Dg(f(x))*Df(x)
Observe that the right hand side is the composition of the two continuous linear maps
: E -*■ F
Df (x)
and
continuous linear map from
Dg(f(x))
E
to
:
F -> G
and is
itself a
G.
This formula for the derivative of
g*f
is known as
the Chain Rule.
It can be expressed by saying that the derivative of a composition is the
composition of the derivatives, but note that care must be taken to specify at which points the derivatives are calculated. In euclidean space the derivatives Jacobian matrices, derivatives. y = f(x),
and then we obtain the familiar chain rule for partial
Explicitly,
z = g(y),
if
E =
Rn,
F
= Rm,
G =
R^3
and we write
then the chain rule states that
9(Zi,z2,..•,zp)
= 3(z1,z2,...,zp)
3(x15x2,...,xn)
3(y1,y2,..•,ym)
or equivalently 3z.
m
3z.
-a= y _i 3x. . 3y, j
66
can be represented by corresponding
k=l
Jk
^k 3x. J
3(y1»y2»---»ym) ‘
3(x1,x2,...,xn)
which is
the form in which the rule is most commonly expressed.
This form is the one which is most useful for specific calculations, but its disadvantages are
Rn
co-ordinates in
(1)
it is
etc.,
(3)
cumbersome, it gives
(2)
it depends on choices of
(as written)
no information about
where to evaluate the partial derivatives.
EXAMPLES 1. m
Ordinary multiplication of real numbers is a bilinear map :
R
R,
x R ->■
and the Chain Rule allows us to use this to derive a rule
for the derivative of the pointwise product functions
f
m(f(x),g(y))
:
R
E
and
g
we can write
D(fg)(x,y)
:
R.
F -*
: E x F -*
Expressing
fg = m*(f x g)
R
f(x)g(y)
of two as
and so by the Chain Rule
= Dm(f(x),g(y))
•
D(f x g)(x,y)
= Dm(f(x),g(y))
*
(Df(x)
by the rule for cartesian products. E x F
fg
x Dg(y))
Applying both sides
to
(h,k)
in
we obtain D(fg)(x,y).(h,k)
= Dm(f(x),g(y))•
(Df(x)h,Dg(y)k)
= m(f(x),Dg(y)k)
+ m(Df(x)h,g(y))
by the rule for bilinear maps. f(x)Dg(y)k + g(y)Df(x)h
The right hand side is
simply
(multiplication is commutative),
and so we can
write the conclusion as D(fg)(x,y)
= f(x)Dg(y)
where here the right hand side is two linear maps case when
Df(x)
E = F =
R
:
E -
R
+ g(y)Df(x)
a pointwise linear combination of the
and
Dg(y)
:
this of course reduces
F
R.
In the particular
to nothing more than the
usual product rule (fg)'
= fg'
+ gf'
67
2.
If
f(x)
B
: E x E ->
linear map
f d
is a bilinear function let
f
:
E ->
R
be defined by
Such a function is often called a quadvat'ic fovm.
= B(x,x).
We can write
R
as the composition
:
E -* E x E
f = B*d
taking x to
Df(x)h = DB(d(x))
•
(x,
where
x) .
d
Since
is
the
Dd(x)
'diagonal'
= d
we have
Dd(x)h
= DB(x,x).(h,h) = B(x,h) If
B
+ B(h,x)
is symmetric in the two factors this becomes Df(x)h = 2B(x,h)
or
Df(x)
and
= 2B(x,*)-
B =
A particularly common example is
scalar product
,
the case
E = Rn
i.e. n
B(x,y)
= x.y =
\
x.y.
:
i=l 2 here
f(x)
=
||x||
can represent by the f(x)
and
Df(x)
1 x n
is the quadratic form T
(where
is
the linear map
h h- 2x.h
matrix or row-vector T x Ax
then
x
This we
More generally,
. is the row-vector
Df(x)
denotes transpose of the vector
2x.
.
and of the matrix
if
T T x (A+A ) A) .
3. A superficially more complicated but in fact equally straightforward class of examples is illustrated by the following: m x n
matrix and an n-vector,
respectively,
with respect to some real parameter What is the formula for Writing
f = B • B
we have
68
:
f'(t),
(A x b)
where
L(Rn,Rm)
x
t,
Let
Rn + Rm
and let
f(t)
(A,b) h- Ab
,
= A(t)b(t)
Df(t).l,
is the bilinear map :
be an
each varying differentiably
or, more precisely B
A,b
in
R™ .
in
Rm
?
f'(t)
= Df(t).1 = DB(A(t),b(t))
•
= B(A(t),b'(t)) = A(t)b'(t)
which makes in
L(R
sense because
,R ).
b'(t)
If
f
inverse Df(x)
g
:
:
■+
U
.
E
+ B(A'(t),b(t))
+ A'(t)b(t)
= Db(t).1
in
Rn
and
A’(t)
= DA(t).1
There are many similar examples that could be constructed
using products of matrices,
(vi)
(A'(t),b'(t))
V
V ->■ U F
transposes and so on.
is a differentiable map which has a differentiable then at each point
is a linear
x
isomorphism.
g*f = id
:
in
U
the derivative
This is easy to prove:
we have
U -* U
and so the Chain Rule gives Dg(f(x))
•
Df(x)
= D(id)(x) = id
since
id
:
U -> U
is
(the restriction of) f • g = id
:
V
Df (x) Df(g(f(x)))
inverse, Again,
and
•
Dg (f (x))
= Df(x).
so in particular
E a linear map.
Similarly
V
and so by the Chain Rule at the point
since
: E
f(x)
= id
:
Therefore Df(x)
in
V
F -> F Df(x)
has
Dg(f(x))
as its
is a linear isomorphism.
this can be casually expressed by saying
the derivative of the
inverse is the inverse of the derivative. A differentiable map with differentiable inverse is called a
diffeomorphism.
A diffeomorphism defined on some open neighbourhood
of a given point
x
is called a
local diffeomorphism
at
x,
U
although this
term is often reserved to apply to local diffeomorphisms whose domain and codomain are both open neighbourhoods of
x
in
E
and which keep the
69
point
x
itself fixed.
simply as invertible
(vii)
Local diffeomorphisms at
(non-linear)
roughly,
can be regarded
changes of local co-ordinates near
A much-used result in elementary calculus
This states,
x
is
Mean Value Theorem.
the
that if a line segment is drawn joining two points
on the graph of a differentiable function then there is
somewhere on the
graph in between these points where the slope of the curve is the slope of the line segment. [x,
x+h]
x.
More accurately,
lies in the domain of
the same as
if the interval
the differentiable function
f
(although
strictly speaking we do not need differentiability at the end-points: continuity there suffices) x < E, < x+h
then there exists some point
£
with
such that f'(0
=
(f(x+h)
- f(x))/h
.
Rewriting this as f(x+h)
= f(x)
+ f' (£)h
gives another interpretation of expressed exactly as map is
f(x)
the derivative of
the theorem,
namely that
plus a linear map applied to f
at
E,,
not that at
definition, may give only an approximation to interpretation, n
f(x+h)
rather than the
'graph'
x
which,
f(x+h).
version,
h.
can be
The linear from the
This
latter
generalizes directly to
dimensions or to any normed linear space:
THEOREM (Mean Value Theorem)
Let U, U.
f
:
U -+ R
be a differentiable function.
and suppose the whole line-segment joining Then there is some point f(x+h)
70
= f(x)
y
Let x
x
to
and x+h
x+h
also lies in
on the line-segment such that + Df(y)h
lie in
The proof is easily obtained from the one-dimensional version above. Given
x
and
h
in
E,
(t)
let
R -+ R
be the function defined by
= f (x + th)
i.e.
f
evaluated at the point
from
x
to
x+h.
:
Then
a
proportion
t
along the line segment
is differentiable, being the composition of
the differentiable functions
R -+ E
:
t
x + th
and
f
:
E -> R,
and
the Chain Rule gives '(t)
= Df (x + th)h
By the standard Mean Value Theorem the left hand side equals for
t
equal to some (f)(1)
and so taking
£
between
- F
is of class
is
to say that the
map Df is continuous, where |L| |
: U ->■ L(E,F)
L(E,F)
has the topology which comes from the norm = 1}
x
= { sup | | L (x)
(Note:
Do not confuse the possible continuity of
(if
is continuous)
f
whether is,
Df
then
continuity of
Df(x)
D(Df)(x)
.
2
2
We can next ask
.
U.
If it
Df
f
at
x,
abbreviated to
implied that its derivative was a
so the fact that
.
If
in
.....
f
Df
is continuous
.2
is a continuous linear map,
L(E,L(E,F)).
x
.
Just as the continuity of
continuous linear map, D f(x)
: E -> F.)
will be a linear map
is called the second derivative of
D f(x).
with the necessary
itself is differentiable at a given point
E + L(E,F) This
Df
i.e.
D f(x)
implies
that
belongs to
is differentiable on the whole of
U
we thus have
a map D2f
: U -> L(E,L(E,F))
If this is continuous, which is we say that
f
is of
class C
2
.
the same thing as .
saying that
Df
is
C
Although the right hand side looks
complicated it is still a normed linear space,
and we can therefore
. . # . continue m this way to define
If
C
3
,
C
4
etc.
f
is of class
C
t
00
for all
r = 1,
2,
3,
Clearly,
. if
. is
C
f
... r
then we say that
then
•
D
r~l
f
*
exists •
•
f
C°
72
C
1C
•
•
implies
to mean continuous.
C
s
for all
s < r.
C
and is continuous,
would not have considered differentiating it: and hence
is of class
thus
C
1
or we
implies
C
Conventionally we write
1
,
,
Sometimes
literature to mean 00
C
smooth
the words C
.
or just
differentiable
are used in the
smooth
Later we shall tend to use
to mean either
27
or
C
for some
r
understood from the context.
Discussion of derivatives of high order rapidly becomes very uncomfortable because we have to deal explicitly with spaces of type
L(e,L(e,L(e,...,L(e,L(e,f) ))...)
.
Fortunately we can avoid the problem by making use of some linear isomorphisms which conveniently exist between these spaces and a family of other types of space which are conceptually easier to grasp. spaces of
These are
multilinear maps.
We have already encountered examples of bilinear maps. An
n-multilinear map M :
where all
the
E^
is
E, x E„ x 12 and
each factor separately are held fixed).
a map
F
...
x E
n
F
, ’
are linear spaces, which is a linear map in
(i.e.
when all the components
For example,
the multiplication map
(x,,x„,...,x ) h- x.x„...x 1* 2 * n 1 2 n
is an n-multilinear map r
Rn = R x R x
As another example,
n x n
...
x R -* R.
times,
...
x Rn
each factor corresponding to one column vector.
determinant
Then the
map is an n-multilinear map Rn x Rn x
As
consider the space of all
matrices regarded as Rn x Rn x
n
in the other factors
...
x Rn
R
special cases we note that a linear map is a 1-multilinear map
and a bilinear map is a 2-multilinear map. n-multilinear map is in general multiplying something on the
not linear,
Observe that for
n >
1
an
since the effect of
left hand side by a scalar
a
is
to
73
multiply in
F
by
oin.
Consider now the space L (E^ , L(E^,F)). every E2
in
the map
Writing
An element E^
L(x^)
and
bilinear.
instead of
:
L
L(x^)
L(x^)
in
L(E2»F).
L(x^).X2
were
Now to every
L(x^).X2
of
X2
in
F.
we have constructed a map
linear this map
since
L(x^)
is continuous, where
product topology Conversely,
this space is a map associating to
E^ x Ej 4 F
Furthermore,
shown that
of
or more generally
associates an element
L(x^,X2)
L
L
an element
L
Since
L(E,L(E,F)),
L
and
is easily seen to be
L
are continuous
x E2
it can be
is equipped with the
(see page 15).
if we start with a bilinear map
L
:
E^
x E2 -*■ F
then we
can define a corresponding linear map L
by taking taking
L(x^)
X2
to
continuity of
to be,
turns out that L (E^,L(E2,F)). between
L
M2
or,
However, L(x^)
x^
E^,
if it is,
that
L
: E^ -> L(E2>F)
if we do assume that and
L
L
L L
F
L(x^)
will be
L
belongs
to
gives a natural bijection
M2(E^ x E2;F)
of continuous
and moreover it is routine to verify that when
is made into a linear space by pointwise addition and scalar
multiplication this bijection is a linear isomorphism.
74
E2
is continuous then it
are continuous and so
and the space F,
the linear map
there is no reason to expect that
Therefore we see that
E^x
in
Now since we have assumed nothing about
to begin with,
L (E^,L(E2,F))
bilinear maps
for each
L(x^,X2).
will be continuous, continuous.
: E^ -* Lin(E2 ,F)
In a similar way,
the space
L(E1,L(E2,L(E3,...,L(En_1,L(En,F)))...)
can be seen to be linearly isomorphic to the space M
n
(E, x E-> x 1 2
...
x E
n
,F)
of continuous n-multilinear maps with pointwise addition and scalar multiplication:
the correspondence is
L(x15x2,...,xn)
=
L L
where
(...((L(x1))(x2))...(xn_1))xn
.
This now gives us a much less clumsy notation for dealing with derivatives.
Instead of (...((Drf(x)(h1))(h2))...(hr_1))hr
we simply write
9
D f (x) (h.^ 5^2»• * •
thinking of When
Drf
E = Rn
U -*■ M^(E x E x
here as a map
and
F = R,
...
x E;F).
so that we are dealing with a real-valued j-
function of
n
variables,
the expression for
D f(x)(h,h,...,h)
becomes a. . ij
(x) •P
h.h....h 1 J P
where 3rf ii...p
3x.3x....3x i J P
and the summation is over all r-tuples integers
from
1
to
n.
When
n = 1
(with possible repeats) this reduces
of the
to
Drf(x)(h,h,...,h) = f(r)(x) hr where
f^
is the usual
rth
derivative of a function of one variable.
Formally, we see that the relationship between the abstract interpretation of
Drf(x)
f
(x)
as a number and
as a multilinear map is that
75
f(r)(x)
= Drf(x)(l,l,...,l)
.
In other words, we need only one coefficient to describe the effect of an r-multilinear map M = Drf(x)
M:RxRx...xR->-R
this coefficient is
We remark here that,
on
(h,h,...,h),
and when
f^r^(x).
as with linear maps,
every multilinear map is
automatically continuous when we work with finite-dimensional spaces. Therefore explicit references to the continuity of
D f(x)
multilinear map can be conveniently dispensed with when
as a
f
is a map
between euclidean spaces.
The particular case map
E x E
form on
Rn
F. .
r = 2.
When
Here
E = Rn
D f(x)
and
F = R
Given a choice of basis
Rn x Ru -> R
in
is thought of as a bilinear then
D2f(x)(*,*)
Rn
any such bilinear map
can be represented by a matrix
B(u,v)
,
Q =
(q^^),
is
a quadratic
where
= u Qv n u.q..v. i W J
i»j=l When
B = D f(x)
this matrix
Q
of second partial derivatives of
d2f (x) 9x^9x^
92f 3x 3xn n 1
This
f
at
92f
(x)
92f 9x 9x„ n 2
9x.. 9x 1 n
in
,n R ,
92f
(x)
is called the Hessian matrix of
x)
9x 9x n n
f
is
the matrix
x:
9x^9x2
(x)
choice of co-ordinates
76
92f
(now depending on
at
x.
(x)
(x)
It depends on the
whereas the original bilinear map
2 D f(x) It
:
E x E -> R
does not.
is well known that for a point
minimum for
f
3f gx
, v vxJ >
/__\ 9f vx/> gx 12
x
to be a local maxi-mum or
it is necessary that the partial derivatives 9f
, N (.x)
...»
should all vanish,
and then in order to
n
determine whether
x
is a maximum or is a minimum it is necessary to
look closely at the properties of the Hessian matrix at return to this
x.
We shall
in detail in §2.9.
It is a standard fact from elementary calculus that when f
:
Rn -* R
is
C2
we always have
2
9 f 9x.9x. i
i,j
2
/ \ (x)
9 f -w 9x. 9x.
=
1
- l,2,...,n,
J
which means
(x)
i
that the Hessian matrix is symmetric.
2 This is
is
the same as
symmetric,
saying that the bilinear map
u,v
:
Rn x Rn + R
i.e. D2f(x)(u,v)
for all
D r(x)
•
in
R
n
.
= D2f(x)(v,u)
More generally,
if
f
is
C
it
then each
rth
partial derivative is independent of the order of differentiation. The corresponding co-ordinate-free version of this is that the r-multilinear map ordering of the
D f (x) r
is symmetric,
components of
can be found in Dieudonnd
[32^,
i.e.
independent of the relative
ExEx...xE.
A proof of this fact
for example.
Note in passing that of the two examples of multilinear maps given above,
the multiplication example is symmetric but the determinant
example is not.
77
Remarks 1.
If
f(x)
=
(f
(x) ,f2 (x) ,. . . ,fm(x) )
straightforward to prove that the
rth
is
C
(See Remark 2,
2. An n-multilinear map
n
the
n
is
in
Rn
then it
is
precisely when all
f^
exist and are
§2.4.)
ExEx...xE->-R
covariant tensor of rank tensor of rank
x
(r 5 1)
mixed partial derivatives of all
continuous.
3.
f
where
on
E.
When
is specified by
m11
is sometimes E = Rm
called a
a covariant
coefficients.
In the discussion of multilinear maps we have avoided mentioning
topologies for
M
(E^ x E^ x
manage without these since,
...
x E
;F) .
Strictly speaking we can
as we showed,
the
spaces are just
alternative versions of the complicated spaces of continuous
linear maps
for which at least in principle we already have topologies derived from norms:
see §2.3.
Nevertheless,
it is useful to see how to make
into a normed linear space explicitly. continuous linear
M
n
By analogy with the theory for
(= 1-multilinear) maps,
it can be shown easily that
an n-multilinear map M :
E. x E„ x 12
. ..
x E
-> F
n
is continuous if and only if there exists a constant ||m(x)||
$ K
\\xl\\
1 lx2l
=
least such
for all
K, 1
-
...
x =
-
1
(x^,x^,...,x ) x
|| n1 1
=
M||
= sup{
with
If we then take
1M |
|m(x)I
1111
M
n
(E
1
= 1
x E? x 2
f°r
...
i =
x E
n
1,2,
;F)
.
this is in a sense the most reasonable norm to choose,
natural bisection between
78
such that
to be the
i. e.
then this does define a norm for more,
1
K
. . . ,n)
What
is
since the
and the corresponding space of linear maps
^(E1’L(E2> • • •
etc.)
is
then not only a linear isomorphism but is also
norm-preserving.
Properties of higher derivatives Corresponding to the properties of
the derivative mentioned in §2.5 we
have the following properties of higher derivatives:
(i)
Linear combinations of
Cr
Dr(af + gg)
(.11)
Linear maps
because
(in)
DL(x)
L
= L
are
B
Cr,
= aDrf + gDrg
C
,
for every
Bilinear maps
maps are
are
with x,
C
,
.
D L = 0
i.e.
with
and
for all
DL(x)
DB
r :> 2.
This is
does not vary with
as given in §2.5
x.
and
D B
given by D2B(x) ((h1,h2) , (k1,k2))
To see this, 2 D B(x)
observe that
= D(DB)(x)
= DB
D2B(x)(h,k)
=
= B(h1,k2)
DB(x)
is linear in
for all
x.
(D2B(x).h).k
DrB = 0
2 D B(x)
Since
for all
x
This means
.
and so that
by definition
= DB(h).k as claimed.
+ B^,!^)
by the above does not vary with
x
we have
r 5 3. oo
In a similar way it follows that n-multilinear maps
M
are
C
,
and D^MCx) (h,k, . . . ,p)
where there are
n
permutations
0
of
follows
D^M = 0
that
- £ M(h^^ ,k o
entries
h,k,...,p
{l,2,...,n}. for all
• »Pp)
and
the sum is over all
Since this
r £ n+1
is
independent of
x
it
.
79
(iv)
The cartesian product of two
Cr
CV,
maps is
and all derivatives
just operate co-ordinate-wise.
(v)
The composition of two
Cr
maps
is
Chain Rule together with induction on D(g* f)(x) so if
f
and
g
are
= Dg(f(x)) C
r
then
•
Cr.
r.
This comes from the
We have
Df(x)
Df
and
Dg
are
induction hypothesis
(that the result holds for
Dg(f(x))
function of
is a
C
x.
C
and the
r-1)
ensures that
Then we use the bilmearily of
the map L(E,F)
x L (F ,G) -* L(E,G)
:
(A,B)
B-A
and the Chain Rule and induction hypothesis again to see that C
r—1
and so
g*f
. is
C
D(g*f)
is
r
There is a formula for
D
(g*f)
in terms of derivatives of
f
and
g,
but it is cumbersome and not worth trying to memorize.
2.7 If x
GERMS AND JETS f
and
g
are two maps which agree on some neighbourhood of a point
then all their derivatives at that point are the same.
Therefore if
we are interested in trying to deduce the local behaviour of a map from information about its derivatives at
x
the precise nature of the map away from consider any map coinciding with
f
we need not be concerned with x
but could equally well
on some neighbourhood of
x.
This leads naturally to the idea of the germ of a map Let and
80
g
T :
be a topological space and let V ■+ S
be maps with domains
U,V
S
be any set.
open sets
in
Let T,
f
: U
and suppose
S x
lies
in
U n V.
Then
if there exists
f|w
such that
f
and
are said to be germ-equivalent at
g
some open neighbourhood = g|w,
i.e.
f
and
W
g
of
x
u r\ V
lying inside
coincide on
W.
x
This is an
equivalence relation on the set of all maps defined on neighbourhoods of x
in
T
and with values
germs of maps at
x.
in
If
S,
S
and the equivalence classes are called
is a topological
consider germs of continuous maps, E,F
we can consider germs at
these germs by
x
and if of
Cr
space also then we can
S,T
are normed linear spaces
maps.
We denote the set of
E^CEjF).
In any of these cases,
if
S
is
a linear space we can define
pointwise addition and scalar multiplication of maps from a given domain into at
S, x
and this goes through to the germ level to give the set of germs
the structure of a linear space.
If
S
also has a multiplicative
structure which interacts reasonably with addition and scalar multiplication then this,
too, will go through to a similar structure on
the set of germs at
In particular,
x.
if
is a ring or an algebra
S
(whose definitions we will not go into here)
then the set of germs at
x
becomes a ring or an algebra. The cases which will mainly interest us are those for which and in
S = R™ . Rn,
Taking
x
without loss of generality to be the origin
we see that the set
E^(Rn,Rm)
does and linear combinations of §2.6).
In the case
algebra,
and so
properties two
C
(iii)
m = 1
E^(Rn,R) and
functions
(v)
is
C
T = Rn
C
maps are
the codomain forms a ring
of
forms a linear space since
R
C
(property
(i)
R™
in
is a ring and even an
(algebra):
here we use
§2.6 to show that the pointwise product of
.
The ring
EQ(R
,R)
of
C
germs
is often CO
denoted simply
En
or
E(n).
When studying local properties of
C
81
functions of
n
elements of
E
variables we will really be studying properties of
n
0
CO
As we have said,
if two
their derivatives at
x
C
maps are germ-equivalent at
are the same.
about the derivatives at
map representing that germ.
rather than of a specific
The basic question then is:
does knowledge of all the derivatives of a germ at about the germ itself? derivatives
at
then all
This means that we can talk
of a given germ,
x
x
x
to what extent
provide knowledge
Can the germ be reconstructed,
given all its
x?
The most elementary result in this direction is the Mean Value Theorem (§2.5) which for
f
:
R -+ R
f (x+h)
E,
for some
states that
lying between
x
which is not a derivative at f(x+h)
+ hf' (O
= f (x)
= f(x)
and x.
h.
Of course,
this
f'(5)
Therefore
+ hf'(x)
+ error term.
Further manipulations with the Mean Value Theorem (as book on calculus)
involves
in any standard
show that the error term can be expressed as
h2 — f" (52)
f°r some
?2
lyi-n§ between
x
involves a derivative elsewhere than at working with derivatives at
x
and x,
x+h.
Again,
this
and so if we insist on
we must write
,2 f(x+h)
= f(x)
+ hf'(x)
+ — f"(x)
+ error term.
Once more the Mean Value Theorem can be used to show that the error term will have the form
h3
-jr f"(£3)
for some
E,^
between
x
and
x+h.
This process of pushing the error term further to right can be continued indefinitely,
although there is no reason to believe that the error term
will become any smaller as this happens.
82
Indeed,
it can easily get larger.
is described by Taylor's Theorem, which says
The true state of affairs that if we write
f(x+h)
= f(x)
then the error term,
, + —y f" (x) ^•
+ hf'(x)
+
...
or remainder term, has . n+1
,
, n . . + —r f n (x) n•
+ error term
the form
,.
V«-^(n+1) the error term
Rn(h)
has the form
—5—r Dn+1f(5)(h,h,...,h) (n+1) .
for some point
K
between
x
and
x+h
83
Tl
Remember that here
E x E x and so
...
Drf (x) (h,h,. .. ,h)
euclidean space Drf(x)(h ,h
Rn h)
l
•
D f(x)
•
is an r-multilinear map x E -> R
is a real number.
1
to
E
is
we can for purposes of calculations write as
3x.3x.. . . 3x i J P
(x)
h.h....h
1 J
p
where the sum is over all ordered r-tuples from
Recall also that when
i,j,...,p
of the integers
n.
The proof of the general version of the given the one-dimensional version. F(t) and apply the theorem to F'(t)
theorem is surprizingly simple,
We write
= f(x + th)
F
at
t = 0.
First of all
= DF(t).1 = Df(x + th).h
by the Chain Rule,
and then since the map
is linear for each fixed F"(t)
h
L(E,R) -* R
taking
A
to
A.h
the Chain Rule applied again gives
= D^f(x + th)(h.h)
and in general F^ (t)
= Drf (x + th) (h.h
Now writing
F(1)
= F(0)
stated for If it is
+ F' (0)
+ \y F"(0) ^•
+
...
+ ^- F(n) (0) n•
+
F(n+1) (£)
f. the case that
R
(h)
does approach zero as
we are, by definition of convergence for infinite series,
84
, v,n+1) •
n
increases then entitled to write
f(x+h)
+ Df(x).h+ JJ- D2f(x)(h,h)
= f(x)
+
...
CO
l
=
~r Dnf(x)(h,h,...,h)
,
n=0 n* where
D f
means
f.
The infinite series on the right is called the
Taylor series for (the germ of)
f
at
x.
This definition can be generalized directly to maps the series
itself
(now a series of elements in
Mean Value Theorem.
is
C
F)
f
:
E -+ F,
since
does not depend on the
If
f
the Taylor series always exists, but
(1)
it may diverge,
or
(2)
it may converge, but to something other than f(x+h) .
If neither of these happens, for all
h
at
For
x.
in some neighbourhood of E = R
,
F = R
for sufficiently small co-ordinates
and the Taylor series converges to
fu ,
then
this means that
f
is said to be analytic
f^(x+h)
can be expressed
as a convergent power-series in the
for each component
sometimes called class Analytic maps
h
x
f(x+h)
Cw
f^
of
f.
Analyticity is
.
are convenient to work with,
since we can invoke the
whole machinery of differentiation of power-series term by term, inversion of power series, any analysis text.
etc.,
Furthermore,
of which details can be found in almost powerful results from the theory of
complex functions can also be used.
What is perhaps even more
significant for practical purposes is the fact that if we replace
f(x+h)
by f(x)
+ Df(x).h + yr °2f(x)(h,h)
which is a polynomial in
h,
+
...
+ yr Dnf(x)(h,h,...,h)
then we can be confident that by choosing
n
85
large enough the error in taking this approximation will be insignificantly small. Despite these enticing advantages of analytic functions,
implicitly
relied upon in much mathematical modelling in physics, engineering and so on,
there are two serious objections - one practical and one
philosophical - to working with analytic functions.
The practical
objection is that functions do arise in real life which look harmless CO
but are in fact not analytic even though they are example is the function
f (x)
f
=
:
R -> R
^
,
x > 0
,
This is not analytic at induction)
0,
0
A standard
,
x ■ Rn
(f(u,v),v)
(I =
Inverse Function Theorem 0
:
with
has matrix
which is non-singular
germ at
F
Rn = Rm x Rn m
Hence
f
L(u,v)
is right equivalent to
is
A,
proving
Corollary 1.
Proof of Corollary 2 Again we suppose
x = 0
Rn, f(x) = 0
in
r
co-ordinates such that non-singular matrix and Rn x Rm n map
G
:
A
has matrix C
is an
(m-n)
R
R
DG(0)
M
and we choose
where now
v. L. x n
with corresponding co-ordinates
B
matrix. (x,y)
is an We write
K
as
and this time define a
= f(x)
+
(0,y)
.
has matrix
B
0
C
I
which is non-singular and so by the Inverse Function Theorem a diffeomorphism.
n x n
by G(x,y)
Then
g
Rm
in \
Let
3
be the germ at
0
of
M*G 1
G
is locally
•
97
Then M(x,0)
= gG(x,0) = Bf(x)
giving
B*f = A
equivalent to
since
A,
M(x,0)
=
Bx (Cx ^ V.
= Ax.
proving Corollary 2.
Hence
f
is left
See Figure 16.
Rm" y
Rm JL—
/TV
,
Rn x
0
Rm
f,
R
J X d&
/fR"
Figure 16 The condition that zero,
i.e.
A
is injective means that the kernel of
the only n-vector
u
satisfying
Au = 0
is
A
is
the zero vector.
n This means that the only linear combination J
)
u.c.
L,
of the columns
l—i
c. —l
i=l
of
A
(as a matrix)
words
the columns of
which is zero is the one with all u. A
are Z■{■nearly
'independent.
is surjective precisely when the rows of The
rank
= 0,
l
A
Correspondingly,
A
are linearly independent.
of a matrix is the largest number of linearly independent rows,
or alternatively,
columns
(the answer is the same).
Thus
corollaries above can be summarized as follows:
If
Df(x)
has maximum rank then
Zike the linear map
98
or in other
Df(x)
near
f 0.
Zooks near
x
the two
Remarks In infinite dimensions we cannot talk about ranks,
1.
the first formulation of the corollaries.
so we must stick to
We also have to add in an
extra condition to ensure that the splitting of
E
suitable
then the proofs are the
same as
'co-ordinates'
(x,y)
is
still valid:
in the finite-dimensional case.
Explicitly,
that in Corollary 1 we should be able to split © E2 with
(see page 42)
E^ = ker A
and
where E^
E^,E0
taken by
or
E
A
takes
A
isomorphic ally onto
F^.
2.
E
F
r>n“in
R
nm
into
or
F^ $ F^
is finite-dimensional
is finite-dimensional in Corollary 2.
In the context of Corollary 1 we can define a 2
F
F,
E
These splitting conditions
can be shown to be automatically satisfied when in Corollary 1 or
the condition is
are closed linear subspaces of
isomorphically onto
E
into two
into a direct sum
similarly in Corollary 2 we should be able to split where
F
C
map
(germ)
"t
-> R
by (y)
= first co-ordinate of y(- B "*"Cy,y)
,
where the inspiration for the expression on the right is that the linear subspace of
Rn
consisting of all elements of the form
precisely the kernel of
A.
It then follows
(- B
Cy,y)
is
that
f((y) »y) = fy(- B lcy>y) becaus-e
y
preserves the y co-ordinate
(because
L
and
F
do) ,
= A(- B_1Cy,y)
=
by design.
0
This version of Corollary 1 is called the
Implicit Function
Theorem, since it says that if f is
a
Cr
:
Rn = Rm x Rn_m -+ Rm
map with
f(0)
=0
and such that the derivative of
f
99
restricted to the first factor is an isomorphism (represented by the matrix
B
above) ,
then there exists a
Cr
map
Rm
satisfying
f (4> (y) >y) = 0 for all f = 0
y
in some neighbourhood of the origin, or in other words putting
gives the first co-ordinate as an implicit function
second co-ordinate.
Note that when
n = 2, m = 1
and the derivative condition is simply that
Rm
Rn.
DEFINITION
A singular point of
f
maximal rank3 i.e. where The germ of
100
f
is a point Df(x)
x
at which
Df(x)
does not have
is neither injective nor surjective.
at a singular point is called a singularity.
be
Thus at a singular point the derivative may not give a good local qualitative picture of the map itself.
Remark The term
singularity
is used in many different fields
of different things - such as a point where not continuous,
or not defined.
f
to mean a variety
is not differentiable,
or
It may also refer to a region of space
with bizarre geometric properties,
such as a
'black hole'
in cosmology.
Here we will keep the word to mean the germ of a differentiable map at a singular point.
EXAMPLES of singularities 1.
In the case
n = m = 1
the derivative
Df(x)
is a linear map
which must either be surjective or be the zero map. point of
f
R ->■ R
:
is a point
x
where
f' (x)
: R
point or stationary point.
For example,
has
and nowhere else.
a singularity at
2.
x = 0
=
f
n,
i.e.
to functions
the linear map
Rn -*■ R
f
:
R.
U
R
'df
8f
given by
f (x)
m = 1
= x
for
Df (x)
is
(=1)
is
matrix
8f
ax * ax ’ ’‘‘ ’ ax 12
v.
x n
1
R
called a critical
0,
Here the derivative
represented by the
->
Thus a singular
The previous example generalizes immediately to the case
any
R
ny
and so the only way in which it can fail to have maximal rank 3f
by the vanishing of all the partial derivatives
7—-
.
Again,
this is
i called a critical point or stationary point of is
f
:
Rn
->
R
singularity at is
given by 0
in
Rn
f(x)
= x^
2
+
2
+
but nowhere else.
in a certain sense the most
'nonsingular'
f.
...
+ x
A standard example
2
;
this has a
As we shall see later,
this
type of singularity.
101
A constant map
3.
Rn)
is
Df (x)
f
Rn
:
Rm
(i.e.
f (x)
= c
is the zero linear map
Rn
Rn -* R™
is crushed by
f
Rm
in
the most singular kind of map imaginable.
the structure of 4.
f
for each
for all
x
in
Here the derivative x,
and the whole of
into a single point of
R .
As a geometrically more interesting kind of singularity consider
R2 -* R2
:
given by 3
f(x1,x2) = (xlfx2
Here
Df(x)
+ x1x2)
is represented by the
.
2x2
matrix
3x22 +
and so the rank is less than maximal precisely when
3x22 + x^ = 0
which is the equation of a parabola in the domain be described pictorially as
R
2
.
The whole map can
in Figure 17.
Critical points of real-valued functions We will now return to the case when one real function of i.e.
102
n
m = 1,
in other words when
f
is
variables and singular points mean critical points,
points at which all the partial derivatives of
f
vanish.
To understand how
f
behave s near a critical point,
look at the next available its
second derivative.
'approximate'
Recall
it is natural to
information about
f,
that this second derivative
namely
D2f(x)
can
be regarded as a symmetic bilinear map D2f(x) or as
a quadratic form,
Hf(x)
of
f
at
x.
:
Rn
x
Rn + R
represented by the Hessian matrix
32f
=
(x)
9x.3x. 1 J
From experience with the Inverse Function Theorem it seems
reasonable that we will be able to extract the most local information about f
from its
second derivative when this Hessian matrix is non-singular.
DEFINITION
A critical point
x
of
f
:
U ->
R
is called degenerate ov non-degenerate
according to whether the determinant of the Hessian matrix of
at
f
x
does or does not vanish.
Remarks 1.
The Hessian matrix itself depends on a choice of co-ordinates
but the vanishing or otherwise of its determinant does not.
Rn,
in
This
is easy
to verify by applying a co-ordinate change and then using the Chain Rule twice.
However,
the real reason for this is that the vanishing or
non-vanishing of the determinant of
H^(x)
corresponds to the
. .
degeneracy or non-degeneracy of the bilinear map An arbitrary bilinear map if there exists
some
u
B in
: Rn
Rn
x
Rn
R
is
2
D f(x).
said to be degenerate
with the property that
B(u,v)
= 0
every
v
in
Rn
;
point
x
of
f
is degenerate or non-degenerate according to whether
otherwise it is non-degenerate.
Thus
for
the critical
103
2
D f(x) 2.
is a degenerate or non-degenerate bilinear map.
In view of the previous remark, we can extend the theory of critical
points to functions on any normed linear space critical point of
f
when
Df(x)
R
: E ->
is
E.
The point
x
the zero linear map,
is a and is
degenerate or non-degenerate according to the degeneracy or
2 non-degeneracy of the bilinear map
D f(x)
:
E x E ->
R.
The Inverse Function Theorem allowed us to change co-ordinates to bring f
locally into linear form,
as possible.
provided the derivative was as well-behaved
We might next hope for a theorem saying that even at a
critical point we can change co-ordinates agreeable standard form, non-degenerate.
(b.
f
locally into some
provided that the critical point is
There is
after Marston Morse
to bring
such a theorem.
1892
It is called the Morse Lemma,
) who gave it as part of a general theory
of the structure of real-valued functions,
a theory which has had far-
reaching implications in differential topology.
Proofs may be found in
many books,
or Milnor
such as Golubitsky and Guillemin
[42]
[[82].
THEOREM (Morse Lemma)
Suppose f
at
p
is a non-degenerate critical point of is right equivalent to the germ at
p
f.
Then the germ of
of a function
0
cf>
of
the form \
+
> • • • )
where each
e.
i
is
It is clear that origin,
104
since
D (x)
± 1
=
...
+ e x n n
2
.
does have a non-degenerate critical point at the ,2e2x2 ,. . . ,2cnXn)
(as a
1 x n
matrix)
which vanishes at
x - 0,
and
H^(0)
is the
n x n
diagonal matrix
2f
2e,
2e
which is non-singular. example,
n
The Morse Lemma says that this is the archetypal
every other function near a non-degenerate critical point being
reducible to one of this type by a local change of co-ordinates. The number of
e.
which are
is called the i-ndex of the critical
- 1
1
point
p.
It depends only on
f
near
way of choosing new co-ordinates;
it is
p,
and not on the particular
the number of negative
2 eigenvalues of
H^(p)
or of
D f(p)
as a linear map
is
we can rearrange co-ordinates so that
Rn -* L(Rn,R) = Rn . If the index of has
p
k
(apart from the constant
f(p))
the form \
/*
2
(x-^ ,X2 , • • * , x^) V
X1
+
. . .
+ x
2
2
2
, n-k
Xn-k+l
...
+
X
n
This allows us to gain a good geometric picture of the behaviour of near
0,
i.e.
of
index
(including
f
cj>
near
k = 0)
p. and
There are n + 1
n + 1
possible values of the
correspondly different types of
non-degenerate critical point. n = 1
n = 2
k = 0
Local form
x
minimum
k = 1
Local form
- x
maximum
k = 0
Local
form
x-^
k = 1
Local form
x^
k = 2
Local form
- x^
2
+ x2
2
2
:
2
saddle point
- x2
2
- x2
minimum
2
:
maximum
105
See Figure 18 for pictures in the case
n = 2.
Note that a saddle
point can be thought of as a minimum in one direction coupled with a maximum in another.
In general there will be
saddle point corresponding to
(n-1)
different types of
k = 1,2,...,n-1.
Remarks 1. An immediate deduction from these local forms non-degenerate critical point is iisolated, containing no other critical point,
i.e.
is that each lies
in some neighbourhood
degenerate or otherwise.
2. The Morse Lemma has an infinite-dimensional version,
extremely useful
in calculus of variations where it is necessary to consider the behaviour of a function defined on a normed linear space themselves functions. write the
(rearranged) x ^ f(p)
where
P
canonical form for +
||(I-P)x||^ -
E
Rn -> Rn
.
above form where
P
whose elements are
In finite dimensions we could (J>
as
||Px||^
k
co-ordinates and
I
denotes the
The infinite-dimensional Morse Lemma states
is a Hilbert space
degenerate critical point
106
[92].
is projection onto the last
identity map provided
See Palais
E
p
(page 53),
the function
that,
in a neighbourhood of a non¬ f
can be converted into the
now denotes a perpendicular projection onto some
linear subspace of
E.
There do exist generalizations of this to Banach
spaces which are not Hilbert spaces:
see Tromba
|44] .
Further study of degenerate critical points At a degenerate critical point
p
singular, which is equivalent matrix has non-zero kernel, with
Hf(p).u = 0.
of
f
the Hessian matrix
H
to saying that as a linear map
i.e.
However,
(p)
is
R° -* Rn
there exists at least one direction
the u
it would be reasonable to hope that if we
keep perpendicular to such directions we could still apply a kind of Morse Lemma giving a standard quadratic form in some co-ordinates, completely degenerate
(i.e.
the remaining variables. choose co-ordinates non-degenerate does exist.
'Morse'
(Splitting Lemma,
conveniently
[45^ •
or Gromoll-Meyer Lemma)
is a degenerate critical point of
is right equivalent to the germ at
0
(xi5X2, . . . ,Xn) B- f (p)
e.
is
±1
+
and
Then the germ of
f.
of a function
/
where each
allowing us to
It was first published in an infinite-dimensional context by
THEOREM
p
Such a generalized Morse Lemma,
part and a completely degenerate part,
and Meyer
p
function of
so as to split the function locally into the sum of a
Gromoll
Suppose
degenerate in all directions)
leaving a
e,x, 11
ip
+...+£ x
2
r r
at
of the form
f
k
2
f
+
rj,
[ r+1
is completely degenerate,
5 nj
i.e.
i
D2ip(0)
= 0
(as well as
Dip(0)
= 0).
Remark In the infinite-dimensional version when
E
is a Hilbert space
H
we let
2 K
be the kernel of the second derivative
H -> L(H,R)
= H*
.
D f(p)
(In fact for a Hilbert space
regarded as a linear map H
= H,
in much the same
107
way that
Rn" = Rn;
seepage 247.)
so that there is only a 2 D f(p)
the image of
We assume that
K
is finite-dimensional,
'finite-dimensional amount' a
is a closed subspace of
of degeneracy,
, (a technical necessity).
H
Each point
x
can be regarded as having co-ordinates
belongs to
K
and
w
is perpendicular to
K.
and that
(w,y)
where
y
Then the standard form
given by the Lemma is x b- f(p)
where this time perpendicular to element of
+
| | (I-P)w| |2 -
P
| |Pw| |2 +
(y)
is a projection inside the space of vectors K,
L(K,K )
and
(Jj
satisfies
DiJj(O)
or as a bilinear map
In view of the Gromoll-Meyer Lemma,
= 0
and
K x K -v R
2
D i[j(0)
w =0
as an
.
the problem of studying the
behaviour of degenerate critical points has been narrowed down to the study of critical points at which the second derivative vanishes entirely, i.e.
points at which the 2-jet vanishes
f(p)).
(ignoring the initial constant
It would be pleasant if there were some kind of
'non-degeneracy'
condition on the third derivative in the presence of which we could again convert the function locally into some standard form by a sort of third degree Morse Lemma.
Essentially this is the case, but the details of the
theory, which generalizes to functions with vanishing 3-jets, are rather more subtle than one might at first suspect.
4-jets,
Indeed,
etc.,
it is
only in recent years with the work of Thom, Mather, Arnol'd and others that the classification of degenerate critical points under various conditions of relative non-degeneracy has become properly understood at all. The problem can be stated in two parts: (1) To what extent does the first non-vanishing k-jet of
108
f
determine the local behaviour of
f ?
(2)
Can the first non-vanishing k-jet homogeneous polynomial degree
k)
(which is a
x1,x2,...,x
in
of
be put into some normal form?
The second problem is one of algebraic geometry, or
less easily for a few low values of
k - 2
k
and
and can be solved more
n.
For example, when
we have the classification of quadratic forms
into standard non¬
degenerate types
ex If
2
+ e x„ 22
2
+
...
+ e x n n
which appeared in the Morse Lemma,
2 elXl
2 + e2X2
2
(e. = ± 1) x
and degenerate types
2 +
•••
(ei = ± 1, 0 $ r < n)
+ erxr
which featured in the Gromoll-Meyer Lemma. cubic form in coincide)
x^,x2
factorizes
When
k = 3
and
n = 2
into one or three real factors
a
(which may
giving without much effort the various standard forms 3
X1
3
+ X2 ’ X1
The first problem is
3
„
2
2
3
~ 3xix2 ’ xl x2’ X1
that of detevmi-naoy of the k-jet
(see §2.7).
Without going into the depths of the theory, we will give the result in the form of a finite procedure for testing the determinacy of the first non-vanishing jet. We may as well suppose that we are working at the origin in
Rn,
and
we shall regard the first non-vanishing jet of a function germ as being a homogeneous polynomial of the appropriate degree a k-jet,
and for
i = l,2,...,n
let
d^
denote
k. 9E
,
Let
E,
be such
itself either
i zero or a homogeneous polynomial of degree
k - 1.
109
DETERMINACY TEST.
The jet
E,
every homogeneous polynomial
as above -is determined if and only if of degree
q
k + 1
in the
can he
x^
written as q = q1d1 + q2d2 + ... +
where the
are homogeneous polynomials of degree
2.
Since the space of homogeneous polynomials of degree dimensional real linear space of dimension
ml
X1
/-a 1,-4(n-1);k.
is a finite-
k
(the elements
m
m2
x
X2
n
with
m, + m~ + 12
...
form a has is),
+ m = k n
the
Determinacy Test reduces to a finite combinatorial problem.
EXAMPLES 1. The k-jet of a constant function is not determined since each However, 2.
If
1-jet
- 0.
this fact is obvious anyway.
Df(0)
E,
d.
of
/ 0 f
(i.e.
0
is not a critical point of
is determined since at least one of the
is a non-zero number and then every polynomial can be written as
f)
d^
q
then the d^
of degree
times a polynomial of degree 2.
(say
d^)
k+1 = 2
However, we
already knew this fact as a consequence of the Inverse Function Theorem in §2.8. 3. If
Df(0)
= 0
then if
A
denotes the Hessian matrix
H^(0)
we have
n 1.
\
=
1 If
A
is non-singular
there is a matrix
(i.e.
B = A
A. .
x.
^
J
j-1 0
is a non-degenerate critical point of
such that n
x. J
110
=
y
i=l
B. .
d.
J1
1
f)
Now every homogeneous polynomial
q
of degree
3
can be written as
n
(0)
= 0,
and then take
y = x(l +
(x))
1 /k
.
It is easy to make
argument rigorous.
Observe in constrast:5.
For
k > 1
the k-jet
(x
,x2) ►+
k
. . , is not determined,
since
x2
k+1
cannot be written as
q ll
3? - + q0 l2 3x.
The result is reasonable,
35
, k-1 — = q1kx1
3x,
since
1
x^
vanishes everywhere along a line
(the x2-axis) whereas no amount of local changes of co-ordinates
around
0
111
in
R“
x^ - x-^ dx2
will alter the fact that e.g.
also along a curve touching the More generally,
x2-axis at
vanishes
the origin.
it is easy to see both from the algebra and the
geometry that no k-jet of
f
:
Rn -> R
which is
independent of one or more
of the co-ordinates can possibly be determined.
6. The 3-jet (x1,x2) h- x1
is determined since
d^ = 3x^
3
2
- 3x^2
- 3x2
homogeneous polynomial of degree 4 q^d^ + q2d2
with
q
, q£
3 2 2 3 x^ x2> x^ x2 , x^x2
2
,
in
of degree
2
d2 = - 6x^x2 x^,x2
2.
can be written as
To see this,
are already multiples of
1
4
X1
2,
1
and every
d2
observe that
and that
,
= 3 X1 dl " 6 XlX2d2 4
x„ 2
1 2, i , = - -x x-. d, - — x,x„d„ 321 6122
However: 7.
The 3-jet (x
is not determined, of form
d^
and
d2
,x2) ^ x^ x2
since
d^ = 2x^x2, d2 = x^
would have
q^d^ + ^2d2
'
x^
2
as a factor.
and so every combination Thus
x2
4 is not of the
This non-determinacy can be understood geometrically
2
as follows:
x^ x2
vanishes along two lines
cutting at right angles, whereas along the x^-axis only.
2 5 x^ x2 + x2 ,
axes)
for example, vanishes
No local co-ordinate changes
one of these configurations into the other.
112
(the co-ordinate
in
R
2
can convert
So far we have considered determinacy only for homogeneous k-jets, which materialize as the first non-vanishing jet at germ.
0
of some function
It is important also to be able to test determinacy of non-
homogeneous jets, since it frequently happens that although the first non-vanishing jet of
f
may not be determined, the addition of the next
term in the Taylor series of
f
gives a jet which then is determined.
The necessary and sufficient test for determinacy of non-homogeneous jets is a little cumbersome, and in many cases it is enough to work with two tests, one necessary and one sufficient, which do not quite meet in the middle. Let
£
For full details see Poston and Stewart
[lOl] .
be any (possibly non-homogeneous) k-jet, and define
3 E,
d^ =
i as before. any degree
In this case $ k - 1.
d^
Let the minimum degree of a polynomial denote the
lowest degree of all its terms. 2
may be a non-homogeneous polynomial of
For example, the minimum degree of
3 2. X2X3 + xp x3 1S four*
Sufficient test.
If every homogeneous -polynomial
q
of degree
k + 1
can be written as n q =
l
q^i + r
i=l where the
q^
cere polynomials of minimum degree at least
a polynomial of minimum degree at least
k + 2,
then
5
23
and
r
is
is determined.
This is similar to the earlier test (for homogeneous jets), but we now allow the > k + 1
q^
to be non-homogeneous and discard all terms of degree
which arise.
113
EXAMPLE We saw previously that
If now we let
n
(x^,X2) b- x^ X2
denote the 4-jet
x
is not a determined 3-jet.
(x^,X2) b- x^ X2 + x^
n
1
is determined
(here
r = 0).
Thus
5
have
Necessary test.
If
of degree
can he written as
of minimum degree
as a factor and
X2
4
(*)
where the
q^
are polynomials
is as before.
r
Clearly the sufficient test implies the necessary test, However,
x^ X2
is determined then every homogeneous polynomial
£
and
1
d^
the undetermined 3-jet
has been made into a determined 4-jet by adding on
k + 1
we find that
5
and other homogeneous polynomials of degree so
4
as
it should do.
there is a small gap between the two through which jets can
easily fall.
EXAMPLE (Siersma's example
[l 1 ^ 3
Let
5
be the 4-jet
d2 = 3x^X2
2
.
.
.
so we suspect that
£
minimum degree
We have
d^ = 3x^
3 + x^
2.
5
6)
with
Thus the Sufficient Test fails
and
To prove this we must
q^d^ + ^2^2
(in particular we see
ea°h of 5
2 = x? d^ - x^^)
and so
does not fail the Necessary Test and we cannot be sure that it is not
114
,
here we find that any homogeneous polynomial
can be written as 1
5
(m°dulo terms of degree
may not be determined.
apply the Necessary Test: 5
.
.
q^d-^ + ^2^2
each of minimum degree
of degree
+ x.^2
A little combinatorial work shows there is no way in which
can be written as q^,q2
2
3
(x^,X2) b- x^
E,
determined: determined,
it has fallen between the two tests.
In fact this
E,
is
but to prove this needs more technicalities which we will not
discuss here. Interestingly,
polynomial of degree degree
7)
where
E,
for this 6
it turns out that every homogeneous
can be written as
qpq9
have minimum degree
regarded, as a 5-jet is determined, of
q^d-^ + ^2^2 2.
(modulo terms of
This means
E,
that
in other words that any function germ
the form 3
(x^,x2) ^
is right equivalent to
+ x^x^
E,.
3
+
(terms of degree
£ 6)
It is in showing that terms of degree
5
also make no difference that the subtlety of this example lies.
Remarks on the literature.
There are a great many books on linear algebra. A fairly random selection is Cullen [3l] , Halmos [49] , Mirsky [84] (in which linear spaces are called linear manifolds), Nomizu [91J , Shields The basic material on calculus in normed linear spaces can be found in Dieudonnd
[32] , Lang
[67]
or Loomis
and Sternberg
[70] ,
jE 14] .
and for euclidean
space in almost any text on advanced calculus such as Hoffman
[58],
Rudin [10^ or Spivak [l2| . See also Griffiths and Hilton [43] or Hirsch and Smale [55] for both linear algebra and calculus. Some references on singularity theory are Eells
[36] ,
Golubitsky and Guillemin
[42] t or the
survey by Arnol'd [9], Determinacy of jets is explored and explained thoroughly in Poston and Stewart [LOT]. Much of the content of this chapter is also contained in books on differential topology referred to at the end of Chapter 3,
and also in Field
[l60] .
115
3 Differentiable manifolds and maps
3.1
THE CONCEPT OF A DIFFERENTIABLE MANIFOLD
With the topology and calculus developed so far we have assembled a mathematical toolkit for dealing with
1.
differentiation in open subsets of euclidean spaces or, more generally,
2.
global structures which are not necessarily
The next step is of
in normed linear spaces
to unite these
two.
linear.
As a natural result of the synthesis
local linearity with global non-linearity we arrive at the definition
of a differentiable manifold. the definition of differentiability
Recall that in order to formulate at
x
for a continuous map
f
needed to use the linearity of of
x
in
U
and
f(x)
in
F.
:
U -> F,
where
U
is
locally, i.e.
E, F
Therefore if
S
open in
E,
in some neighbourhood
and
T
are two
topological spaces we can talk about differentiability of a map at a point
x
linear spaces context of all
S
provided that
in neighbourhoods of
topological spaces means
this more formally as Suppose
map.
Let
S, x
neighbourhood where
U'
that
f(x)
116
in
is
T
x
and and
S
:
S
S,
Now 'looks
'is homeomorphic to',
f
:
and suppose that
such that there is
S -> T x
like'
in the
so we can express
V
in
has
is
T
E.
a continuous
an open
a homeomorphism
an open subset of a normed linear space
has an open neighbourhood
T
look locally like normed
f(x).
are topological spaces and
in
T
f
follows.
be a point of U
S
we only
:
U -> U'
Suppose also
with a homeomorphism
ip : V -> V' assume
where
V’
is
an open subset of a normed linear space
for simplicity that
so that
f(U) U'
a
in a normed linear space r
a
.
The same
If we put in the further reasonable assumption
locally the same everywhere,
i.e.
that all
really just copies of the same normed linear space at
E
the definition of a manifold modelled on
E,
the
E^'s
are
we have arrived
E.
117
DEFINITION
A manifold modelled, on a normed linear space which admits a family
of open sets covering
{U^}
is homeomorphic with an open subset
Each
U ,
called a chart.
The collection
S. f
{(U
a
,
S {(h
the formal manoeuvring in this
uncomfortable about trying to do calculus would be right:
120
our instincts
''‘U
is ,
or
a homeomorphism and
a
*h) }
is one
for
T.
last example, we may still feel around the edge of a square.
detect the fact that manifolds as we have
We
defined them do not yet quite provide calculus. with
the correct setting for global
We will pursue this problem further,
after ending the list above
two rather bizarre examples of manifolds.
2 S = R U
with an unusual topology constructed as follows: is open in
line £
£
S
parallel
(regarded as
precisely when
U
to
in an open subset of
R
the
x^-axis
intersects every
with the usual topology).
2 Thus every
'usual'
open subset of
certainly not conversely. not m the usual
R,
morphic to S
is
R
2
.
the line
a manifold modelled on
This manifold is
is open in this
For example,
Every point
namely
R
the
p
in
through
R,
not on
X2_axis S
p
is open in
parallel
to the
8.
S =
[o,00)
with an unusual
the open sets
in
S
open in the usual
2
R .
S.
is
easy
in
(1,
u
follows:
1+e)
R)
(induced from or contain
0
for some
0
and which together with
e > 0.
R
every point has neighbourhoods homeomorphic to open :
this
little checking reveals Jj3,e)
It needs an
to verify that this prescription does define a topology for
Furthermore,
intervals
as it possesses
are precisely those which are
are either disjoint from
It
Thus
to constitute an atlas.
topology as
topology
an open interval
but
x^-axis.
in a certain sense embarrassingly large,
infinite number of charts
S
has a neighbourhood homeo-
an unoountably -infinite number of disjoint open sets. uncountably
topology, but
(1,1+e)
that each neighbourhood of
is homeomorphic to
manifold modelled on neighbourhood of
is by definition for points other than
0
R.
However,
(-e,e) S
in
0 R.
in Thus
is not Hausdorff,
meets every neighbourhood of
S
0,
of S
and a
the form
is a
since every
1.
121
There are contexts Hausdorff manifolds with them.
in which the study of
is
'large' manifolds
and non-
important, but we shall not be seriously concerned
From now on all our manifolds will be assumed automatically to
be Hausdorff and possess atlases with at most a countably 'infinite number
of chants.
Next we look a little more closely at what is on manifolds.
As
involved in doing calculus
already described, we discuss differentiability of maps
between manifolds by
representing the maps
locally
between open subsets of normed linear spaces.
(via charts)
It is
clear,
though,
the result of this discussion may depend heavily on the choices chart homeomorphisms if
x
lies
d> 1 Vp
Vk V
p
difficulty,
can write
first observe that
the map r
n U ) -> V' = (JjV c: F
pap
as
a
(U
n
a
UJ
followed by ipf
V'
a composition:
^p1 = See Figure
* Vp^
One representative for
f
is
* therefore the
Figure 21 composed with an overlap map
other representative such overlap map is space)
a homeomorphism of one open subset of
with another: we have
If it happened to be the case
atlas were of class
C
^
•
~~
• • the ambiguity about of different
was C
r
E
(the model
that every overlap map within the given
(l£r N
diff'event-table manifolds is
C
map if all its local representatives via charts of
M
and
27
N
are
C
maps in the sense of
Note
that to verify this
§2.6.
in any given case it suffices
to check the
differentiability of the local representatives with respect to just one convenient atlas
for each of
M
all other atlases m the same As
and C
r
N.
It then follows automatically for
structure.
a special case of the above we have:
DEFINITION
A homeomorphism “
f
1
:
N
M
f
: M -> N
is also a
If there exists
27
which is a
2T
C
map and whose inverse 27
map is called a
C
a diffeomorphism between
M
C
diffeomorphic (with the qualifier
C
diffeomorphism.
and
N
they are said to be
inserted according to context).
diffeomorphic manifolds are indistinguishable as
far as
Two
their topologies
and differentiable structures are concerned.
125
EXAMPLES of differentiable manifolds All the examples
of manifolds
above are in fact examples
of differentiable CO
manifolds, in 3(a)
since the overlap maps of the charts are all
the overlap map
image also
(0,1)
cf> .^
which is check. for
S
C
as
In fact the atlases
1
C\
The other overlap maps 3(a)
or 3(b)
and
/
S
2
t >-*- + vl-t
are equally easy
define the same
.
,
(0,1)
consists of those points of
and is given by the formula
0 < t < 1.
For example,
has domain the open interval
(observe that
lying in the first quadrant)
C .
C
to
structure
°o
since the charts
from one overlap
other and therefore a maximal
C
atlas
C
with the charts
including either of
from the
them would
include both. Example 6
(the square)
manifold have sharp since
still seems worrying.
comers?
Yet
the overlap map between
the square is a differentiable manifold
d> »h
and
*h 3
a
is nothing other than arranged to be
C
03
as
The point here is refers
4> • (j>a,
is
(d>
*h) (d> 3
*h)
^
which
a
which could perfectly well have been
.
m the examples
that the
How can a differentiable
for
S
1
'differentiable'
to a relative property of overlap maps,
above. in 'differentiable manifold' and so if you live in the
world defined everywhere locally by the charts you simply do not see sharp corners.
The comers
something else,
such as
exist only as a feature of the square in
the plane
R
2 .
Once the manifold is considered as
part of something bigger then it is having to carry more intrinsic properties as
a manifold,
these extrinsic properties
the
than its
own
and it is important to distinguish
(whatever they may be in any given context)
from
the intrinsic ones which do not vary with the context. What,
126
then,
is
the real nature of the extrinsic property of the circle/
square manifold that has not having comers?
corners,
or,
conversely, what is
R
A straight line in
2
the meaning of
clearly has no corners,
and
nor does anything which can be converted into a straight line by a diffeomorphism. Rn
(i.e.
Generalizing this
linear subspaces
to k-dimensional affine subspaces of
translated by some constant vector,
necessarily passing through the origin in corners
or otherwise is
a local property
the definition of a subset this kind is
S
of
Rn
Rn)
so not
and realizing that having
(although extrinsic) we arrive at
having
'no
comers'.
A subset of
Rn.
called a submanifold of
DEFINITION
Rn
A submanifold of for each
x
in
dif feomorphism that
(x1 - /l-x2
local
:
U
R
2
by
2 ,
x2)
.
a diffeomorphism onto its image
= {x e R
segment
2
C
x^
{x e R2
> - /l-x 2
|x2|
Xx = °, Ix2l
< 1}
< 1}.
and takes
U A S
See Figure 23.
to
the line
It is easy to find
Figure 23 129
charts
(U,)
to take
care of points
on
S
with
< 0
or
°o
(use
x9
instead) ,
thus
showing that
Note that the restriction of
R
in Example 3(a), when
indeed any chart for wo uld, with
S
S
is a
to
U A S
is
thought of as
C
This means
.
above is the
simply
3(a)
(or 3(b))
submanifold of
R
2
R
in
,
C
structure for
the same
structure that
and
2
K
certainly overlap
C
S - S
the
defined by
,
C
.
D
arising from its submanifold property m
that the
is
K
the chart map
x^-axis
CO
atlas
d^
submanifold of
if it were not already one of those in 3(a),
them.
0
S
,
inherits
as
a
.
2. Likewise it can be seen that
S
n
. is a
C
00
. submanifold of
n K
although later we shall find a much easier way of showing this brandishing atlases
explicitly.
This
C
00
structure for
S
n
1
,
than by
/ , \ (n 5 1)
is
usually called its standard structure. Now we can unravel the fact is
that the square
confusion surrounding the S
'square'
is not a differentiable subman-ifotd of
since at any corner there is no local diffeomorphism in straighten out
S
example.
locally to a straight line.
R
2
R
The
2
,
which will
Nevertheless
S
can be
00
equipped with a
C
structure by exploiting its
topological equivalence
to a circle as on page 126, but this structure bears relationship to the
C
°°
structure of
arise in practical situations, initially as submanifolds of
2
R .
Examples like this hardly ever
since manifolds Rn.
little direct
are very often encountered
The only reason for mentioning it was
to
highlight the distinction between intrinsic and extrinsic properties of manifolds. 3.
Suppose we wish to study the motion of a spherical pendulum.
. .
.
.
poszfoon is represented by a point
130
x
in the unit 2-sphere
S
2
The ,
say,
and
o
the velocity is sphere at
x,
represented by a vector which is
v
in
the same as saying that
R
which is v
tangent to the
is perpendicular to
x
3 regarded as
a vector in
R
.
Thus
the space we must consider if we wish to
analyze the global dynamics of the pendulum is T = ■{ (x,v)
e
x R^
which consists of all elements
and
is perpendicular to
(x^x^x^;
2 x^
v
2 + x2
the space
V^>V2’V3^
x
that satisfy
2 = 1
+ x3
X1V1 * X2V2 + X3V3 * °
This
set
T
is
a submanifold of
R^,
as
can be proved either directly by
a careful choice of charts or indirectly by more general methods: see
§3.4 below.
An obvious question arising is: realized as
a submanifold of some
can every differentiable manifold be Rn ?
dispense with the abstract definition. below.
However,
the ways
If this is so,
then we can in theory
The answer is Yes:
see
§3.2,
of doing this may be quite artificial,
distract from the essential properties of the manifold itself.
[T]
and For this
reason we prefer to deal with manifolds abstractly where possible,
to avoid
carrying around a lot of redundant information.
3.2
REMARKS,
COMMENTS, AND MORE EXAMPLES OF DIFFERENTIABLE MANIFOLDS
m A manifold is n-dimensional, or an n-manifold, if the normed linear space on which it is modelled is In this
a finite-dimensional space of dimension
case we can regard the model space as
Rn.
n.
An infinite-dimensional
manifold is one which is modelled on an infinite-dimensional normed linear
131
space If U
cj)
:
U -> U'
is a chart on an n-manifold we can for each point
u
in
write (u)
Sometimes cf>.
:
it is
R
U ->
itself of map
tjj
=
(fj^Cu),
n (u))
formally useful to write i^
coordinate function)
in
Rn.
If
u
also lies
then similarly writing
y.
as well as
in the domain
to denote
\pj
(j
—i express
of the
x's
is a y's
the V
C
i
coordinate
of another chart
= l,2,...,n)
r
the fact that the overlap map
all partial derivatives of the
n
to denote the function
(the x
R
in
we can
. diffeomorphism by:
as functions
exist (and are continuous) up to order
r,
and the Jacobian matrix 3(y1»y2>•••>yn) 3(X1>x2>•••>xn) is non-singular everywhere where it is defined. This
is a consequence of the Inverse Function Theorem (§2.8).
If everything that we have done so far is re-cast in the setting of
complex linear spaces3 with complex differentiability (see Remark 3, page 60) we obtain complex manifolds.
These arise in the global theory of functions
of one or more complex variables, Any open subset
W
of a
C
the same dimension with charts
in particle physics manifold
M
the restrictions
for example.
is itself a manifold of to
W
of those in the
TC
C
structure of
subset of
In particular
the model space
of diffeomorphism used in §2.5.
132
M.
E
(page 125)
.
is a is
(as already noted in §3.1) C
00
in this
manifold.
any open
Observe that the meaning
case the same as
that originally
1 ^ 1
have so far avoided asking an obvious question:
manifold is C
,
it always possible to choose an atlas whose overlap maps are
thus making the manifold into a differentiable manifold?
No,
The answer is
although not surprizingly the proof involves esoteric topological
methods.
An example in dimension 10 was found by Kervaire
However,
it can be shown that if there is a 00
•
exist a
atlas
and even a
.
manifolds.
C
atlas.
then there will also
Therefore from now on we will
CO
C
differentiable manifolds,
and call them smooth
•£ C
case for any given
r £ 1.
a Another natural question to ask is: with two distinct smooth structures? answer is Yes.
(1)
: R
(2)
tj;
:
^
in 1960.
Remarks made about smooth manifolds will usually apply equally
# well m the
Officially,
atlas
[64 [|
0)
C
tend to work with
^(f>
given a topological
For example, -> R
:
x >-»■ x
R
:
x^*x^.
R
can a given manifold be equipped
It is very easy to see that the
consider the two single-chart atlases on
these define distinct
C"*"
R
structures since their overlap map
although a differentiable homeomorphism is not a diffeomorphism (its
inverse is not differentiable at
0),
structure determined by
and therefore
to
the
(R,cf>).
R
are diffeomorphic as smooth manifolds.
(R,tJj)
Nevertheless,
does not belong the two copies of
The map
f : R(1) + R(2) taking
x
to
3>/x
given charts is -
is a diffeomorphism because its representative via the ipf G
:
1842-1899).
of multiplication
x h- x ^ Examples
(e^ ,e'*’^)) ►* place in each factor),
G x G
are both smooth maps are:
group
the algebraic structure of a
the
circle
torus
x S"*"
the group
GL(n;R)
:
(x,y)
is called a
S'*"
(with
G
-> xy
Lie group
(with multiplication S’'"-multiplication taking
of invertible
n x n
matrices
2 with real entries. such as
This
Lie group,
Conversely,
groups
There is
groups
as
of
on a manifold
which respects
M
is
:
for all
x,y
’do
in
a(x) is
G.
action
M
M}
of composition on both sides,
i.e.
= a(xy)
(Remember the composition on the left hand side
then
a(y)’.)
a smooth map,
For a given point through all of
Technically, an
a map
G -> {all dif feomorphisms
the rules
a group of
a particular rich theory of the actions of Lie
a(y)*a(x)
runs
Many Lie groups
one can start with an abstractly-defined
of diffeomorphisms of manifolds.
a
ct(x)(p)
Rn .
and look for objects on which it can act as
transformations.
means
an open subset of
these arise naturally as groups of self-transformations of some
geometric object.
G
last is
p G
If the map
then we say that in
is
M,
G x M -> M a
is
a
orbit
of
p
(x,p)
to
smooth action.
the set of all points
called the
taking
(ct(x) (p) }
under the action
as
x a.
145
It is an immediate consequence of the definition of a group to
the same orbit'
partitioned into
is
an equivalence relation on
a-orbits.
addition)
3.3
Z
:
M
is
R
and the
'multiplication'
is usual
of the 1-dimensional Lie group (in both of which
on compact manifolds.
THE STRUCTURE OF DIFFERENTIABLE MAPS BETWEEN MANIFOLDS
local
We have now reached the stage of generalizing the f
and so
’belonging
In Chapter 4 we shall be studying the orbit
structures of smooth actions O-dimensional Lie group
G,
that
Rn -* Rm
global
(i.e.
families
of real
functions of real variables)
study of maps between differentiable manifolds.
questions
study of maps to the
There are two main
to ask:
1. What are the possible geometrical forms
(topological)
that differentiable manifolds can take?
2. What can differentiable maps between them look like? Question 1 was discussed briefly in §3.2,
10
.
We will now turn
attention to Question 2, which is more important in applications since in practice we are likely to know which manifolds we are dealing with but need to know something about maps between them as relating to inputs and outputs of dynamical systems. To investigate the structure of differentiable maps we first look at local structure and then piece together local information to give global •
•
information.
for some given
146
.
As before, we will take r £ 1.
smooth
00
to mean either
C
-£
or
C
Local structure
(a) Let
M,N
be smooth manifolds modelled on
normed linear spaces (§3.2) sets
Any smooth map
f
or, more generally,
: M -> N
F,
and therefore the
between manifolds
is precisely
on
is by definition
represented locally by smooth maps between open sets
in
maps
E,F.
,R
R
in
E
and open
theory of the local structure of smooth maps the theory of the local structure of smooth
in euclidean or more general normed linear spaces.
We have already studied the in §§2.8, However,
local structure of such maps to some extent
2.9 and we have no further results it is worthwhile
to offer at this stage.
looking again at the consequences of the Inverse
Function Theorem and thinking of them from a manifold point of view. Recall that two significant corollaries of the theorem were summarized on page 98
by saying that if
Rn
Df (x)
:
like
Df(x).
Rn
->
:
and when
Rn
projection of
Rn
-*■
Rm is such that the
has maximal rank then near
In particular, when
Rm,
in
Rm
f
onto
$
m
Rm
the map
f
Rn
f(x)
cases
can be described in the language of
case
f
is
locally like
an
looks
m
each case, but
in
Rn.
0),
we
looks
like
R™
Now the first of these two
§3.1 by saying that locally the
Rm,
f(x)
and in the second
are
'looks
like'
in the
(n-m)-dimensional
two contexts do
required for the formal definition of submanifold in
they do.
Thus,
remembering the extension of the idea of
submanifold from a euclidean space setting to manifolds (§3.2
f
locally
Of course to tie this down precisely, we have to
check that the formal definitions of correspond to what is
looks
locally like a linear
n-dimensional submanifold of
Rn.
f
the local image of
the local inverse images of points near
submanifolds of
the map
and so the inverse image of each point of
near
image of
looks
n
n £ m
x
linear map
can summarize the
two corollaries of
in general
the Inverse Function
147
Theorem this
time as a theorem about maps between manifolds as
follows:
THEOREM
Let
f
n,m
respectively3 and let
: N
M
be a smooth map between smooth manifolds of dimensions
^
x
be a point of
representative
f = 4>f(x))
and (ii) n^m (i)
f
for
Take a local
N.
we have
takes some 'neighbourhood of
smooth submanifold of (ii)
f
of points in some neighbourhood
are locally (i.e. near x)
f(x)
submanifolds of
to an n-dimensional
N
M ;
the inverse images under of
in
x
(n-m)-dimensional
N.
Remarks 1. If the maximal rank condition holds f
then it holds for
them all
of course).
This follows
that if
R
A :
isomorphisms
-> R
then
is
for one such local representative
(at the points corresponding to the given
x,
immediately from the Chain Rule, using the fact a linear map and
CAB ^
kind concerning properties
has of
B:R
the same rank as f
-> R A.
,
C:R
-> R
are
Arguments of this
which are independent of the choice of
local representative are becoming quite familiar by now. 2.
The case
condition is that The
f
is
n = m
is
common to both (i)
(ii).
Here the maximal rank
that derivative should be an isomorphism;
the conclusion is
locally a diffeomorphism (the Inverse Function Theorem directly).
local image of
f
is an open set in
locally the inverse images of points near O-dimensional.
148
and
M,
thus m-dimensional,
f(x)
and
are single points and thus
(b)
Non-singular global structure
Since
the maximal rank property at one point implies
that
f
agreeable local properties near that point, we might expect the maximal rank property
everywhere than description.
f
(i.e.
:
N -> M
that if
has f
has
using local representatives via charts)
would have some easily understandable global
In fact,
looking more closely at the above theorem, we could
reasonably hope that under the assumption of maximal rank everywhere
and
(i)
if
n £ m
then
fN
(ii)
if
n^m
then inverse images of points
submanifolds of A little thought shows though
f
takes
different pieces when
f
is a smooth n-submanifold of
N
(n-m)-
N that
(i) will not hold as stated, because even
small open pieces of on
are smooth
M
N
into submanifolds of
M,
may collide or otherwise interfere with each other
is applied and so
fN
,
as
a subset of
in Figure 29 may well have maximal rank everywhere
oo image of S'
M,
may not have the
(in these cases the
O
image of R
Figure 29
149
condition amounts
to the non-vanishing of the
their images are not submanifolds of crossover points - where and in i.e.
(b) f
fS^
is a submanifold of
p
: in (a)
z
2
1
we have
the definition of an embedding, with
S
n £ m.
f
:
.
.1
and thus winding
M
crossover,
-> R
.
twice around its the most
S
image
f
: N -> M
charts
If we know that
satisfies
for
fN
fN
2
fN
for
M f
M
then by taking
locally as
Rn
Rm
: N -* fN
all have non¬
in
-*■
a smooth map
(i.e.
fN
.
fN
and since smoothness
that the inverse of the bijection
or in other words
that
f
(§3.1)
is everywhere locally invertible as
locally a diffeomorphism),
local property this means is smooth,
: N
to see
'submanifold'
immediately
f
S ) ,
and that
Now the Inverse Function Theorem
that
1
(also
singular derivative everywhere. tells us
= C
'nonsingular' kind of map
is a submanifold of
in
we see that the local representatives
is a
Rather than recording the
the maximal rank condition,
that exhibit
fN
f
ingredients of this definition directly, we will pause for a moment what it means.
R ,
in
and when we add the extra assumption that
z
N
R
there will be many instances where
taking
between manifolds
but
there are places - the
bisection onto -its image, to rule out examples such as to
f'(0))
even though there is no actual
Nevertheless
M,
2
does not look locally like
the same applies at
is injective.
R
'velocity' vector
f
:
is a
N
fN
is a dif feomorphism onto its image
Thus we can paraphrase the definition of embedding as follows:
DEFINITION
A smooth map
f
: N -* M
is an embedding if it is a dif feomorphism from
to a smooth submanifold of
Thus N
150
an embedding of
as a submanifold of
N M.
M.
in
M
can be regarded as a
'realization'
of
N
If
f
satisfies
the maximal rank condition
(with
n £ m)
to be an embedding then it is called an immersion.
fails
called an immersed submanifold of
sometimes
strictly speaking be a submanifold.
M,
but possibly
Its
image is
although it may not
From the remarks
in (a) we can say
that an immersion is a map which is everywhere locally an embedding_, it may fail to be an embedding for certain global reasons. reasons have been illustrated in Figure 29. relating to almost periodic motions
while
Two possible
Here is an important example
in dynamical systems
(see §4.1),
giving
yet another illustration.
EXAMPLE (Irrational flow on a torus) Let
f
R -*■ S
:
x S
be defined by
f(t)
= (exp iat,
are chosen to be two nonzero real numbers with satisfies x S1 f(t) of
the maximal rank condition, we have
= f(s) 2tt,
f'(t)
=
(a,3)
would imply
and so if
of two integers.
a(t-s)
t f s Thus
x s1
f
at rate
joining up with itself: nowhere
that if
a/3
but no longer injective. winding where
p
times
a/3 = p/q
irrational.
a,3
Then
f
Moreover,
B(t-s)
f
is
injective,
since
were both integer multiples
a/3
expressible as
the quotient
is an immersion and is injective, but it is not
a
fR
is a line winding round and round the
in one sense and
3
in the other and never
it forms a dense subset of the torus and so
looks simply like
Notice
(0,0). and
where
since in local angular coordinates on
we would have
an embedding since its image torus
/
a/3
exp iBt)
R
1
.
in
R
2
were rational, Its
locally. then
image would be
f
would still be an immersion
(diffeomorphic to)
around the torus one way and
q
times
a circle
the other way,
in lowest terms.
151
injective
The two examples we have seen so far of not embeddings behaves
in the domain
term behaviour that causes
able to
(§1.6)
'approach infinity':
the trouble.
Now,
in a sense corresponds
'approach infinity'.
it is only the long¬
the topological notion of
to an intuitive idea of not being
Therefore it is reasonable to expect
compact
every injective immersion of a This
that are
fail to be embeddings because of the way the immersion
as points
compactness
immersions
that
manifold may have to be an embedding.
is indeed true.
THEOREM
An injective immersion of a compact manifold is ccn embedding.
The value of this n £ m,
theorem is
that if we are given
then we only have to check properties of
fN
will be a submanifold of
fN
is
to see whether
f
:
and
N -> M N
with
to know that
we do not have to know in advance what
M:
: N
f
f
fN
is a diffeomorphism.
This
is
typical of the kind of simplification to a problem that the use of compactness can give.
Sketch of proof: guarantees
that
As f
mentioned above,
takes suitably small open neighbourhoods
diffeomorphically to their images are themselves open sets of
M)
takes open sets in
bijection further,
152
in
N
-*■
fN
,
in
fN
we immediately see that
is a local property. f
the Inverse Function Theorem
M.
^
:
in
N
Once we know that these images
(with its f
U
fN
topology induced as -> N
a subset
is smooth since smoothness
Therefore the problem is boiled down to proving that N
to open sets
is actually a
in
fN
,
i.e.
homeomorphism.
for Theorem B in §1.6 tells us
that this
that
f,
already a
But now we need go no is
true since
N
is
compact by assumption and
fN
is Hausdorff because
M
is.
In this discussion of global non-singular behaviour so far we have been concentrating on the case
(i)
n $ m.
Here there are no hidden traps. images of points
are smooth
What about the other case
Under the maximal rank assumption,
(n-m)-submanifolds of
N
therefore also globally since being a submanifold of in terms
of the topology of
In the case a
submersion,
n $ m
(ii)
locally, N
n $ m ? inverse
and
is a local property
N.
a map satisfying the maximal rank condition is called
and so we have the following theorem:
THEOREM
If
f
:
N -> M
inverse image
is a submersion then for every point f
is an
1(p)
(n-m)-submanifold
of
p
of
the
fN
N.
Remarks 1. There may be points f
■*'(p)
p
of
M
not in the image of
of
If
in which case
will be empty. Y
2.
f,
f
is of class
C
then each
f
”1
(p)
is
in fact a
C
XT
submanifold
N.
3. The simplest examples of submersions
(other than diffeomorphisms, which
are simultaneously submersions and immersions) cartesian product
N =
* M2
onto one or other of the factors.
inverse image of a point
P2
N ->• M2
t^e set
:
(P-^>P2^ ^ P2
diffeomorphic to
M^.
Here
are projections from a
of
M2
dim(M^)
The
under the projection x ^P2^» = dim(N)
which is of course — dim(M2),
as
in the
theorem.
153
4. We know from the previous discussion that every submersion has
locally.
of product structure
That it need not do so
illustrated by the Mobius strip the strip,
(§3.2 [9 [).
leaving a non-compact 2-manifold
globally
strip as
(x,y)
h-
(x,j)
First remove the boundary of (without boundary),
looks
and collapse Such a
in our formal definition of the Mobius
a quotient space obtained from the square
Locally this
type
is easily
this manifold down on to a circle going around the strip once. map can be given by
this
A =
[o,l]
like a product projection as above, but
x
[b,l].
it is not so
globally since the Mobius strip is not homeomorphic to the product S1 x
[0,1].
This example, nevertheless
showing how a space which is not a cartesian product can
look everywhere locally like one and even have a globally
defined'projection'
(= submersion)
topological idea of a
bundle.
onto one
'factor',
We shall meet this
context of tangent and cotangent bundles
gives
the germ of the
again shortly in the
for smooth manifolds.
5. The purpose of removing the boundary of the Mobius strip in the preceding example was boundary.
to avoid having to define submersions
In fact, with a little care there is no real problem in defining
submersions, immersions 6.
and embeddings
for such manifolds.
In general this discussion goes over to infinite-dimensional manifolds,
provided the usual precautions page
(c)
for manifolds with
are taken about splitting
(Remark 1,
99 ) .
Global structure with singularities
The problem of understanding the global structure of smooth maps between manifolds, where singularities are admitted,
is no less
than the problem of
understanding everything about the global properties of families of smooth
154
functions
of
knowledge is theorems,
n
variables.
rather limited.
Nevertheless,
there
are
some useful general
and in particular cases very strong results about structure can
be obtained.
The theory of singularities of smooth maps,
the global theory, is
Therefore it is not surprizing that our
and in particular
is still very young as a branch of mathematics, but it
growing rapidly with the intensive cultivation of many ideas due in
large part to Thom in the 1960's. The simplest non-trivial maps one of the manifolds of as
a
path
in
deformations
M,
is
R
to look at are those for which at least
or an interval in
R.
A map
R
-> M
is
thought
and although the sets of paths and loops and
of loops
in
M
are interesting and important things
from the point of view of the global topology of
M,
to study
the actual paths do
not lend themselves to much analysis which is of topological significance. After all,
a singularity is
hence any)
local coordinates
simply a place on the path where in some the
'velocity' vanishes.
(and
Therefore we will
move immediately on to the more interesting case of maps
R,
M
i.e.
real-valued functions of many variables.
The structure of real-valued functions Singularities
of such functions
critical points
are called
already studied some local structure near critical points language of
germs)
in §2.9.
and we have (using the
We will now consider some global implications.
EXAMPLES 1.
Let
M =
Rn
with
,
f
: M ->
R
,
f(x1,x2,...,xn)
where the
a.'s
defined by
= a1x1
2
are non-zero constants.
+ a2x2
Then
2
+
...
Df(x)
+ a^ n n is
2
the linear map
l
155
R
-> R
described by the
1 x n
matrix
(2aixi> V2.V.1 which fails x
to have maximal rank
= x. = ... 12
f,
= x =0. n
(= 1)
precisely when
Therefore if
k ^ 0
nn x e K
2
and
k
in the image of
the set
gives
that the
(n-1)-sphere
general theory rather than by considering charts.
„n-l
signs f ^(0)
k > 0
then
(an ellipse when then
n
R .
f
S
n-1
In particular. is
-1
(k)
f ^(k)
n = 2),
is noncompact
is
If all the
f : M -> R
n = 2),
while if some
a^
(a hyperbola when
depending on the
be projection onto the
sketched in Figure 31. inverse image
f
-1
(k)
a^'s.
and
a^
n = 2).
x^-axis
See Figure
(the 'height'
We see that for most points is
a.
are
a compact manifold diffeomorphic to
is either just one point or is not a submanifold of
lines when
a smooth
Rn. We already know this (§3.1) but now we have proved it by
submanifold of
positive and
II
= 1
•
a.
. )
+ a2X2
(n-1)-dimensional smooth submanifold of
taking each
2
+ • •
will be an
alXl
2
+
-1
156
lies
k
a 1-dimensional submanifold of
have opposite Note that Rn
(a pair of
30.
function) in M,
fM
as the
although not
From the picture in Example 2 it seems reasonable to expect that in general
'most'
which are is
points
k
in the image of
(n-1)-dimensional
indeed the case.
It is
(i.e.
f
will have inverse images
codimension 1)
submanifolds of
clearly going to be true when
finite number of critical points, restriction of
f
f
M.
This
has only a
for on removing the critical points
the
to the remaining manifold is always a submersion, but
the importance of the result is
that it holds
for any smooth
f
whatsoever,
no matter how large or complicated the set of critical points may be.
p
Define a critical value of
f
is
Thus
a critical point of
Values to
R.
f.
: M
R
to be any number
k = f(p)
critical points belong to
Numbers which are not critical values,
M,
where
critical
including those not OO
even in
fM ,
are
called regular values.
Assume that
f
is
C
.
The
theorem that gives us what we hope for is:
THEOREM (Sard's Theorem)
The set of regular values of
f
is a dense subset of
R. 157
Remarks 1. The original theorem (see Sard relates the values Also,
measure
(a non-topological
is much stronger than this.
concept)
to the degree of differentiability of it deals with maps from
valued functions:
3. The theorem does
not
For example,
non-zero
k
to
m
of the set of critical
f
^(k)
[l23j .
say that the set of non-critical points is dense
the map
as regular
Rn -> R
values,
taking everything to zero has all
but has no non-critical
points
that if
k
at all.
is
is non-empty but is not a codimension-1 submanifold of
can find
k'
as close as we wish to
k
with the property that
is either empty or is genuine codimension-1 submanifold of If we call a codimension-1 submanifold a
hypersurface3
summarize the above discussion by saying that for
of points
x
in
f
then we ’*"(k')
M. then we can k
in
R
this conclusion is
which we think of as
that it allows us
to take local
= k.
the world in which
x
f_1(k),
can vary subject to the
We shall use this hypersurface property
when considering certain important types of dynamical system (§4.6).
158
the set
M.
coordinate systems consistently and meaningfully everywhere on
f(x)
M,
= k
is either empty or is a hypersurface in The advantage of
'most'
such that
satisfying the nonlinear equation
M
f(x)
constraint
M.
dimensions rather than just real¬
The practical interpretation of the theorem is f
and the dimension of
theorem applicable to infinite-dimensional manifolds
has been formulated by Smale
M.
n
It
see below.
2. A version of Sard's
in
Qlll])
Let us look again at the illustration in Example 2, critical points are given to be non-degenerate. definition does not depend on the choice of neighbourhood of each critical point model which enables us disappears as Furthermore, moves
k f
As we saw in §2.9 this
local coordinate chart.
f ^(k)
does not change at all in
R
(up to diffeomorphism)
k = f(p). as
k
not containing any critical values.
Thus from a knowledge only of the critical points of seems
changes or even
through the corresponding critical value
throughout any interval
In a
the Morse lemma gives us a local
to see how the submanifold
passes (k)
p
and suppose that the
f
and their types it
that we can more or less reconstruct the global geometry of the
manifold
M.
This
simple yet profound observation is
the basis of Morse
theory. Morse theory has been used to great effect by topologists in studying the geometry of high-dimensional manifolds, the generalized Poincar^ conjecture however,
for example in Smale’s proof of
(§3.2, 10 ).
In practical applications,
it may often be more useful to work in the opposite direction,using
the fact that the global topological structure of
places constraints on
M
the possible numbers of the various types of non-degenerate critical point that a smooth function on Let us suppose that
M
M
can have.
is a compact manifold.
each non-degenerate critical point of
f
As remarked in §2.9,
is isolated,
neighbourhood containing no other critical points. that all the critical points are non-degenerate, implies
i.e.
lies in a
If we now suppose also
the compactness of
M
that there can only be a finite number of them (if there were an
infinite number they would have to have an accumulation point since all derivatives of local representatives of
f
(§1.6), and
are continuous such
an accumulation point would have to be a critical point of
f
:
an
159
impossibility if all critical points are isolated)0 number of critical points of index N.
(0 £ i £ n)
i
(see §2.9).
are related to the topology of
M
Let
FL
denote the
Then the numbers by the so-called
Morse inequalities: N
N
r
n
where the
- N
, r-1
_ r-2
...
. ± N„ £ B - B i + B r-2 0 ^ r-1 r
...
± B
- N
„ , + N n-1 n-2
...
± N~ = B - B , + B 0 ~ n-2 0 n-1 n
...
± B
+ N
are the Betti numbers of
B.
M
(
l
(certain integers defined from
1
the topology of
M
See e.g. Maunder
via its homology groups: we will give no details here.
[^79J).
The alternating sum
B0 " B1 + B2 ' the Euler characteristic of
is
numbers,
M,
**•
1 Bn
usually written
The Betti
are topological invariants_,
and hence the Euler characteristic,
which means
x(M)*
that homeomorphic spaces have the same Betti numbers.
These
invariants can in principle be computed directly for any manifold or reasonable topological space defined in some fairly concrete way,
and
much can be discovered about them abstractly using the machinery of homology theory,
itself a small part of algebraic topology.
EXAMPLES 1.
For
the function on the torus in Figure 31 we have
(the number of saddle-points) f
:
torus ->- R
NQ 2.
so
x(torus)
= 0.
= 2
and
Hence any
with all critical points non-degenerate would satisfy = 0.
The n-sphere
It is worth trying several examples to verify this. Sn
admits a smooth map
degenerate maximum (index n) other critical points.
160
= 4,
NQ =
f
:
Sn -> R
having one non¬
one non-degenerate minimum (index 0),
For example,
let
f
and no
be the projection of the
standard Hence
S
x(S )
(unit sphere in = 1 +
(-1)
gives a proof that
S
2
.
R
In particular
2-2g,
x(S )
= 2,
in
Rn+^.
which incidentally
is not homeomorphic to a torus.
3. An orientable surface of genus characteristic
onto one coordinate axis
g
(see §3.2, [lo] )
has Euler
as may be seen by counting critical points of a
height function based on Figure 28
.
Note that this
is negative for all
cases other than the sphere or the torus.
Maps between arbitrary manifolds As
soon as we take both manifolds to have dimension greater than 1 the
global problems become much more subtle and complicated. the main tools - Sard's Theorem - is still available. parallel to of
f
that we took for functions
: M -*■ N
to be points
p
f
: M -> R,
However,
one of
Following a path
we define singular points
where the maximal rank condition fails
(in some and therefore any local coordinates), singular values
to be
points
f,
q
of
N
which are the images of singularities under
regular Values to be points in .
*
N
which are not singular values
.
(including
oo
points which are not m Theorem says
and
fM).
Then,
assuming
f
to be
C
,
Sard's
that the set of regular values is a dense subset of
N.
Again exploiting the corollaries of the Inverse Function Theorem as in the first part of this
section, we see that when
dim N £ dim M
the
inverse image of each regular value is either empty or is a smooth submanifold of
M
interpretation is M
that if
of codimension equal to
close or
of codimension equal to
to
q
f ^(q')
in
N
f
"'’(q) dim N,
dim N.
Thus the practical
is non-empty but is not a submanifold of then there exist points
q'
arbitrarily
for which either the submanifold condition will be true
will be empty.
Much as before, we can summarize this by
161
saying that m
’usually ’ the set of solutions of
variables3 where
dimension
n £ m,
n
non—linear equations In
Is either empty or Is a smooth manifold of
m-n.
Given some notion of
'nondegeneracy*
for singularities in arbitrary
dimensions, we would aim to mimic the Morse inequalities by relating the number and types
(whatever they may be)
maps
to the topological structures of
f
:
M -> N
programme which has hardly begun, that suitable ideas of evolved,
of nondegenerate singularities of M
and
N.
This
is a
since it is only relatively recently
'nondegeneracy'
in this wider context have been
and in any case the difficulties seem formidable.
It is likely,
though,that we shall see exciting developments in these directions
in the
next few years.
3.4
TANGENT BUNDLES AND TANGENT MAPS
In the previous discussion of the structure of smooth maps condition played an important part. of a local representative of a map,
This is a condition on the derivative and carries with it
does not depend on the choice of local representative. definitions of critical point3
the maximal rank
the remark that it Similarly,
the
non-degenerate3 singular point are in terms
of the derivatives of a local representative, with the accompanying assurance about independence of choice of representative.
It is cumbersome
to have to refer to local representatives when dealing with properties independent of them,
but how else can one work with derivatives?
The answer is: by means of the tangent bundle3 us
to deal with derivatives of smooth maps without mentioning local
representatives explicitly.
162
a concept which will enable
If we think of some evolving dynamical system as being represented by a point moving on a manifold, then it is likely that we will be interested not only in the position but also in some sense the velocity of the point. The intuitive idea of the tangent bundle of
M
is that it is the space of
all positions and velocities of points moving on
M.
The subtlety of its
construction stems from the need for an intrinsic definition, using information only about
M
and not about any embedding of
M
in some
euclidean or other linear space (in which case the definition of velocity vectors would present no problem). To formulate the definition let us go back to a local setting. E,F
to be Banach spaces (e.g. euclidean spaces
open sets in
E,F
respectively.
p
in
U,
then the derivative
by the effect of is needed.
f
if
Df(p)
on smooth paths in
By a smooth path in
and let
U,V
be
The basis of our definition of tangent
bundle will be the following observation: at
Rn),
Take
U
f : U -> V
: E U
F
is characterized entirely
based at
based at
p
is differentiable
p.
Some explanation
we mean a smooth map
c : J ->■ U where If
J
is some open interval
f : U
V
(a,b)
with
a < 0 < b
and
c(0) = p.
is a smooth map, then by the Chain Rule we see that the
composition f*c : J -* V
is a smooth path in
V
based at
c(p)
D(f-c)(0) = Df(c(0)) i.e.
(f-c)'(O) = Df(p)
Thus the derivative
Df(p)
and that • Dc(0)
. c'(0)
:
R -* F,
.
takes the tangent to the path
(*) c
at
p
to the
163
tangent to the path
p in
(e.g. U)
c :
t
h-
f*c
p + tv,
at
f(p).
for
See Figure 32.
|t|
small enough to ensure
and so the property above describes
Given two smooth paths
c^,c2
based at
Df(p) p,
c(t)
lies
completely.
define them to be tangent
there if
(c^t) - c2(t))/t -> 0 (in E)
as
(the usual definition of 'first order contact'). equivalence relation on the set of all paths in call the equivalence classes tangency classes at way of capturing the idea of veloctt'ies at Denote the tangency class of
a P
c
by
t
0
Then tangency is an U
based at
p.
p,
and we
This is the formal
p.
[c].
Then the map
{tangency classes at
p} -> E
defined by
°p[c] = c* (0) makes sense
( [cj
(cp' (0) = c2'(0)
= [cj implies
implies
[cj = [c2] )
that it is surjective (for any
164
c^'(0) = c2'(0))
v
in
and is injective
and we have already noted above E
there exists
c
with
c'(0) = v).
Therefore we have a natural bijection between the linear space set
TpU
means
of tangency classes of
that we can regard
T^U
(just lift the structure of similarly think of
smooth paths in
over to
the
linear map
Df(p)
which takes each
: E -> F
[cl
TpU
as
p
:
in
T U P
runs
to
U
p.
via the bijection),
at the point
becomes
T f P
Define the tangent bundle spaces
T U
as an isomorphic copy of
is called the tangent space
TpU
based at
and the This
as a normed linear space isomorphic to E
T^
U
E
T U P
E
and
F.
This linear space
p.
In this new guise
the linear map
T
V f (p)
to
[f-d
Tf(p)v-
TU
of
U
through
U.
The isomorphisms
to be the union of all a
: P
the linear
T U -> E P
fit
together to give a bijection
o :
TU + U x E
defined explicitly by
a([c])
path
We then assign to
c
makes aW
based at a
p.
into a homeomorphism,
is open in
U x E
Any smooth map
f
defined as
T f P
i.e,
: U •> V
of the derivative of
:
for every smooth
that unique topology which
is open in
TU
precisely when
TU -> TV
f
via
T U. P
Thus
Tf
simultaneously all over a
and its
:
analogue for
(p,h) h- (f(p),
. . From this expression we see that if
f
y_1 C
c'(0))
induces a tangent map
U x E -> V x F
class
(c(0),
TU
W
on each linear space
U x E, V x F
=
with its product topology.
Tf
as
= (p,c'(0))
represents
U. V,
If we regard then
Df(p).h)
. is of class
C
r
Tf
TU,
TV
is the map
.
then
the effect
(**)
Tf
is of
1T“ 1
since
Df
is of class
C
(cf.
page 80).
165
Using this notation the Chain Rule has a very clean form. smooth maps (p,h)
f,g
whose composition
g*f
makes sense then
If we have T(g*f)
takes
to (g*f(p), D(g*f)(p).h) = (g(f(p)), Dg(f(p))*Df(p).h) = Tg(Tf(p,h))
so in fact the Chain Rule becomes T(g-f) = Tg • Tf. This is easy to see from the tangent space definition, T(g'f)([c]) = [gfc] = Tg([fc]) = Tg • Tf([c])
for any
From the Chain Rule it follows immediately that if then so is
Tf,
since
is an inverse for
T(id^) = id,^
since
(similarly for
[c] f V)
in
TU.
is a diffeomorphism and so
T(f
Tf.
The idea of tangency classes is the one we use in order to put derivatives into a manifold context. on
E
and
p
is a point on
M,
If
M
is any smooth manifold modelled
define a smooth path based at
p
to be
a smooth map c : J -> M where
J = (a,b)
with
a < 0 < b
and
c(0) = p,
and define two such to
be tangent if their local representatives with respect to some chart around p
are tangent in the sense already defined above for maps between open sets
in normed linear spaces. is easily seen by applying X = ).
and the rules for a linear E. N
is a manifold modelled on
then by taking local representatives we see that if then
will
In more concrete terms, this means that given
If we have a smooth map
T M
to
by the bijection corresponding
acquires via the bijection with
The same applies to scalar multiples
F,
a
Therefore, no matter which chart we choose3
we can define
space (§2.2) work for
itself
However, the important thing
the net result is a linear isomorphism
follows again from
T M P
If we choose a different chart we may obtain
a different bijection between
to one chart, and then from
T^M
T^is means that
[c J = f
[cj
in
determines a
well-defined map T f : T M 4 T N P P f(p) [c] »• [f.c].
167
In terms of a local representative than
Df(cf>(p))
: E -> F.
is a linear map.
f(p)
this map
It is called the tangent map of
on
T f
Since this is linear, the map
as taking velocities of points of points
f = ipftp ^
p
of
M
f
is none other
T f : T^M -> T^^N at
p.
We think of it
to the corresponding velocities
N.
To surrmavize: the tangent space
T^M
is a linear space isomorphic to
constructing an explicit isomorphism requires choosing a chart around but the linear structure induced on smooth map [c] H- [f-c],
f : M
N
T M P
p,
does not depend on the chart.
induces a linear tangent map
E;
T f : T^M -*
A by
which in local representation is simply the derivative.
Note that we have 'forgotten*
the norm structure on
thinking of any particular norm for
T^M
E,
at the moment.
and we are not We shall come back
to this point shortly. So far we have discussed only what happens at one point on the manifold. The next task is to put together in some coherent way the tangent spaces and tangent maps over the whole manifold
M.
DEFINITION The tangent bundle of
M,
denoted by
spaces at all the points of
Equivalently,
TM
TM,
is the union of the tangent
M.
is the set of all tangent vectors everywhere on
As it stands, this definition gives
TM
M.
simply as a set, with no further
structure than the linear structure of each
T M. P
However,
TM
can
quickly be made not only into a topological space but into a smooth manifold, and it then turns out that Tf is a smooth TM
168
to
TN.
(C
if
f
is
C )
map from
Topology and smooth structure for Let us
first take the
TM
local case, when
M
is an open subset
U
of
E.
Here we have already seen that there is a natural bisection a defined by taking we think of
{p}
. an arbitrary
C
T U P x E
r
U'
x E
c
as just a copy of
x E,
in
in
x
U'
U
x E.
a
via
for each
p
in
U,
M
E
'labelled'
by
p.
If we are on
we can take a chart
between [c]
x E
TU
T^U
(pc
in
. is a
C
this back via
1
x x„ a 6
.
y =
and,
U' ,
x
x„W 3
. V
with every
C
with
TU'
by letting
and then apply
:
U -> U',
diffeomorphism, r > 1)
: U -*■ U'
TU
is
cj>.
TU.
but the topology This is because any
and so as we saw above
and is in any case a (*)
that
Ty
is in fact
x W = x x„ ^(x„W) a a 6 6
independent of
TM TU
TM
TU, we
to be an open set precisely when the intersection
is open in
condition discussed above, to check this
is open,
cj>.
To get from this a topology on the whole of
in
Now this
we now have a well-defined topology on
TU.
does satisfy the rules for a topology
impossible)
is open if and only if
showing that the topology for
independent of
itself as a manifold)
to a topology for
But it follows directly from
so
and
r
is a
. . diffeomorphism (if
homeomorphism.
TU
does not depend on the choice of
TU
r-1
U
U -> U'
Again, we already have a product topology defined for
—l
Ty
:
to the point
- i.e. we identify
construction depends on the choice of chart
overlap map
(thinking of
in each
correspond to
and so we lift
obtained for
where
P
by taking each
a : TU' -> U' U'
{p} x E
manifold
( (4>c) ' (0)) each path
TU -> U x E
to
construct a bijection and
:
It is routine to check that this (§1.3).
it is not necessary
In view of the overlap (and of course theoretically
intersection properly for all charts on
M :
it
169
suffices to check it for one atlas. By means of the bijections TM.
Each
t
to an open set
on
E x E. C
we have gained more than a topology for
is a homeomorphism (by definition)
TM
are
x
U' x E
in
E x E,
and so
from an open set TM
. diffeomorphisms if the overlap maps
and so a hence a
C C
r
structure on
r—1
-1
0
each
two points in different tangent spaces. has dimension
TpM
then
n
then
TM
TM
has dimension
is never compact.
2n.
This is essentially because
is non-compact, being homeomorphic to
E.
(A compact set in
E
would have to be bounded: cf. Theorem C in §1.6.) 5. There is a natural map TpM
to
p
to the point
for each p
170
and then
T^M
in
itself.
of the tangent bundle M,
p
it
TM
: TM -* M, M,
Thus
i.e. tt
called the projection map, taking it
takes all tangent vectors at
1(p) = T^M.
With
it
p
in mind we think
as a collection of tangent spaces lying 'above'
is called the fibre of the fundle over p.
See Figure 33.
Figure 33 The projection
tt
is a smooth map
local representatives of in
U* .
(C
the form
if
cj)ttx 1
This has maximal rank everywhere
and splits
M
is
taking
C )
(q,h)
since it has in
U'
(its derivative is
in the infinite-dimensional case) and so
tt
x E
to
q
surjective,
is a submersion
TM -> M.
EXAMPLES of tangent bundles 1.
For an open set
to make
U
in
the bijection
a
reason we usually write the same linear space points
in
U
If
S
TU
E
U.
It is
was defined just in order
into a homeomorphism. In effect this says
since
This
For this
that we are using
U
is useful for calculations, but
itself lies
lying in (the other copy of)
in (one copy of) E
E
and so
might be mistaken for
important to try to avoid this confusion.
is a smooth k-dimensional submanifold of
the right charts for
M
at points on
2k-dimensional submanifold of T S P
U x E
TU
in which to represent the tangent vectors at all
simultaneously.
tangent vectors
2.
:
the topology on
TU = U x E.
confusing conceptually,
elements of
E
TM.
S
we see that
For each
is a k-dimensional linear subspace of
M
p
in
then TS
S
by taking is a smooth
the tangent space
T M. P
171
3. As an example of the above,
consider
can be written in polar coordinates r(t)
= 1
since
(c(O),0’(O)) charts
for
c
is
in
for each S*"
it is
S'*".
[c] easy
c(t)
in
TS*"
isomorphism,
so we usually write simply
TS*"
S*"
5.
T(TM) define C
r— 1
TM
,
is
2
or
to
can be regarded as
’coordinates'
Since
.
A path
(r(t),0(t))
c
a
R
in
S
where of course
x R
taking
[c]
to
and by taking
is a diffeomorphism.
(the second factor) TS
= S
x R.
by a linear
The bundle projection
is just projection on the first factor.
4. T(M x N) second
T S'*" P
R
is clearly a bijection,
a
:
=
to check that
takes each
in
a : TS*"
The map
Furthermore
it
S
T M
TM x TN
by working with the first and
separately.
itself a smooth manifold, of dimension
T"*M = T(T^M),
4n
if
and so on.
and so the process
M
If
its own tangent bundle
has dimension
M
is of class
r T M
stops at
it has
n. Cr
Similarly we then
TM
is
which is a topological manifold
but has no given differentiable structure and so no tangent bundle. Locally
2
T M
looks like T(U x E)
since 6.
U x E
=
x
(E x E)
is an open subset of the linear space
From Examples
3 and 4 it follows
TCS1 x S1)
on rearranging the factors. 0,
(U x E)
on the two
S*"
E x E.
that
=
(S1 x R)
=
(S
11
x S1)
x
(S1 x R)
2
x RZ
This expresses
the fact that angular coordinates
factors allow tangents to paths on
described everywhere by the pair of numbers
S'*" x S'*"
(0’(t) , ’(t) )
in
to be o R .
2 7. The two-sphere
S
does not have a system of angular coordinates
Euler angles give charts
172
that do not work at
(the
the north and south poles).
and this is related to the fact that 2 S
2 x R .
If it could be,
TS^
then we would be able immediately to assign a
non—zero tangent vector to each point of continuously over
S
2
cannot be identified with
S
(take a fixed vector
in a way that varied v
o R ).
in the second factor
The celebrated Hairy Ball Theorem in topology asserts that this is impossible.
We have here more than a mere topological curiosity, for we
saw on page
131
that the manifold that we now recognize as
TS
is the
natural setting for the study of the motion of a spherical pendulum. In fact
TSn = Sn x Rn
only in the cases
n = 1,3
This is a deep result, first established by J.F. Adams
or
7
[ 3 ]
(or
0).
in 1962.
It
is closely connected with the existence of complex numbers (dim 2), quaternions
(dim 4), Cayley numbers (dim 8) but nothing analogous in
higher dimensions.
The proof involves much elaborate machinery from
algebraic topology. The question of whether or not
TM
is the same as
M x Rn
is very
important both on purely geometric grounds and from the point of view of applications.
If
TM = M x Rn
then velocities of points moving on
M
(= tangents to paths) can all be represented in one n-dimensional coordinate space
Rn,
charts.
despite the fact that
However,
if
TM ^ M x Rn
for velocities as well. some
Rm
then
TM
M
itself may need several coordinate then we need separate coordinate charts
Of course if
M
is a submanifold of
(Ex.l), and velocities on
M
is embedded as a submanifold in TRm
(Ex.2) which is
R™ x Rm
can be measured in the second factor
Rm.
However,
this introduces a lot of redundant extrinsic information about
the way
M
processes on
lies in M
R™,
and is not helpful to an understanding of intrinsic
itself.
173
A tangent bundle
trivialc
TM
which is really a product
M x R
is called
The precise definition, valid also in infinite dimensions,
is
the
TM
is
following:
DEFINITION
If
is a smooth manifold modelled on
M
then the tangent bundle
E
trivial if there exists a diffeomorphism
TM -* M x E
which takes each tangent space of
{p} x e
in
E
by a linear isomorphism to the copy
T^M
A manifold whose tangent bundle is trivial is
M x e„
called parallelizable.
Tangent maps for manifolds By analogy with the local case
map
Tf
: TM
linear map
TN
T^f
to points on
(open set
of a smooth map
f
on each tangent space
N,
the map
Tf
: U -> U'
representative
f
. representative for class
for
Tf.
in
: M -> N T^M.
E)
and
we see that
Therefore
Tf
we define the tangent
to be the map which is the While
takes velocities on
If we take suitable charts tpfip ^
U
f
takes points on
M
to velocities on
\p : V -> V' T(^f
. is of class
M N,
to look at a local
is a local C
r-1
if
f
is of
r C o
There is a simple global Chain Rule for maps between manifolds: given [g-f-c]
f
: M
N
and
= Tg.Tf([c])
g
: N
P
T(idM)
= idrpM
in the local case that if
174
g*f
: M
P
and since
we see as before that
T(g-f)
Again,
we have
= Tg-Tf.
and it follows by the same formal argument as f
:
M
N
is a diffeomorphism
(Cr)
then so is
r— 1 i).
Tf (C
Since of
N
Tf
at
takes the tangent space of
f(p),
M
at
p
to the tangent space
we could write
it *Tf = f• tt : TM -> N N M
where
tt^
denotes the projection
TM
M,
and similarly for
tt^.
This is
often conveniently represented by saying that the diagram
TM
Tf ->-
M
TN
-
N
commutesj meaning that you arrive at the same result whichever route you follow from top left to bottom right. Tf : TM -> TN
being a map ovev
the derivative of
f,
This diagram conveys the idea of
carrying on top of
f
information about
f.
EXAMPLES of tangent maps 1. It is worth emphasizing again that when f : U
V
then
Tf : TU -> TV
U,V
and
q
Tf:S
f :
->
is an integer. 1
x
R -»■ S
1
xR
by
Tf
does.
(such as checking the maximal rank condition) we
work with local representatives and 2. Define
Tf : TM -*■ TN .
makes no sense in the manifold case, although
To carry out calculations
and
(p,h) h- (f(p), Df(p)h).
This is the form of all local representatives of Df
E,F
is the map
U x E -> V x F :
Remember that
are open sets in
f(z) =
D,
but for general theorems we use where
Then (Example 3 above)
z
is a complex number TS'*' =
x R
e10
and here
is given by
175
T.
Tf(z,h) = (zq, qh). This is because
f(e^^^'t^) =
and so
of paths on
by the factor
f
multiplies velocities
q.
Tvansvevsality Since the tangent space to p,
M
at
p
is a local linear version of
M
near
the geometry of configurations of linear spaces and subspaces can be
used to model local geometry of configurations of manifolds and submanifolds. The development of this simple observation into a theory of tvansvevsdli-ty of intersection by Thom (see §4.7) has been of profound importance in many aspects of differential topology. Two lines in
R
3
will in general not meet;
two planes will m general
meet in a line, and a line and a plane will meet in a point. extends easily to
Rn;
two affine subspaces
in general not meet unless satisfied then
K
and
dim KnL=k+£-n
k + £ £ n,
L
K,L
The geometry
of dimensions
k,£
will
and if the dimension condition is
will in general meet in such a way that
and
K + L = Rn:
this is easy to believe from
considering low-dimensional cases that can be visualized.
Applying this to
tangent spaces of manifolds, we say that two submanifolds
K,L
dimensional manifold K c\ L
M
of a finite¬
intersect tvansvevsalty if for every point
p
on
we have TK + TL = TM. P P P
It is then not difficult to prove using the Inverse Function Theorem that K A L If
is a
(k + £ - n)-dimensional submanifold of
k + £ = n
product
K x L
then up to local diffeomorphism near
p.
If
cannot be satisfied for any 176
k + £ < n p
in
K r\ L
M
M
locally near
p.
is expressible as the
the transversality condition so in this case we take
transversality to mean
K n L
is empty.
The idea becomes more useful if instead of two submanifolds we take one submanifold together with the image of another manifold under a smooth map into
M.
If
S
is a submanifold of
we look at points if either
f(p)
p
in
N
M
and
and say that
does not lie in
S
f
f : N -> M
is a smooth map
is transversal to
S
at
p
or, if it does, then
V(iph) + Tf(P)s - Tf(P)MSee Figure 34.
near of
N
p
It follows from the Inverse Function Theorem that locally
the inverse image
f ^(S)
is (if not empty) a smooth submanifold
of codimension equal to that of
S
in
M.
We have already met this
idea before in the case when
S
surjectivity of
is locally a submersion
T^f
and
f
is one point: then transversality means (cf. page 153).
The map version of transversality becomes the submanifold version when is an embedding, and is very similar when
f
f
is an injective immersion.
We shall use it in this context when studying dynamical systems in §4.4. Transversality makes sense in infinite dimensions, although to use any of the theory we have as usual to include a proviso about splitting. might suppose that that
(T f) p
T^^M
^(T, ,S) f(P)
splits into a direct sum
is a factor in a splitting of
Tf^S © F, T N, P
We
say, and
although
177
sometimes we can assume less than this.
Transversality theory in function
spaces is a very powerful tool in differential topology and global analysis. See Abraham and Robbin
[_ 2 ] .
Returning to the original ideas of 'general position' hope that in some sense a map
f : N -> M
to a given submanifold
M.
S
of
in
R ,
we
would in general be transversal
This is the essence of Thom's
Transversality Theorem (page 259).
However, we cannot treat this here since
we have not yet discussed a definition for 'in general'.
We shall return
to these questions in Chapter 4.
Riemannian structures As remarked on page 168, locally as
U x E.
the norm on
E
is forgotten when
is modelled
Once the tangent bundle has been constructed, however,
it becomes useful to have a measure of length in each that paths in
TM
M
T^M
not only so
have speed (a number) as well as velocity (a tangent
vector) but also in order to measure global contracting/expanding behaviour of dynamical systems: see §4.5. such as T^M
Rn
If
E
is a Hilbert space
H
(page
52),
for example, then it would be useful to be able to equip each
with an inner product
p
which varies continuously or, better,
smoothly with respect to
p.
a smooth function of
p
for all smooth vector fields
neighbourhood of
In fact it is always possible to do this, although
p.
By this we mean that
certainly not in any unique way.
^ X,Y
should be
defined in a
The family of inner products is called a
R'Lemann'ian structure (B. Riemann, 1826-1866); it is technically a section of a bundle of covariant tensors (see §4.6). M
we have on each
T M p
a norm
\
I IvI I = 5 II 1 'P P
178
for
v e T M. P
I I•I I 1 1 1 'p
Given a Riemannian structure on defined (as on
H) J
bv y
With this norm we can construct a metwc on
M,
using the obvious
intuitive idea of the 'shortest distance between two points'. path p
y
to
: q
[o,l] -> M along
with
y
y(0) = p ,
we define the distance from
as
dy(p,q)
Then taking
y(l) = q
Given a
Y'(t)
d(p,q) - inf d (p,q) y V
y(t)
dt.
gives a metric on
M.
For this reason
the Riemannian structure is itself often called a Riemannian metric.
3.5
VECTOR FIELDS AND DIFFERENTIAL EQUATIONS
Consider a system
S
of first order autonomous (i.e. no explicit
t
on
the right hand side) ordinary differential equations
x x
x denotes
where
1 2
n
d_ dt *
(x^,x2,•• • jx ) X2(xi,x2, ...,xn)
= Xn(x1,x2,...,xn) We can write the system more economically as
x = X(x) where
Rn,
x
lies in some open subset
U
of
Rn
and
X
is a map from
U
to
which we do not assume to have any special properties at this stage.
It is convenient, and in keeping with our general aim, to regard as a measure of time.
The set
U
is called the phase space of
What does it mean to solve the equations?
t
here
S.
First we have to postulate
some initial conditions such as
179
x = p and then a solution means a path
when
t = 0
c : J -> U
satisfying
c (0) = p and c(t) = X(c(t)) for all
t
in the interval
J,
where
c(t)
is what in §2.4 we called
c'(t) = Dc(t).1. We think of
X
as prescribing a vector
and we want a path a typical point vector Rn X
based at
c(t),
X(c(t)).
in which
c
p
X(x)
x
in
U,
with the property that the tangent at
i.e. the velocity of that point, is precisely the
We should therefore think of
x
at each point
X(x)
not as an element of
lies, but as an element of the tangent space
T U.
Then
becomes a map X : U -> TU = U x Rn
with
X(x)
an element of
T U
for every
x
in
U,
or in other words
X
satisfies tt*X = idjj where
tt : TU
U
vector field on
is the natural projection map. U.
Formally, a solution
X(c(t)) = (c(t),c(t)),
c
Such a map is called a
then satisfies
where the seemingly redundant
c(t)
in the first
factor is there to keep track of the particular tangent space in which X(c(t))
lies.
Since by definition
c(t) = Dc(t).l
we have
in tangent map notation
X(c(t)) = (c(t), Dc(t).l) = Tc(t,1)
or
Ttc(l).
These definitions, in which we have disentangled the space in which X(x)
180
lies from that in which
x
itself lies, generalize immediately to
X
manifolds.
DEFINITION A vector field on a smooth manifold 77 *x =
Thus
where
X(p)
M
is a map
is the natural projection
it
lies in
tt
\p) = T^M
TM.
p
which satisfies
TM -> M.
for every point
assigns a tangent vector (velocity) at called a sect'Lon of the bundle
X : M -> TM
p
on
to each point
p.
M,
i.e.
X
Such a map is
See Figure 35*
image of X
Figure 35 The notion of a solution to the system
S
now becomes:
DEFINITION Let M,
X
be a vector field on
is a path
a < 0 < b)
c : J
M
satisfying
M.
(where c(0) = p
A solution curve J
to
X,
is some open interval
based at (a,b)
p
on
with
and
Ttc(l) = X(c(t))
for all
t
in
J.
(Here
TJ = J x R
and so
T c
is a map
R -* TM.)
Thus a vector field on a manifold is the global version of a first order autonomous system of
n
ordinary differential equations on
Rn.
Observe
181
that the definition works equally well for smooth manifolds modelled on an infinite dimensional normed linear space
E.
Higher order equations There is a standard trick for converting an variable into a system of Assuming the
nth
n
first order equations in
n
variables.
,n-l d x dx n-1 dt”**’ dt
= F
dt
x^
order equation in one
order equation is of the form
,n d x
we write
nth
for
x
and then
x
n
x 2»
x^
F(x^,x^,...,x^) Rn.
giving us a first order system on
•
and finally
,
The subset of
Rn
on which all
this is defined (generally an open subset) is the phase space for the original equation. In the case
n = 2
we have
X1 = X2
x2 = F(x ,x2)
and so from a second order equation on an open interval a first order equation (i.e. a vector field) on vector field
X
on
TU
x,-component of
X(x)
in
T U = R X
tangent bundle language this says Tit • X = idrpTT TU
2
X
R
we obtain
Not every
comes from a second order equation on
-L
182
in
U x R = TU.
easy to see that a necessary and sufficient condition on that the
U
U :
it is
for this is
should be equal to
x„. Z
In
o
where
it
map, and
: TU
->
U
is the usual projection.
2 X : TU -> T U
Tit
:
is the vector field on
T U
TU.
TU
->
is its tangent
This generalizes
verbatim to manifolds:
DEFINITION A second order autonomous ordinary differential equation on field.
2 X : TM -t T M
on
TM
Ttt
where
tt
: TM
M
M
is a vector
which satisfies
•
X =
id
TM
is the natural projection
(so we have
Ttt
2 : T M -* TM).
Non-autcnomous equations A system of first order equations which contains hand side can be regarded as a family
X
t
explicitly on the right
of vector fields varying with
This goes over directly to vector fields on manifolds. another variable
u = t
as just one vector field (x,u)
of
M x R
on the product manifold
the tangent space to
and the component of
X(x,u)
M x R
in
t : in this case
M x (R/Z) = M x
X
is
X
M x R.
X
At a point
T^M x T^R = T^M x R
in the second factor is
This technique is particularly useful when say)
By introducing
we can interpret the family of vector fields X
t.
1
since
= 1.
is periodic (of period 1,
projects to a vector field on
giving rise to a flow with a cross-section: see §4.1.
Local existence and uniqueness of solutions It is a standard and fundamental theorem that provided well-behaved local solutions to furthermore if conditions.
X
X
X
is sufficiently
exist, are essentially unique and
is smooth they vary smoothly with respect to the initial
We will give a careful statement of a form of this theorem.
Note that since we are working locally we operate in a normed linear space
183
E
rather than an arbitrary manifold.
The theorem requires
E
to be a
Banach space, since the proof involves techniques of infinite iteration and needs good criteria for convergence of sequences.
THEOREM (Local existence and uniqueness of solutions of ordinary differential equations) Let Cr
U
be an open set in a Banach space
map with
r £ 1.
Then given any point
a positive number and
E,
a neighbourhood
and let p
in
U
f : U
E
be a
there exists
e,
W
of
p
in
U
such that for any
x
in
W
and any positive number
6 $ e
there exists a unique solution curve c
:
(-6,6) -> U
X
to the equation
x = f(x),
satisfying
c (0) = x*
Moreover3 if we define a map : W x (-6,6) -* U by (x,t) = c (t) X
then
is a
C
IT
map.
•
Thus solutions vary
C
1C
with respect to
t
and
the initial conditions.
Remarks 1. This may seem rather an elaborate formulation of a simple idea. the reason for introducing -e < t < e use of 184
U U
W
However,
is that we want solutions to exist for
throughout a whole neighbourhood of
p.
We may not be able to
itself for this neighbourhood, since the nearer the shorter the positive (say)
x
t-interval over which
is to the edge c^
is defined
may be.
5,
See Figure 36.
•
•
.
Also, we must formally allow the flexibility with
.
.
since otherwise it is just conceivable that solutions
both exist with -e/2 < t < e/2
c
(0) = x
(i = 1,2)
c
i
x
,c
9
x
could
and be defined for, say,
without contradicting the uniqueness statement for
e.
(Of course these solutions would have to be non-extendible to solutions on -e < t < e,
or e-uniqueness would be contradicted.)
2. For proofs of the theorem see Dieudonnd
[^32^, Lang
[^67^], or any of
the many books on differential equations such as Arnol'd ^llj , Coddington and Levinson j^30T, Hartman
^5l] , Lefschetz (^69^) .
There is a neat proof
using the Inverse Function Theorem directly, due to Robbin [104] • proofs begin by supposing
X
only to be
C'*'
Most
or satisfying an even weaker
Lipschitz condition (R. Lipschitz, 1832-1903), and use induction to prove . the theorem m the
C
r
case.
non-autonomous case, where
f
Moreover, proofs are usually given for the is a function of both
smoothness or Lipschitz conditions apply to
x
x
and
t
and the
but not necessarily to
t.
00
However, we shall always be assuming smoothness (usually
C )
in both
variables and accordingly reduce the problem to the autonomous case as described on page 183.
185
We now want to extend this local existence and uniqueness theorem to a global theorem by continuing our local solution curves all over the manifold 27
M
as far as they will go, assuming
solutions do exist.
X
to be
C (r £ 1)
so that local
The key result enabling us to do this is the following
lemma which says that any two solution curves based at
p
will agree on
any interval on which they are both defined.
LEMMA Let
cx : J
-> M
and
based at the point and
c^
p
c^ : ^ on
M.
M
be two solution curves for
Then
c^(t) = c^Ct)
for all
are both defined3 i.e. on the intersection
t
X
both
where
c^
r\ J^.
Proof This may seem obvious, but it is instructive to see how several topological notions from Chapter 1 are involved in formalizing this intuition. be the subset of I =
A J2*
A ^
on which
(1) I
is not empty (it contains
(2) I
is open in
(3) I
is closed in
and
C2(t)
M
A ^2
c^
agree; we aim to prove
t,
A 3^ J^A
0);
(the local uniqueness guarantees this); . (If
t
is not in
I
then
c^(t) 4 c^(t)
is Hausdorff we can find neighbourhoods of
which do not meet.
implies that of
and
I
Note the following facts
and since
c^(s) 4 0^(3)
J
(1),
(2) and (3) imply that
The continuity of
for all
i.e. the complement of
Now
186
c^
Let
I
s
c^>c2
c^(t) thus
within some distance
is open so
I
is closed.)
an interval, and is therefore connected (see §1.7). I =
A J^,
y
proving the lemma.
Hence
Finally, we construct a maximal solution curve
c : J -v M
(i.e. one which cannot be extended over any larger as follows.
Let
at
J
p.
Let
c : J -> M
(c
A
: J
M}
A
c
J . A
Then
J
J ,
c : J -> M
and define
A
is an interval (finite
or infinite) and the above lemma shows that the definition of unambiguous.
Clearly
based at
We call it the global solution curve through
p.
J)
be the family of all solution curves based
on each
A
p
t-interval containing
be the union of all the various intervals
to be equal to
based at
c
is
is the unique maximal solution curve for
Global solutions need not be defined for all time
X
p.
t:
EXAMPLES 1. The equation
x = 1
has global solution through
If the equation is defined on the open interval each solution curve is defined in 2. Let through
M = R p
is defined only for
c(t) = (p
finite number.
M = (-1,1)
in
c(t) = p+t.
R,
then
2 x = x .
-t)
,
The global solution
but when
p > 0
this solution
< t < p ^.
The problems are that (1) c(t)
given by
only for a finite time.
and consider the equation
is given by
time, or (2)
M
p
c(t)
may reach the 'edge' of
can shoot off to infinity while
However,
if
M
t
M
in a finite
approaches some
is compact there is no edge (our manifolds
here have no boundary) and no way of shooting off to infinity, and so we would expect global solutions to be defined in this case for all
t.
The
optimism is justified.
THEOREM If all
M
is compact then each global solution curve for
t
in
X
is defined for
R.
187
Proof Let
c : J -> M
J = (a,b) . If t
n
b < °°
We shall show we could find
= b - —). n If
be the global solution curve through
c(t ) = c(t ) m n
R,
c
for some
(see §1.6).
M
with
m,n,
b < 00.
-e < t < e b - t
(possible by continuity of
t
n
+ e > b
as
n
in
00
(e.g.
M.
c(t) = c(t+t -t ) n m
Hence all the
c(t )
valid for all
are distinct.
there exists an accumulation point
such that
n
then have
-> b
x_,_
t
By
of the
c(t )'s
Now the local existence and uniqueness theorem shows that
and defined for t
n
is similar.
then local uniqueness and the above
unique solutions exist based at any
choose
t
and suppose
a =
c(t ), c(t„),... 12
is periodic with r
contradicting
compactness of
the argument for
t_,t„,... 12
Look at the points
lemma show that in
b = 00 ;
p,
c
where
x e
in some neighbourhood depends on
is less than
n c).
e
W and
W
of
x^
but not on
x.
Now
c(t ) n
is in
W
By local uniqueness and the lemma above we
extendible to a solution on
(a,
t +e)
we have contradicted the maximality of
,
and since
J.
Therefore
b = °°,
as originally claimed.
Thus if
M
(P»t) = c (t) P
is compact (and often when it is not) we can define where
c
is the global solution for
X
through
p,
and
P
thereby obtain a map
(p : M x R -> M.
This map, representing all the solutions for
X
simultaneously, has a
number of interesting properties which will be of central importance to what follows.
First of all,
can be shown to be
Cr
if
X
is
Cr,
by piecing together the local information from the local existence and uniqueness theorem.
188
Secondly, if we write
^^(p) = c (t)
then
^_
: M -v M
satisfies
«f)f. * t for all gives
s,t t
=
s
, t+s
:
M -> M
This essentially says as
= 4>t(4>s(p)) •
that uniqueness of solutions
the unique global solution based at
c
(s),
so
Also
*0 " idM since
^q(p)
= Cp(0)
diffeomorphism of
(j>
:
= P*
because
4>
it immediately follows * cf>
t
-t
= 4>
-t
• 4>
=
t
that
= id M
0
(f>^
so
is a
d> T-t
is
the inverse
M -> M.
Such a map
flow
is called a
which we shall use
to model
on
M.
This
the theory of the
and this will occupy us
Remarks on the literature.
dynamical system
that we have been pursuing
three chapters of this book.
of dynamical systems becomes
manifolds_,
the mathematical object
the general concept of a
governed by ordinary differential equations throughout the first
is
The qualitative theory
structure of flows on
for the whole of the fourth chapter.
Over the last decade or so there has been a slow
but steady stream of books becoming available on differential
topology.
In
rough chronological order (relevant since language and aims of exposition tend to change) Mackenzie Wallace Lang
there are Lang
[13],
[145] ,
[68J,
[66],
Narasimhan
[87],
Chillingworth
[[331 , Hirsch
[25],
Stamm
[15],
Spivak
Golubitsky and Guillemin
Dodson and Poston manifolds is
Munkres
Bishop and Crittenden
[130] .
recorded in Bourbaki
[128"], Milnor
[83],
[129], Brickell and Clark [20 ] ?
[42] ,
[52],
[86], Auslander and Spivak
Guillemin and Pollack
[48],
See also survey articles by
The formal theory of differentiable [18],
The series of volumes
[l29[]
by
Spivak provide a very thorough and lively account of manifolds and differential geometry, obtain.
although unfortunately they are not always easy to
For Morse theory see Milnor
article by Palais
[93]
in some books on differential and Clark,
or Spivak
[129],
treatments are Hochschild by Adams
[ 4 ].
[82]
or Wallace
is also strongly recommended. topology:
The classic text is Chevalley
[36]
the excellent
Lie groups
feature
see Bishop and Crittenden, Brickell [24];
other
or the concise and very clear lecture notes
A standard reference for theorems on the orbit structures
of Lie group actions is Borel [17]. Some references for algebraic topology are Agoston Young
[145] ;
[57], Maunder
[5
], Hocking and
[79].
189
For the basic
theory of differential equations refer to any of the texts
quoted on page 185,
and also the short introduction to some qualitative
aspects of the theory by Hurewicz Hirsch and Smale of view,
[^59^.
The books by Arnol'd
fllj
and
are both written very much from the qualitative point
and contain large numbers of interesting examples,
fresh ideas:
190
[^55j
applications and
they would provide a stimulating accompaniment to these notes.
4 Qualitative theory of dynamical systems
4.1 We
FLOWS AND DIFFEOMORPHISMS saw in the previous
chapter that the solutions of a
(differential equation)
on a compact manifold 4>
on
M.
Writing
:
Diffr(M)
is a
group homomorphism,
that the flow
is an
IT
•
diffeomorphisms
Diff
Diff
(M)
(M)
has
• ••
(remember
(f>
=
t
4)_ 21 ’
(t)
Z
the algebraic properties
or in the language of Lie groups of
R
on the manifold
orbits
M.
(see §3.2
1 12] )
The images of
of this action.
and consider positive and negative iterations
t ’ ^0 ’ ^ t ’ ^21
1
and
t •*
the solution curves of the vector field are the If we fix a value
is another one,
say that the map :
action
M -* M.
and
nt =
-> Dif fr (M)
which is an action of the group
:
1 ''
(d>fc)n)
we have a map
n h- 4>^^
Z
on
M.
This action is a kind of
191
discrete approximation to the R-action
M meaning
f
We call this the Z-action
(fn(x)}
for all
n = 0, ± 1,
It does not follow that
f
± 2,
...
need come
see below.
In both the flow and diffeomorphism case the total all
to
f
the orbits is called the
phase portrait.
It is
configuration of this which we study
when trying to understand the global qualitative behaviour of a dynamical system.
Remarks 1.
One approach to the problem of classifying phase portraits is
the quotient space
(§1.3)
of
'lying in the same orbit', in the
orbit space
complications
M/R.
by the equivalence relation
However,
this leads in general to such
M/R
may easily be non-Hausdorff.
Try to imagine,
the orbit space for an irrational flow on the torus or even for an innocent linear system in
x^ = x^,
:
= _
x2
R
such as
see Figure 44.
In the same way that a flow represents the evolution of a system
governed by ordinary differential equations, a
of
so that orbits are then represented by points
(Example 6 below),
2.
R
that it hardly sheds any light on the problem.
One difficulty is that for example,
M
to take
difference equation
point on
192
M.
of the form
a diffeomorphism represents
x ,. = f(x ) m+1 m
The sequence of points
where each
xQ,x1,x2,...
is
x m
is a
the forward orbit
Xq
°f
under
f,
and by including
the whole orbit. is
x_^ - f
(xq)
and so on we obtain
Therefore the theory of diffeomorphisms as Z-actions
the global qualitative theory of certain types of difference equations.
For many types of difference equation the map however, still
and so we no longer have a Z-action.
may not be invertible, Nevertheless, we could
study the behaviour of an arbitrary smooth map
forward iterations. these lines, 3.
f
f
: M -> M
Although there are some interesting results along
the global
theory remains relatively unexplored.
We could drop differentiability from the discussions,
continuous
actions where each
c|>
we need not work with manifolds, The
theory of such actions
continuously with
t,
or
f
and consider
is merely a homeomorphism:
then
but any topological spaces we liked.
(often called
continuous flows
discrete flows,
or
topological dynamics. this generality,
under
respectively)
when
varies
is called
Many useful results about flows can be proved in
but differentiability yields much stronger results.
We shall rely heavily on differentiability, which can usually be assumed in practical applications.
The relationships between R-actions and Z-actions There are two important ways in which R-actions give rise to Z-actions. 1.
A choice of
t
gives
the Z-action
n
from the R-action
,
as described above. 2.
Let
be a flow on
suppose that
\
M
with corresponding vector field
is a codimension-1 submanifold of
M
X,
and
with the following
properties: (a)
every orbit meets
\
for arbitrarily large
positive and negative times
t
;
193
(b)
if
p
to
£
lies on
£
i.e.
X(p)
,
then
£
is called a
cross-section
£
induces a Z-action on
for
£
we define
on
positive on
p.
t
for which
We call
See Figure 37. system on 0
M
f
f (p) c|>
(p)
The flow on (page
£
to be
, t
(p)
183) has
It is not hard to
where
again lies in
M x S'*'
cj>.
generated by
first-return map
the
.
transversally.
of the flow
prove that p
is not tangent
does not lie in
Thus orbits intersect Then
X(p)
£
.
f
t^
£ ->■ £
when
is the least
In general
PoincarS map
or
:
for
t^ £
depends .
arising from a periodic non-autonomous
a cross-section
£ = M x {0} for any fixed
e S1.
M
Figure Z7 Conversely, way.
a Z-action gives rise to an R-action in the following
Let the Z-action be generated by
product
M x I
(where
vector field
X
contained in
M x R.
All
topology
(§1.3):
in §3.2,
194
9
in
Diff
is the closed interval
pointing in the I-direction,
attaching the point 1-end.
I
f
[o,lJ)
thinking of
Now glue the two ends of (x,0)
M. x I
at the O-end to the point
this can be made respectable,
(M).
Take the and take a unit M x I
as
together, (f(x),l)
at the
using the idea of quotient
compare the mathematical description of the Mobius band See Figure 38.
Mxl
CO
Figure 38
We obtain a vector field on a smooth manifold of dimension (if has
dim M - n), M
Thus
and the corresponding flow,
n + 1
called the suspension of
f,
as a cross-section.
R— and Z—actions can be associated to each other in ways which
reflect their orbit-structures.
This association has serious
limitations,
though: 1. Not every flow is the suspension of a diffeomorphism,
since
for example a suspension has no points where the vector field vanishes.
This
also implies that not every flow has a
cross-section. 2. Not every diffeomorphism can be incorporated into a flow on
the same manifold (i.e. as
t
for some
t) .
Clearly a
diffeomorphism which is not deformable continuously to the identity cannot be not enough: by
f(e
are as
i0
)
t
let
f
= p*^
(e
for any
: i0
t.
Deformability is
still
be a diffeomorphism defined ),
where
illustrated in Figure 39,
\p
is a flow whose orbits and
p
is rotation by
180°
.
195
q Then If
f(p) f = ,
f
2 (p)
= P,
f
(q) = 9*
then the whole circle must
would not be periodic) which would
of
period
(r)
f r.
and
for any
2
= p,
f(q)
Figure 39
2t
:
see below.
Therefore
f
But this
is not of the
t.
it is clear that the study of flows and the study of
iterated diffeomorphisms are very closely related, them both in parallel.
For flows we tend to write
and we shall pursue
(p)
as
(J^p.
Important features 1. A fixed point of a flow all
t,
i.e.
is a point
p
for which
for which the vector field vanishes:
also called a zero
(or, misleadingly,
of the vector field.
It corresponds
X(p)
p = p = 0.
for Thus it is
a singularity or critical point) to an equilibrium state of the
dynamical system being modelled. A fixed point of a diffeomorphism f(p)
= p,
and hence
fn(p)
= p
2. A periodic point of a flow some q.
196
T > 0 If
with
r = (f> q
Tq = q.
f
is a point
for all integers
p
for which
n.
is a point
q
for which there exists
The least such
T
is called the period of
is any point on the orbit of
q
then
V
^T+s^
^s+T^
~
point with the same period as q
as a A
i
(q)
periodic orbit
periodic point - q
q.
of period
m > 0
(or
points from fixed points).
>1
£
return map for
f
is also a periodic
is a point
y
of
q
such that
if we want to distinguish periodic
The least such
conversely,
r
T.
Every fixed or periodic point of the suspension flow;
so
Therefore we describe the orbit
of a diffeomorphism
for some
cross-section
_ r>
f
m
is the
period
of
q.
gives rise to a periodic orbit of
every periodic orbit of a flow with a
gives rise to a fixed or periodic point of the first£
.
This fact that the phenomenon of periodicity is
preserved when taking suspensions and cross-sections
is one of the
justifications for studying diffeomorphisms in order to understand flows. Fixed points and periodic orbits correspond to in a dynamical system,
'observable'
phenomena
since they represent states which are unchanging
or which repeat themselves.
However, points whose orbits return
infinitely often to any arbitrarily small neighbourhood of them, (since we can only measure to within a certain finite accuracy) which have points often
(for times
q
arbitrarily close to
t -»- °°)
p
or even points
p
which return infinitely
to any neighbourhood of
p,
may also be
associated to some kind of observable phenomenon although it could have a rather chaotic or turbulent character. type is called non-wandering behaviour.
Recurrent behaviour of this A point
p
is
wandering
has a neighbourhood which never returns to intersect itself, which is not wandering is
non-wandering.
if it
and a point
Formally, we express this as
follows:
197
DEFINITION
A point
p
is a
neighbourhood
of
U
on
Similarly for a diffeomorphism
replacing
(n > 0)
for the flow
M
there exist arbitrarily large
p,
is non-empty.
Urw|> V fn
non-wandering point
ifs given any
t > 0
for which
with
f3
.
t
Remarks 1. The set of non-wandering points is denoted by
ft() ,
ft(f)
or simply
ft . 2. Let
Then if
4>^_U
U
s+t
UnU = cf>Vr'iV s t
x
is in
is in
n(4>);
using
t < 0
is nonempty it follows is also nonempty.
that
From this
it is
easy to deduce (i)
(ii)
if
ft() Result
(ii)
ft() = ft ()
(put
n(4>)
s = - t
shows that if
in the above). is
the flow obtained from
ft(f').
In particular,
ft(f)
4. The set of wandering points is obviously an open set,
compact.
closure
the 5.
- X
then
.
Clearly fixed points and periodic orbits all lie in
closed set.
x
would give the same definition of
Similar remarks apply to 3.
then the whole orbit of
If
M
= ft(f
•
ft. and so
ft
is a
happens to be compact this will imply that
ft
is
The fact that
ft
is closed means that it must contain at least
of the set of fixed and periodic points.
If a point is wandering it corresponds to transient behaviour in a
dynamical system, know where
198
and so to understand the long-term behaviour we need to
(if anywhere)
the point can be said to approach asymptotically
as
t -> + oo.
For any point
ut-llvmt point
of
x
we say that another point
(the orbit of)
x
if there are points
y
is an
x,
tl on the orbit of
x
with
t. -> +
and
00
d>
1
t.
x -xy
as
i -> °°
x,
.. .
C2 .
This does
1
not
necessarily imply that
The set of all written
such
y
-* y
for given
as x
t -> + °°
:
see examples below.
is called the
u-llrrrlt set
of
x,
w(x).
Replacing
t -* + °°
Replacing
t^
by
by
n^
t -*■ -
00
a-llnrit set
gives the
a(x).
gives a corresponding definition of
a
and
u>
for diffeomorphisms. A straightforward argument shows that of
n,
a(x)
for every
= co(x)
x.
If
= orbit of
x
o>(x)
and
a(x)
are both subsets
is fixed or periodic then
x.
EXAMPLES of flows Simple harmonic oscillator
1.
We convert this on
2
x + k x = 0
second order equation on
TR = R x R = R
by setting
x = y
R
and obtaining
The equation is easy to solve explicitly,
:
R
2
x R
R
(j> (x,y; t)
2
=
into a first order system
and a formula for the flow
can be worked out as
(x cos kt + k_1y sin kt,
- kx sin kt + y cos kt)
(It is instructive to verify that this Is a flow.) ellipses
k2x2 + y2 = constant,
fixed point.
except for
(0,0)
Every non-fixed point is periodic,
The orbits are which is the unique so every point is
2 non-wandering:
= R
199
2. Damped simple hcurmonic oscillator Here
2 x = y, y = - 2by - k x
x + 2bx + k x = 0
(b > 0)
and without working out the equat%ons for
the flow we know that the vectors now have an 'inwards'
component and so
we deduce on qualitative grounds alone that all orbits spiral in towards the origin as Also,
every
t -*■ + 00 a(p)
consists of
.
Hence the only w-limit set is the origin.
is empty except that
ct(0,0)
=
(0,0).
Thus
here
the origin and nothing else.
This and the previous example are both linear systems, meaning that the map
2
R
2
:
(x,y) >-»- (x,y)
is a linear map.
structures of these and other linear systems in 3. Simple pendulum.
m£0 + mg sin 0=0
Take
0
£ = g = 1,
a point on
S
,
so
represents
and convert this
second order equation on
into
a first order equation on TS
1
= S
1
to obtain When
x R
by putting
0 = v
0 = v, v = - sin 0
S1 x R
is
'unrolled'
.
the
orbits of the flow are as shown in Figure
200
40.
For pictures of R
2
the orbit
see Figure 44.
Each orbit lies in some subset
1 maP
E
(c)
where
E
:
TS
-> R
is the
energy
2
(6>v) ^ y v
see Figure 41.
- cos 9 Here
SI
(this simply says that energy is conserved): is all of
TS1.
R
-
*1
---1
Figure 41 Observe that
E
has two critical points
both non-degenerate regular value
c
submanifold of
(0,0)
(minimum and saddle-point,
of
E
TS1
the inverse image
(see
§3.3,
page
157).
and
(tt,0)
;
respectively).
E 1(c)
they are For every
is a codimension-1
It is now a very instructive
exercise to visualize what happens to the orbits when a damping term is added,
turning the equation into
0 + b9 + sin 0=0
and corresponding to a loss of energy along each non-fixed orbit.
.
4
If
Gradient flows on X
Rn
is a vector field on 9V X1
where
V
:
Rn
-*■
vector field. solution curves,
9x
R
.
Rn 9V
’ X2 9x„ 12
is
corresponding to equations of the form . *
_ 9V
Xn
some smooth function,
It is easy to show that
V
9x
n
then
X
is called a
gradient
can never decrease along
since
201
4rV(c(t))
- DV(c(t)).c(t)
"
j, 1x7 1=1
l
=
1
(c.(t))2 $ 0
.
i=l
If fact
4— V(c(t)) at
of solutions)
c(t)
are defined for all
> 0
unless
c(t)
= constant. t,
= 0,
in which case
(by uniqueness
From this it follows that if solutions
so that
X
gives rise to a flow
on
Rn
,
the only non-pandering points are fixed points - i.e. there can be no periodic or other recurrence phenomena: c(t)
= 4>
(p)
has left
so by the continuity of which
t(p)
p = c(0) V
this is because once a point
the value of
V
must have increased and
there will be some neighbourhood of
p
into
can never return.
This relatively simple structure of
Q,
for gradient flows on
Rn
allows their qualitative behaviour to be analyzed quite thoroughly. We shall return to this important fact and its implications
later,
in
the wider context of manifolds. 5. The van der Pol oscillator
x - a(l-x )x + x = 0,
a > 0
Converting this as usual into a first order system, we have x = y
2
y = ot(l-x )y - x which is a non-linear system exhibiting a particularly important type of behaviour. (see §4.2) However, y
The origin is clearly a fixed point,
shows that solution curves are spiralling away from the origin.
they do not spiral out to infinity but approach a periodic orbit
which surrounds the origin.
202
and local analysis there
Outside
y
the solution curves
spiral in
towards
y,
and
is therefore called a limit cycle.
y
set of every point except the origin;
here
It is
0, = y \j {origin}.
the
w-limit
See
Figure 42.
X
Figure 42 The physical interpretation of
the picture is that, no matter what the
initial conditions are apart from
(0,0),
the system will eventually
(or possibly very quickly) be behaving in a way which is practically indistinguishable from the periodic behaviour of points on Indeed,
even if it is initially at
cause it to rocket out towards It is
(0,0),
y.
the slightest perturbation can
y.
surprisingly difficult to prove rigorously the existence of the
single limit cycle
y
for this system,
although it is easy enough to
construct artifically other examples of systems exhibiting limit cycles. For example
has
the circle
r = 1
as a limit cycle.
The van der Pol equation is
historically important as the basic equation for the oscillation of a radio valve.
203
6.
Rational and irrational flows on the torus
Recall that in t *->■ (exp iat,
§3.3
we studied immersions
exp i3t).
vector field on
R ->
Such a curve arises
S^~ x
corresponding in
x
of the form
as a solution curve to the
(0,)
'co-ordinates'
to the
first order system 0 = a = 3 (in fact it is the global rational then
every
non-wandering.
solution through
=
(0,0)).
If
a/3
is
orbit of the corresponding flow is periodic, hence
a/3
If
is irrational,
torus, hence again non-wandering.
4.2
(0,)
then every orbit is dense in the
Therefore in both cases
ft =
x S^
LOCAL BEHAVIOUR NEAR FIXED POINTS AND PERIODIC ORBITS
In studying the qualitative properties of flows and diffeomorphisms our approach will be wandering set
ft
(1)
to investigate behaviour on and near the non¬
(where the action is)
and then
(2)
piece together a
global picture from information about how orbits go from (near) of
ft
to
points,
(near)
another.
one part
We begin by looking at behaviour near fixed
corresponding physically to behaviour near equilibrium states.
First we will consider flows,
and then discuss the analogous behaviour
for diffeomorphisms.
Local linearization of flows Let
p
be a fixed point of the smooth flow
we have a linear map T d> P
204
t
: T M -> T M p
p
p t
so as
t
runs
-
T 4> =T(d> p s p rt
through
R
•
)
at
If
(page 166)
gives
= T d> p^t+s
the family
gives us a flow on the linear space of
tangent maps
{T
of
T^m.
(linear)
We call
diffeomorphisms
this the I'ineav'izat'Lon
p.
Rn
is a linear map
X(0) ,
= 0.
The derivative
DX(0)
of
and we can consider the linear
system x = DX(0).x on
Rn.
This
is called the 'L'Lneav'Lzat'ion of
Not surprizingly,
Rn
at
p.
then
at
0.
the flow corresponding to this linear system turns out
to be the coordinate representation in cf>
X
The proof is
simple:
(fi^x = X(tx)
if
U
X
of the above linearization of gives rise to the flow
= DX(4>j_x)
•
x = 0
|- D 0 L
showing that
(0)
(v,t) h- Dc|>
gives
x
and
t
(see
and get
= DX (0)
L
x
Dcf>t(x)
since we can interchange differentiation with respect to Now we put
in
and so differentiation with respect to
|^T Dt(x)
page 77 ).
cj)
•
Dtf)
(0) L
(0)v
is the flow arising from
DX(0)
.
205
It is reasonable to ask whether knowing about the behaviour of the linearization of
at
p
will be of any use in trying to understand
the behaviour of
itself near
p.
As with the Inverse Function
Theorem and the Morse Lemma for maps and for functions,
the answer is
Yes, provided certain non-degeneracy conditions are met. Since we are working locally we may as well
and we will once again use the language of germs
p = 0,
The vector field map
X
suppose that
:
Rn -> Rn
X .
is thought of locally as the germ at
M = R
and
(§2.7). 0
of a smooth
The kind of equivalence we are interested in this
time will not be right equivalence or right-left equivalence but something specially adapted to vector fields, in the domain of
X
since a local change of co-ordinates
automatically gives a transformation of both the right
hand side and the left hand side of the system x = h(y)
where
h
x = X(x).
is a local diffeomorphism then
Explicitly,
x = Dh(y)y
and so the
system in the new y-coordinates becomes y =
(Dh(y))
1x = Dh 1(h(y)).X(h(y)) = Y(y) ,
say.
Therefore the idea of two vector field germs being the same up to
dvfferentvable
change of co-ordinates becomes formally the
(C )
following:
DEFINITION •
Two vector fveld germs
X, Y
dif feomorphism germ
such that
h
Y(•)
are
X*
C -equivalent if there exists a
= Dh_1(h(-)).X(h(-))
or3 more simply3 Yh_1(-)
206
= Dh_1(•).X(•)
.
if
C
T*
Now we can ask whether the germ of a vector field equivalent for some
r
to the germ at
DX(O),
and the answer we receive is:
THEOREM
(Sternberg
For each integer
at
0
is
Cr
of the linearized vector field
usually.
[jL3l] )
there is a finite number of relationships among
r > 2
the eigenvalues of to
0
X
with the property that
DX(O)
is Cr-equivalent
X
unless one or more of the relationships holds.
DX(O)
Remarks 1. The relationships are as follows: in terms of all
the eigenvalues
A. J
ml^l + m2^2 +
where the If any
A^
eigenvalue
nr
> 0
A^,A2»»*-»A^
Aj
can be expressed
(including
A^)
as
+ m A n n
are integers with
is purely imaginary A.,
an eigenvalue
(A^
2 £ £ nr = ioj)
£
then
a certain - A^
N(r).
is also an
and so we can write A. J
2A . J
+ A., J
which is a relationship of the proscribed type. may not be
linearizable.
In this case
As we see on the next page,
X there are
elementary and intuitive reasons why this is so, but it is important to realize that these Sternberg conditions are rather subtle and in general non-intuitive, then 2.
X
showing for example that if
may not be
The set of
linear maps
L r
:
X
to be
relationships for
C
Rn -*■ Rn
A^ = 1,
A^ = 2
avoiding the Sternberg
is open and dense in
CO
For
and
linearizable.
relationships for given 3.
n = 2
,
L(Rn,Rn)
.
,
linearizable it must avoid the Sternberg
every
r 5 1,
so an infinite number of conditions have
207
to be met.
The set of linear maps
relationships for a residual set
all
r
L
:
Rn
->
Rn
avoiding the
is dense but no longer open in
L(Rn,Rn):
it is
(see §4.4).
If we are concerned with a qualitative understanding of the local behaviour of a dynamical system near an equilibrium, Theorem is too powerful. equilibrium point
p p
For example,
all approach
X
but only topological
t ■+ + °°)
as
(C^)
orbits of the flow locally. Qso^
if we want to know if the
is a stable equilibrium (the orbits of points near
p
Hartman
then Sternberg's
and Grobman
[44] >
we do not need
C
information about
information about the configuration of the The key result here is the theorem of usually referred to in non-Soviet
literature as Hartman's Theorem.
It was proved at a surprisingly late
date in the development of the theory of differential equations.
THEOREM (Hartman)
If
DX(0)
has no 'purely imaginary (including zero) eigenvalues, then
there is a homeomorphism germ at
0
in
Rn
locally taking orbits of
to orbits of the linearized flow (i.e. the flow defined by The homeomorphism preserves the sense of orbits
(t
not necessarily the parametrization of orbits by
That the eigenvalue
x = y,
y = - x
2
+ y
then all orbits except
208
(0,0)
- °°)
but
illustrated by the
X
given by
2
? y = -x + x
or
(see §4.1, Example 1).
Here the field is already linear, but if we take x = y + x
DX(0)J.
t.
condition is necessary is
simple harmonic motion system
+ °°
tj>
?
+ y
spiral outwards and so in no neighbourhood
of the origin can there be a homeomorphism taking orbits of this flow to the circular orbits of the simple harmonic flow arising from
DX(0,0).
Analogous results for diffeomorphisms If a
C
diffeomorphism germ
y ~ Y(y)
h
converts the system
into
then since we are after all dealing only with a change of
co-ordinates it is clear that under » ^
x = X(x)
h
the orbits of the respective flows
must agree and also respect parametrization by
h(j> ^ (x) = ^th(x)
t,
so that
locally or, as germs, h»t = ip *h .
This says that the diffeomorphisms (page 26 )
by the
C
, ipt
diffeomorphism
are locally conjugate
h.
This notion of conjugacy
at the germ level is what corresponds in the diffeomorphism case to r C -equivalence m the vector field or flow case. Sternberg's Theorem for local germ
f
at
0
-£
m.
°f
> 0
and
2 ^2 * * * **yn
l
.
linearization of a diffeomorphism Df(0)
m, m„ 12 yj " yl y2 for
x
is as before, with
C -conjugacy replacing eigenvalues
C
The relationships which the
should avoid are m n yn
.
l
Hartman's Theorem becomes the following:
THEOREM (Hartman's Theorem for diffeomorphisms) If f
Df (0) at
0
has no eigenvalues with modulus equal to is
C°-conjugate to the germ at
0
1
then the germ of
of the linear map
L = Df (0).
209
It follows immediately that the conjugating homeomorphism germ satisfying
hf = Lh
will locally take the orbit of each
f-action to the orbit of
h(x)
x
h
under the
under the L-action, since
h*f^(x) = h*f*f(x) = L*h*f(x) = L^h(x)
and similarly h*fm(x) = L^'hCx) for every integer
m.
Remarks 1. A typical linear system on
Rn
is of the form
x = Ax where
A
Rn -> Rn ,
is a linear map
A e L(Rn,Rn).
i.e.
R
to prove that the corresponding flow on
It is not hard
is given explicitly by the
formula , . . tA
1)
such that
+ + L(E)=E ;
all eigenvalues of
L | E+
have modulus
> 1
;
all eigenvalues of
L | E
have modulus
< 1
.
k < 1 < K
L
with
with
) with the following properties:
Furthermore, there exists a norm on
i.e.
X
(corresponding to eigenvalues
X
Rn
for which there exist constants
such that
x e E+
implies
||L(x)||
> K||x||
x s E
implies
||L(x)||
< k||x||
expands along
E
,
and contracts along
E
211
We can think of two
'co-ordinate'
or
E
= {0}
Rn
as being decomposed into the direct sum of
subspaces:
E+
and
is not excluded.
the origin is a sink',
if
If
E
.
the
The possibility that
E+ = {0}
then
E
= R
E
and we say
then the origin is a source.
E+ = Rn
Intermediate cases are often called saddle points.
See Figure 43.
saddle
Figure 43 The proofs of the above assertions are straightforward matrix theory, although the construction of the norm is not entirely trivial. Analogous results apply in infinite dimensions, using the spectrum instead of the eigenvalues. For a hyperbolic linear flow . . corresponding behaviour:
_n R
is invariant under the flow A
|
E+
(A
|
E )
and constants
for all
arising from
splits into (i.e.
have positive
k < 1 < K
E
E
+
© E
= E )
(negative)
—
we have
such that each
E
+
and all eigenvalues of
real part.
There is a
norm
such that
x e E+
implies
| |tx| |
>
x e E
implies
| | x| |
< k^ | |x| |
t > 0.
x = Ax
| | x| |
The situation can again be symbolized by Figure 43.
In two dimensions the orbits of hyperbolic linear flows can be sketched quite easily, explicitly:
212
since the corresponding system of equations can be solved see Figure 44.
Note that the validity of
these pictures
rests on the fact that if a hyperbolic linear system is perturbed linearly by a small amount so that no eigenvalue crosses the imaginary axis,
the phase portrait remains the same up to homeomorphism.
then
This observation,
although not mentioned again explicitly, is a first step in the theory of global stability discussed in §4.5 and should be kept well in mind.
Pictures of orbits of flows arising from 2-dimensional hyperbolic linear systems
x = Ax.
real, negative (i)
equal, with two independ¬ ent eigenvectors
(ii)
unequal
Xl,X2
=
3 / 0,
a
±
ip’
a < 0
(iii) or (iv), depending on sign
of lower left hand element in
(v)
A
real, negative
A^ = A2
with no two independent eigenvectors (A cannot be diagonalized) (vi)
Ax,A2
real, opposite
signs
(v)
(i)-(v) are sinks, point.
(vi) is a saddle-
For sources change signs of
(real parts of)
A^,A2
Figure 44
in (i)-(v). 213
The discussion of hyperbolicity so far has been concerned with fixed points of flows and diffeomorphisms. this
to periodic orbits.
the diffeomorphism
f
If
q
2
fm.
approach.
Choose a point
q
is also hyperbolic,
q
and take a
codimension-1 submanifold
\
i.e.
X(q)
\
such that the vector
sufficiently close to
the forward orbit of
x
V
of
y. q
P(x)
which
= (|) x
x e \
where .
Thus
the first-return map
P
P
\
and transverse to .
again at a time
\
x)
page 194).
P
214
y
,
approximately
is the smallest
Cr
if
X
is.
Clearly
q,
t > 0
for P
P(q)
= q,
and the
since we construct an inverse
P
Therefore we can
as a diffeomorphism germ
is hyperbolic then we say that
periodic orbit of the flow. transverse to
t
Using the Implicit Function Theorem
discuss the hyperbolicity or otherwise of If
on
l
is a diffeomorphism germ at
.
x
there exists a
by following orbits backwards instead of forwards.
Rn 1 -> Rn 1
For each
is a kind of local cross-section, with
is
y,
on which we can define a map
(depending on £
(cf.
it can be shown that germ of
x
piece of
the continuity of the flow guarantees that
More precisely,
in
q
(small)
does not lie in
P : V + by
we need a different method of
passing through
will meet
equal to the period of neighbourhood
q
y,
so we can
f.
of a flow on
of
It then follows automatically that
talk about a hyperbolic periodic orbit of y
m
is hyperbolic if it is
q
(q)... on the orbit of
For a periodic orbit
step to extend
is a periodic point with period
then we say that
hyperbolic as a fixed point of each point f(q),f
It is only a small
y
is a hyperbolic
This does not depend on the choice of
nor on the choice of
q
on
y
\
at which to construct
The germ at
q
Recall that if
f
of the map :
M
M
P
is
called a
PoincarS map
is any diffeomorphism then
germ for
fm
is a
Poincar^ map for every periodic orbit of the suspension of arising from a periodic point of
f
of period
P
(global) (page 195)
m.
Now we can apply Hartman's theorem to the germ of diffeomorphism case or to the germ of
f
y.
fm
in the
in the flow case,
and it tells us
the behaviour of a diffeomorphism or flow near a hyperbolic periodic
that
orbit is topologically the same as the behaviour of the linearized version. In view of Hartman's Theorem and our analysis of hyperbolic linear systems above we now see that if we take a sufficiently small neighbourhood
in-set
of
p
U
of a hyperbolic fixed point
p
and define the local
in the diffeomorphism case by
in(p)
= (x e U
|
fn(x) -> p
|
tx
as
n -> + °°}
or in the flow case by in(p)
then
in(p)
subspace in(p)
= (x e U
p
as
t -> + 00}
is homeomorphic to a neighbourhood of
E_
of
Rn
0
defined from linearization of
is a topological manifold of dimension
of eigenvalues with modulus less than
1
s
in the linear f
or
where
(or real part
cf>
s
at
p,
so
is the number
< 0).
In fact
we can do better than this:
THEOREM
If the diffeomorphism or flow is of
M.
.
its tangent space at
The terminology
in-set
then
Cr p
in(p)
corresponds
was suggested by E.C.
to
is a E
Cr in
Zeeman.
submanifold Dn K .
More traditionally g
in (p) Here
is called a s
means
local stable manifold
stable,
and also refers
to
for
p,
denoted by
w£oc(p)
•
the dimension of
215
s Wn (p). 36oc
Note that if we choose
ball with respect to similarly for
cf>
U
carefully
the relevant norm on
with
[6l] .
Replacing
It is known as n
by
- n
or unstable manifold of dimension appropriate
U
we have
The extension of If
q
we have
(or
W£0C(p)
u = n - s
n R )
s then
fW
s
36 OC
(p)cW
Jo OC
(p);
t > 0.
The proof of the theorem is not easy: or Irwin
(corresponding to a small
see Smale
[l22]
Nitecki
[90],
the Stable Manifold Theorem. t
by
for
- t)
local out-set
we obtain the
P*
It is a
C
submanifold of
, with tangent
space at
p
corresponding to
fW^ (p) 36 O C
these ideas
36 O C
(p);
similarly for
to periodic orbits
is a hyperbolic periodic point of period
of
fm
f
local stable and unstable manifolds for
f(q),f
at
m_ i we obtain
(q),...,f
m 1
(q):
see Figure 45.
Figure 45 We can think of these constituting a stable whole.
216
m
stable
.
q.
f,f 2
t > 0
.
then as above
2 Transporting these around the orbit by applying
,
E
is straightforward.
m
local stable and unstable manifolds for
t
M
(unstable) manifolds
as
together
(unstable) manifold for the periodic orbit as a
For
Similarly,
if
y
is a hyperbolic periodic orbit for the flow
then
the local stable and unstable manifolds for a Poincar4 map can be transported around as
local
whole:
4.3
y
by
to give manifolds of one higher dimension
stable and unstable manifolds
W®
(y),
(y)
for
y
as a
see Figure 46.
SOME GLOBAL BEHAVIOUR
Now that we have some understanding of the way in which a flow or diffeomorphism behaves near a hyperbolic fixed point or periodic orbit we look at the consequences for global behaviour on a manifold In particular,
let us consider the case of a point
approaches a hyperbolic fixed point If we take
t
large enough then
p
(jj^x
as
all
cb
-t
W
s £oc
t > 0
(p) .
for which so that
p,
or,
(or, indeed, over all
t e R)
d>
-t
= p.
(p)
equivalently,
Therefore if we take the union of the
(J>^_x
u)(x)
will eventually be in
defined for some suitable neighbourhood of in
t -»- + °°,
x
M.
x
s W„ (p) £oc
lies over
we shall capture precisely
217
those points for
p.
x
for which
+ °°.
Any
x
in
Co (x) = p , a (x) = p2and
u p e W (p).
• If there exists some
u,
then
all satisfying
called homoolinic points, and
(cf. m(x)
2 above) = a(x)
there is an infinite = p.
Such points are
although sometimes this term is reserved for u W (p)
. intersect transversally
(cf.
§4.4).
See Figure 48.
Figure 48
220
(P2) •
and no point in it is fixed or periodic
->
x e W (p) D W (p)
number of such
p^ ±
This is because
under the action of
u (p ) n W (p )
x ^ p
s
where
since such a point could not W
W
W
which is nonsense.
s W (p^)
(as a set)
in
then
Once again, we can extend these ideas everything for flows as well are no longer discrete,
as for diffeomorphisms.
(s — u = 1).
phenomena shows how analogous
and do
In this case orbits
and the behaviour illustrated
not occur in such low dimensions diffeomorphism
to periodic orbits,
in Figure 48 could
Suspending these things happen in higher
dimensions for flows.
From the results of this section we see that in studying the global structure of a dynamical system we can make some progress if we know that fixed points and periodic orbits are all something about the
hyperbolic,
and if we know
'intersections of stable and unstable manifolds.
Now we might reasonably hope that
'in general'
all fixed points and
periodic orbits will be hyperbolic,
and that stable and unstable manifolds
will intersect in general position,
i.e.
transversally.
To discuss this we first have to provide a suitable mathematical meaning for
'in general',
and then find which properties we can expect to be those
of a
'generic'
system.
This is the programme for the next section.
4.4
GENERIC PROPERTIES OF FLOWS AND DIFFEOMORPHISMS
It is common practice in classifying any set
S
to begin by trying to sort them into two types: degenerate,
tame,
pathological, S,
nice,
...).
and to associate
...)
and
usual
(singular,
'usual'
with
(regular, non¬
degenerate, wild,
One way to do this is to put a
Another way is via a topology on here,
unusual
of mathematical objects
measure
on the set
occupying a subset of large measure. S.
This is
the approach we shall use
since at the same time it allows us to discuss questions of global
221
stability of dynamical Naively,
systems.
it seems reasonable to associate
subset which is dense in since a subset of For example,
S
This is soon seen to be inadequate,
and its complement in
the subset
its complement,
S.
the set
for a start.
S
of rational numbers is dense in
I
of irrational numbers,
'more usual'
'usual'.
though,
can both be dense in
Q
rationals and irrationals are both are in some sense
'usual' with occupying a
but so is
and therefore the
Nevertheless,
than the rationals:
R,
S.
the irrationals
there are more of them,
We can capture this by using the topological notion of a
residual set.
DEFINITION
Let
S
be any topological space.
A subset
G
of
S
is called a
residual set if it is the intersection of a countable number of sets,
of which is both open and dense in
each
S.
In any respectable topological space a residual set is itself dense. This is known as Baire's Theorem,
once
'respectable'
is defined,
and then a
space in which this result holds is called a Baire space (L.
Baire,
1874-1932).
For example,
metric space is a Baire space.
any space homeomorphic to a complete
The intersection of a finite or countably
infinite number of residual sets is residual,
and so in a non-empty Baire
space the complement of a residual set cannot be residual empty set would be dense). good synonym for
'usual'
(otherwise the
Therefore occupying a residual set becomes a
in a Baire space.
Conveniently,
the spaces we
deal with will all be Baire spaces. In the rational-irrational example above we see that as numbers are countable we can write
222
Q = {,q2, . . . ,q , . . . }
the rational and, writing
Un
for the open dense subset
R - {qj
as the intersection of all the R,
and
Q
U^.
R,
of
Thus
I
I
is a residual subset of
is not.
The sets we are interested in are the set diffeomorphisms on the smooth n~manifold fields on
we can then express
M
these sets.
which we denote by
X (M).
Diffr(M)
M,
of
Cr
and the set of
Cr
vector
We will now put topologies on
Assume unless otherwise stated that
M
is compact:
this
makes the topologies easier to describe, and is in any case the context in which our later theorems will apply. We regard M
M,
on
Diffr(M).
Diffr(M)
topologize
C
as a subset of the set (M,M),
Cr(M,M)
of
Cr
maps
and then take the subset topology induced
To topologize
Cr(M,M),
we want to capture the idea of two we can do this by saying that
or, more generally,
C
f, g
maps
M -> N
Cr(M,N),
being 'close' and
are close if their local
representatives in all charts of some atlases are close.
The procedure
for converting these hazy notions into a topology is as follows. Let
(U,),
the closure
(V,ijj)
fU
of
be charts on fU
fU g
is compact. in
Y
C (M,N)
sup | | ipf (x) - Tj)g(x) | | xeU where
||* I I
with (for technical reasons)
bounded and contained in
§1.6 this implies that denote the set of all
M, N
For any .
Rn .
fU
,...,U
with the property that each “s
contained in some chart
B (f;U,V)
and such that
Take atlases for
U
“2
gU c V
let
(*)
M
,U
—
for which
chosen carefully so that there is a cover of
“l
By Theorem C in
e > 0
< e
is any particular norm on
V.
M
and
N
by charts is bounded and “i
V
on
N.
This is not hard to do, by
3i considering the cover of
M
by all the open sets of the form
f
(V^),
223
using the compactness of
M
to get a finite subcover, and then taking
slightly smaller charts to satisfy the closure condition. B (f) e then
to be the intersection of the
g
lies in
B^_(f)
respect to the charts
U
a.
, V„ B.
1
1
W
there exists some
is contained in which for
W.
N = M
for
i = l,2,...,s
:
precisely when all its local representatives with
Finally, we define a set f e W,
B (f;U ,Vn ) e ou fk
Now define
differ from those of
in B (f) e
Cr(M,N)
f
by less then
e.
to be an open set if, given any
constructed as above such that IT C (M,N)
This gives a topology on induces the C -topology on
Diff
B (f) e
0 the C -topology,
(M).
The topology can
be shown to be independent of the choice of atlases and special charts
If we want the topology to capture closeness of derivatives of all orders up to
r
we do the same as above, but replace
r , I SUP I1D f(y) k=0 yeU'
— where
U' = U,
IT
C (M,M)
and on
||
Rm
and .
Diff
T
< e >
(**)
^_
_1
f = i|jf
and
MrCR™ x Rm x ... x Rm;Rn) from norms on
, _ D g(y)
(*) by
(see §2.6) D f (y)
belongs to the space
of k-multilinear maps and
Rn. (M,M).
||*||
is obtained
This then gives us the Cr-topology on Cr(M,N), Unless it is stated otherwise, this
topology is assumed to be the one in use.
It is known more classically
as the topology of uniform convergence of all derivatives up to order The case
r = 1
r.
will most interest us.
Remarks 1. There is a more sophisticated way of describing these topologies using jets:
224
see Golubitsky and Guillemin [[
, for example.
Alternatively we can embed regard
f
: M
N
as
f
Tf : T M -> T RP = R^ euclidean metric on and
T g.
N
in some N -* Rp .
: M
where R*^
Rp
q = 2rp
(see §3.2,
\J]),
and then
This has an r-fold tangent map and with a little care the ordinary
can be used to give a 'distance' between
We obtain a metric on
topology already defined.
C (M,N),
Trf
and this induces the
Using the metric, it can be shown that
C Cr(M,N)
■JT
and
Diff (M,M)
2. When
M
are Baire spaces.
is not compact problems arise since there may not be a finite
atlas, and even when there is one a topology constructed as above may not be suitable for dealing with behaviour 'at infinity'. details, except to mention that for useful
Cr
open subsets of
topology is obtained by replacing the constant
continuous function edge of
M, N
We will not go into
M) .
e
: M -> R
Rm, Rn e
a
in (**) by a
(which may tail off to zero towards the
Alternatively, one can consider a topology of uniform
convergence on compact subsets of derivatives up to order
r.
3. Further subtleties occur even in the compact case if we wish to treat all derivatives for
C°°
maps:
there are a number of topologies one could
...
CO
use.
One choice is to say a set is open in
of sets, each open in some
C (M,N)
when it is a union
C (M,N).
4. These ideas can be extended to infinite-dimensional manifolds and to manifolds with boundary. For X
Xr(M)
and a chart
the procedure is similar but easier. 4>
x xd) ""1 a
• u ' ‘a
factor we have
U ' a
Rn
a
X Rn
^
there is the local representative
cl
(cf.
xa : Uq' -> RU .
we can suppose that bounded in
■* U '
: U cl
Given a vector field
p.169) and considering only the second If we shrink
a little if necessary
is defined on the closure
(hence compact)
it follows that
Ua>
||D X^||
and if
^
is
is a bounded 225
function on
U . a a
Let
can choose a finite atlas of charts
IX| I
This is a norm for
When
X = T D X 1 1 a1 1 r . Ln ' ' c k=0
=
sup l$i£s
X (M),
U
X
a.
l
,U
a2
, . .. ,U
a
M
is compact we
as above, and define s
'r
making it into a Banach space.
The norm
itself depends very much on the choice of atlas, but the topology induced on
Xr(M)
does not.
(Cf. Example 6 on page 8, where there is only one chart.)
Remarks analogous to those in the
Diffr(M)
case also apply to
Xr(M).
With topologies assigned to these spaces of diffeomorphisms and vector fields we can now discuss genericity.
DEFINITION A property Cr P
P
of diffeomorphisms (vector fields) on
generic if the set
P
M
is said to he
of diff eomorphisms (vector fields) possessing
contains a residual subset of
Diff (M)
There are two particular properties
(X (M)).
P^, P^
which are known to be
generic for diffeomorphisms and vector fields on a compact manifold: P^
Every fixed point and periodic orbit is hyperbolic_, and all inter¬
sections of their stable and unstable manifolds are transversal. This does not say that there need be only a finite number of fixed points and periodic orbits.
As we shall see below, the property of having
only a finite number of periodic orbits is not P2
The non-wandering set
points and periodic orbits.
226
ft
generic.
is precisely the closure of the set of fixed
Remarks 1*
as
^
generic
Kupka-Smale Theorem. 2.
his Closing Lemma p
such that
See Kupka
?2
The result that
for each finite
is
C1
[65] ,
Smale
r.
This is known as the
[l2P] or Peixoto
generic is due to Pugh,
and follows from
This states that any recurrent point
[l03] .
p e m(p))
[96] .
(a point
can be converted into a periodic point by
arbitrarily C —small perturbations of the diffeomorphism or vector field. The proof
is extremely difficult.
and for C -small perturbations Lemma may be false. Thus
For C^-small perturbations it is easy,
(r > 1)
It is unknown whether
P^
is
C
the
Cr
Closing
generic or not.
the choice of topology is not merely a refinement for mathematicians,
but has important implications for systems
4.5
little is known:
the kinds of behaviour in dynamical
that we are to consider as usual or as exceptional.
GLOBAL STABILITY
We will now take a different point of view from that of genericity,
and
consider when the overall structure of a dynamical system persists under sufficiently small perturbations of the system.
After all, we should be
suspicious of a differential or difference equation used to model a reallife system if the equation can be made to behave very differently by including arbitrarily small extra terms,
since in practice we can only
measure to within some non-zero margin of error.
This leads to the idea
of structural stability of diffeomorphisms and vector fields. indulge in some wishful
thinking,
when suitably formulated,
We might
and hope that structural stability,
is a generic property
...
227
Previously
(§4.2) we discussed local equivalence of germs of
diffeomorphisms or vector fields at a fixed point.
We can easily transfer
this to a global concept.
DEFINITION IT
Two diffeomorphisms •
if there exists
This
f,g e Diff
h e Diff
k
(M) •
wi,th
(M)
are
1c
C
globally.
k £ r)
h*f = g»h.
is a special case of conjugacy of two maps
'look the same'
(for some
equivalent
(§1.5),
Note that in particular
h
i.e.
f
and
g
takes f-orbits
globally to g-orbits.
DEFINITION •
Two vector fields
X,Y e X
IT
are
(M)
k h e Diff
k
•
equivalent if there exists
, (M)
#
taking X-orbits to Y-orbits, preserving senses but not
necessarily parametrization by
In either case
h
t.
takes the phase portrait of one system to the phase
portrait of the other. points,
C
As pointed out in the local context for fixed
it is mainly the case
k = 0
(i.e.
h
a homeomorphism)
that is of
practical interest in studying the long-term behaviour of orbits. Therefore our definition of structural stability will be as follows:
DEFINITION 2“
A diffeomorphism
f e Diff
T"
(M)
(or vector field
structurally stable If there is a neighbourhood (or N
X
is
228
in C°
X (M))
X e X
N
of
(M)) f
in
is Diffr(M)
such that every diffeomorphism (or vector field) in
equivalent to
f
(or
X).
If we had worked with
Ck
equivalence
definition was practically useless.
(k :> 1)
For example,
we would find that the let
f
diffeomorphism with a finite number of fixed points. C1
equivalent then the conjugating
fixed point shows
p
of
f
h(p)
-^(p)®
of
and hence has the same eigenvalues as
and so
f
T f
are even
takes each
and the Chain Rule
T f.
T h
However, we can
by arbitrarily small
could not be structurally stable.
based on the same fact about linearization, be structurally stable C
h
g
P
easily change the eigenvalues of f,
g,
f,
is linearly conjugate by
r
in
If
diffeomorphism
to a fixed point
that the linear tangent map
T f
C1
be a
(with
k 5 1)
As an example
the system
since
x = -
changes
x = — x
(l+e)x
would not
would not be
equivalent to the first system even though its phase portrait is
identical.
Of course,
so neither does
if we take
k = 0
then
T h P
does not exist and
the above argument.
Most of the important theorems on structural stability are formulated for compact manifolds.
Although the manifolds that it might be necessary
to use in practice could often turn out to be non-compact spaces for second order systems), essentially what goes on,
(e.g. phase
the results for the compact case show
and they can often be adjusted to cope with
specific non-compact problems.
Therefore in what follows we make the 0°
standing assumption that boundary),
M
is a
and we might as well
C
compact n-manifold (without
suppose also that
otherwise we could treat each piece of
M
separately.
The early results on structural stability those of the Russian school However,
(called
(Andronov, Pontrjagin
the first main global theorem which
results on gradient systems
(see §4.6)
is connected,
M
'roughness') were
[7^])
in the 1930's.
together with Smale's
launched a new attack on the
229
subject, was Peixoto's theorem for flows on 2-manifolds.
THEOREM
A
(Peixoto
[95J)
vector field on a 2-dimensional manifold
Cr
is structurally stable
M
if and only if it satisfies the following conditions: (1) The number of fixed points and periodic orbits is finite3 and each is hyperbolicj (2) there are no points
a(x)
and
are both saddle points;
w(x)
(3) n
for which
x
consists of fixed points and periodic orbits
only. Moreover, if
is orientable the set of such vector fields is open and
M
r
dense (thus highly generic) in
X
(M).
Remarks 1. As
stated in
[95^]
the full result includes the non-orientable case,
but there is a gap in the proof. the rescue, for 2.
For
r = 1
the Closing Lemma comes
to
but it seems that the situation has still not been clarified
r > 1. Condition
(2)
is the same as transversality of intersection of stable
and unstable manifolds: not automatic are
the only dimensions for which transversality is
s = u = 1
uniqueness of solutions if along a whole orbit.
(i.e. W (p^)
for saddle points), is tangent to
Thus genericity of
W
and then by
(p£)
(hyperbolicity +
they must meet (2))
corresponds to the Kupka-Smale Theorem (page 227). With this result in mind,
Smale proposed a closer study of systems on
compact n-manifolds satisfying the above conditions with
230
)
(1)
and
(3)
together
(2)
All stable and unstable manifolds intersect transversally.
Such systems are now known as Morse-Smale
(MS)
systems.
Apart from their
dynamic properties, Morse-Smale systems have the added interest of providing close analogues to the Morse inequalities a set of Morse-Smale inequalities the topology of
M
(Smale
(page 160).
[119], Markus
[73]),
There is
through which
places constraints on the possible numbers of fixed
points and periodic orbits for any MS diffeomorphism or flow on However,
M.
from the point of view of dynamics MS systems seemed good material
for two plausible conjectures:
Conjecture 1
A system is structurally stable if and only if it is MS.
Conjecture 2
MS
systems are dense in
Failing Conjecture 2,
Conjecture 3
Diff^(M)
or
X^(M).
one could at least try
Structurally stable systems are dense in
These conjectures were short-lived.
Diff1(M),
X1(M).
Their fates are quickly
summarized: Conjecture 1
MS
implies structurally stable:
by Palis and Smale
(see
[22]).
Structurally stable implies MS Conjecture 2
FALSE in general.
Conjecture 3
FALSE in general.
Of course,
:
FALSE in general.
the conjectures are true for vector fields when
by Peixoto's Theorem. dim M = 1,
proved TRUE
(i.e.
dim M = 2
They are also true for diffeomorphisms when
M = S1).
The failure of Conjectures 1 and 2 in general
is a consequence of the fact that there exist complicated non-wandering
231
phenomena which systems can exhibit and which cannot be removed by small perturbations.
We will now look at two basic examples of such phenomena.
The failure of Conjecture 3 is more subtle,
and we shall be in a better
position to understand it once we have understood these examples.
EXAMPLE 1:
The Smale Horseshoe 2
Take a rectangle
R
in the plane
it into a horseshoe shape as C',
D')
R
shown
.
Pick it up,
(vertices
A,
and put it down again partly on top of
see Figure 49.
It is not difficult,
B,
stretch it and bend C, D
go to
A',
B',
its original position:
using smooth but possibly non-analytic
Figure 49 functions
(cf.
example on page
86 )
to piece things
this construction into a diffeomorphism this
:
R
2
-v R
2
.
to incorporate
Furthermore
(and
is-important) we can arrange that the two rectangular strips R r\ f
consisting of points of
R
which remain in
are each stretched linearly by a factor (parallel to 'horizontal'
AD)
remain m
R
R
>1
when in the
f
R^f
-1
(parallel to (R)
when both
of four thin rectangles,
f
f
-2
and
(R) ,
'vertical'
f
2
0,
, a A = e
3. Hyperbolic fixed points and periodic orbits are hyperbolic invariant sets in the above sense. 4. The horseshoe example is hyperbolic on example is hyperbolic on the whole torus.
238
A;
the toral automorphism
If define all
is a point of
X
the
y
A
and
A
is hyperbolic for
(generalized) stable manifold
in
M
WS(x)
for which the distance between
approaches zero as
n -* °°.
To do
for fn(y)
f x
we would like to to be the set of
and
fn(x)
this we must have a metric on
M,
but
since we already have a Riemannian structure we can use that to provide a metric:
again see
euclidean space.
§3.4.
Alternatively, we could embed
Therefore the definition of
independent of the particular metric chosen) Generalized Stable Manifold Theorem,
WS(x)
makes
M
in sense
(it is
and there is then the
due to Hirsch and Pugh
[53] , which we
summarize:
THEOREM
Each As
Ws(x)
is an immersed copy of
runs through
x
A
all the
Ws(x)
varying family (at least locally near that
WS(f(x))
Replacing
with
Rs,
ES
as tangent space at
x.
fit together to give a continuouslyA),
and the family is invariant in
= fWs(x).
n
by
- n
we obtain the definition and analogous theorem
for generalized unstable manifolds This theorem in the case
WU(x).
A = ft(f)
is one of the main ingredients
in
the long-awaited Structural Stability Theorem that generalizes Peixoto's theorem and the theorem of Palis and Smale on
MS
systems:
THEOREM
Suppose the diffeomorphism
f
(1) the non-wandering set
satisfies the following: ft
is hyperbolic;
(2) periodic orbits are dense in
ft;
239
(3) all 'intersections of (generalized) stable and unstable manifolds are transversal. Then
f
is structurally stable.
Remarks 1.
This theorem was proved by Robbin
many people including Hirsch,
Pugh,
[l05]
as
the culmination of work by
Shub and Smale:
In Robbin's proof it is necessary to suppose that C
2
.
,
although small perturbations are allowed in
the
case was given by Robinson
2. The converse almost proved, stability is
f Diff
[22^],
[ll| .
itself is actually
i
(M).
A proof in
Q.08].
(that structural stability implies
(1),
(2)
and
(3))
is
and is certainly true if the definition of structural
strengthened a little:
In fact Franks
see
[4C>3
shows that
(1),
see Guckenheimer (2),
(3)
\A6~J
,
Franks
[39^] .
are necessary and sufficient
conditions for a slightly amended structural stability property that may be of more relevance from a practical point of view than the previous 'pure' 3.
definition.
Corresponding results exist for flows.
See Robinson
[l07] , Franks
Behaviour of structurally stable systems We will here use the word system to mean either a diffeomorphism or a flow. The two conditions on the non-wandering set
£2
£2
hyperbolic *
periodic points dense in
Q
together comprise what Smale [l25] called Axiom A. structural stability theorem can be stated as
240
Thus Robbin's
[38^].
Axiom A plus transversality of implies structural stability .
stable and unstable manifolds No examples are known of systems with
not dense in
ft,
ft
hyperbolic but periodic points
and so it is an unsolved problem to decide whether the
first clause of Axiom A always The structure of
ft
implies
the second.
for an Axiom A system can be described in fairly
terms via Smale's ft-decomposition theorem
general but useful
[l25]:
THEOREM
If Axiom A is satisfied then ft = ft,
ft
breaks up into a disjoint union
ft0 v; . .. u ft
12
in which each
ft^
is closed,
s
invariant under the diffeomorphism
(or flow),
and such that the diffeomorphism (or flow) acts transitively on each i.e.
has an orbit dense in
ft.,
ft.. 1
It is possible that M:
s = 1
and
ft = ft
is
the whole of the manifold
this was the case for the hyperbolic toral automorphism.
In general,
systems for which the whole manifold is hyperbolic are called Anosov systems,
since they first came to light in the flow context through work
of the Russian D.V. Anosov of
M
always implies
•
ft = M.
It is not known whether hyperbolicity The classification of Anosov systems is
an interesting and very difficult open problem, Anosov system with
ft = M
important because an
is an example of a dynamical system that
exhibits complicated recurrent behaviour everywhere which cannot is structurally stable) The sets
ft.
be destroyed by small
(since it
perturbations.
in the decomposition are called basic sets.
They generalize hyperbolic fixed points and periodic orbits as models for stable
(thus observable)
recurrence phenomena in dynamical
systems.
241
The Cantor set in the horseshoe is an example of a complicated basic set. In principle we would like to be able to classify all possible types of basic sets
fib
for Axiom A systems,
attractors, i.e. for which everything near
fb
WS(£b)
approaches
and especially those which are is a neighbourhood of
£b
as
t
(or
fib
n) -> + °°) .
These generalize sinks, which are hyperbolic fixed points dim W (p) Williams
= dim M.
(so that
p
with
Some interesting results have been obtained by
[l511 .
There is a school of thought that strange attractors
in smooth
dynamical systems may be the right models for hydrodynamic turbulence. See Rue lie and Takens
[Tio] , Iooss
[6o] , Marsden et al.
[75], Arnol'd
[lo] .
The behaviour of Axiom A diffeomorphisms and flows on the basic sets 1 i
fb
\
themselves has been studied in detail by Bowen, who showed that
symbolic dynamics can be used to give a very good picture of the measure theoretical or statistical properties of the system. plays an important role.
The idea of entropy
Thus deterministic mechanics leads via Axiom A
to statistical mechanics within the same model.
See Bowen
[19] .
To understand the global behaviour of an Axiom A system we have to see which points a(x) W
CL
(fb)
Sb
x
and
and
W
go from one m(x)a £b (fb).
£b
.
Such an
(This
y
in
Sb
x
fb
lies
,
i.e.
satisfy
in the intersection of
is more diff icult to prove than it looks.
It amounts to showing that if some point
to another
with
d(fn(x),fb) ->- 0 d (fn (x) , fn (y)) -* 0
as as
n -> + °° n -> + oos
then there is i.e.
g
x e W (y).
See Figure 51.
periodic points
in
fi,
hyperbolic invariant set
242
The proof uses strongly the density of
and the result is false for an arbitrary A
:
see the counterexample due to Bowen in
[54].)
Figure 51 Putting an ordering on the
ft.'s
by writing
1
W
u.
(ft^)
r\
W
s
(ftj)
whenever
ft.
> ft.
i
J
J
> ft.
i
J
to mean that
is nonempty, we can draw a schematic diagram or graph
with vertices representing the ft.
ft.
.
ft^'s
and a directed edge joining
ft^
to
It can be shown that the condition of transversal
intersection of stable and unstable manifolds implies that there are no cycles
(closed circuits of consistently directed edges)
in the graph,
so
that the global behaviour of the system is represented by a hierarchical structure as
ftc,ft0 5
o
in Figure 52.
The attractors are the
ft^'s
at the bottom
are
attractors
Figure 52 of
the hierarchy.
Axiom A systems,
In general
it
is not known which graphs can come from
or to what extent equivalent graphs
imply equivalent
243
systems,
although a classification in the case of Morse-Smale systems on
2-manifolds has been carried out by Peixoto
[98^
Final remarks on stability 1.
The example due to Smale
[l24] ,showing that structural stability is
unfortunately not a generic property in constructing an Axiom A diffeomorphism p
as one basic set
Diff^(M), f
was obtained by
having an ordinary saddle point
and a torus with a hyperbolic toral automorphism
on it as another basic set
and then changing
f
to
for which
WU(p)
bends by a large amount and has non-transversal intersection with
s W (y)
for some
y e ^
•
If
... is sufficiently
g
non-transversality persists, with f ^.
The
type of phase-portrait for
periodic or non-periodic in This means which is
that
f
Diff^(M),
vector fields,
has a
since
perturbations
close to close to
f^
this
p,y,^2
depends on whether or not
y*
for is
neighbourhood of diffeomorphisms,
none of
Hence structurally stable systems are not The same holds for
by taking a suspension. it turns out that every system in
see Shub [ll6j , Zeeman
interpret,
g
1
and both possibilities occur densely.
can be approximated arbitrarily
systems:
[l52j .
and yet implies
mathematical modelling.
C°
Diff^CM)
or
C°-closely by structurally stable This is a rather confusing result to
the definition of structurally stable uses
on what choices of topologies
244
g
f°r
and Conjecture 3 is demolished.
Despite this result,
X1(M)
SI 1>
structurally stable.
dense in
2.
p^y',^'
C
density.
C1
The significance of this depends
seem appropriate in any given context of
3. There are other global notions of stability that have been proposed in the hope that they may be both physically meaningful and topology)
generic.
restricted to
For example,
:
see e.g.
Shub
ft-stability [jL15~] )
tolerance stability (due to Zeeman;
(in some useful
(stability of the system
is not
see Takens
generic; (jL37~|)
is in a certain sense
generic.
4.6
DYNAMICAL SYSTEMS UNDER CONSTRAINT
The results
considered above about structural stability and genericity all
assume that arbitrary perturbations in allowed, whereas
Diff
(M)
there are many circumstances
or
X
(M)
are to be
in modelling real
systems in
which the admissible class of perturbations will be restricted and so possibly lead to entirely different results both about stability and genericity.
In fact systems under constraint are probably more common than
unconstrained systems.
We will consider three types of constrained system:
gradient systems_, Hamiltonian systems and systems with symmetry and comment on their global stability properties. To discuss gradient and Hamiltonian systems we need to introduce the
cotangent bundle of a smooth manifold.
Cotangent bundles Let space
f
: M -> R E.
be a smooth function on a manifold
For each point
T f p and
T f p
:
p
at
M
modelled on a Banach
we have the linear tangent map
T M -> T , ,R = R P f(p)
is an element of
called covectors
on
M
p,
(T M) * = L(T M, P P
and
R).
Elements of
(T M)* - which we rewrite as
(T M)* P T*M - is
are the
245
cotangent space of df
M
at
p.
Usually we denote
assigns a covector to each point
a covector field on To discuss
the way
M.
with
M,
and is
df(p):
then
thus an example of
df(p)
varies with
p
on the set
we need
T*M
to put a topology,
and
of all covectors at all
This can be done by analogy with the construction for the
tangent bundle T M
of
by
M.
preferably a smooth structure, points on
p
T f
E
(§3.4).
Choose a chart
(as on page 167)
U ■* U'
:
and hence identify
with T*M
p e U,
with
identify
E* = L(E,R).
Corresponding to the chart x for
TM
TU -* U’
x E
:
T*U -> U'
we obtain a map
t* and so an atlas for
M
atlas obtained for TM,
x E*
gives us an atlas
on the cartesian product
for
:
E x E*. . is
T*M
C
r-1
C
r
taking
(equation (**)
on page 165),
. . diffeomorphism
f
:
(p,h)
to
f
:
Whereas
U -> V
(f(p),Ah)
M
is
C
r
then the
the
. induces a where
TM C
proof relied on
r— 1
A = Df(p)
map e L(E,F)
this time we have to use the fact that the
. induces
U -> V
as a manifold modelled
. . The proof of this is similar to that
.
overlap map
U x E -* V x F
T*M
If the atlas for
although a little more subtle.
the fact that a
for
a
C
r-1
map
U x E* -> V x F*
taking
-1* (p,h)
to
takes
p e E*
Thus
(f(p), A to . is a
T*M
bundle for
M.
h),
where for
C
r— 1
As with
TM
we have a natural projection map
a
it.a = id
246
L* e L(E*,F*)
differentiable manifold called the cotangent
M
or Pfaffians
the map
p*L e F*.
and a covector field on : M -> T*M
L e L(F,E)
satisfying
is a section of the bundle, .
(J . Pfaff, 1765-1825 ).
it
:
T*M -> M,
i.e. a map
Covector fields are also called 1 -forms
In finite dimensions we often regard distinction between
1 x n
really saying that since X
Rn -> R
:
X(*)
52)
on
with
dimensional
Rn.
as
n x 1
Rn
by blurring the
column vectors.
(given a choice of coordinates)
X e R
correspond to
(page
matrices and
x h- Lx.
can be expressed as
will identify
Rn'‘
£ =
2 2
n n
e Rn.
(£^,£^,...,£ )
where
any linear map
+ £0x0 +...+£ x
11
is
We are
=
we
More briefly, we let
the standard inner product
This works equally well without coordinates on any finite¬
linear space
V
if we replace the inner product by any non¬
degenerate bilinear map 3 There is implies
a map
:
V -> V*
that this is
be surjective, (page
Sj,
53)
i.e.
:
V x v
R
.
taking
v
to
injective,
the non-degeneracy
and then by counting dimensions
an isomorphism.
it must also
In the case of a Hilbert space
the properties of the inner product are such that we again get
an isomorphism between
H
and
critical points of functions: Suppose
3(v,*);
H*.
This
fact was used in discussing
see page 107.
that for each point
p
on the smooth manifold
M
we have a
non-degenerate bilinear map
3(p) Then we have
Sl(p) tr
:
T M P
: T M x T M -> R. P
P
and in the finite-dimensional or Hilbert
T*M, P
space case we also have S,(P) 3T
Hence we can apply and so
3
3^(p)
= 3
(P)
:
T*M -* T M.
V
H
to convert any
F
covector at
p
to a vector at
p,
converts any 1-form into a vector field.
To discuss continuity or smoothness of this process we construct a bundle
over
M
with fibre
M„(T M x T M ; 2 p p
R)
over
p,
obtaining a smooth manifold
247
called the tensor bundle of covariant degree 2. analogy with
T*M
is close:
see e.g. Abraham
We give no details,
Q1 J.
A section
called a covariant tensor field of degree 2,
bundle is
if the bilinear map
g(p)
is non-degenerate
3
but the of this
and is non-degenerate
for every
p
in
M.
All
that
we have said so far is summarized by: any smooth oovariant tensor field
of degree 2 on field
converts a smooth coveotor field
M
into a smooth vector
namely the unique vector field satisfying
X,
3(p)
for all
a
3
and every
v e T^M
(X(p),v)
in
p
= a(p)v M.
Now we look at two particular cases.
Gradient systems If
3(p)
is symmetric and positive definite for every
Riemannian structure on the covector field
df
M
(see page 178).
. is converted by
called the gradient field of just
grad f
is
3
f
the standard inner product for all
and so
grad f
is
=
. into a
3
with respect to
is understood.
Df (p)
Given a
3f 3x
C C
3,
then
df(p)
then
M = Rn the
(Example 4).
=
n Y
possibly depending on
p)
then
1J component of the gradient vector at
df(p)
248
(h^j)
is
R
and
or
3(p)
is
matrix
n p
t^ie matrix inverse to
would become
(g..)»
.n R
If we
g..x.y. ij i J
would not change but the Y .L,
J=1
where
grad^f
1 x n
• • =l i,j
ith
: M
the same expression regarded as a field of vectors in
had chosen a different inner product of the form g..
f
vector field
denoted by
is
is a
3f ,_N 9f / N (p ’ 9x0(p),,"’3x (P) 12 n
This was our definition of gradient system in §4.1
(with
3
function
r—1
For example, when p
p
3f h.. -(p)
ji 9x. J
A gradient field
X
on a
(compact) manifold
with particularly simple properties.
.
$fc(p)
= df(pt)
.
X(pt)
by definition of
X,
definite.
f
Hence
and this
alone,
fixed orbits of f = constant
manifolds)
are perpendicular (in terms of
Y(p)
153)
is
behaviour of gradient systems,
for
4.
(p)
since X,
3(p^_)
we have
f. g)
is positive
and so there can be consists of
Moreover,
the non-
to the level sets
that away from critical points
a tangent vector to
= df(p)Y(p) = 0.
pfc
The non-wandering set
namely the critical points of
since if
3(p)(X(p),Y(p))
^ 0
increases along the orbits of
(remember (page
gives rise to a flow
(x(pt), x(pt))
last term is
no periodic or recurrent phenomena. fixed points
If we write
jfc f(pt) = df(pt)
= e(Pt)
M
N = f \k)
they are
then
This gives a very good grasp of the
and Smale
[l20j
was able to prove that
gradient systems with hyperbolic fixed points and transversal intersections of stable and unstable manifolds are structurally stable as one of the first main results of the theory.
Of course, we now see this as a special case of
the general structural stability theorem in §4.5.
Interestingly,
ft-decompos ition theorem and the graph representation (page 243)
Smale's
allow any
Axiom A system with transversality of intersection of stable and unstable manifolds
to be regarded in a very loose way as a kind of overall gradient
sys tern.
Hamiltonian systems If
B(p)
is anti-symmetric
in
T M) P
then
3
is
(i.e.
g(p)(u,v)
called a 2-form on
M.
= ~3(p)(v,u) As above,
for every
u,v
a non-degenerate
249
2-form will convert any covector field into a vector field. time we see that
df
to the level sets
(manifolds away from critical points of
X)
but is
becomes a vector field
tangent to them,
= B(Pt)(X(pt),X(pt))
because of the anti-symmetry,
For most
that k
f
X
apply to
and so
f
the level set
f
'*’(k)
is
page 157),
a codimension-1
n-1.
satisfies a further technical condition
f(x)
or Spivak
It so happens
f
[j.28])
(namely,
(cf.
function
vector field
X
f
:
T*N.
T*N
R
T*N
§3.2,
[8]).
If
8
see and
(W.R. Hamilton,
of any manifold
If we take
N
M = T*N,
on
T*U = U x Rn
to the system of ordinary differential equations
x.
3f
l l
9f 3x. l
has a
and so any smooth
to be an open set X
N
gives automatically a Hamiltonian
(Remember that here
turns out that the vector field
250
we would then be
in this context called a Earniltonian function.
(Hamiltonian)
T*N -> T(T*N).)
M
the phase space
if it is closed',
natural symplectic 2-form that can be defined on it,
X :
k
then it becomes a symp'lect'tc form,
that the cotangent bundle
on
submanifold of
$ k
the vector field is called a Hamilton-ian vector field with
of
If for some reason we had to
working with a flow on a manifold with boundary
1805-1865)
= zeros
remains constant along orbits.
the system a constraint of the form
Q1 J
f,
= 0
and so for fixed such
has been reduced to dimension
e.g. Abraham
which is not perpendicular
represents some quantity which is conserved by the flow.
(recall Sard's Theorem, for
this
since we have
^f(Pt)
This means
X
However,
1,2 , • • •, n
U
so we have in
Rn
= u x Rn
then it corresponds
where
(x,y)
are the coordinates in
U x Rn.
These are Hamilton's
equations from classical mechanics. The global behaviour of Hamiltonian flows is very complicated, since it turns out that there can be no sources or sinks, and recurrence phenomena abound.
There are some quite simple properties known to be generic for
Hamiltonian flows
(see Robinson
extraordinarily elaborate ones
[l06^] , Takens
[135J), and also some
(Markus and Meyer
[74!])*
In view of these,
it seems difficult to know how to formulate global stability conjectures within the Hamiltonian context.
Of course,
the characteristically
Hamiltonian behaviour can always be destroyed by allowing perturbations through non-Hamiltonian systems in harmonic system (Example
1,
(M).
A common example is the simple
§4.1) in which the family of periodic orbits
around the origin is converted into a spiral configuration with the origin as a sink by the addition of arbitrarily small amounts of frictional damping. Here the undamped system is Hamiltonian with symplectic 2-form
(u,v) *->
- u^v2
for
2 2 2 f(x^,x2) = k x^ + x2 u,v e R
2
= T^R
2
an R
.
let
co
E^
denote the set of germs at
0
of
C
functions
For convenience of writing we will be rather lax about
distinguishing between germs and functions that represent them. previously said that two such germs exists a germ of a diffeomorphism that
g = f»y
.
f,g y
E
n
.
• y + K
for some constant
Then we can define an element
neighbourhood
Rn -* Rn
taking
0
to
if there 0
and such
Since we are interested only in critical points, we will
now broaden the definition a little, g = ±f
:
right equivalent
are
We have
N
in
E^
and say that K. f
f,g
equivalent
are
if
Suppose that we have a topology on
E
of
for which every
g
to be
n in
N
stable
if it has some
is equivalent to
f.
Following the general approach outlined at the beginning of this section,
we let
T
u
stable.
denote the subset of To study
of it in general
£
we
E
n
consisting of those which are not
try to analyze the
k-parameter
families of maps
The programme behind Thom’s Theorem is as to regard
E
'avoidability'
Rn -> R,
follows.
as an infinite-dimensional manifold.
of various pieces for
k = 1,2,...
We ought to be able Any element with non-
257
.
vanishing first derivative at
0
is stable
(since it is right equivalent
to its derivative (see page 110) and any two non zero linear maps are right equivalent),
and so
£
lies
decomposible as a collection of disjoint where each
together with a very
Further,
£
pieces
submanifold of
remaining piece
.
Iq
ought to be
(although abutting)
is a codimension-i 'small'
R
in the codimension-n submanifold
consisting of germs with vanishing derivative.
IlJ2,...
R
£q
Each
,
should consist
of a finite number of classifiable equivalence classes.
Given a k-parameter
n family of maps
f
:
R
-> R
(where
c =
(c^, c2,. ., c^)
e R ) we define a
global germ mccp F by
F(x,c)
the germ at
= germ of 0
of
:
Rn x Rk -s- E
y h- f^(x+y) f
at
n
y = 0.
with origin moved to
this map to be transversal
(page 176)
meet only those
n+i £ n+k,
with
to all
In other words, x. the
F(x,c)
is
In general we hope for ^
avoiding
,
and therefore
as well.
to
Finally,
two k-parameter families whose global germ maps meet a given
^
transversally at the same point should themselves be in some sense locally equivalent. The initial obstacle to carrying out this programme is easily be made into a suitable manifold.
that
A way around this,
E^
cannot
though,
is
to convert the problem into a finite-dimensional one by working with jets of some given order,
look in this
for submanifolds representing the
linear space of jets Y
determinacy of jets
(§2.9)
A second problem is
that the decomposition of
way suggested for
i £ 5
:
88)
and then to use facts about
to lift the results back up to the germ level.
\
only works in the simple
after that, natural candidates
to contain infinite families of equivalence classes.
258
(see page
for
However,
V. Thom
tend
was interested in
k $ 4
parameters for reasons mentioned below, and so
this problem is avoided. The genericity of transversality of
F
to the
is expected as a
consequence of Thom's powerful Transversality Theorem:
THEOREM
If
M,W
of maps
are manifolds and
is a closed submanifold of
S
which are transversal to
W -> M
M
then the set
is an open dense subset of
S
oo
C (W,M).
This theorem has been the springboard for many important advances in differential topology. to show
(a)
In fact we need here a more sophisticated version
perturbations in
obtain transversality with ^
f
supply enough perturbations of
^ , and (b)
F
to
the fact that the submanifolds
may not be closed does not matter since together they form a
stratification. By writing
The required version exists, also due to Thom. $(x,c)
instead of
family locally as a germ be smooth in
x
and
c
unfolding of the germ
R
n
x R
Ic
fc(x) -> R
we may as weli regard a k-parameter
at
(0,0),
* • which we will assume to
and therefore an element of f = f
.
Two unfoldings
We call of
f
$
an
are equivalent
if there exist smooth germs w : Rn x Rk -* Rn
with
x
a diffeomorphism
w(x,c)
germ for each
c,
and
w(x,0) = x
b :
R^ -> R^
a dif feomorphism germ with
a :
R^ -> R
with
b(0) = 0
a(0) = 0
such that f(x,c) = $(w(x,c),b(c)) + a(c)
.
(*)
259
Thus
¥
is obtained from
depending on
c,
$
by (i) an invertible change of x-variables,
(ii) an invertible change
addition of a number
a
depending on
c.
b
of c-variables and (iii)
We call the unfolding
$ 00
stable if, roughly, any other unfolding (of 4>
is equivalent to
$ .
f)
sufficiently
C -close to
In trying to make this precise we again run up
against the problem of a topology for
^n+^>
but one waY around this is to CO
choose a specific neighbourhood topology for maps every
U.
U -> R,
U
of
(0,0)
to work with, using a
and then to insist on stability on
See Wassermann [l46]
U
C
for
for a very careful discussion of these
questions. Now we can give a more technical expression of Thom’s theorem.
THEOREM For any
n > 1
there are seven equivalence classes
(called strata) in (1) each
E
n
E^
of codimension
(2) generically3 for a k--parameter family corresponding global germ map
$
is versal,
such that T
F
of
f
$
is an unfolding of
b
with
k £ 4
£ ;
$
f e £;?
then any other unfolding
by coordinate changes as in (*)
need not be invertible.
If
k = i
then
called universal : every other universal unfolding of equivalent to
$ .
f
$
is
is
From this and the definition of transversal it
follows that universal unfoldings cure stable.
260
the
meaning that if the origin is cliosen in
can be obtained from
except that
$
n+i ;
will be transversal to
and will avoid the rest of
(3) such a
^
such that
is a submanifold of
the
£3’
Remarks !• T^e interpretation of the manifold structures is strictly speaking via jets, as mentioned above. 2. For
n - 1
there is only one stratum of each codimension:
see the
comments on Thom's classification list below. 3. For
k = 5
we include four more strata
£ ’
’
’
f
but for
k 5 6
we would need to include an infinite family and this would complicate the genericity property (2). 4. Under our definition
f
is equivalent to
minima (= stable equilibria)
• seven strata break into ten (not 14, x
-x)
A sign change converts
into maxima and vice-versa, so in applications
it is important to distinguish between
by
—f .
f
and
since e.g.
—f . x
3
If we do this the can be changed to
-x
3
.
5. We supposed
$
defined on all of
open subset would do for the domain of
R
ri
x R
Ic
only for convenience; any
$ .
6. Full details of the machinery for proving the theorem were essentially published in more general form in the work of Mather [jsH •
The main tool
for dealing with unfoldings on the jet level is the Preparation Theorem, a classical theorem for formal power series only recently proved by Malgrange CO
for
C
functions.
See references at the end of this chapter.
A universal unfolding of degenerate critical point at nearby when
f
appropriate
C
f
gives a full description of the way the 0
breaks up into a cluster of critical points
is perturbed in ccny way whatsoever that is small in the 00
sense,
.
problem into a finite geometrical one. of points
c e R
k
....
thus reducing a seemingly infinitely complicated
for which
f
For any unfolding
$ ,
the set
K
• . has at least one degenerate critical
261
point is the catastrophe set occurs for a flow on of
f
Rn
¥ ,
As
governed by
the diffeomorphism
onto that for
K.
b
in
(*)
c
passes through f
.
For
$,T
K
a bifurcation
universal unfoldings
takes the catastrophe set for
$
and so the catastrophe set for any particular universal
unfolding is a kind of archetype representing the catastrophe sets for all other universal unfoldings. Thom has given a now celebrated list of seven germs representing each of the equivalence classes
^ ,
together with universal unfoldings for them
and (theoretically) pictures of the corresponding archetypal catastrophe sets, We will not give the list, since it may be found in any of the references to catastrophe theory given below: perhaps this is the first account of the theory in which the list does not appear! f(x) + Q
while three are of the form
Four germs are of the form
f(x,y) + Q
degenerate quadratic form in the remaining
n-1
where or
Q
n-2
is a non¬
variables.
Recall
that from the Gromoll-Meyer lemma (page 107) any germ at a degenerate critical point can be put into the form (totally degenerate) + (non¬
degenerate).
Implicit in Thom's list is the theorem that any germ whose
degenerate part needs more than two variables will be avoidable in generic k-parameter families,
k $ 4.
Given an unfolding of a germ with degenerate critical point at
0
it is
important to know how to find out whether the unfolding is universal or not. • • There is a formal criterion for this: the unfolding f
is universal if and only if3
h. e
(1 R
of
there exist germs
such that
3 n a. j— (x,0) + l h (x)d (x) j=l J j i=l 1 1
l
n
where
cL(x)
as in §2.9.
EXAMPLES 4
1. n = 1 ;
a
f(x) = x .
unfolding since any
Here
g
$(x;c^,C2) E x
o
+ cix + C2X
i-s a universal
can be written as
/ \ . 3$ 9$ , 3 8(x) =aotaik];ta2^M,,h = aQ + a^x + a^x for some germ h. 2. n = 2 ;
3 + 4x h
This is the cusp catastrophe.
f(x,y) = x
3
3 + y .
The unfolding
3 $(x,y) is universal, since any
/
3
= x g
+ y
\
h,k .
In general when
$
+ c^x +
+ c3xy
can be written as
^
g(x,y) = a0 +
for some germs
2
94> 9$ 2 2 9$_ w + a + a + 3x h + 3y k 2 9c. 3 9c, '1 ~ “^2 - ”'"3
This is the hyperbolic umbilic catastrophe.
is not a polynomial there will be no finite way of
checking this criterion directly, so we need to replace the problem by one about polynomials rather as in the tests for determinancy in §2.9.
This
can be done easily enough, although it is a little elaborate to write down. See Poston and Stewart
[lOl].
The use of catastrophe theory in practice To interpret the significance of the catastrophe set in a given family of systems we have to know how the critical points of the function
f
are
observed, via the physical or other laws which operate in the particular context.
For example, when there are two or more minima of
f
available
the system may adopt the state corresponding to the lowest minimum: in this
263
case the set of points
c
for which
f
has two equal-valued minima
(the shock-wave set) will play a crucial role, rather than the catastrophe set itself. Thom's use of catastrophes to study morphogenesis in biology is based on the following model. time we think of
T
If
T
is a piece of biological tissue in space and
as a region in
regarded as a point
c = (c^,c^,c^)
that the internal behaviour of governed by a potential function can apply the foregoing theory. are subsets of
T
R .
C
Each cell
e R
,
C
in
T
can be
and then if we hypothesize
is described by a dynamical system f
we have a 4-parameter family
$
and
The catastrophe set and shock-wave set
itself, and through the laws of adhesion and other
properties of cells they will influence the observed manifestations of the bifurcations in assume that
$
f . c
Since we are modelling a real-life system we can
will be everywhere locally stable as an unfolding (an
assumption vital to the model) and hence since there are the local forms for
$
and the catastrophe set
K
k Rm
for cases other than
Singularities which persist (up to local equivalence) under small
perturbations are stable. by Whitney
[149] :
Stable singularities
they have local form either
R
2
-> R
(x^jX^)
2
were classified 2 (x^,X2 )
(a fold)
3
or
(x^jX^)
(x^jX^
+ x-^x^)
(a cusp as in Example
4, page 102).
Much
is now known about stability of singularities and unfoldings of unstable ones, following the work of Mather ^78^1 .
See Golubitsky and Guillemin
[[42].
This information should soon find its way into any sphere of mathematical modelling concerned with
m
functions of
n
variables, as it has done
already in mathematical economics. 4. Infinite dimensions.
In continuum mechanics bifurcation theory has
developed many techniques of its own (Keller and Antman
[[63]] , Sattinger
[ll2]) and is only now beginning to interact with catastrophe theory. Sometimes the problems can be reduced to finite-dimensional ones, using the Gromoll-Meyer lemma (page 107) as in Chillingworth Vlf\, or the powerful Centre Manifold Theorem as in Marsden, Ebin and Fisher Marsden and McCracken 266
[[76]] •
[~75]]
The ideas of genericity can sometimes be
or
used directly in the appropriate function—space: Mallet—Paret
Q 29^ *
There are also ways in which catastrophe theory can be
used in partial differential equations: Duistermaat
Q3531 •
see Chow, Hale and
see
Guckenheimer [[ 47^],
Since partial differential equations are ultimately of
more use in modelling the real world than are ordinary differential equations,
it is probably here that catastrophe theory will in the long run
have its greatest impact.
Remarks on the 1-iteratwce. Most of the recent work in global stability and genericity for dynamical systems stems from the survey of Smale [125~] . See in addition the lecture notes by Markus [733 , and the books by Nitecki 90 31 and Abraham and Robbin [3 2 J . The progress of research can be traced through the three Proceedings volumes [22[ , [j?7 ] , [71~|, each of which also contains useful survey articles. For a treatment of differential equations with accessible proofs of some of the key results used in this chapter see the forthcoming book by Irwin and Robertson [[62J . The monograph by Moser [385 3J relating classical and new theories in the context of celestial mechanics is highly recommended reading. In this connection see also the appealing book on differential topology and dynamical systems by Abraham [ 1 3 • The main reference in catastrophe theory is of course the treatise of Thom [1383. For mathematical background see Brocker and Lander [213 , Poston and Stewart Q-01] , Wassermann [l463 » Zeeman and Trotman Jl59j > anc* also the excellent survey of bifurcation theory by Arnol'd Q103J * There are numerous expository articles about catastrophe theory. A very elementary introduction is Chillingworth \^2b~\ ; for treatments with more mathematical content see Golubitsky Q413 , Sussmann [1333 . The 8eneral article by Stewart [132[] is valuable reading, although at present there is nothing to surpass the tour de force of Zeeman [1563 .
And -if I've said too much, they'l say; I’m Sorry not at allj For much more unto Such, I may. And not he Criminall. Thomas Mace : Musick's Monument,
1676.
267
Appendix
TERMINOLOGY AND NOTATION FOR SETS AND FUNCTIONS Sets 1. A set is a collection of objects specified either by a given list or by some defining property or process.
(There are logical paradoxes inherent
in this definition, but we keep well clear of them.) are called its elements of
A
or belongs to
:
A.
we write
A
a^,a^,...
is defined as the set of elements
a certain property
P
to denote that
a
is an element
In many contexts elements are considered as points.
The set consisting of elements if
a e A
The objects in a set
we write
is denoted by b
{a^,a^,...},
of some other set
A = (b e B
b satisfies
P} .
B
and
satisfying Standard
sets we will use are: N = natural numbers Z = integers
{1,2,3,...}
{0,±1,±2,...}
Q = rational numbers
{p/q
where
p,q e Z
and
q ^ 0}
R = real numbers C = complex numbers. We will not define
R, C
2. The cartesian product first object
a
here, but we assume familiarity with them. A x b
is the set of all pairs
is an element of
A
(a,b)
and the second object
b
where the is an element
2 of
B.
Examples are the euclidean plane
euclidean n-space defined inductively as 3.
In a particular case the symbol
open interval from 268
a
to
b,
(a,b)
R
= R x R
and, more generally,
Rn = Rn ^ x R . may also be used to mean the
i.e. the set of real numbers
x
with
a < x < b
interval
denoted formally [k,b]
is
by
{x e R
a * x ^ b) .
meaning) half-open intervals intervals
(a,-)
,
(-~,b)
4. If every element of
subset of
,
A
(a,b) = {x e R
(a,bj and
possibility
A = B ,
There are also (with the obvious
is automatically an element of B:
we write
B,
written
The union
of everything in either everything in both C
A
is a
This allows the A S B
if
A
Ac B ,
the set of
is the complement of
A
in
when there is no possible confusion with subtraction in
an algebraic sense.
subsets of
If
which does not belong to
B-A
then
Note that contained in does not mean belongs to,
which applies to elements and not subsets. B
AcB.
B
although some authors prefer the notation
equality is allowed.
everything in
as well as infinite
[a,b),
(-oo>co) = R
or is contained in
B
and
The closed
a < x < b} .
A
A u B
of
A
and
B
is the set consisting
the intersection
A r\ B
A
or
B ;
and
B.
It is easy to see that if
A
is the set of and
B
are
then C - (A U B) = (C - A) n (C - B)
C - (A n B) = (C - A) U (C - B) . It is convenient to think formally of an empty set at all, so that for example if write
A A B = 0 .
5. A relation
R
Here we say
A,B A
implies
and
B
are disjoint,
between pairs of elements of a set
bRa
together imply {x e A
for every
a,b e A)
a
A
for every
is called an a e A)
and transitive (aRb
a,b,c e A).
For given
is called the equivalence class of
xRa}
it contains
aRc
for every
which has no elements
have no elements in common we can
equivalence relation if it is reflexive (aRa (aRb
0
itself in view of reflexivity.
a
a,
, symmetric
and
bRc
the set
with respect to
R :
It follows from symmetry
and transitivity that any two equivalence classes either coincide or are
269
disj oint. 6. If if
A
is a subset of
a $ M
for all
exists no smaller
R
a number
a e A . M
If
M
is called an upper bound for
is an upper bound for
which is also an upper bound for
least upper bound or supremum for
A :
we write
A
A,
and there
then
= sup A.
for lower bound and greatest lower bound or infimum, written
A
is a
Similarly inf A .
It is either a theorem or an assumption about real numbers (depending on your
R)
definition of
R
that every subset of
with an upper bound has a least
upper bound; similarly for lower bounds.
Functions 7. A function
f
from
A
precisely one element of
to B
B
to each element of
f : A and denote the effect of omitted for simplicity. "f
takes
a
indicate a
to
f
If
b" .
as
f(*)>
f
is a prescription for associating
on
a
by
f(a) = b
f(a)
we write
270
The set
f(a)
B = R .
must be defined for every element
is the domain of the map
B
to be read as
although it is customary in analysis and topology to call a
may wish to consider the operation of
8.
f : a h- b,
The word map is generally synonymous
or we are not strictly entitled to put
restriction of
Sometimes the brackets are
with the dot keeping track of where the ’variable'
The prescription for
A
.
In formulae it is occasionally convenient to
map a function only in the case when
The set
We write
B
has to be inserted in the formula.
with function3
A.
f
to
U
is written
is the codomain of
f ,
A
of
A,
at the beginning of the arrow.
f . f
a
If
U
is a subset of
on elements of
f|u .
In full,
U
this is
A ,
alone.
we This
f|u : U -> B.
a rather formal definition since
the codomain could be changed by including
B
in some larger set and
thinking of this as the codomain without essentially doing anything to Not every element of the codomain need come via The subset of in
A
A
B
f ,
from anything in
A .
consisting of those elements that do come from something
is called the range of
under
f
f .
f ,
or the image of
in any case written
that everything in
B
fA .
If
fA
comes from something in
f ,
or the image of
is the whole of
A ,
then
f
B , so
is
surjective (is a surjection). 9. Two distinct elements of of
A
may well be taken by
B .
If this never happens, then
f
is injective if and only if
Thus
f(a) = f(a') 10. If
f
• A
to the same element
is injective (is an injection).
implies
a = a’
.
is both injective and surjective it is bijective (is a bijection).
In other words, id
f
f
A :
it is
a h- a
'one-to-one
and onto'.
is a bijection.
The identity map
A bijection
f : A
B
has an inverse
A f
: B -* A
defined by f-1(b) = unique
11. If
f : A -> B
composition
and
g • f
a e A
g : B -> C
: A -> C
satisfying
f(a) = b .
are any two maps then they have a
defined as
a h- g(f(a))
for every
Note that for this to make sense the domain
B
of
the codomain of
f
could be smaller than
The inverse
f ,
f"1
although the image of
of a bijection
f : A •* B
g
a e A .
must be the same as B .
is characterized by the
relationships f-1 • f = id
: A
A
Pi.
f • 12. Even if of
B
f
f_1 = idg : B
B .
is not a bijection, we can still define for each subset
a subset of
A
called the inverse image of
V
under
f ,
V
consistin;
271
of those elements
a
of
image is denoted by
A
for which
f "*"(V)
,
f(a)
Remember, however,that the symbol
f(a) e V} .
f
-1
on its own is meaningless unless f ^(V)
(In this case the two meanings of
consists of just one element
This inverse
V .
so
f_1(V) = {a e A
is bijective.
belongs to
v,
cumbersome but formally correct
we write f ‘''({v})
coincide.)
If
instead of the more
f ''"(v) .
EXAMPLES f : R
(i)
R :
[j-l,l]
f
- [-1,1] :
x h*- sin x
fR = clos ed interval
(ii)
-
f :
(iii) Let d : M
= n x n
-> R
n
by
Neither injective nor surjective.
x ^ sin x
Bijective.
matrices (real numbers as entries), n £ 1.
d(M) = det(M)
.
Then
d
is the set of non-singular matrices.
(iv)
be the unit sphere in
S
R
3
,
and let
0,
Define
f :
angles (see almost any book on mechanics). by
f(0,) = point on
but not surjective. same wording then
13. A function We write
sup f(a) aeA
Willard
272
Q
with Euler angles
If instead we define F
0, S
by the
sup fA;
has upper and lower bounds. similarly for
is countable if there is a bijection
of rational numbers is countable, but
[150] .
be the Euler
is surjective but not injective.
f : A -> R
14. An infinite set The set
S
Define
is surjective but not injective,
d '''(R - {0}) Let
0,±1,±2,...}.
n
(0) = {n-rr
R
is not.
inf
A
.
N
.
See e.g.
f V
Abraham, R.
Foundations of Mechanics3 Benjamin, New York 1967.
[2]
Abraham, R. and Robbin, J.W.
Transversal Mappings and Flows3 Benjamin, New York 1967.
[3]
Adams, J.F.
Vector fields on spheres, Ann. Math. (1962), 603-632.
[^]
Adams, J.F.
Lectures on Lie Groups3 Benjamin, New York 1969.
[3]
Agoston, M.
Algebraic Topology: A First Course, Marcel Dekker, New York 1976.
M
Ahlfors, L.V. and Sario, L.
Riemann Surfaces3 Princeton University Press 1960.
[7]
Andronov, A.A. and Pontrjagin, L.S.
Systkmes grossiers, Dokl. Akad. Nauk. (1937), 247-251.
W
Anosov, D.V.
Roughness of geodesic flows on closed Riemannian manifolds of negative curvature, Soviet Math. Dokl. 3 (1962), 1068-1070.
M
Amol'd, V.I.
Singularities of smooth mappings, Russian Math. Surveys 23 (1968), 1-43.
[10]
Arnol'd, V.I.
Lectures on bifurcations in versal families, Russian Math. Surveys 27 (1972), 54-124.
Amol'd, V.I.
Ordinary Differential Equations3 M.I.T. Press Cambridge, Mass. 1973.
[12]
Amol'd, V.I.
Critical points of smooth functions and their normal forms, Russian Math. Surveys 30 (1975)
[13]
Auslander, L. and MacKenzie, R.E.
Geometry of Manifolds, Academic Press, New York 1964.
[14]
Berry, M.V.
Waves and Thom's theorem, Adv. Rhys. (1976), 1-26.
1
h-*
M
-1
References
75
14
25
273
[15]
Bishop, R.L. and Crittenden, R.J.
Geometry of Manifolds_, Academic Press, New York 1964.
[16]
Blackett, D.W.
Elementary Topology3 Academic Press, New York 1967.
[17]
Borel, A. et al.
Seminar on Transformation Groups, Annals of Mathematics Studies 46, Princeton University Press 1960.
[18]
Bourbaki, N.
Elements de Mathdmatique 33: Varibtds Diffdrentiables et Analytiques; Fascicule de Rdsultats, Hermann, Paris 1967.
[19]
Bowen, R.
Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms, Lecture Notes in Mathematics 470, Springer-Verlag, Berlin* Heidelberg 1975.
[20]
Brickell, F. Clark, R.S.
[21]
Brocker, Th. and Lander, L.C.
Differentiable germs ccnd catastrophes3 L.M.S. Lecture Notes Series 17, Cambridge University Press 1975.
[22]
Chem, S.S. and Smale, S. (eds.)
Global Analysis: Proceedings of A.M.S. Symposium in Pure Mathematics Vol. XIV, American Mathematical Society, Providence, R.I. 1970.
[23]
Chern, S.S. and Smale, S. (eds.)
As above, Vol. XV.
[24]
Chevalley, C.
Theory of Lie Groups. Princeton University Press 1946.
[25]
Chillingworth, D.R.J.
Smooth manifolds and maps, in Global Analysis and its Applications3 Vol.l. IAEA, Vienna 1974.
[26]
Chillingworth, D.R.J.
Elementary catastrophe theory. Bull. Inst. Math. Appl. 11 (1975), 155-159.
[27]
Chillingworth, D.R.J.
The catastrophe of a buckling beam, in
[28]
Chillingworth, D.R.J. and Reversals of the earth's magnetic field, Furness, P.M.D in []7l[] .
[29]
Chow, S.N.,Hale, J.K. and Applications of generic bifurcation, I, Mallet-Paret, J. Arch. Rat. Mech. Anal. 59 (1975), 159-188.
274
and
Differentiable Manifolds3 Van Nostrand Reinhold, London 1970.
[7l].
[30]
Coddington, E. and Levinson, N.
Theory of Ordinary Differential Equations3 McGraw-Hill, New York 1955.
[31]
Cullen, C.
Matrices and Linear Transformations, Addison-Wesley, Reading, Mass. 1966.
[32]
Dieudonnd, J.
Foundations of Modern Analysis_, Academic Press, New York 1960.
[33]
Dodson, C.T.J. and Poston, T.
Tensor Geometry3
[34]
Dodson, M.M.
Darwin's law of natural selection and Thom's theory of catastrophes. Math. Biosciences _28 (1976), 243-274.
[35]
Duistermaat, J.J.
Oscillatory integrals, Lagrange immersions and unfoldings of singularities, Comm. Pure Appl. Math. 27_ (1974), 207-281.
[36]
Eells, J.
Singularities of Smooth Maps3 Nelson, London 1968.
[37]
Field, M.J.
Equivariant dynamical systems, Bull. A.M.S. _76 (1970), 1314-1318.
[38]
Franks, J.M.
^-stability: diffeomorphisms and flows, in Proceedings of Colloquiwn on Smooth Dynamical Systems (ed. D.R.J. Chillingworth), Southampton University 1972.
[39]
Franks, J.
Absolutely structurally stable diffeomorphisms, Proc. A.M.S. 37 (1973), 293-296.
[40]
Franks, J.
Time dependent stable diffeomorphisms, Inv. Math. 2A (1974), 163-172.
[41]
Golubitsky, M.
An introduction to catastrophe theory and its applications (to appear).
[42]
Golubitsky, M. and Guillemin, V.
Stable Mappings and Their Singularities, Graduate Texts in Mathematics 14, SpringerVerlag, New York 1973.
[43]
Griffiths, H.B. and Hilton, P.J.
A Comprehensive Textbook of Classical Mathematics3 Van Nostrand Reinhold, London 1970.
[44]
Grobman, D.
Homeomorphisms of systems of differential equations, Dokl. Akad. Nauk. 128 (1959), 880-881.
Pitman, London 1976.
275
[45]
Gromoll, D. and Meyer, W.
On differentiable functions with isolated critical points, Topology 8 (1969), 361-369.
[46]
Guckenheimer, J.
Absolutely ^-stable diffeomorphisms, Topology 11 (1972), 195-197.
[47]
Guckenheimer, J.
Catastrophes and partial differential equations, Ann. Inst. Fourier (Grenoble) (1973), 31-59.
23
[48]
Guillemin, V. and Pollack, A.
Differential Topology3 New Jersey 1974.
[49]
Halmos, P.
Finite-Dimensional Vector Spaces3 Van Nostrand, Princeton, N.J. 1958.
[50]
Hartman, P.
On the local linearization of differential equations, Proc. A.M.S. 14 (1963), 568-573.
[51]
Hartman, P.
Ordinary Differential Equations3 Wiley, New York 1964.
[52]
Hirsch, M.W.
Differential Topology3 Graduate Texts in Mathematics, Springer-Verlag, New York 1976.
[53]
Hirsch, M.W. and Pugh, C.C.
Stable manifolds and hyperbolic sets, in [22].
[54]
Hirsch, M.W., Palis, J., Pugh, C.C. and Shub, M.
Neighbourhoods of hyperbolic sets, Inv. Math. 9 (1970), 121-134.
[55]
Hirsch, M.W. and Smale, S.
Differential Equations3 Dynamical Systems3 and Linear Algebra3 Academic Press, New York 1974.
[56]
Hochschild, G.
The Structure of Lie Groups3 Holden-Day, San Fransisco 1955.
[57]
Hocking, J.G. and Young, G.S.
Topology3 Addison-Wesley, Reading, Mass.
[58]
Hoffman, K.
Analysis in Euclidean Space3 Prentice-Hall, New Jersey 1975.
[59]
Hurewicz, W.
Lectures on Ordinary Differential Equations3 M.I.T. Press, Cambridge, Mass. 1958.
[60]
Iooss, G.
Bifurcation of a periodic solution of the Navier-Stokes equations into an invariant torus. Arch. Rat. Mech. Anal. 58 (1975), 57-76. —
276
Prentice-Hall,
1961.
[61]
[62]
Irwin, M.C.
On the stable manifold theorem. Bull. 2 (1970), 196-198.
Irwin, M.C. Robertson,
and
(To appear).
S.A.R.
[63]
Keller, Antman,
[64]
Kervaire, M.
J.B. S.
and
Bifurcation Theory and Nonlinear Eigenvalue Problemss Benjamin, New York 1969. A manifold which does not admit any differentiable structure. Comm. 34 (1960), 257-270.
[65]
Kupka,
I.
Lang,
S.
Lang,
S.
Lang,
(Interscience), New York 1962.
S.
1968.
Differential Manifolds, Addison-Wesley, Reading, Mass. version of
[69]
Lefschetz,
S.
Loomis,
L.H.
Sternberg,
[71]
1972.
(Revised and expanded
[66].)
Differential Equations: Geometric Theorys Wiley
[70]
Advanced Calculus3 Addison-Wesley, Reading,
and
S.
Manning, A.K.
(Interscience), New York 1957.
Mass.
1968.
Dynamical Systems - Warwick 1974, Lecture
(ed.)
Notes in Mathematics 468, Berlin*Heidelberg [72]
Markov,
(1963),
Analysis I, Addison-Wesley, Reading, Mass.
[68]
2
Introduction to Differentiable Manifolds, Wiley
[67]
Math. Helv.
Contribution h la thdorie des champs gdndriques, Contrib. Biff. Eqs. 457-484; _3 (1964), 411-420.
[66]
L.M.S.
Springer-Verlag,
1975.
The unsolvability of the problem of
A.A.
homeomorphy, Proc. Internat. Congress of Mathematicians 1958, Cambridge University Press 1960, [73]
pp.
300-306
(in Russian).
Lectures in Differentiable Dynamics3
Markus, L.
C.B.M.S
Regional Conference Series in Mathematics 3, American Mathematical Society, Providence, R.I. [74]
Markus,
L.
Solenoids in generic Hamiltonian dynamics
and
(to appear).
Meyer, K.R.
[75]
Marsden, J., Ebin, and Fischer, A.
1971.
D.,
Diffeomorphism groups, hydrodynamics and relativity, Proceedings of the 13th Biennial
Seminar of the Canadian Mathematical Congress (ed.
J.R.
Vanstone),
C.M.C.
1972.
277
[76]
Marsden,
J.E.
and
McCracken, M.
The Hopf bifurcation and its Applications, Applied Math.
Series 19,
Berlin*Heidelberg • [77]
Springer-Verlag,
New York 1976.
Algebraic Topology: An Introduction,
Massey, W.S.
Harcourt,
Brace and World, New York 1967.
00 [78]
Mather,
Stability of
J.
C
mappings:
Ann. Math. 87 (1968), 89-104 II. Ann. Math. 89 (1969), 254-291 III. Publ. Math. I.H.E.S. No.35 (1968),127-156 IV. Publ. Math. I.H.E.S. No.37 (1969),223-248 V. Ado. in Math. 4 (1970), 301-336 VI. in Proceedings of Liverpool Singularities Symposium I (ed. C.T.C. Wall), Lecture I.
Notes in Mathematics
192,
Springer-Verlag,
Berlin*Heidelberg 1971. [79]
Maunder,
C.R.F.
Algebraic Topology, Van Nostrand, New York 1970.
[80]
Mendelson,
B.
Introduction to Topology, Allyn & Bacon, Boston, Mass.
[8l]
Milnor,
1962.
On manifolds homeomorphic to the 7-sphere,
J.W.
Ann. Math. _64 (1956), 399-405. [82]
Morse Theory, Annals of Mathematics Studies
Milnor, J.W.
51, Princeton University Press 1963. [83]
Topology from the Differentiable Viewpoint,
Milnor, J.W.
University Press of Virginia [84]
Mirsky,
1965.
An Introduction to Linear Algebra, Oxford
L.
University Press 1955. [85]
Moser, J.
Stable and Random Motions in Dynamical Systems: With Special Emphasis on Celestial Mechanics, Annals of Mathematics Studies 77, Princeton University Press
[86]
Munkres,
J.R.
1973.
Elementary Differential Topology, Annals of Mathematics Studies 54, Princeton University Press
[87]
Narasimham,
R.
1961
(revised 1966).
Analysis on Real and Complex Manifolds, Masson, Paris 1968.
[88]
278
Newhouse,
S.
On simple arcs between structurally stable flows, in [7l].
[89]
Newhouse, Palis, J.
[90]
Nitecki,
S.
and
Z.
Bifurcations of Morse—Smale dynamical systems, in [97].
Differentiable Dynamics, M.I.T. Press, Cambridge, Mass.
[91]
Nomizu, K.
1971.
Fundamentals of Linear Algebra, McGraw-Hill, New York 1966.
[92]
Palais,
R.S.
Morse theory on Hilbert manifolds. Topology 2 (1963), 299-340.
[93]
Palais,
R.S.
Critical point theory and the minimax principle, in [23].
[94]
Patterson,
[95]
Peixoto, M.M.
E.M.
Topology, Oliver and Boyd, Edinburgh 1956. Structural stability on two-dimensional manifolds. Topology 1
[96]
Peixoto, M.M.
Peixoto, M.M.
101-120.
On an approximation theorem of Kupka and Smale, J.
[97]
(1962),
(ed)
Diff. Eqns.
3 (1967), 214-227.
Dynamical Systems: Proceedings of Salvador Symposium 1971, Academic Press, New York 1973
[98J
Peixoto, M.M.
On the classification of flows on 2-manifolds in
[99]
Pitts,
C.G.C.
[97] .
Introduction to Metric Spaces, Oliver and Boyd, Edinburgh 1972.
[ioo]
Podnaru, V.
[101]
Poston, Stewart,
[102]
Poston, Stewart,
[103]
T.
Singularities
C en presence de symdtrie, Lecture Notes in Mathematics 510, SpringerVerlag, Berlin*Heidelberg 1976.
Taylor Expansions and Catastrophes3 Research
and
I.
Notes in Mathematics
The Geometry of the Higher Catastrophes,
T., I.N.
7, Pitman, London 1976.
and
Research Notes in Mathematics, Pitman,
Woodcock, A.E.R.
London
Pugh,
The closing lemma, Amer.
C.C.
(to appear).
J. Math.
89
(1967),
956-1009. [104]
Robbin,
J.W.
On the existence theorem for differential equations, Proc. A.M.S.
[105]
Robbin,
J.W.
19
(1968),
1005-1006.
A structural stability theorem, Ann. Math. 24
(1971),
447-493.
279
[106]
Generic properties of conservative systems,
Robinson, R.C.
I,
II, Amer.
J. Math.
92
(1970),
562-603
and 897-906. [107]
Robinson, R.C.
Structural stability of
C'*’
flows,
[108]
Robinson, R.C.
Structural stability of
C1
diffeomorphisms,
to appear in J. [109]
in
Diff. Eqns.
Principles of Mathematical Analysis3
Rudin, W.
[7l] .
2nd ed.,
McGraw-Hill, New York 1964.
[no] [in]
Ruelle,
D.
On the nature of turbulence,
and
Phys.
Sard, A.
The measure of the critical values of
20
(1971),
167-192.
differentiable maps. 48 [112]
Comm. Math.
Takens, F.
Sattinger,
D.H.
(ed.)
(1942),
Bull. A.M.S.
883-890.
Topics in Stability and Bifurcation Theorys Lecture Notes in Mathematics 309,
Springer-
Verlag, Berlin*Heidelberg 1973. [113]
Sewell, M.J.
Some mechanical examples of catastrophe theory. Butt. 163-172.
[114]
Shields, P.C.
Shub, M.
1964.
Shub, M.
[97].
Structurally stable systems are dense.
Bull. A.M.S. [117]
Siersma,
D.
78
Simmons,
G.F.
(1972),
817-818.
Classification and deformation of singularities;
[118]
(1976),
Stability and genericity for diffeomorphisms, in
[H6]
12
Linear Algebra, Addison-Wesley, Reading, Mass.
[115]
Inst. Math. Appl.
thesis, Amsterdam 1974.
Introduction to Topology and Modem Analysis3 McGraw-Hill, New York 1963.
[119]
Smale,
S.
Morse inequalities for a dynamical system, Bull. A.M.S. 66 (I960), 43-49.
[120]
Smale,
S.
On gradient dynamical systems. Arm. Math. (1961), 199-206.
[121]
Smale,
S.
Generalized Poincard's conjecture in dimensions greater than four, Ann. (1961), 391-406.
280
Math.
74
Ik
[122J
Smale,
S.
Stable manifolds
for differential equations
and diffeomorphisms, Ann. Pisa l8 (1963), 97-116. [l23]
Smale,
S.
An infinite dimensional version of Sard's theorem, Amer.
[l24]
Smale,
Scuola Norm. Sup.
S.
J. Math.
87
(1965),
861-866.
Structurally stable systems are not dense, 88 (1966), 491-496.
Amer. J. Math. [l25]
Smale,
S.
Differentiable dynamical systems. Butt.
A.M.S. [l26^]
Sotomayor,
J.
73_ (1967),
747-817.
Generic one-parameter families of vector fields, Publ. Math. 5-46.
I.H.E.S. No.43 (1974),
[l27j
Sotomayor, J.
Generic bifurcations of dynamical systems, in [97].
[128]
Spivak, M.
Calculus on Manifolds
_,
Benjamin, New York
1965.
[T29]
Spivak, M.
A Comprehensive Introduction to Differential Geometryj Publish or Perish, Boston, Mass. 1970.
[l3cFj
Stamm, E.
Introduction to differential
topology.
Proceedings of the IZth Biennial Seminar of the Canadian Mathematical Congress (ed. J.R. Q.3l]
Sternberg,
S.
Vanstone),
C.M.C.
1972.
On the structure of local homeomorphisms of euclidean n-space II, Amer. (1958),
[132]
Stewart,
I.N.
J. Math.
80
623-631.
The seven elementary catastrophes. New
Scientist _68 (1975), 447-454. [133]
Sussmann, H.J.
Catatastrophe theory, Synthbse 31
(1975),
229-270. [jl34j
Sutherland, W.A.
Introduction to Metric and Topological Spaces Oxford University Press 1975.
Q.35]]
Takens,
Hamiltonian systems:
F.
,
Generic properties of
closed orbits and local perturbations.
Math. Ann. (jL36j
Takens,
F.
188
Takens,
F,
304-312.
Singularities of vector fields, Publ.
I.H.E.S. No. [l37]
(1970),
43
(1973),
Tolerance stability,
in
Math.
47-100. [71].
281
[138]
Stability Structurelle et Morphogdnkse3
Thom, R.
Benjamin Advanced Book Program, Mass. D.H. [139]
1972. Fowler,
Reading,
English translation by Benjamin A.B.P.
1975.
A global dynamical scheme for vertebrate
Thom, R.
embryology,
(A.A.A.S.,
1971:
Some Math.
Questions in Biology VI), A.M.S.
Lectures on Mathematics in the Life Sciences 5 (1973), 3-45. [l40]
Thom,
R.
Zeeman, [l4l]
Catastrophe
and
Thompson, J.M.T. Hunt, G.W.
and
[I42] Thompson, J.M.T. and Hunt, [l43]
theory:
future perspectives,
E.C.
its present state and in
[7l].
A General Theory of Elastic Stability3 Wiley, London 1973. Towards a unified bifurcation theory,
Z.A.M.P.
26 (1975), 581-604.
G.W.
Thompson, M.
The geometry of confidence: Enga te and Hagen moka3
an analysis of the
a complex system of
pig-giving in the New Guinea Highlands, appear in Rubbish Theory3 Q-44]
Tromba, A.J.
to
Paladin.
The Morse lemma on arbitrary Banach spaces, 79 (1973), 85-86.
Bull. A.M.S. [145]
Wallace, A.H.
Differential Topology: First Steps3 Benjamin, New York 1968.
[146]
Wassermann,
G.
Stability of Unfoldings3 Mathematics 393, Heidelberg 1974.
Lecture Notes in
Springer-Verlag,
Berlin*
[jL47]
Whitney, H.
Differentiable manifolds, Ann. Math. (1936), 645-680.
[l48]
Whitney, H.
The self-intersections of a smooth n-manifold in 2n-space, Ann. Math. 45 (1944), 220-246.
Q.49]] Whitney, H.
37
On singularities of mappings of Euclidean spaces I, Mappings of the plane into the plane. Am. Math.
Jj.50]
Willard,
S.
282
Williams,
R.F.
(1955),
374-410.
General Topology3 Addison-Wesley, Reading, Mass.
[l5l]
62
1970.
Expanding attractors, Publ. No. 43 (1974), 169-204.
Math. I.H.E.S.
[l52]
Zeeman, E.C.
n° C
density of stable diffeomorphisms and
flows, in Proceedings of Colloquium on Smooth Dynamical Systems (ed. D.R.J. Chillingworth), University 1972. [l53]]
Zeeman, E.C.
Southampton
On the unstable behaviour of stock exchanges,
J. Math. Economics 1 (1974), 39-49. [l54]
Zeeman,
E.C.
Primary and secondary waves in developmental biology,
(A.A.A.S.,
1974:
Some Mathematical
Lectures on Mathematics in the Life Sciences 7 (1974), Questions
in Biology VIII), A.M.S.
69-161. |jL55]
Zeeman, E.C.
Levels of structure in catastrophe theory, in Proceedings of the International Congress of Mathematicians_, Vancouver 1974, Canadian Mathematical Congress 1975,
u,
533-546.
[jL56]
Zeeman, E.C.
Catastrophe theory, Scientific American 234 (1976), 65-83.
[1.57]
Zeeman, E.C.
Euler buckling, in Catastrophe Theory Seattle 1975, Lecture Notes in Mathematics 525, Springer-Verlag, 1976.
[jL58]
Zeeman,
E.C.
Berlin*Heidelberg
The umbilic bracelet and the double cusp catastrophe, in Catastrophe Theory - Seattle 1975, as above.
[I59]
Zeeman, E.C. and Trotman, D.J.A.
The classification of elementary catastrophes of codimension £ 5,
in Catastrophe Theory -
Seattle 1975, as above.
Addenda [160]
Field, M.J.
Differential Calculus and its Applications, Van Nostrand Reinhold, London 1976.
[l6f]
Meyer,
K.R.
Generic bifurcations in Hamiltonian systems, in
[7l] .
283
Index
a-limit set
199
accumulation point
20,
action of Lie group
145,
analytic
85-6
Anosov system atlas
241
attractor Axiom A
242-3 240-244,
Baire
222,
Banach space
52,
basic set basis
241
Betti numbers
188 191-6,
118-124 249
225 93,
184,
§4.7
bijection
94,
271
bilinear
64, 27,
76-9, 30-1,
bundle
c1 C
2
165-175,
2
178,
62 ,
C
r
,
00 C
cu C
103,247 33, 46, 50,
137-8 154, 162,
boundary
226
43 160
bifurcation
bound, bounded
251
x
72 85
topology
Cantor set
223-6
cartesian product
233, 242 65, 80, 134,
catastrophe
256-7
catastrophe set
261,
Chain Rule chart
66, 148, 163, 166,
268
264-5 174
118-131
circle
32,
closed set
19-20
closure
20,
Closing Lemma
227,
230
codimens ion
136,
177,
codomain compact
270
compact manifold
143,
complement
269
complete
52,
94,
complex
61,
132
284
176,
118, 198,
§1.6,
138,
144,
223,
225-6
253-5
187, 152,
145
198, 159,
222
225 223,
229
272
composition
65,
80,
conj ugate
26,
209,
connected
33,
271 234
186, 229, 102, 110
constant map constraint
158,
245,
continuous
4-6,
9,
continuous
linear map
convergence
48,
72,
62
co-ordinate chart
21-3, 118
cotangent bundle co un tab 1 e
53,
122,
covariant tensor
78,
248
covector
245
covector field
246,
critical point
101-115,
critical value
157
cross-section cusp
265
17-18,
50,
51,
245-6,
238
84,
224-5
250 222,
272
155,
157,
250
183,
194,
cylinder
263, 134,
266 138
damping
200-1,
degenerate
derivative
103, 107, 256, 261 21, 207, 222, 231, 239-241, 244, 254, 36, 54, 58, 67 , 72
determinacy
88-9,
determinant
73, 69,
dense
diffeomorphism diffeomorphism,
local
difference equation
197,
:
214
251
109-115,
139 125,
132,
258 189,
69, 88 192
differentiable
36,
53,
differentiable manifold
ii,
124,
126,
differential equation
i,
§3.5,
191,
199-2