Chaos - An Introduction for Applied Mathematicians [1 ed.] 978-3-030-32537-4

The Matlab files can be downloaded here (92 Mb):

448 67 13MB

English Pages XIV, 303 [311] Year 2020

Report DMCA / Copyright


Polecaj historie

Chaos - An Introduction for Applied Mathematicians [1 ed.]

Table of contents :
Preface......Page 6
Contents......Page 9
1.1 Limit Cycles......Page 13
1.2 Bifurcation Theory......Page 15
1.2.1 Hopf Bifurcation......Page 16
1.3 Strange Attractors......Page 17
1.4 Turbulence......Page 19
1.5 Stochasticity......Page 20
1.6 The Lorenz Equations......Page 22
1.7 Notes and References......Page 25
1.7.1 Poincaré and the Swedish Prize......Page 26
1.7.2 Weather Predictability......Page 27
1.8 Exercises......Page 28
2.1 Chaos in Maps......Page 32
2.2 Bifurcations in Maps......Page 36
2.3 Period-Doubling and the Feigenbaum Conjectures......Page 39
2.3.1 Bifurcation Trees......Page 44
2.4 Symbolic Dynamics: The Kneading Sequence......Page 49
2.5.1 The Feigenbaum Conjectures......Page 53
2.6 Exercises......Page 54
3.1 Hopf Bifurcation Theorem......Page 59
3.1.1 Implicit Function Theorem......Page 61
3.1.2 Proof of Hopf Bifurcation Theorem......Page 65
3.2 Normal Forms......Page 69
3.3 Centre Manifold Theorem......Page 73
3.3.1 Formal Power Series Expansion......Page 79
3.3.2 Application to Hopf Bifurcation......Page 82
3.4.1 Stability of Periodic Orbits......Page 84
3.4.2 Floquet Theory......Page 85
3.4.3 Normal Forms......Page 87
3.4.4 Weak and Strong Resonance......Page 90
3.4.5 Circle Maps, Arnold Tongues, Frequency Locking......Page 91
3.5 Tertiary Hopf Bifurcation......Page 99
3.6.2 Multiple Scale Methods......Page 100
3.6.3 Circle Maps......Page 101
3.7 Exercises......Page 102
4 Homoclinic Bifurcations......Page 109
4.1 Lorenz Equations......Page 111
4.1.1 Homoclinic Bifurcations......Page 113
4.1.2 One-Dimensional Map......Page 117
4.2 Symbolic Dynamics......Page 119
4.2.2 Strange Attractors......Page 121
4.3 Shilnikov Bifurcations......Page 123
4.3.1 Approximation and Proof......Page 129
4.4 Matched Asymptotic Expansions for n-dimensional Flows......Page 132
4.4.1 Strange Attractors......Page 138
4.4.2 Partial Differential Equations......Page 139
4.5.1 The Lorenz Equations......Page 140
4.5.3 Infinite Dimensions......Page 141
4.6 Exercises......Page 143
5 Hamiltonian Systems......Page 153
5.1 Lagrangian Mechanics......Page 154
5.1.1 Hamilton's Principle......Page 156
5.1.2 Hamilton's Equations......Page 157
5.2 Hamiltonian Mechanics......Page 158
5.2.1 Integrability......Page 159
5.2.2 Action-Angle Variables......Page 162
5.2.3 Integral Invariants......Page 163
5.2.4 Canonical Transformations......Page 164
5.2.5 The Hamilton–Jacobi Equation......Page 165
5.3 Perturbation Theory......Page 167
5.3.1 Other Applications......Page 168
5.3.2 Resonance and Small Divisors......Page 170
5.3.3 The KAM Theorem......Page 173
5.3.4 Superconvergence......Page 176
5.4.1 Area-Preserving Maps......Page 179
5.4.2 Poincaré-Birkhoff Fixed Point Theorem......Page 182
5.4.3 Removal of Resonances......Page 183
5.4.4 Secondary Resonance......Page 186
5.4.5 Melnikov's Method......Page 188
5.4.6 Heteroclinic Tangles......Page 190
5.4.7 Arnold Diffusion......Page 191
5.5.1 A Restricted Three-Body Problem......Page 192
5.5.2 Lagrange Points......Page 201
5.5.3 The Hénon–Heiles System......Page 203
5.5.4 Hénon's Area-Preserving Map......Page 205
5.5.5 Standard Map......Page 206
5.6.1 The KAM Theorem......Page 208
5.6.3 Hénon-Heiles Potential......Page 209
5.7 Exercises......Page 210
6.1 Chaotic Data......Page 217
6.1.2 The Stock Market......Page 219
6.1.4 Chaos and Noise......Page 220
6.2 Statistical Methods......Page 221
6.2.1 Stochastic Processes......Page 222
6.2.2 Autocorrelation and Power Spectral Density......Page 224
6.2.3 Autoregressive Models......Page 228
6.3 Phase Space Embedding......Page 229
6.3.1 Singular Systems Analysis......Page 230
6.3.2 Time Lag Selection......Page 233
6.3.3 Nonlinear Filtering......Page 236
6.3.4 Prediction......Page 238
6.4 Dimensions and Lyapunov Exponents......Page 239
6.4.1 The Kaplan–Yorke Conjecture......Page 242
6.5.1 The Cantor Middle-Thirds Set......Page 245
6.5.2 Iterated Function Systems......Page 246
6.5.3 Julia Sets......Page 247
6.5.4 The Mandelbrot Set......Page 250
6.6 Whither Turbulence?......Page 252
6.6.1 Linear Stability......Page 253
6.6.2 Nonlinear Stability......Page 254
6.6.3 Experimental Observations in Shear Flows......Page 255
6.6.4 A Homoclinic Connection?......Page 256
6.6.5 Practical Turbulence......Page 257
6.7.1 Time Series......Page 258
6.7.2 Phase Space Embedding......Page 259
6.7.3 Dimensions and Fractals......Page 260
6.7.4 Turbulence......Page 265
6.8 Exercises......Page 267
Appendix Numerical Notes for Figures......Page 280
BookmarkTitle:......Page 301
Index......Page 307

Citation preview

Andrew Fowler Mark McGuinness

Chaos An Introduction for Applied Mathematicians


Andrew Fowler Mark McGuinness •

Chaos An Introduction for Applied Mathematicians


Andrew Fowler MACSI University of Limerick Limerick, Ireland

Mark McGuinness School of Mathematics and Statistics Victoria University of Wellington Wellington, New Zealand

ISBN 978-3-030-32537-4 ISBN 978-3-030-32538-1


Mathematics Subject Classification (2010): 34A34, 34C23, 34C28, 34C37, 37B10, 37C29, 37D45, 37E05, 37J40 © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Front cover image: a drawing of water pouring into a pool by Leonardo da Vinci, ca. 1507. Windsor Collection 12660v. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

We dedicate this book to Stephen O’Brien, who has variously brought both chaos and order, in differing measures, to each of us. Stephen is a leader in applied mathematics in Ireland at the present time, and is a figurehead in the effort to bring the subject there to a level where it seriously engages with industry, science and nature. Under his leadership, the Mathematics Applications Consortium for Science and Industry (MACSI) at the University of Limerick has become a flagship for applied mathematics in Ireland.


The modern theory of chaos arguably stems from the publication of Ed Lorenz’s remarkable paper in 1963 on deterministic nonperiodic flow. It took a while for it to be noticed outside the meteorological literature, but by the 1970s, the subject was gaining momentum, courses were created and in the 1980s books started to appear. At first few in number, but eventually they too became a roaring current. The subject is now well established, and as well as plentiful literature, there are a number of dedicated journals. So the first thing an author needs to do is to provide an excuse for why another book should be added to the list. There are two reasons, both of which provided a motivation for the original drafting of this material, and both of which remain valid in my opinion. The first is that most of the books on dynamical systems take a particular point of view, whereas actually the subject is a composite of a number of different historical strands. For example, Poincaré’s (1892) early investigations were motivated by the problem of predictability in celestial mechanics, particularly with regard to the stability of the solar system, and that work led in one way to the KAM theorem and much of the material concerning stochasticity in Hamiltonian systems; but equally it led (via the Poincaré–Lindstedt method) to modern methods of perturbation theory and bifurcation theory. This homogeneity of the subject has largely been lost in books on chaos, which mostly focus on one subject area or another. The intention here is to provide a view which accommodates all the different directions the subject has taken, whether it be Hamiltonian systems, maps or bifurcation theory. The second reason is found in the book’s title; I am an applied mathematician, and this means I chart a middle course between that of the physicist and that of the purer mathematician. But most of the books on chaos are by people who reside in one or other of these camps, and the flavour of the applied mathematician is rather different.




Applied mathematicians are less interested in theorems and proofs,1 because most of what they do is too hard to prove anything about. That is not always true, but it mostly is. They are interested in practicalities, and often this means approximations. An applied mathematician is less interested in a proof that, for example, the Lorenz equations have chaotic solutions: he2 is willing to take it on trust since there is no reason not to think it so. On the other hand, he is a mathematician, and will therefore be worried by the minutiae of (for example) dealing with homoclinic bifurcations, where it is necessary for peace of mind to pursue the details and mechanism of proof: these would not be items of much concern for physicists, who are yet more interested in observations, to the exclusion of some of the technical detail. For both these reasons, I, or rather we, feel that the present text will fill a small gap in the literature. We provide in chapter one a semi-historical overview of the development and context of the subject, and then in the following chapters we tell the story of the four horsemen: maps, Hopf and homoclinic bifurcations and Hamiltonian systems. The final chapter represents an exploration of other topics of more recent and practical or simply intellectual interest. The treatment is succinct, but in places we push it to a more challenging level. Such excursions provide another deviation from other texts. In order to preserve what readability there is, we have not cluttered the text with references, and we certainly do not aim to be overly extensive in our bibliography. The ‘Notes and references’ sections at the end of each chapter provide both citations and also further commentary on, or elaboration of, some of the earlier text. The material may be suitable for final-year undergraduate or graduate courses; it assumes a working knowledge of applied mathematics curricular subjects such as differential equations, mechanics and nonlinear dynamics. In places, it may assume a familiarity with other likely staples of the curriculum: fluid mechanics and mathematical biology, for example. But it also dabbles with material which may be less familiar to the intended audience, though still essential to the text: point set topology, and some elements of functional analysis. In preparing this book for publication, I have taken on as co-author my long-standing colleague and good friend Mark McGuinness, whose personal entwinement with my career dates back 40 years. Mark and I met as postdoctoral research fellows in Dublin in 1979, and we proceeded to write several papers on chaos in the Lorenz equations, before going our separate ways: he to CalTech and eventually Wellington, I to M. I. T. and eventually Oxford and subsequently Limerick. And, eventually, we met up again, and have since been frequent visitors to each other's institutions. The plan to finish this book coincided with a visit by me

But, I will confess, you will find elements of both here. It is simply necessary to the narrative. Or she: but for prosaic convenience I will deem the pronoun to connote persons of either gender, when it is not specific.

1 2



to Victoria University, Wellington, where Mark is Professor in the School of Mathematics and Statistics, and I have a position of Adjunct Professor, and I am grateful for this award, without which this book might never have seen the light of day! Kilkee, Co. Clare, Ireland February 2019

Andrew Fowler

I am delighted to be involved in writing a book on chaos with my colleague, co-author and friend Andrew Fowler. It is apposite that this book began during a visit to Victoria University of Wellington from Andrew in his capacity as Adjunct Professor, and was completed during my first visit as Adjunct Professor to the University of Limerick. My first forays into chaotic dynamics were in 1979 with Andrew and with John Gibbon, in Dublin, Ireland. We were all newly in town and fresh out of our Ph.D.s, and free and willing to talk about mathematics in Irish pubs and over meals at each other’s homes. It was classic stuff, scribbling on napkins, finding connections between soliton equations and the Lorenz equations, sharpening cusps and using asymptotic methods to make the connection between differential equations and one-dimensional maps. It was not until much later that I was to realise that what happened collaboratively and instinctively in Ireland was very special indeed. This book is in many ways a celebration of those good times, and of the very productive years that followed. Wellington, New Zealand February 2019

Mark McGuinness


. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

1 1 3 4 5 7 8 10 13 14 15 16

2 One-Dimensional Maps . . . . . . . . . . . . . . . . . . . . . . . 2.1 Chaos in Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Bifurcations in Maps . . . . . . . . . . . . . . . . . . . . . 2.3 Period-Doubling and the Feigenbaum Conjectures 2.3.1 Bifurcation Trees . . . . . . . . . . . . . . . . . . . 2.4 Symbolic Dynamics: The Kneading Sequence . . . 2.5 Notes and References . . . . . . . . . . . . . . . . . . . . . 2.5.1 The Feigenbaum Conjectures . . . . . . . . . . 2.5.2 Kneading Theory . . . . . . . . . . . . . . . . . . . 2.5.3 Period-Doubling in Experiments . . . . . . . . 2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

21 21 25 28 33 38 42 42 43 43 43

3 Hopf Bifurcations . . . . . . . . . . . . . . . . . . . . . . 3.1 Hopf Bifurcation Theorem . . . . . . . . . . . . 3.1.1 Implicit Function Theorem . . . . . . . 3.1.2 Proof of Hopf Bifurcation Theorem 3.1.3 Stability . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

49 49 51 55 59

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 1.1 Limit Cycles . . . . . . . . . . . . . . . . . . . 1.2 Bifurcation Theory . . . . . . . . . . . . . . . 1.2.1 Hopf Bifurcation . . . . . . . . . . . 1.3 Strange Attractors . . . . . . . . . . . . . . . . 1.4 Turbulence . . . . . . . . . . . . . . . . . . . . . 1.5 Stochasticity . . . . . . . . . . . . . . . . . . . . 1.6 The Lorenz Equations . . . . . . . . . . . . . 1.7 Notes and References . . . . . . . . . . . . . 1.7.1 Poincaré and the Swedish Prize 1.7.2 Weather Predictability . . . . . . . 1.8 Exercises . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . . . . . . . . .

. . . . .

. . . . .




3.2 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Centre Manifold Theorem . . . . . . . . . . . . . . . . . 3.3.1 Formal Power Series Expansion . . . . . . . 3.3.2 Application to Hopf Bifurcation . . . . . . . 3.4 Secondary Hopf Bifurcations . . . . . . . . . . . . . . . 3.4.1 Stability of Periodic Orbits . . . . . . . . . . . 3.4.2 Floquet Theory . . . . . . . . . . . . . . . . . . . 3.4.3 Normal Forms . . . . . . . . . . . . . . . . . . . . 3.4.4 Weak and Strong Resonance . . . . . . . . . 3.4.5 Circle Maps, Arnold Tongues, Frequency 3.4.6 Structural Stability . . . . . . . . . . . . . . . . . 3.5 Tertiary Hopf Bifurcation . . . . . . . . . . . . . . . . . 3.6 Notes and References . . . . . . . . . . . . . . . . . . . . 3.6.1 Many Theorems . . . . . . . . . . . . . . . . . . 3.6.2 Multiple Scale Methods . . . . . . . . . . . . . 3.6.3 Circle Maps . . . . . . . . . . . . . . . . . . . . . 3.6.4 Tertiary Hopf Bifurcation . . . . . . . . . . . . 3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

4 Homoclinic Bifurcations . . . . . . . . . . . . . . . . . . . . . . . 4.1 Lorenz Equations . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Homoclinic Bifurcations . . . . . . . . . . . . . . . 4.1.2 One-Dimensional Map . . . . . . . . . . . . . . . . 4.2 Symbolic Dynamics . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 The Smale Horseshoe . . . . . . . . . . . . . . . . 4.2.2 Strange Attractors . . . . . . . . . . . . . . . . . . . 4.3 Shilnikov Bifurcations . . . . . . . . . . . . . . . . . . . . . 4.3.1 Approximation and Proof . . . . . . . . . . . . . . 4.4 Matched Asymptotic Expansions for n-dimensional Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Strange Attractors . . . . . . . . . . . . . . . . . . . 4.4.2 Partial Differential Equations . . . . . . . . . . . 4.4.3 Transition to Turbulence . . . . . . . . . . . . . . 4.5 Notes and References . . . . . . . . . . . . . . . . . . . . . . 4.5.1 The Lorenz Equations . . . . . . . . . . . . . . . . 4.5.2 Shilnikov Bifurcations . . . . . . . . . . . . . . . . 4.5.3 Infinite Dimensions . . . . . . . . . . . . . . . . . . 4.5.4 Turbulence and Chaos . . . . . . . . . . . . . . . . 4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

99 101 103 107 109 111 111 113 119

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

122 128 129 130 130 130 131 131 133 133

5 Hamiltonian Systems . . . . . . . . . . 5.1 Lagrangian Mechanics . . . . . 5.1.1 Hamilton’s Principle . 5.1.2 Hamilton’s Equations .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

143 144 146 147

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . . . .

59 63 69 72 74 74 75 77 80 81 89 89 90 90 90 91 92 92

. . . . . . . .

. . . .

. . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . .

. . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . .

. . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . .

. . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . .

. . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . .

. . . . . . . . .

Locking . . . . . . . .

. . . .

. . . .



5.2 Hamiltonian Mechanics . . . . . . . . . . . . . . . . . . 5.2.1 Integrability . . . . . . . . . . . . . . . . . . . . . 5.2.2 Action-Angle Variables . . . . . . . . . . . . 5.2.3 Integral Invariants . . . . . . . . . . . . . . . . 5.2.4 Canonical Transformations . . . . . . . . . . 5.2.5 The Hamilton–Jacobi Equation . . . . . . . 5.2.6 Quasi-Periodic Motion . . . . . . . . . . . . . 5.3 Perturbation Theory . . . . . . . . . . . . . . . . . . . . 5.3.1 Other Applications . . . . . . . . . . . . . . . . 5.3.2 Resonance and Small Divisors . . . . . . . 5.3.3 The KAM Theorem . . . . . . . . . . . . . . . 5.3.4 Superconvergence . . . . . . . . . . . . . . . . 5.4 Resonance and Stochasticity . . . . . . . . . . . . . . 5.4.1 Area-Preserving Maps . . . . . . . . . . . . . 5.4.2 Poincaré-Birkhoff Fixed Point Theorem 5.4.3 Removal of Resonances . . . . . . . . . . . . 5.4.4 Secondary Resonance . . . . . . . . . . . . . 5.4.5 Melnikov’s Method . . . . . . . . . . . . . . . 5.4.6 Heteroclinic Tangles . . . . . . . . . . . . . . 5.4.7 Arnold Diffusion . . . . . . . . . . . . . . . . . 5.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 A Restricted Three-Body Problem . . . . 5.5.2 Lagrange Points . . . . . . . . . . . . . . . . . . 5.5.3 The Hénon–Heiles System . . . . . . . . . . 5.5.4 Hénon’s Area-Preserving Map . . . . . . . 5.5.5 Standard Map . . . . . . . . . . . . . . . . . . . 5.6 Notes and References . . . . . . . . . . . . . . . . . . . 5.6.1 The KAM Theorem . . . . . . . . . . . . . . . 5.6.2 Restricted Three-Body Problem . . . . . . 5.6.3 Hénon-Heiles Potential . . . . . . . . . . . . . 5.6.4 Hénon Area-Preserving Map . . . . . . . . 5.6.5 The Last KAM Torus . . . . . . . . . . . . . 5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Diverse Applications . . . . . . . . . 6.1 Chaotic Data . . . . . . . . . . . 6.1.1 Weather Forecasting 6.1.2 The Stock Market . . 6.1.3 Landslide Statistics . 6.1.4 Chaos and Noise . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

148 149 152 153 154 155 157 157 158 160 163 166 169 169 172 173 176 178 180 181 182 182 191 193 195 196 198 198 199 199 200 200 200

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

207 207 209 209 210 210



6.2 Statistical Methods . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Stochastic Processes . . . . . . . . . . . . . . . . . . 6.2.2 Autocorrelation and Power Spectral Density 6.2.3 Autoregressive Models . . . . . . . . . . . . . . . . 6.3 Phase Space Embedding . . . . . . . . . . . . . . . . . . . . 6.3.1 Singular Systems Analysis . . . . . . . . . . . . . 6.3.2 Time Lag Selection . . . . . . . . . . . . . . . . . . 6.3.3 Nonlinear Filtering . . . . . . . . . . . . . . . . . . . 6.3.4 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Dimensions and Lyapunov Exponents . . . . . . . . . . 6.4.1 The Kaplan–Yorke Conjecture . . . . . . . . . . 6.5 Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 The Cantor Middle-Thirds Set . . . . . . . . . . 6.5.2 Iterated Function Systems . . . . . . . . . . . . . 6.5.3 Julia Sets . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.4 The Mandelbrot Set . . . . . . . . . . . . . . . . . . 6.6 Whither Turbulence? . . . . . . . . . . . . . . . . . . . . . . 6.6.1 Linear Stability . . . . . . . . . . . . . . . . . . . . . 6.6.2 Nonlinear Stability . . . . . . . . . . . . . . . . . . . 6.6.3 Experimental Observations in Shear Flows . 6.6.4 A Homoclinic Connection? . . . . . . . . . . . . 6.6.5 Practical Turbulence . . . . . . . . . . . . . . . . . 6.7 Notes and References . . . . . . . . . . . . . . . . . . . . . . 6.7.1 Time Series . . . . . . . . . . . . . . . . . . . . . . . . 6.7.2 Phase Space Embedding . . . . . . . . . . . . . . . 6.7.3 Dimensions and Fractals . . . . . . . . . . . . . . 6.7.4 Turbulence . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

211 212 214 218 219 220 223 226 228 229 232 235 235 236 237 240 242 243 244 245 246 247 248 248 249 250 255 257

Appendix: Numerical Notes for Figures . . . . . . . . . . . . . . . . . . . . . . . . . . 271 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

Chapter 1


1.1 Limit Cycles The behaviour of autonomous (that is, where time is not represented explicitly) ordinary differential equations in R2 is well understood. The qualitative behaviour of the equation (1.1) x˙ = f (x), x ∈ R2 , is organised by its fixed points, where x˙ = f (x) = 0. Elementary texts show how these maybe classified into saddles, spirals and nodes depending on the eigenvalues of the Jacobian D f (x) at the fixed point. In particular, if these eigenvalues are complex conjugates, there is a spiral, or focus; if they are real and of opposite sign, there is a saddle; and if they are real and of the same sign, there is a node: see Fig. 1.1. Nodes and spirals may be stable or unstable depending on whether the eigenvalues λ have Re λ < 0 or Re λ > 0, respectively; saddles are always unstable. Other degenerate cases (centre, degenerate node) can occur, if Re λ = 0, or if two eigenvalues coalesce. Knowledge of the behaviour of trajectories near fixed points informs the method of phase plane analysis. An example of its use is to find travelling solitary wave



Fig. 1.1 The three basic types of critical point © Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,



1 Introduction

Fig. 1.2 Phase plane for Eq. (1.4)

solutions of the Korteweg–de Vries equation u t + uu x + u x x x = 0


in the form u = f (x − ct) = f (ξ ), so that − c f  + f f  + f  = 0,


f 2 + f  = 0


with first integral − cf +

1 2

(so that f (±∞) = 0). This represents a nonlinear oscillator with potential V ( f ) = − 21 c f 2 + 16 f 3 .1 If we take c > 0 (without loss of generality), then the phase plane is as in Fig. 1.2. There are two fixed points, f = 0, and f = 2c, which are respectively a saddle and a centre. (The latter persists as a centre in the presence of nonlinear terms by virtue of the existence of a first integral, which in turn is associated with the absence of a damping term.) Solitary waves exist, by virtue of the homoclinic orbit which connects 0 to itself. We see in addition that there exists a family of periodic orbits lying within the homoclinic loop. One can think of the homoclinic orbit as a limiting member of this family with infinite period. More generally, periodic behaviour occurs in two-dimensional systems at isolated limit cycles. A limit cycle is a closed curve in a phase space (which, therefore, represents a periodic time series) to which nearby trajectories tend either as t → +∞ (stable limit cycle) or as t → −∞ (unstable limit cycle). A simple example (see also exercise 1.7) is offered by the system x˙ = x − y − x(x 2 + y 2 ), y˙ = x + y − y(x 2 + y 2 );


nonlinear oscillator is an equation of the form x¨ + V  (x) = 0; the quantity V (x) is called the potential (energy) on account of its rôle in the energy-like conservation law E = 21 x˙ 2 + V (x).


1.1 Limit Cycles


Fig. 1.3 Phase plane for Eq. (1.5)



by transforming to polar coordinates (r, θ ), it is straightforward to find the phase plane shown in Fig. 1.3. There is a stable limit cycle at r = 1, and an unstable focus at r = 0. One is often interested in the behaviour of a (physical) system as a parameter changes (e.g. the Earth’s climate as the atmospheric CO2 level changes). In this case, one might study differential equations of the form x˙ = f (x, μ),


where μ is a parameter. For some ranges of μ, the system may exhibit steady behaviour, in others it may behave periodically. One then wishes to chart regions in parameter space where such qualitative changes occur. These delineate particular values of μ at which such bifurcations take place, and μ is then called a bifurcation parameter.

1.2 Bifurcation Theory The Watt governor (see Fig. 1.4) is a device invented by James Watt around 1788 to control the flow of steam into the cylinders of an engine. It is a simple but efficient device, which acts on the basis of bifurcation theory. As shown in the figure, the device consists in essence of a pair of rigid pendulums attached to a central shaft, which rotates about its axis at a rate determined by the pressure of steam in the engine. The object is to have a device such that if this rotation rate becomes too large, then the arms of the device rise, thus opening a valve which alleviates the pressure. If we denote the angle of the arms of the governor to the downward vertical as θ , and the angular velocity of the shaft as , then the equation governing the dynamics is, ignoring friction (see question 5.1), θ¨ − 2 sin θ cos θ + 2c sin θ = 0,



1 Introduction

Fig. 1.4 A cartoon of the Watt governor. The image is figure 15 in the e-book The Story of Great Inventions by Elmer Ellsworth Burns, and can be found at https://www.

where c =

g , l


g is the acceleration due to gravity and l is the length of the pendulum  2 arms. c if  > c . The steady states of the system are θ = 0, π , and also ± cos−1 2 The function of the governor can be seen in the fact that we can have θ as an increasing function of  if  > c , thus opening the valve, and its operation relies on the existence of a pitchfork bifurcation at  = c . More precisely, if we linearise (1.7) near the origin, we see that   θ¨ ≈ 2 − 2c θ,


and thus the vertical state loses stability for  > c . Expanding further in θ and also supposing that  − c is small, we have   θ¨ ≈ 2 − 2c θ − 21 2c θ 3 . . . ,


which shows explicitly that there is an exchange of stability between θ = 0 and the non-zero fixed points. Bifurcation theory is discussed in more detail in Chap. 3.

1.2.1 Hopf Bifurcation The transition from steady to periodic behaviour is often associated with a Hopf bifurcation, as instanced by the following generalised version of (1.5): x˙ = μx − y − x(x 2 + y 2 ), y˙ = x + μy − y(x 2 + y 2 ).


1.2 Bifurcation Theory


Fig. 1.5 Hopf bifurcation diagram. This is for a supercritical Hopf bifurcation: the full lines represent stable orbits, and the dashed line represents the unstable focus at the origin

It is simple to find that the origin is a stable focus for μ < 0, an unstable focus for μ > 0 and that for μ > 0 a stable periodic solution exists, whose amplitude √ √ A = |z|max = |x + i y|max = O( μ) (and in fact equals μ). The bifurcation diagram is shown in Fig. 1.5. The salient features are that, if the linearised system has solutions ∝ eσ t , then a pair of complex conjugate eigenvalues σ = μ ± i of the fixed point cross the imaginary axis at μ = 0, at a non-zero ‘rate’ (d[Re σ ]/dμ = 0), and as a result, a limit cycle exists in μ > 0. This is known as a Hopf bifurcation, and will be considered further in Chap. 3. Notice the fact that the limit cycle exists (for small μ) in either μ > 0 or μ < 0, so that its stability is opposite to that of the fixed point with which it coexists. Furthermore, its amplitude √ is O( μ). All these features are generally true. Fixed points and periodic solutions are ‘all’ that a two-dimensional system can exhibit. This is because trajectories cannot intersect in the phase space, due to the following existence theorem: Theorem (Picard) If x˙ = f (x) and x(0) = x0 , and f is Lipschitz continuous (i.e. | f (x) − f (y)| < K |x − y| for some K ), then x exists locally (in t) and is unique.  Here, in fact, we can take x ∈ Rn , provided a suitable norm is chosen. In terms of differentiability, a sufficient condition on f is that f ∈ C 1 (i.e. f is continuously differentiable); the usual example for necessity is that of x˙ = x 1/2 , x(0) = 0, which has infinitely many solutions of the form x = 41 [t − t0 ]2+ , where t0 ≥ 0.2

1.3 Strange Attractors In three or more dimensions, it is still true that trajectories do not intersect, but no longer does this constrain the orbits in the way that it does in two dimensions. It is quite possible to imagine a bounded trajectory in three dimensions which never repeats itself. Whether such trajectories can be realised is another matter, of course, but Fig. 1.6 suggests that they can. This shows a numerical solution of a set of three ordinary differential equations, known as the Lorenz equations (written explicitly in (1.14) below, and which we shall meet again later, particularly in Chap. 4); the orbit appears to trace out a surface, albeit a convoluted one, on which the orbit is dense 2 We

define [x]+ = max(x, 0).


1 Introduction

Fig. 1.6 Solution of the Lorenz equations; r = 28, σ = 10, b = 83

50 40 30

z 20 10 0 -20





x (i.e. it ‘fills out’ the surface). These observations raise certain conceptual problems. We term a differential equation x˙ = f (x), x ∈ Rn ,


dissipative if ∇. f < 0; the Lorenz equations are an example of a dissipative system. It is easy to show that an arbitrary volume V of initial points in phase space satisfies the equation V˙ = V ∇. f. (1.13) Therefore if ∇. f < 0 (and is bounded away from zero), then V → 0 as t → ∞. If, in addition, solutions are bounded for all t (e.g. if x. f < 0 for large ||x||), then all trajectories must be attracted to a set of zero volume. Consider the Lorenz equations in three dimensions. Intuition tells us to expect the zero volume attractor to be of dimension two, i.e. a surface. And yet, we have already pointed out that orbits on a surface cannot be more exciting than periodic, in contrast to Fig. 1.6. The resolution of this paradox must lie in the fact that while the attractor is locally two-dimensional, it forms a braided set of connected surfaces, on which orbits can wander forever. Thus the area of the attractor is unbounded, and in a sense, its dimension is greater than two. This is one facet of what is termed a strange attractor. Another is that nearby trajectories diverge (exponentially) in time. It is this sensitive dependence on initial conditions which was the motivation for Lorenz’s original paper, and which is (probably) why meteorologists cannot predict the weather more than about a week ahead. Any initial uncertainties in today’s data (e.g. due to an insufficient number of weather stations collecting information) are magnified exponentially in time, so that predictions beyond a few days are unreliable. It is probably the case that nature abounds with such inherently unpredictable phenomena.

1.4 Turbulence


1.4 Turbulence It has been a commonplace observation for centuries that water often flows in an irregular fashion. Figure 1.7 shows a drawing by Leonardo da Vinci of water flowing into a pool. It is a mass of bubbles and irregularly placed whorls. Osborne Reynolds was the first to conduct systematic investigations into this phenomenon of turbulence, beginning in 1880. He studied flow in a cylindrical pipe, and found that when a parameter proportional to the flow rate through the pipe increased through a critical value, the flow underwent a transition from a steady, uniform state to an irregular, time-dependent, turbulent one. This parameter, now called the Reynolds number (denoted Re), thus acts as a bifurcation parameter for fluid flow. Other types of transition to ‘turbulence’ in fluids, for example in thermal convection, rotating Couette flow, rotating stratified flows, etc., are associated with changes in Rayleigh number Ra, Taylor number Ta and Rossby number Ro. The analogy between the transition in ordinary differential equations from fixed point to periodic motion and ultimately to chaos, and similar transitions in some fluid flow experiments, suggests that a useful strategy to understand these phenomena may be to study the bifurcations associated with the transitions. In some experiments, this approach has been markedly successful. The word ‘turbulence’ has come to have a particular meaning to fluid dynamicists who often associate it with transitions in shear flow, (e.g. Reynolds’ flows in a pipe,

Fig. 1.7 A drawing of water pouring into a pool by Leonardo da Vinci, ca. 1507. Windsor Collection 12660v. Image from Art Heritage/Alamy Stock Photo


1 Introduction

or plane Poiseuille flow, or boundary layer flow). It is characterised by temporal and spatial disorder, and the existence of ‘eddies’ on many different length scales. Moreover, shear flows are unlike other fluid experiments, such as Rayleigh–Bénard convection (the motion induced by heating a layer of fluid from below), in that the mechanisms of transition show no obvious analogy with transitions in finitedimensional systems. As such, they lie apart from more ‘well-behaved’ experiments. Some mathematicians and physicists have trod on this sacred turf, by annexing the word ‘turbulence’ for their own purposes, sometimes simply to characterise chaotic motion in ordinary differential equations, or as a general term to describe spatiotemporal disorder. In doing so, this has alienated some fluid dynamicists, which is a pity, since there is little doubt that (in the words of Ed Spiegel) ‘turbulence is chaos, but chaos is not turbulence’. But the clue to the understanding of turbulence in fluids lies in the modern theory of chaos in dynamical systems.

1.5 Stochasticity Osborne Reynolds began his experiments on pipe flow in 1880, and the first paper reporting his results appeared in 1883. A second, outlining the basis for the statistical description of turbulence, and introducing the Reynolds stresses, appeared in 1885. In this same year, King Oscar II of Sweden proposed a prize for the solution of a seemingly unrelated problem in the much older subject of celestial mechanics, namely the stability of orbits of the n-body problem. Given a system of self-gravitating mass points, the prize asked for convergent series representations of their trajectories for all time. Put another way, it raised the question as to whether Laplace’s famous principle of determinism3 could be put into practice: are the motions of celestial bodies predictable; is the solar system stable? At first sight such a question seems ludicrous; the solar system, in particular, seems the very epitome of predictability. Treating the planets as mass points interacting according to Newtonian gravitation, one can predict in detail the trajectories of the Earth, the Moon and the other planets accurately for thousands of years into the future or into the past. Eclipses are no longer a harbinger of doom, and spacecraft navigate the well-charted courses of planets. Elementary Newtonian mechanics teaches us that planetary orbits—the two-body problem—are ellipses. The complication arises when one tries to include the perturbing effect of other planets on these simple elliptical trajectories. Although these perturbations are small, the increase in the number of degrees of freedom means that one can no longer be certain of the predictability of the orbits.

3 In a nutshell: if you know precisely the state of the universe at time t

0 , together with the deterministic laws governing it, then (if you’re God) you will know the state of the future. At an atomic level, this is countermanded by the uncertainty principle, and, at a practical macroscopic level, by the subject of this book.

1.5 Stochasticity


The conundrum is this: methods of perturbation theory allow one to calculate actual orbits as accurately as one wishes for a finite time. But the stability question is one which requires the study of the dynamics for all time. Schematically, we can find accurate series representations for small perturbations of O(ε), say, for t < T . That is, one can accurately compute trajectories for fixed T as ε → 0. But the reality is that ε is small but finite, and one really wants to study the limit T → ∞. The series that one obtains are not necessarily convergent, but asymptotic. The non-convergence of the perturbation series representations for planetary trajectories is associated with the phenomenon of stochasticity in Hamiltonian dynamics. A system such as the solar system (nine4 planets plus the Sun, let us say) is represented by the Hamiltonian dynamics associated with the Hamiltonian H (which will be just the sum of the kinetic and potential energies). This Hamiltonian is close to that which describes the trajectories as nine uncoupled two-body problems, H0 , say. Corresponding to H0 , the motion is of nine separate oscillators, each of whose trajectories can be thought of as lying in a circle in its own 2-dimensional phase space. Thus orbits of H0 lie in the Cartesian product of nine circles, S 1 × S 1 × . . . × S 1 , or a nine torus, in the 18-dimensional positional phase space. What happens if H0 is perturbed to H? The question is whether the nine torus is robust to perturbations, and also whether, if so, trajectories on it are close to quasi-periodic (i.e. have no more than nine independent frequencies). In fact, there is a one parameter family of 9-tori corresponding to different values of the energy H . Poincaré won the King’s prize by showing amongst other things that series expansions in ε (if H − H0 ∼ ε) were divergent, and he also knew that such divergence was associated with the destruction of certain tori. The achievement of Kolmogorov, Arnold and Moser, in the celebrated KAM theorem, was to show that most tori (in the sense of measure) are conserved under perturbation, and thus the answer to the question: is the solar system stable? is: probably. However, near the tori which are destroyed by a perturbation, there exist regions of irregular, stochastic behaviour, and as the perturbation grows in amplitude, the regions of stochasticity grow in size, until eventually, they envelop the phase space. Celestial stability is a classical problem, and of interest for its own sake. Nevertheless, the question of whether the solar system might fly apart in a few million years might seem overly abstruse. There are, however, other more practical problems where these issues are central. Particle accelerators, and other magnetic confinement devices such as tokamaks, operate by accelerating atomic particles around closed loops. They have to maintain almost circular orbits for many millions of rotations. In this case, the long-term stability of almost periodic motions is a central concern, which makes the study of the analogous celestial mechanics problem rather less academic.

4 Now

apparently eight since the relegation of Pluto.


1 Introduction

1.6 The Lorenz Equations The surge of recent interest in chaotic behaviour does not, however, stem from a breakthrough in studies of turbulence, nor in the revelations offered by the KAM theorem. Instead, it was a more mundane, if difficult, problem that initiated much of what followed: predicting the weather. Ed Lorenz was a meteorologist at M. I. T., interested in medium-range weather forecasting, and particularly the issue of whether weather was in principle predictable beyond some optimum time. Given that the initial conditions (the weather today) can only be known incompletely and approximately at best, will the real solution and the approximate one diverge so rapidly that prediction is in effect impossible after some finite time?5 The question is exemplified by the simple equation x˙ = x, x(0) = 1 or 1 + ε. The solutions, et and eεt et diverge, so that at time t = O(1/ε), the ratio is O(1). This divergence, however, is rather slow. A more dramatic example is with the initial conditions x = 0 or x = ε. Lorenz wished to show that a ‘realistic’ set of nonlinear equations (i.e. which bear some relation to a physical model) could exhibit the same behaviour, and moreover that the solutions themselves could be irregular and aperiodic. The minimum requirement is a set of three nonlinear, ordinary differential equations, and Lorenz derived the equations named for him by writing down a Fourier series representation for the stream function and temperature in two-dimensional, Boussinesq6 convection. By retaining only three terms, he obtained the following: x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz.


The three parameters r, σ , b are all positive; r represents external forcing and is normally used as a bifurcation parameter. We have already seen, in Fig. 1.6, that solutions of these equations can appear chaotic when viewed in phase space. Figure 1.8 shows a typical time series, which is apparently chaotic. Lorenz made several key observations, which led to a partial understanding of the behaviour he observed, and raised further questions for the theorist. First, the divergence of the right-hand side of (1.14) (taken as a vector) is −(σ + b + 1) < 0, and thus phase volumes contract uniformly in time. Second, orbits are bounded in phase space. Therefore, there is a bounded attracting set, which all trajectories approach. From his numerical integrations, Lorenz guessed that he might be able to characterise orbits by some single feature such as the maximum value of z as they

5 Currently

thought to be about ten days. Boussinesq approximation neglects density variations except in the buoyancy term. It is equivalent to the assumption that α T 1, where α is thermal expansion, and T the prescribed temperature drop. 6 The

1.6 The Lorenz Equations


Fig. 1.8 x(t) in a solution of the Lorenz equations, corresponding to the parameter values of Fig. 1.6

Fig. 1.9 Lorenz map of successive maxima of z at parameter values r = 28, σ = 10, b = 83

loop round. He thus tabulated the sequence of maxima in z, {Mn }, and plotted Mn+1 versus Mn : the result is shown in Fig. 1.9. On the face of it, it seems that one can use Mn as a predictor for Mn+1 . In some sense, the sequence {Mn } encodes the orbits, and thus we are led to studying the difference map given in Fig. 1.9. The issue of why the differential equation should be thus encoded, and how one should derive Fig. 1.9 from (1.14) is rather more difficult; we shall return to it in Chap. 4. We shall have more to say about one-dimensional maps in Chap. 2. For the moment, we follow Lorenz, who opted to study the simpler tent map given by  2Mn , Mn < 21 , (1.15) Mn+1 = 2 − 2Mn , Mn > 21 , which is shown in Fig. 1.10. Its dynamics are extremely complicated, but nevertheless simple to understand. Lorenz argued as follows. Iterates of (1.15) satisfy Mn = m n ± 2n M0 ,



1 Introduction

Fig. 1.10 The piecewise linear map given by (1.15)

where m n is even. Consider first an initial value M0 = u/2 p , where u is odd (and M0 < 21 ). Then M p−1 = 21 , M p = 1 and M p+1 = 0. There are thus a countable number of trajectories which are bi-asymptotic to 0 (that is, Mi → 0 as i → ±∞); such u trajectories are called homoclinic trajectories. If M0 = p , where u, v are relatively 2 v uk prime (and odd), then for all k < 0, M p+k = , where u k , v are relatively prime and v u k is even. Hence all such values of M0 lead to asymptotically periodic sequences, since for any v, only a finite number of even values of u k < v can occur. Finally, suppose that M0 is irrational; it is easy to see from (1.16) that Mn+k = Mn for all values of n and k, and thus such sequences are aperiodic. In consequence, (i) there are a countable number of periodic orbits; (ii) there are an uncountable number of aperiodic orbits. Furthermore, if M0 = M0 + ε, ε 1, then for sufficiently small ε and given k, Mk = Mk ± 2k .


It follows from this that (iii) the map displays sensitive dependence on initial conditions. No matter how close two initial values M0 and M0 are, their iterates diverge after a finite number of terms. The three ingredients above are often taken as the definition of deterministic chaos. That is, an invariant set which includes (i) and (ii) and displays (iii) is said to be chaotic. Since (iii) implies that periodic orbits are unstable, it follows that orbits will ‘look’ chaotic. A fourth ingredient, not considered by Lorenz, is (iv) the existence of a dense7 orbit. That is to say, there is an orbit which approaches any point in the invariant set arbitrarily closely. It is easy to see that a dense orbit exists for (1.15). First, (1.16) implies that if M0 is written as a binary expansion M0 = .a1 a2 a3 . . . then 7A

set M is dense in X if X is the closure of M.

1.6 The Lorenz Equations

Mn = .an an+1 an+2 . . . or 1 − .an an+1 an+2 . . . .



Let s ∈ (0, 1) have binary expansion s = .s1 s2 s3 . . . , then if s˜ = .s1 s2 . . . sk , we have |s − s˜ | < 2−k . Suppose also t˜ = 1 − s˜ = .t1 t2 . . . tk . Then if M0 = a0 a1 . . . s1 s2 . . . sk t1 t2 . . . t k . . ., we see that some iterate of M0 lies within 2−k of s. By sequentially including all such numbers for larger and larger k, we thus create an M0 whose iterates come arbitrarily close (and infinitely often) to any point in the interval. For example, M0 = .01011011001010011100101110111 . . .


is such a point. And there are countably many such points.

1.7 Notes and References The methods of nonlinear dynamics describing bifurcation theory, phase plane analysis and so on are discussed in any number of books on differential equations; two popular ones which carry on into dynamical systems and chaos are those by Strogatz (1994) and Alligood et al. (1997). Although dynamical systems may be thought to start with the work of Poincaré over a hundred years ago, the subject of chaos really burst into flower in the 1980s. In Chaos II (1990), a collection of important reprints, Hao Bao-Lin lists 117 books or conference proceedings up to the end of 1989 (not all on chaos, though). Among the first of these was that of Guckenheimer and Holmes (1983). There is a lot of information packed in this book, although the order will not be to everyone’s taste, and the style is solidly mathematical. The strengths are in differential equation theory. The book by Devaney (1986) (there is a second edition from 2003) is solely concerned with maps, and is a good undergraduate mathematics text. On the other hand, there are a number of books by physicists; one such is that by Lichtenberg and Lieberman (1983) (with a second edition from 1992 and a paperback edition from 2010), which is comprehensive but largely concerned with Hamiltonian systems and their perturbations, and not easy to read. The reprint collection by Hao Bao-Lin (1990) contains a very useful introduction to chaos, as does that by Cvitanoviˇc (1984), and both books are very useful sources of journal articles. The introduction in the latter, in particular, contains a lucid discussion of period-doubling (better than Feigenbaum’s original papers (1978, 1979), which are rather dense reading). A nice book, again by physicists, is that by Bergé et al. (1986), which ends with a rather Gallic debate between the three authors. Arnold’s (1983) book is full of good things, though it shares with Guckenheimer and Holmes’s book a somewhat perverse ordering. Altogether different is the book by Thompson and Stewart (1986) (second edition from 2002), engineers by training. It is very clear and readable, though it stops short


1 Introduction

of the kind of detail needed to satisfy even an applied mathematician. It is very useful as a reference. Mackay and Meiss (1987) is a large collection of reprints in Hamiltonian mechanics, including classic papers by Kolmogorov and Arnold on the KAM theorem. Another book with the Hamiltonian flavour is that of Tabor (1989). A lovely book on the Lorenz equations is that of Sparrow (1982), which gave an early emphasis to the importance of homoclinic bifurcations. A book which considers these in enormous detail is that of Wiggins (1990). The prolific Hao Bao-Lin (1989) has also written a nice book on symbolic dynamics. Books on related subjects are those by Carr (1981) on centre manifold theory, Hassard et al. (1981) on the Hopf bifurcation theorem (but including normal forms) and the book on ordinary differential equations by Hartman (1982), which contains many smoothness results which are used in the various bifurcation analyses. It can be seen from the above that there was a bloom in books on chaos in the 1980s and 1990s, and many of these are now considered classics. Of course, books on chaos continue to be published, often as much later second editions. Recent examples include the books by Argyris et al. (2015), Shivamoggi (2014) and Barreira and Valls (2012). The last of these largely concerns maps, and has an abstract, analytic flavour. The other two are both second editions of much earlier books, and are more substantial. Shivamoggi’s book is a tour of many of the topics of the present book, and more: solitons, the Painlevé property, amongst other things. The book by Argyris et al. is a monster at over 800 pages, and is suitably thorough; in particular, there is liberal use of illustration. The main topics, though, are just those in the older books, though with some eclectic add-ons: for example, the last chapter on the dynamics of surface chemical reactions. One of the principal developments in the subject in more recent years has been the development of the subject of non-smooth, or piecewise discontinuous, dynamical systems. These arise in a variety of applications, for example, involving impact oscillators, systems with friction, and so on, and there are already a number of books on the subject, for example, that by di Bernardo et al. (2008). However, their study goes beyond the aspiration of the present book.

1.7.1 Poincaré and the Swedish Prize An extensive account of King Oscar’s prize competition, its conduct and the award of the prize to Poincaré is given by Barrow-Green (1997), and to a lesser extent by Daniel Goroff in his introduction to the translation of Poincaré’s Méthodes nouvelles de la Mécanique Céleste (Poincaré 1993). Interestingly, the two historical accounts diverge slightly in interpretation. Poincaré’s original submission contained a mistake, spotted by the reviewer Phragmén, which involved the assumption that the stable and unstable manifolds8 of fixed points of a Poincaré map for a Hamiltonian system would intersect (if at all) smoothly. In fact this is not necessarily the case, and transverse 8 We will have plenty of discussion of manifolds later, but essentially in R n

they are simply subspaces.

1.7 Notes and References


intersection provides an explicit mechanism for chaotic behaviour, as explained in Chap. 5, Section 5.3.6. The error was corrected in the eventually published memoir in Acta Mathematica, but was a cause of some embarrassment to the prize committee, none of whom had noticed it.

1.7.2 Weather Predictability We mentioned the weather briefly in Sect. 1.3, and again in Sect. 1.6. The problem of predicting the weather is confounded by two issues. The first of these is that, most probably, the model equations which describe weather patterns have chaotic solutions, with the specific implication that small changes in initial conditions lead after a finite time to widely different outcomes. This feature of chaos, ‘sensitive dependence on initial conditions’, is picturesquely referred to as ‘the butterfly effect’, the title of the first chapter of James Gleick’s popular book on chaos (Gleick 1988), and a phrase credited to Edward Lorenz. Lorenz himself gives an interesting gloss on the phrase in his own popular and readable account (Lorenz 1993). It seems it originates with the title of a talk given by Lorenz in 1972, ‘Does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?’ Interestingly, exactly the same idea (and even with a butterfly playing the key rôle) occurs in Ray Bradbury’s short story ‘The sound of thunder’, which appears in his collection ‘The golden apples of the sun’; the story was published originally in 1952. The consequence of this is that, even if the model equations could be solved exactly (which they can not be), the necessary imprecision in specification of the initial conditions for the variables, due to incomplete knowledge of the state of the atmosphere, leads to divergence of the forecast from reality after a finite time, usually thought to be about one week. A second issue is that the governing model equations are themselves imperfect representations of the real atmosphere, so that even if the solutions were not chaotic, there would be divergence of the predicted weather from that which actually occurs. Sensitive dependence on initial conditions leads to exponential divergence in time of neighbouring solution trajectories, while model error causes linear growth; which of these is the more important is a matter for debate. In practice, present-day forecasts attempt to circumvent the issue by using ensemble forecasts, in which a suite of different predictions are made using a distribution of different initial conditions. The forecasts then become probabilistic in nature. See, for example, Gneiting and Raftery (2005).


1 Introduction

1.8 Exercises 1.1 The pair of equations x˙ = f (x, y), y˙ = g(x, y) has an equilibrium point at x = x0 , y = y0 . If the matrix A is given by   fx f y A= gx g y evaluated at (x0 , y0 ), and its trace and determinant are denoted by T = tr A, D = det A, show that the stability of (x0 , y0 ) is determined by T and D, and delineate the curves in (T, D) space which separate regions in which the critical point is a saddle, (stable or unstable) node, etc. 1.2 The Lorenz equations are given by x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz, where r, σ, b > 0. Show that the origin is a stable equilibrium point if r < 1, and unstable if r > 1. Show √ that if r > 1, there are two further equilibria C± , given by x = y = ± b(r − 1), z = r − 1, and show that the eigenvalues λ of the Jacobian of the linearised equations at either equilibrium satisfy p(λ, r ) ≡ λ3 + (σ + b + 1)λ2 + b(σ + r )λ + 2bσ (r − 1) = 0. By consideration of the graph of p for r > 1 show that there is always at least one root of p which is negative, and show that λ = 0 is never a root for r > 1. Show that when r = 1, the other two roots are also negative. Deduce that this is also the case if r ≈ 1. [Hint: as r varies, the roots λ(r ) of p = 0 vary continuously—why?] Show that the two roots other than the negative one can be an imaginary pair ±i only if r = rc , where rc =

σ (σ + b + 3) . σ −b−1

Deduce that the equilibria C± are stable if 1 < r < rc , and unstable if r > rc .

1.8 Exercises


1.3 If x ∈ R satisfies x˙ = f (x, μ), with f (0, 0) = 0, f x (0, 0) = 0, write down an approximate equation for x near 0 by expanding f in a Taylor series, retaining terms up to degree 3. By considering the relative magnitude of these terms when x and μ are small, show that (i) if f μ = 0, then approximately x˙ = μf μ + 21 x 2 f x x ; (ii) if f μ = 0, approximately x˙ = μx f xμ + 21 x 2 f x x + 21 μ2 f μμ ; (iii) if f μ = 0 and f x x = 0, approximately x˙ = μx f xμ + 16 x 3 f x x x + 21 μ2 f μμ , and that in the last case, the μ2 term can also be ignored. Describe the corresponding bifurcation diagrams which plot the steady states of x versus the parameter μ, and indicate the stability of the various fixed points. [These bifurcations are known as saddle-node, transcritical and pitchfork bifurcations, respectively.] 1.4 Show that the system x˙ = μ − x 2 , y˙ = −y,


has a bifurcation at μ = 0 where a saddle in x < 0 joins a node in x > 0 (hence the name saddle-node bifurcation). 1.5 The Lotka–Volterra equations are x˙ = ax − bx y, y˙ = −cy + d x y. Show by a scaling argument that they can be written in the dimensionless form x˙ = x(1 − y), y˙ = γ y(x − 1), and define γ in terms of a, b, c and d. Show that there is a (linear) centre at (1, 1) and a saddle at (0, 0).


1 Introduction

By considering dy/d x or otherwise, find a first integral, and deduce carefully that trajectories of the nonlinear system form closed orbits about (1, 1). Hence construct the phase plane for the Lotka–Volterra equations. 1.6 The Van der Pol equation is x¨ + μ(x 2 − 1)x˙ + x = 0. By an appropriate definition of y, write this equation as a pair of first-order equations for x and y. Show that there is a unique fixed point, and show that it is stable if μ < 0 and unstable if μ > 0. What kind of bifurcation occurs at μ = 0? 1.7 A λ-ω system is given by u˙ = λu − ωv, v˙ = ωu + λv, in which λ = λ(r ), ω = ω(r ) and r 2 = u 2 + v 2 . Write the pair of equations as a single equation for the complex variable z = u + iv. Hence, or otherwise, show that r˙ = λ(r )r. Deduce that if ω(r ) ≡ 1 and λ(r ) = μ − r 2 , a Hopf bifurcation occurs at μ = 0. Plot the resultant bifurcation diagram. If, instead, λ = μr 2 − r 4 , show that the bifurcation diagram remains the same. Does a Hopf bifurcation occur in this case? If not, why not? 1.8 Sketch the bifurcation diagrams for the systems (i) x˙ = x(μ − c − μbx), b, c > 0; (ii) x˙ = μx − bx 2 + cx 3 , b > 0; (iii) x˙ = −x(x 2 − 2bx − μ), b > 0; indicate stability of the solution branches. (You should not necessarily assume that the critical value of μ is zero.) 1.9 Describe the (x, y) phase portrait of the following systems, written in polar coordinates (r, θ ), for μ > < 0, and determine which of them has a Hopf bifurcation at μ = 0: (a) r˙ = r (μ − r 2 )(2μ − r 2 )2 ; (b) r˙ = r (r + μ)(r − μ); (c) r˙ = −μ2 r (r + μ)2 (r − μ)2 ; with θ˙ = 1 in all cases.

1.8 Exercises


1.10 A feedback control system is modelled by x¨ + δ x˙ + x(x 2 − 1) = −z, z˙ + αz = αγ x, where x is the amplitude of a damped, nonlinear spring and z is the controller. Evaluate the fixed points and their stability, and show that bifurcations can occur on three distinct surfaces in (α, δ, γ ) space. Show that these surfaces meet at γ = 1, δ = 1/α, where there is a degenerate bifurcation. 1.11 The equation ε x˙ = −x + f (x1 ), where x1 = x(t − 1), is called a delay-differential equation. Explain why the initial data for this equation is x = u(t), t ∈ [−1, 0], and deduce that the system is infinite-dimensional. By appropriately defining xn , show that, for large N , the equation can be approximated by the (N + 1)-dimensional set ε x˙0 = −x0 + f (x N ), 1 x˙n = −xn + xn−1 , n = 1, 2, ...N . N


Hence show that the system is dissipative. If x ∗ is an equilibrium point, derive a transcendental equation for the exponents σ of infinitesimal perturbations eσ t to x ∗ , for the delay equation. Hence show     that if f < 1, the fixed point is stable. Is there a corresponding result for the (N + 1)-dimensional approximation? [This is called a delay-recruitment equation. If the map x → f (x) has chaotic dynamics, then so does the delay-differential equation for small enough ε. The fixed point x ∗ loses stability by a Hopf bifurcation (if f  (x ∗ ) < −1), and then period doubles9 towards chaotic behaviour.]

9 Period-doubling

is discussed in Chap. 2.

Chapter 2

One-Dimensional Maps

2.1 Chaos in Maps The study of differential equations is naturally bound to the study of maps. Depending on your point of view, difference equations are of interest in their own right, and an extreme position leads to such esoteric (but elegant!) subjects as that of the study of fractals, the Mandelbrot set, Julia sets, and so on (for which, see Chap. 6). At a more mundane level, difference equations may be viewed as mathematical models in their own right. In particular, discrete generations of animal or insect populations may be modelled by the size of successive generations x1 , x2 , x3 , . . ., and it is plausible that the successive values are related to those of the previous generation by a difference equation of the form (2.1) xn+1 = f (xn ). The simplest assumption is that of exponential growth: the rate of reproduction is directly proportional to the population, where xn+1 = λxn .


However, it is evident in practice that unlimited population growth cannot occur, and the effects of overcrowding, resource limitation, disease, etc., will cause a realistic function f (x) to be sub-linear; in particular, f (∞) is usually bounded and often negative. Possibly the simplest correction to allow for such effects is the logistic difference equation (2.3) xn+1 = λxn (1 − xn ). Obviously populations (here normalised so that the growth rate is zero at x = 1) must have xn ≥ 0, which suggests that 0 ≤ xn ≤ 1 if the sequence is to retain a meaning.

© Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,



2 One-Dimensional Maps 1

1 0.8

x n+1




0.4 0.2 0













Fig. 2.1 Convergence to a fixed point (left) and chaos (right) for a unimodal map (i.e. one with a single maximum)

Consequently, we must take 0 ≤ λ ≤ 4 if xn ∈ [0, 1] for all n.1 The parameter λ is a bifurcation parameter, so-called because as it changes, the behaviour of sequences of {xn } will change dramatically. In practice, one might expect values of λ to depend on external circumstances, such as those which regulate food supply, temperature and so on. An instructive method to get some feel for the dynamics of sequences of interest is to trace them graphically. One obtains a sequence x1 , x2 , x3 . . . in two-dimensional space with axes (xn , xn+1 ) by tracing a ‘staircase’ vertically from x1 on the xn axis to the graph of xn+1 = f (xn ) (which gives x2 ); then horizontally to the line at 45◦ , xn+1 = xn (thus x2 is the new abscissa); then vertically to f (x2 ), horizontally to the 45◦ line (giving x3 ) and so on. Figure 2.1 illustrates this procedure (called cobwebbing) schematically for two different values of λ in (2.3). If λ is quite small (1 < λ < 3, in fact), then successive iterates appear to converge to the non-trivial fixed point x ∗ = f (x ∗ ) in (0, 1). However, if λ is larger, then iterates appear to wander around aimlessly. Before we proceed, we need some simple definitions. Definition For the map xn+1 = f (xn ), x ∗ is a fixed point if x ∗ = f (x ∗ ). Definition An orbit of f is a sequence {xn } such that xn+1 = f (xn ). Notation

We will write iterates of f with a superscript. Thus f 2 = f ◦ f , i.e. f 2 (x) ≡ f [ f (x)], and more generally f n ≡ f ◦ f n−1 .

Definition A periodic orbit or p-cycle is an orbit {xi }i∈Z with some smallest value p such that xi+ p = f p (xi ) = xi for all i; p is the period. Definition An aperiodic orbit is an orbit which is not periodic.

1 Because of this, (2.3) is not wholly satisfactory; a better kind of function has

One example which has been used is xn+1 = λxn (1 + xn )−β , with β > 1.

f (x) → 0 as x → ∞.

2.1 Chaos in Maps


Fig. 2.2 Period-doubling bifurcation

Now we can, of course, easily examine iterates of maps such as (2.3) numerically, using a simple program. If 0 < λ < 1, one finds that every sequence tends to zero (xn → 0 as n → ∞), and it is in fact easy to see why this must be so.2 For λ > 1, there is a non-trivial fixed point x ∗ = 1 − (1/λ), and we find that if 1 < λ < 3, then any initial value (other than x1 = 0 or 1) gives a sequence tending to x ∗ as n → ∞. However, as λ is increased further, the behaviour becomes more interesting. For example, if λ = 4, then almost any initial condition leads to a sequence which seems to behave aperiodically. The transition in the behaviour of sequences as λ increases from 3 to 4 is associated with a complicated sequence of bifurcations. For example, at λ = 3.1, we see a 2-cycle. This bifurcates from x ∗ at λ = 3 in the sense that as λ approaches 3 from above, the amplitude (|xi+1 − xi |) of the 2-cycle approaches zero, and each point of the cycle approaches x ∗ . This is the period-doubling bifurcation (so-called because the period-1 orbit (x ∗ ) bifurcates to give rise to a period-2 orbit); it is illustrated in Fig. 2.2. Notice that the fixed point x ∗ continues to exist for λ > 3. However, it becomes unstable as λ increases through 3. Now a 2-cycle of f can also be thought of as two fixed points of f 2 . As λ is increased further, one finds that these fixed points also lose stability (at λ = λ2

3.449 . . .) in a period-doubling bifurcation, giving a period-4 orbit for f . This process continues as λ increases, giving period-8, period-16, etc. orbits. Furthermore, the values of λ at which these bifurcations occur accumulate, so that at λ 3.57, periods of all orders 2k are present. Also, all these orbits are unstable. It does not take much to sense that for λ > 3.57, a typical computed sequence is likely to appear ‘chaotic’. With a countable number of unstable periodic orbits sprinkled throughout the unit interval, one might guess that ‘most’ sequences would shuttle around, occasionally approaching different cycles before veering off again. 2 Since

xn < xn−1 for all n.


2 One-Dimensional Maps

And indeed, this statement is roughly accurate (and gets more accurate as λ → 4), but the full description is somewhat more complicated than this. The case λ = 4 is special, because the dynamical behaviour is truly chaotic, and because the behaviour can be very precisely described. To see this we put x = sin2 θ, λ = 4,


in (2.3). Then sin2 θn+1 = sin2 2θn , so that {xn } can be determined from the sequence {θn }, where (2.5) θn+1 = 2θn mod 2π.3 This is an elementary example of a circle map, defined on the circle S 1 and will be termed the (one-dimensional) baker map.4 We put θ = 2π φ. The map φn+1 = f (φn ) = 2φn mod 1,


is equivalent to a binary shift on two symbols. For each f n ∈ (0, 1), there corresponds a unique5 binary representation φ = .a1 a2 a3 . . .


∞  ai 1




The map acting on the unit interval is equivalent to the shift map .a1 a2 a3 . . . → .a2 a3 a4 . . .. In particular, it is clear that the binary representation of the initial value φ1 = .a1 a2 a3 . . . encodes the entire trajectory (since it contains all subsequent iterates, thus φk = .ak ak+1 . . .). In particular, the sequence a1 a2 a3 . . . tells us whether the k-th iterate lies in (0, 21 ) (ak = 0) or ( 21 , 1) (ak = 1).6 Thus trajectories can be uniquely encoded by the binary sequences on the symbols (L, R) which denote whether φk is in (0, 21 ) or ( 21 , 1). For values of λ other than λ = 4 (and for other maps), such sequences, called kneading sequences, can be used to establish the order of bifurcation of periodic orbits.7 It is immediate to see from the shift map that (i) there are a countable number of periodic orbits (φ = m/n ∈ Q, n = 2r , r ∈ N), (ii) there are an uncountable number of aperiodic orbits (φ ∈ / Q); (iii) there are an uncountable number of homoclinic points (φ = m/2r ), that is to say, points for which φk = 0 for some k, and for which there exist trajecto= y mod z if (x − y)/z is an integer. n } may also be determined from the tent map θn+1 = 2θn , 0 < θn < π, θn+1 = 4π − 2θn , π < θn < 2π (see Chap. 1). 5 Providing we exclude sequences ending . . . 1111 . . .. 6 Trajectories which hit φ = 1 are those which terminate . . . 10000 . . ., since if φ = 1 , then φ = k k 2 2 1, φk+i = 0 for i ≥ 1. 7 And then, one needs a symbol C to denote when φ = c, where c is the turning point of f . k 3x

4 The sequences {x

2.1 Chaos in Maps


ries . . . φ−m , φ−(m−1) , . . . , φ1 , . . . which pass through φ1 and emanate from 0 (φ−m → 0 as m → ∞); (iv) f has sensitive dependence to initial conditions: given x, there exists δ > 0 such that for any ε > 0 , there is a value y with |x − y| < ε, such that, for some N , | f N (x) − f N (y)| > δ; (v) f has (infinitely many) dense orbits. There are various definitions of chaos for maps or flows, which involve some or all of the above properties. A working definition used by Devaney8 uses essentially properties (i) (actually, that periodic points are dense), (iv) and (v). One reason that no universal definition for chaos is agreed on is the practical importance of numerical computations. The essence of chaos, as opposed to noise, is that the dynamics of the system are ‘generally’ chaotic in appearance. But it is typically the case that not all initial points lead to aperiodic orbits, and furthermore machine accuracy when one actually computes orbits can lead to misleading results. If a computed orbit looks chaotic, it ought to be chaotic; but in a mathematical sense, it may well not be. For example, finite precision computations ensure that all numerically computed iterates of difference equations must be periodic (if they do not terminate), though this is typically an academic (if important) point. But a machine working in binary arithmetic would numerically predict that 0 was a globally stable attracting fixed point for the baker map! The two critical features of Devaney’s definition are thus the dense orbit, and the sensitive dependence on initial conditions. These imply that numerical computations will typically look chaotic, and in normal circumstances, a decent numerical computation which looks chaotic may well correspond to chaos thus defined. However, one needs to avoid being dogmatic. The best definition of chaos would be one where what looks chaotic is chaotic, but it is quite possible (for example) to have extremely long chaotic transients, which eventually lead to steady behaviour. It is often the case that such behaviour can be related to the more restrictive definition of chaos given above.

2.2 Bifurcations in Maps Although the logistic map has infinitely many periodic orbits for λ ≥ λ∞ ≈ 3.57 . . ., and exhibits chaotic motion over much of the parameter interval (λ∞ , 4) there are many intervals, or windows, in which stable periodic behaviour can occur. For example, if 3.8284 < λ < 3.8415, there is a stable 3-cycle, to which almost all (in the sense of Lebesgue measure) initial values of x1 are attracted. In some sense, the transition to chaos is a gradual thing. This also shows that periodic orbits with periods other than 2r can exist. In order to understand where these come from, we now categorise the two basic bifurcations exhibited by general one-dimensional maps. 8 In

his book, An Introduction to Chaotic Dynamical Systems.


2 One-Dimensional Maps

First we define stability for fixed points. Definition A fixed point x ∗ of the one-dimensional map f is (asymptotically) stable if there is a δ > 0 such that for all x with |x − x ∗ | < δ, f n (x) → x ∗ as n → ∞. Otherwise, x ∗ is unstable. It follows from this definition that x ∗ is (linearly) stable if | f  (x ∗ )| < 1, unstable if | f  (x ∗ )| > 1. This is because if xn = x ∗ + yn , then yn+1 ≈ f  (x ∗ )yn . If | f  (x ∗ )| = 1, linear theory is inconclusive, but as we will be interested in the passage of | f  (x ∗ )| through one as a parameter μ is varied, this is of little consequence. Two principal possibilities now occur.9 (i) Saddle-node bifurcation Suppose that, when the parameter μ = 0, f (x ∗ , μ) = x ∗ is a fixed point of the map f (x, μ) and that f x (x ∗ , 0) = 1. For x near x ∗ and μ near 0, we have (by appropriately re-defining μ if necessary) f ≈ x + μ + a(x − x ∗ )2 + · · · ,


and thus there exist two fixed points given by x = x ∗ ± (−μ/a) 2 1


when μ/a < 0. This bifurcation is called the saddle-node bifurcation, as it represents (more generally) the confluence of a saddle-type fixed point with a node-type fixed point. The saddle is unstable, while the node is stable. In one sense, this is not a true bifurcation—no new fixed points are created; rather, one can view the fold as part of a one-parameter family of fixed points which is simply not monotonically determined by μ (Fig. 2.3).

Fig. 2.3 A saddle-node bifurcation 9 There are actually two others, the transcritical bifurcation and the pitchfork bifurcation (cf. question

1.3): in both of these f μ = 0 at the fixed point x ∗ , and in the second also f is odd about x ∗ ; see question 2.6.

2.2 Bifurcations in Maps


Fig. 2.4 Supercritical (left) and subcritical (right) period-doubling bifurcations








(ii) Period-doubling bifurcation This occurs as f  decreases through -1. If f (x ∗ , 0) = x ∗ , f x (x ∗ , 0) = -1, it is clear (since y = f (x) cuts y = x transversely) that the fixed point persists for μ = 0, and without loss of generality x ∗ (μ) = constant. Then f (x ∗ , μ) ≡ x ∗ , and for small μ and x − x ∗ , f ≈ x ∗ − (1 + μ)(x − x ∗ ) + a(x − x ∗ )2 + b(x − x ∗ )3 . . .


with error O(μy, y 3 ), where y = x − x ∗ . Composing f with itself, we find f 2 (x) − x ≈ −2by 3 − 2a 2 y 3 + 2μy,


whence it follows that f has only the fixed point x ∗ for |μ|  1, while f 2 has two fixed points 1 x − x ∗ = ±[μ/(b + a 2 )] 2 . (2.12) We say that f 2 has a pitchfork bifurcation at μ = 0, on account of the amplitudeparameter diagram, Fig. 2.4. The stability of the bifurcated two cycle (x1 , x2 ) is given by ( f 2 ) (x1 ) = 2  ( f ) (x2 ) = f  (x1 ) f  (x2 ) (using the chain rule). More generally, the stability of p a p-cycle {x1 , . . . , x p } is determined by whether | 1 f  (xi )| < 1.

Fig. 2.5 A period-doubling sequence


2 One-Dimensional Maps

For equations such as the logistic map, we find numerically that there are a whole host of period-doubling bifurcations, as indicated by Fig. 2.5. These, for example, produce the 2r -orbits. Other orbits are produced by saddle-node bifurcations, e.g. that of period 3 for λ 3.83, in which an unstable and stable 3-cycle appear. It is then found that the stable 3-cycle period doubles to orbits of period 3 × 2r , and these period-doublings have their own accumulation point (at 3.8495 …).

2.3 Period-Doubling and the Feigenbaum Conjectures In the late 1970s, the world of chaos was lit up by the (re-)discovery of concepts of universality applied to the bifurcations in one-dimensional maps. The promoter of these ideas was Mitchell Feigenbaum, and in this section, we describe the rather beautiful way in which ideas imported from physics were applied to maps. ‘Universality’ is big business for physicists, which explains the popularity of the ideas at the time, although subsequently their influence has waned. We will restrict our consideration to the so-called ‘unimodal’ (i.e. one-humped) maps, such as the logistic map, which for the present section are defined as follows:10 Definition A unimodal map f : [-1, 1] → [-1, 1] is one for which f is continuous, f (0) = 1, and it is strictly monotone on each of [-1, 0), (0,1]. We will, in fact suppose, that in addition, f is symmetric, although this is not in fact necessary. An example and its second iterate f 2 = f ◦ f are shown in Fig. 2.6. As illustrated in Fig. 2.6, f 2 has a single (stable) fixed point, which is the same as the fixed point of f . Now suppose f depends on a parameter μ, and is such that as μ increases, f gets steeper (or sharper). It follows that f 2 gets steeper also. Moreover, if f  at the fixed point decreases through -1, we know that a period-doubling occurs. This means that f 2 has a pitchfork bifurcation, since in addition to the fixed point of



Fig. 2.6 A unimodal map and its second iterate

10 This

fixes the maximum of f ; in the following Sect. 2.4 we revert to an affinely related version such as (2.3) in which the origin is a fixed point.

2.3 Period-Doubling and the Feigenbaum Conjectures




Fig. 2.7 A period-doubling bifurcation is a pitchfork bifurcation for f 2

f , the 2-cycle of f consists of two points each of which is a fixed point of f 2 . This transition is illustrated in Fig. 2.7, where we see that the bifurcation corresponds to the passage of the slope of f 2 through +1. The key idea is that as μ continues to increase, the slope of f 2 at the two new fixed points will increase in magnitude, and if circumstances permit, a further perioddoubling (of the f 2 fixed points) will occur, thus producing four period four orbits (actually, one period four orbit, but four fixed points of f 4 ). Moreover, the process of transition of f 2 is similar to that of f (see Fig. 2.8), provided we invert the picture and scale it down by a factor (of about 2.5, in fact). There is nothing in principle to stop this sequence of bifurcations continuing indefinitely, and indeed it is commonly found numerically that such period-doubling sequences do occur. For example, the logistic map (in the form x −→ 1 − μx 2 ) has period-doubling bifurcations at μ1 = 0.75, μ2 = 1.25, μ3 = 1.3680989 . . ., μ4 = 1.3940461 . . ., and these values appear to accumulate geometrically at the value μ∞ = 1.401155 . . ., that is (2.13) μ∞ − μn ∼ δ −n . Feigenbaum’s achievement was the discovery that the number δ is independent of the function f , providing f has a quadratic maximum. The number δ = 4.669201609 . . . is sometimes called Feigenbaum’s constant. This remarkable fact endows the process of period-doubling with the fashionable property of universality—a mechanism which is independent of precise detail. Such an idea is veryattractive to physicists,

Fig. 2.8 f 2 after a second period-doubling



2 One-Dimensional Maps



Fig. 2.9 Renormalisation: the central portion of f 2 resembles a rescaled, inverted version of f

though less exciting for mathematicians; it is a remarkable discovery, nonetheless, and the underlying property of self-similarity which provides the explanation has led to new concepts in applying ideas involving fractals to such phenomena as isotropic turbulence, earthquake mechanisms and the permeability of sandstones, for example. In order to explain the universal geometric convergence, we use the ideas of renormalisation, which for the present case is based on the idea embodied in Fig. 2.9: that is to say, f resembles the central portion of f 2 , inverted and rescaled. To be specific, define (2.14) α = −1/ f 2 (0) = −1/ f (1), and we suppose 0 < α < 1. Since f 2 (0) = −1/α, it follows that the graph of f 2 needs to be scaled up by a factor α to make the central portion unimodal. Equivalently, for it to be unimodal in −1 < x < 1, and since the central portion of f 2 lies in −1/α < x < 1/α, we require the argument to be scaled up by −α (see Fig. 2.9). Thus if we define a function f˜(x) via f˜(x) = −α f 2 [−x/α]


(where α = −1/ f (1)), then the idea is that f˜ ‘resembles’ f . The hope is that this resemblance gets better as the period-doubling progresses. To proceed more formally, we define a map T on the space of unimodal functions as follows: the function T f is defined by T f (x) = −α f 2 [−x/α], α = −1/ f (1).


We hope T f ‘resembles’ f , i.e. f is ‘close’ to a fixed point of T . If this is the case, we can hope to linearise about the fixed point in order to study the effect of T . We first need to check whether T f is itself unimodal. We require α > 0 for the rescaling to be possible, and f 2 (1/α) < 1/α so that T f maps [−1, 1] into [−1, 1]. Thus T in fact maps a subspace of the space of unimodal mappings (M) to M. Denote this subspace by D. We now propose, without further ado, the three Feigenbaum conjectures:

2.3 Period-Doubling and the Feigenbaum Conjectures


(i) T has a unique real fixed point g (in D) satisfying the Cvitanovi´c–Feigenbaum equation g(x) = −αg[g(−x/α)], α = −1/g(1)


(g is in fact analytic). (ii) The derivative DT of T has a single unstable eigenvalue δ = 4.669 . . . (for quadratic maxima of f ), and the remainder of the spectrum lies within the unit circle (i.e. is contracting). It follows that there is a one-dimensional unstable manifold11 of g, written W u . (iii) If we denote by 1 the space of functions in D where the first perioddoubling occurs (that is, if x ∗ is the unique fixed point in (0, 1) of f ∈ D, then f  (x ∗ ) = −1), then W u intersects 1 transversely.12 The principal statements here are (i) and (ii), whose likelihood was shown formally by Feigenbaum, and later proved by Lanford. The transversality condition (iii) was in effect proved by Eckmann and Wittwer. First of all, we show how these conjectures can explain the universality of period-doubling sequences, and then we show formally how to calculate g and δ. Suppose f μ (x) = f (x, μ) is a family of unimodal maps which exhibits a full period-doubling sequence. Put another way, f μ intersects all the co-dimension one13 surfaces n on which the n-th period-doubling occurs. Now T maps n+1 to n n+1 (i.e. if f ∈ n+1 , then T f ∈ n , for if {xi }21 is a 2n+1 cycle of f in n+1 , then 2n n {yi }1 , yi = αx2i , is a 2 -cycle of T f in n , and the stability is the same. Since 1 intersects W u transversely, we have the picture shown in Fig. 2.10. Since T maps f away from the stable manifold W s , but towards W u , we see that { r } forms a sequence of surfaces which tend to W s as r → ∞, moreover, W s = ∞ defines the surface where the sequence accumulates. It immediately follows that μ∞ − μn ∼ δ −n as n → ∞. We can go further. Since T maps f μ towards W u , and since W s is invariant, we have the situation shown in Fig. 2.11. T maps f μn+1 on n+1 to a function on n . It is convenient to define a map which stays on n . This is T ∗ , defined by T ∗ f μn = T f μn+1 .


manifold is simply a space that is locally Euclidean, i.e. it looks like Rn . For example, the surface of a sphere is a two-dimensional  manifold. In a function space, it is most easily conceived as a subspace of functions such as n1 ai f i (x), or more generally a family of functions f (x, a), where for an n-dimensional manifold, a ∈ Rn . A one-dimensional unstable manifold for the fixed point g is then an invariant manifold on which the points are mapped away from g by the map T . 12 Transverse here has an obvious geometrical meaning, and is as illustrated in Fig. 2.10. Provided we have concepts of angles and tangents, we want the angle of the tangents at the point of intersection of W u and 1 to be non-zero. Angles are interpreted in function spaces by inner products, and tangents by linear operators known as Frêchet derivatives. 13 The co-dimension of a subspace Rk ∈ Rn is just n − k, the dimension of the complement. More generally we talk of the co-dimension of a manifold. The use of the term becomes advantageous when, as here, we are considering subspaces of infinite-dimensional spaces. 11 A


2 One-Dimensional Maps

Fig. 2.10 T maps the (n + 1)-th period-doubling family to the n-th one



Λ1 Λ2 Λ3


Fig. 2.11 T maps the family f μ towards the unstable manifold of the Feigenbaum fixed point



s W = Λ


Λ1 Λ2 T T*

s W = Λ

T ∗ maps functions sideways on n , and it follows that lim T ∗r f μn = gn ∈ Wu exists r →∞ (and is a universal function), and that gn → g as n → ∞. In order to calculate g, we substitute a Taylor expansion into (2.17). One finds α = 2.502, g = 1 − 1.527 x 2 − 0.1048 x 4 . . . .


The calculation of δ is a little more subtle. Linearisation of T about g (write f = g + εh, ε  1, and expand to terms in O(ε)) gives the derivative operator DT (g)h = Lh + αh(1){g − xg  },


where α = −1/g(1), and Lh = −α[h{g(−x/α)} + g  {g(−x/α)}h(−x/α)].


We require h(0) = 0, h  (0) = 0 in order that g and g + εh both be unimodal. The eigenvalues of DT can be related to those of L, as follows. Suppose Lψ = δψ, (2.22)

2.3 Period-Doubling and the Feigenbaum Conjectures


and choose ψ(0) = 1 (and ψ  (0) = 0) without loss of generality. It is an easy exercise to show that (2.23) Lψ|x=0 = δ = −α[ψ1 + g1 ], where ψ1 = ψ(1), g1 = g  (1). Also g(1) = −1/α, and applying L’Hôpital’s rule to the derivative of (2.17) as x → 0, we get g1 = −α: therefore ψ1 = −(δ/α) + α.


It is also straightforward to show that L[g − xg  ] = g − xg  ,


associated with the fact that gμ (x) = μg(x/μ) also satisfies the Cvitanovi´c– Feigenbaum equation for any μ. Now define h = ψ − (g − xg  ),


and note h(0) = h  (0) = 0. We find, with a little effort, DT (g)h = δh.


It therefore follows that the eigenvalues of DT can be calculated from those of L. One then finds (using Taylor expansions) that ψ ≈ 1 − 0.3256x 2 − 0.00505x 4 , δ = 4.66920 . . .


2.3.1 Bifurcation Trees It is instructive to plot trajectories of a difference equation, such as the logistic map, as points in (x, μ) space. This can be easily done on a computer, and gives the typical diagram shown in Figs. 2.12 and 2.13. As μ varies, we see successive bifurcations from the fixed point to a period-2 cycle, then to a four cycle, and eventually to (apparently) chaotic trajectories at μ 1.4. The chaotic region coincides with the limit point of the period-doubling bifurcation values μn . It does not fill the unit interval but gradually grows as μ increases. Various features are apparent in the chaotic régime, and these are associated with further bifurcations. Most noticeable are the ‘periodic windows’ in which the chaos disappears, to be replaced by periodic orbits. The biggest window is the period-three window at about μ 7/4, but others are visible. A useful exercise is to blow up the régime map by magnifying the scale of x and μ. Figure 2.14 shows a magnified region

34 Fig. 2.12 Logistic bifurcation tree for xn+1 = λxn (1 − xn )

Fig. 2.13 Logistic bifurcation tree for y → 1 − μy 2 (obtained from Fig. 2.12 by x = 21 + 21 ( 21 λ − 1)y, λ = 1 + (1 + 4μ)1/2 )

Fig. 2.14 A close up of Fig. 2.13 revealing the period-three window, as well as the unstable period-three orbit

2 One-Dimensional Maps

2.3 Period-Doubling and the Feigenbaum Conjectures Fig. 2.15 A further close up of Fig. 2.14 showing the saddle-node bifurcation of the period-three orbit in more detail


0.08 0.06

y 0.04 0.02 1.75


near x = 0 and μ = 1.75. We see that there appears to be a tangency of the periodthree orbit just where the chaotic region ends. This tangency effects a crisis in the following way. The period-three orbit is annihilated as μ decreases in a saddle-node bifurcation with an unstable period three orbit, shown dotted in Fig. 2.14, and further magnified in Fig. 2.15. As μ increases, the stable period three orbit undergoes perioddoubling, to periods 6, 12, . . . , 3 × 2n . . .. In a sense, these orbits are generated by the ‘production’ of the unstable period-three orbit (at μ = 4). The crisis is precipitated by the collision of the unstable orbit as μ decreases with the chaotic attractor which exists just below the saddle-node bifurcation. Such bifurcations occur ubiquitously. Figure 2.12 reveals a sequence of dark lines, also revealed in Fig. 2.13, whose behaviour seems related to the bifurcations exhibited by the map. These form what has been called the ‘skeleton’ of the bifurcation diagram by Hao Bai-Lin, and we can explain their presence quite simply. For a unimodal map xn+1 = f (xn , μ), the ‘stability’ of any string {x1 . . . x p } (not necessarily periodic)  p is given by the product p   1 f (x i ). In particular, if x 1 = 0 (where f = 0), then 1 f (x i ) = 0, and the string is ‘superstable’. What this means is, that if the behaviour is chaotic, then points which land close to zero will stay closer to subsequent iterates of zero for longer than other trajectories remain close to each other. Therefore, in a chaotic region, we expect to see iterates of zero (but not zero itself) occur more frequently than other points. Let us define the functions Pk (μ) as the successive iterates of zero: thus P1 (μ) = f (0, μ), . . . , Pn (μ) = f [Pn−1 (μ), μ].


These define curves in (x, μ) space, and these curves form the skeleton of the bifurcation diagram. For example, if f = 1 − μx 2 , we have P1 = 1, P2 = 1 − μ, P3 = 1 − μ + 2μ2 − μ3 , P4 = 1 − μ + 2μ2 − 5μ3 + 6μ4 − 6μ5 − 4μ6 − μ7 ,



2 One-Dimensional Maps

Fig. 2.16 Successive iterates of zero, defined by (2.30)

Fig. 2.17 As in Fig. 2.16, but with the bifurcation tree superimposed

and so on. Some of these curves are shown in Fig. 2.16, and Fig. 2.17 shows them superimposed on the bifurcation tree, which shows the relation between them and the periodic windows. It is instructive, for example, to consider the period-three window. Figure 2.18 shows the graph of f 3 , for μ just lower than the value μc where the period-three orbit comes into existence. As μ increases, the troughs dip down below the 45 degree line, and a pair of orbits (one stable, one unstable) is created, as in Fig. 2.13. For μ slightly below μc , the chaotic attractor spends most of its time near the incipient period-three orbit. The trajectories become intermittently chaotic, as indicated in Fig. 2.19, and in particular, the iterates of the minima of f 3 are all close to the period-three orbit. It follows that for μ near μc many of the curves Pn pass close to the fixed points of f 3 . For a slightly higher value of μ, the minima of f 3 become fixed points themselves, i.e. the period-three orbit becomes super-stable, and at this point P3 (μ) = P6 (μ) = P9 (μ) = . . .. It follows that such points of confluence of the functions Pn can be

2.3 Period-Doubling and the Feigenbaum Conjectures Fig. 2.18 Third iterate of the logistic map y → 1 − μy 2 , for μ = 1.74


1 0.5

f 3(y) 0 -0.5 -1 -1





y Fig. 2.19 Close up of Fig. 2.18: intermittent chaos

0.2 0.1

f 3(y) 0 -0.1 -0.2 -0.2






approximately associated with the onset of periodic windows via the creation of a pair of orbits in a saddle-node bifurcation and the simultaneous crisis as these fixed points collide with the chaotic attractor. Such approximation becomes better for higher period orbits, since f n becomes steeper, and the parameter values where ( f n ) = 1 and ( f n ) = 0 become very close. It is not difficult to show that the {Pkn } curves associated with saddle-node crises are tangential. Other types of intersection might be termed ‘stars’; the most prominent in Fig. 2.12 is that associated with the merging of two separate pieces of the attractor. To understand this, consider Fig. 2.20. The two pieces of chaotic attractor are associated with period-doubling sequences associated with each fixed point of f 2 . The period-doubling sequence reaches its limit before the separate regions merge. It is clear that the merger of the two attracting sets is associated with an orbit which connects the turning points to the (unstable) fixed point. Specifically, f 3 (0) = x ∗ is the fixed point, so that this bifurcation is associated with the value of μ for which P3 (μ) = P4 (μ) = P5 (μ) = . . ..


2 One-Dimensional Maps

f 2(y)

f 2(y)



Fig. 2.20 The transition from a two-piece attractor to a one-piece attractor

2.4 Symbolic Dynamics: The Kneading Sequence The ideas used previously in discussing the dynamics of the tent map can be very elegantly generalised to describe the progress from order to chaos as the bifurcation parameter μ increases. However, it is now convenient to revert to a definition of a unimodal map as a one-humped function on the unit interval, with a fixed point at the left: Definition A unimodal map f : [0, 1] → R is one for which f is continuous, f (0) = f (1) = 0, and is strictly monotone on each of [0, c), (c, 1], where c ∈ (0, 1). We will continue to use μ as the bifurcation parameter. Note that the unit interval is invariant if f (c) ≤ 1, and we assume this. The logistic map takes f (x) = μx(1 − x), with 0 ≤ μ ≤ 4. There are two main combinatorial problems. Firstly, since for small μ there is a globally attracting fixed point, but for μ = 4 (in the logistic map) periodic orbits of all orbits exist, how do these orbits come into existence? As we have seen, the answer is essentially that periods of order k × 2n are generated by period-doubling, while basic odd periods can be generated by saddle-node bifurcations. But we want to know more than this. What order do the periodic orbits occur in? Moreover, how many orbits of any period p are there? We define the itinerary of a point x as the infinite sequence of symbols which indicates whether f n (x) is less than, greater than, or equal to, the turning point c. More precisely, (2.31) S(x) = s = s0 s1 s2 . . . , where s j = 0 if f j (x) < c, = 1 if f j (x) > c, = C if f j (x) = c.


2.4 Symbolic Dynamics: The Kneading Sequence


Although we have used 0 and 1, the sequence s should be thought of as a lexicographic string, not a number, although we retain the ordering.14 Next, we define the kneading sequence, which is the itinerary of f (c): K ( f ) = S[ f (c)].


The aim is to identify points with their itineraries, in the same sort of way as we were able to do for the tent map. This is facilitated by a natural ordering (denoted ≺) on the set of itineraries. Let s = (s0 s1 s2 . . .) and t = (t0 t1 t2 . . .) be two itineraries. Suppose si = ti for 0 ≤ i ≤ n − 1 but sn = tn , and let τn (s) be the number of 1’s in s0 , s1 . . . sn . Then we define s ≺ t if either sn < tn and τn−1 (s) is even; or sn > tn and τn−1 (s) is odd :


otherwise s  t. In this definition, we take τ−1 = 0, i.e. if s0 = t0 then the ordering is that of the first symbol. We take 0 < C < 1 in this definition. It is an awkward but trivial exercise to show that this defines a complete ordering on the set of itineraries. Moreover, this is the same as that on the real line, due to the following Theorem If S(x) ≺ S(y), then x < y : if x < y, then S(x)  S(y).


The equality in the second part (which follows directly from the first) cannot be removed, since if f has a stable periodic orbit, then there will generally be an interval of points with identical itineraries. The proof of the (first part of) the theorem is by induction on n, the discrepancy between s and t.15 Let us assume S(x) = s ≺ S(y) = t. If the discrepancy n = 0, then we have directly s0 < t0 , thus x < y in view of (2.32). Now suppose the theorem is true for all sequences with discrepancies less than n, and suppose that S(x) = s ≺ S(y) = t, where s and t have discrepancy n > 0. Since S[ f (x)] = s1 s2 . . . and S[ f (y)] = t1 t2 . . ., these two sequences have discrepancy n − 1. There are three cases to consider. If s0 = t0 = 0, the number of ones in s0 . . . sn is the same as in s1 . . . sn , and therefore in this case S(x) ≺ S(y) implies S[ f (x)] ≺ S[ f (y)], which by the inductive assumption implies f (x) < f (y); but since s0 = t0 = 0 implies x < c, y < c, where f is increasing, then this implies x < y.

his book, Sparrow (1982) uses the symbols L , R instead of 0, 1. discrepancy n ≥ 0 is the first integer in the sequences (2.31) for S(x) = s and S(y) = t for which sn = tn . 14 In

15 The


2 One-Dimensional Maps

Alternatively, if s0 = t0 = 1, then the number of ones in s0 . . . sn is one more than in s1 . . . sn , and therefore in this case S(x) ≺ S(y) implies S[ f (x)]  S[ f (y)], which by the inductive assumption implies f (x) > f (y); but since s0 = t0 = 1 implies x > c, y > c, on which f is decreasing, then we again obtain x < y. The final possibility is that s0 = t0 = C, but this implies x = y = c, and thus S(x) = S(y), a contradiction. Thus we have proved the inductive step, and hence the theorem is proved. Now, amongst all possible sequences s, we want to identify those which are the itineraries of some point x. We call such sequences admissible. The admissible sequences will then basically tell us how many periodic (and aperiodic) orbits there are. Notice that we have f (x) ≤ f (c) ≤ 1 for all x ∈ [0, 1]. If we denote the shift map on symbolic sequences (s0 s1 s2 . . .) by σ (s0 s1 s2 . . .) = (s1 s2 s3 . . .),


S[ f (x)] = σ [S(x)].


then we have that It follows that σ n [S(x)] = S[ f n (x)], and f n (x) ≤ f (c), whence by the Theorem (2.35), σ n [S(x)] = S[ f n (x)]  S[ f (c)] = K ( f ), for any n. Thus any admissible sequence must satisfy the above criterion. Remarkably, the converse is also (almost) true. In fact, we have the Theorem (Milnor–Thurston) If c is not periodic,16 and if σ n (t) ≺ K ( f ) ∀ n ≥ 1, then ∃ x with S(x) = t. To prove this, first note that if v = C . . ., then σ (v) = K ( f ). It follows from this that if t satisfies the conditions of the theorem, then t j = C for all j (otherwise σ j+1 (t) = K ( f )). Next, we use an overbar to signify a repeating sequence, thus 0¯ = 000 . . ., 10¯ = 1000 . . ., 10 = 101010 . . ., etc. Since f (0) = f (1) = 0, it follows that ¯ S(1) = 10. ¯ S(0) = 0,


¯ 10. ¯ Since both 0¯ and 10¯ are itineraries of points (0 and 1) we can assume that t = 0, We then have ¯ 0¯ ≺ t ≺ 10. (2.39) ¯ there is a first The first of these follows from the fact that (since we take t = 0), non-zero value of tk , so that the discrepancy is k and the number of 1s in the first k − 1 symbols is zero, thus even, and 0 < tk . The second follows similarly: if t0 = 1, then the discrepancy is zero, the number of preceding 1s is zero, thus even, and 16 Thus

we are away from points of intersection of Pn and Pm (see (2.30) or Fig. 2.16).

2.4 Symbolic Dynamics: The Kneading Sequence


t0 < 1, while if t0 = 1 and tk is the first value for k > 0 such that tk = 0, then the discrepancy is k and the number of 1s is odd, and tk > 0. Now let L t = {x|S(x) ≺ t} and Rt = {x|S(x)  t}. Both these sets are nonempty, since 0 ∈ L t and 1 ∈ Rt , and their intersection is clearly empty. We shall show that they are open sets. It will then follow that the set {x|S(x) = t} = [0, 1] \ (L t ∪ Rt ) is closed and non-empty, which proves the theorem.17 We shall show that L t is open; the argument for Rt is similar. Let v = v0 v1 v2 . . . be such that v j = C for j = 0, . . . , n and denote Un (v) = {x|S(x) = v0 . . . vn wn+1 . . .}, where wk is arbitrary for k > n. This set is open, since small perturbations to x ∈ Un (v) leave the iterates f j (x + δx), j = 0, . . . , n on the same side of c as f j (x). Now suppose we select z ∈ L t so that S(z) = s ≺ t. Let the discrepancy of s and t be n, and suppose for example tn = 1 (the case tn = 0 is similar), so that sn = 0 or C. Since s ≺ t and sn < tn in this case, there must be an even number of 1s in s0 , . . . , sn−1 . If sn = 0, then z ∈ Un (s), and all x ∈ Un (s) have itinerary S(x) with the same discrepancy with t as s, and so satisfy S(x) ≺ t. Thus z ∈ Un (s) ⊂ L t . Alternatively, suppose that sn = C. Then since c is assumed not to be periodic, we have that sk = C for k > n. Furthermore, σ n+1 (s) = K ( f ). It must therefore be the case that there is some minimum k ≥ 1 such that sn+k = tn+k , for otherwise σ n+1 (t) = σ n+1 (s) = K ( f ), contradicting our assumption. Let there be an even number of 1s in tn+1 , . . . , tn+k−1 (the odd case is similar); then we must have tn+k = 0, sn+k = 1 in order that σ n+1 (t) = tn+1 . . . ≺ σ n+1 (s) = K ( f ). Now let W be the set {x|S(x) = s0 . . . sn−1 ξ sn+1 . . . sn+k wn+k+1 . . .}, where wn+ j is arbitrary for j > k, and also ξ = 0, C or 1. In particular, z ∈ W , and small perturbations of w ∈ W remain in W , so that W is open. If ξ = 0 or C, it is immediate that S(x) ≺ t for x ∈ W , since the discrepancy is n and we have supposed an even number of 1s in the first n − 1 elements of t. On the other hand, if ξ = 1, then the first n + k − 1 elements of S(x) and t coincide, and further, there are an odd number of 1s in the first n + k − 1 elements (the even ones in s0 . . . sn−1 , the even ones in sn+1 . . . sn+k−1 and the odd one at sn ); furthermore, by definition of W , S(x)n+k = 1 and tn+k = 0. Therefore, we have that S(x) ≺ t in this case also, and so z ∈ W ⊂ L t . Since both Un (s) and W are open sets, it follows that L t is open. The various alternatives are treated similarly, and this concludes the proof.  It follows from this that the existence and number of periodic orbits are determined by the kneading sequence. Bifurcations in the set of periodic orbits, therefore, occur when the kneading sequence changes, and this happens precisely at values of μ where c is periodic—i.e. at the Pn intersections previously referred to. Using this special ordering, one can deduce the progressive appearance of periodic orbits. We do no more here than give the theorem due to Šarkowskii (1964):

17 We

have strayed deep into the land of pure mathematics. Only a pure mathematician would imagine proving a constructive result by showing that a set is open.


2 One-Dimensional Maps

Order the integers as follows: 1  2  4 . . .  2k  2k+1  . . .  2k+1 (2l + 1)  2k+1 (2l − 1)  . . .  2k+1 .5  2k+1 .3  . . .  2k (2l + 1)  2k (2l − 1)  . . .  2k .3  . . .


. . .  2.5  2.3  . . .  (2l + 1)  (2l − 1) . . .  5  3. If f is a continuous map of an interval into itself with a periodic point of period p, then f has periodic points of period q for all q  p. In particular, period-three implies periods of all orders. In terms of admissible sequences, this involves finding periodic sequences s such that σ n (s) ≺ K ( f ) = 10C (if we take the superstable period-three orbit18 for example); in this case the Milnor–Thurston theorem does not immediately apply, since c is periodic, but it does if one adds the extra condition (in this case) that σ n (s) ≺ 101.19

2.5 Notes and References 2.5.1 The Feigenbaum Conjectures Mitchell Feigenbaum came to M. I. T. to give a seminar in about 1980 on his recent work on period-doubling (Feigenbaum 1978, 1979). The room was full. And it was a big lecture theatre: four hundred people had turned up. At the time, the idea of universality in physical systems seemed to provide a route to understanding turbulence in fluid mechanical systems (we will see more of this in Chap. 6). Certainly, there was an influx of physicists and analysts (with inertial manifold theory) into the classical territory of fluid dynamics, and there was some friction as a consequence. Less attention is now paid to the Feigenbaum period-doubling cascade. The key to the publicity attached at the time to the Feigenbaum ‘route to chaos’ lies in the glamour of physics. The idea of ‘universality’, and the use of the fashionable ideas of renormalisation, guaranteed an immediate and well-publicised success for Feigenbaum’s ideas. There is no doubt that physicists’ enthusiasm for the concepts has provided a major advance in our understanding of physical processes. But nor is there any doubt that the details of fluid turbulence are more complicated than their superstable periodic orbit {xi } is one which passes through c, since then the quantity f  (xi ) = 0. 19 See Devaney (1986), remark 2 on p. 146, for example. 18 A


2.5 Notes and References


description by one-dimensional maps, and the hyperbole of those days is no longer with us. The proofs of the Feigenbaum conjectures mentioned in the text are due to Lanford (1982) and Eckmann and Wittwer (1987).

2.5.2 Kneading Theory The description of the kneading theory, and indeed much of this chapter, follows the exposition in the book by Devaney (1986). It must be said that this book is very well written, making what is essentially pure mathematics comprehensible even to an applied mathematician. The proof of the Milnor–Thurston theorem which we give here follows that of Devaney, but his proof (p. 145) has been elaborated in the present text.

2.5.3 Period-Doubling in Experiments A good deal of the excitement caused by the ‘period-doubling route to chaos’ was the fact that it was observed in various fluid dynamical experiments. A summary of some of these is given in the monograph by Argyris et al. (2015). Among the early experimentalists, Albert Libchaber and Harry Swinney were prominent figures. Libchaber’s work focussed on convection experiments with liquid helium and also mercury in a tiny box, while Swinney was more concerned with the transition to turbulent flow observed in Taylor–Couette flow. These latter experiments (e.g. Gollub and Swinney (1975), involving the shear flow induced by the differential motion caused by rotation of the inner of two concentric cylinders, found a sequence consisting of Hopf bifurcation and secondary Hopf bifurcation, before the appearance of broadband noise signalled the onset of chaotic motion. This is consistent with transitions described in Chap. 3. Libchaber, on the other hand, found transitions associated with period-doubling. Some of this work is reviewed by Libchaber (1983).

2.6 Exercises   2.1 Give a formal justification that a fixed point x ∗ is stable if  f  (x ∗ ) < 1. (That is, prove that if f is continuously differentiable and x0 is close enough to x ∗ , then xn → x ∗ as n → ∞.) Show  the stability of the periodic orbit {x1 , ...x p }  p also that of f is determined by  1 f  (x j ).


2 One-Dimensional Maps

2.2 Use the implicit function theorem20 to show that if x → f (x, ε) is a smooth (C 1 ) map, and x ∗ = f (x ∗ , 0) with f  (x ∗ , 0) = 1, then for sufficiently small ε, there is a fixed point x(ε) of f with x(0) = x ∗ . Deduce (with some care!) that the map θ → f εs (θ ) = θ + sε sin sθ + O(ε2 ) (mod 2π ) defined on the unit circle, i.e. θ ∈ [0, 2π ], has 2s fixed points for small ε, where O(ε2 ) is a C 2 perturbation. [The notation g ∈ C n means the function g and its first n derivatives are continuous. Hint: consider, in the statement of the implicit function theorem, the functions g(x, ε) = f (x, ε) − x, g(θ, ε) = ( f εs (θ ) − θ )/ε.] 2.3 By expanding in Taylor series in the vicinity of the bifurcation point, show in detail that a period-doubling bifurcation generates a single periodic orbit, and give a criterion for its stability in terms of the Schwarzian derivative at the fixed point,   3 f  2 f  . Sf =  − f 2 f 2.4 Classify all bifurcations of fixed points (stating the bifurcation parameter values) of the following maps: (i) x → μ − x 2 ; (ii) x → μx(1 − x); (iii) x → μx − x 3 . Sketch the local bifurcation diagrams. 2.5 If x → f (x, μ) = x[μ/{x 2 + (μ − x 2 )e−4πμ }]1/2 , show that if μ < 0, f has a stable fixed point x = 0, and if μ > 0 then f has an unstable fixed point x = 0, √ and two stable fixed points x = ± μ. Find f n (x, μ) (the notation f n denotes the n-th iterate of f ). What are the domains of attraction of these stable fixed points? 2.6 Draw a graph of F(x) = 4x 3 − 3x on the interval [−1, 1]. Show that the map x → F(x) has three fixed points, and examine their stability. By using a suitable trigonometric transformation, show that the map is chaotic, and construct a suitable symbolic representation for orbits of F. [Hint: write x as a ternary fraction]. Use the symbolic representation of x to find how many fixed points and period-2 cycles there are, and verify your answer using the map. 20 A full statement of the implicit function theorem is coming up shortly, at the beginning of Chap. 3.

In the present context, it says that if the function g(x, ε) satisfies g(x ∗ , 0) = 0 and has non-zero derivative (with respect to x) at the same point, and is smooth (continuously differentiable), then for small y, the equation g(x, ε) = y defines a smooth function x = X (y, ε); see, e.g., Devaney’s book.

2.6 Exercises


2.7 Let f : I → I be a unimodal map, let K ( f ) be non-repeating, and let s = s0 s1 . . . sn−1 be a periodic sequence satisfying σ i (s) ≺ K ( f ) ∀ i. Assuming that J = {z ∈ I | S(z) = s} is non-empty and closed, show (i) (ii) (iii) (iv) (v) (vi)

that f n (J ) ⊂ J ; if J is a single point, J = {x}, then x generates a period n orbit; that otherwise, J = [a, b] is a closed interval; and then: f i (x) = c ∀ x ∈ J , ∀i; f n (x) is strictly monotone on J ; f n (a) = a or b, f n (b) = a or b;

deduce that f has a periodic point x of period n with S(x) = s. 2.8 (i) Fermat’s little theorem states that p | a p − a ( p divides a p − a) if p is prime and a is a positive integer. Equivalently, using arithmetic modulo p (where r ≡ s mod p means that r = s + kp for some integer k ∈ Z), a p ≡ a mod p. If p | a, the result is obvious, so assume p  a. Show that the sequence {ra, r = 1, 2, . . . , p − 1} is a permutation of {1, 2, . . . , p − 1}. Hence deduce the theorem by multiplying the elements of each set. (ii) The logistic map is given by z → f (z) = 1 − μz 2 , where z and μ are in general complex. Show that the nth iterate f n (z) is a polynomial of degree 2n . Deduce that if q is prime, the number of period-q 2q − 2 orbits is , explaining why this is an integer. Deduce the number of q period-3, period-5 and period-7 orbits. What is the number of period 4 orbits? (iii) Write down the eighth degree polynomial corresponding to f 3 (z) − z, and by factorising it, show that the sixth degree polynomial whose roots correspond to the period-3 orbits is given by p6 (ζ, μ) = a0 − a1 ζ + a2 ζ 2 + a3 ζ 3 − a4 ζ 4 − ζ 5 + ζ 6 , where ζ = μz and a0 = 1 − μ(1 − μ)2 , a1 = (1 − μ)2 , a2 = (1 − 3μ + 3μ2 ), a3 = −(1 − 2μ), a4 = −(1 − 3μ). Now note from Figs. 2.14 and 2.15 that the bifurcating period-3 orbit at μ = 1.75 is approximately at z = 0.0314, together with its first and second iterate, and thus ζ ≈ 0.055. Corroborate this by plotting p6 (ζ, 1.75) and showing that it has three double roots.


2 One-Dimensional Maps

Since one of the roots is small and close to zero, a perturbation method is suggested. Calculate the values of a0 , a1 and a2 at μ = 1.75 and show that a0  1, a1 ∼ 1, and the discriminant of the quadratic formed by the first three terms,  = a12 − 4a0 a2 , is small. Define ε=

a1 2a2  a3 a4 2 , β0 = , β3 = , β4 = , b = , 3 2a2 a2 a2 a1 a1

and calculate the values of these constants when μ = 1.75. Show that if we define ζ = ε X , the scaled polynomial p(X ) =

2 p6 = (1 − X )2 − εβ0 + εβ3 X 3 − ε2 β4 X 4 − ε4 bX 5 + ε5 bX 6 , εa1

and hence show that the real values of z = x for μ > 1.75 are approximately x=

ε  1 ± {ε(β0 − β3 )}1/2 . μ


Plot β0 as a function of μ for 1.75 < μ < 2, and explain why it becomes so large. Figure 2.21 shows a comparison of the approximation in (∗) with the exact solution for 1.75 < μ < 2. By consideration of the variation of the other constants with μ, explain why the approximation is so good, given the largeness of β0 . Plot an improved correction including β4 . Is it better? Why, or why not?

Fig. 2.21 A comparison of the approximate root in (∗) with the exact pair of roots of f 3 (x) = x near x = 0. Solid line: exact roots; circles: approximation

0.3 0.2 0.1

x 0 -0.1 -0.2 1.7




2.6 Exercises


2.9 Let 2 be the sequence space of symbols 0 or 1, s = s0 s1 s2 . . . ∈ 2 , si = 0 or 1. ∞  |si − ti | Define d(s, t) = . Show that d is a metric for 2 , i.e. d(s, t) ≥ 0, 2i 0 d(s, t) = 0 iff si = ti , d(s, t) = d(t, s), d(r, s) + d(s, t) ≥ d(r, t). Show that the shift map σ : 2 → 2 defined by σ (s0 s1 s2 . . .) = (s1 s2 s3 . . .) is continuous. 2.10 If s is a finite sequence of symbols s0 . . . sn , let us identify s with the periodic sequence s¯. For such a periodic sequence, we define the maximal sequence to be M(s) = σ j (s) such that σ i (s) ≺ σ j (s) for all i = j (we assume s has no periodic subsequences). Finally we define the concatenation of two periodic sequences s and t to be the periodic sequence st. Consider now periodic sequences containing only 0s and 1s. If s = s0 . . . sn , define sˆ = s0 . . . sn−1 sˆn , where sˆn = sn . Let the sequences τ 0 , τ 1 , τ 2 , . . . be defined inductively by τ 0 = 1, τ j+1 = τ j τˆ j . Write down the first few such sequences. Show that (i) (ii) (iii) (iv)

τj τj τ0 τj

has period 2 j ; has an odd number of 1s; ≺ τ 1 ≺ τ 2 . . .; is a maximal sequence. [Hint: use induction.]

Suppose that f (x; λ) is a family of maps such that K ( f ) = 0¯ when λ = λ0 and K ( f ) = 10¯ when λ = λ1 > λ0 , and that K ( f ) increases monotonically with λ. (The logistic map with λ0 = 0 and λ1 = 4 is such a family.) Deduce that the period 2r orbits of such a family occur in the order 2  22  . . .  2k  2k+1 . . .. ¯ t = τ j ∀ j, then Show also that if t is any periodic sequence, ti = C, t = 0, M(t)  τ j ∀ j, where M(t) is the maximal sequence of t. Hence show that all period 2r orbits appear before any odd period orbits.

Chapter 3

Hopf Bifurcations

3.1 Hopf Bifurcation Theorem In the previous chapter, we saw how chaos can be understood naturally in trajectories of mappings, and in particular, one-dimensional maps. Much of the understanding of the transition to chaos consists of the description of the various bifurcations which create periodic orbits, and this remains true when we turn our attention to differential equations, as we do now. The simplest1 such bifurcation is that associated with the name of Hopf, although arguably the idea stems from earlier work by Andronov in the Soviet Union, and before that, Poincaré. In order to fix ideas, we will consider a system of ordinary differential equations, x˙ = f (x, μ), x ∈ Rn , μ ∈ R,


where μ is a real-valued parameter. There is a close connection between differential equations and difference equations. Indeed, (3.1) defines a flow φt (x), which satisfies the properties (3.2) φ0 (x) = x, φs ◦ φt = φs+t for all s, t ≥ 0, and which gives the map along trajectories from time 0 to time t.2 For a fixed t, the flow is a map, and for certain purposes we may replace consideration of the differential equations by a consideration of the corresponding flow. In particular, the Poincaré map is given by φt (x) : U → U defined on a Poincaré section U , where t (x) is the time such that for x ∈ U , t (x) is the (first) time for which φt (x) (x) ∈ U ; cf. Fig. 3.1.

1 Well, 2 The

not the simplest; but it’s the simplest interesting bifurcation. notation φs ◦ φt (x) means φs [φt (x)].

© Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,



3 Hopf Bifurcations

Fig. 3.1 The flow φt

For differential equations, we have the following Definitions: A fixed point is a constant value x ∗ such that f (x ∗ , μ) = 0 (thus, it represents a steady-state solution of (3.1)). A periodic orbit is a solution such that x(t + p) = x(t) for all t and some minimal value of p > 0, called the period. And for future reference: A homoclinic orbit is an orbit which is asymptotic to a fixed point as t → ±∞.3 Much of bifurcation theory is concerned with the branching of small amplitude solutions from critical points (fixed points) as they lose stability. A whole wealth of (equivalent) techniques have been brought to bear on such problems. Some of these look very profound, and so it is worth pointing out that much of bifurcation theory rests on nothing more than the idea of expansion in a Taylor series, and use of the implicit function theorem. In certain cases, we may use the idea of the Fredholm alternative, but even this is essentially undergraduate mathematics. It is a sobering thought that the whole effort of modern nonlinear stability theory, for example, basically rests on the use of Taylor series. We here recall Taylor’s theorem, and the implicit function theorem. Taylor’s Theorem Suppose f (x) ∈ C k (has continuous k-th derivatives). Then there exists a unique expansion (3.3) f (x) = a0 + a1 x + · · · ak−1 x k−1 + O(k), where O(k) means O(x k ). We can take f ∈ Rn , and then am x m represents terms of the form am x1m 1 . . . xnm n where m 1 + · · · + m n = m. Equally, if an expansion such as (3.3) exists, then f ∈ C k . We don’t actually need f (k) to be continuous, just that it exists, but it is convenient to assume so for ease of reference. 3 It

is a periodic orbit with infinite period.

3.1 Hopf Bifurcation Theorem


3.1.1 Implicit Function Theorem Let f (x; μ) ∈ Rn have continuous partial derivatives with respect to x ∈ Rn and be continuous with respect to μ; if f (x0 ; 0) = 0, and if the Jacobian D f of f has nonzero determinant at x = x0 , μ = 0, then for sufficiently small y ∈ Rn , the equation f (x; μ) = y is uniquely solvable by a C 1 function x = g(y; μ). In fact, if f ∈ C r , then also g ∈ C r . Bifurcations of particular solutions are associated with loss of stability as the parameter μ is varied. For fixed points, linear stability of a fixed point is determined by the eigenvalues of the Jacobian D f of f at the fixed point. Specifically, let x ∗ be a fixed point. Then if f ∈ C 2 , we put x = x ∗ + y, so y˙ = f (x ∗ + y) − f (x ∗ ) = D f (x ∗ )y + O(y 2 ).


For small y, we neglect the nonlinear term (and in certain circumstances, this can be justified by the Hartman–Grobman theorem, see below), thus y˙ ≈ D f (x ∗ )y,


with solutions y ∝ exp(λt), where {λ} are the eigenvalues of D f . If Re λ > 0 for any λ, the fixed point is unstable, whereas if Re λ < 0 for all λ, it is stable. Bifurcations are thus associated with values of λ crossing the imaginary axis. For a real system, eigenvalues are either real, or occur in complex conjugates. If a real eigenvalue goes through zero at μ = 0, then the corresponding eigencoordinate4 u satisfies (3.6) u˙ = μu + au 2 + · · · , and we see that there is an exchange of stability between the fixed points u = 0 and u = −μ/a. Such a bifurcation is called a transcritical bifurcation, and is depicted in Fig. 3.2. More generally (or generically), the fixed point may itself depend on μ, and then we can write, for u close to zero, u˙ = μ + au 2 . . . ,


which describes the saddle-node bifurcation. This represents the general unfolding of an instability to positive eigenvalues. The transcritical bifurcation applies only to restricted systems where the fixed point is independent of μ. If in addition, the system has a symmetry, then a further restriction can occur. If the system is symmetric under

4 By

this, we mean that there is a single eigenvector corresponding to the passage of the eigenvalue through zero; by making a suitable coordinate change we can make this eigenvector one of the coordinate axes, and we call the corresponding coordinate an eigencoordinate.


3 Hopf Bifurcations


saddle node


Fig. 3.2 The three basic types of bifurcation

the transformation u → −u, then u = 0 is always a fixed point, and the system can be written as (3.8) u˙ = μu + au 3 . . . , which gives a pitchfork bifurcation, as shown in Fig. 3.2. In general, we see that for a real instability, we can generate only steady states, and for the generic saddle-node bifurcation, we can think of the fold as simply representing a single fixed point which depends non-monotonically on the parameter. For an oscillatory instability, a pair of complex conjugate eigenvalues cross the imaginary axis. As an example of what can happen in this case, consider the system x˙ = (μ − r 2 ) x − ωy, y˙ = ωx + (μ − r 2 )y,


where r 2 = x 2 + y 2 . It is easy to see that the eigenvalues of the linearised equation at the origin are μ ± iω, and therefore there is an oscillatory instability as μ increases through zero. This artificial example is, of course, trivially solved (and we have seen it before, in question 1.7). In polar coordinates, it reduces to r˙ = μr − r 3 θ˙ = ω,


and thus for μ < 0, the origin is globally asymptotically stable. For μ > 0, however, √ a stable limit cycle bifurcates from the origin, of amplitude r = μ. The phase space has the (typical) form shown in Fig. 3.3. The behaviour exhibited by this simple system is in fact generic, and this is the substance of the Hopf bifurcation theorem Let x˙ = f (x, μ), x ∈ Rn , μ ∈ R, have a fixed point x = 0 for all μ. Suppose the eigenvalues of D f (0, μ) have negative real part, except for a pair λ(μ) = α(μ) + iω(μ) and its complex conjugate λ(μ), for which λ(0) = iω0 , and suppose (transversality) α (0) = 0 (i.e. the eigenvalues cross the imaginary axis at a

3.1 Hopf Bifurcation Theorem



y x


Fig. 3.3 Hopf bifurcation for (3.9). Left: μ < 0; right: μ > 0 xμ







u μ

Fig. 3.4 Supercritical and subcritical Hopf bifurcation

non-zero ‘rate’). Then for small μ, there exists a one-parameter family of periodic orbits, xμ (t), in either μ ≥ 0 or μ ≤ 0, whose stability is opposite to that of the coexisting fixed point. Hopf proved his theorem for analytic f , and in this case, we have that μ, xμ and the period T are analytic functions of the (suitably defined) amplitude ε, and both T and μ are in fact functions of ε2 . In general, then, the amplitude ε of xμ is given 1 by ε ∼ |μ| 2 , and the orbit exists in either μ > 0 or μ < 0, as indicated in Fig. 3.4. These two cases are termed supercritical and subcritical bifurcations, respectively. The proof of the Hopf bifurcation theorem (to which we come later) involves (Taylor) series expansions and use of the implicit function theorem. The method of construction of the power series for xμ then involves successive substitution and the use of the periodicity requirement to determine the coefficients. In its simplest form, these are the Poincaré–Lindstedt series. Subsequently, more elaborate formal perturbation methods have been devised to construct uniformly valid asymptotic expansions for solutions in the parametric neighbourhood of a Hopf bifurcation. Prominent amongst these are the method of multiple scales, and the method of averaging. Though the former is often unaccompanied by formal proof, its calculations are formally the same as those of the Hopf theorem, but it is of greater generality, and applies in other situations as well, where amplitude expansions can be used.


3 Hopf Bifurcations

Let us illustrate the use of the method of multiple scales by consideration of the following problem5 : (3.11) x¨ − μx˙ + x˙ 3 + x = 0. For μ < 0, the origin is linearly stable, while it is oscillatorily unstable for μ > 0. The Hopf Theorem applies, and we seek solutions close to the periodic solution. The key observation is that in the neighbourhood of x = 0, μ = 0, the sinusoidal oscillation eit is modulated by a slowly varying amplitude, which can be thought of as a function of the slow time μt. The method of multiple scales formalises this by seeking solutions which depend explicitly on the fast time scale t ∗ , and the slow time scale t˜, defined by (3.12) t ∗ = t, t˜ = ε2 t, where, in view of the Hopf theorem, we define μ = cε2 , c = ±1,


and we shall choose c in the course of the calculation. The value of c determines whether the bifurcation is supercritical (c = 1) or subcritical (c = −1). Considered as a function of two variables, x now satisfies   3  2 2 ∂x ∂2x 2 ∂ x 4∂ x 2 ∂x 2 ∂x 2 ∂x + + 2ε ∗ + ε − cε +ε +ε + x = 0; ∂t ∗ ∂t ∗ ∂t ∗ 2 ∂t ∂ t˜ ∂ t˜2 ∂ t˜ ∂ t˜ (3.14) we expand x as the asymptotic series x = εx0 (t ∗ , t˜) + ε2 x1 (t ∗ , t˜) + · · · ,


so that, equating successive powers of ε: ∂ 2 x0 + x0 = 0, ∂t ∗2 ∂ 2 x1 + x1 = 0, ∂t ∗2   ∂ 2 x2 ∂ 2 x0 ∂ x0 3 ∂ x0 + x2 = −2 ∗ + c ∗ − . ∂t ∗2 ∂t ∂t ∗ ∂t ∂ t˜ The solutions are

x0 = A0 (t˜)eit + (c.c.),



where (c.c.) denotes the complex conjugate, and a similar expression applies for (3.16)2 , so that (3.16)3 becomes 5 This is essentially the Van der Pol equation of question 1.6 (take a derivative of

to avoid the degeneracy whereby the nonlinearity disappears when μ = 0.

(3.11)), but rescaled

3.1 Hopf Bifurcation Theorem


∂ 2 x2 ∗ ∗ + x2 = −i[2 A0 − c A0 + 3|A0 |2 A0 ]eit + A30 e3it + (c.c.). ∂t ∗2


The procedure to determine A0 (t˜) follows from insisting that slow, secular behaviour is fully represented by the slowly varying amplitude functions. Hence, this is equivalent to requiring that each xi be periodic in t ∗ . In order for this to be possible, we must ∗ choose A0 to suppress the resonant terms proportional to eit on the right hand side of (3.18). A more formal restatement of this (applicable to more general situations) uses the Fredholm alternative: we typically obtain by this procedure a sequence of linear inhomogeneous equations of the form L x0 = 0, L x1 = f 1 (x0 ), L x2 = f 2 (x0 , x1 ),


etc.6 Since the homogeneous equation (3.19)1 has a solution, it follows from the Fredholm alternative that the subsequent equations only have a solution (in the relevant Hilbert space) if each f i is orthogonal to the solutions of the homogeneous adjoint equation (3.20) L ∗ η = 0. ∗

In the case of (3.18), L is self-adjoint, and we require orthogonality with e±it , equivalent to requiring coefficients of these terms on the right hand side to vanish. Thus we obtain the differential equation for A0 : d A0 = 21 c A0 − 23 |A0 |2 A0 , d t˜


known variously as the Landau–Stuart equation (or Stuart–Watson, or Ginzburg– Landau). The solution of (3.21) is easily found. There is a limit cycle solution in 1 which |A0 | = (c/3) 2 , provided c = 1, i.e. the bifurcation is supercritical, and the periodic orbit is stable. The bifurcation diagram is, therefore, that shown on the left in Fig. 3.4. In terms of the original variable x, the limit cycle xμ has amplitude √ O( μ), and period 2π + O(μ), in keeping with Hopf’s theorem.

3.1.2 Proof of Hopf Bifurcation Theorem In view of later procedures involving normal forms, etc., it is convenient to consider functions f which are smooth but not analytic. Our sketch of the proof will follow the the unperturbed (ε = 0) system is itself nonlinear, then the first of these is nonlinear, and it is usually appropriate to define the three variables in a more complicated way: this is Kuzmak’s method.

6 If


3 Hopf Bifurcations

roundabout method expounded by Hassard, Kazarinoff and Wan, since the procedure adopted will re-appear later in various guises. First we define a manifold (in Rn ). A manifold is a subspace of Rn which is locally diffeomorphic to Rm , m ≤ n; i.e. it looks locally like a piece of Rm . A good example is the surface of a torus (locally like R2 ) or the surface of a sphere (also R2 ). (Here, a diffeomorphism is a smooth (i.e. at least C 1 ) invertible mapping.) If follows from the centre manifold theorem (which will be expounded below) that it suffices to consider systems x˙ = f (x, μ), where x ∈ R2 . More or less, this is because the components of x in the direction of the eigenvectors of D f (0, μ) with a negative real part will rapidly collapse to zero, leaving an invariant subspace spanned by the eigenvectors associated with the marginally stable eigenvalues. Nor is it necessary to consider a completely general function f . Poincaré’s method of reduction to normal forms enables all sufficiently smooth functions to be reduced to simpler expressions, via a method of successive approximation which we elaborate below. In fact, we shall see that it is sufficient to consider a two-dimensional system in the form [L/2]  c j (μ)z|z|2 j + O{|z|(zμ) L+1 }, (3.22) z˙ = λ(μ)z + j=1

where [L/2] = int(L/2), and we specifically assume f ∈ C L+2 , with L ≥ 2. That is, f has L + 2 continuous derivatives in z and μ, and therefore Taylor’s theorem can be used to write f as a power series, with an error term (involving (L + 2)derivatives) of O(L + 2), where O(L + 2) = O{(z, μ) L+2 } means terms in z and μ of degree L + 2. That we exclude O(μ L+2 ) (as indicated in (3.22)) is due to the assumption that f (0, μ) = 0 for all μ. Note that (3.22) is just an extension of the Stuart–Landau equation (3.21) (and it can be derived in the same way). The idea of the proof is that there will be a solution of (3.22) close to z = ε exp[2π it/T (μ)]


(thus with period T (μ)), providing μ(ε) and T (μ) are appropriately chosen, viz.    Re λ(μ) + c j (μ)ε2 j = 0,


which defines μ(ε) implicitly, and T (μ) =

2π .

Im λ(μ) + c j (μ)ε2 j


To prove this simply involves applying the implicit function theorem judiciously. First we put z = εζ, (3.26)

3.1 Hopf Bifurcation Theorem


so that ζ˙ = F(ζ, ζ , ε, μ)  = λ(μ)ζ + c j (μ)ε2 j ζ |ζ |2 j + O(|ζ |(εζ, μ) L+1 ),


so that F ∈ C L+1 jointly in ζ, μ and ε. Standard theory of ordinary differential equations (e.g. Hartman 1982, p. 100, Theorem 4.1) then implies that ζ (t, ε, μ) depends smoothly (C L+1 ) on ε and μ. Now suppose that ζ = 1 at t = 0;


ζ = eλt = eiω0 t ,


when ε = μ = 0, the solution is

where ω0 = Im λ(0). It follows from the smoothness result for ζ that for ε = 0 and small μ, there is a C L+1 function T0 (μ) such that Im ζ [T0 (μ), 0, μ] = 0, T0 =

2π + O(μ L+1 ), ω0


and also Re ζ [T0 (μ), 0, μ] = e2πα(μ)/ω(μ) + O(μ L+1 ),


where λ = α + iω. We now apply the implicit function theorem when ε = 0. For 0 < ε  1, there exists T (ε, μ) ∈ C L+1 jointly in both ε and μ, with T (0, μ) = T0 (μ) and Im ζ [T (ε, μ), ε, μ] = 0.


R(ε, μ) = Re ζ [T (ε, μ), ε, μ] ,



and note R(0, 0) = 1, ∂ 2π α (0) ∂R (0, 0) = [ζ {T (0, μ), 0, μ}]|μ=0 = = 0, ∂μ ∂μ ω(0)


by the assumption of transversality. Application of the implicit function theorem now yields the existence of a C L+1 function μ(ε) such that


3 Hopf Bifurcations

R[ε, μ(ε)] = 1,


and this gives the existence of the required (unique) periodic orbit. The computation of the coefficients in the expansion for z or ζ follows from (3.24) and (3.25) above. Let us define τ = t/T (ε, μ), ζ = e2πiτ η,


with T (ε, μ) = T (ε) defined by choosing μ(ε) to satisfy (3.24). We know that η is a C L+1 function of ε, thus η=


ηi εi + O(ε L+1 ),



and we know there exists a periodic solution for η, with η(0) = 1 (by definition of ε). The equation for η is 2πiη +

   dη = Tη λ + c j ε2 j |η|2 j + O(ε L+1 ); dτ


using (3.24) and (3.25), we find [L/2]

 dη 2j 2j = Tη c j ε {|η| − 1} + O(ε L+1 ). dτ 1 Equating powers of ε, we find

η0 ≡ 1.



Now suppose ηr (r > 0) is the first non-vanishing coefficient in the expansion. Then η = 1 + ηr εr + · · · , |η|2 = 1 + (ηr + ηr )εr . . . , thus [L/2]

 dηr r 2j r . . . = T [1 + ηr ε . . .] jc j ε {(ηr + ηr )ε + · · · } + O(ε L+1 ); ε dτ 1 (3.41) hence dηr =0, (3.42) dτ r

and ηr is identically zero. Thus η = 1 + O(ε L+1 ), and (3.23) is corroborated.

3.1 Hopf Bifurcation Theorem


3.1.3 Stability Reverting now to (3.22), we put z = r eiθ , so that r˙ = r [Re {λ +

c j r 2 j }] + O(ε L+2 ).



From the definition of μ, we have r˙ = r Re [

c j (r 2 j − ε2 j )] + O(ε L+2 )


= r Re {c1 (0)}(r 2 − ε2 ) + · · · ,


and the stability of the orbit depends on Re c1 (0): if Re c1 (0) < 0, the orbit is stable, if Re c1 (0) > 0, it is unstable. Recall that μ(ε) is defined via Re [λ(μ) +

c j ε2 j ] = 0,



whence μ=−

Re c1 (0) 2 ε + ··· . Re λ (0)


We thus see that in the generic case, Re c1 (0) = 0, the diagrams in Fig. 3.4 are borne out by the analysis. If Re c1 (0) = 0, then one must go to higher order in the expansion in order to compute the stability condition.

3.2 Normal Forms We now have to show why any two-dimensional system satisfying the Hopf criteria can be reduced to the normal form (3.22). The idea here is to reduce a general power series to as simple a form as possible, by removing unwanted nonlinear terms. This procedure is effected by means of successive substitutions. The ultimate simplification would be the complete linearisation of a flow, and in some circumstances this can be done locally. In fact, this statement provides the justification for doing phase plane analysis, as it guarantees that (except in ‘resonant’ situations, see below) linearisation of the flow faithfully represents the topology of the trajectories. More precisely, we define a homeomorphism to be a continuous map with continuous inverse. The smooth case is a diffeomorphism, that is a C k map with C k inverse for some k ≥ 1. We now have the


3 Hopf Bifurcations

Hartman–Grobman theorem: If x˙ = Ax + g(x), g ∈ C k , g(0) = g (0) = 0, and no eigenvalues of A have Re λ = 0, then there exists a local homeomorphism h from orbits of x near the origin to orbits y y of y satisfying y˙ = Ay. (That is to say, h ◦ φtx = φt ◦ h, where φtx , φt are the flows for x and y.)  The reason for the condition on Re λ will be obvious, if, for example, one thinks of a centre in two dimensions. The nonlinear terms may convert it to a spiral, which is clearly not homeomorphic to a centre. In seeking a normal form to which we can apply the Hopf theorem, we are naturally led to inquire as to when h in the Hartman–Grobman theorem will, in fact, be a diffeomorphism. The answer to this is Sternberg’s theorem (for C ∞ functions): If x is as in the Hartman–Grobman theorem, but g ∈ C ∞ , then there exists a C diffeomorphism7 x = h(y) from x to y, providing a countable number of nonresonance conditions can be satisfied (these are detailed below).  ∞

The first of these is that Re λ = 0. We now give an example of the difference between a smooth and a topological map. Consider the flow φtε defined by ϕtε (x) = vε (t), x,


where  ,  denotes the inner product,8 ε is a parameter, and the vector vε (t) is given by (3.48) vε (t) = (et , e(1+ε)t ).

We leave it as an exercise to show that φtε and φtε are diffeomorphic for ε, ε = 0, but φtε and φt0 are only homeomorphic. Actually, this example is more trivial than the elaborate notation makes it appear. The flow ϕtε gives the solutions of the system ˙    x 1 0 x = , y 0 1+ε y


y = c|x|1+ε .


with trajectories

It is clear from Fig. 3.5 that the flows are diffeomorphic everywhere, except at the origin when ε = 0. In order to interpret this distinction in terms of the Hartman–Grobman and Sternberg theorems (which do not involve parameters), we embed the system (3.49) in the three-dimensional set 7 That

is, it has derivatives of all orders. One might think this is the same as being analytic, i.e. having a convergent Taylor series, but it is not the case: think of exp(−1/x 2 ) at the origin. 8 The inner product is just the generalisation of scalar product in vector spaces to more complicated spaces, such as function spaces; here, as we are in Rn (actually R2 in this example), they are the same.

3.2 Normal Forms

61 y






Fig. 3.5 A star and a node

⎛˙⎞ ⎛ ⎞⎛ ⎞ x 1 0 0 x ⎝ y ⎠ = ⎝0 1 + z 0⎠⎝ y ⎠, z 0 0 0 z


which has the same solutions. As we shall see, (3.51) is an example of an equation whose linearisation has resonances. We now turn to the general case, and illustrate the way in which resonances occur. Suppose we have an equation x˙ = x + g(x),


where the Jacobian  at x = 0 is taken to be diagonal (if the Jacobian can not be diagonalised, this is itself an example of resonance in the eigenvalues, as we shall see). Denote the eigenvalues of  by λi ,  = diag (λi ), and suppose g(0) = g (0) = 0, with g ∈ C ∞ , for example. This implies that g has a formal power series to which g is asymptotic. The aim is to transform the system (3.52) to the linear system y˙ = y, if possible. We do this by successively eliminating higher and higher order terms in the formal power series for g by near-identity transformations. Suppose that g has lowest terms of degree k, g = O(k),9 thus g = gk + O(k + 1), where gk is a polynomial of degree k. We put x = h(y) = y + pk (y),


where pk is a polynomial of degree k, to be chosen to eliminate terms of degree k = Dpk , the Jacobian of pk , thus Pk−1 is a matrix (hence the in (3.52). Denote Pk−1 capital letter) of degree k − 1. We have )−1 [(y + pk ) + g{y + pk }] y˙ = (I + Pk−1 + O{2(k − 1)}][y + pk + gk (y) + O(2k − 1)] = [I − Pk−1 y + O(k + 1, 2k − 1) = y + pk + gk − Pk−1

a vector x = (x1 , . . . , xn ) ∈ Rn , we use a notation O(x k ) to represent O(x1α1 . . . xnαn ), k i αi = k, and we also write O(x ) = O(k).

9 For


3 Hopf Bifurcations = y + [pk + gk − Pk−1 y] + O(k + 1),


providing k ≥ 2, as is the case. We see that the required elimination of terms of degree k is effected provided we can choose pk so that y = −gk . pk − Pk−1


This linear equation for pk is called the homological equation. Now let us omit the suffix k, and let p (i) denote the ith component of the vector p with  = diag (λi ); then the components of (3.55) satisfy λi p (i) −

 ∂ p (i) ∂yj


λ j y j = −g (i) ,


where the summation convention is not applied. Each component of p (i) is of the form

a1 a2 an cy1 y2 . . . yn , where i ai = k, and each such term can be solved for separately. For p (i) of this form, then (∂ p (i) /∂ y j )λ j y j = a j λ j p (i) , and so λi p (i) −

 ∂ p (i) ∂yj


λ j y j = (λi −

Thus if g (i) contains a term g˜ (i) = bi p (i) is p˜ (i) =

a j λ j ) p (i) .




ysas , then the corresponding component of

λi −

g˜ (i)





We see that The general form of p (i) , and thus p, can be determined by superposition.

a (unique) transformation exists to remove gk , provided λi = j a j λ j . Therefore, we make the following Definition The

eigenvalues λi of  are said to be resonant if there exist integers ai ≥ 0, j a j ≥ 2, such that for some i, λi =




With this definition, we have Poincaré’s theorem: The system x˙ = Ax + · · · can be formally transformed (via a uniquely defined power series) to the linear system y˙ = Ay, providing the eigenvalues of A are nonresonant.10  10 A

need not, in fact, be diagonisable.

3.2 Normal Forms


It is obvious that the method of construction outlined above can be extended iteratively to eliminate terms of arbitrarily high degree. Furthermore, when the eigenvalues  ai are resonant, so that (3.59) holds for some ai , it is only terms of the form xi which cannot be eliminated. This leads us to the normal form of an equation. The normal form is simply the result of eliminating all terms of non-resonant type from the equation. Example (i) The eigenvalues λ1 , λ2 are not resonant if 2λ1 = 3λ2 . For then λ1 = kλ1 + 3 (1 − k)λ2 , and resonance would require 23 (1 − k) to be a positive integer, which is 2 not possible. (Similarly, if we choose i = 2 in (3.59).) Example (ii) If λ1 = 2λ2 , then λ1 and λ2 are resonant (specifically, terms in x22 cannot be eliminated). Example (iii) The equation (3.51) can be written in the form x˙ = x + g, where  = diag (1, 1, 0), g = (0, x2 x3 , 0)T , so that the eigenvalues are λ1 = 1, λ2 = 1, λ3 = 0. Thus λ1 = λ2 + kλ3 , λ2 = λ1 + kλ3 for any k ≥ 1, and so, for example, terms x2 x3k are resonant in the x1 equation. Example (iv) (Hopf) If λ1 + λ2 = 0, then λ1 = 2λ1 + λ2 = 3λ1 + 2λ2 = . . . , so that terms x1m+1 x2m , m ≥ 1, are resonant (in the x1 equation). In the Hopf case, we have (at μ = 0) the diagonal form z˙ = λz + · · · , and its complex conjugate z˙ = λz + · · · , with λ = i, thus λ + λ = 0; terms of the form z m+1 z m are resonant in the z equation, and we have the normal form  ck |z|2k ] + · · · (3.60) z˙ = z[i + k

3.3 Centre Manifold Theorem Much is made in modern dynamical systems theory of the notion of centre manifold reduction. The idea is indeed a very powerful one, so much so that it is dressed up in different terminologies such as Lyapunov–Schmidt reduction, slaving principle, etc. And the centre manifold theorem itself is a complicated statement which takes a long time to prove rigorously. But, like the normal form procedure, the idea is very simple, and the simple message should not be obscured by the technicality of the more formal language involved in rigorous proofs. The idea is again concerned with motions close to an equilibrium. Suppose the Jacobian of the system at the fixed point is diagonalised, and the equations are then partitioned ‘naturally’ into equations for two (vector) variables x, y, with


3 Hopf Bifurcations

x˙ = Bx + f (x, y), y˙ = C y + g(x, y).


Here f and g are of degree at least two, as usual. The idea of the partition is that, if the eigenvalues of B have real part equal, or close to zero, while those of C have real part negative (and well away from zero), then the linearised behaviour will, after sufficient time, be governed by x only, since y → 0 on a fast time, while x varies on a slow time. The principle involved is little more than an extension of the notion of linear stability, and the idea of the centre manifold theorem is to extend this notion to the nonlinear system. We expect, and it is generally true, that if x ∈ Rn , then there is an n-dimensional subspace of the complete phase space for (x, y) which is locally invariant (points in the subspace stay in the subspace under the flow) and attracting (nearby points tend towards this subspace). Such a subspace is called a centre manifold, and its existence guarantees that formal finite-dimensional calculations, as done, for example, in nonlinear stability theory for partial differential equations, actually have a rigorous content. For practical purposes, it is enough to know that the theorem exists; the business of calculating the dynamics of the centre manifold still proceeds in a formal manner. In particular, our roundabout proof of the Hopf theorem requires the application of the centre manifold theorem, to reduce a general system to a two-dimensional one; we carry out this application as an exercise. First we make some definitions. We have already defined what a manifold is, but we recall this here for completeness. Definition An (n-dimensional) manifold M is a subset of R N which is locally diffeomorphic to Rn (where n ≤ N ).  For example, a two-torus (Fig. 3.6) is a two-dimensional (compact) manifold. So is the surface of a sphere. Definition An invariant set S for a flow φt is a set such that for all x ∈ S, φt (x) ∈ S for all t ∈ R (i.e. for both positive and negative time). An invariant manifold is similarly defined.

Fig. 3.6 A two-torus

3.3 Centre Manifold Theorem


Definition The stable (invariant) manifold of a compact invariant set S ⊂ Rn is the set of points x such that φt (x) → S as t → +∞. The unstable (invariant) manifold of S is defined similarly, but such that φt (x) → S as t → −∞.  Suppose x ∗ is a fixed point of a differential equation x˙ = f (x), and the corresponding Jacobian is D f (x ∗ ). We define the stable, centre and unstable eigenspaces E s , E c , E u at x ∗ to be the vector spaces spanned by the eigenvectors corresponding to stable (Re λ < 0), neutral (Re λ = 0) and unstable (Re λ > 0) eigenvectors of D f (x ∗ ). Example The Lorenz equations are x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz.


At the origin, the linearised system is ⎛ ⎞ ⎛ ⎞⎛ ⎞ x˙ −σ σ 0 x ⎝ y ⎠ = ⎝ r −1 0 ⎠ ⎝ y ⎠ . z 0 0 −b z


The eigenvalues of the Jacobian matrix are −b (with eigenvector up the z axis) and the two roots of λ2 + (σ + 1) − σ (r − 1) = 0, (3.64) which are


2λ = −(σ + 1) ± [(σ + 1)2 + 4σ (r − 1)] 2 .


Suppose that r > 1 (and σ > 0). Then these two roots are real, λ± , say, and λ+ > 0 > λ− . The stable eigenspace E s is thus two dimensional, and the unstable one E u is one-dimensional, as shown in Fig. 3.7. If r = 1, however, then λ+ = 0, so E s is two dimensional, E u is empty, and there is a one-dimensional centre eigenspace E c . Definition A centre manifold is an invariant manifold tangent to the centre eigenspace.  We are now in a position to state the Centre manifold theorem: Let x˙ = f (x), x ∈ Rn , f ∈ C r ; then ∃ C r stable and unstable invariant manifolds W u and W s (tangent to E u and E s ), and a C r −1 centre manifold W c (tangent to E c ); W u and W s are unique, W c need not be. [If r = ∞, then for each k ∈ N there exists a centre manifold W c which is C k .]


3 Hopf Bifurcations

Fig. 3.7 Stable and unstable eigenspaces for the Lorenz equations

Remark Basically, W u , W s are the spaces spanned by the continuation of E u and E s . This is also true for W c , but the continuation is not unique. The reason for this is closely allied to the non-uniqueness of asymptotic expansions for C ∞ functions. As an illustration, consider the following Example (Kelley 1967). Let x, y satisfy x˙ = x 2 , y˙ = −y;


The trajectories are then y = ae1/x where a is constant, and are illustrated in Fig. 3.8. The linearised system is      x˙ 0 0 x = , (3.67) y 0 −1 y with a neutral eigenvalue λ = 0 and corresponding eigenvector (1, 0). Thus the x-axis constitutes E c , but it is evident from the trajectories that a curve {y = ae1/x , x < 0} ∪ {y = 0} is a centre manifold for any value of a. Thus there

Fig. 3.8 Phase space for (3.66)

3.3 Centre Manifold Theorem


are infinitely many (C ∞ ) centre manifolds, but note that only one of these is analytic (i.e. y = 0). Here, the non-uniqueness may be associated with an isolated essential singularity of the system at the origin. It is worth noting that all the centre manifolds have the same asymptotic expansions for y = h(x) as power series in x. Example (van Strien 1979) Consider x˙ = −x 2 + μ2 , y˙ = −y − (x 2 − μ2 ), μ˙ = 0 .


(Although the system is three-dimensional, it is equivalent to a two-dimensional system with a parameter μ; the device of thus embedding a system in a higher dimensional one is a common one, as we shall see later, when the application to Hopf bifurcation is made.) In the previous example, we had an analytic system, with infinitely many C ∞ centre manifolds, but only one analytic centre manifold. The immediate inferential guess one might make is quashed by this example, which is a C ∞ system (in fact, analytic), but has no C ∞ centre manifold. To solve (3.68), we define    μ + x 1/2μ  ; X (x, μ) =  μ − x


then X˙ = X , and one finds 1 y= X


    μ − x 1/2μ   X dx =  μ + x


   μ + s 1/2μ   ds. μ − s 


It looks plausible that the surfaces (3.70) are not C ∞ , but it is not terribly easy to see how to show this. A rather different method used by Van Strien is this. Suppose y = h(x, μ) is an invariant centre manifold. Defining


Y = y − h(x, μ),


Y˙ = −Y + [−h + (1 − h x )(μ2 − x 2 )],


and Y = 0 is the centre manifold if the square-bracketed term vanishes, i.e. if (μ2 − x 2 )h x + h = μ2 − x 2 .


Rather than solve (3.73) (to obtain (3.70)), suppose h is C k . Then we can expand h as


3 Hopf Bifurcations



a j (x − μ) j + O[(x − μ)k ].



Substituting in, we find 2μ , 2μ − 1 1 − a1 , a2 = 4μ − 1 −( j − 1)a j−1 aj = , 3 ≤ j ≤ k − 1. (2μj − 1) a1 =


Now suppose that μ = 1/2n, with 3 ≤ n ≤ k − 1. It follows that an−1 = 0, and 1 thus, working backwards, a2 = 0. However, (3.75)1,2 imply that a1 = − n−1 n2 < 0. Therefore, the assumption (3.74) is invalid if and a2 = − (n − 1)(n − 2) 1 . Particularly, there is a C k centre manifold for any k, providing k ≥1+ 2μ 1 , but a C ∞ centre manifold does not exist. μ< [2(k − 1)] Example The Lorenz equations are x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz.


When r = 1, the eigenvalues are 0, −(σ + 1), −b, with eigenvectors (1, 1, 0), (σ, −1, 0), (0, 0, 1). Therefore, put ⎛ ⎞ ⎛ ⎞⎛ ⎞ x 1 σ 0 u ⎝ y ⎠ = ⎝ 1 −1 0 ⎠ ⎝ v ⎠ , z 0 01 w so that


3.3 Centre Manifold Theorem


⎛˙⎞ ⎛ u 0 ⎝ v ⎠ = ⎝0 w 0

⎞ −σ (u + σ v)w ⎟ ⎞⎛ ⎞ ⎜ ⎜1+σ ⎟ 0 0 u ⎜ ⎟ ⎜ ⎟ −(σ + 1) 0 ⎠ ⎝ v ⎠ + ⎜ 1 (u + σ v)w ⎟ . ⎜ ⎟ 0 −b w ⎜1+σ ⎟ ⎝ ⎠ (u + σ v)(u − v)


The linearly stable eigenspace E s is u = 0, and the centre eigenspace E c is v = w = 0. It is clear that the projection v = w = 0 is not W c , however, since when v = w = 0 in (3.78), w˙ = u 2 : that is to say, E c is not invariant. We now copy the normal form procedure to try to get a better approximation to the centre manifold. The idea is that we have v˙ = 0, w˙ = O(u 2 ) on v = w = 0, so E c is invariant up to terms of second order. We therefore try to choose a nearidentity transformation such that in the new coordinates U , V , W , both V˙ and W˙ are O(3), i.e. zero up to third order. In this way, we obtain a (unique) formal power series expansion for the graph11 of a centre manifold. Since the offending variable w satisfies w˙ = −bw + u 2 + · · · , where the dotted terms vanish on v = w = 0, we put u2 w= +W (3.79) b (and leave u, v unchanged), so that W˙ = −bW + (σ − 1)uv − σ v 2 +

  u2 2σ u(u + σ v) W + ; b(1 + σ ) b


2σ u 4 , b2 (1 + σ )


projection on to v = W = 0 now gives W˙ =

which is invariant to second order (in fact, to third order, but we get cubic terms in the equation for v).

3.3.1 Formal Power Series Expansion We can carry out this procedure of successive approximation in general. Suppose that W u is empty (this is often the interesting case, for example, in applications of the Hopf bifurcation theorem). Consider the system

11 A graph is the generalisation to n

dimensions of the two-dimensional idea of the graph of a scalar function h(x) of a scalar variable x.


3 Hopf Bifurcations

x˙ = Bx + f (x, y), y˙ = C y + g(x, y),


where f, g and their first derivatives vanish at the origin, (x, y) ∈ Rn × Rm , and B, C have eigenvalues λ with Re λ = 0 and Re λ < 0, respectively; B, C are diagonal matrices. We therefore have that the centre manifold W c is tangent to E c (y = 0) and can thus be written as a graph y = h(x) with h(0) = h (0) = 0 (here h (0) is the Jacobian of h at x = 0). Our aim is to calculate h(x) by expansion in powers of x and projection on to y = h. We do this by making the near-identity transformation Y = y−h


(x remains as it is), so that x˙ = Bx + f [x, h(x) + Y ], Y˙ = y˙ − Dh x˙ = C(h + Y ) + g[x, h + Y ] − Dh [Bx + f (x, h + Y )];


our aim is to choose h so that when y = h, i.e. Y = 0, then y = h is a centre manifold, i.e. trajectories of the full system remain on Y = 0. This is achieved if, when we put Y = 0 on the right hand side of (3.84)2 , we obtain Y˙ = 0 also. Thus we aim to choose h(x) to satisfy the nonlinear partial differential equation N (h) = Dh [Bx + f (x, h)] − Ch − g(x, h) = 0.


Our method of solution is constructive, via power series expansions, and is validated by the following Theorem (Carr 1981): If φ(0) = φ (0) = 0, N (φ) = O(x p ), then h = φ + O(x p ).  Suppose, for example, that y˙ = C y + g(x, y), with g of degree k; thus y˙ = O(k) on y = 0. We pick h of degree k, y = h + Y, so that Y˙ = −N (h) on Y = 0,


and we want to choose h so that N (h) = O(k + 1). From (3.85), the terms of degree k give Dh [Bx + · · · ] − Ch − g(x, 0) = 0. (3.87) As for the normal form procedure, this isnow a linear equation for h. If B = αj (i) diag

(bi ), C = diag (ci ), then a term g = a j x j in the ith component of g, where j α j = k, has a solution

3.3 Centre Manifold Theorem


h (i) =

g (i) . j α j b j − ci


Notice that, since Re bi = 0, Re ci < 0, and the α j are real, the denominator in (3.88) is always non-zero, and the problem of resonance does not occur. By construction, these power series are unique. But they are by no means always convergent. Exotic singularites may lurk at the origin, making trajectories nonanalytic there. Example Consider x˙ = −x 3 , y˙ = −y + x 2 ;


putting y = Y + h, we obtain x˙ = −x 3 , Y˙ = −Y + [x 3 h − h + x 2 ].


Adopting a formal power series expansion (with h(0) = h (0) = 0, since E c is just y = 0) ∞  hr x r , (3.91) h∼ 2

we find


x r +2 r h r −



h r x r + x 2 = 0,



whence h 2 = 1, h 3 = 0, h r +2 = r h r for r ≥ 4,


and thus h i = 0, i odd, h 2n = 2n−1 (n − 1)!,


and we have h∼

∞  1 2

2n−1 (n − 1)! x 2n

∼ x + 2x 4 + 8x 6 + · · · ,



3 Hopf Bifurcations

a series which is clearly non-convergent anywhere (except x = 0). In fact, the auxiliary equation for h is linear and solvable in this case, and satisfies the same equation as the trajectories for y as a function of x. These trajectories are 

1 y = exp − 2 2x


1 x

1 exp 2u 2

 du , u


where c is a constant. There is an essential singularity at x = 0. We leave it as an exercise to show that (3.95) is the asymptotic expansion of (3.96) (for any c). The dynamics of trajectories on the centre manifold W c are given by the equation for x: x˙ = Bx + f [x, h]. (3.97) More generally, with h satisfying (3.85), we have x˙ = Bx + f [x, h + Y ], Y˙ = CY + G(x, Y ),


G = g[x, h + Y ] − g[x, h] − Dh[ f (x, h + Y ) − f (x, h)].



Since f, g are O(2), so is G, and thus Y = 0 is linearly stable (as Re ci < 0). It then follows from the Hartman–Grobman theorem that the centre manifold Y = 0 is locally attracting.

3.3.2 Application to Hopf Bifurcation The drawback in applying the centre manifold theorem immediately is that we wish typically to consider systems which are slightly stable or unstable (so that W c is only non-empty at some critical parameter value μ = 0). To apply the theorem, we embed an equation (3.100) x˙ = f (x, μ), x ∈ Rn , where μ is a small parameter, into the system x˙ = f (x, μ), μ˙ = 0, (x, μ) ∈ Rn × R.


As an example, we consider the Hopf case, where we suppose W u is empty (though this is not necessary). At μ = 0, there is a single pair of neutrally stable eigenvalues

3.3 Centre Manifold Theorem


λ = ±iω, ω = 0, and there is therefore a (2 + 1)-dimensional centre manifold which is locally attracting. Specifically, write the system in the form z˙ = iωz + f (z, z¯ , y, μ), z˙¯ = −iω¯z + f¯(¯z , z, y, μ), μ˙ = 0, y˙ = C y + g(z, z¯ , y, μ),


with f , g and their derivatives = 0 at the origin. Application of the centre manifold theorem now proceeds as follows. Suppose f, g ∈ C k , then there exists a centre manifold h(z, z¯ , μ) ∈ C k−1 , such that if y = Y + h, then transformation to coordinates z, z¯ , Y, μ yields the system z˙ = iωz + F(z, z¯ , μ, Y ) μ˙ = 0 , Y˙ = CY + G(z, z¯ , Y, μ) ,


together with the complex conjugate of (3.103)1 , where F, G are O(2), F, G ∈ C k , and G(z, z¯ , 0, μ) ≡ 0 . (3.104) The centre manifold Y = 0 is locally attracting, and the dynamics on it are governed by (3.103)1 with Y = 0, i.e. z˙ = iωz + F(z, z¯ , μ) , μ˙ = 0 .


We now apply the normal form procedure to (3.105); the eigenvalues are λ1 = iω, λ2 = −iω, λ3 = 0, and resonant terms are of the form |z|2r zμs , |z|2r z¯ μs , |z|2r μs , which can arise in the equations for z, z¯ and μ respectively. Evidently, the normal form procedure now yields an equation for z of the form z˙ = iωz +

c j (μ)|z|2 j z,



while the μ equation is unaltered (as there are no nonlinear terms to be removed). The term in j = 0 constitutes the linear growth or decay for μ = 0, hence z˙ = λz +


c j (μ)|z|2 j z,



3 Hopf Bifurcations

with λ = c0 (μ) + iω.


As a final remark, note that our procedure here involves (i) a reduction to motion on a centre manifold, followed by (ii) a coordinate change to the normal form. This elaborate scheme is not usually worthwhile, and other, more practical methods (Poincaré–Lindstedt, multiple scales) go directly for the solution. The more rigorous construction given here does not add any new information.

3.4 Secondary Hopf Bifurcations The whole apparatus that we have developed so far—Poincaré–Lindstedt series, normal forms, the centre manifold theorem, all basically being a result of power series expansions, has led us to a mechanism, the Hopf bifurcation, which ‘produces’ a periodic orbit from a steady state. Our philosophy in understanding chaos is to develop ideas which can explain the topological origin of periodic and more complicated motions, and all this paraphernalia of normal forms, etc. would be rather wasted if only the Hopf bifurcation could be analysed. In fact, there are many other applications. In particular, periodic solutions may themselves be stable or unstable, and power series expansions can be applied to the parametric instability of such orbits, in order to determine the resulting bifurcation structure.

3.4.1 Stability of Periodic Orbits We introduce the idea of a Poincaré map. Suppose u˙ = f (u, μ), u ∈ Rn , μ ∈ R,


is a differential equation having a periodic orbit up (t) of period 2π (this without loss of generality). Let U be an (n − 1)-dimensional manifold transverse12 to up . Then if u ∗ ∈ U lies on the periodic orbit, it will be true that trajectories of (3.109) sufficiently close to u ∗ will also intersect U transversely (if f is smooth, as we assume). Therefore, there is a neighbourhood of u ∗ in U on which we can define a Poincaré map F : N (u ∗ ) → U (where N is the neighbourhood of u ∗ ) as follows: if u 0 ∈ N (u ∗ ) then F(u 0 ) = u(τ ) where u(0) = u 0 , u satisfies (3.109), and τ is the minimum value of t for which u ∈U. The idea is illustrated in Fig. 3.9; basically an initial value u 0 ∈ U is mapped to the next intersection of the trajectory of (3.109) Transverse here has the obvious geometrical meaning. If U intersects up at u = u ∗ , and uˆ is a vector tangent to up at u ∗ , then we require u, v to be non-zero for non-zero vectors v in U (using the usual inner product).


3.4 Secondary Hopf Bifurcations


Fig. 3.9 The Poincaré map


u0 F

F(u 0 )


through U. In general, a Poincaré map is only properly defined in a neighbourhood of a periodic orbit, but in practice, one may define Poincaré maps on Poincaré surfaces, provided these can be chosen to exclude tangencies with trajectories of the system. If we suppose that U is indeed defined in the neighbourhood of a periodic orbit, then the point u ∗ is a fixed point of the Poincaré map F, i.e. F(u ∗ ) = u ∗ . The map F will depend on the parameter μ, and the stability of the periodic orbit for the flow may be determined by the stability of the fixed point u ∗ for the map F. Linear stability of u ∗ is determined by the Jacobian D F of F at u ∗ . If the eigenvalues of D F are λ, then the unit circle marks the stability boundary; if |λ| < 1 for all λ, u ∗ is stable. When |λ| = 1 for some λ (but other λ have |λ| ≤ 1, then u ∗ is marginally stable. Thus instability is associated with passage of λ through the unit circle, and there are then three cases to consider. (i) λ = 1. This is the saddle-node or transcritical type, and is associated with the same bifurcation as in one-dimensional maps, see Fig. 2.3. (ii) λ = −1. This is the period-doubling bifurcation, and is again analogous to that for one-dimensional maps, see Fig. 2.4.13 (iii) λ = e2πiω , ω ∈ (0, 1). For a real-valued differential equation, F is real valued, and a complex conjugate pair of eigenvalues cross the unit circle. This is the analogue of the Hopf bifurcation for maps. It corresponds to a secondary Hopf bifurcation for the flow, where the periodic orbit becomes oscillatorily unstable.

3.4.2 Floquet Theory Let u be as before, satisfying u˙ = f (u, μ) with a 2π -periodic orbit up (t). Put u = up (t) + x,


13 In fact, the centre manifold theorem applies to maps as well, and so one can prove that bifurcations associated with passage of λ through ±1 are strictly analogous to the one-dimensional maps discussed in Chap. 2.


3 Hopf Bifurcations

so that

x˙ = f [up + x; μ] − f [up ; μ] = A(t)x + v(x, t), 14


where A = D f (up ; t) is a 2π -periodic matrix, and v(0, t) = v (0, t) = 0, i.e. v is of degree two in x. We suppose v is C 2 , so that this is true. The linear stability of up is thus determined by the non-autonomous linear system x˙ = A(t)x.


Definition The monodromy operator M is the linear operator which takes x(0) to x(2π ) (i.e. it is a local approximate Poincaré map for the flow). Floquet’s theorem Suppose the monodromy operator M is diagonal and its eigenvalues are given by μs = exp(2π λs ); then (3.112) can be reduced by a substitution x = B(t)y, where B is 2π -periodic, to the autonomous system y˙ = y,  = diag (λi ).


To prove this, let G and F be the flows of x and y satisfying (3.112) and (3.113), that is to say x(t) = G(t)x(0), y(t) = F(t)y(0);


G and F are just fundamental matrices, and we have F(0) = G(0) = I,

F(2π ) = G(2π ) = M


(by choice of λ, and since M is diagonal). Therefore B(t) = G(t)F(t)−1


is a 2π -periodic matrix. Also G˙ = AG,

F˙ = F.


We put x = By,


or equivalently x = Gz, 14 The

symbol = indicates definition.

y = F z;


3.4 Secondary Hopf Bifurcations



˙ + G z˙ = AGz + G z˙ = Ax + G z˙ = Ax x˙ = Gz


from (3.117). Therefore z is constant, so y˙ = F˙ z = F z = y,


which proves the result.  This theorem reduces the problem of stability of up to the determination of the characteristic multipliers μs ; the secondary Hopf bifurcation corresponds to the passage of a complex conjugate pair μs , μ¯ s through the unit circle at a value μs = exp[2πiω].

3.4.3 Normal Forms In order to analyse the nonlinear behaviour at an oscillatory instability, we now reduce the nonlinear system to its normal form. We suppose that Floquet’s theorem has been applied, so that we consider the equation for x in the form x˙ = x + v(x, t),


where v is 2π -periodic in t, and of degree ≥ 2;  is a diagonal, constant matrix,  = diag (λi ). We aim to eliminate the nonlinear terms from the equation, in the same way as for autonomous equations. Suppose v is of degree k. We seek a near-identity transformation x = y + h(x, t), (3.123) where h = O(k), which will remove terms of degree k from v, i.e. we require h to be chosen so that y˙ = y + O(k + 1). (3.124) We find y˙ = x˙ − h t − Dh x˙ = (I − Dh)[y + h + v] − h t = y + h + v − Dh x − h t − Dh v.


We chose h of degree k, and then Dh v is O(2k − 1)  O(k) if k > 1, as it is. Therefore, we obtain the required result if h satisfies the partial differential equation h t + L  h = v,



3 Hopf Bifurcations


L  h = Dh x − h; 15


(3.126) is called the homological equation, and we see that it is consistent that h is O(k) if v is also. This is a linear partial differential equation for h, and can be solved just as before. Since v is 2π -periodic, we can write a Fourier series for v: v=

vmks x m eikt es ,



where es is the sth unit vector in the normal basis for Rn , m = (m 1 , m 2 , . . . , m n ), x m denotes x1m 1 . . . xnm n . The formal solution for h is then h=

h mks x m eikt es ,




   ik − λs + m j λ j h mks = vmks .



Thus nonlinear terms can be successively eliminated, provided the bracketed coefficient in (3.130) is non-zero. Thus we have the Definition If λ = (λ1 . . . λn ) is such that ik − λs + (m, λ) = 0, m ∈ Nn , k ∈ Z,

m j ≥ 2,


then the eigenvalues are resonant. Notice that this definition includes that for autonomous systems when k = 0, which is a special case of the definition. Example We proceed straight to the case of the secondary Hopf bifurcation. Suppose n = 2, so the system is two-dimensional, and suppose it is written in the complex form z˙ = iωz +

vmk z m 1 z¯ m 2 eikt ,



at the critical value for which the eigenvalues ± iω are purely imaginary. Thus λ1 = iω, λ2 = −iω, and resonance occurs if k + (m 1 − m 2 − 1)ω = 0. Since k, m 1 , m 2 are integers, there are obviously two cases. 15 The

symbol = indicates definition, as earlier in (3.111).


3.4 Secondary Hopf Bifurcations


(i) ω is irrational. That is to say, the period of the instability is not commensurate to (not a rational multiple of) the period of the underlying orbit. In this case, resonance requires (3.134) k = 0, m 1 = m 2 + 1, and we derive the normal form z˙ = iωz + c1 z|z|2 + · · · ,


just as for the ordinary Hopf bifurcation. (Remember, though, that z is describing orbits about the periodic orbit.) The form of (3.135) suggests that there is a periodic solution for z, and in terms of the original variable u, this means motion on a two-torus, as depicted in Fig. 3.10. However, while the creation of an invariant two-torus does indeed occur, the issue as to whether the resultant motion is periodic or doubly periodic (i.e. with two incommensurate frequencies) is more subtle, as we shall see. (ii) ω is rational. Let us suppose ω = p/q (in lowest terms). Then kq + (m 1 − m 2 − 1) p = 0 for resonances, so that k = pr, m 1 = m 2 + 1 − qr


determine the resonances, for any r ∈ Z with m 1 + m 2 ≥ 2. Define z = eiωt ζ,


and suppose that all the non-resonant terms have been removed. It follows that the equation (3.132) can be reduced to ζ˙ =

vmk ζ m 1 ζ¯ m 2 ,



summed over values of m 1 and m 2 with m 1 − m 2 = 1 mod q,16 and this gives the normal form. With m 1 = m 2 + 1 − rq, we have ζ m 1 ζ¯ m 2 = |ζ |2m 1 ζ¯ rq−1 = |ζ |2m 2 ζ 1−rq .


Therefore, for r = 0, all terms of the form |ζ |2m ζ are resonant (and k = 0); if r = 0, the lowest order terms are r = 1 : ζ¯ q−1 + O(ζ q+1 ); r = −1 : O(ζ q+1 );

16 Note

that the terms in (3.138) are nonlinear, so that m 1 + m 2 ≥ 2.



3 Hopf Bifurcations

Fig. 3.10 Secondary Hopf bifurcation


secondary orbit

therefore we can certainly write the normal form as ζ˙ = ζ a(| ζ |2 ) + bζ¯ q−1 + O(ζ q+1 ) ,


where a is a polynomial of degree [q/2].

3.4.4 Weak and Strong Resonance Suppose a complex pair of eigenvalues of the monodromy matrix M cross the unit circle at exp[±2πiω] transversely when μ = 0. By suspending the system with μ˙ = 0, and using the centre manifold theorem as before, we can take the general normal form for ζ just as in (3.141), but with the coefficients of a depending on μ, thus a = a0 (μ) + a1 (μ)|ζ |2 + · · · , where ai (μ) are power series expansions in μ, with a0 (0) = 0, and transversality implying a0 = 0 (and particularly, Re a0 (0) = 0). We divide the case of rational ω = p/q into two sub-cases: first if q ≥ 5 we have weak resonance.17 If q ≥ 5, then O(ζ q−1 ) ≤ O(ζ 4 )  O(|ζ |2 ζ ), and therefore the generic leading order expansion of (3.140) is the same as for non-resonance (ω irrational), that is, ζ˙ = a0 ζ + a1 |ζ |2 ζ + O(ζ 3 ),


and in the same way as for the Hopf case, we have that there is an invariant periodic solution given by |ζ |2 = −(Re a0 )/(Re a1 ) = O(μ), whose stability depends on Re a1 , just as before. This immediately implies the existence of an invariant torus. However, the frequency is only prescribed with a small error, and we cannot yet pronounce on the issue of quasi-periodicity. The four cases of strong resonance have particular results. When q = 1, we have the saddle-node bifurcation (ω = 0, e2πiω = 1); when q = 2, we have the perioddoubling bifurcation (ω = 21 , e2πiω = −1). When q = 3, the normal form is

17 The resonance referred to here is that between primary and secondary periods, not that involved in removing nonlinear terms from the system.

3.4 Secondary Hopf Bifurcations

ζ˙ = a0 ζ + bζ¯ 2 , q = 3,



and a single unstable 6π -periodic solution exists for μ > < 0. When q = 4, the normal form is ζ˙ = a0 ζ + a1 |ζ |2 ζ + bζ¯ 3 , (3.144) and there are either two, or no 8π -periodic solutions for μ > < 0. The general case (in the absence of particular symmetries or pathologies) is the case of weak resonance or non-resonance, and we therefore focus our attention on it.

3.4.5 Circle Maps, Arnold Tongues, Frequency Locking The normal form for weak or non-resonant cases is ζ˙ = a(|ζ |2 ; μ)ζ + bζ¯ q−1 + O(q + 1),


where b = 0 if ω is irrational. Recall that this describes the motion of z = eiωt ζ , where the original flow variable was written (more or less) as u = zp (t) + z, with zp being 2π -periodic. Also since the right hand side of (3.138) has no linear terms when μ = 0, we have a(0; 0) = 0. Put ζ = r eiθ , b = |b|eiφ , a(r 2 ) = a R (r 2 ) + ia I (r 2 ),


so that r˙ = ra R (r 2 ) + |b|r q−1 cos(qθ − φ) + · · · , θ˙ = a I (r 2 ) − |b|r q−2 sin(qθ − φ) + · · · .


We expand a R and a I as a R = a0R μ + a1R r 2 + · · · , a I = a0I μ + a1I r 2 + · · · ;


an invariant circle exists for a0R μ/a1R < 0, whose stability is opposite to that of the fixed point at the same value of μ. On these circles r ≈ (−a0R μ/a1R )1/2 . The existence of the invariant circles (tori for the original flow) follows from the use of the implicit function theorem. We are now interested in the dynamics of the trajectories on these curves. From (3.147)2 , θ  1 (since a(0; 0) = 0), so that we can derive an approximate Poincaré map by integrating from 0 to 2π (corresponding to a single orbit of up ). We get the map (3.149) θ → θ + 2πa I − 2π |b|r q−2 sin(qθ − φ) + · · · ,


3 Hopf Bifurcations

or defining θ = θ − (φ/q) + (π/q), α = a I , β = 2π |b|r q−2 , this can be written as

θ → θ + 2π α + β sin qθ .



Finally, it is convenient to interpret this map for the phase of z = eiωt ζ rather than that of ζ . If z = r ei , then the map for  corresponding to (3.151) is just  →  + 2π  + β sin q,


since  = 2π ω + θ = 2π( p/q) + θ , where  = ω + α.


Note that ω = p/q, α ∼ μ, β ∼ μ(q−2)/2 . Equation (3.152) is an example of a circle map, and we wish to examine its solution structure for α, β  1. Note also that we can take  modulo 2π in (3.152). Comment. Our analysis will apply for each fixed ω = p/q as μ → 0. In practice, we can take indefinitely large values of q, but it should be observed that the smoothness of (3.152) breaks down as q → ∞, since the map is non-monotonic for βq > 1. That is to say, the limits β → 0, q → ∞ do not commute, and for large values of q, very small values of β (hence μ) are necessary. For βq > 2, period-doubling in the circle map occurs, leading to chaos. By embedding the circle map in the original two-dimensional phase space, one can obtain an understanding of how invariant tori can break down to form strange invariant manifolds by the formation of Smale horseshoes, but such a discussion is beyond the scope of this chapter (however, see Chap. 4). We now analyse the dynamics of the map (3.152). Consider first the unperturbed map,  →  + 2π ω, (3.154) where ω = p/q. It is clear that after q iterates of the map, an initial value of  returns to the original value. Thus every value of  is a period q cycle. This degeneracy disappears under the influence of a mild nonlinearity. In fact, for α, β = 0 in (3.152), q-cycles exist if 2π |α| < β. (3.155) To show this, take the qth iterate of (3.152): since α, β are small, and since sin q = sin q( + 2π ω), we have (q) ≈  + 2π q + qβ sin q + · · · ,


3.4 Secondary Hopf Bifurcations


where (q) is the qth iterate of (3.152), thus (q) ≈  + 2πqα + qβ sin q.


If β > 2π |α|, there exist q pairs of fixed points of (3.157), as illustrated in Fig. 3.11, which are alternatively saddles and stable (or unstable) nodes. The existence of these fixed points is proved for fixed q and sufficiently small μ by the implicit function theorem. We have to show that this result is all that happens; that is, there are no other cycles for β > 2π |α|, and no cycles at all for β < 2π |α| (and this then includes the non-resonant case). In order to do this, we define the winding, or rotation number of a circle map. Consider, in particular, the circle map  → A() ≡  + a(),


where a > −1, a is 2π -periodic. This is an example of an orientation-preserving map. Now put ak () = a() + a[A()] + · · · + a[Ak−1 ()].


Evidently ak is the total angle of rotation after k iterates of A. Now we have the following Theorem. The limit ρ A = lim [ak ()/2π k] exists, and is independent of . It is called the k→∞

rotation number, or winding number, and is the average rate of rotation of A. It is remarkable enough that the rotation number can even be defined. More remarkable are certain properties which can easily be proved about it. Proposition 1 ρ A is a continuous function of A (in the supremum norm) for 0 < ρ A < 1.

Fig. 3.11 For an attracting torus, the q-cycles are alternatively saddles and stable nodes





α Θ




3 Hopf Bifurcations

Fig. 3.12 Arnold tongues



Proposition 2 Let A be a family of orientation-preserving maps with parameter , monotonically increasing with ; then ρ A is also monotone increasing in ranges of  where A has no fixed point. Proposition 3 ρ A is rational if A has a periodic orbit. Lastly, we have two important theorems. Denjoy’s theorem / Q, A ∈ C 2 , then it is topologically equivalent If A is orientation-preserving, ρ A ∈ (homeomorphic) to a rotation by 2πρ A (and so the motion of the original system on the invariant torus under the action of A is doubly periodic).18 Herman’s theorem For almost all ρ A , and if A ∈ C 3 , A is diffeomorphic to a rotation by 2πρ A . Now let us construct the bifurcation diagram in terms of the results we have already obtained for the circle map (3.152). The distinction between α and β, or between  and β, suggests that it may be convenient to portray results graphically in terms of two parameters, rather than just one (μ). We use  and μ as the two parameters, which makes sense, since as μ is changed, the frequency of the secondary oscillation will change, so that changing μ in a dynamical system will generally lead to a curvilinear path in (, μ) space. We now illustrate the criterion (3.155) in (, μ) space. It states that period q-cycles occur within Arnold tongues defined by      − p  ≤ cμ(q−2)/2 , q ≥ 5,  q


as is illustrated in Fig. 3.12. We call these periodic motions frequency locked, and speak of phase entrainment within the tongues. In general, the frequency ratio  of the circle map for the phase on the invariant torus will vary as μ varies, and so as μ function of time is doubly periodic if it can be written in the form f (ω1 t, ω2 t), where f (x, y) is 2π -periodic separately in each of its arguments, and ω1 /ω2 is irrational.

18 A

3.4 Secondary Hopf Bifurcations


varies above zero, the system will pass through many Arnold tongues. Practically, only the largest of these will be numerically or experimentally observable, and so one can expect that at a secondary Hopf bifurcation, the typical sequence will be, first, doubly periodic motion, and then passage through a low q (hence wide) tongue, and resultant frequency locking. This is often seen experimentally, and can be followed by the period-doubling passage to chaos alluded to earlier, as the map on the invariant torus becomes non-monotonic. What of the winding number ρ (, μ)? Since ρ is rational inside the tongues (Proposition 3) and continuous (Proposition 1) it is in fact constant in a tongue, and evidently ρ = p/q for the tongue emanating from ( p/q, 0). Between tongues, the map has no fixed points. Therefore (Proposition 2) ρ is monotone increasing. Since it is rational (= p/q) precisely in the tongues, but not elsewhere (else it would not be monotonic) it follows that ρ is irrational outside tongues. It is a weird, undrawable function which is called a devil’s staircase. Denjoy now tells us the motion is homeomorphic to a rotation by 2πρ. However, if we wish for the motion to be diffeomorphic, then we must apply Herman’s theorem. We have now essentially proved what we need to verify the bifurcation diagram in Fig. 3.12. Before proceeding, it is useful to indicate constructively the meaning of Denjoy’s and Herman’s theorems. The idea is that we seek a diffeomorphism h of a smooth circle map f which is a rotation by α = 2πρ( f ), i.e. we solve h[ f (θ )] = h[θ ] + α,


given f , itself close to a rotation. Suppose f = Rα + F, h = I + H,


where Rα is rotation by α, and I is the identity; we then find from (3.161) that H [θ + α + F(θ )] − H (θ ) = −F(θ ).


For small F, this is approximately the linear equation H (θ + α) − H (θ ) = −F(θ ), and if F=

am e2πimθ ,

then bm =


am . 1 − e2πimα

bm e2πimθ ,





3 Hopf Bifurcations

The whole procedure is just like that of normal forms. The behaviour of the Fourier coefficients am depends on the smoothness of F and thus f . Integration by parts of  π 1 F(θ )e−imθ dθ (3.167) am = 2π −π     1 1 shows that am = O if F is continuous, and if F is discontinuous, O m2  m 1 generally O if F ∈ C k . If F is analytic, then generally am = exp[−O(m)] m k+2 (see question 3.8). The convergence of the Fourier series for H thus depends on the behaviour of the denominator in (3.166). Supposing H exists, one can prove that the nonlinear equation (3.161) has a solution. But (3.164) possesses a difficulty analogous to resonance in normal forms, called the small divisor problem. It is obvious if α is rational, the series does not exist (some of the bm are infinite), and the procedure is void. Even if α is irrational, it may be sufficiently ‘close’ to rational to give problems. How do we measure how close to being rational an irrational number is? It can be elegantly shown19 that, for any irrational α, there is a sequence of rational approximations p/q with     α − p  < 1 . (3.168)  q  q2 Conversely, given any σ > 0, there is a number K (α) > 0 for almost all α in any bounded interval (and thus on the whole real line), such that for any fraction p/q (with q > 0 without loss of generality),     α − p  ≥ K  q  q 2+σ

(σ > 0).


To see this, suppose 0 < α < 1. Let  I = α : ∃ K (α) s. t.

     α − p  ≥ K ∀ p, q (q > 0) .  q  q 2+σ


The complement of this set is  I = α : ∀ K > 0 ∃ p, q s. t.

19 Using

     α − p  < K .  q  q 2+σ

the continued fraction approximation for an irrational number.


3.4 Secondary Hopf Bifurcations


To show that (3.169) is satisfied for some K (α) for almost all α, we need to show that the measure of I is zero. We define two further sets:      K pα   < , K fixed , J (K ) = α : ∃ pα , qα s. t. α − qα  qα2+σ      K p (3.172) B p,q (K ) = α : α −  < 2+σ , K , p, q fixed . q q It follows that if α ∈ J (K ), then α ∈

J (K ) ⊆


B p,q (K ), i.e. B p,q (K ).



Denote the measure of a set X as M(X ). As we are dealing with unions of intervals, the measure is just the sum of their lengths, or less than this if some the intervals have non-empty intersection. It is clear from (3.173) that M[J (K )] ≤ M

B p,q (K ) .



We now obtain an upper bound for the expression on the right-hand side of this inequality. First fix K , q and p. We take K < 1 without loss of generality, so that 1 K < . q 2+σ q


For fixed p and q, the measure of B p,q (K ) is M[B p,q (K )] ≤

2K , q 2+σ


(just the interval of this length about p/q).20 Summing over p = 0, 1, . . . , q, and noting that we can exclude the left half interval for p = 0 and the right half interval for p = q, and also that we can ignore all values p < 0 and p > q in view of (3.175), we see that the total measure of the union is

 2K B p,q (K ) ≤ 1+σ , (3.177) M q p

20 It

will be less than this if p/q is near either end point.


3 Hopf Bifurcations

and summing this over q, we have M

B p,q (K ) ≤ K C(σ ), C(σ ) = 2



1 < ∞. q 1+σ


It follows from (3.174) that M[J (K )] ≤ K C(σ ).


From the definition of J (K ) in (3.172), we have J (K 1 ) ⊆ J (K 2 ) if K 1 < K 2 , and therefore we have   J (K ) = J (K ), (3.180) I = K >0

for arbitrary ε, and

0 2|θ |/π . Now for almost all α and σ > 0, we have  K n   α −  ≥ 2+σ m m for all m and n, and thus |αm − n| ≥

K . m 1+σ



Therefore     1 − e2πimα  = e2πi(αm−n) − 1 > 4|αm − n| ≥ 4K , m 1+σ


and so from (3.166) |am |m 1+σ . (3.185) 4K     1 1 , and thus b for = O If we suppose F ∈ C r , then am = O m r +2 m r +1−σ m  1 any positive σ , thus certainly bm ≤ O , so that h ∈ C (r −2) . We get Denjoy’s mr theorem from r = 2, Herman’s from r = 3.21 |bm |
0 and Re β < 0. 3.8 Suppose

f (x) is a 2π -periodic function which is analytic, and has a Fourier series k ak eikx . Define a function g(z) in the complex plane via   g eiθ = f (θ ), and write down an expression for ak in terms of a contour integral of g. Hence show that if g is analytic in the annulus r ≤ |z| ≤ R, where r < 1 and R > 1, then ak = exp[−O(|k|)] as |k| → ∞. 3.9 The complex Lorenz equations are defined by x˙ = −σ x + σ y, y˙ = (R − z)x − ay, z˙ = 21 (x y¯ + x¯ y) − bz. Here, x and y are complex, the overbar denotes the complex conjugate, z is real, σ and b are real and positive, a = 1 − i, R = r + is. Assume  + s = 0. Taking r as the bifurcation parameter, show that a Hopf bifurcation occurs at r = r H with frequency ω, and find these values. Show that the bifurcating solution has an exact representation in which x = Aeiωt , and give it explicitly; in particular, show that |A|2 = b(r − r H ). By writing x = X eiωt , y = Y eiωt , z = Z , show that there is a continuum of steady states for X , Y , Z (and thus each has one neutrally stable eigenmode). By linearising, show that a secondary Hopf bifurcation occurs at r = r S H when A2 (taken as real) satisfies the quadratic q1 A4 + q2 A2 + q3 = 0, where q1 = (β + γ )(γ − b − β), q2 = β(b + 2β)(2βγ − 2bβ − bγ ) − α(−bγ + b2 + βb + 2β 2 + 2βγ ),

3.7 Exercises


q3 = −2αβb(α + b2 + 2βb), and α = (σ + 1)2 + (2ω − )2 , β = σ + 1, γ = 2σ. [For further information, see Fowler et al. (1982).] 3.10 [Elliptic functions and the Lorenz equations] Jacobian elliptic functions are defined in the following way. First we define, for |m| < 1,  φ dθ . u= 2 1/2 0 (1 − m sin θ ) Then the functions sn u, cn u and dn u are defined via sn u = sin φ, cn u = cos φ, dn u = (1 − m sin2 φ)1/2 . The complete elliptic integral of the first kind is defined by 


K (m) = 0

dθ . (1 − m sin2 θ )1/2

Show that sn u, cn u and dn u are periodic, with periods 4K (m), 4K (m) and 2K (m), respectively. Show that if f (ξ ) = sn ξ, g(ξ ) = cn ξ, h(ξ ) = dn ξ, then the derivatives are f = gh =

(1 − f 2 )(1 − m f 2 ),

g = − f h = − (1 − g 2 )(1 − m + mg 2 ), h = −m f g = − (1 − h 2 )(m − 1 + h 2 ). The Lorenz equations are given by x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz.


3 Hopf Bifurcations

By rescaling the variables, show that they may be written in the form ξ˙ = η − εσ ξ, η˙ = wξ − εη, w˙ = −ξ η + εb(1 − w), and give the definition of ε. Show that if (ξ, η, w) is a solution, then so is (−ξ, −η, w). Show that if r  1, two approximate first integrals are w 2 + η2 = B 2 , w + 21 ξ 2 = D, where we may take B > 0, and hence deduce that if −B B. Show that in this case ! ξ ≈ ± 2(B + D) dn

2B B t, m = , m B+D

and determine the corresponding values of η and w.

3.7 Exercises


What happens to the averaging procedure as D → B? What do you think might happen in the original equations in this case? [For further information, see Sparrow (1982) and Fowler (1984); but beware an apparent algebraic error in equation (2.7) of the latter paper.] 3.11 A circle map for the qth iterate of the angle variable  is given by  →  + A + B sin q, where q is an integer corresponding to the resonance p/q in the frequency ratio on a secondary Hopf-bifurcating torus, A is a measure of the perturbation of the linearised frequency ratio from this resonance value, and B is a measure of the nonlinearity of the perturbation. Generally, if μ is an actual perturbation parameter, then we would have A ∼ μ and B ∼ μ(q−2)/2 , but we will ignore this inessential detail and use A and B as independent parameters; we take B > 0. (Note that in this case Fig. 3.12 is squashed vertically, so that all the Arnold tongues lie in −A < B < A.) First assume that also A > 0. Show that if B < |A| and φ = sin−1

A ∈ (0, 21 π ), B

then the fixed points of the map are at  = υr =

φ φ (2r − 1)π 2r π + ,  = σr = − , r = 1, . . . , q. q q q q

By calculating the slope of the map at the fixed points, show that each υr is unstable, but that each σr is stable for sufficiently small B. Show further that the fixed points σr undergo a period-doubling bifurcation at B 2 = A2 +

4 , q2

and draw the corresponding bifurcation curve inside the Arnold tongue in (A, B) space, and also in the stretched version of Fig. 3.12. What does this imply for high order resonances for which q  1? Now suppose that q  1. Show in this case that the period-doubling bifurcation occurs at   2 B ≈ A 1+ 2 2 , q A assuming q A  1. Next (focussing on r = 1), define     2η 1 3π 2u B = A 1+ 2 2 , = + ; q A q 2 qA


3 Hopf Bifurcations

show that the map takes the approximate form u → u + u 2 − η, and find the affine transformation which transforms this to the logistic map in the form ζ → 1 − λζ 2 , and find the definition of λ in terms of η. Extend the results to A < 0.

Chapter 4

Homoclinic Bifurcations

In the previous chapter, we showed how motion on invariant two-tori near secondary Hopf bifurcations was described by circle maps. If we take such maps, and increase the nonlinearity (increase μ), then the orbits of the map undergo a transition to chaotic behaviour. This transition to aperiodicity is associated with non-monotonicity of the map, and consequent period-doubling. Now, the map becomes non-invertible when this occurs, and as such, it can no longer model the Poincaré map of a differential equation (it is also in this case no longer derived from an ordinary differential equation, since the derivation for secondary Hopf bifurcation requires the circle map to be almost monotone). In order to maintain some contact with reality, this problem can be circumvented by embedding the non-invertible circle map in an invertible map of the annulus, as shown in Fig. 4.1. More or less analogous results can then be obtained, and the transition to chaos studied. This transition, via the ‘breakdown of tori’, forms one of the popularly dubbed ‘routes to chaos’ in differential equations. Thus we have a list of ‘mechanisms’: period-doubling, intermittency, breakdown of 2-tori, Newhouse–Ruelle–Takens. In addition, in this chapter we show that homoclinic connections between (for example) fixed points of a flow lead to the creation of strange invariant sets, and hence generate the topological skeleton of chaos. (The muscle is added when these sets become attracting, which they are typically not at their birth.) In fact, it is a misconception to think of the different phenomenologies listed above as somehow constituting distinct kinds of behaviour. Rather, it is better to think of one basic mechanism (one bifurcation) for generating chaotic trajectories, and the different routes simply represent different approaches to this bifurcation point. Why should we say this? The point is, that some of the ‘routes’ to chaos are blindfolds. ‘Intermittency’ is a prime example; in essence, it results from a saddlenode bifurcation in a one-dimensional map. As the stable fixed point coalesces with its unstable neighbour, one may see a transition to chaotic behaviour; but the saddlenode bifurcation is completely irrelevant to the origin of the chaotic behaviour. In a certain way, one could argue that while the breakdown of two-tori may shed light on the transition from doubly periodic motion to chaos in differential equations, it © Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,



4 Homoclinic Bifurcations

Fig. 4.1 Map of an annulus: D → f (D)

D f(D)

begs the basic question of what happens to the orbit structure of the equation, which causes the fold in the map. In this chapter we study one mechanism, and perhaps the basic mechanism whereby strange sets of trajectories can be generated. This is the homoclinic bifurcation. One may in fact conjecture that homoclinic bifurcations must play a rôle in all the routes to chaos, and in this sense, and so long as our principal focus is on the dynamics of real differential equations and not just arbitrary maps, their existence is central to the study of chaos. Let us begin with the following Definition A homoclinic orbit  to a fixed point (say x = 0) of a system x˙ = f (x) is a trajectory x(t) such that x(±∞) = 0 (as shown in Fig. 4.2). In a similar way, one can have homoclinic orbits to a limit cycle. The resulting Poincaré map is said to have a homoclinic connection, a matter we shall return to in Chap. 5. Other homoclinic connections are possible, for example, to a torus. Definition A heteroclinic orbit between two fixed points a, b of a system x˙ = f (x) is a trajectory x(t) such that x(−∞) = a, x(+∞) = b. Similarly, one can have heteroclinic connections between limit cycles, tori, etc. One can have families of heteroclinic connections, which combine to form a large

Fig. 4.2 A homoclinic orbit



4 Homoclinic Bifurcations




Fig. 4.3 The nonlinear oscillator with potential V = − 21 x 2 + 14 x 4 has a symmetric pair of homoclinic orbits when E = 0

homoclinic loop. The analysis of the bifurcations associated with all these sorts of trajectory is exemplified by the simplest type of homoclinic connection, which we study here. First, we give an example. Example The nonlinear oscillator x¨ +

∂V = 0, ∂x


with energy integral 21 x˙ 2 + V (x) = E, has a homoclinic orbit if V (0) = E, V  (0) = ∗  ∗ > 0, V  (0) < 0, and if there is a value x ∗ > < 0 with V (x ) = E, V (x ) < 0, as shown in Fig. 4.3. For example, with V = − 21 x 2 + 41 x 4 , there is a homoclinic orbit when E = 0. Note the following: homoclinic orbits only exist for isolated value(s) of the parameter E; in two-dimensional autonomous systems, they are not associated with any exotic behaviour (because trajectories cannot intersect). This latter feature changes dramatically in three dimensions.

4.1 Lorenz Equations The equations x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz,


were introduced to the meteorology literature by Lorenz in 1963. The parameters σ, b, r are positive, and conventionally, σ and b are fixed, while r may vary, and is

102 Fig. 4.4 The bifurcation diagram for the ‘action’ (period × amplitude) for the Lorenz equations (4.2) as r varies, with σ = 10, b = 83 . The value r H is the Hopf bifurcation value r = 24.74 and rh is the homoclinic bifurcation value r = 13.926

4 Homoclinic Bifurcations




rh = 13.926



rH = 24.74

considered to be the bifurcation parameter. When r < 1, the only fixed point is the origin; for r√> 1, a bifurcation occurs, so that two further fixed points at z = r − 1, x = y = ± b(r − 1) exist (the bifurcation is of pitchfork type because the system possesses a symmetry x, y, z → −x, −y, z). If we take Lorenz’s values σ = 10, b = 8/3, then the non-zero fixed points have a subcritical Hopf bifurcation at r = r H ≈ 24.74. It is possible to use numerical techniques to compute this periodic orbit (even though it is unstable) as r decreases. One finds that its period increases as r decreases, and apparently tends to infinity as r → rh ≈ 13.926. To plot this behaviour (amplitude → 0 as r → r H , period → ∞ as r → rh ), it is convenient to introduce a measure of the orbit, such as the ‘action’ A defined by A = period × amplitude. (4.3) Then the bifurcation diagram for A versus r for the Hopf-bifurcating periodic orbit is as shown in Fig. 4.4. The limit of the orbit as r → rh is a homoclinic orbit, and we shall see in the sequel that it is more useful to think of the periodic orbit as being

Fig. 4.5 The symmetric pair of homoclinic orbits to the origin for the Lorenz equations at r = rh



4.1 Lorenz Equations


generated at rh and absorbed at r H , rather than the other way round. Mainly, this is because the homoclinic bifurcation at rh actually produces an infinite number of periodic and aperiodic orbits, some of which survive to become the strange attractor which gives rise to observed chaotic behaviour at a value r ≈ 24.06. In this sense, the Hopf bifurcation is largely irrelevant to the existence of chaos. The homoclinic orbits (a symmetric pair) are shown in Fig. 4.5.

4.1.1 Homoclinic Bifurcations We analyse the orbit structure in the neighbourhood of a homoclinic orbit by constructing an approximate Poincaré map. We will illustrate the formal procedure of how this is done, by constructing such a map for the Lorenz equations. Later on, we will show how the technique is used on other examples. By a linear transformation, the Lorenz system can be written in the form x˙ = f (x, μ), x = (x1 , x2 , x3 ) ∈ R3 , where x3 = z, μ = r − rh , and the Jacobian at the origin is diagonal. Specifically, the eigenvalues of the linearised system at the origin corresponding to motion in the (x, y) plane are λ+ > 0 and −λ− < 0, where λ± =

1 2

  1/2 (σ − 1)2 + 4r σ ∓ (σ + 1) .


The corresponding eigenvectors are  w± =

σ σ + λ+


σ σ − λ−



and with the matrix of the eigenvectors P = (w+ , w− ), we have 

x1 x2

= P −1

  x , y

and with a suitable scale factor, we can take   λ− − 1 x + y, x1 = σ   λ+ + 1 x − y. x2 = σ




4 Homoclinic Bifurcations






Fig. 4.6 The eigenspace of the Lorenz equations at the origin. The (x2 , x3 ) plane is the stable eigenspace, and the x1 axis is the unstable eigenspace. The arrows indicate the size of the eigenvalues

 of the small Fig. 4.7 Construction of an approximate Poincaré map. Two of the faces S and S+ box B at the origin are labelled, of which the horizontal face S is the Poincaré surface. The two homoclinic orbits,  and its reflection, are also shown (these are in fact the primary homoclinic  by linearising orbits for the Lorenz equations). For points P on S close to , we map P to Q on S+ about the origin; to map Q to R on S, we linearise about . The Poincaré map is the composition of the two maps

Under this transformation, f retains the symmetry as x1 , x2 , x3 → −x1 , −x2 , x3 . Denoting the eigenvalues corresponding to the x1 , x2 , x3 axes as λ1 = λ+ , λ2 = −λ− , λ3 = −b, we will assume that −λ2 > λ1 > −λ3 > 0, as is the case for the Lorenz equations at σ = 10, b = 83 , r = 13.926: see Fig. 4.6. We first construct a box B : |xi | = ci surrounding the origin, as shown in Fig. 4.7. The homoclinic orbits will pierce the top face of this box, S : x3 = c3 , at x1 = 0, x2 = ±x2∗ (approximately), and they will leave faces x1 = ±c1 at x2 = x3 = 0, approximately.1 Choose one face of the box through which (each) homoclinic orbit 1 We

can in fact assume that these points are exact by making a further change of variable, so that the stable manifold is locally given by the (x2 , x3 ) plane, and the unstable manifold is the x1 axis. The procedure is similar to the centre manifold construction. But if the box is small (|ci | 1), the error is small anyway, and the formal construction involved is identical.

4.1 Lorenz Equations


passes. We will pick S+ , given by x1 = c1 . The map from S to S+ and back to S is a Poincaré map on the Poincaré surface S, and we construct an approximation to it. The approximation is made in two parts. If we consider orbits close to the homoclinic orbit  in x1 > 0, then inside the box B, if ci is small enough, we may linearise the flow about the origin. On the other hand, for orbits sufficiently close to  outside B, we may linearise about the homoclinic orbit. In this way, we can construct the form of the Poincaré map. Precise construction of the coefficients requires numerical computations, which have not been done. This is of little consequence, however. Inside the box B, we linearise the equations. The solution through the point (x10 , x20 , c3 ) in S is x1 = x10 eλ1 t , x2 = x20 e−|λ2 |t , x3 = c3 e−|λ3 |t ,


and this cuts the surface S+ : x1 = c1 of B at   1  c1  . t= ln λ1  x10 


The same result is true for orbits passing through S− : x1 = −c1 . Thus points (x10 , x20 , c3 ) in S are mapped to (±c1 , x2 , x3 ) ∈ S± , where  −|λ2 |/λ1  −|λ3 |/λ1  c1   c1  x2 = x20  0  , x3 = c3  0  , x1 x1


and we take the upper (lower) sign if x10 > 0 (x10 < 0). Now let the point on S be mapped to a point very close ( O(|ci |)) to (c1 , 0, 0). The trajectory emerging from this point will be close to  until it reaches S again. Therefore we can linearise the system about . Using this notion (see also later discussion), it is plain that in the Poincaré map, points sufficiently close to  on S± will be affinely mapped2 to S.3 Thus the map from S+ to S will be given by 

x2 x3




x1 x2 − x2∗


x2 x3

 + μb ;


here x2∗ is the x2 coordinate of the intersection of the homoclinic orbit through S+ with S. The inhomogeneous term is proportional to μ, which reflects the degree of translation of  for non-zero μ. Also, the entries of the matrix A will depend on ci , with A diverging as ci → 0 due to the nearby singularity. Combining the two maps, we find that an approximate Poincaré map from S to S is given by

2 An 3 In

affine map is simply a linear, inhomogeneous map.  to S. fact, this can be formally derived by simply linearising the flow map from S±


4 Homoclinic Bifurcations

x1 x2


0 x2∗

 −|λ2 |/λ1 ⎞  c1    ⎜ x2  x  ⎟ ⎜ + A ⎝  1 −|λ3 |/λ1 ⎟ ⎠ + μb,  c1  c3   x1 ⎛


for x1 > 0: A is a matrix and b is a vector. If x1 < 0, we can show from the symmetry that the above map is obtained, but with −x2∗ , −c3 and −b replacing x2∗ , c3 and b, if x1 = 0. Combining these results, we have that (x1 , x2 ) in S is mapped, provided x1 = 0, to (x1 , x2 ) in S given by 

  |λ3 |/λ1  |λ2 |/λ1  x1   x1    = sgn(x1 ) ac3   + μb1 + d x2   , c1 c1    |λ3 |/λ1  |λ2 |/λ1  x1   x1   ∗  x2 = sgn(x1 ) x2 + a c3   + μb2 + d  x2   , c1 c1 x1


and d, a, d  and a  are the components of A. Two-dimensional maps are generally not particularly easy to analyse, but (4.13) possesses a crucial simplifying feature, which, moreover, generalises to higher dimensional problems. Recall the ordering |λ2 | > λ1 > |λ3 |,


whence |λ3 |/λ1 < 1 < |λ2 |/λ1 . Denote |λ2 | > 1, λ1 |λ3 | δ3 = < 1. λ1

δ2 =


δ2 δ3 Now we can suppose x2 < ∼ c3 and |x1 /c1 | |x1 /c1 | , since x1 c1 , therefore (4.13) can be (smoothly) approximated away from x1 = 0 by dropping the last term in each component equation, which shows that the Poincaré map is a perturbation of a one-dimensional map. We put c1 = c3 ,

x1 = c1 ξ, x2 = c1 η,


and then, by rescaling (or redefining) μ, we can write (4.13) in the form ξ  = sgn(ξ )[α|ξ |δ3 − μ] + γ η|ξ |δ2 η = sgn(ξ )[η∗ + β|ξ |δ3 − ζ μ] + δη|ξ |δ2 ,


4.1 Lorenz Equations


0.4 0.3 0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 -0.1







0.4 0.3 0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 -0.1

0.2 0.1


0 -0.1













Fig. 4.8 The form of the one-dimensional map (4.18) in the cases μ < 0 (left), μ = 0 (centre), and μ > 0 (right), illustrated using δ3 = 0.5, α = 1 and μ = −0.1, 0, 0.2

where α, β, γ , δ and ζ are the variously rescaled forms of the constants in (4.13). Note that (0, η∗ ) represents the intersection of the homoclinic orbit with S when μ = 0.

4.1.2 One-Dimensional Map If we first neglect the small term in γ in (4.17)1 , then iterates (ξ, η) of the Poincaré map satisfy approximately ξ  = sgn(ξ )[α|ξ |δ3 − μ].


The behaviour of iterates of ξ depends (as μ varies) on the sign of α, and this is the main coefficient which needs to be computed. If α > 0, we call the bifurcation type (a), and if α < 0, we call it type (b). It turns out that type (a) bifurcations are the usual case in the Lorenz equations, and so we suppose henceforth that α > 0. In Fig. 4.8, we sketch the behaviour of (4.18) as μ varies. Notice that the slope at ξ = 0 is infinite, since δ3 < 1. It is evident for μ ≤ 0, that all initial values, other than ξ = 0 (at μ = 0) give trajectories which diverge away from zero. For μ > 0, however, we shall see that some values of ξ remain forever near ξ = 0. In fact, Fig. 4.9 shows that there are two non-zero fixed points of the map (given approximately, for small μ, by ξ ≈ ± (μ/α)1/δ3 ). Denoting these by ± ξ ∗ , and their pre-images by ∓ χ , we see that the intervals (−ξ ∗ , −χ ) and (χ , ξ ∗ ) are both mapped into (−ξ ∗ , ξ ∗ ). There will therefore be some points in this interval which continue to map into (−ξ ∗ , ξ ∗ ) under all iterates of the map, and these form a set of points which constitute the invariant set for (4.18). It is fairly straightforward to see, by considering pre-images, that this invariant set will have measure zero; it is in fact a Cantor set.4 This one-dimensional map suggests a result concerning the invariant set which can be more easily seen by considering


Cantor set in an interval is one which is closed and uncountable but has no interior points and no isolated points.


4 Homoclinic Bifurcations

Fig. 4.9 When α > 0, δ3 < 1 and μ > 0, the map (4.18) has two fixed points, and a strange invariant set




− ξ∗



∗ ξ

−χ -0.1






ξ Fig. 4.10 Effect of the two-dimensional Poincaré map on the Poincaré section S. This assumes that η∗ < 0 in (4.17) (as is the case for the Lorenz equations)

the effect of the two-dimensional map (4.17) on a part of the Poincaré surface which encloses the region U = [−ξ ∗ , ξ ∗ ] × [−η∗ , η∗ ]. This is shown in Fig. 4.10. We portray the Poincaré surface S as the box AG H N , which contains the region U mentioned above, and the red dots mark the intersection of the pair of homoclinic orbits when μ = 0 with S, at (0, ± η∗ ). The line D K repesents the stable manifold of the origin, so that in fact points on this line do not return to S as the trajectories terminate at the origin. The image of the strip AD K N is A D  K  N  : it is uniformly expanding in ξ , and uniformly contracting in η to a value (since ξ < 0) η ≈ −η∗ with a small width due to the term δη|ξ |δ2 which forms a cusp as shown at the end (since δ2 > 1). The figure illustrates the case β = 0 in (4.17), otherwise the cusp is turned up or down like curled up toes (down) or jester’s shoes (up). Equivalently, the strip DG H K is mapped to D  G  H  K  . Thus the Poincaré map contracts uniformly in the η direction, and expands uniformly in the ξ direction. These are the hallmarks of hyperbolicity, and are the ingredients we need to find a strange invariant set.

4.1 Lorenz Equations Fig. 4.11 A cartoon indicating the effect of a Lorenz-like map on the potential constituents of the invariant set of the Poincaré map ψ : S → S



ψ( S1)


S2 S12






S21 S121




ψ(S2 ) Definition A hyperbolic set for a map ψ is an invariant set which can be written as the direct sum5 Mu ⊕ Ms of two sets on which ψ is respectively uniformly expanding and uniformly contracting.

4.2 Symbolic Dynamics By analogy with Fig. 4.10, consider the cartoon shown in Fig. 4.11, indicating the action of a map ψ analogous to that in Fig. 4.10, where attention is restricted to the vertical strips S1 : BC L M and S2 : E F I J in Fig. 4.10. The images ψ(S1 ) and ψ(S2 ) are two horizontal strips as shown, and the intersections define four boxes Si j = ψ(Si ) ∩ S j , i, j = 1, 2.


Points in these boxes (since they lie in the vertical strips) will be mapped back to the horizontal strips as shown in Fig. 4.11, remembering that expansion causes them to intersect both strips, while contraction causes a uniform thinning. The images of Si j are thus four thin horizontal strips, and their intersections with the vertical strips form eight strips, two to each box Si j , denoted by ψ(Si j ) ∩ Sk = Si jk , i, j, k = 1, 2.


this means that points in the set can be written as a coordinate pair (xu , xs ), with xu ∈ Mu and xs ∈ Ms .

5 Essentially,


4 Homoclinic Bifurcations

It is now obvious that we can define the n-th step in this iterative process, giving 2n sets (4.21) Si1 i2 ... in = ψ(Si1 i2 ... in−1 ) ∩ Sin , i j = 1, 2, each of which is of height εn where ε is the vertical contraction under ψ. As n→ ∞, we form a set of horizontal ‘whiskers’, each of which can be identified as ∞ n=1 Si 1 i 2 ... i n for a particular semi-infinite sequence {i 1 i 2 . . . i n . . .}. These whiskers have measure zero, and are in one-to-one correspondence with the infinite sequences {i j }, i j = 1 or 2. 6 ), and it is clear that in a We repeat the process for ψ −1 (the map is invertible ∞ similar way, we form a set of vertical whiskers, n=0 i−n i−(n−1) ... i−1 i0 S , which may also be identified with semi-infinite sequences of ones and twos. The intersection of these two sets is (an) invariant set I for ψ, since by construction all points in I remain in I under both ψ and ψ −1 (and hence, in the flow, as t → ± ∞). I is the cross product of two Cantor sets C H × C V , where C H represents the horizontal whiskers, and C V represents the corresponding vertical whiskers. Since C H can be identified by a forward semi-infinite sequence of ones and twos and C V likewise with a backwards sequence, so any point in I is identified with an infinite sequence {i −n . . . i −1 i 0 . i 1 . . . i m . . .}, which constitute the space of symbolic sequences on two integers, {1, 2}Z . The action of ψ on I is, by construction, identical to that of the shift map σ on {1, 2}Z : that is, if i = {. . . i −1 i 0 . i 1 . . .}, then σ (i) = {. . . i 0 i 1 .i 2 . . .}; alternatively, if i m denotes the mth symbol of i (m ∈ Z), then σ (i)m = i m+1 . In other words, the symbols i m of a particular point i ∈ I (by isomorphism) tell us which strips, S1 or S2 , the point ψ k (i) lies in. For example S12 = ψ(S1 ) ∩ S2 , so points in S12 start in strip S1 , and are mapped to strip S2 . Similarly, a point x in Si jk has x ∈ Si , ψ(x) ∈ S j , ψ 2 (x) ∈ Sk . Finally, the point {i 1 i 2 . . .} in C H visits the strips Si1 , Si2 , . . . in order. It is an immediate consequence of this that I is a strange invariant set in the following sense: (i) I contains a countable number of periodic orbits (i.e. repeating symbolic sequences); (ii) I contains an uncountable number of aperiodic orbits (i.e. non-repeating symbolic sequences); (iii) I contains a dense orbit.7

6 Because

we are solving ordinary differential equations to produce the Poincaré map; if we are solving partial differential equations, we can still manage but it requires some artistry. 7 The orbit approaches every point in I arbitrarily closely.

4.2 Symbolic Dynamics


Fig. 4.12 The Smale horseshoe








4.2.1 The Smale Horseshoe The ideas expressed above are essentially those used in analysing the Smale horseshoe. This is illustrated in Fig. 4.12. The square is stretched and folded back on itself under a map, so that its image lies like a horseshoe across the original square. With essentially the same construction as above, we can see that the symbolic dynamics description applies. It is important that the set is hyperbolic for this to work. This means that strange invariant sets are often non-attracting because of the expanding direction. This is illustrated by the Hénon map     x 1 + y − ax 2 , a = 1.4, b = 0.3, → bx y


which maps a certain trapezium into itself, as shown in Fig. 4.13. Stretching and folding does occur, but not uniformly. Thus, although numerical studies suggest that the map is chaotic, it is less easy to prove. Figure 4.14 shows the attractor, as well as a magnification which indicates the fractal nature of the cross section.

4.2.2 Strange Attractors The real way to obtain a subsidiary bifurcation which generates a strange attractor (i.e. a strange invariant set which is attracting) is through heteroclinic connections. This can be illustrated schematically through an extension of the Lorenz map. If we suppose the one-dimensional map (4.18) suffices for discussion, then the problem is that most orbits near ξ = 0 are eventually blown away from the origin. In fact, for r < 24.06, they tend to one of the non-zero fixed points, as indicated schematically by Fig. 4.15 (note that we are offering only pictures here, there is no reason that the one-dimensional map should retain validity away from ξ = 0). The strange invariant set is unstable for small μ because A in Fig. 4.15 lies above the fixed point of the


4 Homoclinic Bifurcations 0.6 0.4






0 -0.2





-0.6 -1.5

D -1






x Fig. 4.13 The Hénon map, given by (4.22). The trapezium ABC D is mapped to A B  C  D  under the map; extension and contraction occurs, but not uniformly, and the map is a contraction on this trapezium. The coordinates of the vertices of the trapezium are A: (−1.33, 0.42); B: (1.32, 0.133); C: (1.245, −0.14); D: (−1.06, −0.5). These values are given by Strogatz (1994). Evidently, it is not easy to find such an attracting region, as the quadratic term means many trajectories are mapped to infinity 0.21

0.4 0.2






-0.2 -0.4 -1.5








0.18 0.45






Fig. 4.14 The strange attractor of the Hénon map, obtained by iterating (4.22) from an initial value of x = y = 0. The left figure shows the overall shape of the attractor, and that on the right shows a magnification of the inset on the left. It can be seen in this that the attractor consists locally of three bands of points, and that the upper band itself consists of three sub-bands, which, on magnification, reveals the same structure. In cross section, the attractor is thus a fractal

map at B. However, at higher μ, B can rise above A, and in this case the invariant set becomes attracting. The transition occurs when the fixed point at B has ξ = μ, and this corresponds in the differential equation to a heteroclinic trajectory from the origin to the unstable limit cycle. This also appears to be what happens in the Lorenz equations.

4.3 Shilnikov Bifurcations


0.6 0.4

stable fixed point

A 0.2



0 -0.2 -0.4 -0.6







ξ Fig. 4.15 The Lorenz map of (4.18) in Fig. 4.9 actually has a pair of stable fixed points at large amplitude, corresponding to the pair of fixed points in the Lorenz equations. The unstable fixed point B of the map corresponds to an unstable limit cycle. For small μ, the ordinate ξ  = μ of A lies above the fixed point at B, but when it becomes lower than it at higher μ, the invariant set becomes attracting

4.3 Shilnikov Bifurcations The Lorenz equations are somewhat special since they possess a symmetry (which stems from their derivation from a physically symmetric problem), and it is only because of this that a strange invariant set is produced at the bifurcation. A single homoclinic orbit of Lorenz type produces only a single periodic orbit, as can be seen by considering the one-dimensional map for ξ > 0 only. Since the specific form of this map depends mainly on δ3 = |λ3 |/λ1 , we may expect that rather different behaviour can take place if either δ3 > 1, or more importantly if δ3 is complex. Notice that δ3 is the ratio of the least stable eigenvalue to the least unstable one, i.e. the ratio of the two eigenvalues closest to the imaginary axis. It is not surprising that this should be so, since in general the approach for the homoclinic orbit to the fixed point is along the least stable eigenvector, similarly the approach along the unstable manifold is along the least unstable eigenvector. In fact, it is these two eigenvalues which control the dynamics near zero, and this is really why the Poincaré map is approximately one-dimensional. Moreover, this observation can be extended to higher dimensional systems. The case where the eigenvalues nearest the imaginary axis consist of a complex pair and a real eigenvalue was considered by Shilnikov in a sequence of papers in the 1960s. Homoclinic orbits to a fixed point, in this case, can be said to lead to Shilnikov bifurcations. The method of analysis is entirely analogous to that for the Lorenz equations, and we only sketch the derivation of the Poincaré map. Consider the system


4 Homoclinic Bifurcations

Fig. 4.16 A homoclinic orbit of Shilnikov type. The orbit was computed from a solution of the Rössler equations u˙ = −v − w, v˙ = u + av, w˙ = b + w(u − c), with a = 0.181, b = 0.2, c ≈ 10.3134 with the direction of time reversed. For further detail, see the appendix

x˙ = −λ2 x − ωy + P(x, y, z, μ), y˙ = ωx − λ2 y + Q(x, y, z, μ), z˙ = λ1 z + R(x, y, z, μ),


and suppose P, Q, R, are analytic and second order. (Analyticity is not necessary, in fact.) Assume λ1 , λ2 > 0. Suppose at μ = 0 there exists a homoclinic orbit  to (0, 0, 0), as shown in Fig. 4.16. As for the Lorenz case, we construct a box B near the origin. It is a cylinder {0 < z < c , 0 < r < c}, and we define S to be the circular face r = c, and S  to be the top face z = c . Orbits sufficiently close to the homoclinic orbit  and which pass through z > 0 will be mapped to S  and thence back to a neighbourhood of the origin. Note that if z < 0 as a trajectory approaches S, then it subsequently wanders away from , and no further information is possible. The orbit within B follows from a linear description of the flow which leads to an (approximate) map from (z 0 , θ0 ) in S to (r  , θ  ) in S  . Using cylindrical polars, the linearised flow satisfies r˙ = −λ2 r, θ˙ = ω, z˙ = λ1 z,


with solution r = r0 e−λ2 t , θ = θ0 + ωt, z = z 0 e λ1 t ;


the time to pass from S to S  is   c 1 , t = ln λ1 z0


4.3 Shilnikov Bifurcations













Fig. 4.17 The map given by (4.32) in the case δ = 1.2 (left) and δ = 0.3 (right), plotted as ζ  − μ versus ζ . Also plotted are ζ − μ versus ζ for μ = 0, −0.2, 0.3. Other parameters used are d = 0.5, ω = 5, λ1 = 1, and 1 = 0

so the map from S to S  is approximately r = c

 z λ2 /λ1 0


θ  = θ0 +


ω c ln λ1 z0



The other part of the map, from S  to S, is obtained by linearising about the homoclinic orbit. We know that when μ = 0, the point x  = y  = 0 on S  is mapped to x = c, y = 0 on S (we can choose y = 0 by fixing the origin of θ appropriately): this assumes that, if necessary, a local transformation has been carried out to make the (x, y) plane the stable manifold (inside B) and the z-axis the unstable manifold (inside B). For x  , y  , μ c , we can then linearise the map about this fixed point, so that the image in S has y, z c; in particular, θ 1 in S, thus y ≈ cθ, and an appropriate affine map is 

z1 cθ1


r  cos θ  r  sin θ 

 + μb.


Combining the two, we obtain a Poincaré map for (z 0 , θ0 ) → (z 1 , θ1 ) given by     ω c + φ1 + μb1 , cos θ0 + ln z 1 = Dc λ1 z0      z δ ω c 0 + φ2 + μb2 , ln c θ1 = Ac  cos θ0 + c λ1 z0  z δ 0 c


where δ = λ2 /λ1 ,



4 Homoclinic Bifurcations

and A, D are combinations of elements of M (and so depend on c, c ). The map is a valid approximation if z 0 c, θ0 1. Put c = c , z = cζ , then with an appropriate re-definition of μ, we get     ω 1 + 1 + μ, ln ζ → dζ cos θ + λ1 ζ     ω 1 δ + 2 + μβ, θ → aζ cos θ + ln λ1 ζ δ


where ζ, θ 1. It follows that (4.31) is close to a one-dimensional map ζ → ζ  = dζ δ cos

   ω 1 + 1 + μ. ln λ1 ζ


This is a rather more interesting map than the Lorenz map, in some ways. There are two cases, δ > < 1, which are illustrated in Fig. 4.17. For δ > 1, a single periodic orbit bifurcates from the homoclinic orbit, as shown on the left of Fig. 4.18. On the other hand, when δ < 1, an infinite number of periodic orbits are created at μ = 0, which coalesce in saddle-node bifurcations at positive and negative values of μ. Here the bifurcation diagram is as shown on the right of Fig. 4.18. The period P is given by   1 1 , (4.33) ln P∼ λ1 ζ so that for δ > 1, ζ ≈ μ, thus μ ∼ e−λ1 P , while when δ < 1, the fixed points are given by     ω 1 δ (4.34) + 1 , ln μ ∼ −dζ cos λ1 ζ 10



8 6



P 4




0 -1






0 -1






Fig. 4.18 The variation of the fixed point of (4.32) with μ in the case δ = 3 (left), in terms of the period P defined by (4.33), and for δ = 0.3 (right), with the other parameters as in Fig. 4.17. In the latter case an infinite number of periodic orbits are produced at the homoclinic bifurcation, which disappear in saddle-node bifurcations as |μ| increases

4.3 Shilnikov Bifurcations


so that the curve on the right in Fig. 4.18 satisfies μ ∼ e−λ2 P cos[ω P + 1 ],

P → ∞.


In a real sense, the bifurcation only produces one periodic orbit associated with fixed points of the one-dimensional map (just as with the Lorenz case, although there the symmetry produces a symmetric pair), and it remains to be seen whether there is any other interesting behaviour. To examine this, we must again study the two-dimensional map (4.31). First, we write it in the composite form (ζ, θ ) → (ζ  , θ  ) via polar coordinates (ρ, φ) (which are simply scaled versions of r and θ in S  ), thus ζ  = dρ cos(φ + 1 ) + μ, θ  = aρ cos(φ + 2 ) + μβ,


where ρ = ζ δ,

  ω 1 . φ=θ+ ln λ1 ζ


The image in S  (ρ, φ) of a line θ = constant in S is the logarithmic spiral   1 ω . ln φ=θ+ λ2 ρ


The map of a thin strip U in S (say, 0 ≤ ζ, θ ≤ ε 1) is thus a slightly thickened logarithmic spiral strip V , as shown in Fig. 4.19. Since the subsequent map (the rescaled version of (4.28)) from S  to S is affine, we see that the image of U under the Poincaré map is a thin spiral strip U  in S, as also indicated in Fig. 4.19. It is now fairly clear that a number of horseshoes exist on the map. Since the map has a number of fixed points, the intersection of U and U  is non-empty (so long as we have, from (4.31)2 , that the value θ = μβ is in the range (0, ε)); the idea is now to take a horizontal strip in U between two (close) values of ζ . The image of this must be a (relatively short) segment of the spiral. Since for δ < 1 we have a large number of fixed points (an infinite number at μ = 0), it is clear that successive fixed points as ζ increases lie in thin vertical strips whose image under the Poincaré map provides a sequence of horseshoes. An illustration of this is provided in Fig. 4.20. That chaotic dynamics does indeed occur is then guaranteed by the uniform stretching and contraction evidenced by the way the strip ABC D is deformed to A B  C  D  . More specifically, let us choose a strip ζ1 < ζ < ζ2 in S such that the range of φ in S  is 2π , thus (4.39) ζ2 = ζ1 e2πλ1 /ω ,


4 Homoclinic Bifurcations S



φ ρ





Fig. 4.19 The image of a vertical strip in S is a spiral strip in S  , and this is then mapped affinely back to a spiral strip in S. The map from S to S  is given by (4.37), using δ = 0.5, ω = 5, λ1 = 1, and that from S  back to S is given by (4.36), with d = 0.1, 1 = 0, μ = 0.05, a = 10, 2 = 21 π , and β = 4. The range of θ and θ  in S is [−π, π ], and the range of ζ is [0, ν], where ν = 0.1. The rectangular box on the left is 0 ≤ θ ≤ 1, ν 2 < ζ < ν. Its image in S  is surrounded by a circle of radius ρ = ν 0.9δ








ζ Fig. 4.20 A Smale horseshoe results from the map in Fig. 4.19, since neighbouring intersections of the strip and its image contain fixed points, as in Fig. 4.17 (right). The map is the same as in Fig. 4.19, but the θ and ζ axes have been swapped, in order to draw out the resemblance to Fig. 4.12. The parameters are those in Fig. 4.19, except that ω = 5.1, and the range of the rectangle is 0.018 < ζ < 0.074, 0 < θ < 1. The comparison with Fig. 4.12 is not precise, since evidently it is the ζ direction which expands and the θ direction which contracts, but the illusion is one of scale, as the θ scale has been severely compressed in the figure in order to make the mapping visually clear

where we suppose without loss of generality that ω > 0. Since the range of φ is 2π , the maximum of ζ  in A B  C  D  is

while the minimum is

 = μ + |d|ρ = μ + |d|ζ δ , ζmax


 = μ − |d|ζ δ . ζmin


4.3 Shilnikov Bifurcations


Thus we might pick ζ1 , ζ2 such that ζ1 + |d|ζ2δ < μ < ζ2 − |d|ζ2δ ,


which is always possible as μ → 0 if δ < 1 (bearing in mind that ζ2 = Cζ1 , C > 1). It is also then possible to choose ζ1 so that the minima of ζ  occur at the end points of the strip. Thus horseshoes (in fact, many) exist if δ < 1. If δ > 1, however, the spirals are squashed flat in S, and horseshoes do not exist.

4.3.1 Approximation and Proof The astute reader will observe that the formal apparatus involved in analysing homoclinic bifurcations is exactly that of matched asymptotic expansions. The solutions we seek are relaxational, in the sense that there are two distinct time scales (t ∼ 1 and t ∼ ln(1/μ)), during which different approximations are valid. In the normal course of events, we match the different solutions in a matching region, and here the matching ‘regions’ are the Poincaré surfaces S and S  . It is clear that the precise location of S and S  cannot matter to the dynamics, i.e. c is restricted to be small and  μ, but not otherwise. So a different, but equivalent, approach would be to derive maps for the successive iterates of the matching constants between inner (near 0) and outer (near ) expansions. Students of asymptotic expansions, at least in applications, rarely expect a proof of the validity of their results, and we might therefore be happy with the story thus told. But as mathematicians, one aspect of the results is of concern: we have derived very sensitive results detailing the Cantor structure of invariant sets, on the basis of approximate solutions; just as for Arnold tongues, we must ask in what way these approximations can be justified in deriving such sensitive results. The procedure which lies behind the formal approximations is the following. Suppose we have the system x˙ = f (x, μ), x ∈ Rn ,


with a homoclinic orbit to 0 at μ = 0. First, we suppose that no eigenvalue has Re λ = 0, and that they are all distinct (this is the general case). Then we can diagonalise D f at 0, so that the stable and unstable eigenspaces provide a direct decomposition of Rn at x = 0, Rn = E u ⊕ E s . Next, if f is smooth (as we assume), there are smooth local transformations which ensure that the stable and unstable eigenspaces are actually the stable and unstable manifolds of the origin. We suppose this transformation has been made too. Thus we have a local decomposition Rn = W u ⊕ W s . The standard procedure now diverges. One approach is to suppose that the eigenvalues of D f are non-resonant (i.e. as in the normal form procedure) so that there is a local near-identity transformation which


4 Homoclinic Bifurcations

replaces the equation by a linear system for x < c. If we take the box B to be x < c, for example, then the solution inside B is just x = et D f x 0 .


This gives the map from S to S  . Alternatively, for c 1, we simply approximate Eq. (4.43) inside B by x˙ = D f (0, μ)x + O(x 2 ).


This equation can be formally ‘solved’ via reformulation as an integral equation, 

using the fundamental matrix8 et D f = et D . We have  x = et D x 0 + et D


e−s D O(x 2 ) ds,



and standard results of ordinary differential equation theory show that for smooth nonlinear terms in (4.45), the solution of (4.46) is smoothly approximated by (4.44). Why bother, then, with the non-resonant transformation to the linear system? Perhaps the approximate result is not good enough, or perhaps there is some exceptional behaviour associated with the resonant eigenvalues? Apparently however, this is not the case, and the non-resonant transformation is simply made for algebraic simplicity. To understand why this should be the case, we need to consider the issue of resonance more clearly. If the eigenvalues of the matrix are distinct, then D can be taken to be diagonal, D = diag (λi ), and the solution of x˙ = Dx is just x = et D x0 , or x = i ai eλi t . By expanding the nonlinear system as a power series, we can construct a formal solution of the nonlinear system in the form


x = εx (1) + ε2 x (2) + · · · , x (1) = et D x0 ,


x˙ ( j) − Dx ( j) = g j (x (0) , x (1) , . . . , x ( j−1) ) ,


where g j is a polynomial. Since the fundamental solution of the homogeneous part of (4.48) is just et D , it is clear that resonant terms in g j will occur if any combination of x (k) , k ≤ j − 1, gives an exponential of the form exp(λi t). Suppose that x (k) contains no resonant terms for k < j. For example, x (1) is a linear combination of exponentials exp(λi t). Therefore g2 contains terms exp[(λi + λ j )t], which give particular solutions of the  same form if λi + λ j = λl for any i, j, l. Proceeding on, contains terms exp( m λ )t, m k = r, and thus non-resonance is assured if g r k k k  m λ = λ for any r , and for all m such that r k k k k k m k = r ≥ 2. 8 Fundamental

at (3.114).

matrices were in effect defined for linear systems of ordinary differential equations

4.3 Shilnikov Bifurcations


Of course, this is just the criteria for non-resonance given previously. However, now we can see explicitly that if the eigenvalues are non-resonant, then a formal power series of the form x ∼ b1 y + b2 y 2 + . . ., y j = exp(λ j t), can be constructed. It is basically because of this that the flow can be exactly linearised. What if resonance occurs? Then the series are occluded by the presence of terms such as teλr t somewhere in the expansion. In terms of y this corresponds to terms of the form yr ln(1/yr ), and the smoothness of the transformation from x to y breaks down. From the point of view of the homoclinic bifurcation analysis, however, this does not matter. The reason is that the smoothness is only broken at x = 0, and this does not affect trajectories which avoid zero. Trajectories which do tend towards zero lie on the stable manifold, so that for them the construction of the Poincaré map, particularly via that part which maps S to S  , is irrelevant. For points away from the stable manifold, such as we are concerned with, the linear flow is indeed a smooth perturbation of the full system. We return to the construction of the Poincaré map. The map from S  to S is obtained by writing x = ξ(t) + y, (4.49) where ξ(t) is the homoclinic orbit. Then y satisfies y˙ = f [ξ + y, μ] − f [ξ, 0] ∂f = D f [ξ(t), μ]y + μ (ξ, 0) + O(μ2 , y 2 ). ∂μ


At leading order, we have a linear inhomogeneous system, whose solution can be formally written  y(t) = M(t)y0 + μM(t) 0


M −1 (s)

∂f [ξ(s), 0] ds, ∂μ


where M is a fundamental matrix for the equation. The time to get back to S is essentially constant, whence (4.51) gives an affine map. The combination of (4.51) and (4.46) then gives the Poincaré map, as illustrated earlier in this chapter. In the Shilnikov and Lorenz examples, this map is used to show two results. Firstly, that there is a countably infinite number of periodic orbits as μ → 0; more specifically, we can trace the existence of periodic orbits in both examples by using the fact that the approximate Poincaré map is itself approximated by a one-dimensional map. The principal theorem involved here is the implicit function theorem. This states that if the approximations are smooth, i.e. if the neglected terms are C 1 , then periodic orbits of the approximate system correspond to periodic orbits of the real system. It is because of this that we can be sure of the existence of the periodic solutions we describe. Particularly in the Shilnikov case, we can use the one-dimensional map to infer that period-doubling windows occur, although in a rather roundabout way. If the onedimensional map has a period-doubling bifurcation of some orbit at μ = μc , then


4 Homoclinic Bifurcations

we can use the implicit function theorem to guarantee the existence of the perioddoubling orbit except at μc , where the fixed point will have neutral stability, and the implicit function theorem breaks down. The details of the ‘period-doubling’ process might, therefore, be considerably more complicated than for one-dimensional maps. Nevertheless, the global picture away from such singular values of μ is well described by this one-dimensional map. If the one-dimensional map period-doubles to chaos, we cannot necessarily infer chaotic trajectories for the full Poincaré map, since there is no fixed point or cycle for the implicit function theorem to deal with. Thus we show the existence of a chaotic invariant set by doing symbolic dynamics on Smale horseshoes. In the Shilnikov case, we can show that many such horseshoes exist for the approximate map. Now the essential feature of these horseshoes is that the intersection of the chosen subset of S with its horseshoe image should be transverse, and that the map should be hyperbolic on the intersection. Having shown this for the approximate map, the applicability to the real system then follows by virtue of the smoothness of the transformation (C 1 is enough), which is required to ensure that the real map is hyperbolic (since this property depends on the derivatives of the map). Thus, so long as the approximations we make are smooth, the results on the invariant set we obtain can be applied to the real system.

4.4 Matched Asymptotic Expansions for n-dimensional Flows Analysts who have proved these bifurcation results are thus (presumably, unwittingly) doing matched asymptotic expansions for the flow. The flow consists of two separate phases, described by distinct approximations. Evidently, the Poincaré surface S lies in the matching region between the inner (near x = 0) and outer (near x = ξ(t)) regions. It is a useful way of thinking about the process, at least for an applied mathematician, and it will be helpful to use it in constructing the solutions, since the rôle of the size of the perturbations may be made clear. We will now recount the procedure in this light, for a general n-dimensional flow. Suppose the system (4.52) x˙ = f (x, μ), x ∈ Rn , has a homoclinic orbit ξ0 (t) at μ = 0 to x = 0. We take a Poincaré section  transverse to this orbit, and seek an approximate map P :  →  for orbits close to ξ0 on , and for μ small. The situation is shown in Fig. 4.21. Note that  is not required to be close to 0, and in fact we generally suppose it is not. We now seek formal expansions for trajectories x ∼ ξ0 + εξ1 + · · · , ε 1,


4.4 Matched Asymptotic Expansions for n-dimensional Flows Fig. 4.21 A Poincaré map near a homoclinic orbit




which are close to ξ0 on . We take the time origin so that t = 0 for x ∈ . Here ε is a formal small parameter which will help keep in mind the expansion procedure, but it will ultimately be omitted. A second small parameter is μ, but it is clear that a useful distinguished limit has μ ∼ ε, although again this restriction is unnecessary, and will eventually be omitted. For the moment though, we write μ = εμ, ¯ μ¯ = O(1);


expanding (4.52), we have at O(ε) ξ˙1 = f  (ξ0 , 0)ξ1 + μ¯

∂f (ξ0 , 0), ∂μ


which is a linear inhomogeneous equation (cf. (4.50)). If  is a fundamental matrix for the Jacobian f  [ξ0 (t), 0], then the solution of (4.55) satisfying ξ1 = η at t = 0 on  is just  t ∂f ξ1 = (t)η + μ(t) ¯ −1 (s) [ξ0 (s), 0] ds. (4.56) ∂μ 0 We denote the Jacobian matrix at x = 0 as D f (0, μ) = D(μ) = D = D0 + μD0 + O(μ2 ),


and we can assume D, and thus D0 and D0 , to be diagonal, as earlier. Define the heteroclinic matrix H via (4.58) (t) = et D0 H (t). Note that since (0) = I , so also H (0) = I . By a theorem of Coddington and Levinson (1955) (theorem 8.1 in their book), the infinite-period version of Floquet theory shows that H (±∞) = H± are constant matrices. We have  ¯ t D0 H (t) ξ1 = et D0 H (t)η + μe 0


H −1 (s)e−s D0

∂f [ξ0 (s), 0] ds. ∂μ



4 Homoclinic Bifurcations

As t → ∞, then,  t ξ1 ∼ et D0 H+ η + μe ¯ t D0 H+ H+−1 e−s D0 D0 ξ0 ds 0   ∞ t D0 −1 −s D0 ∂ f H (s)e {ξ0 (s), 0} − H+−1 e−s D0 D0 ξ0 ds H+ + μe ¯ ∂μ 0 + O(1). (4.60) Now we define β by

ξ0 ∼ et D0 β as t → ∞;


note that (if  is not close to 0) β = O(1). We find (remembering that D0 , D0 are diagonal) ¯ t D0 D0 βt + μe ¯ t D0 c+ + O(1) (4.62) ξ1 ∼ et D0 H+ η + μe as t → ∞, with c+ defined by the convergent integral from 0 to ∞ in (4.60). Therefore x ∼ et D0 β + εet D0 H+ η + μβtet D0 D  + μet D0 c+


as t → ∞. Finally, we can as accurately telescope the secular term9 by writing (4.63) as x ∼ et D [β + ε H+ η + μc+ ], t → ∞.


It is an exactly analogous procedure to compute x as t → −∞. We define ξ0 ∼ et D0 α, α = O(1) as t → −∞,


x ∼ et D [α + ε H− η + μc− ], t → −∞.


and obtain

The asymptotic procedure involves the fact that the ordering of (4.64) breaks down as t → ∞. Specifically, suppose Rn is partitioned locally as W u and W s , the unstable and stable manifolds of 0. Then β ∈ W s , α ∈ W u , so that et D β → 0 as t → ∞. On the other hand, εet D H+ η will grow exponentially. The expansion thus becomes invalid when the two terms are of comparable size. At this point x 1, and an approximation which describes the flow here is simply x˙ = Dx + O(x 2 ) ,


x = et D x 0 + . . . .


with solution

9 The

one ∝ tet D0 .

4.4 Matched Asymptotic Expansions for n-dimensional Flows


In fact, there is nothing complicated to do; all we need is to relate the large time asymptotics of x in (4.64) to the large negative time asymptotics of x in (4.66) corresponding to the next passage through . To be specific, let η ∈  be mapped to η ∈ , after a (recurrence) time P. We write (4.69) t = t  + P, so that the solution for x for −t   1 is 

x ∼ et D [α + ε H− η + μc− ] = et D e−P D [α + ε H− η + μc− ] ,


while that for large t is just (4.64). These must be the same (as they both match to the flow near the origin, (4.68)), and thus the approximate Poincaré map is simply given by (4.71) e−P D [α + H− η + μc− ] = β + H+ η + μc+ , where we omit ε, but then require η, η to be small. Given η ∈ Rn , there are n + 1 unknowns η , P to be determined from the n-dimensional Eq. (4.71). The extra relation follows by the choice of , for example one such requires η, ξ0 (0) = η , ξ0 (0) = 0 :


the Poincaré surface is orthogonal to the homoclinic trajectory. The map (4.71) generalises the Poincaré maps determined for the Lorenz and Shilnikov systems. The Eq. (4.71) is a smooth approximation to the exact map, so it suffices to consider the approximate map. A simplification follows from the following observation. The function ξ1 = ξ˙0 satisfies (4.55) with μ¯ = 0, and if we thus put μ¯ = 0 in (4.59) and take η = ξ˙0 (0), we find, letting t → ± ∞ and using also (4.61) and (4.65), that D0 β = H+ ξ˙0 (0) , D0 α = H− ξ˙0 (0) ;


M D0 α = D0 β,


M = H+ H−−1 .




It is convenient10 to define v by η = H−−1 v,


effectively places v on the (outgoing) face S of the box near 0, although it may be simply thought of as a convenient transformation. 10 This


4 Homoclinic Bifurcations

so that the map (4.71) is e−P D [α + v  + μc− ] = β + Mv + μc+ .


Now, decompose this map into components on W u and W s . Using an obvious notation, and that αs = 0 = βu , we have11 Muu vu + Mus vs + μcu+ = e−P Du [αu + vu + μcu− ] , vs + μcs− = e P Ds [βs + Msu vu + Mss vs + μcs+ ] .


Notice that e P Ds , e−P Du  1 (as we have αu , βs = O(1)). It follows that the map in (4.78) is smoothly approximated by Muu vu = e−P Du αu − Mus vs − μcu+ , vs = e P Ds βs − μcs− .


Together with the definition of , η, ξ˙0 (0) = 0, (4.79) provides a map from (vu , vs ) to (vu , vs ) with P chosen so that η , ξ˙0 (0) = 0. Now it is clear that the approximation (4.79), since it does not involve vu , is in effect a restriction on the points in  which map forwards to a point on  near ξ0 . If we are only interested in such trajectories (as we are) then it is clear that we may approximate vu smoothly by the solution of  (4.80) Muu vu = e−P Du αu − Mus vs − μcu+ , where P  is the subsequent recurrence time (to be found). Using (4.79)2 (which provides a comparable restraint on points that have come from points on  near ξ0 ), we have, for trajectories which remain close to ξ0 , 

Muu vu ≈ e−P Du αu − Mus e P Ds βs + μ{Mus cs− − cu+ } .


Now the relation (4.74) reflecting true translation invariance of the system can be written, when projected on to Wu , and since αs = βu = 0, Muu Du αu = 0.


(Here we have written D0 = diag (Du , Ds ).) Therefore Muu has rank n − 1 and a unique null vector (if there were a second null vector, there would be a second homoclinic orbit, a non-generic possibility we discount), and therefore there is a unique w such that T w = 0. (4.83) Muu

11 Remember

that D is diagonal.

4.4 Matched Asymptotic Expansions for n-dimensional Flows


Then (4.81) can only be solved (and then uniquely up to additions of multiples of the null vector of Muu ) if w is orthogonal to its right hand side, i.e. if 

w, e−P Du αu  = w, Mus e P Ds βs  + μw, cu+ − Mus cs− ,


which is in fact a difference equation giving P  in terms of P and μ. j j If the unstable eigenvalues are denoted by λu , and the stable ones by λs , then (4.84) can be written in the form  j

a j exp(−λuj P  ) =

b j exp(λsj P) + μ .



In fact, (4.85) can be further approximated (since P and P  are large) by restricting the sum to be over those eigenvalues with real part closest to zero. In this way we get three basic cases: (i) (Lorenz) Two real eigenvalues σu , −σs : the map is a1 exp(−σu P  ) = b1 exp(−σs P) + μ;


(ii) (Shilnikov) One real σu , and two complex conjugates −σs ± iωs : the map is a1 exp(−σu P  ) = b1 exp(−σs P) cos(ωs P + ) + μ;


(iii) (Bifocal) Two complex pairs σu ± iωu , −σs ± iωs : this gives the map a1 exp(−σu P  ) cos(ωu P  + φu ) = b1 exp(−σs P) cos(ωs P + ) + μ.


From these maps, it is straightforward to prove results about fixed points (periodic orbits). If all the perturbations are smooth, then the implicit function theorem can be used with abandon. It is not much more difficult to show that (n − 1)-dimensional ‘horseshoes’ exist. In fact, (4.81) shows that vu lies near a curve parametrised by P and P  (and hence just P, in view of (4.85)). The same applies for vs , from (4.79)2 . Therefore the invariant set will be inside a long thin set, whose length may be parametrised by P and whose transverse directions are parametrised by vu and vs . These transverse directions lend hyperbolic structure to the map near fixed points, and this can be used to show that a symbolic dynamic description of the flow can be given. Thus, strange invariant sets are produced in much the same way as in three dimensions.


4 Homoclinic Bifurcations

Fig. 4.22 Two examples of heteroclinic reconnection which could lead to attracting chaotic behaviour

4.4.1 Strange Attractors Homoclinic bifurcations are hopelessly bad at producing strange attracting behaviour. One can observe that the components of Wu will fling most trajectories away from the homoclinic orbit. Something more exotic is required. In fact, it need not be that much more exotic. The problem is that most trajectories will be ejected away from ξ0 into Wu . Strange attracting behaviour is then likely if these ejected trajectories can find their way back eventually. The obvious mechanism to do this is via a heteroclinic connection to another fixed point (for example). Suppose, to illustrate, that the slowest growing eigenvector at 0 has a real eigenvalue λu . Then trajectories leaving 0 do so asymptotically along ± eu , where eu is the eigenvector associated with λu . We have supposed eu (say) connects to a homoclinic orbit when a parameter μ = 0. If −eu does as well, or connects to a different (unstable) fixed point, then, depending on the behaviour there, re-injection may take place. Figure 4.22 illustrates two examples. In the first (degenerate) case, three separate heteroclinic connections suggest the possibility of a (strange) attractor: this is apparently a co-dimension three bifurcation.12 In the second (the ‘gluing’ bifurcation), two Shilnikov homoclinic orbits coexist at a co-dimension two bifurcation. It is an obvious statement that if symmetries are present (as often in physical models), the relevant co-dimension is lowered, to two and one respectively. More generally, it is plausible to think of high-dimensional chaos as consisting of taking place in a phase space suffused with fixed points, while a trajectory with large forcing will launch itself from one fixed point to the next, along a tangled web of heteroclinic connections. This is in fact exactly how chaos is produced in weakly perturbed integrable Hamiltonian systems (see Chap. 5); in the present context, weak

12 In this context, the co-dimension refers to the number of independent parameters whose selection

is necessary to secure the bifurcation.

4.4 Matched Asymptotic Expansions for n-dimensional Flows


perturbation is associated with weak damping, i.e. strong forcing (for example, high Reynolds number flows).

4.4.2 Partial Differential Equations It is natural to ask whether the formal techniques outlined above can be extended to partial differential equations. On a finite domain, this is basically trivial, since the solution will have a (discrete) Fourier series expansion, and in fact inertial manifold techniques13 (if appropriate) will ensure that the system effectively behaves like a finite-dimensional system. The basic type of a partial differential equation which is not a glorified ordinary differential equation is then an evolution equation for a variable u(x, t) on an infinite spatial domain.14 A homoclinic orbit is a trajectory ξ0 (x, t) such that ξ0 → 0 as t → ± ∞ uniformly in space: a soliton is not a homoclinic orbit. In a formal sense, one can analyse such a system in the same way as finite-dimensional systems, and the basic result is a straightforward generalisation of (4.77) to functions v(k), v  (k) which depend on a continuous wavenumber k. Moreover, the map can be simplified in the same way, by using the fact that the recurrence time P is large. A significant extra feature occurs when (as often) the partial differential equation is autonomous in space (spatially translationally invariant), and the homoclinic orbit is to a uniform state (e.g. u = 0). Then there are two analogues of (4.82) (the analogue replaces matrix multiplication by an integral convolution operator), and so when we condense the system by orthogonality to the components of ∗ , the adjoint of Muu , there are two conditions. The added feature the null space of Muu is that the subsequent pulse need not occur at the same location, and will generally be phase shifted by a quantity Q. The two orthogonality conditions imply that the Poincaré map is approximately given by a map P  = P  (P), Q = Q(P). Analysis of such maps is in its infancy. As with ordinary differential equations, one may seek classes of solutions. Fixed points do not exist, but there do exist (a countably infinite number of) doubly periodic modulated travelling waves. As far as horseshoes are concerned, it seems likely that there will exist uncountably many spatio-temporally chaotic solutions, associated with some symbolic dynamic description, these being close to the modulated solutions. A much more interesting concept is the following. There is nothing to prevent the occurrence of a homoclinic pulse being followed by subsequent generations of two (or more) pulses, since if these are far apart, their exponential tails will not affect each other significantly. Therefore homoclinic bifurcations in partial differential equations may provide a mechanism for the generation and spread of chaotic spatial patches.

13 See

Sect. 4.5.

14 A finite domain can be effectively infinite if the smallest relevant dynamical length scale is much

smaller than the domain size; this commonly occurs when the dynamical forcing parameter (e.g. the Reynolds number in shear flows, or the Rayleigh number in convection) is large.


4 Homoclinic Bifurcations

4.4.3 Transition to Turbulence The central problem which needs to be addressed by chaos theory is the transition to turbulence in parallel shear flows, as illustrated by the formation of turbulent bursts, or spots. We might conjecture the following, for instance. The formation of spots in plane Poiseuille flow, or boundary layer flow over a flat plate, is associated with near-homoclinic orbits of the governing equations. The stable manifold of the zero state (steady, two-dimensional flow) consists of slowly decaying (t ∼ Re) twodimensional cats-eyes patterns. For sufficiently small amplitudes these are subject to three-dimensional inflection point instability (on a time scale t ∼ 1) which sweeps the transverse vortex up and folds it back down as a hairpin vortex. Thus, if the Euler equations for a transverse vortex in a shear flow have heteroclinic orbits connecting two-dimensional states, this picture might gain some credibility. In that light, the possibility of homoclinic bifurcations for partial differential equations leading to a spreading mechanism which is inherently not present in ordinary differential equations means that efforts to understand the transition process by (for example) truncated Fourier-type models may be misdirected. However, it is very difficult to make progress with this problem.

4.5 Notes and References 4.5.1 The Lorenz Equations His famous eponymous equations (4.2) were derived by Lorenz (1963). The equations themselves are a three-term Fourier truncation of a two-dimensional model of Boussinesq thermal convection. In these, x represents the velocity field, y represents the horizontal variation of temperature, and z represents the vertical distortion from the uniform state. Occasionally one reads that the Lorenz equations are a model of convection, but this is hardly true. However, the equations do retain a coarse representation of the physical process: the steady state at the origin represents the equilibrium convective state, and its instability corresponds to the onset of convection, for example. In particular, the relaxational behaviour of the Lorenz equations at high Rayleigh (r ) and Prandtl number (σ ) finds an analogy with an influential paper on convective turbulence by Howard (1966). In this paper, Howard conceives of thermal turbulence as consisting of long quiescent periods, during which the temperature relaxes towards equilibrium, interspersed with short violent overturning episodes, due to the convective instability of the growing thermal boundary layers at the top and bottom. In the Lorenz equations, such behaviour would correspond to a quiescent phase where x, y ≈ 0 and thus z ∼ e−bt , and a rapid phase in which x and y have a nearhomoclinic excursion. In fact, this is exactly what happens when r ∼ σ  1, a limit which is arguably already approached in Lorenz’s original parameter choice, and the limit can be exploited analytically to produce explicit approximations to a Poincaré

4.5 Notes and References


map for the equations. This was done by Fowler and McGuinness (1982, 1983); Fowler (1992) attempts to draw the parallel with real convection more explicitly. The argument given in Sect. 4.2.2 interpreting the transition from a strange invariant set to a strange attractor at r ≈ 24.06 is given by Sparrow (1982), pp. 45 ff. Describing what happens is one thing, but proving results rigorously is another. Tucker (1999) offers a proof of the existence of a strange attractor for the Lorenz equations; see also Ovsyannikov and Turaev (2017). Likewise, Zgliczy´nski (1997) provides a proof of chaos in the Hénon map (and the Rössler equations). The Hénon map introduced in (4.22) and Figs. 4.13 and 4.14 was provided by Hénon (1976), who was actually motivated by Lorenz’s (1963) results.

4.5.2 Shilnikov Bifurcations The series of papers in the 1960s by Shilnikov (e.g. 1965; 1967; 1968; 1970), and later by Glendinning and Sparrow (1984) and Gaspard (1984) paved the way for the application of homoclinic bifurcation analysis to systems of ordinary differential equations. In 1965, Shilnikov showed that at the point of bifurcation (i.e. μ = 0) and for a homoclinic orbit to a saddle-focus in three dimensions, there may exist a countable number of periodic orbits. Later (1967), he obtained an equivalent result (in four dimensions) when the fixed point is bifocal, i.e. a double focus. This analysis was extended to n dimensions and allowed μ = 0 (Neimark and Shilnikov 1965; Shilnikov 1968), but only in obtaining criteria for the bifurcation of a single periodic orbit. In 1970, Shilnikov extended the analysis of the saddle-focus and bifocal cases to n dimensions, to obtain (via symbolic dynamics) the result that at μ = 0, there exist uncountably many aperiodic orbits. The illumination due to Glendinning and Sparrow (1984) and Gaspard (1984) lay in showing that the bifurcations could be viewed in terms of the existence of a one-parameter family of periodic solutions of period P, parameterised by μ(P), a possibly non-monotonic function. Thus Shilnikov’s (1965) result was extended to μ = 0, and to n dimensions. For the bifocal case, Fowler and Sparrow (1991) provided an extensive geometric account of the bifurcations associated with the invariant set of the Poincaré map (for which the Poincaré surface is three-dimensional); a similar discussion can be found in the book by Kuznetsov (2004).

4.5.3 Infinite Dimensions Fowler (1990a) derived a Poincaré map for the n-dimensional case, but showed additionally that it could be asymptotically reduced to an effective one-dimensional map embedded inside a Smale horseshoe structure; one thus obtains all the results on periodic and aperiodic orbits.


4 Homoclinic Bifurcations

An obvious and interesting question is whether these methods can be extended to n = ∞. The two applications for which this would be relevant are to delaydifferential equations and partial differential equations. Delay-differential equations appear to form a very tough proposition, not least because a characterisation of chaotic solutions is difficult to obtain (e.g. Wattis 2017). For partial differential equations, we must first put to one side some of the literature on this subject. Most obviously, we are not concerned with modal truncations (e.g. Haller and Wiggins 1995). This would be relevant in a finite domain, where Fourier series expansion yields a countable sequence of ordinary differential equations, and then inertial manifold theory (see below) provides a justification that a sufficiently large truncation provides an accurate portrait of the dynamics; but we are then back at n dimensions. Nor are we concerned with homoclinic bifurcations arising, for example, in travelling wave solutions (e.g. Bell and Deng 2002), where a system such as the third-order Fitzhugh–Nagumo equations will have travelling waves governed by a third-order system of ordinary differential equations; nor are we concerned with spatially homoclinic orbits (Lord et al. 1999). A large literature has been built up on this latter topic (e.g. Beck et al. 2009; Parra-Rivas et al. 2018), which is concerned with localised spatial structures, whose bifurcation response diagram graphs (amplitude versus μ) are commonly referred to as ‘snakes’ (e.g. Beck et al. 2009). Certainly, we are on an infinite domain. But a solitary wave is not a homoclinic orbit, nor is a purely time-dependent orbit (Coullet et al. 2002). Our generalisation of a homoclinic bifurcation analysis from ordinary differential equations to partial differential equations assumes a homoclinic trajectory in which the dependent variable approaches the fixed point (zero, to be simple) uniformly in both space and 2 2 2 time. The expression e−(x +t ) will serve, but e−(x−t) will not. Fowler (1990b) provides the relevant derivation of a Poincaré map for this case. Further elaboration and development can be found in the thesis by Drysdale (1994). See also question 4.6.

Inertial Manifolds

Inertial manifolds were introduced by Foias and co-workers in the mid-1980s; see, for example, Foias et al. (1988). These are essentially finite-dimensional projection spaces which enable one to identify the dynamics of an infinite-dimensional system. In their words, ‘The inertial manifolds contain the global attractor, they attract exponentially all solutions, and they are stable with respect to perturbations. Furthermore, in the infinite-dimensional case they allow for the reduction of the dynamics to a finite-dimensional ordinary differential equation’. Constantin et al. (1985) show how to use integral estimates to obtain explicit bounds for the attractor dimension. For systems described by partial differential equations on large domains, these estimates are generally large. For example, it is thought that for the Navier–Stokes equations, the number of degrees of freedom (i.e. the dimension) N ∼ Re9/4 , based on the Kolmogorov length scale (see, for example, Tran 2009). Proven estimates generally overshoot the presumed actual values, which is hardly surprising, since in making the

4.5 Notes and References


estimates, nothing is being solved. For example, Hyman and Nicolaenko (1986) give an estimate for the attractor for the Kuramoto–Sivashinsky equation on a domain of (dimensionless) length l as N ∼ l 3.5 , as compared to the apparent value of N ∼ l 1.5 . But in any event, it is clear that the attractor dimension will generally become large as the domain size increases, or equivalently as the forcing parameter increases. In practice, it is then the case that the finite dimensionality of the system is not of practical use, because of an interchange of the limits N → ∞ and μ → 0. The reason for this is that to resolve the attractor, we need to resolve length scales of O(1/l), whereas changes in the dynamics can be expected to be of magnitude μ; the order in which these go to zero is thus important.

4.5.4 Turbulence and Chaos In Ed Spiegel’s phrase, which we mentioned in Chap. 1, ‘turbulence is chaos but chaos is not turbulence’. We shall have more to say on this in Chap. 6, but for the moment we re-iterate the view that there is a fundamental difference between finiteand infinite-dimensional systems. Certainly some systems (for example convection) behave like low-dimensional systems at low forcings (i.e. low Rayleigh number), but they lose this behaviour at high forcings, and appear to be best described by the dynamics of an interacting set of ‘coherent structures’ (in the case of thermal convection, these might be the thermal plumes). The challenge is then to identify these coherent structures and describe their interaction. Nor is fluid mechanics the only field where this challenge occurs. One thinks of the brain and the collective behaviour of neurons, for example.

4.6 Exercises 4.1 For the symmetric Lorenz map ξ → sgn(ξ )[α |ξ |δ − μ], α > 0, δ < 1, find the values of μ for which (a) the bifurcating fixed point (from ξ = 0 at μ = 0) has a saddle-node bifurcation; (b) the strange invariant set becomes attracting. Describe what happens as μ varies if α < 0. 4.2 (i) By consideration of the Lyapunov function V = r x 2 + σ y 2 + σ (z − 2r )2 , show in detail that trajectories of the Lorenz equation enter the ellipsoid Dδ : r x 2 + y 2 + b(z − r )2 = b(r 2 + δ), δ > 0, after a finite time. Deduce that thereafter trajectories remain inside the ellipsoid V ≤ max Dδ V . Can we put δ = 0 in the above? Why? Or why not?


4 Homoclinic Bifurcations

(ii) Show that, if 0 < r < 1 in the Lorenz equations, the origin is globally stable. (Hint: consider the Lyapunov function V = x 2 + σ y 2 + σ z 2 .) (iii) Show that the Lorenz equations have a pitchfork bifurcation of the origin at r = 1, and a Hopf bifurcation of the two non-trivial steady states at r = σ (σ + b + 3)/(σ − b − 1). 4.3 The Lyapunov function V = r x 2 + σ y 2 + σ (z − 2r )2 is defined as in question 4.2, and the parameters r , σ and b are all positive. Using the method of Lagrange multipliers (or otherwise), find the stationary points of V on the ellipsoid D0 defined by r x 2 + y 2 + b(z − r )2 = br 2 . Show that stationary values V = 0, VM = 4σ r 2 always exist and give the points where they are attained (P and O respectively). Show also that if b > 2, and assuming σ = 1, two further stationary points exist, on which the (equal) stationary values Vσ > VM , and describe geometrically how these stationary points bifurcate from O. Show further that if b > 2σ (again assuming σ = 1), that two further stationary points exist, and show that the (equal) stationary values at these points, V1 , also satisfy V1 > VM , and describe how they bifurcate from O. Show that if σ < 1 and b > 2, then Vσ < V1 , whereas if σ > 1 and b > 2σ , then Vσ > V1 . Finally, if σ = 1, show that extremal (not necessarily stationary) values of V occur at P and, if b < 2, at O, but if b > 2, stationary values occur on an ellipse, where V > VM . 4.4 As we will briefly discuss in Sect. 6.5.5, the theory of turbulent fluid flows often proceeds by trying to compute averages of the flow properties. This question will consider a similar approach to the Lorenz equations. The Lorenz equations are given by x˙ = −σ (x − y), y˙ = (r − z)x − y, z˙ = x y − bz, and it is known that at the ‘usual’ values r = 28, σ = 10, b = 83 , all three steady states are unstable, and the attracting set is a strange attractor (in which there are embedded an infinite number of unstable periodic orbits, for example). We will assume that r > 1 (otherwise all averages φ(x, y, z) = φ(0, 0, 0), by question 4.2). The time average of a quantity such as x(t) is denoted as x, ¯ and defined as x¯ = lim

T →∞

1 T


x dt. 0

4.6 Exercises


Show that the time average of the time derivative of any function f (x, y, z) is zero. Let us make the ergodic assumption about the chaotic attractor, which is that for almost all initial conditions, the average of any quantity φ(x, y, z) is independent of the initial condition. Basically, this says that almost any trajectory ‘fills out’ the attractor as t → ∞. With this assumption, use the symmetry of the equations to show that x¯ = y¯ = x z = yz = 0. Deduce for all initial conditions that x y = b¯z , x 2 = x y, x yz = bz 2 ,   y 2 = b r z¯ − z 2 , and hence that

  x˙ 2 = σ 2 b (r − 1)¯z − z 2 ,

and thus that z 2 ≤ (r − 1)¯z , with equality only if the initial condition is one of the steady states (note that these are not generic). Use the Cauchy–Schwarz inequality for integrals to show that z2 ≤ z2, and deduce that, if z¯ > 0, z¯ ≤ r − 1, z 2 ≤ (r − 1)2 , with equality only for the non-trivial steady states. Why may we assume that z¯ > 0? [Although these upper bounds are obtained for the steady states, they are not for the chaotic attractor: for the usual values of the parameters, for example, it appears that z 2 ≈ 23.55, with some numerical uncertainty. See also Goluskin (2018).]


4 Homoclinic Bifurcations

4.5 The Rössler equations are given by the system x˙ = −y − z, y˙ = x + ay, z˙ = b + (x − c)z, where a, b and c are positive constants. Show first that if z > 0 initially, it remains so. Assuming this, it is of interest to know whether trajectories are bounded for all time. On the assumption that they are, and using the same notation as in question 4.4, show that x¯ = a z¯ , y¯ = −¯z , and then also that x y = b − c¯z , x z = c¯z − b,

y2 =

c¯z − b . a

Using these results, show also that bz −1 + a z¯ = c, √ and deduce that trajectories cannot remain bounded if c < 2 ab. 4.6 Shilnikov bifurcations occur for systems of the form x˙ = −λ2 x − ωy + P(x, y, z, μ), y˙ = ωx − λ2 y + Q(x, y, z, μ), z˙ = λ1 z + R(x, y, z, μ), where P, Q, R = O(r 2 ). If there is a homoclinic bifurcation at μ = 0, then an appropriate Poincaré map is approximately one-dimensional, and is ζ → aζ δ cos

 ω ln(1/ζ ) + μ if ζ > 0, λ1

where ζ ∝ z on the Poincaré surface r = (x 2 + y 2 )1/2 = c. Here δ = λ2 /λ1 . Deduce that for δ < 1, an infinite number of periodic orbits bifurcate at μ = 0, but for δ > 1, only one does. Draw a bifurcation diagram of ln(1/ζ ) versus μ in each case.

4.6 Exercises


4.7 The Cantor middle-thirds set is constructed by deleting successive middle-thirds of intervals, as follows: S0 = [0, 1], S1 = [0, 13 ] ∪ [ 23 , 1], S2 = [0, 19 ] ∪ [ 29 , 13 ] ∪ [ 23 , 59 ] ∪ [ 97 , 1] . . . , Show that lim Sn exists, and is a Cantor set (closed, no interior points, no n→∞

isolated points). [Hint: use a ternary representation for x ∈ [0, 1].] Show that this Cantor set has (Lebesgue) measure zero. Can you define a mapping such that φ(S∞ ) = S∞ , and which is chaotic? 4.8 Let 2 be the sequence space of bi-infinite sequences consisting of symbols 0 or 1, s = . . . s−2 s−1 .s0 s1 s2 . . . (a version for semi-infinite sequences was defined in question 2.9). Consider the conjugacy between the Smale horseshoe map ψ indicated in Fig. 4.11 and the shift map σ on 2 . (i) (ii) (iii) (iv)

Define a metric on 2 . Show that σ is continuous and (continuously) invertible. Construct a dense orbit for σ . Let 0 = (...00.000...) ∈ 2 . A sequence s is called homoclinic if σ n (s) → 0 as n → ±∞. Prove that homoclinic sequences are dense in 2 .

4.9 The Lorenz equations are given by x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz. By rescaling the variables, show that they may be written in the form x˙ = −x + y, y˙ = ρ(1 − z)x − γ y, z˙ = x y − δz, and define the parameters γ and δ (ρ is defined by ρ = r/σ ). Assuming that r ∼ σ  1, show that δ and γ are small. Henceforth, neglect γ but retain the small δ term. Solve the system numerically at suitable parameter values (for example take σ = 100, r = 120, b = 1), and show that the solution consists of fast spikes in x and y during which z jumps up, and long quiescent phases, when x and y are small and z slowly decays.


4 Homoclinic Bifurcations

First consider the quiescent phase where x, y 1. Show that while this is the case, z ≈ Me−τ , where τ = δt, and we suppose τ ≈ 0 marks the end of a fast phase. Next consider a fast phase, in which t ∼ O(1). By neglecting δ, find a first integral involving y and z. Hence show that if x, y ≈ 0 and z ≈ m < 1 at the beginning of the fast phase, then we can write ρ(1 − z) ≈ −k cos φ,

k y ≈ − √ sin φ, k = ρ(1 − m), ρ

where φ ≈ π at the start of the fast phase. Deduce an equation for φ, and thus show that φ → 0 at the end of the fast phase, and thus that z → M = 2 − m. In order to find where the next fast phase occurs, we consider the variation of x in the slow (quiescent) phase. Show that x satisfies x¨ + x˙ − ρ(1 − Me−τ )x ≈ 0, and by taking x ∼ exp[(τ )/δ] in the WKB method, show that the two independent solutions for  satisfy 2 + δ +  = ρ(1 − Me−τ ). By selecting (why?) the leading order solution that can grow again, show that  ≈ 0


[− 21 + { 41 + ρ(1 − Me−s )}1/2 ] ds.

Hence give reasons why the next fast pulse occurs at τ = τ ∗ (M), where  0


[− 21 + { 41 + ρ(1 − Me−s )}1/2 ] ds = 0,

and thus why a suitable ‘Poincaré’ (though actually one-dimensional) map relating successive maxima of z is M → 2 − Me−τ



[The detail is actually more subtle than this; for further information see Fowler and McGuinness (1982).] 4.10 A bifocal homoclinic orbit  of a system x˙ = f (x, μ)

4.6 Exercises


is one in which both the approach to, and the path from, the fixed point O : x = 0 is oscillatory (at a value μ = 0, say). Such systems are at least fourdimensional, and in terms of appropriate pairs of polar coordinates near O for such a four-dimensional system, we can assume the local linear behaviour near O is given by r˙ = −λ1r1 , θ˙ = ω1 , r˙2 = λ2 r2 , θ˙2 = ω2 , where λ1 and λ2 are positive. An approximate Poincaré map is constructed by defining two (hyper-)surfaces S and S  : S is defined by r1 = h, 0 ≤ r2 ≤ r¯2 , and S  is defined by r2 = k, 0 ≤ r1 ≤ r¯1 , where h, k, r¯1 and r¯2 are suitably small quantities. Show that S and S  define tori, and draw them, indicating the orientation of the angle variables θ1 and θ2 in each. Derive the map from (r20 , θ10 , θ20 ) ∈ S to (r1 , θ1 , θ2 ) ∈ S  : r1 = h

k r20



  ω1 k ln 0 , λ2 r2   ω2 k  0 θ2 = θ2 + ln 0 , λ2 r2 θ1 = θ10 +

λ1 and calculate also the time of transit T from S to S  . λ2 The homoclinic orbit  at μ = 0 passes through S at R : r2 = θ1 = 0, and through S  at R  : r1 = θ2 = 0, and we consider only orbits which remain close to R and R  , thus we consider only trajectories for which |θ1 | 1 in S and |θ2 | 1 in S  . Show that this restricts the points of consideration in S and S  to spiral scrolls Sc ⊂ S and Sc ⊂ S  : where δ =

  ω2 k in S, ln λ2 r2   ω1 h in S  , θ1 ≈ ln λ1 r1

θ2 ≈ −

and draw these scrolls.


4 Homoclinic Bifurcations

Points in the scroll Sc in S  are mapped affinely to points in S via ⎛

r2 cos θ2

r1 cos θ1

⎜ ⎟ ⎟ ⎜  ⎝ r2 sin θ2 ⎠ = A ⎝ r1 sin θ1 ⎠ + μd. hθ1 kθ2 Show that, for points in Sc which are mapped to Sc , the resultant Poincaré map can be approximated by the one-dimensional map a1 exp(−λ2 P  ) cos(ω2 P  + φ2 ) = b1 exp(−λ1 P) cos(ω1 P + φ1 ) + μ, where P is the time of passage. Hence deduce the bifurcation diagram of P versus μ for periodic solutions. [See also Fowler and Sparrow (1991).] 4.11 An autonomous partial differential equation on the infinite real axis is of the form (∗) At = N (A; μ), where N is a nonlinear operator. It is assumed that at μ = 0 there is a homoclinic orbit A0 (x, t) such that A → 0 uniformly in x as t → ±∞. In a manner analogous to the derivation of a one-dimensional map for the recurrence time P as in (4.85), we can derive maps of the form  c jm e−iωm P  −ikm L m

P  + iλm L


 d jm eiωm P m


+ μ,

j = 1, 2.

Defining σ (k) to be the dispersion relation for solutions A = eikx+σ t of the linearisation of (∗) about A = 0 at μ = 0, then the integers m denote the values 1 . Two equations arise where Re σ = 0 (and Im σ = ωm ), and λm = −  σ (km ) (determining the recurrence time P  and the translational shift L) because of an assumed translation invariance in both space and time; the translation shift is the shift in x where the next near-homoclinic orbit occurs (with the preceding one at x = 0). Assuming a real system with a real dispersion relation which depends only on k 2 and with a single pair of zeros at ±k0 , where σ = 0 and λ = ±λ0 , and thus the c jm occur as complex conjugates, show that the maps take the form d1 , P d2 c2 ζ + c¯2 ζ¯ = μ + , P

c1 ζ + c¯1 ζ¯ = μ +

4.6 Exercises


where the overbars denote complex conjugates and ζ =

e−ik0 L . P  + iλ0 L

Hence show that the maps combine to the single complex map e−ik0 L a = + μb, + iλ0 L P


and give expressions for a and b. Show in general that periodic orbits (P  = P, L = 0) do not exist, but that there is an infinite sequence of modulated travelling waves (P  = P, L = 0), and show that as μ → 0, μ ∼ 1/P with L fixed. Describe (inexhaustively) whether these solutions exist for μ > 0 or μ < 0, depending on the size of |a|. [For further information see Fowler (1990b) and Drysdale (1994).] 4.12 The Rössler equations are given by the system x˙ = −y − z, y˙ = x + ay, z˙ = b + (x − c)z, where a, b and c are positive constants, and solutions are to be sought when a, b = O(1), c  1. By suitably rescaling the equations, show that they can be written in the form x¨ − a x˙ + x = az − z˙ , ε z˙ = ε2 b + (x − 1)z, where you should define ε 1. Show that if initially x < 1 and x˙ = 0, there is a slow phase in which z 1, and 1 1 x = −Ce 2 at cos(ωt + χ ), x˙ = Ce 2 at sin ωt, where we define 1 a 2

1/2  = sin χ , ω = 1 − 41 a 2 = cos χ .

Show that this slow phase approximation breaks down as x → 1. If this occurs at t = tc , and x ≈ 1 + β(t − tc ) − γ (t − tc )2 near that value, show that γ ≈ 1 (1 − aβ). 2 If β = ε1/3 β  , β  ∼ O(1), derive a new approximation by writing t = tc + ε1/3 T, x = 1 + ε2/3 X, z = ε4/3 bZ ,


4 Homoclinic Bifurcations

and show that the solution for Z which matches to the solution in t < tc can be written in the form   ∞    exp − 21 β  (s − T )2 − 16 (s − T )3 ds. Z = exp 21 β  T 2 − 16 T 3 0

Show that this solution has a maximum at T ≈ 23 β  , and for large β  that this  2π maximum is approximately equal to , and that for large T , the solution β matches back to another slow phase. [You may like to compare this behaviour with Fig. A.3.]

Chapter 5

Hamiltonian Systems

Celestial mechanics, the study of the motions of the heavenly bodies, is a subject which has occupied the attention of the popular mind throughout history. From the time of the Babylonian and Aztec astronomer-priests, through to the best-selling musings of their modern counterparts, relativistic gurus such as Hawking and Weinstein, man has been enthralled by the business of understanding how the cosmos works. Celestial mechanics is littered with scientific giants. Kepler and Galileo; Newton, of course; Lagrange, Laplace, Poincaré, Arnold. The study of the dynamics of the solar system has a deceptively simple object, aptly summarised by the statement of a prize question proposed by the Swedish King Oscar II in 1885: for an arbitrary system of mass points which attract each other according to Newton’s laws, assuming that no two points ever collide, give the coordinates of the individual points for all time as the sum of uniformly convergent series whose terms are made up of known functions. This is a tall order, and in fact, it is in general not possible. However, there are some grounds for hope. Taking the solar system as the Sun together with nine1 planets, each considered as a point mass, it is evident that the motion of each planet is primarily determined by the attractive force of the Sun. Roughly, the problem is one of nine separate ‘two-body’ problems, each of which can be solved by Newtonian mechanics to give the well-known elliptical orbits of Kepler. The reason for this is that interplanetary forces are much smaller than that due to the Sun, so that the interplanetary forces act as small perturbations. It may be hoped that these could be analysed using appropriate perturbation expansions, and indeed this can be done— and gives valid results for finite times. The problem is, and this was pointed out by Poincaré in 1892 (and he won the prize, although unable to answer the question affirmatively), the series expansions so obtained do not converge—they are asymptotic expansions. In particular, they are unable to give results which are accurate for all time. Therefore, the question of the

1 Or

eight, Pluto having been demoted.

© Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,



5 Hamiltonian Systems

ultimate stability of the solar system—is it eventually liable to blow apart?—cannot be answered definitely by perturbation methods. We shall see that the KAM theorem will suggest that it ‘probably’ won’t, while very long period integrations suggest some degree of chaos, mostly associated with Pluto.

5.1 Lagrangian Mechanics Classical mechanics is concerned with systems of particles which interact in a clean, frictionless way. Particularly, there is no loss of energy, so that such systems are called conservative. Examples of systems describable in this way are rigid body motions and systems of gravitationally interacting particles. Often in such systems, it is convenient to use a set of generalised coordinates, denoted q j , j = 1, . . . , n, which naturally allow for any constraints on the motion. For example, the motion of a particle on the surface of a sphere is most easily described by the polar and azimuthal angles θ and φ. Up to now in this book, we have eschewed the applied mathematician’s normal practice of writing vectors in boldface, but in discussing mechanics we will (initially) revert to this practice. The principal reason for this is that as we begin by dealing with systems of particles, labelled by a subscript i, it avoids to some extent the confusion that the components of vectors naturally have the same notation. Additionally, we want to use r , for example, to denote the magnitude of the vector r. However, we will return to our previous practice by the time we get to Sect. 5.3. Suppose we have a set of particles of mass m i at position ri , describable by generalised coordinates q, so that ri = ri (q, t). Suppose also that there are internal forces acting on the particles, fi , and external forces, Fi ; Newton’s equation of motion is then (5.1) m i r¨ i = Fi + fi . of the particles (subject to any conWe suppose that in any (virtual)2 displacement straints), the internal forces do no work, that is, i fi .dri = 0. This is not as mysterious as it looks; more or less, it is a consequence of an assumption that forces between particles (for example, gravitational, electrostatic, etc.) are equal and opposite. For example, consider two particles 1 and 2 on a line a distance x apart subject to a mutually attractive force derived from a potential V (x), so that f1 = −

∂V , ∂x

f2 =

∂V ; ∂x


then if particle 2 is moved a distance d x, we have 2 A virtual displacement is one which

is externally imposed, i.e. an external agent takes a particular configuration of the particles and simply moves them to different positions; it is not a consequence of dynamics.

5.1 Lagrangian Mechanics


d f 1 = −d V = −d f 2 ;


the extension to a system of particles is straightforward. It follows that 

m i r¨ i .dri =


Fi .dri ,



and this is known as D’Alembert’s principle. To put this equation into generalised coordinates, we write it as 

m i r¨ i .

i, j

 ∂ri dq j = Q j dq j , ∂q j j


where Q j are the components of the generalised force, Qj =


Fi .

∂ri . ∂q j


Now the velocities are given by r˙ i = vi = ˙ t)) whence (with vi = vi (q, q,

and also d dt

∂ri ∂q j


∂ri  ∂ri + q˙k , ∂t ∂qk k

∂vi ∂ri = , ∂ q˙k ∂qk

 ∂ 2 ri ∂ 2 ri ∂ r˙ i ∂vi q˙k = = . + ∂q j ∂t ∂q j ∂qk ∂q j ∂q j k





 ∂ri ∂ri = m i v˙ i . ∂q ∂q j j i, j i, j     d  ∂ri  d ∂ri − vi . vi . = mi dt ∂q j dt ∂q j i, j      d ∂vi ∂vi − vi . vi . mi = dt ∂ q˙ j ∂q j i, j 

   d ∂ ∂ 2 2 1 1 = mv mv − . 2 i i 2 i i dt ∂ q˙ j ∂q j j i i m i r¨ i .



5 Hamiltonian Systems

 Now the kinetic energy is just T = i 21 m i vi2 , so that (5.5) can be written (using (5.10)) as    d  ∂T  ∂T − Q j dq j = 0 (5.11) − dt ∂ q˙ j ∂q j j On the assumption that the constraints are implicitly satisfied by the choice of generalised coordinates, then the q j are independent, and (5.11) implies d dt

∂T ∂ q˙ j


∂T − Q j = 0. ∂q j


In the frequent case where the generalised forces are derivable from a potential, i.e. there is a function V (r, t) such that Fi = −∇i V , then the generalised forces are  ∂ri ∂V Qj = − ∇i V. =− , and if we define the Lagrangian L = T − V , then ∂q j ∂q j i the system is described by Lagrange’s equations: d dt

∂L ∂ q˙ j


∂L = 0. ∂q j


5.1.1 Hamilton’s Principle Having offloaded the awkward extra index in m i , we now largely revert to the use of the summation convention, whereby summation is implied over repeated indices.3 Lagrange’s equations can be derived from a variational principle by defining the action integral  t2

I =

˙ t) dt. L(q, q,



Hamilton’s principle then states that amongst all possible trajectories which take q from q(t1 ) to q(t2 ), that which is actually realised is the trajectory which gives a stationary value of this integral. To show this, we use the calculus of variations.4 Suppose q(t) is this stationary trajectory with corresponding action integral I , and consider the increment δ I computed by considering a neighbouring trajectory q + δq, such that q + δq has the same values at t1 and t2 , i.e. δq(t1 ) = δq(t2 ) = 0. Then the first- order increment is

3 Occasionally, 4 See

we will add a summation symbol for added clarity. van Brunt (2004).

5.1 Lagrangian Mechanics


 δI =





 ∂L ∂L δqi + δ q˙i dt ∂qi ∂ q˙i 


d dt

∂L ∂ q˙i


 ∂L δqi dt, ∂qi


using the summation convention and integrating by parts. We see that δ I = 0 (for arbitrary δq) only if Lagrange’s equations are satisfied.

5.1.2 Hamilton’s Equations Since the generalised forces are Q j = −∂ V /∂q j = ∂ L/∂q j , it is natural to define the generalised momentum by ∂L pi = (5.16) ∂ q˙i ˙ t), and Lagrange’s equations imply so that pi = pi (q, q, p˙ i =

∂L . ∂qi


Hamilton’s equations provide an alternative method of describing the equations of motion in terms of the generalised coordinates and momenta qi and pi . For any function, and in particular for L (at fixed t), we can write the total differential ∂L ∂L dqi + d q˙i ∂qi ∂ q˙i = p˙ i dqi + pi d q˙i .

dL =


Now we wish to use pi , qi as independent variables. To this end, we define the Hamiltonian H = pi q˙i − L, so that its total differential is d H = pi d q˙i + q˙i dpi − d L = q˙i dpi − pi dqi ;


it follows that Hamilton’s equations for p and q are ∂H , ∂ pi ∂H p˙ i = − . ∂qi

q˙i =



5 Hamiltonian Systems

As an illustration, suppose that the kinetic energy T is quadratic in q˙i , i.e. T = 1 A q˙ q˙ (summed), where Ai j = Ai j (qk , t), and V = V (qk , t). Then ∂ L/∂ q˙i = 2 ij i j Ai j q˙ j , whence ∂L pi q˙i = q˙i = 2T, (5.21) ∂ q˙i and therefore H = T + V,


which is the total energy. Example The n-body problem describes n masses m i at positions ri subject to mutual gravitational attractions. Using ri as the generalised coordinates, the Hamiltonian must simply be H =

1 m |˙r |2 2 i i



 |pi |2 i

2m i

 Gm i m j   ri − r j  i= j

 Gm i m j  , qi − q j  i= j


where qi = ri , and pi = m i q˙ i are the generalised vector momenta. Example A particle in a potential well satisfies m x¨ + V (x) = 0.


With q = x, then p = m x, ˙ the Hamiltonian is H=

p2 + V (q). 2m


5.2 Hamiltonian Mechanics Suppose H is independent of time. Then ∂H ∂H H˙ = p˙ i + q˙i = 0, ∂ pi ∂qi


and H is constant. This is just the principle of conservation of energy. Consider, for example, the simple oscillator in the second example above. We have5 5 Some

care is properly required with the sign of the square root.

5.2 Hamiltonian Mechanics


p = [2m(E − V (q))]1/2 ,


where H = E is now constant, and also q˙ = thus

∂H = p/m, ∂p

2{E − V (q)} q˙ = m


1/2 ,




and the solution is obtained by formal quadrature,  t=


m 2{E − V (q)}


Example A slightly less trivial example is that of n identical (uncoupled) oscillators, whose Hamiltonian is    p2 i + V (qi ) , H= (5.31) 2m i or H =


Hi , where Hi = H ∗ ( pi , qi ), and H ∗ ( p, q) =

p2 + V (q); 2m


it follows that there are n first integrals H ∗ ( pi , qi ), i = 1, . . . , n. This system can then be integrated as in the previous example.

5.2.1 Integrability Hamiltonian systems which can be completely solved are called integrable; the n uncoupled oscillators above are an example of an integrable system. More generally, the special structure of Hamiltonian systems enables the equations to be solved (in principle) providing n independent integrals can be found. Specifically, if n integrals in involution (which we define below) can be found, then a further n integrals exist, and (for a 2n-th order system) this implies the integrability of the equations. Definition The Poisson bracket of two functions u and v is the expression [u, v] =

∂u ∂v ∂u ∂v − ∂qi ∂ pi ∂ pi ∂qi

(where the summation convention is employed).



5 Hamiltonian Systems

We then say that a set of functions {u k } is in involution if [u r , u s ] = 0 for all r and s. Notice that, for any function, ∂u ∂u ∂u + q˙i + p˙ i ∂t ∂qi ∂ pi ∂u ∂ H ∂u ∂u ∂ H + = − ∂t ∂qi ∂ pi ∂ pi ∂qi = u t + [u, H ].

u˙ =


For the particular (but important) case of autonomous (time-independent) Hamiltonians, then H˙ = [H, H ] = 0, and H is a constant of the motion, and any set of integrals in involution must then be independent of time also (since each satisfies [u, H ] = 0). An important property of the Poisson bracket is the Jacobi identity: [u, [v, w]] + [v, [w, u]] + [w, [u, v]] = 0;


which allows us to generate further integrals of the motion from a given set. For, if u, v are integrals of an autonomous system with Hamiltonian H , then since [u, H ] = [v, H ] = 0, it follows from (5.35) that (using [a, b] = −[b, a]) [u,˙ v] = [[u, v], H ] = 0.


There is, however, no guarantee that the resulting functions are independent. Suppose now that there are n integrals of the motion φr (q, p, t) in involution (note ∂ H/∂t is not necessarily zero). In a particular realisation of the motion, we set φ(q, p, t) = a



with inverse p = f(q, a, t), such that   φ q, f{q, a, t}, t ≡ a.


The restriction implied by (5.37) confines the motion to an n-dimensional submanifold M of the phase space R2n . Differentiating (5.38), we have (on M) ∂φr ∂φr + ∂qi ∂pj ∂φs ∂φs + ∂q j ∂ pi

∂fj = 0, ∂qi ∂ fi = 0, ∂q j


whence multiplying by ∂φs /∂ pi and ∂φr /∂ p j , respectively, subtracting, and using the involution property of {φr }, we find

5.2 Hamiltonian Mechanics


∂fj ∂ fi = . ∂qi ∂q j


Let H1 be the value of H on the solution manifold M, i.e. H1 (q, a, t) = H [q, f(q, a, t), t];


then on M, we have ∂H ∂ H ∂ fi ∂ H1 =− − ∂qr ∂qr ∂ pi ∂qr ∂ fr = p˙r − q˙i ∂qi ∂ fr d fr − = q˙i dt ∂qi ∂ fr . = ∂t


The relationships (5.40) and (5.42) imply the existence of a function A of q and t on M (hence we can write A = A(q, φ, t)), such that ∂A = fi , ∂qi

∂A = −H1 on M. ∂t


Now if we consider the total differential of A, we have ∂A ∂A ∂A dt + dqi + dφi ∂qi ∂t ∂φi ∂A = f i dqi − H1 dt + dφi , ∂φi

dA =

or dA −

∂A dφi = f i dqi − H1 dt ∂φi



with H1 = H1 (q, φ, t). We revert on the right-hand side to variables q, p, t (i.e. we change back from q, φ, t to q, p, t); thus dA −

∂A dφi = pi dqi − H dt. ∂φi


Now the Hamiltonian system of equations is equivalent to 


0=δ t1

 L dt = δ t1


pi dqi − H dt,



5 Hamiltonian Systems

where the integral follows a path in (q, p) space. In view of (5.46), this is equivalent to  t2  t2 ∂A ∂A δ − dφi = δ − (5.48) φ˙ i dt = 0, ∂φi ∂φi t1 t1 where the path is traced out in (q, φ) space. Applying the calculus of variations, we have on M δ

 t2  t1

  t2  ∂A ∂2 A ∂A ∂2 A φ˙ i dt = δq j φ˙ i − δφ j φ˙ i − δ φ˙ i dt − ∂φi ∂φi ∂q j ∂φi ∂φ j ∂φi t1 

  t2  d ∂A ∂2 A ∂2 A φ˙ i δq j + φ˙ i δφ j dt, − − = ∂φi ∂q j dt ∂φ j ∂φi ∂φ j t1

(5.49) and taking account of the fact that φ˙ i = 0 on M, it follows that ∂A = constant on M, ∂φ j

j = 1, . . . , n.


This gives a further n integrals of the motion, and enables us to solve the system.

5.2.2 Action-Angle Variables We now restrict ourselves to the autonomous case H = H (q, p). Then (5.42) implies (since all the integrals φr are independent of t, hence so also are fr ) ∂ H1 /∂qr = 0, thus H1 = H1 (φ). We can, therefore, define a time-independent function S(q, I) as (note the change of notation from φ to I) S(q, I) = A(q, I, t) + H1 (I)t,


since from (5.43) ∂ S/∂t = 0. In addition, we have (from (5.43)) ∂S , ∂qi


∂S ∂ A ∂ H1 = + t. ∂ Ii ∂ Ii ∂ Ii


pi = and we define variables θi by θi = Then (5.46) becomes

5.2 Hamiltonian Mechanics


d [S − Ii θi ] + Ii dθi − H1 (Ii ) dt = pi dqi − H dt.


If we consider θ and I as a new set of generalised coordinates, then in view of (5.47), they form the basis for a Hamiltonian description,6 and (writing H1 = H ) θ˙i =

∂H , ∂ Ii

I˙i = 0.


In this form, I, θ are called action-angle variables. In particular, if the n-dimensional manifold is bounded (strictly, it should be compact and connected) then (if ∂ H/∂ Ii = 0) θi can only increase indefinitely by representing an angle variable. This is the normal case for systems with finite and positive definite energy (or at least bounded below), and in this case, the manifold M is (diffeomorphic to) an n-torus: the Cartesian product of n circles. Of course, we would like our angle variable to be 2π -periodic, or equivalently the transformation (q, p) → (θ, I) should be 2π -periodic in each θi . This is effected by choosing the action variables Ii appropriately. In fact, from (5.53), θi = θi (q, I) = ∂ S/∂ Ii , thus ∂θi ∂2S = , (5.56) ∂q j ∂ Ii ∂q j so that on the n-torus M (I = constant) dθi =

∂pj ∂θi dq j = dq j ∂q j ∂ Ii


using (5.52). We can define n topologically independent closed loops γi on M such that [θi ]γi = 2π , [θ j ]γi = 0 for j = i; then (5.57) suggests that we define the action variables as  1 p j dq j . (5.58) Ii = 2π γi

5.2.3 Integral Invariants At first sight, it seems we get something for nothing: (5.58) defines the n integrals we are seeking. But of course, this is not really correct, since the curves γi lie on M—which is not known until the integrals are found. All (5.58) does is to tell us what the best constants of integration to use on M are.

6 The point is that a Hamiltonian system is defined by Hamilton’s principle, to which (5.47) is equiva t2  t2

lent, and since δ

d[S − Ii θi ] dt = 0 (as it is an exact integral), we have δ


hence the conclusion.

Ii dθ − H1 dt = 0,



5 Hamiltonian Systems

We should, though, check that the Ii are invariant. In fact, for any closed curve, we have, with H = H (q, p), d dt


pi dqi =


p˙ i dqi + pi d q˙i



p˙ i dqi − q˙i dpi



d H = 0.


 More generally, γ

pi dqi − H dt is invariant along any closed curve in (q, p, t)

space which follows trajectories. These integrals are called the Poincaré–Cartan invariants. In general, they depend on the curve γ , and it is only on invariant tori that they can be used to define action variables. The Poincaré–Cartan invariants are just the first of a sequence of such invariants. The last of them is associated with Liouville’s theorem: Phase volumes are preserved under the flow of a Hamiltonian system. Mathematically,   d dqi dpi = 0, dt V i


where V evolves under the flow. This simply follows from the ‘incompressibility’ of the flow, i.e.    ∂ q˙i d  d ∂ p˙ i d V = 0. (5.61) (d V ) = dqi dpi = + dt dt i ∂qi ∂ pi i

5.2.4 Canonical Transformations The above discussion describes the structure of the solution when the integrals of the system are known. In seeking to actually solve an integrable Hamiltonian system, our task is thus one of finding the function S, called a generating function. The aim is to find a transformation (q, p) → (Q, P) which preserves the Hamiltonian structure. In view of the association of this with the variational principle (5.47), it is sufficient to show that (5.62) pi dqi − H dt = Pi d Q i − K dt + d F, where d F is the total differential of some function, and K (Q, P) is the new Hamiltonian. We define S = S(q, P, t) and

5.2 Hamiltonian Mechanics


pi =

∂S ∂S , Qi = ; ∂qi ∂ Pi


then ∂S ∂S ∂S dt + dqi + d Pi ∂t ∂qi ∂ Pi ∂S = dt + pi dqi + d(Pi Q i ) − Pi d Q i , ∂t

dS =


hence pi dqi − H dt = Pi d Q i − K dt + d F where K =H+

∂S , F = S − Pi Q i . ∂t



Thus Q, P are canonical coordinates, and the transformation is a canonical transformation if the Hamiltonian structure is preserved, i.e. ∂K ∂K Q˙ i = , P˙i = − . ∂ Pi ∂ Qi


What have we done? Nothing much, so far. Define P implicitly via the first of (5.63) ∂S (q, P) as a (given S), then Q from (5.63)2 . Then by writing K = H (q, p) + ∂t function of Q and P, we have a new Hamiltonian system. Even if we are given S, finding P is non-trivial (if not impossible), and the resultant K is likely to be worse that the original H .

5.2.5 The Hamilton–Jacobi Equation The point is, of course, that we should try and choose the S in order that K be as nice as possible. In particular, for K to be integrable, we seek K = K (P), so that P, Q are action-angle variables, and ∂K , P˙i = 0, Q˙ i = ∂ Pi


which can be integrated directly. Using the definition of pi in (5.63), this leads us to the Hamilton–Jacobi equation for S(q, P, t):   ∂S ∂S + H q, , t = K (P). ∂t ∂q



5 Hamiltonian Systems

Given K , this is a first-order partial differential equation for S, whose solution is generally as difficult as the original system. However, we shall see in the following subsection how (5.69) can be used in perturbation theory. In the time-independent case, K = E is the total energy (which is given by H at t = 0), and we have to solve H (q,

∂S ) = E. ∂q


Example For a one-dimensional oscillator in a potential well, H=

p2 + V (q), 2m


we define a generating function S(q, P), with p=

∂S ∂S , Q= , ∂q ∂P


and the Hamilton–Jacobi equation is 1 2m 

whence S=


∂S ∂q

2 + V (q) = E,

[2m{E − V (q)}]1/2 dq,



where E = E(P). We can choose E(P) as we like, but it is convenient to define E = P, for then (5.67) implies Q˙ = 1, and Q = t + constant. Then  S= and


 m 1/2 ∂S Q= = ∂P 2

[2m{P − V (q)}]1/2 dq, 


dq = t + constant. [P − V (q)]1/2



To choose action-angle variables, we define I =

1 2π

 p dq =

1 2π

 [2m{E − V (q)}]1/2 dq,


and note I = I (E). To illustrate, the simple harmonic oscillator has V = 21 ω2 q 2 , where ω is the frequency. Take m = 1, then

5.2 Hamiltonian Mechanics


1 2π

I =

 [2E − ω2 q 2 ]1/2 dq = E/ω.


The generating function is  S= thus θ=

∂S = ∂I



[2I ω − ω2 q 2 ]1/2 dq,

   ω 1/2 ω dq −1 = sin q , [2I ω − ω2 q 2 ]1/2 2I

so that


where θ˙ =

2I ω



1/2 sin θ,

∂H = ω, i.e., θ = ωt + constant. ∂I



5.2.6 Quasi-Periodic Motion In action-angle coordinates, the integrable Hamiltonian system can be written as I˙i = 0, ∂H θ˙i = ωi = . ∂ Ii


n if the frequencies ωi are The resulting motion on  the n-torus T is quasi-periodic incommensurate (i.e. m i ωi = 0 for all m ∈ Zn ), and then the solution never repeats itself (though it will come arbitrarily close to doing so). Trajectories of the form q = q(ω1 t, ω2 t, . . . , ωn t), where q is 2π -periodic in each argument separately, are called quasi-periodic. Our main interest in such systems is whether this property is robust under small perturbations, and a prime application is to the motion of the planets in the solar system.

5.3 Perturbation Theory At this point, having long since lost sight of systems of particles, we now revert to our previous notation, in which the boldface font for vectors is not used. The Hamiltonian for two-dimensional motion of a particle under a central (conservative) force is


5 Hamiltonian Systems


pφ2 pr2 + + V (r ), 2m 2mr 2


where pr = m r˙ is the radial momentum, and pφ = mr 2 φ˙ is the angular momentum; here (r, φ) are plane polar coordinates. Two integrals of motion are then H and pφ , and the system is integrable. The action variables can be found to be I 1 = pφ , I2 =

1 2π

1/2  dr, 2m{E − I12 /2mr 2 − V (r )}


and for the Newtonian inverse square law, V = −k/r , the resultant motion (for bounded orbits) is periodic. If we take a number of different particles under the same central force, we form the Hamiltonian by summing H given by (5.84) with pr , pφ , r , φ appropriate to each particle. As such, we then have an integrable system which approximates the n-body problem for the solar system: if we consider only the gravitational force due to the Sun on the planets, then each separate oscillator uncouples from the rest, and the trajectories live on a nine-torus in the phase space. However, the real problem also involves the (smaller) gravitational attractions between the planets themselves. Thus, although the problem is no longer integrable, it is still a Hamiltonian system, and its Hamiltonian can be represented as H = H0 (I ) + ε H1 (I, θ ), ε  1,


where I, θ are the action-angle variables for the unperturbed system. The question we now pose is this: can we use perturbation theory based on expansions in powers of ε to determine whether the perturbed system is integrable? In particular, can we expect that the invariant tori for the integrable unperturbed system remain, being merely shifted by O(ε)? If this is the case, then we would have an affirmative answer to the question of the stability of the solar system.

5.3.1 Other Applications There are other astronomical problems to which perturbation theory can be applied. Between Mars and Jupiter, many asteroids circle the Sun. Their orbits are perturbed primarily due to the influence of Jupiter, but it was noted by Kirkwood in 1866 that the periods of asteroid orbits are not smoothly distributed; there are gaps—Kirkwood gaps—at periods where no asteroids are found. Moreover, these gaps occur when the missing period and that of Jupiter are rationally related. This phenomenon of resonance is one mechanism whereby the integrability of the perturbed system can be destroyed.

5.3 Perturbation Theory


There are also applications to plasma confinement devices. In these, particles are confined to circular orbits by magnetic fields, but any imperfections or time variation lead to perturbations as above, and the issue of instability is very important. It is one reason why the attainment of nuclear fusion in tokamaks is very difficult, despite the efforts of many decades’ work. Unlike perturbation methods in other areas (applied analysis, fluid mechanics), we do not seek to find the form of the perturbed solution in terms of ε. This is because there are likely to be a whole family of solutions corresponding to different energy levels, etc. Rather, we suppose that the perturbed system is integrable, and try to find its action-angle variables by solving the Hamilton–Jacobi equation which results from the canonical transformation of variables. It is in solving the Hamilton–Jacobi equation that perturbation theory is used. We consider the perturbed Hamiltonian system (5.86), and we assume I ∈ Rn , θ ∈ S1n , where S1 is the unit circle, thus each component θi of θ is an angle variable. There is thus no loss of generality in taking the perturbation H1 to be a 2π -periodic function; consequently we can express it as a Fourier series, H1 (I, θ ) = H¯ 1 (I ) +

H1m (I )eim.θ ,



where H¯ 1 (I ) is the average of H1 and m ∈ Zn is an integer-valued vector. We now seek a near-identity change of variables (θ, I ) → (φ, J ) via a generating function S(θ, J ), such that J and φ are action-angle variables for the perturbed system. From (5.63), we have ∂S ∂S , φi = , (5.88) Ii = ∂θi ∂ Ji so that S = θ.J would correspond to the identity transformation. Therefore we put S = θ.J + εS1 + . . . ,


(do not confuse S1 here with its referral to the unit circle), and the Hamilton–Jacobi equation (5.70) is   ∂S = K (J ), (5.90) H θi , ∂θi where K is the new (integrable) Hamiltonian that we seek. Expanding, this is  H0

   ∂ S1 ∂ S1 Ji + ε + . . . + ε H1 Ji + ε . . . , θi = K , ∂θi ∂θi


hence, expanding K as well,  H0 (Ji ) + ε

 ∂ S1 ∂ H0 + H1 (Ji , θi ) + . . . = K 0 (Ji ) + εK 1 (Ji ) + . . . , ∂θi ∂ Ji



5 Hamiltonian Systems

and on equating powers of ε, we choose K 0 (J ) = H0 (J )


(the unperturbed Hamiltonian), and then we have to solve (for S1 ) ω0 .∇θ S1 + H1 = K 1 , where ω0 =

∂ H0 ∂J



is the vector of unperturbed frequencies. In view of (5.87), the solution is K 1 (J ) = H¯ 1 (J ), S1 = i

 H1m (J ) eim.θ . m.ω 0 m=0


5.3.2 Resonance and Small Divisors The first-order perturbation theory given above can be extended to higher order in an easy enough manner, but there is a problem. The problem is similar to that obtained when calculating normal forms (see Chap. 3), and much of the following discussion parallels the development there. This is that if the frequencies ω0i are commensurate, that is to say, there exists at least one integer-valued vector m such that m.ω0 = 0, then the frequencies are said to be resonant. If this is the case, then the simple perturbation expansion does not work, and the existence of an invariant torus for the motion is thrown into doubt. In fact, as we will see later, the topology of the unperturbed torus is in fact destroyed by the occurrence of resonance. The phase locking of the different oscillators leads to a periodic motion which does, however, lie close to the original torus. One might suppose this makes little difference in practice—but the reality is a little more devious than that. The resonant frequencies are densely distributed in frequency space. Indeed, any rational set of frequencies ω0 with ω0i = ri /si , is resonant, since rn


n−1  si ω0i − sn ri ω0n = 0,




and thus any vector ω0 can be arbitrarily closely approximated by a resonant frequency vector. This in turn implies, for almost every integrable Hamiltonian, that the resonant tori are densely distributed in the phase space.

5.3 Perturbation Theory


Fig. 5.1 Locating near-resonant frequency vectors

m < ΑΩ αm



~ ω

It is actually worse than this because the convergence of the series (5.96) for S1 will be jeopardised by the existence of values of m.ω0 which, though not zero, are small: this is the classical problem of small divisors. To rescue this situation, we have the following Lemma: Almost every (in the sense of Lebesgue measure) ω ∈ Rn satisfies, for all m ∈ Zn , |m.ω| ≥

K (ω) , |m|ν


for some K which depends on ω, and ν > n − 1.7 That is to say, for non-resonant values of ω, (m.ω) does not come too close to zero. The norm in (5.98) is the 1-norm, that is, |m| = |m 1 | + · · · + |m n |. Consider a bounded domain in Rn of maximal cross-sectional ‘area’ A (that is, A is the maximum area of the intersection of any (n − 1)-dimensional hyperplane with , computed in the usual way). Let m be an integer-valued vector, with 1-norm |m| =

m 1 . Then the set of values of ω for which the one-norm |m.ω| < ε is as follows. Let m.ω˜ = 0, and put ω = ω˜ + αm; then (see Fig. 5.1) |m.ω| = α|m.m| = α m 22 , < ε/ m 2 . Also note that the Euclidean norm m 2 and and |m.ω| < ε if α m 2 √ the one-norm |m| satisfy n m 2 ≥ |m|.8 Therefore the volume V m of values of ω in satisfying |m.ω| < ε is V m

√ ε n A . ≤ α A m 2 ≤ |m|


Next, the number N p of integer-valued vectors m satisfying |m| = p is certainly less than 2(2 p + 1)n−1 (there are a maximum of 2 p + 1 choices for the first n − 1 values m 1 . . . m n−1 , and then two choices for m n ). Therefore  Np < 2p



1 2+ p

n−1 ≤

L n p n−1 , √ n

certain familiarity presents itself here, if we compare this with (3.169), for example. is easily proved by induction.

8 This



5 Hamiltonian Systems

√ p where L n = 2 n 3n−1 . Now let V denote the volume of the set of ω-values in satisfying |m.ω| < ε for all vectors m ∈ Zn with |m| = p. Evidently p V


V m

|m|= p

√ L n p n−1 ε n A < √ . = εL n A p n−2 . p n


Finally, the total volume of all such resonance zones, V satisfies V < L n A

εp n−2 ,



where we suppose the choice of ε may depend on p. If we choose ε = K |m|−ν (= K | p|−ν for each term in (5.101) and thus (5.102)), then (5.102) implies that V < O(K ) provided the series converges, that is, ν > n − 1. Thus, given a bounded set , we see that by choosing K sufficiently small, the volume of resonance zones can be made arbitrarily small. It then follows that (5.98) applies for almost every ω ∈ and hence for almost every ω ∈ Rn . For suppose not: then there exists in a set of finite volume for which |m.w| < K /|m|n+1 for every K : but we have just shown that the volume of the set for a fixed K is O(K ), and this contradicts the assertion. A second lemma concerns the rate of convergence of the Fourier coefficients of H1 . Suppose that H1 is analytic in θ . We have H1m =

1 (2π )n


H1 (I, θ )e−im.θ dθ,


where T n is the n-torus [0, 2π )n . We assume |H1 | ≤ M, and that H1 is analytic in a strip |Im θ | ≤ ρ. We can then deform the contour of integration to Im θi = ±ρ as appropriate, so that H1m and it follows that

1 = (2π )n


H1 e−|m|ρ e−im.θ dθ ,

|H1m | ≤ Me−|m|ρ .



Combining this result with the preceding lemma, we see that for almost all ω, there is a K (ω) such that (5.98) holds, and together with (5.105), we have    H1m  M|m|n−1 e−|m|ρ   .  m.ω  ≤ K


We therefore see that the Fourier series for S1 in (5.96) will in fact converge for almost all ω to a function S1 (θ, J ). Everything seems better, but it is not so. The function S1 must be everywhere discontinuous, since the set of J ’s (hence ω’s) for which the Fourier series (5.96) fails to converge is densely distributed. This immediately

5.3 Perturbation Theory


implies that the series expansion (5.89) is doomed to failure, since higher order terms ∂ 2 S1 involve the computation of terms such as , etc., which do not exist. ∂ Ji ∂θk

5.3.3 The KAM Theorem This is about as far as Poincaré got, at least in terms of finding convergent series solutions. Ordinary perturbation theory can be persuaded to work as far as first order, mostly; but the perturbation series does not converge (indeed, cannot be computed) for most perturbations. Nor is the problem a conceptually simple one, as in the use of multiple scale methods, where secular non-uniformities must be removed by modification of the time scale. Here, the phenomenon of resonance riddles the whole perturbation method like a reticulum. There is no obvious remedy, and the method used to resolve the problem, due to Kolmogorov, is as brilliant as it is simple. Kolmogorov’s idea, published in 1954, is that, rather than allow J and hence the frequencies ω to vary in the perturbation expansion, one should fix the (non-resonant) frequency vector ω∗ (satisfying (5.98)), and seek to find an invariant torus with ω∗ as its frequency. This removes the problem of having frequencies near the unperturbed one which are resonant. The second idea is that, rather than use a plain Taylor series in ε to try and find convergent perturbation series, one seeks corrections which can be applied iteratively. The usual analogue is that of finding roots of an algebraic equation h 0 (x) + εh 1 (x) = 0. The ‘ordinary’ perturbation method would consist of putting x = x0 + εx1 + ε2 x2 + . . . , where h 0 (x0 ) = 0, and computing corrections sequentially. At each stage of the computation, the approximate root improves by O(ε). However, if Newton’s method is used, then (if h = h 0 + εh 1 ) one replaces an ˜ leading to a sequence x˜0 , x˜1 , . . . , which can easily estimate x˜ by x˜ − h(x)/ ˜ h (x), be shown to be ‘superconvergent’, that is x˜n − x˜∞ = O(ε2n ). It is the corresponding exponential convergence in the KAM theorem which ensures the viability of the method. The KAM theorem is named for Kolmogorov, whose idea it was, and Arnold and Moser, who independently (in the early 1960s) proved rigorous versions of the theorem for Hamiltonian differential equations (Arnold) and area-preserving maps (Moser). Later authors have refined these proofs, most notably in reducing the smoothness requirements on the system. The KAM theorem itself can be stated as follows: Theorem (Kolmogorov, Arnold, Moser) Suppose H = H0 (I, ε) + ε H1 (I, θ, ε) is a Hamiltonian, I ∈ Rn , θ ∈ T n , H1 is 2π periodic in each θi and H is analytic in each argument. Providing H0 is non∂ 2 H0 degenerate, that is, det H0

= 0 (where (H0

)i j = ), then almost every invariant ∂ Ii ∂ I j torus of H0 persists for 0 < ε  1. 


5 Hamiltonian Systems

More precisely, the measure of the complement of the union of the persistent invariant tori is small when ε is small. We give an idea of the proof. The full details are beyond the ambition of the present text, while illustration by analogy with Newton’s method is not satisfying enough. For the perturbed system, we define the generating function S(θ, J ) = θ.J + εi

 H1m (J )eim.θ , m.ω∗ m=0


where now we fix ω∗ satisfying the non-resonance inequality (5.98). The idea is that by fixing ω∗ , we may hope to find an iterative perturbation procedure which avoids the small divisor problem, and locates a particular invariant torus corresponding to a particular value of the action J ∗ . From (5.107), we find (with an obvious notation)  m H1m eim.θ , m.ω∗ m=0

I =

∂S = J − ε J , ∂θ


 H eim.θ ∂S im ; = θ + εi ∗ ∂J m.ω m=0

J =


notice that H1m is a scalar (but m is a vector), while H1m is a vector (actually ∇J H1m ). Substituting for I in H = H0 (I ) + ε H1 (I, θ ), and putting

ω0 (J ) = H0 (J ) = ω∗ + (ω0 − ω∗ ),


H = H0 (J ) + ε H¯ 1 (J ) − ε J .(ω0 − ω∗ )   + ε2 J . 21 ∇J (J .ω0 ) − H¯ 1 − ∇J (ω∗ .J ) + O(ε3 ),


we derive

noting that ω∗ .J =

H1m eim.θ ,

m=0 ∗

∇J (ω .J ) =

H1m eim.θ .



Suppose I ∗ is the (unperturbed) action vector such that H0 (I ∗ ) = ω∗ . Then we expect J = I ∗ + O(ε) in order that the frequency vector be unchanged. Specifically, let J ∗ be the solution (near I ∗ ) of H0 (J ∗ ) + ε H¯ 1 (J ∗ ) = ω∗ .


5.3 Perturbation Theory


For sufficiently small ε, the existence of J ∗ close to I ∗ is guaranteed by the implicit function theorem provided H0 is non-degenerate, i.e. det H0

= 0, and in fact we have (5.113) J ∗ = I ∗ − ε H0

(I ∗ )−1 H¯ 1 (I ∗ ) + O(ε2 ). Putting J = I ∗ + ε J (1) , thus ω0 − ω∗ = ε H0

(I ∗ )J (1) + . . . , (5.110) can be written, after simplification, as H = H0 (J ) + ε H¯ 1 (J )     + ε2 J . − H0

(I ∗ )J (1) + H¯ 1 (J ) − 21 ∇J (J .ω∗ ) + O(ε3 ).


Since φ = θ + O(ε) (from (5.108)), we can write (5.114) as H = Hˆ 0 (J, ε) + ε2 Hˆ 1 (J, φ, ε),


where Hˆ 0 (J, ε) = H0 (J ) + ε H¯ 1 (J ),     Hˆ 1 (J, φ, ε) = J . − H0

(I ∗ )J (1) + H¯ 1 (J ) − 21 ∇J (J .ω∗ ) + O(ε), (5.116) where J = I ∗ + ε J (1) . The transformation of (5.86) to (5.115) is the basic step in the construction of the superconvergent iteration method. Now suppose as before that H1 is analytic in |Im θ | ≤ ρ, and |H1 |, |H1 | < M, so that

| ≤ Me−ρ|m| . (5.117) |H1m | ≤ Me−ρ|m| , |H1m Let us additionally assume J − J ∗ = O(ε2 ), where J ∗ satisfies (5.112). That is (since J = I ∗ + ε J (1) ), then J ∗ = I ∗ + ε J (1) + O(ε2 ), and expanding (5.112) and noting that H0 (I ∗ ) = ω∗ , H0

(I ∗ )J (1) + H¯ 1 (I ∗ ) = O(ε).


Hence, from (5.116), Hˆ 1 = − 21 J .∇J (J , ω∗ ) + O(ε)   m H1m (J )H (J )ei(m+s).φ 1s + O(ε) ∗ m.ω m=0 s=0   = am eim.φ bs eis.φ + O(ε),

= − 21





5 Hamiltonian Systems

where we define am = −

m H1m (J ) , bs = H1s (J ). 2m.ω∗


In order for our iterative procedure to continue, we need to be able to extend Hˆ 1 to an analytic function in a strip containing the real φ axis. This will be the case if there

is a positive quantity ρ such that |am | < O(e−|m|ρ ), |bs | < O(e−|s|ρ ). Using the

∗ inequalities on H1m , H1m in (5.117) and for m.ω in (5.98), we have (for ν > n − 1 in (5.98)) |m|n Me−|m|ρ , |bs | < Me−|s|ρ . (5.121) |am | < 2K It follows that Hˆ 1 is analytic in a strip |Im φ| < ρ for any ρ < ρ. Additionally, we will require below a more precise bound on | Hˆ 1 |. From (5.119), we have, using (5.117) and (5.98), M2   |m|n e−(|m|+|s|)ρ , | Hˆ 1 | < 2K m=0 s=0


√ and since the number of terms Nr with |m| = r is less than L n r n−1 / n (see (5.100)), it follows that M 2 L 2n  2n−1 −ρr  n−1 −ρk r e k e . (5.123) | Hˆ 1 | < 2K n r ≥1 k≥1 Note that, for any x > 0 and α ≥ 0, x α ≤ Pα e x , where Pα = exp[α(ln α − 1)] (and P0 = 1). Putting x = δr , it follows that 

r α e−ρr
ρ/2 for all r .9 3/2 Now suppose εr < δrT for some value of T to be chosen. Then (using δr +1 = δr ), εr +1 = so that εr +1 < δrT+1 providing

Cεr2 β

δr +1


Fig. 5.4 Standard map section: iterates of (5.142), for K = 0.85

Alternatively one can consider artificial systems from the start, where actionangle variables are used, and one considers maps rather than flows. An example is the standard map θn+1 = θn + In+1 , In+1 = In + K sin θn ,


in which K is a parameter. Here one often portrays orbits as in Fig. 5.4, where I is the action variable, and θ is the angle. Invariant tori then correspond to unbroken sub-horizontal curves from θ = 0 to θ = 2π (or −π to π , as in the diagram). The area-preserving nature of a map of the general form θ = θ + F(θ, I ), I = I + G(θ, I )



5 Hamiltonian Systems

is assured by requiring the Jacobian

∂(θ , I ) = 1, which implies ∂(θ, I ) 

T = −D, T = tr M,

D = det M,


Fθ FI Gθ G I



Notice that (5.142) is written as a map of the form θ = θ + F(θ, I ), I = I + G(θ, I ),


(with F = I , G = K sin θ ). It is easy to show (see question 5.7) that it is area preserving if (5.146) FI + G θ = 0. This obviously makes it easier to construct area-preserving maps, and indeed the approximate derivation of (5.140) from (5.139) could not asymptotically distinguish between I and I in the arguments of F and G.

5.4.2 Poincaré-Birkhoff Fixed Point Theorem Now let us return to the map (5.140). The frequency ratio is given by ω = ω1 /ω2 , and resonance for the unperturbed Hamiltonian occurs if m 1 ω1 + m 2 ω2 = 0, i.e. ω = −m 2 /m 1 for some m i ∈ Z. That is, resonance corresponds to rational values of ω. When ε = 0, then θ → θ + 2π ω, I = constant, and the invariant tori are straight √ lines in the (cartesian) (θ, I ) space, or circles in the polar ( 2I , θ ) space. Let I0 be a value of I for which ω = r/s ∈ Q, and suppose ω (I0 ) = 0; without loss of generality we suppose ω (I0 ) > 0. Let us denote the map (5.140) by Tε , thus  

θ + 2π ω(I ) + ε f (I, θ ) θ Tε = . I I + εg(I, θ )


Note that the s-th iterate T0s of T0 carries (I0 , θ ) to itself for any θ . The Poincaré– Birkhoff theorem is now as follows: For sufficiently small ε, there are 2ks (for some integer k) orbits of Tε of period s near I = I0 ; these are alternate saddles and sinks.

To prove this, consider curves C± : I = I± in I > < I0 and close to I = I0 . Since ω (I0 ) > 0, it follows that T0s maps points on C− clockwise11 and points on C+ anticlockwise (see Fig. 5.5). This property is preserved for sufficiently small ε. 11 We

√ use the polar coordinate representation of ( 2I , θ).

5.4 Resonance and Stochasticity


Fig. 5.5 The Poincaré-Birkhoff fixed point theorem. The arrows indicate the direction of motion of points under the sth iterate map Tεs . Points on Cε move radially, anticlockwise outside Cε and clockwise inside


T Cε

It follows that for each θ , there is some intermediate value of I (between I+ and I− ) which is mapped by Tεs radially. Denote the curve of these points as Cε . Since Tε is area-preserving, the area of Tεs Cε is the same as that of Cε , and the two curves must have at least two (and hence 2s and in general 2ks) intersections. These intersections give the period-s orbits of Tε . As can be seen in Fig. 5.5, they are alternately saddles or sinks, otherwise known as hyperbolic and elliptic fixed points of Tεs , respectively.

5.4.3 Removal of Resonances The situation here is very similar to the situation occurring in the secondary Hopf bifurcation, where invariant tori are dismantled within Arnold tongues. Here, the resonant tori are replaced by a sequence of periodic orbits, and everything appears unremarkable. However, we need to complete the phase portraits initiated in Fig. 5.5, and to do this we can again use perturbation theory—despite its apparent failure for resonance! Returning to the Hamiltonian H = H0 (I ) + ε

H1m (I )eim.θ ,



where, for two degrees of freedom, m = (m 1 , m 2 ), θ = (θ1 , θ2 ), I = (I1 , I2 ), let us suppose that ω = ω1 /ω2 = r/s at some value (I1∗ , I2∗ ) of the action variables. Perturbation theory fails, but we can proceed by shifting our attention to one of the Poincaré-Birkhoff fixed points. Specifically, we consider a slow angle variable 1 = sθ1 − r θ2 . While θ2 changes by 2π on the resonant torus, 1 changes by a small amount, and this enables us to obtain an approximate description by the method of averaging (which is really equivalent to deriving the approximate map Tε in (5.147)). We make a canonical transformation via the generating function


5 Hamiltonian Systems

S(θ, J ) = (sθ1 − r θ2 )J1 + θ2 J2 ,


whence the Hamiltonian is H = H0 (J ) + ε

 H1m (J ) exp


 i{m 1 1 + (m 1r + m 2 s)2 } , s


and the variables are related by I1 = s J1 ,

I2 = J2 − r J1 ,

1 = sθ1 − r θ2 , 2 = θ2 ,


so that H0 (J ) = H0 (s J1 , J2 − r J1 ), and in particular ∂H0 = sω1 − r ω2 , ∂ J1 The equations are thus

J˙1 = O(ε),

∂H0 = ω2 . ∂ J2

J˙2 = O(ε),



˙ 1 = O(ε), and for J1 ≈ J1∗ = I1∗ /s, J2 ≈ J2∗ = I2∗ + r J1∗ , then ∂H0 /∂ J1 ≈ 0, thus  ˙ 2 = ω2 + O(ε). For values of J near the resonant torus, 1 is a slow variable, and the method of averaging derives approximate equations for J and 1 by averaging the system over several periods (s, in fact) of the fast angle variable 2 . This gives us the averaged Hamiltonian, by summing over m 1 and m 2 with m 1r + m 2 s = 0, H¯ (J1 , J2 , 1 ) = H0 (J ) + ε


H1 p (J )ei p1 ,



where we write H1 p = H1,( ps,− pr ) . Since H¯ is independent of 2 , J2 is constant to all orders of ε. It is called an adiabatic invariant, and in fact changes by exponentially small (in ε) amounts. We can see that the effect of averaging is to reduce the number of degrees of freedom by one. In effect, the slow variation of J1 and 1 is equivalent to the slow variation of I and θ under the s-th iterate of Tε in (5.147). An approximate description of the motion is obtained by retaining the first oscillatory term in (5.154) (note that (5.105) suggests |H1 p | ≤ M exp[− p(s + r )ρ]): H¯ = H0 (J1 ) + εH10 (J1 ) + 2εH11 (J1 ) cos 1 ,


where we take H¯ real (and H11 also, if necessary by adding a constant to 1 ). We suppress the dependence of H¯ on the adiabatically constant J2 . From the Poincaré-Birkhoff theorem, we know H¯ must have fixed points, given by ∂ H¯ /∂ J1 = ∂ H¯ /∂1 = 0. If we choose J2 = J2∗ , then ∂H0 /∂ J1 = 0, and the fixed points are

5.4 Resonance and Stochasticity


Fig. 5.6 Phase plane for the simple pendulum, with constant energy levels h = E in (5.157). The closed orbits represent periodic motion (libration), while the outer unbounded ones represent rotation

at 1 = 0, π (−π , etc.) and H01 = ∓2 H11 , with the minus sign corresponding to 1 = 0 and the positive sign to 1 = π . Let us denote the value of J1 at either of the fixed points by J10 . To construct the phase portrait near J10 , we put

J1 = J10 + ε1/2 K , 1 = φ,

H¯ = H0 (J10 ) + εH10 (J10 ) + εh;


then the perturbed Hamiltonian near resonance is approximated by h(K , φ) ≈ 21 G K 2 − F cos φ, where

G = H0

(J10 ),

F = −2H11 (J10 ).



This is simply the Hamiltonian for the simple pendulum, and its phase portrait is shown in Fig. 5.6. It is a remarkable result that the motion near any resonance is essentially that of a pendulum. The higher order Fourier terms in (5.154) do not significantly alter this fact. The width of the ‘islands’12 near resonant tori is given by 4(εF/G)1/2 , and ˙ 1 = ∂ H¯ /∂ J1 ≈ ε1/2 ∂h/∂ K .13 The the frequency is similarly O(ε1/2 ). In fact, φ˙ =  period of these secondary oscillations (libration) thus ranges from 2π/(εG F)1/2 near the resonant point K = φ = 0 to infinity as the trajectories approach the separatrix. Note also that for analytic H1 with |H11 | ≤ M exp[−(s + r )ρ], then |F| ≤ 2M exp[−(s + r )ρ], and |F| decreases rapidly for higher order resonances, s  1. The islands are correspondingly smaller in both J1 and θ1 directions, but rapidly become thinner.

is the width in the J1 direction; the width in the 1 direction follows from the variation of 1 = φ = sθ1 on the Poincaré section θ2 = constant, and this gives an angular width of 2π θ1 = ; thus the angular width decreases at the high order resonances (s large). s 13 So properly we might also scale t = τ/√ε in (5.156). 12 This


5 Hamiltonian Systems

Fig. 5.7 Secondary resonance. Outside the orbits of the large resonant orbit centred around (θ, I ) = (π, 0) or (π, 2π ), one can see the break up of the ninth of the secondary tori into secondary resonant fixed points with their own local orbits. More obviously, one sees secondary resonance about the period 2 orbit at (0, π ), (π, π ). The figure is computed from the standard map (5.142), with K = 0.757

5.4.4 Secondary Resonance The existence of secondary periodic orbits implies the existence of secondary resonances. For small ε, these are confined to higher resonances (ω = r/s with s  1) but as ε increases, lower resonances occur and the corresponding island widths become larger. An example is shown in Fig. 5.7. There is evidently no limit to this process, and the secondary islands will themselves have subsidiary resonances. These higher resonances can also be studied approximately using perturbation theory. We see that the phase portraits are exceedingly complicated, but so far we see no evidence of any dramatic reconstruction of the trajectories. To be sure, there is a topological discontinuity under perturbation of the resonant tori, but the motion is apparently still constrained to lie on or near period s orbits near the unperturbed torus.

Homoclinic Connections

The method of averaging as applied to the renormalised Hamiltonian (5.150) breaks down near the separatrices between libration (periodic motion) and rotation (φ increases indefinitely). The reason for this is that the (averaged) trajectories spend an exponentially long time near the fixed points φ = 0 and φ = π , and exponentially small initial differences in the action can lead to algebraic differences after passage near the saddles at φ = ± π . In particular, the separatrix from φ = π may not join to φ = −π , despite what perturbation theory implies. In order to examine this, we reconsider the Hamiltonian (5.150). Retaining the first oscillatory averaged perturbation term (m 1 = ± s, m 1r + m 2 s = 0) and a

5.4 Resonance and Stochasticity


single oscillatory term in 2 (with m 1 = ± 1, m 1r + m 2 s = ∓ 1), we have (writing 1 = φ) H = H0 (J ) + εH10 (J ) + 2εH11 (J ) cos φ   (φ − 2 ) + ..., + ε cos s


where  is a real amplitude constant, and we have taken the phase of 2 to be zero, without loss of generality; putting 2 ≈ ω2 t, J = J10 + ε1/2 K , we get, to leading order, φ˙ = ε1/2 G K ,    φ − ω2 t . K˙ = ε1/2 −F sin φ +  sin s


It is in averaging (5.160) over (0, 2π s/ω2 ) that the  term drops out and we get (5.157). Suppose we see whether the unstable manifold from φ = −π joins the stable manifold of φ = π . On the unperturbed separatrix (with  = 0), the Hamiltonian is just (5.161) h = 21 G K 2 − F cos φ = F, and the solution is φ = φ0 (t − t0 ) = 4 tan−1 [exp{εG F}1/2 (t − t0 )] − π,


for arbitrary t0 . Now from (5.160) and (5.161), we have     ˙h = ε1/2 G K sin φ − ω2 t = φ˙ sin φ − ω2 t . s s


Now suppose a point P is on the stable manifold of φ = π , where h = F and take t = 0 there. Then (5.163) implies  F − h(P) ≈

φ˙ 0 (t − t0 ) sin


φ0 − ω2 t s



If P is also on the unstable manifold of φ = −π , (where h = F), then  h(P) − F ≈



φ˙ 0 (t − t0 ) sin

φ0 − ω2 t s



It follows that the stable and unstable manifolds intersect only if the Melnikov function


5 Hamiltonian Systems

 M(t0 ) =


φ˙ 0 (t) sin

φ0 (t) − ω2 (t + t0 ) s



has a zero. If it has one zero, it necessarily has an infinite number. If these are discrete, then the stable and unstable manifolds intersect transversely, with far-reaching consequences as we see below. If M(t0 ) ≡ 0 then the separatrix exists, and if M(t0 ) = 0 for all t0 , the separatrix is broken—but this can only happen in a non-Hamiltonian system. Since φ0 is odd (thus also (φ0 − ω2 t)/s) and φ˙ 0 is even, we find  M(t0 ) = −


φ˙ 0 cos

    ω2 t0 (φ0 − ω2 t) dt sin , s s


and discrete homoclinic intersections occur providing the integral is non-zero. This is indeed the case, and evidently one can expect such intersections in general.

5.4.5 Melnikov’s Method The idea above can be applied generally to Hamiltonian systems, not necessarily in action-angle form. If such a perturbed system is ∂ H1 ∂ H0 +ε , ∂p ∂p ∂ H0 ∂ H1 p˙ = − −ε , ∂q ∂q

q˙ =


where H0 = H0 (q, p), H1 = H1 (q, p, t), let us suppose that the unperturbed system has a heteroclinic trajectory  = {q 0 (t − t0 ), p 0 (t − t0 )} (t0 is arbitrary) connecting two fixed points (in q, p) space) M and N . Thus (q 0 , p 0 ) → M as t → −∞, and → N as t → +∞. Let the value of H0 at M (and N ) be H ∗ . Under the perturbation, the stable manifold WsN of N and the unstable manifold M Wu of M will be perturbed, and the quantity H0 will no longer be conserved. Approximately, we have ∂ H0 ∂ H0 q˙i + p˙ i H˙ 0 = ∂qi ∂ pi   ∂ H1 ∂ H1 ≈ ε − p˙ i0 (t − t0 ) − q˙i0 (t − t0 ) ∂ pi ∂qi = ε[H0 , H1 ]{qi0 (t − t0 ), pi0 (t − t0 ), t},


where [H0 , H1 ] is the Poisson bracket evaluated at the values in the curly brackets. Now let us assume that H1 is 2π -periodic in t. Then it is appropriate to study the Poincaré map obtained by integrating the system over an interval 2π . M and N are

5.4 Resonance and Stochasticity


t = t0

Fig. 5.8 The perturbed stable and unstable manifolds

xu WuM





perturbed to fixed points on the Poincaré surface (corresponding to periodic orbits of amplitude ε for the flow). Suppose without loss of generality that H0 at each of these perturbed locations is equal. Consider a point x u = (q u , p u ) on the unstable manifold WuM of M, as shown in Fig. 5.8. We select a corresponding point x 0 = (q 0 , p 0 ) on the homoclinic trajectory  by assuming x u − x 0 ⊥ , and suppose t = t0 at this point;14 equally there is a point x s = (q s , p s ) ∈ WsN , the stable manifold of N , with x s − x 0 ⊥ . Without loss of generality, we can take t = t0 at all three points. We are interested in the distance between WuM and WsN which is xu − x s . Writ ∂ H0 ∂ H0 ing x u − x s = δx, we note that δx n, where the normal n = ∇ H0 = , , ∂q ∂ p since H0 is constant on . This suggests that we define a distance function M(t0 ) = n.δx = δq.

∂ H0 ∂ H0 + δp. = δ H0 = H0u − H0s . ∂q ∂p


Now integrating (5.169) forward from x s , we have H ∗ − H0s = ε and equivalently H0u − H ∗ = ε

[H0 , H1 ] dt,


[H0 , H1 ] dt,





Adding the two, we see that the Melnikov distance function is (to leading order in ε)  M(t0 ) = ε

∞ −∞

[H0 , H1 ]dt,


with arguments as in (5.157), and we have the result due to Poincaré: if M(t0 ) = 0 ∀ t0 , then WuM and WsN do not intersect, and the unperturbed separatrix is split. If M(t0 ) ≡ 0 then the separatrix is not split, while if M(t0 ) has an isolated zero, 14 Thus

t0 is defined as a monotonically increasing function of x 0 ∈ .


5 Hamiltonian Systems A










Fig. 5.9 Transverse intersection of stable and unstable manifolds

then it has an infinite number, and it is in this situation that complicated, ‘stochastic’ behaviour occurs.

5.4.6 Heteroclinic Tangles We can easily illustrate the effect of transverse intersections of WuM and WsN by considering two-dimensional area-preserving maps, but the same considerations apply in higher dimensions. If we consider the situation shown in Fig. 5.9, where WuM and WsN intersect at P, then, denoting the Poincaré map by , we have m (P) ∈ WuM ∩ WsN for all m ∈ Z. In particular the area A bounded by P, P = (P), Wu and Ws is a closed region which is mapped successively to A , A

, etc. The areas of these regions are all equal, but since P is mapped along Ws towards N , the segments P P , P P

decrease (exponentially) in length, while the transverse widths of the segments increase exponentially. In fact, the unstable manifold must wind round in an incredibly contorted manner. Similarly, the stable manifold of N becomes wildly distorted near M. Thus the saddle points exert a disordering influence on the dynamics, and in particular, it is obvious that in these regions near the unperturbed separatrix, the map exhibits sensitive dependence on initial conditions. One can show that certain iterates of the map have Smale horseshoes, so that one can indeed prove the existence of chaos and a strange invariant set. Since splitting of separatrices is the norm, one can expect these thin stochastic regions to appear near the separatrices of any resonance in the unperturbed Hamiltonian. Nevertheless, the stochastic regions are exceedingly thin, since it is generally found that M(t0 ) ∼ exp[−O(1/ε1/2 )], so that for small ε, they are hardly noticeable. Nevertheless, stochastic trajectories exist for arbitrarily small ε. We are led to the famous, undrawable picture of the effect of perturbation on an integrable system. Figure 5.10 attempts to show this, while actual examples are given in the following Sect. 5.5.

5.4 Resonance and Stochasticity


Fig. 5.10 The heteroclinic tangles which occur when the separatrices are broken

5.4.7 Arnold Diffusion For a two degree of freedom Hamiltonian, the three-dimensional phase space (on a constant energy surface) is divided by the two-dimensional KAM tori into distinct, disconnected regions. This disconnection is reflected in the area-preserving Poincaré map, divided up by KAM tori into separate stochastic layers. As the perturbation parameter ε increases, more and more of the KAM tori are obliterated, and separate stochastic layers merge. One can then identify a transition from local to global stochasticity associated with the disappearance of the ‘last’ KAM torus, and certain approximate methods have been developed to determine when this occurs. The distinction between local and global stochasticity is rather irrelevant in higher dimensional systems, for the following reason. For N degrees of freedom, the energy surface is (2N − 1)-dimensional, while the KAM tori are N -dimensional. The reason that a two-dimensional surface splits three-dimensional space into distinct regions is that the intersection of a curve (two-dimensional) and a two-dimensional surface in R3 is generically a point. More generally, the intersection of a curve and an


5 Hamiltonian Systems E (r, θ)

Fig. 5.11 The geometry of the three-body problem

J (ρ , φ) S (R, Θ)

N -torus in R2N −1 has co-dimension15 [(2N − 1) − N ] + [(2N − 1) − 1], i.e. has dimension N + 1 − (2N − 1) = 2 − N , and thus for N > 2, it is exceptional for curves to intersect N -tori in R2N −1 . Therefore, we can expect the thin stochastic layers surrounding each separatrix on each resonant torus to form a web—the Arnold web— which permeates the entire phase space, and on which trajectories are not limited by the existence of KAM N -tori. Trajectories on this web wander between resonant saddles, where they are subject to effectively random impulses: a cloud of points in the phase space will be spread out as it passes near a saddle, and this process has been called Arnold diffusion. Approximate methods to calculate appropriate diffusion rates have been developed, with some success.

5.5 Examples 5.5.1 A Restricted Three-Body Problem In a subject which derives its inspiration from the studies of celestial mechanics by Poincaré and others before him in the nineteenth century, it would be remiss not to discuss at least one version of the problem of the dynamics of the solar system. We will confine ourselves to one simple version of this. The solar system consists of nine planets (or eight, if we discount the discredited Pluto) which revolve in elliptical orbits around the Sun. Of these, the giant planets Jupiter, Saturn and to a lesser degree Uranus and Neptune are much more massive than the inner planets Earth, Mars, Venus and Mercury, and they are also much further from the Sun. Most of the orbits are fairly circular, except for Mercury, whose eccentricity is around 0.2 (and Pluto is even more eccentric, but is tiny, smaller than the Moon).16 In what follows, we will consider the motion of two planets orbiting the Sun, one of which is Jupiter, and the other could be any of the others, although we will take co-dimension of a k-dimensional manifold in Rn is n − k. And, the co-dimension of the intersection of two manifolds U and V is the sum of their co-dimensions. See also footnotes 13 in Chap. 2, 12 in Chap. 4, and a couple of paragraphs before (6.29) in Chap. 6. 16 Having said that, ellipses don’t actually look (slightly) elliptical until the eccentricity is about 0.5; for an eccentricity √ of 0.2, the orbit is basically circular. This is because the ratio of the principal axis lengths is 1 − e2 ≈ 1 − 21 e2 . 15 The

5.5 Examples


Earth as a specific example. We assume the orbits are planar, and we take the centre of mass of the three bodies as the origin (which is therefore fixed). We denote polar coordinates for the Earth as (r, θ ), those for Jupiter as (ρ, φ), and (R, ) for the Sun. All three bodies are taken to be spherically symmetric, and thus act as point masses. Under the action of gravity, the potential is V =−

G MS M J G MS m G MJ m − − , x y z


where G is the gravitational constant, and M S , M J and m are the masses of the Sun, Jupiter and Earth, and x = [r 2 + ρ 2 − 2ρr cos(θ − φ)]1/2 , y = [ρ 2 + R 2 − 2ρ R cos( − φ)]1/2 , z = [r 2 + R 2 − 2r R cos(θ − )]1/2 ,


are the distances from Earth to Jupiter, Jupiter to Sun and Earth to Sun (see Fig. 5.11). The Hamiltonian is ˙ 2 ) + V, H = 21 m(˙r 2 + r 2 θ˙ 2 ) + 21 M J (ρ˙ 2 + ρ 2 φ˙ 2 ) + 21 M S ( R˙ 2 + R 2 


the generalised momenta are pr = m r˙ , ˙ pθ = mr 2 θ,

pρ = M J ρ, ˙

˙ pφ = M J ρ 2 φ,

˙ p R = M S R, ˙ p = M S R 2 ,


and thus ∂V , ∂r ∂V , − ∂ρ ∂V , − ∂R ∂V − , ∂θ ∂V − , ∂φ ∂V − . ∂

m(¨r − r θ˙ 2 ) = − M J (ρ¨ − ρ φ˙ 2 ) = ˙ 2) = M S ( R¨ − R  d(r 2 θ˙ ) = dt ˙ d(ρ 2 φ) = MJ dt ˙ d(R 2 ) MS = dt m

The centre of mass condition gives



5 Hamiltonian Systems

mr eiθ + M J ρeiφ + M S Rei = 0.


We let d be the mean distance of Jupiter from the Sun, and then we scale the variables as  r, ρ ∼ d,

R ∼ εd, t ∼ t J ≡

d3 G MS

1/2 , V ∼

G MS M J , d


which yields the non-dimensional equations r¨ − r θ˙ 2 = d(r 2 θ˙ ) = dt ρ¨ − ρ φ˙ 2 = ˙ d(ρ 2 φ) = dt   ˙2 = ε R¨ − R  ε

˙ d(R 2 ) = dt

where ε=

    1 1 +ε , z r x r     1 1 +ε , z θ x θ     1 1 + εδ , y ρ x ρ     1 1 + εδ , y φ x φ     1 1 +δ , y R z R     1 1 +δ , y  z 

MJ m , δ= , MS MJ



and x = [r 2 + ρ 2 − 2ρr cos(θ − φ)]1/2 , y = [ρ 2 + ε2 R 2 − 2ερ R cos( − φ)]1/2 , z = [r 2 + ε2 R 2 − 2εr R cos(θ − )]1/2 .


The centre of mass condition is δr eiθ + ρeiφ + Rei = 0.


The masses of the Sun, Jupiter and the Earth are M S ≈ 2 × 1030 kg,

M J ≈ 1.9 × 1027 kg, m ≈ 6 × 1024 kg,


5.5 Examples

and thus


ε ≈ 10−3 , δ ≈ 3 × 10−3 ,


so at least for Earth we can take δ ∼ ε. Additionally, d ≈ 7.79 × 108 km, G ≈ 6.673 × 10−11 m3 kg−1 s−2 ,


whence it follows that the time scale is t J ≈ 1.88 y.


Note also for future reference that the mean distance of the Earth from the Sun, 1.5 × 108 km, and its mean angular velocity, 2π y−1 , correspond to dimensionless values z ≈ r ≈ 0.19, θ˙ ≈ 11.86. (5.189) So long as δ  1, we can approximate (5.181) by neglecting the terms in εδ and ε2 , which suggests that Jupiter’s orbit is unaffected by the smaller planet. For the other planets, this still remains reasonably valid, except for Saturn, whose mass is about a third of Jupiter’s. We then have, approximately, x = [r 2 + ρ 2 − 2ρr cos(θ − φ)]1/2 , y = ρ − ε R cos( − φ), z = r − ε R cos(θ − ).


Now since R only appears at O(ε) in y and z, it is sufficient to approximate the centre of mass condition (5.184) as ρeiφ + Rei = 0.


Equally it is sufficient in (5.181) to calculate R to leading order, thus the Sun’s orbit satisfies, using (5.190), cos( − φ) ˙2 = , R¨ − R  ρ2 ˙ R sin( − φ) d(R 2 ) =− . dt ρ2


Rearranging this gives d 2 (Rei ) eiφ d 2 (ρeiφ ) = = − , dt 2 ρ2 dt 2



5 Hamiltonian Systems

. r

Fig. 5.12 The natural two-torus for the three-body problem


r so both the Sun and Jupiter have elliptical orbits.17 We will now limit our study to the case that both orbits are circular,18 whence it follows that ρ = R = 1, φ = t,  = t + π, (5.194) and the model thus reduces to       1 1 1 {r − cos(θ − t)} 2 cos(θ − t) 2 ˙ , +ε = − 2 +ε − + r¨ − r θ = z r x r r x3 r3       d(r 2 θ˙ ) 1 1 1 r (5.195) = +ε = ε sin(θ − t) 2 − 3 , dt z θ x θ r x where x = [1 + r 2 − 2r cos(θ − t)]1/2 , 1 ε cos(θ − t) 1 ≈ − . z r r2


Evidently when ε = 0, the (bounded) orbits are ellipses in physical space, and are represented by closed orbits in the (r, r˙ ) phase space. Specifically, the energy E = 21 (˙r 2 + r 2 θ˙ 2 ) − and the angular momentum

1 r

h = r 2 θ˙



are conserved when ε = 0. In particular, the energy is then 17 Simply

expand out the last term in (5.193). is called the restricted three-body problem. Actually there are three separate restrictions which are commonly assumed: first, the three bodies move in a common plane; secondly, the mass of the satellite is negligible (here this corresponds to δ → 0); and lastly, the consequent elliptical orbits of the two major bodies are actually circular. Sometimes this is called the circular restricted three-body problem, or even the planar circular restricted three body problem, so the restricted part principally refers to the negligible mass.

18 This

5.5 Examples

187 1 2 r˙ 2

+ U (r, h) = E,


1 h2 − , 2 2r r


where the potential function is U (r, h) =

1 < E < 0 for closed orbits: thus r > 21 h 2 , and the fixed point 2h 2 (a circular orbit) is at r = h 2 . The solution of the system when ε = 0 is the conic

and we require −

h2 = 1 + e cos(θ + α), r


where α is constant and e is the eccentricity of the orbit, and e < 1 for bounded orbits. If we start a computation with r = r0 , h = h 0 and r˙ = 0, then the eccentricity of the orbits is just  2   h0   (5.202) e =  − 1 , r0 and it is clear that if this is too large, the Earth orbits will intersect that of Jupiter, and eventually collision will occur. It is in fact noticeable that all the planets have low eccentricity (as we mentioned earlier), with e < 0.1 except for Mercury at 0.2 (and the aberrant Pluto, at 0.24), and in our simulations we will generally take r0 ∼ h 20 to reflect this. The more eccentric orbits can be chaotic. When ε = 0, the trajectories can be thought of as residing on a two-torus in (r, θ, r˙ ) space (these being cylindrical polars), as shown in Fig. 5.12. A natural Poincaré section is thus the (r, r˙ ) plane at θ = 0, and the coordinates h and E give respectively a measure of the primary radius of the torus in Fig. 5.12 (r = h 2 ) and the 1 secondary radius of the cross section, 21 r˙ 2 + U (r, h) = E. For given E > − 2 , the 2h solution resides on a closed loop on the Poincaré section, but, in fact, it is degenerate, since every point on the loop is a fixed point of the Poincaré map (since all solutions are periodic).19 We expect this degeneracy to disappear when ε is small. When ε is small, then the quantities h and E will vary slowly, so that the orbits remain approximately elliptical in physical space, approximately closed loops in phase space, but the amplitude and period of the orbits will vary slowly. Calculating the derivatives of h and E, we find h˙ = εWθ , E˙ = ε(Wθ θ˙ + Wr r˙ ),


requires the period of rotation with respect to both angle variables, i.e. in θ and around the (r, r˙ ) loop, to be the same. This is exceptional, but is, in fact, the case: see question 5.7. 19 This


5 Hamiltonian Systems

2 10



0 -10

0 -2











Fig. 5.13 Invariant tori in the solution of (5.195) for ε = 0.001. The plots give intersections of the trajectories in cylindrical polars (r, v, ψ) at the Poincaré section ψ decreasing through ψ = 21 π , where v = r˙ . The calculations are done at a fixed value of J0 ≈ −3.068, corresponding approximately to the present Earth orbit, and as described in the text in (5.208) and (5.209), in which r0 = 0.19 and h 0 = 0.44, and with initial conditions r0 = 0.19, ψ = 21 π , and then a variety of values of v0 between 0.2 and 2.1027 are chosen, with the corresponding value of h 0 chosen as in (5.210). The right side shows a zoom, in which it can be seen that for v0 = 2.0, the behaviour is stochastic; v0 values larger than 2.1027 lead to collision with the Sun. The small line segments near r = 0, |v| = 16 are secondary tori

where Wr and Wθ are partial derivatives of the function W =


+ r2

cos(θ − t) 1 − . 1/2 − 2r cos(θ − t)] r2


Now because W depends on the combination θ − t, we have Wθ = −Wt , and therefore we see that E − h − εW = J (5.205) is an exact integral (the Jacobi integral) of these equations: J is a constant, which is dependent on the initial conditions. Note that neither E nor h is conserved under the perturbation. We can in fact write the non-autonomous system (5.198), (5.199) and (5.203) in autonomous form by defining ψ = θ − t, (5.206) whence we have r˙ = v, v˙ = −Ur + εWr , h − 1, r2 h˙ = εWψ ,

ψ˙ =

W =

1 cos ψ − , [1 + r 2 − 2r cos ψ]1/2 r2


5.5 Examples

189 1


0.5 1




0 -0.5

-1 -2 0.1













Fig. 5.14 Resonant tori and their break up at ε = 0.04 in the solution of (5.207). The righthand figure shows a close-up. Calculations were done as for Fig. 5.13, with 0 < v0 < 1.1805 and J0 ≈ −3.106. Larger values lead to collision with the Sun

where the subscripts denote partial derivatives. Because of the Jacobi integral (5.205), it is evident that the solution trajectories occupy a three-dimensional space, and when ε = 0 this space is further spliced into the tori in Fig. 5.12, except that now if we select a Poincaré section ψ = 21 π say, the periodic orbits in physical space will cycle around the (r, r˙ ) loops, and thus the unperturbed Poincaré map will have a set of nested loops. Under a perturbation, we can then expect the usual break up to form resonant orbits and stochastic regions. This is shown in Fig. 5.13, to a small extent, but more clearly in Fig. 5.14. Some resonant tori, KAM tori, and also stochastic layers are evident. In selecting the quantity J to compute Figs. 5.13 and 5.14, we note that on the Poincaré section ψ = 21 π , we have J = 21 v 2 + K (r, h) − h,

K =

ε h2 1 − −√ , 2 2r r 1 + r2


and for sufficiently large J < −h these give closed curves in the plane, with a central point when K is minimum. (If J > ∼ − h, trajectories will tend to escape.) We choose this minimum to be at Earth’s present orbit, r = r0 ≈ 0.19, and this requires  h = h 0 = r0 +

εr04 (1 + r02 )3/2

1/2 ,


and then J = J0 is computed from (5.208) using v = 0, r = r0 and h = h 0 . Values of J < J0 are not (initially) obtainable on the Poincaré section, and only one point satisfies J = J0 . If h were constant in the motion, it would follow that for this one value, assuming that the trajectories are recurrent (which is the case provided / r02 ), the trajectory corresponding to J0 would in fact be a periodic orbit of the h0 ≈ system (and a fixed point of the Poincaré map). In practice h varies during the motion,


5 Hamiltonian Systems

so that we can only expect the value of J0 to be close to the value at such a fixed point. In order to compute other orbits of the system, we select initial conditions ψ = 21 π , r = r0 , and then, for a range of v0 = 0, we determine the initial value of h by solving the quadratic (5.208) for h: ⎡ h=





2r02 J0

− r02 v02

+ 2r0 +

⎤1/2 2εr02 ⎦ . 1 + r02


(The positive sign is selected since for the Earth we have h > r 2 .) In practice the values of h 0 hardly vary, and essentially J0 = −3.067 − ε suffices as a choice. A complication of this particular system arises because the underlying Hamiltonian is not analytic, or even bounded. This is manifested in the reduced model by the singularities at r = 0 and x = 0, and when ε > 0, solutions can reach r = 0 or x = 0 in finite time, and can also escape (r → ∞). The latter property is common in Hamiltonian systems, but the presence of singularities renders the transition to stochasticity less easy to visualise, simply because when the solutions become chaotic, collisions tend to occur, and the trajectories terminate. Finally, in Fig. 5.15 we show the breakup of invariant tori when ε = 0.1. For this relatively large value of ε, stable orbits are restricted to a narrow range of v0 . The

Fig. 5.15 Resonant tori and their break up at ε = 0.1 in the solution of (5.207). Calculations were done as for Fig. 5.13, with 0 < v0 < 1.1805, and J0 ≈ −3.164. Larger values of v0 lead to collision with the Sun. Note that the fourth panel shows more detail than can be inferred from the third

5.5 Examples


reason for this can be seen if we look at the definition of W in (5.207). As we take ε r ∼ r0 , the value of εW ∼ 2 for small r0 ; taking r0 = 0.19, this is about 1.1 for r0 ε = 0.04 and 2.8 for ε = 0.1; thus the perturbing terms are actually quite large in both cases, and only small variations about the fixed points of the map are stable. Even with the actual Jupiter–Sun mass ratio of ε = 0.001, the value of r0 = 0.074 for ε the innermost planet Mercury yields a perturbation of magnitude εW ∼ 2 ∼ 0.18. r0

5.5.2 Lagrange Points The method of averaging, or Melnikov’s method, can be applied to (5.207), although the various integrals involved need to be computed numerically. We can, however, use (5.207) to establish the existence of the Lagrange points. These correspond to equilibria of (5.207) in which h is constant, ψ is constant, so the orbits are circular and rotate at the same frequency as Jupiter, with r 2 = h. Evidently Wψ = 0, and since (5.207)1 can be written r¨ −

h2 1 + 2 = εWr , r3 r


we also require (since h = r 2 ) 1 − r = εWr ; r2


evidently r ≈ 1, so that these points are approximately on Jupiter’s orbit. Calculating Wψ , we find that either ψ = 0, ψ = π , or cos ψ = 2r1 ≈ 21 , whence ψ ≈ ± 13 π . We then calculate Wr in each case, and substituting into (5.212), we find (approximately) the five Lagrangian points: r = 1 ± ( 13 ε)1/3 , ψ = 0, r = 1 + 43 ε, ψ = π, r = 1 − 16 ε, ψ = ± 13 π.


Of these, only the last two are stable (for sufficiently small ε). Going slightly backwards, note that (5.195) can be written in vector form as r¨ = −∇V, where, using (5.204),



5 Hamiltonian Systems

Fig. 5.16 Contours of the effective  potential  1 V R = − 21 r 2 + + εW r in the co-rotating frame with polar coordinates r and ψ (thus x + i y = r eiψ ); the value of ε = 10−3 . The five Lagrange points are visible as the black dots

 1 + εW . V =− r 


Transforming to coordinates in the co-rotating frame (with angular speed θ˙ = 1), this leads to r¨ + 2k × r˙ = −∇VR , (5.216) where the unit vector k is orthogonal to the orbital plane, and the Roche potential is given by   1 1 2 (5.217) VR = − 2 r + + εW . r Taking the scalar product with r˙ and integrating shows that the pseudo-energy is conserved: 1 |˙r|2 + VR = J ; (5.218) 2 this is in fact just the Jacobi integral. It shows that the Lagrange points correspond to the extrema of VR , and is correctly suggestive of their stability, at least for small ε. Figure 5.16 shows the contours of VR ; the Lagrange points are visible, particularly the Trojan points at ± 60◦ .

5.5 Examples


5.5.3 The Hénon–Heiles System In 1964, Hénon and Heiles studied numerically the problem of two particles (x, y) oscillating in a potential V (x, y) = 21 (x 2 + y 2 ) + x 2 y − 13 y 3 .


For small values of x and y, motion is separable and harmonic, while for larger x and y the two oscillators are coupled. The Hamiltonian is (with particle mass equal to one) (5.220) H = 21 ( p12 + p22 + q12 + q22 ) + q12 q2 − 13 q23 , and the equations of motion can be written as p1 = q˙1 , p2 = q˙2 , and q¨1 + q1 + 2q1 q2 = 0, q¨2 + q2 + q12 − q22 = 0.


We see √ from this that there are fixed points at (q1 , q2 ) = (0, 0), (0, 1), and (± 21 3, − 21 ). The equilibrium at the origin is a centre, while each of the outer three fixed points is a saddle-centre, with two oscillatory components and two exponential components (one growing, one decaying).20 The value of H is 16 at these three saddles, and they are connected in phase space by separatrices. If the energy E = H > 16 , then motion is unbounded, and trajectories diverge to infinity. In this system, the energy plays the rôle of the perturbing parameter. If we put E = ˜ qi = εq˜i , pi = ε p˜ i , H = ε2 H˜ , then the system is governed by the Hamiltonian ε2 E, H˜ = 21 ( p˜ 12 + p˜ 22 ) + 21 (q˜12 + q˜22 ) + ε(q˜12 q˜2 − 13 q˜23 ),


and the energy is E˜ = H˜ . We select a fixed energy hypersurface E˜ = 1, say; the non-integrability is parameterised by ε = E 1/2 . In (qi , pi ) space, we can then expect increasing stochasticity as E is increased. This is shown in Fig. 5.17, where numerical computations reveal the characteristic resonant saddle/centre portrait (for q2 , p2 ) of the Poincaré-Birkhoff theorem. The ‘eggs’ in the figure increase in size with E, because on the Poincaré surface, we have p22 + V (q2 ) = 2(E − 16 ) − p12 , V (q) = − 13 (1 − q)2 (1 + 2q),


and thus the outer bounding surface of an egg is when p1 = 0, and gives p22 + V (q2 ) = −2(E − 16 ),


easiest way to see this is to write the system as z¨ + z + i z¯ 2 = 0. Apart from z = 0, the equilibria then satisfy z 3 = −i, and the linear stability of all three non-zero fixed points is the same (since the equation is invariant under rotation by 23 π ).

20 The


5 Hamiltonian Systems

Fig. 5.17 Increasing stochasticity in the Hénon-Heiles system (5.220) as the energy H = E is increased. The figures show intersections in the (q2 , p2 ) plane of the trajectories with the Poincaré 1 1 1 7 1 section q1 = 0, p1 > 0. From left to right and top to bottom, the values of E are 24 , 16 , 12 , 64 , 8, 1 7 1 1 , and . The figures all have the same scale. The cusp for E = is at q = 1, p = 0 2 2 7 48 6 6

which are closed loops as can be seen for E ≤ 16 . As mentioned, the curve becomes unbounded if E > 16 . Note that the degeneracy of the unperturbed system (the two primary frequencies are equal when ε = 0) means that the basic portrait for ε → 0 1 , the loops is split into two sets of nested loops in q2 > 0 and q2 < 0. For E ≤ 12 appear to represent invariant tori, and the separatrices appear to be clean. This is, however, not the case: thin stochastic regions of thickness exp[−O(1/E)] exist for all values of E > 0, and they become rapidly larger as E increases, as can be seen for E ≥ 18 . The motion appears fully stochastic when E = 16 .

5.5 Examples


Fig. 5.18 Trajectories of the Hénon map (5.225), with α = 1.328

5.5.4 Hénon’s Area-Preserving Map The map z → eiα (z − i x 2 ),


where z = x + i y, is an area-preserving map considered by Hénon in 1969. If we define z = εζ = ε(ξ + iη), then the map can be written ζ → εiα (ζ − εiξ 2 ),


and in polar coordinates ζ = r eiθ r → r − εr 2 cos2 θ sin θ + . . . , θ → θ + α − εr cos3 θ . . . ,


which we see is of the form of the perturbed Poincaré map (5.140). We therefore expect to see increasing effects of the perturbation for larger values of |z|, and this is exhibited in Fig. 5.18, where for α/2π = 0.21 (α = 1.328) and small |z| ≤ 0.3, only KAM tori are visible. For |z| ∼ 0.5, a period five island chain is visible, and beyond this, the motion appears stochastic. The orbits which are apparently separatrices joining the period five saddles are in fact thin stochastic layers, as can be seen in Fig. 5.19, where the phase portrait near the saddle at (0.57, 0.16) is magnified. The stochastic region is revealed, as are also a further wealth of island chains formed by secondary resonances surrounding the period five centres. Tertiary resonances are also visible, as well as a secondary stochastic layer. Further magnification would reveal further similar complexity.


5 Hamiltonian Systems

Fig. 5.19 Magnification of Fig. 5.18 near the saddle at x = 0.57, y = 0.16

5.5.5 Standard Map If we integrate the averaged form of (5.160), φ˙ = ε1/2 G J, J˙ = −ε1/2 F sin φ,


over an interval 2π , we get the radial twist map φ → φ + 2π ε1/2 G J, J → J − ε1/2 F sin φ,


and a rescaling puts this in the form of the standard map θn+1 = θn + In+1 , In+1 = In + K sin θn .


Both I and θ are periodic, of period 2π . K > 0 is the stochasticity parameter, and for K = 0, the map is a pure rotation, and the invariant tori are straight lines in the cartesian (θ, I ) plane. For small positive K , resonant islands form as the KAM tori are destroyed. Figure 5.20 shows the phase portrait at K = 0.5. The major island is associated with the resonance described by (5.157), but we also see a period two island chain near I = π . When K = 1, Fig. 5.21 shows a more complicated structure. Many further island chains have been destroyed, and stochastic layers are also visible. Numerical calculations indicate that the last KAM torus is destroyed at K ≈ 0.9716, and this marks the onset of ‘global stochasticity’, although some stable periodic behaviour persists for higher values of K . Greene suggested a mechanism for calculating the value of K at which the last KAM torus is removed. When K > 0, the remaining tori no longer have constant rotation rates, rather the average rotation rate is given by the winding number (the def-

5.5 Examples


Fig. 5.20 Trajectories of the standard map (5.230) at K = 0.5

Fig. 5.21 Trajectories of the standard map (5.230) at K =1

inition of which followed Eq. (3.159)), and this performs the same function. Greene postulates a correspondence between the disappearance of a torus with irrational winding number , and the destabilisation of elliptic critical points of resonances with frequency ratio r/s, where r/s are rational approximants of . Specifically, if a sequence of approximants rn /sn to (i.e. rn /sn → as n → ∞) become unstable at values K n , then we postulate that the torus disappears at K = lim K n . n→∞ We might expect that the last such torus to disappear would be that whose winding √ number is the hardest to approximate by rationals, and this is the golden mean ( 5 − 1)/2 = 0.618 . . .. The reason for this arcane choice is that the best rational approximates for an irrational are the successive iterates of the continued fraction representation, and the most slowly converging continued fraction is 0.618 . . . =

1 1+

1 1+



1 1 + ...

Numerically, it has been confirmed that the last torus to disappear in the standard mapping is indeed that for which = 0.618 . . .. For other systems, this may not be


5 Hamiltonian Systems

true, however. As mentioned above, Greene finds K  0.9716. Percival suggests that at the transition value, the invariant torus (with irrational , so that the torus is filled by a trajectory on it) is destroyed by becoming a ‘cantorus’, i.e. the invariant set on the torus is no longer the whole curve, but only a Cantor subset of it, and trajectories can leak through from one side to the other.

5.6 Notes and References A clear derivation of Lagrange’s equations from first principles is given in the book by Goldstein (1950). Classical books on chaos in Hamiltonian systems are those by Moser (1973), Lichtenberg and Lieberman (1983) and Tabor (1989). Contopoulos (2002) deals with chaos in Hamiltonian systems, with an emphasis on galactic dynamics. The matter of Poincaré and the King’s prize was discussed in Chap. 1. There is also a nice discussion in the popularly written book by Daciu and Holmes (1996). The three volumes of Poincaré’s Méthodes nouvelles contain many results; the one concerning transverse intersection of stable and unstable manifolds (5.173) is presented in Poincaré’s treatise more as a calculation: see Poincaré (1892–1899; 1993, art. 401 ff.). The consequent heteroclinic tangles of Sect. 5.4.6 and Fig. 5.9 were also discovered by Poincaré (1892–1899, art. 397; 1993, p. 1,059). It is worth quoting what he says: When we try to represent the figure formed by these two curves and their infinitely many intersections, each corresponding to a doubly asymptotic solution, these intersections form a type of trellis, tissue, or grid with infinitely fine mesh. Neither of these two curves must ever cut across itself again, but it must bend back upon itself in a very complex manner in order to cut across all of the meshes in the grid an infinite number of times. The complexity of this figure is striking, and I shall not even try to draw it …

Thus Poincaré came to the abyss and confronted chaos; but he stopped short of jumping in.

5.6.1 The KAM Theorem The KAM theorem is named for Kolmogorov, Arnold and Moser. Kolmogorov’s original paper (1954) is no more than a summary of the method, but will suffice for most readers. For a full proof, see Arnold (1963); while this is a long paper (over one hundred pages), the central technical part of the proof is some twenty pages. Moser (1962) provided an independent proof, but while Arnold (and Kolmogorov) considered a general n-dimensional Hamiltonian system, Moser’s analysis was for area-preserving mappings; see also Moser (1973).

5.6 Notes and References


5.6.2 Restricted Three-Body Problem The three-body problem has a long history, dating even back to Newton (who of course solved the two-body problem). Euler, Lagrange, Jacobi tussle with it, but it blooms in the late nineteenth century, reaching a pinnacle with Poincaré, but then even finds whole new territories in the twentieth century. Valtonen and Karttunen’s (2006) book is entirely concerned with it, as is that of Szebehely (1967). There are of course a number of applications; we have focussed on Earth–Jupiter–Sun, but there is also Earth–Moon–Sun, Earth–Moon–satellite (see Koon et al. (2011)), and other exotica such as binary stars, black holes, galaxies, and so on (Valtonen and Karttunen 2006). We are used to thinking of the solar system as being stable, and in particular we would normally expect the three body problem to have regular solutions. As we have seen, relatively extreme parameters can propel the solutions towards blow-up, escape or chaos. One practical example of apparent chaos is the orbit of Pluto. Numerical integrations over long time suggest that Pluto has a chaotic orbit (Sussman and Wisdom 1988). Unlike the eight actual planets of the solar system, whose orbits are co-planar, Pluto’s orbit is inclined to the others’ orbital plane (the ecliptic) by some 17◦ . Its period is also in a 3:2 resonance with the orbit of Neptune, which lends it a certain short-term stability. The Kirkwood gaps are named for a paper by Kirkwood (1867), where, in a short paragraph at the end, he comments on three conspicuous gaps of asteroids (in a population of 87) in the asteroid belt associated with distances from the Sun (with Earth’s orbit being equal to one: one astronomical unit) in the intervals (2.21, 2.26) (Ariadne to Feronia), (2.48, 2.52) (Thetis to Hestia), and (2.79, 2.86) (Leto to Polyhymnia), with the corresponding ratios of asteroid period to Jupiter period being 7:2, 3:1 and 5:2 resonances. With many more asteroids now charted, it is the second and third of these gaps which are most prominent.

5.6.3 Hénon-Heiles Potential The Hénon-Heiles potential was studied by Hénon and Heiles (1964). For the motion of a star in a galaxy subject to an axisymmetric potential V (r, z) (in cylindrical polars (R, z, θ )), such that r¨ = −∇V, (5.232) there are two obvious first integrals, those of energy E and angular momentum h. The problem can be reduced to that of the motion of a particle in the x = (R, z) plane subject to the potential U = V + h 2 /2R 2 : x¨ = −∇U.



5 Hamiltonian Systems

The question of the existence of a third (isolating) integral has been a matter of long-standing interest (e.g. Contopoulos (2002)). If one exists, then trajectories live on surfaces in what can be shown to be a three-dimensional phase space (because of ‘energy’ conservation for (5.233)); if not, the motion is ergodic. It was to study this issue that Hénon and Heiles studied the dynamics of their potential, taken (writing R = x, z = y) as U = 21 (x 2 + y 2 ) + x 2 y − 13 y 3 .

5.6.4 Hénon Area-Preserving Map Not to be confused with the (dissipative) Hénon map of 1976 which we encountered in the preceding chapter, Hénon’s (1969) area-preserving map was introduced by way of reference to conservative dynamical systems with two degrees of freedom; as applications, he mentions the restricted three-body problem, the motion of a star in an axisymmetric galaxy, amongst others.

5.6.5 The Last KAM Torus The discussion of the last KAM torus is by Greene (1979). Reference to the creation of a ‘cantorus’ is given by Percival (1979). See also Contopoulos (2002) and Meiss (2015).

5.7 Exercises 5.1 The Watt governor (Fig. 5.22) was described in Sect. 1.2. The angular velocity is , a constant, and the arms, of length l, have masses 21 m at their ends and are at an angle θ to the vertical. Show that the kinetic energy is   T = 21 ml 2 θ˙ 2 + 2 sin2 θ ,

Fig. 5.22 The Watt governor (as in Fig. 1.4)

5.7 Exercises


and that the potential energy is V = −mgl cos θ, and deduce the form of the Lagrangian L. Deduce that the generalised momentum is ˙ p = ml 2 θ. Use Lagrange’s equations to show that θ satisfies the equation θ¨ + 2c sin θ − 2 sin θ cos θ = 0, (∗) where you should define c . Show that if we define H = T + V , with p as above, then H=

p2 + 21 ml 2 2 sin2 θ − mgl cos θ, 2ml 2

and hence show that the equation p˙ = −

∂H ∂θ

is inconsistent with (∗). Why? 5.2 Show that the equation q¨ + G(q)q˙ 2 + F(q) = 0 may be put in Hamiltonian form by writing p = f (q)q˙ for some appropriate function f (q). Hence find a first integral of the equation. Show that if F > 0 or F < 0 for all q, solutions are unbounded. If F has a single zero at q = q1 , show that bounded (and thus periodic) solutions only occur if F(∞) > 0. What happens if F has multiple zeroes? 5.3 Suppose that a Hamiltonian system has Hamiltonian H ( p, q), and that a function S(q, I ) can be found such that p = ∂ S/∂q, and S satisfies the Hamilton–Jacobi equation  H

∂S ,q ∂q

= H˜ (I ),

for some function H˜ . Derive an expression for p˙ in terms of q, ˙ I˙ and derivatives of S. Derive another expression for p˙ by differentiating H˜ with respect to q. Hence show that


5 Hamiltonian Systems

I˙ = 0. By differentiating H˜ with respect to I , show also that θ˙ =

∂S ∂ H˜ , where θ = . ∂I ∂I

(I , θ are action-angle coordinates for the integrable Hamiltonian H˜ .) A simple harmonic oscillator has Hamiltonian H = 21 ( p 2 + ω2 q 2 ). By solving the Hamilton–Jacobi equation for S(q, I ), find the action-angle variables. 5.4 Use perturbation theory to solve for the motion described by the perturbed Hamiltonian H = 21 ( p 2 + ω02 q 2 ) + εq 3 , ε  1. (You may assume that suitable action-angle coordinates when ε = 0 are determined by q = (2I /ω0 )1/2 sin θ , p = (2ω0 I )1/2 cos θ .) Hence show that the perturbed frequencies are given by  ω(J ) = ω0 − ε


15J 2ω04


5.5 (The precession of the perihelion of a planet in the general theory of relativity.) The equation of the orbit of a planet with plane polar coordinates (r, θ ) is d 2u 1 + u = + εlu 2 , 2 dθ l where u = 1/r , the Sun is fixed at the origin r = 0, and l=

3G M h2 , ε= 2 . GM c l

Here G is the gravitational constant, c is the velocity of light, M is the mass of the Sun and h is the angular momentum per unit mass of the planet about the Sun. Show that there is a centre at (u N , 0) and a saddle point at (u r , 0) in the (u, u ) phase plane, and find approximate expressions for these when ε  1 (the Newtonian limit). Sketch the phase portrait and identify the region of orbits representing solutions which are periodic functions of θ . Define φ = ωθ , where 2π/ω is the period (in θ ) of an orbit, and assume that ω(ε) = ω0 + εω1 + ε2 ω2 + . . .. Show that ω0 = 1 and ω1 = −1. Deduce that

5.7 Exercises


the planet is at perihelion, i.e. that its distance from the Sun is a local minimum, at successive angles θ differing by 2π/ω = 2π + 2π ε + O(ε2 ) as ε → 0. [This gives the precession of the planet by the angle 2π ε = 6π G M/c2 l approximately each revolution, where l is close to the mean radius of the orbit. This result is used in one of the classic tests of Einstein’s general theory of relativity.] 5.6 A model of neuroelectrical activity gives rise to the equations x˙ = (a − by)x(1 − x), y˙ = −(c − d x)y(1 − y), where a, b, c, d > 0. By defining x = F(q), y = G( p) for a suitable choice of F and G, show that the system is Hamiltonian, and find H ( p, q). Hence show that if b > a and d > c, all the phase curves are closed curves. What happens if b < a or c < d? 5.7 A canonical transformation from generalised coordinates (q, p) ∈ R2 to (Q, P) is one which preserves the Hamiltonian structure. If H and K are the respective Hamiltonians, such that H (q, p) = K [Q(q, p), P(q, p)], ∂(Q, P) = 1. ∂(q, p) With k constant, determine which of the following are canonical transformations: show that the transformation is canonical if the Jacobian

p ; q (ii) Q = tan q, P = ( p − k) cos2 q; (i) Q = 21 q 2 , P =

(iii) Q = sin q, P =

( p − k) . cos q 

5.8 Show directly from the variational principle that if δ

( pi q˙i − H ) dt = 0 for all

qi (t), pi (t) joining fixed end points in the (q, p) phase space, then qi , pi satisfy Hamilton’s equations. Find the Hamiltonians and the action-angle variables for the nonlinear oscillator q¨ + V (q) = 0, when the potential V (q) is given by (i) V = U tan2 αq, U, α > 0; (ii) V = 41 βq 4 , β > 0.


5 Hamiltonian Systems

5.9 Consider the two degree of freedom Hamiltonian H = H0 (q, p) + ε H1 (q, p), where ε  1 and q, p ∈ R2 . (i) Show that if H0 = p12 + ω12 q12 + p22 + ω22 q22 , then the system is integrable when ε = 0, and find the action-angle variables I1 , θ1 , I2 , θ2 . (ii) Show that if ω2 /ω1 = r/s, then resonance occurs. By defining a generating function S = (r θ1 − sθ2 )J1 + θ2 J2 , show that the system is transformed to one where resonance is removed, in a sense you should specify. (iii) If φ1 and φ2 are the new angle variables, show by integrating over an interval 2π in φ2 , that J2 is approximately constant, and that φ1 and J1 are slowly varying. If a slow time τ = εt is defined, show that the iterates of φ1 and J1 on the Poincaré surface φ2 = 0 mod 2π are approximately described by the Hamiltonian system dφ1 ∂H1 d J1 ∂H1 , , = =− dτ ∂ J1 dτ ∂φ1 where H1 should be given. 5.10 (i) Explain why Hamiltonian systems are volume preserving, i.e. d [dq1 ...dqn dp1 ...dpn ] = 0. dt Hence explain why, if a Poincaré map is defined on an appropriate Poincaré section, it is area-preserving. (ii) Show that the standard map defined by θn+1 = θn + In+1 , In+1 = In + K sin θn , (note In+1 in the first equation) is area-preserving in the Cartesian (θ, I ) plane. (iii) Explain why the intersection of unstable and stable manifolds of fixed points of an area-preserving map leads to stochastic behaviour. (iiii) Show that the forced Duffing-type equation ˙ x¨ + V (x) = ε[γ cos ωt − β x], V = − 21 x 2 + 41 x 4 , ε  1, has a small amplitude periodic solution xε (t) with E = 21 x˙ 2 + V (x) = O(ε2 ). (v) Show that when ε = 0, E = 0, there is a positive homoclinic orbit x0 (t) = √ 2 sech t, and when ε = 0, E˙ = ε x[γ ˙ cos ωt − β x]; ˙ hence show that

5.7 Exercises


√ when E  1, x ≈ x0 (t − t0 ) + O(ε) if (x, x) ˙ = ( 2, 0) + O(ε) at t = t0 , and deduce that if E 0 = E(t0 ) = O(ε) at t = t0 , then x → xε as t → +∞ if  ∞ −E 0 ≈ ε

x˙0 [γ cos ωt − β x˙0 ] dt,


while, if

 E0 =

t0 −∞

x˙0 [γ cos ωt − β x˙0 ] dt,

then x → xε as t → −∞. (vi) Hence show that the stable and unstable manifolds of xε intersect if the Melnikov function  M(t0 ) =

∞ −∞

x˙0 (t − t0 )[γ cos ωt − β x˙0 (t − t0 )] dt = 0.

Show (putting first t = s + t0 ) that M(t0 ) =

√ # $ 2π γ ω sech 21 π ω sin ωt0 − 43 β,

and deduce that chaos will occur for ε  1 if √ $ # 2 2β cosh 21 π ω < 3π γ ω. 5.11 Show that the system of equations corresponding to the Hénon-Heiles Hamiltonian H = 21 ( p12 + p22 + q12 + q22 ) + q12 q2 − 13 q23 is equivalent to z¨ + z + i z¯ 2 = 0, where z = q1 + iq2 More generally, show that the motion of two particles x, y under a potential 1 (x 2 + y 2 ) + V (x, y) (i.e. the Hamiltonian is 21 {x˙ 2 + y˙ 2 + x 2 + y 2 } + V ) is 2 governed by the system z¨ + z + 2Vz¯ = 0, where we write V (x, y) = V (z, z¯ ). 5.12 Show that the system 1 εFr − 3 , r2 x d(r 2 θ˙ ) εFθ =− 3 , dt x

r¨ − r θ˙ 2 = −



5 Hamiltonian Systems

with x = [1 + r 2 − 2r cos(θ − t)]1/2 , Fr = r − cos(θ − t), Fθ = r sin(θ − t),


can be written in the form ∂W , ∂r d(r 2 θ˙ ) ∂W =− , dt ∂θ

r¨ − r θ˙ 2 = −

where the potential is


1 ε W =− − . r x

Show however that the energy E = 21 r˙ 2 + 21 r 2 θ˙ 2 + W is not conserved, but that it is approximately conserved on average, assuming an unperturbed circular orbit. 5.13 A planetary orbit is described in polar coordinates by the conservation laws r 2 θ˙ = h, 1 2 (˙r 2

+ r 2 θ˙ 2 ) −

1 = E. r

Show that if E < 0, periodic solutions in time for r are possible, oscillating between values r± , and give an expression for the period P as an integral depending on E and h.  P θ˙ dt = 2π , and deduce that the trajectories on the invariant torus Show that 0

in the r, θ, r˙ plane are actually periodic. [If you get this far, you may use the fact that 

π 0

π dx =√ 1 + a cos x 1 − a2

if |a| < 1, although you ought to prove that as well (use complex variables).]

Chapter 6

Diverse Applications

In this final chapter, we turn our attention to more practical issues, some of them concerned with the analysis of real data. Hitherto we have been interested in understanding the origin and nature of chaos as it occurs in known, deterministic systems. We can see the system is chaotic, but we know what the governing equations are, and the process is fully described by them. In applications, we sometimes assume that the system is deterministic, even if we are unable to analyse the equations. For example, we think of turbulence in fluids as representing the chaotic behaviour of solutions of the Navier–Stokes equations, although at best this is only an inference. In other situations, we may conceive of the system behaviour as having a certain structured randomness. An example of this is the stock market, which is usually thought of as a directed random walk. The question then arises, how do we deal with such systems in practice?

6.1 Chaotic Data As a taster, consider Fig. 6.1. It shows three different time series. The first of these is a solution of the Mackey–Glass equation, a delay-recruitment equation of the form ε x˙ = −x + f (x1 ),


where x1 = x(t − 1) and ε  1. The function f (x) = λx(1 − x) has the typical unimodal shape, which induces chaos in one-dimensional maps (when ε = 0 in (6.1)), and when ε is small, it is known that trajectories are chaotic. The second series in Fig. 6.1 is a realisation of a stochastic process consisting of what is known as red noise1 . While bearing a superficial resemblance to the first series, it is (at least, partly) determined by random events. The final series represents actual data, being

1 Noise

will be discussed below.

© Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,


208 Fig. 6.1 Three different ‘chaotic’ time series. The top one is the solution of the delay-recruitment Eq. (6.1), with f (x) = λx(1 − x), and λ = 3.8, ε = 0.1. The second one is a realisation of an AR process (see Sect. 6.2.3) given by xn+1 = 0.2xn − 0.4xn−1 + 0.1xn−2 + 0.1xn−3 + z, where z is sampled from a normal distribution. The third is the Trade-Weighted Index for the New Zealand dollar (NZ$); the horizontal axis is the number of (week) days, starting at 6 January 2014. It represents the value of the NZ$ against a basket of 17 currencies

6 Diverse Applications 1


0 190



t 4 2


0 -2 -4







t 85 80


75 70 65





t a time series of the value of the New Zealand dollar over a period of somewhat less than 3 years. There are obvious qualitative differences between the three series, but they are each evidently ‘irregular’. The question we need to address, given the examples in Fig. 6.1, is whether the data sets represent the output of a deterministic chaotic system, or whether they are intrinsically ‘noisy’ random series, best dealt with using statistical methods. Next, we give some examples.

6.1 Chaotic Data


6.1.1 Weather Forecasting Weather prediction is done by solving the basic equations of fluid mechanics describing the motion of the atmosphere. This is hard enough on its own, since the flow is turbulent, but is rendered harder by the necessity to include radiative transport of energy, and also to model the transport, evaporation, condensation and precipitation of moisture. In particular, the way in which cloud formation is treated in numerical weather prediction models is a source of great uncertainty. Even without these extra complications, and even without paying too much attention to the fluctuations associated with ‘sub-grid scale’ processes, it is generally thought that large-scale weather systems are intrinsically chaotic, and that this limits effective predictability to a period of some five days. The uncertainty inherent in the initial conditions means that trajectory divergence2 will eventually occur. The current fashion is to deal with this inherent uncertainty by carrying out multiple computations, so that instead of a single forecast, one gets a representation of a forecast probability density. The prediction is then one of probability, there is a 70% chance of rain, and the deterministic model is high-jacked to become a semideterministic stochastic process, in which the effect of the chaos is thought of as a random process.

6.1.2 The Stock Market Just like thermodynamics, financial systems represent an item of incomprehensibility to the applied mathematician. The basic conceptual problem is that there is no conservation law, there is no quantity which follows any determinable rules in allocating its amount.3 Despite this, financial mathematical models exist and are used. The basic assumption on the price of a stock or of an option is that it follows a directed random process: there is an element of growth, on which is superimposed a random walk. When this pointwise stochastic process is converted to a Fokker– Planck equation describing the evolution of the expected value, it becomes a partial differential equation called the Black–Scholes equation, in which the randomness is manifested by a diffusive term, of which the diffusion coefficient is called the volatility. The problem is, real data sets do not conform to this assumption. Mathematical finance thus provides a semi-empirical model based on a working hypothesis which is demonstrably wrong.

2 In

the enormous-dimensional phase space. might be such a quantity, at the least analogous to the use of probability as a surrogate for actual events, but then one introduces further abstractions: options, exchange rates, which are one further step removed from connection to actual commodities.

3 Money


6 Diverse Applications

6.1.3 Landslide Statistics Figure 6.2 shows a distribution representing the number of observed landslides as a function of their area in three different climatic events. Such distributions are commonly found in many different geoscience contexts, for example, in earthquake magnitude distribution, rock grain size distribution, drumlin size distribution, and many others. The presence of such distributions attests to a statistical process, but their regularity is also suggestive of an underlying deterministic process which controls the form of the distribution.

6.1.4 Chaos and Noise We have alluded above to a distinction between chaos and noise, and to make this distinction more precise, we must categorise what we mean by noise. A sequence of numbers {xk } is called a white noise process if it is a sequence of uncorrelated random variables from a fixed distribution with constant mean and variance. If the distribution is normal, then the process is said to be Gaussian. We shall see that the

Fig. 6.2 Superimposed distributions of landslide size following three different events in three different parts of the world: earthquake in California, snowmelt in Umbria, rainfall in Guatemala. Also plotted is a fitting inverse gamma distribution. Of interest is the similarity of the three different distributions, as well as the power law tail, often taken to represent self-similarity in the process. Figure supplied courtesy of Bruce Malamud, an edited version of figure 1 in the paper by Malamud et al. (2004), and reproduced with permission of Earth Surface Process and Landforms

6.1 Chaotic Data


Fourier spectrum of such a process is flat, i.e. uniform. Other, partially correlated processes have non-uniform spectra, and are often referred to as coloured noises— examples are red noise and brown noise. One particularly interesting process is so-called ‘1/ f noise’, where the power spectral density (see the following section) S( f ) ∼ f −α , with α near 1. Examples of 1/ f noise are very prevalent in physical systems, particularly various kinds of electronic devices. However, chaotic processes may also be considered as generating noisy signals, in some sense. As we shall see later, one can associate probability densities with chaotic signals, and highly uncorrelated signals can be obtained by using sufficiently high-dimensional chaotic processes. For example, consider the map f (x) = N x

mod 1, x ∈ (0, 1).


This is chaotic for N > 1, and if N is large then information is rapidly lost. For example, the map x → f M (x) (the Mth iterate) will approximate a white noise process for M and N large, e.g. M = 10, N = 1,000. A useful distinction between chaos and noise may thus be that noise effectively represents chaos of very high dimension, and to distinguish between chaotic and noisy signals is more a question of delimiting the amount of resolution a data set can offer. In the following section, we describe briefly a statistical approach, where it is assumed from the outset that the process is, in some sense, random.

6.2 Statistical Methods The basis of statistics is that we know nothing. The basis of applied mathematics4 is that we know everything. The two subjects (should) engage with each other because, on the one hand, the deterministic models of the applied mathematician can lead to apparently random behaviour, while the effort of Bayesians in statistics is directed towards the idea that, yes, we actually do know something. Statistics is based on the theory of probability, which works as follows. It associates to an ‘event’ a number, called its probability. It then assigns to this probability a set of rules, based on intuitive understanding (just as we base the rules of arithmetic on an intuitive understanding of number), and it then does a lot of things up in this ‘probability space’. The reason probability theory is useful is that eventually it makes contact with the real world again by providing a way in which one can identify what the mythical probability actually is: this comes via the central limit theorem. So probability and thus statistics make sense. We now survey the way in which it deals with the analysis of a time series, which is simply a sequence of values which a quantity x takes at successive times.

4 In

the, perhaps, limited sense that I mean here.


6 Diverse Applications

6.2.1 Stochastic Processes Let us consider a quantity X ,5 taking a value x at time t. This evolves to a new value at a time t + t, and any model will provide a recipe for x(t + t) in terms of x(t). There are various ways to do this. One, obviously, is the deterministic way: x(t + t) = x(t) + f t, where f is a prescribed function. In the limit t → 0, this yields a differential equation, while if t remains finite, it gives a deterministic difference equation. A stochastic process is one in which the evolution of x is partly or entirely determined by chance. There are then various ways of prescribing the stochastic evolution of X . One simple one is to realise that, if X is evolving randomly, it only makes sense to deal with its probability density, and thus we consider pt (x), which is the probability that X = x at time t. The probability pt+t is then constructed on the basis of the sum of conditional probabilities, yielding an equation called the Chapman–Kolmogorov equation. Note that this involves a microscale description of the local probability of evolving x to an adjoining state. An example of this is the random walk on the integers. More specifically, we can define a conditional transition probability density Tt (x | z), which is the probability (density) that X = x at t given that X = z at t = 0. That Tt depends only on the time difference between initial and final state is due to the assumption of a Markov process that the system has no memory, i.e. its future evolution depends only on the current state. The Chapman–Kolmogorov equation is then  Tt (x | y)Tt (y | z) dy. (6.3) Tt+t (x | z) = Y

A separation now occurs, depending on whether x and/or t are discrete or continuous variables. When both are discrete, we have a discrete Markov process, or Markov chain, whose solution can be found using generating functions, which are essentially a kind of Laplace transform. If time is continuous, we have a sequence of differential-difference equations, again equally solved with a generating function, which is then described by a partial differential equation. A more complicated situation occurs when X is continuous, but we allow the jump in x to be distributed.6 Here we adopt the continuous limit of the Chapman– Kolmogorov equation which describes the generation to generation evolution of the probability density of x, and when time is also allowed to be continuous, we obtain an integro-differential equation which is called the master equation. If we let φ(x, t) denote the probability density function of X at time t, then  φ(x, t) =

Tt (x | z)φ0 (z) dz, Z

5 In

probability theory, X is a random variable. example is the successive fragmentation of pieces of rock.

6 An


6.2 Statistical Methods


where φ = φ0 (x) at t = 0, and ∂φ = ∂t

 S(x, y)φ(y, t) dy,



where S(x, y) is the time derivative of Tt (x | y). A further elaboration results from taking (6.6) Tt (x | z) = W (x, z)t, x = z, from which we can then derive the master equation ∂φ(x, t) = ∂t

 [W (x, y)φ(y, t) − W (y, x)φ(x, t)] dy.



A particular simplification occurs if it is assumed that migration is local. Specifically, if we take x = y + , W (x, y) = w(y, ), (6.8) with w varying slowly with y but rapidly with , then a Taylor series expansion in  leads to the equation ∞

 (−1)n ∂ n ∂φ = [Mn φ], ∂t n! ∂ x n n=1


where the moments Mn are  Mn =

n w(x, ) d.



Truncation at second order gives the Fokker–Planck equation. A recovery of some of these descriptions is provided in the exercises. A different way of describing the evolution of X is to consider its actual value, and to allow randomness in its evolution. A particular simulation will then yield a realisation of the process. A typical example of this is the stochastic differential equation x(t + t) − x(t) = x = at + bW, (6.11) in which a and b are prescribed functions of x and t, and the increments W are drawn from a distribution, which can be taken to have mean zero. If the distribution of W is Gaussian and has variance t, then it is called a Wiener process. Since the sum of  a sequence of Gaussians means μi and variances σi2 is a Gaussian  with 2 with mean i μi and variance i σi (see question 6.4), we see that W (t) is also Gaussian, with mean zero and variance t. To relate the stochastic differential equation (6.11) to the master equation (6.5), we proceed as follows. The probability density that X = x at t + t given that X = y


6 Diverse Applications

x − y − at , for which at t is Tt (x | y) dy. In view of (6.11), this occurs if W = b the corresponding probability density is Gaussian with mean zero and variance t, and it follows that   1 (x − y − at)2 dy . (6.12) exp − Tt (x | y) dy = √ 2b2 t b 2π t Using (6.6), (6.8) and (6.10), we find M1 = a, M2 ≈ b2 , and Mn  1 for n > 2; this yields the Fokker–Planck equation in the form ∂(aφ) ∂ 2 (b2 φ) ∂φ + = 21 , ∂t ∂x ∂x2


which provides an explicit connection between the realisation (6.11) and the probabilistic forecast (6.13).

6.2.2 Autocorrelation and Power Spectral Density Statisticians live in a discrete world, so they deal with finite time series {x1 , . . . , xn }. Obviously one can simply deal with these values purely statistically, which is to say we suppose they are values drawn from some distribution, xi = z i ,


where we take i = 1, 2, . . . , n, and z i denotes (most simply) a white noise process, i.e. the values are independent. In what follows, we will suppose both {xi } and {z i } have zero mean. There are various statistical measures used to analyse time series. One such is the correlogram, which is used in assessing the independence of the successive values. The autocovariance function of a random variable X with zero mean is γ (τ ) = E[X (t)X (t − τ )],


and the autocorrelation function is G(τ ) = γ (τ )/γ (0). Evidently G(0) = 1, and generally G decreases as τ increases; however, for a periodic signal, clearly it is periodic, and thus an oscillatory behaviour of G suggests a degree of periodicity in the time series. Note that for a white noise of variance σ 2 , γ (0) = σ 2 and γ (τ ) = 0 for τ = 0 (since then X (t) and X (t − τ ) are independent). The discrete estimator for γ (τ ) is 1 gj = xi xi− j , (6.16) n i

6.2 Statistical Methods


where n is the number of data points and the correlogram is just g j /g0 , i.e. the discrete estimator of G(τ ). Another statistic, the periodogram, is also used in assessing the oscillatory qualities of a time series. It is defined by I ( ) =

2 1    x j ei j  ,  n j


and is evidently related to the discrete Fourier transform. Both of these quantities have continuous analogues, which are reached in the limit of large n. Specifically, if we define x j = X ( jt), t = jt, = ωt,


then (with a suitable time origin shift) (6.17) is the discrete analogue of the power spectral density: | Xˆ (ω)|2 , t I ( ) ≈ P(ω) = 2T

Xˆ (ω) =

T −T

X (t)eiωt dt, 2T = nt,


where it is assumed that T is large; so the periodogram is essentially the squared amplitude of the Fourier transform of X (extended to be zero beyond (−T, T )). Equally the correlogram has a continuous analogue in the autocorrelation function 1 G(τ ) ≈ C(τ ) = 2T



X (s)X (s − τ ) ds,


and in fact the two continuous quantities are related, as are their discrete versions: the Fourier transform of C is the power spectral density: ˆ C(ω) = P(ω).


The power spectral density comes into its own in indicating periodicity in a signal (where spikes occur at multiples r ω of a single frequency), or quasi-periodicity, where for example doubly periodic motion is indicated by spikes at frequencies r ω1 + sω2 . The power spectrum has thus been used in fluid dynamical experiments to track the sequence of bifurcations leading to chaotic behaviour. Figure 6.3 shows the power spectrum of a chaotic solution of the Lorenz equations. We see that there is power at certain frequencies, but also a low level ‘noise floor’. We might thus be tempted to suppose that the signal is that of a nonlinear oscillator or oscillators subjected to noise. In fact, contemplation of the entirely deterministic signal allows us to explain, or at least interpret, some of the spikes in the spectrum. The most obvious oscillatory period is that of the short term fluctuations about either of the non-zero fixed points. For example, in 17.75 ≤ t < 22.64 there are eight


6 Diverse Applications











t 200








Fig. 6.3 Power spectrum (computed via the periodogram) of a solution of the Lorenz equations (4.2) at the standard parameter values r = 28, σ = 10, b = 83 . The upper figure shows the solution for x used in construction of the power spectrum in the lower figure. The time series analysed is 512 points over a time range of 2T = 100 units, so that t = 100/511 = 0.1957, and the Nyquist frequency is f N = 511/200, the maximum resolution above which the periodogram begins to mirror itself. The maximum value of angular frequency ω = /t is then ω N = 2π f N ≈ 16. The periodogram is computed simply by directly calculating the sum in Eq. (6.17) for each value of chosen, up to the maximum value N = ω N t = π . Because (6.17) is closely related to the discrete Fourier transform, it is natural to choose 29 = 512 equally spaced values of ∈ (−π, π ), thus  = 2π/n = 0.0122 and consequently ω = 2π/2T = 0.0628 in computing the periodogram; however, when this is done, grid-scale oscillations occur, lending credibility to the suspicion that the periodogram is under-resolved. Thus in the figure, we have chosen 214 = 32 × 512 evenly spaced values of ∈ [0, 3], increasing the number of points until there were enough to fully resolve and accurately locate the peaks and troughs in the graph of the periodogram

peaks and thus seven approximate cycles, thus a period ≈ 0.7; and there is a peak at ω ≈ 8.84, corresponding to a period of 0.71. That this spike is of comparatively small amplitude can be ascribed to the fact that the amplitude of these fast oscillations is about a quarter to a third of the longer ‘period’ cycles from x < 0 to x > 0, and since P(ω) is proportional to the square of the signal amplitude, the power in these high-frequency signals should be an order of magnitude less than those of the slower excursions. Concerning the latter, the same interval 17.6 < t < 22.8 (of successive x = 0 crossings) indicates a half-period of such a cycle of t = 5.2, and thus ω = 0.604;

6.2 Statistical Methods




0 0






ω 1000



0 0


ω Fig. 6.4 Power spectrum of the solution of the Lorenz equations as in Fig. 6.3. The upper figure shows the spectrum for 2T = 800 and the lower for 2T = 6,400. The same value of t = 2T /n was used as in Fig. 6.3, i.e. n = 212 and n = 215 , respectively, and the default grid interval ω = π/T was used

and indeed there is a spike at ω = 0.606. Similarly, the sequence from t = 38 to 43.2 also suggests a half-period of 5.2. Working the other way, we see a spike at ω = 2.8, corresponding to a period of t = 2.24; and three examples of peak to peak intervals of this approximate size are in (29.48, 31.75), (63.39, 65.74) and (96.32, 98.63), of lengths 2.27, 2.35 and 2.31, respectively. It thus seems that we can interpret the spectrum on the basis that the time series lands close to temporary periodic cycles, and that it is these which provide the peaks in the spectrum. Since all such periodic cycles are unstable, we might expect that if the time series is run longer, more peaks of smaller relative power would appear, and the long time evolution of the power spectrum would be smooth. This is borne out in Fig. 6.4, which shows just such spectra. In contrast to periodic or quasi-periodic behaviour (where the power spectrum of a long time series approximates to a sequence of delta functions), the power spectrum of a white noise {z j } is flat. The easiest way to see this is via the correlogram in (6.16) and the periodogram in (6.17). We have (since g0 = σ 2 , g j = 0, j = 0 for a white noise)


σ2 =

6 Diverse Applications


g j ei j =

 1 1 xk xk− j ei j = xk ei k xl e−i l = I ( ), (6.22) n j,k n k l

thus I ( ) = σ 2 , independent of , and the corresponding power spectrum P(ω) = σ 2 t. Generally, we might imagine that an experimentally measured signal consists of a deterministic signal together with a superimposed noise. In that case we might wish to clean the signal by removing the noise. This process is called filtering. A filter of a time series {xi } is simply a modified time series yi =

a j xi− j .



In continuous terms, y = a ∗ x is the Fourier convolution of a with x, and thus filtering is most easily thought of in Fourier space as yˆ = aˆ x. ˆ


Particularly useful forms of filter can be chosen when there is a large discrepancy in the time scales of signal and noise. A low-pass filter allows low frequencies but suppresses high frequencies, for example by taking aˆ = 1, ω < ωc , and aˆ = 0, ω > ωc . A question of some interest for spectra such as that in Fig. 6.3 is, how should we choose an appropriate filter?

6.2.3 Autoregressive Models Statisticians accommodate the idea of a model by considering not-quite-random processes, of which the simplest representative type is the autoregressive (AR) model, given for example by (6.25) xi = αxi−1 + z i , where z i are samples from a white noise process.7 Evidently this is equivalent to the stochastic differential Eq. (6.11). A variety of such models exist, for example, the ARMA (autoregressive moving average) models, which replace the random term  z i by a weighted sequence k βk z i−k , and the ARIMA (autoregressive integrated moving average) models, which relate higher order differences of xi to the applied noise.8 We will limit our discussion to the first order AR model (6.25). There are three issues of concern: identification, diagnosis and prediction. Typically, the form of the autocorrelation function is used in identifying a suitable form of model, and its order. In terms of diagnosis, one immediate question is how to choose 7 This is a first order process. More generally,

xi =


j=1 α j x i− j

+ z i is an AR process of order p.

we define the backwards difference  operator  by xi = xi − xi−1 , then ARIMA models are, for example, of the form  p xi = k βk z i−k , and are discrete representations of a higher order differential equation.

8 If

6.2 Statistical Methods


the value of α in (6.25). Most obviously in an actual data set, the sum of squares of the residuals ri = xi+1 − αxi satisfies  i

ri2 =

2 xi+1 − 2α


xi xi+1 + α 2


xi2 ,



and so α can be estimated by taking the minimum of the quadratic, thus α=

 i x i x i+1  . 2 i xi


This is just the method of least squares. Often one is interested in prediction, and the point prediction for (6.25) is evidently just xi = αxi−1 , but equally evidently, the noise term induces a drift in the prediction, which is best accommodated by allowing an evolving probability density for the prediction; indeed this is what the Fokker–Planck equation does. For a simple AR model (take α= 1 in (6.25)), the solution after m time steps (starting from x0 = 0) is just xm = m 1 z i , and by the central limit theorem this is a Gaussian, in keeping with the representation of the drift as a diffusive process.

6.3 Phase Space Embedding Now let us return to deterministic models. We are confronted with the same issues as discussed above, but now we take a different perspective. In particular, and in distinction to both the statistical and stochastic approaches outlined above, we develop an analysis which is nonlinear. The fundamental tool is the construction of a phase space from a single (and perhaps scalar) time series {xi }. We do this by selecting an embedding dimension d E and then choosing a d E -dimensional vector X i variously as X i = (xi , xi−1 , . . . , xi−d E +1 ), X i = (xi , xi+1 , . . . , xi+d E −1 ), X i = (xi−[ 21 d E ] , . . . , xi+[ 21 d E ] ),


where [y] denotes the integer part of y. The first of these is called back projection, the second forward projection, and the third centre projection; generally, it is convenient to assume d E is odd if centre projection is used.9 They are all topologically equivalent, and which is used is a matter of convenience. We thus obtain from the original scalar sequence {xi } a sequence of vectors {X i }, which we think of as tracing out a trajectory in the embedding phase space Rd E . 9 If

d E is even, then the extreme coordinates can be taken as (x


1 2 dE

, x


1 2 dE


), for example.


6 Diverse Applications

The idea behind phase space embedding is that if the sequence is deterministic and either of finite dimension or with a finite-dimensional attractor, then if d E is sufficiently large, the embedded trajectory will be a topological image of the trajectory in the actual dynamical phase space, and thus can be used to make pointwise predictions, much as is done in statistical time series analysis. The method is nonlinear, as it applies equally to systems whose dynamics are nonlinear, but the methodology involved is linear, as we shall see, and runs somewhat in parallel with the statistical approach. The reason that a finite-dimensional attractor can be safely embedded in a finitedimensional space is embodied by Takens’ embedding theorem. The co-dimension10 of a k-dimensional manifold in Rd is d − k, and the intersection of two manifolds of co-dimension m 1 and m 2 has co-dimension m 1 + m 2 . For example, two surfaces in R3 each have co-dimension one, and in general their intersection is a line (or lines) with co-dimension two. On the other hand, two lines with co-dimension two would generally intersect in a space with co-dimension four, thus of dimension −1! And, of course, they do not generally intersect. Takens’ embedding theorem states that if an attractor has dimension ≤ k, then the embedding method described above ‘works’ if d E ≥ 2k + 1, by which it is meant that the embedded trajectory will not intersect itself. If embedded in Rd E , then the co-dimension of the attractor is d E − k, and the co-dimension of the intersection of the attractor with itself is 2(d E − k), and thus of dimension 2k − d E . Thus if d E ≥ 2k + 1, there is in general no intersection, as required. An issue in the choice of embedding space is the choice of time lag. If for example we suppose that continuous time data x(t) is available (or frequently sampled), then the time lag  is the time interval between sample points used in the embedding (centre projection, with d E odd):

X i = x t − 21 d E  , . . . , x(t − ), x(t), . . . , x t + 21 d E  .


It is evident that the lag time should be neither too small nor too large. As an example, for a periodic sequence with period P, a useful choice has the time window τw = d E  ∼ P, and generally   P. More generally, we might suppose a sequence might have a typical cycle time P, and the same precept applies. The question arises, how should one automate the choice of the lag time for a general sequence? This is the analogue of the question of identification in time series analysis, how to choose the order of the AR process used to model the data. We will come back to this question, but first we discuss the basic methodology which underlies the procedure.

6.3.1 Singular Systems Analysis The basic issue in linear time series analysis was the identification of a suitable AR model, estimation of its parameters, and the filtering of the predictive part of the sequence from the superimposed noise. This involved the use of the Fourier 10 See

footnote 15 in Sect. 5.4.7.

6.3 Phase Space Embedding


spectrum, together with its inherent projection of the sequence on to the underlying Fourier components. In the present case, we follow a similar strategy, but now we use the data itself to determine the basis on to which to project the data. This method uses the technique of singular value decomposition (SVD). Singular value decomposition is an algebraic process which has various uses. It represents a (generally non-square) matrix M in the form M = U V T ,


where U and V are square and orthogonal, and  is diagonal. To see what this has to do with phase space embedding, we note first that the embedding produces a sequence of vectors {X i }, where if there were n data points, there would be N = n − d E + 1 vectors. This sequence defines an N × d E trajectory matrix M = (X iT ),


which represents a cloud of N points in Rd E , and in our case we will have N > d E , though this is not essential to (6.30), but we assume it below. To understand singular value decomposition, observe that M T M is a d E × d E real symmetric matrix which is positive definite,11 and therefore it has a sequence of positive eigenvalues σ12 ≥ σ22 ≥ . . . σd2E , with corresponding orthonormal eigenvectors v1 , . . . , vd E . Without loss of generality we take σi > 0. The vectors vi are called singular vectors and the quantities σi are the associated singular values. Now M M T is an N × N matrix of the same rank as M T M and the same non-zero eigenvalues σi2 , with the rest therefore being zero. Indeed, if we have M T Mv = σ 2 v, then M M T (Mv) = σ 2 Mv, so that we can define the non-null eigenvectors u i of M M T by defining (6.32) Mvi = σi u i , where this definition ensures that the vectors {u i } are orthonormal, since v T M T Mv = σ 2 = ||Mv||2 . Equally, we have M Tu i = σi vi .


(The summation convention is not used here.) Now let us define the d E × d E matrices D and V , and the N × d E matrix Ud , as D = diag (σi ), V = (vi ), Ud = (u i );


M V = Ud D.


we then have

definite since v T M T Mv = ||Mv||2 ≥ 0, and can only be zero if Mv = 0, which means v is orthogonal to the whole data set; in which case we can re-define the embedding space to be the subspace othogonal to v.

11 Positive


6 Diverse Applications

Fig. 6.5 The geometric basis of singular value decomposition. If a cloud of points is approximately linear, the first singular vector gives its direction (after the mean has been subtracted)

We let {u ⊥j }, j = 1, . . . , N − d E be an orthonormal basis for the null space of M T . We define the N × (N − d E ) matrix U⊥ and the (N − d E ) × d E zero matrix O via U⊥ = (u ⊥j ),

O jk = 0;


then we also define the N × d E matrix  =




and the N × N orthogonal matrix U = (Ud , U⊥ ) ,


as a consequence of which we have M TU = V  T .


Since U is an orthogonal matrix, i.e. UU T = I N , where I N is the N × N identity matrix, it follows that M T = V  TU T , and taking the transpose yields (6.30). Let us think where we are going. We have a cloud of points in Rd E . We assume that they have a mean zero, so are centred on the origin. They will not, in general, be symmetrically placed, let us say more like an ellipsoid. In the extreme case, they might be arranged somewhat linearly, as in Fig. 6.5. How should we estimate the consequent direction vector of the points, v, and a measure of its magnitude σ ? To answer this, let us consider the vector Mw, where w ∈ Rd E . Mw is a vector in R N , and its components are the scalar products X i .w. If all the vectors X i were lined up as in Fig. 6.5 along a vector v, then we would have Mw = 0 for all w ⊥ v. An equivalent way to select v is thus to maximise ||Mv||; to make this sensible we suppose v is a unit vector. Here the norm is the usual Euclidean norm. Equivalently we seek to maximise v T M T Mv. Because of the variational principle for the eigenvalues of M T M, we select v = v1 , and σ1 measures the spread of the points. To generalise this, think of the cloud of points as a number of equal point masses. We use {vi } as a basis in Rd E . The components of Mv j are just the v j components of the points X i . Thus ||Mv j ||2 = σ j2 is just the moment of inertia of the points about

6.3 Phase Space Embedding


the plane orthogonal to v j . And since, in view of (6.32), Mvi is orthogonal to Mv j for i = j, the v j give the principal axes of inertia. So the singular vectors and their corresponding singular values give a recipe for finding where in Rd E most of the points are.

6.3.2 Time Lag Selection Let us now return to the question of the selection of the time lag in (6.29). There is no unique way to do this, and a number of choices have been suggested. For example, since the autocorrelation function of a periodic time series is periodic, one might choose a time window based on, for example, the first minimum (if there is one) of the autocorrelation function. Given a choice of time window, the embedding dimension (and thus the time lag) can be chosen by making it sufficiently large that an apparent convergence in the singular value spectrum occurs. Here we will describe one particular method, based on the use of the singular value fraction. First we note that the Frobenius norm of M is tr M T M =


σi2 .


1 2 T This follows from (6.30), from which we find M T M = V D V , where D and V are defined in (6.34). Calculating the trace, we find that it is σk2 ||vk ||2 , whence the result. Supposing for illustration that we take a forward projection in (6.28), then we have that the (i, j)th element of M is

Mi j = xi+ j−1 , 1 ≤ i ≤ N , 1 ≤ j ≤ d E , and thus tr M T M =

dE  N 


2 xi+ j−1 ,


σi2 ≈ N d E σd2 = N d E ,


i=1 j=1

and for large N we thus have dE  1

where σd2 is the variance of the time series {xi }, which we assume equal to one (this is easily achieved by rescaling the variables if necessary). We now proceed to define the singular value fraction. First we define the quantity


6 Diverse Applications

k Fk = d1E






which measures the power in the first k singular components. Note that, since σk decreases with k, the differences Fk − Fk−1 =

σk2 N dE


form a decreasing sequence, so that Fk is a concave function. Since F0 = 0 and Fd E = 1, we see that k Fk ≥ . (6.46) dE Equivalently, the sequence sk =

d E Fk k


can be shown to be decreasing (use induction), and equals one at k = d E . The idea now is that a finite-dimensional attractor of dimension d A embedded in a phase space will wander through a subspace of the embedding space, generally of a dimension greater than d A . A good embedding is one in which the embedded trajectory is ‘spread out’ as much as possible, and thus ideally all the singular values of significance would be equal.12 If we suppose that the attractor does occupy the entire k embedding space, then this implies Fk = . This will of course not be generally dE true; but since Fk is concave, a way of choosing the time lag is to find the value of  for which, given k, the quantity k dE f SV (k) = k 1− dE Fk −


is a minimum; if in particular we take k = 1, then we define the singular value fraction to be σ12 − N . (6.49) f SV = f SV (1) = N (d E − 1) This is a number between 0 and 1; if f SV = 0, then power is equally distributed in all the singular vectors, which is ideal. Given N and d E , f SV is a function of the lag , and for sequences with a recognisable cycle time, choosing  to be the value at the first minimum gives a good choice of lag. 12 If the attractor does occupy a subspace of the embedding space, then the singular values will become zero at some point; more generally for a noisy series, they will reach a ‘noise floor’, discussed further below.

6.3 Phase Space Embedding


As an example, we consider the series x=

√ 2 sin π t,


for which, if N is large, it can be shown that f SV

    sin π d E   1  ,  dE −  =1− 2(d E − 1) sin π  


1 . Application of this method to low-dimensional dE chaotic systems indicates its value, as measured for example by the prediction error, discussed below. It seems that it works well for chaotic series with a clear ‘recurrence time’, an example being the Rössler equations. It does not work (there is no minimum) for the Lorenz equations, but this is due to the fact that the Lorenz solutions cycle about two fixed points x = ± c, on account of the symmetry. It turns out that if the singular value fraction is applied to the series x 2 − c2 , it does provide a good diagnostic, and one could generalise this to more complicated systems in an obvious way. It is evident that there is no necessity for uniform time lags to be used in embeddings, and more generally a backward embedding might be of the form suggesting the choice  =

X (t) = {x(t), x(t − t1 ), . . . , x(t − td E −1 )},


where the time lags ti can be chosen arbitrarily. One situation where this can be useful is in the embedding of a time series derived from the solution of a delay equation. Consider, for example, the delay-differential equation ε x˙ = −x + f (x1 ),


where x1 ≡ x(t − 1), and we will assume f is a unimodal function, so that (6.53) can have chaotic solutions; if ε  1 these can be very irregular, as was shown in Fig. 6.1. Indeed, (6.53) is an infinite-dimensional system, and unsurprisingly its attractor has a dimension of order 1/ε, so that for very small ε, an enormous embedding space would be necessary. On the other hand, the presence of an explicit delay suggests that an irregular embedding of solutions of (6.53), say of the form X (t) = {x(t), x(t − ), x(t − 2), . . . , x(t − k), x(t − T )},


may provide a useful representation of the data if T = 1. The reason for this is that if ε  1, then in practice the embedding dimension d E = k + 2 will be less than the dimension of the attractor, and thus the embedded trajectory fills the embedding space. But if T = 1, the trajectory should collapse to a smaller volume. For example, a three-dimensional embedding


6 Diverse Applications

X (t) = {x(t), x(t − ), x(t − T )} ≡ (x1 , x2 , x3 )


approximates (6.53) as the surface ε(x1 − x2 ) = {−x1 + f (x3 )}.


What this means is that if we plot the singular value fraction as a function of T , mostly it should have a low value, because the trajectory fills the embedding space, but at T = 1, the trajectory should lie on an approximate surface, and we can expect f SV to have a sharp peak. Computational results show that this is indeed the case.

6.3.3 Nonlinear Filtering Singular value decomposition provides a useful method of filtering a time series. The idea is that we embed the time series as above to provide a sequence of points X i ∈ Rd E . We then do SVD which provides an orthonormal basis {vi } in Rd E and an associated set of decreasing singular values σ1 ≥ σ2 . . .. Now while we may need d E dimensions to provide a suitable embedding space, it may commonly be the case that the data inhabits a lower dimensional subspace, and SVD provides a recipe to establish this. Specifically, √ if the sequence of singular values decreases to a noise floor, where σk ∼ σ f  N for k > M, then we can filter the data by projecting on to the first M singular vectors, and this provides both a legitimate way of truncating a deterministic set of data, and also of eliminating superimposed noise. Figure 6.6 shows an example of this. The noise floor may refer to a level where there is little signal strength, or, as in Fig. 6.6, where there is superimposed noise of variance σ 2f /N . j

In more detail, suppose that the trajectory matrix (M) elements are X i , where 1 ≤ i ≤ N and 1 ≤ j ≤ d E . Any of the columns of M gives (most of) the time j series xi . For each i, the vector with components X i is a d E -dimensional vector, and therefore if vk , k = 1, . . . , d E is any orthonormal basis for Rd E , there exist coefficients αik such that dE  j j Xi = αik vk , (6.57) k=1 j

where vk is the jth component of vk . The N × d E matrix α obviously depends on the j j choice of the basis vectors vk . If, for example, we choose vk = δk j , then X i = αi j , and any column of α recovers the original time series. The idea of filtering using SVD is that we choose the basis vectors to be the j singular vectors vk , and the elements of V T are ViTj = vi , bearing in mind that for example in (6.34), the vectors vi are column vectors. Comparing (6.30) with (6.57), we see that in this case α = U  = Ud D, and

6.3 Phase Space Embedding


Fig. 6.6 SVD applied to a solution of the Lorenz equations at values r = 28, σ = 10, b = 83 , to which has been added white noise at each time step of variance 16, which is about a quarter of √ the variance of the Lorenz signal. The plot shows the scaled singular values σi / N versus i for a central embedding using a time window τ = 0.36 with d E = 37 (thus the lag time  = 0.01), and with N = 10,000

αik = σk u ik


(not summed), and thus (6.57) takes the form j Xi




σk u ik vk .



Filtering consists of truncating the sum, selecting only a certain number of largest singular values, and centre projection identifies the filtered time series as xi1 =


σk u ik vk ,



where for odd d E and centre projection, j ∗ = 21 (d E + 1). (Back projection takes j ∗ = d E and forward projection takes j ∗ = 1.) In matrix terms, we take the singular value decomposition M = U V T and then compute the filtered matrix M F = U  F V T , where  F is obtained from  by putting all the neglected singular values to zero. The filtered time series is then simply the j ∗ -th column of M F . This filtering procedure can be iterated. The filtered sequence of points xi1 provides a new trajectory matrix, from which a new set of singular vectors and singular values can be constructed. Applying the same filtering process yields a second iterate xi2 . An example of this is shown in Fig. 6.7, where the same original data set as in Fig. 6.6 is filtered twice. Evidently the procedure can be iterated, and at least for this data set, the sequence seems to settle down, but, just like an asymptotic expansion, continued filtering eventually starts stripping more signal than noise.



6 Diverse Applications






-20 -20 20











20 0









-20 -20 20






-20 -20 20

















-20 -20









t Fig. 6.7 An example of iterated filtering using SVD. A time series from a solution of the Lorenz equations is plotted, where a superimposed white noise has been added, as in Fig. 6.6. The series is embedded in a phase space of dimension d E = 37, and then iterative filtering is applied, as described in the text. Two iterates of the filtering process are shown

6.3.4 Prediction When a time series is embedded in a phase space, the embedding can be used to make forward predictions. The simplest such method uses nearest neighbours in the phase space. For X i ∈ Rd E , we locate K nearest neighbours X j1 , . . . , X jK such that ||X j − X jk || > r for all j ∈ / J , jk ∈ J , where J = { j1 , . . . , j K }. Here r = max J ||X i − X jk ||. Evidently, there is a choice available in prescription of K and r . We may fix K , in which case r will vary, and we expect better results for smaller r . Alternatively we may fix r , giving variable K (and perhaps K = 0). The first choice is thus the safer one. A simple predictor then just takes X i+1 to be the average of the one-step nearest neighbour forward iterates X jk +1 , i.e.

6.3 Phase Space Embedding


X i+1 =

K 1  X j +1 . K k=1 k


Other more complicated predictors may be used. One obvious strategy is to weight the influence of a nearest neighbour by a weight function w(rk ), where rk is the vector X jk − X i . A popular choice is to use radial basis functions for the weights. For a given time series, one of course wants to know how accurate an approximation may be. No assessment of this can be made if the whole data set is used, and the usual practice is to use a fraction (say, a half) of the data set to ‘train’ the predictor. Thus we take jk < 21 N , which means that the one-step prediction of the second half of the time series can be assessed (since one actually knows the iterates). This allows a calculation of the prediction error, and this statistic is useful in assessing the quality of the embedding (as the error can be calculated for different embedding dimensions, lags and so on). Variants are also possible. One obvious one is to update the training set sequentially as one moves along the time series.

6.4 Dimensions and Lyapunov Exponents We have referred casually above to attractor dimension, without really saying what we mean by this. The first thing to say is that a chaotic attractor, or a chaotic (strange) invariant set, inhabits a subspace of the phase space which does not have an integer dimension. For example, the Lorenz equations (1.14) live in R3 , but phase volumes contract (the divergence of the right-hand side is negative), and so the attracting set has dimension less than three. It is fairly evidently not one-dimensional (it would have to be a limit cycle), but is it a surface? If it is, it would have to be a rather complicated one. In fact, the Lorenz attractor is rather more than a surface, and it has a non-integer dimension which is somewhat greater than two. In order to see this, it is necessary to define a way to assess the dimension of a set. There is no unique way to do this, and there are a number of different measures which are used. The simplest is the Hausdorff, or box-counting dimension, sometimes called the fractal dimension. To define this, we need a useful characterisation of the dimension of a set S ∈ Rn . Suppose we cover the set with a (minimal) number of n-dimensional boxes of dimension r , i.e. having volume r n ; the number N (r ) will depend on r , increasing as r decreases. We use this to define the box-counting dimension. To see how, imagine a line, surface or volume in R3 . A line of length l will require N ∼ l/r boxes to cover it, thus N ∼ r −1 as r → 0. Similarly, a surface of area A will have N ≈ A/r 2 , so N ∼ r −2 as r → 0. And equally for a volume, N ∼ r −3 . This motivates us to define the dimension d of a bounded set S by the exponent d in the relation N ∼ r −d , or more precisely d = lim

r →0

N (r ) . ln(1/r )



6 Diverse Applications

How then does a non-integer dimension arise? To see this, let us go back to the Lorenz equations, and the Poincaré map indicated in Fig. 4.10, and more specifically its cartoon version in Fig. 4.11. It is evident that the dimension of the invariant set will ∞ be 2 + d f , where d f is the dimension of the intersection of the limit S∞ = n=1 Si 1 i 2 ... i n of the nested sets defined in (4.21) with a vertical line. This is because there is a dimension of one along the orbit as it travels from the Poincaré surface back to it, another dimension of one from the expanding (horizontal) direction in Fig. 4.11, and d f from the dimension of the intersection of S∞ with a vertical line through S1 or S2 . If, as described after (4.21), we suppose there is a uniform vertical contraction by a factor of ε at each iteration, then at the nth iterate, there are 2n strips each of height lεn−1 (if the square in Fig. 4.11 has length l), and thus (6.62) implies df =

ln 2 < 1. ln(1/ε)


Box-counting is not an efficient way to compute the fractal dimension, but an alternative method is to use the Lyapunov exponents. Roughly speaking, Lyapunov exponents give the average rates of amplification of perturbations to a trajectory. When they are positive, they indicate (exponential) divergence of nearby trajectories, which implies sensitive dependence on initial conditions, and is thus an indicator of chaos. Let us consider a flow defined by the solutions of x˙ = f (x), x(0) = ξ,


ξ → Ft (ξ ) where x(t) = Ft (ξ ).


which defines a time t map

Trajectories x + y sufficiently close to x are approximately described by the linearised flow and flow map (6.66) y˙ = Sy, y → Tt y, where S = D f [x(t)], Tt = D Ft (ξ )


are the Jacobians of f and Ft with respect to x and ξ . Note that by differentiating Tt with respect to t,13 we have (6.68) T˙t = S(x)Tt . The eigenvalues μ of S give the local rate of exponential growth or contraction of small perturbations to a trajectory. Lyapunov exponents are associated with the longterm average of these growth rates, or more precisely their integrated form. These 13 Holding

ξ fixed.

6.4 Dimensions and Lyapunov Exponents


are associated with the singular values of the time t map Tt . For fixed ξ , we can define the singular values σi (t) of Tt as we did earlier, thus σi2 are the eigenvalues of the positive definite symmetric matrix TtT Tt . From our discussion of singular value decomposition earlier, we know that σ represents the expansion or contraction in the principal axes of inertia of a cloud of points surrounding ξ . An initial sphere of points y with radius δ is mapped to an ellipsoid with semi-major axes of length σi δ, directed along the corresponding singular vectors vi . As time increases, we can think of the weighted singular vectors σi vi attached to a point evolving both in direction and magnitude as we move along the trajectory. To illustrate the relation to the eigenvalues of S, suppose S is constant and symmetric. Then Tt = e St , TtT Tt = e2St , (TtT Tt )1/2t = e S , and the eigenvalues μ of S are related to the singular values σ of TtT Tt via 1 (6.69) μi = ln σi . t This result generalises in the following sense. If the system is ergodic, then the limit lim (TtT Tt )1/2t = (6.70) t→∞

exists (and is independent of the choice of ξ for almost all ξ ); then we can define the Lyapunov exponents as λi = ln ωi , (6.71) where ωi are the eigenvalues of . The result in (6.70) is Oseledec’s multiplicative ergodic theorem. We need to comment on the term ‘ergodic’ above. The ergodic theorem refers to the idea that the long time average of a quantity evaluated on a trajectory is the same as the spatial average of that quantity over the trajectory. In symbols, 1 lim τ →∞ τ



 φ[Ft (ξ )] dt =

φ(y) d M(y),



for any continuous function φ, and the left hand side is independent of ξ (for almost all ξ ). One has to be a bit careful about this, because if the attractor A is chaotic, then integration becomes a bit awkward. Also, (6.72) can at best be true for almost all x, since, for example, A is likely to contain periodic cycles,  for which the statement will not be true. Strictly, one defines d M as a measure (and A d M = 1), but intuitively it is a Stieltjes integral over a probability distribution, or more obviously d M = g(ξ ) dξ , where g is the probability density of trajectories on the attractor. Generally, the ergodic theorem is more commonly the ergodic hypothesis, since in practice it is difficult to verify. It is a wish that everything is equally likely, so a typical trajectory visits all parts of the attractor with the same probability as any other trajectory. The ergodic hypothesis is used in classical statistical mechanics. It seems reasonable both there, and also in chaotic dynamics, where some of our earlier descriptions of chaotic


6 Diverse Applications

sets arising for example in homoclinic bifurcations (such as the existence of dense orbits) are consistent with it. The connection of the ergodic theorem with the Oseledec result (6.70) can be seen as follows. Since the matrix W = TtT Tt = V  2 V T


(where we write the singular value decomposition of Tt as Tt = U V T ) is positive definite and symmetric, we can define its logarithm ln W = 2V (ln )V T . Further, it is not difficult to show, using (6.68), that W˙ = Tt T (S + S T )Tt ,


and further that W and W˙ commute. This sensibly allows us to define derivatives of functions of W , and in particular d(ln W ) = W −1 W˙ , dt


which allows us to write (6.70) in the form 1 τ →∞ τ




1 τ →∞ τ

W −1 W˙ dt = lim


Tt −1 (S + S T )Tt dt = 2 = 2 ln . (6.76)


The ergodic theorem then translates this to  (y, ξ ) d M(y) = 2,




 = Tt −1 (S + S T )Tt ,


which gives (6.70) providing the integral in (6.77) exists. The problem here is that, while S = S(y), Tt depends on y and ξ , so  in (6.78) depends also on ξ ,  = (y, ξ ). Of course, the ergodic theorem says this doesn’t matter, but it is not easy to see why the integral should exist.

6.4.1 The Kaplan–Yorke Conjecture We said earlier that Lyapunov exponents λi have a use in estimating the dimension of a chaotic attractor. First of all, we suppose the exponents are ordered, λ1 ≥ λ2 , . . . , etc. Second, we recall that if λ1 > 0 for a system of dimension of three or larger, then the system is chaotic assuming it is bounded. Note also that for an autonomous

6.4 Dimensions and Lyapunov Exponents


system, at least one exponent (and generally only one) is zero corresponding to time translation invariance, thus y = x˙ satisfies the equation y˙ = Sy, which implies that Tt f (ξ ) = f [Ft (ξ )].


Now from (6.70) and the singular value decomposition of Tt , it follows (cf. (6.32)) that for large time Tt v ≈ eλt u, and thus the Lyapunov exponents are just λ = lim


1 ln ||Tt v||, t


where v is an eigenvector of (or singular vector of Tt at large t). In particular, if we assume a bounded chaotic attractor with mostly dense orbits, then Ft (ξ ) comes arbitrarily close to ξ for arbitrarily large t, so that (6.79) and (6.80) imply that λ = 0 for v = f (ξ ). Kaplan and Yorke conjectured that the fractal dimension discussed above in (6.62) would be related, possibly equal, to a quantity they called the Lyapunov dimension, defined as follows. Suppose that k is the maximum number of Lyapunov exponents such that k  λi ≥ 0; (6.81) c(k) = 1

then the Lyapunov dimension is defined as k DL = k + d f = k +

λi . |λk+1 | 1


This odd definition simply estimates the dimension as the value of k where c(k) = 0, with linear interpolation used between integer values. An apparently quite different suggestion was made by Mori; in an N -dimensional system, if we assume that there are m positive exponents and just one of zero, his estimate is  (N − m − 1) m 1 λi . (6.83) DM = m + 1 + d f = m + 1 + N m+2 |λi |  Note that if we assume a dissipative system ( λi < 0) and just one negative exponent, then D M = D L . This is the case for the Lorenz equations, for example. To understand the Kaplan–Yorke conjecture, we note that the volume of k-balls is expanding, but that of (k + 1)-balls is contracting, suggesting k < D < k + 1. k λi Accepting this, the remaining question is, why would the fractional part be 1 ? |λk+1 | To understand this, we argue by analogy with Fig. 4.11. For that figure, we estimated a fractal dimension of 2 + d f (see (6.63)). The integer part came from the nonln 2 contracting part of the flow (thus k in (6.82)), while the fractional part d f = ln(1/ε)


6 Diverse Applications

came from the shrinkage by a factor ε in the return map, together with the fact that there were two strips which came back to the same square. In the Lorenz equations, this is due to the symmetry of the system (without the symmetry, the homoclinic orbit produces a single periodic orbit). More generally, the image of a set in the Poincaré surface under the action of the return map will consist generally of a number of strips due to the folding of the section: see Fig. 6.8. The only sensible way to estimate the number of such strips is by the expansion of the set B in the unstable direction. If the recurrence time is T , then the unstable volume expansion ratio is n U = exp


 λi T ,



and this is also the maximum number of strips  that can fit in the box B. On the other N hand, these strips contract by a ratio exp λ T in the orthogonal contracting k+1 i space, but in fact it is only the smallest contraction which is important in estimating the box-counting dimension. Thus the image B  is essentially one-dimensional in the contracting direction, and the relevant contraction ratio is

r S = exp −|λk+1 |T ,


_ B



Fig. 6.8 A typical set of Smale horseshoes in a Poincaré return map. The oval B is mapped to the image B  , whence the horseshoes provide a strange invariant set. To make it attracting, we must ¯ so that the image of the suppose that the image (dashed) of the outer circle B¯ is contained within B, annulus exterior to B is strongly contracted. Additionally, there must be a reconnection mechanism to bring points outside B back inside it

6.4 Dimensions and Lyapunov Exponents


and as in (6.63), the estimated dimension is df =

ln n U , ln(1/r S )


which is just (6.82).

6.5 Fractals Fractals are sets which live in between dimensions. Strange attractors are thus fractals. They can be curves which are more than curves, surfaces which are more than surfaces. They have become extremely well known in popular imagination, because of their endless artistic qualities, and because of the irritating fascination of their properties: a curve which can be defined but not drawn, which is continuous but nowhere differentiable, which can fill space. Popular books on mathematics are inundated with them: the Koch snowflake, the Sierpi´nski gasket, the Mandelbrot set. Fractals really come from the world of pure mathematics, but have also attracted attention because of the idea that fractal surfaces occur (approximately) in nature. The shape of a coastline, the outline of a cloud, growing dendritic crystals, river networks: all of these have been described as fractals. Their common feature is that they are self-similar, and as we shall see, this quality provides the way to construct them, and it causes their dimension to be non-integral.

6.5.1 The Cantor Middle-Thirds Set Possibly the most famous fractal is the Cantor set, whose construction mirrors that of the strange invariant set in a homoclinic bifurcation; indeed, its definition was introduced in Sect. 4.1.2. It was also detailed in question 4.7. The middle-thirds set is constructed as follows. We take the set S0 = [0, 1] and two functions: φ1 (x) = 13 x, φ2 (x) =

2 3

+ 13 x,


and we let S1 = φ1 (S0 ) ∪ φ2 (S0 ), thus S1 = [0, 13 ] ∪ [ 23 , 1]: we have excised the middle third of S0 and created two new intervals which we label as S00 (the left one) and S20 (the right one) for reasons which will become clear. We now repeat the process, excising the middle third of S00 and S20 , yielding a new set S2 consisting of four intervals: (6.88) S2 = S000 ∪ S020 ∪ S200 ∪ S220 , where S000 = [0, 19 ], S020 = [ 29 , 13 ], S200 = [ 23 , 79 ], S220 = [ 89 , 1].



6 Diverse Applications

Note also that S2 = φ1 (S1 ) ∪ φ2 (S1 ). It is evident that this process can be continued, yielding nested sets of intervals Sn = φ1 (Sn−1 ) ∪ φ2 (Sn−1 ),


and the Cantor set is then defined as its limit ∞ 

S∞ =

Sn .



The fractal dimension is easy to calculate. At the nth iterate, there are 2n intervals, ln 2 1 . each of length n ; the fractal dimension is thus 3 ln 3 The reason for the notation for the intervals in each union lies in the fact that if we use a ternary notation for the fractions, and write fractions which terminate as . . . 1000 . . . as . . . 0222 . . ., then for example S00 contains all numbers between 0.00˙ ˙ In general, ˙ 14 similarly S020 contains all numbers between 0.020˙ and 0.022. and 0.02, ˙ }. S j1 j2 ... jn = {x : 0. j1 j2 . . . jn 0˙ ≤ x ≤ 0. j1 j2 . . . jn 2,


It is then clear that the Cantor set is just S∞ = {x = 0. j1 j2 j3 . . . , jk = 0 or 2},


and it is evident that it is closed, every point is a limit point, and it contains no intervals.

6.5.2 Iterated Function Systems The way in which the Cantor middle-thirds set is constructed is an example of an iterated function system, and this provides a good recipe for constructing fractals. Given a set B and a sequence of contraction mappings15 {wi }, i = 1, . . . , n on B, we define a transformation W : B → B as W (B) =


wi (B).



We have W (B) ⊂ B. We can then define a sequence W0 (B) = B, Wn (B) = W [Wn−1 (B)], 14 The

overdot indicates that the last digit is repeated indefinitely. mappings uniformly shrink volumes.

15 Contraction


6.5 Fractals


Fig. 6.9 The black spleenwort fern Asplenium adiantum-nigrum, created by Barnsley’s famous iterated function system. Next to it, an actual fern (but not the black spleenwort), from Zealandia in Wellington, New Zealand

which forms a nested sequence having a non-empty limit set16 W∞ =


Wn (B).



The same kind of coding of the iterates can be done as for the Cantor set, yielding the fact that the limit sets will commonly be fractal, and iterated function systems can be used to generate pleasing pictures, such as that of the ‘fern’ in Fig. 6.9. The trick here is the ‘collage’ theorem, which basically recommends using a set of contraction maps whose union has a boundary close to the outline of the desired object.

6.5.3 Julia Sets Another situation where fractals occur is in the study of Julia sets. These arise in the study of the dynamics of iterated analytic maps in the complex plane of the form z → f (z).


because each sequence of iterates wi (B) has a fixed point because of the contraction mapping theorem.

16 Non-empty


6 Diverse Applications

We have come across maps throughout this book, but none in analytic form. As can be expected, the analyticity has a significant effect. The basic statement about the map (6.97) is that the complex plane is divided into different regions in which the dynamics of the map are ‘similar’. These are open sets, called Fatou domains, and the boundary between them is the Julia set. To be specific, we will take f (z) to be a polynomial (of degree greater than one), and actually, we will use the quadratic map (6.98) z → w(z) = z 2 − μ as a particular choice, though this is not essential, and the discussion is easily generalised. For a polynomial, the Julia set is the boundary of the set of points z ∈ C such that the iterates f k (z) → ∞. First, it is clear for a set V : |z| ≥ R that | f (z)| > |z| if R is sufficiently large. Hence f (V ) ⊂ V , and it is clear that the iterates f n (V ) form a nested sequence whose limit is the point at infinity. Now equally we have V ⊂ f −1 (V ),17 and thus the inverse iterates f −n (V ) form a growing sequence of sets. Since V was the exterior of a circle, it is easier to think of images of its complement V¯ = C \ V , which is the interior of a circle. Thus the sequence { f −n (V¯ )} forms a shrinking nested sequence, which has a limit: this is a non-empty (since f has a fixed point) open set, and its boundary is the Julia set J . The Julia set divides C into invariant regions called Fatou domains F j , in which the dynamics are relatively ordered. Equally, the Julia set is invariant, i.e. f (J ) = J , but the dynamics of f on J are chaotic: points arbitrarily close to J (in f −n (V ) for some n) go to ∞ under iteration of f , so since f is continuous, it has sensitivity to initial conditions on J . For example, if we take μ = 0 in (6.98), the Julia set is the unit circle. Under iteration of f , z → 0 for z ∈ int (J ) = F1 , while z → ∞ for z ∈ ext (J ) = F2 . On J , we have z = eiθ , and the map is θ → 2θ , which is chaotic, as we saw in Chap. 2 (Eq. (2.5)). Not all Julia sets are so smooth. Indeed, the example above is the exception rather than the rule. The underlying reason for this is due to the analyticity of f . If f  (z) = 0, then a neighbourhood of z ∈ J is mapped conformally to a neighbourhood of f (z) ∈ J ,18 which implies that the two neighbourhoods are self-similar. Unsurprisingly, this property leads to typically fractal Julia sets, an example of which is shown in Fig. 6.10. Images such as that in the figure can be computed in one of two ways. The escape time algorithm associates with each point z in the plane (numerically, on a grid) an integer n which is the number of iterates for it to reach V (which can be taken as any suitable shape), f n (z) ∈ V . The Julia set is then the boundary of the points for which n = ∞. Numerically, we iterate up to some large value of n, and then we can image the filled Julia set by putting a black dot at each point with ‘infinite’ n.

17 Here

we have an issue, as the inverse is not uniquely defined. For the moment, we simply choose one branch of the inverse function. 18 Angles are preserved under the transformation.

6.5 Fractals


Fig. 6.10 The Julia set for the quadratic map (6.98), with μ = 0.999 − 0.251i

Equally, assigning a colour scale to the finite values of n yields some of the sumptuous artwork, which is readily viewed on the Internet. An alternative is to use an iterated function system, as discussed above. This makes use of the fact that the inverse mapping w → z will typically have a number of different branches. For example, the inverse of (6.98) consists of the two branches z = f ±−1 (w) = ± (w + μ)1/2 ,


and each is a contraction on the initial set V¯ . In fact we may define a branch cut for f ±−1 going from −μ to −μ + ∞eiπ , and then the image of each f ±−1 is a half-disc within V¯ . If we now define W0 = ∂ V¯ to be the boundary of V¯ , and the iterated function system (6.100) F(W0 ) = f +−1 (W0 ) ∪ f −−1 (W0 ), we see that F n (W0 ) → J as n → ∞, and this provides a practical way to compute the Julia set. If we take μ to be real, we can relate the evolution of the Julia set of (6.98) as μ increases to that of the logistic map ζ → ω(ζ ) = 1 − μζ 2 ,


which was studied in Chap. 2. Indeed, if we write w = −μω, z = −μζ, then we regain (6.98).



6 Diverse Applications

Fig. 6.11 A sequence of Julia sets for the quadratic map (6.98), with increasing values of μ. From left to right, and top to bottom, the values are μ = 0.3, 0.618, 0.9, 1.2, 1.5, 1.8. The sets gradually collapse, as described in the text

As we have seen, when μ = 0, J is the unit circle. Essentially, as μ increases, the Julia set shrinks; at first appearing as an area with an odd boundary, but with further increase of μ, the enclosed volume shrinks to zero, and there is no interior: the Julia set becomes a bit like the middle-thirds Cantor set. For μ = 2, J is the real interval [−2, 2], and for μ greater than this, the Julia set becomes a disconnected Cantor dust. Note that for μ > 2, the logistic map (6.101) loses the invariance of the interval [−1, 1], and most iterates escape to infinity. A sequence of Julia sets illustrating this behaviour is shown in Fig. 6.11.

6.5.4 The Mandelbrot Set The Mandelbrot set, as shown in Fig. 6.12, is related to Julia sets, but not in an obvious way. It is defined as the set of points μ in the complex plane for which the origin z = 0 is not mapped to ∞ under iteration of the quadratic map (6.98), that is, for which the origin is in the Julia set or its interior (if it has one). It can be shown that this set of points is also precisely the set of values for which the corresponding Julia sets are connected.19 19 A

set is connected if it cannot be partitioned as the union of two non-empty sets, both of which have no points in common with the closure of the other. It is pathwise connected if any two points

6.5 Fractals


Fig. 6.12 The Mandelbrot set

The Mandelbrot set can be constructed using the escape time algorithm, but the use of colour is helpful, since (as is suggested by Fig. 6.12) it has a lot of wispy filaments. In particular, it contains a ‘spire’ extending along the real axis to the value μ = 2 (to the right of the figure). It is associated with the bifurcation structure of the logistic map; indeed its boundary gives the parameter values where bifurcations occur. For example, the large kidney-shaped bulb is associated with the stability of the non-trivial fixed point of the logistic map x → λx(1 − x); the relation of λ to μ is (cf. Fig. 2.13) λ = (1 + 4μ)1/2 + 1.


For (6.98), the corresponding fixed point is z = 21 [1 − (1 + 4μ)1/2 ].


This fixed point is stable if | f  (z)| < 1, which corresponds to the region |1 − (1 + 4μ)1/2 | < 1.


can be joined by a path in the set, and simply connected if pathwise connected, and any closed loop in the set can be shrunk to a point without leaving the set. The surface of a sphere is simply connected, the surface of a torus is pathwise connected, (−1, 0) ∪ (0, 1) is not connected.


6 Diverse Applications

Solving for the boundary, we obtain μ + 41 = eiθ cos2 21 θ , where θ is the polar angle with respect to μ = − 41 , and thus in the same polar coordinates, r = cos2 21 θ,


which is the equation of a cardioid. This is the smooth contour of the large bulb of the figure. Similarly, the circle to its right is the stability region for the period two cycle, and the smaller one to its right is that for the period four cycle. The process carries on as for the Feigenbaum sequence, yielding smaller and smaller regions of stability. Further out past the limiting value of μ∞ ≈ 1.40 . . . (see Eq. (2.13)), there are other small regions, the most noticeable of which corresponds to the period three window at μ ≈ 1.75. Many other bulbs can be seen around the periphery; those touching the main cardioid correspond to stability regions for period-q cycles having a winding number20 of qp , and they touch the cardioid at  (1 + 4μ)


−1=e , θ =π iθ

 2p −1 . q


The large bulb to the right of the cardioid corresponds to period-two cycle stability, p = 1, q = 2; the bulbs at the bottom and top (respectively) of the cardioid represent period-three cycle stability boundaries, p = 1, 2, q = 3; and so on; for more details, see questions 6.8 and 6.9, and also Sect. 6.7.3. It is evident that there is a good deal of self-similarity in the figure, and exotic features can be traced down to indefinitely small scales.

6.6 Whither Turbulence? We mentioned turbulence in Sect. 1.4 of Chap. 1, and again in Sect. 4.4.3 of Chap. 4. All of the subjects we have talked about in this book: bifurcations in ordinary differential equations, unimodal maps, area-preserving and dissipative maps, Hamiltonian systems, celestial mechanics: chaos is clearly a fundamental constituent of their dynamical behaviour. Turbulence in fluids, however, is somewhat different. Its most natural explanation certainly would lie in the idea that the Navier–Stokes equations of fluid flow have chaotic solutions, and that these correspond to the turbulent motions that are seen in experiment, but the extrapolation of ideas of chaos in finitedimensional systems to systems which are fundamentally infinite-dimensional raises conceptual issues, about which little is known. We saw one example of this in Sect. 4.4.2, where the application of the homoclinic bifurcation analysis of Chap. 4 to a genuinely infinite-dimensional partial differential 20 The

average rate of rotation, in revolutions per iteration; this was defined following (3.159).

6.6 Whither Turbulence?


equation (i.e. one on an infinite spatial domain, not a finite domain) was discussed, and that analysis displayed at least one distinctive feature, the idea of spatial migration due to translation invariance. The question we want to consider is whether chaos in partial differential equations has a potential to explain turbulence in any useful sense. A tacit theme of this book has been the idea that chaotic behaviour can be explained either by the ‘classical’ route of bifurcation theory, to more and more complicated behaviour, or by the more direct ‘explosive’ approach through homoclinic bifurcations. This duality lends insight to the Lorenz equations and the logistic map, amongst other dynamical systems, but our view is that the homoclinic approach is the more fundamental, although historically, bifurcation theory came first, at least in dissipative systems. Now let us trace the history of studies of turbulence in this light.

6.6.1 Linear Stability Reynolds’s original experiments in the 1880s told us a good deal about shear flow turbulence in pipes, and some of this is discussed below. But the principal observation is that when the Reynolds number (computed as Re =

ρ ud ¯ , η


where ρ is fluid density, u¯ is mean velocity, d is tube diameter, and η is fluid viscosity) exceeds a value of about Rec ≈ 2,300, the flow becomes disordered, as can be found by injecting ink at the inlet: the ink swirls around and mixes in the flow. Below Rec it maintains a steady straight trace through the pipe. Naturally enough, the first approach to understand this was to study the stability of the flow. The uniform steady solution (Hagen–Poiseuille flow) is subjected to a small perturbation, and the equations are then linearised, leading eventually to the Orr–Sommerfeld equation for the perturbed stream function. This is a fourthorder ordinary differential equation eigenvalue problem, in which the eigenvalues λ are the growth rate exponents (the solutions are ∝ eλt ), and depend on an axial wavenumber k. If any eigenvalue has Re λ > 0, then the steady state is unstable. The corresponding flow between two parallel plates is called plane Poiseuille flow, and is also an experimental system of interest. It is also easier to analyse, because of the Cartesian coordinate system, and also somewhat easier to visualise experimentally. For plane Poiseuille flow, the eigenvalue problem can be solved 21 numerically, and it is found that instability occurs for Re > ∼ 5,772. (This large value also enables an asymptotic approach to be taken, which is complicated, but yields an accurate estimate.) 21 For

plane Poiseuille flow, the Reynolds number is defined as in (6.108), but with d being half the distance between the plates, and u¯ being the maximum velocity in laminar flow. If instead we wanted to retain d as the channel width and u¯ as the mean velocity, then we would take 43 of the value in (6.108).


6 Diverse Applications

So instability appears superficially to be consistent with experiment. But all is not well. In plane Poiseuille flow, instability (in the form of turbulence) occurs at Reynolds numbers of O(1,000), well short of the theoretical result. And for pipe Poiseuille flow, it is worse: the steady state appears to be linearly stable for all Reynolds numbers. So linear stability appears not to be the answer.

6.6.2 Nonlinear Stability In the period after the Second World War, methods for treating (weakly) nonlinear stability effects appeared. These were based on Landau’s earlier amplitude equation theory, and the bifurcation theorem of Hopf, earlier anticipated by Andronov. They were developed in the context of shear flows by Stuart, and for thermal convection by Malkus and Veronis, and the constructive techniques use the method of multiple scales. The main result to come from these studies was the idea that a Hopf bifurcation could be subcritical or supercritical. For a supercritical bifurcation, instability of a uniform flow would lead to a small oscillation which grew with the bifurcation parameter (here the Reynolds number). In the case of pipe flow, this is not seen. On the other hand, a subcritical bifurcation causes an unstable periodic solution to bifurcate from the steady state. The situation for each type of bifurcation was discussed in Chap. 1 and in more detail in Chap. 3, and was shown in Fig. 3.4. It turns out that the bifurcation for plane Poiseuille flow is subcritical, so that an unstable travelling wave solution exists for Re < Rec , where Rec is the critical Reynolds number for instability, and this solution, which is two-dimensional (in keeping with the fact that two-dimensional modes are the most unstable: Squire’s theorem) exists for Reynolds numbers down to about 2,800, where presumably it turns round, as schematically illustrated in Fig. 6.13, through a saddle-node bifurcation to an upper branch solution which would be stable to two-dimensional perturbations. It turns out that below Re = 2,800, quasi-steady two-dimensional structures exist which relax (two-dimensionally) on a slow viscous time scale to equilibrium, but they are rapidly unstable to three-dimensional perturbations, due to Rayleigh’s theorem,

Fig. 6.13 Schematic illustration of the bifurcating two-dimensional travelling wave solution in plane Poiseuille flow. The amplitude of the perturbation about the mean flow is denoted as A


Re c


6.6 Whither Turbulence?


which shows that a sufficient condition for inviscid instability (thus on an inertial time scale) is that the transverse shear flow profile has an inflection point. This is not the case for the uniform steady state, but it is the case for a finite amplitude transverse streak-like structure, corresponding to the transient two-dimensional perturbation. More recently, unstable steady streak-like waves have been found to exist in pipe flow, and it seems that the surface on which these structures exist in the phase space of solutions may separate regions where solutions collapse to the laminar state from regions where the attractor is a turbulent state. It has even been suggested that the dynamics on this (invariant) surface may be chaotic, due to the existence of heteroclinic connections between the different streak patterns; which sounds just like a Julia set. The description is also reminiscent of the Lorenz equations, where the strange invariant set born in the homoclinic bifurcation becomes a strange attractor when the invariant set collides with the unstable periodic orbits, which is also when there is a heteroclinic connection between the origin and the periodic orbits (see Sect. 4.2.2). Pursuing this analogy, we might suppose that turbulent trajectories are created by a homoclinic orbit from the laminar state to itself, and that the strange trajectories become attracting when there is a heteroclinic connection from the laminar state to the unstable streak-like solutions.

6.6.3 Experimental Observations in Shear Flows Some observations of pipe or boundary layer flows are of significance in this regard. The most pertinent is that in a pipe flow, there is no sudden transition from a fully laminar flow to a fully turbulent flow throughout the length of the pipe; rather, isolated turbulent ‘puffs’ and ‘slugs’ occur, as indicated in Fig. 6.14. Their occurrence is dependent both on the Reynolds number, and on the amplitude of the disturbance at the inlet (if there is no disturbance, the flow remains laminar).

Fig. 6.14 Parametric locations of puffs and slugs in pipe flow as a function of Reynolds number Re and disturbance amplitude A. Adapted from Wygnanski and Champagne (1973), with permission of Cambridge University Press. The amplitude of the inlet disturbance A is measured in percent, presumably as inlet velocity fluctuation divided by mean velocity








0.1 laminar

10 3

10 4


10 5


6 Diverse Applications

Fig. 6.15 Image of a turbulent spot in a boundary layer. Image from Milton van Dyke, An Album of Fluid Motion, Stanford, Parabolic Press, 1982, image 109, page 64. The original image is from the paper by B. Cantwell, D. Coles and P. Dimotakis, Structure and entrainment in the plane of symmetry of a turbulent spot, Journal of Fluid Mechanics 87 (4), 641–672 (August 1978), Fig. 6(a). Reproduced by permission of Cambridge University Press

Turbulent slugs occur because the disturbance at the inlet renders the velocity unstable (presumably by creating an inflectional profile), and a local transition to a slug of turbulent flow occurs; the slug grows as the flow progresses downstream, but remains of finite extent and is surrounded by laminar flow. The flow thus consists of patches of turbulence, and this phenomenon is called intermittency. In a boundary layer flow over a flat plate, these patches of turbulence are manifested as turbulent ‘spots’, as shown in Fig. 6.15. Turbulent puffs occur at much larger disturbance amplitudes, and occur when the Reynolds number is too low to maintain the turbulent patches. They are thus regions of localised turbulence in the process of relaminarisation. It is possible to understand the structure of puffs and slugs in terms of a simple model in which one presumes a bistable transition between laminar and turbulent ‘states’ described by a turbulent ‘amplitude’ (consistent with the earlier described numerical results for pipe flow), but there remains the issue of describing what this turbulent state actually is.

6.6.4 A Homoclinic Connection? The simple hypothesis would be that there is a homoclinic orbit from the laminar state back to itself at some critical value of the Reynolds number. If this is the case, then local homoclinic bifurcation analysis suggests that a strange invariant set will come into existence, much as occurs for ordinary differential equations,

6.6 Whither Turbulence?


as described in Chap. 4. However, whereas finite-dimensional systems produce a strange set of unstable periodic orbits, the corresponding items produced in a partial differential equation having a localised homoclinic orbit22 are modulated (doubly periodic) travelling waves, or wave packets. The resemblance of such structures to the turbulent spot in Fig. 6.15 is enticing. Is there any hope of finding such a homoclinic orbit? Well, it would be hard. The simplest way to find homoclinic orbits in ordinary differential equations is through an orbit-following programme such as AUTO, which essentially follows an initial periodic orbit in parameter space by Newton iteration. This involves shooting from a neighbourhood of the fixed point and then aiming to choose the initial conditions to suppress the unstable components as the trajectory approaches the fixed point. In more detail, suppose we seek a homoclinic orbit to a fixed point where the unstable manifold is n u -dimensional and the stable manifold is n s -dimensional, and the bifurcation parameter is μ. Selecting a value of one of the unstable components as an initial Poincaré surface, one then integrates for a time T till closest approach to the fixed point, and the idea is to choose the remaining n u − 1 components and the parameter value μ in order to eliminate the n u unstable components at t = T . More or less the same principle would apply in orbit-following for a partial differential equation, but the procedure rapidly becomes intractable as n u increases. Are there any analytic clues which might help? One of the features of turbulent flows is that the fluctuations are of relatively small amplitude, and this naturally suggests an asymptotic approach, in which the amplitude of the fluctuations scales with some inverse power of the Reynolds number. Of course, sufficiently small amplitude fluctuations are described by the linear Orr–Sommerfeld equation, which itself has boundary layer behaviour associated with the large value of the Reynolds number. It turns out that if the amplitude A of the fluctuations is of O(Re−2/3 ), then nonlinear terms become significant in the critical layer of linear analyses. This suggests the as yet unrealised possibility of studying a suitably reduced model of the Navier–Stokes equation for the existence of homoclinic orbits.

6.6.5 Practical Turbulence While the quest for a homoclinic orbit may provide an answer to why turbulence occurs, it may do little for the practical issue of computing actual turbulent flows. Classical such approaches use various closure schemes, of which the simplest may be the assumption of an eddy viscosity, such that the Reynolds stress (mentioned in Sect. 1.5, defined by the left-hand side below, and derived in question 6.9) is  − ρ(u i u j ) = μT

22 That

is, the orbit is local in both time and space.

∂ u¯ j ∂ u¯ i + ∂x j ∂ xi




6 Diverse Applications

where u¯ is the mean velocity, u is the fluctuation velocity, the overbar denotes a (local) time average, and μT is the eddy viscosity. It is an extraordinary fact that Prandtl’s mixing length theory, applied to pipe flow, gives an expression for the drag that precisely agrees with measurements. This suggests that the rather arbitrary self-similar assumption of the theory contains the essence of the truth. The idea is also consistent with the notion of a self-similar energy cascade from large scale eddies to smaller scale eddies, culminating when viscosity dissipates the energy at the Kolmogorov length scale, and famously described by Lewis Fry Richardson: We realise thus that: big whirls have little whirls that feed on their velocity, and little whirls have lesser whirls and so on to viscosity—in the molecular sense.

What is needed in the closure problem is a recipe for the evolution of the fluctuation velocity distribution at the microscale. There is an analogy here with the derivation of the Navier–Stokes equations from the Boltzmann equation, which precisely produces (6.109) (for the laminar viscosity) by describing the evolution of the velocity distribution in terms of particle collisions; what is needed for the turbulent energy cascade is a comparable recipe describing the interaction of two ‘eddies’. One possibility would be to idealise a turbulent flow as consisting of a population of line vortices, and the ‘collisions’ between vortices would provide the analogue of particle collisions in kinetic theory. Some such rendition of the Navier–Stokes equations is necessary to progress.

6.7 Notes and References 6.7.1 Time Series Stochastic processes are discussed in the book by Gardiner (2009). A number of applications of current interest are discussed in the book by Krapivsky et al. (2010): in particular these include aggregation, fragmentation, and nucleation and crystal growth. Statistical methods of forecasting time series are discussed by Box and Jenkins (1970); a basic exposition of the subject is by Diggle (1990). The linear ARIMA models lend themselves to Fourier analysis, and the use of power spectra in analysing fluid dynamical experiments was a focal point for the early work of Swinney and Libchaber, amongst others, as mentioned in Sect. 2.5.3. Turcotte (1997) gives a whole range of geophysical applications in which scale-independent power spectra occur, for example, earthquake size distributions, rock grain size distributions and so on.

6.7 Notes and References


6.7.2 Phase Space Embedding The use of an embedding dimension is described by Takens (1981), for example. Singular value decomposition, which survives under a whole range of different names, such as pseudo-inverse theory, principal component analysis, Karhunen– Loève decomposition, is a classical part of numerical linear algebra, described for example by Golub and van Loan (1989). The choice of an appropriate time lag to use in the embedding method has been quite varied; our discussion here follows a method due to Kember and Fowler (1993), and its implementation for delay equations is given by Fowler and Kember (1993).

The Pseudo-Inverse

The pseudo-inverse of an N × d E matrix M is defined by the d E × N matrix M + = V  +U T ,


where if  is defined by (6.37), then 

 + = D −1 , O T ,


and is a d E × N matrix. It is immediate from this definition that if x ∈ Rd E , then M + M x = x,


hence the name pseudo-inverse. The use of this is in the solution of the equation M x = c,


where x ∈ Rd E and c ∈ R N . If N > d E , then no solution is generally possible, and the best one can hope for is a least squares fit, such that the norm of the residual r = c − Mx


is minimised. Note that c ∈ R N , so we may write c=

N  1

for some choice of λi . Suppose equally that

λi u i



6 Diverse Applications



μi vi ∈ Rd E .



From (6.114) we then have, using (6.33), together with the orthogonality of both u i and vi , r2 =

N  1

λi2 − 2


λi σi μi +



σi2 μi2 =


λi2 +

d E +1


dE  (λi − σi μi )2 ,



and clearly r 2 is minimised by the choice E  λi vi λi , 1 ≤ i ≤ dE , x = . σi σi 1


μi =


Now, putting x = v in (6.112), we see that, using (6.32), M +ui =

vi σi

for i = 1, . . . , d E ,


and therefore from (6.118), x=


λi M + u i = M + c,



justifying the term pseudo-inverse for M + . The use of singular systems analysis and its relation to moving window spectral methods is discussed by Fowler and Kember (1998).

6.7.3 Dimensions and Fractals A thorough review of various definitions of dimensions is given by Eckmann and Ruelle (1985). As mentioned above, a range of geophysical applications for fractals is given by Turcotte (1997). McGuinness (1983) describes some of the difficulties of computing fractal dimensions. The Kaplan–Yorke conjecture is due to Kaplan and Yorke (1979), see also Frederickson et al. (1983), Nichols et al. (2003), and Gröger and Hunt (2013), for example. The suggested definition of dimension by Mori is given in Mori (1980). Oseledec’s multiplicative ergodic theorem can be found in the paper by Oseledec (1968). The recipe given by Barnsley (1988) for the black spleenwort fern illustrated in Fig. 6.9 is the following: it is an iterated function scheme based on four affine maps of the form

6.7 Notes and References


Table 6.1 The mappings used for the black spleenwort fern in Fig. 6.9, as given by Barnsley (1988). The map column indicates the value of i in (6.121); the angles are in degrees Translations Rotations Scalings Map 1 2 3 4

h 0 0 0 0


θ 0 −2.5 49 120

k 0 1.6 1.6 0.44

φ 0 −2.5 49 −50

r 0 0.85 0.3 0.3

       x r cos θ −s sin φ x h = + , i = 1, 2, 3, 4, y r sin θ s cos φ y k

s 0.16 0.85 0.34 0.37


where the values of the constants are given in Table 6.1. Julia sets are described in some detail by Milnor (2006), a book whose origins as 1990 lecture notes can be found online. The material is theoretical, but there are some nice illustrations of various Julia sets (none of them the same as the ones shown in Figs. 6.10 and 6.11). Another nice exposition is in the book by Falconer (2003). Julia sets and Mandelbrot bulbs In three exercises (6.7, 6.8 and 6.9), we use techniques of perturbation theory to analyse the structure of Julia sets and the Mandelbrot set. The first and third questions are successful, but the second is not, and it is worth discussing why. The issue for question 6.8 is the size of the ‘bulbs’ in the Mandelbrot set in Fig. 6.12. The Mandelbrot set consists of the black parts of the figure, and at this scale, consists of a central cardioid surrounded by a sequence of approximate circles tangent to the boundary. In turn, these circles have further tangent circles, but it is the primary circles which form the focus of the question, which is that of determining their size. Each point of tangency is associated with loss of stability of one of the fixed points of the logistic map z → z 2 − μ through bifurcation to a stable period-q orbit associated with a winding number p/q. Since the circles become smaller and smaller as q increases, a perturbation method to determine the small circles, which are themselves the parametric regions of stability of the period-q orbits, is suggested. And indeed, question 6.8 does derive an approximate formula for these circles, of the form   (μ − μ∗ )e−iψ − ia  < a, a = 1 sin π p . 2q q


Here μ∗ is the value of μ at the point of tangency, ψ is the angle of the tangent, and a is the bulb radius. The problem is, the radius is not very accurate, and a much better pragmatic estimate is to use the radius aq =

1 πp , sin q2 q



6 Diverse Applications

Fig. 6.16 The Mandelbrot set, overlain with the main cardioid bulbs of periods 2,3,4,5, with their estimated circular form (in blue) from (6.122), using a = aq from (6.123)












which looks very good (though not perfect), even for low values of q, as shown in Fig. 6.16. Why this should be is unclear. More to the point, why is the perturbative approximation in question 6.8 apparently incorrect? To answer this, we consider the question in more detail. First, it writes the map z → z 2 − μ in the form ζ → ζ 2 + sζ − ,


where z = z ∗ + ζ , μ = μ∗ + , z ∗ being the value of the fixed point where the bifurcation occurs, at the critical value μ = μ∗ , and s = exp(2πi p/q). Thus at the bifurcation point  = 0, the linear approximation ζ → sζ has a period q orbit. The object of the question is to compute this orbit and its stability when  (and thus also ζ ) is small. The question aims to do this by defining the iterates ζr of the map (with ζ0 = ζ ) as D− (1 − s r u r ) , (6.125) ζr = ζ + 1−s where D± = ζ 2 ± (1 − s)ζ − , and u r +1 − u r =

D− u r2 r D+ 1 2(ζ 2 − )u r − s , + r s(1 − s) s s(1 − s) s(1 − s)



with the initial condition that u 0 = 1; q-periodic solutions (q ≥ 2) satisfy u q = 1.

6.7 Notes and References


The question 6.8 suggests that when ζ and  are small, and thus also D± , then since the right-hand side of (6.127) is small, we have u r ≈ 1, and with this in hand, an explicit expression for u r can be found. This gives u q = 1 if ζ = ± 1/2 , and the resulting stability result is that in (6.122). This is exact for q = 2 but inaccurate for q > 2. What is wrong? The method in the question is somewhat simple, but can be made more methodical using the method of multiple scales for difference equations (see, for example, Hoppensteadt and Miranker (1977)). Specifically, if we write  = ε2 , ζ = ε Z ,

R = εr, u r = U (r, R) ≡ Ur (R),


and presume that the dependence on R is continuous, so that u r +1 = Ur (R) + εUr (R) + · · · ,


where the prime denotes differentiation with respect to R, then with U ∼ U (0) + εU (1) + . . . ,


one can show by equating successive powers of ε that Ur(0) = U0 (R), Ur(1) = U1 (R) +

Z s

 1 − sr 1 − s −r , − 1−s 1 − s −1


and the functions U0 and U1 are determined by the requirement that secular terms be suppressed (so that the solutions are q-periodic). It then follows that U0 ≡ 1 and, by removing secular terms at O(ε2 ), that U1 =

−2R : s(1 − s)


but the procedure fails, whether it be this or the one in the question. To elucidate the issue, we look at explicit iterates of the map (6.124). With D− defined in (6.126), the r -th iterates of (6.124) can be defined (as in question 6.8) as ζr = ζ + D− Ar ,


Ar +1 = 1 + s Ar + 2ζ Ar + D− Ar2 ,



thus A0 = 0,

A1 = 1,

A2 = D− + (1 + s + 2ζ ),


6 Diverse Applications 3 2 A 3 = D− + 2(1 + s + 2ζ )D− + {s + 2ζ + (1 + s + 2ζ )2 }D−

+ (s + 2ζ )(1 + s + 2ζ ) + 1, (6.135) and so on. Since D− = 0 at the fixed points of the map, it follows that a period q orbit is one for which Aq = 0 (and Ar = 0 for 1 ≤ r < q). Consider first the p = 1, q = 2 resonance for which s = −1 and a period 2 orbit bifurcates at  = 0. In this case D− = ζ 2 − 2ζ − , and thus A2 = ζ 2 − ; this is exact, and is in fact identical to the approximate result in question 6.8. Thus the stability result in (6.122) is exact for q = 2. Now let us see what happens for q = 3. In this case we take s = exp(±2πi/3), and thus s 3 = 1, 1 + s + s 2 = 0. Using these, we find 3 2 + 2(1 + s + 2ζ )D− A3 (D− , ζ ) = D−

+ 2{s + (3 + 2s)ζ + 2ζ 2 }D− + 2ζ (1 + 2s + 2ζ ).


We define D(ζ ) = ζ 2 − (1 − s)ζ,


so that D− = D − , and we want to find the roots of A3 = 0 for small . Evidently these are close to the roots of P3 (ζ ) = A3 [D(ζ ), ζ ], and simplifying this using (6.136), we find P3 (ζ ) = ζ 3 [ζ 3 + (1 + 3s)ζ 2 − (2 − s)ζ − (3 + 2s)].


For small , we then have A3 (D− , ζ ) = P3 (ζ ) − 

∂ A3 + .... ∂D


The map ζ → ζ3 is an eighth-degree polynomial. It has two fixed points corresponding to the fixed points of the map (6.124), and the other six roots, which are those of A3 , correspond to the two period 3 orbits. It is evident from (6.138) that the period 3 orbit of concern in the bifurcation is that corresponding to the three roots near zero. From (6.136), ∂ A3 = 2s + O(ζ ), (6.140) ∂D and thus (6.139) gives A3 = c3 ζ 3 − b3  + . . . , c3 = −(3 + 2s), b3 = 2s,


so that the correct approximation for the roots has ζ ∼ 1/3 , in distinction to (6.128). It does not take much nous to infer that for period q orbits, we expect that if s q = 1, 1 + s + s 2 + . . . + s q−1 = 0, then

6.7 Notes and References


Aq = cq ζ q − bq  + . . . ;


substituting this into (6.133) and calculating the approximate value of ζq (ζ ), we find the stability condition as in (6.122), with the value of a for period q orbits being aq =

1 πp sin , Q q

Q = − 21 qbq s(1 − s)2 ,


πp . q

For q = 2 and q = 3, 2q we regain (6.123). In fact it is not difficult to show that bq = − , whence s(1 − s)2 2 q indeed Q = q ; it is less easy to show that Pq (ζ ) ≡ Aq [D(ζ ), ζ ] = cq ζ + O(ζ q+1 ) when s q = 1. The derivation of (6.142) forms the subject of question 6.9. where we make use of the formula s(1 − s) = −2ieiψ sin

6.7.4 Turbulence The proper study of turbulence may be said to begin with the paper by Reynolds (1883), where he identifies the Reynolds number, and shows by experiment that the onset of turbulence occurs at a critical Reynolds number. He further observed that the onset is sudden, and that it depends on the amplitude of the inlet disturbance, and is intermittent. This latter observation was elaborated by Wygnanski and Champagne (1973), who described the formation of slugs and puffs, as indicated in Fig. 6.14, based on a prescribed inlet disturbance. Further experimental description was provided by Darbyshire and Mullin (1995), who placed their disturbance downstream, so as to better approximate a fully developed flow. A specific attempt to explain these observations qualitatively was given by Fowler and Howell (2003), using a characterisation of the flow as being locally in one of two attracting ‘states’: turbulent or laminar, with corresponding different friction laws. A similar method has been employed by Barkley (2016). Linear stability of shear flows is described by the Orr–Sommerfeld equation (Orr 1907a, b; Sommerfeld 1908), which is derived from the Navier–Stokes equations by linearisation. Specifically, for a two-dimensional incompressible shear flow (U (y), 0), the linearised equations for the perturbed velocity components in the velocity field u = (U (y) + u, v) are 1 2 ∇ u, Re 1 2 ∇ v; vt + U vx = − p y + Re

u t + U u x + U  v = − px +


cross-multiplying to eliminate the pressure gradient term, and introducing a stream function ψ such that u = ψ y , v = −ψx , we obtain


6 Diverse Applications

∇ 2 ψt + U ∇ 2 ψx − U  ψx =

1 4 ∇ ψ, Re


and for normal mode solutions of the form ψ = φ(y)eik(x−ct) , the Orr–Sommerfeld equation is (U − c)(φ  − k 2 φ) − U  φ =

 1 iv φ − 2k 2 φ  + k 4 φ . ik Re


Nonlinear stability results for fluid flows were developed for convection by Malkus and Veronis (1958), and for shear flows by Stuart (1960). These methods have their origin in the theory of Hopf bifurcation, and their legacy was the development of amplitude equations in general contexts of pattern formation in weakly nonlinear systems; a wealth of such equations have been studied: Kuramoto–Sivashinsky, CrossHohenberg, nonlinear Schrödinger; and many such systems are used by way of analogue to more complicated equations. For shear flow, these nonlinear stability results rather petered out. Largely, this was due to the numerical results obtained by Orszag and co-workers, who showed in a sequence of papers (Orszag and Kells 1980; Orszag and Patera 1980, 1983) that while there could be perturbed two-dimensional equilibrium structures below an instability threshold (should one exist), these are all subject to fast inviscid instability, essentially due to the consequent inflectional velocity profile (see question 6.12). Since then, many other such structures have been discovered, see for example Faisst and Eckhardt (2003), and their dynamical significance has been studied by Duguet et al. (2008) and Eckhardt et al. (2007), for example. These coherent structures take the form of lengthwise periodic vortices and streaks, whose form warrants some description. As described by Faisst and Eckhardt (2003), they are represented in cross-sectional view as an array of circular structures, the contours of which measure along-pipe velocity, with accelerated structures near the pipe walls. The cross-stream velocities are arranged as length-wise vortices, so the two structures co-exist. On the other hand, analytical studies such as that by Benney and Bergeron (1969) indicate, in a longitudinal section,23 that disturbances take the form of ‘cat’s eyes’, as represented by perturbed streamlines akin to those in a phase plane of the simple pendulum. We thus naturally think of these as transverse structures—wrongly, as we need to add the base flow; thus Benney and Bergeron’s cat’s eyes are a manifestation of a longitudinal streak, despite looking at face value like a transverse vortex. The subject continues to attract analytical approaches (Deguchi and Hall 2014; Chini et al. 2017), as indeed it has for a long time, and these tend to be of daunting complexity, as indeed is appropriate.

23 To

be more precise, if x is downstream direction, then the Faisst and Eckhardt structures are visualised in the (y, z) plane, while the Kelvin cat’s eye patterns of Benney and Bergeron are in the (x, z) plane.

6.7 Notes and References


Richardson’s Dream

The famous quotation by Lewis Fry Richardson (‘big whirls have little whirls…’) is commonly written as a verse, although it is simply a sentence (page 66) in Richardson’s (1922) dense text on meteorology. (And it is commonly misquoted, as ‘big whorls have little whorls…’, e.g. Editorial (2016).) Apart from the poetic quotation, Richardson’s extraordinary book is also known for his fantastic vision (page 219) of a theatrical gallery of human ‘computers’ (he thought 64,000 were necessary) who would be able to do the numerical calculations by hand sufficiently rapidly to provide useful weather predictions (see the illuminating book by Lynch (2006)). But apart from these notable highlights, Richardson’s book is a comprehensive treatise on meteorology. One tends to think of his much-vaunted numerical prediction of the weather on 20 May 1910, over central Europe, as being some simple representation of a basic primitive model involving mass and momentum conservation. But the treatment is extraordinarily detailed: moisture, radiation, clouds, but also basal friction, soil moisture and all kinds of other things. The following passage will give the idea: A cornfield provides a uniform elastic surface by observing which we can measure the shearing stress. For example at Benson, 1920 Aug. 16th to 22nd, observations were made on a field of ripe wheat. The wind acts mainly on the ear, but partly also on the stalk, as may be shown by lowering a glass jar over the ear. The centre of the force due to the wind appears to be about 5 cm below the ear…

Richardson’s book was reprinted in facsimile by Dover Books in 1965, together with an introduction by no less than Sydney Chapman, which contains interesting biographical information, together with diverting caveats for American readers, explaining what size Europe was, and so on. Richardson was something of an outsider in the academic establishment, somewhat like R. A. Bagnold24 or William Smith25 , and his book carries some eccentricities: a use of Coptic letters, not to mention some other apparently invented symbols, and a translation of their meaning into Ido, one of several ‘universal languages’ then prevalent, of which the only one which remains in the popular consciousness is Esperanto.

6.8 Exercises 6.1 A particle undergoes a random walk on the integers Z, and the probability of being at position n at step j is pn, j . Suppose the events move left, move right, remain stationary, have probabilities l, r and 1 − l − r respectively. By using conditional probabilities, show that

24 Renowned 25 The

author of The physics of blown sand and desert dunes. author of the first geological map of Britain.


6 Diverse Applications

pn, j = lpn+1, j−1 + (1 − l − r ) pn, j−1 + r pn−1, j−1 , and if the particle starts at the origin, show that pn,0 = δn0 , where δi j is the Kronecker delta. Find an equation for the generating function G j (s) =

pn, j s n , (∗)


and hence show that  G j (s) =

l + (1 − l − r ) + r s s

j . (†)

For the case l = r = 21 , find the large time behaviour of the distribution as follows. Write    1 it , n = j x, pn, j = √ f (x), s = exp √ j j and find two approximations for G j from (∗) and (†), one as a Fourier integral. Hence deduce that f is a normal distribution, independently of j. What are its mean and variance? Find the equivalent result for general l and r . 6.2 A large number of particles undergo Brownian motion in one spatial dimension x, and the probability density function p(x, t) of being at position x at time t is determined by a probability density φ() of moving a distance  in a small time τ . Using conditional probability and the independence of successive events in time τ , write down an equation for p(x + , t), and assuming  and τ are small, show that p satisfies a diffusion equation, where the diffusivity is 1 D= 2τ

∞ −∞

2 φ() d.

Find the solution if all the particles are initially at x = 0, and hence show that the mean square displacement x 2 = 2Dt. For motion in three dimensions, the density satisfies the diffusion equation   2 ∂p ∂ p 2 ∂p , + =D ∂t ∂r 2 r ∂r

6.8 Exercises


together with p = δ(r) at t = 0. Noting that 

4πr 2 p dr ≡ 1,


find a suitable similarity solution for p, and hence show that r 2 = 6Dt. Is this consistent with the one-dimensional result? 6.3 A nucleation and growth process is modelled on a discrete lattice of N points arranged on a circle of length L = N x, where x is the spacing between the points. In a time interval t an empty site is nucleated with probability p j , forming a ‘bubble’ of occupied sites. In each time step these bubbles grow at each end by one step. The gaps between the bubbles are called ‘holes’. j Let Hi be the expected number of holes of length i at time step j. Show that Hi



j Hi+2

− ip


j Hi+2

+ 2p

N −2 



Hk+2 .


By defining x = ix, t = jt, v =

x , t


Hi = h(x, t) x,

p j = γ xt,

show that the continuum limit of this as x, t → 0 is 


h t − 2vh x = −γ xh + 2γ

h(y, t) dy. x

Show that an appropriate initial condition if there is initially one occupied site is h(x, 0) = δ− (L − x). (δ− (L − x) is the left-sided delta function which is zero everywhere except at L x = L, and 0 δ− (L − x) d x = 1.) Show that in the continuum limit, the number of holes N and the fraction of unoccupied space S are 


Nh = 0

h d x, S =

1 L


xh d x. 0

By suitably scaling the variables, show that the equation can be written in the form


6 Diverse Applications


f t − f x = −x f + 2

f (y, t) dy,


where you should define L ∗ , and show that the initial condition can be written as 1 f (x, 0) = ∗ δ− (L ∗ − x). L Explain why a suitable boundary condition for f is f = 0 at x = L ∗ . Use a Laplace transform to find the solution of this. [Hint: the definition of f can be extended to (0, ∞) by taking f = 0 for x > L ∗ .] 6.4 The characteristic function φ(t) of a random variable X is defined by φ(t) = E(eit X ), where E is the expectation, or mean. A Gaussian random variable X of mean μ and variance σ 2 is one for which P (X < x) =

  1 (x − μ)2 , √ exp − 2σ 2 σ 2π

where x ∈ (−∞, ∞). Show that the characteristic function of such a random 1 2 2 variable is eiμt− 2 σ t . Let X 1 , . . . , X n be a sequence of independent Gaussian n random variables with 2 variances σ . Show that the sum S = means μi and i=1 X i is also Gaussian, i   with mean i μi and variance i σi2 . 6.5 X (t) is a continuous time series for t ∈ [−T, T ] with mean zero and variance one:  T 1 X 2 dt = 1; 2T −T T is taken to be very large. The definition of X is extended outside [−T, T ] by taking it to be zero there. A discrete sequence {xi } is defined by X (ti ) = xi , ti = it, i ∈ I N = {1, 2, . . . , N }, (N − 1)t = 2T. Show that the variance of the discrete sequence is approximately one if N is large, 1  2 x = 1. N −1 I i N

The autocorrelation function is defined by

6.8 Exercises


1 C X (t) = 2T



X (s)X (s − t) ds.

Show that in the limit T → ∞, C X (t) is symmetric. Show that a discrete approximation to C X (t) is given by the vector c ∈ R N , ci =

1  xk xk−i . N − 1 k∈I N

Now consider the N × d E trajectory matrix M, whose ith row is the row j vector X i ∈ Rd E , where the jth component of X i is X i = xi+ j , j ∈ Id = 1 1 1 {−[ 2 d E ], −[ 2 d E ] + 1 . . . , [ 2 d E ]}, with d E being odd. [This corresponds to centre projection; equally we can take Id = {0, −1, . . . , −(d E − 1)} (back projection), or Id = {0, 1, . . . , d E − 1} (forward projection).] Show that (M T M)i j =

xi+k xk+ j , i, j ∈ Id .

k∈I N

The integral operator A(ρ) is defined on functions ρ(t), t ∈ (−τ, τ ) by 1 A(ρ)(t) = 2τ



C X (t − s)ρ(s) ds.

Assuming 2τ = (d E − 1)t and that N  d E  1, show that, if r j = ρ(t j ), a discrete approximation to A(ρ)(t) is provided by Br , where r ∈ Rd E has components r j , and B is the d E × d E matrix with components Bi j ≈

1 (M T M)i j . (d E − 1)(N − 1)

Deduce that the eigenvalues of A(ρ) are approximately λi =

σi2 , (d E − 1)(N − 1)

where σi are the singular values of M. Hence show that 

λi ≈ 1.


6.6 The Koch snowflake curve is generated as follows. Beginning with an equilateral triangle 0 , take each straight line segment, build an equilateral triangle on its middle third, and then remove the middle third. The imposed triangle should be such as to increase the area of the polygon; call this new polygon 1 . Now repeat this procedure, yielding a sequence {n }.


6 Diverse Applications

Calculate the number sn of straight line segments of n , show that the length ln of the perimeter of n → ∞ as n → ∞, but that the enclosed area An tends to a limit. What is this, if A0 = 1? Calculate the fractal dimension of ∞ . 6.7 The (invariant) Julia set of the map z → z 2 − μ is defined as the ‘boundary’ of the set of points which tend to ∞ under the map. When μ = 0, show directly that the Julia set is given by z = r eiθ , r = 1, and that the mapping on it is θ → 2θ , and thus chaotic. When μ is small and real, show that the Julia set r = r (θ ) is defined by r ()ei = r 2 (θ )e2iθ − μ, which implicitly defines both the set r = r (θ ) and the mapping on it, θ → (θ ). By writing r (θ ) = 1 + μρ(θ ),  = 2θ + μψ(θ ), and expanding for small μ, show that, approximately,  ≈ 2θ + μ sin 2θ, ρ(2θ ) ≈ 2ρ(θ ) − cos 2θ. Show by induction that n−1  s    s   2 j 1 − cos j , ρ(s) = 2n ρ n − 1 + 1 + 2 2 0


for any positive integer n. Deduce that ρ(0) = 1. By defining 2π s = ξ , ρ(s) = 1 − sh(ξ ), ξ > 0, 2 show that h satisfies h(ξ + n) = h(ξ ) +

n  sin2 j j=1


, j =

2π 2ξ + j

n ≥ 1.

Hence show that if h(ξ ) = φ(ξ ), ξ ∈ [0, 1), then h(ξ ) = φ(ξ − n) +

n−1  sin2 ω j j=0


, ωj =

2π , ξ ∈ [n, n + 1), n ≥ 1. 2ξ − j

Show that h is continuous providing φ is continuous, and φ(1) = φ(0).

6.8 Exercises


What piece of information has not been used, which might help determine φ? Next, let θ be any angle ∈ (0, 2π ). Thus we can write θ = 0.a1 a2 . . . an . . . , 2π where 0.a1 a2 . . . an . . . is a binary fraction, with each ai = 0 or 1. Show that truncating the fraction at an gives an approximation θn of θ , with θn → θ as n → ∞, and θn =

2π kn , kn = 2n−1 a1 + 2n−2 a2 + . . . + an . 2n

Show that the integer kn = 2kn−1 + an and kn < 2n . By using the recurrence formula (∗) with a value s = 2π kn , show that ρ(θn ) = 1 −

n−1  sin2 (2 j θn )




Show that the series converges uniformly, and deduce that a 2π -periodic continuous solution for ρ exists of the form ρ(θ ) = 1 −

∞  sin2 (2 j θ )




∞  cos(2 j θ )




[This is Weierstrass’s non-differentiable function, announced by him in 1872; it is everywhere continuous and nowhere differentiable; it is also self-similar on fine scales; see Hardy (1916), who proved the non-differentiability for this particular example, and more recently Johnsen (2010). Fig. 6.17 shows the comparison of ρ(θ ) given by this formula with that computed directly from the map, for μ = 0.001 and μ = 0.1.] 1






-0.5 -4





-0.5 -4





Fig. 6.17 Comparison of the Weierstrass formula ρ(θ) (circles) with the Julia set computed directly from the map. Left: μ = 0.001; right, μ = 0.1


6 Diverse Applications

6.8 A wrong answer The Mandelbrot set is the set of values μ ∈ C such that the origin is not mapped eventually to ∞, under successive iterations of the map z → z 2 − μ. In particular, it includes the parameter ranges in which the period-q fixed points are stable. Write down the two values of z which are fixed points of the mapping, identify which one can be stable (and denote it as z ∗ ), and show that the boundary of the region in the complex μ-plane where z ∗ is stable (given by | f  (z ∗ )| = 1) is the cardioid r = cos2 21 θ, μ + 41 = r eiθ , θ ∈ (−π, π ). Write down a quartic equation for z, two of whose roots give the unique periodtwo orbit {z + , z − }. By noting that two of the roots of the quartic correspond to the two fixed points, factorise the quartic and find z ± explicitly. Hence show that the boundary of stability of the period-two orbit (given by | f  (z + ) f  (z − )| = 1) is the circle |μ − 1| = 41 . [This is the bulb to the right of the cardioid in Fig. 6.12.] Show that on the cardioid, at polar angle θ with respect to the cusp at μ = − 14 , the cartesian slope angle is ψ = 23 θ − 21 π . Next, we write the map in terms of a perturbation of the fixed point z = z ∗ when μ = μ∗ is on the cardioid. Define z = z ∗ + ζ, μ = μ∗ + , μ∗ = − 41 + eiα cos2 21 α, and show that the map is transformed to ζ → −ζ eiα + ζ 2 − .


Hence show that when  = 0, the bifurcation at θ = α on the cardioid is to a period-q orbit with rotation number p/q (where p and q are relatively prime) if  α=π

 2p −1 . q


Identify period-three and period-four stability bulbs in Fig. 6.12, and show that their cartesian slope angles at the point of tangency with the (upper branch of the) cardioid are ψ = 0 and ψ = 41 π , respectively. Let the r -th iterate of (∗) be ζr (the initial value ζ0 being defined to be ζ ), so that ζr +1 = sζr + ζr2 − , s = ei(π+α) , ζ0 = ζ, and assume that ζr and  are small. Define

6.8 Exercises


D± (ζ ) = ζ 2 ± (1 − s)ζ − , and show that the fixed points of the map correspond to the solutions of D− = 0. Assuming ζ is not a fixed point of the map, and defining ζr = ζ + D− Ar , show that Ar +1 = 1 + s Ar + 2ζ Ar + D− Ar2 ,

A0 = 0.

Hence show that an approximate solution is Ar =

1 − sr , 1−s

and that this represents a (neutrally stable) period q orbit. Now define 1 − sr ur Ar = , 1−s and show that u r +1 = u r −

D− u r2 r D+ 1 2(ζ 2 − )u r − s , u 0 = 1, + s(1 − s) s r s(1 − s) s(1 − s)

and that a period q (> 1) orbit corresponds to having u q = 1. In a naïve approximation, we may note that for small ζ and , u r ≈ 1, so that u r +1 ≈ u r −

D− D+ 1 2(ζ 2 − ) − s r , u 0 = 1. + r s(1 − s) s s(1 − s) s(1 − s)

Show that the solution of this is     2(ζ 2 − )r 1 − s −r D− 1 − sr D+ + − . ur ≈ 1 − s(1 − s) 1 − s −1 s(1 − s) s(1 − s) 1 − s [It may help to notice that, if xr +1 = xr + As kr and x0 = 0, then xr = A(1 − s kr ) if k = 0, and if k = 0, then xr = Ar . Note in particular that, since (1 − s k ) q s = 1, xq = 0 except for k = 0, when xq = Aq.] Deduce that 2q(ζ 2 − ) , uq ≈ 1 + s(1 − s) √ and thus that there is a period-q orbit through ζ ≈ ± . Show that the region of stability is the bulb defined approximately by the circle


6 Diverse Applications

  −iψ 1 πp e − ia  < a, a = sin , 2q q where ψ = 23 α − 21 π . Note that the complex variable e−iψ has its real axis along the tangent to the cardioid, so that the circles are external and tangent to it, as can be seen in Fig. 6.12. [The formula above is not correct, as the naïve approximation method is incorrect (except for q = 2). Further discussion of the issue is given in Sect. 6.7.3 and in the following question.] 6.9 A better answer An iterative map for the complex sequence ζr is given by ζr +1 = sζr + ζr2 − , ζ0 = ζ, s = e2πi p/q ,


where  is also complex. This question concerns the stability of the period-q orbit which bifurcates from ζ = 0 at  = 0. By writing ζr = ζ + (D − )Ar ,

D = ζ 2 − (1 − s)ζ, λ = s + 2ζ,

show that Ar +1 = 1 + λAr + (D − )Ar2 ,

A0 = 0.

The mapping defines a function Ar (D − , λ). Show that a period-q orbit corresponds to a value of ζ for which Aq (D − , λ) = 0. When  is small, show that the period-q orbit is given approximately by Aq [D(ζ ), λ(ζ )] − 

∂ Aq [D(ζ ), λ(ζ )] = 0. ∂D

By writing Ar as a power series, Ar = a0(r ) (λ) + a1(r ) (λ)D + a2(r ) (λ)D 2 + . . . , and solving sequentially for ai(r ) , show that, for small ζ , ∂ Aq 2q [D(ζ ), λ(ζ )] = bq + O(ζ ), bq = − . ∂D s(1 − s)2 Next, consider the mapping (∗) with  = 0. Assuming the expansion

6.8 Exercises


ζr = a1,r ζ + . . . + a j,r ζ j + . . . , write down a series of difference equations for a j,r , find a1,r and a2,r , and show inductively that the assumption a j,r =


 b jk


s r − s kr s − sk

is valid for 2 ≤ j ≤ q. Deduce that ζq = ζ + aq+1,q ζ q+1 + O(ζ q+2 ), and thus that Aq [D(ζ ), λ(ζ )] = cq ζ q + O(ζ q+1 ), cq = −

aq+1,q . 1−s

Show further that aq+1,q can be determined via  aq+1,q = −2Cq +


 Cm Cq+1−m


q , s

where (s − s j )C j = −2C j−1 +


Cm C j−m ,

j ≥ 3, C2 =


1 . s − s2

Show that the period-q orbit is stable approximately in the circle   2   1 + 2q   < 1.  s(1 − s)  6.10 In a simple qualitative model of turbulent pipe flow, the mean velocity u(t) and the turbulent fluctuation amplitude A are given by the dimensionless equations ε[At + {u − V (A)}A x ] = f (A, u) + ε2 {κ(A)A x }x , u˙ = F − u −

1 L


A d x; 0

L is the dimensionless tube length, F is the dimensionless average pressure gradient, which is prescribed, V (A) and κ(A) are prescribed positive functions, and


6 Diverse Applications

f (A, u) = A3 [u − u(A)], ¯ u(A) ¯ = a + b As +

δ , A2

where a, b, s and δ are positive constants, such that the minimum of u¯ = 1. 32 The parameter ε = is small (Re is the Reynolds number). Re Draw a graph of u(A), ¯ and hence show that for u > 1, there are three steady spatially uniform states. Draw a graph of f (A, u) as a function of A, and show that two of them are stable (you may assume periodic boundary conditions for A). Draw the resulting response diagram for F as a function of u in the steady state, indicating which steady state is ‘laminar’ and which is ‘turbulent’. Turbulent ‘slugs’ are patches of turbulent flow which spread as they propagate downstream, with laminar flow upstream and downstream. Suppose that u is constant. Show that the boundaries between turbulent and laminar patches can be described in this model by travelling wave fronts in which x = ut + ε X, t = εT, and that the front and rear fronts move outwards relative to the mean flow at speeds v± given by  AM v± =


∞ κ f d A ∓ −∞ κ V A2 dξ ∞ , 2 −∞ κ A dξ

where ξ = X ∓ v± T , and A = A M is the turbulent steady state. Consider for example the leading edge of the slug. Suppose that κ = 1, and  AM that v = v+ + V is constant and positive. Deduce that f d A > 0. Show 0

that in an appropriate (A, B) phase plane, the front corresponds to a trajectory connecting the fixed points (A M , 0) to (0, 0), where A = −B, B  = f (A) − v B, where f (A) ≡ f (A, u), u > 1. Show that for such a trajectory, f (A) dB =v− , dA B where

B ∼ λA as A → 0,

λ = 21 [v + {v 2 + 4| f  (0)|}1/2 ].

By consideration of the monotonicity of the solution as v varies, deduce that there is a unique value of v > 0 for which B = 0 when A = A M . For further details see Fowler and Howell (2003).

6.8 Exercises


6.11 Starting from the Navier–Stokes equations in the form26 ∇. u = 0,   ∂u + ∇. (uu) = −∇ p + η∇ 2 u, ρ ∂t derive average equations for the mean velocity and pressure by writing u = u¯ + u ,

p = p¯ + p  ,

where the overbar denotes an ensemble average and the prime denotes fluctuations (thus u = 0, p  = 0). Show that these take the form ∇. u¯ = 0, d u¯ = −∇ p¯ + η∇ 2 u¯ + ∇. τ Re , ρ dt where the Reynolds stress tensor τ Re has components   τiRe j = −ρu i u j .

6.12 The inviscid Orr–Sommerfeld equation is given by (6.146) when the Reynolds number Re = ∞, and is known as the Rayleigh equation: (U − c)(φ  − k 2 φ) − U  φ = 0. Show that the basic shear flow U (y) is unstable if Im c = ci = 0, and that it is neutrally stable if c is real. Explain why suitable boundary conditions in the domain (0, 1) are φ(0) = φ(1) = 0. If c is real, and U (yc ) = c for some yc ∈ (0, 1) (this is the critical layer), show using Frobenius’s method that in general no regular solution for φ exists. For Couette flow, U = 1 − y. Show in this case that there must be a critical layer, and determine the solutions in this case. Hence determine the spectrum of values of c for which solutions exist. Does the spectrum depend on k? Show that a necessary condition for instability is that U  = 0 somewhere in the flow. (This is Rayleigh’s inflection point criterion.) [Hint: it is useful to first divide by U − c, and then use methods familiar in proving the orthogonality of Sturm–Liouville eigenfunctions. Note that φ will be complex in this case.] dyadic uu denotes the tensor with components u i u j , and its divergence ∇.(uu) is the vector with i-th component ∂(u i u j )/∂ x j , where the summation convention is used.

26 The


Numerical Notes for Figures

This appendix gives detailed descriptions of how selected figures in this book were generated. The annotated Matlab code for these figures will be made available online as an additional resource. It should be noted that no attempt at numerical sophistication has been made, rather we just describe the use of Matlab as a working tool of the applied mathematician. Figure 1.9 The Lorenz equations x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz,


are solved numerically with the three parameter values r = 28, σ = 10, b = 8/3. Initial values used are (x, y, z) = (5, 5, 10), and the relative and absolute tolerances passed to the Matlab routine ode45 are both set to 10−10 . After a relatively short initial period, orbits in (x, y, z) phase space lie close to the Lorenz attractor, shaped something like the wings of a butterfly, with one wing centred on one unstable spiral fixed point, and the other wing centred on the other unstable spiral fixed point. Successive maxima in the z variable are plotted against previous values of these maxima as a series of dots, to produce the cusp-shaped first return map that Lorenz studied in 1963. Orbits that give maxima in z that are close to the tip of the cusp are rarely encountered. Hence, the cusp of the return map associated with the Lorenz equations is not accurately delineated by the above process, even if the orbit is followed for a relatively long time. So the cusp is sharpened in the m-file fig1_9.m, by actively shooting for the tip of the cusp. The tip of the cusp is associated with an orbit that is poised between completing another circuit of the same butterfly wing, and crossing over to the other wing. Two starting points (xa , ya , z a ) and (xb , yb , z b ) are sought in phase space for two orbits © Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,



Appendix: Numerical Notes for Figures

that give z maxima that straddle the cusp. Then these starting points are refined using interval bisection along the line joining them to shoot for the cusp, thereby sharpening it. The sign of x is used as a simple proxy for which of the fixed points is being visited, that is, which wing of the butterfly the orbit is currently on.






0 x3

-10 3.5







40 z













Fig. A.1 An orbit for the Lorenz equations, that circles the positive unstable fixed point twice before circling the other one. The uppermost plot is of x versus t, the next is of z vs t, and the lower plot is a projection of the orbit in three-dimensional phase space with axes x, y, and z. Red discs identify locations of successive maxima in z, labelled for reference in the text. The line joining (x1 , y1 , z 1 ) and (x2 , y2 , z 2 ) is the line along which interval bisection is performed. Asterisks identify the two unstable spiral points and the unstable saddle at the origin. Arrows on the orbit in the lower plot indicate the direction of increasing time

Appendix: Numerical Notes for Figures


We use the (x, y, z) locations in phase space of successive maxima in z on an orbit as our two starting points. One orbit is followed in time until it completes two circuits of one unstable point, before crossing to circle the other unstable point. We label the three successive points on this orbit where z reaches its maximum as (x1 , y1 , z 1 ), (x2 , y2 , z 2 ), and (x3 , y3 , z 3 ), as illustrated in Fig. A.1. So the first and second of these are on the same wing, orbiting one unstable point, and the third is after the orbit has crossed over to the other wing. We set (xa , ya , z a ) = (x1 , y1 , z 1 ) and (xb , yb , z b ) = (x2 , y2 , z 2 ). We know that the first initial point completes one more circuit of the same wing before crossing, while the second initial point reaches the next maximum in z only after crossing to the other wing. These two points then straddle the cusp, in the sense that the orbits starting on a continuous line drawn between them are partitioned into one set that circle the fixed point again, and another set that cross over to the other fixed point. To put it differently, the two points lie on opposite sides of the two-dimensional stable manifold of the origin. In principle, using continuity arguments, one point on the line, at the junction of the two partitions, has an orbit that approaches the tip of the cusp. This would correspond to an orbit that approaches the saddle at the origin in a direction that is asymptotic to the z-axis. A new starting point is chosen halfway along the line segment connecting (xa , ya , z a ) and (xb , yb , z b ). This line segment is given in Fig. A.1. The orbit starting from this point is computed, and the next maximum of z is checked to see whether x is positive or negative there. The new pair of points that straddles the cusp is now used, that is, the closest pair of the three under consideration with one having a positive x and the other having a negative x value associated with the next maximum in z. The process is repeated, always halving the distance between the two straddling points, until they are very close to each other, in a process of shooting for the cusp. All maxima of z encountered are recorded to show points in a decreasing neighbourhood of the cusp. The cusp shown in Fig. 1.9 was obtained by iterating until the straddling points were at a distance of 10−8 apart. Figure 2.12 Two thousand equally spaced values of λ ∈ [2.8, 4] are chosen, and the map is iterated starting from x0 = 0.7. The first five hundred iterations are ignored, to allow the iterates to become close to the stable attractor, before saving the next two hundred iterates. Then the iterates are plotted as dots, against the value of λ used to generate them. Figure 4.4 The Lorenz equations x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz,


are solved numerically with the parameter values σ = 10, b = 8/3, and with r varying from just less than 24.74 down to 13.926. The relative and absolute tolerances


Appendix: Numerical Notes for Figures

passed to the Matlab routine ode45 are both set to 10−13 . We seek to compute the action of unstable periodic orbits about either of the two stable fixed points. The action is the period times the amplitude of the principal unstable periodic orbit. The first r value is approximately where the two non-zero fixed points lose stability in a Hopf bifurcation (they are unstable for larger values of r ). For values of r below this value, there is a symmetric pair of bifurcating unstable periodic orbits about each stable fixed point, which increase in amplitude as r decreases, and each periodic orbit approaches one of a symmetric pair of homoclinic orbits to the origin, which exist near r = 13.926. We want to compute the amplitude and period of the unstable periodic orbit that goes around the positive stable fixed point, for a range of r values. There is a sophisticated package called AUTO that will compute such unstable orbits, but we describe here a simple method for directly finding approximations to these unstable orbits by solving the Lorenz equations numerically. At each value of r , the unstable periodic orbit is found by a shooting method: the initial point is run until it has gone through two maxima in z. Maxima in z are found by setting the ‘events’ option for ode45 to search for places where z˙ (t) is decreasing through zero. If successive maxima in the z variable are increasing, or if the orbit flips from circling one non-zero fixed point to circling the other, the initial point is known to be ‘outside’ the unstable periodic orbit; otherwise the initial point is ‘inside’. To find an inside √ point, initial values used are (x0 , y0 , z 0 ) = (xc , yc , r − 1 + ε), where xc = yc = b(r − 1) and z c = r − 1 are the coordinates of the positive fixed point, and ε = (rc − r )/10 with rc = 24.74 is a small increment used to take the initial value off the fixed point. A simple search is made by starting at (x0 , y0 , z 0 ), and then moving along a line between the fixed point and the initial point (x0 , y0 , z 0 ), stepping halfway towards the fixed point each time, until an inside point is found, (xi , yi , z i ). Then another search is made, starting twice the distance that (xi , yi , z i ) is away from the fixed point, along the same line, at (x1 , y1 , z 1 ) = (xc , yc , z c + 2(z i − z c )), and doubling this distance until an outside point (xo , yo , z o ) is found. Then the inside and outside points are refined using interval bisection, choosing a point halfway along the line segment joining (xi , yi , z i ) and (xo , yo , z o ), identifying if it is inside or outside, then choosing the two closest points straddling the unstable periodic orbit; that is, one inside and one outside. This is continued a maximum of 60 times, or until the line segment from inside to outside points has length less than 10−15 of the length of the vector from the origin to the fixed point. The resulting unstable orbits are shown in two different views of the three-dimensional phase portrait in Fig. A.2, for 20 different r values. The Matlab file fig4_4.m generates Fig. 4.4, which shows the action versus r . It takes about half an hour to run on a 2016 Mac laptop, when orbits are computed for 68 different r -values.

Appendix: Numerical Notes for Figures


Fig. A.2 Unstable orbits of the Lorenz equations which circle the positive stable fixed point, at 20 evenly spaced r values ranging from 24.74 down to 13.926, viewed from two slightly different directions in the 3-dimensional phase space. The thick black line is the locus of the positive stable fixed point as it varies with r . The pair of plots can be viewed like a stereo plot, yielding a 3dimensional view

Figure 4.16 This figure shows an approximation to a homoclinic orbit of Shilnikov type. The orbit was computed from a solution of the Rössler equations x˙ = −y − z, y˙ = x + ay, z˙ = b + z(x − c),


with a = 0.181, b = 0.2, c ≈ 10.3134, and the arrows in Fig. 4.16 indicate the direction of reversed time, as discussed in the text. In general, the way of computing a homoclinic orbit would mirror that used for the Lorenz system in Fig. 4.4: find a periodic orbit and follow it in parameter space towards the point where its period becomes infinite. However, the issue in illustrating a Shilnikov orbit as in Fig. 4.16 is rather different. It is commonly difficult to see the spiral structure, but we wanted an illustration that would show it. We were guided by the paper by Barrio et al. (2011), whose Fig. 7 shows a nicely shaped homoclinic orbit near the intersection point of two curves in parameter space. This intersection point is at (a, b, c) = (0.1798, 0.2, 10.3084), and our parameters were chosen close to this for cosmetic purposes. The two curves of the Barrio figure are more or less coincident in (a, c) space (with b = 0.2); the lower is the path of the principal homoclinic orbit (which has a single excursion away from the fixed point), while the one above is apparently that of a subsidiary double-pulse homoclinic orbit, as described by Glendinning and Sparrow (1984). The issue is clouded by the observation that in a


Appendix: Numerical Notes for Figures

Fig. A.3 A numerical solution to the Rössler equations using values a = 0.181, b = 0.2, c ≈ 10.3134, plotted in three-dimensional phase space. The solution was started close to the saddle-focus P

parametric neighbourhood of the value c = ch (a, b) where the principal homoclinic orbit occurs, there are an infinite number of periodic orbits which all look more or less the same (except that they have different numbers of loops). Thus, finding the approximate orbit that looks close to a homoclinic orbit is not straightforward for this set of equations. A single orbit that starts close to the saddlefocus   1  −c + c2 − 4ab , (A.4) P = (−ap, p, − p) , p = 2a is plotted in Fig. A.3, in the three-dimensional (x, y, z) phase space, and has the characteristic shape of the Rössler attractor. For our values of the parameters, P ≈ (0.0035, −0.0194, 0.0194). The starting point for this orbit was set to be in the tangent plane at P to the unstable manifold on which solutions spiral away from P. The equations can be linearised at P to give the system ⎛

⎞ ⎛ ⎞ 0 −1 −1 0 −1.0 −1.0 ⎠ x ≈ ⎝ 1.0 0.1810 ⎠ x, 0 0 x˙ = Ax ≡ ⎝ 1 a − p 0 −ap − c 0.0194 0 −10.3099


where x is relative to an origin at P. (We call this set of coordinates the shifted coordinates, as opposed to the original coordinates used in (A.3).) The matrix A above has eigenvalues λ± = 0.0896 ± 0.9959i, μ = −10.3080,


with corresponding eigenvectors approximated by v± = (0.7071, −0.0646 ∓ 0.7041i, 0.0013 ∓ 0.0001i)T , v3 = (0.0957, −0.0091, 0.9954)T .


Appendix: Numerical Notes for Figures


The real and imaginary parts of v− , v1 = (0.7071, −0.0646, 0.0013)T , v2 = (0, 0.7041, 0.0001)T ,


together with v3 , provide a suitable right-handed basis to analyse the dynamics of the flow near the fixed point. See also question A.1. The strongly stable eigenspace is approximately the z axis, while the weakly unstable eigenspace is approximately the (x, y) plane, as can be seen in Fig. A.3. In order to find the principal homoclinic orbit, we adopt the following procedure. First we fix the parameters a = 0.181 and b = 0.2. Because the stable manifold is one dimensional, we first select an initial value close to P on it. In practice, we start at a small distance along the stable eigenspace in the shifted coordinate system (along the vector v3 ). We then integrate backwards in time and adjust c so that the trajectory lands on the spiral disc in Fig. A.3 which is (in forward time) the unstable manifold. Because the unstable manifold is two dimensional, it is straightforward to do this in a manner analogous to the way in which the cusps were sharpened in the Lorenz map (Fig. 1.9). Specifically, c = 10.3 and c = 10.4 give departures from the unstable manifold (solving backwards in time) that go to positive and negative z values, respectively, thereby bracketing the unstable manifold. A combination of interval bisection and inverse quadratic interpolation methods using Matlab’s fzero command then refines the value of c, obtaining c = 10.3134490623088, when the initial perturbation from P is 10−2 v3 . In solving the equations numerically, we used the smallest tolerance, 2 × 10−14 absolute and relative accuracy. The tolerance on the c value when bisecting was 10−15 . Of course this is not quite good enough, as the initial value on the stable eigenspace is not exactly on the stable manifold. We thus repeat the exercise for initial perturbations εv3 , successively reducing ε. This gives the sequence of values in Table A.1, which are clearly rapidly convergent. While the procedure above provides a good estimate of the parameter values where the homoclinic orbit exists, it does not do a good job of computing the homoclinic orbit itself. The reason for this lies in the extreme disparity of the growth and decay rates of the unstable and stable eigenmodes. In order to compute the homoclinic orbit, we need to integrate forward in time. For convenience, we transform the shifted Table A.1 Converged values of c approximating the homoclinic orbit for (A.3) when also a = 0.181 and b = 0.2, when initial conditions (integrating backwards in time) are taken at εv3 in the shifted coordinates



1 0.1 10−2 10−3 10−4 10−5 10−6

10.3126454288201 10.3134413344485 10.3134490623088 10.3134491336202 10.3134491342413 10.3134491342463 10.3134491342463


Appendix: Numerical Notes for Figures

coordinates to (x  , y  , z  ) measured with respect to the new basis (v1 , v2 , v3 ). We start close to P on the unstable eigenspace spanned by v1 and v2 . Because of the rapid decay of the stable v3 component, we can effectively suppose this places us on the unstable manifold. Next, we solve forwards in time until the orbit intersects a Poincaré section, which is taken to be y  = 0, x  < 0, twice, indicating it has completed at least one full circuit around P. The two intersection points in the threedimensional phase space provide end-points of a line segment. The homoclinic orbit must go very close to this line segment, since all orbits leaving P must go close to this line segment. We shoot on orbits that start on the line segment for the orbit that most closely approaches P after spiralling away from it. This orbit looks like the homoclinic orbit for P. Figure 5.3 The Hénon–Heiles system solved is q˙1 = p1 , q˙2 = p2 , p˙ 1 = −q1 − 2q1 p1 , p˙ 2 = −q2 − q12 + q22 , starting with hand-picked initial conditions. Values of p2 and q2 are saved in the Poincaré section with q1 = 0 and p1 > 0. The initial conditions are listed in Table A.2. Later figures in the chapter use similarly hand-picked initial conditions. The equations are solved for times up to t = 10,000, starting at time t = 0. This leads to a total of about 63,000 points generated in the Poincaré section, from which every fifth point is plotted. Table A.2 Initial values for Fig. 5.3. The initial value for q1 was always zero q2




p2 0

q2 0

































































Figure 5.13 The file fig5_13.m generates Fig. 5.13, a Poincaré section through solutions of the restricted three-body problem. We consider the Sun, Jupiter, and the Earth rotating about their centre of mass which is taken as the origin. We neglect the effect of the Earth on the orbits of the other two bodies, and we restrict our attention to the case that the sun and Jupiter both have circular orbits. The equations take the form

Appendix: Numerical Notes for Figures


r˙ (t) = v,

2 1 r − cos ψ h2 + 3 cos ψ , v(t) ˙ = 3 − 2 +ε − r r x3 r h ˙ ψ(t) = 2 − 1, r

˙h(t) = ε 1 − r sin ψ, r2 x3


where r is the distance from Earth to the origin, x is the distance from Earth to Jupiter, h is the angular momentum of the Earth, ψ = θ − t, where θ is the angular coordinate of the Earth, and ε ≈ 0.001 is the ratio of the Jovian and solar masses. To avoid the inaccuracies that arise during close approaches to the sun (near the origin), we rescale time as dt = r 3 dT to obtain r˙ (T ) = r 3 v,

3 r v(T ˙ ) = h − r + ε − 3 (r − cos ψ) + 2 cos ψ , x 3 ˙ ) = hr − r , ψ(T

4 ˙h(T ) = ε r − r sin ψ. x3 2


Note that in fact it makes little difference to the solutions computed by Matlab whether we use the rescaled equations or not, for values of ε larger than 0.001. Our discussion below always refers to values of t.

2 10


v 0

0 -10

-2 0










Fig. A.4 Invariant tori in the solution of equation A.10 for ε = 0.001. The plots give intersections of the trajectories in cylindrical polars (r, v, ψ) at the Poincaré section ψ decreasing through π/2, where v = r˙ . The calculations are done at a fixed value of J0 ≈ −3.068, corresponding approximately to the present Earth orbit, in which r0 = 0.19 and h 0 = 0.44, and with initial conditions r0 = 0.19, ψ = π/2, and then a variety of values of v0 between 0.2 and 2.1027 are chosen, with the corresponding value of h 0 chosen as in (A.11). The right side shows a zoom, in which it can be seen that for v0 ≈ 2.0, the behaviour is stochastic; v0 values larger than 2.1027 lead to collision with the sun. The small line segments near r = 0, |v| = 16 are examined more closely in the next figure


Appendix: Numerical Notes for Figures

Parameter values are ε = 0.001 and initial values have been chosen so that Earth’s orbit is close to the centre of the tori seen in the Poincaré section in Fig. A.4, with r0 = 0.19 and J0 ≈ −3.068 when ψ = π/2. Initial values of angular momentum h 0 are chosen to keep J constant at the value J0 , by using the equation ⎡ h=





2r02 J0

− r02 v02

⎤1/2 2εr02 ⎦ + 2r0 +  . 1 + r02


A variety of initial values of velocity r˙ = v are used in the approximate range (0, 2). Initial values of velocity a little larger than 2 lead to collision with the sun. Values near 2 give thin regions of stochastic behaviour and secondary tori, and are more clearly visible at values of r near zero and values of |v| near 16, as shown in the magnified plots in Fig. A.5. The differential equations are solved using Matlab’s ode45 command. Absolute and relative tolerances need to be set to 10−10 or smaller to obtain plots that do not change when tolerance is further reduced. Larger tolerances lead to visibly noisy

Fig. A.5 Solutions of the restricted three-body problem with ε = 0.001. These are a series of zoom-ins on the large v and small r secondary tori that become apparent at large magnifications

Appendix: Numerical Notes for Figures


Fig. A.6 Values of the conserved quantity J versus time, for computed solutions of the restricted three-body problem with ε = 0.04. This is a check on computational accuracy, when relative and absolute tolerances are set to 10−9 . The desired value of J is −3.11, so J + 3.11 is displayed, after multiplying by 105 so that the small changes are visible Fig. A.7 Solutions of the restricted three-body problem with ε = 0.04, with ψ decreasing through the value π/2. This looks qualitatively the same as Fig. 5.14, which has ψ increasing through π/2

2 1

v 0 -1 -2 0.1





largest tori and J values that change appreciably with time. Tolerances of 10−8 lead to computed J values that vary by a factor of about 3×10−3 over scaled run times up to T = 108 . Tolerances of 10−10 lead to computed values of J that are constant within a factor of about 3×10−5 over the same times, and regions in the Poincaré section that at looser tolerances looked chaotic, change into clean-looking tori and secondary tori at this tighter tolerance. Elapsed times required to solve at these tolerances for long enough times to produce the plots in Fig. A.5 and see secondary tori are of the order of forty minutes on a laptop. Figures 5.14 and 5.15 are computed with the same code as Fig. 5.13, but with different values for ε and different initial values of r , v and tolerance. Absolute and relative tolerances for Fig. 5.14 were eventually set to 10−9 . The values of J stayed within 7×10−6 of the desired value J = −3.11 for all but six orbits, as illustrated in Fig. A.6. These six orbits failed to progress beyond times of the order of 1,000, and Matlab stopped calculating them because of failure to meet the required tolerances, at times that coincided with the sudden divergence of their J values. Note that Fig. 5.14 was computed by taking a Poincaré section with ψ increasing through the value π/2, and Fig. 5.13 used ψ decreasing through π/2. Figure A.7 shows the Poincaré section with ψ decreasing through the value π/2, which has all of the same qualitative features as Fig. 5.14, just shifted.


Appendix: Numerical Notes for Figures

Similarly, with ε increased to 0.1 for Fig. 5.15, we found that a more relaxed absolute and relative tolerance of 10−8 was adequate and this is the value used to produce this figure; no perceptible difference in the Poincaré section plot is seen when tolerance is eased further to 10−7 ; the values of J for this tolerance of 10−7 remain constant to within a factor of 0.001. For this figure, too, we used (arbitrarily) the Poincaré section with ψ increasing through π/2. Figure 6.1 The file fig6_1a.m generates Fig. 6.1a. It shows the solution of the delayrecruitment equation ε x˙ = −x + λx1 (1 − x1 ), λ = 3.8, x1 = x(t − 1),


with ε = 0.01 and with initial values in the t-range [−1, 0] linearly interpolated on ten random numbers in the range [0.2, 1]. It uses the delay-differential equation solver dde23, and takes half a day to reach the time range plotted when tolerances are set to 10−8 . Figure 6.3 Figure 6.3 is generated by the Matlab file fig6_3.m. The data used to illustrate the periodogram is generated by solving the Lorenz equations x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz,


with r = 28, σ = 10, b = 8/3. Initial values used are (x, y, z) = (5, 5, 10), and the relative and absolute tolerances passed to the Matlab routine ode45 are both set to 10−4 . The solution is computed for times up to 100 with time-step sizes varied by Matlab to meet the desired accuracy. The solution for x is then interpolated onto 512 evenly spaced points over this time range, which just barely resolves the most rapid oscillations. Higher resolutions make little visible difference to the periodogram. The periodogram is generated in three ways, the first two by using Matlab’s periodogram command, and the third by directly computing the sum in equation (6.17). The sum is easily directly calculated. The Matlab commands offer one-sided and two-sided periodograms. All three methods give matching results suggesting that the computations are correct. Matlab computes a ‘periodogram power spectral density’ that takes the form (after a shift of origin to bring the summation to the same range as in the text) t Pm (F) = n

2    n   −2πi F j   xje  ,    j=1


Appendix: Numerical Notes for Figures


where t is the sampling interval, F = f t is a dimensionless frequency that is cycles per sample, n is the number of samples, i 2 = −1 and the frequency range is restricted by the Nyquist or folding frequency, that is, −

1 1 < f < . 2t 2t

Another form offered by Matlab for the periodogram power spectral density is a spectral density per radian, 2    n   1  −i j  Pm ( ) = x e j  ,  2π n  j=1 


with an angular frequency = ωt that is dimensionless, and restricted to the range − π < < π.


This form follows from the first, by substituting = 2π F, and dividing the resulting formula by 2π t to normalise the power to be per radian. The sign in the exponent can be switched for real time series since the form is symmetric in frequency about the origin. Matlab calls the above forms ‘two-sided’ periodograms. Shifting frequency by adding π puts it in the range (0, 2π ), and the Nyquist or folding frequency is then at = π . A one-sided periodogram reflects the fact that physically there is little difference between a negative frequency and a positive frequency of the same absolute value, unless there is a way to distinguish directions of oscillation. So the power attributed in Matlab to a positive frequency value in a one-sided periodogram is double the power value in a two-sided periodogram, unless the frequency is zero or π . Equation (A.15) now matches our equation (6.19), which can be rewritten in the form 2    n  t  −i j  xje (A.17) P(ω) =  ,  n  j=1  with = ωt, provided that we multiply Pm ( ) by 2π t to make its amplitude the same as ours. So we compute a one-sided periodogram with output to the column vector w of angular frequencies and Pm ( ) output to the variable pxx by Matlab with the command [pxx, w] = periodogram(x, [], 512); (A.18) The default interpretation by Matlab is to produce a one-sided periodogram because x is a real-valued data series. Matlab doubles the power at each frequency except the first and last, so for comparison we also double the first and last, and then we


Appendix: Numerical Notes for Figures

Fig. A.8 Input data x from solving the Lorenz equations, and the resulting periodogram power spectral densities P. Black circles are from using the one-sided periodogram, blue plusses the two-sided periodogram and the red lines denote the result of a direct calculation as in the text

halve the power conversion by plotting π t pxx against w/t as black circles in Fig. A.8. The empty square brackets in equation (A.18) result in Matlab using the default square window for the data. We also for comparison compute a two-sided periodogram with the command [pxx2, w2] = periodogram(x, [], 512, twosided );


and we plot 2π t pxx2 against w2/t as blue plus symbols in Fig. A.8. We compute equation (A.15) directly for comparison, as follows: we set jnumb to be the row vector (1 . . . n), where n is the number of data points of x. We matrix multiply the column vector jnumb’ by the row vector of angular frequencies w’ to produce the desired exponents in the matrix OMj defined by ⎛


2 . . . n

⎟ ⎜ ⎜ 2 1 2 2 . . . 2 n ⎟ ⎜ . OMj = ⎜ . . . .. ⎟ . . ⎟ ⎠ ⎝ .. n 1 n 2 . . . n n


We take the matrix product of the row vector x with the matrix exp(i OMj), which gives the desired sum inside the absolute value signs in equation (A.17) as a row vector, one sum for each frequency value. The magnitude of each element of this row vector is squared, and multiplied by t/n to produce a row vector of P(ω) values, graphed as a red line in Fig. A.8. We see that all three computations produce the same result.

Appendix: Numerical Notes for Figures


Fig. A.9 The periodogram power spectral densities P as in the previous figure, but with a higher frequency resolution version for comparison. A disc symbol is placed at each computed point. The higher resolution version uses 214 frequency points, whereas the lower resolution version uses 29 frequency points when evaluating P. The same 29 data points were used in both cases

As noted in the caption to Fig. 6.3, the choice of 512 ω values is a natural match to the number of data values. Taking a higher resolution in ω as illustrated in Fig. A.9 does give slightly more information—no new peaks are revealed, and power values match the lower resolution results at matching frequencies, but existing peaks are more carefully delineated, resulting in better estimates of their amplitudes and frequency value by centering in on the correct frequency. Matlab has a ‘reassigned’ option that does the same job; it hunts for better frequency locations where power is locally maximised. Figures 6.6 and 6.7 The file fig6_6.m generates Fig. 6.6, and it uses the singular value decomposition (SVD) of a component of the solution to the Lorenz equations, x˙ = −σ x + σ y, y˙ = (r − z)x − y, z˙ = x y − bz,


with r = 28, σ = 10, b = 8/3. Initial values used are (x, y, z) = (5, 5, 10), and the relative and absolute tolerances passed to the Matlab routine ode45 are both set to 10−4 . The solution is run until t = 2 to ensure it is on the strange attractor. Then it is continued for a further N = 10,000 time steps of size 0.01 to create a relatively noise-free time series from the x-component of the solution. White noise is added, by using the randn function, which returns a sample of random numbers drawn from a normal distribution with mean zero and variance one. Each term in the sample is then multiplied by four to make the variance 16 before adding it to the noise-free time series as illustrated in Fig. A.10. The singular values are computed as follows. The noisy sampled signal x is put into a matrix M, with the first row being x(1), x(2), . . . , x(d E ), the second row being x(2), x(3), . . . , x(1 + d E ), etc., for a total of N − d E + 1 rows and d E columns in the matrix M. Usually, one would remove the mean of each column so that the new column has mean zero, but the nature of the signal, the size of d E used, and the total


Appendix: Numerical Notes for Figures

Fig. A.10 The data from the x-component of a solution to the Lorenz equations, before and after white noise of variance 16 is added

signal length used, give a column mean that is already very close to zero. Then the command svals= svd(M)/sqrt(N) gives the singular values σi normalised √ by N . These are √plotted in Fig. 6.6, and are seen to rapidly reach the noise floor near the value σ/ N = 4. The SVD is then used to filter the noisy signal and produce Fig. 6.7, in the Matlab file fig6_7.m, as follows. The SVD is computed with the command [U, S, V] = svd(M, 0),


so that the singular values are output into a matrix S = , and the matrices U and V are such that (A.23) M = U V T . The zero argument to the SVD command tells Matlab to compute an economy-sized SVD, for which the S matrix is square with side of length d E , and only the first d E columns of U are computed. Then a filtered matrix is computed as M F = U F V T


where the filtered F matrix is the matrix S with neglected singular values set to zero. Any column of the filtered matrix can be used as a filtered time series signal xi1 . We use the first column, called the forward projection, concatenated with the remainder of the bottom row, so that the data set is not reduced in size upon filtering. To obtain a second filtered time series, this process is repeated, using xi1 as the data set, to generate xi2 , and a third filtered time series can be obtained using xi2 as

Appendix: Numerical Notes for Figures


the data set to generate xi3 ; evidently this process can be repeated, but it loses its efficacy after one or two iterations. Figure 6.9 The file fig6_9.m generates the black spleenwort fern shown in Fig. 6.9. It uses the four transformations x → wi , i = 1, . . . , 4, defined by wi = Ai x + Ti , x ∈ R2 ,


where A1 =

0 0 , 0 0.16

0.85 0.04 0.2 −0.26 −0.15 0.28 A2 = , A3 = , A4 = , −0.04 0.85 0.23 0.22 0.26 0.24

0 0 0 0 , T2 = , T3 = , T4 = , (A.26) T1 = 0 1.6 1.6 0.44

as given by Barnsley (1988) (or in the second edition of 1993). The values in (A.26) are consistent with those in (6.121) and Table 6.1. The random iteration algorithm, also to be found in Michael Barnsley’s book, starts with an initial point x0 ∈ R2 , and recursively and independently generates new points in R2 , (A.27) xn ∈ {w1 (xn−1 ), w2 (xn−1 ), w3 (xn−1 ), w4 (xn−1 )} where the probability of choosing the event xn = wi (xn−1 ) is pi , and ( p1 , p2 , p3 , p4 ) = (0.01, 0.85, 0.07, 0.07).


This leads to a sequence of points xn . We skip the first ten points, and then plot a dot for each of the next 500,000 points to generate the black spleenwort fern image. The probability is imposed by using the Matlab command rand, which returns a single uniformly distributed random number in the range [0, 1]. The number returned is checked to see if it is less than p1 ; if so, w1 is chosen. Otherwise, if it is less than p1 + p2 , w2 is chosen. Otherwise, if it is less than p1 + p2 + p3 , w3 is chosen; otherwise, w4 is chosen. Figure 6.10 The file fig6_10.m generates figure 6.10, the Julia set J ( f ) for the map z → f (z) = z 2 − μ, with complex z and μ = 0.999 − 0.251i. The Julia set is (amongst other things) the boundary of the set of points which are mapped to ∞ under iteration of f , and is an invariant set under the map; the dynamics of f on J ( f ) are chaotic. The Julia set may be found by iterating the inverse of the function starting with an unstable fixed point of the map. This approach is based on the observation that the Julia set is invariant under the map and under its inverse, and its inverse is not subject to rapid growth in magnitudes of errors. However, there are two inverses of any given point and hence as one iterates backwards the number of points increases exponentially fast, eventually exceeding the limitations of computer memory.


Appendix: Numerical Notes for Figures

To cope with the rapid growth of the number of points to be iterated, one approach is to randomly choose just one of the two branches each time an iteration is made. This method works quite well, but there is a method that we find gives better results: we follow the algorithm used in a computer algebra code called Maxima ( The Maxima code was written by Adam Majewski (, and is termed the modified inverse iteration method (MIIM). The work described by Majewski is issued under a variety of Creative Commons Attributions, and we are grateful to him for permission to use his algorithmic ideas in our Matlab code version of MIIM. More information on MIIM may be found at complex_plane/q-iterations. A brief summary of the MIIM as implemented in our code follows. We count the number of hits on pixels by each of the two points generated when we iterate. That is, we set up a grid that we might visualise as a grid of pixels at some resolution that overlies the plotting area. We identify which pixel is hit or landed on by the current iterate, and we keep count of the number of times a pixel has been landed on. √ We start with an unstable fixed point, and we choose the point (1 + 1 + 4μ)/2. We iterate it on the inverse map. Note that this gives two inverses, one of which is the fixed point itself. We could alternatively start with any unstable periodic point— starting with a point on an unstable two-cycle gives essentially the same image of the Julia set. (*) At each inverse iteration, we produce two inverse iterates. We seek to save each iterate to a stack, provided a hit count criterion is met. The stack is used as the source of points to be iterated again. We call the save a push onto the stack, and visualise it as pushing the iterate into the bottom of the stack, thereby pushing upwards all points already in the stack. When a point is iterated it is taken from the bottom of the stack in a pop operation, so that the other points in the stack fall down one place. Points are saved to stack at full resolution (16 digits) so they can be used later to iterate on. The criterion for whether an iterate is pushed to the stack or discarded after iteration is based on the hit count. We push an iterate to the stack for further iteration if the pixel hit count for that iterate is less than some maximum value—otherwise we discard the iterate. We also separately save the iterate for plotting as a full resolution dot if and only if the hit count is one, so that the pixel the point has landed on has just been hit for the first time. After dealing with each of the two inverse iterates by pushing or discarding, and possibly saving for plotting, we pop the bottom point off the stack to perform an iterate on, and we repeat from (*) until there are no more points in the stack. The number of hits on each visited pixel increases to the maximum value, and when enough pixels are at saturation hit count, the stack size begins to drop due to the discarding of iterates that are visiting a saturated pixel. Eventually the stack is emptied and then iteration stops. Saved points are fully accurate for plotting.

Appendix: Numerical Notes for Figures


Fig. A.11 The Julia set computed for the map z 2 − μ, with complex z and μ = 0.999 − 0.251i

This process avoids the known issues with the doubling of the number of inverse points each iteration. We have written Majewski’s MIIM algorithm in Matlab code to generate Fig. 6.10. If the pixel resolution is 2000 by 1000, it takes about fourteen minutes to run when the maximum number of hits per pixel is set to 20. The run time is much less for coarser grids. About 74,500 points were plotted to produce our Fig. A.11. The stack reached a maximum size of nearly 128,000 points during iteration. Figure 6.11 The file fig6_11.m generates Fig. 6.11, the Julia sets for the map z 2 − μ, with complex z and μ = 0.3, 0.618, 0.9, 1.2, 1.5, 1.8. The sets gradually collapse. The method is MIIM, the same as described above for Fig. 6.10. Figure 6.12 The file fig6_12.m generates the Mandelbrot set plotted in Fig. 6.12. The complex quadratic map z 2 − μ is used, and the μ values for which the origin is not mapped to infinity are sought. We use an escape time algorithm, together with a transformation that focusses the region of rapid colour changes more sharply onto the more interesting features of the set. We set an upper bound on the distance of iterates from the origin, at about 4. Iterates of the origin that exceed this distance within a maximum of 10,000 iterations are deemed to have escaped. We record the number of iterations Cold required for (0,0) to escape, for each value of μ considered. This number is used to colour the pixel for that value of μ. The ‘hot’ colourmap built into Matlab is used, with 10,000 colours to reduce the banding that is seen when the escape time is integer valued. Before using the colourmap, the number of iterates is transformed using two arctan functions to focus the regions where the colours change onto the boundary between trapped and escaping values of μ. The transformation used is


Appendix: Numerical Notes for Figures

Fig. A.12 The transformation used to move the colour changes to focus on the edge of the inside or black region of the Mandelbrot set. The second plot is a closer view

2 = 0.8 1 + tan−1 π

Cold − 0.0008 0.0032

Cold − 0.7 , Cnew 0.5 (A.29) which is graphed in Fig. A.12, and focusses the most rapid change in colour onto a region near the original colour value of 0.0008. The resolution of the image that is output is 10,000 by 10,000 pixels. This takes about 12 hours to run on a 2017 Macbook laptop. Points inside the basins of attraction that would normally be coloured white because they never escape, are set to black for better contrast at the boundary.

2 + 0.2 1 + tan−1 π

Exercise A.1 x ∈ R3 satisfies the linear system x˙ = Ax,


where the matrix A is real. The complex matrix P = (v+ , v− , v3 ) with v− = v¯ + and v3 real diagonalises A, thus A P = P D, where D = diag (λ+ , λ− , μ), λ− = λ¯ + and μ < 0 is real, Re λ+ > 0. Show that the transformation x = Py transforms (∗) to the diagonal system y˙ = Dy. However, it is preferable to make a real transformation. To do this, show that P R = T , where T = (Re v− , Im v− , v3 ) = (v1 , v2 , v3 ), and the matrix R is ⎛1 ⎜ R=⎝

2 1 2


1 i 2 1 −2i


⎟ 0⎠;

0 1

hence show that the further transformation y = Rz (i.e. x = T z) transforms the linear system (∗) to the form

Appendix: Numerical Notes for Figures


λ R −λ I 0

⎜ z˙ = R −1 D Rz = ⎝ λ I


⎟ λ R 0 ⎠ z, λ+ = λ R + iλ I . 0 μ

Show that the tangent plane to the unstable manifold at the origin is given by z.k = 0 (using basis vectors i, j, k), or k T T −1 x = 0, and show that this implies   x v11 v21    y v12 v22   z v13 v23

     = 0, x = (x, y, z), vi j = (vi ) j ;  

hence deduce that the spanning vectors for the unstable tangent plane at the origin are v1 and v2 . By considering as an example v+ = (0.7, −0.06 − 0.7i, 0)T , v3 = (0.1, 0, 1)T , show that T is a right-handed transformation.


Alligood, K. T., T. D. Sauer and J. A. Yorke 1997 Chaos: an introduction to dynamical systems. Springer–Verlag, New York. Apostol, T. M. 1957 Mathematical analysis. Addison–Wesley, Reading, Massachusetts. Argyris, J., G. Faust, M. Haase and R. Friedrich 2015 An exploration of dynamical systems and chaos, 2nd ed. Springer–Verlag, Berlin. Arnold, V. I. 1963 Small denominators and problems of stability of motion in classical and celestial mechanics. Russ. Math. Surv. 18 (6), 85–191. [Reprinted in Mackay and Meiss (1987), p. 260.] Arnold, V. I. 1973 Ordinary differential equations. M. I. T. Press, Cambridge, Massachusetts. Arnold, V. I. 1978 Mathematical methods of classical mechanics. Springer–Verlag, Berlin. Arnold, V. I. 1983 Geometric methods in ordinary differential equations. Springer–Verlag, Berlin. Baesens, C., J. Guckenheimer, S. Kim and R. S. Mackay 1991 Three coupled oscillators: modelocking, global bifurcations and toroidal chaos. Physica 49D, 387–475. Barreira, L. and C. Valls 2012 Dynamical systems: an introduction. Springer–Verlag, London. Barrio, R., F. Blesa, A. Denac and S. Serrano 2011 Qualitative and numerical analysis of the Rössler model: bifurcations of equilibria. Comput. Math. Applic. 62, 4,140–4,150. Barrow-Green, J. 1997 Poincaré and the three body problem. American Mathematical Society, Providence, Rhode Island. Barkley, D. 2016 Theoretical perspective on the route to turbulence in a pipe. J. Fluid Mech. 803, P1. Barnsley, M. 1988 Fractals everywhere. Academic Press, San Diego. (Second edition, 1993.) Beck, M., J. Knobloch, D. J. B. Lloyd, B. Sandstede and T. Wagenknecht 2009 Snakes, ladders, and isolas of localized patterns. SIAM J. Math. Anal. 41 (3), 936–972. Bell, D. C. and B. Deng 2002 Singular perturbation of n-front travelling waves in the FitzhughNagumo equations. Nonlinear analysis: real world applications 3 (4), 515–541. Benney, D. J. and R. F. Bergeron 1969 A new class of nonlinear waves in parallel flows. Stud. Appl. Math. 48, 181–204. Bergé, P., Y. Pomeau and Ch. Vidal 1986 Order within Chaos. John Wiley and sons, Chichester. di Bernardo, M., C. J. Budd, A. R. Champneys and P. Kowalczyk 2008 Piecewise-smooth dynamical systems; theory and applications. Springer-Verlag, London. Box, G. E. P. and G. M. Jenkins 1970 Time series analysis. Holden-Day, San Francisco. Broer, H., H. Hanßmann and F. Wagener 2018 Persistence properties of normally hyperbolic tori. Regular and Chaotic Dynamics 23 (2), 212–225. Carr, J. 1981 Applications of centre manifold theory. Springer-Verlag, New York. Chini, G. P., B. Montemuro, C. M. White and J. Klewicki 2017 A self-sustaining process model of inertial layer dynamics in high Reynolds number turbulent wall flows. Phil. Trans. R. Soc. A375, 20160090. © Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,




Coddington, E. A. and N. Levinson 1955 Theory of ordinary differential equations. McGraw-Hill, New York. Constantin, P., C. Foias, O. P. Manley and R. Temam 1985 Determining modes and fractal dimension of turbulent flows J. Fluid Mech. 150, 427–440. Contopoulos, G. 2002 Order and chaos in dynamical astronomy. Springer-Verlag, Berlin. (Corrected second printing, 2004.) Coullet, P., E. Risler and N. V. Anderberghe 2002 Spatial unfolding of homoclinic bifurcations. In: Nonlinear PDEs in condensed matter and reactive flows, eds. H. Berestycki and Y. Pomeau, pp. 399–412. Kluwer, Dordrecht. Cvitanoviˇc, P. 1984 (ed.) Universality in Chaos. Adam Hilger, Bristol. Daciu, F. and P. Holmes 1996 Celestial encounters: the origins of chaos and stability. Princeton University Press, Princeton, New Jersey. Darbyshire, A. G. and T. Mullin 1995 Transition to turbulence in constant-mass-flux pipe flow. J. Fluid Mech. 289, 83–114. Deguchi, K. and P. Hall 2014 Free-stream coherent structures in parallel boundary-layer flows. J. Fluid Mech. 752, 602–625. Devaney, R. L. 1986 An introduction to chaotic dynamical systems. Benjamin Cummings, Menlo Park, CA. (Second edition 2003, Westview Press, Cambridge, Massachusetts.) Diggle, P. J. 1990 Time series: a biostatistical introduction. Clarendon Press, Oxford. Drazin, P. G. 1992 Nonlinear systems. C.U.P., Cambridge. Drysdale, D. M. 1994 Homoclinic bifurcations. D. Phil. thesis, Oxford University. Duguet, Y., A. P. Willis and R. R. Kerswell 2008 Transition in pipe flow: the saddle structure on the boundary of turbulence. J. Fluid Mech. 613, 255–274. Eckhardt, B., T. M. Schneider, B. Hof and J. Westerweel 2007 Turbulence transition in pipe flow. Ann. Rev. Fluid Mech. 39, 447–468. Eckmann, J.-P. and D. Ruelle 1985 Ergodic theory of chaos and strange attractors. Rev. Mod. Phys. 57, 617–656. Eckmann, J.-P. and P. Wittwer 1987 A complete proof of the Feigenbaum conjectures. J. Stat. Phys. 46, 455–475. Editorial 2016 Big whorls, little whorls. Nature Physics 12, 197. Faisst, H. and B. Eckhardt 2003 Traveling waves in pipe flow. Phys. Rev. Lett. 91, 224502. Falconer, K. 2003 Fractal geometry: mathematical foundations and applications, 2nd ed. John Wiley and sons, Chichester. Feigenbaum, M. J. 1978 Quantitative universality for a class of nonlinear transformations. J. Stat. Phys. 19, 25–52. Feigenbaum, M. J. 1979 The universal metric properties of nonlinear transformations. J. Stat. Phys. 21, 669–706. Foias, C., G. R. Sell and R. Temam 1988 Inertial manifolds for nonlinear evolutionary equations. J. Diff. Eqs. 73, 309–353. Fowler, A. C. 1984 Analysis of the Lorenz equations for large r . Stud. Appl. Math. 70, 215–233. Fowler, A. C. 1990a Homoclinic bifurcations in n dimensions. Stud. Appl. Math. 83, 193–209. Fowler, A. C. 1990b Homoclinic bifurcations for partial differential equations in unbounded domains. Stud. Appl. Math. 83, 329–353. Fowler, A. C. 1992 Convection and chaos. In: Chaotic processes in the geological sciences, ed. David A. Yuen, Springer-Verlag, pp. 43–69. Fowler, A. C., J. D. Gibbon and M. J. McGuinness 1982 The complex Lorenz equations. Physica 4D, 139–163. Fowler, A. C. and P. D. Howell 2003 Intermittency in the transition to turbulence. SIAM J. Appl. Math. 63, 1,184–1,207. Fowler, A. C. and G. Kember 1993 Delay recognition in chaotic time series. Phys. Letts. A 175, 402–408. Fowler, A. C. and G. Kember 1998 Singular systems analysis as a moving-window spectral method. Euro. J. Appl. Math. 9, 55–79.



Fowler, A. C. and M. J. McGuinness 1982 A description of the Lorenz attractor at high Prandtl number. Physica 5D, 149–182. Fowler, A. C. and M. J. McGuinness 1983 Hysteresis, period doubling and intermittency at high Prandtl number in the Lorenz equations. Stud. Appl. Math. 69, 99–126. Fowler, A. C. and C. T. Sparrow 1991 Bifocal homoclinic orbits in four dimensions. Nonlinearity 4, 1,159–1,182. Frederickson, P., J. L. Kaplan, E. D. Yorke and J. A. Yorke 1983 The Liapunov dimension of strange attractors. J. Diff. Eqs. 49, 185–207. Gardiner, C. 2009 Stochastic methods: a handbook for the natural and social sciences, 4th ed. Springer-Verlag, Berlin. Gaspard, P. 1984 Generation of a countable set of homoclinic flows through bifurcation in multidimensional systems. Bull. Sci. Acad. Roy. Belgique 70, 61–83. Gleick, J. 1988 Chaos: making a new science. Sphere Books, London. Glendinning, P. and C. Sparrow 1984 Local and global behaviour near homoclinic orbits. J. Stat. Phys. 35, 645–696. Gneiting, T. and A. E. Raftery 2005 Weather forecasting with ensemble methods. Science 310 (5,746), 248–249. Goldstein, H. 1950 Classical mechanics. Addison-Wesley, Reading, Massachusetts. Golub, G. H. and C. F. van Loan 1989 Matrix computations, 2nd ed. Johns Hopkins University Press, Baltimore. Gollub, J. P. and H. L. Swinney 1975 Onset of turbulence in a rotating fluid. Phys. Rev. Letts. 35 (14), 927–930. Goluskin, D. 2018 Bounding averages rigorously using semidefinite programming: mean moments of the Lorenz system. J. Nonlin. Sci. 28 (2), 621–651. Grebogi, C., E. Ott and J. A. Yorke 1983 Are three-frequency quasi-periodic orbits to be expected in typical nonlinear dynamical systems? Phys. Rev. Letts. 51, 339–342. Greene, J. M. 1979 A method for determining a stochastic transition. J. Math. Phys. 20, 1,183–1,201. Gröger, M. and B. R. Hunt 2013 Coupled skinny baker’s maps and the Kaplan-Yorke conjecture. Nonlinearity 26, 2,641–2,667. Guckenheimer, J. and P. Holmes 1983 Nonlinear oscillations, dynamical systems and bifurcations of vector fields. Springer-Verlag, Berlin. Haller, G. and S. Wiggins 1995 Multi-pulse jumping orbits and homoclinic trees in a modal truncation of the damped-forced nonlinear Schrödinger equation. Physica 85D, 311–347. Hao Bao-Lin 1984 Chaos. World Scientific, Singapore. Hao Bao-Lin 1989 Elementary symbolic dynamics and chaos in dissipative systems. World Scientific, Singapore. Hao Bao-Lin 1990 (ed.) Chaos II. World Scientific, Singapore. Hardy, G. H. 1916 Weierstrass’s non-differentiable function. Trans. Amer. Math. Soc. 17 (3), 301– 325. Hartman, P. 1982 Ordinary differential equations (2nd ed.). Birkhäuser, Secaucus, N. J. Hassard, B., N. Kazarinoff and Y. Wan 1981 Theory and application of Hopf bifurcation. C.U.P., Cambridge. Hénon, M. 1969 Numerical study of quadratic area-preserving mappings. Quart. Appl. Math. 27, 291–312. Hénon, M. 1976 A two-dimensional mapping with a strange attractor. Commun. Math. Phys. 50, 69–77. Hénon, M. and C. Heiles 1964 The applicability of the third integral of motion: some numerical experiments. Astron. J. 69, 73–79. Hoppensteadt, F. C. and W. L. Miranker 1977 Multitime methods for systems of difference equations. Stud. Appl. Math. 56, 273–289. Howard, L. N. 1966 Convection at high Rayleigh number. Proc. 11th Int. Cong. Appl. Mech., ed. H. Görtler, pp. 1109–1115. Springer-Verlag, Berlin.



Hyman, J. and B. Nicolaenko 1986 Order and complexity in the Kuramoto-Sivashinsky model of weakly turbulent interfaces. Physica 23D, 265–292. Johnsen, J. 2010 Simple proofs of nowhere-differentiability for Weierstrass’s function and cases of slow growth. J. Fourier Anal. Appl. 16, 17–33. Kaplan, J. and J. Yorke 1979 Chaotic behavior of multidimensional difference equations. In: Peitgen, H. O. and H. O. Walther, eds., Functional differential equations and the approximation of fixed points. Lecture Notes in Mathematics 730, pp. 204–227, Springer–Verlag, Berlin. Kelley, A. 1967 The stable, center stable, center, center unstable and unstable manifolds. J. Diff. Eqns. 3, 546–570. Kember, G. and A. C. Fowler 1993 A correlation function for choosing time delays in phase portrait reconstructions. Phys. Letts. A 179, 72–80. Kevorkian, J. and J. D. Cole 1981 Perturbation methods in applied mathematics. Springer-Verlag, Berlin. Kirkwood, D. 1867 On the theory of meteors. Proc. Amer. Assoc. Adv. Sci. 15, 8–14. (Proceedings of the 15th meeting of the Association, Buffalo, New York, August 1866.) Kolmogorov, A. N. 1954 Preservation of conditionally periodic movements with small change in the Hamiltonian function. Dokl. Akad. Nauk. SSSR 98, 527–530. [Translation appears in Hao Bao-Lin (1984), p. 81.] Koon, W. S., M. W. Lo, J. E. Marsden and S. D. Ross 2011 Dynamical systems, the three-body problem and space mission design. Marsden Books, ISBN 978-0-615-24095-4. http://www.shaneross. com/books/ Krapivsky, P. L., S. Redner and E. Ben-Naim 2010 A kinetic view of statistical physics. C.U.P., Cambridge. Kuznetsov, Y. A. 2004 Elements of applied bifurcation theory, 3rd ed. Springer-Verlag, New York. Lanford, O. E. 1982 A computer-assisted proof of the Feigenbaum conjectures. Bull. Amer. Math. Soc. 6, 427–434. Libchaber, A. 1983 Experimental aspects of the period doubling scenario. In: Dynamical systems and chaos, ed. L. Garrido, pp. 157–164; Lecture notes in physics, vol. 179, Springer–Verlag, Berlin. Lichtenberg, A. J. and M. Lieberman 1983 Regular and stochastic motion. Springer-Verlag, Berlin. Lord, G. J., A. R. Champneys and G. W. Hunt 1999 Computation of homoclinic orbits in partial differential equations: an application to cylindrical shell buckling. SIAM J. Sci. Comput. 21 (2), 591–619. Lorenz, E. N. 1963 Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130–141. Lorenz, E. N. 1993 The essence of chaos. UCL Press, London. Lynch, P. 2006 The emergence of numerical weather prediction: Richardson’s dream. C.U.P., Cambridge. Mackay, R. S. and J. D. Meiss 1987 (eds.) Hamiltonian dynamical systems. Adam Hilger, Bristol. Malamud, B. D., D. L. Turcotte, F. Guzzetti and P. Reichenbach 2004 Landslide inventories and their statistical properties. Earth Surf. Process. Landforms 29, 687–711. Malkus, W. V. R. and G. Veronis 1958 Finite amplitude cellular convection. J. Fluid Mech. 4, 225–260. Marsden, J.E. and M.J. McCracken 1976 The Hopf bifurcation and its applications. Springer-Verlag, Berlin. McGuinness, M. J. 1983 The fractal dimension of the Lorenz attractor. Phys. Letts. A 99, 5–9. Meiss, J. D. 2015 Thirty years of turnstiles and transport. Chaos 25, 097602. Milnor, J. W. 2006 Dynamics in one complex variable, 3rd ed. Princeton University Press, Princeton. Mori, H. 1980 Fractal dimensions of chaotic flows of autonomous dissipative systems. Prog. Theor. Phys. 63, 1,044–1,047. Moser, J. K. 1962 On invariant curves of area-preserving mappings of an annulus. Nachr. Akad. Wiss. Göttingen II, 1–20. Moser, J. K. 1973 Stable and random motions in dynamical systems. Princeton University Press, Princeton, New Jersey.



Neimark, J. I. and L. P. Shilnikov 1965 A case of generation of periodic motions. Sov. Math. Dokl. 6, 305–309. Newhouse, S. E., D. Ruelle and F. Takens 1978 Occurrence of strange axiom A attractors near quasi periodic flows on T m , m ≥ 3. Commun. Math. Phys. 64, 35–40. Nichols, J. M., M. D. Todd, M. Seaver, S. T. Trickey, L. M. Pecora and L. Moniz 2003 Controlling system dimension: a class of real systems that obey the Kaplan-Yorke conjecture. Proc. Nat. Acad. Sci. 100 (26), 15,299–15,303. Orr, W. M’F. 1907a The stability or instability of the steady motions of a perfect liquid and of a viscous liquid. Part I: a perfect liquid. Proc. R. Ir. Acad. A 27, 9–68. Orr, W. M’F. 1907b The stability or instability of the steady motions of a perfect liquid and of a viscous liquid. Part II: a viscous liquid. Proc. R. Ir. Acad. A 27, 69–138. Orszag, S. A. and L. C. Kells 1980 Transition to turbulence in plane Poiseuille and plane Couette flow. J. Fluid Mech. 96, 159–205. Orszag, S. A. and A. T. Patera 1980 Subcritical transition to turbulence in plane channel flows. Phys. Rev. Lett. 45, 989–993. Orszag, S. A. and A. T. Patera 1983 Secondary instability of wall-bounded shear flows. J. Fluid Mech. 128, 347–385. Oseledec, V. I. 1968 A multiplicative ergodic theorem: Liapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19, 197–231. Ovsyannikov, I. and D. V. Turaev 2017 Analytic proof of the existence of the Lorenz attractor in the extended Lorenz model. Nonlinearity 30, 115–137. Parra-Rivas, P., D. Gomila, L. Gelens and E. Knobloch 2018 Bifurcation structure of localized states in the Lugiato-Lefever equation with anomalous dispersion. Phys. Rev. E 97, 042204. Percival, I. C. 1979 Variational principles for invariant tori and cantori. AIP Conf. Proc. 57, 302–310. [Also reprinted in the selection by Mackay and Meiss (1987).] Poincaré, H. 1892–1899 Les méthodes nouvelles de la mécanique céleste, Tome I. Gauthier-Villars, Paris. In fact Poincaré’s treatise appeared in three volumes, the second and third appearing in 1893 and 1899. They appear in translation as Poincaré (1993). Poincaré, H. 1993 New methods of celestial mechanics. Vol. 1: Periodic and asymptotic solutions; Vol. 2: Approximations by series; Vol. 3: Integral invariants and asymptotic properties of certain solutions; edited and introduced by D. L. Goroff, American Institute of Physics. Reynolds, O. 1883 An experimental investigation of the circumstances which determine whether the motion of water shall be direct or sinuous and of the law of resistance in parallel channels. Phil. Trans. R. Soc. Lond. 174, 935–982. Richardson, L. F. 1922 Weather prediction by numerical process. C.U.P., London. Šarkowskii, A. N. 1964 Coexistence of the cycles of a continuous mapping of the line into itself. Ukrainian Math. J. 16, 61–71 (in Russian). Shilnikov, L. P. 1965 A case of the existence of denumerable set of periodic motions. Sov. Math. Dokl. 6, 163–166. Shilnikov, L. P. 1967 The existence of a denumerable set of periodic motions in four-dimensional space in an extended neighbourhood of a saddle-focus. Sov. Math. Dokl. 8, 54–58. Shilnikov, L. P. 1968 On the generation of a periodic motion from trajectories doubly asymptotic to an equilibrium state of saddle type. Math. USSR Sb. 6, 427–438. Shilnikov, L. P. 1970 A contribution to the problem of the structure of an extended neighbourhood of a rough equilibrium state of saddle-focus type. Math. USSR Sb. 10, 91–102. Shivamoggi, B. K. 2014 Nonlinear dynamics and chaotic phenomena: an introduction, 2nd ed. Springer Science and Business Media, Dordrecht. Sommerfeld, A. 1908 Ein Beitrag zur hydrodynamische Erklärung der turbulenten Flüssigkeitsbewegungen. Proc. 4th Int. Congress Math. III. Rome, pp. 116–124. Sparrow, C. 1982 The Lorenz equations: bifurcations, chaos and strange attractors. Springer-Verlag, Berlin. Strogatz, S. H. 1994 Nonlinear dynamics and chaos. Addison–Wesley, Reading, Massachusetts.



Stuart, J. T. 1960 On the non-linear mechanics of wave disturbances in stable and unstable parallel flows. Part 1. The basic behaviour in plane Poiseuille flow. J. Fluid Mech. 9, 353–370. Sussman, G. J. and J. Wisdom 1988 Numerical evidence that the motion of Pluto is chaotic. Science 241 (4,864), 433–437. Szebehely, V. 1967 Theory of orbits: the restricted problem of three bodies. Academic Press, New York. Tabor, M. 1989 Chaos and integrability in nonlinear dynamics. John Wiley and sons, New York. Takens, F. 1981 Detecting strange attractors in turbulence. In: Lecture notes in mathematics, eds. D. A. Rand and L.-S. Young, pp. 366–381, Springer-Verlag, Berlin. Thompson, J. M. T. and H. B. Stewart 1986 Nonlinear dynamics and chaos. John Wiley and sons, New York. (Second edition, 2002.) Tran, C. V. 2009 The number of degrees of freedom of three-dimensional Navier-Stokes turbulence. Phys. Fluids 21, 125103. Tucker, W. 1999 The Lorenz attractor exists. C. R. Acad. Sci. Paris, Sér. I, 328, 1,197–1,202. Turcotte, D. L. 1997 Fractals and chaos in geology and geophysics, 2nd ed. C.U.P., Cambridge. Valtonen, M. and H. Karttunen 2006 The three-body problem. C.U.P., Cambridge. van Brunt, B. 2004 Calculus of variations. Springer-Verlag, Berlin. van Strien, S. J. 1979 Center manifolds are not C ∞ . Math. Z. 166, 143–145. Wattis, J. A. D. 2017 Shape of transition layers in a differential-delay equation. IMA J. Appl. Math. 82, 681–696. Wiggins, S. 1990 Global bifurcations and chaos: analytical methods. Springer-Verlag, Berlin. Wygnanski, I. J. and F. H. Champagne 1973 On transition in a pipe. Part 1. The origin of puffs and slugs and the flow in a turbulent slug. J. Fluid Mech. 59, 281–335. Zgliczy´nski, P. 1997 Computer assisted proof of chaos in the Rössler equations and in the Hénon map. Nonlinearity 10, 243–252.


Action, 102, 146, 274 Action-angle variables, 152 Adiabatic invariant, 174 Affine map, 105 Area-preserving maps, 169 Arnold diffusion, 181 Arnold tongues, 81, 84, 89 Astronomical unit, 199 Asymptotic expansions, 119, 122 Asymptotic series, 9 AUTO, 247, 274 Autocorrelation function, 214, 260 Autocovariance function, 214 Autonomous differential equations, 1 Autoregressive (AR) models, 208, 218 Averaging, 96, 174, 191

Baker map, 24 Bifocal homoclinic orbit, 127, 131, 138 Bifurcation, 51 diagram, 5 homoclinic, 99 Hopf, 4 in maps, 25 parameter, 7, 22 period-doubling, 27, 75 pitchfork, 4, 17, 26, 52 saddle-node, 17, 26, 51, 75 Shilnikov, 113 theory, 3 transcritical, 17, 26, 51, 75 tree, 34 Binary shift map, 24 Black–Scholes equation, 209 Black spleenwort fern, 237, 250, 287 Boltzmann equation, 248 © Springer Nature Switzerland AG 2019 A. Fowler and M. McGuinness, Chaos,

Boussinesq approximation, 10, 130 Butterfly effect, 15

Canonical coordinates, 155 Canonical transformation, 154, 203 Cantor set, 107, 110 middle-thirds set, 137, 235 Cantorus, 198, 200 Cardioid, 242, 251, 264 Cauchy–Schwarz inequality, 135 Celestial mechanics, 8, 143, 182 n-body problem, 8 Centre manifold, 65 theorem, 56, 63, 65 Chaos, 12, 25 in maps, 21 Chapman–Kolmogorov equation, 212 Circle maps, 24, 81, 91 Classical mechanics, 144 Cobwebbing, 22 Co-dimension, 31, 128, 182, 220 Coherent structures, 133, 256 Concatenation, 47 Connected, 240 Continued fraction, 86, 91, 197 Contraction mapping, 236 Convection, 133 Correlogram, 214 Countable, 12, 129 Crisis, 35 Cvitanovi´c–Feigenbaum equation, 31, 33

D’Alembert’s principle, 145 Delay-recruitment equation, 207, 282 Dense, 12, 89, 160, 162 299

300 orbit, 5, 25, 110, 137, 232 Determinism, 8 Deterministic chaos, 12 Diffeomorphism, 56, 59 Difference equation, 21 Discrepancy, 39 Discrete Fourier transform, 215 Dissipative, 6 Doubly periodic, 84, 129 Duffing oscillator, 204

eddy viscosity, 248 Ed Lorenz, 10 Ed Spiegel, 8, 133 Eigenspace, 104 Elliptic functions, 95 Embedding dimension, 219 Ensemble forecasts, 15 Ergodic hypothesis, 135, 231 Exchange of stability, 4, 51

Fatou domain, 238 Feigenbaum conjectures, 28, 30, 42 Feigenbaum’s constant, 29 Feigenbaum sequence, 242 Fermat’s little theorem, 45 Filter, 218 nonlinear, 226 Fixed point, 1, 22, 50, 75 elliptic, hyperbolic, 173 Floquet theory, 75, 123 Flow, 49 Focus, 5 Fokker–Planck equation, 209, 213 Fourier coefficients, 86 Fractal dimension, 229, 250 Fractals, 30, 235 Frêchet derivative, 31 Fredholm alternative, 55 Frequency locking, 81, 84, 89 Frobenius norm, 223 Fundamental matrix, 76, 120, 121

Index Hamilton’s equations, 147 Hamilton’s principle, 146 Hartman–Grobman theorem, 51, 60 Hausdorff dimension, 229 Hénon area-preserving map, 195 Hénon-Heiles system, 170, 193, 199, 205 Hénon map, 111 Heteroclinic connection, 128 Heteroclinic matrix, 123 Heteroclinic orbit, 100 Heteroclinic tangles, 180 Homeomorphism, 59 Homoclinic bifurcation, 99, 103 partial differential equations, 129 Homoclinic connections, 176 Homoclinic orbit, 2, 50, 100 bifocal, 127 for maps, 12 principal and subsidiary, 275 Shilnikov type, 275 Homological equation, 62 Hopf bifurcation, 4, 49 proof of theorem, 55 secondary, 74 subcritical, 102 tertiary, 89 theorem, 52 Hyperbolic set, 109

Implicit function theorem, 44, 51, 57, 121 Inertial manifold, 42, 129, 132 Inner product, 31, 60 Integrable systems, 149 Integral invariants, 153 Intermittent chaos, 36, 99 Invariant manifold, 31 Invariant tori, 79, 194, 279 Involution, 150 Islands (near resonant tori), 175, 196 Isolating integral, 200 Iterated function systems, 236 Itinerary, 38

Generating function, 154, 164 Ginzburg–Landau equation, 55 Gluing bifurcation, 128 Graph, 69

Jacobian, 75 Jacobi identity, 150 Jacobi integral, 188 James Watt, 3 Julia set, 237, 251, 262, 287

Hagen–Poiseuille flow, 243 Hamiltonian mechanics, 9, 143, 148 Hamilton–Jacobi equation, 155, 159, 201

KAM theorem, vii, 9, 88, 144, 163, 198 KAM tori, 90, 181, 189, 196 Kaplan–Yorke conjecture, 232

Index King Oscar II, 8, 14, 143 Kirkwood gaps, 158, 199 Kneading sequence, 24, 38, 43 Koch snowflake, 235, 261 Kolmogorov length scale, 248 Korteweg-de Vries equation, 2 Kuramoto–Sivashinsky equation, 133

Lagrange points, 191 Lagrange’s equations, 146, 198 Lagrangian mechanics, 144 Landau–Stuart equation, 55 Landslide statistics, 210 Last KAM torus, 196, 200 Leonardo da Vinci, 7 Lewis Fry Richardson, 248, 257 Libration, 175 Limit cycles, 1 Linear stability, 51, 243 Liouville’s theorem, 154 Logistic map, 21, 241 Lorenz bifurcation, 127 Lorenz equations, 5, 6, 10, 16, 65, 68, 95, 101, 130, 137, 216, 225, 229, 271, 273, 282, 285 complex, 94 Lorenz map, 11, 111, 133, 271 Lotka-Volterra equations, 17 Low-pass filter, 218 Lyapunov dimension, 233 Lyapunov exponents, 229 Lyapunov function, 133 Lyapunov–Schmidt reduction, 63

Mackey–Glass equation, 207 Mandelbrot set, 235, 240, 241, 289 bulbs, 251, 264 Manifold, 31, 56, 64 invariant, 64 Markov process, 212 Master equation, 212 Melnikov’s method, 178, 191, 205 Method of least squares, 219 Mitchell Feigenbaum, 28, 42 Mixing length theory, 248 Model error, 15 Modulated travelling waves, 129 Monodromy operator, 76 Multiple scales, 54, 74, 90

301 Navier–Stokes equations, 132, 207, 242, 248, 255, 269 n-body problem, 8, 148, 158 n-dimensional flows, 122 Near-identity transformation, 61 Neurons, 133 Newton’s law, 144 Newton’s method, 163 Node, 1 Noise, 210 1/ f noise, 211 Nonlinear filtering, 226 Nonlinear oscillator, 2 Nonlinear stability, 244 Normal forms, 56, 59, 63, 77, 86 n-torus, 153, 162 One-dimensional map, 11, 21, 107 unimodal, 28 Orbit, 22 Orr–Sommerfeld equation, 243, 255, 269 Osborne Reynolds, 7, 8 Oseledec’s multiplicative ergodic theorem, 231 Period-doubling, 19, 28, 99 bifurcation, 23, 75 Feigenbaum conjectures, 28 in experiments, 43 Periodic orbit, 22, 50 Periodic window, 25, 33 Periodogram, 215, 282 period three window, 34 Perturbation theory, 9, 157 Phase entrainment, 84 Phase plane analysis, 1 Phase space embedding, 219, 249 Picard’s theorem, 5 Pipe flow, 8, 243 Pitchfork bifurcation, 4, 17, 26, 52 Poincaré, vii Poincaré-Birkhoff fixed point theorem, 172, 193 Poincaré-Cartan invariant, 154, 169 Poincaré–Lindstedt method, vii, 53, 74 Poincaré map, 74, 103, 113, 123, 125, 139, 170, 234 Poincaré section, 49, 108, 278 Poincaré’s theorem, 62 Poiseuille flow, 130 Poisson bracket, 149, 178 Power spectral density, 211, 214, 282

302 Prandtl number, 130 Prediction, 228 Pseudo-inverse, 249

Quasi-periodicity, 157

Rayleigh–Bénard convection, 8 Rayleigh equation, 269 Rayleigh number, 7, 130, 133 Rayleigh’s inflection point criterion, 269 Rayleigh’s theorem, 244 Renormalisation, 30 Resonance, 62, 71, 78, 119, 158, 160, 169, 173 resonant terms, 55, 61 resonant tori, 190 secondary, 176 strong, 80 weak, 80 Restricted three-body problem, 182, 186, 199, 278 Reynolds number, 7, 243 Reynolds stress, 8, 247, 269 Richardson’s dream, 257 Roche potential, 192 Rossby number, 7 Rössler equations, 114, 136, 141, 275 Rotating frame, 192 Rotation, 175 Rotation number, 83, 196, 242 Routes to chaos, 99

Saddle-node bifurcation, 17, 26, 51, 75 Saddle point, 1 Secondary Hopf bifurcation, 74, 75, 169 Secondary resonance, 176 Secondary tori, 188 Secular term, 124 Self-adjoint, 55 Self-similarity, 30 Sensitive dependence on initial conditions, 6, 12, 25 Separatrix, 177 Shilnikov bifurcation, 113, 127, 131 Sierpi´nski gasket, 235 Simple harmonic oscillator, 156 Simple pendulum, 175 Singular systems analysis, 220 Singular Value Decomposition (SVD), 221, 285 Singular value fraction, 223

Index Slaving principle, 63 Smale horseshoe, 111, 117, 129, 234 Small divisors, 86, 160 Solar system, vii, 8, 143, 157, 169, 182, 199 Soliton, 129 Spiral point, 1 Spiral scrolls, 139 Squire’s theorem, 244 Stability, 26, 59 exchange of, 4 of periodic orbits, 74 Stable manifold, 31, 65 Standard map, 171, 176, 196, 204 Stochastic differential equation, 213 Stochasticity, 8, 9, 169, 181 Stochastic processes, 212 Stock market, 209 Strange attractor, 5, 6, 103, 111, 128, 235 Strange invariant set, 99, 110 Strong resonance, 80 Structural stability, 89 Stuart–Watson equation, 55 Superconvergence, 163, 166 Superstable, 42 Symbolic dynamics, 38, 109

Takens’ embedding theorem, 220 Taylor–Couette flow, 43 Taylor number, 7 Taylor’s theorem, 50 Tertiary Hopf bifurcation, 89 Theorem Carr, 70 centre manifold, 65 Denjoy, 84 Floquet, 76 Hartman–Grobman, 60 Herman, 84 Hopf, 52 Kolmogorov, Arnold, Moser, 163 Milnor–Thurston, 40 Picard, 5 Poincaré, 62 Poincaré–Birkhoff, 172 Šarkovskii, 41 Sternberg, 60 Taylor, 50 Time lag, 220 selection, 223 Time series, 211, 248 2-torus, 170 breakdown, 99

Index Trade-Weighted Index, 208 Trajectory matrix, 221 Transcritical bifurcation, 17, 26, 51, 75 Transversality, 52, 57 Trojan points, 192 Turbulence, 7, 130, 133, 242, 255 intermittency, 255 puffs, 245 slugs, 245, 268 spots, 130, 246 Uncountable, 12 Unimodal map, 28, 38 Universality, 29 Unstable manifold, 65

303 Van der Pol equation, 18, 54 Volatility, 209

Watt governor, 3, 200 Weak resonance, 80 Weather prediction, 15, 209, 257 Weierstrass’s non-differentiable function, 263 White noise, 210 power spectrum, 217 Winding number, 83, 196, 242 WKB method, 138