Lectures on Lyapunov Exponents [1 ed.] 1107081734, 978-1-107-08173-4, 9781316057964, 1316057968

The theory of Lyapunov exponents originated over a century ago in the study of the stability of solutions of differentia

420 73 1MB

English Pages 215 [213] Year 2014

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Lectures on Lyapunov Exponents [1 ed.]
 1107081734, 978-1-107-08173-4, 9781316057964, 1316057968

Table of contents :
Content: Machine generated contents note: Preface
1. Introduction
2. Linear cocycles
3. Extremal Lyapunov exponents
4. Multiplicative ergodic theorem
5. Stationary measures
6. Exponents and invariant measures
7. Invariance principle
8. Simplicity
9. Generic cocycles
10. Continuity
References
Index.

Citation preview

Lectures on Lyapunov Exponents MARCELO VIANA Instituto Nacional de Matemática Pura e Aplicada (IMPA), Rio de Janeiro

University Printing House, Cambridge CB2 8BS, United Kingdom Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781107081734 © Marcelo Viana 2014 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2014 Printed in the United Kingdom by CPI Group Ltd, Croydon CRO 4YY A catalogue record for this publication is available from the British Library Library of Congress Cataloging-in-Publication data Viana, Marcelo, author. Lectures on Lyapunov exponents / Marcelo Viana, Instituto Nacional de Matemática Pura e Aplicada (IMPA), Rio de Janeiro. pages cm. – (Cambridge studies in advanced mathematics ; 145) Includes bibliographical references and index. ISBN 978-1-107-08173-4 (Hardback) 1. Lyapunov exponents. I. Title. QA372.V53 2014 515 .48–dc23 2014021609 ISBN 978-1-107-08173-4 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Contents

Preface

page xi

1

Introduction 1.1 Existence of Lyapunov exponents 1.2 Pinching and twisting 1.3 Continuity of Lyapunov exponents 1.4 Notes 1.5 Exercises

2

Linear cocycles 2.1 Examples 2.1.1 Products of random matrices 2.1.2 Derivative cocycles 2.1.3 Schr¨odinger cocycles 2.2 Hyperbolic cocycles 2.2.1 Definition and properties 2.2.2 Stability and continuity 2.2.3 Obstructions to hyperbolicity 2.3 Notes 2.4 Exercises

6 7 7 8 9 10 10 14 16 18 19

3

Extremal Lyapunov exponents 3.1 Subadditive ergodic theorem 3.1.1 Preparing the proof 3.1.2 Fundamental lemma 3.1.3 Estimating ϕ− 3.1.4 Bounding ϕ+ from above 3.2 Theorem of Furstenberg and Kesten 3.3 Herman’s formula 3.4 Theorem of Oseledets in dimension 2

20 20 21 23 24 26 28 29 30

vii

1 1 2 3 3 4

viii

Contents

3.5 3.6

3.4.1 One-sided theorem 3.4.2 Two-sided theorem Notes Exercises

30 34 36 36

4

Multiplicative ergodic theorem 4.1 Statements 4.2 Proof of the one-sided theorem 4.2.1 Constructing the Oseledets flag 4.2.2 Measurability 4.2.3 Time averages of skew products 4.2.4 Applications to linear cocycles 4.2.5 Dimension reduction 4.2.6 Completion of the proof 4.3 Proof of the two-sided theorem 4.3.1 Upgrading to a decomposition 4.3.2 Subexponential decay of angles 4.3.3 Consequences of subexponential decay 4.4 Two useful constructions 4.4.1 Inducing and Lyapunov exponents 4.4.2 Invariant cones 4.5 Notes 4.6 Exercises

38 38 40 40 41 44 47 48 52 53 53 55 56 59 59 61 63 64

5

Stationary measures 5.1 Random transformations 5.2 Stationary measures 5.3 Ergodic stationary measures 5.4 Invertible random transformations 5.4.1 Lift of an invariant measure 5.4.2 s-states and u-states 5.5 Disintegrations of s-states and u-states 5.5.1 Conditional probabilities 5.5.2 Martingale construction 5.5.3 Remarks on 2-dimensional linear cocycles 5.6 Notes 5.7 Exercises

67 67 70 75 77 79 81 85 85 86 89 91 91

6

Exponents and invariant measures 6.1 Representation of Lyapunov exponents 6.2 Furstenberg’s formula 6.2.1 Irreducible cocycles

96 97 102 102

Contents

6.3

6.4 6.5

6.2.2 Continuity of exponents for irreducible cocycles Theorem of Furstenberg 6.3.1 Non-atomic measures 6.3.2 Convergence to a Dirac mass 6.3.3 Proof of Theorem 6.11 Notes Exercises

ix 103 105 106 108 111 112 113

7

Invariance principle 7.1 Statement and proof 7.2 Entropy is smaller than exponents 7.2.1 The volume case 7.2.2 Proof of Proposition 7.4. 7.3 Furstenberg’s criterion 7.4 Lyapunov exponents of typical cocycles 7.4.1 Eigenvalues and eigenspaces 7.4.2 Proof of Theorem 7.12 7.5 Notes 7.6 Exercises

115 116 117 118 119 124 125 126 128 130 131

8

Simplicity 8.1 Pinching and twisting 8.2 Proof of the simplicity criterion 8.3 Invariant section 8.3.1 Grassmannian structures 8.3.2 Linear arrangements and the twisting property 8.3.3 Control of eccentricity 8.3.4 Convergence of conditional probabilities 8.4 Notes 8.5 Exercises

133 133 134 137 137 139 140 143 147 147

9

Generic cocycles 9.1 Semi-continuity 9.2 Theorem of Ma˜ne´ –Bochi 9.2.1 Interchanging the Oseledets subspaces 9.2.2 Coboundary sets 9.2.3 Proof of Theorem 9.5 9.2.4 Derivative cocycles and higher dimensions 9.3 H¨older examples of discontinuity 9.4 Notes 9.5 Exercises

150 151 153 155 157 160 161 164 168 169

x 10

Contents Continuity 10.1 Invariant subspaces 10.2 Expanding points in projective space 10.3 Proof of the continuity theorem 10.4 Couplings and energy 10.5 Conclusion of the proof 10.5.1 Proof of Proposition 10.9 10.6 Final comments 10.7 Notes 10.8 Exercises References Index

171 172 174 176 178 181 183 186 189 189 191 198

Preface

1. The study of characteristic exponents originated from the fundamental work of Aleksandr Mikhailovich Lyapunov [85] on the stability of solutions of differential equations. Consider a linear equation v(t) ˙ = B(t) · v(t)

(1)

where B(·) is a bounded function from R to the space of d × d matrices. By the general theory of differential equations, there exists a so-called fundamental matrix At , t ∈ R such that v(t) = At · v0 is the unique solution of (1) with initial condition v(0) = v0 . If the characteristic exponents 1 λ (v) = lim sup log At · v t→∞ t

(2)

are negative, for all v = 0, then the trivial solution v(t) ≡ 0 is asymptotically stable, and even exponentially asymptotically stable. The stability theorem of Lyapunov asserts that, under an additional regularity condition, stability remains valid for nonlinear perturbations w(t) ˙ = B(t) · w(t) + F(t, w) with F(t, w) ≤ const w1+ε . That is, the trivial solution w(t) ≡ 0 is still exponentially asymptotically stable. The regularity condition of Lyapunov means, essentially, that the limit in (2) does exist, even if one replaces vectors v by l-vectors v1 ∧ · · · ∧ vl ; that is, elements of the k-exterior power of Rd , for any 0 ≤ l ≤ d. This is usually difficult to check in specific situations. But the multiplicative ergodic theorem of Oseledets asserts that Lyapunov regularity holds with full probability, in great generality. In particular, it holds on almost every flow trajectory, relative to any probability measure invariant under the flow. 2. The work of Furstenberg, Kesten, Oseledets, Kingman, Ledrappier, Guivarc’h, Raugi, Gol’dsheid, Margulis and other mathematicians, mostly in the xi

xii

Preface

1960s–80s, built the study of Lyapunov characteristic exponents into a very active research field in its own right, and one with an unusually vast array of interactions with other areas of Mathematics and Physics, such as stochastic processes (random matrices and, more generally, random walks on groups), spectral theory (Schr¨odinger-type operators) and smooth dynamics (non-uniform hyperbolicity), to mention just a few. My own involvement with the subject goes back to the late 20th century and was initially motivated by my work with Christian Bonatti and Jos´e F. Alves on the ergodic theory of partially hyperbolic diffeomorphisms and, soon afterwards, with Jairo Bochi on the dependence of Lyapunov exponents on the underlying dynamical system. The way these two projects unfolded very much inspired the choice of topics in the present book. 3. A diffeomorphism f : M → M is called partially hyperbolic if there exists a D f -invariant decomposition T M = Es ⊕ Ec ⊕ Eu of the tangent bundle such that E s is uniformly contracted and E u is uniformly expanded by the derivative D f , whereas the behavior of D f along the center bundle E c lies somewhere in between. It soon became apparent that to improve our understanding of such systems one should try to get a better hold of the behavior of D f | E c and, in particular, of its Lyapunov exponents. In doing this, we turned to the classical linear theory for inspiration. That program proved to be very fruitful, as much in the linear context (e.g. the proof of the Zorich–Kontsevich conjecture, by Artur Avila and myself) as in the setting of partially hyperbolic dynamics we had in mind originally (e.g the rigidity results by Artur Avila, Amie Wilkinson and myself), and remains very active to date, with important contributions from several mathematicians. 4. Before that, in the early 1980s, Ricardo Ma˜ne´ came to the surprising conclusion that generic (a residual subset of) volume-preserving C1 diffeomorphisms on any surface have zero Lyapunov exponents, or else they are globally hyperbolic (Anosov); in fact, the second alternative is possible only if the surface is the torus T2 . This discovery went against the intuition drawn from the classical theory of Furstenberg. Although Ma˜ne´ did not write a complete proof of his findings, his approach was successfully completed by Bochi almost two decades later. Moreover, the conclusions were extended to arbitrary dimension, both in the volume-preserving and in the symplectic case, by Bochi and myself.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.001

Preface

xiii

5. In this monograph I have sought to cover the fundamental aspects of the classical theory (mostly in Chapters 1 through 6), as well as to introduce some of the more recent developments (Chapters 7 through 10). The text started from a graduate course that I taught at IMPA during the (southern hemisphere) summer term of 2010. The very first draft consisted of lecture notes taken by Carlos Bocker, Jos´e R´egis Var˜ao and Samuel Feitosa. The unpublished notes [9] and [28], by Artur Avila and Jairo Bochi were important for setting up the first part of the course. The material was reviewed and expanded later that year, in my seminar, with the help of graduate students and post-docs of IMPA’s Dynamics group. I taught the course again in early 2014, and I took that occasion to add some proofs, to reorganize the exercises and to include historic notes in each of the chapters. Chapter 10 was completely rewritten and this preface was also much expanded. 6. The diagram below describes the logical connections between the ten chapters. The first two form an introductory cycle. In Chapter 1 we offer a glimpse of what is going to come by stating three main results, whose proofs will appear, respectively, in Chapters 3, 6 and 10. In Chapter 2 we introduce the notion of linear cocycle, upon which is built the rest of the text. We examine more closely the particular case of hyperbolic cocycles, especially in dimension 2, as this will be useful in Chapter 9. Ch. 2

Ch. 1

3

4

6

Ch. 5

7

8

9

Ch. 10

In the next four chapters we present the main classical results, including the Furstenberg–Kesten theorem and the subadditive ergodic theorem of Kingman (Chapter 3), the multiplicative ergodic theorem of Oseledets (Chapter 4), Ledrappier’s exponent representation theorem, Furstenberg’s formula for exponents of irreducible cocycles and Furstenberg’s simplicity theorem in dimension 2 (Chapter 6). The proof of the multiplicative ergodic theorem is based on the subadditive ergodic theorem and also heralds the connection between Lya-

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.001

xiv

Preface

punov exponents and invariant/stationary measures that lies at the heart of the results in Chapter 6. In Chapter 5 we provide general tools to develop that connection, in both the invertible and the non-invertible case. 7. The last four chapters are devoted to more advanced material. The main goal there is to provide a friendly introduction to the existing research literature. Thus, the emphasis is on transparency rather than generality or completeness. This means that, as a rule, we choose to state the results in the simplest possible (yet relevant) setting, with suitable references given for stronger statements. Chapter 7 introduces the invariance principle and exploits some of its consequences, in the context of locally constant linear cocycles. This includes Furstenberg’s criterion for λ− = λ+ , that extends Furstenberg’s simplicity theorem to arbitrary dimension. The invariance principle has been used recently to analyze much more general dynamical systems, linear and nonlinear, whose Lyapunov exponents vanish. A finer extension of Furstenberg’s theorem appears in Chapter 8, where we present a criterion for simplicity of the whole Lyapunov spectrum. Then, in Chapter 9, we turn our attention to the contrasting Ma˜ne´ –Bochi phenomenon of systems whose Lyapunov spectra are generically not simple. We prove an instance of the Ma˜ne´ –Bochi theorem, for continuous linear cocycles. Moreover, we explain how those methods can be adapted to construct examples of discontinuous dependence of Lyapunov exponents on the cocycle, even in the H¨older-continuous category. Having raised the issue of (dis)continuity, in Chapter 10 we prove that for products of random matrices in GL(2) the Lyapunov exponents do depend continuously on the cocycle data. 8. Each chapter ends with set of notes and a list of exercises. Some of the exercises are actually used in the proofs. They should be viewed as an invitation for the reader to take an active part in the arguments. Throughout, it is assumed that the reader is familiar with the basic ideas of Measure Theory, Differential Topology and Ergodic Theory. All that is needed can be found, for instance, in my book with Krerley Oliveira, Fundamentos da Teoria Erg´odica [114]; a translation into English is under way. I thank David Tranah, of Cambridge University Press, for his interest in this book and for patiently waiting for the writing to be completed. I am also grateful to Vaughn Climenhaga, and David himself, for a careful revision of the manuscript that very much helped improve the presentation. Rio de Janeiro, March, 2014 Marcelo Viana

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.001

1 Introduction

This chapter is a kind of overture. Simplified statements of three theorems are presented that set the tone for the whole text. Much broader versions of these theorems will appear later and several other themes around them will be introduced and developed as we move on. At this initial stage we choose to focus on the following special, yet significant, setting. Let A1 , . . . , Am be invertible 2×2 real matrices and let p1 , . . . , pm be positive numbers with p1 + · · · + pm = 1. Consider Ln = Ln−1 · · · L1 L0 ,

n ≥ 1,

where the L j are independent random variables with identical probability distributions, such that the probability of {L j = Ai } is equal to pi for all j ≥ 0 and i = 1, . . . , m. In brief, our goal is to describe the (almost certain) behavior of Ln as n → ∞.

1.1 Existence of Lyapunov exponents We begin with the following seminal result of Furstenberg and Kesten [56]: Theorem 1.1

There exist real numbers λ+ and λ− such that

1 lim log Ln  = λ+ n n with full probability.

1 lim log (Ln )−1 −1 = λ− n n

and

The numbers λ+ and λ− are called extremal Lyapunov exponents. Clearly,

λ+ ≥ λ− 1

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.002

(1.1)

2

Introduction

because B ≥ B−1 −1 for any invertible matrix B. If B has determinant ±1 then we even have B ≥ 1 ≥ B−1 −1 . Hence,

λ+ ≥ 0 ≥ λ−

(1.2)

when all matrices Ai , 1 ≤ i ≤ m have determinant ±1.

1.2 Pinching and twisting Next, we discuss conditions for the inequalities (1.1) and (1.2) to be strict. Let B be the monoid generated by the matrices Ai , i = 1, . . . , m; that is, the set of all products Ak1 · · · Akn with 1 ≤ k j ≤ m and n ≥ 0 (for n = 0 interpret the product to be the identity matrix). We say that B is pinching if for any constant κ > 1 there exists some B ∈ B such that B > κ B−1 −1 .

(1.3)

This means that the images of the unit circle under the elements of B are ellipses with arbitrarily large eccentricity. See Figure 1.1.

B−1 −1

B

B 1

Figure 1.1 Eccentricity and pinching

We say that the monoid B is twisting if given any vector lines F, G1 , . . . , Gn ⊂ R2 there exists B ∈ B such that B(F) ∈ / {G1 , . . . , Gn }.

(1.4)

The following result is a variation of a theorem of Furstenberg [54]: Theorem 1.2 Assume B is pinching and twisting. Then λ− < λ+ . In particular, if | det Ai | = 1 for all 1 ≤ i ≤ m then both extremal Lyapunov exponents are different from zero.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.002

1.3 Continuity of Lyapunov exponents

3

1.3 Continuity of Lyapunov exponents The extremal Lyapunov exponents λ+ and λ− may be viewed as functions of the data A1 , . . . , Am , p1 , . . . , pm . Let the matrices A j vary in the linear group GL(2) of invertible 2 × 2 matrices and the probability vectors (p1 , . . . , pm ) vary in the open simplex Δm = {(p1 , . . . , pm ) : p1 > 0, . . . , pm > 0 and p1 + · · · + pm = 1}. The following result is part of a theorem of Bocker and Viana [35]: Theorem 1.3 The extremal Lyapunov exponents λ± depend continuously on (A1 , . . . , Am , p1 , . . . , pm ) ∈ GL(2)m × Δm at all points. Example 1.4 Let m = 2, with    σ 0 cos θ and A = R A R , R = A1 = 2 θ 1 −θ θ sin θ 0 σ −1

− sin θ cos θ



for some σ > 1 and θ ∈ R. By Theorem 1.3, the Lyapunov exponents λ± depend continuously on the parameter σ and θ . Moreover, using Theorem 1.2, we have λ+ = 0 if and only if p1 = p2 = 1/2 and θ = π /2+nπ for some n ∈ Z.

1.4 Notes Theorem 1.1 is a special case of the theorem of Furstenberg and Kesten [56], which is valid in any dimension d ≥ 2. The full statement and the proof will appear in Chapter 3: we will deduce this theorem from an even more general statement, the subadditive ergodic theorem of Kingman [74]. Kingman’s theorem will also be used in Chapter 4 to prove the fundamental result of the theory of Lyapunov exponents, the multiplicative ergodic theorem of Oseledets [92]. It is natural to ask whether the type of asymptotic behavior prescribed by Theorem 1.1 for the norm Ln  and conorm (Ln )−1 −1 extends to the individual matrix coefficients Li,n j . Furstenberg, Kesten [56] proved that this is so if the coefficients of the matrices Ai , 1 ≤ i ≤ m are all strictly positive. The example in Exercise 1.3 shows that this assumption cannot be removed. On the other hand, the theorem of Oseledets theorem does contain such a description for the matrix column vectors. Theorem 1.2 is also the tip of a series of fundamental results, which are to be discussed in Chapters 6 through 8. The full statement and proof of Furstenberg’s theorem for 2-dimensional cocycles (Furstenberg [54]) will be given in

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.002

4

Introduction

Chapter 6. The extension to any dimension will be stated and proved in Chapter 7: it will be deduced from the invariance principle (Ledrappier [81], Bonatti, Gomez-Mont and Viana [37], Avila and Viana [16], Avila, Santamaria and Viana [13]), a general tool that has several other applications, both for linear and nonlinear systems. In dimension larger than 2, there is a more ambitious problem: rather than asking when λ− < λ+ , one wants to know when all the Lyapunov exponents are distinct. That will be the subject of Chapter 8, which is based on Avila and Viana [14, 15]. Furstenberg and Kifer [57] proved continuity of the Lyapunov exponents of products of random matrices, restricted to the (almost) irreducible case. A variation of their argument will be given in Section 6.2.2. The reducible case requires a delicate analysis of the random walk defined by the cocycle in projective space. That was carried out by Bocker and Viana [35], in the 2-dimensional case, using certain discretizations of projective space. At the time of writing, Avila, Eskin and Viana [12] are extending the statement of the theorem to arbitrary dimension, using a very different strategy. The proof of Theorem 1.3 that we present in Chapter 10 is based on this more recent approach. The problem of the dependence of Lyapunov exponents on the data can be formulated in the broader context of linear cocycles that we are going to introduce in Chapter 2. We will see in Chapter 9 that, in contrast, continuity often breaks down in that generality.

1.5 Exercises The following elementary notions are used in some of the exercises that follow. We call a 2 × 2 matrix hyperbolic if it has two distinct real eigenvalues, parabolic if it has a unique real eigenvalue, with a one-dimensional eigenspace, and elliptic if it has two distinct complex eigenvalues. Multiples of the identity belong to neither of these three classes. Exercise 1.1 Show that, in dimension d = 2, if | det Ai | = 1 for all 1 ≤ i ≤ m then λ+ + λ− = 0. Exercise 1.2 Calculate the extremal Lyapunov exponents for m = 2 and p1 , p2 > 0 with p1 + p2 = 1 and  σ (1) A1 = 0



0

σ −1



σ −1 and A2 = 0

 0 , where σ > 1; σ

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.002

 (2) A1 =

σ 0

0 σ −1



1.5 Exercises   0 −1 and A2 = , where σ > 1. 1 0

5

Exercise 1.3 (Furstenberg and Kesten [56]) Take m = 2 with p1 = p2 = 1/2 and     2 0 0 1 A1 = and A2 = . 0 1 1 0 Show that limn (1/n) log |Li,n j | does not exist for any i, j, with full probability. Exercise 1.4 Show that if some matrix Ai , 1 ≤ i ≤ m is either hyperbolic or parabolic then the monoid B is pinching. Exercise 1.5 Show that the monoid B may be pinching even if all the matrices Ai , 1 ≤ i ≤ m are elliptic. Exercise 1.6 Suppose that there exists 1 ≤ i ≤ m such that Ai is conjugate to an irrational rotation. Conclude that B is twisting. Exercise 1.7 Suppose that there exist 1 ≤ i, j ≤ m such that Ai and A j are either hyperbolic or parabolic and that they have no common eigenspace. Conclude that B is twisting (and pinching). Exercise 1.8 that

Let Ai , i = 1, 2 be as in the second part of Exercise 1.2. Check

λ+ (A1 , A2 , 1, 0) = lim λ+ (A1 , A2 , 1 − p2 , p2 ). p2 →0

Thus, the hypothesis p1 > 0, . . . , pm > 0 cannot be removed in Theorem 1.3.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.002

2 Linear cocycles

Linear cocycles are the basic object upon which this text is built. Here we define this concept and introduce a few examples. Special attention is given to (uniformly) hyperbolic cocycles, a class that is often used as a kind of paradigm for the behavior of more general systems. Let (M, B, μ ) be a probability space and f : M → M be a measure-preserving map. Let A : M → GL(d) be a measurable function with values in the linear group GL(d) of invertible d × d matrices with real coefficients. Sometimes we let A take values in the special linear group SL(d) of real d × d matrices with determinant ±1. The linear cocycle defined by A over f is the transformation F : M × Rd → M × Rd ,

(x, v) → ( f (x), A(x)v).

(2.1)

Observe that F n (x, v) = ( f n (x), An (x)) for every n ≥ 1, where An (x) = A( f n−1 (x)) · · · A( f (x))A(x). If f is invertible then so is F. Moreover, F −n (x, v) = ( f −n (x), A−n (x)) for all n ≥ 1, where A−n (x) = A( f −n (x))−1 · · · A( f −1 (x))−1 = An ( f −n (x))−1 . The Furstenberg–Kesten theorem (Theorem 1.1) extends to this setting, as follows: for any f -invariant probability measure μ such that log A±1  ∈ L1 (μ ), 1 1 λ+ (x) = lim log An (x) and λ− (x) = lim log An (x)−1 −1 n n n n

(2.2)

exist at μ -almost every point x. This fact will be proven in Chapter 3. More generally, one may consider A to take values in the group GL(d, C) of invertible d × d matrices with complex coefficients, or the subgroup SL(d, C) of matrices with determinant in the unit circle. This gives rise to complex linear cocycles M × Cd → M × Cd . Of course, every complex cocycle in dimension 6

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

2.1 Examples

7

d is also a real cocycle in dimension 2d and, conversely, every d-dimensional real linear cocycle defines a d-dimensional complex linear cocycle. The two theories, real and complex, are actually very similar. We focus on the real case, except where stated otherwise.

2.1 Examples We illustrate this notion with three important classes of linear cocycles, arising from probability theory, dynamical systems, and spectral theory, respectively.

2.1.1 Products of random matrices The situation we considered in Chapter 1 can be modeled by (a special case of) the following class of linear cocycles. Let X = GL(d) and M = X Z (or M = X N ) and f : M → M,

(αk )k → (αk+1 )k

be the shift map on X. Consider the function A : M → GL(d),

(αk )k → α0

and let F : M × Rd → M × Rd be the linear cocycle defined by A over f . Note that the kth iterate of F is given by     F n (αk )k , v = (αk+n )k , αn−1 . . . α1 α0 v . Given a probability measure p in the space GL(d), consider the product measure μ = pZ (or μ = pN ), which is characterized by   μ {(αk )k : αi ∈ Ei , . . . , α j ∈ E j } = p(Ei ) · · · p(E j ) for every i ≤ j and any measurable sets E1 , . . . , E j ⊂ X. It is clear that μ is invariant under the shift map. We call a locally constant linear cocycle the following slightly more general construction. Let (Y, Y , q) be any probability space and then consider N = Y Z endowed with the product σ -algebra C = Y Z and the product measure ν = qZ (or N = Y N endowed with C = Y N and ν = qN ). Let g : N → N be the shift map. Moreover, let B : N → GL(d) be any measurable function depending only on the zeroth coordinate; that is, of the form B(y) = β (y0 ) for some measurable function β : Y → GL(d). Then consider the linear cocycle G : N ×Rd → N ×Rd

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

8

Linear cocycles

defined by B over g. Note that G is semi-conjugate to a cocycle F as in the previous paragraph, with p = β∗ q: N × Rd

G

/ N × Rd

F

 / M × Rd

Φ

 M × Rd

Φ

   with Φ (xk )k , v = (β (xk ))k , v). For this reason, the two cocycles are equivalent for most of our purposes.

2.1.2 Derivative cocycles Consider a diffeomorphism f : M → M on the torus M = Td of dimension d ≥ 1. It is easy to construct smooth vector fields X1 , . . . , Xd on Td such that {X1 (x), . . . , Xd (x)} is a basis of the tangent space Tx M, for every x ∈ M. One says that the torus is a parallelizable manifold. The derivative cocycle of f is F : M × Rd → M × Rd ,

(x, v) → ( f (x), A(x)v),

where A(x) ∈ GL(d) is the matrix, with respect to these bases, of the derivative D f (x) : Tx M → T f (x) M. For more general diffeomorphisms, on non-parallelizable manifolds, the previous construction does not apply. However, one can still view the derivative map D f : T M → M as a linear cocycle, in the following more general sense. Let π : V → M be a finite-dimensional vector bundle. This means that V is equipped with a family of homeomorphisms hα : Uα × Rd → π −1 (Uα ) such that: (i) {Uα } is an open cover of M; (ii) π ◦ hα (x, v) = x for every x ∈ Uα and any α ; (iii) for every x ∈ Uα ∩ Uβ and any α , β , there exists a linear isomorphism Lα ,β (x) : Rd → Rd such that h−1 β ◦ hα (x, v) = (x, Lα ,β (x)v) for every v. The integer d ≥ 1 is the dimension of the vector bundle. A linear cocycle on V over a transformation f : M → M is a measurable transformation F : V → V such that π ◦ F = f ◦ π and the actions Fx : Vx → V f (x) on the fibers are linear isomorphisms: V

F

/V

f

 /M

π

 M

π

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

2.1 Examples

9

For our purposes there is not much to gain from considering such generality. So, most of the time we will stick to the case of trivial fiber bundles; that is, to linear cocycles of the form (2.1).

2.1.3 Schr¨odinger cocycles Consider = {(un )n∈Z : ∑n |un |2 < ∞}. The Schr¨odinger operator associated with a sequence (Vn )n∈Z in R, is defined by 2

H : 2 → 2 ,

u = (un )n∈Z → H(u) = (un+1 + un−1 +Vn un )n∈Z .

(2.3)

In the most interesting models the sequence Vn is generated from a dynamical system f : M → M and a function V : M → R (the so-called potential) through Vn = V ( f n (x)), for some x ∈ M. The most studied cases are: (1) Random Schr¨odinger cocycles: Let M = X Z be a shift space, f : M → M be the shift map, and μ = pZ be a Bernoulli measure on M. Fix x ∈ M and then take Vn = V ( f n (x)), where the function V : M → R is such that V (x) depends only on the zeroth coordinate of x ∈ M. (2) Quasi-periodic Schr¨odinger cocycles: Let μ the normalized Lebesgue measure on M = Td and f : Td → Td be an irrational translation. Fix x ∈ Td and take Vn = V ( f n (x)), where V : Td → R is an analytic function. A main objective is to understand the spectral theory of these operators (a theorem of Pastur [97] asserts that when the base system ( f , μ ) is ergodic the spectrum of H is the same for almost all choices of x ∈ M; the same is true for the absolutely continuous spectrum, the singular continuous spectrum and the pure point spectrum, by Kunz and Souillard [78]). Thus, one is led to studying the eigenvalue equation H(u) = Eu,

for E ∈ R.

(2.4)

While, by definition, the eigenvectors of H are the solutions of this equation in the space 2 , it is useful to consider (2.4) for any real sequence u = (un )n∈Z . Note that the equation may be rewritten as un+1 + un−1 +V ( f n (x))un = Eun or, still equivalently,      un E −V ( f n (x)) −1 un+1 = . un un−1 1 0

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

10

Linear cocycles

This suggests that we consider the linear cocycle FE : M × R2 → M × R2 defined over f : M → M by the function   E −V (y) −1 A : M → SL(2), A(y) = . 1 0 We have just seen that u = (un )n∈Z is a solution to H(u) = Eu if and only if U = (un , un−1 )n∈Z is a trajectory of the linear cocycle FE . The behavior of these linear cocycles provides useful information about the spectral properties of the Schr¨odinger operator. For example, if the Lyapunov exponents of FE are different from zero then E cannot be an eigenvalue of H : 2 → 2 . See Damanik [48] for much more information.

2.2 Hyperbolic cocycles We are going to define an important class of cocycles whose behavior is particularly well understood. We focus on the two-dimensional setting, but we also comment briefly on the general case.

2.2.1 Definition and properties Let M be a compact metric space and f : M → M be a homeomorphism. We call a continuous cocycle F : M × R2 → M × R2 ,

(x, v) → ( f (x), A(x)v)

hyperbolic if there are C > 0 and λ < 1 and, for every x ∈ M, there exist transverse lines Exs and Exu in R2 such that (1) A(x)Exs = E sf (x) and A(x)Exu = E uf(x) (2) An (x)vs  ≤ Cλ n vs  and A−n (x)vu  ≤ Cλ n vu  for every vs ∈ Exs , vu ∈ Exu , x ∈ M, and n ≥ 1. Proposition 2.1 Let F : M × R2 → M × R2 be the linear cocycle defined by a continuous function A : M → SL(2) over a homeomorphism f : M → M. Then F is hyperbolic if and only if there exist constants c > 0 and σ > 1 such that An (x) ≥ cσ n for all x ∈ M and n ≥ 1. Proof (We are going to use (2.2), whose proof will be given in Section 3.2. Similar arguments will appear in Section 3.4, for proving the multiplicative ergodic theorem in dimension 2.)

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

2.2 Hyperbolic cocycles

11

Suppose that F is hyperbolic. Condition (2) in the definition implies that An (x)vu  ≥ C−1 λ −n vu , and so An (x) ≥ C−1 λ −n

for every x and n.

This means that we may take c = C−1 and σ = λ −1 . In the converse direction, suppose that An (x) ≥ cσ n for every x and n. In particular, An (x) > 1 for every large n. Let un (x) and sn (x) be unit vectors, respectively, most expanded and most contracted by An (x) (Exercise 2.3): An (x)un (x) = An (x) ≥ cσ n −1 −1

A (x)sn (x) = A (x)  n

Lemma 2.2

n

and

= An (x)−1 ≤ c−1 σ −n .

(2.5)

There are C1 , C2 > 0 such that | sin (sn (x), sn+1 (x))| ≤ C1 σ −n An (x)−1 ≤ C2 σ −2n .

for all n ≥ 0 and x ∈ M.   Proof Write αn =  sn (x), sn+1 (x) . Then sn (x) = sin αn un+1 (x) + cos αn sn+1 (x). Then, since the images of sn+1 (x) and un+1 (x) under An+1 (x) are orthogonal, An+1 (x)sn (x) ≥  sin αn An+1 (x)un+1 (x) = | sin αn | An+1 (x). On the other hand, An+1 (x)sn (x) ≤ A( f n (x)) An (x)sn (x) = A( f n (x)) An (x)−1 . Let C0 > 0 be an upper bound for the norm of A. Then, substituting (2.5), | sin αn | ≤

A( f n (x)) An+1 (x) An (x)



C0 C0 ≤ . cσ n+1 An (x) c2 σ 2n+1

This implies the conclusion of the lemma, with cσ C1 = c2 σ C2 = C0 . Lemma 2.2 implies that (sn (x))n is a Cauchy sequence in projective space; that is, it can be made a Cauchy sequence by multiplying some of the vectors sn (x) by −1. Let s(x) = limn sn (x). Then, | sin (sn (x), s(x))| ≤ C1





m=n

m=n

∑ σ −m Am (x)−1 ≤ C2 ∑ σ −2m

(2.6)

for every x ∈ M and every n ≥ 1. In particular, the convergence is uniform. This also implies that s(x) is a continuous function of x ∈ M. Lemma 2.3

A(x)s(x) is collinear to s( f (x)) for every x ∈ M.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

12 Proof

Linear cocycles   Let βn =  A(x)sn+1 (x), sn ( f (x)) . Then A(x)sn+1 (x) = cos βn sn ( f (x)) + sin βn un ( f (x)).

Applying An ( f (x)) to both sides, An+1 (x)sn+1 (x) ≥ | sin βn | An ( f (x))un ( f (x)) − An ( f (x))sn ( f (x)). Substituting (2.5), we get that c−1 σ −n−1 ≥ cσ n | sin βn | − c−1 σ −n , and so | sin βn | ≤ 2c−2 σ −2n . So | sin βn | → 0 as n → ∞, and this implies the claim in the lemma. Lemma 2.4 For any σ0 < σ there exists n0 ≥ 1 such that An (x)s(x) ≤ σ0−n for every x ∈ M and n ≥ n0 . Proof First, take x ∈ M to be such that limn n−1 log An (x) exists. Observe that, according to (2.2), this is the case for μ -almost every point x and every f -invariant probability measure μ . We claim that 1 (2.7) lim sup log An (x)s(x) ≤ − log σ . n n   For proving this, let γn =  s(x), sn (x) . Then s(x) = cos γn sn (x) + sin γn un (x) and so, using (2.5) and (2.6), An (x)s(x) ≤ | cos γn |An (x)sn (x) + | sin γn |An (x)un (x) ∞

≤ An −1 +C1

∑ σ − j A j −1 An .

m=n

The assumption implies that for any ε > 0 there exists nε ≥ 1 such that e−nε ≤ Am −1 An  ≤ enε

for every m ≥ n ≥ nε .

Using (2.5) once more, it follows that An (x)s(x) ≤ c−1 σ −n +C1 enε



∑ σ − j ≤ C1 enε σ −n

for every n ≥ nε ,

j=n

where the constant C1 depends only on c, C1 , and σ . This implies our claim. Thus, we have shown that (2.7) holds for μ -almost every x and any f -invariant probability measure μ . Now, suppose that the conclusion of the lemma is false. Then there exists σ0 < σ and, for every k ≥ 1, there exists nk ≥ k and xk ∈ M such that −nk

Ank (xk )s(xk ) > σ0

.

n −1

k Define μk = n−1 k ∑ j=0 δ f j (xk ) and φ (x) = log A(x)s(x). The previous relation

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

2.2 Hyperbolic cocycles

13



may be written, equivalently, as φ d μk > − log σ0 . Since the space of probability measures on M is weak∗ -compact, up to restricting to a subsequence if necessary we may assume that the sequence (μk )k converges in the weak∗ topology to some probability measure μ on M. Since the function is continuous, it follows that 

φ d μ ≥ − log σ0 .

  δ f nk (xk ) − δxk . Making k → ∞ we find that f∗ μ = μ ; Clearly, f∗ μk = μk + n−1 k that is, μ is invariant under f . Let φ˜ be the Birkhoff time average of φ : 1 φ˜ (x) = lim n n

n−1

1

log An (x)s(x) . ∑ φ ( f j (x)) = lim n n

j=0





On the one hand, by the ergodic theorem, φ˜ d μ = φ d μ ≥ − log σ0 . On the other hand, (2.7) gives that φ˜ (x) ≤ − log σ for μ -almost every x. These two inequalities are incompatible. This contradiction proves the lemma. Let σ0 ∈ (1, σ ) be fixed. By Lemma 2.4, there exists C3 > 0 such that An (x)s(x) ≤ C3 σ0−n

for every x ∈ M and every n ≥ 1.

(2.8)

Analogously, considering backward iterates instead, one constructs a unit vector u(x) such that A−1 (x)u(x) is collinear to u(x) for every x ∈ M and A−n (x)u(x) ≤ C3 σ0−n

for every x ∈ M and every n ≥ 1.

(2.9)

Incidentally, this is the first time in the proof that we have used the assumption that the cocycle is invertible. Lemma 2.5

The vectors s(x) and u(x) are transverse for every x ∈ M.

Proof From (2.8) we get that An ( f −n (x))s( f −n (x)) ≤ C3 σ −n for every x and n. This may be rewritten as A−n (x)s(x) ≥ C3−1 σ0n , in view of Lemma 2.3. Moreover, using (2.9), A−n (x)u(x) ≤ C3 σ0−n , for every x and n. Then A−n (x)s(x) is larger than A−n (x)u(x) for every large n. In particular, s(x) and u(x) cannot be collinear, as claimed. Define Exs and Exu to be the lines generated by s(x) and u(x), respectively. Lemmas 2.2–2.5 yield all the properties in the conclusion of Proposition 2.1.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

14

Linear cocycles

2.2.2 Stability and continuity Let C0 (M, SL(2)) denote the space of continuous functions M → SL(2), endowed with the distance d(A, B) = sup A(x) − B(x). x∈M

We are going to deduce from Proposition 2.1 that hyperbolicity corresponds to an open subset of this space. Moreover, the invariant decomposition varies continuously on this subset. Proposition 2.6 Let f : M → M be fixed. Suppose that the linear cocycle F defined by A : M → SL(2) over f is hyperbolic, and let R2 = Exs ⊕ Exu be the corresponding invariant decomposition. There exists δ > 0 such that the cocycle defined over f by any continuous function B : M → SL(2) with d(A, B) < δ is also hyperbolic. s ⊕ Eu , x ∈ Moreover, given ε > 0, the B-invariant decomposition R2 = EB,x B,x s s u u M satisfies | sin (Ex , EB,x )| < ε and | sin (Ex , EB,x )| < ε for all x ∈ M, if δ is small enough. Proof For each x ∈ M and γ > 0, let Cu (x, γ ) be the set of vectors v ∈ Rd whose coordinates (vs , vu ) in the decomposition Rd = Exs ⊕ Exu satisfy vs  ≤ γ vu  (this is a particular case of a cone, of which we will hear more in Section 4.4.2). Let C > 0 and λ < 1 be constants as in the definition of hyperbolicity. Fix k ≥ 1 large enough so that Cλ k ≤ 1/3. Then, for any v ∈ Cu (x, 1) and x ∈ M, Ak (x)vu  ≥ 3vu 

and

1 1 1 Ak (x)vs  ≤ vs  ≤ vu  ≤ Ak (x)vu , 3 3 9

and so Ak (x)v ∈ Cu (x, 1/9). Then, for any B : M → SL(2) such that d(A, B) is sufficiently small, for every v ∈ Cu (x, 1), and every x ∈ M,  u  Bk (x)v  ≥ 2vu  and Bk (x)v ∈ Cu (x, 1). u  So, by induction,  Bkn (x)v  ≥ 2n vu  for every v ∈ Cu (x, 1), every x ∈ M, and every n ≥ 1. This implies that Bn (x) ≥ cσ n for every x ∈ M and n ≥ 1, where σ = 21/k and c > 0 are independent of B, x, and n. By Proposition 2.1, it follows that the cocycle defined by B is hyperbolic, as claimed. To prove the other claim in the proposition, let ε > 0 be fixed. Let s(x) s and, for each n ≥ 1, let s (x) be be a unit vector in the direction of EB,x B,n n the unit vector most contracted by B (x). By (2.6), there is a constant C > 0 independent of B, x, and n, such that | sin (sB,n (x), sB (x))| ≤ Cσ −2n .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

2.2 Hyperbolic cocycles

15

Fix n ≥ 1, large enough so that Cσ −2n < ε /4 and then assume d(A, B) is small enough so that | sin (sA,n (x), sB,n (x))| < ε /4 for every x ∈ M and n ≥ 1. Then | sin (sA (x), sB (x))| ≤ | sin (sA,n (x), sB,n (x))| + 2Cσ −2n < ε . s )| < ε for every x ∈ M. The corresponding stateThis proves that | sin (Exs , EB,x ment for E u is analogous.

Here is another interesting consequence of Proposition 2.1. Proposition 2.7 Suppose that A : M → SL(2) is such that A(x) has positive entries for every x ∈ M. Then the linear cocycle F defined by A over any homeomorphism f : M → M is hyperbolic. Proof

By hypothesis,

 A(x) =

ax cx

bx dx



with ax , bx , cx , dx > 0 and ax dx − bx cx = 1. Since M is compact and A is continuous, there exists a positive lower bound δ > 0 for ax , bx , cx , and dx over all x ∈ M. It is clear that the cocycle preserves the subset of vectors with positive entries: if v = (v0 , w0 ) is such that v0 > 0 and w0 > 0 then A(x)v = (v1 , w1 ) with v1 > 0 and w1 > 0. Moreover, v1 w1 > (1 + 2bx cx )v0 w0 ≥ (1 + 2δ 2 )v0 w0 . Then, by induction, the iterates (vn , wn ) = An (x)v have vn > 0, wn > 0 and vn wn > (1 + 2δ 2 )n v0 w0 . This implies that the norm of An (x) grows exponentially fast: An (x) ≥ cσ n for some c > 0 and σ = (1 + 2δ 2 )1/2 . Now the claim follows directly from Proposition 2.1. The notion of hyperbolicity extends naturally to linear cocycles in any dimension and one may even consider linear cocycles on general vector bundles, as mentioned in Section 2.1.2. Namely, let f : M → M be a homeomorphism on a compact space M. A continuous linear cocycle F : V → V over f is hyperbolic if there are constants C > 0 and λ < 1 and for every x ∈ M there exists a direct sum decomposition Vx = Exs ⊕ Exu of the fiber over x satisfying (1) Fx (Exs ) = E sf (x) and Fx (Exu ) = E uf(x) (2) Fxn (vs ) ≤ Cλ n vs  and Fx−n (vu ) ≤ Cλ n vu  for every vs ∈ Exs , vu ∈ Exu , and n ≥ 1. The conclusion of Proposition 2.6 remains valid in this generality, as we are going to explain. First of all, there is a natural topology in the space of linear cocycles over f : M → M, associated with the following norm. Fix any finite

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

16

Linear cocycles

set of trivializing coordinates hα : Uα × Rd → π −1 (Uα ) for the vector bundle V . The local expressions   −1 h−1 (Uβ ) × Rd → Uβ × Rd β ◦ F ◦ hα : Uα ∩ f of F are maps of the form (x, v) → ( f (x), Fα ,β (x)v). Define  F = max sup Fα ,β (x) : x ∈ Uα ∩ f −1 (Uβ ) . α ,β

The topology associated with this norm does not depend on the choice of the trivializing atlas. Then the set of hyperbolic cocycles is an open subset of the space of all linear cocycles over f : M → M. Moreover, the invariant sub-bundles E s and E u vary continuously with the linear cocycle inside that open set, in the following sense: restricted to the each trivializing domain Uα , we may view x → Ex∗ , ∗ ∈ {s, u} as a map from Uα to the Grassmannian Gr(d); if F is a hyperbolic linear cocycle and G is a nearby linear cocycle then the maps Uα → Gr(d) associated with G are uniformly close to those of F, for each α . The proof of these facts is part of Exercise 2.6 (see also Shub [107]). By the Grassmannian Gr(d) we mean the (disjoint) union of the Grassmannian manifolds Gr(l, d), 0 ≤ l ≤ d, whose elements are the l-dimensional vector subspaces of Rd .

2.2.3 Obstructions to hyperbolicity We describe a few mechanisms that exclude the presence of hyperbolic cocycles in certain situations. Example 2.8 Let M be a compact, connected metric space, f : M → M be a continuous transformation, and A : M → SL(d) be a continuous function such that for every 1 ≤ i ≤ d − 1 there exists a periodic point pi ∈ M of f , with period κi ≥ 1 such that the eigenvalues {β ji : 1 ≤ j ≤ d} of each Aκi (pi ) satisfy i i |β1i | ≥ · · · ≥ |βi−1 | > |βii | = |βii+1 | > |βi+2 | ≥ · · · ≥ |βdi |

(2.10)

i and βii , βi+1 are complex conjugate (not real). Such an A may be found, for instance, starting with a constant cocycle and deforming it on disjoint neighborhoods of the periodic orbits. Property (2.10) remains valid for every B : M → SL(d) in a C0 neighborhood U of A. Consider any B ∈ U and suppose the associated cocycle is hyperbolic, with hyperbolic decomposition Exs ⊕ Exu at each x ∈ M. Since the hyperbolic decomposition is continuous (Exercises 2.1 and 2.6), and M is connected, the dimensions of Exs and Exu are constant. Let dim Ex = l. Then, for any point p ∈ M with f κ (p) = p, the l largest (in norm) eigenvalues of Bκ (p) are strictly larger than the other d − l eigenvalues. This

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

2.2 Hyperbolic cocycles

17

is incompatible with (2.10) when p = pl . This contradiction shows that no cocycle in U can be hyperbolic. Example 2.9 (Michael Herman) Let S1 = R/Z and f : S1 → S1 be a continuous transformation, with topological degree deg( f ) ∈ Z. Let A : S1 → SL(2) be of the form A(x) = A0 R2πα (x) where A0 ∈ SL(d), α : S1 → S1 is a continuous function with topological degree deg(α ) ∈ Z, and Rθ denotes the rotation of angle θ . Let U be isotopy class of A in the space of maps from S1 to SL(2). This is a C0 neighborhood of A. We claim that if 2 deg(α ) is not a multiple of deg( f ) − 1 then the linear cocycle associated with every B ∈ U is not hyperbolic. Indeed, let B ∈ U and suppose the associated cocycle is hyperbolic. Let Exs ⊕ Exu be the hyperbolic decomposition. Let us view E s : x → Exs as a continuous map from S1 to the projectivization PR2 of the real plane. The graph {(x, Exs ) : x ∈ S1 } represents some element (η , ζ ) of the fundamental group π1 (S1 × PR2 ) = Z ⊕ Z. Since B is isotopic to A, the image of graph(E s ) under the cocycle must represent (η deg( f ), ζ + 2 deg(α )) ∈ π1 (S1 × PR2 ) (the factor 2 comes from the fact that S1 is the 2-fold covering of PR2 ). This must be collinear to (η , ζ ), because E s is invariant under the cocycle. In other words, we must have ζ + 2 deg(α ) = deg( f )ζ . Then deg( f ) − 1 must divide 2 deg(α ). This proves our claim. Example 2.10 A diffeomorphism f : M → M on a compact manifold is an Anosov diffeomorphism if the derivative cocycle F = D f is hyperbolic. The existence of Anosov diffeomorphisms imposes strong restrictions on the manifold M. For one thing, the Euler characteristic must be zero. In dimension 2 the Klein bottle may also be excluded, so that the only surface that admits Anosov diffeomorphisms is the torus T2 . Anosov diffeomorphisms can also be constructed in the high-dimensional tori, as follows. Let A be any d × d a matrix with integer coefficients and determinant 1. Then A defines a diffeomorphism of Td = Rd /Zd and, assuming that A has no eigenvalues in the unit circle, this diffeomorphism is Anosov. Every Anosov diffeomorphism on a torus is topologically conjugate to such a hyperbolic automorphism. Anosov diffeomorphisms with a similar algebraic flavor may be constructed on the more general class of infranilmanifolds, which are suitable quotients of Lie groups. Again, all Anosov diffeomorphisms on infranilmanifolds are topologically conjugate to a hyperbolic automorphism of the manifold. Moreover, all known examples of Anosov diffeomorphisms are defined on infranilmanifolds. See Newhouse [91], Franks [52], Manning [90].

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

18

Linear cocycles

2.3 Notes The theory of products of random matrices was effectively initiated by Furstenberg and Kesten [56] and Furstenberg [54]. There is now a vast literature on the subject, some of which will be discussed later. In addition to the references we provide along the way, let us mention the collections [7, 44, 51] and the book [46]. The time-independent Schr¨odinger equation describes the so-called orbitals, or stationary waves, in quantum mechanics. It takes the form HΨ = E ψ where Ψ is the wave function, E is the energy of the quantum state Ψ and H is the Hamiltonian operator, an hermitian operator that characterizes the total energy of any wave function. In the case of a single particle moving in an electric field the Hamiltonian operator is given by

h¯ 2 Δ +V (x) Ψ(x) HΨ(x) = − 2m where h¯ is the reduced Planck constant, m is the mass, V is the potential energy and Δ is the Laplacian operator. Our discussion in Section 2.1.3 concerns the case when the space variable x is one-dimensional and discrete. Then, the Laplacian is simply given by ΔΨ(n) = Ψ(n + 1) − 2Ψ(n) + Ψ(n − 1) and this readily leads to the form of the Schr¨odinger operator in (2.3). Much of the mathematical study of Schr¨odinger operators is motivated by an observation made in 1958 by the American physicist Philip Anderson. He argued in [1] that, while ideal crystals are always conductors, the presence of impurities should cause the crystal to loose all its conductivity properties and, thus, become an insulator: the electrons are trapped due to the crystal lattice disorder (this discovery earned Anderson the Nobel Prize for Physics in 1977). Mathematically, such disordered systems are modeled by suitable (random) Schr¨odinger operators H : 2 (Zd ) → 2 (Zd ) and the Anderson localization phenomenon corresponds to the operator H having only a pure point spectrum. In other words, localization means that the space 2 (Zd ) admits a Hilbert basis formed by eigenvectors of H. See Damanik [47] and Jitomirskaya and Marx [66] for surveys on this and related problems. In the text, we restricted ourselves to hinting (in the case d = 1) that the Lyapunov exponents of the associated Schr¨odinger cocycles have a say in the spectral theory of these operators. Damanik [48] contains a lot more information on this topic. The observations in Section 2.2 will be useful in Chapter 9. The notion of (uniform) hyperbolicity is due to Smale; see [109] and references therein. He had in mind diffeomorphisms and smooth flows (derivative cocycles) and the purpose was twofold: to prove that hyperbolicity characterizes the structural

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

2.4 Exercises

19

stability of the dynamics (the Stability Conjectures of Palis and Smale [96]), and to conclude that most smooth systems are structurally stable (which turned out not to be true). Of course, hyperbolicity was implicit in some previous works, for instance in the proof by E. Hopf [65] that the geodesic flow on any surface with negative curvature is ergodic. Anosov [2] introduced the class of globally hyperbolic diffeomorphisms and flows to extend this result to arbitrary dimension: he observed that the geodesic flow on any compact manifold with negative (sectional) curvature is globally hyperbolic and he proved that volume-preserving globally hyperbolic C2 diffeomorphisms and flows are ergodic. Especially in the context of diffeomorphisms and smooth flows, the concept of (uniform) hyperbolicity has been broadened to that of non-uniform hyperbolicity, which refers to systems whose Lyapunov exponents are non-zero at almost every point, relative to some distinguished invariant measure. This was initiated by Pesin [99, 100] with major contributions also by Katok [68], Ma˜ne´ [86], Ledrappier [79], Ledrappier and Young [83, 84] and Barreira, Pesin and Schmeling [19] among others. See also the book of Barreira and Pesin [18] and references therein. For most of these results the system is assumed to be C1+H¨older .

2.4 Exercises Exercise 2.1 Show that Exs and Exu are unique when they exist. Moreover, show they depend continuously on the point x ∈ M. Exercise 2.2 Prove that if the Schr¨odinger cocycle FE is hyperbolic then the spectral equation H(u) = Eu has no solution in 2 . Exercise 2.3 Show that given B ∈ SL(2) such that B = 1, there exist unit vectors s and u such that B(u) = B and B(s) = B−1 −1 = B−1 . These vectors are unique, up to multiplication by −1, they are orthogonal, and their images B(s) and B(u) are also orthogonal. Exercise 2.4 Check that B|r1 /B|r2  ≤ B2 for any lines r1 , r2 ⊂ R2 and B ∈ SL(2). Exercise 2.5 Prove that a linear cocycle F is hyperbolic if and only if any iterate F k , k = 0 is hyperbolic. Exercise 2.6 Extend Exercise 2.1, Proposition 2.6, and Exercise 2.5 to linear cocycles on vector bundles of arbitrary dimension.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.003

3 Extremal Lyapunov exponents

As announced already in (2.2), the theorem of Furstenberg and Kesten [56] states that, under a suitable integrability assumption, the norm An (x)

and

the conorm An (x)−1 −1

of any linear cocycle have well-defined exponential rates as time n → ∞, for almost every point x. The precise statement, which contains Theorem 2.2, is given in Section 3.2. Indeed, we obtain the result as a straightforward consequence of the so-called subadditive ergodic theorem of Kingman [74], which we state and prove in Section 3.1. We take the occasion, in Section 3.3, to illustrate Herman’s [62] subharmonic trick for bounding the largest Lyapunov exponents from below. The multiplicative ergodic theorem of Oseledets [92] improves the Furstenberg–Kesten theorem in that it provides exponential rates for the iterates An (x)v of all vectors, rather than just for the norm and conorm of the matrices. The full statements will appear in Chapter 4. Here (Section 3.4), we treat the special case of cocycles in dimension 2, which is much simpler and contains a few of the elements of the general case.

3.1 Subadditive ergodic theorem Let (M, B, μ ) be a probability space and f : M → M be a measure-preserving transformation. The positive part and the negative part of a measurable function ϕ : M → [−∞, +∞] are the non-negative functions defined by, respectively,

ϕ + (x) = max{0, ϕ (x)} and ϕ − (x) = max{0, −ϕ (x)}. 20

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.1 Subadditive ergodic theorem

21

A measurable function ϕ is called (essentially) invariant if ϕ ( f (x)) = ϕ (x) for μ -almost all x ∈ M. By definition, a measurable subset of M is invariant if its characteristic function is invariant. A sequence ϕn : M → [−∞, +∞), n ≥ 1, of measurable functions is subadditive, relative to f , if

ϕm+n ≤ ϕm + ϕn ◦ f m

for all m, n ≥ 1.

Example 3.1 Given any measurable function ψ : M → R, consider its orbital j m sum ϕn = ∑n−1 j=0 ψ ◦ f . Then ϕm+n = ϕm + ϕn ◦ f for every m and n and, in particular, (ϕn )n is a subadditive sequence, and even an additive sequence, since the equality always holds. It is clear that, conversely, every additive sequence is the orbital sum of its first term. Example 3.2 Given any measurable function A : M → GL(d), consider the sequence ϕn (x) = log An (x), where An (x) = A( f n−1 (x)) · · · A( f (x))A(x). As B1 B2  ≤ B1  B2  for every B1 , B2 ∈ GL(d), the sequence (ϕn )n is subadditive. Theorem 3.3 (Kingman) Let ϕn : M → [−∞, +∞), n ≥ 1 be a subadditive sequence of measurable functions such that ϕ1+ ∈ L1 (μ ). Then (ϕn /n)n converges μ -almost everywhere to some invariant function ϕ : M → [−∞, +∞). Moreover, the positive part ϕ + is integrable and 

1 ϕ d μ = lim n n



1 ϕn d μ = inf n n



ϕn d μ ∈ [−∞, +∞).

The proof of this theorem will be presented later. An interesting feature is that it does not use the ergodic theorem of Birkhoff. Thus, the latter can be obtained as a consequence (Corollary 3.10).

3.1.1 Preparing the proof A sequence (an )n in [−∞, +∞) is said to be subadditive if am+n ≤ am + an holds for any m, n ≥ 1. If (an )n is a subadditive sequence then an an lim = inf ∈ [−∞, ∞). (3.1) n n n n Proof If am = −∞ for some m then, by subadditivity, an = −∞ for all n > m. Then both sides of (3.1) are equal to −∞, and so the lemma holds in this case. From now, assume that an ∈ R for all n. Let L = infn (an /n) ∈ [−∞, +∞) and let L be any real number bigger than L. Then we can find k ≥ 1 such that ak /k < L . For n > k, we may write n = kp+q,

Lemma 3.4

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

22

Extremal Lyapunov exponents

where p and q are integer numbers such that p ≥ 1 and 1 ≤ q ≤ k. Then, by subadditivity, an ≤ akp + aq ≤ pak + aq ≤ pak + α , where α = max{ai : 1 ≤ i ≤ k}. Then, pk ak α an ≤ + . n n k n Observe that pk/n converges to 1 and α /n converges to zero when n → ∞. Therefore, since ak /k < L , we have an L≤ < L n for any large enough n. Making L → L, we conclude that an an lim = L = inf . n n n n This completes the argument. Now let (ϕn )n be as in Theorem 3.3. By subadditivity,

ϕn ≤ ϕ1 + ϕ1 ◦ f + · · · + ϕ1 ◦ f n−1 . This inequality remains true if we replace ϕn and ϕ1 by ϕn+ and ϕ1+ , respectively. So, the hypothesis that ϕ1+ ∈ L1 (μ ) implies that ϕn+ ∈ L1 (μ ) for any n. On the other hand, the hypothesis that (ϕn )n is subadditive implies that 

an =

ϕn d μ ,

n ≥ 1,

is a subadditive sequence in [−∞, +∞). Then, by Lemma 3.4, an an lim = inf = L ∈ [−∞, ∞) n n n n exists. Define ϕ− : M → [−∞, ∞] and ϕ+ : M → [−∞, ∞] by ϕn ϕn ϕ− (x) = lim inf (x) and ϕ+ (x) = lim sup (x). n n n n It is clear that ϕ− (x) ≤ ϕ+ (x) for every x ∈ M. We are going to prove that 

ϕ− d μ ≥ L ≥



ϕ+ d μ ,

(3.2)

provided that every function ϕn is bounded from −∞. Consequently, the two functions ϕ− and ϕ+ coincide at μ -almost every point and their integrals are equal to L. This yields the theorem in this case, with ϕ = ϕ− = ϕ+ (the fact that ϕ is an invariant function is proved in Exercise 3.2). At the end, we use a truncation trick to remove the boundedness condition and, hence, prove the theorem in complete generality.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.1 Subadditive ergodic theorem

23

3.1.2 Fundamental lemma In this section we assume that ϕ− > −∞ at every point. Fix ε > 0 and define, for each k ∈ N,    Ek = x ∈ M : ϕ j (x) ≤ j ϕ− (x) + ε for some j ∈ {1, . . . , k} . It is clear that Ek ⊂ Ek+1 for any k. Moreover, the definition of ϕ− (x) implies that M = k Ek . Define also,

ϕ− (x) + ε if x ∈ Ek ψk (x) = ϕ1 (x) if x ∈ Ekc . The definition of Ek implies that ϕ1 (x) > ϕ− (x) + ε for every x ∈ Ekc . Thus, ψk (x) decreases to ϕ− (x) + ε as k → ∞, for every x ∈ M. In particular, by the monotonic convergence theorem, 

ψk d μ →



(ϕ− + ε ) d μ

as k → ∞.

The crucial step in the proof of the theorem is the following estimate: Lemma 3.5

For any n > k ≥ 1 and μ -almost every point x ∈ M,

ϕn (x) ≤

n−k−1



i=0

ψk ( f i (x)) +

n−1



max{ψk , ϕ1 }( f i (x)).

i=n−k

This means that the sequence (ϕn )n is bounded from above by an orbital sum of ψk and an “error term”. Since orbital sums are additive sequences (recall Example 3.1), this pretty much reduces our subadditive setting to the much easier additive case. Recall that ψk converges to ϕ− + ε when k goes to infinity. The last sum on the right-hand side of the inequality (the “error term”) is negligible because it is the sum of a fixed number k of integrable functions, and n may be taken to be much larger than k. Proof Take x ∈ M such that ϕ− (x) = ϕ− ( f j (x)) for any j ≥ 1 (this is the case for μ -almost every point; see Exercise 3.2). Consider the sequence, possibly finite, of integer numbers m0 ≤ n1 < m1 ≤ n2 < m2 < · · ·

(3.3)

defined inductively in the following way (see also Figure 3.1). Take m0 = 0. Given j ≥ 1, let n j be the smallest integer greater or equal than m j−1 that satisfies f n j (x) ∈ Ek (assuming it exists). Then, by definition of Ek , there exists m j such that 1 ≤ m j − n j ≤ k and

ϕm j −n j ( f n j (x)) ≤ (m j − n j )(ϕ− ( f n j (x)) + ε ).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

(3.4)

24

Extremal Lyapunov exponents Ekc

Ekc

m0

n1

Ekc

m1

E

nl

n1

n

ml

Ekc

E

m0

Ekc

Ekc

m1

nl

ml

nl1

n

Figure 3.1 Splitting the trajectory of a point

This completes the definition of the sequence (3.3). Given any n ≥ k, let l ≥ 0 be largest such that ml ≤ n. By subadditivity, n j −1



ϕn j −m j−1 ( f m j−1 (x)) ≤

ϕ1 ( f i (x))

i=m j−1

for any j = 1, . . . , l such that m j−1 = n j , and similarly for ϕn−ml ( f ml (x)). Thus, l

ϕn (x) ≤ ∑ ϕ1 ( f i (x)) + ∑ ϕm j −n j ( f n j (x)) i∈I

where I =

(3.5)

j=1

l

j=1 [m j−1 , n j ) ∪ [ml , n).

ϕ1 ( f i (x)) = ψk ( f i (x)) for any

Observe that i∈

l 

[m j−1 , n j ) ∪ [ml , min{nl+1 , n}),

j=1

since f i (x) ∈ Ekc in all these cases. Moreover, as ϕ− is constant on orbits (Exercise 3.2) and ψk ≥ ϕ− + ε , the relation (3.4) implies that

ϕm j −n j ( f n j (x)) ≤

m j −1

m j −1

i=n j

i=n j

∑ (ϕ− ( f i (x)) + ε ) ≤ ∑

ψk ( f i (x))

for every j = 1, . . . , l. Thus, using (3.5) we conclude that

ϕn (x) ≤

min{nl+1 ,n}−1



ψk ( f i (x)) +

i=0

n−1



ϕ1 ( f i (x)).

i=nl+1

Since nl+1 > n − k, the lemma is proved.

3.1.3 Estimating ϕ− In order to prove (3.2), in this section we prove the following lemma:

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.1 Subadditive ergodic theorem Lemma 3.6



25

ϕ− d μ = L.

Proof Suppose, for the time being, that ϕn /n is uniformly bounded from below; that is, there exists κ > 0 such that ϕn /n ≥ −κ for every n. In particular, ϕ− ≥ −κ > −∞. Applying Fatou’s lemma to the sequence of non-negative functions ϕn /n + κ , we obtain that ϕ− is integrable and 

ϕ− d μ ≤ lim



ϕn d μ = L. n

To prove the opposite inequality, observe that Lemma 3.5 implies that 1 n



n−k ϕn d μ ≤ n



k ψk d μ + n



max{ψk , ϕ1 } d μ

(3.6)

Note that max{ψk , ϕ1 } ≤ max{ϕ− + ε , ϕ1+ } and this last function is integrable. So, the lim supn of the final term in (3.6) is non-positive. Hence, taking n → ∞  we obtain that L ≤ ψk d μ for any k. Then, making k → ∞ we conclude that L≤



ϕ− d μ + ε 

Finally, making ε → 0 we obtain that L ≤ ϕ− d μ . This proves the lemma when ϕn /n is uniformly bounded from below. Now, let us remove that hypothesis. Define, for each κ > 0,

ϕnκ = max{ϕn , −κ n} and ϕ−κ = max{ϕ− , −κ }. The sequence (ϕnκ )n satisfies the hypotheses of Theorem 3.3: it is subadditive and the positive part of ϕ1κ is integrable. Moreover, ϕ−κ = lim infn (1/n)ϕnκ . Hence, the argument in the previous paragraph shows that 

ϕ−κ d μ = inf n

1 n



ϕnκ d μ .

(3.7)

By the monotone convergence theorem, we also have that 

ϕn d μ = inf κ



ϕnκ d μ



and

ϕ− d μ = inf



κ

ϕ−κ d μ .

Combining (3.7) and (3.8), we find that 

ϕ− d μ = inf κ



ϕ−κ

1 = inf inf κ n n



ϕnκ d μ

1 = inf n n



This completes the proof of this lemma.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

ϕn d μ = L.

(3.8)

26

Extremal Lyapunov exponents

3.1.4 Bounding ϕ+ from above 

We are going to show that ϕ+ d μ ≤ L if every ϕn is bounded from −∞. That will complete the proof of (3.2). First, we prove a couple of auxiliary results: Lemma 3.7

If φ : M → R is integrable with respect to μ then

1 lim φ ( f n (x)) = 0 for μ -almost all x ∈ M. n n Proof Fix any ε > 0. Since μ is invariant under f ,     μ {x ∈ M : |φ ( f n (x))| ≥ nε } = μ {x ∈ M : |φ (x)| ≥ nε }   ∞ |φ (x)| < k+1 . = ∑μ x∈M:k≤ ε k=n Adding these inequalities over every n ∈ N, we obtain   ∞ ∞   |φ (x)| ∑ μ {x ∈ M : |φ ( f n (x))| ≥ nε } = ∑ kμ x ∈ M : k ≤ ε < k + 1 n=1 k=1 

|φ | dμ. ε Since φ is assumed to be integrable, the right-hand side is finite. Hence, we may use the Borel–Cantelli lemma to conclude that the set B(ε ) of points x such that |φ ( f n (x))| ≥ nε for infinitely many values of n has measure zero. / B(ε ) there exists p ≥ 1 such that By the definition of B(ε ), for every x ∈ |φ ( f n (x))| < nε for every n ≥ p. Now, consider B = ∞ i=1 B(1/i). Then B has / B. measure zero and limn (1/n)φ ( f n (x)) = 0 for every x ∈ ≤

Lemma 3.8

For any fixed k, lim sup n

ϕkn ϕn = k lim sup . n n n

Proof The inequality ≤ is clear, since ϕkn /kn is a subsequence of ϕn /n. To get the other inequality, write n = kqn + rn with rn ∈ {1, . . . , k}. By subadditivity,

ϕn ≤ ϕkqn + ϕrn ◦ f kqn ≤ ϕkqn + ψ ◦ f kqn where ψ = max{ϕ1+ , . . . , ϕk+ }. Note that n/qn → k as n → ∞. Moreover, since ψ ∈ L1 (μ ), we may use Lemma 3.7 to check that ψ ◦ f n /n converges to zero at μ -almost every point. Thus, dividing the previous relation by n and taking the lim sup when n → ∞, we obtain that 1 1 1 1 1 lim sup ϕn ≤ lim sup ϕkqn + lim sup ψ ◦ f kqn = lim sup ϕkq , n n n k q q n n n as claimed in the lemma.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.1 Subadditive ergodic theorem Lemma 3.9 Proof that

27 

Suppose that inf ϕn > −∞ for any n. Then ϕ+ d μ ≤ L.

jk For each fixed k and n ≥ 1, consider θn = − ∑n−1 j=0 ϕk ◦ f . Observe



θn d μ = −n



ϕk d μ

for every n,

(3.9)

since f k preserves the measure μ . As the sequence (ϕn )n is subadditive, we have that θn ≤ −ϕkn for any n. Then, using Lemma 3.8,

θ− = lim inf n

and so

θn ϕkn ϕn ≤ − lim sup = −k lim sup = −kϕ+ n n n n n 

θ− d μ ≤ −k



ϕ+ d μ .

(3.10)

Observe also that the sequence (θn )n is additive: θm+n = θm + θn ◦ f km for any m, n ≥ 1. As θ1 = −ϕk is bounded from above by − inf ϕk , we also have that the function θ1+ is bounded and, consequently, integrable. Thus, we can apply Lemma 3.6, together with (3.9), to conclude that 





θn d μ = − ϕk d μ . θ− d μ = lim n n Putting the relations (3.10) and (3.11) together, we obtain that 

ϕ+ d μ ≤

1 k



(3.11)

ϕk d μ .



Finally, taking the infimum on k yields ϕ+ d μ ≤ L. Lemmas 3.6 and 3.9 prove the relation (3.2) and, thus, Theorem 3.3 when inf ϕn > −∞ for any n. In the general case, define

ϕnκ = max{ϕn , −κ n} , ϕ−κ = max{ϕ− , −κ } and ϕ+κ = max{ϕ+ , −κ } for any constant κ > 0. The previous arguments can be applied to the sequence (ϕnκ )n for any fixed κ > 0. Therefore, ϕ+κ = ϕ−κ at μ -almost every point and for any κ > 0. Since ϕ−κ → ϕ− and ϕ+κ → ϕ+ when κ → ∞, it follows that ϕ− = ϕ+ at μ -almost every point. The proof of Theorem 3.3 is complete. Corollary 3.10 (Birkhoff ergodic theorem) Let ϕ : M → R be a μ -integrable function. Then

ϕ˜ (x) = lim n

1 n−1 ∑ ϕ ( f j (x)) n j=0

exists at μ -every point. Moreover, the function ϕ˜ is invariant and μ -integrable,   with ϕ˜ d μ = ϕ d μ .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

28

Extremal Lyapunov exponents

Proof

According to Example 3.1, this is a particular case of Theorem 3.3.

Now we can also deduce the following strong version of Lemma 3.7 (the conclusion is the same while the assumption is weaker) that will be useful later. Corollary 3.11 Let φ : M → R be a measurable function such that the function ψ = φ ◦ f − φ is integrable with respect to μ . Then 1 lim φ ( f n (x)) = 0 n n

for μ -almost all x ∈ M.

In particular, this holds if φ ∈ L1 (μ ). j Proof Note that φ ( f n (x)) = φ (x) + ∑n−1 j=0 ψ ( f (x)) for every x and every n. So, by the Birkhoff ergodic theorem applied to the integrable function ψ ,

1 1 1 n−1 lim φ ( f n (x)) = lim φ (x) + lim ∑ ψ ( f j (x)) n n n n n n j=0

(3.12)

exists at μ -almost every point. On the other hand, since μ is f -invariant,    1    μ x :  φ ( f n (x)) ≥ c = μ y : |φ (y)| > nc → 0 when n → ∞. n In other words, the sequence (1/n) φ ◦ f n converges to zero in measure. Thus, the limit in (3.12) must be zero at μ -almost every point.

3.2 Theorem of Furstenberg and Kesten Let F : M × Rd → M × Rd be given by F(x, v) = ( f (x), A(x)v), for some measurable function A : M → GL(d). Let L1 (μ ) denote the space of μ -integrable functions on M. Theorem 3.12 (Furstenberg–Kesten)

If log+ A±1  ∈ L1 (μ ) then

1 1 λ+ (x) = lim log An (x) and λ− (x) = lim log (An (x))−1 −1 n n n n exist for μ -almost everywhere x ∈ M. Moreover, the functions λ± are invariant and μ -integrable, with 



1 log An (x)d μ n n   1 λ− d μ = lim log (An (x))−1 −1 d μ . n n

λ+ d μ = lim

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.3 Herman’s formula

29

Proof This is a direct consequence of Theorem 3.3. Define

ϕn (x) = log An (x) and ψn (x) = log (An (x))−1  The hypothesis implies that ϕ1+ , ψ1+ ∈ L1 (μ ) and so ϕ1 (x), ψ1 (x) ∈ [−∞, +∞) for μ -almost every x. Since the norm of linear operators is sub-multiplicative, the sequences ϕn and ψn are subadditive (Example 3.2). Now the conclusion of Theorem 3.12 follows immediately from applying Theorem 3.3 to these sequences.

3.3 Herman’s formula Michael Herman devised a method for bounding Lyapunov exponents from below, based on the theory of (sub)harmonic functions. Here is one application. Theorem 3.13 (Herman’s formula) Let S1 = R/Z and f : S1 → S1 be an 1 = A0 R2π x where irrational  Let A : S → SL(2) be ofthe form A(x)   rotation. σ 0 cos θ − sin θ for some σ > 0 and Rθ = . Then A0 = sin θ cos θ 0 σ −1

σ + σ −1 (3.13) 2 where λ+ is the largest Lyapunov exponent of the cocycle defined by A, relative to the unique f -invariant probability measure m. λ+ ≥ log

Proof Let ω ∈ 2π (R \ Q) be the angle of the rotation f : S1 → S1 . By Theorem 3.12,

λ+ = lim n

1 n



S1

1 = lim n 2π n

log An (y) dm(y)

 2π 0

(3.14) log A0 Rx+(n−1)ω · · · A0 Rx+ω A0 Rx  dx.

This does not depend on the choice of the norm; we take  ·  to be given by the maximum absolute value of the coefficients. The idea of the proof is to extend the function in the last integral to a subharmonic function on the complex plane and then deduce the claim of the theorem from the average property of subharmonic functions (the value at any point is not bigger than the average of the function over any circle centered at that point). For this, consider the complex matrices   2 (−z2 + 1)/(2i) (z + 1)/2 R(z) = (z2 + 1)/2 (z2 − 1)/(2i)

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

30

Extremal Lyapunov exponents

defined for z ∈ C. Observe that R(eiθ ) = eiθ Rθ for every θ . Thus, z → R(z) is a kind of holomorphic extension of the family θ → Rθ of rotations. Now write Cn (z) = A0 R(e(n−1)ω i z) · · · A0 R(eω i z)A0 R(z) Then Cn (eix ) = eiτ An (x) with τ = nx + n(n − 1)ω /2, and so (3.14) becomes

λ+ = lim n

1 2π n

 2π 0

log Cn (eix ) dx.

The function z → log Cn (z) is subharmonic because the absolute value of a holomorphic function is subharmonic and the maximum of subharmonic functions is subharmonic (see [45, § 19.4]). It follows that 1 λ+ ≥ lim log Cn (0). n n Moreover, 1 1 lim log Cn (0) = lim log (A0 R(0))n  n n n n equals the logarithm of the spectral radius of A0 R(0). An explicit calculation gives that the spectral radius is (σ + σ −1 )/2. This proves the claim. For σ = 1 the theorem gives that λ+ is positive. Observe that deg( f ) = 1 and deg(θ ) = 1, where θ (x) = x for x ∈ S1 . So, by the criterion in Example 2.9, the cocycle cannot be hyperbolic.

3.4 Theorem of Oseledets in dimension 2 We are going to state and prove a version of the multiplicative ergodic theorem for 2-dimensional cocycles, both invertible and non-invertible. The full statement of the theorem will the topic of Chapter 4. The reason for including this discussion of a special case is that it can handled by much simpler methods and, thus, provides a useful glimpse into the main features of the theorem, in a very transparent situation. Let F : M × R2 → M × R2 be given by F(x, v) = ( f (x), A(x)v), for some measurable function A : M → GL(2) satisfying log+ A±1  ∈ L1 (μ ).

3.4.1 One-sided theorem Theorem 3.14 For μ -almost every x ∈ M,

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.4 Theorem of Oseledets in dimension 2

31

(1) either λ− (x) = λ+ (x) and 1 lim log An (x)v = λ± (x), for all v ∈ R2 ; n n (2) or λ+ (x) > λ− (x) and there exists a vector line Exs ⊂ R2 such that

1 λ− (x) if v ∈ Exs \ {0} n lim log A (x)v = n n λ+ (x) if v ∈ Rd \ Exs . Moreover A(x)Exs = E sf (x) for every x as in (2). Proof We treat the case when A takes values in SL(2) and leave it to the reader (Exercise 3.3) to extend the conclusions to the general GL(2) setting. Consider any x as in the conclusion of Theorem 3.12 and let λ (x) = λ+ (x) = −λ− (x). First, let us consider x ∈ M such that λ (x) = 0. For any v ∈ R2 , An (x)−1 v = An (x)−1 −1 v ≤ An (x)v ≤ An (x)v, and so 1 1 1 log(An (x)−1 v) ≤ log An (x)v ≤ log(An (x)v). n n n Sending n → ∞ the left-hand side goes to −λ (x) = 0 and the right-hand side goes to λ (x) = 0. So, we are done with the case when the exponent vanishes. Now let us suppose that λ (x) > 0. Then An (x) ≈ enλ (x) is larger than 1 for every large n. So (Exercise 2.3), there exist unit vectors sn (x) and un (x), respectively, most contracted and most expanded under An (x): An (x)sn (x) = An (x)−1 Lemma 3.15

and

An (x)un (x) = An (x).

(3.15)

The angle (sn (x), sn+1 (x)) decreases exponentially: 1 lim sup log | sin (sn (x), sn+1 (x))| ≤ −2λ (x). n n

Proof

Let us denote αn = (sn (x), sn+1 (x)). Then sn (x) = sin αn un+1 (x) + cos αn sn+1 (x).

It follows that An+1 (x)sn (x) ≥  sin αn An+1 (x)un+1 (x) = | sin αn | An+1 (x), and An+1 (x)sn (x) ≤ A( f n (x))An (x)sn (x) = A( f n (x))An (x)−1 .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

32

Extremal Lyapunov exponents

Hence | sin αn | ≤

A( f n (x)) An+1 (x) An (x)

.

Corollary 3.11 ensures that 1 lim log A( f n (x)) = 0. n n So, taking limit superior on both sides of the previous expression, we find that 1 lim sup log | sin αn | ≤ −2λ (x). n n This completes the proof of the lemma. Lemma 3.16 Proof

The sequence (sn (x))n is Cauchy in projective space.

Consider any ε > 0 such that −2λ (x) + ε < 0. By Lemma 3.15, | sin αn | ≤ en(−2λ (x)+ε )

for every large n. Then, up to replacing some s j (x) by −s j (x), sn (x) − sn+1 (x) ≤ 2en(−2λ (x)+ε ) for every large n. Consequently, there exists C > 0 such that sn+k (x) − sn (x) ≤ Cen(−2λ (x)+ε ) for every k ≥ 1 and n large enough. In particular, the sequence is Cauchy. Define s(x) = lim sn (x) whenever the limit exists. Lemma 3.17

The vector s(x) is contracted at the rate −λ (x): 1 lim log An (x)s(x) = −λ (x). n n

Proof

Let βn = (s(x), sn (x)). Then s(x) = cos βn sn (x) + sin βn un (x) and so An (x)s(x) ≤ | cos βn |An (x)sn (x) + | sin βn |An (x)un (x).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.4 Theorem of Oseledets in dimension 2

33

Then (Exercise 3.6), 1 lim sup log An (x)s(x) n n   1 1 ≤ max lim sup | cos βn | An (x)sn (x), lim sup | sin βn | An (x)un (x) n n n n  1 ≤ max lim sup log An (x)−1 , n n  1 1 lim sup log | sin βn | + lim sup An (x)un (x) n n n→∞ n ≤ max{−λ (x), −2λ (x) + λ (x)} = −λ (x). This completes the proof. Lemma 3.18

If v ∈ R2 is not collinear with s(x) then

1 lim log An (x)v = λ (x). n n Proof Denote γn = (v, sn (x)). Then v = cos γn sn (x) + sin γn un (x) and so An (x)v ≥ | sin γn |An (x)un (x) − | cos γn |An (x)sn (x). Note that | sin γn | is bounded from zero for all large n, because sn (x) → s(x) and v is not collinear to s(x). Using also (3.15), An (x)un (x) ≈ enλ (x)

and

An (x)sn (x) ≈ e−nλ (x) .

Substituting this in the previous inequality, and taking the limit as n → ∞, we get 1 lim inf log An (x)v ≥ λ (x). n n n n From A (x)v ≤ A (x)v we immediately get the opposite inequality: 1 1 lim sup log An (x)v ≤ lim log An (x) = λ (x). n n n n The proof of the lemma is complete. Lemma 3.19 Proof

A(x)s(x) is collinear to s( f (x)).

By Lemma 3.17,

1 1 log An+1 (x)s(x) = −λ (x). lim log An ( f (x))A(x)s(x) = lim n n n n+1 By Lemma 3.18, 1 lim log An ( f (x))v = λ ( f (x)) = λ (x) n n

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

34

Extremal Lyapunov exponents

for every v not collinear to s( f (x)). This implies the claim of the lemma. Take Exs to be the line Rs(x) generated by s(x). Lemmas 3.15 through 3.19 contain all the claims in Theorem 3.14.

3.4.2 Two-sided theorem Theorem 3.20 If f : M → M is invertible then for μ -almost every x ∈ M: (1) either λ− (x) = λ+ (x) and lim

n→±∞

1 log An (x)v = λ± (x) for all v ∈ R2 ; n

(2) or λ− (x) < λ+ (x) and there is a direct sum decomposition R2 = Exu ⊕ Exs such that

1 λ− (x) if v ∈ Exs \ {0} n lim log A (x)v = n→+∞ n λ+ (x) if v ∈ Rd \ Exs 1 log An (x)v = n→−∞ n lim

λ+ (x) if v ∈ Exu \ {0} λ− (x) if v ∈ Rd \ Exu .

Moreover, in the latter case, A(x)Exu = E uf(x) and A(x)Exs = E sf (x) and the angle between the two lines decreases subexponentially along orbits: lim

n→±∞

1 log | sin (E ufn (x) , E sf n (x) )| = 0. n

Proof We deal with the case when A(x) ∈ SL(2) for all x and let the reader extend the conclusions to the general setting. For x as in the conclusion of Theorem 3.12, write λ (x) = λ+ (x) = −λ− (x). The case λ (x) = 0 follows directly from Theorem 3.14 applied to F and to its inverse F −1 . From now on, assume that λ (x) > 0. Let Exs = Rs(x) and Exu = Ru(x) be the subspaces given by Theorem 3.14 for F and F −1 , respectively. We need to check that these two lines are transverse: Lemma 3.21 The vectors s(x) and u(x) are non-collinear, for μ -almost every point in {x : λ (x) > 0}. Proof We have limn→−∞ n−1 log An (x) | Exu  = λ (x), by Theorem 3.14 applied to F −1 . Thus, the lemma will follow if we prove that lim

n→−∞

1 log An (x) | Exs  = −λ (x). n

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.4 Theorem of Oseledets in dimension 2

35

From Theorem 3.14 applied to F −1 , we know that the limit on the left-hand side exists. Let us denote it by ψ (x). Consider the sequence of functions

ψn (x) =

1 log A−n (x) | Exs  −n

and

φn (y) =

1 log (An (y) | Eys )−1 . −n

From the definition of A−n we get that ψn (x) = φn ( f −n (x)) for every n ≥ 1. Since E s is one-dimensional, we may write

φn (x) =

1 log An (y) | Eys . n

Then limn→∞ φn (y) = −λ (y), by Lemma 3.17. In particular, the sequence φn converges to −λ in measure; that is, lim μ ({y : |φn (y) + λ− (y)| > δ }) = 0 n

for any δ > 0.

Then, since μ is f -invariant, lim μ ({y : |φn ( f −n (x)) + λ− ( f −n (x))| > δ }) = 0 n

for any δ > 0.

In view of the previous observations, and the fact that the function λ− is invariant, this implies that lim μ ({y : |ψn (x) + λ (x)| > δ }) = 0 n

for any δ > 0.

This means that ψn converges to −λ in measure. On the other hand, ψn converges to ψ almost everywhere and, consequently, in measure. By uniqueness of the limit, it follows that ψ = λ− , as claimed. Lemma 3.22

Let θ (y) = (Eys , Eyu ). For μ -almost every x with λ (x) > 0, lim

n→±∞

1 log | sin θ ( f n (x))| = 0. n

Proof From the elementary relation (Exercise 3.4) | sin θ ( f (x))| ≤ A(x)2 | sin θ (x)|   we find that  log | sin θ ( f (x))| − log | sin θ (x)|  ≤ 2 log A(x) and, in particular, log | sin θ | ◦ f − log | sin θ | ∈ L1 (μ ). So we may conclude from Corollary 3.11 that 1 lim log | sin θ ( f n (x))| = 0 n→±∞ n A(x)−2 ≤

as stated. This finishes the proof of Theorem 3.20.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

36

Extremal Lyapunov exponents

3.5 Notes Theorem 3.3 is an application of the main result of Kingman [74]. The proof we presented here is due to Avila and Bochi [10], inspired by a proof of the Birkhoff ergodic theorem by Katznelson and Weiss [69]. See Ledrappier [80, § I.2] for another proof inspired by [69]. Theorem 3.12, which we obtained as a consequence of the subadditive ergodic theorem, was actually proven before, by Furstenberg and Kesten [56]. The subadditive ergodic theorem will reappear in Chapter 4, as part of the proof of the Oseledets theorem. Theorem 3.13 is perhaps the simplest application of the method devised by Herman [62] for estimating Lyapunov exponents from below. Avila and Bochi [11] proved that (3.13) is, actually, an equality. The arguments in Section 3.4 are inspired by the proof of Proposition 2.1. Other proofs of the Oseledets theorem in two dimensions have been given by Young [119] and Bochi [29].

3.6 Exercises Exercise 3.1 Check that, given any measurable function A : M → GL(d), log+ A±1  ∈ L1 (μ ) ⇔ | log A±1  | ∈ L1 (μ ) ⇔ log− A±1  ∈ L1 (μ ). Moreover, if A takes values in SL(d), all these conditions are equivalent to log A ∈ L1 (μ ). Exercise 3.2 Check that the functions ϕ− and ϕ+ are invariant. Exercise 3.3 Given A : M → GL(d), define c(x) = | det A(x)|1/d and then let B : M → SL(d) be given by A(x) = c(x)B(x). Show that log c and log+ B±1  are in L1 (μ ) if log+ A±1  is in L1 (μ ). Check that, for μ -almost every x ∈ M, 1 1 lim log An (x)v = t(x) + lim log Bn (x)v n n n n

for every v = 0,

j where t(x) = limn n−1 ∑n−1 j=0 log c( f (x)) is the Birkhoff time average of the function log c. Deduce that the associated cocycles F(x, v) = ( f (x), A(x)v) and G(x, v) = ( f (x), B(x)v) have the same Oseledets flag/decomposition at almost every point. Moreover, the Lyapunov spectrum of the former cocycle is the t(x)-translate of the Lyapunov spectrum of the latter.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

3.6 Exercises

37

Exercise 3.4 Show that, for any B ∈ SL(2) and non-zero vectors u, v ∈ R2 , B−2 ≤

| sin (B(u), B(v))| ≤ B2 . | sin (u, v)|

Exercise 3.5 Prove that limn→∞ n−1 log | sin (An (x)v, E ufn (x) )| = −2λ (x) for any v ∈ / Exs . If the cocycle F is invertible then the same holds as n → −∞, with u Ex in the place of Exs . Exercise 3.6 Show that if an , bn > 0 for every n then   1 1 1 (1) lim sup log(an + bn ) = max lim sup log an , lim sup log bn . n n n n n n    1 1 1 (2) lim sup log a2n + b2n = max lim sup log an , lim sup log bn . n n n n n n   1 1 1 (3) lim inf log(an + bn ) ≥ max lim inf log an , lim inf log bn . n n n n n n    1 1 1 (4) lim inf log a2n + b2n ≥ max lim inf log an , lim inf log bn . n n n n n n

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.004

4 Multiplicative ergodic theorem

This chapter is devoted to the fundamental result in the theory of Lyapunov exponents: the multiplicative ergodic theorem of Oseledets [92]. The statements are given in Section 4.1 and proofs appear in Section 4.2 (one-sided version, for general cocycles) and Section 4.3 (two-sided version, for invertible cocycles). Some related issues are discussed in Section 4.4. Throughout, we take (M, B, μ ) to be a complete separable probability space. Recall that complete means that any subset of a measurable set with zero measure is measurable (and has zero measure) and separable means that there exists a countable family E ⊂ B such that for any ε > 0 and any B ∈ B there exists E ∈ E such that μ (BΔE) < ε . Let F : M × Rd → M × Rd be the linear cocycle defined by a measurable function A : M → GL(d) over a measurable transformation f : M → M that preserves the probability measure μ . It is assumed that the functions log+ A±1  are integrable with respect to μ .

4.1 Statements Recall that the Grassmannian of Rd is the disjoint union Gr(d) of the Grassmannian manifolds Gr(l, d), 0 ≤ l ≤ d. A map x → Vx with values in Gr(d) is measurable if and only if there exist measurable, linearly independent vector fields that span Vx at each point (Exercise 4.1). A flag in Rd is a decreasing family W 1  · · ·  W k  {0} of vector subspaces of the d-dimensional Euclidean space. The flag is called complete if k = d and dimW j = d + 1 − j for all j = 1, . . . , d. Theorem 4.1 (Oseledets) For μ -almost every x ∈ M there is k = k(x), num38

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.1 Statements

39

bers λ1 (x) > · · · > λk (x) and a flag Rd = Vx1  · · ·  Vxk  {0}, such that, for all i = 1, . . . , k: (a) k( f (x)) = k(x) and λi ( f (x)) = λi (x) and A(x) ·Vxi = V fi (x) ; (b) the maps x → k(x) and x → λi (x) and x → Vxi (with values in N and R and Gr(d), respectively) are measurable; 1 (c) lim log An (x)v = λi (x) for all v ∈ Vxi \Vxi+1 (with Vxk+1 = {0}). n n When μ is ergodic, it follows that the values of k(x) and of each of the Lyapunov exponents λi (x) are constant on a full measure subset, and so are the dimensions of the Oseledets subspaces Vxi . We call dimVxi −dimVxi+1 the multiplicity of the corresponding Lyapunov exponent λi (x). The Lyapunov spectrum of F is the set of all Lyapunov exponents, each counted with multiplicity. The Lyapunov spectrum is simple if all Lyapunov exponents have multiplicity 1 or, equivalently, if the Oseledets flag is complete. When the transformation f : M → M is invertible, we have a stronger conclusion: Theorem 4.2 (Oseledets) Suppose that f : M → M is invertible. Then, for μ almost every x ∈ M, there exists a direct sum decomposition Rd = Ex1 ⊕· · ·⊕Exk such that, for every i = 1, . . . , k: (a) A(x) · Exi = E if (x) and Vxi =

k

j j=i Ex

and

1 log An (x)v = λi (x) for all v ∈ Exi \ {0} and n    j  1  / E if n (x) , E f n (x)  = 0 whenever I ∩ J = 0. (c) lim log  sin  n→±∞ n i∈I j∈J

(b) lim

n→±∞

By definition, the angle (V,W ) between two subspaces V and W of Rd is the smallest angle between non-zero vectors v ∈ V and w ∈ W . Clearly, the multiplicity of each Lyapunov exponent λi coincides with the dimension dim Exi = dimVxi − dimVxi+1 of the associated Oseledets subspace Exi . Thus, the Lyapunov spectrum is simple if and only if dim Exi = 1 for every i. 

Remark 4.3 The sums kj=i Exj of Oseledets subspaces corresponding to the smallest Lyapunov exponents depend only on the forward iterates of the linear cocycle, since they coincide with the subspaces Vxi . Analogously, any sum of Oseledets subspaces corresponding to the largest Lyapunov exponents depends only on the backward iterates.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

40

Multiplicative ergodic theorem

4.2 Proof of the one-sided theorem As we are going to see, it is rather easy to get a weaker form of Theorem 4.1, where limit is replaced by limit superior in part (c). In Section 4.2.1 we find the Lyapunov exponents λi (x) and the Oseledets subspaces Vxi and we check that they are invariant. Then, in Section 4.2.2, we show that these objects are measurable. The main difficulty is to prove that the limit in part (c) actually exists. Roughly speaking, the proof is by induction on the number k of Oseledets subspaces. Sections 4.2.3 and 4.2.4 prepare the proof of the case k = 1, whereas the inductive step is prepared in Section 4.2.5. In Section 4.2.6 we wrap up the proof.

4.2.1 Constructing the Oseledets flag For each

v ∈ Rd

\ {0} and x ∈ M, define 1 λ (x, v) = lim sup log An (x)v. n n

(4.1)

In the next lemma we collect a few basic properties of this function. Recall that, by Theorem 3.12, the extremal Lyapunov exponents λ± (x) are well defined and finite for μ -almost every x. Lemma 4.4 (i) (ii) (iii) (iv)

For μ -almost every x ∈ M and any v, v ∈ Rd \ {0},

λ− (x) ≤ λ (x, v) ≤ λ+ (x); λ (x, cv) = λ (x, v) for v ∈ Rd \ {0} and c = 0; λ (x, v + v ) = max{λ (x, v), λ (x, v )} if v + v = 0; λ (x, v) = λ ( f (x), A(x)v).

Proof

For claim (i), just observe that An (x)−1 −1 v ≤ An (x)v ≤ An (x) v

and take the limit/limit superior as n → ∞. Part (ii) follows directly from the definition (4.1). Claim (iii) is a direct consequence of Exercise 3.6. Part (iv) also follows directly from the definition. For any x ∈ M as in Lemma 4.4, take k(x) ≥ 1 to be the number of elements of {λ (x, v) : v ∈ Rd \ {0}}, let λ1 (x) > · · · > λk(x) (x) be those elements, in decreasing order, and let  Vxi = v ∈ Rd \ {0} : λ (x, v) ≤ λi (x) ∪ {0} for i = 1, . . . , k(x).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.2 Proof of the one-sided theorem

41

By part (i) of Lemma 4.4, every λi (x) is a real number. Moreover, parts (ii) and (iii) ensure that Vxi is a vector subspace, for every i. It follows directly from these definitions that the Vxi constitute a flag: k(x)

Rd = Vx1  · · ·  Vx

 {0}.

In particular, k(x) ≤ d. Another direct consequence of the definitions is:

λ (x, v) = λi (x) for every v ∈ Vxi \Vxi+1 .

(4.2)

In particular, by part (i) of Lemma 4.4,

λ− (x) ≤ λi (x) ≤ λ+ (x) for every i = 1, . . . , k(x).

(4.3)

Finally, by Lemma 4.4(iv), the functions x → k(x) and x → λi (x) and x → Vxi are all invariant: for every i = 1, . . . , k(x), k( f (x)) = k(x)

and λi ( f (x)) = λi (x)

and V fi (x) = A(x)Vxi .

This proves part (a) of the theorem.

4.2.2 Measurability Next, we are going to show that the functions k, λi and V i are measurable, as claimed in part (b) of the theorem. We will use the following criteria for measurability, whose proof can be found in Castaing and Valadier [43]. Let (X, B, μ ) be a complete probability space and Y be a separable complete metric space. Denote by B(Y ) the Borel σ -algebra of Y . Proposition 4.5 (Theorem III.23 in [43]) Let B ⊗ B(Y ) be the product σ algebra in X ×Y and π : X ×Y → X be the canonical projection. Then π (E) ∈ B for every E ∈ B ⊗ B(Y ). Proposition 4.6 (Theorem III.30 in [43]) Let K (Y ) be the space of compact subsets of Y , with the Hausdorff topology. The following are equivalent: (i) a map x → Kx from X to K (Y ) is measurable; (ii) its graph {(x, y) : y ∈ Kx } is in B ⊗ B(Y ); / ∈ B for any open set U ⊂ Y . (iii) {x ∈ X : Kx ∩U = 0} Moreover, any of these conditions implies that there exists a measurable map σ : X → Y such that σ (x) ∈ Kx for every x ∈ X. Here is a useful application to maps with values in the Grassmannian: Corollary 4.7 Let (X, B, μ ) be a complete probability space and x → Vx be a map from X to the Grassmannian Gr(d). The following are equivalent:

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

42

Multiplicative ergodic theorem

(a) the map x → Vx is measurable; (b) its graph {(x, v) ∈ X × Rd : v ∈ Vx } is in B ⊗ B(Rd ). Proof For each subspace V of Rd , define PV = {ξ ∈ PRd : ξ ⊂ V }. Consider the following projectivized versions of conditions (a) and (b): (a ) the map x → PVx , from M to K (PRd ) is measurable; (b ) its graph {(x, ξ ) ∈ M × PRd : ξ ∈ PVx } is in B ⊗ B(PRd ). Note that (a ) ⇔ (b ), by Proposition 4.6. Next, consider the map P : Gr(Rd ) → K (PRd ),

V → PV.

This map is continuous and injective. Since Gr(d) is compact, it follows that P is a homeomorphism onto its image. The composition of P with the map in (a) is, precisely, the map in (a ). Thus, (a) ⇔ (a ). Similarly, {(x, v) ∈ M × Rd : v ∈ Vx }     = p−1 {(x, ξ ) ∈ M × PRd : ξ ∈ PVx } ∪ M × {0} . where

  p : M × Rd \ {0} → M × PRd ,

p(x, v) = (x, [v]).

So, since the map p is measurable, (b ) ⇒ (b). Next, we prove that (b) ⇒ (b ). Let π : M × Rd → M be the canonical projection to the first coordinate and let U be any open subset of PRd . Observe that {x ∈ M : PVx ∩U = 0} /   = π {(x, v) ∈ M × Rd : v ∈ Vx } ∩ p−1 (M ×U) .

(4.4)

Assume (b); that is, take {(x, v) ∈ M × Rd : v ∈ Vx } to be in B ⊗ B(Rd ). Then, / is in B. using Proposition 4.5, it follows from (4.4) that {x ∈ M : PVx ∩U = 0} By Proposition 4.6(iii), this implies the statement in (b ): {(x, ξ ) ∈ M × PRd : ξ ∈ PVx } is in B ⊗ B(PRd ), Thus (b) ⇒ (b ). This completes the proof that (a) ⇔ (b). Let e1 , . . . , ed be an arbitrary basis of Rd . By Exercise 3.6,

λ1 (x) = max{λ (x, ei ) : 1 ≤ i ≤ d}. Since (x, v) → λ (x, v) is measurable, it follows that x → λ1 (x) is a measurable function and V∗2 = {(x, v) ∈ M × Rd \ {0} : λ (x, v) < λ1 (x)}

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.2 Proof of the one-sided theorem

43

is a measurable subset of M × Rd . Observe that

π (V∗2 ) = {x ∈ M : λ (x, v) < λ1 (x) for some v ∈ Rd \ {0}} = {x ∈ M : k(x) ≥ 2}. By Proposition 4.5, this is a measurable subset of M. For x ∈ π (V∗2 ), define Vx2 = {v ∈ Rd : (x, v) ∈ V 2 } ∪ {0}. Since V∗2 ∪ (M × {0}) is a measurable subset of M × Rd , Corollary 4.7 gives that x → Vx2 is a measurable map on π (V∗2 ). Then, by Exercise 4.1, each Ml2 = {x ∈ π (V∗2 ) : dimVx2 = l},

1≤l≤d

is a measurable subset and for each l there exist measurable functions v1 , . . . , vl : Ml2 → Rd such that {v1 (x), . . . , vl (x)} is a basis of Vx2 for every x. Then

λ2 (x) = max{λ (x, ui (x)) : 1 ≤ i ≤ l} is a measurable function on Ml2 , for every 1 ≤ l ≤ d. Next, let V∗3 = {(x, v) ∈ M × Rd \ {0} : λ (x, v) < λ2 (x)} and Vx3 = {v ∈ Rd : (x, v) ∈ V∗3 } ∪ {0} for each x ∈ π (V∗3 ). Just as before, π (V∗3 ) = {x ∈ M : k(x) ≥ 3} is a measurable subset of M, the map x → Vx3 is measurable on π (V∗3 ), each Ml3 = {x ∈ π (V∗3 ) : dimVx3 = l},

1≤l≤d

is a measurable subset and, for each l, there exist measurable functions v1 , . . . , vl : Ml3 → Rd such that {v1 (x), . . . , vl (x)} is a basis of Vx3 for every x. Repeating this argument successively, we find that (i) {x ∈ M : k(x) ≥ s} is measurable for every s ≥ 1; thus, the function x → k(x) is measurable; (ii) each Lyapunov exponent λi (x) is a measurable function of x on the set π (V∗i ) = {x ∈ M : k(x) ≥ i}; (iii) each Oseledets subspace Vxi is a measurable function of x on π (V∗i ). This proves part (b) of the theorem.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

44

Multiplicative ergodic theorem

4.2.3 Time averages of skew products The relation (4.2) gives a weaker version of part (c), with limit superior instead of limit. We are left to prove that the limit does exist. The heart of the argument is Proposition 4.8 below. Let P be a compact metric space. Consider the space C0 (P) of continuous real functions on P, endowed with the norm ψ 0 = sup{|ψ (ξ )| : ξ ∈ P}. It is well known that C0 (P) is a separable space (see [114, Theorem A.3.13]). Denote by F the space of measurable functions Ψ : M × P → R such that Ψ(x, ·) ∈ C0 (P) for μ -almost every x and x → Ψ(x, ·)0 is integrable with respect to μ , modulo identifying any two functions that coincide on some N ×P with μ (N) = 1. Then, Ψ1 =



Ψ(x, ·)0 d μ (x)

(4.5)

defines a complete norm on F . Using the assumption that (M, B, μ ) is a separable probability space, one also gets that (F ,  · ) is separable. See Exercise 4.2. Let M (μ ) be the space of probability measures on M ×P such that π∗ η = μ , where π : M × P → M is the canonical projection. The weak∗ topology is the smallest topology on M (μ ) such that the operator M (μ ) → R,

η →



Ψ dη

is continuous, for every Ψ ∈ F .

A variation of the classical Banach–Alaoglu argument (see Theorem 2.1 in [114]) shows that this topology is metrizable and compact (Exercise 4.4). Proposition 4.8 Let G : M × P → M × P be a measurable map of the form G (x, v) = ( f (x), Gx (v)), where Gx : P → P is a continuous map for μ -almost every x. Given any Φ ∈ F , define I(x) = lim n

n−1 1 inf ∑ Φ(G j (x, v)) n v∈P j=0

and

n−1 1 S(x) = lim sup ∑ Φ(G j (x, v)). n n v∈P j=0

The limits exist at μ -almost every point and there exist G -invariant probability measures ηI ∈ M (μ ) and ηS ∈ M (μ ) such that 

Φ d ηI =



I dμ



and

Φ d ηS =



S dμ.

(4.6)

Proof The roles of the infimum and the supremum are exchanged if one replaces Φ by −Φ. Thus, it suffices to prove the claims pertaining to either one

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.2 Proof of the one-sided theorem

45

of them. For each n ≥ 1, define n−1

In (x) = inf

v∈P

∑ Φ(G j (x, v))

(4.7)

j=0

Every In is a measurable function, since it coincides with the infimum taken over any countable dense subset of P. Note that I1 is integrable, because Φ ∈ F . Moreover, the sequence (In )n is super-additive: Im+n (x) ≥ Im (x) + In ( f m (x)) for every m, n and x. Then, by the subadditive ergodic theorem applied to (−In )n , 1 I(x) = lim In (x) n n

exists for μ -almost every x.

Consider the subsets Γn of M × P defined by Γn = {(x, v) ∈ M × P :

n−1

∑ Φ(G j (x, v)) = In (x)}

j=0

and let Γn (x) = {v ∈ P : (x, v) ∈ Γn } for each x ∈ M. This is a non-empty compact subset of P, for μ -almost every x, since P is compact and v → Φ(G j (x, v)) is continuous for every j. Then, since Γn is measurable, Proposition 4.6 gives that there exists a measurable map vn : M → P such that vn (x) ∈ Γn (x) for μ -almost every x. Now let

ξn =



δx,vn (x) d μ (x) and ηn =

1 n−1 j ∑ G∗ ξn . n j=0

It is clear that π∗ ξn = μ for every n. It follows that π∗ ηn = μ for every n, because G is a skew product over f and the measure μ is f -invariant. By the compactness of M (μ ), there exists (nk )k → ∞ such that (ηnk )k converges to some ηI ∈ M (μ ). Given any Ψ ∈ F ,       1      (Ψ ◦ G ) d ηnk − Ψ d ηnk  =  (Ψ ◦ G nk ) d ξnk − Ψ d ξnk  nk (4.8)  2 2 Ψ(x, ·)0 d μ (x) = Ψ1 . ≤ nk nk Observe that Ψ ◦ G ∈ F , since G is measurable, Gx is continuous and (Ψ ◦ G )(x, ·)0 ≤ Ψ( f (x), ·)0 ∗ for μ -almost every x. Thus, by the definition of the weak  topology, the left  hand side of (4.8) converges to | (Ψ ◦ G ) d ηI − Ψ d ηI . The right-hand side

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

46

Multiplicative ergodic theorem

converges to zero. So, we have shown that 

(Ψ ◦ G ) d ηI =



Ψ d ηI

for all Ψ ∈ F .

This implies that ηI is a G -invariant measure (Exercise 4.3). Finally, 

Φ d ηI = lim



Φ d ηnk = lim

k

= lim k



1 nk −1 ∑ Φ(G j (x, vnk (x))) d μ (x) nk j=0

k

1 nk



Ink (x) d μ (x) =



I dμ

(the last step uses the subadditive ergodic theorem). This proves our claim. Corollary 4.9 In the setting of Proposition 4.8, for μ -almost every x there exist vI (x) ∈ P and vS (x) ∈ P such that lim n

1 n−1 ∑ Φ(G j (x, vI (x))) = I(x) and n j=0

lim n

1 n−1 ∑ Φ(G j (x, vS (x))) = S(x). n j=0

Proof Up to replacing Φ by −Φ, it suffices to prove either one of the two claims. It is clear that, for every v ∈ P and μ -almost every x, I(x) ≤ lim inf n

1 n−1 1 n−1 Φ(G j (x, v)) ≤ lim sup ∑ Φ(G j (x, v)) ≤ S(x). ∑ n j=0 n j=0 n

(4.9)

By the Birkhoff ergodic theorem, given any G -invariant measure η , ˜ v) = lim 1 Φ(x, n n

n−1

∑ Φ(G j (x, v))

(4.10)

j=0





˜ d η = Φ d η . Taking exists for η -almost every point (x, v), and satisfies Φ   ˜ d η = I d μ . In view of (4.9), this implies that the η = ηI , this gives that Φ measurable set ˜ v) = I(x)} E = {(x, v) ∈ M × P : Φ(x, has full ηI -measure. By Proposition 4.5, the set π (E) is measurable. Moreover, it has full μ -measure: since π∗ ηI = μ ,

μ (π (E)) = ηI (π −1 (π (E))) ≥ ηI (E) = 1. Now, it is clear that every x ∈ π (E) satisfies the claim relative to I and vI . Remark 4.10

When G is invertible, we may replace (4.10) by n−1 ˜ v) = lim 1 ∑ Φ(G j (x, v)) Φ(x, n→±∞ n j=0

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.2 Proof of the one-sided theorem

47

(the two limits exist and are equal almost everywhere). Then we get 1 n−1 1 n−1 Φ(G j (x, vI (x))) = I(x) and lim ∑ ∑ Φ(G j (x, vS (x))) = S(x) n→±∞ n n→±∞ n j=0 j=0 lim

in the conclusion of Corollary 4.9.

4.2.4 Applications to linear cocycles Let us go back to linear cocycles. We are going to deduce: Proposition 4.11 Let x → Vx be a measurable invariant sub-bundle for the cocycle F : M × Rd → M × Rd . Then, for μ -almost every point x, 1 (a) lim log (An (x) | Vx )−1 −1 = min{λ (x, v) : v ∈ Vx \ {0}}; n n 1 (b) lim log (An (x) | Vx ) = max{λ (x, v) : v ∈ Vx \ {0}}. n n Proof Up to restricting to convenient invariant measurable subsets of M, and considering the normalized restriction of μ to such subsets, we may suppose that the dimension of Vx is constant. Let l = dimVx for every x. By Exercise 4.4 and Gram–Schmidt, we may find measurable functions {v1 (x), . . . , vl (x)} that constitute a orthonormal basis of Vx at every point. Using such a basis, we may identify Vx with Rl through an isometry. Denote D(x) = A(x) | Vx and let G : M × Rl → M × Rl be the linear cocycle induced by the map D : M → GL(l) over f . Clearly, D(x) ≤ A(x) and D(x)−1  ≤ A(x)−1  and so the hypotheses imply log+ D±1  ∈ L1 (μ ). We also consider the projectivization G : M × PRl → M × PRl of G. Consider Φ : M × PRl → R, defined by Φ(x, [v]) = log

D(x)v . v

It is clear that Φ ∈ F ; that is, Φ is measurable and Φ(x, ·) ∈ C0 (PRd ) for every x. For any vector v ∈ Rl \ {0} and any n ≥ 0, lim sup n

1 n−1 1 Dn (x)v = λ (x, v). Φ(G j (x, [v])) = lim sup log ∑ n j=0 n v n

Moreover, In (x) = log Dn (x)−1 −1

and

Sn (x) = log Dn (x)

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

48

Multiplicative ergodic theorem

for every n ≥ 0, and so I(x) = λ− (x)

and

S(x) = λ+ (x).

According to Lemma 4.4(i), we have λ− (x) ≤ λ (x, v) ≤ λ+ (x) for every v = 0. So, the conclusion of Corollary 4.9 means that min{λ (x, v) : v ∈ Rl \ {0}} = λ− (x) and max{λ (x, v) : v ∈ Rl \ {0}} = λ+ (x), as we wanted to prove. Remark 4.12

If f is invertible, using Remark 4.10 we get:

(a) lim

1 log (An (x) | Vx )−1 −1 = min{λ (x, v) : v ∈ Vx \ {0}}; n

(b) lim

1 log (An (x) | Vx ) = max{λ (x, v) : v ∈ Vx \ {0}}. n

n→±∞

n→±∞

The special case Vx = Rd in Proposition 4.11 gives: Corollary 4.13

λ+ (x) = λ1 (x) and λ− (x) = λk(x) (x) for μ -almost every x.

4.2.5 Dimension reduction Let us consider the following situation. Let x → Vx be a measurable invariant sub-bundle and α (x) < β (x) be measurable invariant functions such that (i) λ (x, v) ≤ α (x) for every v ∈ Vx \ {0}; (ii) λ (x, u) ≥ β (x) for every u ∈ Rd \Vx ; for μ -almost every x. By Proposition 4.11, condition (i) implies 1 (iii) lim log An (x) | Vx  ≤ α (x). n n Let Vx⊥ denote the orthogonal complement of Vx . Note that x → Vx⊥ is measurable, since x → Vx is taken to be measurable and the orthogonal complement map ⊥: Gr(l, d) → Gr(d − l, d) is a diffeomorphism for every l. Let   B(x) 0 A(x) = (4.11) C(x) D(x) be the expression of A relative to the direct sum decomposition Rd = Vx⊥ ⊕Vx . Clearly, D(x) is just the restriction A(x) | Vx of A(x) to the invariant sub-bundle. It is also clear that the norms of B(x), C(x) and D(x), and their inverses, are bounded by the norms of A(x)±1 . Thus, the hypotheses ensure that log B±1 , log C±1 , log D±1  ∈ L1 (μ ).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

(4.12)

4.2 Proof of the one-sided theorem Proposition 4.14

49

For μ -almost every x, any u ∈ Vx⊥ \ {0} and any v ∈ Vx ,

1 1 (a) lim sup log Bn (x)u = lim sup log An (x)(u + v). n n n n 1 1 n (b) if lim log B (x)u exists then lim log An (x)(u + v) exists for all v ∈ n n n n Vx , and the two limits coincide. Proof

First, we prove (a). By Exercise 3.6 and the assumptions (i) and (ii),

  1 1 lim sup logAn (x)(u + v) ≤ lim sup log An (x)u + An (x)v n n n n   1 1 = max lim sup log An (x)u, lim sup log An (x)v n n n n 1 = lim sup log An (x)u n n and, analogously,   1 1 lim sup log An (x)u ≤ lim sup log An (x)(u + v) + An (x)v n n n n   1 1 = max lim sup log An (x)(u + v), lim sup log An (x)v n n n n 1 = lim sup log An (x)(u + v). n n This proves that, for any u ∈ Vx⊥ \ {0} and v ∈ Vx , 1 1 lim sup log An (x)(u + v) = lim sup log An (x)u. n n n n So, from now on we consider v = 0. We will need the following fact: Lemma 4.15 such that

Given any ε > 0, there exists a measurable function dε (x) > 0

Dn ( f m (x)) ≤ dε (x)eα (x)n+(m+n)ε Proof

for every m, n ≥ 0.

Define bε (x) = sup{Dn (x) e−n(α (x)+ε ) : n ≥ 0}.

Observe that 1 ≤ bε (x) < ∞ at μ -almost every point, by condition (iii). From bε ( f (x)) = sup{Dn ( f (x)) e−n(α (x)+ε ) : n ≥ 0} we get that bε ( f (x)) ≥ D(x)−1 eα (x)+ε sup{Dn (x)e−n(α (x)+ε ) : n ≥ 1}.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

(4.13)

50

Multiplicative ergodic theorem

There are two possibilities. If the supremum on the right-hand side coincides with bε (x), then this yields log bε ( f (x)) ≥ log bε (x) + log D(x)−1 + α (x) + ε . Otherwise, if the supremum in the definition of bε (x) is attained at n = 0, then bε (x) = 1 ≤ bε ( f (x)). In any event, log bε ( f (x)) − log bε (x) ≥ min{− log+ D(x) + α (x) + ε , 0}.

(4.14)

Similarly, (4.13) yields bε ( f (x)) ≤ D(x)−1 eα (x)+ε bε (x), and so log bε ( f (x)) − log bε (x) ≤ log+ D(x)−1  + α (x) + ε .

(4.15)

The inequalities (4.14)–(4.15), together with (4.12), ensure that log bε ◦ f − log bε is in L1 (μ ). Then, using Corollary 3.11, lim m

1 log bε ( f m (x)) = 0 m

( f m (x))e−ε m

Let dε (x) = sup{bε : m ≥ 0}. The previous relation ensures that dε (x) is finite at μ -almost every point. By definition, bε ( f m (x)) ≤ dε (x)eε m and Dn ( f m (x)) ≤ bε ( f m (x))en(α (x)+ε ) ≤ dε (x)eα (x)n+(m+n)ε for every m, n ≥ 0. Going back to the proof of Proposition 4.14, observe that   n 0 B (x) for every n ≥ 0, An (x) = Cn (x) Dn (x) where n−1

Cn (x) =

∑ Dn− j−1 ( f j+1 (x))C( f j (x))B j (x).

(4.16)

j=0

Given x ∈ M and u ∈ Vx \ {0}, consider   1 γ = max α (x), lim sup log Bn (x)u n n In particular, 1 lim sup log Bn (x)u ≤ γ n n

(4.17)

which implies that, given any ε > 0, there exists a real number bε such that B j (x)u ≤ bε e j(γ +ε )

for every j.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

(4.18)

4.2 Proof of the one-sided theorem

51

By Corollary 3.11 and (4.12), there exists a measurable function cε such that C( f j (x)) ≤ cε (x)e jε

for every j.

(4.19)

By Lemma 4.15, there exists a measurable function dε such that Di ( f j (x)) ≤ dε (x)eiα (x)+(i+ j)ε

for every i and j.

(4.20)

Substituting the estimates (4.18)–(4.20) in (4.16), we find that Cn (x)u ≤

n−1

∑ dε (x)e(n− j−1)α (x)+nε cε (x)e jε bε e j(γ +ε ) ≤ naε en(γ +3ε ) ,

j=0

where aε = bε cε (x)dε (x). Consequently, 1 lim sup log Cn (x)u ≤ γ + 3ε . n n

(4.21)

Since An (x)u = (Bn (x)u,Cn (x)u), we have that An (x)u2 = Bn (x)u2 + Cn (x)u2 , So, by Exercise 3.6, the inequalities (4.17) and (4.21) give 1 1 lim sup log Bn (x)u ≤ lim sup log An (x)u ≤ γ + 3ε . n n n n Since ε > 0 is arbitrary, this implies that 1 1 lim sup log Bn (x)u ≤ lim sup log An (x)u ≤ γ . n n n n

(4.22)

Now, the assumption (ii) yields 1 α (x) < β (x) ≤ lim sup log An (x)u ≤ γ n n and so α (x) is strictly smaller than γ . So, by the definition of γ , 1 lim sup log Bn (x)u = γ . n n The two relations (4.22) and (4.23) yield part (a) of the proposition. Now we can readily deduce part (b). Note that An (x)(u + v)2 = Bn (x)u2 + Cn (x)u + Dn (x)v2 ,

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

(4.23)

52

Multiplicative ergodic theorem

since An (x)(u + v) = (Bn (x)u,Cn (x)u + Dn (x)v). So, by Exercise 3.6, 1 lim inf log An (x)(u + v) n n 1 1 ≥ max{lim inf log Bn (x)u, lim inf log Cn (x)u + Dn (x)v} n n n n 1 1 n ≥ lim inf log B (x)u = lim sup log Bn (x)u. n n n n Then the claim follows immediately from part (a) of the proposition. This finishes the proof of Proposition 4.14.

4.2.6 Completion of the proof Keep in mind that our goal, to finish the proof of Theorem 4.1, is to show that 1 lim log An (x)v n n

(4.24)

exists for μ -almost every x ∈ M, every v ∈ Vxi \Vxi+1 and every 1 ≤ i ≤ k. It will follow that the limit is equal to λ (x, v) = λi (x), of course. Up to replacing M by suitable invariant measurable subsets, and considering the normalized restriction of μ to each of such subsets, we may suppose that k(x) is independent of x, and so is the dimension l ≥ 1 of the invariant subbundle Vx = Vxk . Let α (x) = λk (x) and β (x) = λk−1 (x). It is clear that conditions (i) and (ii) in Section 4.2.5 are satisfied in this context, and so we will be able to use Proposition 4.14. Moreover, Proposition 4.11 implies that, for μ -almost every x, 1 1 lim log (An (x) | Vx )−1 −1 = λk (x) = lim log An (x) | Vx  n n n n and so, 1 lim log An (x)v = λk (x) for all v ∈ Vx \ {0}. (4.25) n n Consider the expression (4.11) of A relative to the direct sum decomposition Rd = Vx⊥ ⊕Vx . Using a measurable orthonormal basis {w1 (x), . . . , wd−l (x)}, we may identify Vx⊥ with Rd−l and view each B(x) as an element of GL(d − l), depending measurably on the point x. Recall that log± B ∈ L1 (μ ), by (4.12). Define Uxi = Vx⊥ ∩Vxi ,

for every i = 1, . . . , k.

By Proposition 4.14(a), for every 1 ≤ i ≤ k − 1 and u ∈ Uxi \Uxi+1 , 1 1 lim sup log Bn (x)u = lim sup log An (x)u = λi (x). n n n n

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.3 Proof of the two-sided theorem

53

Thus, Rd−l = Ux1  · · ·  Uxk−1  {0} is the Oseledets flag of B, and the Lyapunov exponents are λ1 (x), . . . , λk−1 (x). By induction on k, 1 lim log Bn (x)u = λi (x) for all u ∈ Uxi \Uxi+1 n n and every i = 1, . . . , k − 1. By Proposition 4.14(b), it follows that 1 lim log An (x)v = λi (x) for all v ∈ Vxi \Vxi+1 (4.26) n n and every i = 1, . . . , k − 1. The relations (4.25) and (4.26) contain (4.24). The proof of Theorem 4.1 is complete.

4.3 Proof of the two-sided theorem Now let us prove Theorem 4.2. We have seen in Remark 4.12 that the limits are not affected when one takes n → −∞ instead. The main remaining steps for proving Theorem 4.2 are: to upgrade the Oseledets flag to an invariant decomposition (Section 4.3.1) and to prove the subexponential decay of the angles (Section 4.3.2).

4.3.1 Upgrading to a decomposition We pick-up where we left, in Section 4.2.6. It is no restriction to suppose that k(x) k(x) and the dimension l ≥ 1 of the invariant sub-bundle Vx = Vx are independent of x. Let α (x) = λk (x) and β (x) = λk−1 (x). Consider the expression (4.11) of A relative to the decomposition Rd = Vx⊥ ⊕Vx . By Remark 4.12, lim

n→±∞

1 1 log Dn (x)−1 −1 = lim log Dn (x) = α (x) n→±∞ n n

(4.27)

for μ -almost every x. Similarly, since λ1 , . . . , λk−1 are the Lyapunov exponents of B (see Section 4.2.6), Remark 4.12 gives lim

n→±∞

1 log Bn (x)−1 −1 = β (x) n

for μ -almost every x.

(4.28)

We are going to use these facts in the proof of the following result. Proposition 4.16 If f : M → M is invertible then there exists a measurable invariant sub-bundle x → Wx such that Rd = Wx ⊕Vx for μ -almost every x. Proof Let L be the space of all measurable maps L : x → Lx assigning to μ -almost every x a linear map Lx : Vx⊥ → Vx . The graph transform T : L → L is the transformation characterized by the condition that, for every x ∈ M, the

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

54

Multiplicative ergodic theorem

image of the graph of Lx under A(x) coincides with the graph of T (L) f (x) . Using the expression (4.11), this condition translates to the following relation:   (4.29) T (L) f (x) = C(x) + D(x)Lx B(x)−1 for every x. Clearly, the graph of some L ∈ L is an invariant sub-bundle if and only if T (L)x = Lx for μ -almost every x. Thus, to prove the proposition it suffices to find such an (essentially) fixed point of the graph transform. With that in mind, let us rewrite (4.29) as T (L)y = Ry + S(L)y

for every y,

(4.30)

where R : y → Ry = C( f −1 (y))B( f −1 (y))−1 = C( f −1 (y))B−1 (y) is an element of L and S : L → L is the linear operator defined by S(L)y = D( f −1 (y))L f −1 (y) B−1 (y). We claim that there exist measurable functions a(x) > 0 and ε (x) > 0 such that Sk (R)y  ≤ a(y)e−kε (y)

for every k ≥ 0 and μ -almost every y.

(4.31)

k Assume this fact for a while. Then the sum L : y → Ly = ∑∞ k=0 S (R)y is well defined μ -almost everywhere, and it is a measurable map; in other words, L ∈ L . Moreover, L is a fixed point of the graph transform:

Ry + S(L)y = Ry + S





∑ Sk (R)

k=0





= Ry + ∑ Sk (R)y = y k=1



∑ Sk (R)y = Ly .

k=0

We are left to prove the claim (4.31). Begin by observing that, for any k ≥ 0, Sk (R)y = Dk ( f −k (y))R f −k (y) B−k (y). For each y, take

 1 β (y) − α (y) . 5 By (4.27) and Lemma 4.15, there exists a measurable function d(·) such that

ε (y) =

Dk ( f −k (y)) ≤ d(y)ek(α (y)+2ε (y))

for every k ≥ 0.

(4.32)

Similarly, (4.28) and Lemma 4.15 imply that there exists a measurable function b(·) such that Bk (y)−1  ≤ b(y)ek(−β (y)+ε (y))

for every k ≥ 0.

(4.33)

The observation (4.12) implies that log+ R ≤ log C ◦ f −1  + log+ B−1  is in L1 (μ ). Then we may use Corollary 3.11: there exists a measurable function r such that R f −k (y)  ≤ r(y)ekε (y)

for every k ≥ 0.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

(4.34)

4.3 Proof of the two-sided theorem

55

Let a(y) = b(y)d(y)r(y). Putting (4.32)–(4.34) together, Sk (L)y  ≤ b(y)d(y)r(y)ek(α (y)−β (y)+4ε (y)) = a(y)e−kε (y) This proves (4.28), and that completes the proof of the proposition. The relation (4.27) implies that 1 log An (x)v = α (x) = λk (x) n→±∞ n lim

for all v ∈ Vz \ {0}

(4.35)

and μ -almost every x. Identify Wx with Rd−l through some measurable or˜ thonormal basis and let A˜ : M → GL(d − l) be given by A(x) = A(x) | Wx . Define Wxi = Wx ∩Vxi

for every i = 1, . . . , k.

By Proposition 4.14(a), for every 1 ≤ i ≤ k − 1 and u ∈ Wxi \Wxi+1 , 1 1 lim sup log A˜ n (x)u = lim sup log An (x)u = λi (x). n n n n ˜ and the LyaThus, Wx = Wx1  · · ·  Wxk−1  {0} is the Oseledets flag of A, ˜ punov exponents are λ1 (x), . . . , λk−1 (x). By induction on k, there exists an A1 k−1 invariant splitting Wx = Ex ⊕ · · · ⊕ Ex such that Wxj =

k−1 

Exi

for every j = 1, . . . , k − 1

(4.36)

i= j

lim

n→±∞

1 log An (x)u = λi (x) n

for all u ∈ Exi \ {0}

(4.37)

and μ -almost every x. Denote Exk = Vx = Vxk . Then Rd = Ex1 ⊕ · · · ⊕ Exk−1 ⊕ Exk is an A-invariant splitting and (4.36) leads to Vxj = Vxk ⊕Wxj =

k 

Exi

for every j = 1, . . . , k.

i= j

This proves part (a) of the theorem. Part (b) is contained in (4.35) and (4.37).

4.3.2 Subexponential decay of angles All that is left to do is to prove part (c) of Theorem 4.2. As explained before, it is no restriction to assume that k(x) and the dimensions of the Oseledets subspaces are constant μ -almost everywhere. Given disjoint subsets I and J of {1, . . . , k}, define     φ (x) =  sin  Exi , Exj . i∈I

j∈J

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

56

Multiplicative ergodic theorem

Since the Oseledets subspaces are invariant, 



j φ ( f (x)) | sin (A(x) i∈I Exi , A(x) j∈J Ex )| = .   φ (x) | sin ( i∈I Exi , j∈J Exj )|

By elementary linear algebra (Exercise 3.5), the right-hand side is bounded above by A(x)A(x)−1  and below by (A(x)A(x)−1 )−1 . In particular, the hypotheses log+ A±1  ∈ L1 (μ ) imply that log φ ◦ f −log φ is μ -integrable. Equivalently, log φ ◦ f −1 −log φ is μ -integrable. Applying Corollary 3.11, both to f and to f −1 we conclude that lim

n→±∞

1 φ ( f n (x)) = 0, n

which is what we wanted to prove. The proof of Theorem 4.2 is complete.

4.3.3 Consequences of subexponential decay Let us begin by pointing out that we proved a stronger fact than was stated in part (b) of the theorem, namely: lim

n→±∞

1 1 log (An (x) | Exi )−1 −1 = lim log An (x) | Exi  = λi (x) n→±∞ n n

(4.38)

for μ -almost every x (this implies, for instance, that the limit in (b) is uniform over all unit vectors). It follows (Exercise 4.7) that lim

n→±∞

  1 log  det(An (x) | Exi ) = λi (x) dim Exi n

for μ -almost every x. Moreover, using Theorem 4.2(c) and Exercise 4.7,     1  lim log  det An (x) | Exi  = ∑ λi (x) dim Exi (4.39) n→±∞ n i∈I i∈I for any I ⊂ {1, . . . , k(x)}. For example, k(x) 1 log | det An (x)| = ∑ λi (x) dim Exi n→±∞ n i=1

lim

(4.40)

for μ -almost every x ∈ M. In particular, if | det A| ≡ 1 then the sum of all Lyapunov exponents, counted with multiplicity, is identically zero. Analogously,   1 log  det(An (x) | Exu ) = ∑ λi (x) dim Exi n→±∞ n λ (x)>0 lim

i

  1 lim log  det(An (x) | Exs ), = ∑ λi (x) dim Exi n→±∞ n λ (x)0 Exi is the unstable bundle and Exs = λi (x) · · · > λk (x) be the Lyapunov exponents and Rd = Ex1 ⊕ · · · ⊕ Exk be the Oseledets decomposition of F. We may suppose that k and the dimensions dm = dim Exm are constant μ -almost everywhere. For each (l1 , . . . , lk ) with 0 ≤ lm ≤ dm and ∑km=1 lm = l, l ,...,l define Ex 1 k to be the subspace of Λl (Rd ) generated by the l-vectors v1 ∧ · · · ∧ vl with v j ∈ Exm

m−1

for



i=1

m

li < j ≤ ∑ li .

(4.45)

i=1

l ,...,l

Every x → Ex 1 k is a Λl F-invariant sub-bundle. Moreover, these sub-bundles are in general position with respect to each other, and     d d l1 ,...,lk = 1 ··· k . dim Ex l1 lk

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.4 Two useful constructions 

59

l ,...,l

Then (Exercise 4.5), we have l1 ,...,lk Ex 1 k = Λl (Rd ). Take v1 ∧ · · · ∧ vl as in (4.45) and let Vx be the subspace generated by v1 , . . . , vl . Using (4.44), lim

n→±∞

  1 1 log Λl A(x)n (v1 ∧ · · · ∧ vl ) = lim log  det(An (x) | Vx ). n→±∞ n n

For each m = 1, . . . , k, take Vxm to be the subspace generated by the vectors v j k m m with ∑m−1 m=1 Vx and, using Theorem 4.2(c) and i=1 li < j ≤ ∑i=1 li . Then Vx = Exercise 4.7, k     1 1 log  det(An (x) | Vx ) = ∑ lim log  det(An (x) | Vxm ). n→±∞ n n→±∞ n m=1

lim

By (4.38) and Exercise 4.7,   1 log  det(An (x) | Vxm ) = lm λm (x). n→±∞ n lim

Putting these three relations together, k 1 log Λl A(x)n (v1 ∧ · · · ∧ vl ) = ∑ lm λm (x). n→±∞ n m=1

lim

Since these decomposable k-vectors v1 ∧ · · · ∧ vl generate the space E l1 ,...,lk , it follows that the latter is contained in some Oseledets subspace, with Lyapunov exponent ∑km=1 lm λm (x). Thus, the Lyapunov exponents of Λl F are the numbers ∑km=1 lm λm (x), with l ,...,l 0 ≤ lm ≤ dm and ∑km=1 lm = l, each with multiplicity equal to dim Ex 1 k (some of these numbers may coincide, in which case the multiplicities add up). This is a rephrasing of the conclusion of Proposition 4.17.

4.4 Two useful constructions Here we present a couple of constructions concerning Lyapunov exponents that will be useful later. The reader may choose to skip this section at first reading, and come back to it when these ideas are called for.

4.4.1 Inducing and Lyapunov exponents Let Z ⊂ M be a positive μ -measure subset and g : Z → Z be the first return map, defined by g(x) = f r(x) (x),

r(x) = inf{n ≥ 1 : f n (x) ∈ Z}

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

(4.46)

60

Multiplicative ergodic theorem

(r(x) is finite for μ -almost every x, by Poincar´e recurrence). Let ν be the normalized restriction of μ to Z; that is,

ν (E) =

μ (E) μ (Z)

for every measurable set E ⊂ Z.

(4.47)

Let G : Z × Rd → Z × Rd be the linear cocycle over g defined by the function B : Z → GL(d), B(x) = Ar(x) (x). We are going to see that the Lyapunov exponents of G at each point x are obtained from the Lyapunov exponents of F by multiplication by a constant c(x) ≥ 1. Proposition 4.18

Take the transformation f : M → M to be invertible.

(1) The probability measure ν is g-invariant and we have log+ B±1  ∈ L1 (ν ) whenever log+ A±  ∈ L1 (μ ). (2) The Oseledets decomposition of G coincides with the restriction of the Oseledets decomposition of F. (3) For ν -almost every x ∈ Z there exists c(x) ≥ 1 such that the Lyapunov exponents satisfy λ j (G, x) = c(x)λ j (F, x) for every j. Proof For each j ≥ 1, let Z j be the subset of points x ∈ Z such that r(x) = j. Then {Z j : j ≥ 1} and { f j (Z j ) : j ≥ 1} are partitions of full measure subsets of Z. Note also that g | Z j = f j | Z j for all j ≥ 1. For any measurable set E ⊂ Z and any j ≥ 1,       μ g−1 (E ∩ f j (Z j )) = μ f − j (E ∩ f j (Z j )) = μ E ∩ Z j , because μ is invariant under f . It follows that   μ g−1 (E) =



∑μ

 −1  g (E ∩ f j (Z j )) =

j=1



∑μ

  E ∩ Z j = μ (E).

j=1

So, the normalized restriction ν is invariant under g. Next, from the definition B(x) = Ar(x) (x), we get  Z

log+ B d μ =







j=1 Z j

log+ A j  d μ ≤

∞ j−1 

∑∑

j=1 i=0 Z j

log+ A ◦ f i  d μ .

Since μ is invariant under f and the domains f i (Z j ) are pairwise disjoint for all 0 ≤ i ≤ j − 1, it follows that  Z

log+ B d μ ≤

∞ j−1 

∑∑

j=1 i=0

f i (Z j )

log+ A d μ ≤



log+ A d μ .

The corresponding bound for the norm of the inverse is obtained in the same

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.4 Two useful constructions

61

way. This proves that log+ B±1  is ν -integrable. It is clear that the restriction of the Oseledets decomposition of F is invariant under G. Define 1 k−1 ∑ r(g j (x)). k→∞ k j=0

c(x) = lim

Note that r is integrable relative to ν :  Z

r dμ =



∞ j−1

j=1

j=1 i=0

∑ j μ (Z j ) = ∑ ∑ μ ( f i (Z j )) ≤ 1.

Thus, by the ergodic theorem, c(x) is well defined at ν -almost every x. It is clear from the definition that c(x) ≥ 1. Now, given any non-zero vector v and a generic point x ∈ Z, lim

k→±∞

1 1 log Bk (x)v = c(x) lim log An (x)v . n→±∞ n k

(Theorem 4.2 ensures that the limit on the right-hand side exists.) This implies that the Oseledets decomposition of G coincides with the restriction of the Oseledets decomposition of F, and that the Lyapunov exponents of the two cocycles are related by the factor c(x), as claimed in the proposition.

4.4.2 Invariant cones The cone of radius a > 0 around a subspace V of Rd is defined by C(V, a) = {v1 + v2 ∈ V ⊕V ⊥ : v2  < av1 }.

(4.48)

Proposition 4.19 Assume there exist δ > 0, b > a > 0, and an F-invariant measurable decomposition Rd = Vx ⊕Wx , defined at μ -almost every point, such that the dimensions of Vx and Wx are constant, | sin (Vx ,Wx )| ≥ δ

and

A(x)(C(Vx , b)) ⊂ C(V f (x) , a)

for μ -almost every x ∈ M. Then the Lyapunov exponents of F along the subbundle Vx are strictly larger than the Lyapunov exponents of F along Wx . Proof It is no restriction to suppose that the subspace Vx is constant: pick any measurable orthonormal basis {v1 (x), · · · , vd (x)} of the space Rd such that {v1 (x), . . . , vl (x)} is a basis of Vx and use it to identify Rd to itself; then Vx is identified to Rl × {0}. So, from now on we suppose Vx = V for every x ∈ M. The cone C(V, b) is the union of all the graphs of linear maps u : V → V ⊥ such that u < b. In the space H (V, b) of such maps, consider the Hilbert

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

62

Multiplicative ergodic theorem

g0 f0

ψ φ

{u : u < b}

Figure 4.1 Cross-ratio and projective metric

metric (see Birkhoff [26] and [114, Section 12.3.1]), defined by the logarithm of the cross-ratio d(φ , ψ ) = log

φ − ψ0 ψ − φ0  , φ − φ0 ψ − ψ0 

(4.49)

where φ0 and ψ0 are the points where the line through φ and ψ hits the boundary {u : u = b}, denoted in such a way that φ0 is closer to φ and ψ0 is closer to ψ (see Figure 4.1). The hypothesis implies that each A(x)(graph(φ )), φ ∈ H (V, b) may be written as graph(ψ ) for some ψ ∈ H (V, a) ⊂ H (V, b). Since H (V, a) is relatively compact in H (V, b), it has finite diameter for the Hilbert metric. By a crucial property of the Hilbert metric (see [114, Proposition 12.3.6]), the action H (V, b) → H (V, b) of A(x) thus defined is a θ contraction with respect to this metric, for some θ < 1 that depends only on an upper bound for the diameter and, hence, can be expressed in terms of a and b alone. It follows (Exercise 4.11) that the width of the iterates An ( f −n (x))(C(V, b)) decays as Ke−κ n , for some K > 0 and κ > 0 that depend only on a and b. Of course, the intersection of the iterate satisfies ∞ 

An ( f −n (x))(C(V, b)) = V.

(4.50)

n=1

We are going to deduce that any unit vector v ∈ V is expanded at a definitely faster rate than any unit vector w ∈ Wx . Indeed, one can find c0 > 0 depending only on b and such that v + c0 w ∈ C(V, b). Then, An (x)v + c0 An (x)w ∈ C(V, Ke−κ n )

for every n ≥ 1.

Since An (x)w ∈ W f n (x) and | sin (W f n (x) ,V f n (x) )| ≥ δ , the previous relation implies that An (x)w ≤ K0 e−κ n An (x)v

for every n ≥ 1,

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.5 Notes

63

where K0 > 0 depends only on K, c0 , and δ , and so is completely determined by a, b, and δ . It follows that 1 1 lim log An (x)w ≤ lim log An (x)v − κ n n n n for every v ∈ V and w ∈ Wx , and μ -almost every x. This proves that any Lyapunov exponent of a vector in Wx is smaller, by a definite amount −κ , than any Lyapunov exponent of a vector in V .

4.5 Notes Theorems 4.1 and 4.2 are part of the main result in Oseledets [92]: the results of Oseledets are stated in the slightly broader context of linear cocycles on finitedimensional vector bundles. Other proofs were given by Raghunathan [101], Ruelle [105] and Ledrappier [80], among others. While the early proofs relied mostly on linear algebra calculations, dynamical proofs were later found by Ma˜ne´ [89] and Walters [117]. Our presentation is closest to [117] and to its extension to the invertible case by Bochi [28]. The assumption log+ A±1  ∈ L1 (μ ) is unnecessarily strong for Theorem 4.1: it suffices to assume that the function log+ A is integrable; however, in this case the smallest Lyapunov exponent may be −∞. The multiplicative ergodic theorem also extends to some infinite-dimensional cocycles with finitely many non-negative Lyapunov exponents: see Ruelle [106] and Ma˜ne´ [87]. Proposition 4.17 shows that the subadditive ergodic theorem suffices to identify all the Lyapunov exponents, including their multiplicities, not just the extremal ones: for every 1 ≤ l ≤ d, the l-largest Lyapunov exponent (counted with multiplicity) coincides with λ+ (Λl F) − λ+ (Λl−1 F). This observation was part of the early proofs of the Oseledets theorem. Most of what follows in this book turns around the multiplicative ergodic theorem. Chapter 6 shows that the Lyapunov exponents may be expressed, in an explicit way, in terms of the invariant measures and the stationary measures of the projective cocycle PF. The invariance principle in Chapter 7 is a general tool for analysing the special situation when all the Lyapunov exponents are equal (and to prove that this is seldom the case). In Chapter 8 we will investigate conditions under which all the Lyapunov exponents are distinct. Finally, in Chapters 9 and 10 we will investigate how Lyapunov exponents vary with the linear cocycle and with the invariant measure μ . The constructions in Sections 4.4.1 and 4.4.2 will be used in the proof of Propositions 9.13 and 8.3, respectively.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

64

Multiplicative ergodic theorem

4.6 Exercises Exercise 4.1 Prove that a map x → Vx with values in Gr(d) is measurable if and only if Ml = {x ∈ M : dimVx = l} is a measurable set for every 0 ≤ l ≤ d and, for each l, there exist measurable vector fields vi : Ml → Rd , i = 1, . . . , l such that {v1 (x), . . . , vl (x)} is a basis of Vx for every x ∈ Ml . Exercise 4.2 Prove that (4.5) defines a complete separable norm in F : (1) Check that  · 1 is indeed a norm in F . (2) Show that any Cauchy sequence in F admits some subsequence (Ψn )n such that Ψn − Ψn+1 1 ≤ 2−2n for every n. Deduce that there exists N ⊂ M with μ (N) = 1 such that Ψ(x, v) = limn Ψn (x, v) exists for every (x, v) ∈ N × P. Argue that ψ ∈ F and use bounded convergence to get that Ψn − Ψ1 → 0. (3) Use the fact that C0 (P) is separable, and the assumption that the probability space (M, B, μ ) is separable, to conclude that (F ,  · 1 ) is separable (compare Theorem 2 in [120, page 137]). Exercise 4.3 Show that two probability measures η , ζ ∈ M (μ ) coincide if   and only if Ψ d η = Ψ d ζ for every Ψ ∈ F . Exercise 4.4 Prove that the weak∗ topology on M (μ ) is metrizable and compact: (1) Fix a countable dense subset {Ψk : k ≥ 1} of the unit ball of F and define    ∞   d(ξ , η ) = ∑ 2−k  Ψk d ξ − Φk d η  for any ξ , η ∈ M (μ ). k=1

Show that d is a distance compatible with the weak∗ -topology. (2) Given any sequence in M (μ ), use a diagonal argument to find a subse quence (ηn )n such that limn Ψk d ηn exists for every k. Conclude that there exists a unique bounded linear operator g : F → R such that 

g(Ψk ) = lim 

n

Ψk d ηn

for every k.

Then Ψ d η = g(Ψ) defines a probability measure η on M ×P that projects down to μ . Moreover, (ηn )n converges to η in the weak∗ topology. Exercise 4.5 Check that if the d1 + · · · + dk = d then       d1 d d ··· k , = ∑ l lk l 1 l ,...,l 1

k

where the sum is over all 0 ≤ li ≤ di with ∑ki=1 li = l.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

4.6 Exercises

65

Exercise 4.6 Show that if {v1 , . . . , vk } and {w1 , . . . , wk } are orthonormal bases for the same subspace V then the matrix (vi · w j )i, j is orthonormal. Exercise 4.7 Show that if L : Rd → Rd is a linear isomorphism and V1 ,V2 are linear subspaces of Rd then (1)

| sin (L(V1 ), L(V2 ))| 1 ≤ ≤ L L−1 ; −1 L L  | sin (V1 ,V2 )|

(2) (L | V1 )−1 −d1 ≤ | det(L | V1 )| ≤ (L | V1 )d1 , where d1 = dimV1 ; (3) | sin (V1 ,V2 )| = π −1 −1 , where π : V1 → V2 is the orthogonal projection; (4) | sin (L(V1 ), L(V2 ))|d1 ≤

| det(L | V1 ⊕V2 )| ≤ | sin (V1 ,V2 )|−d1 . | det(L | V1 )| | det(L | V2 )|

Exercise 4.8 Prove that if μ is ergodic for f then ν is ergodic for the induced map g, and c(x) = 1/μ (Z) for ν -almost every x. Exercise 4.9 Extend Proposition 4.18 and Exercise 4.8 to the non-invertible case, using the concept of natural extension (see [114, Section 2.4.2]). Exercise 4.10 Let V,W be subspaces of Rd with dimV = dimW . Prove that: (1) given a > 0 there is b > 0 such that C(W, b) ⊂ C(V, 2a) if W ⊂ C(V, a); (2) if b < 1/10 then W ⊂ C(V, b) implies C(V, b) ⊂ C(W, 3b). Exercise 4.11 Let ρ > 0 be small. Show that if H ⊂ H (V, b) is contained in the ρ -neighborhood of φ ≡ 0, relative to the Hilbert metric d, then the union C of all graph(φ ), φ ∈ H is contained in C(V, ρ /b). More generally, relate the d-diameter of a subset H of H (V, b) to the width of the corresponding cone C. Exercise 4.12 Check that one may replace the assumption | sin (Vx ,Wx )| ≥ δ in Proposition 4.19 by a condition of subexponential decay of the angles: 1 lim log | sin (An (x)Vx , An (x)Wx )| = 0. n n Exercise 4.13 Let B ∈ GL(d) be such that B(C(V, b)) ⊂ C(V, a) for some subspace V ⊂ Rd and some 0 < a < b. Prove that: (1) B admits a dominated decomposition Rd = U ⊕ S with dimU = dimV ; that is, B(U) = U, B(S) = S, and every eigenvalue of B restricted to U is larger, in norm, than every eigenvalue of B restricted to S;

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

66

Multiplicative ergodic theorem

(2) there is r > 0 such that B˜ = B1 · · · B admits a dominated decomposition Rd = UB˜ ⊕ SB˜ , for any  ≥ 1 and any B1 , . . . , B in the r-neighborhood of B; moreover, UB˜ is uniformly close to U if r is small.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.005

5 Stationary measures

Further development of the theory requires that we exploit in more depth the connection between the Lyapunov exponents of a linear cocycle and the invariant measures of the corresponding projective cocycle, of which we had a brief glimpse in the proof of the multiplicative ergodic theorem. In this chapter we introduce a general formalism that will be very useful towards that end. Linearity is not relevant at this stage, so we formulate the results for a class of systems more general than linear and projective cocycles, that we call random transformations. The definition and fundamental properties of such systems are discussed in Section 5.1. In Section 5.2 we introduce the key notion of stationary measure for a random transformation. We will see in the next chapter that the measures stationary under the projective cocycle completely determine the Lyapunov exponents of the linear cocycle. The properties of stationary measures of general (possibly non-invertible) random transformations are studied in Sections 5.2 and 5.3. The invertible case is treated in more detail in Section 5.4 and leads to the important concepts of u-state and s-state, which are invariant probability measures whose disintegrations along the fibers have special invariance properties. These disintegrations are revisited, from a different angle, in Section 5.5.

5.1 Random transformations Denote by M the space of sequences X N (or X Z ) in some probability space (X, X , p), endowed with the product σ -algebra A = X N (or X Z ) and the product measure μ = pN (or μ = pZ ). Moreover, let f : M → M be the shift map on M. Let (N, B) be a measurable space and take M × N to be endowed with the product σ -algebra A ⊗ B. Given a set E ⊂ M × N and points x ∈ M and v ∈ N, 67

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

68

Stationary measures

we will consider the slices defined by Ev = {x ∈ M : (x, v) ∈ E} and

E x = {v ∈ N : (x, v) ∈ E}.

If E ⊂ M × N is measurable then so are Ev ⊂ M and E x ⊂ N (Exercise 5.1). A random transformation (or locally constant skew product) over f is a measurable transformation of the form F : M × N → M × N,

F(x, v) = ( f (x), Fx (v))

where Fx depends only on the zeroth coordinate of x ∈ M. Example 5.1 (Random matrices) Let X = {A1 , . . . , Am } ⊂ GL(d). Consider the probability measure defined on X by p = p1 δA1 + · · · + pm δAm , with p1 , . . . , pm > 0 and p1 + · · · + pm = 1. Let N = Rd and Fx = x0 for every x in M = X Z . Two random transformations associated with this data are relevant for our purposes: the locally constant linear cocycle F : M × Rd → M × Rd ,     (αn )n , v → (αn+1 )n , α0 (v) , and the corresponding projective cocycle PF : M × PRd → M × PRd ,     (αn )n , [v] → (αn+1 )n , [α0 (v)] . Example 5.2 Let F1 , F2 : S1 → S1 be homeomorphisms of the circle N = S1 . Consider X = {1, 2} and M = X N . Let f : M → M be the shift map and μ = pN with p = p1 δ1 + p2 δ2 . Let F : M × N → M × N, F(x, v) = ( f (x), Fx0 (v)). Then F n (x, v) = ( f n (x), Fi(n−1) · · · Fi(0) (v)) where the Fi( j) ∈ {F1 , F2 } are independent random variables with identical distribution p. Let F : M × N → M × N be a random transformation. The transition probabilities associated with F are defined by     p(v, B) = μ {x ∈ M : Fx (v) ∈ B} = μ F −1 (M × B)v , for each v ∈ N and each measurable set B ⊂ N. The set on the right-hand side is measurable (Exercise 5.1) and so p(v, B) is well defined. It is clear that p(v, B) is countably additive on the second variable, and so each p(v, ·) is a probability measure on N. Moreover (Exercise 5.1), the function N → R, v → p(v, B) is measurable, for any measurable B ⊂ N. The transition operator P associated with the random transformation F is the linear map P acting on the space of bounded measurable functions ϕ : N → R by P ϕ : N → R,

P ϕ (v) =



ϕ (Fx (v)) d μ (x).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

(5.1)

5.1 Random transformations

69

Note that x → ϕ (Fx (v)) is measurable (Exercise 5.2), and so P ϕ (v) is well defined. It is clear that the function P ϕ defined by (5.1) is bounded. Moreover, P ϕ is measurable. To see this, begin by supposing that ϕ is the characteristic function of some set B ⊂ N. Then P ϕ (v) = μ ({x : Fx (v) ∈ B}) = p(v, B)

(5.2)

is a measurable function of v, as observed in the previous paragraph. By the linearity of the transition operator, it follows that P ϕ is measurable whenever ϕ is a simple function; that is, a linear combination of characteristic functions. In general, there exists a sequence (ϕn )n of simple functions converging uniformly to ϕ . Thus, P ϕ is the uniform limit of measurable functions P ϕn , n ≥ 1, and so it is measurable. The adjoint transition operator P ∗ associated with the random transformation F acts on the space of probability measures η on N by     P ∗ η (B) = (Fx )∗ η (B) d μ (x) = η Fx−1 (B) d μ (x), (5.3) for any measurable set B ⊂ N. Observe that     η Fx−1 (B) = η F −1 (M × B)x

(5.4)

is a measurable function of x, by Exercise 5.1. Thus, the integral in (5.3) is well defined. Moreover, it is clear from the definition (5.3) that P ∗ η is countably additive and satisfies P ∗ η (N) = 1. Thus, P ∗ η is a probability measure on N. We will take the space T (N) of measurable transformations N → N to be endowed with some σ -algebra relative to which the map F : M → T (N), F (x) = Fx is measurable. For instance, this could be the push-forward of the σ -algebra A under the map F . Then we denote by ν the push-forward F∗ μ of the probability measure μ under F . The definitions (5.1) and (5.3) translate to P ϕ (v) =



ϕ (g(v)) d ν (g) and P ∗ η (B) =



η (g−1 (B)) d ν (g). (5.5)

So, the transition operators are completely characterized by the probability measure ν . We have the following duality relation: Lemma 5.3

Let ϕ : N → R be a bounded measurable function. Then 

ϕ d(P ∗ η ) =



(P ϕ ) d η .

Proof Suppose first that ϕ is the characteristic function of some measurable set B ⊂ N. Then, using (5.4),     ∗ ∗ ϕ d(P η ) = P η (B) = η F −1 (M ×B)x d μ (x) = (μ × η )(F −1 (M ×B))

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

70

Stationary measures

and, using (5.2), 

(P ϕ ) d η =



  μ {x : Fx (v) ∈ B} d η (v) = (μ × η )(F −1 (M × B)).

This proves the equality in this case. By linearity, we get that the equality holds for every simple function ϕ . The general case follows, because every bounded measurable function is a uniform limit of simple functions.

5.2 Stationary measures A probability measure η on N is called stationary for the random transformation F if P ∗ η = η ; that is, if

η (B) =



η (Fx−1 (B)) d μ (x) =



η (g−1 (B)) d ν (g)

(5.6)

for every measurable set B ⊂ N. The probability measure ν = F∗ μ was introduced in Section 5.1. The expression on the right-hand side of (5.6) shows that ν determines the set of stationary measures entirely. Recall that a measure η is said to be invariant under a transformation g if η (g−1 (B)) = η (B) for every measurable set. The definition (5.6) means that η is called stationary if such an equality holds on average over all Fx , the average being with respect to μ . Thus, clearly, any probability measure that is invariant under Fx for μ -almost every x is also stationary. The following example (see also Exercise 6.8) illustrates the fact that the converse is far from being true. Example 5.4 Let f : M → M be the shift map on M = {1, 2}Z and μ be a Bernoulli measure supported on the whole M = {1, 2}Z . Let A : M → SL(2) be a locally constant function, A | [0; 1] ≡ A1

and

A | [0; 2] ≡ A2 ,

where A1 and A2 are hyperbolic matrices with no common eigenspace. Let F : M × PR2 → M × PR2 be the projective cocycle defined by A over f . For each i = 1, 2, the probability measures on PR2 invariant under the action of Ai are the convex combinations of the Dirac masses on the corresponding eigenspaces. Thus, the hypotheses imply that A1 and A2 have no common invariant probability measure. On the other hand, Proposition 5.6 below ensures that F does have some stationary measure. The following characterization of stationary measures is specific to onesided random transformations; that is, such that M = X N and f : M → M is the one-sided shift. The two-sided situation will be analyzed later.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.2 Stationary measures

71

Proposition 5.5 Let F : M × N → M × N be a one-sided random transformation. A probability measure η on N is stationary for F if and only if the probability measure μ × η on M × N is F-invariant. Proof First, we prove the ‘if’ claim. Suppose that μ × η is invariant under F. Given any bounded measurable function ϕ : N → R, define ψ : M × N → R by ψ (x, v) = ϕ (v). Then, using Lemma 5.3, 

ϕ (v)(dP ∗ η )(v) =



P ϕ (v) d η (v) =

 

=  

=

 

ψ (x, Fx (v)) d μ (x) d η (v)

ψ ( f (x), Fx (v)) d μ (x) d η (v) ψ (x, v) d μ (x) d η (v) =



ϕ (v) d η (v).

Since ϕ is arbitrary, this shows that P ∗ η = η , as we wanted to prove. Now we prove the ‘only if’ claim. Let η be a stationary measure on N. Given any bounded measurable function ψ : M × N → R, consider the function  ϕ : N → R defined by ϕ (v) = ψ (x, v)d μ (x). Then  

ψ (x, v) d μ (x) d η (v) =



ϕ (v) d η (v) =

 

=



P ϕ (v) d η (v)

ϕ (Fx (v)) d μ (x) d η (v)

  

ψ (y, Fx (v)) d μ (y) d μ (x) d η (v).

=

Write x = (x0 , x1 , . . . ) and y = (y0 , y1 , . . . ). Since Fx depends only on x0 , the triple integral may be written as   

 ψ (y, Fx0 (v) d pN (y) d p(x0 ) d η (v).

Let us write z = (x0 , y0 , y1 , . . . ). Then f (z) = y, and so the previous integral becomes  

ψ ( f (z), Fz0 (v)) d pN (z) d η (v) =

 

ψ ( f (z), Fz (v)) d μ (z) d η (v).

This proves that  

ψ d μ dη =

 

(ψ ◦ F) d μ d η .

Since ψ is arbitrary, this shows that μ × η is invariant under F. There is more than one useful topology in the space of probability measures

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

72

Stationary measures

on a measurable space N. The uniform topology is defined by the total variation norm      ξ − η  = sup  ψ d ξ − ψ d η , |ψ |≤1

where the supremum is over all measurable functions ψ : M → R with |ψ | ≤ 1. The pointwise topology is the smallest topology such that

ξ → ξ (E) is continuous for every measurable E ⊂ N. 

It follows that ξ → ψ d ξ is continuous for every bounded measurable function ψ : M → R. When N is a metric space, we may also consider the weak∗ topology, which is the smallest topology such that 

ξ → ϕ d ξ is continuous for each bounded uniformly continuous function ϕ . It is clear from the definitions that uniform convergence implies pointwise convergence which, in turn, implies weak∗ convergence. Both converses are false, in general. The weak∗ topology is especially useful, not the least because it is compact if N is compact (see [114, Theorem 2.1.5]). Proposition 5.6 Assume that N is a compact metric space and Fx : N → N is continuous for every x ∈ M. Then there is some stationary measure for F. Proof

The first step is

Lemma 5.7 If ϕ : N → R is continuous then P ϕ : N → R is also continuous. Proof

Given n ∈ N, v ∈ N, and δ > 0, define   1    B(v, n, δ ) = x ∈ M : Fx B v, ⊂ B(Fx (v), δ ) . n

Fix v ∈ N and ε > 0. Then fix δ > 0 such that

ε d(z, y) < δ ⇒ |ϕ (z) − ϕ (y)| < . 2 The fact that every Fx is continuous ensures that μ (B(v, n, δ )c ) → 0 as n → ∞. Fix n large enough so that

μ (B(v, n, δ )c )
0. By definition, for any v ∈ N,

Pk ϕ (v) − P ϕ (v) =



ϕ (Fk,x (v)) d pk (x) − 

+



ϕ (Fx (v)) d pk (x)

ϕ (Fx (v)) d pk (x) −



ϕ (Fx (v)) d p(x). (5.7)

Since ϕ is (uniformly) continuous and Fk is uniformly close to F, |ϕ (Fk,x (v)) − ϕ (Fx (v))| < ε

for every (x, v) ∈ M × N

if k is large enough. Then the first term of (5.7) on the right-hand side of (5.7) is smaller than ε . The last term of (5.7) is also smaller than ε if k is large, because (pk )k converges to p and the function x → ϕ (Fx (v)) is bounded and measurable (respectively, continuous). This proves the lemma. Proceeding with the proof of Proposition 5.9, consider any sequence (ηk )k converging to some η . For any continuous function ϕ : N → R, 

ϕ d(Pk∗ ηk ) − 

=



ϕ d(P ∗ η ) =

(Pk ϕ ) d ηk −



(P ϕ ) d ηk +





ϕ d(P ηk ) −

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006



ϕ d(P ∗ η ).

5.3 Ergodic stationary measures

75

By Lemma 5.10, the first difference on the right-hand side goes to zero when k → ∞. The second difference goes to zero as well because, by Lemma 5.8, P ∗ ηk converges to P ∗ η in the weak∗ topology. Since ϕ is arbitrary, this proves that Pk∗ ηk converges to P ∗ η in the weak∗ topology, as k → ∞. When each ηk is stationary for Fk this means that ηk converges to P ∗ η . Then P ∗ η = η , as we wanted to prove.

5.3 Ergodic stationary measures A bounded measurable function ϕ : N → R is stationary if it satisfies P ϕ = ϕ . A set B ⊂ N is stationary if its characteristic function XB is stationary. Let η be a stationary measure. A bounded measurable function ϕ : N → R is η stationary if P ϕ (v) = ϕ (v) for η -almost every v ∈ N. A set B ⊂ N is η stationary if XB is η -stationary. Proposition 5.11 are equivalent:

Let η be a stationary measure. The following conditions

(a) every η -stationary function is constant on some set with full η -measure; (b) if B ⊂ N is an η -stationary set then η (B) is either 0 or 1. Proof Property (b) is a special case of property (a). To prove that (b) also implies (a), let ϕ : N → R be η -stationary. We claim that B(c) = {v ∈ N : ϕ (v) > c} is an η -stationary set for every c ∈ R. Consequently, η (B(c)) is either 0 or 1, for every c ∈ R. Let c¯ be the supremum of the set of values of c such that η (B(c)) = 1. Then ϕ (v) = c¯ for η -almost every v ∈ B(c). ¯ We are left to prove the claim. Lemma 5.12 If η is a stationary measure and ψ1 , ψ2 : N → R are η -stationary then so are the functions max{ψ1 , ψ2 } and min{ψ1 , ψ2 }. Proof Begin by noting that if ψ is η -stationary then |ψ | = |P ψ | ≤ P|ψ |  at η -almost every point and (P|ψ | − |ψ |) d η = 0, because η is stationary. These two facts together imply that |ψ | is η -stationary. It is clear that the sum and the difference of η -stationary functions are also η -stationary. Then, for any ψ1 and ψ2 as in the statement max{ψ1 , ψ2 } =

ψ1 + ψ2 |ψ1 − ψ2 | + 2 2

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

76

Stationary measures

and min{ψ1 , ψ2 } =

ψ1 + ψ2 |ψ1 − ψ2 | − 2 2

are η -stationary. This proves lemma. It follows from Lemma 5.12 that ϕn = min{1, n max{ϕ − c, 0}} is η -stationary for every n ≥ 1. Now, ϕn converges monotonically to the characteristic function of B(c). So (Exercise 5.7), the characteristic function is η -stationary, as claimed. This finishes the proof of the proposition. A stationary measure η is ergodic if it satisfies conditions (a) and (b) in Proposition 5.11. It follows from Exercise 5.6 that the definition does not change if one considers, instead, sets or functions that are actually stationary (not just η -stationary). In the one-sided case, we have the following complement to Proposition 5.5: Proposition 5.13 Let F : M × N → M × N be a one-sided random transformation and η be a stationary measure. Then η is ergodic if and only if the F-invariant probability measure m = μ × η is ergodic. Proof Suppose that m = μ × η is ergodic and let B ⊂ N be a stationary set. The latter means that XB (v) = PXB (v) =



XB (Fx (v)) d μ (x)

for every v ∈ N. This may be read as follows: v ∈ B implies Fx (v) ∈ B for μ -almost every x and v ∈ / B implies Fx (v) ∈ / B for μ -almost every x. Then the set M×B is (F, m)-invariant, in the sense that the symmetric difference (M×B)ΔF −1 (M×B) has zero m-measure. By ergodicity, it follows that η (B) = m(M×B) is either 0 or 1. This proves the “if” part of the statement. To prove the “only if” part, let ψ : M × N → R be any bounded measurable F-invariant function; that is, such that ψ ◦ F = ψ . Let ϕ : N → R be defined by  ϕ (v) = ψ (y, v) d μ (y). Then, for any v ∈ N, P ϕ (v) =



ϕ (Fx (v)) d μ (x) =

 

ψ (y, Fx (v)) d μ (y) d μ (x).

By assumption, Fx depends only on the zeroth coordinate x0 . On the other hand, f (x) = (x1 , . . . , xn , . . . ) is independent of x0 . Thus, recalling that μ = pN , the last integral coincides with 

ψ ( f (x), Fx (v)) d μ (x) =



ψ (x, v) d μ (x) = ϕ (v)

for every v ∈ N. This means that ϕ is stationary. Hence, by hypothesis, there

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.4 Invertible random transformations

77

exists C ∈ R such that ϕ (v) = C for η -almost every v. Moreover, for each k ≥ 1 and x0 , x1 , . . . , xk−1 ∈ X, 

ψ (x, v) d μ (xk , xk+1 , . . . ) =



ψ ( f k (x), Fxk (v)) d μ (xk , xk+1 , . . . ).

Changing variables y = f k (x) = (xk , xk+1 , . . . ), and observing that Fxk depends only on x0 , . . . , xk−1 , we conclude that 

ψ (x, v) d μ (xk , xk+1 , . . . ) =



ψ (y, Fxk (v)) d μ (y) = ϕ (Fxk (v)).

This proves that 

ψ (x, v) d μ (xk , xk+1 , . . . ) = C

for every k, x0 , . . . , xk−1 , v.

Then (Exercise 5.8), we have ψ (x, v) = C for m-almost all (x, v) ∈ M × N. This proves that m is ergodic, and so the proof of the proposition is complete. The next theorem is an analogue for random transformations of the ergodic decomposition theorem in deterministic dynamics (see [114, Theorem 5.1.3]): every stationary measure can be written as a convex combination of ergodic stationary measures. Theorem 5.14 (Ergodic decomposition) Take N to be a separable complete metric space. Then the space of ergodic stationary measures is a measurable subset of the space of all stationary measures. Moreover, every stationary mea sure η may be represented as η = ξ d ηˆ (ξ ), meaning that

η (B) =



ξ (B) d ηˆ (ξ ) for every Borel set B ⊂ N,

where ηˆ is a probability measure in the space of stationary measures giving full weight to the subset of ergodic stationary measures. See Kifer [71, Appendix A.1] for a proof. The statement extends, immediately, to the case when N is just a Borel subset of a separable complete metric space, because any measure on N may be identified with a measure on the larger metric space.

5.4 Invertible random transformations We have seen in Proposition 5.5 that, in the one-sided case, a probability measure η is stationary if and only if the probability measure μ × η is invariant. Here we discuss the relations between these two concepts when the random

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

78

Stationary measures

transformation F is invertible and, in particular, the shift f : M → M is twosided. Throughout, we assume that X is a separable complete metric space. Then (Exercise 5.12), M = X Z (and M = X N ) itself is also a separable complete metric space. Let f : (M, μ ) → (M, μ ) be a two-sided Bernoulli shift: M = X Z and the invariant probability measure μ = pZ for some probability measure p on X. Denote Z+ = {n ∈ Z : n ≥ 0} ±

and

Z− = {n ∈ Z : n < 0}.

±

Then write M ± = X Z and μ ± = pZ . Let π ± : M → M ± be the canonical projections, and define f ± : M ± → M ± by f + ◦ π+ = π+ ◦ f

and

f − ◦ π − = π − ◦ f −1 .

Then f + : (M + , μ + ) → (M + , μ + ) and f − : (M − , μ − ) → (M − , μ − ) are onesided Bernoulli shifts and μ = μ − × μ + . Now let F : M × N → M × N be an invertible random transformation over f : (M, μ ) → (M, μ ). There are two non-invertible random transformations naturally associated with F, namely: F + : M + × N → M + × N, F − : M − × N → M − × N, where Fx++

F + (x+ , v) = ( f + (x+ ), Fx++ (v)) over f +

F − (x− , v) = ( f − (x− ), Fx−− (v)) over f − , −1  = Fx for any x ∈ M with π + (x) = x+ and Fx−− = Ff −1 (x) for

any x ∈ M with π − (x) = x− . Since Fx : N → N depends only on the zeroth coordinate x0 of the point x ∈ M, these transformations are indeed well defined and locally constant: Fx++ depends only on the coordinate x0+ and Fx−− depends − . only on the coordinate x−1 A probability measure η on N is forward stationary for F if it is stationary for F + ; that is, if       η (B) = η (Fx++ )−1 (B) d μ + (x+ ) = η Fx−1 (B) d μ (x) for every measurable set B. Analogously, η is backward stationary for F if it is stationary for F − ; that is, if       η (B) = η (Fx−− )−1 (B) d μ − (x− ) = η Ff −1 (x) (B) d μ (x) 

=

  η Fx (B) d μ (x)

for every measurable set B (the last equality uses the fact that the measure μ is invariant under f ).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.4 Invertible random transformations

79

In what follows we are going to analyze these notions from several different angles. The conclusions can be summarized as follows (some of the notions involved have yet to be defined):

η is forward stationary ⇔ the product m+ = μ + × η is F + -invariant ⇔ the lift m of η (and m+ ) is a u-state ⇔ the disintegration of m factors through M −

η is backward stationary ⇔ the product m− = μ − × η is F − -invariant ⇔ the lift m of η (and m− ) is an s-state ⇔ the disintegration of m factors through M + . In general, the sets of forward stationary measures and backward stationary measures are distinct and even disjoint. That is the case, for instance, for locally constant projective cocycles that are pinching and twisting (Exercise 5.24).

5.4.1 Lift of an invariant measure The invariant probabilities of F over μ are in one-to-one correspondence with the invariant probabilities of F + over μ + , and the same is true for F − and μ − . To state this fact precisely, consider the canonical projections P : M×N → M ±

±

P : M ×N → M

±

and

Q : M×N → N

and

Q± : M ± × N → N,

(5.8)

and let Π± : M × N → M ± × N be given by Π± (x, v) = (π ± (x), v). Recall that π ± : M → M ± are the canonical projections between the shift spaces. The next two lemmas remain true if one replaces all +-signs by −-signs. Lemma 5.15 (1) If m is an F-invariant probability measure with P∗ m = μ then m+ = Π+ ∗m is an F + -invariant probability measure with P∗+ m+ = μ + . (2) Given any F + -invariant probability measure m+ with P∗+ m+ = μ + there + exists a unique F-invariant probability measure m such that Π+ ∗m=m and P∗ m = μ . Proof The first claim is an immediate consequence of the obvious relations F + ◦ Π+ = Π+ ◦ F and P+ ◦ Π+ = π + ◦ P. To prove the second claim, let C be the σ -algebra of M × N and, for each n ≥ 0, denote by Cn the sub-σ -algebra

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

80

Stationary measures

of sets of the form F n (M − × G) for some G ⊂ M + × N. Note that C0 ⊂ C1 ⊂ · · · ⊂ Cn ⊂ · · ·

and

∞ 

Cn generates C .

(5.9)

n=0 + Suppose that m is an F-invariant probability measure such that Π+ ∗m=m . Then

m(F n (M − × G)) = m(M − × G) = m+ (G)

(5.10)

for any measurable set G ⊂ M + × N. This proves that m is uniquely determined on Cn for every n. Then, by (5.9), m is uniquely determined on C . To prove existence, take (5.10) to define m on each σ -algebra Cn . These definitions are consistent: if F n (M − × Gn ) = F n−1 (M − × Gn−1 ) then M − × Gn = F −1 (M − × Gn−1 ) = M − × (F + )−1 (Gn−1 ) and, using that m+ is F+ -invariant, we find that m+ (Gn−1 ) = m+ (Gn ). So, (5.10) does define a probability measure on C . By construction, this probability measure is F-invariant. The probability measure m is called the lift of m+ . Note that the operations in parts (1) and (2) of Lemma 5.15 are inverse to each other: by definition, the + + Π+ ∗ -projection of the lift of m coincides with m and, by uniqueness, the lift of the Π+ -projection of m coincides with m. Lemma 5.16 An F + -invariant probability measure m+ is ergodic for F + if and only if its lift m is ergodic for F. Proof Let G ⊂ M + ×N be a measurable set invariant under F + . Then M − ×G is invariant under F. If m is ergodic, it follows that m+ (G) = m(M − × G) is either 0 or 1. Thus, m+ is ergodic. To prove the converse, let ψ : M ×N → R be a bounded measurable function. We claim that the Birkhoff time average, relative to the transformation F, is constant m-almost everywhere. It follows that, m is ergodic for F. To prove the claim, suppose first that there exists k ≥ 1 such that ψ (x, v) depends only on (x−k , . . . , x0 , . . . ) and v ∈ N. Then, for every n ≥ k

ψ (F n (x, v)) = ψ ( f n (x), Fxn (v)) depends only on x+ = (x0 , . . . , xn , . . . ) and v. Consequently, the time average ψ˜ is constant on each set M − × {(x+ , v)}. This means that ψ˜ : M × N → R may be identified with a function ψˆ : M + × N → R, defined on a subset with full

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.4 Invertible random transformations

81

m+ -measure. Moreover, the fact that ψ˜ ◦ F = ψ˜ at m-almost every point entails that ψˆ ◦ F + = ψˆ at m+ -almost every point. By ergodicity of m+ , it follows that ψˆ is constant on a subset with full m+ -measure. Equivalently, ψ˜ is constant on a subset with full m-measure. This proves the claim in this case. In general, for each k ≥ 1, let ψk : M × N → R be defined by

ψk (x, v) =



ψ (x, v) d μ (. . . , x−m , . . . , x−k+1 ).

By the previous special case of our claim, the time average ψ˜ k is constant malmost everywhere. Now, (ψk )k converges to ψ at m-almost every point (Exercise 5.16). By the bounded convergence theorem, it follows that (ψk )k converges to ψ in the space L1 (m). Then the time averages (ψ˜ k )k also converge to ψ˜ in L1 (m). It follows that ψ˜ is constant m-almost everywhere.

5.4.2 s-states and u-states In view of the previous observations, to every forward stationary measure η one can associate an F-invariant probability measure m, namely the lift of the F + -invariant probability measure m+ = μ + × η ; similarly for every backward stationary measure. In either case, we call m the lift of the stationary measure η . Our purpose now is to characterize the classes of F-invariant measures one finds in this way. Let m be an F-invariant probability measure on M × N with P∗ m = μ . We call m an s-state if, for measurable sets A− ⊂ M − , A+ ⊂ M + , and B ⊂ N, m(A− × A+ × B) μ (A− × A+ )

does not depend on A− .

(5.11)

Analogously, m is called a u-state if, for measurable sets A− ⊂ M − , A+ ⊂ M + , and B ⊂ N, m(A− × A+ × B) μ (A− × A+ )

does not depend on A+ .

(5.12)

We call m an su-state if it is both an s-state and a u-state. These are measures of a very special kind: Proposition 5.17 If m is a u-state then η = Q∗ m is a forward stationary measure. Conversely, if η is a forward stationary measure on N then its lift m is a u-state. Moreover, m is ergodic if and only if η is ergodic. The same is true replacing, simultaneously, forward stationary by backward stationary and u-state by s-state.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

82

Stationary measures

+ + Proof Let m be a u-state and m+ = Π+ ∗ m. For any measurable sets A ⊂ M and B ⊂ N,

m+ (A+ × B) m(M − × A+ × B) = μ + (A+ ) μ (M − × A+ ) depends only on B. Let this expression be denoted η (B). It is clear from the definition that η is a probability measure on N. Moreover, m+ = μ + × η and, since m+ is invariant under F + , Proposition 5.5 asserts that η is forward sta+ tionary for F. Finally, Q∗ m = Q+ ∗ m = η . This proves the first claim. To prove the converse, suppose η is a stationary measure and let m be its lift. We want to prove that m(A− × A+ × B) μ (A− × A+ ) does not depend on A+ . It is no restriction to take A− to be a cylinder, A− = {x− ∈ M − : x−k ∈ X−k , . . . , x−1 ∈ X−1 }, for some measurable sets X−k , . . . , X−1 ⊂ X, because the cylinders generate the σ -algebra of M − . Then, since m is invariant under F, m(A− × A+ × B) = m(F −k (A− × A+ × B)) = m(M − × G) = (μ + × η )(G) where G is the set of (x+ , v) ∈ M + × N with x0 ∈ X−k , . . . , xk−1 ∈ X−1 , (xk , . . . , xn , . . . ) ∈ A+ , and v ∈ (Fxk )−1 (B). Notee that Fxk is a function of x+ , indeed it depends only on x0 , . . . , xk−1 . We conclude that m(A− × A+ × B) is given by  X−k ×···×X−1

μ + (A+ )η ((Fxk )−1 (B)) d pk (x0 , . . . , xk−1 ).

Consequently, m(A− × A+ × B) = μ (A− × A+ )



k −1 k X−k ×···×X−1 η ((Fx ) (B)) d p (x0 , . . . , xk−1 ) μ − (A− )

does not depend on A+ , as we wanted to prove. By Lemma 5.16, m is ergodic if and only if m+ is ergodic. By Proposition 5.13, m+ is ergodic if and only if η is ergodic. For the remainder of this section we take N to be a compact metric space, and Fx : N → N to be a homeomorphism for every x ∈ M. Moreover, x0 → Fx0 is a measurable map from X to the space of continuous transformations on N,

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.4 Invertible random transformations

83

endowed with the topology of uniform convergence. Keep in mind that we take X and M to be separable complete metric spaces. It follows from Propositions 5.6 and 5.17 that the set of u-states (and sstates) is non-empty. Exercise 5.19 outlines an alternative, more direct proof of this fact, and similar ideas will be developed in Section 6.1. The following stronger statement will be useful later. Let (pk )k be a sequence of probability measures on X and (μk )k be the corresponding Bernoulli measures on M. Let Fk : M × N → M × N, k ≥ 1 be random transformations over the Bernoulli shifts f : (M, μk ) → (M, μk ) and, for each k ≥ 1, let mk be an s-state (respectively, a u-state) for Fk . Proposition 5.18 Suppose that (pk )k converges to p in the uniform topology, (Fk )k converges to F uniformly on M × N, and (mk )k converges to m in the weak∗ topology. Then m is an s-state (respectively, a u-state) for F. Moreover, the assumption on (pk )k is not necessary when F is continuous. Proof Since each mk projects to μk and (mk )k → m in the weak∗ topology, the projections (μk )k must converge to the projection of m in the weak∗ topology. Now, the hypothesis (pk )k → p implies that (μk )k → μ in the weak∗ topology (Exercise 5.14). Thus, m projects down to μ . Lemma 5.19 Let (mk )k be a sequence of probability measures on M × N projecting to Bernoulli measures (μk )k on M and satisfying (5.11). If (mk )k converges to some probability measure m and (μk )k converges to some Bernoulli measure μ , in the weak∗ topology, then m satisfies (5.11). Analogously for condition (5.12). + + Proof Write mk = μk− × m+ k where mk is the projection of mk to M × N − − (Exercise 5.17). The hypothesis gives that (mk )k → m and (μk )k → μ in the weak∗ topology. Restricting to some subsequence, we may assume that (m+ k )k converges to some probability measure m+ on M + × N. Then (Exercise 5.14) m = μ − × m+ and this means that m satisfies (5.11). Exchanging ±-signs, one gets the same claim for (5.12).

To finish the proof of Proposition 5.18, we still have to prove that m is Finvariant. We claim that 

(ϕ ◦ Fk ) dmk −



(ϕ ◦ F) dm → 0

as k → ∞,

(5.13)

for any bounded uniformly continuous function ϕ : M × N → R. This means that (Fk )∗ mk converges to F∗ m in the weak∗ topology. The hypothesis also implies that (Fk )∗ mk = mk converges to m in the weak∗ topology. It follows

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

84

Stationary measures

that F∗ m = m, as we wanted to prove. Observe that 

(ϕ ◦ Fk ) dmk −



(ϕ ◦ F) dmk → 0

as k → ∞,

because ϕ ◦ Fk − ϕ ◦ F converges uniformly to zero. Thus, to prove (5.13) we only have to show that 

(ϕ ◦ F) dmk −



(ϕ ◦ F) dm → 0

as k → ∞,

(5.14)

for any bounded uniformly continuous function ϕ : M × N → R. This is clear if F is continuous, because mk is assumed to converge to m in the weak∗ topology. This proves the second claim in the proposition. Now we prove the first claim, where F is not assumed to be continuous. Arguing as in Exercise 5.13, it is no restriction to suppose that ϕ : M × N → R is such that ϕ (x, v) depends only on (x−n , . . . , x0 , . . . , xn−1 , v), for some fixed m ≥ 1. For each x ∈ M, consider the continuous function Φx : N → R defined by Φx (v) = ϕ ◦ F(x, v) = ϕ ( f (x), Fx (v)). Since Φx depends only on (x−n+1 , . . . , x0 , . . . , xn ), we may view Φ : x → Φx as a (measurable) map from X 2n to the set C0 (N) of continuous functions on N. This is a separable metric space for the norm of uniform convergence. Thus, we may apply the theorem of Lusin (see [114, Appendix A.3]) to find, for any given ε > 0, a compact set K ⊂ X 2n such that p2n (K c ) < ε and Φ is continuous on K. By construction, the restriction of ϕ ◦ F to K × N → R is continuous. Then, by the extension theorem of Tietze, there exists a continuous function ψ : X 2n × N → R such that sup |ψ | ≤ sup |ϕ | and ψ (x, v) = ϕ ◦ F(x, v) for every x ∈ K and v ∈ N. Then (functions on X 2n × N are also viewed as functions on M × N that depend only on (x−n+1 , . . . , x0 , . . . , xn )), 

(ϕ ◦ F) dmk −



(ϕ ◦ F) dm = 

+

K c ×N



ψ dmk −



(ϕ ◦ F − ψ ) dmk −

ψ dm 

K c ×N

(ϕ ◦ F − ψ ) dm.

(5.15) The first difference on the right-hand side is less than ε in absolute value if k is large, because ψ is continuous and mk converges to m in the weak∗ topology. The remainder terms in (5.15) are bounded by     c 2n c 2 sup |ϕ | mk (K c × N) + m(K c × N) = 2 sup |ϕ | p2n k (K ) + p (K ) . 2n The hypothesis that pk converges to p implies that p2n k converges to p , in c the uniform sense (Exercise 5.14). In particular, p2n k (K ) < ε for all large k.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.5 Disintegrations of s-states and u-states

85

Substituting these estimates in (5.15) we find that       (ϕ ◦ F) dmk − (ϕ ◦ F) dm < ε + 4 sup |ϕ | ε if k is large enough. This completes the proof of (5.14) and the proposition.

5.5 Disintegrations of s-states and u-states For future use, we give an alternative characterization of s-states and u-states, in terms of their disintegrations along the vertical fibers. Take X, M = X Z and N to be separable complete metric spaces.

5.5.1 Conditional probabilities Let m be any probability measure on M × N that projects down to μ . A disintegration of m along vertical fibers is a measurable family {mx : x ∈ M} of probability measures on N satisfying 

m(E) = M

  mx {v : (x, v) ∈ E} d μ (x)

for any measurable set E ⊂ M × N.

The mx are called conditional probabilities of m. The family P = {v : (x, v) ∈ E} of vertical fibers is a measurable partition of M × N, in the sense of Rokhlin; that is, it is the limit of some increasing sequence of finite partitions P1 ≺ · · · ≺ Pn ≺ · · · . To see this, just consider any countable basis of open sets {Uk : k ∈ N} of M, and take Pn =

n  

Uk × N,Ukc × N .

k=1

Then, Rokhlin’s disintegration theorem (see [114, Section 5.2]) gives that a disintegration along vertical fibers does exist. Moreover, it is essentially unique: any two disintegrations coincide on a full μ -measure subset. Indeed, the proof of Rokhlin’s theorem provides the following explicit characterization of the conditional probabilities: mx (B) = lim

Uk →x

m(Uk × B) m(Uk )

for μ -almost every x ∈ M

(5.16)

and any B ⊂ N in some countable algebra AN generating the Borel σ -algebra of N. The limit is taken over suitable decreasing sequences of open sets Uk whose intersection consists of the point x only.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

86

Stationary measures

Proposition 5.20 An F-invariant probability measure m on M ×N with P∗ m = μ is an s-state if and only if it admits a disintegration that factors through M + ; that is, such that each mx depends only on x+ = π + (x). Dually, m is a u-state if and only if it admits a disintegration that factors through M − . Proof Suppose that m admits a disintegration {mx : x ∈ M} where each mx depends only on x+ = π + (x): we write mx = mx+ . For any A− , A+ , B, m(A− × A+ × B) = μ (A− × A+ )



A− d μ

− (x− )



A+ mx+ (B) d μ − μ (A− )μ + (A+ )

+ (x+ )

.

The right-hand side may be rewritten as 

A+ mx+ (B) d μ μ + (A+ )

+ (x+ )

=

m(M − × A+ × B) , μ (M − × A+ )

and so it does not depend on A− . Conversely, suppose m is a u-state. Fix countable bases of open sets {Uk− : k ∈ N} of M − and {Ul+ : l ∈ N} of M + . The products Uk− ×Ul+ form a basis of open sets of M. As observed in (5.16), there exists a full μ -measure set M0 ⊂ M and a countable algebra AN generating the Borel σ -algebra of N, such that mx (B) =

m(Uk− ×Ul+ × B) μ (Uk− ×Ul+ ) Uk− ×Ul+ →x lim

for all x ∈ M0 and B ∈ AN ,

(5.17)

for every x ∈ M0 and B ∈ AN , where the limit is taken over any sequence of basis elements Uk− × Ul+ shrinking down to x. Given any pair x, y ∈ M0 such that π + (x) = π + (y), we may consider the same neighborhoods Ul+ for both points. Then, as the right-hand side of (5.17) does not depend on Uk− , we get that mx (B) = my (B) for every B ∈ AN . Since the algebra AN is generating, this implies that mx = my . We have shown that mx factors through M + , restricted to the full measure subset M0 . Then we can force this property on the whole M by redefining the mx appropriately for x ∈ M \ M0 (since this is a zero measure set, the family mx continues to be a disintegration of m). This proves the claim for s-states. The dual claim, for u-states, is analogous.

5.5.2 Martingale construction Let (X, B, μ ) be a probability space. The conditional expectation E(Ψ | A ) of an integrable function Ψ : X → R relative to a σ -algebra A ⊂ B is the unique A -measurable function on X such that 

A

E(Ψ | A ) d μ =



A

Ψd μ for all A ∈ A .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.5 Disintegrations of s-states and u-states

87

A martingale is a sequence (Ψn , Bn ), n ≥ 0 where Bn , n ≥ 0 is a non-decreasing sequence of sub-σ -algebras of B, each Ψn : X → R, n ≥ 0 is Bn -measurable and integrable, and E(Ψn+1 | Bn ) = Ψn almost everywhere, for every n ≥ 0.

(5.18)

Theorem 5.21 (martingale convergence theorem) Given any martingale (Ψn , Bn ), n ≥ 0, such that supn E(|Ψn |) < ∞, there exists a measurable function Ψ : X → R with E(|Ψ|) < ∞, such that (1) Ψn → Ψ almost everywhere; (2) E(Ψ | Bn ) = Ψn almost everywhere, for every n ≥ 0; (3) Ψ is B∞ -measurable, where B∞ denotes the σ -algebra generated by the union of all Bn , n ≥ 0. A proof of this classical theorem can be found in Durrett [49, § 5.2], for example. We are going to use the martingale convergence theorem to give a useful alternative proof of Proposition 5.17 in the case when is N a compact metric space. Let F : M × N → M × N and F + : M + × N → M + × N be as in the previous section. Let m+ be any F + -invariant probability measure that projects down to + + μ + and let {m+ y : y ∈ M } be a disintegration of m . Given any x ∈ M, define + + + + mx = mx+ where x = π (x). Lemma 5.22 The weak∗ limit mx = limn An ( f −n (x))∗ m+f −n (x) exists for μ almost every x ∈ M. Proof

Given any continuous function ϕ : N → R and n ≥ 0, define    Ψϕ ,n (x) = ϕ d An ( f −n (x))∗ m+f −n (x) .

(5.19)

For each n ≥ 0, let Fn be the σ -algebra generated by the cylinders [−n : Δ−n , . . . , Δl ] = {x ∈ M : xi ∈ Δi for i = −n, . . . , l} with l ≥ −n and Δi ⊂ X a measurable set for i = −n, . . . , l. It is clear that + + Fn ⊂ Fn+1 for every n ≥ 0. Since m+ y depends only on y = π (y) and A(x) depends only on the zeroth coordinate x0 , the value of Ψϕ ,n (x) depends only on xk , k ≥ −n. Hence, every Ψϕ ,n is Fn -measurable. Moreover, given any cylinder C = [−n : Δ−n , . . . , Δl ], 

C

Ψϕ ,n+1 (x) d μ (x) =

  C

 

= C

ϕ ◦ An+1 ( f −n−1 (x)) dm+f −n−1 (x) d μ (x) ϕ ◦ An ( f −n (x)) dq f −n−1 (x) d p(x−n ) · · · d p(x−1 )

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

88

Stationary measures 

+ where qy = X A(y)∗ m+ y d p(y0 ). Observe that dqy = m f (y) for μ -almost every y, because m+ is F + -invariant (Exercise 5.20). Therefore,

 C

Ψϕ ,n+1 (x) d μ (x) =

  C



= C

ϕ ◦ An ( f −n (x)) dm+f −n (x) d μ (x)

Ψϕ ,n (x) d μ (x).

Since these cylinders C generate the σ -algebra Fn , this proves that (Ψϕ ,n , Fn ) is a martingale. Then, by the martingale convergence theorem, there exists a measurable function Ψϕ : M → R such that Ψϕ (x) = lim Ψϕ ,n (x)

(5.20)

for every x in some full μ -measure set Mϕ ⊂ M. Now we use once more the fact that the space of continuous functions on the compact metric space N admits countable dense subsets. Taking the intersection of the Mϕ over all ϕ in such a dense subset one obtains a full μ -measure set M∗ ⊂ M such that (5.20) holds for every x ∈ M∗ and every continuous function ϕ : N → R. Every map ϕ → Ψϕ (x) is linear and non-negative and it sends ϕ ≡ 1 to Ψ1 (x) = 1. Thus, by the Riesz representation theorem (see [114, Theorem A.3.11]), it defines a probability measure mx on N: 

ϕ dmx = Ψϕ (x),

for each continuous ϕ : N → R.

The definition gives that mx is the weak∗ limit of An ( f −n (x))∗ m+f −n (x) . The corollary that follows means that the lift of m+ is precisely the probability measure m on M × N that projects down to μ and admits the family {mx : x ∈ M} in Lemma 5.22 as a disintegration. Corollary 5.23 The lift of m+ is the probability measure m defined on M × N by    m(E) = mx E ∩ ({x} × N) d μ (x) for every measurable set E ⊂ M × N. Proof

Recall the proof of Lemma 5.22. The definition (5.19) implies m f (x) = lim An+1 ( f −n (x))∗ m f −n (x) = A(x)∗ mx n

for μ -almost every x. Therefore (Exercise 5.20), the probability measure m is

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.5 Disintegrations of s-states and u-states

89

invariant under F. Moreover, given any cylinder C = [0 : Δ0 , . . . , Δl ] in F0 and any continuous function ϕ : N → R, 

XC ϕ dm =

  C

= C

ϕ dmx d μ (x) =

Ψϕ ,0 (x) d μ (x) =



Ψϕ (x) d μ (x)

 C C

ϕ dm+ x d μ (x) =



XC ϕ dm+

because E(Ψϕ | F0 ) = Ψϕ ,0 . This proves that m projects to m+ under the map Π+ : M × N → M + × N. Corollary 5.24 Suppose that m+ = μ + × η for some forward stationary measure η . Then the lift m is a u-state. Proof each

Recall the proof of Lemma 5.22. In the present case, m+ x ≡ η and so Ψϕ ,n (x) =



  ϕ d An ( f −n (x))∗ η 

depends only on x−n , . . . , x−1 . Thus, the limit Ψϕ (x) = ϕ dmx depends only on x− = π − (x). This means that m is a u-state.

5.5.3 Remarks on 2-dimensional linear cocycles Let F : M × R2 → M × R2 , F(x, v) = ( f (x), A(x)v) be an invertible locally constant linear cocycle and PF : M × PR2 → M × PR2 be the associated projective cocycle. By definition, the base dynamics f : (M, μ ) → (M, μ ) is a two-sided Bernoulli shift. The function A : M → GL(2) is assumed to satisfy the integrability conditions log+ A±1  ∈ L1 (μ ). Take the Lyapunov exponents λ+ and λ− to be distinct and let R2 = Ex+ ⊕ Ex− be the Oseledets decomposition. Let ms and mu be defined on the product space M × PR2 by    ms (C × D) = μ {x ∈ C : Ex− ∈ D} = δEx− (D)d μ (x) C

   m (C × D) = μ {x ∈ C : Ex+ ∈ D} = δEx+ (D)d μ (x) u

C

PR2 .

for all measurable sets C ⊂ M and D ⊂ In other words, ms and mu are 2 the probability measures on M × PR that project down to μ and admit, respectively, δEx− and δEx+ as conditional probabilities along the projective fibers. Note that ms and mu are PF-invariant, because (Exercise 5.20) A(x)∗ δEx− = δE −

f (x)

and

A(x)∗ δEx+ = δE +

f (x)

μ -almost everywhere.

Let G ⊂ M × PR2 be a measurable set invariant under PF. Then the set Cs = {x ∈ M : (x, Ex− ) ∈ G} is invariant under f , and so μ (Cs ) is either 0 or 1.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

90

Stationary measures

Consequently, ms (G) is either 0 or 1, and this proves that ms is ergodic. A similar argument proves that mu is ergodic. Moreover, ms is an s-state, because Exs depends only on x− = π − (x); similarly, mu is a u-state. Lemma 5.25 Any PF-invariant probability measure m that projects down to μ may be written as a convex combination m = ams + bmu , with a, b ≥ 0 and a+b = 1 For each k ≥ 1, consider the set   1 Bk = (x, v) ∈ M × PR2 : | sin (v, Ex∗ )| ≥ | sin (Exs , Exu )| for ∗ = s, u . k

Proof

Consider any (x, v) ∈ Bk . Since λ− < λ+ , the angle between An (x)v and E ufn (x) decays exponentially fast as n → +∞ (Exercise 3.5). On the other hand, by the theorem of Oseledets, the angle between E sf n (x) and E ufn (x) decays subexponentially. This implies that F n (x, v) eventually leaves Bk . So, by Poincar´e recurrence, m(Bk ) = 0 for every F-invariant probability measure m and every k ≥ 1. This means that m is concentrated on the set of points (x, v) ∈ M × PR2 with v ∈ {Exs , Exu }, and so its disintegration has the form mx = a(x)δExs + b(x)δE u (x) with a(x), b(x) ≥ 0 and a(x) + b(x) = 1. Moreover, the functions a(·) and b(·) are ( f , μ )-invariant and so, by ergodicity of μ , they are constant on a full μ measure set. Remark 5.26 The hypotheses that F is locally constant and the system ( f , μ ) is a Bernoulli shift are not used in the proof of Lemma 5.25. Thus, the conclusion extends to any 2-dimensional linear cocycle over an ergodic system satisfying the integrability conditions in Theorem 3.20 and such that λ− < λ+ . This will be used in the proof of Proposition 9.13. Thus, ms and mu are the unique ergodic PF-invariant probability measures. The next result also follows from the lemma. Corollary 5.27 Either ms is the unique s-state or all PF-invariant probability measures that project down to μ are s-states. Either mu is the unique u-state or all PF-invariant probability measures that project down to μ are u-states. Proof Suppose that there exists some s-state m = ms . Then m = ams + bmu with b = 0, and so we may write a 1 mu = m − ms . b b This implies that mu is an s-state, and therefore so is any convex combination

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.6 Notes

91

of ms and mu . By Lemma 5.25, this proves the first claim. The second one is analogous. In the next chapter, especially in Section 6.1, we will push this kind of picture to linear cocycles in arbitrary dimension.

5.6 Notes The books of Kifer [72] and Arnold [5] contain systematic developments of the theory of random transformations, from two different perspectives. Our definition is more restrictive than Arnold’s, as we consider only the Bernoulli case. However, several parts of the material we present here have been developed in much more generality. The notions of s-state and u-state were introduced by Bonatti, Gomez-Mont and Viana [37] and were further developed in a series of papers by Avila and Viana [16], by Avila, Santamaria and Viana [13] and by Avila, Viana and Wilkinson [17]. The analysis of the invertible case presented in Section 5.4 is probably new. The idea of disintegration into conditional probabilities relies on the disintegration theorem of Rokhlin [102]; see also [114, Section 5.2]. The martingale construction in Section 5.5.2 goes back to Furstenberg [55]. The remarks in Section 5.5.3 are from Avila and Viana [16]. They hold more generally: the cocycle need not be locally constant and it suffices that the invariant measure μ has local product structure, meaning that the restriction of μ to a neighborhood of every point x is equivalent to the product μx+ × μx− of a measure μx+ on the unstable manifold of x by a measure μx− on the stable manifold of x.

5.7 Exercises Exercise 5.1 Show that if E is a measurable subset of M × N then: (1) the sets Ev = {x ∈ M : (x, v) ∈ E} and E x = {v ∈ N : (x, v) ∈ E} are measurable subsets of M and N, respectively, for every v ∈ N and x ∈ M; (2) the functions N → R, v → ξ (Ev ) and M → R, x → η (E x ) are measurable, for any probability measures ξ on (M, A ) and η on (N, B);   (3) moreover, (ξ × η )(E) = ξ (Ev ) d η (v) = η (ξ x ) d ξ (x). Exercise 5.2

Show that if F : M × N → M × N, F(x, v) = ( f (x), Fx (v)) is

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

92

Stationary measures

measurable then the maps M → N, x → Fx (v) and Fx : N → N are measurable, for every v ∈ N and x ∈ M. Exercise 5.3 Check that the transition operators can be defined in terms of the transition probabilities alone: 

(1) P ϕ (v) = ϕ (w) d p(v, w) for every v ∈ N, and  (2) P ∗ η (B) = p(v, B) d η (v) for every measurable set B ⊂ N. Exercise 5.4 Consider the situation in Example 5.1. Show that a probability measure η is stationary for the projective cocycle PF : M × PRd → M × PRd if and only if it satisfies η = ∑m i=1 pi (Ai )∗ η . Exercise 5.5 Calculate all the stationarymeasures ofthe projective cocycle PF : X N × PR2 → X N × PR2 given by PF (αn )n∈N , [v] = (αn+1 )n∈N , [α0 v]) where X = {A1 , A2 } and p = p1 δA1 + p2 δA2 , with   σ 0 for some σ > 1 and p1 , p2 > 0. A1 = A−1 = 2 0 σ −1 Exercise 5.6 Let η be a stationary measure and ϕ : N → R be an η -stationary function. Show that there is a stationary function ψ : N → R with ϕ (v) = ψ (v) for η -almost every v ∈ N. Show that if ϕ is a characteristic function of some subset of N then ψ may also be taken to be a characteristic function. Exercise 5.7 Let η be a stationary measure and ϕn : N → R, n ≥ 1, be a monotone sequence of η -stationary functions converging η -almost everywhere to some bounded measurable function ϕ : N → R. Check that ϕ is η -stationary. Exercise 5.8 Let φ : M → R be a bounded measurable function and C ∈ R. (1) Show that if {x ∈ M : φ (x) > C} has positive μ -measure then there exist  k, x0 , . . . , xk−1 such that φ (x) d μ (xk , xk+1 , . . . ) > C.  (2) Deduce that if φ (x) d μ (xk , xk+1 , . . . ) = C for every k, x0 , . . . , xk−1 then φ (x) = C for μ -almost every x ∈ M. Exercise 5.9 Check that the projection to N of a probability measure m invariant under a random transformation F : M × N → M × N need not be a stationary measure for F. Exercise 5.10 Check that a stationary measure η is ergodic if and only if it is extremal; that is, if it cannot be written as a convex combination η = α1 η1 + α2 η2 where α1 , α2 > 0 with α1 + α2 = 1, and the probability measures η1 , η2 are stationary and mutually singular (two measures η1 and η2 are mutually singular if there is a measurable subset such that η1 (A) = 0 and η2 (Ac ) = 0).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.7 Exercises

93

Exercise 5.11 Let F : M × N → M × N be a one-sided random transformation  and η be a stationary measure. Let ξ d ηˆ (ξ ) be an ergodic decomposition of  η . Show that (μ × ξ ) d ηˆ (ξ ) is an ergodic decomposition of the F-invariant probability measure m = μ × η . Exercise 5.12 Let (Xi , di ), i ∈ I, be separable, complete metric spaces, where I ⊂ Z. Let X be the cartesian product Πi∈I Xi , endowed with the product topology. Show that X is a separable space and d(x, y) = ∑ 2−|i| min{1, di (xi , yi )},

x, y ∈ X,

(5.21)

i∈I

defines a complete metric, compatible with the product topology on X. Exercise 5.13 Let (Xi , di ), i ∈ I, be metric spaces, where I ⊂ Z, and let the cartesian product X = ∏i∈I Xi be endowed with the distance (5.21). For any bounded uniformly continuous function ϕ : X → R and any ε > 0, find a finite set J ⊂ I and a bounded uniformly continuous function ψ : X → R such that sup |ϕ − ψ | < ε and ψ (x) depends only on the coordinates x j with j ∈ J. Exercise 5.14 Let (Xi , di ), i ∈ I be separable metric spaces, where I ⊂ Z, and let the cartesian product X = ∏i∈I Xi be endowed with the distance (5.21). Prove that: (1) If I is finite then the following is true relative to the uniform topology, the pointwise topology and the weak∗ topology: if (pn,i )n converges to pi for every i ∈ I then (pn = ∏i∈I pn,i )n converges to p = ∏i∈I pi . Indeed: (a) pn − p ≤ ∑i∈I pn,i − pi ; (b) {E ⊂ X : pn (E) → p(E)} is a monotone class and contains every finite union of measurable rectangles ∏i∈I Ei ; (c) for any bounded uniformly continuous ϕ : X → R and ε > 0 there is a countable partition {Rk : k ≥ 1} of X into rectangles with p(∂ Rk ) = 0 and |ϕ (z) − ϕ (w)| ≤ ε if z and w are in the same Rk . (2) When I is infinite, the statement remains true for the weak∗ topology (use Exercise 5.13) but, in general, not for the pointwise topology nor the uniform topology: Take Xi = [0, 1] and pn,i = (1 − n−1 )δ0 + n−1 δ1 . Exercise 5.15 Consider the situation in Example 5.1. Let η be a probability measure on N. Show that: −1 (1) η is forward stationary if and only if η (B) = ∑m i=1 pi η (Ai (B)) for every measurable B ⊂ N; (2) η is backward stationary if and only if η (B) = ∑m i=1 pi η (Ai (B)) for every measurable B ⊂ N.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

94

Stationary measures

The sets of backward and forward stationary measures may be disjoint. Exercise 5.16

Let φ : M → R be a bounded measurable function and

φk : M → R,

φk (x) =



φ (x) d μ (xk , xk+1 , . . . )

for each k ≥ 1. Use the martingale convergence theorem to prove that (φk (x))k converges to φ (x) for μ -almost every x. Exercise 5.17

Prove that an F-invariant probability measure m on M × N is

(1) an s-state if and only if m = μ − × m+ for some probability measure m+ on + M + × N; then m+ = Π+ ∗ m and it is F -invariant; + − (2) a u-state if and only if m = μ × m for some probability measure m− on − M − × N; then m− = Π− ∗ m and it is F -invariant; (3) an su-state if and only if m = μ × η where η , the projection of m to N, is both forward stationary and backward stationary. Exercise 5.18 Check that the conditions (5.11) and (5.12) are preserved by, respectively, backward and forward iteration under the random transformation. Namely, (1) if m is a probability measure on M ×N that projects down to μ and satisfies (5.11), then the same is true for F∗−1 m; (2) if m is a probability measure on M ×N that projects down to μ and satisfies (5.12), then the same is true for F∗ m. Exercise 5.19

Prove the following.

(1) The space M (μ ) of probability measures on M × N that project down to μ is non-empty and compact for the weak∗ topology. (2) The push-forward F∗ : M (μ ) → M (μ ) is continuous. (3) The subset of measures m ∈ M (μ ) satisfying (5.12) is invariant under F∗ and is closed in the weak∗ topology. (4) Any Cesaro limit of the forward iterates of any element of this subset is an F-invariant probability measure and a u-state for F. (5) A dual construction yields s-states. Exercise 5.20 Let m be a probability measure on M × N that projects down to an f -invariant probability measure μ . Prove that: (1) if f : M → M is invertible (two-sided shift) then m is F-invariant if and only if (Fx )∗ mx = m f (x) for μ -almost every x ∈ M;

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

5.7 Exercises

95

(2) if f : M → M is a one-sided shift then m is F-invariant if and only if 

m f (x) =

(Fy0 )∗ m(y0 ,x1 ,...,xn ,...) d p(y0 )

for μ -almost every x = (x0 , x1 , . . . , xn , . . .) ∈ M. Exercise 5.21 Let m be the lift of an F + -invariant probability measure m+ . Show that m is an s-state if and only if any of the following conditions holds: (1) m = μ − × m+ ; for μ -almost every x ∈ M, where x+ = π + (x); (2) mx = m+ x+ + (3) (Fx+ )∗ mx+ = m+f + (x+ ) for μ + -almost every x+ ∈ M + ; where {mx : x ∈ M} and {m+ : x+ ∈ M + } are disintegrations of m and m+ . x+ Exercise 5.22 Prove that if m is an F-invariant probability measure on M × N that projects down to μ then its ergodic components also project down to μ . Exercise 5.23 Prove that if m is an s-state (respectively, a u-state) then its ergodic components are s-states (respectively, u-states). Exercise 5.24 Let F : M × R2 → M × R2 be an invertible locally constant linear cocycle over a two-sided Bernoulli shift. Show that if F is pinching and twisting then the projective cocycle PF admits a unique u-state and a unique s-state, and they are distinct. Conclude that the forward stationary measure and the backward stationary measure are also unique and distinct.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.006

6 Exponents and invariant measures

The main theme in the present chapter is that the Lyapunov exponents and Oseledets subspaces of a linear cocycle F may be retrieved from the invariant measures and the stationary measures of the associated projective cocycle PF. This theme will be substantiated by three important results. A theorem of Ledrappier [80], which we discuss in Section 6.1, states that the Lyapunov exponents of F coincide with the integrals of the function Φ : M × PRd → R,

Φ(x, [v]) = log

A(x)v v

(6.1)

with respect to the ergodic invariant probability measures of PF. Moreover, any such probability measure is concentrated on the corresponding Oseledets sub-bundle. When the cocycle is strongly irreducible and locally constant, the largest Lyapunov exponent is given by the integral of Φ with respect to any PFinvariant measure of the form μ × η . That is the content of Furstenberg’s formula [54], that we present in Section 6.2. Finally, Section 6.3 is devoted to Furstenberg’s [54] criterion for the extremal Lyapunov exponents of locally constant SL(2)-dimensional cocycles to coincide: if λ− = λ+ then either the cocycle lives in a compact subgroup or it leaves invariant some finite subset of PR2 . Throughout, PF : M × PRd → M × PRd is the projective cocycle associated with a linear cocycle F : M × Rd → M × Rd defined over an ergodic transformation f : (M, μ ) → (M, μ ) by a measurable function A : M → GL(d). We always take M to be a separable complete metric space and μ to be a Borel measure on M such that log+ A±1  ∈ L1 (μ ). The extremal Lyapunov exponents are denoted, indifferently, as λ± (F, μ ) or λ± (A, μ ). For some of the results we take ( f , μ ) to be a Bernoulli shift and A to be locally constant; that is, such that A(x) depends only on the zeroth coordinate 96

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.1 Representation of Lyapunov exponents

97

of x ∈ M. Then F and PF are random transformations, in the sense of the previous chapter.

6.1 Representation of Lyapunov exponents Let λ1 > · · · > λk be the Lyapunov exponents of the linear cocycle F (keep in mind that we take ( f , μ ) to be ergodic) and then let Rd = Vx1 ⊃ · · · ⊃ Vxk ⊃ {0} be the Oseledets flag of F (Theorem 4.1). When the cocycle is invertible, Rd = Ex1 ⊕ · · · ⊕ Exk denotes the Oseledets decomposition of F (Theorem 4.2). Theorem 6.1 (Ledrappier) Given any PF-invariant ergodic probability measure m on M × PRd that projects down to μ , there exists j ∈ {1, . . . , k} such that    Φ dm = λ j and m {(x, [v]) : v ∈ Vxj \Vxj+1 } = 1. (6.2) Conversely, given j ∈ {1, . . . , k} there is an PF-invariant ergodic probability measure m projecting to μ and satisfying (6.2). When F is invertible, one may replace Vxj \Vxj+1 by Exj in (6.2). Proof

Let m be PF-invariant and ergodic. On the one hand, by ergodicity, 1 1 n−1 An (x)v lim log = lim ∑ Φ ◦ PF i (x, [v]) = n n n n v i=0



Φ dm

(6.3)

for m-almost every (x, [v]). On the other hand, for μ -almost every x ∈ M and every [v] ∈ PRd , the expression on left-hand side of (6.3) converges to some  Lyapunov exponent λ j . It follows that Φ dm = λ j and m gives full weight to the set {(x, [v]) ∈ M × PRd : v ∈ Vxj \Vxj+1 } of pairs (x, [v]) for which the limit in (6.3) is λ j . When F is invertible, one can take both limits n → ±∞ in (6.3). We conclude that m gives full weight to the set {(x, [v]) ∈ M × PRd : v ∈ Exj } of pairs (x, [v]) for which the limit in (6.3) is λ j for both n → +∞ and n → −∞. To prove the converse statement we need a few auxiliary lemmas. Let M (μ ) denote the space of probability measures on M × PRd that project down to μ . Lemma 6.2 The push-forward F∗ : M (μ ) → M (μ ) is well defined and continuous relative to the weak∗ topology.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

98

Exponents and invariant measures

Proof Let (mn )n be a sequence in M (μ ) converging, in the weak∗ sense, to some probability measure m. We want to show that F∗ mn converges to F∗ m; that is, 

(ϕ ◦ F) dmn →



(ϕ ◦ F) dm

(6.4)

for every bounded continuous function ϕ : M × PRd → R. Since f : M → M and A : M → SL(d) are measurable, we can use Lusin’s theorem to find, for every ε > 0, compact sets K ⊂ M such that μ (K c ) < ε and the restriction of F to L = K ×PRd is continuous. Then, ϕ ◦F is continuous restricted to L and, by the Tietze extension theorem, there is some continuous function ψ : M × PRd → R such that sup |ψ | ≤ sup |ϕ | and ψ = ϕ ◦ F restricted to L. It follows that            (ϕ ◦ F) dmn − (ϕ ◦ F) dm ≤  ψ dmn − ψ dm+           +  ψ dmn − ψ dm +  (ϕ ◦ F) dmn − (ϕ ◦ F) dm Lc

Lc

Lc

Lc

The first term on the right-hand side is smaller than ε if n is large enough, because mn → m and ψ is continuous. The remain terms are bounded by   2 sup |ϕ | mn (Lc ) + m(Lc ) = 4 sup |ϕ | μ (K c ) ≤ 4 sup |ϕ | ε . This proves (6.4), and so the argument is complete. Lemma 6.3

M (μ ) is sequentially compact, relative to the weak∗ topology.

Proof We are going to use the fact that every Borel measure in a separable, completely metrizable space is tight (see [114, Appendix A.3]). This means that for any ε > 0 we may find a compact set K ⊂ M such that μ (K) > 1 − ε . Then K × PRd is compact and m(K × PRd ) = μ (K) > 1 − ε for every m ∈ M (μ ). This proves that the set M (μ ) is tight. Hence, by the theorem of Prohorov (see [114, Theorem 2.1.8]), it is sequentially compact. Lemma 6.4 Let x → Vx be a measurable sub-bundle of M × Rd . Then the subset of probability measures m ∈ M (μ ) such that m({(x, [v]) : v ∈ Vx }) = 1 is closed in the weak∗ topology. Proof

Let (mn )n be a sequence in M (μ ) such that mn ({(x, [v]) : v ∈ Vx }) = 1 for all n,

and which converges to some m in the weak∗ topology. We want to show that m({(x, [v]) : v ∈ Vx }) = 1. Since the sub-bundle V is assumed to be measurable, for every ε > 0 one can use Lusin’s theorem to find a compact set K ⊂ M with μ (K) > 1 − ε and such that the restriction of V to K is continuous. Then

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.1 Representation of Lyapunov exponents

99

{(x, [v]) ∈ K × PRd : v ∈ Vx } is closed and so its m-measure is bounded below by lim sup mn ({(x, [v]) ∈ K × PRd : v ∈ Vx }) ≥ μ (K) > 1 − ε . n

Since ε is arbitrary, this proves that m({(x, [v]) : v ∈ Vx )}) = 1. We are now ready to conclude the proof of Theorem 6.1. Let j be fixed. Let us start with the invertible case. Since the Oseledets sub-bundle x → Exj is measurable it follows from Exercise 4.1 that it admits a measurable section; that is, there exists a measurable vector field x → σ (x) such that σ (x) ∈ Exj for every x. Let m0 be the probability measure in M (μ ) that admits δσ (x) as a disintegration. In other words,   m0 (B) = μ {x ∈ M : (x, σ (x)) ∈ B} for every measurable B ⊂ M × PRd . Then define, for n ≥ 1, mn =

1 n−1 ∑ (PF)i∗ m0 . n i=0

It is clear that mn ({(x, [v]) : v ∈ Exj }) = 1 for every n ≥ 0. Then, by Lemma 6.4, the same is true for every accumulation point of the sequence (mn )n . Using Lemma 6.2, every accumulation point is a PF-invariant probability measure. By Lemma 6.3, accumulation points do exist. Considering ergodic components, one gets that there exists some PF-invariant ergodic probability measure ˜ [v]) = λ j for m-almost every m such that m({(x, [v]) : v ∈ Exj }) = 1. Then Φ(x,   ˜ dm = λ j . This finishes the proof in the invertible (x, [v]), and so Φ dm = Φ case. The general case can be deduced easily, through the following invertible extension of F : M × Rd → M × Rd . Let (see [114, Section 2.4.2]) ˆ fˆ : Mˆ → M,

fˆ(. . . , xn , . . . , x−1 , x0 ) → (. . . , xn , . . . , x−1 , x0 , f (x0 ))

be the natural extension of f : M → M, defined on the space of pre-orbits Mˆ = {(xn )n≤0 : f (xn ) = xn+1 for every n < 0}. Denote by π : Mˆ → M the projection to the zeroth coordinate. There is a unique fˆ-invariant measure on Mˆ such that π∗ μˆ = μ . Define Aˆ = A ◦ π : Mˆ → GL(d) and then let Fˆ : Mˆ × Rd → Mˆ × Rd be the linear cocycle defined by Aˆ over fˆ. Note that 

log Aˆ ±1  d μˆ =



log A±1  d μ .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

100

Exponents and invariant measures

It is also clear that F and Fˆ have the same Lyapunov exponents, and their Oseledets flags (V i )i and (Vˆ i )i are related by Vˆxˆi = Vπi (x) ˆ

for every xˆ ∈ Mˆ and every i.

ˆ By the previous paragraph, there exists some PF-invariant probability measure d ˆ mˆ on M × R that projects down to μˆ and satisfies m({(x, ˆ [v]) : v ∈ Eˆxj }) = 1. Let m be the image of mˆ under the map π × id : Mˆ × PRd → M × PRd . Then m is a PF-invariant probability measure and   m {(x, [v]) : v ∈ Vxj \Vxj+1 } = 1. ˆ Considering ergodic components, \ Vπj+1 for every xˆ ∈ M. because Eˆxˆj ⊂ Vπj(x) ˆ (x) ˆ one concludes that there exists some PF-invariant ergodic probability measure m such that m({(x, [v]) : v ∈ Vxj \Vxj+1 }) = 1. The other claim in (6.2) follows immediately, just as in the invertible case. The following application of Theorem 6.1 will be useful later on. Take the linear cocycle F : M × Rd → M × Rd to be locally constant and, for the time being, invertible. Proposition 6.5

For invertible locally constant cocycles,  (1) λ+ (F, μ ) = max Φ dm : m a u-state for PF},  Φ dm : m an s-state for PF}. (2) λ− (F, μ ) = min Proof

It follows from Theorem 6.1 and Exercise 5.22 that

λ− (F, μ ) ≤



Φ dm ≤ λ+ (F, μ )

(6.5)

for every PF-invariant probability measure m that projects down to μ . Thus, we need to show that the maximum is realized by some u-state, and the minimum is realized by some s-state. Consider the Oseledets sub-bundle E 1 corresponding to the largest Lyapunov exponent λ1 = λ+ (F, μ ). As observed in Remark 4.3, the subspace Ex1 depends only on the negative part x− = π − (x) of the point x. So we may find a measurable section x → σ (x) to the sub-bundle E 1 such that σ (x) depends only on x− . Let m0 be the probability measure in M (μ ) that admits {δσ (x) : x ∈ M} as a disintegration. In other words, m0 (B) = μ ({x ∈ M : (x, σ (x)) ∈ B}) for every measurable B ⊂ M × PRd . Then define, for n ≥ 1, mn =

1 n−1 ∑ (PF)∗j m0 . n j=0

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.1 Representation of Lyapunov exponents

101

Using Lemma 6.2 one easily gets that every accumulation point of this sequence is an PF-invariant probability measure that projects down to μ . Accumulation points do exist, by Lemma 6.3. By Exercise 5.18, every mn satisfies the u-state condition (5.12). Hence, according to Lemma 5.19, every accumulation point m is a u-state for PF. Note also that mn ({(x, [v]) : v ∈ Exj }) = 1 for every n ≥ 0. Then, in view of Lemma 6.4, the same is true for every accumulation point m. This implies that lim n

1 n−1 ∑ Φ(F j (x, [v])) = λ1 n j=0 

for m-almost every (x, [v]), and so Φ dm = λ1 = λ+ (F, μ ). This shows that m realizes the maximum in (6.5), as required. A dual argument shows that the minimum in (6.5) is realized by some s-state. Now consider the non-invertible cocycles F ± and PF ± associated with F and PF, as introduced in Section 5.4. We have seen that every PF-invariant probability measure m is the lift of some PF + -invariant probability measure m+ . Note that   M×PRd

Φ dm =

M + ×PRd

Φ dm+

because Φ depends only on the zeroth coordinate. Moreover, m is a u-state if and only if m+ = μ + × η for some forward stationary measure η . In this case, 

M×PRd

Φ dm =



M + ×PRd

Φ d(μ + × η ) =



M×PRd

Φ d(μ × η ).

Similar remarks apply to PF − -invariant probabilities and backward stationary measures. Thus, Proposition 6.5 immediately gives Corollary 6.6

For invertible locally constant cocycles,  (a) λ+ (F, μ ) = max Φ d(μ × η ) : η forward stationary for PF},  (b) λ− (F, μ ) = min Φ d(μ × η ) : η backward stationary for PF} This also leads to the one-sided version of Proposition 6.5: Proposition 6.7

For general locally constant cocycles,   λ+ (F, μ ) = max Φ d(μ × η ) : η a stationary measure for PF .

Proof This follows from the previous corollary applied to the invertible extension Fˆ : Mˆ × Rd → Mˆ × Rd of the linear cocycle F : M × Rd → M × Rd . More precisely, let fˆ : Mˆ → Mˆ be the two-sided shift on Mˆ = X Z and let ˆ x) ˆ where π : Mˆ → M is the canonical Aˆ : Mˆ → GL(d) be given by A( ˆ = A(π (x)),

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

102

Exponents and invariant measures

ˆ x)v). ˆ x, ˆ μˆ ), projection. Define F( ˆ v) = ( fˆ(x), ˆ A( ˆ It is clear that λ+ (F, μ ) = λ+ (F, where μˆ = pZ . Moreover, a probability measure η on PRd is stationary for PF ˆ if and only if it is stationary for PF.

6.2 Furstenberg’s formula In this section we take the linear cocycle F : M × Rd → M × Rd to be locally constant and the shift f : (M, μ ) → (M, μ ) to be one-sided.

6.2.1 Irreducible cocycles A linear cocycle F is irreducible if there is no proper subspace of Rd invariant under A(x) for μ -almost every x ∈ M; and it is strongly irreducible if there is no finite family of proper subspaces invariant by A(x) for μ -almost every x. Proposition 6.7 implies that the integral of the function (6.1) with respect to some PF-invariant probability measure of the form m = μ × η is equal to the largest Lyapunov exponent. We are going to see that if one assumes strong irreducibility then the same is true for every invariant probability measure of the form μ × η . That also implies that for j > 1 the measure m in Theorem 6.1 cannot be a product measure μ × η . Theorem 6.8 (Furstenberg’s formula) If F : M × Rd → M × Rd is strongly irreducible then 

λ+ (F, μ ) =

Φ d(μ × η )

for any stationary measure η of the associated projective cocycle PF. Proof

Let m = μ × η . By the ergodic theorem, the Birkhoff average An (x)v 1 1 n−1 = lim ∑ Φ ◦ PF i (x, [v]) Ψ(x, [v]) = lim log n n n n v i=0 



exists at m-almost every point, and it satisfies Ψ dm = Φ dm. By the theorem of Oseledets, there exists M0 ⊂ M such that μ (M0 ) = 1 and An (x)v 1 = λ1 lim log n n v

for every x ∈ M0 and v ∈ / Vx2 .

So, it suffices to check that the set of pairs (x, v) with v ∈ Vx2 has zero mmeasure. For this we need the lemma that follows. A measure ξ in a measurable space (X, B) is non-atomic if ξ ({x}) = 0 for every x ∈ X.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.2 Furstenberg’s formula

103

Lemma 6.9 If the cocycle F is strongly irreducible then η (V ) = 0 for any proper projective subspace V of PRd and any PF-stationary measure η . In particular, every F-stationary measure η is non-atomic. Proof Suppose that there is some proper projective subspace V with η (V ) > 0. Let d0 ≥ 1 be the smallest dimension of such subspaces, c be the maximum value of η (V ) over all subspaces of dimension d0 , and V be the family of all subspaces V of dimension d0 such that η (V ) = c. Since subspaces are closed subsets of PRd , we have η (V ) = c for any accumulation point V of any sequence (Vn )n of subspaces of dimension d0 such that η (Vn ) → c. By compactness of the Grassmannian manifold G(d0 , d), accumulation points do exist, and so the family V is non-empty. Moreover, V is finite: since η is a probability measure and the elements of V are essentially disjoint (due to the choice of d0 ) we have that #V ≤ 1/c. We write V = {V1 , . . . ,Vn }. Since    c = η (Vi ) = η A(x)−1 (Vi ) d μ (x) and η (A(x)−1 (Vi )) ≤ c for all x ∈ M, we must have η (A(x)−1 (Vi )) = c for μ -almost every A(x). In view of the choice of V , this implies that A(x)−1 (Vi ) ∈ V for μ -almost every x ∈ M. This means that the set V is invariant under almost every A(x0 ), which contradicts the strong irreducibility hypothesis. This contradiction proves the first part of the lemma. The last part is a special case (subspaces of dimension 1).   Hence, η (Vx2 ) = 0 for every x ∈ M and so m {(x, [v]) : v ∈ Vx2 } = 0. It   follows that Φ dm = Ψ dm = λ1 for any PF-invariant probability measure of the form m = μ × η , as we wanted to prove.

6.2.2 Continuity of exponents for irreducible cocycles Let I be the space of pairs (A, p) where A : X → SL(d) is a measurable function and p is a probability measure on X such that log+ A±1  ∈ L1 (p) and A is strongly irreducible: no finite family of subspaces of Rd is invariant under A(x) for p-almost every x. For any (A, p) ∈ I we denote μ = pN and ν = A∗ p. Sometimes, we let ourselves view A as a function M → SL(d) depending only on the zeroth coordinate. Then ν = A∗ μ . A sequence of measures (ξk )k in GL(d) is called uniformly integrable if given ε > 0, there exist R > 0 and k0 ≥ 1 such that 

{B>R}

log+ B±1  d ξk (B) < ε

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

(6.6)

104

Exponents and invariant measures

for all k ≥ k0 and for both choices of the sign. This is the case, for example, if there is some compact set K ⊂ GL(d) such that ξk (K) = 1 for every k. Theorem 6.8 has the following interesting consequence: Corollary 6.10 Let (A, p) ∈ I and (Ak , pk ), k ≥ 1 be a sequence in I such that the probability measures νk = (Ak )∗ pk converge to ν = A∗ p in the weak∗ topology and are uniformly integrable. Then λ+ (Ak , μk ), k ≥ 1 converges to λ+ (A, μ ) as k → ∞, where μk = pN k. Proof For each k ≥ 1, define Fk (x, v) = ( f (x), Ak (x0 )v) and then let ηk be any Fk -stationary measure. By Theorem 6.8,

λ+ (Ak , μk ) =



Φ(B, [v]) d νk (B) d ηk ([v]),

(6.7)

  where Φ(B, [v]) = log B(v)/v . Up to restricting to a subsequence of an arbitrary subsequence, we may suppose that ηk converges to some probability measure η on PRd . Note that η is F-stationary, by Proposition 5.9. Thus, Theorem 6.8 also gives

λ+ (A, μ ) =



Φ(B, [v]) d ν (B)d η ([v]).

(6.8)

So, we only have to show that the right-hand side of (6.7) converges to the right-hand side of (6.8) when k → ∞. Let ε > 0. Since log+ A±1  ∈ L1 (p), we may find R > 0 such that 

{B>R}

log+ B±1  d ν (B) =



{A(x)>R}

log+ A(x)±1  d p(x) < ε .

The assumption of uniform integrability ensures that, increasing R if necessary, the same holds for νk if k is sufficiently large. Now observe that |Φ(B, [v])| ≤ log+ B + log+ B−1  Thus,

  

{B>R}

Φ d νk d ηk −

for every (B, [v]) ∈ GL(d) × PRd .

 {B>R}

  Φ d ν d η  ≤ 4ε

(6.9)

for every large k. Next, observe that νk × ηk converges to ν × η in the weak∗ topology (Exercise 5.14). Since the function Φ is continuous, using Exercise 6.1 we get that      Φ d ν d η − Φ dν dη  ≤ ε (6.10)  k k {B≤R}

{B≤R}

for every large k. Combining (6.7)–(6.10) we get that |λ+ (Ak , μk ) − λ (A, μ )| < 5ε for every k sufficiently large.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.3 Theorem of Furstenberg

105

6.3 Theorem of Furstenberg Let F : M × R2 → M × R2 be a locally constant linear cocycle defined by a measurable function A : X → SL(2) over a one-sided shift f : M → M. We continue to denote M = xN and μ = pN and ν = A∗ p. The result that follows was stated before, in Theorem 1.2, but this time the hypotheses are formulated directly in terms of the support of the measure ν , rather than the monoid B generated by the support. Exercises 6.6 and 6.7 show that the two formulations are equivalent. In the next chapter we will discuss extensions of the statement to arbitrary dimensions. Theorem 6.11 (Furstenberg)

Assume that

(a) the support of ν is not contained in a compact subgroup of SL(2) and (b) there is no non-empty finite set L ⊂ PR2 such that B(L) = L for every B in the support of ν . Then the extremal Lyapunov exponents are distinct: λ− < 0 < λ+ . Recall that the support of a measure θ on a topological space is the (closed) subset supp θ of points whose neighborhoods all have positive measure. It follows from the theorem (and Exercises 6.6 and 6.7) that there are four possibilities for SL(2)-cocycles: (1) the cocycle is non-pinching: supp ν is contained in some compact subgroup of SL(2); (2) the cocycle is non-twisting, with either one or two invariant subspaces: supp ν is contained in a triangular subgroup or a diagonal subgroup; (3) the cocycle is non-twisting, with an invariant set formed by two subspaces which are interchanged by some of the matrices; (4) in all other cases, the Lyapunov exponents are distinct and non-zero. This discussion is much refined by the following theorem of Thieullen [111], which classifies SL(2)-cocycles up to conjugacy. A function φ : M → R is cohomologous to zero mod r, for a given r > 0, if there exists a measurable function u : M → R such that

φ + u ◦ f − u ∈ rZ almost everywhere. Given a set E ⊂ M, we define RE : M → SL(2) by

rotation of π /2 if x ∈ E RE (x) = id otherwise.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

(6.11)

106

Exponents and invariant measures

Theorem 6.12 (Thieullen) Let ( f , μ ) be ergodic. If A : M → SL(2) satisfies log+ A±1  ∈ L1 (μ ) then there exists P : M → SL(2) such that the function B(x) = P−1 ( f (x)) · A(x) · P(x) has one of the following forms, μ -almost everywhere:   cos θ (x) − sin θ (x) with θ not cohomologous to zero mod π . (a) B(x) = sin θ (x) cos θ (x)    a(x) b(x) (b) B(x) = with log |a| d μ = 0 and log |b| ∈ L1 (μ ). 0 1/a(x)   cca(x) 0 (c) B(x) = RE (x) where log |a| ∈ L1 (μ ) and the charac0 1/a(x) teristic function of E ⊂ M is not cohomologous to zero mod 2. (d) B(x) =

 a(x) 0

  0 with λ = log |a| d μ > 0. 1/a(x)

Moreover, B(x) and P( f (x))P−1 (x) are bounded above by A(x). In particular, the Lyapunov exponents of B are well defined and coincide with the Lyapunov exponents of A. The Lyapunov exponents vanish in cases (a)–(c), and they are ±λ in case (d). In the remainder of the present section we prove Theorem 6.11.

6.3.1 Non-atomic measures Let B : R2 → R2 be a non-zero linear map. When B is invertible, we denote by B∗ θ the push-forward of a probability measure θ under the projective action [v] → [B(v)]. When B is non-invertible (of rank 1, necessarily) the projective action is defined at all points except [ker B] ∈ PR2 , and so B∗ θ is still defined for every non-atomic probability measure θ . In this case, B∗ θ is the Dirac mass at the image [B(R2 )] ∈ PR2 . Lemma 6.13 The map (B, θ ) → B∗ θ is continuous, for B in the space of non-zero linear maps of R2 , with the coefficients topology, and θ in the space of non-atomic probability measures on PR2 , with the weak∗ topology. Proof Let (Bn )n converge to B and (θn )n converge to θ . Consider any con  tinuous function ϕ : PR2 → R. Then | (ϕ ◦ Bn ) d θn − (ϕ ◦ B) d θ | is bounded by            (ϕ ◦ Bn ) d θn − (ϕ ◦ B) d θn  +  (ϕ ◦ B) d θn − (ϕ ◦ B) d θ . (6.12)

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.3 Theorem of Furstenberg

107

Suppose that B is invertible. Then the projective action of Bn converges to the projective action of B, uniformly on PR2 . The first difference in (6.12) is bounded by sup |ϕ ◦ Bn − ϕ ◦ B|, which goes to zero when n → ∞. The second also goes to zero, because ϕ ◦ B is continuous and θn converges to θ . Since ϕ is arbitrary, this proves that (Bn )∗ θn converges to B∗ θ in the weak∗ topology. Now suppose B is non-invertible, of rank 1. Given ε > 0, fix a closed neighborhood U of [ker B] ∈ PR2 such that θ (U) < ε . Then θn (U) < ε for every   large n. The difference | (ϕ ◦ Bn ) d θn − (ϕ ◦ B) d θ | is bounded by           ( ϕ ◦ Bn ) d θ n − (ϕ ◦ B) d θ  +  (ϕ ◦ Bn ) d θn − (ϕ ◦ B) d θ . (6.13)  Uc

Uc

U

U

The projective action of Bn converges to the projective action of B uniformly on U c . So the same argument as before shows that the first difference in (6.13) is less than ε if n is large enough. The second is bounded by 2ε sup |ϕ |. Since ε is arbitrary, this proves that (Bn )∗ θn converges to B∗ θ also in this case. We use Bt : Rd → Rd to denote the adjoint of a linear map B : Rd → Rd : Bt (u) · v = u · B(v) for every u, v ∈ Rd . Lemma 6.14 Let θ be a non-atomic probability measure on PR2 and (Bn )n be a sequence in SL(2). The following conditions are equivalent: (1) Bn  → ∞ as n → ∞; (2) every limit point of (Bn )∗ θ in the weak∗ topology is a Dirac mass δ[v] . If (Bn )∗ θ → δ[v] then Btn (u)/Bn  → |u · v|/v for every u ∈ R2 . Proof Suppose that Bn  does not converge to ∞. Then (Bn )n admits some subsequence converging to an invertible linear map B : R2 → R2 . Then, by Lemma 6.13, the corresponding push-forwards of θ converge to B∗ θ , which is non-atomic. This shows that (2) implies (1). Now let us prove that (1) implies (2) and the last conclusion in the statement. Suppose that Bn  → ∞. Up to restricting to a subsequence, we may suppose that Ln = Bn /Bn  converges to some L : R2 → R2 . Note that L = 1 and | det L| = limn 1/Bn 2 = 0, and so L has rank 1. Let V = L(R2 ). By Lemma 6.13, the image (Bn )∗ θ = (Ln )∗ θ converges to δV . Now fix unit vectors v ∈ V and w ∈ V ⊥ . Any vector u ∈ R2 may be written as u = (u · v)v + (u · w)w. Then Btn u = Ltn (u) → Lt (u) = |u · v| Bn  because Lt (v) = ±v and Lt (w) = 0.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

108

Exponents and invariant measures

Corollary 6.15 The stabilizer H(θ ) = {B ∈ SL(2) : B∗ θ = θ } of any nonatomic probability measure θ on PR2 is a compact subgroup of SL(2). Proof The fact that H(θ ) is a subgroup of SL(2) is clear from the definition and the fact that it is closed follows from Lemma 6.13. So, we only have to prove that the norm of the elements of Hθ is bounded. That is a simple consequence of Lemma 6.14: if there exists a sequence An ∈ H(θ ), n ≥ 1 with An  → ∞ then the sequence (An )∗ η accumulates on some Dirac mass; that is not possible because this sequence is constant equal to η , which is non-atomic. This contradiction proves that H(θ ) is indeed bounded in norm.

6.3.2 Convergence to a Dirac mass Ft

The adjoint of a linear cocycle F(x, v) = ( f (x), A(x)v) is the linear cocycle defined over the inverse f −1 : M → M by F t (x, v) = ( f −1 (x), At (x)v), where At (x) = A( f −1 (x))t . The n th iterate is given by (x, v) → ( f −n (x), At n (x)), where At n (x) = At ( f −n+1 (x)) · · · At ( f −1 (x))At (x) = A( f −n (x))t · · · A( f −2 (x))t A( f −1 (x))t . The adjoint F t is not locally constant, because At (x) depends on the first (not the zeroth) coordinate of x. However, F t is conjugate to the locally constant cocycle F  (x, v) = ( f −1 (x), A(x)t v) by the transformation (x, v) → ( f −1 (x), v). We denote by PF t and PF  the projective cocycles associated with F t and F  , respectively. The notions of stationary measures for PF t and PF  coincide. Let η t be any PF t -stationary probability measure on PR2 . We are going to analyze the iterates of η t under the matrices An (x)t = A(x)t · · · A( f n−1 (x))t = At n ( f n (x)). Lemma 6.16 For μ -almost every x ∈ M, there exists a probability measure mx on PR2 such that An (x)t∗ η t → mx in the weak∗ topology. Proof This is analogous to Lemma 5.22; we just describe the main steps. Given any continuous function ϕ : PR2 → R and n ≥ 0, define Ψϕ ,n (x) =



ϕ ◦ An (x)t d η t .

/ M} and, for each n ≥ 1, let Fn be the σ -algebra generated by the Let F0 = {0, cylinders [0 : Δ0 , . . . , Δn−1 ] with Δi ⊂ X a measurable set for i = 0, . . . , n − 1. Then Fn ⊂ Fn+1 for every n ≥ 0. Since An (x)t depends only on x0 , . . . , xn−1 , so does the value of Ψϕ ,n (x). Hence, every Ψϕ ,n is Fn -measurable. Moreover,

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.3 Theorem of Furstenberg for any cylinder C ∈ Fn , 

C

Ψϕ ,n+1 (x) d μ (x) =

 C



= C

109

ϕ ◦ An+1 (x)t d η t d μ (x) ϕ ◦ An (x)t d η t d μ (x)

because η t is PF t -stationary. This proves that (Ψϕ ,n , Fn ) is a martingale. It follows that there exists a measurable function Ψϕ : M → R such that Ψϕ (x) = lim Ψϕ ,n (x) for every x in some full μ -measure set M0 ⊂ M. Using the fact that the space C0 (PR2 ) is separable, one can choose the full measure subset M0 to be the same for every continuous function ϕ . By the Riesz representation theorem, each linear map ϕ → Ψϕ (x), x ∈ M0 defines a probability measure mx on PR2 . By construction, mx is the weak∗ limit of An (x)t∗ η t . Lemma 6.17 Let mx be as in Lemma 6.16. Then An (x)t∗ Bt∗ η t → mx for μ almost every x and ν -almost every B ∈ SL(2). Proof Given any continuous function ϕ : PR2 → R and any n ≥ 0, define Ψϕ ,n : M → R as in the proof of Lemma 6.16: Ψϕ ,n (x) = 



ϕ ◦ An (x)t d η t .



We claim that Ψϕ ,n Ψϕ ,n+1 d μ = Ψ2ϕ ,n d μ for every n ≥ 0. Indeed, the expression on the left-hand side may be written as 

 Ψϕ ,n (x) ϕ ◦ An (x)t ◦ A( f n (x))t d η t d μ (x). Since Ψϕ ,n (x) and An (x)t depend only on x0 , . . . , xn−1 , and A( f n (x))t depends only on xn , this is the same as 

 Ψϕ ,n (x) ϕ ◦ An (x)t ◦ A( f n (x))t d η t d p(xn ) d pn (x0 , . . . , xn−1 ). Using the assumption that η t is PF t -stationary, this expression becomes  

 Ψϕ ,n (x) ϕ ◦ An (x)t d η t d pn (x0 , . . . , xn−1 ) = Ψ2ϕ ,n d μ . That proves our claim. Now it follows that       2  Ψϕ ,n+1 − Ψϕ ,n d μ = Ψ2ϕ ,n+1 − 2 Ψϕ ,n Ψϕ ,n+1 + Ψ2ϕ ,n d μ 

=

Ψ2ϕ ,n+1 d μ −



Ψ2ϕ ,n d μ .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

110

Exponents and invariant measures

Therefore, by telescopic cancellation, l−1 





2

Ψϕ ,n+1 − Ψϕ ,n d μ =

n=0



Ψ2ϕ ,l d μ −



Ψ2ϕ ,0 d μ ≤ sup |ϕ |2

(6.14)

for every l ≥ 1. The integral on the left-hand side may be rewritten as    2 ϕ ◦ An (x)t A( f n (x))t d η t − ϕ ◦ An (x)t d η t d pn (x0 , . . . , xn−1 ) d p(xn ); that is,  

 2 ϕ ◦ An (x)t Bt d η t − ϕ ◦ An (x)t d η t d pn (x0 , . . . , xn−1 ) d ν (B)    2 ϕ ◦ An (x)t Bt d η t − ϕ ◦ An (x)t d η t d μ (x) d ν (B). =

Thus, (6.14) means that  2 ∞   n t t t n t t ϕ ◦ A (x) B d η − ϕ ◦ A (x) d η d μ (x) d ν (B) < ∞. ∑ n=0

By Borel–Cantelli, this implies that 

 lim ϕ ◦ An (x)t Bt d η t − ϕ ◦ An (x)t d η t = 0 n

(6.15)

for (μ × ν )-almost every (x, B). Taking the intersection of those full measure sets over all functions ϕ in some countable dense subset of C0 (PR2 ), we find a full (μ × ν )-measure set of pairs (x, B) such that (6.15) holds; that is, 

lim n

ϕ ◦ An (x)t Bt d η t =



ϕ dmx ,

for all continuous functions ϕ : PR2 → R. This is the claim of the lemma. Lemma 6.18

For μ -almost any x there is V (x) ∈ PR2 such that mx = δV (x) .

Proof By Lemma 6.9, the PF t -stationary measure η t is non-atomic. Hence, by Corollary 6.15, the stabilizer H(η t ) is a compact subgroup of SL(2). By Lemma 6.17, for μ -almost every x ∈ M, lim An (x)t∗ Bt∗ η t = lim An (x)t∗ η t = mx n

n

for ν -almost every B.

Fix x and let P be any accumulation point of the sequence An (x)t /An (x)t . Then P = 1 and, by Lemma 6.13, the previous relation implies P∗ Bt∗ η t = P∗ η t = mx

for ν -almost every Bt .

If P was invertible then Bt∗ η t = η t for ν -almost every B, contradicting the fact that the stabilizer of η t is compact. Thus, P has rank 1 and so mx = P∗ η t is the Dirac measure δV (x) with V (x) = P(R2 ).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.3 Theorem of Furstenberg

111

6.3.3 Proof of Theorem 6.11 Let η be any PF-stationary measure. By Theorem 6.8, the largest Lyapunov exponent λ+ is given by

λ+ =



Φ dη d μ .

(6.16)

Our goal is to show that the integral is positive. Let V (x) be as in Lemma 6.18. An (x)u → ∞ for every u ∈ / V (x)⊥ and μ -almost every x ∈

Corollary 6.19 M.

Proof We have seen in Lemma 6.18 that An (x)t∗ η t converges to a Dirac mass, at μ -almost every point. Write V (x) = [v]. Then Lemma 6.14 gives that An (x)t  → ∞

and

|u · v| An (x)u → n t A (x)  v

/ V (x)⊥ . for every u. In particular, An (x)u → ∞ for every u ∈ Since η is non-atomic, by Lemma 6.9, the set {(x,V (x)) : x ∈ M} has zero μ × η -measure. So, Corollary 6.19 implies that n

lim ∑ Φ(F j (x, [v])) = lim log n

j=0

n

An (x)v =∞ v

(6.17)

for μ × η -almost every (x, [v]). We need the following abstract result: Lemma 6.20 Let (X, X , m) be a probability space and T : X → X a measurepreserving map. If ϕ ∈ L1 (X, m) is such that n−1

lim ∑ ϕ (T j (x)) = +∞ for m-almost every x ∈ X, n

j=0



then ϕ dm > 0. Proof

j Let Sn = ∑n−1 j=0 ϕ ◦ T and, for each fixed ε > 0, define

Aε = {x ∈ X : Sn (x) > ε for all n ≥ 1} and Bε =



T −k (Aε ).

k≥0

We claim that if Sn (x) → ∞ then x belongs to Bε for some ε > 0. Indeed, / A1/l 2 suppose otherwise. Then x ∈ / B1/l 2 for any l ≥ 1. This means that T k (x) ∈ for any k ≥ 0 and l ≥ 1. Equivalently, for any k ≥ 0 and l ≥ 1 there exists n ≥ 1 such that Sn ( f k (x)) ≤ 1/l 2 . Then we may construct a sequence (nl )l with n0 = 0 and Snl ( f n+ ···+nl−1 (x)) ≤ 1/l 2

for every l ≥ 1.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

112

Exponents and invariant measures

By construction, Sn1 +···+nl (x) ≤ 1 + 1/4 + · · · + 1/l 2 does not converge to ∞. This proves our claim. It follows from the hypothesis of the lemma that the union of all Bε , ε > 0 has full measure. Now let ψ be the Birkhoff average of ϕ and let τε be the sojourn time in Aε ; that is, the Birkhoff average of the characteristic function XAε . The hypothesis of the lemma implies that ψ (x) ≥ 0 for almost every x, and so   ϕ dm = ψ dm ≥ 0. Suppose, for contradiction, that the integrals vanish. Then ψ = 0 almost everywhere. Given x ∈ Bε , let k ≥ 0 be the smallest integer such that T k (x) ∈ Aε . Then n−1

k−1

n−1

j=0

j=0

j=k

∑ ϕ (T j x) ≥ ∑ f (T j x) + ∑ ε XAε (T j x), for all n ≥ 1.

Dividing by n and sending n → ∞, we obtain 0 = ψ (x) ≥ ετε (x) for every / Bε . So, we conclude that m(Aε ) = x ∈ Bε . It is clear that τε (x) for every x ∈  τε dm = 0. Then m(Bε ) also vanishes, for every ε > 0. This contradicts the claim in the previous paragraph. This contradiction proves the lemma. In view of (6.16) and (6.17), Lemma 6.20 gives that λ+ = strictly positive. This finishes the proof of Theorem 6.11.



Φ d η d μ is

6.4 Notes Our presentation in Section 6.1 is a variation of the original proof of Theorem 6.1, which was due to Ledrappier [80, § I.5]. Theorem 6.8 is taken from Furstenberg [54]. Corollary 6.10 is contained in results of Furstenberg, and Kifer [57] and of Hennion [61]. Related results, for some reducible cocycles, were obtained by Kifer and Slud [70, 73]. We will return to the topic of continuity of Lyapunov exponents in Chapter 10. Theorem 6.11 is also due to Furstenberg [54]. Several extensions have been obtained, by Virtser [116], Guivarc’h [59], Royer [103], Ledrappier and Royer [82] and others. This includes the proof that the statement remains true when the base dynamics ( f , μ ) is just a Markov shift. More recently, Bonatti, Gomez-Mont and Viana [37] extended the criterion to H¨older-continuous cocycles with invariant holonomies over hyperbolic homeomorphisms. The invariant measure only needs to have local product structure. Currently, the strongest statements are due to Viana [113], for H¨oldercontinuous cocycles over (non-uniformly) hyperbolic systems, and Avila, Santamaria and Viana [13], when the base dynamics is partially hyperbolic. Two different generalizations of Furstenberg’s theorem are at the heart of

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

6.5 Exercises

113

the next two chapters: the extension of the Furstenberg criterion to arbitrary dimension, which we discuss in Chapter 7, and the criterion for simplicity of the whole Lyapunov spectrum, to be presented in Chapter 8.

6.5 Exercises Exercise 6.1 Let Y ⊂ X be either open or closed. Suppose that (μn )n converges to μ in the weak∗ topology on X and let (μY,n )n and μY be their normalized restrictions to Y . Conclude that (μY,n )n converges to μY in the weak∗ topology on Y . Exercise 6.2 Use Exercise 5.22 to check that for each j = 1, . . . , k, the probability measure m in the second part of Theorem 6.1 may be chosen to be ergodic. Exercise 6.3 Use Exercise 5.23 to check that Proposition 6.5 remains true if one restricts to ergodic s-states and ergodic u-states. Exercise 6.4 Show that Proposition 6.7 and the expressions (a) and (b) for the Lyapunov exponents remain true if one restricts to ergodic stationary measures. Exercise 6.5 Suppose that X = SL(2) and the probability measure p has finite support; that is, it is of the form p = p1 δA1 + · · · + pm δAm . Moreover, let A : M → SL(2) be the projection to the zeroth coordinate. Show that the associated linear cocycle is strongly irreducible if and only if it is twisting. Exercise 6.6 Prove that the following conditions are equivalent. (1) The support of ν is not contained in any compact subgroup of SL(2) (hypothesis (a) in Theorem 6.11). (2) The support of ν is not contained in a set of the form GRG−1 where G ∈ SL(2) and R ⊂ SL(2) is the group of rotations. (3) The linear cocycle F is pinching: the monoid B generated by supp ν contains matrices with arbitrarily large norm. Exercise 6.7 equivalent.

Let F be pinching. Prove that the following conditions are

(1) There is no finite set L ⊂ PR2 such that B(L) = L for ν -almost every B ∈ SL(2) (hypothesis (b) in Theorem 6.11). (2) There is no set L ⊂ PR2 with one or two elements such that B(L) = L for ν -almost every B ∈ SL(2).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

114

Exponents and invariant measures

(3) The linear cocycle F is twisting: for any V0 , V1 , . . . , Vk ∈ PR2 there is B ∈ B such that B(V0 ) = Vi for all i = 1, . . . , k. Exercise 6.8 Calculate the stationary measures for the product of random matrices in the second part of Exercise 1.2. Deduce that this cocycle is a point of continuity for the Lyapunov exponents among locally constant cocycles. Exercise 6.9 Let F be an SL(2)-cocycle. Use Lemma 6.9 and Corollary 6.15 to show that if F satisfies conditions (a) and (b) in Theorem 6.11 then there exists no probability measure on PR2 such that B∗ η = η for every B ∈ supp ν . Exercise 6.10 Let F be an SL(2)-cocycle. Prove the converse to Exercise 6.9 and deduce that if λ± = 0 then there exists some probability measure η on PR2 such that B∗ η = η for every B ∈ supp ν . Exercise 6.11 Show that the cocycle F  in Section 6.3.2 satisfies the hypotheses of Theorem 6.11 if and only if F does.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.007

7 Invariance principle

The main result in this chapter takes our understanding of the invariant measures of linear and projective cocycles, and their links to Lyapunov exponents, one step further. Most important, it applies precisely when the information provided by the theorem of Oseledets is poor, namely, when all Lyapunov exponents coincide and so the Oseledets decomposition is trivial. As a motivation for the statement, let us start with a brief discussion of the 2-dimensional case. Let F : M × R2 → M × R2 be an invertible locally constant linear cocycle over a Bernoulli shift f : (M, μ ) → (M, μ ), and let PF : M × PR2 → M × PR2 be the associated projective cocycle. Suppose first that the Lyapunov exponents λ± of F are distinct. Then, as we have seen in Section 5.5.3, there exists an s-state ms and a u-state mu such that every PF-invariant probability measure that projects down to μ is a convex combination of ms and mu . Moreover, these are the unique PF-ergodic probability measures, and they determine the Lyapunov exponents:

λ+ =



Φ dmu

and

λ− =



Φ dms .

(7.1) 

When the Lyapunov exponents coincide, we still get that λ± = Φ dm for every PF-invariant probability measure. Moreover, the invariance principle that we are going to prove in this chapter provides a surprisingly precise characterization of these measures: every PF-invariant probability measure that projects down to μ is a product measure m = μ × η , where η is both forward and backward stationary. The invariance principle holds in arbitrary dimension, as we are going to see in Sections 7.1 and 7.2. In Section 7.3 we deduce the following result, which was originally due to Furstenberg [54]: if λ− = λ+ then there exists some probability measure η on projective space that is invariant under A(x) for almost every x. In Section 7.4 we observe that the latter is a very restrictive 115

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

116

Invariance principle

condition on the cocycle. Thus, one has λ− < λ+ for generic locally constant cocycles.

7.1 Statement and proof Let Fˆ : Mˆ × Rd → Mˆ × Rd be an invertible linear cocycle defined over a twoˆ μˆ ) by a measurable Aˆ : Mˆ → GL(d) ˆ μˆ ) → (M, sided Bernoulli shift fˆ : (M, that depends only on the zeroth coordinate. Let PFˆ : Mˆ × PRd → Mˆ × PRd be the associated projective cocycle. Write Mˆ = X Z , where X and Mˆ are taken to be separable complete metric spaces and μˆ = pZ . Assume that log+ Aˆ ±1  ∈ ˆ L1 (μˆ ) and let λ− and λ+ be the extremal Lyapunov exponents of F. ˆ Theorem 7.1 (Linear invariance principle) If λ− = λ+ then every PF-invariant probability measure mˆ that projects down to μˆ is both an s-state and a u-state. Consequently, mˆ = μˆ × η for some probability measure η on PRd that is both forward and backward stationary. Theorem 7.1 is a consequence of the following theorem of Ledrappier [81]. Let F : M ×Rd → M ×Rd be a linear cocycle defined over a measure-preserving transformation f : (M, μ ) → (M, μ ) by a measurable function A : M → GL(d) such that log+ A±1  ∈ L1 (μ ). Let PF : M × PRd → M × PRd be the associated projective cocycle. Note F, PF, and f are not assumed to be invertible. Moreover, f is arbitrary (not necessarily a shift) and A need not be locally constant either. We do assume that M is a separable complete metric space (and so (M, μ ) is a Lebesgue space, see [114, Section 8.5.2]). Theorem 7.2 (Ledrappier) every x ∈ M. Then

Assume that λ− (F, x) = λ+ (F, x) for μ -almost

m f (x) = A(x)∗ mx

for μ -almost every x ∈ M,

for any disintegration {mx : x ∈ M} of any PF-invariant probability measure m that projects down to μ . Proof of Theorem 7.1 Let M = X N , μ = pN and f : (M, μ ) → (M, μ ) be the one-sided Bernoulli shift. Define A : M → GL(d) by Aˆ = A ◦ π + and then take F : M × Rd → M × Rd to be the linear cocycle and PF : M × PRd → M × PRd to be the projective cocycle defined by A over f . Then λ− (F, x) = λ− and ˆ λ+ (F, x) = λ+ for μ -almost every x ∈ M. Given any PF-invariant probability + d measure m, ˆ let m = Π∗ mˆ be its projection to M × PR . We know from Corolˆ and {mx : x ∈ M} of mˆ and m, lary 5.23 that the disintegrations {mˆ xˆ : xˆ ∈ M}

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

7.2 Entropy is smaller than exponents

117

respectively, are related by mˆ xˆ = lim Aˆ n ( fˆ−n (x)) ˆ ∗ mπ + ( fˆ−n (x)) ˆ n

ˆ for μˆ -almost every xˆ ∈ M.

Now, using Theorem 7.2 and the relations f ◦ π + = π + ◦ f and Aˆ = A ◦ π + , n + ˆ−n Aˆ n ( fˆ−n (x)) ˆ ∗ mπ + ( fˆ−n (x)) (x))) ˆ ∗ mπ + ( fˆ−n (x)) ˆ = A (π ( f ˆ

= m f n (π + ( fˆ−n (x))) = mπ + (x) ˆ for μˆ -almost every x. ˆ Substituting this in the previous expression we conclude ˆ In particular, mˆ xˆ depends only on π + (x), that mˆ xˆ = mπ + (x) for μˆ -almost every x. and that proves that mˆ is an s-state. An entirely dual argument, using backward iterates instead, proves that mˆ is a u-state.

7.2 Entropy is smaller than exponents In this section we prove Theorem 7.2. Let m be a probability measure on M × PRd that projects down to μ and let {mx : x ∈ M} be a disintegration of m. For each x, let J(x, ·) be the Radon–Nikodym derivative   d A(x)−1 ∗ m f (x) J(x, v) = (v) . dmx This means that A(x)−1 ∗ m f (x) = J(x, ·)mx + ξx

(7.2)

for some positive measure ξx (totally) singular with respect to mx . The fibered entropy of m is defined by 

h(m) =

− log J dm =







{J>0}

− log J dm + ∞m({J = 0}).

(7.3)

Note that {J>0} J dm = J dm ≤ 1, because the left-hand side of (7.2) is a probability measure and ξx is a positive measure. Then, by convexity, 

{J>0}

− log J dm ≥ − log



{J>0}

J dm ≥ 0.

(7.4)

This shows that the integral in (7.3) is well defined and h(m) ∈ [0, +∞]. Proposition 7.3

If h(m) = 0 then A(x)∗ mx = m f (x) for μ -almost every x.

Proof The first inequality in (7.4) is strict unless log J is constant m-almost  everywhere. The second one is strict unless J dm = 1 (equivalently, ξx = 0 for μ -almost every x). Therefore, h(m) = 0 if and only if m({J = 0}) = 0,

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

118

Invariance principle 

and J dm = 1, and log J is constant on a full m-measure set. In particular, if h(m) = 0 then J = 1 on a full m-measure set. That is, A(x)−1 ∗ m f (x) = mx or, equivalently, A(x)∗ mx = m f (x) for μ -almost every x ∈ M. This proves the claim. 0 ≤ h(m) ≤ d(λ+ − λ− ).

Proposition 7.4

Theorem 7.2 is a direct consequence of Propositions 7.3 and 7.4. In the remainder of this section we prove Proposition 7.4.

7.2.1 The volume case As a motivation for the proof, and for the definition of h(m), we begin by discussing the case when m = μ × λ , where λ is the Haar measure on PRd . This will not be used in the proof of the general case, so the reader may choose to skip this section altogether. In the case under consideration mx = λ for every x ∈ M, and so the Radon– Nikodym derivative J(x, [v]) =

A(x)−1 ∗ λ ([v]) = | det DΦx ([v])| dλ

where Φx : PRd → PRd , Φx ([v]) = [A(x)v] denotes the projective action of A(x).  d It follows that J(x, [v]) = | det A(x)| v/A(x)v at every point, and so 

h(m) = d

log 

=d

A(x)v]| d μ (x) d λ ([v]) − v

Φ d μ dλ −





log | det A(x)| d μ (x)

log | det A(x)| d μ (x), 

where Φ is the function in (6.1). Let m = mα d α be the ergodic decomposition of m = μ × λ . By Theorem 6.1, for each α there exists j = j(α ) such that  Φ dmα = λ j . Thus, 

Φ d μ dλ =

k

k

∑ c jλ j,

∑ c j = 1,

j=1

j=1

where c j is the total weight of ergodic components with j(α ) = j. By (4.40), 

log | det A(x)| d μ (x) =

k

∑ λ j dim E j .

j=1

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

7.2 Entropy is smaller than exponents

119

Therefore, k

h(m) =

∑ (dc j − dim E j )λ j ≤ d(λ+ − λ− ).

(7.5)

j=1

The last inequality is, obviously, not sharp.

7.2.2 Proof of Proposition 7.4. Consider any Δ > λ+ − λ− ≥ 0. We want to prove that h(m) ≤ dΔ for any Finvariant probability measure m that projects down to μ . For the time being, let us assume that m is ergodic for F; the general case will be deduced at the end.  Given ε > 0, define hε (m) = − log Jε dm where Jε = J + ε . By the monotone convergence theorem, hε (m) increases to h(m) as ε decreases to 0. Suppose, for contradiction, that h(m) > dΔ. Then, for any ε > 0 small enough, hε (m) > dΔ + 4ε .

(7.6)

Lemma 7.5 Each fiber {x} × PRd admits partitions Pn (x), defined for every large n, such that (1) #Pn (x) ≤ en(dΔ+2ε ) , (2) diam Pn (x) ≤ e−n(Δ+2ε ) , (3) mx (∂ Pn (x, v)) = 0 for every v ∈ PRd .

Figure 7.1 A refining sequence of partitions of PRd

Proof We start from a regular triangulation of the sphere Sd−1 into 2d simplices, as pictured in Figure 7.1 for the case d = 3. We take the sphere to carry the standard metric with curvature ≡ 1. By identifying antipodes, one obtains a triangulation T1 of the projective space PRd into 2d−1 simplices. Each simplex can be split into 2d−1 regular simplices, each of which is the image of the original simplex by a (1/2)-contraction, as illustrated in Figure 7.1. By

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

120

Invariance principle

repeating this procedure successively, we obtain an increasing sequence Tk of triangulations of PRd satisfying (a) #Tk = 2k(d−1) for every k ≥ 1 (b) C−1 ≤ 2k diam Tk (v) ≤ C for every k ≥ 1 and v ∈ PRd , where the constant C > 1 is independent of k and v. By construction, the boundaries of the atoms of every Tk are contained in finitely many projective hyperplanes. Thus, for each x ∈ M and k ≥ 1, we may find an orthogonal transformation Bx,k : Rd → Rd such that the partition Tx,k = Bx,k (Tk ) satisfies (c) mx (Tx,k (v)) = 0 for every v ∈ PRd . Moreover, properties (a) and (b) are not affected when one replaces Tk by Tx,k , because the projective action of Bx,k is an isometry. Assuming ε is small enough relative to Δ, for every large n ≥ 1 there exists some integer k ≥ 1 such that Δ + 2ε logC dΔ + 2ε ≥k≥n + . n (d − 1) log 2 log 2 log 2 Then the partition Pn (x) = Tx,k satisfies the conditions in the conclusion of the lemma. Let Pn (x, v) denote the atom of the partition Pn (x) that contains the point v. For each 0 ≤ k ≤ n let Pn,k (x) be the partition of {x} × PRd given by the pull-back of Pn ( f k (x)) under Ak (x):   Pn,k (x, v) = Ak (x)−1 Pn (F k (x, v)) for each v ∈ PRd . Observe that Pn,0 (x, v) = Pn (x, v). Then define, for 0 ≤ k < n,   m f (x) Pn,k (F(x, v))   Jn,k (x, v) = mx Pn,k+1 (x, v))   n−1 m f n (x) Pn (F n (x, v))   = ∏ Jn,k (x, v) Jn (x, v) = mx Pn,n (x, v)) k=0 and, for each ε > 0, Jn,k,ε (x, v) = Jn,k (x, v) + ε

n−1

and

Jn,ε (x, v) = ∏ Jn,k (x, v). k=0

Lemma 7.6

sup0≤k≤n  log Jn,k,ε − log Jε L1 (m) converges to zero as n → ∞.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

7.2 Entropy is smaller than exponents

121

Let us assume this fact for a while and use it to complete the proof of Proposition 7.4. Since we assume m to be ergodic, 1 1 n−1 1 n−1 lim log Jn,ε = lim ∑ log Jn,k,ε ◦ F j = lim ∑ log Jε ◦ F j n n n n n n k=0 k=0 

=

log Jε dm = hε (m)

at m-almost every point (the second equality uses Lemma 7.6). Consequently,   1 1 lim sup log m f n (x) Pn (F n (x, v)) ≤ lim sup log Jn (x, v) n n n n 1 ≤ lim log Jn,ε (x, v) = −hε (m) n n for m-almost every (x, v). In particular, for every large n ≥ 1 there exists En ⊂ M × PRd such that m(En ) > 1/2 and   m f n (x) Pn (F n (x, v)) ≤ en(−hε (m)+ε ) for all (x, v) ∈ En . Since every Pn (y) has at most en(dΔ+2ε ) atoms (Lemma 7.5), and recalling (7.6), it follows that the m f n (x) -measure of the intersection of F n (En ) with the fiber of f n (x) does not exceed en(dΔ+2ε ) en(−hε (m)+ε ) ≤ e−nε . Since x is arbitrary, we conclude that m(F n (En )) ≤ e−nε , contradicting the fact that m(En ) > 1/2 for every n. This contradiction reduces the proof of Proposition 7.4 to proving Lemma 7.6. Lemma 7.7 sup0≤k≤n sup{diam Pn,k (x, v) : v ∈ PRd } converges to zero when n → ∞, for μ -almost every x ∈ M. Proof The derivative of the action of Ak (x)−1 in projective space is bounded by Ak (x)Ak (x)−1  (Exercise 7.1). Hence, using Lemma 7.5, diam Pn,k (x, v) ≤ Ak (x)Ak (x)−1  diam Pn (x, v) ≤ Ak (x)Ak (x)−1 e−n(Δ+2ε ) . For μ -almost every x ∈ M, 1 lim log Ak (x)Ak (x)−1  = λ+ − λ− . n n Hence, there exists k0 (x) ≥ 1 such that diam Pn,k (x, v) ≤ ekΔ e−n(Δ+2ε ) ≤ e−2nε

for every k0 (x) < k ≤ n.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

122

Invariance principle

Fix k0 (x) and let K0 (x) = max{Ak (x)Ak (x)−1  : 0 ≤ k ≤ k0 (x)}. Then diam Pn,k (x, v) ≤ K0 (x)e−n(Δ+2ε ) ≤ e−2nε for every n sufficiently large. This proves that, for μ -almost every x ∈ M, there exists n0 (x) ≥ 1 such that sup0≤k≤n supv diam Pn,k (x, v) ≤ e−n2ε for every n ≥ n0 (x). That implies the conclusion of the lemma. Lemma 7.8 Let X be a complete metric space and η0 , η1 be Borel probability measures on X such that η1 ≥ cη0 . Let ρ : X → [0, +∞] be the Radon-Nikodym derivative ρ = d η1 /d η0 and, given any countable partition P of X, define ρP (x) = η1 (P(x))/η0 (P(x)) for x ∈ X. Then: 



(1) log ρ d η0 ≤ log ρP d η0 ≤ 0; (2) given ε > 0 there is δ > 0 such that  log ρP − log ρ L1 (η0 ) ≤ ε for any countable partition P with η0 (∂ P) = 0 and diam P < δ . Proof





By convexity, log ρP d η0 ≤ log ρP d η0 = 0. Similarly, 

P(x)

log ρ d η0 ≤ η0 (P(x)) log ρP (x) 



for every atom P(x), and so log ρ d η0 ≤ log ρP d η0 . This proves part (1) of the lemma. Next, note that the functions ρP satisfy a uniform integrability condition: for all Y ⊂ X with η0 (Y ) < 1/e, 

Y

| log ρP | d η0 ≤ −η0 (Y )(log η0 (Y ) + log c).

(7.7)

Indeed, the assumption implies − log ρP ≤ − log c and so the claim is trivial if log ρP happens to be negative on Y . When log ρP ≥ 0 on the set Y , the claim follows from convexity:  Y

log ρP d

η0 ≤ log η0 (Y )

 Y

ρP d

1 η0 η1 (P(Y )) ≤ log ≤ log , η0 (Y ) η0 (Y ) η0 (Y )

where P(Y ) denotes the union of all atoms of P that intersect Y . The general case of (7.7) is handled by splitting Y into two subsets where log ρP has constant sign. Next, we claim that if R refines Q then  log ρR − log ρQ L1 (η0 ) ≤  log ρ − log ρQ L1 (η0 ) . To see that this is so, write 

| log ρR − log ρQ | d η0 =





R⊂Q R

| log ρR − log ρQ | d η0 ,

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

(7.8)

7.2 Entropy is smaller than exponents

123

where the sum is over the pairs of atoms R ∈ R and Q ∈ Q with R ⊂ Q. Since ρR and ρQ are constant on each R ∈ R, this may be rewritten as     η (R) | log ρ − log ρ | = log ρ d η − η (R) log ρ  0 0 R Q Q ∑ 0 ∑ R⊂Q

R⊂Q







R

R⊂Q R

| log ρ − log ρQ | d η0 .

The combination of these two relations proves (7.8). We are ready to prove part (2) of the lemma. Let (Qn )n be any refining sequence of partitions with m(∂ Qn ) = 0 and diam Qn → 0. By the martingale convergence theorem (Theorem 5.21), ρQn → ρ at η0 -almost every point. By uniform integrability (7.7), it follows that log ρQn → log ρ in L1 (η0 ). Given ε > 0, fix n sufficiently large so that  log ρQn − log ρ L1 (η0 ) < ε /4.

(7.9)

Given any partition P with η0 (∂ P) = 0, let R = P ∨ Qn (the coarsest partition that refines both P and Qn ). Combining (7.8) with (7.9), we get  log ρR − log ρ L1 (η0 ) < ε /2.

(7.10)

Now, let Y be the set of points x ∈ X such P(x) ⊂ Qn (x). Since η0 (∂ Qn ) = 0 we may find δ > 0 such that diam P < δ implies η0 (Y ) is small enough that −η0 (Y )(log η0 (Y ) + log c) < ε /4. Then, combining (7.8) with the fact that R(x) = P(x) for all x ∈ X \Y , we find  log ρP − log ρR L1 (η0 ) ≤



Y

(| log ρP | + | log ρR |) d η0 < ε /2.

Together with (7.10), this implies  log ρP − log ρ L1 (η0 ) < ε , as claimed. Proof of Lemma 7.6 First, apply Lemma 7.8 with X = PRd and c = ε and P = Pn,k (x) and η0 = mx and η1 = A(x)−1 ∗ m f (x) + ε mx . Note that

ρ = J + ε = Jε

and

ρP = Jn,k + ε = Jn,k,ε .

Moreover, Lemma 7.7 ensures that the diameter of P is small if n is large enough. It follows from Lemma 7.8 that, for any γ > 0 and μ -almost every x, one has  log Jn,k,ε − log Jε L1 (mx ) < γ

for all 0 ≤ k < n

if n is sufficiently large n. Note that  log Jε L1 (mx ) ≤ 1 − log ε , because 

{Jε ≥1}

log Jε dmx ≤ 1 and



{Jε 0 for every i ∈ X. Let f : M → M be the shift map on M = pN and consider μ = pN . The set of functions A : X → GL(d) may be identified with GL(d)m which, in turn, may be viewed as an open subset of a Euclidean space. Let λ± (A) denote the extremal Lyapunov exponents of the locally constant linear cocycle F : M × Rd → M × Rd defined by A over ( f , μ ). Theorem 7.12 The subset Z of the functions A : X → GL(d) such that λ− (A) = λ+ (A) is contained in a finite union of closed proper submanifolds of GL(d)m . In particular, the closure of Z is nowhere dense and has volume zero. We prove this theorem later. Let us point out now that the conclusion can be sharpened considerably. Exercise 7.8 is one example of this. Another is that the arguments we are going to present remain valid if one replaces GL(d) by

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

126

Invariance principle

the subgroup SL(d): just note that the curves B(t) defined in (7.12) and (7.14) lie in SL(d) if the initial matrix A does.

7.4.1 Eigenvalues and eigenspaces For each r, s ≥ 0 with r + 2s = d, let G(r, s) be the subset of matrices A ∈ GL(d) having r real eigenvalues and s pairs of (strictly) complex conjugate eigenvalues, such that all the eigenvalues that do not belong to the same complex conjugate pair have distinct norms. Every G(r, s) is open and the complement of the union G = r,s G(r, s) is a small subset of GL(d), meaning that it is contained in a finite union of closed proper submanifolds. For A ∈ G(r, s), let λ1 (A), . . . , λr (A) be the real eigenvalues, in decreasing order of the absolute value, and μ1 (A), μ¯ 1 (A), . . . , μs (A), μ¯ s (A) be the pairs of complex eigenvalues, also in decreasing order of the absolute value. Moreover, let ξ j (A) ∈ Gr(1, d) be the eigenspace corresponding to each real eigenvalue λ j (A) and ηk (A) ∈ Gr(2, d) be the invariant plane corresponding to each pair of complex conjugate eigenvalues μk (A) and μ¯ k (A). We need to analyse how these eigenvalues and invariant subspaces vary inside G(r, s). Let us start with the case when all eigenvalues are real: Lemma 7.13 The maps A → λ j (A) and A → ξ j (A) are smooth on G(d, 0) for j = 1, . . . , d. Moreover, the map G(d, 0) → Gr(1, d)d , A → (ξ1 (B), . . . , ξd (B)) is a submersion. Proof Each λ j (A) is a simple root of the polynomial det(A − λ id), and so it has a smooth continuation on G(d, 0), given by the implicit function theorem. Denote L j (A) = A − λ j (A) id. This matrix depends smoothly on A ∈ G(d, 0) and, since λ j (A) remains a simple eigenvalue throughout, it always has rank d − 1. Let L j (A)a be the adjoint matrix. Its entries are the cofactors of L j (A), and so the adjoint is non-zero and varies smoothly with A. In particular, the columns of L j (A)a are smooth functions of A. Moreover, L j (A) · L j (A)a = det L j (A) id = 0, which implies that every non-zero column of L j (A)a is an eigenvector for A, associated with the eigenvalue λ j (A). This shows that every A → ξ j (A) is smooth. To check that the derivative of

ξ : A → (ξ1 (A), . . . , ξd (A)) ∈ Gr(1, d)d is onto, consider any differentiable curve β (t) = (β1 (t), . . . , βd (t)) on Gr(1, d)d such that β (0) = ξ (A). Take t → P(t) to be a differentiable curve in GL(d) such that the columns of P(t) are non-zero vectors in the direction of the β j (t). Define B(t) = P(t) diag[λ1 (A), . . . , λd (A)]P(t)−1 .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

(7.12)

7.4 Lyapunov exponents of typical cocycles

127

Then t → B(t) is also a differentiable curve in GL(d) with B(0) = A and ξ (B(t)) = β (t) for all t. In particular, the derivative Dξ (A) maps B (0) to β  (0). So the derivative of ξ is surjective, as claimed. Next we deal with complex eigenvalues. Given A ∈ GL(d), let μ be a complex solution of det(A − μ id) = 0 and v ∈ Cd \ {0} be such that Av = μ v. Then, ¯ Assuming since the matrix A is taken to be real, det(A − μ¯ id) = 0 and Av¯ = μ¯ v. / R, the conjugate vectors v and v¯ are linearly independent and so (Exthat μ ∈ ercise 7.9) the real vectors ℜv and ℑv are also linearly independent. Moreover, the real plane ψ (v) = span{ℜv, ℑv} is invariant under A. This plane depends only on the complex projective class of v and so we may consider

ψ : {[v] ∈ PCd : v and v¯ are linearly independent} → Gr(2, d).

(7.13)

We invite the reader to check that this map is smooth and is a submersion (Exercise 7.10). Lemma 7.14 The maps A → λ j (A) and A → μk (A) and A → ξ j (A) and A → ηk (A) are smooth on G(r, s), for j = 1, . . . , r and k = 1, . . . s. Furthermore, G(r, s) → Gr(1, d)r × Gr(2, d)s ,

A → (ξ1 (A), . . . , ξr (A), η1 (A), . . . , ηs (A))

is a submersion. Proof Smoothness of the eigenvalues λ j and μk follows from the implicit function theorem. Recall that ξ j (A) ∈ PRd denotes the eigenspace associated with each real eigenvalue λ j (A). Let ζk (A) ∈ PCd be the eigenspace associated with each complex eigenvalue μk (A). The same arguments as in Lemma 7.13 imply that A → ξ j (A) and A → ζk (A) are smooth maps. By the observations preceding this lemma, it follows that ηk (A) = ψ (ζk (A)) ∈ Gr(2, d) is also a smooth function of A. Moreover (Exercise 7.10), to prove that the map in the statement is a submersion it suffices to show that

ζ : A → (ξ1 (A), . . . , ξr (A), ζ1 (A), . . . , ζs (A)) ∈ (PRd )r × (PCd )s is a submersion. Let α (t) = (β1 (t), . . . , βr (t), γ1 (t), . . . , γs (t)) be any differentiable curve in (PRd )r × (PCd )s with α (0) = ζ (A). Define B(t) = P(t) diag[λ1 (A), . . . , λr (A),

μ1 (A), μ¯ 1 (A), . . . , μs (A), μ¯ s (A)] P(t)−1 .

(7.14)

where t → P(t) is a differentiable curve such that the columns of P(t) are non-zero vectors u j (t) ∈ β j (t) and vk (t), v¯k (t) with vk (t) ∈ γk (t). Observe that t → B(t) is a differentiable curve in GL(d), even if P(t) need not be real. Moreover, B(0) = A and ζ (B(t)) = α (t) for all t. So Dζ (A) maps B (0) to α  (0). This shows that the derivative Dζ (A) is indeed surjective.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

128

Invariance principle

7.4.2 Proof of Theorem 7.12 Since the complement of G is a small subset of GL(d), the set of A ∈ GL(d)m such that Ai ∈ / G for some i = 1, . . . , m is a small subset of GL(d)m . So, we may restrict ourselves to the open set G = {A ∈ GL(d)m : Ai ∈ G for every i = 1, . . . , m}. In view of Theorem 7.9, it suffices to show that the subset of A ∈ G such that the matrices Ai , i = 1, . . . , m admit a common invariant probability is small. If Ai ∈ G(ri , si ) then, as all the eigenvalues have different absolute values, every Ai -invariant probability measure on PRd is a convex combination of Dirac masses on the eigenspaces ξ j (A), j = 1, . . . , ri and of probability measures supported in the invariant planes ηk (A), k = 1, . . . , si (each ηk (A) is naturally identified with a 1-dimensional subspace of the projective space). In particular, the support must be contained in Σ(Ai ) = {ξ1 (Ai ), . . . , ξri (Ai )} ∪ η1 (Ai ) ∪ · · · ∪ ηsi (Ai ). Suppose first that d ≥ 4. Since the ξ j (Ai ) are points and the ηk (Ai ) are lines in the projective space, and dim PRd ≥ 3, it follows from Lemmas 7.13 and 7.14 that there exists a small subset Z1 of G(r1 , s1 )×G(r2 , s2 )×GL(d)m−2 such that

ξa (A1 ) = ξb (A2 ) ξa (A1 ) ∈ / ηe (A2 ) and ξb (A2 ) ∈ / ηc (A1 ) ηc (A1 )∩ηe (A2 ) = 0/ )×GL(d)m−2 \Z

(7.15) (7.16) (7.17)

for every A ∈ G(r1 , s1 )×G(r2 , s2 1 and 1 ≤ a ≤ r1 , 1 ≤ b ≤ r2 , 1 ≤ c ≤ s1 and 1 ≤ e ≤ s2 . This implies that Σ(A1 ) ∩ Σ(A2 ) = 0/ which, by the remarks in the previous paragraph, implies that A1 and A2 have no common invariant probability measure. This proves the theorem in dimension d ≥ 4. Now suppose that d = 3. The previous arguments extend to this case when either s1 = 0 or s2 = 0; that is, when the condition (7.17) is void. When s1 = s2 = 1, there is one difficulty: since the projective space PR3 is only 2-dimensional, one cannot force the pair of 1-dimensional submanifolds η1 (A1 ) and η1 (A2 ) to be disjoint, as required in (7.17). This can be bypassed as follows. By the same arguments as before, there exists a small subset of Z2 of G(1, 1)2 × GL(d)m−2 such that every A ∈ G(1, 1) × G(2, 2) × GL(d)m−2 \ Z2 satisfies (7.15) and (7.16) and η1 (A1 ) = η1 (A2 ) (instead of (7.17)). Suppose that A1 and A2 have some common invariant probability measure θ . Observing that Σ(A1 ) ∩ Σ(A2 ) = η1 (A1 ) ∩ η1 (A2 )

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

7.4 Lyapunov exponents of typical cocycles

129

consists of a single point z in the projective space, θ has to be the Dirac mass at z. Then z has to be a fixed point for Ai inside η1 (Ai ), for i = 1, 2. This is impossible, because the eigenspace ηi (Ai ) contains no Ai -invariant line. This contradiction proves the theorem in dimension d = 3. Now we treat the case d = 2. If s1 = s2 = 0 then the conditions (7.16) and (7.17) are void and we can use the same arguments as we did for d ≥ 4. If s1 = 0 and s2 = 1 then Σ(A1 )∩Σ(A2 ) consists of the two points ξ1 (A1 ) and ξ2 (A1 ). So, if A1 and A2 have a common invariant probability, then this must be a convex combination of the Dirac masses at the two eigenspaces of A1 . Then the union of these eigenspaces has to be invariant under A2 , which is impossible because the action of A2 ∈ G(0, 1) on the projective space is a rotation whose angle is not a multiple of π . Up to exchanging the roles of the matrices, this also covers the case d = 2 with s1 = 1 and s2 = 0. The only remaining case, d = 2 with s1 = s2 = 1, requires an essentially new ingredient that we are going to borrow from Douady and Earle [50]. Recall that every matrix B ∈ GL(2) with positive determinant induces an automorphism hB of the Poincar´e half-plane H:   az + b a b . (7.18) B= −→ hB (z) = c d cz + d The action of B on the projective plane may be identified with the action of hB on the boundary of H, via

∂ H → PR2 ,

x → [(x, 1)]

(including x = ∞) so that B-invariant measures on the projective plane may be seen as hB -invariant measures sitting on the real axis. It is also easy to check that hB has a fixed point in the open disc H if and only if B ∈ G(0, 1). Define φ (B) to be this (unique) fixed point. It is easy to see that B → φ (B) is a smooth submersion: use the explicit expression for the fixed point extracted from (7.18). Lemma 7.15 If A1 , a2 ∈ G(0, 1) have a common invariant probability measure μ on ∂ H then φ (A1 ) = φ (A2 ). Proof It is clear that A1 and A2 have no invariant measures with atoms of mass larger than 1/3: such atoms would correspond to periodic points in the projective plane, with periods 1 or 2, which are forbidden by the definition of G(0, 1). In Proposition 1 of [50] a map μ → B(μ ) is constructed that assigns to each probability measure μ with no atoms of mass ≥ 1/2 (see Remark 2 in [50, page 26]) a point B(μ ) in the half-plane H, called conformal barycenter

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

130

Invariance principle

of μ , such that B(h∗ μ ) = h(B(μ )) for every conformal automorphism h : H → H. When μ is Ai -invariant this implies hAi (B(μ )) = B((hAi )∗ μ ) = B(μ ), and so the conformal barycenter must coincide with the fixed point φ (Ai ) of the automorphism hAi . Thus, if μ is a invariant measure for both A1 and A2 then φ (A1 ) = B(μ ) = φ (A2 ). Since φ is a submersion, Z3 = {A ∈ G(0, 1)2 × GL(d)m−2 : φ (A1 ) = φ (A2 )} is a small subset of G(0, 1)2 × GL(d)m−2 . By Lemma 7.15, if A is in the complement of Z3 then there are no common invariant probability measures. Altogether, this proves that Z is a small subset of GL(d)m , as claimed in the first part of Theorem 7.12. The second part is an immediate consequence, because the closure of a small subset is still a small subset, and small subsets have volume zero and, consequently, are nowhere dense.

7.5 Notes Theorem 7.2 is a special case of the main result of Ledrappier [81]. Ledrappier shows how to deduce both Furstenberg’s theorem (Theorem 7.9) and a celebrated theorem of Kotani [77]. The proof in Section 7.2 is from Avila and Viana [16], where a much more general statement is obtained. The invariance principle for linear cocycles goes back to Bonatti, GomezMont and Viana [37]. They considered both locally constant cocycles and H¨older-continuous cocycles with invariant holonomies over hyperbolic homeomorphisms with invariant probabilities having local product structure (the notion of local product structure was defined in the Notes to Chapter 5 and that of invariant holonomies will be discussed in Section 10.6). Viana [113] used a non-uniform version of these methods to prove that almost all linear cocycles over any hyperbolic system, uniform or not, have some non-vanishing Lyapunov exponent. The expression invariance principle was coined by Avila and Viana [16], who extended the statement to smooth cocycles over hyperbolic homeomorphisms; that is, cocycles that act by diffeomorphisms on the fibers. Avila, Santamaria and Viana [13] further extended the statement to smooth cocycles over partially hyperbolic volume-preserving diffeomorphisms. This turned the invariance principle into a very flexible tool, with several applications in the theory of partially hyperbolic systems. Among these, we mention the rigidity results of Avila and Viana [16], for symplectic diffeomorphisms, and Avila, Viana and Wilkinson [17], for timeone maps of geodesic flows, the construction of measures of maximal entropy

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

7.6 Exercises

131

by Rodriguez-Hertz, Rodriguez-Hertz, Tahzibi and Ures [64] and the results of Viana and Yang [115] on the existence and finiteness of physical measures. Theorem 7.9 and Corollary 7.11 extend Theorem 6.11 to arbitrary dimension. They were originally due to Furstenberg [54] and also hold more generally than stated (see [81, Corollary 1]): it suffices to assume that A+ ∩ A− contains only zero and full measure sets, where A± is the σ -algebra generated by the functions A ◦ f ±n , n ≥ 1. Clearly, this is the case if ( f , μ ) is a Bernoulli shift and A is locally constant. Section 7.4 is a simplified version of material from Bonatti, Gomez-Mont and Viana [37], which has also been extended by Avila, Santamaria and Viana [13] to cocycles over partially hyperbolic diffeomorphisms. In fact, for the 2dimensional case we follow the latter paper more closely.

7.6 Exercises Exercise 7.1 Let N = PRd and Ψ : N → N be the projective action of some B ∈ GL(d). Prove that, for every [v] ∈ N and ξ ∈ T[v] N, (1) T[v] N = orthogonal complement of [v]; (2) DΨ([v])ξ = orthogonal projection of

v B(ξ ) to T[B(v)] N; B(v)

(3) DΦ([v]) ≤ BB−1  and DΦ([v])−1  ≤ BB−1 ; 

v (4) | det DΨ([v])| = | det B| B(v)

d .

Exercise 7.2 Let θ be a probability measure on PRd and consider any finite family of projective hyperplanes H1 , . . . , Hn ⊂ PRd . Show that there exists some orthogonal transformation B : Rd → Rd such that θ (B(H j )) = 0 for every j = 1, . . . , n. Exercise 7.3 Prove that if m is an F-invariant probability measure that projects down to μ then almost every ergodic component mα projects down to μ . Exercise 7.4 Prove that Corollary 6.15 extends to any dimension d ≥ 2. Namely, that the stabilizer H(θ ) = {B ∈ GL(d) : B∗ θ = θ } of any non-atomic probability measure θ on PRd is a compact subgroup of GL(d). Exercise 7.5 Show that Exercise 6.9 extends to any dimension d ≥ 2. Namely, that if (a) the support of ν is contained in no compact subgroup of GL(d) and (b) the cocycle is strongly irreducible, then there exists no probability measure on PRd such that B∗ η = η for every B ∈ supp ν .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

132

Invariance principle

Exercise 7.6 Show that if the support of ν is contained in a compact subgroup of GL(d), d ≥ 2 then there exists some probability measure on PRd such that B∗ η = η for every B ∈ supp ν . Exercise 7.7 Show that there exist SL(3)-cocycles that are not irreducible and yet admit no probability measure on PR3 with B∗ η = η for every B ∈ supp ν . Check that this is the case for any random product of A1 = shear along the xy-plane, A2 = irrational rotation around the z-axis and A3 = hyperbolic map on the xy-plane cross identity along the z-axis, with probabilities p1 , p2 , p3 > 0. Exercise 7.8 Check that the submanifolds in Theorem 7.12 may be chosen with codimension ≥ [m/2]. Exercise 7.9 lent:

Given v ∈ Cd , prove that the following conditions are equiva-

(1) v and v¯ are C-linearly dependent; (2) ℜv and ℑv are R-linearly dependent; (3) v = eiθ w for some θ ∈ R and w ∈ Rd . Exercise 7.10 Show that the set R = {[w] ∈ PCd : w ∈ Cd \ {0}} is closed in PCd and the map ψ : PCd \ R → Gr(2, d) in (7.13) is a submersion. This can be done in the following steps. (1) Let 1 ≤ i < j ≤ d be fixed. Given vectors u, v ∈ Rd , let πi, j (u, v) be the 2 × 2-matrix whose rows are (ui , vi ) and (u j , v j ), and πi,∗ j (u, v) be the (d − / {i, j}. Denote by Gi, j the 2) × 2-matrix whose rows are (ul , vl ) for l ∈ subset of planes L ∈ Gr(2, d) admitting a basis {u, v} such that πi, j (u, v) is invertible. Check that the matrix φi, j (L) = πi,∗ j (u, v)πi, j (u, v)−1 does not depend on the choice of the basis, and these maps φi, j : Gi, j → R2(d−2) form an atlas of Gr(2, d). (2) Observe that the map ψ is given by ψ ([w]) = πi,∗ j (ℜw, ℑw)πi, j (ℜw, ℑw)−1 in such local coordinates. Deduce that ψ is smooth and the derivative is given by Dψ ([w])w˙ = πi,∗ j (ℜw, ˙ ℑw) ˙ πi, j (ℜw, ℑw)−1 − πi,∗ j (ℜw, ℑw)πi, j (ℜw, ℑw)−1 πi, j (ℜw, ˙ ℑw) ˙ πi, j (ℜw, ℑw)−1 , for every w˙ ∈ T[w] PCd . Given B˙ ∈ Tψ ([w]) Gr(2, d) (a real (d − 2) × 2 ma˙ v) ˙ = B˙ πi, j (ℜw, ℑw) and πi, j (u, ˙ v) ˙ = trix), consider u, ˙ v˙ ∈ Rd such that πi,∗ j (u, ˙ Conclude that the deriva0. Let w˙ = u˙ + iv˙ and check that Dψ ([w])w˙ = B. tive of ψ is indeed surjective.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.008

8 Simplicity

In the previous two chapters we found sufficient conditions for the largest and smallest Lyapunov exponents λ− and λ+ to be distinct. We are now going to see that, under only mildly stronger conditions, all Lyapunov exponents are distinct. Let F : M × Rd → M × Rd be the linear cocycle defined over a measurepreserving transformation f : (M, μ ) → (M, μ ) by some measurable function A : M → GL(d) such that log+ A±1  ∈ L1 (μ ). The Lyapunov spectrum of F is simple if all the Lyapunov exponents have multiplicity 1; that is, if dimVxj − dimVxj+1 = 1 for every j,

(8.1)

where Rd = Vx1 > · · · > Vxk > {0} is the Oseledets flag in Theorem 4.1. In other words, F has d distinct Lyapunov exponents. When F is invertible, condition (8.1) may be written as dim Exj = 1 for every j, where Ex1 ⊕ · · · ⊕ Exk is the Oseledets decomposition in Theorem 4.2. The main result in this chapter (Theorem 8.1) is a general criterion for simplicity. We restrict ourselves to the case when F : M × Rd → M × Rd is locally constant, although the statement is a lot more general. That is, we suppose that ( f , μ ) is a Bernoulli shift, either one-sided or two-sided, and the function A : M → GL(d) depends only on the zeroth coordinate. Let ν = A∗ μ be the probability measure on GL(d) obtained by push-forward of μ under A. We need to recall a few elementary notions from linear algebra.

8.1 Pinching and twisting The singular values of a linear operator B ∈ GL(d) are the positive square roots σ1 , . . . , σd of the eigenvalues of the self-adjoint operator Bt B. They have the following geometric interpretation: the image of the unit sphere under B : Rd → 133

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

134

Simplicity

Rd is an ellipsoid with semi-axes of length σ1 , . . . , σd . We always take the singular values to be numbered in non-increasing order: σ1 (B) ≥ · · · ≥ σd (B). The eccentricity of B is defined by ecc(B) =

inf

1≤k≤d−1

σk (B) . σk+1 (B)

Given a (d − k)-dimensional subspace G of Rd , the hyperplane section dual to G is the set of all F ∈ Gr(k, d) such that F ∩ G = {0}. This is a closed, nowhere-dense subset of Gr(k, d). A monoid is a set B endowed with an associative binary operation that admits a unit element. The monoid associated with the linear cocycle F is the set B ⊂ GL(d) of all products B1 · · · Bn with Bi ∈ supp ν for 1 ≤ i ≤ n and n ≥ 0 (for n = 0 interpret the product to be the identity). A monoid B ⊂ GL(d) is • pinching if it contains matrices with arbitrarily large eccentricity; • twisting if, given any 1 ≤ k ≤ d − 1, any F ∈ Gr(k, d), and any finite family G1 , . . . , Gn ∈ Gr(d − k, d), there exists B ∈ B such that B(F) ∩ Gi = {0} for every 1 ≤ i ≤ n. We leave it to the reader to check that for d = 2 these conditions coincide with those in the definitions of pinching and twisting in Section 1.2. Theorem 8.1 Suppose that the monoid B associated with F is pinching and twisting. Then the Lyapunov spectrum of the linear cocycle F is simple. The hypotheses of the theorem are typical among locally constant cocycles (Exercise 8.1). It is also interesting to compare these hypotheses with those of the high-dimensional Furstenberg theorem. It is clear that pinching implies the non-compactness condition (a) in Corollary 7.11 and the implication is strict if d ≥ 3. Moreover, twisting implies the strong irreducibility condition (b) in Corollary 7.11. Lemma 8.7 below contains a kind of converse to this last observation.

8.2 Proof of the simplicity criterion For proving Theorem 8.1, it is no restriction to suppose the Bernoulli shift f : (M, μ ) → (M, μ ) to be two-sided, M = X Z and μ = pZ , and we do so. Given B ∈ GL(d) and 1 ≤ k ≤ d − 1 such that σk (B) > σk+1 (B), define Eku (B) ∈ Gr(k, d) and Eks (B) ∈ Gr(d − k, d) to be the subspaces given by: • Eku (B) is spanned by the eigenvectors of Bt B with eigenvalue ≥ σk (B)2 ;

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

8.2 Proof of the simplicity criterion

135

• Eks (B) is spanned by the eigenvectors of Bt B with eigenvalue ≤ σk+1 (B)2 . In geometric terms: Eku (B) is the pre-image of the subspace generated by the k largest semi-axes, and Eku (B) is the pre-image of the subspace generated by the d − k smallest semi-axes of the ellipsoid {B(v) : v = 1}. Proposition 8.2 For each 1 ≤ k ≤ d − 1 there exists a measurable function ξ u : M − → Gr(k, d) such that ξˆ u = ξ u ◦ π − : M → Gr(k, d) satisfies (1) ξˆ u ( f (x)) = A(x)ξˆ u (x) for μ -almost every x ∈ M. (2) For μ -almost every x ∈ M,

σd−k (A−n (x)) s → ∞ and Ed−k (A−n (x)) → ξˆ u (x). σd−k+1 (A−n (x)) (3) For every hyperplane section S ⊂ Gr(k, d) there exists a set of x− ∈ M − with positive μ − -measure such that ξ u (x− ) ∈ / S. The proof of this proposition will be given in a while. Replacing k by d − k and reversing time, we also obtain: Proposition 8.3 For each 1 ≤ k ≤ d − 1 there exists a measurable function ξ s : M + → Gr(d − k, d) such that ξˆ s = ξ s ◦ π + : M → Gr(d − k, d) satisfies (1) ξˆ s ( f (x)) = A(x) · ξˆ s (x) for μ -almost every x ∈ M. (2) For μ −almost every x ∈ M,

σk (An (x)) → ∞ and Eks (An (x)) → ξˆ s (x). σk+1 (An (x)) (3) For every hyperplane section S ⊂ Gr(d −k, d) there exists a set of x+ ∈ M + with positive μ + -measure such that ξ s (x+ ) ∈ / S. Proof of Theorem 8.1 Let 1 ≤ k ≤ d − 1 be fixed. Let ξ u : M − → Gr(k, d) be as in Proposition 8.2 and ξ s : M + → Gr(d − k, d) be as in Proposition 8.3. We write each x ∈ M as (x− , x+ ) with x− = π − (x) and x+ = π + (x). We claim that

ξ u (x− ) ∩ ξ s (x+ ) = {0} for μ −almost every x ∈ M.

(8.2)

Indeed, the set of x ∈ M for which this property holds is ( f , μ )-invariant. So, by ergodicity, either the claim is true or else ξ u (x− ) ∩ ξ s (x+ ) = {0} for μ -almost every x. By Fubini, the latter would imply that there exists x+ ∈ M + such that ξ u (x− ) is contained in the hyperplane section dual to ξ s (x+ ) for μ − -almost every x− ∈ M − , and that would contradict Proposition 8.2. This contradiction proves (8.2). Thus, we have an F-invariant measurable decomposition Rd = ξˆ s (x)⊕ ξˆ u (x) defined at almost every point. We want to prove that the Lyapunov exponents of

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

136

Simplicity

F along ξˆ u are strictly bigger than those along ξˆ s . Denote ξns (x) = Eks (An (x)). By Proposition 8.3, the sequence ξns converges to ξˆ s at μ -almost every point. In particular, the angle between ξns (x) and ξˆ u (x) is bounded from zero for all large n. So, we may choose α > 0, n0 ≥ 1, and a set Z ⊂ M with μ (Z) > 0 such that ξˆ u (x) ⊂ C(ξ s (x)⊥ , α ) and ξˆ s (x) ∩C(ξ s (x)⊥ , 4α ) = 0/ (8.3) n

n

for every n ≥ n0 and x ∈ Z. Note that C(V, a) denotes the cone of radius a > 0 around a subspace V of Rd , as defined in Section 4.4.2. Fix β ∈ (0, 1/10) such that (Exercise 4.10) C(ξˆ u (x), 4β ) ⊂ C(ξns (x)⊥ , 2α ). Then, for any x ∈ Z and n ≥ n0 ,     An (x) C(ξˆ u (x), 4β ) ⊂ C An (x)(ξns (x)⊥ ), 2α En (x)

(8.4)

where En (x) =

σk+1 (An (x)) . σk (An (x))

By Proposition 8.3, En (x) goes to zero as n → ∞. Thus, increasing n0 and reducing Z if necessary, we may suppose 2α En (x) ≤ β for every n ≥ n0 and x ∈ Z. Now, (8.4) and the fact that ξˆ u is F-invariant imply that   ξˆ u ( f n (x)) ∈ C An (x)(ξns (x)⊥ ), β .     Then C An (x)(ξns (x)⊥ ), β ⊂ C ξˆ u ( f n (x)), 3β (Exercise 4.10). Altogether, this proves that   (8.5) An (x) C(ξˆ u (x), 4β ) ⊂ C(ξˆ u ( f n (x)), 3β ) for every n ≥ n0 and x ∈ Z. Reducing Z once more, if necessary, we may assume that its first n0 iterates are pairwise disjoint. Equivalently, the first return time r(x) ≥ n0 for every x ∈ Z. Let g(x) : Z → Z, g(x) = f r(x) (x) be the first return map and G : Z ×Rd → Z × Rd be the linear cocycle G(x, v) = (g(x), B(x)v) with B(x) = Ar(x) (x). Then property (8.5) implies that every B(x) maps the cone C(ξˆ u (x), 4β ) inside a strictly smaller cone C(ξˆ u ( f n (x)), 3β ). By (8.3), these cones are disjoint from ξˆ s (x). So, we are in a position to apply Proposition 4.19 to conclude that the Lyapunov exponents of G along ξˆ u (x) are strictly bigger than the Lyapunov exponents of G along ξˆ s (x). Then, by Proposition 4.18, the same is true for the original cocycle F: the first k exponents are strictly larger than all the remaining ones. Since k is arbitrary, this proves that all the Lyapunov exponents of F are distinct, as claimed in the theorem.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

8.3 Invariant section

137

Now let us prove Proposition 8.2 and Proposition 8.3.

8.3 Invariant section Let Fk : M × Gr(k, d) → M × Gr(k, d) be the Grassmannian cocycle associated with the linear cocycle F : M ×Rd → M ×Rd . That is, Fk (x,V ) = ( f (x), A(x)V ) for every (x,V ) ∈ M × Gr(k, d). For k = 1 this is just the projective cocycle PF : M × PRd → M × PRd . For the proofs of Propositions 8.2 and 8.3, we need to expand a bit more on the formalism introduced in Section 4.3.2.

8.3.1 Grassmannian structures Let E = Rd and ΛE be the disjoint union of the exterior powers Λk E with 0 ≤ k ≤ d. The exterior product on E is the bilinear map ∧ : ΛE × ΛE → ΛE, characterized by (u1 ∧ · · · ∧ uk ) ∧ (v1 ∧ · · · ∧ vl ) = (u1 ∧ · · · uk ∧ v1 ∧ · · · ∧ vl ). Let H be a hyperplane (that is, a codimension-1 linear subspace) of the vector space Λk E. Then H may be written as H = {ω ∈ Λk E : ω ∧ υ = 0} for some non-zero υ ∈ Λd−k E. We call the hyperplane geometric if υ may be chosen to be a decomposable (d − k)-vector; that is, υ = υk+1 ∧ · · · ∧ υd for some choice of vectors υi in E. Observe that,

ω ∈ H ⇔ ω ∧ υ = 0 ⇔ Ψ(ω ) ∩ Ψ(υ ) = {0} for any ω = ω1 ∧ · · · ∧ ωk (the map Ψ was introduced in Section 4.3.2). Thus, the intersection of Gr(k, d) with the projective image of the geometric hyperplane H coincides with the hyperplane section of Gr(k, d) dual to the (d − k)-dimensional subspace Ψ(υ ) generated by the vectors υk+1 , . . . , υd . This shows that the hyperplane sections of the Grassmannian manifold are, precisely, the intersections of Gr(k, d) with the projective images of geometric hyperplanes of Λk E. As observed before, hyperplane sections are closed nowhere-dense subsets of the Grassmannian manifold. A geometric subspace of Λk E is a finite intersection of geometric hyperplanes. Similarly, a linear section of Gr(k, d) is a finite intersection of hyperplane sections. Equivalently, a linear section is the intersection of the Grassmannian with the projective image of some geometric subspace of the exterior power.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

138

Simplicity

We call linear arrangement in Λk E any finite union of geometric subspaces. A linear arrangement in Gr(k, d) is a finite union of linear sections. Thus, a linear arrangement in Gr(k, d) is the intersection of the Grassmannian manifold with the projective image of some linear arrangement in Λk E. In particular, every linear arrangement is a closed nowhere-dense subset of the Grassmannian. Lemma 8.4 If {Lα : α ∈ I} is an arbitrary family of linear arrangements in  Gr(k, d) then α ∈I Lα coincides with the intersection of the Lα over a finite subfamily. Proof By definition, for each α ∈ I there exists some linear arrangement Lα in Λk E such that Lα = PLα ∩ Gr(k, d). Let I1 ⊂ I be an arbitrary finite subset. We may write 

α ∈I1

Lα = V1 ∪ · · · ∪Vm

where every V j ⊂ Λk E is a geometric subspace. Renumbering the V j if neces sary, we may suppose that there is s ∈ {0, . . . , m} such that V j ⊂ α ∈I Lα if   and only if 1 ≤ j ≤ s. If s = m then α ∈I1 Lα = α ∈I Lα and so the claim is proved. Otherwise, for each j ∈ {s + 1, . . . , n} there is α j ∈ I such that V j is not contained in Lα j . Let I2 = I1 ∪ {αs+1 , . . . , αn }. Then  α ∈I2

Lα = W1 ∪ · · · ∪Wn

where Wi = Vi for i ≤ s and each Wi with i > s is a proper subspace of some V j with j > s. In particular,       max dimW j : W j ⊂ Lα ≤ max dimV j : V j ⊂ Lα − 1. α ∈I

α ∈I

Thus, repeating this procedure not more than dim Λk E times, we find a finite   set Ir ⊂ I such that α ∈Ir Lα = α ∈I Lα . Then       Lα = P Lα ∩ Gr(k, d) = P Lα ∩ Gr(k, d) = Lα , α ∈I

α ∈I

α ∈Ir

α ∈Ir

which proves the claim. Corollary 8.5 The family of linear arrangements in Gr(k, d) is closed under finite unions and arbitrary intersections. Proof The statement about finite unions and finite intersections is an immediate consequence of the definition of linear arrangement. The claim for arbitrary intersections follows directly from Lemma 8.4.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

8.3 Invariant section

139

Corollary 8.6 Let L be a linear arrangement in Gr(k, d) and let B ∈ GL(d). If B(L ) ⊂ L then B(L ) = L . Proof Consider the non-increasing sequence Bn (L ), n ≥ 1. By Lemma 8.4 there exists p ≥ 1 such that Bn (L ) = B p (L ) for every n ≥ p. In particular, B p+1 (L ) = B p (L ) and, applying B−p to both sides, this gives B(L ) = L .

8.3.2 Linear arrangements and the twisting property A linear arrangement is proper if it is neither empty nor the whole Gr(k, d). Lemma 8.7 A monoid B is twisting if and only if, for any 1 ≤ k ≤ d − 1, there is no proper linear arrangement L in Gr(k, d) with B(L ) = L for all B ∈ B. Proof Suppose that there exists a proper linear arrangement L ⊂ Gr(k, d) invariant under every B ∈ B. Fix F ∈ L and consider G1 , . . . , Gm ∈ Gr(d − k, d) such that L is contained in the union of the hyperplane sections S1 , . . . , Sm dual to G1 , . . . , Gm . Then, for every B ∈ B, there exists j such that B(F) ∈ S j or, equivalently, F ∩ G j = {0}. This shows that B is not twisting. Conversely, suppose that B is not twisting. Then, there exist F, G1 , . . . , Gm such that, for every B ∈ B, there exists j ∈ {1, . . . , m} such that B(F) belongs to the hyperplane section S j dual to G j . Consider    L = B−1 S1 ∪ · · · ∪ Sm . B∈B

Then L is a linear arrangement, by Corollary 8.5, and it is proper, since F ∈ L . Moreover, B−1 (L ) ⊃ L for every B ∈ B. Using Corollary 8.6 we conclude that B(L ) = L for every B ∈ B. This finishes the proof. Corollary 8.8 The inverse monoid B −1 = {B−1 : B ∈ B} is pinching or twisting if and only if B is pinching or twisting, respectively. Proof By Exercise 8.2, any B and B−1 have the same singular values. Thus, B −1 is pinching if and only if B is pinching. The claim about twisting is a direct consequence of Lemma 8.7, because B(L ) = L if and only if B−1 (L ) = L. The proof of the following corollary mimics that of Lemma 6.9. Corollary 8.9 If B is twisting then any Fk -stationary measure gives zero weight to every proper linear arrangement in Gr(k, d).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

140

Simplicity

Proof Suppose η (S ) > 0 for some linear section S ⊂ Gr(k, d). By definition, S = Gr(k, d) ∩ PS for some geometric subspace S ⊂ Λk E. Let d0 ≥ 1 be the smallest dimension of a geometric subspace S such that η (Gr(k, d) ∩ PS) > 0 and let c > 0 be the supremum of η (Gr(k, d) ∩ PS) over all subspaces S of dimension d0 . Let V be the family of all S = Gr(k, d) ∩ PS such that S is a geometric subspace with dim S = d0 and η (S ) = c. We claim that V is non-empty. Indeed, consider any sequence Sn = Gr(k, d) ∩ PSn where Sn is a geometric subspace with dim Sn = d0 and η (Sn ) → c. Up to restricting to a subsequence, we may suppose (Exercise 8.8) that (Sn )n converges to some geometric subspace S of dimension d0 . Since S = Gr(k, d)∩PS is a closed subset of the Grassmannian, it follows that η (S ) = c. This proves our claim. We are also going to show that V is finite. Indeed, if Si = Gr(k, d) ∩ PSi , i = 1, 2 are two distinct elements of V then S1 = S2 then dim(S1 ∩ S2 ) < d0 and so S1 ∩ S2 = Gr(k, d) ∩ P(S1 ∩ S2 ) has zero η -measure. Therefore, #V ≤ 1/c < ∞. Now, consider any S ∈ V . Since η is Fk -stationary, c = η (S ) =



η (B−1 (S )) d ν (B)

and so η (B−1 (S )) = c for ν -almost every B. It follows that η (B−1 (S )) = c for every B ∈ supp ν . This proves that B−1 (S ) ∈ V for all B ∈ B and S ∈ V . Consequently, L0 = S ∈V S is a proper linear arrangement in Gr(k, d), and L0 is invariant under every B ∈ B. By Lemma 8.7, this contradicts the assumption that B is twisting. This contradiction proves that η (S ) = 0 for every linear section S , and so η (L ) = 0 for every linear arrangement L ⊂ Gr(k, d).

8.3.3 Control of eccentricity Let 1 ≤ k ≤ d − 1 be fixed. Recall that Eku (B) ∈ Gr(k, d) is the subspace of dimension k most expanded by B and Eks (B) ∈ Gr(d − k, d) is the subspace of dimension d − k most contracted by B. They are defined when σk (B) > σk+1 (B). Lemma 8.10 Let (Bn )n be a sequence in GL(d) with σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku and Eks (Bn ) → Eks as n → ∞. Let K be any compact subset of Gr(k, d) which does not intersect the hyperplane section dual to Eks . Then Bn (K) → Eku as n → ∞. Proof

Suppose that the conclusion is not true. Then, restricting to a subse-

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

8.3 Invariant section

141

quence if necessary, there exist Fn ∈ K such that Bn (Fn ) converges to some F  = Eku . Take hn ∈ Fn with hn  = 1 such that Bn (hn ) →h∈ / Eku . Bn (hn ) By Exercise 8.3, there exists c1 > 0 depending only on the distance from K to the hyperplane section dual to Eks such that Bn (hn ) ≥ c1 σk (Bn )

for every n.

(8.6)

s (B−1 ), there exists c > 0 depending Since (Exercise 8.2) Bn (Eku (Bn )) = Ed−k 2 n u only on the distance from h to Ek such that −1 1 = hn  = B−1 n (Bn (hn )) ≥ c2 σd−k (Bn )Bn (hn ). −1 (Exercise 8.2), for all large n. Consequently, since σd−k (B−1 n ) = σk+1 (Bn )

1 ≥ c1 c2

σk (Bn ) σk+1 (Bn )

for all n large, which contradicts the hypothesis. Lemma 8.11 Let (Bn )n be a sequence in GL(d). Suppose that there is F ∈ Gr(k, d) such that the set {F  ∈ Gr(k, d) : Bn (F  ) → F} is not contained in a hyperplane section. Then σk (Bn )/σk+1 (Bn ) → ∞ and F = limn Bn (Eku (Bn )). Proof Suppose that σk (Bn )/σk+1 (Bn ) is bounded along some subsequence. Passing to a subsequence, and replacing Bn by Yn Bn Zn , with convenient Yn , Zn ∈ GL(d) such that Yn±1  and Zn±1  are bounded, we may assume that there exist l < k < m such that • σ j (Bn )/σk (Bn ) → ∞ for j ∈ {1, . . . , l}, • σ j (Bn ) = σk (Bn ) for j ∈ {l + 1, . . . , m}, • σ j (Bn )/σk (Bn ) → 0 for j ∈ {m + 1, . . . , d}, and there exists an orthonormal basis {e1 , . . . , ed }, independent of n, such that • Bn (ei ) = σi (Bn )ei for every i and n. Let E u , E c , E s be the spans of {e1 , . . . , el }, {el+1 , . . . , em }, {em+1 , . . . , ed }, respectively. We claim that the set of F  ∈ Gr(k, d) such that Bn (F  ) → F is contained in the hyperplane section dual to some G ∈ Gr(k, d). We split the argument into three cases. If F is not contained in E u ⊕ E c , take G to be any subspace of dimension d − k containing E s . Observe that F  ∩ E s = {0}, necessarily, and so F  ∩ G = {0}. That proves the claim in this case.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

142

Simplicity

From now on, suppose F ⊂ E u ⊕ E c . If F does not contain E u , take G to be any subspace of dimension d − k contained in E c ⊕ E s . Observe that F  cannot be transverse to E c ⊕ E s , and so dim F  ∩ (E c ⊕ E s ) > k − l. Then dim F  ∩ (E c ⊕ E s ) + dim G > k − l + d − k = dim E c ⊕ E s and that implies that F  ∩ G = {0}. Finally, suppose E u ⊂ F ⊂ E c ⊕ E u . Then F = E u ⊕ F c for some subspace c F ⊂ E c with dimension k − l. Take G to be any subspace of dimension d − k that contains E s and intersects F c . Observe that F  must be transverse to E c ⊕ E s and F  ∩ (E c ⊕ E s ) must be the graph of a linear map F c → E s . Hence, F  ∩ G = {0}. The proof of our claim is complete. Thus, we have shown that if {F  ∈ Gr(k, d) : Bn (F  ) → F} is not contained in a hyperplane section, for some F ∈ Gr(k, d), then σk (Bn )/σk+1 (Bn ) → ∞. Moreover, passing to a subsequence if necessary, we may assume that Eks (Bn ) converges to some Eks . Since {F  ∈ Gr(k, d) : Bn (F  ) → F} is not contained in the hyperplane section dual to Eks , Lemma 8.10 gives that F = lim Bn (Eku (Bn )). That completes the argument. The statement and proof of the next result are akin to those of Lemma 6.14. Lemma 8.12 Let (Bn )n be a sequence in GL(d) and β be any probability measure on Gr(k, d). (1) Assume that β gives zero weight to every hyperplane section and assume σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku . Then (Bn )∗ β converges in the weak∗ topology to a Dirac mass on Eku . (2) Assume that β is not supported in a hyperplane section and assume (Bn )∗ β converges in the weak∗ topology to a Dirac mass on some Eku ∈ Gr(k, d). Then σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku . Proof Let us first prove (1). Up to restricting to some subsequence of an arbitrary subsequence, we may assume that Eks (Bn ) also converges to some Eks . Take a compact set K disjoint from the hyperplane section dual to Eks and such that β (K) > 1 − ε . Then (Bn )∗ β (Bn (K)) > 1 − ε and, by Lemma 8.10, Bn (K) is close to Eku for all large n. This shows that (Bn )∗ β converges to the Dirac measure at Eku . Now we prove (2). For each m ≥ 1, let Vm denote the neighborhood of radius 1/m around Eku . The hypothesis implies that for each m ≥ 1 there exists nm ≥ 1 such that   −m β B−1 for every n ≥ nm . n (Vm ) > 1 − 2



∞ −1 Define X = ∞ l=1 m=l Bnm (Vm ). Then β (X) = 1 and so, by the hypothesis on β , the set X is not contained in any hyperplane section. Moreover, for every

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

8.3 Invariant section

143

F  ∈ X one has Bnm (F  ) → Eku as m → ∞. Using Lemma 8.11, we conclude that σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku restricted to the subsequence (nm )m . The same argument can be applied starting from any subsequence of (Bn )n . Thus, we have shown that σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku restricted to some subsequence of every subsequence. This implies the claim, and so the proof of the lemma is complete.

8.3.4 Convergence of conditional probabilities We call homogeneous measure any probability measure β in Gr(k, d) such that β (L ) = 0 for every proper linear arrangement L . Corollary 8.9 states that every Fk -stationary measure is homogeneous, if the associated monoid B is twisting. Clearly, the support of a homogeneous measure can never be contained in a hyperplane section. Lemma 8.13 There exist θ > 0, m ≥ 1, and matrices P, Q, Qi , 1 ≤ i ≤ m in B, with the following properties: (1) P admits an invariant decomposition Rd = UP ⊕ SP where dimUP = k and all eigenvalues of P restricted to UP are larger, in norm, than all eigenvalues of P restricted to SP . (2) Q(UP ) ∩ SP = {0} (3) For every G ∈ Gr(d − k, d) there exists i ∈ {1, . . . , m} such that the angle between Qi (UP ) and G is at least 2θ . Proof Since B is pinching, we may find Bn ∈ B such that σk (Bn )/σk+1 (Bn ) goes to infinity. Up to restricting to a subsequence, we may suppose that Bn (Eku (Bn )) and Eks (Bn ) converge to subspaces Eku and Eks , respectively. Since B is twisting, we may find B0 ∈ B such that B0 (Eku ) ∩ Eks = {0}. We are going to take P = B0 Bn for some large n. Fix c > b > a > 0, large enough so that P(Eku (Bn )) is contained in the cone C(Eku (Bn ), a) for every large n. Since σk (Bn ) is much larger than σk+1 (Bn ), the map P = B0 Bn sends C(Eku (Bn ), c) to a thin cone around P(Eku (Bn )). Hence, assuming n is large enough,   P C(Eku (Bn ), c) ⊂ C(Eku (Bn ), b). Then (Exercise 4.13), P admits a dominated decomposition as in claim (1). Since B is twisting, there exists Q ∈ B as in claim (2). For the same reason, given any G ∈ Gr(d − k, d) there exists QG ∈ B such that QG (UP ) ∩ G = {0}. Moreover, one may take QG = QG for every G ∈ Gr(d − k, d) in a neighborhood of G. So, by compactness of Gr(d − k, d), one can choose matrices Q1 ,

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

144

Simplicity

. . . , Qm such that

 max | sin  Qi (UP ), G)| > 0.

1≤i≤m

Using compactness once more, the expression on the left is bounded below by some 2θ > 0. By definition, P = P1 · · · PN(P) for some Pj ∈ supp ν for 1 ≤ j ≤ N(P) and N(P) ≥ 1. Fix any such factorization of P and let r > 0. Define VP to be the set of all products  P = P1 · · · PN(P) with Pj in the r-neighborhood of Pj for 1 ≤ j ≤ N(P).

Define N(Q), VQ and N(Qi ), VQi , 1 ≤ i ≤ m analogously, starting from factorizations of Q and Qi , 1 ≤ i ≤ m instead. Take r > 0 to be small enough that the VQi , 1 ≤ i ≤ m are pairwise disjoint and the conclusions of Lemma 8.13 remain true, for slightly smaller θ > 0, when P, Q, and Qi , 1 ≤ i ≤ m are replaced by any P ∈ VP , Q ∈ VQ , and Qi ∈ VQi , 1 ≤ i ≤ m. For each l ≥ 1, let VPl denote the set of all products P˜ = P(1) · · · P(l) , with P( j) ∈ VP for all 1 ≤ j ≤ l.

Assuming r > 0 is small, every P˜ ∈ l VPl admits a dominated decomposition Rd = UP˜ ⊕ SP˜ as in part (1) of Lemma 8.13 (Exercise 4.13). Moreover, UP˜ is uniformly close to UP , so that parts (2) and (3) remain valid as well. Let CP denote the subset of x ∈ M such that A( f j−1 (x)) is in the r-neighborhood of Pj for every j = 1, . . . , N(P). Note that CP is a cylinder of M, since A is locally constant, and CP has positive μ -measure, because every Pj ∈ supp ν . Analogously, starting from the chosen factorizations of Q and Qi , 1 ≤ i ≤ m, we define cylinders CQ and CQi , 1 ≤ i ≤ m with positive μ -measure. For each l ≥ 1, let CPl denote the subset of x ∈ M such that f jN(P) (x) ∈ CP for j = 0, . . . , l − 1. Let nl,i = lN(P) + N(Q) + lN(P) + N(Qi ) and define Yl,i to be the subset of points x ∈ M such that f −nl,i (x) ∈ CPl f

and

−nl,i +lN(P)+N(Q)

(x)

f −nl,i +lN(P) (x) ∈ CQ ∈ CPl

and

f

and

−nl,i +lN(P)+N(Q)+lN(P)

(x) ∈ CQi .

(8.7)

Then Yl,i is an nl,i -cylinder of M with μ (Yl,i ) > 0, for every l and i. For each l fixed, the cylinders Yl,i , 1 ≤ i ≤ m are pairwise disjoint. Given ε > 0, we say a probability measure ρ in Gr(k, d) is ε -concentrated if there exists some ε -neighborhood Bε ⊂ Gr(k, d) such that ρ (Bε ) > 1 − ε . Lemma 8.14 Given ε > 0 and any homogeneous measure β in Gr(k, d), there is l0 ≥ 1 and, given any B0 ∈ B, there is i = i(B0 ) ∈ {1, . . . , m} such that B∗ β

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

8.3 Invariant section

145

is ε -concentrated, for every B = B0 Qi P Q P with Qi ∈ VQi , P ∈ VPl , Q ∈ VQ , P ∈ VPl , and l > l0 . Proof Since β is homogeneous, it follows from (1) in Lemma 8.13 that, if l is large, most of the mass of P∗ β is concentrated near the sum UP ∈ Gr(k, d) of the eigenspaces of P associated with the k eigenvalues with largest norms. Then most of the mass of (Q P )∗ β is concentrated near Q (UP ). By (2) in Lemma 8.13, the latter is transverse to SP . So, if l is large then most of the mass of (P Q P )∗ β is concentrated near the sum UP ∈ Gr(k, d) of the eigenspaces associated with the k eigenvalues of P with largest norms. Choose θ > 0 as in Lemma 8.13 and let GB0 ∈ Gr(d − k, d) be as in Exercise 8.6. The action of all B0 ∈ B is equicontinuous restricted to the set of k-dimensional subspaces whose angle to GB0 is at least θ . Let i = i(B0 ) be as in (3) of Lemma 8.13, for G = GB0 . Then most of the mass of (Qi P QP )∗ β is concentrated near Qi (UP ), and so most of the mass of B∗ β is concentrated near B0 Qi (UP ), if l is large enough. Moreover, the equicontinuity property allows us to take the condition on l uniform on B0 . This proves the lemma. In what follows, take l0 = l0 (ε , β ) ≥ 1 to be as in Lemma 8.14. Lemma 8.15 There is c0 > 0 and measurable sets Zl ⊂ M with μ (Xl ) > c0 for every l ≥ 1 such that, for every ε > 0, every homogeneous measure β in Gr(k, d), and every x ∈ Zl with l > l0 , there exists n > l such that An ( f −n (x))∗ β is ε -concentrated. Proof For each l ≥ 1 and 1 ≤ i ≤ m, let nl,i and Yl,i be as defined in (8.7). Define nl = maxi nl,i By ergodicity of ( f −nl , μ ), for μ -almost every x ∈ M there exist infinitely many values of k ≥ 1 such that f −knl (x) ∈ Yl,1 ∪ · · · ∪Yl,m . Choose k = k(x) minimum with this property and then let n = knl + nl,i and B0 = Aknl ( f −knl (x))

and

B = An ( f −n (x)).

The definitions give that B = B0 Qi P Q P with P , P ∈ VPl , Q ∈ VQ , and Qi ∈ VQi for some i ∈ {1, . . . , m}. Moreover (Exercise 8.7),

μ (Zl ) ≥ min

1≤ j≤m

μ (CQ j ) μ (Yl, j ) = min 1≤ j≤m μ (Yl,1 ) + · · · + μ (Yl.m ) μ (CQ1 ) + · · · + μ (CQm )

where Zl is the set of points x ∈ M for which this i = i(B0 ). Let c0 > 0 denote the expression on the right-hand side. To conclude, apply Lemma 8.14. Proof of Proposition 8.2 Let η be any stationary measure for Fk : stationary measures do exist, by Proposition 5.6. Let m be the lift of η to M × Gr(k, d).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

146

Simplicity

By Corollaries 5.23 and 5.24, m is an Fk -invariant u-state and its disintegration is given by mx = lim An ( f −n (x))∗ η n

for μ -almost every x.

(8.8)

Let Xl , l ≥ 1 be as in Lemma 8.15 and Z be the set of points x that belong to Xl for infinitely many values of l. Then μ (Z) ≥ c0 and An ( f −n (x))∗ η accumulates at a Dirac mass for all x ∈ Z. So, mx is a Dirac mass for every x ∈ Z. By the ergodicity of ( f , μ ), it follows that mx is a Dirac mass for μ -almost every x. Denote the support of this Dirac mass by ξˆ u (x). Since m is a u-state, we can factorize ξˆ u = ξ u ◦ π− for some measurable ξ u : M − → Gr(k, d). The fact that m is Fk -invariant means that ξˆ u ( f (x)) = A(x)(ξˆ u (x)) for μ -almost every x. That proves claim (1). By Corollary 8.9, the support of η cannot be contained in any hyperplane section. It follows from part (2) of Lemma 8.12 that

σk (An ( f −n (x))) → ∞ and An ( f −n (x))Eku (An ( f −n (x))) → ξˆ u (x), σk+1 (An ( f −n (x))) for μ -almost every x. In view of Exercise 8.2, this is the same as

σd−k (A−n (x)) s → ∞ and Ed−k (A−n (x)) → ξˆ u (x), σd−k+1 (A−n (x)) which is precisely claim (2) in the proposition. Moreover, claim (3) follows directly from the observation that m projects to η in Gr(k, d) and the support of η is not contained in any hyperplane section. This finishes the proof of Proposition 8.2. To deduce Proposition 8.3, consider the inverse F −1 : M × Rd → M × Rd , which is the linear cocycle defined over f −1 : M → M by A−1 : M → GL(d),

A−1 (x) = A( f −1 (x))−1 .

The associated monoid is B −1 = {B−1 : B ∈ B} and, by Corollary 8.8, this monoid is pinching and twisting. Strictly speaking, F −1 is not locally constant, because A−1 (x) depends on the first (not the zeroth) coordinate of x ∈ M. However, the cocycle F  : M × Rd → M × Rd defined over f −1 : M → M by A : M → GL(d),

A (x) = A(x)−1

is locally constant, and it is conjugate to F −1 by the map (x, v) → ( f −1 (x), v) on M × Rd . In particular, the two cocycles have the same associated monoid B −1 . Applying Proposition 8.2 to the cocycle F  , with k replaced by d − k, and then translating the conclusions to F −1 through the conjugacy, we obtain the claims in Proposition 8.3. Now the proof of Theorem 8.1 is complete.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

8.4 Notes

147

8.4 Notes Simplicity criteria were first proven in the special case of products of random matrices. Guivarc’h and Raugi [60] obtained sufficient conditions (strong irreducibility together with the contraction property) for the extremal Lyapunov exponents λ± to be simple. Using exterior powers as in Section 4.3.2, this yields sufficient conditions for simplicity of all Lyapunov exponents. A more direct criterion for simplicity of the whole Lyapunov spectrum of a product of random matrices was given by Gol’dsheid and Margulis [58], based on considering the action of the cocycle on the Grassmannian manifolds. The contraction property in [60] is replaced by a more explicit condition on the Zariski closure of the group generated by the support of the probability measure p. Bonatti, Viana [38] extended the criterion in [60] to H¨older-continuous linear cocycles with invariant holonomies and Avila and Viana [14] obtained a similar extension for [58]. In either case, the base dynamics is assumed to be a hyperbolic homeomorphism endowed with an invariant probability measure with local product structure. A main application was the proof of the Zorich–Kontsevich conjecture on the Lyapunov spectrum of the Teichm¨uller flow on strata of Abelian differentials, by Avila and Viana [15]. See also the proofs, by Cambrainha [42], that typical symplectic cocycles have only non-zero Lyapunov exponents, and by Herrera [63], that certain multidimensional continued fraction algorithms have simple Lyapunov spectra. The presentation in this chapter was adapted from Avila and Viana [14, 15].

8.5 Exercises Exercise 8.1 Let m ≥ 2 and S be the set of all (A1 , . . . , Am ) ∈ GL(d)m such that the monoid generated by the set {A1 , . . . , Am } is pinching and twisting. Prove that S is open and has full Lebesgue measure in GL(d)m . Exercise 8.2 Prove that, for every B ∈ GL(d) and 1 ≤ k ≤ d,

σk (B−1 ) = σd−k+1 (B)−1

and

σk (Bt ) = σk (B).

Moreover, assuming that σk (B) > σk+1 (B), show: (1) Eks (B) and Eku (B) are orthogonal complements to each other; (2) B(Eks (B)) and B(Eku (B)) are orthogonal complements to each other; u (B)) and E u (B−1 ) = B(E s (B)); (3) Eks (B−1 ) = B(Ed−k k d−k

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

148

Simplicity

(4) Eks (Bt ) = B(Eks (B)) and Eku (Bt ) = B(Eku (B)). Exercise 8.3 Let B ∈ GL(d) and 1 ≤ k ≤ d − 1 be such that σk (B) > σk+1 (B). Prove that: d + (1) B(h) ≥ σk (x)h+ k  for every h ∈ R , where h represents the component u of h along Ek (B); (2) if F ∈ Gr(k, d) is transverse to Eks (x) then B(h) ≥ cσk (B)h for all h ∈ F, where c > 0 does not depend on B, but only on the distance between F and the hyperplane section dual to Eks (x).

Exercise 8.4 Let B be a monoid. Prove that: (1) if there is B1 ∈ B whose eigenvalues are all distinct in norm, then B is pinching; (2) if there exists B1 as above and there exists B2 ∈ B such that B2 (V ) ∩W = {0} for any pair of B1 -invariant subspaces with complementary dimensions, then B is twisting. Exercise 8.5 Show that the adjoint monoid Bt = {Bt : B ∈ B} is pinching or twisting if and only if B is pinching or twisting, respectively. Exercise 8.6 For each B ∈ GL(d) let GB be some subspace of dimension d − k spanned by eigenvectors associated with the d − k smallest eigenvalues of Bt B. If σk (B) > σk+1 (B) then the only choice is GB = Eks (B). Let θ > 0 be fixed. Show that for any ε > 0 there exists δ > 0 such that | sin (F1 , F2 )| < δ



| sin (B(F1 ), B(F2 ))| < ε

for any B ∈ GL(d) and F1 , F2 ∈ Gr(k, d) with | sin (Fi , GB )| > θ for i = 1, 2. Exercise 8.7 Let f : (M, μ ) → (M, μ ) be a Bernoulli shift, M = X Z and μ = pZ . Let m, n ≥ 1, and Y1 , . . . , Ym be positive measure subsets of M defined by imposing conditions on coordinates x0 , . . . , xn−1 only. Define g(x) = f kn (x) with k = k(x) ≥ 1 minimum such that f kn (x) ∈ Y1 ∪ · · · ∪Ym . Prove that, given any sequence (ik )k with values in {1, . . . , m},

μ (Y j ) 1≤ j≤m μ (Y1 ) + · · · + μ (Ym )

μ (Z) ≥ min

where Z is the set of all points x ∈ M for which g(x) ∈ Yik(x) . Exercise 8.8 Prove that: (1) the set of decomposable k-vectors is a closed subset of Λk (Rd ) and its intersection with the unit ball is compact;

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

8.5 Exercises

149

(2) the set of geometric hyperplanes is a compact subset of the space of hyperplanes of Λk (Rd ); (3) for every fixed l, the set of geometric subspaces is a compact subset of the space of l-dimensional subspaces of Λk (Rd ). Exercise 8.9 Prove that if B is pinching and twisting then each associated Grassmannian cocycle Fk : M × Gr(k, d) → M × Gr(k, d), 1 ≤ k ≤ d − 1 has a u unique invariant u-state;  namely, theˆprobability measure m defined on M × u u Gr(k, d) by m (E) = μ {x ∈ M : (x, ξ (x)) ∈ E} . Analogously, each Fk has a unique invariant s-state.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.009

9 Generic cocycles

In Chapters 7 and 8 we came to the conclusion that, for a significant class of linear cocycles, the Lyapunov exponents are most of the time distinct. The results were stated for locally constant cocycles over Bernoulli shifts but, as observed at the end of both chapters, the conclusions extend much beyond: roughly speaking, they remain valid for H¨older-continuous cocycles with invariant holonomies, assuming that the base dynamics is sufficiently “chaotic”. Rather in contrast, in the early 1980s Ma˜ne´ [88] announced that generic (that is, a residual subset of all) area-preserving C1 diffeomorphisms on any surface have λ± = 0 at almost every point, or else they are Anosov diffeomorphisms. Actually, as observed in Example 2.10, the second alternative is possible only if the surface is the torus T2 . A complete proof of Ma˜ne´ ’s claim was first given by Bochi [30], based on an unpublished draft by Ma˜ne´ himself. This family of ideas is the subject of the present chapter. In Section 9.1, we make a few useful observations about semi-continuity of Lyapunov exponents. Then, in Section 9.2, we state and prove a version of the Ma˜ne´ –Bochi theorem for continuous linear cocycles (Theorem 9.5). This can be extended in several ways: to the original setting of diffeomorphisms (derivative cocycles); to higher dimensions; and to continuous time systems. Some of these are briefly discussed in Section 9.2.4. This theory is also connected to the question of how Lyapunov exponents vary with the linear cocycle, which will be the central topic of Chapter 10. Indeed, Theorem 9.5 yields (in Corollary 9.8) a complete characterization of the continuity points of the Lyapunov exponents in the realm of continuous linear cocycles. Similar ideas allow us, in Section 9.3, to give examples of discontinuity of the Lyapunov exponents among H¨older-continuous cocycles. 150

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.1 Semi-continuity

151

9.1 Semi-continuity Let f : M → M be a measurable transformation on a separable complete metric space M and let μ be a Borel probability measure on M invariant under f . Let F : M × Rd → M × Rd be the linear cocycle defined over ( f , μ ) by a bounded measurable function A : M → GL(d). To begin with, we are going to show that the extremal Lyapunov exponents are semi-continuous functions of the corresponding cocycle. We write λ± (A, μ ) to mean λ± (F, μ ). Similarly, let χ1 (A, μ ) ≥ · · · ≥ χd (A, μ ) denote all the Lyapunov exponents of F, counted with multiplicity. The first lemma and its corollaries hold for pairs (A, μ ) in any of the following situations: • A bounded and continuous, with the topology of uniform convergence (given by the C0 -norm), and μ ∈ M (M) with the weak∗ topology; • A bounded and measurable, with the topology of uniform convergence, and μ ∈ M (M) with the pointwise topology. Lemma 9.1 The function (A, μ ) → λ+ (A, μ ) is upper semi-continuous and the function (A, μ ) → λ− (A, μ ) is lower semi-continuous. More generally, for any 0 ≤ k ≤ d, the function (A, μ ) → ∑ki=1 χi (A, μ ) is upper semi-continuous and the function (A, μ ) → ∑di=d−k+1 χi (A, μ ) is lower semi-continuous. 

Proof Observe that (A, μ ) → log An  d μ is continuous for every n ≥ 1. Indeed,       log An  d μ − log Bn  d ν         ≤  log An  d(μ − ν ) +  log An  − log Bn  d ν and the first term goes to zero when μ → ν , because log An  is bounded, and the second term goes to zero when A − B → 0, because the difference log An  − log Bn  converges uniformly to zero. So, the case k = 1 of the claim follows directly from the identities 

1 log An  d μ and n≥1 n  1 λ− (A, μ ) = sup log (An )−1 −1 d μ n≥1 n

λ+ (A, μ ) = inf

in Theorem 3.12. To deduce the general case, consider the cocycle Λk F in-

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

152

Generic cocycles

duced by F in the exterior k-power Λk (Rd ). From Proposition 4.17 we get

χ1 (A, μ ) + · · · + χk (A, μ ) = λ+ (Λk A, μ ) χd−k+1 (A, μ ) + · · · + χd (A, μ ) = λ− (Λk A, μ ). Then it suffices to observe that A → Λk A is continuous with respect to the topology of uniform convergence (Exercise 9.1). A subset of a topological space is called residual if it contains a countable intersection of open dense subsets, and it is called meager if its complement is residual or, in other words, if it is contained in a countable union of nowheredense closed sets. The ambient is called a Baire space if every residual subset is dense or, equivalently, if every meager subset is nowhere dense. Complete metric spaces and locally compact topological spaces are Baire spaces. Corollary 9.2 The set of discontinuity points (A, μ ) for the Lyapunov exponents λ± is a meager subset of the domain. Proof This is a general feature of semi-continuous functions. We recall the argument. Let Q be any countable dense subset of R and, for each q ∈ Q, define Fq = ∂ {(A, μ ) : λ+ (A, μ ) < q}. The assumption gives that Fq is the boundary of an open set, and so it is closed and nowhere dense. Now let (A, μ ) be a point of discontinuity for λ+ . Then there exists (An , μn ) → (A, μ ) such that limn λ+ (An , μn ) < λ+ (A, μ ). Let q be any element of Q between these two numbers. Then (A, μ ) ∈ Fq . This proves that the set of discontinuity points of λ+ is contained in q∈Q Fq . Analogously, one gets that the set of discontinuity points of λ− is contained in a countable union of closed nowhere-dense subsets. Corollary 9.3 If (A, μ ) is such that all the Lyapunov exponents are equal, then (A, μ ) is a continuity point for the Lyapunov exponents. Proof Denote λ = λ+ (A, μ ) = λ− (A, μ ). By upper semi-continuity of the largest exponent and lower semi-continuity of the smallest exponent,

λ+ (B, ν ) < λ + ε

and

λ− (B, ν ) > λ − ε

for every (B, ν ) close to (A, μ ). Then |χ j (B, ν )− χ j (A, μ )| = |χ j (B, ν )− λ | < ε for every j = 1, . . . , d, and so (A, μ ) is indeed a continuity point. Another rather general class of continuity points for the Lyapunov exponents are the hyperbolic cocycles introduced in Section 2.2.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.2 Theorem of Ma˜ne´ –Bochi

153

Lemma 9.4 The Lyapunov exponents vary continuously with A ∈ GL(2) in the open subset of hyperbolic cocycles in C0 (M, SL(2)). s ⊕ E u be the hyperbolic decomposition for the cocycle Proof Let Rd = EA,x A,x defined by A. Given x ∈ M, define

gsA (x) =

A(x)vs  vs 

and

guA (x) =

A(x)vu  vu 

for any non-zero vs ∈ Exs and vu ∈ Exu (the definition does not depend on the choice of these vectors). The Lyapunov exponents are given by

λ+ (A, μ ) =



log guA d μ

and

λ− (A, μ ) =



log gsA d μ .

(9.1)

By Proposition 2.6, the invariant sub-bundles EAs and EAu depend continuously on A. Thus, gsB and guB are uniformly close to gsA and guA , respectively, if B is close to A. Then, by (9.1), the exponents λ± (B, μ ) are close to λ± (A, μ ).

˜ e–Bochi 9.2 Theorem of Man´ An invariant probability measure of a transformation f : M → M is aperiodic if the set of periodic points of has zero measure. Given an invariant set Λ ⊂ M, we say that a cocycle F : M × Rd → M × Rd is hyperbolic over Λ if the restriction of F to Λ × Rd is hyperbolic. Theorem 9.5 Let f : M → M be a homeomorphism and μ be an aperiodic ergodic probability measure on some compact metric space M. For any continuous function A : M → SL(2), either the cocycle associated with A is hyperbolic over the support of μ or A is approximated in C0 (M, SL(2)) by continuous functions C : M → SL(2) such that λ± (C, μ ) = 0. Before proving this theorem, let us list a few applications: Corollary 9.6 In the setting of Theorem 9.5, the set of A ∈ C0 (M, SL(2)) for which either the linear cocycle is hyperbolic over the support of μ or else λ± (A, μ ) = 0 is a residual subset of C0 (M, SL(2)). Proof By Proposition 2.6, the subset H of functions A for which the linear cocycle is hyperbolic is open in C0 (M, SL(2)). By Lemma 9.1, the subset L(δ ) of continuous functions A : M → SL(2) for which λ+ (A, μ ) < δ is also open. Fix any sequence (δn )n → 0. Then the set in the statement coincides with  n H ∪ L(δn ). By Theorem 9.5, the intersection is dense and, hence, so is each of the open sets H ∪ L(δn ).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

154

Generic cocycles

In some cases, the hyperbolicity alternative can be excluded a priori. Then one gets an abundance of vanishing exponents: Example 9.7 Let S1 = R/Z and f : S1 → S1 be a continuous transformation, with degree deg( f ) ∈ Z. Let A : S1 → SL(2) be of the form A(x) = A0 R2πα (x) where A0 ∈ SL(d) and α : S1 → S1 is a continuous function with degree deg(α ) ∈ Z and Rθ denotes the rotation of angle θ . Assume that μ is supported on the whole S1 . Assume that 2 deg(α ) is not a multiple of deg( f ) − 1. Then, continuous cocycles with zero Lyapunov exponents form a residual subset of the isotopy class of A for, as we have seen in Example 2.9, no element of the isotopy class can be hyperbolic. Theorem 9.5 also yields a complete characterization of the continuity points for the Lyapunov exponents among continuous cocycles: Corollary 9.8 In the setting of Theorem 9.5, a function A : M → SL(2) is a continuity point for the Lyapunov exponents in C0 (M, SL(2)) if and only if the corresponding linear cocycle is hyperbolic over the support of μ or else the Lyapunov exponents λ± (A, μ ) vanish. Proof By Lemma 9.4, Lyapunov exponents are continuous at every hyperbolic cocycle. By Corollary 9.3, the same is true for every cocycle with vanishing Lyapunov exponents. Theorem 9.5 implies that there are no other continuity points. Let us give an outline of the proof of Theorem 9.5. We will fill in the details in the next three sections. Start from any A ∈ C0 (M, SL(2)) whose associated linear cocycle is not hyperbolic over supp μ . Proposition 9.10 below states that if the Lyapunov exponents λ± (A, μ ) are non-zero then one can find a nearby measurable function B : M → SL(2) that preserves the union of the Oseledets sub-bundles of A but exchanges the two Oseledets sub-bundles over some positive measure set Z ⊂ M. Another key point, that will be established in Proposition 9.12, is that Z may be chosen so that the second return map Z → Z is ergodic. According to Proposition 9.13 below, it follows that λ± (B, μ ) = 0. This is not quite the end yet, because B need not be continuous. However, using Lusin’s theorem, there exist continuous functions C : M → SL(2) with uniformly bounded norm and coinciding with B outside sets that have arbitrarily small measure. Then the Lyapunov exponents λ± (C, μ ) are arbitrarily close to zero. This proves that every non-hyperbolic cocycle is approximated by contin-

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.2 Theorem of Ma˜ne´ –Bochi

155

uous cocycles with arbitrarily small Lyapunov exponents. A Baire argument wraps up the proof of the theorem.

9.2.1 Interchanging the Oseledets subspaces Let us now present the arguments in detail. Lemma 9.9 Let f : M → M be an invertible transformation and μ be an invariant aperiodic probability measure. For any measurable set Y ⊂ M with μ (Y ) > 0 and any m ≥ 1 there exists a measurable set Z ⊂ Y such that μ (Z) > 0 and Z, f (Z), . . . , f m−1 (Z) are pairwise disjoint. Proof Since fixed points form a zero measure subset of Y , we may find Y1 ⊂ Y such that μ (Y1 Δ f (Y1 )) > 0. Then Z1 = Y1 \ f (Y1 ) has positive measure and it is disjoint from its image f (Z1 ). Similarly, we may find Y2 ⊂ Z1 such that μ (Y2 Δ f 2 (Y2 )) > 0. Then Z2 = Y2 \ f 2 (Y2 ) has positive measure and Z2 , f (Z2 ), and f 2 (Z2 ) are pairwise disjoint. Continuing in this way we find Z = Zm as in the statement. Proposition 9.10 Let A ∈ C0 (M, SL(2)) be such that the associated linear cocycle F is not hyperbolic over supp μ . Then for every ε > 0 there exists m ≥ 1 and Z ⊂ M with μ (Z) > 0 such that Z, f (Z), . . . , f m−1 (Z) are pairwise disjoint, and there exists a measurable map B : M → SL(2), satisfying (a) A(x) − B(x) < ε for every x ∈ M and A = B outside (b)

m−1 j f (Z);

j=0 where Rd

= and = for x ∈ Z, = Exs ⊕ Exu is the Oseledets decomposition of the linear cocycle F associated with A. B(x)Exs

E ufm (x)

B(x)Eus

E ufm (x)

Proof There are two parts. First, we explain how to construct a perturbation that maps E u to E s . More precisely, we find Y ⊂ M, p ≥ 1, and J : M → SL(2) satisfying (a) and J p (x)Exu = E sf p (x) but, possibly, not J p (x)Exs = E ufp (x) . Next, we explain how to modify the construction and obtain Z, m, and B satisfying all the conditions in the statement. Mapping E u to E s : Fix δ > 0, much smaller than ε > 0. If the set of points x such that | sin (Exu , Exs )| < δ has positive measure, we may take Y to be that set, p = 1, and J(x) = A(x)R(x) for x ∈ Y , where R(x) is a small rotation such that R(x)Exs = Exu . From now on, we suppose that | sin (Exu , Exs )| ≥ δ for almost every x. For each p ≥ 1, consider the set  Γ p = x ∈ M : A p (x) | Exu  < 2A p (x) | Exs || .

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

156

Generic cocycles

The assumption that the cocycle is non-hyperbolic implies that μ (Γ p ) > 0 for every p. Indeed, suppose there exists p ≥ 1 such that A p (x) | Exu  ≥2 A p (x) | Exs 

for μ -almost every x ∈ M.

Then (Exercise 2.4), for any k ≥ 1 and μ -almost every x Akp (x)2 ≥

Akp (x) | Exu  ≥ 2k . Akp (x) | Exs 

Then, by continuity, Akp (x) ≥ 2k/2 for every k ≥ 1 and x in the support of μ . It follows, by Proposition 2.1, that the linear cocycle F p is hyperbolic over supp μ . Then (Exercise 2.5), F is hyperbolic over supp μ . This contradiction proves that μ (Γ p ) > 0 for every p, as claimed. Now fix θ > 0 much smaller than δ and p ≥ 1 much larger than | log θ |. By Lemma 9.9, we may find a measurable set Y ⊂ Γ p with μ (Y ) > 0 and such that Y , f (Y ), . . . , f p−1 (Y ) are pairwise disjoint. For each x ∈ Y and 0 < j < m − 1, define J( f j (x)) = A( f j (x))L j (x), where L j (x) is a linear map that fixes E sf j (x) and E uf j (x) , expanding the former

and contracting the latter by a definite factor e±θ : L j (x) : E sf j (x) ⊕ E uf j (x) → E sf j (x) ⊕ E uf j (x) ,

vs + vu → eθ vs + e−θ vu .

Since the angle between the two sub-bundles is bounded from zero, one has that A( f j (x)) − J( f j (x))) < ε for all 0 < j < p − 1 and x ∈ Y , as long as θ is small enough. Then p s A( f p−1 (x))J( f p−2 (x)) · · · J( f (x))A(x) | Exs  2(p−2)θ A (x) | Ex  ≥ e A( f p−1 (x))J( f p−2 (x)) · · · J( f (x))A(x) | Exu  A p (x) | Exu 

≥ e2(p−2)θ −1 > 2δ −1 θ −2 , as long as p is large enough. Since the angle between the Oseledets subspaces is bounded below by δ , this implies that there exists some line r ⊂ R2 such that, denoting r p = A( f p−1 (x))J( f p−2 (x)) · · · J( f (x))A(x)r, | sin (r, Exu )| < θ

and

| sin (r p , E sf p (x) )| < θ .

Define J(x) = A(x)R(x) and J( f p−1 (x)) = L(x)A( f p−1 (x)) where R(x) and L(x) are small rotations mapping Exu to r and r p to E sf p (x) , respectively. This preserves A − J < ε and also gives J p (x)Exu = E sf p (x) , as required.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.2 Theorem of Ma˜ne´ –Bochi Mapping E s to E u : (Exercise 3.15),

157

The latter property implies that J p (x)Exs = E sf p (x) . Then

lim | sin (Am−p ( f p (x))J p (x)Exs , E ufm (x) )| = 0 m

Moreover, by Poincar´e recurrence, lim sup | sin (E ufm (x) , E sf m (x) )| > 0. m

Thus, we may find m > p and a positive measure set Z ⊂ Y such that, for every z ∈ Z, the angle between Am−p ( f p (x))J p (x)Exs and E ufm (x) is much smaller than the angle between the two Oseledets subspaces at f m (x). hence, there exists a linear map S( f m (x)) close to the identity that fixes E sf m (x) and maps Am−p ( f p (x))J p (x)Exs to E ufm (x) . By Lemma 9.9, Z, f (Z), . . . , f m−1 (Z) may be assumed to be pairwise disjoint. Let B : M → SL(2) be given by ⎧ if x ∈ Z ∪ f (Z) ∪ · · · ∪ f p−1 (Z) ⎨ J(x) B(x) = S( f (x))A(x) if x ∈ f m−1 (Z) ⎩ A(x) in all other cases. It is clear from the construction that Z, m, and B satisfy all the properties in the conclusion of the proposition.

9.2.2 Coboundary sets We want to deduce from Proposition 9.10 that the Lyapunov exponents of B vanish. The following observation shows that additional information is needed for this. Let A : M → SL(2) be such that the Lyapunov exponents of the associated cocycle are non-zero, and let Rd = Exs ⊕ Exu be the corresponding Oseledets decomposition. Fix a measurable set W ⊂ M and let H(x) : R2 → R2 be defined as follows: if x ∈ W then H(x) is an orthonormal map that exchanges Exs with Exu ; otherwise, H(x) = id. Then take B : M → SL(2) to be given by B(x) = H( f (x))A(x)H(x)−1 for every x ∈ M. By construction, B(x)Exs = Exu B(x)Exs

= Exs

and and

B(x)Exu = Exs B(x)Exu

= Exu

for x ∈ W Δ f (W ) for x ∈ / W Δ f (W ).

(9.2)

On the other hand, since the norm of H ±1 is bounded, the Lyapunov exponents of A and B coincide. Hence, the property of exchanging the Oseledets subspaces of A alone cannot force the Lyapunov exponents of B to vanish. This problem is handled by the notion of coboundary set, introduced by Knill [75] in this context. A measurable set Z ⊂ M is a coboundary for f if

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

158

Generic cocycles

there exists some measurable set W ⊂ M such that Z coincides with W Δ f (W ) up to a zero measure set. Given any measurable set Z ⊂ M with μ (Z) > 0, let ν be the normalized restriction of μ to Z, and let g : Z → Z be the first return map: g(x) = f r(x) (x) where r(x) ≥ 1 is the first time the forward orbit of x hits z (actually, g is defined on a full measure subset of Z). By Proposition 4.18, the map g is ergodic for the probability measure ν . Lemma 9.11 Let f : M → M be an invertible transformation and μ be an ergodic probability measure. A set Z ⊂ M with μ (Z) > 0 is a coboundary for f if and only if the second return map g2 : Z → Z is not ergodic for ν . Proof Suppose that there exists W ⊂ M be such that Z = W Δ f (W ) up to a zero measure set. Let U = W \ f (W ) and V = f (W ) \ W . Given x ∈ U, let / W . Then g(x) = f n (x) ∈ V . Given x ∈ V , n ≥ 1 be minimum such that f n (x) ∈ let n ≥ 1 be minimum such that f n (x) ∈ W . Then g(x) = f n (x) ∈ U. This shows that g(U) ⊂ V and g(V ) ⊂ U. It follows that ν (U) = ν (V ) = 1/2 and g2 (U) ⊂ U. In particular, g2 is not ergodic. To prove the converse, suppose there exists U ⊂ Z such that 0 < ν (U) < 1 and g2 (U) = U. Then U ∪ g(U) is a (g, ν )-invariant set with positive measure. Since g is ergodic, it follows that ν (U ∪ g(U)) = 1. In other words, Z = U ∪ g(U) up to a zero measure set. Define W = { f j (x) : x ∈ U and 0 ≤ i < r(x)}. Then f (W ) = { f j (x) : x ∈ U and 0 < i ≤ r(x)} and so  W Δ f (W ) = f j (x) : x ∈ U and i ∈ {0, r(x)} = U ∪ g(U) = Z up to a zero measure set. Thus, Z is a coboundary set. Proposition 9.12 Assume that μ is aperiodic for f . Then any set with positive measure has some subset with positive measure that is not a coboundary set. Proof Since M is a compact metric space, the probability space (M, B, μ ) is a Lebesgue space (see [114, Section 8.5.2]). Moreover, μ is non-atomic, because it is aperiodic. So, (M, μ ) is isomorphic to the interval [0, 1], endowed with the Lebesgue measure. Theorem 9.3 in Friedman [53] asserts that, for any ergodic measure-preserving transformation in a Lebesgue space, the family W ( f , μ ), of measurable sets Z ⊂ M such that the first return map fZ : Z → Z is weak mixing, is dense in the σ -algebra of M, relative to the distance d(A1 , A2 ) = μ (A1 ΔA2 ). In particular, W ( f , μ ) contains some measurable set Z with μ (Z) > 0. Now, given any Y ⊂ M with μ (Y ) > 0, let h = fY : Y → Y be the first return map and ν be the normalized restriction of μ to Y . Then ν is aperiodic for h. By the previous paragraph, there is Z ∈ W (h, ν ) with ν (Z) > 0. This means

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.2 Theorem of Ma˜ne´ –Bochi

159

that hZ : Z → Z is weak mixing and μ (Z) > 0. Since weak mixing is preserved under iteration, the second return map h2Z is also weak mixing and, hence, ergodic. Clearly, hZ coincides with the first return map fZ : Z → Z under the transformation f . Thus, fZ2 is ergodic and so Z is not a coboundary set. Therefore, we may always choose the set Z in Proposition 9.10 in such a way that it is not a coboundary. Proposition 9.13 Let Z ⊂ M, m ≥ 1, and B : M → SL(2) be as in the conclusion of Proposition 9.10 and assume that Z is not a coboundary set. Then λ± (B, μ ) = 0. Proof Let Rd = Exs ⊕ Exu denote the Oseledets decomposition for A. Suppose s ⊕ E u be the correthat λ± (B, μ ) are different from zero and let Rd = EB,x B,x sponding Oseledets decomposition. We are going to use the inducing construction in Section 4.4.1. More precisely, we consider the linear cocycle G : Z × R 2 → Z × R2 ,

G(x, v) = ( f r(x) (x), Br(x) v)

over the first return map g : Z → Z. We also consider the corresponding projective cocycle PG : Z × PR2 → Z × PR2 . By construction, Br(x) (Exs ) = Exu

and

Br(x) (Exu ) = Exs

ν -almost everywhere.

(9.3)

By Proposition 4.18, the Lyapunov exponents λ± (G, ν ) are different from zero, s ⊕ E u restricted and the Oseledets decomposition of G is given by Rd = EB,x B,x to the domain Z. Let m be the probability measure defined on Z × PR2 by 1 1 m(X) = ν ({x ∈ Z : (x, Exs ) ∈ X}) + ν ({x ∈ Z : (x, Exu ) ∈ X}) . 2 2 In other words, m projects down to μ and its disintegration is given by 1 1 x → δExs + δExu . 2 2 It follows from (9.3) that m is invariant under PG. Then, by Lemma 5.25 and Remark 5.26, m is a linear combination of the probability measures msB and muB defined on Z × PR2 by   s msB (X) = ν {x ∈ Z : (x, EB,x ) ∈ X}   u ) ∈ X} . muB (X) = ν {x ∈ Z : (x, EB,x Lemma 9.14

The probability measure m is ergodic for PG.

Proof Suppose that there is an invariant set X ⊂ Z × PR2 with 0 < m(X) < 1. Let Z0 be the set of x ∈ Z whose fiber X ∩ ({x} × PR2 ) contains neither Exs nor Exu . In view of (9.3), Z0 is a (g, ν )-invariant set and so its ν -measure is either

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

160

Generic cocycles

0 or 1. Since m(X) > 0, we must have ν (Z0 ) = 0. Similarly, m(X) < 1 implies that ν (Z2 ) = 0, where Z2 is the set of x ∈ Z whose fiber contains both Exs and Exu . Now let Zs be the set of x ∈ Z whose fiber contains Exs but not Exu , and let Zu be the set of x ∈ Z whose fiber contains Exu but not Exs . The previous observations show that Zs ∪ Zu has full ν -measure. Moreover, (9.3) implies g(Zs ) = Zu

and

g(Zu ) = Zs

up to zero measure sets. Thus, ν (Zs ) = ν (Zu ) = 1/2 and g2 (Zs ) = Zs . This implies that g2 is not ergodic, contradicting Lemma 9.11. This contradiction proves the present lemma. It follows that m coincides with either msB and muB . This is a contradiction, because the conditional probabilities of m are supported on exactly two points on each fiber, whereas the conditional probabilities of both muB and msB are Dirac masses on a single point. This contradiction proves that the Lyapunov exponents λ± (B, μ ) are zero, as claimed.

9.2.3 Proof of Theorem 9.5 So far we have shown that any continuous cocycle F which is not hyperbolic over the support of μ can be approximated, in the uniform norm, by measurable cocycles with vanishing exponents. To complete the proof of Theorem 9.5 we must replace this by continuous cocycles. We are going to use the following semi-continuity result: Lemma 9.15 Let B : M → SL(d), d ≥ 2 be such that log B ∈ L1 (μ ). Given L > 0 and δ > 0, there exists ρ > 0 such that λ+ (J,  μ ) < λ+ (B, μ ) + δ for  any measurable function J : M → SL(d) such that μ {x ∈ M : B(x) = J(x)} < ρ and J∞ < L. Proof

Given δ > 0, fix n ≥ 1 such that 1 n



δ log Bn  d μ ≤ λ+ (B, μ ) + . 2

Given J : M → SL(d) such that J < L, denote Δ = {x ∈ M : B(x) = J(x)} and Δn = Δ ∪ f (Δ) ∪ · · · ∪ f n−1 (Δ). Then

λ+ (J, μ ) ≤

1 n



log J n  d μ =

1 n



Δn

log J n  d μ +

1 n



M\Δn

log Bn  d μ .

The right-hand side is bounded by 1 δ δ μ (Δn ) log L + λ+ (B, μ ) + ≤ μ (Δ) log L + λ+ (B, μ ) + . n 2 2

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.2 Theorem of Ma˜ne´ –Bochi

161

It follows that λ+ (J, μ ) ≤ λ+ (B, μ ) + δ as long as μ (Δ) < ρ for some ρ > 0 sufficiently small. This proves the lemma. Let us proceed with the proof of Theorem 9.5. Previously, we have shown that for any ε > 0 there exist measurable functions B : M → SL(2) such that A − B∞ < ε and λ+ (B, μ ) = 0. Fix any δ > 0. Given any ρ > 0 there exist continuous functions J : M → SL(2) such that J∞ ≤ B∞ and J = B outside a set with measure ρ (Exercise 9.4). By Lemma 9.15, this implies that λ+ (J, μ ) < δ , as long as ρ is chosen small enough. Let H be the subset of continuous functions A : M → SL(2) such that the associated linear cocycle is hyperbolic over supp μ and, for any δ > 0, let L(δ ) be the subset of continuous functions A : M → SL(2) such that λ+ (A, μ ) < δ . By Propositions 2.6 and 9.1, H and the L(δ ) are all open subsets of C0 (M, SL(2)). The reasoning in the previous paragraph shows that L(δ ) is dense in the closed set C0 (M, SL(2))\H, for every δ > 0. Since C0 (M, SL(2)) is a complete metric  space, it follows from Baire’s theorem (Exercise 9.3) that ∞ k=1 L(1/k) is dense in C0 (M, SL(2)) \ H. In other words, every non-hyperbolic continuous cocycle is approximated in C0 (M, SL(2)) by another continuous cocycle, whose Lyapunov exponents are zero. This finishes the proof of Theorem 9.5.

9.2.4 Derivative cocycles and higher dimensions Recall that a C1 diffeomorphism f : M → M is Anosov if the derivative cocycle F = D f is hyperbolic. We denote by Diff1m (M) the space of volume-preserving diffeomorphisms on a compact Riemannian manifold, endowed with the C1 topology: two diffeomorphisms are C1 -close if they are uniformly close and their derivatives are also uniformly close. Theorem 9.5 is still true, albeit much harder, for derivative cocycles of area-preserving diffeomorphisms: Theorem 9.16 (Ma˜ne´ [88], Bochi [30]) Let M be a compact Riemannian manifold of dimension 2. For every f ∈ Diff1m (M), either f is an Anosov diffeomorphism or f is approximated in Diff1m (M) by diffeomorphisms whose Lyapunov exponents vanish almost everywhere. Corollary 9.17 Let M be a compact Riemannian manifold of dimension 2. If f ∈ Diff1m (M) is a continuity point for the Lyapunov exponents then either f is an Anosov diffeomorphism or the Lyapunov exponents vanish almost everywhere. The continuity points form a residual subset of Diff1m (M). The 2-dimensional torus is the only compact surface that carries Anosov diffeomorphisms (see [52, 91] and Example 2.10). In all other cases, Corol-

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

162

Generic cocycles

lary 9.17 gives that a residual subset of all area-preserving diffeomorphisms have zero Lyapunov exponents almost everywhere. Now we discuss extensions of the previous results to arbitrary dimension. For the statements we need the notion of dominated decomposition. Let M be a compact metric space and F : M × Rd → M × Rd be a continuous cocycle over some homeomorphism f : M → M. Let Rd = Vx1 ⊕ · · · ⊕Vxl ,

x∈Λ

be an F-invariant decomposition, defined for x in some invariant set Λ ⊂ M and such that every x → dimVxi is constant on Λ. We call the decomposition dominated over Λ if there exist C > 0 and θ < 1 such that F n (x)vi  ≤ Cθ n F n (x)vi−1 

for every unit vector vi ∈ Vxi , vi−1 ∈ Vxi−1 ,

and for every x ∈ X and 1 < i ≤ l. In other words, the cocycle is more contractive along each Vxi than along Vxi−1 , by a definite factor. By convention, the trivial decomposition into a single subspace (the case l = 1) is dominated. In the next statement G is any subgroup of GL(d) that acts transitively on the projective space; that is, such that {g(r) : g ∈ G} = PRd for any r ∈ PRd . That includes many interesting subgroups, such as the special linear group SL(d) and the symplectic group (if d is even). When d is even, it also includes SL(d/2, C) and GL(d/2, C), viewed as subgroups of GL(d). Theorem 9.18 (Bochim and Viana [34]) Let f : M → M be a homeomorphism on a compact metric space M and μ be an aperiodic ergodic probability measure. Then A : M → G is a continuity point for the Lyapunov exponents among G-valued continuous cocycles if and only if the Oseledets decomposition of the associated linear cocycle over f is dominated (possibly trivially) over the support of μ . The continuity points form a residual subset of the space of all continuous G-valued cocycles. The statement actually proven in [34] is a bit stronger: in particular, it includes the non-ergodic case. Theorem 9.19 (Bochi and Viana [34]) For any compact manifold M, there exists a residual subset R of the space Diff1m (M) of volume-preserving diffeomorphisms such that for every f ∈ R the Oseledets decomposition of F = D f is dominated (possibly trivially) on every orbit of f . Example 9.20 Bonatti and Viana [38] constructed open sets U of volumepreserving diffeomorphisms on M = T4 that are not Anosov and yet admit a unique decomposition T M = E s ⊕ E u that is invariant and dominated for the

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.2 Theorem of Ma˜ne´ –Bochi

163

derivative cocycle. Both subspaces E s and E u are 2-dimensional. Tahzibi [110] showed that every C2 diffeomorphism in U is ergodic. By Avila [8], the set of C2 diffeomorphisms is dense in Diff1m (M). By Oxtoby and Ulam [93], the set of ergodic diffeomorphisms is always a countable intersection of open sets. Combining these three facts we get that ergodic diffeomorphisms form a residual subset R of U . For any f ∈ R, Theorem 9.19 and Exercise 9.5 give that the Oseledets decomposition extends continuously to a dominated decomposition on the whole M. Then, by uniqueness, the Oseledets decomposition must coincide with E s ⊕ E u at almost every point. In particular, every f ∈ R has exactly one positive exponent and one negative exponent, both with multiplicity 2. Now take M to be endowed with a symplectic form; that is, a closed nondegenerate differential 2-form ω . Existence of such a form implies that the dimension of M is even, d = 2k. Moreover, ω k = ω ∧ · · · ∧ ω is a volume form on M. Let m be the corresponding volume measure, normalized so that m(M) = 1. A diffeomorphism f : M → M is symplectic if the derivative preserves the symplectic form; that is, if

ωx (v, w) = ω f (x) (D f (x)v, D f (x)w) for every x ∈ M and v, w ∈ Tx M. Every symplectic diffeomorphism preserves the volume measure m. The space Diff1ω (M) of symplectic diffeomorphisms is a closed subset of Diff1m (M). In dimension d = 2 the two spaces coincide. Theorem 9.21 (Bochi and Viana [34], Bochi [31]) There exists a residual subset R of the space Diff1ω (M) of symplectic diffeomorphisms such that for every f ∈ R the Oseledets decomposition of F = D f is dominated (possibly trivially) on every orbit of f . As a consequence, one gets the following dichotomy for a residual subset of symplectic diffeomorphisms that was also announced by Ma˜ne´ [88]: (1) either f is an Anosov diffeomorphism, with invariant decomposition T M = E u ⊕ E s such that dim E u = dim E s and D f | E s is uniformly contracting and D f | E u is uniformly expanding; (2) or, for almost every x, there is a decomposition E s ⊕ E c ⊕ E u dominated on the orbit of x, with dim E s = dim E u (possibly equal to zero) and dim E c ≥ 2, such that D f | E s is uniformly contracting, D f | E u is uniformly expanding, and the Lyapunov exponents of D f | E c vanish. For continuous time systems new difficulties arise, especially in the presence of equilibrium points of the flow. In his thesis, Bessa [20] obtained a version

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

164

Generic cocycles

of Theorem 9.5 for 2-dimensional continuous time linear cocycles. The extension to arbitrary dimension, corresponding to Theorem 9.18, was obtained by Bessa [22]. As happens for the discrete time systems, the case of derivative cocycles is much harder. A version of Theorem 9.16 for 3-dimensional flows was proven by Bessa [21], in the case when the flow has no equilibrium points. This restriction was removed soon afterwards by Ara´ujo and Bessa [3]. Concerning extensions to higher dimensions, the continuous time version of Theorem 9.19 was settled by Bessa and Rocha [24] and a version of Theorem 9.21 for Hamiltonian flows with two degrees of freedom was obtained by Bessa and Dias [23].

9.3 H¨older examples of discontinuity Corollary 9.8 shows that continuity of the Lyapunov exponents, although corresponding to residual subsets of the domain (by Corollary 9.2), can only occur at cocycles that are rather special from a dynamical viewpoint. While this is perhaps specific to the C0 -topology in the space of cocycles (see the discussion in Section 10.6), it is interesting to point out that a variation of the proof of Theorem 9.5 also yields examples of discontinuity in the space of H¨oldercontinuous cocycles over a Bernoulli shift, with arbitrary H¨older constant. That is the purpose of the present section and also motivates some of the ideas we will put forward in Section 10.6. Let f : M → M be the shift map on the space M = X Z , with X = {1, 2}, endowed with the metric d(x, y) = 2−N(x,y) ,

N(x, y) = sup{N ≥ 0 : xn = yn whenever |n| < N}.

Let μ = pZ , where p = p1 δ1 + p2 δ2 with p1 , p2 positive and distinct. Let r ∈ (0, ∞) be fixed. A function A : M → GL(d) is r-H¨older-continuous if there exists C > 0 such that A(x) − A(y) ≤ Cd(x, y)r for any x, y ∈ M. The space H r (M) of all r-H¨older-continuous functions A : M → GL(d) comes with a natural topology, defined by the r-H¨older norm    A(x) − A(y)  : x =  y . (9.4) AH r = sup A(x) : x ∈ M + sup d(x, y)r Given σ > 1, consider the locally constant cocycle associated with the function A : M → SL(2) given by   σ 0 if x0 = 1 A(x) = 0 σ −1

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.3 H¨older examples of discontinuity and A(x) =

 −1 σ 0

0 σ

165

 if x0 = 2.

Note that A is r-H¨older continuous for any r > 0. The Lyapunov exponents λ± (A, μ ) = ±|p1 − p2 | log σ are non-zero. Theorem 9.22 For any r > 0 such that 22r < σ there exist r-H¨older-continuous cocycles B : M → SL(2) with vanishing Lyapunov exponents and such that A − BH r is arbitrarily close to zero. Hence, A is a point of discontinuity for the Lyapunov exponents on H r ({1, 2}). The hypothesis on the H¨older constant r is probably not sharp. For reasons to be discussed in Section 10.6, it would be interesting to weaken it to 2r < σ 2 . Here is an outline of the proof of Theorem 9.22. Note that the original cocycle A preserves the horizontal and vertical line bundles Hx = R(1, 0) and Vx = R(0, 1). Then the Oseledets subspaces must coincide with Hx and Vx almost everywhere. We choose cylinders Zn ⊂ M whose first n iterates f i (Zn ), 0 ≤ i ≤ n − 1 are pairwise disjoint. Then we construct cocycles Bn by modifying A on some of these iterates so that Bnn (x)Hx = V f n (x)

and

Bnn (x)Vx = H f n (x)

for all x ∈ Zn .

We deduce from this property that the Lyapunov exponents of Bn vanish. Moreover, by construction, each Bn is constant on every atom of some finite partition of M into cylinders. In particular, Bn is H¨older continuous for every r > 0. From the construction we also get that  2r n/2 Bn − AH (9.5) r ≤ const 2 /σ decays to zero as n → ∞. This is how we get the claims in the theorem. In the remainder of this section we fill the details in this outline, to prove Theorem 9.22. Let n = 2k + 1 for some k ≥ 1 and Zn = [0; 2, . . . , 2, 1, . . . , 1, 1] where the symbol 2 appears k times and the symbol 1 appears k + 1 times. Note that the f i (Zn ), 0 ≤ i ≤ 2k are pairwise disjoint. Let

εn = σ −k

and

δn = arctan εn .

Define Rn : M → SL(2) by Rn (x) = rotation of angle δn if x ∈ f k (Zn )   1 0 if x ∈ Zn ∪ f 2k (Zn ) Rn (x) = εn 1 Rn (x) = id

in all other cases.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

(9.6)

166

Generic cocycles

and then take Bn = ARn . Lemma 9.23 Proof

Bnn (x)Hx = V f n (x) and Bnn (x)Vx = H f n (x) for all x ∈ Zn .

Note that for any x ∈ Zn , Bkn (x)Hx = R(εn , 1) and Bk+1 n (x)Hx = V f k+1 (x) B2k n (x)Hx = V f 2k (x)

and and

Bkn (x)Vx = V f k (x) Bk+1 n (x)Vx = R(−εn , 1) B2k n (x)Vx = R(−1, εn ).

The claim follows by iterating one more time. Lemma 9.24

 2r k There is C > 0 such that Bn − AH for every n. r ≤ C 2 /σ

Proof Let Ln = Bn − A. Clearly, sup Ln  ≤ sup A Rn − id  and this is bounded by σ εn . Now let us estimate the second term in the definition (9.4). If x and y are not in the same cylinder [0; a] then d(x, y) = 1, and so Ln (x) − Ln (y) ≤ 2 sup Ln  ≤ 2σ εn . d(x, y)r

(9.7)

From now on we suppose x and y belong to the same cylinder. Then, since A is constant on cylinders, Rn (x) − Rn (y) Ln (x) − Ln (y) A(x)(Rn (x) − Rn (y)) = ≤σ . d(x, y)r d(x, y)r d(x, y)r If neither x nor y belong to Zn ∪ f k (Zn ) ∪ f 2k (Zn ) then Rn (x) = Rn (y) and so the expression on the right vanishes. The same holds if x and y belong to the same f i (Zn ) with i ∈ {0, k, 2k}. We are left to consider the case when one of the points belongs to some f i (Zn ) with i ∈ {0, k, 2k} and the other one does not. Then d(x, y) ≥ 2−2k and so, using once more that Rn − id  ≤ εn at every point, we have Rn (x) − Rn (y) Ln (x) − Ln (y) ≤σ ≤ 2σ εn 22kr . d(x, y)r d(x, y)r Combining this with (9.7), we conclude that  k Ln r ≤ σ εn + 2σ εn 22kr ≤ 3σ 22r /σ . Now it suffices to take C = 3σ to complete the proof of the lemma. Now we want to prove that λ± (Bn ) = 0 for every n. Let μn be the normalized restriction of μ to Zn and fn : Zn → Zn be the first return map (defined on a full measure subset). We have the following explicit description: Zn =



[0; w, b, w]

(up to a zero measure subset)

b∈B

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.3 H¨older examples of discontinuity

167

where w = (2, . . . , 2, 1, . . . , 1, 1) and the union is over the set Ω of all finite words b = (b1 , . . . , bs ) not having w as a subword; moreover, fn | [0; w, b, w] = f n+s | [0; w, b, w]

for each b ∈ Ω.

Thus, ( fn , μn ) is a Bernoulli shift with an infinite alphabet Ω and probability vector given by pb = μn ([0; w, b, w]). Let Bˆ n : Zn → SL(2) be the cocycle induced by B over fn ; that is, | [0; w, b, w] Bˆ n | [0; w, b, w] = Bn+s n

for each b ∈ Ω.

By Proposition 4.18, the Lyapunov spectrum of the induced cocycle is obtained multiplying the Lyapunov spectrum of the original cocycle by the average return time. In our setting this means that

λ± (Bˆ n ) =

1 λ± (Bn ). μ (Zn )

Therefore, it suffices to prove that λ± (Bˆ n ) = 0 for every n. Suppose that the Lyapunov exponents of Bˆ n are different from zero and let d R = Exu ⊕ Exs be the Oseledets decomposition (defined almost everywhere in Zn ). The key observation is that, as a consequence of Lemma 9.23, the cocycle Bˆ n permutes the vertical and horizontal sub-bundles: Bˆ n (x)Hx = V fn (x)

and

Bˆ n (x)Vx = H fn (x)

for all x ∈ Zn .

(9.8)

Let m be the probability measure defined on M × PR2 by 1 1 mn (G) = μn ({x ∈ Zn : Vx ∈ G}) + μn ({x ∈ Zn : Hx ∈ G}) . 2 2 In other words, mn projects down to μn and its disintegration is given by 1 x → (δHx + δVx ). 2 It is clear from (9.8) that mn is Bˆ n -invariant. Using Lemma 5.25 we get that mn is a linear combination of the probability measures msn and mun defined on M × PR2 by   msn (G) = μn {x ∈ Zn : (x, Exs ) ∈ G}   mun (G) = μn {x ∈ Zn : (x, Exu ) ∈ G} . Lemma 9.25

The probability measure mn is ergodic.

Proof Suppose that there is an invariant set X ⊂ M × PR2 with mn (X ) ∈ (0, 1). Let X0 be the set of x ∈ Zn whose fiber X ∩({x}×PR2 ) contains neither Hx nor Vx . In view of (9.8), X0 is an ( fn , μn )-invariant set and so its μn -measure

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

168

Generic cocycles

is either 0 or 1. Since mn (X ) > 0, we must have μn (X0 ) = 0. The same kind of argument shows that μn (X2 ) = 0, where X2 is the set of x ∈ Zn whose fiber contains both Hx and Vx . Now let XH be the set of x ∈ Zn whose fiber contains Hx but not Vx , and let XV be the set of x ∈ Zn whose fiber contains Vx but not Hx . The previous observations show that XH ∪ XV has full μn -measure and it follows from (9.8) that fn (XH ) = XV

and

fn (XV ) = XH .

Thus, μn (XH ) = 1/2 = μn (XV ) and fn2 (XH ) = XH and fn2 (XV ) = XV . This is a contradiction because fn is Bernoulli and, in particular, the second iterate fn2 is ergodic. Thus, mn must coincide with either msn and mun . This is a contradiction, because the conditional probabilities of mn are supported on exactly two points on each fiber, whereas the conditional probabilities of either mun and msn are Dirac masses on a single point. This contradiction proves that the Lyapunov exponents of Bˆ n vanish for every n, and that concludes the proof of Theorem 9.22.

9.4 Notes The original announcement of Theorem 9.16 was made by Ma˜ne´ in his invited address to the ICM 1983, in Warsaw [88]. A draft of the proof circulated for several years, but it remained incomplete when Ma˜ne´ passed away in 1995. The first complete proof was provided by Bochi in his thesis, based on that draft, and was published in [30]. A few related results had been obtained in the meantime. Knill [75, 76] considered L∞ cocycles with values in SL(2) and proved that, as long as the base dynamics is aperiodic, the set of cocycles with non-zero exponents is never open. This was refined to the C0 case by Bochi [27], who proved that an SL(2)cocycle is a continuity point for the Lyapunov exponents in C0 (M, SL(2)) if and only if it is hyperbolic or else the exponents vanish (Corollary 9.8). Our presentation in Section 9.2 is adapted from the texts of Bochi [27] and Avila and Bochi [9]. A different strategy was used by Bochi [30], extending to the special case of derivative cocycles. Related results were also obtained for L p cocycles with p < ∞. Arbieto and Bochi [4] proved that Lyapunov exponents are still semi-continuous on the cocycle relative to the L p -norm, for any 1 ≤ p < ∞. Then, using a result of Arnold and Cong [6], they concluded that the cocycles whose exponents are all equal are precisely the continuity points for the Lyapunov exponents on

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

9.5 Exercises

169

L p (μ ). It follows that these cocycles form a residual subset of L p (μ ). See also Bessa and Vilarinho [25]. Bochi and Viana [34] extended Theorems 9.5 and 9.16 to arbitrary dimension (Theorems 9.18 and 9.19). They also proved a version of Theorem 9.19 for symplectic diffeomorphisms which was later improved by Bochi [31] to obtain Theorem 9.21. Moreover, Bessa and his coauthors [3, 20, 21, 22, 23, 24] extended most of these statements to the continuous-time setting, as we explained at the end of Section 9.2.4. The examples in Section 9.3 were constructed by Bocker and Viana [35], based on the method of proof of Theorem 9.5. Such examples notwithstanding, it should be stressed that the behavior of H¨older-continuous cocycles is usually very different from the behavior of typical continuous cocycles. Example 9.7 provides a striking illustration of this fact: one can use the invariance principle to show (see [36, Corollary 12.34]) that, under two mild additional assumptions, every H¨older-continuous cocycle in a C0 -neighborhood has non-vanishing Lyapunov exponents. See Chapter 12 of Bonatti, D´ıaz and Viana [36] and the survey papers of Bochi and Viana [32, 33] for more detailed discussions.

9.5 Exercises Exercise 9.1 Prove that the maps A → Λk A, 1 ≤ k ≤ d − 1 are continuous for the topology of uniform convergence (C0 -norm). Moreover, the same is true for the L1 -norm, restricted to maps A : M → GL(d) with A∞ ≤ C, for any constant C > 1. Exercise 9.2 Let A : M → GL(d) be such that A±1  ∈ L∞ (μ ). Show that for every ε > 0 and C > 1 there exists δ > 0 such that

λ+ (B, μ ) ≤ λ+ (A, μ ) + ε

and

λ− (B, μ ) ≤ λ− (A, μ ) − ε

for any B : M → GL(d) with B±1 ∞ ≤ C and A − B1 ≤ δ . Moreover, this statement remains true if one replaces λ+ and λ− by, respectively, ∑ki=1 χi and ∑di=d−k+1 χi , for any 0 ≤ k ≤ d. Exercise 9.3 Let X be a complete metric space, F be a closed subset and An , n ≥ 1 be open subsets whose closure contains F for every n. Show that the  closure of n An contains F. Exercise 9.4 Use the theorem of Lusin and the Tietze extension theorem to show that, given any measurable function B ∈ SL(d) and any ρ > 0, there exist

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

170

Generic cocycles

continuous functions J : M → SL(d) such that J∞ ≤ B∞ and J = B outside a set with measure ρ . Exercise 9.5 Suppose that the linear cocycle F admits a dominated decomposition Rd = Vx1 ⊕ · · · ⊕Vxl , x ∈ Λ on some invariant set Λ. Conclude that: (1) The decomposition is continuous (the subspaces Vxi depend continuously on x ∈ Λ), and even admits a continuous extension to the closure of Λ. (2) Any linear cocycle sufficiently close to F in the uniform convergence norm admits a dominated decomposition on Λ into subspaces with the same dimensions. Exercise 9.6 Show that an SL(2)-cocycle admits a dominated decomposition over an invariant set Λ ⊂ M if and only if it is hyperbolic over Λ. Exercise 9.7 Prove the following fact, which was implicit in Section 9.3: If Z is a cylinder in a Bernoulli shift space (M, μ ) then Z is not a coboundary set.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.010

10 Continuity

We have seen in Chapter 9 that the Lyapunov exponents may depend in a complicated way on the underlying linear cocycle. The theme, in the context of products of random matrices, of the present chapter is that this dependence is always continuous. Let G (d) denote the space of compactly supported probability measures p on GL(d), endowed with the following topology: p is close to p if it is close in the weak∗ -topology and supp p is contained in a small neighborhood of supp p. Let λ+ (p) and λ− (p) denote the extremal Lyapunov exponents of the product of random matrices associated with a given p ∈ G (d), in the sense of Section 2.1.1. In other words, λ± (p) = λ± (A, μ ), where A : GL(d)N → GL(d), (αk )k → α0 and μ = pN . A probability measure η on PRd will be called pstationary if it is stationary for this cocycle. We are going to prove: Theorem 10.1 (Bocker and Viana) The functions G (2) → R, p → λ± (p) are continuous at every point in the domain. Avila, Eskin and Viana announced recently that this statement remains true in arbitrary dimension. Even more, for any d ≥ 2, all the Lyapunov exponents depend continuously on the probability distribution p ∈ G (d). The proof will appear in [12]. It is easy to see that Theorem 10.1 implies Theorem 1.3. Indeed, given (Ai, j , p j )i, j ∈ GL(2)m × Δm , let us consider p = ∑ j p j δA j ∈ G (2). For any nearby element (Ai, j , pj )i, j of GL(2)m × Δm , the corresponding probability measure p = ∑ j pj δAj is close to p inside G (2); note that the assumption that p j > 0 for every j is crucial for ensuring that supp p is contained in a small neighborhood of supp p. So, the Lyapunov exponents λ± (Ai, j , pj )i, j = λ± (p ) are close to λ± (Ai, j , p j )i, j = λ± (p), as claimed in Theorem 1.3. Here is another direct consequence of Theorem 10.1. Let X be a separable complete metric space and f : M → M be the shift map on M = X N . Con171

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

172

Continuity

sider the space of measurable functions A : X → GL(2) such that log A±1  are bounded, with the topology of uniform convergence. Fix any probability measure p on X and let λ± (A, p) be the extremal Lyapunov exponents relative to μ = pN of the locally constant linear cocycle defined by A over f . Observe that λ± (A, p) = λ± (A∗ p) and A → A∗ p is continuous. Thus, it follows from Theorem 10.1 that the functions A → λ± (A, p) are continuous at every point. The proof of Theorem 10.1 occupies Sections 10.1 through 10.5. Then, in Section 10.6, we try to put this theorem and the results of Chapter 9 together in a consistent possible scenario.

10.1 Invariant subspaces According to Corollary 9.3, we only need to consider the case λ− (p) < λ+ (p). Let (pk )k be a sequence converging to some p in G (2). In other words, given any continuous functions ϕ1 , . . . , ϕl : GL(d) → R and any ε > 0, |



ϕ j d pk −



ϕ j d p| < ε for j = 1, . . . , l

and supp pk is contained in the ε -neighborhood of supp p, for every large k. We are going to show that λ+ (pk ) → λ+ (p) as k → ∞. Then, by Exercise 10.1, we also have λ− (pk ) → λ− (p) as k → ∞. For each k, let ηk be a pk -stationary measure that realizes the largest Lyapunov exponent for pk :

λ+ (pk ) =



Φ d ηk d pk .

Up to restricting to a subsequence, we may suppose that (ηk )k converges to some probability measure η on PR2 . By Proposition 5.9(b), this is a p-stationary measure. Moreover, since Φ is continuous,

λ+ (pk ) =



Φ d ηk d pk



converges to

Φ d η d p.

Thus, if η realizes the largest Lyapunov exponent for p then we are done. In what follows we suppose that 

Φ d η d p < λ+ (p)

(10.1)

and we prove that this leads to a contradiction. We will use α = (αn )n to represent a generic point of M = GL(d)N and we will write α (n) = αn−1 · · · α0 for n ≥ 1. Proposition 10.2

There exists some L ∈ PR2 such that

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

10.1 Invariant subspaces (a) η ({L}) > 0; (b) α (L) = L for every α ∈ supp p; (c) for μ -almost all α ∈ GL(2)N , 1 lim log α (n) (v) = n n

173

λ− (p) if v ∈ L \ {0} λ+ (p) if v ∈ R2 \ L.

Proof According to Proposition 5.5, the product measure μ × η is invariant under the projective cocycle PF : M × PR2 → M × PR2 defined by the function A : M → GL(2) over the shift f : M → M on M = GL(2)N . So, we may use the ergodic theorem to conclude that n−1     ˜ α , [v] = lim ∑ Φ PF j (α , [v]) = lim 1 log α (n) (v) Φ n n n j=0

exists for μ × η -almost every point, is constant on the orbits of PF and satis  ˜ < λ+ (p) on ˜ d η d μ = Φ d η d μ . Thus, the assumption implies that Φ fies Φ some subset with positive measure for μ × η . We claim that for μ -almost every α ∈ M there exists a unique L(α ) ∈ PR2 ˜ α , L(α )) < λ+ (p). Moreover, such that Φ( L( f (α )) = α0 (L(α )) for μ -almost every α ∈ M.

(10.2)

Indeed, consider ˜ α , [v]) < λ+ (p)}. Z = {(α , [v]) ∈ M × PR2 : Φ( This is a measurable, PF-invariant set with positive measure for μ × η . Hence, the projection Z = {α ∈ M : (α , [v]) ∈ Z for some [v]} is a measurable (Proposition 4.5), f -invariant set with positive measure for μ . Since ( f , μ ) is ergodic, it follows that μ (Z) = 1. This proves the existence part of the claim. Next, given any α ∈ M, suppose that there exist two distinct points [v1 ] and ˜ α , [vi ]) < λ+ (p) for i = 1, 2. Since every [v2 ] in projective space such that Φ( 2 vector in R may be written as linear combination of v1 and v2 , it follows that 1 lim log α (n)  < λ+ (p). n n This can only happen on a subset with zero measure for μ , because the largest Lyapunov exponent is equal to λ+ (p) at μ -almost every point α ∈ M. That proves the uniqueness part of the claim. Finally, (10.2) is a direct consequence ˜ is constant on the orbits of PF. of the fact that the function Φ Let η0 be any ergodic component (Theorem 5.14) of η such that μ × η0 (Z ) is positive. Then μ × η0 (Z ) = 1, by ergodicity, and so η0 ({L(α )}) = 1 for μ almost every α ∈ M. It follows that there exists L ∈ PR2 such that L(α ) =

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

174

Continuity

L for μ -almost every α ∈ M. This subspace L satisfies the conditions in the statement. Indeed, condition (a) is given by η ({L}) = μ × η (Z ) > 0. The relation (10.2) gives that α (L) = L for p-almost every α , which is equivalent ˜ α , L) < ˜ α , [v]) = λ+ (p) for [v] = L and Φ( to condition (b). By construction, Φ( ˜ only takes the values λ± (p), by the λ+ (p) for μ -almost all α ∈ M. Since Φ ˜ α , L) = λ− (p) for μ -almost all α ∈ M. Oseledets theorem, it follows that Φ( Thus, L satisfies condition (c) as well.

10.2 Expanding points in projective space Let L be the subspace given by Proposition 10.2. We are going to view L as a kind of repelling fixed point for the random walk defined by p on the projective space and to analyse such a repeller from the point of view of (the random walks defined by) nearby probability measures. The first step will be to establish in which sense L is repelling. For that, we need to introduce some notation. Most of the time the dimension d will be arbitrary, although our interest is in d = 2. Let P(d) be the space of probability measures on PRd , endowed with the weak∗ topology. Given p ∈ G (d) and η ∈ P(d), we denote by p ∗ η the pushforward of the product measure p × η under the map (α , x) → α (x). That is, p ∗ η (D) =



η (α −1 (D)) d p(α ) for every measurable set D ⊂ PRd . (10.3)

Observe that a probability η is p-stationary if and only if p ∗ η = η . Similarly, given p1 , p2 ∈ G (d), we denote by p1 ∗ p2 ∈ G (d) the pushforward of the product measure p1 × p2 under the map (α1 , α2 ) → α1 α2 . In other words, p1 ∗ p2 (E) =



p2 (α1−1 E) d p1 (α1 ) =



p1 (E α2−1 ) d p2 (α2 )

(10.4)

for every measurable set E ⊂ GL(d). We call p1 ∗ p2 the convolution of p1 and p2 . Clearly, p1 ∗ p2 is compactly supported if p1 and p2 are. Given p ∈ G (d), define p(1) = p and p(n+1) = p ∗ p(n) for every n ≥ 1. Remark 10.3 The operations (10.3) and (10.4) make sense, more generally, for measures (not necessarily probabilities) p1 , p2 and p in GL(d) and η in PRd . Given α ∈ GL(d) and [v] ∈ PRd , we denote by Dα ([v]) the derivative at the point [v] of the action of α in the projective space. In explicit terms (see

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

10.2 Expanding points in projective space

175

Exercise 10.3): Dα ([v])v˙ =

˙ projα (v) α (v) α (v)/v

for every v˙ ∈ T[v] PRd = {v}⊥

(10.5)

where projw : u → u − w(u · w)/(w · w) denotes the orthogonal projection to the hyperplane orthogonal to w. When d = 2, the tangent hyperplanes are lines and so Dα ([v]) is a real number. Given p ∈ G (d) and x ∈ PRd , we say that x is p-invariant if it is a fixed point for p-almost every α ∈ GL(d) or, equivalently, for every α ∈ supp p. Then we say that x is p-expanding if there exist  ≥ 1 and c > 0 such that 

log Dβ (x)v ˙ d p() (β ) ≥ 2c for every unit vector v˙ ∈ Tx PRd .

(10.6)

Part (b) of Proposition 10.2 means that L is p-invariant. In view of part (c), the next proposition implies that L is also p-expanding: Proposition 10.4 Let x ∈ PRd be a p-invariant point and suppose that there exist a < b such that

1 ≤ a if v ∈ x \ {0} lim log α (n) (v) n n ≥ b if v ∈ Rd \ x. for μ -almost all α ∈ M. Then x is a p-expanding point. Proof By Proposition 4.14 and the hypothesis, 1 1 lim log  projα (n) (x) α (n) (v) ˙ = lim log α (n) (v) ˙ ≥b n n n n for every unit vector v˙ ∈ Tx RPd = {x}⊥ and μ -almost every α ∈ M. Then, 1 ˙ ≥ b−a > 0 lim log Dα (n) (x)v n n for every unit vector v˙ ∈ {x}⊥ and μ -almost every α ∈ M. Since the support of p is compact, the inequalities in Exercise 10.3 ensure that there exists C > 0 ˙ ≤ C for every α ∈ supp p and every univ vector such that −C ≤ log Dα (x)v v˙ ∈ {v}⊥ . It follows that −C ≤

1 log Dα (n) (x)v ˙ ≤C n

for every α ∈ supp p(n) , every unit vector v˙ ∈ {v}⊥ and every n ≥ 1. So, using the bounded convergence theorem, the previous inequality implies that 

lim n

1 log Dβ (x)v ˙ d p(n) (β ) = lim n n



1 log Dα (n) (x)v ˙ d μ (α ) ≥ b − a n

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

176

Continuity

for every unit vector v˙ ∈ {x}⊥ . Let c = (b − a)/2 and, for each unit vector ˙ ≥ 1 be the smallest value of n such that v˙ ∈ {x}⊥ , let n(v) 

1 log Dβ (x)vd ˙ p(n) (β ) > c. n

Note that n(v) ˙ depends upper semi-continuously on v, ˙ and so ˙ : v˙ ∈ {x}⊥ with v ˙ = 1} n0 = sup{n(v) is finite. Fix C > 0 such that 

log Dβ (x)vd ˙ p(n) (β ) ≥ −C

for 1 ≤ n ≤ n0 and every unit vector v˙ ∈ {x}⊥ . Then 

n n0 −C ≥ 2c log Dβ (x)vd ˙ p(n) (β ) ≥ c n0 for every unit vector v˙ ∈ {x}⊥ , if n is sufficiently large. The heart of the proof of Theorem 10.1 is the statement that p-expanding points are invisible for the p-stationary measures that are limits of pk -stationary non-atomic measures with pk → p. In precise terms: Theorem 10.5 Suppose that x ∈ PRd is a p-expanding point. Let (pk )k be a sequence in G (d) converging to p and, for each k ≥ 1, let ηk ∈ P(d) be a pk -stationary measure. Suppose that (ηk )k converges to some η . If ηk is nonatomic for k arbitrarily large, then η ({x}) = 0. The proof of this theorem will be given in Sections 10.4 and 10.5. Right now, let us explain how it can be used to complete the proof of Theorem 10.1.

10.3 Proof of the continuity theorem The existence of an invariant subspace L implies that no matrix α ∈ supp p is elliptic. If the support consisted only of parabolic matrices (with the same invariant subspace) and multiples of the identity then we would have λ− (p) = λ+ (p), which is assumed not to be the case. Thus, supp p contains some hyperbolic matrix. Since the set of hyperbolic matrices is open, it follows that it has positive measure for p. This fact will be useful in a while. By Proposition 10.2 and Theorem 10.5, we may suppose that ηk is atomic for every large k. Then, by Lemma 6.9, there exists a finite set Lk ⊂ PR2 such that α (Lk ) = Lk for every α ∈ supp p and η ({y}) > 0 for every y ∈ Lk . The open set of hyperbolic matrices has positive measure for pk , for every

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

10.3 Proof of the continuity theorem

177

large k, because pk is close to p in the weak∗ topology. Observe, furthermore, that for a hyperbolic matrix any finite invariant set consists of either one or two eigenspaces. Thus, for every large k, the set Lk can have no more than 2 elements. First, suppose that Lk has a single element Lk . We claim that 

Φ(α , Lk ) d pk (α ) = λ+ (pk ).

(10.7)

To prove this, let ζk be the Dirac mass at the point Lk . The product measure μk × ζk is PF-invariant, and so it follows from the ergodic theorem that 

Φ(α , Lk ) d pk (α ) =



˜ α , Lk ) d μk (α ). Φ(

(10.8)

˜ α , [v]) = λ+ (pk ) for μk × ηk -almost every (α , [v]), beWe also have that Φ(  ˜ α , [v]) ≤ λ+ (pk ) for μk ˜ α , [v]) d ηk ([v]) d μk (α ) = λ+ (pk ) and Φ( cause Φ( almost every α ∈ M and every [v] ∈ PR2 . Since ηk ({Lk }) > 0, this implies ˜ α , Lk ) = λ+ (pk ) for μk -almost every α ∈ M. Together with (10.8), this imΦ( plies the claim (10.7). Up to restricting to a subsequence, may assume that (Lk )k converges to some subspace L0 ∈ PR2 . Let ζ0 be the Dirac mass at L0 . Since supp pk converges to supp p in the Hausdorff distance (Exercise 10.4), it follows that α (L0 ) = L0 for every α ∈ supp p. Thus, the product measure μ × ζ0 is invariant under the projective cocycle PF. Since Φ is a continuous function, passing to the limit in (10.7) we get that

λ+ (pk ) →



Φ(α , L0 ) d p(α ) =



˜ α , L0 ) d μ (α ) Φ(

(10.9)

According to Proposition 10.2, there are two possibilities: (i) If L0 = L then the right-hand side of (10.9) is equal to λ+ (p). Then  Φ d ηk d μk = λ+ (pk ) converges to λ+ (p), which contradicts (10.1). (ii) If L0 = L then the right-hand side of (10.9) is λ− (p). Then (Exercise 10.1) we also have that λ− (pk ) → λ+ (p). This is a contradiction, because λ− (pk ) ≤ λ+ (pk ) for every k and λ− (p) < λ+ (p). We have shown that, for k large, Lk cannot consist of a single point. Finally, suppose that, for all k large, Lk consists of two points, Lk and Lk , with positive measure for η . Let ζk = (δLk + δL )/2. Arguing as we did for k (10.7), we find that 1 2



1 Φ(α , Lk ) d pk (α ) + 2



Φ(α , Lk ) d pk (α ) = λ+ (pk ).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

(10.10)

178

Continuity

We may assume that (Lk )k and (Lk )k converge to subspaces L0 and L0 , respectively. Let ζ0 = (δL0 + δL )/2. Passing to the limit in (10.10), 1 λ+ (pk ) → 2



0

Φ(α , L0 ) d p(α ) + =

1 2



1 2



Φ(α , L0 ) d p(α )

˜ α , L0 ) d p(α ) + 1 Φ( 2



˜ α , L0 ) d p(α ). Φ(

(10.11)

If L0 and L0 are both different from L then the right-hand side is equal to λ+ (p) and we reach a contradiction just as we did in case (i) of the previous paragraph. Now suppose that L0 = L. We may assume that, for each k large, there exists αk ∈ supp pk such that αk (Lk ) = Lk : otherwise, we could take Lk = {Lk } instead, and that case has already been dealt with. Passing to the limit along a convenient subsequence, we find α0 ∈ supp p such that α0 (L0 ) = L0 . By part (b) of Proposition 10.2, this implies that L0 is also equal to L. Then the right-hand side of (10.11) is equal to λ− (p), and we reach a contradiction just as we did in case (ii) of the previous paragraph. So, Lk cannot consist of two points either. We have reduced the proof of Theorem 10.1 to proving Theorem 10.5.

10.4 Couplings and energy For simplicity, we will write P = PRd and G = GL(d). Let d be the distance on P defined by the angle between two directions. For any Borel measure ξ on P × P and δ > 0, define the δ -energy of ξ to be Eδ (ξ ) =



d(x, y)−δ d ξ (x, y).

Let π i : P × P → P be the projection on the i th coordinate, for i = 1, 2. By definition, the mass of a measure η on P is the total measure η  = η (P) of the ambient space. If η1 and η2 are measures on P with the same mass, a coupling of η1 and η2 is a measure on P × P that projects to ηi on the i th coordinate for i = 1, 2. For instance,

ξ=

1 1 η1 × η2 = η1 × η2 η1  η2 

is a coupling of η1 and η2 . Define: eδ (η1 , η2 ) = inf{Eδ (ξ ) : ξ a coupling of η1 and η2 }.

(10.12)

The infimum is always achieved, because the function ξ → Eδ (ξ ) is lower semi-continuous (Exercise 10.5).

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

10.4 Couplings and energy

179

A self-coupling of a measure η ∈ P(d) is a coupling of η and η . We call a self-coupling symmetric if it is invariant under the involution ι : (x, y) → (y, x). Define the δ -energy of η to be: eδ (η ) = eδ (η , η ) = inf{Eδ (ξ ) : ξ a self-coupling of η }.

(10.13)

We call p ∗ η the convolution of η by p. Observe that a probability η is p-stationary if and only if p ∗ η = η . If ξ is any self-coupling then ξ  = (ξ + ι∗ ξ )/2 is a symmetric self-coupling and Eδ (ξ ) = Eδ (ξ  ) Thus, the infimum in (10.13) is always achieved at some symmetric self-coupling of η . Such symmetric self-couplings are called δ -optimal. Example 10.6 Let P = [0, 2] and η be the convex combination of the Dirac masses at 0, 1 and 2; that is, 1 1 1 η = δ0 + δ1 + δ2 . 3 3 3 Since η is a probability, the product ξ1 = η × η is a (symmetric) self-coupling of η . Observe that the product ξ1 has atoms on the diagonal of P × P and, thus, its δ -energy is infinite. Now, consider 1 1 1 1 1 1 ξ2 = δ(0,1) + δ(0,2) + δ(1,0) + δ(1,2) + δ(2,0) + δ(2,1) . 6 6 6 6 6 6 Observe that ξ2 is a symmetric self-coupling of η . Moreover, the support of ξ2 does not intersect the diagonal and, consequently, Eδ (ξ2 ) is finite. This proves that eδ (η ) < ∞. It is easy to check that ξ2 is a δ -optimal self-coupling of η . This notion of energy allows us to characterize the presence of atoms: Lemma 10.7

For any η ∈ P(d) and δ > 0:

(1) if η ({x}) > η /2 for some x ∈ P then eδ (η ) = ∞; (2) if η ({x}) < η /2 for every x ∈ P then eδ (η ) < ∞. Proof First we prove (1). Take x as in the hypothesis. For any self-coupling ξ of η , we have

ξ ({x}c × P) = η ({x}c ) = ξ (P × {x}c ). Taking the union, ξ ({(x, x)}c ) ≤ 2η ({x}c ) < η (P) = ξ (P × P). This implies that (x, x) is an atom for ξ , and so Eδ (ξ ) = ∞. Now we prove (2). The hypothesis implies (Exercise 10.7) that there exists ρ > 0 such that η (B(x, 2ρ )) < η (B(x, 2ρ )c ) for every x ∈ P. Let x1 , . . . , xl be

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

180

Continuity

such that U = {B(xi , ρ ) : i = 1, . . . , l} covers P. We claim that there exists a finite sequence ξ0 , . . . , ξl of symmetric self-couplings of η such that   ξ j B(xi , ρ ) × B(xi , ρ ) = 0 for every 1 ≤ i ≤ j. (10.14) Denote Δr = {(y, z) ∈ P × P : d(y, z) < r}. Take r > 0 to be a Lebesgue number for the open cover U . Then, by definition, Δr ⊂

l 

B(xi , ρ ) × B(xi , ρ ).

i=1

  It follows that ξl Δr = 0 and so Eδ (ξl ) < ∞. We are left to prove our claim. We are going to construct the sequence ξ j by induction of j. Let ξ0 be an arbitrary self-coupling of η . For each j = 1, . . . , l, suppose that ξ j−1 has been constructed in such a way that B(xi , ρ ) × B(xi , ρ ) has zero measure for 1 ≤ i ≤ j −1. Denote ξ = ξ j−1 and B = B(x j , ρ ) and C = B(x j , 2ρ ) and D = B(x j , 2ρ )c . Note that (see Figure 10.1) • ξ (C ×C) + ξ (C × D) = η (C), because π∗1 ξ = η ; • ξ (D ×C) + ξ (D × D) = η (D), because π∗1 ξ = η ; • ξ (C × D) = ξ (D ×C), because ξ is symmetric. Since η (C) < η (D), this implies that ξ (B × B) ≤ ξ (C × C) 0 and  ≥ 1 as in the definition (10.6). Up to restricting to a subsequence, we may suppose that (ηk )n converges to η . Assume, for contradiction, that η ({x}) > 0. Then let U0 ⊂ P be an open neighborhood of x such that

η ({x}) >

9 η (U¯ 0 ). 10

Lemma 10.8 Assuming that δ > 0 is small enough and k ≥ 1 is large enough,  () ˙ −δ d pk (β ) ≤ 1 − cδ for every unit vector v˙ ∈ Tx P. then Dβ (x)v Proof Fix some compact neighborhood V of the support of p. Take k to be large enough that supp pk ⊂ V . Each function

ψv,k ˙ (δ ) =



Dβ (x)v ˙ −δ d pk (β ) ()

is differentiable and the derivative is given by  ψv,k ˙ (δ ) = −





Dβ (x)v ˙ −δ log Dβ (x)v ˙ d pk (β ). ()

 (0) = − log Dβ (x)v ˙ d pk (β ). Since pk → p() in the weak∗ Note that ψv,k ˙ topology, and the function β → log Dβ (x)v ˙ is continuous on V , we get that  lim ψv,k ˙ (0) = − k

()



()

log Dβ (x)v ˙ d p() (β ) ≤ −2c for every v. ˙

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

182

Continuity

 (0) This convergence is uniform on v, ˙ as the sequence of functions v˙ → ψv,k ˙  (0) ≤ −3c/2 for is equicontinuous. So, assuming that k is large enough, ψv,k ˙ every v. ˙ Since β and v˙ live inside compact sets V and Tx1 P, respectively, the ˙ −δ is uniformly close to 1 if δ is close to 0. Thus, in particular, factor Dβ (x)v  (δ ) ≤ −c for every v, ψv,k ˙ every large k and every small δ . Since ψv,k ˙ (0) = 1, ˙ this gives the claim.

Fix δ as in Lemma 10.8 and take k to be large enough that the conclusion of the lemma holds; further conditions will be imposed on k along the way. For () notational simplicity, we write Π = p() and Πk = pk for each k. Let U1 ⊂ U0 be an open neighborhood of x such that 

d(β (y), β (z))−δ dΠk (β ) ≤ (1 − cδ )d(y, z)−δ ,

(10.16)

for any pair of distinct points y, z ∈ U1 . Since x is a p-invariant point, β (x) = x for every β ∈ supp Π. Let K be a compact neighborhood of the support of Π and U1 ⊃ U2 ⊃ U3 ⊃ U4 ⊃ U5 be open neighborhoods of x such that

β −1 (U2 ) ⊂ U1 and U¯ 3 ⊂ U2 and U¯ 4 ⊂ U3 and β (U5 ) ⊂ U4

for every β ∈ K.

Take k to be large enough that supp Πk ⊂ K. The δ -energy of every ηk | U1 is finite, because the ηk are assumed to be non-atomic. The strategy for the proof of Theorem 10.5 is to use the expansion property (10.16) to find a uniform bound for these δ -energies. That ensures that the δ -energy remains finite at k = ∞, contradicting the existence of a fat atom for η | U1 at x. The main step for realizing this strategy is the following proposition: Proposition 10.9 For each k ≥ 1, let ξk be any symmetric self-coupling of ηk | U1 . Then there exists C > 0 such that eδ (ηk | U1 ) ≤ (1 − cδ )Eδ (ξk ) +C

for every k sufficiently large.

Theorem 10.5 can be deduced as follows. Suppose there exists a subsequence (k j ) j → ∞ such that ηk j has no atoms in U1 and, in particular, eδ (ηk j | U1 ) is finite. Let ξk j be a δ -optimal symmetric self-coupling of ηk j ; that is, such that Eδ (ξk j ) = eδ (ηk j | U1 ). Then Proposition 10.9 gives that eδ (ηk j | U1 ) ≤ C/(cδ ). Let ηˆ be any accumulation point of the sequence (ηk j | U1 ) j . On the one hand, by lower semi-continuity, eδ (ηˆ ) ≤ lim sup eδ (ηk j | U1 ) ≤ j

C < ∞. cδ

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

10.5 Conclusion of the proof

183

On the other hand, η | U1 ≤ ηˆ ≤ η | U¯ 1 and, in particular,

ηˆ ({x}) >

9 ηˆ . 10

The latter implies that eδ (ηˆ ) = ∞, which contradicts the previous conclusion. This contradiction proves Theorem 10.5.

10.5.1 Proof of Proposition 10.9 We need to construct a (symmetric) self-coupling of ηk | U1 whose δ -energy is significantly smaller than that of ξk . A good starting point is the diagonal convolution ξ˜k of Πk and ξk ; that is, the image of Πk × ξk under the diagonal push-forward (β , (y, z)) → (β (y), β (z)). Equivalently,

ξ˜k (D1 , D2 ) =



ξk (β −1 (D1 ) × β −1 (D2 )) dΠk (β )

(10.17)

for any measurable sets D1 , D2 ⊂ PRd . Indeed, it is clear that ξ˜k is a symmetric measure and the expansion property (10.16) implies that Eδ (ξ˜k ) =



d(y, z)−δ d ξ˜k (y, z) =

≤ (1 − cδ )





d(β (y), β (z))−δ dΠk (β ) d ξk (y, z)

d(y, z)−δ d ξk (y, z) = (1 − cδ )Eδ (ξk ).

(10.18) However, the restriction of ξ˜k to U1 × U1 is not a self-coupling of ηk | U1 . Indeed, let ηk1 be the projection of ξ˜k | U1 × U1 on either coordinate (thus, ξ˜k | U1 × U1 is a symmetric self-coupling of ηk1 ). The next lemma shows that ηk1 ≤ ηk | U1 and it describes the difference between the two measures precisely. There are two terms, ik and ok , that correspond to displacements of mass in opposite directions across the border of U1 under the diagonal pushforward: ik accounts for mass that is mapped from the outside into U1 × U1 whereas ok comes from mass that overflows U1 ×U1 . See Figure 10.2. Lemma 10.10 (ηk | U1 ) − ηk1 = ik + ok where ik is the restriction to U1 of Πk ∗ (ηk | U1c ) and ok is the projection of ξ˜k | (U1 ×U1c ) on the first coordinate. Moreover, 1 ok  ≤ ik  and lim sup ik  ≤ η ({x}). 10 k Proof

It follows from (10.17) that

π∗j ξ˜k



=

(ηk | U1 ) ◦ β −1 dΠk (β ) = Πk ∗ (ηk | U1 )

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

for j = 1, 2.

184

Continuity 10.5 Conclusion of the proof U1c Ok Ik U1 U2 U2

U1

U1c

Figure 10.2 Diagonal push-forward: mass in region Ik projects to ik and mass in region Ok projects to ok

Combining this with the fact that ηk is Πk -stationary, we find

π∗1 ξ˜k = Πk ∗ ηk − Πk ∗ (ηk | U1c ) = ηk − Πk ∗ (ηk | U1c ), and so π∗1 ξ˜k | U1 = (ηk | U1 ) − ik . Moreover,     π∗1 ξ˜k | U1 = π∗1 ξ˜k | U1 ×U1 + π∗1 ξ˜k | U1 ×U1c = ηk1 + ok . The first claim is a direct consequence of these two equalities. Next, observe that       Πk ∗ (ηk | U1 ) (U1c ) + Πk ∗ (ηk | U1 ) (U1 ) = Πk ∗ (ηk | U1 ) (P) = ηk (U1 ) and, using that ηk is Πk -stationary,       Πk ∗ (ηk | U1c ) (U1 ) + Πk ∗ (ηk | U1 ) (U1 ) = Πk ∗ ηk (U1 ) = ηk (U1 ). This implies that 

   Πk ∗ (ηk | U1c ) (U1 ) = Πk ∗ (ηk | U1 ) (U1c ).

(10.19)

The left-hand side is precisely ik . As for the right-hand side,   Πk ∗ (ηk | U1 ) (U1c ) = (π∗2 ξ˜k )(U1c ) = ξ˜k (P ×U1c ) ≥ ξ˜k (U1 ×U1c ) = ok . This proves that ok  ≤ ik . Finally, the first part of the lemma implies that ik ≤ ηk | U1 . Moreover, the choice of U2 ensures that U2 ∩ β −1 (U1 ) = 0/ for every β ∈ supp Πk and so ik (U2 ) = 0. Thus, lim sup ik  ≤ lim sup ηk (U¯ 1 \U2 ) ≤ η (U¯ 1 \U2 ) ≤ k

k

This finishes the proof of the lemma.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

1 η ({x}). 10

10.5 Conclusion of the proof

185

According to Lemma 10.10, the difference (ηk | U1 ) − ηk1 is positive and relatively small. This suggests that we try to construct the self-coupling of ηk | U1 we are looking for, with small δ -energy, by adding suitable correcting terms to the restriction of ξ˜k to U1 × U1 . These correcting terms should be concentrated outside a neighborhood of the diagonal, if possible, so that their contribution to the total energy is bounded. We choose: ! ζk  ξ¯k = (ξ˜k | U1 ×U1 ) − 4 (ξ˜k | U4 ×U4 ) ηk    1 (10.20) (ok | U3 ) × ik + ik × (ok | U3 ) + ik   1  + 4 (ζk × ηk4 ) + (ηk4 × ζk ) ηk  where ηk4 is the projection (on either coordinate) of ξ˜k | U4 ×U4 and   ok | U3  ik + (ok | U3c ). ζk = 1 − ik  Lemma 10.11 Assume that k ≥ 1 is large enough. Then ξ¯k is a symmetric self-coupling of ηk | U1 . Moreover, Eδ (ξ¯k ) ≤ Eδ (ξ˜k | U1 ×U1 ) +C, where C is a positive constant independent of k. Proof It is clear that the three terms on the right-hand side of (10.20) are symmetric. Moreover, ok | U3  ≤ ok  ≤ ik . This ensures that ζk is a positive measure. Now it is clear that the last two terms in (10.20) are positive. To see that the first one is also positive, it suffices to check that ζk  < ηk4 . That is easily done as follows. We have ηk (U5 ) ≥ (9/10)ηk (U1 ) and that implies ξk (U5 ×U5 ) ≥ (8/10)ηk (U1 ). Since β −1 (U4 ) ⊃ U5 for every β ∈ supp Πk , the definition (10.17) gives that ξ˜k (U4 ×U4 ) ≥ ξk (U5 ×U5 ). Thus, 8 8 ηk4  = ξ˜k (U4 ×U4 ) ≥ ηk (U1 ) ≥ η ({x}). 10 10 Furthermore, assuming k is sufficiently large, ζk  ≤ ik  + ok  ≤ 2ik  ≤

3 η ({x}). 10

In particular, ζk  < ηk4  as we wanted to prove. Next, observe that the projection of ξ¯k on either coordinate is equal to ! ! ! ζk  4 ok | U3  ζk  4 1 ik + ζk + 4 ηk ηk − 4 ηk + (ok | U3 ) + ik  ηk  ηk 

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

186

Continuity U1 (ok | U3 ) × ik U2

ηk4 × ζk U3 ik × (ok | U3 ) U4

ζk × ηk4 U4

U3

U2

U1

Figure 10.3 Coupling terms far from the diagonal

and that adds up to ηk1 + ik + ok = ηk | U1 . So, ξ¯k is a self-coupling of ηk | U1 as claimed. The final step is to estimate the δ -energy of ξ¯k . It is important to notee that the two correcting terms  1  (ok | U3 ) × ik + ik × (ok | U3 ) ik 

and

 1  (ζk × ηk4 ) + (ηk4 × ζk ) 4 ηk 

are supported away from the diagonal of P × P. Indeed, ok | U3 is concentrated on U3 while ik is concentrated on U2c , as we have seen. Similarly, ζk is concentrated on U3c while ηk4 is concentrated on U4 . See Figure 10.3. Moreover, their masses are uniformly bounded (by 1, say). Thus, the δ -energy of the sum of these two terms is bounded by some constant C > 0, for every large k. As for the first term in (10.20), it is clear that its δ -energy is bounded by Eδ (ξ˜k | U1 ×U1 ). Thus, Eδ (ξ¯k ) ≤ Eδ (ξ˜k | U1 ×U1 ) +C, as claimed. Combining (10.18) with Lemma 10.11, we get eδ (ηki ) ≤ Eδ (ξ¯k ) ≤ Eδ (ξ˜k | U1 ×U1 ) +C ≤ Eδ (ξ˜k ) +C ≤ (1 − cδ )Eδ (ξk ) +C as claimed in Proposition 10.9.

10.6 Final comments We saw in Chapter 9 that discontinuity of the Lyapunov exponents is common among cocycles with low regularity: continuous or H¨older-continuous cocycles with small H¨older constant. On the other hand, Theorems 1.3 and 10.1 show that, at least in dimension 2, for locally constant cocycles (cocycles with

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

10.6 Final comments

187

“infinite H¨older constant”) one always has continuity. In what follows, we present a possible global scenario for 2-dimensional cocycles over “chaotic” systems, encompassing these two contrasting conclusions. Let X = {1, . . . , m} and f : M → M be the shift map on M = X Z . Fix θ ∈ (0, 1) and endow M with the metric dθ (x, y) = θ N(x,y) , where N(x, y) = max{N ≥ 0 : xn = yn for all |n| < N}. Let A : M → GL(2) be r-H¨older continuous for this metric. In other words, there exists C1 > 0 such that A(x) − A(y) ≤ C1 θ Nr

for any x, y with xn = yn for all |n| < N.

(10.21)

The cocycle F defined by A over f is fiber-bunched if there exists C2 > 0 and λ < 1 such that An (x) An (x)−1  θ nr ≤ C2 λ n for every x ∈ M and n ≥ 1.

(10.22)

This notion was introduced in [37], under a different name. Conjecture 10.12 Lyapunov exponents vary continuously restricted to the subset of fiber-bunched elements A : M → GL(2) of the space of r-H¨oldercontinuous cocycles. Observe that in Section 9.3 we had A = A−1  = σ and θ = 1/2, and so the cocycle is fiber-bunched if and only if σ 2 < 2r . Thus, the hypothesis σ > 22r in Theorem 9.22 is incompatible with fiber-bunching. It is clear from the definition that fiber-bunched cocycles form an open subset of the space of r-H¨older-continuous cocycles for any r > 0. An important feature of fiber-bunched cocycles is that they admit invariant stable and unstable holonomies. Let us explain this. s (x) of a point x ∈ M is the set of all y ∈ M that have The local stable set Wloc the same positive part; that is, such that π + (x) = π + (y). The local unstable set is defined analogously, requiring π − (x) = π − (y) instead. The stable and unstable sets are defined by W s (x) =

∞  n=0

∞   s n  u −n   f −n Wloc ( f (x)) and W u (x) = f n Wloc ( f (x)) , n=0

respectively. An s-holonomy for the cocycle associated with A is a family of linear maps hsx,y : Rd → Rd , defined for every x and y in the same stable set, such that (1) hsy,z ◦ hsx,y = hsx,z and hsx,x = id for every x, y, z in the same stable set; (2) A(y) ◦ hsx,y = hsf (x), f (y) ◦ A(x) for every x, y in the same stable set;

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

188

Continuity

(3) (x, y, v) → hsx,y (v) is continuous, when x and y vary in the set of points in the same local stable set; (4) there are C > 0 and t > 0 such that hsx,y is (C,t)-H¨older for every x and y in the same local stable set. Analogously, one defines u-holonomy for the cocycle. If A is fiber-bunched then hsx,y = lim An (y)−1 An (x) n

and

hux,y = lim A−n (y)−1 A−n (x) n

are well defined and they yield an s-holonomy and a u-holonomy for the cocycle. Fiber-bunching is not necessary for the existence of invariant holonomies, of course. For instance, for locally constant cocycles one may always take s (x) hsx,y = id for all y ∈ Wloc

and

u hux,y = id for all y ∈ Wloc (x).

Thus, the following statement contains both Theorem 1.3 and Conjecture 10.12: Conjecture 10.13 Let H be a family of r-H¨older-continuous A : M → GL(2) such that stable and unstable holonomies, hsA,x,y and huA,x,y , exist for all A ∈ H and vary continuously on H . Then the Lyapunov exponents vary continuously on H . Here, the continuity condition means that (x, y, v, A) → hsA,x,y (v) is continuous when x and y vary in the set of points in the same local stable set, and (x, y, v, A) → huA,x,y (v) is continuous when x and y vary in the set of points in the same local unstable set. Moreover, the H¨older constants C and t in condition (4) are assumed to be locally uniform in A. Hyperbolic cocycles need not be fiber-bunched neither do the cocycles with λ− = λ+ . Thus, Corollary 9.3 and Lemma 9.4 imply that the converse to Conjecture 10.12 is not true in general. Instead, we state: Conjecture 10.14 Consider any r-continuous function A : M → GL(2) such that λ− (A, μ ) < λ+ (A, μ ) and the cocycle defined by A over f is not hyperbolic. If A is not fiber-bunched then there exists a sequence (An )n converging to A in the space of r-H¨older-continuous functions such that limn λ+ (An , μ ) < λ+ (A, μ ). Young [118] constructed a C1 -open set of SL(2)-cocycles over an expanding circle map which are neither hyperbolic nor fiber-bunched and whose Lyapunov exponents are uniformly bounded from zero. The estimates are not sufficient to decide whether the Lyapunov exponents vary continuously in this set. Still, if such an open set can be constructed for H¨older-continuous cocycles, this could be a good test case for Conjecture 10.14.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

10.7 Notes

189

10.7 Notes Theorem 10.1 was proven by Bocker and Viana [35], based on certain estimates of the random walk induced by the cocycle on PR2 , which they obtained through a suitable discretization of the projective space. Avila, Eskin and Viana [12] use a very different analysis of the random walk to extend the theorem to arbitrary dimension d ≥ 2. The presentation in Sections 10.1 through 10.5 is based on the latter. The dependence of Lyapunov exponents on the linear cocycle or the base dynamics has been studied by many authors. Ruelle [104] proved real-analytic dependence of the largest exponent on the cocycle, for linear cocycles admitting an invariant convex cone field. Short afterwards, Furstenberg and Kifer [57, 70] and Hennion [61] studied the dependence of the largest exponent of random matrices on the probability distribution, proving continuity with respect to the weak∗ topology in the almost irreducible case; that is, when there is at most one invariant subspace. Kifer [70] observed that Lyapunov exponents may jump when the probability vector degenerates (Exercise 1.8) and Johnson [67] also found examples of discontinuous dependence of the exponent on the energy E, for Schr¨odinger cocycles over quasi-periodic flows. For random matrices satisfying strong irreducibility and the contraction property, Le Page [94, 95] proved local H¨older continuity and even smooth dependence of the largest exponent on the cocycle; the assumptions ensure that the largest exponent is simple (multiplicity 1), by work of Guivarc’h and Raugi [60] and Gol’dsheid and Margulis [58]. Le Page’s result cannot be improved: a construction of Halperin (see Simon and Taylor [108]) shows that for every α > 0 one can find random Schr¨odinger cocycles near which the exponents fail to be α -H¨older continuous. For random matrices with finitely many values and, more generally, for locally constant cocycles over Markov shifts, Peres [98] showed that simple exponents are locally real-analytic functions of the transition data. Recently, Bourgain and Jitomirskaya [40, 41] proved continuous dependence of the exponents on the energy E, for quasi-periodic Schr¨odinger cocycles.

10.8 Exercises Exercise 10.1 Prove that if (pk )k converges to p in the space G (2) then (λ+ (pk ) + λ− (pk ))k converges to λ+ (p) + λ− (p). Conclude that λ+ (pk ) → λ+ (p) if and only if λ− (pk ) → λ− (p). Exercise 10.2 Prove the following properties of the convolution operations:

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

190

Continuity

(1) ∗ : G (d) × P(d) → P(d) defined in (10.3) is continuous and it satisfies (p1 ∗ p2 ) ∗ η = p1 ∗ (p2 ∗ η ). (2) ∗ : G (d) × G (d) → G (d) defined in (10.4) is commutative, associative and continuous. Is there a unit element? Exercise 10.3 Verify the equality (10.5) and prove that v ˙ ≤ Dα ([v])v ˙ ≤ α  α −1  v ˙ α  α −1  for every [v] ∈ PRd and v˙ ∈ {v}⊥ . Exercise 10.4 Prove that if (pk )k converges to p in G (d) then (supp pk )k converges to supp p relative to the Hausdorff distance, defined in the space of compact subsets of GL(d) by dH (K1 , K2 ) = inf{r > 0 : K1 ⊂ Br (K2 ) and K2 ⊂ Br (K1 )}. Exercise 10.5 Prove the following. (1) The function ξ → Eδ (ξ ) is lower semi-continuous with respect to the weak∗ topology in the space of measures with a given mass on N × N. (2) The function η → eδ (η ) is lower semi-continuous with respect to the weak∗ topology in the space of measures with a given mass on N. Exercise 10.6 Let η be a probability measure on a compact metric space P. Check that the function U → eδ (η | U) may not be monotone. Exercise 10.7 Let η be a Borel measure on some compact metric space P such that η ({x}) < η /2 for every x ∈ P. Show that there exists τ > 0 such that η (B(z, τ )) < η (B(z, τ )c ) for every z ∈ P. Exercise 10.8 Let η be a probability on [0, 1] of the form η = (δ0 + μ )/2, where μ is a probability on (0, 1]. Check that, depending on μ , the δ -energy of η may be finite or infinite. Exercise 10.9 State and prove an analogue of Lemma 10.7 for couplings of any two measures η1 and η2 with the same mass. Exercise 10.10 Check that the fiber-bunching property (10.22) does not depend on the choice of θ in the definition of the distance.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.011

References

[1] P. Anderson. Absence of diffusion in certain random lattices. Phys. Rev., 109:1492–1505, 1958. [2] D. V. Anosov. Geodesic flows on closed Riemannian manifolds of negative curvature. Proc. Steklov Math. Inst., 90:1–235, 1967. [3] V. Ara´ujo and M. Bessa. Dominated splitting and zero volume for incompressible three-flows. Nonlinearity, 21:1637–1653, 2008. [4] A. Arbieto and J. Bochi. L p -generic cocycles have one-point Lyapunov spectrum. Stoch. Dyn., 3:73–81, 2003. [5] L. Arnold. Random Dynamical Systems. Springer-Verlag, 1998. [6] L. Arnold and N. D. Cong. On the simplicity of the Lyapunov spectrum of products of random matrices. Ergod. Theory Dynam. Sys., 17:1005–1025, 1997. [7] L. Arnold and V. Wishtutz (Eds.). Lyapunov Exponents (Bremen, 1984), volume 1186 of Lecture Notes in Math. Springer, 1986. [8] A. Avila. Density of positive Lyapunov exponents for quasiperiodic SL(2, R)cocycles in arbitrary dimension. J. Mod. Dyn., 3:631–636, 2009. [9] A. Avila and J. Bochi. Lyapunov exponents: parts I & II. Notes of mini-course given at the School on Dynamical Systems, ICTP, 2008. [10] A. Avila and J. Bochi. Proof of the subadditive ergodic theorem. Preprint www.mat.puc-rio.br/∼jairo/. [11] A. Avila and J. Bochi. A formula with some applications to the theory of Lyapunov exponents. Israel J. Math., 131:125–137, 2002. [12] A. Avila, A .Eskin and M. Viana. Continuity of Lyapunov exponents of random matrix products. In preparation. [13] A. Avila, J. Santamaria, and M. Viana. Holonomy invariance: rough regularity and Lyapunov exponents. Ast´erisque, 358:13–74, 2013. [14] A. Avila and M. Viana. Simplicity of Lyapunov spectra: a sufficient criterion. Port. Math., 64:311–376, 2007. [15] A. Avila and M. Viana. Simplicity of Lyapunov spectra: proof of the ZorichKontsevich conjecture. Acta Math., 198:1–56, 2007. [16] A. Avila and M. Viana. Extremal Lyapunov exponents: an invariance principle and applications. Inventiones Math., 181:115–178, 2010. [17] A. Avila, M. Viana, and A. Wilkinson. Absolute continuity, Lyapunov exponents and rigidity I: geodesic flows. Preprint www.preprint.impa.br 2011.

191

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.012

192

References

[18] L. Barreira and Ya. Pesin. Nonuniform Hyperbolicity: Dynamics of Systems with Nonzero Lyapunov Exponents. Cambridge University Press, 2007. [19] L. Barreira, Ya. Pesin and J. Schmeling. Dimension and product structure of hyperbolic measures. Ann. of Math., 149:755–783, 1999. [20] M. Bessa. Dynamics of generic 2-dimensional linear differential systems. J. Differential Equations, 228:685–706, 2006. [21] M. Bessa. The Lyapunov exponents of generic zero divergence threedimensional vector fields. Ergodic Theory Dynam. Systems, 27:1445–1472, 2007. [22] M. Bessa. Dynamics of generic multidimensional linear differential systems. Adv. Nonlinear Stud., 8:191–211, 2008. [23] M. Bessa and J. L. Dias. Generic dynamics of 4-dimensional C2 Hamiltonian systems. Comm. Math. Phys., 281:597–619, 2008. [24] M. Bessa and J. Rocha. Contributions to the geometric and ergodic theory of conservative flows. Ergodic Theory Dynam. Sys., 33:1709–1731, 2013. [25] M. Bessa and H. Vilarinho. Fine properties of lp-cocycles. J. Differential Equations, 256:2337–2367, 2014. [26] G. Birkhoff. Lattice Theory, volume 25. A.M.S. Colloq. Publ., 1967. [27] J. Bochi. Discontinuity of the Lyapunov exponents for non-hyperbolic cocycles. Preprint www.mat.puc-rio.br/∼jairo/. [28] J. Bochi. The multiplicative ergodic theorem of Oseledets. Preprint www.mat.puc-rio.br/∼jairo/. [29] J. Bochi. Proof of the Oseledets theorem in dimension 2 via hyperbolic geometry. Preprint www.mat.puc-rio.br/∼jairo/. [30] J. Bochi. Genericity of zero Lyapunov exponents. Ergod. Theory Dynam. Sys., 22:1667–1696, 2002. [31] J. Bochi. C1 -generic symplectic diffeomorphisms: partial hyperbolicity and zero centre Lyapunov exponents. J. Inst. Math. Jussieu, 8:49–93, 2009. [32] J. Bochi and M. Viana. Pisa lectures on Lyapunov exponents. In Dynamical Systems. Part II, Pubbl. Cent. Ric. Mat. Ennio Giorgi, pages 23–47. Scuola Norm. Sup., 2003. [33] J. Bochi and M. Viana. Lyapunov exponents: how frequently are dynamical systems hyperbolic? In Modern Dynamical Systems and Applications, pages 271–297. Cambridge University Press, 2004. [34] J. Bochi and M. Viana. The Lyapunov exponents of generic volume-preserving and symplectic maps. Ann. of Math., 161:1423–1485, 2005. [35] C. Bocker and M. Viana. Continuity of Lyapunov exponents for 2D random matrices. Preprint www.impa.br/∼viana/out/bernoulli.pdf. [36] C. Bonatti, L. J. D´ıaz, and M. Viana. Dynamics Beyond Uniform Hyperbolicity, volume 102 of Encyclopaedia of Mathematical Sciences. Springer-Verlag, 2005. [37] C. Bonatti, X. G´omez-Mont, and M. Viana. G´en´ericit´e d’exposants de Lyapunov non-nuls pour des produits d´eterministes de matrices. Ann. Inst. H. Poincar´e Anal. Non Lin´eaire, 20:579–624, 2003. [38] C. Bonatti and M. Viana. SRB measures for partially hyperbolic systems whose central direction is mostly contracting. Israel J. Math., 115:157–193, 2000. [39] N. Bourbaki. Algebra. I. Chapters 1–3. Elements of Mathematics (Berlin). Springer-Verlag, 1989. Translated from the French, Reprint of the 1974 edition.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.012

References

193

[40] J. Bourgain. Positivity and continuity of the Lyapounov exponent for shifts on Td with arbitrary frequency vector and real analytic potential. J. Anal. Math., 96:313–355, 2005. [41] J. Bourgain and S. Jitomirskaya. Continuity of the Lyapunov exponent for quasiperiodic operators with analytic potential. J. Statist. Phys., 108:1203–1218, 2002. [42] M. Cambrainha. Generic symplectic cocycles are hyperbolic. PhD thesis, IMPA, 2013. [43] C. Castaing and M. Valadier. Convex Analysis and Measurable Multifunctions. Lecture Notes in Mathematics, Vol. 580. Springer-Verlag, 1977. [44] J. E. Cohen, H. Kesten, and C. M. Newman (Eds.). Random Matrices and their Applications (Brunswick, Maine, 1984), volume 50 of Contemp. Math. Amer. Math. Soc., 1986. [45] J. Conway. Functions of One Complex Variable. II, volume 159 of Graduate Texts in Mathematics. Springer-Verlag, 1995. [46] A. Crisanti, G.Paladin, and A. Vulpiani. Products of Random Matrices in Statistical Physics, volume 104 of Springer Series in Solid-State Sciences. SpringerVerlag, 1993. With a foreword by Giorgio Parisi. [47] D. Damanik. Schr¨odinger operators with dynamically defined potentials. In preparation. [48] David Damanik. Lyapunov exponents and spectral analysis of ergodic Schr¨odinger operators: a survey of Kotani theory and its applications. In Spectral Theory and Mathematical Physics: a Festschrift in Honor of Barry Simon’s 60th birthday, volume 76 of Proc. Sympos. Pure Math., pages 539–563. Amer. Math. Soc., 2007. [49] L. J. D´ıaz and R. Ures. Persistent homoclinic tangencies at the unfolding of cycles. Ann. Inst. H. Poincar´e, Anal. Non-lin´eaire, 11:643–659, 1996. [50] A. Douady and J. C. Earle. Conformally natural extension of homeomorphisms of the circle. Acta Math., 157:23–48, 1986. ´ [51] R. Dudley, H. Kunita, F. Ledrappier, and P. Hennequin (Eds.). Ecole d’´et´e de probabilit´es de Saint-Flour, XII—1982, volume 1097 of Lecture Notes in Math. Springer, 1984. [52] J. Franks. Anosov diffeomorphisms. In Global Analysis (Proc. Sympos. Pure Math., Vol. XIV, Berkeley, Calif., 1968), pages 61–93. Amer. Math. Soc., 1970. [53] N. Friedman. Introduction to Ergodic Theory. Van Nostrand, 1969. [54] H. Furstenberg. Non-commuting random products. Trans. Amer. Math. Soc., 108:377–428, 1963. [55] H. Furstenberg. Boundary theory and stochastic processes on homogeneous spaces. In Harmonic Analysis in Homogeneous Spaces, volume XXVI of Proc. Sympos. Pure Math. (Williamstown MA, 1972), pages 193–229. Amer. Math. Soc., 1973. [56] H. Furstenberg and H. Kesten. Products of random matrices. Ann. Math. Statist., 31:457–469, 1960. [57] H. Furstenberg and Yu. Kifer. Random matrix products and measures in projective spaces. Israel J. Math, 10:12–32, 1983. [58] I. Ya. Gol’dsheid and G. A. Margulis. Lyapunov indices of a product of random matrices. Uspekhi Mat. Nauk., 44:13–60, 1989.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.012

194

References

[59] Y. Guivarc’h. Marches al´eatories a` pas markovien. Comptes Rendus Acad. Sci. Paris, 289:211–213, 1979. [60] Y. Guivarc’h and A. Raugi. Products of random matrices: convergence theorems. Contemp. Math., 50:31–54, 1986. [61] H. Hennion. Loi des grands nombres et perturbations pour des produits r´eductibles de matrices al´eatoires ind´ependantes. Z. Wahrsch. Verw. Gebiete, 67:265–278, 1984. [62] M. Herman. Une m´ethode nouvelle pour minorer les exposants de Lyapunov et quelques exemples montrant le caract`ere local d’un th´eor`eme d’Arnold et de Moser sur le tore de dimension 2. Comment. Math. Helvetici, 58:453–502, 1983. [63] A. Herrera. Simplicity of the Lyapunov spectrum for multidimensional continued fraction algorithms. PhD thesis, IMPA, 2009. [64] F. Rodriguez Hertz, M. A. Rodriguez Hertz, A. Tahzibi, and R. Ures. Maximizing measures for partially hyperbolic systems with compact center leaves. Ergodic Theory Dynam. Sys., 32:825–839, 2012. [65] E. Hopf. Statistik der geod¨atischen Linien in Mannigfaltigkeiten negativer Kr¨ummung. Ber. Verh. S¨achs. Akad. Wiss. Leipzig, 91:261–304, 1939. [66] S. Jitomirskaya and C. Marx. Dynamics and spectral theory of quasi-periodic Schr¨odinger type operators. In preparation. [67] R. Johnson. Lyapounov numbers for the almost periodic Schr¨odinger equation. Illinois J. Math., 28:397–419, 1984. [68] A. Katok. Lyapunov exponents, entropy and periodic points of diffeomorphisms. Publ. Math. IHES, 51:137–173, 1980. [69] Y. Katznelson and B. Weiss. A simple proof of some ergodic theorems. Israel J. Math., 42:291–296, 1982. [70] Yu. Kifer. Perturbations of random matrix products. Z. Wahrsch. Verw. Gebiete, 61:83–95, 1982. [71] Yu. Kifer. Ergodic Theory of Random Perturbations. Birkh¨auser, 1986. [72] Yu. Kifer. General random perturbations of hyperbolic and expanding transformations. J. Analyse Math., 47:11–150, 1986. [73] Yu. Kifer and E. Slud. Perturbations of random matrix products in a reducible case. Ergodic Theory Dynam. Sys., 2:367–382 (1983), 1982. [74] J. Kingman. The ergodic theorem of subadditive stochastic processes. J. Royal Statist. Soc., 30:499–510, 1968. [75] O. Knill. The upper Lyapunov exponent of SL(2, R) cocycles: discontinuity and the problem of positivity. In Lyapunov Exponents (Oberwolfach, 1990), volume 1486 of Lecture Notes in Math., pages 86–97. Springer-Verlag, 1991. [76] O. Knill. Positive Lyapunov exponents for a dense set of bounded measurable SL(2, R)-cocycles. Ergod. Theory Dynam. Sys., 12:319–331, 1992. [77] S. Kotani. Lyapunov indices determine absolutely continuous spectra of stationary random one-dimensional Schr¨odinger operators. In Stochastic Analysis, pages 225–248. North Holland, 1984. [78] H. Kunz and Bernard B. Souillard. Sur le spectre des op´erateurs aux diff´erences finies al´eatoires. Comm. Math. Phys., 78:201–246, 1980/81. [79] F. Ledrappier. Propri´et´es ergodiques des mesures de Sina¨ı Publ. Math. I.H.E.S., 59:163–188, 1984.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.012

References

195

[80] F. Ledrappier. Quelques propri´et´es des exposants caract´eristiques. volume 1097 of Lect. Notes in Math., pages 305–396, Springer-Verlag, 1984. [81] F. Ledrappier. Positivity of the exponent for stationary sequences of matrices. In Lyapunov Exponents (Bremen, 1984), volume 1186 of Lect. Notes in Math., pages 56–73. Springer-Verlag, 1986. [82] F. Ledrappier and G. Royer. Croissance exponentielle de certain produits al´eatorires de matrices. Comptes Rendus Acad. Sci. Paris, 290:513–514, 1980. [83] F. Ledrappier and L.-S. Young The metric entropy of diffeomorphisms. I. Characterization of measures satisfying Pesin’s entropy formula. Ann. of Math., 122:509–539, 1985. [84] F. Ledrappier and L.-S. Young The metric entropy of diffeomorphisms. II. Relations between entropy, exponents and dimension. Ann. of Math., 122:540–574, 1985. [85] A. M. Lyapunov. The General Problem of the Stability of Motion. Taylor & Francis Ltd., 1992. Translated from Edouard Davaux’s French translation (1907) of the 1892 Russian original and edited by A.T. Fuller, with an introduction and preface by Fuller, a biography of Lyapunov by V.I. Smirnov, and a bibliography of Lyapunov’s works compiled by J.F. Barrett, Lyapunov centenary issue, Reprint of Internat. J. Control 55 (1992), no. 3 [MR1154209 (93e:01035)], With a foreword by Ian Stewart. [86] R. Ma˜ne´ . A proof of Pesin’s formula. Ergod. Theory Dynam. Sys., 1:95–101, 1981. [87] R. Ma˜ne´ . Lyapunov exponents and stable manifolds for compact transformations. In Geometric Dynamics, volume 1007 of Lect. Notes in Math., pages 522–577. Springer-Verlag, 1982. [88] R. Ma˜ne´ . Oseledec’s theorem from the generic viewpoint. In Procs. International Congress of Mathematicians, Vol. 1, 2 (Warsaw, 1983), pages 1269–1276, Warsaw, 1984. PWN Publ. [89] R. Ma˜ne´ . Ergodic Theory and Differentiable Dynamics. Springer-Verlag, 1987. [90] A. Manning. There are no new Anosov diffeomorphisms on tori. Amer. J. Math., 96:422–429, 1974. [91] S. Newhouse. On codimension one Anosov diffeomorphisms. Amer. J. Math., 92:761–770, 1970. [92] V. I. Oseledets. A multiplicative ergodic theorem: Lyapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc., 19:197–231, 1968. [93] J. C. Oxtoby and S. M. Ulam. Measure-preserving homeomorphisms and metrical transitivity. Ann. of Math., 42:874–920, 1941. ´ Le Page. Th´eor`emes limites pour les produits de matrices al´eatoires. In [94] E. Probability Measures on Groups (Oberwolfach, 1981), volume 928 of Lecture Notes in Math., pages 258–303. Springer-Verlag, 1982. ´ Le Page. R´egularit´e du plus grand exposant caract´eristique des produits de [95] E. matrices al´eatoires ind´ependantes et applications. Ann. Inst. H. Poincar´e Probab. Statist., 25:109–142, 1989. [96] J. Palis and S. Smale. Structural stability theorems. In Global Analysis, volume XIV of Proc. Sympos. Pure Math. (Berkeley 1968), pages 223–232. Amer. Math. Soc., 1970.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.012

196

References

[97] L. Pastur. Spectral properties of disordered systems in the one-body approximation. Comm. Math. Phys., 75:179–196, 1980. [98] Y. Peres. Analytic dependence of Lyapunov exponents on transition probabilities. In Lyapunov Exponents (Oberwolfach, 1990), volume 1486 of Lecture Notes in Math., pages 64–80. Springer-Verlag, 1991. [99] Ya. B. Pesin. Families of invariant manifolds corresponding to non-zero characteristic exponents. Math. USSR. Izv., 10:1261–1302, 1976. [100] Ya. B. Pesin. Characteristic Lyapunov exponents and smooth ergodic theory. Russian Math. Surveys, 324:55–114, 1977. [101] M. S. Raghunathan. A proof of Oseledec’s multiplicative ergodic theorem. Israel J. Math., 32:356–362, 1979. [102] V. A. Rokhlin. On the fundamental ideas of measure theory. A. M. S. Transl., 10:1–54, 1962. Transl. from Mat. Sbornik 25 (1949), 107–150. First published by the A. M. S. in 1952 as Translation Number 71. [103] G. Royer. Croissance exponentielle de produits markoviens de matrices. Ann. Inst. H. Poincar´e, 16:49–62, 1980. [104] D. Ruelle. Analyticity properties of the characteristic exponents of random matrix products. Adv. in Math., 32:68–80, 1979. [105] D. Ruelle. Ergodic theory of differentiable dynamical systems. Inst. Hautes ´ Etudes Sci. Publ. Math., 50:27–58, 1979. [106] D. Ruelle. Characteristic exponents and invariant manifolds in Hilbert space. Annals of Math., 115:243–290, 1982. [107] M. Shub. Global Stability of Dynamical Systems. Springer-Verlag, 1987. [108] B. Simon and M. Taylor. Harmonic analysis on SL(2, R) and smoothness of the density of states in the one-dimensional Anderson model. Comm. Math. Phys., 101:1–19, 1985. [109] S. Smale. Differentiable dynamical systems. Bull. Am. Math. Soc., 73:747–817, 1967. [110] A. Tahzibi. Stably ergodic diffeomorphisms which are not partially hyperbolic. Israel J. Math., 142:315–344, 2004. [111] Ph. Thieullen. Ergodic reduction of random products of two-by-two matrices. J. Anal. Math., 73:19–64, 1997. [112] M. Viana. Lyapunov exponents and strange attractors. In J.-P. Franc¸oise, G. L. Naber, and S. T. Tsou, editors, Encyclopedia of Mathematical Physics. Elsevier, 2006. [113] M. Viana. Almost all cocycles over any hyperbolic system have nonvanishing Lyapunov exponents. Ann. of Math., 167:643–680, 2008. [114] M. Viana and K. Oliveira. Fundamentos da Teoria Erg´odica. Colec¸a˜ o Fronteiras da Matem´atica. Sociedade Brasileira de Matem´atica, 2014. [115] M. Viana and J. Yang. Physical measures and absolute continuity for onedimensional center directions. Ann. Inst. H. Poincar´e Anal. Non Lin´eaire, 30:845–877, 2013. [116] A. Virtser. On products of random matrices and operators. Th. Prob. Appl., 34:367–377, 1979. [117] P. Walters. A dynamical proof of the multiplicative ergodic theorem. Trans. Amer. Math. Soc., 335:245–257, 1993.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.012

References

197

[118] L.-S. Young. Some open sets of nonuniformly hyperbolic cocycles. Ergod. Theory Dynam. Sys., 13(2):409–415, 1993. [119] L.-S. Young. Ergodic theory of differentiable dynamical systems. In Real and Complex Dynamical Systems, volume NATO ASI Series, C-464, pages 293–336. Kluwer Acad. Publ., 1995. [120] A. C. Zaanen. Integration. North-Holland Publishing Co., 1967. Completely revised edition of An Introduction to the Theory of Integration.

Downloaded from University Publishing Online. This is copyrighted material http://dx.doi.org/10.1017/CBO9781139976602.012

Index

∗ convolution, 174, 179 Bt adjoint linear operator, 107 C(V, a) cone of width a around V , 61 C0 (P) space of continuous functions, 44 Df derivative of a smooth map, 8 Es, Eu stable and unstable bundles, 10, 57 Eδ (ξ ), eδ (η ) δ -energy of a measure, 178, 179 Ev , E x slices of a set in a product space, 67, 91 G(d), G(k, d) Grassmannian, 16, 38 L1 ( μ ) space of integrable functions, 28 Rθ rotation, 3 Vx⊥ orthogonal complement, 48 W s (x), W u (x) (global) stable and unstable sets, 187 s (x), W u (x) Wloc loc local stable and unstable sets, 187 Δm open simplex, 3 Δr r-neighborhood of the diagonal, 180 Λl E exterior l-power, 57 Λld E decomposable l-vectors, 57

GL(d), GL(d, C) linear group, 6 PRd , PCd projective space, real and complex, 17 SL(d), SL(d, C) special linear group, 6, 7 Td d-dimensional torus, 17 B(Y ) Borel σ -algebra, 41 G (d) space of compactly supported measures on GL(d), 171 K (Y ) space of compact subsets, 41 M (μ ) space of measures projecting down to μ , 44 P(d) space of probability measures in PRd , 174 ecc(B) eccentricity of a linear map, 133 2 space of square integrable sequences, 9 λ (x, v) Lyapunov exponent function, 40 λ+ , λ− extremal Lyapunov exponents, 1, 6 PF projective cocycle, 68 projw u orthogonal projection, 175 α generic element of GL(d)N , 172 ϕ +, ϕ − positive and negative parts of a function, 20 p ∗ η , p1 ∗ p2

198

Downloaded from University Publishing Online. This is copyrighted material http://ebooks.cambridge.org/null/ebook.jsf?bid=CBO9781139976602

Index convolution, 174, 179 sn (x) most contracted vector, 31 un (x) most expanded vector, 31

δ -energy of a measure, 178, 179 ε -concentrated measure, 144 C0 topology, 153 C1 topology, 161 l-vector, 57 decomposable, 57 L∞ cocycle, 168 L p cocycle, 168 p-expanding point, 175 p-invariant point, 175 s-state, 81, 83, 86, 94, 149 su-state, 81 u-state, 81, 83, 86, 94, 149 additive sequence, 21 adjoint cocycle, 108 monoid, 148 operator, 69, 107 Anderson localization, 18 angle between subspaces, 39 Anosov diffeomorphism, 17, 161 aperiodic measure, 153, 155, 158 backward stationary measure, 78, 101 Baire space, 152 barycenter, 130 Bernoulli shift, 7 characteristic exponents, xi coboundary set, 158, 159, 170 cocycle L∞ , 168 L p , 168 continuous, 154, 168 derivative, 161 fiber-bunched, 187 Grassmannian, 137 H¨older continuous, 165, 187 inverse, 146 invertible, 34, 39 irreducible, 102 locally constant, 7 projective, 68, 92, 137 strongly irreducible, 102, 103 cohomological condition, 105 complete flag, 38

probability space, 38 conditional expectation, 86 probabilities, 85 cone, 61, 65 conformal barycenter, 130 conorm, 20 continuity theorem, 3, 171 contraction property, 189 convergence pointwise, 74 weak∗ , 74 convolution, 174, 179, 189 diagonal, 183 coupling, 178 cross-ratio, 61 cylinder, 82, 87, 89, 170 decomposable l-vector, 57 derivative cocycle, 8, 161 determinant, 58 diagonal convolution, 183 push-forward, 183 dimension of a vector bundle, 8 disintegration, 85, 87 dominated decomposition, 65, 162, 163 duality, 69 eccentricity of a linear map, 133 elliptic matrix, 4 energy of a measure, 178, 179 ergodic decomposition, 95 decomposition theorem, 77 measure, 159, 167 stationary measure, 76 theorem of Birkhoff, 27 essentially invariant function, 21, 36 invariant set, 21 expectation, 86 exponents representation theorem, 97 exterior power, 57, 151 product, 57, 137 extremal Lyapunov exponents, 1 measure, 92 fiber-bunched cocycle, 187 fibered entropy, 117

Downloaded from University Publishing Online. This is copyrighted material http://ebooks.cambridge.org/null/ebook.jsf?bid=CBO9781139976602

199

200

Index

Jacobian, 117 first return map, 59, 158 time, 59 flag, 38 complete, 38 of Oseledets, 39, 41 formula of Furstenberg, 102 of Herman, 29 forward stationary measure, 78, 101 Furstenberg’s criterion, 2, 105, 124 formula, 102 geometric hyperplane, 137 subspace, 137 graph transform, 53 Grassmannian, 16, 38 cocycle, 137 manifold, 16, 38, 57 H¨older continuity, 164 norm, 164 Hamiltonian operator, 18 Hausdorff distance, 190 Herman’s formula, 29 Hilbert metric, 61 holonomy stable, 188 unstable, 188 homogeneous measure, 143 hyperbolic cocycle, 17, 153 linear cocycle, 10, 15 matrix, 4 hyperbolicity non-uniform, 19 uniform, 10, 15 hyperplane, 137 section, 134, 137 inducing, 59, 166 interchanging subspaces, 155, 160, 166, 167 invariance principle, 115, 116 invariant function, 21 holonomies, 187 measure, 70, 79, 97 set, 21 inverse

cocycle, 146 monoid, 139 invertible cocycle, 34, 39 invertible extension, 99, 101 irreducibility, 102 strong, 102, 134 irreducible cocycle, 102, 103 strongly, 102 Lebesgue space, 158 lift of a stationary measure, 81, 88, 89 of an invariant measure, 79–81 linear arrangement, 137, 139, 143 proper, 139 cocycle, 6 hyperbolic, 10, 15 invertible extension, 101 on a vector bundle, 8 derivative cocycle, 8 group, 6 special, 6, 7 invariance principle, 116 Schr¨odinger cocycle, 9 section, 137 local stable set, 187 unstable set, 187 local product structure, 91 locally constant cocycle, 7, 96 skew product, 67 Lyapunov exponents, 39, 40, 53, 60 extremal, 1 spectrum, 39, 133 martingale, 86, 88 convergence theorem, 86 mass of a measure, 178 matrix elliptic, 4 hyperbolic, 4 parabolic, 4 meager subset, 152 measurable section, 99 sub-bundle, 64, 98 measure backward stationary, 78 ergodic, 76 extremal, 92

Downloaded from University Publishing Online. This is copyrighted material http://ebooks.cambridge.org/null/ebook.jsf?bid=CBO9781139976602

Index forward stationary, 78 invariant, 70 non-atomic, 102, 106 stationary, 70 monoid, 2, 134 most contracted subspace, 134, 140 vector, 11, 19, 31 most expanded subspace, 134, 140 vector, 11, 19, 31 multiplicative ergodic theorem one-sided, 30, 38 two-sided, 34, 39 multiplicity of a Lyapunov exponent, 39, 133 natural extension, 65, 99 negative part of a function, 20 non-atomic measure, 102, 106 non-compactness condition, 105, 132, 134 normalized restriction, 59, 113 open simplex, 3 orbital sum, 21 orthogonal complement, 48 Oseledets decomposition, 34, 39, 60, 97 flag, 39, 41, 53, 97 sub-bundles, 39 subspaces, 39, 40 parabolic matrix, 4 parallelizable manifold, 8 pinching, 2, 113, 134, 139, 148 point p-expanding, 175 p-invariant, 175 pointwise convergence, 74 topology, 72 positive part of a function, 20 potential, 9 probabilistic repeller, 174 probability space complete, 38 separable, 38 projection theorem, 41 projective action, 106 cocycle, 68, 92, 137 space, 17 proper linear arrangement, 139 quasi-periodic Schr¨odinger cocycle, 9

Radon–Nikodym derivative, 117 random matrices, 7, 68 Schr¨odinger cocycle, 9 transformation, 67, 97 transformation (invertible), 77 transformation (one-sided), 70, 76 residual subset, 152, 153, 168 rotation, 3 s-holonomy, 188 Schr¨odinger cocycle, 9 quasi-periodic, 9 random, 9 operator, 9 second return map, 154, 158, 167 self-coupling, 179 optimal, 179 symmetric, 179 semi-continuity, 151, 152, 160 separable metric space, 41 probability space, 38 sequence additive, 21 subadditive, 21 super-additive, 45 sequential compactness, 83 simple Lyapunov spectrum, 39, 133 simplicity criterion, 134 singular measures, 117 values, 133 slices of a set in a product space, 68 small subset, 126 special linear group, 6, 7 stabilizer of a measure, 108, 131 stable bundle, 10, 57, 135 holonomy, 188 set, 187 stationary function, 75 measure, 70, 101, 115, 171 set, 75 strong irreducibility, 102, 103, 105, 134 subadditive ergodic theorem, 20, 21 sequence, 21 subexponential decay, 31, 34, 35, 39 subharmonic function, 29, 30

Downloaded from University Publishing Online. This is copyrighted material http://ebooks.cambridge.org/null/ebook.jsf?bid=CBO9781139976602

201

202

Index

superadditive sequence, 45 support of a measure, 105 symmetric self-coupling, 179 optimal, 179 symplectic diffeomorphism, 163 form, 163 theorem invariance principle, 116 martingale convergence, 86, 87 of Baire, 169 of Birkhoff, 27 of continuity, 3, 171 of ergodic decomposition, 77 of Furstenberg, 2, 105 of Furstenberg (formula), 102 of Furstenberg-Kesten, 1, 20, 28 of Herman (formula), 29 of Kingman, 21 of Kingman (subadditive), 20 of Ledrappier, 97, 116 of Ma˜ne´ –Bochi, 153 of Ma˜ne´ -Bochi, 161 of Oseledets (one-sided), 30, 38 of Oseledets (two-sided), 34, 39 of Prohorov, 83 of Rokhlin, 85 of simplicity, 134 of Thieullen, 106 projection, 41 subadditive, 21 topological degree, 17 topology C0 , 153 C1 , 161 pointwise, 72 uniform, 72 weak∗ , 72 total variation norm, 72 transition operator, 68 operator (adjoint), 69 probabilities, 68 transitive action, 162 twisting, 2, 113, 134, 139, 148

unstable bundle, 10, 57, 135 holonomy, 188 set, 187 vector bundle, 8 dimension, 8 most contracted, 31 most expanded, 31 volume of a decomposable l-vector, 58 weak∗ convergence, 74, 83 topology, 72

u-holonomy, 188 uniform convergence, 83 hyperbolicity, 10, 15 integrability, 103 topology, 72

Downloaded from University Publishing Online. This is copyrighted material http://ebooks.cambridge.org/null/ebook.jsf?bid=CBO9781139976602

v

To Tania, Miguel and Anita, for their understanding.

Downloaded from University Publishing Online. This is copyrighted material http://ebooks.cambridge.org/null/ebook.jsf?bid=CBO9781139976602