Research in recent years has highlighted the deep connections between the algebraic, geometric, and analytic structures

*139*
*51*
*9MB*

*English*
*Pages 381
[403]*
*Year 2024*

C A M B R I D G E S T U D I E S I N A DVA N C E D M AT H E M AT I C S 2 1 3 Editorial Board ´ S , W. F U LTO N , B . K R A , I . M O E R D I J K , J . B E RTO I N , B . B O L L O B A C . P R A E G E R , P. S A R NA K , B . S I M O N , B . TOTA RO

HARMONIC FUNCTIONS AND RANDOM WALKS ON GROUPS Research in recent years has highlighted the deep connections between the algebraic, geometric, and analytic structures of a discrete group. New methods and ideas have resulted in an exciting field, with many opportunities for new researchers. This book is an introduction to the area from a modern vantage point. It incorporates the main basics, such as Kesten’s amenability criterion, the Coulhon and Saloff-Coste inequality, random walk entropy and bounded harmonic functions, the Choquet–Deny theorem, the Milnor–Wolf theorem, and a complete proof of Gromov’s theorem on polynomial growth groups. The book is especially appropriate for young researchers, and those new to the field, accessible even to graduate students. An abundance of examples, exercises, and solutions encourage self-reflection and the internalization of the concepts introduced. The author also points to open problems and possibilities for further research. Ariel Yadin is Professor in the Department of Mathematics at Ben-Gurion University of the Negev, Israel. His research is focused on the interplay between random walks and the geometry of groups. He has taught a variety of courses on the subject and has been part of a new wave of investigation into the structure of spaces of unbounded harmonic functions on groups.

Published online by Cambridge University Press

Published online by Cambridge University Press

C A M B R I D G E S T U D I E S I N A DVA N C E D M AT H E M AT I C S Editorial Board J. Bertoin, B. Bollob´as, W. Fulton, B. Kra, I. Moerdijk, C. Praeger, P. Sarnak, B. Simon, B. Totaro All the titles listed below can be obtained from good booksellers or from Cambridge University Press. For a complete series listing, visit www.cambridge.org/mathematics. Already Published 173 P. Garrett Modern Analysis of Automorphic Forms by Example, I 174 P. Garrett Modern Analysis of Automorphic Forms by Example, II 175 G. Navarro Character Theory and the McKay Conjecture 176 P. Fleig, H. P. A. Gustafsson, A. Kleinschmidt & D. Persson Eisenstein Series and Automorphic Representations 177 E. Peterson Formal Geometry and Bordism Operators 178 A. Ogus Lectures on Logarithmic Algebraic Geometry 179 N. Nikolski Hardy Spaces 180 D.-C. Cisinski Higher Categories and Homotopical Algebra 181 A. Agrachev, D. Barilari & U. Boscain A Comprehensive Introduction to Sub-Riemannian Geometry 182 N. Nikolski Toeplitz Matrices and Operators 183 A. Yekutieli Derived Categories 184 C. Demeter Fourier Restriction, Decoupling and Applications 185 D. Barnes & C. Roitzheim Foundations of Stable Homotopy Theory 186 V. Vasyunin & A. Volberg The Bellman Function Technique in Harmonic Analysis 187 M. Geck & G. Malle The Character Theory of Finite Groups of Lie Type 188 B. Richter Category Theory for Homotopy Theory 189 R. Willett & G. Yu Higher Index Theory 190 A. Bobrowski Generators of Markov Chains 191 D. Cao, S. Peng & S. Yan Singularly Perturbed Methods for Nonlinear Elliptic Problems 192 E. Kowalski An Introduction to Probabilistic Number Theory 193 V. Gorin Lectures on Random Lozenge Tilings 194 E. Riehl & D. Verity Elements of ∞-Category Theory 195 H. Krause Homological Theory of Representations 196 F. Durand & D. Perrin Dimension Groups and Dynamical Systems 197 A. Sheffer Polynomial Methods and Incidence Theory 198 T. Dobson, A. Malniˇc & D. Maruˇsiˇc Symmetry in Graphs 199 K. S. Kedlaya p-adic Differential Equations 200 R. L. Frank, A. Laptev & T. Weidl Schr¨odinger Operators:Eigenvalues and Lieb–Thirring Inequalities 201 J. van Neerven Functional Analysis 202 A. Schmeding An Introduction to Infinite-Dimensional Differential Geometry 203 F. Cabello S´anchez & J. M. F. Castillo Homological Methods in Banach Space Theory 204 G. P. Paternain, M. Salo & G. Uhlmann Geometric Inverse Problems 205 V. Platonov, A. Rapinchuk & I. Rapinchuk Algebraic Groups and Number Theory, I (2nd Edition) 206 D. Huybrechts The Geometry of Cubic Hypersurfaces 207 F. Maggi Optimal Mass Transport on Euclidean Spaces 208 R. P. Stanley Enumerative Combinatorics, II (2nd Edition) 209 M. Kawakita Complex Algebraic Threefolds 210 D. Anderson & W. Fulton Equivariant Cohomology in Algebraic Geometry 211 G. Pineda Villavicencio Polytopes and Graphs 212 R. Pemantle, M. C. Wilson & S. Melczer Analytic Combinatorics in Several Variables (2nd Edition)

Published online by Cambridge University Press

Published online by Cambridge University Press

Harmonic Functions and Random Walks on Groups A R I E L YA D I N Ben-Gurion University of the Negev

Published online by Cambridge University Press

Shaftesbury Road, Cambridge CB2 8EA, United Kingdom One Liberty Plaza, 20th Floor, New York, NY 10006, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia 314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India 103 Penang Road, #05–06/07, Visioncrest Commercial, Singapore 238467 Cambridge University Press is part of Cambridge University Press & Assessment, a department of the University of Cambridge. We share the University’s mission to contribute to society through the pursuit of education, learning and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781009123181 DOI: 10.1017/9781009128391 © Ariel Yadin 2024 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press & Assessment. When citing this work, please include a reference to the DOI 10.1017/9781009128391 First published 2024 A catalogue record for this publication is available from the British Library A Cataloging-in-Publication data record for this book is available from the Library of Congress ISBN 978-1-009-12318-1 Hardback Cambridge University Press & Assessment has no responsibility for the persistence or accuracy of URLs for external or third-party Internet websites referred to in this publication and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Published online by Cambridge University Press

Contents

page xi xv xvii

Preface Acknowledgments Notation

Part I

Tools and Theory

1

1

Background 1.1 Basic Notation 1.2 Spaces of Sequences 1.3 Group Actions 1.4 Discrete Group Convolutions 1.5 Basic Group Notions 1.6 Measures on Groups and Harmonic Functions 1.7 Bounded, Lipschitz, and Polynomial Growth Functions 1.8 Additional Exercises 1.9 Solutions to Exercises

3 4 5 7 9 10 26 31 33 34

2

Martingales 2.1 Conditional Expectation 2.2 Martingales: Definition and Examples 2.3 Optional Stopping Theorem 2.4 Applications of Optional Stopping 2.5 L p Maximal Inequality 2.6 Martingale Convergence 2.7 Bounded Harmonic Functions 2.8 Solutions to Exercises

50 51 54 56 59 61 63 65 67

vii

Published online by Cambridge University Press

viii

Contents

3

Markov Chains 3.1 Markov Chains 3.2 Irreducibility 3.3 Random Walks on Groups 3.4 Stopping Times 3.5 Excursion Decomposition 3.6 Recurrence and Transience 3.7 Positive Recurrence 3.8 Null Recurrence 3.9 Finite Index Subgroups 3.10 Solutions to Exercises

75 76 78 81 81 84 85 88 98 100 108

4

Networks and Discrete Analysis 4.1 Networks 4.2 Gradient and Divergence 4.3 Laplacian 4.4 Path Integrals 4.5 Voltage and Current 4.6 Effective Conductance 4.7 Thompson’s Principle and Rayleigh Monotonicity 4.8 Green Function 4.9 Finite Energy Flows 4.10 Paths and Summable Intersection Tails 4.11 Capacity 4.12 Transience and Recurrence of Groups 4.13 Additional Exercises 4.14 Remarks 4.15 Solutions to Exercises

117 118 119 121 121 123 125 126 128 132 133 139 141 144 146 147

Part II 5

Results and Applications

159

Growth, Dimension, and Heat Kernel 5.1 Amenability 5.2 Spectral Radius 5.3 Isoperimetric Dimension 5.4 Nash Inequality 5.5 Operator Theory for the Heat Kernel 5.6 The Varopolous–Carne Bound 5.7 Additional Exercises

161 162 164 169 172 174 180 184

Published online by Cambridge University Press

Contents 5.8 5.9

Remarks Solutions to Exercises

ix 184 186

6

Bounded Harmonic Functions 6.1 The Tail and Invariant σ-Algebras 6.2 Parabolic and Harmonic Functions 6.3 Entropic Criterion 6.4 Triviality of Invariant and Tail σ-Algebras 6.5 An Entropy Inequality 6.6 Coupling and Liouville 6.7 Speed and Entropy 6.8 Amenability and Liouville 6.9 Lamplighter Groups ∗ 6.10 An Example: Infinite Permutation Group S∞ 6.11 Additional Exercises 6.12 Remarks 6.13 Solutions to Exercises

196 197 198 202 207 209 213 215 219 220 223 227 229 231

7

Choquet–Deny Groups 7.1 The Choquet–Deny Theorem 7.2 Centralizers 7.3 ICC Groups 7.4 JNVN Groups 7.5 Choquet–Deny Groups Are Virtually Nilpotent 7.6 Additional Exercises 7.7 Remarks 7.8 Solutions to Exercises

246 247 249 251 255 257 262 264 265

8

The Milnor–Wolf Theorem 8.1 Growth 8.2 Growth of Nilpotent Groups 8.3 The Milnor Trick 8.4 Characteristic Subgroups 8.5 Z-extensions and Nilpotent Groups 8.6 Proof of the Milnor–Wolf Theorem 8.7 Additional Exercises 8.8 Remarks 8.9 Solutions to Exercises

273 274 275 280 282 284 290 291 293 293

Published online by Cambridge University Press

x 9

Contents Gromov’s Theorem 9.1 A Reduction 9.2 Unitary Actions 9.3 Harmonic Cocycles 9.4 Diffusivity 9.5 Ozawa’s Theorem 9.6 Proof of Gromov’s Theorem 9.7 Classification of Recurrent Groups 9.8 Kleiner’s Theorem 9.9 Additional Exercises 9.10 Remarks 9.11 Solutions to Exercises

301 302 303 309 317 318 328 330 332 341 341 343

Appendices

355

Appendix A Hilbert Space Background A.1 Inner Products and Hilbert Spaces A.2 Normed Vector Spaces A.3 Orthonormal Systems A.4 Solutions to Exercises

357 358 360 361 363

Appendix B Entropy B.1 Shannon Entropy Axioms B.2 A Different Perspective on Entropy

365 366 367

Appendix C Coupling and Total Variation C.1 Total Variation Distance C.2 Couplings C.3 Solutions to Exercises

370 371 371 373

References Index

374 378

Published online by Cambridge University Press

Preface

xi

https://doi.org/10.1017/9781009128391.001 Published online by Cambridge University Press

xii

Preface

What This Book Is About In recent years I have become more and more interested in the connections between the different possible mathematical structures on a set, given by algebra, by geometry, and by analysis. A group, by definition, has an obvious algebraic structure – its multiplication table. Finitely generated groups come equipped with a metric (inherited from the graph structure of the Cayley graph). Also, since Polyá, Feller, and Kesten it has been noted that the stochastic process known as a random walk constrains and is constrained by these structures. One may say that this book is about the interplay between the above structures. Chapter 1 introduces the basic concepts we will work with throughout the book: random walks, harmonic functions, group properties, and basic examples. In Chapter 2 we review an important probabilistic object: the martingale. This chapter is largely based on the highly influential book Probability: Theory and Examples by Rick Durrett (2019). Chapter 3 introduces the very basic fundamentals of Markov chains, with a focus on reversible chains. It is based on two main sources: Markov Chains by James Norris (1998) and Probability on Trees and Networks by Lyons and Peres (2016). In Chapter 4 we get into what I like to call discrete analysis. These are discrete counterparts to objects from classical analysis such as gradients, divergence, Laplacian, Green’s function, and Dirichlet energy. Also, mathematical notions arising from models of electrical networks are introduced. This chapter is largely based on Lyons and Peres (2016), which provides a much more comprehensive study of electrical networks and random walks. The above chapters, which form Part I of the book, provide us with the necessary tools to study harmonic functions and random walks on groups. Part II of the book deals with the basic questions regarding these objects and how to apply them to the study of the geometric and algebraic properties of the group. We start with the relationship of isoperimetry and amenability with the random walk, and specifically heat kernel decay. This is done in Chapter 5. The main source for this chapter is Random Walks on Infinite Graphs and Groups by Woess (2000). In this chapter, Kesten’s thesis is presented, which relates the algebraic property of amenability to the analytic property of a spectral gap for the random walk Laplacian. In Chapter 6 we study the space of bounded harmonic functions. This is a well-studied topic, and has a huge body of literature expanding into many other areas of mathematics not mentioned here. This chapter only presents the tip of

https://doi.org/10.1017/9781009128391.001 Published online by Cambridge University Press

Preface

xiii

the iceberg. The main result here is the well-known entropic criterion for the existence of nonconstant bounded harmonic functions (the Liouville property). Still related to bounded harmonic functions, in Chapter 7 we discuss the Choquet–Deny phenomena, sometimes known as the strong Liouville property, where a group never has any nonconstant bounded harmonic functions (with respect to any random walk). This chapter includes some very recent work by Frisch, Hartman, Tamuz, and Vahidi-Ferdowsi (2019), which shows that the (seemingly analytic) Choquet–Deny property is equivalent to a group being virtually nilpotent (an algebraic property). The notion of growth is studied further in Chapter 8. The main result of the chapter is a dichotomy, originally due to results of Milnor and Wolf, for solvable groups. The algebra of these groups constrains their geometry, and the Milnor– Wolf theorem states that they can only have growth that is either polynomial or exponential, but not in between. The relationship between algebra and geometry is strengthened here. This chapter is based on material from Woess (2000) and Geometric Group Theory by Druţu and Kapovich (2018). Finally, we culminate with Chapter 9, in which Gromov’s theorem is proved. The theorem relates the geometric property of polynomial growth to the algebraic property of virtual nilpotence, telling us that these properties are actually equivalent. The proof presented for Gromov’s theorem is elementary, and utilizes the tools previously developed. This proof is based on ideas from Shalom and Tao (2010) and from Tao’s blog What’s New, as well as the paper by Ozawa (2018) proving Gromov’s theorem. This book has been written from the perspective of someone trained in probability. Many times probabilistic proofs are used, even if other proofs are available. Also, an effort has been made to keep everything as elementary as possible, and facilitate reading for students and beginners. This sometimes leads to repetition of ideas and more “hands-on” proofs. The chapters give only a taste of the material mentioned. More depth can be achieved by going to one of the relevant sources mentioned in this section. An advantage of this book is that it brings together topics that usually appear in separate texts in a unified and self-contained fashion. Another advantage is that it is elementary in the sense that it is suitable for newcomers. For example, I have avoided the use of more advanced or less known topics such as the Tits Alternative, Zariski topology, property (T), and ultraproducts. However, these are all extremely interesting and important notions: I do not wish anyone to be discouraged from reading about these in other sources!

https://doi.org/10.1017/9781009128391.001 Published online by Cambridge University Press

xiv

Preface

How to Read This Book This book grew out of lecture notes written for courses I have taught over the years about random walks, harmonic functions, and their relationship to the geometry of finitely generated groups. The main objective in writing the book is to enable a beginner to enter into the field of random walks on groups. The book provides the main examples and basic results in the field. An effort has been made to keep everything elementary, in the sense that a good thirdyear undergraduate or first-year graduate student can read the book in a selfcontained fashion. The reader should be familiar with the basics of mathematics as taught in most undergraduate studies at universities around the world. Specifically, one requires linear algebra, the basics of probability and measure theory, as well as a very basic introduction to abstract group theory. A beginner’s familiarity with Hilbert spaces, Shannon entropy, and couplings is an advantage; the main definitions and results required are in the appendices. Part I of the book contains all the necessary tools for the bigger theorems presented in Part II. Some of the topics covered in Part I may be well known to some readers; I have opted to make this book accessible to more people and to keep the book self contained. For many of the theorems in the book, the proofs are broken down into steps that are spelt out through lemmas and various exercises. I encourage the reader to attempt to solve the exercises along the way themselves, which would assist not only in internalizing the notions but also in grasping the main ideas. Many technical steps from proofs are also delegated to exercises, so that hopefully in the actual proofs one can see the big picture without getting lost in the details. After the table of contents there is a list of notation and definitions. This is to facilitate finding the original place where some notion or notation is defined.

https://doi.org/10.1017/9781009128391.001 Published online by Cambridge University Press

Acknowledgments

As mentioned, this book evolved through lecture notes for courses. Naturally, many of the proofs have not been preserved in the original form in which they appeared for the first time. The typical evolutionary path a proof would have taken could be described generically as follows: A proof was published in a paper. Someone gave a course or wrote a book including the proof. I read the proof or learned it in some book or course as a graduate student. I explained the proof to some peers, so had to recall what I read or heard in a past lecture. • The proof was eventually written down by me in some lecture notes I handed out. • The proof was edited in subsequent lecture notes, most likely influenced by remarks from colleagues and students. • • • •

At this point in time, it is very hard for me to trace back exactly where the different proofs in this book came from, the intermediate “fossils” having been lost to time (sitting somewhere on an old lost hard drive). However, I can list the sources that have definitely influenced and taught me over the years, which probably cover the vast majority of the material in this book. I was introduced to random walks in my graduate studies. Definitely the most influential resource I had then for random walks was Probability on Trees and Networks by Russell Lyons and Yuval Peres. At the time it was yet unpublished, and I had different versions throughout the years. I was seduced into the world of random walks on groups and harmonic functions by a course I took in 2011 at Tel Aviv University by Yehuda Shalom, where he presented an elementary proof of Gromov’s theorem based on his xv

Published online by Cambridge University Press

xvi

Acknowledgments

work with Terence Tao (Shalom and Tao, 2010). I took notes during that course, which I supplemented with ideas from Tao’s blog What’s new. Gabor Pete’s book in preparation, Probability and Geometry on Groups, is one of my go-to resources for anything under that topic. It is not easy for me to isolate Gabor’s specific fingerprint from my notes. Most likely his influence is diffused throughout everything I have written. Another source I often use is Random Walks on Infinite Graphs and Groups by Wolfgang Woess, especially regarding Varopolous’ theorem, recurrence, and transience. Anything appearing here is most likely some kind of logical descendant of the above, if not even more directly related. I am greatly in debt to the authors mentioned. Finally, many people have offered me their comments and assistance in different stages of writing, teaching, and presenting the material included. I will make an attempt to list them all, but most likely there will be someone left out and I apologize in advance to anyone for which this is the case. The material present has benefitted from many discussions I had with many people. Among them are Itai Benjamini, Hugo Duminil-Copin, Yair Glasner, Yair Hartman, Gady Kozma, Tom Meyerovitch, Yuval Peres, Idan Perl, Gabor Pete, Yehuda Shalom, Maud Szusterman, Matthew Tointon, Wolfgang Woess, Amir Yehudayoff, and Ofer Zeitouni. At different stages, many people have contributed remarks on versions of the manuscript. These include Adam Dor-On, Guy Drori, Dor Elimelech, Yair Glasner, Yair Hartman, Gady Kozma, Tom Meyerovitch, Christophe Pittet, Liran Ron-George, Guy Salomon, Matthew Tointon, Vasiliki Velona-Anastasiou, Yeari Vigder, and Jiayan Ye. I also thank the team of editors and typesetters at Cambridge University Press, who gave me many comments and corrections to typos. Some worked in the background and are unknown to me, and with some I have had direct correspondence. Among these, let me especially mention Eleanor Bolton, Clare Dennison, Jasintha Jacob Srinivasan, Anna Scriven, and David Tranah. Itai Benjamini has graciously decorated the book with his wonderful illustrations, and I thank him for this nice touch. A very special thanks goes to my partner Tovi, for all her support and encouragement. “The book won’t write itself.”

Published online by Cambridge University Press

Notation

basic notation cylinder set group action canonical left action on a function G-invariant subset orbit of an action stabilizer convolution Mn (R) = n × n matrices with entries in the ring R GLn (R), GLn (C) = invertible real / complex matrices GLn (Z) = integer matrices with determinant ±1 SLn (Z) = integer matrices with determinant 1 Abelian/commutative group finitely generated group group property virtual group property indicable group central series nilpotent group derived series solvable group free group finitely presented group semi-direct product Cayley graph distS (x, y) = distG,S (x, y) = Cayley graph distance |x| = |x|S = distance to 1 BS (x, r) = BG,S (x, r) = ball of radius r symmetric, adapted measures xvii

Published online by Cambridge University Press

4 5 7 9 9 9 9 10 12 12 13 13 14 14 16 16 16 16 18 19 20 22 25 28 31 31 31 31 32

xviii

Notation

SA (G, k) = symmetric, adapted measures with kth moment SA (G, ∞) = symmetric, adapted, exponential tail

random walk harmonic function BHF = bounded harmonic functions LHF = Lipschitz harmonic functions HFk = harmonic functions of degree-k polynomial growth conditional expectation filtration martingale stopping time uniform integrability sub-martingale, super-martingale, predictable process Markov chain Markov property irreducible Markov chain aperiodic Markov chain TA, TA+, Tx, Tx+ = hitting and return times σ(X0, . . . , XT ) for a stopping time T strong Markov property recurrence and transience positive/null recurrent stationary distribution µ H = hitting measure network, conductance ` 2 (G, c), ` 2 (E(G, c)) ∇ = gradient, div = divergence ∆ = Laplacian E = Dirichlet energy (bilinear form) paths in networks flow voltage, current maximum principle for harmonic functions Ceff (a, Z ) = effective conductance from a to Z R eff (a, Z ) = effective resistance from a to Z Ceff (a, ∞) = effective conductance to infinity R eff (a, ∞) = effective resistance to infinity g Z = Green function ` 0 (G, c) = ` 0 (G) capacity

Published online by Cambridge University Press

32 32 33 34 37 38 39 63 67 67 68 70 77 94 95 97 97 100 101 102 104 108 108 121 141 142 143 144 145 145 147 148 148 150 150 150 151 153 166 166

Notation amenable group Følner sequence Cheeger constant ρ = ρ(P) = ρ(G, c) = spectral radius dimiso (G, c) = isoperimetric dimension isoperimetric inequality Nash inequality T = tail σ-algebra I = invariant σ-algebra parabolic function Liouville = bounded harmonic functions are constant H (X ) = Shannon entropy H (X | σ) = conditional entropy h(G, µ) = random walk entropy D(·||·) = Kullback–Leibler divergence I (X, Y ) = mutual information speed of a random walk just-not virtually nilpotent (JNVN) preorder on growth functions characteristic subgroup γ¯ n (G) eigenvalue of a group automorphism cocycle || · || H S = Hilbert–Schmidt norm H∗ ⊗ H = space of Hilbert–Schmidt operators weak∗ convergence in a Hilbert space

Published online by Cambridge University Press

xix 194 195 196 197 203 207 207 235 235 237 241 242 242 246 250 250 256 304 326 336 337 339 367 379 379 380

Published online by Cambridge University Press

PART I Tools and Theory

https://doi.org/10.1017/9781009128391.002 Published online by Cambridge University Press

https://doi.org/10.1017/9781009128391.002 Published online by Cambridge University Press

1 Background

3

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

4

Background

1.1 Basic Notation The sets N, Z, Q, R, and C denote the sets of natural, whole, rational, real, and complex numbers, respectively. We assume that N contains the number 0. For real numbers a, b ∈ R we use a ∧ b = min{a, b} and a ∨ b = max{a, b}. For sets A, B the notation AB is used to denote all functions from B to A. A ] B is used to denote disjoint unions; that is, this notation includes the claim that A ∩ B = ∅. We use 1 A to denote the indication function of a set A; so 1 A (ω) = 1 for ω ∈ A and 1 A (ω) = 0 for ω < A. For linear operators we use I to denote the identity operator. In a generic probability space, we use P to denote the probability measure and E to denote expectation. A graph is a pair (V, E) where V is a set (whose elements are called vertices) and E ⊂ {{x, y} ⊂ G}. A subset {x, y} ∈ E is called an edge. Sometimes we write x ∼ y to denote the case that {x, y} ∈ E. A graph is naturally equipped with the notion of paths: A finite path in a graph G is a sequence x 0, . . . , x n of vertices such that x j ∼ x j+1 for all 0 ≤ j < n. For such a sequence, n is the length of the path; this is the number of edges traversed by the path. An infinite such sequence is called an infinite path. A graph is connected if for every x, y ∈ G there is some finite path starting at x and ending at y. A connected graph comes with a natural metric on it: distG (x, y) is the minimal length of a path between x and y. For a sequence (an )n we use the notation a[m, n] = (am, . . . , an ). For two measures µ, ν on a measurable space (Ω, F ), we write µ ν if µ is absolutely continuous with respect to ν. That is, for any A ∈ F it holds that if ν( A) = 0 then µ( A) = 0. If µ is a probability measure on a measurable space (Ω, F ), then an i.i.d.-µ sequence of elements means a sequence of elements (ωt )t such that each one has law µ and that are all independent. (Sometimes this is just called i.i.d., omitting µ from the notation; “i.i.d.” stands for independent and identically distributed.) In a group G we use 1 and sometimes 1G to denote the identity element. For elements x, y ∈ G we denote x y = y −1 x y and [x, y] = x −1 y −1 x y = x −1 x y . The latter is called the commutator of x, y. Iterated commutators are defined inductively by [x 1, . . . , x n ] := [[x 1, . . . , x n−1 ], x n ]. The centralizer of x ∈ G is defined to be CG (x) = {y ∈ G : [x, y] = 1}. ( ) For A ⊂ G we write Ax = {a x : a ∈ A} and A−1 = a−1 : a ∈ A . A is called symmetric if A = A−1 . A group G is generated by a subset S ⊂ G if every element of G can be written a product of finitely many elements from

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

5

1.2 Spaces of Sequences

S ∪ S −1 . Also, hAi denotes the subgroup generated by the elements of A; that is, all elements that can be written as a product of finitely many elements from A ∪ A−1 . For two subsets A, B ⊂ G we write [A, B] = h[a, b] : a ∈ A , b ∈ Bi (note that this is the group generated by all commutators, not just the set of commutators). We also denote AB = {ab : a ∈ A , b ∈ B}.

1.2 Spaces of Sequences Let G be a countable set. Let us briefly review the formal setup of the canonical ∞ where ω ∈ G probability spaces on GN . This is the space of sequences (ωn )n=0 n for all n ∈ N. A cylinder set is a set of the form ( ) C(J, ω) = η ∈ GN | ∀ j ∈ J , η j = ω j , J ⊂ N, 0 < | J | < ∞, ω ∈ GN . It is also natural to define C(∅, ω) = GN . Let X j : GN → G be the map X j (ω) = ω j projecting onto the jth coordinate. For times t > s we also use the notation X[s, t] = (X s, X s+1, . . . , Xt ). Define the cylinder σ-algebra F = σ(X0, X1, X2, . . .) = σ X n−1 (g) | n ∈ N , g ∈ G .

Exercise 1.1

Show that

F = σ(X0, X1, X2, . . .) = σ C(J, ω) | 0 < | J | < ∞ , J ⊂ N, = σ C({0, . . . , n}, ω) | n ∈ N ω ∈ GN .

ω ∈ GN

Show that η ∈ C(J, ω) if and only if C(J, ω) = C(J, η). For t ≥ 0 we denote Ft = σ(X0, . . . , Xt ). Show that Ft ⊂ Ft+1 ⊂ F . (A sequence of σ-algebras with this property is called a filtration.) Conclude that [ F =σ Ft . Exercise 1.2

t

Theorems and Kolmogorov tell us that the probability mea of Carathéodory N sure P on G , F is completely determined by knowing the marginal probabilities P[X0 = g0, . . . , X n = gn ] for all n ∈ N, g0, . . . , gn ∈ G. That is, when G is countable, Kolmogorov’s extension theorem implies the following:

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

6

Background

Theorem 1.2.1 Let (Pt )t be a sequence of probability measures, where each Pt is defined on Ft . Assume that these measures are consistent in the sense that for all t, Pt+1 {(X0, . . . , Xt ) = (g0, . . . , gt )} = Pt {(X0, . . . , Xt ) = (g0, . . . , gt )} (1.1)

for any g0, . . . , gt ∈ G. Then, there exists a unique probability measure P on GN, F such that for any A ∈ Ft we have P( A) = Pt ( A). Details can be found in Durrett (2019, appendix A). Exercise 1.3 Let (Pt )t be a sequence of probability measures, where each Pt is defined on Ft . Show that (1.1) holds if and only if for any t < s and any A ∈ Ft , we have Ps ( A) = Pt ( A).

The space GN comes equipped with a natural shift operator: θ : GN → GN given by θ(ω)t = ωt+1 for all t ∈ N. Exercise 1.4

Show that θ t ( A) ∈ F for any A ∈ F .

B solution C

Exercise 1.5 Let K ⊂ F be a collection of events. Show that if G = σ(K ) is the σ-algebra generated by K , then θ −t G := {θ −t ( A) : A ∈ G} is a σ-algebra, and in fact θ −t G = σ θ −t (K ) : K ∈ K . B solution C

Exercise 1.6

Show that θ −1 ( A) ∈ F for any A ∈ F .

Exercise 1.7

Define

B solution C

−1 σ(Xt , Xt+1, . . .) = σ Xt+j (g) : g ∈ G , j ≥ 0 . Show that σ(Xt , Xt+1, . . .) = θ −t F = {θ −t ( A) : A ∈ F }.

B solution C

If ∼ is an equivalence relation on Ω, we say that a subset A ⊂ Ω respects ∼ if for any ω ∼ η ∈ Ω we have ω ∈ A ⇐⇒ η ∈ A. Show that the collection of subsets A that respect the equivalence relation ∼ forms a σ-algebra on Ω.

Exercise 1.8

Define an equivalence relation on GN by ω ∼t ω 0 if ω j = ω 0j for all j = 0, 1, . . . , t. Show that this is indeed an equivalence relation. Show that σ(X0, X1, . . . , Xt ) = A : A respects ∼t . Exercise 1.9

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.3 Group Actions

7

1.3 Group Actions A (left) group action G y X is a function from G × X to X, (γ, x) 7→ γ.x, that is compatible in the sense that (γη).x = γ.(η.x), and such that 1.x = x for all x ∈ X. We usually denote γ.x or γx for the action of γ ∈ G on x ∈ X. A right action is analogously defined for (x, γ) 7→ x.γ and compatibility is x.(γη) = (x.γ).η (and x.1 = x for all x ∈ X). Let G y X be a left group action. For any γ ∈ G and x ∈ X define x.γ := γ −1 .x. Show that this defines a right action of G on X. Conversely, show that if G acts on X from the right, then defining γ.x = x.γ −1 is a left action.

Exercise 1.10

The bijections on a set X form a group with the group operation given by composition of functions. A (left) group action G y X can be thought of as a homomorphism from the group into the group of bijections on X. Sometimes, we wish to restrict to some subgroup of bijections on X when X has some additional structure. For example, if X is a topological space, we say that G acts on X by homeomorphisms if every element of G is a homeomorphism of X, when thinking of elements of G as identified with their corresponding bijection of X. That is, an action by homeomorphisms is a group homomorphism from G into the set of homeomorphisms of X. Similarly, if H is some Hilbert space, then a group G acts on H by unitary operators if every element of G is mapped to a unitary operator of H. This is just a group homomorphism from G into the group of unitary operators on H. Exercise 1.11 Show that any group acts on itself by left multiplication; that is, G y G by x.y := x y.

Let CG be the set of all functions from G → C. Show that Gy by (x. f )(y) := f x −1 y . Show that f x (y) := f yx −1 defines a right action. Exercise 1.12

CG

Generalize the previous exercise as follows: X Suppose that G y X. Consider C , all functions from X → C. Show that X −1 G y C by γ. f (x) := f γ .x , for all f ∈ CX , γ ∈ G, x ∈ X. Show that the action G y CX is linear; that is, γ.(ζ f + h) = ζ (γ. f ) + h, for all f , h : X → C, ζ ∈ C, and γ ∈ G. Show that f γ (x) := f (γ.x) defines a right action of G on CX . Show that this right action is linear as well. Exercise 1.13

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

8

Background

Exercise 1.14 Let G y X. Let M 1 (X ) be the set of all probability measures on (X, F ), where F is some σ-algebra on X. Suppose that for any g ∈ G the function g : X → X given by g(x) = g.x is a measurable function. (In this case we say that G acts on X by measurable functions.) Show that G y M1 (X ) by ∀A∈F g.µ( A) := µ g −1 A , ( ) where g −1 A := g −1 .x : x ∈ A .

Let F = { f : G → C : f (1) = 0}. Show that (x. f )(y) := f x −1 y − f x −1

Exercise 1.15

defines a left action of G on F. Notation Throughout this book, specified otherwise, we will always unless use the left action γ. f (x) = f γ −1 x for G y X and f : X → C. Let G y X be a (left) action. For A ⊂ X and γ ∈ G define γ. A = {γ.x : x ∈ A}. For F ⊂ G denote F. A = {γ.x : γ ∈ F, x ∈ A}. A subset A ⊂ X is called G-invariant if γ. A ⊂ A for all γ ∈ G; equivalently, G. A = A. Definition 1.3.1

For a group action G y X and some x ∈ X, the set G.x := {g.x : g ∈ G} is called the orbit of x under G. The stabilizer of x is the subgroup stab (x) = {g ∈ G : g.x = x}.

Definition 1.3.2

Exercise 1.16

Show that for G y X any stabilizer stab (x) is indeed a subgroup. B solution C

Exercise 1.17 (Orbit-Stabilizer theorem)

Show that |G.x| = [G : stab (x)].

Let G y X. B solution C

One nice consequence of the orbit-stabilizer theorem is that intersections of finite-index subgroups have finite index. Let G be a group and H, N ≤ G be subgroups. Then, [G : H ∩ N] ≤ [G : H] · [G : N].

Proposition 1.3.3

Proof If either [G : H] = ∞ or [G : N] = ∞ there is nothing to prove because the right-hand side is infinite. So assume that [G : H] < ∞ and [G : N] < ∞.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.4 Discrete Group Convolutions

9

Let X = G/H × G/N. That is, elements of X are pairs of cosets (αH, βN ). Therefore, X is finite, since |X | = |G/H | · |G/N |. The group G acts on X by g.(αH, βN ) = (gαH, g βN ). The stabilizer of (H, N ) may easily be computed: stab (H, N ) = H ∩ N. Thus, [G : H ∩ N] ≤ |X | ≤ [G : H] · [G : N].

1.4 Discrete Group Convolutions Throughout this book we will almost exclusively deal with countable groups. Given a countable group G, one may define the convolution of functions f , g : G → C as follows. Definition 1.4.1 Let G be a countable group. Let f , g : G → C. The convolution of f and g is the function f ∗ g : G → C defined by X X ( f ∗ g)(x) := f (y)g y −1 x = f (y)(y.g)(x), y

y

as long as the above sum converges absolutely. This is the analogue of the usual convolution of functions on the group R: Z ( f ∗ g)(x) = f (y)g(x − y)dy. However, the convolution is not necessarily commutative, as is the case for Abelian groups. Exercise 1.18

Show that ( f ∗ g)(x) =

X

f x y −1 g(y).

y

Give an example for which f ∗ g , g ∗ f . Exercise 1.19 (Left action and convolutions) Show that x.( f ∗ g) = (x. f ∗ g) for the canonical left action x. f (y) = f x −1 y .

When G is countable, a probability measure µ on G may be thought of as a P function µ : G → [0, 1] so that µ( A) = a ∈ A µ(a). Let µ be a probability measure on a countable group G, and let X be a random element of G with law µ. Show that

Exercise 1.20

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

10

Background g f E f x · X −1 = ( f ∗ µ)(x)

whenever the above quantities are well defined. Let µ be a probability measure on G. We will use the notation µt to denote the t-fold convolution of µ with itself. Specifically, µ1 = µ and µt+1 = µ ∗ µt = µt ∗ µ.

Definition 1.4.2

Let G be a countable group. Let µ, ν be probability measures on G. Let X, Y be independent random elements in G such that X has law µ, and Y has law ν. Show that the law of X · Y is µ ∗ ν. B solution C Exercise 1.21

Show that for any p ≥ 1 we have ||x. f || p = || f || p . Here, P p || f || p = x | f (x)| p and || f ||∞ = supx | f (x)|. Show that || fˇ|| p = || f || p , where fˇ(x) = f x −1 .

Exercise 1.22

Prove Young’s inequality for products: For all a, b ≥ 0 and any p, q > 0 such that p + q = 1, we have ab ≤ pa1/p + qb1/q . B solution C

Exercise 1.23

Prove the generalized Hölder inequality: for all p1, . . . , pn ∈ P [1, ∞] such that nj=1 p1j = 1, we have

Exercise 1.24

|| f 1 · · · f n ||1 ≤

n Y

|| f j || p j .

B solution C

j=1

Prove Young’s inequality for convolutions: For any p, q ≥ 1 and 1 ≤ r ≤ ∞ such that p1 + q1 = r1 + 1, we have

Exercise 1.25

|| f ∗ g||r ≤ || f || p · ||g||q .

B solution C

1.5 Basic Group Notions Here we briefly recall some basic notions and examples from group theory. Further depth on any of these notions can be found in any basic textbook on group theory.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.5 Basic Group Notions

11

1.5.1 Basic Linear Groups If R is a ring we use the notation Mn (R) to denote the set of n × n matrices with entries in R. For example, Mn (Z) is the set of all n × n matrices with integer entries. These do not necessarily form a group. By GLn (R) we denote the group of n × n invertible matrices with real entries. The group operation here is matrix multiplication. In more generality, GLn (C) is the group of n × n matrices with complex entries, so that GLn (R) ≤ GLn (C). Exercise 1.26

Show that GL2 (R) ∩ M2 (Z) is not a group with matrix multipli-

cation.

B solution C

A nontrivial fact is that if we restrict to integer entries with determinant ±1, we do have a group. We denote GLn (Z) = {M ∈ Mn (Z) : |det(M)| = 1}.

Proposition 1.5.1 GLn (Z)

is a group with matrix multiplication.

Proof The main property we will use is that for any M ∈ GLn (Z) the number det(M) is invertible in the ring Z. (This proof generalizes to matrices over a commutative ring with unit such that the determinants are invertible in the ring; see Exercise 1.102.) Recall Cramer’s Rule: For b ∈ Rd and A ∈ GLn (R), we may compute the A(i,b)) for each i = 1, . . . , n, where A(i, b) is the solution to Ax = b by x i = det(det( A) matrix A with ith column replaced by the vector b. Let ei denote the standard basis for Rn . So ei is a vector with 1 in the ith position, and 0 everywhere else. Now let A ∈ GLn (Z). We want to compute A−1 and show that it has integer entries. Let x i be the ith column of A−1 . Then Ax i = ei . Consequently,

A−1

i, j

= (x i ) j =

det( A( j, ei )) . det( A)

Note that since A, ei have integer entries, then so does A( j, ei ). Since det( A) ∈ {−1, 1}, we conclude that A−1 has integer entries. Thus, if A ∈ GLn (Z) then A−1 ∈ GLn (Z). The fact that GLn (Z) is closed under matrix multiplication is easy to prove, and is left to the reader. The following notation is also standard. Define: SLn (Z) = { A ∈ GLn (Z) | det( A) = 1}.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

12

Exercise 1.27

Background

Show that SLn (Z) C GLn (Z) and that [GLn (Z) : SLn (Z)] = 2. B solution C

For 1 ≤ i, j ≤ n let Ei, j denote the n × n matrix with 1 only in the (i, j) entry, and 0 in Dall other entries. E Show that SLn (Z) = I + Ei, j | 1 ≤ i , j ≤ n , where I is the n × n identity matrix. B solution C Exercise 1.28

1.5.2 Abelian Groups A group G is called Abelian, or commutative, if x y = yx for all x, y ∈ G. A group G is called finitely generated if there exists a finite generating set for G. That is, if there exists a finite set S ⊂ G, |S| < ∞, such that for any x ∈ G there are s1, . . . , s n ∈ S ∪ S −1 such that x = s1 · · · s n . We will come back to finitely generated groups in Section 1.5.7. Show that the group Zd (with vector addition as the group operation) is a finitely generated Abelian group, with the standard basis serving as a finite generating set.

Exercise 1.29

Finitely generated Abelian groups have a special structure. The classification of these groups is given by the so-called fundamental theorem of finitely generated Abelian groups. We will prove a simplified version of this theorem. Theorem 1.5.2 Let G be a finitely generated Abelian group. Then there exists a finite Abelian group F and some integer d ≥ 0 such that G Zd × F. Also, d > 0 if and only if |G| = ∞.

Proof Let U = {u1, . . . , un } be a finite generating set for G. Consider the vector space V = Qn . Define a map ψ : Zn → G by ψ(z1, . . . , z n ) = (u1 ) z1 · · · (un ) zn . Note that since G is Abelian and since U generates G, the map ψ is surjective. Also, it is simple to check that because G is Abelian we have that ψ is a homomorphism. Let K = Kerψ = {~z ∈ Zn : ψ(~z ) = 1}. Let W = span (K ), which is a subspace of V . The quotient vector space V /W has dimension d ≤ n, so we can choose ~b1, . . . , ~bd ∈ V such that {b j + W : 1 ≤ j ≤ d} forms a (linear) basis for V /W . ~ 1, . . . , w ~ k be a basis for W . By multiplying by a large enough integer, Let w

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.5 Basic Group Notions

13

we can assume without loss of generality that ~b j ∈ Zn for all 1 ≤ j ≤ d and ~ j ∈ Zn for all 1 ≤ j ≤ k. w Define s j = ψ(~b j ) for all 1 ≤ j ≤ d and f j = ψ(~ w j ) for all 1 ≤ j ≤ k. z 1 We claim that the map (z1, . . . , z d ) 7→ (s1 ) · · · (s d ) z d is an isomorphism from Zd onto Z = hs1, . . . , s d i. It is immediate to verify that this is a surjective homomorphism. To show it is injective, assume that (s1 ) z1 · · · (s d ) z d = 1. Then, ψ z1~b1 + · · · + z d ~bd = (s1 ) z1 · · · (s d ) z d = 1, implying that z1~b1 + · · · + z d ~bd ∈ K ⊂ W . Since ~b1 +W, . . . , ~bd +W are linearly independent, it must be that z1 = · · · = z d = 0. This proves injectivity, showing that Z Zd . Now fix some ~z ∈ Zn ∩ W . Then since W = span (K ), there exist some q1, . . . , qm ∈ Q and ~z1, . . . , ~z m ∈ K such that ~z = q1~z1 + · · · + qm~z m . So there exists a large enough integer r , 0 such that r~z ∈ K, implying that ψ(~z ) r = ψ(r~z ) = 1. This implies that any element of F = h f 1, . . . , f k i is torsion; that is, for any x ∈ F there exists an integer r , 0 such that x r = 1. (One can check that in fact F is exactly the subgroup of all torsion elements of G.) So we may take r > 0 large enough so that ( f j ) r = 1 for all 1 ≤ j ≤ k. Since F is generated by f 1, . . . , f k , and since F is Abelian, we have that the map {0, . . . , r − 1} k → F given by (a1, . . . , ak ) 7→ ( f 1 ) a1 · · · ( f k ) ak is surjective, and thus F is a finite group. Let x ∈ Z ∩ F. So x r = 1 for some integer r > 0. Then ψ r z1~b1 + · · · + r z d ~bd = ((s1 ) z1 · · · (s d ) z d ) r = x r = 1, for some integers z1, . . . , z d ∈ Z. This implies that r z1~b1 +· · ·+r z d ~bd ∈ K ⊂ W , and as before we get that z1 = · · · = z d = 0, so that x = 1. That is, we have shown that Z ∩ F = {1}. Finally, recall that the map ψ : Zn → G is surjective. Any ~z ∈ Zn can be ~ where ~v = z1~b1 + · · · + z d ~bd and w ~ = a1 w ~ 1 + · · · + ak w ~k written as ~z = ~v + w for integers z1, . . . , z d, a1, . . . , ak . Thus, for any x ∈ G there exist ψ(~v ) ∈ Z and ψ(~ w ) ∈ F such that x = ψ(~v ) · ψ(~ w ). To conclude, we have Z C G with Z Zd and F C G with |F | < ∞, and these have the following properties: • G = Z F = {z f : z ∈ Z , f ∈ F}, • Z ∩ F = {1}, • and for any z ∈ Z, f ∈ F we have z f = f z. It is an exercise to show that this implies that G Z × F.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

14

Background

Let G be a group and let Z, F be subgroups such that G = Z F, Z ∩ F = {1}, and z f = f z for all z ∈ Z, f ∈ F. Show that G Z × F.

Exercise 1.30

1.5.3 Virtual Properties A group property is a class of groups P such that if G H and G ∈ P, then also H ∈ P. For G ∈ P we sometimes say that G is P. For a group property P, we may define the property virtually P. A group G is virtually P if there exists a finite index subgroup [G : H] < ∞ such that H is P. A group G is virtually finitely generated if there exists a finite index subgroup H ≤ G, [G : H] < ∞ such that H is finitely generated. 454 Example 1.5.3

Exercise 1.31

Show that if G is virtually finitely generated then G is finitely

generated.

B solution C

A group G is called indicable if there exists a surjective homomorphism from G onto Z. A group G is therefore virtually indicable if there exists a finite index subgroup [G : H] < ∞ and a surjective homomorphism ϕ : H → Z. 454

Example 1.5.4

Show that if G is finitely generated and there exists a homomorphism ϕ : G → A where A is an Abelian group and |ϕ(G)| = ∞, then G is indicable. B solution C Exercise 1.32

Let G be a finitely generated group. Show that |G/[G, G]| = ∞ if and only if there exists a surjective homomorphism ϕ : G → Z. B solution C Exercise 1.33

Every group G ∈ P is also virtually P, as it has index 1 in itself. But not every property P is the same as virtually P. For example: the infinite dihedral group D∞ , see Exercise 1.72, is virtually Z (i.e. contains a finite index subgroup isomorphic to Z) but is not Abelian, and so definitely not isomorphic to Z.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

15

1.5 Basic Group Notions

1.5.4 Nilpotent Groups Definition 1.5.5 For a group G, we define the lower central series inductively as follows: γ0 = γ0 (G) = G and γn+1 = γn+1 (G) = [γn (G), G] for all n ≥ 0. We define the upper central series as: Z0 = {1} and for all n ≥ 0,

Zn+1 = Zn+1 (G) := {x ∈ G : ∀ y ∈ G [x, y] ∈ Zn }. Z1 = Z1 (G) is called the center of G, and is sometimes denoted just Z (G). Exercise 1.34

Assume that G = hSi, for some set of elements S ⊂ G.

Show that

γn (G) = [s0, . . . , s n ]x : s0, . . . , s n ∈ S , x ∈ G .

B solution C

Let ϕ be an automorphism of a group G. Show that ϕ(γn (G)) = γn (G) and that ϕ(Zn (G)) = Zn (G). Conclude that γn (G), Zn (G) are normal subgroups of G. B solution C

Exercise 1.35

Show that for k ≤ n we have Zk (G) C Zn (G). Show that Zn (G)/Zk (G) = Zn−k (G/Zk (G)).

Exercise 1.36

B solution C

Exercise 1.37

Show that if γn (G) = {1} then Zn (G) = G.

B solution C

Exercise 1.38

Show that if Zn (G) = G then γn (G) = {1}.

B solution C

Show that if G is finitely generated, then γk /γk+1 is also finitely generated for any k ≥ 0. B solution C Exercise 1.39

A group G is called n-step nilpotent if γn (G) = {1} and γn−1 (G) , {1}. (By convention, 0-step nilpotent is just the trivial group.) A group is called nilpotent if it is n-step nilpotent for some n ≥ 0.

Definition 1.5.6

Note that 0-step nilpotent is the trivial group {1}. Note too that 1-step nilpotent is just Abelian. Show that a group is n-step nilpotent if and only if Zn (G) = G and Zn−1 (G) , G. Show that G is (n + 1)-step nilpotent if and only if G/Z1 (G) is n-step nilpotent. B solution C

Exercise 1.40

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

16 Exercise 1.41

Background Show that G/γn (G) is at most n-step nilpotent.

B solution C

Exercise 1.42 Show that if G is a nilpotent group and H ≤ G, then H is nilpotent as well. B solution C Exercise 1.43

Show that if G is nilpotent and N CG, then G/N is also nilpotent. B solution C

Let us go through some basic examples of nilpotent groups. Some readers may have seen the following definition: An n × n matrix M ∈ Mn (R) is called k-step nilpotent if M k−1 , 0 and M k = 0. This is related to nilpotence of groups, as the following exercises show. Exercise 1.44 Let Tn (R) denote all n × n upper triangular matrices with real entries. For 1 ≤ k ≤ n define

Dk = {M ∈ Tn (R) : ∀ j ≤ i + k − 1, Mi, j = 0}. That is, all the first k diagonals of M are 0. (So e.g. D0 = Tn (R).) Show that if M ∈ Dk , N ∈ D` then M N ∈ Dk+` .

B solution C

Exercise 1.45 Fix n > 1. Let Tn (R) denote all n × n upper triangular matrices. For 1 ≤ k ≤ n define

Dk = {M ∈ Tn (R) : ∀ j ≤ i + k − 1, Mi, j = 0}, and define Dk (Z) = Dk ∩ Mn (Z) (recall that Mn (Z) is the set of n × n matrices with integer entries). Set Q n,k = {I + N : N ∈ Dk (Z)}. Show that Q n,k is a group (with the usual matrix multiplication).

B solution C

Let n > 1. Let Hn (Z) be the collection of all upper triangular n × n matrices, with 1 on the diagonal, and only integer entries. Show that Hn (Z) is a group (with the usual matrix multiplication). Show that for 0 ≤ k ≤ n − 1 we have

Exercise 1.46

γk (Hn (Z)) ⊂ Q n,k+1 ⊂ Zn−k−1 (Hn (Z)).

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

B solution C

17

1.5 Basic Group Notions

1.5.5 Solvable Groups Definition 1.5.7 Let G be a group.f The derived g series is defined inductively as follows: G (0) = G, and G (n+1) = G (n), G (n) .

A group G is n-step solvable if G (n) = {1} and G (n−1) , {1}. (By convention, 0-step solvable is the trivial group.) A group is solvable if it is solvable for some n ≥ 0. Definition 1.5.8

Note that the properties of 1-step solvable, 1-step nilpotent, and Abelian all coincide. Exercise 1.47

Show that any nilpotent group is solvable.

B solution C

Exercise 1.48

Show that if G is 2-step solvable, then G (1) is Abelian.

B solution C

Exercise 1.49

Show that the following are equivalent:

• G is a solvable group. • G (n) is solvable for all n ≥ 0. • G (n) is solvable for some n ≥ 0. Exercise 1.50

B solution C

Show that if G is solvable and infinite then [G : [G, G]] = ∞. B solution C

Exercise 1.51

Show that if G is a solvable group and H ≤ G then H is solvable. B solution C

Let ∆+n denote the collection of all n × n diagonal matrices with real entries and only positive values on the diagonal. Show that ∆+n is an Abelian group (with the usual matrix multiplication). Exercise 1.52

B solution C

Exercise 1.53 Fix n > 1, and recall Dk , the collection of all n × n upper triangular matrices, with first k diagonals equal to 0 (from Exercise 1.44). Recall also ∆+n , the collection of all n × n diagonal matrices with only positive values on the diagonal. For k ≥ 1 define

Pn,k = Pn,k := {T + M : T ∈ ∆+n , M ∈ Dk }.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

18

Background

Show that Pn,k is a group (with the usual matrix multiplication). Show that [Pn,k , Pn,k ] ⊂ {I + M : M ∈ Dk }. Show that Pn,k is solvable, of step at most dlog2 (n)e + 1. Show that Pn,k is not nilpotent when k < n. Exercise 1.54

B solution C

Let r > 1 and consider ω = e2πi/r , the rth root of unity. Define r−1 X D= ak ω k : ak ∈ Z k=0

and G=

(f

ωz d 0 1

g

) : z ∈ Z, d ∈ D .

Show that G is a finitely generated virtually Abelian group that is not nilpotent. B solution C

1.5.6 Free Groups Let S be a finite set. For each element s ∈ S, consider a new element s¯, and ¯ define S¯ = { s¯ : s ∈ S}. Consider all possible finite words in the letters S ∪ S, including the empty word ∅, and denote this set by ΩS . That is, ¯ ∪ {∅}. ΩS := {a1 · · · an : n ∈ N , a j ∈ S ∪ S} Define the reduction operation R : ΩS → ΩS as follows: Call a word a1 · · · an ∈ ΩS reduced if for all 1 ≤ j < n we have that (a j , a j+1 ) < {(s, s¯), ( s¯, s) : s ∈ S}. The empty word ∅ is reduced by convention. Let FS denote the collection of all reduced words. Now, for a word ω ∈ FS , define R(ω) = ω. For a word a1 · · · an < FS , let j be the smallest index for which (a j , a j+1 ) ∈ {(s, s¯), ( s¯, s) : s ∈ S}, and define R(a1 · · · an ) = a1 · · · a j−1 a j+2 · · · an (if j = 1 this means R(a1 · · · an ) = a3 · · · an ). It is easy to see that for any word a1 · · · an ∈ ΩS , at most n applications of R will result in a reduced word. Let R∞ (a1 · · · an ) denote this reduced word. So R∞ : ΩS → FS , which fixes any word in FS . Define a product structure on FS : For two reduced words a1 · · · an and b1 · · · bm define ∅a1 · · · an = a1 · · · an ∅ = a1 · · · an and a1 · · · an · b1 · · · bm = R∞ (a1 · · · an b1 · · · bm ). It is easily verified that this turns FS into a group with identity element ∅.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.5 Basic Group Notions

Definition 1.5.9

19

FS is called the free group on generators S.

Since the actual letters generating the free group are not important, we will usually write Fd for the free group generated by d elements. If G is a finitely generated group, generated by a finite set S, then consider FS and define a map ϕ : FS → G by ϕ(∅) = 1, for s ∈ S we set ϕ(s) = s and ϕ( s¯) = s−1 , and finally for general reduced words set ϕ(a1 · · · an ) = ϕ(a1 ) · · · ϕ(an ). This is easily seen to be a surjective homomorphism, so G FS /Ker ϕ. Let G be a group generated by a finite set S. We have seen that there exists a normal subgroup R C FS such that FS /R G. In this case we write G = hS | Ri. Moreover, suppose there exist (r n )n ⊂ R such that R is the smallest normal subgroup containing all (r n )n . Then we write G = hS | (r n )n i. We will come back to this presentation in Section 1.5.8.

Remark 1.5.10

There is a classical method of proving that certain groups (or subgroups) are isomorphic to a free group. We will not require it but include it for the educational value. Let G be a group acting on some set X. Let a, b ∈ G. Suppose that there exist disjoint non-empty subsets A, B ⊂ X, A ∩ B = ∅ such that for all 0 , z ∈ Z we have a z (B) ⊂ A and bz ( A) ⊂ B. (This is known as: a, b play ping-pong.) Then H = ha, bi ≤ G is isomorphic to F2 . B solution C Exercise 1.55 (Ping-pong lemma)

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

20

Background

f g f g Consider a = 10 21 and b = 12 01 in SL2 (Z). Show that S = ha, bi is a free group generated by 2 elements.

Exercise 1.56

B solution C

f g f g The group S above, generated by a = 10 21 and b = 12 01 , is sometimes called the Sanov subgroup. f g f g Note that SL2 (Z) is generated by x = 10 11 and y = 11 01 , and that a = x 2 and b = y 2 . Remark 1.5.11

Let I ∈ SL2 (Z) denote the 2 × 2 identity matrix. Show that {−I, I} C SL2 (Z).

Exercise 1.57

Denote PSL2 (Z) = SL2 (Z)/{−I, I}. Exercise 1.58

f

Let x =

f

11 01

g

and y =

f

10 11

g

and let a = x 2, b = y 2 . Set

g

t = 01 −1 and s = xt. 0 Show that t 2 = s3 = −I, where I is the 2 × 2 identity matrix. Show that x = −st and y = −s2 t. Let π : SL2 (Z) → PSL2 (Z) be the canonical homomorphism. Show that PSL2 (Z) = hπ(t), π(s)i. Show that for any z ∈ SL2 (Z) there exist ε 1, . . . , ε n ∈ {−1, 1} and α, β ∈ {0, 1} such that z ≡ t α s ε1 ts ε2 · · · ts εn t β (mod {−I, I}). Let x, y, a, b, s, t be as in Exercise 1.58. Let S = ha, bi ≤ SL2 (Z) be the Sanov subgroup (from Exercise 1.56). Show that a = stst and b = s2 ts2 t. Let π : SL2 (Z) → PSL2 (Z) be the canonical projection. ( ) Show that for any z ∈ SL2 (Z) there exist w ∈ S and p ∈ 1, s, s2, t, st, s2 t such that π(z) = π(w)π(p). Show that [PSL2 (Z) : π(S)] ≤ 6. Conclude that [SL2 (Z) : S] ≤ 12. B solution C Exercise 1.59

1.5.7 Finitely Generated Groups Let H ≤ G and let S be a finite generating set for G. Let T be a right-traversal of H in G; that is, a set of representatives for the right-cosets of H containing 1 ∈ T. So G = ]t ∈T Ht. Show that H is generated by T ST −1 ∩ H. B solution C

Exercise 1.60

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.5 Basic Group Notions

21

Exercise 1.61 Show that a finite index subgroup of a finitely generated group is also finitely generated.

Let H C G and let π : G → G/H be the canonical projection. ˜ Assume that H is generated by U and G/H is generated by S. ˜ then U ∪ S generates G. Show that if S ⊂ G is such that π(S) = S, Conclude that if H and G/H are finitely generated, then so is G. B solution C

Exercise 1.62

A nice property of finitely generated groups is that there cannot be too many finite index subgroups of a given index. Let G be a finitely generated group, generated by d elements. Then for any n, the set {H ≤ G | [G : H] = n} has size at most (n!) d .

Theorem 1.5.12

Proof Assume that S ⊂ G is a finite generating set for G of size |S| = d. Let Πn be the group of permutations of the set {1, 2, . . . , n}. Let X = {H ≤ G : [G : H] = n}. If X = ∅ then it is of course finite. So assume that X , ∅. Consider H ∈ X. Write G/H = {xH : x ∈ G} = {x 1 H, x 2 H, . . . , x n H }, where x 1 = 1. G acts on G/H by x(yH) = xyH. Define ψ H : G → Πn by x 7→ πx ∈ Πn , where πx is the permutation for which πx (i) = j for the unique 1 ≤ i, j ≤ n such that x x i H = x j H. Note that πx (1) = 1 if and only if x ∈ H. It is easy to see that ψ H is a homomorphism from G into Πn . We claim that H 7→ ψ H is an injective map from X into Hom (G, Πn ). Indeed, if H , K ∈ X, then without loss of generality we may take x ∈ H \K (otherwise x ∈ K\H, and reverse the roles of H and K in what follows). Let π = ψ H (x) and σ = ψ K (x). Since x ∈ H we have that π(1) = 1. Since x < K we have that σ(1) , 1. So ψ H (x) , ψ K (x), implying that ψ H , ψ K . We conclude that |X | ≤ | Hom (G, Πn )|, so we only need to bound the size of this last quantity. Any homomorphism ψ ∈ Hom (G, Πn ) is completely determined by the values {ψ(s) : s ∈ S}. Thus, | Hom (G, Πn )| ≤ (Πn ) S = (n!) d .

1.5.8 Finitely Presented Groups Definition 1.5.13 Let G be a group generated by a finite set S. Consider the free group on the generators S, FS . If it is possible to find a normal subgroup R C FS and finitely many r 1, . . . , r k ∈ R such that R is the smallest normal subgroup

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

22

Background

containing r 1, . . . , r k , we write G = hS | r 1, . . . , r k i, and in this special case we say that G is finitely presented. The elements of S are called generators of G, and the elements of R are called relations of G. The next lemma shows how the property of finite presentation can be moved from quotients by finitely presented groups to the mother group. Let G be a group and N C G. Assume that both N and G/N are finitely presented. Then, G is finitely presented as well.

Lemma 1.5.14

Proof Assume that N = hs1, . . . , s k | r 1, . . . , r ` i

and

G/N = ha1, . . . , ad | p1, . . . , pm i .

Let F = Fd+k be the free group on d + k generators. Denote the generators of F by { f 1, . . . , f d, t 1, . . . , t k }. Let F = h f 1, . . . , f d i ≤ F and T = ht 1, . . . , t k i ≤ F. So F is a free group on d generators, and T is a free group on k generators. For any 1 ≤ j ≤ d choose an element g j ∈ G such that g j is mapped to a j under the canonical projection G → G/N (i.e. a j = Ng j ). Let ϕ : F → G be the homomorphism defined by ϕ( f j ) = g j for 1 ≤ j ≤ d and ϕ(t j ) = s j for 1 ≤ j ≤ k. By our assumptions on the presentation for N, there exist words r 1, . . . , r ` ∈ T such that if R is the smallest normal subgroup of T containing r 1, . . . , r k , then ϕ T : T → N with Ker (ϕ T ) = Ker ϕ ∩ T = R. Also, by our assumptions on the presentation of G/N, there exist words p1, . . . , pm ∈ F such that if P is the smallest normal subgroup of F containing p1, . . . , pm , then ϕ−1 (N ) ∩ F = P For any 1 ≤ i ≤ k and 1 ≤ j ≤ d, we have that ϕ (t i ) fj = (ni ) g j ∈ N. So there exists ui, j ∈ T such that ϕ (t i ) fj = ϕ(ui, j ). Define qi, j = (t i ) fj (ui, j ) −1 . Observe that qi, j ∈ Ker ϕ for all i, j. For any 1 ≤ j ≤ m we have that ϕ(p j ) ∈ N, by our assumptions on the presentation of G/N. So there exists w j ∈ T such that ϕ(p j ) = ϕ(w j ). Define z j = p j (w j ) −1 . Observe that z j ∈ Ker ϕ for all j. Denote K := Ker ϕ. Let Q be the smallest normal subgroup of F containing {qi, j : 1 ≤ i ≤ k , 1 ≤ j ≤ d}. Let Z be the smallest normal subgroup of F containing z1, . . . , z m . Let M C F be any normal subgroup containing [ {r 1, . . . , r `, z1, . . . , z m } {qi, j : 1 ≤ i ≤ k , 1 ≤ j ≤ d} ⊂ M. Since M is an arbitrary normal subgroup containing the above relations, we only need to show that K C M for all such M, which will prove that G is finitely presented, since G F/K.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

23

1.5 Basic Group Notions To this end, we will prove that K ⊂ RQZ := {rqz : r ∈ R , q ∈ Q , z ∈ Z } ⊂ M.

(1.2)

We move to prove (1.2). It will be convenient to use the notations AB = {ab : a ∈ A , b ∈ B}

and

AB = {a b : a ∈ A , b ∈ B}

for subsets A, B ⊂ F. Step I. Let t ∈ T and f ∈ F. Then replacing (t i ) fj = ui, j qi, j , since Q C F, we have that t f = uq for some u ∈ T and q ∈ Q. That is, T F ⊂ TQ. f f Step II. For any 1 ≤ j ≤ m, D and any f ∈ F, we have that E (p j ) = (w j z j ) . Since Z C F and since P = (p j ) f : 1 ≤ j ≤ m , f ∈ F , we have that P ⊂ T F Z ⊂ TQZ. Step III. For any x ∈ F we can write x = h1 v1 · · · hn vn for some h1, . . . , hn ∈ F and v1, . . . , vn ∈ T. By conjugating the v j , we have that x = (u1 ) d1 · (un ) dn f for some u1, . . . , un ∈ T and d 1, . . . , d n, f ∈ F. Since Q C F, we conclude that F ⊂ TQF. Step IV. Let x ∈ K. Write x = tq f for t ∈ T, q ∈ Q, and f ∈ F. So ϕ(t f ) = 1, implying that f ∈ ϕ−1 (N ) ∩ F = P. This implies that K ⊂ TQP ⊂ TQTQZ ⊂ TQZ. Hence, for any x ∈ K we can write x = tqz for some t ∈ T, q ∈ Q, and z ∈ Z. Since Q, Z ⊂ K, we have that t ∈ T ∩ K = R. So we have shown that K ⊂ RQZ, which is (1.2). Suppose G is a group, and suppose that there exists a sequence of subgroups G = H0 BH1 B· · ·BHn = {1}, with the property that every quotient H j /H j+1 is finitely presented. Then G is finitely presented. Theorem 1.5.15

Proof This is proved by induction on n. If n = 1, then G = H0 is finitely presented by assumption. For n > 1, let H = H1 . By induction, considering the sequence H = H1 B · · · B Hn = {1} we have that H is finitely presented. Also, by assumption G/H1 is finitely presented. So G is finitely presented by Lemma 1.5.14, completing the induction. Exercise 1.63

Show that if G is a finite group then it is finitely presented. B solution C

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

24

Background

Assume that a group G is virtually-Z; that is, there exists a finite index normal subgroup H C G, [G : H] < ∞ such that H Z. Show that G is finitely presented. B solution C

Exercise 1.64

Let G be a n-step solvable group. Assume that G (k) /G (k+1) is virtually-Z for every 0 ≤ k < n. Show that G is finitely presented. B solution C Exercise 1.65

Show that Zd is finitely presented. Show that any finitely generated virtually Abelian group is finitely presented.

Exercise 1.66

B solution C

Show that if G is a finitely generated nilpotent group, then G is finitely presented. B solution C Exercise 1.67

1.5.9 Semi-direct Products In this exercise, we introduce the notion of semi-direct products. Recall that a direct product of groups G, H is the group whose elements are the pairs G × H and the group operation is given by (g, h)(g 0, h 0 ) = (gg 0, hh 0 ) for all g, g 0 ∈ G and h, h 0 ∈ H. Let G, H be groups. Assume that G acts on H by automorphisms. That is, each g ∈ G can be thought of as an automorphism of H. A different way of thinking of this is that there is a homomorphism ρ : G → Aut(H); that is, g.h = ( ρ(g))(h) for any g ∈ G and h ∈ H. Define the semi-direct product of G acting on H (with respect to ρ) as the group G n H (also sometimes denoted H oρ G), whose elements are G × H = {(g, h) | g ∈ G , h ∈ H } and where multiplication is defined by

Exercise 1.68

(g, h)(g 0, h 0 ) = (gg 0, h · g.h 0 ). Show that this defines a group structure. Determine the identity element in G n H and the inverse of (g, h). Show that the set {1G } × H is an isomorphic copy of H sitting as a normal subgroup inside G n H. Show that G n H/({1G } × H) G. B solution C A useful (but not completely precise) way tofthinkg about semi-direct product G n H is to think of matrices of the form g0 h1 , g ∈ G, h ∈ H. This is especially aesthetic when H is Abelian, so that multiplication in H can be

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.5 Basic Group Notions

25

written additively. Indeed, when multiplying two such matrices we have g f 0 0g f 0 0 g f gh gh +h · g0 h1 = gg , 0 1 0 1 which is reminiscent of (g, h)(g 0, h 0 ) = (gg 0, h + gh 0 ). In the non-Abelian case matrix multiplication must be interpreted properly: f g f 0 0g f 0 g gh g h gg h ·gh0 · = . 0 1 0 1 0 1 Also, it may be worth pointing out that G n H hints at which group is acting on which: n has a small triangle, similar to the symbol B, which reminds us that H {1G } × H C G. Let G, H be groups. Define an action ρ : G → Aut (H) of G on H by ρ(g).h = h for all h ∈ H and g ∈ G. Show that G n H = G × H.

Exercise 1.69

So a semi-direct product generalizes the notion of a direct product of groups. Recall from Sections 1.5.4 and 1.5.5 the following groups of n × n matrices: For 1 ≤ k ≤ n, the group Dk is the additive group of all upper-triangular n × n real matrices A such that Ai, j = 0 for all j ≤ i + k − 1 (so the first k diagonals are 0). Here, ∆+n is the multiplicative group of diagonal matrices with only strictly positive entries on the diagonal. Show that ∆+n acts on Dk by left multiplication. Show that ∆+n n Dk is 2-step solvable. Show that if ∆+n n Dk is nilpotent, then k ≥ n. B solution C Exercise 1.70

Let V be a vector space over C. ϕ : V → V is an affine transformation if ϕ(v) = αv + u for some fixed scalar 0 , α ∈ C and fixed vector u (α is called the dilation and u the translation). Let A be the collection of all affine transformations on V . Show that A is a group with multiplication given by composition. Show that A C∗ n V where C∗ is the multiplicative group C\{0} and V is considered as an additive group. Is A Abelian? Nilpotent? Solvable? B solution C

Exercise 1.71

D E The infinite dihedral group is D∞ = a, b | baba , b2 . Let ϕ ∈ Aut (Z) be given by ϕ(x) = −x. Let Z2 = {−1, 1} be the group on 2 elements (the group operation given by multiplication). Show that D∞ Z2 nZ where Z2 acts on Z via ε.x = ε · x for ε ∈ {−1, 1}. Exercise 1.72

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

26

Background

Show that D∞ is not nilpotent. Show that D∞ is 2-step solvable. Show that D∞ is virtually Z.

B solution C

Exercise 1.73 Consider the following group: Let Sd be the group of permutations on d elements. Let Sd act on Z d by permuting the coordinates; that is, σ(z1, . . . , z d ) = zσ −1 (1), . . . , zσ −1 (d) . Show that this is indeed a left action. Consider the group G = Sd n Zd . Show that there exist H C G such that G/H Sd and H Zd . (Specifically H is Abelian.) Show that G is not Abelian for d > 2. B solution C

1.6 Measures on Groups and Harmonic Functions 1.6.1 Metric and Measure Structures on a Group Definition 1.6.1 (Cayley graph) Let G be a finitely generated group. Let S ⊂ G be a finite generating set. Assume that S is symmetric; that is, S = S −1 := {s−1 : s ∈ S}. The Cayley graph of G with respect to S is the graph with vertex set G and edges defined by the relations x ∼ y ⇐⇒ x −1 y ∈ S. The distance in this Cayley graph is denoted by distS .

Exercise 1.74 Show that distS (x, y) is invariant under the diagonal G-action. That is, distS (gx, gy) = distS (x, y) for any g ∈ G.

Due to this fact, we may denote |x| = |x|S := distS (1, x). So that distS (x, y) = |x −1 y|. Balls of radius r in this metric are denoted B(x, r) = BS (x, r) = {y : distS (x, y) ≤ r }. Throughout the book, the underlying generating set will be implicit, and we will not specify it explicitly in the notation. If we wish to stress a specific generating set (or, sometimes, a specific group), we will use the notation distG,S (x, y) = distS (x, y) = distG (x, y) and BG,S (x, r) = BS (x, r) = BG (x, r). Let S, T be two finite symmetric generating sets of G. Show that there exists a constant κ = κ S,T > 0 such that for all x, y ∈ G, Exercise 1.75

κ −1 · distT (x, y) ≤ distS (x, y) ≤ κ · distT (x, y) .

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

B solution C

1.6 Measures on Groups and Harmonic Functions Definition 1.6.2

27

Let µ be a probability measure on G.

• We say that µ is adapted (to G) if any element x ∈ G can be written as a product x = s1 · · · s k where s1, . .. , s k ∈ supp(µ). • µ is symmetric if µ(x) = µ x −1 for all x ∈ G. • µ has an exponential tail if for some ε > 0, f g X Eµ eε |X | = µ(x)eε |x | < ∞. x

• We say that µ has kth moment if f g X Eµ |X | k = µ(x)|x| k < ∞. x

By SA (G, k) we denote the collection of symmetric, adapted measures on G with kth moment. By SA (G, ∞) we denote the collection of symmetric, adapted, exponential tail measures on G. Show that if µ has kth moment with respect to a finite symmetric generating set S, then µ has kth moment with respect to any finite symmetric generating set. Show that if µ has an exponential tail with respect to a finite symmetric generating set S, then µ has an exponential tail with respect to any finite symmetric generating set. Exercise 1.76

The most basic example of µ ∈ SA (G, ∞) is when µ is the uniform measure on some finite symmetric generating set S of a finitely generated group G. Show that if µ is a symmetric, adapted measure on G with finite support, then µ ∈ SA (G, ∞).

Exercise 1.77

Show that if µ, ν are symmetric probability measures on G, then pµ + (1 − p)ν is also symmetric for p ∈ (0, 1).

Exercise 1.78

Show that if µ is an adapted probability measure on G and ν is any probability measure on G, then for any p ∈ (0, 1] we have that pµ+ (1 − p)ν is also adapted. Exercise 1.79

Let p ∈ (0, 1). Show that if µ ∈ SA (G, k) then ν = pδ1 + (1 − p) µ ∈ SA (G, k). (Such a measure ν is called a lazy version of µ.) B solution C

Exercise 1.80

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

28

Background

1.6.2 Random Walks Given a group G with a probability measure µ define the µ-random walk on G started at x ∈ G as the sequence Xt = xU1U2 · · · Ut , where (U j ) j are i.i.d. with law µ. The probability measure and expectation on GN (with the canonical cylinderset σ-algebra) are denoted Px, Ex . When we omit the subscript x we refer to P = P1, E = E1 . Note that the law of (Xt )t under Px is the same as the law of P (x Xt )t under P. For a probability measure ν on G we denote Pν = x ν(x) Px P and similarly for Eν = x ν(x) Ex . More precisely, given some probability measure ν on G, we define Pν to be the measure obtained by Kolmogorov’s extension theorem, via the sequence of measures t

Y Pt {(X0, . . . , Xt ) = (g0, . . . , gt )} = ν(g0 ) · µ g −1 j−1 g j . j=1

Show that Pt above indeed defines a probability measure on Ft = σ(X0, . . . , Xt ).

Exercise 1.81

Show that the µ-random walk on G is a Markov chain with transition matrix P(x, y) = µ x −1 y . (Markov chains will be defined and studied in Chapter 3. For the unfamiliar reader, this exercise may be skipped in the meantime.) Show that the corresponding Laplacian operator, usually defined ∆ := I − P, and the averaging operator P are given by Exercise 1.82

P f (x) = f ∗ µ(x), ˇ where µ(y) ˇ = µ y −1 .

∆ f (x) = f ∗ (δ1 − µ)(x), ˇ

Consider the matrix P(x, y) = µ x −1 y from the previous exercise. Show that if Pt is the tth matrix power of P then Ex [ f (Xt )] = Pt f (x).

Exercise 1.83

Exercise 1.84

Let µ be a probability measure on G, and let P(x, y) = µ x −1 y .

• Show that Pt (1, x) = µˇ ∗t (x), where µˇ t is convolution of µˇ with itself t times. µ(y) ˇ = µ(y −1 ) .

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.6 Measures on Groups and Harmonic Functions

29

• Show that µ is adapted if and only if for every x, y ∈ G there exists t ≥ 0 such that Pt (x, y) > 0. (This property is also called irreducible.) • Show that µ is symmetric if and only if P is a symmetric matrix (if and only if µˇ = µ). We will investigate random walks in more depth in Chapter 3.

1.6.3 Harmonic Functions In classical analysis, a function f : Rn → R is harmonic at x if for any small enough ball around x, B(x, r), it satisfies the mean value property: R 1 |∂B(x,r ) | ∂B(x,r ) f (y)dy = f (x). Another definition is that ∆ f (x) = 0 where P ∂2 ∆ = j ∂x 2 is the Laplace operator. (Why these two definitions should coincide j

is a deep fact, outside the scope of our current discussion.) Let G be a finitely generated group and µ a probability measure on G. A function f : G → C is µ-harmonic (or simply, harmonic) at x ∈ G if X µ(y) f (x y) = f (x) Definition 1.6.3

y

and the above sum converges absolutely. A function is harmonic if it is harmonic at every x ∈ G. Exercise 1.85 Show that f is µ-harmonic at x if and only if Eµ [ f (xU)] = f (x), if and only if ∆ f (x) = 0. (Here Eµ is expectation with respect to µ, and U is a random element of G with law µ.)

Prove the maximum principle for harmonic functions: Consider an adapted probability measure µ on G. If f is harmonic, and there exists x such that f (x) = supy f (y), then f is constant. Exercise 1.86

Consider the space ` 2 (G) of functions P 2 f : G → C such that y | f (y)| < ∞. This space is a Hilbert space with the P inner product h f , gi = y f (y)g(y). Prove the following “integration by parts” identity: for any f , g ∈ ` 2 (G), X P(x, y)( f (x) − f (y))(g(x) ¯ − g(y)) ¯ = 2 h∆ f , gi . Exercise 1.87 ( L 2 harmonic functions)

x,y

(The left-hand side above is h∇ f , ∇gi, appropriately interpreted, hence the name

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

30

Background

“integration by parts”. Thisis also sometimes understood as Green’s identity.) Here as usual, P(x, y) = µ x −1 y for a symmetric measure µ. Show that any f ∈ ` 2 (G) that is harmonic must be constant. B solution C Consider the group Z and the measure µ = 12 δ1 + 12 δ−1 . Suppose that f is a µ-harmonic function. Then, for any z ∈ Z, f (z−1)+ f (z+1) = 2 f (z), which implies that Example 1.6.4

f (z + 1) = 2 f (z) − f (z − 1), f (z − 1) = 2 f (z) − f (z + 1).

So the values of f are determined by the two numbers f (0), f (1). This implies that the space HF (Z, µ) = { f : Z → C : ∆ f ≡ 0} of all harmonic functions has dimension at most 2. Moreover, any function f (z) = αz + β, is a µ-harmonic function (check this!). Thus, we conclude that HF (Z, µ) is the (2-dimensional) space of all linear maps z 7→ αz + β for α, β ∈ C. 454 Exercise 1.88 Show that if G = Z and µ is uniform measure on {−1, 1, −2, 2} then the space of all µ-harmonic functions has dimension at least 2. Is this dimension finite? B solution C

Consider the group G = Z2 and the measure µ, which is uniform on the standard generators {(±1, 0), (0, ±1)}. Show that the functions f (x, y) = x, h(x, y) = y and g(x, y) = x 2 − y 2 and k (x, y) = xy are all µ-harmonic. Consider a different measure ν, which is uniform on {(±1, 0), (0, ±1), ±(1, 1)}. Which of the above functions is harmonic with respect to ν? B solution C Exercise 1.89

Let G be a finitely generated group. Let µ ∈ SA (G, 1). Show that any homomorphism from G to the additive group (C, +) is a µ-harmonic function. B solution C Exercise 1.90

Let µ be a symmetric and adapted probability measure on a finitely generated group G. Let p ∈ (0, 1) and let ν = pδ1 + (1 − p) µ be a lazy version of µ. Show that any function f : G → C is µ-harmonic if and only if it is ν-harmonic. B solution C

Exercise 1.91

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.7 Bounded, Lipschitz, and Polynomial Growth Functions

31

1.7 Bounded, Lipschitz, and Polynomial Growth Functions 1.7.1 Bounded Functions Recall for f : G → C and p > 0 we have X p || f || p = | f (x)| p, x

|| f ||∞ = sup | f (x)|. x

Recall that ||x. f || p = || f || p for all p ∈ (0, ∞]. Show that || f ||∞ ≤ || f || p for any p > 0.

Exercise 1.92

For a finitely generated group G and a probability measure µ on G, we use BHF (G, µ) to denote the set of bounded µ-harmonic functions on G; that is, BHF (G, µ) = { f : G → C : || f ||∞ < ∞ , ∆ f ≡ 0}.

Exercise 1.93 Show that BHF (G, µ) is a vector space over C. Show that it is a G-invariant subspace; that is, G.BHF (G, µ) ⊂ BHF (G, µ).

Any constant function is in BHF (G, µ), so dim BHF (G, µ) ≥ 1. The question of whether BHF (G, µ) consists of more than just constant functions is an important one, and we will dedicate Chapter 6 to this investigation.

1.7.2 Lipschitz Functions For a group G and a function f : G → C, define the right-derivative at y ∂y f : G → C by ∂ y f (x) = f xy −1 − f (x). Given a finite symmetric generating set S, define the gradient ∇ f = ∇S f : G → CS by (∇ f (x))s = ∂ s f (x). We define the Lipschitz semi-norm by ||∇S f ||∞ := sup sup |∂ s f (x)|. s ∈S x ∈G

Definition 1.7.1

A function f : G → C is called Lipschitz if ||∇S f ||∞ < ∞.

Show that for any two symmetric generating sets S1, S2 , there exists C > 0 such that Exercise 1.94

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

32

Background ||∇S1 f ||∞ ≤ C · ||∇S2 f ||∞ .

Conclude that the definition of Lipschitz function does not depend on the choice of specific generating set. Exercise 1.95

( ) What is the set f ∈ CG : ||∇S f ||∞ = 0 ?

We use LHF (G, µ) to denote the set of Lipschitz µ-harmonic functions; that is, LHF (G, µ) = { f : G → C : ||∇S f ||∞ < ∞ , ∆ f ≡ 0}.

Exercise 1.96

Show that LHF (G, µ) is a G-invariant vector space, by showing

that ∀x∈G

||∇S x. f ||∞ = ||∇S f ||∞ .

Exercise 1.97 (Horofunctions) Let G be a finitely generated group with a metric given by some fixed finite symmetric generating set S. Consider the space

L = {h : G → C : ||∇S h||∞ ≤ 1 , h(1) = 0}. Show that L is compact under the topology of pointwise convergence. −1 Show that x.h(y) = h x y − h x −1 defines a left action of G on L. Show that if h is fixed under the G-action (i.e. x.h = h for all x ∈ G) then h is a homomorphism from G into the group (C, +). Show that if h is a homomorphism from G into (C, +), then there exists α > 0 such that αh ∈ L. For every x ∈ G let bx (y) = distS (x, y) − distS (x, 1) = x −1 y − |x|. Show that bx ∈ L for any x ∈ G. Prove that the map x 7→ bx from G into L is an injective map. B solution C

1.7.3 Polynomially Growing Functions Let S be a finite, symmetric generating set for a group G. For f : G → C and k ≥ 0, define the kth degree polynomial semi-norm by || f ||S,k := lim sup r −k · sup | f (x)|. r→∞

|x | ≤r

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

33

1.8 Additional Exercises Let HFk (G, µ) =

(

) f ∈ CG : f is µ-harmonic, || f ||S,k < ∞ .

Show that || · ||S,k is indeed a semi-norm. Show that ||x. f ||S,k = || f ||S,k . Show that HFk (G, µ) is a G-invariant vector space.

Exercise 1.98

B solution C

Show that if S, T are two finite symmetric generating sets for G then there exists some constant C = C(S, T, k) > 0 such that for any f : G → C we have || f ||S,k ≤ C · || f ||T,k . Specifically, the space HFk (G, µ) does not depend on the specific choice of generating set. B solution C Exercise 1.99

Show that if || f ||S,k < ∞ then there exists C > 0 such that for all x ∈ G we have | f (x)| ≤ C |x| k + 1 . Exercise 1.100

Exercise 1.101

Show that

C ≤ BHF (G, µ) ≤ LHF (G, µ) ≤ HF1 (G, µ) ≤ HFk (G, µ) ≤ HFk+1 (G, µ), for all k ≥ 1.

1.8 Additional Exercises Let R be a commutative ring. Define GLn (R) to be the collection of all n × n matrices M with entries in R such that det(M) is an invertible element in R. Show that GLn (R) is a group. B solution C

Exercise 1.102

Let I be the n × n identity matrix. Show that {I, −I} C GLn (Z). Define PGLn (Z) = GLn (Z)/{−I, I}. Show that GL2n+1 (Z) {−1, 1} × PGL2n+1 (Z). Show that SL2n+1 (Z) PGL2n+1 (Z). B solution C

Exercise 1.103

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

34

Background

Exercise 1.104 Let S ≤ SL2 (Z) be the Sanov subgroup (see Exercise 1.56 and Remark 1.5.11). Show that if A ∈ S then f g 2n A = 4k+1 2m 4`+1

for some integers n, m, k, ` ∈ Z. Show that the Sanov subgroup is exactly f g ) 2n S = A = 4k+1 2m 4`+1 | det( A) = 1, k, `, n, m ∈ Z .

B solution C

Exercise 1.105

(

B solution C

Exercise 1.106 Show that the Sanov subgroup S has finite index in SL2 (Z). (Hint: use the map taking the matrix entries modulo 4.) B solution C

1.9 Solutions to Exercises Solution to Exercise 1.4 :( Let G = { A : θ t ( A) ∈ F }. G is easily seen to be a σ -algebra. For any t ≤ n ∈ N and g ∈ G , we have that −1 (g) ∈ F , θ t Xn−1 (g) = {θ t (ω) : ω n = g } = Xn−t

and if t > n ∈ N then θ t Xn−1 (g) = G N ∈ F . So Xn−1 (g) ∈ G for all n ∈ N and g ∈ G . This implies that F ⊂ G , which completes the proof.

:) X

Solution to Exercise 1.5 :( θ −t G is a σ algebra because θ −t G N = G N and θ −t (∪ n A n ) = ∪ n θ −t ( A n ) and θ −t ( Ac ) = (θ −t ( A)) c . −t For any k ∈ K we have that θ (K ) ∈ θ −t G by definition. So let H be any σ -algebra containing {θ −t (K ) : K ∈ K }. Define G 0 = { A : θ −t ( A) ∈ H }. Then, similarly to the above, it is easy to see that G 0 is a σ -algebra. Moreover, K ⊂ G 0 , so it must be that G ⊂ G 0 . But then, θ −t G ⊂ θ −t G 0 ⊂ H . Since H was any σ -algebra containing {θ −t (K ) : K ∈ K }, this implies that θ −t G = σ(θ −t (K ) : K ∈ K ) . :) X Solution to Exercise 1.6 :( This is immediate from θ −1 F = σ θ −1 (Xn (g)) : n ∈ N , g ∈ G = σ(Xn+1 (g) : n ∈ N , g ∈ G) ⊂ F .

:) X

Solution to Exercise 1.7 :( Note that ( ) θ −t Xn−1 (g) = ω : θ t (ω) ∈ Xn−1 (g) = {ω : ωt +n = g } = Xt−1 +n (g).

Since F = σ Xn−1 (g) : n ∈ N , g ∈ G we have that −1 θ −t F = σ θ −t Xn−1 (g) : n ∈ N , g ∈ G = σ Xn+t (g) : n ∈ N , g ∈ G = σ(Xt , Xt +1, . . .). :) X Solution to Exercise 1.16 :( If g, γ ∈ stab (x) , then γg.x = γ.x = x , and also g−1 .x = g−1 g.x = x .

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

:) X

35

1.9 Solutions to Exercises

Solution to Exercise 1.17 :( Let x ∈ X and S = stab (x) . The map of cosets of S into G.x given by gS 7→ g.x is a well-defined bijection. Indeed, if gS = γS then g = γs for some s ∈ S . So g.x = γs.x = γ.x , and the map is well defined. It is obviously surjective, and if g.x = γ.x then γ −1 g ∈ S , so gS = γS , implying that the map is injective as well. :) X Solution to Exercise 1.21 :( Compute:

P[X · Y = z] =

X f g X µ(x)ν x −1 z = (µ ∗ ν)(z). P X = x , Y = x −1 z =

:) X

x

x

Solution to Exercise 1.23 :( If a = 0 orf b = 0 there g is nothing to prove. So assume that a, b > 0. Consider the random variable X that satisfies P X = a1/ p = p, P[X = b 1/q ] = q . Then E[log X] = log a + log b . Also, E[X] = pa1/ p +

qb 1/q tells us that E[log X] ≤ log E[X], which results in log(ab) = log a + log b ≤ . Jensen’s inequality log pa1/ p + qb 1/q . :) X Solution to Exercise 1.24 :( The proof is by induction on n. For n = 1 there is nothing to prove. For n = 2, this is the “usual” Hölder f g inequality, which is proved as follows: denote f = f1, g = f2, p = p1, q = p2 and f˜ = || f || p , g˜ = ||g|| q . Then,

| | f g | |1 = | | f | | p · | |g | | q ·

X

| f˜ (x) | · | g(x) ˜ | ≤ | | f | | p · | |g | | q ·

X

x

= | | f | | p · | |g | | q ·

1 p

| f˜ (x) | p +

1 q

| g(x) ˜ |q

x

1 p

p

| | f˜ | | p +

1 q

q | | g˜ | | q = | | f | | p · | |g | | q ,

where the inequality is just Young’s inequality for products: ab ≤ pa1/ p + qb 1/q . A similar (and simpler) argument proves the case where p = 1, q = ∞. pj p Now for the induction step, n > 2. Let qn = p nn−1 and q j = p j · 1 − p1n = q n for 1 ≤ j < n. Then, 1 pn

+

1 qn

= 1 and

n−1 X j=1

= 1−

1 qj

1 −1 pn

·

n−1 X j=1

1 pj

= 1.

By the induction hypothesis (for n = 2 and n − 1),

| | f1 · · · fn | |1 ≤ | | fn | | p n · | | f1 · · · fn−1 | | q n = | | fn | | p n · | | | f1 | q n · · · | fn−1 | q n | |1 1/q n

n−1 Y ≤ | | fn | | p n · *. | | | fj | q n | | q j +/ , j=1 -

Solution to Exercise 1.25 :( r −p r −q For any x , since r1 + pr + qr =

| f ∗ g(x) | ≤

X y

1 p

f (y)g y −1 x =

+

1 q

X y

−

1 r

= | | fn | | p n ·

n−1 Y

| | fj | | p j .

1/q n

:) X

j=1

= 1,

q 1/r (r −q)/r | f (y) | p g y −1 x · | f (y) | (r −p)/r g y −1 x

= | | f1 · f2 · f3 | |1 ≤ | | f1 | |r · | | f2 | | pr /(r −p) · | | f3 | | qr /(r −q), where the second inequality is the generalized Hölder inequality with

q 1/r f1 (y) = | f (y) | p g y −1 x , f2 (y) = | f (y) | (r −p)/r , (r −q)/r f3 (y) = g y −1 x .

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

36

Background

Now,

| | f1 | |r =

X y

q 1/r | f (y) | p g y −1 x , (r −p)/ pr

X | f (y) | p +/ | | f2 | | pr /(r −p) = *. , y

(r −p)/r

= | | f | |p

,

(r −q)/qr

X q (r −q)/r (r −q)/r = | |x.gˇ | | q = | |g | | q , | | f3 | | qr /(r −q) = *. g(y −1 x) +/ y , recalling that g(z) ˇ = g z −1 and that | |x.gˇ | | q = | |g | | q . Combining all the above, X X q r −q r −p | | f ∗ g | |rr = | f ∗ g(x) | r ≤ | f (y) | p g y −1 x · | | f | | p · | |g | | q x

x, y

r −p

= | | f | |p

r −q

· | |g | | q

X

·

q

| f (y) | p · | |y.g | | q

y r −p

= | |g | | qr · | | f | | p

·

X

| f (y) | p = | |g | | qr · | | f | | rp .

:) X

y

Solution to Exercise 1.26 :( Inverses offinvertible matrices with integer fentriesgdo not necessarily have to have integer entries. For example, g 2 take M = 12 21 . The inverse is M −1 = 13 −1 :) X 2 −1 . Solution to Exercise 1.27 :( The map A 7→ det( A) is a homomorphism from GL n (Z) onto {−1, 1}. SL n (Z) is the kernel of this map. :) X Solution to Exercise 1.28 :( We use e1, . . . , en to denote the standard basis of R n . For a matrix A we write c j ( A) for the j th column of A, and r j ( A) for the j th row of A. It is easy to see that AEi, j is a matrix with ck ( AEi, j ) = 0 for k , j and c j ( AEi, j ) = ci ( A) . Thus, multiplying A on the right by I + Ei, j results in adding ci ( A) to c j ( A) . That is,

( ck ( A(I + Ei, j )) =

ck ( A) c j ( A) + ci ( A)

for k , j, for k = j.

Specifically, (I + Ei, j ) −1 = I − Ei, j . Applying (I + Ei, j ) z we see that we can add a z -multiple of column i to column j . By transposing the matrices, we see that ( r ( A) for k , j, rk ((I + Ei, j ) A) = k ri ( A) + r j ( A) for k = i. Thus, we can add a multiple of some row i to another row j . Also, for i , j , set Si, j = (I + Ei, j )(I − E j, i )(I + Ei, j ) . One may compute that

c ( A) k ck ( ASi, j ) = −c j ( A) c ( A) i

for k < {i, j }, for k = i, for k = j.

That is, we can swap columns at the price of changing the sign of one of them. Multiplying by Si, j on the left we can also swap rows, changing the sign of one. D E Denote G k = I + Ei, j | 1 ≤ i , j ≤ k . We claim by induction on k that for any A ∈ GL k (Z) there exist M, N ∈ G k such that f forg any diagonal 0 (n − k) × (n − k) matrix D with integer entries, if we consider the n × n matrix A0 = A 0 D , we find that

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.9 Solutions to Exercises

37

M A0 N is a diagonal matrix. The base case, where k = 1, is just the case where A0 is already diagonal, so we may choose M = N = I . So assume 1 < k ≤ n, and let f A ∈g GL k (Z) . Let D be any diagonal (n − k) × (n − k) matrix D with 0 integer entries, and define A0 = A 0 D . By swapping columns and/or rows, we may assume without loss of generality that A0k, k , 0. Now, suppose that A0i, k , 0 for some 1 ≤ i < k . Adding appropriate multiples of rk ( A0 ) to ri ( A0 ) and appropriate multiples of ri ( A0 ) to rk ( A0 ) sequentially, we arrive at a matrix M ∈ G k for which (M A0 ) k, k , 0 and (M A0 )i, k = 0. Continuing this way for all 1 ≤ i < k , we find that there exists M ∈ G k such that (M A0 ) k, k , 0 and (M A0 )i, k = 0 for all 1 ≤ i < n. The same procedure with columns instead of rows yields a matrix N ∈ G k such that (M A0 N ) k, k , 0 and (M A0 N )i, k = (M A0 N ) k, i = 0 for all 1 ≤ i < n. Let B be the (k − 1) × (k − 1) matrix given by Bi, j = (M A0 N )i, j for all 1 ≤ i, j ≤ k − 1. Let D 0 be the 0 = (M A0 N ) 0 0 (n−k+1)×(n−k+1) diagonal matrix given by D1,1 k, k and D1+i,1+i = (M A N ) k+i, k+i = Di, i for all 1 ≤ i ≤ n − k . We find that f g M A0 N = B0 D0 0 . Moreover,

det( A) · det(D) = det( A0 ) = det(M A0 N ) = det(B) · (M A0 N ) k, k · det(D), which implies that det(B) · (M A0 N ) k, k = det( A) . As these are all integers, and | det( A) | = 1, we also find that | det(B) | = 1, so that B ∈ GL k−1 (Z) . By induction, there exist M 0, N 0 ∈ G k−1 such that M 0 M A0 N N 0 is a diagonal matrix. Since G k−1 ≤ G k , we have that M 0 M, M N 0 ∈ G k , completing the induction step. Taking k = n from the above induction claim, we see that for any A ∈ GL n (Z) there exist M, N ∈ G n such that M AN is a diagonal matrix. Since det( A) = det(M AN ) , andQsince M AN has integer entries, we n find that ai := (M AN )i, i ∈ {−1, Q n 1} for all 1 ≤ i ≤ n. Also, det( A) = i=1 ai . Now, if A ∈ SL n (Z) , then i=1 ai = 1. Let J = {1 ≤ i ≤ n : ai = −1}. Q If J , ∅, then since (−1) |J | = j ∈J a j = 1, it must be that |J | ≥ 2. Take any i , j ∈ J and consider the matrix B = S j, i M AN Si, j . B is a diagonal matrix, with B j, j = −ai = 1 and Bi, i = −a j = 1 and B k, k = a k for all k < {i, j }. Continuing this way, we find some matrices S, T ∈ G n such that T M AN S = I . So A = M −1T −1 S −1 N −1 ∈ G n , and we are done. :) X Solution to Exercise 1.31 :( Let G be virtually finitely generated. So there exists H ≤ G , [G : H] < ∞ such U that H is finitely generated. Let R ⊂ G be a set of representatives for the cosets of H in G ; that is G = r ∈R Hr , and |R | = [G : H]. Let S be a finite symmetric generating set for H . Let x ∈ G . There are unique y ∈ H and r ∈ R such that x = yr . Since S generates H , there are s1, . . . , sn ∈ S such that y = s1 · · · sn . Thus, x = s1 · · · sn · r . This implies that S ∪ R is a finite generating set for G . :) X Solution to Exercise 1.32 :( Since G is finitely generated, the image ϕ(G) is a finitely generated Abelian group. By Theorem 1.5.2, ϕ(G) Z d × F for a finite Abelian group F . If d = 0 then |ϕ(G) | < ∞. So under our assumptions, d > 0. Since |ϕ(G) | = ∞, there must exist 0 , z ∈ Z d and f ∈ F such that (z, f ) ∈ ϕ(G) ≤ Z d × F . Since z , 0, there must exist 1 ≤ j ≤ d such that z j , 0. Let π : Z d × F → Z be the homomorphism given by π (w, f ) = w j for all w ∈ Z d and f ∈ F . Then, ψ = π ◦ ϕ is a homomorphism from G into Z. Since 0 , z j ∈ ψ(G) , we obtain that z j Z ≤ ψ(G) , implying that |ψ(G) | = ∞. Since ψ(G) ≤ Z it can only be trivial, or isomorphic to Z. Thus, ψ maps G onto the group ψ(G) Z. :) X Solution to Exercise 1.33 :( Let π : G → G/[G, G] be the canonical projection. If G/[G, G] is infinite, then π(G) is an infinite Abelian group, so Exercise 1.32 provides a surjective homomorphism onto Z. If on the other hand there exists a surjective homomorphism ϕ : G → Z, then [G, G] C Kerϕ . Thus, [G : [G, G]] ≥ [G : Kerϕ] = ∞. :) X Solution to Exercise 1.34 :( We prove this by induction on n. Note that

[xy, z] = y −1 x −1 z −1 xyz = ([x, z]) y · [y, z],

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

38

Background

so if x = s1 · · · sm then for any y ∈ G , there exist z1, . . . , z m such that

[x, y] = ([s1, y]) z1 · · · ([sm, y]) z m . Expanding out y in a similar fashion shows that

γ1 (G) = [s, s0 ] x : s, s0 ∈ S, x ∈ G , proving the claim for n = 1. Assume now that n > 1. Recall that

γn (G) = h[x, z] : x ∈ γn−1 (G) , z ∈ Gi . By induction on n, any x ∈ γn−1 (G) can be written as

x = [s1,1, . . . , sn−1,1 ] z1 · · · [s1, m, . . . , sn−1, m ] z m for si, j ∈ S and z j ∈ G . Thus, for any s ∈ S there exist w1, . . . , wm ∈ G such that

[x, s] = [s1,1, . . . , sn−1,1, s] w1 · · · [s1, m, . . . , sn−1, m, s] w m . Also, for any y = r1 · · · r` with r j ∈ S there exist u1, . . . , u` such that

[x, y]−1 = [y, x] = [r1, x]u1 · · · [r` , x]u ` . All this implies that for any x ∈ γn−1 (G) and any y ∈ G we can write [x, y] as a finite product of elements of the form [s1, . . . , sn ] z where s j ∈ S and z ∈ G . In other words, this proves the induction step. :) X Solution to Exercise 1.35 :( This is shown by induction on n. For n = 0 it is immediate that we have ϕ(γ0 (G)) = ϕ(G) = G and ϕ(Z0 (G)) = ϕ( {1}) = {1}. For n > 0, note that ϕ([x, y]) = [ϕ(x), ϕ(y)] for all x, y ∈ G . So by induction

ϕ([γn−1 (G), G]) = [ϕ(γn−1 (G)), ϕ(G)] = [γn−1 (G), G] = γn (G). Also by induction, [ϕ(x), y] ∈ Zn−1 (G) for all y ∈ G if and only if ϕ([x, y]) ∈ Zn−1 (G) = ϕ(Zn−1 (G)) for all y ∈ G , which is if and only if [x, y] ∈ Zn−1 (G) for all y ∈ G . So ϕ(Zn (G)) = Zn (G) . This completes the proof by induction. Finally, for any y ∈ G , the map ϕ y (x) = x y is an automorphism of G , so that γn (G) y = γn (G) and Zn (G) y = Zn (G) for all y ∈ G ; that is, these are normal subgroups. :) X Solution to Exercise 1.36 :( Since Zk (G) is a normal subgroup, for any x ∈ Zk (G) and any y ∈ G we have that [x, y] = x −1 x y ∈ Zk (G) . So Zk (G) C Zk+1 (G) . This proves the first assertion. Now, the second assertion we prove by induction on m := n − k . Fix k ≥ 0. The base step is m = 0, which is just Zk (G)/Zk (G) = {1} = Z0 (G/Zk (G)) . For the induction step, let m > 0. Let H = G/Zk (G) and let π : G → H be the canonical projection. Since Zk (G) C Zk+m (G) , it suffices to prove that π(Zk+m (G)) = Zm (H ) . Indeed, we have by induction that for x, y ∈ G ,

[x, y] ∈ Zk+m−1 (G) ⇐⇒ [π (x), π(y)] = π ([x, y]) ∈ Zk+m−1 (G)/Zk (G) = Zm−1 (H ), so

π (Zk+m (G)) = {π (x) : ∀ y ∈ G [x, y] ∈ Zk+m−1 (G) } = {π (x) : ∀ z ∈ H [π(x), z] ∈ Zm−1 (H ) } = Zm (H ), completing the induction step.

:) X

Solution to Exercise 1.37 :( We do this by induction on n. For n = 0 this is obvious. For n > 0, assume that γn = {1}. Then, [γn−1, G] = {1} implies that γn−1 C Z1 . Let H = G/Z1 , and let

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

39

1.9 Solutions to Exercises π : G → H be the canonical projection. It is easy to verify that for all k ≥ 1, we have π(γk ) = [π (γk−1 ), π(G)] = [γk−1 (H ), H] = γk (H ), so γn−1 (H ) = {1}. By induction and a previous exercise,

G/Z1 = H = Zn−1 (H ) = Zn−1 (G/Z1 ) = Zn /Z1 . As Z1 C Zn , this can only happen if G = Zn .

:) X

Solution to Exercise 1.38 :( Again this is by induction, where the base step n = 0 is obvious. Assume for n > 0 that Zn = G . Set H = G/Z1 . Since Zn−1 (H ) = Zn /Z1 = H , we have by induction that γn−1 (H ) = {1}. As before, if π : G → H is the canonical projection, then π(γn−1 ) = γn−1 (H ) = {1}, so γn−1 C Z1 . Thus,

γn = [γn−1, G] C [Z1, G] = {1}.

:) X

Solution to Exercise 1.39 :( Let k ≥ 1. We know that γk = h[x, y] : x ∈ γk−1, y ∈ Gi. Consider γk /γk+1 as a subgroup of G/γk+1 . Note that since [γk , G] = γk+1 , we have that γk /γk+1 ≤ Z (G/γk+1 ) . Thus, for any x ∈ γk−1 , y, z ∈ G we get that

[x, yz] = x −1 z −1 y −1 xyz = [x, z]z −1 [x, y]z ≡ [x, z] · [x, y]

(mod γk+1 ).

Also, if x, y ∈ γk−1 and z ∈ G then

[xy, z] = y −1 x −1 z −1 xyz = y −1 [x, z]z −1 yz ≡ [x, z] · [y, z]

(mod γk+1 ).

We conclude that if γk−1 = hXi and G = hSi then

γk /γk+1 = h[γk+1 x, γk+1 s] : x ∈ X , s ∈ Si . By induction on k , this proves that as long as G is finitely generated, the group γk /γk+1 is finitely generated for all k . :) X Solution to Exercise 1.40 :( G is n-step nilpotent if and only if γn (G) = {1} and γn−1 (G) , {1}, which, by Exercises 1.37 and 1.38, is if and only if Zn = G and Zn−1 , G . The second assertion follows from the fact that Zn+1 (G) = Zn (G/Z1 ) and Zn (G) = Zn−1 (G/Z1 ) . :) X Solution to Exercise 1.41 :( One verifies that γk (G/γn ) ≤ γk /γn , so γn (G/γn ) = {1}.

:) X

Solution to Exercise 1.42 :( This follows from γn (H ) ≤ γn (G) for all n, which is easily shown by induction, since for any subgroups A ≤ B ≤ G and C ≤ D ≤ G we have [A, C] ≤ [B, D]. :) X Solution to Exercise 1.43 :( Let π : G → G/N be the canonical projection. Note that γk (G/N ) ≤ π (γk (G)) . So if γn (G) = {1} ≤ N , then γn (G/N ) = {1}. :) X Solution to Exercise 1.44 :( Let 1 ≤ j ≤ i + k + ` − 1 ≤ n. Compute for M ∈ D k , N ∈ D` :

(M N )i, j =

n X

Mi, t Nt, j 1{t ≥i+k } 1{ j ≥t +` } = 0,

t =1

because j − ` < i + k .

:) X

Solution to Exercise 1.45 :( If M, N ∈ D k (Z) then

(I + M )(I + N ) = I + M + N + M N ∈ Q n, k

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

40

Background

because M N ∈ D2k (Z) ⊂ D k (Z) . Moreover, since D n contains only the 0 matrix, we have that for any N ∈ D k (Z) we may choose M = P n−1 j j=1 (−N ) ∈ D k (Z) , and we have that

(I + N )(I + M ) = (I + N ) ·

n−1 X

(−N ) j = I,

j=0

implying that (I + N ) −1 = I + M for this choice of M . This proves that Q n, k is a group.

:) X

Solution to Exercise 1.46 :( Let H = Hn (Z) . Note that H = Q n,1 from the previous exercise, so it is indeed a group. We now show that for 0 ≤ k ≤ n − 1 we have γk (H ) ⊂ Q n, k+1 ⊂ Zn−k−1 (H ) . The case k = 0 is exactly what was shown above. For k > 0, if I + M ∈ H, N ∈ D k (Z) then

[(I + M ), (I + N )] = (I + M ) −1 (I + N ) −1 (I + N + M (I + N )) = (I + M ) −1 I + ((I + N ) −1 − I )M (I + N ) + M (I + N ) = (I + M ) −1 (I + M ) + L + M N = I + (I + M ) −1 (L + M N ), where n X L = (I + N ) −1 − I M (I + N ) = (−N ) j M (I + N ) ∈ D k+1 (Z). j=1

Since M N ∈ D k+1 (Z) as well, we conclude inductively that γk (H ) ⊂ Q n, k+1 . Also, since D n only contains the 0 matrix, it is immediate that Q n, k+1 ⊂ Zn−k−1 (H ) holds when k = n−1. For k < n − 1 and N ∈ D k+1 (Z) , for any I + M ∈ H , we have seen that [(I + M ), (I + N )] ∈ Q n, k+2 ⊂ Zn−k−2 (H ) (inductively). Thus, I + N ∈ Zn−k−1 (H ) for any N ∈ D k+1 (Z) , as required. :) X Solution to Exercise 1.47 :( This follows since if H ≤ G then [G, H] ≤ [G, G]. So for any group G we have that G (n) ≤ γn (G) , inductively.

:) X

Solution to Exercise 1.48 :( f g If G is 2-step solvable then G (1), G (1) = G (2) = {1}.

:) X

Solution to Exercise 1.49 :( (k ) This follows since G (n) = G (n+k ) .

:) X

Solution to Exercise 1.50 :( There exists n such that G (n) , {1} = G (n+1) . We prove this by induction on n. If n = 0 then G is infinite Abelian, in which case [G, G] = {1}. For n > 0, let H = G/G (n) . We have that H (n) = {1}, so by induction [H : [H, H]] = ∞. Also, [H, H] = G (1) /G (n) , so [G : [G, G]] = [H : [H, H]] = ∞, completing the induction. :) X Solution to Exercise 1.51 :( This follows from H (n) ≤ G (n) , which can be easily shown inductively.

:) X

Solution to Exercise 1.52 :( For A, B ∈ ∆+n we have that

( AB)i, j =

n X

Ai, ` B`, j = 1{i= j } Ai, i Bi, i .

`=1

This immediately shows that AB = B A. Also, since Ai, i > 0 for all i , we can choose Bi, i =

1 A i, i

, to get AB = I , so B = A−1 .

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

:) X

41

1.9 Solutions to Exercises

Solution to Exercise 1.53 :( For T + M, S + N ∈ Pn, k we have (T + M )(S + N ) = T S +T N + M S + M N . Since T N, M S, M N ∈ D k we get that Pn, k is closed under matrix multiplication. j P Choosing M = nj=1 −S −1 N · S −1 and T = S −1 will give us that

(T + M )(S + N ) = I + S −1 N + M I + S −1 N = I, so (S + N ) −1 = S −1 + M for this choice of M . Now consider the map ϕ : Pn, k → ∆+n given by ϕ(T + M ) = T . One easily check that this is a surjective homomorphism, and that Kerϕ = {I + N : N ∈ D k }. Since ∆+n is an Abelian group, it must be that [Pn, k , Pn, k ] C Kerϕ . As in Exercise 1.46, we compute commutators: for any M ∈ D` , N ∈ D k we have [(I + M ), (I + N )] = I + (I + M ) −1 (L + M N ) , where

L=

n X

(−N ) j M (I + N ) ∈ D`+k

j=1

and also M N ∈ D`+k (by Exercise 1.44). This implies inductively that

(Pn, k ) (`+1) = ([Pn, k , Pn, k ]) (`) C {I + N : N ∈ D2` k } for all ` ≥ 0. Since D n contains only the 0 matrix, Pn, k is solvable of step at most dlog2 (n/k)e + 1. Finally, to show that Pn, k is not nilpotent, we will show that Z1 (Pn, k ) = {1}, which implies that Z` (Pn, k ) = {1} for all ` ≥ 0. Indeed,

(T + M )(S + N ) − (S + N )(T + M ) = T N − NT + M S − S M + M N − N M . If S + N ∈ Z1 (Pn, k ) , then by choosing M ∈ D n−1 , we have that N M = M N = 0. Also, an easy computation gives

M S − S M = (Sn, n − S1,1 ) · M . Also, there exist t, s such that Nt, s , 0. Necessarily s > t . We choose Mi, j = 1{i= j=n} and Ti, j = α · 1{i= j=t } for some α > 0. Then (T N − NT )i, j = α 1{i=t } − 1{ j=t } Ni, j . Hence

((T + M )(S + N ) − (S + N )(T + M ))t, s = αNt, s + Sn, n − S1,1 . Since we can choose α > 0 such that this is nonzero, we find that S + N does not commute with T + M in this case. :) X Solution to Exercise 1.54 :( It is easy to compute that ωz d 0 1

f

so that

f

ωz d 0 1

For d =

f

g −1

ω −z −ω −z d 0 1 k k=0 a k ω where a0,

=

f

g

Pr −1

ωz d 0 1

g

=

f

1 d 0 1

g

·

f

ω 0 0 1

g f w · ω0

g

f

=

ω z+w ω z c+d 0 1

,

showing that G is a group.

g z

=

r −1 Y

1 ak ω k 0 1

implying that G is generated by the finite set Df g f G = ω0 01 , 10

ωk 1

·

f

ω 0 0 1

g z

=

r −1 f Y

1 ωk 0 1

g ak

·

f

ω 0 0 1

g z

,

k=0

g

E : 0 ≤ k ≤ r −1 .

Computing commutators we see that g f w g g f −z−w g f z+w −ω−z−w c−ω −z d ω , ω0 c1 = ω 0 0 1

ωz d 0 1

g

. . . , ar −1 ∈ Z, and z ∈ Z, we have that

k=0

ff

c 1

ω z c+d 1

g

=

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

f

1 ω −z−w ((ω z −1)c−(ω w −1)d) 0 1

g

.

42

Background

As above, this shows that G is 2-step solvable, but not nilpotent, since Z 1f (G) =g {1}. z However, consider the map ϕ : G → {0, 1, . . . , r − 1} given by ϕ ω0 d1 = z (mod r ) . This is easily f z g seen to be a well-defined surjective homomorphism, so [G : Kerϕ] = r . Moreover, ω0 d1 ∈ Kerϕ if and only if z = 0 (mod r ) . Thus (f g ) 1 d : d ∈ D , Kerϕ = 0 1 :) X

which is an Abelian group of finite index in G .

Solution to Exercise 1.55 :( Let S = {a, b }. Define ϕ : F S → H by ϕ(a) = a and ϕ(b) = b and extending in the canonical way to words in F S . This is a surjective homomorphism, and we want to show that it is injective as well. Step I. Let h = a z1 b w1 · · · a z n be an element in H such that z n, z k , wk ∈ Z \ {0} for all 1 ≤ k ≤ n − 1. For any x ∈ B ,

h.x = a z1 b w1 · · · a z n .x ∈ a z1 b w1 · · · a z n−1 b z n−1 ( A) ⊂ A, so it is impossible that h.x = x , implying that h , 1. Step II. Now, for a general element h = a z1 b w1 · · · a z n b w n , where w1, z n, z k , wk ∈ Z \ {0} for 2 ≤ k ≤ n − 1, but possibly z1 = 0 or wn = 0. In this case we can define:

a−z n ha z n −1 g= a ha h

if z1 = wn = 0, if z1 = 0 , wn, if z1 , 0 = wn .

We see that in each of the above cases, the element g falls into the conditions of Step I. So g , 1. Since every time g is a conjugate of h , also h , 1. :) X Solution to Exercise 1.56 :((

)

(

)

SL2 (Z) acts on Z2 . Let A = (x, y) ∈ Z2 : |y | < |x | and B = (x, y) ∈ Z2 : |x | < |y | .

f

g

f

g

1 0 z Note that a z = 10 2z 1 and b = 2z 1 for any z ∈ Z. We have that a z (x, y) = (x + 2zy, y) . So if (x, y) ∈ B , since |y | > |x | we get that

|x + 2zy | ≥ 2 |z | |y | − |x | > (2 |z | − 1) |y | ≥ |y | if z , 0. So a z (x, y) ∈ A for all z , 0 and (x, y) ∈ B . Similarly, if (x, y) ∈ A then

|2zx + y | ≥ 2 |z | |x | − |y | > (2 |z | − 1) |x | ≥ |x |, so

b z (x,

y) ∈ B for all z , 0 and (x, y) ∈ A. This implies that ha, bi is isomorphic to F2 by the Ping-Pong Lemma.

:) X

Solution to Exercise 1.59 :( It was shown in Exercise 1.58 that a = x 2 = (−st) 2 = stst and b = y 2 = (−s 2 t) 2 = s 2 ts 2 t . Now, let z ∈ SL2 (Z) . By Exercise 1.58, there exist n ≥ 0 and ε1, . . . , εn ∈ {−1, 1} and α, β ∈ {0, 1} such that z ≡ t α s ε1 ts ε2 · · · ts ε n t β( (mod {−I, I }) . Choose a minimal n = n(z) as above. We prove the assertion ) that there exist w ∈ S and p ∈ 1, s, s 2, t, ts, ts 2 such that z ≡ w p (mod {−I, I }) by induction on n. The base case is n(z) = 0, for which z ≡ t α+β (mod {−I, I }) for some α, β ∈ {0, 1}. In all cases one sees that the assertion holds with w = 1 and p ∈ {1, t }. For the induction step, we have that z ≡ t α s ε1 ts ε2 · · · ts ε n t β( (mod {−I, I }) and ) n ≥ 1. Set z˜ = t α s ε1 ts ε2 · · · s ε n−1 t . By induction, there exist w˜ ∈ S and p˜ ∈ 1, s, s 2, t, st, s 2 t such that z˜ ≡ w˜ p˜ (mod {−I, I }) . Note that modulo {−I, I },

s −1 t ≡ s 2 t t st ps ˜ −1 t ≡ ts −1 t ≡ a−1 s sts −1 t ≡ ab −1 s 2 2 −1 s ts t ≡ b

p˜ p˜ p˜ p˜ p˜ p˜

= 1, = s, = s 2, = t, = st, = s 2 t,

st s2 t t pst ˜ ≡ tst ≡ b −1 s 2 stst = a 2 s tst ≡ ba−1 s

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

p˜ p˜ p˜ p˜ p˜ p˜

= 1, = s, = s 2, = t, = st, = s 2 t,

1.9 Solutions to Exercises

43

which completes the induction step. This immediately shows that the number of cosets of π(S) is at most 6, that is,( [PSL2 (Z) : π(S)] ) ≤ 6. Finally, we also have that for any z ∈ SL2 (Z) there exist w ∈ S and p ∈ 1, s, s 2, t, st, s 2 t such that π(z) = π(w p) . This implies that for some ε ∈ {−1, 1} we have that z = εw p . Hence, there are at most 12 cosets for S in SL2 (Z) ; that is, [SL2 (Z) : S] ≤ 12. :) X Solution to Exercise 1.60 :( For every x ∈ G there are unique elements y x ∈ H and t x ∈ T such that x = y x t x . For any u ∈ T and s ∈ S one has that us(tu s ) −1 = yu s ∈ T ST −1 ∩ H . We will show that {yu s : u ∈ T, s ∈ S } generate H . To this end, fix some x ∈ G and write x = s1 · · · sn for s j ∈ S . Define inductively u1 = s1 and uk+1 = tu k sk+1 . Then,

x = ys1 ts1 s2 · · · sn = yu1 yu2 tu2 s3 · · · sn = · · · = yu1 yu2 · · · yu n tu n . Note that u j ∈ T S so yu j ∈ T ST −1 ∩ H . Specifically, if x ∈ H then it must be that tu n = 1 and x = yu 1 · · · yu n . :) X Solution to Exercise 1.62 :( For any x ∈ G , we can write π(x) = π (s1 ) · · · π(sn ) for some s j ∈ S . Thus, there exists h ∈ H such that x = hs1 · · · sn . Writing h = u1 · · · um for ui ∈ U , we have that U ∪ S generates G . :) X Solution to Exercise 1.63 :( Let G = {g1, . . . , gn }. Let F = F n be the free group on n generators, and denote the generators by {s1, . . . , sn }. Consider the homomorphism ϕ : F → G defined by setting ϕ(s j ) = g j . For every 1 ≤ i, j ≤ n there exists 1 ≤ k = k (i, j) ≤ n such that gi g j = gk . Define the relation ri, j = si s j (sk ) −1 for k = k (i, j) . Let K = Kerϕ and let R C F be the smallest normal subgroup containing {ri, j : 1 ≤ i, j ≤ n}. Note that R C K. Let π : F/R → G be the homomorphism defined by π (Rx) = ϕ(x) . This is well defined because R C K . | So F/R and K/R are finite groups. Since (F/R)/(K/R) F/K G , we have that |G | ≤ |K|G /R| , which can D E only mean that K = R . Hence G = s1, . . . , sn | ri, j 1 ≤ i, j ≤ n is a finitely presented group. :) X Solution to Exercise 1.64 :( Z is finitely presented, as it is just the free group on 1 generator. Since G/H is finite, it is also finitely presented. Thus, G is finitely presented by Lemma 1.5.14. :) X Solution to Exercise 1.65 :( This follows directly from Theorem 1.5.15 and the fact that virtually-Z groups are finitely presented.

:) X

Solution to Exercise 1.66 :( If e1, . . . e d are the standard basis vectors spanning Z d , then defining Hk = he1, . . . , e d−k i for 0 ≤ k < d , and Hd = {1}, we have that Hk+1 C Hk and Hk /Hk+1 Z for all 0 ≤ k < d . Thus Z d is finitely presented by Theorem 1.5.15. If G is a finitely generated virtually Abelian group, then G Z d × F for some d and some finite group F (by Theorem 1.5.2). Thus, there exists a normal subgroup N C G such that N Z d and G/N F . Since both N and F are finitely presented, so is G by Lemma 1.5.14. :) X Solution to Exercise 1.67 :( Assume that G is n-step nilpotent. We prove the claim by induction on n. If n = 1 then G is Abelian, and since it was assumed to be finitely generated, G is finitely presented, completing the induction base. For n > 1, consider the lower central series G = γ0 B γ1 B · · · B γn = {1}. Consider the group H = γn−1 . Since [G, H] = {1}, we have that H is Abelian. By Exercise 1.39, H γn−1 /γn is finitely generated. Thus, H is finitely presented. Also, G/H is at most (n − 1) -step nilpotent and finitely generated, so G/H is finitely presented by induction. Thus, G is also finitely presented, completing the induction step. :) X

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

44

Background

Solution to Exercise 1.68 :( Multiplication is associative since (g, h)(g0, h0 ) (g00, h00 ) = (gg0, hg(h0 ))(g00, h00 ) = (gg0 g00, hg(h0 )gg0 (h00 )), 0 0 (g, h) (g , h )(g00, h00 ) = (g, h)(g0 g00, h0 g0 (h00 )) = (gg0 g00, hg(h0 )gg0 (h00 )). The identity is easily seen to be (1G , 1 H ) . Inverses are given by (g, h) −1 = g−1, g−1 (h −1 ) . The map (g, h) 7→ g is a homomorphism onto G with kernel {(1, h) | h ∈ H }, which is isomorphic to H . :) X Solution to Exercise 1.70 :( Since T M ∈ D k for all T ∈ ∆+n and M ∈ D k , it is obvious that ∆+n acts on the set D k . Also, T (M + N ) = T M + T N so this action is indeed a group automorphism (recall thatf D k has g an additive operation). For T ∈ ∆+n, M ∈ D k define a 2n × 2n matrix by Ψ(T, M ) = T0 M I . Multiplying two such matrices by blocks gives Ψ(T, M )Ψ(S, N ) = Ψ(T S, T N + M ) . This immediately leads to the conclusion that ∆+n n D k G := {Ψ(T, M ) : T ∈ ∆+n, M ∈ Dk }. Now, since Ψ(T, M ) −1 = Ψ T −1, −T −1 M , we have that

[Ψ(T, M ), Ψ(S, N )] = Ψ T −1 S −1, −T −1 S −1 N − T −1 M Ψ(T S, T N + M ) = Ψ I, T −1 S −1 (T N + M ) − T −1 S −1 N − T −1 M = Ψ I, (I − T −1 )S −1 N − (I − S −1 )T −1 M . Thus, G (1) ⊂ {Ψ(I, M ) : M ∈ D k }. However, computing the commutator again (when S = T = I ) we get that G (2) = {I }, so G is 2-step solvable. If k ≥ n then D k is just the 0 matrix, so ∆+n n D k ∆+n , which is Abelian. To show that G is not nilpotent when k < n, we first compute the center Z = Z1 (G) . If Ψ(S, N ) ∈ Z , then the commutator computation above implies that (T − I ) N = (S − I )M for all T ∈ ∆+n, M ∈ D k . Choosing T = I and Mi, j = 1{ j ≥i+k } , we get that S j, j = 1 for all j ≤ n − k . Thus, (S − I )M = 0 for any M ∈ D k . This leads to (T − I ) N = 0 for all T ∈ ∆+n , which cannot hold unless N = 0. We conclude that

Z = {Ψ(S, 0) : ∀ j ≤ n − k , S j, j = 1}. Now, we compute the second center Z2 = Z2 (G) = {x ∈ G : ∀ y ∈ G [x, y] ∈ Z1 (G) }. Using the commutator formula above, we see that if Ψ(S, N ) ∈ Z2 , then again (T − I ) N = (S − I )M for all T ∈ ∆+n, M ∈ D k , which leads to N = 0 and S j, j = 1 for all j ≤ n − k , as before. But then we get that Z2 = Z , so the upper-central series stabilizes at Z , and G cannot be nilpotent. :) X Solution to Exercise 1.71 :( For α , 0 and u ∈ V , denote the transformation v 7→ αv + u by the “matrix” α0 u1 . (If V is finite dimensional, then this is an actual (dim V + 1) × (dim V + 1) matrix.) One sees that the usual matrix multiplication provides us with composition of transformations: α u f β v g f αβ αv+u g . 0 1 · 0 1 = 0 1 The inverse transformation is given by

α

u −1 0 1

=

f

α −1 −α −1 u 0 1

g

.

This provides the group structure for the affine transformations of V . In fact, note that the multiplicative group C∗ = C\{0} acts on the additive group V , so the collection of affine transformations is just C∗ n V . It is now straightforward to compute commutators: f f α u f β v g g −1 −1 −1 −1 g −1 −1 −1 αv+u = α β −α β v−α u αβ = 1 α β ((α−1)v−(β−1)u) . 0 1 , 0 1 0 1 0

1

0

1

∗ Just as before, one sees that C n V is 2-step solvable, if V , {0}. Also, if α0 u1 ∈ Z = Z1 (C∗ n V ) , then (α − 1)v = (β − 1)u for all β ∈ C∗, v ∈ V . If V , {0}, this is only possible if u = 0 and α = 1. Hence Z = {1}. That is, the only case where C∗ n V is nilpotent is when V = {0} and C∗ n V C∗ , which is Abelian. :) X

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

45

1.9 Solutions to Exercises

Solution to Exercise 1.72 :( First, note that the collection {a x b η : η ∈ {0, 1}, x ∈ Z} forms a subgroup of D∞ . Since the generators a, b of D∞ are contained in this subgroup, we get that any element of D∞ is of the form a x b η for some η ∈ {0, 1} and x ∈ Z. It is not difficult to verify that the map (ε, x) 7→ a x b (1+ε)/2 is a surjective homomorphism with trivial kernel. Now,

[(ε, x), (δ, y)] = (ε, −εx)(δ, −δy)(ε, x)(δ, y) = (εδ, −εx − εδy)(εδ, x + εy) = ε 2 δ 2, −εx − εδy + εδx + ε 2 δy = (1, ε(δ − 1)x + δ(1 − ε)y). Thus [(1, x), (−1, 0)] = (1, −2x) and [(−1, x), (1, 1)] = (1, 2) . Hence Z (Z2 n Z) = {(1, 0) }, so D∞ Z2 n Z is not nilpotent. Also, the above commutator calculation shows that [(1, x), (1, y)] = (1, 0) , so D∞ Z2 n Z is 2-step solvable. Finally, the surjective homomorphism (ε, x) 7→ ε shows that H = {(1, x) : x ∈ Z} is a normal subgroup of Z2 n Z isomorphic to Z, and of index 2 because Z2 n Z/H Z2 . :) X Solution to Exercise 1.73 :( The group structure is easy to verify. The identity in G is 1G = (1 S d , 0) and (σ, z) −1 = σ −1, −σ −1 z . Now, note that (σ, z) (τ, w) = τ −1, −τ −1 w (στ, z + σw) = σ τ , τ −1 z + σ τ τ −1 w − τ −1 w .

(τ, w) Let H = {(1 S d , z) : z ∈ Z d }. Then it is immediate that H Z d and from the above, 1 S d , z = 1 S d , τ −1 z , so H C G . Also, the map π : G → S d given by (σ, z) 7→ σ is a homomorphism with kernel H . So G/H S d . Finally, [(σ, z), (τ, w)] = σ −1, −σ −1 z σ τ , τ −1 z + σ τ τ −1 w − τ −1 w = [σ, τ], −σ −1 z + τ σ w − σ −1 τ −1 w . Since S d is non-Abelian (for d > 2) we may find σ, τ ∈ S d such that [σ, τ] , 1 S d .

:) X

Solution to Exercise 1.75 :( It suffices to show only one inequality, as the other will follow by reversing the roles of S, T . For any t ∈ T let st,1, . . . , st, n(t ) ∈ S be such that st,1 · · · st, n(t ) = t and n(t) = |t | S = dist S (1, t) . Let κ = maxt ∈T n(t) . Now, for any x ∈ G let t1, . . . , tm ∈ T be such that t1 · · · tm = x and m = |x |T = distT (1, x) . Then,

x = t1 · · · tm = st1 ,1 · · · st1 , n(t1 ) · st2 ,1 · · · st2 , n(t2 ) · · · st m ,1 · · · st m , n(t m ), so

|x | S ≤

m n(t Xj ) X

|st j , k | ≤ κ · m = κ · |x |T .

j=1 k=1

Hence, for general x, y ∈ G we have that

dist S (x, y) = x −1 y ≤ κ · x −1 y = distT (x, y). S T

:) X

Solution to Exercise 1.80 :( Symmetry and adaptedness of ν follow from the previous exercises. Let U be a random element of law µ , and let V = U with probability 1 − p , and V = 1 with probability p . Then, f g f g E |V | k = (1 − p) E |U | k < ∞, implying that ν ∈ SA (G, k) .

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

:) X

46

Background

Solution to Exercise 1.87 :( Compute using the symmetry of P : XX X P(x, y)( f (x) − f (y)) g(x) ¯ ∆ f (x) g(x) ¯ =2 2 h∆ f , gi = 2 x

x

=

X

=

X

y

P(x, y)( f (x) − f (y)) g(x) ¯ +

X

P(y, x)( f (y) − f (x)) g(y) ¯

y, x

x, y

P(x, y)( f (x) − f (y))(g(x) ¯ − g(y)). ¯

x, y

We have used that f , g ∈ ` 2 , so that the above sums converge absolutely, and so can be summed together. Thus, if f is ` 2 and harmonic we have that X P(x, y) | f (x) − f (y) | 2 = 2 h∆ f , f i = 0. x, y

Thus, | f (x) − f (y) | 2 = 0 for all x, y such that P(x, y) > 0. Since P is irreducible (i.e. µ is adapted) this implies that f is constant. :) X Solution to Exercise 1.88 :( Note that any linear map z 7→ αz + β is still harmonic with respect to this µ . The dimension is at most 4 since the linear map f 7→ ( f (−1), f (0), f (1), f (2)) from the space HF (Z, µ) to C4 is injective (it has a trivial kernel). Indeed, for any µ-harmonic function f , and any z we have that

f (z + 2) = 4 f (z) − f (z − 1) − f (z + 1) − f (z − 2), f (z − 2) = 4 f (z) − f (z − 1) − f (z + 1) − f (z + 2). So if f (z − 1) = f (z) = f (z + 1) = f (z + 2) = 0 for any z then f ≡ 0 is identically 0.

:) X

Solution to Exercise 1.89 :( It is easy to verify µ -harmonicity. As for ν , one may check that f , h are ν -harmonic. Also,

(x + 1) 2 − y 2 + (x − 1) 2 − y 2 + x 2 − (y + 1) 2 + x 2 − (y − 1) 2 + (x + 1) 2 − (y + 1) 2 + (x − 1) 2 − (y − 1) 2 = x 2 − y 2 + 16 2 − 2 + 1 + 2x − 1 − 2y + 1 − 2x − 1 + 2y = x 2 − y 2 = g(x, y),

g ∗ ν(x, ˇ y) =

1 6

(x + 1)y + (x − 1)y + x(y + 1) + x(y − 1) + (x + 1)(y + 1) + (x − 1)(y − 1) = xy + 16 y − y + x − x + x + y + 1 − x − y + 1 = xy + 13 ,

k ∗ ν(x, ˇ y) =

1 6

so g is ν -harmonic, but k is not ν -harmonic.

:) X

Solution to Exercise 1.90 :( Let ϕ : G → C be a homomorphism. Then, using the symmetry of µ , X X µ(y)ϕ(xy) = ϕ(x) + µ(y) 12 ϕ(y) + ϕ(y −1 ) = ϕ(x). y

y

The above sum converges absolutely because µ has finite first moment, and since |ϕ(xy) | ≤ |ϕ(x) | + |ϕ(y) | ≤ max s∈S |ϕ(s) | · ( |x | + |y |) , where S is the finite symmetric generating set used to determine the metric on G . :) X Solution to Exercise 1.91 :( For any x ∈ G , X

ν(y) f (xy) = p f (x) + (1 − p)

y

X

µ(y) f (xy),

x

where the sums on both sides converge absolutely together.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

:) X

47

1.9 Solutions to Exercises

Solution to Exercise 1.97 :( The fact that L is compact is basically the Arzelà–Ascoli theorem. However, let us give a self-contained proof. The space L with the topology of pointwise convergence is metrizable; for example, one may consider the metric

dist( f , h) = exp(−R(h, f )),

R(h, f ) := sup{r ≥ 0 : ∀ |x | ≤ r , h(x) = f (x) }.

So compactness will follow by showing that any sequence has a converging subsequence. Let ( fn ) n be a sequence in L . Denote G = {x1, x2, . . . }. We will inductively construct a sequence of subsets N ⊃ I1 ⊃ I2 ⊃ I3 ⊃ · · · , all infinite |I j | = ∞, such that for all m ≥ 1 the limits lim I j 3k→∞ fk (x j ) exist. Indeed, if m = 1 then since | fn (x) | ≤ |x | for all n, the sequence ( fn (x1 )) n is bounded, and thus has a converging subsequence. Let I1 be the indices of this converging subsequence. For m > 1, given Im−1 , we consider ( fk (x m )) k ∈I m−1 . Since this sequence is bounded, it too has a converging subsequence, and we denote by Im ⊂ Im−1 the indices of this new subsequence. With this construction, we now write Im = (nk(m) ) k , for each m ≥ 1. Consider the sequence h k = f (k ) . n

k

For any m ≥ 1, the sequence (h k ) k ≥m is a subsequence of ( fk ) k ∈I m . Thus, h(x m ) := lim k→∞ h k (x m ) exists. This shows that (h k ) k converges pointwise to h , proving that L is compact. The fact that x.h(y) = h x −1 y − h x −1 is a left action is easily shown. Also, if x.h = h for all x ∈ G , then h(xy) = x −1 .h(y) + h(x) = h(y) + h(x) for all x, y ∈ G . If h : G → C is a homomorphism, then choose α = max 1 |h (s)| . Then s∈S

| |∇ S h | |∞ = sup sup |h(xs) − h(x) | = max |h(s) |, s∈S

s∈S x∈G

so that | |∇ S αh | |∞ = 1. Hence, αh ∈ L . Now for the functions b x . Note that z.b x (y) = b x z −1 y − b x z −1 = x −1 z −1 y − x −1 z −1 = b z x (y). By the triangle inequality, |b x (y) | ≤ |y | . So,

|b x (ys) − b x (y) | = y −1 .b x (s) = |b y −1 x (s) | ≤ |s |, which implies that | |∇ S b x | |∞ ≤ 1. Finally, if b x = b y , then

dist S (x, y) = b x (y) + |x | = b y (y) + |x | = |x | − |y |. Reversing the roles of x, y we have that dist S (x, y) = −dist S (x, y) , implying that dist S (x, y) = 0, so that x = y. :) X Solution to Exercise 1.98 :( The fact that | | · | | S, k is a semi-norm is easy to verify. For f : G → C and x ∈ G note that

| |x. f | | S, k = lim sup r −k sup f x −1 y ≤ lim sup (r + |x |) −k r →∞ r →∞ |y |≤r

sup

|z |≤r +| x |

| f (z) | ·

r +| x | k r

= | | f | | S, k .

Repeating this for x −1 , we have that | |x. f | | S, k ≤ | | f | | S, k =

x −1 .x. f

≤ | |x. f | | S, k , which implies S, k equality. It is now immediate that HF k (G, µ) is a G -invariant vector space. :) X Solution to Exercise 1.99 :( We know that there exists κ > 0 such that |x |T ≤ κ |x | S for all x ∈ G . Hence,

| | f | | S, k = lim sup r −k sup | f (x) | ≤ lim sup r −k r →∞

| x | S ≤r

r →∞

sup | x |T ≤κ r

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

| f (x) | ≤ κ k | | f | |T , k .

:) X

48

Background

Solution to Exercise 1.102 :( This is similar to the proof of Proposition 1.5.1. Let M ∈ GL n (R) . Let us recall the cofactor matrix c(M ) and the adjugate matrix adj (M ) given as follows: For every 1 ≤ i, j ≤ n let M i, j be the (n − 1) × (n − 1) matrix obtained from M by deleting the i th row and i+ j det(M i, j ) . It is well known that j th column. Define c(M ) to be the n × n matrix P given by c(M )i, j = (−1) for any fixed 1 ≤ i ≤ n we have det(M ) = nj=1 Mi, j Ci, j . Define adj (M ) = c(M ) τ (the transpose). Thus, M adj (M ) = adj (M )M = det(M ) · I where I is the n × n identity matrix. This implies that if det(M ) is invertible in R , then M −1 = (det(M )) −1 · adj (M ) . :) X Solution to Exercise 1.103 :( It is easy to see that I, −I commute with any A ∈ GL n (Z) . Thus, {I, −I } C GL n (Z) . For A ∈ GL2n+1 (Z) we have that det(det( A) · A) = det( A) 2n+1 · det( A) = 1. Thus, the map A 7→ (det( A), det( A) · A) is an isomorphism from GL2n+1 (Z) onto {−1, 1} × SL2n+1 (Z) . Also, the map A 7→ {−I, I } A is an isomorphism from SL2n+1 (Z) onto PGL2n+1 (Z) . :) X Solution to Exercise 1.104 f:( g f g f g 2n Since S is generated by a = 10 21 and b = 12 01 , it suffices to show that for any matrix A = 4k+1 2m 4`+1 , we have that Aa and Ab are both still of this form. For A as above, compute, f g f g f g f 4k+1 2(4k+1+n) g 4k+1 2(4k+1)+2n 12 2n Aa = 4k+1 = 2m 4(m+`)+1 , 2m 4`+1 · 0 1 = 2m 2·2m+4`+1 which is of the correct form. Similarly, f g f g f 4k+1+2·2n 10 2n Ab = 4k+1 2m 4`+1 · 2 1 = 2m+2(4`+1)

2n 4`+1

g

=

f

4(k+n)+1 2n 2(m+4`+1) 4`+1

g

, :) X

completing the proof. Solution to Exercise 1.105 :( Let

( f g ) 2n H = A = 4k+1 2m 4`+1 | det( A) = 1 , k, `, n, m ∈ Z . We have already f gseen that Sf ⊂ Hg . Let a = 10 21 and b = 12 01 be the generators of S . f g f g −1 = 4`+1 −2n , by possibly 2n Let A = 4k+1 2m 4`+1 . Denote | | A| | = max{ |4k + 1|, |4` + 1 | } . Since A −2m 4k+1 replacing A with A−1 , we may assume that |4k + 1 | ≥ |4` + 1 | , so that | | A| | = |4k + 1 | . We will prove by induction on | | A| | that if det( A) = 1 then A ∈ S . The base case is where | | A| | = 1, which is k = ` = 0. Then 1 = det( A) = 4(` − mn) + 1 implies that ` = nm, so that either n = 0 or m = 0. If n = 0 then A = b m ∈ S and if m = 0 then A = a n ∈ S . This completes the base case. For | | A| | > 1 we proceed by induction as follows. Note that det( A) = 1 implies that |(4k + 1)(4` + 1) | = |4nm + 1 | . If 2 min{ |n |, |m| } > |4k + 1 | then

|4k + 1 | 2 ≥ |(4k + 1)(4` + 1) | ≥ 4 |nm| − 1 ≥ ( |4k + 1 | + 1) 2 − 1 > |4k + 1 | 2, a contradiction! So it must be that

2 min{ |n |, |m| } ≤ |4k + 1 |. Since 2 min{ |n |, |m | } is even, and |4k + 1| is odd, equality cannot hold, so we conclude that

2 min{ |n |, |m| } < |4k + 1 |. We now have two cases. Case I. 2 |n | < |4k + 1| . In this case we see that for some z ∈ {−1, 1} we have |4(k + zn) + 1 | < |4k + 1 | . Since f 4(k+z n)+1 2n g Ab z = 2(m+z (4`+1)) 4`+1 , if |4` + 1 | < |4k + 1| , then | | Ab z | | < | | A| | , and by induction Ab z ∈ S , implying that A ∈ S as well.

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

1.9 Solutions to Exercises

49

If |4` + 1| = |4k + 1| , then we can find w ∈ {−1, 1} such that |4(` + wn) + 1 | < |4` + 1 | . So the matrix b w Ab z admits that

| |b w Ab z | | = max{ |4(k + zn) + 1 |, |4(` + wn) + 1 | } < | | A| |. b w Ab z

Again by induction ∈ S so that A ∈ S as well. Case II. 2 |m| < |4k + 1| . Similarly to the previous case, taking z ∈ {−1, 1} such that |4(k + zm) + 1 | < |4k + 1| , we find that g f m)+1 2(n+z (4`+1)) . a z A = 4(k+z 2m 4`+1 If |4` + 1 | < |4k + 1| then | |a z A| | < | | A| | , so that a z A ∈ S by induction, implying that A ∈ S . If |4` + 1| = |4k + 1 | , then taking w ∈ {−1, 1} such that |4(` + wm) + 1| < |4` + 1 | , we obtain that | |a z Aa w | | = max{ |4(k + zm) + 1|, |4(` + wm) + 1| } < | | A| | . As before, by induction a z Aa w ∈ S so that A ∈ S as well. :) X Solution to Exercise 1.106 :( Let ϕ : SL2 (Z) → SL2 (Z/4Z) be the map given by taking the matrix entries modulo 4. This is easily seen to be a surjective homomorphism. Let K = Kerϕ C SL2 (Z) . By the above exercises, K C S . So [SL2 (Z) : S] = [SL2 (Z)/K : S/K] = [SL2 (Z/4Z) : ϕ(S)] < ∞. :) X

https://doi.org/10.1017/9781009128391.003 Published online by Cambridge University Press

2 Martingales

50

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

51

2.1 Conditional Expectation

2.1 Conditional Expectation Martingales are a central object in probability. To define them properly we need to develop the notion of conditional expectation. Proposition 2.1.1 (Existence of conditional expectation) Let (Ω, F , P) be a probability space. Let G ⊂ F be a sub-σ-algebra. Let X : Ω → R be an integrable random variable (i.e. E |X | < ∞). Then, there exists an a.s. unique G-measurable and integrable random variable Y such that for all A ∈ G we have E[X1 A] = E[Y 1 A].

Let (Ω, F , P) be a probability space. Let G ⊂ F be a subσ-algebra. Let X : Ω → R be an integrable random variable (i.e. E |X | < ∞). Denote E[X | G] to be the (a.s. unique) random variable such that for all A ∈ G, we have E[X1 A] = E[E[X | G]1 A]. For an event A ∈ F , we denote P[A | G] := E[1 A | G]. If P[A] > 0, we also define

Definition 2.1.2

E[X | A, G] :=

E[X1 A | G] P[A]

and

P[B | A, G] :=

P[B ∩ A | G] . P[A]

It is important to note that conditional expectation produces a random variable and not a number. One may think of E[X | G] as the “best guess” for X given the information G. Uniqueness is a simple exercise: Let (Ω, F , P) be a probability space. Let G ⊂ F be a sub-σalgebra. Let X be an integrable random variable. Let Y, Z : Ω → R be G-measurable random variables, and assume that for any A ∈ G the expectations E[Y 1 A] = E[Z1 A] = E[X1 A] exist and are equal. Show that Y, Z are integrable, and that Y = Z a.s. B solution C Exercise 2.1

We now prove the existence of conditional expectation. Proof of Proposition 2.1.1 The existence of conditional expectation utilizes a powerful theorem from measure theory: the Radon–Nykodim theorem. It states that if µ, ν are σ-finite measures on a measurable space (M, Σ), and if ν µ (i.e. for any A ∈ Σ, if µ( A) = 0 then ν( A) = 0), then there exists a dν measurable function dµ such that for any ν-integrable function f , we have that R R dν dν f dµ is µ-integrable and f dν = f dµ dµ. (See Theorem A.4.6 in Durrett, 2019.)

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

52

Martingales

This is a deep theorem, but from it the existence of conditional expectation is straightforward. We start with the case where X ≥ 0. Let µ = P on (Ω, F ) and define ν( A) = E[X1 A] for all A ∈ G. One may easily check that ν is a measure dν on (Ω, G) and that ν P G . Thus, there exists a G-measurable function dµ R R dν such that for any A ∈ G, we have E[X1 A] = ν( A) = 1 A dν = 1 A dµ dµ. A G-measurable function is just a random variable measurable with respect to G. dν , and we have E[X1 A] = E[Y 1 A] for all A ∈ G. So we may take Y = dµ For a general X (not necessarily nonnegative) we may write X = X + − X − for X ± nonnegative. One may check that Y := E[X + | G] − E[X − | G] has the required properties. The uniqueness property described in Exercise 2.1 is a good tool for computing the conditional expectation in many cases; usually one “guesses” the correct random variable and verifies it by showing that it admits the properties guaranteeing it is equal to the conditional expectation a.s. Let us summarize some of the most basic properties of conditional expectation with the following exercises. Exercise 2.2 Let (Ω, F , P) be a probability space, X an integrable random variable, and G ⊂ F a sub-σ-algebra. Show that if X is G-measurable, then E[X | G] = X a.s. Show that if X is independent of G, then E[X | G] = E[X] a.s. Show that if P[X = c] = 1, then E[X | G] = c a.s. B solution C Exercise 2.3 Let (Ω, F , P) be a probability space, X an integrable random variable, and G ⊂ F a sub-σ-algebra. Show that E[E[X | G]] = E[X]. B solution C

Recall that for A ∈ F , we defined P[A | G] = E[1 A | G]. Prove Bayes’ formula for conditional probabilities: Show that for any B ∈ G and A ∈ F with P[A] > 0, we have

Exercise 2.4

P[B | A] = Exercise 2.5

E[1B P[A | G]] . P[A]

B solution C

Show that conditional expectation is linear; that is, E[aX + Y | G] = a E[X | G] + E[Y | G]

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

a.s.

53

2.1 Conditional Expectation

Show that if X ≤ Y a.s., then E[X | G] ≤ E[Y | G] a.s. Show that if X n % X a.s., X n ≥ 0 for all n a.s., and X is integrable, then E[X n | G] % E[X | G]. B solution C Let (Ω, F , P) be a probability space, X an integrable random variable, and G ⊂ F a sub-σ-algebra. Show that if Y is G-measurable and E |XY | < ∞, then E[XY | G] = Y E[X | G] a.s. B solution C Exercise 2.6

Let (Ω, F , P) be a probability space and X an integrable random variable. Suppose that ( An )n is a sequence of pairwise disjoint events such that P n P[An ] = 1 (i.e. ( An )n form an almost-partition of Ω). Let G = σ(( An )n ). Show that for all n, Exercise 2.7

E[X1 An ] 1 An P[An ]

a.s.

X E[X1 A ] n · 1 An P[A ] n n

a.s.

E[X | G]1 An = Use this to conclude that E[X | G] =

B solution C

Let X be an integrable (real-valued) random variable, and Y another random variable, not necessarily real-valued. Define E[X | Y ] := E[X | σ(Y )]. Definition 2.1.3

Show that if X is an integrable random variable, and Y is a random variable taking on countably many values, then f g X E X1 {Y=y } E[X | Y ] = 1 {Y=y }, P[Y = y] y ∈R Exercise 2.8

Y

where RY = {y : P[Y = y] > 0}. Prove Chebychev’s inequality for conditional expectation: Show that if X ∈ L 2 (Ω, F , P), then a.s. f g P[|X | ≥ a | G] ≤ a−2 · E X 2 | G . Exercise 2.9

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

54

Martingales

Prove Cauchy–Schwarz for conditional expectation: Show that if X, Y ∈ L 2 (Ω, F , P), then XY is integrable and a.s. f g f g E[XY | G] 2 ≤ E X 2 | G · E Y 2 | G . Exercise 2.10

Proposition 2.1.4 (Jensen’s inequality)

If ϕ is a convex function such that

X, ϕ(X ) are integrable, then a.s. E[ϕ(X ) | G] ≥ ϕ(E[X | G]). Proof As in the usual proof ( of Jensen’s inequality, we know) that ϕ(x) = sup(a,b) ∈S (ax+b) where S = (a, b) ∈ Q2 : ∀ y , ay + b ≤ ϕ(y) . If (a, b) ∈ S, then monotonicity of conditional expectation gives E[ϕ(X ) | G] ≥ a E[X | G] + b

a.s.

Taking the supremum over (a, b) ∈ S, since S is countable, we have that E[ϕ(X ) | G] ≥ ϕ(E[X | G])

a.s.

Let (Ω, F , P) be a probability space, X an integrable random variable, and H ⊂ G ⊂ F sub-σ-algebras. Then, E[E[X | G] | H ] = E[E[X | H ] | G] = E[X | H ] a.s.

Proposition 2.1.5 (Tower property)

Proof Note that E[X | H ] is H -measurable and thus G-measurable. So E[E[X | H ] | G] = E[X | H ] a.s. For the other assertion, since E[X | H ] is H -measurable, we only need to show the second property. That is, for any A ∈ H , since A ∈ G as well, E[E[X | H ]1 A] = E[E[X1 A | H ]] = E[X1 A] = E[E[X1 A | G]] = E[E[X | G]1 A].

2.2 Martingales: Definition and Examples Let (Ω, F , P) be a probability space. A filtration is a sequence (Ft )t of nested sub-σ-algebras Ft ⊂ Ft+1 ⊂ F .

Definition 2.2.1

The basic example of a filtration is one induced by a sequence of random variables. If (Xt )t is a sequence of random variables in a probability space (Ω, F , P), then Ft = σ(X0, . . . , Xt ) is easily seen to be a filtration. 4 5 4 Example 2.2.2

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

2.2 Martingales: Definition and Examples

55

Let (Ω, F , P) be a probability space. Let (Ft )t be a filtration. Let (Mt )t be a sequence of complex-valued random variables. The sequence (Mt )t is said to be a martingale with respect to the filtration Ft if the following conditions hold: For all t, Definition 2.2.3

• Mt is measurable with respect to Ft , • E |Mt | < ∞, and • E[Mt+1 | Ft ] = Mt a.s. Exercise 2.11 Let µ be a probability measure on Z such that for (Ut )t ≥1 i.i.d.-µ, P we have E[Ut ] = 0 and E |Ut | < ∞. Let M0 = 0 and Mt := tk=1 Uk . Show that (Mt )t is a martingale with respect to the filtration Ft = σ(U1, . . . , Ut ). B solution C What about the filtration Ft 0 = σ(M0, . . . , Mt )?

Let (Mt )t be a martingale with respect to a filtration (Ft )t . Show that (Mt )t is also a martingale with respect to the canonical filtration B solution C Ft 0 = σ(M0, . . . , Mt ). Exercise 2.12

In light of Exercise 2.12, we do not really need to specify the filtration when speaking about a martingale (Mt )t , since we can always refer to the canonical filtration σ(M0, . . . , Mt ). Thus, whenever we speak of a martingale without specifying the filtration, we are referring to the canonical filtration. Let (Xt )t be the simple random walk on Zd . That is, Xt = d j=1 U j , where U j are i.i.d. uniform on the standard basis of Z and the inverses. d Show that Mt = hXt , vi is a martingale, where v ∈ R . Show that Mt = ||Xt || 2 − t is a martingale. B solution C

Exercise 2.13

Pt

Let (Ω, F , P) be a probability space and (Ft )t a filtration. A stopping time with respect to the filtration (Ft )t is a random variable T with values in N ∪ {∞} such that {T ≤ t} ∈ Ft for every t. A stopping time with respect to a process (Xt )t is defined as being a stopping time with respect to the canonical filtration of the process σ(X0, . . . , Xt ). Definition 2.2.4

Usually we will not specify the filtration, since it will be obvious from the context. Example 2.2.5

Some examples: If (Xt )t is a process and we take the canonical

filtration, • T = inf {t : Xt ∈ A} is a stopping time;

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

56

Martingales

• E = sup {t : Xt ∈ A} is typically not a stopping time. 454

Show that if (Mt )t is a martingale and T is a stopping time, then (MT ∧t )t is also a martingale. B solution C

Exercise 2.14

Show that if T, T 0 are both stopping times with respect to a filtration (Ft )t , then so is T ∧ T 0. B solution C Exercise 2.15

The relation of probability and harmonic functions is via martingales as the following exercise shows. Let G be a finitely generated group. Let µ be an adapted probability measure on G. Let (Xt )t be the µ-random walk. Show that f : G → C is µ-harmonic if and only if ( f (Xt ))t is a martingale (with respect to the canonical filtration Ft = σ(X0, . . . , Xt )). B solution C

Exercise 2.16

2.3 Optional Stopping Theorem It follows from the definition that E[Mt ] = E[M0 ] for a martingale (Mt )t . We would like to conclude that this also holds for random times. However, this is not true in general. P Example 2.3.1 Let Mt = tj=1 X j , where (X j ) j are all i.i.d. with distribution P[X j = 1] = P[X j = −1] = 12 (i.e. (Mt )t is the simple random walk on Z). Let T = inf {t : Mt = 1}. We have seen that (Mt )t is a martingale and T is a stopping time. In Section 2.4, we will prove that T < ∞ a.s. (i.e. the simple random walk on Z is recurrent). So MT is well defined, and actually, by definition MT = 1 a.s. However, M0 = 0 a.s., so E[MT ] = 1 , 0 = E[M0 ]. 454 In contrast to the general case, uniform integrability is a condition under which E[MT ] = E[M0 ] for stopping times. Definition 2.3.2 (Uniform integrability) Let (X α )α∈I be a collection of random variables. We say that the collection (Xα )α is uniformly integrable if lim sup E |Xα |1 { |Xα |>K } = 0. K→∞ α

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

2.3 Optional Stopping Theorem

57

Show that if X is integrable, then the collection Xα := X is uniformly integrable. B solution C Exercise 2.17

Exercise 2.18

Show that if (Xα )α is uniformly integrable, then supα E |Xα |

0 we have supα E |Xα | 1+ε < ∞, then (Xα )α is uniformly integrable. B solution C

Exercise 2.19

Show that if (Fα )α is a collection of σ-algebras and X is an integrable random variable, then (E[X | Fα ])α is uniformly integrable. B solution C Exercise 2.20

Show that if (X n )n is uniformly integrable and X n → X a.s., then X is integrable. B solution C Exercise 2.21

The following is not the strongest form of optional stopping theorems that are possible to prove, but it is sufficient for our purposes. Theorem 2.3.3 (Optional stopping theorem) Let (Mt )t be a martingale and T a stopping time, both with respect to a filtration (Ft )t . We have that E |MT | < ∞ and E[MT ] = E[M0 ] if one of the following holds:

• The stopping time T is a.s. bounded; that is, there exists t ≥ 0 such that T ≤ t a.s. • T < ∞ a.s. and (Mt )t is uniformly integrable and E |MT | < ∞. We will actually see (in Exercise 2.25) that in the last condition, the requirement E |MT | < ∞ is redundant. Proof For the first case, if T ≤ t a.s., then E[MT ] = E

t−1 X

1 {T > j } · (M j+1 − M j ) + E[M0 ].

j=0

Since {T > j} = {T ≤ j} c ∈ Fj , we get that f g E (M j+1 − M j )1 {T > j } = E E[(M j+1 − M j )1 {T > j } | Fj ] f g = E 1 {T > j } E[M j+1 − M j | Fj ] = 0. So E[MT ] = E[M0 ] in this case. For the second case, note that since E |MT | < ∞, then E |MT |1 { |MT |>K } → 0 as K → ∞.

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

58

Martingales

Now, (Mt )t is uniformly integrable so supt E |Mt |1 { | Mt |>K } → 0 as K → ∞. Thus, E |MT ∧t |1 { | MT ∧t |>K } ≤ E |MT |1 { | MT |>K } 1 {T ≤t } + E |Mt |1 { |Mt |>K } 1 {T >t } ≤ E |MT |1 { | MT |>K } + E |Mt |1 { | Mt |>K } , so supt E |MT ∧t |1 { | MT ∧t |>K } → 0 as K → ∞. Let K ϕK (x) = x −K

if x > K, if |x| ≤ K, if x < −K .

Note that |ϕK (x) − x| ≤ |x|1 { |x |>K } . Since MT ∧t → MT a.s. as t → ∞ (because we assumed that T < ∞ a.s.), we also have that ϕK (MT ∧t ) → ϕK (MT ) a.s. as t → ∞. Since ϕK (MT ∧t ), ϕK (MT ) are uniformly bounded by K, we can apply dominated convergence to obtain that lim E |ϕK (MT ∧t ) − ϕK (MT )| = 0.

t→∞

Thus, E |MT − MT ∧t | ≤ E |ϕK (MT ) − MT | + E |ϕK (MT ∧t ) − MT ∧t | + E |ϕK (MT ∧t ) − ϕK (MT )| ≤ E |MT |1 { | MT |>K } + sup E |MT ∧t |1 { | MT ∧t |>K } + E |ϕK (MT ∧t ) − ϕK (MT )|. t

Taking t → ∞ and then K → ∞, we get that E |MT − MT ∧t | → 0. Since T ∧t is an a.s. bounded stopping time, E[MT ∧t ] = E[M0 ]. In conclusion, | E[MT − M0 ]| = | E[MT − MT ∧t ]| → 0.

Show that if a stopping time T is a.s. finite and a martingale (Mt )t is a.s. uniformly bounded (i.e. there exists m such that |Mt | ≤ m a.s. for all t), then E[MT ] = E[M0 ]. B solution C Exercise 2.22

Assume that (X n )n, X are random variables such that X n → X a.s. Show that if (X n )n is uniformly integrable, then X n → X also in L 1 . B solution C

Exercise 2.23

Exercise 2.24 Show that for a martingale (Mt )t and for any a.s. finite stopping time T, we have E |MT ∧t | ≤ E |Mt |. B solution C

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

2.4 Applications of Optional Stopping

59

Exercise 2.25 A specific case of the martingale convergence theorem states that if (Mt )t is a martingale with supt E |Mt | < ∞, then there exists a random variable M∞ such that Mt → M∞ a.s., and E |M∞ | < ∞. (We will prove the martingale convergence theorem in Theorem 2.6.3.) Use this to show that if (Mt )t is a uniformly integrable martingale and T is an a.s. finite stopping time, then E |MT | < ∞ (so this last condition is redundant in the optional stopping theorem). B solution C

2.4 Applications of Optional Stopping Let us give some applications of the optional stopping theorem (OST) to the study of random walks on Z. We consider Z = h−1, 1i. This is the usual Cayley graph on Z, with neighbors given by adjacent integers. We take the measure µ = 12 (δ1 + δ−1 ). That is, uniform on {−1, 1}. P Thus, the µ-random walk (Xt )t can be represented as Xt = tj=1 U j where (U j ) j are i.i.d. and P[U j = 1] = P[U j = −1] = 21 . First, it is simple to see that (Xt )t is a martingale (we have already seen this above). Now, let Tz := inf {t : Xt = z}. Note that Tz is a stopping time. Also, for a < 0 < b we have the stopping time Ta,b := Ta ∧ Tb , which is the first exit time of (a, b). Now, the martingale Mt = XTa, b ∧t is a.s. uniformly bounded (by |a| ∨ |b|). t As an exercise to the reader it is left to show that Ta,b < ∞ P0 -a.s. Thus, 0 = E[MTa, b ] = P[Ta < Tb ] · a + P[Tb < Ta ] · b = P[Ta < Tb ](a − b) + b. We deduce that b . b−a Now, note that (x + Xt )t has the distribution of a random walk started at x. Thus, for all n > x > 0, x n−x =1− . Px [T0 < Tn ] = P0 [T−x < Tn−x ] = n n This is the probability that a gambler starting with x dollars will go bankrupt before reaching n dollars in wealth, and is known as the gambler’s ruin estimate. P0 [Ta < Tb ] =

One of the extraordinary facts (albeit classical) about the random walk on Z is now obtained by taking n → ∞: Px [T0 = ∞] = lim Px [T0 > Tn ] = 0. n→∞

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

60

Martingales

That is, no matter how much money you have entering the casino, you always eventually reach 0 (and this is in the case of a fair game!) In other words, the random walk on Z is recurrent: it reaches 0 a.s. But how long does it take to reach 0? Note that since the random walk takes steps of size 1, we have that for n > x > 0, under Px , the event T0 > Tn implies that T0 ≥ 2n − x. Thus, Ex [T0 ] =

∞ X n=0

≥

X n>x

Px [T0 > n] ≥

X

Px [T0 ≥ 2n − x]

n>x

Px [T0 > Tn ] =

Xx = ∞. n n>x

So Ex [T0 ] = ∞. Indeed the walker reaches 0 a.s., but the time it takes is infinite in expectation. That is, the random walk on Z is null-recurrent. We will expand on the notions of recurrence and null-recurrence in Chapter 3 (and specifically in Section 3.8). Show that for a < 0 < b, we have that P0 [Ta,b < ∞] = 1. In fact, strengthen this to show that for all a < 0 < b there exists a constant c = c(a, b) > 0 such that for all t, and any a < x < b,

Exercise 2.26

Px [Ta,b > t] ≤ e−ct . Conclude that Ex [Ta,b ] < ∞ for all a < x < b.

B solution C

Let us now consider a different martingale. f g 2 E Xt+1 | Xt = 21 (Xt + 1) 2 + 21 (Xt − 1) 2 = Xt2 + 1. So Mt := Xt2 − t is a martingale. t If we apply the OST (Theorem 2.3.3) we get f g −1 = E−1 [MT0 ] = E−1 XT20 − E−1 [T0 ] = − E−1 [T0 ]. So E−1 [T0 ] = 1, a contradiction! The reason is that we applied the OST in the case where we could not, since (Mt )t is not necessarily bounded, and this in fact shows that (Mt )t is not uniformly integrable. One may note that for −n < x < n, under Px , the martingale Mt∧T−n, n t admits |Mt∧T−n, n | ≤ XT2−n, n − T−n,n + |Xt2 − t|1{T−n, n >t } ≤ 2n2 + T−n,n + t1{T−n, n >t } .

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

2.5 L p Maximal Inequality Thus using (a + b) 2 ≤ 2a2 + 2b2 , 2 2 Ex Mt∧T−n, n ≤ 2 Ex 2n2 + T−n,n + 2t 2 Px [T−n,n > t] 2 ≤ 2 Ex 2n2 + T−n,n + 2t 2 · e−ct ,

61

f g for some c = c(n) > 0. This implies that supt Ex |Mt∧T−n, n | 2 < ∞, so Mt∧T−n, n is a uniformly integrable martingale. t Given this, we may apply the OST to get that for any −n < x < n, f g x 2 = Ex [MT−n, n ] = Ex XT2−n, n − Ex [T−n,n ] = n2 − Ex [T−n,n ], so Ex [T−n,n ] = n2 − x 2 . Specifically, E[T−n,n ] = n2 . This property is sometimes referred to as the random walk on Z being diffusive. Similarly to the above, one may easily see that the martingale (Mt∧T0, n )t is Px -a.s. bounded for any 0 < x < n. So f g x 2 + Ex [T0,n ] = Ex |XT0, n | 2 = Px [T0 > Tn ] · n2 = xn. Thus, Ex [T0,n ] = (n − x)x. For general a < x < b, note that under Px , the walk (Xt )t has the same distribution as (a + Xt )t under Px−a . Thus, for a < x < b, Ex [Ta,b ] = Ex−a [T0,b−a ] = (b − x)(x − a).

2.5 L p Maximal Inequality The goal of this section is to prove the following theorem, which shows how to control the maximum of a martingale up to a certain time, using only the last value. Theorem 2.5.1 ( L p maximal inequality)

Let (Mt )t be a martingale. Then, for

any 1 < p < ∞ and any t, E[max |Mk | p ] ≤ k ≤t

p p p−1

· E |Mt | p .

Proof Let Ft = σ(M0, . . . , Mt ) be the natural filtration. Let Nt = maxk ≤t |Mk |. As a first step we show that for any r > 0, r · P[Nt ≥ r] ≤ E |Mt |1 {Nt ≥r } . (This is also known as Doob’s inequality.) Indeed, fix r > 0 and t > 0. Let T = inf{s ≥ 0 : |Ms | ≥ r } ∧ t,

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

62

Martingales

which is a stopping time. Since {T = s} ∈ Fs , we have that for all s ≤ t, E |Ms |1 {T =s } = E | E[Mt | Fs ]|1 {T =s } ≤ E |Mt |1 {T =s } . Summing over s ≤ t, using that P[T ≤ t] = 1, we have E |MT | ≤ E |Mt |. Finally, note that |MT |1 {Nt 1 we integrate Doob’s inequality: ∞

Z R pr p−1 P[Kt ≥ r]dr ≤ pr p−2 E |Mt |1 {Nt ≥r } dr 0 0 # " Z R f g p p−2 pr 1 {r ≤ Nt } dr = p−1 E |Mt | · |Kt | p−1 = E |Mt |

E |Kt | p =

Z

0

≤

p p−1

· E |Mt | p

1/p

· E |Kt | p

(p−1)/p

,

where the last inequality is Hölder’s inequality. Recalling that Kt = Nt ∧ R, taking R → ∞, and using monotone convergence, we have E |Nt | p

1/p

≤

p p−1

· E |Mt | p

which is the required assertion.

1/p

,

Exercise 2.27 Let (X t )t be a lazy random walk on Z; that is the µ-random walk for µ(1) = µ(−1) = 12 (1 − p) and µ(0) = p for some p ∈ [0, 1). B solution C Let Mt = maxk ≤t |Xk |. Show that E |Mt | 2 ≤ 4t.

Let (Xt )t be a lazy random walk on Z; that is the µ-random walk for µ(1) = µ(−1) = 21 (1 − p) and µ(0) = p for some p ∈ [0, 1). Let Mt = maxk ≤t |Xk |. Prove that there exists C, c > 0 such that for all t > 0 and all m > 0, c exp −C mt 2 ≤ P[Mt ≤ m] ≤ C exp −c(1 − p) mt 2 . B solution C Exercise 2.28

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

2.6 Martingale Convergence

63

2.6 Martingale Convergence One amazing property of martingales is that they converge under appropriate conditions. Definition 2.6.1 A sub-martingale is a process (Mt )t such that E |Mt | < ∞ and E[Mt+1 | M0, . . . , Mt ] ≥ Mt for all t. A super-martingale is a process (Mt )t such that E |Mt | < ∞ and E[Mt+1 | M0, . . . , Mt ] ≤ Mt for all t. A process (Ht )t is called predictable (with respect to (Mt )t ) if Ht is measurable with respect to σ(M0, . . . , Mt−1 ) for all t.

Of course any martingale is a sub-martingale and a super-martingale. Show that if (Mt )t is a sub-martingale then also Xt := (Mt − is a sub-martingale. B solution C

Exercise 2.29

a)1 {Mt >a }

Show that if (Mt )t is a sub-martingale (respectively, supermartingale) and (Ht )t is a bounded nonnegative predictable process, then the process t X Hs (Ms − Ms−1 ) (H · M)t := Exercise 2.30

s=1

is a sub-martingale (respectively, super-martingale). Show that when (Mt )t is a martingale and (Ht )t is bounded and predictable but not necessarily nonnegative, then (H · M)t is a martingale. B solution C Exercise 2.31 Show that if (Mt )t is a sub-martingale and T is a stopping time then (MT ∧t )t is a sub-martingale. B solution C

Let (Mt )t be a sub-martingale. Fix a < b ∈ R and let Ut be the number of upcrossings of the interval (a, b) up to time t; more precisely, define: N0 = −1 and inductively

Lemma 2.6.2 (Upcrossing Lemma)

N2k−1 = inf {t > N2k−2 : Mt ≤ a}

and

N2k = inf {t > N2k−1 : Mt ≥ b}.

Set Ut = sup {k : N2k ≤ t}. Then (b − a) · E[Ut ] ≤ E (Mt − a)1 {Mt >a } − E (M0 − a)1 {M0 >a } .

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

64

Martingales

Proof Define Xt = a + (Mt − a)1 {Mt >a } . Set Ht = 1 {∃ k : N2k−1 t − 1. Now, one verifies that t X

Hs · (X s − X s−1 ) ≥

N2k X

Ut X

Ut X (X s − X s−1 ) = (MN2k − a) ≥ (b − a)Ut .

k=1 s=N2k−1 +1

s=1

k=1

Note that (Xt )t is a sub-martingale by Exercise 2.29. By Exercise 2.30, since P P Hs ∈ [0, 1], At := ts=1 Hs · (X s − X s−1 ) and Bt := ts=1 (1 − Hs ) · (X s − X s−1 ) are also sub-martingales. Specifically, E[Bt ] ≥ E[B0 ] = 0. We have that (b − a) E[Ut ] ≤ E[At ] ≤ E[At + Bt ] = E[Xt − X0 ]. This is the required form.

Let (Mt )t be a sub-martingale such that supt E Mt 1 {Mt >0} < ∞. Then there exists a random variable M∞ such that Mt → M∞ a.s. and E |M∞ | < ∞. Theorem 2.6.3 (Martingale convergence theorem)

Proof Since (Mt − a)1 {Mt >a } ≤ Mt 1 {Mt >0} + |a|, we have by the upcrossing lemma that (b − a) E[Ut ] ≤ E Mt 1 {Mt >0} + |a|, where Ut is the number of upcrossings of the interval (a, b). Let U = U(a,b) = limt→∞ Ut be the total number of upcrossings of (a, b). By Fatou’s lemma, |a| + supt E Mt 1 {Mt >0} < ∞. E[U] ≤ lim inf E[Ut ] ≤ t→∞ b−a Specifically, U < ∞ a.s. Since this holds for all a < b ∈ R, taking a union bound over all a < b ∈ Q, we have that X P[∃ a < b ∈ Q : lim inf Mt ≤ a < b ≤ lim sup Mt ] ≤ P U(a,b) = ∞ = 0. t→∞

t→∞

a0} ≤ lim inf E Mt 1 {Mt >0} ≤ sup E Mt 1 {Mt >0} < ∞. t→∞

t

Another application of Fatou’s lemma gives E −M∞ 1 {M∞ 0} − E[Mt ] t→∞ ≤ lim inf E Mt 1 {Mt >0} − E[M0 ] t→∞ ≤ sup E Mt 1 {Mt >0} − E[M0 ] < ∞. t

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

2.7 Bounded Harmonic Functions

65

E |M∞ | = E M∞ 1 {M∞ >0} − E M∞ 1 {M∞ 0 be such that P[Xk = x] = α > 0. Exercise 2.36 proves that P[Xt+k = Xt x | Xt ] = α a.s. for all t (this is known as the Markov property, which will be discussed in Section 3.1, Exercise 3.1). Thus, for any ε > 0, P[|h(Xt x) − h(Xt )| > ε] ≤ α −1 P[|h(Xt+k ) − h(Xt )| > ε] → 0.

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

66

Martingales

Now assume that dim BHF (G, µ) < ∞. Then there exists a ball B = B(1, r) such that for all f , f 0 ∈ BHF (G, µ), if f B = f 0 B then f = f 0. Define a norm on BHF (G, µ) by || f ||B = maxx ∈B | f (x)|. Since all norms on finitedimensional spaces are equivalent, there exists a constant K > 0 such that || f ||B ≤ || f ||∞ ≤ K · || f ||B for all f ∈ BHF (G, µ). Now, since ||y.h||∞ = ||h||∞ , for any t we have a.s. inf ||h − c||∞ = inf

Xt−1 .h − c

≤ K · inf

Xt−1 .h − c

∞ B c ∈C c ∈C

c ∈C

≤ K · inf max |h(Xt x) − c| ≤ K · max |h(Xt x) − h(Xt )|. c ∈C x ∈B

x ∈B

Since this last term converges to 0 in probability, inf c ∈C ||h − c||∞ = 0 must hold. Thus, h is constant. Let (Xt )t be the µ-random walk for an adapted probability measure µ on a group G. Show that

Exercise 2.36

P[Xt+k = Xt x | Xt ] = P[Xk = x] a.s. for all t, k.

B solution C

Exercise 2.37 Check that || f || B in the proof of Theorem 2.7.1 is indeed a norm, for the specific B chosen. (In general, it is only a semi-norm.)

The following is a major open problem in the theory of bounded harmonic functions. It basically states that the property of having only constant bounded harmonic functions should not change if we restrict to “nice” random walk measures. We will return to this conjecture in Chapter 6. Conjecture 2.7.2

Let G be a finitely generated group. Then for any two µ, ν ∈

SA (G, 2), we have dim BHF (G, µ) = dim BHF (G, ν).

Let G be a finitely generated group and µ ∈ SA (G, 1). Recall the space of Lipschitz harmonic functions LHF (G, µ). Show that, if there exists a nonconstant positive h ∈ LHF (G, µ), then dim LHF (G, µ) = ∞. (Hint: consider LHF modulo the constant functions.) Exercise 2.38

B solution C

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

67

2.8 Solutions to Exercises

2.8 Solutions to Exercises Solution to Exercise 2.1 :( Let A = {Y > 0} and B = {Y ≤ 0}. Note that A, B ∈ G . Thus, E[|Y |1 A ] = E[Y1 A ] = E[X1 A ] ≤ E[|X |1 A ], and similarly E[|Y |1 B ] = − E[Y1 B ] = − E[X1 B ] ≤ E[|X |1 B ]. Thus, E |Y | ≤ E |X | < ∞. So Y is integrable. Similarly for Z . ( ) Now, let A = Y − Z > n1 . Then since A ∈ G , we have

0 = E[(Y − Z)1 A ] ≥ P[A] · n1 , g f implying that P Y > Z + n1 = 0. A union bound over n implies that P[Y > Z] = 0. Reversing the roles of Y, Z , we have that P[Z > Y] = 0 as well, culminating in P[Y , Z] = 0. :) X Solution to Exercise 2.2 :( When X is G -measurable, since for any A ∈ G we have E[X1 A ] = E[X1 A ] trivially, we have that E[X | G] = X a.s. If X is independent of G then for any A ∈ G we have E[X1 A ] = E[X] · E[1 A ] = E[E[X] · 1 A ]. Since a constant random variable is measurable with respect to any σ -algebra (and specifically E[X] is G -measurable), we have the second assertion. The third assertion is a direct consequence, since a constant is always independent of G . :) X Solution to Exercise 2.3 :( Since Ω ∈ G we have

E[X] = E[X1Ω ] = E[E[X | G]1Ω ] = E[E[X | G]].

:) X

E[1 B P[A | G]] = E[1 A 1 B ] = P[B | A] · P[A].

:) X

Solution to Exercise 2.4 :( Since B ∈ G ,

Solution to Exercise 2.5 :( Linearity: let Z = a E[X | G] + E[Y | G]. So Z is G -measurable. Also, for any A ∈ G ,

E[(aX + Y )1 A ] = a E[X1 A ] + E[Y1 A ] = a E[E[X | G]1 A ] + E[E[Y | G]1 A ] = E[Z1 A ]. Monotonicity: By linearity it suffices to show that if X ≥ 0 a.s. then E[X | G] f≥ 0 a.s. Indeed, for g f X ≥ 0 a.s., g let Y = E[X | G]. Then we may consider {Y < 0} ∈ G , and we have that 0 ≤ E X1{Y j } · ( | M j+1 | − | M j |) + E | M0 | < ∞.

j=0

It now suffices to show that

E MT ∧(t +1) | M0, . . . , Mt = MT ∧t . Indeed, since {T = t } = {T ≤ t }\{T ≤ t − 1} ∈ σ(M0, . . . , Mt ) , we have that t f g X f g E[MT ∧(t +1) | M0, . . . , Mt ] = E Mt +1 1{T >t } | M0, . . . , Mt + E M j 1{T = j } | M0, . . . , Mt j=0

= E[Mt +1 | M0, . . . , Mt ] · 1{T >t } +

t X j=0

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

M j 1{T = j } = MT ∧t .

:) X

69

2.8 Solutions to Exercises Solution to Exercise 2.15 :( For any t we have that {T ∧ T 0 ≤ t } = {T ≤ t } ∪ {T 0 ≤ t } ∈ Ft . Solution to Exercise 2.16 :( If f is µ -harmonic then for any x ∈ G , f E | f (Xt ) |1 { X

g

t −1 =x }

=

X

:) X

µ(y) | f (xy) | · P[Xt −1 = x] < ∞,

y

so that f (Xt ) is integrable for every t . Also, X f g X E[ f (Xt +1 ) | Ft ] = E Xt−1 . f (Ut +1 ) | Ft = µ(y)Xt−1 . f (y) = µ(y) f (Xt y) = f (Xt ), y

y

which shows that ( f (Xt ))t is a martingale. Now assume that f is such that ( f (Xt ))t is a martingale. Since µ is adapted, for any x ∈ G , there exists t > 0 such that P[Xt = x] > 0. So, X f (x) = E[ f (Xt +1 ) | Xt = x] = µ(y) f (xy), y

which implies that f is harmonic at x . As before, the above sum converges absolutely because f (Xt +1 ) is integrable. :) X Solution to Exercise 2.17 :( Set YK = |X |1{| X |> K } . So YK → 0 a.s. Since 0 ≤ YK ≤ |X | for all K , by dominated convergence we have that f g lim E |Xα |1{| X α |> K } = lim E[YK ] = 0. :) X K →∞

K →∞

Solution to Exercise 2.18 :( f g Uniform integrability implies that there exists K such that for all α we have E |Xα |1{| X α |> K } < 1. Since f g E |Xα |1{| X α |≤K } ≤ K , we arrive at

f g f g sup E |Xα | ≤ sup E |Xα |1{| X α |> K } + E |Xα |1{| X α |≤K } ≤ 1 + K . α

α

:) X

Solution to Exercise 2.19 :( Choose p = 1 + ε and q = 1+ε ε . Hölder’s inequality gives for any α, f g E |Xα |1{| X α |> K } ≤ (E[|Xα | p ]) 1/ p · (P[|Xα | > K]) 1/q

≤ (E[|Xα | p ]) 1/ p · (E[|Xα | 1+ε ]) 1/q · K −(1+ε)/q = E[|Xα | 1+ε ] · K −ε . Thus,

f g f g lim sup E |Xα |1{| X α |> K } ≤ sup E |Xα | 1+ε · lim K −ε = 0.

K →∞ α

α

K →∞

:) X

Solution to Exercise 2.20 :( Let ε > 0. There exists δ > 0 such that if A ∈ F and P[A] < δ then E[|X |1 A ] < ε . (Otherwise we could find ( A n ) n ⊂ F such that P[A n ] < n−1 and E[|X |1 A n ] ≥ ε . But since X is integrable, E[|X |1 A n ] → 0 by the dominated convergence theorem, a contradiction.) Take K > δ −1 E |X | . Then, since {E[|X | | Fα ] > K } ∈ Fα , f f g g E | E[X | Fα ]|1{| E[X |Fα ]|> K } ≤ E E[|X | | Fα ]1{E[| X ||Fα ]> K } f g = E E[|X |1{E[| X ||Fα ]> K } | Fα ] f g = E |X |1{E[| X ||Fα ]> K } . Using A = {E[|X | | Fα ] > K }, we have that

P[A] ≤ E[E[ |X | | Fα ]] · K −1 = E |X | · K −1 < δ,

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

70

Martingales

so E[|X |1 A ] < ε . This was uniform over α, so we conclude that for all ε > 0 there exists δ > 0 such that if K > δ −1 E |X | then f g sup E | E[X | Fα ]|1{| E[X |Fα ]|> K } < ε. α

:) X

This is exactly uniform integrability. Solution to Exercise 2.21 :( By Fatou’s lemma,

E |X | = E[lim |Xn |] ≤ lim inf E |Xn | ≤ sup E |Xn | < ∞. n

n→∞

n

:) X

Solution to Exercise 2.22 :( Note that if | Mt | ≤ m a.s., then obviously (Mt )t is uniformly integrable. Also, since T < ∞ a.s., we have that | MT ∧t | → | MT | a.s. as t → ∞. Thus, |MT | ≤ m a.s., implying that E[| MT |] < ∞. By the optional stopping theorem we have that E[MT ] = E[M0 ]. :) X Solution to Exercise 2.23 :( This is similar to the proof of Theorem 2.3.3. Define

K ϕ K (x) = x −K

if x > K, if |x | ≤ K, if x < −K .

Note that |ϕ K (x) − x | ≤ |x |1{| x |> K } . Since ϕ K (Xn ) → ϕ K (X) a.s., and |ϕ K (Xn ) | ≤ K for all n, by dominated convergence we have that ϕ K (Xn ) → ϕ K (X) in L 1 . Thus,

E |Xn − X | ≤ E |ϕ K (Xn ) − ϕ K (X) | + E[|Xn |1{| X n |> K } ] + E[|X |1{| X |> K } ], implying that

f g f g lim sup E |Xn − X | ≤ sup E |Xt |1{| X t |> K } + E |X |1{| X |> K } . n→∞

t

Since E |X | < ∞ (as the a.s. limit of a uniformly integrable sequence), this goes to 0 as K → ∞.

:) X

Solution to Exercise 2.24 :( Set Xt := |Mt | − | MT ∧t | . Using Jensen’s inequality we have that a.s.

E[| Mt +1 | | Mt , . . . , M0 ] ≥ |E [Mt +1 | Mt , . . . , M0 ]| = | Mt |. (That is, ( | Mt |)t is a sub-martingale.) Note that

| MT ∧(t +1) | − | MT ∧t | = ( | Mt +1 | − | Mt |)1{T >t }, so

Xt +1 − Xt = ( | Mt +1 | − | Mt |)1{T ≤t } . Thus,

E[Xt +1 − Xt | M0, . . . , Mt ] = 1{T ≤t } · E[| Mt +1 | − | Mt | | M0, . . . , Mt ] ≥ 0. (That is, (Xt )t is also a sub-martingale.) Taking expectations we get that E[Xt ] ≥ E[Xt −1 ] ≥ · · · ≥ E[X0 ] = E[M0 − MT ∧0 ] = 0. Hence, E | Mt | ≥ E | MT ∧t | , as required. :) X Solution to Exercise 2.25 :( Since supt E | MT ∧t | ≤ supt E | Mt | < ∞, we have that MT ∧t → M∞ a.s., for some integrable M∞ . But MT ∧t → MT a.s. as well, which implies that MT = M∞ a.s., so MT is integrable. :) X

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

71

2.8 Solutions to Exercises Solution to Exercise 2.26 :( Let K = b − a. Compute, for any a < x < b ,

P[Xt +K < (a, b) | Xt = x, Ta, b > t] ≥ P[∀ 0 ≤ j < K, Ut + j+1 = 1 | Xt = x, Ta, b > t] = 2−K , c using the fact that (Ut + j+1 ) ∞ j=0 are all independent of Ft , and that {Ta, b > t } = {Ta, b ≤ t } ∈ Ft . Since Ta, b > t + K implies that Ta, b > t and that Xt +K ∈ (a, b) , we may bound b−1 X

P[Ta, b > t + K] =

P[Ta, b > t + K | Xt = x, Ta, b > t] · P[Xt = x, Ta, b > t]

x=a+1

X b−1 ≤ 1 − 2−K · P[Xt = x, Ta, b > t] = 1 − 2−K · P[Ta, b > t]. x=a+1

Inductively we obtain that

n P[Ta, b > K n] ≤ 1 − 2−K .

:) X

Solution to Exercise 2.27 :( This is just the L p maximal inequality, with p = 2, together with the fact that E |Xt | 2 = (1 − p)t ≤ t , which stems from the OST applied to the martingale |Xt | 2 − (1 − p)t . :) X t

Solution to Exercise 2.28 :( We start with the upper bound. Consider |Xt | 2 − (1 − p)t . This is easily seen to be a martingale. Started at t |x | ≤ m and up to the stopping time

T = Tm+1 ∧ T−m−1 = inf {t ≥ 0 : |Xt | = m + 1}, this is a bounded martingale, so by the OST (Theorem 2.3.3), f g |x | 2 = E x |XT | 2 − (1 − p)T = (m + 1) 2 − (1 − p) E x [T ].

g f 2 (m + 1) 2 ≤ 12 . Hence, by Markov’s inequality, uniformly over |x | ≤ m, we have P x T > 1−p Let Ut = Xt − Xt −1 for all t ≥ 1, so that (Ut )t ≥1 are i.i.d. −µ . Since (Us+k ) k ≥1 are independent of Fs , for any |x | ≤ m and any t > s we have that k X Us+ j ≤ m P x [T > t | Fs ] = 1{T > s} · P x ∀ 1 ≤ k ≤ t − s, Xs + j=1 = 1{T > s} · P X s [T > t − s] ≤ 1{T > s} · sup P y [T > t − s]. |y |≤m

Thus, for |x | ≤ m and any t >

2 1−p

(m + 1) 2 ,

f P x [Mt ≤ m] = P x [T > t] = P x T > t, T > f ≤ Px T >

2 1−p

2 1−p

(m + 1) 2

g

g f (m + 1) 2 · sup P y T > t − |y |≤m

2 1−p

(1−p)t g −b c (m + 1) 2 ≤ · · · ≤ 2 2(m+1) 2 ,

which gives the desired upper bound. Now for the lower bound. For 0 ≤ x ≤ m we have

P x [Xt < −m] ≤ P x [|Xt − X0 | ≥ m + 1] ≤ P[|Xt | ≥ m + 1] ≤ Similarly, for −m ≤ x ≤ 0,

P x [Xt > m] ≤

t . (m + 1) 2

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

E |Xt | 2 t ≤ . (m + 1) 2 (m + 1) 2

72

Martingales

Under P0 , both Xt and −Xt have the same distribution. Shifting by x , for some |x | ≤ m, we get that P x [Xt ≤ x] = P[Xt ≤ 0] ≥ 12 , and similarly P x [Xt ≥ x] ≥ 21 . Recall that by the L p maximal inequality we know that E x | Mt | 2 ≤ 4 E x |Xt | 2 = 4 x 2 + (1 − p)t ≤ 4 x 2 + t ,

since |Xt | 2 − (1 − p)t is a martingale. t Putting all this together we have for any 0 ≤ x ≤ m, P x [Mt ≤ km, |Xt | ≤ m] ≥ P x [Xt ≤ x] − P x [Xt < −m] − P x [Mt > km] f g E x | Mt | 2 4 x2 + t 1 t 1 t ≥ − − ≥ − − , 2 (m + 1) 2 2 (m + 1) 2 k 2 m2 k 2 m2 and similarly for −m ≤ x ≤ 0 we have

P x [Mt ≤ km, |Xt | ≤ m] ≥ Choosing k = 8 and ` =

h

1 2 8m

i

4 x2 + t 1 t − − . 2 (m + 1) 2 k 2 m2

, we arrive at the conclusion that for all |x | ≤ m we have

P x [M` ≤ 8m, |X` | ≤ m] >

1 4.

Similarly to the proof of the upper bound, for any |x | ≤ m and any t > s we have

k X ≤ 8m X + U s s+ j j=1

P x [Mt ≤ 8m, |Xs | ≤ m | Fs ] = 1{ M s ≤8m,

| X s |≤m}

= 1{ M s ≤8m,

| X s |≤m}

· P x ∀ 1 ≤ k ≤ t − s, · P X s [| Mt −s | ≤ 8m]

≥ 1{ M s ≤8m,

| X s |≤m}

· inf P y [| Mt −s | ≤ 8m]. |y |≤m

We obtain that for any |x | ≤ m and t > ` ,

P x [Mt ≤ 8m] ≥ P x [Mt ≤ 8m, |X` | ≤ m] ≥ P x [M` ≤ 8m, |X` | ≤ m] · inf P y [Mt −` ≤ 8m] ≥ · · · ≥ 4−bt /`c , |y |≤m

which completes the proof of the lower bound.

:) X

Solution to Exercise 2.29 :( The function ϕ(x) = x1{x >0} is convex and nondecreasing, so by Jensen’s inequality E[ϕ(Mt +1 − a) | M0, . . . , Mt ] ≥ ϕ(E[Mt +1 − a | M0, . . . , Mt ]) ≥ ϕ(Mt − a) . :) X Solution to Exercise 2.30 :( We write a solution only for the sub-martingale case, since all are very similar.

E[(H · M )t +1 − (H · M )t | M0, . . . , Mt ] = Ht +1 E[(Mt +1 − Mt ) | M0, . . . , Mt ] ≥ 0. We have used the fact that Ht +1 is σ(M0, . . . , Mt ) -measurable.

:) X

Solution to Exercise 2.31 :( P Let Ht = 1{T ≥t } , which is a bounded predictable process. Then, MT ∧t = M0 + ts=1 Hs (Ms − Ms−1 ) , which is a sub-martingale by Exercise 2.30. :) X Solution to Exercise 2.32 :( f g The process Xt := −Mt is a sub-martingale and supt E Xt 1{X t >0} ≤ 0 < ∞. So Xt converges a.s., which implies the a.s. convergence of Mt . :) X

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

2.8 Solutions to Exercises

73

Solution to Exercise 2.33 :( Since (Mt )t is uniformly integrable,

f g sup E Mt 1{ M t >0} ≤ sup E | Mt | < ∞, t

t

so by martingale convergence Mt → M∞ a.s., for some integrable M∞ . By uniform integrability again, Mt → M∞ in L 1 as well. :) X Solution to Exercise 2.34 :( Let Mt = E[X | Ft ]. Since (Mt )t is a uniformly integrable martingale, it converges a.s. and in L 1 to some integrable M∞ . Now, for any event A, we have that Mt 1 A → M∞ 1 A a.s. and in L 1 as well. Thus, if A ∈ Fn for some n,

E[M∞ 1 A ] = lim E[Mt 1 A ] = E[X1 A ]. t →∞

Consider the probability measures:

µ( A) :=

E[((M∞ ) + + X − )1 A ] E[(M∞ ) + + X − ]

and

ν( A) :=

E[((M∞ ) − + X + )1 A ] . E[(M∞ ) − + X + ]

S Since M∞, X are integrable, these are indeed probability measures, and µ, ν agree on the π -system t Ft . Thus, by Dynkin’s lemma (also known as the π − λ theorem) µ, ν must agree on all of F∞ . Hence, M∞ = E[X | F∞ ] a.s. :) X Solution to Exercise 2.35 :( Set Xn = E[X | σn ]. Fix n and consider Mt := Xn−t for t ≤ n and Mt = X0 for t ≥ n. Then (Mt )t is fa martingale. If Un isgthe number of upcrossings of the interval (a, b) by M0, . . . , Mn , then (b−a) E[Un ] ≤ E (Mn − a)1{ M n > a} = f g E (X0 − a)1 { X > a } . 0 Now, let U∞ be the number of upcrossings of the interval (a, b) by (Xn ) n . Then Un % U∞ , so by monotone convergence, E[U∞ ] < ∞. Exactly as in the proof of the martingale convergence theorem, this holding for all a < b ∈ Q implies that Xn → X∞ a.s. for some integrable X∞ . Since (Xn ) n is uniformly integrable, we get that Xn → X∞ in L 1 as well. Set Y = lim sup n→∞ Xn . So Y = X∞ a.s. Note that for any n, all (Xt )t ≥n are σn -measurable, so we have that T Y = lim sup n≤t →∞ Xt is also σn -measurable. This implies that Y is measurable with respect to σ∞ = n σn . For any A ∈ σ∞ , we have that Xn 1 A → X∞ 1 A in L 1 , so

E[Y1 A ] = E[X∞ 1 A ] = lim E[E[X | σn ]1 A ] = lim E[X1 A ]. n→∞

n→∞

Thus, E[X | σ∞ ] = Y = X∞ a.s.

:) X

Solution to Exercise 2.36 :( Write Xt = U1 · · · Ut for (Ut )t ≥1 i.i.d.-µ elements. Set Y = Xt−1 Xt +k , and note that Y is independent of Ft , and specifically that Y is independent of Xt . Thus,

P[Xt +k = Xt x | Xt ] = P[Y = x | Xt ] = P[Y = x] a.s. Since (Ut )t ≥1 all have the same distribution, we find that Y = Ut +1 · · · Ut +k has the same distribution as Xk = U1 · · · Uk . :) X Solution to Exercise 2.38 :( Let V = LHF (G, µ)/C (modulo the constant functions). Fix some finite symmetric generating set S of G . Assume that dim LHF (G, µ) < ∞, so dim V < ∞. Recall the Lipschitz semi-norm | |∇ S f | |∞ := sup x∈G, s∈S | f (xs) − f (x) | . We have that | |∇ S f | |∞ = 0 if and only if f is constant. Also, | |∇ S ( f + c) | |∞ = | |∇ S f | |∞ for any constant c . Thus, | |∇ S · | |∞ induces a norm on V . Another semi-norm on G is given by | | f | | B := max x∈B | f (x) − f (1) | , where B is some finite subset. Note that | | f + c | | B = | | f | | B for any constant c . Because dim LHF (G, µ) < ∞, if B = B(1, r ) for r large enough then | | · | | B is a semi-norm on LHF (G, µ) such that | | f | | B = 0 if and only if f is constant. Thus, | | · | | B induces a norm on V as well.

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

74

Martingales

Since V is finite dimensional, all norms on it are equivalent. Thus, there exists a constant K > 0 such that | |∇ S v | |∞ ≤ K · | |v | | B for any v ∈ V . Since these semi-norms are invariant to adding constants, this implies that | |∇ S f | |∞ ≤ K · | | f | | B for all f ∈ LHF (G, µ) . Now, let h ∈ LHF (G, µ) be a positive harmonic function. Then, (h(Xt ))t is a positive martingale, implying that it converges a.s. Thus, for any fixed k , we have h(Xt +k ) − h(Xt ) → 0 a.s. Fix x ∈ G and let k be such that P[Xk = x] = α > 0. By Exercise 2.36, P[Xt +k = Xt x | Xt ] = α, independently of t . We have that a.s. convergence implies convergence in probability, so for any ε > 0,

P[|h(Xt x) − h(Xt ) | > ε] ≤ α−1 P[|h(Xt +k ) − h(Xt ) | > ε] → 0. So h(Xt x)−h(Xt ) → 0 in probability, for any x ∈ G . Since B is a finite ball this implies that max x∈B |h(Xt x)− h(Xt ) | → 0 in probability. Now we also use the fact that | |∇ S (x.h) | |∞ = | |∇ S h | |∞ . Thus, for all t , we have a.s. that | |∇ S h | |∞ =

∇ S Xt−1 .h

≤ K ·

Xt−1 .h

= K · max |h(Xt x) − h(Xt ) |. B ∞ x∈B Since this converges to 0 in probability, we have | |∇ S h | |∞ = 0 and h is constant.

https://doi.org/10.1017/9781009128391.004 Published online by Cambridge University Press

:) X

3 Markov Chains

75

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

76

Markov Chains

As we have already started to see, random walks play a fundamental role in the study of harmonic functions. In this chapter (and in Chapter 4) we will review fundamentals of random walks on groups and, more generally, Markov chains.

3.1 Markov Chains Recall Section 1.2. Let G be a countable set. Let (P(x, y))x,y ∈G be a matrix (albeit, perhaps, P infinite dimensional). We say P is stochastic if P(x, y) ≥ 0 and z P(x, z) = 1, for all x, y ∈ G. Let ν be a probability measure on G. Let F be the cylinder σ-algebra on GN . Let Xt (ω) = ωt for ω ∈ GN . For t ≥ 0, let Pν,t : Ft → [0, 1] be the probability measure defined by Pν,t [C({0, 1, . . . , t}, ω)] = ν(ω0 ) ·

t Y

P(ω j−1, ω j ).

j=1

One easily verifies that these measures are consistent, and by Kolmogorov’s extension theorem there exists a unique probability measure Pν on (GN, F ) such that Pν [A] = Pt [A] for any A ∈ σ(X0, . . . , Xt ) and any t ≥ 0. Definition 3.1.1 A sequence (X t )t with distribution Pν as above is called a Markov chain with transition matrix P and starting distribution ν. Sometimes we say that (Xt )t is Markov-(ν, P). When ν = δ x , we use the notation Px = Pδ x and write Eν, Ex for the expectation with respect to Pν , Px , respectively. The set G in which the random process (Xt )t takes its values is called the state space.

To specify a Markov chain, one can specify either the probability space, or the transition matrix and starting distribution. Somewhat carelessly, we will not distinguish these, and use both as the object designated by the name “Markov chain.”

Remark 3.1.2

Recall that if P is a G × G matrix, and f : G → R, we define P f : G → R P by P f (x) = y P(x, y) f (y) whenever the sum converges. Similarly, f P(x) = P y f (y)P(y, x) whenever the sum converges. The following property is called the Markov property.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

77

3.1 Markov Chains Exercise 3.1 (Markov property)

Let P be a stochastic matrix and (Xt )t be the

corresponding Markov chain. Show that P[Xt+1 = y | Ft ] = P[Xt+1 = y | Xt ] = P(Xt , y)

a.s.

Conclude that for any event A ∈ σ(X0, . . . , Xt ), any t, n ≥ 0, and any x, y ∈ G, P[Xt+n = y, Xt = x, A] = P n (x, y) · P[Xt = x, A].

B solution C

Note that this exercise tells us that conditioned on Xt = x, the process (Xt+n )n is distributed as a Markov chain with transition matrix P started at x. Moreover, conditioned on Xt = x the process (Xt+n )n is conditionally independent of X0, . . . , Xt . Let (Xt )t be a Markov chain on G with transition matrix P. Show that for any bounded nonnegative function f : G → [0, ∞) and any probability measure µ on G, we have that X Eµ [ f (Xt )] = µPt f = µ(x)Pt (x, y) f (y).

Exercise 3.2

x,y

Conclude that if f : G → C is a function such that Eµ | f (Xt )| < ∞, then Eµ [ f (Xt )] = µPt f . Let G be a countable set. Let (Xt )t be a sequence of G-valued random variables. Suppose that for any t ≥ 0 and all x, y ∈ G, it holds that

Exercise 3.3

P[∀ 0 ≤ j ≤ t, X j = x j , Xt+1 = y] = P[∀ 0 ≤ j ≤ t, X j = x j ] · P(x, y) for some matrix P. Show that (Xt )t is a Markov chain with transition matrix P. Consider a bored mathematician. She has a (possibly biased) coin and two chairs in her office, say chair a and chair b. Every minute, out of boredom, she tosses the coin. If it comes out heads, she moves to the other chair. Otherwise, she does nothing. Example 3.1.3

This can be modeled by a Markov chain on the state space {a, b}. At each time, with some probability 1 − p the mathematician does not move, and with probability p she " jumps to the# other state. The corresponding transition matrix 1−p p would be P = . p 1−p What is the probability Pa [X n = b] =? For this we need to calculate P n .

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

78

Markov Chains

A complicated way would be to analyze the eigenvalues of P, perhaps using the Jordan form... But a small trick gives us an easier calculation: Let µn = P n (a, ·). Check that µn+1 = µn P. Consider the vector π = (1/2, 1/2)τ . Then πP = π. Now, consider an = (µn − π)(a). Since µn is a probability measure, we get that µn (b) = 1 − µn (a), so an = ((µn−1 − π)P)(a) = (1 − p) µn−1 (a) + pµn−1 (b) − 1/2 = (1 − 2p)(µn−1 − π)(a) + p − π(a) + (1 − 2p)π(a) = (1 − 2p)an−1 . So an = (1 − 2p) n a0 = (1 − 2p) n · 12 and P n (a, a) = µn (a) = n also implies that P n (a, b) = 1 − P n (a, a) = 1−(1−2p) . 2 We see that " # f g n 1 1 1 P →2 = π π . 1 1

1+(1−2p) n . 2

This

This was just a practice example, to illustrate some of the properties that we will prove in greater generality. 454 Another basic example is a random walk on a graph. Let G = (V, E) be a graph. We may define a stochastic matrix P by P(x, y) = 1 deg(x) 1 {x∼y } (recall that x ∼ y means that {x, y} is an edge in the graph). The corresponding Markov chain is the process that at each step chooses uniformly among the neighbors of the current position and moves to that new position. Sometimes, one may be interested in the situation where the random walk has some probability of staying in place. This is known as the lazy random walk. It requires another parameter to be specified, namely the probability α ∈ (0, 1) of staying in the same place. In this case, the transition matrix would 1 be Q(x, y) = α1 {x=y } +(1−α) deg(x) 1 {x∼y } . So in matrix form Q = αI+(1−α)P. The parameter α is sometimes called the holding probability. 454 Example 3.1.4

3.2 Irreducibility When we speak about graphs, we have the notion of connectivity. We are now interested to generalize this notion to Markov chains. We want to say that a state x is connected to a state y if there is a way to get from x to y; note that for general Markov chains this does not necessarily imply that one can get from y to x. Definition 3.2.1

Let P be the transition matrix of a Markov chain on G; P is

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.2 Irreducibility

79

called irreducible if for every pair of states x, y ∈ G there exists t > 0 such that Pt (x, y) > 0. A Markov chain is called irreducible if its transition matrix is irreducible. This means that for any two states x, y, there is a large enough time t such that with positive probability the chain can go from state x to state y in t steps. Example 3.2.2 Consider the cycle Z/nZ, for n even. Consider the Markov chain moving +1 or −1 (mod n) with equal probability at each step. That is, P(x, y) = 21 for all |x − y| = 1 (mod n), and P(x, y) = 0 otherwise. This is an irreducible chain since for any x, y, we have for t = |x−y| (mod n), if γ = (x = x 0, x 1, . . . , x t = y) is a path of length t from x to y, then

Pt (x, y) ≥ Px [(X0, . . . , Xt ) = γ] = 2−t > 0. Note that at each step, the Markov chain moves from the current position +1 or −1 (mod n). Thus, since n is even, at even times the chain must be at even vertices, and at odd times the chain must be at odd vertices. Thus, it is not true that there exists t > 0 such that for all x, y, Pt (x, y) > 0. The main reason for this is that the chain has a period: at even times it is on some set, and at odd times on a different set. Similarly, the chain cannot be back at its starting point at odd times, only at even times. 454 Let P be a transition matrix of a Markov chain on G. • A state x is called periodic if gcd t ≥ 1 : Pt (x, x) > 0 > 1, and this gcd is called the period of x. • If gcd t ≥ 1 : Pt (x, x) > 0 = 1 then x is called aperiodic. • P is called aperiodic if all x ∈ G are aperiodic. Otherwise P is called periodic.

Definition 3.2.3

Example 3.2.4

Note that in the even-length cycle example, ( ) gcd t ≥ 1 : Pt (x, x) > 0 = gcd {2, 4, 6, . . .} = 2.

454

If P is periodic, then there is an easy way to “fix” P to become aperiodic: namely, let Q = αI + (1 − α)P be some “lazy version” of P. Here I is the identity matrix, and α ∈ (0, 1). Then Q(x, x) ≥ α for all x, and thus Q is aperiodic. Such a Q is called a lazy version of P because it basically is the same Markov chain, only that at some times it stays put (with probability α). Remark 3.2.5

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

80

Markov Chains

Proposition 3.2.6

Let P be a transition matrix of a Markov chain on state

space G. • x is aperiodic if and only if there exists t(x) such that for all t > t(x), Pt (x, x) > 0. • If P is irreducible, then P is aperiodic if and only if there exists an aperiodic state x. • Consequently, if P is irreducible and aperiodic, and if G is finite, then there exists t 0 such that for all t > t 0 all x, y admit Pt (x, y) > 0. Proof We start with the first assertion. Assume that x is aperiodic. Let R = t ≥ 1 : Pt (x, x) > 0 . Since Pt+s (x, x) ≥ Pt (x, x)P s (x, x) we get that t, s ∈ R implies t + s ∈ R; that is, R is closed under addition. A number theoretic result tells us that since gcd R = 1 it must be that Rc is finite. The other direction is simpler. If Rc is finite, then R contains two consecutive integers n, n + 1 ∈ R, so gcd R = gcd(n, n + 1) = 1. For the second assertion, if P is irreducible and x is aperiodic, then let t(x) be such that for all t > t(x), Pt (x, x) > 0. For any z, y let t(z, y) be such that Pt (z,y) (z, y) > 0 (which exists by irreducibility). Then, for any t > t(y, x) + t(x) + t(x, y) we get that Pt (y, y) ≥ Pt (y, x) (y, x)Pt−t (y, x)−t (x,y) (x, x)Pt (x,y) (x, y) > 0. So for all large enough t, Pt (y, y) > 0, which implies that y is aperiodic. This holds for all y, so P is aperiodic. The other direction is trivial from the definition. For the third assertion, for any z, y let t(z, y) be such that Pt (z,y) (z, y) > 0. Let T = maxz,y {t(z, y)}. Let x be an aperiodic state and let t(x) be such that for all t > t(x), Pt (x, x) > 0. We get that for any t > 2T + t(x) we have that t − t(z, x) − t(x, z) ≥ t − 2T > t(x), so Pt (z, y) ≥ Pt (z, x) (z, x)Pt−t (z, x)−t (x,z) (x, x)Pt (x,z) (x, z) > 0.

Exercise 3.4 Let G be a finite connected graph, and let Q be the lazy random walk on G with holding probability α; that is, Q = αI + (1 − α)P where 1 P(x, y) = deg(x) if x ∼ y and P(x, y) = 0 if x / y. Show that Q is aperiodic. Show that for

diam(G) = max {dist(x, y) : x, y ∈ G} , we have that all t > diam(G), and all x, y ∈ G admit Q t (x, y) > 0.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.3 Random Walks on Groups

81

3.3 Random Walks on Groups The main type of Markov chain we will be studying is the random walk on G where G is a finitely generated group. Let µ be a probability measure on a finitely generated group G. Recall that we defined the µ-random walk as the process (Xt )t defined by Xt = X0U1 · · · Ut , where X0 has some distribution on G, and (Ut )t ≥1 are i.i.d.-µ. Let G be a finitely group and let µ be a probability generated −1 measure on G. Let P(x, y) = µ x y . Exercise 3.5

• Show that the µ-random walk (Xt )t is a Markov chain with transition matrix P. • Show that P is irreducible if and only if µ is adapted. • Show that if µ(1) > 0 then P is aperiodic. B solution C Let G be a finitely generated group, let S be a finite symmetric generating set. Let µ(x) = |S1 | 1 {x ∈S } . This is called the simple random walk on the Cayley graph of G with respect to S. 454 Example 3.3.1

Let Z = h−1, 1i, and consider the simple random walk. So µ(1) = µ(−1) = 12 . Note that for any sequence of integers, z0 = 0, z1, . . . , z n such that |zk+1 − zk | = 1, we have that

Example 3.3.2

P0 [X1 = z1, . . . , X n = z n ] =

n Y

P(zk , zk+1 ) =

k=1

n Y

µ(zk+1 − zk ) = 2−n .

k=1

It may be noted that if we set Uk = Xk − Xk−1 , then the sequence (Uk )k ≥1 are P i.i.d. with distribution µ. So Xt = tk=1 Uk is the sum of i.i.d. random variables. 454

3.4 Stopping Times If (Xt )t is a Markov chain on G with transition matrix P, we can define the following. Let A ⊂ G. Define TA = inf {t ≥ 0 : Xt ∈ A}

and

TA+ = inf {t ≥ 1 : Xt ∈ A} .

These are the hitting time of A and return time to A, respectively. (We use the convention that inf ∅ = ∞.) If A = {x} we write Tx = T{x } and similarly + . Tx+ = T{x } The hitting and return times have the property that their value can be determined by the history of the chain; that is, the event {TA ≤ t} is determined by

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

82

Markov Chains

(X0, X1, . . . , Xt ). Recall that this is precisely the definition of a stopping time (with respect to the canonical filtration Ft = σ(X0, . . . , Xt )). Show that {TA ≤ t} and {TA+ ≤ t} are events in Ft = σ(X0, . . . , Xt ), for any t ≥ 0. That is, show that TA, TA+ are stopping times. Exercise 3.6

Example 3.4.1

Consider the simple random walk on Z3 . Let T = sup{t : Xt = 0}.

This is the last time the walk is at 0. One can show that T is a.s. finite, and we will prove this in Exercise 4.39. However, T is not a stopping time, since, for example, {T ≤ 0} = {∀ t > 0, Xt , 0} =

∞ \

{Xt , 0} < σ(X0 ).

454

t=1

Let (Xt )t be a Markov chain and let T = inf {t ≥ TA : Xt ∈ A0 }, ⊂ S. Then T is a stopping time, since

Example 3.4.2

where A,

A0

{T ≤ t} =

t [ k [

X m ∈ A, Xk ∈ A0 .

454

k=0 m=0

Exercise 3.7

Let T, T 0 be stopping times. Show that the following holds

• Any constant t ∈ N is a stopping time. • T ∧ T 0 and T ∨ T 0 are stopping times. • T + T 0 is a stopping time.

B solution C

3.4.1 Conditioning on a Stopping Time We have already seen important uses for stopping times in the optional stopping theorem (Theorem 2.3.3). Another important property we want is the strong Markov property. For a fixed time t, the Markov property tells us that the process (Xt+n )n is a Markov chain with starting distribution Xt , independent of σ(X0, . . . , Xt ). We want to do the same thing for stopping times. Let T be a stopping time. The information captured by X0, . . . , XT , is the σ-algebra σ(X0, . . . , XT ). This is defined as the collection of all events A such that for all t, A ∩ {T ≤ t} ∈ σ(X0, . . . , Xt ). That is, σ(X0, . . . , XT ) = { A: A ∩ {T ≤ t} ∈ σ(X0, . . . , Xt ) for all t}.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.4 Stopping Times

83

One can check that this is indeed a σ-algebra. Exercise 3.8

Show that σ(X0, . . . , XT ) is a σ-algebra.

Important examples are: • For any t, {T ≤ t} ∈ σ(X0, . . . , XT ). • Thus, T is measurable with respect to σ(X0, . . . , XT ). • XT 1 {T k, and any event A ∈ σ(X0, . . . , Xk ), P[X m = y, A, Xk = x] = P m−k (x, y) P[A, Xk = x]. Since T + t is also a stopping time, to prove the theorem it suffices to show that for all t, and any A ∈ σ(X0, . . . , XT ), P[XT +t+1 = y, XT +t = x, A, T < ∞] = P(x, y) P[XT +t = x, A, T < ∞]. Note that A ∩ {T = k} ∈ σ(X0, . . . , Xk ) ⊂ σ(X0, . . . , Xk+t ) for all k, so P[XT +t+1 = y, A, XT +t = x, T < ∞] ∞ X P[Xk+t+1 = y, Xk+t = x, A, T = k] = k=0

=

∞ X

P(x, y) P[Xk+t = x, A, T = k]

k=0

= P(x, y) P[XT +t = x, A, T < ∞].

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

84

Markov Chains

3.5 Excursion Decomposition We now use the strong Markov property to prove the following. Let P be the transition matrix of an irreducible Markov chain (Xt )t on G. Fix x ∈ G. Define inductively the following stopping times: Tx(0) = 0 and ( ) Tx(k) = inf t ≥ Tx(k−1) + 1 : Xt = x .

Example 3.5.1

So Tx(k) is the time of the kth return to x. Let Vt (x) be the number of visits to x up to time t; that is, Vt (x) = Pt k=1 1 {Xk =x } . It is immediate that Vt (x) ≥ k if and only if Tx(k) ≤ t. Now let us look at the excursions to x: The kth excursion is f g X Tx(k−1), Tx(k) = XT (k−1) , XT (k−1) +1, . . . , XT (k ) . x

x

x

These excursions are paths of the Markov chain ending at x and starting at x (except, possibly, the first excursion, which starts at X0 ). For k > 0 define τx(k) = Tx(k) − Tx(k−1) if Tx(k) < ∞ and 0 otherwise. When (k) Tx < ∞, this is the length of the kth excursion. The strongf Markov property tells us that conditioned on Tx(k−1) g < ∞, the (k−1) (k) excursion X Tx , Tx , is independent of σ X0, . . . , XT (k−1) , and has the x distribution of the first excursion X[0, Tx+ ] conditioned on X0 = x. 454 The above gives rise to the following relation. Let P be an irreducible Markov chain on G. Let Vt (x) and Tx(k) be defined as in Example 3.5.1. Then, g f (Px [Tx+ < ∞]) k = Px [V∞ (x) ≥ k] = Px Tx(k) < ∞ . Theorem 3.5.2

Consequently, 1 + Ex [V∞ (x)] =

1 Px [Tx+

= ∞]

,

where 1/0 = ∞. ( ) Proof As in Example 3.5.1, {V∞ (x) ≥ k} = Tx(k) < ∞ . We also saw in the example that for any m, f g f g P Tx(m) < ∞ | Tx(m−1) < ∞ = P ∃ t ≥ 1, XT (m−1) +t = x | Tx(m−1) < ∞ x

= Px [Tx+ < ∞].

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.6 Recurrence and Transience 85 ( (m) ) ( (m) ) Since Tx < ∞ = Tx < ∞, Tx(m−1) < ∞ , we can inductively conclude that f g f g f g Px Tx(k) < ∞ = Px Tx(k) < ∞ | Tx(k−1) < ∞ · P Tx(k−1) < ∞ = · · · = (Px [Tx+ < ∞]) k . This proves the first assertion. The second assertion follows from the fact that 1 + Ex [V∞ (x)] =

∞ X

Px [V∞ (x) ≥ k] =

k=0

1 , 1 − Px [Tx+ < ∞]

where this holds even if Px Tx+ < ∞ = 1, interpreting 1/0 = ∞.

Exercise 3.9 Let (X t )t be a Markov chain on G with transition matrix P. Assume that P is irreducible. Let Z ⊂ G. Recall TZ , the first hitting time of the set Z. P Z Let x < Z and define V = Tt=0 1 {Xt =x } (so that V = VTZ (x) + 1 {X0 =x } ). Show that under Px , the random variable V has geometric distribution with mean 1/p, for p = Px [TZ < Tx+ ]. B solution C

3.6 Recurrence and Transience Let P be a transition matrix of a Markov chain (Xt )t on G. A state x ∈ G is called recurrent if Px [Tx+ < ∞] = 1. Otherwise, if Px [Tx+ = ∞] > 0 we say that x is transient.

Definition 3.6.1

Excursion decomposition provides us with the following important characterization of recurrence in Markov chains. Theorem 3.6.2

The following are equivalent for an irreducible Markov chain

(Xt )t on G: (1) (2) (3) (4)

x is recurrent. Px [V∞ (x) = ∞] = 1.f g For any state y, Px Ty+ < ∞ = 1. Ex [V∞ (x)] = ∞.

Proof (1) ⇒ (2): If x is recurrent, then by definition Px [Tx+ < ∞] = 1. So for any k, by Theorem 3.5.2 we have Px [V∞ (x) ≥ k] = 1. Taking k to infinity, we get that Px [V∞ (x) = ∞] = 1.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

86

Markov Chains f g (2) ⇒ (3): Let y ∈ G. Let Ek = X Tx(k−1), Tx(k) be the kth excursion from f g x. We assumed that Px ∀ k Tx(k) < ∞ = Px [V∞ (x) = ∞] = 1. So under Px , all (Ek )k are independent and identically distributed. Since P is irreducible, there exists t > 0 such that Px [Xt = y, t < Tx+ ] > 0 (this is the subject of Exercise 3.10). Thus, we have that p := Px [Ty < Tx+ ] ≥ Px [Xt = y , t < Tx+ ] > 0. This implies by the strong Markov property that f g Px Ty < Tx(k+1) | Ty > Tx(k) , Tx(k) < ∞ ≥ p > 0. f g So, using the fact that Px ∀ k Tx(k) < ∞ = 1, f g f g f g Px Ty ≥ Tx(k) = Px Ty ≥ Tx(k) | Ty > Tx(k−1) , Tx(k−1) < ∞ · Px Ty > Tx(k−1) f g ≤ (1 − p) · Px Ty ≥ Tx(k−1) ≤ · · · ≤ (1 − p) k . Thus, f g Px [Ty+ = ∞] ≤ Px ∀ k , Ty ≥ Tx(k−1) = lim (1 − p) k = 0. k→∞

(3) ⇒ (1): If for any y we have Px [Ty+ < ∞] = 1, then taking y = x shows that x is recurrent. This shows that (1),(2),(3) are equivalent. It is obvious that (2) implies (4). Since Px [Tx+ = ∞] = E x [V∞1(x)]+1 , we get that (4) implies (1). Exercise 3.10 Show that if P is irreducible, for any x , y there exists t > 0 B solution C such that Px [Xt = y , t < Tx+ ] > 0.

Let P be an irreducible transition matrix of a recurrent Markov chain on state space G. Show that for any starting distribution ν and any non-empty subset A ⊂ G it holds that Pν [TA+ < ∞] = 1. B solution C Exercise 3.11

A gambler plays a fair game. Each round she wins a dollar with probability 1/2, and loses a dollar with probability 1/2; all rounds are independent. What is the probability that she never goes bankrupt, if she starts with N dollars? We have already seen that this is just a simple random walk (Xt )t on Z. Now, P since Xt = tk=1 Uk where (Uk )k are i.i.d., P[Uk = −1] = P[Uk = 1] = 21 .

Example 3.6.3

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.6 Recurrence and Transience

87

Define Rt = #{1 ≤ k ≤ t : Uk = 1} =

t X

1 {Uk =1},

k=1 t X

L t = #{1 ≤ k ≤ t : Uk = −1} =

1 {Uk =−1} = t − Rt .

k=1

So Xt − X0 = Rt − L t = 2Rt − t. Also, Rt , L t both have binomial-(t, 12 ) distribution. Now, it is immediate that 0 f g P0 [Xt = 0] = P0 Rt = 2t = 2−t t t/2

t is odd, t is even.

Stirling’s approximation (or other classical computations) tells us that ! √ t lim t2−t =c>0 t→∞ t/2 for some constant c > 0. So P0 [Xt = 0] ≥ ct −1/2 for some c 0 > 0 and all even t > 0. This implies that E0 [V∞ (0)] ≥ E0 [Vt (0)] ≥

bt/2c X

√ P0 [X2k = 0] ≥ c 00 t.

k=1

Thus, taking t → ∞ we get that E0 [V∞ (0)] = ∞, and so 0 is recurrent. Note that 0 here was not special, since all vertices look the same. This symmetry implies that Px [Tx+ < ∞] = 1 for all x ∈ Z. Thus, for any N, P N [T0+ = ∞] = 0. That is, no matter how much money the gambler starts with, she will always go bankrupt eventually. Indeed we have already seen the recurrence of the simple random walk on Z, as an application of the optional stopping theorem (Theorem 2.3.3). There we calculated for 0 < x < n that Px [T0 > Tn ] = nx . Taking n → ∞ proves that Px [T0 = ∞] = 0. 454 Let P be an irreducible transition matrix on a state space G. Then, for any x, y ∈ G, x is transient if and only if y is transient. Corollary 3.6.4

Proof By irreducibility, for any pair of states z, w we can find t(z, w) > 0 such that Pt (z,w) (z, w) > 0. Fix x, y ∈ G and suppose that x is transient. For any t > 0, Pt+t (x,y)+t (y, x) (x, x) ≥ Pt (x,y) (x, y)Pt (y, y)Pt (y, x) (y, x).

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

88

Markov Chains

Thus, Ey [V∞ (y)] =

∞ X

Pt (y, y)

t=1

≤

∞ X 1 · Pt+t (x,y)+t (y, x) (x, x) < ∞. Pt (x,y) (x, y)Pt (y,x) (y, x) t=1

So y is transient as well.

To conclude, we see that for an irreducible Markov chain recurrence and transience are properties of the chain and not of specific states. We say an irreducible Markov chain is recurrent if at least one state, and hence all states, are recurrent, and transient if all states are transient. Exercise 3.12 Let P be an irreducible transition matrix on a state space G. Let P 1 k Q= ∞ k=0 ek! P . Show that Q is an irreducible stochastic matrix. Show that if P is transient if and only if Q is transient. B solution C

∗ Exercise 3.13 Let P be an irreducible transition matrix on a state space G. For every x ∈ G let px ∈ [0, 1). Define

Q(x, y) = px 1 {x=y } + (1 − px )P(x, y). Show that Q is an irreducible Markov chain. Show that Q is recurrent if and only if P is recurrent.

B solution C

3.7 Positive Recurrence We have seen that recurrence and transience are governed by the return time Tx+ being finite a.s. or not. A natural question in the recurrent case is whether this time has finite expectation? This has to do with the theory of stationary distributions. Theorem 3.7.1 Let G be a finitely generated group and let µ be an adapted measure on G. The following are equivalent.

(1) There exists x ∈ G such that Ex [Tx+ ] < ∞. (2) For all x ∈ G we have Ex [Tx+ ] < ∞. (3) G is a finite group. The goal of this section is to prove this theorem.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.7 Positive Recurrence

89

Note that if Ex [Tx+ ] < ∞ then of course Px [Tx+ < ∞] = 1. Thus, it makes sense to state the following definition. If a state x satisfies Ex [Tx+ ] < ∞, we say that x is positive recurrent. Otherwise, if x is recurrent but Ex [Tx+ ] = ∞, we say that x is null recurrent.

3.7.1 Stationary Distributions Suppose that P is a transition matrix of a Markov chain (Xt )t on state space G such that for some starting distribution ν, we have that Pν [X n = x] → π(x) for all x ∈ G, where π is some limiting distribution. One immediately checks that in this case we must have X πP(x) = lim νP n (y)P(y, x) = lim νP n+1 (x) = π(x), y

n→∞

n→∞

or πP = π. (That is, π is a left eigenvector for P with eigenvalue 1.) Let P be a transition matrix. If π is a distribution satisfying πP = π then π is called a stationary distribution.

Definition 3.7.2

" # 1−p p Recall from Example 3.13 the two-state chain P = . p 1−p " # 1 1 We saw that P → 12 · . Indeed, it is simple to check that π = (1/2, 1/2) 1 1 is a stationary distribution in this case. 454 Example 3.7.3

Consider a finite graph G. Let P be the transition matrix of 1 1 {x∼y } . Or deg(x)P(x, y) = a simple random walk on G. So P(x, y) = deg(x) 1 {x∼y } . Thus, X deg(x)P(x, y) = deg(y). Example 3.7.4

x

So deg is a left eigenvector for P with eigenvalue 1. Since X XX X deg(x) = 1 {x∼y } = 2 = 2|E(G)|, x

we normalize π(x) =

x deg(x) 2 |E (G) |

y

e ∈E (G)

to get a stationary distribution for P.

454

The above stationary distribution has a special property, known as the detailed balance equation. A distribution π is said to satisfy the detailed balance equation with respect to a transition matrix P if for all states x, y, π(x)P(x, y) = π(y)P(y, x).

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

90

Markov Chains

Exercise 3.14 Show that if π satisfies the detailed balance equation, then π is a stationary distribution. Give an example of an irreducible and aperiodic transition matrix P with a stationary distribution that does not satisfy the detailed balance equation. B solution C

3.7.2 Stationary Distributions and Hitting Times There is a deep connection between stationary distributions and return times. The main result here is the following theorem. Let P be an irreducible transition matrix on state space G. Then the following are equivalent:

Theorem 3.7.5

• • • •

P has a stationary distribution π. Every x is positive recurrent. Some x is positive recurrent. P has a unique stationary distribution, π(x) =

1 . E x [Tx+ ]

The proof of this theorem goes through the following few lemmas. In the next lemma we will consider a function (vector) v : G → [0, ∞]. Although it may take the value ∞, since we are only dealing with nonnegative P numbers we can write vP(x) = y v(y)P(y, x) without confusion (with the convention that 0 · ∞ = 0). Let P be an irreducible transition matrix on a state space G. Let v : G → [0, ∞] be such that vP = v. Then:

Lemma 3.7.6

• If there exists a state x such that v(x) < ∞ then v(y) < ∞ for all states y. • If v is not the zero vector, then v(y) > 0 for all states y. Note that this implies that if π is a stationary distribution then all the entries of π are strictly positive and finite. Proof Assume that v(x) < ∞. For any t, using the fact that v ≥ 0, X v(x) = v(z)Pt (z, x) ≥ v(y)Pt (y, x). z

Thus, for a suitable choice of t, since P is irreducible, we know that Pt (y, x) > 0, and so v(y) ≤ Pv(x) t (y, x) < ∞.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

91

3.7 Positive Recurrence

For the second assertion, if v is not the zero vector, since it is nonnegative, there exists a state x such that v(x) > 0. Thus, for any state y and for t such that Pt (x, y) > 0, we get X v(y) = v(z)Pt (z, y) ≥ v(x)Pt (x, y) > 0. z

Recall that for a Markov chain (Xt )t we denote by Vt (x) = number of visits to x.

Pt

k=1 1 {Xk =x }

the

Let (Xt )t be an irreducible Markov chain with transition matrix P. Let µ be some starting distribution. Assume T is a stopping time such that

Lemma 3.7.7

Pµ [XT = x] = µ(x),

for all x.

Assume further that 1 ≤ T < ∞ Pµ -a.s. Let v(x) = Eµ [VT (x)]. Then, vP = v. Moreover, if Eµ [T] < ∞ then P has a stationary distribution . π(x) = Ev(x) µ [T ] Proof The assumptions on T give that for any j, Pµ [X0 = y] = µ(y) = Pµ [XT = y] = ∞ X

Pµ [X j = y, T > j] = Pµ [X0 = y] +

∞ X

Pµ [X j = y, T = j],

j=1 ∞ X

Pµ [X j = y, T > j]

j=1

j=0

= =

∞ X j=1 ∞ X

(Pµ [X j = y, T = j] + Pµ [X j = y, T > j]) Pµ [X j = y, T ≥ j] = v(y).

j=1

Thus we have that v(x) = = =

∞ X

Pµ [X j = x, T ≥ j] =

j=1 ∞ X X j=0 y ∞ XX y

∞ X

Pµ [X j+1 = x, T > j]

j=0

Pµ [X j+1 = x, X j = y, T > j] Pµ [X j = y, T > j]P(y, x) = (vP)(x).

j=0

That is, vP = v.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

92

Markov Chains

Since X x

X v(x) = Eµ VT (x) = Eµ [T] x

if Eµ [T] < ∞, then π(x) =

v(x) E µ [T ]

defines a stationary distribution.

Consider an irreducible f g Markov chain (Xt )t with transition matrix P, and let v(y) = Ex VTx+ (y) . If x is recurrent, then Px -a.s. we have f g 1 ≤ Tx+ < ∞, and Px XTx+ = y = 1 {y=x } = Px [X0 = y]. So we conclude that vP = v. Since Px -a.s. VTx+ (x) = 1, we have that 0 < v(x) = 1 < ∞, so 0 < v(y) < ∞ for all y. Note that although it may bef that Exg Tx+ = ∞, that is x is null-recurrent, we still have that for any y, Ex VTx+ (y) < ∞; that is, the expected number of visits to y until returning to x is finite. f g Example 3.7.8

If x is positive recurrent, then π(y) = for P.

E x VT + (y) x

E x [Tx+ ]

is a stationary distribution 454

This vector plays a special role, as in the next lemma. f g Let P be an irreducible transition matrix. Let u(y) = Ex VTx+ (y) . Let v ≥ 0 be a nonnegative vector such that vP = v, and v(x) = 1. Then, v ≥ u. Moreover, if x is recurrent, then v = u. Lemma 3.7.9

Proof Let (Xt )t denote the Markov chain. If y = x then v(x) = 1 ≥ u(x), so we can assume that y , x. We will prove by induction that for all t, for any y , x, t X

Px Xk = y, Tx+ ≥ k ≤ v(y).

(3.1)

k=1

Indeed, for t = 1 this is just X Px X1 = y, Tx+ ≥ 1 = P(x, y) ≤ v(z)P(z, y) = v(y), z

since v ≥ 0, v(x) = 1, and y , x. For general t > 0, we rely on the fact that by the Markov property, for any y , x, X Px Xk+1 = y, Tx+ ≥ k + 1 = Px Xk+1 = y, Xk = z, Tx+ ≥ k z,x

=

X

Px Xk = z, Tx+ ≥ k P(z, y).

z,x

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

93

3.7 Positive Recurrence So by induction, t+1 X

t

X Px Xk = y, Tx+ ≥ k = P(x, y) + Px Xk+1 = y, Tx+ ≥ k + 1

k=1

k=1

= P(x, y) +

X z,x

≤ P(x, y) +

X

P(z, y)

t X

Px Xk = z, Tx+ ≥ k

k=1

P(z, y)v(z)

z,x

=

X

v(z)P(z, y) = v(y).

z

This completes a proof of (3.1) by induction. Now, one notes that the left-hand side of (3.1) is just the expected number of visits to y started at x, up to time Tx+ ∧ t. Taking t → ∞, using monotone convergence, v(y) ≥

t X

Px Xk = y, Tx+ ≥ k = Ex [VTx+ ∧t (y)] % u(y).

k=1

This proves that v ≥ u. Since x is recurrent, we have uP = u, and u(x) = 1 = v(x). We have seen that v − u ≥ 0, and of course (v − u)P = v − u. Until now we have not actually used irreducibility; we will use this to show that v − u = 0. Indeed, let y be any state. If v(y) > u(y) then v − u is a nonzero nonnegative left eigenvector for P, so must be positive everywhere. This contradicts v(x) − u(x) = 0. So it must be that v − u ≡ 0. We are now in good shape to prove Theorem 3.7.5. Proof of Theorem 3.7.5 Assume that π is a stationary distribution for P. Fix π(z) . We have that any state x. Recall that π(x) > 0. Define the vector v(z) = π(x) f g v ≥ 0, vP = v, and v(x) = 1. Hence, v(z) ≥ Ex VTx+ (z) for all z. That is, X π(y) f g X X 1 Ex Tx+ = Ex VTx+ (y) ≤ v(y) = = < ∞. π(x) π(x) y y y So x is positive recurrent. This holds for a generic x. The second bullet of course implies the third. f g Now assume some state x is positive recurrent. Let v(y) = Ex VTx+ (y) . P Since x is recurrent, we know that vP = v and y v(y) = Ex [Tx+ ] < ∞. So v π = E x [T + is a stationary distribution for P. x] Since P has a stationary distribution, by the first implication all states are positive recurrent. Thus, for any state z, if v = π π(z) then vP = v and v(z) = 1. So z being recurrent we get that v(y) = Ez [VTz+ (y)] for all y. Specifically,

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

94

Markov Chains X 1 Ez [Tz+ ] = v(y) = , π(z) y

which holds for all states z. For the final implication, if P has a specific stationary distribution, then of course it has a stationary distribution. If an irreducible Markov chain P has two stationary distributions π and π 0, then π = π 0.

Corollary 3.7.10 (Stationary distributions are unique)

Let P be an irreducible transition matrix. Show that for positive recurrent states x, y, f g f g Ex VTx+ (y) Ey VTy+ (x) = 1. B solution C

Exercise 3.15

Let P be an irreducible transition matrix on a state space G. Let A ⊂ G, and consider the return time TA+ . f g Show that if P is positive recurrent then Eν TA+ < ∞, for any non-empty subset A ⊂ G and any starting distribution ν. B solution C

Exercise 3.16

Consider the following Markov chain on state space N: the transition probabilities are given by P(x, x + 1) = 1 − P(x, x − 1) = p for x > 0 and P(0, 1) = 1. Assume that p < 12 . Define Example 3.7.11

c π(x) = c p x 1−p

x = 0, 1 p

for appropriate constant c > 0, chosen so that Compute, for x > 1,

x > 0, P

x

π(x) = 1.

πP(x) = π(x − 1)P(x − 1, x) + π(x + 1)P(x + 1, x) p x−1 p 2 = pc 1−p · p + (1 − p)( 1−p ) p x−1 p(1−p)+p2 = pc 1−p · = π(x). 1−p Also, π(0) = c = π(1)P(1, 0), and π(1) =

c 1−p

=c+

cp (1 (1−p) 2

− p) = π(0)P(0, 1) + π(2)P(2, 1).

So π is a stationary distribution.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.7 Positive Recurrence

95

Thus, this Markov chain is positive recurrent. In fact, we immediately have 1 the formula Ex Tx+ = π (x) (which decreases exponentially in x). Specifically, f g 1 + E1 [T0 ] = E0 T0 = c . It is not difficult to compute c =

1−2p 2(1−p) .

454

Theorem 3.7.5 shows that the property of positive recurrence is a property of the whole (irreducible) Markov chain, not just a specific state. So we conclude that we may call an irreducible Markov chain positive recurrent if there exists a positive recurrent state, and we call the chain null recurrent if there exists a null recurrent state. We are now ready to prove Theorem 3.7.1. Proof of Theorem 3.7.1 By Theorem 3.7.5 we have that (1) ⇐⇒ (2). Now, note that if (Xt )t is a µ-random walk on G with X0 = x, then (Ytf = yX g t )t + is a µ-random walk on G with Y0 = yx. Thus, Ex Tx+ = Eyx Tyx for f g + + any y ∈ G, which implies that the quantity Ex Tx = E T1 is a constant independent of x. 1 Hence, the vector π(x) = E x [T + is a strictly positive left eigenvector of Pµ f gx ] + + if and only if Ex Tx = E T1 < ∞. Also, since π is a nonzero constant P vector, it is a distribution (i.e. x π(x) = 1) if and only if G is finite. So by Theorem 3.7.5, (Xt )t is positive recurrent if and only if G is finite, which is (2) ⇐⇒ (3). Show that if (Xt )t is a µ-random walk on a group G with X0 = x, then (Yt = yXt )t is a µ-random walk on G with Y0 = yx. Exercise 3.17

3.7.3 Summary Let us sum up what we know so far about irreducible chains. If P is an irreducible transition matrix, then: • Ex [V∞ (x)] + 1 = P x [T1+ =∞] . x • For all states x, y, we have that x is transient ifgand only if y is transient. f • If P is recurrent, the vector v(z) = Ex VTx+ (z) is a positive left eigenvector for P, and any nonnegative left eigenvector for P is proportional to v. • P has a stationary distribution if and only if P is positive recurrent. • If P is positive recurrent then there exists a unique stationary distribution π satisfying π(x) Ex Tx+ = 1. Consider the random walk on Z, given by taking µ(1) = p = 1 − µ(−1). So that P(x, x + 1) = p and P(x, x − 1) = 1 − p for all x. Suppose vP = v. Then, v(x) = v(x − 1)p + v(x + 1)(1 − p), or v(x + 1 1) = 1−p (v(x) − pv(x − 1)). Solving such recursions is simple: Set u x =

Example 3.7.12

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

96 f v(x + 1)

Markov Chains gτ

v(x) . So u x+1 =

1 1−p

Au x where

"

# 1 −p A= . 1−p 0 Since the characteristic polynomial of A is λ 2 −λ+p(1−p) = (λ−p)(λ−(1−p)), we get that the eigenvalues of A are p and 1 − p. One can easily check that A is diagonalizable, and so p x v(x) = u x (2) = (1−p) −x ( Ax u0 )(2) = (1−p) −x ·[0 1]M DM −1 u0 = a 1−p +b, where D is diagonal with p, 1 − p on the diagonal, and a, b are constants that depend on the matrix M and on u0 (but are independent of x). P Thus, x ∈Z v(x) will only converge for a = 0, b = 0, which gives v = 0. That is, there is no stationary distribution, and P is not positive recurrent. In Exercise 4.27 we will in fact see that P is transient for p , 1/2, and for p = 1/2 we have already seen that P is null-recurrent, since this is the simple random walk on the Cayley graph of Z with respect to the generating set {−1, 1}. 4 5 4 This is a solution to an exercise from Aldous and Fill (2002). A chess knight moves on a chess board; each step it chooses uniformly among the possible moves. Suppose the knight starts at the corner. What is the expected time it takes the knight to return to its starting point? At first, this looks difficult... However, let G be the graph whose vertices are the squares of the chess board, V (G) = {1, 2, . . . , 8}2 . Let x = (1, 1) be the starting point of the knight. For edges, we will connect two vertices if the knight can jump from one to the other in a legal move. Thus, for example, a vertex in the “center” of the board has 8 adjacent vertices. A corner, on the other hand has 2 adjacent vertices. In fact, we can determine the degree of all vertices. Example 3.7.13

∗

∗

∗

∗ o

legal moves:

∗

∗ ∗

∗

⇒

2 3 4 4

3 4 6 6

4 6 8 8

4 6 8 8

4 6 8 8

4 6 8 8

3 4 6 6

2 3 4 4

4 4 3 2

6 6 4 3

8 8 6 4

8 8 6 4

8 8 6 4

8 8 6 4

6 6 4 3

4 4 3 2

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.7 Positive Recurrence

97

Summing all the degrees, one sees that 2|E(G)| = 4 · (4 · 8 + 4 · 6 + 5 · 4 + 2 · 3 + 2) = 4 · 84 = 336. Thus, the stationary distribution is π(i, j) = deg(i, j)/336. Specifically, π(x) = 2/336 and so Ex Tx+ = 168. 454

3.7.4 Convergence to Stationarity We have already seen that if Pν [Xt = x] → π(x) for all x ∈ G, then π must be a stationary distribution. The following states that the converse is also true for aperiodic irreducible chains. Let P be an aperiodic irreducible transition matrix of a Markov chain (Xt )t on a state space G. If P is positive recurrent, then for any starting distribution ν the limit Theorem 3.7.14

π(x) = lim Pν [Xt = x] t→∞

exists, and π is the stationary distribution of P. Proof Since P is positive recurrent, there exists a stationary distribution π. Let (Xt )t be the Markov chain with transition matrix P, started with distribution ν. Let (Yt )t be an independent chain with transition matrix P on the same state space, started with distribution π. Consider the process Zt = (Xt , Yt ) ∈ G × G. This is easily verified to be a Markov chain with transition matrix Q((x, y), (x 0, y 0 )) = P(x, x 0 )P(y, y 0 ). We use the fact that P is irreducible and aperiodic to prove that Q is irreducible. For any x, y, x 0, y 0 there exists t 0 such that for all t > t 0 we have Q t ((x, y), (x 0, y 0 )) = P(x,y) [(Xt , Yt ) = (x 0, y 0 )] = Px [Xt = x 0] · Py [Yt = y 0] = Pt (x, x 0 )Pt (y, y 0 ) > 0. Moreover, one checks readily that λ(x, y) := π(x)π(y) is a stationary distribution for Q. Thus, by Theorem 3.7.5 we know that Q is positive recurrent. Specifically, if we fix some x ∈ G then by Exercise 3.16 we have that E(ν,π) [T] < ∞, where T = inf{t ≥ 0 : Xt = Yt = x}, and (ν, π) is the starting distribution of (X0, Y0 ). Specifically, P(ν,π) [T < ∞] = 1. Define Yt if t ≤ T, Xt if t ≤ T, At = and Bt = Y if t > T, t Xt if t > T . An important observation is that under P(x, x) , both chains (Xt )t and (Yt )t have the exact same distribution and are independent. Thus, under P(x, x) , the chains (Xt , Yt )t and (Yt , Xt )t have the same distribution. The strong Markov property at time T implies that under P(ν,π) , the chain ( AT +t , BT +t )t is independent

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

98

Markov Chains

of ( Ak , Bk )k ≤T , and has the same distribution as (Yt , Xt )t under P(x, x) . Also, ( Ak , Bk )k ≤T = (Xk , Yk )k ≤T by definition. We conclude that under P(ν,π) , the following chains have the same distribution: ( At , Bt )t = ((X0, Y0 ), . . . , (XT , YT ), (YT +1, XT +1 ), . . .) and

((X0, Y0 ), . . . , (XT , YT ), (XT +1, YT +1 ), . . .) = (Xt , Yt )t .

Specifically, ( At )t and (Xt )t have the same distribution under P(ν,π) . So ( At )t is a Markov chain with transition matrix P. Because π is a stationary distribution, P(ν,π) [Yt = y] = πPt (y) = π(y). We have that Pν [Xt = y] = P(ν,π) [At = y] = P(ν,π) [Xt = y, T ≥ t] + P(ν,π) [Yt = y, T < t] = P(ν,π) [Xt = y, T ≥ t] − P(ν,π) [Yt = y, T ≥ t] + π(y), so | Pν [Xt = y] − π(y)| = | P(ν,π) [Xt = y, T ≥ t] − P(ν,π) [Yt = y, T ≥ t]| ≤ P(ν,π) [T ≥ t] → 0, as t → ∞, because P(ν,π) [T < ∞] = 1. Since Pt (x, y) = Px [Xt = y] this is a special case of the above with ν = δ x . P∞ t Remark 3.7.15 Note that if P is transient, then t=0 P (x, y) < ∞, so P t (x, y) → 0 for all x, y. When we study null recurrent chains, we will see in Theorem 3.8.1 (and Exercise 3.19) that Pt (x, y) → 0 in the null recurrent case. So we see that for irreducible and aperiodic chains, (Pt (x, y))t always converges. In the positive recurrent case it converges to the stationary distribution, whereas in the other cases it converges to 0.

3.8 Null Recurrence We have seen above that an irreducible aperiodic Markov chain converges to a stationary distribution if and only if it is positive recurrent. What about null recurrent chains? Exercise 3.18 Let P be an irreducible transition matrix. Show that the following are equivalent:

(1) Pt (x, x) → 0 for all x. (2) Pt (x, x) → 0 for some x.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

99

3.8 Null Recurrence (3) Pt (x, y) → 0 for some x, y. (4) Pt (x, y) → 0 for all x, y.

B solution C

Let P be an irreducible null recurrent Markov chain. Then, Pt (x, y) → 0 for all x, y.

Theorem 3.8.1

Proof We will prove this under the additional assumption that P is also aperiodic. Exercise 3.19 shows that this extends to the general irreducible null recurrent case. Fix x. Set L = lim supt→∞ Pt (x, x). By Exercise 3.18, it suffices to show that L = 0. f g Let v(y) = Ex VTx+ (y) . We have already seen in Example 3.7.8 that v > 0 and vP = v. P P If y v(y) < ∞ then P would be positive recurrent, so y v(y) = ∞ by assumption. Fix ε > 0 and choose a finite subset A of states such that v( A) = P v(y) −1 a ∈ A v(a) > ε . Define a probability measure ν A (y) = 1 {y ∈ A} v( A) . Then, ν A Pt (y) ≤

t 1 v( A) vP (y)

=

v(y) v( A)

< εv(y).

Since v(x) = 1, we have ν A Pt (x) < ε for any t. Now, since P is irreducible and aperiodic, and since A is finite, we know that there exists n such that for all t ≥ n we have Pt (x, a) > 0 for all a ∈ A. Let 0 < δ < ε be small enough so that P n (x, a) − δν A (a) > 0 for all a ∈ A. Since ν A (y) = 0 for y < A, we may define a probability measure η(y) =

P n (x, y) − δν A (y) . 1−δ

By Exercise 3.11, Pη Tx+ = ∞ = 0. So we may choose k large enough such + that Pη Tx > k < δ2 . Then for t > k, Pη [Xt = x] =

t X

Pη Tx+ = j · Pt−j (x, x)

j=1

≤

k X

Pη Tx+ = j · Pt−j (x, x) + Pη Tx+ > k ,

j=1

so lim sup ηPt (x) = lim sup Pη [Xt = x] < L + δ2 . t→∞

t→∞

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

100

Markov Chains

However, Pt+n (x, x) =

X

P n (x, y)Pt (y, x) = (1 − δ)ηPt (x) + δν A Pt (x)

y

< (1 − δ)ηPt (x) + δε, so taking a lim sup, we get that L ≤ (1 − δ) L + δ2 + δε, implying that L ≤ (1 − δ)δ + ε < 2ε. Since this holds for arbitrary ε > 0, we get that lim supt→∞ Pt (x, x) = 0. Exercise 3.19

Show that if P is irreducible and null recurrent, then Pt (x, y) → 0

for any x, y.

B solution C

3.9 Finite Index Subgroups In this section we will apply the optional stopping theorem and the basic theory of Markov chains to investigate the behavior of harmonic functions when moving to a finite index subgroup. Let G be a finitely generated group with a symmetric adapted probability measure µ. Let H ≤ G be a finite index subgroup. Given the µ-random walk on G, we may consider the random walk on the coset space (H Xt )t . Indeed, it is a Markov process: if H x = H y then x −1 H = y −1 H so f g f g P[H Xt+1 = H z | H Xt = H x] = P Ut+1 ∈ x −1 H z = P Ut+1 ∈ y −1 H z = P[H Xt+1 = H z | H Xt = H y]. So (H Xt )t is a Markov chain on G/H, which is a finite set. The transition matrix is given by f g P(H x, H y) = P[H X1 = H y | H X0 = H x] = P X1 ∈ x −1 H y . Let us show that the uniform distribution is stationary for this Markov chain. U For n = [G : H], let x 1, . . . , x n ∈ G be such that G = nj=1 x −1 j H. Note that −1 −1 H x j = H x i if and only if x i H = x j H, so that G/H = {H x 1, . . . , H x n }. U Because G = Gy, we know that G = nj=1 x −1 j H y for any y ∈ G. Hence, n X j=1

P(H x j , H y) =

n X f g P X1 ∈ x −1 j H y = 1. j=1

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.9 Finite Index Subgroups

101

This Markov chain is irreducible because µ is adapted; indeed, Pt (H x, H y) ≥ Px [Xt = y] > 0 for large enough t. We have proved above, in the theory of finite state space Markov chains, that the stationary distribution is unique for an irreducible finite chain, and is exactly equal to the inverse of the expected return time; that is, we have shown the following proposition. Proposition 3.9.1 Let G be a finitely generated group and let µ be an adapted probability measure on G. Let H ≤ G be a finite index subgroup [G : H] < ∞. Recall TH+ := inf {t ≥ 1 : Xt ∈ H }, the return time to H, where (Xt )t is the µ-random walk on G. Then, f g E1 TH+ = [G : H].

Proof If we consider the random walk on G/H by projecting (H Xt )t , we have that TH+ = inf {t ≥ 1 : H Xt = Hf }, which is the first return time of the g + projected walk to the origin. So E1 TH is the reciprocal of the stationary distribution f g at H = H1, which by the above is exactly the number of states; that is, E1 TH+ = [G : H]. Let G be a finitely generated group and let µ be an adapted probability measure on G. Let H ≤ G be a finite index subgroup [G : H] < ∞. Show that there exist constants C, c > 0 such that for any x ∈ G and all t, f g Px TH+ > t ≤ Ce−ct . B solution C Exercise 3.20

Definition 3.9.2 Let G be a finitely generated group and µ an adapted probability measure on G. Let H ≤ G be a finite index subgroup [G : H] < ∞. Define a probability measure on H, called the hitting measure with respect to µ, by

µ H (x) := P1 [XTH+ = x], where TH+ is the return time to H. This measure arises naturally in the study of harmonic functions. We will see that natural spaces of µ-harmonic functions on G are isomomorphic to the corresponding spaces of µ H -harmonic functions on H, via the restriction function. Exercise 3.21 provides an example. Show that if H ≤ G is a finite index subgroup, [G : H] < ∞, and f ∈ BHF (G, µ), then f H ∈ BHF (H, µ H ). B solution C Exercise 3.21

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

102

Markov Chains

When studying the spaces BHF (G, µ), we would often like to consider properties that are “geometric” in nature. That is, they should be invariant to changing the geometry slightly, for example passing to a finite index subgroup. A naive choice may be to only consider measures with finite support. But this fails to be preserved when passing to a finite index subgroup. Consider the Cayley graph of Z2 with respect to the standard generating set {±(0, 1), ±(1, 0)}. Let µ be the uniform measure on this generating set. Let H = {(2x, 2y) : x, y ∈ Z}. Show that H is a finite index subgroup in Z2 . Show that µ H is not finitely supported. B solution C Exercise 3.22

Contrary to the above, the next theorem shows that the category of symmetric, adapted, and exponential tail measures is invariant under moving to hitting measures on finite index subgroups, and is thus a natural choice for the geometric study of groups via their harmonic functions. Let G be a finitely generated group and µ an adapted probability measure on G. Let H ≤ G be a finite index subgroup [G : H] < ∞. Show that the hitting measure µ H is also adapted (to H). Show that if µ is symmetric then µ H is also symmetric. B solution C

Exercise 3.23

Let G be a finitely generated group and let µ be an adapted probability measure on G with an exponential tail. Let H ≤ G be a subgroup of finite index [G : H] < ∞. Let µ H be the hitting measure on H. Then µ H is also adapted and has an exponential tail. Theorem 3.9.3

Proof We have that µ H is adapted to H because µ is adapted to G by Exercise 3.23. We only need to show that µ H has an exponential tail. By Exercise f g3.20, we may choose δ > 0 so that for some constant K > 0 we have P TH+ ≥ t ≤ K e−2δt for any t ≥ 1. −1 X be the independent Let (Xt )t denote the µ-random walk, and let Ut = Xt−1 t f g “jumps” of the walk. Since µ has an exponential tail we have that E eε |U1 | < ∞ f g for some ε > 0. By dominated convergence, E eε |U1 | → 1 as ε → 0. So we g f may choose ε > 0 small enough so that E e2ε |U1 | ≤ e δ , for δ > 0 as above. Using Cauchy–Schwarz appropriately, we can calculate: ∞ t X Y ε |XT + | ε |U | j H E e ≤ E 1{T + =t } e H t=1 j=1 ∞ √ ∞ q f X g f g t/2 X ≤ P TH+ = t · E e2ε |U1 | ≤ K e−δt · e δt/2 < ∞. t=1

So µ H has an exponential tail.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

t=1

103

3.9 Finite Index Subgroups

3.9.1 Random Walks with k Moments We have seen that if µ ∈ SA (G, ∞), then µ H ∈ SA (H, ∞) when [G : H] < ∞. In this section we will prove the slightly more complicated result stating that if µ ∈ SA (G, k) then µ H ∈ SA (H, k) when [G : H] < ∞. That is, the property of the kth moment is preserved when passing to a finite index subgroup. We require two lemmas first. Lemma 3.9.4 Let G be a finitely generated group and µ an adapted measure on G. Assume that µ has finite kth moment. Let (Xt )t be a µ-random walk on G. Then, there exists a constant C = C(k, µ) > 0 such that for every t ≥ 0, and for any x ∈ G, f g Ex |Xt | k ≤ C · (t + |x|) k .

Proof By the triangle inequality we have |Xt | ≤ |X0 | + −1 X is the jump at time j. So U j = X j−1 j f

E |Xt |

k

g

Pt j=1

|U j |, where

t k Y k X *X + ≤ E . |U j | / = E |U ji | . j=1 - ~j∈ {1,...,t } k i=1 ,

(3.2)

Note that by Jensen’s inequality, max

1≤n≤k

E[|U1 | n ]

k/n

f g ≤ E |U1 | k =: Ck .

It follows that Y k E |U ji | ≤ Ck for all ~j ∈ {1, . . . , t} k . i=1

(3.3)

It now follows from (3.2) and (3.3) that f g Ex |Xt | k = E |x Xt | k ≤ E (|x| + |Xt |) k ! ! k k X X k k = |x| j · E |Xt | k−j ≤ |x| j · Ck−j t k−j . j j j=0 j=0 Taking C = max j ≤k C j we get that Ex |Xt | k ≤ C · (t + |x|) k .

Let G be a finitely generated group and µ an adapted measure on G. Assume that µ has finite kth moment. Let H ≤ G be a subgroup of finite index [G : H] < ∞. Let (Xt )t denote the µ-random walk.

Lemma 3.9.5

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

104

Markov Chains

Then, there exist i.i.d. nonnegative, integer-valued random variables (Rt )t , also independent of TH+ , such that a.s. +

X + ≤ max |X | ≤ |X | + 0 TH 1≤t ≤T + t

TH X

H

Rt ,

t=1

f g and such that E R1k < ∞. Proof Let n = [G : H] and choose a set g1, . . . , gn of representatives for the U cosets of H, so that G = nj=1 Hg j . Let 1 ≤ m ≤ n be such that Hgm = H. For all t ≥ 1 and 1 ≤ i, j ≤ n, let S(t, i, j) be independent elements of G, with distribution given by µ(x) −1 Hg · 1 µ g > 0, −1 H g j −1 x ∈g i { i j} µ ( gi H g j ) P[S(t, i, j) = x] = −1 1 {x=1} µ gi Hg j = 0. For each t ≥ f1 set i, j)| : 1 ≤ i, j ≤ n}. It is Exercise 3.24 to g Rt = f max{|S(t, g verify that E Rtk = E R1k < ∞ (because µ has finite kth moment). Let (It )t be a Markov chain on {1, . . . , n} with transition matrix P[It+1 = j | It = i] = P(i, j) = µ gi−1 Hg j started at I0 = 1. Choose (It )t so that it is independent of (S(t, i, j))t,i, j . Recalling that Hgm = H, define T = inf{t ≥ 1 : It = m}. Since Rt is measurable with respect to σ(S(t, i, j) : 1 ≤ i, j ≤ n), we get that (Rt )t is independent of (It )t . Also, since (S(t, i, j))t,i, j are independent, and since T is measurable with respect to σ(It : t ≥ 0), we conclude that (Rt )t , T are all independent, as required. Now, consider the sequence Y0 = X0 and Yt = Yt−1 S(t, It−1, It ) for t ≥ 1. Exercise 3.26 shows that (Yt )t is a Markov chain on G with transition matrix P[Yt+1 = y | Yt = x] = µ x −1 y , so that (Yt )t has the distribution of a µ-random walk on G. Exercise 3.25 shows that It = i if and only if Yt ∈ Hgi . Thus, T = inf{t ≥ 1 : It = m} = inf{t ≥ 1 : Yt ∈ H },

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

105

3.9 Finite Index Subgroups which implies that T and TH+ have the same distribution. Finally, note that T T X X max |Yt | ≤ |Y0 | + |S(t, It , It−1 )| ≤ |Y0 | + Rt . 1≤t ≤T

t=1

t=1

Exercise 3.24

Show that E[R1k ] < ∞ in the proof of Lemma 3.9.5.

Exercise 3.25

Show that in the proof of Lemma 3.9.5 It = i if and only if

B solution C

Yt ∈ Hgi . B solution C

Show that (Yt )t from the proof of Lemma 3.9.5 is a Markov chain −1 on G with transition matrix P[Yt+1 = y | Yt = x] = µ x y . B solution C

Exercise 3.26

Using Lemma 3.9.5, we can now deduce that the kth moment property is preserved when passing to the hitting measure of a finite index subgroup. Let H ≤ G be a subgroup of finite index [G : H] < ∞ in a finitely generated group G. If µ is an adapted probability measure on G with finite kth moment, then the hitting measure µ H is also an adapted probability measure on H with finite kth moment. Proof Let (Xt )t denotefthe µ-random walk. Let T = TH+ . So XT has law µ H . g We need to prove that E |XT | k < ∞. P By Lemma 3.9.5, P-a.s., |XT | ≤ Tt=1 Rt for some i.i.d. (Rt )ft , independent g of T, all nonnegative integer-valued random variables. Also, E R1k < ∞, and f g by Exercise 3.20, E T k < ∞. Using that T is independent of (Rt )t , we have that Corollary 3.9.6

f

E |XT |

k

g

X X k X k ∞ T s ≤ E Rt = P[T = s] · E Rt t=1 s=1 t=1 Y T k X X ≤ P[T = s] · E R ji i=1 s=1 ~j∈ {1,...,s } k ≤

T X s=1

P[T = s] ·

X

f g f g f g E R1k = E T k · E R1k < ∞.

~j∈ {1,...,s } k

f g f g m/k We have used Jensen’s inequality, so that E R1m ≤ E R1k for all m ≤ k.

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

106

Markov Chains

Exercise 3.27 Let G be a finitely generated group, µ an adapted probability measure on G with finite kth moment, and H ≤ G a subgroup of finite index [G : H] < ∞. Show thatg there exists C = C(k, µ, H) > 0 such that for any x ∈ G, we have f Ex |XTH+ | k ≤ C · (1 + |x|) k . B solution C

3.9.2 HFn Restricted to Finite Index Subgroups Recall that HFk (G, µ) is the space of µ-harmonic functions, with growth bounded by a polynomial of degree at most k. We now investigate the change in HFk as we move from a group to a finite index subgroup. Theorem 3.9.7 shows that µ-harmonic functions on G correspond bijectively to µ H -harmonic functions on H, even in the unbounded case. It is a generalization of Exercise 3.21. Theorem 3.9.7 Let G be a finitely generated group. Let k ≥ 0 and let µ be an adapted probability measure on G with finite kth moment. Let H ≤ G be a subgroup of finite index [G : H] < ∞. Then, the restriction of any f ∈ HFk (G, µ) to H is in HFk (H, µ H ). Conversely, any f˜ ∈ HFk (H, µ H ) is the restriction of a unique f ∈ HFk (G, µ). Thus, the restriction map is a linear bijection (isomorphism) of the vector spaces HFk (G, µ) HFk (H, µ H ).

Proof Step I: Extension Let f˜ ∈ HFk (H, µ H ). Define f : G → C by f (x) := Ex [ f˜(XT )], where T = TH is the hitting time of H and (Xt )t is a µ-random walk on G. Note that this coincides with f˜ for x ∈ H, since T =f 0 a.s. when g starting at X0 = x ∈ H. Also, for x < H we have that f (x) = Ex f˜ XTH+ . We now wish to show that f is well defined (i.e. the expectation Ex | f˜(XT )| is finite), and that f ∈ HFk (G, µ). Note that since [G : H] < ∞ and G is finitely generated, then so is H (Exercise 1.61). Also, with respect to any choices of finite symmetric generating ˜ the corresponding metrics are bi-Lipschitz. Namely, sets G = hSi and H = h Si, there exist C > 1 so that for any x ∈ H we have that C −1 |x|S ≤ |x|S˜ ≤ C · |x|S (see Exercise 1.75). Thus, since XT ∈ H and f˜ ∈ HFk (H, µ H ), we have that ˜ | f (XT )| ≤ C · 1 + |XT | k for some constant C > 0. Also, by Exercise 3.27 we have that Ex |XT | k ≤ C(1 + |x|) k for some constant C > 0 (because µ has finite kth moment). This proves that Ex | f˜(XT )| < ∞, so that f is well defined, and also that | f (x)| ≤ C(1 + |x|) k , for some (perhaps adjusted) constant C > 0 and

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.9 Finite Index Subgroups

107

all x ∈ G. Exercise 3.28 shows that f is indeed µ-harmonic. So we conclude that f ∈ HFk (G, µ). To recap: for every f˜ ∈ HFk (H, µ H ) the extension f (x) := Ex [ f˜(XT )] is a well-defined function f ∈ HFk (G, µ), with f H ≡ f˜. Step II: Restriction To show that this is indeed a unique extension, it suffices to show that if f H ≡ 0 and f ∈ HFk (G, µ) then f ≡ 0 on all of G. Indeed, if f ∈ HFk (G, µ) then ( f (XT ∧t ))t is a martingale. This martingale is uniformly integrable. Indeed, there is a constant C > 0 such that | f (y)| ≤ C|y| k for all 1 , y ∈ G. If we denote M = max1≤t ≤T |Xt |, then we know from Lemma 3.9.5 that we can find i.i.d. nonnegative integer-valued random variables (Rt )t , P with finite kth moment, also independent of T, such that M ≤ |X0 | + Tt=1 Rt . Thus, f g f g Ex | f (XT ∧t )|1 { | f (XT ∧t ) |>K } ≤ C Ex M k 1{C M k >K } . By Jensen’s inequality, as in the solution to Exercise 3.27, one may show that ! X m ∞ k X f g X s k Ex M k ≤ P[T = s] E Rt · |x| k−m m t=1 s=1 m=0 ! ∞ k X X f g k m f m g k−m ≤ P[T = s] s E R1 |x| = E (T R1 + |x|) k m s=1 m=0 f g f g k ≤ |x| · P[R1 = 0] + E T k · E R1k · (1 + |x|) k < ∞. Since 1{C M k >K } → 0 as K → ∞, by dominated convergence we obtain that f g f g lim sup Ex | f (XT ∧t )|1 { | f (XT ∧t ) |>K } ≤ C · lim Ex M k 1{C M k >K } = 0. K→∞ t

K→∞

This proves that ( f (XT ∧t ))t is a uniformly integrable martingale. Thus, we may apply the optional stopping theorem (Theorem 2.3.3) to obtain Ex [ f (XT )] = f (x). If f (x) = 0 for all x ∈ H, then f (XT ) = 0 a.s., and so f ≡ 0 on all of G. This shows that the linear map f 7→ f H is injective on HFk (G, µ). Let G be a finitely generated group. Let k ≥ 0 and let µ be an adapted probability measure on G with finite kth moment. Let H ≤ G be a subgroup of finite index [G : H] < ∞. Show that if f˜ ∈ HFk (H, µ H ) then f (x) = Ex [ f˜(XTH )] defines a µ-harmonic function. B solution C Exercise 3.28

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

108

Markov Chains

Let G be a finitely generated group and let µ be a probability measure on G. Let H ≤ G be a subgroup such that the hitting measure µ H is well defined; that is, TH+ < ∞ a.s. Show that (G, µ) is recurrent if and only if (H, µ H ) is recurrent. B solution C

Exercise 3.29

For examples of finite index subgroups recall Exercises 1.68, 1.72, and 1.73. Consider the infinite dihedral group D E D∞ = x, y | y 2 , yx yx . Let µ = 13 (δ x + δ x −1 + δ y ) i.e. uniform on the standard generatorsx, x −1, y . Show that H = hxi is a subgroup of index [D∞ : H] = 2 that is isomorphic to H Z. What is µ H in this case? Show that dim HF1 (D∞, µ) ≥ 2. B solution C Exercise 3.30

3.10 Solutions to Exercises Solution to Exercise 3.1 :( Let Ft = σ(X0, . . . , Xt ) . Let Rt = {x ∈ G : P[Xt = x] > 0}. Using Exercise 2.8, we have that a.s. X P[Xt +1 = y | ∀ 0 ≤ j ≤ t, X j = x j ]1 (∀ 0≤ j ≤t , X =x ) P[Xt +1 = y | Ft ] = j

x j ∈R j 0≤ j ≤t

X ν(x0 ) ·

=

x j ∈R j 0≤ j ≤t

X

=

Qt

j

t P(x j−1, x j ) · P(xt , y) Y 1 ( X =x ) Qt j j P(x , x ) j j−1 j=1 j=0

j=1

ν(x0 ) ·

P(xt , y)1{X t =x t } = P[Xt +1 = y | Xt ] = P(Xt , y).

x t ∈R t

This implies that for any bounded function ϕ : G → [0, ∞) , we have a.s. X E[ϕ(Xt +1 ) | Ft ] = P[Xt +1 = y | Ft ]ϕ(y) = Pϕ(Xt ). y

Inductively, we can now show that

E[ϕ(Xt +n ) | Ft ] = E E[ϕ(Xt +n ) | Ft +n−1 ] | Ft = E[Pϕ(Xt +n−1 ) | Ft ] = · · · = P n ϕ(Xt ). When ϕ = δ y we get P n ϕ = P n (·, y) . Now, if A ∈ Ft then f g f g P[Xt +n = y, A, Xt = x] = E P[Xt +n = y | Ft ]1 A 1{X t =x } = E P n (Xt , y)1 A 1{X t =x }

= P n (x, y) · P[A, Xt = x].

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

:) X

109

3.10 Solutions to Exercises Solution to Exercise 3.5 :( Since (Ut )t are i.i.d.-µ and independent of the starting distribution, we have that for x0, . . . , xt ∈ G , f g P x0 [∀ 0 ≤ j ≤ t, X j = x j ] = P x0 X0 = x0, ∀ 1 ≤ j ≤ t, U j = (x j−1 ) −1 x j

=

t Y

t Y µ (x j−1 ) −1 x j = P(x j−1, x j ).

j=1

j=1

This shows that (Xt )t is a Markov chain with transition matrix P . Assume that P is irreducible. Let x ∈ G . There exists t > 0 such that P[Xt = x] = P t (1, x) > 0. Under P we have that Xt = U1 · · · Ut for i.i.d.-µ elements (Uk ) k . Thus, there must exist u1, . . . , ut ∈ supp (µ) such that u1 · · · ut = x . This holds for any x ∈ G , so supp (µ) generates G . Now, assume that µ is adapted. Let x, y ∈ G . Write x −1 y = u1 · · · ut for some u1, . . . , ut ∈ supp (µ) . Thus,

P t (x, y) = P x [Xt = y] ≥ P x [X0 = x, X1 = xu1, . . . , Xt = xu1 · · · ut ] =

t Y

µ(u j ) > 0.

j=1

Finally, if µ(1) > 0, then P(x, x) ≥ µ(1) > 0 for any x ∈ G , which immediately implies that x is aperiodic. :) X Solution to Exercise 3.7 :( Since {t ≤ k } ∈ {∅, Ω} (the trivial σ -algebra), we get that {t ≤ k } ∈ σ(X0, . . . , Xk ) for any k . So constants are stopping times. For the minimum: [ T ∧ T 0 ≤ t = {T ≤ t } T 0 ≤ t ∈ σ(X0, . . . , Xt ). The maximum is similar:

\ T ∨ T 0 ≤ t = {T ≤ t } T 0 ≤ t ∈ σ(X0, . . . , Xt ). For the addition, t [ T + T0 ≤ t = T = k, T 0 ≤ t − k . k=0

Since {T = k } = {T ≤ k }\{T ≤ k − 1} ∈ σ(X0, . . . , Xk ) , we get that T + T 0 is a stopping time.

:) X

Solution to Exercise 3.9 :( ( ) Let Tx(0) = 0 and Tx(k+1) = inf t > Tx(k ) : Xt = x be the successive times the chain is at x . ( (k ) ) Let A k = Tx < TZ . The strong Markov property tells us that for any k ≥ 1,

P x [V ≥ k + 1] = P x [A k ] = P x [A k | A k−1 ] · P x [A k−1 ] = · · · = (P x [A1 ]) k , and P x [V ≥ 1] = 1 = (P x [A1 ]) 0 trivially. This is exactly a geometric distribution; that is, one obtains that for any k ≥ 1,

P x [V = k] = P x [V ≥ k] − P x [V ≥ k + 1] = (P x [A1 ]) k−1 (1 − P x [A1 ]). Note that this gives the correct mean since 1 − P[A1 ] = P x TZ ≤ Tx+ = P x TZ < Tx+ (as x < Z so + TZ , Tx ). :) X

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

110

Markov Chains

Solution to Exercise 3.10 :( There exists n such that P n (x, y) > 0 (because P is irreducible). Thus, there is a sequence x = x0, x1, . . . , x n = y such that P(x j , x j+1 ) > 0 for all 0 ≤ j < n. Let m = max{0 ≤ j < n : x j = x }, and let t = n − m and y j := x m+ j for 0 ≤ j ≤ t . Then we have the sequence x = y0, . . . , yt = y so that y j , x for all 0 < j ≤ t , and we know that P(y j , y j+1 ) > 0 for all 0 ≤ j < t . Thus,

P x [Xt = y, t < Tx+ ] ≥ P x [∀ 0 ≤ j ≤ t, X j = y j ] = P(y0, y1 ) · · · P(yt −1, yt ) > 0.

:) X

Solution to Exercise 3.11 :( Note that if y ∈ A then TA+ ≤ Ty+ , so that it suffices to prove that Pν [Ty+ < ∞] = 1 for any y ∈ G and any ν . Also, X f g X Pν [Ty+ < ∞] = ν(x) P x Ty+ < ∞ = ν(x) = 1 x

x

:) X

when the chain is recurrent. Solution to Exercise 3.12 :( Q is easily seen to be a stochastic matrix since

X

Q(x, y) =

y

∞ X

1 ek!

X

P k (x, y) = 1.

y

k=0

Also, since Q(x, y) ≥ e−1 P(x, y) for all x, y ∈ G , we obtain inductively that Q t (x, y) ≥ e−t P t (x, y) for all x, y ∈ G and all t , implying that Q is irreducible (because P P is). Let (Λt )t∞=1 be i.i.d. Poisson of mean 1 and let St = tj=1 Λ j (with S0 = 0). If (Xt )t is the Markov chain with transition matrix P , independent of (Λt )t , then one verifies that (Yt = XS t )t is a Markov chain with transition matrix Q . Indeed, f g P[Yt +1 = y, Yt = x] = P XS t +Λt +1 = y, XS t = x

=

∞ X

P[St = s] · P[Λt +1 = k] · P[Xs+k = y, Xs = x]

s, k=0

=

∞ X

P[St = s] ·

s=0

∞ X

k 1 ek! P (x,

y) = Q(x, y).

k=0

For every t define Lt = {k : Sk = t }. Note that if k, k + n ∈ Lt then Sk+n = Sk , so Λ k+1 = · · · = Λ k+n = 0. Also,

{Lt , ∅, min Lt = k } ∈ σ(Λ1, . . . , Λ k ). Thus, for any ` ≥ 1,

P[ |Lt | ≥ `] = ≤ =

∞ X k=0 ∞ X k=0 ∞ X

P[ |Lt | ≥ `, min Lt = k] P[Lt , ∅, min Lt = k, ∀ 1 ≤ j ≤ ` − 1, Λ k+ j = 0] P[Lt , ∅, min Lt = k] · e1−` ≤ e1−` .

k=0

This implies that

E[|Lt |] =

∞ X `=0

P[ |Lt | > `] ≤

∞ X

e−` < 2.

`=0

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

111

3.10 Solutions to Exercises Now, we may bound the number of visits of (Yt )t to a specific site x by ∞ X

1 {Y

k =x }

≤

∞ X

1{X t =x } · |Lt |.

t =0

k=0

Hence ∞ X

P[Yt = x] ≤

∞ X

P[Xt = x] · E |Lt | ≤ 2 · R

Also, note that since St is Poisson of mean t , and since ∞ X

P[Xt = x].

t =0

t =0

t =0

∞ X

P[St = n] =

t =0

∞ X

∞ 0

e−t

t =0

e−t t n dt = n!, we have tn n!

≥ c,

for some constant c > 0 independent of n. Summing over n we have

2

∞ X

P[Xt = x] ≥

t =0

∞ X

P[Yt = x] =

t =0

∞ X ∞ X

P[St = n] · P[Xn = x] ≥ c ·

t =0 n=0

∞ X

P[Xn = x],

n=0

where we have used the independence of Xt , Lt . Thus Q is transient if and only if P is transient.

:) X

Solution to Exercise 3.13 :( Let (Ut )t∞=1 be i.i.d. random variable, each uniform on [0, 1]. Define a Markov chain (Yt , Nt ) on G × N as follows. Given Yt = x, Nt = n, let ( (x, n) if Ut +1 ≤ p x , (Yt +1, Nt +1 ) = (y, n + 1) if Ut > p x , with probability P(x, y). One checks easily that (Yt )t is a Markov chain with transition matrix Q . Also, note that Nt ≤ Nt +1 ≤ Nt + 1 for all t , and

P[Nt +1 = Nt + 1 | Yt = x] = 1 − p x > 0. So we may define

Tn = inf {t ≥ 0 : Nt = n}, with T0 = 0 and Tn < ∞ a.s. Set Xn = YTn . One then can also easily check that (Xn ) n is a Markov chain with transition matrix P . Moreover, this coupling gives that if Xn = x for infinitely many n then Yt = x for infinitely many t . Conversely, note that for any Tn ≤ t < Tn+1 , we have that YTn = Yt . So if Yt = x for infinitely many t and Xn = x for only finitely many n, it must be that Tn+1 − Tn = ∞ for some n, which happens with probability 0. We conclude that (Xn ) n is recurrent if and only if (Yt )t is recurrent. :) X Solution to Exercise 3.14 :( If π (x)P(x, y) = π(y)P(y, x) for all x, y ∈ G , then X X π P(y) = π (x)P(x, y) = π(y)P(y, x) = π(y). x

x

For the example, consider 1

4 P = 12 1 4

1 4 1 4 1 4

1 2 1 , 4 1 4

which is easily seen to have the uniform distribution on three states, π (x) = 13 , as a stationary distribution. However, if π(x)P(x, y) = π(y)P(y, x) , in this case P(x, y) = P(y, x) , which is easily seen to be false. :) X

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

112

Markov Chains

Solution to Exercise 3.15 :( g f Let v(z) = E x VT + (z) and u(z) = E y VT + (z) . Since x, y are positive recurrent we know that P has a x

y

unique stationary distribution given by π (z) =

1 . E z [Tz+ ]

Also, since x, y are recurrent we know that π (x)v and

π(y)u are stationary distributions, as in Example 3.7.8. But then π(x)v(y) = π(y)u(y) = π(y)

and

π(x) = π(x)v(x) = π(y)u(x),

so that

v(y)u(x) =

π (y) π(x) · = 1. π(x) π (y)

:) X

Solution to Exercise 3.16 :( P If x ∈ A, then TA+ ≤ Tx+ . Also, Eν Tx+ = y ν(y) E y Tx+ . So it suffices to show that E y Tx+ < ∞ for all x, y ∈ G . If y = x this is immediate from Theorem 3.7.5. For x , y ∈ G , by Exercise 3.10 there exists t > 0 such that P x Xt = y, Tx+ > t > 0. The Markov property and positive recurrence give that f g f g f g ∞ > E x Tx+ ≥ P x Xt = y, Tx+ > t · E y Tx+ + t ,

so E y Tx+ < ∞.

:) X

Solution to Exercise 3.18 :( (1) ⇒ (2), (4) ⇒ (1) are trivial. (2) ⇒ (3) follows from the fact that given x, y there exists k such that P k (y, x) > 0, so as t → ∞,

P t (x, y) ≤

P t +k (x, x) → 0. P k (y, x)

(3) ⇒ (4) : Let z, w be such that P t (z, w) → 0. Then, for any x, y there exist k, n such that P k (z, x) > 0 and P n (y, w) > 0. Thus, P t (x, y) ≤

P t +k+n (z, w) → 0. x)P n (y, w)

P k (z,

:) X

Solution to Exercise 3.19 :( By Exercise 3.18 it suffices to show that P t (x, x) → 0 for some x . Fix x . Let n = gcd{t > 0 : P t (x, x) > 0}. Let Q = P n . Note that {t > 0 : P t (x, x) > 0} ⊂ {nt : t > 0}, so lim supt →∞ P t (x, x) ≤ lim supt →∞ Q t (x, x) . Set Y = {y : ∃ t ≥ 0, Q t (x, y) > 0}. Then Q restricted to Y is an irreducible Markov chain. Moreover, by definition of gcd, we know that

gcd{t > 0 : Q t (x, x) > 0} = gcd{t > 0 : P nt (x, x) > 0}. Moreover, the definition of gcd implies that

gcd{t > 0 : Q t (x, x) > 0} = gcd{t > 0 : P nt (x, x) > 0} = 1. So Q restricted to Y is aperiodic and irreducible, and thus Q t (x, x) → 0 by Theorem 3.8.1.

:) X

Solution to Exercise 3.20 :( Consider the projected random walk (H Xt )t . Let H x ∈ G/H . Because this chain is irreducible, there exists k > 0 such that P k−1 (H x, H ) > 0. Thus, f g + P x TH < k ≥ P x [Xk−1 ∈ H] ≥ P H x [H Xk−1 = H] = P k−1 (H x, H ). Since f there are g only finitely many different cosets, we may find k and α > 0 such that for any x ∈ G , we have + < k ≥ α. P x TH

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

113

3.10 Solutions to Exercises Now, for any t > k , using the Markov property, f g f g f g f g + + + + P x TH > t ≤ P x TH > t − k · sup P y TH ≥ k ≤ P x TH > t − k · (1 − α) y

≤ · · · ≤ (1 − α) bt /kc .

:) X

Solution to Exercise 3.21 :( + . We have seen that T + < ∞ a.s. If f ∈ BHF (G, µ) then ( f (X )) is a bounded martingale. Thus, Let T = TH t t H by the optional stopping theorem (Theorem 2.3.3), f (x) = E x [ f (XT )] for all x ∈ G . Note that for any x, y ∈ H , since x −1 H = H and x −1 y ∈ H , we have that

P x [XT = y] = =

∞ X t =1 ∞ X

P x [X0 = x, Xt = y, ∀ 1 ≤ j < t, X j < H] f g P X0 = 1, Xt = x −1 y, ∀ 1 ≤ j < t, X j < H

t =1

f g = P XT = x −1 y . Thus, for x ∈ H ,

f (x) = E x [ f (XT )] =

X

P x [XT = y] f (y) =

y

=

X y

P[XT = y] f (xy) =

X f g P XT = x −1 y f (y) y

X

µ H (y) f (xy).

y

So f H ∈ BHF (H, µ H ) .

:) X

Solution to Exercise 3.22 :( Consider the homomorphism ϕg: Z2 → {0, 1}2 given by ϕ(x, y) = (x (mod 2), y (mod 2)) . We have H = f Kerϕ , so H C Z2 with Z2 : H = 4. We have that µ H is not finitely supported since for any (2x, 2y) ∈ H , we can find a path in the Cayley graph of Z2 from (0, 0) to (2x, 2y) that avoids H until reaching (2x, 2y) . For example, if x > 0, y > 0, then we take the path

γ = ((0, 0), (1, 0), (1, 1), (1, 2), . . . , (1, 2y − 1), (2, 2y − 1), . . . , (2x, 2y − 1), (2x, 2y)). Similarly in the other cases. Thus, µ H (2x, 2y) > 0 for all (2x, 2y) ∈ H .

:) X

Solution to Exercise 3.23 :( Let (Xt )t denote the µ -random walk, and let Ut = Xt−1 −1 Xt be the independent “jumps” of the walk. For any x ∈ H we can write x = s1 · · · sn for s1, . . . , sn ∈ supp (µ) because µ is adapted. Define x0 = 1 and x j = s1 · · · x j . Let J = {0 ≤ j ≤ n : x j ∈ H }. Write J = {0 = j0 < j1 < · · · < jk = n}. For 1 ≤ ` ≤ k set −1 u` = x j `−1 x j ` = s j `−1 +1 · · · s j ` . So u1 = x j1 and x = u1 · · · uk . For any 1 ≤ ` ≤ k it is simple to see inductively that u` ∈ H . Moreover, since x j `−1 +1, . . . , x j ` −1 < H , we have that

f g P XT + = u` ≥ P U1 = s j `−1 +1, . . . , U j ` − j `−1 = s j ` = H

j` Y − j `−1

µ(s j i ) > 0.

i=1

Thus, u` ∈ supp (µ H ) for all ` . As x = u1 · · · uk , this shows that H is generated by supp (µ H ) .

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

114

Markov Chains

Now, if µ is symmetric, then for any n, (Xt )tn=1 , and (Xt−1 )tn=1 have the same distribution. Thus, for any x ∈ H, ∞ X P XT + = x = P[Xt = x, ∀ 1 ≤ j ≤ t − 1, X j < H] H

t =1 ∞ X f g = P Xt = x −1, ∀ 1 ≤ j ≤ t − 1, X j < H = P XT + = x −1 . H

t =1

:) X

Solution to Exercise 3.24 :( Since µ has finite k th moment, it follows that as long as µ gi−1 H g j > 0, we have

" −1 g f · E |X1 | k 1 ( X E |S(1, i, j) | k = µ gi−1 H g j Since R1k ≤

P

i, j ≤n

# −1 1 ∈g i H g j

)

−1 g f · E |X1 | k < ∞. ≤ µ gi−1 H g j

f g f g |S(1, i, j) | k , we have E R1k ≤ n2 · E |X1 | k < ∞.

:) X

Solution to Exercise 3.25 :( This is by induction on t . For t = 0 this is by definition, since I0 = 1 and H g1 = H g and Y0 = X0 = g. For t > 0, note that S(t, i, j) ∈ gi−1 H g j if µ gi−1 H g j > 0, and S(t, i, j) = 1 otherwise. Also, by H g I t > 0, hence, the induction hypothesis, Yt −1 ∈ H g I t −1 . By definition of (It )t , it must be that µ g−1 I t −1

−1 S(t, It −1, It ) ∈ g−1 I t −1 H g I t = Yt −1 H g I t . This implies that Yt = Yt −1 S(t, It −1, It ) ∈ H g I t , completing the induction. :) X

Solution to Exercise 3.26 :( Note that It = j for 1 ≤ j ≤ n such that Yt ∈ H g j . Let x, y ∈ G and let 1 ≤ i, j ≤ n be such that x ∈ H gi and y ∈ Hg j . Recall that S(t, i, j) is independent of (It )t . For any t , since It , It −1 are measurable with respect to Yt , Yt −1 ,

P[Yt +1 = y | Yt = x, Yt −1, . . . , Y0 ] = P[S(t + 1, It , It +1 ) = x −1 y | It , It −1, Yt = x, Yt −1, . . . , Y0 ] = P[It +1 = j | It , It −1, Yt = x, Yt −1, . . . , Y0 ] · P[S(t + 1, i, j) = x −1 y] µ x −1 y = µ x −1 y . = µ gi−1 H g j −1 µ gi H g j

:) X

Solution to Exercise 3.27 :( Fix x ∈ G . As in the proof of Corollary 3.9.6 we know that

k T X f g E x |XT | k ≤ E |x | + Rt , t =1 f g f g + are independent, positive, and integer-valued, and E R k = E R k < ∞. where (Rt )t , T = TH t 1 Similarly to the proof of Corollary 3.9.6, this last expectation can be bounded as follows. First, for any m, using Jensen’s inequality, we have the bound m X s E Rt ≤ t =1

Y m E R j i ≤ ~ j ∈{1, ..., s} m i=1 X

X

E[R1m ] = s m E[R1m ].

~ j ∈{1, ..., s} m

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

3.10 Solutions to Exercises

115

! X m ∞ k X f g X s k E x |XT | k ≤ Rt · |x | k−m P[T = s] E m t =1 s=1 m=0 ! k ∞ X X g f k m f mg s E R1 |x | k−m = E (T R1 + |x |) k P[T = s] ≤ m m=0 s=1 f g f g ≤ |x | k P[R1 = 0] + E T k · E R1k · (1 + |x |) k .

:) X

Then,

Solution to Exercise 3.28 :(

+ , so that f (x) = E First, note that if x < H then P x -a.s. we have TH = TH for x < H . x f˜ XT + H

˜ ˜ On the other hand, when x ∈ H , we have, using the fact that f is µ H harmonic, f (x) = f (x) = E x f˜ XT + . H

+ ≥ 1 and the Markov property, we have Finally, using TH X f (x) = E x f˜ XT + = µ(y) E x f˜ XT + | X1 = xy H

=

X

µ(y) E x y

H

y

f g X f˜ XTH = µ(y) f (xy).

y

:) X

y

Solution to Exercise 3.29 :( Let (Xt )t denote the µ -random walk. Let τ0 = 0 and inductively,

τn+1 = inf {t > τn : Xt ∈ H }. So (Yn := Xτ n ) n is a µ H -random walk on H . If (H, µ H ) is recurrent, then Yn = 1 infinitely many times a.s. So Xt = 1 infinitely many times as well, since (Yn ) n is a subsequence of (Xt )t . Conversely, if Xt = 1 infinitely many times a.s., then since 1 ∈ H , we have that

{t : Xt = 1} ⊂ {τn : n ≥ 0}, implying that Yn = 1 infinitely many times as well.

:) X

Solution to Exercise 3.30 :( It is easy to see that any element in D∞ can be uniquely written as y ε x z for ε ∈ {0, 1} and z ∈ Z. This shows that H Z and [D∞ : H] = 2. Let (Xt )t be the µ -random walk. Starting from the origin, note that with probability 32 we have X1 ∈ ( ) x, x −1 . With the remaining probability 31 we have X1 = y . Now, given that Xt = yx z for some z ∈ Z, we have that with probability 31 we get Xt +1 = Xt y = x −z ∈ H . ( ) The other probability of 32 is split over two events, {Xt +1 = Xt x } and Xt +1 = Xt x −1 , each with probability 1 3,

and in both these events we have Xt +1 < H . This shows that for any t > 0,

f g + + P TH > t + 1 | TH > t = 23 . Thus, inductively,

f g f g f g t −1 f g t −1 + + + + + 1 P TH > 1 = 32 P TH > t = P TH > t | TH > t − 1 · P TH > t − 1 = · · · = 23 3. f g + =1 = Also, P TH

2 3. f g + − 1 is just the process that moves by One may also note that conditioned on X1 = y , we have that X 1, TH

x, x −1 with probability

1 2

each. (In the last step, one must have the final jump with y into H .)

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

116

Markov Chains

Thus, we may describe the distribution of µ H as follows. Let Bt ∼ Bin(t, 12 ) . Note that Zt = 2Bt − t has the distribution of a simple random walk on Z of t steps. Then,

µ H (x z ) =

∞ ∞ X X f g f g + + 1 2 2 t −2 P TH = t · P Xt = x z | TH = t = 31 1{z=1} + 13 1{z=−1} + · P[Zt −2 = −z]. 3 3 t =1

t =2

Finally, since the restriction map from D∞ to H is a linear isomorphism of HF1 (D∞, µ) and HF1 (H, µ H ) , it suffices to show that the latter has dimension at least 2. This has already been shown in a previous exercise, but for completeness, recall that the maps x z 7→ αz + β are µ H -harmonic (actually for any symmetric measure). :) X

https://doi.org/10.1017/9781009128391.005 Published online by Cambridge University Press

4 Networks and Discrete Analysis

117

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

118

Networks and Discrete Analysis

The theory of random walks is deeply connected to the theory of electrical networks. In this chapter, we will go through the basics of the theory. The main application at the end of the chapter is that transience or recurrence is, in essence, a group property. More precisely, these properties remain stable when changing between different symmetric, adapted measures with finite second moment.

4.1 Networks Definition 4.1.1 A network is a pair (G, c) where G is a countable set and c : G× G → [0, ∞) is a nonnegative function, called the conductance, satisfying:

• c(x, y) = c(y, x) for all x, y. P • 0 < cx := y c(x, y) < ∞ for all x. We write x ∼ y if c(x, y) > 0. Any network comes with an associated Markov chain; specifically, the Markov chain on G with transition matrix P(x, y) := c(x,y) c x . We call this Markov chain the random walk on (G, c). Example 4.1.2 Let µ be a symmetric probability measure on a finitely generated group G. Define c(x, y) = µ x −1 y . Because µ is symmetric this defines a symmetric function c. Also, X X cx = c(x, y) = µ x −1 y = 1. y

y

So c is a conductance. One may check that the associated Markov chain is exactly the µ-random walk on G. Thus, networks are a generalization of symmetric random walks on groups. µ we write (G, µ) to denote the network (G, c) with c(x, y) = For a symmetric µ x −1 y . (This is the reason we use G instead of N; our main example will be networks arising from symmetric probability measures on groups.) 454 Throughout, given some network (G, c), we will use (Xt )t to denote the random walk on (G, c). As in the case of random walks on groups, Px, Ex will denote probability measure and expectation on GN conditioned on X0 = x. Similarly, Pν, Eν are conditioned on X0 having law ν. P will denote the transition matrix of the Markov chain; that is, P(x, y) = c(x,y) cx .

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

119

4.2 Gradient and Divergence

Definition 4.1.3 A network (G, c) is called connected if the associated transition matrix P(x, y) = c(x,y) c x is irreducible.

Exercise 4.1 Let (G, c) be a connected network. Let Z ⊂ G be a subset such that G\Z is finite. Recall TZ = inf{t ≥ 0 : Xt ∈ Z } where (Xt )t is the random walk on (G, c). Show that TZ has an exponential tail; that is, there exists K, δ > 0 such that for all t and all x ∈ G,

Px [TZ > t] ≤ K e−δt .

B solution C

From now on, unless otherwise specified, by network we will always refer to a connected network.

4.2 Gradient and Divergence Let (G, c) be a network. Let E = E(G, c) := {(x, y) ∈ G × G : c(x, y) > 0}. We write x ∼ y to denote the fact that (x, y) ∈ E(G, c) (this implicitly depends on the specific conductance c). We consider two spaces of functions. For f : G → C and F : E → C, define X X 2 1 || f ||c2 := cx | f (x)| 2, ||F ||c2 := c(x,y) |F (x, y)| , x

(x,y) ∈E

and let ` 2 (G, c) = { f : G → C : || f ||c < ∞}, ` 2 (E(G, c)) = {F : E(G, c) → C : ||F ||c < ∞}.

Definition 4.2.1

Define

f , f 0 ∈ ` 2 (G, c),

f, f 0

c

:=

X

cx f (x) f 0 (x),

x

and F, F 0 ∈ ` 2 (E(G, c)),

F, F 0

c

:=

X

1 c(x,y) F (x,

y)F 0 (x, y).

(x,y) ∈E (G,c)

The next exercise shows the classical fact that ` 2 (G, c) and ` 2 (E(G, c)) are Hilbert spaces with inner product given by the above forms, respectively.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

120

Networks and Discrete Analysis

Exercise 4.2 Let N be a countable set. Let m : N → (0, ∞) be a positive function. P Show that for f , g : N → C, the form h f , gi := x ∈N m(x) f (x)g(x) defines an inner product structure when restricted to

f , g ∈ ` 2 (N, m) = { f : G → C : h f , f i < ∞}. Show that, in fact, ` 2 (N, m) with this inner product is a Hilbert space.

B solution C

Note that if µ is a symmetric probability measure and c(x, y) = µ x −1 y , then cx = 1 for all x. So actually ` 2 (G, µ) = ` 2 (G) = { f : G → ) P C : x | f (x)| 2 < ∞ .

Remark 4.2.2

Let (G, c) be a network. For functions f : G → C and F : E(G, c) → C define two operators: X 1 ∇ f (x, y) := c(x, y)( f (x) − f (y)), div F (x) := c x (F (x, y) − F (y, x)).

Definition 4.2.3

y∼x

Show that ∇ : ` 2 (G, c) → ` 2 (E(G, c)) and div : ` 2 (E(G, c)) → ` 2 (G, c), and that these are bounded linear operators. B solution C Exercise 4.3

The following is an extremely useful identity, sometimes called integration by parts, or Green’s identity. Let (G, c) be a network. For any f ∈ ` 2 (G, c) and F ∈ we have

Theorem 4.2.4

` 2 (E(G, c)),

h f , div Fic = h∇ f , Fic . That is, the operators ∇, div are adjoints of each other. Proof Compute: X X cx f (x) h f , div Fic = x

=

1 cx

(F (x, y) − F (y, x))

y∼x

X

f (x)F (x, y) −

(x,y) ∈E (G,c)

=

X

X

f (x)F (y, x)

(x,y) ∈E (G,c) 1 c(x,y) c(x,

y)( f (x) − f (y))F (x, y) = h∇ f , Fic .

(x,y) ∈E (G,c)

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

4.3 Laplacian

121

4.3 Laplacian Recall that a network (G, c) comes with an associated Markov chain with transition matrix P(x, y) = c(x,y) c x . The Laplace operator, or Laplacian, is defined to be ∆ = I − P where I is the identity operator. Exercise 4.4

that ∆ =

Show that ∆ is a self-adjoint bounded operator on ` 2 (G, c), and

1 2 div∇.

B solution C

Recall that a function is called harmonic at x if ∆ f (x) = 0, and harmonic if ∆ f ≡ 0. Define a bilinear form on ` 2 (G, c) by X E ( f , g) := h∇ f , ∇gic = c(x, y)( f (x) − f (y))(g(x) − g(y)). x,y

Prove that E is a nonnegative definite bilinear form. That is, show that for f , g ∈ ` 2 (G, c) and α ∈ C,

Exercise 4.5

• E (α f + g, h) = αE ( f , h) + E (g, h), • E ( f , g) = E (g, f ), • E ( f , f ) ≥ 0. Show that E ( f , f ) = 0 if and only if f is constant. Integration by parts shows that E ( f , g) = 2 h∆ f , gic for f , g ∈ ` 2 (G, c). The quantity X E ( f , f ) = 2 h∆ f , f ic = c(x, y)| f (x) − f (y)| 2 x,y

is usually called the Dirichlet energy of f . Let (G, c) be a connected network. Show that if f ∈ ` 2 (G, c) and f is harmonic, then f is constant. B solution C

Exercise 4.6

4.4 Path Integrals Let (G, c) be a network. A finite path in (G, c) is a finite sequence γ = (γ0, γ1, . . . , γn ) of elements of G satisfying c(γ j , γ j+1 ) > 0 for all 0 ≤ j < n. If γ0 = x, γn = y, we write γ : x → y. The length of a path γ = (γ0, γ1, . . . , γn ) is |γ| = n. If γ : x → x (i.e. γ0 = γ |γ | ), we say that γ is a cycle.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

122

Networks and Discrete Analysis

Accordingly, an infinite path in (G, c) is an infinite sequence (γ0, γ1, . . .) such that c(γ j , γ j+1 ) > 0 for all j ≥ 0. The length of an infinite path is |γ| = ∞. A path in (G, c) is either an infinite path or a finite path. If γ = (γ0, γ1, . . . , γn ) is a finite path, the reversal of γ is defined to be the path γˇ := (γn, γn−1, . . . , γ1, γ0 ). We also define the reversal a function F : E(G, c) → C to be Fˇ (x, y) = F (y, x). Show that div Fˇ = −div F. Show that || Fˇ ||c = ||F ||c .

Exercise 4.7

If F = Fˇ we say that F is symmetric. If F = −Fˇ we say that F is antisymmetric. Exercise 4.8 Show that (G, c) is connected if and only if for all x, y ∈ G there exists a finite path γ : x → y. B solution C

For a finite path γ in a network (G, c) and a function F ∈ ` 2 (E(G, c)), define I |γ |−1 X 1 F (γ j , γ j+1 ) · c(γ j ,γ F := . j+1 ) γ

Exercise 4.9

j=0

Let F ∈ ` 2 (E(G, c)) and let γ be a finite path in (G, c). Show

that: H H ˇ • γˇ F = γ F. • If F = ∇ f then

H

Proposition 4.4.1

Suppose that (G, c) is a connected network. If ∇ f = ∇g for then there exists a constant η ∈ C such that f ≡ g + η.

some f , g ∈

γ

F = f (γ0 ) − f (γ |γ | ).

B solution C

` 2 (G, c)

Proof Let x, y ∈ G and let γ : x → y (which is possible since (G, c) is connected). Then, I I f (x) − f (y) = ∇f = ∇g = g(x) − g(y). γ

γ

Thus, f (x) − g(x) = f (y) − g(y) =: η for all x, y.

A function F ∈ ` 2 (E(G, c)) His said to respect Kirchhoff’s cycle law if for any cycle γ : x → x we have γ F = 0 (a cycle is just a path starting and ending at the same point). For example, if F = ∇ f then by the above it respects Kirchhoff’s cycle law. As we shall see, this is the only example.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

123

4.5 Voltage and Current

Let (G, c) be a connected network. Then F ∈ ` 2 (E(G, c)) respects Kirchhoff’s cycle law if and only if there exists f ∈ ` 2 (G, c) such that F = ∇f.

Proposition 4.4.2

Proof First note that F = −Fˇ since for any x, y with c(x, y) > 0, we may consider the cycle (x, y, x) to get I 0= F = c(x, y)(F (x, y) + F (y, x)). (x,y,x)

If α : x → y and β : x → y are two paths, then we may concatenate them to get a path γ = α βˇ = (α0, . . . , α |α |, β |β |−1, . . . , β0 ). Note that γ : x → x, so I I I I I F= F+ F= F = 0. F− β

α

α

βˇ

γ

Thus, for any α : x → y we have that the quantity α F does not depend on α, but only on x, y. H Now, fix some o ∈ G and for any x define f (x) := α F for some path α : x → o (so f (o) = 0). This is well defined by the above. It is immediate that if γ : y → o and γ1 = x then denoting γ 0 = (γ1, . . . , γ |γ | ) we have γ 0 : x → o and I I 1 1 F (y, x) = c(x,y) F (x, y). f (x) − f (y) = F = − c(x,y) F− H

γ0

γ

4.5 Voltage and Current Let (G, c) be a network. Let A, Z be disjoint subsets of G. A flow on (G, c) from A to Z is a function F : E(G, c) → R satisfying F = −Fˇ (i.e. F is anti-symmetric) and div F (x) = 0 for all x < A ∪ Z.

Definition 4.5.1

A function having 0 divergence is sometimes said to satisfy Kirchhoff’s node law. If A = {a} we say that F is a flow from a to Z. If Z = ∅ we say that F is a flow from A to infinity. If A = {a} and div F (a) = 1 we say that F is a unit flow from a to Z (or, respectively, from a to infinity in the case that Z = ∅). Example 4.5.2 Let (G, c) be a network and let A, Z be disjoint subsets of G. Let v : G → R be a function. Assume that ∆v(x) = 0 for all x < A ∪ Z (that is, v is harmonic off A ∪ Z). Let I = ∇v. Then, I is anti-symmetric and for any x < A ∪ Z we have div I (x) = 2∆v(x) = 0. So I is a flow from A to Z. 454

This example is a central one.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

124

Networks and Discrete Analysis

Let (G, c) be a network. Let A, Z be disjoint subsets of G. A voltage from A to Z is a function v : G → R that is harmonic outside A ∪ Z; that is, ∆v(x) = 0 for any x < A ∪ Z. If, in addition, v(a) = 1 for all a ∈ A and v(z) = 0 for all z ∈ Z, we say that v is a unit voltage from A to Z. If v is a voltage, the function I = ∇v is called the current (from A to Z) induced by v. Definition 4.5.3

Exercise 4.10

Show that any current I is a flow that satisfies Kirchhoff’s cycle

law. Show that if F is a flow from A to Z satisfying Kirchhoff’s cycle law, then F is a current. B solution C Let A, Z be disjoint subsets of a connected network (G, c) with G\( A ∪ Z ) finite. Let v be a voltage from A to Z. If supx v(x) = v(z) < ∞ then there exists y ∈ A ∪ Z such that v(y) = supx v(x). That is, the maximal value of a voltage (if it exists) is always attained on the “boundary” A ∪ Z. Theorem 4.5.4 (Maximum principle)

Proof Let v be any voltage from A to Z. Let T = TA∪Z = inf{t : Xt ∈ A ∪ Z }, where (Xt )t is the random walk on (G, c). Now, if supx v(x) < ∞, then (v(XT ∧t ))t is a martingale, bounded from above. a.s. By the Martingale convergence theorem (Theorem 2.6.3), v(XT ∧t ) −→ M∞ as a.s. t → ∞, for some integrable random variable M∞ . However, v(XT ∧t ) −→ v(XT ) as well. So v(XT ) is integrable. Using that G\( A ∪ Z ) is finite, define b = max{|v(x)| : x < A ∪ Z }. Then the event |v(XT ∧t )| > b implies the event {T ≤ t}. Thus, for K > b we have Ex |v(XT ∧t )|1 { |v(XT ∧t ) |>K } ≤ Ex |v(XT )|1 { |v(XT ) |>K } ≤ Ex |v(XT )| < ∞. Taking a supremum over t and then a limit K → ∞, we see that (v(XT ∧t ))t is uniformly integrable. Since G\( A∪ Z ) is finite and (G, c) is connected, we know that T < ∞ a.s. (In fact, we have seen that T has an exponential tail.) The optional stopping theorem (Theorem 2.3.3) now gives that v(x) = Ex [v(XT )] for any x ∈ G. Hence, if v(z) = supx v(x) then it must be that there is some y with Pz [XT = y] > 0 such that v(y) ≥ v(z).

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

125

4.6 Effective Conductance

Let (G, c) be a connected network. Let v be a voltage from A to Z. (We do not necessarily assume that G\( A ∪ Z ) is finite.) Assume that T = TA∪Z < ∞, Px -a.s., for all x. Assume further that v(XT ) is integrable under Px for all x. Also, assume that supx −∞, then there exists y ∈ A ∪ Z such that v(y) = inf x v(x). B solution C Exercise 4.11

Let A, Z be disjoint subsets of a connected network (G, c) with G\( A ∪ Z ) finite. Show that the only voltage from A to Z that is 0 on A ∪ Z is the identically 0 function. Show that if two voltages from A to Z have the same values on A ∪ Z, then they are equal. Conclude that there is a unique unit voltage from A to Z. B solution C Exercise 4.12

1 ; r is If (G, c) is a network, we can consider the function r (x, y) = c(x,y) called the resistance. Note that if v is a voltage and I the induced current then v(x) − v(y) = I (x, y)r (x, y), which is reminiscent of classical physics.

4.6 Effective Conductance Let (G, c) be a network. Let a < Z, and assume G\Z is finite. Let v be the unit voltage from a to Z. The effective conductance between a and Z is defined to be X X Ceff (a, Z ) = c2a div I (a) = I (a, x) = c(a, x)(v(a) − v(x)) = ca ∆v(a). Definition 4.6.1

x

x

The effective resistance is defined to be R eff (a, Z ) =

1 Ceff (a, Z) .

Let (G, c) be a network. Let a < Z, and assume G\Z is finite. Let v be a voltage from a to Z such that v Z ≡ 0 and v(a) , 0. Show that ca Ceff (a, Z ) = v(a) ∆v(a). B solution C Exercise 4.13

Let (G, c) be a network. Let a < Z, and assume G\Z is finite. Let v be a unit voltage from a to Z. Show that 2Ceff (a, Z ) = E (v, v). B solution C Exercise 4.14

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

126

Networks and Discrete Analysis

Let (G, c) be a network. Let a < Z. Assume that G\Z is finite. Show that Ceff (a, Z ) = ca Pa TZ < Ta+ . B solution C

Exercise 4.15

Let (G, c) be a network. Let (G n )n be an exhaustion of G; that is, a nondecreasing sequence G n ⊂ G n+1 of finite subsets of G such that S T G = n G n . Let Zn = G\G n . Assume that a ∈ n G n . Then the limit Theorem 4.6.2

Ceff (a, ∞) := lim Ceff (a, Zn ) n→∞

exists and does not depend on the specific choice of exhaustion. In fact, Ceff (a, ∞) = ca Pa Ta+ = ∞ . Proof Since G n ⊂ G n+1 , the sequence TZn < Ta+ n is a decreasing sequence of events. Thus, ca · Pa Ta+ = ∞ = lim ca · Pa TZn < Ta+ = lim Ceff (a, Zn ). n→∞

n→∞

Ceff (a, ∞) is called the effective conductance to infinity. We define the 1 = limn→∞ R eff (a, Zn ). effective resistance to infinity by R eff (a, ∞) = Ceff (a,∞) We then immediately obtain the following characterization of recurrence for networks. Corollary 4.6.3

A connected network (G, c) is recurrent if and only if R eff (a, ∞) =

∞ for some a.

4.7 Thompson’s Principle and Rayleigh Monotonicity In this section we may want to consider effective conductance of different networks on the same set of points G. Hence, if (G, c), (G, c 0 ) are two networks, we use the notation Ceff (a, Z | c), Ceff (a, Z | c 0 ) to specify the conductance function used. Let (G, c) be a network. Let A, Z be disjoint subsets such that G\( A ∪ Z ) is finite. Let v be the unit voltage from A to Z. For any function f : G → R such that f A ≡ 1 and f Z ≡ 0, we have E ( f , f ) ≥ E (v, v).

Theorem 4.7.1 (Thompson’s principle)

That is, the function minimizing the Dirichlet energy is the voltage. Proof Since f − v A∪Z ≡ 0 and since v is harmonic off A ∪ Z, we have that ( f − v)∆v ≡ 0. Also, f − v ∈ ` 2 (G, c). Thus,

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

4.7 Thompson’s Principle and Rayleigh Monotonicity 1 2 E( f

127

− v, v) = h∆( f − v), vi = h f − v, ∆vi = 0.

This implies that E ( f , f ) = E ( f − v + v, f − v + v) = E ( f − v, f − v) + E (v, v) − 2E ( f − v, v) ≥ E (v, v).

Let (G, c) be a network. Let a < Z be such that G\Z is finite. Let v be a voltage from a to Z such that v ≡ 0 and such that the current I = ∇v is a unit flow from a to Z. Then, for

Theorem 4.7.2 (Thompson’s principle, dual form)

Z

any unit flow F from a to Z, we have ||F ||c ≥ ||I ||c . Proof Since div (F − I) G\Z ≡ 0 and v Z ≡ 0, we have that div (F − I) · v ≡ 0. Thus, hF − I, Iic = hF − I, ∇vic = hdiv (F − I), vic = 0. So ||F ||c2 = ||F − I + I ||c2 = ||F − I ||c2 + ||I ||c2 − 2 hF − I, Iic ≥ ||I ||c2 .

Let (G, c), (G, c 0 ) be two networks on the same set of points. Let a < Z such that G\Z is finite. Suppose that c(x, y) ≤ c 0 (x, y) for all x, y. Then, Theorem 4.7.3 (Rayleigh monotonicity)

Ceff (a, Z | c) ≤ Ceff (a, Z | c 0 ). Proof Let v be the unit voltage from a to Z with respect to the network (G, c) and let u be the unit voltage from a to Z with respect to the network (G, c 0 ). By Exercise 4.14 we know that E c (v, v) = 2Ceff (a, Z | c) and E c0 (u, u) = 2Ceff (a, Z | c 0 ). Here, E c, E c0 are the Dirichlet energies with respect to the conductances c, c 0, respectively. We now use Thompson’s principle (Theorem 4.7.1), which tells us that v minimizes E c over functions with value 1 on a and 0 on Z. Hence, X 2Ceff (a, Z | c) = E c (v, v) ≤ E c (u, u) = c(x, y)|u(x) − u(y)| 2 x,y

≤

X

c (x, y)|u(x) − u(y)| 2 = E c0 (u, u) = 2Ceff (a, Z | c 0 ). 0

x,y

Perhaps the most important application of Rayleigh monotonicity is the following corollary. It states that increasing the conductances of a network (in any way imaginable) cannot change a transient network into a recurrent one.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

128

Networks and Discrete Analysis

Let (G, c), (G, c 0 ) be two connected networks on the same set of points. Suppose that c(x, y) ≤ c 0 (x, y) for all x, y. If (G, c) is transient then (G, c 0 ) is transient. Corollary 4.7.4

Exercise 4.16

Prove Corollary 4.7.4.

B solution C

Let (G, c) be a network. Let G 0 ⊂ G and let c 0 (x, y) = c(x, y)1 {x,y ∈G0 } . Assume that both (G, c) and (G 0, c 0 ) are connected. Show that if (G 0, c 0 ) is transient then (G, c) is transient. B solution C Exercise 4.17

4.8 Green Function Definition 4.8.1 Let (G, c) be a connected network. Let Z ⊂ G be a subset, possibly empty. Define

g Z (x, y) =

∞ X

Px [Xt = y, t < TZ ] = Ex

X

1 {Xt =y } .

t 0,

x

by transience. So I is a flow from a to infinity. To estimate the energy of I, let (G n )n be an exhaustion of G as above, with a ∈ G1 and Zn = G\G n . Let vn (x) = Px Ta < TZn . So vn (x) % v(x) for all

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

133

4.10 Paths and Summable Intersection Tails

x, and vn (a) = v(a) = 1. Thus, for any x, y we have ∇vn (x, y) → I (x, y). By Fatou’s lemma we now have X X ||I ||c2 = r (x, y)|I (x, y)| 2 = r (x, y) lim |∇vn (x, y)| 2 x,y

x,y

≤ lim inf n→∞

X

n→∞

r (x, y)|∇vn (x, y)| 2 = lim inf ||∇vn ||c2 . n→∞

x,y

Since g Zn (x, a) = vn (x) · g Zn (a, a),

we get that ||∇vn ||c2 = 2 h∆vn, vn ic = 2

X

cx ∆vn (x)vn (x) =

x

Because g Zn (a, a) → g (a, a) we conclude that ||I ||c2 ≤

2c a g Z n (a,a)

2c a g (a,a)

.

< ∞.

4.10 Paths and Summable Intersection Tails Let (G, c) be a network. Recall that an infinite path is a sequence γ = (γ0, γ1, . . .) such that c(γ j , γ j+1 ) > 0 for all j. A path γ is called simple if γ j , γi for all i , j. Let Γa be the collection of all infinite simple paths γ with γ0 = a. For two infinite simple paths α, β, let E(α, β) = {(x, y) : ∃ i, j, αi = β j = x, αi+1 = β j+1 = y}. That is, E(α, β) is the set of all (x, y) such that c(x, y) > 0 and both α and β 1 , and let traverse (x, y). Recall the resistance r (x, y) = c(x,y) X R(α, β) = r (x, y). (x,y) ∈E (α,β)

A probability measure λ on Γa is called SIT (summable intersection tails) if the following holds: For α, β, two independent paths each of law λ, we have Eλ⊗λ [R(α, β)] < ∞. That is, the total resistance of edges traversed by two independent paths has finite expectation. (Here λ ⊗ λ is the product measure on Γa × Γa of two independent paths.) A network (G, c) is called SIT if there exist a ∈ G and a probability measure λ on Γa such that λ is SIT.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

134

Networks and Discrete Analysis A connected network (G, c) is transient if and only if (G, c) is

Theorem 4.10.1

SIT. The proof of the theorem is split into two parts. The simpler part is given by Proposition 4.10.2. Proposition 4.10.2

If (G, c) is SIT then (G, c) is transient.

Proof Assume that (G, c) is SIT. Let a be such that some probability measure λ on Γa is SIT. Let α be a random path drawn from λ. Define F (x, y) = Pλ [∃ j, α j = x, α j+1 = y] − Pλ [∃ j, α j = y, α j+1 = x]. ˇ Note that F = −F. Since α is simple, we have that α0 = a and α j , a for any j > 0. Thus, F (a, x) = Pλ [α1 = x], and X F (a, x) = c2a . div F (a) = c2 a x

Since α is a simple path, X Pλ [∃ j, α j = x, α j+1 = y] = Pλ [∃ j, α j = x], y∼x

and X

Pλ [∃ j, α j = y, α j+1 = x] = Pλ [∃ j, α j+1 = x].

y∼x

So for any x , a we get that X div F (x) = c2 Pλ [∃ j, α j = x, α j+1 = y]−Pλ [∃ j, α j = y, α j+1 = x] = 0. x y∼x

Hence, F is a flow from a to infinity. Finally, since |F (x, y)| 2 ≤ Pλ [∃ j, α j = x, α j+1 = y]

2

+ Pλ [∃ j, α j = y, α j+1 = x] 2,

summing over all edges we obtain X ||F ||c2 ≤ 2 r (x, y) Pλ [∃ j, α j = x, α j+1 = y] 2 x∼y

=2

X

r (x, y) Pλ⊗λ [(x, y) ∈ E(α, β)] = 2 Eλ⊗λ [R(α, β)] < ∞.

x∼y

So F has finite energy and hence (G, c) is transient.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

4.10 Paths and Summable Intersection Tails

135

We now move to the proof of the second part of Theorem 4.10.1, namely that transience implies SIT. For a flow F from a to infinity, we say that a cycle γ : x → x is a positive flow cycle if F (γ j , γ j+1 ) > 0 for all 0 ≤ j < |γ|. Note that if F admits some positive flow cycle γ : x → x, then I γ

F=

|γ |−1 X

1 F (γ j , γ j+1 ) c(γ j ,γ > 0. j+1 )

j=0

So F does not respect Kirchhoff’s cycle law. If (G, c) is a connected transient network, then there exists a unit flow F from some a ∈ G to infinity such that no cycle is a positive flow cycle. Moreover, F (x, a) ≤ 0 for all x ∈ G (i.e. there is no flow into a). Lemma 4.10.3

ˇ and Proof Define v(x) = 21 g (x, a) and F = ∇v. So F = −F, div F (x) = 2∆v(x) = 1 {x=a } .

So F is a unit flow. Also, g (x, a) = P x [Ta < ∞] · g (a, a),

so F (x, a) ≤ 0 for all x. Finally, since F = ∇v we know that F respects Kirchhoff’s cycle law, so F cannot admit a positive flow cycle. Let F be a unit flow in (G, c) from a to infinity. Suppose that F does not admit any positive flow cycle. We define a new Markov chain on G, which is different from the random walk with transition matrix P. Define X f (x) = F (x, y) y:F (x,y)>0

and 1 F (x, y) f (x) Q(x, y) = 0

F (x, y) > 0, F (x, y) ≤ 0.

In order to differentiate from the random walk, denoted by (Xt )t , we will use (Yt )t to denote the Markov chain with transition matrix Q, and we will use Qx to denote the corresponding probability measures. Note that Tx will still denote hitting times for this chain. Since the flow F does not admit positive flow cycles, we get that the chain (Yt )t is a simple path in the network (G, c), Qx -a.s. for any x.

Claim 4.10.4

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

136

Networks and Discrete Analysis

Proof First, note that (Yt )t is indeed a path in (G, c), because Q(x, y) > 0 only if c(x, y) > 0. If (Yt )t is not a simple path with some positive probability, then there must exist a point x and a time t > 0 such that Q t (x, x) = Qx [Yt = x] > 0. Thus, there is a cycle γ : x → x such that 0 < Qx [(Y0, . . . , Yt ) = γ] =

|γ |−1 Y

Q(γ j , γ j+1 )

j=0

=

|γ |−1 Y

1 f (γ j ) F (γ j , γ j+1 )1 { F (γ j ,γ j+1 )>0 } .

j=0

This implies that F (γ j , γ j+1 ) > 0 for all 0 ≤ j < |γ|, meaning that γ is a positive flow cycle, a contradiction! Claim 4.10.5

For all x ∈ G we have ∞ X

Q j (a, x) = Qa [Tx < ∞] ≤

ca 2

f (x).

j=0

ˇ Proof Since F = −F, X

f (x) =

F (x, y) =

y:F (x,y)>0

=

X 2 F (x, y) div F (x) − cx y:F (x,y)0

Thus, ( f Q)(x) =

X

f (y)Q(y, x) =

y

X

F (y, x) = f (x) −

y:F (y, x)>0

2 δ a (x). ca

Iterating this inductively, 2 ( f Q t )(x) = f Q t−1 (x) − δ a Q t−1 (x) ca t−2 2 X j 2 = · · · = f (x) − Q (a, x) − Q t−1 (a, x) ca j=0 ca = f (x) −

t−1 2 X j Q (a, x). ca j=0

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

4.10 Paths and Summable Intersection Tails

137

Since f Q t has only nonnegative entries, taking t → ∞ we arrive at ∞ X

Q j (a, x) ≤

ca 2

f (x).

j=0

Now, let A be the event that (Yt )t is a simple path, and recall that Qa [A] = 1. Note that for any x, the events ({Yj = x} ∩ A) j ≥0 are mutually disjoint. Thus, ca 2

f (x) ≥

∞ X

Q j (a, x) =

∞ X

Qa [Yj = x, A]

j=0

j=0

= Qa [∃ j ≥ 0, Yj = x] = Qa [Tx < ∞]. Claim 4.10.6

For all x ∼ y we have

Qa [∃ j ≥ 0, Yj = x, Yj+1 = y] ≤

ca 2

F (x, y)1 {F (x,y)>0} .

Proof As before, let A be the event that (Yt )t is a simple path, and recall that Qa [A] = 1. Since the events ({Yj = x, Yj+1 = y} ∩ A) j ≥0 are disjoint, we use the previous claim to compute: Qa [∃ j ≥ 0, Yj = x, Yj+1 = y] =

∞ X

Q j (a, x)Q(x, y)

j=0

=

∞ X

1 Q j (a, x) f (x) F (x, y)1 {F (x,y)>0} ≤

ca 2

F (x, y)1 {F (x,y)>0} .

j=0

If ||F ||c < ∞ and F does not admit a positive flow cycle, then the measure Qa is SIT.

Claim 4.10.7

Proof Since F does not admit a positive flow cycle, Qa is indeed supported on simple paths. Since F has finite energy, if Y = (Yt )t , Z = (Zt )t are two independent chains under Qa , we have that Qa ⊗ Qa [∃ j, i ≥ 0, Yj = x, Yj+1 = y, Zi = x, Zi+1 = y] ≤ | c2a F (x, y)| 2 . So EQ a ⊗Q a [R(Y, Z )] ≤

c a2 4

·

X

r (x, y)|F (x, y)| 2 =

c a2 4

· ||F ||c2 < ∞.

x∼y

This completes the construction of a SIT measure from the finite energy flow F above, which was obtained under the assumption that the network (G, c) is transient. Combined with Proposition 4.10.2, this proves Theorem 4.10.1.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

138

Networks and Discrete Analysis

Let (G, c) be a network. Let Γa∗ be the set of all infinite paths γ (not necessarily simple) that start at γ0 = a and that visit any point only finitely many times. That is, for all x ∈ G, X 1{γ j =x } < ∞.

Exercise 4.20

j

Γa∗

We call the set of transient paths. Show that for any γ ∈ Γa∗ there exists an infinite simple path α such that for all j ≥ 0, there exists k ≥ 0 with γk = α j , γk+1 = α j+1 . That is, any edge that α traverses, γ also traverses. B solution C Exercise 4.21

Let (G, c) be a network. Let Γa∗ be the set of transient paths from

Exercise 4.20. Suppose that λ ∗ is a probability measure on Γa∗ such that Eλ∗ ⊗λ∗ R(α, β) < ∞. Show that there exists a SIT measure λ on Γa . B solution C Let (G, c) be a connected network. Let x, y ∈ G. Show that if there is a SIT measure on Γx , then there exists a SIT measure on Γy . B solution C Exercise 4.22

Let ε > 0. Let (G, c) be a network, and define conductances y) = c(x, y) + ε for all (x, y) ∈ E(G, c) and c 0 (x, y) = 0 for (x, y) < E(G, c). (Note that c 0 (x, y) > 0 if and only if c(x, y) > 0. That is, E(G, c) = E(G, c 0 ).) Show that if (G, c) is transient then (G, c 0 ) is transient. Show that if inf (x,y) ∈E (G,c) c(x, y) > 0 and (G, c 0 ) is transient, then (G, c) is transient. B solution C Exercise 4.23

c 0 (x,

Let ε > 0. Let (G, c) be a network, and define conductances c 0 (x, y) = c(x, y) + ε for all (x, y) ∈ E(G, c) and c 0 (x, y) = 0 for (x, y) < E(G, c). Give an example of a network (G, c) and ε > 0 such that (G, c) is recurrent and (G, c 0 ) is transient. B solution C

Exercise 4.24

This is called the Nash-Williams criterion. Let (G, c) be a network, and fix some a ∈ G. Assume that (Kn )n is a sequence of subsets of edges Kn ⊂ E(G, c) that are pairwise disjoint, Kn ∩ Km = ∅ for all n , m. Assume that for all n the set Kn is a cutset for a; that is, any infinite simple path starting from a must traverse some edge in Kn . Precisely, for any γ ∈ Γa and any n, there exists j such that (γ j , γ j+1 ) ∈ Kn . Exercise 4.25

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

4.11 Capacity

139

Prove that for any probability measure λ on Γa , and α, β independent paths of law λ, we have that X 1 Eλ⊗λ [R(α, β)] ≥ , c(Kn ) n P where c(Kn ) = (x,y) ∈Kn c(x, y). B solution C Show that the simple random walk on Z2 is recurrent. That is, the µ-random walk on Z2 for µ uniform on {±(1, 0), ±(0, 1)}. B solution C

Exercise 4.26

Consider the group Z with the nonsymmetric measure µ(1) = 1 − µ(−1) = p, for some p ∈ (0, 1). Show that the µ-random walk can be realized as the random walk on a p x ) . network (Z, c) where conductances are given by c(x, x + 1) = ( 1−p 1 Show that if p , 2 then (Z, µ) is transient. B solution C Exercise 4.27

4.11 Capacity Recall that for a network (G, c) we consider the spaces ` 2 (G, c) and ` 2 (E(G, c)). Also, we denote by ` 0 (G, c) = { f : G → C : | supp ( f )| < ∞} the subspace of finitely supported functions. This space does not actually depend on c, so sometimes we will just write ` 0 (G). Exercise 4.28

Show that ` 0 (G, c) is dense in ` 2 (G, c) under the norm || · ||c . B solution C

Recall the Dirichlet energy of a function f ∈ ` 2 (G, c), defined by E ( f , f ) = h∇ f , ∇ f ic = 2 h∆ f , f ic . Let (G, c) be a network. Let A ⊂ G be a finite subset. Define the capacity of A to be ( ) cap ( A) = inf 21 E ( f , f ) : f A ≡ 1, f ∈ ` 0 (G, c) . Definition 4.11.1

Also, denote cap (x) = cap ({x}). Let (G, c) be a network. For any a ∈ G we have that cap (a) = = ca Pa [Ta+ = ∞]. Consequently, the following are equivalent:

Theorem 4.11.2 ca g (a,a)

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

140

Networks and Discrete Analysis

(1) (G, c) is transient. (2) There exists x ∈ G such that cap (x) > 0. (3) For all x ∈ G we have cap (x) > 0. Proof Let Z ⊂ G be such that G\Z is finite. Fix a < Z and let v(x) = Then v(a) = 1, supp (v) ⊂ G\Z, so it is easy to calculate that E (v, v) = 2 h∆v, vic = 2ca ∆v(a) · v(a) = Thus, g Z (a, a) ≤

ca . cap (a)

2c a g Z (a,a)

g Z (x,a) . g Z (a,a)

.

Taking a monotone sequence (Zn )n , Zn ⊃ Zn+1 ,

2c a Zn = ∅, we arrive at g (a, a) ≤ cap (a) . Conversely, v above is a unit voltage from a to Z, so by Thompson’s principle (Theorem 4.7.1), for any f : G → C such that f (a) = 1 and supp ( f ) ⊂ G \ Z, a we have E ( f , f ) ≥ E (v, v) = g Z2c (a,a) . Again, taking a monotone sequence T (Zn )n , Zn ⊃ Zn+1 , n Zn = ∅, we arrive at ( ) ca ca cap (a) = inf 12 E ( f , f ) : f A ≡ 1, f ∈ ` 0 (G, c) ≥ inf g (a,a) ≥ g(a,a) . Z

T

n

n

n

Show that ) ( cap ( A) = inf 21 E (v Z , v Z ) : v Z (x) = P x [TA < TZ ], A ⊂ Z c, | Z c | < ∞ .

Exercise 4.29

B solution C

Definition 4.11.3 Let (G, c) be a transient network, and let A ⊂ G be a finite subset. Define the equilibrium measure of A by f g e A (x) = 1 {x ∈ A} Px TA+ = ∞ .

Exercise 4.30

Show that in a transient network (G, c), for any finite subset A, X g (x, y)e A (y) = P x [TA < ∞]. y

(Hint: consider L = sup{t ≥ −1 : Xt ∈ A}.) Exercise 4.31

Show that for a finite subset A in a network (G, c), X cap ( A) = cx e A (x).

B solution C

B solution C

x

Show that for finite subsets A, B in a network (G, c), the capacity satisfies cap ( A ∪ B) ≤ cap ( A) + cap (B). B solution C

Exercise 4.32

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

141

4.12 Transience and Recurrence of Groups

Show that if A ⊂ B are finite subsets in a network (G, c) then

Exercise 4.33

cap ( A) ≤ cap (B). Exercise 4.34

B solution C

Let A be a finite subset in a transient network (G, c). Define

X 0 0 ≤ ϕ ∈ ` (G, c) | ∀ x ∈ A : g (x, y)ϕ(y) ≥ Σ ≥ ( A) := y X . Σ ≤ ( A) := 0 ≤ ϕ ≤ 1 | ∀ x ∈ A : g (x, y)ϕ(y) ≤ 1 A y Show that X X cap ( A) = min cx ϕ(x) = max cx ϕ(x). ϕ ∈Σ≥ ( A)

x

ϕ ∈Σ≤ ( A)

, 1

B solution C

x

Let (G, c) be a transient network. Let A be a finite subset. Define a ∈ A g (x, a). Prove that P ca inf g (a, A) ≤ a ∈ A ≤ sup g (a, A). B solution C a∈A cap ( A) a∈A

Exercise 4.35

g (x, A) =

P

4.12 Transience and Recurrence of Groups In this section we apply the relation of Dirichlet energy, capacity, and transience to prove that transience/recurrence is essentially a group property. Let G be a finitely generated group. Let µ, ν ∈ SA (G, 2) be two symmetric, adapted probability measures on G with finite second moment. Then, the µ-random walk is transient if and only if the ν-random walk is transient.

Theorem 4.12.1

In light of this theorem we may state the following definition. A finitely generated group G is called recurrent if some (and hence, any) symmetric, adapted random walk on G with finite second moment is a recurrent Markov chain. Otherwise, we say that G is transient. Definition 4.12.2

We have already seen that Z, Z2 are recurrent (see Section 2.4, Example 3.6.3, and Exercise 4.26). Thus, by Exercise 3.29, any group that is virtually Z or Z2 is recurrent. In Theorem 9.7.1, we will classify all (finitely generated)

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

142

Networks and Discrete Analysis

recurrent groups, and we will see that finite groups, virtually Z and virtually Z2 are the only examples. Exercise 4.27 shows that there exist nonsymmetric random walks on Z that are transient. In Exercise 4.44 we will give an example of µ ∈ SA Z2, 2 − ε such that Z2, µ is transient. These examples show that the conditions in Theorem 4.12.1 cannot be relaxed in general. Let G be a finitely generated group, and let µ be any symmetric adapted probability measure on G. Assume that the µ-random walk is recurrent. Show that there exists a finitely supported, symmetric, adapted random walk ν on G such that the ν-random walk is recurrent. Show that ν can be chosen so that supp (ν) ⊂ supp (µ). B solution C

Exercise 4.36

Our main tool to prove Theorem 4.12.1 is the notion of a canonical path ensemble. Let (G, c), (G, c 0 ) be two networks on the same state space G. A canonical path ensemble is a collection of finite paths Γ = (γ xy )(x,y) ∈E (G,c) , one for each edge (x, y) ∈ E(G, c), such that γ xy : x → y is a path in (G, c 0 ). For a canonical path ensemble Γ, define the maximal load X 1 L Γ (c, c 0 ) = sup c(x, y)|γ xy |1 {(z,w) ∈γ x y } . c0 (z,w) (z,w) ∈E (G,c0 )

x,y

Furthermore, define L(c, c 0 ) = inf L Γ (c, c 0 ) where the infimum is over all possible choices for canonical path ensembles Γ = (γ xy )(x,y) ∈E (G,c) . Let (G, c), (G, c 0 ) be two networks on the same state space G. Then, for any f ∈ ` 2 (G, c) ∩ ` 2 (G, c 0 ) we have E c ( f , f ) ≤ L(c, c 0 ) · E c0 ( f , f ). Lemma 4.12.3

Proof If Γ = (γ xy )(x,y) ∈E (G,c) is a canonical path ensemble, for any f ∈ ` 2 (G, c) ∩ ` 2 (G, c 0 ), we have X Ec ( f , f ) = c(x, y)| f (x) − f (y)| 2 x,y

≤

X

=

X

xy

c(x, y)|γ |

x,y

x y |−1 |γX

xy

xy

| f (γ j ) − f (γ j+1 )| 2

j=0

c (z, w)| f (z) − f (w)| 2 · 0

z,w

X

c(x,y) |γ x y | xy c0 (z,w) 1 {(z,w) ∈γ }

x,y

≤ L Γ (c, c 0 ) · E c0 ( f , f ). Taking the infimum over all possible choices for Γ proves the lemma.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

4.12 Transience and Recurrence of Groups

143

Proof of Theorem 4.12.1 Assume that (G, µ) is transient. We wish to show that (G, ν) is transient. For this it suffices to bound the ratio between respective Dirichlet energies (by the capacity criterion for transience, Theorem 4.11.2). If (G, ν) is recurrent, then by Exercise 4.36 we may find a finitely supported, symmetric and adapted probability measure ν˜ such that (G, ν) ˜ is also recurrent. By replacing ν with ν, ˜ we may assume without loss of generality that ν has a finite support S. We use this generating set S to determine the metric on G. For any x ∈ G write x = s1 · · · s |x | for s j ∈ S and let α x : 1 → x be the path x α = (1, s1, s1 s2, . . . , s1 · · · s |x |−1, x). −1 For x, y ∈ G define γ x,y : x → y by setting γ x,y = xα x y , where xα = (xα0, . . . , xα |α | ). Note that γ 1,y = α y . The collection Γ = (γ x,y )(x,y) ∈E (G,µ) is our choice of canonical path ensemble. An important property we will use is that for any edge (z, w) ∈ E(G, ν) in the network induced by ν, if (z, w) ∈ γ x,y , then also v −1 z, v −1 w ∈ E(G, ν) −1 −1 and v −1 z, v −1 w ∈ v −1 γ x,y = γ v x,v y . Using the fact that µ has finite second moment, X 1 µ x −1 y |γ x,y |1 {(w,wz) ∈γ x, y } L Γ (µ, ν) = sup ν(z) z ∈supp (ν) w ∈G

=

sup z ∈supp (ν) w ∈G

sup

≤

z ∈supp (ν) w ∈G

x,y 1 ν(z)

X y

1 ν(z)

X

µ(y)|α y |

X

1{ (1,z) ∈w −1 xα y }

x

µ(y)|α y | · |α y | ≤

sup z ∈supp (ν)

y

f g 1 · Eµ |X1 | 2 < ∞, ν(z)

where we have used that for any z, w, y, X X 1{ (1,z) ∈w −1 xα y } = 1 {(x,xz) ∈α y } ≤ |α y |. x

x

Let µ be the uniform measure on {−1, 1}. Let (Xt )t be the µrandom walk on Z. Show that the transition matrix for (Xt )t is given by P(z, z+1) = P(z, z−1) = 21 . Show that there exists some constant C > 0 such that for all t ≥ 1, Exercise 4.37

C −1 t −1/2 ≤ P[X2t = 0] ≤ Ct −1/2 . Show that P[X2t+1 = 0] = 0. Conclude that Z is recurrent (we have already proven this fact).

(4.1)

B solution C

Let µ be the uniform measure on {−1, 0, 1}. Let (Xt )t be the µ-random walk on Z.

Exercise 4.38

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

144

Networks and Discrete Analysis

Show that the transition matrix for (Xt )t is given by P(z, z+1) = P(z, z−1) = P(z, z) = 13 . Show that there exists some constant C > 0 such that for all t ≥ 1, C −1 t −1/2 ≤ P[Xt = 0] ≤ Ct −1/2 .

(4.2) B solution C

Consider the set S = {−1, 0, 1} d as a subset of Zd . That is, S is the set of all d-dimensional vectors taking values in {−1, 0, 1}. Let µ be the uniform measure on S. Show that µ ∈ SA Zd, ∞ . Let (Zt )t denote the µ-random walk. Denote Zt = Xt1, . . . , Xtd for every t, Exercise 4.39

j

where Xt ∈ Z. Show that Xt1 , . . . , Xtd are independent random walks on Z, each with t

t

transition matrix P(z, z + 1) = P(z, z − 1) = P(z, z) = 13 . Conclude that Zd is recurrent if and only if d ≤ 2.

B solution C

4.13 Additional Exercises Let (G, c) be a network. Let f , g ∈ ` 2 (G, c) be functions such that f · g ∈ ` 2 (G, c). Define Γ( f , g) : G → C by Γ( f , g) :=

1 2

f · ∆g¯ + g¯ · ∆ f − ∆( f · g) ¯ .

Show that Γ( f , g) = Γ(g, f ). Show that Γ(α f + h, g) = αΓ( f , g) + Γ(h, g), where α ∈ C. Show that X Γ( f , g)(x) = 21 P(x, y)( f (x) − f (y))(g(x) − g(y)).

Exercise 4.40

y

Conclude that Γ( f , f ) ≥ 0. Show that E ( f , g) = 2 hΓ( f , g), 1ic , where 1 is the constant function. Exercise 4.41

B solution C

Show that E f , g 2 = 4 hΓ( f , g), gic .

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

B solution C

4.13 Additional Exercises Exercise 4.42

145

Show that for any x, |Γ( f , g)(x)| 2 ≤ Γ( f , f )(x) · Γ(g, g)(x).

Conclude that ||Γ( f , g)||c2 ≤ hΓ( f , f ), Γ(g, g)ic .

B solution C

In the following exercises we work to provide an example of a transient symmetric random walk on the recurrent group Z2 . In light of Theorem 4.12.1 and Exercise 4.39, this random walk cannot have finite second moment. This example will show that Theorem 4.12.1 is tight in the sense that the number of moments cannot be relaxed. P∞ −α Exercise 4.43 Fix α ∈ (1, 2). Let ζ (α) = n=1 n . Define a probability 1 n−α . (This is sometimes called the measure ν = να on N by ν(n) = 1 {n≥1} ζ (α) ζ-distribution.) Pn Let (Uk )k be i.i.d.-ν, and Tn = k=1 Uk .f g f g 0 Show that for ε > 2 − α we have that E U11−ε < ∞ but E U11−ε = ∞ for ε 0 ≤ 2 − α. f g Show that there exists a constant C = C(α) > 0 such that E T1n ≤ Cnα−3 . (Hint: consider δ > 0 and define Bn (δ) = #{1 ≤ k ≤ n : Uk ≥ n δ }. Use Chebychev’s inequality for the binomial random variable Bn (δ).) B solution C P −α Fix α ∈ (1, 2). Let ζ (α) = ∞ n=1 n . Define a probability 1 −α measure ν = να on N by ν(n) = 1 {n≥1} ζ (α) n . (This is sometimes called the ζ-distribution.) Pn Let (Uk )k be i.i.d.-ν, and Tn = k=1 Uk . Let (Xt , Yt )t be two independent simple random walks on Z, independent of (Uk )k as well. Consider the process Zn := XTn , YTn ∈ Z2 . Show that (Zn )n is a µ-random walk, for some µ ∈ SA Z2, 2 − ε whenever ε > 4 − 2α. Show that P[Zn = ~0] ≤ Cn−β for some constants C = C(α) > 0, β = 2 β(α) > 1 and all n ≥ 1. Conclude that Z , µ is transient. B solution C Exercise 4.44

Let G be a finitely generated group. Let (Xt )t be a µ-random walk for an adapted symmetric probability measure µ on G. Define the range of the random walk to be

Exercise 4.45

Rt = {X0, . . . , Xt }.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

146

Networks and Discrete Analysis

Show that f g 1 E |Rt | = P T1+ = ∞ . t→∞ t lim

B solution C

Let µ ∈ SA (Z, 1). Let (Xt )t be the µ-random walk, and let Rt = {X0, . . . , Xt }. Show that 1t |Rt | → 0 a.s. Use Exercise 4.45 to show that (Z, µ) is recurrent. B solution C

Exercise 4.46

4.14 Remarks A more comprehensive treatment of network theory can be found in Aldous and Fill (2002) and Lyons and Peres (2016). Much of this chapter is based on the latter. Benjamini, Pemantle & Peres introduced the notion of exponential intersection tails, or EIT, in Benjamini et al. (1998). Their motivation was transience of percolation clusters, see their paper for details. Recall the framework from Section 4.10, expecially the definitions of Γa , E(α, β), R(α, β). A probability measure λ on Γa is called EIT (exponential intersection tails) if the following holds. For α, β two independent paths each of law λ, we have f g there exists ε > 0, Eλ⊗λ eεR(α,β) < ∞. That is, the distribution of the total resistance of edges traversed by two independent paths has an exponential tail. A network is called EIT if it admits some EIT measure on Γa for some a ∈ G. In Benjamini et al. (1998) it was shown that if a graph G is EIT then there exists some p < 1 such that Bernoulli-p percolation clusters on G are transient. The relation between SIT and transience, namely Theorem 4.10.1, was known but seems to have been unpublished until recently. The following is currently still open however. Conjecture 4.14.1

Let G be a Cayley graph of a transient group. Then G is EIT.

That is, we conjecture that the seemingly much weaker property of SIT (finite expectation) implies EIT (exponential tails) in Cayley graphs. All exponential growth Cayley graphs are EIT, as a result of Lyons (1995). One may combine Gromov’s theorem (Theorem 9.0.1) and the methods in Chapter 9 with Benjamini et al. (1998) to show that all transient Cayley graphs of polynomial growth are EIT (see also Raoufi and Yadin, 2017).

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

147

4.15 Solutions to Exercises

4.15 Solutions to Exercises Solution to Exercise 4.1 :( Let T = TZ . Since (G, c) is connected, for any a < Z there exists t (a) such that

P a [T ≤ t (a)] ≥ P a [Xt (a) ∈ Z] > 0. Let t∗ = max{t (a) : a < Z } (using that G\Z is finite). Then,

min P a [T ≤ t∗ ] ≥ min P a [Xt (a) ∈ Z] =: α > 0.

a t] = P x [T > t | T > t − t∗ ] · P x [T > t − t∗ ] ≤ sup P y [T > t∗ ] · P x [T > t − t∗ ] ≤ (1 − α) · P x [T > t − t∗ ] y

≤ · · · ≤ (1 − α) bt /t∗ c .

:) X

Solution to Exercise 4.2 :( We only sketch the proof, as it is a very classical fact, basically coming from the fact that ` 2 (N, m) = L 2 N, 2 N , m , considering m as a (not necessarily finite) measure on N . The fact that this is an inner product is very easy to prove. To show that ` 2 (N, m) is a Hilbert space, we only need to show that the induced metric is complete. To this end, let ( fn ) n be a Cauchy sequence in ` 2 (N, m) . We wish to find a limit in ` 2 (N, m) for this sequence. One then proceeds in a few steps:

• Show that for any x ∈ N we have that ( fn (x)) n is a Cauchy sequence in C, and therefore the pointwise limit f (x) := lim fn (x) exists for all x . (This uses the fact that m(x) > 0.) • Show that M := sup n | | fn | | < ∞. P • Enumerate N = {x1, x2, . . . } ( N is countable by assumption). Use the above to bound rj=1 | f (x j ) − fn (x j ) | 2 as follows. Fix ε > 0. Let n0 be large enough so that | | fn − fm | | 2 < ε for all n, m ≥ n0 . For r > 0 let nr ≥ n0 be large enough so that | f (x j ) − fm (x j ) | 2 < εr for all j ≤ r and m ≥ nr . Then for any n ≥ n0 , for any r > 0, and for any m ≥ nr , r X

| f (x j ) − fn (x j ) | 2 ≤ 2

j=1

r X

| fm (x j ) − fn (x j ) | 2 + 2

j=1

r X

| f (x j ) − fm (x j ) | 2 < 4ε.

j=1

:) X

• This culminates in showing that | | f − fn | | → 0. Solution to Exercise 4.3 :( Linearity is immediate. If f ∈ ` 2 (G, c) then X 2 1 | |∇ f | | c2 = c (x, y) |c(x, y)( f (x) − f (y)) | x∼y

=

X

c(x, y) | f (x) − f (y) | 2 ≤

X

x, y

x

c x | f (x) | 2 +

X

cy | f (y) | 2 + 2

y

X

c(x, y) | f (x) | · | f (y) |.

x, y

The Cauchy–Schwarz inequality tells us that 2

X X X *. c(x, y) | f (x) | · | f (y) | +/ ≤ c(x, y) | f (x) | 2 · c(x, y) | f (y) | 2 = | | f | | c2 · | | f | | c2 . x, y x, y , x, y All in all, this gives the bound

| |∇ f | | c2 ≤ 4| | f | | c2 < ∞.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

148

Networks and Discrete Analysis

Also, if F ∈ ` 2 (E (G, c)) then similarly,

X 2 1 c x (F (x, y) − F (y, x)) c y∼x x x X 2 2 1 ≤ c x |F (x, y) | + |F (y, x) | + 2 |F (x, y) | · |F (y, x) |

| | div F | | c2 =

X

x∼y

≤4

X x∼y

1 cx

|F (x, y) | 2 ≤ 4 | |F | | c2 , :) X

where we have used Cauchy–Schwarz again, and the fact that c x ≥ c(x, y) . Solution to Exercise 4.4 :( Since ∇ f (y, x) = −∇ f (x, y) we have: 1 2 div∇ f (x)

= =

1 1 2 cx

X

X

(∇ f (x, y) − ∇ f (y, x)) =

y∼x

1 cx

X

∇ f (x, y)

y∼x

P(x, y)( f (x) − f (y)) = ∆ f (x).

y∼x

Now, ∆ is self-adjoint just because ∇, div are adjoints of each other; indeed, for any f , g ∈ ` 2 (G, c) ,

2 h∆ f , gi c = h∇ f , ∇gi c = 2 h f , ∆gi c . Solution to Exercise 4.6 :( We have that X

:) X

c(x, y) | f (x) − f (y) | 2 = E ( f , f ) = 2 h∆ f , f i c = 0.

x, y

Since the sum on the left-hand side is of nonnegative P terms, all terms must be 0. Thus, f (x) = f (y) for all x ∼ y . Since (G, c) is connected, f is constant. If x c x = ∞ this is only possible in ` 2 (G, c) when f is 0. :) X Solution to Exercise 4.8 :( The main observation here is that

X

P n (x, y) =

n−1 Y

P(γ j , γ j+1 ).

(x=γ0 , ...,γ n =y)∈G n+1 j=0

Assume that for all x, y ∈ G there exists a finite path γ : x → y . Then, for some specific x, y ∈ G , let γ = (x = γ0, . . . , γn = y) be such a path. Note that

P n (x, y) ≥

n−1 Y

P(γ j , γ j+1 ) =

j=0

n−1 Y c(γ j , γ j+1 ) > 0. cγ j j=0

So P is irreducible, meaning that G is connected. Conversely, assume that P is irreducible. Let x, y ∈ G . So there exists n for which P n (x, y) > 0. This implies that there exists a finite sequence (x = γ0, . . . , γn = y) ∈ G n+1 such that n−1 Y j=0

P(γ j , γ j+1 ) =

n−1 Y c(γ j , γ j+1 ) > 0. cγ j j=0

But this implies in turn that c(γ j , γ j+1 ) > 0 for all 0 ≤ j < n. So γ : x → y is a path from x to y . Solution to Exercise 4.9 :( The first bullet is immediate.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

:) X

4.15 Solutions to Exercises

149

For the second bullet, just compute the telescopic sum:

I γ

∇f =

|γ |−1 X

f (γ j ) − f (γ j+1 ) = f (γ0 ) − f (γ|γ | ).

:) X

j=0

Solution to Exercise 4.10 :( If I is a current then it satisfies Kirchhoff’s cycle law by Exercise 4.9. If F is a flow from A to Z satisfying Kirchhoff’s cycle law, then Proposition 4.4.2 tells us that F = ∇v for some v . For any x < A ∪ Z we have that

∆v(x) = 21 div F (x) = 0, which shows that v is a voltage from A to Z , implying that F is a current.

:) X

Solution to Exercise 4.11 :( As in the proof of Theorem 4.5.4, take b = sup x< A∪Z |v(x) | < ∞, and note that for K > b we have f g f g E x |v(XT ∧t ) |1 { |v (XT ∧t )|> K } ≤ E x |v(XT ) |1 { |v (XT )|> K } ≤ E x |v(XT ) | < ∞, so that (v(XT ∧t ))t is a uniformly integrable martingale. Since T < ∞ a.s., by the OST we have v(x) = E x [v(XT )] for all x . If sup x v(x) = v(z) < ∞ for some z , then since v(z) = E z [v(XT )], there must exist some y such that P z [XT = y] > 0 and v(y) ≥ v(z) . The minimum principle for v follows by applying the maximum principle to −v . :) X Solution to Exercise 4.12 :( Let v be a voltage from A to Z such that v(x) = 0 for all x ∈ A ∪ Z . Since G\( A ∪ Z) is finite, we get that v ∈ ` 2 (G, c) . Note that ∆v(x) · v(x) = 0 for all x ∈ G . Thus, X E (v, v) = 2 h∆v, vi = 2 c x ∆v(x)v(x) = 0. x

But

0 = E (v, v) =

X

c(x, y) |v(x) − v(y) | 2,

x, y

implying that v is constant (because (G, c) is connected). So v ≡ 0. Let u, v be two voltages from A to Z . Then u − v is a voltage from A to Z that is identically 0 on A ∪ Z . So we get that v = u . :) X Solution to Exercise 4.13 :( v Just note that u = v (a) is a unit voltage from a to Z .

:) X

Solution to Exercise 4.14 :( Compute:

E (v, v) = 2 h∆v, vi = 2

X

c x ∆v(x)v(x) = 2c a ∆v(a).

:) X

x

Solution to Exercise 4.15 :( Consider v(x) = P x [Ta < TZ ]. It is simple to verify using the Markov property that v is a voltage from a to Z . Moreover, v(a) = 1 and v(z) = 0 for all z ∈ Z . So v is the unique unit voltage from a to Z . Since v(a) − v(x) = 1 − P x [Ta < TZ ] = P x [TZ < Ta ], X Ceff (a, Z) = c a ∆v(a) = c a P(a, x)(v(a) − v(x)) x

= ca

X

P(a, x) P x [TZ < Ta ] = c a P a [TZ < Ta+ ].

x

In the last line we have used the Markov property at time 1.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

:) X

150

Networks and Discrete Analysis

Solution to Exercise 4.16 :( Let (Zn ) n be a decreasing sequence Zn ⊃ Zn+1 of subsets such that |G\Zn | < ∞ for all n, and such that a < Z0 . Thus,

Ceff (a, ∞ | c) = lim Ceff (a, Zn | c) ≤ lim Ceff (a, Zn | c0 ) = Ceff (a, ∞ | c0 ). n→∞

n→∞

Since effective conductance to infinity is positive if and only if the network is transient, we have that if (G, c) is transient, then 0 < Ceff (a, ∞ | c) ≤ Ceff (a, ∞ | c0 ) , so that (G, c0 ) is also transient. :) X Solution to Exercise 4.17 :( Because c0 (x, y) ≤ c(x, y) for all x, y ∈ G , this is a direct application of Rayleigh monotonicity, Theorem 4.7.3. :) X Solution to Exercise 4.18 :( (1) ⇒ (2) is trivial, and so is (3) ⇒ (1). For (2) ⇒ (4), recall that g (x, y) = P x [Ty < ∞] · g (y, y) , for all x, y . Since (G, c) is connected, 1 , P x [Ty < ∞] > 0 for all x, y . Thus, if g (x, y) < ∞ for some x, y , then g (y, y) < ∞. But g (y, y) = + P y [Ty =∞]

so it must be that P y [Ty+ = ∞] > 0, implying that (G, c) is transient. Finally, for (4) ⇒ (3), assume that (G, c) is transient, and let x, y ∈ G . Since g (x, y) ≤ g (y, y) it suffices 1 to prove that g (y, y) < ∞. As g (y, y) = this follows from transience because P y [Ty+ = ∞] > 0. + P y [Ty =∞]

:) X Solution to Exercise 4.19 :( If the random walk is transient then g Z (x, y) ≤ g (x, y) < ∞. If the random walk is recurrent, then we have seen that g Z (x, y) ≤ g Z (y, y) =

1{y 0. :) X Solution to Exercise 4.20 :( Erasing the loops of γ provides such a path α, as follows: Let A = {γ j : j ≥ 0} be the set of point visited by γ . For every x ∈ A define

t (x) = min{ j ≥ 0 : γ j = x },

`(x) = max{ j ≥ 0 : γ j = x },

which are well defined because γ is a transient path. Note that γt (x) = γ` (x) = x . By definition, `(γ j ) ≥ j . ∞ Define inductively, α0 = a and for all n > 0 define αn = γ` ( α . We have that (αn ) n=0 is a n−1 ) +1 subsequence of (γ j ) ∞ j=0 . Note that

`(αn ) = ` γ` (α n−1 )+1 ≥ `(αn−1 ) + 1 > `(αn−1 ). If αn = αk for some n > k , then we would have `(αn ) = `(αk ) , contradicting the above, so αn , αk for all n , k . Also, for any n, note that αn+1 = γ` (α n )+1 and γ` (α n ) = αn , so that αn, αn+1 is always an edge, and one that is traversed by γ as well. We conclude that α is an infinite simple path with the required properties. :) X Solution to Exercise 4.21 :( ∗ , fix some infinite simple path γ˜ ∈ Γ such that for any x, y with c(x, y) > 0, if γ˜ = x, γ˜ For γ ∈ Γa a j j+1 = y then γk = x, γk+1 = y for some k . ∗ ∗ Now, given the measure λ , define the measure λ by taking a random α with law λ , and letting λ be the law of α˜ . For two independent such paths α, β , we have that

˜ (x, y) ∈ E (α, ˜ β)

⇒

(x, y) ∈ E (α, β),

˜ ≤ R(α, β) . Thus, λ is SIT. so that a.s. R(α, ˜ β)

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

:) X

151

4.15 Solutions to Exercises

Solution to Exercise 4.22 :( Let λ be a SIT measure on Γx . Let α, β be two independent paths in Γx with law λ. Let γ : y → x be some finite path (using that (G, c) is connected). Define α˜ = γα and β˜ = γβ . Then α, ˜ β˜ are independent paths in Γy∗ with

˜ ≤ R(α, ˜ β)

|γ | X

r (γ j−1, γ j ) + R(α, β).

j=1

:) X

By Exercise 4.21 we get that there exists a SIT measure on Γy .

Solution to Exercise 4.23 :( If (G, c) is transient, then that (G, c0 ) is transient just follows from Rayleigh monotonicity (Theorem 4.7.3). If (G, c0 ) is transient, then there exists a SIT measure λ on paths in (G, c0 ) started at some a ∈ G . Note that since E (G, c) = E (G, c0 ) , we have that paths in (G, c0 ) are also paths in (G, c) . If α, β are two independent paths sampled from the distribution λ, in the network (G, c) we then have

Eλ⊗λ [R(α, β)] =

X

Pλ⊗λ [(x, y) ∈ E (α, β)]

(x, y)∈E (G, c)

=

X

1 c(x, y)

Pλ⊗λ [(x, y) ∈ E (α, β)]

1 · c0 (x, y)

Pλ⊗λ [(x, y) ∈ E (α, β)]

1 · 1+ c0 (x, y)

(x, y)∈E (G, c 0 )

≤

X (x, y)∈E (G, c 0 )

c (x, y)+ε c (x, y) ε η

,

where η = inf (x, y)∈E (G, c) c(x, y) . Thus, when η > 0, we have that λ is a SIT measure on the network (G, c) as well, implying that (G, c) is transient. :) X Solution to Exercise 4.24 :( Let G = {0, 1}∗ be the set of all finite words in the letters 0, 1, including the empty word, denoted by ?. For a non-empty word ω = ω1 · · · ω n , write ω ˆ = ω1 · · · ω n−1 for the unique word that is obtained by removing the last letter of ω . Write |ω | for the length of the word ω , with | ? | = 0. Define conductances c(ω, ω1) = c(ω, ω0) = 2−|ω | . Let (Xt )t be the random walk on the network (G, c) . One easily checks that

P[ |Xt +1 | = n + 1 | |Xt | = n] = P[ |Xt +1 | = n − 1 | |Xt | = n] = 12 , P[ |Xt +1 | = 1 | |Xt | = 0] = 1. So ( |Xt |)t is a simple random walk on the natural numbers N, and is easily seen to be recurrent. Now, consider the conductances c0 (x, y) = c(x, y) + 1 for any (x, y) ∈ E (G, c) . We will show this new network admits a SIT measure. Choose a random path α inductively as follows. Set α0 = ?. Given αn , let αn+1 = αn 1 or αn+1 = αn 0 with probability 12 each. Let λ be the law of this random path. We claim that λ is a SIT measure. Indeed, consider two independent samples, α, β . Note that if αn , βn then αn+1 , βn+1 , by construction. So if N = min{n : αn , βn }, then since c0 (x, y) ≥ 1 for (x, y) ∈ E (G, c0 ) ,

R(α, β) ≤

N X k=1

1 ≤ N. c0 (αk−1, αk )

Also, since α, β are independent,

P[αn+1 , βn+1 | αn = βn ] = 12 . Thus,

Eλ⊗λ [R(α, β)] ≤ Eλ⊗λ [N ] = 2 < ∞. So λ is indeed a SIT measure on (G, c0 ) , and (G, c0 ) is transient.

:) X

Solution to Exercise 4.25 :( Let α, β ∈ Γa be two independent random paths with law λ. We denote {(x, y) ∈ α } = {∃ j, (α j , α j+1 ) =

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

152

Networks and Discrete Analysis

(x, y) }. Since α must traverse some edge in K n , by Cauchy–Schwarz, 2

X Pλ [(x, y) ∈ α]+/ 1 ≤ *. (x, y)∈K n ,

X

≤

r (x, y)(Pλ [(x, y) ∈ α]) 2 ·

X

c(x, y).

(x, y)∈K n

(x, y)∈K n

As (K n ) n are pairwise disjoint,

Eλ⊗λ [R(α, β)] ≥

X

X

r (x, y)(Pλ [(x, y) ∈ α]) 2

n (x, y)∈K n

≥

1

X P n

(x, y)∈K n

c(x, y)

=

X n

1 . c(K n )

:) X

Solution to Exercise 4.26 :( For ever n ≥ 1, define

K n = {((εn, j), (ε(n + 1), j)), (( j, εn), ( j, ε(n + 1))) : −n ≤ j ≤ n, ε ∈ {−1, 1} }. That is, K n is the collection of edges emanating from a n × n box. One easily checks that (K n ) n are pairwise P disjoint cutsets for 0, and that c(K n ) = 2n + 1. Since n c (K1 n ) = ∞, we conclude that if λ is any probability measure on Γ0 , then by Exercise 4.25, λ cannot be SIT. By Exercise 4.22 and Theorem 4.10.1 we see that Z2, µ is recurrent. :) X Solution to Exercise 4.27 :( p x Consider conductances given by c(x, x + 1) = 1−p . Note that if P is the transition matrix of the induced random walk then

P(x, x + 1) =

px c(x, x + 1) = x = p = µ(1), c(x, x + 1) + c(x − 1, x) p + (1 − p)p x−1

and

P(x, x − 1) =

c(x − 1, x) = 1 − p = µ(−1). c(x, x + 1) + c(x − 1, x)

So the induced random walk on (Z, c) is exactly the µ -random walk. Now consider the (degenerate) probability measure λ on simple infinite paths in (Z, c) started at 0, given by choosing (deterministically) the path γ = (0, 1, 2, . . .) if p > 21 and γ = (0, −1, −2, . . .) if p < 12 . Note that for q = max{p, 1 − p }, ∞ X 1−q n R(γ, γ) = < ∞, q n=0

because q > 1 − q . Thus, we have a SIT on (Z, c) implying that (Z, c) is transient by Theorem 4.10.1.

:) X

Solution to Exercise 4.28 :( P 2 2 Let P f ∈ ` (G, c)2 . Let ε > 0. We know that x c x | f (x) | < ∞, so there exists a finite P set F ⊂ G such that c | f (x) | < ε . Define g = f 1 . Then it is easy to check that | | f − g | | c ≤ x 0, s∈S

we have seen in Exercise 4.23 that if (G, ν) is transient then so is (G, c) . Since (G, c) is recurrent, it must be that (G, ν) is recurrent as well. :) X Solution to Exercise 4.37 :( The transition matrix is just P(x, y) = µ(x − y) = 1{x−y∈{−1,1}} 21 . P P To prove (4.1), we write Xt = X0 + tk=1 Uk where (Uk ) k ≥1 are i.i.d.-µ elements. Set Rt = tk=1 1 {U =1 } k Pt and Lt = k=1 1 {U =−1 } . So Rt + Lt = t and Xt = X0 + Rt − Lt . Thus, k

if t = 2n + 1, 0 P[Xt = 0] = P[Rt = Lt ] = P[Rt = t/2] = 2n 2−2n if t = 2n, n 1 because Rt ∼ Bin t, 2 . Stirling’s Approximation tells us that for all n ≥ 1, 1< √

n!e n 2πnn n

< 2.

Thus, for all n ≥ 1, 1 4

0 such that for all n ≥ 0 and C −1 (n + 1) −1/2 ≤ P[Y2n = 0 | M] ≤ C (n + 1) −1/2 . Also, P[Y2n+1 = 0 | M] = 0 for all n ≥ 0.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

155

4.15 Solutions to Exercises

for any t ≥ 0, set N (t) = n such that tn ≤ t < tn+1 . It is simple to see that N (t) has Binomial Finally, t, 32 distribution. Since N (t) is a function of the set M , we conclude that

P[Xt = 0] =

t X

X

P[N (t) = n] · P[Yn = 0] =

P[N (t) = 2n] · P[Y2n = 0].

0≤n≤t /2

n=0

Thus,

f g f g C −1 E 1{ N (t ) is even } (N (t) + 1) −1/2 ≤ P[Xt = 0] ≤ C E (N (t) + 1) −1/2 . These last quantities are bounded as follows. For the lower bound, note that for any t > 1,

P[N (t) is even ] ≥ P[N (t − 1) is even, Ut = 0] + P[N (t − 1) is odd, Ut , 0] = =

1 3 2 3

P[N (t − 1) is even] + 1 3

−

2 3

P[N (t − 1) is odd]

P[N (t − 1) is even ] ≥

1 3.

Thus,

P[Xt = 0] ≥ C −1 13 (t + 1) −1/2 . For the upper bound, by Chebychev’s inequality,

f P N (t)

1 3t

g

≤

2 , t

so that

g −1/2 f f g f + C P N (t) < P[Xt = 0] ≤ C E (N (t) + 1) −1/2 ≤ C P N (t) ≥ 31 t 31 t + 1 √ 2C ≤ C 3(t + 3) −1/2 + . t

1 3t

g :) X

Solution to Exercise 4.39 :( We have that S generates Z d as it contains the standard basis for R d . Also, µ is obviously symmetric, and has finite support, so has an exponential tail as well. To show that Xt1 , . . . , Xtd are independent Markov chains with transition matrix P , one notes that t t for any ~ z k = x k1 , . . . , x kd ∈ Z d ,

P[Zt +1 = ~zt +1 | Zt = ~zt ] = µ(zt +1 − zt ) = 1 { z −z t ∈S } 3−d t +1 =

d Y

1

j=1

x

1 j j −x ∈{−1,0,1} 3 t +1 t

=

d Y

j j P xt , xt +1 .

j=1

This immediately gives the identity

P[Zt = 0] =

d Y f j g f gd P Xt = 0 = P Xt1 = 0 . j=1

Thus, using Exercise 4.38, we have that when d > 2, ∞ X

P[Zt = 0] ≤ 1 + C

t =0

implying that Z d is transient for d > 2. For d ≤ 2 we find that ∞ X

∞ X

t −d/2 < ∞,

t =1

P[Zt = 0] ≥ C −1

t =0

∞ X

t −d/2 = ∞,

t =1

implying that Z d is recurrent for d ≤ 2.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

:) X

156

Networks and Discrete Analysis

Solution to Exercise 4.40 :( The first two assertions are very easy. The third assertion follows from the following computation. For f , g ∈ ` 2 (G, c) with f g ∈ ` 2 (G, c) , and for any x ∈ G , X ¯ + P( f g)(x) ¯ P(x, y)( f (x) − f (y))(g(x) − g(y)) = g(x) · ∆ f (x) − f (x) · P g(x) y

= g¯ · ∆ f + f · ∆g¯ − ∆( f g) ¯ (x) = 2Γ( f , g)(x). The final assertion follows from X E ( f , g) = c(x, y)( f (x) − f (y))(g(x) − g(y)) x, y

=2

X

c x 12

X

x

P(x, y)( f (x) − f (y))(g(x) − g(y))

y

= 2 hΓ( f , g), 1i c .

:) X

Solution to Exercise 4.41 :( Compute: X X 4 hΓ( f , g), gi c = c x g(x) ¯ P(x, y)( f (x) − f (y))(g(x) − g(y)) x

y

X

+

cy g(y) ¯

X

y

=

X

P(y, x)( f (y) − f (x))(g(y) − g(x))

x

c(x, y)( f (x) − f (y))(g(x) − g(y))(g(x) + g(y)) = E f , g2 .

:) X

x, y

Solution to Exercise 4.42 :( This is just a straightforward application of Cauchy–Schwarz.

:) X

Solution to Exercise 4.43 :( We will make repeated use of the following estimate: for any r ≥ 1, ∞ X n=r ∞ X

n−α ≥

∞

Z

ξ −α dξ = r

n−α ≤ r −α +

∞

Z

ξ −α dξ ≤ r

n=r

1−α 1 , α−1 r 1−α α . α−1 r

We have that

f g E U11−ε =

∞ 1 X 1−ε−α n , ζ (α) n=1

which converges for ε >f 2 − α andg is infinite for ε ≤ 2 − α. Also, setting p = P U1 ≥ n δ we have that B n (δ) ∼ Bin (n, p) . Note that

1 ≤ ζ (α)(α − 1)pn δ (α−1) ≤ α. By Chebychev’s inequality with λ =

1 2

E[B n (δ)],

P[ |B n (δ) − np | > λ] ≤ Note that Tn ≥ B n (δ) · n δ , so f P Tn

0, we find that (by perhaps updating the constant c(α) > 0) f g E T1n ≤ 4(np) −1 n−1 + 2(np) −1 n−δ ≤ c(α)n−1+δ (α−1) · n−1 + n−δ . Choosing δ = 1 completes the proof.

:) X

Solution to Exercise f 4.44 :( g Let µ(x, y) = P (XT1 , YT1 ) = (x, y) . It is immediate from the independence of (Uk ) k , (Xt )t , (Yt )t that (Zn ) n is a µ-random walk. It is also easy to see that µ is symmetric. To show that µ has the proper moments, let ε > 2 − α. Consider the process Mt = |Xt | 2 − t , which is easily seen to be a martingale. So by Jensen’s inequality,

g 1−ε g f f = t 1−ε . E |Xt | 2−2ε ≤ E |Xt | 2 f g Similarly E |Yt | 2−2ε ≤ t 1−ε . Finally, since |(x, y) | ≤ |x | + |y | for all (x, y) ∈ Z2 , we get that 1−ε |Z1 | 2−2ε ≤ 2 |Xt | 2 + 2|Yt | 2 ≤ 2( |Xt | 2−2ε + |Yt | 2−2ε ). Since T1 is independent of (Xt )t and (Yt )t , we have that f g f g E |Z1 | 2−2ε ≤ 2 E T11−ε < ∞,

by Exercise 4.43, as long as ε > 2 − α. Thus, µ ∈ SA Z2, 2 − ε for all ε > (4 − 2α) . Exercise 4.37 tells us that P[(Xt , Yt ) = (0, 0)] ≤ Ct −1 for some constant C > 0, and all t ≥ 1. By Exercise 4.43, for some constants C 0 = C 0 (α) > 0 and β = β(α) > 1, f g P[Zn = ~0] ≤ E P[(XTn , YTn ) = (0, 0) | Tn ] ≤ C E T1n ≤ C 0 n−β . As this is summable over n we have the transience of µ .

:) X

Solution to Exercise 4.45 :( Let R˜ t = {X1, . . . , Xt } (so R˜ 0 = ∅). Let Tx+ = inf {t > 0 : Xt = x }. If x , 1 then we have that g f g f g f P1 Tx+ = t = P x Tx+ > t , Xt = 1 = P1 T1+ > t , Xt = x −1 . The first equality comes from reversing paths – the path s1 · · · st and (st ) −1 · · · (s1 ) −1 have the same distribution because µ is symmetric. The second comes from the fact that we can translate the starting point by x −1 . Now, the event Xt < R˜ t −1 is the event that there exists x such that Tx+ = t . Thus, g X f f g P1 [Xt < R˜ t −1 ] = P1 T1+ = t + P1 Tx+ = t x,1

f g X f g f g = P1 T1+ = t + P1 T1+ > t , Xt = x −1 = P1 T1+ ≥ t . x,1

f g f g Since P1 T1+ ≥ t → P T1+ = ∞ , we have that the Césaro limit is the same: t f g 1 1X E | R˜ t | = P[Xk < R˜ k−1 ] → P T1+ = ∞ . t t k=1

Since 0 ≤ |Rt | − | R˜ t | ≤ 1 we are done. Solution to Exercise 4.46 :( P Let (Ut )t∞=1 be i.i.d.-µ . Then Xt = tk=1 Uk is a µ -random walk. Fix ε > 0. Let

τε = inf {t ≥ 0 : ∀ s ≥ t, |Xs | ≤ εs },

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

:) X

158

Networks and Discrete Analysis

with inf ∅ = ∞. The law of large numbers tells us that that a.s.

1 t Xt

→ 0 a.s., which implies that P[τε < ∞] = 1. Note

|Rt | ≤ τε + |Z ∩ [−tε, tε]| ≤ τε + 2tε + 1. Thus,

" # 1 P lim sup |Rt | > 2ε ≤ P[τε = ∞] = 0. t t →∞ As ε > 0 was arbitrary, we have that

1 t

|Rt | → 0 a.s. As |Rt | ≤ t + 1, dominated convergence implies that

g f 1 P T1+ = ∞ = lim E |Rt | = 0. t →∞ t So the random walk is recurrent.

https://doi.org/10.1017/9781009128391.006 Published online by Cambridge University Press

:) X

P A R T II Results and Applications

https://doi.org/10.1017/9781009128391.007 Published online by Cambridge University Press

https://doi.org/10.1017/9781009128391.007 Published online by Cambridge University Press

5 Growth, Dimension, and Heat Kernel

161

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

162

Growth, Dimension, and Heat Kernel

In this chapter we will connect the volume growth of the group to the decay of the probabilities Pt (x, y). The quantity Pt (x, y) is sometimes called the heat kernel, where the term off-diagonal is used for y , x and on-diagonal is used when x = y. One is usually interested in the asymptotics of this sequence as t → ∞ for fixed x, y. We will see that the growth of the group provides bounds for the decay of the heat kernel. This should be contrasted to the Varopolous–Carne bound (Theorem 5.6.1), which is the asymptotics for dist(x, y) → ∞. The main results in this section are: • Kesten’s amenability criterion (Theorem 5.2.4), which states that a finitely generated group is non-amenable if and only if the Laplacian of some symmetric adapted random walk has a strictly positive spectral gap. • The Coulhon–Saloff-Coste inequality (Theorem 5.3.1), relating the growth rate of a finitely generated group to the isoperimetric profile. • Nash-type inequalities relating isoperimetry to bounds on the decay of Pt (x, y) as t → ∞. • The Varopolous–Carne Bound (Theorem 5.6.1) controlling the decay of Pt (x, y) as dist(x, y) → ∞.

5.1 Amenability Let G be a finitely generated group, with finite symmetric generating set S. Define the Cheeger constant as

Definition 5.1.1

ΦS :=

inf

A⊂G:| A| 0 is some constant such that for any finitely supported real-valued function f : G → R we have ||P f ||c ≤ ρ|| ˜ f ||c . Show that this holds for all f ∈ ` 2 (G, c), implying that ||P|| ≤ ρ. ˜

Exercise 5.11

B solution C

Let (G, c) be a connected network, and let P(x, y) = the corresponding transition matrix. Show that Exercise 5.12

ρ(P) =

sup 0, f ∈` 0 (G,c)

hP f , f ic = h f , f ic

sup f : G→R 0, f ∈` 0 (G,c)

hP f , f ic . h f , f ic

c(x,y) cx

be

B solution C

be Let (G, c) be a connected network, and let P(x, y) = c(x,y) cx 2 the corresponding transition matrix. Let h ∈ ` (G, c) such that Ph = λh. Show that |λ| ≤ ρ(P). Show that if λ < ρ(P) and h is nonnegative, then h ≡ 0. B solution C Exercise 5.13

We now move to prove some relationship between the spectral radius ρ and amenability. Let (G, c) be a network and let Φ = Φ(G,c) be the Cheeger constant. Then, for any f : G → R of finite support we have that X X Φ· cx | f (x)| ≤ c(x, y)| f (x) − f (y)|.

Proposition 5.2.3 (` 1 -Sobolev inequality)

x

x,y

P Proof Recall that c( A) = a ∈ A ca . For t > 0 let St = {x : f (x) > t}, which is a finite set, because f has finite support. Note that Z ∞ X cx f (x)1 { f (x) ≥0} . c(St )dt = 0

x

We have ∞

Z

1 { f (y) ≤t< f (x) } dt = | f (x) − f (y)|1 { f (x) ≥ f (y) } . 0

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

167

5.2 Spectral Radius Also, X c(x, y)| f (x) − f (y)|1 { f (x) ≥ f (y) } =

X

1 2

c(x, y)| f (x) − f (y)|1 { f (x) ≥ f (y) }

x,y

x,y

+

1 2

X

c(y, x)| f (y) − f (x)|1 { f (y) ≥ f (x) }

x,y

=

1 2

X

c(x, y)| f (x) − f (y)|.

x,y

Thus, Φ·

X

cx f (x)1 { f (x) ≥0} =

Φ · c(St )dt 0

x ∞

Z

X

≤ 0

=

∞

Z

1 2

c(x, y)dt =

0

x ∈St y 0 if and only if the d-dimensional isoperimetric inequality c( A, Ac ) ≥ κ d c( A) (d−1)/d,

for all A ⊂ G , | A| < ∞,

(5.3)

holds for some constant κ d > 0. Proof For a finite subset A ⊂ G, note that X p cx |1 A (x)| p = c( A), ||1 A || p = x

X

|∇1 A (x, y)| = 2

x,y

X

c(x, y)1 {x ∈ A,

y t}. St is a finite set since f has finite support.

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

172

Growth, Dimension, and Heat Kernel

Using the Fubini–Tonelli Theorem, X X c(x, y)| f (y) − f (x)| = 2 x,y

c(x, y)( f (x) − f (y))

x,y: f (y)< f (x)

=2 =2

∞

Z

X

c(x, y)

1[ f (y), f (x)) (t)dt 0

x,y: f (y)< f (x) Z ∞X

c(x, y)1 {x ∈St ,

0

yt } pt p−1 dt x

=

x ∞

Z

ϕ(t) p pt p−1 dt ≤ p

0

Z

0 ∞

Z ≤p 0

∞ 0 ∞

! p−1 Z ϕ(x)dx ·

t

Z

! p−1 ϕ(x)dx

ϕ(t)dt

0

ϕ(t)dt = p 0

!p

∞

Z

c(St )

1/p

dt

,

0

again by Fubini–Tonelli. In conclusion, for all f ∈ ` 0 (G, c), Z ∞ p1/p X · |∇ f (x, y)|, || f || p ≤ p1/p c(St ) 1/p dt ≤ 2κ d x,y 0 establishing (5.2).

5.4 Nash Inequality We now proceed to consider the isoperimetric dimension of a network with a finer resolution. Let I : [0, ∞) → [0, ∞) be a nondecreasing function. Let κ > 0 be a positive real number. We say that a network (G, c) satisfies a (I, κ)-isoperimetric inequality (or sometimes we just say that (G, c) is (I, κ)isoperimetric), if for any finite subset A ⊂ G, it holds that c( A, Ac ) ≥ κ · I(c( A)). Definition 5.4.1

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

173

5.4 Nash Inequality

If (G, c) satisfies a (I, κ)-isoperimetric inequality with I(t) = t (d−1)/d , we say that (G, c) satisfies a d-dimensional isoperimetric inequality. Note that dimiso (G, c) ≥ d if and only if for any n < d (G, c) satisfies an ndimensional isoperimetric inequality. So this is a finer notion than isoperimetric dimension. Let N : [0, ∞) → [0, ∞) be a nondecreasing function. We say that (G, c) satisfies a N -Nash inequality if for any f ∈ ` 0 (G, c), | | f | |2 || f ||22 ≤ N | | f | |12 · E ( f , f ).

Definition 5.4.2

2

Recall that the Sobolev inequality (Theorem 5.3.3) gives a relationship between isoperimetric dimension and an inequality comparing || f || p to ||∇ f ||1 . In a similar fashion, the following theorem relates isoperimetric inequalities to Nash inequalities. Suppose (G, c) satisfies a (I, κ)-isoperimetric inequality, with t I such that t 7→ I(t) is nondecreasing. Theorem 5.4.3

Then (G, c) satisfies a N -Nash inequality, with N (t) =

2 κ2

·

(4t) 2 . I(4t) 2

Proof Note that since || f || p = || | f | || p and E ( f , f ) ≥ E (| f |, | f |), it suffices to prove the inequality for nonnegative f . So let f ∈ ` 0 (G, c) such that f ≥ 0. For t ≥ 0 let St = {x : f (x) > t}. Note c(St ) decreases as well. Since St is finite, that St decreases as t increases, so I(c(S t )) we may now compute that f (x) = f (x) − f (y) =

∞

Z

1 { f (x)>t } dt =

Z0 ∞

∞

Z

1 {x ∈St } dt, 0

1 {x ∈St ,

− 1 {y ∈St ,

y 0 and consider f λ := ( f − λ)+ = max{ f − λ, 0}. So f 2 ≤ + 2λ f (because f ≥ 0). Also, supp ( f λ ) = Sλ and X E ( f λ, f λ ) = c(x, y)| f (x) − f (y)| 2 ≤ E ( f , f ). x,y ∈Sλ

Thus, for any λ > 0, || f ||22 =

X

cx f (x) 2 ≤

f λ2

+ 2λ|| f ||1 1 x ≤ 21 N 14 c(Sλ ) · E ( f , f ) + 2λ|| f ||1

We choose λ =

| | f | |22 4 | | f | |1 ,

||f || ≤ 21 N 4λ 1 · E ( f , f ) + 2λ|| f ||1 .

so that

|| f ||22 ≤ 12 N

| | f | |12 | | f | |22

· E ( f , f ) + 21 || f ||22 .

5.5 Operator Theory for the Heat Kernel We now continue our investigation of the operator P(x, y) = c(x,y) as an cx p operator between the spaces ` (G, c) = { f : G → C : || f || p < ∞} (we will need to be careful with the choices of p, q for which this operator is well defined). Recall that X p || f || p = cx | f (x)| p and || f ||∞ = sup | f (x)|. x

Our main focus will be for p ∈ {1, 2, ∞}.

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

x

175

5.5 Operator Theory for the Heat Kernel

Recall that for a linear operator T : ` p (G, c) → ` q (G, c), we define the (p → q) operator norm ||T || p→q =

sup 0. f ∈` p (G,c)

||T f ||q . || f || p

With these definitions, we see that || f ||c = || f ||2 and ρ(P) = ||P|| = ||P||2→2 . Exercise 5.21

Show that 2 ||Pt ||1→2 = sup x,y

P2t (x, y) P2t (x, x) = sup . cy cx x

B solution C

Let Q be an operator with ||Q||2→2 ≤ 1. Let f ∈ ` 2 (G, c). Show that the sequence (||Q t f ||2 )t is nonincreasing. B solution C

Exercise 5.22

Exercise 5.23

f ∈

Let Q be a self-adjoint operator on the Hilbertspace ` 2 (G, c). Let t 2 ||Q f ||2 − ||Q t+1 f ||22 is nonincreasing.

` 2 (G, c). Show that the sequence

t

B solution C

As one consequence of the Nash inequality we obtain the following theorem. Consider the random walk (Xt )t on a connected network (G, c). If (G, c) satisfies a (I, κ)-isoperimetric inequality with I(t) = t (d−1)/d , then for any x, y ∈ G, there exists a constant K = K (d, κ) > 0 such that for any t ≥ 2, Theorem 5.5.1

Px [Xt = y] ≤ Kcy t −d/2 . 1 Proof Let P be the transition matrix P(x, y) = c(x,y) c x and let Q = 2 (I + P) be the lazy version of P. Note that for any x, y, Q2 (x, y) = 41 I + 2P + P2 (x, y) ≥ 12 P(x, y).

Because Q is also self-adjoint, X E( f, f ) = cx P(x, y)| f (x) − f (y)| 2 x,y

≤2

X

cx Q2 (x, y)| f (x) − f (y)| 2 = 4

D

E I − Q2 f , f

x,y

= 4|| f ||22 − 4||Q f ||22 . By Theorem 5.4.3, for any f ∈ ` 0 (G, c) we have a Nash-type inequality: | | f | |2 || f ||22 ≤ N | | f | |12 · || f ||22 − ||Q f ||22 , (5.4) 2

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

176

Growth, Dimension, and Heat Kernel

where N (t) = κ82 · (4t) 2/d (and κ is the constant from the d-dimensional isoperimetric inequality). Fix f ∈ ` 0 (G) with || f ||1 = 1. Let f t = Q t f and let ξ (t) = || f t ||22 . Recall that Q is a contraction and that f 0 = f ∈ ` 2 (G, c), so f t ∈ ` 2 (G, c). Fix ε > 0, and let gt,ε ∈ ` 0 (G, c) be such that ||gt,ε − f t ||2 < ε and ||gt,ε ||1 ≤ || f t ||1 . For example, this can be achieved by taking gt,ε = 1 A f t for some finite but large enough subset A ⊂ G. Because cx Q t (x, y) = cy Q t (y, x), X X X t || f t ||1 = cx Q (x, y) f (y) ≤ Q t (y, x)cy | f (y)| = || f ||1 = 1. y x,y x By (5.4), 8 · 42/d 2 2 · ||g || − ||Qg || . ||gt,ε ||22(d+2)/d ≤ t,ε t,ε 2 2 κ2 Taking ε → 0, we arrive at 8 · 42/d . (5.5) κ2 Now, by Exercises 5.22 and 5.23, the sequences (ξ (t))t and (ξ (t) − ξ (t + 1))t are both nonincreasing. We thus may interpolate ξ to be a smooth function ξ : [0, ∞) → [0, ∞) such that ξ is nonincreasing and convex. We conclude for any t > 0 that ξ (t + 1) − ξ (t) ≥ ξ 0 (t). Plugging this into (5.5) we have the differential inequality dM ∂ 1 ≤ −M · ξ 0 (t) · ξ (t) −(d+2)/d = · ∂t ξ (t) −2/d , 2 which by integrating implies that −2/d −2/d t ≤ dM − ξ (0) −2/d ≤ dM . 2 · ξ (t) 2 ξ (t) ξ (t) (d+2)/d ≤ M · (ξ (t) − ξ (t + 1))

Hence, for the constant K =

2 dM

for

M=

> 0 we get that for any f ∈ ` 0 (G),

||Q t f ||22 = ξ (t) ≤ (Kt) −d/2 . Choosing f =

1 cx

δ x we arrive at

D E Q2t (x, x) = (cx ) −2 · Q2t δ x, δ x = ||Q t f ||22 ≤ (Kt) −d/2 . c cx

Finally, since cx P2t (x, x) = ||Pt δ x ||22 , we have, by Exercise 5.22, that P2t (x, x) is a nonincreasing sequence. Thus, t

Q (x, x) = 2 2t

−2t

! ! 2t t X X 2t j 2t 2t −2t P (x, x) ≥ 2 P (x, x) ≥ 12 P2t (x, x), j 2k j=0 k=0

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

177

5.5 Operator Theory for the Heat Kernel where we have used that 2−2t

! t X f g 2t = P Bin 2t, 21 is even = 12 , 2k k=0

which is a simple exercise to prove (see Exercise 5.29). Since ||P|| ≤ 1, we have by Cauchy–Schwarz that D E cx P2t+1 (x, y) = Pt δ y, Pt+1 δ x ≤ ||Pt δ y ||2 · ||Pt+1 δ x ||2 c q t t ≤ ||P δ y ||2 · ||P δ x ||2 = cy P2t (y, y) · cx P2t (x, x). Similarly, cx P2t (x, y) ≤

q

cy P2t (y, y) · cx P2t (x, x).

Hence both cx P2t (x, y) and cx P2t+1 (x, y) are bounded by 2cx cy (Kt) −d/2 . Dividing by cx , we obtain, for any t ≥ 2, Pt (x, y) ≤ 21+d/2 cy (K (t − 1)) −d/2, which completes the proof.

Since any infinite network satisfies a 1-dimensional isoperimetric inequality (with implicit constant κ = 1), we obtain the following corollary. Corollary 5.5.2 There is a universal constant K > 0 such that for any infinite network (G, c) the induced random walk (Xt )t satisfies

Px [Xt = y] ≤ Kcy t −1/2 for all t ≥ 2 and all x, y. Let G be a finitely generated group. Assume that |Br | ≥ αr d for all r and some constant α > 0, where Br = BS (1, r) is the ball of radius r around 1 in some fixed Cayley graph of G. Show that for any symmetric, adapted µ-random walk (Xt )t on G we have that Exercise 5.24

Px [Xt = y] ≤ Ct −d/2 for some constant C = C(α, d, µ) > 0 and any t > 0.

B solution C

Let G be a finitely generated group, with a finite symmetric generating set S. Recall that BS (x, r) denotes the ball of radius r about x in the Cayley graph with respect to S. We say that G has exponential growth if there exists c = cS > 0

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

178

Growth, Dimension, and Heat Kernel

such that for all r > 0 we have |BS (1, r)| ≥ ecr . (We will revisit the notion of growth in the future, in Section 8.1.) Show that the definition of exponential growth does not depend on the specific choice of finite symmetric generating set S (although the implicit constant c does). B solution C Exercise 5.25

Show that if G is a group of exponential growth, then for any symmetric, adapted measure µ on G we have that the network (G, µ) satisfies t and κ = κ µ > 0 is some an (I, κ)-isoperimetric inequality, where I(t) = log(t) constant. B solution C Exercise 5.26

Show that if G is a group of exponential growth, then for any symmetric, adapted measure µ on G we have that the µ-random walk (Xt )t satisfies P[Xt = 1] ≤ 2 exp −cµ t 1/3 . B solution C Exercise 5.27

We now provide an example of a finitely supported random walk 1/3 on an exponential growth group for which P[Xt = 1] ≥ ce−ct for some c > 0 and all t > 0, showing that the bound obtained from the Nash inequality cannot be improved for general exponential growth groups. This example will be expanded upon in Section 6.9. Consider M Σ= {0, 1} = {σ : Z → {0, 1} : | supp (σ)| < ∞}, Example 5.5.3

Z

with point-wise addition modulo 2. So (σ + τ)(z) = σ(z) + τ(z) (mod 2). Z acts on Σ via z.σ(x) = σ(x − z). We may construct a group via this action, which is actually the semi-direct product: L = L(Z) = Z n Σ (see Exercise 1.68). Recall that this is the group whose elements are Z × Σ and multiplication is given by (x, σ)(y, τ) = (x + y, σ + x.τ). One can easily verify that this constitutes a group. Now, let S = {(1, 0), (−1, 0), (0, δ0 ), (0, 0)}, which is easily seen to be a generating set for L(Z). It is simple to verify that for a general element (z, σ) ∈ L, multiplying on the right by a generator (±1, 0) will change the Z-coordinate by ±1, and not affect the Σ-coordinate. Multiplying on the right by (0, δ0 ) will change the Σ-

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

5.5 Operator Theory for the Heat Kernel

179

coordinate by flipping the value of σ at position z. Multiplying by the identity element (0, 0) does nothing of course. Let µ be the uniform measure on S, and let (Xt , σt )t denote the µ-random walk on L. A µ-random walk on L can be thought of as follows: a “lamplighter” is walking on the integers Z, where a “lamp” is placed at each integer. The Zcoordinate is the position of the lamplighter, and σ(x) gives the state of the lamp placed at x (either 1 or 0). At each step, the lamplighter either moves left or right, without changing any lamps, or the lamplighter switches the state of the lamp at the current position, or the lamplighter does nothing, each of these four possibilities with equal probability. Define Q t = {x ∈ Z : ∃ 0 ≤ k ≤ t − 1, Xk+1 = Xk = x}. That is, Q t is the set of all lamps such that the lamplighter has stayed at that lamp for at least one time step up to t. Note that Q t is measurable with respect to σ(X0, X1, . . . , Xt ). It is quite intuitive that every time the lamplighter stays at some lamp, there is equal chance that the lamp state is switched or that nothing happens. This is independent of other times steps. Thus, the states of the lamps in Q t should have the distribution of independent Bernoulli random variables. This is the content of the next exercise. Exercise 5.28

Show that for (L, µ) and Q t as above, for any ξ : Q t → {0, 1},

we have P[σt (x) = ξ (x) for all x ∈ Q t | (X n )n ] = 2− |Qt |

a.s.

B solution C

In the solution of Exercise 5.28 one may require the following. Exercise 5.29

Show that for X ∼ Bin n, 12 the probability that X is even is 12 . B solution C

The set Q t is a bit complicated, and it will be useful to note that Q t ⊂ Rt−1 where Rt = {X0, . . . , Xt }. Now, for any z ∈ Z with |z| ≤ m + 1, we have P[(X2t , σ2t ) = (z, 0)] ≥ P[X2t = z] · E P[σ2t (x) = 0 ∀ x ∈ Q2t | X2t = z] f g = E 2− |Q2t | 1 {X2t =z } ≥ P[|Q2t | ≤ m, X2t = z] · 2−m ≥ P[|R2t−1 | ≤ m, X2t = z] · 2−m . Summing over |z| ≤ m + 1, and noting that in a group the heat kernel is always

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

180

Growth, Dimension, and Heat Kernel

maximized on the diagonal, we arrive at X 1 P[(X2t , σ2t ) = (0, 0)] ≥ 2m+3 P[|R2t−1 | ≤ m, X2t = z] · 2−m |z | ≤m+1

= P[|R2t−1 | ≤ m] ·

1 2m+3

· 2−m .

One notes that (Xt )t has the distribution of a lazy random walk on Z, so if we define Mt = maxk ≤t |Xk |, we may estimate P[|R2t−1 | ≤ 2m + 1] ≥ P[M2t ≤ m]. We have seen in Exercise 2.28 that for the lazy random walk on Z, P[M2t ≤ m] ≥ c exp −c mt 2 , for some constant c > 0 and all t > 0. So, perhaps by modifying the constant c > 0, we may conclude with the bound P[(X2t , σ2t ) = (0, 0)] ≥ sup c exp(−c mt 2 − (2m + 1) log 2 − log(4m + 5)), m≥1

and by choosing m = c 0t 1/3 for an appropriate c 0 > 0, we arrive at the required bound P[(X2t , σ2t ) = (0, 0)] ≥ c 00 exp −c 00t 1/3 . 454

5.6 The Varopolous–Carne Bound We have seen how to bound the heat kernel Pt (x, y) as a function of t (time), using Nash-type inequalities and volume growth. We now bound Pt (x, y) as a function of space, not time; that is, the bounds will depend on the distance between x and y, and fixed t. We begin with some motivation from some classical results on martingale concentration. Exercise 5.30 (Hoeffding’s inequality) Let X be a (real-valued) random variable, with E[X] = 0 and |X | ≤ 1 a.s. Show that for any ε > 0 we have f g 2 E eεX ≤ exp ε2 .

(Hint: the function x 7→ eεx is convex.)

B solution C

Let (Mt )t be a martingale. Assume that |Mt+1 − Mt | ≤ b a.s. for all t (i.e. the martingale has a.s. bounded differences).

Exercise 5.31 (Azuma’s inequality)

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

5.6 The Varopolous–Carne Bound

181

Show that for any λ > 0 we have 2 P[Mt − M0 ≥ λ] ≤ exp − 2bλ 2 t for all t > 0. (Hint: bound E[exp(εMt )].)

B solution C

Let µ be a symmetric and adapted measure on Z of finite support. Let (Xt )t be the µ-random walk. Show that there exists c = cµ > 0 such that for all t > 0 and all x ∈ Z, 2 B solution C P[Xt = x] ≤ exp −c |xt | . Exercise 5.32

Compare this to Exercise 2.28. We see that it is difficult for the random walk on Z to go further than order √ t at time t. (Compare this with the central limit theorem.) However, in larger graphs or groups, a priori it may be that to reach a vertex x in t steps there are many more possible paths, so that the probability to be at x at time t could be larger than exp −cr 2 /t , where r is the distance between x and the origin. We will now see that (perhaps surprisingly) this is actually never the case. Let G be a finitely generated group, and fix some finite symmetric generating set S of G. Let µ ∈ SA (G, ∞) and let (Xt )t be the µ-random walk. Then there exists c = cS,µ > 0 such that for all t,

Theorem 5.6.1 (Varopolous–Carne bound)

1 e− 2 (d−1)c t < (d − 1)c + 1, Px [Xt = y] ≤ 2 2 exp − (d−1) c t ≥ (d − 1)c + 1, 2(t−1) where d = distS (x, y) = x −1 y . S 0 Specifically, there is a constant c 0 = cS,µ such that for all t > 0, 0 −1 2 Px [Xt = y] ≤ exp − c |x t y | .

First, we require a variation on Hoeffding’s inequality that holds for unbounded random variables. Let X be a (real-valued) random variable of mean E[X] = 0, and assume that C, c > 0 are such that for all r > 0 we have P[|X | > r] ≤ Ce−cr . Then, for any 0 < λ < c we have f g Cλ2 E eλX ≤ exp c(c−λ) . Lemma 5.6.2

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

182

Growth, Dimension, and Heat Kernel

Proof It is quite simple to show that for any integer k > 0, Z ∞ g Z ∞ f g f P |X | k > ξ dξ = P[|X | > ξ]k ξ k−1 dξ E |X | k = 0 0 Z ∞ f g k−1 ≤C· e−cξ kξ k−1 dξ = Ck = C · k! · c−k . c · E Exp (c) 0

Since E[X] = 0, f

E e

λX

g

=

f g ∞ λk E X k X k!

k=0

≤ 1+C ·

∞ X

λ k c

= 1+C ·

λ2 c(c−λ)

≤ exp

Cλ2 c(c−λ)

,

k=2

as long as |λ| < c.

Let µ be a symmetric and adapted probability measure on a finitely generated group G. Let (Xt )t be the µ-random walk. Let x, y ∈ G. Let t > 0 be such that Px [Xt = y] > 0. Show that Py [Xt = x] = Px [Xt = y] > 0. Show that for any z and any 0 < s < t,

Exercise 5.33

Px [X s = z | Xt = y] = Py [Xt−s = z | Xt = x]. Proof of the Varopolous–Carne bound, Theorem 5.6.1 Fix x, y and let d = dist(x, y) = x −1 y . Let ϕ : G → R be any function that is 1-Lipschitz ||∇ϕ||∞ ≤ 1 and ϕ(x) = 0, ϕ(y) = d. For example, one may take ϕ(z) = dist(x, z). Set δ(z) = Ez [ϕ(X1 ) − ϕ(X0 )] = Ez [ϕ(X1 )] − ϕ(z). Now, let (Xt )t be a random walk started at X0 = x, and (Yt )t be an independent random walk started at Y0 = y. Define the process M1 = 0 and Mt = ϕ(Xt ) − ϕ(Yt ) + ϕ(Y1 ) − ϕ(X1 ) −

t−1 X

(δ(X j ) − δ(Yj ))

j=1

=

t−1 X

(ϕ(X j+1 ) − E[ϕ(X j+1 ) | X j ]) −

j=1

t−1 X

(ϕ(Yj+1 ) − E[ϕ(Yj+1 ) | Yj ]).

j=1

It is simple to check that (Mt )t is a martingale. In fact, Mt+1 − Mt = ϕ(Xt+1 ) − E[ϕ(Xt+1 ) | Xt ] + E[ϕ(Yt+1 ) | Yt ] − ϕ(Yt+1 ). Denote Zt = ϕ(Xt ) − E[ϕ(Xt ) | Xt−1 ]

and

Wt = E[ϕ(Yt ) | Yt−1 ] − ϕ(Yt ).

Let Ft = σ(M1, . . . , Mt ).

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

5.6 The Varopolous–Carne Bound

183

Because ϕ is 1-Lipschitz, and because µ ∈ SA (G, ∞), we have that there exists c > 0 such that for any t and any r > 0, f g P[| Zt | > r | Ft−1, Wt ] ≤ P |(Xt−1 ) −1 Xt | > r ≤ e−2cr , f g P[|Wt | > r | Ft−1, Zt ] ≤ P |(Yt−1 ) −1Yt | > r ≤ e−2cr . Also, it is immediate that E[Zt | Ft−1, Wt ] = E[Wt | Ft−1, Zt ] = 0. Thus, by the variation on Hoeffding’s inequality, namely Lemma 5.6.2, for any 0 < λ ≤ c we have f g 2 f g 2 λ λ E eλZt | Ft−1, Wt ≤ exp 2c and E eλWt | Ft−1, Zt ≤ exp 2c 2 2 . Pt Since Mt = j=2 (Zt + Wt ), we conclude that for 0 < λ ≤ c, f g 2 E eλMt ≤ exp λc2 · (t − 1) . Let E = {Xt = y, Yt = x}. Note that because µ is symmetric, P[E] = P[Xt = y | X0 = x] · P[Yt = x | Y0 = y] = (Px [Xt = y]) 2 . So it suffices to bound this probability. Since, Px [X s = z | Xt = y] = Py [Yt−s = z | Yt = x], we find that on the event E we have t−1 X

(δ(X j ) − δ(Yj )) = 0.

j=1

Thus, on the event E we have Mt ≥ 2(d − 1). Hence, 2 f g exp λc2 (t − 1) ≥ E eλMt 1 E ≥ e2λ(d−1) · P[E], implying that P[E] ≤ exp

(t−1) c2

λ 2 − 2(d − 1)λ . 2

Optimizing over λ, we would like to choose λ = (d−1)c t−1 . We still require that λ ≤ c however. When t − 1 < (d − 1)c, we choose λ = c resulting in a bound of p Px [Xt = y] = P[E] ≤ exp −(d − 1) c2 . For t − 1 ≥ (d − 1)c, we choose λ =

(d−1)c 2 t−1

≤ c to obtain p 2 c2 Px [Xt = y] = P[E] ≤ exp − (d−1) . 2(t−1)

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

184

Growth, Dimension, and Heat Kernel

5.7 Additional Exercises Exercise 5.34

Let G be a finitely generated group, generated by a finite sym-

metric set S. Let µ ∈ SA (G, ∞), and let (Xt )t denote the µ-random walk. Recall the Green Function (from Section 4.8): g (x, y) =

∞ X

P[Xt = y].

t=0

Assume G has exponential growth. Show that there exists some constant C = CS,µ > 0 such that for all x, y ∈ G, ! q −1 g (x, y) ≤ C exp −C |x y|S . B solution C

Show that if G is a non-amenable finitely generated group and µ ∈ SA (G, 1), then lim inf t→∞ |Xtt | > 0 a.s. B solution C

Exercise 5.35

Note that Fatou’s Lemma tells us that in the setting above, lim inf t→∞

E |Xt | ≥ E lim inf t t

|Xt | t

> 0.

This is not an equivalence; that is, there are amenable groups of positive speed, for example some lamp-lighter groups, see Section 6.9. Let G be aninfinite finitely generated group. Show that for any ε > 0, there exists µ ∈ SA G, 13 − ε such that (G, µ) is transient. (Hint: recall Exercises 4.43 and 4.44.) B solution C Exercise 5.36

This is not the best possible. It can be shown that for any finitely generated group G and any ε > 0, there is µ ∈ SA (G, 1 − ε) such that (G, µ) is transient. However, this requires methods outside the scope of those presented here.

5.8 Remarks Kesten was one of the founding fathers of random walks on general groups (departing from the more classical Euclidean setting considered by Pólya, 1921). He proved the amenability criterion, Theorem 5.2.4, in his PhD thesis (see also Kesten, 1959).

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

5.8 Remarks

185

Bounds of the Varopolous–Carne type, Theorem 5.6.1, were first proved by Varopoulos (1985) and Carne and Varopoulos (1985). Many proofs in the literature are specialized to µ of finite support. The proof presented is an elegant probabilistic argument by Rémi Peyre (2008), which is useful for the generalization to exponential tail measures, even with infinite support. Theorem 5.3.1 was proved by by Coulhon and Saloff-Coste (1993). Our treatment of isoperimetric inequalities, Nash inequalities, and heat kernel bounds is based on Woess (2000). A more careful analysis of heat kernel decay in polynomial growth Cayley graphs can be used to prove that the expected number of visits to a ball of radius r in the graph is at most O r 2 (when the graph is transient). It is expected that this phenomenon is universal, at least for transient transitive graphs. Let G be a transient Cayley graph. Fix some x ∈ G and let Br = B(x, r) be the ball of radius r around G. Recall the Green function, g (x, y), which is the expected number of visits to y started at x. Then, there exists a constant C = C(G) > 0 such that for all r ≥ 0, X g (x, Br ) = g (x, y) ≤ Cr 2 . Conjecture 5.8.1

y ∈Br

To our knowledge, the state of the art at the time of writing is a bound in Lyons et al. (2017): p g (x, Br ) ≤ Cr 2 log |Br | ≤ C 0r 5/2 (the last inequality from the fact that Cayley graphs grow at most exponentially). This is also related to results we will prove in the following chapters. In Chapter 9, Theorem 9.4.1 tells us that on a Cayley graph the simple f g random walk (Xt )t is always at least diffusive; that is E |Xt | 2 ≥ ct. Thus, for Tr = inf{t ≥ 0 : |Xt | > r }, we have that f g ct ≤ E |Xt | 2 ≤ P[|Xt | ≤ r] · r 2 + P[|Xt | > r] · t 2, so that P[|Xt | > r] ≥

ct − r 2 , t2

which can be made at least some fixed ε > 0 by choosing t = Cr 2 for appropriate C > 0. Thus, with constant probability at least ε > 0, the walk exits a ball of radius Kr by time t = r 2 . As the walk is transient, one would intuitively expect the random walk to now escape and never return to the ball of radius r with some positive probability. Thus, the expected total number of visits to the ball should be at most of order r 2 . This is not a proof, of course.

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

186

Growth, Dimension, and Heat Kernel

5.9 Solutions to Exercises Solution to Exercise 5.1 :( Let S, T be two finite symmetric generating sets for a group G . Assume that Φ S = 0. Let A ⊂ G be a finite subset. Assume that a ∈ A and at < A for some t ∈ T . Then, there exist s1, . . . , sk ∈ S such that |t | S = k and t = s1 · · · sk . Since a ∈ A and at < A, it must be that as1 . . . s j−1 ∈ A and as1 · · · s j < A for some 1 ≤ j ≤ k , where s0 = 1. Thus, k X XX 1( a s 1{a∈ A} 1{at < A} ≤ a

a j=1

≤ |t | S ·

XX

1 ···s j−1 ∈ A

) 1(

a s1 ···s j < A

)

1{a∈ A} 1{a s< A} = |t | S · | AS\A|.

a s∈S

Summing this over T , we have that

| AT \T | ≤ max |t | S · |T | · | AS\A|. t ∈T

Taking an infimum over finite subsets A, we see that

0 ≤ ΦT ≤

| AS\A| maxt ∈T |t | S · |T | · |S | maxt ∈T |t | S · |T | · |S | · inf ≤ · Φ S = 0. | A| 0. Take n large enough so that for all s ∈ S we have

|Fn s 4Fn | < ε. |Fn | Then,

|Fn S\Fn | ≤

X

|Fn s 4Fn | < |S | · |Fn | · ε,

s∈S

so Φ S < ε . Since this holds for all ε > 0, we have that G is amenable. For the other direction, assume that G is amenable. Let S be a finite symmetric generating set. We know that Φ S = 0. So there is a sequence of finite subsets (Fn ) n such that

lim

n→∞

|Fn S\Fn | = 0. |Fn |

Note that for any x, y ∈ G and any finite subset F ⊂ G , [ F xy\F ⊆(F xy\F y) (F y\F ). Since | Ay | = | A| for any A ⊂ G and y ∈ G , it is simple to conclude that

|F x 4F | ≤ |F x\F | + |F\F x | ≤ 2 |x | · sup |F s\F | ≤ 2|x | · |F S\F |. s∈S

This leads to

|Fn x 4Fn | →0 |Fn | as n → ∞, for any fixed x ∈ G . Thus, (Fn ) n is a Følner sequence. Solution to Exercise 5.3 :( Let (Fn ) n be a Følner sequence; that is,

lim

n→∞

|Fn x 4Fn | =0 |Fn |

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

:) X

5.9 Solutions to Exercises (

187

)

for any x ∈ G . Set F˜n = (Fn ) −1 = y −1 : y ∈ Fn . Then, for any x ∈ G ,

−1 4(Fn ) −1 = Fn x −1 4Fn . |x F˜n 4 F˜n | = Fn x −1 Since |Fn | = | F˜n | , this completes the equivalence of the existence of right and left Følner sequences.

:) X

Solution to Exercise 5.4 :( The first assertion is immediate by definition. For the second assertion, let µ be a symmetric adapted probability measure on G . Let S ⊂ supp (µ) be a finite symmetric generating set for G . Write ε := min s∈S µ(s) > 0. Note that for a ∈ A, X XX P a [X1 < A] = µ(x)1{a x< A} ≥ ε | AS\A|. a∈ A x

a∈ A

So dividing by | A| and taking the infimum over finite subsets A ⊂ G , we arrive at

|S | · Φ S ≤

1 Φµ . ε

So if Φ µ = 0 then G is amenable. For the other direction, assume that G is amenable. Fix some finite symmetric generating set S for G . Let ε > 0, and choose r > 0 large enough so that X µ(x) < ε. | x |>r

Let T = B S (1, r ) = {x : |x | ≤ r }. T generates G , so ΦT = 0. We now have that X XX P a [X1 < A] = µ(x)1{a x< A} ≤ | AT \A| + | A| · ε. a∈ A x

a∈ A

Dividing by | A| and taking an infimum over finite subsets A we get that

Φ µ ≤ |T | · ΦT + ε = ε. As this holds for any ε > 0, we get that Φ µ = 0. This implies that Φ µ = 0 if and only if G is amenable.

:) X

Solution to Exercise 5.5 :( We have using the symmetry of µ ,

P a [X1 < A] =

X

X µ a−1 x = P x −1 [X1 = a].

x< A

x< A

Summing over a we get

X a∈ A

P a [X1 < A] =

X

P x [X1 ∈ A].

:) X

x< A−1

Solution to Exercise 5.8 :( Compute:

X 2 X X X c x P(x, y) f (y) ≤ c x P(x, y) | f (y) | 2 = c(x, y) | f (y) | 2 ≤ cy | f (y) | 2 y x x, y x, y y = | | f | | c2 . :) X

| | P f | | c2 =

X

Solution to Exercise 5.9 :( Since (G, c) is connected, P is irreducible. So, for any x , y , z , w , there exist n, k > 0 such that P n (x, z) , P k (w, y) > 0. Hence, for any t ,

P t +n+k (x, y) ≥ P n (x, z)P t (z, w)P k (w, y),

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

188

Growth, Dimension, and Heat Kernel

which implies that

lim sup P t (z, w)

≤ lim sup P t +n+k (x, y)

1/t

t →∞

t →∞

1/t

= lim sup P t (x, y)

1/t

.

t →∞

:) X

Solution to Exercise 5.11 :( First, we show that | | P f | | c ≤ ρ˜ | | f | | c for any f ∈ ` 0 (G, c) (that is, including complex-valued functions). Indeed, for any f ∈ ` 0 (G, c) ,

2 X P(x, y)(Re f (y) + i Im f (y)) c x y x X X ≤ c x P(x, y) | Re f (y) | 2 + c x P(x, y) | Im f (y) | 2

| | P f | | c2 =

X

x, y

=

x, y

+

| |P Re f | | c2

| | P Im f | | c2

≤ ρ˜ 2 · | | Re f | | c2 + | | Im f | | c2 .

Since

| | f | | c2 =

X

c x | f (x) | 2 =

X

x

c x | Re f (x) | 2 +

x

X

c x | Im f (x) | 2 = | | Re f | | c2 + | | Im f | | c2 ,

x

we conclude that | |P f | | c ≤ ρ˜ | | f | | c for all f P ∈ ` 0 (G, c) . 2 (G, c) . Since 2 Now, fix ε > 0 and f ∈ ` x c x | f (x) | < ∞, there exists a finite subset S ε ⊂ G such P that x 0, then we can find a constant κ > 0 such that c( A n, Acn ) d+ε ≥ κ · c( A n ) d+ε−1 for all n. But this gives the contradiction

d = lim L( A n ) ≥ lim n→∞

n→∞

log c( A n ) log c( A n ) −

d+ε−1 d+ε

· log c( A n ) −

1 d+ε

log κ

= d + ε.

On the other hand, if dimiso (G, c) ≤ d − ε for some ε > 0, then for any n we can find a finite set D n such that c(D n, D nc ) d−ε < c(D n ) d−1−ε . This gives the contradiction

d ≤ lim inf n→∞

log c(D n ) log c(D n ) −

d−1−ε d−ε

· log c(D n )

= d − ε.

:) X

Solution to Exercise 5.17 :( For any finite subset A, in any Cayley graph Γ(G, S) , considered as a network with conductance 1 on every edge, we have that

|∂A| ≤ |c( A, Ac ) | ≤ |S | · |∂A|, where ∂A = {x ∈ A : : ∃ y ∼ x, y < A}. Also, as in the proof of the Coulhon–Saloff-Coste inequality (Theorem 5.3.1), | A\Ay | ≤ |y | · |∂A| . Since x ∈ A, xy < A is equivalent to x ∈ A\Ay −1 , we have that

#{x ∈ A : ∃ u ∈ S0, xu < A} ≤ max |u | S · #{x ∈ A : ∃ s ∈ S, xs < A}. u∈S 0

The conclusion from this is that for any two finite symmetric generating sets S, S 0 , the ratio between the corresponding values for c( A, Ac ) in the two Cayley graphs is bounded between two constants depending only on S, S 0 and not on A. This implies that the limit

log | A|

lim inf | A|→∞

log

| A| |c ( A, A c )|

will be the same in both Cayley graphs.

:) X

Solution to Exercise 5.18 :( If G is non-amenable, and S is some finite symmetric generating set, then Φ S > 0. Thus, there exists a constant

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

190

Growth, Dimension, and Heat Kernel

κ > 0 such that for any finite subset A ⊂ G we have that |∂A| ≥

| AS\A| ≥ κ | A| ≥ κ | A| (d−1)/d , |S |

for all d > 0.

:) X

Solution to Exercise 5.19 :( A Taylor expansion gives that r d ≤ (r − 1) d + dr d−1 for any r ≥ 1. For x ∈ Γ, let S(x, r ) = {y ∈ Γ : dist(x, y) = r }. Since dimiso (Γ) ≥ d we know that there is a constant ε > 0 such that for any finite subset A we have that |∂A| ≥ ε · c( A) (d−1)/d . Here, ∂A = {(x, y) : x ∼ y, x ∈ A, y < A} and c( A) = c( A, G) = {x ∼ y : x ∈ A, y ∈ G }. Specifically, for any r ≥ 2 and any x ∈ G , we have X X c(B(x, r )) = 1{dist(y, x)≤r } = c(B(x, r − 1)) + 1{dist(y, x)=r } y∼z

y∼z

≥ c(B(x, r − 1)) + |∂S(x, r ) | ≥ c(B(x, r − 1)) + εc(B(x, r − 1)) (d−1)/d . Summing this from 1 to r , we have by induction on r that

c(B(x, r )) ≥

r X

c(B(x, k)) − c(B(x, k − 1)) ≥ ε ·

k=1

r X

c(B(x, k − 1)) (d−1)/d

k=1

≥ ε · α (d−1)/d ·

r −1 X

k d−1 ≥ ε · α (d−1)/d ·

1 d

·

1 d

k d − (k − 1) d

· (r − 1) d ≥ ε · α (d−1)/d ·

1 d

· 2−d · r d .

In order for this to complete the induction, α must be chosen so that ε · α (d−1)/d · equivalent to α ≤

k=1

k=1

= ε · α (d−1)/d ·

r −1 X

2 ε d d −d 2−d .

1 d

· 2−d ≥ α, which is :) X

Solution to Exercise 5.20 :( d d−1 , we get that Since the ball of radius r , Br , has volume |Br | ≥ cr and boundary |∂Br | ≤ Cr dimiso Z d ≤ d . Also, by the Coulhon–Saloff-Coste inequality, since |Br | ≥ cr d , we get that for any finite subset A ⊂ Z d | A| 0 1/d . Thus, |∂A| ≥ c 0 · | A| (d−1)/d , and we conclude that we have |∂A| ≥ 2r (2| A|) where r (n) ≤ C n iso d dim Z ≥ d. :) X Solution to Exercise 5.21 :( Recall that P is self-adjoint with respect to the inner product h·, ·i c . Using Cauchy–Schwarz, D E D E c x P 2t (x, y) = P 2t δ y , δ x = P t δ y , P t δ x c

c

2 2 ≤ | |P t δ y | |2 · | | P t δ x | |2 ≤ | |P t | |1→2 · | |δ y | |1 · | |δ x | |1 = | |P t | |1→2 · cy · c x .

Also, for any f ∈ ` 1 (G, c) with | | f | |1 = 1, we have X D E D E P t f , P t f = P 2t f , f = c x P 2t (x, y) f (x) f (y) c

c

x, y

P 2t (x, y) P 2t (x, y) P 2t (x, y) X · c x | f (x) |cy | f (y) | = sup · | | f | |12 = sup . ≤ sup cy c cy x, y x, y x, y y x, y Together these show that

sup x, y

P 2t (x, y) 2 = | |P t | |1→2 . cy

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

191

5.9 Solutions to Exercises Also

D E | |P t δ x | | c2 = P 2t δ x , δ x = c x P 2t (x, x), c

so by Cauchy–Schwarz,

q D E c x P 2t (x, y) = P t δ y , P t δ x ≤ c x P 2t (x, x) · cy P 2t (y, y), c

so

sup x, y

P 2t (x, y) P 2t (x, x) ≤ sup , cy cx x :) X

which actually implies equality. Solution to Exercise 5.22 :( Just note that since | |Q | |2→2 ≤ 1,

Q t +1 f

≤ kQ | | t t 2→2 | |Q f | |2 ≤ | |Q f | |2 .

2

:) X

Solution to Exercise 5.23 :( First note that

| |Q t f | |22 − | |Q t +1 f | |22 = Compute: D

D

E I − Q2 Q t f , Q t f .

E D E I − Q2 Q t +1 f , Q t +1 f − I − Q 2 Q t f , Q t f D E D E = I − Q2 Q t +1 f , Q t +1 f − Q t f + (I − Q2 ) Q t +1 f − Q t f , Q t f D E D E = I − Q2 Q t +1 f , (Q − I )Q t f + (Q − I )Q t f , I − Q 2 Q t f D E D E = (Q − I )Q t f , I − Q2 Q t +1 f + Q t f = (Q − I )Q t f , I − Q2 (Q + I )Q t f D E 2 = (Q − I )(I + Q)Q t f , I − Q 2 Q t f = −

I − Q2 Q t f

≤ 0. 2

:) X

Solution to Exercise 5.24 :( Let S ⊂ supp (µ) be a finite symmetric generating set. By Exercise 1.75 and the assumptions, we have that B S (1, r ) ≥ α0 r d for some α0 > 0 and all r . Thus, r (n) = inf {r : |B S (1, r ) | ≥ n} satisfies r (n) ≤ Cn1/d for some C = C (α) > 0. Set ε := min s∈S µ(s) . Then, by the Coulhon–Saloff-Coste inequality (Theorem 5.3.1), we know that for any finite subset A ⊂ G ,

ε | A| ≥ κc( A) (d−1)/d , 2C (2 | A|) 1/d where c = cµ is the conductance of the network (G, µ) , given by cµ (x, y) = µ x −1 y , and κ = κ (d, µ, α) > 0. That is, (G, µ) satisfies a d -dimensional isoperimetric inequality. Thus, Theorem 5.5.1 guarantees the proper conclusion. :) X c( A, Ac ) ≥ ε · |∂A| ≥

Solution to Exercise 5.25 :( This is a straightforward application of Exercise 1.75.

:) X

Solution to Exercise 5.26 :( Fix some finite generating set S ⊂ supp (µ) of G , and consider the Cayley graph with respect to this generating set S . Set α = min{µ(s) : s ∈ S } > 0. Let A ⊂ G be a finite subset. For any x ∈ A, y < A, we have that µ x −1 y ≥ 1 ( x −1 y∈S ) α, so that

X x∈ A, y< A

X µ x −1 y ≥ 1{x∈∂ A} α = |∂A|α ≥ x∈ A

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

| A| , 2r (2| A|)

192

Growth, Dimension, and Heat Kernel

where r (n) = inf {r ≥ 0 : |Br | ≥ n} is the inverse growth rate, and we have used the Coulhon–SaloffCoste inequality, Theorem 5.3.1. Because G has exponential growth, for some constant C > 0 we have that r (n) ≤ C log n, so that for some constant κ > 0 we have that

X

µ x −1 y ≥ κ

x∈ A, y< A

| A| . log | A|

Solution to Exercise 5.27 :( As in the proof of Theorem 5.5.1, we set P(x, y) = µ x −1 y and Q =

E ( f , f ) ≤ 4| | f

| |22

− 4| |Q f

:) X

1 2 (I

+ P) , the lazy version of P . So

| |22

` 2 (G,

for all f ∈ µ) . By the previous exercise (G, µ) satisfies an (I, κ) -isoperimetric inequality with I(t) =

t 7→ log t is nondecreasing, we get that (G, µ) satisfies an N -Nash inequality with N (t) = any f ∈ ` 0 (G, µ) admits

| | f | |22 ≤

2 κ2

t log t 2

. Since

log (4t) . So

2 32 || f || · | | f | |22 − | |Q f | |22 . · log || f ||1 2 κ2

Approximating by a finitely supported function implies that the above also holds for all f ∈ ` 2 (G, µ) , and specifically for the function f = Q t δ x . This function has | | f | |1 = 1, so for ξ (t) = | | f | |22 = Q 2t (1, 1) , we arrive at 8 ξ (t) ≤ 2 · log ξ (t) 2 · ξ (t) − ξ (t + 1)). κ By Exercises 5.22 and 5.23, the functions t 7→ ξ (t) and t 7→ (ξ (t + 1) − ξ (t)) are nonincreasing, so we may interpolate ξ to be a function ξ : [0, ∞) → [0, ∞) that is nonincreasing and convex. Specifically, ξ (t + 1) − ξ (t) ≥ ξ 0 (t) and we arrive at

ξ (t) ≤ −

8 · log ξ (t) 2 · ξ 0 (t). κ2

Setting ζ (t) = log ξ (t) , we have

1≤−

8 ζ (t) 2 ζ 0 (t). κ2

Integrating from 0 to t , we arrive at

t ≤−

3 8 8 ζ (t) 3 − ζ (0) 3 = − 2 log Q2t (1, 1) , 3κ 2 3κ

which implies that

Q2t (1, 1) ≤ exp −ct 1/3 ,

c=

3κ 2 1/3 . 8

Finally, as in the proof of Theorem 5.5.1, we know that P 2t (1, 1) ≤ 2Q 2t (1, 1) and also P 2t (x, y) ≤ P 2t (1, 1) for all x, y . Since for t > 0, X µ(x)P t −1 (x, 1), P[Xt = 1] = P t (1, 1) = x

:) X

the proof is complete.

Solution to Exercise 5.28 :( Let (Ut )t ≥1 be i.i.d. each uniform on {−1, 1}, and let (Jt , It )t ≥1 be i.i.d. Bernoulli- 12 random variables, independent of (Ut )t . Define (Xt , σt ) ∈ L inductively by X0 = 0, σ0 = 0 and for t > 0,

(U , 0) t (Xt , σt ) = (Xt −1, σt −1 ) · (0, δ0 ) (0, 0)

if It = 1, if Jt = 1 , It , if Jt = It = 0.

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

193

5.9 Solutions to Exercises

It is simple to verify that ((Xt , σt ))t is a µ -random walk, for µ uniform on {(1, 0), (−1, 0), (0, δ0 ), (0, 0) }. Define Vt = Ut It . It is immediate that Xt = V1 + · · · + Vt . Since (Jt )t and (Ut , It )t are independent, we have that (Jt )t and (Xt )t are also independent. Thus, (Jt )t and (Qt )t are independent. Note that Xk+1 = Xk if and only if Ik = 0. For x ∈ Z define the following.

K (x, t) = {0 ≤ k ≤ t − 1 : Xk+1 = Xk = x }, X

Lt (x) =

Jk+1 =

k ∈K (x, t )

t −1 X

1{X

J . k+1 =X k =x } k+1

k=0

So Lt (x) is the number of times the lamp at x has been flipped up to time t . Thus, σt (x) = Lt (x) (mod 2) . Note that x ∈ Qt if and only if K (x, t) , ∅. Also, the sets (K (x, t)) x∈Q t are measurable with respect to σ(X0, . . . , Xt ) , and these sets are pairwise disjoint. We have that Lt (x) is determined by (Jk ) k ∈K (x, t ) . Since (Jn ) n are mutually independent, we have that conditioned on (Xn ) n , the random variables (Lt (x)) x∈Q t are mutually independent. Since (Jn ) n and (Xn ) n are independent, we conclude that: conditioned on (Xn ) n , the conditional distribution of (Lt (x)) x∈Q t is that of independent Binomial random variables, each Lt (x) having Binomial- |K (x, t) |, 21 distribution. It is a simple exercise to show that f g P[Lt (x) = 1 (mod 2) | x ∈ Qt ] = P Bin |K (x, t) |, 12 = 1 (mod 2) = 12 . We conclude that the conditional distribution of (σt (x)) x∈Q t , conditioned on (Xn ) n , is that of independent :) X Bernoulli- 21 random variables. Solution to Exercise 5.29 :( This can be done by induction on n. The base case where n = 1 is just P[X is even ] = P[X = 0] =

1 . 2 For n > 1, take X = Y + Z for Y, Z independent and Y ∼ Bin n − 1, 21 and Z ∼ Ber 12 . Let E X be the event that X is even and O X the event that X is odd. Similarly, let EY be the event that Y is even and OY the event that Y is odd. Then, by induction, P[EY ] = 1 − P[OY ] = 12 .

P[E X ] = P[EY , Z = 0] + P[OY , Z = 1] =

1 2 (P[Z

= 0] + P[Z = 1]) = 21 , :) X

completing the induction step.

Solution to Exercise 5.30 :( Note that since the function x 7→ e ε x is convex, by Jensen’s inequality we have for any |x | ≤ 1 by writing 1−x x = x+1 2 · 1 + 2 · (−1) ,

eε x ≤

x+1 ε 2 e

+

1−x 2

e−ε .

Taking expectations, we have

f g E eε X ≤

ε 1 2 (e

+ e−ε ) =

1 2

∞ X

2ε 2k (2k )!

≤ exp

ε2 2

,

k=0

where we have used that (2k)! ≥ 2 k k!.

:) X

Solution to Exercise 5.31 :( M −M By shifting Mt to t b 0 , we may assume without loss of generality that M0 = 0, b = 1. Since E[Mt +1 − Mt | Ft ] = 0, we have that for any ε > 0, f g 2 E e ε ( M t +1 −M t ) | Ft ≤ exp ε2 by Hoeffding’s inequality (Exercise 5.30). This leads to,

f f g f g g 2 2 E e ε M t = E E[e ε ( M t −M t −1 ) | Ft −1 ]e ε M t −1 ≤ exp ε2 E e ε M t −1 ≤ · · · ≤ exp t ε2 .

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

194

Growth, Dimension, and Heat Kernel

Thus, for any λ, ε > 0,

g f 2 P[Mt ≥ λ] = P e ε M t ≥ e ε λ ≤ exp t ε2 − ελ , by Markov’s inequality. Optimizing over the choice of ε > 0, we choose ε = conclusion.

λ t

to arrive at the required :) X

Solution to Exercise 5.32 :( Since µ is symmetric, (Xt )t is easily seen to be a martingale. Since µ has finite support, there exists b > 0 such that |Xt +1 − Xt | ≤ b a.s. for all t . Since Xt , −Xt have the same distribution, by Azuma’s inequality we have that 2 :) X P[Xt = x] = P[Xt = |x |] ≤ P[Xt ≥ |x |] ≤ exp − | x 2| . 2b t

Solution to Exercise 5.34 :( By the Varopolous–Carne bound (Theorem 5.6.1), C may be chosen so that for all t > 0 and x, y ∈ G , ! |x −1 y | 2 P x [Xt = y] ≤ C exp − . t By modifying C and by Exercise 5.27, for all t > 0,

P x [Xt = x] ≤ C exp −Ct 1/3 . As in the proof of Theorem 5.5.1, Cauchy–Schwarz tells us that for all x, y ∈ G and any t ≥ 3, we have f g P x [Xt = y] ≤ P Xb(t −1)/2c = 1 . Thus, we may compute: g (1, x) =

m X t =0

≤C

∞ X

P[Xt = x] +

m X t =0

P[Xt = x]

t =m+1 ∞ X 2 exp −Ct 1/3 exp −C | xt| + C t =m+1

≤ C (m + 1) exp −C

| x |2 m

+C

Z

∞

exp −Cξ 1/3 dξ .

m

Note that

Z

∞

Z exp −C ξ 1/3 dξ = 3

∞

ξ 2 exp(−Cξ )dξ Z ≤ 3 exp − C2 m1/3 · C2 ·

m

≤

6 C

m1/3

·

8 C2

∞ 0

ξ 2 C2 exp − C2 ξ dξ

· exp − C2 m1/3 .

Thus, by choosing m = b |x | 3/2 c we are done, because g (x, y) = g 1, x −1 y . Solution to Exercise 5.35 :( Let ρ < 1 and v > 1 be such that for all t > 0 and r > 0,

sup P[Xt = x] ≤ Cρ t

and

|B(1, r ) | ≤ Cv r .

x

log ρ Let α > 0 be small enough so that ρv α < 1 e.g. α < − log v . Then, P[ |Xt | ≤ αt] ≤ Cρ t · |B(1, αt) | ≤ C 2 (ρv α ) t .

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

:) X

195

5.9 Solutions to Exercises Since this is exponentially decaying in t it is summable, and by Borel–Cantelli, P lim inf | Xtt | ≤ α = 0.

:) X

t →∞

Solution to Exercise 5.36 :( Fix a finite symmetric generating set S for G , and let (Xt )t be a simple random walk on G (i.e. each step is uniform on S). Fix α ∈ 1, 43 . As in Exercise 4.43, we may choose an i.i.d.-ν sequence (Uk ) k ≥1 , for

ν(n) = 1{n≥1} ζ (α) −1 n−α 1 P −α . Set T = P n U . It is easy to see that E |U | 3 −ε < ∞ as long as ε > and ζ (α) = ∞ n 1 n=1 n k=1 k Also, just as in Exercise 4.43, we can show that g f β = 23 − 2(α−1) E (Tn ) −1/2 ≤ C (α)n−β , α .

4 3

− α.

For α < 34 we have that β > 1. By Corollary 5.5.2, we know that there exists some constant C > 0 such that for all t ≥ 1 we have P[Xt = 1] ≤ Ct −1/2 . Since (Xt )t and Tn are independent, we get that f g P[XTn = 1] = E P[XTn = 1 | Tn ] ≤ C E (Tn ) −1/2 ≤ C · C (α)n−β .

Since β > 1, the process XTn n is transient. It is easy to prove that for µ(x) = P[XT1 = x], the process XTn n is a µ -random walk, similarly to Exercise 4.44. :) X

https://doi.org/10.1017/9781009128391.008 Published online by Cambridge University Press

6 Bounded Harmonic Functions

196

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.1 The Tail and Invariant σ-Algebras

197

In this chapter we study the space of bounded harmonic functions. We will introduce the Liouville property (Definition 6.2.4), that is, the property that all bounded harmonic functions are constant, and we will consider different necessary and sufficient conditions for this property, related to speed, amenability, and entropy. Theorem 6.4.3 summarizes the main results of this chapter.

6.1 The Tail and Invariant σ-Algebras Let G be a finitely generated group. Let µ be a symmetric and adapted probability measure on G. Let (Xt )t denote the µ-random walk on G. We consider the measurable space over which the probability measures Px (for the random walk on G) are defined. Recall that this is the space GN equipped with the σ-algebra F spanned by cylinder sets, see Section 1.2. As usual, we define Ft = σ(X0, . . . , Xt ). On the space GN we have the shift operator θ : GN → GN defined by θ(ω)t := ωt+1 . An event A ∈ F is called invariant if θ −1 ( A) = A. Exercise 6.1

Show that the collection of all invariant events is a σ-algebra. B solution C

Definition 6.1.1

The tail σ-algebra is defined as T =

∞ \

σ(Xt , Xt+1, . . . , ).

t=0

An event A ∈ T is called a tail event. The invariant σ-algebra, denoted I, is defined to be the collection of all invariant events. Show that the event A = {∃ t 0, ∀ t > t 0, Xt , 1} is a tail event and an invariant event. (This event is just the event that the walk eventually stops returning to the origin; i.e. it is just what is known as transience.) Show that the event A0 = {∃ t 0, ∀t > t 0, X2t , 1} is a tail event, but not an invariant event. Give an example of an event that is not a tail event.

Exercise 6.2

There is another way to construct these σ-algebras. Recall that given an equivalence relation ∼ on GN , we say that an event A respects ∼ if for any ω ∼ ω 0 ∈ GN we have that ω ∈ A ⇐⇒ ω 0 ∈ A. Define two equivalence relations on GN :

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

198

Bounded Harmonic Functions

• ω ∼ ω 0 if and only if there exist n, k ∈ N such that θ k (ω) = θ n (ω 0 ), • ω ≈ ω 0 if and only if there exists n such that θ n (ω) = θ n (ω 0 ). Of course, ω ≈ ω 0 implies ω ∼ ω 0, but not the other way around. Exercise 6.3 Show that A ∈ σ(X t , X t+1, . . .) if and only if A respects the relation ∼t on GN , which is defined by ω ∼t ω 0 if and only if θ t (ω) = θ t (ω 0 ). B solution C

Show that an event A is a tail event if and only if for any ω ∈ A and ω 0 ≈ ω we have ω 0 ∈ A as well (i.e. A respects ≈). Show that an event A is an invariant event if and only if for any ω ∈ A and 0 ω ∼ ω we have ω 0 ∈ A as well (i.e. A respects ∼). B solution C Exercise 6.4

Exercise 6.5

Show that I ⊂ T (any invariant event is also a tail event).

Exercise 6.6

Show that if A ∈ T then θ −1 ( A) ∈ T as well.

B solution C

Exercise 6.7 Assume that A ∈ T \I. Show that there exists ∅ , B ∈ T such that B ∩ θ(B) = ∅. B solution C

Show that a random variable Y on GN, F is I-measurable if and only if Y = Y ◦ θ. B solution C Exercise 6.8

Let Y be a random variable on GN, F . Show that the following are equivalent. Exercise 6.9

(1) Y is measurable with respect to σ(X n, X n+1, . . .). (2) For any ω, ω 0 ∈ GN such that θ n (ω) = θ n (ω 0 ), we have Y (ω) = Y (ω 0 ). (3) There exists a random variable Z such that Y = Z ◦ θ n . B solution C Show that a random variable Y on GN, F is T -measurable if and only if there is a sequence of random variables (Yn )n such that Y = Yn ◦ θ n for every n. Exercise 6.10

6.2 Parabolic and Harmonic Functions P Recall the averaging operator P(x, y) = µ x −1 y ; P f (x) = y P(x, y) f (y) = Ex [ f (X1 )].

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.2 Parabolic and Harmonic Functions

199

A function f : G × N → C is called parabolic if P f (·, n + 1) = f (·, n) for all n. By this we mean X for all x ∈ G and n ∈ N, P(x, y) f (y, n + 1) = f (x, n),

Definition 6.2.1

y

and all sums converge absolutely. We may view a function f : G × N → C as a sequence of functions f n : G → C by f n (x) = f (x, n). So a parabolic function is a sequence of functions admitting P f n+1 = f n . Recall that f : G → C is harmonic if P f = f . Given a harmonic function f we may define f n = f for all n, and we see that any harmonic function induces a parabolic function. Exercise 6.11

Give an example of a parabolic function that is not harmonic.

Show that the function f : G × N → C is parabolic if and only if ( f (Xt , t))t is a martingale. Show that if f : G × N → C is a parabolic function then ( f (Xt , t + n))t is a martingale, for any fixed n ≥ 0. Exercise 6.12

Show that f : G × N → C is parabolic if and only if Re f and Im f are both parabolic. Exercise 6.13

We now provide the relation between bounded parabolic functions and the tail σ-algebra, and also between bounded harmonic functions and the invariant σ-algebra. Let µ be a symmetric, adapted probability measure on a finitely generated group G. Let (Xt )t be the µ-random walk. Show that if h ∈ BHF (G, µ) is a bounded harmonic function then there exists a I-measurable integrable random variable L such that Exercise 6.14

L = lim h(Xt ) t→∞

a.s.

(In particular, the limit above exists a.s.)

B solution C

Let µ be a symmetric, adapted probability measure on a finitely generated group G. Let (Xt )t be the µ-random walk. Show that if h : G × N → C is a bounded parabolic function then there exists a T -measurable integrable random variable L such that Exercise 6.15

L = lim h(Xt , t) t→∞

a.s.

(In particular, the limit above exists a.s.)

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

B solution C

200

Bounded Harmonic Functions

Let µ be a symmetric, adapted probability measure on a finitely generated group G. Let (Xt )t be the µ-random walk. Let L be an integrable random variable. Show that Ey [L] = Ex [L ◦ θ t | Xt = y]. B solution C

Exercise 6.16

Let G be a finitely generated group and µ an adapted probability measure on G. There is a correspondence between bounded T -measurable random variables and bounded parabolic functions on G. If L is a bounded T -measurable (complex) random variable, then f L (x, t) := Ex [L t ] is a bounded parabolic function. Here, L t are random variables such that L = L t ◦ θ t . Conversely, if f is a bounded parabolic function then Proposition 6.2.2 (Poisson formula (parabolic case))

L f := lim sup f (Xt , t) t→∞

is T -measurable (and obviously bounded). These mappings are “inverses” of one another, in the sense that f L f = f and Px [L fL = L] = 1 for all x ∈ G. Proof If L is T -measurable and bounded, by the Markov property, f L (Xt , t) = EXt [L t ] = Ex [L t ◦ θ t | Ft ] = Ex [L | Ft ]. So ( f L (Xt , t))t is a martingale (by the tower property), implying that f L is indeed a bounded parabolic function. If f is a bounded parabolic function, then L f = lim supt→∞ f (Xt , t) is T measurable by Exercise 6.15, and L f = limt→∞ f (Xt , t) a.s. by the martingale convergence theorem (Theorem 2.6.3). Note that L f = limn→∞ f (Xt+n, t + n) a.s., so that for Kt = lim sup f (X n, t + n), n→∞

we have L f = Kt ◦ θ t a.s. Also, f (·, · + t) is a bounded parabolic function, so that ( f (X n, n+t))n is a martingale. This implies by dominated convergence that f L f (x, t) = Ex [Kt ] = Ex [lim f (X n, t + n)] = lim Ex [ f (X n, t + n)] = f (x, t). n

Also, if L = L t ◦

θt

n

is a bounded T -measurable random variable, then a.s.,

L fL = lim f L (Xt , t) = lim Ex [L | Ft ] = Ex [L | F ] = L, t→∞ t→∞ S where we have used that σ F t t = F.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

201

6.2 Parabolic and Harmonic Functions

Let G be a finitely generated group and µ an adapted probability measure on G. Show that there is a correspondence between bounded I-measurable random variables and bounded harmonic functions on G, as follows. If L is a bounded I-measurable (complex) random variable, then h L (x) := Ex [L] is a bounded harmonic function. Conversely, if h is a bounded harmonic function then L h := lim supt→∞ h(Xt ) is I-measurable (and obviously bounded). These maps are “inverses” of one another, in the sense that h Lh = h and Px [L h L = L] = 1 for all x ∈ G. B solution C

Exercise 6.17 (Poisson formula)

Definition 6.2.3 A sub-σ-algebra G ⊂ F is called trivial if for every x ∈ G and any event A ∈ G, we have that Px [A] ∈ {0, 1}.

In general, the value of 0, 1 for the probability of a given event may depend on the specific measure Px . However, if the invariant σ-algebra is trivial, then this value cannot depend on the specific starting point, as the next exercise shows. Show that I is trivial if and only if for every bounded Imeasurable random variable L, there is a constant c such that Px [L = c] = 1 for all x ∈ G. B solution C Exercise 6.18

Give an example of a tail event A ∈ T such that Px [A] ∈ {0, 1} for any x ∈ G but there exist x, y such that Px [A] , Py [A]. Exercise 6.19

Let A be a tail event, and suppose that A is independent of Ft for all t. Prove that Px [A] ∈ {0, 1} for all x ∈ G. Conclude that if T is independent of Ft for all t then T is trivial. B solution C Exercise 6.20 (Kolmogorov 0-1 law)

Recall Liouville’s theorem (which was actually first proved by Cauchy), that any bounded harmonic function on Rn is actually constant. Here, harmonicity P ∂2 is with respect to the classical Laplace operator ∆ = ∂x 2 . Inspired by this j

theorem, we have the following definition. Let G be a finitely generated group and µ a probability measure on G. We call (G, µ) Liouville if every bounded µ-harmonic function is constant. Definition 6.2.4

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

202

Bounded Harmonic Functions

Recall Theorem 2.7.1, which tells us that the space of bounded µ-harmonic functions, BHF (G, µ), is either only the constant functions, or has infinite dimension. That is, a restatement of Theorem 2.7.1 is: (G, µ) is Liouville if and only if dim BHF (G, µ) < ∞. Our first connection is between the Liouville property and the invariant σ-algebra. Proposition 6.2.5 Let G be a finitely generated group and µ a symmetric and adapted probability measure on G. Then I is trivial if and only if every bounded harmonic function is constant (the Liouville property).

Proof The correspondence between bounded harmonic functions and bounded I-measurable random variables together with Exercise 6.18 prove the proposition. If every bounded harmonic function is constant, then for any bounded Imeasurable random variable, we can write L = lim supt→∞ h(Xt ) for some bounded harmonic function. Since h is constant, so is L. Thus I is trivial, by Exercise 6.18. For the other direction, if I is trivial, then for any bounded harmonic function h, we have that h(x) = Ex [L] for some bounded I-measurable random variable L. By Exercise 6.18 there is c such that Px [L = c] = 1 for all x ∈ G, implying that h ≡ c is constant. Exercise 6.21 Let G be a finitely generated group, and let µ be an adapted probability measure on G. Show that if (G, µ) is recurrent, then (G, µ) is Liouville. B solution C

6.3 Entropic Criterion 6.3.1 Entropy Let X be a discrete random variable, meaning that X takes only countably many possible values. For any x, let p(x) be the probability that X = x. The (ShanP non) entropy of X is defined as H (X ) = E[− log p(X )] = − x p(x) log p(x) with the convention 0 log 0 = 0. Strictly speaking, H is not a function of the specific random variable X, rather only of the distribution of X (so a function of p in the notation above). For a probability measure ν on a countable set Ω, P one may similarly define H (ν) = − x ν(x) log ν(x), again with 0 log 0 = 0. The notation H (X ) is convenient, if somewhat misleading. See Section B.1 for

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.3 Entropic Criterion

203

some motivation behind this definition, and Cover and Thomas (1991) for a more in-depth discussion of entropy. Let σ be a σ-algebra on Ω. For a discrete random variable X, we can define P the conditional probability pσ (x) = E 1 {X=x } |σ . Since a.s. x pσ (x) = 1, we have a “random” entropy X Hσ (X ) = − pσ (x) log pσ (x). x

Taking expectations, we define H (X | σ) = E[Hσ (X )]. If Y is another discrete random variable then we define H (X |Y ) = H (X | σ(Y ) ) It is important to note that Hσ (X ) is a random variable, but H (X | σ) is just some number (possibly infinite). In the next few exercises we work out some basic properties of entropy. Note that if X, Y are discrete random variables, then (X, Y ) is also a discrete random variable. Show that H (X | Y ) = H (X, Y ) − H (Y ). B solution C Exercise 6.22

Exercise 6.23

Show that H (X ) − H (X |Y ) = H (Y ) − H (Y |X ).

B solution C

Let X, Y be discrete random variables taking values in a set G. Let KG denote the collection of all finite subsets of G. Assume that there exists a function ϕ : G → KG such that P[X ∈ ϕ(Y )] = 1. Prove that H (X | Y ) ≤ E[log |ϕ(Y )|]. Conclude that for a random variable X taking values in some finite set F, we have that H (X ) ≤ log |F |. B solution C Exercise 6.24

Exercise 6.25

Show that if σ ⊆ G then H (X | G) ≤ H (X | σ).

Exercise 6.26

Show that H (X | σ) = H (X ) if and only if X is independent

of σ.

B solution C

B solution C

Show that if G is a σ-algebra such that for any A ∈ G we have P[A] ∈ {0, 1} then H (X | G) = H (X ). B solution C

Exercise 6.27

Exercise 6.28

Show that H (X, Y ) ≤ H (X ) + H (Y ).

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

B solution C

204 Exercise 6.29

Bounded Harmonic Functions Show that H (X ) ≤ H (X, Y ).

B solution C

We will also require the following convergence theorem. Let (Fn ) be an increasing sequence of σ-algebras, and let F∞ = σ( n Fn ). Let (σn ) be a decreasing sequence of σ-algebras, and let σ∞ = ∩n σn . Then, for any random variable X with H (X ) < ∞,

Lemma 6.3.1

S

lim H (X |Fn ) = H (X |F∞ ),

n→∞

lim H (X |σn ) = H (X |σ∞ ).

n→∞

Proof Let Yn = E 1 {X=x } | Fn . The tower property shows that this is a bounded martingale. So Yn → Y∞ a.s. for some integrable Y∞ , by martingale convergence (Theorem 2.6.3). For any A ∈ Fn we have that E[Y∞ 1 A] = E[limm Ym 1 A] = limm E[Ym 1 A] = limm E[Yn 1 A] = E 1 {X=x } 1 A , where we have used dominated convergence of Ym 1 A → Y∞ 1 A and the fact that E[Ym | Fn ] = Yn for all m > n by the tower property. (This is where it is used that (Fn )n are increasing.) Since S S n Fn is a π-system generating the σ-algebra F∞ = σ( n Fn ), we get that E [Y∞ 1 A] = E[1 {X=x } 1 A] for all A ∈ F∞ (by Dynkin’s lemma). Also, Y∞ is F∞ measurable, so E 1 {X=x } | F∞ = Y∞ = limn E 1 {X=x } | Fn . We conclude that E 1 {X=x } | Fn → E 1 {X=x } | F∞ for all x. Let φ(ξ) = −ξ log ξ. For any x the sequence φ E 1 {X=x } | Fn n is bounded above by e−1 and below by 0 . Since φ E 1 {X=x } | Fn → φ E 1 {X=x } | F∞ a.s., 1 it also converges in expectation L , by dominated convergence. Let f n (x) = E φ E 1 {X=x } | Fn . Let f ∞ = E φ E 1 {X=x } | F∞ . Let g(x) = φ(P[X = x]). Note that the above is the assertion that f n → f ∞ pointwise. Also, by Jensen, f n (x) ≤ φ E E 1 {X=x } | Fn = φ(P[X = x]) = g(x). P P Since x g(x) = H (X ) < ∞, we get by dominated convergence that x f n (x) → P x f ∞ (x). Finally, note that since f n (x) ≥ 0, we have that X X H (X | Fn ) = E φ E 1 {X=x } | Fn = f n (x), x

x

and similarly for H (X | F∞ ). So we conclude that H (X | Fn ) → H (X | F∞ ). For the second assertion, we use the backward martingale convergence the orem (Exercise 2.35) to obtain that E 1 {X=x } | σn → E 1 {X=x } | σ∞ . Similarly to the first case, we can easily deduce from this the second assertion.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.3 Entropic Criterion

205

Show that there exists a universal constant C > 0 such that the following holds. Let X be a random variable taking values in the natural numbers N (P[X ∈ N] = 1). Then, H (X ) ≤ C E[log(X + 1)] + C. B solution C Exercise 6.30

Give an example of a discrete real-valued bounded random variable X with H (X ) = ∞. B solution C

Exercise 6.31

6.3.2 The Entropy Criterion Let G be a finitely generated group, and µ a probability measure on G with finite entropy. Let (Xt )t denote the µ-random walk. Denote Ft = σ(X0, . . . , Xt ) and σt = σ(Xt , Xt+1, . . .). For any k < n, and any m > 0,

Proposition 6.3.2

H (X1, . . . , Xk |X n ) = H (X1, . . . , Xk |X n, . . . , X n+m ) = H (X1, . . . , Xk |σn ). Also, H (X1, . . . , Xk , X n ) = k H (X1 ) + H (X n−k ). Proof Fix some ` ≥ 1. Let Yt(`) = (X` ) −1 Xt+` . Note that Yt(`) is a µ-random t walk, which is independent of X1, . . . , X` . Thus, H (X1, . . . , Xk , X n, . . . , X n+m ) = H X1, . . . , Xk , X n, . . . , X n+m−1, Y1(n+m−1) = H (X1, . . . , Xk , X n, . . . , X n+m−1 ) + H (X1 ). Similarly, H (X n, . . . , X n+m ) = H (X n, . . . , X n+m−1 ) + H (X1 ), implying that H (X1, . . . , Xk |X n, . . . , X n+m ) = H (X1, . . . , Xk |X n, . . . , X n+m−1 ) for all m ≥ 1. Since the σ-algebras (σ(X n, . . . , X n+m ))m≥0 increase to σn , by Lemma 6.3.1 we conclude that for all m ≥ 0, H (X1, . . . , Xk |X n ) = H (X1, . . . , Xk |X n, . . . , X n+m ) = H (X1, . . . , Xk |σn ). The second assertion follows inductively on k, using that (1) (1) H (X1, . . . , Xk , X n ) = H X1, Y1(1), . . . , Yk−1 , Yn−1 = H (X1 ) + H (X1, . . . , Xk−1, X n−1 ).

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

206

Bounded Harmonic Functions

Definition 6.3.3 For a probability measure µ on a finitely generated group G, define the random walk entropy to be

H (Xt ) . t (Note that h(G, µ) is also known as Avez entropy.) h(G, µ) = lim

t→∞

Exercise 6.32 Show that the limit h(G, µ) is well defined and nonnegative. Show that h(G, µ) ≤ H (X1 ). B solution C Exercise 6.33

Show that if µ ∈ SA (G, 1) then H (µ) < ∞.

B solution C

Give an example of a finitely generated group G and a symmetric, adapted probability measure µ such that µ ∈ SA (G, 1 − ε) for any ε > 0, and H (µ) = ∞. B solution C Exercise 6.34

We now make another connection, this time between random walk entropy and the tail σ-algebra. Proposition 6.3.4 (Kaimanovich–Vershik entropic criterion) Let µ be a probability measure on a finitely generated group G. Assume that H (µ) < ∞. Then, h(G, µ) = 0 if and only if T is trivial.

Proof Let (Xt )t be the µ-random walk. Let σn = σ(X n, X n+1, . . .). Let h = h(G, µ). For any k < n, H (X1, . . . , Xk |X n ) = k H (X1 ) − H (X n ) + H (X n−k ) by Proposition 6.3.2. This leads to the fact that H (X n+1 ) − H (X n ) = H (X n+1 ) − H (X n+1 |X1 ) = H (X1 ) − H (X1 |X n+1 ) = H (X1 ) − H (X1 |σn+1 ) is a nonincreasing nonnegative sequence. Thus, it converges to a limit given by n

lim H (X n ) − H (X n−1 ) = lim

n→∞

n→∞

H (X n ) 1X H (Xk ) − H (Xk−1 ) = lim = h. n→∞ n k=1 n

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.4 Triviality of Invariant and Tail σ-Algebras

207

Thus, for any k ≥ 1, k h = lim

k−1 X

n→∞

j=0

H (X n−j ) − H (X n−j−1 ) = lim H (X n ) − H (X n−k ) n→∞

= k H (X1 ) − H (X1, . . . , Xk |T ) = H (X1, . . . , Xk ) − H (X1, . . . , Xk |T ). We conclude that if h = 0 then T is independent of any (X1, . . . , Xk ). This implies that T is trivial by the Kolmogorov 0-1 law (Exercise 6.20). On the other hand, if T is trivial, then H (X1 |T ) = H (X1 ) and so h = 0.

6.4 Triviality of Invariant and Tail σ-Algebras In this section we will compare bounded parabolic and harmonic functions. We will require the following estimate regarding couplings of binomial distributions. For background on couplings, see Appendix C. Show that there exists a constant c > 0 such that for any p ∈ (0, 1) and any n ≥ 1, there exists a coupling of B ∼ Bin (n, p) and B 0 ∼ Bin (n + 1, p) such that P[B , B 0] ≤ cn−1/2 . B solution C Exercise 6.35

Let ν be a probability measure on G (not necessarily symmetric or adapted). Let p ∈ (0, 1) and let µ = pδ1 + (1 − p)ν be a lazy version of ν. Show that if (Xt )t is a ν-random walk and Bt ∼ Bin (t, 1 − p) is independent of (Xt )t , then X Bt has law µt (the tth step of a µ-random walk). B solution C Exercise 6.36

Let ν be a probability measure on G (not necessarily symmetric or adapted). Let p ∈ (0, 1) and let µ = pδ1 + (1 − p)ν be a lazy version of ν. Then, any bounded µ-parabolic function ( f n )n is actually a µ-harmonic function f n = f 0 . Lemma 6.4.1

Proof If (Xt )t is a ν-random walk and B ∼ Bin (t, 1 − p) then X B has the distribution of the tth step of a µ-random walk, by Exercise 6.36. Let B ∼ Bin (t, 1 − p), B 0 ∼ Bin (t + 1, 1 − p), coupled so that P[B , B 0] ≤ ct −1/2 , for some constant c > 0, as in Exercise 6.35. Then, if f is µ-parabolic, then f (x, n) = Ex [ f (X B0, n + t + 1)] and f (x, n + 1) = Ex [ f (X B, n + 1 + t)]. If f is bounded by M then | f (x, n) − f (x, n + 1)| ≤ Ex | f (X B0, n + t + 1) − f (X B, n + 1 + t)|1 {B,B0 } ≤ 2M · P[B , B 0] ≤ 2Mct −1/2 .

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

208

Bounded Harmonic Functions

Taking t → ∞, we have that f (x, n) = f (x, n + 1) for all x ∈ G and all n. This immediately implies that f (·, n) = f (·, 0) is µ-harmonic.

Let µ be an adapted probability measure on a finitely generated group G. For any tail event A ∈ T , there exists an invariant event B ∈ I such that Px [A4B] = 0 for all x ∈ G. Specifically, if I is trivial then T is also trivial. Proposition 6.4.2

Proof Let k = inf{t ≥ 1 : P[Xt = 1] > 0}. Since µ is adapted, k < ∞. Also, set p = P[Xk = 1], and define the probability measure ν=

1 k µ − pδ1 1−p

recalling that µk denotes the kth convolution power of µ . Now, let A ∈ T be a tail event. By Proposition 6.2.2, we can find a bounded parabolic function ( f n )n such that for any x ∈ G we have Px [ f n (X n ) → 1 A] = 1. We have that ( f kn+` )n is a bounded µk -parabolic function, for any fixed 0 ≤ ` < k. Since µk = pδ1 + (1 − p)ν, by Lemma 6.4.1 we conclude that f kn+` = f ` for all 0 ≤ ` < k and all n ≥ 0, and f ` is a µk -harmonic function. Pk−1 Also, the function g = k1 `=0 f ` is µ-harmonic; indeed, Pg =

k−1 k−1 k−1 k−1 1X 1X 1X 1X P f` = P f k+` = f k+`−1 = f ` = g. k `=0 k `=0 k `=0 k `=0

The above argument tells us that for any 0 ≤ ` < k and any x ∈ G, Px [ lim f ` (Xkn+` ) = 1 A] = 1. n→∞

This implies that Px [ lim f ` (Xkn ) = 1 A] = n→∞

X y

Px [Xk−` = y] · Py [ lim f ` (Xkn+` ) = 1 A] = 1. n→∞

Pk−1 f ` , we find that g(Xkn ) → 1 A Px -a.s. for any x ∈ G. Thus, defining g = k1 `=0 However, since g is a bounded µ-harmonic function, the random variable L = lim supn g(X n ) is I-measurable, and by Exercise 6.17 we know that g(X n ) → L Px -a.s. for any x ∈ G. Thus, for any x ∈ G we have that Px [L = 1 A] = 1, which implies that Px [A4B] = 0 for B = {L = 1} ∈ I.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.5 An Entropy Inequality

209

Let G be a finitely generated group, and let µ be a symmetric and adapted probability measure on G. Let ( f n = f (·, n))n be a bounded parabolic function. Show that h = f 0 + f 1 is a µ-harmonic function. Exercise 6.37

We summarize this section so far with the following theorem. Theorem 6.4.3 Let G be a finitely generated group and µ an adapted probability measure on G with finite entropy. The following are equivalent:

• • • • • •

Every bounded µ-harmonic function is constant (i.e. (G, µ) is Liouville). dim BHF (G, µ) < ∞. dim BHF (G, µ) = 1. The invariant σ-algebra I is trivial. The tail σ-algebra T is trivial. h(G, µ) = 0.

Let µ be an adapted probability measure on a finitely generated group G, with H (µ) < ∞. Show that (G, µn ) is Liouville if and only if (G, µ) is Liouville. B solution C

Exercise 6.38

6.5 An Entropy Inequality In this section we quantify one direction of the entropic criterion for the Liouville property. We begin by introducing more notions related to entropy and information theory. Let µ, ν be two probability measures on a countable set Ω. The Kullback–Leibler divergence, denoted D(µ||ν), is defined as X µ(x) D(µ||ν) = µ(x) log , ν(x) x Definition 6.5.1

where p log p0 is interpreted as ∞ for p > 0, and 0 log 00 is interpreted as 0 (so that D(µ||ν) is finite only if µ ν). If X, Y are discrete random variables with laws µ(x) = P[X = x] and ν(y) = P[Y = y], we define D(X ||Y ) = D(µ||ν). Be careful! D(µ||ν) does not necessarily equal D(ν|| µ). Let X, Y be discrete random variables with finite entropy. The mutual information of X and Y is defined as

Definition 6.5.2

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

210

Bounded Harmonic Functions I (X, Y ) := H (X ) + H (Y ) − H (X, Y ).

Show that I (X, Y ) = I (Y, X ). Give an example where D(X ||Y ) , D(Y ||X ), but both quantities are finite.

Exercise 6.39

The relationship between Kullback–Leibler divergence and mutual information is given in the following exercise. Exercise 6.40 Let X, Y be discrete random variables with finite entropy. For every y in the support of Y , let (X |Y = y) be the random variable with distribution P[(X |Y = y) = x] = P[X = x | Y = y]. Show that X I (X, Y ) = P[Y = y] · D((X |Y = y)||X ). B solution C y

Let X and Y be two discrete random variables (i.e. taking values in some countable set). Let f fbe some gcomplex-valued f g function defined on the range of X and Y such that E | f (X )| 2 , E | f (Y )| 2 are finite. Then, E[ f (X )] − E[ f (Y )] 2 ≤ 2D(X ||Y ) · E[| f (X )| 2 ] + E[| f (Y )| 2 ] . Proposition 6.5.3

Proof Let J (X, Y ) =

X (P[X = z] − P[Y = z]) 2 . P[X = z] + P[Y = z] z

Let RX = {z : P[X = z] > 0} and RY = {z : P[Y = z] > 0}. If there exists z ∈ RX \RY , then D(X ||Y ) = ∞ and there is nothing to prove. Let us now assume that RX ⊆ RY . Hence we can write p(z) := P[X = z]/ P[Y = z] for z ∈ RY and J (X, Y ) =

X z ∈RY

P[Y = z] ·

(1 − p(z)) 2 . 1 + p(z)

Consider the function φ(x) = ξ log ξ (with φ(0) = 0). We have that φ 0 (ξ) = log ξ + 1, φ 00 (ξ) = 1/ξ. Thus, expanding around 1 we have that for all ξ > 0, (ξ−1) 2 ξ log ξ − ξ + 1 ≥ 2(1+ξ ) . This implies J (X, Y ) ≤ 2

X

P[Y = z](1 − p(z)) + 2D(X ||Y ) = 2D(X ||Y ).

z

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.5 An Entropy Inequality

211

Using Cauchy–Schwarz, one obtains X | E[ f (X )] − E[ f (Y )]| ≤ | P[X = z] − P[Y = z]| · | f (z)| z ∈RY

X | P[X = z] − P[Y = z]| p = · P[X = z] + P[Y = z] · | f (z)| √ P[X = z] + P[Y = z] z ∈RY q p ≤ J (X, Y ) · E | f (X )| 2 + E | f (Y )| 2 q p ≤ 2D(X ||Y ) · E | f (X )| 2 + E | f (Y )| 2 , as required.

For background on total variation distance see Appendix C. Prove Pinsker’s inequality: For any two finite entropy probability measures µ, ν, we have || µ − ν||TV ≤ p 2 D(µ||ν). B solution C Exercise 6.41 (Pinsker’s inequality)

Exercise 6.42 Let G be a finitely generated group and µ an adapted measure on G. Let (Xt )t denote the µ-random walk. For t ≥ 0 and x ∈ G, define

Kt (x) = || Px [Xt = ·] − P[Xt+n(x) = ·]||TV, where n(x) = inf{n ≥ 1 : µn (x) > 0}. Show that (G, µ) is Liouville if and only if for any x ∈ G we have Kt (x) → 0 as t → ∞. B solution C Exercise 6.43

Give an example of a finitely generated group G and µ ∈

SA (G, ∞) for which (G, µ) is Liouville, but there exists some x with

|| Px [Xt = ·] − P[Xt = ·]||TV = 1 for all t.

B solution C

Let X, Y be discrete random variables. Let f g f be some complexvalued function on the range of X such that E | f (X )| 2 < ∞. Then, p p E E[ f (X )|Y ] − E[ f (X )] ≤ 2 I (X, Y ) Var[ f (X )]. Corollary 6.5.4

Proof By subtracting the constant E[ f (X )] from f we may assume without loss of generality that E[ f (X )] = 0. Recall that for any y such that P[Y = y] > 0 we have the random variable (X |Y = y), which has law P[(X |Y = y) = x] = P[X = x | Y = y]. By

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

212

Bounded Harmonic Functions

Proposition 6.5.3 applied to X and (X |Y = y), | E[ f (X |Y = y)] − E[ f (X )]| 2 ≤ 2D((X |Y = y)||X ) · E[| f (X |Y = y)| 2 ] + E[| f (X )| 2 ] . Note that by Exercise 6.40, I (X, Y ) =

X

P[Y = y]D((X |Y = y)||X ),

y

and also X

f g P[Y = y] E | f (X |Y = y)| 2 = E[| f (X )| 2 ].

y

Thus, the Cauchy–Schwarz inequality implies that X E E[ f (X )|Y ] − E[ f (X )] ≤ P[Y = y] · | E[ f (X |Y = y)] − E[ f (X )]| y

≤

X

q P[Y = y] 2D((X |Y = y)||X ) E[| f (X |Y = y)| 2 ] + E[| f (X )| 2 ]

y

q √ p ≤ 2 I (X, Y ) · 2 E | f (X )| 2 , which can be seen to be the proper conclusion.

Theorem 6.5.5 Let G be a finitely generated group and µ a probability measure on G, with H (µ) < ∞. Let (Xt )t ≥0 be the walk on G. Let h : G → C f µ-random g be a µ-harmonic function such that E |h(Xt )| 2 < ∞. Then, f g (Ez |h(X1 ) − h(z)|) 2 ≤ 4 Ez |h(Xt ) − h(z)| 2 · (H (Xt ) − H (Xt−1 )).

Proof Since h is harmonic we have that |h(X1 ) − h(z)| = | Ez [h(Xt )|X1 ] − Ez [h(Xt )]| a.s. Using Corollary 6.5.4 (with X being Xt , Y being X1 , and f (x) = h(x) − h(z)), we find that q p Ez |h(X1 ) − h(z)| ≤ 2 · I (Xt , X1 ) · Ez |h(Xt ) − h(z)| 2 . Proposition 6.3.2 implies that I (Xt , X1 ) = H (Xt ) − H (Xt−1 ), which completes the proof. The inequality in Theorem 6.5.5 actually provides a quantitative estimate on the growth of harmonic functions, which quantifies one direction of the entropic criterion for the Liouville property.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.6 Coupling and Liouville

213

Let µ be an adapted probability measure on a finitely generated group G, with H (µ) < ∞. Let h : G → R be a real-valued harmonic function. Show that if for some x ∈ G,

Exercise 6.44

lim inf 1t Var x [h(Xt )] · H (Xt ) = 0, t→∞

then h is constant. B solution C

Let µ be an adapted probability measure on a finitely generated group G, with H (µ) < ∞. Show that (G, µ) is not Liouville if and only if there exists a nonconstant real-valued harmonic function h : G → R such that supt Var[h(Xt )] < ∞.

Exercise 6.45

B solution C

6.6 Coupling and Liouville Recall the definition of a coupling and total variation distance (see Appendix C). Definition 6.6.1 Let G be a finitely generated group and µ an adapted probability measure on G. Fix some Cayley graph of G. For x, y ∈ G and r > 0 let Dr (x, y) be the total variation distance between the exit measure of B(1, r) started from x and from y; that is,

Dr (x, y) = || Px [X Er = ·] − Py [X Er = ·]||TV, where Er = inf {t : |Xt | > r }. Theorem 6.6.2 Let G be a finitely generated group and µ an adapted probability measure on G. Then (G, µ) is Liouville if and only if Dr (x, y) → 0 as r → ∞ for all x, y. In fact, for any x, y ∈ G we have that Dr (x, y) → 0 as r → ∞ if and only if for all f ∈ BHF (G, µ) we have f (x) = f (y).

Proof Let x, y ∈ G, and for r > 0 let (X, Y ) be a coupling of X Er started at x and started at y, respectively, such that P[X , Y ] = Dr (x, y). If f is a bounded harmonic function then ( f (Xt )t ) is a bounded martingale. The optimal stopping theorem (Theorem 2.3.3) guarantees that f (x) =

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

214 Bounded Harmonic Functions Ex f (X Er ) = E[ f (X )] and f (y) = Ey f (X Er ) = E[ f (Y )]. Thus, | f (x) − f (y)| = E ( f (X ) − f (Y ))1 {X,Y } ≤ 2|| f ||∞ · P[X , Y ] = 2|| f ||∞ Dr (x, y). Hence, if Dr (x, y) → 0 then f (x) = f (y) for any bounded harmonic function f . The other direction is slightly more involved. If for some x, y ∈ G we have Dr (x, y) 6→ 0, then let d := lim supr→∞ Dr (x, y) > 0. Let (r k )k be a subsequence such that Drk (x, y) → d. For any r > 0, let Ar be a set such that Px X Er ∈ Ar − Py X Er ∈ Ar = Dr (x, y). For any r > 0 we may define f r (z) = Pz X Er ∈ Ar . This function is harmonic in the ball of radius r around 1. Since ( f r )r are uniformly bounded, there exists a subsequence r n0 = r kn n for which the subsequential limit f rn0 → f (pointwise convergence). It is immediate that f is a bounded harmonic function. Also, | f (x) − f (y)| = lim f rn0 (x) − f rn0 (y) = lim Drk n (x, y) = d > 0, n→∞ n→∞ so f (x) , f (y).

Exercise 6.46 Let S be a finite symmetric generating set for a group G. Let µ be a probability measure supported on S. Show that for any µ-harmonic function f on G, we have

| f (x) − f (y)| ≤ 2Dr (x, y) · max | f (z)|. |z |=r+1

Remark 6.6.3

B solution C

We see from the above proof that for fixed x, y ∈ G we have that

lim Dr (x, y) = 0 ⇐⇒ ∀ f ∈ BHF (G, µ), f (x) = f (y).

r→∞

Tointon (2016) proved that for fixed x, y ∈ G we have that f (x) = f (y) for all µ-harmonic functions f ⇐⇒ there exists r 0 , for all r ≥ r 0 , Dr (x, y) = 0. That is, convergence to 0 of Dr (x, y) is equivalent to the fact that bounded harmonic functions cannot differentiate between x and y. But the (stronger) property that the sequence (Dr (x, y))r stabilizes at 0 is equivalent to the fact that any harmonic function cannot differentiate between x and y.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

215

6.7 Speed and Entropy

6.7 Speed and Entropy Recall that in Exercise 6.30 we showed that H (|Xt |) ≤ C E log(|Xt | + 1) + C for some universal constant C > 0. Show that for a probability measure µ on a finitely generated group G, the following limit always exists: limt→∞ t −1 E |Xt |, where (Xt )t is the µ-random walk. We call this limit the speed of the random walk. B solution C Exercise 6.47

Let µ be an adapted probability measure on a finitely generated group G. Show that if µ has speed 0, then (G, µ) is Liouville. B solution C

Exercise 6.48

Let G be a finitely generated group and µ ∈ SA (G, ∞). Consider the µ-random walk (Xt )t . Then there exists a constant Cµ > 0 such that

Proposition 6.7.1 (Entropy and speed)

E |Xt | 2 ≤ H (Xt ) ≤ Cµ · E |Xt |. Cµ · t Proof We know that H (Xt ) ≤ C E |Xt | for some constant C > 0, by Exercise 6.30, as in the previous exercise. For the other inequality, we use the Varopolous–Carne bound (Theorem 5.6.1). Since H (Xt ) ≥ H (Xt−1 ) ≥ · · · ≥ H (X1 ) > 0, we only need to consider those t for which E |Xt | 2 ≥ t, otherwise the inequality is trivial. Now, by the Varopolous–Carne bound (Theorem 5.6.1), for some constant c = cµ > 0, using m = c−1 (2t − 1), X H (Xt ) = − P[Xt = x] log P[Xt = x] x

X

≥

P[Xt = x] ·

2≤ |x | ≤m+1

=

c2 8

·

E |Xt t

c 2 |x | 2 8t

+

X

P[Xt = x] · c4 |x| 2

|x |>m+1

|2

f

g + E |Xt | 2 1 { |Xt |>m+1} · c4 (1 −

c 2t )

− P[|Xt | = 1] ·

c2 8t .

Note that P[|Xt | = 1] ≤ #{x : |x| = 1} · P[Xt = 1] → 0 as t → ∞. Thus, we may choose Cµ appropriately so that the assertion holds. Corollary 6.7.2 (Liouville and speed) Let µ ∈ SA (G, ∞) for a finitely generated group G. Then (G, µ) is Liouville if and only if µ has speed 0.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

216

Bounded Harmonic Functions

Proof If the speed is 0 then (G, µ) is Liouville even without the exponential tail assumption for µ. If (G, µ) is Liouville and µ ∈ SA (G, ∞), then by the Varopolous–Carne bound (Theorem 5.6.1), we have that 2 2 t| t| t) ≤ lim sup E |X ≤ Cµ · lim sup H (X = h(G, µ) = 0. lim E |X t t t2 t→∞

t→∞

t→∞

Consider a symmetric and adapted probability measure µ on the group Zd . Let (Xt )t denote the µ-random walk. The symmetry of µ implies that f g E ||Xt+1 || 2 | Ft = ||Xt || 2 + E ||Xt+1 − Xt || 2 + 2 E[hXt+1 − Xt , Xt i | Ft ] Example 6.7.3

= ||Xt || 2 + E ||X1 || 2, so that Mt = ||Xt || 2 − ct is a martingale for c = E ||X1 || 2 . (We have used that the increment Xt+1 − Xt is independent of Ft .) This implies that E ||Xt || 2 = ct, for all t. Note that the norm || · || is Euclidean distance in Rd , but may be easily compared to the Cayley graph distance (with the standard generators) in Zd . Indeed, for any z = (z1, . . . , z d ) ∈ Zd , we have that |z| =

d X

√ |z j | ≤

d · ||z||.

k=1

Thus, t| lim E |X t t→∞

p

√ ≤

d · lim sup t→∞

E ||Xt || 2 = 0. t

Zd

So a symmetric random walk on always has 0 speed, and thus is always Liouville. We will see that this is part of a broader phenomenon, known as the Choquet– Deny theorem (Corollary 7.1.2). 454 Consider the free group Fd on d ≥ 2 (generators, with the ) standard Cayley graph; that is, the generating set is S = (a1 ) ±1, . . . , (ad ) ±1 . Let (Xt )t be the simple random walk on Fd . Note that for any x , 1, we have that multiplying by exactly one generator from S will bring us closer to 1, and all the rest will take us farther away. Thus, Example 6.7.4

P[|Xt+1 | = |Xt | + 1 | |Xt | > 0] = and

P[|Xt+1 | = |Xt | − 1 | |Xt | > 0] =

2d−1 2d 1 2d .

Of course, when Xt = 1, then any generator will take us further from the origin.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

217

6.7 Speed and Entropy

This implies that the process (|Xt |)t is a Markov chain on N, with the 1 transition matrix P(n, n + 1) = 1 − 2d = 1 − P(n, n − 1) for n > 0, and P(0, 1) = 1. This Markov chain is precisely the weighted random walk induced by the network on N given by placing conductance c(n, n + 1) = (2d − 1) n . Let us provide a lower bound on the speed of the above random walk: E |Xt+1 | = E |Xt+1 |1 { |Xt |>0} + E |Xt+1 |1 { |Xt |=0} 1 = 2d−1 2d · E (|X t | + 1)1 { |Xt |>0} + 2d · E (|X t | − 1)1 { |Xt |>0} + P[X t = 1] = E |Xt | + 1 −

1 d

P[Xt , 1],

so that E |Xt | ≥ E |Xt−1 | +

d−1 d

≥ ··· ≥ t ·

d−1 d .

Hence, the speed of the simple random walk on Fd is at least strictly positive, implying that this walk is not Liouville.

d−1 d ,

which is 454

6.7.1 The Karlsson–Ledrappier Theorem In the above we used the Varopolous–Carne bound (Theorem 5.6.1), which required random walks with exponential tails (at least by our method of proof). However, the relationship of the Liouville property with the speed of the random walk is much broader. A fundamental result is the following theorem from Karlsson and Ledrappier (2007). Theorem 6.7.5 (Karlsson–Ledrappier Theorem) Let (X t )t be a µ-random walk for some adapted measure µ on a group G; µ does not necessarily have to be symmetric. Assume that µ has finite first moment, so E |X1 | < ∞, and assume that (G, µ) is Liouville; that is, every bounded µ-harmonic function is constant. Then, there exists a group homomorphism ϕ : G → R such that

lim 1 t→∞ t

E |Xt | = E ϕ(X1 ).

In particular, if µ is symmetric then E ϕ(X1 ) = 0, so the speed is 0. Proof To prove this theorem we will actually be dealing with the theory of horofunctions, although it is not necessary to be familiar with these objects. We have already encountered them in Exercise 1.97. Fix a finite symmetric generating set S for G. Recall the Lipschitz semi-norm: ||∇S f ||∞ = sup sup | f (xs) − f (x)|. s ∈S x ∈G

Consider the set of functions L = { f : G → R : ||∇S f ||∞ ≤ 1, f (1) = 0}.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

218

Bounded Harmonic Functions

We topologize this set with pointwise convergence. Exercise 1.97 shows that with this topology, L is a compact topological space. G acts on L by x. f (y) = f x −1 y − f x −1 . An important observation here is that a fixed point for this action is a homomorphism; that is, x. f = f for all x ∈ G if and only if f (x y) = f (x) + f (y) for all x, y ∈ G. The action is also continuous. An important subset of L is given by the so-called Busemann functions: for x ∈ G we have bx ∈ L defined by bx (y) := y −1 x − |x|. Now, let (Xt )t be the µ-random walk. Define n−1

f n (x) =

1X E bXk (x) . n k=0

As a convex combination of Busemann functions, f n ∈ L. By compactness, there is a subsequence n j → ∞ such that f n j → f for some f ∈ L. Set ψ(x) = fˇ(x) := f x −1 . Note that X µ(y) E[|xyXk |] = E[|x Xk+1 |], y

which gives us that X y

n−1 X 1X µ(y) E[|x yXk | − |Xk |] µ(y) f n (x y) −1 = n k=0 y n−1

=

n−1

1X 1X E[|x Xk+1 | − |Xk+1 |] + E[|Xk+1 | − |Xk |] n k=0 n k=0 n

1 1X E[|x Xk | − |Xk |] + E[|X n |] n k=1 n 1 = f n x −1 + (E |x X n | − |x|). n

=

Since |X n | − |x| ≤ |x X n | ≤ |X n | + |x|, we have that σ := lim

n→∞

1 1 E |x X n | = lim E |X n |. n→∞ n n

Taking the limit along n j → ∞, we get that X µ(y)ψ(xy) = ψ(x) + σ. y

Hence, for any fixed x ∈ G, the function h x (y) := ψ(x y) − ψ(y)

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.8 Amenability and Liouville 219 is a harmonic function. Note that h x (y) = y. f x −1 . Since ||∇S y. f ||∞ = ||∇S f ||∞ ≤ 1, for all y ∈ G, we get that |h x (y)| ≤ |x| for all x, y ∈ G. That is, h x are all bounded harmonic functions. By the assumption that (G, µ) is −1 Liouville, we have that y. f x = h x (y) = h x (1) = f x −1 for all x, y ∈ G; that is, f is a fixed point for the G-action, implying that f and ψ are actually homomorphisms. Finally, as computed above, E ψ(X1 ) = σ. If µ is symmetric this must be 0. Let G be a finitely generated group and µ ∈ SA (G, 1). Then, (G, µ) is Liouville if and only if µ has 0 speed.

Corollary 6.7.6

6.8 Amenability and Liouville In this section we connect the notions of amenability and Liouville. In fact, the next theorem proves that any finite entropy (symmetric, adapted) random walk on a non-amenable group is non-Liouville. Theorem 6.8.1 (Liouville implies amenable)

Let G be a non-amenable finitely

generated group. Then, for any symmetric, adapted probability measure µ on G with finite entropy H (µ) < ∞, we have that (G, µ) is not Liouville. Proof It suffices to prove that the entropy is linearly growing. Since G is nonamenable, by Kesten’s amenability criterion (Theorem 5.2.4), we may choose ρ < 1, C > 0 such that Px [Xt = y] ≤ C ρt for all x, y ∈ G and t > 0. So X H (Xt ) = − Pt (x, y) log Pt (x, y) y

≥−

X

Pt (x, y) log(C ρt ) = − log C + t · log(1/ρ).

y t) → h(G, µ) ≥ − log ρ > 0. Thus, H (X t So (G, µ) is not Liouville by the entropic criterion.

Recall Exercise 5.35, which shows that random walks on nonamenable groups have positive speed. Give an alternative proof of this fact using the Liouville property. That is, show that if G is a non-amenable finitely generated group, then any µ ∈ SA (G, 1) must have positive speed.

Exercise 6.49

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

220

Bounded Harmonic Functions

Exercise 6.50 Let G be a finitely generated group such that G has subexponential growth. That is, we have r1 log |B(1, r)| → 0 as r → ∞. Show that (G, µ) is Liouville for any adapted µ with finite first moment. Conclude that G is amenable. B solution C

6.9 Lamplighter Groups In this section we review a useful class of examples, known as lamplighter groups. These will typically be examples of amenable groups that are nonLiouville. First, let us recall the notion of a semi-direct product from Section 1.5.9. Let G, H be groups and suppose that G acts on H by group automorphisms. That is, every element g ∈ G can be thought of as an automorphism of H, and we denote by g.h ∈ H the image of h ∈ H under the automorphism g ∈ G. Define the group G n H to be the set {(g, h) : g ∈ G, h ∈ H }, with the following product: (g, h)(g 0, h 0 ) := (g · g 0, h · g.h 0 ). In Exercise 1.68 we saw that this is indeed a group, that the identity element of G n H is (1G, 1 H ), and that inverses are given by (g, h) −1 = g −1, g −1 .h−1 . Some examples can be found in Exercises 1.70, 1.71, and 1.72. The example we wish to consider here is the lamplighter group over a group G. Let G be a finitely generated group. Consider the group Σ(G) = ⊕G {0, 1} = {σ : G → {0, 1} | |supp(σ)| < ∞}. Equipped with pointwise addition modulo 2, this is an Abelian group (when G is infinite it is not finitely generated), and G acts on these functions naturally −1 by translation x.σ(y) = σ x y . The lamplighter group over G is the group L(G) := G n Σ(G). That is, elements of L(G) are pairs (x, σ) for x ∈ G and σ : G → {0, 1}. Multiplication is given τ) = (x y, σ + x.τ). Inverse by (x, σ)(y, elements are given by (x, σ) −1 = x −1, x −1 .σ . We have already encountered the group L(Z) in Example 5.5.3. Consider the group L(G). Compute the conjugation (x, σ) (y,τ) . What is (1, σ) (y,τ) ? Show that Σ(G) {1} × Σ(G) is a normal subgroup of L(G). What is L(G)/({1} × Σ(G)) isomorphic to?

Exercise 6.51

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.9 Lamplighter Groups

221

Show that if G = hSi then L(G) is generated by {(s, 0), (1, δ1 ) : s ∈ S}, where δ x ∈ Σ(G) is given by δ x (y) = 1 {x=y } . Show that the commutator subgroup of L(G) satisfies [L(G), L(G)] ⊂ [G, G] × Σ(G) (as sets). B solution C Let us interpret this group in a probabilistic way. We think of G as a “street” on which “lamps” are placed at each site x ∈ G. The lamps can be on or off, indicated by 0, 1, respectively, so the configuration of lamps is σ. A “lamplighter” walks on the street G and can switch the state of the lamps. So (x, σ) indicates the position of the lamplighter by x, and the state of all lamps by σ. Multiplying (x, σ) on the right by an element (s, 0) is moving the lamplighter on the street G by s. Multiplying (x, σ) on the right by (1, δz ) is flipping the state of the lamp at xz. So under the generators (s, 0), s ∈ S, and (1, δ1 ), we have a lamplighter moving around on G and flipping 0-1 lamps on G. Show that L(G) has exponential growth; that is, show that there exists c > 0 such that |B(1, r)| ≥ ecr for all r > 0. Conclude that L(G) is transient. B solution C

Exercise 6.52

In Exercise 6.63 we will precisely compute the entropy of certain random walks on L(G). The more general phenomenon, not requiring an exact computation, is that when the “lamplighter” is walking recurrently on the “street” G, the random walk on L(G) is Liouville. This is made precise in Theorem 6.9.1.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

222

Bounded Harmonic Functions

Let G be a finitely generated group and µ an adapted probability measure on G. Show that the following are equivalent:

Exercise 6.53

(1) The µ-random walk (Xt )t is transient. (2) For every radius r > 0, lim sup Px [∀ t ≥ 0, |Xt | > r] = 1. |x |→∞

(3) There exists a radius r > 0 lim sup Px [∀ t ≥ 0, |Xt | > r] > 0.

B solution C

|x |→∞

Let G be a finitely generated group. Let ν be an adapted measure on L(G). Assume that Theorem 6.9.1

r := sup{|y| : σ(y) = 1, (x, σ) ∈ supp (ν)} < ∞. (That is, the ν-random walk cannot change lamps at distance more than r from the location of the lamplighter.) P Define the measure µ on G by µ(x) = σ ∈Σ(G) ν(x, σ). Then (L(G), ν) is Liouville if and only if the µ-random walk on G is recurrent. Proof Let ((Xt , σt ))t be the ν-random walk. Note that (Xt )t is a µ-random walk on G. For (x, σ) ∈ L(G) define h(x, σ) = P(x,σ) [lim sup σt (1) = 0]. t→∞

That is, h(x, σ) is the probability starting at (x, σ) that the lamp at 1 is eventually off. Note that by our assumption, if |Xt | > r then σt+1 (1) = σt (1). Consider the event A = {∀ t ≥ 0, |Xt | > r }. That is, the walk never enters the ball of radius r (so on A the walk cannot change the lamp at 1). Note that by our assumption, A ⊂ {∀ t ≥ 0, σt (1) = σ0 (1)}. If the µ-random walk is transient, then by Exercise 6.53 there exists x ∈ G such that P(x,σ) [A] ≥ 43 for all σ ∈ Σ(G). However, in that case, h(x, 0) ≥ P(x,0) [A] ≥ 34 and h(x, δ1 ) ≤ P(x,δ1 ) [Ac ] ≤ 41 . So h is a nonconstant, bounded harmonic function. Now, for the case that the µ-random walk on G is recurrent. Let f : L(G) → C be a bounded harmonic function. Fix some σ ∈ Σ(G). Since ν is adapted, there

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

∗ 6.10 An Example: Infinite Permutation Group S∞

223

exists k ≥ 0 and ε > 0 such that P[(Xk , σk ) = (1, σ)] = ε. By the Markov property we conclude that for all t,

P[(Xt+k , σt+k ) = (Xt , σt )(1, σ) | Ft ] = ε

a.s.

Since (Xt )t is a µ-random walk on G, it is recurrent. Define T0 = 0 and inductively Tn+1 = inf{t > Tn + k : Xt = 1}. By recurrence, all Tn are finite stopping times a.s. Also, the strong Markov property implies that for all n, P XTn +k , σTn +k = 1, σTn + σ | FTn = ε a.s. Thus, we have that the set Γ = n : XTn +k , σTn +k = 1, σTn + σ is a.s. infinite. Specifically, since XTn = 1 a.s., lim inf | f (Xt+k , σt+k ) − f (Xt , σt + σ)| = 0 a.s. t→∞

However, since ( f (Xt+k , σt+k ) − f (Xt , σt + σ))t is a bounded martingale, it must converge a.s. and in L 1 by the martingale convergence theorem (Theorem 2.6.3 and Exercise 2.33). Specifically, for any x ∈ G and any σ ∈ Σ(G), we have | f (x, 0) − f (x, σ)| = | f (x, 0) − (1, σ). f (x, 0)| ≤ E(x,0) [| f (Xt+k , σt+k ) − (1, σ). f (Xt , σt )|] → 0, implying that f (x, 0) = f (x, σ) for all x ∈ G and σ ∈ Σ(G). This immediately leads to the fact that h(x) := f (x, 0) is a µ-harmonic function on G. The assumption that µ is recurrent implies that (G, µ) is Liouville by Exercise 6.21. So h must be constant. Thus, f (x, σ) = h(x) = h(1) = f (1, 0) for all (x, σ) ∈ L(G). Show that if G is a finitely generated amenable group then L(G) is also amenable. Conclude that L Z3 is an amenable group that is non-Liouville for any finitely supported, adapted random walk. B solution C

Exercise 6.54

∗ 6.10 An Example: Infinite Permutation Group S∞

In light of Conjecture 2.7.2, we may wish to point out that something similar does not hold for non-finitely generated groups, as can be seen by the following example.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

224

Bounded Harmonic Functions

∗ denote the group of all finitely supported permutations of a countably Let S∞ infinite set Ω. Specifically, if σ : Ω → Ω let supp (σ) = {ω ∈ Ω : σ(ω) , ω} and define ∗ S∞ = {σ : Ω → Ω : | supp (σ)| < ∞ and σ is a bijection}.

(Since all countably infinite sets are in bijection with one another, the precise ∗ will be isomorphic.) set Ω is not important, as all versions of S∞ ∗ is not finitely generated. Show that S∞ ∗ is locally finite; that is, any finitely generated subgroup Hint: show that S∞ ∗ B solution C of S∞ is finite.

Exercise 6.55

There exist adapted symmetric probability measures µ, ν on ∗ such that h(S ∗ , µ) = 0 and h(S ∗ , ν) > 0. S∞ ∞ ∞

Proposition 6.10.1

The specific constructions of µ, ν as above are in the following two examples. ∗ with 0 entropy. This is an example of a random walk on S∞ Identify Ω with N and consider the transpositions πn = (n n + 1). Let µ be the measure µ(πn ) = 2−n−1 for all n ≥ 0. Since πn = πn−1 , µ is symmetric. ∗ , so µ is adapted. Also, (πn )n generate S∞ Let (Xt )t be the µ-random walk. Let Ut+1 = Xt−1 Xt+1 be the jump at time t + 1. So Ut ∈ {πn : n ∈ N}. Set |Ut | = n for n such that Ut = πn . Let

Example 6.10.2

Jt = max{|Uk | : 1 ≤ k ≤ t} be the “maximal jump” up to time t. Let us bound the entropy of Xt . Note that if Jt ≤ r then supp (Xt ) ⊂ {0, 1, . . . , r + 1}. On this event, the number of possibilities for Xt is at most (r + 2)! ≤ (r + 2) r+2 . Thus, H (Xt ) ≤ H (Xt | Jt ) + H (Jt ) ≤ E[(Jt + 2) log(Jt + 2)] + H (Jt ). Now, P[Jt ≥ r] ≤ P[∃ 1 ≤ k ≤ t, |Uk | ≥ r] ≤ t2−r . It is a simple computation that E[(Jt + 2) log(Jt + 2)] ≤ C log t · log log t, for large enough constant C > 0. Moreover, since p 7→ −p log p ismaximized at p = e−1 on (0, 1), increasing on 0, e−1 , and decreasing on e−1, 1 , we obtain

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

∗ 6.10 An Example: Infinite Permutation Group S∞

H (Jt ) = −

∞ X

225

P[Jt = r] log P[Jt = r]

r=0

X

(t2−r ) log(t2−r )

≤ C log t ·

1 e

−

≤ C log t ·

1 e

+ O t2−C log t = O(log t).

r >C log t

Thus, altogether, for some constant C > 0, we have H (Xt ) ≤ C log t · log log t. ∗ , µ) = 0. Hence h(S∞ 454 ∗. This is an example of a positive-entropy random walk on S∞ For every n ≥ 1 define

Example 6.10.3

Ω+n := {ω = (ω1, . . . , ωn ) ∈ {−1, 1} n : ωn = +1}, Ω−n := {ω = (ω1, . . . , ωn ) ∈ {−1, 1} n : ωn = −1}, Ωn := Ω+n ] Ω−n, ] Ω := Ωn . n ∗ S∞

Consider as permutations of Ω. For each n ≥ 1 define (ω1, . . . , ωn ) ± τn (ω) = (ω1, . . . , ωn, ±1) ω

if ω = (ω1, . . . , ωn, ±1) ∈ Ω±n+1, if ω = (ω1, . . . , ωn ) ∈ Ωn, otherwise.

Note that (τn± ) 2 = 1. Also, let ρ be the permutation −ω ρ(ω) = ω

if ω ∈ Ω1, otherwise.

Note also that ρ2 = 1. Finally, define µ by setting µ( ρ) = 1 − α,

µ(τn± ) =

A−1 2An

· α,

for some fixed 1 < A < 2 and α ∈ (0, 1) arbitrary. So µ is symmetric and adapted. Also, A has been chosen so that for all n > 1, µ(τn+ ) + µ(τn− ) µ(τn+ ) + µ(τn− ) 1 = = + − 1+ µ(τn+ ) + µ(τn− ) + µ τn−1 µ(τn+ ) + µ(τn− ) + µ τn−1

A 2

>

1 . 2

∗ , if σ −1 (o) = (ω , . . . , ω ) define ||σ|| = n Set o = (+1) ∈ Ω+1 . For σ ∈ S∞ 1 n and sgn (σ) = ω1 .

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

226

Bounded Harmonic Functions

∗ , and consider the processes X = ||σ || Let (σt )t be the µ-random walk on S∞ t t and Yt = sgn (σt ). It is an exercise (Exercise 6.56) to show that (Xt , Yt )t forms a Markov chain with the properties:

• P[Yt+1 , Yt | Xt > 1] = 0, • P[Xt+1 = Xt + 1 | Xt+1 , Xt , Xt > 1] ≥

1 A 1+ 2

> 21 .

Let τ = inf{t : Xt = 1}. Comparing with a random walk on a rooted binary tree (see Exercise 6.57), we may then see that for any n > 1, P[τ = ∞ | X0 = n > 1] ≥ P[∀ t, Xt ≥ X0 | X0 = n > 1] 1 ≥ 2− inf P[X = Xt + 1 | Xt+1 , Xt , Xt > 1] t t+1 = 2 − 1 + A2 = 1 − A2 . Since (Yt )t can only change sign when Xt ∈ {−1, 1}, we conclude that starting at any X0 = n > 1, with positive probability at least 1 − A2 the process (Xt , Yt )t never changes the sign of Yt . That is, P[∀ t, Yt = Y0 | X0 = n > 1] ≥ 1 −

A 2.

Thus, the event E = {∃ t 0, ∀ t > t 0, Yt = 1} is an invariant event, with probability not in {0, 1}. Hence, the invariant σalgebra is nontrivial, and hence the tail σ-algebra as well. This can be used in Corollary 6.5.4 (as in the proof of Theorem 6.5.5) to show that the entropy ∗ , µ) > 0 is positive. h(S∞ 454 Show that (Xt , Yt )t defined above form a Markov chain on {1, 2, . . .} × {−1, 1} with transition probabilities given by:

Exercise 6.56

P((1, y), (1, −y)) = µ( ρ) = 1 − α, P((1, y), (2, y)) = µ τ1+ + µ τ1− = P((1, y), (1, y)) = αA ,

A−1 A α,

P((n, y), (n + 1, y)) = µ τn+ + µ τn− = A−1 An α, + − A−1 P((n, y), (n − 1, y)) = µ τn−1 = µ τn−1 = 2A n−1 α, P((n, y), (n, y)) = 1 − α ·

( A−1)( A+2) , 2An

for all n > 1 and y ∈ {−1, 1}.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

227

6.11 Additional Exercises Show that P[Yt+1 , Yt | Xt > 1] = 0, and that P[Xt+1 = Xt + 1 | Xt+1 , Xt , Xt > 1] ≥

1 1+

A 2

> 12 .

Exercise 6.57 Let T be the rooted binary tree. That is, T = {(ω1, . . . , ω n ) ∈ Ω : ω1 = 1}, for Ω as in the previous example. For x = (ω1, . . . , ωn ), y = (ω1, . . . , ωn, ωn+1 ) we write yˆ = x. We also write x ± 1 = (ω1, . . . , ωn, ±1). Suppose that P is a transition matrix on T such that P(x, y) > 0 only if yˆ = x or xˆ = y. Also, assume that P(x, x + 1) + P(x, x − 1) ≥ α, for some α > 21 independent of x. Show that this is a transient Markov chain. Prove that for any x , (1) that is not the root,

Px [∀ t, Xt , x] ˆ ≥ 2−

1 p

≥ 2 − α1 ,

where p = inf y (P(y, y + 1) + P(y, y − 1)) ≥ α. (Hint: couple the walk on the tree with a walk on N.)

B solution C

Exercise 6.58 Define an explicit bounded nonconstant harmonic function for the non-Liouville random walk µ defined in Example 6.10.3. B solution C

6.11 Additional Exercises Let G be a finitely generated group. Recall the lamplighter group L(G) from Section 6.9. In Exercises 6.59–6.63, we will precisely compute the entropy of certain random walks on L(G). We start with a generalization of Exercise 5.28. Exercise 6.59 Let G be a finitely generated group, and let µ be a symmetric, adapted measure on G. Assume that µ(1) = 0. Fix ε, p ∈ (0, 1). Let ν be the measure on L(G) given by ν(s, 0) = ε µ(s), and ν(1, δ1 ) = 12 (1 − ε) and ν(1, 0) = 12 (1 − ε). Let ((Xt , σt ))t be the ν-random walk on L(G). For t ≥ 1 set

Q t = {x ∈ G : ∃ 0 ≤ k ≤ t − 1, Xk+1 = Xk = x}. Show that for any t > 0, the conditional distribution of (σt (x))x ∈Qt conditioned on (X n )n is that of independent Bernoulli- 12 random variables. That is, for any ξ : Q t → {0, 1},

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

228

Bounded Harmonic Functions

P[for all x ∈ Q t σt (x) = ξ (x) | (X n )n ] = 2− |Qt |

a.s.

B solution C

Exercise 6.60 Let G be a finitely generated group and µ a symmetric adapted measure on G. Assume that µ(1) > 0. Let (Xt )t be µ-random walk and set Q t = {x ∈ G : ∃ 0 ≤ k ≤ t − 1, Xk+1 = Xk = x}. Show that the limit ` := limt→∞ 1t E |Q t | exists. (Hint: recall Exercise 4.45.) Show that

µ(1)pesc ≤ ` ≤ pesc := P[∀ t > 0, Xt , 1]. Conclude that ` = 0 if and only if (G, µ) is recurrent.

B solution C

Let G be a finitely generated group and µ a symmetric adapted measure on G. Assume that µ(1) > 0. Let (Xt )t be µ-random walk and set Q t = {x ∈ G : ∃ 0 ≤ k ≤ t − 1, Xk+1 = Xk = x}. Show that pesc µ(1) 1 ` := lim E |Q t | = . B solution C t→∞ t µ(1) + pesc Exercise 6.61

Exercise 6.62 Let G be a finitely generated group and µ a symmetric adapted measure on G, with finite entropy. Let ν = (1 − ε)δ1 + ε µ be a lazy version of µ. Show that

h(G, ν) = ε · h(G, µ).

B solution C

Exercise 6.63 Let G be a finitely generated group, and let µ be a symmetric, adapted measure on G, with finite entropy. Assume that µ(1) = 0. Fix ε, p ∈ (0, 1). Let ν be the measure on L(G) given by ν(s, 0) = ε µ(s) and ν(1, δ1 ) = 1 1 2 (1 − ε) and ν(1, 0) = 2 (1 − ε). Compute the random walk entropy h(L(G), ν), and show that

h(L(G), ν) = log 2 ·

(1 − ε)pesc + εh(G, µ), 1 − ε + pesc

where pesc = P[∀ t ≥ 1, Xt , 1] and (Xt , σt )t is the ν-random walk.

B solution C

Exercise 6.64 Show that there exists a finitely supported adapted but nonsymmetric measure µ on Z such that (Z, µ) is transient. Conclude that there exists a finitely supported adapted but nonsymmetric measure ν on L(Z) such that (L(G), ν) is non-Liouville. B solution C

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

229

6.12 Remarks

Let ε > 0. Show that there exists ν ∈ SA L(Z2 ), 2 − ε such that L(Z2 ), ν is nonLiouville. B solution C Exercise 6.65

For the following exercises, let G be a finitely generated group, and µ an adapted probability measure on G. Let (Xt )t denote the µ-random walk. Define A(G, µ) = { f ∈ ` ∞ (G) : ( f (Xt ))t converges a.s. }. Exercise 6.66

Show that BHF (G, µ) ⊂ A(G, µ).

B solution C

Show that A(G, µ) is an algebra with pointwise addition and multiplication of functions (i.e. A(G, µ) is a sub-algebra of ` ∞ (G)). B solution C

Exercise 6.67

Exercise 6.68

Define a.s.

I(G, µ) = { f ∈ A(G, µ) : f (Xt ) −→ 0}. Show that I(G, µ) is an ideal.

B solution C

Show that for any f ∈ A(G, µ) the random variable L f := lim supt→∞ f (Xt ) is measurable with respect to the invariant σ-algebra I. Define h f (x) = Ex [L f ]. Show that h f ∈ BHF (G, µ). Show that f − h f ∈ I(G, µ). Conclude that A(G, µ) = BHF (G, µ) ⊕ I(G, µ). B solution C

Exercise 6.69

6.12 Remarks In this chapter we have only just glimpsed at the rich world of bounded harmonic functions. A deep notion that we have not mentioned at all is that of the Poisson boundary. Given an adapted random walk µ on a group G, the Poisson boundary is a measure space with a G-action that is maximal with respect to certain equivalent properties. This deserves treatment of its own, and we refer the reader to Lyons and Peres (2016, chapter 14) and references therein. The construction and study of the Poisson boundary was initiated in Furstenberg (1963, 1971). A brief description of the construction from the latter paper is as follows: We start with the von Neumann algebra ` ∞ (G). Let µ be some adapted probability measure on G. Recall A(G, µ) = BHF (G, µ) ⊕ I(G, µ) from Exercise 6.69. Since I(G, µ) is an ideal (Exercise 6.68), we have that BHF (G, µ)

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

230

Bounded Harmonic Functions

A(G, µ)/I(G, µ) as commutative unital von Neumann algebras. A classical result by von Neumann states that BHF (G, µ) L ∞ (Π, Σ, ν) for some probability space (Π, Σ, ν). It can be shown that the G action on BHF (G, µ) induces an ac- tion on (Π, Σ, ν). It turns out that the resulting action on ν by gν( A) = ν g −1 A P is µ-stationary; that is, g µ(g)gν = ν. Furstenberg called the space (Π, Σ, ν) the Poisson boundary, since it is the space on which we may specify boundary values (a bounded function) to obtain some bounded harmonic function on the group. Specifically, the so-called Furstenberg transform is the map L ∞ (Π, σ, ν) 3 f 7→ h f ∈ BHF (G, µ) given by h f (g) =

Z f (gξ)dν(ξ). Π

This defines a µ-harmonic function because ν is µ-stationary. Blackwell (1955) first noted the Poisson formula (Exercise 6.17 and Proposition 6.2.2), relating the invariant σ-algebra to bounded harmonic functions. Avez (1972, 1976) proved that 0 random walk entropy implies the Liouville property (formally for finitely supported random walks). Vershik and Kaimanovich (1979) and Kaimanovich and Vershik (1983) proved the equivalence of 0 entropy and triviality of the tail σ-algebra, Proposition 6.3.4, and the proof presented is basically the one from Kaimanovich and Vershik (1983). The equivalence of triviality of the tail and invariant σ-algebras, Proposition 6.4.2, was proven independently by Derriennic et al. (1980), Vershik and Kaimanovich (1979), and Kaimanovich and Vershik (1983), both providing proofs of Theorem 6.4.3. Theorem 6.8.1, stating that finite entropy symmetric adapted random walks on non-amenable groups are always non-Liouville, was proved by Furstenberg (1973). In fact, it is shown in Furstenberg (1973) that any adapted random walk on a non-amenable group is non-Liouville. The characterization of such “strongly non-Liouville” groups as non-amenable groups was shown by Rosenblatt (1981) and Kaimanovich and Vershik (1983). They prove that any amenable group admits some Liouville random walk. Chapter 7 deals with the polar case: for which groups are all adapted random walks Liouville? This is known as the Choquet–Deny property. Versions of the inequality from Theorem 6.5.5 appear independently in Benjamini et al. (2015, 2017), Erschler and Karlsson (2010), and Ozawa (2018). We provided the exposition from Benjamini et al. (2017). Regarding the Liouville property, we have already mentioned Conjecture 2.7.2, which may be restated as follows:

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.13 Solutions to Exercises

231

Conjecture 6.12.1 (Restatement of Conjecture 2.7.2) Let G be a finitely generated group. Then, for any two µ, ν ∈ SA (G, 2), we have that (G, µ) is Liouville if and only if (G, ν) is Liouville.

That is, the conjecture states that the Liouville property is invariant to symmetric random walks with finite second moment. For finitely supported, symmetric and adapted µ, ν the above has been conjectured by Kaimanovich and Vershik (1983). Note that this conjecture is “tight” in the sense that the conditions of symmetry and second moment are necessary, by Exercises 6.64 and 6.65. ∗ in Section 6.10 are from Kaimanovich and Vershik The examples on S∞ (1983).

6.13 Solutions to Exercises Solution to Exercise 6.1 :( We have that G N is obviously invariant. Since θ −1 ( Ac ) = θ −1 ( A) c and θ −1 (∪ n A n ) = ∪ n θ −1 ( A n ) , we have that the invariant events form a σ -algebra. :) X Solution to Exercise 6.3 :( In Section 1.2 we saw that σ(Xt , Xt +1, . . .) = θ −t F . We have that σ(Xt , Xt +1, . . .) is generated by the sets {Xt−1 + j (g) : j ≥ 0, g ∈ G } . For any j ≥ 0, g ∈ G ,

0 −1 we have that Xt−1 + j (g) respects ∼ t . Indeed, if ω ∼ t ω and ω ∈ Xt + j (g) , then ω t + j = g, implying that

ω0t + j = ωt + j = g (because ω ∼t ω0 ), which in turn implies that ω0 ∈ Xt−1 + j (g) as well. This shows that any event in σ(Xt , Xt +1, . . .) respects ∼t . For the other direction, assume that A respects ∼t . Note that ( ) θ −t θ t ( A) = ω : θ t (ω) ∈ θ t ( A) ( ) = ω : ∃ ω0 ∈ A, θ t (ω) = θ t (ω0 ) = A. Since θ t ( A) ∈ F , we get that A = θ −t θ t ( A) ∈ θ −t F = σ(Xt , Xt +1, . . .) .

:) X

Solution to Exercise 6.4 :( Define a relation ω ≈t ω0 if and only if θ t (ω) = θ t (ω0 ) . If A respects ≈, then for any t , A respects ≈t . So if A respects ≈ then A ∈ σ(Xt , Xt +1, . . .) for all t , implying that A ∈ T . Conversely, if A ∈ T , then for any t , it must be that A respects ≈t . Now, if ω ≈ ω0 , and ω ∈ A, then for some t we have ω ≈t ω0 , which implies that ω0 ∈ A as well. This shows that any tail event A also respects ≈. Now, assume that A is an invariant event. Note that if ω ∈ A = θ −1 ( A) then θ (ω) ∈ A. Iterating this, if ω ∼ ω0 and ω ∈ A, then taking k, n so that θ k (ω) = θ n (ω0 ) , we have that ω0 ∈ θ −n (θ n (ω0 )) = θ −n θ k (ω) ⊂ θ −n ( A) = A. So any invariant event A respects ∼. Finally, assume that A respects ∼. Since ω ∼ θ (ω) by definition, it must be that ω ∈ A if and only if θ (ω) ∈ A. This shows that A = θ −1 ( A) . :) X Solution to Exercise 6.6 :( That A ∈ T implies that A respects ≈.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

232

Bounded Harmonic Functions

If ω ≈ ω0 and ω ∈ θ −1 ( A) , then θ (ω) ∈ A and θ (ω) ≈ θ (ω0 ) , so θ (ω0 ) ∈ A, implying that ω0 ∈ θ −1 ( A) as well. Hence θ −1 ( A) respects ≈, so θ −1 ( A) ∈ T . :) X Solution to Exercise 6.7 :( Let B = A\θ −1 ( A) . Note that θ (B) ⊂ Ac , so θ (B) ∩ B = ∅. Similarly, if B0 = θ −1 ( A)\A, then θ (B0 ) ∩ B0 = ∅. If B = B0 = ∅, then A = θ −1 ( A) , so A ∈ I . :) X Solution to Exercise 6.8 :( Just note that for any Borel subset B ⊂ C, we have (Y ◦ θ) −1 (B) = θ −1Y −1 (B) .

:) X

Solution to Exercise 6.9 :( Assume Y is measurable with respect to σ(Xn, Xn+1, . . .) = θ −n F . For any Borel subset B ⊂ C, there exists A = A B ∈ F such that Y −1 (B) = θ −n ( A) . If B = {c } then this implies that Y (ω) = c if and only if θ n (ω) ∈ A{c} . So if θ n (ω) = θ n (ω0 ) then Y (ω) = Y (ω0 ) . Now assume that for any ω, ω0 such that θ n (ω) = θ n (ω0 ) , we have Y (ω) = Y (ω0 ) . By this assumption, Y (ω) ∈ B if and only if θ −n (θ n (ω)) ⊂ Y −1 (B) . For η ∈ θ −n (ω) , define Z (ω) = Y (η) , which is well defined by the assumption. Here, Z is a random variable, since for any Borel subset B ⊂ C we have ( ) Z −1 (B) = ω : ∃ η ∈ Y −1 (B), ω = θ n (η) = θ n Y −1 (B) ∈ F . Note that we have that Y = Z ◦ θ n for this random variable Z . Now, if Y = Z ◦ θ n for some random variable Z , then for any Borel subset B ⊂ C, we get Y −1 (B) = θ −n Z −1 (B) ∈ θ −n F , implying that Y is measurable with respect to θ −n F = σ(Xn, Xn+1, . . .) . :) X Solution to Exercise 6.14 :( We have that (h(Xt ))t is a bounded martingale, so converges to an integrable random variable a.s. by the martingale convergence theorem (Theorem 2.6.3). So we just define L = lim supt →∞ h(Xt ) . To see that L is I -measurable, note that

L ◦ θ (ω) = lim sup h(ωt +1 ) = L(ω). t →∞

:) X

So L is I -measurable by Exercise 6.8.

Solution to Exercise 6.15 :( As before, the limit exists because (h(Xt , t))t is a bounded martingale, and we can take L = lim supt →∞ h(Xt , t) (by the martingale convergence theorem, Theorem 2.6.3). To see that L is T -measurable, let n ∈ N and assume that θ n (ω) = θ n (ω0 ) . Then ωt = ω0t for all t ≥ n, so

L(ω) = lim h(ωt , t) = lim h(ω0t , t) = L(ω0 ). t →∞

t →∞

Therefore L is σ(Xn, Xn+1, . . .) -measurable by Exercise 6.9, and since this holds for any n, we have that L is T -measurable. :) X Solution to Exercise 6.16 :( Define Yn := Xt +n for all n ≥ 0. The Markov property tells us that conditioned on Xt = y , we have that (Yn ) n has the distribution of a µ -random walk started at Y0 = y . Also, θ t (X0, X1, . . .) = (Y0, Y1, . . .) by definition. Thus, f g E x L ◦ θ t | Xt = y = E y [L]. :) X Solution to Exercise 6.17 :( For a bounded I -measurable L , we know that L = L ◦ θ . Note that by the Markov property, P x -a.s., f g E X t [L] = E x L ◦ θ t | Ft . So

E x [h L (X1 )] = E x E X1 [L] = E x [L ◦ θ] = E x [L] = h L (x).

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.13 Solutions to Exercises

233

Thus h L ∈ BHF (G, µ) . The inverse mapping is well defined because if h ∈ BHF (G, µ) then (h(Xt ))t is a bounded martingale and thus converges a.s. In Exercise 6.14 we saw that Lh = lim supt →∞ h(Xt ) is I -measurable, so that limt →∞ h(Xt ) = Lh a.s. To show this is indeed the inverse mapping, we show that E x [Lh ] = h(x) . Indeed, since (h(Xt ))t converges a.s.,

E x [Lh ] = E x [lim sup h(Xt )] = E x [ lim h(Xt )] = lim E x [h(Xt )] = h(x). t →∞

t →∞

t →∞

The exchange of limit and expectation is justified by dominated convergence since (h(Xt ))t is uniformly bounded. Also, for a bounded I -measurable L , we have that P x -a.s.,

lim sup h L (Xt ) = lim sup E X t [L] t →∞

t →∞

f g = lim sup E x L ◦ θ t | Ft t →∞

= lim sup E x [L | Ft ] = E x [L | F ] = L. t →∞

We have used that σ

S

t

Ft = F .

:) X

Solution to Exercise 6.18 :( If every bounded I -measurable random variable is constant P x -a.s. for all x ∈ G , then applying this to indicators completes one direction. For the other direction, assume I is trivial. If L ≥ 0 is a bounded I -measurable, then for any x, y such that P x [X1 = y] > 0, we have

E x [L] = E x [L ◦ θ] ≥ P x [X1 = y] · E y [L]. Thus, E x L = 0 implies E y L = 0. Since this holds for any pair x, y with P(x, y) = µ x −1 y > 0, and since µ is adapted, this implies that E x [L] is either always 0 or always nonzero, for all x ∈ G . Now, if L = 1 A for A ∈ I , this argument implies that P x [A] = P y [A] ∈ {0, 1} for all x, y ∈ G (as we assumed that I is trivial). Thus, this holds for finite linear combinations of indicators as well. If Y is a nonnegative I -measurable random variable, then we can approximate Y by Yn % Y , which converges P x -a.s. for all x , and Yn is a finite linear combination of indicators (e.g. take Yn = 2−n b2 n Y c ∧ n). Thus, P x [Y = c] = 1 for some c ≥ 0 and all x ∈ G . If Y is a general (complex) I -measurable random variable, we may write Y = (Y1 − Y2 ) + i(Y3 − Y4 ) where Yj are all I -measurable and nonnegative. Thus, P x [Yj = c j ] = 1 for all x ∈ G and some constants c j ≥ 0, which implies that P x [Y = (c1 − c2 ) + i(c3 − c4 )] = 1 for all x ∈ G . :) X Solution to Exercise 6.20 :( A tail event is measurable with respect to σ(Xt , Xt +1, Xt +2, . . .) for all t . Since A is independent of Ft we have that E x [1 A ] = E x [1 A | Ft ]. Now, Mt = E x [1 A | Ft ] is a bounded martingale. Thus, it converges a.s. and in L 1 to the integrable random variable M∞ = lim supt →∞ Mt . Now, since M∞ = lim supt →∞ Mt , and Mt is measurable with respect to Ft for all t , we have that M∞ is measurable with respect to T . Also, for any B ∈ T we have Mt 1 B → M∞ 1 B in L 1 , so

E x [1 A 1 B ] = E x [E x [1 A | Ft ]1 B ] → E x [M∞ 1 B ]. Thus, M∞ = E x [1 A | T ]. Because A ∈ T , we have that E x [1 A ] = E x [1 A | T ] = 1 A ∈ {0, 1} a.s. So 1 A is a.s. constant in {0, 1}. :) X Solution to Exercise 6.21 :( If h : G → C is a bounded harmonic function, then (h(Xt ))t is a bounded martingale. Since (G, µ) is recurrent, T1 = inf f {t ≥ 1g : Xt = 1} is a.s. finite. Thus, the optional stopping theorem (Theorem 2.3.3) tells us that h(x) = E x h(XT1 ) = h(1) , for any x ∈ G , implying that h is constant. :) X

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

234

Bounded Harmonic Functions

Solution to Exercise 6.22 :( Using that the random variables are discrete, since f g X 1{Y =y} P[X = x, Y = y]/ P[Y = y], E 1{X=x } |σ(Y ) = y

we get that

H (X |Y ) = −

X

P[X = x, Y = y] log P[X = x |Y = y].

x, y

So

H (X, Y ) = −

X

=−

X

P[X = x, Y = y] log P[X = x, Y = y]

x, y

P[X = x, Y = y] log P[X = x |Y = y] −

x, y

X

P[X = x, Y = y] log P[Y = y]

x, y

= H (X |Y ) + H (Y ).

:) X

Solution to Exercise 6.23 :( Compute:

H (X) − H (X |Y ) = H (X) + H (Y ) − H (X, Y ) = H (Y ) − H (Y |X).

:) X

Solution to Exercise 6.24 :( Let RY = {y ∈ G : P[Y = y] > 0}. In the solution to Exercise 6.22 we have seen that X X H (X | Y ) = − P[X = x, Y = y] log P[X = x |Y = y]. y∈RY

x

So using this computation, and the assumption P[X ∈ ϕ(Y )] = 1, we find using Jensen’s inequality that X X H (X | Y ) = P[Y = y] P[X = x | Y = y] log P[X=x1|Y =y] y∈RY

≤

X

x∈ϕ (y)

P[Y = y] · log |ϕ(y) | = E[log |ϕ(Y ) |].

:) X

y∈RY

Solution to Exercise 6.25 :( The function φ(x) = −x log x is a concave function on (0, 1) . Thus, Jensen’s inequality tells us that X f f g g X f f g g H (X | G) = E φ E 1{X=x } | G = E E φ E 1{X=x } | G |σ x

≤

X

x

f g E φ E 1{X=x } |σ = H (X |σ).

:) X

x

Solution to Exercise 6.26 :( We use the “equality version” of Jensen’s inequality. The function φ(x)f = −x log x is strictly g concave on (0, 1) . Since H (X |σ) ≤ H (X) , we have that equality holds if and only if E φ(E[1{X=x } | σ]) = φ(P[X = x]) for f g every x . Thus, with Z = E[1{X=x } | σ], we have that this holds if and only if E 1{X=x } | σ = P[X = x] for every x a.s., which is if and only if X is independent of σ . :) X Solution to Exercise 6.27 :( Since P[A] ∈ {0, 1} for all A ∈ G , any X is independent of G , so H (X | G) = H (X) .

:) X

Solution to Exercise 6.28 :( This is equivalent to H (X |Y ) ≤ H (X) , which follows from the fact that {∅, Ω} ⊂ σ(Y ) , so that H (X |Y ) ≤ H (X | {∅, Ω}) = H (X) . :) X

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

235

6.13 Solutions to Exercises Solution to Exercise 6.29 :( Just note that H (X, Y ) − H (X) = H (Y |X) ≥ 0.

:) X

Solution to Exercise 6.30 :( ( ) Let pn = P[X = n]. Define A = n ∈ N : pn > (n + 1) −2 . 1/4 −4 The function ϕ(ξ ) = −ξ log ξ is maximized on the interval [0, 1] at the point ξ = e , obtaining the value ϕ e−4 = e4 . Thus, for all n ∈ N,

−pn log pn = (pn ) 3/4 · ϕ(pn ) ≤

4 e

(pn ) 3/4 .

We conclude that

H (X) = −

∞ X

pn log pn −

n∈ A

n=0

≤−

X

pn log pn ≤ −

X

pn log (n + 1) −2 +

∞ X

pn log pn

n< A 4 e

X

(pn ) 3/4

n< A

n∈ A

≤2

X

pn log(n + 1) +

4 e

∞ X

(n + 1) −3/2,

n=0

n=0

:) X

as required. Solution to Exercise 6.31 f :( For integers n > 1, let P X =

1 n

g

= pn where pn =

C n(log n) 2

, with C > 0 chosen so that

P∞ n=2

pn = 1.

Then,

H (X) = −

∞ X

pn log pn = C ·

n=2

which is infinite. P We have used the fact that ∞ n=2

1 n(log n) 2

< ∞ but

∞ X log n + 2 log log n − log C , n(log n) 2 n=2 1 n=2 n log n

P∞

= ∞.

:) X

Solution to Exercise 6.32 :( By Proposition 6.3.2, H (Xt +s | Xt ) = H (Xs ) . This implies that

H (Xt +s ) ≤ H (Xt +s , Xt ) = H (Xt +s | Xt ) + H (Xt ) ≤ H (Xs ) + H (Xt ), so that the sequence (H (Xt ))t is subadditive. Hence, the limit

h(G, µ) = inf t

H (Xt ) H (Xt ) = lim t →∞ t t

exists by Fekete’s lemma. Finally, iterating subadditivity, H (Xt ) ≤ t H (X1 ) showing that h(G, µ) ≤ H (X1 ) .

:) X

Solution to Exercise 6.33 :( Since G has at most exponential growth, there exists D > 0 such that for any r > 0 we have log |B(1, r ) | ≤ Dr . (Here B(1, r ) is the ball of radius r in some fixed Cayley graph of G .) Let U have law µ . Since U ∈ B(1, |U |) , by Exercise 6.24,

H (U) ≤ H ( |U |) + H (U | |U |) ≤ H ( |U |) + E log |B(1, |U |) | ≤ H ( |U |) + D E |U |, so it suffices to prove that H ( |U |) < ∞. We have that |U | is a random variable with nonnegative integer values such that E[|U |] < ∞ (because µ has finite 1st moment). So H ( |U |) < ∞ by Exercise 6.30. :) X Solution to Exercise 6.34 :( Let G = F d be the free group on d generators. Consider the Cayley graph with respect to the standard generators. Let Sr = {x ∈ G : |x | = r }. So |Sr | ≥ e cr for some constant c = c(d) > 0 and all r ≥ 0.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

236

Bounded Harmonic Functions

Define a probability measure µ on G by −1 µ(x) = α · |x | −2 · S| x | 1{| x |>0},

P −2 for α−1 = ∞ r =1 r . Let U be a random element of G with law µ . For any ε > 0 we have that ∞ ∞ X X g X f r 1−ε · αr −2 |Sr | −1 = α · r −1−ε < ∞. E |U | 1−ε = r =1

r =1

| x |=r

So µ ∈ SA (G, 1 − ε) . However,

H (µ) =

∞ X X

∞ ∞ X X αr −2 |Sr | −1 log α−1 r 2 |Sr | ≥ α r −2 log αr 2 e cr ≥ cα r −1 = ∞.

r =1 | x |=r

r =1

:) X

r =1

Solution to Exercise 6.35 :( k n−k and q(k) = n+1 p k (1 − p) n+1−k be the laws of B, B0 , respectively. It suffices Let p(k) = n k k p (1 − p) to prove that | |p − q | |TV ≤ cn−1/2 (see Appendix C), where | |p − q | |TV is the total variation distance; that is,

2 | |p − q | |TV =

! ! n X n p k (1 − p) n−k − n + 1 p k (1 − p) n+1−k + p n+1 k k k=0

=

n+1 X k=0

! n+1 k p (1 − p) n−k · n+1−k n+1 − (1 − p) k n+1 X

! n+1 k p (1 − p) n−k · k − (n + 1)p k k=0 p p Var[Bin (n + 1, p)] (n + 1)p(1 − p) 1 ≤ , = ≤ √ n+1 n+1 2 n+1

=

using p(1 − p) ≤

1 n+1

1 4.

:) X

Solution to Exercise 6.36 :( We prove this by induction on t . If t = 1 then f g P X B1 = x = p P[X0 = x] + (1 − p) P [X1 = x] = pδ1 (x) + (1 − p)ν(x) = µ(x). Now, for t ≥ 1, let Bt ∼ Bin (t, 1 − p) and let M ∼ Ber (1 − p) be independent of Bt . So Bt + M ∼ Bin (t + 1, 1 − p) . By induction we have that f g f g f g P X B t +M = x = p P X B t = x + (1 − p) P X B t +1 = x X f g = pµ t (x) + (1 − p) P X B t = y ν y −1 x y

=

X

µ (y) pδ1 (y t

−1

x) + (1 − p)ν(y −1 x) = µ t ∗ µ (x) = µ t +1 (x),

y

:) X

which completes the induction. Solution to Exercise H (X 6.38 :( H (X ) tn) t We have that is a subsequence of , so tn t t

t

h(G, µ ) = n

lim H (Xt t n ) t →∞

= n · lim

t →∞

H (X t ) t

= n · h(G, µ).

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

:) X

237

6.13 Solutions to Exercises Solution to Exercise 6.40 :( Just a straightforward computation:

X

X

P[Y = y] · D((X |Y = y) | |X) =

P[Y = y]

=

X

−

X

P[X = x | Y = y] log

x

y

y

X

P[X = x, Y = y] P[Y = y] P[X = x]

P[X = x, Y = y] log P[X = x, Y = y]

x, y

P[X = x, Y = y] log P[Y = y] −

X

P[X = x, Y = y] log P[X = x]

x, y

x, y

= −H (X, Y ) + H (Y ) + H (X) = I (X, Y ).

:) X

Solution to Exercise 6.41 :( Let X have law µ and f Y have g lawf ν . Fixg an event A. Let f = 1 A . So E[ f (X)] = P[X ∈ A] and E[ f (Y )] = P[Y ∈ A]. Also, E f (X) 2 + E f (Y ) 2 ≤ 2. By Proposition 6.5.3,

| P[X ∈ A] − P[Y ∈ A]| ≤

p p 2D(X | |Y ) · 2 = 2 · D(X | |Y ). :) X

Taking the supremum over events A we obtain the inequality. Solution to Exercise 6.42 :( Let x ∈ G and let n = n(x) . So µ n (x) > 0. Fix t > 0. Recall that µ t (z) = P[Xt = z]. So P[Xt +n = z | Xn = y] = P[yXt = z] = µ t y −1 z = y.µ t (z).

Using Pinsker’s inequality (Exercise 6.41), Jensen’s inequality, Proposition 6.3.2, and Exercise 6.40, we find that X µ n (x) · Kt (x) = µ n (x) · | |x.µ t − µ t +n | |TV ≤ µ n (y) | |y.µ t − µ t +n | |TV y 1/2

X µ n (y) D(y.µ t | | µ t +n ) ≤ 2 *. µ n (y)D((Xt +n |Xn = y) | | Xt +n ) +/ y , y p p = 2 I (Xt +n, Xn ) = 2 H (Xt +n ) − H (Xt +n | Xn ) ≤2

X

p

n−1 X H (Xt +k+1 ) − H (Xt +k ) + = 2* , k=0

1/2

.

We have seen in the proof of Proposition 6.3.4 that (H (Xt +1 ) − H (Xt ))t is a nonincreasing sequence converging to h(G, µ) . Thus, for any x ∈ G we have that

lim sup Kt (x) ≤ t →∞

p 2 · n(x)h(G, µ). µ n(x) (x)

Hence, if (G, µ) is Liouville then Kt (x) → 0 for any x ∈ G . For the other direction, assume that Kt (x) → 0 as t → ∞ for all x ∈ G . Let f ∈ BHF (G, µ) . For t > 0, let (Xt , Yt +1 ) be a coupling of two µ-random walks at times t and t + 1 such that P[xXt , Yt +1 ] = Kt (x) . Then,

| f (x) − f (1) | = | E[ f (xXt ) − f (Yt +1 )]| ≤ 2 | | f | |∞ P[xXt , Yt +1 ] = 2 | | f | |∞ Kt (x). Taking t → ∞ shows that f (x) = f (1) , for all x ∈ G . So any f ∈ BHF (G, µ) is constant.

:) X

Solution to Exercise 6.43 :( Let G = Z and µ uniform on {−1, 1}. Let x = 1. Since

P[X2t ∈ 2Z] = P[X2t +1 ∈ 2Z + 1] = 1 for all t , and since P1 [Xt ∈ A] = P[Xt ∈ A − 1] for all A ⊂ Z, we have that

| | P x [X2t = ·] − P[X2t = ·]| |TV ≥ | P1 [X2t ∈ 2Z] − P[X2t ∈ 2Z]| = 1, | | P x [X2t +1 = ·] − P[X2t +1 = ·]| |TV ≥ | P1 [X2t +1 ∈ 2Z] − P[X2t +1 ∈ 2Z]| = 1.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

:) X

238

Bounded Harmonic Functions

Solution to Exercise 6.44 :( Assume that h : G → R is a harmonic function and x ∈ G is such that

lim inf t →∞

1 t

Var x [h(Xt )] · H (Xt ) = 0.

Note that since (h(Xt ))t is a martingale, we have that g g f f Var x [h(Xt )] = E x |h(Xt ) − h(Xt −1 ) | 2 + E x |h(Xt −1 ) − h(x) | 2 + 2 E x (h(Xt −1 ) − h(x)) E[(h(Xt ) − h(Xt −1 )) | Ft −1 ] f g 2 = E x |h(Xt ) − h(Xt −1 ) | + Var x [h(Xt −1 )], which implies that (Var x [h(Xt )])t is a nondecreasing sequence. Thus,

lim sup t →∞

1 Var x [h(Xt )] · H (Xt ) = 0. t

Now, let y ∈ G and consider m ≥ 1 large enough so that P x [Xm = y] > 0 (using that µ is adapted). Note that h is µ m -harmonic as well. Also, (Yt = Xmt )t is a µ m -random walk, with 1 t

Var x [h(Yt )] · H (Yt ) = m ·

1 mt

Var x [h(Xmt )] · H (Xmt ) → 0.

Thus, by Theorem 6.5.5, we then have that E x |h(Y1 ) − h(x) | = 0, implying that h(Y1 ) = h(x) P x -a.s. Since P x [Y1 = y] > 0, this implies that h(y) = h(x) . This holds for arbitrary y ∈ G , implying that h is constant. :) X Solution to Exercise 6.45 :( If (G, µ) is not Liouville then there exists a bounded harmonic function h : G → C. It is immediate that Reh is also a bounded harmonic function, and must therefore have bounded variance. For the other direction, let h : G → R be a nonconstant harmonic function with supt Var[h(Xt )] ≤ M < ∞. Since h is nonconstant and since µ is adapted, there exist x ∈ G and u ∈ supp (µ) such that h(xu) , h(x) . By replacing h with x.h we may assume that h(u) , h(1) . So E |h(X1 ) − h(1) | ≥ µ(u) |h(u) − h(1) | > 0. By Theorem 6.5.5, we have

(E |h(X1 ) − h(1) |) 2 ≤ 4 E |h(Xt ) − h(1) | 2 · (H (Xt ) − H (Xt −1 )) ≤ 4M 1t H (Xt ). Taking t → ∞ we have that h(G, µ) ≥ implies that (G, µ) is non-Liouville.

C 4M

where C = (E |h(X1 ) − h(1) |) 2 > 0, which is positive. This :) X

Solution to Exercise 6.46 :( Let (X, Y ) be a coupling of XEr started at x and at y such that P[X , Y] = Dr (x, y) . Because µ is supported on S , we have that |X | = |Y | = r + 1. Also, |Xt | ≤ r for all t < Er . Let f be a µ -harmonic function. Note that Mt = f (XEr ∧t ) is a martingale, uniformly bounded by max|z |≤r +1 | f (z) | . Thus, the optional stopping theorem (Theorem 2.3.3) guarantees that E[ f (X)] = E x f (XEr ) = f (x) and E[ f (Y )] = E y f (XEr ) = f (y). So,

f g | f (x) − f (y) | ≤ E | f (X) − f (Y ) |1{X,Y } ≤ 2 max | f (z) | · Dr (x, y). |z |=r +1

:) X

Solution to Exercise 6.47 :( Note that for all integers t, s > 0,

E |Xt +s | ≤ E |Xt | + E (Xt ) −1 Xt +s = E |Xt | + E |Xs |. Thus, the limit exists by subadditivity (Fekete’s lemma). Solution to Exercise 6.48 :( By Exercise 6.30,

H ( |Xt |) ≤ C E log( |Xt | + 1) + C ≤ C 0 E |Xt |

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

:) X

239

6.13 Solutions to Exercises

for some universal constants C, C 0 > 0. Also, there exists a constant C > 0 such that for all r ≥ 0 we have |B(1, r ) | ≤ exp(Cr ) , so that by Exercise 6.24,

H (Xt ) ≤ H ( |Xt |) + H (Xt | |Xt |) ≤ H ( |Xt |) + E log |B(1, |Xt |) | ≤ C 0 E |Xt | + C E |Xt |, so that

h(G, µ) ≤ (C 0 + C) · lim

t →∞

E |Xt | = 0. t

:) X

Solution to Exercise 6.50 :( Let µ ∈ SA (G, 1) . Since µ has finite first moment we know that E |Xt | ≤ Ct for some constant C > 0 and all t ≥ 0. Let ε > 0, and let R > 0 be such that for all r > R we have log |B(1, r ) | ≤ εr . As in Exercise 6.48, using Exercises 6.24 and 6.30, we have that for some constant C 0 > 0,

H (Xt | |Xt |) ≤ E[log |B(1, |Xt |) |] ≤ ε E |Xt | + log |B(1, R) |, H ( |Xt |) ≤ C 0 E[log( |Xt | + 1)] + C 0 ≤ C 0 ε E |Xt | + C 0 log(R + 1) + C 0, which implies that

h(G, µ) = lim

t →∞

H (Xt ) ≤ (C 0 + 1)Cε. t

Since ε > 0 was arbitrary, this proves that h(G, µ) = 0.

:) X

Solution to Exercise 6.51 :( We have that

(x, σ) (y, τ ) = y −1, y −1 .τ (x, σ)(y, τ) = x y , y −1 .(τ + σ) + y −1 x .τ , and since y −1 .τ + y −1 .τ = 0,

(1, σ) (y, τ ) = 1, y −1 .σ . Note that this implies that {1} × Σ(G) ⊂ L(G) is a normal subgroup. Also, the map (x, σ) 7→ x is a homomorphism from L(G) onto G , with kernel exactly {1} × Σ(G) , so L(G)/( {1} × Σ(G)) G . y −1 ,0

Note that for any y ∈ supp (σ) , we have that (1, δ1 ) then

(1, σ) = (1, δ1 )

y −1 ,0 1

= (1, δ y ) . Thus, if supp (σ) = {y1, . . . , yk }

· · · (1, δ1 )

y −1 ,0 k

.

So we only have to show that {(s, 0) : s ∈ S } generates any element of the form (y, 0) for y ∈ G . Indeed, if y ∈ G then y = s1 · · · sn for some s j ∈ S , so

(y, 0) = (s1, 0) · · · (sn, 0). Using the above computation of conjugation in L(G) , we can compute [(x, σ), (y, τ)] = x −1, x −1 .σ (x, σ) (y, τ ) = [x, y], x −1 .(σ + y −1 .σ + y −1 .τ + y −1 x.τ) . So [L(G), L(G)] ⊂ [G, G] × Σ(G) as sets.

:) X

Solution to Exercise 6.52 :( Transience follows from exponential growth by the Nash inequality, see Exercise 5.27. For exponential growth, consider a finite symmetric generating set S for G , and for L(G) use the generating set S 0 = {(s, 0), (1, δ1 ) : s ∈ S }. Let (sn ) n be a sequence of generators in S and denote x n = s1 · · · sn for all n. Choose (sn ) n so that |x n | = n for all n, and set x0 = 1. Then, consider

Fn = {(x n, σ) : supp (σ) ⊂ {x0, x1, . . . , x n } }.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

240

Bounded Harmonic Functions

It is immediate that |Fn | = 2 n+1 . It is also easy to show that for any (x n, σ) ∈ Fn one has |(x n, σ) | ≤ 2(n+1) . Thus, B S 0 ((1, 0), 2r ) ≥ 2r for all r . :) X Solution to Exercise 6.53 :( If the random walk is recurrent then

P x [∀ t ≥ 0, |Xt | > r] ≤ P x [∀ t ≥ 0, Xt , 1] = P x [T1 = ∞] = 0. This proves (3) ⇒ (1) . (2) ⇒ (3) is trivial. To prove (1) ⇒ (2) , let Zr = G\B(1, r ) be the complement of the ball of radius r . If the walk is transient then f g f g 0 < P T1+ = ∞ ≤ P TZ r < T1+ · sup P x [T1 = ∞] . x∈Z r

For every r > 0 let zr ∈ Zr be such that

P z r [T1 = ∞] ≥ sup P x [T1 = ∞] 1 − r −1 . x∈Z r

Then, as r → ∞,

f g P T1+ = ∞ f g → 1. P z r [T1 = ∞] ≥ 1 − r −1 P TZ r < T1+ Since for any r > 0 the set B(1, r ) is finite, and since µ is adapted, we have sup | x |≤r P x [T1 = ∞] < ∞. Thus, it must be that |zr | → ∞ and we are done. :) X Solution to Exercise 6.54 :( Let F ⊂ G be a finite set. Define F˜ ⊂ L(G) by

F˜ = {(x, σ) : x ∈ F, supp (σ) ⊂ F }. It is easy to compute that

| F˜ | = 2 |F | · |F | |∂F˜ | = #{(x, σ) : supp (σ) ⊂ F, x ∈ ∂F } = 2 |F | · |∂F |, so that

|∂ F˜ | | F˜ |

=

|∂F | |F | .

Thus, a Følner sequence in G gives rise to a Følner sequence in L(G) .

:) X

Solution to Exercise 6.55 :( ∗ and let G be the subgroup generated by these elements. Let M ⊂ Ω be Let σ1, . . . , σn be elements in S∞ defined by n [ M= supp (σ j ). j=1

Note that for any j we have that σ j M is a bijection on M . Thus, for any element g ∈ G also g M is a bijection on M (because composing bijections is a bijection). Thus, the map g 7→ g M is a homomorphism from G to the group of permutations on M . Also, the kernel of this homomorphism is precisely those g ∈ G such that g M is the identity on M . However, since supp (g) ⊂ M for any g ∈ G , it must be that the kernel is trivial. So G admits an injective homomorphism into a finite group (the group of permutations on the finite set M ), implying that G is finite. ∗ is not finite, because, for example, if we identify Ω with N, and consider the transpositions Now, S∞ πn = (n n + 1) then there are infinitely many different permutations with size 2 support. :) X Solution to Exercise 6.57 :( For x = (ω1, . . . , ω n ) ∈ T denote |x | = n and consider the process Yt = |Xt | . Note that for any t we have a.s.

P[Yt +1 = Yt + 1 | Xt ] ≥ P(Xt , Xt + 1) + P(Xt , Xt − 1) ≥ α.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

241

6.13 Solutions to Exercises

Thus, we may couple (Yt )t with a process (Zt )t on N\{0} such that Zt ≤ Yt a.s. for all t , and Z0 = Y0 and such that (Zt )t is a Markov chain with transition probabilities P(z, z + 1) = 1 − P(z, z − 1) = α for all z > 1. Under such a coupling we have the inclusion of events

{∀ t, Zt ≥ Z0 } ⊂ {∀ t, Yt ≥ Y0 } ⊂ {∀ t, Xt , Xˆ0 }. Thus we only need to prove a lower bound on P[∀ t, Zt ≥ Z0 | Z0 = n > 1]. Denote ϕ(n) = P[∀ t, Zt ≥ Z0 | Z0 = n]. Since (Zt )t moves only by ±1 values, we have that ϕ(n) = ϕ(n + 1) = ϕ for all n > 1. We have:

ϕ(n) = P[Z1 = n + 1 | Z0 = n] · P[∀ t, Zt ≥ n − 1 | Z0 = n + 1] = α · P[∀ t, Zt ≥ n + 1 | Z0 = n + 1] + α · P[∃ t, Zt = n | Z0 = n + 1] · P[∀ t, Zt ≥ n | Z0 = n] = α · ϕ + α · (1 − ϕ) · ϕ. We thus arrive at the equation 1 = α(2 − ϕ) , which gives us for any |x | > 1,

P x [∀ t, Xt , x] ˆ ≥ ϕ =2−

1 α.

:) X

Solution to Exercise 6.58 :( Set

h(σ) = P σ [lim sup Yt = 1], t

where Xt , Yt are defined in Example 6.10.3. Exercise 6.57 shows that

P[∀ t, Xt ≥ X0 | X0 > 1] ≥ ε := 1 −

A 2

> 0.

Let Tn = inf {t : Xt = n}. Since Xt +1 − Xt ∈ {−1, 0, 1} always, conditioned on X0 ≥ n we have that Tn < Tn−1 < Tn−2 < · · · < T2 < T1 . Since P[Tn−1 = ∞ | X0 = n] ≥ ε , we have by repeated use of the strong Markov property,

P[T1 < ∞ | X0 = n] = P[Tn−1 < ∞ | X0 = n] · P[T1 < ∞ | X0 = n − 1] ≤ · · · ≤ (1 − ε) n−1 . Since P[Yt +1 , Yt | Xt > 1] = 0, we conclude that

P[∀ t, Yt = Y0 | X0 = n] ≥ P[T1 = ∞ | X0 = n] ≥ 1 − (1 − ε) n−1 . Thus, if σ0 = σ is such that X0 = n, Y0 = −1 then h(σ) ≤ (1 − ε) n−1 and if σ0 = σ 0 is such that X0 = n, Y0 = 1 then h(σ 0 ) ≥ 1 − (1 − ε) n−1 . Taking n large enough so that (1 − ε) n−1 < 21 , we obtain σ , σ0 for which h(σ) < 12 < h(σ0 ) . :) X Solution to Exercise 6.59 :( Let (Ut , Jt , It )t ≥1 be mutually independent random variables such that Ut ∈ G has distribution µ , Jt is a Bernoulli- 21 random variable, and It is a Bernoulli-ε random variable. Define (Xt , σt ) ∈ L(G) inductively by X0 = 1, σ0 = 0, and for t > 0,

(U , 0) t (Xt , σt ) = (Xt −1, σt −1 ) · (1, δ1 ) (1, 0)

if It = 1, if Jt = 1 , It , if Jt = It = 0.

It is simple to verify that ((Xt , σt ))t is a ν -random walk, for ν given by ν(s, 0) = εµ(s) and ν(1, δ1 ) = and ν(1, 1) = 12 (1 − ε) . Define ( Ut if It = 1, Vt = 1 if It = 0.

1 2 (1−ε)

It is immediate that Xt = V1 · · · Vt . Since (Jt )t and (Ut , It )t are independent, we have that (Jt )t and (Xt )t are also independent. Thus, (Jt )t and (Qt )t are independent. Note that Xk+1 = Xk if and only if Ik = 0.

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

242

Bounded Harmonic Functions

For x ∈ G define the following.

K (x, t) = {0 ≤ k ≤ t − 1 : Xk+1 = Xk = x }, Lt (x) =

X

Jk+1 =

k ∈K (x, t )

t −1 X

1{X

J . k+1 =X k =x } k+1

k=0

So Lt (x) is the number of times the lamp at x has been flipped up to time t . Thus, σt (x) = Lt (x) (mod 2) . Note that x ∈ Qt if and only if K (x, t) , ∅. Also, the sets (K (x, t)) x∈Q t are measurable with respect to σ(X0, . . . , Xt ) , and these sets are pairwise disjoint, and Lt (x) is determined by (Jk ) k ∈K (x, t ) . Since (Jn ) n are mutually independent, we have that conditioned on (Xn ) n , the random variables (Lt (x)) x∈Q t are mutually independent. Since (Jn ) n and (Xn ) n are independent, we conclude that: conditioned on (Xn ) n , the conditional distribution of (Lt (x)) x∈Q t is that of independent Binomial random variables, each Lt (x) having Binomial- |K (x, t) |, 12 distribution. It is a simple exercise to show that

g 1 (mod 2) = , 2

f P[σt (x) = 1 | x ∈ Qt ] = P Bin |K (x, t) |, 12 = 1

see Exercise 5.29. Since (Lt (x)) x∈Q t are conditionally independent conditioned on (Xn ) n , it is also true that (σt (x)) x∈Q t are conditionally independent. We conclude that the conditional distribution of (σt (x)) x∈Q t , conditioned on :) X (Xn ) n , is that of independent Bernoulli- 12 random variables. Solution to Exercise 6.60 :( For t, n ≥ 1 define

Qt, n = {x ∈ G : ∃ t ≤ k ≤ t + n − 1, Xk+1 = Xk = x }. Thus, Qt +n ⊂ Qt ∪ Qt, n . f g Also, note that P x [y ∈ Q n ] = P x −1 y ∈ Q n , so that E x |Q n | = E |Q n | . The Markov property tells us that X E |Qt, n | = P[Xt = x] E x |Q n | = E |Q n |, x

so that (E |Qt |)t is a subadditive sequence. Thus, the limit ` := limt →∞ Note that Qt ⊂ Rt −1 implying that E |Qt | ≤ E |Rt −1 | . The strong Markov property at time Tx ∧ t implies that

1 t

E |Qt | exists by Fekete’s lemma.

P[x ∈ Qt ] ≥ P[Tx < t] · µ(1) = P[x ∈ Rt −1 ] · µ(1), so that E |Qt | ≥ µ(1) · E |Rt −1 | . Thus, by Exercise 4.45,

µ(1) · pesc ≤ lim

t →∞

1 E |Qt | ≤ pesc . t

Since pesc = 0 if and only if (G, µ) is recurrent, and since µ(1) > 0, this completes the proof.

:) X

Solution to Exercise 6.61 :( If (G, µ) is recurrent, then as we have seen in Exercise P6.60 that ` = pesc = 0. So assume that (G, µ) is transient. Let g (x, y) = ∞ t =0 P x [Xt = y] be the Green function. Let T0 = 0 and Tn+1 = inf {t > Tn : Xt = 1}, with the convention that inf ∅ = ∞. These are the successive return times to 1 ∈ G . Let N = max{n ≥ 1 : Tn < ∞}, which is a.s. finite because of transience. We have seen in Example 3.5.1 N −1 are independent. that the random variables (Tn+1 − Tn ) n=0 Also, note that 1 ∈ Qt if and only if there exists n < N such that Tn + 1 = Tn+1 ≤ t . Thus, the event that Tn+1 > Tn + 1 for all n < N implies that 1 < Qt . Note that the strong Markov property at time T1 gives that

P[∀ n < N, Tn+1 > Tn + 1] = P[1 < T1 < ∞] · P[∀ n < N, Tn+1 > Tn + 1] + P[T1 = ∞] = (P[T1 < ∞] − µ(1)) · P[∀ n < N, Tn+1 > Tn + 1] + P[T1 = ∞],

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.13 Solutions to Exercises

243

which implies that

P[∀ n < N, Tn+1 > Tn + 1] = Denote r :=

pesc µ(1)+pesc

P[T1 = ∞] pesc = . µ(1) + P[T1 = ∞] µ(1) + pesc

. Thus,

P[1 < Qt ] ≥ P[∀ n < N, Tn+1 > Tn + 1] = r . On the other hand, as t → ∞,

P[1 < Qt ] ≤ P[∀ n < N, Tn+1 > t max(Tn + 1)] % P[∀ n < N, Tn+1 > Tn + 1]. Now, fix ε > 0 and let t0 be large enough so that for all t ≥ t0 we have r ≤ P[1 < Qt ] ≤ (1 + ε) · r . Now, for any x ∈ G , using the strong Markov property at time Tx ∧ n, for t ≥ n we have that

P[x ∈ R n−1 \Qt ] ≥ P[Tx < n] · P x [x < Qt ], P[x ∈ R n−1 \Qt ] ≤ P[Tx < n] · P x [x < Qt −n+1 ]. Since P x [x < Qt ] = P[1 < Qt ], as long as t − n ≥ t0 we have that

1≤

P[x ∈ R n−1 \Qt ] ≤ 1 + ε. r P[Tx < n]

Summing over x we have that for any t ≥ t0 + n,

E |R n−1 | · r ≤ E |R n−1 \Qt | ≤ E |R n−1 | · r (1 + ε). Recalling that Qt ⊂ Rt −1 and that |Rt −1 \R n−1 | ≤ t − n, this implies that

(E |Rt −1 | − (t − n)) · r ≤ E |Rt −1 \Qt | ≤ E |R n−1 \Qt | ≤ E |Rt −1 | · r (1 + ε). Fixing t = n + t0 and taking t → ∞, we have

pesc · r ≤ pesc − ` ≤ pesc · r (1 + ε). Since this holds for all ε > 0, we obtain ` = pesc (1 − r ) .

:) X

Solution to Exercise 6.62 :( Let (Ut , It )t ≥1 be mutually Pindependent, with each Ut having the distribution of µ and each It a Bernoulli-ε random variable. Set Bt = tk=1 Ik , and note that Bt is a Binomial- (t, ε) random variable. Let Xt = U1 · · · Ut and Yt = X B t . We have that

P[Yt +1 = Yt x | Yt ] = P[X B t +I t +1 = X B t x | X B t ] = P[It +1 = 0, x = 1 | X B t ] + P[It +1 = 1, UB t +1 = x | X B t ] = (1 − ε)δ1 (x) + εµ(x) = ν(x). Thus, (Yt )t is a ν -random walk. Finally, note that

H (Yt ) = H (X B t ) ≥ H (X B t | Bt ) =

t X

P[Bt = k]H (Xk ),

k=0

H (Yt ) = H (X B t ) ≤ H (X B t | Bt ) + H (Bt ) ≤

t X

P[Bt = k]H (Xk ) + log(t + 1),

k=0

h(G, ν) = lim

t →∞

t 1X 1 H (Yt ) = lim P[Bt = k]k · H (Xk ). t →∞ t t k k=0

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

244

Bounded Harmonic Functions

Now, let η > 0 and choose K large enough so that k1 H (Xk ) − h(G, µ) < η for all k ≥ K . Then, h(G, ν) ≥ (h(G, µ) − η) · lim

t →∞

t 1 X P[Bt = k]k t k=K

K (K + 1) = (h(G, µ) − η) · ε, t K X P[Bt = k]k h(G, ν) ≤ (h(G, µ) + η) · ε + (h(G, µ) + η) · lim ≥ (h(G, µ) − η) · ε − lim

t →∞

t →∞

k=0

K (K + 1) ≤ (h(G, µ) + η) · ε + lim = (h(G, µ) + η) · ε. t →∞ t Taking η → 0 completes the proof.

:) X

Solution to Exercise 6.63 :( Let (Xt , σt )t be the ν -random walk on L(G) . Let Qt = {x ∈ G : ∃ 0 ≤ k ≤ t − 1, Xk+1 = Xk = x }. We have

H (σt | Xt ) + H (Xt ) = H (Xt , σt ) ≤ H (σt ) + H (Xt ). Also, since |Qt | independent Bernoulli- 12 have entropy |Qt | · log 2, by Exercise 6.59,

H (σt | |Qt |) = log 2 · E |Qt |, and similarly,

H (σt | Xt ) ≥ H (σt | |Qt |, Xt ) = log 2 · E |Qt |. Note that (Xt )t is a µ˜ -random walk with µ˜ = (1 − ε)δ1 + εµ , and specifically µ(1) ˜ = 1 − ε . So, by Exercises 6.61 and 6.62,

h(L(G), ν) = log 2 · lim

t →∞

1 1 (1 − ε)pesc + εh(G, µ). E |Qt | + lim H (Xt ) = log 2 · t →∞ t t 1 − ε + pesc

Here pesc = P[∀ t ≥ 1, Xt , 1].

:) X

Solution to Exercise 6.64 :( As in Exercise 4.27, if we consider the random walk measure µ(1) = 21 p , µ(−1) = 12 (1 − p) , and µ(0) = 12 for some 12 < p < 1, then the deterministic path γ = (0, 1, 2, . . .) is a SIT in a network inducing the µ -random walk. Thus (Z, µ) is transient. Define ν on L(Z) by

ν(1, 0) =

1 2 p,

ν(−1, 0) =

1 2 (1 −

p),

ν(0, δ1 ) = 21 .

So

r := sup{ |y | : σ(y) = 1, (x, σ) ∈ supp (ν) } = 0, and

µ(x) =

X

ν(x, σ).

σ

Thus, by Theorem 6.9.1, we know that (L(Z), ν) is non-Liouville.

:) X

Solution to Exercise 6.65 :( Exercise 4.44 provides us with µ ∈ SA Z2, 2 − ε such that Z2, µ is transient. Define ν on L Z2 by P 1 1 1 1 ν(z, 0) = 2 µ(z) and ν(0, δ1 ) = 2 . Note that for µ(z) ˜ = 2 µ(z) + 2 1{z=0} . ˜ = σ ν(z, σ) , we have that µ(z) So the µ˜ -random walk is just a lazy version of the µ -random walk. Exercise 3.13 tells us that Z2, µ˜ is transient if and only if Z2, µ is transient. Since max{ |y | : σ(y) = 1, (x, σ) ∈ supp (ν) } = 0, by Theorem 6.9.1 we get that L(Z2 ), ν is non-Liouville. :) X

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

6.13 Solutions to Exercises

245

Solution to Exercise 6.66 :( This is just the martingale convergence theorem (Theorem 2.6.3 and Exercise 2.32).

:) X

Solution to Exercise 6.67 :( This follows since pointwise sums and products of converging sequences are convergent.

:) X

Solution to Exercise 6.68 :( If f ∈ I(G, µ) and h ∈ A(G, µ) , then h(Xt ) f (Xt ) → 0 a.s., because f (Xt ) → 0 a.s.

:) X

Solution to Exercise 6.69 :( We have that L f is I -measurable just as in Exercise 6.14, and h f ∈ BHF (G, µ) just as in Exercise 6.17. By Exercise 6.17, h f (Xt ) → L f a.s. Thus, ( f − h f )(Xt ) → L f − L f = 0 a.s., implying that f − h f ∈ I(G, µ) . If h ∈ BHF (G, µ) ∩ I(G, µ) , then by Exercise 6.17 we see that h ≡ 0. So A(G, µ) = BHF (G, µ) ⊕ I(G, µ) . :) X

https://doi.org/10.1017/9781009128391.009 Published online by Cambridge University Press

7 Choquet–Deny Groups

246

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

7.1 The Choquet–Deny Theorem

247

7.1 The Choquet–Deny Theorem In this chapter we depart from the usual framework of this book and work with general probability measures on a finitely generated group G; that is, the measures considered in this section are not necessarily assumed to be symmetric or have moment conditions. We have already seen in Theorem 6.8.1 that if G is a finitely generated non-amenable group, then every symmetric adapted random walk on G with finite entropy is non-Liouville. (In fact, on a non-amenable group any adapted random walk is non-Liouville; see Kaimanovich and Vershik, 1983; Rosenblatt, 1981.) The main objective of this section is to understand the extreme opposite case: what are the finitely generated groups G for which (G, µ) is Liouville for all adapted probability measures µ. For this section it is wise to be familiar with the basic notions of virtually nilpotent groups, see Section 1.5.4. In 1960, Choquet and Deny proved that Abelian groups never admit nonconstant harmonic functions. We will basically present their result using a probabilistic proof. Let G be a group and let µ be an adapted measure on G. Let K be the kernel of the canonical left action of G on BHF (G, µ). Let Z = Z (G) be the center of G (i.e. Z = {x ∈ G : ∀ y ∈ G [x, y] = 1}.) Then, Z C K. That is, the center of G acts trivially on BHF (G, µ).

Theorem 7.1.1

Proof Let x ∈ Z. Let (Xt )t be the µ-random walk on G. Since µ is adapted, there exist k = k (x) > 0 and α > 0 such that for any t we have for all y ∈ G, f g f g Py Xt+k = Xt x −1 | Xt = P Xk = x −1 =: α > 0. Specifically, this implies that for any f ∈ BHF (G, µ), f g Ey [| f (Xt+k ) − f (Xt )|] ≥ α Ey | f (Xt x −1 ) − f (Xt )| = α Ey [|x. f (Xt ) − f (Xt )|] ≥ α|x. f (y) − f (y)|. We have used that f Xt x −1 = f x −1 Xt = x. f (Xt ) since x ∈ Z. On the other hand, since f ∈ BHF (G, µ), the sequence ( f (Xt ))t is a bounded martingale, and thus converges a.s. and in L 1 by the martingale convergence theorem (Theorem 2.6.3 and Exercise 2.33). Thus, Ey [| f (Xt+k ) − f (Xt )|] → 0. We obtain that for any y ∈ G, any f ∈ BHF (G, µ), and any x ∈ Z, we have x. f (y) = f (y). This implies that Z acts trivially on BHF (G, µ).

Corollary 7.1.2 (Choquet–Deny theorem)

If G is an Abelian group then (G, µ)

is Liouville for any adapted µ.

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

248

Choquet–Deny Groups

Proof When G is Abelian then Z (G) = G.

Let G be a finitely generated group and µ an adapted probability measure on G. Show that if K C G is a normal subgroup acting trivially on BHF (G, µ), then BHF (G, µ) BHF (G/K, µ), ¯ where µ¯ is the (projected) measure on G/K given P by µ(K ¯ x) = y ∈K µ(yx). B solution C Exercise 7.1

In light of the Choquet–Deny theorem for Abelian groups, we have the following definition. We say that a group G is Choquet–Deny if for any adapted measure µ on G we have that (G, µ) is Liouville.

Definition 7.1.3

Let G be a virtually nilpotent group. Then, G is Choquet–Deny; that is, (G, µ) is Liouville for any adapted µ. Corollary 7.1.4

Proof By Theorem 3.9.7, it suffices to prove that any nilpotent group is Choquet–Deny, because passing to a finite index results in an isomorphic space of bounded harmonic functions. Also, the Abelian case (or 1-step nilpotent) is Corollary 7.1.2. Now, if G is a n-step nilpotent group, then Z (G) acts trivially on BHF (G, µ), by Theorem 7.1.1. So Exercise 7.1 tells us that BHF (G, µ) BHF (G/Z (G), µ). ¯ Also G/Z (G) is (n−1)-step nilpotent. Thus, inductively, we have that BHF (G, µ) is just the space of constant functions. Naturally the question arises whether virtually nilpotent groups are the only examples of Choquet–Deny groups. Although the Choquet–Deny theorem has basically been known since 1960, the converse was only shown in 2018. Theorem 7.1.5 (Frisch, Hartman, Tamuz, Vahidi-Ferdowsi)

Let G be a finitely

generated group. The following are equivalent: (1) G is virtually nilpotent, (2) G is Choquet–Deny, (3) (G, µ) is Liouville for every symmetric, adapted measure µ on G with finite entropy H (µ) < ∞. We have seen that every virtually nilpotent group is Choquet–Deny, so to prove Theorem 7.1.5, it suffices to show the following.

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

7.2 Centralizers

249

Let G be a finitely generated group that is not virtually nilpotent. Then there exists a symmetric, adapted measure µ on G, with finite entropy H (µ) < ∞, such that h(G, µ) > 0.

Theorem 7.1.6

The proof of Theorem 7.1.6 is carried out in a few steps over the remainder of this chapter.

7.2 Centralizers Let G be a group, and let N C G. Define CGN (x) = {y ∈ G : [x, y] ∈ N }. So CG (x) := CG{1} (x) is the centralizer of x; that is, all elements of G that commute with x. Exercise 7.2

Show that CGN (x) is a subgroup.

B solution C

Exercise 7.3

Show that N C CGN (x).

B solution C

Let G be a group, and let N C M C G be normal subgroups of G. Show that CGN (x) ≤ CGM (x). M/N Show that CGM (x)/N = CG/N (N x). B solution C

Exercise 7.4

Exercise 7.5

y Show that CGN (x y ) = CGN (x) .

B solution C

Definition 7.2.1 Let G be a group. Let N (G) denote the collection of all normal subgroups of G. Define τ = τG : N (G) → N (G) by g ) ( f τ(N ) = τG (N ) = x ∈ G | G : CGN (x) < ∞ .

Show that τ is well defined. That is, show that τ(N ) is always a normal subgroup. B solution C

Exercise 7.6

Exercise 7.7

Show that N C τ(N ).

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

B solution C

250

Choquet–Deny Groups Show that if N C G is of finite index [G : N] < ∞, then τ(N ) = G.

Exercise 7.8

B solution C

Show that if N C M C G are normal subgroups of G then τ(N ) C

Exercise 7.9

τ(M).

B solution C

Let G be a group. Define τ 0 = τG0 = {1}, and inductively = τG (τ n ).

Definition 7.2.2

τ n+1

τGn+1

= Define

τ ∞ = τG∞ =

[

τn .

n

Let (Nn )n be a sequence of normal subgroups of G such that Nn ⊂ Nn+1 for all n. S Show that N∞ = n Nn is also a normal subgroup of G. Conclude that τG∞ is a normal subgroup of G. B solution C

Exercise 7.10

Let G be a group, and let N C M C G be normal subgroups of G. Show that τG (M)/N = τG/N (M/N ). Conclude that for n > k, τGn /τGk = τ n−kk . B solution C

Exercise 7.11

G/τG

Exercise 7.12

Show that Zn (G) C τGn .

Exercise 7.13

Show that if H ≤ G has finite index [G : H] < ∞, then

τHn

=

τGn

∩ H.

B solution C

B solution C

The connection of τ ∞ and nilpotence is given by the following. Let G be a finitely generated group. Then G is virtually nilpotent if and only if τ ∞ = G.

Theorem 7.2.3

Proof Assume that τ ∞ = G. If S is a finite generating set for G, then there exists n such that S ⊂ τ n . So τ n = G for some n. We will show by induction on n that G admits a finite index subgroup H ≤ G, [G : H] < ∞ such that γn (H) = {1}. If n = 1, then we claim that [G : Z] < ∞, for Z = Z1 (G). (This suffices since Z is Abelian, so γ1 (Z ) = {1}.) Indeed, if s1,. . . , s d generate G, then T Z = dj=1 CG (s j ). Since [G : CG (s j )] < ∞ for all j because τ 1 = G , we get

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

7.3 ICC Groups

251

that [G : Z] < ∞ (by Proposition 1.3.3), so G is virtually Abelian. This proves the base step of the induction. Now assume that n > 1 and τ n = G. Let N = τ 1 . Since G/N = τ n /τ 1 = n−1 , by induction we have that there exists a finite index subgroup N ≤ H ≤ G, τG/N [G : H] < ∞, such that γn−1 (H/N ) = {1}. Thus, γn−1 (H) ≤ N = τG ({1}). This implies that for any x ∈ γn−1 (H) we have that [G : CG (x)] < ∞. Since CH (x) = CG (x) ∩ H and since [G : H] < ∞, we conclude (by Proposition 1.3.3) that [H : CH (x)] ≤ [G : CG (x) ∩ H] < ∞, for any x ∈ γn−1 (H). Since [G : H] < ∞ and G is finitely generated, H is also finitely generated (Exercise 1.61). Now, if H = hSi for some finite set S, then by Exercise 1.34, D E γn−1 (H) = [s1, . . . , s n−1 ]x : s j ∈ S, x ∈ H . For any s1, . . . , s n−1 ∈ S, we have that [H : CH (y)] < ∞ for y = [s1, . . . , s n−1 ]. This implies that the set y H = {y x : x ∈ H } is finite (by the orbit-stablizer theorem, Exercise 1.17). Thus, the generating set {[s1, . . . , s n−1 ]x : s j ∈ S, x ∈ H } is a finite set, implying that γn−1 (H) is finitely generated. So let U = {u1, . . . , ud } be a finite generating set for γn−1 (H). Let K = Td j=1 CH (u j ). Since [H : CH (u j )] < ∞ for all j, we have that [H : K] < ∞ (Proposition 1.3.3), and so [G : K] < ∞. Also, γn−1 (K ) ≤ γn−1 (H). Since K was chosen such that its elements commute with those of γn−1 (H), we get that γn (K ) = [K, γn−1 (K )] = {1}, completing the induction step. Thus, we have proved that if τ n = G then G is virtually nilpotent, which is one direction of the theorem. For the other direction, assume that G is virtually nilpotent. So there exists H ≤ G, [G : H] < ∞ such that Zn (H) = H for some n. By Exercise 7.12 we find that τHn = H. By Exercise 7.13, we get that τGn ⊃ τHn = H, so f g G : τGn < ∞, which implies that τGn+1 = τG τGn = G by Exercise 7.8, completing the proof.

7.3 ICC Groups Let G be a group. Consider T = {N C G : τ(N ) = N }, the set of all fixed points for τ = τG . Define \ τ∗ = τ∗ = N. Definition 7.3.1

N ∈T

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

252

Choquet–Deny Groups

Since τ(G) = G, this is well defined. Note that if τ(N ) = N for some N C G, then τ ∗ C N. Show that for any n ≥ 1 we have that τ n C τ ∗ . Conclude that τ ∞ C τ ∗ .

Exercise 7.14

Exercise 7.15

Show that τ(τ ∗ ) = τ ∗ .

B solution C

B solution C

That is, τ ∗ is the smallest fixed point of τ, and it contains all the finite applications of τ to the trivial subgroup. Definition 7.3.2 A group G has the infinite conjugacy class (ICC) property if [G : CG (x)] = ∞ for all 1 , x ∈ G.

This name comes from the fact that [G : CG (x)] = x G (where x G = {x y : y ∈ G}). This fact is just the orbit-stabilizer theorem (Exercise 1.17), since CG (x) is the stabilizer of x for the G-action by conjugation. It is immediate that an infinite group G is ICC if and only if τG ({1}) = {1}. Show that G/τ ∗ is either trivial or ICC.

B solution C

Let G be a finitely generated group. Show that if τ ∞ = G then there exists n such that τ ∗ = τ n = G.

B solution C

Exercise 7.16 Exercise 7.17

The relationship of τ ∗ to bounded harmonic functions is given by the following theorem. Theorem 7.3.3 Let G be a finitely generated group and µ an adapted measure on G. Let K = {x ∈ G : ∀ h ∈ BHF (G, µ), x.h = h} be the kernel of the G-action on BHF (G, µ). Then τ(K ) = K. Consequently, τ ∗ C K.

Proof Since K C τ(K ) we only need tog show that τ(K ) C K. To this end, f K (x) < ∞. Set C = C K (x). assume that x ∈ τ(K ). Then, G : CG G Let (Xt )t be a µ-random walk, let y ∈ G, and consider the Markov chain (C yXt )t . This is a Markov chain on a finite network, so must be recurrent. Let ε > 0 and k be such that P[Xk = x] = ε > 0, which is possible as µ is adapted. Then, for any t we have that P[Xt+k = Xt x | Xt ] = ε. Define inductively T0 = 0 and Tn+1 = inf{t > Tn + k : yXt ∈ C}.

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

253

7.3 ICC Groups

Recurrence of the process (C yXt )t ensures that Tn < ∞ for all n a.s. Also, the strong Markov property tells us that P[yXTn +k = yXTn x | FTn ] = ε K (x) by definition, we have that yX x ≡ for all n. Since yXTn ∈ C = CG Tn x yXTn (mod K ). Because K acts trivially on BHF (G, µ), this implies that for any h ∈ BHF (G, µ) we have x −1 .h(yXTn ) = h(x yXTn ) = h(yXTn x), so that f g P x −1 .h(yXTn ) , h(yXTn +k ) | FTn ≤ 1 − ε. Thus, x −1 .h(yXTn ) = h(yXTn +k ) for infinitely many n, a.s. Now, for h ∈ BHF (G, µ), the process Mt := x −1 .h(yXt ) − h(yXt+k ) is a t bounded martingale. So it converges a.s. and in L 1 by the martingale convergence theorem (Theorem 2.6.3 and Exercise 2.33). Since lim inf |Mt | ≤ lim inf |x −1 .h(yXTn ) − h(yXTn +k )| = 0 t→∞

n→∞

a.s.,

it must be that E |Mt | → 0. Thus, x −1 .h(y) − h(y) = E f x −1 .h(yX ) − h(yX ) g ≤ E |M | → 0, t t+k t implying that x −1 .h(y) = h(y). This holds for any y ∈ G, any h ∈ BHF (G, µ), and all x ∈ τ(K ). We conclude that τ(K ) acts trivially on BHF (G, µ), so τ(K ) C K.

The key property of ICC groups is the following. Let G be a finitely generated group, and fix some Cayley graph of G. Let F ⊂ G be a finite subset. An element x ∈ G is said to shatter F, if for all η ∈ {−1, 1} we have x η F x ∩ F ⊂ {1}. Definition 7.3.4

Note that if x shatters F, then F ∩ xF x ∪ x −1 F x ∪ xF x −1 ∪ x −1 F x −1 ⊂ {1}. Let G be a finitely generated ICC group. For any finite subset F ⊂ G and any r > 0 there exists x ∈ G with |x| > r such that x shatters F.

Lemma 7.3.5

Proof Fix some finite, symmetric generating set S for G, and let (Xt )t be a −1 X are i.i.d. uniform on S. simple random walk on G; that is, the jumps Xt−1 t

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

254

Choquet–Deny Groups

Fix a finite subset F ⊂ G. We denote by F ∗ the set of all x ∈ G such that x shatters F. Define ( ) F1 = x ∈ G : x −1 F x ∩ F 1 {1} , F2 = {x ∈ G : xF x ∩ F 1 {1}}. So G\F ∗ ⊂( F1 ∪ F2 . ) Let I = x ∈ G : x 2 = 1 . Note that I ∩ (G\F1 ) ⊂ F ∗ , so I ⊂ F ∗ ∪ F1 . Assume that H ≤ G is an infinite index subgroup [G : H] = ∞. We have seen in Corollary 5.5.2, as a consequence of the Nash inequality, that there exists a universal constant C > 0 such that for all t ≥ 0 we have f g f g P[Xt ∈ xH y] = P x −1 Xt ∈ H y ≤ P H x −1 Xt = H y ≤ C(t + 1) −1/2, where we have crucially used here that [G : H] = ∞. This fact will be used repeatedly in what follows, with different subgroups. Recall that y G = {y x : x ∈ G}. For 1 , y ∈ G and z ∈ y G fix some x y,z ∈ G such that y xy, z = z. Note that if y x = z then x ∈ CG (y)x y,z . This implies that F1 ⊂

[

CG (y)x y,z .

(7.1)

y,z ∈F\{1}

( ) Now, let K = (y, z) ∈ F 2 : ∃ x ∈ G, xyx = z . For any (y, z) ∈ K fix some wy,z such that wy,z ywy,z = z. Note that for any x ∈ G and w = wy,z , if wxywx = z as well, then wx ywx = wyw, so ywx = x −1 yw, implying that yw 2 = x. Thus, x (yw) = x, implying that x ∈ CG (yw) 2 . In (x yw ) −1 = x −1 conclusion, for any (y, z) ∈ K we have that −1 wy,z {v : vyv = z} ⊂ CG (ywy,z ) 2 . If (ywy,z ) 2 = 1, then z = wy,z ywy,z = y −1 , so (yx) 2 = 1 for any x ∈ G such that xyx =) z. That is, (ywy,z ) 2 = 1 implies that {x : x yx = z} = ( x : (yx) 2 = 1 = y −1 I. ( ) Set K 0 = (y, z) ∈ K : (ywy,z ) 2 , 1 . We then have that F2 ⊂

[ (y,z) ∈K 0

[ wy,z CG (ywy,z ) 2 y −1 I.

(7.2)

y ∈F

Denote by Br the ball of radius r around 1 and Fr∗ = F ∗ \ Br . Then, combining (7.1) and (7.2), and using that [G : CG (y)] = ∞ for all y , 1, and that

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

255

7.4 JNVN Groups f

g

G : CG ((ywy,z ) 2 ) = ∞ for all (y, z) ∈ K 0, we arrive at P[Xt < Fr∗ ] ≤ P[Xt ∈ F1 ] + P[Xt ∈ F2 \Br ] + P[|Xt | ≤ r] X f g ≤ P[Xt ∈ F1 ] + P Xt ∈ wy,z CG ((ywy,z ) 2 ) (y,z) ∈K 0

+

X

P[vXt ∈ I\Br ] + P[|Xt | ≤ r]

v ∈F

≤ (|F | + 1) · max

v ∈F∪{1}

+

X

X

f g P Xt ∈ v −1 CG (y)x y,z

y,z ∈F\{1}

f

P Xt ∈ wy,z CG ((ywy,z ) 2 )

g

(y,z) ∈K 0

+ |F | · max P[v Xt ∈ F ∗ \Br ] + v ∈F

X

P[Xt = v]

|v | ≤r

C · (|F | + 1)|F | 2 + |K 0 | + |Br | ≤ + |F | · max P v Xt ∈ Fr∗ . √ v ∈F t+1 For p = max P yXt ∈ Fr∗ y ∈F∪{1}

this implies that

∗

1 − p ≤ P Xt < Fr

C · (|F | + 1)|F | 2 + |K 0 | + |Br | ≤ + |F |p, √ t+1

or 2 + |K 0 | + |B | C · (|F | + 1)|F | r 1 +/ · . p ≥ *.1 − √ |F | + 1 t+1 , So when t is large enough (only as a function of |F | and |Br |), we have that p > 0, implying that Fr∗ = F ∗ \Br cannot be empty.

7.4 JNVN Groups Definition 7.4.1 A group G is called just not virtually nilpotent, or JNVN, if G is not virtually nilpotent and for every nontrivial normal subgroup {1} , N C G, the quotient G/N is virtually nilpotent.

For example, since nilpotent groups are never simple, an infinite simple group is always JNVN, because every nontrivial quotient of a simple group is the trivial group. The following lemma tells us that every finitely generated group that is not virtually nilpotent has a JNVN quotient.

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

256

Choquet–Deny Groups

Lemma 7.4.2 Let G be an infinite finitely generated group. If G is not virtually nilpotent, then there exists N C G such that G/N is JNVN.

Proof We will use two properties of finitely generated virtually nilpotent groups. Fact I. Quotients of virtually nilpotent groups are virtually nilpotent. See Exercise 1.43. Fact II. Every finitely generated virtually nilpotent group is finitely presented. See Exercise 1.67. Now to continue the proof of Lemma 7.4.2. Consider the collection N = {N C G : G/N is not virtually nilpotent }. S If (Nk )k is an increasing chain in N , then N∞ := k Nk is a normal subgroup (Exercise 7.10). We will show that N∞ is in N . Indeed, if N∞ < N , then G/N∞ is virtually nilpotent. So G/N∞ is finitely presented (Fact II). If G is generated by the finite set S, then there exists a free group F = FS , and finitely many words r 1, . . . , r m ∈ F, such that if we let R C F be the smallest normal subgroup containing r 1, . . . , r m , then under the canonical S projection ϕ : F → G, we have that ϕ(R) = N∞ . Since N∞ = k Nk , there must exist some k for which ϕ(r j ) ∈ Nk for all 1 ≤ j ≤ m. But then ϕ−1 (Nk ) C F is a normal subgroup containing r 1, . . . , r m , implying that R C ϕ−1 (Nk ), so that N∞ = Nk , a contradiction! We conclude that any nondecreasing chain in N has an upper bound in N , so by Zorn’s lemma N contains a maximal element. That is, there exists a maximal element N ∈ N . So G/N is not virtually nilpotent. Now, assume that G/N has a non-virtually nilpotent quotient. That implies the existence of some N C M C G such that G/M is not virtually nilpotent. By the maximality of N, we have that N = M. So the only non-virtually nilpotent quotient of G/N is by the trivial group. That is, G/N is JNVN. Theorem 7.4.3

If G is a finitely generated JNVN group, then G is ICC.

Proof Let N = τG ({1}) C G. If N , {1} then since G is JNVN we have that G/N is virtually nilpotent. So by Theorem 7.2.3, we know that for some n, n τGn+1 /N = τG/N = G/N .

But this implies that τGn+1 = G, so G is virtually nilpotent by Theorem 7.2.3 again, a contradiction! Hence we deduce that τG ({1}) = {1}, which is to say that G is ICC.

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

7.5 Choquet–Deny Groups Are Virtually Nilpotent

257

Let P be a group property. Assume that any group that is P is also finitely presented. Assume that any quotient of a group that is P is also P. (That is, P is closed under quotients.) Show that for any finitely generated group G that is not P, there exists a normal subgroup N C G such that G/N is not P, but all nontrivial quotients of G/N are P. (That is, G/N is just not P.) B solution C Exercise 7.18

Show that the group Z is just-not-finite (i.e. just-infinite); that is, Z is an infinite group such that any nontrivial quotient of Z is finite. B solution C Exercise 7.19

Exercise 7.20 Show that if G is a finitely generated infinite group, then there exists N C G such that [G : N] = ∞, but for any N C M C G such that N , M, we have that [G : M] < ∞. B solution C

7.5 Choquet–Deny Groups Are Virtually Nilpotent Exercise 7.21 Let G be a finitely generated group and let N C G. Show that dim BHF (G, µ) ≥ dim BHF (G/N, µ) ¯ where µ¯ is the projection of µ onto the group G/N. Conclude that if (G/N, µ) ¯ is not Liouville, then neither is (G, µ). B solution C

Let G be a group and N C G. Let ν be an adapted probability measure on G/N. Show that there exists an adapted probability measure µ on G such that the projection µ¯ on G/N satisfies µ¯ = ν. Show that if ν is symmetric, then we can choose µ to be symmetric. Show that if H (ν) < ∞ then µ can be chosen so that H (µ) < ∞. B solution C Exercise 7.22

In order to prove Theorem 7.1.6, we will need to construct some probability measure on a finitely generated ICC group G. The construction will utilize a suitable sequence of group elements (gn )n , choosing gn with probability roughly n−α for some α ∈ (1, 2). Thus, there is some underlying probability measure on N governing which element in the sequence is sampled. We begin with the construction of the sequence (gn )n . Fix some parameter α ∈ (1, 2). Start the construction by fixing a finite, symmetric generating set S for G, and consider the Cayley graph with respect to S. We assume 1 < S. Let Br denote the ball of radius r around 1 ∈ G.

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

258

Choquet–Deny Groups

Set G0 = {1} and G1 = S. Let r 0 = 0. For n > 1, given G1, . . . , G n−1 , we let n−1 [ |x| : x ∈ G r n−1 = 4n2(α−1) max k k=1

and choose gn such that |gn | > r n−1 and gn shatters Brn−1 . We have required that ( ) G is ICC, to be able to find such a gn using Lemma 7.3.5. Set G n = gn, gn−1 . S∞ Let G∗ = n=0 G n . (The exact constant 4 above is not important, but it needs to be large enough for what follows.) For n ≥ 0, we write ||x|| = n for any x ∈ G n . Note that if x = u1 · · · ut where uk ∈ G∗ for all k, then writing m = maxk ≤n ||uk ||, we have that |x| ≤ t maxk ≤m |gk |. For u1, . . . , ut ∈ G∗ , let M (u1, . . . , ut ) = max{||u j || : 1 ≤ j ≤ t}. Denote J (u1, . . . , ut ) =

t X

1{ | |u j | |=M (u1,...,ut ) },

j=1

and note that J (u1, . . . , ut ) ≥ 1. Define ( ) Ut = (u1, . . . , ut ) ∈ G∗t : J (u1, . . . , ut ) = 1, M (u1, . . . , ut ) 2(α−1) > t . Let t, s ≥ 2. Let (u1, . . . , ut ) ∈ Ut and (v1, . . . , vs ) ∈ Us . Let a ∈ S and assume that au1 · · · ut = v1 · · · vs . Then, there exist k < t and ` < s such that au1 · · · uk = v1 · · · v` .

Lemma 7.5.1

Proof We can write u1 · · · ut = xgz where: • • • •

x = u1 · · · uk for some 1 ≤ k < t, g = uk+1 admits ||g|| = M (u1, . . . , ut ) > 0, z = uk+2 · · · ut (and z = 1 if k + 1 = t), for all j , k + 1 we have ||u j || < ||g||.

Similarly, we write v1 · · · vt = x 0 g 0 z 0 where: • • • •

x 0 = v1 · · · v` for some 1 ≤ ` < s, g 0 = v`+1 admits ||g 0 || = M (v1, . . . , vt ) > 0, z 0 = v`+2 · · · vs (and z 0 = 1 if ` + 1 = s), for all j , ` + 1 we have ||v j || < ||g 0 ||.

If ||g|| > ||g 0 || then axgz = x 0 g 0 z 0 implies that for n = ||g||, 4n2(α−1) max |gk | = r n−1 < |g| ≤ |a| + |x| + |z| + |x 0 | + |z 0 | + |g 0 | k ≤n−1

≤ 1 + (t − 1 + s) · max |gk | ≤ (t + s) · max |gk |, k ≤n−1

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

k ≤n−1

7.5 Choquet–Deny Groups Are Virtually Nilpotent

259

which implies that t + s < M (u1, . . . , ut ) 2(α−1) + M (v1, . . . , vs ) 2(α−1) ≤ 2n2(α−1) < t + s, a contradiction! Similarly, by interchanging the roles of g, g 0, we also get a contradiction if we assume that ||g|| < ||g 0 ||. So it must be that ||g|| = ||g 0 ||, and that for n = M (u1, . . . , ut ) = M (v1, . . . , vs ) η ξ we have axgn z = x 0 gn z 0 for some η, ξ ∈ {−1, 1}. Since (x 0 ) −1 ax ≤ (t + s − 1) max |g | < 2n2(α−1) max |g | ≤ r , k k n−1 k ≤n−1 k ≤n−1 and similarly, z 0 z −1 ≤ (t + s − 2) max |g | < r , k n−1 k ≤n−1 −ξ

η

and since gn shatters Brn−1 , the identity gn (x 0 ) −1 axgn = z 0 z −1 actually implies that au1 · · · uk = ax = x 0 = v1 · · · v` . We now fix an additional integer parameter K > 0, which will later be taken to be large enough. Define β = α−1 2 and Et (K ) = (u1, . . . , ut ) ∈ G∗t ∀ 1 ≤ j < K β, u j = 1, ∀ K β ≤ k ≤ t, (u1, . . . , uk ) ∈ Uk . Let t, s ≥ 1 and let (u1, . . . , ut ) ∈ Et (K ) and (v1, . . . , vs ) ∈ E s (K ). Let a ∈ S. Then, au1 · · · ut , v1 · · · vs .

Lemma 7.5.2

j k Proof Let N = K β . Assume for a contradiction that au1 · · · ut = v1 · · · vs . Define k = min{0 ≤ j ≤ t : ∃ 1 ≤ ` ≤ s, au1 · · · u j = v1 · · · v` }, where k = 0 implies that v1 · · · v` = a for some 1 ≤ ` ≤ s. After fixing k, let ` = min{1 ≤ j ≤ s : au1 · · · uk = v1 · · · v j } (which satisfies ` ≥ 1 since a , 1). If k, ` ≥ N, then (u1, . . . , uk ) ∈ Uk and (v1, . . . , v` ) ∈ U` , so by Lemma 7.5.1 there exist k 0 < k and ` 0 < ` such that au1 · · · uk 0 = v1 · · · v`0 , contradicting the minimality of k, `. We conclude that it must be that either k < N or ` < N. If k < N ≤ `, then since u1 = · · · = u N −1 = 1, we have v1 · · · v` = a. As in the proof of Lemma 7.5.1, we may find ` 0 < ` such that ||v`0 +1 || = n :=

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

260

Choquet–Deny Groups

M (v1, . . . , v` ) and for all 1 ≤ j ≤ ` with j , ` 0 + 1 we have ||v j || < n. But this would imply that 4` max |g j | < 4n2(α−1) max |g j | = r n−1 < |v`0 +1 | ≤ 1 + (` − 1) max |g j |, j ≤n−1

j ≤n−1

j ≤n−1

a contradiction! If ` < N ≤ k we would arrive at a similar contradiction. Hence, it must be that k < N and ` < N. But then a = au1 · · · uk = v1 · · · v` = 1, a contradiction! Therefore, we have shown that it is impossible that au1 · · · ut = v1 · · · vs for (u1, . . . , ut ) ∈ Et (K ) and (v1, . . . , vs ) ∈ E s (K ). We now define the required probability measure on G. Using the above lemmas, we will prove it has the necessary properties, completing a proof of Theorem 7.1.6. P 1 −α for ζ (α) = ∞ Let κ = ζ (α) n=1 n . For n ≥ 1 and x ∈ G n define µ(x) = 1 {x ∈Gn } ·

κ · (n + K ) −α, |G n |

and define µ(1) = 1 −

∞ X

n−α .

n=K+1

Since G1 = S, we see that µ is adapted. Since (G n ) −1 = G n for all n, we see that µ is symmetric. The crucial property of this µ is given by the following lemma. Define Gt (K ) := {u1 · · · ut ∈ G : (u1, . . . , ut ) ∈ Et (K )}. Lemma 7.5.3 Let (X t )t be the µ-random walk. For any ε > 0, there exists K0 such that for all K ≥ K0 and all t ≥ 1 we have that P[Xt < Gt (K )] < ε. ∞ by i.i.d.-µ elements, and X = U · · · U . We bound: Proof Let (Ut )t=1 t 1 t

P[Xt < Gt (K )] ≤ P[(U1, . . . , Ut ) < Et (K )] X X ≤ P[U j , 1] + P[(U1, . . . , U j ) < Ut ] j ≤K β

j>K β

≤ K β (1 − µ(1)) +

X

P[J (U1, . . . , U j ) ≥ 2]

j>K β

+

X

f g P M (U1, . . . , U j ) 2(α−1) ≤ j .

j>K β

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

(7.3)

261

7.5 Choquet–Deny Groups Are Virtually Nilpotent

If we can choose K large enough to make each of the three terms above smaller than 13 ε, we are done. We will make repeated use of the following estimate: for any r ≥ 1, Z ∞ ∞ X −α 1 1−α n ≥ ξ −α dξ = α−1 r , r

n=r ∞ X n=r

n

−α

≤r

−α

+

∞

Z

ξ −α dξ ≤ r

α 1−α . α−1 r

(This was also used in the solution to Exercise 4.43.) For the first term in (7.3), note that by our choice of β = K β (1 − µ(1)) < K β

∞ X

n−α ≤

α−1 2 ,

−β α α−1 K ,

n=K+1

which can be made as small as required by taking K large enough. For the second term in (7.3), we compute for any t ≥ 2 and n ≥ 1, using the independence of (U j ) j , P[J (U1, . . . , Ut ) ≥ 2, M (U1, . . . , Ut ) = n ≥ 1] ! t ≤ (P[||U1 || = n]) 2 · (1 − P[||U1 || ≥ n]) t−2 2 1 (n + K ) 1−α t−2 . ≤ 21 t 2 · κ 2 (n + K ) −2α ∧ 1 − α−1 Also, for any t ≥ 1, P[M (U1, . . . , Ut ) = 0] ≤ µ(1) t ≤ 1 −

1 α−1 (1

+ K ) 1−α

t

.

Hence, for t ≥ 2 and any r ≥ 1, by summing over n ≤ r and n > r separately, we have that t 1 P[J (U1, . . . , Ut ) ≥ 2] ≤ 1 − α−1 (1 + K ) 1−α + 12 t 2 κ 2 ·

1−2α 2α 2α−1 (K + r) 1 exp − α−1 (K + r) 1−α (t

+ 21 t 2 · r 1/(α−1)

− 2) .

t By choosing r so that r + K ≥ C1 log for a large enough constant C1 = t C1 (α), we have that for some constants C2 = C2 (α) > 0 and C3 = C3 (α) > 0,

P[J (U1, . . . , Ut ) ≥ 2] ≤ C2 · t −γ (log t) (2α−1)/(α−1) ≤ C3 t −γ+η, 1 2−α α where γ = 2α−1 α−1 − 2 = α−1 and η = 2(α−1) . Since γ − η = 2(α−1) , summing β over t > K , we have that for some constant C4 = C4 (α) > 0, X P[J (U1, . . . , U j ) ≥ 2] ≤ C4 K −β(2−α)/(α−1), j>K β

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

262

Choquet–Deny Groups

which takes care of the second term in (7.3). Finally, for the third term in (7.3), we compute for r 2(α−1) = t, √ P[M (U1, . . . , Ut ) ≤ r] ≤ (1 − P[||U1 || > r]) t ≤ exp(−C5 t), where C5 = C5 (α) > 0 is some constant. Thus, for some constants C6 = C6 (α) and C7 = C7 (α) > 0 we have X f g P M (U1, . . . , Ut ) 2(α−1) ≤ t ≤ C6 exp −C7 K β/2 . t>K β

This takes care of the third term in (7.3), completing the proof.

We are now ready to prove that BHF (G, µ) contains a nonconstant function. Proof of Theorem 7.1.6 Fix some a ∈ S. Let Kt (a) denote the total variation distance Kt (a) = || Pa [Xt = ·] − P[Xt+1 = ·]||TV . By Exercise 6.42, it suffices to prove that Kt (a) is bounded away from 0, as t → ∞. Fix some t ≥ 1. We know that there is a coupling (Xt , Yt+1 ) of two µ-random walks at time t and t + 1 such that P[aXt , Yt ] = Kt (a). Choose K in Lemma 7.5.3 such that P[Xt < Gt (K )] < 14 and P[Yt+1 ∈ Gt+1 (K )] < 14 . Note that if Xt ∈ Gt (K ) and Yt+1 ∈ Gt+1 (K ) then by Lemma 7.5.2 it must be that aXt , Yt . Thus, 1 − Kt (a) = P[aXt = Yt+1 ] ≤ P[Xt ∈ Gt (K )] + P[Yt+1 ∈ Gt+1 (K )] < 12 . So Kt (a) >

1 2

for all t ≥ 1, and we are done.

7.6 Additional Exercises Exercise 7.23 Let G be a finitely generated group and µ an adapted symmetric probability measure on G. Let h be a nonnegative µ-harmonic function. Show that there exists c > 0 such that for all x ∈ G we have h(x) ≤ ec |x | h(1). Conclude that if h is a nonnegative µ-harmonic function, and that h(x) = 0 for some x ∈ G, then h ≡ 0. B solution C

Let G be a finitely generated group and µ an adapted probability measure on G. Show that the center Z = Z (G) acts trivially on any positive µ-harmonic function. B solution C

Exercise 7.24

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

7.6 Additional Exercises

263

Recall the definitions of τ, τ n, τ ∗ (Definitions 7.2.1 and 7.3.1). Let G be a finitely generated group and µ an adapted probability measure on G. Let P be the collection of all positive µ-harmonic functions. Let K = {x ∈ G : ∀ h ∈ P, x.h = h}. Show that K C G. Show that τ(K ) = K. Conclude that τ ∞ C τ ∗ C K. B solution C Exercise 7.25

Let G be a finitely generated virtually nilpotent group. Let µ be an adapted probability measure on G. Show that any positive µ-harmonic function is constant. B solution C

Exercise 7.26

Recall Definition 6.6.1 and Theorem 6.6.2. Let G be an infinite finitely generated group, and µ be an adapted symmetric probability measure on G. Let (Xt )t be the µ-random walk. For a finite subset A ⊂ G let E A = inf{t ≥ 0 : Xt < A} be the exit time from A. Define M A,x (z) = Px [X E A = z] and D A (x | y) =

M A,y (z) 1 − . M A, x (z) z : M A, x (z)>0 sup

Show that if A ⊂ A0 ⊂ G are finite subsets, then for any x, y ∈ G it holds that D A0 (x | y) ≤ D A (x | y). B solution C Exercise 7.27

For this exercise assume that µ has finite support. Let x, y ∈ G. Assume that there exists a nondecreasing sequence of finite S subsets An ⊂ An+1 such that n An = G and such that Exercise 7.28

lim D An (x | y) = 0.

n→∞

Show that for any positive µ-harmonic function h it holds that h(x) = h(y). B solution C

Let x, y ∈ G. Assume that there exists a nondecreasing sequence S of finite subsets An ⊂ An+1 such that n An = G and such that

Exercise 7.29

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

264

Choquet–Deny Groups lim sup D An (x | y) > 0. n→∞

Prove that there exists a positive µ-harmonic function h such that h(x) , h(y). B solution C

Exercise 7.30 Assume that any positive µ-harmonic function on G is constant. Let S ⊂ supp (µ) be a finite symmetric generating set and let Br = BS (1, r) denote the ball of radius r around 1 ∈ G in the Cayley graph with respect to S. Show that for any ε > 0 there exists c > 0 such that for all r > 0 and any |z| = r + 1,

MBr ,1 (z) = P[X EBr = z] ≥ ce−εr . Conclude that G has sub-exponential growth; that is, lim

r→∞

1 log |Br | = 0. r

B solution C

7.7 Remarks The Choquet–Deny theorem (Corollary 7.1.2) was first shown by Blackwell (1955) for Zd , by Choquet and Deny (1960) for Abelian groups, and by Dynkin and Malyutov (1961) for finitely generated nilpotent groups. It took quite a few decades before the converse was obtained by Frisch, Hartman, Tamuz, and Vahidi-Ferdowsi (2019). They prove that for any non-virtually nilpotent group there exists a finite entropy symmetric adapted random walk that is non-Liouville, Thoerem 7.1.5. This characterizes the Choquet–Deny groups as the virtually nilpotent groups, tying together the analytic phenomenon and algebraic property. The equivalent condition for the existence of nonconstant positive harmonic functions (in Exercises 7.28 and 7.29), as well as the proof that any group of exponential growth admits a nonconstant positive harmonic function (Exercise 7.30), was shown in Amir and Kozma (2017). Note that by Exercise 7.26, finitely generated virtually nilpotent groups (which by Gromov’s theorem, Theorem 9.0.1, are the same as polynomial growth groups) do not admit nonconstant positive harmonic functions. In contrast, Exercise 7.30 tells us that any group of exponential growth admits some

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

265

7.8 Solutions to Exercises

nonconstant positive harmonic function. The following question was raised in Amir and Kozma (2017), and is still open at the time of writing. Do groups of intermediate growth admit nonconstant positive harmonic functions? Specifically, does the Grigorchuk Group (from Grigorchuk, 1980, 1984) admit nonconstant positive harmonic functions? Question 7.7.1

In Perl and Yadin (2023) it is shown that if HFk (G, µ) contains a nonconstant positive harmonic function then dim HFk (G, µ) = ∞. See Exercise 2.38 for the case where LHF (G, µ) contains a nonconstant positive harmonic function. Question 7.7.2 Let G be a finitely generated group and µ ∈ SA (G, ∞). Assume that dim HFk (G, µ) = ∞. Does HFk (G, µ) contain a nonconstant positive harmonic function?

7.8 Solutions to Exercises Solution to Exercise 7.1 :( If f ∈ BHF (G, µ) then f¯ (K x) := f (x) is a well-defined function on G/K (because K acts trivially on f ). It is also simple to verify that f¯ is µ¯ -harmonic. Finally, this is a linear map, and its kernel is trivial. Indeed, if f¯ ≡ 0, then for any x ∈ G we have that f (x) = f¯ (K x) = 0, so f ≡ 0. :) X Solution to Exercise 7.2 :( If y, z ∈ CG (x) , then since N C G ,

f

[x, yz] = x −1 z −1 y −1 xyz = [x, z] · ([x, y]) z ∈ N, g −1 x, y −1 = x −1 yxy −1 = ([y, x]) y ∈ N,

because [y, x] = ([x, y]) −1 .

:) X

Solution to Exercise 7.3 :( x N (x) . If y ∈ N then [x, y] = y −1 y ∈ N , so y ∈ CG

:) X

Solution to Exercise 7.4 :( The first assertion is immediate. For the second assertion, let π : G → G/N be the canonical projection, and let ϕ = π C M (x) be the G

M (x) . restriction of π to CG M (x) then [π(x), π(y)] = π([x, y]) ∈ M/N , because [x, y] ∈ M . Thus ϕ is a homomorphism If y ∈ CG

M (x) into C M / N (π(x)) . from CG G/ N

M/N Now, if N y ∈ CG/ N (π (x)) , then π([x, y]) = [π(x), π(y)] ∈ M/N , so N [x, y] = N m for some m ∈ M . Since N C M , this implies that [x, y] ∈ M . M (x) → C M / N (N x) is surjective, completing the proof. Thus, ϕ : CG :) X G/ N

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

266

Choquet–Deny Groups

Solution to Exercise 7.5 :( We have that N (x y ) ⇐⇒ [x y , z] ∈ N ⇐⇒ z ∈ CG

Solution to Exercise 7.6 :( Since f

x, z y

−1

y N (x) . ∈ N ⇐⇒ z ∈ CG

:) X

x −1 g −1 , x −1, y = xy −1 x −1 y = ([y, x]) x = [x, y]−1

N x −1 = C N (x) . we have that CG G Also, because [xy, z] = y −1 x −1 z −1 xyz = ([x, z]) y [y, z], N (y) , so x, y ∈ τ(N ) implies that xy ∈ τ(N ) (by Proposition we know that ⊃ ∩ CG 1.3.3). This shows that τ(N ) is indeed a subgroup. f g N (x) onto C N (x y ) . Thus, G : C N (x) = Now let y ∈ G . The automorphism z 7→ z y of G maps CG G G f g N (x y ) , implying that x ∈ τ(N ) if and only if x y ∈ τ(N ) . This shows that τ(N ) C G . G : CG :) X N (xy) CG

N (x) CG

Solution to Exercise 7.7 :( N (x) = G , which implies that x ∈ τ(N ) . If x ∈ N , then [x, y] ∈ N for all y ∈ G . That is, CG

:) X

Solution to Exercise 7.8 :( f g N (x) for any x , we have that G : C N (x) ≤ [G : N ] < ∞ for all x , implying that τ(N ) = G . Since N C CG G :) X Solution to Exercise 7.9 :( f g f g N (x) ≤ C M (x) . Thus, G : C M (x) ≤ G : C N (x) . This Just note that since N C M , we have that CG G G G easily implies the assertion. :) X Solution to Exercise 7.10 :( Let x, y ∈ N∞ and z ∈ G . Since (Nn ) n is a nondecreasing sequence, there exists n such that x, y ∈ Nn . So x −1 y ∈ Nn ⊂ N∞ , and also x z ∈ Nn ⊂ N∞ . :) X Solution to Exercise 7.11 :( g f g f M (x)/N = C M / N (N x) , we know that G : C M (x) = G/N : C M / N (N x) . So we have that Since CG G G/ N G/ N x ∈ τG (M ) if and only if N x ∈ τG/ N (M/N ) , which is to say that τG (M )/N = τG/ N (M/N ) . The second assertion follows inductively on n − k by noting that by induction, for N = τ k , n−k n+1−k τ n+1 /N = τ(τ n )/N = τG/ N (τ n /N ) = τG/ N τG/ :) X N = τG/ N . Solution to Exercise 7.12 :( This is done by induction on n. The base case of n = 1 is just the observation that if z ∈ Z1 then CG (z) = G . For n > 1, we know that Zn /Zn−1 = Z1 (G/Zn−1 ) . So 1 Zn /Zn−1 ⊂ τG/Z

n−1

= τG (Zn−1 )/Zn−1 .

Since Zn−1 C Zn ∩ τG (Zn−1 ) we get that Zn C τG (Zn−1 ) . By induction we now have that Zn C τG (Zn−1 ) C n−1 = τ n , completing the proof. τG τG :) X G

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

7.8 Solutions to Exercises

267

Solution to Exercise 7.13 :( H ∩ N (x) = C N (x) ∩ H . As [G : H] < ∞ we conclude that If N C G , then for any x ∈ H we have that C H G τ H (H ∩ N ) = τG (N ) ∩ H . Thus, inductively, n n−1 n n−1 n−1 ∩ H. :) X ∩ H = τG ∩ H = τG τG τH = τH τH = τ H τG Solution to Exercise 7.14 :( We prove this by induction on n. For n = 1 this follows since {1} C N for any N C G . So if τ(N ) = N then τ 1 C τ(N ) = N . Thus, τ 1 C τ ∗ . For the induction step, let n > 1. Note that by induction, τ n−1 C τ ∗ . So for any N C G such that τ(N ) = N , since τ ∗ C N , we have that τ n C τ(τ ∗ ) C τ(N ) = N . Intersecting over all such N completes the proof. :) X Solution to Exercise 7.15 :( For any N C G such that τ(N ) = N we have that τ ∗ C N , hence τ(τ ∗ ) C τ(N ) = N . Intersecting over all such N gives that τ(τ ∗ ) C τ ∗ . The other inclusion is true for any normal subgroup. :) X Solution to Exercise 7.16 :( If [G : τ ∗ ] < ∞, then τ ∗ = τ(τ ∗ ) = G , so we may assume that [G : τ ∗ ] = ∞. Since

τG/τ ∗ ( {1}) = τG (τ ∗ )/τ ∗ = τ ∗ /τ ∗, we conclude that

G/τ ∗

is ICC as long as it is infinite.

:) X

Solution to Exercise 7.17 :( S Let s1, . . . , sd be generators of G . Since G = τ ∞ = n τ n , there exists n such that s1, . . . , sd ∈ τ n , which n ∗ implies that G = τ C τ . :) X Solution to Exercise 7.18 :( This is almost identical to the proof of Lemma 7.4.2. Consider the collection

N = {N C G : G/N is not P }. S If (Nk ) k is an increasing chain in N , then N∞ := k Nk is a normal subgroup. If N∞ < N , then G/N∞ is P . So G/N∞ is finitely presented. Say G/N∞ = hs1, . . . , sd | r1, . . . , rm i . Just as in the proof of Lemma 7.4.2, there exists k such that all relations r j would be mapped into Nk by the canonical projection onto G . This would imply that N∞ = Nk , meaning that G/N∞ is not P , a contradiction! We conclude that any nondecreasing chain in N has an upper bound in N , so by Zorn’s lemma N contains a maximal element. That is, there exists a maximal element N ∈ N . So G/N is not P . Now, assume that G/N has a non- P quotient. That implies the existence of some N C M C G such that G/M is not P . By the maximality of N , we have that N = M . So the only non- P quotient of G/N is by the trivial group, as required. :) X Solution to Exercise 7.19 :( If 0 , z ∈ N C Z, it must be that {nz : n ∈ N} ⊂ N , so [Z : N ] ≤ |z | .

:) X

Solution to Exercise 7.20 :( All finite groups are finitely presented, and quotients of finite groups are also finite. So by Exercise 7.18, there exists N C G such that G/N is just-infinite. That is, G/N is infinite, but any nontrivial quotient of G/N is finite. So if N , M for N C M C G , then [G/N : M/N ] = [G : M] < ∞. :) X Solution to Exercise 7.21 :( Let f1, . . . , fn ∈ BHF (G/N, µ) ¯ be linearly independent bounded harmonic functions. For each 1 ≤ k ≤ n, define h k : G → C by h k (x) = fk (N x) . It is easy to check that h k is µ -harmonic because fk is µ¯ -harmonic. It is also straightforward that h1, . . . , h n are linearly independent. :) X

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

268

Choquet–Deny Groups

Solution to Exercise 7.22 :( Fix some Cayley graph of G . Let ρ be the probability measure on N given by

ρ(x) = 1{x∈ N } · C ·

1 ( |x | + 1) 3 S| x | ∩ N

for an appropriate constant C > 0, where Sr = {x ∈ G : |x | = r }. Fix a set of representatives R ⊂ G so that G = ]r ∈R Nr . For any x ∈ G , there are unique elements n x ∈ N and r x ∈ R such that x = n x r x . Denote n¯ x = n x −1 . Thus, the following is well defined:

µ(x) := Since

r −1 N

=

Nr −1 ,

1 2 (ρ(n x )

+ ρ( n¯ x ))ν(N x).

we have that for any r ∈ R , X X ρ nr −1 n −1 = n∈ N

ρ(n x ) =

X

ρ(n) = 1.

n∈ N

x∈r −1 N

Thus,

X

µ(Nr ¯ )=

1 2

ρ(n) + ρ(nr −1 n−1 ) ν(Nr )

n∈ N

1 1 X ρ(n)ν(Nr ) + 2 n∈ N 2

=

X

ρ(n x )ν(Nr )

x∈r −1 N

1 X 1 X ρ(n)ν(Nr ) + ρ(n)ν(Nr ) = ν(Nr ), 2 n∈ N 2 n∈ N

=

which implies that µ¯ = ν and that µ is indeed a probability measure. If ν is symmetric, then ν(N x) = ν N x −1 , so that µ is also symmetric. For any r ∈ R , let ρr (n) := ρ nr −1 n −1 , which we have seen above is a probability measure. If X is a random element of law µ , then one easily checks that r X has law ν , and (X | r X = r ) has law Note that |Sk | ≤ D k for some D > 1 and all k ≥ 0. Thus,

H (ρ) ≤

∞ X

X

k=0 n∈S k ∩ N

≤

1 2 (ρ

+ ρr ) .

C log |Sk | + 3C log(k + 1) − C log C (k + 1) 3 |Sk ∩ N |

∞ X C log D · k + 3C log k − C log C < ∞. (k + 1) 3 k=0

Also,

H (ρr ) = −

X

ρ(n x ) log ρ(n x ) = −

x∈r −1 N

So for any r ∈ R we have that H with law µ , then

1 2 (ρ

X

ρ(n x ) log ρ(n x ) = H (ρ).

x∈ N r −1

+ ρr ) ≤ log 2 + H (ρ) , implying that if X is a random element of G

H (X) ≤ H (X | rX ) + H (rX ) ≤ log 2 + H (ρ) + H (ν). Finally, to show that µ is adapted, let n ∈ N and r ∈ R . Write Nr = N p1 · · · pk for N p j ∈ supp (ν) and p j ∈ R for all j . Let m ∈ N be such that mp1 · · · pk = nr . Note that ρ(m) > 0 (since ρ is supported on all of N ). Since µ(mp1 ) = ρ(m)ν(N p1 ) > 0 and µ(p j ) = ν(N p j ) > 0 for all j , we have that mp1, p2, . . . , pk ∈ supp (µ) . Since nr = mp1 p2 · · · pk , and this holds for arbitrary n ∈ N, r ∈ R , we find that µ is adapted. :) X Solution to Exercise 7.23 :( Let S be a finite symmetric generating set for G such that S ⊂ supp (µ) . Consider the Cayley graph of G with respect to S . Set µ∗ = min s∈S µ(s) > 0.

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

7.8 Solutions to Exercises

269

Let x ∈ G and write x = s1 · · · sn for s j ∈ S and n = |x | . Then, because h is nonnegative, X X µ(y)h(y) ≥ µ∗ h(s1 ) = µ∗ µ(y)h(s1 y) ≥ (µ∗ ) 2 h(s1 s2 ) h(1) = y

y

≥ · · · ≥ (µ∗ ) n h(s1 · · · sn ) = (µ∗ ) | x | h(x). This proves the first assertion. Now, assume that h is a nonnegative µ -harmonic function and that h(x) = 0. Set f = x.h . So f is a nonnegative µ -harmonic function with f (1) = 0. By the first assertion, 0 ≤ f (y) ≤ e c |y | · f (1) = 0 for some c > 0 and all y ∈ G . This implies that h ≡ 0. :) X Solution to Exercise 7.24 :( This is similar to the proof of Theorem 7.1.1. Let f be any positive µ -harmonic function. Let x ∈ Z . Let (Xt )t be the µ -random walk on G . Since µ is adapted, there exist k = k (x) > 0 and α > 0 such that for any t we have for all y ∈ G , f g f g P y Xt +k = Xt x −1 | Xt = P Xk = x −1 =: α > 0. Specifically,

f g E y [| f (Xt +k ) − f (Xt ) |] ≥ α E y | f (Xt x −1 ) − f (Xt ) | = α E y [ |x. f (Xt ) − f (Xt ) |] ≥ α |x. f (y) − f (y) |. We have used that f Xt =f t = x. f (Xt ) since x ∈ Z . On the other hand, since the sequence ( f (Xt ))t is a nonnegative martingale, it converges a.s. and in L 1 by the martingale convergence theorem (Theorem 2.6.3). This implies that ( f (Xt +k ) − f (Xt ))t converges to 0 a.s. and in L 1 . Thus, E y [| f (Xt +k ) − f (Xt ) |] → 0. We obtain that for any y ∈ G and any x ∈ Z we have x. f (y) = f (y) . This implies that Z acts trivially on f . :) X

x −1

x −1 X

Solution to Exercise 7.25 :( It is easily shown that K C G . For the rest of the proof, since K C τ(K ) we only need to show that τ(K ) C K . This is exactly as in the proof of Theorem 7.3.3, with P replacing BHF (G, µ) , and using the fact that positive martingales converge a.s. and in L 1 . :) X Solution to Exercise 7.26 :( By Theorem 7.2.3, since G is virtually nilpotent, τ ∞ = G , so also G = τ ∗ . By Exercise 7.25, G acts trivially on any positive µ -harmonic function, so any such function must be constant. :) X Solution to Exercise 7.27 :( Note that since A ⊂ A0 , we have that E A ≤ E A0 . Thus, by the strong Markov property, f g X f g f g P a XE A0 = b = P a XE A = w · P w XE A0 = b . w

This leads to

|M A0, x (z) − M A0, y (z) | =

X

M A0, w (z) · M A, x (w) − M A, y (w)

w

≤ D A (x | y) ·

X

M A0, w (z) · M A, x (w) = D A (x | y) · M A0, x (z).

w

Maximizing over z proves the claim.

:) X

Solution to Exercise 7.28 :( Let A be a finite subset, and consider the exit time E A . If h is a positive µ -harmonic function and (Xt )t is a µ-random walk, then (Mt := h(Xt ∧E A ))t is a bounded martingale. Indeed,

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

270

Choquet–Deny Groups 0 < Mt ≤

max

x∈ A s∈supp (µ)

h(xs) < ∞.

f g Thus, by the optional stopping theorem (Theorem 2.3.3), h(x) = E x h(XE A ) for any x ∈ G . Now, let n(x, y) be large enough so that x, y ∈ A n for all n ≥ n(x, y) which is possible because ( A n ) n S are nondecreasing and n A n = G . Using that h is positive we can now compute, |h(x) − h(y) | =

X

≤

X

f h(z) · | P x XE A

z

n

g f = z − P y XE A

n

g =z |

h(z)M A, x (z) · D A n (x | y) = h(x) · D A n (x | y) → 0.

z

This implies that h(x) = h(y) .

:) X

Solution to Exercise 7.29 :( Let ε > 0 and (z n ) n be such that for all n we have

| M A n , x (z n ) − M A n , y (z n ) | > ε M A n , x (z n ). For each n define

h n (w) =

M A n , w (z n ) . M A n , x (z n )

Let S be a finite symmetric generating set for G such that S ⊂ supp (µ) . Consider the Cayley graph of G with respect to S . Set µ∗ = Smin s∈S µ(s) > 0. Let w ∈ G . Because n A n = G and A n ⊂ A n+1 for all n, there exists n(w) such that S ∪ {w } ⊂ A n for any n ≥ n(w) . For any s ∈ S and n ≥ n(w) , we have that f g f g P w s XE A = z n, E A n > 1, X1 = w ≥ µ∗ · P w XE A = z n . n

n

Thus, for any n ≥ n(w) ,

h n (ws) ≥ µ∗ · h n (w). Since S generates G , there exist s1, . . . , s|w | ∈ S such that ws1 · · · s|w | = 1. We can choose m(w) large enough so that ( ) w, ws1, ws1 s1, . . . , ws1 · · · s|w | = 1 ⊂ A n for all n ≥ m(w) . Thus, for all n ≥ m(w) we have that

h n (1) ≥ (µ∗ ) |w | · h n (w). Since this holds for any w ∈ G , an Arzelà–Ascoli type argument (as in the solution to Exercise 1.97) proves that there is a subsequence (h n k ) k such that the limit

h(w) := lim h n k (w) k→∞

exists for each w ∈ G . We now have that

h(x) = lim h n k (x) = 1, k→∞

|1 − h(y) | = lim 1 − h n k (y) > ε. k→∞ So h(x) , h(y) . We are left with showing that h is µ -harmonic. Let w ∈ G . Fix δ > 0 small, and let F = Fδ be a finite subset such that P[X1 < F] < δ . Since F is finite, there exists nw, F such that for all n ≥ n F we have wF ∪ {w } ⊂ A n . Thus, for any n ≥ nw, F ,

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

271

7.8 Solutions to Exercises f g X f g P w XE A = z n − P w XE A = z n, X1 = wz n n z ∈F X ≤ P w [X1 = wz] ≤ P[X1 < F] < δ. z 0, for any s1, . . . , sn ∈ S and any x ∈ G , we have that

P x [∀ 1 ≤ k ≤ n, Xk = xs1 · · · sk ] ≥ (µ∗ ) n . For any |x | ≤ r and any |z | = r + 1, there is a path in the Cayley graph from x to z of length at most |x | + |z | ≤ 2r + 1, which stays inside Br until hitting z (e.g. follow a geodesic from x to 1, and then a geodesic from 1 to z ). We thus have that M B r , x (z) ≥ (µ∗ ) 2r +1 > 0. Fix ε > 0. Since any positive harmonic function is constant, by Exercise 7.29 we can choose R be large enough so that for any r ≥ R we have

Dr := max (D B r (1 | s) ∨ D B r (s | 1)) < ε. s∈S

Note that the x 7→ yx is an automorphism of the Cayley graph, and since M A, x (z) = My A, y x (yz) , we get that

D B (x, r ) (x | xs) ∨ D B(x, r ) (xs | x) ≤ Dr . Now, let s ∈ S and x ∈ G . For any r ≥ R + |x | , since B(x, R) ⊂ Br , by Exercise 7.27 we have that

D B r (x | xs) ∨ D B (x, r ) (xs | x) ≤ D R < ε. Now, let r ≥ R + |x | and choose s1, . . . , s| x | ∈ S such that x = s1 · · · s| x | . Write yk = s1 · · · sk for 1 ≤ k ≤ |x | , and y0 = 1. By the above, for any |z | = r + 1,

M B r , x (z) = M B r , y | x |−1 s | x | (z) ≤ (1 + ε)M B r , y | x |−1 (z) ≤ · · · ≤ (1 + ε) | x | · M B r ,1 (z) ≤ e ε | x | · M B r ,1 (z). Now, let r > R and let z ∈ G be such that |z | = r + 1. Write z = s1 · · · sr +1 for s j ∈ S . Set y0 = 1 and y j = s1 · · · s j for all 1 ≤ j ≤ r + 1. Note that |y j | = j ≤ r . Let x = yr −R . We have that

M B r , x (z) ≥ P x [∀ 1 ≤ t ≤ R + 1, X j = yr −R+t ] ≥ (µ∗ ) R+1 .

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

272

Choquet–Deny Groups

Thus we have that

M B r ,1 (z) ≥ e−ε (r −R) · (µ∗ ) R+1 = µ∗ · (e ε µ∗ ) R · e−εr . This holds for all r > R and all |z | = r + 1. Choosing an appropriate constant c > 0 gives that there is a constant c > 0 such that for all r > 0 and all |z | = r + 1 it holds that M B r ,1 (z) > ce−εr . Summing over z , we find

1 ≥ #{z : |z | = r + 1} · ce−εr . Summing over r we conclude that

|Br | = 1 +

r −1 X

#{z : |z | = k + 1} ≤ 1 +

1 c

1 c (e ε −1)

eε k

k=0

k=0

≤ 1+

r −1 X

· e εr .

Thus, for some constant C > 0 and all r > 0 we have that log |Br | ≤ C + εr . Dividing by r and taking r → ∞ completes the proof. :) X

https://doi.org/10.1017/9781009128391.010 Published online by Cambridge University Press

8 The Milnor–Wolf Theorem

273

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

274

The Milnor–Wolf Theorem

This chapter is devoted to proving a special dichotomy that holds in the case of solvable groups: any finitely generated solvable group is either of exponential growth or of polynomial growth. So solvable groups do not exhibit intermediate growth. The result, Theorem 8.2.4, is due to Milnor (1968a) and Wolf (1968). This result is a very nice example of the algebra of the group (e.g. the property of solvability) constraining the geometry of the group (e.g. growth). Wolf also proved in his 1968 paper that finitely generated nilpotent groups have polynomial growth (Theorem 8.2.1). In Chapter 9 we will prove Gromov’s theorem (Theorem 9.0.1), which states that a finitely generated group is virtually nilpotent if and only if it has polynomial growth.

8.1 Growth Let us define some notions regarding growth in finitely generated groups. For (nondecreasing) functions f , g : N → [0, ∞), define a preorder (i.e. a transitive and reflexive relation): f g if there exists a constant C > 0 such that f (n) ≤ Cg(Cn) for all n. We write f ∼ g for f g and g f . We write f ≺ g for f g and g f . If f ∼ g then we say that f , g are equivalent. Show that f g is a preorder; that is, show that if f g and g h then f h, and show that f f . Show that f ∼ g is an equivalence relation.

Exercise 8.1

P Show that for any non-negative polynomial p(x) = dj=0 a j x j (restricted to N) with ad , 0, we have p ∼ n 7→ n d . Show that for d > k we have n 7→ nk ≺ n 7→ n d . α Show that all stretched exponential functions n 7→ ecn for different c > 0 α α (and the same α) are equivalent; that is, show that n 7→ en ∼ n 7→ ecn for all c > 0. α Show that all stretched exponential functions n 7→ en for different α > 0 α β are not equivalent. In fact, show that n 7→ en ≺ n 7→ en for α < β. α Show that n 7→ n d ≺ n 7→ en for any α > 0. Exercise 8.2

Let G be a finitely generated group, and fix some Cayley graph of G. Let f (r) = |Br | where Br = {x : |x| ≤ r } is the ball of radius r around 1. We say that G has polynomial growth if f (r) r 7→ r d for some d > 0. We say that G has exponential growth if (r 7→ er ) f (r). Definition 8.1.1

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

8.2 Growth of Nilpotent Groups

275

Otherwise we say that G has intermediate growth. If f (r) ≺ (r 7→ er ) we say that G has sub-exponential growth. By the growth of a finitely generated group G, we refer to the equivalence class of f (r). Note that when we say that a group has some growth we implicitly assume it is finitely generated, and we are assuming some implicit Cayley graph (or finite symmetric generating set) in the background. The following exercise shows that the specific Cayley graph is not important. Show that the growth of a group does not depend on the choice of finite symmetric generating set. Show that the growth of a group cannot exceed exponential. B solution C Exercise 8.3

Let G be a finitely generated group. Show that a finitely generated subgroup of G has growth not more than that of G (under the preorder on growth functions). Show that a quotient group of G has growth not more than that of G.

Exercise 8.4

B solution C

Show that if H is a finite-index subgroup of a finitely generated group G, then the growth of H is the same as that of G. B solution C Exercise 8.5

Exercise 8.6

Show that a finitely generated Abelian group has polynomial

growth.

B solution C

8.2 Growth of Nilpotent Groups In this section we will prove that any finitely generated nilpotent group has polynomial growth, generalizing Exercise 8.6. Recall from Section 1.5.4 the definition of a nilpotent group: G is nilpotent of step n if for some n the lower central series terminates at step n. That is, γn (G) = {1} where γ0 (G) = G and γk+1 (G) = [G, γk (G)]. Let G be a finitely generated nilpotent group. Then G has polynomial growth.

Theorem 8.2.1

We prove Theorem 8.2.1 via the following steps.

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

276

The Milnor–Wolf Theorem

Exercise 8.7 Let G be a finitely generated nilpotent group. Show that for any n, the subgroup γn (G) is finitely generated. (Hint: use Exercises 1.39 and 1.62). B solution C

Exercise 8.8 Let S0 = S be a finite generating set for an n-step nilpotent group G (i.e. γn (G) = {1}). For 1 ≤ k ≤ n − 1, define

Sk = {[s0, . . . , s k ] : s1, . . . , s k ∈ S}. Define Tn−1 = Sn−1 , and for k < n − 1 define inductively Tk = Sk ∪ Tk+1 . Show that γk (G) = hTk i for all 0 ≤ k ≤ n − 1. B solution C Lemma 8.2.2 Let S be a finite symmetric generating set for an infinite nilpotent group G. For any finite symmetric generating set T for [G, G], there exists a constant κ > 0 such that for any x ∈ [G, G], we have |x|T ≤ κ(|x|S ) 2 .

That is, Lemma 8.2.2 states that the internal metric on [G, G] does not distort much the external metric inherited from the one on G. For more on the phenomena of distortion, see Exercise 8.35. Proof We have that G/[G, G] is an infinite finitely generated Abelian group, so isomorphic to Zd × F for some d ≥ 1 and some finite Abelian group F, by Theorem 1.5.2. Thus there exists a subgroup [G, G] C H C G such that [G : H] = |F | and H/[G, G] Zd . Since [G : H] < ∞, we know that H is finitely generated (Exercise 1.61) and nilpotent. It is also simple to see that [H, H] = Ker π = [G, G] where π : H → Zd is the canonical projection. By adjusting the constant κ, if we replace G by H we may assume without loss of generality that G/[G, G] Zd . Let n be the nilpotent step of G. Let T = {[s0, . . . , s k ] : s0, . . . , s k ∈ S, k ≥ 1}\{1}, which is a finite set since γn (G) = {1}. Exercise 8.8 tells us that T generates [G, G]. Let π : G → Zd be the canonical projection (with Ker π = [G, G]). Let a1, . . . , ad ∈ G be such that π(a j ) = e~j(, where e~1, . . ). , e~d is the standard ˜ basis for Rd so they generate Zd . Let S˜ = a1±1, . . . , a±1 d . Note that S ∩ T ⊆ ˜S ∩ Ker π = ∅. By adjusting the constant κ, we may modify S so that S˜ ⊆ S. Exercise 1.62 tells us that S˜ ∪ T generates G. So it suffices to prove that there 2 exists κ > 0 such that for all x ∈ [G, G] we have |x|T ≤ κ |x|S∪T . ˜

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

8.2 Growth of Nilpotent Groups

277

To this end, fix some x ∈ [G, G]. For k ≥ 1, define Tk = {[s0, . . . , s k ] : s0, . . . , s k ∈ S}\{1}, Sn−1 noting that Tk = ∅ for k ≥ n and that T = k=1 Tk . Define Wx = {(u1, . . . , um ) : m ≥ 1, u j ∈ S˜ ∪ T, u1 · · · um = x}. For ω = (u1, . . . , um ) and 1 ≤ j ≤ d and 1 ≤ k ≤ n − 1, we define |ω| = m, m X 1 (ui ∈ {a j ,a−1 } ), ||ω||0, j = j

i=1

||ω||0 = ||ω||k =

d X j=1 m X

||ω||0, j =

m X

1{u I ∈S˜ },

i=1

1 {ui ∈Tk },

i=1

||ω||T =

n−1 X

||ω||k =

m X

1 {ui ∈T } .

i=1

k=1

Note that |ω| = ||ω||0 + ||ω||T . Moreover, if ||ω||0 = 0 then |x|T ≤ |ω| = ||ω||T . Let ω = (u1, . . . , um ) ∈ Wx . Assume that ||ω||0, j > 0. Then, since x ∈ [G, G] = Ker π, m X π(u j ) = π(u1 · · · um ) = π(x) = 0, i=1

( ) so there must exist 1 ≤ ` 0 < ` ≤ k such that u` = u`−10 ∈ a j , a−1 j . Consider ω 0 = (u1, . . . , u`0 −1, u`0 +1, [u`0 +1, u` ], . . . , u`−1, [u`−1, u` ], u`+1, . . . , uk ). Since yz = zy[y, z] and since u`0 u` = 1, we see that ω 0 ∈ Wx . Also, we have that ||ω 0 ||0, j ≤ ||ω||0, j − 2, ||ω 0 ||0,i ≤ ||ω||0,i,

for all 1 ≤ i ≤ d,

||ω ||k ≤ ||ω||k + ||ω||k−1, 0

for all k ≥ 1.

This implies that |ω 0 | ≤ |ω| + ||ω||0 − 2, ||ω 0 ||0 ≤ ||ω||0 − 2.

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

278

The Milnor–Wolf Theorem

Applying this procedure r times, we obtain ω 00 ∈ Wx with ||ω 00 ||0, j ≤ ||ω||0, j − 2r, ||ω 00 ||0,i ≤ ||ω||0,i,

for all 1 ≤ i ≤ d,

|ω | ≤ |ω| + (||ω||0 − 2)r. 00

(Hence it must be that 2r ≤ ||ω||0, j .) We find that for any ω ∈ Wx with ||ω||0, j > 0, there exists ω 0 ∈ Wx with ||ω 0 ||0, j = 0,

||ω 0 ||0 ≤ ||ω||0,

and

|ω 0 | < |ω| + 21 (||ω||0 ) 2 .

Repeating this at most d times, once for every 1 ≤ j ≤ d, we conclude with the following fact. For any ω ∈ Wx , there exists ω 0 ∈ Wx with ||ω 0 ||0 = 0

and

|ω 0 | < |ω| + d2 (||ω||0 ) 2 .

(8.1)

Finally, note that since S˜ ∪ T generates G, we know that there exists ω ∈ Wx 0 with |ω| = |x|S∪T ˜ . Hence, using (8.1), there exists ω ∈ W x with 2 |x|T ≤ ||ω||T = |ω 0 | < |ω| + d2 (||ω||0 ) 2 ≤ d2 + 1 |x|S∪T . ˜ Proof of Theorem 8.2.1 We prove the theorem by induction on the nilpotent step n. If G is 1-step nilpotent, then G is Abelian, which is done in Exercise 8.6. Assume that G is n-step nilpotent for n > 1. Let Q be a set of representatives for the cosets of [G, G]; that is, G = U q ∈Q [G, G]q. So, for any x ∈ G there exist unique qx ∈ Q and gx ∈ [G, G] such that x = gx qx . We may choose Q so that for any x ∈ G we have that |x|S ≥ |qx |S . Thus, |gx |S ≤ 2|x|S . By Lemma 8.2.2, there exists a finite symmetric generating set T for [G, G] and a constant κ > 0 such that |gx |T ≤ κ(|gx |S ) 2 ≤ 4κ(|x|S ) 2 . This implies that the map ψr : BS (1, r) → BT 1, 4κr 2 × (Q ∩ BS (1, r)) given by ψr (x) = (gx, qx ) is injective, so we have that |BS (1, r)| ≤ BT 1, 4κr 2 · |Q ∩ BS (1, r)|. Let π : G → G/[G, G] be the canonical projection, and let S˜ = π(S). So S˜

is a finite symmetric generating set for G/[G, G]. Also, π maps Q ∩ BS (1, r) injectively into BS˜ ([G, G], r) ⊂ G/[G, G]. So it suffices to show that both [G, G] and G/[G, G] are of polynomial growth. Since γk ([G, G]) ≤ γk+1 (G), we have that [G, G] is a finitely generated (n − 1)-step nilpotent group. Thus, [G, G] has polynomial growth by induction. Also, G/[G, G] is a finitely generated Abelian group, so is also of polynomial growth.

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

8.2 Growth of Nilpotent Groups

279

For the sake of completeness, we present the correct order of growth for nilpotent groups, although we will not prove the precise bounds. These are due to Bass (1972) and Guivarc’h (1973). Let G be a finitely generated nilpotent group. Recall the lower central series γ0 (G) = G and γk+1 (G) = [γk (G), G]. For every 1 ≤ k ≤ n we have that γk (G)/γk+1 (G) is finitely generated and Abelian, so we may write d k for an integer such that γk (G)/γk+1 (G) Zdk ×Fk and Fk finite. The growth of G is then given by: there exists a constant C > 0 such that for Pn−1 all r > 0 we have C −1 r D ≤ |B(1, r)| ≤ Cr D , where D = k=0 (k + 1)d k and γn (G) = {1}. Theorem 8.2.3 (Bass–Guivarc’h Formula)

Specifically, the growth exponent is an integer. Recall that any non-amenable group must have exponential growth (see e.g. Exercise 6.50). We have already seen examples of polynomial growth groups (e.g. nilpotent groups) and of exponential growth groups (e.g. free groups). In 1968, Milnor asked if groups of intermediate growth are in fact possible (Milnor, 1968b). This was answered affirmatively only in 1980 by Grigorchuk (see Grigorchuk, 1980, 1984). Prior to Grigorchuk’s famous construction, various families of groups were shown to not contain any intermediate growth group. The Milnor–Wolf theorem tells us that in the class of solvable groups there are no groups of intermediate growth. Theorem 8.2.4 (Milnor–Wolf theorem) If G is a finitely generated solvable group then G either has exponential growth or polynomial growth. In fact, G is a finitely generated solvable group of sub-exponential growth if and only if G is virtually nilpotent.

Recall that a group G is solvable if the derived series terminates after finitely f g many steps: G = G (0) B G (1) B · · · B G (n) = {1}, where G ( j+1) = G ( j), G ( j) . A group G is nilpotent if the lower central series terminates after finitely many steps: G = γ0 (G) B γ1 (G) B · · · B γn (G) = {1}, where γ j+1 (G) = [G, γ j (G)]. This is a good place for the reader to recall these notions, see Section 1.5. The Milnor–Wolf theorem is related to Gromov’s celebrated theorem, Theorem 9.0.1, to which Chapter 9 is devoted. Gromov’s theorem states that a finitely generated group has polynomial growth if and only if it is virtually nilpotent. So the second assertion in Theorem 8.2.4 is a special case of Gromov’s theorem for solvable groups. We now proceed to develop the required tools for the proof of the Milnor– Wolf theorem.

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

280

The Milnor–Wolf Theorem

8.3 The Milnor Trick Let G be a finitely generated group of sub

exponential growth. Then, for any x, y ∈ G the subgroup x −n yx n : n ∈ Z is finitely generated. Proposition 8.3.1 (Milnor’s trick)

Proof Fix x, y ∈ G and a finite symmetric generating set for G containing x, y. Consider the set ( ) A = xy j1 x y j2 · · · xy jn : j1, . . . , jn ∈ {0, 1} . One sees that the map ( j1, . . . , jn ) 7→ xy j1 x y j2 · · · xy jn maps {0, 1} n into A, and that A ⊂ B(1, 2n). Since |B(1, 2n)| grows sub-exponentially, this implies that there must exist n and ( j1, . . . , jn ) , (i 1, . . . , i n ) ∈ {0, 1} n such that xy j1 · · · x y jn = x y i1 · · · x y in . We may assume without loss of generality that jn , i n (by cancelling out the terms on the right until arriving at jn , i n ). Using the notation yk = x k yx −k , we get that (y1 ) j1 · · · (yn ) jn = (y1 ) i1 · · · (yn ) in . We obtain that (yn ) jn −in = (yn−1 ) −jn−1 · · · (y1 ) −j1 · (y1 ) i1 · · · (yn−1 ) in−1 . But since jn , i n , this tells us that yn ∈ hy1, . . . , yn−1 i. Conjugating by x −1 we have that yn+1 ∈ hy2, . . . , yn i ≤ hy1, y2, . . . , yn−1 i. Continuing inductively we have that yn+k ∈ hy1, . . . , yn i for all k. Repeating this argument with x −1 instead of x we get that for some n we have ym ∈ hy−n, . . . , y, . . . , yn i for all m ∈ Z. Exercise 8.9 Let G be a finitely generated group. Let H CG such that G/H Z. Show that there exists a generating set S = {s1, . . . , s n } of G such that s1 maps to 1 ∈ Z under the canonical projection, and such that s j ∈ H for all j > 1. B solution C

Let G be a finitely generated group of sub-exponential growth. Let H C G such that G/H Z. Show that H is finitely generated. (Hint: use Exercise 1.60.) B solution C Exercise 8.10

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

8.3 The Milnor Trick

281

If G is a finitely generated group of sub-exponential growth, and H C G such that G/H is Abelian, then H is finitely generated. Proposition 8.3.2

Proof Since G/H is finitely generated and Abelian, then G/H Zd × F for a finite group F. For d = 0 we have that [G : H] < ∞, and so H is finitely generated. The proof is by induction on d ≥ 1. Let π : G → Zd × F be the canonical projection. Let K = π −1 ({0} × F), which is easily seen to be a normal subgroup of G. Also, K/H {0} × F F, so [K : H] < ∞. Thus, it suffices to prove that K is finitely generated. Note that G/K (G/H)/(K/H) Zd . So we have a surjective homomorphism ψ : G → Zd with K = Kerψ. The base case, d = 1, is Exercise 8.10. For d > 1 we proceed by induction as follows. We note that Zd = Zd−1 ×Z. Let ϕ : Zd → Zd−1 be the projection onto the first d −1 coordinates, ϕ(z1, . . . , z d ) = (z1, . . . , z d−1 ). Let N = Ker (ϕ ◦ ψ). So K C N C G and G/N Zd−1 . By induction, N is finitely generated. Thus, N has sub-exponential growth, as a finitely generated subgroup of G, by Exercise 8.4. Now, ψ(N ) = {~z ∈ Zd : z1 = · · · = z d−1 = 0} Z, so N/K Z. This implies that K is finitely generated as well, again by Exercise 8.10, completing the induction step. The above proposition will suffice for our purposes, but let us give an easy corollary of it. If G is a finitely generated group of subexponential growth, and H C G is such that G/H is solvable, then H is finitely generated. Theorem 8.3.3 (Rosset’s theorem)

Proof The proof is by induction on the length of the derived series of G/H. Recall from Section 1.5.5 that the derived series of a group Γ is Γ (0) = Γ and f g Γ (k+1) = Γ (k), Γ (k) . Γ is n-step solvable if Γ (n) = {1} and Γ (n−1) , {1}. If G/H is 1-step solvable, then G/H is Abelian, which is exactly Proposition 8.3.2. Now assume that G/H is n-step solvable for some n > 1. If π : G → N = G/H is the canonical projection, let M = π −1 ([N, N]). So H C M and M/H [N, N] is (n − 1)-step solvable. Since G/M N/[N, N] is Abelian, we know that M is finitely generated by Proposition 8.3.2. So M has sub-exponential growth as a subgroup of G by Exercise 8.4. Since M/H is (n − 1)-step solvable, H is finitely generated by induction.

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

282

The Milnor–Wolf Theorem

Exercise 8.11 A group G is called torsion if every element of G has finite order. Show that if G is a finitely generated solvable group of sub-exponential growth and G is torsion, then G is finite. B solution C

Now, since finitely generated nilpotent groups have polynomial growth, we can use Rosset’s theorem (Theorem 8.3.3) to show that all their subgroups are finitely generated. Theorem 8.3.4 Let G be a finitely generated nilpotent group. Then, any subgroup of G is finitely generated.

Proof This is by induction on the nilpotent step n of G. The base case, where γ1 (G) = {1}, is the case where G is finitely generated and Abelian. In this case any subgroup is normal, so if H ≤ G, then since G/H is a finitely generated Abelian group of sub-exponential growth, H is finitely generated by Rosset’s theorem (Theorem 8.3.3, or in this case even Proposition 8.3.2 suffices). For the induction step, when n > 1, let H be any subgroup. Consider H ∩ [G, G]. By Rosset’s theorem (Theorem 8.3.3), since G has polynomial growth and G/[G, G] is Abelian, we know that [G, G] is finitely generated. Also, γn−1 ([G, G]) ≤ γn (G) = {1}, so [G, G] is finitely generated and nilpotent of step at most (n − 1). By induction we know that H ∩ [G, G] is finitely generated. Also, if π : G → G/[G, G] is the canonical projection, then π(H) H/(H ∩ [G, G]) is a subgroup of the finitely generated Abelian group G/[G, G]. Thus, H/(H ∩ [G, G]) is finitely generated. By Exercise 1.62, H is finitely generated, completing the induction step.

8.4 Characteristic Subgroups Recall that a subgroup H ≤ G is normal if H g = H for all g ∈ G. That is, the inner automorphisms of G by conjugation preserve the subgroup H. Another notion is that of a characteristic subgroup. A subgroup H ≤ G is characteristic in G if any automorphism ϕ ∈ Aut (G) preserves H; that is, ϕ(H) = H.

Definition 8.4.1

Since conjugation by g ∈ G is an automorphism, any characteristic subgroup is also normal. However, this is a stronger notion than normality.

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

283

8.4 Characteristic Subgroups

Show that if H is a characteristic subgroup of N and N C G is a normal subgroup of G, then H is normal in G. B solution C Exercise 8.12

Let G be a finitely generated group. Let H ≤ G be a finite index subgroup, [G : H] < ∞. Then, there exists a subgroup N ≤ H such that [G : N] < ∞ and N is characteristic in G. Theorem 8.4.2

Proof Assume that [G : H] = n. By Theorem 1.5.12, the set X = {K ≤ G : [G : K] = n} is a finite set (in fact |X | ≤ (n!) d , where d is the number of generators of G). Let N = ∩K ∈X K. Since [G : A ∩ B] ≤ [G : A] · [G : B] (Proposition 1.3.3), we have that [G : N] ≤ n |X | < ∞. Moreover, for any automorphism ϕ ∈ Aut (G), and any K ∈ X, we have that [G : ϕ(K )] = [G : K] = n, so that ϕ(K ) ∈ X as well. Thus, ϕ(N ) = N, proving that N is a characteristic subgroup of G. Example 8.4.3 We have already seen in Exercise 1.35 that γn (G) and Z n (G) are characteristic subgroups of G. 454 Exercise 8.13

Recall that for two subsets A, B ⊂ G, we define [A, B] = h[a, b] : a ∈ A, b ∈ Bi .

Show that if N, K ≤ G are two characteristic subgroups of G, then [N, K] is also a characteristic subgroup of G. B solution C For a group G and n > 0 define γ¯ n (G) = {x ∈ γn−1 (G) | ∃ 0 , k ∈ Z, x k ∈ γn (G)}. Exercise 8.14

Show that γ¯ n (G) is a characteristic subgroup of G.

B solution C

Show that if G is finitely generated, then γn (G)/γ¯ n+1 (G) Zdn for some integer d n = d n (G) ≥ 0. B solution C

Exercise 8.15

Exercise 8.16

Show that if d n (G) = 0 then γn+k (G) = γ¯ n+k+1 (G) = γn+k+1 (G)

for all k ≥ 1. Conclude that if d n (G) = 0 then d n+1 (G) = 0. Exercise 8.17

B solution C

Show that if ϕ ∈ Aut (G) then it induces an automorphism

ϕn ∈ Aut (γn (G)/γ¯ n+1 (G))

given by

ϕn ( γ¯ n+1 (G)x) := γ¯ n+1 (G)ϕ(x). B solution C

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

284

The Milnor–Wolf Theorem

8.5 Z-extensions and Nilpotent Groups Recall the notion of a semi-direct product of groups from Section 1.5.9 and Exercise 1.68. We have already seen examples in Exercises 1.70, 1.71, and 1.72, as well as in the lamplighter groups in Section 6.9. One particular case is known as a Z-extension of a group, which we now explain in more detail. If ϕ ∈ Aut(G) then Z acts on G via ϕ (specifically a ∈ Z acts on x ∈ G by ϕ a (x)). This gives a new group H = Z nϕ G, which is defined as follows. Elements of H are pairs (a, x) for a ∈ Z, x ∈ G. Multiplication is given by (a, x)(b, y) = (a + b, xϕ a (y)). Show that in Z nϕ G for any x ∈ G and a, b ∈ Z we have (a, x) (b,1) = a, ϕ−b (x) . Show that if ϕ ∈ Aut(G) and ψ = ϕm for m ∈ Z\{0}, then there is a normal subgroup H C Z nϕ G such that (Z nψ G) H and [(Z nϕ G) : H] = |m|. Show that if G = hSi then Znϕ G is generated by S˜ = {(0, s), (±1, 1) : s ∈ S}. Show that with this generating set, if S is a finite symmetric generating set for G, we have |(a, x)|S˜ ≤ |a| + |x|S . B solution C Exercise 8.18

Exercise 8.19 Let G be a finitely generated group. Let K C G be such that G/K Z. Show that G Z nϕ K for some ϕ ∈ Aut (K ). Show that {0} × K ⊂ Z nϕ K is a subgroup isomorphic to K. Show that if G has growth ∼ r 7→ r d then K is finitely generated with B solution C growth r 7→ r d−1 . Exercise 8.20 Let G be a group and let ϕ ∈ Aut (G). Let H C G be a subgroup such that ϕ(H) = H. Show that the map ψ(H x) = H ϕ(x) is a well-defined automorphism of G/H. Let N = {0} × H ⊂ Z nϕ G. Show that N is a normal subgroup of Z nϕ G, and that N H. Show that (Z nϕ G)/N = Z nψ (G/H). B solution C Exercise 8.21 Show that if F CG is a finite normal subgroup such that G/F Z, then there exists a finite index subgroup H ≤ G, [G : H] < ∞ such that H Z. B solution C

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

8.5 Z-extensions and Nilpotent Groups

285

We require some results regarding matrices with integer entries. Recall that GLn (Z) is the group of n × n matrices with integer entries and determinant in

{−1, 1}. Let M ∈ Mn (Z) be an n × n matrix with integer entries. Show that if all (potentially, complex) eigenvalues of M have modulus 1 then all eigenvalues of M are roots of unity. B solution C Exercise 8.22

Show that if ϕ is an automorphism of Zd , then there exists a matrix M ∈ GLd (Z) such that ϕ(x) = M x for all x ∈ Zd . B solution C

Exercise 8.23

d Since any M ∈ GLd (Z) is an automorphism of Z , we conclude that d Aut Z GLd (Z).

Let G be a group isomorphic to Zd . Let i, j : G → Zd be group isomorphisms. Let ϕ ∈ Aut (G) and consider ϕi = i ◦ ϕ ◦ i −1 and ϕ j = j ◦ ϕ ◦ j −1 . Note that ϕi, ϕ j are automorphisms of Zd . Let Mi, M j ∈ GLd (Z) be matrices such that ϕi (~z ) = Mi ~z and ϕ j (~z ) = M j ~z for all ~z ∈ Zd . Prove that Mi = U −1 M j U for some invertible matrix U. B solution C Exercise 8.24

Using the above, we now proceed to define the notion of eigenvalues of a group automorphism. Let G be a finitely generated group. Let ϕ ∈ Aut(G). Set γn = γn (G) and γ¯ n = γ¯ n (G). Consider the induced automorphisms guaranteed by Exercise 8.17, ϕn ∈ Aut(γn /γ¯ n+1 ); that is, satisfying ϕn (γn+1 x) = γn ϕ(x) for all x ∈ G. By Exercise 8.15, there exists a group isomorphism i n : γn /γ¯ n+1 → Zdn , where d n > 0 if and only if γn , γ¯ n+1 . Thus, ψn := i n ◦ ϕn ◦ (i n ) −1 is an automorphism of Zdn . If d n > 0, then there exists Mn ∈ GLdn (Z) such that ψn (~z ) = Mn x for all ~z ∈ Zdn (Exercise 8.23). If d n > 0, then let λ n,1, . . . , λ n,dn denote the (possibly complex) eigenvalues of Mn , with multiplicities. Note that these eigenvalues are independent of the specific choice of isomorphism i n , by Exercise 8.24. Let J = inf{ j : d j = 0} where J = ∞ if d j > 0 for all j. Exercise 8.16 tells us that if J < ∞ then d j = 0 for all j ≥ J. Definition 8.5.1

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

286

The Milnor–Wolf Theorem

The (possibly infinite list of) numbers (λ j,1, . . . , λ j,d j ) J−1 j=0 are called the eigenvalues of the automorphism ϕ. (If d 0 = 0 then ϕ has no eigenvalues.) We will now study Z-extensions of nilpotent groups and see that the eigenvalues of the implicit automorphism provide a lot of information on the structure of the Z-extension. Let G be a finitely generated nilpotent group. Let ϕ ∈ Aut(G). If ϕ has an eigenvalue λ with |λ| , 1 then Z nϕ G has exponential growth. Lemma 8.5.2

Proof By Exercise 8.18, for an appropriate m ∈ Z we have with ψ = ϕm that H Z nψ G is finite index in Z nϕ G and ψ has an eigenvalue |λ| < 1/2. It suffices to prove that Z nψ G has exponential growth. So we may assume without loss of generality that ϕ has an eigenvalue λ with 0 < |λ| < 12 . The eigenvalue λ must be λ = λ j,i for some 0 ≤ j < n and 1 ≤ i ≤ d j = d j (G), which is an eigenvalue of the corresponding matrix M j ∈ GLd j (Z) corresponding to the induced automorphism ϕ j on γ j (G)/γ¯ j+1 (G) Zd j ; that is, for the canonical projection π : γ j (G) → γ j (G)/γ¯ j+1 (G), there is some isomorphism α : γ j (G)/γ¯ j+1 (G) → Zd j such that for any y ∈ γ j (G), we have that ϕ j ( γ¯ j+1 (G)y) = ϕ j (π(y)) = α −1 M j α(π(y)). Set d = d j , M = M j , and ψ = ϕ j . So ψ = α −1 ◦ M ◦ α. Recall that by Exercise 8.17, for any y ∈ γ j (G) we have that ψ(π(y)) = π(ϕ(y)). Let v ∈ Cd be an eigenvector of M ∗ with eigenvalue λ¯ (recall that eigenvalues of M ∗ are complex conjugates of the eigenvalues of M). Scaling v by an appropriate scalar we may assume that he`, vi = 1 for one of the standard basis vectors e` ∈ Zd . Let x ∈ γ j (G) be such that α ◦ π(x) = e` . So for all a ∈ Z, M a e` = α ◦ ψ a ◦ π(x) = α ◦ π(ϕ a (x)). Thus,

α ◦ π(ϕ a (x)), v = M a e`, v = e`, (M ∗ ) a v = λ a .

Now, assume that (0, x), (0, 1), (−1, 1), (1, 1) are all in the generating set of H = Z nϕ G. Then, in the group H, for any ε = (ε 0, . . . , ε k ) ∈ {0, 1} k+1 , consider the elements gε = (0, x ε0 )(−1, 1)(0, x ε1 ) · · · (−1, 1)(0, x εk ) · (k, 1).

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

287

8.5 Z-extensions and Nilpotent Groups Note that |gε | ≤ 2(k + 1) − 1 + k + 1 < 3(k + 1). Also, by Exercise 8.18, gε = (0, x ε0 ) (1,1) · (0, x ε1 ) (1,1) · · · (0, x εk ) (1,1) = 0, ϕ0 (x ε0 ) · ϕ−1 (x ε1 ) · · · ϕ−k (x εk ) . 0

1

k

Assume that for some ε, ε 0 ∈ {0, 1} k+1 , we have gε = gε0 . Then, ϕ0 (x ε0 ) · · · ϕ−k (x εk ) = ϕ0 (x ε0 ) · · · ϕ−k (x εk ). 0

0

Using the fact that α ◦ π is a homomorphism, we obtain k X

ε i λ −i =

i=0

k D X

α ◦ π(ϕ−i (x εi )), v

E

i=0

E D E 0 0 = α ◦ π ϕ0 (x ε0 ) · · · ϕ−k (x εk ) , v = α ◦ π ϕ0 (x ε0 ) · · · ϕ−k (x εk ) , v D

=

k D X

k E X 0 α ◦ π ϕ−i (x εi ) , v = ε i0 λ −i .

i=0

i=0

Since 0 < |λ| < this is only possible if ε = ε 0. We conclude that the map ε 7→ gε from {0, 1} k+1 into BH (1, 3(k + 1)) is injective, and hence |BH (1, 3(k + 1))| ≥ 2k+1 . This implies exponential growth of H = Z nϕ G. 1 2,

Show that if N C G is such that G/N is virtually nilpotent and [G, N] = {1}, then G is virtually nilpotent. B solution C Exercise 8.25

Exercise 8.26 Show that if F C G is a finite normal subgroup and G/F is virtually nilpotent, then G is virtually nilpotent. B solution C

Exercise 8.27

Let Γ Zd and let ϕ ∈ Aut (Γ). Assume all eigenvalues of ϕ are

exactly 1. Show that there exists y ∈ Γ such that ϕ(y) = y.

B solution C

Let K ≤ Zd . Show that if K is nontrivial, then Zd /K Zm × F for some m < d and a finite Abelian group F. B solution C Exercise 8.28

Let G be a finitely generated Abelian group. Let ϕ ∈ Aut (G). Assume that K ≤ G is a subgroup such that ϕ(K ) = K and γ¯1 (G) ≤ K. Let ψ ∈ Aut (G/K ) be the automorphism given by ψ(K x) = K ϕ(x). Show that any eigenvalue of ψ is an eigenvalue of ϕ. B solution C Exercise 8.29

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

288

The Milnor–Wolf Theorem

Exercise 8.30 Let G be an infinite finitely generated (n + 1)-step nilpotent group. So γn (G) , {1} = γn+1 (G). Let N be a subgroup such that N ≤ γn (G). Show that N C G. Let π : G → G/N be the canonical projection. Show that for all k ≥ 0 we have γk (π(G)) = π(γk (G)). Show that γ¯ k+1 (π(G)) = π( γ¯ k+1 (G)) for all 0 ≤ k ≤ n. Show that if k > n then γ¯ k+1 (π(G)) = {1}. B solution C

Let G be an infinite finitely generated nilpotent group. Let ϕ ∈ Aut(G). Assume that all eigenvalues of ϕ are 1. Then Z nϕ G is virtually nilpotent.

Lemma 8.5.3

Proof Let n be such that γn (G) , {1} = γn+1 (G) (i.e. G is (n + 1)-step nilpotent). Recall that d k (G) ∈ N is the nonnegative integer for which γk (G)/γ¯ k+1 (G) P Zdk (G) (Exercise 8.15). Set D(G) = ∞ d (G). Since γk (G) = {1} for all Pn k=0 k k ≥ n + 1, we know that D(G) = k=0 d k (G). We will prove the lemma by induction on D(G). Denote H = Z nϕ G, Γ = γn (G), and Γ¯ = γ¯ n+1 (G). Let π : G → G/Γ¯ be the canonical projection. We have that Γ¯ is a finitely generated Abelian group such that all elements have finite order. By Exercise 8.7, Γ is a finitely generated group. Since [Γ, Γ] ≤ γn+1 (G) = {1}, we have that Γ is an Abelian group. By Theorem 1.5.2, we know that Γ Zd × F for some finite Abelian group F. Since any element of Γ¯ has ¯ ⊂ {~0} × F. We finite order, if i : Γ → Zd × F is an isomorphism, then i( Γ) ¯ conclude that Γ is a finite Abelian group. Since Γ¯ is a characteristic subgroup of G (Exercise 8.14), Exercise 8.20 gives ¯ Z nψ π(G), that {0} × Γ¯ C H is a finite normal subgroup with H/({0} × Γ) where ψ ∈ Aut (π(G)) is given by ψ(π(x)) = π(ϕ(x)). By Exercise 8.26 it suffices to prove that Z nψ π(G) is virtually nilpotent. By Exercise 8.29, ψ has all eigenvalues equal to 1. Since π(Γ) Zdn (G) , by Exercise 8.27 there exists y ∈ Γ\Γ¯ such that ψ(π(y)) = π(y). By the definition ¯ ¯ if y m ∈ Γ¯ for some 0 , m ∈ Z, then (y m ) ` = 1 for some ` ∈ Z, so y ∈ Γ. of Γ, Hence, D π(y) E ∈ π(Γ) is an element of infinite order. Specifically, if we define ¯ y , we have that π(N ) = hπ(y)i Z. N = Γ, For a, m ∈ Z and x ∈ G we can compute commutators in Z nψ π(G): [(a, π(x)), (0, π(y m ))] = −a, ψ −a π(x −1 ) ψ −a (π(y −m )) (a, π(x)ψ a (π(y m ))) = 0, ψ −a (π(x −1 ))ψ −a (π(x)) = 1Znψ π (G) .

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

8.5 Z-extensions and Nilpotent Groups

289

That is, [Z nψ π(G), {0} × π(N )] = {1}. Thus, by Exercises 8.20 and 8.25, it suffices to prove that (Z nψ π(G))/({0} × π(N )) Z nφ (π(G)/π(N )) is virtually nilpotent, where φ(π(N )π(x)) = π(N )ψ(π(x)). Since ¯ ¯ G/N, π(G)/π(N ) (G/Γ)/(N/ Γ) by Exercise 8.30 we may conclude the following facts: γk (π(G)/π(N ))/γ¯ k+1 (π(G)/π(N )) γk (G/N )/γ¯ k+1 (G/N ) π(γk (G))/π( γ¯ k+1 (G)) γk (G)/γ¯ k+1 (G) for all 0 ≤ k < n. So d k (π(G)/π(N )) = d k (G) for all 0 ≤ k < n. If k > n then γk (π(G)) = π(γk (G)) = {1}, so d k (π(G)) = 0. Finally, for k = n we have that since Γ¯ = γ¯ n+1 (G) ≤ N, γ¯ n+1 (π(G)/π(N )) γ¯ n+1 (G/N ) = {1}. Also, ¯ γn (G)/Γ¯ = Γ/Γ, ¯ γn (G/Γ) so ¯ ¯ Γ/N . γn (G/N ) γn (G)/N γn (G/Γ)/(N/ Γ) Note that π(Γ) Γ/Γ¯ Zdn (G) . Since Γ¯ is a finite subgroup of the infinite group N, π(N ) cannot be trivial. Since π(N ) ≤ π(Γ) Zdn (G) is a nontrivial subgroup, by Exercise 8.28 we have that Γ/N π(Γ)/π(N ) Zm × F for a finite Abelian group F and some m < d n (G). Specifically, since γn (π(G)/π(N )) γn (G/N ) Γ/N, we find that d n (π(G)/π(N )) = d n (G/N ) < d n (G). In conclusion, we have that D(π(G)/π(N )) < D(G). Thus, Znφ (π(G)/π(N )) is virtually nilpotent by induction, completing the induction step. Let G be an infinite finitely generated nilpotent group. Let ϕ ∈ Aut(G). If all eigenvalues of ϕ have modulus 1 then Znϕ G is virtually nilpotent.

Lemma 8.5.4

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

290

The Milnor–Wolf Theorem

Proof Since G is nilpotent, ϕ has finitely many eigenvalues (Exercise 8.16). By Exercise 8.22, the eigenvalues of ϕ are all roots of unity. Thus, there exists k > 0 such that all eigenvalues of ψ = ϕk ∈ Aut (G) are exactly 1. So Lemma 8.5.3 tells us that Z nψ G is virtually nilpotent. By Exercise 8.18, Z nψ G is isomorphic to a finite index subgroup of Z nϕ G. So Z nϕ G is virtually nilpotent as well. Let us summarize Lemmas 8.5.2 and 8.5.4. A Z-extension of a finitely generated nilpotent group G can either be virtually nilpotent (and thus have polynomial growth), or must have exponential growth. The former case is exactly when all eigenvalues of the implicit automorphism of G have modulus 1. Note that this is a special case of the Milnor–Wolf theorem (Theorem 8.2.4) for groups that are Z-extensions of finitely generated nilpotent groups.

8.6 Proof of the Milnor–Wolf Theorem One consequence of Lemmas 8.5.2 and 8.5.4 is the following. Corollary 8.6.1 Let G be a finitely generated group of sub-exponential growth. Assume that the commutator subgroup [G, G] is virtually nilpotent. Then G is virtually nilpotent.

Proof Since G/[G, G] is Abelian and since G has sub-exponential growth, we know that [G, G] is finitely generated by Rosset’s theorem (Theorem 8.3.3). Denote Γ¯ = γ¯1 (G) = {x ∈ G : ∃ 0 , m ∈ Z, x m ∈ [G, G]}. ¯ Since Γ/[G, G] is an Abelian group whose elements are all of finite order, we know that [Γ¯ : [G, G]] < ∞. Specifically, Γ¯ is virtually nilpotent by our assumption on [G, G]. By Exercises 8.15 and 1.50, we know that G/Γ¯ Zd for some d > 0. We may find normal subgroups Γ¯ = K d C · · · C K1 C K0 = G such that K j /K j+1 Z for all 0 ≤ j < d. Using Rosset’s theorem (Theorem 8.3.3) on these we get that K1, . . . , K d are all finitely generated. Exercise 8.19 gives that K j Z nψ j K j+1 for all 0 ≤ j < d for some ψ j ∈ Aut (K j+1 ). Assume that G is not virtually nilpotent. Then there exists some 0 ≤ j < d such that K j is not virtually nilpotent, but K j+1 is virtually nilpotent (recall that Γ¯ is virtually nilpotent). Then, since K j = Z nψ j Aut (K j+1 ) has sub-exponential growth (Exercise 8.4), Lemma 8.5.2 tells us that it must be that all eigenvalues

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

291

8.7 Additional Exercises

of ψ j are of modulus 1. Thus, by Lemma 8.5.4 we have that K j is virtually nilpotent, a contradiction! We are now ready to prove the Milnor–Wolf theorem, Theorem 8.2.4, which basically states that if G is a finitely generated solvable group of sub-exponential growth, then G is virtually nilpotent. Proof of Theorem 8.2.4 Recall the derived series of G, which is G (0) = G, f g G ( j+1) = G ( j), G ( j) . Let n be such that G (n+1) = {1} and G (n) , {1}. The proof is by induction on n. Base case, n = 0. In this case G is Abelian, which has polynomial growth by Exercise 8.6. For the induction step, assume n > 0. Since G has sub-exponential growth, and since G/G (1) is Abelian, we know that G (1) is finitely generated by Rosset’s theorem (Theorem 8.3.3). So G (1) is finitely generated of sub-exponential (n) growth (Exercise 8.4). Since G (1) = G (n+1) = {1}, by induction G (1) = [G, G] is virtually nilpotent. By Corollary 8.6.1, this implies that G is virtually nilpotent as well.

8.7 Additional Exercises Recall the groups Hn (Z) from Exercise 1.46. Consider the case n = 3, which is the so-called Heisenberg Group, 1 a c H3 (Z) = {((c, b, a)) : a, b, c ∈ Z}, where ((c, b, a)) = 0 1 b .

Exercise 8.31

0 0 1

Let A = ((0, 0, 1)), B = ((0, 1, 0)), and C = ((1, 0, 0)). Show that ((c, b, a)) = C c B b Aa . Show that [A, B] = C. Conclude that A, B generate H3 (Z). Exercise 8.32

Show that Z (H3 (Z)) = hCi = {((c, 0, 0)) : c ∈ Z}.

B solution C

B solution C

Let G be a group. Let x, y ∈ G be such that [x, y] ∈ Z (G). Show that for all integers n, m we have that [x n, y m ] = [x, y]nm . B solution C

Exercise 8.33

Consider the Cayley graph of H3 (Z) with respect to the symmetric generating set { A, −A, B, −B}, for A, B given in Exercise 8.31. Show that Exercise 8.34

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

292

The Milnor–Wolf Theorem

there exists a constant κ > 0 such that p |((c, b, a))| ≤ κ |c| + 1 + |b| + |a| .

B solution C

Exercise 8.35 Let G be a finitely generated group with finite symmetric generating sets S, S 0. Let H C G be a finitely generated normal subgroup, with finite symmetric generating sets T, T 0. The distortion of H in G is defined to be

δ (G,S), (H,T ) (r) = max{|x|T : x ∈ H, |x|S ≤ r }, which is easily seen to be a nondecreasing function. Show that δ (G,S), (H,T ) ∼ δ (G,S0 ), (H,T 0 ) (where ∼ is the equivalence relation on growth functions). Show that if the growth of H is at most ψ, the distortion of H in G is at most δ, and the growth of G/H is at most φ, then the growth of G is at most φ · (δ ◦ ψ). Conclude that if H has polynomial distortion in G and G/H has polynomial growth, then G has polynomial growth. B solution C Exercise 8.36

Show that the growth of H3 (Z) is r 7→ r 4 .

Exercise 8.37

Let G be a group and let H, K be subgroups of G such that:

B solution C

• hK, Hi = G, • H C G, • H ∩ K = {1G }. Show that K acts on H by conjugation and that K n H G.

B solution C

Let G = K n H for some K acting on H. Let K˜ = K × {1 H } ⊂ G ˜ and H = {1K } × H ⊂ G. Show that DH˜ ∩ K˜E = {1G }. ˜ K˜ = G. Show that H, Show that H˜ C G and H˜ H. Exercise 8.38

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

8.8 Remarks

293

8.8 Remarks Nilpotent groups were shown to have polynomial growth by Wolf (1968). Later, Bass (1972) and Guivarc’h (1973) provided the precise degree of the polynomial growth rate (and showed that it is in fact an integer). The Milnor–Wolf theorem (Theorem 8.2.4) was published in Milnor (1968a) and Wolf (1968), and is a special case of Gromov’s theorem (Theorem 9.0.1) for the class of solvable groups. Our treatment of the results in this chapter is based on Druţu and Kapovich (2018). According to Druţu and Kapovich (2018), the provided proof of Lemmas 8.5.3 and 8.5.4 is due to B. Plotkin. Related to the Milnor–Wolf theorem, but outside the scope if this book, is another dichotomy known as the Tits alternative, proved by Jacques Tits (1972). Let V be a finite-dimensional vector space over some field F. Let G be a finitely generated subgroup of GL (V ). Then, either G contains a subgroup isomorphic to F2 (the free group generated by two elements), or G is virtually solvable. Theorem 8.8.1 (Tits alternative)

The Tits alternative is a special case of Gromov’s theorem for finitely generated linear groups (i.e. subgroups of GL (V ) for finite-dimensional V ). Note that it implies that any finitely generated amenable linear group is virtually solvable.

8.9 Solutions to Exercises Solution to Exercise 8.3 :( We showed in Section 1.6.1 (Exercise 1.75) that if S, T are two finite symmetric generating sets of G , then there exists a constant κ = κS,T > 0 such that for all x, y ∈ G ,

κ −1 · distT (x, y) ≤ dist S (x, y) ≤ κ · distT (x, y) (i.e. the metrics are bi-Lipschitz). Thus, B S (1, r ) ⊂ BT (1, κr ) for all r ≥ 0. So all Cayley graphs (of the same group) have equivalent growth functions. Since |B S (1, r ) | ≤ |S | r , it is immediate the exponential growth is the maximal possible.

:) X

Solution to Exercise 8.4 :( If H ≤ G is finitely generated, then we may choose T ⊂ S such that S is a finite symmetric generating set of G and T is a finite symmetric generating set of H . It is then immediate that for any x ∈ H we have |x | S ≤ |x |T (because any word in T is also a word in S ). Thus, BT (1, r ) ⊂ B S (1, r ) ∩ H , which immediately implies the first assertion. For the second assertion, let N C G and let π : G → N \G be the canonical projection. If S is a finite symmetric generating set for G , then π(S) is a finite symmetric generating set for N \G . Now, if | N x | π (S) = r , then there exist s1, . . . , sr ∈ S such that N x = N s1 · · · sr . Set x j = s1 · · · s j for 1 ≤ j ≤ r . Note that | N y | π (S) ≤ |y | S for all y ∈ G , and hence

r ≥ |xr | S ≥ | N xr | π (S) = | N x | π (S) = r,

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

294

The Milnor–Wolf Theorem

implying that |xr | S = r . We conclude that π B r is surjective onto B π (S) (N, r ) . Thus, |B π (S) (N, r ) | ≤ |B S (1, r ) | , implying the second assertion. :) X Solution to Exercise 8.5 :( The fact that H is finitely generated was shown in Exercise 1.61. So the growth of H is at most the growth of G , by Exercise 8.4. U Now, consider some finite symmetric generating set S˜ for H . Write G = t ∈T Ht , for some finite set |T | = [G : H] < ∞. For any x ∈ G there exist h x ∈ H and t x ∈ T such that x = h x t x . Thus, S := S˜ ∪ T ∪ T −1 is a finite symmetric generating set for G . Moreover, since S˜ ⊂ S we have that |h x | S˜ ≤ |h x | S ≤ |x | S + 1. Thus,

|B S (1, r ) | ≤ #{(h x , t x ) : |x | ≤ r } ≤ |B S˜ (1, r + 1) | · |T |, :) X

which implies that the growth of G is at most the growth of H .

Solution to Exercise 8.6 :( If A is a finitely generated Abelian group then A Z d × F for some d ≥ 0 and |F | < ∞ a finite group, by Theorem 1.5.2. Since Z d has finite index in Z d × F , by Exercise 8.5 it suffices to show that Z d has polynomial growth for any d ≥ 0. This is immediate when d = 0 (as the group is finite) and for d > 0, we can take the standard basis elements to generate Z d , namely Z d = h±e1, . . . , ±e d i, where e j is the d -dimensional vector with all coordinates 0 except the j th coordinate equal to 1. With these generators, it is very easy to see that d X d |x j | ≤ r B(0, r ) = (x1, . . . , x d ) ∈ Z : j=1 d recalling that 0 ∈ Z is the identity element . This can easily be bounded by

B(0, r ) ⊂ {(x1, . . . , x d ) ∈ Z d : max |x j | ≤ r }, j

so that |B(0, r ) | ≤ (2r

+ 1) d ,

and by

{(x1, . . . , x d ) ∈ Z d : max |x j | ≤ r/d } ⊂ B(0, r ), j

so that |B(0, r ) | ≥

2r d

d

:) X

.

Solution to Exercise 8.7 :( Denote γk = γk (G) . Let n be the nilpotent step of G ; so γn = {1}. Assume for a contradiction that there exists 1 ≤ k ≤ n − 1 such that γk is not finitely generated. Choose a maximal such k . So γk is not finitely generated, but γk+ j is finitely generated for all j ≥ 1. By Exercise 1.39 we know that γk /γk+1 is finitely generated. Since γk+1 is finitely generated by the choice of k , by Exercise 1.62 we have that γk is also finitely generated, a contradiction! :) X Solution to Exercise 8.8 :( Denote γk = γk (G) . We prove the claim by induction on n − k . The base case is k = n − 1. By Exercise 1.34, we know that

γn−1 (G) = [s0, . . . , sn−1 ] x : s1, . . . , sn−1 ∈ S, x ∈ G . Since γn−1 (G) C Z (G) , we have that [s0, . . . , sn−1 ] x = [s0, . . . , sn−1 ]. Thus, γn−1 = hSn−1 i. Now, for k < n − 1, by Exercise 1.34,

γk (G) = [s0, . . . , sk ] x : s1, . . . , sk ∈ S, x ∈ G . Since,

[s0, . . . , sk ] x = [s0, . . . , sk ] · [s0, . . . , sk , x], and since [s0, . . . , sk , x] ∈ γk+1 , we have that if T generates γk+1 then Sk ∪ T generates γk . By induction we have that Tk = Sk ∪ Tk+1 generates γk . :) X

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

8.9 Solutions to Exercises

295

Solution to Exercise 8.9 :( Let π : G → Z be a surjective homomophism with Ker π = H . Choose some element s ∈ G such that π (s) = 1 ∈ Z. Let S˜ be any finite generating set for G . For u ∈ S˜ there exist su ∈ H and zu ∈ Z such that π(u) = zu and u = su s z u . Note that the set S = {s, su : u ∈ S˜ } generates S˜ and thus generates all of G , so S satisfies the necessary requirements. :) X Solution to Exercise 8.10 :( Using Exercise 8.9, let S = {s1, . . . , sd } be a finite generating set for G such that s1 is mapped to 1 ∈ Z under the canonical projection (and s j ∈ H for all j > 1). Note that hs1 i = {(s1 ) n : n ∈ Z} is a right-traversal for H ; that is, 1 ∈ hs1 i and G = ]t ∈h s i Ht . Exercise 1.60 tells us that H is generated by hs1 i S hs1 i−1 ∩ H . Note 1

that hs1 i S hs1 i−1 ∩ H = {(s1 ) −n s j (s1 ) n : 1 < j ≤ d, n ∈ Z}. This last set is finitely generated by Milnor’s trick (Proposition 8.3.1). :) X Solution to Exercise 8.11 :( The proof is by induction on the length of the derived series. If G is 1-step solvable, then G is Abelian, and being finitely generated is of the form Z d × F for a finite F . Since Z d only has elements of infinite order, it must be that d = 0. This is the base case. Now assume that G is n-step solvable for some n > 1. If G is torsion, then [G, G] is torsion as well, and it is easy to see that G/[G, G] is also torsion. Thus, by induction G/[G, G] is finite. Also, [G, G] is finitely generated by Rosset’s theorem (Theorem 8.3.3). We have that [G, G] is (n − 1) -step solvable, so [G, G] must be finite by induction. The induction is completed via |G | ≤ [G : [G, G]] · |[G, G] | . :) X Solution to Exercise 8.12 :( Let g ∈ G . The map x 7→ x g is an automorphism of G , and thus, restricted to N is an automorphism of N since N C G . Thus, it preserves H . So H g = H for any g ∈ G , implying that H C G . :) X Solution to Exercise 8.13 :( Let ϕ be any automorphism of G . Note that ϕ([x, y]) = [ϕ(x), ϕ(y)] for all x, y ∈ G . Thus, for any x ∈ N and any y ∈ K we have that ϕ([x, y]) ∈ [ϕ(N ), ϕ(K )] = [N, K]. Now, any ϕ(z) ∈ ϕ([N, K]) can be written as z = [x1, y1 ] · · · [x n, yn ] for some x1, . . . , x n ∈ N and y1, . . . , yn ∈ K . Thus, ϕ(z) ∈ [N, K]. This proves that ϕ([N, K]) ⊂ [N, K] for any automorphism ϕ of G . But then, we also have that [N, K] = ϕ ϕ −1 ([N, K]) ⊂ ϕ([N, K]) ⊂ [N, K], so equality holds. :) X Solution to Exercise 8.14 :( For any automorphism ϕ ∈ Aut (G) , we have that

ϕ(x) k ∈ γn (G) ⇐⇒ x k ∈ γn (G) because γn (G) is a characteristic subgroup. So we only need to show that γ¯ n (G) is indeed a subgroup. To this end, for any x, y ∈ γ¯ n (G) , we have x k ∈ γn (G) and y m ∈ γn (G) for some k, m ∈ Z. Also, since x, y ∈ γn−1 (G) , we know that [x, y] ∈ γn (G) , so there exists g ∈ γn (G) such that

x −1 y

km

m = gx −k m y k m = g x −k (y m ) k ∈ γn (G).

This implies that x −1 y ∈ γ¯ n (G) , proving that γ¯ n (G) is a subgroup.

:) X

Solution to Exercise 8.15 :( Since [γn (G), γn (G)] ≤ [γn (G), G] = γn+1 (G) ≤ γ¯ n+1 (G) , we have that γn (G)/γ¯ n+1 (G) is Abelian. We know that γn (G)/γn+1 (G) is finitely generated (Exercise 1.39). Note that γn+1 (G) is a normal subgroup of both γ¯ n+1 (G) and γn (G) . Thus,

γn (G)/γ¯ n+1 (G) (γn (G)/γn+1 (G))/(γ¯ n+1 (G)/γn+1 (G)), implying that γn (G)/γ¯ n+1 (G) is a finitely generated Abelian group. By Theorem 1.5.2, γn (G)/γ¯ n+1 (G) Z d n × F for some finite Abelian group F and some integer dn ≥ 0. For any f ∈ F we have that (0, f ) ∈ Z d n × F has finite order in Z d n × F , namely (0, f ) |F | = 0, f |F | = (0, 1 F ) . We want to show that F = {1}. So it suffices to prove that no nontrivial element of γn (G)/γ¯ n+1 (G) has finite order.

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

296

The Milnor–Wolf Theorem

k Indeed, assume m that x ∈ γn (G) . If γ¯ n+1 (G)x has finite order, then x ∈ γ¯ n+1 (G) for some k . But then xk m = xk ∈ γn+1 (G) for some m, implying that x ∈ γ¯ n+1 (G) . So the only coset γ¯ n+1 (G)x of finite order in γn (G)/γ¯ n+1 (G) is the trivial coset. :) X

Solution to Exercise 8.16 :( Let γk = γk (G) , γ¯ k = γ¯ k (G) , and dk = dk (G) for all k . Let x ∈ γn = γ¯ n+1 and y ∈ G . There exists k ≥ 1 such that x k ∈ γn+1 . We have that f g f g f gf g x k , y = x −1 x k−1, y y −1 xy = x k−1, y [x k−1, y], x [x, y] f g = · · · = [[x, y], x] · · · [x k−1, y], x · [x, y].

f g Since x k , y ∈ γn+2 and [[x m, y], x] ∈ γn+2 for all m ∈ N, we have that [x, y] ∈ γn+2 . This holds for all x ∈ γn and all y ∈ G . Thus, γn+1 = h[x, y] : x ∈ γn, y ∈ Gi ≤ γn+2, implying that γn+1 = γn+2 = γ¯ n+2 . Thus, dn+1 = 0, and we can repeat the argument inductively for n + k for all k ≥ 1.

:) X

Solution to Exercise 8.17 :( We have that ϕ n is well defined since if γ¯ n+1 (G)x = γ¯ n+1 (G)y , then x = gy for some g ∈ γ¯ n+1 (G) . Hence ϕ(x) = ϕ(g)ϕ(y) , and since γ¯ n+1 (G) is a characteristic subgroup, we get that ϕ(g) ∈ γ¯ n+1 (G) , so that γ¯ n+1 (G)ϕ(x) = γ¯ n+1 (G)ϕ(y) . It is immediate to see that ϕ n is an automorphism of γn (G)/γ¯ n+1 (G) . :) X Solution to Exercise 8.18 :( First, conjugation in Z nϕ G is given by (a, x) (b, y) = −b, ϕ −b (y −1 ) (a + b, xϕ a (y)) = a, ϕ −b (y −1 x)ϕ a−b (y) . This proves the first assertion. For the second assertion, consider the map λ : Z nψ G → Z nϕ G given by λ(a, x) = (ma, x) . Note that λ a + b, xψ a (y) = m(a + b), xϕ ma (y) = (ma, x) · (mb, y) = λ(a, x) · λ(b, y). Also, if λ(a, x) = (0, 1) then a = 0 and x = 1. Thus, λ is a injective homomorphism. The homomorphism π : Znϕ G → Z/|m |Z given by π(a, x) = a (mod |m|) has kernel Ker π = H = {(ma, x) : a ∈ Z, x ∈ G }. Thus, [Z nϕ G : H] = |m | . Note that H = λ(Z nψ G) . The last assertion is easy, since (a, 1) = (1, 1) a and (0, xy) = (0, x)(0, y) and (a, x) = (0, x)(a, 1) . :) X Solution to Exercise 8.19 :( Let g ∈ G be such that g is mapped to 1 ∈ Z under the canonical projection G 7→ Z G/K . Define ϕ ∈ Aut (K ) by ϕ(k) = gkg−1 . It is simple to check that this induces an isomorphism of groups Z nϕ K G by mapping λ(a, k) = kg a . Indeed, for a, b ∈ Z and k, n ∈ K ,

λ((a, k)(b, n)) = ϕ(a + b, kg a ng−a ) = kg a ng−a g a+b = λ(a, k)λ(b, n). The kernel of λ is trivial because kg a = 1 if and only if a = 0 and k = 1. Also, λ is surjective because given x ∈ G , there exists a ∈ Z such that K x = K g a . So choosing k = xg−a , we have that λ(a, k) = kg a = x . Note that this isomorphism maps {0} × K onto K , so these are isomorphic subgroups. Now, if G has growth r d , then K is finitely generated by Rosset’s theorem (Theorem 8.3.3). Let S be a finite symmetric generating set for K , and consider the generating set {(0, s), (±1, 1) : s ∈ S } for Z n K . Note that |(a, k) | ≤ |a | + |k | (by Exercise 8.18). Consider the identity map of Z × K into Z n K (as sets). This is an injective function, which takes {1, . . . , r } × {k ∈ K : |k | ≤ r } into the ball of radius 2r in Z n K . Thus, r · # {k ∈ K : |k | ≤ r } ≤ Cr d for some C > 0, implying that K has growth r d−1 . :) X Solution to Exercise 8.20 :( If H x = H y then x −1 y ∈ H so also ϕ(x) −1 ϕ(y) = ϕ x −1 y ∈ H , implying that ψ is well defined. It is easy to see that ψ is a homomorphism, with trivial kernel (because x ∈ H ⇐⇒ ϕ(x) ∈ H ), and ψ is surjective because ϕ is surjective. So ψ ∈ Aut (G/H ) .

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

297

8.9 Solutions to Exercises Consider the map π : Z nϕ G → Z nψ (G/H ) given by π (a, x) = (a, H x) . We have

π(a + b, xϕ a (y)) = (a + b, H xϕ a (y)) = (a + b, H xψ a (H y)) = (a, H x)(b, H y), so π is a homomorphism. It is simple to see that π is surjective, and Ker π = {(a, x) : a = 0, x ∈ H } = N .

So N C Z nϕ G and (Z nϕ G)/N Z nψ (G/H ) . The map (0, x) 7→ x is an isomorphism of N with H .

:) X

Solution to Exercise 8.21 :( Since G/F Z, we know that G Z nϕ F for some ϕ ∈ Aut (F ) (Exercise 8.19). So assume without loss of generality that G = Z nϕ F . Let H = {(z, 1) : z ∈ Z} ⊂ G . It is easy to see that H is a (not necessarily normal) subgroup. Also, since (a, f ) = (0, f )(a, 1) for all f ∈ F, a ∈ Z, we see that [G : H] ≤ |F | < ∞. :) X Solution to Exercise 8.22 :( n Let λ1, . . . , λ n be the eigenvalues of M (with multiplicities). Let vk := λ1k , . . . , λ kn ∈ S 1 here S 1 ⊂ C P n n k k 1 is the unit circle . So trace M = j=1 λ j = hvk , 1i, where 1 is the all ones vector. Since S is compact, mj

there is a convergent subsequence (w j := vk j ) j , which implies that for m j := k j − k j−1 we have that λi → 1 m m for all i , as j → ∞. Thus, trace M j → n as j → ∞. However, trace M j is an integer for any m j (this is where it is used that M has integer values). Hence, it must be that there exists j0 such that for all j > j0 we mj mj Pn have i=1 λi = trace M m j = n. Since |λi | = 1 for all i , the only way this can happen is if λi = 1 for all i and j > j0 . :) X Solution to Exercise 8.23 :( Let e1, . . . , e d denote the standard basis of Z d . So e j is the d -dimensional vector with all 0’s except at the j th D E P coordinate, which is 1. Then any x ∈ Z d can be written as x = dj=1 x, e j e j . For an automorphism ϕ ∈ Aut Z d (recalling that the group operation is vector addition), we have that d ϕ(ne j ) = nϕ(e j ) Dfor any n ∈E Z. Also, ϕ(x + y) = ϕ(x) + ϕ(y) for any x, D y ∈ EZ . Define Mi, j = ϕ(e j ), ei ∈ Z for 1 ≤ i, j ≤ d . If x ∈ Z d , note that x, e j ∈ Z for all j , so that

hϕ(x), ei i =

d D X

x, e j

j=1

d ED E X ϕ(e j ), ei = Mi, j x j = (M x)i . j=1

Thus, ϕ(x) = M x . So we are left with showing that | det(M ) | = 1. Since ϕ is an automorphism, ϕ −1 is alsoan automorphism. It is easy to see that M is invertible, and ϕ −1 (x) = M −1 x for all x ∈ Z d . So det(M ) · det M −1 = 1. Since both these determinants must be integers, the only possibilities are det(M ) = det M −1 ∈ {−1, 1}. :) X Solution to Exercise 8.24 :( Consider ψ = j ◦ i −1 ∈ Aut Z d . By Exercise 8.23, there is a matrix U ∈ GL d (Z) such that ψ(~ z ) = U~z for all ~ z ∈ Z d . Note that for any ~z ∈ Z d we have

U −1 M j U~z = ψ −1 ◦ ϕ j ◦ ψ(~z) = i ◦ j −1 ◦ j ◦ ϕ ◦ j −1 ◦ j ◦ i −1 (~z) = ϕi (~z) = Mi ~z.

:) X

Solution to Exercise 8.25 :( Note that if γn (G/N ) = {1} then γn (G) ≤ N . But then γn+1 (G) = [G, γn (G)] ≤ [G, N ] = {1}, so G is nilpotent. If G/N is virtually nilpotent, then there is a finite index subgroup H C G such that N C H and H/N is nilpotent. So H is nilpotent and thus G is virtually nilpotent. :) X Solution to Exercise 8.26 :( F C G implies that G acts on F by conjugation. The kernel of this action is the subgroup

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

298

The Milnor–Wolf Theorem C F = x ∈ G : ∀ f ∈ F, f x = f .

So G/C F is isomorphic to a subgroup of the permutations of the finite set F , and thus C F has finite index in G . So it suffices to show that C F is virtually nilpotent. Let H = FC F = { f x : f ∈ F, x ∈ C F }, which is a subgroup of G . The second isomorphism theorem tells us that C F /(F ∩ C F ) H/F ≤ G/F , which is virtually nilpotent as a (finite index) subgroup of a virtually nilpotent group. Since [C F , F] = {1} by definition, we can use Exercise 8.25 to conclude that C F is virtually nilpotent. :) X Solution to Exercise 8.27 :( By passing to an isomorphic group, we can assume that ϕ = M ∈ GL d (Z) , and that all eigenvalues of M are 1 (Exercise 8.23). Thus, there necessarily exists a vector v ∈ Q d such that M v = v . By multiplying v by an appropriate constant, we obtain an eigenvector in Z d . :) X Solution to Exercise 8.28 :( Since Z d /K is a finitely generated Abelian group, Z d /K Z m × F for some m ≤ d and a finite Abelian group F (Theorem 1.5.2). There exists M ≤ Z d with K ≤ M such that Z d /M Z m and M/K F . Let π : Z d → Z m be the canonical projection with Ker π = M . Using the standard basis e1, . . . , e d ∈ Z d , π can d m be extended to a surjective homomorphism of vector spaces π : R → R . If K is nontrivial, then Ker π is nontrivial, so m = dim π R d < dim R d = d . :) X Solution to Exercise 8.29 :( Since γ¯1 (G) ≤ K , by Exercise 8.30 we have that γ¯1 (G/K ) = {1}. Thus, G/K Z d for some d . Let π : G → G/K be the canonical projection with Ker π = K , and let η : G/K → Z d be an isomorphism. Let φ ∈ Aut (G/γ¯1 (G)) be defined by φ(γ¯1 (G)x) = γ¯1 (G)ϕ(x) (this is well defined because γ¯1 (G) is a characteristic subgroup). Since (G/γ¯1 (G))/(K/γ¯1 (G)) G/K Z d , there exist surjective homomorphisms α : G → G/γ¯1 (G) and β : G/γ¯1 (G) → Z d with Kerα = γ¯1 (G) and Kerβ = α(K ) such that β ◦ α = η ◦ π . Note that this implies that η(K x) = β(γ¯1 (G)x) for all x ∈ G . Now, let λ be an eigenvalue of ψ , so we can choose x ∈ G such that η(ψ(K x)) = λη(K x) . Then,

λβ(γ¯1 (G)x) = λη(K x) = η(ψ(K x)) = η ◦ π(ϕ(x)) = β(γ¯1 (G)ϕ(x)). Thus, λ is also an eigenvalue of ϕ .

:) X

Solution to Exercise 8.30 :( Since [G, γn (G)] = {1}, we know that [G, N ] = {1} so N C G . It is obvious that γ0 (π(G)) = π (G) = π(γ0 (G)) . Note that for any homomorphism π([x, y]) = [π (x), π(y)], so [π (H ), π(K )] = π([H, K]) . By induction, for k > 0 we have that

γk (π (G)) = [π(G), γk−1 (π (G))] = π([G, γk−1 (G)]) = π(γk (G)). Also, for 0 ≤ k ≤ n, since N ≤ γn (G) , we have that π (x) ∈ π(γk (G)) if and only if x ∈ γk (G) . Thus, for 0 ≤ k ≤ n,

γ¯ k+1 (π(G)) = {π(x) ∈ π (γk (G)) : ∃ m ∈ Z, π (x m ) ∈ π(γk+1 (G)) } = π (γ¯ k+1 (G)). For k ≥ n + 1, note that γk (G) = {1}, so γ¯ k+1 (G) = π(N ) = {1}. Solution to Exercise 8.31 :( The product in H3 (Z) is readily computed to be

((c, b, a))((z, y, x)) = ((c + z + ay, y + b, x + a)). Thus

((c, b, a)) −1 = ((−c + ab, −b, −a)), and thus,

[A, B] = ((0, 0, −1))((0, −1, 0))((0, 0, 1))((0, 1, 0)) = ((1, −1, −1))((1, 1, 1)) = ((1, 0, 0)) = C.

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

:) X

8.9 Solutions to Exercises

299

C c B b Aa = ((c, 0, 0))((0, b, 0))((0, 0, a)) = ((c, 0, 0))((0, b, a)) = ((c, b, a)).

:) X

Finally, it is easy to show that

Solution to Exercise 8.32 :( One easily verifies that

((z, 0, 0))((c, b, a)) = ((c, b, a))((z, 0, 0)). Moreover, for a general element ((c, b, a)) ∈ H3 (Z) :

• ((c, b, a))((0, 0, 1)) = ((c, b, a + 1)) and ((0, 0, 1))((c, b, a)) = ((c + b, b, a + 1)) . So ((c, b, a)) < Z (H3 (Z)) if b , 0. • ((c, b, a))((0, 1, 0)) = ((c + a, b + 1, a)) and ((0, 1, 0))((c, b, a)) = ((c, b + 1, a)) . So ((c, b, a)) < Z (H3 (Z)) if a , 0. :) X Solution to Exercise 8.33 :( Let z = [x, y] ∈ Z (G) . Then, for any positive integers n, m,

f g x n, y = x −n y −1 x n y = x 1−n zy −1 x n−1 y = z x n−1, y ,

so inductively on n we have that x n, y = z n . n ∈ Z (G) , it is also true that y, x n ∈ Z (G) , so the same argument applied to y and m tells us Since z that y m, x n = y, x n m = z −nm , so x n, y m = z nm for all positive integers n, m. For negative integers, note that if [x, y] ∈ Z (G) , then f g x −1, y = xy −1 x −1 y = x[y, x]x −1 ∈ Z (G), so we can obtain the result for negative n, and negative m as well.

:) X

Solution to Exercise 8.34 :( f g Since C = [A, B] ∈ Z (H3 (Z)) , by Exercise 8.33 we have that C a b = Aa , B b , implying that C a b ≤ 2( |a | + |b |) for all integers a, b . Specifically, p |c | + 1 + |b | + |a | , |((c, b, a)) | = C c B b Aa ≤ κ for some constant κ > 0.

:) X

Solution to Exercise 8.35 :( That δ (G, S), (H,T ) ∼ δ (G, S 0 ), (H,T 0 ) is immediate from Exercise 1.75. Let T be a finite symmetric generating set for H for which |BT (1, r ) | ≤ ψ(r ) . Let π : G → G/H be the canonical projection, and let S be a finite symmetric subset of G such that π (S) generates G/H and such that B π (S) (H, r ) ≤ φ(r ) . Exercise 1.62 tells us that S ∪ T generates G . We write |x | = |x | S∪T . The definition of distortion tells us that for some constant κ > 0 we haveUfor all x ∈ H that |x |T ≤ κδ(κ |x |) . Let Q be a set of representatives of the cosets of H , so that G = q∈Q H q . So for any x ∈ G there exist unique q x ∈ Q and h x ∈ H such that x = h x q x . Moreover, we may choose Q so that |q x | ≤ |x | for all x . Thus, for any x we have that |H q x | π (S) ≤ |q x | ≤ |x | . Also, |h x | ≤ |q x | + |x | ≤ 2|x | , so |h x |T ≤ κδ(2κ |x |) . We conclude that the map x 7→ (h x , H q x ) maps B S∪T (1, r ) injectively into BT (1, κδ(2κr ))×B π (S) (H, r ) . This immediately implies that

|B S∪T (1, r ) | ≤ φ(r ) · ψ(κδ(2κr )), which proves that the growth of G is at most φ · (ψ ◦ δ) , as required. The last assertion is just the fact that products and composition of polynomials is still a polynomial. Solution to Exercise 8.36 :( Exercise 8.34 implies that

{((c, b, a)) : |c | ≤ r 2 − 1, |a |, |b | ≤ r } ⊂ B(1, 3κr ),

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

:) X

300

The Milnor–Wolf Theorem

implying that the growth of H3 (Z) is at least r 7→ 2(r 2 − 1) + 1 (2r + 1) 2 , which is equivalent to r 7→ r 4 . This provides a lower bound on the growth. For the upper bound, denote H = H3 (Z) and Z = Z (H ) . Exercise 8.32 tells us that H/Z Z2 (which has quadratic growth) and that Z Z (which has linear growth). Exercise 8.34 implies that the distortion of Z in H is at most quadratic. By Exercise 8.35 with ψ(r ) = r and δ(r ) = φ(r ) = r 2 , we have that the growth of H is at most r 7→ r 4 = φ(r ) · (ψ ◦ δ)(r ) . :) X Solution to Exercise 8.37 :( Because H C G , we know that h k ∈ H for all h ∈ H and k ∈ K . Thus, K acts on H by k.h = khk −1 . Also, consider H K = {hk : h ∈ H, k ∈ K }. Since hK, Hi = G , for any g ∈ G we can write g as a finite product of elements from K and H . Moving all such elements from K to the right, replacing kh = khk −1 k , we may write any g ∈ G as g = hk for some h ∈ H and k ∈ K . Thus, H K = G . Also, if hk = h0 k 0 for h, h0 ∈ H , and k, k 0 ∈ K , then h −1 h0 = k (k 0 ) −1 ∈ H ∩ K = {1}, so h = h0 and k = k 0 . Thus, for h ∈ H and k ∈ K , the map ϕ(hk) = (k, h) is a well-defined surjective map onto the set K × H . Since h, h0 ∈ H and k, k 0 ∈ K , we have hkh0 k 0 = hkh0 k −1 kk 0 = h(k.h0 )kk 0 , we get that ϕ is a surjective homomorphism from G = H K onto K n H . We have that ϕ is an isomorphism since (k, h) = 1 K nH if and only if k = h = 1G . :) X

https://doi.org/10.1017/9781009128391.011 Published online by Cambridge University Press

9 Gromov’s Theorem

301

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

302

Gromov’s Theorem

This chapter is dedicated to the following famous theorem of Gromov, connecting the geometric property of polynomial growth to the algebraic property of (virtual) nilpotence. A finitely generated group G has polynomial growth if and only if G is virtually nilpotent.

Theorem 9.0.1 (Gromov’s theorem)

We have already seen the “easy” direction in Theorem 8.2.1, that every finitely generated virtually nilpotent group has polynomial growth. We will now work to prove the other direction of Gromov’s theorem. We begin with some reductions.

9.1 A Reduction Recall that a group G is virtually indicable if there exists a finite index subgroup H ≤ G, [G : H] < ∞, such that H admits a surjective homomorphism ϕ : H → Z. The main induction argument for Gromov’s theorem is in the following. Proposition 9.1.1 Assume that every infinite finitely generated group of polynomial growth is virtually indicable. Then Gromov’s theorem (Theorem 9.0.1) holds.

Proof For a finitely generated group G, define ( ) d(G) = inf d ∈ N : G has growth r 7→ r d , with the convention that inf ∅ = ∞. When G has polynomial growth, d(G) < ∞. Let G be a finitely generated group of polynomial growth. We will prove by induction on d(G) that G is virtually nilpotent. If d(G) = 0 then G is finite, which is virtually nilpotent (virtually trivial, in fact). This is the base case. For the induction step, assume that d(G) > 0. Our assumption that G is virtually indicable means that we are guaranteed some H ≤ G with [G : H] < ∞ and H/K Z for some K C H, by the assumptions of the proposition. Since H is of finite index in G it has the same (polynomial) growth as G. Specifically, it has growth r 7→ r d , for d = d(G). Since H/K Z, Exercise 8.19 shows that in this case K is finitely generated and has growth r 7→ r d−1 , so that d(K ) ≤ d(G) − 1. Hence, by induction, K is virtually nilpotent. So we can find a finite index subgroup N ≤ K, [K : N] < ∞, such that N is nilpotent. By Theorem 8.4.2, we may assume that N is characteristic

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.2 Unitary Actions

303

in K, and thus N C H. Since (H/N )/(K/N ) H/K Z and K/N is finite, by Exercise 8.21 there exists a finite index subgroup N C M ≤ H, [H : M] < ∞, such that M/N Z. Since N is nilpotent and [M, M] ≤ N, we have that M is solvable. Since M is finite index in G, we find that M has polynomial growth. By the Milnor–Wolf theorem (Theorem 8.2.4) M is virtually nilpotent, implying that G is virtually nilpotent, completing the induction step.

9.2 Unitary Actions Recall the operator norm for a linear operator a : V → V , where V is a complex normed vector space: ||a||op = sup{||av|| : ||v|| = 1}. If ||a||op < ∞, we say that a is a bounded operator. It is easy to verify that ||ab||op ≤ ||a||op · ||b||op , and that if a is invertible with ||a||op < ∞, then also

a−1

< ∞. So the collection of bounded invertible operators on V forms

op a group (with composition as the group operation). We denote this group by GL (V ). The most basic example is when V = Cn , for which the invertible linear operators are just the invertible matrices GLn (C) = GL (Cn ). In this case all linear operators are bounded because the dimension is finite. Exercise 9.1 Let H be a complex Hilbert space. Let a be a bounded linear operator. Show that the following are equivalent:

(1) ||av|| = ||v|| for all v ∈ H. (2) hau, avi = hu, vi for all u, v ∈ H. (3) a∗ a = 1. (Here, 1 denotes the identity operator.)

B solution C

Let H be a Hilbert space. An operator a ∈ GL (H) is called unitary if a∗ a = aa∗ = 1, where a∗ denotes the adjoint of a. A unitary operator is necessarily bounded with operator norm 1 (by Exercise 9.1). We denote by U (H) the collection of unitary operators on H. Since a∗ = a−1 for a unitary a, and since (ab) ∗ = b∗ a∗ , we see that U (H) is a group. We use the notation Un (C) = U (Cn ). Exercise 9.2

Prove that ||xa||op = ||ax||op = ||a||op for all a ∈ GL (H), x ∈

U (H). B solution C

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

304

Gromov’s Theorem

It is a basic theorem from linear algebra that any unitary matrix x ∈ Un (C) is unitarily diagonalizable. That is, there exists u ∈ Un (C) such that u∗ xu is a diagonal matrix. The eigenvalues of x are precisely the elements on this diagonal. Also, the corresponding orthonormal basis of eigenvectors are the columns of u. (This follows from the fact that x is normal, that is, x ∗ x = x x ∗ , which is a more general property than being unitary. Diagonalization of normal operators also generalizes to operators over infinite-dimensional spaces, via the spectral theorem. We will only utilize the finite-dimensional case, which may be found in any linear algebra textbook.) Let x ∈ Ud (C). Suppose that λ 1, λ 2, . . . , λ d are the eigenvalues of x. Show that |λ j | = 1 for all 1 ≤ j ≤ d. Prove that ||x − 1||op = max j |λ j − 1|. B solution C

Exercise 9.3

Let x ∈ Ud (C). Show that ||x n − 1||op ≤ n · ||x − 1||op for any integer n > 0. B solution C Exercise 9.4

Set c = 6−1/2 . Let 1 , x ∈ Ud (C). cπ n Then, for all 1 ≤ n ≤ | |x−1| |op we have that ||x − 1||op ≥ cn||x − 1||op . n Specifically, x , 1. Lemma 9.2.1

Proof Let θ ∈ [0, π] and λ = eiθ . A simple calculation reveals that √ p |λ − 1| = 2 · 1 − cos(θ). Also, a Taylor expansion of cos provides the inequality q √ p 2 cθ ≤ θ · 1 − θ12 ≤ 2 · 1 − cos(θ) ≤ θ, because c2 = 16 < 1 − π12 . For any λ ∈ C with |λ| = 1, we have |λ − 1| = | λ¯ − 1|. Thus, for any θ ∈ [−π, π], we get that c|θ| ≤ eiθ − 1 ≤ |θ|. Now, let λ 1, . . . , λ d be the eigenvalues of x ∈ Un (C). For every j write λ j = eiθ j for θ j ∈ [−π, π]. So 2

c max |θ j | ≤ ||x − 1||op = max |λ j − 1| ≤ max |θ j |. j

j

j

Since the eigenvalues of x n are precisely λ 1n, . . . , λ nd , we get that, if n||x−1||op ≤ cπ, then n|θ j | ≤ π for all j, so that ||x n − 1||op = max |λ nj − 1| ≥ cn max |θ j | ≥ cn||x − 1||op . j

j

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.2 Unitary Actions

305

Lemma 9.2.2 Let G ≤ Ud (C) be a finitely generated infinite group of unitary matrices. If G has polynomial growth, then there exists a finite index normal subgroup H C G, [G : H] < ∞ such that the following holds. Either

• H is Abelian, or • there exists an element a ∈ Z (H) (the center of H) such that a , λ · 1 (a is nonscalar). Proof We will take advantage of the ring structure of linear operators (or matrices in this case). Namely, we have the ability to add them. For any x, y ∈ Ud (C), ||[x, y] − 1||op = ||xy − yx||op = ||(x − 1)(y − 1) − (y − 1)(x − 1)||op ≤ 2||x − 1||op · ||y − 1||op .

(9.1)

(This “distortion” tells us that commutators are roughly closer to the identity than their original components.) Another useful observation is as follows. Suppose that x = [a, b] = λ · 1 is some unitary scalar matrix that is also a commutator. Since λ d = det(x) = det[a, b] = 1 it must be that λ = ei2πk/d for some 0 ≤ k ≤ d − 1. If k ≥ 1 −1/2 (as shown in the proof of then ||x − 1||op = |λ − 1| ≥ ξ d := c 2π d for c = 6 Lemma 9.2.1). We conclude that if x = [a, b] = λ · 1 and ||x − 1||op < ξ d then x = 1. cπ Lemma 9.2.1 tells us that if x , 1, then for 1 ≤ n ≤ | |x−1| |op , we have n ||x − 1||op ≥ cn||x − 1||op . Now, we assume that G has polynomial growth. Fix some Cayley graph of G, and let BG (1, r) denote the ball of radius r in this Cayley graph. Let D > 0 be such that |BG (1, r)| ≤ r D for all r ≥ 1.( Fix a)large enough integer n so that c n + 1 > 2D , and such that ε := 8n < min 14 , ξ d . D E Let G ε = x ∈ G : ||x − 1||op ≤ ε . Since ||x y − 1||op = ||(x − 1) y ||op = ||x − 1||op for any y ∈ Ud (C), we have that G ε C G. Also, if ||x − y||op ≤ ε then x −1 y ∈ G ε . It is an exercise below to show that this implies that [G : G ε ] < ∞. We have that G ε is finitely generated (as a finite index subgroup of the finitely generated group G) and we may write the finite number of generators of G ε as finite products of elements x ∈ G with ||x − 1||op ≤ ε. Taking all elements (and their inverses) appearing in these products, there exists a finite symmetric generating set Sε for G ε = hSε i such that ||s − 1||op ≤ ε for all s ∈ Sε and such that 1 < Sε . We now assume for a contradiction the negation of the conclusion of the lemma for H = G ε . That is, we assume that G ε is not Abelian, and for any

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

306

Gromov’s Theorem

z ∈ G ε such that [z, x] = 1 for all x ∈ G ε we have z = λ · 1 for some λ ∈ C. If we reach a contradiction, then taking H = G ε will prove the lemma. Consider the following inductive construction. First step. Start by considering all s ∈ Sε . If all such s are scalar, then G ε = hSε i contains only scalars, and thus is Abelian, contradicting our assumption. Hence, we may assume that there exists some x 1 := y1 ∈ Sε such that x 1 is nonscalar. Inductive step. Suppose we have defined a sequence y1, . . . , yk ∈ Sε and y1 = x 1, x 2, . . . , x k with the following properties for all 1 ≤ j ≤ k − 1: • x j+1 = [y j+1, x j ]. • x j , x j+1 are nonscalar. • ||x j+1 − 1||op ≤ 2ε · ||x j − 1||op . Then, since x k is nonscalar, it is not in the center, by our assumption on G ε . Hence, there must exist yk+1 ∈ Sε such that x k+1 := [yk+1, x k ] , 1. Note that ||x k+1 −1||op ≤ 2·||yk+1 −1||op ·||x k −1||op ≤ 2ε||x k −1||op ≤ 21 (2ε) k+1 ≤ ε < ξ d . Since det(x k+1 ) = 1 and x k+1 , 1, we have that x k+1 cannot be scalar. This completes the inductive construction, with the above properties. Now, note that for all k, we have x k , 1 and ||x k − 1||op ≤ 21 (2ε) k ≤ ε

` and j , i ∈ {0, . . . , n},

x j − x i

=

x | j−i | − 1

≤ | j − i| · ||x − 1|| ≤ n · ||x − 1|| , r op r op r op

r

r

op ||x r − 1||op ≤ (2ε) r−` · ||x ` − 1||op,

x j − x i

=

x | j−i | − 1

≥ c| j − i| · ||x − 1|| . ` op ` op

`

`

op

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

307

9.2 Unitary Actions So,

x j` · · · x jk −x i` · · · x ik

≥

x j` · · · x jk−1 − x i` · · · x ik−1

− x jk − x ik

` ` k−1 op k op k−1 k op k k

`

` k X j ||x r − 1||op ≥ · · · ≥ ||x `` − x `i` ||op − n · r=`+1

≥ c · | j` − i ` | · ||x ` − 1||op − n · ||x ` − 1||op ·

∞ X

(2ε) r−` .

r=`+1

P∞ j j Since n · r=1 (2ε) r ≤ c2 (by our choice of ε), we get that if x `` · · · x kk = x `i` · · · x ikk then j` = i ` . Since this holds for any 1 ≤ ` ≤ k, this proves the injectivity of ϕ. Now, note that in the Cayley graph of G ε with respect to the generating set Sε , we have that |x 1 | = 1 and |x k+1 | ≤ 2(|x k | + |yk+1 |) = 2|x k | + 2. So |x k | ≤ 3 · 2k−1 − 2 by induction. Thus, for 0 ≤ j1, . . . , jk ≤ n we have (x ) j1 · · · (x ) jk ≤ k 1

k X

j` 3 · 2`−1 − 2 ≤ 3n2k .

`=1

This proves the claim.

From this claim we deduce that in the Cayley graph of G ε with respect to the generating set Sε we have BSε (1, 3n2k ) ≥ (n + 1) k for all k (n is large but fixed). Since [G : G ε ] < ∞, there exists a constant C = C(ε) > 0 such that BSε (1, r) ⊆ BG (1, Cr) for all r ≥ 0. We conclude that for all k ≥ 1 we have D k (n + 1) k ≤ BG (C3n2k ) ≤ C3n2k = (C3n) D · 2D . Since n + 1 > 2D we arrive at a contradiction for large enough k. Thus, the proof of the lemma is concluded by taking H = G ε . Exercise 9.5

Show that Ud (C) is compact with the metric induced by the norm

|| · ||op . Show that in the proof of Lemma 9.2.2, [G : G ε ] ≤ C for some C = C(d, ε) > 0. B solution C Our goal in this section is to prove the following result, basically due to Jordan. It is Gromov’s theorem for the special case of unitary matrix groups. Let G ≤ Ud (C) be a finitely generated group of unitary matrices. Then G has polynomial growth if and only if G is virtually Abelian.

Theorem 9.2.3

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

308

Gromov’s Theorem

Proof If G is virtually Abelian then G has a finite index subgroup isomorphic to Zd (by Theorem 1.5.2), so G has polynomial growth. So we are left with proving the other direction. Assume G ≤ Ud (C) is a finitely generated group of polynomial growth. We prove by induction on the dimension d that G is virtually Abelian. The base case d = 1 is where G ≤ C is just an Abelian group. For the induction step, assume d > 1. Let H C G be a finite index subgroup [G : H] < ∞ guaranteed by Lemma 9.2.2. If H is Abelian, we are done. Otherwise, there is an element a ∈ H such that a , λ · 1 and [x, a] = 1 for all x ∈ H. By conjugating H (and thus moving to an isomorphic group) we may assume that a is a diagonal matrix. (Indeed, a is unitary and thus diagonalizable over C. Conjugate all of H by the unitary matrix that conjugates a to a diagonal matrix.) That is, Cd = W1 ⊕ · · · ⊕ Wm for some m ≤ d, where W j = Ker (a − λ j · 1) for some λ 1, . . . , λ m ∈ C. Note that since a − λ ( · 1 commutes with any x ∈) H we get that HW j ⊂ W j . For every j set N j = x ∈ H : xw = w, ∀ w ∈ W j C H. Note that if x ∈ N1 ∩ · · · ∩ Nm then xv = v for all v ∈ Cd , implying that x = 1. Thus, N1 ∩ · · · ∩ Nm = {1}. It is then an easy exercise to show that the map H 7→ H/N1 × · · · × H/Nm by x 7→ (N1 x, . . . , Nm x) is an injective homomorphism. Since a is not scalar, there exists j such that λ j , λ 1 . Thus, m > 1. Thus, dim W j < d for all 1 ≤ j ≤ m. For any 1 ≤ j ≤ m, note that H/N j is isomorphic to a subgroup of U (W j ), the unitary operators on W j . We have that H/N j is finitely generated and of polynomial growth (as a quotient of the finite index subgroup H ≤ G, [G : H] < ∞), so H/N j is virtually Abelian by induction. Hence, also the direct product H/N1 ×· · ·× H/Nm is virtually Abelian. Thus, H is isomorphic to a subgroup of a virtually Abelian group, and hence H is virtually Abelian, which implies that G is virtually Abelian. Show that if N1, . . . , Nm C H are normal subgroups with N1 ∩· · ·∩ Nm = {1}, then the map x 7→ (N1 x, . . . , Nm x) is an injective homomorphism.

Exercise 9.6

B solution C

Exercise 9.7

Show that the direct product of virtually Abelian groups is virtually

Abelian. Show that any subgroup of a virtually Abelian group is virtually Abelian. B solution C

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.3 Harmonic Cocycles

309

From Theorem 9.2.3 and Exercise 1.32 we immediately obtain the following corollary. Corollary 9.2.4 Let V be a nontrivial finite-dimensional normed vector space over C. Let G ≤ U (V ) be a finitely generated infinite group of unitary operators on V . If G has polynomial growth, then there exists a finite index normal subgroup H C G, [G : H] < ∞ such that H admits a surjective homomorphism onto Z.

9.3 Harmonic Cocycles Suppose that a group G acts on a Hilbert space H. Suppose that for any x ∈ G the map v 7→ x.v is a unitary operator. We say that G acts on H by unitary operators. This is equivalent to the existence of a homomorphism U : G → U (H), and the action defined as x.v := U (x)v for v ∈ H and x ∈ G. Let G be a group acting on a Hilbert space H by unitary operators. That is, U : G → U (H) a homomorphism and x.v = U (x)v for v ∈ H and x ∈ G. A map c : G → H is called a cocycle (with respect to U) if Definition 9.3.1

c(xy) = c(x) + x.c(y), for all x, y ∈ G. P

As in the complex-valued case, a map h : G → H is µ-harmonic if y µ(y)h(x y) = h(x) for all x ∈ G, and the above sum converges absolutely.

Let G be a finitely generated group acting on a Hilbert space H by unitary operators. Show that if c : G → H is a cocycle and if S is a finite symmetric generating set for G, then for any x ∈ G we have ||c(x)|| ≤ |x| · maxs ∈S ||c(s)||. B solution C Exercise 9.8

Let G be a finitely generated group acting on a Hilbert space H by unitary operators. Let c : G → H be a cocycle. Show that c(1) = 0. Show that c x −1 = −x −1 .c(x). B solution C Exercise 9.9

Recall that SA (G, 1) is the collection of symmetric, adapted probability measures on G with finite first moment.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

310

Gromov’s Theorem

Exercise 9.10 Let G be a finitely generated group acting on a Hilbert space H by unitary operators. Let c : G → H be a cocycle. P Show that for µ ∈ SA (G, 1), c is µ-harmonic if and only if y µ(y)c(y) = 0. B solution C

Exercise 9.11 Let G be a finitely generated group acting on a Hilbert space H by unitary operators. Fix v ∈ H and define cv (x) := x.v − v. Show that cv is a cocycle. (These cocycles are sometimes called co-boundaries.) B solution C

Let G be a finitely generated group acting on a Hilbert space H by unitary operators. Let c : G → H be a cocycle. Let v ∈ H. Define h(x) := hc(x), vi. Let µ ∈ SA (G, 1). Show that h ∈ LHF (G, µ). (Recall that LHF is the space of Lipschitz harmonic functions.)

Exercise 9.12

9.3.1 Construction This section is devoted to proving the following fundamental result. (Recall that a Hilbert space is separable if it has a countable orthonormal basis.) Theorem 9.3.2

Let G be a finitely generated infinite amenable group, and

µ ∈ SA (G, 1). Then, there exists a separable Hilbert space H such that G acts on H by unitary operators, and such that there exists a nonzero µ-harmonic cocycle c : G → H. In fact, the proof of Theorem 9.3.2 will show that we may take H= µ). However, the G action on ` 2 (G, µ) will not necessarily be the regular action, but some other homomorphism U : G → U ` 2 (G, µ) .

Remark 9.3.3

` 2 (G,

Recall that ` 0 (G) denotes the space of finitely supported complex-valued functions on G. For a probability measure µ on G, we define the transition matrix, Laplacian, and Green function:

P(x, y) = µ x y , −1

∆ = I − P,

g (x, y) =

∞ X

Pt (x, y),

t=0

where the latter is only defined if (G, µ) is transient. These operators are always defined on ` 0 (G), but in some cases can be extended to larger spaces. Recall also the space ` 2 (G, µ), which consists of functions f : G → C with

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.3 Harmonic Cocycles

311

|| f || 2 = x | f (x)| 2 < ∞. Also, ` 2 (G, µ) is a Hilbert space with the inner P product h f , hi = x f (x)h(x). See Chapter 4. P

Let G be an infinite finitely generated group and let µ be a symmetric adapted probability measure on G. Assume that the µ-random walk on G is transient. P k Fix f ∈ ` 0 (G). Show that ϕ := ∞ k=0 P f = g f is well defined. Show that ∆ϕ = f . Show that if f is a nonnegative function, then for any n, Exercise 9.13

h∆ϕ, ϕi ≥

n D X

E Pk f , f .

B solution C

k=0

Exercise 9.14 Let G be a finitely generated group and let µ be a symmetric adapted probability measure on G. We say that (G, µ) satisfies the bubble condition if (G, µ) is transient and the P Green function g satisfies y | g (1, y)| 2 < ∞. Show that if (G, µ) satisfies the bubble condition, then there exists a linear operator ∆−1 : ` 0 (G) → ` 2 (G, µ) such that ∆∆−1 f = f for any f ∈ ` 0 (G). B solution C

Now, consider some finite subset A ⊂ G. Note that for f = | A| −1/2 1 A, we have || f || = 1 and D

E 1 X 1 X Pa [Xk ∈ A] = 1 − Pa [Xk < A]. Pk f , f = | A| a ∈ A | A| a ∈ A

If G is amenable, then for any D n we Emay find a finite set Fn such that with f n := |Fn | −1/2 1Fn , we have Pk f n, f n ≥ 12 for all 0 ≤ k < n. Also, since f n P k is a nonnegative function, if we could define ϕn := ∞ k=0 P f n , then the above exercises would give that ||∆ϕn || = 1

and

h∆ϕn, ϕn i ≥ 21 n.

Specifically, ||∆ϕn || 2 → 0. h∆ϕn, ϕn i The only obstacles are the convergence of the series defining ϕn , for example, when (G, µ) is recurrent, and also ensuring that ϕn ∈ ` 2 (G, µ), for example, the bubble condition. This is the content of the next lemma, where we replace the infinite sum by a truncated finite sum.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

312

Gromov’s Theorem

Lemma 9.3.4 Let G be a finitely generated infinite group, and let µ be a symmetric adapted probability measure on G. Let f ∈ ` 2 (G, µ) be a real-valued function such that || f || = 1, ||∆ f || ≤ 12 , and k−1 E 1 XD j lim P f , f = 0. k→∞ k j=0

Then there exists ϕ ∈ ` 2 (G, µ) such that ||∆ϕ|| 2 ≤ 16||∆ f ||. h∆ϕ, ϕi Proof Define ϕk =

P j f . Since P is a contraction, ||∆ϕk || =

I − Pk f

≤ 2|| f || = 2. Pk−1 j=0

Also, because µ is symmetric, P is self-adjoint, and so is I − Pk , and thus, D E D E h∆ϕk , ϕk i = I − Pk f , ϕk = f , I − Pk ϕk = h f , 2ϕk − ϕ2k i , (9.2) where we have used that P k ϕk =

k−1 X

P j+k f = ϕ2k − ϕk .

j=0

For any m ≥ 1 we have that since P is a contraction, ||(I − P m ) f || ≤

m−1 X j=0

P j − P j+1 f

≤ m||∆ f ||,

so by Cauchy–Schwarz,

1 − P m f , f = (I − P m ) f , f ≤ m||∆ f ||. This implies that for any m, if we set am := hϕ2m , f i then am =

m −1 2X D

j=0

E

P f, f ≥ j

m −1 2X

(1 − j ||∆ f ||) ≥ 2m 1 − 2m−1 ||∆ f || .

j=0

By the assumptions of the lemma, 2−m am → 0, so we may write X X 2ak − ak+1 . 2−m am = 2−k ak − 2−(k+1) ak+1 , implying am = 2k−m+1 k ≥m k ≥m

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

313

9.3 Harmonic Cocycles

Since k ≥m 2m−k−1 = 1, and since ak ∈ R ( f is assumed to be real-valued), there must exist k ≥ m such that 2ak − ak+1 ≥ am . Choose m ≥ 1 such that 2m−1 ≤ 2| |∆1 f | | ≤ 2m (recall that we assume that 2||∆ f || ≤ 1). So am = hϕ2m , f i ≥ 2m 1 − 2m−1 ||∆ f || ≥ (4||∆ f ||) −1 . P

Thus, we conclude that there exists k ≥ m such that 2ak − ak+1 ≥ (4||∆ f ||) −1 , so that by (9.2),

∆ϕ2k , ϕ2k = 2ak − ak+1 ≥ (4||∆ f ||) −1, and consequently, for ϕ := ϕ2k , ||∆ϕ|| 2 ≤ 4 · 4||∆ f ||. h∆ϕ, ϕi

The next lemma tells us that in order to find a harmonic cocycle for a unitary action of G on a Hilbert space H, it suffices to find a sequence of “almost harmonic” cocycles, and take some appropriate limit. The price to pay, however, is that the unitary action for the harmonic cocycle may be different from the original action. Lemma 9.3.5 Let G be a finitely generated group acting on a separable Hilbert space by unitary operators. Let µ ∈ SA (G, 1). Let cn : G → H be a sequence of cocycles such that

• (cn )n are uniformly Lipschitz, that is, supn maxs ∈S ||cn (s)|| < ∞, for some finite symmetric generating set S, and • (cn )n are almost harmonic, that is,

X

µ(x)cn (x)

= 0. lim

n→∞

x Then, there exists a homomorphism U : G → U (H) and a map c : G → H such that c is a µ-harmonic cocycle with respect to U; that is, for all x, y ∈ G, X c(x y) = c(x) + U (x)c(y) and µ(x)c(x) = 0. x

Moreover, for any fixed x ∈ G we may choose c such that ||c(x)|| ≥ lim sup ||cn (x)||. n→∞

Proof Fix some enumeration G = {x 1, x 2, . . .}. Let κ := supn maxs ∈S ||cn (s)||. Note that ||cn (x)|| ≤ κ|x| for all n and all x ∈ G (see Exercise 9.8). There exists an infinite set J0 ⊂ N such that lim ||cn (x 1 )|| = lim sup ||cn (x 1 )||.

J0 3n→∞

n→∞

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

314

Gromov’s Theorem

Let (bn )n be an orthonormal basis for H. For each n let Vn = span{b1, . . . , bn }. So Vn ≤ Vn+1 for all n. Fix n, and consider the subspaces Wn,k = span{cn (x j ) : j ≤ k}. So Wn,k ≤ Wn,k+1 . Since dim Wn,k ≤ k, we may find a unitary operator Un : H → H such that Un (Wn,k ) ≤ Vk for all k ≤ n. Define c˜n (x) = Un cn (x). Note that for any k ≤ n we have that c˜n (x k ) ∈ Vk , which is a finitedimensional space. Also, we have that || c˜n (x k )|| = ||cn (x k )|| ≤ κ|x k |. Thus the sequence ( c˜n (x k ))n is inside the set Rk = {v ∈ Vk : ||v|| ≤ κ|x k |}. Since dim Vk < ∞, we know that Rk is compact. This implies that for any infinite set J ⊂ N, there exists another infinite subset I ⊂ J such that the subsequence ( c˜n (x k ))n∈I converges in H. By taking further and further subsequences, we obtain by induction that there exists a sequence Jk ⊃ Jk+1 of infinite subsets of N such that for any ` ≤ k the subsequence ( c˜n (x ` ))n∈Jk converges in H. Write Jk = {mk,1 < mk,2 < · · · }. For any n let m(n) = mn,n , and consider the subsequence c˜m(n) n . Since this is a subsequence of ( c˜n )n∈Jk for any k, we obtain that the limit c(x k ) := lim c˜m(n) (x k ) = n→∞

lim

Jk 3n→∞

c˜n (x k )

exists in H for all k ≥ 1. (This was basically the Arzelà–Ascoli theorem, see also Exercise 1.97.) Note that for any k ≥ 1, ||c(x k )|| =

lim

Jk 3n→∞

||cn (x k )||,

so that ||c(x 1 )|| = lim sup ||cn (x 1 )||. n→∞

We now show that c is µ-harmonic at 1 ∈ G. Fix some small ε > 0. Recalling that µ ∈ SA (G, 1), we can choose A ⊂ G a large enough but finite subset such P that x 0 such that for all t ≥ 1 we have E |Xt | 2 ≥ ct. Theorem 9.4.1

Proof If G is non-amenable then we have already seen in Exercise 5.35 that E |Xt | ≥ ct for some c > 0. So assume that G is amenable. Let c : G → H be a harmonic cocycle into a −1 X be the Hilbert space H on which G acts by unitary operators. Let Ut = Xt−1 t “jumps” of the random walk. Then, for t > 0, E ||c(Xt ) − c(Xt−1 )|| 2 = E ||Xt−1 .c(Ut )|| 2 = E ||c(Ut )|| 2 = E ||c(X1 )|| 2 . Also, since Ut is independent of Ft−1 , E hc(Xt ) − c(Xt−1 ), c(Xt−1 )i = E E[hc(Ut ), Xt−1 .c(Xt−1 )i | Ft−1 ] fD X Eg =E µ(x)c(x), Xt−1 c(Xt−1 ) = 0. x

So E ||c(Xt )|| 2 = E ||c(Xt ) − c(Xt−1 )|| 2 + ||c(Xt−1 )|| 2 + 2Re hc(Xt ) − c(Xt−1 ), c(Xt−1 )i = E ||c(X1 )|| 2 + E ||c(Xt−1 )|| 2 = · · · = t · E ||c(X1 )|| 2 . By Exercise 9.8, we conclude that 2 t · E ||c(X1 )|| 2 ≤ E ||c(Xt )|| 2 ≤ sup ||c(s)|| · E |Xt | 2 .

s ∈S

9.5 Ozawa’s Theorem In this section we will prove the following theorem (basically due to Ozawa, also based on works of Shalom), which is the main step in proving Gromov’s theorem (Theorem 9.0.1). Theorem 9.5.1 Let G be a finitely generated infinite group and µ ∈ SA (G, 2). Let (Xt )t denote the µ-random walk. Assume that G is amenable. Then, at least one of the following holds:

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.5 Ozawa’s Theorem

319

• There exists a normal subgroup N C G of infinite index [G : N] = ∞ such that G/N is isomorphic to a subgroup of Ud (C), or • there exist normal subgroups K C N C G with index [G : N] < ∞ and [N : K] = ∞ such that N/K is Abelian, or • there exists h ∈ LHF (G, µ) a nonconstant Lipschitz harmonic function on G such that lim 1 t→∞ t

sup Var x [h(Xt )] = 0. x

The step from Ozawa’s theorem to Gromov’s theorem is not a difficult one. If G has polynomial growth then it is amenable. Using the polynomial growth together with Theorem 6.5.5, one rules out the third option in Ozawa’s theorem. Either one of the first two options in Ozawa’s theorem shows that G is virtually indicable. This will be spelt out in full detail at the end of the chapter, see Section 9.6. For now, we build the necessary machinery to prove Ozawa’s theorem. Suppose that T is a bounded linear operator on a Hilbert space H. We denote by HT = {v ∈ H : T v = v} = Ker (I − T ), which we call the subspace of T invariant vectors. Since T is continuous, HT is a closed subspace, so there exists an orthogonal projection onto HT . The main tool used is the classical von Neumann ergodic theorem. Let T be a bounded self-adjoint operator on a Hilbert space H. Assume that ||T || ≤ 1. Let HT be the subspace of T invariant vectors. Let π be the orthogonal projection onto HT . Then, for any v ∈ H we have Theorem 9.5.2 (von Neumann ergodic theorem)

n−1

lim

n→∞

1X k T v = πv. n k=0

(9.3)

Proof Let V be the set of all v ∈ H such that (9.3) holds for v. We wish to show that V = H. It is an exercise to show that V is a closed subspace, and that HT ≤ V . Now, let U = {T v − v : v ∈ H}. Another exercise is to show that U is a subspace (but this time it is not guaranteed to be closed). Now, if u = T v − v ∈ U then for any w ∈ HT we have hu, wi = hv, T wi − hv, wi = 0. So U ⊥ HT , which implies that πu = 0 for all u ∈ U. Now, for any u = T v − v ∈ U, we have

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

320

Gromov’s Theorem

X

n−1 1 2||v||

1 T k u

= ||T n v − v|| ≤ → 0. n

n k=0

n

So we obtain that U ≤ V . Since V is closed, we get that W := cl U + HT ≤ V . Now assume that w ⊥ W . Then w ⊥ U. Specifically, w ⊥ T w − w. Since ||T w|| ≤ ||w||, ||T w − w|| 2 = ||T w|| 2 + ||w|| 2 − 2Re hT w, wi ≤ −2Re hT w − w, wi = 0, so T w = w. But then w ∈ HT , and so w ⊥ w, implying that w = 0. We arrive at the conclusion that W ⊥ = {0}, which is to say that H = W ≤ V , proving the theorem. Show that V in the proof of Theorem 9.5.2 is a closed subspace of H. Show that HT ≤ V . Show that U = {T v − v : v ∈ H} is a subspace of H. B solution C

Exercise 9.17

9.5.1 Hilbert–Schmidt Operators In the following, we develop the theory of the Hilbert–Schmidt spaces for general Hilbert spaces. We will, however, only require the theory in the separable case. Some readers may find it simpler to proceed with the assumption that all Hilbert spaces mentioned have a countable orthonormal basis (i.e. are separable). See Section A.3 for more details. Let H be a Hilbert space. Consider the space of Hilbert–Schmidt operators on H: for a bounded linear operator T : H → H define the Hilbert–Schmidt norm X 2 ||T ||HS = ||T b|| 2 b ∈B

(which may be infinite), where B is some orthonormal basis for H. Note that this definition is independent of the specific choice of orthonormal basis. Show that if B, B˜ are two orthonormal bases for H, then for any bounded linear operator T : H → H, X X X

||T b|| 2 = ||T b|| 2 = | T b, b0 | 2 .

Exercise 9.18

b ∈B

b ∈ B˜

b,b0 ∈B

Show that ||T ||op ≤ ||T ||HS .

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

B solution C

9.5 Ozawa’s Theorem

321

Definition 9.5.3 Let H be a Hilbert space. A bounded linear operator A: H → H is called a Hilbert–Schmidt operator if || A||HS < ∞. The space of all Hilbert– Schmidt operators is denoted by H∗ ⊗ H.

Exercise 9.19

Show that H∗ ⊗ H is a Hilbert space under the inner product X

A, A0 HS := Ab, A0 b , b ∈B

where B is some orthonormal basis for H. Show that this defines an inner product and is independent of the choice of specific orthonormal basis. Show that hA∗, ( A0 ) ∗ iHS = hA, A0iHS . B solution C If v, u ∈ H then define the linear operator v ⊗ u: H → H Exercise 9.20

by

(v ⊗ u)w := hw, vi u.

Show that v ⊗ u ∈ H∗ ⊗ H for any v, u ∈ H. Show that

v ⊗ u, v 0 ⊗ u 0 HS = v 0, v · u, u 0 ,

so that ||v ⊗ u||HS = ||v|| · ||u||. Show that if B is an orthonormal basis for H, then {b ⊗ b0 : b, b0 ∈ B} is an orthonormal basis for H∗ ⊗ H. B solution C One way to think about Hilbert–Schmidt operators is as a generalization of matrices. One can think of hAb, b0i as the coefficients of D E the operator A in the basis B, just like with matrices, where Mi, j = Me j , ei are the coefficients in the standard basis. Let H be a Hilbert space and let A ∈ H∗ ⊗H be a Hilbert–Schmidt operator. Let B be an orthonormal basis. Show that there exists a countable subset E = {e1, e2, . . .} ⊂ B such that Ab = 0 for any b ∈ B\E. For any n, define An : H → H by Exercise 9.21

An v =

n X

hv, ek i Aek .

k=1

Show that for any ε > 0 there exists nε such that || A− An ||HS < ε for all n ≥ nε . B solution C

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

322

Gromov’s Theorem

The natural topology on a Hilbert space is the one induced by the metric given by the norm. However, the unit ball is not necessarily compact in this topology. Exercise 9.22 Let H be a Hilbert space. Show that the unit ball {u ∈ H : ||u|| ≤ 1} is compact if and only if dim H < ∞. B solution C

Consider a different topology on H. We say that (un )n converges weak∗ to u (or, in the weak∗ topology), if for any v ∈ H we have hun, vi → hu, vi. Definition 9.5.4

Exercise 9.23

Show that if (un )n converges weak∗ to u then ||u|| ≤ lim inf ||un ||. n→∞

B solution C

Let B be an orthonormal basis for H. Show that if (un )n is a sequence in the unit ball, then (un )n converges weak∗ to u if and only if for any b ∈ B we have hun − u, bi → 0. B solution C

Exercise 9.24

The weak∗ topology has the advantage that the unit ball is compact (see Lemma 9.5.5). However, unlike the norm topology, not every bounded operator is continuous with respect to the weak∗ topology. That is, if (un )n converges weak∗ to u, it does not necessarily hold that || A(un − u)|| → 0 for a general bounded operator A. One feature of Hilbert–Schmidt operators is that they are still continuous with this weaker restriction (i.e. || A(un − u)|| → 0 if || A||HS < ∞). Exercise 9.25

Let H be a Hilbert space and let A ∈ H∗ ⊗H be a Hilbert–Schmidt

operator. Show that A: H → H is a continuous function from H with the weak∗ topology to H with the norm topology. That is, show that if (un )n is a sequence in H such that for some u ∈ H and any v ∈ H, we have hun, vi → hu, vi, then || A(un − u)|| → 0. B solution C Let H be a Hilbert space and let A ∈ H∗ ⊗ H be a self-adjoint Hilbert–Schmidt operator. Consider ϕ : H → R given by ϕ(u) = hAu, ui. Then ϕ is maximized and minimized on the unit ball in H. That is, there exist Lemma 9.5.5

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.5 Ozawa’s Theorem

323

v, v 0 ∈ H with ||v||, ||v 0 || ≤ 1 such that ϕ(v) = max{ϕ(u) : ||u|| ≤ 1}

and

ϕ(v 0 ) = min{ϕ(u) : ||u|| ≤ 1}.

Proof This is essentially due to the Banach–Alaoglu theorem, which tells us that the unit ball is compact in the weak∗ topology, and the fact that ϕ is continuous on this compact set. However, we will provide a self-contained proof. Let K = {u ∈ H : ||u|| ≤ 1} denote the unit ball. Set M = supu ∈K ϕ(u). Let (un )n be a sequence in K such that ϕ(un ) → M. We claim that there exists a subsequence of (un )n that converges weak∗ in H. P Let B be an orthonormal basis. Since b | hun, bi | 2 = ||un || 2 < ∞ for all n, we have that there exists a countable subset E = {e1, e2, . . .} ⊂ B such that hun, bi = 0 for all n and any b ∈ B\E. The sequence (hun, e1 i)n is a bounded sequence. Thus there exists a sequence (n1, j ) j such that D E lim un1, j , e1 = λ 1 ∈ C. j→∞

Inductively, for k > 1, suppose we have a sequence (nk−1, j ) j such that for all 1 ≤ ` ≤ k − 1 the limits D E lim unk−1, j , e` = λ ` j→∞

D E exist. Considering the bounded sequence unk−1, j , ek , we may extract a j further subsequence of (nk−1, j ) j , say (nk, j ) j so that for all 1 ≤ ` ≤ k we have the limits D E lim unk−1, j , e` = λ `, j→∞

for some λ k ∈ C. Constructing this for every k, we now consider the diagonal subsequence v j := un j, j . We have that for any k ≥ 1, D E D E lim v j , ek = lim unk, j , ek = λ k . j→∞

j→∞

By Fatou’s lemma, XD X v , e E 2 = lim inf ||v || 2 ≤ 1. |λ k | 2 ≤ lim inf j j k k

j→∞

j→∞

k

P Thus, u = k λ k ek is a well-defined vector in H, with ||u|| 2 ≤ 1. Note that for any k ≥ 1 we have that D E hu, ek i = λ k = lim v j , ek . j→∞

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

324

Gromov’s Theorem E Also, hu, bi = 0 = v j , b for all b ∈ B\E. By Exercise 9.24, (v j ) j is a subsequence of (un )n that converges weak∗ to u. Now, D E D E |ϕ(v j ) − ϕ(u)| = A(v j − u), v j + Au, v j − u ≤ 2|| A(v j − u)|| → 0, D

by Exercise 9.25. Since ϕ(v j ) → M = supv ∈K ϕ(v) as j → ∞, and since u ∈ K, we conclude that ϕ(u) = M. This proves that ϕ is maximized on the unit ball in H. Replacing A by −A proves that ϕ is also minimized. The next lemma shows that, just like in the matrix case, a self-adjoint Hilbert– Schmidt operator always has a nontrivial eigenvector. Let 0 , A ∈ H∗ ⊗ H be self-adjoint. Then there exists an eigenvector 0 , v ∈ H with Av = λv for some λ , 0. Lemma 9.5.6

Proof This is due to the fact that Hilbert–Schmidt operators are compact operators (on a Banach space). But we will give a self-contained proof. Let M = max{hAu, ui : ||u|| ≤ 1}. Since A , 0, by perhaps replacing A by −A we may assume without loss of generality that M > 0. Fix v, u ∈ H and let p : R → C be p(ξ) =

hA(v + ξu), v + ξui hAv, vi + ξ 2 · hAu, ui + ξ · (hAv, ui + hAu, vi) = . hv + ξu, v + ξui ||v|| 2 + ξ 2 · ||u|| 2 + ξ · (hv, ui + hu, vi)

Since the numerator and denominator are quadratic polynomials in ξ, we get that p0 (0) =

(hAv, ui + hAu, vi) · ||v|| 2 − (hv, ui + hu, vi) · hAv, vi . ||v|| 4

Consider ϕ(u) = hAu, ui and the Rayleigh quotient ψ(u) = |ϕ(u) defined for |u | | 2 any u , 0. Note that ψ(αu) = ψ(u) for all u ∈ H\{0} and all α ∈ C\{0}. By Lemma 9.5.5, we know that ϕ is maximized on the unit ball by some vector u ∈ H with ||u|| ≤ 1. Since ϕ(u) = M = max{ϕ(w) : ||w|| ≤ 1} > 0, we know that u , 0. Set v = | |uu | | . Then, for any w ∈ H with 0 < ||w|| ≤ 1, ψ(v) = ψ(u) =

ϕ(u) w w ≥ ϕ | |w = ψ | |w = ψ(w). | | | | ||u||

So v maximizes ψ on the unit ball. Then, with this v, the function p(ξ) = ψ(v + ξu) defined above is maximized at ξ = 0. By the above formula for p0 (0), we have for any u ∈ H that hAv, ui + hAu, vi = (hv, ui + hu, vi) · hAv, vi .

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

325

9.5 Ozawa’s Theorem Taking λ = hAv, vi we have that for any u ∈ H,

hAv − λv, ui + u, A∗ v − λv = 0. Since A = A∗ we obtain Re hAv − λv, ui = 0 for any u ∈ H. So also Im hAv − λv, ui = Re hAv − λv, iui = 0

for any u ∈ H. Thus, Av = λv.

Now we add to the mix a group G acting on the Hilbert space H by unitary operators. Let G be a group acting on H by unitary operators. Show that G acts on H∗ ⊗ H by conjugation. Show that this action is a unitary action on the Hilbert space H∗ ⊗ H.

Exercise 9.26

B solution C

Exercise 9.27

x(v ⊗ u)x −1

Show that x(v ⊗u)y −1 = yv ⊗ xu (as operators on H). Specifically, = xv ⊗ xu. B solution C

Let G be a group. Let H be a Hilbert space on which G acts by unitary operators. Assume that 0 , L ∈ H∗ ⊗ H is a G-invariant vector; that is, xLx −1 = L for all x ∈ G. Then, there exists a finite-dimensional subspace V ≤ H, dim V < ∞, and some v ∈ V such that Lv , 0 and such that GV ≤ V (i.e. V is a G-invariant subspace). Lemma 9.5.7

Proof Assume that 0 , L ∈ H∗ ⊗H is a G-invariant vector. Since xL ∗ x −1 = L ∗ as well (because the action is unitary), if we define A = 21 (L + L ∗ ), we have that A is a self-adjoint G-invariant vector in H∗ ⊗ H. Since A is a self-adjoint Hilbert–Schimidt operator, there exists an eigenvector 0 , v ∈ H with Av = λv for λ , 0, by Lemma 9.5.6. Let V = {u ∈ H : Au = λu}. If dim V ≥ n then we may find an orthonormal basis B for H such that b1, . . . , bn ∈ B ∩ V . However, in this case, 2 || A||HS ≥

n X

|| Ab j || 2 = n|λ| 2 .

j=1 2 < ∞. Hence dim V ≤ |λ| −2 · || A||HS We now claim that V is G-invariant. Indeed, for any u ∈ V and any x ∈ G we have that Axu = x Au = λ xu, so xu ∈ V as well.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

326

Gromov’s Theorem

Finally, note that since Av = λv, we have that Re hLv, vi =

hLv, vi + hL ∗ v, vi = hAv, vi = λ||v|| 2 , 0, 2

which implies that Lv , 0.

We now prove Ozawa’s theorem. Proof of Theorem 9.5.1 Let H be a Hilbert space on which G acts by unitary operators, with the guaranteed µ-harmonic cocycle c : G → H, from Theorem 9.3.2. Let (Xt )t be the µ-random walk. For any map f : G → H∗ ⊗ H we will use the notation X E[ f (Xt )] = P[Xt = x] · f (x). x

Note that for this to be defined, it is required that the above sum converges in H∗ ⊗ H. If E[|| f (Xt )||HS ] < ∞ then E[ f (Xt )] converges absolutely. Note also that for any x, y ∈ G, c(x y) ⊗ c(x y) = (c(x) + x.c(y)) ⊗ (c(x) + x.c(y)) = c(x) ⊗ c(x) + x(c(y) ⊗ c(y))x −1 + c(x) ⊗ x.c(y) + x.c(y) ⊗ c(x). Fix a finite symmetric generating set S for G. Let κ = maxs ∈S ||c(s)||. So ||c(x)|| ≤ κ|x| for all x ∈ G. Note that since ||x.c(Xt ) ⊗ c(x)||HS = ||c(Xt )|| · ||c(x)|| ≤ κ 2 |x||Xt |, and since µ has finite first moment, the sum E[x.c(Xt ) ⊗ c(x)] converges absolutely. Similarly, since µ has finite second moment, for any x ∈ G also E[c(x Xt ) ⊗ c(x Xt )] converges absolutely, as ||c(x Xt ) ⊗ c(x Xt )|| ≤ κ 2 (|x| + |Xt |) 2 . Since c is µ-harmonic, E[x.c(Xt ) ⊗ c(x)] = E[c(x) ⊗ x.c(Xt )] = x.c(1) ⊗ c(x) = 0, so E[c(x Xt ) ⊗ c(x Xt )] = c(x) ⊗ c(x) + x E[c(Xt ) ⊗ c(Xt )]x −1 . P Let T be an operator on H∗ ⊗ H given by T A = x µ(x)x Ax −1 . Define At := E[c(Xt ) ⊗ c(Xt )]. Averaging the above over x according to µ we obtain that At = A1 + T At−1 = A1 + T A1 + T 2 At−2 = · · · =

t−1 X k=0

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

T k A1 .

9.5 Ozawa’s Theorem

327

It is an exercise to show that T is a bounded self-adjoint operator on H∗ ⊗ H with ||T || ≤ 1. Thus, by the von Neumann ergodic theorem (Theorem 9.5.2), 1 At = L, t→∞ t lim

where L is some T-invariant vector in H∗ ⊗ H (in fact L is the orthogonal projection of A1 onto the subspace of T-invariant vectors in H∗ ⊗ H). We now have three cases. Case I L , 0. By normalizing L, one may assume without loss of generality P that ||L||HS = 1. Since T L = x µ(x)xLx −1 = L is a convex combination of unit vectors, we obtain that L must be a G-invariant vector. See Exercise 9.30. By Lemma 9.5.7, there exists a subspace V ≤ H such that 0 < dim V < ∞ and GV ≤ V . Let N = {x ∈ G : ∀ v ∈ V, x.v = v}. It is an easy exercise to show that N is a normal subgroup. Then G/N is isomorphic to a subgroup of Ud (C) U (V ), where d = dim V . Case Ia If [G : N] = ∞, then G/N is isomorphic to an infinite subgroup of Ud (C). Case Ib So assume that [G : N] < ∞. There exists v ∈ V such that Lv , 0 (by Lemma 9.5.7). Consider the function h(x) = hv (x) := hc(x), vi, t−1

1 XX k µ (x) hv, c(x)i c(x), t→∞ t k=0 x

0 , Lv = lim

which implies that there exists x ∈ G such that h(x) = hc(x), vi , h(1) = 0. So h is nonconstant. It is an exercise to show that since the action of G is by unitary operators, and since x.v = v for all x ∈ N, we have that h N is a homomorphism into C. Consider the subgroup Ker (h). Note that h N ≡ 0 if and only if N/Ker (h) h(N ) is finite. Since we assumed that [G : N] < ∞, if [N : Ker (h)] < ∞, then h ≡ 0. So it cannot be that [N : Ker (h)] < ∞. Thus, N/Ker (h) is isomorphic to an infinite subgroup of C, which must then be Abelian. Case II L = 0. In this case, we find a Lipschitz harmonic function with sublinear variance. This is because c is a nontrivial cocycle, so there exists y ∈ supp (µ) such that c(y) , 0. Define h(x) := hc(x), c(y)i. It is immediate that h(1) = 0 , h(y) and that h ∈ LHF (G, µ). Since c is µ-harmonic, we have that E[hx.c(Xt ), c(y)i] = 0. Using

v ⊗ u, v 0 ⊗ u 0

HS

= v 0, v u, u 0 ,

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

328

Gromov’s Theorem

we arrive at E |h(x Xt )| 2 = E h(c(x) + x.c(Xt )) ⊗ (c(x) + x.c(Xt )), c(y) ⊗ c(y)i D E = | hc(x), c(y)i | 2 + x(c(Xt ) ⊗ c(Xt ))x −1, c(y) ⊗ c(y) + hc(y), c(x)i · E[hx.c(Xt ), c(y)i] + hc(x), c(y)i · E[hc(y), x.c(Xt )i] D E = |h(x)| 2 + x At x −1, c(y) ⊗ c(y) , which implies 1 1 1 Var x [h(Xt )] = E |h(x Xt )| 2 − |h(x)| 2 t t* t + 1 −1 = x At x , c(y) ⊗ c(y) ≤

1t At

· ||c(y)|| 2 → 0. HS t

Let G be a group acting on a Hilbert space H by unitary operators. Let V ≤ H be a subspace. Let N = {x ∈ G : ∀ v ∈ V, x.v = v}. Show that N is a normal subgroup of G. Exercise 9.28

Let G be a group acting by unitary operators on a Hilbert space H. Let c : G → H be a cocycle. Let v ∈ H be such that x.v = v for all x ∈ G. Show that h(x) := hc(x), vi is a homomorphism into C. Show that h(G) is finite if and only if h ≡ 0. Exercise 9.29

Let G act by unitary operators on a Hilbert space H. Let µ be a symmetric and adapted probability measure on G. Let T : H → H be defined P by T v = x µ(x)x.v. Show that if ||v|| = 1, T v = v then x.v = v for all x ∈ supp (µ). Conclude that x.v = v for all x ∈ G. (Hint: consider the random variable hX .v, bi for some fixed vector b, and X distributed according to µ. What is the variance of this random variable? What is the sum over b ranging over an orthonormal basis?) B solution C

Exercise 9.30

9.6 Proof of Gromov’s Theorem Finally we put all the pieces together to obtain Gromov’s theorem (Theorem 9.0.1). In light of the reduction in Proposition 9.1.1, we only need to show that any infinite polynomial growth group admits a finite index subgroup with a

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.6 Proof of Gromov’s Theorem

329

surjective homomorphism onto Z. That is, it suffices to prove the following theorem. Theorem 9.6.1 Let G be an infinite finitely generated group of polynomial growth. Then there exists a finite index subgroup [G : H] < ∞ such that H admits a surjective homomorphism onto Z.

Proof Let G be a finitely generated group of polynomial growth. Choose any µ ∈ SA (G, ∞). Since G is of sub-exponential growth it is amenable (Exercise 6.50). By Theorem 9.5.1 and Exercise 1.32, one of three cases holds: (1) There exists a finite index subgroup H C G, [G : H] < ∞ such that H admits a surjective homomorphism onto Z. (2) There exists a nonconstant h ∈ LHF (G, µ) such that for any x, lim 1 t→∞ t

Var x [h(Xt )] = 0.

(3) There exists a homomorphism ϕ : G → Ud (C) such that |ϕ(G)| = ∞. If the first case holds, we are done. If h ∈ LHF (G, µ), then Theorem 6.5.5 tells us that for all t, Ex |h(X1 ) − h(x)|

2

≤ 4 · Var x h(Xt ) · H (Xt ) − H (Xt−1 ) .

Using Exercises 6.24 and 6.30, since G has polynomial growth we have that H (Xt ) ≤ C log(t + 1) for some C > 0 and all t > 0. So there are infinitely 2C . But then many t for which H (Xt ) − H (Xt−1 ) ≤ t+1 Ex |h(X1 ) − h(x)|

2

≤ 8C · lim inf 1t Var x h(Xt ). t→∞

If 1t Var x h(Xt ) → 0 for all x then h is constant. This implies that the second case above cannot hold for groups of polynomial growth. So we are left with the third case. There exists a homomorphism ϕ : G → Ud (C) such that |ϕ(G)| = ∞. The group ϕ(G) is a finitely generated infinite subgroup of Ud (C) of polynomial growth, so by Theorem 9.2.3 it is virtually Abelian and infinite. That is, there exists a finite index subgroup of ϕ(G) with a surjective homomorphism onto Z. Lifting this subgroup to a subgroup of G, we obtain a finite index subgroup of G with a surjective homomorphism onto Z. This completes the proof of Gromov’s theorem, Theorem 9.0.1.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

330

Gromov’s Theorem

9.7 Classification of Recurrent Groups Recall the notions of a recurrent and transient group from Section 4.12. We have seen that any group that is finite, virtually Z, or virtually Z2 is recurrent. As an application of Gromov’s theorem, we will now prove that these are the only recurrent groups. Recall Exercise 8.19, which tells us that if G is a group of polynomial growth d ∼ r 7→ r , and if K C G is such that G/K Z, then K is finitely generated and has polynomial growth r 7→ r d−1 . Also, Exercise 8.21 tells us that if G/F Z for some finite subgroup F C G, then G is virtually Z. Exercise 9.31 Let G be an infinite finitely generated group. Assume that G has growth (r 7→ r) (i.e. linear growth). Show that G contains a finite index subgroup H such that H Z; that is, G is virtually Z. B solution C Exercise 9.32 Let G be an infinite finitely generated group. Assume that G has growth r 7→ r 2 . Show that G contains a finite index subgroup H such that H Z or H Z2 . B solution C

Exercise 9.33 Let G be an infinite finitely generated group. Assume that G has growth r 7→ r 3−ε for some ε > 0. Show that G is either virtually Z or virtually Z2 . (Specifically, G actually has B solution C growth (r 7→ r 2 ).) Theorem 9.7.1 (Varopolous’ theorem) Let G be a finitely generated recurrent group. Then G is virtually Z2 or G is virtually Z or G is finite.

Proof We have seen in Exercise 9.33 that if G has growth r 7→ r 3−ε for some ε > 0 then G is either finite Z or virtually Z2 . Thus we may or virtually 3−ε assume that G has growth r 7→ r for ε > 0 as small as we want. We then want to prove that under this condition G is transient. By Theorem 4.12.1, it suffices to find a probability measure µ ∈ SA (G, 2) such that the µ-random walk is transient. Fix some small ε > 0 in fact ε < 21 will suffice . Fix a Cayley graph of G, and let Br = B(1, r) denote the ball of radius r about the origin 1. Let (α n )n≥1 be a sequence such that α n > 0 for all n and

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

P

331

9.7 Classification of Recurrent Groups P 2 n α n n < ∞ and n α n = 1. Define a measure µ by µ(x) =

∞ X

αr |Br | 1 { |x | ≤r } .

r=1

Then µ is fully supported (so obviously adapted) and easily seen to be symmetric since |x −1 | = |x| . Also, X

µ(x)|x| 2 =

X

x

|x| 2

x

=

∞ X

αr |Br |

r=1

αr |Br |

X r ≥ |x | r X

=

∞ X

αr |Br |

r=1

X

|x| 2

|x | ≤r

k 2 #{|x| = k} ≤

∞ X

αr r 2 < ∞.

r=1

k=1

So µ has a finite second moment. We now show that (G, µ) is transient. Let R ≥ 1, and define µ R (x) =

R X

αr |Br | 1 { |x | ≤r }

r=1

and νR = µ − µ R (note that these measures are not probability measures). PR We have that µ R (G) ≤ r=1 αr . Also, for any x ∈ G, X X αr 1 νR (x) = 1 ≤ · αr . |x | ≤r { } |Br | |B R+1 | r >R

r >R

It is an exercise to use these two estimates to prove by induction that P[Xt = 1] ≤ (1 − δ) t + t ·

δ |B R+1 | ,

P where δ = r >R αr (which can be made as small as desired by choosing large enough R). Given this bound, we want to choose R = R(t) so that P t P[X t = 1] < ∞, which proves transience. Let k ≥ 1, β > 0 to be determined. Choose αr = jCrk−k for some constant C > 0. So δ = O R−(k−1) . Then choose R = R(t) = t β . It follows that (1 − δ) t ≤ e−δt ≤ exp −ct 1−β(k−1) , 1−β(2+k−ε) tδ . |B R+1 | ≤ O t In order for the above to be summable, and to satisfy the requirements for the sequence (αr )r , we require the conditions 2 2+k−ε

< β

3.

If we take k = 3 + ε and ε < 12 , this can be satisfied with β = 12 .

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

332

Gromov’s Theorem

Complete the details of the proof of Theorem 9.7.1. Show by δ induction: since µ R, νR satisfy µ R (G) ≤ 1 − δ and supx ∈G νR (x) ≤ |B R+1 | , that for all t and any x ∈ G we may bound Exercise 9.34

P[Xt = x] ≤ (1 − δ) t + t · where δ =

P

r >R

δ |B R+1 | ,

αr .

B solution C

9.8 Kleiner’s Theorem In light of the fact that dim BHF (G, µ) ∈ {1, ∞} (Theorem 2.7.1), we may naively think that the same holds for HFk (G, µ). In this section we will see that HFk (G, µ) can be finite dimensional (we have already seen in Corollary 9.3.6 that dim LHF (G, µ) > 1). In fact Kleiner’s theorem states that this is always the case when the growth of the group is polynomial. Theorem 9.8.1 (Kleiner’s theorem) Let G be a finitely generated group of polynomial growth. Fix k > 0 and let µ ∈ SA (G, 2(k + 1)). Then the space HFk (G, µ) has finite dimension.

The proof requires two fundamental inequalities. The Saloff-Coste–Poincaré inequality and the reverse Poincaré inequality. Let G be a finitely generated group with finite symmetric generating set S and let µ be a symmetric adapted probability measure on G. For a function f : G → C, define X 1 Ar f := f (x), |B(1, r)| x ∈B(1,r ) X 2 := µ x −1 y | f (x) − f (y)| 2 . ||∇ f ||µ,r x,y ∈B(1,r )

Proposition 9.8.2 (Saloff-Coste–Poincaré inequality) Let G be a finitely generated group, S a finite symmetric generating set, and µ an adapted probability measure on G such that S ⊂ supp(µ). Consider the Cayley graph of G with respect to S. Then there exists a constant Cµ > 0 such that for all f : G → C and all r > 0, X |B(1, 2r)| 2 2 | f (x) − Ar f | 2 ≤ Cµ r 2 · · ||∇ f ||µ,3r . 2 |B(1, r)| x ∈B(1,r )

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

333

9.8 Kleiner’s Theorem Proof Set α −1 := mins ∈S µ(s). If |y| ≤ 2r, then for any s ∈ S, X X X 2 | f (x ys) − f (x y)| 2 ≤ | f (zs) − f (z)| 2 ≤ α · ||∇ f ||µ,3r . |z | ≤3r s ∈S

|x | ≤r

Since any y ∈ B(1, 2r) can be written as y = s1 · · · s n for n = |y| ≤ 2r and s1, . . . , s n ∈ S, using the notation y0 = 1 and y j = s1 · · · s j , we have by Cauchy–Schwarz, X

| f (x y) − f (x)| 2 ≤

X

n

n X

| f (x y j ) − f (xy j−1 )| 2

|x | ≤r j=1 n X X

|x | ≤r

≤ 2r

2 | f (xy j−1 s j ) − f (y j−1 )| 2 ≤ 4r 2 α||∇ f ||µ,3r .

j=1 |x | ≤r

This holds for any |y| ≤ 2r. Summing over all such y, and using another Cauchy–Schwarz, 2

X

X X X *. | f (x y) − f (y)| +/ ≤ |B(1, 2r)| | f (xy) − f (y)| 2 |x | ≤r , |y | ≤2r |x | ≤r |y | ≤2r 2 2 2 ≤ |B(1, 2r)| 4r α||∇ f ||µ,3r . Fix |x| ≤ r. Any |z| ≤ r can be written as x y for some |y| ≤ 2r. Thus, X X 1 1 | f (x) − f (z)| ≤ | f (x y) − f (x)|. | f (x) − Ar f | ≤ |B(1, r)| |z | ≤r |B(1, r)| |y | ≤2r Summing the squares, we have, 2

X X 1 *. | f (x y) − f (x)| +/ | f (x) − Ar f | 2 ≤ 2 |B(1, r)| |x | ≤r , |y | ≤2r |x | ≤r 2 |B(1, 2r)| 2 ≤ 4r 2 α||∇ f ||µ,3r . |B(1, r)| 2 X

We also have the “reverse” inequality, but only for harmonic functions. This is also sometimes known as the Cacciopolli inequality. Let G be a finitely generated group. Let µ ∈ SA (G, 2(k + 1)) for some k > 0. Then there exist constants Cµ > 0 such that for any f ∈ HFk (G, µ) and all r > 0, X 2 ||∇ f ||µ,r ≤ Cµ r −2 · | f (x)| 2 . Proposition 9.8.3 (Reverse Poincaré inequality)

|x | ≤3r

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

334

Gromov’s Theorem

Proof Since |ζ | 2 = | Reζ | 2 + | Imζ | 2 for all ζ ∈ C, we have that 2 2 2 ||∇ f ||µ,r = ||∇Re f ||µ,r + ||∇Im f ||µ,r

and X

X

| f (x)| 2 =

|x | ≤3r

| Re f (x)| 2 +

|x | ≤3r

X

| Im f (x)| 2 .

|x | ≤3r

2 for real-valued So, without loss of generality, it suffices to bound ||∇ f ||µ,r functions f : G → R. By Hölder’s inequality with p = k+1 k and q = k +1, since µ ∈ SA (G, 2(k +1)) and since f ∈ HFk (G, µ), there are constants C, Cµ > 0 such that for all r > 0, we have f g f g 1/p E 1 + | f (X1 )| 2 1 { |X1 |>r } ≤ C E (1 + |X1 |) 2k p · (P[|X1 | > r]) 1/q f g ≤ C E (1 + |X1 |) 2(k+1) · r −2 ≤ Cµ r −2 .

This implies that X

X

µ(z) ≤

|z |>r

µ(z) 1 + | f (z)| 2 ≤ Cµ r −2 .

|z |>r

Define 1 ϕ(x) = 2 − 0

if |x| ≤ r, |x | r

if r < |x| ≤ 2r, if |x| > 2r.

Note that ϕ : G → [0, 1] and supp (ϕ) ⊂ B(1, 2r). Also, |ϕ(xs) − ϕ(x)| ≤

1 2r

for

|x −1 y |

any s ∈ S, x ∈ G. Thus, for any x, y ∈ G we have that |ϕ(x) − ϕ(y)| ≤ 2r . To ease the notation, we write ϕ(x) = ϕ x , f (x) = f x , and P(x, y) = µ x −1 y . Note that P(x, y) = P(y, x) because µ is symmetric. For a finite subset F ⊂ G, define X JF = P(x, y)ϕ2y ( f x − f y ) 2, x,y ∈F

AF =

X

P(x, y) f x ϕ2x − f y ϕ2y ( f x − f y ),

x,y ∈F

BF = 2

X

P(x, y) f x ϕy (ϕ x − ϕy )( f y − f x ),

x,y ∈F

CF =

X

P(x, y) f x (ϕ x − ϕy ) 2 ( f y − f x ),

x,y ∈F

DF =

X

P(x, y) f x2 + f y2 (ϕ x − ϕy ) 2 .

x,y ∈F

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

335

9.8 Kleiner’s Theorem The identity ϕ2y ( f x − f y ) = f x ϕ2x − f y ϕ2y − 2 f x ϕy (ϕ x − ϕy ) − f x (ϕ x − ϕy ) 2 implies that JF = AF + BF + CF . We have X 2 ||∇ f ||µ,r = P(x, y) 21 ϕ2x + ϕ2y ( f x − f y ) 2 ≤ JF , x,y ∈B(1,r )

for any F ⊃ B(1, r). We bound AF by utilizing the inequality 2|ab| ≤ a2 + b2 . Using the harmonicity of f , and symmetry P(x, y) = P(y, x), XX AF = −2 P(x, y)ϕ2x f x ( f x − f y ) x ∈F yr

|x | ≤2r

To bound BF , we use the inequality |ab| ≤ 41 a2 + b2 , so | f x ϕy (ϕ x − ϕy )( f y − f x )| ≤ 14 ( f x − f y ) 2 ϕ2y + f x2 (ϕ x − ϕy ) 2, which implies X X BF ≤ 21 P(x, y)ϕ2y ( f x − f y ) 2 + 2 P(x, y) f x2 (ϕ x − ϕy ) 2 = 12 JF + DF . x,y ∈F

x,y ∈F

To bound CF we use the inequality |a(a − b)| ≤ 32 a2 + 12 b2 < 32 a2 + b2 . We have that 3 3 X CF ≤ P(x, y) f x2 + f y2 (ϕ x − ϕy ) 2 = DF . 2 x,y ∈F 2 We conclude that JF ≤ AF + 12 JF + 52 DF , implying that JF ≤ 2AF + 5DF .

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

336

Gromov’s Theorem

Finally, to bound DF , since |ϕ x − ϕy | ≤

|x −1 y | 2r ,

we have that

2 1 X P(x, y) x −1 y f x2 2r 2 x,y ∈F 1 X 2 X ≤ 2 f · µ(z)|z| 2 . 2r x ∈F x z

DF ≤

Also, since f is a nonconstant harmonic function, f < ` 2 (G, µ), and specifiP cally, x | f (x)| 2 = ∞. By adjusting the constant Cµ , and combining all the above, we have that for all r > 0, X JF ≤ 2AF + 5DF ≤ Cµ r −2 · | f (x)| 2, |x | ≤3r

which completes the proof.

9.8.1 Proof of Kleiner’s Theorem In this section we prove Kleiner’s theorem (Theorem 9.8.1). We will show that when G has polynomial growth, there exists some constant K = K (G, µ) such that any finite-dimensional subspace V ≤ HFk (G, µ) has dimension bounded by K. Thus, any collection of more than K vectors in HFk (G, µ) is linearly dependent, implying that dim HFk (G, µ) ≤ K. It will be very helpful to consider the following canonical finite-dimensional Hilbert spaces. For r > 0 and f , g : G → C define the semi-inner product X f (x)g(x). Qr ( f , g) = |x | ≤r

Implicitly, these depend on some fixed Cayley graph of the group G. If B = (b1, . . . , bd ) is a basis for a finite-dimensional vector space V ≤ CG , then Qr may be represented as matrices: the matrix of Qr in the basis B is defined to be [Qr ]B ∈ Md (C) with entries given by ([Qr ]B )i, j = Qr (bi, b j ). Different bases give different matrices; however, all matrices of the same form are related via the following exercise. Exercise 9.35 Show that if B = (b1, . . . , bd ) and C = (c1, . . . , cd ) are bases for a vector space V , then there exists an invertible matrix M = M (B, C) ∈ Md (C) such that [Qr ]C = M[Qr ]B M ∗ . B solution C

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.8 Kleiner’s Theorem

337

Show that for any finite dimensional vector space V ≤ CG , there exists r 0 = r 0 (V ) such that for all r ≥ r 0 the space V with the inner product structure given by Qr is a Hilbert space. B solution C

Exercise 9.36

A bilinear form Q on a vector space V is called positive definite if Q(v, v) > 0 for any 0 , v ∈ V . Exercise 9.37 Show that when Qr is positive definite on a finite-dimensional V ≤ CG , for any basis B of V , the matrix [Q]B is invertible. B solution C

In light of this, for two positive definite bilinear forms Qr , Q R , the determinant det[Q R ]B det(Q R /Qr ) := det[Qr ]B is well defined and independent of the choice of basis. Exercise 9.38 Show that if Qr , Q R are both positive definite on a finite-dimensional V ≤ CG , then one may choose a basis B = (v1, . . . , vd ) for V for which [Qr ]B is diagonal and [Q R ]B is the identity matrix. B solution C

Assume that R ≥ r > 0 and that Qr is positive definite on a finite-dimensional V ≤ CG . Show that for any basis B of V we have that det[Q R ]B ≥ det[Qr ]B . B solution C

Exercise 9.39

We now move to prove Kleiner’s theorem (Theorem 9.8.1). Let G be a finitely generated group of polynomial growth. Fix some k > 0. Let µ ∈ SA (G, 2(k + 1)). Let V ≤ HFk (G, µ) be a finite-dimensional subspace of dimension dim V = d. Set Λ = 21 (this just needs to be some large enough number). We will show that dim V ≤ Λn for some large enough n = n(G, µ). To this end, we fix a basis B of V such that |b(x)| ≤ (|x| + 1) k for any x ∈ G and all v ∈ B (using the fact that V ≤ HFk (G, µ)). Let r 0 = r 0 (V ) be large enough so that Qr is positive definite on V for all r ≥ r 0 . It will be useful to consider the function h(r) = |B(1, r)|(det[Qr ]B ) 1/d . Note that for any v ∈ B we have that Qr (v, v) ≤ |B(1, r)| · (r + 1) k , so by Hadamard’s inequality, (det[Qr ]B ) 1/d ≤ |B(1, r)| · (r + 1) k . Thus, h(r) ≤ |B(1, r)| 2 (r + 1) k . Set g(r) = log h(Λr ). Since G has polynomial growth, there exists D > 0 such that lim inf (g(r) − Dr log Λ) = 0. r→∞

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

338 Remark 9.8.4

Gromov’s Theorem Note that this only assumes that lim inf r −m |B(1, r)| < ∞, r→∞

for some m > 0, which is an a priori weaker assumption than polynomial growth. Claim 9.8.5 (Step I) There exists an integer n0 = n0 (G, µ, V ) such that the following holds for all n ≥ n0 . First, QΛn is positive definite on V for all n ≥ n0 . Additionally, there exist b > a ≥ n0 such that:

• b − a ∈ (n, 3n), • g(b + 1) − g(a) < n · 4D log Λ, • g(a + 1) − g(a) < 4D log Λ and g(b + 1) − g(b) < 4D log Λ. Proof By Exercise 9.36, we can choose n0 so that QΛn is positive definite on V for all n ≥ n0 . Fix any n ≥ n0 . Let ` be such that g(n0 + 3n`) − g(n0 ) < 4n` · D log Λ. (This can be done when, for example, 2n0 < n` and 2|g(n0 )| < n` · D log Λ.) This implies (by a telescoping sum) that there exists n0 ≤ j ≤ n0 + 3n(` − 1) such that g( j + 3n) − g( j) < 4n · D log Λ. Telescoping again, 4n · D log Λ > g( j + 3n) − g( j) = g( j + 3n)−g( j + 2n)+g( j + 2n)−g( j + n)+g( j + n)−g( j), we see that since g is nondecreasing, there must exist b ∈ [ j + 2n, j + 3n − 1] and a ∈ [ j, j + n − 1] such that g(b + 1) − g(b) and g(a + 1) − g(a) are both less than 4D log Λ. Also, g(b + 1) − g(a) ≤ g( j + 3n) − g( j) < 4n · D log Λ, by our choice of j. This proves Step I. Step II Now choose some fixed n ≥ n0 so that (Cµ ) 2 Λ−2n+12D < 12 Λ−8D, where Cµ is the maximum of the constants from the Poincaré and reverse Poincaré inequalities (Propositions 9.8.2 and 9.8.3). Using a, b from Step I, we now choose r = Λa and R = Λb . Thus, 3

)| (Cµ ) 2 |B(1,7r (r/R) 2 ≤ (Cµ ) 2 exp(3(g(a + 1) − g(a)))Λ−2n < 12 Λ−8D, |B(1,r ) | 3

(9.4) |B(1,ΛR) | |B(1, R) | |B(1,2R) | |B(1,r ) |

≤e

(g(b+1)−g(b))

Λ8D · Q R (v j , v j )}. Then, since det(QΛR /Q R ) is independent of the choice of basis, Λ4D ≥ exp(g(b + 1) − g(b)) =

h(ΛR) ≥ det(QΛR /Q R ) 1/d h(R)

1 d d Y ` = *. QΛR (v j , v j ) +/ > Λ8D d . , j=1 Thus ` < d/2. We conclude that by taking

UR = span{v j | QΛR (v j , v j ) ≤ Λ8D · Q R (v j , v j )}, then UR ≤ V has dimension dim UR ≥ have

1 2

dim V , and also for any u ∈ UR we

QΛR (u, u) ≤ Λ8D · Q R (u, u).

(9.5)

Fix R > r > 0. Show that one can find m = m(r, R) ≤ and points x 1, . . . , x m ∈ B(1, R) such that S • B(1, R) ⊂ m j=1 B(x j , 2r) (the small 2r-balls cover the big R-ball). • The overlap is not too big: for any x ∈ B(1, R), the number of 6r-balls containing x, can be bounded: ( ) |B(1, 7r)| # 1 ≤ j ≤ m : x ∈ B(x j , 6r) ≤ . B solution C |B(1, r)|

Exercise 9.40 (Step III) |B(1,2R) | |B(1,r ) |

Step IV Now we combine the Poincaré and reverse Poincaré inequalities (Propositions 9.8.2 and 9.8.3). Recall m = m(r, R) from Step III, the number of balls in the covering of B(1, R) by radius 2r balls. Define Ψ : V → Cm by Ψ(v) = ( Ax1,2r v, . . . , Axm,2r v),

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

340

Gromov’s Theorem

1 where Ax,r v = |B(1,r y ∈B(x,r ) v(y). Note that Ψ is a linear map, so )| dim KerΨ = d − dim ImΨ ≥ d − m. Let K = KerΨ. S Because B(1, R) ⊂ m j=1 B(x j , 2r), instead of summing over x ∈ B(1, R) we may sum over j = 1, . . . , m and then over x ∈ B(x j , 2r). Because of the properties of the overlaps in the covering,

P

m X

X

µ x −1 y |u(x) − u(y)| 2

j=1 x,y ∈B(x j ,6r )

=

X

µ x −1 y |u(x) − u(y)| 2 · #{1 ≤ j ≤ m : x, y ∈ B(x j , 6r)}

x,y ∈B(1, R+6r )

≤

|B(1, 7r)| 2 · ||∇u||µ,7R . |B(1, r)|

Let u ∈ K. So Ax j ,2r u = 0 for all 1 ≤ j ≤ m. Specifically, by combining the Poincaré and reverse Poincaré inequalities (Propositions 9.8.2 and 9.8.3), and using (9.4), Q R (u, u) ≤

m X

X

(u(x) − Ax j ,2r u) 2

j=1 x ∈B(x j ,2r )

≤

m X j=1

Cµ 4r 2

|B(1, 4r)| 2 |B(1, 2r)| 2

X

µ x −1 y |u(x) − u(y)| 2

x,y ∈B(x j ,6r )

|B(1, 7r)| |B(1, 4r)| 2 2 · 4Cµ r 2 · ||∇u||µ,7R |B(1, r)| |B(1, 2r)| 2 |B(1, 7r)| 3 · (r/R) 2 QΛR (u, u) ≤ 12 Λ−8D · QΛR (u, u), ≤ (Cµ ) 2 |B(1, r)| 3 ≤

by our choices of r, R. By (9.5), if u ∈ UR ∩ K then we get Q R (u, u) ≤ 12 Λ−8D QΛR (u, u) ≤

1 2

· Q R (u, u).

Thus Ψ UR is injective, which gives dim V ≤ 2 dim UR ≤ 2m ≤ 2

|B(1, 2R)| < 2Λ4Dn . |B(1, r)|

Since this bound depends only on G, µ, this completes the proof of Kleiner’s theorem.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

341

9.9 Additional Exercises

9.9 Additional Exercises Exercise 9.41

Let G be a finitely generated group, and let µ ∈ SA (G, 1).

Define K = {x ∈ G : ∀ h ∈ LHF (G, µ), x.h − h is constant}. Show that K C G. Show that for any h ∈ LHF (G, µ), the function f = h K − h(1) is a homomorphism from K into C. ¯ x) = h(x) is a Show that if h ∈ LHF (G, µ) satisfies h K ≡ 0 then h(K well-defined function on G/K, and that h¯ ∈ LHF (G/K, µ), ¯ where µ(K ¯ x) = P µ(K x) = y ∈K µ(yx) is the projection of µ onto G/K. Conclude that dim LHF (G, µ) ≤ dim Hom (K, C) + 1 + dim LHF (G/K, µ). ¯

B solution C

Let G be a finitely generated group, and let µ ∈ SA (G, 1) be such that (G, µ) is Liouville. Define Exercise 9.42

K = {x ∈ G : ∀ h ∈ LHF (G, µ), x.h − h is constant}. Show that Z1 (G) C K (recall that Z1 (G) is the center of G). Similarly to Exercise 9.41, conclude that dim LHF (G, µ) ≤ dim Hom (Z1 (G), C) + 1 + dim LHF (G/Z1 (G), µ), ¯ where µ(Z ¯ 1 (G)x) = µ(Z1 (G)x) is the projection of µ onto G/Z1 (G).

B solution C

Show that if G is a finitely generated group and U is a finite generating set for G, then dim Hom (G, C) ≤ |U |. B solution C Exercise 9.43

Let G be a finitely generated nilpotent group, and let µ ∈ SA (G, 1). Show (without using Kleiner’s theorem, Theorem 9.8.1) that dim LHF (G, µ) < ∞. B solution C

Exercise 9.44

9.10 Remarks Milnor was a major initiator of the study of growth of finitely generated groups. The Milnor–Wolf theorem (Theorem 8.2.4), published in Milnor (1968a) and Wolf (1968), was a special case of Gromov’s theorem for the class of solvable groups. In the same year, Milnor (1968b) posed the following two questions:

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

342

Gromov’s Theorem

(1) Do there exist groups of intermediate growth? (2) What is the (algebraic) characterization of the groups of polynomial growth? The first question was answered by Grigorchuk (1980, 1984), in which he constructed an intermediate growth group, now known as the Grigorchuk group. The second question was answered by Gromov (1981). This is Gromov’s theorem (Theorem 9.0.1), stating that a finitely generated group has polynomial growth if and only if it is virtually nilpotent. Kleiner (2010) gave a new proof of Gromov’s theorem by showing that the space of Lipschitz harmonic functions has finite dimension (Theorem 9.8.1), and using the action on that space to provide a representation of the group as a linear group. The proof presented is from Kleiner (2010). The Poincaré inequality, Proposition 9.8.2, is attributed in Kleiner (2010) to Saloff-Coste. It is similar to an inequality appearing in Coulhon and Saloff-Coste (1993). Ozawa (2018) provided yet another proof for Gromov’s theorem; it is basically the one presented here. Both Kleiner and Ozawa’s proofs rely on the existence of harmonic coycles (Theorem 9.3.2). This follows from results of Mok (1995) or Korevaari and Schoen (1997). Kleiner provides a proof using property (T). Ozawa provides a very short proof using the spectral theorem. The proof of Theorem 9.3.2 we present here is due to Lee and Peres (2013). This proof has the advantage of avoiding the use of ultrafilters and ultraproducts. Theorem 9.4.1, stating that random walks on groups are never sub-diffusive, is due to Erschler, and appears in Lee and Peres (2013). The proof of the von Neumann ergodic theorem (Theorem 9.5.2) presented is due to Riesz, and we have taken it from Tao’s blog What’s New. Shalom and Tao (2010) gave a quantitative proof of Gromov’s theorem, based on Kleiner’s proof, which also shows that there is some ε > 0 such that any ε finitely generated group of growth r 7→ r (log log r ) is actually of polynomial growth. Thus, there is a “gap” for growth functions, where no groups exist. It is actually conjectured by Grigorchuk (1990) that there is a much more serious “gap.” Conjecture 9.10.1 (Grigorchuk’s gap conjecture)

group of growth r 7→

exp(r α )

for some α

0, let K be large enough so that | |vK − v | | < ε4 . Let N be large enough so that | | A n vK − πvK | | < ε2 for all n ≥ N . We then have that for all n ≥ N ,

| | A n v − πv | | ≤ | | A n vK − πvK | | + | | A n (v − vK ) − π(v − vK ) | | < Thus, | | A n v − πv | | → 0 as n → ∞, implying that v ∈ V . Hence V is closed. The other two assertions are very easy. Solution to Exercise 9.18 :( For any b ∈ B we have

X

| T b, b0 | 2 = | |T b | | 2,

b 0 ∈B

making the last identity immediate (by Tonelli’s theorem, Exercise A.13).

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

ε 2

+ 2 | |v − vK | | < ε. :) X

347

9.11 Solutions to Exercises Now, using this and Tonelli’s theorem repeatedly, X X X X | |T ∗ a | | 2 = | hT b, ai | 2 = | |T b | | 2 =

X

| |T a0 | | 2 . | T ∗ a, a0 | 2 = a 0 ∈ B˜

a, a 0 ∈ B˜

a∈ B˜

b∈B a∈ B˜

b∈B

X

Also, if v ∈ H is such that | |v | | = 1, then v is contained in some orthonormal basis for H. Hence, 2 . This holds for all unit-length v , so | |T | | | |T v | | 2 ≤ | |T | |HS :) X op ≤ | |T | |HS . Solution to Exercise 9.19 :( If | | A| |HS < ∞ and | | A0 | |HS < ∞, then by Cauchy–Schwarz, X

X 1/2 | Ab, A0 b | ≤ | | Ab | | · | | A0 b | | ≤ | | A| |HS · | | A0 | |HS , b∈B

b∈B

A0 iHS

so hA, converges absolutely. Thus, linearity of the Lebesgue integral will provide the properties of an inner product. D E P ˜ A0 b˜ is absolutely convergent, the set Also, if B˜ is another orthonormal basis for H, then since b∈ ˜ B˜ Ab, D E ) ( ˜ A0 b˜ , 0 is countable. Also, since b˜ ∈ B˜ : Ab,

| | A| |HS =

X

| | Ab˜ | | 2 < ∞,

˜ B˜ b∈

and for any b˜ ∈ B˜ ,

| | Ab˜ | | 2 =

X D E ˜ b | 2 < ∞, | Ab, b∈B

we may find countable subsets C˜ ⊂ B˜ and C ⊂ B such that XD E X XD ED E ˜ A0 b˜ = ˜ b b, A0 b˜ Ab, Ab, ˜ B˜ b∈

˜ C˜ b∈C b∈

=

X XD

˜ b Ab,

ED

E X

( A0 ) ∗ b, A∗ b b, A0 b˜ =

˜ C˜ b∈C b∈

=

X

( A0 ) ∗ b, b0

b∈B

X

b0, A∗ b = Ab0, A0 b0 . b 0 ∈B

b, b 0 ∈C

A0 iHS

So hA, does not depend on the specific choice of orthonormal basis. To show that H∗ ⊗ H is a Hilbert space, we need to show that it is complete (with the Hilbert–Schmidt norm). The proof of this fact is basically the same as the proof that ` 2 is a complete space. Let ( A n ) n be a Cauchy sequence in H∗ ⊗ H. Let B be an orthonormal basis of H. Let v ∈ H. Note that 2 | | A n+m v − A n v | | 2 ≤ | | A n+m − A n | |HS · | |v | | 2,

so ( A n v) n is a Cauchy sequence in H. Thus the limit Av := lim n A n v exists in H. This also immediately shows that A is a linear map. Since | | A n+m | |HS − | | A n | |HS ≤ | | A n+m − A n | |HS, we get that lim n | | A n | |HS exists, and specifically, M := sup n | | A n | |HS < ∞. If F ⊂ B is a finite subset then X X X | | Ab | | 2 ≤ 2 | | An b | |2 + 2 | |( A − A n )b | | 2 b∈F

b∈F

b∈F

2 2 ≤ 2 sup | | A n | |HS + 2|F | · | |v | | 2 · | | A − A n | |HS . n

2 ≤ 2M 2 < ∞. So A ∈ H∗ ⊗ H. Taking n → ∞ and the supremum over F we get that | | A| |HS Now, since | | A n | |HS < ∞ and | | A| |HS < ∞, there exists a countable subset C ⊂B such that for all b ∈ B\C we have | | A n b | | = | | Ab | | = 0 for all n. Write C = {c1, c2, . . . }.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

348

Gromov’s Theorem

2 < ε. Let ε > 0. Let n0 be large enough so that for all n, m ≥ n0 , we have | | A n − A m | |HS Fix some r > 0. Let n1 ≥ n0 be large enough so that for any m ≥ n1 , we have | |( A − A m )c j | | 2 < all 1 ≤ j ≤ r . We then have that for all n ≥ n0 and all m ≥ n1 , r X

| |( A − A n )c j | | 2 ≤ 2

j=1

r X

| |( A − A m )c j | | 2 + 2

j=1

r X

ε r

for

2 | |( A n − A m )c j | | 2 < 2ε + 2| | A n − A m | |HS < 4ε.

j=1

2 < 4ε for all n ≥ n , implying that A → A in H∗ ⊗ H. Taking r → ∞, we have that | | A − A n | |HS n 0

:) X

Solution to Exercise 9.20 :( Note that if B is an orthonormal basis then

X

v ⊗ u, v0 ⊗ u0 = hb, vi v0, b u, u0 = v0, v · u, u0 , b∈B

2 | |v ⊗ u | |HS =

X

X

| |(v ⊗ u)b | | 2 =

b∈B

| hb, vi | 2 · | |u | | 2 = | |v | | 2 · | |u | | 2 .

b∈B

If b, b0, a, a0 ∈ B then

b ⊗ b0, a ⊗ a0 HS = ha, bi · b0, a0 = 1{a=b} · 1 { a0 =b 0 } . So {b ⊗ b0 : b, b0 ∈ B } forms an orthonormal system. Also, for A ∈ H∗ ⊗ H and b, b0 ∈ B , we have X

X

Ab00, b0 · b, b00 = Ab, b0 . Ab00, (b ⊗ b0 )b00 = A, b ⊗ b0 HS = b 00 ∈B

b 00 ∈B

So if hA, b ⊗

b0 iHS

= 0 for all b,

b0

∈ B then 2 | | A| |HS =

X

| Ab, b0 | 2 = 0. b∈B

Thus, {b ⊗ b0 : b, b0 ∈ B } forms an orthonormal basis.

:) X

Solution to Exercise 9.21 :( Since

X

| | Ab | | 2 = | | A| |HS < ∞,

b∈B

there exists a countable subset E = {e1, e2, . . . } ⊂ B , as required. For the second assertion, note that for any b ∈ B we have that A n b = Ab if b ∈ {e1, . . . , en } and A n b = 0 otherwise, so 2 | | A − A n | |HS =

X b∈B

| | Ab − A n b | | 2 =

∞ X

| | Ab | | 2 .

k=n+1

As the tail of a converging series, this can be made as small as required by taking n large enough.

:) X

Solution to Exercise 9.22 :( If dim H < ∞ then H is isomorphic to C n for some n so the unit ball is compact. If dim H = ∞ then let (un ) n be an orthonormal sequence. So | |un | | = 1 for all n and, specifically, the sequence is in the unit ball; however, for any m , n,

| |un − um | | 2 = | |un | | 2 + | |um | | 2 = 2, by the Pythagorean theorem, so the sequence does not have a converging subsequence.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

:) X

349

9.11 Solutions to Exercises Solution to Exercise 9.23 :( For any subsequence (un k ) k we have

D E | |u | | 2 = hu, ui = lim un k , u ≤ lim sup | |un k | | · | |u | |. k→∞

k→∞

Taking the subsequence for which lim sup k→∞ | |un k | | = lim inf n→∞ | |un | | completes the proof.

:) X

Solution to Exercise 9.24 :( P Let v P ∈ H. Fix ε > 0. Since | |v | | 2 = b∈B | hv, bi | 2 < ∞, there is a finite subset F ⊂ B such that for vF = b∈F hv, bi b , we have that | |v − vF | | < ε . Since vF is a finite linear combination of basis elements, hun − u, vF i → 0 as n → ∞. Thus,

| hun − u, vi | ≤ | hun − u, vF i | + | hun − u, v − vF i | ≤ | hun − u, vF i | + sup | |un − u | | · ε. n

Taking n → ∞ and then ε → 0 completes the proof, utilizing the fact that | |un − u | | ≤ 2.

:) X

Solution to Exercise 9.25 :( Let B be an orthonormal basis. For a finite subset F ⊂ B define A F : H → H by X AF v = hv, bi Ab. b∈F

Let ε > 0. By Exercise 9.21, there exists a finite subset Fε ⊂ B such that | | A − A Fε | |HS < ε . Let (un ) n be a sequence converging weak∗ to u . For any finite F ⊂ B , we have | | A F (un − u) | | → 0 as n → ∞. Thus,

| | A(un − u) | | ≤ | |( A − A F )(un − u) | | + | | A F (un − u) | | ≤ 2 | | A − A F | |HS + | | A F (un − u) | |. This implies that for any ε > 0,

lim sup | | A(un − u) | | ≤ 2ε, n→∞

:) X

implying that | | A(un − u) | | → 0 as n → ∞.

Solution to Exercise 9.26 :( The action x. A := x Ax −1 is obviously a left action. It is unitary because for some orthonormal basis B of H, since G acts unitarily on H, the collection xB = {xb : b ∈ B } also forms an orthonormal basis. Hence, XD D E E XD E

x Ax −1, x A0 x −1 = x Ax −1 b, x A0 x −1 b = Ax −1 b, A0 x −1 b = A, A0 HS . :) X HS

b∈B

b∈B

Solution to Exercise 9.27 :( For any w ∈ H,

D E x(v ⊗ u)y −1 w = y −1 w, v xu = (yv ⊗ xu)w.

:) X

Solution to Exercise 9.30 :( This is basically some form of Jensen’s inequality. Let X = X1 have the distribution of µ . Fix any vector b ∈ H. We have that E hX.v − v, bi = hT v − v, bi = 0. So using that E hX.v, bi = hT v, bi = hv, bi, we arrive at

0 ≤ Var hX.v, bi = E | hX.v, bi | 2 − | hv, bi | 2 . Summing b over an orthonormal basis, and using the fact that | |X.v | | = | |v | | because the action is unitary, X 0 = E | |X.v | | 2 − | |v | | 2 = E | hX.v, bi | 2 − | hv, bi | 2 ≥ 0. b

So for any b we have Var hX.v, bi = 0, implying that hX.v, bi = hv, bi a.s. That is, for any b we find that hx.v, bi = hv, bi for any x ∈ supp (µ) . This implies that x.v = v for any x ∈ supp (µ) . Finally, since µ is adapted, any x ∈ G is the product of finitely many elements from supp (µ) , so x.v = v for all x ∈ G . :) X

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

350

Gromov’s Theorem

Solution to Exercise 9.31 :( By Theorem 9.6.1, we know that G has a finite index subgroup [G : H] < ∞ with a surjective homomorphism ϕ : H → Z. Set K = ker ϕ . Since H is finite index in G it has linear growth. Since H/K Z, by Exercise 8.19, it must be that K is finite. By Exercise 8.21, H is virtually Z, implying that G is virtually Z as well. :) X Solution to Exercise 9.32 :( As before, by Theorem 9.6.1, G contains a finite index subgroup [G : H] < ∞ with a surjective homomorphism ϕ : H → Z. Set K = ker ϕ . By Exercise 8.19, K must be finitely generated and has growth (r 7→ r ) . Hence, by Exercise 9.31, K is either virtually Z or K is finite. In the latter case, we have seen in Exercise 8.21 that this implies that H is virtually Z, so that G is virtually Z as well. In the former case, when K is virtually Z, we have that G contains a finite index subgroup H such that for some N C H we have H/N Z and N Z. Now, Exercise 8.19 tells us that H = Z n N for some action of Z on N . Since N Z, we only need to show that if Z acts on Z, then Z n Z is always virtually Z2 . Let Z hai act on the additive group of integers Z. That is, a is an automorphism of the additive group of integers. There are two possibilities for a(1) : either a(1) = 1 so the action is trivial, or a(1) = −1 and the action is a reflection around 0. If the action is trivial, a(1) = 1, then it is easy to check that (a z , x) 7→ (z, x) defines an isomorphism of hai n Z to Z2 . D E If the action is a reflection, a(1) = −1, then consider the subgroup N = a2 C hai. We have that N acts trivially on the integers, and N n Z has index 2 inside hai n Z. So N n Z Z2 is a finite index subgroup in :) X hai n Z. Solution to Exercise 9.33 :( By Theorem 9.6.1, G contains a finite index subgroup [G : H] < ∞ with a surjective homomorphism ϕ : H → Z. Let K = Kerϕ . By Exercise 8.19, K must be finitely generated and have growth r 7→ r 2−ε ≺ r 7→ r 2 . Thus, K is either finite or virtually Z or virtually Z2 , by Exercise 9.32. If K is virtually Z2 then K has growth r 7→ r 2 r 7→ r 2−ε , a contradiction. So K is either finite or virtually Z. If K is finite, then H/K Z implies that G is virtually Z, by Exercise 8.21. If K is virtually Z, then H/K Z implies that G actually has growth r 7→ r 2 , so G is virtually Z2 or virtually Z by Exercise 9.32. :) X Solution to Exercise 9.34 :( For t = 0 there is nothing to prove. Assume for t , and we prove by induction for t + 1. Compute using the Markov property at time t , X X f g X f g P[Xt +1 = x] = P[X1 = y] · P y [Xt = x] = µ R (y) P Xt = y −1 x + νR (y) P Xt = y −1 x y by induction

≤

X

y

µ R (y) · (1 − δ) t + t ·

y

≤ (1 − δ) · (1 − δ) + t · ≤ (1 − δ)

t +1

t

+ (t + 1) ·

y

δ | B R+1 |

δ | B R+1 |

+

+ sup νR (y) · y∈G

X

P[Xt = z]

z

δ | B R+1 |

δ | B R+1 | ,

:) X

which completes the induction. Solution to Exercise 9.35 :( Since B is a basis, for every j we can write

cj =

d X

M j, k b k ,

k=1

for some M j, k ∈ C. The linearity of the form Qr gives that

Qr (ci , c j ) =

d X d X

Mi, k Qr (b k , b` )M j, ` = (M[Qr ] B M ∗ )i, j .

k=1 `=1

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

:) X

9.11 Solutions to Exercises

351

Solution to Exercise 9.36 :( Let B = (b1, . . . , b d ) be a basis of V . Since these are nonzero functions, there exists r0 such that for every j there exists |x | ≤ r0 with |b j (x) | > 0. Then, for any r ≥ r0 , the restriction map v 7→ v B (1, r ) is then a linear isomorphism from V into C B(1, r ) . Thus, if v ∈ V then v(x) , 0 for some |x | ≤ r , implying that Qr (v, v) > 0. This turns Qr into an inner product on V . Being finite dimensional, V is then obviously a Hilbert space with this inner product. :) X Solution to Exercise 9.37 :( Choose a basis E for V that is orthonormal with respect to the inner product. Then [Q] E is the identity matrix. So for any other basis B = (b1, . . . , b d ) of V , we have that [Q] B = M M ∗ for some matrix M . Since M M ∗ is self-adjoint, there exists a unitary matrix U and a diagonal matrix D such that [Q] B = M M ∗ = U DU ∗ . So for any j , X ∗ D j, j = (U ∗ [Q] B U) j, j = U j, k U`, j Q(b k , b` ) = Q(c j , c j ), k, `

where

cj =

d X

(U ∗ ) j, k b k ∈ V .

k=1

Because U is invertible, c j , 0 for all j , implying that det([Q] B ) = det(D) > 0.

:) X

Solution to Exercise 9.38 :( Let E = (e1, . . . , e d ) be a basis that is orthonormal with respect to Q R (this can be done since Q R is positive definite, so defines a Hilbert space on V ). So [Q R ] E = I . Then, [Qr ] E is a self-adjoint matrix, so can be unitarily diagonalized. That is, U[Qr ] E U ∗ = D is a diagonal matrix for some unitary matrix U . Let B = (b1, . . . , b d ) be the basis given by

bj =

d X

U j, k ek .

k=1

Then one may compute that

[Qr ] B = U[Qr ] E U ∗ = D

and [Q R ] B = U[Q R ] E U ∗ = I .

:) X

Solution to Exercise 9.39 :( det[Q ] B Since det(Q R /Qr ) = det[QR does not depend on B , we may assume that B is such that [Qr ] B and [Q R ] B r ]B are both diagonal matrices. For b ∈ B we have that X Q R (b, b) = |b(x) | 2 ≥ Qr (b, b). | x |≤ R

Thus,

det[Q R ] B =

Y b∈B

Q R (b, b) ≥

Y

Qr (b, b) = det[Qr ] B .

:) X

b∈B

Solution to Exercise 9.40 :( Inductively choose x1, . . . , x m as follows. Start with some x1 ∈ R . For any k ≥ 1, assume that we have chosen S x1, . . . , x k such that the balls (B(x j , r )) kj=1 are pairwise disjoint. If B(1, R) ⊂ kj=1 B(x j , 2r ) , then set m = k and the construction is complete. Otherwise, there exists x k+1 ∈ B(1, R) such that x k+1 < B(x j , 2r ) for all 1 ≤ j ≤ k . Hence B(x k+1, r ) ∩ B(x j , r ) = ∅ for all 1 ≤ j ≤ k , and we can continue S inductively. With the above procedure, we have found x1, . . . , x m ∈ B(1, R) such that B(1, R) ⊂ m j=1 B(x j , 2r ) and such that the balls (B(x j , r )) m are pairwise disjoint. j=1 Since the balls (B(x j , r )) m are pairwise disjoint, we have that m|B(1, r ) | ≤ |B(1, R+r ) | ≤ |B(1, 2R) | . j=1 We bound the overlap. Fix x ∈ B(1, R) . Let Jx = {1 ≤ j ≤ m : x ∈ B(x j , 6r ) }. Note that for any j ∈ Jx we have that x j ∈ B(x, 6r ) . Because the balls (B(x j , r )) m are pairwise disjoint, we have that j=1 |Jx | · |B(1, r ) | ≤ |B(1, 6r + r ) | = |B(1, 7r ) | . :) X

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

352

Gromov’s Theorem

Solution to Exercise 9.41 :( We have that

xy.h − h = x.(y.h − h) + x.h − h, x −1 .h − h = −x −1 .(x.h − h), x −1 yx.h − h = x −1 .(y.x.h − x.h), and since the action on a constant function it trivial, this shows that K C G . Note that for any y ∈ K and x ∈ G , we have that h(yx) − h(1) = y −1 .h − h (x) + h(x) − h(1) = y −1 .h − h (1) + h(x) − h(1)

= h(y) − h(1) + h(x) − h(1). This shows that f = h K − h(1) is a homomorphism from K to C. ¯ x) = h(x) Moreover, if h K ≡ 0, then for any x ∈ G and y ∈ K , we have that h(yx) = h(x) , so that h(K is well defined. It is easy to see that h¯ ∈ LHF (G/K, µ) ¯ . Indeed,

X

¯ xs) − h(K ¯ x) | = |h(xs) − h(x) | ≤ | |∇h | |∞ · |s |, | h(K X X ¯ µ(K ¯ x) h(K zK x) = µ(yx)h(zx) K x∈G/K y∈K

K x∈G/K

=

X

X

µ(yx)h zy −1 z −1 zyx

K x∈G/K y∈K

=

X

¯ z), µ(g)h(zg) = h(z) = h(K

g∈G

where we have used that cosets are pairwise disjoint, and that h(zx) = h zy −1 z −1 zyx = h(zyx) for all y ∈ K and x, z ∈ G . Consider the liner map Ψ(h) = h K on LHF (G, µ) . The image of Ψ is contained in the space V of homomorphisms from K to C plus a constant function. The kernel of Ψ, denoted Ker Ψ, is the space of all ¯ is easily seen to be an h ∈ LHF (G, µ) with h K ≡ 0. The map Φ(h) = h¯ from KerΨ into LHF (G/K, µ) injective linear map. So we conclude that dim LHF (G, µ) ≤ dim V + dim KerΨ ≤ dim Hom (K, C) + 1 + dim LHF (G/K, µ). ¯

:) X

Solution to Exercise 9.42 :( Let Z = Z1 (G) . If z ∈ Z then for h ∈ LHF (G, µ) and x ∈ G we have that |z.h(x) − h(x) | = h xz −1 − h(x) ≤ | |∇h | |∞ · |z |, so z.h − h is a bounded µ -harmonic function. Since (G, µ) is assumed to be Liouville, z.h − h is constant, implying that z ∈ K . Proceeding exactly as in Exercise 9.41, we obtain

dim LHF (G, µ) ≤ dim Hom (Z, µ) + 1 + dim LHF (G/Z, µ). ¯

:) X

Solution to Exercise 9.43 :( The map ψ 7→ (ψ(u))u∈U is a linear map from Hom (G, C) into CU . Since U generates G , if a homomorphism ψ satisfies ψ(u) = 0 for all u ∈ U , then ψ ≡ 0. Thus, the above linear map is injective, so dim Hom (G, C) ≤ dim CU . :) X Solution to Exercise 9.44 :( The proof is by induction on the nilpotent step. Base case: If G is 1-step nilpotent, then G is Abelian. So (G, µ) is Liouville (by the Choquet–Deny theorem (Corollary 7.1.2)), and Z1 (G) = G . By Exercise 9.43, we know that dim Hom (G, C) < ∞, since G is finitely generated. Since G/Z1 (G) is the trivial group, LHF (G/Z1 (G), µ) ¯ is just the space of constants. Thus, by Exercise 9.42 we have that

dim LHF (G, µ) ≤ dim Hom (G, C) + 1 + 1 < ∞.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

9.11 Solutions to Exercises

353

Induction step: Let n > 1. Assume the claim for nilpotent groups of step less than n, and let G be n-step nilpotent. We have that (G, µ) is Liouville because nilpotent groups are Choquet–Deny (Corollary 7.1.4). Also, dim LHF (G/Z1 (G), µ) ¯ < ∞ by induction, because G/Z1 (G) is at most (n − 1) -step nilpotent. Since G is a finitely generated nilpotent group, we know that Z1 (G) is finitely generated, for example, by Rosset’s theorem (Theorem 8.3.3) together with the fact that nilpotent groups have polynomial growth (Theorem 8.2.1). Thus, by Exercise 9.42,

dim LHF (G, µ) ≤ dim Hom (Z1 (G), C) + 1 + dim LHF (G/Z1 (G), µ) ¯ < ∞.

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

:) X

https://doi.org/10.1017/9781009128391.012 Published online by Cambridge University Press

Appendices

https://doi.org/10.1017/9781009128391.013 Published online by Cambridge University Press

https://doi.org/10.1017/9781009128391.013 Published online by Cambridge University Press

Appendix A Hilbert Space Background

357

https://doi.org/10.1017/9781009128391.014 Published online by Cambridge University Press

358

Hilbert Space Background

A.1 Inner Products and Hilbert Spaces We record the necessary background regarding inner products and Hilbert spaces. Any basic book should contain the details. Most exercises and theorems are given without solutions or proofs in this section. A complex inner product space is a vector space V over the complex number field C with the additional structure of an inner product. This is a function h·, ·i : V × V → C such that for all vectors u, v, w ∈ V and scalars α ∈ C,

Definition A.1.1

• • • •

hαv + u, wi = α hv, wi + hu, wi, hv, ui = hu, vi (complex conjugate), R 3 hv, vi ≥ 0, and hv, vi = 0 if and only if v = 0. √ An inner product defines a norm on V given by ||v|| = hv, vi for all v ∈ V .

Sometimes there can be more than one space being considered. If we wish to stress which inner product space is being used we may write h·, ·iV or || · ||V . Exercise A.1 Let V be an inner product space, and fix some v, u ∈ V . Consider the function ϕ : R → R given by ϕ(ξ) = ||ξv − u|| 2 . Compute the minimum of ϕ. Use it to prove the Cauchy–Schwarz inequality:

∀ v, u ∈ V,

| hv, ui | ≤ ||v|| · ||u||.

In Chapter 4 we consider examples of ` 2 (G, c) for a network (G, c). These are inner product spaces. 454

Example A.1.2

Show that the norm on an inner product space V satisfies the following. For all vectors v, u ∈ V and scalars α ∈ C, Exercise A.2

• ||αv|| = |α| · ||v||, • ||v + u|| ≤ ||v|| + ||u||, and • ||v|| = 0 if and only if v = 0. Exercise A.3

Show that dist(u, v) = ||v−u|| defines a metric on an inner product

space V .

https://doi.org/10.1017/9781009128391.014 Published online by Cambridge University Press

A.1 Inner Products and Hilbert Spaces

359

Let V be an inner product space and fix some u ∈ V . Show that the functions v 7→ hv, ui and v 7→ hu, vi and v 7→ ||v|| are three uniformly continuous functions on V . Exercise A.4

Definition A.1.3 An inner product space H is a Hilbert space if the metric induced by the norm on H is a complete metric; that is, every Cauchy sequence in H converges to a limit in H.

We have that ` 2 (G, c) is a Hilbert space, see Exercise 4.2 (Chap-

Example A.1.4

ter 4).

454

Let H be a Hilbert space. We write v ⊥ u if hu, vi = 0. For subsets S, T ⊂ H we write S ⊥ T if for all s ∈ S and t ∈ T we have s ⊥ t. For a subset S ⊂ H we define S ⊥ = {u ∈ H : ∀ v ∈ S, u ⊥ v}. We also write v ⊥ = {v}⊥ .

Definition A.1.5

Exercise A.5

Show that S ⊥ is a subspace.

Exercise A.6

Show that S ⊥ is closed.

Let H be a Hilbert space and V, W ≤ H be closed subspaces. We write H = V ⊕W if V ∩W = {0} and for every u ∈ H there exist v ∈ V, w ∈ W such that u = v + w. Definition A.1.6

Let V ≤ H be a subspace. Then, (V ⊥ ) ⊥ = cl (V ). (So if V is closed then (V ⊥ ) ⊥ = V .) Also, H = V ⊕ V ⊥ . Moreover, consider the maps L : H → V and R : H → V ⊥ given by Lu = v and Ru = w, where u = v + w is the unique decomposition for which v ∈ V and w ∈ V ⊥ . Then, these are both surjective linear maps, with Theorem A.1.7 (Orthogonal projection)

Lv 0 = v 0,

Lw 0 = 0,

Rv 0 = 0,

Rw 0 = w 0,

for all v 0 ∈ V and w 0 ∈ V ⊥ . We call L the orthogonal projection onto V (and similarly R is the orthogonal projection onto V ⊥ ). Finally, for any u ∈ H we have that ||u − Lu|| = min{||u − v|| : v ∈ V }.

https://doi.org/10.1017/9781009128391.014 Published online by Cambridge University Press

360

Hilbert Space Background

Show that if V ≤ H is a closed subspace of a Hilbert space, and π : H → V is the orthogonal projection, then ||πu|| ≤ ||u|| for all u ∈ H.

Exercise A.7

Definition A.1.8 Let V be a complex vector space. A linear functional is a linear map from V to C.

We have seen that in a Hilbert space H, for u ∈ H, the map v 7→ hv, ui is a continuous linear functional. In fact, the next theorem states that these are the only possibilities. 454

Example A.1.9

Let H be a Hilbert space. Let ϕ : H → C be a continuous linear functional. Then there exists u ∈ H such that ϕ(v) = hv, ui for all v ∈ H.

Theorem A.1.10 (Reisz representation theorem)

A.2 Normed Vector Spaces Definition A.2.1 Let V be a vector space over C. A norm on V is a function V 3 v 7→ ||v|| ∈ [0, ∞) such that for all vectors u, v ∈ V and all scalars α ∈ C,

• • • •

||αv|| = |α| · ||v||, ||v + u|| ≤ ||v|| + ||u||, ||v|| ≥ 0, and ||v|| = 0 if and only if v = 0.

Such a space V is called a normed space. If the metric induced by the norm is complete, V is called a Banach space. We have seen that any inner product induces a norm. But not every norm is induced by an inner product. Show that if V is a complex inner product space then for the norm induced by the inner product we have: ||u + v|| 2 + ||u − v|| 2 = 2 ||u|| 2 + ||v|| 2

Exercise A.8 (Parallelogram law)

for all u, v ∈ V . Exercise A.9 (Polarization identity) Assume that V is a normed complex vector space such that for all u, v ∈ V , we have

https://doi.org/10.1017/9781009128391.014 Published online by Cambridge University Press

361

A.3 Orthonormal Systems ||u + v|| 2 + ||u − v|| 2 = 2 ||u|| 2 + ||v|| 2 . Show that hu, vi =

1 4

||u + v|| 2 − ||u − v|| 2 + i||u + iv|| 2 − i||u − iv|| 2

is a well-defined inner product on V . Show that hu, ui = ||u|| 2 . So a norm is induced by an inner product if and only if it satisfies the parallelogram law. Let V, W be two normed spaces. For a linear operator L : V →

Definition A.2.2

W , define ||L||V →W = sup{||Lv||W : v ∈ V, ||v||V ≤ 1}. If ||L||V →W < ∞, then L is called bounded. Let V, W be two normed spaces. Let L : V → W be a linear

Exercise A.10

operator. Show that L is bounded if and only if L is continuous (as a function between two metric spaces). Show that L is continuous if and only if L is continuous at 0 ∈ V . Let V be a normed space, and let T : V → V be a bounded linear operator. Define the subspace of T-invariant vectors to be

Definition A.2.3

V T := {v ∈ V : T v = v}. Exercise A.11

Let V be a normed space, and let T : V → V be a bounded linear

operator. Show that V T is always a closed subspace of V .

A.3 Orthonormal Systems Let X be a non-empty set (possibly uncountable). Consider the counting measure µ on X; that is, µ( A) = | A| for any A ⊂ X.

Lemma A.3.1

https://doi.org/10.1017/9781009128391.014 Published online by Cambridge University Press

362

Hilbert Space Background

For any ϕ : X → [0, ∞], the Lebesgue integral satisfies Z X ϕdµ = sup ϕ(x). F ⊂X |F | 0 we may choose a finite subset Fa, R ⊂ B such that b∈Fa, R fa (b) > R . By considering the finite subset {a } × Fa, R ⊂ A × B , we get that X S ≥ f (a, b) > R, b∈F a, R

so that S = S A = ∞ by taking R → ∞. So assume that S a < ∞ for all a ∈ A. P Fix ε > 0. For any a ∈ A choose a finite subset Fa = Fa, ε ⊂ B such that b∈Fa fa (b) > S a − ε . Let F A ⊂ A be any finite subset. Define FB ⊂ B by FB = ∪ a∈F Fa . We then have that

S ≥

X

f (a, b) ≥

(a, b)∈F A ×F B

X X *. S a − ε · |F A |. fa (b) +/ ≥ a∈F A , b∈F a - a∈F A X

Taking ε → 0 and then a supremum over F A we find that S ≥ S A .

:) X

Solution to Exercise A.14 :( Since f ∈ ` 2 (B) , there is a countable subset P n C ⊂ B such that f (b) = 0 for all b ∈ B\C . Enumerate C = {c1, c2, . . . }. For every n, define vn = k=1 f (ck )ck . Note that by the Pythagorean theorem,

| |vn+m − vn | | 2 =

m X

| f (cn+k ) | 2 ≤

k=1

X

| f (ck ) | 2,

k >n

which shows that (vn ) n forms a Cauchy sequence in H. Thus, v = lim n vn ∈ H exists. Also, if b ∈ C then

hv, bi = lim hvn, bi = f (b), n→∞

and if b < C then, similarly, hv, bi = 0 = f (b) . If w is any vector with hw, bi = f (b) for all b ∈ B , then hv − w, bi = 0 for all b ∈ B\C , and :) X hv − w, bi = f (b) − f (b) = 0 for all b ∈ C , which implies that v = w .

https://doi.org/10.1017/9781009128391.014 Published online by Cambridge University Press

Appendix B Entropy

365

https://doi.org/10.1017/9781009128391.015 Published online by Cambridge University Press

366

Entropy

B.1 Shannon Entropy Axioms A distribution on finitely many objects can be characterized by a vector P (p1, . . . , pn ) such that p j ∈ [0, 1] and j p j = 1. We want to associate a “measure of uncertainty” for such distributions, say denoted by H (p1, . . . , pn ). This is a number that only depends on the probabilities p1, . . . , pn , not on the specific objects being sampled. ( ) P For an integer n, let Pn = (p1, . . . , pn ) : p j = 1 be the set of all probability distributions on n objects. We want a measure of uncertainty. That is, we want to measure how hard it is to guess, or predict, or sample, a certain measure. That is, we want functions Hn : Pn → [0, ∞) such that: • More variables have strictly larger uncertainty: 1 , . . ., Hn n1 , . . . , n1 < Hn+1 n+1

1 n+1

.

• Grouping: If we have n variables to choose uniformly from, we could play the game in two steps. First, divide them into k blocks, each of size b1, . . . , bk . b Then, choose each block with probability nj , and in that block choose uniformly among the b j variables. That is, for natural numbers b1 + · · · + bk = n, Hn

1 1 n, . . . , n

= Hk

b

n , . . ., 1

bk n

+

k X bj Hb j b1j , . . . , n j=1

1 bj

.

• Continuity: Hn (p1, . . . , pn ) is continuous. Theorem B.1.1

Such a family of functions H must satisfy Hn (p1, . . . , pn ) = −C

n X

p j log p j ,

j=1

for some constant C > 0. Note that this implies that the order of the p j ’s does not matter. Proof Set u(n) = Hn n1 , . . . , n1 . Step 1. u(mn) = u(m) + u(n). Indeed, take b1 = · · · bm = n in the grouping axiom. Then, u(nm) = u(m) +

m X

1 m u(n)

= u(m) + u(n).

j=1

https://doi.org/10.1017/9781009128391.015 Published online by Cambridge University Press

367

B.2 A Different Perspective on Entropy So we get that u nk = ku(n). u(2) Step 2. u(n) = C log n for C = log 2. If n = 1 then grouping implies that

u(n) = u(n) +

n X

1 n H1 (1),

j=1

so u(1) = H1 (1) = 0. So we can assume n > 1. Let a > 0 be some integer, and let k ∈ N be such that nk ≤ 2a < nk+1 . That a log 2 a log 2 is, k ≤ log n < k + 1 k = b log n c . Here, we use that u is strictly increasing: we get ku(n) = u nk ≤ u(2a ) = au(2) < u nk+1 = (k + 1)u(n). So we have that for any a > 0, k log 2 k +1 ≤ < a log n a

and

k u(2) k +1 ≤ < . a u(n) a

Thus, u(2) log 2 1 ≤ . − u(n) log n a Taking a → ∞ proves Step 2. a Step 3. Let us prove the theorem for p j ∈ Q. If p j = b jj ∈ Q then define Q Q P a Π Π j = i,j bi and Π = nj=1 b j . So p j = jΠ j and j a j Π j = Π. Thus, by grouping, n X aj Πj u(Π) = Hn a1ΠΠ1 , . . . , anΠΠn + Π u(a j Π j ). j=1

Rearranging, we have Hn (p1, . . . , pn ) = C log Π − C

n X

p j log(a j Π j ) = −C

j=1

n X

p j log

aj Πj Π

.

j=1

This is the theorem for rational values. Step 4 The theorem follows for all values by continuity.

B.2 A Different Perspective on Entropy Suppose that we are scientists, and we wish to predict the next state of some physical system. Of course, this is done using previous observations. Suppose

https://doi.org/10.1017/9781009128391.015 Published online by Cambridge University Press

368

Entropy

that the distribution of the next state of the system is distributed over n possible states, say {1, 2, . . . , n}. Assume that the (unknown) probability for the system to be in state j is q j . Some scientist provides a prediction for the distribution: p(1), . . . , p(n). We want to “score” the possible prediction, or rather penalize the prediction based on “how far off” it is from the actual distribution q1, . . . , qn . If B(p) is the penalization for predicting a probability p ∈ [0, 1], then the expected penalization for a prediction p(1), . . . , p(n) is n X

q j B(p( j)).

j=1

How would this penalization work? First, it should be continuous in changing the parameters. Second, one can predict the distribution after two time steps. So this would be a prediction of a distribution on {(i, j) : 1 ≤ i, j ≤ n}; denote it by p(i, j). Now, of course scientists all know logic and mathematics, so their prediction for two time steps ahead is consistent with their prediction for one time step ahead. Specifically, for all i, j we have p(i, j) = p( j |i)p(i), where p( j |i) denotes the predicted probability that state j will follow a state i. Also, the penalization for predicting two time steps ahead should be the same as that for predicting one time step and then predicting the second step based on that. That is, B(p(i, j)) = B(p(i)) + B(p( j |i)). Together these become: B(x y) = B(x) + B(y) for all x, y ∈ [0, 1]. It is a simple exercise to prove that B(x) = logm x (where the base of the logarithm is a choice). Thus, the expected penalization is H (p(1), . . . , p(n)) = −

n X

q j logm p( j).

j=1

(Negative sign is to show that this is a penalization.) Proposition B.2.1

Let p j , q j ∈ (0, 1] be such that

for any m > 1, −

n X j=1

p j logm p j ≤ −

n X

Pn j=1

pj =

Pn j=1

q j = 1. Then,

p j logm q j .

j=1

So p j = q j minimizes the expected penalization. 1 Proof Note that logm (x) has second derivative − x 2 log < 0 on (0, ∞). So by m Jensen’s inequality,

https://doi.org/10.1017/9781009128391.015 Published online by Cambridge University Press

369

B.2 A Different Perspective on Entropy n X

p j logm

qj pj

≤ logm

j=1

n X

q

p j p jj = 0.

j=1

So if predicting an outcome distribution p1, . . . , pn “costs” the scientist D(p|q) :=

n X

p j log

pj qj ,

j=1

where q1, . . . , qn is the real distribution, then the cost is always positive, and the cost is minimized by choosing the correct distribution. Exercise B.1

Let p j ∈ (0, 1] be such that

Pn j=1

p j = 1. Define r j =

1 ≤ j ≤ n. Show that for any m > 1, −

n X j=1

p j logm p j ≤ −

n X

r j logm r j = logm n.

j=1

https://doi.org/10.1017/9781009128391.015 Published online by Cambridge University Press

1 n

for all

Appendix C Coupling and Total Variation

370

https://doi.org/10.1017/9781009128391.016 Published online by Cambridge University Press

371

C.1 Total Variation Distance

In this appendix we briefly review the connection between the total variation distance on probability measures, and coupling.

C.1 Total Variation Distance For two probability measures µ, ν on a countable set Ω, define the total variation distance between µ and ν to be X | µ(x) − ν(x)|. || µ − ν||TV = 12 Definition C.1.1

x

Exercise C.1

Let µ, ν be two probability measures on a countable set Ω.

Show that sup | µ( A) − ν( A)| = A⊂Ω

X

µ(x) − ν(x)

x:µ(x)>ν(x)

=

X

ν(x) − µ(x) = || µ − ν||TV .

B solution C

x:ν(x)>µ(x)

Exercise C.2

Let µ, ν be two probability measures on a countable set Ω.

Show that X

(µ(x) ∧ ν(x)) = 1 − || µ − ν||TV .

B solution C

x

C.2 Couplings Given two probability measures µ, ν on a countable set Ω, a coupling of (µ, ν) is a probability measure λ on Ω × Ω such that the marginals of λ have the distributions µ, ν respectively; that is, X X ∀ x, y ∈ Ω, λ(x, y) = µ(x), and λ(x, y) = ν(y). y

x

If X, Y are Ω-valued random variables with laws µ and ν, respectively, a coupling of (X, Y ) is defined to be a coupling of (µ, ν). We say that (X, Y ) is a coupling of (µ, ν) if X has law µ and Y has law ν, so that their joint distribution is a coupling of (µ, ν). Couplings enable us to put two probability measures into the same space, so we can compare them. This usually requires creativity in constructing the coupling, but finding a good coupling can be a very powerful tool.

https://doi.org/10.1017/9781009128391.016 Published online by Cambridge University Press

372

Coupling and Total Variation

A coupling of two probability measures always exists. Show that the product measure λ = µ ⊗ ν given by λ(x, y) = µ(x)ν(y) is a coupling of (µ, ν).

Exercise C.3

Let µ, ν be two probability measures on a countable set Ω. Show that if (X, Y ) is a coupling of (µ, ν), then for any event A,

Exercise C.4

| µ( A) − ν( A)| ≤ P[X , Y ]. Conclude that sup | µ( A) − ν( A)| ≤ inf P[X , Y ], A⊂Ω

where the infimum is over all possible couplings of (µ, ν). Let µ, ν be two probability measures on a countable set Ω. Then there exists a coupling (X, Y ) of (µ, ν) such that || µ − ν||TV = P[X , Y ]. Thus, || µ − ν||TV = min P[X , Y ] where the minimum is over all possible couplings (X, Y ) of (µ, ν). Proposition C.2.1

Proof Since || µ − ν||TV = sup | µ( A) − ν( A)| ≤ inf P[X , Y ], A⊂Ω

where the infimum is over all couplings of (µ, ν), we only need to find a specific coupling (X, Y ) of (µ, ν) satisfying P[X , Y ] = || µ − ν||TV . P For this, set ε = x (µ(x) ∧ ν(x)). We have seen in Exercise C.2 that 1 − ε = || µ − ν||TV . Define a coupling (X, Y ) as follows. Let ξ be a Bernoulli-ε random variable. Define µ(x)∧ν(x) ε λ(x, y) = (µ(x)−ν(x))+ · | |µ−ν | |TV

Note that X λ(x, y) = (µ(y) ∧ ν(y)) +

x = y , ξ = 1, (ν(y)−µ(y))+ | |µ−ν | |TV

1 | |µ−ν | |TV

x

X

ξ = 0.

(µ(x) − ν(x))+ · (ν(y) − µ(y))+

x

= ν(y), X

λ(x, y) = (µ(x) ∧ ν(x)) +

y

1 | |µ−ν | |TV

X

(ν(y) − µ(y))+ · (µ(x) − ν(x))+

y

= µ(x),

https://doi.org/10.1017/9781009128391.016 Published online by Cambridge University Press

C.3 Solutions to Exercises

373

so that λ is indeed a coupling of (µ, ν). For (X, Y ) with law λ we obtain that P[X , Y ] = P[ξ = 0] = 1 − ε = || µ − ν||TV, which is what we wanted to prove.

As an example, Exercise 6.35 bounds the total variation distance between binomials. We recall that exercise here. Exercise C.5

Show that the total variation distance between Bin (n, p) and . B solution C 2(n+1)

Bin (n + 1, p) is bounded by √ 1

C.3 Solutions to Exercises Solution to Exercise C.1 :( Let E = {x : µ(x) > ν(x) } and let F = {x : ν(x) > µ(x) }. We have X X X µ(x) − ν(x) + µ(x) − ν(x) = µ(x) − ν(x) = 0, x∈E

x

x∈F

and we obtain that

X

2| |µ − ν | |TV =

µ(x) − ν(x) +

x∈E

=2

X

X

ν(x) − µ(x)

x∈F

µ(x) − ν(x) = 2

x∈E

ν(x) − µ(x).

x∈F

For any A ⊂ Ω, we have that X µ( A) − ν( A) = µ(x) − ν(x) ≤ x∈ A

X

X

µ(x) − ν(x) = µ( A ∩ E) − ν( A ∩ E)

x∈ A∩E

≤ µ( A ∩ E) − ν( A ∩ E) +

X

µ(x) − ν(x)

x∈E \ A

=

X

µ(x) − ν(x) = | |µ − ν | |TV,

x∈E

and similarly,

ν( A) − µ( A) ≤ | |µ − ν | |TV, so that

sup |µ( A) − ν( A) | ≤ | |µ − ν | |TV . A⊂Ω

Since | |µ − ν | |TV = |µ(E) − ν(E) | = |ν(F ) − µ(F ) | , we are done. Solution to Exercise C.2 :( Let E = {x : µ(x) > ν(x) }. Compute X (µ(x) + ν(x)) − (µ(x) − ν(x)) + 1 − | |µ − ν | |TV = 12 x∈E

=

X

ν(x) +

x∈E

X

1 2

:) X

(µ(x) + ν(x)) − (ν(x) − µ(x))

x