This introduction to random walks on infinite graphs gives particular emphasis to graphs with polynomial volume growth.

*634*
*154*
*990KB*

*English*
*Pages 236
[240]*
*Year 2017*

- Author / Uploaded
- Martin T. Barlow

*Table of contents : Content: Random walks and electrical resistance --Isoperimetric inequalities and applications --Discrete time heat kernel --Continuous time random walks --Heat kernel bounds --Potential Theory and harnack inequalities*

LONDON MATHEMATICAL SOCIETY LECTURE NOTE SERIES Managing Editor: Professor M. Reid, Mathematics Institute, University of Warwick, Coventry CV4 7AL, United Kingdom The titles below are available from booksellers, or from Cambridge University Press at http://www.cambridge.org/mathematics 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374

Structured ring spectra, A. BAKER & B. RICHTER (eds) Linear logic in computer science, T. EHRHARD, P. RUET, J.-Y. GIRARD & P. SCOTT (eds) Advances in elliptic curve cryptography, I.F. BLAKE, G. SEROUSSI & N.P. SMART (eds) Perturbation of the boundary in boundary-value problems of partial differential equations, D. HENRY Double affine Hecke algebras, I. CHEREDNIK ˇ (eds) L-functions and Galois representations, D. BURNS, K. BUZZARD & J. NEKOVÁR Surveys in modern mathematics, V. PRASOLOV & Y. ILYASHENKO (eds) Recent perspectives in random matrix theory and number theory, F. MEZZADRI & N.C. SNAITH (eds) Poisson geometry, deformation quantisation and group representations, S. GUTT et al (eds) Singularities and computer algebra, C. LOSSEN & G. PFISTER (eds) Lectures on the Ricci flow, P. TOPPING Modular representations of finite groups of Lie type, J.E. HUMPHREYS Surveys in combinatorics 2005, B.S. WEBB (ed) Fundamentals of hyperbolic manifolds, R. CANARY, D. EPSTEIN & A. MARDEN (eds) Spaces of Kleinian groups, Y. MINSKY, M. SAKUMA & C. SERIES (eds) Noncommutative localization in algebra and topology, A. RANICKI (ed) Foundations of computational mathematics, Santander 2005, L.M PARDO, A. PINKUS, E. SÜLI & M.J. TODD (eds) Handbook of tilting theory, L. ANGELERI HÜGEL, D. HAPPEL & H. KRAUSE (eds) Synthetic differential geometry (2nd Edition), A. KOCK The Navier–Stokes equations, N. RILEY & P. DRAZIN Lectures on the combinatorics of free probability, A. NICA & R. SPEICHER Integral closure of ideals, rings, and modules, I. SWANSON & C. HUNEKE Methods in Banach space theory, J.M.F. CASTILLO & W.B. JOHNSON (eds) Surveys in geometry and number theory, N. YOUNG (ed) Groups St Andrews 2005 I, C.M. CAMPBELL, M.R. QUICK, E.F. ROBERTSON & G.C. SMITH (eds) Groups St Andrews 2005 II, C.M. CAMPBELL, M.R. QUICK, E.F. ROBERTSON & G.C. SMITH (eds) Ranks of elliptic curves and random matrix theory, J.B. CONREY, D.W. FARMER, F. MEZZADRI & N.C. SNAITH (eds) Elliptic cohomology, H.R. MILLER & D.C. RAVENEL (eds) Algebraic cycles and motives I, J. NAGEL & C. PETERS (eds) Algebraic cycles and motives II, J. NAGEL & C. PETERS (eds) Algebraic and analytic geometry, A. NEEMAN Surveys in combinatorics 2007, A. HILTON & J. TALBOT (eds) Surveys in contemporary mathematics, N. YOUNG & Y. CHOI (eds) Transcendental dynamics and complex analysis, P.J. RIPPON & G.M. STALLARD (eds) Model theory with applications to algebra and analysis I, Z. CHATZIDAKIS, D. MACPHERSON, A. PILLAY & A. WILKIE (eds) Model theory with applications to algebra and analysis II, Z. CHATZIDAKIS, D. MACPHERSON, A. PILLAY & A. WILKIE (eds) Finite von Neumann algebras and masas, A.M. SINCLAIR & R.R. SMITH Number theory and polynomials, J. MCKEE & C. SMYTH (eds) Trends in stochastic analysis, J. BLATH, P. MÖRTERS & M. SCHEUTZOW (eds) Groups and analysis, K. TENT (ed) Non-equilibrium statistical mechanics and turbulence, J. CARDY, G. FALKOVICH & K. GAWEDZKI Elliptic curves and big Galois representations, D. DELBOURGO Algebraic theory of differential equations, M.A.H. MACCALLUM & A.V. MIKHAILOV (eds) Geometric and cohomological methods in group theory, M.R. BRIDSON, P.H. KROPHOLLER & I.J. LEARY (eds) Moduli spaces and vector bundles, L. BRAMBILA-PAZ, S.B. BRADLOW, O. GARCÍA-PRADA & S. RAMANAN (eds) Zariski geometries, B. ZILBER Words: Notes on verbal width in groups, D. SEGAL Differential tensor algebras and their module categories, R. BAUTISTA, L. SALMERÓN & R. ZUAZUA Foundations of computational mathematics, Hong Kong 2008, F. CUCKER, A. PINKUS & M.J. TODD (eds) Partial differential equations and fluid mechanics, J.C. ROBINSON & J.L. RODRIGO (eds) Surveys in combinatorics 2009, S. HUCZYNSKA, J.D. MITCHELL & C.M. RONEY-DOUGAL (eds) Highly oscillatory problems, B. ENGQUIST, A. FOKAS, E. HAIRER & A. ISERLES (eds) Random matrices: High dimensional phenomena, G. BLOWER Geometry of Riemann surfaces, F.P. GARDINER, G. GONZÁLEZ-DIEZ & C. KOUROUNIOTIS (eds) Epidemics and rumours in complex networks, M. DRAIEF & L. MASSOULIÉ Theory of p-adic distributions, S. ALBEVERIO, A.YU. KHRENNIKOV & V.M. SHELKOVICH ´ Conformal fractals, F. PRZYTYCKI & M. URBANSKI Moonshine: The first quarter century and beyond, J. LEPOWSKY, J. MCKAY & M.P. TUITE (eds) Smoothness, regularity and complete intersection, J. MAJADAS & A. G. RODICIO Geometric analysis of hyperbolic differential equations: An introduction, S. ALINHAC

375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438

Triangulated categories, T. HOLM, P. JØRGENSEN & R. ROUQUIER (eds) Permutation patterns, S. LINTON, N. RUŠKUC & V. VATTER (eds) An introduction to Galois cohomology and its applications, G. BERHUY Probability and mathematical genetics, N. H. BINGHAM & C. M. GOLDIE (eds) Finite and algorithmic model theory, J. ESPARZA, C. MICHAUX & C. STEINHORN (eds) Real and complex singularities, M. MANOEL, M.C. ROMERO FUSTER & C.T.C WALL (eds) Symmetries and integrability of difference equations, D. LEVI, P. OLVER, Z. THOMOVA & P. WINTERNITZ (eds) ˇ Forcing with random variables and proof complexity, J. KRAJÍCEK Motivic integration and its interactions with model theory and non-Archimedean geometry I, R. CLUCKERS, J. NICAISE & J. SEBAG (eds) Motivic integration and its interactions with model theory and non-Archimedean geometry II, R. CLUCKERS, J. NICAISE & J. SEBAG (eds) Entropy of hidden Markov processes and connections to dynamical systems, B. MARCUS, K. PETERSEN & T. WEISSMAN (eds) Independence-friendly logic, A.L. MANN, G. SANDU & M. SEVENSTER Groups St Andrews 2009 in Bath I, C.M. CAMPBELL et al (eds) Groups St Andrews 2009 in Bath II, C.M. CAMPBELL et al (eds) Random fields on the sphere, D. MARINUCCI & G. PECCATI Localization in periodic potentials, D.E. PELINOVSKY Fusion systems in algebra and topology, M. ASCHBACHER, R. KESSAR & B. OLIVER Surveys in combinatorics 2011, R. CHAPMAN (ed) Non-abelian fundamental groups and Iwasawa theory, J. COATES et al (eds) Variational problems in differential geometry, R. BIELAWSKI, K. HOUSTON & M. SPEIGHT (eds) How groups grow, A. MANN Arithmetic differential operators over the p-adic integers, C.C. RALPH & S.R. SIMANCA Hyperbolic geometry and applications in quantum chaos and cosmology, J. BOLTE & F. STEINER (eds) Mathematical models in contact mechanics, M. SOFONEA & A. MATEI Circuit double cover of graphs, C.-Q. ZHANG Dense sphere packings: a blueprint for formal proofs, T. HALES A double Hall algebra approach to affine quantum Schur–Weyl theory, B. DENG, J. DU & Q. FU Mathematical aspects of fluid mechanics, J.C. ROBINSON, J.L. RODRIGO & W. SADOWSKI (eds) Foundations of computational mathematics, Budapest 2011, F. CUCKER, T. KRICK, A. PINKUS & A. SZANTO (eds) Operator methods for boundary value problems, S. HASSI, H.S.V. DE SNOO & F.H. SZAFRANIEC (eds) Torsors, étale homotopy and applications to rational points, A.N. SKOROBOGATOV (ed) Appalachian set theory, J. CUMMINGS & E. SCHIMMERLING (eds) The maximal subgroups of the low-dimensional finite classical groups, J.N. BRAY, D.F. HOLT & C.M. RONEY-DOUGAL Complexity science: the Warwick master’s course, R. BALL, V. KOLOKOLTSOV & R.S. MACKAY (eds) Surveys in combinatorics 2013, S.R. BLACKBURN, S. GERKE & M. WILDON (eds) Representation theory and harmonic analysis of wreath products of finite groups, T. CECCHERINI-SILBERSTEIN, F. SCARABOTTI & F. TOLLI Moduli spaces, L. BRAMBILA-PAZ, O. GARCÍA-PRADA, P. NEWSTEAD & R.P. THOMAS (eds) Automorphisms and equivalence relations in topological dynamics, D.B. ELLIS & R. ELLIS Optimal transportation, Y. OLLIVIER, H. PAJOT & C. VILLANI (eds) Automorphic forms and Galois representations I, F. DIAMOND, P.L. KASSAEI & M. KIM (eds) Automorphic forms and Galois representations II, F. DIAMOND, P.L. KASSAEI & M. KIM (eds) Reversibility in dynamics and group theory, A.G. O’FARRELL & I. SHORT ˘ & M. POPA (eds) Recent advances in algebraic geometry, C.D. HACON, M. MUSTA¸TA The Bloch–Kato conjecture for the Riemann zeta function, J. COATES, A. RAGHURAM, A. SAIKIA & R. SUJATHA (eds) The Cauchy problem for non-Lipschitz semi-linear parabolic partial differential equations, J.C. MEYER & D.J. NEEDHAM Arithmetic and geometry, L. DIEULEFAIT et al (eds) O-minimality and Diophantine geometry, G.O. JONES & A.J. WILKIE (eds) Groups St Andrews 2013, C.M. CAMPBELL et al (eds) ´ Inequalities for graph eigenvalues, Z. STANIC Surveys in combinatorics 2015, A. CZUMAJ et al (eds) Geometry, topology and dynamics in negative curvature, C.S. ARAVINDA, F.T. FARRELL & J.-F. LAFONT (eds) Lectures on the theory of water waves, T. BRIDGES, M. GROVES & D. NICHOLLS (eds) Recent advances in Hodge theory, M. KERR & G. PEARLSTEIN (eds) Geometry in a Fréchet context, C. T. J. DODSON, G. GALANIS & E. VASSILIOU Sheaves and functions modulo p, L. TAELMAN Recent progress in the theory of the Euler and Navier-Stokes equations, J.C. ROBINSON, J.L. RODRIGO, W. SADOWSKI & A. VIDAL-LÓPEZ (eds) Harmonic and subharmonic function theory on the real hyperbolic ball, M. STOLL Topics in graph automorphisms and reconstruction (2nd Edition), J. LAURI & R. SCAPELLATO Regular and irregular holonomic D-modules, M. KASHIWARA & P. SCHAPIRA Analytic semigroups and semilinear initial boundary value problems (2nd Edition), K. TAIRA Graded rings and graded Grothendieck groups, R. HAZRAT Groups, graphs and random walks, T. CECCHERINI-SILBERSTEIN, M. SALVATORI & E. SAVA-HUSS (eds) Dynamics and analytic number theory, D. BADZIAHIN, A. GORODNIK & N. PEYERIMHOFF (eds) Random Walks and Heat Kernels on Graphs, MARTIN T. BARLOW

London Mathematical Society Lecture Note Series: 438

Random Walks and Heat Kernels on Graphs M A RT I N T. BA R L OW University of British Columbia, Canada

University Printing House, Cambridge CB2 8BS, United Kingdom One Liberty Plaza, 20th Floor, New York, NY 10006, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia 4843/24, 2nd Floor, Ansari Road, Daryaganj, Delhi – 110002, India 79 Anson Road, #06–04/06, Singapore 079906 Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning, and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781107674424 DOI: 10.1017/9781107415690 c Martin T. Barlow 2017 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2017 Printed in the United Kingdom by Clays, St Ives plc A catalogue record for this publication is available from the British Library. Library of Congress Cataloging-in-Publication Data Names: Barlow, M. T. Title: Random walks and heat kernels on graphs / Martin T. Barlow, University of British Columbia, Canada. Description: Cambridge : Cambridge University Press, [2017] | Series: London Mathematical Society lecture note series ; 438 | Includes bibliographical references and index. Identifiers: LCCN 2016051295 | ISBN 9781107674424 Subjects: LCSH: Random walks (Mathematics) | Graph theory. | Markov processes. | Heat equation. Classification: LCC QA274.73 .B3735 2017 | DDC 511/.5–dc23 LC record available at https://lccn.loc.gov/2016051295 ISBN 978-1-107-67442-4 Paperback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party Internet Web sites referred to in this publication and does not guarantee that any content on such Web sites is, or will remain, accurate or appropriate.

To my mother, Yvonne Barlow, and in memory of my father, Andrew Barlow.

Contents

Preface 1

page ix

Introduction 1.1 Graphs and Weighted Graphs 1.2 Random Walks on a Weighted Graph 1.3 Transition Densities and the Laplacian 1.4 Dirichlet or Energy Form 1.5 Killed Process 1.6 Green’s Functions 1.7 Harmonic Functions, Harnack Inequalities, and the Liouville Property 1.8 Strong Liouville Property for Rd 1.9 Interpretation of the Liouville Property

26 31 32

2

Random Walks and Electrical Resistance 2.1 Basic Concepts 2.2 Transience and Recurrence 2.3 Energy and Variational Methods 2.4 Resistance to Infinity 2.5 Traces and Electrical Equivalence 2.6 Stability under Rough Isometries 2.7 Hitting Times and Resistance 2.8 Examples 2.9 The Sierpinski Gasket Graph

38 38 42 44 55 61 67 70 73 75

3

Isoperimetric Inequalities and Applications 3.1 Isoperimetric Inequalities 3.2 Nash Inequality 3.3 Poincaré Inequality

80 80 85 91

vii

1 1 6 11 15 21 22

viii

Contents

3.4 3.5

Spectral Decomposition for a Finite Graph Strong Isoperimetric Inequality and Spectral Radius

97 101

4

Discrete Time Heat Kernel 4.1 Basic Properties and Bounds on the Diagonal 4.2 Carne–Varopoulos Bound 4.3 Gaussian and Sub-Gaussian Heat Kernel Bounds 4.4 Off-diagonal Upper Bounds 4.5 Lower Bounds

106 106 111 116 124 128

5

Continuous Time Random Walks 5.1 Introduction to Continuous Time 5.2 Heat Kernel Bounds

132 132 140

6

Heat Kernel Bounds 6.1 Strongly Recurrent Graphs 6.2 Gaussian Upper Bounds 6.3 Poincaré Inequality and Gaussian Lower Bounds 6.4 Remarks on Gaussian Bounds

149 149 155 160 168

7

Potential Theory and Harnack Inequalities 7.1 Introduction to Potential Theory 7.2 Applications

172 172 179

Appendix A.1 Martingales and Tail Estimates A.2 Discrete Time Markov Chains and the Strong Markov Property A.3 Continuous Time Random Walk A.4 Invariant and Tail σ -fields A.5 Hilbert Space Results A.6 Miscellaneous Estimates A.7 Whitney Type Coverings of a Ball A.8 A Maximal Inequality A.9 Poincaré Inequalities References Index

183 183 186 190 197 202 205 206 211 213 219 224

Preface

The topic of random walks on graphs is a vast one, and has close connections with many other areas of probability, as well as analysis, geometry, and algebra. In the probabilistic direction, a random walk on a graph is just a reversible or symmetric Markov chain, and many results on random walks on graphs also hold for more general Markov chains. However, pursuing this generalisation too far leads to a loss of concrete interest, and in this text the context will be restricted to random walks on graphs where each vertex has a finite number of neighbours. Even with these restrictions, there are many topics which this book does not cover – in particular the very active field of relaxation times for finite graphs. This book is mainly concerned with infinite graphs, and in particular those which have polynomial volume growth. The main topic is the relation between geometric properties of the graph and asymptotic properties of the random walk. A particular emphasis is on properties which are stable under minor perturbations of the graph – for example the addition of a number of diagonal edges to the Euclidean lattice Z2 . The precise definition of ‘minor perturbation’ is given by the concept of a rough isometry, or quasi-isometry. One example of a property which is stable under these perturbations is transience; this stability is proved in Chapter 2 using electrical networks. A considerably harder theorem, which is one of the main results of this book, is that the property of satisfying Gaussian heat kernel bounds is also stable under rough isometries. The second main theme of this book is deriving bounds on the transition density of the random walk, or the heat kernel, from geometric information on the graph. Once one has these bounds, many properties of the random walk can then be obtained in a straightforward fashion. Chapter 1 gives the basic definition of graphs and random walks, as well as that of rough isometry. Chapter 2 explores the close connections between

ix

x

Preface

random walks and electrical networks. The key contribution of network theory is to connect the hitting properties of the random walk with the effective resistance between sets. The effective resistance can be bounded using variational principles, and this introduces tools which have no purely probabilistic counterpart. Chapter 3 introduces some geometric and analytic inequalities which will play a key role in the remainder of the book – two kinds of isoperimetric inequality, Nash inequalities, and Poincaré inequalities. It is shown how the two analytic inequalities (i.e. Nash and Poincaré) can be derived from the geometric information given by the isoperimetric inequalities. Chapter 4 studies the transition density of the discrete time random walk, or discrete time heat kernel. Two initial upper bounds on this are proved – an ‘on-diagonal’ upper bound from a Nash inequality, and a long range bound, the Carne–Varopoulos bound, which holds in very great generality. Both these proofs are quite short. To obtain full Gaussian or sub-Gaussian bounds much more work is needed. The second half of Chapter 4 makes the initial steps, and introduces conditions (bounds on exit times from balls, and a ‘near-diagonal lower bound’) which if satisfied will lead to full Gaussian bounds. Chapter 5 introduces the continuous time random walk, which is in many respects an easier and more regular object to study than the discrete time random walk. In this chapter it is defined from the discrete time walk and an independent Poisson process – another construction, from the transition densities, is given in the Appendix. It is proved that Gaussian or sub-Gaussian heat kernel bounds hold for the discrete time random walk if and only if they hold for the continuous time walk. Chapter 6 proves sub-Gaussian or Gaussian bounds for two classes of graphs. The first are ‘strongly recurrent’ graphs; given the work done in Chapter 4 the results for these come quite easily from volume and resistance estimates. The second class are graphs which satisfy a Poincaré inequality – these include Zd and graphs which are roughly isometric to Zd . The proof of Gaussian bounds for these graphs is based on the tools developed by Nash to study divergence form PDE; these methods also work well in the graph context. Chapter 7 gives a brief introduction to potential theory. Using the method of balayage, the heat kernel bounds of Chapter 6 lead to a Harnack inequality, and this in turn is used to prove the strong Liouville property (i.e. all positive harmonic functions are constant) for graphs which are roughly isometric to Zd . This book is based on a course given at the first Pacific Institute for the Mathematical Sciences (PIMS) summer school in Probability, which was held at UBC in 2004. It is written for first year graduate students, though it is probably too long for a one semester course at most institutions. A much shorter

Preface

xi

version of this course was also given in Kyoto in 2005, and I wish to thank both PIMS and the Research Institute for the Mathematical Sciences (RIMS) for their invitations to give these courses. I circulated a version of these notes several years ago, and I would like to thank those who have read them and given feedback, and in particular, Alain Sznitman and Gerard Ben-Arous, for encouraging me to complete them. I also wish to thank Alain for making available to me some lecture notes of his own on this material. I have learnt much from a number of previous works in this area – the classical text of Spitzer [Spi], as well as the more recent works [AF, LL, LP, Wo]. I am grateful to those who have sent me corrections, and in particular to Yoshihiro Abe, Alma Sarai Hernandez Torres, Guillermo Martinez Dibene, Jun Misumi, and Zichun Ye.

1 Introduction

1.1 Graphs and Weighted Graphs We start with some basic definitions. To be quite formal initially, a graph is a pair = (V, E). Here V is a finite or countably infinite set, and E is a subset of P2 (V) = {subsets A of V with two elements}. Given a set A we write | A| for the number of elements of A. The elements of V are called vertices, and the elements of E edges. In practice, we think of the elements of V as points (often embedded in some space such as Rd ), and the edges as lines between the vertices. Now for some general definitions. (1) We write x ∼ y to mean {x, y} ∈ E, and say that y is a neighbour of x. Since all edges have two distinct elements, we have not allowed edges between a point x and itself. Also, we do not allow multiple edges between points, though we could; anything which can be done with multiple edges can be done better with weights, which will be introduced shortly. (2) A path γ in is a sequence x0 , x1 , . . . , x n with xi−1 ∼ xi for 1 ≤ i ≤ n. The length of a path γ is the number of edges in γ . A path γ = (x 0 , . . . , xn ) is a cycle or loop if x0 = xn . A path is self-avoiding if the vertices in the path are distinct. (3) We define d(x, y) to be the length n of the shortest path x = x0 , x1 , . . . , x n = y. If there is no such path then we set d(x, y) = ∞. (This is the graph or chemical metric on .) We also write for x ∈ V, A ⊂ V, d(x, A) = min{d(x, y) : y ∈ A}. (4) is connected if d(x, y) < ∞ for all x, y. 1

2

Introduction

(5) is locally finite if N (x) = {y : y ∼ x} is finite for each x ∈ V, that is, every vertex has a finite number of neighbours. has bounded geometry if supx |N (x)| < ∞. (6) Define balls in by B(x, r ) = {y : d(x, y) ≤ r },

x ∈ V,

r ∈ [0, ∞).

Note that while the metric d(x, ·) is integer valued, we allow r here to be real. Of course B(x, r ) = B(x, r ). (7) We define the exterior boundary of A by ∂ A = {y ∈ Ac : there exists x ∈ A with x ∼ y}. Set also ∂i A = ∂(Ac ) = {y ∈ A : there exists x ∈ Ac with x ∼ y }, A = A ∪ ∂ A, Ao = A − ∂i A. We use the notation An ↑↑ V to mean that An is an increasing sequence of finite sets with ∪n An = V. Notation We write ci , Ci for (strictly) positive constants, which will remain fixed in each argument. In some proofs we will write c, c etc. for positive constants, which may change from line to line. Given functions f, g : A → R we write f g to mean that there exists c ≥ 1 such that c−1 f (x) ≤ g(x) ≤ c f (x)

for all x ∈ A.

We write 1 for the function f which is identically 1, and 1 A for the indicator function (or characteristic function) of the set A. We write 1x for 1{x} . From now on we will always assume: is locally finite,

(H1)

is connected;

(H2)

and in addition from time to time we will assume: has bounded geometry.

(H3)

We will treat weighted graphs. Definition 1.1 We assume there exist weights (also called conductances) μx y , x, y ∈ V satisfying:

1.1 Graphs and Weighted Graphs

3

(i) μx y = μ yx , (ii) μx y ≥ 0 for all x, y ∈ V, (iii) if x = y then μx y > 0 if and only if x ∼ y. We call (, μ) a weighted (iii) allows us to have graph. Note that condition μx x > 0. Since E = {x, y} : x = y, μx y > 0 , we can recover the set of edges from μ. The natural weights on are given by 1 if x ∼ y, μx y = 0 otherwise. Whenever we discuss a graph without explicit mention of weights, we will assume we are using the natural weights. Let μx = μ(x) = y μx y , and extend μ to a measure on V by setting μx . (1.1) μ(A) = x∈A

Since is locally finite we have: (i) |B(x, r )| < ∞ for any x and r , (ii) μ(A) < ∞ for any finite set A. We will sometimes need additional conditions on the weights. Definition 1.2 We say (, μ) has bounded weights if there exists C1 < ∞ such that C 1−1 ≤ μe ≤ C 1 for all e ∈ E, and μx x ≤ C1 for all x ∈ V. We say (, μ) has controlled weights if there exists C 2 < ∞ such that μx y 1 ≥ μx C2

whenever x ∼ y.

We introduce the conditions: (, μ) has bounded weights,

(H4)

(, μ) has controlled weights.

(H5)

Controlled weights is called ‘the p0 condition’ in [GT1, GT2]; note that it holds if (H3) and (H4) hold. While a little less obvious than (H4), it turns out that controlled weights is the ‘right’ condition in many circumstances. As an example of its use, we show how it gives bounds on both the cardinality and the measure of a ball.

4

Introduction

Lemma 1.3 Suppose (, μ) satisfies (H5) with constant C2 . Then (H3) holds and, for n ≥ 0, |B(x, n)| ≤ 2C2n , μ(B(x, n)) ≤ 2C2n μx . Proof Since μx y ≥ μx /C2 if x ∼ y, we have |N (x)| ≤ C2 for all x and thus (H3) holds. Further, we have C 2 ≥ 2 unless |V| ≤ 2. Then writing S(x, n) = {y : d(x, y) = n}, we have |S(x, n)| ≤ C2 |S(x, n − 1)|, and so by induction |B(x, n)| ≤ (C 2n+1 − 1)/(C 2 − 1) ≤ 2C2n . Also μ(N (x)) = μ y ≤ C2 μx y = C 2 μ x . y∼x

y∼x

Hence again by induction μ(B(x, n)) ≤ 2C2n μx . Examples 1.4 (1) The Euclidean lattice Zd . Here V = Zd , and x ∼ y if and only if |x − y| = 1. Strictly speaking we should denote this graph by (Zd , E d ), where E d is the set of edges defined above. (2) The d-ary tree. (‘Binary’ if d = 2.) This is the unique infinite graph with |N (x)| ≡ d + 1, and with no cycles. (3) We will also consider the ‘rooted binary tree’ B. Let B0 = {o}, and for n ≥ 1 let Bn = {0, 1}n . Then the vertex set is given by B = ∪∞ n=0 Bn . For x = (x 1 , . . . , xn ) ∈ Bn with n ≥ 2 set a(x) = (x 1 , . . . , xn−1 ) – we call a(x) the ancestor of x. (We set a(z) = o for z ∈ B1 .) Then the edge set of the rooted binary tree is given by E(B) = {x, a(x)} : x ∈ B − B0 . (4) The canopy tree is a subgraph of B defined as follows. Let U = {o} ∪ {x = (x1 , . . . , x n ) ∈ B : x1 = · · · = xn = 0}. Define f : B → B by taking f (x) to be the point in U closest to x, and set BCan = {x ∈ B : d(x, f (x)) ≤ d(o, f (x))}. Then the canopy tree is the subgraph of (B, E(B)) generated by BCan . The canopy tree has exponential volume growth, but only one self-avoiding path from o to infinity. (5) Let G be a finitely generated group, and let = {g1 , . . . , gn } be a set of generators, not necessarily minimal. Write ∗ = {g1 , . . . , gn , g1−1 , . . . , gn−1 }. Let V = G and let {g, h} ∈ E if and only if g −1 h ∈ ∗ . Then = (V, E) is the Cayley graph of the group G with generators . Zd is the Cayley graph of the group Z ⊕ · · · ⊕ Z, with generators gk , 1 ≤ k ≤ d; here gk has 1 in the kth place and is zero elsewhere. Since different sets of generators will give rise to different Cayley graphs, in general a group G will have many different Cayley graphs. The d-ary tree

1.1 Graphs and Weighted Graphs

5

is the Cayley graph of the group generated by = {x1 , x2 , . . . , xd+1 } with 2 = 1. Since the ternary tree is also the Cayley the relations x 12 = · · · = x d+1 graph of the free group with two generators, two distinct (non-isomorphic) groups can have the same Cayley graph. (6) Let α > 0. Consider the graph Z+ = {0, 1, . . . }, but with weights (α) μn,n+1 = α n . This satisfies (H1)–(H3) and (H5), but (H4) fails unless α = 1. If α < 1 then (Z+ , μ(α) ) is an infinite graph, but μ(α) (Z+ ) < ∞. (7) Given a weighted graph (V, E, μ), the lazy weighted graph (V, E, μ(L) ) (L) is defined by setting μx y = μx y if y = x, and μ(L) xx =

μx y .

y=x

Definition 1.5 We introduce the following operations on graphs or weighted graphs. Let (, μ) be a weighted graph. (1) Subgraphs. Let H ⊂ V. Then the subgraph induced by H is the (weighted) graph with vertex set H , edge set E H = {x, y} ∈ E : x, y ∈ H , and weights (if appropriate) given by μxHy = μx y ,

x, y ∈ H.

We define μxH = y∈H μxHy ; note that μxH ≤ μx . Clearly (H, E H ) need not be connected, even if is. (2) Collapse of a subset to a point. Let A ⊂ V. The graph obtained by collapsing A to a point a (where a ∈ V) is the graph = (V , E ) given by V = (V − A) ∪ {a}, E = {x, y} ∈ E : x, y ∈ V − A ∪ {x, a} : x ∈ ∂ A . We define weights on (V , E ) by setting μ x y = μx y if x, y ∈ V − A, and μ xa =

μxb ,

x ∈ V − A.

b∈A

Note that μ xa ≤ μx < ∞. In general this graph need not be locally finite, but will be if |∂ A| < ∞. It is easy to check that if is connected then so is .

6

Introduction

(3) Join of two graphs. Let i = (Vi , E i ), i = 1, 2 be two graphs. (We regard V1 , V2 as being distinct sets.) If xi ∈ Vi , the join of 1 and 2 at x1 , x2 is the graph (V, E) given by V = V1 ∪ V2 , E = E 1 ∪ E 2 ∪ {x 1 , x2 } . We define weights on (V, E) by keeping the weights on 1 and 2 , and giving the new edge {x1 , x2 } weight 1. (4) Graph products. Let i be graphs. There are two natural products, called the Cartesian product (denoted 1 2 ) and the direct or tensor product 1 ×2 . In both cases we take the vertex set of the product to be V1 ×V2 = {(x1 , x 2 ) : x 1 ∈ V1 , x2 ∈ V2 }. For the Cartesian product we define E = {(x, y1 ), (x, y2 )} : x ∈ V1 , {y1 , y2 } ∈ E 2 ∪ {(x1 , y), (x 2 , y)} : {x1 , x2 } ∈ E 1 , y ∈ V2 , and for the tensor product E × = {(x1 , y1 ), (x 2 , y2 )} : {x 1 , x 2 } ∈ E 1 , {y1 , y2 } ∈ E 2 . Note that Z2 = Z Z. If μ(i) are weights on i then we define weights on the product by (2) μ (x,y1 ),(x,y2 ) = μ y1 ,y2 ,

(1) μ (x 1 ,y),(x 2 ,y) = μx 1 ,x 2 ,

(1) (2) μ× (x1 ,y1 ),(x2 ,y2 ) = μx 1 ,x 2 μ y1 ,y2 .

1.2 Random Walks on a Weighted Graph Set P(x, y) =

μx y . μx

(1.2)

Let X be the discrete time Markov chain X = (X n , n ≥ 0, Px , x ∈ V) with transition matrix (P(x, y)), defined on a space (, F ) – see Appendix A.2. Here Px is the law of the chain with X 0 = x, and the transition probabilities are given by Pz (X n+1 = y|X n = x) = P(x, y). We call X the simple random walk (SRW) on (, μ) and (P(x, y)) the transition matrix of X . At each time step X moves from a vertex x to a neighbour y with probability proportional to μx y . Since we can have μx x > 0 we do in general allow X to remain at a vertex x with positive probability. If ν

1.2 Random Walks on a Weighted Graph

7

is a probability measure on V we write Pν for the law of X started with distribution ν. Note that (P(x, y)) is μ-symmetric, that is: μx P(x, y) = μ y P(y, x) = μx y .

(1.3)

Set Pn (x, y) = Px (X n = y). The condition (1.3) implies that X is reversible. Let x0 , x1 , . . . , x n ∈ V. Then

Lemma 1.6

μx0 Px0 (X 0 = x0 , X 1 = x1 , . . . , X n = xn ) = μxn Pxn (X 0 = xn , X 1 = x n−1 , . . . , X n = x0 ). Proof Using the μ-symmetry of P it is easy to check that both sides equal n i=1 μxi−1 ,xi . Definition 1.7

Let A ⊂ V. Define the hitting times T A = min{n ≥ 0 : X n ∈ A}, T A+ = min{n ≥ 1 : X n ∈ A}.

Here we adopt the convention that min ∅ = +∞, so that T A = ∞ if and only if X never hits A. Note also that if X 0 ∈ A then T A = T A+ . Write Tx = T{x} and τ A = T Ac = min{n ≥ 0 : X n ∈ A}

(1.4)

for the time of the first exit from A. If A, B ⊂ V we will write {T A < TB } for the event that could more precisely be denoted {T A < TB , T A < ∞}. If A, B are disjoint then the sample space can be written as the disjoint union = {T A < TB , T A < ∞} ∪ {TB < T A , TB < ∞} ∪ {T A = TB = ∞} ∪ {T A = TB < ∞}. The following theorem is proved in most first courses on Markov chains – see for example [Dur, KSK, Nor], or Appendix A.2. Theorem 1.8 Let be connected, locally finite, infinite. (T) The following five conditions are equivalent:

8

Introduction

(a) There exists x ∈ V such that Px (Tx+ < ∞) < 1. (b) For all x ∈ V, Px (Tx+ < ∞) < 1. (c) For all x ∈ V, ∞ Pn (x, x) < ∞. n=0

(d) For all x, y ∈ V with x = y either Px (Ty < ∞) < 1 or P y (Tx < ∞) < 1. (e) For all x, y ∈ V, ∞ Px (X hits y only finitely often) = Px 1(X n =y) < ∞ = 1. n=0

(R) The following five conditions are equivalent: (a) There exists x ∈ V such that Px (Tx+ < ∞) = 1. (b) For all x ∈ V, Px (Tx+ < ∞) = 1. (c) For all x ∈ V, ∞ Pn (x, x) = ∞. n=0

(d) For all x, y ∈ V, Px (Ty < ∞) = 1. (e) For all x, y ∈ V, Px (X hits y infinitely often) = Px

∞

1(X n =y) = ∞ = 1.

n=0

Definition 1.9 If (T)(a)–(e) of Theorem 1.8 hold we say (or X ) is transient; otherwise (R)(a)–(e) of Theorem 1.8 hold and we say or X is recurrent. The type problem for a graph is to determine whether it is transient or recurrent. A special case is given by the Cayley graphs of groups, and is sometimes called Kesten’s problem: which groups have recurrent Cayley graphs? In this case, recall that the Cayley graph depends on both the group G and the set of generators . Let us start with the Euclidean lattices. Polya [Pol] proved the following in 1921. Theorem 1.10

Zd is recurrent if d ≤ 2 and transient if d ≥ 3.

Polya used the fact that X n is a sum of independent random variables, and that therefore the Fourier transform of Pn (0, x) is a product. Inverting the Fourier transform, he deduced that P2n (0, 0) ∼ cd n −d/2 , and hence proved his theorem.

1.2 Random Walks on a Weighted Graph

9

Most modern textbooks give a combinatorial proof. For d = 1 we have

1 −2n 2n P2n (0, 0) = 2 ∼ ; n (πn)1/2 this is easy using Stirling’s formula. For d = 2 we have n 2 2n 1 , P2n ((0, 0), (0, 0)) = 4−2n ∼ πn k k=0

after a bit more work. (Or we can use the trick of writing Z n = ( 12 (X n + Yn ), 12 (X n − Yn )), where X and Y are independent SRW on Z.) In general we obtain cd n −d/2 , so the series converges when d ≥ 3. This expression is to be expected: by the local limit theorem (a variant of the central limit theorem), we have that the distribution of X n is approximately the multivariate Gaussian N (0, n 1/2 d −1/2 Id ). (Of course the fact that, after rescaling, the SRW X on Zd with d ≥ 3 converges weakly to a transient process does not prove that X is transient, since transience/recurrence is not preserved by weak convergence.) The advantage of the combinatorial argument is that it is elementary. On the other hand, the details are a bit messy for d ≥ 3. More significantly, the technique is not robust. Consider the following three examples: (1) The SRW on the hexagonal lattice in R2 . (2) The SRW on a graph derived from a Penrose tiling. (3) The graph (, μ) where = Zd , and the weights μ satisfy μx y ∈ [c−1 , c]. Example (1) could be handled by the combinatorial method, but the details would be awkward, since one now has to count six kinds of steps. Also it is plainly a nuisance to have to give a new argument for each new lattice. Polya’s method does work for this example, but both methods look hopeless for (2) or (3), since they rely on having an exact expression for Pn (x, x) or its transform. (For the convergence of SRW on a Penrose tiling to Brownian motion see [T3, BT].) We will be interested in how the geometry of is related to the long run behaviour of X . As far as possible we want techniques which are stable under various perturbations of the graph. Definition 1.11 Let P be some property of a weighted graph (, μ), or the SRW X on it. P is stable under bounded perturbation of weights (weight stable) if whenever (, μ) satisfies P, and μ are weights on such that c−1 μx y ≤ μ x y ≤ cμx y ,

x, y ∈ V,

then (, μ ) satisfies P. (We say the weights μ and μ are equivalent.)

10

Introduction

Definition 1.12 Let (X i , di ), i = 1, 2 be metric spaces. A map ϕ : X 1 → X 2 is a rough isometry if there exist constants C1 , C 2 such that C1−1 (d1 (x, y) − C2 ) ≤ d2 (ϕ(x), ϕ(y)) ≤ C1 (d1 (x, y) + C2 ), Bd2 (ϕ(x), C2 ) = X 2 .

(1.5) (1.6)

x∈X 1

If there exists a rough isometry between two spaces they are said to be roughly isometric. (One can check that this is an equivalence relation.) This concept was introduced for groups by Gromov [Grom1, Grom2] under the name quasi-isometry, and in the context of manifolds by Kanai [Kan1, Kan2]. A rough isometry between two spaces implies that the two spaces have the same large scale structure. However, to get any useful consequences of two spaces being roughly isometric one also needs some kind of local regularity. This is usually done by considering rough isometries within a family of spaces satisfying some fixed local regularity condition. (For example, Kanai assumed the manifolds had bounded geometry: that is, that the Ricci curvature was bounded below by a constant.) Sometimes one has to be careful not to forget such ‘hidden’ side conditions. Example

Zd , Rd , and [0, 1] × Rd are all roughly isometric.

The following will be proved in Chapter 2 – see Proposition 2.59. Proposition 1.13 Let G be a finitely generated infinite group, and , be two sets of generators. Let , be the associated Cayley graphs. Then and are roughly isometric. For rough isometries of weighted graphs, the natural additional regularity condition is to require that both graphs have controlled weights. Using Lemma 1.3 and the condition (1.7) below this allows one to relate the measures of balls in the two graphs. Definition 1.14 Let (i , μi ), i = 1, 2 be weighted graphs satisfying (H5). A map ϕ : V1 → V2 is a rough isometry (between (1 , μ1 ) and (2 , μ2 )) if: (1) ϕ is a rough isometry between the metric spaces (V1 , d1 ) and (V2 , d2 ) (with constants C1 and C2 ); (2) there exists C 3 < ∞ such that for all x ∈ V1 C3−1 μ1 (x) ≤ μ2 (ϕ(x)) ≤ C 3 μ1 (x).

(1.7)

1.3 Transition Densities and the Laplacian

11

Two weighted graphs are roughly isometric if there is a rough isometry between them; one can check that this is also an equivalence relation. We define stability under rough isometries of a property P of or X in the obvious way. Example Consider the graphs (Z+ , μ(α) ) defined in Examples 1.4, and let ϕ : Z+ → Z+ be the identity map. Then ϕ is (of course) a rough isometry between the graphs Z+ and Z+ , but if α = β it is not a rough isometry between the weighted graphs (Z+ , μ(α) ) and (Z+ , μ(β) ). We will see in Chapter 2 that (Z+ , μ(α) ) is recurrent if and only if α ≤ 1, and that the type of a graph is stable under rough isometries (of weighted graphs). Question Is there any ‘interesting’ (global) property of the random walk on a weighted graph (, μ) which is weight stable but not stable under rough isometries? (Properties such as being bipartite or a tree are not global.)

1.3 Transition Densities and the Laplacian Rather than working with the transition probabilities P(x, y), it is often more convenient to consider the transition density of X with respect to the measure μ; we will also call this the (discrete time) heat kernel of the graph . We set pn (x, y) =

Pn (x, y) Px (X n = y) = . μy μy

We write p(x, y) = p1 (x, y), and note that p0 (x, y) =

1x (y) , μx

p(x, y) = p1 (x, y) =

μx y . μx μ y

It is easy to verify: Lemma 1.15

The transition densities of X satisfy pn+m (x, y) = pn (x, z) pm (z, y)μz , z

pn (x, y) = pn (y, x), pn (x, y)μ y = μx pn (x, y) = 1.

y

x

Define the function spaces C(V) = RV = { f : V → R}, C0 (V) = { f : V → R such that f has finite support}.

(1.8)

12

Introduction

Let C+ (V) = { f : V → R+ }, and write C0,+ = C0 ∩ C+ etc. For f ∈ C(V) we write supp( f ) = {x : f (x) = 0}. For p ∈ [1, ∞) and f ∈ C(V) let p

|| f || p =

| f (x)| p μx ,

x∈V

and L p (V, μ) = { f ∈ C(V) : || f || p < ∞}. We set || f ||∞ = supx | f (x)| and L ∞ (V, μ) = { f : || f ||∞ < ∞}. If p ∈ [1, ∞) and An ↑↑ V then f 1 An → f in L p for any f ∈ L p , so that C0 is dense in L p . We write f, g for the inner product on L 2 (V, μ): f (x)g(x)μx . f, g = x∈V

We define the operators Pn f (x) = Pn (x, y) f (y) = pn (x, y) f (y)μ y = Ex f (X n ), y∈V

(1.9)

y∈V

and write P = P1 . Since we are assuming (H1), we have that Pn f is defined for any f ∈ C(V). Using Lemma 1.15 we obtain Pn = P n . The (probabilistic) Laplacian is defined on C(V) by 1

f (x) = μx y ( f (y) − f (x)) μx y∈V = p(x, y)( f (y) − f (x))μ y = Ex f (X 1 ) − f (x). y

In operator terms we have

= P − I. Notation We write A p→ p to denote the norm of A as an operator from L p to L p , that is, A p→ p = sup{A f p : f p ≤ 1}.

Let q be the conjugate index of p . Since, by the duality of L p and L q , || f || p = sup{|| f g||1 : ||g||q ≤ 1},

1.3 Transition Densities and the Laplacian

13

we also have A p→ p = sup{g A f 1 : f p ≤ 1, gq ≤ 1}.

(1.10)

Proposition 1.16 (a) P1 = 1. (b) For f ∈ C(V), |P f | ≤ P| f |. (c) For each p ∈ [1, ∞], ||P|| p→ p ≤ 1, and || || p→ p ≤ 2. Proof (a) and (b) are easy. For (c), let first p ∈ [1, ∞) and q be the conjugate index. Let f ∈ L p (V, μ). Then, using Hölder’s inequality,

p

p |P f (x)| p μx = p(x, y) f (y)μ y μx ||P f || p =

x

≤

x

=

x

x

y

p(x, y)| f (y)| μ y p

1/ p

y

p(x, y)1q μ y

1/q p

μx

y

p(x, y)| f (y)| p μx μ y =

y

p

| f (y)| p μ y = || f || p .

y

If p = ∞ the result is easy: |P f (x)| = | p(x, y) f (y)μ y | ≤ || f ||∞ p(x, y)μ y = || f ||∞ . y

y

As = P − I the second assertion is immediate. Remark Note that this proof did not actually use the symmetry of p(x, y), but just the fact that x μx p(x, y) = 1, that is, that μ is an invariant measure for P. The combinatorial Laplacian is defined by Com by

Com f (x) = μx y ( f (y) − f (x)).

Remark 1.17

(1.11)

y∈V

In matrix terms Com = A − D, where A x y = μx y , and D is the diagonal matrix with Dx x = μx . If has its natural weights, then A is the adjacency matrix of . It is easy to check that A and hence Com is self-adjoint with respect to counting measure on V. If M = supx μx < ∞ then one has

p p μx A x y f (y) ||A f || p = x

≤

x

y

μx

y

A x y | f (y)| p

1/ p y

A x y 1q

1/q p

14

Introduction ≤

x

=M

p

μx

A x y | f (y)| p M p/q ≤ M 1+ p/q

y p || f || p ,

| f (y)| p μ y

y

which implies that ||A|| p→ p ≤ M. Since ||D|| p→ p ≤ M this gives || Com || p→ p ≤ 2M. One can then define the ‘delayed RW’ on (, μ) by considering the random walk associated with the operator P=I−

1

Com . M

This is the SRW on the weighted graph (V, E, μ ), where μ x y = M −1 μx y if x = y and μ x x = 1 − y=x μ x y . If M = ∞ then the operator Com is not naturally associated with any discrete time random walk. However it can be associated in a natural way with a continuous time random walk – see Remark 5.7. Proposition 1.18

P and are self-adjoint on L 2 .

Proof This is clear for P since p(x, y) f (y)μ y g(x)μx P f, g = x

= =

y

x

y

y

x

p(x, y) f (y)g(x)μ y μx p(y, x)g(x)μx f (y)μ y = f, Pg.

Hence = P − I is also self-adjoint. Remark 1.19 See Example 1.35 in Section 1.6 for a graph and functions u, v such that both u and v have finite support, but v(x) u(x)μx = u(x) v(x)μx . x

x

(Of course the functions u, v are not in L 2 .) On probability spaces one has L p2 ⊂ L p1 if p2 > p1 . On discrete spaces such as the inclusions are in the opposite direction, provided that μx is bounded below.

1.4 Dirichlet or Energy Form

15

Lemma 1.20 Suppose that μx ≥ a > 0 for each x ∈ V. Then if p2 > p1 and f ∈ C(V), || f || p2 ≤ a 1/ p2 −1/ p1 || f || p1 . Proof By replacing f by f /|| f || p1 we can assume that || f || p1 = 1. If x ∈ V then | f (x)| p1 ≤ 1/μx ≤ 1/a. Hence p | f (x)| p1 | f (x)| p2 − p1 μx ≤ | f (x)| p1 a −( p2 − p1 )/ p1 μx || f || p22 = x

x

≤ a 1− p2 / p1 . Write pnx (·) = pn (x, ·). We can rewrite the equations in Lemma 1.15 as: y

pn+m (x, y) = pnx , pm , P pnx (y)

pnx (y)

=

(1.12)

x pn+1 (y),

= (P −

(1.13)

I ) pnx (y)

=

x pn+1 (y) −

pnx (y).

(1.14)

Equation (1.14) is the discrete time heat equation. Note that (1.12) implies that || pnx ||22 ≤ μ−1 x for all x ∈ V, n ≥ 0.

1.4 Dirichlet or Energy Form For f, g ∈ C(V) define the quadratic form E ( f, g) = 12 μx y ( f (x) − f (y))(g(x) − g(y)),

(1.15)

x∈V y∈V

whenever this sum converges absolutely. This is the discrete analogue of ∇ f · ∇g. Let H 2 (V, μ) = H 2 (V) = H 2 = { f ∈ C(V) : E ( f, f ) < ∞}. Choose a ‘base point’ o ∈ V and define || f ||2H 2 = E ( f, f ) + f (o)2 . The sum in (1.15) is well defined for f, g ∈ H 2 (V). If A ⊂ V we will sometimes use the notation E A ( f, g) = 12 ( f (x) − f (y))(g(x) − g(y))μx y . (1.16) x∈A y∈A

Proposition 1.21 (a) If x ∼ y then −1/2

| f (x) − f (y)| ≤ μx y E ( f, f )1/2 .

16

Introduction

(b) E ( f, f ) = 0 if and only if f is constant. (c) If f ∈ L 2 , E ( f, f ) ≤ 2|| f ||22 .

(1.17)

In particular L 2 ⊂ H 2 . (d) If ( f n ) is Cauchy in H 2 then ( f n ) converges pointwise and in H 2 to a function f ∈ H 2 . (e) Convergence in H 2 implies pointwise convergence. (f) H 2 (V) is a Hilbert space with inner product f, g H = E ( f, g) + f (o)g(o). (g) Let ( f n ) be a sequence of functions with supn || f n || H 2 < ∞. Then there exists a subsequence f n k and a function f ∈ H 2 such that f n k → f pointwise, and || f || H 2 ≤ lim inf || f n k || H 2 . k→∞

(1.18)

(h) The conclusion of (g) holds if An ↑↑ V with o ∈ A1 , and f n : An → R with supn (| f n (o)|2 + E An ( f, f )) < ∞. Proof (a) This is immediate from the definition of E . (b) As is connected, this is immediate from (a). (c) We have ( f (x) − f (y))2 μx y ≤ ( f (x)2 + f (y)2 )μx y E ( f, f ) = 12 x

=

y

x

y

2|| f ||22 .

(d) Let ( f n ) be Cauchy in H 2 . Then f n (o) is Cauchy in R, and so converges. Let A = {x ∈ V : f n (x) converges}. If x ∈ A and y ∼ x then, applying (a) to fn − fm , −1/2

| f n (y) − f m (y)| ≤ μx y E ( f n − f m , f n − f m )1/2 + | f n (x) − f m (x)|. Letting m, n → ∞ it follows that y ∈ A, and so, as is connected, A = V and there exists a function f such that f n converges to f pointwise. By Fatou’s lemma we have || f n − f ||2H 2 ≤ lim inf || f n − f m ||2H 2 , m→∞

and hence || f n − f || H 2 → 0. (e) This is immediate from (d). (f) It is clear that H 2 is an inner product space; (d) then implies that H 2 is complete, so is a Hilbert space.

1.4 Dirichlet or Energy Form

17

(g) and (h) We have that (| f n (o)|, n ≥ 1) is bounded. Hence, using (a) it follows that (| f n (x)|, n ≥ 1) is bounded for each x ∈ V. A standard diagonalisation argument then implies that there exists a subsequence (n k ) such that ( f n k ) converges pointwise. Using Fatou’s lemma then gives (1.18). Let H02 (V) be the closure of C 0 (V) in H 2 (V). As C 0 (V) ⊂ L 2 (V) ⊂ H02 (V) is also the closure of L 2 (V) in H 2 (V). One always has 2 2 L ⊂ H0 ⊂ H 2 . If μ(V) is infinite then H 2 is strictly larger than L 2 , since the constant function 1 is in H 2 . All three spaces can be distinct – see Example 2.67, which shows that this holds for the join of Z+ and Z3 . We will see later (Theorem 2.36) that H 2 = H02 if and only if is recurrent. H 2 (V),

Exercise 1.22 (a) Show that H02 (Z+ ) = H 2 (Z+ ). (b) Show that if f ∈ H02 then f+ , f − ∈ H02 . 2 and f → f in H 2 then f ∧ f → f in H 2 . (c) Show that if f ∈ H0,+ n n Proposition 1.16 shows that is defined on L 2 . However, in fact it maps the larger space H 2 to L 2 . Proposition 1.23 Let f ∈ H 2 . Then f ∈ L 2 , and || f ||22 ≤ 2E ( f, f ) ≤ 2|| f ||2H 2 . Proof We have || f ||22 =

| f (x)|2 μx

x

=

μ−1 x

x

≤

x

2

y

μ−1 x

( f (y) − f (x))2 μx y μx y

x

=

( f (y) − f (x))μx y

y

y

( f (y) − f (x))2 μx y = 2E ( f, f ).

y

We will frequently use the following. Theorem 1.24 (Discrete Gauss–Green Theorem) Let f, g ∈ C(V) satisfy |g(x)|| f (x) − f (y)|μx y < ∞. (1.19) x∈V y∈V

18

Introduction

Then E ( f, g) = −

( f (x))g(x)μx ,

(1.20)

x

and the sum in (1.20) converges absolutely. In particular (1.20) holds if f ∈ H 2 and g ∈ L 2 , or if at least one of f , g is in C 0 (V). Proof The condition (1.19) implies that both sides of (1.20) converge absolutely. We have ( f (y) − f (x))g(y)μx y − 12 ( f (y) E ( f, g) = 12 x

y

x

y

− f (x))g(x)μx y , and both sums on the right side converge absolutely. Interchanging the indices x and y in the second sum, and using the fact that μx y = μ yx , gives E ( f, g) = −

x

( f (y) − f (x))g(x)μx y = −

y

g(x) f (x)μx .

x

Remark (1) Let V = N, f (n) = nk=1 (−1)k k −1 , and g(n) = 1 for all n. Then f, g ∈ H 2 , but | f (n)| ∼ n −1 . Thus while the sum in E ( f, g) converges absolutely, the sum on the right side of (1.20) does not. Since | f (n + 1) − f (n)| ∼ n −1 , (1.19) fails. In fact f, g ∈ H02 – this is easy to verify directly, or one can use Theorem 2.36. (2) Example 1.35 (Section 1.6) shows that even if both sides of (1.20) converge absolutely equality need not hold. y (3) Applying (1.20) to pnx , pm (which are both in C0 ) gives y

y

y

x E ( pnx , pm ) = − pnx , pm = pnx − pn+1 , pm

= pn+m (x, y) − pn+m+1 (x, y). (4) E ( f, f ) can be regarded as the energy of the function f . (5) The bilinear form E , with domain L 2 , is a regular Dirichlet form on L 2 (V, μ) – see [FOT]. Corollary 1.25

We have E (1x , 1 y ) =

−μx y , μx − μx x ,

y = x, y = x.

1.4 Dirichlet or Energy Form

Proof We have E (1x , 1 y ) = 1x , − 1 y = −μx 1 y (x) = −

19

μx z (1 y (x) − 1 y (z)),

z

and a simple calculation completes the proof. As another application of (1.20) we give the following: Lemma 1.26 is recurrent.

Let (, μ) be a weighted graph with μ(V) < ∞. Then (, μ)

Proof Let z ∈ V, and ϕ(x) = Px (Tz = ∞). Then 0 ≤ ϕ ≤ 1, ϕ(z) = 0, and ϕ(x) = 0 if x = z – see Theorem 2.5. As μ(V) < ∞, the function ϕ ∈ L 2 (V) and so −E (ϕ, ϕ) = ϕ, ϕ = ϕ(z) ϕ(z)μz = 0. So by Proposition 1.21 ϕ is constant, and hence is identically zero. Lemma 1.27 (a) Let f ∈ H 2 (V), a ∈ R, and g = ( f − a)+ , h = f ∧ a. Then g, h ∈ H 2 (V) and E (g, g) ≤ E ( f, f ), E (h, h) ≤ E ( f, f ). (b) Let f 1 , . . . , f n ∈ H 2 (V) and g = min{ f 1 , . . . , f n }. Then E (g, g) ≤

n

E ( f i , f i ).

i=1

(c) Let f ∈ H 2 (V). Then E ( f + , f + )+E ( f − , f − ) ≤ E ( f, f ) ≤ 2E ( f + , f + )+2E ( f − , f− ). (1.21) Proof (a) Since |(b1 − a)+ − (b2 − a)+ | ≤ |b1 − b2 | for a, b1 , b2 ∈ R, we have for x, y ∈ V (g(x) − g(y))2 ≤ ( f (x) − f (y))2 , and summing over x, y gives the inequality E (g, g) ≤ E ( f, f ). Since f ∧ a = a − (a − f )+ we have E ( f, f ) = E (a − f, a − f ) ≥ E ((a − f )+ , (a − f )+ ) = E (a − (a − f )+ , a − (a − f )+ ) = E (h, h). (b) Let x ∼ y, and suppose that g(x) ≤ g(y), and g(x) = f j (x). Then 0 ≤ g(y) − g(x) = g(y) − f j (x) ≤ f j (y) − f j (x).

20

Introduction

Hence |g(y) − g(x)|2 ≤ max | f i (y) − fi (x)|2 ≤ i

| f i (y) − f i (x)|2 ,

i

and again summing over x, y completes the proof of (b). (c) If a, b ∈ R then (a − b)2 ≥ (a+ − b+ )2 + (a− − b− )2 ; summing this inequality as in (a) gives the left side of (1.21). The right side is immediate from Cauchy–Schwarz. Property (a) of Lemma 1.27 gives what is called in [FOT] the Markov property of the quadratic form E . To explain the significance of this, let V be a countable set, and let (ax y , x, y ∈ V) satisfy ax y = a yx , ax x = 0, y |a x y | < ∞ for all x. For f, g ∈ C 0 (V) define the quadratic form A ( f, g) associated with (ax y ) in the same way as in (1.15): a x y ( f (x) − f (y))(g(x) − g(y)). (1.22) A ( f, g) = 12 x∈V y∈V

(Note that this sum converges absolutely when f, g ∈ C 0 (V).) If ax y ≥ 0 for all x, y then of course A ( f, f ) ≥ 0 for all f ∈ C0 (V), but the converse is false. Lemma 1.28 Let A be the quadratic form on C0 (V) associated with (ax y ) by (1.22). Then ax y ≥ 0 for all x, y if and only if A has the property: A ( f + , f + ) ≤ A ( f, f )

for all f ∈ C0 (V).

(1.23)

Proof The forward implication is immediate from Lemma 1.27(a). Suppose now that (1.23) holds. Then if a x y = ax y , y = x, and ax x = − y ax y , f (x)ax y g(y). A ( f, g) = − x

y

Let x, y ∈ V, with x = y, λ > 0, and set f λ = 1 y − λ1x . Then A ( f λ , f λ ) = −λ2 ax x + 2λax y − a yy = λ2 |ax x | + 2λax y + |a yy |. Since λ > 0 we have f λ+ = f 0 , and so by (1.23) 0 ≤ A ( f λ , f λ ) − A ( f λ+ , f λ+ ) = λ2 |ax x | + 2λax y , and thus ax y ≥ 0. We will not make much use of the next result, but include it since, combined with Theorems 2.36 and 2.37, it will show that H02 = H 2 if and only if is recurrent.

1.5 Killed Process

21

Proposition 1.29 We have 1 ∈ H02 if and only if H 2 = H02 . Proof The backward implication is obvious. Suppose that 1 ∈ H02 . Then there exist u n ∈ C 0 (V) with ||1 − u n || H 2 ≤ n −3 . By Lemma 1.27(a) we can assume that 0 ≤ u n ≤ 1. Set vn = nu n ; then E (vn , vn ) ≤ n −1 , but since u n → 1 pointwise, vn (x) → ∞ for all x ∈ V. Now let f ∈ H+2 , and set f n = f ∧ vn , so that f n ∈ C0 (V), and for each x ∈ V we have f n (x) = f (x) for all sufficiently large n. Let ε > 0, and choose a finite set A such that if B = V− A then E B ( f, f ) < ε. Now choose n > 1/ε such that f n (x) = f (x) for all x ∈ A. Then since f − f n = ( f − vn )+ E ( f − f n , f − fn ) = E B ( f − fn , f − fn ) ≤ E B ( f − vn , f − vn ) ≤ 2E B ( f, f ) + 2E B (vn , vn ) ≤ 4ε. We therefore deduce that f n → f in H 2 , so that f ∈ H02 .

1.5 Killed Process Let A ⊂ V, and recall from (1.4) the definition of τ A , the time of the first exit from A. Set x pnA (x, y) = μ−1 y P (X n = y, n < τ A ).

(1.24)

This is the heat kernel for the process X killed on exiting from A. Define also the operators I A f (x) = 1 A (x) f (x), PA f (x) = p1A (x, y) f (y)μ y , y∈A

PnA

A = PA − I A , A f (x) = Ex f (X n ); n < τ A = pn (x, y) f (y)μ y . y

Note that P0A = I A . Lemma 1.30 (a) pnA (x, y) = 0 if either x ∈ A or y ∈ A. (b) If x, y ∈ A then A pn+1 (x, y) = pnA (x, z) p1A (z, y)μz = pnA (x, z) p(z, y)μz , z∈V

z∈A

(1.25)

22

Introduction A pn+1 (x, y) − pnA (x, y) = y pnA (x, y),

pnA (x, y) = pnA (y, x).

(1.26)

Thus pnA (x, ·) satisfies the discrete time heat equation on A with zero boundary conditions on ∂ A. (c) We have PnA = (PA )n . (d) We have PA f = I A P I A f,

A f = I A I A f.

Proof (a) This is clear from the definition. Using this, we have A μ y pn+1 (x, y) = Px (X n+1 = y, τ A > n + 1) Px (X n+1 = y, τ A > n + 1|X n = z, τ A > n) pnA (x, z)μz = z∈V

=

Px (X n+1 = y, τ A > n + 1|X n = z, τ A > n) pnA (x, z)μz ,

z∈A

giving the first equality in (1.25). The second follows since p1A (x, z) = p(x, z) if x, z ∈ A. The other equalities in (b), as well as (c), then follow easily from (1.25). (d) is again clear from the definition.

1.6 Green’s Functions Let A ⊂ V. We define the Green’s functions (or potential kernel density) by g A (x, y) =

∞

pnA (x, y).

n=0

It is clear from (1.26) that g A (x, y) is symmetric in x, y. We write g for gV , and as above we may write g xA (·) = g A (x, ·). Define the number of hits by X y on y (or local time of X at y) by L 0 = 0 and y

Ln =

n−1

1{X r =y} ,

n ≥ 1.

r =0

Then we also have Ex L τyA =

∞ n=0

Px (X n = y, τ A > n) =

∞

pnA (x, y)μ y = g A (x, y)μ y .

n=0

(1.27) If is recurrent then Theorem 1.8 implies that g(x, y) = ∞ for all x, y ∈ V.

1.6 Green’s Functions

23

Theorem 1.31 Let A ⊂ V and suppose that either is transient or A = V. Then g A (x, y) = Px (Ty < τ A )g A (y, y), for all x, y ∈ V, 1 < ∞ for all y ∈ A. g A (y, y)μ y = y P (τ A ≤ Ty+ )

(1.28) (1.29)

In particular g A (x, y) < ∞ for all x, y ∈ V. Proof Applying the strong Markov property of X at Ty (see Appendix A.2), g A (x, y)μ y = Ex (L τyA ) = Ex 1(Ty τ A ) ≥ P y (Ty+ > Tz ) > 0. By the strong Markov property of X , P y (L τyA = k) = (1 − p) pk−1 ,

k ≥ 1.

y

Thus L τ A has a geometric distribution, and, summing the series in k, μ y g A (y, y) = E y (L τyA ) =

∞

kp k−1 (1− p) =

k=1

1 1 < ∞. = y 1− p P (τ A ≤ Ty+ )

Note that our hypotheses allow P y (τ A = Ty+ = ∞) > 0. The Green’s function satisfies the following equations. Lemma 1.32

Let x, y ∈ A. Then Pg xA (y) = g A (x, y) − μ−1 x 1x (y),

(1.30)

−μ−1 x 1x (y).

(1.31)

g xA (y) Proof We have Pg xA (y) =

=

p(y, z)μz g A (x, z)

z∈V

= =

∞

p(y, z) pnA (x, z)μz

z∈V n=0 ∞ A pn+1 (x, n=0

y) = g A (x, y) − p0A (x, y).

Substituting for p0 (x, y) gives (1.30). Since = P − I , (1.31) is immediate.

24

Introduction

Remark The term μ−1 x in (1.31) may look odd, but arises because we are considering densities with respect to μ. We also define the potential operator G A f (x) =

∞

PnA f (x),

n=0

and write G = G V . We have G A f (x) =

g A (x, y) f (y)μ y = Ex

y

A −1 τ

f (X n ) .

n=0

Unless is recurrent and A = V we have G A : L 1 (V) → C(V), since by Theorem 1.31 |G A f (x)| ≤ g A (x, y)| f (y)|μ y ≤ g A (x, x)|| f ||1 . (1.32) y

Remark

For Zd (with natural weights) one has if d ≥ 3 cd . g(x, y) ∼ |x − y|d−2

Hence g(x, ·) ∈ L 2 if d ≥ 5, but not for d = 3, 4. Also, as we will see in Corollary 3.44, G does not map L 2 (Zd ) to L 2 (Zd ) for any d. Lemma 1.33

(a) Let A ⊂ V with A = V if is recurrent. Then

A G A f = −I A f, for f ∈ L 1 (V).

In particular G A f (x) = − f (x) for x ∈ A. (b) If is transient then

G f = − f, for f ∈ L 1 (V). Proof (a) We have

A G A = (PA − I A )

∞

(PA )n = −I A ;

n=0

the sums here converge by (1.32). To prove the second assertion, since G A f has support A, by Lemma 1.30 −I A f = A G A f = I A G A f , so − f and

G A f agree on A. (b) This is immediate from (a).

1.6 Green’s Functions

25

Theorem 1.34 Let A ⊂ V with A = V if is recurrent. (a) If An ↑↑ A then g An (x, y) ↑ g A (x, y) pointwise. (b) For any x ∈ A, g A (x, ·) ∈ H02 (V). (c) If f ∈ H02 (V) with supp( f ) ⊂ A then E ( f, g xA ) = f (x). In particular

y E (g xA , g A )

(1.33)

= g A (x, y).

Proof (a) Since τ An ↑ τ A this is clear from (1.27) by monotone convergence. (b) Suppose first that A is finite, and let f : V → R with supp( f ) ⊂ A. By (1.31) we have g xA (y) = 0 for y ∈ A, y = x, while f (y) = 0 if y ∈ V − A. Therefore by Theorem 1.24 E ( f, g xA ) = f, − g xA = f, μ−1 x 1x = f (x).

(1.34)

Now let A be any set satisfying the hypotheses of the theorem. The conditions on and A imply that g xA is finite. Let An = B(x, n) ∩ A, so that An ↑↑ A. If m ≤ n then supp(g xAm ) ⊂ An , so by (1.34) E (g xAm , g xAn ) = g Am (x, x). Therefore E (g xAm − g xAn , g xAm − g xAn ) = g An (x, x) − g Am (x, x), which implies that (g xAn ) is Cauchy in H 2 . Since g xAn converges pointwise to g xA , by Proposition 1.21(d) we also have convergence in H 2 , and thus g xA ∈ H02 . (c) Let f ∈ H02 with supp( f ) ⊂ A. Choose f n ∈ C0 (V) with f n → f in H 2 . As f n has finite support, the support of f n is contained in Am for m large enough. So by (1.34) E ( f n , g xAm ) = f n (x), and taking the limit, first as m → ∞, and then as n → ∞, gives (1.33). Example 1.35 Suppose that V is transient; by Lemma 1.26 we have μ(V) = ∞. Fix x ∈ V and let f = 1. By the above we have f ∈ H 2 , g x ∈ H02 , and therefore E ( f, g x ) is defined. (We will see in Theorem 2.37 that f ∈ H02 .) x Since f = 0 and g x = −μ−1 x 1x , both f and g have finite support, and therefore both sums f, g x and f, g x are defined and converge absolutely. However, by direct calculation one has E ( f, g x ) = 0, (− f (y))g x (y)μ y = 0, y f (y)(− g x (y))μ y = f (x) = 1. y

Thus (1.33) may fail if f ∈ H 2 (V).

26

Introduction

1.7 Harmonic Functions, Harnack Inequalities, and the Liouville Property Definition 1.36

Let A ⊂ V and f be defined on A = A ∪ ∂ A. We say f is

superharmonic in A if f ≤ 0

(i.e. P f ≤ f ),

subharmonic in A if f ≥ 0

(i.e. P f ≥ f ),

harmonic in A if f = 0

(i.e. P f = f ).

Clearly f is superharmonic if and only if − f is subharmonic. By Lemma 1.32 g A (x, ·) is superharmonic in A, and harmonic in A − {x}. If f ≥ 0 then by Lemma 1.33 G A f is superharmonic in A. If h 1 , h 2 are superharmonic then so is h = h 1 ∧ h 2 . For if h(x) = h 1 (x) then h(x) = h 1 (x) ≥ p(x, y)h 1 (y)μ y ≥ p(x, y)h(y)μ y . y

If g is superharmonic then Ex (g(X n )|X n−1 , . . . , X 0 ) =

y

p(X n−1 , y)g(y)μ y = Pg(X n−1 ) ≤ g(X n−1 ),

y

so g(X n ) is a supermartingale. (See Appendix A.1.) Theorem 1.37 (Maximum principle) subharmonic in A.

Let A ⊂ V be connected, and h be

(a) If there exists x ∈ A such that h(x) = max h(z) z∈A

then h is constant on A. (b) If A is finite then h attains its maximum on A on ∂ A, that is, max h = max h. ∂A

A

Proof (a) Let B = {y ∈ A : h(y) = h(x)}. If y ∈ B ∩ A then h(z) ≤ h(y) whenever z ∼ y, so (h(z) − h(y))μ yz ≤ 0. 0 ≤ μ y h(y) = z∼y

Thus if y ∈ A ∩ B and z ∼ y then z ∈ B. Hence, as A is connected, B = A.

1.7 Harmonic Functions and the Liouville Property

27

(b) Let x be a point in A where h attains its maximum. If x ∈ ∂ A we are done; if not then x ∈ A and by (a) h is constant in A. Example The maximum principle fails in general if A is infinite. Let = Z3 , A = Z3 − {0}, and h(x) = Px (T0 = ∞). Then h is harmonic in A, but sup A h = 1 while sup∂ A h = h(0) = 0. Remark 1.38 If h is superharmonic then applying the maximum principle to −h gives a corresponding minimum principle: If there exists x ∈ A such that h(x) = min y∈A h(y) then h is constant on A. Definition 1.39 (, μ) has the Liouville property if all bounded harmonic functions on are constant. (, μ) has the strong Liouville property if all positive harmonic functions on are constant. For the probabilistic interpretation of the Liouville property in terms of the tail behaviour of the random walk X see the end of this chapter. Example 2.67 shows that the join of Z+ and Z3 at their origins satisfies the Liouville property but not the strong Liouville property. Theorem 1.40 Let (, μ) be recurrent. (a) Any positive superharmonic function on V is constant. (b) (, μ) has the strong Liouville property. Proof (a) There is a quick probabilistic proof using the martingale convergence theorem (see Theorem A.4). If h ≥ 0 is superharmonic then Mn = h(X n ) is a positive supermartingale. So Mn converges Px -a.s. to a limit M∞ , and as for any y ∈ V the set {n : X n = y} is Px -a.s. unbounded, we have Px (h(y) = M∞ ) = 1 for all y ∈ V. Thus h is constant. One can also give an analytic proof based on the maximum principle. Let u ≥ 0 be superharmonic. So Pu ≤ u, and iterating P n+1 u ≤ P n u ≤ · · · ≤ Pu ≤ u. Set v = u − Pu ≥ 0. Then P k v ≥ 0, and n

P k v = u − P n+1 u ≤ u.

k=0

Hence y

g(x, y)v(y)μ y = Gv(x) = lim

n→∞

n k=0

P k v(x) ≤ u(x) < ∞.

28

Introduction

Since g(x, y) = ∞ for all x, y we must have v(y) = 0 for all y, so that u = Pu, and therefore u is harmonic. We have proved that if u ≥ 0 is superharmonic then u is harmonic. Now let f ≥ 0 be superharmonic. Suppose f is non-constant, so there exist x, y ∈ V, λ > 0 such that f (x) < λ < f (y). Then the function v = f ∧ λ is superharmonic, so harmonic. Further, v attains its maximum at y, so by the maximum principle (applied on any finite connected set containing x and y) v is constant, a contradiction. (b) Since a harmonic function is superharmonic, this is immediate from (a).

Proposition 1.41 (‘Foster’s criterion’ – see [Fos, Mer]) Let A ⊂ V be finite. (, μ) is recurrent if and only if there exists a non-negative function h which is superharmonic in V − A and which satisfies |{x : h(x) < M}| < ∞ for all M.

(1.35)

Proof By collapsing A to a single point, we can assume that A = {o}. Suppose a function h with the properties above exists. Then for any initial point x ∈ V the process Mn = h(X n∧To ) is a non-negative supermartingale, so converges Px -a.s. to a r.v. M∞ . If X is transient then we can find x ∈ V so that Px (To < ∞) < 1, and using (1.35) it follows that Px (M∞ = ∞) > 0, which contradicts the convergence theorem for supermartingales. (See Corollary A.5.) Now assume that (, μ) is recurrent. Let Bn = B(o, n), and let h n (x) = x P (τ Bn < To ). Then h n is superharmonic in V − {o}, equals 1 on Bnc , and satisfies lim h n (x) = 0 for all x.

n→∞

Since Bk is finite for each k, there exists a sequence n k → ∞ such that h n k (x) ≤ 2−k for all x ∈ Bk .

Let h = k h n k . Then h(x) < ∞ for all x, and h is non-negative and superharmonic. Let M ∈ N. Then h n j (x) = 1 if x ∈ Bnc M and j ≤ M, so h(x) ≥ M for all x ∈ Bnc M . Thus h satisfies (1.35). We can extend the minimum principle to infinite sets if is recurrent. Theorem 1.42 Let be recurrent, and A ⊂ V. If h is bounded and superharmonic in A then min h = min h. ∂A

A

1.7 Harmonic Functions and the Liouville Property

29 c

Proof Let b = min∂ A h, and set g = h ∧ b on A, and g = b on A . Then g is superharmonic in A, while if y ∈ Ac then b = g(y) ≥ Pg(y). So g is superharmonic in the whole of V, and thus g is constant and equal to b. Hence h ≥ b on A. Proposition 1.43 (Local Harnack inequality) Let (, μ) satisfy (H5) with constant C1 . Then if h ≥ 0 is harmonic in {x, y} with x ∼ y, C1−1 h(y) ≤ h(x) ≤ C1 h(y).

(1.36)

Proof We have h(x) =

μx z z∼x

μx

h(z) ≥ C1−1 h(y).

Interchanging x and y gives the second inequality. Iterating (1.36) we deduce that if h ≥ 0 and is harmonic on B(x, r ), and y ∈ B(x, r ), then d(x,y)

h(y) ≤ C1

h(x).

Definition 1.44 (, μ) satisfies a (scale invariant) elliptic Harnack inequality (EHI) if there exists a constant C H such that, given x0 ∈ V, R ≥ 1, h ≥ 0 on B(x0 , 2R) and harmonic in B(x 0 , 2R) then h(x) ≤ C H h(y)

for all x, y ∈ B(x0 , R),

or equivalently max h ≤ C H min h.

B(x0 ,R)

B(x0 ,R)

(1.37)

Remark (1) Note that h(x) ≥ 0 for x ∈ B(x0 , 2R) does not imply that h ≥ 0 on B(x0 , 2R). We sometimes have to be careful in this way in the discrete setup. (2) We have considered the balls B(x, R) ⊂ B(x, 2R) for simplicity. An easy exercise shows that if K > 1 and the EHI holds in the form stated above, then (1.37) holds (with a different constant C H (K )) for any non-negative function h defined on B(x0 , K R) and harmonic in B(x0 , K R), provided that R ≥ 2/(K − 1). If f : A → R write Osc( f, A) = sup A f − inf A f .

30

Introduction

Proposition 1.45 Suppose (, μ) satisfies EHI with constant C H . Then if ρ = 1/(2C H ) and h is harmonic in B(x0 , 2R), Osc(h, B(x 0 , R)) ≤ (1 − ρ) Osc(h, B(x 0 , 2R)).

(1.38)

Proof Write B = B(x0 , R), D = B(x0 , 2R). Replacing h by a + bh (where a, b ∈ R) we can arrange that min h = 0,

max h = 1.

D

Suppose that h(x0 ) ≥

1 2.

D

Then by the EHI 1 2

≤ max h ≤ C H min h. B

B

Thus h(x) ≥ 1/(2C H ) on B, so Osc(h, B) ≤ 1 − 1/(2C H ) = (1 − ρ) Osc(h, D). If h(x 0 ) ≤

1 2

then apply the argument above to h = 1 − h.

Remark The oscillation inequality (1.38) is slightly weaker than the EHI: [Ba2] shows that some random walks on ‘lamplighter’ type graphs satisfy (1.38) without satisfying the EHI. A fairly weak lower bound on hitting probabilities combined with (1.38) is enough to give the EHI – see for example [FS]. Theorem 1.46 Suppose (, μ) satisfies the EHI. Then it satisfies the strong Liouville property. Proof Let h ≥ 0 be harmonic in V. Fix x 0 ∈ V and let An = B(x0 , 2n ), λn = Osc(h, An ). We begin by using the EHI in An ⊂ An+1 : max h ≤ C H min h ≤ C H h(x0 ). An

An

As n is arbitrary it follows that h is bounded by c1 = C H h(x0 ). Now by Proposition 1.45, λn ≤ (1 − ρ) Osc(h, An+1 ) ≤ (1 − ρ) Osc(h, An+2 ) = (1 − ρ)λn+2 . Iterating, and using the fact that λm ≤ Osc(h, V) ≤ c1 for any m, we deduce that, for any n ≥ 1 and k ≥ 1, λn ≤ (1 − ρ)k λn+2k ≤ (1 − ρ)k c1 . So λn = 0 for all n, proving that h is constant.

1.8 Strong Liouville Property for Rd

31

Remark T. Lyons [Ly2] proved that the Liouville property and the strong Liouville property are not weight stable. The EHI has been recently proved to be weight stable – see [BMu]. Example An example of a graph which does not satisfy the Liouville property is the join of two copies of Z3 at their origins. Write Z3(i) , i = 1, 2 for the two copies, and 0i for their origins. Let F = {X is ultimately in Z3(1) } =

∞ ∞

{X m ∈ Z3(1) },

n=1 m=n

and let h(x) = Px (F). Then h is harmonic, and is non-constant since h(x) ≥ Px (X never hits 01 ) for x ∈ Z3(1) , h(x) ≤ Px (X hits 02 ) for x ∈ Z3(2) .

1.8 Strong Liouville Property for Rd In general, proving the EHI requires some hard work. In Chapter 7 we will see how it can be proved from estimates on the transition density of X . A corollary is that the Liouville property and the strong Liouville property hold for graphs which are roughly isometric to Zd . Here is a proof of the strong Liouville property for Rd , d ≥ 3, using the same ideas. We say a function h ∈ C 2 (Rd ) is harmonic on Rd if h(x) = 0 for all x; here is the usual Laplacian on Rd . Let Sr be the sphere centre 0 and radius r , and let σr be surface measure on Sr . By Green’s third identity, if v(x) = c1 (|x|2−d − r 2−d ) is the fundamental solution of the Laplace equation inside Sr , then ∂v h(0) = h(y) (y)dσr (y), ∂n Sr and it follows that h(0) = σ (Sr )−1

hdσr . Sr

Let kt (x, y) = (2πt)−d/2 exp(−|x − y|2 /2t) be the usual Gaussian density, and K t f (x) = kt (x, y) f (y)dy.

32

Introduction

Then the spherical symmetry of kt (x, ·) implies that h(x) = K t h(x) for all x ∈ Rd and t > 0. We have

2 |y| − 2|x − y|2 kt (x, y) = 2d/2 exp k2t (0, y) 4t

2 2|x| − |y − 2x|2 2 = 2d/2 exp ≤ 2d/2 e|x| /2t ≤ e2d/2 4t if 2t ≥ |x|2 . Now let h ≥ 0 be harmonic in Rd , and suppose h is non-constant. Replacing h by h − inf h if necessary, we can assume that inf h = 0. Let x ∈ Rd , and choose t ≥ 12 |x|2 ; then h(x) = kt (x, y)h(y)dy ≤ e2d/2 k2t (0, y)h(y) = e2d/2 h(0). So h is bounded, and (using translation invariance) we also have that h(x) ≤ e2d/2 h(y) for all x, y ∈ Rd . Since h is non-constant there exists x 0 ∈ Rd with h(x 0 ) > 0, so we deduce that h(y) ≥ e−1 2−d/2 h(x0 ) for all y ∈ Rd , and hence that inf h > 0, a contradiction.

1.9 Interpretation of the Liouville Property We will need more of the theory of Markov processes than in the rest of this chapter: see Appendix A.2 for further details. As before we assume that the random walk X is defined on a probability space (, F ), with probability measures (Px , x ∈ V) such that Px (X 0 = x) = 1. We also assume that we have the shift operators θn : → satisfying X m (θn ω) = X n+m (ω),

n, m ≥ 0, ω ∈ .

For a r.v. ξ we set (ξ ◦ θn )(ω) = ξ(θn ω). We have 1 F ◦ θn = 1θ −1 (F) .

(1.39)

n

We will sometimes write F ◦ θn for θn−1 (F). We define the σ -fields Fn = σ (X k , k ≤ n),

Gn = σ (X k , k ≥ n),

F = G0 = σ (X k , k ≥ 0). (1.40)

The invariant σ -field I is defined by: F ∈ I if and only if F ∈ F and θn−1 (F) = F for all n.

1.9 Interpretation of the Liouville Property

33

The tail σ -field T is defined by: T =

∞

Gn .

(1.41)

n=0

Lemma 1.47

I ⊂T.

Proof We have that F ∈ Gn if and only if F = θn−1 (Fn ) for some Fn ∈ F0 ; see Lemma A.25. Let F ∈ I . Then F = θn−1 (F), so F ∈ Gn for any n, and hence F ∈ T . A σ -field G is Px -trivial if Px (G) ∈ {0, 1} for every G ∈ G . Theorem 1.48 The following are equivalent: (a) satisfies the Liouville property. (b) I is Pν -trivial for all probability distributions ν on V. (c) There exists x ∈ V such that I is Px -trivial. Proof Let F ∈ I , and set h(x) = Px (F). Then as F = F ◦ θn , by the Markov property of X (see Theorem A.10), Eν (1 F |Fn ) = Eν (1 F ◦ θn |Fn ) = E X n (1 F ) = h(X n ). Thus, taking expectations with respect to Ex and setting n = 1, P(x, y)h(y), h(x) = Ex h(X 1 ) = y

so that h is harmonic. Now assume (a). Then h is constant, while by the martingale convergence theorem c = h(X n ) = Eν (1 F |Fn ) → 1 F ,

Pν -a.s.

So Pν (1 F = c) = 1, and thus F is Pν -trivial, so that (b) holds. That (b) implies (c) is immediate. Suppose (c) holds, and let h be a bounded harmonic function. Let Mk = h(X k ), so M is a bounded martingale, and by the martingale convergence theorem (see Theorem A.7), for each x ∈ V, Mk → M∞ Px -a.s. and in L 1 . Let n ≥ 0. Now Mk ◦ θn = Mk+n , so M∞ = M∞ ◦ θn , and thus M∞ is I measurable. As I is Px -trivial, it follows that Px (M∞ = c) = 1 for some constant c. Further, as Mn → M∞ in L 1 , h(X n ) = Mn = Ex (M∞ |Fn ) = c

34

Introduction

for any n, so h(y) = c for any y such that Px (X n = y) > 0 for some n ≥ 0. As is connected we deduce (a). Combining Theorems 1.40 and 1.48 we obtain: Let be recurrent. Then I is Pν -trivial for all ν.

Corollary 1.49

While in general the Liouville property can be hard to prove, the special structure of the SRW on Zd as a sum of i.i.d. random variables enables us to give two quick proofs of the Liouville property; one direct, and one via the Hewitt–Savage 01 law. We begin with a direct argument from [DM]. Theorem 1.50 Let G be a countable Abelian group and be a set of generators. Let ν be a probability measure on , with νz = ν(z) > 0 for all z ∈ . If h is bounded and harmonic then h is constant. Remark

In this context h harmonic means that h = Ph, where Ph(x) = h(x + y)ν y . y∈

In order that the associated random walk be given by edge weights, we need the additional conditions that if x ∈ then −x ∈ , and that νx = ν−x for all x ∈ . We can then set μx,x+y = μx+y,x+y+(−y) = ν y . Proof Suppose that h is non-constant. Then since generates G there exist w ∈ and x ∈ G such that h(x ) = h(x + w). Set h w (x) = h(x) − h(x + w); replacing h by −h if necessary we can assume that M = sup h w > 0. Then h w (x + y)ν y Ph w (x) = y∈

=

(h(x + y) − h(x + w + y))ν y = h(x) − h(x + w) = h w (x),

y∈

so that h w is also harmonic. Then h w (x) ≤ νw h w (x + w) + (1 − νw )M, which gives that M − h w (x + w) ≤ ν−1 w (M − h w (x)).

1.9 Interpretation of the Liouville Property

35

Iterating this equation we obtain M − h w (x + nw) ≤ ν−n w (M − h w (x)),

for all x and n ≥ 0 .

(1.42)

Let N ≥ 1. Since infx (M − h w (x)) = 0, there exists x 0 ∈ G such that M − N . So by (1.42), for 0 ≤ n ≤ N , h w (x0 ) ≤ 12 Mνw 1 M − h w (x0 + nw) ≤ ν−n w (M − h w (x 0 )) ≤ 2 M,

which implies that h w (x0 + nw) ≥ 12 M,

0 ≤ n ≤ N.

Then 2||h||∞ ≥ h(x 0 ) − h(x 0 + N w) =

N

h(x0 + nw) − h(x0 + (n − 1)w)

n=1

=

N

h w (x0 + (n − 1)w) ≥ 12 N M,

n=1

a contradiction if N is chosen large enough. Remark The example of the free group with two generators shows this result is not true for non-Abelian groups. To see where the proof fails, suppose that G is non-Abelian, and the random walk acts by multiplication on the right, so that Ph(x) = y∈ h(x y)ν y . If we define h w (x) = h(x) − h(xw) then h w is not harmonic. If instead we set h w (x) = h(x) − h(wx) then, while h w is harmonic, the relation h(xwn ) − h(xwn−1 ) = h w (xwn−1 ), used in the last line of the proof, fails in general. We now introduce the Hewitt–Savage 01 law. A finite permutation π : N → N is a 1-1 onto map from N to N such that {i : π(i) = i} is finite. It follows that there exists m = m π such that π( j) = j for all j ≥ m. Now let (ξn , n ∈ N) be random variables on a probability space (, F , P). For simplicity we assume that ξi are discrete, with values in a finite set S, that = S N , and that ξi are the coordinate maps, so that if ω = (ω1 , ω2 , . . . ) ∈ , then ξi (ω) = ωi . Given a finite permutation π we define the map Mπ : → by Mπ (ω) = (ωπ(1) , ωπ(2) , . . . ); unpacking this notation we have ξi (Mπ (ω)) = ξπ(i) (ω). If F ∈ F then Mπ−1 (F) = {ω : Mπ (ω) ∈ F}. We say that F is permutable if F = Mπ−1 (F) for every π .

36

Introduction

Theorem 1.51 (Hewitt–Savage 01 law – see [Dur], Th. 4.1.1.) (a) The collection of permutable events is a σ -field, F E , called the exchangeable σ -field. (b) Let (ξn , n ∈ N) be i.i.d. Then F E is trivial. Corollary 1.52 Let X be the SRW on Zd . Then T and I are trivial, and X satisfies the Liouville property. Proof Let S be the set of the 2d neighbours of 0 in Zd , and let = (ξn , n ∈ N) be i.i.d.r.v. with distribution given by P(ξn = x) = 1/2d for each x ∈ S. Define X 0 = 0, and Xn =

n

ξi ,

n ≥ 1.

i=1

Let Gn and T be as in (1.40) and (1.41), and let G ∈ T . Let π be a finite permutation and let X nπ =

n

ξπ(i) , n ≥ 1.

i=1

Then there exists m ≥ 1 such that πi = i for all i ≥ m. Hence if n ≥ m then X nπ = X n . Since G ∈ Gm it follows that Mπ−1 (G) = G. Therefore G is permutable, and so G ∈ F E . Hence T ⊂ F E , and so by Theorem 1.51 T is trivial. Thus I is trivial, and so by Theorem 1.48 X satisfies the Liouville property. Since the Liouville property is not stable under rough isometries, this result is of no help in proving the Liouville property for graphs which are roughly isometric to Zd . Example It is easy to give examples of graphs for which T = I . Let = Z3 , with the usual edges except for one additional edge e∗ between (0, 0, 0) and (1, 1, 0). Let A0 = {x = (x1 , x2 , x 3 ) ∈ Zd : x1 + x 2 + x3 is even}, and A1 = Zd − A0 . As the SRW on Z3 is transient, it only traverses e∗ a finite number of times. So if Fi = {X 2n ∈ Ai for all large n},

1.9 Interpretation of the Liouville Property

37

then Px (F0 ∪ F1 ) = 1. The events Fi are in T and have Px (Fi ) > 0 for all x, so T is non-trivial. On the other hand I is trivial by Theorem 7.19. Remark 1.53 Appendix A.4 gives a ‘02 law’ which gives necessary and sufficient conditions to have T = I . As a corollary we have that T = I for the lazy random walk, that is the simple random walk on the lazy graph defined in Examples 1.4(7).

2 Random Walks and Electrical Resistance

2.1 Basic Concepts Doyle and Snell [DS] made explicit the connection between random walks and electrical resistance, which had been implicit in many previous works, such as [BD, NW]. Informally an electrical network is a set of wires and nodes. A weighted graph = (V, E, μ) can be interpreted as an electrical network: the vertices are ‘nodes’ and the edge {x, y} corresponds to a wire with conductivity μx y . (We only consider ‘pure resistor’ networks – no impedances or capacitors.) Definition 2.1

A flow on is a map I : V × V → R such that: I x y = 0 if x ∼ y, I x y = −I yx for all x, y.

Consider a single wire {x, y}. We take its resistance to be μ−1 x y (which is why we called the weights μx y conductances). If x, y are at potential f (x), f (y) respectively, then Ohm’s law (‘I = V /R’) gives the current as I x y = μx y ( f (y) − f (x))

(2.1)

(my currents flow uphill), and the power dissipation (‘E = V 2 /R’) is μx y ( f (y) − f (x))2 .

(2.2)

Note that I defined by (2.1) is a flow, and that summing (2.2) over edges gives E ( f, f ). Electrical engineering books regard currents as primary (they correspond to physical processes) and potentials as derived from the currents by (2.2). But in this mathematical treatment it is easier to take potentials as our main object. 38

2.1 Basic Concepts

Definition 2.2

39

(a) Let f ∈ C(V). Define the flow ∇ f by (∇ f )x y = μx y ( f (y) − f (x)).

(2.3)

(b) Given a flow I set Div I (x) =

Ix y ;

divI (x) =

y∼x

Div I (x) . μx

(2.4)

Div I (x) is the net flux out of x, while div is the density of Div with respect to μ. For a flow I and A ⊂ V define the net flux out of the set A by Div I (x). (2.5) Flux(I ; A) = x∈A

Lemma 2.3

For f ∈ C(V), div∇ f (x) = f (x).

Proof By (2.3) and (2.4) div∇ f (x) = μ−1 (∇ f )x y = μ−1 μx y ( f (y) − f (x)) = f (x). x x y∼x

y∼x

The behaviour of the currents in a network is described by Kirchhoff’s two laws (see [Kir1, Kir2]). The first (the ‘current law’) states that if x ∈ V, and no external current flows into or out of x, then Ix y = 0. Div I (x) = y∼x

The second (the ‘voltage law’) is that, given a cycle x0 , x1 , . . . , xn = x0 in V, then, n

μ−1 x i−1 ,xi I xi−1 ,x i = 0.

(2.6)

i=1

It follows easily from this condition (a discrete analogue of curl(I) = 0) that there exists a function (or ‘potential’) f on V such that I = ∇ f ; f is unique up to an additive constant. The proof is straightforward: let x0 ∈ V and set f (x 0 ) = 0. Suppose f has been defined on a set A ⊂ V. If x ∈ A and y ∼ x with y ∈ Ac then let f (y) = f (x) + μ−1 x y I x y . The condition (2.6) ensures that f is well defined. Plainly, if we just have a bunch of wires connected together, nothing will happen – that is, no currents will flow and the whole network will remain at the same potential. Electrical engineers therefore allow, as well as wires between x and y, the possibility of voltage or current sources. However, from our viewpoint, which is to use electrical networks to explore the properties of

40

Random Walks and Electrical Resistance

a graph , it is better to fix the ‘passive’ network (or a subset of it) and then allow ‘external’ sources (voltage or current) to be imposed in various ways. Let D ⊂ V. Suppose we impose a potential f 0 on V − D (by adding ‘external’ inputs of currents), and then allow currents to flow in D according to Kirchhoff’s laws. Let f ∈ C(V) be the resulting potential; by Ohm’s law the current from x to y is (∇ f ) x y . For x ∈ D conservation of current then gives div∇ f (x) = 0, so that f is harmonic in D. Thus f is the solution to the following Dirichlet problem: f (x) = f 0 (x) on V − D,

f (x) = 0 on D.

(2.7)

Of course it is only the values of f 0 on ∂ D which will affect f inside D. Lemma 2.4 Let f be any solution of the Dirichlet problem (2.7). Then I = ∇ f satisfies Kirchhoff’s laws. Proof Since I is the gradient of a function, (2.6) holds. For x ∈ D we have, by Lemma 2.3, Div I (x) = μx (x) = 0. Assume now that f 0 is bounded, recall the definition of the exit time τ D from (1.4), and set ϕ(x) = Ex f 0 (X τ D )1{τ D 0 for some x ∈ D. In this case uniqueness will fail for (2.7) since, for any λ, the function ϕ + λh D is also a solution of (2.7). Thus the Dirichlet problem (2.7) is not enough to specify the potential f and current I arising from a potential f 0 on ∂ D. We will see that if we impose the additional condition that f has ‘minimal energy’, then we do have uniqueness. Lemma 2.6

Let I be a flow on V with finite support. Then Div I (x) = 0. x∈V

Proof Let A be the set of vertices x such that I x y = 0 for some y ∼ x. Since A is finite, Div I (x) = Div I (x) = Ix y = − I yx = − Div I (y). x∈V

x∈A

x∈A y∈A

x∈A y∈A

y∈A

Remark If I has infinite support then the interchange of sums above may not be valid, and the result is false in general. Consider for example the graph N with flow In,n+1 = 1 for all n; Div I (n) = 0 for all n ≥ 2 but Div I (1) = 1.

42

Random Walks and Electrical Resistance

2.2 Transience and Recurrence Let B0 , B1 be disjoint non-empty subsets of V, and assume that D = V−(B0 ∪ B1 ) is finite. Let f 0 = 1 B1 , and let ϕ be the unique solution, given by (2.8), to the Dirichlet problem (2.7). Let V − D = B0 ∪ B1 (with B0 ∩ B1 = ∅), and let f 0 = 0 on B0 and f 0 = 1 on B1 . Then ϕ(x) = Px (TB1 < TB0 ).

(2.9)

Let I = ∇ϕ be the associated current. As 0 ≤ ϕ ≤ 1 and Div I (x) = μx ϕ(x), we have x ∈ B0 ,

Div I (x) ≥ 0,

Div I (y) ≤ 0,

y ∈ B1 .

Using Lemma 2.6 we deduce that the net current flowing between B0 and B1 is given by Div I (x) = − Div I (x) = − Flux(I ; B1 ). Flux(I ; B0 ) = x∈B0

x∈B1

Definition 2.7 Let V be finite. The (finite network) effective resistance FN between B0 and B1 , denoted Reff (B0 , B1 ), is Flux(I ; B0 )−1 , where I is the current flowing from B0 to B1 when B0 is held at potential 0, and B1 at potential 1. Theorem 2.8

Let B0 ⊂ A ⊂ V, with A finite. Then

Px (τ A < TB+0 )μx =

x∈B0

1 FN (B , V − Reff 0

A)

.

(2.10)

Further, for x0 ∈ A, FN (x0 , V − A) = g A (x0 , x0 ). Reff

(2.11)

Proof Let D = A − B0 , B1 = V − A, ϕ be given by (2.9), and I = ∇ϕ. Let x ∈ B0 . Then ϕ(x) = 0 and ϕ(y)μx y = μx P(x, y)P y (TB1 < TB0 ) Div I (x) = y∼x

y∼x

= μx P (TB1 < x

TB+0 ).

Thus the left side of (2.10) is Flux(I ; B0 ), and (2.10) follows from the definition of effective resistance.

2.2 Transience and Recurrence

43

Taking B0 = {x0 } we deduce that Px0 (τ A < Tx+0 )μx0 =

1 FN (x , V − Reff 0

A)

,

(2.12)

and comparing (2.12) with (1.29) gives (2.11). Definition 2.9 Let (, μ) be an infinite weighted graph, let B0 ⊂ V, and let An ↑↑ V with B0 ⊂ A1 . We define FN (B0 , Acn ). Reff (B0 , ∞) = lim Reff n

Lemma 2.10

(2.13)

(a) If B0 ⊂ D1 ⊂ D2 , and D2 is finite, then FN FN Reff (B0 , D1c ) ≤ Reff (B0 , D2c ).

(b) Reff (B0 , ∞) does not depend on the sequence (An ) used. Proof As τ D1 ≤ τ D2 (a) is immediate from (2.10). (We will see a better proof later.) (b) Let A n be another sequence, and denote the limits in (2.13) by R and R . Given n there exists m n such that An ⊂ A m n . So, by (a), FN FN FN (B0 , Acn ) ≤ Reff (B0 , (A m n )c ) ≤ lim Reff (B0 , (A m )c ) = R . Reff m

Thus R ≤ R and similarly R ≤ R. Theorem 2.11 Let (, μ) be an infinite weighted graph. (a) For x ∈ V μx Px (Tx+ = ∞) =

1 . Reff (x, ∞)

(b) (, μ) is recurrent if and only if Reff (x, ∞) = ∞. (c) (, μ) is transient if and only if Reff (x, ∞) < ∞. Proof This is immediate from Theorem 2.8 and the definitions of Reff (x, ∞), transience, and recurrence. Remark 2.12 The connection between the probabilistic property of transience or recurrence, and resistance to infinity, is striking. Its usefulness comes from two sources. First, Reff has monotonicity properties which follow from its characterisation via variational problems, but are not evident from a probabilistic viewpoint. Second, there exist techniques of ‘circuit reduction’, which allow (some) complicated networks to be reduced to simple ones. These include the

44

Random Walks and Electrical Resistance

well known formulae for calculating the effective resistance due to resistors in series or parallel, and the less familiar Y − ∇ transform. FN ({0}, [−n, n]c ) = 1 (n + 1). Examples (1) Z is recurrent since Reff 2 2 (2) For Z one reduces effective resistance if one ‘shorts’ boxes around 0. (A precise mathematical justification for this operation will be given later in this chapter.) For x = (x 1 , x2 ) ∈ Z2 , define x∞ = max{|x 1 |, |x 2 |}, and note that Sn = {x : x∞ = n} is a square with centre 0 and side 2n. Each side of Sn contains 2n + 1 vertices, and so the total number of edges between Sn and Sn+1 is 4(2n + 1). If we ‘short’ all the edges between vertices in Sn , for each n, then we obtain a new network ˜ with ‘vertices’ S0 , S1 , . . . , such that Sn is only connected to Sn−1 and Sn+1 . Using the formulae for resistors in parallel 1 FN (S , S −1 and series we have R˜ eff n n+1 ) = 4 (2n + 1) , and hence Z2 FN (0, ∞) ≥ R˜ eff (S0 , ∞) = lim R˜ eff (S0 , S N ) Reff N →∞

= lim 4−1 N →∞

N −1

(2k + 1)−1 = ∞.

k=0

(3) A similar argument shows that the triangular lattice is recurrent. (4) Consider the graph (, μ) with vertex set Z+ and weights μn,n+1 = an . n−1 −1 Then R FN (0, n) = r =0 ar , so that (, μ) is transient if and only if −1eff a < ∞. In particular the graph considered in Examples 1.4(6) with n n an = α n is transient if and only if α > 1. Remark 2.13 Let A be finite, x0 ∈ A, and consider the flow I associated with the potential g A (·, x0 ). Taking ϕ(x) = Px (Tx0 < τ A ) as in Theorem 2.8 we have g A (x, x0 ) = ϕ(x)g A (x0 , x0 ). By (1.31) and Lemma 2.3 we have Div I (x0 ) = μx g xA0 (x0 ) = −1, so that I is a unit flow from Ac to x0 .

2.3 Energy and Variational Methods In the examples above we used shorting and cutting techniques to calculate resistance without justification. But facts like ‘shorts decrease resistance’

2.3 Energy and Variational Methods

45

should be provable from the definition, so we will now go on to discuss in more detail the mathematical background to effective resistance. In addition, we wish to study infinite networks, where not surprisingly some additional care is needed. Let (, μ) be a weighted graph, B0 , B1 ⊂ V with B0 ∩ B1 = ∅. Set V (B0 , B1 ) = { f ∈ H 2 (V) : f | B1 = 1, f | B0 = 0},

(2.14)

and consider the variational problem: Ceff (B0 , B1 ) = inf{E ( f, f ) : f ∈ V (B0 , B1 )}.

(VP1)

We call Ceff (B0 , B1 ) the ‘effective conductance’ between B0 and B1 , and will FN (of course) find that it is the reciprocal of Reff (B0 , B1 ) in those cases when FN Reff (B0 , B1 ) has been defined. In general we may have Ceff (B0 , B1 ) = ∞. We call a function f a feasible function for (VP1) if it satisfies the constraints; that is, f ∈ V (B0 , B1 ). We write Ceff (x, y) for Ceff ({x}, {y}). Whenever we consider variational problems of this kind we will assume that B0 and B1 are non-empty. Remark 2.14 If B0 ∩ B1 = ∅ then V (B0 , B1 ) = ∅, so with the usual convention that inf ∅ = ∞ we can take Ceff (B0 , B1 ) = ∞. An immediate consequence of the definition is: Lemma 2.15 (a) Ceff (B0 , B1 ) = Ceff (B1 , B0 ). (b) Ceff (B0 , B1 ) < ∞ if either B0 or B1 is finite and B0 ∩ B1 = ∅. (c) C eff (B0 , B1 ) is increasing in each of B0 , B1 . Remark general.

If instead we had used H02 (V) in (2.14) then (a) would fail in

We will make frequent use of the following theorem from the theory of Hilbert spaces. Theorem 2.16 Let H be a Hilbert space, and A ⊂ H be a non-empty closed convex subset. (a) There exists a unique element g ∈ A with minimal norm. (b) If f n is any sequence in A with || f n || → ||g|| then f n → g in H . (c) If h ∈ H and there exists δ > 0 such that g + λh ∈ A for all |λ| < δ then h, g = 0.

46

Random Walks and Electrical Resistance

Proof (a) Let M = inf{|| f ||2 : f ∈ A}. Suppose g1 , g2 ∈ A with ||gi ||2 ≤ M + ε, where ε ≥ 0. As A is convex, h = 12 (g1 + g2 ) ∈ A, and so 4M ≤ 2h, 2h = ||g1 ||2 + 2g1 , g2 + ||g2 ||2 ≤ 2M + 2ε + 2g1 , g2 , which implies that −2g1 , g2 ≤ −2M + 2ε. So g1 − g2 , g1 − g2 ≤ 2(M + ε) − 2g1 , g2 ≤ 4ε,

(2.15)

and it follows that the minimal element is unique. (b) Let f n be a sequence in A with || f n ||2 → M, and set δn = || f n ||2 − M. Then by (2.15) we have || f n − f m ||2 ≤ 4(δn ∨ δm ), so that ( f n ) is Cauchy and converges to a limit f ; since || f ||2 = M, then f = g. (c) We have 0 ≤ g + λh, g + λh − g, g = 2λg, h + λ2 h, h, and as this holds for |λ| < δ we must have g, h = 0. Theorem 2.17

Suppose that C eff (B0 , B1 ) < ∞.

(a) There exists a unique minimiser f for (VP1). (b) f is a solution of the Dirichlet problem: f (x) = 0 on B0 , f (x) = 1 on B1 ,

(2.16)

f (x) = 0 on V − (B0 ∪ B1 ). Proof (a) Consider the Hilbert space H 2 (V) where we choose the base point x 0 ∈ B0 . Clearly V = V (B0 , B1 ) is a convex subset of the Hilbert space H 2 (V). Further, if f n ∈ V and f n → f in H 2 (V), then (since f n → f pointwise), f ∈ V , so V is closed. Hence, by Theorem 2.16 V has a unique element f which minimises g H 2 , and so E (g, g), over g ∈ V . (b) It is clear that f satisfies the boundary conditions in (2.16). Let x ∈ V − (B0 ∪ B1 ), and let h(y) = 1{x} (y). Then for any λ ∈ R the function f + λh ∈ V (B0 , B1 ), so by Theorem 2.16 E ( f, h) = 0. Since h ∈ C0 (V), 0 = E ( f, h) = f, h = μx f (x), so that f is harmonic in V − (B0 ∩ B1 ).

2.3 Energy and Variational Methods

47

Proposition 2.18 Let Vn ↑↑ V. Write Ceff (·, ·; Vn ) for effective conductance in the subgraph Vn . Then if B0 , B1 ⊂ V, lim C eff (B0 ∩ Vn , B1 ∩ Vn ; Vn ) = Ceff (B0 , B1 ). n

(2.17)

Proof Note first that Ceff (B0 ∩ Vn , B1 ∩ Vn ; Vn ) ≤ Ceff (B0 ∩ Vn , B1 ∩ Vn ; Vn+1 ) ≤ C eff (B0 ∩ Vn+1 , B1 ∩ Vn+1 ; Vn+1 ) ≤ Ceff (B0 ∩ Vn+1 , B1 ∩ Vn+1 ) ≤ Ceff (B0 , B1 ). The first and third inequalities hold because for any f ∈ C(V) we have EVn ( f, f ) ≤ EVn+1 ( f, f ) ≤ E ( f, f ), while the second and fourth are immediate by the monotonicity given by Lemma 2.15. Thus the limit on the left side of (2.17) exists, and we have proved the upper bound on this limit. Let ϕn be the optimal function for Ceff (B0 ∩ Vn , B1 ∩ Vn ; Vn ); we extend ϕn to V by taking it to be zero outside Vn . Let C = limn Ceff (B0 ∩ Vn , B1 ∩ Vn ; Vn ). If C = ∞ there is nothing to prove. Otherwise EVn (ϕn , ϕn ) are uniformly bounded, and by Proposition 1.21(h) there exists a subsequence n k such that ϕn k converges pointwise in V to a function ϕ with E (ϕ, ϕ) ≤ lim inf EVn (ϕn , ϕn ) = C. n

We have ϕ = k on Bk , k = 0, 1 and so ϕ is feasible for (VP1), and therefore C eff (B0 , B1 ) ≤ E (ϕ, ϕ) ≤ C, proving equality in (2.17). Remark 2.19 Since the function ϕ satisfies E (ϕ, ϕ) = Ceff (B0 , B1 ), ϕ is the unique minimiser of (VP1). Further, since any subsequence of ϕn will have a sub-subsequence converging to ϕ, we have ϕn → ϕ pointwise, and deduce that ϕ(x) = lim Px (TB1 ∩Vn < TB0 ∩Vn ). n

We will shortly introduce a second variational problem in terms of flows. To − → formulate this, we need to define a Hilbert space for flows. Let E be the set of oriented edges in V × V: − → E = {(x, y) : {x, y} ∈ E}. Given flows I and J write E(I, J ) =

1 2

x

y∼x

μ−1 x y I x y Jx y

48

Random Walks and Electrical Resistance

− → for the inner product in the space L 2 ( E , μ−1 ). We call E(I, I ) the energy dissipation of the current I . Note that if f, g ∈ H 2 then E(∇ f, ∇g) = E ( f, g). Lemma 2.20

(2.18)

Let I be a flow. Then ||divI ||22 ≤ 2E(I, I ).

Proof We have 2 2 −1/2 ||divI ||22 = μx μ−1 = μ−1 μx y I x y μ x y x Ix y x x

≤

y∼x

μ−1 x

x

Lemma 2.21 f ∈ L 2,

μx y

y∼x

x 2 μ−1 x y Ix y

y∼x

= 2E(I, I ).

y∼x

− → Let f ∈ C(V), and I be a flow. Then if I ∈ L 2 ( E ) and E(I, ∇ f ) ≤ E(I, I )1/2 E ( f, f )1/2 , −divI, f ≤ divI, divI

Further we have E(I, ∇ f ) = −

1/2

f, f

1/2

(2.19) .

(2.20)

divI (x) f (x)μx

(2.21)

x

in the following cases: (a) if either I or f has finite support, − → (b) if I ∈ L 2 ( E ) and f ∈ L 2 , − → (c) if I ∈ L 2 ( E ), divI ∈ C0 , and f ∈ H02 . Proof The inequalities (2.19) and (2.20) are immediate from Cauchy– Schwarz. (a) Suppose that either I or f has support in a finite set A. Set A1 = A. Then I x y ( f (y) − f (x)), E(I, ∇ f ) = 12 x∈A1 y∈A1

=

1 2

I x y f (y) −

x∈A1 y∈A1

=−

1 2

Ix y f (x)

x∈A1 y∈A

I x y f (x)

x∈A1 y∈A1

=−

x∈A1

f (x)divI (x)μx = −divI, f .

2.3 Energy and Variational Methods

49

(b) Let f n ∈ C0 with f n → f in L 2 . Thus f n → f in H 2 also. Then E(I, ∇ fn ) = −divI, f n

(2.22)

by (a), and using (2.19) and (2.20) we obtain (2.21). (c) Let f n ∈ C0 with f n → f in H 2 . Then again (2.22) holds and we have |E(I, ∇ f n − ∇ f )|2 ≤ E(I, I )E ( f − f n , f − f n ) → 0

as n → ∞,

so the left side of (2.22) converges to E(I, ∇ f ). Since f n → f pointwise, and divI has finite support, the right side of (2.22) also converges. Remark 2.22 See Remark 2.33 for examples which show that (2.21) can fail − → − → if f ∈ H 2 and I ∈ L 2 ( E ) with divI ∈ C 0 . In addition, if I ∈ L 2 ( E ) and f ∈ H02 then the sum on the right side of (2.21) may not converge absolutely. FN . The next result connects Ceff with our previous definition of Reff

Proposition 2.23 Let V be finite, and B0 ∩ B1 = ∅. (a) The unique minimiser for (VP1) is given by ϕ(x) = P x (TB1 < TB0 ). (b) The flow I = ∇ϕ satisfies Flux(I ; B0 ) =

Div I (x) = E (ϕ, ϕ) = Ceff (B0 , B1 ).

x∈B0 FN (c) Ceff (B0 , B1 ) = Reff (B0 , B1 )−1 .

Proof (a) By Theorem 2.17 a unique minimiser exists, and satisfies (2.16). Since ϕ also satisfies (2.16), and by Theorem 2.5 this Dirichlet problem has a unique solution, we deduce that ϕ is this minimiser. (b) Let I = ∇ϕ. Note that divI (x) ≥ 0 if x ∈ B0 , divI (x) ≤ 0 if x ∈ B1 , and divI (x) = 0 for x ∈ V − B. We apply Lemma 2.21 with f = ϕ and I = ∇ϕ, to obtain

50

Random Walks and Electrical Resistance E (ϕ, ϕ) = −

ϕ(x)divI (x)μx

x

=−

Div I (x) = − Flux(I ; B1 ) = Flux(I ; B0 ).

x∈B1 FN (c) By definition Reff (B0 , B1 ) = Flux(I ; B0 )−1 .

Definition 2.24

We now define for B0 , B1 ⊂ V, with B0 ∩ B1 = ∅, Reff (B0 , B1 ) = Ceff (B0 , B1 )−1 .

By Proposition 2.2.3(c), this agrees with our previous definition when V − FN . (B0 ∪ B1 ) is finite, and we will now drop the temporary notation Reff Lemma 2.25

For f ∈ H 2 , x, y ∈ V | f (x) − f (y)|2 ≤ E ( f, f )Reff (x, y).

(2.23)

Proof This is immediate from the definition of Reff via (VP1). If f (x) = f (y) there is nothing to prove; otherwise set g(·) =

f (·) − f (x) . f (y) − f (x)

Then g(x) = 0, g(y) = 1 so g is feasible for (VP1). Hence Reff (x, y)−1 ≤ E (g, g) = | f (y) − f (x)|−2 E ( f, f ); rearranging then proves (2.23). Definition 2.26 (‘Shorts’ and ‘cuts’) Let (, μ) be a weighted graph, and e0 = {x, y} be an edge in . The graph ( (c) , μ(c) ) obtained by cutting the edge e0 is the weighted graph with vertex set V, edge set E (c) = E − {e0 }, and weights given by μe , e = e0 , μ(c) e = 0, e = e0 . (Note that this graph may not be connected.) The graph ( (s) , μ(s) ) obtained by shorting the edge {x, y} is the weighted graph obtained by identifying the vertices x and y: that is, by collapsing the set {x, y} to a single point a, as in Definition 1.5(2). for effective Corollary 2.27 Let μ, μ be weights on , and write Reff , Reff resistance in the weighted graphs (, μ) and (, μ ).

2.3 Energy and Variational Methods

51

(a) If μ ≤ c1 μ then (B0 , B1 ), Ceff (B0 , B1 ) ≤ c1 Ceff

Reff (B0 , B1 ) ≥ c1−1 Reff (B0 , B1 ).

(b) Cuts increase Reff (B0 , B1 ). (c) Shorts decrease Reff (B0 , B1 ). Proof (a) Since E ≤ c1 E , this is immediate from the definition of Ceff . Now fix an edge e0 = {x, y}, and write Eλ ( f, f ) for the energy of f in the graph (, μλ ), where μλe = μe if e = e0 and μλe0 = λ. Write Ceff (·, ·, λ) for (c)

(s)

the effective conductance in the graph (, μλ ), and C eff (·, ·) and Ceff (·, ·) for effective conductance in ( (c) , μ(c) ) and ( (s) , μ(s) ). (c) (b) We have C eff (B0 , B1 ) = Ceff (B0 , B1 , 0), and the result follows from (a). (c) We continue with the notation from (a). Define F : V → V by taking F to be the identity on V − {x, y}, and F(x) = F(y) = a, and let B j = F(B j ) for j = 0, 1. If B0 and B1 are disjoint then (s) (B0 , B1 ) = lim Ceff (B0 , B1 , λ), Ceff λ→∞

(2.24)

so again the monotonicity follows from (a). If B0 ∩ B1 = ∅ then the right side of (2.24) is infinite, and by Remark 2.14 we should also take the left side to be infinite. We can also regard Ceff (B0 , B1 ) as a capacity, or, following [Duf] as the modulus of a family of curves. The connection with capacity will be made in Chapter 7 – see Proposition 7.9. To see the connection with moduli, let P be a family of (finite) paths in V. For a path γ = (z 0 , z 1 , . . . , z n ) ∈ P and f ∈ C(V) set Oγ ( f ) =

n

| f (z i ) − f (z i−1 )|,

i=1

and let F (P) = f ∈ C(V) : Oγ ( f ) ≥ 1 for all γ ∈ P . Define the modulus of P by M(P) = inf{E ( f, f ) : f ∈ F (P)}. The quantity M(P)−1 is the extremal length of the family of curves P. (This concept has a long history in complex analysis – see for example [Ah].)

52

Random Walks and Electrical Resistance

Proposition 2.28 Let B0 , B1 be disjoint subsets of V and P be the set of all finite paths with one endpoint in B0 and the other in B1 . Then M(P) = Ceff (B0 , B1 ). Proof Let ϕ be an optimal solution for (VP1). Then if γ = (x 0 , . . . , xn ) is any path in P with x0 ∈ B0 , xn ∈ B1 , Oγ (ϕ) =

n

|ϕ(xi )−ϕ(xi−1 )| ≥

i=1

n

(ϕ(xi )−ϕ(xi−1 )) = ϕ(xn )−ϕ(x 0 ) = 1.

i=1

So, ϕ ∈ F (P), and hence M(P) ≤ E (ϕ, ϕ) = Ceff (B0 , B1 ). To prove the other inequality, let f ∈ F (P). Define g(x) = min Oγ ( f ) : γ is a path from a point x0 ∈ B0 to x . Clearly g(x) = 0 if x ∈ B0 , and (as f ∈ F (P)), g(x) ≥ 1 for x ∈ B1 . Now let x ∼ y, and suppose that g(x) ≥ g(y). Then since a path to y can be extended by one step to give a path to x, g(x) ≤ g(y) + | f (x) − f (y)|, proving that |g(x) − g(y)| ≤ | f (x) − f (y)|,

whenever x ∼ y.

Let ψ = g ∧ 1; ψ is feasible for (VP1). Then from the above and Lemma 1.27 C eff (B0 , B1 ) ≤ E (ψ, ψ) ≤ E (g, g) ≤ E ( f, f ), and hence Ceff (B0 , B1 ) ≤ M(P). Now we consider a second variational problem, given by minimising the energy of flows in V. Recall from (2.5) the definition of Flux(I ; A). Definition 2.29 Let B0 , B1 be disjoint non-empty subsets of V. Let I0 (B0 , B1 ) be the set of flows I with finite support such that divI (x) = 0,

x ∈ V − (B0 ∪ B1 ),

(2.25)

Flux(I ; B0 ) = 1,

(2.26)

Flux(I ; B1 ) = −1.

(2.27)

Let I 0 (B0 , B1 ) be the closure of I0 (B0 , B1 ) with respect to E(·, ·). By taking a unit flow along a finite path between x0 ∈ B0 and x1 ∈ B1 we see that I0 (B0 , B1 ) and I 0 (B0 , B1 ) are non-empty. The conditions (2.26) and (2.27) are not preserved by L 2 convergence, so need not be satisfied by flows in I 0 (B0 , B1 ); see the examples in Remark 2.33. It is easy to check that:

2.3 Energy and Variational Methods

53

Lemma 2.30 Let I ∈ I 0 (B0 , B1 ) and let In ∈ I0 (B0 , B1 ) with E(I − In , I − In ) → 0. Then (In )x y → Ix y for each x, y, and I satisfies (2.25). In addition, if B0 ∪ B1 is finite then I satisfies (2.26) and (2.27). The variational problem is: minimise E(I, I ) over I ∈ I 0 (B0 , B1 ).

(VP2)

As before, we say that a flow is feasible for (VP2) if it is in I 0 (B0 , B1 ). Theorem 2.31 (Thompson’s principle) Let B0 and B1 be disjoint non-empty subsets of V. Then (VP2) has a unique minimiser I ∗ , and Reff (B0 , B1 ) = E(I ∗ , I ∗ ) = inf E(I, I ) : I ∈ I 0 (B0 , B1 ) . (2.28) If Ceff (B0 , B1 ) < ∞ let ϕ be the unique minimiser for (VP1). Then I ∗ = Reff (B0 , B1 )∇ϕ.

(2.29)

If Ceff (B0 , B1 ) = ∞ then I ∗ = 0. Proof Since I 0 is closed and convex, a unique minimiser for (VP2) exists by Theorem 2.16. If J ∈ I0 (B0 , B1 ) then by Lemma 2.21(a) E(J, ∇ϕ) = −divJ, ϕ = − divJ (x)μx = 1. x∈B1

Hence if C eff (B0 , B1 ) < ∞ then 1 = E(J, ∇ϕ) ≤ E(J, J )1/2 E (ϕ, ϕ)1/2 , and therefore E(J, J ) ≥ Reff (B0 , B1 ). This last inequality is trivial if Ceff (B0 , B1 ) = ∞. If J ∈ I 0 (B0 , B1 ) then there exists a sequence Jn ∈ I0 (B0 , B1 ) with E(J − Jn , J − Jn ) → 0. Since E(Jn , Jn ) ≥ Reff (B0 , B1 ) for each n, we have E(J , J ) ≥ Reff (B0 , B1 ), and we have proved that Reff (B0 , B1 ) ≤ inf{E(J, J ) : J ∈ I 0 (B0 , B1 )}.

(2.30)

Now let I ∗ be defined by (2.29). Then since E(∇ϕ, ∇ϕ) = E (ϕ, ϕ), E(I ∗ , I ∗ ) = Reff (B0 , B1 )2 E (ϕ, ϕ) = Reff (B0 , B1 ). It remains to prove that by (2.31)

I∗

(2.31)

∈ I 0 (B0 , B1 ). Suppose first that V is finite. Then

1 = E(I ∗ , ∇ϕ) = −divI ∗ , ϕ = −

divI ∗ (x)μx = − Flux(I ∗ , B1 ).

x∈B1

Then Flux(I ∗ ; B0 ) = 1, and so I ∗ ∈ I0 (B0 , B1 ).

54

Random Walks and Electrical Resistance

Now suppose V is infinite. Let Vn ↑↑ V, and let In be the optimum flow for (VP2) between B0 ∩ Vn and B1 ∩ Vn in the graph Vn . Then each In ∈ I0 (B0 , B1 ), and by Proposition 2.18 E(In , In ) = Reff (B0 ∩ Vn , B1 ∩ Vn ; Vn ) → Reff (B0 , B1 ) = E(I ∗ , I ∗ ). By Theorem 2.16, E(In − I ∗ , In − I ∗ ) → 0, and so I ∗ ∈ I 0 (B0 , B1 ). Remark 2.32 (1) The two variational principles (VP1) and (VP2) allow one to estimate Reff (B0 , B1 ) in cases when an exact calculation may not be possible. If f and I are feasible functions and flows for (VP1) and (VP2), respectively, then E ( f, f )−1 ≤ Reff (B0 , B1 ) ≤ E(I, I ). It is usually easier to find a ‘good’ function f than a ‘good’ flow I . (2) Let B be the rooted binary tree given in Examples 1.4, and recall that Bn is the set of 2n points at a distance n from the root o. Let I be the flow which sends a flux of 2−n into each point x in Bn from its ancestor a(x). Then E(I, I ) =

∞

(2

−n 2

n=1 x∈Bn

) =

∞

2−n = 1.

n=1

Using the flow I we have Reff (o, Bn ) ≤

n

2−k = 1 − 2−n .

k=1

In fact equality holds, as can be seen by considering the function f (x) = (1 − 2−d(o,x) )/(1 − 2−n ). Remark 2.33 The following examples of infinite networks show why in (VP2) we take the space of flows to be I 0 (B0 , B1 ). (1) Let V = {0, 1} × Z+ , and let = (V, E) be the subgraph of Z2 induced by V; is like a ladder. Since it is a subgraph of Z2 , is recurrent using Corollary 2.39 in Section 2.4. Let B j = { j} × Z+ . If In is the flow which sends current 1/n from (0, k) to (1, k) for k = 1, . . . , n then E(In , In ) = 1/n. The flows In converge to I ∗ = 0, which is therefore in I 0 (B0 , B1 ). Hence the infimum in (VP2) is zero, but there is clearly no minimising flow which satisfies the flux conditions (2.26) and (2.27). (2) In the same graph, let J be the flow which sends current 1/n from (0, n) to (1, n) for each n ≥ 1. Then E(J, J ) = (1/n 2 ) < ∞. If f = 1 then, since is recurrent, we have f ∈ H02 by Theorem 2.37. Then since

2.4 Resistance to Infinity

55

∇ f = 0 we have E(J, ∇ f ) = 0, while the sum in the expression for −divJ, f does not converge absolutely. (3) Let consist of two copies of a rooted binary tree, denoted V0 , V1 , with roots x 0 and x1 , connected by a path A = (x 0 , y1 , y2 , x 1 ) of length 3. Then if B j = {x j } it is clear that Ceff (B0 , B1 ) = 13 . However, one can build a flow I (going from x 0 to infinity in V0 , coming from infinity to x1 in V1 ) with E(I, I ) = 2. Of course I is not in I 0 . Let f be the function which equals 0 on V0 and 1 on V1 , and is linear between x 0 and x1 . Then I and ∇ f have disjoint supports, but −divI, f = −

1

divI (x 0 ) f (x0 ) = −divI (x 1 ) = −1,

i=0

so that (2.21) fails. This example suggests that to handle flows on general infinite graphs one needs to impose the correct boundary behaviour at infinity. For more on this see [LP, So2, Z].

2.4 Resistance to Infinity We now take B1 to be a finite set, and B0 to be ‘infinity’. We consider the two variational problems C(B1 , ∞) = inf{E ( f, f ) : f | B1 = 1, f ∈ H02 }

(VP1 )

and R(B1 , ∞) = inf{E(I, I ) : Flux(I ; B1 ) = −1, divI (x) = 0, x ∈ V − B1 }. (VP2 ) Set ϕ(x) = Px (TB1 < ∞).

(2.32)

Recall from Definition 2.9 the definition of Reff (B1 , ∞). We set Ceff (B1 , ∞) = Reff (B1 , ∞)−1 . Proposition 2.34 (a) The function ϕ ∈ H02 (V), and attains the minimum in (VP1 ). We have E (ϕ, ϕ) = C(B1 , ∞) = Ceff (B1 , ∞). (b) If f ∈ H02 with f = 1 on B1 then E (ϕ, f ) = − ϕ, 1 B1 = Reff (B1 , ∞)−1 .

(2.33)

56

Random Walks and Electrical Resistance

Proof (a) By Theorem 2.16 there exists a unique minimiser ϕ for (VP1 ). Let 2 ϕn → ϕ in H . Then E ( ϕn , ϕn ) → C(B1 , ∞). Since B1 is finite ϕn ∈ C0 with and ϕn converge pointwise to ϕ , we can assume that ϕn = 1 on B1 . ϕk , k ≤ n), and Vn = B(x0 , n) ∪ An , Choose x0 ∈ B1 , and let An = supp( so that Vn ↑↑ V. Let ϕn = Px (TB1 < τVn ). By Proposition 2.23 ϕn is the minimiser for (VP1) for the sets ∂Vn and B1 in the finite graph Vn ∪ ∂Vn . Since ϕn is feasible for this problem, we have E (ϕn , ϕn ) ≤ E ( ϕn , ϕn ). Since also ϕn is feasible for (VP1 ), E ( ϕ, ϕ ) = C(B1 , ∞) ≤ E (ϕn , ϕn ) ≤ E ( ϕn , ϕn ). ϕ in H02 and therefore pointwise. However, By Theorem 2.16 we have ϕn → we also have ϕn (x) ↑ ϕ(x) for all x, so ϕ = ϕ and ϕ ∈ H02 . By the definition of Ceff (B1 , ∞) we have Ceff (B1 , ∞) = lim Ceff (B1 , Vnc ) = lim E (ϕn , ϕn ) = E (ϕ, ϕ), n

n

completing the proof of (a). (b) Since B1 is finite, using Proposition 2.23(b) − ϕ, 1 B1 = lim − ϕn , 1 B1 = lim Ceff (B1 , Vcn ) = Ceff (B1 , ∞). n

n

Let f ∈ with f = 1 on B1 . Let f n ∈ C 0 (V) with f n → f in H 2 (V); as in (a) we can assume that f n = 1 on B1 . Then H02

E (ϕ, f n ) = − ϕ, f n = − ϕ, 1 B1 , and letting n → ∞ gives (2.33). Lemma 2.35 Reff (B1 , ∞).

Let I be any feasible flow for (VP2 ). Then E(I, I ) ≥

Proof Let Vn ↑↑ V, and ϕn be as in Proposition 2.34. By Lemma 2.21 E(I, I )1/2 E (ϕn , ϕn )1/2 ≥ E(I, ∇ϕn ) = −divI, ϕn = − Flux(I ; B1 ) = 1, so E(I, I ) ≥ Reff (B1 , Vcn ). We can now summarise the behaviour of these variational problems; it is easier to deal separately with the recurrent and transient cases. Theorem 2.36 (a) 1 ∈ H02 , (b) H02 = H 2 ,

Let (, μ) be recurrent, and B1 ⊂ V be finite. Then:

2.4 Resistance to Infinity

57

(c) Reff (B1 , ∞) = ∞, (d) R(B1 , ∞) = ∞. Proof As ϕ ≡ 1, (a) is immediate from Proposition 2.34, (b) follows by Proposition 1.29, (c) holds by Theorem 2.11, and (d) follows from Lemma 2.35. Theorem 2.37 Let (, μ) be transient, and B1 ⊂ V be finite. Then: (a) (b) (c) (d) (e)

the function ϕ given by (2.32) attains the minimum in (VP1 ), E (ϕ, ϕ) = C eff (B1 , ∞) = Reff (B1 , ∞)−1 , Reff (B1 , ∞) < ∞, 1 ∈ H02 , I ∗ = Reff (B1 , ∞)∇ϕ attains the minimum in (VP2 ) and E(I ∗ , I ∗ ) = R(B1 , ∞) = Reff (B1 , ∞) < ∞.

Proof (a) and (b) have already been proved in Proposition 2.34, and (c) was proved in Theorem 2.8. For (d), if 1 ∈ H02 then the infimum in (VP1 ) would be zero, which contradicts (c). For (e) we have divI ∗ (x)μx = Reff (B1 , ∞) ϕ, 1 B1 = −1, Flux(I ∗ ; B1 ) = B1

by Proposition 2.34. Thus I ∗ is feasible for (VP2 ). Since E(I ∗ , I ∗ ) = Reff (B1 , ∞)2 E (ϕ, ϕ) = Reff (B1 , ∞), by Lemma 2.35 the flow I ∗ attains the minimum in (VP2 ). Corollary 2.38 (See [Ly1]) (, μ) is transient if and only if there exists x1 ∈ V and a flow J from x1 to infinity with finite energy. Proof Theorem 2.36 and Theorem 2.37 together imply that (, μ) is transient if and only if R(B1 , ∞) < ∞ for every finite set B1 ⊂ V. Corollary 2.39

Let be a subgraph of .

(a) If is transient then is transient. (b) If is recurrent then is recurrent. Proof (a) is clear from Corollary 2.38 since if J is a flow from x1 to infinity with finite energy in then J is also such a flow in . (b) follows immediately from (a).

58

Random Walks and Electrical Resistance

The following criterion often provides an easy way to prove recurrence. Corollary 2.40 (Nash–Williams criterion for recurrence [NW]) Let (, μ) be an infinite weighted graph. Suppose there exist Ak ↑↑ V with Ak ⊂ Ak+1 such that ∞

Reff (Ak , Ack ) = ∞.

k=1

Then is recurrent. Proof Let E k be the set of edges e = {x, y} with x ∈ Ak and y ∈ Ack . The condition on Ak implies that the edge sets E k are disjoint. We have μe . Reff (Ak , Ack )−1 = C eff (Ak , Ack ) = e∈E k

If I is any unit flow from A1 to infinity then

1≤

⎛ −1/2 1/2 μe

|Ie |μe

e∈E k

⎛

=⎝

⎞1/2 ⎠ Ie2 μ−1 e

≤⎝

e∈E k

|I (e)| ≥ 1. So,

⎞1/2 ⎛ ⎠ Ie2 μ−1 e

e∈E k

⎝

⎞1/2 μe ⎠

e∈E k

(Reff (Ak , Ack ))−1/2 .

e∈E k

Hence E(I, I ) ≥

2 μ−1 e Ie ≥

k e∈E k

Reff (Ak , Ack ) = ∞,

k

and so is recurrent by Corollary 2.38. Examples 2.41 (1) Zd , d ≥ 3 is transient. Since Z3 can be regarded as a subgraph of Zd if d > 3, it is enough to prove this for d = 3. Note first that a continuum approximation suggests that a finite energy flow from zero to infinity should exist. Let I be the symmetric flow on R3 from the unit ball to infinity. Then, at a distance r from the origin, I has density (4πr 2 )−1 , so the energy of the flow I is:

∞ 1

(4πr 2 )−2 4πr 2 dr = (4π )−1

1

∞

r −2 dr < ∞.

2.4 Resistance to Infinity

59

To prove that Z3 is transient we exhibit the flow in Z3+ given in [DS]. For (x, y, z) ∈ Z3+ set A = x + y + z, and let I(x,y,z),(x+1,y,z) =

x +1 , (A + 1)(A + 2)(A + 3)

with I(x,y,z),(x,y+1,z) and I(x,y,z),(x,y,z+1) defined in a symmetric fashion. Then it is straightforward to verify that divI = 0 except at 0, and E(I, I ) < ∞. (2) (Feller’s problem.) The following problem is given on p. 425 of [Fe]: ‘Let = (G, E) be a recurrent graph: Prove that any subgraph of is also recurrent.’ Discussions following Peter Doyle’s solution of this problem using resistance techniques led to the book [DS]. The solution is given by Corollary 2.39. Note, however, that we do not need the full strength of the results such as this: one can also give a simple argument using just Corollary 2.27 and Theorem 2.11. (3) In Remark 2.33(2) we constructed a flow I on the rooted binary tree with E(I, I ) = 1; it follows that the binary tree is transient. The following reformulation of Theorem 2.37, using the language of Sobolev inequalities rather than resistance, is useful and connects with the ideas in Chapter 3. Theorem 2.42 Let x ∈ V. (, μ) is transient if and only if there exists c1 = c1 (x) < ∞ such that | f (x)|2 ≤ c1 E ( f, f )

for all f ∈ C0 (V).

(2.34)

Proof Suppose first that (, μ) is transient. Let f ∈ C0 (V) ⊂ H02 (V). If f (x) = 0 there is nothing to prove. Otherwise, set g = f / f (x). Then g is feasible for (VP1 ), so E ( f, f ) = f (x)2 E (g, g) ≥ f (x)2 Ceff (x, ∞). If (2.34) holds, let ϕn be as in the proof of Theorem 2.37. Then E (ϕn , ϕn ) ≥ c1−1 , so E (ϕ, ϕ) = limn E (ϕn , ϕn ) ≥ c1−1 . As remarked above, it is usually easier to find a good feasible function for (VP1 ) than a good feasible flow for (VP2 ). It is therefore of interest to find other ways of bounding Reff (., .) from above. One method is described in [Bov, Sect. 6]. Let A be a set, and let qea , a ∈ A, e ∈ E satisfy qea ≤ 1. qea ≥ 0, a

60

Random Walks and Electrical Resistance

Set μae = qea μe , and let Ea be the quadratic form associated with μa . Write H for the set of functions in C0 (V) which are feasible for (VP1 ). Then C(B1 , ∞) = inf E ( f, f ) ≥ inf Ea ( f, f ) ≥ inf Ea ( f, f ). f ∈H

f ∈H

a

a

f ∈H

If one can choose qea so that the final term can be calculated, or estimated, then this gives a lower bound on C(B1 , ∞) and hence an upper bound on Reff (B1 , ∞). Note that this generalises the method of cuts; cutting the edge e is equivalent to taking A = {a} and qea = 0, qea = 1 for e = e . We now describe another method, the technique of unpredictable paths introduced in [BPP]. Fix a vertex o ∈ V, and let be the set of paths π = (o, π1 , π2 , . . . ) from o to infinity with the property that d(o, πn ) = n for all n ≥ 1. Let P be a probability measure on , and let π and π be independent paths chosen according to P. We regard π, π as sets of edges. Exercise

Prove that is non empty.

Theorem 2.43 (See [BPP], Sect. 2) Suppose that μ−1 E e 1(e∈π∩π ) < ∞.

(2.35)

e

Then (, μ) is transient. Proof We construct a flow I as follows. Let e = {x, y} be an edge, and assume that d(o, x) ≤ d(o, y). If d(o, x) = d(o, y) we take I x y = 0, and otherwise we set Ie = I x y = P(e ∈ π ). It is easy to check that I is a flow from o to infinity. (Each path in gives a unit flow from o to infinity, and I is the flow obtained by averaging these with respect to the probability measure P.) Then 2 −1 −1 )) = E ) . E(I, I ) = μ−1 I = μ E(1 1 μ 1 (e∈π ) (e∈π (e∈π ∩π e e e e e∈E

e∈E

e∈E

Thus I has finite energy and so (, μ) is transient. If has natural weights then the expression in (2.35) is just the mean number of edges in both π and π . Example 2.44 We can use this method to give another proof of the transience of Z3 . Let θ be chosen uniformly on the sphere S2 , and let Rθ be the ray from zero to infinity in the direction θ .

2.5 Traces and Electrical Equivalence

61

Then there exists C such that we can define a path π = π(θ ) from zero to infinity in Z3 in the direction θ so that for each n we have infr ∈Rθ |r −πn | ≤ C. (In fact we can take C = 1.) It follows that |π(θ ) ∩ π(θ )| ≤ cd S (θ, θ )−1 , where d S is the usual metric on S2 . The random variable d S (θ, θ ) has a density f (t) on [0, π ] with f (t) ≤ ct. Hence π E|π ∩ π | ≤ ct −1 f (t)dt ≤ c , 0

and we deduce that

Z3

is transient.

Exercise 2.45 Use a similar construction to show that there exists C < ∞ such that if x, y ∈ Zd , with d ≥ 3, then Reff (x, y) ≤ C.

2.5 Traces and Electrical Equivalence A standard result in electrical network theory is that n unit resistors in series can be replaced by a single resistor of strength n. In this subsection we will see how circuit reduction results of this kind can be formulated and proved in our framework. Let (V, E, μ) be a weighted graph. Let D ⊂ V be finite, and let B = V− D. If f ∈ C(B) let Z D f denote the function in C(V) given by f (x) if x ∈ B, Z D f (x) = 0 if x ∈ D. Recall from (1.16) the definition of E B and set H 2 (B) = { f ∈ C(B) : E B ( f, f ) < ∞}. Since D is finite, Z D f ∈ H 2 (V) if and only if f ∈ H 2 (B). For f ∈ H 2 (B) set EB ( f, f ) = inf{E (u, u) : u ∈ H 2 (V), u| B = f }.

(2.36)

Remark 2.46 We could consider infinite D, but all our applications just use finite sets. See [Kum] for a more general situation.

Proposition 2.47 The infimum in (2.36) is obtained by a unique function. Writing H B f for this function, so that EB ( f, f ) = E (H B f, H B f ),

f ∈ H 2 (B),

H B f is the unique solution to the Dirichlet problem H B f (x) = f on B,

H B f (x) = 0 on D.

(2.37)

62

Random Walks and Electrical Resistance

Consequently the map H B : H 2 (B) → C(V) is linear and H B f (x) = Ex f (X τ D ).

(2.38)

Further EB ( f, f ) has the Markov property: EB (1 ∧ f + , 1 ∧ f + ) ≤ EB ( f, f ) for f ∈ H 2 (B). Proof The existence and uniqueness of the minimiser is immediate from Theorem 2.16. The proof that H B f satisfies the Dirichlet problem (2.37) is exactly as in Theorem 2.17(b); uniqueness follows since D is finite, and this also proves (2.38). This characterisation implies that H B is linear. To prove the Markov property, let f ∈ H 2 (B) and g = 1 ∧ f + . If u| B = f then v = 1 ∧ u + satisfies v| B = g. The Markov property for E (see Lemma 1.27) implies that E (v, v) ≤ E (u, u), and it follows that EB (g, g) ≤ EB ( f, f ). We now define EB ( f, g) = E (H B f, H B g), for f, g ∈ H 2 (B). We call EB the trace of E on B – see [FOT], Sect. 6.2. Let E(B) be the set of edges e ∈ E such that both endpoints are in B. Let E + D (B) be the set of {x, y} in B with x = y such that there exists a path γ = (x = z 0 , z 1 , . . . , z k = y) in such that z i ∈ D, i = 1, . . . , k − 1. That is, x and y can be connected by a path which lies in D, except for its endpoints. Let E D (B) = E(B) ∪ E + D (B). Proposition 2.48 f, g ∈ H 2 (B), EB ( f, g) =

There exist (symmetric) conductances μx y such that, for 1 2

( f (x) − f (y))(g(x) − g(y)) μx y .

(2.39)

x∈B y∈B

We have μx y = μx Px (X T + = y) if x = y. B

(2.40)

In particular μx y = μx y unless x, y ∈ ∂i B. Proof EB is a symmetric bilinear form on C0 (V), and EB ( f, f ) ≥ 0 for all f . Further, for f ∈ L 2 (V), EB ( f, f ) ≤ E (Z D f, Z D f ) ≤ 2||Z D f ||22 = 2|| f ||2L 2 (B) .

2.5 Traces and Electrical Equivalence

63

Since EB has the Markov property, by Theorem A.33 there exist ( μx y ) with μx y ≥ 0 for x = y and k x ≥ 0 such that, for f, g ∈ L 2 (B), EB ( f, g) = 12 ( f (x) − f (y))(g(x) − g(y)) μx y + f (x)g(x)k x . x∈B y∈B

x∈B

(2.41) Let x ∈ B. Then for sufficiently large m we have D ⊂ B(x, m), and so H B 1 B(x,m) = 1 on B(x, m − 1). Then as H B 1x has support contained in {x} ∪ D, EB (1x , 1 B(x,m) ) = E (H B 1x , H B 1 B(x,m) ) = 0. Consequently we have k x = 0, and we obtain the representation (2.39). Now let x, y ∈ B, with x = y. Then μx y = −EB (1x , 1 y ) = −E (H B 1x , H B 1 y ) = H B 1x , H B 1 y . Since H B 1x = 0 on B − {x}, while H B 1 y = 0 on D, μx z H B 1 y (z) = μx z Pz (X TB = y), μx y = μx H B 1 y (x) = z∼x

z∼x

and (2.40) then follows. Finally, since μx y = μx y except for finitely many pairs (x, y), the representation (2.39) also holds for f, g ∈ H 2 (B). Note that Proposition 2.48 does not specify μx x . Summing (2.40) we have μx Px (X T + = x) = μx − μx y . B

y∈B,y =x

It is therefore convenient to take

so that μx = graph.

μx x = μx Px (X T + = x), B

μx y = μx . We then have that (B, E D (B), μ) is a weighted y∈B

Definition 2.49 We say that the two networks (V, E, μ) and (B, E D (B), μ) are electrically equivalent on B. The intuition for this definition is that an electrical engineer who is only able to access the vertices in B (imposing voltages and measuring currents there) would be unable to distinguish between the two networks. Writing for the new network, Proposition 2.50 gives that resistances between any sets outside D are the same in the old network and the new network ; hence if it is simpler we can do the calculations in the new network.

64

Random Walks and Electrical Resistance

Proposition 2.50 Let B0 , B1 be disjoint non-empty subsets of B, and let eff (B0 , B1 ) and R eff (B2 , ∞) for effective B2 be a finite subset of B. Write R resistances in (B, E D (B), μ). Then eff (B0 , B1 ) = Reff (B0 , B1 ), R

eff (B2 , ∞) = Reff (B2 , ∞). R

Proof To prove the first inequality, let f be the function which attains the minimum for (VP1) in the network (, μ), and set g = f | B . Then since D ∩ (B0 ∪ B1 ) = ∅, f is harmonic in D and thus f = H B g. Then eff (B0 , B1 ) ≤ EB (g, g) = E ( f, f ) = Reff (B0 , B1 )−1 . C On the other hand, let u ∈ H 2 (B) be the function which attains the minimum μ). Then, since H B u is feasible for (VP1) in (, μ), for (VP1) in (B, E D (B), eff (B0 , B1 ) = EB (u, u) = E (H B u, H B u) ≥ Ceff (B0 , B1 ). C The second equality is proved in the same way. Example 2.51 (Resistors in series) Suppose contains a subset A = {x0 , x1 , . . . , xn }, with the property that for i = 1, . . . , n − 1 the only neighbours of xi are xi−1 and xi+1 . Let μxi−1 ,xi = Ri−1 ,

i = 1, . . . , n.

A standard procedure in electrical network theory is to replace the resistors n Ri . This can be in A with a single wire (i.e. edge) with resistance R = i=1 justified using Proposition 2.50 as follows. Set D = {x1 , . . . , x n−1 }, and take B = V − D. If g ∈ H 2 (B) then it is straightforward to verify that the unique solution to the Dirichlet problem (2.37) is given by k i=1 Ri g(x ) = g(x ) + g(x ) − g(x ) . HB k 0 1 0 n i=1 Ri So, with R as above, n k=1

Rk−1 (g(xk−1 − g(x k ))2 =

n

Rk R −2 = R −1 ,

k=1

and thus μx0 ,xn = μx0 ,xn + R −1 . (Note that we did not exclude the possibility that there was already an edge between x0 and x n .)

2.5 Traces and Electrical Equivalence

65

Remark 2.52 (Resistors in parallel) In electrical network theory one combines resistances in parallel by adding their conductances. In this book we have not treated graphs with multiple edges. If we did allow these, then the process X and Dirichlet form E would not be changed by replacing two edges between vertices x and y, with conductances μ1 and μ2 , by a single edge with conductance μ1 + μ2 . Example 2.53 (Y –∇ or Wye–Delta transform; see Figure 2.1) Let be a network, and suppose V contains a set A = {a1 , a2 , a3 , b} such that the only neighbours of b are a1 , a2 , a3 and μa j ,b = β j ,

j = 1, 2, 3.

Assume for simplicity that there is no edge between ai and a j for i = j. Let D = {b}. If we remove D we obtain a network where there are now edges on the triangle {a1 , a2 , a3 }. Let αi j = μai ,a j , Then αi j =

i = j.

βi β j . β1 + β2 + β3

(2.42)

Equivalently if i, j, k are distinct then βk =

α12 α23 + α23 α31 + α31 α12 . αi j

To prove (2.42) take f = 1a1 , g = 1a2 . Then, writing S = β1 + β2 + β3 , β1 j β j f (a j ) = , HB f (b) = S S and similarly H B g(b) = β2 /S. Then −α01 = E (H B f, H B g) =

3

βi ( f (ai ) − f (b))(g(ai ) − g(b))

i=1

= −β1 g(b) − β2 f (b) + f (b)g(b)S = −

Figure 2.1. The ∇–Y transform.

β1 β2 . S

66

Random Walks and Electrical Resistance

Interchanging ‘new’ and ‘old’ networks, this transformation also allows a triangle in the original network to be replaced by a ‘Y’ in the new one. A minor modification allows us to extend this to the case when ai ∼ a j . Use the previous example to first add an extra vertex ai j between ai and a j , then apply the Y –∇ transform, and then remove ai j . We shall see in the following section an application of this transform to the computation of effective resistances on the Sierpinski gasket graph. Remark Any planar network can be reduced to a single resistor by a sequence of these three transformations – that is, the Y –∇ transform, and the formulae for resistors in series and parallel – see [Tru]. The ‘trace operation’ on the Dirichlet form E corresponds to time changing X so that the time spent in D is cut out. We continue to assume that D is finite, and B = V − D. Define the stopping times T0 = 0, Tk+1 = min{n ≥ Tk + 1 : X n ∈ B}. (If X hits D for the first time at time n, and then stays in D until the time it hits B again at n + m, we would have Tk = k for k = 0, . . . , n − 1, Tn = n + m.) Set X k = X Tk ,

k ≥ 0.

X 1 = x, Note that if X 0 = x ∈ ∂i B we will have, with positive probability, that so jumps from a point to itself are possible for X. Theorem 2.54 Let D, B, μ be as above. The process X is a SRW on the weighted graph (B, E D (B), μ). Proof Since D is finite we have Px (Tk < ∞) = 1 for all k. The (strong) Markov property of X implies that Pw ( X k+1 = y| X k = x, X 0 , X 1 , . . . , X Tk ) depends only on x and y. So it is enough to prove that, for x, y ∈ B, μx y y) = Px ( P(x, X 1 = y| X 0 = x) = . μx However, since μx = μx , this is immediate from (2.40).

2.6 Stability under Rough Isometries

67

Proposition 2.55 Let D ⊂ V be finite, and let be the graph obtained from = (V, E) by collapsing D c to a single point. Write R˜ eff for effective resistance in . Then, for x, y ∈ D, g D (x, x) + g D (y, y) − 2g D (x, y) = R˜ eff (x, y),

(2.43)

2g D (x, y) = Reff (x, D c ) + Reff (y, D c ) − R˜ eff (x, y). Proof Note that the Green’s function g D (x, y) is the same in the original network and the collapsed one . Since g D (x, x) = Reff (x, D c ) = R˜ eff (x, D c ), the second formula follows easily from the first. To prove (2.43), write d for the vertex D c . Let I x be the current in x . Then, by Lemma 2.6, associated with the potential g D x x μzd (g D (z) − g D (d)) = Div I x (d) = − Div I x (x) = 1. z∼d y

x Similarly we have Div I y (d) = 1, and so if f = g D − gD μzd ( f (z) − f (d)) = 0, z∼d

so that f is harmonic at d. Since f is harmonic at all points z ∈ D − {x, y}, the associated flow I x − I y is optimal for (VP2), and we have x − I y , I x − I y ) = E(g x − g y , g x − g y ) R˜ eff (x, y) = E(I D D D D = g D (x, x) + g D (y, y) − 2g D (x, y).

2.6 Stability under Rough Isometries We now consider the stability of transience/recurrence under rough isometries (of weighted graphs). First note that if (, μ) has controlled weights (i.e. satisfies (H5)) and x ∼ y, y ∼ z, then μx y ≥ C2−1 μ y ≥ C2−1 μ yz . Iterating, it follows that if d(x, y) = k and x ∼ x , y ∼ y then μ yy ≤ C 2k+1 μx x .

(2.44)

Proposition 2.56 Let (i , μi ), i = 1, 2 be weighted graphs satisfying (H5), and let ϕ : V1 → V2 be a rough isometry. There exists K < ∞ such that, for f 2 ∈ C0 (V2 ), E1 ( f 2 ◦ ϕ, f 2 ◦ ϕ) ≤ K E2 ( f 2 , f 2 ).

68

Random Walks and Electrical Resistance

Remark 2.57 It is easy to see that the converse is false; that is, that one cannot control E2 ( f 2 , f 2 ) in terms of E1 ( f 2 ◦ϕ, f 2 ◦ϕ). Let for example i = Z, ϕ(n) = 2n and f 2 (n) = 1{n=1} . Then f 2 ◦ϕ = 0, while f 2 has non-zero energy. Proof Let C1 , C 2 , C3 be the constants associated with ϕ in (1.5)–(1.7), and let C5 be a constant so that (H5) holds with constant C5 for both 1 and 2 . We use the notation μi (x, y) to denote conductances in the graphs i . Set f 1 = f 2 ◦ ϕ. Let x ∼ y in V1 , and x = ϕ(x), y = ϕ(y). Then 0 ≤ d2 (x , y ) ≤ C1 (1 + C 2 ). Let k = C 1 (1 + C2 ), so there exists a path x = z 0 , . . . , z k = y connecting x and y . Then | f 1 (x) − f 1 (y)| = | f 2 (x ) − f 2 (y )| ≤

k

| f 2 (z i−1 ) − f 2 (z i )|

i=1

≤ k 1/2

k

| f 2 (z i−1 ) − f 2 (z i )|2

1/2

.

i=1

Using (2.44) we have, for 1 ≤ i ≤ k, μ1 (x, y) ≤ μ1 (x) ≤ C3 μ2 (x ) ≤ C3 C 5 μ2 (z 0 , z 1 ) ≤ C3 C5i μ2 (z i−1 , z i ). Hence, | f 1 (x) − f 1 (y)|2 μ1 (x, y) ≤ (kC3 C5k )

k

| f 2 (z i−1 ) − f 2 (z i )|2 μ2 (z i−1 , zi ).

i=1

(2.45) Summing over x ∼ y ∈ V1 , the left hand side gives E1 ( f 1 , f 1 ), while the right hand side is less than kC3 C5k ME2 ( f 2 , f 2 ), where M is the maximum number of pairs x ∼ y ∈ V1 such that the same edge w ∼ w in V2 occurs in a sum of the form (2.45). It remains to prove that M < ∞. Fix an edge {w, w } in 2 . If {w, w } is on a path connecting ϕ(x ) and ϕ(y ), where x ∼ y are in V1 , then d2 (ϕ(x ), w) ≤ k. So it remains to bound |A|, where A = ϕ −1 (B2 (w, k)). Let z 1 , z 2 ∈ A. Then by (1.5) C1−1 (d1 (z 1 , z 2 ) − C2 ) ≤ d2 (ϕ(z 1 ), ϕ(z 2 )) ≤ 2k, so d(z 1 , z 2 ) ≤ c1 = c1 (C1 , C 2 ). Hence |A| ≤ c2 = c2 (C1 , C 2 ), and M ≤ c22 .

2.6 Stability under Rough Isometries

69

Theorem 2.58 Transience and recurrence of weighted graphs is stable under rough isometries. Proof Let (i , μi ), i = 1, 2, be as in Proposition 2.56, and let ϕ : V1 → V2 be a rough isometry. Suppose that (1 , μ1 ) is transient, and let f 2 ∈ C 0 (V2 ), and f 1 = f 2 ◦ ϕ. Then if x1 ∈ V1 , x 2 = ϕ(x 1 ), using Theorem 2.42, | f (x1 )|2 = | f 2 (x2 )|2 ≤ CE2 ( f 2 , f 2 ) ≤ K CE1 ( f 1 , f 1 ). Thus, by Theorem 2.42, (2 , μ2 ) is transient. Proposition 2.59 Let G be a finitely generated infinite group, and let 1 , 2 be two sets of generators. Let 1 , 2 be the associated Cayley graphs. Then 1 and 2 are roughly isometric. Proof Let ϕ(x) = x for x ∈ G . Let di be the metric for the Cayley graph of (G , i ). The property (1.6) of rough isometries is immediate, so it remains to prove (1.5). For this it is enough to prove that there exists C21 ≥ 1 such that d2 (x, y) ≤ C 21 d1 (x, y),

x, y ∈ G ,

(2.46)

since interchanging d1 and d2 in the proof we also have that there exists C12 ≥ 1 such that d1 (x, y) ≤ C12 d2 (x, y) for all x, y. Taking C1 = C12 ∨ C21 and C2 = 0 then gives (1.5). To prove (2.46) recall that given a set of generators we set ∗ = {g, g −1 , g ∈ }. Let 2 (k) =

k

x j : x j ∈ ∗2 , k ≤ k

j=1

be the set of all products of k or fewer elements of ∗2 . It is easy to check that d2 (x, y) ≤ k if and only if x −1 y ∈ 2 (k), and that if w1 ∈ 2 (k1 ) and w2 ∈ 2 (k2 ) then w1 w2 ∈ 2 (k1 + k2 ). Since the group generated by ∗2 is the whole of G , for each g ∈ ∗1 there exists k(g) such that g ∈ 2 (k(g)). Set C21 = max{k(g), g ∈ 1 }. Now let x, y ∈ G with d1 (x, y) = m. Then x −1 y = z 1 . . . z m , where z j ∈ ∗1 , and therefore in 2 (C21 ). By the above, x −1 y ∈ 2 (mc21 ), and thus d2 (x, y) ≤ C 21 m, proving (2.46). Corollary 2.60 The type of a Cayley graph does not depend on the choice of the (finite) set of generators. Proof This is immediate from Proposition 2.59 and Theorem 2.58.

70

Random Walks and Electrical Resistance

Remark A group G has recurrent Cayley graphs if and only if it is a finite extension of Z or Z2 . The ‘if’ part is immediate from Corollary 2.60. The ‘only if’ part requires a deep theorem on the structure of groups due to Gromov [Grom1], and is beyond the scope of this book.

2.7 Hitting Times and Resistance In this section we show how the resistance between points and sets gives information on hitting times of a random walk. We begin by extending Theorem 2.8 to an infinite graph. Lemma 2.61

Let A ⊂ V and x ∈ V − A. Then

1 1 − Px (Tx+ = T A = ∞) ≤ Px (Tx+ > T A ) ≤ . μx Reff (x, A) μx Reff (x, A) (2.47) Proof Recall that by Px (Tx+ > T A ) we mean Px (Tx+ > TA , T A < ∞). Let Vn ↑↑ V with x ∈ V1 . Let X (n) be the random walk on Vn , and write T·(n) for hitting times of X (n) . We can define X and X (n) on the same probability space (n) so that they agree until Sn = T∂i Vn = T∂i Vn . Thus {Tx+ > T A , TA < Sn } = {Tx(n,+) > T A(n) , T A(n) < Sn }. Since {T A < ∞} = ∪∞ n=1 {T A < Sn }, we have (n)

(n)

Px (Tx+ > T A ) = lim Px (Tx(n,+) > T A , T A < Sn ) n (n) (n) (n) = lim Px (Tx(n,+) > T A ) − Px (Tx(n,+) > T A , T A ≥ Sn ) . n

Dropping the second term, we have Px (Tx+ > T A ) ≤ lim n

1 , μx Reff (x, A ∩ Vn : Vn )

which gives the upper bound in (2.47) by Proposition 2.18. For the lower bound, we have (n)

(n)

Px (Tx(n,+) > T A , T A ≥ Sn ) ≤ Px (T A ≥ Sn , Tx+ ≥ Sn ), and these probabilities have limit Px (T A = Tx+ = ∞).

2.7 Hitting Times and Resistance

Lemma 2.62

71

Let x ∈ A ∪ B.

(a) If Px (T A∪B < ∞) = 1 then Px (T A < TB ) ≤

Reff (x, A ∪ B) Reff (x, B) ≤ . Reff (x, A) Reff (x, A)

(b) If Px (T A < ∞) = 1 then Px (T A < TB ) ≥

Reff (x, A ∪ B) Reff (x, A ∪ B) − . Reff (x, A) Reff (x, B)

Proof We have Px (T A < TB ) = Px (T A < TB , T A∪B < Tx+ ) + Px (T A < TB , T A∪B > Tx+ ). By the Markov property at Tx+ the second term is Px (T A < TB )Px (T A∪B > Tx+ ), so Px (T A < TB ) =

Px (T A < Tx+ ) Px (T A < TB , T A∪B < Tx+ ) ≤ , Px (T A∪B < Tx+ ) Px (T A∪B < Tx+ )

and using Lemma 2.61 gives the upper bound. For the lower bound, Px (T A < TB , TA∪B < Tx+ ) ≥ Px (T A < Tx+ , TB > Tx+ ) ≥ Px (T A < Tx+ ) − Px (TB < Tx+ ), and again using Lemma 2.61 completes the proof. The following result, called the commute time identity, was discovered in 1989 – see [CRR]. Other proofs can be found in [Tet, BL]. Theorem 2.63

Let (, μ) be a weighted graph with μ(V) < ∞. Then Ex0 Tx1 + Ex1 Tx0 = Reff (x0 , x1 )μ(V).

Proof Let ϕ(x) = Px (Tx1 < Tx0 ), I = ∇ϕ, and write R = Reff (x0 , x1 ). Then

ϕ(x) = 0,

x ∈ V − {x 0 , x1 },

divI (x 1 ) = μx1 ϕ(x1 ) = −R −1 . Let f 0 (x) = Ex Tx0 . Then f 0 (x0 ) = 0 and if x = x0 then

f 0 (x) = Ex f 0 (X 1 ) − f 0 (x) = −1.

72

Random Walks and Electrical Resistance

By the self-adjointness of we have ϕ, − f 0 = − ϕ, f 0 .

(2.48)

We calculate each side of (2.48). The left side is ϕ(x)μx + ϕ(x0 )(− ϕ(x0 ))μx0 = ϕ(x)μx , x=x0

x

since ϕ(x0 ) = 0. The right side is

ϕ(x 0 )(− f 0 (x0 ))μx0 + ϕ(x 1 )(− f 0 (x1 ))μx1 = R −1 f 0 (x1 ). Thus Ex1 Tx0 = Reff (x0 , x1 )

ϕ(x)μx .

(2.49)

x

If ϕ (x) = Px (Tx1 > Tx0 ) then ϕ(x) + ϕ (x) = 1 and ϕ (x)μx . Ex0 Tx1 = Reff (x0 , x1 ) x

Adding these two equations completes the proof of the theorem. Remark (1) In an unweighted graph μ(V) = 2|E|. (2) See Theorem 7.10 for the commute time between two sets. Theorem 2.64 (See [Ki], Th. 1.6) Reff is a metric on V.

Let be a graph (finite or infinite). Then

Proof Symmetry is clear, so it remains to prove the triangle inequality. Let x, y, z ∈ V. If μ(V) < ∞ then by the Markov property Ex Ty + E y Tz ≥ Ex Tz , Ez Ty + E y Tx ≥ Ez Tx . Adding, one deduces that Reff (x, y) + Reff (y, z) ≥ Reff (x, z). If V is infinite, let Vn ↑↑ V, with {x, y, z} ⊂ V1 . Then for each n Reff (x, y; Vn ) + Reff (y, z; Vn ) ≤ Reff (x, z; Vn ).

(2.50)

Taking the limit and using Proposition 2.18 completes the proof. Remark 2.65 An alternative proof of (2.50) is to use electrical equivalence. Using the Y –∇ relation, the inequality is clear for a graph with three points. It then follows, for a general finite graph, by considering the network on B = {x, y, z} which is equivalent to the original network.

2.8 Examples

73

The following corollary was proved in [T1], shortly before the discovery of Theorem 2.63. Corollary 2.66

Let (, μ) be a graph, and A ⊂ V. Then E x τ A ≤ Reff (x0 , Ac )μ(A).

(2.51)

Proof If μ(A) = ∞ there is nothing to prove. If μ(A) < ∞ this follows from (2.49) on collapsing Ac to a single point. Example Suppose (, μ) is a transient weighted graph, {x0 , x 1 } is an edge in , and there is a finite subgraph V1 = {x 1 , . . . , xn } ⊂ V ‘hanging’ from x1 . More precisely, if y ∈ V1 then every path from y to x0 passes through x1 . Then we can ask how much the existence of V1 ‘delays’ the SRW X , that is, what is E x1 Tx0 ? Let V0 = {x0 , . . . , xn }, and work with the subgraph generated by V0 , except we eliminate any edges between x 0 and itself. Write μ0 for the measure on V0 given by (1.1), and for clarity denote the hitting times for the SRW on V0 by Ty0 . Then, by Theorem 2.63, E x0 Tx01 + E x1 Tx00 = Reff (x0 , x1 )μ0 (V0 ). −1 Since E x0 Tx01 = 1, Reff (x0 , x 1 ) = μ−1 x 0 ,x1 , μ0 (V0 ) = μ(V1 ) + μx 0 ,x 1 , and x 0 x 1 1 E Tx0 = E Tx0 , we obtain

E x1 Tx0 =

μ(V1 ) . μx0 ,x1

2.8 Examples Example 2.67 Consider the graph obtained by identifying the origins of Z+ and Z3 . Since Z3 is transient, is transient, so by Theorem 2.37 the identity function 1 is not in H02 . Let f n be zero on Z3 − {0} and define f n on Z+ by 1 − k/n if k ≤ n, f n (k) = 0 if k > n. Let f = 1Z+ be the pointwise limit of the f n . Then f n ∈ C0 and E ( f − f n , f − f n ) = 1/n, so that f n → f in H 2 and thus f ∈ H02 . So 1 is in H 2 but not H02 , while f is in H02 but not L 2 , showing that in general all three spaces are distinct.

74

Random Walks and Electrical Resistance

In addition, this graph satisfies the Liouville property but not the strong Liouville property. To show the latter, let h(x) = Px (T0 < ∞),

x ∈ Z3 .

Then h is harmonic in Z3 −{0}. Let a > 0 and define h on Z+ by h(n) = 1+an; clearly h is also harmonic in Z+ − {0}. Let p0 > 0 be the probability that a SRW on Z3 started at any neighbour of 0 does not return to 0. Then

h(0) = (2d + 1)−1 [(1 + a) − 1] + 2d[(1 − p0 ) − 1] = (2d + 1)−1 (a − 2dp0 ). So if a = 2dp0 then h is a positive non-constant harmonic function on . To prove the Liouville property, let h be a bounded harmonic function on ; we can assume h(0) = 1. Then h must be of the form h(n) = 1 + bn on Z+ , for some b ∈ R. But since h is bounded, b = 0, and h is constant on Z+ . As h is harmonic at 0, writing A = {x ∈ Z3 : x ∼ 0}, 1 = h(0) = (2d + 1)−1 1 + h(x) , x∈A

and therefore h(0) = (2d)−1

h(x).

x∈A

So the restriction of h to Z3 is harmonic in Z3 , and by the Liouville property for Z3 (Corollary 1.52) it follows that h is constant. Example 2.68 (Binary tree with weights) Recall from Examples 1.4(3) the definition of the rooted binary tree. Let (rn , n ≥ 0) be strictly positive, and for an edge {x, y} with x ∈ Bn , y ∈ B + n + 1 set μx y =

1 . rn

Thus Reff (Bn , Bn+1 ) = 2−n rn , and the weighted tree is transient if and only if Reff (o, ∞) = lim Reff (o, Bn ) = n

∞

2−k rk < ∞.

k=0

Example 2.69 (Spherically symmetric trees) Again we start from the rooted binary tree and a sequence rn , n ≥ 0 of positive integers. The tree is obtained by replacing each edge {x, y} with x ∈ Bn , y ∈ Bn+1 by a sequence of rk edges. Again we have that the tree is transient if and only if 2−k rk < ∞.

2.9 The Sierpinski Gasket Graph

75

2.9 The Sierpinski Gasket Graph The ‘true’ Sierpinski gasket is a compact fractal subset of R2 , which can be constructed by a procedure analogous to that for the Cantor set. Starting with the (closed) unit triangle K 0 in R2 , one removes the open downward facing triangle whose vertices are the midpoints of the sides of K 0 . The resulting set, K 1 , consists of three closed triangles of side 2−1 . Repeating this procedure, after n steps one obtains a set K n consisting of 3n triangles each of side 3−n (see Figure 2.2). The Sierpinski gasket is the set K ∞ = ∩n K n ; it is a compact connected subset of R2 with Hausdorff dimension log 3/ log 2 – see [Fa]. A more formal description of the sets K n can be given using an √ iterated function scheme – see p. 113 of [Fa]. Let V0 = {(0, 0), (1, 0), (1/2, 3/2)} = {a1 , a2 , a3 } be the vertices of the unit triangle in R2 , and let K 0 be the closed convex hull of V0 . Define the functions ψi : R2 → R2 by ψi (x) = 12 (x − ai ). Given a compact set K ⊂ R let 3 (K ) = ∪i=1 ψi (K ).

It is easy to check that setting K n = (K n−1 ) for n ≥ 1 defines a decreasing sequence of compact sets with the properties described above. Using this construction as a guide, we can define an infinite graph with a similar structure. Note that K n = n (V0 ) consists of 3n triangles each of side 2−n . We define the set Tn to be the set of these triangles, that is

Figure 2.2. The set K 4 .

76

Random Walks and Electrical Resistance Tn = ψi1 ◦ · · · ◦ ψin (V0 ) : (i 1 , . . . , in ) ∈ {1, 2, 3}n .

Let Un = n (V0 ) be the set of vertices of these triangles, and for x, y ∈ Un n let x ∼y if there exists a triangle T ∈ Tn with x, y ∈ T . Define a graph Gn = (Vn , E n ) by Vn = 2n Un = {2n x : x ∈ Un },

n E n = {2n x, 2n y} : x ∼y, x, y ∈ Un .

The vertices in Vn are all contained in 2n K 0 , and Vn+1 consists of Vn together with two copies of Vn , which meet at the vertices 2n a1 and 2n a2 . We define V = ∪n Vn , and for x, y ∈ V we let {x, y} ∈ E if {x, y} ∈ E m for all sufficiently large m. The graph SG = (V, E) is the Sierpinski gasket graph (Figure 2.3). We call a subset A of V a triangle of side 2n if it is isometric to Vn . The extreme points of A are the three vertices in A such that A is contained in the convex hull of these vertices. We say that two triangles of side 2n are adjacent if they share an extreme point. Let Dn be the set of extreme points of vertices of triangles of side 2n . The following lemma gives some basic properties of SG . We write B(x, r ) for balls in SG in the usual graph metric d(x, y), and set α = log 3/ log 2. Write B E (x, r ) = {y ∈ R2 : |x − y| ≤ r } for (closed) balls in the usual Euclidean metric in R2 . Lemma 2.70 (a) If A is a triangle of side 2n then |A| = 12 (3n+1 + 3). (b) If x ∈ V then B E (x, 2n−2 ) intersects at most two triangles of side 2n , and these triangles are adjacent.

Figure 2.3. Part of the Sierpinski gasket graph SG .

2.9 The Sierpinski Gasket Graph

77

(c) We have, for x, y ∈ V , |x − y| ≤ d(x, y) ≤ c1 |x − y|. (d) If x ∈ V and r ≥ 1 then |B(x, r )| μ(B(x, r )) r α .

(2.52)

Proof (a) is easy on noting that A contains 3n edges and that all but three of the vertices in A are in two triangles of side 1. (b) Let x ∈ V , and let H be a triangle of side 2n−1 containing x. Write wi , 3 B (w , 2n−2 ) interi = 1, 2, 3 for the extreme points of H . Then the set ∪i=1 E i sects at most two triangles of side 2n , and it follows that the same holds for B E (x, 2n−2 ). (c) Since SG is embedded in the plane and all edges are of length 1, it is clear that d(x, y) ≥ |x − y|. It is easy to prove by induction that if A is a triangle of side 2n and z is an extreme point of A then d(x, z) ≤ 2n for all x ∈ A. Hence if x, y are in two adjacent triangles of side 2n and w is the common extreme point of the two triangles, then d(x, y) ≤ 2.2n ; using (b), (c) then follows. (d) If x ∈ V then μx = 4 unless x = 0, when we have μ0 = 2. So |B(x, r )| μ(B(x, r )). If r ≥ 1 let m be the smallest integer such that r ≤ 2m . Then r > 2m−1 , B(x, r ) is contained in two triangles of side 2m+1 , and so |B(x, r )| ≤ 3m+2 + 3 ≤ cr α . For the lower bound, x is contained in at least one triangle A of side 2m−2 ; let z be an extreme point of this triangle. Then for y ∈ A we have d(x, y) ≤ d(x, z) + d(y, z) ≤ 2m−1 ≤ r, so that A ⊂ B(x, r ) and thus |B(x, r )| ≥ c r α . Lemma 2.71

The SRW on SG is recurrent.

Proof As SG is a subgraph of the triangular lattice in R2 , this is immediate from Corollary 2.39. It is also easy to give a direct proof using the Nash– Williams criterion, since for each n we have |∂ Vn | = 4. Theorem 2.72

Let β = log 5/log 2. For x, y ∈ V c1 d(x, y)β−α ≤ Reff (x, y) ≤ c2 d(x, y)β−α .

(2.53)

Proof It is sufficient to prove this in the finite graph Gm for large enough m.

78

Random Walks and Electrical Resistance

Figure 2.4. Sequence of Y –∇ transforms for the Sierpinski gasket.

If d(x, y) = 1 then the unit flow from x to y has energy 1, so Reff (x, y) ≤ 1. If f = 1 y then Reff (x, y)−1 ≤ E ( f, f ) ≤ 4, so we deduce that 1 4

≤ Reff (x, y) ≤ 1.

Now consider a triangle A of side 2 in SG , with vertices z 1 , z 2 , z 3 . Then by a sequence of Y –∇ transformations (see Figure 2.4) we can remove the vertices inside A, and replace the edges {z i , z j } by edges which (a short calculation shows) have conductance 5/3. Repeating this operation on every triangle of side 2, we can replace Gm by Gm−1 , but with every edge having conductance 5/3. Continuing in this way, we can replace Gm by Gm−k with edge conductances of (5/3)k . Thus if x, y are distinct extreme points of a triangle with side 2k , then by (2.9) we have k 1 4 (5/3)

≤ Reff (x, y) ≤ (5/3)k .

(2.54)

Let An be a triangle of side n, with extreme points w1 , w2 , w3 . This calculation also shows that the unique harmonic function in An with boundary values given by h n (w1 ) = 1, h n (w2 ) = h n (w3 ) = 0 satisfies E An (h n , h n ) = 2(3/5)n . Now let Ak be a triangle of side 2k , and y ∈ Ak . Choose z k to be an extreme point of Ak closest to y. Then we can find a sequence of triangles A j , 0 ≤ j ≤ k and extreme points z j of A j such that z 0 = y and for each j both z j and z j−1 are extreme points of A j−1 . (A j is a triangle of side 2 j .) Since Reff is a metric it follows that Reff (y, z k ) ≤

k

Reff (z j−1 , z j ) ≤

j=1

the upper bound in (2.53) then follows.

k j=1

(5/3) j−1 ≤ (3/2)(5/3)k ;

2.9 The Sierpinski Gasket Graph

79

For the lower bound, suppose that d(x, y) ≥ 2k . Then there exist disjoint triangles of side 2k−2 , A x , and A y say, with x ∈ A x and y ∈ A y . There are either two or three triangles of side 2k−2 adjacent to A x ; call the union of these triangles A. Consider the function f which is 1 on A x , harmonic in Ao , and zero on ∂(A ∪ A x ). As A y is disjoint from Ao , we have f (x) = 1, f (y) = 0. The function f has energy 2(3/5)k−2 in each of the triangles of side 2k−2 in A, so we have E ( f, f ) ≤ 6(3/5)k−2 . Hence Ceff (x, y) ≤ E ( f, f ) ≤ c(3/5)k , proving the lower bound in (2.53). We will see in Chapter 6 that these calculations for the resistance metric on SG are the input needed to be able to obtain bounds on the heat kernel for the SRW.

3 Isoperimetric Inequalities and Applications

3.1 Isoperimetric Inequalities The study of the close connections between random walks and isoperimetric inequalities was opened up by Varopoulos – see in particular [V1, V2]. We continue to assume that (, μ) is a locally finite connected weighted graph. Definition 3.1

For A, B ⊂ V set μ E (A;B) =

μx y .

x∈A y∈B

Let ψ : R+ → R+ be increasing. We say that (, μ) satisfies the ψisoperimetric inequality (Iψ ) if there exists C0 < ∞ such that μ E (A; V − A) ≥ C0−1 ψ(μ(A))

for every finite non-empty A ⊂ V.

(Iψ )

For α ∈ [1, ∞) we write (Iα ) for (Iψ ) with ψ(t) = t 1−1/α , and (I∞ ) for (Iψ ) with ψ(t) = t. (I∞ ) is also called the strong isoperimetric inequality. Examples 3.2 (1) The Euclidean lattice Zd satisfies (Id ). (2) The binary tree satisfies (I∞ ) with C0 = 3. (3) The Sierpinski gasket graph does not satisfy (Iα ) for any α > 1. (4) Any infinite connected graph (with natural weights) satisfies (I1 ). (5) If (, μ) satisfies (Iα+δ ) then it satisfies (Iα ). (1) If A ⊂ Zd is finite, construct a subset D ⊂ Rd by replacing each x ∈ A with a unit cube. Then, writing k for the k-dimensional Lebesgue measure, μ(A) = 2dd (D), and μ E (A; Ac ) = d−1 (∂ D); thus the isoperimetric inequality for Zd follows from that for Rd . 80

3.1 Isoperimetric Inequalities

(2)

(3)

(4) (5)

81

However, while intuitive, the isoperimetric inequality for Rd is not elementary. There is a very extensive literature on this topic, much of it related to determining the optimal constant and proving that the sets for which equality is attained are balls. See [Ga] for an easy proof via the Brunn–Minkowski inequality, and also the same article for a (fairly) quick proof of the Brunn–Minkowski inequality. Alternatively, see Theorem 3.26 in Section 3.3 for a proof which does not give the optimal constant. To prove (I∞ ) for the binary tree, it is enough to consider connected sets A. The inequality is then easy by induction on |A|; each point added increases μ(A) by 3 and μ E (A; Ac ) by at least 1. If An is the triangle of side 2n containing the origin, then μ(An ) = 4 + 2.3n , while μ E (An ; Acn ) = 4. So μ E (An ; Acn )/μ(An )1−1/α ≤ c3−n(1−1/α) . If A is a finite set then μ E (A; V − A) ≥ 1, and so (Iψ ) with ψ(t) = 1 and C0 = 1 holds. By Lemma 3.3(a) we have μ(A) ≥ C 0−(α+δ) for all (non-empty) subsets −δ/α A. Hence μ(A)−1/(α+δ) ≥ C0 μ(A)−1/α , and (Iα ) follows.

The condition (Iα ) gives a lower bound sizes of balls. Lemma 3.3

Let (, μ) satisfy (Iα ).

(a) For any x ∈ V, μx ≥ C0−α .

(3.1)

(b) If α ∈ [1, ∞) then μ(B(x, n)) ≥ c1 (α)n α ,

n ≥ 1.

(c) If α = ∞ then μ(B(x, n)) ≥ (1 + C0−1 )n μx , so (, μ) has (at least) exponential volume growth. Proof (a) Taking A = {x} we have μ(A) = μ E (A; Ac ) = μx , and (3.1) is then immediate. (b), (c) Let Bn = B(x0 , n), and bn = μ(Bn ). Then μ(∂ Bn ) =

x∈∂ Bn

μx ≥

x∈∂ Bn y∼x,y∈Bn

μx y = μ E (Bn ; Bnc ).

82

Isoperimetric Inequalities and Applications

So, bn+1 = μ(Bn+1 ) = μ(Bn ) + μ(∂ Bn ) −1/α

≥ μ(Bn ) + C0−1 μ(Bn )1−1/α = bn (1 + C 0−1 bn

).

If α = ∞ then it follows immediately that bn ≥ (1 + C0−1 )n b0 . If α ∈ [1, ∞) then we have 1 + (2α − 1)t ≥ (1 + t)α for t ∈ [0, 1]. Choose A ∈ (0, C 0−α ) small enough so that 12 C0−1 A−1/α ≥ 2α − 1. We have b1 ≥ C0−α ≥ A. We now prove (b) with c(α) = A by induction. Suppose that bn ≥ An α . If bn ≥ A(n + 1)α then bn+1 ≥ bn ≥ A(n + 1)α and we are done. If not then bn ≤ A(n + 1)α ≤ 2α An α , and hence bn+1 ≥ An α (1 + 12 C0−1 A−1/α n −1 ) ≥ An α (1 + n −1 )α = A(n + 1)α . Theorem 3.4 isometries.

The property (Iα ) is weight stable, and stable under rough

Proof Stability under bounded perturbation of weights is immediate from the definition. To prove stability under rough isometries, let (i , μi ) be weighted graphs, and let ϕ : V1 → V2 be a rough isometry. Suppose that (1 , μ1 ) satisfies (Iα ). If A2 ⊂ V2 let f 2 = 1 A2 , A1 = ϕ −1 (A2 ), and f 1 = 1 A1 = f 2 ◦ ϕ. Then Ei ( f i , f i ) = μ E (Ai ; Vi − Ai ) for i = 1, 2. Since ϕ is a rough isometry of weighted graphs, μ1 (A1 ) μ2 (A2 ), while Proposition 2.56 gives that μ E (A2 ; V2 − A2 ) ≥ K −1 μ E (A1 ; V1 − A1 ). Combining these inequalities we have that (2 , μ2 ) also satisfies (Iα ). Exercise 3.5 Let i , i = 1, 2, be graphs, with natural weights, which satisfy (Iαi ). Then the join of 1 and 2 satisfies (Iα1 ∧α2 ). Definition 3.6

Now define p

||∇ f || p =

1 2

x

μx y | f (y) − f (x)| p .

y

We will be mainly interested in the cases p = 1 and p = 2; of course, ||∇ f ||22 = E ( f, f ). Note that if f = 1 A then p ||∇ f || p = μx y = μ E (A; Ac ). (3.2) x∈A y∈Ac

3.1 Isoperimetric Inequalities

83

We introduce the following inequalities: || f ||αp/(α− p) ≤ C||∇ f || p || f || p ≤ Cμ(D)1/α ||∇ f || p ,

p

for f ∈ C0 (V),

(Sα )

if supp( f ) ⊂ D.

(Fα )

p

p

The inequality (Sα ) only makes sense when p < α, and is the graph anap logue of a classical Sobolev inequality in Rd . The inequalities (Fα ) were introduced by Coulhon in [Co1], and it is proved there (in the context of p p manifolds) that when α > p the inequalities (Sα ) and (Fα ) are equivalent. This equivalence also holds in the graph context, but we will just discuss the p cases p = 1 and p = 2. The connection with the Sobolev inequalities (Sα ) is p included mainly for historical interest; as we will see the (Fα ) inequalities are on the whole easier to use, and contain the same information. For α = ∞ both inequalities above take the form || f || p ≤ C||∇ f || p , Theorem 3.7

if f ∈ C0 (V).

p

(F∞ )

Let α ∈ (1, ∞]. The following are equivalent:

(a) (, μ) satisfies (Iα ) with constant C0 . (b) (, μ) satisfies (Sα1 ) with constant C0 . (c) (, μ) satisfies (Fα1 ) with constant C0 . In addition (I1 ) and (F11 ) are equivalent. Proof (c) ⇒ (a) Let α ∈ [1, ∞]. Given a finite subset A ⊂ V set f = 1 A . Then f 1 = μ(A), while ||∇ f ||1 = μ E (A; Ac ). So by (Fα1 ) μ(A) ≤ C0 μ(A)1/α μ E (A; Ac ), which is (Iα ). (b) ⇒ (c) This is immediate from Hölder’s inequality; note that p = α/(α−1) and p = α are conjugate indices. If f has support D then || f ||1 = || f 1 D ||1 ≤ || f ||α/(α−1) ||1 D ||α ≤ C 0 ||∇ f ||1 μ(D)1/α . The final implication, that (a) implies (b), requires a bit more work. First define t ( f ) = {x : f (x) ≥ t}.

84

Isoperimetric Inequalities and Applications

Lemma 3.8 (Co-area formula) Let f : V → R+ . Then ∞ μ E (t ( f ); t ( f )c )dt. ||∇ f ||1 = 0

Proof We have ||∇ f ||1 =

x

=

x

dt

0

∞

0

x

∞

=

μx y

y ∞

=

μx y ( f (y) − f (x))+

y

1( f (y)≥t> f (x)) dt μx y 1( f (y)≥t> f (x))

y

dtμ E (t ( f ); t ( f )c ).

0

Proof of the remainder of Theorem 3.7 We begin by proving that (a) ⇒ (b). First, note that it is enough to consider f ≥ 0. For, if g = | f | then ||g|| p = || f || p and ||∇g||1 ≤ ||∇ f ||1 . The idea of the proof is to use (Iα ) on the sets t ( f ). However, if we do what comes naturally we find we need to use the Hardy–Littlewood–Polya inequality: for a non-negative decreasing function f , ∞ ∞ p pt p−1 f (t) p dt ≤ f (t)dt . 0

0

We can avoid this by using a trick from Rothaus [Ro]. First let 1 < α < ∞. p Let p = α/(α − 1), and p = α. Let g ∈ L + (V, μ) with ||g|| p = 1. Then ∞ ∞ C 0 ||∇ f ||1 = C0 μ E (t ( f ); t ( f )c )dt ≥ μ(t ( f ))1/ p dt 0 0 ∞ ∞ = dt||1t ( f ) || p ≥ dt||g1t ( f ) ||1 0 ∞ 0 ∞ = dt g(x)μx 1t ( f ) (x) = g(x)μx dt1( f (x)≥t) 0

=

x

x

0

f (x)g(x)μx = || f g||1 .

x

So, || f || p = sup{|| f g||1 : ||g|| p = 1} ≤ C0 ||∇ f ||1 . For α = ∞ take p = 1, g = 1 in the calculation above.

3.2 Nash Inequality

85

Finally, we need to show that (I1 ) implies (F11 ). Let f ≥ 0 have finite support D and let M = max f . Then M ||∇ f ||1 = μ E (t ( f ); t ( f )c )dt ≥ C0−1 M ≥ C0−1 μ(D)−1 || f ||1 . 0

Let α ∈ [1, ∞]. Then (Fα1 ) implies (Fα2 ).

Lemma 3.9

Proof As before it is enough to prove this for f ≥ 0. Let f have support D, and set g = f 2 . Applying (Fα1 ) to g, and writing C 0 for the constant in (Fα1 ), we obtain || f ||22 = ||g||1 ≤ C 0 μ(D)1/α ||∇g||1 .

(3.3)

If we were working with functions on Rd we would have 1/2 |∇ f |2 || f ||2 . ||∇g||1 = |∇g| = |∇ f 2 | = 2 | f ||∇ f | ≤ 2 However, in a discrete space one cannot use the Leibnitz rule to obtain |∇ f 2 | = 2| f ||∇ f |. This is not a serious difficulty, since μx y | f (x)2 − f (y)2 | ||∇g||1 = 12 x,y

=

1 2

μx y | f (x) − f (y)|| f (x) + f (y)|

x,y

≤

μx y | f (x) − f (y)|| f (x)| ≤ 21/2 E ( f, f )1/2 || f ||2 .

(3.4)

x,y

Combining (3.3) and (3.4) gives (Fα2 ) with constant 21/2 C0 .

3.2 Nash Inequality We now introduce another inequality, which following [CKS] is called a Nash inequality. Nash used inequalities of this type in his 1958 paper [N] to prove Hölder continuity of solutions of divergence form PDEs – see the first inequality on p. 936 of [N]. In Section 4.1, which closely follows the ideas of [CKS], we see that Nash inequalities are closely related to the behaviour of the heat kernel on (, μ). Definition 3.10 Let α ∈ [1, ∞). satisfies the Nash inequality (Nα ) if for all f ∈ L 1 ∩ L 2 2+4/α

E ( f, f ) ≥ C N || f ||2

−4/α

|| f ||1

.

(Nα )

86

Isoperimetric Inequalities and Applications

Remark 3.11 (1) Suppose we wish to prove (Nα ), and have proved it for all f ∈ C 0,+ (V). Then if f ∈ C0 (V), writing f = f + − f − , we have p p p || f || p = || f + || p + || f − || p , and E ( f, f ) ≥ E ( f + , f + ) + E ( f − , f − ) by Lemma 1.27. Hence we obtain (Nα ) (with a different constant) for all f ∈ C0 (V). If now f ∈ L 1 ∩ L 2 , let f n ∈ C 0 (V) with f n → f in L 1 and L 2 . Then by (1.17) f n → f in H 2 (V), and hence (Nα ) holds for f . It is also enough just to prove (Nα ) for f with || f ||1 = 1. (2) Since E ( f, f ) ≤ 2|| f ||22 by Proposition 1.21, (Nα ) also implies that || f ||2 ≤ c1 || f ||1 for all f . (3) Suppose (Nα ) holds. Let A ⊂ V. Applying (Nα ) and using (3.2) gives μ E (A, V − A) ≥ C N μ(A)1−2/α , so that (Iα/2 ) holds. Taking A = {x} we have 1−2/α μx ≥ μx y ≥ C N μx , y=x α/2

so that μx ≥ C N . Thus the Nash inequality gives a global lower bound on μx . Exercise 3.12 Let (, μ) be a weighted graph, let λ > 0, and let μλ = λμ. If (, μ) satisfies (Nα ) with constant C N then (, μλ ) satisfies (Nα ) with constant λ2/α C N . The following simple proof of the Nash inequality in Zd is based on an argument of Stein – see p. 935 of [N]. Lemma 3.13

The Nash inequality (Nd ) holds in Zd .

Proof Note that since we have μx y = 1 when x ∼ y, we have μx = 2d for each x ∈ Zd , so that expressions for || f || p will differ from familiar ones by powers of 2d. Write I = [−π, π]d , and let f ∈ L 1 (Zd ). Then f ∈ L 2 (Zd ) by Lemma f (θ ) be the Fourier transform of f : 1.20. For θ ∈ Rd let f (θ ) = eiθ·x f (x). x∈Zd

Clearly | f (θ )| ≤ (2d)−1 || f ||1 , while by Parseval’s formula f (θ )|2 dθ. (2d)−1 || f ||22 = (2π )−d | I

3.2 Nash Inequality

87

Let e j denote the unit vectors in Zd and h j (x) = f (x + e j ) − f (x). Then h j (θ ) = eiθ·(x−e j ) f (x) − f (θ ) = (e−iθ·e j − 1) f (θ ). x

Since there exists a constant c0 such that |1 − eiθ·e j |2 ≥ c0 |θ j |2 , −1 2 −d 2 −d (2d) ||h j ||2 = (2π ) |h j (θ )| dθ ≥ (2π ) c0 |θ j |2 | f (θ )|2 dθ, I

I

and hence E ( f, f ) =

d

−1

( f (x + e j ) − f (x)) = (2d)

j=1 x

≥ (2π )−d c0

2

d

||h j ||22

j=1

|θ |2 | f (θ )|2 dθ. I

Let r > 0, and let Cd be the volume of the unit ball B(0, 1) in Rd . Then | f (θ )|2 dθ + | f (θ )|2 dθ (2π )d f 22 = B(0,r ) I −B(0,r ) |θ/r |2 | f (θ )|2 dθ ≤ C d (2d)−1r d f 21 + ≤ Cd (2d)

−1 d

r

f 21

I −B(0,r ) −2 −1 + r c0 (2π )d E ( f,

f ).

If we choose r so that the last two terms are equal up to constants then r d+2 = E ( f, f ) f −2 1 , and we obtain f 22 ≤ Cd E ( f, f )d/(d+2) f 1

4/(d+2)

,

which is (Nd ). (Note that the proof works even if r is large enough so that I − B(0, r ) is empty.) Theorem 3.14 Let α ∈ [1, ∞). The following are equivalent: (a) (, μ) satisfies (Fα2 ). (b) (, μ) satisfies (Nα ). In addition, if α > 2 then (a) and (b) are equivalent to: (c) (, μ) satisfies (Sα2 ). Proof If α = ∞ then for f ∈ C0 (V) all three inequalities take the form || f ||22 ≤ cE ( f, f ). An easy approximation argument then gives the general equivalence.

88

Isoperimetric Inequalities and Applications

Now suppose that α < ∞. The implication (b) ⇒ (a) is easy. Let supp( f ) = D. Then || f ||1 = || f 1 D ||1 ≤ || f ||2 ||1 D ||2 = || f ||2 μ(D)1/2 . So by (Nα ) 2+4/α

C N || f ||2

4/α ≤ || f ||2 μ(D)1/2 E ( f, f ),

and (Fα2 ) follows immediately. For the implication (a) ⇒ (b) we use an argument of Grigor’yan [Gg2] – see also [Co2]. By Remark 3.11 we can suppose that f ∈ C0+ (V). For λ ≥ 0 we have f 2 ≤ 2λ f 1( f ≤2λ) + 4( f − λ)2+ 1( f >2λ) ≤ 2λ f + 4( f − λ)2+ .

(3.5)

Applying (Fα2 ) to g = ( f − λ)+ , and using Lemma 1.27(a) and Markov’s inequality, || f || 2/α 1 . ||g||22 ≤ C 2 μ({ f > λ})2/α E (g, g) ≤ C 2 E ( f, f ) λ Let λ = || f ||22 /4|| f ||1 . Then using (3.5) || f ||22 ≤ 2λ|| f ||1 + 4||g||22 ≤ 12 || f ||22 + 4C 2 E ( f, f )(|| f ||1 λ−1 )2/α 4/α

−4/α

= 12 || f ||22 + 41+2/α C 2 E ( f, f )|| f ||1 || f ||2

,

which is (Nα ). Now let α > 2. The implication (c) ⇒ (a) is easy. Let D = supp( f ) and p = α/(α − 2), so that the conjugate index p = 2/α. Then using (Sα2 ) || f ||22 = || f 2 1 D ||1 ≤ || f 2 || p ||1 D || p = || f ||22α/(α−2) μ(D)α/2 ≤ CE ( f, f )μ(D)α/2 . That (a) implies (c) needs a bit more work. We include the argument for completeness, but will not need it. For k ∈ Z define k : R+ → R+ by ⎧ k ⎪ if t ≥ 2k+1 , ⎪ ⎨2 k (t) = (t − 2k )+ ∧ 2k = t − 2k if 2k ≤ t < 2k+1 , ⎪ ⎪ ⎩0 if t < 2k . So if n is such that 2n ≤ t < 2n+1 then k (t) = (t − 2n )1(k=n) + 2k 1(k 2k }) and p = 2α/(α − 2). We have p || f || p = f (x) p 1(2k < f (x)≤2k+1 ) μx x

≤

k

2 p(k+1)

1(2k < f (x)≤2k+1 ) μx ≤

x

k

2 p(k+1) γk .

k

For q ≥ 2 q

|| f k ||q ≥ 2qk μ({ f > 2k+1 }) = 2qk γk+1 .

(3.6)

Applying (Fα2 ) to f k gives 2/α

|| f k ||22 ≤ Cμ({ f > 2k })2/α E ( f k , f k ) = Cγk

E ( f k , fk ),

and using (3.6), first with q = 2 and then with q = p, 2/α 2/α p 22k γk+1 ≤ || f k ||22 ≤ Cγk E ( f k , f k ) ≤ C || f k−1 || p 2−(k−1) p E ( f k , f k ). Therefore since p − 2 p/α = 2 and | f k | ≤ | f |, p || f || p ≤ 22 p 2 pk γk+1 k 2 p+2 p/α

≤2

C

2( p−2)k || f || p

2 p/α −2kp/α

2

E ( fk , fk )

k

≤ C || f || p

2 p/α

p−2

E ( f k , f k ) ≤ C|| f || p

E ( f, f ).

k

Since f ∈ C 0 (V) we have || f || p < ∞, and dividing the last equation by p−2 || f || p gives (Sα2 ). Corollary 3.15

Let α ∈ [1, ∞). If (, μ) satisfies (Iα ) then it satisfies (Nα ).

Proof Combining Theorem 3.7, Lemma 3.9, and Theorem 3.14, we have the following chain of implications: (Iα ) ⇒ (Fα1 ) ⇒ (Fα2 ) ⇒ (Nα ).

90

Isoperimetric Inequalities and Applications

Remark 3.16 In number theory a constant in an equation or inequality is called effective if it could in principle be calculated. Most proofs of inequalities in analysis and probability yield constants which are effective, but occasionally a ‘pure existence’ proof will show that some inequality holds without giving any control on the constants. If one inspects the proof of Theorem 3.14 then one sees that the constants in the results are effective in the sense that they depend only on the constants in the ‘input data’. So, if for example (, μ) satisfies (Fα2 ) with a constant C F then the proof of the implication (a) ⇒ (b) in Theorem 3.14 shows that (, μ) satisfies (Nα ) with a constant C which depends only on α and C F . The same holds for the other implications, and, although we do not remark on this again, it holds for the remaining results in this book. Corollary 3.17 Let (, μ) satisfy (Iα ) with α ∈ (2, ∞]. Then (, μ) is transient. Proof Let x0 ∈ V, p = 2α/(α − 2). Then using (S2α ) −2/ p

| f (x 0 )|2 ≤ μx0

|| f ||2p ≤ c E ( f, f ),

and so (, μ) is transient by Theorem 2.42. Lemma 3.18 (a) Let α ∈ (2, ∞). If (, μ) satisfies (Sα2 ) then (, μ) satisfies (Iα/2 ). 2 ) then (, μ) satisfies (I ). (b) If (, μ) satisfies (S∞ ∞ Proof (a) Let A ⊂ V and f = 1 A . Then || f ||22α/(α−2) = μ(A)(α−2)/α , and by (3.2) E ( f, f ) = μ E (A; Ac ). So by (Sα2 ), μ E (A; Ac ) ≥ cμ(A)(α−2)/α = cμ(A)1−2/α , giving Iα/2 . (b) Again taking f = 1 A , we have || f ||2 = μ(A), so that (I∞ ) is immediate. Remark 3.19 Z2 satisfies (Sα1 ) with α = 2. One might hope that one would obtain the α ↓ 2 limit of (Sα2 ), that is, || f ||2∞ ≤ C1 E ( f, f ),

f ∈ C 0 (Z2 ),

(3.7)

but since Z2 is recurrent, using Theorem 2.42 one sees that (3.7) cannot hold. There can be a loss of information in passing from (Sα1 ) to (Sα2 ).

3.3 Poincaré Inequality

91

Exercise 3.20 For i = 1, 2 let i satisfy (Nαi ). Then the join of 1 and 2 satisfies (Nα1 ∧α2 ).

3.3 Poincaré Inequality We now introduce a second kind of isoperimetric inequality, and begin by considering finite graphs. Definition 3.21 A finite weighted graph (, μ) satisfies a relative isoperimetric inequality (with constant C R ) if, for any A ⊂ V satisfying μ(A) ≤ 12 μ(V), one has μ E (A; V − A) (3.8) ≥ CR. μ(A) We write R I = R I () = R I (, μ) for the largest constant C R such that (3.8) holds, and call R I the relative isoperimetric constant of . This is closely related to the Cheeger constant for a finite graph, defined by χ () = min

A=∅,V

μ(V)μ E (A; V − A) . μ(A)μ(V − A)

(3.9)

Since for any A with 0 < μ(A) ≤ 12 μ(V) one has μ(V)μ E (A; V − A) μ E (A; V − A) μ E (A; V − A) ≤ ≤2 , μ(A) μ(A)μ(V − A) μ(A) it follows that R I () ≤ χ () ≤ 2R I (). Proposition 3.22 Let = (V, E) be a finite graph. The following are equivalent: (a) (, μ) satisfies a relative isoperimetric inequality with constant C R . (b) For any f : V → R, min | f (y) − a|μ y ≤ C −1 (3.10) R ||∇ f ||1 . a

y∈V

Consequently ||∇ f ||1 : f ∈ C(V), f non-constant . R I = inf mina y∈V | f (y) − a|μ y Proof Note first that the left side of (3.10) is minimised by choosing a to be a median of f , that is, so that μ({x : f (x) > a}) and μ({x : f (x) < a}) are both less than or equal to 12 μ(V).

92

Isoperimetric Inequalities and Applications

(a) ⇒ (b) Choose a as above, and let g = f − a. Since |g(x) − g(y)| = |g+ (x) − g+ (y)| + |g− (x) − g− (y)|, we have ||∇g||1 = ||∇g+ ||1 + ||∇g− ||1 . Using (3.8) and the co-area formula Lemma 3.8, ∞ g+ (x)μx = μ({x : g+ (x) > t})dt 0

x∈V

≤

C −1 R

∞ 0

μ E ({g+ > t}; {g+ ≤ t})dt

= C −1 R ||∇g+ ||1 . Adding this and a similar inequality for g− gives (3.10). (b) ⇒ (a) Let μ(A) ≤ 12 μ(V) and f = 1 A . Then f has median zero, and (3.10) gives μ(A) ≤ C −1 R μ E (A; V − A), which is the relative isoperimetric inequality with constant C R . The final characterisation of R I is immediate from the equivalence of (a) and (b), and the definition of R I . Given a (finite) graph it is often easy to give a good upper bound for R I () by exhibiting a suitable set A. For lower bounds one can use a dual argument which finds a good (i.e. well spread out) family of paths on the graph. Definition 3.23 (See [DiS, SJ]) A family of paths M covers if for each pair of distinct points x, y ∈ V there exists a path γ = γ (x, y) ∈ M such that γ is a path from x to y. Given such a family of paths, set κ(M ) = max μ−1 μx μ y . e e∈E

{(x,y):e∈γ (x,y)}

We wish to choose the family of paths M so that κ(M ) is small. Theorem 3.24 Let be a finite graph and M be a covering family of paths. Then μ(V) R I () ≥ . κ(M ) Proof Let f : V → R. Then | f (x) − λ|μx = μ y min | f (x) − λ|μx μ(V) min λ

x

λ

y

≤

y

μy

x

x

| f (x) − f (y)|μx .

3.3 Poincaré Inequality

93

But since, for any x, y ∈ V, f (y) − f (x) =

k

f (z i ) − f (z i−1 ) ,

i=1

where (z 0 , z 1 , . . . , z k ) is the path γ (x, y), we have | f (x) − f (y)|μx μ y ≤ μx μ y |∇ f (e)| y

x

y

=

x

e∈γ (x,y)

|∇ f (e)|μe

e∈E

μ−1 e 1{(x,y):e∈γ (x,y)} μx μ y

x,y

≤ κ(M )||∇ f ||1 . Using Proposition 3.22 completes the proof. Theorem 3.25 cd /R.

Let be the cube Q = {1, . . . , R}d in Zd . Then R I () ≥

Proof Note that R d ≤ μ(Q) ≤ (2d R)d . For each pair x, y ∈ Q let γ (x, y) be the path which first matches the first coordinate, then the second, and so on. Now let e = {z, z + ek } be an edge in , where 1 ≤ k ≤ d. If e ∈ γ (x, y) then the first k − 1 coordinates of x must already have been set to those of y, while coordinates k + 1, . . . , d will not yet have been altered. So we must have z i = yi , 1 ≤ i ≤ k − 1, and z i = xi for k + 1 ≤ i ≤ d. Thus x lies in a set of size at most R k and y in a set of size at most R d−(k−1) . So, |{(x, y) : e ∈ γ (x, y)}| ≤ R k R d−k+1 = R d+1 . Thus κ(M ) ≤ (2d)2 R d−1 and hence R I () ≥

μ(Q) 1 . ≥ 2 d+1 (2d) R (2d)2 R

This leads easily to a proof of the isoperimetric inequality in Zd . Theorem 3.26

Let A be a finite subset of Zd . Then μ E (A; Zd − A) ≥ Cd μ(A)(d−1)/d .

Proof Choose r such that 14 r d ≤ μ(A) ≤ 12 r d . Let (Q n , n ∈ N) be a tiling of Zd by disjoint cubes each containing r d points, and let An = A ∩ Q n . Set an = μ(An ), bn = μ E (An ; Q n − An ). Then μ(A) = an , while μ E (A; Zd −

94

Isoperimetric Inequalities and Applications

A) ≥ bn . Since μ(An ) ≤ 12 μ(Q n ), by Theorem 3.25 we have bn ≥ (c/r )an for each n. Hence bn ≤ cr μ E (A; Zd − A), μ(A) = an ≤ cr and as r has been chosen so that r d μ(A), the inequality (Id ) follows. Proposition 3.27

Let = (V, E) be a finite graph. Then for any f : V → R | f (y) − λ|2 μ y ≤ 2R −2 (3.11) min I E ( f, f ). λ

y∈V

Proof Note that the left side of (3.11) is minimised by taking λ to be the average of f , that is, f (x)μx . λ = f V = μ(V)−1 x∈V

Let f : V → R. By subtracting a constant we can assume that the median of f is zero. Set g(x) = f (x)2 sgn( f (x)), where sgn(x) = 1(x>0) − 1(x 0, and for all f : V → R | f (x) − b|2 μx ≤ λ−1 (3.20) min 2 E ( f, f ). b

x

Proof For any f ∈ C(V) we have using (3.17) and the orthogonality of the ϕi , E ( f, f ) = − f, f =

N

λi f, ϕi 2 ,

(3.21)

i=1

|| f ||22 =

N

f, ϕi 2 .

(3.22)

i=1

Given (3.21) and (3.22) we obtain (3.19) by induction in k. Taking f = 1 when k = 1 gives λ1 = 0 and shows that ϕ1 is a constant a. Then 1 = ϕ1 , ϕ1 = a 2 μ(V), so a = μ(V)−1/2 . Since the space of functions is finite dimensional, the infimum for the variational problem when k = 2 is attained by some function f 2 ; since f 2 is orthogonal to ϕ1 , the function f 2 must be non-constant, and so (since is connected) we have E ( f2 , f 2 ) > 0, and thus λ2 > 0. The final inequality follows on setting b = a f, ϕ1 and using (3.21) and (3.22) for g = f − b. The size of the eigenvalues controls the speed of convergence of the transition density to a constant, in the L 2 sense.

100

Isoperimetric Inequalities and Applications

Lemma 3.37 Let V be finite, ν be a probability measure on V, and f n be the Pν -density of X n : f n (y) = νx pn (x, y). x

Then, if a = μ(V)−1/2 and ρ ∗ = maxi≥2 |ρi |, || f k − a 2 ||22 ≤ (ρ ∗ )k || f 0 − a 2 ||22 . Proof We have f 0 (x) = νx /μx , so that f k (y) =

f 0 (x) pk (x, y)μ y = f 0 , pk (·, y) =

x

N

f 0 , ϕi ϕi (y)ρik .

i=1

Since ϕ1 = a and ρ1 = 1, f0 , ϕ1 ϕ1 (y)ρ1k = a 2 , and thus fk − a 2 =

N

f 0 , ϕi ρik ϕi .

i=2

So || f k − a 2 ||22 =

N

f 0 , ϕi 2 ρi2k ≤ (ρ ∗ )k || f 0 − a 2 ||22 .

i=2

In the above ρ ∗ is either ρ2 or ρ N . In considering rates of convergence to equilibrium for discrete time random walks, the possibility of negative values for ρi is a nuisance. One way of avoiding this is to look at walks in continuous time; another is to consider the lazy walk defined in Remark 1.53. It is easy to check that the semigroup operator for the lazy walk is given by P (L) = 12 (I + P), and so if ϕi is an eigenfunction for P then it is also one for P (L) , with eigenvalue (L)

ρi

= 12 (1 + ρi ).

As ρi ∈ [−1, 1] we have ρi(L) ∈ [0, 1], and maxi≥2 |ρi(L) | = ρ2(L) . By Lemma 3.37, applied to the lazy walk, we obtain ||(P (L) )k f 0 − a 2 ||22 ≤ ( 12 + 12 ρ2 )k || f 0 − a 2 ||22 . The quantity λ2 = 1 − ρ2 is often called the spectral gap.

3.5 Strong Isoperimetric Inequality and Spectral Radius

Lemma 3.38 (Cheeger’s inequality) non-zero eigenvalue of − . Then 1 2 2 RI

101

Let be finite and λ2 be the smallest

≤ λ2 ≤ 2R I .

(3.23)

Proof By (3.20) and (3.21) we have λ2 = inf{E ( f, f ) : || f ||2 = 1, f = 0}. Thus the first inequality is immediate from (3.11). To prove the second inequality, let A be a set which attains the minimum in (3.8), set b = μ(A)/μ(V), and let f = 1 A − b. Then f = 0, E ( f, f ) = μ E (A; V − A), and || f ||22 = μ(A)(1 − b). So λ2 ≤

E ( f, f ) μ E (A; V − A) = ≤ 2R I . 2 μ(A)(1 − b) || f ||2

3.5 Strong Isoperimetric Inequality and Spectral Radius By Theorem 3.7, Lemma 3.9, and Lemma 3.18 we have 1 1 2 ) ⇔ (S∞ ) ⇔ (S∞ ). (I∞ ) ⇔ (F∞

Definition 3.39

Let x, y ∈ V. The spectral radius ρ is defined by

ρ = ρ(, μ) = lim sup Pn (x, y)1/n = lim sup pn (x, y)1/n . n→∞

(3.24)

n→∞

Proposition 3.40 (a) The definition in (3.24) is independent of x, y. (b) 0 < ρ ≤ 1. (c) For all x, y ∈ V, n ≥ 0 pn (x, y)(μx μ y )1/2 ≤ ρ n . (d) For each x ∈ V, ρ = limn→∞ P2n (x, x)1/2n . Proof (a) Write ρ(x, y) = lim sup Pn (x, y)1/n . Let x, x , y ∈ V. Then as is connected there exists k such that p = Pk (x, x ) > 0. Hence Pn+k (x, y) ≥ Pk (x, x )Pn (x , y), so ρ(x, y) = lim sup(Pn+k (x, y)1/(n+k) ) n

≥ lim sup p 1/(n+k) Pn (x , y)1/(k+n) = ρ(x , y). n

It follows that ρ(x, y) is the same for all x, y.

102

Isoperimetric Inequalities and Applications

(b) Since Pn (x, x) ≤ 1 it is clear that ρ ≤ 1. Also, P2n (x, x) ≥ P2 (x, x)n , so ρ ≥ P2 (x, x)1/2 > 0. (c) Consider the case y = x first. Suppose for some k and ε > 0 that Pk (x, x) = pk (x, x)μx ≥ (1 + ε)ρ k . Then Pnk (x, x) ≥ Pk (x, x)n ≥ (1 + ε)n ρ nk , so ρ ≥ lim supk Pnk (x, x)1/nk ≥ (1 + ε)1/k ρ, a contradiction. If x = y then ρ 2n ≥ p2n (x, x)μx ≥ pn (x, y)2 μ y μx . (d) Let an = P2n (x, x), so that an ≥ P2 (x, x)n > 0. Now an+m ≥ an am , so that the sequence (an ) is supermultiplicative. Hence by Fekete’s lemma (i.e. the convergence of a subadditive sequence) there exists κ such that 1/n

lim an n

1/n

= sup an n

= κ;

thus ρ = κ 1/2 = limn p2n (x, x)1/2n = supn p2n (x, x)1/2n . Proposition 3.41

||P||2→2 = ρ.

Proof Write ||P|| = ||P||2→2 . First, ||P n || ≤ ||P||n , so if f = 1x0 then P n f (x) = pn (x, x0 )μx0 and pn (x, x0 )2 μx μ2x0 = p2n (x0 , x 0 )μ2x0 . ||P||2n || f ||22 ≥ ||P n f ||22 = x

Thus ||P|| ≥ limn p2n (x0 , x0 )1/2n = ρ, using Proposition 3.40(d). Let f ∈ C0+ (V) and bn = P n f, P n f = P 2n f, f . Then 1/n 1/n lim bn = lim p2n (x, y) f (x) f (y)μx y = ρ2, n

n

x,y

by Proposition 3.40(c) and (d). On the other hand, using Cauchy–Schwarz 2

2 = P n+2 f, P n f ≤ P n+2 f, P n+2 f P n f, P n f = bn+2 bn . bn+1

So, dn =

bn+1 bn+2 ≤ = dn+1 , bn bn+1

and thus dn ↑ D. Since bn+1 ≤ Dbn it follows that bn ≤ D n b0 , and so 1/n ρ 2 ≤ D. But also bk+n ≥ dkn bk , so ρ 2 = limn bk+n ≥ dk , and hence D = ρ 2 . Finally, ||P f ||22 = P f, P f = b1 ≤ Db0 = ρ 2 || f ||22 , so ||P|| ≤ ρ.

3.5 Strong Isoperimetric Inequality and Spectral Radius

103

Definition 3.42 A weighted graph (, μ) is amenable if there exists a sequence of sets An with μ E (An ; V − An ) → 0 as n → ∞. μ(An ) Thus (, μ) is amenable if and only if it fails to satisfy (I∞ ). Theorem 3.43 (a) 1 − ρ = inf

E ( f, f ) : f = 0, f ∈ C0 (V) . f, f

(b) ρ < 1 if and only if (, μ) satisfies (I∞ ). (c) ρ = 1 if and only if (, μ) is amenable. (d) The property that ρ < 1 is stable under rough isometries. Proof (a) By Lemma A.32 ρ = ||P|| = sup P f, f , || f ||2 =1

so 1 − ρ = inf

E ( f, f ) f, f − P f, f : f = 0 = inf : f = 0 . f, f f, f

2 ) holds with constant (1 − ρ)−1 , while (b) and (c) From (a), if ρ < 1 then (S∞ 2 ) holds with constant C then (a) implies that 1 − ρ ≥ C −1 . if (S∞ 1 1 (d) follows from (b) and Theorem 3.4.

It follows from this and Lemma 3.3 that the only graphs with ρ < 1 are those with exponential (or faster) volume growth. The converse is not true – ‘lamplighter graphs’ give an example. Example (Lamplighter graphs) Let V = Z × {0, 1}Z . We write a typical element x ∈ V as x = (n, ξ ), where n ∈ Z and ξ = (. . . , ξ−1 , ξ0 , ξ1 , . . . ) is a doubly infinite sequence of 0s and 1s. We take V = {(n, ξ ) :

∞

ξi < ∞}.

−∞

We think of n as the location of the lamplighter, and ξ as giving the state of the lamps – of which only finitely many are in state 1. We define edges so that (n, ξ ) ∼ (n + 1, ξ ) if and only if ξk = ξk for k = n, n + 1. Thus each point x has eight neighbours. At each time step the lamplighter moves according to

104

Isoperimetric Inequalities and Applications

a SRW on Z, and then randomises the state of the lamps at his old and new location. Let η be the configuration with ηk = 0 for all k, and set 0 = (0, η). Let n ≥ 1 and F be the event that the lamplighter returns to 0 at time 2n, and has visited no more than (2n)1/2 points of Z. Then we have P0 (F) ≥ c0 n −1/2 . Let G be the event that each lamp visited at a time in {0, . . . , 2n} is switched off at the last visit before time 2n. If k is the number of lamps visited, then P(G) = 2−k . Hence P2n (0, 0) = P0 (X 2n = 0) ≥ P0 (F ∩ G) ≥ 2−(2n) c0 n 1/2 . 1/2

Hence P2n (0, 0)1/2n → 1 as n → ∞, so ρ = 1 and the lamplighter graph is amenable. On the other hand, B(0, n) contains 2n+1 points of the form (n, ξ ), where ξ is any configuration with ξk = 0 if k ∈ {0, . . . , n}, so that has exponential volume growth. This graph is vertex transitive, but has some surprising properties. For example, there are many points x with d(x, 0) = n such that x has no neighbours at distance n + 1 from 0. See [Rev] for the behaviour of the heat kernel pn (x, y) on this graph. Recall from Chapter 1 the definition of the potential operator G. Corollary 3.44 ∞ for Zd .

||G||2→2 < ∞ if and only if ρ < 1. In particular ||G||2→2 =

Proof Since G = n P n , we have ||G|| = ||G||2→2 ≤ n ρ n = (1 − ρ)−1 if ρ < 1. If ρ = 1 then ||P n || = 1 for all n ≥ 0. Let ε > 0, and choose m ≥ 1. Since P| f | ≥ |P f | there exists f ∈ L 2+ with || f ||2 = 1 such that ||P 2m f ||2 ≥ (1 − ε). Hence we have ||P k f ||2 ≥ (1 − ε) for each 0 ≤ k ≤ 2m. Since 2k P j f ≥ 0 for each j, we have G f ≥ m k=0 P f , and thus %2 #$ m & m 2m ||G f ||22 ≥ || P 2k f ||22 = P 2k f, f = a j P 2 j f, f , k=0

where (1 + t 2 + · · · + t 2m )2 =

k=0

2m

j=0 a j t

||G f ||22 ≥ (1 − ε)2

2m j=0

and ||G||2→2 must be infinite.

j=0 2j.

So

a j = (1 − ε)2 m 2 ,

3.5 Strong Isoperimetric Inequality and Spectral Radius

Definition 3.45

We say the random walk X is ballistic if

1 lim inf d(x, X n ) > 0, n→∞ n Theorem 3.46

105

Px -a.s. for each x ∈ V.

Suppose that (, μ) satisfies (H5). If ρ < 1 then X is ballistic.

Proof This follows from Proposition 3.40(d) and Lemma 1.3 by a simple ‘exhaustion of mass’ argument. Let x ∈ V, λ > 0 and set Bn = B(x, λn). Set Fn = {X n ∈ Bn }. Then 1/2 −1/2 pn (x, y)μ y ≤ ρ n μ y μx Px X n ∈ Bn = y∈Bn

y∈Bn

−1/2 μx ρ n |Bn |1/2 μ(Bn )1/2

≤ cC1λn ρ n . Thus if λ is chosen small enough so that ρC1λ < 1 then n Px (Fn ) < ∞, and by Borel–Cantelli ≤

Px (Fn occurs infinitely often ) = 0. Hence lim infn n −1 d(x, X n ) ≥ λ. Examples (1) Theorem 3.46 gives a sufficient condition for ballistic behaviour, but it is far from necessary. Consider for example the graph V which is the join of Zd with the binary tree B. Since B is transient, X can cross between the two parts only a finite number of times. If d ≥ 3 then X will ultimately escape to infinity either in B, in which case it moves ballistically, or else in Zd , in which case it escapes non-ballistically. If d ≤ 2 then X is ultimately in B, so is ballistic. However, it is easy to see, by looking at sets in the Z2 part of the graph, that V does not satisfy (I∞ ), so that ρ(V) = 1. (2) The random walk on the lamplighter graph is not ballistic, although this graph has exponential volume growth.

4 Discrete Time Heat Kernel

4.1 Basic Properties and Bounds on the Diagonal We will now use the inequalities in Chapter 3 to prove ‘heat kernel’ bounds, that is, bounds on the transition density pn (x, y). We begin with some basic properties of the heat kernel. Lemma 4.1

For x, y ∈ V, n ≥ 0, p2n+1 (x, x) ≤ p2n (x, x),

(4.1)

p2n+2 (x, x) ≤ p2n (x, x), p2n (x, y) ≤ p2n (x, x)

1/2

(4.2) p2n (y, y)

1/2

,

p2n+1 (x, y) ≤ p2n (x, x)1/2 p2n+2 (y, y)1/2 .

(4.3) (4.4)

Further, if rn (x, y) = pn (x, y) + pn+1 (x, y), then y

E (rnx , rm ) = rn+m (x, y) − rn+m+2 (x, y), y rnx , rm

= rn+m (x, y) + rn+m+1 (x, y).

(4.5) (4.6)

In particular r2n (x, x) is decreasing in n. Proof By (1.12)–(1.14) x p2n (x, x)− p2n+1 (x, x) = pnx − pn+1 , pnx = − pnx , pnx = E ( pnx , pnx ) ≥ 0,

proving (4.1). To prove (4.2) we use Proposition 1.16: x x , pn+1 = P pnx , P pnx = ||P pnx ||22 ≤ || pnx ||22 = p2n (x, x). p2n+2 (x, x) = pn+1

Equations (4.3) and (4.4) are immediate from (1.12) and Cauchy–Schwarz. (Note that we cannot expect pk (x, y)2 ≤ pk (x, x) pk (y, y) in general when k is odd; any bipartite graph gives a counterexample.) 106

4.1 Basic Properties and Bounds on the Diagonal

107

For the equalities involving rnx , note first that x x x x x ) = ( pn+1 − pnx ) + ( pn+2 − pn+1 ) = pn+2 − pnx .

rnx = ( pnx + pn+1

So, y

y

E (rnx , rm ) = − rnx , rm y

y

x = pnx − pn+2 , pm + pm+1

= pn+m (x, y) + pn+m+1 (x, y) − pn+m+2 (x, y) − pn+m+3 (x, y) = rn+m (x, y) − rn+m+2 (x, y). A similar calculation gives (4.6). Lemma 4.2

Let f ∈ L 2 . Then for n ≥ 0

0 ≤ ||P n+1 f ||22 − ||P n+2 f ||22 ≤ ||P n f ||22 − ||P n+1 f ||22 .

(4.7)

Further, for n ≥ 1, || f ||22 − ||P n f ||22 ≤ 2nE ( f, f ).

(4.8)

Proof The first inequality in (4.7) holds since ||P||2→2 ≤ 1 by Proposition 1.16. For the second, writing g = P n f , we have 0 ≤ P 2 g − g, P 2 g − g = g, g − 2g, P 2 g + P 2 g, P 2 g, and rearranging gives Pg, Pg − P 2 g, P 2 g ≤ g, g − Pg, Pg, which proves the second inequality. Since P = I + , we have f, f − P f, P f = −2 f, f − f, f ≤ −2 f, f = 2E ( f, f ), which proves (4.8) when n = 1. If n ≥ 2 then, using (4.7), || f ||22 − ||P n f ||22 =

n−1

||P k f ||22 − ||P k+1 f ||22 ≤ n || f ||22 − ||P f ||22 ,

k=0

and (4.8) follows. Theorem 4.3

Let α ≥ 1. The following are equivalent.

(a) satisfies the Nash inequality (Nα ). (b) There exists C H such that pn (x, x) ≤

CH , (1 ∨ n)α/2

n ≥ 0, x ∈ V.

(4.9)

108

Discrete Time Heat Kernel

such that (c) There exists C H

pn (x, y) ≤

CH , (1 ∨ n)α/2

n ≥ 0, x, y ∈ V.

(4.10)

Proof (a) ⇒ (b) Assume the Nash inequality holds with constant C N . −α/2 If n = 0 we have p0 (x, x) = μ−1 by Remark 3.11(3). For n ≥ 1 we x ≤ CN x work with the function rn (y) = rn (x, y) = pn (x, y) + pn+1 (x, y) introduced in Lemma 4.1. We have ||rnx ||1 = 2, while ||rnx ||22 ≥ r2n (x, x) by (4.6). Set ϕn = r2n (x, x); then by (4.5) −4/α

ϕn − ϕn+1 = E (rnx , rnx ) ≥ C N ||rnx ||1

2+4/α

||rnx ||2

≥ 2−4/α C N ϕn

1+2/α

.

So we obtain the difference inequality 1+2/α

ϕn+1 − ϕn ≤ −c1 ϕn

.

(4.11)

While we could treat (4.11) directly, a simple comparison with a continuous function will give the bound we need. Let f be the continuous function on [0, ∞) which is linear on each interval (n, n + 1) and satisfies f (n) = ϕn for each n ∈ Z+ . Since ϕn is decreasing, so is f and f (t) ≤ ϕn for t ∈ [n, n + 1]. Then f (t) = ϕn+1 − ϕn ≤ −c1 ϕn

1+2/α

≤ −c1 f (t)1+2/α ,

n < t < n + 1.

Now let g(t) = f (t)−2/α , so 2 2c1 f (t) f (t)−1−2/α ≥ . α α Hence g(t) ≥ g(0) + c2 t ≥ c2 t since g(0) ≥ 0, and thus g (t) = −

−α/2 −α/2

p2n (x, x) ≤ r2n (x, x) = ϕn = f (n) = g(n)−α/2 ≤ c2

n

,

proving (b). Inspecting the constants we can take C H = 1 ∨ 4(α/2C N )α/2 . (b) ⇔ (c) Clearly (c) implies (b). If (b) holds, and n = 2m is even, then (4.10) follows immediately from (4.9) and (4.3). If n = 2m + 1 is odd, since 2m + 1 ≤ 32 (1 ∨ (2m)), using (4.3) gives p2m+1 (x, y) ≤ p2m (x, x)1/2 p2m+2 (y, y)1/2 ≤ c(1 ∨ 2m)−α/4 (1 ∨ (2m + 2))−α/4 ≤ c (2m + 1)−α/2 , which is (c). (1 ∨ n)−α/2 for the upper bound in (4.10). Taking (c) ⇒ (a) Write πn = C H . For f ∈ C (V), −1 n = 0 we obtain μx = p0 (x, x) ≤ C H 0 pn (x, y) f (y)μ y ≤ πn || f ||1 , ||Pn f ||∞ = sup x

y

4.1 Basic Properties and Bounds on the Diagonal

109

and so ||Pn f ||22 = P 2n f, f ≤ ||P 2n f ||∞ || f ||1 ≤ π2n || f ||21 . So by Lemma 4.2 f 22 − π2n f 21 f 22 − ||P n f ||22 ≥ . (4.12) n n We can assume that || f ||1 = 1. Choose n to be the smallest integer k ≥ 1 such that π2k ≤ 12 || f ||22 . If n = 1 then we have, using (1.17), Lemma 1.20, and the lower bound on μx , 2E ( f, f ) ≥

c2 2−α/2 ≤ 12 || f ||22 ≤ 2E ( f, f ) ≤ 2|| f ||22 ≤ c || f ||21 = c . Hence we obtain E ( f, f ) ≥ c|| f ||2 . If n ≥ 1 then we have n −α/4 || f ||2 , and substituting in (4.12) gives (Nα ). 2+4/α

Remark The argument in continuous time is slightly simpler – see Remark 5.15. Corollary 4.4

(a) Suppose that (, μ) satisfies (Iα ). Then pn (x, y) ≤ c(1 ∨ n)−α/2 for n ≥ 0, x, y ∈ V.

(b) Let V be infinite, and suppose that μe ≥ c0 > 0 for all edges e. Then pn (x, y) ≤ c(1 ∨ n)−1/2 for n ≥ 0, x, y ∈ V. Proof By Corollary 3.15 we have (Nα ), and hence (a) holds. For (b), Example 3.2(5) gives that I1 holds, and the bound follows from (a). Remark (1) Bounds on pn (x, x) are called ‘on diagonal’, since they control the function pn : V × V → R on the diagonal {(x, x), x ∈ V}. Corollary 4.4 shows that (up to constants) the on-diagonal decay of pn on Z is the slowest possible for an infinite graph with natural weights. Note also that (4.3) and (4.4) give global upper bounds on pn (x, y) from the on-diagonal upper bounds. (2) Corollary 4.4 is one of the main results in [V1]; in this and other papers in the early 1980s Varopoulos opened up in a very fruitful way the connections between the geometry of a graph and the properties of a random walk on it. (0)

(1)

Examples (1) Let = Zd , μx y be the natural weights on Zd , and μx y (1) (0) be weights on Zd satisfying c0 μ(0) x y ≤ μx y ≤ c1 μx y . Let Ei be the (i) (i) Dirichlet forms associated with (, μ ), write X for the associated (i) SRW and pn (x, y) for their transition densities. By Lemma 3.13 the Nash

110

Discrete Time Heat Kernel

inequality (Nd ) holds for Zd . Since E1 ( f, f ) ≥ c0 E0 ( f, f ), we have also (1) (Nd ) with respect to E1 , and therefore the bound pn (x, y) ≤ c n −d/2 holds. (2) Let be the join of two copies of Zd at their origins. By Exercise 3.20, satisfies (Nd ), and so by Theorem 4.3 the heat kernel on V satisfies the bound pn (x, y) ≤ Cn −d/2 . (3) The graphical (pre)-Sierpinski carpet SC is an infinite connected graph (a subset of Z2+ ) which is roughly isometric with the (unbounded) Sierpinski carpet (Figure 4.1). For a precise definition see [BB1]. Let An = V ∩ [0, 12 3n ]2 . Then μ(An ) ≈ c8n , and μ E (An ; V − An ) n 2 . So if V satisfies (Iα ) then 2n ≥ c(8n )(α−1)/α , which implies that α ≤ 3/2. In fact (as is intuitively plausible) the sets An are the optimal case as far as isoperimetric properties of V are concerned. Theorem 4.5 (Osada; see [Os]) SC satisfies (Iα ) with α = 3/2. Using this and Theorem 4.3 it follows that SC satisfies pn (x, y) ≤ cn −3/4 .

Figure 4.1. The graphical Sierpinski carpet.

(4.13)

4.2 Carne–Varopoulos Bound

111

However, this is not the correct rate of decay; in [BB1] it is proved that pn (x, y) n −γ , where γ = (log 8)/(log 8ρ). Here the exact value of ρ is unknown, but it is close to 5/4, so that γ % 0.90 > 0.75. The reason (4.13) is not the best possible is that there is a loss of information in passing from (Iα ) to (Nα ). The inequality (Iα ) implies that (in a certain sense) the graph has ‘good geometry’, while (Nα ) or (Sα2 ) imply ‘good heat flow’. Good geometry is sufficient, but not necessary, for good heat flow.

4.2 Carne–Varopoulos Bound The bounds in Theorem 4.3 do not give any information on the behaviour of pn (x, y) for fixed n when d(x, y) is large. The next few results will lead to a very simple upper bound for pn (x, y) in this regime, by a surprising argument. For the motivation and history, see Remark 4.10 later in this section. We begin with a few lemmas. Lemma 4.6 Then

Let Sn be the simple symmetric random walk on Z with S0 = 0.

D2 ), 2n

n r −n 2r −n n . λ P(Sn = r ) = 2 λ r P(Sn ≥ D) ≤ exp(−

r ∈Z

(4.14) (4.15)

r =0

Proof We first note that cosh λ ≤ exp( 12 λ2 ) – one proof is by comparison of the power series. Then, since E(eλS1 ) = cosh(λ), P(Sn ≥ D) = P(eλSn ≥ eλD ) ≤ e−λD EeλSn ≤ e−λD (EeλS1 )n = e−λD (cosh λ)n ≤ exp(−λD + 12 nλ2 ). Setting λ = D/n gives (4.14). For (4.15), fix n and r = Sn , and let u and d be the number of upward and downward steps by (S1 , . . . , Sn ). Thus u + d = n and u − d = r , and so if n + r is odd then Sn cannot equal r , while if n + r is even

112

Discrete Time Heat Kernel

P(Sn = r ) = 2−n

n n . = 2−n 1 u 2 (n + r )

So

λr P(Sn = r ) =

r ∈Z

n

λ2r −n P(Sn = 2r − n) = 2−n

n

r =0

λ2r −n

r =0

n . r

k Lemma 4.7 Let q(t) = a t r be a polynomial with |q(t)| ≤ 1 for rk=0 r r t ∈ [−1, 1]. If R = q(P) = r =0 ar P then ||R f ||2 ≤ || f ||2 . Proof We first assume that is finite. Using the spectral representation (3.18), Rf =

k

ar P f = r

r =0

k r =0

ar

N

f, ϕi ρir ϕi

=

i=1

N

q(ρi ) f, ϕi ϕi .

i=1

Since |q(ρi )| ≤ 1, ||R f ||22 =

N

q(ρi )2 f, ϕi 2 ≤

N

i=1

f, ϕi 2 = f 22 .

i=1

The case of infinite could be handled by spectral integrals, but there is an easier way. Let f ∈ C0 (V), with (finite) support A. Let Ak = {x : d(x, A) ≥ k}, and note that R f (x) = 0 if x ∈ Ak+1 . So f 2 and R f 2 are unchanged if we replace by the subgraph generated by Ak+1 , and we deduce that R f 2 ≤ f 2 . A limiting argument now handles the case of general f ∈ L 2. We now define the Chebyshev polynomial Hk (t) = 12 (t + i(1 − t 2 )1/2 )k + 12 (t − i(1 − t 2 )1/2 )k , Lemma 4.8

−1 ≤ t ≤ 1.

For each k, Hk (t) is a real polynomial in t of degree k, and |Hk (t)| ≤ 1,

−1 ≤ t ≤ 1.

Further, for each n ≥ 0, tn = P(Sn = k)H|k| (t), k∈Z

−1 ≤ t ≤ 1.

(4.16)

4.2 Carne–Varopoulos Bound

113

Proof Fix t and let s = (1 − t 2 )1/2 . Then the binomial theorem gives k k k− j t ((is) j + (−is) j ). 2Hk (t) = j j=0

The terms with j odd cancel, and so 2Hk (t) is a real polynomial in t, which is of degree k. Let z 1 = t + is, z 2 = t − is. Then |z i |2 = 1, z 1 z 2 = 1, and Hk (t) = 12 (z 1k + z 2k ) = H−k (t). So |Hk (t)| ≤ 12 |z 1 |k + 12 |z 2 |k = 1. Since t = 2−1 (z 1 + z 2 ) we have n n n k n−k n 2k−n r t n = 2−n z 1 z 2 = 2−n z = z 1 P(Sn = r ). k k 1 k=0 k=0 r ∈Z Similarly t n = r z r2 P(Sn = r ), and adding gives (4.16). Theorem 4.9 (‘Carne–Varopoulos bound’ [Ca, V2]) Let (, μ) be a weighted graph. Then pn (x, y) ≤

d(x, y)2 2 , exp − 2n (μx μ y )1/2

x, y ∈ V, n ≥ 1. −1/2

Proof Let x1 , x 2 ∈ V and set f i (x) = 1xi (x)μxi i = 1, 2. Let R = d(x1 , x2 ), and note that

(4.17)

. Thus || f i ||2 = 1 for

f 1 , P k f 2 = (μx1 μx2 )1/2 pk (x1 , x2 ).

(4.18)

If k < R then this is zero, and so f 1 , H|k| (P) f 2 = 0. The polynomial identity (4.16) implies that P n = k∈Z P(Sn = k)H|k| (P), and therefore that P(Sn = k) f 1 , H|k| (P) f 2 f1 , P n f2 = k∈Z

=

P(Sn = k) f 1 , H|k| (P) f 2

|k|≥R

≤

P(Sn = k)|| f 1 ||2 ||H|k| (P) f 2 ||2 .

|k|≥R

Using Lemma 4.7 we have ||H|k| (P) f 2 ||2 ≤ 1, so 2 P(Sn = k) = P(|Sn | ≥ R) ≤ 2e−R /2n , f 1 , P n f2 ≤ |k|≥R

and combining this with (4.18) completes the proof.

114

Discrete Time Heat Kernel

Remark 4.10 (1) The proof of Theorem 4.9 does not make it very clear why the argument works. Let kt (x) be the density of the N (0, t) distribution, so that the Fourier transform of kt is

∞

−∞

(2πt)−1/2 e−x

2 /(2t)

eiθ x d x =

∞

(2π t)−1/2 e−x

2 /(2t)

−∞

1

cos(θ x)d x = e− 2 θ t. 2

Setting λ = 12 θ 2 gives the identity ∞ √ 2 (2πt)−1/2 e−x /(2t) cos(x 2λ)d x = e−λt . −∞

Formally, if λ = − one obtains ∞ √ t kt (s) cos(s −2 )ds. e =

(4.19)

−∞

√ On the left side of (4.19) is the heat semigroup, while cos(x −2 ) is the wave semigroup: this is Hadamard’s transmutation formula. The wave equation in R has finite velocity a, so there is no contribution to the heat kernel pt (x, y) from s such that as < d(x, y). Varopoulos [V2] used this idea for graphs, replacing the graph by what he called its ‘cable system’. (One thinks of each edge as a wire of length 1, and runs a Brownian motion on the wires, with suitable behaviour at the junctions. Many people now call these quantum graphs, perhaps because it makes them sound more important.) Carne then found the fully discrete argument given here. (2) If f i have support Ai and d(A1 , A2 ) = R then the proof of Theorem 4.9 gives the ‘Davies–Gaffney bound’ (see [Da]) Pn f 1 , f2 ≤ 2|| f 1 ||2 || f 1 ||2 e−R

2 /2n

.

(4.20)

These bounds lead to an upper bound on the rate of escape of X n in graphs with polynomial volume growth. Definition 4.11 We say (, μ) has polynomial volume growth if there exist C V and θ such that |B(x, r )| ∨ μ(B(x, r )) ≤ C V r θ ,

x ∈ V, r ≥ 1.

This condition involves both the number of points in balls and the measure of balls; of course these are the same if, for example, μx are bounded away from zero and infinity.

4.2 Carne–Varopoulos Bound

115

Lemma 4.12 Suppose (, μ) has polynomial volume growth, with index θ. If r 2 > 4n(1 + θ log 2) then Px (d(x, X n ) > r ) ≤ cr θ exp(−r 2 /4n).

(4.21)

Hence there exists c2 such that d(x, X n ) ≤ c2 (n log n)1/2 for all large n, Px -a.s. Proof Let r ≥ 1, and set Dn = B(x, 2n r ) − B(x, 2n−1r ) for n ≥ 1. Then Px (d(X n , x) > r ) =

∞

−1/2

pn (x, y)μ y ≤ 2μx

k=1 y∈Dk

∞

μ y e−(2 1/2

k−1 r )2 /2n

.

k=1 y∈Dk

By Cauchy–Schwarz and the hypotheses on (, μ), 1/2 μ y ≤ |Dk |1/2 μ(Dk )1/2 ≤ c(2k r )θ . y∈Dk

Since 4k ≥ 4k for k ≥ 1, if r 2 > 4n(1 + θ log 2) then Px (d(X n , x) > r ) ≤ cr θ

∞

2kθ e−(2

k r )2 /8n

k=1

= cr θ e−r

2 /4n

∞

exp(kθ log 2 − kr 2 /4n) ≤ c r θ e−r

2 /4n

,

k=1

which proves (4.21). Now set rn2 = 4An log n, so that the bound (4.21) holds for all large n. Then Px (d(X n , x) > rn ) ≤ crnθ n −A ≤ c (log n)θ n θ −A . Taking A large enough, n Px (d(X n , x) > rn ) converges, and the result follows by Borel–Cantelli. Remark 4.13

On Zd we expect Gaussian upper bounds of the form pn (x, y) ≤ c1 n −d/2 e−c2 d(x,y)

2 /n

.

(4.22)

(These bounds do indeed hold – see Theorem 6.28 in Chapter 6.) Let R = d(x, y). If R ≤ n 1/2 then Corollary 4.4 gives pn (x, y) ≤ c3 n −d/2 ≤ c3 n −d/2 e1−R

2 /2n

= c4 n −d/2 e−R

2 /2n

.

On the other hand if R ≥ (2dn log n)1/2 then Theorem 4.9 gives pn (x, y) ≤ e−R

2 /2n

= e−R

2 /4n−R 2 /4n

≤ e−R

2 /4n

e−(d/2) log n = n −d/2 e−R

2 /4n

.

116

Discrete Time Heat Kernel

So for fixed n, the two bounds we have proved so far – that is the global upper bound in Theorem 4.3 and the ‘long range’ Carne–Varopoulos bound, prove (4.22) for all values of R except for the narrow range cn 1/2 ≤ R ≤ c(n log n)1/2 . Unfortunately there is no simple way to combine these bounds so as to cover this interval, and this is the range of R which determines many important properties of the random walk. Bridging this gap, as well as obtaining matching lower bounds, will occupy much of the remainder of this book.

4.3 Gaussian and Sub-Gaussian Heat Kernel Bounds We now introduce the main family of heat kernel bounds that we will treat in the remainder of this book. It includes Gaussian bounds, but also the type of heat kernel bound which holds on regular fractal type graphs, such as the Sierpinski gasket graph. Definition 4.14

Define the volume growth function by V (x, r ) = μ(B(x, r )),

x ∈ V, r ≥ 0.

(1) (, μ) satisfies the condition (Vα ) if there exist positive constants C L , CU such that C L ≤ μx ≤ CU for all x ∈ V, and C L r α ≤ V (x, r ) ≤ CU r α ,

for all x ∈ V, r ≥ 1.

(2) (, μ) satisfies UHK(α, β) if there exist constants c1 , c2 such that for all n ≥ 0, x, y ∈ V,

d(x, y)β 1 c1 β−1 pn (x, y) ≤ . (4.23) exp − c 2 (n ∨ 1)α/β n∨1 (, μ) satisfies LHK(α, β) if there exist constants c3 , c4 such that, for all n ≥ 0, x, y ∈ V with d(x, y) ≤ n,

d(x, y)β 1 c3 β−1 pn (x, y)+ pn+1 (x, y) ≥ . (4.24) exp −c 4 (n ∨ 1)α/β n∨1 (, μ) satisfies HK(α, β) if it satisfies both UHK(α, β) and LHK(α, β). Note that in this case (, μ) is transient if and only if α > β. Remarks 4.15 (1) We need to consider pn + pn+1 in the lower bound because of bipartite graphs, or more generally graphs in which large regions are nearly bipartite. It is also clear that the lower bound cannot hold if d(x, y) > n.

4.3 Gaussian and Sub-Gaussian Heat Kernel Bounds

117

(2) If β = 2 then these are the familiar Gaussian heat kernel bounds. (3) If HK(α, β) holds with β > 2 then one says that satisfies sub-Gaussian heat kernel bounds; the decay of the bound on pn (x, y) as d(x, y) → ∞ is slower than Gaussian. (4) We will see in Theorem 6.28 that the heat kernel on any graph roughly isometric to Zd satisfies HK(2, d), and in Corollary 6.11 that the Sierpinski gasket graph satisfies HK(log 3/ log 2, log 5/ log 2). (5) If (V, E) is an infinite graph with natural weights then |B(x, r )| ≥ r , so the condition (Vα ) cannot hold for any α < 1. (6) It is easy to check that if (Vα ) holds then (H3) (bounded geometry) holds, and the two conditions (H4) and (H5) are then equivalent. Exercise 4.16 Show that the condition (Vα ) is stable under rough isometries of weighted graphs. Lemma 4.17 Suppose that (, μ) satisfies HK(α, β). Then (, μ) satisfies (H3), (H4), and (H5). Proof Taking n = 0 in (4.23) we have μ−1 x = p0 (x, x) ≤ c1 for all x, so μx is bounded below. Let y ∼ x; then taking n = 0 in (4.24) gives μx y = p1 (x, y) ≥ c3 e−c4 . μx μ y

(4.25)

Thus μx ≥ μx y ≥ cμx μ y , and it follows that there exists c5 such that c5−1 ≤ μ y ≤ c5 for all y ∈ V. Using (4.25) again then gives that c−1 ≤ μe ≤ c for all e ∈ E, so that (, μ) satisfies (H3), (H4), and (H5). In the remainder of this section we will explore some consequences of HK(α, β), starting with the information it gives on the exit times from balls. In the final two sections of this chapter we will introduce some conditions on which are sufficient to give HK(α, β). Definition 4.18

Define the stopping times τ (x, r ) = inf{n ≥ 0 : X n ∈ B(x, r )}.

Lemma 4.19 (a) We have

Let (, μ) be any infinite connected graph. Let r ≥ 1, n ≥ 1. Px X n ∈ B(x, 2r ) ≤ Px τ (x, 2r ) ≤ n .

118

Discrete Time Heat Kernel

(b) Suppose that P y (X m ∈ B(y, r )) ≤ p for all y ∈ V, 0 ≤ m ≤ n. Then Px (τ (x, 2r ) < n) ≤ 2 p. Proof (a) is clear by inclusion of events. (b) Writing τ = τ (x, 2r ), B = B(x, r ), and noting that X τ ∈ ∂ B(x, 2r ), Px (τ < n) = Px (τ < n, X n ∈ B) + Px (τ < n, X n ∈ B) ≤ p + Ex [1(τ kn 1 ) ≤ 2−k ,

y ∈ B.

This proves (4.28), and also gives Ex τ B ≤ cn 1 . Lemma 4.21 (a) For λ ≥ 1,

Suppose that (, μ) satisfies (4.26) and UHK(α, β). Px X n ∈ B(x, λn 1/β ) ≤ c1 exp(−c2 λβ/(β−1) ),

(4.30)

4.3 Gaussian and Sub-Gaussian Heat Kernel Bounds

119

Px τ (x, 2λn 1/β ) < n ≤ 2c1 exp(−c2 λβ/(β−1) ).

(4.31)

(b) For x ∈ V, r ≥ 1, c1 r β ≤ Ex τ (x, r ) ≤ c2 r β . Proof (a) Let r = λn 1/β , Bk = B(x, kr ), and Dk = Bk+1 − Bk for k ≥ 1. Then writing γ = 1/(β − 1), and noting that βγ > 1, ∞ ∞ Px (X n ∈ Dk ) ≤ μ(Dk ) max pn (x, z) Px X n ∈ B(x, r ) ≤ k=1 ∞

≤c

k=1

z∈Dk

r α (k + 1)α n −α/β exp(−c(kr )βγ n −γ )

k=1 ∞ α

≤ cλ

(k + 1)α exp(−cλβγ k) ≤ c λα exp(−c λβγ ).

k=1

This proves (4.30), and (4.31) is then immediate by Lemma 4.19. (b) The upper bound was proved in the preceding lemma. For the lower bound, let y ∈ B(x, r/2) and choose λ large enough so that the left side of (4.31) is less 1/β 1/β than 12 . Then choose n 2 so that 2λn 2 = r/2. We have τ (y, 2λn 2 ) ≤ τ B , and so 1/β 1/β E y τ B ≥ E y τ (y, 2λn 2 ) ≥ n 2 P y τ (y, 2λn 2 ) ≥ n 2 ≥ n 2 /2 ≥ cr β . Lemma 4.22

Let (, μ) satisfy HK(α, β). Then (Vα ) holds.

Proof By Lemma 4.17 we have c−1 ≤ μx ≤ c for all x ∈ V. Let r ≥ 1 and n = r β . Then 2 ≥ Px X n ∈ B(x, r ) + Px X n+1 ∈ B(x, r ) pn (x, y) + pn+1 (x, y) μ y ≥ cV (x, r )n −α/β , ≥ y∈B(x,r )

so we obtain V (x, r ) ≤ c1r α . For the lower bound we use the upper bound in Lemma 4.21: choose λ ≥ 1 large enough so the right hand of (4.30) is less than 12 . Then if r = λn 1/β 1 = Px (X n ∈ B(x, r )) + Px (X n ∈ B(x, r )) pn (x, y)μ y ≤ 12 + cV (x, r )n −α/β , ≤ 12 + y∈B(x,r )

which gives V (x, r ) ≥ c r α . Exercise 4.23

Show that if satisfies HK(α, β) then Ex d(x, X n ) n 1/β .

120

Discrete Time Heat Kernel

Exercise 4.24

Suppose that satisfies HK(α, β), and let f (n) = n 1/β (log log n)(β−1)/β .

Show that 0 < lim inf n→∞

d(x0 , X n ) d(x0 , X n ) ≤ lim sup < ∞. f (n) f (n) n→∞

(When β = 2 this gives the familiar law of the iterated logarithm, but without the exact constant.) To prove this let n k = ek , λk = (A log log n k )(β−1)/β , and use HK(α, β) and Lemma 4.21 to bound the probability of the events 1/β

{X m ∈ B(x0 , n k λk ) for all m ≤ n k±1 }. Let D ⊂ V, and recall from (1.24) the definition of the heat kernel for X killed on exiting D. To simplify notation write γ = 1/(β − 1) and (T, R) = T −α/β exp(−(R β /T )γ ). It is clear from the definition that pnD (x, y) ≤ pn (x, y), so our interest is in obtaining lower bounds on p D . Theorem 4.25 Suppose (, μ) satisfies HK(α, β), and let B = B(x0 , R), B = B(x0 , R/2). Then for x, y ∈ B , and d(x, y) ≤ n ≤ R β ,

d(x, y)β 1 β−1 B B −α/β . (4.32) (x, y) + p (x, y) ≥ c n exp − c pn 1 2 n+1 n Proof The following observation will be useful. Let θ > 0 and f (t) = t −θ exp(−s γβ t −γ ). γ

Then f (t) = t −1 f (t)(γ s βγ t −γ − θ ), so if t0 is defined by t0 = γ s βγ /θ then f is unimodal with a maximum at t0 . Let x, y ∈ B(x0 , 2R/3) with d(x, y) ≤ δ R, where δ will be chosen later. Then Px (X n = y) = Px (X n = y, τ B > n) + Px (X n = y, τ B ≤ n). For the second term we have Px (X n = y, τ B ≤ n) = Ex 1(τ B ≤n) P X τ B (X n−τ B = y) ≤ max max Pz (X m = y). z∈∂ B 1≤m≤n

Thus pnB (x, y) ≥ pn (x, y) − max max pm (z, y), z∈∂ B 1≤m≤n

4.3 Gaussian and Sub-Gaussian Heat Kernel Bounds

121

B (x, y). Using the observation above and with a similar lower bound on pn+1 the upper bounds on pm , there exists c3 > 0 such that if n ≤ c3 R β then

max max pm (z, y) ≤ c4 (n, c5 R/3).

z∈∂ B 1≤m≤n

Hence

β γ β β γ B pnB (x, y) + pn+1 (x, y) ≥ c6 n −α/β e−c7 ((δ R) /n) − c8 e−c9 (R /3 n) .

Choosing δ small enough we obtain the lower bound (4.32) when d(x, y) ≤ δ R and n ≤ c3 R β . A chaining argument similar to that in Proposition 4.38 then concludes the proof. These bounds enable us to control the Green’s function for X . In the case when α ≤ β then n pn (x, x) = ∞, so X is recurrent and it only makes sense to bound g D for proper subsets D of V. We will do it just for balls. Theorem 4.26 Suppose (, μ) satisfies HK(α, β). Let R ≥ 2, B = B(x0 , R), x, y ∈ B = B(x0 , R/2), and r = 1 ∨ d(x, y). (a) If α > β then c1 α−β r

≤ g B (x, y) ≤ g(x, y) ≤

c2 . α−β r

(b) If α < β then c1 R β−α ≤ g B (x, y) ≤ c2 R β−α . (c) If α = β then c1 log(R/r ) ≤ g B (x, y) ≤ c2 log(R/r ). Proof By Theorem 4.25 β

g B (x, y) ≥

R

β

pnB (x,

y) ≥ c

n=r β

R

n −α/β

n=r β

Rβ

rβ

s −α/β ds,

and the lower bounds in (a)–(c) follow easily. (a) The bound g B (x, y) ≤ g(x, y) is immediate. If y = x then g(x, x) =

∞

pn (x, x)

n=0

If y = x then note first that

∞ 0

t −α/β exp(−c2 (r β /t)γ )dt = r β−α

∞

n −α/β c.

n=1

∞ 0

s −α/β exp(−c2 s −γ )ds = cr β−α .

122

Discrete Time Heat Kernel

The upper bound on g(x, y) then follows by comparison between the sum for g(x, y) and the above integral. (b) Let m = R β . By Lemma 4.21 we have for x, y ∈ B B (x, y) = pmB (x, y) pnB (z, y)μ y pm+n z∈B

≤ cm −α/β P y (τ B > n) ≤ c R −α exp(−cn/R β ). So, g B (x, y) =

∞ (k+1)n

pnB (x, y)

k=0 n=km

≤

m

c(1 ∨ n)−α/β +

n=0

∞

c R β−α e−c (k−1) ≤ c R β−α .

k=1

(c) Let m be as in (b) and m = r β . Then

g B (x, y) ≤ c +

m

pn (x, y) +

≤

pn (x, y) +

m

n=1 m

m

pn (x, y) + c

m

∞

c R β−α e−c (k−1)

k=1

n −1 + c ≤ c log(R/r ) +

n=m

n=0

m

pn (x, y).

n=0

(4.33) Now

rβ

t −1 e−c1 (r

β /t)γ

dt =

1

s −1 e−c1 s

−γ

ds = c2 ,

0

0

and comparing the final sum in (4.33) with this integral gives the upper bound in (c). We conclude this section by obtaining some general bounds on the time derivative of pn . The example of a bipartite graph shows that we cannot expect to have good control of | pn+1 − pn | in general, but we can bound | pn+2 − pn |. Lemma 4.27

Let 1 ≤ k < n. Then x x || pnx − pn+2 ||2 ≤ 2k −1 || pn−k ||2 .

4.3 Gaussian and Sub-Gaussian Heat Kernel Bounds

123

Proof Since X cannot visit B(x, n + 3)c before time n + 3, we can assume that V = B(x, n + 3). Then if N = |B(x, n + 3)| the spectral decomposition (3.16) gives x (y) pnx (y) − pn+2

=

N

ϕi (x)ϕi (y)ρin (1 − ρi2 ).

i=1

Now if t ∈ [0, 1] 2 ≥ (1 + t) = (1 + t)(1 − t)

∞

t j ≥ (1 − t 2 )kt k .

0

So taking t = |ρi | x || pnx − pn+2 ||22 =

i

≤

ϕi (x)2 ρi2n (1 − ρi2 )2 2 ϕi (x)2 ρi2n−2k max ρ kj (1 − ρ 2j )

i

≤ 4k −2

j

x ϕi (x)2 ρi2n−2k = 4k −2 || pn−k ||22 .

i

Theorem 4.28 Let n = n 1 + n 2 and 1 ≤ k < n 2 . Then | pn (x, y) − pn+2 (x, y)| ≤ 2k −1 p2n 1 (x, x)1/2 p2n 2 −2k (y, y)1/2 . In particular, if UHK(α, β) holds then | pn (x, y) − pn+2 (x, y)| ≤ c1 n −1−β/α . Proof We have | pn (x, y) − pn+2 (x, y)|

pn 1 (x, z)( pn 2 (z, y) − pn 2 +2 (z, y))μz = z

≤

pn 1 (x, z)2 μz

1/2

z 1/2

= p2n 1 (x, x) ≤ 2k

−1

y || pn 2 1/2

p2n 1 (x, x)

−

( pn 2 (z, y) − pn 2 +2 (z, y))2 μz

1/2

z y pn2 +2 ||2

p2n 2 −2k (y, y)1/2 .

This proves the first part; the second then follows on choosing n 1 = k = n/3 . (If n ≤ 3 we can adjust the constant c1 so that the inequality holds.) Using (4.5) we obtain

124

Discrete Time Heat Kernel

Corollary 4.29

x Let rnx = pnx + pn+1 . If UHK(α, β) holds then

E (rnx , rnx ) ≤ cn −1−α/β .

4.4 Off-diagonal Upper Bounds We now introduce some conditions which, if satisfied, will imply UHK(α, β). In Chapter 6, we will see how these conditions can be proved from geometric or resistance properties of the graph. Definition 4.30 Let β ≥ 1. Following [GT1, GT2] we say (, μ) satisfies the condition (E β ) if there exist positive constants c1 , c2 such that c1 r β ≤ Ex τ (x, r ) ≤ c2 r β ,

for x ∈ V, r ≥ 1.

Lemma 4.31 Suppose that (, μ) satisfies (E β ). Then there exist constants c· > 0 and p > 0 such that for all x, r E y τ (x, r ) ≤ c3 r β ,

y ∈ B(x, r ),

β

E τ (x, r ) ≥ c4 r , y ∈ B(x, r/2), Px τ (x, r ) ≥ c5r β ≥ p. y

(4.34) (4.35) (4.36)

Proof If y ∈ B(x, r ) then B(x, r ) ⊂ B(y, 2r ), so E y τ (x, r ) ≤ E y τ (y, 2r ) ≤ c2 2β r β . Similarly if y ∈ B(x, r/2) then E y τ (x, r ) ≥ E y τ (y, r/2) ≥ c1 2−β r β . Let t > 0, and write τ = τ (x, r ), B = B(x, r ). Using the Markov property of X we have c1 r β ≤ Ex τ = Ex 1(τ ≤t) τ + Ex 1(τ >t) τ ≤ tPx (τ ≤ t) + Ex 1(τ >t) (t + E X t τ ) ≤ t + Px (τ > t) sup E y τ ≤ t + Px (τ > t)c3 r β . y∈B

Hence, taking t = 12 c1r β , Px (τ > t) ≥ Lemma 4.32 Then β ≥ 2.

c1 r β − t c1 = . c3 r β 2c3

Suppose that (, μ) has polynomial growth, and satisfies (E β ).

4.4 Off-diagonal Upper Bounds

125

Proof Let ci be as in Lemma 4.31. Then by Markov’s inequality Px (τ (x, r ) > 2c3 r β ) ≤ 12 . Suppose now that β < 2, and let n = 2c3 r β . Then r 2 /n ≥ cr 2−β , so if r is large enough we can use (4.21) to obtain 1 2

≤ Px (τ (x, r ) ≤ n) ≤

n

Px (X k ∈ B(x, r ))

k=1 θ −r 2 /4k

≤ n max cr e 0≤k≤n

≤ cr β+θ e−cr

2−β

,

a contradiction for large r . Proposition 4.33 Let β ≥ 2 and suppose that there exist constants c1 > 0, p > 0 such that for all x ∈ V, r ≥ 1 (4.37) Px τ (x, r ) ≥ c1 r β ≥ p. Then there exist constants c2 , c3 such that for T ≥ 1, R ≥ 1 Px (τ (x, R) < T ) ≤ c2 exp(−c3 (R β /T )1/(β−1) ).

(4.38)

In particular, (4.38) holds if satisfies (E β ). Proof We can assume c1 ≤ 1. Choose positive integers m and r , and let t0 = c1r β . Define the stopping times S0 = 0,

Si = inf{n ≥ Si−1 : d(X Si−1 , X n ) = r },

i ≥ 1.

Write Fn = σ (X m , m ≤ n) for the filtration of X . Since d(X Si−1 , X k ) < r for Si−1 ≤ k < Si , we have d(X 0 , X k ) < mr , for 0 ≤ k < Sm , and hence τ (x, mr ) ≥ Sm . We call the parts of the path of X between times Si−1 and Si ‘journeys’, and will call a journey ‘slow’ if Si − Si−1 ≥ t0 . Let ξi = 1(Si −Si−1 ≥t0 ) , and N =

m

ξi .

i=1

Thus N is the number of slow journeys, and τ (x, mr ) ≥ Sm ≥ N t0 . The bound (4.37) implies that Px (ξi = 1|F Si−1 ) ≥ p, and using a martingale inequality (see Lemma A.8) to bound N we obtain Px τ (x, mr ) ≤ 12 mpt0 ≤ Px (N t0 < 12 mpt0 ) ≤ e−cm . (4.39) This bound then leads to (4.38). The basic idea is that, given R and T , if r = R/m and mt0 = T then T = c1 mr β = c1 R β m 1−β ,

126

Discrete Time Heat Kernel

giving (4.38). However, since we need both m and r to be positive integers, the argument needs a little care. We wish to choose m (and so r ) such that mr ≤ R,

1 2

pt0 m =

1 2

pc1 mr β ≥ T.

If these hold, then P(τ (x, R) ≤ T ) ≤ P(τ (x, mr ) ≤ T ) ≤ P(τ (x, mr ) ≤

1 2

pmt0 ) ≤ e−cm .

Given m ∈ {1, . . . , R}, set r = R/m . Then r ≤ R/m ≤ r + 1 ≤ 2r , and so, taking c4 = 2−1−β c1 , 1 2

pc1 mr β ≥ c1 p2−1−β R β m 1−β = c4 R β m 1−β .

We now choose m as large as possible, subject to the constraints m ≤ R,

m β−1 ≤

c4 R β . T

(4.40)

We consider three regimes for R and T , and in each case we obtain a bound of the form (4.38), but with different constants c2 ( j), c3 ( j) for cases j = 1, 2, 3. The bound (4.38) then follows by taking the weakest constant in each case, that is, c2 = max j c2 ( j) and c3 = min j c3 ( j). Case 1: c4 R β /T ≤ 2β−1 . If this holds then the right side of (4.38) is of order 1, so the bound gives no information. However, it still does hold since P(τ (x, R) ≤ T ) ≤ 1 ≤ e2 e−(c4 R

β /T )1/(β−1)

.

Case 2: T ≤ c4 R. Since c4 < c1 ≤ 1 we have T < R, so the left side of (4.38) is zero, and the bound therefore holds automatically. Case 3: c4 R β /T ≤ 2β−1 , and c4 R ≤ T . (This is the only non-trivial case.) We choose m = (c4 R β /T )1/(β−1) , so that m ≥ 2 and also 2m ≥ m + 1 ≥ (c3 R β /T )1/(β−1) . Then m β−1 ≤ c4 R β /T ≤ R β−1 , so we also have m ≤ R. Hence this choice of m and r satisfies both the constraints in (4.40), and (4.38) follows from (4.39). Remarks (1) The bound (4.38) is (up to constants) the best possible. This may seem surprising, since the preceding argument appears to give a lot away; usually it will take many more than m journeys of length r for X to leave B(x, mr ). (In Zd one would expect that roughly m 2 journeys would be needed.) However, the bound (4.38) only gives much information when

4.4 Off-diagonal Upper Bounds

127

R β & T , and in this case it is very unlikely that X will exit B(x, mr ) in a short time; on the rare occasions when it does so, it moves more or less directly to the boundary. (2) If β = 2 then 1/(β − 1) = 1, and the term in the exponential in (4.38) is −c R 2 /T . The following theorem gives conditions for UHK(α, β). Theorem 4.34 Let α ≥ 1, β ≥ 2, and suppose that (, μ) satisfies (Vα ). The following are equivalent. (a) There exist constants c1 , c2 > 0, p1 ∈ (0, 1) such that for all x ∈ V, r ≥ 1 Px (τ (x, r ) ≤ c1 r β ) ≤ p1 , pn (x, x) ≤ c2 (1 ∨ n)

−α/β

x ∈ V, r ≥ 1,

(4.41)

, n ≥ 0, x, y ∈ V.

(4.42)

(b) UHK(α, β) holds; that is, there exist constants c3 , c4 such that for x, y ∈ V, n ≥ 0

d(x, y)β 1 c3 β−1 pn (x, y) ≤ . (4.43) exp − c4 α/β (n ∨ 1) n∨1 (c) (E β ) and the on-diagonal upper bound (4.42) hold. Proof The implication (b) ⇒ (c) was proved in Lemma 4.20, while (c) ⇒ (a) is immediate from Lemma 4.31. (a) ⇒ (b) The condition (4.42) implies that μx = p0 (x, x) ≤ c2 for all x. Hence it is easy to check that (4.23) holds if n = 0 or n = 1. Let 2R = d(x, y) and m = n/2. If R β /n ≤ 1 then the term in the exponential in UHK(α, β) is of order 1, and so the bound is immediate from (4.42). Now let n ≥ 2, and choose m such that n/3 ≤ m ≤ 2n/3. Let A x = {z : d(x, z) ≤ d(y, z)} and A y = V − A x . Then μx Px (X n = y) = μx Px (X n = y, X m ∈ A y ) + μx Px (X n = y, X m ∈ A x ). (4.44) The second term in (4.44) is pn (x, z)μz pn−m (z, y)μ y = μ y pn−m (y, z)μz pn (z, x)μx μx z∈A x

z∈A x y

= μ y P (X n = x, X n−m ∈ A x ). So the two terms in (4.44) are of the same form, and it is sufficient to bound the first. For this we note that if z ∈ A y then 2d(x, z) ≥ d(x, z) + d(y, z) ≥

128

Discrete Time Heat Kernel

d(x, y) = 2R, so d(x, z) ≥ R. Hence also if X 0 = x and X m ∈ A y then τ (x, R) ≤ m. So, μx Px (X n = y, X m ∈ A y ) = μx Px (τ (x, R) ≤ m, X m ∈ A y , X n = y) ≤ μx Ex 1{τ (x,R)≤m,X m ∈A y } P X m (X n−m = y) ≤ μx Px (τ (x, R) ≤ m) sup pm−n (z, y)μ y . z∈A y

By Proposition 4.33 R) ≤ m) ≤ c exp(−c(R β /(n/3))1/(β−1) , while by (4.42) pm−n (z, y) ≤ c(n/3)−α/β ; combining these bounds gives UHK(α, β). Px (τ (x,

4.5 Lower Bounds We now turn to lower bounds for pn (x, y). On-diagonal lower bounds follow immediately from the upper bounds. Lemma 4.35

Suppose (, μ) satisfies (Vα ) and UHK(α, β). Then p2n (x, x) ≥ c1 n −α/β , n ≥ 1.

Proof Choose λ so that the left side of (4.30) is less than 12 , and let B = B(x, λn 1/β ). Then 1/2 x 2 1 ≤ P (X ∈ B) = p (x, y)μ ≤ p (x, y) μ μ(B)1/2 n n y n y 2 ≤

y∈B

pn (x, y)2 μ y

1/2

y∈B

μ(B)1/2 = p2n (x, x)1/2 μ(B)1/2 .

y

Thus 4 p2n (x, x) ≥ μ(B)−1 ≥ cn −α/β . This argument does not give any lower bound on pn (x, y) with y = x. To obtain such a bound, even if d(x, y) ' n, requires more information on the graph . Definition 4.36 Let β ≥ 2. We say (, μ) satisfies the near-diagonal lower bound (NDLB(β)) if there exist constants c0 , c1 ∈ (0, 1) such that c0 if 1 ∨ d(x, y) ≤ c1 n 1/β . pn (x, y) + pn+1 (x, y) ≥ V (x, n 1/β ) If (Vα ) holds then we can write this as pn (x, y) + pn+1 (x, y) ≥ c2 n −α/β if 1 ∨ d(x, y) ≤ c1 n 1/β .

(4.45)

4.5 Lower Bounds

129

Even when the upper bound UHK(α, β) holds NDLB(β) may fail – an example is the join of two copies of Zd (with d ≥ 2). We will see in Chapter 6 how the near-diagonal lower bound can be proved from the Poincaré inequality or from resistance bounds; for the moment we show that it leads to full lower bounds. Lemma 4.37 Suppose (Vα ) and NDLB(β) hold. Let x, y ∈ V, and r ≥ 2. Then if r + d(x, y) ≤ c2 n 1/β , Px (X n ∈ B(y, r )) ≥ cr α n −α/β . Proof Let B = B(y, r/2). Then summing (4.45) over B we have Px (X n ∈ B ) + Px (X n+1 ∈ B ) ≥ cn −α/β (r/2)α .

(4.46)

If X n+1 ∈ B then X n ∈ B(y, r ), so each of the probabilities on the left side of (4.46) is bounded above by Px (X n ∈ B(y, r )). Proposition 4.38 Let (, μ) be a graph satisfying (H5) and (Vα ). Suppose that the near-diagonal lower bound (4.45) holds for some β ≥ 2. Then if T ≥ d(x, y) ∨ 1, d(x, y)β 1 β−1 . (4.47) pT (x, y) + pT +1 (x, y) ≥ c3 (T ∨ 1)−α/β exp − c4 T ∨1 Proof If T = 0 and x = y this bound is immediate from the condition (Vα ). We continue to write c1 , c2 for the constants in (4.45). We use a classical chaining argument which works quite generally. At first sight, like the argument in the upper bound, the proof appears to give rather a lot away. As in the upper bound, we will need to treat three cases, of which the first two are easy and the β third gives the main estimate. Let R = d(x, y) and c5 = 1 ∧ 2−3β c1 . Case 1: c5 T ≤ R ≤ T . We can find a path γ = (z 0 , . . . , z m ) between x and y, where m ∈ {T, T + 1}; this path does not have to be self-avoiding. As (H5) holds, we have P(z i−1 , zi ) ≥ c6−1 > 0 for each i. Then

Px (X m = y) ≥ Px (X k = z k , 1 ≤ k ≤ m) ≥ c6T +1 ≥ ce−c T , which gives the bound (4.47). (Note that in this regime the term T −α/β plays no role.) β Case 2: R β /T ≤ c1 . In this case the exponential term in (4.47) is of order 1, and since R ≤ c2 T 1/β the bound follows immediately from (4.45). β Case 3: R β /T > c1 , R ≤ c5 T . The basic idea is to choose m with m β−1 % R β /T , divide the journey from x to y into m steps, and use the NDLB on each

130

Discrete Time Heat Kernel

step; this is possible since this choice of m means that (T /m)1/β R/m. However, since both m and the duration of the time steps have to be integers, the details of the argument need a little work. Let 1 ≤ m ≤ R, and r = R/m, s = T /m, r0 = r , s0 = s . As m ≤ R ≤ T , we have r/2 ≤ r0 ≤ r , s/2 ≤ s0 ≤ s. Since mr0 ≤ R ≤ m(r0 + 1), we can find a chain x = z 0 , z 1 , . . . , z m = y with d(z j−1 , z j ) ∈ {r0 , r0 + 1} for each j. Set B j = B(z j , r0 ). Note that if wi ∈ Bi , then d(wi−1 , wi ) ≤ 3r0 + 1 ≤ 4r . Now choose si ∈ {s0 , s0 + 1} so m si = T . Set that i=1 p j = min Pw (X s j ∈ B j ), j = 1, . . . , m − 1. w∈B j−1

Then if t j = s1 + · · · + s j Px (X T = y) ≥ Px (X t j ∈ B j , j = 1, . . . , m − 1, X T = y) = Px (X s1 = w)Pw (X t j −s1 ∈ B j , j = 2, . . . , m − 1, X T −s1 = y) w∈B1

≥ Px (X s1 ∈ B1 ) min Pw (X t j −s1 ∈ B j , j = 2, . . . , m − 1, X T −s1 = y) w∈B1

≥ · · · ≥ p1 . . . pm−1 min Pw (X sm = y). w∈Bm−1

Suppose we can choose m ∈ {1, . . . , R} so that 1/β

3r0 + 1 ≤ c1 s0

≤ 16r0 .

(4.48)

Then, using (4.45) and the lower bound on μ(B j ), −α/β

p j ≥ c2 s0

−α/β α r0 ≥ c7 , −α/β c2 s0 μ y ≥ c2 T −α/β μ y .

μ(B j ) ≥ cs0

minw∈Bm−1 Pw (X s = y) ≥

Combining the bounds above we obtain pn (x, y) ≥ c7m−1 c2 T −α/β = c2 c7−1 T −α/β e−m log c7 .

(4.49)

It remains to check that we can choose m so that (4.48) holds, and to evaluate the left hand side of (4.49). Using the bounds on r0 , s0 above, (4.48) holds if 4r = 4(R/m) ≤ c1 (T /2m)1/β ,

c1 (T /m)1/β ≤ 8(R/m),

that is, if an integer m can be found with −β

−β

21+2β c1 R β /T ≤ m β−1 ≤ 2β−1 (21+2β c1 R β /T ). −β

As R β T c1 have

(4.50)

≥ 1, we can find m satisfying (4.50). Further, since R ≤ c5 T , we

4.5 Lower Bounds −β

131

−β

m β−1 ≤ c1 23β R β /T ≤ c5 c1 23β R β−1 ≤ R β−1 , so that m ≤ R. Substituting the bound from (4.50) into (4.49) completes the proof. For convenience we summarise our sufficient conditions for UHK(α, β) and LHK(α, β). Write (n, r ) = (r β /n)1/(β−1) . Theorem 4.39 Let (, μ) satisfy (Vα ). (a) If (, μ) satisfies (E β ) and the on-diagonal upper bound pn (x, x) ≤ c0 (1 ∨ n)−α/β , n ≥ 0, x, y ∈ V, then there exist constants c1 , c2 such that for x, y ∈ V, n ≥ 0 pn (x, y) ≤ c1 (n ∨ 1)−α/β exp − c2 (n ∨ 1, d(x, y) . (b) If (, μ) satisfies (H5) and NDLB(β) then there exist constants c3 , c4 such that for x, y ∈ V, n ≥ d(x, y) pn (x, y) + pn+1 (x, y) ≥ c3 (n ∨ 1)−α/β exp − c4 (n ∨ 1, d(x, y)) .

5 Continuous Time Random Walks

5.1 Introduction to Continuous Time In Chapter 4 we studied the transition density pn (x, y) of the discrete time random walk X . It is also of interest to consider the related continuous time simple random walk (CTSRW) on (, μ). While the essential ideas are the same in both discrete and continuous time contexts, discrete time does introduce some extra difficulties, of which the most significant are connected to parity issues. In addition, the difference equations used in Chapter 4 take a simpler form in the continuous time context. There is, however, an initial penalty in dealing in continuous time, in that one is dealing with a more complicated object, and some extra work is needed to verify that certain interchanges of derivatives and infinite sums are valid. A further (minor) source of difficulty is that, while the support of the discrete time heat kernel pn (x, ·) is finite, the continuous time kernel qt (x, ·) is strictly positive everywhere. We denote the continuous time walk by Y = (Yt , t ∈ [0, ∞)), and will construct it from the discrete time walk X n , n ∈ Z+ and an independent Poisson process with rate 1 (Nt , t ≥ 0). (For details on the Poisson process see Appendix A.3 or [Nor].) Given the processes X and N we set Yt = X Nt , t ∈ [0, ∞).

(5.1)

The process Y waits an exp(1) time at each vertex x, and then jumps to some y ∼ x with the same jump probabilities as X , that is, with the probability P(x, y) = μx y /μx given by (1.2). Informally if Yt = x then the probability that Y jumps to y ∼ x in the time interval (t, t + h] is P(x, y)h + O(h 2 ). We use Px to denote the law of Y with initial position Y0 = x, and write qt (x, y) for the transition density of Yt with respect to μ, given by 132

5.1 Introduction to Continuous Time

qt (x, y) =

133

Px (Yt = y) . μy

We also call qt the (continuous time) heat kernel on (, μ). As in the case of X , we write TD = inf{t ≥ 0 : Yt ∈ D},

τ D = TDc .

If it is necessary to specify the process, we write TD (Y ), τ D (Y ), etc. These are stopping times for the process Y . Lemma 5.1 summarises some basic properties of qt , which follow easily from the definition (5.1). As before we use the notation qtx (·) = qt (x, ·). Lemma 5.1

(a) For t ≥ 0, qt (x, y) =

∞ −t n e t n=0

(b) qt (x, y) = qt (y, x). (c) For all x, z ∈ V, s, t ≥ 0, qt+s (x, z) =

n!

pn (x, y).

qt (x, y)qs (y, z)μ y = qtx , qsz .

(5.2)

(5.3)

y∈V x 2 In particular, ||qtx ||22 ≤ μ−1 x for each x ∈ V, t ≥ 0, so that qt ∈ L .

(d) ∂ qt (x, y) = qt (x, y). ∂t Proof (a) Since X and N are independent, Px (Yt = y) =

∞

Px (X n = y, Nt = n) =

n=0

∞

pn (x, y)μ y Px (Nt = n),

n=0

and, as Nt has a Poisson distribution with mean t, this gives (5.2). (b) This is immediate from the symmetry of pn (x, y). (c) This follows from (5.2) and the discrete Chapman–Kolmogorov equation (1.8). We have qt+s (x, z) = =

∞ −t−s e (s + t)n n=0 ∞ n=0

n! e−t−s

pn (x, z)

n s k t n−k pk+n−k (x, z) k!(n − k)! k=0

134

Continuous Time Random Walks

=

∞ ∞

e−t−s

i=0 j=0

=

si t j pi (x, y) p j (y, z)μ y i! j! y

qt (x, y)qs (y, z)μ y .

y

(d) This follows by differentiating (5.2) – note that, since pn (x, y) ≤ 1/μx for all n, it is easy to justify the interchange of derivative and sum. Let t ≥ 0. We define the semigroup operator Q t by setting Q t f (x) = Ex f (Yt ),

f ∈ C+ (V),

(5.4)

and then let Q t f = Q t f + − Q t f − whenever this is defined. Thus we have qt (x, y) f (y)μ y (5.5) Q t f (x) = y

whenever the sum on the right hand side of (5.5) converges (absolutely). Unlike the discrete time operators Pn , which were defined for all f ∈ C(V) in (1.9), the expectation in (5.4) can be infinite. We therefore need to be more careful with the domain of these operators; for most purposes L 2 (V) will be sufficient, but we include other L p space for completeness. Lemma 5.2

Let p ∈ [1, ∞].

(a) For each t ≥ 0, Qt f =

∞ −t n e t n=0

n!

P n f,

f ∈ L p.

(5.6)

For each t ≥ 0 the operator Q t maps L p to L p , and ||Q t || p→ p ≤ 1. If f ≥ 0 then Q t f ≥ 0. Q t 1 = 1 for each t ∈ [0, ∞). For s, t ≥ 0 we have Q t Q s = Q s+t so that (Q t , t ∈ [0, ∞)) is a semigroup on L p (V). (f) For f ∈ L p ,

(b) (c) (d) (e)

|Q t f | ≤ Q t | f |. (g) We have Q t f = Q t f for all f ∈ L p .

5.1 Introduction to Continuous Time

135

Proof (a) and (b) For f ∈ C0 (V), (5.6) follows immediately from (5.2). By Minkowski’s inequality, ∞ −t n ∞ −t n ' ' e t n ' e t ' ' ' ' P n f ' ≤ || f || p . P f' ≤ ||Q t f || p = ' p p n! n! n=0

n=0

and we have ||Q t || p→ p ≤ 1. So (5.6) extends to f ∈ (c) and (d) These are immediate from the definition (5.4), and the semigroup property (e) follows from (5.3). (f) Since −| f | ≤ f ≤ | f |, this follows from (c). (g) Since = P − I commutes with P n , this follows by (a). L p,

Remark The Markov property of Y at a fixed time t follows from (5.3). Later, we will also use the strong Markov property of Y , that is, the Markov property at stopping times T . As a first encounter with the precise formulation of the Markov property can produce distress, even to experienced mathematicians, the details have been put into Appendices A.2 and A.3. Lemma 5.3 states that is the infinitesimal generator of the semigroup Qt . Let f ∈ L 2 (V, μ). Then

Lemma 5.3

lim t↓0

Qt f − f = f, t

pointwise and in L 2 (V, μ).

(5.7)

Proof It is sufficient to prove this for f ∈ L 2+ . By (5.6) t −1 (Q t f − f ) = t −1

∞ −t n e t n=0

n!

(P n f − f ) =

∞ −t n−1 e t n=1

n!

(P n f − f ).

Since P1 f − f = f , we obtain

where Rt f =

∞

t −1 (Q t f − f ) = f + t Rt f,

n=2 (e

−t t n−2 /n!)(P n

||Rt f ||2 ≤

(5.8)

f − f ). Then

∞ (e−t t n−2 /n!)(||P n f ||2 + || f ||2 ) n=2

≤ 2|| f ||2

∞

(e−t t n−2 /(n − 2)!) ≤ 2|| f ||2 ,

n=2

and letting t ↓ 0 in (5.8) gives the convergence in L 2 . The pointwise convergence then follows.

136

Continuous Time Random Walks

Corollary 5.4

Let f ∈ L 2 (V, μ) and t ≥ 0. Then d Q t f = Q t f. dt

(5.9)

The limit in the derivative holds both pointwise and in L 2 (V, μ). Proof The case t = 0 is given in Lemma 5.3. By (5.8) applied to g = Q t f , if h ∈ (0, 1) then ||Q t+h f − Q t f ||2 ≤ h|| Q t f ||2 + h 2 ||Rh Q t f ||2 ≤ 4h|| f ||2 . Hence if t > 0, and 0 < |h| < 1 ∧ t, then by first considering separately the cases h > 0 and h < 0 we have ||h −1 (Q t+h f − Q t f ) − h Q t f ||2 ≤ ||Q t f − Q t−|h| f ||2 + 2h|| f ||2 , which proves that the limit in (5.9) holds in L 2 . Again, the existence of the pointwise limit is then immediate. Let f ∈ L 2 . Then

Lemma 5.5

d ||Q t f ||22 = −2E (Q t f, Q t f ), dt d2 ||Q t f ||22 = 4|| Q t f ||22 ≥ 0. dt 2

(5.10) (5.11)

Proof Using the L 2 convergence in Corollary 5.4, and then Lemma 5.2(g), d ||Q t f ||22 = lim h −1 Q 2t+2h f − Q t f, f h→0 dt = 2 Q 2t f, f = −2E (Q t f, Q t f ). Similarly, −

d E (Q t f, Q t f ) = lim h −1 Q 2t+2h f − Q 2t f, f h→0 dt = 2Q 2t f, f = 2|| Q t f ||22 .

Corollary 5.6

For f ∈ L 2 and t ≥ 0, || f ||22 − ||Q t f ||22 ≤ 2tE ( f, f ).

5.1 Introduction to Continuous Time

137

Proof Set ϕ(t) = ||Q t f ||22 . Then ϕ is increasing, so t ϕ(0) − ϕ(t) = − ϕ (s)ds ≤ −tϕ (0) = 2tE ( f, f ). 0

Remark 5.7 We now return to the relation (5.7). Let (, μ) be a weighted graph, and let νx > 0, so ν defines a measure on V. We can associate an operator L (ν) to the pair (E , ν) by requiring E ( f, g) = L (ν) f, g L 2 (ν) = L (ν) f (x)g(x)νx , f, g ∈ C0 (V). x

Using the discrete Gauss–Green formula (1.20) we also have

f (x)g(x)μx , E ( f, g) = x

and therefore it follows that L (ν) f (x) =

μx 1

f (x) = μx y ( f (y) − f (x)). νx νx y

The Hille–Yoshida theory of semigroups of operators (see [RW]) allows us (ν) to define a semigroup (Q t ) on L 2 (ν) such that L (ν) is the infinitesimal (ν) generator of (Q t ); that is, (ν)

lim t −1 (Q t f − f ) = L (ν) f,

t→0

f ∈ L 2 (ν).

Formally one can write (ν)

Qt

= exp(tL (ν) ),

t ≥ 0.

The process Y (ν) with generator L (ν) waits at x for an exponential time with mean νx /μx , and then jumps to y ∼ x with probability P(x, y) = μx y /μx . In the general terminology of Dirichlet forms (see [FOT]), one says that Y (ν) is the Markov process associated with the Dirichlet form E on L 2 (ν). Note that the quadratic form E alone is not enough to specify the process: one needs the measure ν also. Given two different measures ν, ν , the processes Y (ν) , Y (ν ) have the same jump probabilities as X , and it follows that they can be expressed as time changes of each other. (ν) Unfortunately in general the semigroup (Q t ) is not unique. This is because of the possibly of ‘explosion’ of the process Y (ν) – that is, it can escape to infinity in finite time. Let τn = inf{t ≥ 0 : Yt(ν) ∈ B(x, n)} and ζ = limn τn ; we call ζ the explosion time of Y (ν) . In general one may have Px (ζ < ∞) > 0, and in this case the generator L (ν) only determines Y (ν) on the interval [0, ζ ).

138

Continuous Time Random Walks

If the explosion time ζ is always +∞, (i.e. Px (ζ = ∞) = 1 for all x ∈ V) then the process Y is called stochastically complete. For the particular case ν = μ explosion does not occur, since (5.1) defines Yt for all t ∈ [0, ∞). The intuitive explanation of why Y does not explode is that it has to wait an average of one time unit at a vertex x before jumping to the next vertex. While any ν is possible, there are two particularly natural choices of ν. The first (chosen above) is ν = μ, while the second is νx ≡ 1. If we wish to distinguish these processes, we call them the constant speed and variable speed continuous time simple random walks on (, μ), or CSRW and VSRW for short. In this book we discuss just the CSRW. Explosion in finite time is a possibility for the VSRW on graphs with unbounded weights or vertex degree; for more on the general case and some sufficient conditions for stochastic completeness, see [GHM, Fo2]. Note also that the infinitesimal generator of the VSRW is the combinatorial Laplacian Com given by (1.11). Remarks

(1) Taking f = 1 y in (5.7) gives μx y Px (Yt = y) , = P(x, y) = t→0 t μx μ y − μ yy 1 − Px (Yt = x) lim . = t→0 t μy lim

y = x,

Thus Y has a ‘Q-matrix’ (see [Nor]) given by Qxy =

P(x, y),

x = y,

P(x, x) − 1,

x = y.

(2) If Yt = x then the probability that Y jumps to y ∼ x in the time interval (t, t + h] is P(x, y)h + O(h 2 ). (3) For discrete time walks, allowing μx x > 0 makes a possibly important qualitative difference, since it permits one to have X n = X n+1 = x with positive probability. This is helpful in removing difficulties arising in bipartite graphs, and is one reason to study the ‘lazy’ random walk. For the continuous time walk, the jump time out of a point x has a smooth (exponential) distribution, and it is more natural to take μx x = 0 for all x. However, as having μx x > 0 creates no difficulties, we will continue to allow this possibility.

5.1 Introduction to Continuous Time

139

Corollary 5.8 Suppose that |V| = N , let ϕi , ρi be as in Corollary 3.35, and set λi = 1 − ρi . Then qt (x, y) =

N

ϕi (x)ϕi (y)e−λi t .

i=1

Proof Since ∞ n=0

ρin

e−t t n = e−t+ρi t = e−λi t , n!

this is immediate from Corollary 3.35 and (5.2). The following result on speed of convergence to equilibrium is proved in the same way as Lemma 3.37. Corollary 5.9 Let V be finite, and let 0 = λ1 < λ2 ≤ · · · ≤ λ N be the eigenvalues of − . Let νx be a probability measure on V, b = μ(V)−1 , and f t (y) = x νx qt (x, y). Then || f t − b||2 ≤ e−λ2 t || f 0 − b||2 . Let D ⊂ V, and define the killed heat kernel by x qtD (x, y) = μ−1 y P (Yt = y, τ D > t).

(5.12)

We write qtD,x = qtD (x, ·). It is straightforward to verify Lemma 5.10

For s, t ≥ 0, x, y, z ∈ V, qtD (x, y) = qtD (x, y) D qt+s (x, z)

= =

∞ −t n e t

n!

pnD (x, y),

n=0 qtD (y, x), qtD,x , qsD,z ,

∂ D q (x, y) = D qtD (x, y). ∂t t The following relation gives one key to the control of qt (x, x). Lemma 5.11

Let x ∈ V, and ψ(t) = qtx , qtx = q2t (x, x). Then ψ (t) = 2 qtx , qtx = −2E (qtx , qtx ) ≤ 0,

ψ (t) =

4 qtx , qtx

≥ 0.

In particular, ψ is decreasing and ψ is increasing.

(5.13) (5.14)

140

Continuous Time Random Walks

Proof This follows from Lemma 5.5 by taking f = μ−1 x 1x so that Q t f (y) = qt (y, x) = qtx (y). Exercise

Show that if (, μ) is transient then ∞ g(x, y) = qt (x, y) dt. 0

5.2 Heat Kernel Bounds We will use some basic properties of the Poisson distribution to compare the discrete and continuous time heat kernels pn (x, y) and qt (x, y). For s ∈ (−1, ∞) set f (s) = (1 + s) log(1 + s) − s. For s ∈ [0, 1),

Lemma 5.12

f (−s) ≥ f (s) ≥ 14 s 2 . For s ≥ 1, f (s) ≥ c0 (1 + s) log(1 + s), where c0 = 1 − (2 log 2)−1 . Proof Note that f (s) = log(1 + s). Since − log(1 − u) ≥ log(1 + u) for u ∈ (0, 1), we have, for s ∈ (0, 1), s s s 1 2 1 log(1−u)du ≥ log(1+u)du = f (s) ≥ f (−s) = − 2 udu = 4 s . 0

0

0

Here we used the fact that log(1 + u) ≥ u/2 for 0 < u < 1. For the second assertion, set h(s) = (1 + s) log(1 + s) and g(s) = h(s)/s. Then g (s) ≥ 0 for s ≥ 1, so that infs≥1 g(s) = g(1) = 2 log 2. Therefore, writing b = 1/2 log 2, we have f (s) = bh(s) − s + (1 − b)h(s) ≥ (1 − b)h(s). Write at (k) = P(Nt = k) =

e−t t k . k!

(5.15)

Lemma 5.13 (a) For any t ≥ 0, k ∈ Z+ , P(Nt ∈ 2Z, Nt ≤ 2k) ≥

1 P(Nt ≤ 2k). 3

(b) For any t > 0, P(Nt ≥ (1 + s)t) ≤ exp(−t f (s)), P(Nt ≤ (1 − s)t) ≤ exp(−t f (−s)),

s > 0, 0 < s < 1.

(5.16) (5.17)

5.2 Heat Kernel Bounds

141

(c) We have P(Nt ≥ (e − 1)t) ≤ e−t ,

P(Nt ≤ t/e) ≤ e−t (e−2)/e .

(d) For R > t,

P(Nt ≥ R) ≤ exp − t + R − R log(R/t) = exp − t − R log(R/et) .

(e) For |λ| < 1, t ≥ 0, P(|Nt − t| ≥ λt) ≤ 2e−λ

2 t/4

.

Proof (a) We have t 2 − (2k + 2)t + (2k + 1)(2k + 2) > t 2 − 2(k + 1)t + (k + 1)2 ≥ 0; rearranging and multiplying by e−t t 2k /(2k + 1)! gives at (2k + 1) ≤ at (2k) + at (2k + 2). So P(Nt ∈ 2Z, Nt ≤ 2k) =

1 3

k

at (2m) +

m=0

≥

1 3

k

k

1 3

2at (2m)

m=0

at (2m) +

m=0

k−1

1 3

at (2m + 1) = 13 P(Nt ≤ 2k).

m=0

(b) This is proved by a standard exponential moment estimate. Let λ > 0. Then P(Nt ≥ t (1 + s)) = P(eλNt ≥ eλ(1+s)t ) ≤ e−λ(1+s)t EeλNt = exp(−λt (1 + s) − t + teλ ). The minimum on the left is attained by taking λ = log(1 + s), and this gives (5.16). Similarly for (5.17) we write P(Nt ≤ (1 − s)t) = P(e−λNt ≥ e−λ(1−s)t ), use Markov’s inequality, and finally set λ = − log(1 − s). The remaining estimates are immediate from (b) and Lemma 5.12. Theorem 5.14 Let α ≥ 1. The following are equivalent. (a) satisfies the Nash inequality (Nα ). (b) There exists c1 such that pn (x, y) ≤ c1 (1 ∨ n)−α/2 ,

n ≥ 0, x, y ∈ V.

142

Continuous Time Random Walks

(c) There exists c2 such that qt (x, y) ≤

c2 t α/2

,

t > 0, x, y ∈ V.

(5.18)

Proof The equivalence of (a) and (b) has already been proved in Theorem 4.3, so it remains to prove (b) ⇔ (c). First assume (b). Then qt (x, x) = E p Nt (x, x) ≤ cE(1 ∨ Nt )−α/2 . By Lemma 5.13(c), E(1 ∨ Nt )−α/2 ≤ P(Nt < t/e) + (t/e)−α/2 P(Nt ≥ t/e) ≤ e−ct + c t −α/2 , which gives (c). Now suppose (c) holds. By Lemma 4.1 it is enough to prove p2n (x, x) ≤ c(1 ∨ n)−α/2 .

(5.19)

Using the notation at (k) given in (5.15), we have, for t ≥ 0, n ≥ 0, ct −α/2 ≥ qt (x, x) = ≥

∞ at (2k) p2k (x, x) + at (2k + 1) p2k+1 (x, x) k=0 n

at (2k) p2k (x, x)

k=0

≥ p2n (x, x)P(Nt ≤ 2n, Nt ∈ 2N) ≥

1 3 p2n (x, x)P(Nt

≤ 2n).

If n = 0 then set t = 1; we obtain c2 ≥ q1 (x, x) ≥ 13 p0 (x, x)e−1 , which gives (5.19). If n ≥ 1 then set t = 2n/(e − 1), so that t ≥ 1. Lemma 5.13(c) gives P(Nt ≥ 2n) ≤ e−t ≤ 12 , and thus p2n (x, x) ≤ 6ct −α/2 = c n −α/2 , proving (b). Remarks 5.15 (1) Note the following quick proof of (a) ⇒ (c). Set ψ(t) = qtx , qtx = q2t (x, x); then ||qtx ||1 = 1 and |qtx ||22 = ψ(t). So by Lemma 5.11 and (Nα ) ψ (t) = −2E (qtx , qtx ) ≤ −c N ||qtx ||2

2+4/α

−4/α

||qtx ||1

= −c N ψ(t)1+2/α . (5.20) −2/α ; then . Let ϕ(t) = ψ(t) We have ψ(0) = q0 (x, x) = μ−1 x 2 ϕ (t) = − ψ (t)ψ(t)−1−2/α ≥ c2 , α so ϕ(t) ≥ ϕ(0) + c2 t ≥ c2 t, and thus ψ(t) ≤ (c2 t)−α/2 . (2) We could also prove the implication (c) ⇒ (a) by using the same argument as in Theorem 4.3.

5.2 Heat Kernel Bounds

Corollary 5.16

143

We have E (qtx , qtx ) ≤ t −1 qt (x, x).

Proof Using the notation of Remarks 5.15, by Lemma 5.11, ψ ≥ 0, so that ψ is increasing. Thus t ψ(t) − ψ(t/2) = ψ (s)ds ≤ 12 tψ (t) = −tE (qtx , qtx ). t/2

So

E (qtx , qtx )

≤

t −1 ψ(t/2)

= t −1 qt (x, x).

The tail of the Poisson decays more rapidly than an exponential. Combining the Carne–Varopoulos bound with our bounds on the tail of the Poisson, we obtain the following. Theorem 5.17

Let x, y ∈ V and R = d(x, y) ≥ 1. Then

(a)

2 2 4(μx μ y )−1/2 e−R /(e t) ,

qt (x, y) ≤

(μx ∨ μ y

)−1 exp(−t

if R ≤ et,

− R log

R et ),

if R ≥ et;

(5.21)

(b) if satisfies (H5) then qt (x, y) ≥ c1 (μx ∧ μ y )−1 exp(−c2 R − R log Rt )),

for R ≥ t > 0. (5.22)

Proof (a) Note that both bounds in (5.21) are of the form e−c R if t ≤ R ≤ et. By Theorem 4.9 and Lemma 5.13(c) we have, writing b = 2(μx μ y )−1/2 , qt (x, y) ≤ b

∞

at (n)e−R

2 /2n

n=0

=b

at (n)e−R

n≤et

≤ be

−R 2 /2et

n≤et

2 /2n

+b

at (n)e−R

2 /2n

n>et

at (n) + b

at (n) ≤ be−R

2 /2et

+ be−t ,

n>et

and since t ≥ e−2 R 2 /t this gives the first bound. If R ≥ et then by Lemma 5.13(b) qt (x, y)μ y ≤ P(Nt ≥ R) ≤ exp(−t − R log(R/et)). Interchanging x and y gives the same bound for qt (x, y)μx , which implies the second bound in (5.21).

144

Continuous Time Random Walks

(b) Since d(x, y) = R ≥ 1 there exists a path x = z 0 , z 1 , . . . , z R = y of length R connecting x and y. Using (H5) there exists p0 > 0 such that P(z n−1 , z n ) ≥ p0 , for n = 1, . . . , R. Hence p R (x, y)μ y ≥ p0R , and therefore qt (y, x)μx = qt (x, y)μ y ≥ p R (x, y)μ y P(Nt = R) ≥ p0R e−t t R R −R = exp(−R log p0−1 − t − R log(R/t)), and as t ≤ R this implies (5.22). Remarks 5.18 (1) The bounds in this theorem show that if d(x, y) = k then qt (x, y) t k for small t. (2) If we do not need the best constant for the R log(R/t) term, then, writing U (R, t) = R + R log(R/t), it is easy to obtain from (5.21) and (5.22) the simpler looking bounds c1 (μx ∧ μ y )−1 e−c2 U (R,t) ≤ qt (x, y) ≤ 4(μx μ y )−1/2 e−c3 U (R,t) . We now consider heat kernel bounds of the form HK(α, β) for qt (x, y). In view of the transition between ‘bulk’ and ‘Poisson’ behaviour, and the Poisson type bounds in Theorem 5.17, we restrict to t ≥ d(x, y). Definition 5.19 (, μ) satisfies UHKC(α, β) if there exist constants c1 , c2 such that for all x, y ∈ V, t ≥ d(x, y)

d(x, y)β 1 c1 β−1 qt (x, y) ≤ α/β exp − c2 . (5.23) t t We say (, μ) satisfies LHKC(α, β) if there exist constants c3 , c4 such that for all x, y ∈ V and t ≥ 1 ∨ d(x, y)

d(x, y)β 1 c3 β−1 qt (x, y) ≥ α/β exp − c4 . (5.24) t t We say (, μ) satisfies HKC(α, β) if both UHKC(α, β) and LHKC(α, β) hold. We now use similar arguments to those in Chapter 4 to obtain UHKC(α, β) from bounds on volume and hitting times. Lemma 5.20

Let A ⊂ V, x ∈ V. Then Ex T A (X ) = Ex T A (Y ).

In particular, the condition (E β ) is the same for both processes.

5.2 Heat Kernel Bounds

145

Proof Recall that Yt = X Nt , where N is a Poisson process independent of X . Write S1 , S2 , . . . for the jump times of N . Then Sk − Sk−1 are i.i.d. exp(1) r.v., and so ESk = k. We have T A (Y ) = STA (X ) , and so E T A (Y ) = x

∞

Ex (Sk ; T A (X ) = k) = Ex T A (X ).

k=0

Write τY (x, r ) = τ B(x,r ) (Y ). Lemma 5.21 Let β ≥ 2 and suppose that (E β ) holds. There exists ci > 0 such that if t ≥ r ≥ 1 then Px (τY (x, r ) < t) ≤ c4 exp(−c5 (r β /t)1/(β−1) ).

(5.25)

Proof We could follow the arguments in Lemma 4.31 and Proposition 4.33, but a direct proof from the discrete time result is simpler. Write τ X , τY for the exit times for X and Y from B(x, r ), and set γ = 1/(β − 1). Note that {τY ≤ t, τ X > 2t} ⊂ {Nt > 2t}. So, Px (τY ≤ t) = Px (τY ≤ t, τ X ≤ 2t) + Px (τY ≤ t, τ X > 2t) ≤ Px (τ X ≤ 2t) + P(Nt > 2t) ≤ ce−c(r

β /2t)γ

+ e−ct .

Here we used Proposition 4.33 and Lemma 5.13(b) in the final line. Since t ≥ r , we have t ≥ (r β /t)γ , giving (5.25). The proofs of Lemmas 4.19, 4.20, 4.21, and 4.22 all go through for Y with essentially no changes. In particular, we have the following lemma. Lemma 5.22

Let satisfy HKC(α, β). Then satisfies (Vα ) and (E β ).

The following two theorems give the equivalence of upper and lower heat kernel bounds in discrete and continuous time. Theorem 5.23 Let α ≥ 1, β ≥ 2, and let (, μ) satisfy (Vα ). The following are equivalent. (a) (, μ) satisfies (E β ) and the on-diagonal upper bound pn (x, x) ≤ c(1 ∨ n)−α/β ,

n ≥ 0.

(b) (, μ) satisfies (E β ) and the on-diagonal upper bound qt (x, x) ≤ ct −α/β , (c) (, μ) satisfies UHK(α, β). (d) (, μ) satisfies UHKC(α, β).

t > 0.

146

Continuous Time Random Walks

Proof The equivalence of (a) and (c) is proved in Theorem 4.34, and of (a) and (b) in Theorem 5.14 and Lemma 5.20. Suppose (b) holds. Let x, y ∈ V with d(x, y) = r ≥ 1, and let t > 0. If t < r then there is nothing to prove, so it remains to consider the case t ≥ 1 ∨ r . Lemma 5.21 gives the bound (5.25) on the tail of τY (x, r ), and the same argument as in the proof of (a) ⇒ (b) in Theorem 4.34 then gives (5.23). The proof that (d) ⇒ (b) is the same as in the discrete time case – see Lemma 4.21. Theorem 5.24 Let α ≥ 1, β ≥ 2. Suppose that (, μ) satisfies (Vα ), (H5), and UHK(α, β). The following are equivalent. (a) pn (x, y) satisfies the near-diagonal lower bound (Definition 4.36). (b) (, μ) satisfies HK(α, β). (c) qt (x, y) satisfies a near-diagonal lower bound: there exist c1 , c2 ∈ (0, 1) such that qt (x, y) ≥ c1 t −α/β if 1 ∨ d(x, y) ≤ c2 t 1/β .

(5.26)

(d) (, μ) satisfies HKC(α, β). Proof The implications (b) ⇒ (a) and (d) ⇒ (c) are trivial, and Proposition 4.38 proved that (a) ⇒ (b). The same chaining argument also works in continuous time, so that (c) ⇒ (d). Thus it remains to prove that the discrete and continuous time near-diagonal lower bounds are equivalent. The harder implication is that (c) ⇒ (a), since this uses the bounds on | pk+2 (x, y) − pk (x, y)| obtained in Theorem 4.28. Assume that (c) holds, fix x, y, and let c2 n 1/β ≥ 1 ∨ d(x, y). Let δ ∈ (0, 12 ), and recall the notation at (k) from (5.15). Then by (5.26) c1 n −α/β ≤ qn (x, y) = an (k) pk (x, y) + an (k) pk (x, y) ≤

|k−n|≤δn

|k−n|>δn

an (k) pk (x, y) + P(|Nn − n| > δn).

|k−n|≤δn

Let J0 = {k : |k −n| ∈ 2Z, |k −n| ≤ δn}, J1 = {k : |k −n −1| ∈ 2Z, |k −n| ≤ δn}, and write an (J ) = k∈J an (k). Then by Theorem 4.28, for k ∈ J0 , | pk (x, y) − pn (x, y)| ≤ | pi+2 (x, y) − pi (x, y)| |i|≤δn/2

≤ nδc3 (n − nδ)−1−α/β ≤ c4 δn −α/β .

5.2 Heat Kernel Bounds

Thus

147

an (k)( pk (x, y) − pn (x, y)) ≤ c4 δn −α/β ,

k∈J0

and a similar bound holds for using Lemma 5.13(e),

k∈J1

an (k)( pk (x, y) − pn+1 (x, y)). Hence,

c1 n −α/β ≤ pn (x, y)an (J0 ) + pn+1 (x, y)an (J1 ) + c4 δn −α/β + P(|Nn − n| > δn) ≤ pn (x, y) + pn+1 (x, y) + c4 δn −α/β + 2e−δ

2 n/4

.

Choosing δ so that c4 δ ≤ c1 /3, and c5 so that if c2 n 1/β ≥ c5 then 2e−δ 1 −α/β , we obtain 3 c1 n

2 n/4

≤

pn (x, y) + pn+1 (x, y) ≥ 13 c1 n −α/β , provided c2 n 1/β ≥ c5 ∨ d(x, y), which is the discrete time near-diagonal lower bound. Now assume that (a) holds. If t ≥ 1 and |t − k| ≤ 12 t then at (k + 1)/at (k) = t/(k + 1) ≤ 2. Since a ∧ b ≥ 13 (a + b) if b ≤ 2a, we have, for t ≥ 1, writing J = {k ∈ 2Z : |t − k| ≤ 12 t}, qt (x, y) ≥ at (k) pk (x, y) + at (k + 1) pk+1 (x, y) k∈J

≥

1 3

(at (k) + at (k + 1))( pk (x, y) + pk+1 (x, y))

k∈J

≥ (1 − 2e−t/16 ) min( pk (x, y) + pk+1 (x, y)). k∈J

Hence there exist c6 , c7 such that if c6 t 1/β ≥ 1 ∨ d(x, y) then qt (x, y) ≥ c7 t −α/β , giving the continuous time NDLB (5.26). Remark See Remark A.52 for an example which shows that if (H5) does not hold then HK(α, β) and HKC(α, β) are not equivalent. (The difficulty arises with the discrete time lower bound at small times.) The following theorem summarises our sufficient conditions for the heat kernel bounds. Theorem 5.25 Let α ≥ 1 and β ≥ 2, and let (, μ) be a weighted graph satisfying (H5). The following are equivalent. (a) (, μ) satisfies (Vα ), (Nα ), (E β ), N DL B(β). (b) HK(α, β) holds. (c) HKC(α, β) holds.

148

Continuous Time Random Walks

Proof If (, μ) satisfies (Vα ) then the equivalence of (a), (b), and (c) is immediate from Theorems 4.39 and 5.24. It therefore remains to prove that each of (b) and (c) implies (Vα ), and this is given by Lemma 4.22, in discrete time, and by a similar argument in continuous time. We conclude this section by stating a lower bound on the killed heat kernel in continuous time. Let D ⊂ V, and set x qtD (x, y) = μ−1 y P (Yt = y, τ D > t).

Then the same argument as in the discrete time case (see Theorem 4.25) gives: Proposition 5.26 Suppose (, μ) satisfies HK(α, β), and let B = B(x 0 , R), B = B(x0 , R/2). Then for x, y ∈ B , d(x, y) ≤ t ≤ R β ⎛ ⎞ 1

β β−1 d(x, y) ⎠. qtB (x, y) ≥ c1 t −α/β exp ⎝−c2 t

6 Heat Kernel Bounds

6.1 Strongly Recurrent Graphs Chapters 4 and 5 showed that one can obtain Gaussian bounds, or the more general bounds HK(α, β) given three main inputs: a Nash inequality, a bound on the exit times from balls, and the near-diagonal lower bound. Of these, we have seen in Chapter 3 how one can prove the Nash inequality from geometric data on the graph. The other two conditions were introduced in an axiomatic fashion, and so far we have seen no indication of how they can be proved. In this chapter we will study two different situations where these conditions can be established. The easier but less familiar case, which we will do first, is a family of graphs which we call ‘strongly recurrent’, and where the heat kernel bounds will follow from volume and resistance estimates. There is no generally agreed definition of ‘strongly recurrent’ – see [D2, Ba1, T2] for two definitions. The general idea is that if a graph (, μ) satisfies a condition of the form Reff (x, y) ≥ cd(x, y)ε ,

as d(x, y) → ∞,

then, with some additional control on the volume growth, one can use effective resistance to obtain good control of the SRW on (, μ). The standard example of a graph which is recurrent but not ‘strongly recurrent’ is Z2 , where Reff (x, y) log d(x, y). Definition 6.1 (, μ) satisfies the condition (Rβ ) if there exist constants C1 , C2 such that C2 d(x, y)β C1 d(x, y)β ≤ Reff (x, y) ≤ , μ(B(x, d(x, y))) μ(B(x, d(x, y))) 149

for all x, y ∈ V.

150

Heat Kernel Bounds

If (Vα ) holds and β > α then we can rewrite this condition as C0 d(x, y)β−α ≤ Reff (x, y) ≤ C 0 C A d(x, y)β−α ,

for all x, y ∈ V. (6.1)

We will study graphs satisfying the two conditions (Vα ) and (Rβ ), with β > α, and will write ζ = β − α > 0. By Theorem 2.72 the graphical Sierpinski gasket satisfies these conditions with α = log 3/ log 2 and β = log 5/ log 2. We will make frequent use of the following inequality, proved in Lemma 2.25: | f (x) − f (y)|2 ≤ E ( f, f )Reff (x, y),

f ∈ C0 (V).

(6.2)

We begin with the following covering result; the constants CU and C L are as in Definition 4.14. Lemma 6.2 Assume (Vα ). Let r ≥ 1. Then there exists a covering of V by balls Bi = B(xi , r ) such that (a) B(xi , r/2) are disjoint; (b) Let θ ≥ 1. Any y ∈ V is in at most CU (2θ +1)α /C L of the balls B(xi , θr ). Proof First, we can choose a maximal set {xi } of points in V such that B(xi , r/2) are disjoint. (‘Maximal’ means that no extra points can be added to the set {xi , i ∈ N} – there is no assumption of optimal packing.) To do this, write V as a sequence {z 1 , z 2 , . . . }, and let x1 = z 1 . Given x1 , . . . , xn−1 , choose x n to be the first z j such that B(z j , r/2) is disjoint from n−1 ∪i=1 B(xi , r/2). Write Bi = B(xi , r ), Bi∗ = B(xi , θr ). We now show that V = ∪Bi . Let m ≥ 2, and let k be the smallest integer such that xk was chosen from {z m+1 , . . . }. Since xk = z m , we must have that B(z m , r/2) ∩ B(xi , r/2) = ∅ for some i ≤ k − 1; hence z m ∈ B(xi , r ). Now let y ∈ V be in M of the balls Bi∗ ; without loss of generality ∗ ∗ . So d(y, x ) ≤ θr for 1 ≤ i ≤ M. Thus B(x , r/2) ⊂ B1 , . . . , B M i i B(y, (θ + 12 )r ) for i ≤ M. So since B(xi , r/2) are disjoint, M B(xi , r/2)) CU (θ + 12 )α r α ≥ V (y, (θ + 12 )r ) ≥ μ(∪i=1

=

M

V (xi , r/2) ≥ MC L (r/2)α ,

i=1

which gives the upper bound on M.

6.1 Strongly Recurrent Graphs

151

Lemma 6.3 Let (, μ) satisfy (Vα ), and (Rβ ) with β > α. Then there exist constants ci such that c1r β−α ≤ Reff (x, B(x, r )c ) ≤ c2 r β−α ,

for all x ∈ V, r ≥ 1.

Proof Write B = B(x, r ). Let y ∈ ∂ B. Then {y} ⊂ B c and so by Corollary 2.27 we have Reff (x, B c ) ≤ Reff (x, y) ≤ C0 C A (1+r )ζ , which gives the upper bound. To prove the converse let λ ∈ (0, 14 ). Let B(yi , λr ) be the cover of V given by Lemma 6.2, with λr replacing r . If B(yi , λr ) ∩ B(x, r ) = ∅ then x ∈ B(yi , (1 + λ)λ−1r ). Thus if A0 = {i : B(yi , λr ) ∩ B(x, r ) = ∅} then | A0 | ≤ cλ−α . Let A = {i ∈ A0 : d(x, yi ) ≥ 12 r }. For each i ∈ A let h i be the harmonic function h i (z) = Pz (Tx < Tyi ); thus h i (x) = 1 and h i (yi ) = 0. Further h i is the function which attains the minimum in the variational problem for Ceff (x, yi ), so E (h i , h i ) = Reff (x, yi )−1 ≤ cr −ζ . For i ∈ A and z ∈ B(yi , λr ) we have |h i (z)−h i (yi )|2 ≤ E (h i , h i )Reff (yi , z) =

C 0 C A (λr )ζ Reff (yi , z) ≤ = C A (2λ)ζ . Reff (yi , x) C 0 ( 12 r )ζ

So taking λ small enough we have h i (z) ≤ then z = yi and so h(z) = 1.) Now let

1 2

on each B(yi , λr ). (If λr < 1

g = 2 min(h i − 12 )+ . i∈A

Then g(x) = 1, and g = 0 on D = ∪i∈A B(yi , λr ). Note that D contains all points z with 34 r ≤ d(x, z) ≤ r . So if g = g1 B then E (g , g ) ≤ E (g, g). However, by Lemma 1.27(b), E (h i , h i ) ≤ cmr −ζ . E (g, g) ≤ i∈A

Since g (x) = 1 and g = 0 on B c , we have Reff (x, B c ) ≥ E (g , g )−1 ≥ cr ζ . Remark rent.

This implies that Reff (x, ∞) = ∞, so that (, μ) is indeed recur-

152

Heat Kernel Bounds

Lemma 6.4 Assume (Vα ), (Rβ ) with β > α, and (H5). There exists p S > 0 such that if x ∈ V then (6.3) P y Tz < τ (x, r ) ≥ p S if y, z ∈ B(x, r/2). Proof Write B = B(x, r ) and let m ≥ 1. If d(y, z) ≤ r/m then by Lemma 2.62 we have P y (TB c < Tz ) ≤

Reff (y, z) C0 C A d(y, z)ζ ≤ ≤ c1 m −ζ . c Reff (y, B ) Reff (y, B(y, r/2)c )

(6.4)

Choose m so that the left side of (6.4) is less than 12 . If r/m ≥ 1 then, given any y, z ∈ B(x, r/2), we can find a chain of at most 2m points y = w0 , w1 , . . . , wm = z such that wi ∈ B(x, r/2) and d(wi−1 , wi ) ≤ r/m for i = 1, . . . , 2m. For each i we have Pwi−1 (Twi < TB c ) ≥ 12 , so P y (Tz < c T B ) ≥ 2−m . If r/m < 1 then we can use (H5) directly to obtain (6.3). Corollary 6.5 Let (, μ) satisfy (Vα ), (Rβ ) with β > α, and (H5). Then the elliptic Harnack inequality holds for (, μ). Proof Let x ∈ V, R ≥ 2, B = B(x, 2R), B = B(x, R), and let h : B → R+ be harmonic in B. Let y, z ∈ B and write τ = τ B , Mn = h(Yn∧τ ). Then M is a martingale, and so, using Lemma 6.4, h(y) = E y M0 = E y M Tz ∧τ ≥ E y (MTz ∧τ ; Tz < τ ) = h(z)P y (Tz < τ ) ≥ p S h(z).

An alternative proof uses the maximum principle. Let z, y be the points in B where h attains its maximum and minimum, respectively, and suppose that h(z) = 1. Set h z = P(Tz ≤ τ ), so that h z is harmonic in A = B − {z} and h z (y) ≥ p S by Lemma 6.4. Set f = h z − h. If x ∈ ∂ B then h z (x) = 0, while f (z) = 0. Thus max f = max f = 0, A

∂A

so that f ≤ 0 in A. Thus p S = h z (y) ≤ h(y), proving the EHI. Lemma 6.6

Let (, μ) satisfy (Vα ) and (Rβ ), and let β > α. Then qt (x, x) ≤ c0 t −α/β ,

x ∈ V, t > 0.

(6.5)

Proof Recall that (Vα ) includes the condition C L ≤ μx ≤ CU for all x. So qt (x, x) ≤ 1/μx ≤ 1/C L for all t, and it is enough to prove (6.5) for t ≥ 1.

6.1 Strongly Recurrent Graphs

153

Let r = t 1/β , fix x ∈ V, and set D = B(x, r ). Let f t (y) = qt (x, y), and let yt ∈ D be chosen so that f t (yt ) = min f t (y). y∈D

Then f t (yt )V (x, r ) ≤

f t (y)μ y ≤ 1,

y∈D

so that f t (yt ) ≤ V (x, r )−1 . By Corollary 5.16 we have tE ( f t , f t ) ≤ ϕ(t/2) = f t (x), so f t (x)2 ≤ 2 f t (yt )2 + 2( f t (x) − f t (yt ))2 ≤ 2V (x, r )−2 + 2Reff (x, yt )E ( f t , f t ) ≤ c1r −2α + c1

r β−α f t (x) ≤ 2c1 r −α max(r −α , r β t −1 f t (x)). t

If the first term in the maximum is larger, then f t (x)2 ≤ 2c1 r −2α , while if the second is larger then f t (x) ≤ tr −α−β = r −α = t −α/β , so in either case the bound (6.5) holds. Lemma 6.7 Let β > α and suppose that (, μ) satisfies (Vα ) and (Rβ ). Then (E β ) and UHK(α, β) hold. Proof Let B = B(x, r ). We have E y τ (x, r ) = g B (y, z)μz ≤ V (x, r )g B (y, y) = V (x, r )Reff (y, B c ). z∈B

As B ⊂ B(y, 2r ) we have Reff (y, B c ) ≤ c(2r )β−α , so, combining this with the upper bound on V (x, r ), we obtain the upper bound on E y τ . For the lower bound let B = B(x, r/2), and note that by (1.28) and Lemma 6.4 we have for y, z ∈ B g B (y, z) = P y (Tz < τ B )g B (z, z) ≥ p S g B (z, z). So, Ey τB ≥

g B (y, z)μz ≥ p S g B (z, z)V (x, r/2).

z∈B

As g B (z, z) = Reff (z, B c ) ≥ Reff (z, B(z, r/2)c ) ≥ cr β we obtain the lower bound in the condition (E β ). UHK(α, β) then follows from Lemma 6.6 and Theorem 5.23.

154

Heat Kernel Bounds

In general it can be hard to obtain the near-diagonal lower bound, but in the strongly recurrent case it follows easily from (6.2). Lemma 6.8 Suppose β > α and (Vα ) and (Rβ ) hold. Then there exist constants c1 , c2 such that qt (x, y) ≥ c1 t −α/β ,

if d(x, y) ≤ c2 t 1/β .

(6.6)

Proof Let f t (y) = qt (x, y) and d(x, y) ≤ δt 1/β . Since UHK(α, β) holds we also have the on-diagonal lower bound qt (x, x) ≥ ct −α/β by the same argument as in Lemma 4.35. Using Corollary 5.16 we have E ( f t , ft ) ≤ c2 t −1−α/β . So by (6.2) |qt (x, y) − qt (x, x)|2 ≤ cd(x, y)β−α E ( f t , f t ) ≤ c3 δ β−α t −2α/β ≤ c4 δ β−α qt (x, x)2 . Choose δ small enough so that c4 δ β−α ≤ 14 . Then |1 − qt (x, y)/qt (x, x)| ≤ 12 , which gives (6.6). Let β > α. The following are equivalent.

Theorem 6.9

(a) (H5), (Vα ), and (Rβ ) hold (b) HK(α, β) holds. Proof (a) ⇒ (b) The upper bound was proved in Lemma 6.7, while the lower bound follows from Lemma 6.8 and Theorem 5.24. (b) ⇒ (a) (H5) and (Vα ) hold by Lemmas 4.17 and 4.22. To prove (Rβ ) let x, y ∈ V, with r = d(x, y) ≥ 1. By Theorem 2.8 and Theorem 4.26(b), Reff (x, B(x, r − 1)c ) = g B(x,r −1) (x, x) r β−α . So Reff (x, y) ≥ Reff (x, B(x, r − 1)c ) ≥ cr β−α . Now let B ∗ = B(x, 2r ). By Theorem 4.26(b) again we have g B ∗ (x, y) β−α r , and g B ∗ (y, y) = Reff (y, (B ∗ )c ) r β−α . So by Lemma 2.62 0 0 for all edges e. Thus Reff (x, y) ≤ c whenever x ∼ y, and so Reff (x, B(x, r − 1)c ) ≤ cr . If β > α we have from Theorem 6.9 that Reff (x, B(x, r − 1)c ) ≥ cr β−α , and so β − α ≤ 1. Using the estimates (2.52) and (2.53) we have Corollary 6.11 The Sierpinski gasket graph SG satisfies HK(α, β) with α = log 3/ log 2 and β = log 5/ log 2. Remarks 6.12 (1) The strongly recurrent graphs considered here also satisfy a Poincaré inequality, with the power R β rather than R 2 – see Proposition 6.26 in Section 6.3. (2) If (α, β) satisfy the conditions of Corollary 6.10 then there exists a graph which satisfies HK(α, β) – see [Ba1]. (3) The techniques used in this chapter were introduced in [BCK], and handle the case α < β, which is in some sense ‘low dimensional’. As we have seen the proofs are fairly quick. In Section 6.2 we will see how quite different methods, based on Poincaré inequalities, enable us to prove HK(α, 2), that is, the case when β = 2. The final case is of ‘anomalous diffusion’ (i.e. β > 2) when α ≥ β; this turns out to be much harder than the cases already described, and will not be dealt with in this book. (For characterisations of HK(α, β) in this case, see [GT1, GT2, BB2].) However, the most interesting applications of anomalous diffusion are in critical models in statistical physics, such as the incipient infinite percolation cluster, and it turns out that for many models one has α < β. (Because of random fluctuations the conditions (Vα ), (Rβ ) only hold with high probability, but the methods described in this chapter can be modified to handle this.) For examples of random graphs of this kind see [BK, BJKS, BM, KN].

6.2 Gaussian Upper Bounds We begin by giving (stable) conditions for Gaussian upper bounds for graphs satisfying (Vα ) and (Nα ). The main work is to prove the condition (E 2 ), and for this we will use the method of Nash [N] as improved by Bass [Bas]. The first step is an elementary but hard real variable lemma.

156

Heat Kernel Bounds

Lemma 6.13 (See [N], p. 938) Let α > 0 and m(t), q(t) be real functions defined on [1, ∞) with m(1) ≤ 1, and suppose that for t ≥ 1 q(t) ≥ c1 + 12 α log t,

(6.7)

,

(6.8)

c32 q (t).

(6.9)

1 + m(t) ≥ c2 e

2

m (t) ≤

q(t)/α

Then c4 t 1/2 ≤ 1 + m(t) ≤ c5 t 1/2

for t ≥ 1.

Proof We first note that if g(t) ≥ 0 then using the inequality (u + v)1/2 ≤ u/2v 1/2 + v 1/2 , which holds for real positive u, v, we have t t 1/2 g (s)/2(b/s)1/2 + (b/s)1/2 ds (g (s) + b/s) ds ≤ 1 1 t 1 = 1/2 g (s)s 1/2 ds + 2b1/2 (t 1/2 − 1) 2b 1 t g(t)t 1/2 − g(1) 1 = − 1/2 g(s) 12 s −1/2 ds + 2b1/2 (t 1/2 − 1) 2b1/2 2b 1 ≤ t 1/2 2b1/2 + g(t)/2b1/2 . We used integration by parts to obtain the third line. Now set r (t) = α −1 (q(t) − c1 − 12 α log t), so that r (t) ≥ 0 for t ≥ 1. Then c2 ec1 /α t 1/2 er (t) = c2 eq(t)/α ≤ 1 + m(t) t q (s)1/2 ds ≤ 1 + m(1) + c3 1 t 1 1/2 = 1 + m(1) + c3 α 1/2 r (s) + ds 2s 1 ≤ 2 + ct 1/2 (1 + r (t)). Thus r (t) is bounded, and the bounds on m follow. Fix x0 ∈ V. Following [N] define for t ≥ 1 d(x0 , y)qt (x0 , y)μ y , M(t) = Ex0 d(x0 , Yt ) = Q(t) = −

y

qt (x0 , y) log qt (x0 , y)μ y .

y

We now verify that Q and M satisfy (6.7)–(6.9).

6.2 Gaussian Upper Bounds

157

Lemma 6.14 Let (, μ) satisfy (Vα ) and (Nα ). Then M(t) ≤ t for all t and, in particular, M(1) ≤ 1. Further, there exist constants ci such that Q(t) ≥ c1 + 12 d log t, 1 + M(t) ≥ c2 e Q(t)/d ,

t ≥ 1,

(6.10)

t ≥ 1.

(6.11)

Proof The representation (5.1) gives that d(x0 , Yt ) ≤ Nt , and therefore M(t) ≤ ENt = t. The bound (6.10) follows directly from the upper bound (5.18). The proof of (6.11) is similar to that in [N, Bas]. Let a > 0, set D0 = B(x0 , 1), and Dn = B(x 0 , 2n ) − B(x0 , 2n−1 ) for n ≥ 1. Then using (Vα ) to bound μ(Dn ) we have y∈V

e−ad(x0 ,y) μ y ≤ c +

∞

exp(−a2n−1 )μ y ≤ c +

n=1 y∈Dn

∞

CU 2nα exp(−a2n−1 ).

n=1

By Lemma A.36 this sum is bounded by ca −α if a ∈ (0, 1]. If λ > 0 then u(log u + λ) ≥ −e−1−λ for u > 0. So, setting λ = ad(x0 , y) + b, where a ≤ 1 and b ≥ 0, −Q(t) + a M(t) + b = qt (x0 , y) log qt (x0 , y) + ad(x0 , y) + b μ y y

≥−

e−1−ad(x0 ,y)−b μ y

y

= −e

−1−b

e−ad(x0 ,y) μ y ≥ −c5 e−b a −α .

y

Setting a = 1/(1 + M(t)) and eb = (1 + M(t))α = a −α we obtain −Q(t) +

M(t) + α log(1 + M(t)) ≥ −c5 . 1 + M(t)

Since M/(1 + M) < 1, rearranging gives (6.11). If f (λ) = (1 + λ) log λ − λ then f is increasing on [1, ∞). So, writing λ = u/v ≥ 1, we have (1 + u/v) log(u/v) − u/v ≥ f (1) = −1, and rearranging gives (u − v)2 ≤ (u − v)(log u − log v), for u, v > 0. u+v

(6.12)

Lemma 6.15 Let (, μ) be a weighted graph, with subexponential volume growth: for each λ > 0, x ∈ V, lim V (x, R)e−λR = 0.

R→∞

(6.13)

158

Heat Kernel Bounds

Then Q (t) ≥ M (t)2 ,

t > 0.

Proof Set f t (x) = qt (x0 , x), and let bt (x, y) = f t (x) + f t (y). Neglecting for the moment the need to justify the interchange of limits, we have M (t) =

y

= − 12

x

≤

1 2

1 2

(6.14)

μx y (d(x0 , y) − d(x 0 , x))( f t (y) − f t (x))

(6.15)

y

x

≤

∂ f t (y) d(x0 , y) f t (y)μ y μy = ∂t y

d(x0 , y)

μx y | f t (y) − f t (x)|

y

x

= 2−1/2

μx y bt (x, y)

y

1/2 x

x

y

μx y

y

μx y

( f t (y) − f t (x))2 1/2 bt (x, y)

( f t (y) − f t (x))2 1/2 . f t (x) + f t (y)

Using (6.12) we deduce M (t)2 ≤ 12 μx y ( f t (y) − f t (x))(log f t (y) − log f t (x)). x

y

On the other hand (again using discrete Gauss–Green), Q (t) = − (1 + log f t (y)) f t (y)μ y

(6.16)

y

=

1 2

x

μx y (log f t (y) − log f t (x))( f t (y) − f t (x))

y

≥ M (t)2 .

(6.17)

It remains to justify the interchanges of derivatives and sums in calculating M and Q , and the use of discrete Gauss–Green to obtain (6.15) and (6.17) from (6.14) and (6.16). Let Q n (t), Mn (t) be the approximations to Q and M obtained by summing only over y ∈ B(x 0 , n). By Theorem 5.17 if R = d(x 0 , y) > 3t then c1 exp(−c2 R(1 + log(R/t)) ≤ f t (y) ≤ c3 exp(−c4 R). So, using the subexponential volume growth, it follows that, for any T ≥ 1, Mn and Q n converge uniformly for t ∈ [1, T ] to the right sides of (6.14) and

6.2 Gaussian Upper Bounds

159

(6.16), and Mn and Q n converge uniformly to M and Q. We also have that the sums d(x 0 , x)| f t (x)− f t (y)|μx y , (1+| log f t (y)|)| f t (x)− f t (y)|μx y x

y

x

y

converge absolutely, so we can use discrete Gauss–Green to obtain (6.15) and (6.17). Thus Q and M satisfy the hypotheses of Lemma 6.13, and we obtain Proposition 6.16 Then

Suppose that (, μ) satisfies (Vα ) and (Nα ). Let x 0 ∈ V. M(t) ≤ c1 t 1/2 ,

t ≥ 1.

Corollary 6.17 Let (, μ) satisfy (Vα ) and (Nα ). Then the condition (E 2 ) holds for (, μ). Proof Let x ∈ V, r ≥ 1 and write τ = τ (x, r ). The upper bound E y τ ≤ cr 2 follows from the upper bound qt (x, y) ≤ ct −α/2 and Lemma 4.20. By the triangle inequality, d(x, Yt∧τ ) ≤ d(x, Y2t ) + d(Yt∧τ , Y2t ). Now choose t so that 23/2 c1 t 1/2 = 12 r . Then r Px (τ ≤ t) ≤ Ex d(x, Yt∧τ ) ≤ Ex d(x, Y2t ) + Ex d(Yt∧τ , Y2t ) ≤ c1 (2t)1/2 + Ex EYt∧τ d(Y0 , Y2t−t∧τ ) ≤ c1 (2t)1/2 +

sup

z∈B(x,r ),s≤t

Ez d(z, Y2t−s )

≤ 2c1 (2t)1/2 = 12 r. Thus Ex τ (x, r ) ≥ tPx (τ ≥ t) ≥ 12 t = c r 2 , which gives the lower bound. Theorem 6.18 Let (, μ) satisfy (Vα ). The following are equivalent. (a) (, μ) satisfies (Nα ). (b) UHK(α, 2) holds. (c) UHKC(α, 2) holds. Proof Assume that (Nα ) holds. By Corollary 6.17 (, μ) satisfies (E 2 ), and hence by Theorem 5.23 it satisfies UHK(α, 2) and UHKC(α, 2). The implications (b) ⇒ (a) and (c) ⇒ (a) are immediate from Theorem 5.14.

160

Heat Kernel Bounds

6.3 Poincaré Inequality and Gaussian Lower Bounds In this section we will assume that (, μ) satisfies the (weak) PI, and we will write λ P for the constant by which we have to multiply the size of the balls. The main result of this section is that for graphs which satisfy (Vα ) and (H5) the PI leads to the Gaussian bounds HK(α, 2). Theorem 6.19

Let (, μ) satisfy (H5). The following are equivalent.

(a) (, μ) satisfies (Vα ) and the (weak) PI. (b) The continuous time heat kernel qt (x, y) satisfies HKC(α, 2). (c) The discrete time heat kernel pn (x, y) satisfies HK(α, 2). In consequence, the property HK(α, 2) is stable under rough isometries. We begin by proving that the Poincaré inequality can be used to obtain ondiagonal upper bound. Proposition 6.20

Let (, μ) satisfy (Vα ) and the PI. Then qt (x, y) ≤ c2 t −α/2 ,

x, y ∈ V, t ≥ 0.

Proof As in Theorem 5.14 it is enough to prove this for x = y. Fix x0 ∈ V, let f t (y) = qt (x0 , y), and let ψ(t) = f t , ft = q2t (x0 , x 0 ). Choose r ≥ 2, and let Bi = B(xi , r ) be a covering of V given by Lemma 6.2. Set M = (2λ P + 1)α CU /C L ; we have that any x ∈ V is in at most M of the balls Bi∗ = B(xi , λ P r ). It follows that any edge {x, y} is contained in at most M of these balls. By (5.13) − 12 ψ (t) = E ( f t , f t ) ≥ M −1

∞ i=1

Set f t,i = μ(Bi∗ )−1

1 2

( f t (x) − f y (y))2 μx y .

x∈Bi∗ y∈Bi∗

f t (x)μx .

x∈Bi∗

Then using the PI in the balls Bi ⊂ Bi∗ one obtains 1 ( f t (x) − f t (y))2 μx y ≥ (C P r 2 )−1 ( f t (x) − f t,i )2 μx 2 x∈Bi∗ y∈Bi∗

x∈Bi

6.3 Poincaré Inequality and Gaussian Lower Bounds

= (C P r 2 )−1

f t (x)2 μx − μ(Bi )−1

x∈Bi

≥ (C P r 2 )−1

161

f t (x)μx

x∈Bi

f t (x)2 μx − C L−1 r −α

x∈Bi

f t (x)μx

2

2

.

x∈Bi

Now ∞ i=1

f t (x)μx

2

x∈Bi

≤

∞ i=1

f t (x)μx

2

≤ M 2,

x∈Bi

and thus −C P Mr 2 12 ψ (t) ≥

∞

f t (x)2 μx − C L−1 r −α M 2

i=1 x∈Bi

≥ ψ(t) − C L−1r −α M 2 . So, − 12 ψ (t) ≥ (MC P r 2 )−1 (ψ(t) − MC L−1 r −α ). Choosing r = r (t) such that ψ(t) = 2M 2 C L−1r −α gives ψ (t) ≤ −cψ(t)1+2/α , and the rest of the argument is as in Remark 5.15(1). Combining Proposition 6.20 with Theorem 5.14 we deduce: Corollary 6.21

(Vα ) and the PI imply (Nα ).

It is also possible to give a direct proof – see [SC1]. We now turn to lower bounds, and will use a weighted Poincaré inequality and the technique of [FS] to obtain the near-diagonal lower bound (5.26). We begin by stating the weighted Poincaré inequality. Let x 0 ∈ V, R ≥ 1, B = B(x0 , R), and define 2 0 ,y)) 1 ∧ (R+1−d(x if y ∈ B, R2 ϕ(y) = ϕ B (y) = (6.18) 0 if y ∈ V − B. Note that ϕ has support B, that R −2 ≤ ϕ(y) ≤ 1 on B, that ϕ(y) = R −2 if y ∈ ∂i B, and that if x ∼ y with x, y ∈ B then ϕ(y) ≤ 4ϕ(x) and |ϕ(x) − ϕ(y)| ≤ R −2 . The following weighted Poincaré inequality is proved in Appendix A.9.

162

Heat Kernel Bounds

Proposition 6.22 Let (, μ) satisfy (H5), (Vα ), and the PI. Then there exists a constant C W such that for each ball B = B(x0 , R) and f : B → R inf a

( f (x) − a)2 ϕ B (x)μx

x∈B

≤ 12 C W R 2

( f (x) − f (y))2 (ϕ B (x) ∧ ϕ B (y))μx y .

(6.19)

x,y∈B

We will need the following elementary inequality. Lemma 6.23 (See [SZ]) Let a, b, c, d be positive reals. Then d (d − c)2 c − − (b − a) ≥ 12 (c ∧ d)(log b − log a)2 − . b a 2(c ∧ d) Proof First note that the inequality is preserved by the transformations (a, b, c, d) → (b, a, d, c) and (a, b, c, d) → (sa, sb, tc, td), where s, t > 0. Using the first, we can assume that d ≥ c, and by the second we can take a = c = 1. So, writing x = b, the inequality reduces to proving

d − 1 (x − 1) ≤ 12 (d − 1)2 , (6.20) f (x) = 12 (log x)2 + x for x > 0, d ≥ 1. Since f (x) − f (x −1 ) = (d − 1)(x − x −1 ), it is enough to prove (6.20) for x ≥ 1. Now, by Cauchy–Schwarz, x 2 x x 2 −1 1.t dt ≤ dt t −2 dt (log x) = 1

1

1

(x − 1)2 = (x − 1)(1 − x −1 ) = . x Hence it is enough to prove that

(x − 1)2 d ≤ (d − 1)2 + 2(x − 1) 1 − . x x Finally

d 2(x − 1) 1 − x

=2

(x − 1)2 (x − 1) (x − 1)2 −2 (d − 1) ≥ − (d − 1)2 , x x x

where we used −2AB ≥ −A2 − B 2 in the final line. We now prove the near-diagonal lower bound for Y , using an argument of Fabes and Stroock [FS]. Recall from Chapter 5 the notation qtB (x, y) for the transition density of Y killed on exiting B.

6.3 Poincaré Inequality and Gaussian Lower Bounds

163

Theorem 6.24 Suppose satisfies (H5), (Vα ), and the weak PI. Then for any R ≥ 1, and x0 ∈ V, qtB(x0 ,R) (x1 , x 2 ) ≥ c1 t −α/2 if x1 , x2 ∈ B(x0 , 12 R) and 14 R 2 ≤ t ≤ R 2 . (6.21) In consequence, satisfies HK(α, 2). Proof The bounds UHK(α, 2) hold by Theorem 6.18 and Corollary 6.21, and Lemma 4.21 then gives that (E 2 ) holds. Write T = R 2 and B = B(x 0 , R). Let x 1 ∈ B(x0 , R/2). By Lemma 5.21 qtB (x1 , x)μx = Px1 (Yt ∈ B(x0 , 2R/3), τ B > t) x∈B(x0 ,2R/3)

≥ Px1 (τ (x1 , R/6) > t) ≥ 1 − cec R

2 /t

≥ 12 , (6.22)

provided t ≤ c2 T , where c2 is chosen small enough. We take c2 ≤ 1/8. Let ϕ be defined by (6.18), and let V0 = ϕ(x)μx . x∈B

Then, as ϕ ≥

1 4

on B(x0 , R/2), c3 R α ≤ V0 ≤ μ(B) ≤ c4 R α .

Write u t (x) = qtB (x1 , x), and let w(t, y) = wt (y) = log V0 u(t, y),

H (t) = H (x 1 , t) = V0−1

(6.23)

ϕ(y)wt (y)μ y .

y∈B

Note that wt (y) is well defined for y ∈ B and t > 0. It is easy to see by Jensen’s inequality that H (t) < 0, but we will not need to use this fact. Set ϕ(x) if x ∈ B, f t (x) = u t (x) 0 if x ∈ B. Then ϕ(y) ∂u t (y) ∂wt (y) μy = μ y = f, u t = −E ( f, u t ) ∂t u (y) ∂t y∈B y∈B t ( f (x) − f (y))(u t (x) − u t (u))μ x y . = − 12

V0 H (t) =

ϕ(y)

x∈V y∈V

164

Heat Kernel Bounds

Now we use Lemma 6.23. If x, y ∈ B then f (x) > 0 and f (y) > 0, and ϕ(x) ϕ(y) − (u t (x) − u t (u)) −( f (x) − f (y))(u t (x) − u t (y)) = − u t (x) u t (y) ≥ 12 (ϕ(x) ∧ ϕ(y))(log u t (x) − log u t (y))2 −

(ϕ(x) − ϕ(y))2 . 2(ϕ(x) ∧ ϕ(y))

If x, y ∈ B c then f (x) = f (y) = 0, while if x ∈ B and y ∈ B c then f (y) = u t (y) = 0 and −( f (x) − f (y))(u t (x) − u t (y)) = −ϕ(x). We therefore have V0 H (t) ≥

1 4

(ϕ(x) ∧ ϕ(y))(wt (x) − wt (y))2 μx y

(6.24)

x∈B y∈B

−

1 4

μx y

x∈B y∈B

−

1 2

(ϕ(x) − ϕ(y))2 ϕ(x) ∧ ϕ(y)

μx y ϕ(x).

(6.25) (6.26)

x∈B y∈B c

Call the sums in (6.24)–(6.26) S1 , S2 , and S3 , respectively. To bound S2 let x ∼ y with x, y ∈ B, and assume that d(y, x0 ) ≤ d(x, x 0 ). Writing k = R + 1 − d(x, x0 ), we have R 2 ϕ(x) = k 2 , and R 2 ϕ(y) is either k 2 or (k + 1)2 . Hence 9 (ϕ(x) − ϕ(y))2 ((k + 1)2 − k 2 )2 ≤ 2. ≤ 2 2 ϕ(x) ∧ ϕ(y) k R R Thus S2 = − 14

μx y

x∈B y∈B

≥

− 94 R −2

(ϕ(x) − ϕ(y))2 ϕ(x) ∧ ϕ(y) μx y ≥ − 94 R −2 μ(B) ≥ c R −2 V0 .

x∈B y∈B

Also, if x ∈ ∂i B then ϕ(x) = R −2 , so that S3 = − 12 μx y R −2 ≥ −R −2 μx ≥ −c R −2 V0 . x∈∂i B y∈B c

x∈B

Thus the terms S2 and S3 are controlled by bounds of the same size, and so, using the weighted Poincaré inequality Proposition 6.22 to bound S1 , and recalling that T = R 2 , we obtain (wt (x) − H (t))2 ϕ(x)μx . (6.27) V0 H (t) ≥ −c5 T −1 V0 + c5 T −1 x∈B

6.3 Poincaré Inequality and Gaussian Lower Bounds

165

It follows immediately from (6.27) that t → H (t) + c5 t/T is increasing. We now prove that if H (t) is small enough then the final term in (6.27) is greater than cH (t)2 . Let I = [ 12 c2 T, c2 T ]. By Theorem 6.18 we have V0 u t (x) ≤ c6 for t ∈ I , x ∈ B. Write B = B(x0 , 2R/3), choose c7 < 1 so that c6 c7 μ(B ) ≤ 14 V0 , and let A = {y ∈ B : V0 u t (y) ≥ c6 c7 }. Then by (6.22) 1 u t (y)μ y + u t (y)μ y 2 ≤ y∈B ∩A

y∈B −A

≤ c6 V0−1 μ(A) + c6 c7 V0−1 (μ(B ) − μ(A)) ≤ c6 V0−1 μ(A) + c6 c7 V0−1 μ(B ) ≤ c6 V0−1 μ(A) + 14 , so that V0 μ(A) ≥ c6 /4. For x ∈ A and t ∈ I we have c6 c7 ≤ V0 μt (x) ≤ c6 , so there exists c8 such that |wt (x)| ≤ c8 for x ∈ A, t ∈ I . Now let t1 = inf{t : H (t) ≥ −2c8 }. If t1 < c2 T then, as H (t) + c5 t/T is increasing, c2 T H (s)ds ≥ −2c8 −c5 (c2 T −t1 )/T ≥ −(2c8 +c5 c2 ). H (c2 T ) = H (t1 )+ t1

If t1 > c2 T then for x ∈ A, t ∈ I wt (x) − H (t) ≥ −c8 + c8 + 12 |H (t)| ≥ 12 |H (t)|. So, for t ∈ I , T H (t) ≥ −c5 + c5

(wt (x) − H (t))2 ϕ(x)μx

x∈A

≥ −c5 + 14 c5 ≥ −c5 + ≥ −c5 +

(wt (x) − H (t))2 μx

x∈A −1 1 2 c 16 5 μ(A)V0 H (t) 2 2 1 1 128 c5 c6 c8 + 128 c5 c6 H (t) .

(6.28)

We can assume that c8 is large enough so that the second term in (6.28) is greater than c5 , so that there exists c9 such that T H (t) ≥ c9 H (t)2 So since H ( 12 c2 T ) < −2c8 < 0, 1 1 1 < − =− H (c2 T ) H (c2 T ) H ( 12 c2 T )

for t ∈ I.

c2 T 1 2 c2 T

(6.29)

H (s)H (s)−2 ds ≤ − 12 c2 c9 .

166

Heat Kernel Bounds

Thus in both cases we deduce that H (c2 T ) ≥ −c10 . Using again the fact that H (t) + c5 T −1 t is increasing, it follows that for x 1 ∈ B(x0 , R/2) H (x1 , t) ≥ −c11 ,

1 8T

≤ t ≤ 12 T.

(6.30)

To turn this into a pointwise bound, note first that B B V0 qt/2 (x1 , y)V0 qt/2 (x2 , y)ϕ(y)μ y . V0 qtB (x1 , x2 ) = V0−1 y∈B

So, using Jensen’s inequality and (6.30), for x 1 , x2 ∈ B(x0 , R/2), t ∈ [ 14 T, T ], log(V0 qtB (x1 , x2 )) ≥ V0−1

B B (x1 , y)) + log(V0 qt/2 (x 2 , y)) ϕ(y)μ y log(V0 qt/2

y∈B

= H (x 1 , t/2) + H (x2 , t/2) ≥ −2c11 .

By (6.23) V0 R α , and we obtain the near-diagonal lower bound for qtB . The Gaussian bound HK(α, 2) then follows immediately by Theorem 5.25. Remarks 6.25 (1) It is not clear how to run this argument in discrete time. However, if satisfies the hypotheses of Theorem 6.24 then the discrete time NDLB follows from Theorem 5.24. (2) In Theorem 6.24 the hypothesis (H5) is not used in the proof of the continuous time near-diagonal lower bound (6.21) from the weighted PI. However, it is used to obtain HK(α, 2) from (6.21), and it is also used in the proof of the weighted PI from the weak PI. With some more work one can use the PI to remove the need for (H5) in the continuous time case – see [BC]. We now see how these Gaussian bounds, or more generally HK(α, β), lead to a Poincaré inequality. Proposition 6.26 Suppose that HK(α, β) holds. Let B = B(x0 , R). Then if f : B → R, writing B = B(x0 , R/2), ( f (x) − f B )2 μx ≤ c1 R β E B ( f, f ). x∈B

Proof Write B = (B, E B , μ B ) for the subgraph induced by B with the weights μ x y restricted to B. Let Y˜ be the continuous time random walk on

6.3 Poincaré Inequality and Gaussian Lower Bounds

167

B , and write q˜t (x, y) for the (continuous time) heat kernel associated with Y˜ . Then, using Proposition 5.26, if T = R β , q˜ T (x, y) ≥ qTB (x, y) ≥ c2 T −α/β ,

for x, y ∈ B .

(6.31)

Write Q˜ t for the semigroup of Y˜ , given by Q˜ t f (x) = q˜t (x, y) f (y)μ yB . y∈B

Now let f : B → R. For x ∈ B , set ax = Q˜ T f (x), and let h (x) = Q˜ T ( f − ax )2 . Then h (x) (y) = ( Q˜ T f 2 )(y) − 2ax Q˜ T f (y) + ax2 , and therefore h (x) (x) = ( Q˜ T f 2 )(x) − ax2 = ( Q˜ T f 2 )(x) − ( Q˜ T f (x))2 . So, writing ·, · B for the inner product in L 2 (μ B ), h (x) (x)μxB = ( Q˜ T f 2 )(x)μxB − ( Q˜ T f (x))2 μxB x

x

x

= Q˜ T f 2 , 1 B − Q˜ T f, Q˜ T f B = f, f B − Q˜ T f, Q˜ T f B ≤ 2T E B ( f, f ). Here we used Corollary 5.6 in the final line. On the other hand, using the lower bound (6.31), for x ∈ B , h (x) (x) = ( f (y) − ax )2 q˜ T (x, y)μ yB y∈B

≥ cT −α/β

( f (y) − ax )2 μ yB ≥ c R −α

y∈B

( f (y) − f B )2 μ yB .

y∈B

Combining the bounds above, ( f (y)− f B )2 μ yB ≤ c R α min h (x) (x) ≤ c h (x) (x)μx ≤ c R β E B ( f, f ), y∈B

x∈B

x∈B

which proves the weak Poincaré inequality with λ P = 2. Proof of Theorem 6.19 The equivalence between (b) and (c) is proved in Theorem 5.24. The implication (a) ⇒ (b) was proved in Theorem 6.24, and (b) ⇒ (a) follows by Lemma 4.22 and Proposition 6.26. By Proposition 3.33 the weak Poincaré inequality is stable under rough isometries. Since (Vα ) is also stable under rough isometries, the equivalence of (a) and (b) imply that HK(α, 2) is stable under rough isometries. Remark 6.27 Note that all that was needed to prove the Poincaré inequality in Proposition 6.26 was the near-diagonal lower bound (6.31) for the killed

168

Heat Kernel Bounds

heat kernel, together with (Vα ). Suppose that (, μ) satisfies (H5) and (Vα ). If (6.31) holds (with β = 2), then the PI holds, and so, by Theorem 6.19, HK(α, 2) holds. For convenience we summarise the heat kernel bounds which one has for Zd and for graphs which are roughly isometric to Zd . Theorem 6.28 Let (, μ) be connected, locally finite, satisfy the condition (H5) of controlled weights: μx y 1 ≥ μx C2

whenever x ∼ y,

(6.32)

and be roughly isometric to Zd . Let pn (x, y) and qt (x, y) be the discrete and continuous time heat kernels on . There exist constants ci = ci (d) such that the following estimates hold. (a) For x, y ∈ V, n ≥ d(x, y) ∨ 1, pn (x, y) ≤ c1 n −d/2 exp(−c2 d(x, y)2 /n), pn (x, y) + pn+1 (x, y) ≥ c3 n −d/2 exp(−c4 d(x, y)2 /n). (b) If t ≥ 1 ∨ d(x, y) then c3 t −d/2 exp(−c4 d(x, y)2 /t) ≤ qt (x, y) ≤ c1 t −d/2 exp(−c2 d(x, y)2 /t). (c) If d(x, y) ≥ 1 and t ≤ d(x, y) then, writing R = d(x, y), c3 exp(−c4 R(1 + log(R/t)) ≤ qt (x, y) ≤ c1 exp(−c2 R(1 + log(R/t)). −t −1 (d) μ−1 x (1 − e ) ≤ qt (x, x) ≤ μx for t ∈ (0, 1).

Proof It is clear that satisfies (Vd ) and (H5). Zd satisfies the (weak) PI by Corollary 3.30, so by the stability of the PI (Proposition 3.33), satisfies the PI. Theorem 6.19 then gives that HK(d, 2) holds in both discrete and continuous time, proving (a) and (b). The bound (d) is elementary, and (c) follows from Theorem 5.17 and Remark 5.18(2).

6.4 Remarks on Gaussian Bounds We have seen that, starting with an isoperimetric inequality, it is fairly straightforward to obtain on-diagonal upper bounds, but that we had to work much harder in order to obtain full Gaussian upper bounds. There are at least seven methods in the literature for obtaining Gaussian upper bounds:

6.4 Remarks on Gaussian Bounds

169

(1) The approach given in Section 6.3, using differential inequalities, originally due to Nash and improved by Bass [Bas]. (2) An analytic interpolation argument due to Hebisch and Saloff-Coste [HSC]. (3) Davies’ method, using a perturbation of the semigroup Q t – see [CKS]. (4) A ‘two-point’ method due to Grigor’yan – see [CGZ, Fo1, Ch] for the graph case. (This approach works well for random graphs, where global information may not be available.) (5) A proof from a parabolic Harnack inequality, as was done by Aronsen in [Ar] for divergence form diffusions. See [D1] for such arguments in the graph context. (The parabolic Harnack inequality can be proved by Moser’s iteration argument [Mo1, Mo2, Mo3].) (6) Combining the Carne–Varopoulos bound and its extensions with a mean value inequality, as in [CG]. (7) For manifolds, there is a complex interpolation method due to Coulhon and Sikora [CS]. However, the method relies on having a full Gaussian tail and does not seem to work in the graph context. We now state without proof two additional theorems on Gaussian bounds. The first, due to Hebisch and Saloff-Coste, gives that on-diagonal upper bounds (or equivalently a Nash inequality) imply Gaussian upper bounds, with no additional geometric hypotheses. Theorem 6.29 (See [HSC]) Let (, μ) be a weighted graph and α ≥ 1. The following are equivalent. (a) pn (x, x) ≤ cn −α/2 ,

n ≥ 1.

(b) The Gaussian upper bounds UHK(α, 2) hold. Theorem 6.31 gives a characterisation of Gaussian upper bounds and the parabolic Harnack inequality. Definition 6.30 We say (, μ) satisfies the volume doubling (VD) condition if there exists a constant C such that V (x, 2r ) ≤ C V (x, r )

for all x ∈ V, r ≥ 1.

170

Heat Kernel Bounds (1) If (Vα ) holds with constants C L and CU then, if r ≥ 1, V (x, 2r ) ≤ CU 2α r α = CU 2α /C L C L r α = CU 2α /C L V (x, r ),

Remarks

and thus VD holds. (2) Zd satisfies (Vd ) and hence volume doubling. (3) If (, μ) satisfies VD then, iterating, one deduces that V (x, 2k ) ≤ C k V (x, 1), from which it follows that V (x, r ) ≤ cr α V (x, 1), that is, that (, μ) has polynomial volume growth. (4) The binary tree, which has exponential volume growth, therefore does not satisfy VD. (5) Let i , i = 1, 2 be graphs satisfying (Vαi ), with α1 = α2 . Then the join of 1 and 2 does not satisfy VD. Theorem 6.31 (a) (b) (c) (d)

Let (, μ) satisfy (H5).

VD and PI are weight stable. VD is stable under rough isometries. The (weak) PI is stable under rough isometries. VD and the (weak) PI imply the strong PI.

Proof (a) and (b) are easy, and (c) was proved in Proposition 3.33. For (d) see Appendix A.9 and Corollary A.51. Remark 6.32 On its own it is not clear that the strong PI is stable under rough isometries: a rough isometry ϕ need not map balls into balls. However, using Theorem 6.31(b), (c), and (d), one deduces that the combined condition ‘(VD) + (strong PI)’ is stable under rough isometries. Definition 6.33 We say (, μ) satisfies the parabolic Harnack inequality (PHI(β)) if whenever u(t, x) ≥ 0 is defined on the cylinder Q = [0, 4T ] × B(y, 2R), and satisfies ∂ u(t, x) = u(t, x), ∂t

(t, x) ∈ [0, 4T ] × B(y, 2R),

(6.33)

then, writing T = R β , Q − = [T, 2T ] × B(y, R) and Q + = [3T, 4T ] × B(y, R), sup

(t,x)∈Q −

u(t, x) ≤ c1

inf

(t,x)∈Q +

u(t, x).

(6.34)

6.4 Remarks on Gaussian Bounds

171

A function which satisfies (6.33) is called caloric. If h(x) is harmonic then u(t, x) = h(x) is caloric, so that PHI(β) immediately implies EHI. Somewhat surprisingly, the standard parabolic Harnack inequality (with β = 2) is easier to characterise than the EHI; this was done by Grigor’yan [Gg1] and SaloffCoste [SC1] for manifolds. Here is the graph version. Theorem 6.34 (See [D1]) Let (, μ) be a weighted graph satisfying (H5). The following are equivalent. (a) (, μ) satisfies VD and PI. (b) There exist constants C 1 , C2 such that for t ≥ d(x, y) ∨ 1, C1 C2 2 2 e−C2 d(x,y) /t ≤ qt (x, y) ≤ e−C1 d(x,y) /t . V (x, t 1/2 ) V (x, t 1/2 ) (c) (, μ) satisfies PHI(2).

(6.35)

7 Potential Theory and Harnack Inequalities

7.1 Introduction to Potential Theory We will use Green’s functions, so throughout this section we consider a domain D ⊂ V, with the assumption that if (, μ) is recurrent then D = V. Definition 7.1 Let A ⊂ D ⊂ V. We define the hitting measure and escape measure (or capacitary measure) by H A (x, y) = Px (X τ A = y; τ A < ∞), Px (T A+ ≥ τ D )μx if x ∈ A, e A,D (x) = 0 otherwise. We write e A for e A,V . If ν is a finite measure on V we define νx g D (x, y). νG D (y) =

(7.1)

x∈V

Recall that we count the event {T A ≥ τ D } as occurring if T A = τ D = ∞. In this discrete context, the distinction between measures and functions is not as pronounced as in general spaces. However, it is still helpful to distinguish between the two; note that there is no term involving μx in (7.1). We adopt the convention that G acts to the right of measures and to the left of functions. If νx = f (x)μx then the symmetry of G D gives that νG D = G D f . Proposition 7.2 (Last exit decomposition) Let A ⊂ D with A finite. Then Px (T A < τ D ) = g D (x, y)P y (T A+ ≥ τ D )μ y = e A,D G D (x). (7.2) y∈A

In particular, e A,D G D = 1 on A. 172

7.1 Introduction to Potential Theory

173

Proof Note that both sides of (7.2) are zero if x ∈ D. Set σ = max{n ≥ 0 : X n ∈ A, n < τ D }. As A is finite, P y (σ < ∞) = 1 for all y. Since A ⊂ D we have {X j ∈ A, j = 1, . . . , τ D } = {T A+ > τ D }. So Px (T A < τ D ) = =

∞ n=0 y∈A ∞

Px (σ = n, X n = y, τ D > n) Px (X n = y, τ D > n, X j ∈ A, j = n + 1, . . . , τ D )

n=0 y∈A

=

∞

Px (X n = y, τ D > n)P y (X j ∈ A, j = 1, . . . , τ D )

n=0 y∈A

=

g D (x, y)μ y P y (T A+ ≥ τ D ).

y∈A

Remark 7.3 This result can fail if P.(σ = ∞) > 0. For example, consider Z with edge weights μn,n+1 = λn for all n, where λ > 1. Let D = N and A = {k, k +1, . . . }, where k ≥ 2. If x ≥ k then Px (T A < τ D ) = 1. If y ≥ k +1 then P y (T A+ ≥ τ D ) = 0, so the left side of (7.2) is g D (x, k)Pk (T A+ ≥ τ D )μk . Since g D (x, k)/g D (k, k) = Px (Tk < ∞) = λk−x for x ≥ k, the left side of (7.2) is not constant, and converges to zero as x → ∞, showing that (7.2) cannot hold. Taking D = V we deduce: Let (, μ) be transient and A be finite. Then

Corollary 7.4

Px (T A < ∞) = e A G(x), and e A G = 1 on A. Lemma 7.5

Let A ⊂ V with A = V. Then if x ∈ A, y ∈ ∂ A H A (x, y) = g A (x, z) p1 (z, y)μz μ y . z∈∂i A

Proof We have Px (X τ A = y) =

∞

Px (τ A = n + 1, X n = z, X n+1 = y)

n=0 z∈∂i A

=

∞ n=0 z∈∂i A

Px (τ A > n, X n = z, X n+1 = y)

174

Potential Theory and Harnack Inequalities

=

∞

Px (τ A > n, X n = z)Pz (X 1 = y)

n=0 z∈∂i A

=

g A (x, z)μz p1 (z, y)μ y .

z∈∂i A

(This argument does not use the last exit decomposition.) Proposition 7.6 (‘Principle of domination’) Let A ⊂ D with A finite, and for i = 1, 2 let ν(i) be measures with support A. Let f i = ν(i) G D , and suppose that f 1 ≥ f 2 on A. Then f 1 ≥ f 2 on V, and ν(1) (A) ≥ ν(2) (A). Proof Let Dn ↑↑ D, with A ⊂ D1 , and write fi,n = ν(i) G Dn . Then by Theorem 1.34 f i,n converges to f i pointwise. Let ε > 0. As A is finite there exists n such that (1 + ε) f 1,n ≥ f 2,n on A. Since f i,n is harmonic in Dn − A and is zero on ∂ Dn , the maximum principle implies that (1 + ε) f 1,n ≥ f 2,n on V. As ε is arbitrary, it follows that f 1 ≥ f 2 . By Proposition 7.2 we have e A,D G D = 1 on A. So for i = 1, 2 ν(i) e A,D (y)G D (y, x) ν(i) (A) = x x∈A

=

y∈A

e A,D (y)(ν(i) G D (y)) =

y∈A

e A,D (y) f i (y).

x∈A

Since f 1 ≥ f 2 on A it follows that ν(1) (A) ≥ ν(2) (A). Definition 7.7

Let A ⊂ D, with A finite. Define the capacity of A by Cap D (A) = x∈A e A,D (x) = e A,D (A).

We write Cap(A) for CapV (A). Theorem 7.8

Let A ⊂ D, with A finite, and set h = e A,D G D .

Then h = 1 on A, 0 < h ≤ 1 on V, and h is harmonic on D − A. Further, Cap D (A) = sup{ν( A) : supp(ν) ⊂ A, νG D ≤ 1 on D}

(7.3)

= inf{ν( A) : supp(ν) ⊂ A, νG D ≥ 1 on A}. Proof The properties of h are immediate from Proposition 7.2. To prove the final assertion let ν be a measure satisfying the conditions in (7.3).

7.1 Introduction to Potential Theory

175

Then νG D ≤ e A G D on A, so by Proposition 7.6 ν(A) ≤ e A,D (A) = Cap D (A). Thus the supremum in (7.3) is attained by e A . If ν is a measure on A and ν G D ≥ 1 on A then ν G D ≥ e A,D on A, so by Proposition 7.6 ν (A) ≥ e A,D (A). We can also connect capacity with the variational problems of Chapter 2. Proposition 7.9 (a) Suppose that is transient. Let A be finite. Then h = e A G is the unique minimiser of (VP1 ) with B1 = A, and Ceff (A, ∞) = Cap(A). (b) Suppose that D is finite, A ⊂ D. Then h = e A,D G D is the unique minimiser for (VP1) with B1 = A, B0 = V − D, and Ceff (A, V − D) = Cap D (A). Proof In case (a) we set D = V. By Theorem 2.37 h is the unique minimiser for (VP1 ), and E (h, h) = Ceff (A, ∞). In case (b) since D is finite the minimiser is the unique solution to the Dirichlet problem h = 0 in D − A, h = 1 on A, and h = 0 on V − D. Since h also satisfies these conditions, h is the minimiser, and E (h, h) = Ceff (A, V − D). x In both cases h has support D, so, writing g D = g D (x, ·), by Theorem 1.34 x we have E (g D , h) = h(x) = 1 for x ∈ A. Thus x x E (h, h) = E e A,D (x)g D ,h = e A,D (x)E (g D , h) = e A,D (A). x∈A

x∈A

We can now generalise the commute time identity Theorem 2.63 to sets. Theorem 7.10 Let be finite and A0 , A1 be disjoint subsets of V. Then there exist probability measures πi on Ai such that Eπ0 T A1 + Eπ1 T A2 = μ(V)Reff (A0 , A1 ).

(7.4)

Further, π0 = Reff (A0 , A1 )e A0 ,V−A1 . Proof Set D = V − A1 and let h 0 (x) = Px (T A0 < T A1 ). Then by Proposition 7.2, Theorem 7.8, and Proposition 7.9 we have h 0 = e A0 ,D G D and e A0 ,D (A0 ) = Reff (A0 , A1 )−1 . Set π1 = e A0 ,D Reff (A0 , A1 ). Then h 0 (y)μ y = e A0 ,D (x)g D (x, y)μ y y∈V

y∈V x∈A0

176

Potential Theory and Harnack Inequalities = Reff (A0 , A1 )−1

π0 (x)

x∈A0

= Reff (A0 , A1 )−1

g D (x, y)μ y

y∈V

π0 (x)Ex T A1 = Reff (A0 , A1 )−1 Eπ0 T A1 .

x∈A0

Similarly, setting h 1 (x) = Px (T A1 < T A0 ) and π1 = Reff (A0 , A1 )e A1 ,V−A0 , we have h 1 (y)μ y = Reff (A0 , A1 )−1 Eπ1 T A0 . y∈V

Since h 0 + h 1 = 1, adding the two identities gives (7.4). Remark 7.11 Since hitting times and Green’s functions are the same for both the discrete and continuous time random walks, this result applies to both X and Y . Now consider a more general continuous time random walk Z = (Z t , t ≥ 0) with generator μx y ( f (y) − f (x)). L f (x) = θx−1 y∈V ) (Z ) The potential operator G (Z D and Green’s functions g D (x, y) for Z are defined by τ D (Z ) (Z ) ) x G (Z f (x) = E f (Z s )ds = g D (x, y) f (y)θ y ; (7.5) D 0

(Z )

y∈D (Z )

thus g D (x, y) is the density of G D with respect to the measure θ . Straight(Z ) forward calculations give that g D (x, y) satisfies the same equations as (Z ) g D (x, y), and therefore we have g D = g D ; this is not so evident from (7.5). Since Green’s functions and capacitary measures are the same for Z and X , as above we obtain h 0 (y)θ y = Ceff (A0 , A1 ) π0 (x) g D (x, y)θ y . x∈A0

y∈V

Since E x T A1 (Z ) =

y∈V

y∈V

g D (x, y)θ y , this gives

Eπ0 T A1 (Z ) + Eπ1 T A2 (Z ) = θ (V)Reff (A0 , A1 ). We now give a version of the Riesz representation theorem. Recall from Chapter 1 the definition of the operator D .

7.1 Introduction to Potential Theory

177

Theorem 7.12 Let D ⊂ V with D = V if (, μ) is recurrent. Let u : D → R be non-negative on D and superharmonic in D. Then there exist non-negative functions w, h with support in D, with h harmonic, such that u(x) = G D w(x) + h(x),

x ∈ D.

(7.6)

Further, w = D u, and this representation is unique. Proof Since u ≥ 0 on D, for x ∈ D, 0 ≤ PD u(x) ≤ Pu(x) ≤ u(x). D u ≤ P D u, and, since these functions are non-negative, we Consequently, Pn+1 n can define

h(x) = lim PkD u. k→∞

By monotone convergence, D PD h = PD ( lim PkD u) = lim Pk+1 u = h on D. k→∞

k→∞

we have PD h = Ph on D, so that h is harmonic on D. Since h = 0 on Now define w by setting Dc

w = I D u − PD u. Then G Dw =

∞ k=0

PkD w = lim

N →∞

N k=0

PkD w = lim (I D u − PND+1 u) = I D u − h, N →∞

which gives the representation (7.6). Applying D to both sides of (7.6) gives

D u = D G D w + D h = −w, which implies that the representation is unique. Remark 7.13 Note that h = 0 on ∂ D and h = 0 in D, so that h = 0 in D is finite. If is a graph for which the strong Liouville property fails, and u is a non-constant positive harmonic function, then h = u and w = 0. We now prove a ‘balayage’ formula, which will be used to obtain an elliptic Harnack inequality from Green’s functions bounds. Let D ⊂ V be finite, and let A ⊂ D. Let h be non-negative in D and harmonic in D. Define the reduite h A by h A (x) = Ex (h(X TA ); T A < τ D ).

178

Potential Theory and Harnack Inequalities

Lemma 7.14 (a) The function h A satisfies h A ≥ 0 on V, h A = 0 on D c , h A = h on A, and 0 ≤ h A ≤ h on D − A. (b) Let g = 1 A h. Then h A (x) = Ex g(X τ D−A ). (c) h A is harmonic in D − A, superharmonic in D, and is the smallest nonnegative superharmonic function f in D such that f ≥ h1 A . Proof (a) and (b) are immediate from the definition. (c) The representation of h A in (b) as the solution to a Dirichlet problem gives that h A is harmonic in D − A. Further, the definition of h A gives that

h A (x) = 0 if x ∈ Ao ∪ (D − A). If x ∈ ∂i A, then h A (x) = h(x) = Ph(x) ≥ Ph A (y), so h A is superharmonic at x. Now let f ≥ h1 A be superharmonic in D and non-negative in V. Then by the optional sampling theorem f (x) ≥ E x f (X TA ); T A < τ D ≥ Ex h(X TA ); T A < τ D = h A (x). Alternatively, we can use the minimum principle. Let f be superharmonic in D with f ≥ 1 A h. Since h A is harmonic in D − A, f − h A is superharmonic in D − A. Then f − h A ≥ 0 on (D − A)c , and so min D−A ( f − h A ) = min∂(D−A) ( f − h A ) ≥ 0. Theorem 7.15 (Balayage formula) Let D ⊂ V be finite, and let A ⊂ D. Let h be non-negative in D and harmonic in D. Then there exists w ≥ 0 with supp(w) ⊂ ∂i (A) such that h A (x) = G D w(x),

x ∈ D.

(7.7)

In particular, h = G D w on A. Proof By Lemma 7.14, h A is non-negative and superharmonic in D, and vanishes on ∂ D. So by the Riesz representation, and the remark following, we have h A = G D w, where w = D h A . Since h A = 0 on ∂ D, we have w = h A , and so by the calculations in Lemma 7.14 w(x) = 0 unless x ∈ ∂i (A). This concludes our survey of potential theory. We now apply these results to graphs with ‘well-behaved’ Green’s functions.

7.2 Applications

179

7.2 Applications Definition 7.16 We say that (, μ) satisfies the condition (UG) (‘uniform Green’) if there exists a constant c1 such that for all balls B = B(x, R) and x i , yi ∈ B(x, 2R/3) with d(xi , yi ) ≥ R/8 for i = 1, 2 g B (x1 , y1 ) ≤ c1 g B (x2 , y2 ). Using Theorem 4.26 we have Corollary 7.17

If (, μ) satisfies HK(α, β) then (UG) holds.

Theorem 7.18 Let (, μ) satisfy (UG). Then (, μ) satisfies the elliptic Harnack inequality. Proof Let B = B(x0 , R), and let h be harmonic in B and non-negative in B. Set A = B(x0 , 2R/3) and B = B(x 0 , R/2). Then the balayage formula (7.7) applied with D = B gives that there exists f ≥ 0 with support ∂i A such that g D (x, y) f (y)μ y , x ∈ A. h(x) = y∈∂i A

Let x1 , x2 ∈ B . Then, since d(xi , y) ≥ R/6 for y ∈ ∂i A, using the condition (UG), g B (x1 , y) f (y)μ y ≤ c1 g B (x2 , y) f (y)μ y = c1 h(x2 ), h(x 1 ) ≤ y∈∂i A

y∈∂i A

giving the EHI. Theorem 7.19 Let (, μ) satisfy (H5) and be roughly isometric to Zd , with d ≥ 1. Then (, μ) satisfies the EHI and the strong Liouville property. Proof By Theorem 6.28 (, μ) satisfies HK(d, 2), and therefore the EHI. The strong Liouville property then follows by Theorem 1.46. Remark 7.20 One can also use a balayage argument to show that HK(α, β) leads to PHI(β). See [BH] for the case β = 2; the case β > 2 is very similar. We now look at hitting probabilities in graphs which satisfy HK(α, β) with α > β; note that this condition implies these graphs are transient.

180

Potential Theory and Harnack Inequalities Suppose that (, μ) satisfies HK(α, β) with α > β.

Lemma 7.21

(a) There exist constants ci such that c1 r α−β ≤ Cap(B(x, r )) ≤ c2 r α−β . (b) If A ⊂ B(x, r/2) then Cap(A) ≤ Cap B (A) ≤ c2 Cap(A). Proof (a) By Theorem 4.26 we have g(x, y) d(x, y)β−α when x = y. Let B = B(x, r ); then 1 ≥ e B G(x) ≥ e B (B) min g(x, y) ≥ Cap(B)cr β−α , y∈B

giving the upper bound. For the lower bound, it is sufficient to consider a measure ν on B which puts mass λ on each point in B. For y ∈ B, g(z, y). νG(y) ≤ λ z∈B

To bound this sum, choose k as small as possible so that 2r ≤ 2k . Then, since (, μ) satisfies (Vα ), writing B j = B(y, 2 j ),

g(z, y) ≤ c +

k

g(y, z) ≤ c + c

j=1 z∈B j+1 −B j

z∈B

k

(2 j )β−α 2α j ≤ cr β .

j=1

So taking λ = cr −β we have νG ≤ 1 and hence Cap(B) ≥ ν(B) ≥ cr α−β . (b) We have e A G B ≤ eG ≤ 1 on A, so Cap B (A) ≥ e A (A) = Cap(A), which proves the first inequality. For the second, by Theorem 4.26(a) there exists c2 such that g(y, z) ≤ c2 g B (y, z)

for y, z ∈ B(x, r/2).

Then for y ∈ A c2−1 e A,B G(y) ≤ e A,B G B (y) = 1, and so Cap(A) ≥ c2−1 e A,B (A) = c2−1 Cap B (A). Lemma 7.22 Let (, μ) satisfy HK(α, β), and let B(y, r ) ⊂ B(x0 , R/2). Write B = B(x0 , R). (a) If α < β then there exist constants ci such that c1 c2 ≤ Cap B (B(y, r )) ≤ β−α . R β−α R

7.2 Applications

181

(b) If α = β then there exist constants ci such that c1 c2 ≤ Cap B (B(y, r )) ≤ . log(R/r ) log(R/r ) Proof (a) follows easily by the same methods as in Lemma 7.21, on noting that by Theorem 4.26 we have g B (x, y) R β−α for x, y ∈ B(y, r ). (b) Let A = B(y, r ). Then 1 = e A,B G B ≥ Cap B (A) minz,z ∈A g B (z, z ) ≥ c Cap B (A) log(R/r ). Let z ∈ A and A j = {z ∈ A : 2 j ≤ d(z, z ) < 2 j+1 }. Then

g B (z, z ) ≤ c + c

∞

z ∈A

|A j | log(R/2 j ) ≤ c r α log(R/r ).

j=1

So if ν places a mass λ = c r α log(R/r )−1 at each point in A then νG B ≤ 1 everywhere, ν(A) = cr α λ, and hence Cap B (A) ≥ c/ log(R/r ). Theorem 7.23 (Wiener’s test) Suppose that (, μ) satisfies HK(α, β) with α > β. Let o ∈ V. Let A ⊂ V, and An = A ∩ (B(o, 2n+1 ) − B(o, 2n )). Then Po (X hits A infinitely often) is 0 or 1 according to whether ∞

2−k(α−β) Cap(Ak )

(7.8)

k=1

is finite or infinite. Proof As (, μ) is transient, X will only hit each set An a finite number of times, and so, writing Fn = {X hits An }, {X hits A infinitely often} = {Fn occurs for infinitely many n}. By Theorem 4.26 we have if x, y ∈ V with d(x, y) = r ≥ 1, c1 r −(α−β) ≤ g(x, y) ≤ c2 r −(α−β) . Further, if x, y ∈ B(o, R/2) then we have g B(o,R) (x, y) ≥ c1 r −(α−β) . One implication is easy. We have Po (Fn ) = e An G(o) ≤ e An (An ) max g(o, y) ≤ c Cap(An )2−n(α−β) . y∈An

So if the sum in (7.8) is finite then A finitely often.

n

Po (Fn ) < ∞, and thus, Po -a.s., X hits

182

Potential Theory and Harnack Inequalities

Now suppose that the sum in (7.8) is infinite. Set Bk = B(o, 2k ), and let G k = {T Ak < τ Bk+2 }, so that G k ⊂ Fk . Then if y ∈ ∂ Bk P y (G k ) = e Ak ,Bk+2 G Bk+2 (y) ≥ Cap Bk+2 (A2k+i ) minz∈Ak g Bk+2 (y, z) ≥ c Cap(Ak )2−(α−β)k . Here we used Lemma 7.21 and the fact that both y, z ∈ Bk+2 . Let Fk = σ (X 1 , X 2 , . . . , X Tk ). Then the bound above implies that Po (G k |Fk ) ≥ c Cap(Ak )2−(α−β)k ,

so that k Po (G k |Fk ) = ∞ a.s. The conclusion that Fk occurs infinitely often, Po -a.s., then follows by a conditional version of the second Borel– Cantelli lemma – see [Dur], Chap. 5.3.1.

Appendix

This appendix contains a number of results which are used in this book but are not part of the main flow of the arguments.

A.1 Martingales and Tail Estimates Let (, F , P) be a probability space. A filtration is an increasing sequence of sub-σ -fields (Fn , n ∈ Z+ ) of F . We call (, F , Fn , P) a filtered probability space. Definition A.1 A random time is a r.v. T : → Z+ ∪ {+∞}. A random time is a stopping time if for each n ∈ Z+ {T ≤ n} ∈ Fn . We also define the σ -field FT by FT = {F ∈ F : F ∩ {T ≤ n} ∈ Fn for all n ∈ Z+ }. If T is a stopping time then it is easy to check that {T = n} ∈ Fn . Definition A.2 Let I ⊂ Z. Let X = (X n , n ∈ I) be a stochastic process on a filtered probability space. We say X is a supermartingale if X is in L 1 , so that E|X n | < ∞ for each n, X n is Fn -measurable for all n, and E(X n |Fm ) ≤ X m for each n ≥ m,

n, m ∈ I.

(A.1)

(Thus a supermartingale is decreasing on average.) We say X is a submartingale if −X is a supermartingale, and X is a martingale if X is both a supermartingale and a submartingale, or equivalently if X is in L 1 and equality holds in (A.1). 183

184

Appendix

The two cases of interest here are I = Z+ ( a ‘forwards supermartingale’, or just a supermartingale) and I = Z− (a ‘backwards supermartingale’). The following are the main results on martingales used in this book. For proofs see [Dur, W]. Theorem A.3 (Optional sampling theorem; see [W], Th. 10.10) Let X be a forward supermartingale and T ≥ 0 be a stopping time. Suppose one of the following conditions holds: (a) T is bounded (i.e. there exists K such that T ≤ K a.s.), (b) X is bounded (i.e. there exists K such that |X n | ≤ K a.s.) Then E(X T ) ≤ E(X 0 ). The convergence theorem for supermartingales is as follows. Theorem A.4 (See [W], Th. 11.5) L 1.

Let X be a supermartingale bounded in

(a) If I = Z+ then there exists a r.v. X ∞ with P(|X ∞ | < ∞) = 1 such that X n → X ∞ a.s. as n → ∞. (b) If I = Z− then there exists a r.v. X −∞ with P(|X −∞ | < ∞) = 1 such that X n → X −∞ a.s. and in L 1 as n → −∞. Corollary A.5 Let X be a non-negative (forwards) supermartingale. Then X converges a.s. to a r.v. X ∞ , and EX ∞ ≤ EX 0 . In particular, P(X ∞ < ∞) = 1. Since E|X n | = EX n ≤ EX 0 this is immediate from the convergence theorem. The hypotheses in Theorem A.4(a) are not strong enough to ensure that EX n → EX ∞ . Definition A.6 X = (X n , n ≥ 0) is uniformly integrable if for all ε > 0 there exists K such that E(|X n |; |X n | > K ) ≤ ε for all n ≥ 0. This condition is slightly stronger than being bounded in L 1 . Using Hölder’s inequality it is easy to verify that if X is bounded in L p for some p > 1 then

A.1 Martingales and Tail Estimates

185

X is uniformly integrable. In particular, a bounded martingale is uniformly integrable. Theorem A.7 (See [W], Th. 13.7) Then there exists M∞ such that

Let M be a uniformly integral martingale.

Mn → M∞ a.s. and in L 1 . We conclude this subsection with an estimate on the tail of a sum of r.v. Lemma A.8 Let ξi be random variables taking values in {0, 1}, and suppose that there exists p > 0 such that Yk = E(ξk |Fk−1 ) ≥ p, where Fk = σ (ξ1 , . . . , ξk ). Let b < p. Then there exists a constant c1 = c1 (b, p) such that $ n % P ξk < bn ≤ e−c1 n . k=1

Proof Let Mn =

n

( p − ξi ),

i=1

so that M is a supermartingale. If λ > 0 then E(e−λξn |Fn−1 ) = E(e−λ + (1 − e−λ )1(ξn =0) |Fn−1 ) ≤ e−λ + (1 − p)(1 − e−λ ) = (1 − p) + pe−λ . So,

n EeλMn ≤ (1 − p)eλp + pe−λ(1− p) . Set ψ(λ) = log (1 − p)eλp + pe−λ(1− p) . Then it is straightforward to check

that ψ(0) = ψ (0) = 0, so that ψ(λ) ≤ c2 λ2 on [0, 1]. So $ n % P ξk < bn = P(Mn > ( p − b)n) = P(eλMn > eλ( p−b)n ) i=1

≤ e−λ( p−b)n E(eλMn ) ≤ exp(−n(λ( p − b) − ψ(λ))). Taking λ > 0 small enough then gives the bound. Corollary A.9 Let X ∼ Bin(n,p) and a > p. Then there exists a constant b = b(a, p) such that P(X > an) ≤ e−bn .

186

Appendix

A.2 Discrete Time Markov Chains and the Strong Markov Property While the Markov property for a process is quite intuitive (‘only the value of X at time n affects the evolution of X after time n’), its precise formulation can be quite troublesome, particularly as for many applications one needs the strong Markov property, that is, the Markov property at stopping times. In the discrete context one does not have to worry about null sets or measurability problems, but at first sight the setup still appears unnecessarily elaborate. Let V be a countable set and (, F ) be a measure space. We assume that we have random variables X n : → V, and also that we have shift operators θn : → which are measurable (so θn−1 (F) ∈ F for all F ∈ F ) and relate to (X n ) by X n (θm ω) = X n+m (ω) for all n, m ∈ Z+ , ω ∈ .

(A.2)

We write ξ ◦ θn for the random variable given by ξ ◦ θn (ω) = ξ(θn (ω)). One way of constructing these is to take to be the canonical space = VZ+ . Thus an element ω ∈ is a function ω : Z+ → V. We then take X n to be the coordinate maps and θn to be a shift by n, so that if ω = (ω0 , ω1 , . . . ) then X n (ω) = ωn , n ∈ Z+ ,

θn (ω) = (ωn , ωn+1 , . . . ).

Let F be the Borel σ -field on , that is, the smallest σ -field which contains all the sets X n−1 (A), n ∈ Z+ , A ⊂ V. We define the filtration Fn = σ (X 0 , X 1 , . . . , X n ). We write ξ ∈ bFn to mean that ξ is a bounded Fn -measurable r.v. To make our notation more compact we will sometimes write x EF (ξ ) = Ex (ξ |Fn ). n

Let P(x, y) be a transition matrix on V × V, so that P(x, y) = 1 for all x. P(x, y) ≥ 0 for all x, y, and y∈V

Now let {Px , x ∈ V} be the probabilities on (, F ) so that, for all x, w, z, n, Px (X 0 = x) = 1,

(A.3)

P (X n+1 = z|Fn ) = P(X n , z).

(A.4)

x

A.2 Discrete Time Markov Chains and the Strong Markov Property 187 We call the collection (X n , n ∈ Z+ , Px , x ∈ V) which satisfies the conditions above a Markov chain with transition probabilities P. As in (1.9) we define the semigroup operator Pn by Pn f (x) = Ex f (X n ), and write P = P1 . The (simple) Markov property, that is, at a fixed time n, is formulated as follows. Theorem A.10

Let n ≥ 0, ξ ∈ bFn , η ∈ bF . Then Ex (ξ(η ◦ θn )) = Ex (ξ E X n η).

(A.5)

Remark A.11 The notation in (A.5) is concise but can be confusing, particularly if η is written as a function of X . (For example if η = ϕ(X n ) then the right hand side of (A.5) would be Ex (ξ E X n ϕ(X n )). The two X n in this expression are not the same!) A clearer way of writing this example is to set f (y) = E y (η). Then the right side of (A.5) is just Ex (ξ f (X n )). An alternative way of stating (A.5) is that Ex (η ◦ θn |Fn ) = E X n (η), Px -a.s.

(A.6)

Proof Let gk : V → R be bounded and ξ ∈ bFn . Let first η=

m

gk (X k ).

k=0

Then if ω = (ω0 , ω1 , . . . ), using (A.2), (η ◦ θn )(ω) =

m

gk (X k (θn (ω))) =

j=0

m

gk (X n+k (ω)).

k=0

(This shows how θn shifts the process X to the left by n steps.) Now take gk (x) = 1(x=yk ) , so that (η ◦ θn )(ω) = 1(X n =y0 ,...,X n+m =ym ) . It is then easy to check by induction that both E X n (η) and Ex (η ◦ θn |Fn ) equal 1(X n =y0 )

m

P(yi−1 , yi ),

i=1

which proves (A.6) in this case. As both sides of (A.6) are linear in η, (A.6) then follows for general bounded gk . A monotone class argument, as in [Dur], Chap. 6.3, then concludes the proof.

188

Appendix

The Markov property extends to stopping times. We define the shift θT on {ω : T (ω) < ∞} as follows: θT (ω) = θT (ω) (ω). So θT = θn on {ω : T (ω) = n}. Theorem A.12 (Strong Markov property) bFT , and η ∈ bF . Then

Let T be a stopping time, ξ ∈

Ex (ξ(η ◦ θT ); T < ∞) = Ex (ξ E X T (η); T < ∞).

(A.7)

Proof Set An = {T = n}, and note that ξ 1 An ∈ bFn . Then, since T = n on An , using the simple Markov property, Ex (ξ(η ◦ θT ); T < ∞) = Ex (ξ 1 An (η ◦ θT )) n∈Z+

=

Ex (ξ 1 An (η ◦ θn ))

n∈Z+

=

Ex (ξ 1 An E X n (η))

n∈Z+

=

Ex (ξ 1 An E X T (η)) = Ex (ξ E X T (η); T < ∞).

n∈Z+

We will now give a proof of Theorem 1.8, and, as an example of the use of the strong Markov property, begin by giving a careful proof of a preliminary result. Let p(x) = Px (Tx+ < ∞), H (x, y) = Px (Ty < ∞), L nx

=

n−1

1(X k =x) .

k=0

Lemma A.13

(a) We have y

y

Ex L ∞ = H (x, y)E y L ∞ . Y = ∞) = 1, while if p(y) < 1 then (b) If p(y) = 1 then P y (L ∞ y

Ey L ∞ =

1 . 1 − p(y)

(A.8)

A.2 Discrete Time Markov Chains and the Strong Markov Property 189 Proof (a) If x = y then H (x, x) = 1 and (A.8) is immediate. Now let x = y, y y y and write T = Ty . Then L ∞ (θT ω) = L ∞ (ω) if T (ω) < ∞, while L ∞ (ω) = 0 y if T (ω) = ∞. Set ξ = 1, let K > 0, and set η = L ∞ ∧ K . Then y

y

Ex (K ∧ L ∞ ) = Ex ((K ∧ L ∞ ) ◦ θT ; T < ∞) y

y

= Ex (E y (K ∧ L ∞ ); T < ∞) = H (x, y)E y (K ∧ L ∞ ). Here we used the strong Markov property to obtain the second equality, and also the fact that X T = y. Let K → ∞ to obtain (A.8). y (b) For k ≥ 0 let A k = {L ∞ ≥ k}. Write T = Ty+ . If T (ω) < ∞ then y

L ∞ (θT ω) =

∞

1(X k ◦θT (ω)=y) =

k=0

If X 0 (ω) = y ω ∈ Ak if and

∞

y

y

1(X k+T (ω)=y) = L ∞ (ω) − L T −1 (ω).

k=0

y then L T −1 (ω) = 1, and so on {T < only if θT ω ∈ Ak−1 . So, for k ≥ 1,

∞} ∩ {X 0 = y} we have

P y (Ak ) = E y 1 Ak = E y (1 Ak−1 ◦ θT ; T < ∞) = E y (E y (1 Ak−1 ); T < ∞) = p(y)P y (Ak−1 ). Since P y (A0 ) = 1 we obtain y

P y (L ∞ ≥ k) = p(y)k . y

If p(y) = 1 this gives P y (Ak ) = 1 for all k and so P y (L ∞ = ∞) = 1. If y p(y) < 1 then under P y the r.v. L ∞ has a geometric distribution, and therefore −1 has expectation (1 − p(y)) . Proof of Theorem 1.8 Note that as is connected we have H (x, y) > 0 for all x, y. Further, if Fx y = {Ty < Tx+ } then, by considering the event that X travels from x to y along a shortest path between these points, we have Px (Fx y ) > 0. With the convention that 1/0 = ∞, we have from Lemma A.13 1 x = Ex L ∞ = Ex 1(X k =x) = Pk (x, x). 1 − p(x) n

∞

k=0

k=0

Thus k Pk (x, x) converges if and only if p(x) < 1. Let x, y ∈ V and m = d(x, y). Then Pm (x, y) > 0 and we have Pn+2m (y, y) ≥ Pm (x, y)Pm (y, x)Pn (x, x), so that k Pk (x, x) converges if and only if k Pk (y, y) converges.

190

Appendix

Combining these observations, we obtain the equivalence of (a), (b), and (c) in both case (R) and case (T). Suppose (a)–(c) hold in case (T). Then 1 > Px (Tx+ < ∞) ≥ Px (Ty < ∞, Tx+ ◦ θTy < ∞) = H (x, y)H (y, x). So (d) holds. If (d) holds then choose x, y so that H (x, y) < 1. Then c c P y (Ty+ < ∞) = P y (Ty+ < ∞; Fyx ) + P(Fyx ) ≤ P y (Fyx )H (x, y) + P(Fyx ) < 1,

so (a) holds. Hence (d) is equivalent to (a)–(c) in case (T), and therefore also in case (R). By Lemma A.13 and the above we have x = ∞) = 1 if and only if p(x) = 1, Px (L ∞

P

x

x (L ∞

= ∞) = 0 if and only if p(x) < 1.

(A.9) (A.10)

Further, by the strong Markov property, y

y

Px (L ∞ = ∞) = H (x, y)P y (L ∞ = ∞). Hence (a)–(d) implies (e) in both case (R) and case (T). Taking x = y and using (A.9) and (A.10) gives that (e) implies (a).

A.3 Continuous Time Random Walk In this section we give an outline of construction of the continuous time random walk. Definition A.14 Let f : R+ → M , where M is a metric space. We say f is right continuous with left limits or cadlag if for all t ∈ R+ lim f (s) = f (t), s↓t

and lim f (s) exists. s↑t

(If t = 0 we do not consider the second property above.) The term cadlag is from the French abbreviation, which is easier to pronounce than the English one. Let D = D(R+ , V) be the set of functions f : R+ → V which are right continuous with left limits. We write Yt for the coordinate functions on D defined by Yt ( f ) = f (t), f ∈ D. We also let D be the Borel σ -field on D, which is the smallest σ -field which contains all the sets { f : f (t) = x} for t ∈ R+ and x ∈ V.

A.3 Continuous Time Random Walk

191

Definition A.15 Let Vi , i ∈ N be independent exponential random variables with mean 1, so that P(Vi ≤ t) = 1 − e−t . Set Tk =

k

Vi ,

i=1

Nt =

∞

1(Tk ≤t) ,

k=1

so that Nt = k if and only if Tk ≤ t < Tk+1 . The process N = (Nt , t ∈ R+ ) is the Poisson process. The definition gives that Tk is the time of the kth jump of N , and that for all ω the function t → Nt (ω) is right continuous with left limits. We define the filtration of N by FtN = σ (Ns , 0 ≤ s ≤ t),

t ∈ R+ .

Theorem A.16 (a) Nt has a Poisson distribution with parameter t, so that P(Nt = m) =

e−t t m , m!

m ∈ Z+ .

(b) Let s ≥ 0. The process N (s) defined by Nt(s) = Nt+s − Ns is independent of FsN , and has the same distribution as N . (c) The process N has stationary independent increments. Proof (a) The r.v. Tm+1 has the gamma distribution with parameters 1 and m + 1, so integrating by parts we have t −s m e s P(Nt ≥ m + 1) = P(Tm+1 ≤ t) = ds m! 0 t −s m−1 e−t t m −e−s s m

t e s ds = + P(Nt ≥ m), =

+ 0 m! m! 0 (m − 1)! which proves (a). (b) This follows by basic properties of the exponential distribution – see [Nor], Th. 2.4.1. (c) Let 0 = t0 < t1 < · · · < tm . Applying (b) with s = tm−1 we have that Ntm − Ntm−1 is Poisson with parameter tm − tm−1 and is independent of Ftm−1 . Continuing, it follows that the random variables Nt0 , Nt1 − Nt0 , . . . , Ntm − Ntm−1 are independent and that Ntk − Ntk−1 is Poisson with parameter tk − tk−1 .

192

Appendix

Remark We have by (a) that supm Tm = ∞, so that the random times Tm do not accumulate, and Nt is finite P-a.s. for all t. Definition A.17 Let (, μ) be a weighted graph and X = (X n , n ∈ Z+ ) be the discrete time random walk with X 0 = x defined on a probability space (, F , Px ). We assume that this space also carries a Poisson process N independent of X . Define t = X Nt , t ∈ [0, ∞). Y

(A.11)

the continuous time random walk on (, μ). We call Y Before we prove the Markov property, it will be convenient to ‘move’ the to the canonical space (D, D) given in Definition A.14. Note that process Y Y as defined by (A.11) is cadlag – this is immediate from the corresponding property for the Poisson process N . We thus have a map : → D, Y (ω) = (Y t (ω), t ≥ 0). To prove that this function is measurable, defined by Y −1 (A) ∈ F for sets A given by A = { f ∈ D : it is enough to check that Y f (t) = x}, where t ≥ 0 and x ∈ V. However, (ω) ∈ A} = {ω : Y t (ω) = x} = {ω : Y

∞

{Nt (ω) = n} ∩ {X n (ω) = x}.

n=0

We define the law of Y started at x on the space D by setting −1 (F)), F ∈ D. P x (Y Px (F) = under P x and Y under Px have the It is then straightforward to verify that Y same law. As in the discrete time situation, we define the shift operators θt on D by setting θt ( f ) to be the function given by θt ( f )(s) = f (t + s). We also define the filtration of Y by Ft = σ (Ys , s ≤ t). Theorem A.18

The process Y defined above satisfies the following.

(a) The function t → Yt is right continuous with left limits. (b) The process Y has semigroup given by

A.3 Continuous Time Random Walk

Q t f (x) = Ex f (Yt ) =

∞ −t e n=0

n!

193

P n f (x),

for f ∈ C(V) with f bounded. (c) Y satisfies the Markov property. That is, if ξ ∈ bFt and η ∈ bF then Ex (ξ(η ◦ θt )) = Ex (ξ EYt η).

(A.12)

and Y . Proof (a) and (b) are immediate from the construction of Y (c) Let t > 0, 0 ≤ t1 ≤ · · · ≤ tm < t, 0 ≤ s1 ≤ · · · ≤ sn , g j , fi be bounded, and m n ξ= g j (Yt j ), η = f i (Ysi ). (A.13) j=1

i=1

It is sufficient to prove that Ex

m

t j ) g j (Y

j=1

n

m n t+si ) = Ex t j )EYt si ) . (A.14) f i (Y g j (Y f i (Y

i=1

j=1

i=1

and Y are equal in law, this implies (A.12) for ξ and η given by (A.13), Since Y and a monotone class argument then gives the general case. To prove (A.14) it is sufficient to prove, for integers 0 ≤ k1 ≤ · · · ≤ km ≤ k and z ∈ V, that Ex

m

t j )1(Nt =ki ) 1(Nt =k) 1(Y =z) f 1 (Y t+s1 ) g j (Y t i

j=1

= Ex

m

t j )1(Nt =ki ) 1(Nt =k) 1(Y =z) Q s1 f 1 (z) . g j (Y t i

j=1

An induction argument on n then gives the general case. Write f = f1 , s = s1 and let ξ1 = 1(X k =z)

m

g j (X k j ),

ξ2 = 1(Nt =k)

j=1

m

1(Nti =ki ) ,

j=1

so that ξ1 ∈ σ (X j , j ≤ k) and ξ2 ∈ σ (Ns , s ≤ t). Then Ex

m

t j )1(Nt =ki ) 1 t+s ) = Ex ξ1 ξ2 f (Y t+s ) g j (Y f ( Y (Yt =z) i

j=1 ∞ = Ex ξ1 ξ2 1(X k =z) 1(Nt+s −Nt =i) f (X k+i ) i=0

(A.15)

194

Appendix

=

∞

Ex ξ1 ξ2 1(Nt+s −Nt =i) f (X k+i )

i=0

=

∞

Ex (ξ2 1(Nt+s −Nt =i) )Ex (ξ1 f (X k+i )).

i=0

Here we used the independence of X and N in the last line. Using the Markov properties for N and X we then obtain ∞ si t+s ) = Ex ξ1 ξ2 1(X k =z) f (Y Ex (ξ2 )e−s Ex ξ1 Ez f (X i ) i! i=0

= Ex (ξ1 ξ2 )

∞

e−s

i=0

si i P f (z) = Ez (ξ1 ξ2 Q s f (z)). i!

This completes the proof of (A.15), and hence of the Markov property. A second method of constructing Y is from the transition densities qt (x, y) defined in (5.2). This method has the advantage of not requiring as much detailed work on the exponential distribution, but it is more abstract and needs more measure theory. Let = VR+ , the space of all functions from R+ to V. As before we define coordinate maps Z t by setting Z t (ω) = ω(t). Let F o = σ Z t−1 ({x}), x ∈ V, t ∈ R+ . Let T be the set of all finite subsets of R+ . For T ∈ T let F o (T ) = σ (Z t , t ∈ T ). Let T ∈ T , and write T = {t1 , . . . , tm } where 0 < t1 < · · · < tm . For x0 ∈ V define the probability measure PTx0 on (, F o (T )) by PTx0 ({ω : ω(ti ) = xi , 1 ≤ i ≤ m}) =

m

qti −ti−1 (xi−1 , xi )μxi .

i=1

Using the Chapman–Kolmogorov equation (5.3) one has qti −ti−1 (xi−1 , z)qti+1 −ti (z, xi+1 )μz = qti+1 −ti−1 (xi−1 , xi+1 ), z

and it follows that the family (PTx , T ∈ T ) is consistent – that is, if T1 ⊂ T2 then

PTx2 F o (T ) = PTx1 . 1

The Kolmogorov extension theorem (see [Dur], App. A.3) then implies that there exists a probability P x on (, F o ) such that P F o (T ) = PTx for each

A.3 Continuous Time Random Walk

195

T ∈ T . Under the probability measure P x the process Z is then a Markov process with semigroup Q t . A reader who sees this construction here for the first time may feel a sense of unease. On one hand, the space is very big – much larger than the spaces usually studied in probability, which have the same cardinality as R. Further, the construction appears to be too quick and general, and does not use any specific properties of the process. This unease is justified: while what has been done above is correct, it does not give us a useful Markov process. In Lemma 5.21, for example, we look at the first exit time τY (x, r ) of Y from a ball. However, the corresponding time τ Z (x, r ) = inf{t > 0 : Z t ∈ B(x, r )} is not measurable with respect to F o . This is easy to prove: the event {τ Z (x, r ) < 1} depends on Z at uncountably many time coordinates, while it is not hard to verify that any event in F o can only depend on countably many time coordinates. The solution is to regularise the process Z . For simplicity we just do this on the interval [0, 1], but the same argument works on any compact interval, and so (by an easy argument) on R+ . Let D = {k2−n , k ∈ Z}, and set D = ∪n Dn . We write Z D for the process Z restricted to the time set D+ = D ∩ [0, ∞). Let J (a, b) be the number of jumps of Z D in the interval [a, b], defined by J (a, b, n) = max{k : there exist t1 < t2 < · · · < tk with ti ∈ Dn ∩ [a, b] and Z ti +1 = Z ti for i = 1, . . . , n − 1 }, J (a, b) = sup J (a, b, n). n

Lemma A.19 (a) For any n ≥ 1 we have P(J (t, t + h, n) ≥ 1) ≤ h. (b) For t ≥ 0, h > 0, n ≥ 0, P(J (t, t + h, n) ≥ 2) ≤ h 2 . (c) For h ≤ 12 we have Px ( sup J (t, t + h) ≥ 2) ≤ ch. 0≤t≤1

(d) There exists a r.v. H on (, F o , Px ) with Px (H > 0) = 1 such that J (t, t + H ) ∈ {0, 1} for all t ∈ [0, 1]. Proof (a) An easy calculation using the transition density of Z gives that Px (Z t+h = Z t ) ≤ h. Write Dn ∩ [t, t + h] = {t1 , t2 , . . . , tm } with ti+1 = ti + 2−n and set F j = {Z t j+1 = Z t j }. Then −n Px (J (t, t + h, n) ≥ 1) = Px (∪m−1 ≤ h. j=1 F j ) ≤ m2

196

Appendix

(b) If J (t, t + h, n) ≥ 2 then two distinct events Fi must occur. The Markov property gives that Px (Fi ∩ F j ) ≤ 2−2n , so Px (J (t, t + h, n) ≥ 2) ≤ Px (Fi ∩ F j ) ≤ m 2 2−2n ≤ h 2 . i

j>i

(c) As J (t, t + h, n) is increasing in n, it follows from (b) that Px (J (t, t + h) ≥ 2) ≤ h 2 . Let G k = {J (a + kh, a + (k + 2)h) ≥ 2}, for k = 0, 1, . . . , h −1 . Then Px (G k ) ≤ (2h)2 h −1 ≤ ch. Px ( sup J (t, t + h) ≥ 2) ≤ 0≤t≤1

k

(d) This follows from (c) by a straightforward argument using Borel–Cantelli by considering the events {sup0≤t≤1 J (t, t + 2−k ) ≥ 2}. Lemma A.19 proves that on the event {H > 0}, which has probability one, the jumps of Z |D∩[0,1] do not accumulate. Therefore we can define Zt =

lim

p∈D, p↓t

Z p , t ∈ [0, 1).

Lemma A.20 (a) The process Z is right continuous with left limits on [0, 1), Px -a.s. (b) For t ∈ [0, 1), Px (Z t = Z t ) = 1. We call the process Z a right-continuous modification of Z . We can now move the process (Z , Px ) onto the canonical space D(R+ , V) to obtain the continuous time simple random walk. We note that with this construction the Markov property at a fixed time t is straightforward to verify. We now give the strong Markov property for Y . Definition A.21

A random variable T : → [0, ∞] is a stopping time if {T ≤ t} ∈ Ft for all t ≥ 0.

We define the σ -field FT by FT = {F ∈ F : F ∩ {T ≤ t} ∈ Ft for all t ≥ 0}. One can check that if S ≤ T then F S ⊂ FT , and that T is FT measurable. As in the discrete time case we define θT (ω) = θT (ω) (ω) when T (ω) < ∞.

A.4 Invariant and Tail σ -fields

197

Theorem A.22 (Strong Markov property in continuous time) Let Y be as above, T be a stopping time, and ξ ∈ bFT and η ∈ bF . Then Ex (ξ(η ◦ θT ); T < ∞) = Ex (ξ EYT (η); T < ∞).

(A.16)

Proof Replacing T by M ∧ T we can assume that Px (T < ∞) = 1. Let Tn = min{ p ∈ Dn : p ≥ T }. Then Tn is a stopping time, Tn ≥ T , and limn Tn = T . By a monotone class argument, as in the case of the Markov property at a fixed time, it is enough to prove (A.16) when η=

n

f i (Ysi ),

i=1

where f i are bounded. We first prove (A.16) for Tn . Let Uk = 1(Tn =k2−n ) . Then Uk ∈ Fk2−n and, since ξ ∈ TTn , we have ξUk ∈ Fk2−n . So, by the Markov property for Y at the times k2−n , Ex (ξUk (η ◦ θk2−n )) Ex (ξ(η ◦ θTn )) = k

=

Ex (ξUk EYk2−n (η)) = Ex (ξ EYTn (η)).

k

Since Y is right continuous, YTn → YT , and therefore we have lim η ◦ θTn = η ◦ θT , n

lim EYTn (η) = EYT (η). n

Using dominated convergence, we then obtain (A.16).

A.4 Invariant and Tail σ -fields Given probability measures P and Q on a measure space (M, M ), we define the total variation norm by ||P − Q||TV(M ) = sup (P(A) − Q(A)). A∈M

(A.17)

When the σ -field is clear from the context we will sometimes write TV rather than TV(M ). It is straightforward to verify that there exists a set A ∈ M which attains the supremum in (A.17). If M = V and M is the σ -field of all subsets of V, and

198

Appendix

ν and ν are measures on V, then we can take A = {x : νx > ν x } and deduce that (νx − ν x )+ = 12 |νx − ν x |. ||ν − ν ||TV = x∈V

x∈V

Remark A.23 There are different normalisations of the total variation distance. For example [De] defines it by sup{EP (Z ) − EQ (Z ) : |Z | ≤ 1, Z is M measurable},

(A.18)

which gives a value twice that in (A.17). Lemma A.24 Let (, G0 ) be a measure space and P, Q be probability measures on (, G0 ). Let (Gn , n ∈ Z+ ) be a decreasing filtration, and let G∞ = ∩n Gn . Then lim ||P − Q||TV(Gn ) = ||P − Q||TV(G∞ ) . n

Proof Suppose initially that Q ' P on G0 . Let Z n be the Radon–Nikodym derivative dQ/dP on Gn . Since for G ∈ Gn we have Q(G) = EP (1G Z n ), it is easy to check that a set A which attains the supremum in (A.17) is given by A = {Z n < 1}. So ||P − Q||TV(Gn ) = P(Z n < 1) − EP Z n 1(Z n .

(a) F ∈ Gn if and only if there exists Fn ∈ G0 such that F = θ −n (Fn ). (b) If F ∈ T and Pν (F) = 0 then Pν (θ −n (F)) = 0 for all n ≥ 0. (c) If F ∈ T and Pν (F ) θ −1 (F)) = 0 then F˜ = lim sup θ −n (F) =

∞ ∞

θ −m (F) ∈ I ,

n=0 m=n

˜ = 0. and Pν (F ) F) Proof (a) If F = {X n+k ∈ Ak , 1 ≤ k ≤ m} then F = θ −n (Fn ), where Fn = {X k ∈ Ak , 1 ≤ k ≤ m}. The general case then follows by a monotone class argument. (b) Let F ∈ T with Pν (F) = 0. Since ν ∈ M> we have Px (F) = 0 for all x. So by the Markov property Eν (1 F ◦ θn ) = E X n (1 F ) = 0, and thus, using (1.39), Pν (θ −n (F)) = 0. (c) If ω ∈ F˜ then there exists n such that ω ∈ θ −m (F) for all m ≥ n. So ˜ Similarly if θ (ω) ∈ θ −(m−1) (F) for all m − 1 ≥ n − 1, and thus θ (ω) ∈ F. ˜ = F, ˜ and F˜ ∈ I . We have, using (b), ˜ So θ −1 ( F) θ (ω) ∈ F˜ then ω ∈ F. Pν (F ) θ −2 (F)) = Eν |1 F − 1θ −2 (F) | ≤ Eν |1 F − 1θ −1 (F) | + Eν |1θ −1 (F) − 1θ −2 (F) | = 0,

200

Appendix

and similarly Pν (θ −i (F) ) θ − j (F)) = 0 if i = j. Now let G n,m =

m

θ −k (F).

k=n

Then Eν |1 F − 1G n,m | = Eν |1 F −

m

1θ −k (F) | = 0.

k=n

˜ = 0. Taking the limits as m → ∞ and then n → ∞ gives that Pν (F ) F) Let ν, ν be probability measures on V.

Lemma A.26 (See [Kai], Lem. 2.1) Then

||Pν − Pν ||TV(Gn ) = ||νP n − ν P n ||TV .

(A.19)

Further,

lim ||νP n − ν P n ||TV = ||Pν − Pν ||TV(T ) . n

Proof Let n ≥ 0. Then, since σ (X n ) ⊂ Gn , we have

||Pν − Pν ||TV(Gn ) ≥ ||Pν − Pν ||TV(σ (X n )) = ||νP n − ν P n ||TV . Let G ∈ Gn . Then G = θ −n (G n ) for some Gn ∈ G0 , and (νPyn − ν Pyn )P y (θ −n (G n )) Pν (G) − Pν (G) = y∈V

≤

(νPyn − ν Pyn )+ = ||νP n − ν P n ||TV ,

y∈V

which proves (A.19). Using Lemma A.24 completes the proof. Definition A.27 Let ν be a measure on V. We say that I = T Pν -a.s. if for any F ∈ T there exists F ∈ I such that Eν |1 F − 1 F | = 0. Write δ x for the probability measure on V which assigns mass 1 to {x}. Set α(x) = sup ||δ x P n − δ x P n+1 ||TV , n

α = sup α(x). x∈V

The following result is often called the ‘02 law’; see [De]. We have ‘1’ rather than ‘2’ since we defined the total variation distance by (A.17) rather than (A.18).

A.4 Invariant and Tail σ -fields

Theorem A.28

201

We have

α = sup lim ||νP n − νP n+1 ||TV , and is either 0 or 1 . ν

n

α = 0 if and only if I = T Pν -a.s. for all probability measures ν on V. Proof Suppose first that I = T Pν -a.s. for all ν. Let ν be a probability measure on V, and suppose initially that νP ' ν. Then by Lemma A.26 lim ||νP n − νP n+1 ||TV = ||Pν − PνP ||TV(T ) = sup Eν (1 F ) − EνP (1 F ) . n

F∈T

(A.20) Let F ∈ T . Then by hypothesis there exists F ∈ I such that Pν (F )F ) = 0. Let Z = 1 F , Z = 1 F . We have Pν (Z = Z ) = 1, and so since νP ' ν it follows that PνP (Z = Z ) = 1. Then by the Markov property EνP (Z ) = EνP (Z ) = νx P(x, y)E y (Z ) x ν

=E E

X1

Z = Eν (Z ◦ θ1 ) = Eν (Z ) = Eν (Z ).

Thus the left side of (A.20) is zero. For the general case we can replace ν by n an = 1 and an > 0 for all n. n an νP , where Now suppose that there exists ν such that I = T Pν -a.s. Replacing ν by n n an νP if necessary, we can assume that ν ∈ M> and so νP ' ν. Let F ∈ T be such that Pν (F ) F ) > 0 for all F ∈ I . Then by Lemma A.25(c) we must have Pν (F ) θ −1 (F)) > 0. Let G = F − θ −1 (F), so that 0 < P(G) < 1 and G ∩ θ −1 (G) = ∅. Let n ≥ 0. Since G ∈ Gn there exists G n ∈ G0 such that 1G = 1G n ◦ θn . Set h n (x) = Ex (1G n ). Let Mn = Eν (1G |Fn ). Then M is a bounded martingale and so Mn → 1G a.s. and in L 1 . By the Markov property Mn = Eν (1G |Fn ) = Eν (1G n ◦ θn |Fn ) = E X n (1G n ) = h n (X n ). Similarly, Eν (1θ −1 (G) |Fn ) = Eν (1G n−1 ◦ θn |Fn ) = E X n (1G n−1 ) = h n−1 (X n ). So we have (h n (X n ), h n−1 (X n )) → (1G , 1θ −1 (G) ) a.s. and in L 1 (Pν ). Hence, given ε > 0 there exists n ≥ 1 and x ∈ V such that h n (x) > 1 − ε and h n−1 (x) < ε. So for k ≥ 0 1 − 2ε < h n (x) − h n−1 (x) = Ex h n+k (X k ) − Ex h n+k (X k+1 )

202

Appendix = Pδ

x Pk

(G n+k ) − Pδ

x P k+1

(G n+k ) ≤ ||Pδ

x Pk

− Pδ

x P k+1

||TV(Gk )

= ||δ x P k − δ x P k+1 ||TV . This implies that α > 1 − 2ε. Corollary A.29 Let (, μ) be a weighted graph. Then T = I for the lazy random walk on (, μ). Proof Let x ∈ V. Since P(x, x) ≥ 12 , we have δ ≥ 12 δ x P, and therefore x Px ≥ 12 Pδ P . So α(x) ≤ 12 , and thus α ≤ 12 . By the 02 law therefore α = 0 and T = I . Corollary A.30 Let (, μ) be recurrent, and not bipartite. Then for any ν both T and I are trivial. Proof I is trivial by Theorem 1.48. Since (, μ) is not bipartite there exists k ≥ 1 and a cycle C of length 2k +1. Let z be a point in C and w ∼ z. Let x ∈ V, X n be the SRW started at x, and X n be the SRW started with distribution δ x P. We can couple X and X so that X k = X k+1 for k + 1 ≤ Tz (X ). As (, μ) is recurrent, X will hit x with probability 1. Let S = Tz (X ) + 2k. After the hit on x, we run the processes X and X independently. With positive probability, X will move back and forth k times between z and w, while X will go round the cycle C. So there exists p > 0 such that P(X S = X S ) ≥ 2 p. On the event {X S = X S } we then run X and X together. We can find n large enough so that P(S > n) ≤ p, and so we have P(X n = X n ) ≥ p. This implies that α(x) ≤ 1 − p for all x, and so by the 02 law we have that T is trivial.

A.5 Hilbert Space Results We begin with some basic facts about operators in Hilbert spaces. Lemma A.31 for all n ≥ 0

Let T be a self-adjoint operator in a Hilbert space H . Then T n = T n .

(A.21)

Proof Since ||AB|| ≤ || A|| · ||B|| for any linear operators on a Banach space, we have T n ≤ T n . Let B = { f ∈ H : || f || = 1}. If n = 2m is even then

A.5 Hilbert Space Results

203

||T 2m || = sup{T 2m f, g : f, g ∈ B } ≥ sup{T m f, T m f : f ∈ B } = ||T m ||2 .

Hence (A.21) holds if n = 2k for any k ≥ 0. If for some n we have T n < T n , then, taking k so that n < 2k , k

||T 2 || ≤ ||T n ||||T 2

k −n

|| < ||T ||n ||T 2

k −n

k

|| ≤ ||T ||2 ,

a contradiction. Lemma A.32

For the operator P defined by (1.9), P2→2 = sup{P f, f : f 2 = 1}.

(A.22)

Proof Let B = { f : f 2 = 1}, B + = B ∩ C+ (V), and write ρ = P2→2 . Note that P f, g ≤ P| f |, |g|, so that using (1.10) we have ρ = sup{P f, g : f, g ∈ B + }. Let ε > 0 and f, g ∈ B + with P f, g ≥ ρ − ε2 . Then (as P is self-adjoint) ||P f − ρg||22 = P f − ρg, P f − ρg = P f 22 + ρ 2 g22 − 2ρP f, g ≤ 2ρ 2 − 2ρ(ρ − ε2 ) = 2ρε2 ≤ 2ε2 . Similarly Pg − ρ f 22 ≤ 2ε2 . So P f = ρg + h 1 and Pg = ρ f + h 2 with h j 2 ≤ 21/2 ε for j = 1, 2. Let u = ( f + g); as f, g ≥ 0 we have 1 ≤ u2 ≤ 2. So Pu = ρu + h, where h2 ≤ 23/2 ε. Therefore Pu, u = ρu22 + h, u ≥ ρu22 − 2(23/2 ε)u2 . So if v = u/u2 we have Pv, v ≥ ρ − 25/2 ε, proving (A.22). The following is a discrete version of the representation theorem of Beurling, Deny, and LeYan – see [FOT], Th. 3.2.1. Let V be a countable set and μ be a measure on V. Let A be a symmetric (real) bilinear form on L 2 (V) with the following properties: (D1) A ( f, f ) ≤ C0 || f ||22 , f ∈ L 2 , (D2) A ( f, f ) ≥ 0 for all f ∈ L 2 , (D3) A (1 ∧ f + , 1 ∧ f + ) ≤ A ( f + , f + ) ≤ A( f, f ), f ∈ L 2 . Property (D3) is called the Markov property of the quadratic form A . Theorem A.33 There exist Jx y , k x with Jx x = 0, Jx y = Jyx ≥ 0, x = y, k x ≥ 0 such that for f, g ∈ L 2 (V, μ) ( f (x)− f (y))(g(x)−g(y))Jx y + f (x)g(x)k x . (A.23) A ( f, g) = x

y

x

204

Appendix

Proof Set A x y = A (1x , 1 y ), so that for f, g ∈ C 0 (V) f (x)A x y g(y). A ( f, g) = x

y

Let x, y ∈ V with x = y, and let f = 1x − λ1 y , where λ > 0. Then f + = 1x , so by (D3) A x x = A (1x , 1x ) ≤ A ( f, f ) = A x x − 2λA x y + λ2 A yy . Hence A x y ≤ 0, so define Jx y = −A x y ,

x = y,

Jx x = 0. Note that Jx y = J yx by the symmetry of A . Now let B ⊂ V be finite and g ≥ 0 with supp(g) ⊂ B. If f = 1 B + λg, where λ > 0, then f ∧ 1 = 1 B . Using (D3) again we have A (1 B , 1 B ) ≤ A (1 B , 1 B ) + 2λA (g, 1 B ) + λ2 A (g, g), and therefore 0 ≤ A (1 B , g) =

g(x )

x ∈B

Ax y .

y∈B

Taking g = 1x with x ∈ B it follows that y∈B A x y ≥ 0 for any finite B, and therefore that Jx y ≤ A x x . y=x

Now set kx =

Ax y = Ax x −

Jx y ≥ 0.

y=x

y

Then if f, g ∈ C0 (V) f (x) A x x g(x) + A x y g(y) A ( f, g) = y=x

x

=

f (x)g(x)k x +

x

=

x

+ =

1 2

x

f (x)

x

f (x)g(x)k x +

1 2

Jx y (g(x) − g(y))

y=x

f (x)Jx y (g(x) − g(y))

x,y

f (y)Jyx (g(y) − g(x))

y,x

f (x)g(x)k x +

1 2

x,y

Jx y (g(x) − g(y))( f (x) − f (y)).

A.6 Miscellaneous Estimates

205

This gives (A.23) for f, g ∈ C 0 (V). We now extend this to L 2 (V, μ). For B ⊂ V write S( f, f ; B) = f (x)g(x)k x + 12 Jx y ( f (x) − f (y))2 . x∈B

x∈B,y∈B

Now let Bn ↑↑ V. Let f ∈ L 2 and f n = f 1 Bn , so that f n → f in L 2 (V, μ). Hence, by (D1) and Fatou, A ( f, f ) = lim A ( f n , f n ) = lim S( f n , fn ; V) ≤ S( f, f ; V). n

n

On the other hand, if m ≥ n then S( f m , f m ; Bn ) ≤ A ( f m , f m ), and so S( f, f ; Bn ) = lim S( f m , f m ; Bn ) ≤ lim A ( f m , f m ) = A ( f, f ). m→∞

m→∞

Taking the limit as n → ∞ we deduce that (A.23) holds when f = g. Finally by polarization (A.23) extends to f, g ∈ L 2 . Since (A.23) gives A (1 B(x,2) , 1{x} ) = k x we have: Corollary A.34 Let A satisfy (D1)–(D3). If A (1 B(x,m) , 1{x} ) = 0 for some m ≥ 2 then A satisfies the representation (A.23) with k x = 0.

A.6 Miscellaneous Estimates Lemma A.35

Let

I (γ , x) =

∞

γ

e−xt dt.

1

Then I (γ , x) x −1/γ

for 0 < x ≤ 1,

(A.24)

I (γ , x) x −1 e−x

for x ≥ 1.

(A.25)

Proof Let x ∈ (0, 1]. Then x

1/γ

I (γ , x) =

and (A.24) follows. Now let x v = x(t γ − 1), I (γ , x) = γ −1 x −1 e−x

γ

e−s ds,

x 1/γ

≥ 1. Then, making the substitution 0

giving (A.25).

∞

∞

e−v (1 + v/x)−1+1/γ dv,

206

Appendix Let λ > 1, γ > 0, x > 0, and

Lemma A.36

S(λ, γ , x) =

∞

λn exp(−xλnγ ).

n=0

Then there exist ci = ci (λ, α) such that c1 x −1/γ ≤ S(λ, γ , x) ≤ c2 x −1/γ , c1 x

−1 −x

e

≤ S(λ, γ , x) ≤ c2 x

Proof We have I (γ , x) =

∞ n=0

λn+1 λn

x ∈ (0, 1],

−1 −x/λγ

e

,

x ≥ 1.

γ

e−xt dt,

and estimating each term in the sum gives (λ − 1)−1 I (γ , x) ≤ S(λ, γ , x) ≤ (λ − 1)−1 I (γ , xλ−γ ). The bounds on S then follow easily from Lemma A.35.

A.7 Whitney Type Coverings of a Ball In this section, we construct a covering of a ball with some quite specific properties, which will be used to prove a weighted Poincaré inequality. The construction is similar to that in [SC2], but has some significant differences close to the boundary of the ball. We assume that (, μ) is a weighted graph which satisfies volume doubling – see Definition 6.30. If C V is the doubling constant, set θ=

log C V . log 2

(A.26)

Lemma A.37 Suppose that (, μ) satisfies volume doubling. Then if x, y ∈ V, 1 ≤ r ≤ R R + d(x, y) θ V (y, R) . ≤ C V2 V (x, r ) r Proof This is a straightforward and informative exercise in the use of the volume doubling condition. Lemma A.38

Suppose that (, μ) satisfies volume doubling. Let B = {B(xi , ri ), i ∈ I }

A.7 Whitney Type Coverings of a Ball

207

be a family of disjoint balls in (, μ), with R1 ≤ ri ≤ R2 for all i. Let λ ≥ 1. Then R θ 2 . |{i : x ∈ B(xi , λri )}| ≤ C V2 (1 + 2λ)θ R1 Proof Let I = {i : x ∈ B(xi , λri )} and N = |I |. If i ∈ I then B(xi , ri ) ⊂ B(x, (1 + λ)ri ). So $ % B(xi , ri ) = μ(B(xi , ri )) ≥ N min V (xi , ri ). μ(B(x, (1+λ)R2 )) ≥ μ i∈I

i∈I

i∈I

Using Lemma A.37 to bound V (x, (1 + λ)R2 )/V (xi , ri ) we obtain λr + (1 + λ)R θ i 2 , N ≤ C V2 max i ri from which the result follows easily. For the rest of this section we fix a ball B = B(o, R). Set ρ(x) = R − d(o, x),

(A.27)

so that d(x, ∂i B) ≥ ρ(x). We now fix λ1 ≥ 10 and a constant R0 ≥ 10, and assume that λ1 R0 ≤ R. We also assume that both R and λ1 R0 are integers. We now choose balls Bi = B(xi , ri ), i = 1, . . . , N as follows. We take x 1 = o and r1 = R/λ1 . Given Bi for i = 1, . . . , k − 1 set Ak = B(o, R − k−1 Bi . If Ak is empty then we let N = k − 1 and stop; otherwise λ1 R0 ) − ∪i=1 we choose xk ∈ Ak to minimise d(o, xk ), and set rk =

ρ(x k ) R − d(o, xk ) = . λ1 λ1

We also define Bi = B(xi , 12 ri ). Lemma A.39 (a) The balls Bi cover B(o, R − λ1 R0 ). (b) We have r1 ≥ r2 ≥ r3 ≥ · · · ≥ r N ≥ R0 . (c) The balls Bi are disjoint. (d) If y ∈ B(xi , θri ) then (λ1 − θ )ri ≤ ρ(y) ≤ (λ1 + θ )ri .

(A.28)

Proof (a) is immediate from the construction. (b) That ri are decreasing is clear from the construction. Since x N ∈ B(0, R − λ1 R0 ) we have ρ(X N ) ≥ λ1 R0 and hence r N ≥ R0 .

208

Appendix

(c) Suppose that y ∈ Bi ∩ B j with i < j. Then d(xi , x j ) ≤ d(xi , y) + d(y, x j ) ≤ 12 (ri + r j ) ≤ ri , a contradiction since then x j ∈ Bi . (d) If y ∈ Bi (xi , θri ) then |ρ(y) − ρ(xi )| ≤ θri , and (A.28) follows. We call a ball Bi a boundary ball if ri ≤

λ1 R 0 . λ1 − 1

It is clear from the construction that there exists M such that Bi is a boundary ball if and only if M + 1 ≤ i ≤ N . For x, y ∈ V we write γ (x, y) for a shortest path connecting x to y. (If this path is not unique, then we make some choice among the set of shortest paths.) For each i ≥ M + 1 we define Bi = Bi ∪ {z ∈ B : there exists y ∈ γ (o, z) ∩ Bi such that ρ(y) = λ1 R0 }. Lemma A.40 (a) If Bi is not a boundary ball then ρ(x) > λ1 R0 for all x ∈ Bi . (b) If z ∈ B ∩ {x : ρ(x) ≤ λ1 R0 } then there exists a boundary ball Bi such that z ∈ Bi . (c) If Bi is a boundary ball and z ∈ Bi then there exists a path inside Bi of length at most (λ1 + 2)R0 connecting xi and z. In particular, Bi ⊂ B(xi , (λ1 + 2)R0 ). Proof (a) This is immediate from (A.28). (b) Let z ∈ B. Let y be the unique point on the path γ (z, o) such that ρ(y) = λ1 R0 . By Lemma A.39(a) there exists i such that y ∈ Bi . Then d(y, xi ) ≤ ri , so λ1 ri = ρ(x i ) ≤ ρ(yi ) + ri ≤ λ1 R0 + ri , Bi . and thus Bi is a boundary ball and z ∈ (c) Let z ∈ Bi − Bi and let y be as above. Since Bi is a boundary ball, d(z, xi ) ≤ d(z, y) + d(y, xi ) ≤ λ1 R0 + ri ≤

λ21 R0 ≤ (λ1 + 2)R0 . λ1 − 1

Further, the whole path γ (z, y) is contained in Bi , and so z is connected to xi by a path in Bi of length at most (2 + λ1 )R0 . We now define an ancestor function a : {2, . . . , N } → {1, . . . , N } which will give a tree structure on the set of indices {1, . . . , N }. Let i ≥ 2. We need to consider two cases.

A.7 Whitney Type Coverings of a Ball

209

Case 1: If d(o, xi ) ≥ 2ri then let zi be the first point on the path γ (xi , o) with d(zi , xi ) ≥ 2ri . So we have 2ri ≤ d(z i , xi ) < 2ri + 1. Since ρ(z i ) > ρ(xi ) ≥ λ1 R0 , there exists j such that z i ∈ B j . (If there is more than one such j we choose the smallest.) We have |ρ(z i ) − ρ(x j )| ≤ r j , and therefore 2ri + λ1ri ≤ d(z i , xi ) + ρ(xi ) = ρ(z i ) ≤ 2ri + λ1ri + 1, 2ri + λ1 ri − r j ≤ λ1 r j = ρ(x j ) ≤ 2ri + λ1 ri + 1 + r j . Thus 2 + λ1 1 2 + λ1 ri ≤ r j ≤ ri + . λ1 + 1 λ1 − 1 (λ1 − 1) In particular, we have r j > ri , so j < i. We set a(i) = j. Case 2: If d(o, xi ) ≤ 2ri then we set a(i) = 1. Since 2ri ≥ d(xi , o) > r1 , we obtain λ1 r1 − 2ri = R − 2ri ≤ λ1 ri ≤ λ1 r1 − r1 , and hence λ1 + 2 λ1 ri . ri ≤ r 1 ≤ λ1 − 1 λ1 Thus in both cases there exist δ1 and δ2 depending only on λ1 so that if j = a(i) then (1 + δ1 )ri ≤ r j ≤ (1 + δ2 )ri ;

(A.29)

further, we can take δ1 = (1 + λ1 )−1 and δ2 = 4/9. We write ak for the kth iterate of a. Using this ancestor function, starting in Bi we obtain a sequence Bi , Ba(i) , Ba2 (i) , . . . , Bam (i) of successively larger balls, which ends with the ball B1 . Lemma A.41

(a) If j = a(i) then d(xi , x j ) ≤ 1 + 2ri + r j , and μ(B(x i , 2ri ) ∩ B(x j , 2r j )) ≥

μ(Bi ) ∨ μ(B j )

(b) If j = am (i) for some m ≥ 1 then d(xi , x j ) ≤

4r j . δ1

Consequently Bi ⊂ B(x j , K 1r j ), where K 1 = 5 + 4λ1 .

C V3

.

210

Appendix

Proof (a) The first assertion is immediate from the triangle inequality and (A.29). In Case 1 above let w be the first point on γ (xi , o) with d(xi , w) ≥ 3ri /2, and in Case 2 let w = o. Then B(w, ri /3) ⊂ B(xi , 2ri ) ∩ B(x j , 2r j ), while Bi ∪ B j ⊂ B(w, 8ri /3). Hence μ B(xi , 2ri ) ∩ B(x j , 2r j ) ≥ μ(B(w, ri /3)) ≥ C V−3 μ(B(w, 8ri /3)) ≥ C V−3 μ(Bi )∨μ(B j ). (b) Set sk = rak (i) for k = 0, . . . , m. Then since sk ≤ sk+1 (1+δ1 )−1 , using (a), d(xi , x j ) ≤

m−1

d(xak (i) , xak+1 (i) )

k=0

≤

m−1

(2sk + sk+1 + 1) ≤

k=0

m 7 sk 2 k=0

m 4sm ≤ 72 sm (1 + δ1 )k−m ≤ . δ1 k=0

The final assertion now follows easily. Definition A.42

Given the construction above, set

Bi∗ = B(xi , 2ri ), Di = B(xi , 12 λ1 ri ), Di∗ = B(xi , K ri ), Bi∗ = Di = Bi , Di∗ = B(xi , K ri ), if M + 1 ≤ i ≤ N .

if 1 ≤ i ≤ M,

Note that we have for each i Bi ⊂ Bi∗ ⊂ Di ⊂ B ∩ Di∗ . Lemma A.43 There exists a constant K 2 such that any x ∈ B is contained in at most K 2 of the sets Di . Proof It is enough to consider the families {Di , i ≤ M} and {Di , i > M} separately. For the first family, the result follows from Lemma A.38 on using Lemma A.39 to control ri . For the second, since each Di is contained in a ball of radius (1 + λ1 )R0 , this is immediate by Lemma A.38.

A.8 A Maximal Inequality

211

A.8 A Maximal Inequality We begin with a result, known as the 5B Theorem, on ball coverings. See [Zie], Th. 1.3.1 for the case of a general metric space; in the discrete context here we can replace 5 by 3. Notation In this section and Section A.9 we will use the notation λB to denote the ball with the same centre as B and λ times the radius. Theorem A.44 Let = (V, E) be a graph, and let B = {B(xi , ri ), i ∈ I } be a family of balls, with R = supi ri < ∞. Then there exists J ⊂ I such that B(x j , r j ), j ∈ J are disjoint, and ∪i∈I B(xi , ri ) ⊂ ∪ j∈J B(x j , 3r j ). Indeed, for any i there exists j ∈ J such that Bi ⊂ B(x j , 3r j ). Proof Write Bi = B(xi , ri ). We can assume that ri are integers. Now set for 1≤r ≤ R Ir = {i : ri = r }. We can find a maximal subset I R of I R such that {Bi , i ∈ I R } are disjoint. For a general metric space one may need to use the Hausdorff maximal principle, but here since I R is countable it can be done by a simple explicit procedure. , let I R−m be a maximal subset of I R−m such that Given I R , . . . , I R−m+1 {Bi , i ∈ I R−m } are disjoint, and also disjoint from ( Bi : i ∈

R

I j

) .

j=R−m+1 Now set J = ∪∞ j=1 I j . By construction, the balls B j , j ∈ J are disjoint. We now show {B(x j , 3r j ), j ∈ J } cover ∪ I Bi . Let i ∈ I . If i ∈ J we are done. If not, then there exists k ∈ J with rk ≥ ri such that Bi ∩ Bk = ∅. Let z ∈ Bi ∩ Bk . Then for any y ∈ Bi

d(y, xk ) ≤ d(y, xi ) + d(xi , z) + d(z, xk ) ≤ 2ri + rk ≤ 3rk , so that y ∈ B(x k , 3rk ). We now let be a graph satisfying volume doubling. For f ∈ C(V) set 1 M f (x) = sup | f (z)|μz : x ∈ B(y, r ) . V (y, r ) z∈B(y,r )

212

Appendix Let (, μ) satisfy volume doubling. For all f ∈ C0 (V) we

Theorem A.45 have

||M f ||2 ≤ c|| f ||2 .

(A.30)

Proof We can assume f ≥ 0. Let At ( f ) = {x : M f (x) > t},

for t > 0.

If x ∈ At ( f ) then there exists a ball Bx containing x such that f (z)μz > tμ(Bx ).

(A.31)

z∈Bx

The balls {Bx , x ∈ At ( f )} cover At ( f ), and as f has finite support they have a maximal radius. So, by Theorem A.44 there exists G ⊂ At ( f ) so that {Bx , x ∈ G} are disjoint, and At ( f ) ⊂ 3Bx . x∈G

Then, using volume doubling and (A.31), μ(3Bx ) ≤ c1 μ(Bx ) μ(At ( f )) ≤ ≤

x∈G c1

t

x∈G

f (z)μz ≤

x∈G z∈Bx

c1 || f ||1 . t

(A.32)

Thus we have proved that M is of weak L 1 type. Since also ||M f ||∞ ≤ || f ||∞ we can apply the Marcinkiewicz interpolation theorem to deduce (A.30). However, a direct proof is easy, so we give it here. Fix (for the moment) t > 0 and let g = t ∧ f , h = f − g. Since M f = M(g + h) ≤ Mg + Mh and g ≤ t we have A2t ( f ) ⊂ {Mg > t} ∪ {Mh > t} = {Mh > t} = At (h). Applying (A.32) to At (h), and using the facts that h = 0 on { f ≤ t} and h ≤ f , we obtain c1 c1 h(x)μx ≤ f (x)μx . μ(A2t ( f )) ≤ μ(At (h)) ≤ t t x: f (x)>t

Hence

||M f ||22

∞

=

8tμ(M f > 2t)dt ≤ 8c1

0

= 8c1

x: f (x)>t

x∈V

f (x)μx 0

0 ∞

∞

f (x)μx dt

x: f (x)>t

1(t< f (x)) dt = 8c1 || f ||22 .

A.9 Poincaré Inequalities

213

Lemma A.46 Let (, μ) satisfy volume doubling. Let Bi = B(xi , ri ), 1 ≤ i ≤ m, be balls in . Let λ ≥ 1. There exists a constant C(λ) such that, for any ai ≥ 0, ' ' ' ' ' ai 1λBi '2 ≤ C(λ)' ai 1 Bi '2 . i

i

Proof Let f =

ai 1λBi ,

g=

i

ai 1 Bi .

i

It is sufficient to show that for any h ∈ C0 (V) ||h f ||1 ≤ c||g||2 ||h||2 . Given h let Hi =

(A.33)

1 h(x)μx . μ(λBi ) x∈λBi

Then ||h f ||1 =

x

ai 1λBi (x)h(x)μx =

i

ai μ(λBi )Hi ≤ c

i

ai Hi μ(Bi ).

i

Now 1 Bi (x)Hi ≤ 1 Bi (x)Mh(x). So

||h f ||1 ≤ c|| ai Hi 1 Bi ||1 ≤ c|| ai 1 Bi Mh||1 ≤ c||g||2 ||Mh||2 ≤ c||g||2 ||h||2 , i

i

by Theorem A.45. This proves (A.33) and completes the proof.

A.9 Poincaré Inequalities The main result in this section is the proof of the weighted PI Proposition 6.22, which plays an essential role in the proof of the near-diagonal lower bound. We remark that there is a quick proof of a weighted PI from the strong PI in [DK], but this argument does not work if one only has the weak PI. Since we wish to prove stability for the Gaussian bounds HK(α, 2), we need stable hypotheses, and it is not clear that the strong PI, on its own, is stable. We therefore need to derive the weighted PI from the weak PI. This was done by Jerison [Je], and in fact the same technique can be used to prove both the strong and the weighted PI from the weak PI – see Corollary A.51 later in this section. The approach used here is a simplification of that in [Je] – see [SC2] for its presentation in the metric space context, and also for historical notes.

214

Appendix

Let (, μ) be a weighted graph and B = B(o, R) be a ball. Recall the definition of ϕ B : (R + 1 − d(o, y))2 1 B (y), ϕ B (y) = 1 ∧ R2

(A.34)

and set for A ⊂ B (B) EA ( f, f ) =

1 2

( f (x) − f (y))2 μx y (ϕ B (x) ∧ ϕ B (y)).

x,y∈A

Recall from (1.16) the notation E B ( f, f ), and that the (weak) PI states that for any function f we have inf ( f (x) − a)2 μx ≤ C P r 2 E B(x,λr ) ( f, f ). (A.35) a

X ∈B(x,r )

The main result of this section is the following. Theorem A.47 Let (, μ) satisfy (H5), volume doubling, and the PI with expansion factor λ. Then for any ball B = B(x0 , r ) the following weighted PI holds: (B) inf ( f (x) − a)2 ϕ B (x)μx ≤ C1 (λ)R 2 EB ( f, f ). a

x∈B

If f : A → R then we write f A = μ(A)−1

f (x)μx

x∈A

for the mean value of f on A. For finite sets A ⊂ A we write P(A, A ) for the smallest real number C such that the following PI holds: ( f (x) − f A )2 μx ≤ CE A ( f, f ). (A.36) x∈A

The next result enables us to use the PI to chain the means of a function f along a sequence of sets. Lemma A.48

Let Ai ⊂ Ai∗ , i = 1, 2 be finite sets in V and f ∈ C(V). Then

| f A1 − f A2 |2 ≤

2P(A1 , A∗1 )E A∗1 ( f, f ) + 2P(A2 , A∗2 )E A∗2 ( f, f ) μ(A1 ∩ A2 )

.

A.9 Poincaré Inequalities

215

Proof We have μ(A1 ∩ A2 )| f A1 − f A2 |2 = |( f (x) − f A1 ) − ( f (x) − f A2 )|2 μx x∈A1 ∩A2

≤2

x∈A1 ∩A2

≤2

| f (x) − f A2 |2 μx

x∈A1 ∩A2

| f (x) − f A1 | μx + 2 2

x∈A1

≤

| f (x) − f A1 |2 μx + 2

| f (x) − f A2 |2 μx

x∈A 2

2P(A1 , A∗1 )E A∗1 ( f,

f ) + 2P(A2 , A∗2 )E A∗2 ( f, f ).

For the remainder of this section we fix a ball B = B(o, R) in a graph (, μ) which satisfies the hypotheses of Theorem A.47. We can assume that λ ≥ 10, and set λ1 = 4λ, R0 = 10. If λ1 R0 ≤ R then let Bi ⊂ Bi∗ ⊂ Di ⊂ Di∗ , 1 ≤ i ≤ N be the sets given in Definition A.42. If R < λ1 R0 then we just take M = 0, N = 1, and B1∗ = D1 = B(o, R). We take θ = log C V / log 2, as in (A.26). Since B is fixed we will write EA ( f, f ) for EA(B) ( f, f ), and ϕ for ϕ B . Lemma A.49

We have P(Bi∗ , Di ) ≤ C P ri2 , P(Bi∗ ,

1 ≤ i ≤ M,

Di ) ≤ C(λR0 ) , 2θ

M + 1 ≤ i ≤ N.

Proof If i ≤ M then Bi∗ = B(xi , 2ri ) and B(xi , 2λri ) ⊂ Di , so this is immediate from the weak PI (A.35). Bi . Using (H5) and Lemma Now let i ≥ M + 1, and recall that Bi∗ = Di = 1.3 we have for any x ∼ y in Di that μx y ≤ cV (x, 1). Then by Lemmas A.37 and A.40(c), and the condition (H5), μ(Di ) ≤ V (x, 2(2 + λ1 )R0 ) ≤ c R0θ λθ V (x, 1) ≤ c1 R0θ λθ μx y . Let M be any family of paths which covers Di and let κ(M ) be as in Definition 3.23. Then 2 κ(M ) = max μ−1 μx μ y ≤ max μ−1 e e μ(Di ) . e∈E(Di )

e∈E(D I )

{(x,y):e∈γ (x,y)}

Hence R I (Di )−1 ≤ κ(M )/μ(Di ) ≤ c(λR0 )θ , and so by Proposition 3.27 we have P(Bi∗ , Di ) = P(Di , Di ) ≤ c(λR0 )2θ . Set Pi = P(Bi∗ , Di ),

P∗ = max Pi ; i

216

Appendix

by the preceding lemma we have P∗ ≤ C R 2 . Let ϕi = max ϕ(x). Di

There exists c1 , depending only on λ1 and R0 , such that

Lemma A.50

ϕi ≤ c1 ϕ(x),

x ∈ Di .

Hence for any i we have ϕi E Di ( f, f ) ≤ c1 EDi ( f, f ). Proof If i ≤ M this follows easily from Lemma A.39(d) with θ = 12 λ1 . If i > M then min Di ϕ(x) ≥ R −2 . By Lemma A.40(c) ρ(x) ≤ 2(1 + λ1 )R0 for all x ∈ Di , so max Di ϕ(x)/min Di ϕ(x) ≤ cλ21 R02 . Proof of Theorem A.47 If R ≤ λ1 R0 then the weighted PI holds with a constant c = c(λ) using Lemmas A.49 and A.50. So now assume that R > λ1 R0 . Let f : B → R and write f i = f Bi∗ . Then

| f (x) − f 1 |2 ϕ(x)μx

x∈B

≤

i

≤2

| f (x) − f 1 |2 ϕ(x)μx

x∈Bi∗

i

| f (x) − f i |2 ϕ(x)μx + 2

x∈Bi∗

i

| f i − f 1 |2 ϕ(x)μx

x∈Bi∗

= S1 + S2 . Using the weak PI for Bi∗ ⊂ Di , and Lemmas A.50 and A.43, S1 ≤ 2

N

ϕi | f (x) − f i |2 μx

i=1 x∈Bi∗

≤2

N

N

ϕi Pi E Di ( f, f ) ≤ 2C 1

i=1

Pi EDi ( f, f ) ≤ 2C 1 K 2 P∗ EB ( f, f ).

i=1

The sum S2 is harder. Set g(x) =

i

1/2

ϕi

| f 1 − f i |1 Bi (x).

A.9 Poincaré Inequalities As the balls Bi are disjoint, g(x)2 =

217

ϕi | f 1 − f i |2 1 Bi (x),

i

and S2 ≤ 2

ϕi | f 1 − f i |2 μ(Bi∗ ) ≤ c

i

ϕi | f 1 − f i |2 μ(Bi ) = c

g(x)2 μx .

x

i

To bound g we first fix j ∈ {1, . . . , N }. Let m = m j be such that am ( j) = 1, and write jk = ak ( j) for k = 0, . . . , m. Then by Lemma A.48 1/2

| f 1 − f j |ϕ j

≤

m−1

1/2

ϕj | f

− f

jk

k=0 1/2

≤ c P∗

m−1

1/2

ϕj

jk+1 |

E

D jk ( f,

μ(B ∗jk ∩ B ∗jk+1 )

k=0

Let bi2 =

E D jk ( f, f ) + E D jk+1 ( f, f ) μ(B ∗jk ∩ B ∗jk+1 )

.

EDi ( f, f ) . μ(Bi∗ )

Then by Lemma A.41 ϕj

f ) + E D jk+1 ( f, f ) 1/2

≤ ϕj

E D jk ( f, f ) μ(B ∗jk )

+

E D jk+1 ( f, f ) μ(B ∗jk+1 )

≤ c(b2jk + b2jk+1 ) ≤ c(b jk + b jk+1 )2 . In the final line we used Lemma A.50 and the fact that ϕ j ≤ cϕ jk for k ≥ 1. By Lemma A.41 B j ⊂ D ∗jk , so 1/2

1/2

ϕ j | f 1 − f j |1 B j (x) ≤ c P∗

m

b jk 1 B j (x)

k=0 1/2

≤ c1 (x)P∗ B j

b jk 1 D ∗jk (x)

k N 1/2

≤ c1 B j (x)P∗

bi 1 Di∗ (x).

i=1

Surprisingly, the constant in the final bound does not depend on j. Using again the fact that Bi are disjoint, so 1 B ≤ 1, j 1/2 g(x) = ϕ j | f 1 − f j |1 B j (x) j

≤c

j

1 B j (x)

N i=1

1/2

1/2

P∗ bi 1 Di∗ (x) ≤ c P∗

N i=1

bi 1 Di∗ (x).

218

Appendix

Hence, using Lemma A.46, S2 ≤ c||g||22 ≤ c P∗ || bi 1 Di∗ (x)||22 ≤ c P∗ || bi 1 Bi (x)||22

= c P∗

i

bi2 μ(Bi )

≤ c P∗

i

i

EDi ( f, f ) ≤ c K 2 P∗ EB ( f, f ).

i

Since P∗ ≤ c R 2 this completes the proof. Corollary A.51 Let (, μ) satisfy (H5), volume doubling, and the weak PI with scale constant λ. Then (, μ) satisfies the strong PI. Proof The proof is the same as that of Theorem A.47, except that we replace ϕ B by 1. Remark A.52 Here is an example which shows that we need the condition (H5) to obtain either the strong or weighted PI from the weak PI. Let = Z2 , and let εn be any sequence of positive reals with limn εn = 0. Let e1 = (0, 1) and e2 = (1, 0). Choose weights so that μx y = 1, except that, writing xn = (10n, 0), yn = xn + e2 , we set μxn +z,yn +z = εn

for z = −e1 , 0, e1 .

Then one can verify that weak Poincaré inequality holds with λ = 5 and a constant C P of order 1. For example, if Bn = B(xn , 1) then B(xn , 5) contains a path of length 5 consisting of edges of weight 1 which connect x n and yn . However, if An = {(x, y) ∈ Bn : y ≥ 1} then μ E (An ; Bn − An ) = 3εn . It follows that the PI in Bn holds with a constant C P of order εn , and consequently that the strong PI does not hold for this weighted graph. The discrete time heat kernel bounds fail, since infn p1 (xn , yn ) = 0. However, one can prove that the continuous time bounds HK(2, 2) do hold – see [BC].

References

[AF] D.J. Aldous and J. Fill. Reversible Markov Chains and Random Walks on Graphs. See http://www.stat.berkeley.edu/~aldous/RWG/book. html

[Ah] L.V. Ahlfors. Conformal Invariants: Topics in Geometric Function Theory. New York, NY: McGraw-Hill Book Co. (1973). [Ar] D.G. Aronson. Bounds on the fundamental solution of a parabolic equation. Bull. Amer. Math. Soc. 73 (1967) 890–896. [Ba1] M.T. Barlow Which values of the volume growth and escape time exponent are possible for a graph? Rev. Mat. Iberoamer. 20 (2004), 1–31. [Ba2] M.T. Barlow. Some remarks on the elliptic Harnack inequality. Bull. London Math. Soc. 37 (2005), no. 2, 200–208. [BB1] M.T. Barlow and R.F. Bass. Random walks on graphical Sierpinski carpets. In: Random Walks and Discrete Potential Theory, eds. M. Piccardello and W. Woess. Symposia Mathematica XXXIX. Cambridge: Cambridge University Press (1999). [BB2] M.T. Barlow and R.F. Bass. Stability of parabolic Harnack inequalities. Trans. Amer. Math. Soc. 356 (2003), no. 4, 1501–1533. [BC] M.T. Barlow and X. Chen. Gaussian bounds and parabolic Harnack inequality on locally irregular graphs. Math. Annalen 366 (2016), no. 3, 1677–1720. [BCK] M.T. Barlow, T. Coulhon, and T. Kumagai. Characterization of sub-Gaussian heat kernel estimates on strongly recurrent graphs. Comm. Pure. Appl. Math. LVIII (2005), 1642–1677. [BH] M.T. Barlow and B.M. Hambly. Parabolic Harnack inequality and local limit theorem for percolation clusters. Electron. J. Prob. 14 (2009), no. 1, 1–27. [BK] M.T. Barlow and T. Kumagai. Random walk on the incipient infinite cluster on trees. Illinois J. Math. 50 (Doob volume) (2006), 33–65. [BJKS] M.T. Barlow, A.A. Járai, T. Kumagai and G. Slade. Random walk on the incipient infinite cluster for oriented percolation in high dimensions. Commun. Math. Phys. 278 (2008), no. 2, 385–431. [BM] M.T. Barlow and R. Masson. Spectral dimension and random walks on the two dimensional uniform spanning tree. Commun. Math. Phys. 305 (2011), no. 1, 23–57.

219

220

References

[BMu] M.T. Barlow and M. Murugan. Stability of elliptic Harnack inequality. In prep. [BT] Zs. Bartha and A. Telcs. Quenched invariance principle for the random walk on the Penrose tiling. Markov Proc. Related Fields 20 (2014), no. 4, 751–767. [Bas] R.F. Bass. On Aronsen’s upper bounds for heat kernels. Bull. London Math. Soc. 34 (2002), 415–419. [BPP] I. Benjamini, R. Pemantle, and Y. Peres. Unpredictable paths and percolation. Ann. Prob. 26 (1998), 1198–1211. [BD] A. Beurling and J. Deny. Espaces de Dirichlet. I. Le cas élémentaire. Acta Math. 99 (1958), 203–224. [BL] K. Burdzy and G.F. Lawler. Rigorous exponent inequalities for random walks. J. Phys. A 23 (1990), L23–L28. [Bov] A. Bovier. Metastability: a potential theoretic approach. Proc. ICM Madrid 2006. Zurich: European Mathematical Society. [CKS] E.A. Carlen, S. Kusuoka, and D.W. Stroock. Upper bounds for symmetric Markov transition functions. Ann. Inst. H. Poincaré Suppl. no. 2 (1987), 245– 287. [Ca] T.K. Carne. A transmutation formula for Markov chains. Bull. Sci. Math. 109 (1985), 399–405. [CRR] A.K. Chandra, P. Raghavan, W.L. Ruzzo, R. Smolensky, and P. Tiwari. The electrical resistance of a graph captures its commute and cover times. In: Proc. 21st Annual ACM Symposium on Theory of Computing, Seattle, WA, ed. D.S. Johnson. New York, NY: ACM (1989). [Ch] X. Chen. Pointwise upper estimates for transition probability of continuous time random walks on graphs. In prep. [Co1] T. Coulhon. Espaces de Lipschitz et inégalités de Poincaré. J. Funct. Anal. 136 (1996), no. 1, 81–113. [Co2] T. Coulhon. Heat kernel and isoperimetry on non-compact Riemannian manifolds. In: Heat Kernels and Analysis on Manifolds, Graphs, and Metric Spaces (Paris, 2002), eds. P. Auscher, T. Coulhon, and A. Grigor’yan. Contemporary Mathematics vol. 338. Providence, RI: AMS (2003), pp. 65–99. [CG] T. Coulhon and A. Grigor’yan. Random walks on graphs with regular volume growth. GAFA 8 (1998), 656–701. [CGZ] T. Coulhon, A. Grigoryan, and F. Zucca. The discrete integral maximum principle and its applications. Tohoku Math. J. (2) 57 (2005), no. 4, 559–587. [CS] T. Coulhon and A. Sikora. Gaussian heat kernel upper bounds via the Phragmén-Lindelöf theorem. Proc. Lond. Math. Soc. (3) 96 (2008), no. 2, 507–544. [Da] E.B. Davies. Large deviations for heat kernels on graphs. J. London Math. Soc.(2) 47 (1993), 65–72. [D1] T. Delmotte. Parabolic Harnack inequality and estimates of Markov chains on graphs. Rev. Math. Iberoamer. 15 (1999), 181–232. [D2] T. Delmotte. Graphs between the elliptic and parabolic Harnack inequalities. Potential Anal. 16 (2002), no. 2, 151–168. [De] Y. Derriennic. Lois ‘zero ou deux’ pour les processus de Markov. Ann. Inst. Henri Poincaré Sec. B 12 (1976), 111–129.

References

221

[DiS] P. Diaconis and D. Stroock. Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Prob. 1 (1991), 36–61. [DS] P. Doyle and J.L. Snell. Random Walks and Electrical Networks. Washington D.C.: Mathematical Association of America (1984). Arxiv: .PR/0001057. [Duf] R.J. Duffin. The extremal length of a network. J. Math. Anal. Appl. 5 (1962), 200–215. [Dur] R. Durrett. Probability: Theory and Examples, 4th edn. Cambridge: Cambridge University Press (2010). [DK] B. Dyda and M. Kassmann. On weighted Poincaré inequalities. Ann. Acad. Sci. Fenn. Math. 38 (2013), 721–726. [DM] E.B. Dynkin and M.B. Maljutov. Random walk on groups with a finite number of generators. (In Russian.) Dokl. Akad. Nauk SSSR 137 (1961), 1042–1045. [FS] E.B. Fabes and D.W. Stroock. A new proof of Moser’s parabolic Harnack inequality via the old ideas of Nash. Arch. Mech. Rat. Anal. 96 (1986), 327– 338. [Fa] K. Falconer. Fractal Geometry. Chichester: Wiley (1990). [Fe] W. Feller. An Introduction to Probability Theory and its Applications. Vol. I, 3rd edn. New York, NY: Wiley (1968). [Fo1] M. Folz. Gaussian upper bounds for heat kernels of continuous time simple random walks. Elec. J. Prob. 16 (2011), 1693–1722, paper 62. [Fo2] M. Folz. Volume growth and stochastic completeness of graphs. Trans. Amer. Math. Soc. 366 (2014), 2089–2119. [Fos] F.G. Foster. On the stochastic matrices associated with certain queuing processes. Ann. Math. Statist. 24 (1953), 355–360. [FOT] M. Fukushima, Y. Oshima, and M. Takeda, Dirichlet Forms and Symmetric Markov Processes. Berlin: de Gruyter (1994). [Ga] R.J. Gardner. The Brunn-Minkowski inequality. Bull. AMS 39 (2002), 355– 405. [Gg1] A.A. Grigor’yan. The heat equation on noncompact Riemannian manifolds. Math. USSR Sbornik 72 (1992), 47–77. [Gg2] A.A. Grigor’yan. Heat kernel upper bounds on a complete non-compact manifold. Revista Math. Iberoamer. 10 (1994), 395–452. [GT1] A. Grigor’yan and A. Telcs. Sub-Gaussian estimates of heat kernels on infinite graphs. Duke Math. J. 109 (2001), 452–510. [GT2] A. Grigor’yan and A. Telcs. Harnack inequalities and sub-Gaussian estimates for random walks. Math. Annal. 324 (2002), no. 3, 521–556. [GHM] A. Grigor’yan, X.-P. Huang, and J. Masamune. On stochastic completeness of jump processes. Math. Z. 271 (2012), no. 3, 1211–1239. [Grom1] M. Gromov. Groups of polynomial growth and expanding maps. Publ. Math. IHES 53 (1981), 53–73. [Grom2] M. Gromov. Hyperbolic groups. In: Essays in Group Theory, ed. S.M. Gersten. New York, NY: Springer (1987), pp. 75–263. [HSC] W. Hebisch and L. Saloff-Coste. Gaussian estimates for Markov chains and random walks on groups. Ann. Prob. 21 (1993), 673–709. [Je] D. Jerison. The weighted Poincaré inequality for vector fields satisfying Hörmander’s condition. Duke Math. J. 53 (1986), 503–523.

222

References

[Kai] V.A. Kaimanovich. Measure-theoretic boundaries of 0-2 laws and entropy. In: Harmonic Analysis and Discrete Potential Theory (Frascati, 1991). New York, NY: Plenum (1992), pp. 145–180. [Kan1] M. Kanai. Rough isometries and combinatorial approximations of geometries of non-compact riemannian manifolds. J. Math. Soc. Japan 37 (1985), 391–413. [Kan2] M. Kanai. Analytic inequalities, and rough isometries between non-compact reimannian manifolds. In: Curvature and Topology of Riemannian Manifolds. Proc. 17th Intl. Taniguchi Symp., Katata, Japan, eds. K. Shiohama, T. Sakai, Takashi, and T. Sunada. Lecture Notes in Mathematics 1201. New York, NY: Springer (1986), pp. 122–137. [KSK] J.G. Kemeny, J.L. Snell, and A.W. Knapp. Denumerable Markov Chains. New York, NY: Springer (1976). [Ki] J. Kigami. Harmonic calculus on limits of networks and its application to dendrites. J. Funct. Anal. 128 (1995), no. 1, 48–86. [Kir1] G. Kirchhoff. Über die Auflösung der Gleichungen, auf welche man bei der Untersuchungen der linearen Vertheilung galvanischer Ströme geführt wird. Ann. Phys. Chem. 72 (1847), 497–508. [Kir2] G. Kirchhoff. On the solution of the equations obtained from the investigation of the linear distribution of galvanic currents. (Translated by J.B. O’Toole.) IRE Trans. Circuit Theory 5 (1958), 4–7. [KN] G. Kozma and A. Nachmias. The Alexander-Orbach conjecture holds in high dimensions. Invent. Math. 178 (2009), no. 3, 635–654. [Kum] T. Kumagai. Random Walks on Disordered Media and their Scaling Limits. École d’Été de Probabilités de Saint-Flour XL – 2010. Lecture Notes in Mathematics 2101. Chan: Springer International (2014). [LL] G.F. Lawler and V. Limic. Random Walk: A Modern Introduction. Cambridge: Cambridge University Press (2010). [LP] R. Lyons and Y. Peres. Probability on Trees and Networks. Cambridge: Cambridge University Press, 2016. [Ly1] T. Lyons. A simple criterion for transience of a reversible Markov chain. Ann. Prob. 11 (1983), 393–402. [Ly2] T. Lyons. Instability of the Liouville property for quasi-isometric Riemannian manifolds and reversible Markov chains. J. Diff. Geom. 26 (1987), 33–66. [Mer] J.-F. Mertens, E. Samuel-Cahn, and S. Zamir. Necessary and sufficient conditions for recurrence and transience of Markov chains in terms of inequalities. J. Appl. Prob. 15 (1978), 848–851. [Mo1] J. Moser. On Harnack’s inequality for elliptic differential equations. Commun. Pure Appl. Math. 14 (1961), 577–591. [Mo2] J. Moser. On Harnack’s inequality for parabolic differential equations. Commun. Pure Appl. Math. 17 (1964), 101–134. [Mo3] J. Moser. On a pointwise estimate for parabolic differential equations. Commun. Pure Appl. Math. 24 (1971), 727–740. [N] J. Nash. Continuity of solutions of parabolic and elliptic equations. Amer. J. Math. 80 (1958), 931–954. [Nor] J. Norris. Markov Chains. Cambridge: Cambridge University Press (1998).

References

223

[NW] C. St J.A. Nash-Williams. Random walks and electric currents in networks. Proc. Camb. Phil. Soc. 55 (1959), 181–194. [Os] H. Osada. Isoperimetric dimension and estimates of heat kernels of preSierpinski carpets. Prob. Theor Related Fields 86 (1990), 469–490. [Pol] G. Polya. Über eine Ausgabe der Wahrscheinlichkeitsrechnung betreffend die Irrfahrt im Strassennetz. Math. Ann. 84 (1921), 149–160. [Rev] D. Revelle. Heat kernel asymptotics on the lamplighter group. Elec. Commun. Prob. 8 (2003), 142–154. [RW] L.C.G. Rogers and D.W. Williams. Diffusions, Markov Processes, and Martingales. Vol. 1. Foundations. 2nd edn. Chichester: Wiley (1994). [Ro] O.S. Rothaus. Analytic inequalities, isoperimetric inequalities and logarithmic Sobolev inequalities. J. Funct. Anal. 64 (1985), 296–313. [SC1] L. Saloff-Coste. A note on Poincaré, Sobolev, and Harnack inequalities. Inter. Math. Res. Not 2 (1992), 27–38. [SC2] L. Saloff-Coste. Aspects of Sobolev-type Inequalities. LMS Lecture Notes 289. Cambridge: Cambridge University Press (2002). [SJ] A. Sinclair and M. Jerrum. Approximate counting, uniform generation and rapidly mixing Markov chains. Inform. Comput. 82 (1989), 93–133. [So1] P.M. Soardi. Rough isometries and Dirichlet finite harmonic functions on graphs. Proc. AMS 119 (1993), 1239–1248. [So2] P.M. Soardi. Potential Theory on Infinite Networks. Berlin: Springer (1994). [Spi] F. Spitzer. Principles of Random Walk. New York, NY: Springer (1976). [SZ] D.W. Stroock and W. Zheng. Markov chain approximations to symmetric diffusions. Ann. Inst. H. Poincaré Prob. Stat. 33 (1997), 619–649. [T1] A. Telcs. Random walks on graphs, electric networks and fractals. Prob. Theor. Related Fields 82 (1989), 435–451. [T2] A. Telcs. Local sub-Gaussian estimates on graphs: the strongly recurrent case. Electron. J. Prob. 6 (2001), no. 22, 1–33. [T3] A. Telcs. Diffusive limits on the Penrose tiling. J. Stat. Phys. 141 (2010), 661–668. [Tet] P. Tetali. Random walks and the effective resistance of networks. J. Theor. Prob. 4 (1991), 101–109. [Tru] K. Truemper. On the delta-wye reduction for planar graphs. J. Graph Theory 13 (1989), 141–148. [V1] N.Th. Varopoulos. Isoperimetric inequalities and Markov chains. J. Funct. Anal. 63 (1985), 215–239. [V2] N.Th. Varopoulos. Long range estimates for Markov chains. Bull. Sci. Math., 2e serie 109 (1985), 225–252. [W] D. Williams. Probability with Martingales. Cambridge: Cambridge University Press (1991). [Wo] W. Woess. Random Walks on Infinite Graphs and Groups. Cambridge: Cambridge University Press (2000). [Z] A.H. Zemanian. Infinite Electrical Metworks. Cambridge: Cambridge University Press (1991). [Zie] W.P. Ziemer. Weakly Differentiable Functions. New York, NY: Springer (1989).

Index

adjacency matrix, 13 amenable graph, 103, 104 balayage, 177, 178 ballistic random walk, 105 binary tree, with weights, 74 boundary, 2 bounds, on diagonal, 109 cadlag, 190 caloric function, 171 capacitary measure, 172 capacity, 51, 174 Carne–Varopoulos bound, 113 Cayley graph, 4 chaining argument, 129 Chapman–Kolmogorov equation, 133, 194 Cheeger constant, 91 Cheeger’s inequality, 101 circuit reduction, 61 co-area formula, 83, 92 collapse of subset in a graph, 5 commute time identity, 71 for sets, 175 conductance, 2, 38 continuous time simple random walk, 132 current (in a circuit), 38 cutting (an edge), 50 cycle, 1 Dirichlet form, 18, 137 trace, 62, 66 Dirichlet problem, 40 discrete Gauss–Green, 17

(E β ), 124 effective conductance, 45 effective constant, 90 effective resistance, finite network, 42 electrical equivalence of networks, 63 electrical network, 38 circuit reduction, 61 energy dissipation (of a current), 48 of flow, 48 of function, 18 escape measure, 172 Euclidean lattice, 4 explosion time, 137 exterior boundary, 2 extremal length, 51 feasible function, 45 Feller’s problem, 59 filtration, 183 5B theorem, 211 flow, 38 on a graph, 47 flux out of a set, 39 Foster’s criterion, 28 function cadlag, 190 harmonic, 26 right continuous with left limits, 190 subharmonic, 26 superharmonic, 26 Gaussian distribution, 9 graph balls, 2

224

Index

bounded geometry, 2 connected, 1, 2 lazy, 5 locally finite, 2 metric, 1 natural weights, 3 strongly recurrent, 149 type problem, 8 weighted, 2 graph products, 6 Green’s function, 22, 67, 121 group, 4

lamplighter graph, 103 Laplacian, 12 combinatorial, 13 probabilistic, 12 last exit decomposition, 172 law of the iterated logarithm, 120 lazy random walk, 37, 100, 138, 202 LHK(α, β), 116, 131 LHKC(α, β), 144 Liouville property, 27 local time, 22 loop, 1

Hadamard’s transmutation formula, 114 harmonic, 26 Harnack inequality elliptic, 29 local, 29 heat equation, discrete time, 15 heat kernel, 11 discrete time, 11 global upper bounds, 109 for killed process, 21 on-diagonal upper bounds, 109 heat kernel bounds Gaussian, 117, 160 sub-Gaussian, 117 Hewitt–Savage 01 law, 35 hitting measure, 172 hitting times, 7 HK(α, β), 116, 146, 147, 154 HKC(α, β), 144, 146, 147

Markov chain, 6 Markov process associated with Dirichlet form, 137 Markov property of quadratic form, 20, 62, 203 simple, discrete time, 187 strong, 23, 186 continuous time, 197 discrete time, 188 martingale, 183 convergence theorem, 27 optional sampling theorem, 184 maximum principle, 26 minimum principle, 27 modulus of a family of curves, 51

inequality Cheeger’s, 101 isoperimetric, 80 Nash, 85 Poincaré, 94, 96, 160, 167 infinitesimal generator, 135 isoperimetric inequality, 80 for Zd , 93 relative, 91, 92 strong, 80 iterated function scheme, 75 join of two graphs, 6 Kesten’s problem, 8 killed process, 21 Kirchhoff’s laws, 39 Kolmogorov extension theorem, 194

near-diagonal lower bound, 128, 146, 162 Ohm’s law, 38 optional sampling theorem, 184 oscillation inequality, 30 path, 1 self-avoiding, 1 permutable set, 35 Poincaré inequality, 94, 96, 160, 167 expansion factor, 95 strong, 95, 218 weak, 94 weighted, 161, 213, 214 for Zd , 95 Poisson distribution, 143 Poisson process, 132, 191 polynomial volume growth, 114 potential (in electrical network), 38 potential kernel density, 22 potential operator, 24 products (of graphs), 6

225

226

quasi-isometry, 10 random walk continuous time, 192 discrete time, 6 recurrence Foster’s criterion, 28 recurrent graph, 8 strong Liouville property, 27 reduite, 177 resistors in parallel, 65 in series, 64 Y –∇ transform, 65 reversible (Markov chain), 7 Riesz representation theorem, 176 rough isometry, 10 semigroup, 114, 135 continuous time, 192 shift operators continuous time, 192 discrete time, 186 shorting (an edge), 50 Sierpinski gasket, 75 Sierpinski gasket graph, 76, 150, 155 σ -field exchangeable, 36 invariant, 32 tail, 33, 199 spectral gap, 100 spectral radius, 101 stability under rough isometries, 11, 36, 69, 82, 96, 103, 170, 213 stable under perturbations of weights, 9 stochastically complete, 138 stopping time, 117, 133, 183, 196 in continuous time, 196 strong Liouville property, 27, 30 for Rd , 31 subgraph, 5 subharmonic, 26 submartingale, 183

Index

superharmonic, 26 supermartingale, 27, 183 convergence, 184 Thomson’s principle, 53 time derivative of pn , 122 total variation norm, 197 trace (of Dirichlet form), 62, 66 transient graph, 8 transition density, 11 transition matrix, 6 tree binary, 4 with weights, 74 spherically symmetric, 74 UHK(α, β), 116, 124, 127, 131, 146 UHKC(α, β), 144, 146 uniformly integrable process, 184 unpredictable paths, 60 (Vα ), 116 variational problem, 45 for functions, 45 volume doubling, 169, 206 volume growth, 116 exponential, 81, 103 polynomial, 114 wave equation, 114 weight stable property, 9 weighted graph, 3 weights (on a graph), 2 bounded, 3 controlled, 3 equivalent, 9 natural, 3 Wiener’s test, 181 Wye–Delta transform, 65 Y –∇ transform, 65 02 law, 200