Random Explorations (Student Mathematical Library, 98) 1470467666, 9781470467661

The title “Random Explorations” has two meanings. First, a few topics of advanced probability are deeply explored. Secon

123 100

English Pages 199 [215] Year 2022

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Harmonic Analysis: From Fourier to Wavelets (Student Mathematical Library) (Student Mathematical Library - IAS/Park City Mathematical Subseries) 0821875663, 9780821875667

In the last 200 years, harmonic analysis has been one of the most influential bodies of mathematical ideas, having been

330 137 3MB Read more

Discrete Morse Theory (Student Mathematical Library, 90) 1470452987, 9781470452988

Discrete Morse theory is a powerful tool combining ideas in both topology and combinatorics. Invented by Robin Forman in

176 25 3MB Read more

Introduction to Harmonic Analysis (Student Mathematical Library) 147047199X, 9781470471996

This book gives a self-contained introduction to the modern ideas and problems of harmonic analysis. Intended for third-

156 45 4MB Read more

Problems in Abstract Algebra (Student Mathematical Library) 1470435837, 9781470435837

This is a book of problems in abstract algebra for strong undergraduates or beginning graduate students. It can be used

1,856 282 1MB Read more

Extremal Problems for Finite Sets (Student Mathematical Library) 1470440393, 9781470440398

One of the great appeals of Extremal Set Theory as a subject is that the statements are easily accessible without a lot

1,064 80 1MB Read more

K-Theory of Forms. (AM-98), Volume 98 9781400881413

The description for this book, K-Theory of Forms. (AM-98), Volume 98, will be forthcoming.

187 80 15MB Read more

El desastre del 98

768 105 7MB Read more

Mathematical Modeling of Random and Deterministic Phenomena (Mathematics and Statistics) [1 ed.] 1786304546, 9781786304544

This book highlights mathematical research interests that appear in real life, such as the study and modeling of random

971 76 18MB Read more

Problems in Probability Theory, Mathematical Statistics and Theory of Random Functions [Reprint ed.] 0486637174, 9780486637174

Approximately 1,000 problems — with answers and solutions included at the back of the book — illustrate such topics as r

467 156 12MB Read more

Student s Solutions Guide for Introduction to Probability, Statistics, and Random Processes nodrm 9780990637219, 0101101001, 0110110111

3,055 542 114KB Read more

Random Explorations (Student Mathematical Library, 98)
1470467666, 9781470467661

Author / Uploaded
Gregory F. Lawler

Commentary
True PDF

Table of contents :
Cover
Title page
Contents
Preface
Chapter 1. Markov Chains
1.1. Definition
1.2. Laplacian and harmonic functions
1.3. Markov chain with boundary
1.4. Green’s function
1.5. An alternative formulation
1.6. Continuous time
Further Reading
Chapter 2. Loop-Erased Random Walk
2.1. Loop erasure
2.2. Loop-erased random walk
2.3. Determinant of the Laplacian
2.4. Laplacian random walk
2.5. Putting the loops back on the path
2.6. Wilson’s algorithm
Further Reading
Chapter 3. Loop Soups
3.1. Introduction
3.2. Growing loop at a point
3.3. Growing loop configuration in 𝐴
3.4. Rooted loop soup
3.5. (Unrooted) random walk loop measure
3.6. Local time and currents
3.7. Negative weights
3.8. Continuous time
Further Reading
Chapter 4. Random Walk in Z ^{𝑑}
4.1. Introduction
4.2. Local central limit theorem
4.3. Green’s function
4.4. Harmonic functions
4.5. Capacity for 𝑑≥3
4.6. Capacity in two dimensions
Further reading
Chapter 5. LERW and Spanning Trees on Z ^{𝑑}
5.1. LERW in Z ^{𝑑}
5.2. Marginal distributions for UST in Z ^{𝑑}
5.3. Uniform spanning tree (UST) in Z ^{𝑑}
5.4. The dual lattice in Z ²
5.5. The uniform spanning tree (UST) in Z ²
Further Reading
Chapter 6. Gaussian Free Field
6.1. Introduction
6.2. Multivariate normal distribution
6.3. Gaussian fields coming from Markov chains
6.4. A Gibbs measure perspective
6.5. One-dimensional GFF
6.6. Square of the field
Further reading
Chapter 7. Scaling Limits
7.1. The idea of a scaling limit
7.2. Brownian motion
7.3. Conformal invariance in two dimensions
7.4. Brownian loop soup
7.5. Scaling limit for LERW
7.6. Loewner differential equation
7.7. Self-avoiding walk: ¢=0
7.8. Continuous GFF for 𝑑=1,2
Further Reading
Appendix A. Some Background and Extra Topics
A.1. Borel-Cantelli lemma
A.2. Second moment method
A.3. Compound Poisson process
A.4. Negative binomial process
A.5. Increasing jump processes
A.6. Gamma process
A.7. Lévy processes
Bibliography
Index
Back Cover

Citation preview

ST UDEN T MATHEMATIC A L LI B RA RY Volume 98

Random Explorations Gregory F. Lawler

Random Explorations

STUDENT MATHEMATICAL LIBRARY Volume 98

Random Explorations Gregory F. Lawler

EDITORIAL COMMITTEE John McCleary Rosa C. Orellana (Chair)

Paul Pollack Kavita Ramanan

2020 Mathematics Subject Classification. Primary 60-02, 60G60, 60J10, 60K35. For additional information and updates on this book, visit www.ams.org/bookpages/stml-98 Library of Congress Cataloging-in-Publication Data Names: Lawler, Gregory F., 1955- author. Title: Random explorations / Gregory F. Lawler. Description: Providence, Rhode Island : American Mathematical Society, [2022] | Series: Student mathematical library, 1520-9121 ; volume 98 | Includes bibliographical references and index. Identifiers: LCCN 2022034479 | ISBN 9781470467661 (paperback)| ISBN 9781470472214 (ebook) Subjects: LCSH: Stochastic processes. | Random walks (Mathematics) | Markov processes. | AMS: Probability theory and stochastic processes – Research exposition (monographs, survey articles) | Probability theory and stochastic processes – Stochastic processes – Random fields.| Probability theory and stochastic processes – Markov processes – Markov chains (discrete-time Markov processes on discrete state spaces). | Probability theory and stochastic processes – Special processes – Interacting random processes; statistical mechanics type models; percolation theory. Classification: LCC QA274 .L387 2022 | DDC 519.2/3–dc23/eng20221014 LC record available at https://lccn.loc.gov/2022034479 DOI: https://doi.org/10.1090/stml/98 Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/ publications/pubpermissions. Send requests for translation rights and licensed reprints to reprint-permission@ ams.org.

c 2022 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines

established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

27 26 25 24 23 22

Contents

Preface Chapter 1.

ix Markov Chains

1

§1.1. Definition

1

§1.2. Laplacian and harmonic functions

7

§1.3. Markov chain with boundary

8

§1.4. Green’s function

13

§1.5. An alternative formulation

16

§1.6. Continuous time

20

Further Reading

21

Chapter 2.

Loop-Erased Random Walk

23

§2.1. Loop erasure

23

§2.2. Loop-erased random walk

25

§2.3. Determinant of the Laplacian

29

§2.4. Laplacian random walk

32

§2.5. Putting the loops back on the path

36

§2.6. Wilson’s algorithm

37

Further Reading

43

Chapter 3.

Loop Soups

45 v

vi

Contents §3.1. Introduction

45

§3.2. Growing loop at a point

46

§3.3. Growing loop configuration in A

50

§3.4. Rooted loop soup

54

§3.5. (Unrooted) random walk loop measure

55

§3.6. Local time and currents

58

§3.7. Negative weights

63

§3.8. Continuous time

65

Further Reading

66

Chapter 4.

Random Walk in Zd

67

§4.1. Introduction

67

§4.2. Local central limit theorem

70

§4.3. Green’s function

76

§4.4. Harmonic functions

81

§4.5. Capacity for d ≥ 3

93

§4.6. Capacity in two dimensions

101

Further reading

102

Chapter 5.

LERW and Spanning Trees on Zd

§5.1. LERW in Z

d

103 103

§5.2. Marginal distributions for UST in Z

108

§5.3. Uniform spanning tree (UST) in Z

117

§5.4. The dual lattice in Z

122

§5.5. The uniform spanning tree (UST) in Z2

130

Further Reading

131

d

d

2

Chapter 6.

Gaussian Free Field

133

§6.1. Introduction

133

§6.2. Multivariate normal distribution

133

§6.3. Gaussian fields coming from Markov chains

139

§6.4. A Gibbs measure perspective

143

§6.5. One-dimensional GFF

148

Contents

vii

§6.6. Square of the field

149

Further reading

152

Chapter 7.

Scaling Limits

153

§7.1. The idea of a scaling limit

153

§7.2. Brownian motion

156

§7.3. Conformal invariance in two dimensions

159

§7.4. Brownian loop soup

161

§7.5. Scaling limit for LERW

164

§7.6. Loewner differential equation

169

§7.7. Self-avoiding walk: c = 0

172

§7.8. Continuous GFF for d = 1, 2

173

Further Reading

177

Appendix A.

Some Background and Extra Topics

179

§A.1. Borel-Cantelli lemma

179

§A.2. Second moment method

180

§A.3. Compound Poisson process

182

§A.4. Negative binomial process

184

§A.5. Increasing jump processes

187

§A.6. Gamma process

189

§A.7. Lévy processes

192

Bibliography

195

Index

197

Preface

This book is an outgrowth of lectures that I gave in the summer of 2020 as part of the Research Experiences for Undergraduates (REU) at the University of Chicago. The REU lectures are not intended to be standard courses but rather tastes of graduate and research level mathematics for advanced undergraduates. The title of the book can be interpreted in two ways. First, this is not a comprehensive survey of an area but rather a “random” sampling of some objects that arise in models in probability and statistical mechanics. The second meaning refers to a prevailing theme in many of these models. Random fields can be studied by exploration, that is, by traveling (perhaps randomly) through the field and observing what one has seen so far and using that to predict the parts that have not been observed. In order to keep the material accessible to students who have not had graduate material, I have concentrated on discrete models where “measure theoretic” probability is not needed. The formal prerequisites for these notes are advanced calculus, linear algebra, and a calculus-based course in probability. It is also expected that students have sufficient mathematical maturity to understand rigorous arguments. While those are the only formal prerequisites, the intent of these lectures was to give a taste of research level mathematics and I allow myself to venture occasionally a bit beyond these prerequisites.

ix

x

Preface

The first chapter introduces Markov chains and ideas that permeate the book. The focus is on transient chains or recurrent chains with “killing” for which there is a finite Green’s function representing the expected number of visits to sites. The Green’s function can be seen to be the inverse of an important operator, the Laplacian. Harmonic functions (functions whose Laplacian equal zero) and the determinant of the Laplacian figure prominently in the later chapters. We concentrate mainly on discrete time chains but we also discuss how to get continuous time chains by putting on exponential waiting times. A probabilistic approach dominates our treatment but much can be done purely from a linear algebra perspective. The latter approach allows measures on paths that take negative and complex values. Such path measures come up naturally in a number of models in mathematical physics although they are not emphasized much here. Chapter 2 introduces an object that has been a regular part of my research. I introduced the loop-erased random walk (LERW) in my doctoral dissertation with the hope of getting a better understanding of a very challenging problem, the self-avoiding random walk (SAW). While the differences between the LERW and SAW have prevented the former from being a tool to solve the latter problem, it has proved to be a very interesting model in itself. One very important application is the relation between LERW and another model, the uniform spanning tree (UST). This relationship is most easily seen in an algorithm due to David Wilson [20] to generate such trees. Analysis of the loop-erasing procedure leads to consideration both of the loops erased and the LERW itself. Chapter 3 gives an introduction to loop measures and soups that arise from this. We view a collection of loops as a random field that is growing with time as loops are added. The distribution of the loops at time 1 corresponds to what is erased from loop-erased random walks. The loop soup at time 1/2 is related to the Gaussian free field (GFF). This chapter introduces the discrete time loop soup which is an interesting mathematical model in itself. This discrete model has characteristics of a number of fields in statistical mechanics. In particular, the distribution of the field does not depend on how one orders the elements, but to investigate the field one can order the sites and then investigate

Preface

xi

the field one site at a time. For this model, when one visits a site, one sees all the loops that visits that site. This “growing loop” model which depends on the order of the vertices turns out to be equivalent to an “unrooted loop soup” that does not depend on the order. While we have used the generality of Markov chains for our setup, one of the most important chains is the simple random walk in the integer lattice. In order to appreciate paths and fields arising from random walk, it is necessary to understand the walk. Chapter 4 discusses the simple random walk on the lattice giving some more classical results that go beyond what one would normally see at an undergraduate level. We return to the spanning tree in Chapter 5 and consider the infinite spanning tree in the integer lattice as a limit of spanning trees on finite subsets. Whether or not this gives an infinite tree or a forest (a collection of disconnected trees) depends on the dimension. We also give an example of duality on the integer lattice. Another classical field is the topic of Chapter 6. The multivariate normal distribution is a well known construction and is the model upon which much of classical mathematical statistics, such as linear regression, is based. The (GFF) is an example of such a distribution where some geometry comes into the picture. Here we discuss the GFF coming from a Markov chain. The idea of exploration comes in again as one “samples” or “explores” the field at some sites and uses that to determine distributions at other sites. The global object is independent of the ordering of the vertices but the sampling rule is not. There is a relation between the GFF and the growing loop defined in Chapter 3 discussed in Section 6.6. In Chapter 7 we introduce some of the continuous models that arise as scaling limits. A proper treatment of this material would require more mathematical background than I am assuming so this should be viewed as an enticement to learn more. The scaling limits we discuss are: Brownian motion, Brownian loop soup, SchrammLoewner evolution, and the continuous GFF. In the Appendix, we discuss a couple of topics that arise in the previous chapters but have sufficient independent interest that it seems appropriate to separate them. The first is a basic technique

xii

Preface

for research probabilists often called the “second moment method”. The second, which arises for us primarily in the analysis of the loop models, is an introduction to Lévy processes with an emphasis on the negative binomial and Gamma processes. There are a number of exercises scattered through the text. It is recommended that the serious reader, that is, those who are considering doing research in this or related areas of mathematics, do as many as possible. I also suggest to be prepared to draw pictures to help understand some of the constructions and the arguments. Of course, the more casual reader can do whatever they please! I have focused on the mathematics in these lectures and have not discussed the history of the development of these ideas. Clearly, the mathematics in this book is the work of many researchers including many who are active today. I have included a few references for further reading. Many of these works also have extensive bibliographies which can be a good source of original articles. All of the chapters depend on the material in Chapter 1. Chapter 3 uses Chapter 2 while Chapters 4 and 6 are independent and need only Chapter 1. Chapter 5 uses all then chapters preceding it. I would like to thank the “random fields” group during the 2020 REU: Nixia Chen, Victor Gardner, Jinwoo Sung, Stephen Yearwood (mentors) and Sam Craig, Nitya Gadhiwala, Jingyi Jia, Ethan Lewis, Mishal Mrinal, Ethan Naegele, Vedant Pathak, Sivakorn Sanguamoo, Rachit Surana, Haozhe Yu, Lingyue Yu, Stanley Zhu (participants), as well as some participants from other groups that interacted with our group: Jake Fiedler, Jessica Metzger, Lucca Prado, Ben Rapport. Among others who commented and sent corrections on early drafts of the notes are Charley Devlin, Vladislav Guskov, Seyhun Ji, Fredrik Viklund, and Zijian Wang. Research was supported by National Science Foundation grant DMS-1513036.

Chapter 1

Markov Chains

1.1. Definition Suppose A is a countable (finite or countably infinite) set called the state space. A (time-homogeneous) Markov chain on A is a sequence of random variables X0 , X1 , X2 , . . . taking values in A that satisfies the following (time-homogeneous) Markov property: P{Xn+1 = y|X0 = x0 , . . . , Xn = xn } = P{Xn+1 = y | Xn = xn } = P{X1 = y | X0 = x0 }. The first equality is the Markov property which says that the only information about the past and the present that is useful for predicting the future is the present value of the chain. The second equality is time homogeneity. All of our Markov chains will be time homogeneous unless we specify otherwise. To describe the evolution of a time-homogeneous Markov chain we need only specify the initial state and the transition probabilities p(x, y) = P{X1 = y | X0 = x}. If A is finite with N elements, this is called a finite Markov chain and the transition probabilities can be written as an N × N transition matrix P = [p(x, y)]. 1

2

1. Markov chains

More generally, the transition probabilities are given by a function p : A × A → [0, 1] satisfying ∑

p(x, y) = 1.

y∈A

In other words, starting at any state x, the probability of being somewhere after one step is one. In the case of a finite chain, this says that the entries are nonnegative and the sum of the entries of every row equals 1. Square matrices with this property are called stochastic matrices. In the infinite case, we can still refer to P as the transition matrix as long as we understand it as an ∞ × ∞ matrix. The transition probabilities p(x, y) give the one-step transitions. We can also define the n-step transitions by pn (x, y) = P{Xn = y | X0 = x}. This includes n = 0 for which { p0 (x, y) =

1 0

if x = y . if x = ̸ y

The first important result about Markov chains is that the n-step probabilities can be given by taking the transition matrix and raising it to the nth power. We state this in the next proposition without using matrix notation. Proposition 1.1 (Chapman-Kolmogorov equations). If n, m ∈ N and x, y ∈ A, pn+m (x, y) =

∑ z∈A

pn (x, z) pm (z, y).

1.1. Definition

3

Proof. This is a simple application of the law of total probability. pn+m (x, y) = P{Xn+m = y | X0 = x} ∑ = P{Xn+m = y, Xn = z | X0 = x} z∈A

=

∑

P{Xn = z | X0 = x} P{Xn+m = y | Xn = z, X0 = x}

z∈A

=

∑

P{Xn = z | X0 = x} P{Xn+m = y | Xn = z}

z∈A

=

∑

pn (x, z) pm (z, y).

z∈A

The penultimate equality uses the Markov property and the last uses time homogeneity. □ Since the matrix of n-step probabilities is the same as P n , the study of finite Markov chains makes heavy use of linear algebra. One key theorem in linear algebra is that there is a one-to-one correspondence between matrices and linear transformations. The linear transformation framework is more convenient for extending to infinite state spaces so we will use it here. For the matrix P , this transformation is defined as follows. If f : A → R is a function, then P f : A → R is the function ∑ (1.1) P f (x) = p(x, y) f (y). y∈A

If A is finite, this is the same as multiplying the matrix P by the vector f on the right to get the vector P f . Writing it as (1.1) extends the definition to infinite state spaces. It is a linear transformation, P [c1 f + c2 g] = c1 P f + c2 P g. Many of the extensions of linear algebra to infinite matrices are expressed in terms of transformations and are studied in functional analysis. We say that a Markov chain is irreducible if one can get from any state to any other state. More precisely, it is irreducible if for every

4

1. Markov chains

x, y ∈ S there exists n ≥ 0 such that pn (x, y) > 0. It is symmetric if for every x, y, p(x, y) = p(y, x). An irreducible Markov chain is recurrent if it returns to every point. To be more precise, if x ∈ A and τx = min{k ≥ 1 : Xk = x}, then the irreducible chain is recurrent if for all x, y ∈ A, P{τx < ∞ | X0 = y} = 1. There are several equivalent ways of defining recurrence. Let Vx denote the total number of visits to the state x, ∞ ∑ 1{Xn = x}. Vx = n=0

Here we use the indicator function notation: if E is an event, then 1E is the random variable that equals 1 if E occurs and equals 0 if E does not occur. Note that E[1E ] = P(E) and hence E[Vy | X0 = x] =

∞ ∑

E [1{Xn = y} | X0 = x] =

n=0

∞ ∑

pn (x, y).

n=0

Before stating the next proposition it will be useful to introduce some notation. If E is an event and Y is a random variable we write Px (E) and Ex (Y ) for the probability of E and the expectation of Y assuming X0 = x. We can rewrite the equality above as [∞ ] ∞ ∑ ∑ pn (x, y). Ex [Vy ] = Ex 1{Xn = y} = n=0

n=0

We will talk about “visits” and “returns” to a state x for a Markov chain starting at x. They are similar, but the word visit includes the visit at time 0 but the word return refers only to the visits after time 0.

Proposition 1.2. If Xn is an irreducible Markov chain and x ∈ A, then 1 Px {τx = ∞} = x . E [Vx ] In particular, Px {τx = ∞} = 0 if and only if Ex [Vx ] = ∞.

1.1. Definition

5

Proof. Assume that X0 = x and let fx = Px {τx < ∞} be the probability that the chain returns to x. If fx = 1, the chain returns with probability one and by iterating we can see that it returns infinitely often and Vx = ∞. If fx < 1, then Vx = k means that the chain returns to the state k − 1 times and then does not return again. From this we see that Px {Vx = k} = fxk−1 (1 − fx ),

k = 1, 2, . . . .

In other words Vx has a geometric distribution representing the number of trials needed until a success where the probability of success is 1 − fx . By doing the sum (or checking a book in probability) we see that ∞ ∑ 1 x k Px {Vx = k} = E [Vx ] = . 1 − fx k=1

□ Exercise 1.3. The following are equivalent for an irreducible Markov chain. • The chain is recurrent. • For every x, y ∈ A, Ex [Vy ] = ∞. • There exists x, y ∈ A such that Ex [Vy ] = ∞. • For every x, y ∈ A, Px {Vy = ∞} = 1. • There exists x, y ∈ A such that Px {Vy = ∞} = 1. A chain that is not recurrent is called transient. All irreducible finite Markov chains are recurrent but there exist irreducible transient Markov chains with infinite state spaces. One of the most important is simple random walk in the integer lattice Zd for d ≥ 3. Exercise 1.4. Consider the Markov chain whose state space is the integers with transition probabilities p(x, x + 1) = q,

p(x, x − 1) = 1 − q,

where 0 < q < 1. Let pn = pn (0, 0) = P0 {Xn = 0}. (1) Show that pn = 0 if n is odd.

6

1. Markov chains (2) Show that

( p2n =

) 2n n q (1 − q)n . n

(3) Stirling’s formula states that n!

lim

1 n→∞ nn+ 2 e−n

=

√

2π.

Assuming Stirling’s formula, show that if q = 1/2 then, 1 lim n1/2 p2n = √ . π

n→∞

(4) Show that for q = 1/2, the chain is recurrent. (5) Show that for q ̸= 1/2, the chain is transient. An important class of Markov chains is simple random walk on graphs. We consider directed graphs (digraphs) first. A digraph is a countable collection of vertices (also called sites) and directed edges. Each directed edge is as an ordered pair (x, y) where x is the initial vertex and y is the terminal vertex. We say that the edge goes from x to y. In this general definition we allow self-edges, that is, edges with x = y, and multiple edges with the same initial and terminal vertices. We call the digraph simple if there are no self-edges and if for each x, y there is at most one edge from x to y. We define the outdegree outdeg(x) of a vertex to be the number of edges with initial vertex x and the indegree indeg(x) to be the number of edges with terminal vertex x. Suppose that outdeg(x) < ∞ for all x. Simple random walk on the digraph is the Markov chain whose state space is the collection of vertices and in which the walker at each step chooses randomly among the possible edges starting at that site. The transition probabilities are given by p(x, y) =

#{edges from x to y} . outdeg(x)

This assumes outdeg(x) > 0; if not, we set p(x, x) = 1. The graph is (strongly) connected if and only if one can get from any vertex to any other; this is equivalent to saying that this Markov chain is irreducible.

1.2. Laplacian and harmonic functions

7

An (undirected) graph is a collection of vertices and a collection of edges but in this case edges do not have a direction. An undirected edge is an unordered pair {x, y}. In the general definition we allow self-edges {x, x} and we allow multiple edges connecting two vertices. The graph is simple if there are no self-edges and any two vertices are connected by at most one edge. Given an undirected edge one can associate directed edges in the following way. • If {x, y} is an unordered edge connecting two distinct vertices, we replace the edge with two directed edges, one from x to y and one from y to x. • If {x, x} is a self-edge, we replace it with a single directed self-edge at x. The simple random walk on the undirected graph is the same as the simple random walk on the corresponding digraph. One of the most important examples is simple random walk on the integer lattice where the vertices are Zd and the edges connect “nearest neighbors”, that is, points that are distance one from each other.

1.2. Laplacian and harmonic functions Associated to any Markov chain is its Laplacian which gives the difference between the value of a function of the chain at time 0 and its expected value at time 1. More precisely, we define the Laplacian to be the linear transformation L = I − P where I denotes the identity matrix, ∑ Lf (x) = (I − P )f (x) = f (x) − p(x, y) f (y). y∈A

For a random walk on a simple graph we can write Lf (x) = f (x) − Ex [f (X1 )] = f (x) − M V (f ; x) where M V (f ; x) denotes the “mean value” of f along its adjacent vertices. This is one of several possible definitions of the Laplacian of a Markov chain or a graph. Sometimes the negative of this quantity, M V (f ; x) − f (x), is called the Laplacian to make it more analogous

8

1. Markov chains

to the Laplacian in analysis as we describe below. In graph theory, the graph Laplacian of a simple undirected graph is often defined as ∑ Lgr] f (x) = deg(x) f (x) − f (y). x∼y

where x ∼ y means that x and y are adjacent. In other words the graph Laplacian is the degree matrix minus the adjacency matrix. In order to avoid confusion, our L or −L is sometimes called the random walk Laplacian on the graph. If f : Rd → R is a smooth function, the Laplacian is defined as ∆f (x) =

d ∑

∂x2j xj f (x).

j=1

The relationship to the discrete Laplacian can be seen in the next exercise. Exercise 1.5. Suppose D ⊂ Rd is open, and f : D → Rd is a function with continuous first and second derivatives. Show that for x ∈ D, ∆f (x) = 2d lim ϵ↓0

M V (f ; x, ϵ) − f (x) , ϵ2

where M V (f ; x, ϵ) denotes the mean (average) value of f on the sphere of radius ϵ centered at x, {y ∈ Rd : |x − y| = ϵ}. Hint: Expand f around x in its Taylor polynomial of degree 2. If A′ ⊂ A, we say that a function f is (P -)harmonic on A′ if for all x ∈ A′ , P f (x) = f (x), that is, Lf (x) = 0,

x ∈ A′ .

Note that the function f is defined on A and not just A′ ; it is generally not sufficient to know f on A′ to compute Lf (x) for all x ∈ A′ .

1.3. Markov chain with boundary For many of our examples the state space A will be written as A = A ∪ ∂A where A are the interior vertices and ∂A are the boundary vertices. See Figure 1. The terminology and notation come from analysis and

1.3. Markov chain with boundary

9

Figure 1. This is a subset of Z2 . Interior vertices in A are represented by filled circles and boundary vertices in ∂A by empty circles. Full lines represent interior edges and dotted lines represent boundary edges.

point-set topology; here A, ∂A, A are the analogues of an open set, its boundary, and its closure, respectively. If P is a transition matrix for an irreducible Markov chain on A, then there is a corresponding matrix P˜ obtained by stopping the chain when in reaches ∂A, that is, for each x ∈ ∂A we change the entries in the x-row: { 1, x = y . p˜(x, y) = 0, x ̸= y We let PA be the matrix P restricted to the rows and columns in A; this is the same as PÃ restricted to those rows and columns. Assume ∂A ̸= ∅ and define the stopping time T = TA = min{k ≥ 0 : Xk ̸∈ A}. Note that T is a random variable. If A is finite, since the chain is irreducible Px {T < ∞} = 1. However, this might not be true of A is infinite; see Exercise 1.4 (5). Recall that the superscript x indicates that we are assuming that the chain starts at the state x. The term stopping time refers to the fact that to determine whether or not T ≤ n one needs to observe only the values of the chain X0 , X1 , . . . , Xn and not the future values of

10

1. Markov chains

the chain. In other words, if we have not stopped before time n, our decision whether or not to stop at time n must be made using only the information available at that time. If A is finite, and we start in A, there is a probability density function defined on ∂A called the Poisson kernel that describes the probability of leaving the set A at a particular point in ∂A. Definition 1.6. The Poisson kernel is the function HA : A × ∂A → [0, 1] given by HA (x, z) = Px {XT = z}. If x ∈ ∂A, then T = 0 and hence { 1, x = z (1.2) HA (x, z) = , 0, x ̸= z

x ∈ ∂A .

Proposition 1.7. Suppose P is an irreducible transition matrix on A = A ∪ ∂A. If z ∈ ∂A and h(x) = HA (x, z), then h is the unique bounded function on A that is harmonic in A and satisfies the boundary condition (1.2) on ∂A. Proof. Clearly h satisfies (1.2) on ∂A. If x ∈ A, then T ≥ 1 and hence using the law of total probability, h(x)

= =

Px {XT = z} ∑ Px {X1 = y, XT = z} y∈A

=

∑

Px {X1 = y} Px {XT = z | X1 = y}

y∈A

=

∑

Px {X1 = y} Py {XT = z}

y∈A

=

∑

p(x, y) h(y).

y∈A

In other words, h(x) = P h(x). This shows that h is harmonic on A. We need to show that it is unique. Suppose h is a bounded function, harmonic on A, satisfying (1.2) on the boundary. Let T ∧ n = min{T, n}. We claim that for each n, h(x) = Ex [h(XT ∧n )].

1.3. Markov chain with boundary

11

We will prove the claim by induction. It is trivially true for n = 0 since T ∧ 0 = 0. Assume it is true for a particular n. If XT ∧n ∈ ∂A, then T ≤ n and XT ∧(n+1) = XT ∧n . If XT ∧n ∈ A, then T > n and XT ∧n = Xn , XT ∧(n+1) = Xn+1 . From this we see that E[h(XT ∧(n+1) ) | XT ∧n = z] = h(z),

z ∈ ∂A,

E[h(XT ∧(n+1) ) | XT ∧n = z] = P h(z) = h(z),

z ∈ A.

The last equality uses the fact that h is harmonic on A. Combining this, we get [ ] E h(XT ∧(n+1) ) | XT ∧n = h(XT ∧n ), and our claim follows by taking expectations. Note that with probability one, (1.3)

lim h(XT ∧n ) = h(XT ).

n→∞

We would like to conclude that lim Ex [h(XT ∧n )] = Ex [h(XT )].

n→∞

It is not true that this always follows from (1.3). However, if h is a bounded function it will by the bounded convergence theorem, see Exercise 1.8. If h satisfies (1.2), then Ex [h(XT )] = HA (x, z). □ Exercise 1.8. We will prove the bounded convergence theorem. Those who already know this can skip this exercise. Suppose X1 , X2 , . . . is a collection of random variables such that with probability one, the limit X = lim Xn n→∞

exists. Assume also that there exists J < ∞ such that |Xn | ≤ J for all n. Then E[X] = lim E[Xn ]. n→∞

Hint: (1) Show that it suffices to prove this when X is identically equal to 0. (2) For every ϵ > 0, write E[|Xn |] = E[|Xn | 1{|Xn | ≤ ϵ}] + E[|Xn | 1{|Xn | > ϵ}].

12

1. Markov chains Use this to conclude that for n sufficiently large, E[|Xn |] ≤ 2ϵ.

Remark 1.9. • If A is finite, then any function on A satisfying (1.2) will be bounded, so we do not need to include boundedness in the assumptions. • If A is infinite, there may be unbounded solutions. For example if we consider simple random walk on the integers Z and let A = {1, 2, . . .} be the positive integers, then HA (x, 0) = 1 for all x > 0. However for any r ∈ R, the function { 1 + rx, x ≥ 0, hr (x) = 0, x n},

that is, the probability that the chain starting at x is at y at time n and that the chain has not left A. We also consider GA as a linear transformation ∞ ∑ ∑∑ GA f (x) = GA (x, y) f (y) = pn (x, y) f (y). y∈A

y∈A n=0

In other words, GA = I + PA + PA2 + PA3 + · · · = (I − PA )−1 = L−1 A . This expression uses the convergence of the infinite sum of matrices. We could justify this directly, but we will take another approach. Note that we can write [∞ ] ∑ x x GA (x, y) = E [Vy ] = 1{x = y} + E 1{Xn = y} . n=1

Also,

[ E

∞ ∑

] 1{Xn = y} | X1 = z = GA (z, y).

n=1

In matrix form this gives GA = I + PA GA or GA [I − PA ] = I. Therefore, GA = (I − PA )−1 = L−1 A . Some people use (I − PA )−1 as the definition of GA . The Green’s function is not defined for a recurrent irreducible Markov chain if A is the entire state space. One way to define a Green’s function is to give a “killing rate”. Suppose 0 < s < 1 and that at each step of the chain, we “stop” or “kill” the chain with probability 1 − s. To be more precise, suppose that p(·, ·) is the transition probability in A. We enlarge the state space by adding a “cemetery” site ∂ and considering the chain on A = A∪{∂} with transition probabilities p′ (∂, ∂) = 1, p′ (x, y) = s p(x, y), x, y ∈ A,

1.4. Green’s function

15

p′ (x, ∂) = 1 − s,

x ∈ A.

It is easy to check that for all x, y ∈ A, p′n (x, y) = sn p(x, y) and the Green’s function for the killed chain is ∞ ∞ ∑ ∑ ′ GA (x, y; s) = pn (x, y) = sn pn (x, y). n=0

n=0

The right-hand side, viewed as a function of s, is called the generating function of the sequence {pn (x, y)} and GA (x, y; s), which is defined for |s| < 1, is the Green’s generating function for the chain.

A standard way to handle “infinite” quantities in mathematics and physics is to put in an extra parameter that makes the quantity finite and then to study the behavior as one varies the parameter in a way so that the quantity goes to infinity. For example, to study recurrent Markov chains, one can study the behavior of the Green’s generating function as s → 1. If X is a random variable taking values in the natural numbers with probability distribution pn , the generating function is ∞ ∑ ϕ(s) = E[sX ] = sn pn . n=0

For s > 0 this is the same as the moment-generating function evaluated at log s.

Exercise 1.13. Consider a binary tree defined as follows. Let A be the set of finite sequences of 0s and 1s such as 0010110. We include the empty sequence which we represent as ∅. We say that sequence a is the parent of sequence b if b is of the form ar where r is 0 or 1. All sites have a parent except for the empty sequence. Consider the Markov chain with transition probabilities p(a, b) = 1/3,

p(b, a) = 1/3,

if a is a parent of b. In other words, the random walker chooses randomly among its parent and its two children. Since the empty sequence ∅ has no parent we also set p(∅, ∅) = 1/3.

16

1. Markov chains (1) Show that this chain is transient. Hint: See Exercise 1.4. (2) For b ̸= ∅, let p(b) be the probability that the chain starting at b ever reaches the empty sequence. Explain why p(b) must be of the form λ|b| for some λ where |b| is the length of (number of digits in) b. (3) Find λ. (4) What is G(∅, ∅)? (5) If b is any sequence, find G(b, ∅) and G(∅, b). (6) Find a bounded nonconstant function f on A that is harmonic with respect to this chain. (7) Generalize to the case p(a, b) = q, p(b, a) = 1−2q for various values of q.

1.5. An alternative formulation We will give an alternative formulation of the results of the last few sections. This work includes everything we have done so far and in fact is more general allowing for “nonpositive transition probabilities”. There is a disadvantage that the probabilistic intuition disappears. In order to simplify the discussion we will restrict our discussion to finite state spaces. We will assume we have a finite collection of vertices A = A ∪ ∂A and a finite collection of directed edges E = E ∪ ∂E. The edges in E have initial and terminal vertex in A; the edges in ∂E have initial vertex in A and terminal vertex in ∂A. We allow multiple edges connecting the same two vertices as well as selfedges that connect a vertex to itself. For each edge we have a weight q(e) which can take values in the complex numbers C. We allow q(e) = 0 although usually such edges can just be ignored. A path of length n is an ordered n-tuple of edges written as ω = e1 ⊕ e2 ⊕ · · · ⊕ en where for each j, the terminal vertex of ej−1 is the same as the initial vertex of ej . The initial vertex of ω is the initial vertex of e1 and the terminal vertex of ω is the terminal vertex of en . A path of length

1.5. An alternative formulation

17

0 is just a vertex and is called a trivial path; other paths are called nontrivial paths. The weight q extends to a weight on paths by q(e1 ⊕ e2 ⊕ · · · ⊕ en ) = q(e1 ) q(e2 ) · · · q(en ). We also set q(x) = 1 for each trivial path. We let Kn (x, y) be the set of paths of length n from x to y and ∞ ∪ K(x, y) = Kn (x, y), K(E) =

∪ ∪

n=0

K(x, y),

K(∂E) =

x∈A y∈A

∪ ∪

K(x, y),

x∈A y∈∂A

¯ = K(E) ∪ K(∂E). K(E) Note that K0 (x, y) is empty if x ̸= y and contains the trivial path if x = y. Also, K1 (x, y) is the set of edges with initial vertex x and terminal vertex y. If x = y, K(x, y) contains the trivial path at x while if x ̸= y, K(x, y) contains no trivial paths. ¯ is a countable set and q : K(E) ¯ → C gives a We note that K(E) weight (or “measure”) to each path. We would like to define for each ¯ the measure V ⊂ K(E) ∑ q(V ) = q(ω). ω∈V

There is a worry that the sum may not be convergent. We say that q is integrable if ∑ (1.4) |q(ω)| < ∞. ¯ ω∈K(E)

¯ In this case, q gives a (complex-valued) measure to all subsets of K(E) that is countably additive: if V1 , V2 , . . . are disjoint, then ∞ ∑ q(Vj ). q(V1 ∪ V2 ∪ · · · ) = n=1

For the remainder of this section, we will assume that q is integrable.

18

1. Markov chains

In measure theory, one first defines positive (nonnegative) measures. These are allowed to take on the value +∞. Later on, one defines signed and complex measures and for these one does not allow infinities. The condition required is that the “total variation” measure is finite. For countable sets, this criterion is exactly (1.4).

Definition 1.14. If ω = e1 ⊕ e2 ⊕ · · · ⊕ en ,

ω ′ = e′1 ⊕ · · · ⊕ e′k

are paths such that the terminal vertex of ω is the initial vertex of ω ′ , then we define the concatenation to be the path of length n + k given by ω ⊕ ω ′ = e1 ⊕ e2 ⊕ · · · ⊕ en ⊕ e′1 ⊕ · · · ⊕ e′k . We already used the ⊕ notation when we wrote a path of length n as a concatenation of n edges. The trivial path is an “identity” for concatenation. The weight q satisfies q(ω ⊕ ω ′ ) = q(ω) q(ω ′ ). We define the matrix Q = [q(x, y)]x.y∈A , q(x, y) =

∑

q(e),

e∈K1 (x,y)

and the Laplacian L = Lq = I − Q, We write Qn = [qn (x, y)] for the usual powers of the matrix whose entries qn (x, y) give the “transition probabilities”. The Green’s function for integrable q is defined by GA (x, y) = GqA (x, y) =

∑ ω:x→y

q(ω) =

∞ ∑

qn (x, y).

n=0

In the first sum ω : x → y is another way of writing ω ∈ K(x, y). Integrability guarantees that the sums are absolutely convergent. If A is fixed, we sometime write just G for GA . Proposition 1.15. If q is an integrable weight on a finite state space, then GA = L−1 .

1.5. An alternative formulation

19

Proof. If n ≥ 1, every path in Kn (x, y) can be written as ω = e ⊕ ω′ ,

e ∈ K1 (x, z), ω ′ = Kn−1 (z, y)

for some z ∈ A. Using this we see that ∑ Qn (x, y) = Q(x, z) Qn−1 (z, y). z∈A

By summing over n we get [∞ ] ∞ ∑ ∑ n QGA = Q Q = Qn = GA − I, n=0

n=1

and hence LGA = (I − Q)GA = I.

□

Let K∗ (x, y) be the collection of nontrivial paths that start at x, end at y, and have no other visits to y. In other words the path ω ∈ K∗ (x, y) if and only if ω = e1 ⊕ e2 ⊕ · · · ⊕ en where n ≥ 1, the initial vertex of e1 is x, the terminal vertex of en is y, and y is not the terminal vertex of any of e1 , e2 , . . . , en−1 . We let Kn∗ (x, y) be the set of such paths of length n. Every nontrivial ω ∈ K(x, y) has a unique decomposition ω = ω1 ⊕ ω2 where ω 1 ∈ K∗ (x, y) and ω 2 ∈ K(y, y). Using this we get the formulas, G(x, y) = q[K∗ (x, y)] G(y, y),

x ̸= y,

G(x, x) = 1 + q[K∗ (x, x)] G(x, x). In particular, we have established the following which is the generalization of Proposition 1.2. Proposition 1.16. If x ∈ A, G(x, x) =

1 . 1 − q[K∗ (x, x)]

To see that this is the generalization, we note that if q = p comes from a Markov chain, then in the notation of Proposition 1.2, G(x, x) = Ex [Vx ] and p[K∗ (x, x)] = Px {τx < ∞} = 1 − Px {τx = ∞}.

20

1. Markov chains

1.6. Continuous time We have been discussing Markov chains indexed by integer times. One can also consider Markov chains Yt indexed by times t ∈ [0, ∞). These can be constructed from discrete-time chains by specifying that each time the process reaches a state, it waits a certain amount of time before taking the next step. In order to keep the process Markovian, we need to assume that the amount of time spent at a site has an exponential distribution. A continuous random variable T has an exponential distribution with rate λ if it has density λ e−λt ,

0 < t < ∞,

or equivalently, if the distribution function is ∫ t F (t) = P{T ≤ t} = λ e−λs ds = 1 − e−λt . 0

That is, P{T > t} = 1 − F (t) = e−λt . We view T as the time until an event occurs when on the average one expects λ events to occur in each time interval. In particular, ∫ ∞ 1 E[T ] = t λ e−λt dt = . λ 0 The exponential distribution satisfies the memoryless property: P{T > t + s | T > s} = P{T > t}. This is important in order to construct a Markov chain. The probability of leaving a site soon does not depend on how long one has been at the site. Exercise 1.17. Suppose T is a nonnegative random variable with a continuous distribution function F satisfying the memoryless property: for all t, s > 0, P{T > t + s | T > s} = P{T > t}. Show that T has an exponential distribution. Exercise 1.18. Suppose T1 , T2 , . . . , Tn are independent exponential random variables with rates λ1 , λ2 , . . . , λn . Let T = min{T1 , . . . , Tn }.

Further Reading

21

Show that T has an exponential distribution and give its rate. Explain how this fact is implicitly being used in the construction of the continuous time chain below. We will now construct a continuous time Markov chain. Suppose Xn is a (discrete-time) Markov chain on a state space A with transitions p(x, y) and λ : A → (0, ∞) is a bounded function. For ease, let us assume that p(x, x) = 0 for all x. • Whenever we reach a state x, we wait an exponential amount of time with rate λ(x) and then we move to state y with probability p(x, y). We can also describe this directly without reference to the discrete chain. Let λ(x, y) = λ(x) p(x, y) denote the rate at which the chain moves from x to y. The (time homogeneous) continuous Markov chain Yt is a process satisfying the Markov property P{Yt+s = y | Yr , r ≤ t} = P{Yt+s = y | Yt }, and as s ↓ 0, P{Yt+s = y | Yt = x} = P{Ys = y | Y0 = x} = λ(x, y) s + o(s). If we let pt (x, y) = P{Yt = y | Y0 = x}, then we get a system of differential equations ∑ d pt (x, y) = −λ(y) pt (x, y) + pt (x, z) λ(z, y). dt z̸=y

One well known example is the Poisson process with rate λ where A is the nonnegative integers, the discrete chain uses the trivial transition matrix p(k, k + 1) = 1 for all k, and the rate is identically λ.

Further Reading We have chosen the aspects of Markov chains that will be important in this book. There are many introductions to discrete time stochastic processes, e.g., [3, 6]. For a deeper understanding of Markov chains with a view towards their application to computer algorithms, see [12]. For a higher level introduction to a number of discrete processes, see [13].

Chapter 2

Loop-Erased Random Walk

In this chapter we define and give some basic properties of the looperased random walk (LERW) and show how the LERW relates to another model the uniform spanning tree (UST). We will first discuss loop erasure of a Markov chain which will be used to construct spanning trees and later we discuss the generalizations to arbitrary edge weights q. Throughout this section we will assume that our set of vertices can be written as A = A ∪ ∂A. A good example to keep in mind is where A is a finite subset of the integer lattice Z2 and the Markov chain is simple random walk (probability 1/4 of moving to each of its nearest neighbors) stopped when the chain leaves A.

2.1. Loop erasure We will write a path ω of length n as a finite sequence of points (2.1)

ω = [ω0 , . . . , ωn ]

with ωj ∈ A.

23

24

2. Loop-Erased Random Walk

In Section 1.5 we wrote a path as ω = e1 ⊕ · · · ⊕ en where ej denotes a directed edge. If for all x, y ∈ A, there is only one directed edge from x to y, then this is equivalent to (2.1) by choosing ej to be the edge connecting ωj−1 to ωj . If there are multiple edges connecting vertices, then the formulation (2.1) loses the information about which edge is being traversed. Since this chapter focuses on the vertices, we will not worry about this. More precisely, if there are multiple edges connecting two vertices we combine these edges into a single edge whose weight is the sum of the weights of the individual edges.

We call a path a self-avoiding walk (SAW) of length n if all of the vertices {ω0 , . . . , ωn } are distinct. If ω is a path that is not selfavoiding, there may be many SAWs from ω0 to ωn that are subpaths of ω. We will focus on a particular path which is called the (chronological) loop erasure of ω. It is obtained by traversing the path ω until one reaches a vertex that one has already visited and then erasing the loop that was just made. We then continue until we reach another duplicate vertex in this new path and we erase that loop. Eventually we get to ωn and we stop. Note that if ω0 = ωn , we will eventually erase everything and get the trivial path [ω0 ]. More generally, if we visit ω0 many times and we let k be the largest index with ω0 = ωk , then we will eventually erase all the vertices in [ω0 , ω1 , . . . , ωk ] before continuing on. We use this observation to give a more convenient definition of the loop-erasing procedure. Definition 2.1. If ω = [ω0 , ω1 , . . . , ωn ] is a path, then its (chronological) loop erasure, denoted by LE(ω), is the SAW η defined as follows. • Let η0 = ω0 and let j0 be the largest index j such that ωj = ω0 .

2.2. Loop-erased random walk

25

• Recursively, for k ≥ 0 we do the following. – If jk = n, stop and set LE(ω) = [η0 , η1 , . . . , ηk ]. – Otherwise, set ηk+1 = ωjk +1 and let jk+1 be the largest index j such that ωj = ωjk +1 . Note that the definition of LE(ω) depends on the order in which the vertices [ω0 , ω1 , . . . , ωn ] are traversed. This procedure is sometimes called “forward loop-erasing” to distinguish it from “backward loop-erasing” defined as follows, • Let ω R = [ωn , ωn−1 , . . . , ω0 ] be the path ω traversed in the opposite direction. • Let η˜ = LE(ω R ) and let LE R (ω) = η˜R . Exercise 2.2. Give an example of a path ω for which LE R (ω) ̸= LE(ω). Although there are many other possible ways to erase loops from ω, when we write LE(ω) we will always mean the chronological loop erasure. It is important to remember that the mapping ω 7→ LE(ω) is a deterministic operation. Of course, if ω is a random path, then LE(ω) is also a random path.

2.2. Loop-erased random walk Let Xn be an irreducible Markov chain with transition matrix P = [p(x, y)] on state space A = A ∪ ∂A with A finite and ∂A ̸= ∅. We will investigate LERW from x ∈ A to ∂A” where by this we mean start the Markov chain in state x, stop it when it first reaches ∂A, and then erase the loops to produce a self-avoiding path. Let T denote the first time that the chain reaches ∂A, T = TA = min{k ≥ 0 : Xk ̸∈ A}, and let HA (x, z) denote the Poisson kernel as in Section 1.3, HA (x, z) = Px {XT = z}, x ∈ A, z ∈ ∂A.

26

2. Loop-Erased Random Walk

Note that for each x, ∑

HA (x, z) = 1.

z∈∂A

If x ∈ A, let KA (x, ∂A) denote the set of paths starting at x stopped at the first time that they reach the boundary. In other words, it is the set of ω = [ω0 , . . . , ωn ] where n ≥ 1, ω0 = x, ωn ∈ ∂A, and ω0 , . . . , ωn−1 ∈ A. We write ∪ KA (x, ∂A) = KA (x, z) z∈∂A

where KA (x, z) denotes the set of such paths that end at z ∈ ∂A. The transition matrix P induces a probability measure p on KA (x, ∂A) with p[KA (x, z)] = HA (x, z): if ω = [ω0 , . . . , ωn ] ∈ KA (x, ∂A), p(ω) = Px {X1 = ω1 , X2 = ω2 , · · · , Xn = ωn } =

n ∏

p(ωj−1 , ωj ).

j=1

Let RA (x, ∂A) denote the set of self-avoiding walks (SAWs) in K(x, ∂A) and define RA (x, z) similarly. Definition 2.3. Loop-erased random walk (LERW) from x to ∂A is the probability measure pˆ on RA (x, ∂A) given by ∑ pˆ(η) = pÂ (η) = p(ω). ω∈KA (x,∂A), LE(ω)=η

Since the initial and terminal vertices are fixed in the loop-erasing procedure we see that for all z ∈ ∂A, pÂ [RA (x, z)] = pA [KA (x, z)] = HA (x, z). The next proposition gives an expression for pÂ (η). Recall the definition of the Green’s function GA (·, ·) from Section 1.4. Proposition 2.4. Suppose that η = [η0 , η1 , . . . , ηk ] ∈ RA (x, ∂A) and let Aj = A \ {η0 , . . . , ηj−1 }. Then pÂ (η) = p(η)

k−1 ∏ j=0

GAj (ηj , ηj ).

2.2. Loop-erased random walk

27

Proof. Suppose ω ∈ KA (x, ∂A) satisfies LE(ω) = η. By reviewing the loop-erasing procedure, we can see that ω can be written uniquely as ℓ0 ⊕ [η0 , η1 ] ⊕ ℓ1 ⊕ [η1 , η2 ] ⊕ · · · ⊕ [ηk−2 , ηk−1 ] ⊕ ℓk−1 ⊕ [ηk−1 , ηk ], where • ℓ0 is the loop erased at η0 in A representing the path ω from its first to its last visit to η0 . • ℓ1 is the loop erased at η1 representing the path from its first visit to η1 after its last visit to η0 to its last visit to η1 . • In general, ℓj is the loop erased at ηj representing the path from its first visit to ηj after its last visits to η0 , . . . , ηj−1 to its last visit to ηj . For the ℓj we can choose any loop that lies in A that does not visit {η0 , . . . , ηj−1 }, that is, any loop in KAj (ηj , ηj ). Note that ∑ p(ℓj ) = GAj (ηj , ηj ) where the sum is over all such ℓj . Using this we see that the p measure of the set of possible ω is given by G0 · p(η0 , η1 ) · G1 · p(η1 , η2 ) · · · p(ηk−2 , ηk−1 ) · Gk−1 · p(ηk−1 , ηk ). where Gj = GAj (ηj , ηj ). Since p(η) =

k ∏

p(ηj−1 , ηj ),

j=1

□

the proposition is proved.

An important quantity appears in the last proposition. Suppose V = {x1 , x2 . . . , xk } ⊂ A and let Aj = A \ {x1 , . . . , xj−1 }. Let F (A; x1 , . . . , xk ) =

k ∏

GAj (xj , xj ).

j=1

The expression on the right-hand side looks like it depends on the order that we write down the vertices x1 , x2 , . . . , xk . The surprising thing is that it does not!

28

2. Loop-Erased Random Walk

Lemma 2.5. If σ : {1, . . . , k} → {1, . . . , k} is a permutation, then F (A; xσ(1) , . . . , xσ(k) ) = F (A; x1 , . . . , xk ). Proof. This is trivial if k = 1 so let us first consider the case k = 2 and write x = x1 , y = x2 . Let ω = [ω0 , ω1 , . . . , ωn ] denote a loop rooted at x in A. Let s be the smallest index j such that ωj = y; if there is no such index let s = n + 1. Let r be the largest index j less than s with ωj = x and write ω = ω − ⊕ ω + where ω + = [ωr , . . . , ωn ]. If s = n + 1, then ω + is the trivial loop at x. This decomposition is unique. We can choose any loop rooted at x in A \ {y} for ω − and ω + is a loop rooted at x that is either trivial or has the property that it visits y before its first return to x. Let ω ˜ = [˜ ω0 , ω ˜1, . . . , ω ˜m] denote a loop rooted at y in A. In this case, let s be the largest index j such that ω ˜ j = x, let r be the smallest index j greater than s such that ω ˜ j = y and write ω ˜=ω ˜− ⊕ ω ˜ + , where ω ˜ − = [˜ ω0 , . . . , ω ˜ r ]. If ω ˜ does not visit x, we let ω ˜ − be the trivial loop at y. This decomposition is unique. We can choose any loop rooted at y in A \ {x} for ω ˜ + and ω ˜ − is a loop rooted at y that is either trivial or has the property that it does not visit y at any time after the last visit to x and before the terminal visit to y. We now see that there is a natural bijection between the choices for ω + and ω ˜ − . To be more precise, the trivial loop at x is associated to the trivial loop at y and if + ω + = [ω0+ , ω1+ , . . . , ωm ]

we associate to ω + the loop + + ω ˜ − = [ωj+ , ωj+1 , . . . , ωm , ω1+ , . . . , ωj+ ]

2.3. Determinant of the Laplacian

29

where j is the smallest index with ωj+ = y. Note that under this pairing p(ω + ) = p(˜ ω − ). Using this bijection, we see that F (A; x, y) = F (A; y, x). For general k, we can use the k = 2 result to establish the result for any permutation σ that just interchanges two adjacent indices. Any permutation can easily be seen to be a product of such transpositions. □ Exercise 2.6. Check the last paragraph of the last proof. Given this lemma we can make the following definition. Definition 2.7. If V = {x1 , . . . , xk } ⊂ A, let (2.2)

FV (A) =

k ∏

GAj (xj , xj )

j=1

where Aj = A \ {x1 , . . . , xj−1 }. This quantity is independent of the ordering of the vertices. We also make the following conventions. • If V = A, we write just F (A) for FA (A). • If V ̸⊂ A, then FV (A) = FV ∩A (A). • If η is a path, Fη (A) = FV (A) where V is the set of vertices visited by η. With this definition, we can restate Proposition 2.4. Proposition 2.8. If η = [η0 , η1 , . . . , ηk ] ∈ RA (x, ∂A), then pÂ (η) = p(η) Fη (A).

2.3. Determinant of the Laplacian We gave a combinatorial argument, sometime referred to as a “bijective argument” since it gives an explicit bijection, to show that FV (A) does not depend on the order of the vertices in V . When mathematicians establish results like this, they often decide to look more closely to see if they can give another expression for the same quantity from which the independence of ordering would be more obvious. We will do this by relating FV (A) to the determinant of the Laplacian, or equivalently, to the determinant of the Green’s function.

30

2. Loop-Erased Random Walk

Proposition 2.9. If V ⊂ A, then FV (A) =

det LA\V det GA ˜V , = det G = det GA\V det L

˜ V is the matrix GA restricted to the rows and columns indexed where G by V . In particular, F (A) =

1 = det GA . det L

Here L = LA = G−1 A .

Recall that GV = (I − PV )−1 where PV is the transition matrix restricted to the rows and columns indexed by V . ˜ V which is defined as the matrix This is not the same at G GA restricted to those rows and columns.

Proof. We will first prove the result about F (A) by induction on the number of elements of A. If A = {x} contains only one element, then P is a 1 × 1 matrix with entry p = p(x, x), and GA is a 1 × 1 matrix with entry GA (x, x) = 1 + p + p2 + · · · =

1 1 = . 1−p det(I − P )

More generally suppose that A = {x} ∪ A′ and that the result holds for A′ . By choosing an ordering of A that starts with x, we see that F (A) = GA (x, x) F (A′ ) = GA (x, x) det GA′ . Consider the function on A given by g(y) = GA (y, x). Then Lg(y) = 1{y = x}. The last equation can be solved using Cramer’s rule (see below), giving det M g(x) = det L where M is the matrix obtained from L by replacing the column associated to the site x with the column vector that is 1 at x and

2.3. Determinant of the Laplacian

31

0 everywhere else. By expanding along this column, we see that det M = det[LA\{x} ]. This gives the relation GA (x, x) =

det LA′ det GA = . det LA det GA′

Hence F (A) = GA (x, x) det GA′ = det GA . More generally, by definition, we can see that F (A) = FV (A) F (A \ V ) and hence

F (A) det GA = . F (A \ V ) det GA\V ˜ V as Exercise 2.11. We leave the equality FV (A) = det G FV (A) =

□

Suppose we have a system of equations Qx = v where Q is an n × n matrix, v = (v1 , . . . , vn )T is a column vector and x = (x1 , . . . , xn )T are the unknowns. Cramer’s rule states that if Q is invertible then the unique solution is given by det Qj xj = det Q where Qj is the matrix obtained from Q by replacing the jth column with the vector v. While this is a nice compact formula it is less efficient numerically than other ways to solve the equation such as row reduction. For this reason Cramer’s rule is often not emphasized in linear algebra courses.

Exercise 2.10. Suppose Xj , j = 0, 1, . . . , is an irreducible Markov chain on A = A ∪ ∂A and suppose that either the chain is transient or ∂A ̸= ∅. Let x, y be distinct points in A, and let q(x, y) be the probability that the chain starting at x reaches y before leaving A or returning to x. Show that GA (x, y) = q(x, y) FV (A).

32

2. Loop-Erased Random Walk

where V = {x, y}. Exercise 2.11. Suppose Xj , j = 0, 1, . . . , is an irreducible Markov chain on A = A∪∂A where A = {x1 , . . . , xn } is finite and ∂A ̸= ∅. Let PA = [p(x, y)]x,y∈A denote the transition matrix restricted to A and GA = (I − PA )−1 the Green’s function. Suppose V = {x1 , . . . , xk }. We will consider the Markov chain that corresponds to the original chain “viewed only when visiting points in V ”. The transition matrix P˜ = [˜ p(x, y)]x,y∈V is given by ∑ p˜(x, y) = p(ω), ω:x→y

where the sum is over all paths ω = [ω0 , . . . , ωr ] with r ≥ 1; ω0 = x, ωr = y; and ω1 , . . . , ωr−1 ∈ A \ V . ˜ V := (I − P˜ )−1 is the same (1) Show that the Green’s function G as GA restricted to rows and columns indexed by vertices in V. (2) Show that ˜V . FV (A) = det G (3) Let PV be the matrix P restricted to rows and columns indexed by V . Let GV = (I −PV )−1 be the Green’s function. Give an example to show that GV need not be the same as ˜V . G Exercise 2.12. Suppose Suppose Xj , j = 0, 1, . . . , is an irreducible, transient Markov chain on a countable state space A and suppose that V is a finite subset of A. Redo Exercise 2.11 in this case.

2.4. Laplacian random walk We have defined the loop-erased walk from x ∈ A to ∂A as a probability measure on paths. If V ⊂ ∂A, the phrase “loop-erased walk from x to V in A” can have several possible meanings depending on whether one is stopping the original chain or the loop-erased process and whether one is normalizing to make it a probability measure. For this section, we will interpret this in the same way as in the previous section.

2.4. Laplacian random walk

33

• Run the Markov chain starting at x stopped at time T , the first time that XT ∈ ∂A. • Condition on the event that XT ∈ V . • Erase loops. We can also view this as a process ˆ0, X ˆ1, . . . , X ˆT X ˆ 0 = x and T is a random time with where X ˆ1, . . . , X ˆ T −1 ∈ A and X ˆ T ∈ V ⊂ ∂A. X ˆk This is a not a Markov process. The conditional distribution of X ˆ0, . . . , X ˆ k−1 depends on the entire past and not just on the given X ˆ k−1 . We can still talk about “transition probabilities” but value X they depend on the full path. Let η denote a SAW and ˆ k = xk | [X ˆ0, . . . , X ˆ k−1 ] = η}. pˆ(xk | η) = P{X To compute this let us return to the proof of Proposition 2.4. Let KA (z, w) be the set of paths starting at z, ending at w, and such that all other vertices are in A. In other words, it is the set of paths ω = [ω0 , ω1 , . . . , ωk ] for some k with ω0 = z, ωk = w and ω1 , . . . , ωk−1 ∈ A. We allow either or both of z, w to be in ∂A. We will allow the trivial path z = w, k = 0 if and only if z ∈ A. Note that KA (z, z) is the set of loops in A rooted at z and ∑ p [KA (z, z)] = p(ℓ) = GA (z, z). ℓ∈KA

If V ⊂ ∂A, we set KA (z, V ) =

∪

KA (z, w).

w∈V

If x ∈ A, we will write Ax = A \ {x}, that is, we turn x into a boundary point. If x ∈ A, V ⊂ ∂A, then by splitting at the last visit to x, we can write ω ∈ KA (x, V ) as ω = l0 ⊕ ω ∗

34

2. Loop-Erased Random Walk

with l0 ∈ KA (x, x), ω ∗ ∈ KAx (x, V ). This is the starting point in the proof of Proposition 2.4. The distribution of ω ∗ is that of a Markov chain starting at x, stopped at the first k ≥ 1 with Xk ∈ ∂Ax , conditioned so that the terminal point is in V . Note that LE(ω) = LE(ω ∗ ). From this we get the following “ignore the first loop” principle. • Consider the probability distribution on SAWs obtained by starting a chain at x ∈ A, stopping when it reaches ∂A, conditioning to exit A at V , and then erasing loops. The distribution is the same if we also condition that the chain does not return to x before reaching ∂A. We can also write it as ω = l0 ⊕ [x, y] ⊕ ω ′ where p(x, y) > 0 and ω ′ ∈ KAx (y, V ). Here y represents the first step of the LERW. With this decomposition we see the following lemma. We first give a definition and a simple exercise. Definition 2.13. Suppose z, w are distinct points in ∂A. Then the boundary Poisson kernel , denoted H∂A (z, w), is the measure of the set of paths starting at z, ending at w, and otherwise staying in A. If z = w we define H∂A (z, z) = 1. If V ⊂ ∂A, we write ∑ H∂A (z, V ) = H∂A (z, w). w∈V

Exercise 2.14. Show that if x ∈ A, z ∈ ∂A, HA (x, z) = GA (x, x) H∂Ax (x, z), ∑ H∂Ax (x, z) = p(x, y) HAx (y, z). y∈A

More generally, if V ⊂ ∂A, HA (x, V ) = GA (x, x) H∂Ax (x, V ). Lemma 2.15. Suppose x ∈ A, V ⊂ ∂A and HA (x, V ) > 0. Let {Xn } be the Markov chain started at x and T = min{k ≥ 0 : Xk ∈ ∂A}. Then for y ∈ Ax ∪ V , Px {X1 = y | XT ∈ V, x ̸∈ {X1 , . . . , XT −1 }} p(x, y) HAx (y, V ) . = H∂Ax (x, V )

2.4. Laplacian random walk

35

Here Ax = A \ {x}. Exercise 2.16. Check this. Proposition 2.17. Consider LERW from x0 to V ⊂ ∂A in A. The ˆk , X ˆ k+1 . . . , XT given X ˆ 0 = x0 , . . . , X ˆk = conditional distribution of X := xk is the same as LERW from xk to z in Ak A \ {x0 , . . . , xk−1 } and also the same as LERW from xk to z in Ak+1 . In particular, if hk (y) = HAk (y, V ), Then, ˆ k+1 = y | [X ˆ0, . . . , X ˆ k ] = [x0 , . . . , xk ]} = p(xk , y) hk+1 (y) . P{X H∂Ak+1 (xk , V ) Since LERW is a process that chooses its next step using a harmonic function, it also goes under the name Laplacian random walk. The harmonic function hk changes at each step and depends on the path up to that time. For that reason it is analogous to “moving boundary” problems from partial differential equations. Proposition 2.18. Consider LERW from x to ∂A. Then for each SAW η = [η0 , . . . , ηk ] from x to ∂A in A we have pÂ (η) =

k−1 ∏ j=0

p(ηj , ηj+1 ) HAj+1 (ηj+1 , ∂A) . H∂Aj+1 (ηj , ∂A)

We finally can look at the LERW as a Markov process, but a time inhomogeneous one on an unusual state space. • The state space is the set of all triples (A, x, V ) where A ⊂ A, x ∈ ∂A, and V ⊂ ∂A. • We assume that H∂A (x, V ) > 0. • The transitions will always be of the form (A, x, V ) 7−→ (Ay , y, V ), so we only need to give the transition probabilities. • If x ∈ V , then (A, x, V ) is an absorbing state, that is, we stay in that state forever.

36

2. Loop-Erased Random Walk • If x ̸∈ V , then the transition from (A, x, V ) to (Ay , y.V ) occurs with probability p(x, y) HA (y, V ) . H∂A (x, V )

The fact that this is a Markov chain is sometimes called the domain Markov property. Here we have chosen x to be a point on ∂A; if x ∈ A, then we choose (Ax , x, V ) as our initial state.

2.5. Putting the loops back on the path We have decomposed the path of a Markov chain from an interior point to the boundary into two parts: the LERW which can be considered a “backbone” of the process, and the loops that have been erased. Here we will consider the joint distribution of the LERW and the loops erased. We go back to the decomposition at the beginning of the proof of Proposition 2.4. A path ω ∈ KA (x, ∂A) is decomposed into its loop erasure η and a collection of loops {ℓ0 , ℓ1 , . . . , ℓk−1 } where k = |η| is the length of η. The loop ηj is an element of KAj (ηj , ηj ). If we want to describe the joint distribution of a pair of discrete random variables (X, Y ), we can either give all the values (2.3)

P{X = x, Y = y} = p(x, y)

or we can give the distribution of one and the conditional distribution for the other, (2.4)

p(x) = P{X = x},

p(y | x) = P{Y = y | X = x}.

In our case X is the LERW η and Y is the k-tuple of loops l = (ℓ0 , ℓ1 , . . . , ℓk−1 ). Proposition 2.4 gives a form of type (2.3), p(η, l) = p(η) p(l)

where p(l) =

k−1 ∏

p(ℓj )

j=0

and this is restricted to (η, l) such that ℓj is a loop rooted at ηj staying in A \ {η0 , . . . , ηj−1 }. We then used this to give the marginal distribution on η, pˆ(η) = p(η) Fη (A).

2.6. Wilson’s algorithm

37

We can also write this as in (2.4) p(η, l) = pˆ(η) p(l) Fη (A)−1 . Proposition 2.18 gives an expression for pˆ(η) that does not involve the loops. Hence we get the conditional distribution on the loops given η. We summarize this in the following proposition. Proposition 2.19. Consider LERW from x to ∂A, that is, we assign each SAW η = [η0 , . . . , ηk ] from x to ∂A in A probability pˆ(η). Suppose that we choose loops l conditionally independent given η with probability Fη (A)−1 p(l) = Fη (A)−1 p(ℓ0 ) p(ℓ1 ) · · · p(ℓk−1 ), assuming ℓj ∈ KAj (ηj−1 , ηj−1 ). Then if we put the loops on the curve and concatenate as in the proof of Proposition 2.4, the path we get has the distribution of the Markov chain starting at x stopped at ∂A. Exercise 2.20. Show that the following gives another way to sample the loops ℓ0 , . . . , ℓk−1 given η. • Start independent Markov chains X j at each ηj and let T j = min{n ≥ 0 : Xnj ∈ ∂Aj } • Let j ρj = max{m < T j : Xm = xj }

• Output

ℓj = [X0j , X1j , . . . , Xρjj ].

2.6. Wilson’s algorithm For this section, a graph will be a simple, connected, undirected graph on a finite collection of vertices V . Associated to the graph is the simple random walk which at vertex x chooses one of its deg(x) nearest neighbors (adjacent vertices) at random. Definition 2.21. • A (connected) tree is a connected simple graph with no loops. Equivalently, a tree is a simple undirected graph such that for any two distinct vertices x, y there is exactly one SAW in the graph with initial vertex x and terminal vertex y.

38

2. Loop-Erased Random Walk • A spanning tree of a graph is a subgraph containing all the vertices that is a tree.

It is easy to see that any tree on n vertices must have exactly n−1 edges and hence so must any spanning tree of a connected graph with n vertices. There are only a finite number of spanning trees of a finite graph and hence it makes sense to choose one “at random”. Definition 2.22. A uniform spanning tree (UST) of a graph is a random spanning tree chosen from the uniform distribution on all spanning trees. This terminology is standard but can be misleading at first. The word “uniform” in uniform spanning tree is not referring to a property of the tree but rather to the probability distribution it is chosen from. Choosing randomly from a uniform distribution is a very elementary concept but can be challenging for large sets that are defined in a way that makes it hard to determine the number of elements. We will give an algorithm first found by David Wilson [20] for choosing a spanning tree and compute the probability that a particular spanning tree is chosen using this algorithm. We will discover that this probability is the same for all trees! We not only conclude that the algorithm selects from the uniform distribution, but we also get an expression for the number of trees. We assume that our graph has n + 1 vertices. • Write the vertices as V = {x0 , x1 , . . . , xn } where we have chosen an arbitrary ordering of the vertices. Let V0 = {x0 }. • Start simple random walk at x1 and stop it when it reaches x0 . Erase loops (chronologically) from the path and add the remaining edges to the tree. Let E1 be the set of edges added so far and let V1 be the set of vertices that are adjacent to an edge in E1 . • Recursively, if Vk = V stop. Otherwise let xj be the vertex of smallest index not in Vk . Take random walk starting at xj stopped when it reaches Vk . Erase loops, and add the remaining edges to Ek giving Ek+1 and let Vk+1 be the set of vertices that are adjacent to an edge in Ek+1 .

2.6. Wilson’s algorithm

39

It is easy to see that this algorithm produces a random spanning tree of A. The next proposition computes the probability distribution on trees and shows that it uniform over all trees. Proposition 2.23. If T is a spanning tree of a connected graph with vertices V = {x0 , . . . , xn }, then the probability that T is chosen in Wilson’s algorithm is  −1 n ∏  deg(xj ) F (A) j=1

where A = {x1 , . . . , xn }. Here F (A) = det GA is as defined in Section 2.2 using simple random walk on the graph as the Markov chain. The surprising thing is that the probability does not depend on how the points were ordered or on the tree T . Proof. We fix a tree T and compute the probability that it is output in this algorithm. We start by writing the tree in a useful way; see Figure 2.6. Let V0 = {x0 }, E0 = ∅. • There is a unique SAW from x1 to x0 in the tree. Denote it by η 1 and let V1 be the vertices in η 1 , E1 the edges in η 1 . • Recursively, if Vk−1 ̸= V , let x be the vertex of smallest index not in Vk−1 . There is a unique SAW η k in T starting at x, ending in Vk−1 , and such that no other vertices of the path are in Vk−1 . Existence of such a path follows since it is a spanning tree and if there were more than one such path we would have a loop in the tree. Add the vertices of η k to Vk−1 to get Vk and the edges of η k to Ek−1 to give Ek . In this fashion, we have described each tree T as a sequence of SAWs [η 1 , η 2 , . . . , η m ]. This decomposition has used the ordering of the vertices. By repeated use of Proposition 2.8, we see that the probability of choosing T is given by m ∏ ] [ j p(η ) Fηj (A \ Vj−1 ) . j=1

40

2. Loop-Erased Random Walk 9 7 3 0

8 1 5 6 2

4

Figure 1. Decomposition of spanning tree into SAWs. η 1 = [1, 7, 3, 9, 0], η 2 = [2, 5, 3], η 3 = [4, 5], η 4 = [6, 8, 0].

However,

m ∏ j=1 m ∏

p(η j ) =

n ∏ k=1

1 , deg(xk )

Fηj (A \ Vj−1 ) = F (A).

j=1

□ Since we are choosing from a uniform distribution and we know the probability of picking a particular element, we know how many elements are in the set which gives the following corollary. Corollary 2.24. The total number of spanning trees is     n n ∏ ∏  deg(xj ) det LA . deg(xj ) F (A)−1 =  j=1

j=1

Corollary 2.24 is actually a well known result that dates to the nineteenth century. it has a somewhat nicer formulation if we switch to the graph Laplacian Lgr A = Deg − Adj where Deg is the diagonal matrix with (x, x) component deg(x) and Adj is the adjacency matrix with (x, y) = 1 if x, y are adjacent and other entries equal zero.

2.6. Wilson’s algorithm

41

Theorem 2.25. The number of spanning trees of the graph is given by det Lgr A . In particular, this quantity is independent of which vertex is labeled x0 . Proof. Since Lgr A is obtained from LA by multiplying the row associated to x by deg(x), we get   n ∏  det Lgr deg(xj ) det[LA ]. A = j=1

□ Exercise 2.26. The complete graph on n+1 vertices {x0 , x1 , . . . , xn } is the (undirected) simple graph with the maximal number of edges, that is, every pair of distinct vertices is connected by an edge. In particular, each vertex has degree n. Find the number of spanning trees of the complete graph on n + 1 vertices by computing F (A) =

n ∏

GAj (xj , xj ),

Aj = {xj , xj+1 , . . . , xn },

j=1

and using Corollary 2.24. We can generalize Wilson’s algorithm to Markov chains. Suppose P is the transition matrix for an irreducible Markov chain on a finite set A = A∪∂A with ∂A ̸= ∅. A wired spanning tree of A is a spanning tree of the graph A ∪ {∂A} where the boundary ∂A has been “wired” into a single point. If we take ∂A as the “root”, we can use Wilson’s algorithm to generate a random wired spanning tree. For a general Markov chain, this does not give a uniform measure. Proposition 2.27. If a wired spanning tree of A is chosen using Wilson’s algorithm with transition probabilities p(·, ·), then the probability that the tree T is chosen is ∏ (2.5) p(T ) F (A), where p(T ) = p(e). e∈T

Here p(e) = p(x, y) if the edge e = {x, y} is oriented so that the unique path from x to ∂A in T goes through y. In particular, for

42

2. Loop-Erased Random Walk

simple random walk stopped when reaching ∂A, the probability that the tree is chosen is [

∏

]−1 deg(x)

F (A),

x∈A

and the number of wired spanning trees of A is given by [ ] ∏ (2.6) deg(x) F (A)−1 . x∈A

Exercise 2.28. Prove the last proposition by making appropriate adjustments to the proof of Proposition 2.23. If x0 is a vertex in a graph, there is a natural one-to-one correspondence between spanning trees of the graph and wired spanning trees viewing x0 as the boundary. More generally, if T ′ is a tree with k vertices and k − 1 edges, there is a one-to-one correspondence between spanning trees of A containing T ′ as a subgraph and wired spanning trees of A \ T ′ viewing the vertices of T ′ as ∂(A \ T ′ ). Proposition 2.29. Suppose there is a connected graph with vertices A = {x0 , . . . , xn }. Let T ′ be a subtree of the graph with vertices V containing x0 . Then the probability that a uniform spanning tree of the graph contains T is  

∏

−1 deg(y)

y∈V \{x0 }

F (A \ {x0 }) . F (A \ V )

Proof. There is a one-to-one correspondence between wired spanning trees in A \ V and spanning trees in A that contain T ′ . Therefore, the probability that an uniform spanning tree contains T ′ is #{wired spanning trees of A \ V } . #{spanning trees of A} The result follows from (2.6).

□

Further Reading

43

Further Reading The LERW was introduced in [5] and Wilson’s algorithm to generate spanning trees was given in [20]. For more details on the approach in this book, see [8, 10].

Chapter 3

Loop Soups

3.1. Introduction We have seen the role that loops play in loop-erased random walk (LERW) and uniform spanning trees (UST). The configurations, the loop-erased random walk and the uniform spanning tree, were obtained by taking a Markov chain and erasing the loops. The distribution of the remaining configuration could be described without reference to the loops. In this chapter we focus on the loops, and we show that the loop erased in loop erasures come from viewing an increasing collection of loops at time t = 1. The parameter t is related to another parameter called central charge c that arises in mathematical physics. We will discuss c in Chapter 7 and will use t as our parameter in this chapter. We will define the loop configurations and show how they can be constructed in three different ways: (ordered) growing loops, rooted loops, and unrooted loops. Along the way we will show why t = 1 corresponds to LERW and UST. In Chapter 6, we will see that t = 1/2 is related to the Gaussian free field (GFF). For this section, we will assume that we have a finite set A of vertices, a collection of directed edges E which may include self-edges and multiple edges, and a weight which is a function from E to R or C. If we denote the weight by p, then it will be nonnegative while if

45

46

3. Loop Soups

we use q, the weight can take on complex values. If q is a potentially complex weight, there is a corresponding nonnegative weight |q|. As in Section 1.5, the weight extends to a weight q on paths in KA . We will include the trivial path at each vertex in A and give the trivial paths weight one. For each weight q there is the corresponding Green’s function ∑ (3.1) GqA (x, y) = q(ω) = q [KA (x, y)] . ω:x→y

We will omit the q if the weight is known. As before, we call q |q| integrable if GA (x, y) < ∞ for every x, y. Integrability allows us to justify summations such as the one in (3.1). • Convention: All weights in this chapter will be integrable unless explicitly stated otherwise. An important example is simple random walk on a finite A ⊂ Z2 . Here E = {xy ⃗ : |x − y| = 1, x, y ∈ A} with weights p(xy) ⃗ = 1/4. The reader might choose to focus on this example when reading this chapter.

3.2. Growing loop at a point In this section we give a model for a “growing loop” rooted at a vertex x in A. We first consider positive weights p for which there is a nice probabilistic interpretation of this model. We let lt denote the loop at time t ≥ 0. For each t, lt is a random variable taking values in K := K(x) := KA (x, x). We allow the trivial loop. We assume that l0 is the trivial loop, and the process grows at the end, that is to say, if s ≤ t, then ls ⪯ lt . Here we write l ⪯ ˜l if l is an initial segment of ˜l, that is, if ˜l = l ⊕ l∗ for another loop l∗ . The distribution at time t = 1 will be that of the loops erased in the chronological loop-erasing procedure.

The notation may be a bit confusing. We write lt for a random variable that takes values in the set of loops. A loop is a finite sequence of vertices l = [ω0 , . . . , ωk ].

3.2. Growing loop at a point

47

When we want to refer to the vertices in a loop, we will write the vertices as ωj or ηj . If we write lt this is referring to the entire loop that appears at time t and not to a particular vertex of a loop.

The process is a continuous time Markov chain (see Section 1.6) whose state space is the set of loops. Let K∗ be the set of nontrivial loops in K and λ : K∗ → [0, ∞) with ∑ λ := λ(l) < ∞. l∈K∗

The process adds the nontrivial loop l to the end of the growing loop at rate λ(l). More precisely, we assume that P{lt+s = lt | lr , r ≤ t} = 1 − λ s + o(s), and for each l ∈ K∗ as s ↓ 0, P{lt+s = lt ⊕ l | lr , r ≤ t} = λ(l) s + o(s), where o(s) denote functions satisfying lim s↓0

o(s) = 0. s

The assumptions are equivalent to saying that lt is a continuous time Markov chain with state space K starting at the trivial path with rates λ(l, l ⊕ l′ ) = λ(l′ ). To specify the model, we need to give the rate λ. For n > 0, let Kn = Kn (x) be the set of loops starting at x that return to x exactly n times, that is, the set of loops l = [ω0 , ω1 , . . . , ωk ] ∈ K∗ such that #{j ≥ 1 : ωj = x} = n. We call the loops in K1 elementary loops and note that each loop in Kk can be written uniquely as (3.2)

l = l1 ⊕ l2 ⊕ · · · ⊕ lk ,

l j ∈ K1 .

The total measure of the set of elementary loops at x is denoted by ∑ GA (x, x) − 1 p(l) = fx := p(K1 ) = . GA (x, x) l∈K1

48

3. Loop Soups

If p is the transition matrix of a Markov chain, fx is the probability that the chain starting at x returns to x without leaving A. Let ν denote the probability measure on K1 given by ν(l) = p(l)/fx . Using (3.2) we see that p[Kk ] = fxk . We define our growing loop as follows. • Let Kt be the number of elementary loops in the growing loop at time t. We assume that Kt is a continuous time Markov chain with state space N and rate 1 λ(n, n + k) = fxk . k • Choose l1 , l2 , . . . to be independent, identically distributed loops in K1 with distribution ν. • Let lt = l1 ⊕ l2 ⊕ · · · ⊕ lKt , where the right-hand side is defined to be the trivial loop if Kt = 0. An equivalent definition is the following, • lt is a continuous time Markov chain taking values in K with l0 being the trivial loop and rates 1 λ(l, l ⊕ l′ ) = p(l′ ), l′ ∈ Kk . k The process Kt is a negative binomial process with parameter GA (x, x)−1 = 1 − fx . See Section A.4 for the definition and construction. As can be seen there, P{Kt = k} =

1 Γ(k + t) k f , t GA (x, x) k! Γ(t) x

k = 0, 1, 2, . . . ,

and hence (3.3)

P{lt = l} =

1 Γ(k + t) p(l) GA (x, x)t k! Γ(t)

if

l ∈ Kk .

When t = 1 we get the simpler expression 1 (3.4) P{l1 = l} = p(l), GA (x, x) in other words, a loop is chosen with probability proportional to the weight p(l) of the loop.

3.2. Growing loop at a point

49

Exercise 3.1. Suppose we have a Markov chain on the state space A = {1, 2, 3} with transition probabilities p(1, 3) = p(2, 3) =

1 , 3

p(3, 3) = 1,

1 1 1 1 , p(1, 2) = , p(2, 1) = , p(2, 2) = . 3 3 2 6 Let A = {1, 2}. Answer the following questions with x = 1 and with x = 2. p(1, 1) =

(1) Find fx , GA (x, x). (2) What is the probability that the growing loop at time t = 1 is trivial? What is the probability that it is of length two? (3) Given that the growing loop at time t = 1 is of length 2 what is the probability that it equals the loop [x, x, x]? (4) Answer the last two questions with t = 1/2. (5) Let ρ(t) be the probability that the growing loop at time t is of length 2 and let ρ˜(t) be the probability that the growing loop at time t is [x, x, x]. Find lim t↓0

ρ˜(t) ρ(t)

lim

t→∞

ρ˜(t) . ρ(t)

Exercise 3.2. Suppose we take a square in Z2 , A = {(0, 0), (0, 1), (1, 0), (1, 1)}. with that ordering of points and the usual simple random walk weights p(x, y) = 1/4 if and only if |x−y| = 1. Answer the following questions with x = (0, 0). (1) Find fx , GA (x, x). (2) What is the probability that the growing loop at time t = 1 is trivial? What is the probability that it is of length 4? (3) Given that the growing loop at time t = 1 is of length 4, what is the probability that it visits all four vertices? (4) Answer the last two questions with t = 1/2. (5) Let ρ(t) be the probability that the growing loop at time t is of length 4 and let ρ˜(t) be the probability that the growing

50

3. Loop Soups loop at time t is of length 4 and visits all four vertices. Find ρ˜(t) ρ˜(t) lim lim . t→∞ t↓0 ρ(t) ρ(t)

3.3. Growing loop configuration in A The growing loop configuration in A is a collection of growing loops at each vertex. It will depend on an ordering σ of the vertices, and, as before, we let Aj = A \ {x1 , . . . , xj−1 }. Let us write G = GA,σ to denote the set of n-tuples of loops l = (l1 , . . . , ln ) such that lj is a loop in Aj rooted at xj . Definition 3.3. Given the ordering of A, the growing loop configuration is an n-tuple lt = (lt1 , lt2 , . . . , ltn ) were lt1 , lt2 , . . . , ltn are independent and ltj is a growing loop at xj in Aj as in the previous section. If p is a weight, and l = (l1 , . . . , ln ) we define p(l) = p(l1 ) p(l2 ) · · · p(ln ). Proposition 3.4. If σ is an ordering of A and p is an integrable nonnegative weight, then for l ∈ G, n p(l) ∏ Γ(ji + t) P{lt = l} = (det G)t i=1 ji ! Γ(t) where ji is the number of times that loop li returns to the origin as in the previous section. Proof. The loops lt1 , . . . , ltn are independent and by (3.3), we have P{lti = l} =

1 Γ(ji + t) i p(l ). GAi (xi , xi )t ji ! Γ(t)

The result follows by recalling from (2.2) and Proposition 2.9 that det G =

n ∏

GAi (xi , xi ).

i=1

□

3.3. Growing loop configuration in A

51

The growing loop configuration depends on the ordering of vertices and we would like a definition that does not use the ordering. Although this will not give the same distribution on loops, it will give the same distribution on certain functions of the loops such as the number of times vertices or edges are visited. Let us start by viewing the growing loop configuration as a “loop soup”. We will write σ to denote the particular ordering of the vertices and let n ∪ Jσ = KAj (xj , xj ). j=1

For each l ∈ KAj (xj , xj ), let β(l) denote the number of times that the loop returns to xj . For each l ∈ Jσ , let Nt (l) denote the number of times that l has been added onto a loop by time t. Then Nt := {Nt (l) : l ∈ Jσ } are independent Poisson processes with parameters 1 p(l), if l ∈ KAj (xj , xj ). λl = β(l) If we know the values of Nt for all t, l, then we can reconstruct lt . We can also view Nt as a random multiset of elements of Jσ . 1 Definition 3.5. • We call Nt the ordered loop soup associated to the measure p with ordering σ. • More generally, if µ is a nonnegative measure on a countable set A, then a Poisson realization or soup with measure µ is a collection of independent Poisson process Nt = {Nt (a) : a ∈ A} where Nt (a) has rate µ(a). Exercise 3.6. Consider the Markov chain in Exercise 3.1 and suppose that the vertices are ordered 1, 2 and lt = (lt1 , lt2 ) is the corresponding growing loop configuration. (1) For each t, find the probability that the loop lt1 visits the vertex 2. (2) For each t, find the expected number of times vertex 2 appears in lt1 . 1 A multiset is a set where elements can appear more than once. We view Nt (l) as the number of times that the loop l appears in Nt .

52

3. Loop Soups (3) Answer the same questions using the ordering 2, 1 and the vertex 1.

Exercise 3.7. Consider the Markov chain in Exercise 3.2 with that ordering and let x = (0, 0), y = (0, 1), w = (1, 0), z = (1, 1). Let lt = (lt1 , lt2 , lt3 , lt4 ) be the corresponding growing loop configuration. Find the following for each t: (1) the probability that the loop lt1 visits y; (2) the probability that the loop lt1 visits z; (3) the probability that the loop lt2 visits z. Answer the same questions using the ordering {x, y, z, w}. Exercise 3.8. Order the vertices A = {x1 , . . . , xn } and define the measure on ordered loops m∗ as follows. Let Aj = A \ {x1 , . . . , xj−1 }. If l is a loop in Aj rooted at xj , then m∗ (l) =

p(l) β(l)

where β(l) = #{k : 1 ≤ k ≤ |l|, lk = xj }, and m∗ (l) = 0 for all other loops. Show that ∑ m∗ (l) = log det GA . l

We now focus on t = 1 for which (3.4) shows that the conclusion of Proposition 3.4 can be written as ∏ p(l) p(ℓi ) = . det G i=1 GAi (xi , xi ) n

P{l1 = l} =

The next propositions show that we can construct realizations of a Markov chain by starting with realizations of the loop-erased random walk or the uniform spanning tree and taking conditionally independent realizations of the loop soup at time t = 1 and combining them. For LERW, the key fact is given in Proposition 2.19 while for UST one needs to look at the proof of Wilson’s algorithm. We leave checking all of this as exercises. Proposition 3.9. Suppose x ∈ A and we construct a path ω as follows.

3.3. Growing loop configuration in A

53

• Take a LERW random walk η = [η0 , . . . , ηk ] from x to ∂A, that is, a Laplacian walk with the distribution as given in Proposition 2.18. • Given η, choose an ordering of A that starts with η0 = x, η1 , . . . , ηk−1 and then order the remaining vertices arbitrarily. • Take a conditionally independent realization of the ordered loop soup at time t = 1 using this ordering. This gives loops l0 , l1 , . . . , lk−1 . We do not care about the remaining loops which by construction do not intersect η. • Let ω = l0 ⊕ [η0 , η1 ] ⊕ l1 ⊕ [η1 , η2 ] ⊕ · · · ⊕ lk−1 ⊕ [ηk−1 , ηk ]. Then ω has the distribution of the Markov chain starting at x stopped when in reaches ∂A. Proposition 3.10. Choose an n-tuple of random paths (ω 1 , . . . , ω n ) indexed by the vertices in A = {x1 , . . . , xn } as follows. • Choose a random wired spanning tree T using the probability distribution (2.5). • Given T , Let η 1 , . . . , η j be the corresponding SAWs as in Figure 2.6. • Reorder the vertices to match the figure. For example, in Figure 2.6 the new ordering is 1, 7, 3, 9, 2, 5, 4, 6, 8. • Take a conditionally independent realization of the ordered loop soup at time t = 1 using this new ordering. • Let ω ˜ i be the path rooted at η0i given by adding the loops to i η in the same way as in the previous proposition. For all vertices that are not an initial vertex of one of the η i , let ω ˜j be the trivial path at that vertex. This now gives an n-tuple of paths ω ˜ j indexed by the vertices of A. Each of these paths either stays in A or stops at the first visit to ∂A. • Independently for each of these paths, extend the path using the Markov chain until the first visit to ∂A. This gives a collection of paths ω 1 , . . . , ω n indexed by the vertices of A.

54

3. Loop Soups

Then ω 1 , . . . , ω n are independent realizations of the chain with the property that Wilson’s algorithm applied to the realization ω 1 , . . . , ω n outputs T . Exercise 3.11. Verify Propositions 3.9 and 3.10. Hint: We have done all the work. This is a matter of reviewing all of this material.

3.4. Rooted loop soup The ordered loop soup depends on the ordering σ. Indeed, every loop that is rooted at xj , lies in Aj . We will consider a different measure and corresponding loop soup that does not depend on the ordering. If ω = [ω0 , ω1 , . . . , ωn ] is a loop, we write |ω| = n for the number of steps in ω. We define the loop τ ω by translation of the components τ ω = [ω1 , ω2 , . . . , ωn , ω1 ]. Note that τ ω traverses the same loop as ω in the same direction but with a different starting point. Clearly p(τ ω) = p(ω). If ω is a loop, one can potentially get as many as n different loops by translation, ω , τ ω , τ 2ω , . . . ,

τ n−1 ω.

However, these may not be distinct. For example, if ω = [x, y, x, y, x], then n = 4 but there are only two distinct rooted loops obtained by this procedure. We let J(ω) be the number of distinct loops obtained. It is easy to see that J(ω) is an integer dividing |ω|. Let J(ω; x) be the number of distinct loops obtained that are rooted at x. Then, J(ω; x) =

J(ω) #{j : 1 ≤ j ≤ n, ωj = x}. |ω|

Note that J(ω; x) = J(τ ω; x), Definition 3.12. The rooted loop measure m ˜ =m ˜ p is the measure that assigns measure p(l) m(l) ˜ = |l|

3.5. (Unrooted) random walk loop measure

55

to every loop l∈

n ∪

KA (xj , xj ).

j=1

The corresponding Poissonian realization is called the rooted loop soup. Exercise 3.13. Show that the following is a valid way to get the rooted loop soup. • Choose an ordering of the vertices σ. • Take a realization of the ordered loop soup with ordering σ. • For each ω in the soup, choose a translation τ k ω where k is chosen uniformly in {0, 1, . . . , |ω| − 1}. The rooted loop measure is not a probability measure; indeed, ∑ (3.5) m(l) ˜ = log det GA . l

One way to check this is by using Exercise 3.8, but we will give a different way. Note that ∑ l

m(l) ˜ =

∞ ∑ ∞ ∑ pn (x, x) ∑ 1 = Tr(P n ) = − log det(I − P ), n n n=1 n=1 x∈A

where Tr denotes the trace, that is, the sum of the diagonal elements. The last equality is a matrix identity analogous to the Taylor series − log(1 − r) =

∞ ∑ rn , n n=1

|r| < 1.

3.5. (Unrooted) random walk loop measure In the last section, we chose a loop and then randomized its starting point. Since the starting point is not important for many applications, it is useful to consider loops without specifying the starting point. Definition 3.14. • An unrooted loop ℓ is an equivalence class of rooted loops under the equivalence relation ω ∼ τ ω for all ω.

56

3. Loop Soups • J(ℓ) is the number of rooted loops in the equivalence class of ℓ. • The unrooted loop measure m = mp is the measure that assign to each unrooted loop m(ℓ) = J(ℓ)

p(ℓ) . |ℓ|

• The (unrooted) loop soup is a Poissonian realization from m. We write ℓ for unrooted loops and l or ω for rooted loops. Note that p(ℓ) and |ℓ| are well defined for unrooted loops since the values are the same for all rooted representatives of ℓ. Using (3.5), we see that ∑ m(ℓ) = log det GA , ℓ

or equivalently, (3.6)

{ F (A) = det GA = exp

∑

} m(ℓ) .

ℓ

Exercise 3.15. Explain why each of the following methods is a valid way to get the unrooted loop soup. • Take a realization directly from m. • Take a realization of the rooted loop soup and then “forget the root”. • Choose an ordering σ, take a realization of the ordered loop soup with ordering σ, and then “forget the root”. Exercise 3.16. Show that the probability that the unrooted loop soup has no loops in it at time t is given by (det GA )−t . In many ways, the unordered loop soup is the most fundamental quantity, and indeed, it has the most interesting scaling limit. The ordered loop soup is what arises from exploration of the unordered soup. We should think as follows. • At any particular time t an unordered loop may appear.

3.5. (Unrooted) random walk loop measure

57

• We choose an ordering of the set A = {x1 , x2 , . . . , xn } and use that to see if a loop has appeared. • When we explore vertex x1 , we observe all unordered loops that visit x1 . We root these loops so they start at x1 . If the loop visits x1 many times we choose the root randomly from these visits. • Recursively, when we explore vertex xj , we see all the unordered loops we have not previously observed that visit xj . These loops must contain vertices only in {xj , . . . , xn }. Again, we covert it to a loop rooted at xj . The corresponding collection of loops has the distribution of the ordered loop soup with this ordering. The important thing is that we have not changed the loops in any way except for specifying which vertex to start and end the loop. Let us be a little more precise. Suppose we have an ordering of the vertices A = {x1 , . . . , xn }. A realization of the unordered loop soup will produce (random) times 0 < t1 < t2 < t3 < · · · where tj is the time that the jth loop appears. If we write ℓj for the loop that has appeared at time j, then we can associate to ℓj a rooted loop lj∗ by rooting the unordered loop at the vertex of smallest index. If there are multiple choices for this rooted loop, we choose uniformly among the possibilities. We then define a growing loop lt using the ordering by saying that lt is the concatenation of the rooted loops obtained from the soup by time t where the order of the concatenation is the same as the order that the loops appeared in the soup. From this we get the following version of Proposition 3.9. Proposition 3.17. Suppose x ∈ A and we construct a path ω as follows. • Take a LERW random walk η = [η0 , . . . , ηk ] from x to ∂A, that is, a Laplacian walk with the distribution as given in Proposition 2.18. • Choose an independent realization of the unrooted loop soup at time t = 1. • Given η, choose an ordering of A that starts with η0 = x, η1 , . . . , ηk−1 .

58

3. Loop Soups • Use this ordering (and the extra randomness) to convert the realization of the unrooted soup to a growing loop. This gives loops l0 , l1 , . . . , lk−1 . We do not care about the remaining loops which by construction do not intersect η. • Let ω = l0 ⊕ [η0 , η1 ] ⊕ l1 ⊕ [η1 , η2 ] ⊕ · · · ⊕ lk−1 ⊕ [ηk−1 , ηk ].

Then ω has the distribution of the Markov chain starting at x stopped when in reaches ∂A. There is also a corresponding version of Proposition 3.10, but we leave this to the reader.

3.6. Local time and currents Suppose we have a set A and an integrable weight p on directed edges. For ease we will assume there is at most one edge with a given initial and terminal vertex but we do allow self-edges. In other words, p : A × A → [0, ∞). This also gives a weight θ on undirected edges as follows. If e = {a, b} is an undirected edge connecting different vertices then θ(e) = p(a, b) + p(b, a),

if a ̸= b,

and if e = {a, a}, θ(e) = p(a, a). Conversely, if we start with a function θ on undirected edges, we can define the symmetric weight p(x, y) =

θ(e) , 2

e = {x, y}, x ̸= y,

p(x, x) = θ({x, x}). Associated to the weight p is a loop soup: it can be from the growing loops, the rooted loop measure, or the unrooted loop measure. Definition 3.18.

3.6. Local time and currents

59

• A function C : A × A → N is called a directed current if for each vertex x, ∑ ∑ C(x, y) = C(z, x). y∈A

z∈A

We write C = CA for the set of directed currents on A. • If C ∈ C, let p(C) =

∏

p(x, y)C(x,y) .

(x,y)∈A×A

• A function C ′ from undirected edges to N is called an undirected current if for each vertex x, the number of edges adjacent to x that are not self-edges is even, that is, for every x, ∑ C ′ ({x, y}) y̸=x ′ is even. We write C ′ = CA for the set of undirected currents on A.

The loop soup induces random currents as well as local times on vertices. • Directed current: Ct (x, y) is the number of times the directed edge (x, y) has been traversed by time t. • Undirected current: If e is an undirected edge, Ct′ (e) is the number of times that e has been traversed (in either direction) by time t. • (Discrete vertex) local time: Rt (x) is the number of times that the vertex has been visited by time t. Let us be a little more precise. If ω = [ω0 , ω1 , . . . , ωn ] is a loop we let R(x, ω) = #{j : 1 ≤ j ≤ n, ωj = x}, C(x, y, ω) = #{j : 1 ≤ j ≤ n, ωj−1 = x, ωj = y}. Note that R(x, ω) = R(x, τ ω), C(x, y, ω) = C(x, y, τ ω), that is, the local time and the current associated to a loop is a function of the

60

3. Loop Soups

unrooted loop. Then Rt (x) =

∑

Nt (ω) R(x, ω),

ω

Ct (x, y) =

∑

Nt (ω) C(x, y, ω),

ω

where Nt (ω) is the number of times that ω has appeared by time t. If e is an undirected edge joining x and y, we define Ct′ (e) = Ct (x, y) + Ct (y, x), Ct′ (e)

= Ct (x, x),

x ̸= y

x = y.

The important fact that we have seen this section is the following: • The distributions of Rt = {Rt (x) : x ∈ A}, C t = {Ct (x, y) : x, y ∈ A} and C ′t = {Ct′ (e) : e undirected edge} are the same whether we use the rooted loop soup, the unrooted loop soup, or the growing loop soup with any ordering of the vertices. Given the loops, one can determine the directed current, from which one can determine the undirected current, from which one can determine the local time. However, each step “loses information” in that one cannot go the other direction. It will be most convenient to take an ordering of the vertices A = {x1 , . . . , xn } and consider realizations of the growing loop which can be given as an n-tuple of loops lt = (lt1 , . . . , ltn ) where ltj is a loop rooted at xj in Aj = A \ {x1 , . . . , xj−1 }. Given l, the directed current C(l), the undirected current C ′ (l) and the vertex local time R(l) are determined. Likewise, if C ∈ C, the local time R(C) = {Rx (C) : x ∈ A} is well defined. Proposition 3.19. If σ is an ordering of the vertices, (3.7)

P{C t = C} =

p(C) (det G)t

n ∑ ∏ Γ(ji + t) . ji ! Γ(t) i=1

C(l)=C

Here the sum is over all l = (l , . . . , ln ) ∈ Gσ with C(l) = C, and ji is the number of times that loop li returns to the origin. 1

3.6. Local time and currents

61

This proposition follows immediately from Proposition 3.4. A nice property of this expression is the fact that the dependence on the weight p has been factored out of the sum. The sum is a combinatorial object independent of p. We will use it to prove the following proposition whose conclusion is independent of the ordering. Proposition 3.20. Suppose C ∈ C with vertex local times Rx = Rx (C). Then the number of l ∈ G with C(l) = C is ∏ Rx ! ∏ . y∈A C(x, y)! x∈A

In particular, it is independent of the ordering σ. Proof. Let C and hence {Rx } be given. For each x ∈ A, consider all ordered Rx -tuples ax = (a1 , a2 , . . . , aRx ),

aj ∈ A,

such that the number of times that y ∈ A appears is C(x, y). For each x, the number of choices for the Rx -tuple is given by the multinomial coefficient Rx ! ∏ . y∈A C(x, y)! Hence the total number of choices of a = {ax : x ∈ A} is given by ∏ Rx ! ∏ . y∈A C(x, y)! x∈A

We now choose an ordering σ of the vertices A = {x1 , . . . , xn } and we give a bijection from {l ∈ Gσ : C(l) = C} to collections of such Rx -tuples a = {ax : x ∈ A}. If l = (l1 , . . . , ln ) ∈ Gσ , we build the current C along with a by proceeding chronologically first l1 , then l2 , etc. Each time we reach a vertex and take a next step we record that edge so that for each vertex we get a record of the edges traversed in order. For example, we start with l1 = [ω0 , ω1 , . . . , ωn ]. This gives n edges and will give us n of the components in a. When edge (ωj , ωj+1 ) is reached, we attach the component ωj+1 onto aωj . We continue using all of the lj . This gives a.

62

3. Loop Soups

To show it is a bijection, we give the inverse map. Suppose a is given as well as an ordering {x1 , . . . , xn }. To get l1 we start at x1 and go along the graph always choosing the next vertex using the ordering given by a. We proceed like this until all of the edges going in or out of x1 are exhausted. This gives l1 . We now have the remaining edges and start at x2 doing the same procedure to give l2 . Continuing gives l(C, a). □ Corollary 3.21. If C ∈ C, P{C1 = C} =

p(C) ∏ Rx ! ∏ . det G y∈A C(x, y)! x∈A

Proof. This follows immediately from (3.7) and the last proposition since Γ(ji + 1) = ji !. □ For t ̸= 1 we do not get a simple expression because the terms ji in (3.7) depend on the loop l and not just on C(l). However, for t = 1/2 there is an expression for symmetric weights. Suppose A is a graph, θ is a function on undirected edges (we allow self-edges) and p is defined as in the beginning of this section. If C ′ is an undirected current we define ∏ ′ θ(C ′ ) = θ(e)C (e) e

where the product is over undirected edges e. Proposition 3.22. θ(C ′ ) ∏ Γ(Rx (C ′ ) + 12 ) ∏ 1 √ . P{C ′1/2 = C} = √ C ′ (e)! π det G x∈A e We will not put a proof down but allow this as a challenge to the reader; see [8]. The joint distribution of {Rt (x) : x ∈ A} is a little complicated, but we have already seen the marginal distribution. Proposition 3.23. If x ∈ A, then Rt (x) has the distribution of a negative binomial distribution with parameter GA (x, x)−1 = 1 − fx . In particular, Γ(k + t) k 1 P{Rt (x) = k} = f , k = 0, 1, 2, . . . , t GA (x, x) k! Γ(t) x

3.7. Negative weights

63

Proof. Choose an ordering of the vertices so that x comes first. Then note that the process Kt from Section 3.2 is the same as Rt (x). □

3.7. Negative weights Recall that a random variable has a Poisson distribution with rate λ > 0 if it has distribution λk , k = 0, 1, 2, 3, . . . . k! What if we make λ < 0? This does not make sense as a random variable, but we can still consider the signed measure on the nonnegative integers given by P{X = k} = e−λ

µλ (k) = e−λ

(3.8) We still have

λk , k!

∞ ∑

k = 0, 1, 2, 3, . . . .

µλ (k) = 1,

k=0

although justifying this requires noting that the sum is absolute convergent. Indeed for λ < 0, ∥µλ ∥ :=

∞ ∑

|µλ (k)| =

k=0

∞ ∑ k=0

e−λ

(−λ)k = e|λ|−λ . k!

We recall that a measure on the nonnegative integers is a function µ : {0, 1, 2, . . .} → R such that ∥µ∥ :=

∞ ∑

|µ(j)| < ∞.

j=0

The norm ∥ · ∥ is an example of a L1 norm, and the corresponding normed space is often called ℓ1 .

More generally, we can consider a “Poisson process” with rate λ to be the corresponding collection of measures {µtλ }.

64

3. Loop Soups

Exercise 3.24. Define the convolution of two (possibly signed) measures µ, ν on the nonnegative integers by µ ∗ ν(k) =

k ∑

µ(j) ν(k − j).

j=0

(1) Show that if µλ is defined as in (3.8), then µλ1 ∗ µλ2 = µλ1 +λ2 . (2) Show that ∥µ ∗ ν∥ ≤ ∥µ∥ ∥ν∥. Show this is an equality if µ, ν are nonnegative. (3) Give an example such that ∥µ ∗ ν∥ < ∥µ∥ ∥ν∥. (4) Suppose µ, ν are probability functions, that is, there exist random variables X, Y with P{X = j} = µ(j), P{Y = j} = ν(j) for all j. How do we interpret µ ∗ ν in terms of random variables? Similarly, if we have an integrable weight q(x, y) that generates a weight on paths q with ∑

|q(ω)| < ∞,

ω

we can define the corresponding loop measure mq , and we can discuss the distribution of the “loop soup” at time t. More generally, if A is any countable set and m is a function on A with |m|(A) :=

∑

|m(x)| < ∞,

x∈A

we can define a Poissonian realization or soup from the measure m to be the corresponding collection of measure µt on O, the set of functions k = {kx : x ∈ A}, from A to the nonnegative integers with {x : kx > 0} finite. It is given by µt (k) =

∏ x∈A

e−tm(x)

∏ (tm(x))kx (tm(x))kx = e−tm(A) . kx ! kx ! x∈A

3.8. Continuous time

65

Although this is written as an infinite product, all but a finite number of terms equal one so it is well defined. This is a signed measure with ∑

|µt (k)| = e−tm(A)

∑ ∏ (t|m(x)|)kx kx !

k∈O x∈A

k∈O

= e−tm(A)

∏ ∑ (t|m(x)|)kx kx !

x∈A k∈O

= exp {t [|m|(A) − m(A)]} .

3.8. Continuous time Markov chains with continuous time can be obtained from discrete time chains by adding continuous waiting times at each site. A similar construction can be done with the loop soup by specifying that each time a site is visited, one spends an exponential amount of time with rate 1. This gives a continuous time loop soup. The actual loops are the same but the continuous vertex local time changes. A slight variation of this is more relevant when discussing the relations with the Gaussian free field. Assume that at each state x, there is an independent Gamma process Ttx . This is described in more detail in Section A.6, but the density of Ttx is a Gamma with parameters (t, 1), that is, has density 1 t−1 −t s e , Γ(t)

s > 0.

If t is an integer, then Ttx has the same distribution as the sum of t independent exponential random variables with rate 1. • Take a realization of the (discrete time) loop soup giving ˜ t (x) = t + Rt (x). local times {Rt (x) : x ∈ A}. Let R • Take independent Gamma processes with rate 1 at each x ∈ A, {Ttx }. • Let Yt (x) = TRx˜ t (x) . Proposition 3.25. For each x, the distribution of Yt (x) is that of a Gamma process with rate GA (x, x)−1 = 1 − fx .

66 Proof. See Proposition A.23.

3. Loop Soups □

While this is only a statement about the distribution for fixed x, for t = 1/2, 1, one can describe the joint distributions of {Y1/2 (x) : x ∈ A} and {Y1 (x) : x ∈ A} in terms of the GFF.

Further Reading Random walk loop measures are discrete analogues of the Brownian loop measure which was discovered first. See [8] for a more detailed discussion of our main object, the discrete time, discrete space loop measure. Another approach introduced by Yves Le Jan [11] is to consider discrete space, continuous time loop meausures and soups as the fundamental object and he found the relationship with the Gaussian field. There are advantages and disadvantages to each of the approaches.

Chapter 4

Random Walk in Zd

4.1. Introduction In this chapter we will focus on the integer lattice Zd = {(z1 , . . . , zd ) : zj ∈ Z} viewed as an undirected graph where two vertices z, w are adjacent if they are nearest neighbors, that is, |z − w| = 1. Here and throughout this chapter we use | · | to denote the usual Euclidean distance. If A ⊂ Zd , we write ∂A = {z ∈ Zd \ A : |z − w| = 1 for some w ∈ A}, A = A ∪ ∂A. We let Bn denote the discrete ball of radius n about the origin Bn = {z ∈ Zd : |z| < n}, and note that for all w ∈ ∂Bn , n ≤ |w| < n + 1. There are three natural “subgraphs” of Zd associated to a subset A: • Free boundary: The vertices are A, and the edges are the edges of Zd with both endpoints in A. 67

68

4. Random Walk in Zd

• Closure: The vertices are A, and the edges are the edges of Zd with at least one endpoint in A.

• Wired boundary: The vertices are A∪{∂A} where all the points on the boundary are identified (“wired”) and considered as a single point. The edges are the same edges as in the closure but now there can be multiple edges from a vertex z ∈ A to the boundary ∂A.

Simple random walk on Zd starting at the origin can be written as Sn = X1 + · · · + Xn

4.1. Introduction

69

where X1 , . . . , Xn are independent random variables with distribution P{Xj = w} = 1/(2d) for all |w| = 1. We will write pn (z, w) for the corresponding n-step transition probabilities pn (z, w) = Pz {Sn = w}, and pn (w) = pn (0, w) = pn (z, z + w). The transition probabilities are symmetric, pn (z, w) = pn (w, z). We write L for the Laplacian 1 ∑ Lf (z) = (I − P ) f (z) = f (z) − f (w). 2d |w−z|=1

The transition probabilities pn (z) satisfy the “discrete heat equation” (4.1)

pn+1 (z) =

1 2d

∑

pn (w),

|z−w|=1

which can also be written as pn+1 (z) − pn (z) = −Lpn (z), where the Laplacian is with respect to the variable z. Simple random walk is a Markov chain with period 2. We can divide Zd into the “even” points and the “odd” points where the even points are the (z1 , . . . , zd ) with z1 + · · · + zd even. If one starts at an even point, then after an odd number of steps one is at an odd point, and after an even number of steps one is at an even point. There are other variations of simple random walk that get rid of this periodicity. Two standard ones are: • Lazy walker: Let 0 < p < 1. At each time step the walker chooses with probability p to not move. If the walker moves, then it chooses its new site as in the simple random walk. • Continuous time: Let St be a continuous time walk that waits for an exponential amount of time and then takes a step. In this model the components of the walk are independent. These models are the same if we view it only at the times the walker chooses a new site. There are advantages and disadvantages to each of these.

4. Random Walk in Zd

70

Our discrete heat equation is a discretization of the usual (continuous) heat equation 1 ∆x p(t, x). 2 The latter describes the evolution of the probability density function of the continuous analogue of random walk which is called Brownian motion. ∂t p(t, x) =

4.2. Local central limit theorem Let Sn denote the position of a simple random walk starting at the origin in Zd . The central limit theorem states that the distribution of n−1/2 Sn converges to a normal distribution; in this case, if U is an open ball in Rd , } ∫ { Sn p(x) dx. lim P √ ∈ U = n→∞ n U where d ∏

[

{

x2j √ p(x) = exp − 2(1/d)) 2π(1/d) j=1 { } ( )d/2 d|x|2 d exp − . = 2π 2 1

}]

This should be familiar at least for d = 1. For general d, p(x) is the density of independent normal random variables with mean 0 and variance 1/d. The variance is 1/d because that is the variance of one step for each component; for example, each step in the first component equals 1 with probability 1/2d; −1 with probability 1/2d; and 0 otherwise. We define (4.2)

√ n−d/2 p(x/ n) } ( )d/2 { d 1 d|x|2 = exp − . 2n nd/2 2π

pn (x) =

4.2. Local central limit theorem

71

Using the central limit theorem as a guide we might conjecture that if n and x = (x1 , . . . , xd ) have the same “parity”, that is, if n + x1 + x2 + · · · + xd is even, then pn (x) ∼ 2 pn (x). Statements of this kind are called local central limit theorems (LCLT). theorems are stronger than the usual central limit theorem which is not sufficiently precise to estimate probabilities at points. We will state one strong (although not the strongest) version of the LCLT for simple random walk. The basic idea and proof work for all d, but for ease we will discuss the full details of the proof only for d = 1. Let Td = [−π, π] × [−π, π] × · · · × [−π, π] . | {z } d

If θ = (θ1 , . . . , θd ) ∈ Td we write dθ for dθ1 · · · dθd . If X is a random variable taking values in Zd , its characteristic function is is the function ϕ : Rd → C defined by ϕ(θ) = ϕX (θ) = E[eiθ·X ] =

∑

eiθ·x P{X = x}.

x∈Zd

Note that since X takes values in Zd and e2πi = 1, the function ϕ is periodic of period 2π in each variable. In other words, if y ∈ Zd , then ei2πy·X = 1 and hence ϕ(θ) = ϕ(θ + 2πy). The next proposition shows that we can give the distribution of Sn in terms of the characteristic function: the idea is a version of “Fourier inversion”. Proposition 4.1. Suppose X is a random variable taking values in Zd with characteristic function ϕ. Then, 1 P{X = z} = (2π)d

∫ Td

e−iz·θ ϕ(θ) dθ.

4. Random Walk in Zd

72 Proof. ∫

∫

e−iz·θ ϕ(θ) dθ

=

Td

 e−iz·θ 

Td

=

∑



∑

eiw·θ P{X = w} dθ

w∈Zd

∫

P{X = w}

w∈Zd

=

e−iz·θ eiw·θ dθ

Td

(2π)d P{X = w}.

The third equality uses the identity { ∫ (2π)d , w = 0, iw·θ . e dθ = 0, w ∈ Zd \ {0} Td The interchange of sum and integral in the second equality is valid since ∑ ∑ ∫ e−iz·θ eiw·θ P{X = w} dθ = P{X = w}(2π)d < ∞. w∈Zd

Td

w∈Zd

□ If Sn = X1 + · · · + Xn is simple random walk in Zd , then 1 ∑ iθj 1∑ [e + e−iθj ] = cos θj . 2d j=1 d j=1 d

E[eiθ·X1 ] = [ E[eiθ·Sn ] = E

n ∏

] eiθ·Xk =

k=1

d

n ∏ k=1

 n d ∑ [ iθ·X ] 1 k E e = cos θj  . d j=1

The second equality uses the independence of X1 , . . . , Xn . Combining this with the last proposition, we get an exact expression for the distribution of Sn . Proposition 4.2. The n-step transition probabilities are given by ∫ 1 e−iz·θ ϕ(θ)n dθ (4.3) pn (z) = (2π)d Td where 1∑ cos θj . d j=1 d

ϕ(θ) =

4.2. Local central limit theorem

73

While (4.3) is an exact expression, the integrand is highly oscillatory for large |z| which means that there is a lot of cancellation. Hence it takes work to estimate the integral. Theorem 4.3 (Local Central Limit Theorem(LCLT)). For every integer d ≥ 1, there exists c < ∞ such that for every positive integer n and x = (x1 , . . . , xd ) ∈ Zd with n + x1 + · · · + xd even, c |pn (x) − 2 pn (x)| ≤ d . n 2 +1 Here pn (x) is as in (4.2). √ Remark 4.4. For a “typical” x with |x| ≤ n, pn (x) is of order n−d/2 and hence we can write pn (x) = 2 pn (x) [1 + O(n−1 )]. √ However, if |x| ≫ n, then pn (x) is of smaller order and the error d n−( 2 +1) can be larger than the dominant term. In this case, while the theorem is valid, it is not very useful. There are other versions of the LCLT that give better estimates for these atypical values of x, but we will not discuss them. The proof of Theorem 4.3 is similar in all dimensions and involves estimating the integral in (4.3). We will only discuss it in the case d = 1 and n, x are even for which 2 1 pn (x) = √ e−x /2n , 2πn and hence (4.3) gives ∫ π 1 P{Sn = x} = e−ixθ cosn θ dθ. 2π −π Since n and x are both even integers, the function v(θ) = e−ixθ cosn θ has period π and hence the integral from −π to π is the same as twice the integral from −π/2 to π/2, ∫ π/2 2 pn (x) = e−ixθ cosn θ dθ. 2π −π/2 Let us consider this integral. We know that we expect the lefthand side to be of order n−1/2 at least if x is not too far away from the origin. We also know that cos θ goes from 1 to 0 as |θ| goes from 0 to

4. Random Walk in Zd

74

π/2. Unless cos θ is very near one, cosn θ will be very small for large n. To make this observation precise, we will use the Taylor polynomial approximation of cos y. By Taylor’s theorem with remainder we know that there exist C < ∞ such that y 2 4 (4.4) cos y − 1 + 2 ≤ C y , |y| ≤ π/2. Indeed, we could give an explicit C but we will not need it. We are letting n go to infinity, so we only need consider n sufficiently large √ that C ≤ n/4. We claim that ∫ −1/4 1 n (4.5) pn (x) + o(n−3/2 ) = e−ixθ cosn θ dθ. π −n−1/4 To see this, we use (4.4) to see that cos n−1/4 ≤ 1 −

1 (n−1/4 )2 + C (n−1/4 )2 (n−1/4 )2 ≤ 1 − √ , 2 4 n

and hence ∫ −ixy n e cos y dy ≤ n1/4 ≤|y|≤π/2 ≤

∫

π/2

cosn y dy

2 n1/4

[ ]n 1 π 1− √ 4 n

≤ π e−n

1/2

/4

The first inequality is immediate since|e−ixy | = 1. Note that e−n /4 = o(n−3/2 ). √ If we do the change of variables θ = s/ n, the right-hand side of (4.5) becomes ∫ n1/4 √ √ 2 1 √ I where I = √ e−ixs/ n cosn (s/ n) ds. 2πn 2π −n1/4 1/2

Note that I = I1 − I2 + I3 where ∫ ∞ √ 2 1 e−ixs/ n √ e−s /2 ds. I1 = 2π −∞ ∫ √ 2 1 I2 = e−ixs/ n √ e−s /2 ds. 1/4 2π |s|≥n

4.2. Local central limit theorem 1 I3 = √ 2π

∫

n1/4

e−ixs/

−n1/4

√

n

75

√ 2 [cosn (s/ n) − e−s /2 ] ds.

The integral I1 is the characteristic function of a standard normal √ random variable evaluated at −x/ n; one can compute this or look 2 it up to see that I1 = e−x /2n . Using |e−iy | = 1, we see that ∫ √ 2 1 √ e−s /2 ds ≤ O(e− n/2 ) = o(n−1 ). |I2 | ≤ 2π |s|≥n1/4 Similarly, √

∫ 2π I3 ≤

n1/4

−n1/4

2 n √ cos (s/ n) − e−s /2 ds.

Using the expansion for the cosine (details omitted) we see that 2 2 s4 n √ cos (s/ n) − e−s /2 ≤ c e−s /2 . n Hence, ∫ c ∞ 4 −s2 /2 I3 ≤ s e ds = O(1/n). n −∞ The error term I3 is the largest of the error terms and indeed can be as large as c/n.

There is another approach to the LCLT in one dimension that uses another exact expression ( ) 2n P{S2n = x} = 2−2n n+x and then uses Stirling formula (with error terms) to evaluate the right-hand side. This is easier than our proof, if one knows Stirling’s formula, but the proof we give is easier to adapt to higher dimensions and also can be used for random walks other than the simple walk.

The LCLT implies that p2n (0) =

d Cd + O(n− 2 −1 ). nd/2

4. Random Walk in Zd

76 where Cd = 21−d (d/π)d/2 . In particular, {

∞ ∑

pn (x)

n=0

< ∞ if d ≥ 3 . = ∞ if d ≤ 2

the expected number of returns to the origin is infinite if d ≤ 2 and finite for d ≥ 3. Theorem 4.5 (Pólya). With probability one, simple random walk walk in Z1 and Z2 is recurrent. If d ≥ 3, the random walk is transient. Exercise 4.6. Use Proposition 1.2 to prove this theorem.

4.3. Green’s function If d ≥ 3, simple random walk is transient, and the (whole space) Green’s function

G(z, w) =

∞ ∑

Pz {Sn = w} =

n=0

∞ ∑

pn (w − z),

n=0

is well defined. Note that G(z, w) = G(w, z) = G(w − z) where we write G(z) for G(0, z). Analysts think of the Green’s function as the “fundamental solution of the Laplacian”. The discrete analogue of this viewpoint is the statement in the next proposition. We write δ(z) for the delta-function in Zd defined by { δ(z) =

1 0

z=0 z ̸= 0.

Proposition 4.7. The Green’s function G satisfies LG(x) = δ(z).

4.3. Green’s function

77

Proof. Using (4.1), we can see that G(z) =

∞ ∑

pn (z) =

δ(z) +

n=0

= δ(z) +

= δ(z) + = δ(z) +

∞ ∑

pn (z)

n=1 ∞ ∑

1 2d n=1 1 2d 1 2d

∑

pn−1 (w)

|w−z|=1

∑

∞ ∑

pn−1 (w)

|w−z|=1 n=1

∑

G(w)

|w−z|=1

= δ(z) + G(z) − LG(z). □ We will give the asymptotics of the Green’s function as |x| → ∞. This can be deduced from local central limit theorems although we would need a stronger version than we have proved here. For this reason, we will not give a complete proof of the asymptotics, but we will show how the leading order term arises from a computation using the LCLT. For ease let us assume that x ∈ Zd \ {0} and that the sum of the components of x is even. We start with G(x) =

∞ ∑

p2n (x) ∼

n=1

∞ ∑

2 p2n (x)

n=1

∼

∞ ∑ n=1

pn (x) =

∞ (d/2π)d/2 ∑ −y/n e nd/2 n=1

where y = d |x| /2. We write the right-hand side as ] [ ∞ d |x|2−d 1 ∑ (4.6) (n/y)−d/2 e−y/n . 2 π d/2 y n=1 2

We write it this way because the quantity in the brackets is a Riemann sum approximation using intervals of length y −1 of the integral ∫ ∞ t−d/2 e−1/t dt. 0

4. Random Walk in Zd

78

To compute the integral we use the substitution dt = −s−2 ds

t = 1/s, to make it

∫

∞

d 2

s e

−s −2

s

( ds = Γ

0

∫

where Γ(r) =

∞

) d 2 − 1 = Γ(d/2) , 2 d−2 z r−1 e−z dz

0

is the Gamma function which satisfies rΓ(r) = Γ(r + 1). Combining with (4.6) we see that as |x| → ∞, G(x) ∼

d Γ(d/2) |x|2−d . (d − 2) π d/2

By more careful analysis which we omit one can give a sharp bound on the error in the above asymptotics. Proposition 4.8. If d ≥ 3, then as |x| → ∞ G(x) ∼ βd |x|2−d ,

where βd =

d Γ(d/2) . (d − 2) π d/2

In fact, (4.7)

G(x) = βd |x|2−d + O(|x|−d ).

The statement of this proposition uses a convenient shorthand. The conclusion can be written more precisely as: there exists c < ∞ such that for all x, G(x) − βd |x|2−d ≤ c . |x|d Writing things like this is a bit bulky so we will use the O(·) and o(·) notation. It is important to remember that there is an implicit constant in this notation and that this constant is uniform over all x ∈ Zd . It is useful to know what is worth memorizing and what is not so critical. In the case of the last proposition, the exponent 2 − d is worth committing to memory but not the value of the constant βd . The function f (x) = |x|2−d is harmonic in Rd \ {0} and is the fundamental solution of the continuous Laplacian.

4.3. Green’s function

79

One way to remember the exponent is to use the following heuristic derivation. If Sn = x then we would expect n to be of order |x|2 . So there are about |x|2 possible times that contribute to the sum. For each of these values, the probability of being at x is of order |x|−d . Therefore the Green’s function is of order |x|2 |x|−d .

The Green’s function as defined above does not exist if d = 2 because the simple random walk is recurrent. However, there is another quantity that has many of the same properties, the potential kernel. Some authors refer to this as the Green’s function. Definition 4.9. If d = 2 the potential kernel is defined by   n n ∑ ∑ pj (x) . pj (0) − a(x) = lim  n→∞

j=0

j=0

One has to be careful with this definition. We cannot naively write the limit as ∞ ∞ ∑ ∑ (4.8) pn (0) − pn (x), j=0

j=0

since both of these sums are infinite. Let us show why the limit exists. We will do the case where x = (x1 , x2 ) with x1 + x2 even. We write (4.9)

a(x) = lim

n ∑

n→∞

[p2j (0) − p2j (x)].

j=0

Using the local central limit theorem. we can write ] 2 C2 [ p2n (0) − p2n (x) = 1 − e−|x| /n + O(n−2 ). n If we fix x and let n → ∞, we see that |x|2 + O(|x|4 /n2 ). n Hence there exists a constant cx such that for all n, cx |pn (0) − pn (x)| ≤ 2 . n 1 − e−|x|

2

/n

=

4. Random Walk in Zd

80

This shows that the sum in (4.9) is absolutely convergent and we can write ∞ ∑ a(x) = [pj (0) − pj (x)]. j=0

If x1 + x2 is odd, we can similarly write ∞ ∑ a(x) = [pj (0) − pj+1 (x)]. j=0

The next proposition shows that this is the fundamental solution of the Laplacian with d = 2 although we get a change in sign. Proposition 4.10. If d = 2, a(0) = 0, and a(y) = 1 if |y| = 1. Moreover, for all x ∈ Z2 , La(x) = −δ(x). Exercise 4.11. Prove Proposition 4.10.

We could have defined a for d ≥ 3 using the same definition, but in that case the naive expression (4.8) is valid and a(x) = G(0) − G(x). It is more convenient to use G(x) rather than a(x).

We now consider the asymptotics of the potential kernel in Z2 as |x| → ∞. We will consider the case where x1 + x2 is even and let y = |x|2 so that 1 −y/n p2n (x) = e + O(n−2 ). πn We will ignore the error term for the moment and consider ∞ ] ∑ 1 [ 1 − e−y/n . n n=0 Note that

] ∑ y ∑1 [ 1 − e−y/n ≤ c ≤ c0 , n n2 n≥y n≥y ∫ 1 −1/t ∑1 ∑1 1 e −y/n −y/n e e dt ≤ c0 . = ∼ n y n/y t 0

n≤y

n≤y

4.4. Harmonic functions

81

Therefore, a(x) = O(1) +

1∑1 1 2 = log y + O(1) = log |x| + O(1). π n π π n≤y

The next proposition gives a more precise version. As in the case for the Green’s function for d ≥ 3, this can be proved from a sufficiently strong LCLT, but we will not prove it here. Proposition 4.12. If d = 2, as |x| → ∞, (4.10)

a(x) =

2 log |x| + k0 + O(|x|−2 ), π

where k0 =

1 2 log 8 + γ π π

and γ is Euler’s constant.

Euler’s constant is defined by [ γ = lim

n→∞

] n ∑ 1 − log n + . j j=1

The actual value π1 log 8 + π2 γ is not so important but just the fact that there exists k0 such that 2 a(x) = log |x| + k0 + O(|x|−2 ). π

4.4. Harmonic functions The study of simple random walk is very closely related to the study of harmonic functions on the lattice Zd . A good starting point for understanding harmonic function is the sharp estimates of the Green’s function and potential kernel, (4.7) and (4.10). We will assume these even though we have not given complete proofs. Suppose A ⊂ Zd with ∂A ̸= ∅. Let TA = min{n : Sn ̸∈ A}. Recall that the Poisson kernel HA (z, w) for z ∈ A, w ∈ ∂A, is defined by HA (z, w) = Pz {STA = w}.

4. Random Walk in Zd

82

For fixed z, this gives a probability measure on ∂A provided that Pz {TA < ∞} = 1. This will always be true if d ≤ 2 or if A is finite. There are examples with d ≥ 3 such that Pz {TA < ∞} < 1, for example, if Zd \ A is finite. The next proposition is a particular case of Proposition 1.10 so we do not need to prove it again. Proposition 4.13. Suppose A ⊂ Zd such that for each x ∈ A, Px {TA < ∞} = 1. Suppose F : ∂A → R is a bounded function. Then there exists a unique bounded function f : A → R satisfying Lf (x) = 0,

x ∈ A,

f (x) = F (x), It is given by (4.11)

f (x) = Ex [F (STA )] =

x ∈ ∂A. ∑

F (y) HA (x, y).

y∈∂A

Exercise 4.14. Suppose d ≥ 3 and Zd \ A is finite. Show that (4.11) gives the unique function that is harmonic in A, equals zero on ∂A, and satisfies the extra condition lim f (x) = 0.

|x|→∞

If Px {TA < ∞} < 1 for some x we can get a similar result by adding the point “∞” to ∂A and setting HA (z, ∞) = Pz {TA = ∞}. In this case we must also give the boundary value F (∞). See Exercise 4.31 for a proof of this.

Recall that Bn is the discrete ball of radius n about the origin and for z ∈ ∂Bn , n ≤ |z| < n + 1. Propositions 4.7 and 4.10 imply that for all n and all z ∈ ∂Bn , G(z) = βd n2−d + O(n1−d ), a(z) =

2 log n + k0 + O(n−1 ), π

d ≥ 3, d = 2.

4.4. Harmonic functions

83

Here we do not use the full force of the asymptotics of the Green’s function. Although we know G(z) up to an error of |z|−d , there is an error of order n1−d when we replace |z| with n since |z|2−d = n2−d + O(n1−d ),

z ∈ ∂Bn ,

log |z| = log n + O(n−1 ),

z ∈ ∂Bn .

The next proposition expresses the Green’s function GA on a finite set in terms of the whole space Green’s function or the potential kernel. Proposition 4.15. Suppose A ⊂ Zd is finite. Then for all z, w ∈ A, • If d ≥ 3, GA (z, w) =

∑

G(z, w) −

HA (z, y) G(y, w)

y∈∂A

= G(w − z) −

∑

HA (z, y) G(w − y).

y∈∂A

• If d = 2, GA (z, w) = −a(w − z) +

∑

HA (z, y) a(w − y).

y∈∂A

Proof. Without loss of generality we may assume that z = 0 ∈ A and let T = TA . For d ≥ 3, we write the total number of visits to w as ∞ T −1 ∞ ∑ ∑ ∑ 1{Sj = w} = 1{Sj = w} + 1{Sj = w}. j=0

j=0

Taking expectations, we get G(w) = GA (0, w) +

j=T

∑

HA (0, y) G(y, w).

y∈∂A

A similar proof can be given for d = 2, but it takes more work because of the recurrence of the random walk. We give a different proof. Without loss of generality assume that w = 0 and let ∑ g(z) = −a(−z) + HA (z, y) a(−y). y∈∂A

Note that if z ∈ ∂A, then g(z) = 0. Also if z ∈ A, Lg(z) = δ(z). The unique function satisfying this is g(w) = GA (z, 0).

□

4. Random Walk in Zd

84

As a corollary, we estimate the expected number of returns to the origin before leaving the ball Bn by a random walker starting at the origin, Proposition 4.16. • If d ≥ 3, GBn (0, 0) = G(0) − O(n2−d ). • If d = 2, 2 log n + k0 + O(n−1 ) π where k0 is as in (4.10). GBn (0, 0) =

• If d = 2 and x ∈ Bn , (4.12)

GBn (x, 0) =

2 log π

(

n |x|

)

+ O(n−1 ) + O(|x|−2 ).

Exercise 4.17. Use Proposition 4.15 to prove the last proposition. Proposition 4.18. If d ≥ 3 and |x| ≥ n, then the probability that a random walk starting at x ever enters Bn equals ( )d−2 [ ] n 1 + O(n−1 ) . |x| Proof. Let A = Zd \ Bn and let q = q(x, n) be this probability. Note that ∑ q= HA (x, z). z∈∂A

The (whole space) Green’s function G(·) is a bounded function that is harmonic in A and goes to zero as x → ∞. Therefore (see Exercise 4.14), ∑ G(x) = HA (x, z) G(z). z∈∂A

We know that G(x) = βd |x|2−d + O(|x|−d ), and since |x| ≥ n, [ ] G(x) = βd |x|2−d 1 + O(n−2 ) . For z ∈ ∂A, G(z) = βd n2−d + O(n1−d ) = βd n2−d [1 + O(n−1 )],

4.4. Harmonic functions and hence,

∑

85

HA (x, z) G(z) = q βd n2−d [1 + O(n−1 )].

z∈∂A

Therefore, q=

|x|2−d [1 + O(n−1 )]. n2−d □

Proposition 4.19. Suppose d = 2, and let q(n, x) be the probability that a simple random walk starting at x ∈ Z2 leaves Bn before reaching the origin. Then for |x| < n, q(n, x) =

2 π

a(x) . log n + k0 + O(n−1 )

In particular, (4.13)

lim

n→∞

2 (log n) q(n, x) = a(x). π

Proof. Let A = Bn \{0}. The potential kernel is a harmonic function on A with a(0) = 0, and hence ∑ ∑ a(x) = a(z) HA (x, z) = a(z) HA (x, z). z∈∂A

z∈∂Bn

For z ∈ Bn , 2 log n + k0 + O(n−1 ). π The probability that we want is ∑ a(x) HA (x, z) = 2 . −1 ) π log n + k0 + O(n z∈∂B a(z) =

n

□ Exercise 4.20. Show that if d = 2 and m < |x| < n, then the probability that a random walk starting at x enters Bm before leaving Bn equals log n − log |x| + O(|x|−1 ) . log n − log m + O(m−1 ) Hint: The potential kernel a(·) is a harmonic function in Bn \ Bm .

4. Random Walk in Zd

86

The next proposition gives a difference estimate for harmonic functions. Difference estimates are discrete analogues of estimates of derivatives. We will use the following estimates which follow immediately from Propositions 4.7 and 4.10: if x, y ∈ Zd , |x − y| = 1, then (4.14)

|G(x) − G(y)| ≤ c |x|1−d , −1

|a(x) − a(y)| ≤ c |x|

,

d ≥ 3, d = 2.

If f is a function on a countable set V we write ∥f ∥∞ = sup{|f (x)| : x ∈ V }. If V is finite, the supremum is the same as the maximum of |f |. We also write dist(x, ∂A) = min |x − y|. y∈∂A

Proposition 4.21. There exists c = c(d) < ∞ such that if f : A → R is harmonic in A and x, y ∈ A with |x − y| = 1 and dist(x, ∂A) ≥ n, then c |f (x) − f (y)| ≤ ∥f ∥∞ . n It is important to note the order of quantifiers in the proposition. There is a single constant c that works for all subsets A ⊂ Zd and all harmonic function on A. Proof. Without loss of generality we will assume that x = 0, and since Bn ⊂ A, we can assume A = Bn . We will show that for every |y| = 1 and z ∈ ∂A, (4.15)

HA (y, z) = HA (0, z) [1 + O(n−1 )].

We recall that this is shorthand for the statement that there exists c such that for every n > 0, every z ∈ ∂A, and every |y| = 1, c |HA (0, z) − HA (y, z)| ≤ HA (0, z). n To see that (4.15) suffices, we can use (4.11) to write ∑ |f (0) − f (y)| ≤ |HA (0, z) − HA (y, z)||f (z)| z∈∂A

≤

∑ c c ∥f ∥∞ HA (0, z) = ∥f ∥∞ . n n z∈∂A

4.4. Harmonic functions

87

Let us fix z ∈ ∂A and write h(x) = HA (x, z). We will use a technique knows as a last-exit decomposition. Let τ = τn = min{j : Sj ̸∈ Bn }, and let V = ∂Bn/2 . For w ∈ V , let q(w) be the probability that a random walker starting at w does not return to V before leaving Bn and that it leaves Bn at z, q(w) = qn,z (w) = Pw {Sτ = z; Sj ̸∈ V for j = 1, 2, . . . , τ }. Then we claim that for all x ∈ Bn/2 , ∑ (4.16) h(x) = GA (x, w) q(w). w∈V

To see this we focus on the last visit to V by the random walker before leaving Bn . Note that if we start in Bn/2 , we must visit V before leaving Bn . Let ρ be the largest k with Sk ∈ V and k < τ . Then using the total law of probability, ∞ ∑ ∑ h(x) = Px {Sτ = z, ρ = k, Sk = w}. k=1 w∈V

Note that Px {Sτ = z, ρ = k, Sk = w} = Px {Sk = w, k < τ } Px {Sτ = z, ρ = k | Sk = w, k < τ }. Using the Markov property we can see that Px {Sτ = z, ρ = k | Sk = w, k < τ } = q(w). Therefore, P {Sτ = z} x

=

∑ w∈V

=

∑

q(w)

∞ ∑

Px {Sk = w, k < τ }

k=0

q(w) GA (x, w).

w∈V

Our next step is to claim that for all w ∈ V , we have (4.17)

GA (0, w) = GA (y, w) [1 + O(n−1 )].

We will show this in the case d ≥ 3; the d = 2 case is done similarly. Proposition 4.15 gives ∑ HA (z, ζ) GA (ζ, x). GA (x, w) = GA (w, x) = G(w, x) − ζ∈∂Bn

4. Random Walk in Zd

88

Using this and (4.7), we see for w ∈ V and x ∈ {0, y} GA (x, w) = [2d−2 − 1] βd n2−d + O(n1−d ). Also (4.14) gives |G(ζ, 0) − G(ζ, y)| ≤ c n1−d ,

|ζ| ≥ n/2.

Combining these two estimates gives (4.17). Finally we can write ∑ h(0) = q(w) GA (0, w) w∈V

=

∑

q(w) GA (y, w) [1 + O(n−1 )]

w∈V

=

h(y) [1 + O(n−1 )]. □

Exercise 4.22. Suppose f : Zd → R is harmonic. (1) Show that if f is bounded then f is constant. (2) More generally, show that if lim

|x|→∞

|f (x)| = 0, |x|

then f is constant. For nonnegative functions we get another important result that says that values of positive functions in the interior are comparable. The key point in this lemma is that the constant Cr can be chosen so that the inequality holds for all n and all positive harmonic functions in Bn . Proposition 4.23 (Harnack inequality). For every 0 < r < 1, there exists Cr = Cr (d) < ∞ such that if f : B n → [0, ∞) is harmonic in Bn , then for all |x|, |y| < rn, f (x) ≤ Cr f (y). Proof. Let cr = c/(1 − r) where c is from the previous proposition. Then if |x|, |y| < rn with |x − y| = 1, (4.15) gives [ cr ] . HBn (x, w) ≤ HBn (y, w) 1 + n

4.4. Harmonic functions Since f is nonnegative, f (x) =

∑

89

HBn (x, z) f (z)

z∈∂A

[ cr ] f (z) HBn (y, z) 1 + n z∈∂A [ cr ] = f (y) 1 + . n If |x|, |y| ≤ rn,√then we can connect x to y by a path staying in Brn of at most 2r dn steps. Therefore, by repeated application of the above inequality we get √ [ cr ]2r dn f (x) ≤ 1 + f (y) ≤ Cr f (y). n √ □ where Cr = exp{2 dcr r}. ≤

∑

Proposition 4.24. There exists c < ∞ such that if f : A → [0, ∞) is harmonic in A then the following holds. Suppose ω is a nearest neighbor path from z to w in A of length k with dist(ω, ∂A) ≥ N . Then, f (z) ≤ f (w) exp{c k/N }. Exercise 4.25. (1) Check that the proof of Proposition 4.22 extends to prove the last proposition. (2) Use the last proposition to show the following. There exists c < ∞ such that if A = Zd \ Bn and f : A → [0, ∞) is harmonic in A, then for all z, w ∈ ∂B2n , f (z) ≤ c f (w). Exercise 4.26. Show that there exists c < ∞ such that the following is true for every f : Bn → R that is harmonic in Bn . • For every y ∈ Bn , |y| ∥f ∥∞ . n • If f ≥ 0 on Bn , then for every y ∈ Bn/2 , |f (y) − f (0)| ≤ c

(4.18)

|f (y) − f (0)| ≤ c

|y| f (0). n

4. Random Walk in Zd

90

Hint: The first is a consequence of the difference estimate and the second uses the Harnack inequality as well. Exercise 4.27. Show that there exists α > 0 and c < ∞ such that the following is true. Let A = Zd \ Bn and z ∈ ∂A. Then if r ≥ 2 and x, y ∈ Zd \ Brn , then c (4.19) |HA (x, z) − HA (y, z)| ≤ α HA (x, z). r Hint: (1) Let Vk = ∂B2k n for positive integers k. Explain why it suffices to prove (4.19) for x, y ∈ Vk for all k. (2) Let λk = max

{

} |HA (x, z) − HA (y, z)| : x, y ∈ Vk . HA (x, z)

Show that there exists ρ < 1 (independent of z, n, k) such that if k ≥ 1, λk+1 ≤ ρ λk . Hint: Use Exercise 4.25. Exercise 4.28. Suppose n ≥ 2m and A ⊂ Bm . Let τA and τn denote the first time that a random walk hits A and ∂Bn , respectively. Let z ∈ ∂Bn . If x ∈ ∂B2m , define ϵA (x, z) by Px {S(τn ) = z | τn < τA } = P{S(τn ) = z} [1 + ϵA (x, z)] . Show that there there exists c = c(d) < ∞ such that for all n ≥ 2m, m |ϵA (x, z)| ≤ c , d ≥ 3, n m n |ϵA (x, z)| ≤ c log , d = 2. n m Hint: Use (4.18) to show that [ ( m )] Px {S(τn ) = z | τn > τA } = P{S(τn ) = z} 1 + O . n We will use our work so far to show the existence of harmonic measure from infinity. We start by giving the definition and then we will prove a proposition that shows that the definition is valid.

4.4. Harmonic functions

91

Definition 4.29. Suppose A ⊂ Zd , d ≥ 2 is finite and let T = TA = min{j ≥ 1 : Sj ∈ A}. Then the harmonic measure (from infinity) of A is defined by hmA (x) = lim Pw {ST = x | T < ∞} w→∞

If d = 2, then Pw {ST = x} = 1 and we can write simply hmA (x) = lim Pw {ST = x}. w→∞

We will now establish the existence of the limit. Before doing so, we note that the limit does not exist for d = 1. If we consider the set A = {0, 1}, then the probability that a random walk “from infinity” first visits A at 0 depends on whether the walker is coming from the right-hand side or the left-hand side. The proposition below shows that in more than one dimension, the hitting probability is the same (in the limit) regardless of the direction one is coming from. Proposition 4.30. If A ⊂ Zd , d ≥ 2, is finite, then for each x ∈ A, the limit hmA (x) = lim Pw {ST = x | T < ∞} w→∞

exists and is also given by Px {τn < T } , y y∈A P {τn < T }

lim ∑

n→∞

where τn = min{j ≥ 1 : Sj ∈ ∂Bn }. Moreover, if A ⊂ Bm and |w| ≥ 2m, then Pw {ST = x | T < ∞} = hmA (x) [1 + O(ϵ)], where ϵ = ϵ(m, w) = m/|w| for d ≥ 3 and ϵ = (m/|w|) log(|w|/m) for d = 2. Proof. For x ∈ A, z ∈ ∂Bn , let rn (x, z) = Px {τn < T, S(τn ) = z}. By reversing paths (check this!) we also see that rn (x, z) = Pz {τn > T, S(T ) = x}.

4. Random Walk in Zd

92 By Exercise 4.28,

rn (x, z) = Px {τn < T } P0 {S(τn ) = z} [1 + O(ϵn )], where ϵn = (m/n) if d ≥ 3 and ϵn = (m/n) log(n/m) if d = 2. We now use a last-exit decomposition for |w| > n focusing on the last visit to ∂Bn before reaching the set A. More precisely, arguing as in (4.16), we get for |w| > n, Pw {S(TA ) = x} ∑ = GZd \A (w, z) rn (x, z) z∈∂Bn

∑

=

GZd \A (w, z) Px {τn < T } P0 {S(τn ) = z} [1 + O(ϵn )]

z∈∂Bn

=

Jn (w, A) Px {τn < T } [1 + O(ϵn )],

where Jn (w, A) =

∑

GZd \A (w, z) HBn (0, z).

z∈∂Bn

The term Jn (w, A) is independent of x, and hence if we set Px {τn < T } , y y∈A P {τn < T }

hn (x) = ∑

(4.20)

then, we can write the above as Pw {S(TA ) = x | TA < ∞} = hn (x) [1 + O(ϵn )] . □ Exercise 4.31. Suppose A ⊂ Zd (d ≥ 2) with Zd \ A finite. Show that if f : A → R is bounded and harmonic on A, then the limit exists. L = lim f (z) z→∞

Hint: (1) It suffices to prove the result when ∥f ∥∞ = 1. (2) Let fˆ(z) =

∑ x∈∂A

HA (z, x) f (x).

4.5. Capacity for d ≥ 3

93

Let fˆ(∞) = 0 if d ≥ 3 and for d = 2 ∑ hm∂A (x) f (x). fˆ(∞) = x∈∂A

Show that lim fˆ(z) = fˆ(∞).

z→∞

(3) Let g = f − fˆ and note that this satisfies the hypotheses with g ≡ 0 on A. (4) Use Exercise 4.28 to show that g(z) = C HA (z, ∞). for some C.

4.5. Capacity for d ≥ 3 If A is a finite subset of Zd with d ≥ 3, there are various ways to describe the size of A. One obvious way is the number of points, but this does not distinguish n points that are close together from n points spread apart. We will consider another notion called capacity which is related to the probability that a simple random walker ever visits the set. We will start with the definition and then we will give this interpretation. Let TA = min{j ≥ 1 : Sj ∈ A}, where we set TA = ∞ if Sj ̸∈ A for all j ≥ 1. Note that TA ≥ 1 even if we start in A since we are considering j ≥ 1. We let EsA (x) = Px {TA = ∞} and call EsA (x) the escape probability. Definition 4.32. If d ≥ 3 and A ⊂ Zd is finite, the capacity of A, is defined by ∑ cap(A) = EsA (z). z∈A

In this language we can write (4.20) for d ≥ 3, hmA (x) =

EsA (x) . cap(A)

4. Random Walk in Zd

94

We recall that the Green’s function satisfies G(x) = βd |x|2−d + O(|x|−d ). Exercise 4.33. Let Sj , j ≥ 0 denote simple random walk in Zd , d ≥ 3, starting at the origin, and let z be a nearest neighbor of the origin. Let p denote the probability that the random walk returns to the origin. We know that G(0, 0) =

1 . 1−p

(1) Show that the probability of ever visiting z is p. (2) Let T = min{j ≥ 1 : Sj ∈ {0, z}} with T = ∞ if no such j exists. Show that p P{ST = 0} = P{ST = z} = . 1+p (3) Show that GZd \{0} (z, z) = 1 + p, and hence F{0,z} (Zd ) = G(0, 0) GZ d \{0} (z, z) =

1+p . 1−p

(4) Show that cap({0, z}) = 2

1−p . 1+p

Proposition 4.34. If A ⊂ Zd , d ≥ 3 is a finite set, then (4.21)

cap(A) = lim βd−1 |x|d−2 Px {TA < ∞}. x→∞

More precisely, if A ⊂ Bn , and |x| ≥ 2n,

[ ( )] n Px {TA < ∞} = βd |x|2−d cap(A) 1 + O . |x|

Proof. We use a last-exit decomposition. Suppose we start a simple random walk at x ̸∈ A and let σ = max{k < ∞ : Sk ∈ A}

4.5. Capacity for d ≥ 3

95

with σ = ∞ if there is no such k. Then Px {TA < ∞} = =

=

=

Px {σ < ∞} ∞ ∑ ∑ Px {σ = k, Sk = z} k=1 z∈A ∞ ∑ ∑ k=1 z∈A ∞ ∑∑

Px {Sk = z} Px {σ = k | Sk = z} Px {Sk = z} EsA (z)

z∈A k=1

=

∑

EsA (z) G(x, z)

z∈A

= =

[ ( )] n EsA (z) βd |x|d−2 1 + O |x| z∈A [ ( )] n βd |x|d−2 cap(A) 1 + O . |x| ∑

□ Proposition 4.35. If A, B ⊂ Zd , d ≥ 3, are finite, then (4.22)

cap(A ∪ B) ≤ cap(A) + cap(B) − cap(A ∩ B).

The inequality (4.22) is what characterizes capacities as opposed to other “measures” of size. Recall that probabilities, and more generally finite measures, satisfy P(E1 ∪ E2 ) = P(E1 ) + P(E2 ) − P(E1 ∪ E2 ). so the capacity condition is weaker than the probability (measure) condition. Proof. Let x ∈ Zd , start a random walk at x, and let EV denote the event that the random walk visits V . Note that EA∪B = EA ∪ EB and EA∩B ⊂ EA ∩ EB . Since it is possible for the walker to visit both A and B without visiting A ∩ B, it is not always true that EA∩B = EA ∩ EB . Then, Px (EA∪B ) =

Px (EA ) + Px (EB ) − Px (EA ∩ EB )

≤ Px (EA ) + Px (EB ) − Px (EA∩B ).

4. Random Walk in Zd

96

If we multiply both sides by βd−1 |x|2−d and take the limit using (4.21), we get the result. □ Proposition 4.36. If A = Bn , cap(A) = βd−1 nd−2 + O(nd−1 ). Proof. By Proposition 4.18, ( P {TA < ∞} = x

n |x|

)d−2 [ ( )] 1 1+O . n

Therefore, cap(A)

lim βd−1 |x|d−2 Px {TA < ∞} [ ( )] 1 βd−1 nd−2 1 + O . n

=

x→∞

=

□ Since a transient random walk visits each point only a finite number of times, it also visits every finite set only finitely often. What about infinite sets? Exercise 4.37. Let A ⊂ Zd , and let g(x) = gA (x) = Px {random walk visits A infinitely often}. Show that g ≡ 0 or g ≡ 1. Hint: Show that g is harmonic and use Exercise 4.22 to conclude that g is constant. Now assume that g ≡ q ∈ (0, 1) and derive a contradiction. Definition 4.38. A subset A ⊂ Zd is called transient if and only if with probability one the simple random walk visits A only finitely many times. Otherwise A is called recurrent. We can construct infinite transient sets by spacing points far apart. Let {x1 , x2 , . . .} be a sequence of points with ∞ ∑ k=1

|xk |2−d < ∞.

4.5. Capacity for d ≥ 3

97

Let S be a simple random walk starting at the origin and let V =

∞ ∑

1{Sn ∈ V } =

n=0

∞ ∑ ∞ ∑

1{Sn = xj }

n=0 j=1

be the number of times that the random walk visits A. Then, E[V ] =

∞ ∑ ∞ ∑

P{Sn = xj } =

j=1 n=0

∞ ∑

G(xj ) < ∞.

j=1

Hence, P{V < ∞} = 1. One can ask the converse: if E[V ] = ∞ is it true that P{V = ∞} = 1? The answer turns out to be no. Let us do a construction. ∪∞ Set b = 1 − d1 and let A = n=1 An where An is the discrete ball of radius rn = 2nb centered at the point zn = (2n , 0, 0, . . . , 0). The number of elements of An is comparable to rnd = 2n(d−1) and ∑

G(x) =

x∈A

∞ ∑ ∑ n=1 x∈An

G(x) ≍

∞ ∑ n=1

2n(2−d) rnd ≍

∞ ∑

2n = ∞.

n=1

Also, Proposition 4.34 shows that cap(An ) ≍ rnd−2 and hence the probability of visiting An is comparable to 2−n(d−2) rnd−2 ≍ 2n(d−2)(b−1) = 2−n(d−2)/d . Therefore,

∞ ∑

Px {random walk visits An } < ∞.

n=1

We used the word “comparable” and the notation ≍ in the last paragraph. If an , bn are two sequences of positive numbers, we say that an and bn are comparable, written an ≍ bn , if there exists C < ∞ such that for all n, C −1 an ≤ bn ≤ C an .

We will give a criterion to determine whether or not a set is transient. We start with an exercise that we will use. We let A ⊂ Zd , d ≥ 3 and An = A ∩ {z : 2n ≤ |z| < 2n+1 }.

4. Random Walk in Zd

98

Exercise 4.39. Show that there exist 0 < c1 < c2 < ∞ such that for every A and every n, if |x| ≤ 2n−1 or 2n+2 ≤ |x| ≤ 2n+3 , then c1 cap(An ) ≤ 2n(d−2) Px {TAn < ∞} ≤ c2 cap(An ). Proposition 4.40 (Wiener’s Test). Let A ⊂ Zd , d ≥ 3 and let An = A ∩ {z : 2n ≤ |z| < 2n+1 }. Then the set A is transient for random walk if and only if ∞ ∑ (4.23) 2n(2−d) cap(An ) < ∞. n=1

Proof. Let qn = 2n(2−d) cap(An ) ≍ P{TAn < ∞}, and let Y =

∞ ∑

1 {TAn < ∞}

n=1

denote the number of sets A1 , A2 , . . . that the random walk visits. The condition (4.23) is equivalent to the condition E[Y ] < ∞. If E[Y ] < ∞, then with probability one Y is finite and hence the walk is transient. This gives one direction. To finish we need to show that if E[Y ] = ∞, then P{Y = ∞} = 1. Using Exercise 4.37, it suffices to show that P{Y = ∞} > 0. Assume that the sum in (4.23) is infinite. Then at least one of ∞ ∑ n=1

q2n ,

∞ ∑

q2n−1

n=1

is infinite. We will assume the first is infinite; essentially the same argument holds if the second sum is infinite. Let En be the event that the random walk visits A2n . Then using the exercises immediately above, we get the relation P(En ∩ Em ) ≤ c P(En ) P(Em ) for some c. To see that this suffices, we use the second moment □ method, see Proposition A.4. Proposition 4.41. Let Sn be a simple random walk in Zd and let A = {Sj : j = 0, 1, 2, . . .} be the points visited by the path and let Aˆ be the set of points visited by the loop erasure of the path. Then with probability one,

4.5. Capacity for d ≥ 3

99

• If d ≥ 5, A and Aˆ are transient sets. • If d ≤ 4, A and Aˆ are recurrent sets. Exercise 4.42. Show that

∑

|x|−r < ∞

x∈Zd \{0}

if and only if r > d. Proof. We will only prove the result for A; the results for A˜ is similar but requires some more work. Let ∑ ∑ G(x) = 1{x ∈ A} G(x). Y = x∈A

x∈Zd

which is now a random variable since A is a random set. Note that ∑ ∑ G(x)2 E[Y ] = . P{x ∈ A} G(x) = G(0) d d x∈Z

x∈Z

Since G(x) ≍ |x| , we have G(x) ≍ |x|4−2d . Therefore the sum converges if and only if 2d − 4 > d, that is, d > 4. Therefore, if d > 4, E[Y ] < ∞ and hence with probability one Y < ∞. As we have seen, this implies that A is a transient set and since Aˆ ⊂ A, Aˆ is also a transient set. 2−d

2

The case d ≤ 4 takes more work; we will do only the case of A with d = 4. Let S, S˜ be two independent simple random walks starting at the origin and let A = {Sj : j = 0, 1, . . .},

A˜ = {S˜j : j = 0, 1, 2, . . .}.

We will show that with probability one, A ∩ A˜ is an infinite set. Let σn = min{j : Sj ≥ 2n }, An = B2n+1 ∩ {Sj : σn−2 ≤ j ≤ σn+2 }. ˜ We will and let σ ñ , Añ be the analogous quantities using the walk S. show that following • With probability one, there exist infinitely many n with A4n ∩ A˜4n ̸= ∅.

4. Random Walk in Zd

100

We will not give all the details but leave it as an exercise in the ideas of this chapter to put in all the details, However, we will give the sketch of facts to verify. Let En denote the event that A4n ∩ A˜4n ̸= ∅ and let Un = B24n+1 \ B24n . • Show that there exists c1 > 0 such that for all x ∈ Un , P{x ∈ An } ≥ c1 2−8n . • Show that there exists c2 < ∞ such that for all x, y ∈ Un distinct, P{x, y ∈ An } ≤ c2 2−16n |x − y|−4 . • Show that if Yn =

∑

˜ 1{x ∈ A ∩ A}.

˜4n x∈A4n ∩A

then there exist c3 , c4 > 0 such that E[Yn ] ≥ c3 ,

E[Yn2 ] ≤ c4 n.

• Use the second moment method (see Section A.2) to conclude that c4 P(En ) = P{Yn > 0} ≥ 2 . c3 n • Show that there exists c6 such that for all m < n, P(Em ∩ En ) ≤ c6 P(Em ) P(En ). • Use the second moment method again to conclude that with probability one ∞ ∑

1{Em } = ∞.

n=1

Exercise 4.43. Put it all the details of the last proof! Exercise 4.44. Show that if d ≥ 5, there exists c < ∞ such that the following holds. Suppose S 1 , S 2 are simple random walks starting at 0 and x respectively. Then, P{S 1 [0, ∞) ∩ S 2 [0, ∞) ̸= ∅} ≤ c |x|4−d . Hint: Let Iy be the indicator function of the event that there exist ∑ j, k with Sj1 = y and Sk2 = y. Let I = y∈Zd I(y). Show that 4−d E[I] ≤ c |x| .

4.6. Capacity in two dimensions

101 □

Many of the results about intersection of random walk paths are reflections of the fact that a random walk path in Zd , d ≥ 2 is a “two-dimensional set”. This is seen by noting that for large R, the number of points of the path that lie in the ball of radius R is of order R2 . Two two-dimensional sets (think, for example, planes) in Rd typically do not intersect if d > 4 and intersect if d < 4 with d = 4 being the critical dimension where it is a close call.

4.6. Capacity in two dimensions There is a also a notion of capacity in two dimensions that relates to the probability of hitting a set, but we cannot use the same definition since every nonempty set is hit with probability one. Instead we will take a limit. If A is a finite set we write τA for the first time that we visit A and we write τn for τ∂Bn . We start with (4.13) which can be rewritten as 2 a(x) = lim (log n) Px {τn < τ0 }. n→∞ π Proposition 4.45. If A ⊂ Z2 is finite, then the limit aA (x) = lim

n→∞

2 (log n) Px {τn < τA }. π

exists for every x ∈ Z2 . Moreover, aA ≡ 0 on A and if z ∈ A, x ∈ Z2 \ A, then ∑ aA (x) = a(x − z) − Px {S(TA ) = w} a(w − z). w∈A

Proof. We will do the case z = 0 and leave the general case to the reader. Note that (4.24)

Px {τn < τ0 } = Px {τn < τA } + Px {τA < τn < τ0 },

4. Random Walk in Zd

102 and Px {τA < τn < τ0 } =

∑

Px {S(τA ∧ τn ) = w} Pw {τn < τ0 }.

w∈A

Using (4.13), we get ∑ 2 lim (log n) Px {τA < τn < τ0 } = Px {S(τA ) = w} a(w). n→∞ π w∈A

Using this again in (4.24), we get ∑ 2 (log n) Px {τn < τA } = a(x) − Px {S(τA ) = w} a(w). lim n→∞ π w∈A

□ If 0 ∈ A, we can write 2 a(x) − aA (x) = lim (log n) [Px {τn < τ0 } − Px {τn < τA }] . n→∞ π Definition 4.46. If x ∈ A ⊂ Z2 , then the capacity of A is defined by cap(A) = =

lim [a(z) − aA (z)] ∑ hmA (y) a(y − x).

|z|→∞

y∈A

The existence of this limit follows from Exercise 4.31. In some sense this is defined up to an additive constant and we have chosen the constant so that cap({0}) = 0. Another reasonable choice would be to choose cap({0}) = −k0 . We have the expansion 2 aA (z) = log |z| + k0 − cap(A) + o(1), z → ∞. π

Further reading The classical book by Frank Spitzer [18] includes an extensive bibliography on the early work on random walk. This chapter can be considered as a sampler from [10] which is a serious graduate/research level treatment of simple random walk.

Chapter 5

LERW and Spanning Trees on Zd

5.1. LERW in Zd Simple random walk in Zd with d ≥ 3 is transient. This allows us to define the infinite LERW by erasing loops from the infinite path. The definition is essentially the same as in Section 2.1 so we just repeat it here. Let Sj be a simple random walk starting at the origin in Zd , d ≥ 3. Let j0 = max{n : Sn = 0},

Sˆ0 = Sj0 = 0,

and for k > 0, jk = max{n : Sn = Sjk−1 +1 },

Sˆk = Sjk−1 +1 = Sjk .

This gives a probability measure on infinite paths starting at the origin that visit no site more than once. Let us write A0 = Zd and ˆ k − 1]. The following proposition is similar for k ≥ 1, Ak = Zd \ S[0, to Proposition 2.4 although we get an extra factor. Proposition 5.1. If η = [η0 , . . . , ηk ] is a SAW in Zd (d ≥ 3) starting ˆ k] equals η is at the origin, then the probability that S[0, (2d)−k Fη (Zd ) Esη (ηk ). 103

104

5. LERW and Spanning Trees on Zd

Proof. Similarly to the proof of Proposition 2.4 we write any infinite path whose loop erasure starts with η as ℓ0 ⊕ [η0 , η1 ] ⊕ · · · ⊕ [ηk−2 , ηk−1 ] ⊕ ℓk−1 ⊕ [ηk−1 , ηk ] ⊕ ω, where ℓj is a loop rooted at ηj staying in Aj , and ω is an infinite (not necessarily self-avoiding) path starting at ηk that does not return to any vertex in η. We then have • The measure of possible lj is GAj (ηj , ηj ). • Each [ηj , ηj+1 ] contributes a factor of (2d)−1 . • The measure of possible ω is the probability that a simple random walk starting at η k never returns to η. By definition, this is Esη (ηk ). Finally, we recall that by definition Fη (Zd ) =

k ∏

GAj (ηj , ηj ).

j=0

□ If d ≥ 3 and η is a SAW (or, indeed, any finite subset of Zd ), let fη (z) be the probability that a simple random walk starting at z never visits η. If z ̸∈ η, this is the same as Esη (z) but for z ∈ η, fη (z) is defined to be zero. We can characterize fη as the unique function on Zd that is harmonic in Zd \ η, equals zero on η, and such that fη (z) → 1 as z → ∞. Exercise 5.2. Show that if z ∈ η, Esη (z) = −Lfη (z). We will now describe simple random walk conditioned to avoid η (forever). Fix η and suppose that fη (x) > 0. If Sj is a simple random walk starting at x, then we can condition on the event that the walk never visits a vertex in η. This is well defined because we are conditioning on an event of positive probability. Proposition 5.3. Simple random walk conditioned to avoid η starting at x0 with fη (x0 ) > 0 is a (time-homogeneous) Markov chain

5.1. LERW in Zd

105

X0 , X1 , . . . with transition probabilities p(x, y) =

fη (y) , 2d fη (x)

|x − y| = 1.

Proof. Let S be a simple random walk starting at x and let E be the event {Sj ̸∈ η : j = 0, 1, 2 . . .} which has probability fη (x). Using the definition of conditional probability, we see that Px {S1 = y | E}

= = =

Px [{S1 = y} ∩ E] Px (E) x P {S1 = y} Px (E | S1 = y) Px (E) −1 (2d) fη (y) . fη (x) □

Another way one might try to define “random walk conditioned to avoid a set” is by doing the usual random walk but at each time restricting to those vertices that are not in the set to be avoided. This would be the same as simple random walk on the graph Zd \ η. This is not the same process. This is why we put the parenthetical (forever) in our definition. Our random walk is a particular case of a Doob hprocess where a process computes the probabilities for each step based on the values of a positive harmonic function. In our case this function is fη . The “weighting” or “tilting” of probability distributions is a very important idea and is seen in many places, e.g., the Girsanov or Cameron-Martin theorem in stochastic calculus or in “importance sampling” in statistics.

We defined the random walk conditioned to avoid η starting at a point z with fη (z) > 0. We can also define the process at x with fη (x) = 0 provided that Esη (x) > 0. In this case we define p(x, y) =

fη (y) , 2d Esη (x)

|x − y| = 1.

5. LERW and Spanning Trees on Zd

106

Although this allows the starting point to be in η, the process never returns to η. The next exercise gives the “Laplacian random walk” viewpoint of the LERW in Zd , see Section 2.4. Exercise 5.4. Suppose d ≥ 3, η is a k-step SAW starting at the origin and Sˆ0 , Sˆ1 , . . . , is a loop-erased random walk starting at the origin. Then the following is true. (1) The distribution of {Sˆk , Sˆk+1 , . . .} given [Sˆ0 , . . . , Sˆk ] = η can be obtained by • Take a random walk starting at Sˆk conditioned to never return to η. • Erase loops from this path. (2) Given [Sˆ0 , . . . , Sˆk ] = η, the distribution of Sˆk+1 is the same as that of simple random walk starting at Sˆk conditioned to never return to η. In particular, we have another way of constructing the LERW in Zd , d ≥ 3: • Take a simple random walk starting at the origin conditioned to never return to the origin. • Erase loops. We will use this viewpoint to construct the LERW for d = 2. We need to make sense of the notion “simple random walk conditioned to never return to the origin”. As in Section 4.6, we let S be a simple random walk in Z2 and τA = min{k ≥ 0 : Sk ∈ A},

τn = min{k ≥ 0 : |Sk | ≥ n}.

The function aA is given by aA (z) = lim

n→∞

2 (log n) Pz {τn < τA }. π

If A = {0}, then we write just a(z), and indeed a(z) is the same as the potential kernel in Z2 , see Proposition 4.19. For other finite A containing the origin, Proposition 4.45 gives ∑ aA (z) = a(z) − Pz {SτA = w} a(w). w∈A

5.1. LERW in Zd

107

The function aA can also be characterized by the following properties: z∈A

aA (z) = 0, LaA (z) = 0, aA (z) ∼

z ∈ Z2 \ A,

2 log |z|, π

z → ∞.

Definition 5.5. If A ⊂ Z2 is finite and contains the origin, then random walk conditioned to avoid A is simple random walk “tilted” by aA . More precisely, it is the time homogeneous Markov chain on {x : aA (x) > 0} with transition probabilities p(x, y) =

aA (y) , 4 aA (x)

|x − y| = 1.

As in d ≥ 3 we can also start the process at x ∈ A provided that −LaA (x) > 0, that is, if there exists y with |x−y| = 1 and aA (y) > 0. Exercise 5.6. Suppose A ⊂ Z2 is finite and let V = {x ∈ Z2 : aA (x) > 0}. Let S be a simple random walk and pVn (x, y) = Px {Sn = y; S0 , . . . , Sn ∈ V }. Let qn (x, y) be the n-step transition probability for random walk conditioned to avoid A. Show that qn (x, y) = pVn (x, y)

aV (y) . aV (x)

Exercise 5.7. Let Xj , j ≥ 0 have the distribution of simple random walk in Z2 conditioned to never hit the origin with X0 = x ̸= 0. (1) Show that if A is a finite set, then the probability that the walker ever reaches A is bounded above by maxy∈A a(y) . a(x) (2) Show that the process is transient, that is, with probability one lim |Xj | = ∞.

j→∞

Exercise 5.8. Let 0 ∈ A ⊂ Z2 be finite and let x ∈ Z2 \ {0}.

108

5. LERW and Spanning Trees on Zd (1) Show that the probability that random walk starting at x conditioned to avoid the origin also avoids all of A is aA (x)/a(x). (2) Show that simple random walk conditioned to avoid A can be obtained by starting with simple random walk conditioned to avoid the origin and then conditioning on the event that the walker avoids A.

Another way of thinking of the last exercise is that “tilting random walk by aA ” can be considered as a two-step process: first tilting by a and then tilting the tilted process by aA /a. Exercise 5.9. Suppose that η = [η0 , . . . , ηk ] is a SAW in Z2 starting at the origin, and Sˆ is a LERW starting at the origin. ˆ k] equals η is (1) Show that the probability that S[0, (5.1)

(2d)−k Fη (Z2 \ {0}) [−Laη (ηk )]. (2) For n > k consider LERW in the discrete ball Bn , that is, start simple random walk at the origin, stop it when it reaches ∂Bn , and the erase loops. Let pˆn (η) be the probability that the first k steps of this loop-erasure agree with η. Show that for fixed η, as n → ∞, pˆn (η) converges to (5.1).

5.2. Marginal distributions for UST in Zd Suppose A ⊂ Zd is finite. The wired uniform spanning tree T is a subset of the edges of AW = A ∪ {∂A} containing exactly #(A) edges such that no loop appears. Recall from Corollary 2.24 that the number of wired spanning trees of A is (2d)#(A) F (A)−1 . We want to extend the definition of the UST to the integer lattice Zd . One cannot do this directly since the number of spanning trees is infinite. We will define it as a limit by specifying the probability that any particular set of finite edges appears. We start by computing the marginal distributions for the UST on finite A. To be more precise, we will compute P (E; A) where E

5.2. Marginal distributions for UST in Zd

109

Figure 1. There are 12 vertices in A. The 7 edges in E are represented by full lines and the wiring of the boundary is indicated by the dotted lines. The component V0 has 2 interior vertices as well as ∂A. We have k = 3 with two interior components with 3 vertices and one with 2 vertices. There are 2 isolated vertices and hence n = 5.

is a collection of edges in AW , and P (E; A) denotes the probability that all of these edges appear in the uniform wired spanning tree. It will be useful to set up some notation. A finite set of edges E partitions AW into connected components of vertices that we write as {V0 , V1 , . . . , Vn }. We order these components so that V0 is the component containing ∂A; V1 , . . . , Vk are the remaining components containing more than one vertex in A; and Vk+1 , . . . , Vn are the components consisting of isolated vertices in A. See Figure 1. Note that (5.2)

#(E) = #(A) − n.

Let I = Vk+1 ∪ · · · ∪ Vn be the set of isolated vertices. Let S denote a simple random walk in Zd and T = min{m ≥ 1 : Sm ∈ Zd \ I}. For 0 ≤ r, s ≤ k define the excursion measure between Vr and Vs in I by ExcI (Vr , Vs ) =

∑ z∈Vr

Pz {ST ∈ Vs }.

5. LERW and Spanning Trees on Zd

110 Note that if 1 ≤ r ≤ k, k ∑

ExcI (Vr , Vs ) =

s=0

k ∑∑

Pz {ST ∈ Vs } = #(Vr ).

z∈Vr s=0 ′

By convention, if V, V are disjoints sets we write just Exc(V, V ′ ) for ExcI (V, V ′ ) with I = Zd \ (V ∪ V ′ ).

What we call excursion measure is very closely related to effective resistance. We choose to use the excursion terminology to emphasize the “excursions” (random walk paths) going from one set to another.

Exercise 5.10. Show that ExcI (Vr , Vs ) =

∑ ∑

H∂I (z, w)

z∈Vr w∈Vs

where H∂I (z, w) is the boundary Poisson kernel. Conclude that ExcI (Vr , Vs ) = ExcI (Vs , Vr ). Exercise 5.11. Suppose A1 ⊂ A2 ⊂ · · · is a sequence of finite subsets d d of Zd , d ≥ 3, with ∪∞ j=1 Aj = Z . If V is a finite subset of Z show that (5.3)

lim Exc(V, ∂An ) = cap(V ).

n→∞

We can view G = {V1 , . . . , Vn } as a graph with boundary ∂G = {V0 } using the edges from Zd . This introduces some multiple edges and some self-edges but this is no problem. There is a bijection between wired spanning trees of G (that is, spanning trees of G ∪ {V0 }) and wired spanning trees of A containing the edges in E. Indeed, the edges of the spanning tree of A are the union of E and the edges in the spanning tree of G. Let J = [J(r, s)]1≤r,s≤k be the matrix { #(Vr ) − ExcI (Vr , Vr ), r = s (5.4) J(r, s) = , − ExcI (Vr , Vs ), r= ̸ s

5.2. Marginal distributions for UST in Zd

111

and Q = [Q(r, s)]1≤r,s≤k the matrix (5.5)

Q(r, s) =

1 ExcI (Vr , Vs ). #(Vr )

Simple random walk on G is sometimes called excursion-reflected random walk or random walk with darning. It treats each connected component of vertices as a single point. When the process is in state Vj , it first chooses a vertex of Vj at random using the uniform distribution and then takes a step from that vertex using simple random walk. It we view it only on the vertices, then the transition probabilities for r ≥ 1, s ≥ 0 are given by p˜(Vr , Vs ) =

b(r, s) , (2d) #(Vr )

where b(r, s) denotes the number of edges connecting Vr and Vs . The extra rule for the graph is that if the process chooses to jump between two vertices and there is more than one edge connecting them, the process chooses the edge randomly. This process looks like usual simple random walk on the isolated vertices but is different on the components with more than one vertex. Proposition 5.12. If A is finite and E is a set of edges in the wired graph AW containing no loop, then the probability that a uniform wired spanning tree in A contains all the edges in E is given by P (E; A) =

k FV (A) det[J] FV (A) ∏ Exc(Vj , V0 ∪ · · · ∪ Vj−1 ). = (2d)#(E) (2d)#(E) j=1

where V = A \ I denotes the set of vertices in A that are endpoints of edges in E and J is defined in (5.4). In particular, if E is a tree not including ∂A with vertices V , P (E; A) =

FV (A) Exc(V, ∂A) . (2d)#(E)

Proof. Let us apply Wilson’s algorithm to find the wired spanning trees of G using simple random walk on G stopped upon reaching V0 . Recall that the degree of Vr is (2d) #(Vr ). Applying Corollary 2.24 to this random walk, we see that the number of wired spanning trees

5. LERW and Spanning Trees on Zd

112 of G is

[

n ∏

]

[

deg(Vr ) Fˆ (G)−1 = (2d)n

r=1

k ∏

] #(Vr ) Fˆ (G)−1 .

r=1

where we write Fˆ to denote that this is the quantity obtained using simple random walk on G. Taking ratios and using (5.2), we see that [ n ] ∏ F (A) −#(E) P (E; A) = (2d) deg(Vr ) . Fˆ (G) r=1 Note that F (A) = FV (A) F (A \ V ) = FV (A) F (I). Similarly, Fˆ (G) = Fˆ{V1 ,...,Vk } (G) Fˆ ({Vk+1 , . . . , Vn }). Since random walk on G acts the same as usual simple random walk on the isolated vertices I, we have Fˆ ({Vk+1 , . . . , Vn }) = F (I), and hence FV (A) F (A) = . ˜ ˆ F (G) F{V1 ,...,Vk } (G) We claim that

[

k ∏

]−1 deg(Vr )

det J =

r=1

1 . ˆ F{V1 ,...,Vk } (G)

Indeed, the left-hand side equals det(I − Q) where Q is defined as in (5.5). These are the transitions for the random walk “viewed only when visiting {V1 , . . . , Vk }”, see Exercise 2.11. From that exercise, we see that det(I − Q) equals 1/Fˆ{V1 ,...,Vk } (G). To finish the proof we need to show that (5.6)

det[J] =

k ∏

Exc(Vj , V0 ∪ · · · ∪ Vj−1 ).

j=1

We leave this as Exercise 5.13. We note that we can also write this as k ∏ [ ] det[J] = #(Vj ) − ExcIj (Vj , Vj ) j=1

where Ij = Zd \ (V0 ∪ · · · ∪ Vj ).

□

5.2. Marginal distributions for UST in Zd

113

˜ = Exercise 5.13. The goal of this exercise is prove (5.6). Let G −1 [I − Q] denote the Green’s function for the simple random walk on G stopped upon reaching V0 . (1) Show that ˜ 1 , V1 ) = G(V

1 1−f

where ∑ 1 Pz {S(T1 ) ∈ V1 }. #(V1 )

f=

z∈V1

Here S denotes simple random walk in Zd and T is the first i ≥ 1 with Si ∈ V0 ∪ V1 . (2) More generally, show that k ∏ 1 1 = , det(I − Q) r=1 1 − fr

where fr =

∑ 1 Pz {S(Tr ) ∈ Vr }. #(Vr ) z∈Vr

Here S denotes simple random walk in Zd and Tr is the first i ≥ 1 with Si ∈ V0 ∪ V1 ∪ · · · ∪ Vr . (3) Use this to conclude (5.6). We now take the limit as A grows to Zd . Let us fix a finite set of edges E and let A1 ⊂ A2 ⊂ · · · with Z = d

∞ ∪

Aj .

j=1

We start with the transient case d ≥ 3. Let V1 , . . . , Vk be the connected components of E as above. Let us make one more definition. Suppose A, B ⊂ Zd with A finite. If S is a simple random walk and z ∈ A, let EsA (z; B) be the probability that the random walk does not return to A without visiting B first. In other words, if T = TA∪B = min{j : Sj ∈ A ∪ B} then EsA (z; B) = Pz {T = ∞ or ST ∈ B}.

5. LERW and Spanning Trees on Zd

114 We define cap(A; B)

= :=

Exc(A, B ∪ {∞}) lim Exc(A, B ∪ ∂Bn ) =

n→∞

∑

EsA (z; B).

z∈A

Note that cap(A) = cap(A; ∅) and cap(A; B) = #(A) − ExcZd \(A∪B) (A, A). We omit the simple proof of the following proposition obtained by taking limits in Proposition 5.12, using (5.3) for the second part. Proposition 5.14. If d ≥ 3, E is a finite subset of edges containing d no loop, and A1 ⊂ A2 ⊂ · · · with ∪∞ j=1 Aj = Z , then lim P (E; Aj ) =

j→∞

k FV (Zd ) ∏ cap(Vj ; V1 ∪ · · · ∪ Vj−1 ). (2d)#(E) j=1

Here V1 , . . . , Vk are the connected components of vertices with more than one vertex and V = V1 ∪ · · · ∪ Vk . In particular, if E is a tree with vertices V , lim P (E; Aj ) =

j→∞

FV (Zd ) cap(V ) . (2d)#(E)

Exercise 5.15. Show that if d ≥ 3 and V = {x1 , . . . , xn }, then FV (Zd ) = det [G(xi , xj )]1≤i,j≤n , where G denotes the Green’s function for simple random walk in Zd . Exercise 5.16. Use Proposition 5.14 to show that if d ≥ 3 and E consists of a single edge, then 1 lim P (E; Aj ) = . j→∞ d Why would you expect this to be the right answer? Hint: Exercise 4.33 may be useful. The d = 2 case is a little different because the random walk is recurrent. We will restrict ourselves to the case where A = Bn = {z ∈ Z2 : |z| < n}. We start with a lemma. Recall that 2 GBn (0, 0) = log n + O(1). π

5.2. Marginal distributions for UST in Zd

115

If z ̸= 0, we have GBn−|z| (0, 0) ≤ GBn (z, z) ≤ GBn+|z| (0, 0), and hence for all z,

[

lim

n→∞

2 log n π

]−1 GBn (z, z) = 1.

Lemma 5.17. Suppose A ⊂ Z2 is finite. If S is a simple random walk, let TA = min{j ≥ 1 : Sj ∈ A} and τn = min{j ≥ 1 : |Sj | ≥ n}. Then ∑ 2 lim (log n) Pz {τn < TA } = 1. n→∞ π z∈A

Proof. For ease we will assume that 0 ∈ A. If A = {0} with stopping time T0 , then this follows from Proposition 4.19. In fact, 1 GBn (0, 0) = 0 . P {T0 < τn } For any A ⊂ Bn , “path reversal” gives a natural bijection between: • paths starting at A ending at ∂Bn with all other vertices in Bn \ A; • paths paths starting at ∂Bn ending at A with all other vertices in Bn \ A. Using this bijection, we see that ∑ ∑ Pz {τn < TA } = Pw {τn > TA }. z∈A

w∈∂Bn

P {τn < T0 } = 0

∑

Pw {τn > T0 }.

w∈∂Bn

Also, since 0 ∈ A, we have Pw {τn > TA } ≥

Pw {τn > T0 }

= Pw {τn > TA } Pw {τn > T0 | τn > TA } ≥ Pw {τn > TA } bn (A), where bn (A) = min Pz {T0 < τn }. z∈A

5. LERW and Spanning Trees on Zd

116

Using recurrence, we see that bn (A) → 1 as n → ∞ which gives our result. □ Proposition 5.18. Suppose E is a finite set of edges of Z2 with no loops, V1 , . . . , Vk are the connected components of the corresponding vertices, and V = V1 ∪ · · · ∪ Vk . Suppose z ∈ V . Then lim P (E; Bn ) =

n→∞

k FV (Z2 \ {z}) ∏ Exc(Vj , V1 ∪ · · · ∪ Vj−1 ). 4#(E) j=2

In particular, if E is a finite tree, then lim P (E; Bn ) =

n→∞

FV (Z2 \ {z}) . 4#(E)

Proof. Proposition 5.12 states that P (E; Bn ) equals k ∏ FV (Bn ) Exc(Vj , ∂Bn ∪ V1 ∪ · · · ∪ Vj−1 ). Exc(V , ∂B ) 1 n 4#(E) j=2

We write FV (Bn ) Exc(V1 , ∂Bn ) as ] [ GBn (z, z) FV (Bn \ {z}) 2 (log n) Exc(V , ∂B ) . 1 n 2 π π log n Then we have

[ lim

n→∞

2 log n π

]−1 GBn (z, z) = 1,

and using Lemma 5.17 we have 2 (log n) Exc(V1 , ∂Bn ) = 1. π Since V1 is nonempty and the random walk is recurrent, we have for each j ≥ 2, lim

n→∞

lim Exc(Vj , ∂Bn ∪ V1 ∪ · · · ∪ Vj−1 ) = Exc(Vj , V1 ∪ · · · ∪ Vj−1 ).

n→∞

□ Exercise 5.19. Show that if E consists of a single edge in Z2 , then lim P (E; Bn ) =

n→∞

1 . 2

5.3. Uniform spanning tree (UST) in Zd

117

Exercise 5.20. Give a direct proof without using results of this section that if V is a finite subset of Zd and z, w ∈ V , then FV (Zd \ {z}) = FV (Zd \ {w}). Exercise 5.21. It follows from Proposition 5.18 that k ∏

Exc(Vj , V1 ∪ · · · ∪ Vj−1 )

j=2

does not depend on the ordering of the components {V1 , V2 , . . . , Vk }. Give a direct proof of this fact.

5.3. Uniform spanning tree (UST) in Zd The uniform spanning tree of a finite graph is a spanning tree chosen from the uniform distribution on all spanning trees. For an infinite graph such as the integer lattice this does not make sense at least directly. In order to define the UST we need to take a limit. Definition 5.22. • A spanning tree of Zd is a subgraph of the integer lattice containing all the vertices of Zd with the property that for any two vertices there exists a unique self-avoiding path in the subgraph connecting the two points in the subgraph. • A forest in Zd is a subgraph containing all the vertices of Zd with the property that for any two vertices, there exists at most one self-avoiding path in the subgraph connecting the points. • A spanning forest in Zd is a forest such that for every z ∈ Zd there is an infinite subpath going to infinity. Note that we are retaining all the vertices so a spanning tree, forest, or spanning forest can be viewed as a subset of nearest neighbor edges in Zd with certain properties. A forest partitions the vertices of the integer lattice into equivalence classes called connected components where x, y are in the same component if there is a path in the forest connecting x and y. A component can consist of a single isolated vertex; this happens if a vertex has no edges adjacent to it.

5. LERW and Spanning Trees on Zd

118

A spanning forest is a forest such that all of the components contain an infinite number of vertices. A spanning tree is a spanning forest with only one component. A spanning forest can be viewed as a wired spanning tree in Zd where we add a boundary point ∞ to the lattice. Given any forest F and any finite A ⊂ Zd , we let FA be the collection of edges in F with at least one vertex in A. Note that FA is also a forest in Zd with the property that every vertex outside of A is isolated. It satisfies the “restriction property”: if A ⊂ B, then FA = [FB ]A . Any probability measure on forests in Zd can be described in terms of probability measures PA on forests in A. Conversely, any collection of probability measures on forests {PA : A finite} satisfying a certain consistency condition gives rise to a probability measure on forests. Exercise 5.23. What should the “consistency condition” be? Exercise 5.24. Suppose P is a probability measure on forests in Zd . Show that P is actually a probability measure on spanning forests if and only if the following holds with probability one. • For every finite A ⊂ Zd and every z ∈ A, there exists a path in FA connecting z to ∂A. Definition 5.25. A sequence of probability measures P (n) on forests in Zd is said to converge to P if for all finite A and all forests F ′ of A, lim PA (F ′ ) = PA (F ′ ). (n)

(5.7)

n→∞

Exercise 5.26. Suppose P is a probability measure on forests in Zd and A ⊂ Zd is finite. For each forest F ′ in A, let E = E(F ′ , A) be the set of edges not in F ′ with at least one endpoint in A. Use a version of the inclusion-exclusion principle to show that ∑

#(E)

(5.8)

PA (F ′ ) =

j=0

(−1)j

∑

P{F ′ ∪ J ⊂ F }.

J⊂E;#(J)=j

The following follows immediately from this exercise.

5.3. Uniform spanning tree (UST) in Zd

119

Proposition 5.27. Suppose P (n) is a sequence of probability measures on forests in Zd and for each finite set of edges E the limit P (E) = lim P (n) {E ⊂ F }

(5.9)

n→∞

exists. Then P converges to a probability measure P on forests of Zd . Indeed, using (5.8), (n)

∑

#(E) ′

PA (F ) =

j=0

(−1)j

∑

P (E ∪ J).

J⊂E;#(J)=j

There are three equivalent ways to define the uniform spanning forest. Recall that Bn = {z ∈ Zd : |z| < n} denotes the discrete ball about the origin. • Free boundary. Take a UST on Bn . Then let n → ∞. • Wired boundary. Take a UST on the wired boundary Bn ∪ {∂Bn }. Then let n → ∞. • Wilson’s algorithm. A natural generalization of Wilson’s algorithm to infinite graphs which we now define. In the last section, we showed the existence of the infinite spanning forest using the wired boundary definition. Indeed, we gave explicit expressions for the probabilities P (E) in Propositions 5.14 and 5.18 which we now recall. Let V1 , V2 , . . . , Vn be the connected vertex components associated to E; let V1 , . . . , Vk be the components containing more than one vertex; and V = V1 ∪ · · · ∪ Vn . Then if d ≥ 3, k FV (Zd ) ∏ P (E) = cap(Vj ; V1 ∪ · · · ∪ Vj−1 ), (2d)#(E) j=1 and if d = 2 and z ∈ V , k FV (Z2 \ {z}) ∏ Exc(Vj , V1 ∪ · · · ∪ Vj−1 ). P (E) = 4#(E) j=2

While these expressions are nice, it is not easy to deduce from them some of the properties of the forest, e.g., is the forest actually a tree? For this reason, it is useful to describe the spanning forest using Wilson’s algorithm.

5. LERW and Spanning Trees on Zd

120

We will describe Wilson’s algorithm to choose a random spanning forest of Zd . We order the points Zd = {x0 , x1 , . . .}. Although the distribution does not depend on the ordering, it is useful to order the points so that (5.10)

0 = |x0 | ≤ |x1 | ≤ |x2 | ≤ · · · • Take a LERW from 0 to infinity and add those edges to the graph. This gives F0 . Let U0 be the set of vertices that are endpoints of an edge in F0 . • Recursively, assume Fm is given. If xm+1 ∈ Um , do nothing and let Fm+1 = Fm , Um+1 = Um . Otherwise, start a simple random walk at xm+1 and stop it at the first time it hits a vertex in Um . – If this random walk stops in finite time, then we erase loops and add the edges of the loop erasure to Fm to give Fm+1 . – If the random walk goes to infinity without hitting Um , then erase loops from the infinite path and add the edges of the loop erasure to Fm to give Fm+1 . – Let Um+1 be the set of vertices adjacent to at least one edge in Fm+1 . Note that Fm+1 need not be connected. • The final forest is F=

∞ ∪

Fm .

m=0

Some important properties of this procedure are the following. • {x0 , . . . , xm } ⊂ Um . • If A ⊂ {x0 , . . . , xm }, then for all n ≥ m, [Fn ]A = FA . In other words, after time m we do not add any edges to the forest with an endpoint in A. • If x, y ∈ Um and x, y are not connected in Fm , then x, y will not be connected in F. Proposition 5.28. The distribution of the forest F in Wilson’s algorithm is the same as the uniform spanning forest defined as the limit of wired spanning trees on Bn .

5.3. Uniform spanning tree (UST) in Zd

121

Proof. We will not give the proof here but we will suggest the strategy for d ≥ 3 if a reader wishes to put in the details. The d = 2 case will be discussed in Section 5.5. It suffices to show that it gives the same distribution on forests on finite sets A. We fix a finite set A and consider a wired spanning tree derived from Wilson’s algorithm using the same order and the same random walks as in the infinite Wilson’s algorithm except that the random walks are stopped when the reach ∂Bn . If we fix A and let n → ∞, we can show that in the limit they give the same distribution on forests in A. □ Proposition 5.29. The distribution of the forest F in Wilson’s algorithm does not depend on the ordering of the vertices {x0 , x1 , . . . , }. Proof. This fact is true for wired spanning trees and hence can be deduced from the previous proposition. □ Proposition 5.30. If x, y ∈ Zd , then the probability that x, y are in the same connected component of the infinite spanning forest is the same as the probability that the path of a LERW starting at x and the path of an independent simple random walk starting at y intersect. Proof. Choose an ordering that starts with x0 = x, x1 = y. Then x, y will be connected in F1 if and only if the simple random walk starting at y intersects the loop erasure of the walk starting at x. As we have already noticed. if x, y are not connected in F1 , then they will not be connected in F. □ We end by stating the following which comes from Wilson’s algorithm and Proposition 4.41. Proposition 5.31. The uniform spanning forest in Zd is a tree if and only if d ≤ 4. Proof. We construct the forest using Wilson’s algorithm at infinity starting with the origin. If d ≤ 4, using Proposition 4.41 we see that with probability one, for every y ∈ Zd , the simple random walk starting at y intersects the loop erasure of the walk starting at the origin, and hence all the points are connected in the forest. For d ≥ 5, the previous proposition shows that the probability that 0 and x are connected in the forest is bounded above by the

122

5. LERW and Spanning Trees on Zd

probability that the paths of independent simple random walks starting at 0 and x intersect. This latter probability is bounded above by c |x|4−d , see Exercise 4.44. □

5.4. The dual lattice in Z2 In this section we study the two-dimensional lattice Z = Z2 and write points in Z as complex numbers m+in where m, n ∈ Z. Let S denote the (filled-in) square of side length one centered at the origin { } 1 1 S = x + iy ∈ C : − ≤ x, y ≤ , 2 2 and if z ∈ Z, we write Sz = z + S for the corresponding square centered at z. The corners and the edges of these squares combine to form another copy of the integer lattice that is shifted by 21 + 2i from the original lattice. We let Z ∗ denote this new lattice } { i 1 Z ∗ = z + + : z ∈ Z2 , 2 2 and we call it the dual lattice associated to Z. For each edge e ∈ Z, we denote by e∗ the edge in Z ∗ that intersects e. Note that the intersection point of edges in Z and edges in Z ∗ lie on the midpoints of the edges of the squares Sz . We will show there is a duality between wired spanning trees of subgraphs of Z and free spanning trees of subgraphs of Z ∗ . We call a finite subset A of Z simply connected if both A and Z \ A are connected. Note that there is a one-to-one correspondence between finite nonempty simply connected subsets of Z and what are known as self-avoiding polygons (SAP) in Z ∗ . A SAP is a loop without any self-intersections where we forget where the loop starts and which way we go around the loop. Suppose A is a nonempty finite simply connected subset of Z and let ∂A be the boundary as before. Recall that the wired graph AW associated to A has #(A) + 1 vertices: the vertices of A plus a single vertex that we call ∂A. Let E = EA denote the edges of AW which are partitioned into two sets: E o are the edges that connect two vertices of A and ∂E are the edges that connect a vertex in A with the wired boundary ∂A. This graph may have some multiple edges

5.4. The dual lattice in Z2

123

from a vertex in A to ∂A. By definition, a wired spanning tree of A is a spanning tree of AW . We define the dual of A to be the graph A∗ whose edges are given by ∗ E ∗ = EA = {e∗ : e ∈ EA },

and the vertices of A∗ are all the endpoints of edges in E ∗ . Equivalently, the edges of A∗ are the edges of the squares {Sz : z ∈ A} and the vertices of A∗ are the corners of {Sz : z ∈ A}. Note that E ∗ is the collection of edges in Z ∗ both of whose endpoints are in A∗ . By construction there is natural bijection between ∗ EA and EA given by e ←→ e∗ .

Figure 2. A is the filled circles, ∂A the unfilled circles. The ∗ and the dotted lines are the full lines gives the edges in EA edges in EA . The corners of the full-line squares give the vertices in A∗

If T is a collection of edges in EA we define ∗ T ∗ = T ∗ (A) = {e∗ ∈ EA : e ∈ EA \ T }. ∗ In other words, an edge e∗ is in T ∗ if and only if it is in EA but its dual edge e is not in T .

124

5. LERW and Spanning Trees on Zd

Proposition 5.32. Suppose A is a nonempty finite, simply connected subset of Z. A collection of edges T ⊂ EA is a wired spanning tree of A if and only if T ∗ is a spanning tree of A∗ . In particular, the number of wired spanning trees of A is the same as the number of spanning trees of A∗ . Proof. It suffices to show that: • T is not connected if and only if T ∗ has a loop. • T ∗ is not connected if and only if T has a loop. Suppose T is not connected. Then there exists a connected component A˜ of vertices that does not include ∂A. Let ∪ SA˜ = Sz . ˜ z∈A

The “outer boundary” of SA˜ , that is, the topological boundary of the unbounded component of R2 \ SA˜ , is composed of edges in T ∗ and hence gives a loop in T ∗ . Conversely, if there is loop in T ∗ there is a SAP all of whose edges are in T ∗ . This SAP corresponds to a simply connected A′ ⊂ A, and A′ is disconnected from ∂A in AW . Now suppose that T ∗ is not connected, and let A˜∗ be a connected

component of vertices in T ∗ . Let SA˜∗ be as above; its outer boundary gives a self-avoiding loop ℓ of edges in Z. Not all of these edges are in EA since some of them may have both vertices in ∂A. However, there is at least least one edge with a vertex in A. If we start at the vertex and go along ℓ in both directions, stopping when we reach ∂A if we get there, we get a loop in EA . Similarly, if T has a loop and e is an edge on the loop, then the two vertices of e∗ are disconnected in T ∗ . □ Exercise 5.33. We can extend this duality to the case of multiply connected subsets. Suppose A is a finite, connected subset of Z and that the complement Z \ A consists of k connected components ∂1 , . . . ∂k . Let us change the definition of the wired graph AW so that boundary points in the same connected component are wired but not points in different components. The vertices of AW are A ∪ {∂1 , . . . , ∂k } and the edges come from the edges EA . Note that

5.4. The dual lattice in Z2

125

Figure 3. The dotted lines give a wired spanning tree T for the A in Figure 2 and and the full lines give the corresponding T ∗.

different boundary components are not adjacent in this graph. Define A∗ , T ∗ as before. Show that T is a spanning tree of AW if and only if T ∗ is a spanning tree of A∗ . To count the number of spanning trees of a simply connected A ⊂ Z we can either count the number of wired spanning trees in A or the number of spanning trees in A∗ . We can use Wilson’s algorithm in either case and this leads to slightly different expressions. • Wired. We use simple random walk in Z2 stopped when it leaves A as before. The number of spanning trees is 4#(A) F (A). • Dual spanning trees. We need to consider simple random walk on the graph A∗ . There are two different ways to do this which lead to the same tree in Wilson’s algorithm. One is to take a step with probabilities proportional to the number of nearest neighbors of the vertex in A. The other is a “rejection” or “stutter step” approach where the walker tries to take a step according to simple random walk. However, if the step would take the walker outside of A it is

5. LERW and Spanning Trees on Zd

126

rejected and one stays in place. We will use this latter walk. One then has to choose some vertex z ∈ A∗ , and then the number of spanning trees is ∗

4#(A

)−1

F˜ (A∗ \ {z}).

Here we write F˜ to indicate that this is the quantity computed using a “stutter step” walker stopped when it reaches vertex z. Example 5.34. We will do the calculation for the case A = {0, 1}. The wired spanning tree has three vertices {0, 1, ∂A} and there are seven edges: three connecting 0 to ∂A, three from 1 to ∂A, and one connecting 0 to 1. The number of spanning trees is 15: there are six that include the edge connecting 0 and 1 and there are nine that do not include it.

0

1

Starting at 0, the probability that a random walk returns to 0 without leaving A is 1/16, this is the probability of doing the path 0 → 1 → 0. Hence 16 1 GA (0, 0) = 1 = 15 . 1 − 16 Clearly, G{1} (1, 1) = 1 since the random walk immediately leaves the set {1}. Therefore, F (A) = GA (0, 0) GA\{0} (1, 1) =

16 . 15

The number of wired spanning trees is 4#(A) F (A)−1 = 42 ·

15 = 15. 16

5.4. The dual lattice in Z2

127

The dual graph A∗ has six vertices which we write (after translating by − 12 + 2i ) as x + iy where x = −1, 0, 1 and y = 0, 1. We will make z0 = 0 the root and choose the ordering of the remaining vertices as z1 = i, z2 = −1 + i,

z3 = −1, z4 = 1 + i,

z2

z1

z4

z3

z0

z5

z5 = 1.

Let Vj = A∗ \ {z1 , . . . , zj−1 }. Starting at z1 , the probability that the stutter step walk returns to z1 before reaching z0 is given by 7 1 1 2 . 0+ + · = 4 2 3 12 The 0 represents the probability that the first step is to z0 ; the 1/4 is the probability that the first step is to z1 ; the 1/2 is the probability that the first step is to z2 or z4 and the 2/3 represents the probability that the random walk starting at z2 or z4 reaches z1 before z0 . Therefore, 1 12 GV1 (z1 , z1 ) = 7 = 5 . 1 − 12 Starting at z2 , the probability of returning to z2 before leaving V2 is 5 1 1 1 + · = . 2 4 2 8 The first term is the probability of a stutter step at z2 and the second term is the probability of a first step to z3 followed by a return to z2 before visiting z0 . This gives 1 8 GV2 (z2 , z2 ) = 5 = 3. 1− 8 The probability starting at z3 of returning to z3 without leaving V3 is 1/2, and hence GV3 (z3 , z3 ) = 2.

5. LERW and Spanning Trees on Zd

128 Similarly,

GV4 (z4 , z4 ) = GV2 (z2 , z2 ),

GV5 (z5 , z5 ) = GV3 (z3 , z3 ),

and hence

8 45 12 8 · ·2· ·2= , F˜ (A∗ ) = 5 3 3 15 and the number of spanning trees of A∗ is 4#(A

∗

)−1

F˜ (A∗ )−1 = 15.

Exercise 5.35. Carry out the calculations as in the last example for the two three-point cases A = {−1, 0, 1} and A = {0, 1, 1 + i}. We now consider the UST in Z2 . We say that a sequence of finite subsets {Vn } increases to Z if V1 ⊂ V2 ⊂ · · · and ∞ ∪

Vn = Z.

n=1

Recall the definition of convergence of probability measures on subgraphs in Z given in (5.7). Let Pn be the probability measure given by the wired UST on Vn and Pñ the measure given by the free UST on Vn . Theorem 5.36. Suppose Vn is a collection of subsets increasing to Z. Then lim Pn = lim Qn = P n→∞

n→∞

where P is the UST that can be obtained as in Section 5.3 or that also can be obtained by Wilson’s algorithm rooted at the origin. Proof. Let Q denote the probability measure given by Wilson’s algorithm rooted at the origin. We will first show lim Pn = lim Qn = Q.

n→∞

n→∞

Let A ⊂ Z be finite with N + 1 elements including the origin, and let ϵ > 0. Choose an ordering of Z = {0, z1 , z2 , . . .} such that A = {0 = z1 , z2 , . . . , zN }. We will be choosing n sufficiently large so that A ⊂ Vn and in all three cases we will be doing Wilson’s algorithm rooted at the origin with vertices chosen using an ordering that agrees with this ordering up through zN .

5.4. The dual lattice in Z2

129

For each z ∈ A find an Rz > 0 such that the probability that a random walker in Z2 starting at z goes distance Rz from the origin before visiting the origin is at most ϵ/N . Let R = maxz∈A Rz and choose n sufficiently large so that Vn contains the discrete disk of radius R + 1 about the origin. We consider Wilson’s algorithm rooted at the origin using the above ordering of the lattice and stopping it after all the vertices in A have been added to the tree. This will only use random walks started at z1 , z2 , . . . , zN stopped when the reach the origin. The behavior of these random walks is the same whether they are in the wired spanning tree, or the free spanning tree, or all of Z unless the random walker reaches a vertex such that one of the nearest neighbors is not in Vn . By our construction, the probability of at least one of the walks doing this is at most N ·(ϵ/N ) = ϵ. Therefore, except for an event of probability at most ϵ, the forest that one has at this time is the same for Pn , Qn , and Q. Since all the vertices of A are in the forest at this time, Wilson’s algorithm in all three cases will add no more edges with both endpoints in A. This implies ∑

|Pn (F ′ ) − Q(F ′ )| ≤ ϵ,

F′

and similarly for Pñ where the sum is over all forests F ′ in A. To finish we will show that lim Pn = P.

n→∞

In this case we will choose the wired spanning tree in Vn by rooting at ∂Vn and using the ordering as above. We compare the infinite LERW starting at the origin with the LERW obtained by taking random walk, stopping when it reaches ∂Vn , and erasing loops. Let A, ϵ, R be as above. If n is sufficiently large we can see that the probability measure on forests in A obtained from infinite LERW and the LERW obtained from random walk stopped at ∂Vn agree except for an event of probability at most ϵ/N . Arguing as above for the next N − 1 steps of Wilson’s algorithm, we see that they agree except for an event of probability at most ϵ. □

130

5. LERW and Spanning Trees on Zd

5.5. The uniform spanning tree (UST) in Z2 Let us now consider the uniform spanning tree T on all of Z = Z2 which also generates a dual tree T ∗ = {e∗ : e ∈ EZ \ T }. Proposition 5.37. With probability one, T ∗ as defined above has the distribution of the UST on Z ∗ . Proof. Let Vn be a sequence of sets increasing to Z. Then Vn∗ is a sequence of sets increasing to Z ∗ . The wired spanning tree measure on Vn induces the free spanning tree measure on Vn∗ . Both measures have the UST as their limit. □ If T is a tree in Z, let T z denote the tree as seen from z, that is, the edge (w1 , w2 ) is in T z if and only if (w1 + z, w2 + z) ∈ T . Proposition 5.38. The distribution of the infinite spanning tree is translation invariant. That is, for every z the distribution of T z is the same as that of T . Proposition 5.39. With probability one for the UST T in Z, for every z ∈ Z there is a unique infinite self-avoiding path ω = [ω0 = z, ω1 , ω2 , . . .], such that each of the edges [ωj−1 , ωj ] is in T . Proof. By translation invariance it suffices to prove this at z = 0. The existence of such a path is immediate from the construction of the tree using Wilson’s algorithm rooted at infinity. Indeed, the distribution of the tree is that of LERW from 0 to ∞. Uniqueness requires a little work. Suppose there were another such path that we write as η = [η0 = 0, η1 , η2 , . . .]. Let k be the smallest integer such that ηk ̸= ωk and consider the doubly infinite path [. . . , ηk+1 , ηk , ηk−1 = ωk−1 , ωk , ωk+1 , . . .].

Further Reading

131

We claim this must be self-avoiding. Suppose not. Let n be the smallest integer with ωn = ηm for some m ≥ k and let m denote the smallest such index. Then, [ηm , ωm−1 , . . . η−k , ηk−1 = ωk−1 , ωk , . . . , ωn ] is a self-avoiding loop in T . The doubly infinite self-avoiding path can be viewed as a loop in the plane with infinity included. It divides the plane into two pieces. In particular, if e is the edge [ωk−1 , ωk ], the loop disconnects the endpoints of the dual edge e∗ in T ∗ . Since T ∗ is a tree, this is a contradiction. □

Further Reading For more on the infinite spanning tree and forest see [?OKPS]. There is also a significant discussion on spanning trees in [13].

Chapter 6

Gaussian Free Field

6.1. Introduction The normal or Gaussian distribution is one of the most important distributions in probability. The generalization to collections of random variable is called the multivariate normal distribution and this appears in many models in statistics and mathematical physics. To say that a collections of random variables {Zx : x ∈ A} has a multivariate normal distributions means more than to say that each individual random variable Zx has a normal distribution. Given a Markov chain with symmetric transition probabilities on a set A = A ∪ ∂A, there is a corresponding multivariate normal distribution on A called the (Dirichlet) Gaussian free field (GFF). It is closely related to the loop measures associated to the chain. In this chapter we first discuss multivariate normal distributions and then consider the case of the GFF.

6.2. Multivariate normal distribution Recall that a random variable X has a (univariate or one-dimensional) normal distribution with mean µ and variance σ 2 if it has density } { 1 (x − µ)2 f (x) = √ , −∞ < x < ∞. exp − 2σ 2 2πσ 2 133

134

6. Gaussian Free Field

We write X ∼ N (µ, σ 2 ). We say X has a centered normal distribution if µ = 0 and we say that X has a standard normal distribution if µ = 0, σ 2 = 1. We state a couple of well known properties. • If X ∼ N (µ, σ 2 ), a, b ∈ R, and Y = a X + b, then Y ∼ N (aµ + b, a2 σ 2 ). • If X1 , X2 , . . . , Xn are independent random variables with Xj ∼ N (µj , σj2 ), then X1 + X2 + · · · + Xn ∼ N (µ1 + · · · + µn , σ12 + · · · + σn2 ). • In particular, if X1 , X2 , . . . , Xn are independent standard normal random variables, then so is X1 + · · · + Xn √ . n We first define a centered (mean zero) multivariate random vector Z = (Z1 , . . . , Zn ). There are several equivalent ways to define this and we will choose one that is convenient for us. A multivariate normal distribution is a linear combination of independent normals; we use matrix notation to make this precise. We write Z for both the row vector and the column vector   Z1  ..   . . Zn We call a random vector N = (N1 , . . . , Nm ) a standard normal vector if the N1 , . . . , Nm are independent standard normal random variables. Definition 6.1. A random vector Z = (Z1 , . . . , Zn ) has a centered multivariate normal distribution if there exist a standard normal vector N in Rm and an n × m matrix A = [ajk ] such that Z = AN. Since E[Nk2 ] = 1 and E[Nk Nl ] = E[Nk ] E[Nl ] = 0 for k ̸= l, we see that each Zj is a centered normal random variable with variance [ ] E[Zj2 ] = E (aj1 N1 + · · · + ajm Nm )2 =

m ∑ k=1

=

a2jk E[Nk2 ] +

∑

ajk ajl E[Nk Nl ]

k̸=l

a2j1 + a2j2 + · · · + a2jm .

6.2. Multivariate normal distribution

135

More generally, the covariances are given by (6.1)

E[Zj Zk ] = aj1 ak1 + aj2 ak2 + · · · + ajm akm .

Definition 6.2. If Z = (Z1 , . . . , Zn ) has a centered multivariate normal distribution, its covariance matrix is the n × n matrix Γ = [E(Zj Zk )] . Definition 6.3. An n × n matrix M with real coefficients is positive semidefinite if for each vector v = [v1 , . . . , vn ] ∈ Rn , v · M v ≥ 0. If v · M v > 0 for all v ̸= 0, then the matrix is positive definite. Proposition 6.4. The covariance matrix Γ is symmetric and positive semidefinite. If the random variables Z1 , . . . , Zn are linearly independent, then Γ is positive definite.

To say that Z1 , . . . , Zn are linearly independent is to say that the only constants c1 , . . . , cn such that c1 Z1 + · · · + cn Zn equals the zero random variable are c1 = c2 = · · · = cn = 0. Equivalently, if v is a (non-random) vector with v · Z = 0, then v = 0.

Proof. Symmetry follows immediately from the definition. If v = [v1 , . . . , vn ] is a vector note that n ∑ n [ ] ∑ E (v · Z)2 = vj E[Zj Zk ] vk = v · Γ v. j=1 k=1

The left-hand side is obviously nonnegative and is positive unless v · Z ≡ 0. The latter can happen for nonzero v only if Z1 , . . . , Zn are linearly dependent. □

136

6. Gaussian Free Field

Since Γ is symmetric, it has a complete set of eigenvalues and eigenvectors. In this case, positive semidefinite implies that the eigenvalues are all nonnegative and positive definite implies that all the eigenvalues are strictly positive. Using (6.1), we can see that Γ = AAT where T denotes transpose. Proposition 6.5. Suppose Z = (Z1 , . . . , Zn ) has a centered multivariate normal distribution with matrix A and covariance matrix Γ = AAT . • The distribution of Z depends only on the covariance matrix Γ and not otherwise on A. If Γ is invertible, Z has a density, (6.2)

{ } 1 x · Γ−1 x f (x) = exp − . 2 (2π det Γ)n/2 • The matrix A is not unique, that is, for a given Γ there are many matrices A such that Γ = A AT . • If Γ is a symmetric, positive semidefinite matrix, then there exists (not unique) A such that Γ = AAT . In particular, there exists a centered multivariate normal distribution with covariance matrix Γ.

Note that if n = 1, (6.2) reduces to the familiar { } 1 x2 √ exp − 2 . 2σ 2πσ 2

Proof. To show that the distribution depends only on the covariance matrix Γ we can compute the moment generating function. If u =

6.2. Multivariate normal distribution (u1 , . . . , un ), then    n ∑  E exp uj Zj  =   j=1

=

=

=

=

= =

  E exp uj ajk Nk    j=1 k=1    m n  ∑  ∏ E exp Nk uj ajk    j=1 k=1   2    n m 1 ∑  ∏   exp uj ajk    2 j=1  k=1   n ∑ n m 1 ∑  ∏ exp uj ui ajk aik 2  j=1 i=1 k=1   n ∑ m n ∑  1 ∑ uj ui ajk aik exp  2 j=1 i=1 k=1 { } 1 exp uAAT uT 2 { } 1 exp uΓuT . 2 

 n ∑

137

m ∑

If Γ is invertible, a computation that we omit shows that this is also the moment generating function of a random variable with density (6.2). Given positive semidefinite, symmetric Γ, we can find a diagonal T −1 matrix D and an √ orthogonal (Q = Q ) matrix Q such that Γ = T QD Q . Let D be a diagonal matrix whose entries are square roots (with either √ choice of sign for each entry) of the entries in D and let A = Q D. Then AAT = Γ. Since the choices of signs were arbitrary, we see that A is not unique. □ Definition 6.6. If µ ∈ Rn , then a random vector Y is a multivariate normal random vector with mean µ if Y−µ is a centered multivariate normal random vector. In other words, a multivariate normal vector is specified by giving its mean µ and its covariance matrix Γ.

138

6. Gaussian Free Field

Extending the definition of multivariate normal vectors to infinite index sets is straightforward after we make this easy observation. • Suppose {Zx } is a centered multivariate normal random vector indexed by a finite set A with covariance matrix Γ = [Γ(x, y)]x,y∈A . Suppose that A′ ⊂ A. Then {Zx : x ∈ A′ } has a multivariate normal distribution with covariance matrix [Γ(x, y)]x,y∈A′ . Definition 6.7. A collection of random variables {Zx : x ∈ A} has a centered multivariate normal distribution if for each finite subset A′ ⊂ A, {Zx : x ∈ A′ } has a centered multivariate normal distribution. Although we will only consider countable A, this definition also applies to uncountable index sets. The distribution is determined by the covariance “matrix” {E[Zx Zy ] : x, y ∈ A}. Exercise 6.8. Let X1 , X2 , . . . be independent standard normal random variables, let Sn = X1 + · · · + Xn , Yj = Xj − n−1 Sn , and Zk = Y1 + · · · + Yk = Sk −

k Sn . n

(1) Explain why (S0 , S1 , . . . , Sn ) has a centered multivariate normal distribution. (2) Show that if j ≤ k, E[Sj Sk ] = j. (3) Explain why (Z0 , Z1 , . . . , Zn ) has a centered multivariate normal distribution. (4) Show that if j ≤ k, E[Zj Zk ] = n

j n

( ) k 1− . n

(5) Explain why it makes sense to say that the distribution of (Z0 , Z1 , . . . , Zn ) is the same as the conditional distribution of (S0 , S1 , . . . , Sn ) given Sn = 0. Exercise 6.9. Suppose Γ is an n × n symmetric positive definite matrix. Show that there is a lower triangular matrix A such that AAT = Γ. (A = [ajk ] is lower triangular if ajk = 0 for all j > k.) In

6.3. Gaussian fields coming from Markov chains

139

other words we can write Zj =

j ∑

ajk Nk .

k=1

(1) Prove this for n = 2, 3 directly. (2) Try to do the general case. If you get stuck, look up Cholesky decomposition.

6.3. Gaussian fields coming from Markov chains An important class of Gaussian fields arise from symmetric Markov chains. Suppose Xn is an irreducible Markov chain on A = A ∪ ∂A with a symmetric transition matrix PA and assume that either ∂A ̸= ∅ or that the chain is transient. In particular, the Green’s function GA = (I − PA )−1 is well defined. We will construct a random field Z = {Zx : x ∈ A} that has a centered multivariate distribution with covariance matrix GA . The construction will write GA = AAT for a lower triangular matrix A. We assume that we have a collection of independent standard normal random variables {Nj } and we will write the values of the field as linear combinations of these variables. Order the elements of A, A = {x1 , x2 , . . .}, and let Aj = A \ {x1 , . . . xj−1 }, ∂Aj = ∂A ∪ {x1 , . . . , xj−1 }. The construction will depend on the ordering of the vertices, but the distribution will not. Recall that HAk (z, w) denotes the Poisson kernel, that is, the probability that the chain starting at z first visits ∂Ak at w. For notational ease we write Zj for Zxj . Let √ Z1 = GA (z1 , z1 ) N1 , and for k > 1, Zk =

k−1 ∑ j=1

HAk (xk , xj ) Zj +

√ GAk (xk , xk ) Nk .

140

6. Gaussian Free Field

Note that Zk can also be written as a linear combination of N1 ,. . . , Nk . The first term k−1 ∑

HAk (xk , xj ) Zj

j=1

is the expected value of Zk given the values Z1 , . . . , Zk−1 . Using Proposition 1.10 we see that it is the value at xk of the unique function on Ak ∪ ∂Ak that is harmonic in Ak with boundary value Zj at xj , for j = 1, . . . , k √ − 1 and boundary value 0 at other points in ∂Ak . The second term GAk (xk , xk ) Nk gives the added randomness independent of Z1 , . . . , Zk−1 . Proposition 6.10. The random vector Z has a centered multivariate normal distribution with covariance matrix GA . In particular, the distribution is independent of the ordering of the vertices. Proof. By construction Z has a centered multivariate distribution so it suffices to check that E[Zj Zk ] = GA (xj , xk ) for all j, k. We will do this by induction on n. For n = 1 it is immediate that E[Z12 ] = GA (x1 , x1 ) E[N12 ] = GA (x1 , x1 ). Now assume that E[Zj Zk ] = GA (xj , xk ) for all j, k ≤ n − 1. Suppose that k < n. Since Zk is a linear combination of N1 , . . . , Nk , Nn is independent of Zk and E[Nn Zk ] = 0. Therefore,   n−1 ∑ E [Zk Zn ] = E Zk HAn (xn , xj ) Zj  j=1

=

n−1 ∑

HAn (xn , xj ) E [Zk Zj ]

j=1

=

n−1 ∑

HAn (xn , xj ) GA (xj , xk ).

j=1

We claim that the last sum equals GA (xn , xk ). To see this, note that for fixed k, the function h(y) := GA (y, xk ) is bounded by GA (xk , xk );

6.3. Gaussian fields coming from Markov chains

141

equals 0 on ∂A; and is harmonic in An . By Proposition 1.10, we have h(xn ) =

k−1 ∑

HAn (xn , xj ) h(xj ).

j=1

Similarly, we can see that E[Zn2 ] =

n−1 ∑

HAn (xn , xj ) GA (xj , xn ) + GAn (xn , xn ).

j=1

We claim that the right-hand side equals GA (xn , xn ). Indeed, arguing similarly to above: • The sum gives the expected number of visits to xn before leaving A but after leaving An . • GAn (xn , xn ) gives the expected number of visits to xn before leaving An . □ If A is finite with n elements then we know that GA = (I −PA )−1 and hence we can write the density as { } x · (I − PA ) x 1 √ exp − (6.3) 2 (2π)d/2 det GA which can also be written as ϕ(x) √

(6.4)

1 exp det GA

{

x · PA x 2

} ,

where ϕ is the density of a standard normal in Rn . Exercise 6.11. Under the assumptions above, show that for all j the following hold: Zj =

j ∑

HAk+1 (xj , xk )

√ GAk (xk , xk ) Nk ,

k=1

Zj =

j √ ∑

GAk (xj , xj ) − GAk+1 (xj , xj ) Nk ,

k=1

where, as before, we define HAj+1 (xj , xj ) = 1, GAj+1 (xj , xj ) = 0.

142

6. Gaussian Free Field

Definition 6.12. If A ⊂ Zd , then the Dirichlet GFF on A is the field corresponding to the Markov chain given by PA = [p(x, y)]x,y∈A where p(x, y) = 1/2d if |x − y| = 1 and equals zero otherwise. There are two important extensions of this definition. First, we can choose a different boundary condition by letting {Zy : y ∈ ∂A} be given. We then set ∑ √ (6.5) Z k = Z xk = HAk (xk , x) Zx + GA (xk , xk ) Nk . x̸∈Ak

Proposition 6.13. Given a bounded function {Zy : y ∈ ∂A}, the distribution of {Zx : x ∈ A} is that of h(x) + Z˜x , where Z˜x is a centered Gaussian field with covariance matrix GA and ∑ h(x) = HA (x, y) Zy . y∈∂A

Note that h is the unique function that is harmonic on A with boundary condition Zx on ∂A, Exercise 6.14. Prove this proposition. As a preliminary step you may wish to show that for every k, ∑ h(xk ) = HAk (xk , y) h(y). y̸∈Ak

Corollary 6.15. Suppose {Zx : x ∈ A} is a Dirichlet GFF and V ⊂ A. Then the conditional distribution of {Zx : x ∈ A \ V } given {Zx : x ∈ V } is that of h(x) + Z˜x , where {Z˜x : x ∈ A \ V } is a Dirichlet GFF in A \ V and ∑ h(x) = HA\V (x, y) Zy . y∈V

We can extend to the case where the symmetric weight q(x, y) can take negative values provided that it is integrable so that the Green’s q function is well defined. We recall that the Poisson kernel HA (z, w) is defined for w ∈ ∂A by ∑ q HA (x, w) = q(ω), ω:x→w

6.4. A Gibbs measure perspective

143

where the sum is over all paths starting at x, ending at w, and otherwise staying in A. The Green’s function satisfies GA (x, x) > 0 for all x. Hence we can use (6.5) to define the field.

6.4. A Gibbs measure perspective We will give an equivalent definition of the GFF. Suppose we have a set of vertices A = A ∪ ∂A with A finite and symmetric real weights q(x, y), x, y ∈ A. We assume that q(x, y) = 0 if x, y are distinct points in ∂A. Let Q = [q(x, y)]x,y∈A and QA = [q(x, y)]x,y∈A . We allow negative weights but we require q to be integrable on A. We will consider random fields (vectors) {Zx : x ∈ A} indexed by elements of A. At times we will also consider Zy , y ∈ ∂A, but these will be fixed boundary values. We will say Dirichlet boundary conditions if Zy = 0 for y ∈ ∂A. A standard way to assign probabilities to configurations in statistical mechanics is through Gibbs measures. This uses the principle that configurations of the same energy get the same probability. Here is the set-up. • There is a base “measure” on fields. This is often a probability measure, but not always. Sometimes it is an infinite measure. • For each configuration (realization of the field) we define an energy or Hamiltonian E. • Look at the new field where the weight of each configuration is changed to c exp{−E}. Here c is chosen to that the new measure is a probability measure. This model favors configurations with low energy. Sometimes the energy is written as a sum of two or more terms, say E = H1 + H2 . Weighting by exp{−E} can be viewed as a two-step process: first weight by exp{−H1 } and then reweight the weighted field by exp{−H2 }.

144

6. Gaussian Free Field

Suppose X is a continuous real-valued random variable with density f . A “Gibbs formulation” of the last sentence would be to consider the base measure to be Lebesgue measure (length) on the real line and the Hamiltonian as H(x) = − log f (x) where log 0 = ∞,

It is also standard to consider weights exp{−β E} where β is a parameter, usual positive, that is often called the inverse temperature. When β is small (high temperature) the effect of the new weighting is not very strong while for β large (low temperature) the effect of the new weighting is much greater. Sometime the β is just included in the E term; we will do that in this section so that β = 1. Critical phenomena deals with the behavior as β approaches a critical value. We will not be dealing with this here, so we have left β out of our notation.

Let us write { } z · (I − Q) z exp − = e−E(z) = exp{−[H1 (z) + H2 (z)]}, 2 where H1 (z) =

1 ∑ 2 zx , 2 x∈A

H2 (z) = −

1∑∑ q(x, y) zx zy . 2 x∈A y∈A

For Dirichlet boundary conditions (zx = 0 if x ∈ ∂A), we can write the sums as sums over x, y ∈ A, but we write it this way to allow different boundary conditions. Either way, we view z as a vector in Rn , that is, only the values indexed by A are considered variables. Tilting by e−H1 makes the random variables {Zx : x ∈ A} independent standard normal random variables. The correlations between the random variables produced by the weight appear in the interaction term H2 .

6.4. A Gibbs measure perspective

145

A ground state for a system is a configuration of minimal energy. The next proposition shows that the ground states for the Gaussian field for a Markov chain are the harmonic functions for the chain. Proposition 6.16. If {zy : y ∈ ∂A} is given, then the energy E(z) is minimized when ∑ zx = q(x, y) zy , x ∈ A. y∈A

Proof. Since E(z) is a smooth function from Rn to R, any local minimum must be a critical point, that is, the partial derivatives must vanish. If we fix x ∈ A and write z = zx , then ∑ 1 1 E(z) = z 2 − q(x, x) z 2 − q(x, y) z zy + C, 2 2 y̸=x

where C is a term that does not depend on z. The factor of 1/2 drops from the sum because we have both q(x, y) and q(y, x) contributing. Differentiating and setting equal to zero gives ∑ zx = q(x, y) zy . y

This establishes that this is the only local minimum so we need to show that it is, in fact, a global minimum. To show that it is a global minimum, we use the fact that I − Q is positive definite on A (see Exercise 6.18) and hence |z · (I − Q)z| ≥ λ|z|2 where λ > 0 is the smallest eigenvalue of I − Q. If a boundary condition is given, we can see that (6.6)

E(z) ≥ λ|z|2 − c |z|,

where c depends on the boundary value. In particular, E(z) → ∞ as |z| → ∞, and the local minimum is a global minimum. □ Exercise 6.17. Give a simple proof of the estimate (6.6) in the case that q is a nonnegative weight. Exercise 6.18. To establish (6.6), we used the fact that if Q is a symmetric, real-valued, integrable weight, then I − Q is positive definite. Here is one proof.

146

6. Gaussian Free Field (1) Show that for each x ∈ A′ ⊂ A, GA′ (x, x) > 0. (2) Show that for each A′ ⊂ A, det GA′ > 0 and hence det(I − QA′ ) > 0. Hint: See Section 2.3. (3) Look up Sylvester’s criterion to determine if symmetric real matrices are positive definite.

Exercise 6.19. Suppose q(x, y) = p(x, y) where p(x, y) denotes the transition probabilities of a symmetric irreducible Markov chain on A = A ∪ ∂A. Let us fix the boundary condition {zx : x ∈ ∂A} and let h = {hx : x ∈ A} be the harmonic function in A with this boundary value, that is, hx = zx for x ∈ ∂A and ∑ hx = p(x, y) hy , x ∈ A. y∈A

˜ = {˜ (1) Suppose z zx : x ∈ A} is a function with z˜x = 0 for x ∈ ∂A. Show that ∑∑ p(x, y) [hy − hx ] [˜ zy − z˜x ] = 0. ˜ y∈A ˜ x∈A

(2) Show that if z = {zx : x ∈ A} is a function that agrees with h on ∂A, then E(z) = E(h) + E(z − h). We view the GFF as follows. We are giving a distribution on Rn . • Start with base measure, the usual volume measure in Rn which is called Lebesgue measure. • Let us first tilt the measure by e−H1 . In this new measure the density is ϕ(z) =

2 1 e−|z| /2 . n/2 (2π)

In this measure, {Zx : x ∈ A} are independent, standard normal random variables. • We tilt again by e−H2 .

6.4. A Gibbs measure perspective

147

Definition 6.20. The Dirichlet GFF {Zx : x ∈ A} with symmetric weights q(x, y) and boundary value {zy : y ∈ ∂A} is the random variable whose density is c ϕ(z) exp{−H2 (z)}, where ϕ is the density for independent standard normal random variables, 1∑∑ (6.7) H2 (z) = − q(x, y) zx zy , 2 x∈A y∈A

and c is the normalization constant needed to make this a probability density. Remark 6.21. • We can consider the one point space A = {x} with weight q = q(x, x). Let Zx be a standard normal random variable 2 and Y = eqZx /2 . then ∫ ∞ 2 2 1 eqz /2 e−z /2 dz E[Y ] = √ 2π −∞ { } ∫ ∞ 1 z2 = √ exp − dz. 2[1/(1 − q)] 2π −∞ = (1 − q)1/2 . In order for this integral to be finite, we need q < 1. In the new measure Zx is a centered normal with variance (1−q)−1 . • More generally, if A has n points and the weight matrix is diagonal with qx = q(x, x), then in the new measure {Zx : x ∈ A} are independent centered normal random variables with E[Zx2 ] = (1 − qx )−1 . • For a general weight matrix Q = [q(x, y)], we can consider the new weighting as a two step process: first, weight by the diagonal elements of Q changing the base measure to independent normals with different variances, and then tilting by the nondiagonal elements. The GFF is an example of a Markovian field. The extension of the definition of “Markovian” to fields is a little subtle. Suppose

148

6. Gaussian Free Field

{Zz : z ∈ A} is a GFF and suppose that V ⊂ A. Suppose also that V “divides” A in the sense that we can partition A as A = A− ∪ V ∪ A+ such that for every x ∈ A− , y ∈ A+ we have q(x, y) = 0. Then the Markov field property states that the random variables {Zx : x ∈ A− } and {Zx ∈ A+ } are conditionally independent given the values of {Zx : x ∈ V }.

6.5. One-dimensional GFF We consider the Dirichlet GFF on A = {1, 2, . . . , n − 1} with ∂A = {0, n} with respect to the usual simple random walk, q(x, y) = 1/2 if |x − y| = 1. This is a random vector {Zj : j = 0, 1, 2, . . . , n} with Z0 = Zn = 0. We have E(z) =

n−1 n−1 n−1 1∑ 1∑ 2 1∑ zj − zj zj+1 = (zj+1 − zj )2 , 2 j=1 2 j=1 4 j=0

where z0 = zn = 0. Then we write exp{−E(z)} = (4π)n/2

n−1 ∏ j=0

2 1 √ e−(zj+1 −zj ) /4 . 4π

We have written it this way to see that if we set yj = zj − zj−1 , then (4π)−n/2 E(y), is the density of n independent centered normal random variables with variance 2 conditioned on the event that y1 + · · · + yn = 0. This conditioning is on an event of probability zero but it is straightforward to make sense of this in terms of an appropriate limit. Note that this is very similar to Exercise 6.8 although that exercise had variance 1 rather than variance 2. From this we see that the field is the same as the sum of independent centered normal random variables X1 , . . . , Xn each with variance 2 “conditioned so that X1 + · · · + Xn = 0”. This factor of 2 can also be seen in this proposition.

6.6. Square of the field

149

Proposition 6.22. Let S denote a simple random walk on A = {1, 2, . . . , n − 1} with ∂A = {0, n}. Then for x, y ∈ A with x ≤ y, x ( y) GA (x, y) = 2n 1− . n n Proof. We will use the well-known “gambler’s ruin” calculation for one-dimensional random walk which states that if x ∈ A, then the probability that the random walk leaves A at n is x/n. Suppose X0 = x. Using the gambler’s ruin calculation on both sides of x, we can see that the probability to reach ∂A before returning to x is 1 1 1 1 n + = 2 x 2 n−x 2x(n − x) Therefore, 2x(n − x) GA (x, x) = . n Similarly, if x < y, the probability to visit y before leaving A is x/y and hence x x 2y(n − y) 2x(n − y) GA (x, y) = GA (y, y) = = . y y n n □ Exercise 6.23. The last proof used the “gambler’s ruin” estimate: if Xn is simple random walk on {0, 1, . . . , n} stopped when it reaches 0 or n, with X0 = x, then the probability that the walk stops at n is x/n. Verify this. (There are many ways to do this exercise!)

6.6. Square of the field There is a relationship between the random walk loop soup at time s = 1/2 and the square of the GFF. Before stating the result, let us first describe what happens where there are no loops, that is, when q ≡ 0. Exercise 6.24. If N is a standard normal random variable, then the density of N 2 /2 is 1 f (t) = √ e−t , 0 < t < ∞. πt

150

6. Gaussian Free Field

Hint: When trying to find densities, it is often easier to compute the distribution function and then differentiate to get the density. The distribution of N 2 is called the χ2 (“chi-square”) distribution with one degree of freedom and the distribution of N12 + · · · + Nn2 with N1 , . . . , Nn independent, standard normals is called the χ2 distribution with n degrees of freedom. Such distributions are very important in mathematical statistics. Our dividing by 2 is just a convenience to make the density a little simpler. The following proposition is an immediate corollary of the exercise; we state it in order to define some notation. Proposition 6.25. If N = (N1 , . . . , Nn ) is a standard normal vector then the random vector 1 (N 2 , . . . , Nn2 ) 2 1 has density     n n   ∏ ∑ 1  (6.8) ψ(t) =  exp − tj . √   π tj j=1

j=1

For the remainder we will assume for ease that our weight satisfies q(x, x) = 0 for each x. Proposition 6.26. Suppose {Zx : x ∈ A} is a Dirichlet GFF with weights q. Then the density of {Zx2 /2 : x ∈ A} is given by ψ(t) Φ(t) where ψ is as in (6.8) and    ∑ ∑  √ Φ(t) = [det G]−1/2 E exp Jx Jy tx ty q(x, y)  ,   x∈A y∈A

where {Jx : x ∈ A} are independent coin-flipping random variables, P{Jx = 1} = P{Jx = −1} = 1/2. Proof. We leave this as an exercise in change of variables of a density (see (6.8)) using the change of variables tx = zx2 . Some care is needed since the square is not a one-to-one function. Indeed for any t there

6.6. Square of the field

151

exist 2n choices of z since each component can be positive or negative, □ We will give another expression for the expectation in the last proposition. If e is an undirected edge, with endpoints x, y, let te = tx ty , Je = Jx Jy where x, y are the endpoints of e. Recall that θe = 2p(x, y) if x ̸= y and θe = p(x, x) if x = y. Proposition 6.27. Suppose {Zx : x ∈ A} is a Dirichlet GFF with weights q. Then the density of {Zx2 /2 : x ∈ A} is given by ψ(t) Φ(t) where ψ is as in (6.8) and ∑ ∏ (√te θe )ke −n/2 Φ(t) = [det G] ke ! e k

where the sum is over all nonnegative integer valued functions on {ke : e ∈ E} with ke even for all e. Proof. We first calculate    ∑ ∑ √ exp Jx Jy tx ty q(x, y)  

{ = exp

x∈A y∈A

=

∏

∑

√

}

J e te θe

e

{ √ } exp Je te θe

e

=

√ ∞ ∏∑ (Je te θe )k

k! √ ∑ ∏ (Je te θe )ke = ke ! e k ∑∏ ∏ (√te θe )ke = Jxkx , ke ! e∈e e k=0

k x∈A

and then by taking expectations we get [ { }] [ ] ∑ √ ∑ ∏ (√te θe )ke ∏ E exp J e te θe = E Jxkx . k ! e e e x∈A k ] ∏ [∏ kx kx = x∈A E[Jx ] which equals zero if By independence, E x∈A Jx kx is odd for at least one x and otherwise equals one. This gives the result. □

152

6. Gaussian Free Field

We finish by stating a result relating the loop soup of the previous chapter to the square of the GFF. Suppose we have a realization of the loop soup with symmetric weights q(x, y). Then as in Section 3.8, we let Rt (x) denote the number of times that the vertex x has ˜ t (x) = t + Rt (x). Let Ttx denote been visited by time t and let R independent Gamma processes with rate 1 indexed by x ∈ A, and x Yt (x) = TRx˜ t (x) = Tt+R . t (x)

We view this as a “continuous local time” at the vertex x generated by the soup. Theorem 6.28. • {Y1/2 (x), x ∈ A} has the same distribution as {Zx2 /2} where {Zx , x ∈ A} is the GFF. • {Y1 (x), x ∈ A} has the same distribution as {(Zx2 + Z˜x2 )/2} where {Zx , x ∈ A}, {Z˜x , x ∈ A} are independent GFFs. We will leave this to the reader to prove this if they want to give it a try. Here are some starting ideas. • Show that proving either of the statements will also give the other. • Use the combinatorial formulas in Corollary 3.21 and Proposition 3.22 to compute the joint density of {Yt (x)} for t = 1 or t = 1/2, respectively.

Further reading The field of Gaussian processes is huge. If we restrict to the relationship between Markov chains (processes) and Gaussian fields, here are some possibilities [8, 11, 14, 17, 19].

Chapter 7

Scaling Limits

7.1. The idea of a scaling limit In this chapter we introduce scaling limits of some of the processes on the integer lattice that we have been studying. A serious treatment of these continuous processes requires a measure theoretic background in probability and even with that it takes significant work to put in all the details. For this reason, we will only introduce some of the ideas and present some of the main results with the goal of enticing the reader to learn more. Theorems stated in the this section have been proved but we will not discuss the proofs here. When studying finite systems as the size goes to infinity, it is standard to scale the processes so that they are objects of diameter of order one. Suppose ω = [ω0 , ω1 , . . . , ωk ] is a nearest neighbor path in Zd with ω0 = 0 with k large arising as a realization of some random process. Let N be an integer such that the diameter of the path ω is about N . We scale the path to view it on the scaled lattice ZN = N −1 Zd . This scales space so that the diameter of the path is of order 1. We also want to scale time so that the time to traverse the scaled path is also of order 1. The fractal dimension of the path is the exponent α such that k ≍ N α . We then define the scaled path

γ (N ) (t) = N −1 ω(tN α )

0 ≤ t ≤ k N −α . 153

154

7. Scaling Limits

This define a function of t of the form nN −α for some integer n. We define it for other t by linear interpolation to make it a continuous curve. In the limit we hope to get a random continuous curve of “fractal dimension” α. There are various ways of talking about fractal dimension. We will make a precise statement later in this section but for now we think of the scaling rule (7.1)

|∆x| ≈ (∆t)1/α .

This definition depends not only on the set of points visited but also the order in which they were visited and the parametrization. Sometimes it is useful to concentrate on the “fractal dimension” of the subsets of Rd without regard to how they arise as images of curves. We will give one definition of such a dimension for bounded V ⊂ Rd . Define the “ϵ-sausage” about V by Vϵ = {z ∈ Rd : dist(z, V ) ≤ ϵ}. Definition 7.1. The Minkowski α-content of V ⊂ Rd is defined by Contα (V ) = lim ϵα−d Vol(V ) ϵ↓0

provided that the limit exists. A bounded subset V has Minkowski dimension α if { 0, if β > α . Contβ (V ) = ∞, if β < α

These definitions assume that the limits exist. If the limits do not exists, we can still define upper Minkowski content, lower Minkowski content, upper Minkowski dimension, and lower Minkowski dimension by taking limsups and liminfs rather than limits.

In order to get some intuition, let us consider a square in R3 , V = {(x, y, 0) ∈ R3 : 0 ≤ x, y ≤ 1}.

7.1. The idea of a scaling limit

155

The ϵ-sausage Vϵ is a union of {x, y, z) : 0 ≤ x, y ≤ 1, |z| ≤ ϵ} and a small set around the border of volume O(ϵ2 ). Hence Vol[Vϵ ] = 2ϵ + O(ϵ2 ), and Cont2 (Vϵ ) = lim ϵ2−3 Vol(Vϵ ) = 2. ϵ↓0

The fact that the limit is 2 rather than, say, 1 is an artifact of our particular definition. The Minkowski content should be viewed as a quantity defined up to an arbitrary constant. Exercise 7.2. Show that for any bounded set V , the Minkowski content of V is the same as the Minkowski content of its closure. Exercise 7.3. If V is a bounded subset of Rd , let Nn (V ) denote the smallest number of balls of radius 1/n needed to cover V . The box or box counting dimension of V , if it exists, is defined to be log Nn (V ) . n→∞ log n lim

Show that the box dimension is the same as the Minkowski dimension. Exercise 7.4. If V is a bounded subset of Rd let Hϵα (V ) = inf

∞ ∑

[diam Uj ]α ,

j=1

where the infimum is over all countable collections of sets U1 , U2 , . . . with diam[Uj ] ≤ ϵ for each j and V ⊂

∞ ∪

Uj .

j=1

(1) Show that the limit Hα (V ) = lim Hϵα (V ) ϵ↓0

exists (where 0 and ∞ are possible values of the limit).

156

7. Scaling Limits (2) Show that for each V , there exists a unique number dimh (V ) such that { 0 α > dimh (V ) Hα (V ) = . ∞ α < dimh (V ) The number dimh (V ) ∈ [0, d] is called the Hausdorff dimension of V . (3) Show by examples that if α = dimh (V ) then any of Hα (V ) = 0, 0 < Hα (V ) < ∞, Hα (V ) = ∞ are possible. (4) Show that if the box dimension of V exists then it is greater than or equal to the Hausdorff dimension. (5) Show that if V is a bounded, countable set, then dimh (V ) = 0. Hence the Hausdorff dimension can be strictly less than the box dimension. (6) Does there exist a countable subset of [0, 1] that is closed and has box dimension strictly bigger than zero?

7.2. Brownian motion Brownian motion or Wiener process is the scaling limit of random walk. Let Sn denote a simple random walk in Zd starting at the (N ) origin and define Wt on the lattice ZN by (N )

Wt

= N −1 StdN 2 .

The central limit theorem tells us that the limit of the distribution of (N ) W1 is a standard normal random variable in Rd . Brownian motion is the scaling limit if we consider it as a random function of time t. Definition 7.5. A standard Brownian motion or Wiener process starting at the origin in Rd is a collection of random variables {Wt } satisfying the following. • W0 = 0; • For every s < t, the random variable Wt −Ws is independent of {Wr : r ≤ s}. • If s < t, (t − s)−1/2 (Wt − Ws ) has a standard normal distribution in Rd .

7.2. Brownian motion

157

• With probability one, the function t 7→ Wt is a continuous function of t. A process Xt that satisfies the first two conditions and such that the distribution of Xt − Xs depends only on t − s is called a Lévy process, see Section A.7. It can be shown that if Xt is a Lévy process with continuous paths, then Xt has the same distribution as r Wt + tv for some r ≥ 0 and v ∈ Rd . It takes a little effort to show that Brownian motion exists. While we have come to the definition by taking a “limit” of random walk, one can construct the process without using the approximating random walk. Exercise 7.6. Suppose Wt1 , . . . , Wtd are independent standard Brownian motions in R and Wt = (Wt1 , . . . , Wtd ). Show that Wt is a standard Brownian motion in Rd . Exercise 7.7. Suppose Wt is a standard Brownian motion in Rd , a ∈ R \ {0}, and Yt = a−1 Wta2 . Show that Yt is a standard Brownian motion in Rd . Exercise 7.8. Suppose Wt is a standard Brownian motion in R. Show that {Wt : t ≥ 0} has a centered multivariate normal distribution with E[Ws Wt ] = min{s, t}. A Brownian motion for all d scales like a process of fractal dimension α = 2 if we use the rule (7.1). However if we consider the image W [0, t] = {Ws : 0 ≤ s ≤ t}, then for d = 1, W [0, t] is a closed interval in R and hence has Minkowski dimension 1. As the next theorem states, as long as the motion has at least two dimensions to move in, the path of the motion will be a two-dimensional set. Theorem 7.9. Suppose Wt is a standard Brownian motion in Rd , d ≥ 2 and t > 0. Then with probability one, the Minkowksi dimension of

158

7. Scaling Limits

W [0, t] is 2. Morever, there exists a constant c = c(d) > 0 such that { 0, d = 2 Cont2 (W [0, t]) = . ct, d ≥ 3 We also introduce the idea of a Brownian loop or bridge. Intuitively, it is a standard Brownian motion Wt , conditioned so that W0 = W1 = 0 and then considered as a periodic function of period 1. This is conditioning on an event of probability zero so it is not precise. Fortunately there is a direct way to define what we want. We already saw the discrete analogue of this construction in Exercise 6.8. Definition 7.10. The (standard) Brownian loop or bridge is the process defined by Yt = Wt − t W1 , 0 ≤ t ≤ 1, where Wt is a standard Brownian motion. Using Exercise 7.6, we can see that we can get a d-dimensional Brownian loop by taking a d-tuple of independent one-dimensional Brownian bridges in each component. Proposition 7.11. The one-dimensional Brownian bridge is a centered multivariate process with covariance E[Ys Yt ] = s(1 − t) if s ≤ t. It is shift invariant in the following sense. Let Xt = Yt+s − Ys where we view X, Y as functions of period 1. Then Xt is a onedimensional Brownian loop. Proof. If s ≤ t ≤ 1, then E [(Ws − sW1 )(Wt − tW1 )] = E[Ws Wt ] − sE[W1 Wt ] − t E[Ws W1 ] + st E[W12 ] = s − st − st + st = s(1 − t). Let 0 < s < 1 and let { Wt+s − Ws , Bt = W1 − Ws + Wt+s−1

t+s≤1 . t+s≥1

7.3. Conformal invariance in two dimensions

159

Then one can check that Bt , 0 ≤ t ≤ 1 is a standard Brownian motion with B1 = W1 . If t + s ≤ 1, Xt = Yt+s − Ys

=

Wt+s − (t + s) W1 − [Ws − s W1 ]

=

Wt+s − Ws − t W1

=

Bt − t B1 .

If t + s > 1, Xt = Yt+s−1 − Ys

=

Wt+s−1 − (t + s − 1) W1 − [Ws − s W1 ]

=

Wt+s−1 + W1 − Ws − t W1

=

Bt − t B 1 . □

7.3. Conformal invariance in two dimensions When studying functions of two variables, it is often useful to view R2 as the complex plane C. Similarly, we can view a standard Brownian motion in R2 as a complex-valued Brownian motion Bt = Wt1 + i Wt2 where W 1 , W 2 are independent Brownian motions. There is a close relationship between Brownian motion and conformal maps. We will need some facts from complex analysis. A domain is a connected, open subset of C. Two important domains are the unit disk D = {z : |z| < 1} and the upper half plane H = {x + iy : y > 0}. A function f : D → C is holomorphic or analytic if it is differentiable at all z ∈ D using the usual definition of derivative extended to complex-valued functions: f (w) − f (z) . w−z The fact that the limit exists as w → z in all possible directions says a lot about the function. If we write f (z) = u(z) + iv(z) where u, v are real-valued functions, then u, v satisfy the Cauchy-Riemann equations, ∂x u = ∂y v, ∂y u = −∂x v. f ′ (z) = lim

w→z

In particular, u, v are harmonic functions, ∂xx u + ∂yy u = ∂xx v + ∂yy v = 0.

160

7. Scaling Limits

If f ′ (z) ̸= 0 and we write f ′ (z) = reiθ with r > 0, then locally at z, f looks like a dilation by a factor r and a rotation by angle θ. Definition 7.12. A holomorphic function f : D → f (D) is called a conformal transformation if it is a bijection with f ′ (z) ̸= 0 for all z ∈ D. The condition that f ′ is nonzero in the last definition is not needed. In complex analysis, one learns that a holomorphic bijection will satisfy f ′ ̸= 0 everywhere; indeed a holomorphic function f is locally one-to-one at z if and only if f ′ (z) ̸= 0. If f ′ (z) ̸= 0 for all z, the function is “locally one-to-one” but might not be globally oneto-one. One example is f (z) = ez which satisfies f (z) = f (z + 2πi). Theorem 7.13. • If f : D → D′ is a conformal transformation, then f −1 : D′ → D is a conformal transformation. • The only conformal transformations f : D → D with f (0) = 0 are of the form f (z) = wz for some |w| = 1. • The only conformal transformations f : H → H with f (∞) = ∞ are linear functions of the form f (z) = rz + b,

r > 0, b ∈ R.

• If w ∈ D, the map f (z) =

z−w , 1−wz

is a conformal transformation of D onto D with f (w) = 0. Here w denotes the complex conjugate. Theorem 7.14. Suppose D, D′ are domains and f : D → D′ is a conformal transformation. Let Bt be a complex Brownian motion in D and let τ = min{t : Bt ̸∈ D}. If Yt = f (Bt ) then Yt is a time-change of a standard Brownian motion in D′ .

7.4. Brownian loop soup

161

When we say time-change we mean that the “picture” of the path is the same as a standard Brownian motion but the “rate” of traversing it is different. This time change comes from the fact that when we dilate space by a factor of r we need to change time by a factor of r2 . We will not need to worry about the time change. One of the reasons that conformal invariance is useful is that all proper domains of the complex plane “without holes” are conformally equivalent in the sense that one can find a conformal transformation mapping one to the other. To be precise, a domain D ⊊ C is simply connected if every connected component of C \ D is unbounded. Note that D and H are simply connected. Theorem 7.15 (Riemann Mapping Theorem). Suppose D is a simply connected proper subdomain of C containing the origin. Then there exists a unique conformal transformation f : D → D with f (0) = 0, f ′ (0) > 0. Combining this with Theorem 7.13 we can see that every conformal transformation g : D → D with g(0) = 0 is of the form g(z) = eiθ f (z) for some θ.

7.4. Brownian loop soup The Brownian loop soup in C is the scaling limit of the random walk loop soup in N −1 Z2 . Let us make this precise. We start with the random walk loop measure in Z2 as in Chapter 3. We recall that there are three ways to look at this: growing loop measure, rooted loop measure, or unrooted loop measure. We will choose the rooted loop measure that assigns to each nontrivial loop of 2n steps measure (2n)−1 4−2n . This is an infinite measure because we are allowing all possible roots. However, if z ∈ Z2 , the total measure of loops rooted at z is finite; indeed, it is given by (7.2)

∞ ∑ 1 P{S(2n) = 0}, 2n n=1

where S is a simple random walk in Z2 . As we have seen in Theorem 4.3, P{S(2n) = 0} ∼ 1/(πn) and hence the sum is finite.

162

7. Scaling Limits

As before, let ZN = N −1 Z2 and associate to each loop ω of length 2n the scaled loop γω (t) = N −1 ω(2N 2 t),

0 ≤ t ≤ n N −2 .

The random walk loop measure induces a measure on scaled paths γω ; here we are scaling the path but not the measure of the path. As N → ∞, if we restrict to paths that lie in the unit disk, we see that the total measure of scaled loops in the disk grows to infinity as N → ∞. The (rooted) Brownian loop measure is the scaling limit of this measure. In can be defined directly without reference to the random walk loop measure,. A continuous loop γ of time duration tγ is a continuous function γ : [0, tγ ] → C with γ(0) = γ(tγ ). To specify a loop we can give a triple (z, tγ , γˆ ) where z = γ(0) is the root; tγ is the time duration; and γˆ is the loop translated so that it is rooted at the origin and scaled so that it has time duration one. Here we use Brownian scaling, that is, the triple (z, tγ , γˆ ) corresponds to the loop γ(t) = z + t1/2 ˆ (t/tγ ), γ γ

0 ≤ t ≤ tγ .

To give a measure on curves γ, we can give a measure on triples as above. We write the Brownian loop measure as ) ( dt × (Brownian bridge). (Area) × 2πt2 In other words, the root z is chosen “uniformly” on C; the curve γ˜ is chosen using the probability distribution of the Brownian bridge as described in Section 7.2; and the time duration tγ is chosen according to the measure on [0, ∞) with density 1 1 1 = . 2 2πt 2πt t We factor the density to consider the two terms separately. (2πt)−1 is the density of a Brownian motion at time t evaluated at the origin which roughly can be considered as “the probability that the motion is at the origin at time t”, and the 1/t is the analogue of the 1/2n

7.4. Brownian loop soup

163

for the random walk loops. See (7.2). Unlike the discrete case, the corresponding integral ∫ ∞ dt 2 2πt 0 blows up at the origin although it converges at infinity. This indicates a property of the loop measures — small loops are more likely to appear, but, of course, they are not as big. We have defined the Brownian loop measure as a measure on rooted loops but it can also be considered as a measure on unrooted loops which are defined analogously to the definition for random walk loops. Recall from Section 3.4, that we can define the random walk loop soup by taking a Poissonian realization of the rooted loop measure. This can be viewed as a “soup” of rooted loops or unrooted loops. Let Ct be the collection of loops at time t. As the lattice spacing 1/N shrinks to zero, this converges to the Brownian loop soup. One can define the soup directly as a Poissonian realization from the Brownian loop measure but we will not give the formal definition. If D is a domain, then the soup in D is the Brownian loop soup restricted to loops that stay in D. If D is bounded, there are infinitely many loops in the soup because there are many “small” loops. However, for each ϵ > 0, the number of loops of diameter at least ϵ is finite. This is similar to the increasing Lévy jump processes (see Section A.5) that jump infinitely often in each open interval but take jumps of size at least ϵ only finitely often. There are two important properties that the loop soup satisfies. The first is immediate from the definition. • Restriction Property. If D′ ⊂ D, then the soup in D′ can be obtained by taking the soup in D and restricting to loops staying in D′ . The second property is more surprising and is given in the next theorem. Theorem 7.16 (Conformal Invariance). Suppose f : D → D′ is a conformal transformation. Suppose Ct is a realization of the unrooted

164

7. Scaling Limits

loop soup in D and define Ct′ = {f ◦ γ : γ ∈ Ct }. Then Ct′ is a Brownian loop soup in D′ . In the statement of this theorem we have written f ◦ γ to denote the image of γ under the map f . As in the conformal invariance of Brownian motion, this is really defined “up to a time change” but we will not worry about this detail.

7.5. Scaling limit for LERW We will find the scaling limit for the LERW. Rather than trying to take the limit directly start by listing the properties that a scaling limit might have. Let AN = {x + iy ∈ Z + iZ : |x|, |y| < N } and consider the boundary points zN = −N, wN = N . Note that N −1 AN is an approximation of the square D = {x + iy ∈ C : |x|, |y| < 1} with boundary points z = −1, w = 1. Loop-erased random walk from zN to wN in D is obtained by taking a simple random walk starting at zN , conditioning it to go into AN and leave AN at wN and then erasing the loops. If η = [η0 = zN , η1 , η2 , . . . , ηk = wN ] is a SAW in AN from zN to wN , then (see Proposition 2.8) the probability that we get η is (7.3)

−1 −|η| qN 4 Fη (AN ).

−1 Here qN is the normalization factor to make this a probability measure, in other words, qN is the probability that a simple random walk starting at zN goes into AN and leaves at wN . Let us suppose that the scaling limit exists and that it produces curves of fractal dimension as in Section 7.1.

More generally, we can consider any domain D ⊂ C and distinct boundary points z, w ∈ D. Suppose we have a family of probability measures P(D; z, w) on curves from z to w in D for all such (z, w, D).

7.5. Scaling limit for LERW −N + N i

zN

165 N + Ni

wN

−N − N i

N − Ni

−N + N i

N + Ni

zN

−N − N i

wN

N − Ni

We will first make an assumption that is a limit of something we have observed for LERW.

166

7. Scaling Limits

z

w

• Domain Markov property. Suppose we have seen an initial segment γ(s) : 0 ≤ s ≤ t of the curve. Then the conditional distribution on the rest of the curve is that of P(D \ γ[0, t]; γ(t), w). Possible scaling limits are conjectured to be rotationally invariant as well as scale invariant (with appropriate changes of time). For this reason we might hope that the limit is actually conformally invariant. For the LERW, one might guess this by noting that the scaling limit of the simple random walk (Brownian motion) is conformally invariant and the loop-erasing procedure depends only on the ordering of the vertices. • Conformal invariance. Suppose f : D → D′ is a conformal transformation. Define a probability measure on curves in D′ by f ◦ P(D; z, w)[V ] = P(D; z, w){γ : f ◦ γ ∈ V }. Then f ◦ P(D; z, w) = P(D′ ; f (z), f (w)).

We have written f ◦ γ as the image of the curve γ under f . As in the case of Brownian motion, there is a question of how the image curve should be be parametrized. We will view f ◦ γ as a curve “modulo reparametrization”.

7.5. Scaling limit for LERW

167

It turns out that these two properties determine the possible limit up to a single parameter. Theorem 7.17. Suppose there is a family of probability measures P(D; z, w) on simple curves from z to w in D, indexed by simply connected proper subdomains of C and distinct z, w, satisfying the domain Markov property and conformal invariance. Then P(D; z, w) is the Schramm-Loewner evolution with parameter κ (SLEκ ) for some 0 < κ ≤ 4. With probability one the following is true. • The curves have Minkowski dimension κ α=1+ 8 • If the boundary is not too bad near z, w, the curves have finite, non-zero Minkowski α-content. We will define the main character, SLEκ , in the next section. The theorem states that there is only one possible limit for each choice of fractal dimension α = 1 + β with 0 < β ≤ 12 . The process SLEκ is well defined for κ > 4 but it does not give simple curves; we will restrict our discussion to κ ≤ 4. We have not determined which value corresponds to LERW. We can use κ ∈ (0, 4] or β ∈ (0, 12 ] to specify the family. We define another quantity c called the central charge by c=

(3κ − 8)(6 − κ) . 2κ

Note that −∞ < c ≤ 1. Theorem 7.18 (Boundary perturbation). Suppose D is a simply connected domain with boundary points z, w, and suppose that D′ ⊂ D is simply connected and agrees with D in neighborhoods of z, w. Let E, E′ denote expectations with respect to P(D; z, w) and P(D′ ; z, w), respectively. Let {c } (7.4) Y (γ) = 1{γ(0, tγ ) ⊂ D′ } exp m(D; γ, D \ D′ ) , 2 where m(D; γ, D \ D′ ) denotes the Brownian loop measure of loops in D that intersect both γ and D \ D′ . Then for every random variable

168

7. Scaling Limits

X, E′ [X] =

E[X Y ] . E[Y ]

The conclusion can be expressed precisely in terms of RadonNikodym derivatives of measures. dP(D′ ; z, w) Y (γ) (γ) = . dP(D; z, w) E[Y ] The boundary perturbation property with c = 0 is called the restriction property.

The term Radon-Nikodym derivative refers roughly to the ratio of the measure at a point in two different measures. This is defined even if individual points have zero measure. As an example, suppose that X and Y are two continuous random variables with densities f, g, respectively. Then we write informally f (x) P{X = x} = , P{Y = x} g(x) even though the numerator and denominator on the leftside equal zero.

In order to understand the statement let us return to the discrete LERW in the square D. Let D′ ⊂ D be a simply connected domain that agrees with D near −1, 1; for example, we could choose D′ = {x+iy : |x| ≤ 1, |y| ≤ 1/2}. Let A′N denote the corresponding discrete set. Recalling (7.3), we see that the probability that the LERW from zN to wN is η is ′ −1 −|η| [qN ] 4 Fη (A′N ) 1{η ⊂ A′N }.

If we divide this by the quantity in (7.3) we get (7.5)

CN (D′ , D) 1{η ⊂ A′N }

Fη (A′N ) Fη (AN )

where CN (D′ , D) denotes a normalization constant independent of η. Note that Fη (A′N ) = exp{−Λ}. Fη (AN )

7.6. Loewner differential equation

169

where Λ denotes the random walk loop measure of loops in AN that intersect η and AN \A′N . As N gets large, we see that (7.5) approaches (7.4) with c = −2. This identifies the appropriate limit for LERW as SLE2 with fractal dimension α = 5/4. Theorem 7.19. As N → ∞, the scaling limit of the LERW on the square AN converges to SLE2 from z to w in the square D. There exists a constant c∗ such that the limit curve γ satisfies Cont5/4 (γ[0, t]) = c∗ t.

0 ≤ t ≤ tγ .

7.6. Loewner differential equation In order to define SLE in simply connected domains it suffices to define it in one domain. The most convenient domain is the upper half plane H with the target point being ∞ which is a boundary point of H. This is a probability measure on random curves from 0 to ∞. Before defining it, we start by considering a fixed simple curve γ : (0, ∞) → H with γ(0+) = 0. For each t, let Dt = H \ γ(0, t] which is a simply connected “slit” domain. By the Riemann mapping theorem, we can find a conformal transformation gt : Dt → H that sends infinity to infinity. We can give an expansion as z → ∞, gt (z) = rz + b + O(|z|−1 ). where r > 0, b ∈ R. By composing with a conformal transformation of H onto H, we can find a unique such transformation with r = 1, b = 0 and its expansion at infinity looks like a(t) + O(|z|−2 ). z It can be shown that the coefficient a(t) is continuous and strictly increasing in t and hence the curve can be reparametrized so that 2t gt (z) = z + + O(|z|−2 ). z This is called the (half-plane) capacity parametrization. gt (z) = z +

Theorem 7.20 (Loewner differential equation). Suppose γ : (0, ∞) −→ H is a simple curve as above with the capacity parametrization satisfying γ(t) → ∞. Then the maps gt satisfy the equation 2 (7.6) ∂t gt (z) = , g0 (z) = z, gt (z) − Ut

170

7. Scaling Limits

g

γ (t)

t

Ut

0

where Ut = gt (γ(t)). Moreover, Ut is a continuous function of t. For fixed z ∈ H, the solution is valid up to the time Tz = min{t : γ(t) = z}, where Tz = ∞ if z ̸∈ γ(0, ∞). We will not give a complete proof, but let us see how this equation arises. Let us write gt (z) = ut (z) + i vt (z) and focus on the imaginary part vt (z). The equation (7.6) applied to vt gives [ ] 2 2 Im[gt (z)] ∂t vt (z) = Im . =− gt (z) − Ut |gt (z) − Ut |2 Let us focus at t = 0 and see that this can be written as (7.7)

vt (z) = v0 (z) − t

2 Im(z) + o(t), |z|2

t ↓ 0.

Using the expansion of gt at infinity, we can see that ht (z) := v0 (z) − vt (z) is a bounded harmonic function on Dt with boundary value ht (z) = Im(z) for z ∈ ∂Dt . Using a continuous analogue of the solution of the Dirichlet problem (see Proposition 4.13), we see that ht (z) = Ez [Im(Bτt )] where Bs is a complex Brownian motion and τt = min{s : Bs ̸∈ Dt }. We write Ez [Im(Bτt )] = Pz {Bτt ̸∈ R} Ez [Im(Bτt ) | Bτt ̸∈ R]. As t goes down to zero, the second term Ez [Im(Bτt ) | Bτt ̸∈ R] becomes independent of the starting point z. In our parametrization,

7.6. Loewner differential equation

171

√ it becomes asymptotic to c t for some c. The first term does depend on z; as t ↓ 0, it is asymptotic to √ c′ HH (z, 0) t which is the Poisson kernel in the upper half plane corresponding roughly to “the probability that Brownian motion starting at z leaves H at 0”. It is known that if z = x + iy, then HH (z, 0) =

y 1 Im(z) = . π(x2 + y 2 ) π |z|2

By being careful with the constants one establishes (7.7). This gives the Loewner equation for the imaginary part of gt and the CauchyRiemann equation can be used to derive it for the real part as well. Now suppose that γ is a family of random curves satisfying domain Markov property and conformal invariance. Then the function Ut is a random continuous function that satisfies the following: • For s < t, Ut − Us is independent of {Ur : r ≤ s} with the same distribution as Ut−s . In other words, Ut is a Lévy process with continuous paths and this √ means that Ut = bt + κ Wt for a standard Brownian motion Wt where b ∈ R and κ ≥ 0. By scaling we can show that b must equal zero. Hence there is only one parameter κ. Definition 7.21. The Schramm-Loewner evolution with parameter κ (SLEκ ) is the solution to the Loewner equation (7.6) where Ut = √ κ Wt and Wt is a standard Brownian motion. Here we have defined SLE to be the conformal maps gt and implicitly the domains Dt . This definition is valid for all κ > 0. However, only for κ ≤ 4 do we get a probability measure on simple curves such that Dt = H \ γ(0, t]. The main mathematical tool to analyze the curve γ and derive the properties discussed in the previous section is stochastic calculus which we will not discuss in this book. We finish with one comment. The Loewner equation produces the curve γ but gives it the capacity parametrization. When taking scaling limits it is natural to consider the curve parametrized so that Contα (γ(0, t]) = t where α = 1 + κ8

172

7. Scaling Limits

is the fractal dimension. One can do this but these parametrizations are singular with respect to each other.

7.7. Self-avoiding walk: c = 0 Using the boundary perturbation rule for SLE as a guide, we can consider other measures on self-avoiding walks besides the LERW. Consider the discrete square AN as in Section 7.5. We will modify (7.3) to say that each SAW η in AN from zN to wN will get measure, see (3.6), −c/2

−1 −|η| −1 −|η| qN β Fη (AN )−c/2 = qN β [det GAN ]

.

Here qN = qN (β, c) is the appropriate scaling factor so that this is a probability measure and β is a “critical value” of the parameter so that qN neither grows nor decays exponentially when N → ∞. As we have seen, see (3.6), the LERW corresponds to c = −2. For values other than c = −2, it is an open problem to determine if this scaling limit exists. Let us focus on c = 0 for which each SAW of the same length in AN gets the same measure. This process is called the self-avoiding walk. The constant β is called the connective constant. It is not known exactly but it is around 2.64. Exercise 7.22. Let Cn denote the number of SAWs of length n in the integer lattice Z2 starting at the origin. (1) Explain why 2n ≤ Cn ≤ 4 · 3n−1 . (2) Explain why Cn+m ≤ Cn Cm and hence if f (n) = log Cn , then f is subaddtive, f (n + m) ≤ f (n) + f (m). (3) Show that for any subadditve function, lim

n→∞

f (n) f (n) = inf . n n n

(4) Show that there exists β ∈ [2, 3] with lim Cn1/n = β.

n→∞

(5) Show that 2 < β < 3.

7.8. Continuous GFF for d = 1, 2

173

We have used the word self-avoiding walk (SAW) to mean a path without self-intersections. The term “self-avoiding walk” is also used for the model in statistical physics is one where all SAWs of the same length get the same measure.

It is an open problem whether or not the scaling limit exists for the self-avoiding walk and if so whether it satisfies conformal invariance. However, if it does exist and is conformally invariant, one can see that it should also satisfy the domain Markov property and the restriction property, that is, c = 0 and κ = 8/3. Prediction 7.23. The scaling limit of the SAW is SLE8/3 . In particular, the paths have Minkowski dimension 4/3. Here we have used the word “prediction”. For many people in the interface of mathematics and physics this term has the connotation of something that has not been rigorously proved but for which there is a strong theoretical argument. Much of the work of theoretical physicists in this area are referred to as predictions to indicate that although it is not mathematically rigorous, there is theoretical justification for its validity. For the self-avoiding walk model there are other nonrigorous methods that give the same prediction and the prediction is supported very strongly by numerical simulations.

7.8. Continuous GFF for d = 1, 2 We consider the continuous GFF on a bounded domain D in R or R2 with Dirichlet boundary conditions. Roughly speaking, the GFF {Xt : t ∈ D} is a collection of centered normal random variables with E[Xz Xw ] = GD (z, w) where GD (z, w) is the Green’s function for Brownian motion stopped when it leaves D. The d = 1 case is straightforward to define and is a natural limit of the discrete case discussed in Section 6.5. For ease, let D be the open interval (0, 1).

174

7. Scaling Limits

Definition 7.24. The Dirichlet Gaussian free field √ on (0, 1) is the Brownian loop {Xt : 0 ≤ t ≤ 1} where Xt = 2 Yt and Yt is the Brownian loop as defined in Definition 7.10. We have already seen that Y0 = Y1 = 0 and that Xt is a centered Gaussian process with E[Xs Xt ] = 2s(1 − t), 0 ≤ s ≤ t ≤ 1. We need to interpret the right-hand side as GD (s, t), Suppose Wt is a standard Brownian motion starting at y ∈ D = (0, 1) and let T be the first time that it reaches 0 or 1. In analogy with the random walk case, the Green’s function GD (y, x) is “the expected amount of time spent at x before time T ”. This description does not makes precise sense, but if 0 ≤ a < b ≤ 1, we can let I(a, b; y) be the expected amount of time spent in [a, b] before leaving T assuming W0 = y. Then the Green’s function GD (y, x) satisfies ∫ b I(a, b; y) = GD (y, x) dx. a

By symmetry, we see that GD (y, x) = GD (1 − y, 1 − x) and, in fact, we claim that if 0 < y ≤ x < 1, GD (y, x) = 2 y (1 − x). This can be obtained as a limit of the expression in Proposition 6.22. The two-dimensional GFF is trickier to define. We start by defining the Green’s function for Brownian motion. There are several different ways to define this and there is some arbitrariness in the choice of multiplicative constant; we will make one choice. As before, we will consider R2 as the complex plane C whenever it is convenient. Suppose D ⊂ C is a bounded domain and z, w ∈ D are distinct points. For each positive integer N , let AN be the set of points in the lattice N −1 Z that lie in D. Let zN , wN denote lattice points in N −1 Z closest to z, w (if there is more than one closest point, just choose one). The constant π/2 in this definition is for convenience. Definition 7.25. The Green’s function GD (z, w) is defined by π GD (z, w) = lim GAN (zN , wN ). 2 N →∞

7.8. Continuous GFF for d = 1, 2

175

Theorem 7.26. For every bounded domain D and distinct z, w ∈ D, the limit exists. Moreover, it can also be obtained in the following ways. • Let Bt be a complex Brownian motion starting at z. Let T be the first time that Bt ̸∈ D and let τs be the first time that |Bt − w| ≤ e−s . Then, (7.8)

GD (z, w) = lim s Pz {τs < T }. s→∞

• In analogy with Proposition 4.15, GD (z, w) = − log |z − w| + Ez [log |z − BT |] . If D = D, then using (4.12), we can see that GD (0, z) = GD (z, 0) = − log |z|. We can compute GD (z, w) for all simply connected domains by the Riemann mapping theorem and the following theorem. Theorem 7.27 (Conformal invariance). Suppose f : D → D′ is a conformal transformation and z, w are distinct points in D. Then, GD′ (f (z), f (w)) = GD (z, w). Exercise 7.28. Taking (7.8) as given, try to derive the last theorem. The GFF in D with Dirichlet boundary conditions is defined loosely to be the Gaussian process {hz : z ∈ D} with covariance function GD (z, w). However, this does not make sense because GD (z, z) = ∞. We will approximate. Let us take D = D and let {Zz : z ∈ AN } denote the Gaussian field with Green’s function GAN (z, z). We would like to define √ π Zz , hz = lim N →∞ 2 N where zN is point in AN closest to z (we will not be precise about the notion of limit).. Recall that Z0 ∼ log N and hence one cannot take a limit. If ρ : D → R is a smooth function with compact support, let us define formally ∫ hρ = h(z) ρ(z) dz.

176

7. Scaling Limits

Here we are writing dz for usual two-dimensional integrals in R2 and not as complex integrals. This does not make sense as written because h(z) is not defined, but if it did we could write [(∫ )2 ] 2 E[hρ ] = E h(z) ρ(z) dz [(∫ = = = =

E

) (∫

)]

h(z) ρ(z) dz

h(w) ρ(w) dw [∫ ∫ ] E h(z) h(w) ρ(z) dz ρ(w) dw ∫ ∫ E [h(z) h(w)] ρ(z) ρ(w) dz dw ∫ ∫ GD (z, w) ρ(z) ρ(w) dz dw.

Although GD (z, z) is infinite, this integral is finite. Definition 7.29. The Dirichlet GFF on a domain D ⊂ C is the centered Gaussian process {hρ } indexed by smooth functions ρ with compact support with covariance function ∫ ∫ E[hρ hψ ] = GD (z, w) ρ(z) ψ(w) dw dz. Theorem 7.30. The GFF is conformally invariant. Let us state what we mean by this. Suppose for a moment that h(z) were a random function so we could write ∫ h(ρ) = h(z) ρ(z) dz. ˜ is a conformal transformation, g = f −1 , Suppose also that f : D → D ˜ and we define h(w) = h(g(w)) and ρ˜(w) = ρ(g(w)) |g ′ (w)|2 so that ˜ h(f (z)) = h(z), ρ˜(f (z)) = ρ(z) |f ′ (z)|−2 . Then, ∫ ∫ ∫ ˜ (z)) ρ˜(f (z)) |f ′ (z)|2 dz = h(w) ˜ h(z) ρ(z) dz = h(f ρ˜(w) dw. This is formal, but we can rigorously define ˜ ρ) = h(ρ), h(˜ ˜ is a GFF in D. ˜ and then the statement is that h

Further Reading

177

The conformal invariance of the two-dimensional GFF has made it an extremely important object in the study of statistical physics in two dimensions. We will not even try to go into this!

Further Reading Research in probability uses the framework of probability in terms of measure theory. For a classic treatment including the necessary measure theory see [2] and for a short introduction see [4]. There are many other texts as well. Brownian motion is a fascinating subject in itself. An introduction can be found in the books on probability. For a detailed account, see [15]. The Schramm-Loewner evolution (SLE) has been a particular interest of the author. The original paper [16] is worth reading because of its originality and how influential it has been for probability in the twenty-first century. For more details see [7, 9]. Much recent work has been relating SLE, the Brownian loop soup, and the GFF.

Appendix A

Some Background and Extra Topics

A.1. Borel-Cantelli lemma Suppose E1 , E2 , . . . is a sequence of events and In = 1En is the indicator function that event En is occurs. Then the number of events that occur is given by ∞ ∑ In . V = n=1

Note that E[V ] =

∞ ∑

P(En ).

n=1

Probabilists often ask the question: do the events happen infinitely often? An equivalent question is to ask if the random variable V is finite or equals infinity. One of the basic tools for deciding this is the Borel-Cantelli Lemma. We will give it in two parts. The first part is immediate but is still extremely useful. Proposition A.1 (Borel-Cantelli Lemma 1). If E[V ] < ∞, then with probability one V < ∞. The converse is not true. Indeed, it is possible for a finite random variable to have an infinite expectation. However, if the random variables are independent, we do get the converse. 179

180

A. Some Background and Extra Topics

Proposition A.2 (Borel-Cantelli Lemma 2). If the events E1 , E2 , . . . are independent and E[V ] = ∞, then with probability one V = ∞. Proof. If V < ∞, then there must be some n such that none of the events En , En+1 , En+2 , . . . occur. We will show that for all n, if An = En ∪ En+1 ∪ · · · , then P(An ) = 1. Note that c Acn = Enc ∩ En+1 ∩ ···

where the superscript c denotes complement. Using independence and the inequality 1 − x ≤ e−x , P(Acn ) =

∞ ∏

[1 − P(Ej )]

j=n

≤

∞ ∏

exp{−P(Ej )}

j=n

  ∞  ∑  = exp − P(Ej ) = 0.   j=n

□

A.2. Second moment method There are many random variables, especially sums of other random variables, for which it is relatively straightforward to calculate expectations. Sometimes what one wants to know is whether or not the random variable equals zero. It is possible for a nonnegative random variable to have a very large expectation and yet be equal to zero most of the time. For example, if P{Xk = k 2 } = 1/k and P{Xk = 0} = 1 − k1 , then E[Xk ] = k and hence lim E[Xk ] = ∞,

k→∞

lim P{Xk ̸= 0} = 0.

k→∞

If one can calculate higher moments, then one can give bounds. From a practical perspective, the second moment, E[X 2 ], can often be estimated and this does give a bound on P{Xk ̸= 0}.

A.2. Second moment method

181

Proposition A.3. Suppose Y is a nonnegative random variable with mean µ. Then if 0 ≤ a ≤ 1, P{Y ≥ a µ} ≥

(A.1)

(1 − a)2 µ2 . E(Y 2 )

Proof. We may assume that µ = 1 for otherwise we can consider Y /µ. Since E[Y ; Y < a] ≤ a, we have E[Y ; Y ≥ a] ≥ 1 − a. Using the conditional form of the inequality E[X 2 ] ≥ E[X]2 , we get E[Y 2 ]

≥ E[Y 2 ; Y ≥ a] =

P{Y ≥ a} E[Y 2 | Y ≥ a]

≥

P{Y ≥ a} (E[Y | Y ≥ a])

=

P{Y ≥ a} (E[Y ; Y ≥ a]/P{Y ≥ a})2

≥

(1 − a)2 /P{Y ≥ a}.

2

□ We now give a useful generalization of the second Borel-Cantelli Lemma where the assumption of independence is replaced with the weaker assumption “independent up to a multiplicative constant”. Our conclusion is that V = ∞ occurs with positive probability although not necessarily with probability one. Proposition A.4. Suppose E1 , E2 , . . . is a sequence of events and Vn =

n ∑

1Ej .

j=1

Suppose there exists c0 < ∞ such that for all j ̸= k, (A.2)

P(Ej ∩ Ek ) ≤ c0 P(Ej ) P(Ek ).

If E[V∞ ] = ∞, then P{V∞ = ∞} ≥

1 . c0

182

A. Some Background and Extra Topics

Proof. Note that E[Vn2 ]

=

n ∑ n ∑

P(Ej ∩ Ek )

j=1 k=1

≤

n ∑

P(Ej ) +

j=1

n ∑ n ∑

c0 P(Ej ) P(Ek )

j=1 k=1

= E[Vn ] + c0 E[Vn ]2 Therefore, using (A.1), P{Vn ≥ a E[Vn ]} ≥

(1 − a)2 E[Vn ]2 . E[Vn ] + c0 E[Vn ]2

Letting n → ∞ and then a ↓ 0, we get the result.

□

Exercise A.5. For every c0 > 1, give an example satisfying (A.2) such that P{V∞ = ∞} = 1/c0 . Hence the conclusion of the proposition cannot be strengthened.

A.3. Compound Poisson process The Poisson process was mentioned in Section 1.6 but we recall it here. Definition A.6. A process Nt , 0 ≤ t < ∞ is called a (right-continuous) Poisson process with rate λ > 0 if: (1) For each s < t, Nt − Ns is a Poisson random variable with mean (t − s)λ independent of {Nr : r ≤ s}. (2) With probability one,the function t 7→ Nt is right-continuous, that is, for every s, lim Nt = Ns . t↓s

Exercise A.7. Show that the right continuity condition (2) is equivalent to the following, assuming that {Nt } satisfies (1). • With probability one, for every t ≥ 0, there exists ϵ > 0 such that Ns = Nt for t ≤ s ≤ t + ϵ.

A.3. Compound Poisson process

183

In order to construct a Poisson process, let T1 , T2 , . . . be independent random variables, each exponential with rate λ and let τ0 = 0, τk = T1 + · · · + Tk . We view τk as the time of the kth jump. Then we set Nt = k if τk ≤ t < τk+1 . Exercise A.8. If you have not seen this, convince yourself that this construction gives a Poisson process. The Poisson process takes a jump of size +1 whenever it jumps. The compound Poisson process is similar except when the process jumps it chooses the jump size from a probability distribution. We will write a probability distribution as a measure µ# on reals, that is, if U ⊂ R, µ# (U ) = P{X ∈ U }. The measure µ# is determined by its probability distribution function F : F (b) = FX (b) = P{X ≤ b} = µ# (∞, b]. When we say that a random variable has distribution µ# it is the same as saying it has distribution function F . Let Y1 , Y2 , . . . be independent random variables with distribution µ# Definition A.9. The compound Poisson process with rate λ and probability distribution µ# is given by Xt = Y1 + Y2 + · · · + YNt , where Y1 , Y2 , . . . are independent random variables with distribution µ# and Nt is a Poisson process with rate λ independent of Y1 , Y2 , . . . Implicit in this definition is Xt = 0 if Nt = 0. This definition uses two “parameters” (λ, µ# ), but this can be combined into one parameter called the Lévy measure: µ := λ µ# . Here µ is the measure with µ(U ) = λ µ# (U ); in particular, µ(R) = λ. Definition A.10. The compound Poisson process with finite Lévy measure µ is the compound Poisson process as above with λ = µ(R), µ# = λ−1 µ. Proposition A.11. Suppose Xt is a compound Poisson process and u ∈ R such that the moment generating function of the Yj , Mt (u) = E[exp{uYj }], is finite. Then E[exp{uXt }] = exp{tλ [Mt (u) − 1]}.

184

A. Some Background and Extra Topics Note that we can write ∫ ∞ ∫ λ [Mt (u) − 1] = λ [eux − 1] dF (x) = −∞

∞

−∞

[eux − 1] µ(dx).

Proof. This is a straightforward computation using the Taylor series for the exponential, E[exp{uXt }] = = =

∞ ∑ n=0 ∞ ∑ n=0 ∞ ∑

P{Nt = n} E[euXt | Nt = n} P{Nt = n} Mt (u)n e−λt

n=0

= e−λt

(λt)n Mt (u)n n!

∞ ∑ [λtMt (u)]n n! n=0

= exp {λt [Mt (u) − 1]} . □ We note that the proposition is valid even if Mt (u) = ∞ since in that case both sides of the equation equal infinity.

A.4. Negative binomial process A compound Poisson process that arises in the analysis of loop measures is the negative binomial process. We first define the negative binomial distribution. Definition A.12. A nonnegative valued random variable K has a negative binomial distribution with parameters p ∈ (0, 1) and t ≥ 0 if ( ) n+t−1 P{Kt = n} = (1 − p)n pt , n = 0, 1, 2, . . . . n Here the binomial coefficient is defined by ( ) n+t−1 (n + t − 1) (n + t − 2) · · · (n + t − n) . = n! n

A.4. Negative binomial process

185

For shorthand, we will write K ∼ NB(t, p). We will say that K has an augmented NB distribution, written K ∼ ANB(t, p), if it has the distribution of t + Kt , that is, ( ) n+t−1 P{Kt = t + n} = (1 − p)n pt , n = 0, 1, 2, . . . . n The terminology and the fact that this is a probability distribution comes from the generalization of the binomial theorem to negative exponents: ) ∞ ( ∑ n+t−1 −t (A.3) p = (1 − p)n . n n=0 This is valid for |p| < 1 and is just the Taylor series for g(x) = x−t about 1. If t = 0, then K = 0 with probability one. If K ∼ NB(1, p), then K has a geometric distribution with parameter p P{K1 = n} = (1 − p)n p. This represents the number of failures until a success for independent trials each with probability 1 − p of success. The augmented random variable, K1 + 1, is also called a geometric and represents the number of trials until a success. More generally, if t is a positive integer, the negative binomial distribution represents the number of failures and the augmented NB distribution represents the number of trials until t successes. In the literature, the term negative binomial distribution is used for both what we call the negative binomial and the augmented NB distribution. We are using the term augmented so that we can distinguish them.

We can also write

( ) n+t−1 Γ(n + t) = . n n! Γ(t) ∫

where Γ(r) =

0

∞

xr−1 e−x dx

186

A. Some Background and Extra Topics

is the Gamma function. Proposition A.13. If K ∼ NB(t, p), then its moment generating function M (u) is given by [ ]t p M (u) = E[exp{uK}] = . 1 − (1 − p)eu provided that (1 − p)eu < 1, In particular, 1−p . E[K] = M ′ (0) = t p Proof. This is a straightforward calculation using (A.3): ) ∞ ( ∑ n+t−1 t E[exp{uK}] = p (1 − p)n eun n n=0 ) ∞ ( ∑ n+t−1 t [(1 − p)eu ]n = p n n=0 = pt [1 − (1 − p)eu ]−t . □

The proof uses that fact that if X has a finite MGF M (u) defined in a neighborhood of the origin, then E[X n ] = M (n) (0). This can be seen formally by differentiating both sides of ∞ ∑ 1 n u E[X n ], E[exp{uX}] = n! n=0 and then using the existence of the MGF to justify this calculation rigorously,

Proposition A.14. If X, Y are independent random variables with X ∼ NB(s, p), Y ∼ NB(t, p), then X + Y ∼ NB(s + t, p). Proof. The distribution of a random variable is determined by its MGF provided that the function exists in a neighborhood of the origin. □

A.5. Increasing jump processes

187

Definition A.15. • A process Kt taking values in the nonnegative integers is called a negative binomial process with parameter p ∈ (0, 1) if for every s < t, the random variable Kt − Ks is conditionally independent of {Kr : r ≤ s} with distribution NB(t − s, p). • We say that Yt := t + Kt is an augmented NB process with parameter p. Proposition A.16. The negative binomial process with parameter 1 − p is the same as the compound Poisson process with Lévy measure supported on the positive integers, µk = µ{k} =

1 k p , k

k = 1, 2, . . . .

Proof. Using the Taylor series for − log(1 − x) about the origin, we see that ∞ ∞ ∞ ∑ ∑ ∑ 1 1 k [euk − 1] µk = (peu )k − p k k k=1

k=1

k=1

= − log[1 − peu ] + log(1 − p). Therefore,

{

E[e

uKt

] = exp t

∞ ∑

} [e

uk

− 1] µk

k=1

[

1−p = 1 − peu

]t .

Comparing with Proposition A.13, we see that Kt ∼ N B(t, p)

□

A.5. Increasing jump processes We will generalize the idea to processes that take infinitely many jumps in any finite time interval. We will restrict to the case where the jumps are all in the positive direction. To specify the process, we will give a positive measure µ on (0, ∞). This can be a general σfinite measure, but to avoid measure theoretic details, we will consider measures with a density, that is, measure µ such that for all x > 0, ∫ ∞ F (x) := µ[x, ∞) = f (t) dt, x

188

A. Some Background and Extra Topics

for some function f . We will allow ∫ ∞ f (t) dt = ∞, µ(0, ∞) = 0

but we will impose the condition ∫ 1 t f (t) dt < ∞, 0

Then the process is characterized by the relation P{Xt+s − Xt ≥ x | Xs } = s F (x) + o(s). This is usually expressed in terms of the Lévy measure µ defined by µ[x, ∞) = F (x). If F (0+) < ∞, then the process is exactly the compound Poisson process with measure µ. To construct the process, we approximate F by FJ (x) = min{F (x), J} and let Xt,J be the compound Poisson process generated by FJ and then let Xt = lim Xt,J . J→∞

This is an increasing limit. Proposition A.17. Suppose Xt is an increasing Lévy process whose Lévy density satisfies ∫ ∞ [eux − 1] f (x) dx < ∞. 0

Then,

{ ∫ E[exp{uXt }] = exp t

∞

} ux

[e

− 1] f (x) dx .

0

Proof. Let fm (x) = f (x) 1{x ≥ 1/m} and let Xt,m denote the Lévy jump process associated to fm . Then Xt,m is a compound Poisson process and hence we have } { ∫ ∞ ux [e − 1] fm (x) dx . E[exp{uXt.m }] = exp t 0

Note that Xt = lim Xt,m m→0

A.6. Gamma process

189

and that this is an increasing limit. In particular, lim E [exp {uXt,m }] = E [exp {uXt }] .

m→∞

More precisely, this follows for u ≥ 0 by the monotone convergence theorem and for u < 0 by the dominated convergence theorem. Similarly, ∫ ∞ ∫ ∞ ux lim [e − 1] fm (x) dx = [eux − 1] f (x) dx. m→∞

0

0

□

A.6. Gamma process The Gamma process is the continuous limit of the negative binomial process. Let us start with the easiest case. Suppose Tn is a geometric random variable, ( )k 1 1 P{Tn = k} = 1 − , k = 0, 1, 2, . . . . n n Then,

)kn ( 1 = e−k . lim n P{Tn = kn} = lim 1 − n→∞ n→∞ n { } ∫ x Tn lim P ≤x = e−t dt. n→∞ n 0 ( ) More generally, if Tn ∼ NB t, n1 , ( )kn 1 1 Γ(kn + t) 1− lim n P{Tn = kn} = lim n→∞ n→∞ Γ(kn + 1) Γ(t) nt−1 n 1 t−1 −k = k e . Γ(t)

The Gamma function is defined for r > 0 by ∫ ∞ Γ(r) = xr−1 e−x dx. 0

If r is an integer, Γ(r) = (r − 1)! and the Gamma function is the natural extension of the factorial function to noninteger r. Stirling’s formula states that as r → ∞, √ 1 Γ(r + 1) ∼ 2π rr+ 2 e−r .

190

A. Some Background and Extra Topics Using this we can see that for each t > 0, Γ(t + r) = 1. lim r→∞ r t Γ(r)

Definition A.18. A random variable T has a Gamma distribution with parameters t and λ if it has density λt t−1 −λx x e x > 0. Γ(t) The case t = 1 is called the exponential distribution. The parameter λ is called the rate. We write T ∼ Gamma(t, λ). Proposition A.19. • If T ∼ Gamma(t, λ) and X = rT with r > 0, then X ∼ Gamma(t, λ/r). • If u < λ and T ∼ Gamma(t, λ), then [ ]t λ M (u) := E[euT ] = . λ−u In particular, t . λ Definition A.20. A process Xt is a Gamma process with rate λ if for each s < t, the random variable Xt − Xs is independent of {Xr : r ≤ s} with distribution Gamma(t − s, λ). E[T ] = M ′ (0) =

We will focus on the case λ = 1; the more general case can be handled easily with this exercise. Exercise A.21. If Xt is a Gamma process with rate λ, r > 0, and Yt = rXt then Yt is a Gamma process with rate λ/r. The next proposition is the continuous analogue of Proposition A.16. Proposition A.22. The Gamma process with rate 1 is the increasing jump process with Lévy measure with density 1 f (x) = e−x , x > 0. x

A.6. Gamma process

191

Proof. Note that if |u| < 1, ∫

∞

∫ [eux − 1] f (x) dx

=

0

= = = =

R

[eux − 1] f (x) dx ] ∫ R [∑ ∞ (ux)n 1 −x e dx lim R→∞ 0 n! x n=1 ∫ R n−1 ∞ ∑ x lim un e−x dx R→∞ n! 0 n=1 ∫ ∞ n−1 ∞ ∑ x un e−x dx n! 0 n=1 lim

R→∞

∞ ∑

0

un = − log(1 − u).

n=1

□ Proposition A.23. If Kt is an augmented NB process with parameter p and Yt is an independent Gamma process with rate 1, then Zt := KYt is a Gamma process with rate p. Proof. One can check that it is the process has independent, identically distributed increments and hence is a Lévy process. E[euZt ]

= = = = =

∞ ∑

P{Kt = t + n} E[euZt | Kt = t + n]

n=0 ∞ ∑

Γ(n + t) t p (1 − p)n (1 − u)−(t+n) n! Γ(t) n=0 [ ]n ∞ ∑ pt Γ(n + t) 1 − p (1 − u)t n=0 n! Γ(t) 1 − u [ ]−t (pt 1−p 1− (1 − u)t 1−u [ ]t p p−u

This is the MGF of a Gamma(t, p) distribution.

□

192

A. Some Background and Extra Topics

A.7. L´ evy processes We have been looking at examples of Lévy processes so it makes sense to discuss the general theory. Definition A.24. A process Xt , t ≥ 0 taking values in the reals is a (right continuous) Lévy process (starting at the origin) if with probability one the function t 7→ Xt is right-continuous and the following holds. • For each s < t, the random variable Xt − Xs is independent of {Xr : r ≤ s} and has the same distribution as Xt−s , Looking at Lévy processes leads to describing certain kinds of distributions. Definition A.25. A random variable X has an infinitely divisible distribution if for every positive integer n we can find independent, identically distributed random variables Y1 , . . . , Yn such that X has the same distribution as Y1 + · · · + Yn . Exercise A.26. We will give some examples of infinitely divisible distributions but it is probably worth some time to show that many distributions are not infinitely divisible. (1) Suppose that X is an infinitely divisible random variable that takes on only a finite number of values. Show that X takes on only one value, that is, it is a constant. (2) Let X be a uniform random variable on [0, 1], that is, X has density f (x) = 1, 0 ≤ x ≤ 1 with f (x) = 0 for other x. Show that X is not infinitely divisible. Proposition A.27. If Xt is a Lévy process, then Xt has an infinitely divisible distribution. Proof. We can write ( ) ( ) Xt = Xt/n + X2t/n − Xt/n + · · · + Xnt/n − X(n−1)t/n , and the assumptions imply that the n random variables on the righthand side are independent and identically distributed. □

A.7. L´ evy processes

193

Suppose Xt has an infinitely divisible distribution. Let ϕt denote the characteristic function of Xt [ ] ϕt (u) = E eiuXt . Since Xt+s = Xs + (Xt+s − Xs ), we get ϕt+s (u) = ϕs (u) ϕt (u). Lemma A.28. For all t > 0 and u ∈ R, ϕt (u) ̸= 0. Proof. We know that lim Xt/n = X0 = 0.

n→∞

This implies that for fixed u, lim E[eiuX1/n ] = E[eiuX0 ] = 1.

n→∞

Hence, at least for n sufficiently large, ϕt/n (u) ̸= 0. But ϕt (u) = [ϕt/n (u)]n . □ Given this we can define Lt (u) = log ϕt (u). Note that Lt+s (u) = Lt (u) + Ls (u). Using continuity, we conclude that Lt (u) = 1 and hence we can write ϕt (u) = exp {tL(u)} where L(u) = log ϕ1 (u). Summarizing: the distribution of X1 determines the distribution of the remaining Xt and the distribution of X1 is determined by its characteristic function (or the logarithm of it).

Here we have defined the logarithm of a characteristic function. In order to define this we need that the characteristic function does not take on the value zero. It requires a little more care because we are talking a logarithm of a complex number which is multi-valued. However there is a unique function L(u) such that E [exp{iuX}] = exp{L(u)}, with the conditions that L is continuous in u with L(0) = 0.

194

A. Some Background and Extra Topics

For the jump processes whose Lévy measure has a density f that we have been describing ∫ ∞ L(u) = i [eiux − 1] f (x) dx. −∞

One might ∫ ∞ worry about the convergence of this integral but we claim that −∞ |eiux − 1| f (x) dx < ∞. Indeed, since |eiux − 1| ≤ 2, for every ϵ > 0, ∫ ∫ iux |e − 1| f (x) dx ≤ 2 f (x) dx < ∞. |x|≥ϵ

|x|≥ϵ

For |x| ≤ ϵ, we can use eiux − 1 = iux + O(u2 x2 ).

Bibliography

[1] Itai Benjamini, Harry Kesten, Yuval Peres, and Oded Schramm, Geometry of the uniform spanning forest: transitions in dimensions 4, 8, 12, . . . , Ann. of Math. (2) 160 (2004), no. 2, 465–491, DOI 10.4007/annals.2004.160.465. MR2123930 [2] Patrick Billingsley, Probability and measure, Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., Hoboken, NJ, 2012. Anniversary edition [of MR1324786]; With a foreword by Steve Lalley and a brief biography of Billingsley by Steve Koppes. MR2893652 [3] Geoffrey R. Grimmett and David R. Stirzaker, Probability and random processes, Oxford University Press, Oxford, 2020. Fourth edition [of 0667520]. MR4229142 [4] Davar Khoshnevisan, Probability, Graduate Studies in Mathematics, vol. 80, American Mathematical Society, Providence, RI, 2007, DOI 10.1090/gsm/080. MR2296582 [5] Gregory F. Lawler, A self-avoiding random walk, Duke Math. J. 47 (1980), no. 3, 655–693. MR587173 [6] Gregory F. Lawler, Introduction to stochastic processes, 2nd ed., Chapman & Hall/CRC, Boca Raton, FL, 2006. MR2255511 [7] Gregory F. Lawler, Scaling limits and the Schramm-Loewner evolution, Probab. Surv. 8 (2011), 442–495, DOI 10.1214/11-PS189. MR2861136 [8] Gregory F. Lawler, Topics in loop measures and the loop-erased walk, Probab. Surv. 15 (2018), 28–101, DOI 10.1214/17-PS293. MR3770886 [9] Gregory F. Lawler, Conformally invariant processes in the plane, Mathematical Surveys and Monographs, vol. 114, American Mathematical Society, Providence, RI, 2005, DOI 10.1090/surv/114. MR2129588 [10] Gregory F. Lawler and Vlada Limic, Random walk: a modern introduction, Cambridge Studies in Advanced Mathematics, vol. 123, Cambridge University Press, Cambridge, 2010, DOI 10.1017/CBO9780511750854. MR2677157 [11] Yves Le Jan, Markov paths, loops and fields, Lecture Notes in Mathematics, vol. 2026, Springer, Heidelberg, 2011. Lectures from the 38th Probability Sum´ ´ e de Probabilit´ mer School held in Saint-Flour, 2008; Ecole d’Et´ es de SaintFlour. [Saint-Flour Probability Summer School], DOI 10.1007/978-3-642-21216-1. MR2815763

195

196

Bibliography

[12] David A. Levin and Yuval Peres, Markov chains and mixing times, American Mathematical Society, Providence, RI, 2017. Second edition of [ MR2466937]; With contributions by Elizabeth L. Wilmer; With a chapter on “Coupling from the past” by James G. Propp and David B. Wilson, DOI 10.1090/mbk/107. MR3726904 [13] Russell Lyons and Yuval Peres, Probability on trees and networks, Cambridge Series in Statistical and Probabilistic Mathematics, vol. 42, Cambridge University Press, New York, 2016, DOI 10.1017/9781316672815. MR3616205 [14] Michael B. Marcus and Jay Rosen, Markov processes, Gaussian processes, and local times, Cambridge Studies in Advanced Mathematics, vol. 100, Cambridge University Press, Cambridge, 2006, DOI 10.1017/CBO9780511617997. MR2250510 [15] Peter M¨ orters and Yuval Peres, Brownian motion, Cambridge Series in Statistical and Probabilistic Mathematics, vol. 30, Cambridge University Press, Cambridge, 2010. With an appendix by Oded Schramm and Wendelin Werner, DOI 10.1017/CBO9780511750489. MR2604525 [16] Oded Schramm, Scaling limits of loop-erased random walks and uniform spanning trees, Israel J. Math. 118 (2000), 221–288, DOI 10.1007/BF02803524. MR1776084 [17] W. Werner and E. Powell, Lectures on the Gaussian free field, arXiv: 2004.04720. [18] Frank Spitzer, Principles of random walk, 2nd ed., Graduate Texts in Mathematics, Vol. 34, Springer-Verlag, New York-Heidelberg, 1976. MR0388547 [19] Alain-Sol Sznitman, Topics in occupation times and Gaussian free fields, Zurich Lectures in Advanced Mathematics, European Mathematical Society (EMS), Z¨ urich, 2012, DOI 10.4171/109. MR2932978 [20] David Bruce Wilson, Generating random spanning trees more quickly than the cover time, Proceedings of the Twenty-eighth Annual ACM Symposium on the Theory of Computing (Philadelphia, PA, 1996), ACM, New York, 1996, pp. 296– 303, DOI 10.1145/237814.237880. MR1427525

Index

augmented NB distribution, 185 augmented NB process, 187 binary tree, 15 Borel-Cantelli Lemma, 179, 181 boundary perturbation, 167 boundary Poisson kernel, 34 boundary vertex, 8 bounded convergence theorem, 11 box dimension, 155 Brownian bridge, 158 Brownian motion, 156 capacity in Z2 , 102 in Zd , d ≥ 3, 93 capacity parametrization, 169 Cauchy-Riemann equation, 171 Cauchy-Riemann equations, 159 central charge, 45, 167 central limit theorem, 70 Chapman-Kolomogorov equations, 2 characteristic function, 71, 193 comparable, 97 compound Poisson process, 183, 188 concatenation, 18 conformal invariance, 163, 166 conformal transformation, 160

connective constant, 172 covariance matrix, 135 Cramer’s rule, 31 current directed, 59 undirected, 59 Dirichlet problem, 12, 170 domain Markov property, 36, 166 Doob h-process, 105 dual graph, 123 dual lattice, 122 dual spanning tree, 122, 125 edge self-edge, 7 undirected, 7 elementary loop, 47 escape probability, 93 Euler’s constant, 81 exponential distribution, 20, 190 fractal dimension, 153 free boundary, 67, 119 gambler’s ruin, 149 Gamma distribution, 65, 190 Gamma function, 78, 189 Gamma process, 65, 190 Gaussian free field (GFF), 133 conformal invariance in R2 , 176

197

198 continuous, 173 Dirichlet, 142, 146 from Markov chains, 139 generating function, 15 geometric distribution, 5 graph connected, 6 directed (digraph), 6 Laplacian, 8 graphs, 6 Green’s function, 13, 18, 26 Brownian motion, 174 conformal invariance, 175 ground state, 145 growing loop at a point, 46 configuration, 50 Hamiltonian, 143 harmonic function, 8 difference estimates, 86 on Zd , 81 harmonic measure, 91 Harnack inequality, 88 Hausdorff dimension, 156 heat equation, 69 holomorphic function, 159 indegree, 6 indicator function, 4 infinitely divisible distribution, 192 integrable weight, 17 interior vertex, 8 Kirchhoff’s theorem, 41 L´ evy measure, 183, 188 L´ evy process, 157, 192 Laplacian, 18, 76 determinant, 29 graph, 8, 40 in Rd , 8 Markov chain, 7 random walk, 8 Laplacian random walk, 32 last-exit decomposition, 87, 94 lazy walker, 69 local central limit theorem (LCLT), 71, 73

Index local time, 59 Loewner differential equation, 169 loop erasure, 24 backward, 25 forward, 25 loop measure rooted, 54 unrooted, 56 loop soup, 51 Brownian, 161 ordered, 51 rooted, 55 loop-erased random walk (LERW), 25, 103, 168 scaling limit, 164 Markov chain, 1 cemetery site, 14 continuous time, 20 finite, 1 irreducible, 3 killing rate, 14 Laplacian, 7 recurrent, 4 symmetric, 4 time homogeneous, 1 transient, 5 with boundary, 8 Markovian field, 147 measure, 17 memoryless property, 20 Minkowski content, 154, 167 Minkowski dimension, 154, 167 moment generating function (mgf), 15 negative binomial distribution, 62, 184 negative binomial process, 48, 187 normal distribution, 133 centered, 134 multivariate, 133, 134, 137 normaldistribution multivariate, 134 outdegree, 6 P´ olya’s theorem, 76 path, 16, 23

Index length, 16 nontrivial, 17 trivial, 17 period, 69 Poisson kernel, 10, 34, 142, 171 Poisson process, 51, 182 positive definite, 135 positive semidefinite, 135 Possion process negative rate, 63 potential kernel, 79, 81 prediction, 173 random walk excursion-reflected, 111 Laplacian, 32, 106 loop-erased, 25 simple Zd , 68 graph, 6 with darning, 111 recurrent, 76 set in Zd , 98 set inZd , 96 restriction property, 163, 168 Riemann mapping theorem, 161 sausage, 154 scaling limit, 153 Schramm-Loewner evolution (SLE), 167, 171 second moment method, 180 self-avoiding polygon (SAP), 122 self-avoiding walk (SAW), 24, 172 simply connected, 122, 161 spanning forest, 117 spanning tree, 117 Stirling’s formula, 6, 75 stochastic matrix, 2 stopping time, 9 transient, 76 set in Zd , 96, 98 transition matrix (probabilities), 1 tree, 37 spanning, 38 wired spanning, 41 Uniform spanning tree (UST)

199 infinite in Zd , 117 uniform spanning tree (UST), 38 in Z2 , 130 infinite in Z2 , 128 unrooted loop, 55 weight integrable, 17 on edges, 16 on paths, 17 Wiener process, 156 Wiener’s test, 98 Wilson’s algorithm, 38, 119 wired boundary, 68, 119

SELECTED PUBLISHED TITLES IN THIS SERIES

98 Gregory F. Lawler, Random Explorations, 2022 97 Anthony Bonato, An Invitation to Pursuit-Evasion Games and Graph Theory, 2022 96 Hil´ ario Alencar, Walcy Santos, and Greg´ orio Silva Neto, Differential Geometry of Plane Curves, 2022 95 J¨ org Bewersdorff, Galois Theory for Beginners: A Historical Perspective, Second Edition, 2021 94 James Bisgard, Analysis and Linear Algebra: The Singular Value Decomposition and Applications, 2021 93 Iva Stavrov, Curvature of Space and Time, with an Introduction to Geometric Analysis, 2020 92 Roger Plymen, The Great Prime Number Race, 2020 91 Eric S. Egge, An Introduction to Symmetric Functions and Their Combinatorics, 2019 90 Nicholas A. Scoville, Discrete Morse Theory, 2019 89 Martin Hils and Fran¸ cois Loeser, A First Journey through Logic, 2019 88 M. Ram Murty and Brandon Fodden, Hilbert’s Tenth Problem, 2019 87 Matthew Katz and Jan Reimann, An Introduction to Ramsey Theory, 2018 86 Peter Frankl and Norihide Tokushige, Extremal Problems for Finite Sets, 2018 85 Joel H. Shapiro, Volterra Adventures, 2018 84 Paul Pollack, A Conversational Introduction to Algebraic Number Theory, 2017 83 Thomas R. Shemanske, Modern Cryptography and Elliptic Curves, 2017 82 A. R. Wadsworth, Problems in Abstract Algebra, 2017 81 Vaughn Climenhaga and Anatole Katok, From Groups to Geometry and Back, 2017 80 Matt DeVos and Deborah A. Kent, Game Theory, 2016 79 Kristopher Tapp, Matrix Groups for Undergraduates, Second Edition, 2016 78 Gail S. Nelson, A User-Friendly Introduction to Lebesgue Measure and Integration, 2015 77 Wolfgang K¨ uhnel, Differential Geometry: Curves — Surfaces — Manifolds, Third Edition, 2015 76 John Roe, Winding Around, 2015

For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstore/stmlseries/.

This book is an outgrowth of lectures by the author in the University of Chicago Research Experiences for Undergraduate (REU) program in 2020. The idea of the course was to expose advanced undergraduates to ideas in probability research.

Photographed by: Andrea Rosalez

The title “Random Explorations” has two meanings. First, a few topics of advanced probability are deeply explored. Second, there is a recurring theme of analyzing a random object by exploring a random path.

The book begins with Markov chains with an emphasis on transient or killed chains that have finite Green’s function. This function, and its inverse called the Laplacian, is discussed next to relate two objects that arise in statistical physics, the loop-erased random walk (LERW) and the uniform spanning tree (UST). A modern approach is used including loop measures and soups. Understanding these approaches as the system size goes to infinity requires a deep understanding of the simple random walk so that is studied next, followed by a look at the infinite LERW and UST. Another model, the Gaussian free field (GFF), is introduced and related to loop measure. The emphasis in the book is on discrete models, but the final chapter gives an introduction to the continuous objects: Brownian motion, Brownian loop measures and soups, SchrammLoewner evolution (SLE), and the continuous Gaussian free field. A number of exercises scattered throughout the text will help a serious reader gain better understanding of the material. For additional information and updates on this book, visit www.ams.org/bookpages/stml-98

STML/98