Brownian Motion: A Guide to Random Processes and Stochastic Calculus [3rd Edition] 9783110741278, 9783110741254

Stochastic processes occur everywhere in the sciences, economics and engineering, and they need to be understood by (app

305 40 4MB

English Pages 533 [534] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Brownian Motion: A Guide to Random Processes and Stochastic Calculus [3rd Edition]
 9783110741278, 9783110741254

Table of contents :
Preface
Contents
Dependence chart
1 Robert Brown’s new thing
2 Brownian motion as a Gaussian process
3 Constructions of Brownian motion
4 The canonical model
5 Brownian motion as a martingale
6 Brownian motion as a Markov process
7 Brownian motion and transition semigroups
8 The PDE connection
9 The variation of Brownian paths
10 Regularity of Brownian paths
11 Brownian motion as a random fractal
12 The growth of Brownian paths
13 Strassen’s functional law of the iterated logarithm
14 Skorokhod representation
15 Stochastic integrals: L2-Theory
16 Stochastic integrals: localization
17 Stochastic integrals: martingale drivers
18 Itô’s formula
19 Applications of Itô’s formula
20 Wiener Chaos and iterated Wiener–Itô integrals
21 Stochastic differential equations
22 Stratonovich’s stochastic calculus
23 On diffusions
24 Simulation of Brownian motion by Björn Böttcher
A Appendix
Bibliography
Index

Citation preview

René L. Schilling Brownian Motion

Also of interest Stochastic Models for Fractional Calculus Mark M. Meerschaert, Alla Sikorskii, 2019 ISBN 978-3-11-055907-1, e-ISBN (PDF) 978-3-11-056024-4, e-ISBN (EPUB) 978-3-11-055914-9 in: De Gruyter Studies in Mathematics De Gruyter Series in Probability and Stochastics Edited by Itai Benjamini, Jean Bertoin, Michel Ledoux, René L. Schilling ISSN 2512-9007, e-ISSN 2512-899X

Feynman-Kac-Type Theorems and Gibbs Measures on Path Space Volume 1 Feynman-Kac-Type Formulae and Gibbs Measures József Lörinczi, Fumio Hiroshima, Volker Betz, 2020 ISBN 978-3-11-033004-5, e-ISBN (PDF) 978-3-11-033039-7, e-ISBN (EPUB) 978-3-11-038993-7 Volume 2 Applications in Rigorous Quantum Field Theory Fumio Hiroshima, József Lörinczi, 2020 ISBN 978-3-11-040350-3, e-ISBN (PDF) 978-3-11-040354-1, e-ISBN (EPUB) 978-3-11-040360-2

Bernstein Functions Theory and Applications René L. Schilling, Renming Song, Zoran Vondracek, 2012 ISBN 978-3-11-025229-3, e-ISBN (PDF) 978-3-11-026933-8

René L. Schilling

Brownian Motion

A Guide to Random Processes and Stochastic Calculus With a Chapter on Simulation by Björn Böttcher 3rd Edition

Mathematics Subject Classification 2010 Primary: 60-01, 60J65; Secondary: 60H05, 60H10, 60J35, 60G46, 60J60, 60J25. Author Prof. Dr. René L. Schilling Technische Universität Dresden Institut für Mathematische Stochastik 01062 Dresden Germany [email protected] https://tu-dresden.de/mn/math/stochastik/schilling Online Resources www.motapa.de/brownian_motion Book cover The cover shows a photograph of the Quantum Cloud sculpture by Antony Gormley in London, almost directly on the Greenwich Meridian (51° 30’ 6.48“ N and 0° 00’ 32.76“ E). It is approximately 30 metres high and portrays a figure appearing in a cloud of tetrahedron-shaped metal pieces; the cloud around the figure was constructed with the help of a random walk algorithm.

ISBN 978-3-11-074125-4 e-ISBN (PDF) 978-3-11-074127-8 e-ISBN (EPUB) 978-3-11-074149-0 Library of Congress Control Number: 2021943621 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2021 Walter de Gruyter GmbH, Berlin/Boston Printing and binding: CPI books GmbH, Leck Cover image: Anthony Gormley Figure; Tony Hisgett from Birmingham, UK, CC BY 2.0 www.degruyter.com

Preface Brownian motion is arguably the single most important stochastic process. Historically it was the first stochastic process in continuous time and with a continuous state space, and thus it shaped the study of Gaussian processes, martingales, Markov processes, diffusions and random fractals. Its central position within mathematics is matched by numerous applications in science, engineering and mathematical finance. The present book grew out of several courses, which I taught at the University of Marburg and TU Dresden, and it owes a great debt to the lecture notes [199] by Lothar Partzsch who was my co-author in the first two editions. Many students are interested in applications of probability theory and it is important to teach Brownian motion and stochastic calculus at an early stage of the curriculum. Such a course is very likely the first encounter with stochastic processes in continuous time, following directly on an introductory course on rigorous (i.e. measure-theoretic) probability theory. Typically, students would be familiar with the classical limit theorems of probability theory and basic discrete-time martingales, as it is treated, for example, by Jacod & Protter Probability Essentials [124], Williams Probability with Martingales [270], or in the more voluminous textbooks by Billingsley [15] and Durrett [67]. General textbooks on probability theory cover, if at all, Brownian motion only briefly. On the other hand, there is a quite substantial gap to more specialized texts on Brownian motion which is not so easy to overcome for the novice. Our aim was to write a book which can be used in the classroom as an introduction to Brownian motion and stochastic calculus, and as a first course in continuous-time and continuous-state Markov processes. My aim was to have a text which is both a self-contained back-up and self-study text for contemporary applications (such as mathematical finance) and a foundation to more advanced monographs, e.g. Ikeda & Watanabe [113], Revuz & Yor [216] or Rogers & Williams [222] (for stochastic calculus), Marcus & Rosen [178] (for Gaussian processes), Peres & Mörters [186] (for random fractals), Chung [33] or Port & Stone [209] (for potential theory) or Blumenthal & Getoor [18] (for Markov processes) to name but a few. Things the readers are expected to know. Our presentation is basically selfcontained, starting from scratch with continuous-time stochastic processes. We do, however, assume some basic measure theory (as in [232]) and a first course on probability theory and discrete-time martingales (as in [124] or [270]). This material can also be found in my lecture notes [235]. How to read this book. Of course, nothing prevents you from reading it linearly. But there is more material here than one could cover in a one-semester course. Depending on your needs and likings, there are at least three possible selections: BM and Itô calculus, BM and its sample paths and BM as a Markov process. The diagram on page XI will give you some ideas how things depend on each other and how to construct your own “Brownian sample path” through this book. https://doi.org/10.1515/9783110741278-202

VI | Preface

Ex. n.m

Whenever special attention is needed and to point out traps & pitfalls, we have used the sign in the margin. Also in the margin, there are cross-references to exercises at the end of each chapter which we think fit (and are sometimes needed) at that point.¹ They are not just drill problems but contain variants, excursions from and extensions of the material presented in the text. The proofs of the core material do not seriously depend on any of the problems. Changes to the third edition. After the publication of the first two editions, I got many positive responses from colleagues and students alike, and quite a few led to improvements in this new edition. I took the opportunity to correct misprints and make many changes and additions scattered throughout the text. The most significant changes are the completely rewritten Chapters 8 and § 23.3 on PDEs, the new sections on orthogonal random measures and white noise § 15.6, local times and Brownian excursions § 19.8 and the martingale problem § 23.4. I have also expanded the presentation of stochastic integrals driven by (local) martingales (Chapters 16 and 17) and added a proof of the Wiener chaos expansion via iterated Itô integrals (Chapter 20). Throughout the book there are exercises which come with a detailed online solution manual www.motapa.de/brownian_motion. Despite of all of these changes I did not try to make the book into an encyclopedia, but I wanted to keep the easily accessible, introductory character of the exposition, and there are still plenty of topics deliberately left out. Acknowledgements. Many people contributed towards the completion of this project: first of all the students who attended my courses and helped – often unwittingly – to shape the presentation of the material. My (former) PhD students Katharina Fischer, Julian Hollender, Felix Lindner, Michael Schwarzenberger, Cailing Li, and my colleagues David Berger, Wojciech Cygan, Paolo Di Tella and Kamil Kaleta did a good job at proofreading at various stages. Without the input of Franziska Kühn and Niels Jacob this text would look very different. A big “thank you” to all of you for weeding out obscure passages and proposing clever new arguments. I owe a great debt to my colleague Björn Böttcher who produced most of the illustrations and contributed Chapter 24 on simulation. The de Gruyter editorial staff around Leonardo Milla and Ute Skambraks did a splendid job to make this project a success. Finally, without the support and patience of my wife, who had to put up with Corona, home office and Wiener space, this endeavour would not have been possible. Thank you! Dresden, summer of 2021

René L. Schilling

1 For the readers’ convenience there is a web page where additional material and solutions are available. The URL is www.motapa.de/brownian_motion.

Contents Preface | V Dependence chart | XI Index of notation | XIII 1

Robert Brown’s new thing | 1

2 2.1 2.2 2.3

Brownian motion as a Gaussian process | 7 The finite dimensional distributions | 7 Brownian motion in ℝd | 11 Invariance properties of Brownian motion | 14

3 3.1 3.2 3.3 3.4 3.5 3.6

Constructions of Brownian motion | 20 A random orthogonal series | 20 The Lévy–Ciesielski construction | 22 Wiener’s construction | 26 Lévy’s original argument | 28 Donsker’s construction | 34 The Bachelier–Kolmogorov point of view | 36

4 4.1 4.2

The canonical model | 39 Wiener measure | 39 Kolmogorov’s construction | 43

5 5.1 5.2 5.3

Brownian motion as a martingale | 48 Some “Brownian” martingales | 48 Stopping and sampling | 52 The exponential Wald identity | 56

6 6.1 6.2 6.3 6.4 6.5 6.6 6.7

Brownian motion as a Markov process | 62 The Markov property | 62 The strong Markov property | 65 Desiré André’s reflection principle | 69 Transience and recurrence | 74 Lévy’s triple law | 76 An arc-sine law | 79 Some measurability issues | 80

7

Brownian motion and transition semigroups | 87

VIII | Contents 7.1 7.2 7.3 7.4 7.5 7.6

The semigroup | 87 The generator | 92 The resolvent | 96 The Hille-Yosida theorem and positivity | 103 The potential operator | 107 Dynkin’s characteristic operator | 114

8 8.1 8.2 8.3 8.4

The PDE connection | 124 The heat equation | 125 The inhomogeneous initial value problem | 128 The Feynman–Kac formula | 130 The Dirichlet problem | 138

9 9.1 9.2 9.3 9.4

The variation of Brownian paths | 152 The quadratic variation | 153 Almost sure convergence of the variation sums | 154 Almost sure divergence of the variation sums | 158 Lévy’s characterization of Brownian motion | 161

10 10.1 10.2 10.3

Regularity of Brownian paths | 167 Hölder continuity | 167 Non-differentiability | 170 Lévy’s modulus of continuity | 171

11 11.1 11.2 11.3 11.4 11.5

Brownian motion as a random fractal | 177 Hausdorff measure and dimension | 177 The Hausdorff dimension of Brownian paths | 180 Local maxima of a Brownian motion | 187 On the level sets of a Brownian motion | 189 Roots and records | 191

12 12.1 12.2

The growth of Brownian paths | 198 Khintchine’s law of the iterated logarithm | 198 Chung’s “other” law of the iterated logarithm | 206

13 13.1 13.2 13.3

Strassen’s functional law of the iterated logarithm | 212 The Cameron–Martin formula | 213 Large deviations (Schilder’s theorem) | 220 The proof of Strassen’s theorem | 225

14

Skorokhod representation | 232

Contents

15 15.1 15.2 15.3 15.4 15.5 15.6

Stochastic integrals: L2 -Theory | 244 Discrete stochastic integrals | 244 Simple integrands | 248 Extension of the stochastic integral to L2T | 252 Evaluating Itô integrals | 256 What is the closure of ST ? | 260 White noise integrals | 263

16

Stochastic integrals: localization | 273

17 17.1 17.2 17.3

Stochastic integrals: martingale drivers | 283 The Doob–Meyer decomposition | 283 The Itô integral driven by L2 martingales | 289 The Itô integral driven by local martingales | 291

18 18.1 18.2 18.3 18.4 18.5 18.6 18.7

Itô’s formula | 297 Itô processes and stochastic differentials | 297 The heuristics behind Itô’s formula | 299 Proof of Itô’s formula (Theorem 18.1) | 300 Itô’s formula for stochastic differentials | 303 Itô’s formula for Brownian motion in ℝd | 307 The time-dependent Itô formula | 309 Tanaka’s formula and local time | 311

19 19.1 19.2 19.3 19.4 19.5 19.6 19.7 19.8

Applications of Itô’s formula | 317 Doléans–Dade exponentials | 317 Lévy’s characterization of Brownian motion | 321 Girsanov’s theorem | 323 Martingale representation – 1 | 327 Martingale representation – 2 | 329 Martingales as time-changed Brownian motion | 332 Burkholder–Davis–Gundy inequalities | 334 Local times and Brownian excursions | 338

20 20.1 20.2 20.3 20.4

Wiener Chaos and iterated Wiener–Itô integrals | 361 Multiple vs. iterated stochastic integrals | 361 An interlude: Hermite polynomials | 367 The Wiener–Itô theorem | 371 Calculating Wiener–Itô decompositions | 377

21 21.1

Stochastic differential equations | 383 The heuristics of SDEs | 383

| IX

X | Contents 21.2 21.3 21.4 21.5 21.6 21.7 21.8 21.9

Some examples | 385 The general linear SDE | 388 Transforming an SDE into a linear SDE | 390 Existence and uniqueness of solutions | 393 Further examples and counterexamples | 398 Solutions as Markov processes | 403 Localization procedures | 404 Dependence on the initial values | 407

22 22.1 22.2

Stratonovich’s stochastic calculus | 417 The Stratonovich integral | 417 Solving SDEs with Stratonovich’s calculus | 421

23 23.1 23.2 23.3 23.4

On diffusions | 427 Kolmogorov’s theory | 429 Itô’s theory | 434 Drift transformations revisited | 439 First steps towards the martingale problem | 445

24 24.1 24.2 24.3 24.4 24.5 24.6

Simulation of Brownian motion by Björn Böttcher | 460 Introduction | 460 Normal distribution | 464 Brownian motion | 466 Multivariate Brownian motion | 467 Stochastic differential equations | 469 Monte Carlo method | 473

A A.1 A.2 A.3 A.4 A.5 A.6 A.7 A.8 A.9 A.10 A.11

Appendix | 474 Kolmogorov’s existence theorem | 474 A property of conditional expectations | 478 From discrete to continuous time martingales | 479 Stopping times | 485 Optional sampling | 488 Remarks on Feller processes | 491 The monotone class theorem | 492 BV functions and Riemann–Stieltjes integrals | 493 Frostman’s theorem: Hausdorff measure, capacity and energy | 497 Gronwall’s lemma | 500 Completeness of the Haar functions | 501

Bibliography | 503 Index | 514

Dependence chart As we have already mentioned in the preface, there are at least three paths through this book which highlight different aspects of Brownian motion: Brownian motion and Itô calculus, Brownian motion as a Markov process, and Brownian motion and its sample paths. Below we suggest some fast tracks “C”, “M” and “S” for each route, and we indicate how the other topics covered in this book depend on these fast tracks. This should help you to find your own personal sample path. Starred sections (in the grey ovals) contain results which can be used without proof and without compromising too much on rigour. Getting started For all three fast tracks you need to read Chapters 1 and 2 first. If you are not too much in a hurry, you should choose one construction of Brownian motion from Chapter 3. For the beginner we recommend either Sections 3.1, 3.2 or Section 3.4. Basic stochastic calculus (C)

6.1–3∗

15.1–4

9.1

5.1–2

6.7∗

16

18.1–3

10.1∗

Basic Markov processes (M) 5.1–2

6.1–3

7

6.4

4∗

6.7∗

Basic sample path properties (S) 5.1–2

6.1–3

4∗

https://doi.org/10.1515/9783110741278-204

9.1+4

10.1–2

12.1

11.3–4

14

XII | Dependence chart Dependence to the sections 5.1–23.2 The following list shows which prerequisites are needed for each section. A star as in 4∗ or 6.7∗ indicates that some result(s) from Chapter 4 or Section 6.7 are used which may be used without proof and without compromising too much on rigour. Starred sections are mentioned only where they are actually needed, while other prerequisites are repeated to indicate the full line of dependence. For example, 6.6: M or S or C, 6.1–3 indicates that the prerequisites for Section 6.6 are covered by either “M” or “S” or “C if you add 6.1, 6.2, 6.3”. Since we do not refer to later sections with higher numbers, you will need only those sections in “M”, “S”, or “C and 6.1, 6.2, 6.3” with section numbers below 6.6. 19.1: C, 18.4–5, 17∗ means that 19.1 requires “C” plus the Sections 18.4 and 18.5. Some results from 17 are used, but they can be quoted without proof. C or M or S C or M or S C or M or S M or S or C M or S or C, 6.1 M or S or C, 6.1–2 M or S or C, 6.1–3 M or S or C, 6.1–3 M or S or C, 6.1–3 M or S or C, 6.1–3 M, 4.2∗or C, 6.1, 4.2∗ M or C, 6.1, 7.1 M or C, 6.1, 7.1–2 M or C, 6.1, 7.1–3 M or C, 6.1, 7.1–4 M or C, 6.1, 7.1–5 M or C, 6.1, 7.1–3 M, 8.1 or C, 6.1, 7.1–3, 8.1 M, 8.1–2 or C, 6.1, 7.1–4, 8.1–2 8.4: M, 6.7∗, 8.1–3∗ or C, 6.1–4, 7, 6.7∗, 8.1–3∗ 9.1: S or C or M 9.2: S or C or M, 9.1 9.3: S or C or M, 9.1 9.4: S or C or M, 9.1 10.1: S or C or M 10.2: S or C or M 10.3: S or C or M 11.1: S or M or C 5.1: 5.2: 5.3: 6.1: 6.2: 6.3: 6.4: 6.5: 6.6: 6.7: 7.1: 7.2: 7.3: 7.4: 7.5: 7.6: 8.1: 8.2: 8.3:

11.2:

11.3: 11.4: 11.5:

12.1: 12.2: 13.1: 13.2: 13.3: 14: 15.1: 15.2: 15.3: 15.4: 15.5: 15.6: 16: 17.1: 17.2: 17.3: 18.1: 18.2: 18.3: 18.4: 18.5: 18.6:

S, 11.1, 10.3∗ or M, 11.1, 10.3∗ or C, 11.1, 10.3∗ S or M or C, 6.1–3 S or M or C, 6.1–3 S, 11.3–4, 11.2∗ or M, 11.3–4, 11.2∗ or C, 6.1–3, 11.3–4, 11.2∗ S, 10.3∗or C, 10.3∗or M, 10.3∗ S or C, 12.1 or M, 12.1 S S, 13.1 S, 13.1–2, 4∗ S or C or M C or M C, 6.7∗or M, 15.1, 6.7∗ C or M, 15.1–2 C or M, 15.1–3, 9.1∗ C, 6.7∗ C, 15.5 C or M, 15.1–4 C C, 15.5, 17.1∗ C, 17.2, 15.5∗, 17.1∗ C C C C C, 18.4 C, 18.4–5

18.7: 19.1: 19.2: 19.3: 19.4: 19.5: 19.6: 19.7: 19.8: 20: 21.1: 21.2: 21.3: 21.4: 21.5: 21.6: 21.7: 21.8: 21.9: 22.1: 22.2: 23.1: 23.2: 23.3: 23.4:

C C, 18.4–5, 17∗ C, 18.4–5 C, 18.4–5, 19.1 C, 18.4–5 C, 15.5, 17, 18.4–5, 19.2 C, 18.4–5, 19.2 C, 18.4–5 C, 18.7, 11.4–5, 19.2∗, 19.7∗, 10.1∗ C, 18.4, 18.5 C C, 21.1 C, 21.1 C, 21.1–3 C, 18.4–5, 21.1 C, 18.4, 18.7, 19.2–3, 21.1–5 C, 6.1, 18.4–5, 21.1, 21.5 C, 18.4–5, 21.1, 21.5–6 C, 18.4–5, 21.1, 21.5–6, 10.1∗, 19.7∗ C, 18.6 C, 18.6, 21.1, 21.5, 22.1 M or C, 6.1, 7 C, 6.1, 7, 18.4–5, 21, 23.1 C, 18.4–5, 19.1, 19.3, 21, 23.1–2, 8∗ C, 16, 17, 18.4–5, 21, 23.1–3, 19.2∗, 19.4∗, 19.5∗

Index of notation This index is intended to aid cross-referencing, so notation that is specific to a single section is generally not listed. Some symbols are used locally, without ambiguity, in senses other than those given below; numbers following an entry are page numbers. Unless otherwise stated, functions are real-valued and binary operations between functions such as f ± g, f ⋅ g, f ∧ g, f ∨ g, comparisons f ⩽ g, f < g or limiting relan→∞ tions f n 󳨀󳨀󳨀󳨀→ f , limn f n , limn f n , limn f n , supn f n or inf n f n are understood pointwise. “Positive” and “negative” always means “⩾ 0” and “⩽ 0”. General notation: analysis

General notation: probability

inf 0



inf 0 = +∞

maximum of a and b

a∨b

minimum of a and b

a≍b

∃C :

a∧b

a+ , a−

a ∨ 0, resp., −(a ∧ 0) C−1 a

⩽ b ⩽ Ca

⌊x⌋

largest integer n ⩽ x

⟨x, y⟩

scalar product in ℝd , ∑dj=1 x j y j

|x|

I d , idd

𝟙A

“is distributed as”

s

“is sample of”, 460

⊥⊥

“is stochastically independent”



d

convergence in law



󳨀󳨀→

convergence in probab.

Lp

a.s.

convergence in L p (ℙ)

almost surely (w.r.t. ℙ)

󳨀󳨀→ ucp

󳨀󳨀→

uniform ℙ-conv., 274

iid

independent and identically distributed

Euclidean norm in ℝd , |x|2 = x21 + ⋅ ⋅ ⋅ + x2d

󳨀󳨀→

unit matrix in ℝd×d

LIL

law of iterated logarithm

ℙ, 𝔼

probability, expectation

{1, x ∈ A 𝟙A (x) = { 0, x ∉ A { scalar product ∫ fg dμ

variance, covariance

1 ⋅ 3 ⋅ . . . ⋅ (2n − 1); (−1)!! := 1

𝕍, Cov

N(m, Σ)

δx

point mass at x

BM

D, R

domain, range

normal law in ℝd , mean m ∈ ℝd , covariance matrix Σ ∈ ℝd×d

⟨f, g⟩L2 (μ)

(2n − 1)!!

Leb λT

Lebesgue measure

Leb. measure on [0, T]



Laplace operator

∂j

partial derivative

∇, ∇x

gradient ( ∂x∂ 1 , . . . ,

∂ ∂x j

https://doi.org/10.1515/9783110741278-205

∂ ⊤ ∂x d )

N(μ,

σ2 )

normal law in ℝ, mean μ, variance σ2

Brownian motion, 4

1

2

BM , BM

1-, 2-dimensional BM, 4

BM

d-dimensional BM, 4

d

(B0)–(B4)

4

(B3󸀠 )

6

(QB3)

13

XIV | Index of notation Sets and σ-algebras Ac

complement of the set A

A

closure of the set A

𝔹(x, r)

open ball, centre x, radius r

𝔹(x, r)

limn A n limn A n

closed ball, centre x, radius r ⋃k⩾1 ⋂n⩾k A n

L0T , L0T (M)

273, 291

L2T , L2T (M) closure of ST , 253, 289 L2T,loc , L2T,loc (M) 273, 291 L2P f ∈ L2 with P mble. representative, 260

L2P = L2T M2 , M2T M2,c T

262, 290

L2 martingales, 244, 249 continuous L2 martingales, 249

M2,c T,loc

B(E)

⋂k⩾1 ⋃n⩾k A n

Borel sets of E

FtX

σ(X s : s ⩽ t)

Spaces of functions

supp f

Ft+ Ft F∞

Fτ , Fτ+ P

support, {f ≠ 0}

⋂u>t Fu

completion of Ft with all subsets of ℙ null sets σ (⋃t⩾0 Ft )

55, 486

progressive σ-algebra, 260

Stochastic processes (X t , Ft )t⩾0

adapted process, 48

ℙx , 𝔼x

law of a process, starting at x, 63, 90–91

τ D , τ∘D

hitting/entrance time, 53

σ, τ

X tτ

⟨X⟩t

⟨X, Y⟩t

B(E)

Borel functions on E

Bb (E)

– – , bounded

C(E)

continuous functions on E

Cb (E)

– – , bounded

C∞ (E)

– – , lim f(x) = 0 |x|→∞

– – , compact support

Cc (E)

– – , f(0) = 0,

C(o) (E)

k times continuously diff’ble functions on E

Ck (E)

Ckb (E)

– – , bounded (with all derivatives)

quadratic variation, 244, 253, 283

Ckc (E)

– – , compact support

quadratic covariation, 247

H1

stopping times: {σ ⩽ t} ∈ Ft , t ⩾ 0

stopped process X t∧τ

varp (f; [0, t]) p-variation on [0, t], 152 ST

local M2,c T martingales, 274

simple processes, 249

k (E) C∞

– – , 0 at infinity (with all derivatives)

C1,2 (I ×E)

f(⋅, x) ∈ C1 (I) and f(t, ⋅) ∈ C2 (E)

L p (E,

μ),

Cameron–Martin space, 213 L p (E) L p space w.r.t. the measure space (E, A , μ)

L p (μ),

1 Robert Brown’s new thing “I have some sea-mice – five specimens – in spirits. And I will throw in Robert Brown’s new thing – ‘Microscopic Observations on the Pollen of Plants’ – if you don’t happen to have it already.” George Eliot: Middlemarch Book II, Chapter xvii

If you observe plant pollen in a drop of water through a microscope, some of them may burst and release even smaller particles which perform an incessant, irregular movement. The Scottish Botanist Robert Brown was not the first to describe this phenomenon – he refers to W. F. Gleichen-Rußwurm as the discoverer of the motions of the Particles of the Pollen [22, p. 164] – but Brown’s 1828 and 1829 papers [21; 22] are the first scientific publications investigating “Brownian motion”. Brown points out that 󳶳 the motion is very irregular, composed of translations and rotations; 󳶳 the particles appear to move independently of each other; 󳶳 the motion is more active the smaller the particles; 󳶳 the composition and density of the particles have no effect; 󳶳 the motion is more active the less viscous the fluid; 󳶳 the motion never ceases; 󳶳 the motion is not caused by flows in the liquid or by evaporation; 󳶳 the particles are not animated. Let us briefly sketch how the story of Brownian motion evolved.

Brownian motion and physics. Following Brown’s observations, several theories emerged, but it was Einstein’s 1905 paper [74] which gave the correct explanation: the atoms of the fluid perform a temperature-dependent movement and bombard the (in this scale) macroscopic particles suspended in the fluid. These collisions happen frequently and they do not depend on position nor time. In the introduction Einstein remarks: It is possible, that the movements to be discussed here are identical with the socalled “Brownian molecular motion”; [. . . ] If the movement discussed here can actually be observed (together with the laws relating to it that one would expect to find), then classical thermodynamics can no longer be looked upon as applicable with precision to bodies even of dimensions distinguishable in a microscope: an exact determination of actual atomic dimensions is then possible.¹ [75, pp. 1–2]. And between the lines: this would settle the then ongoing discussion on the existence of atoms. It was Jean Perrin who

1 Es ist möglich, daß die hier zu behandelnden Bewegungen mit der sogenannten “Brownschen Molekularbewegung” identisch sind; [...] Wenn sich die hier zu behandelnde Bewegung samt den für sie zu erwartenden Gesetzmäßigkeiten wirklich beobachten läßt, so ist die klassische Thermodynamik schon für mikroskopisch unterscheidbare Räume nicht mehr als genau gültig anzusehen und es ist dann eine exakte Bestimmung der wahren Atomgröße möglich [74, p. 549]. https://doi.org/10.1515/9783110741278-001

2 | 1 Robert Brown’s new thing combined in 1909 Einstein’s theory and experimental observations of Brownian motion to prove the existence and determine the size of atoms, cf. [204; 205]. Independently of Einstein, M. von Smoluchowski arrived at an equivalent interpretation of Brownian motion, cf. [241]. Brownian motion and mathematics. As a mathematical object, Brownian motion can be traced back to the not completely rigorous definition of Bachelier [6] who makes no connection to Brown or Brownian motion. Bachelier’s work was rediscovered by economists in the 1960s, cf. [43]. The first rigorous mathematical construction of Brownian motion is due to Wiener [263] who introduces the Wiener measure on the space C[0, 1] (which he calls differential-space) building on Einstein’s and von Smoluchowski’s work. Further constructions of Brownian motion were subsequently given by Wiener [264] (Fourier-Wiener series), Kolmogorov [146; 147] (giving a rigorous justification of Bachelier [6]), Lévy [164; 166, pp. 492–494, 17–20] (interpolation argument), Ciesielski [37] (Haar representation) and Donsker [55] (limit of random walks, invariance principle), see Chapter 3. Let us start with Brown’s observations to build a mathematical model of Brownian motion. To keep things simple, we consider a one-dimensional setting where each particle performs a random walk. We assume that each particle 󳶳 starts at the origin x = 0, 󳶳 changes its position only at discrete times k∆t where ∆t > 0 is fixed and for all k = 1, 2, . . . ; 󳶳 moves ∆x units to the left or to the right with equal probability; 󳶳 and ∆x does not depend on any past positions nor the current position x nor on time t = k∆t. Letting ∆t → 0 and ∆x → 0 in an appropriate way should give a random motion which is continuous in time and space. Let us denote by X t the random position of the particle at time t ∈ [0, T]. During the time [0, T], the particle has changed its position N = ⌊T/∆t⌋ times. Since the decision to move left or right is random, we will model it by independent, identically distributed Bernoulli random variables, ϵ k , k ⩾ 1, where so that

ℙ(ϵ1 = 1) = ℙ(ϵ1 = 0) =

S N = ϵ1 + ⋅ ⋅ ⋅ + ϵ N

and

1 2

N − SN

denote the number of right and left moves, respectively. Thus N

X T = S N ∆x − (N − S N )∆x = (2S N − N)∆x = ∑ (2ϵ k − 1)∆x k=1

1 Robert Brown’s new thing | 3

is the position of the particle at time T = N∆t. Since X0 = 0 we find for any two times t = n∆t and T = N∆t that N

n

k=n+1

k=1

X T = (X T − X t ) + (X t − X0 ) = ∑ (2ϵ k − 1)∆x + ∑ (2ϵ k − 1)∆x.

Since the ϵ k are iid random variables, the two increments X T − X t and X t − X0 are independent and X T − X t ∼ X T−t − X0

(“∼” indicates that the random variables have the same probability distribution). We write σ2 (t) := 𝕍 X t . By Bienaymé’s identity we get 𝕍 X T = 𝕍(X T − X t ) + 𝕍(X t − X0 ) = σ2 (T − t) + σ2 (t)

which means that t 󳨃→ σ2 (t) is linear:

𝕍 X T = σ2 (T) = σ2 T,

where σ > 0 is the so-called diffusion coefficient. On the other hand, since 𝔼 ϵ1 = 𝕍 ϵ1 = 14 we get by a direct calculation that 𝕍 X T = N(∆x)2 =

which reveals that

1 2

and

T (∆x)2 ∆t

(∆x)2 = σ2 = const. ∆t

The particle’s position X T at time T = N∆t is the sum of N iid random variables, N

where

X T = ∑ (2ϵ k − 1)∆x = (2S N − N)∆x = S∗N √Tσ, k=1

S∗N =

2S N − N S N − 𝔼 S N = √N √𝕍 S N

is the normalization – i.e. mean 0, variance 1 – of the random variable S N . A simple application of the central limit theorem now shows that in distribution N→∞

X T = √TσS∗N 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ √TσG (i.e. ∆x,∆t→0)

where G ∼ N(0, 1) is a standard normal distributed random variable. This means that, in the limit, the particle’s position B T = lim∆x,∆t→0 X T is normally distributed with law N(0, Tσ2 ). This approximation procedure yields for each t ∈ [0, T] some random variable B t ∼ N(0, tσ2 ). More generally

4 | 1 Robert Brown’s new thing 1.1 Definition. Let (Ω, A , ℙ) be a probability space. A d-dimensional stochastic process indexed by I ⊂ [0, ∞) is a family of random variables X t : Ω → ℝd , t ∈ I. We write X = (X t )t∈I . I is called the index set and ℝd the state space.

The only requirement of Definition 1.1 is that the X t , t ∈ I, are A /B(ℝd ) measurable. This definition is, however, too general to be mathematically useful; more information is needed on (t, ω) 󳨃→ X t (ω) as a function of two variables. Although the family (B t )t∈[0,T] satisfies the condition of Definition 1.1, a realistic model of Brownian motion should have at least continuous trajectories: the sample path [0, T] ∋ t 󳨃→ B t (ω) should be a continuous function for all ω. 1.2 Definition. A d-dimensional Brownian motion B = (B t )t⩾0 is a stochastic process indexed by [0, ∞) taking values in ℝd such that B0 (ω) = 0

for ℙ almost all ω;

(B0)

B t n − B t n−1 , . . . , B t1 − B t0 are independent

(B1)

for all n ⩾ 1, 0 = t0 ⩽ t1 < t2 < ⋅ ⋅ ⋅ < t n < ∞; B t − B s ∼ B t+h − B s+h

B t − B s ∼ N(0, t − s)⊗d , t 󳨃→ B t (ω)

for all 0 ⩽ s < t, h ⩾ −s; N(0, t)(dx) =

is continuous for all ω.

1

√2πt

exp (−

(B2) x2 ) dx; 2t

(B3) (B4)

We use BMd as shorthand for d-dimensional Brownian motion. Ex. 1.8 Ex. 1.9 Ex. 1.10

It is sometimes useful to replace the requirement “for all ω” in (B4) with “for ℙ-almost all ω”. Denote by N ⊂ Ω the exceptional set of those ω where (B4) is violated. If ̃ t (ω) := B t (ω)𝟙Ω\N (ω) is N is measurable and ℙ(N) = 0, it is not hard to see that B a Brownian motion in the sense of Definition 1.2. Things get a bit messy if N is not measurable. One way around is to use the completion of the underlying measure space. We will also speak of a Brownian motion if the index set is an interval of the form [0, T) or [0, T]. We say that (B t + x)t⩾0 , x ∈ ℝd , is a d-dimensional Brownian motion started at x. Frequently we write B(t, ω) and B(t) instead of B t (ω) and B t ; this should cause no confusion. By definition, Brownian motion is an ℝd -valued stochastic process starting at the origin (B0) with independent increments (B1), stationary increments (B2) and continuous paths (B4). We will see later that this definition is redundant: (B0)–(B3) entail (B4) at least for almost all ω. On the other hand, (B0)–(B2) and (B4) automatically imply that the increment B(t) − B(s) has a (possibly degenerate) Gaussian distribution (this is a consequence of the central limit theorem). Before we discuss such details, we should settle the question, if there exists a process satisfying the requirements of Definition 1.2, cf. Chapter 3.

| 5

Problems

1.3 Further reading. A good general survey on the history of continuous-time stochastic processes is the paper [41]. The role of Brownian motion in mathematical finance is explained in [8] and [43]. Good references for Brownian motion in physics are [182] and [188], for applications in modelling and engineering [172]. Bachelier, Davis (ed.), Etheridge (ed.): Louis Bachelier’s Theory of Speculation: [8] The Origins of Modern Finance. Cohen: The history of noise. [41] Cootner (ed.): The Random Character of Stock Market Prices. [43] [172] MacDonald: Noise and Fluctuations. [182] Mazo: Brownian Motion. [188] Nelson: Dynamical Theories of Brownian Motion.

Problems A sequence of random variables X n : Ω → ℝd converges weakly (also: in distribution or d

in law) to a random variable X, X n 󳨀 → X if, and only if, limn→∞ 𝔼 f(X n ) = 𝔼 f(X) for all bounded and continuous functions f ∈ Cb (ℝd ). This is equivalent to the convergence of the characteristic functions limn→∞ 𝔼 e i⟨ξ, X n ⟩ = 𝔼 e i⟨ξ, X⟩ for all ξ ∈ ℝd . 1.

Let X, Y, X n , Y n : Ω → ℝ, n ⩾ 1, be random variables. d

a) If, for all n ⩾ 1, X n ⊥⊥ Y n and if (X n , Y n ) 󳨀 → (X, Y), then X ⊥⊥ Y.

b) Let X ⊥⊥ Y such that X, Y ∼ β1/2 := 12 (δ0 + δ1 ) are Bernoulli random variables. We set X n := X + 1n and Y n := 1 − X n . Then X n 󳨀 → X, Y n 󳨀 → Y, X n + Y n 󳨀 → 1 but (X n , Y n ) does not converge weakly to (X, Y). d

d

2.

d

d

d

c) Assume that X n 󳨀 → X and Y n 󳨀 → Y. Is it true that X n + Y n 󳨀 → X + Y?

(Slutsky’s theorem) Let X n , Y n : Ω → ℝd , n ⩾ 1, be two sequences of random d

3.

d



d

variables such that X n 󳨀 → X and X n − Y n 󳨀 → 0. Then Y n 󳨀 → X.

(Slutsky’s theorem 2) Let X n , Y n : Ω → ℝ, n ⩾ 1, be two sequences of random variables. d ℙ d d a) If X n 󳨀 → X and Y n 󳨀 → c, then X n Y n 󳨀 → cX. Is this still true if Y n 󳨀 → c? ℙ

d

d

d

b) If X n 󳨀 → X and Y n 󳨀 → 0, then X n + Y n 󳨀 → X. Is this still true if Y n 󳨀 → 0?

4. Let X n , X, Y : Ω → ℝ, n ⩾ 1, be random variables. If

lim 𝔼(f(X n )g(Y)) = 𝔼(f(X)g(Y)) for all f ∈ Cb (ℝ), g ∈ Bb (ℝ),

n→∞

d



then (X n , Y) 󳨀 → (X, Y). If X = ϕ(Y) for some ϕ ∈ B(ℝ), then X n 󳨀 → X.

6 | 1 Robert Brown’s new thing 5.

Let δ j , j ⩾ 1, be iid Bernoulli random variables with ℙ(δ j = ±1) = 1/2. We set S0 := 0,

S n := δ1 + ⋅ ⋅ ⋅ + δ n

and

X tn :=

1 S⌊nt⌋ . √n

A one-dimensional random variable G is Gaussian if it has the characteristic function 𝔼 e iξG = exp(imξ − 12 σ2 ξ 2 ) with m ∈ ℝ and σ ⩾ 0. Prove that d

→ G t where t > 0 and G t is a Gaussian random variable. a) X tn 󳨀 d

→ G t−s where t ⩾ s ⩾ 0 and G u is a Gaussian random variable. Do we b) X tn − X sn 󳨀 have G t−s = G t − G s ?

6.

c) Let 0 ⩽ t1 ⩽ . . . ⩽ t m , m ⩾ 1. Determine the limit as n → ∞ of the random vector (X tnm − X tnm−1 , . . . , X tn2 − X tn1 , X tn1 ).

Consider the condition

for all s < t the random variables

B(t) − B(s) √t − s

are

identically distributed, centred and have variance 1.

(B3󸀠 )

Show that (B0), (B1), (B2), (B3) and (B0), (B1), (B2), (B3󸀠 ) are equivalent. Hint. If X ∼ Y are centred and have variance 1, X ⊥⊥ Y and X ∼ Rényi [213, VI.5 Theorem 2].

7.

1 (X √2

+ Y), then X ∼ N(0, 1), cf.

Let G and G󸀠 be two independent real N(0, 1)-random variables. Show that Γ ± :=

G ± G󸀠 ∼ N(0, 1) and √2

Γ − ⊥⊥ Γ + .

8. Let (X t )t⩾0 be a stochastic process on the measure space (Ω, A , ℙ) satisfying (B1)– (B3). Assume further that there is some measurable set Ω0 ⊂ Ω, ℙ(Ω0 ) = 1, such that for all ω ∈ Ω0 we have X0 (ω) = 0 and t 󳨃→ X t (ω) is continuous. Show that a) B t (ω) := X t (ω)𝟙Ω0 (ω) is a Brownian motion on (Ω, A , ℙ); 9.

b) X t is a Brownian motion on the space (Ω0 , Ω0 ∩ A , ℙ0 := ℙ(⋅ ∩ Ω0 )). Notation. Ω0 ∩ A := {Ω0 ∩ A : A ∈ A } is the trace σ-algebra.

Let (Ω, A , ℙ) be a probability space and let ℙ∗ (Q) := inf {ℙ(A) : A ⊃ Q, A ∈ A } be the outer measure. If Ω0 ⊂ Ω is such that ℙ∗ (Ω0 ) = 1, then (Ω0 , A0 , ℙ0 ) with A0 := Ω0 ∩ A and ℙ0 (A0 ) := ℙ(A) for every A0 ∈ A0 of the form A0 = Ω0 ∩ A, defines a new probability space.

10. Let (Ω, A , ℙ) be a probability space. The completion (Ω, A 󸀠 , ℙ󸀠 ) is the smallest extension such that A 󸀠 contains all subsets of A measurable null sets. Show that A 󸀠 := {A󸀠 ⊂ Ω : ∃A1 , A2 ∈ A , A1 ⊂ A󸀠 ⊂ A2 , ℙ(A2 \ A1 ) = 0} , ℙ󸀠 (A󸀠 ) := ℙ(A2 )

(A2 is as in the definition of A 󸀠 ).

2 Brownian motion as a Gaussian process Recall that a one-dimensional random variable Γ is Gaussian if its characteristic function has the form 1 2 2 𝔼 e iξΓ = e imξ − 2 σ ξ (2.1) for some real numbers m ∈ ℝ and σ ⩾ 0. If we differentiate (2.1) two times with respect to ξ and set ξ = 0, we see that and

m = 𝔼Γ

σ2 = 𝕍 Γ.

(2.2)

A random vector Γ = (Γ1 , . . . , Γ n ) ∈ ℝn is Gaussian, if the scalar product ⟨ξ, Γ⟩ is for every ξ ∈ ℝn a one-dimensional Gaussian random variable. This is equivalent to 1

𝔼 e i⟨ξ, Γ⟩ = e i 𝔼⟨ξ, Γ⟩− 2 𝕍⟨ξ, Γ⟩

for all ξ ∈ ℝd .

(2.3)

Setting m = (m1 , . . . , m n ) ∈ ℝn and Σ = (σ jk )j,k=1...,n ∈ ℝn×n where m j := 𝔼 Γ j

and

σ jk := 𝔼(Γ j − m j )(Γ k − m k ) = Cov(Γ j , Γ k ),

we can rewrite (2.3) in the following form

1

𝔼 e i⟨ξ, Γ⟩ = e i⟨ξ, m⟩− 2 ⟨ξ, Σξ⟩

for all ξ ∈ ℝd .

(2.4)

We call m the mean vector and Σ the covariance matrix of Γ.

2.1 The finite dimensional distributions Let us quickly establish some first consequences of the definition of Brownian motion. To keep things simple, we assume throughout this section that (B t )t⩾0 is a onedimensional Brownian motion. 2.1 Proposition. Let (B t )t⩾0 be a one-dimensional Brownian motion. Then B t , t ⩾ 0, are Gaussian random variables with mean 0 and variance t: 𝔼 e iξB t = e−t ξ

2

/2

for all t ⩾ 0, ξ ∈ ℝ.

(2.5)

Proof. Set ϕ t (ξ) = 𝔼 e iξB t . If we differentiate ϕ t with respect to ξ , and use integration by parts we get 2 1 (B3) ϕ󸀠t (ξ) = 𝔼 (iB t e iξB t ) = ∫ e ixξ (ix)e−x /(2t) dx √2πt =

parts

1

√2πt

= −tξ

https://doi.org/10.1515/9783110741278-002



∫ e ixξ (−it)



1

√2πt

= −tξ ϕ t (ξ).

d −x2 /(2t) e dx dx

∫ e ixξ e−x



2

/(2t)

dx

Ex. 2.1

8 | 2 Brownian motion as a Gaussian process Since ϕ t (0) = 1, (2.5) is the unique solution of the differential equation ϕ󸀠t (ξ) = −tξ ϕ t (ξ)

2

2

2 |c|⋅|y| ⩽ e c e y /4 for all From the elementary inequality 1 ⩽ exp([ |y| 2 − |c|] ) we see that e 2 2 2 |c|⋅|y| −y /2 c −y /4 c, y ∈ ℝ. Therefore, e e ⩽e e is integrable. It follows that the expectation in (2.5) exists for all ξ ∈ ℂ and defines an analytic function.

2.2 Corollary. A one-dimensional Brownian motion (B t )t⩾0 has exponential moments of all orders, i.e. 𝔼 e|ζ|⋅|B t | < ∞

𝔼 e ζB t = e t ζ

and

2.3 Moments. Note that for k = 0, 1, 2, . . . 𝔼(B2k+1 )= t

and 𝔼(B2k t )

= =

1 √2πt

∫x

2k

ℝ ∞

e

1

∫ x2k+1 e−x

√2πt

−x2 /(2t)



dx

2k t k ∫ y k−1/2 e−y dy √π 0

= t k ⋅ 2k ⋅

2

x=√2ty

=

=

tk

/2

2

for all ζ ∈ ℂ.

/(2t)

2 √2πt

dx = 0 ∞

∫ (2ty)k e−y 0

(2.6)

(2.7)

2t dy 2√2ty

2k Γ(k + 1/2) √π

(2.8)

3 1 Γ(1/2) 2k − 1 ⋅⋅⋅ ⋅ ⋅ = t k ⋅ (2k − 1) ⋅ ⋅ ⋅ 3 ⋅ 1, 2 2 2 √π

where Γ(⋅) denotes Euler’s Gamma function. The last line follows from the functional equation Γ(x + 1) = xΓ(x) and Γ(1/2) = √π, cf. [232, p. 185, Problem 16.9]. In particular, 𝔼 B t = 𝔼 B3t = 0,

𝕍 B t = 𝔼 B2t = t

and

2.4 Covariance. For s, t ⩾ 0 we have Indeed, if s ⩽ t,

𝔼 B4t = 3t2 .

Cov(B s , B t ) = 𝔼 B s B t = s ∧ t. (B1)

𝔼 B s B t = 𝔼 (B s (B t − B s )) + 𝔼(B2s ) = s = s ∧ t. 2.3

2.5 Definition. A one-dimensional stochastic process (X t )t⩾0 is called a Gaussian process if all vectors Γ = (X t1 , . . . , X t n ), n ⩾ 1, 0 ⩽ t1 < t2 < ⋅ ⋅ ⋅ < t n are (possibly degenerate) Gaussian random vectors. Let us show that a Brownian motion is a Gaussian process.

2.1 The finite dimensional distributions | 9

2.6 Theorem. A one-dimensional Brownian motion (B t )t⩾0 is a Gaussian process. For t0 := 0 < t1 < ⋅ ⋅ ⋅ < t n , n ⩾ 1, the vector Γ := (B t1 , . . . , B t n )⊤ is a Gaussian random variable with mean vector m = 0 ∈ ℝn and a strictly positive definite, symmetric covariance matrix C = (t j ∧ t k )j,k=1...,n : 1

𝔼 e i⟨ξ, Γ⟩ = e− 2 ⟨ξ, Cξ⟩ .

Ex. 2.5

(2.9)

Moreover, the probability distribution of Γ is given by ℙ(Γ ∈ dx) =

=

1 1 1 exp (− ⟨x, C−1 x⟩) dx n/2 2 √det C (2π) 1

(2π)n/2 √∏nj=1 (t j − t j−1 )

with x := (x1 , . . . , x n ) ∈ ℝn and x0 := 0.

exp (−

1 n (x j − x j−1 )2 ) dx ∑ 2 j=1 t j − t j−1

(2.10a) (2.10b)

Proof. Set ∆ := (B t1 − B t0 , B t2 − B t1 , . . . , B t n − B t n−1 )⊤ and observe that we can write B(t k ) − B(t0 ) = ∑kj=1 (B t j − B t j−1 ). Thus, 1 Γ = ( ... 1

..

. ⋅⋅⋅

0

) ∆ = M∆

1

where M ∈ ℝn×n is a lower triangular matrix with entries 1 on and below the diagonal. Therefore, 𝔼 (exp [i⟨ξ, Γ⟩]) = 𝔼 (exp [i⟨M ⊤ ξ, ∆⟩]) (B1)

n

= ∏ 𝔼 (exp [i(B t j − B t j−1 )(ξ j + ⋅ ⋅ ⋅ + ξ n )]) j=1

(2.11)

n 1 = ∏ exp [− (t j − t j−1 )(ξ j + ⋅ ⋅ ⋅ + ξ n )2 ] . 2 (2.5) j=1 (B2)

Observe that n

n

∑ t j (ξ j + ⋅ ⋅ ⋅ + ξ n )2 − ∑ t j−1 (ξ j + ⋅ ⋅ ⋅ + ξ n )2 j=1

j=1

n−1

= t n ξ n2 + ∑ t j ((ξ j + ⋅ ⋅ ⋅ + ξ n )2 − (ξ j+1 + ⋅ ⋅ ⋅ + ξ n )2 ) j=1

=

t n ξ n2

n−1

+ ∑ t j ξ j (ξ j + 2ξ j+1 + ⋅ ⋅ ⋅ + 2ξ n ) j=1

n

n

= ∑ ∑ (t j ∧ t k ) ξ j ξ k . j=1 k=1

(2.12)

Ex. 2.6

10 | 2 Brownian motion as a Gaussian process This proves (2.9). Since C is strictly positive definite and symmetric, the inverse C−1 exists and is again positive definite; both C and C−1 have unique positive definite, symmetric square roots. Using the uniqueness of the Fourier transform, the following calculation proves (2.10a): (

1 −1 1 n/2 1 ) ∫ e i⟨x, ξ⟩ e− 2 ⟨x, C x⟩ dx 2π √det C

y=C

−1/2

=

=

=

x

ℝn

(

( e

1 1/2 2 1 n/2 ) ∫ e i⟨(C y), ξ ⟩ e− 2 |y| dy 2π

1 ) 2π

− 12

|C

n/2

1/2

ℝn

∫ e i⟨y, C

ℝn 2

ξ|

1/2

1

ξ⟩

1

2

e− 2 |y| dy

= e− 2 ⟨ξ, Cξ ⟩ .

Let us finally determine ⟨x, C−1 x⟩ and det C. Since the entries of ∆ are independent N(0, t j − t j−1 ) distributed random variables we get (2.5)

𝔼 e i⟨ξ, ∆⟩ = exp (−

1 n 1 ∑ (t j − t j−1 )ξ j2 ) = exp (− ⟨ξ, Dξ⟩) 2 j=1 2

where D ∈ ℝn×n is a diagonal matrix with entries (t1 − t0 , . . . , t n − t n−1 ). On the other hand, we have 1

Ex. 2.7

e− 2 ⟨ξ, Cξ ⟩ = 𝔼 e i⟨ξ, Γ⟩ = 𝔼 e i⟨ξ, M∆⟩ = 𝔼 e i⟨M



ξ, ∆⟩

1

= e − 2 ⟨M



ξ, DM ⊤ ξ ⟩

.

Thus, C = MDM ⊤ and, therefore C−1 = (M ⊤ )−1 D−1 M −1 . Since M −1 is a two-band matrix with entries 1 on the diagonal and −1 on the first sub-diagonal below the diagonal, we see (x j − x j−1 )2 t j − t j−1 j=1 n

⟨x, C−1 x⟩ = ⟨M −1 x, D−1 M −1 x⟩ = ∑

as well as det C = det(MDM ⊤ ) = det D = ∏nj=1 (t j − t j−1 ). This shows (2.10b).

The proof of Theorem 2.6 actually characterizes Brownian motion among all Gaussian processes. 2.7 Corollary. Let (X t )t⩾0 be a one-dimensional Gaussian process such that for each n ∈ ℕ the vector Γ = (X t1 , . . . , X t n )⊤ is a Gaussian random variable with mean 0 and covariance matrix C = (t j ∧ t k )j,k=1,...,n . If (X t )t⩾0 has continuous sample paths, then (X t )t⩾0 is a one-dimensional Brownian motion. Ex. 2.6

Proof. The properties (B4) and (B0) follow directly from the assumptions; note that X0 ∼ N(0, 0) = δ0 , i.e. X0 = 0 almost surely. Set ∆ = (X t1 − X t0 , . . . , X t n − X t n−1 )⊤ and let M ∈ ℝn×n be the lower triangular matrix with entries 1 on and below the diagonal.

2.2 Brownian motion in ℝd

| 11

Then, as in Theorem 2.6, Γ = M∆ or ∆ = M −1 Γ where M −1 is a two-band matrix with entries 1 on the diagonal and −1 on the first sub-diagonal below the diagonal. Since Γ is Gaussian, we see 𝔼 e i⟨ξ, ∆⟩ = 𝔼 e i⟨ξ, M

−1

Γ⟩

= 𝔼 e i⟨(M

) ξ, Γ ⟩ (2.9)

1

−1 ⊤

1

−1

= e− 2 ⟨(M

−1 ⊤

= e− 2 ⟨ξ, M

) ξ, C(M −1 )⊤ ξ ⟩ C(M −1 )⊤ ξ ⟩

.

A straightforward calculation shows that M −1 C(M −1 )⊤ is a diagonal matrix 1

−1 (

.. ..

. .

..

. −1

t1 t1 )( . .. t1 1

t1 t2 .. . t2

⋅⋅⋅ ⋅⋅⋅ .. . ⋅⋅⋅

1 t1 t2 .. ) ( . tn

−1 .. .

.. ..

. .

t1 − t0 t2 − t1 )=( ). .. . −1 t n − t n−1 1

Thus, ∆ is a Gaussian random vector with uncorrelated, hence independent, components which are N(0, t j − t j−1 ) distributed. This proves (B1), (B3) and (B2).

2.2 Brownian motion in ℝd

We will now show that B t = (B1t , . . . , B dt ) is a BMd if, and only if, its coordinate proj cesses B t are independent one-dimensional Brownian motions. We call two stochastic processes (X t )t⩾0 and (Y t )t⩾0 (defined on the same probability space) independent, if the σ-algebras generated by these processes are independent: where

X F∞ := σ ( ⋃

(2.13)

X Y F∞ ⊥⊥ F∞



n⩾1 0⩽t1 0 have the same finite dimensional distributions. Since Ω W and the analogously defined set Ω B are determined by countably

Problems

| 17

many sets of the form {|W(r)| ⩽ 1n } and {|B(r)| ⩽ 1n }, we conclude that ℙ(Ω W ) = ℙ(Ω B ). Consequently, (B4)

ℙ(Ω W ) = ℙ(Ω B ) = ℙ(Ω) = 1.

This shows that (W t )t⩾0 is, on the smaller probability space (Ω W , Ω W ∩ A , ℙ) equipped with the trace σ-algebra Ω W ∩ A , a Brownian motion. The d-dimensional analogue follows easily from the observation that the coordinate processes of a BMd are independent BM1 , and vice versa, cf. Theorem 2.10. 2.18 Further reading. The literature on Gaussian processes is quite specialized and technical. Among the most accessible monographs are [178] (with a focus on the Dynkin isomorphism theorem and local times) and [169]. The seminal, and still highly readable, paper [61] is one of the most influential original contributions in the field. Dudley: Sample functions of the Gaussian process. [61] [84] Fernique: Fonctions aléatoires Gaussiennes. Vecteurs aléatoires Gaussiens. [169] Lifshits: Gaussian Random Functions [170] Lifshits: Lectures on Gaussian Processes. [178] Marcus, Rosen: Markov Processes, Gaussian Processes, and Local Times.

Problems 1.

2. 3.

Show that there exist a random vector (U, V) such that U and V are one-dimensional Gaussian random variables but (U, V) is not Gaussian. Hint. Try f(u, v) = g(u)g(v)(1 − sin u sin v) where g(u) = (2π)−1/2 e−u

2 /2

.

Let G ∼ N(m, C) be a d-dimensional Gaussian random vector and A ∈ ℝd×d . Show that AG ∼ N(Am, ACA⊤ ).

Let X, Γ ∼ N(0, t). Show that X ⊥⊥ Γ 󳨐⇒ X − Γ ⊥⊥ X + Γ and X ± Γ ∼ N(0, 2t).

4. Let G n ∼ N(0, t n ), n ⩾ 1, be sequence of Gaussian random variables. Show that d

5.

6.

7.

Gn 󳨀 → G if, and only if, the sequence (t n )n⩾1 is convergent.

Show that the covariance matrix C = (t j ∧ t k )j,k=1,...,n appearing in Theorem 2.6 is positive definite.

Verify that the matrix M in the proof of Theorem 2.6 and Corollary 2.7 is a lower triangular matrix with entries 1 on and below the diagonal. Show that the inverse matrix M −1 is a lower triangular matrix with entries 1 on the diagonal and −1 directly below the diagonal. Let A, B ∈ ℝn×n be symmetric matrices. If ⟨Ax, x⟩ = ⟨Bx, x⟩ for all x ∈ ℝn , then A = B.

18 | 2 Brownian motion as a Gaussian process 8. Let (B t )t⩾0 be a BM1 . Decide which of the following processes are Brownian motions: a) X t := 2B t/4 ; b) Y t := B2t − B t ; c) Z t := √t B1 (t ⩾ 0). 9.

Let (B(t))t⩾0 be a BM1 . a) Find the density of the random vector (B(s), B(t)) where 0 < s < t < ∞.

b) (Brownian bridge) Find the conditional density of the vector (B(s), B(t)), 0 < s < t < 1 under the condition B(1) = 0, and use this to determine 𝔼(B(s)B(t) | B(1) = 0). c) Let 0 < t1 < t2 < t3 < t4 < ∞. Determine the conditional density of the bi-variate random variable (B(t2 ), B(t3 )) given that B(t1 ) = B(t4 ) = 0. What is the conditional correlation of B(t2 ) and B(t3 )?

10. Find the correlation function C(s, t) := 𝔼(X s X t ), s, t ⩾ 0, of the stochastic process X t := B2t − t, t ⩾ 0, where (B t )t⩾0 is a BM1 .

11. (Stationary Ornstein–Uhlenbeck process) Let (B t )t⩾0 be a BM1 , α > 0 and consider the stochastic process X t := e−αt/2 B e αt , t ⩾ 0. a) Determine m(t) = 𝔼 X t and C(s, t) = 𝔼(X s X t ), s, t ⩾ 0. b) Find the probability density of (X t1 , . . . , X t n ) where 0 ⩽ t1 < ⋅ ⋅ ⋅ < t n < ∞.

B =⋃ 12. Let (B t )t⩾0 be a BM1 . Show that F∞

J⊂[0,∞) J countable

σ(B(t) : t ∈ J).

13. Let X(t), Y(t) be any two stochastic processes. Show that

(X(u1 ), . . . , X(u p )) ⊥⊥(Y(u1 ), . . . , Y(u p )) for all u1 < ⋅ ⋅ ⋅ < u p , p ⩾ 1

implies

(X(s1 ), . . . , X(s n )) ⊥⊥(Y(t1 ), . . . , Y(t m ))

for all

s1 < ⋅ ⋅ ⋅ < s m , t1 < ⋅ ⋅ ⋅ < t n , m, n ⩾ 1.

14. Let (Ft )t⩾0 and (Gt )t⩾0 be any two filtrations and define F∞ = σ( ⋃t⩾0 Ft ) and G∞ = σ( ⋃t⩾0 Gt ). Show that Ft ⊥⊥ Gt (∀t ⩾ 0) if, and only if, F∞ ⊥⊥ G∞ .

15. Use Lemma 2.8 to show that (all finite dimensional distributions of) a BMd is invariant under rotations. 16. Let B t = (b t , β t ), t ⩾ 0, be a two-dimensional Brownian motion. a) Show that W t := √1 (b t + β t ) is a BM1 . 2

b) Are X t := (W t , β t ) and Y t := Brownian motions?

1 (b √2 t

+ β t , b t − β t ), t ⩾ 0, two-dimensional

17. Let B t = (b t , β t ), t ⩾ 0, be a two-dimensional Brownian motion. For which values of λ, μ ∈ ℝ is the process X t := λb t + μβ t a BM1 ?

18. Let B t = (b t , β t ), t ⩾ 0, be a two-dimensional Brownian motion. Decide whether for s > 0 the process X t := (b t , β s−t − β t ), 0 ⩽ t ⩽ s is a two-dimensional Brownian motion.

Problems

| 19

19. Let B t = (b t , β t ), t ⩾ 0, be a two-dimensional Brownian motion and α ∈ [0, 2π). Show that W t = (b t ⋅ cos α + β t ⋅ sin α, β t ⋅ cos α − b t ⋅ sin α)⊤ is a two-dimensional Brownian motion. Find a suitable d-dimensional generalization of this observation. 20. Let X be a Q-BM. Determine Cov(X t , X sk ), the characteristic function of X(t) and the transition probability (density) of X(t). j

21. Show that (B1) is equivalent to B t − B s ⊥⊥ FsB for all 0 ⩽ s < t.

22. Let (B t )t⩾0 be a BM1 and set τ a = inf{s ⩾ 0 : B s = a} where a ∈ ℝ. Show that τ a ∼ τ−a and τ a ∼ a2 τ1 .

23. A ℝ-valued stochastic process (X t )t⩾0 such that 𝔼(X 2t ) < ∞ is stationary (in the wide sense) if m(t) = 𝔼 X t ≡ const. and C(s, t) = 𝔼(X s X t ) = g(t − s), 0 ⩽ s ⩽ t < ∞ for some even function g : ℝ → ℝ. Which of the following processes is stationary? a) W t = B2t − t; b) X t = e−αt/2 B e αt ; c) Y t = B t+h − B t ; d) Z t = B e t .

24. Let (B t )t∈[0,1] and (β t )t∈[0,1] be independent one-dimensional Brownian motions. Show that the following process is again a Brownian motion: {B t , if t ∈ [0, 1], W t := { B + tβ1/t − β1 , if t ∈ (1, ∞). { 1 25. Find out whether the processes X(t) := B(e t ) and X(t) := e−t/2 B(e t ), t ⩾ 0, have the no-memory property, i.e. σ(X(t) : t ⩽ a) ⊥⊥ σ(X(t + a) − X(a) : t ⩾ 0) for a > 0. 26. Prove the time inversion property from Paragraph 2.15.

27. (Strong law of large numbers) Let (B t )t⩾0 be a BM1 . Use Paragraph 2.17 to show that limt→∞ B t /t = 0 a.s. and in mean square sense. 28. (Fractional Brownian motion) Fractional Brownian motion (fBM, α-fBM) (B αt )t∈ℝ with Hurst index H = α/2 ∈ (0, 1] is a Gaussian process with the following mean and covariance structure 1 𝔼 B αt = 0 and 𝔼 B αs B αt = (|s|α + |t|α − |s − t|α ) . 2 Alternatively one can require that 𝔼 B αt = 0,

B0α = 0

and

𝔼 [(B αs − B αt )2 ] = |s − t|α .

a) Check the equivalence of the two definitions for a fBM.

b) Show that fBM exists, i.e. find the probability distribution of (B αt1 , . . . , B αtn ).

c) Identify the processes relating to the particular cases α = 1 and α = 2.

d) Show that an α-fBM is H-self-similar, i.e. W tα := c−H B αct , t ⩾ 0, defines for each c > 0 an α-fBM.

e) Show that fBM has stationary increments. f)

Show that for α ≠ 1 the increments are dependent

3 Constructions of Brownian motion There are several different ways to construct Brownian motion and we will present a few of them here. Since a d-dimensional Brownian motion can be obtained from d independent one-dimensional Brownian motions, see Section 2.2, we restrict our attention to the one-dimensional setting. If this is your first encounter with Brownian motion, you should restrict your attention to Sections 3.1 and 3.2 or Section 3.4 which contain the most intuitive constructions.

3.1 A random orthogonal series Some of the earliest mathematical constructions of Brownian motion are based on random orthogonal series. The idea is to write the paths [0, 1] ∋ t 󳨃→ B t (ω) for (almost) every ω as a random series with respect to a complete orthonormal system (ONS) in 1 the Hilbert space L2 (dt) = L2 ([0, 1], dt) with scalar product ⟨f, g⟩L2 = ∫0 f(t)g(t) dt. Assume that (ϕ n )n⩾0 is any complete ONS and let (G n )n⩾0 be a sequence of real-valued iid Gaussian N(0, 1)-random variables on the probability space (Ω, A , ℙ). Set N−1

N−1

t

n=0

n=0

0

W N (t) := ∑ G n ⟨𝟙[0,t) , ϕ n ⟩L2 = ∑ G n ∫ ϕ n (s) ds.

Ex. 3.1 Ex. 3.2

We want to show that limN→∞ W N (t) defines a Brownian motion on [0, 1].

3.1 Lemma. The limit W(t) := limN→∞ W N (t) exists for every t ∈ [0, 1] in L2 (ℙ) and the process W(t) satisfies (B0)–(B3).

Proof. Using the independence of the G n ∼ N(0, 1) and Parseval’s identity we get for every t ∈ [0, 1] and M < N N−1

𝔼 ((W N (t) − W M (t))2 ) = 𝔼 [ ∑ G n G m ⟨𝟙[0,t) , ϕ m ⟩L2 ⟨𝟙[0,t) , ϕ n ⟩L2 ] m,n=M

N−1

= ∑ 𝔼(G n G m )⟨𝟙[0,t) , ϕ m ⟩L2 ⟨𝟙[0,t) , ϕ n ⟩L2 m,n=M N−1

=0 (n=m), ̸ or =1 (n=m)

= ∑ ⟨𝟙[0,t) , ϕ n ⟩2L2 󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ 0. n=M

M,N→∞

This shows that W(t) = L2 - limN→∞ W N (t) exists and ∞

𝔼 W 2 (t) = ∑ ⟨𝟙[0,t) , ϕ n ⟩2L2 = ⟨𝟙[0,t) , 𝟙[0,t) ⟩L2 = t. n=0

https://doi.org/10.1515/9783110741278-003

3.1 A random orthogonal series | 21

An analogous calculation yields for s < t and u < v ∞

𝔼(W(t) − W(s))(W(v) − W(u)) = ∑ ⟨𝟙[0,t) − 𝟙[0,s) , ϕ n ⟩L2 ⟨𝟙[0,v) − 𝟙[0,u) , ϕ n ⟩L2 n=0

and now it is easy to see that

= ⟨𝟙[s,t) , 𝟙[u,v) ⟩L2 ,

t − s, { { { 𝔼(W(t) − W(s))(W(v) − W(u)) = {0, { { + {(v ∧ t − u ∨ s) ,

if [s, t) = [u, v);

if [s, t) ∩ [u, v) = 0; in general.

With this calculation we find for all 0 ⩽ s < t ⩽ u < v and ξ, η ∈ ℝ 𝔼 [ exp (iξ(W(t) − W(s)) + iη(W(v) − W(u))) ] N−1

= lim 𝔼 [exp (i ∑ (ξ⟨𝟙[s,t) , ϕ n ⟩ + η⟨𝟙[u,v) , ϕ n ⟩) G n )] N→∞

n=0

N−1

iid

= lim ∏ 𝔼 [exp ((iξ⟨𝟙[s,t) , ϕ n ⟩ + iη⟨𝟙[u,v) , ϕ n ⟩) G n )] N→∞

n=0

1󵄨 󵄨2 = lim ∏ exp (− 󵄨󵄨󵄨ξ⟨𝟙[s,t) , ϕ n ⟩ + η⟨𝟙[u,v) , ϕ n ⟩󵄨󵄨󵄨 ) N→∞ 2 n=0 N−1

(since 𝔼 e iζG n = exp(− 21 |ζ|2 ))

= lim exp(− N→∞

1 N−1 2 ∑ [ξ ⟨𝟙[s,t) , ϕ n ⟩2 + η2 ⟨𝟙[u,v) , ϕ n ⟩2 ]) 2 n=0 N−1

× lim exp(− ∑ ξη⟨𝟙[s,t) , ϕ n ⟩⟨𝟙[u,v) , ϕ n ⟩) N→∞

n=0

󳨀󳨀→0 1 2 1 2 = exp (− ξ (t − s)) exp (− η (v − u)) . 2 2

This calculation shows 󳶳 W(t) − W(s) ∼ N(0, t − s), if we take η = 0; 󳶳 W(t − s) ∼ N(0, t − s), if we take η = 0, s = 0 and replace t by t − s; 󳶳 W(t) − W(s) ⊥⊥ W(v) − W(u), since ξ, η are arbitrary.

The independence of finitely many increments can be seen in the same way. Since W(0) = 0 is obvious, we are done.

The series defining W(t) is a series of independent random variables which converges in L2 (ℙ), hence in probability. From classical probability theory we know that

22 | 3 Constructions of Brownian motion

𝑗

𝑗

22

22 𝑘+1 2𝑗

𝑘 2𝑗

𝑗 1 −2 2 2

1

𝑘+1 2𝑗

𝑘 2𝑗

1

𝑗

𝑗

−2 2

−2 2

Fig. 3.1: The Haar functions H2j +k and the Schauder functions S2j +k .

limN→∞ W N (t) = W(t) almost surely. But this is not enough to ensure that the path t 󳨃→ W(t, ω) is (ℙ almost surely) continuous. The general theorem due to Itô–Nisio [122] would do the trick, but we prefer to give a direct proof using a special complete orthonormal system.

3.2 The Lévy–Ciesielski construction This approach goes back to Lévy [164, pp. 492–494] but it got its definitive form in the hands of Ciesielski, cf. [37; 38]. Following Ciesielski we use the Haar orthonormal system in the random orthogonal series from Section 3.1. Write (H n )n⩾0 and (S n )n⩾0 for the families of Haar and Schauder functions, respectively. If n = 0 and n = 2j + k, j ⩾ 0, k = 0, 1, . . . , 2j − 1, they are defined as H0 (t) = 1,

j

{+2 2 , { { { { { j H2j +k (t) = {−2 2 , { { { { { {0,

on [ 2kj , on

2k+1 ) 2j+1

, k+1 ) [ 2k+1 2j+1 2j

S0 (t) = t

S2j +k (t) = ⟨𝟙[0,t] , H2j +k ⟩L2 t

,

otherwise

= ∫ H2j +k (s) ds, 0

supp S n = supp H n .

For n = 2j + k the graphs of the Haar and Schauder functions are shown in Figure 3.1. 1 Obviously, ∫0 H n (t) dt = 0 for all n ⩾ 1 and H2j +k H2j +ℓ = S2j +k S2j +ℓ = 0

for all j ⩾ 0, k ≠ ℓ.

3.2 The Lévy–Ciesielski construction | 23

The Schauder functions S2j +k are tent-functions with support [k2−j , (k + 1)2−j ] and maximal value 12 2−j/2 at the tip of the tent. The Haar functions are also an orthonormal system in L2 = L2 ([0, 1], dt), i.e. 1

{1, ⟨H n , H m ⟩L2 = ∫ H n (t)H m (t) dt = { 0, 0 {

m=n

m ≠ n,

for all n, m ⩾ 0,

and they are a basis of L2 ([0, 1], ds), i.e. a complete orthonormal system, see Theorem A.43 in the appendix. 3.2 Theorem (Lévy 1940; Ciesielski 1959). There is a probability space (Ω, A , ℙ) and a sequence of iid standard normal random variables (G n )n⩾0 such that ∞



W(t, ω) = ∑ G n (ω)⟨𝟙[0,t) , H n ⟩L2 = ∑ G n (ω)S n (t), n=0

n=0

is a Brownian motion.

t ∈ [0, 1],

(3.1)

Proof. Let W(t) be as in Lemma 3.1 where we use the Haar functions H n as orthonormal system in L2 (dt). As probability space (Ω, A , ℙ) we use the probability space which supports the (countably many) independent Gaussian random variables (G n )n⩾0 from the construction of W(t). It is enough to prove that the sample paths t 󳨃→ W(t, ω) are continuous. By definition, the partial sums N−1

N−1

n=0

n=0

t 󳨃→ W N (t, ω) = ∑ G n (ω)⟨𝟙[0,t) , H n ⟩L2 = ∑ G n (ω)S n (t).

(3.2)

are continuous for all N ⩾ 1, and it is sufficient to prove that (a subsequence of) (W N (t))N⩾0 converges uniformly to W(t). The next step of the proof is similar to the proof of the Riesz-Fischer theorem on the completeness of L p spaces, see e.g. [232, Theorem 13.7]. Consider the random variable 2j −1

∆ j (t) := W2j+1 (t) − W2j (t) = ∑ G2j +k S2j +k (t). k=0

If k ≠ ℓ, then S2j +k S2j +ℓ = 0, and we see 2j −1

|∆ j (t)|4 = ∑

k,ℓ,p,q=0 2j −1

G2j +k G2j +ℓ G2j +p G2j +q S2j +k (t)S2j +ℓ (t)S2j +p (t)S2j +q (t)

= ∑ G42j +k |S2j +k (t)|4 k=0

2j −1

⩽ ∑ G42j +k 2−2j . k=0

Ex. 3.3

24 | 3 Constructions of Brownian motion Since the right-hand side does not depend on t ∈ [0, 1] and since 𝔼 G4n = 3 (use 2.3 for G n ∼ N(0, 1)) we get 2j −1

𝔼 [ sup |∆ j (t)|4 ] ⩽ 3 ∑ 2−2j = 3 ⋅ 2−j t∈[0,1]

k=0

For n < N we find using Minkowski’s inequality in L4 (ℙ)

for all j ⩾ 1.

󵄩󵄩 󵄩󵄩 󵄩󵄩 ∞ 󵄩󵄩 󵄩󵄩 󵄩 󵄩 󵄩 󵄩󵄩 sup sup |W2N (t) − W2n (t)| 󵄩󵄩󵄩 ⩽ 󵄩󵄩󵄩 ∑ sup |W2j (t) − W2j−1 (t)| 󵄩󵄩󵄩 󵄩󵄩 4 󵄩󵄩 󵄩󵄩 N>n t∈[0,1] 󵄩󵄩 4 󵄩L 󵄩 j=n+1 t∈[0,1] 󵄩 󵄩L =|∆ j−1 (t)| 󵄩 ∞ 󵄩 󵄩󵄩 󵄨 󵄨 󵄩󵄩 ⩽ ∑ 󵄩󵄩󵄩󵄩 sup 󵄨󵄨󵄨∆ j−1 (t)󵄨󵄨󵄨 󵄩󵄩󵄩󵄩 󵄩 󵄩󵄩L4 j=n+1 󵄩 t∈[0,1] ∞

⩽ 31/4 ∑ 2−(j−1)/4 󳨀󳨀󳨀󳨀→ 0. n→∞

j=n+1

By monotone convergence we get

󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩 󵄩 󵄩 󵄩󵄩 lim sup sup |W2N (t) − W2n (t)| 󵄩󵄩󵄩 = lim 󵄩󵄩󵄩 sup sup |W2N (t) − W2n (t)| 󵄩󵄩󵄩 = 0. 󵄩󵄩 4 n→∞ 󵄩󵄩 N>n t∈[0,1] 󵄩󵄩 4 󵄩󵄩󵄩 n→∞ N>n t∈[0,1] 󵄩L 󵄩 󵄩L

Ex. 3.4

Ex. 1.8

This shows that there exists a subset Ω0 ⊂ Ω with ℙ(Ω0 ) = 1 such that lim sup sup |W2N (t, ω) − W2n (t, ω)| = 0

n→∞ N>n t∈[0,1]

for all ω ∈ Ω0 .

Because of the completeness of the space of continuous functions, the (sub-)sequence (W2j (t, ω))j⩾1 converges uniformly in t ∈ [0, 1]. Since we know that W(t, ω) is the limiting function, we conclude that W(t, ω) inherits the continuity of the W2j (t, ω) for all ω ∈ Ω0 . It remains to define Brownian motion everywhere on Ω. We set { ̃ ω) := W(t, ω), W(t, { 0, {

ω ∈ Ω0

ω ∉ Ω0 .

̃ which means that W(t, ̃ ω) is a As ℙ(Ω \ Ω0 ) = 0, Lemma 3.1 remains valid for W, one-dimensional Brownian motion indexed by [0, 1]. ̃ which is a Strictly speaking, W is not a Brownian motion but has a restriction W Brownian motion.

Ex. 3.6

3.3 Definition. Let (X t )t∈I and (Y t )t∈I be two ℝd -valued stochastic processes with the same index set. We call X and Y indistinguishable if they are defined on the same probability space and if ℙ(X t = Y t

∀t ∈ I) = 1;

3.2 The Lévy–Ciesielski construction | 25

modifications if they are defined on the same probability space and ℙ(X t = Y t ) = 1

Ex. 3.5

for all t ∈ I;

equivalent if they have the same finite dimensional distributions, i.e. (X t1 , . . . , X t n ) ∼ (Y t1 , . . . , Y t n ) for all t1 , . . . , t n ∈ I, n ⩾ 1

(X, Y need not be defined on the same probability space).

Now it is easy to construct a real-valued Brownian motion for all t ⩾ 0.

3.4 Corollary. Let (W k (t))t∈[0,1] , k ⩾ 0, be independent real-valued Brownian motions on the same probability space. Then W 0 (t), { { { { { { {W 0 (1) + W 1 (t − 1), B(t) := { {k−1 { { { j k { { ∑ W (1) + W (t − k), j=0 {

is a BM1 indexed by t ∈ [0, ∞).

t ∈ [0, 1); t ∈ [1, 2);

(3.3)

t ∈ [k, k + 1), k ⩾ 2.

Proof. Let B k be a copy of the process W from Theorem 3.2 and denote the corresponding probability space by (Ω k , Ak , ℙk ). On the product space (Ω, A , ℙ) = ⨂k⩾1 (Ω k , Ak , ℙk ), ω = (ω1 , ω2 , . . . ), we define processes W k (ω) := B k (ω k ); by construction, these are independent Brownian motions on (Ω, A , ℙ). Let us check (B0)–(B4) for the process (B(t))t⩾0 defined by (3.3). The properties (B0) and (B4) are obvious. Let s < t and assume that s ∈ [ℓ, ℓ + 1), t ∈ [m, m + 1) where ℓ ⩽ m. Then m−1

ℓ−1

B(t) − B(s) = ∑ W j (1) + W m (t − m) − ∑ W j (1) − W ℓ (s − ℓ) j=0

j=0

m−1

= ∑ W j (1) + W m (t − m) − W ℓ (s − ℓ) j=ℓ

W m (t − m) − W m (s − m) ∼ N(0, t − s), { { { ={ m−1 { { ℓ ℓ j m { (W (1) − W (s − ℓ)) + ∑ W (1) + W (t − m), j=ℓ+1

indep ∼ N(0,1−s+ℓ)⋆N(0,1)⋆m−ℓ−1 ⋆N(0,t−m)=N(0,t−s)

ℓ=m

ℓ < m.

This proves (B2). We show (B1) only for two increments B(u) − B(t) and B(t) − B(s) where s < t < u. As before, s ∈ [ℓ, ℓ + 1), t ∈ [m, m + 1) and u ∈ [n, n + 1) where

26 | 3 Constructions of Brownian motion ℓ ⩽ m ⩽ n. Then we have that

W m (u − m) − W m (t − m), { { { n−1 B(u) − B(t) = { {(W m (1) − W m (t − m)) + ∑ W j (1) + W n (u − n), { j=m+1 {

m=n

m < n.

By assumption, the random variables appearing in the representation of the increments B(u) − B(t) and B(t) − B(s) are independent, which means that the increments are independent. The case of finitely many, not necessarily adjacent, increments is similar.

3.3 Wiener’s construction This is also a series approach, but Wiener used the trigonometric functions (e inπt )n∈ℤ as orthonormal basis for L2 [0, 1]. In this case we set ϕ n (t) := √1 (e inπt + e−inπt ) and 2 obtain Brownian motion on [0, 1] as a Wiener–Fourier series ∞

sin(nπt) G n (ω), nπ n=1

W(t, ω) := tG0 (ω) + √2 ∑

(3.4)

where (G n )n⩾0 are iid standard normal random variables. Lemma 3.1 remains valid for (3.4) and shows that the series converges in L2 and that the limit satisfies (B0)–(B3). 3.5 Theorem (Wiener 1923, 1924). The random process (W t )t∈[0,1] defined by (3.4) is a one-dimensional Brownian motion.

Using Corollary 3.4 we can extend the construction of Theorem 3.5 to get a Brownian motion with time set [0, ∞). Proof of Theorem 3.5. It is enough to prove that (3.4) defines a continuous random process. Let sin(nπt) G n (ω). nπ n=1 N

W N (t, ω) := tG0 (ω) + √2 ∑

We will show that (W2n )n⩾1 is a Cauchy sequence in L2 (ℙ) uniformly for all t ∈ [0, 1]. Set ∆ j (t) := W2j+1 (t) − W2j (t).

Using | Im z| ⩽ |z| for z ∈ ℂ, we see

2 󵄨󵄨 2j+1 ikπt 󵄨󵄨2 󵄨󵄨 sin(kπt) e 󵄨󵄨󵄨 |∆ j (t)| = ( ∑ G k ) ⩽ 󵄨󵄨󵄨 ∑ G k 󵄨󵄨󵄨󵄨 , k 󵄨󵄨k=2j +1 k 󵄨󵄨 k=2j +1 󵄨 󵄨 2

2j+1

3.3 Wiener’s construction | 27

and since |z|2 = zz we get |∆ j (t)|2



= m=k−ℓ

=



2j+1

2j+1





k=2j +1 ℓ=2j +1

G2k

2j+1



k=2j +1 2j+1



k=2j +1 2j+1



k=2j +1

k2

G2k k2

e ikπt e−iℓπt G k Gℓ kℓ 2j+1

k−1

+2 ∑



k=2j +1 ℓ=2j +1

2j −1 2j+1 −m

+2 ∑



m=1 ℓ=2j +1

e ikπt e−iℓπt G k Gℓ kℓ

e imπt Gℓ Gℓ+m ℓ(ℓ + m)

󵄨󵄨 2j −1 󵄨󵄨󵄨2j+1 −m Gℓ Gℓ+m 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 . 󵄨󵄨 ∑ + 2 ∑ 󵄨 ℓ(ℓ + m) 󵄨󵄨󵄨 k2 m=1 󵄨󵄨󵄨ℓ=2j +1 󵄨

G2k

Note that the right-hand side is independent of t. On both sides we can therefore take the supremum over all t ∈ [0, 1] and then the mean value 𝔼(. . . ). With Jensen’s inequality we conclude that 𝔼 |Z| ⩽ √𝔼 |Z|2 and so 2

2j+1

𝔼 ( sup |∆ j (t)| ) ⩽ ∑ t∈[0,1]

k=2j +1 2j+1

⩽ ∑

k=2j +1

𝔼 G2k k2

𝔼 G2k k2

󵄨󵄨2j+1 −m 󵄨󵄨 󵄨󵄨 Gℓ Gℓ+m 󵄨󵄨󵄨 󵄨 󵄨󵄨 + 2 ∑ 𝔼 󵄨󵄨󵄨 ∑ ℓ(ℓ + m) 󵄨󵄨󵄨 m=1 󵄨󵄨󵄨ℓ=2j +1 󵄨 2j −1

2j+1 −m

2j −1

+ 2 ∑ √𝔼 ( ∑

ℓ=2j +1

m=1

2

Gℓ Gℓ+m ) . ℓ(ℓ + m)

If we expand the square we get expressions of the form 𝔼(Gℓ Gℓ+m Gℓ󸀠 Gℓ󸀠 +m ); since the G n , n ∈ ℕ0 , are independent mean zero random variables, these expectations are zero whenever ℓ ≠ ℓ󸀠 . Thus, 2j+1

𝔼 ( sup |∆ j (t)|2 ) = ∑ t∈[0,1]

k=2j +1 2j+1

= ∑

k=2j +1 2j+1

2 −1 2 −m Gℓ Gℓ+m 2 1 + 2 ∑ √ ∑ 𝔼( ) . 2 ℓ(ℓ + m) k m=1 ℓ=2j +1 j

j+1

2 −1 2 −m 1 1 +2 ∑ √ ∑ 2 2 k ℓ (ℓ + m)2 m=1 ℓ=2j +1 j

j+1

2 −1 2 −m 1 1 +2 ∑ √ ∑ j 2 j )2 (2j )2 (2 ) (2 m=1 ℓ=2j +1 k=2j +1

⩽ ∑

j

j+1

⩽ 2j ⋅ 2−2j + 2 ⋅ 2j ⋅ √2j ⋅ 2−4j ⩽ 3 ⋅ 2−j/2 .

28 | 3 Constructions of Brownian motion From this point onwards we can follow the proof of Theorem 3.2: for n < N we find using Minkowski’s inequality in L2 (ℙ) 󵄩󵄩 󵄩󵄩 󵄩󵄩 ∞ 󵄩󵄩 󵄩󵄩 󵄩 󵄩 󵄩 󵄩󵄩 sup sup |W2N (t) − W2n (t)| 󵄩󵄩󵄩 ⩽ 󵄩󵄩󵄩 ∑ sup |W2j (t) − W2j−1 (t)| 󵄩󵄩󵄩 󵄩󵄩 N>n t∈[0,1] 󵄩󵄩 2 󵄩󵄩 󵄩󵄩 2 󵄩 󵄩L 󵄩 j=n+1 t∈[0,1] 󵄩L =|∆ j−1 (t)| 󵄩 ∞ 󵄩 󵄩󵄩 󵄨 󵄩󵄩 󵄨 ⩽ ∑ 󵄩󵄩󵄩󵄩 sup 󵄨󵄨󵄨∆ j−1 (t)󵄨󵄨󵄨 󵄩󵄩󵄩󵄩 󵄩 󵄩󵄩L2 j=n+1 󵄩 t∈[0,1] ∞

⩽ 31/4 ∑ 2−(j−1)/4 󳨀󳨀󳨀󳨀→ 0. n→∞

j=n+1

By monotone convergence we get

󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩 󵄩 󵄩 󵄩󵄩 lim sup sup |W2N (t) − W2n (t)| 󵄩󵄩󵄩 = lim 󵄩󵄩󵄩| sup sup |W2N (t) − W2n (t)| 󵄩󵄩󵄩 = 0. 󵄩󵄩 n→∞ N>n t∈[0,1] 󵄩󵄩 2 n→∞ 󵄩󵄩 N>n t∈[0,1] 󵄩󵄩 2 󵄩 󵄩L 󵄩 󵄩L

Ex. 3.4

Ex. 1.8

This shows that there exists a subset Ω0 ⊂ Ω with ℙ(Ω0 ) = 1 such that lim sup sup |W2N (t, ω) − W2n (t, ω)| = 0

n→∞ N>n t∈[0,1]

for all ω ∈ Ω0 .

By the completeness of the space of continuous functions, there is a subsequence of (W2j (t, ω))j⩾1 which converges uniformly in t ∈ [0, 1]. Since we know that W(t, ω) is the limiting function, we conclude that W(t, ω) inherits the continuity of the W2j (t, ω) for all ω ∈ Ω0 . It remains to define Brownian motion everywhere on Ω. We set { ̃ ω) := W(t, ω), W(t, { 0, {

ω ∈ Ω0

ω ∉ Ω0 .

̃ which means that W(t, ̃ ω) is a Since ℙ(Ω \ Ω0 ) = 0, Lemma 3.1 remains valid for W, one-dimensional Brownian motion indexed by [0, 1].

3.4 Lévy’s original argument

Lévy’s original argument is based on interpolation. Let s < t < u and assume that (W t )t∈[0,1] is a Brownian motion. Then G󸀠 := W(t) − W(s) and

G󸀠󸀠 := W(u) − W(t)

are independent Gaussian random variables with mean 0 and variances t − s and u − t, respectively. In order to find the positions W(s), W(t), W(u) we could either determine W(s) and then G󸀠 and G󸀠󸀠 independently or, equivalently, work out first W(s) and

3.4 Lévy’s original argument | 29

𝑊1 2

𝑊1

𝑊1

1

𝑊1 2

𝑊1

𝑊1

𝑊1

1 2

1

2

𝑊1

𝑊1

4

4

𝑊3

𝑊3

4

4

1 4

1 2

3 4

1 4

1

1 2

3 4

Fig. 3.2: The first four interpolation steps in Lévy’s construction of Brownian motion.

1

W(u) − W(s), and obtain W(t) by interpolation. Let Γ be a further standard normal random variable which is independent of W(s) and W(u) − W(s). Then (u − t)W(s) + (t − s)W(u) √ (t − s)(u − t) + Γ u−s u−s = W(s) +

t−s (t − s)(u − t) (W(u) − W(s)) + √ Γ u−s u−s

is – because of the independence of W(s), W(u) − W(s) and Γ – a Gaussian random variable with mean 0 and variance t. Thus, we get a random variable with the same distribution as W(t) and so ℙ(W(t) ∈ ⋅ | W(s) = x, W(u) = z) = N(m t , σ2t ) (u − t)x + (t − s)z

(t − s)(u − t) mt = and = . u−s u−s If t is the midpoint of [s, u], Lévy’s method leads to the following prescription: σ2t

(3.5)

Ex. 3.9

30 | 3 Constructions of Brownian motion 3.6 Algorithm. Set W(0) = 0 and let W(1) be an N(0, 1) distributed random variable. Let n ⩾ 1 and assume that the random variables W(k2−n ), k = 1, . . . , 2n − 1 have already been constructed. Then {W(k2−n ), W(ℓ2−n−1 ) := { 1 (W(k2−n ) + W((k + 1)2−n )) + Γ2n +k , {2

ℓ = 2k,

ℓ = 2k + 1,

where Γ2n +k is an independent (of everything else) Gaussian random variable with mean 0 and variance 2−n /4, cf. Figure 3.3. In-between the nodes we use piecewise-linear interpolation:

W2n (t, ω) := Linear interpolation of (W(k2−n , ω), k = 0, 1, . . . , 2n ),

Ex. 2.3

At the dyadic points t = k2−n we get the “true” value of W(t, ω), while the linear interpolation is an approximation, see Figure 3.2. We will now see that these piecewise linear functions converge to the random function t 󳨃→ W(t, ω). The refined increments W(ℓ2−(n+1) ) − W((ℓ − 1)2−(n+1) ) are again independent. This can be seen from the following argument: 1o Fix n. By assumption W(k2−n ) − W((k − 1)2−n ), k = 1, . . . , 2n are independent. 2o Pick k0 ∈ ℕ and some Γ = Γ2n +k0 ∼ N(0, 41 2−n ) which is independent of everything else. Therefore, see e.g. Exercise 2.3, X :=

Ex. 3.8

n ⩾ 1.

1 (W(k0 2−n ) − W((k0 − 1)2−n )) ∼ N(0, 14 2−n ) 󳨐⇒ X + Γ ⊥⊥ X − Γ 2

3o Since both X + Γ ⊥⊥ X − Γ and (X + Γ, X − Γ) ⊥⊥{W(k2−n ) − W((k − 1)2−n ), k ≠ k0 }, a quick calculation (e.g. using characteristic functions and Kac’s theorem) shows that the random variables X + Γ = W((2k0 − 1)2−(n+1) ) − W((2k0 − 2)2−(n+1) ), { { { X − Γ = W(2k0 2−(n+1) ) − W((2k0 − 1)2−(n+1) ), { { { −n −n {W(k2 ) − W((k − 1)2 ), ∀k ≠ k0

are independent.

Repeating these steps for all possible values of k0 finishes the proof. 3.7 Theorem (Lévy 1940). The series ∞

W(t, ω) := ∑ (W2n+1 (t, ω) − W2n (t, ω)) + W1 (t, ω), n=0

converges a.s. uniformly. In particular (W(t))t∈[0,1] is a BM1 .

t ∈ [0, 1],

Using Corollary 3.4 we can extend the construction of Theorem 3.7 to get a Brownian motion with time set [0, ∞).

3.4 Lévy’s original argument |

31

Fig. 3.3: The n + 1st interpolation step in Lévy’s construction. The nodes of the previous generation are s = (k0 − 1)2−n = (2k0 − 2)2−n−1 and u = k0 2−n = 2k0 2−n−1 ; now we add all midpoints t = 12 (s + u) = (2k0 − 1)2−n−1 . Then X = 12 (W(k0 2−n ) − W((k0 − 1)2−n )) ∼ N(0, 14 2−n ), Γ = Γ2n +k0 and ∆ n (t) = W2n+1 (t) − W2n (t).

Proof of Theorem 3.7. Set ∆ n (t, ω) := W2n+1 (t, ω) − W2n (t, ω). By construction, ∆ n ((2k − 1)2−n−1 , ω) = Γ2n +(k−1) (ω) ∼ N (0, 14 2−n ) ,

k = 1, 2, . . . , 2n ,

are iid N(0, 2−(n+2) ) distributed random variables. Therefore, xn 󵄨 󵄨 󵄨 󵄨 ℙ ( max 󵄨󵄨󵄨∆ n ((2k − 1)2−n−1 )󵄨󵄨󵄨 > ) ⩽ 2n ℙ (󵄨󵄨󵄨√2n+2 ∆ n (2−n−1 )󵄨󵄨󵄨 > x n ) √2n+2 1⩽k⩽2n and the right-hand side equals ∞



n

n

2 r −r2 /2 2n+1 2n+1 −x2n /2 2 ⋅ 2n e dr = . e ∫ e−r /2 dr ⩽ ∫ √2π √2π x n √2π x n x x

Choose c > 1 and x n := c√2n log 2. Then ∞ ∞ 2n+1 −c2 log 2n xn 󵄨 󵄨 e )⩽ ∑ ∑ ℙ ( max 󵄨󵄨󵄨∆ n ((2k − 1)2−n−1 )󵄨󵄨󵄨 > √2n+2 1⩽k⩽2n n=1 n=1 c √2π

=

∞ 2 2 ∑ 2−(c −1)n < ∞. c√2π n=1

Using the Borel–Cantelli lemma we find a set Ω0 ⊂ Ω with ℙ(Ω0 ) = 1 such that for every ω ∈ Ω0 there is some N(ω) ⩾ 1 with n log 2 󵄨 󵄨 max 󵄨󵄨󵄨∆ n ((2k − 1)2−n−1 )󵄨󵄨󵄨 ⩽ c√ n+1 2

1⩽k⩽2n

for all n ⩾ N(ω).

32 | 3 Constructions of Brownian motion ∆ n (t) is the distance between the polygonal arcs W2n+1 (t) and W2n (t); the maximum is attained at one of the midpoints of the intervals [(k − 1)2−n , k2−n ], k = 1, . . . , 2n , see Figure 3.3. Thus 󵄨 󵄨 󵄨 󵄨 sup 󵄨󵄨󵄨W2n+1 (t, ω) − W2n (t, ω)󵄨󵄨󵄨 ⩽ max 󵄨󵄨󵄨∆ n ((2k − 1)2−n−1 , ω)󵄨󵄨󵄨 1⩽k⩽2n

0⩽t⩽1

n log 2 ⩽ c√ n+1 2

which shows that the limit

for all n ⩾ N(ω),



W(t, ω) := lim W2N (t, ω) = ∑ (W2n+1 (t, ω) − W2n (t, ω)) + W1 (t, ω) N→∞

n=0

exists for all ω ∈ Ω0 uniformly in t ∈ [0, 1]. Therefore, t 󳨃→ W(t, ω), ω ∈ Ω0 , inherits the continuity of the polygonal arcs t 󳨃→ W2n (t, ω). Set { ̃ ω) := W(t, ω), W(t, { 0, {

ω ∈ Ω0 , ω ∉ Ω0 .

By construction, we find for all 0 ⩽ j ⩽ k ⩽ 2n ℙ-almost surely −n ̃ ̃ −n ) = W2n (k2−n ) − W2n (j2−n ) W(k2 ) − W(j2 k

= ∑ (W2n (ℓ2−n ) − W2n ((ℓ − 1)2−n ))

iid

ℓ=j+1

∼ N(0, (k − j)2−n ).

Ex. 2.4

̃ is continuous and since the dyadic numbers are dense in [0, t], we conSince t 󳨃→ W(t) ̃ j ) − W(t ̃ j−1 ), 0 = t0 < t1 < ⋅ ⋅ ⋅ < t N ⩽ 1 are independent clude that the increments W(t N(0, t j − t j−1 ) distributed random variables. Here we use that the normal distribution ̃ t∈[0,1] is a is preserved under d-convergence or ℙ-convergence. This shows that (W(t)) Brownian motion.

3.8 Remark. The polygonal arc W2n (t, ω) can be expressed by Schauder functions, cf. Figure 3.1. From (3.2) and the proof of Theorem 3.2 we know that ∞ 2n −1

W(t, ω) = G0 (ω)S0 (t) + ∑ ∑ G2n +k (ω) S2n +k (t), n=0 k=0

=W1 (t,ω)

=W2n+1 (t,ω)−W2n (t,ω)

where S n (t) are the Schauder functions. This shows that Lévy’s construction and the Lévy–Ciesielski constructions are essentially the same.

3.4 Lévy’s original argument | 33

For dyadic t = k/2n , the expansion of W t is finite. Observe that 1o —

W(1, ω) = G0 (ω)S0 (1)

o

W( 21 , ω) = G0 (ω)S0 ( 12 ) + G1 (ω)S1 ( 12 )

2 —

interpolation

o

W( 14 ,

3 —

W( 43 , 4o —

ω) = ω) =

...

G0 (ω)S0 ( 14 )

= Γ1 ∼ N(0,1/4)

+ G1 (ω)S1 ( 41 ) + G2 (ω)S2 ( 14 )

interpolation

G0 (ω)S0 ( 43 )

+

= Γ2 ∼ N(0,1/8)

= Γ3 ∼ N(0,1/8)

G1 (ω)S1 ( 43 ) +G2 (ω) S2 ( 43 ) + G3 (ω)S3 ( 34 ) =0

The 2n th partial sum, W2n (t, ω), is therefore a piecewise linear interpolation of the Brownian path W t (ω) at the points (k2−n , W(k2−n , ω)), k = 0, 1, . . . , 2n .

3.9 Rigorous proofs of (3.5). There are several possibilities to prove Lévy’s formula (3.5). First, one could use a direct attack based on the joint law of (B s , B t , B u ) and the familiar formula for conditional probabilities. Although this is straightforward, it is a bit tedious.

A second possibility is to use the behaviour of Gaussian random variables under conditioning: if (G, Γ) is an ℝ × ℝd -valued Gaussian random variable, then the conditional expectation 𝔼(G | Γ) is of the form a + ⟨b, Γ⟩. Thus, G − 𝔼(G | Γ) is centred and normal, e.g. ∼ N(0, σ2 ). The conditional law is ℙ(G ∈ dx | Γ = γ) = N(a + ⟨b, γ⟩, σ2 ), see [235, Theorem 36.4] or [233, Satz 16.11]. Statisticians know this fact from linear models. Let us add yet another proof. Observe that for all 0 ⩽ r < t and ξ ∈ ℝ 󵄨 󵄨 𝔼 (e iξW(t) 󵄨󵄨󵄨 W(r)) = e iξW(r) 𝔼 (e iξ(W(t)−W(r)) 󵄨󵄨󵄨 W(r)) (B1)

= e iξW(r) 𝔼 (e iξ(W(t)−W(r)) )

(B2)

1

2

= e− 2 (t−r)ξ e iξW(r) .

(2.5)

If we apply this equality to the projectively reflected (cf. 2.17) Brownian motion (tW(1/t))t>0 , we get 1 2 󵄨 󵄨 𝔼 (e iξtW(1/t) 󵄨󵄨󵄨 W(1/r)) = 𝔼 (e iξtW(1/t) 󵄨󵄨󵄨 rW(1/r)) = e− 2 (t−r)ξ e iξrW(1/r) .

Now set b :=

1 r

>

1 t

=: a and η := ξ/a = ξt. Then

1 1 1 a 󵄨 𝔼 (e iηW(a) 󵄨󵄨󵄨 W(b)) = exp [− a2 ( − ) η2 ] exp [iη W(b)] 2 a b b 1a a = exp [− (b − a)η2 ] exp [iη W(b)] . 2b b

(3.6)

Ex. 3.9

34 | 3 Constructions of Brownian motion This is (3.5) for the case where (s, t, u) = (0, a, b) and x = 0. Now we fix 0 ⩽ s < t < u and observe that W(r + s) − W(s), r ⩾ 0, is again a Brownian motion. Applying (3.6) yields 1 t−s t−s 2 󵄨 𝔼 (e iη(W(t)−W(s)) 󵄨󵄨󵄨 W(u) − W(s)) = e− 2 u−s (u−t)η e iη u−s (W(u)−W(s)) .

On the other hand, W(s) ⊥⊥(W(t) − W(s), W(u) − W(s)) and 󵄨󵄨 𝔼 (e iη(W(t)−W(s)) 󵄨󵄨󵄨 W(u) − W(s)) 󵄨 󵄨󵄨 = 𝔼 (e iη(W(t)−W(s)) 󵄨󵄨󵄨 W(u) − W(s), W(s)) 󵄨 󵄨󵄨 = 𝔼 (e iη(W(t)−W(s)) 󵄨󵄨󵄨 W(u), W(s)) 󵄨 󵄨󵄨 = e−iηW(s) 𝔼 (e iηW(t) 󵄨󵄨󵄨 W(u), W(s)) . 󵄨

This proves

1 t−s t−s 2 󵄨 𝔼 (e iηW(t) 󵄨󵄨󵄨 W(u), W(s)) = e− 2 u−s (u−t)η e iη u−s (W(u)−W(s)) e iηW(s) ,

and (3.5) follows.

3.5 Donsker’s construction Donsker’s invariance theorem shows that Brownian motion is a limit of linearly interpolated random walks – pretty much in the way we have started the discussion in Chapter 1. As before, the difficult point is to prove the sample continuity of the limiting process. On a probability space (Ω, A , ℙ) let ϵ n , n ⩾ 1, be iid Bernoulli random variables such that ℙ(ϵ1 = 1) = ℙ(ϵ1 = −1) = 12 . Then S n := ϵ1 + ⋅ ⋅ ⋅ + ϵ n

is a simple random walk. Interpolate linearly and apply Gaussian scaling S n (t) :=

1 (S⌊nt⌋ − (nt − ⌊nt⌋)ϵ⌊nt⌋+1 ) , √n

t ∈ [0, 1],

see Figure 3.4 illustrating this for n = 5, 10, 100 and 1000 steps. In particular, S n (j/n) = S j /√n. If j = j(n) and j/n = s = const., the central limit d

theorem shows that S n (j/n) = √s S j /√ j 󳨀󳨀󳨀→ √s G as n → ∞ where G is a standard normal random variable. Moreover, for s = j/n and t = k/n, 0 ⩽ j < k ⩽ n, the increment S n (t) − S n (s) = (S k − S j )/√n is independent of ϵ1 , . . . , ϵ j , and therefore of all earlier increments of the same form. Moreover, 𝔼 (S n (t) − S n (s)) = 0

and

𝕍 (S n (t) − S n (s)) =

k−j n

= t − s;

3.5 Donsker’s construction | 35

S5t

S10 t

1

1

1

t

−1

1

t

1

t

−1

St100

St1000

1

1

1

−1

t

−1

Fig. 3.4: The first few approximation steps in Donsker’s construction: n = 5, 10, 100 and 1000.

in the limit we get a Gaussian increment with mean 0 and variance t − s. Since independence and stationarity of the increments are distributional properties, they are inherited by the limiting process which we will denote by (B t )t∈[0,1] . We have seen that (B q )q∈[0,1]∩ ℚ would have the properties (B0)–(B3) and it qualifies as a candidate for Brownian motion. If it had continuous sample paths, (B0)–(B3) would hold not only for rational times but for all t ⩾ 0. That the limit exists and is uniform in t is the essence of Donsker’s invariance principle. 3.10 Theorem (Donsker 1951). Let (B(t))t∈[0,1] be a one-dimensional Brownian motion, (S n (t))t∈[0,1] , n ⩾ 1, be as above and Φ : C([0, 1], ℝ) → ℝ a uniformly continuous bounded functional. Then lim 𝔼 Φ(S n (⋅)) = 𝔼 Φ(B(⋅)),

n→∞

i.e. S n (⋅) converges weakly to B(⋅).

We will give a proof Donsker’s theorem in Chapter 14, Theorem 14.5. Since our proof relies on the existence of a Brownian motion, we cannot use it to construct (B t )t⩾0 . Nevertheless, it is an important result for the intuitive understanding of a Brownian motion as limit of random walks as well as for (simple) proofs of various limit theorems. An elementary and easily accessible proof can be found in [234, Kapitel 15].

36 | 3 Constructions of Brownian motion

Bt

A2 t2

t5

A4

A1

t

A3 A5 Fig. 3.5: Brownian paths meeting (and missing) the sets A1 , . . . , A5 at times t1 , . . . , t5

3.6 The Bachelier–Kolmogorov point of view Ex. 2.12

The starting point of this construction is the observation that the finite dimensional marginal distributions of a Brownian motion are Gaussian random variables. More precisely, for any number of times t0 = 0 < t1 < ⋅ ⋅ ⋅ < t n , t j ∈ I, and all Borel sets A1 , . . . , A n ∈ B(ℝ) the finite dimensional distributions p t1 ,...,t n (A1 × ⋅ ⋅ ⋅ × A n ) = ℙ(B t1 ∈ A1 , . . . , B t n ∈ A n )

are mean zero normal laws with covariance matrix C = (t j ∧ t k )j,k=1,...,n . From Theorem 2.6 we know that they are given by p t1 ,...,t n (A1 × ⋅ ⋅ ⋅ × A n ) 1 1 = (2π)n/2 √det C =

1



A1 ×⋅⋅⋅×A n

1 exp (− ⟨x, C−1 x⟩) dx 2 ∫

(2π)n/2 √∏nj=1 (t j − t j−1 ) A1 ×⋅⋅⋅×A n

exp (−

1 n (x j − x j−1 )2 ) dx. ∑ 2 j=1 t j − t j−1

We will characterize a stochastic process in terms of its finite dimensional distributions – and we will discuss this approach in Chapter 4 below. The idea is to identify a stochastic process with an infinite dimensional measure ℙ on the space of all sample paths Ω such that the finite dimensional projections of ℙ are exactly the finite dimensional distributions. 3.11 Further reading. Yet another construction of Brownian motion, using interpolation arguments and Bernstein polynomials, can be found in the well-written paper [149]. Donsker’s theorem, without assuming the existence of BM, is e.g. proved in the classic monograph [14]; in a more general context it

Problems

| 37

is contained in [62], the presentations in [136] and [233] are probably the easiest to read. Ciesielski’s construction became popular through his very influential Aarhus lecture notes [38], one of the first books on Brownian motion after Wiener [268] and Lévy [166]. Good sources on random Fourier series and wavelet expansions are [131] and [206]. Constructions of Brownian motion in infinite dimensions, e.g. cylindrical Brownian motions, etc. can be found in [210] and [47], but this needs a good knowledge of functional analysis. Billingsley: Convergence of Probability Measures. [14] [38] Ciesielski: Lectures on Brownian Motion, Heat Conduction and Potential Theory. DaPrato, Zabczyk: Stochastic Equations in Infinite Dimensions. [47] [62] Dudley: Uniform Central Limit Theorems. [131] Kahane: Some Random Series of Functions. [136] Karatzas, Shreve: Brownian Motion and Stochastic Calculus. [149] Kowalski: Bernstein Polynomials and Brownian Motion. [206] Pinsky: Introduction to Fourier Analysis and Wavelets. [210] Prevôt, Röckner: A Concise Course on Stochastic Partial Differential Equations. [233] Schilling: Martingale und Prozesse.

Problems 1. 2.

Use the Lévy–Ciesielski representation B(t) = ∑∞ n=0 G n S n (t), t ∈ [0, 1], to obtain a 1 series representation for X := ∫0 B(t) dt and find the distribution of X.

(Random orthogonal measures) Let (G n )n⩾0 be a sequence of iid Gaussian N(0, 1) random variables, (ϕ n )n⩾0 be a complete ONS in L2 ([0, 1], dt), B ∈ B[0, 1] be a measurable set and define W N (B) := ∑N−1 n=0 G n ⟨𝟙B , ϕ n ⟩L2 . a) Show that the limit W(B) := limN→∞ W N (B) exists in L2 (ℙ).

b) Find 𝔼 W(A)W(B) for sets A, B ∈ B[0, 1].

c) Show that B 󳨃→ μ(B) := 𝔼 (|W(B)|2 ) is a measure. Is B 󳨃→ W(B) a measure?

d) Let S denote all step functions of the form f(t) := ∑j x) ⩽ 2(2π)−1/2 x−1 e−x

2 /2

, x > 0, and the Borel–Cantelli lemma.

4. Let (S, d) be a complete metric space equipped with the σ-algebra B(S) of its Borel sets. Assume that (X n )n⩾1 is a sequence of S-valued random variables such that lim sup 𝔼 (d(X n , X m )p ) = 0

n→∞ m⩾n

5. 6.

for some p ∈ [1, ∞). Show that there is a subsequence (n k )k⩾1 and a random variable X such that limk→∞ X n k = X almost surely.

Let (X t )t⩾0 and (Y t )t⩾0 be two stochastic processes which are modifications of each other. Show that they have the same finite dimensional distributions. Let (X t )t∈I and (Y t )t∈I be two processes with the same index set I ⊂ [0, ∞) and state space. Show that X, Y indistinguishable 󳨐⇒ X, Y modifications 󳨐⇒ X, Y equivalent.

7.

Assume that the processes are defined on the same probability space and t 󳨃→ X t and t 󳨃→ Y t are right continuous (or that I is countable). Do the reverse implications hold, too?

Let (B t )t⩾0 be a real-valued stochastic process with exclusively continuous sample paths. Assume that (B q )q∈ℚ∩ [0,∞) satisfies (B0)–(B3). Show that (B t )t⩾0 is a BM.

8. Let X 󸀠 , X 󸀠󸀠 , Z be random vectors such that X 󸀠 ⊥⊥ X 󸀠󸀠 and (X 󸀠 , X 󸀠󸀠 ) ⊥⊥ Z. Show that X 󸀠 , X 󸀠󸀠 , Z are independent. 9.

Give a direct proof of the formula (3.5) using the joint probability distribution (W(t0 ), W(t), W(t1 )) of the Brownian motion W(t).

10. (Two-sided Brownian motion) Show that there is a stochastic process (W t )t∈ℝ satisfying (B0)–(B4). Hint. Glue together, back-to-back, two independent Browninan motions .

4 The canonical model Often we encounter the statement “Let X be a random variable with law μ” where μ is an a priori given probability distribution. There is no reference to the underlying probability space (Ω, A , ℙ), and actually the nature of this space is not important: while X is defined by the probability distribution, there is considerable freedom in the choice of our model (Ω, A , ℙ). The same situation is true for stochastic processes: Brownian motion is defined by distributional properties and our construction of BM, see e.g. Theorem 3.2, not only furnished us with a process but also a suitable probability space. By definition, a d-dimensional stochastic process (X(t))t∈I is a family of ℝd -valued random variables on the space (Ω, A , ℙ). Alternatively, we may understand a process as a map (t, ω) 󳨃→ X(t, ω) from I × Ω to ℝd or as a map ω 󳨃→ {t 󳨃→ X(t, ω)} from Ω into the space (ℝd )I = {w, w : I → ℝd }. If we go one step further and identify Ω with (a subset of) (ℝd )I , we get the so-called canonical model. Of course, it is a major task to identify the correct σ-algebra and measure in (ℝd )I . For Brownian motion it is enough to consider the subspace C(o) [0, ∞) ⊂ (ℝd )[0,∞) ,

Ex. 4.1

Ex. 4.2

C(o) := C(o) [0, ∞) := {w : [0, ∞) → ℝd : w is continuous and w(0) = 0} .

In the first part of this chapter we will see how we can obtain a canonical model for BM defined on the space of continuous functions; in the second part we discuss Kolmogorov’s construction of stochastic processes with given finite dimensional distributions.

4.1 Wiener measure Let I = [0, ∞) and denote by π t : (ℝd )I → ℝd , w 󳨃→ w(t), the canonical projection onto the tth coordinate. The natural σ-algebra on the infinite product (ℝd )I is the product σ-algebra d B I (ℝd ) = σ {π−1 t (B) : B ∈ B(ℝ ), t ∈ I} = σ{π t : t ∈ I};

this shows that B I (ℝd ) is the smallest σ-algebra which makes all projections π t measurable. Since Brownian motion has (almost surely) continuous sample paths, it is natural to replace (ℝd )I by C(o) . Unfortunately, cf. Corollary 4.6, C(o) is not contained in B I (ℝd ); therefore, we have to consider the trace σ-algebra C(o) ∩ B I (ℝd ) = σ (π t |C(o) : t ∈ I) .

If we equip C(o) with the metric of locally uniform convergence, ∞

ρ(w, v) = ∑ (1 ∧ sup |w(t) − v(t)|) 2−n , n=1

https://doi.org/10.1515/9783110741278-004

0⩽t⩽n

Ex. 1.8

40 | 4 The canonical model

Ex. 4.4

C(o) becomes a complete separable metric space. Denote by Oρ the topology induced by ρ and consider the Borel σ-algebra B(C(o) ) := σ(Oρ ) on C(o) .

4.1 Lemma. We have C(o) ∩ B I (ℝd ) = B(C(o) ).

Proof. Under the metric ρ the map C(o) ∋ w 󳨃→ π t (w) is (Lipschitz) continuous for every t, hence measurable with respect to the topological σ-algebra B(C(o) ). This means that and so,

σ (π t |C(o) ) ⊂ B(C(o) ) ∀t ∈ I,

Conversely, we have for v, w ∈ C(o) ∞

ρ(v, w) = ∑ (1 ∧ n=1 ∞

= ∑ (1 ∧ n=1

sup

t∈[0,n]∩ ℚ

sup

t∈[0,n]∩ ℚ

σ (π t |C(o) : t ∈ I) ⊂ B(C(o) ).

|w(t) − v(t)|) 2−n

(4.1) −n

|π t (w) − π t (v)|) 2 .

Thus, v 󳨃→ ρ(v, w) is measurable with respect to σ (π t |C(o) : t ∈ ℚ+ ) and, consequently, with respect to C(o) ∩ B I (ℝd ). Since (C(o) , ρ) is separable, there exists a countable dense set D ⊂ C(o) and it is easy to see that every U ∈ Oρ can be written as a countable union of the form U=

(ρ)



𝔹 (w,r)⊂U w∈D, r∈ℚ+

𝔹(ρ) (w, r)

where 𝔹(ρ) (w, r) = {v ∈ C(o) : ρ(v, w) < r} is an open ball in the metric ρ. Since 𝔹(ρ) (w, r) ∈ C(o) ∩ B I (ℝd ), we get Oρ ⊂ C(o) ∩ B I (ℝd )

and therefore B(C(o) ) = σ(Oρ ) ⊂ C(o) ∩ B I (ℝd ).

Let (B t )t⩾0 be a Brownian motion on some probability space (Ω, A , ℙ) and consider the map Ψ : Ω → C(o) , ω 󳨃→ w := Ψ(ω) := B(⋅, ω). (4.2)

The metric balls 𝔹(ρ) (v, r) ⊂ C(o) generate B(C(o) ). From (4.1) we conclude that the map ω 󳨃→ ρ(B(⋅, ω), v) is A measurable; therefore {ω : Ψ(ω) ∈ 𝔹(ρ) (v, r)} = {ω : ρ(B(⋅, ω), v) < r}

shows that Ψ is A /B(C(o) ) measurable. The image measure μ(Γ) := ℙ (Ψ −1 (Γ)) = ℙ(Ψ ∈ Γ),

Γ ∈ B(C(o) ),

is a measure on C(o) . Assume that Γ is a cylinder set, i.e. a set of the form Γ = {w ∈ C(o) : w(t1 ) ∈ C1 , . . . , w(t n ) ∈ C n }

= {w ∈ C(o) : π t1 (w) ∈ C1 , . . . , π t n (w) ∈ C n }

(4.3)

4.1 Wiener measure |

41

where n ⩾ 1, t1 < t2 < ⋅ ⋅ ⋅ < t n and C1 , . . . , C n ∈ B(ℝd ). Because of w = Ψ(ω) and π t (w) = w(t) = B(t, ω) we see μ(Γ) = μ (π t1 ∈ C1 , . . . , π t n ∈ C n ) = ℙ (B(t1 ) ∈ C1 , . . . , B(t n ) ∈ C n ) .

(4.4)

The family of all cylinder sets is a ∩-stable generator of the σ-algebra C(o) ∩ B I (ℝd ). Therefore, the finite dimensional distributions (4.4) uniquely determine the measure μ (and ℙ). This proves the following theorem. 4.2 Theorem. On the probability space (C(o) , B(C(o) ), μ) the family of projections (π t )t⩾0 satisfies (B0)–(B4), i.e. (π t )t⩾0 is a Brownian motion.

Theorem 4.2 says that every Brownian motion (Ω, A , ℙ, B t , t ⩾ 0) admits an equivalent Brownian motion (W t )t⩾0 on the probability space (C(o) , B(C(o) ), μ). Since W(t) is just the projection π t , we can identify W(t, w) with its sample path w(t).

4.3 Definition. Let (C(o) , B(C(o) ), μ, π t ) be as in Theorem 4.2. The measure μ is called the Wiener measure, (C(o) , B(C(o) ), μ) is called Wiener space or path space, and (π t )t⩾0 is the canonical (model of the) Brownian motion process.

We will write (C(o) (I), B(C(o) (I)), μ) if we consider Brownian motion on an interval I other than [0, ∞).

4.4 Remark. We have seen in Section 2.3 that a Brownian motion is invariant under reflection, scaling, renewal, time inversion and projective reflection at t = ∞. Since these operations leave the finite dimensional distributions unchanged, and since the Wiener measure μ is uniquely determined by the finite dimensional distributions, μ will also be invariant. This observation justifies arguments of the type ℙ(B(⋅) ∈ A) = ℙ(W(⋅) ∈ A)

where A ∈ A and W is the (path-by-path) transformation of B under any of the above operations. Below is an overview on the transformations both at the level of the sample paths and the canonical space C(o) . Note that the maps S : C(o) → C(o) in Table 4.1 are bijective.

Let us finally show that the product σ-algebra B I (ℝd ) is fairly small in the sense that the set C(o) is not measurable. We begin with an auxiliary result, which shows that Γ ∈ B I (ℝd ) is uniquely determined by countably many indices.

4.5 Lemma. Let I = [0, ∞). For every Γ ∈ B I (ℝd ) there is a countable set S = S Γ ⊂ I such that f ∈ (ℝd )I , w ∈ Γ : f|S = w|S 󳨐⇒ f ∈ Γ. (4.5) Proof. Set Σ := {Γ ⊂ (ℝd )I : (4.5) holds for some countable set S = S Γ }. We claim that Σ is a σ-algebra.

Ex. 4.4

42 | 4 The canonical model Tab. 4.1: Transformations of Brownian motion.

Reflection

time set I

paths W(t, ω)

[0, ∞)

−B(t, ω)

−w(t)

[0, a]

B(a − t, ω) − B(a, ω)

w(a − t) − w(a)

[0, ∞)

Renewal Time inversion Scaling Proj. reflection

w ∈ C(o) (I), w 󳨃→ Sw(⋅)

B(t + a, ω) − B(a, ω)

√c B( ct , ω)

[0, ∞)

[0, ∞)

w(t + a) − w(a)

√c w( ct ) t w( 1t )

t B( 1t , ω)

󳶳 󳶳

Clearly, 0 ∈ Σ; Let Γ ∈ Σ with S = S Γ . Then we find for Γ c and S that

󳶳

(otherwise, we would have f ∈ Γ and then (4.5) would imply that w ∈ Γ). This means that (4.5) holds for Γ c with S = S Γ = S Γ c , i.e. Γ c ∈ Σ. Let Γ1 , Γ2 , ⋅ ⋅ ⋅ ∈ Σ and set S := ⋃j⩾1 S Γ j . Obviously, S is countable and

f ∈ (ℝd )I , w ∈ Γ c , f|S = w|S 󳨐⇒ f ∉ Γ

f ∈ (ℝd )I , w ∈ ⋃ Γ j , f|S = w|S 󳨐⇒ ∃j0 : w ∈ Γ j0 , f|S Γj = w|S Γj j⩾1

0

0

(4.5)

󳨐⇒ f ∈ Γ j0 hence f ∈ ⋃ Γ j . j⩾1

This proves ⋃j⩾1 Γ j ∈ Σ.

Since all sets of the form {π t ∈ C}, t ∈ I, C ∈ B(ℝd ) are contained in Σ, we find B I (ℝd ) = σ(π t : t ∈ I) ⊂ σ(Σ) = Σ.

4.6 Corollary. If I = [0, ∞) then C(o) ∉ B I (ℝd ).

Proof. Assume that C(o) ∈ B I (ℝd ). By Lemma 4.5 there exists a countable set S ⊂ I such that f ∈ (ℝd )I , w ∈ C(o) : f|S = w|S 󳨐⇒ f ∈ C(o) .

Fix w ∈ C(o) , pick t0 ∉ S, γ ∈ ℝd \ {w(t0 )} and define a new function by {w(t), f(t) := { γ, {

t ≠ t 0 t = t0

.

Then f|S = w|S , but f ∉ C(o) , contradicting our assumption.

4.2 Kolmogorov’s construction |

43

4.2 Kolmogorov’s construction We have seen in the previous section that the finite dimensional distributions of a d-dimensional Brownian motion (B t )t⩾0 uniquely determine a measure μ on the space of all sample paths, such that the projections (π t )t⩾0 are, under μ, a Brownian motion. This measure μ is uniquely determined by the (finite dimensional) projections π t1 ,...,t n (w) := (w(t1 ), . . . , w(t n ))

and the corresponding finite dimensional distributions

p t1 ,...,t n (C1 × ⋅ ⋅ ⋅ × C n ) = μ(π t1 ,...,t n ∈ C1 × ⋅ ⋅ ⋅ × C n )

where t1 , . . . , t n ∈ I and C1 , . . . , C n ∈ B(ℝd ). From (4.4) it is obvious that the conditions p t1 ,...,t n (C1 × ⋅ ⋅ ⋅ × C n ) = p t σ(1) ,...,t σ(n) (C σ(1) × ⋅ ⋅ ⋅ × C σ(n) ) } } } (4.6) for all permutations σ : {1, 2, . . . , n} → {1, 2, . . . , n} } } } d p t1 ,...,t n−1 ,t n (C1 × ⋅ ⋅ ⋅ × C n−1 × ℝ ) = p t1 ,...,t n−1 (C1 × ⋅ ⋅ ⋅ × C n−1 )} are necessary for a family p t1 ,...,t n , n ⩾ 1, t1 , . . . , t n ∈ I to be finite dimensional distributions of a stochastic process.

4.7 Definition. Assume that for every n ⩾ 1 and all t1 , . . . , t n ∈ I, p t1 ,...,t n is a probability measure on ((ℝd )n , B((ℝd )n )). If these measures satisfy the consistency conditions (4.6), we call them consistent or projective. In fact, the consistency conditions (4.6) are even sufficient for p t1 ,...,t n to be finite dimensional distributions of some stochastic process. The following deep theorem is due to Kolmogorov. A proof is given in Appendix A.1.

4.8 Theorem (Kolmogorov 1933). Let I ⊂ [0, ∞) and p t1 ,...,t n be probability measures defined on ((ℝd )n , B((ℝd )n )) for all t1 , . . . , t n ∈ I, n ⩾ 1. If the family of measures is consistent, then there exists a probability measure μ on ((ℝd )I , B I (ℝd )) such that p t1 ,...,t n (C) = μ (π−1 t1 ,...,t n (C))

for all C ∈ B((ℝd )n ).

Using Theorem 4.8 we can construct a stochastic process for any family of consistent finite dimensional probability distributions.

4.9 Corollary (canonical process). Let I ⊂ [0, ∞) and p t1 ,...,t n be probability measures defined on ((ℝd )n , B((ℝd )n )) for all t1 , . . . , t n ∈ I, n ⩾ 1. If the family of measures is consistent, there exist a probability space (Ω, A , ℙ) and a d-dimensional stochastic process (X t )t⩾0 such that for all C1 , . . . , C n ∈ B(ℝd ) ℙ(X(t1 ) ∈ C1 , . . . , X(t n ) ∈ C n ) = p t1 ,...,t n (C1 × ⋅ ⋅ ⋅ × C n ).

Proof. We take Ω = (ℝd )I , the product σ-algebra A = B I (ℝd ) and ℙ = μ, where μ is the measure constructed in Theorem 4.8. Then X t := π t defines a stochastic process with finite dimensional distributions μ ∘ (π t1 , . . . , π t n )−1 = p t1 ,...,t n .

Ex. 4.2 Ex. 4.5

44 | 4 The canonical model

Bt

A2 A4

t2 A1

t5 t

A3 A5

Fig. 4.1: A Brownian path meeting (and missing) the sets A1 , . . . , A5 at times t1 , . . . , t5

4.10 Kolmogorov’s construction of BM. For d = 1 and t0 = 0 < t1 < ⋅ ⋅ ⋅ < t n the n-variate Gaussian measures p t1 ,...,t n (A1 × ⋅ ⋅ ⋅ × A n ) = (

1 n/2 1 ) 2π √det C



A1 ×⋅⋅⋅×A n

1

e− 2 ⟨x, C

−1

x⟩

dx

(4.7)

with C = (t j ∧t k )j,k=1,...,n , are probability measures. If t1 = 0, we use δ0 ⊗p t2 ,...,t n instead. It is not difficult to check that the family (4.7) is consistent. Therefore, Corollary 4.9 proves that there exists an ℝ-valued stochastic process (B t )t⩾0 . By construction, (B0) and (B3) are satisfied. From (B3) we get immediately (B2); (B1) follows if we read (2.11) backwards: let C = (t j ∧ t k )j,k=1,...,n where 0 < t1 < ⋅ ⋅ ⋅ < t n < ∞; then we have for all ξ = (ξ1 , . . . , ξ n )⊤ ∈ ℝn n 1 𝔼 [exp (i ∑ ξ j B t j )] = exp (− ⟨ξ, Cξ⟩) 2 j=1

n 1 = ∏ exp (− (t j − t j−1 )(ξ j + ⋅ ⋅ ⋅ + ξ n )2 ) 2 j=1

(2.12)

(B2)

n

= ∏ 𝔼 [exp (i(B t j − B t j−1 )(ξ j + ⋅ ⋅ ⋅ + ξ n ))] .

(2.5)

j=1

On the other hand, we can easily rewrite the expression on the left-hand side as n

n

j=1

j=1

𝔼 [exp (i ∑ ξ j B t j )] = 𝔼 [exp (i ∑ (B t j − B t j−1 )(ξ j + ⋅ ⋅ ⋅ + ξ n ))] .

Using the substitution η j := ξ j + ⋅ ⋅ ⋅ + ξ n we see that n

n

𝔼 [exp (i ∑ η j (B t j − B t j−1 ))] = ∏ 𝔼 [exp (iη j (B t j − B t j−1 ))] j=1

j=1

Problems

| 45

for all η1 , . . . , η n ∈ ℝ which finally proves (B1). This gives a new way to prove the existence of a Brownian motion. The problem is, as in all other constructions of Brownian motion, to check the continuity of the sample paths (B4). This follows from yet another theorem of Kolmogorov. 4.11 Theorem (Kolmogorov 1934; Slutsky 1937; Chentsov 1956). Denote by (X t )t⩾0 a stochastic process on (Ω, A , ℙ) taking values in ℝd . If 𝔼 (|X(t) − X(s)|α ) ⩽ c |t − s|1+β

for all s, t ⩾ 0

(4.8)

holds for some constants c > 0 and α, β > 0, then (X t )t⩾0 has a modification (X 󸀠t )t⩾0 which has only continuous sample paths.

We will give the proof in a different context in Chapter 10 below. For a process satisfying (B0)–(B3), we can take α = 4 and β = 1 since, cf. (2.8), 𝔼 (|B(t) − B(s)|4 ) = 𝔼 (|B(t − s)|4 ) = 3 |t − s|2 .

4.12 Further reading. The concept of Wiener space has been generalized in many directions. Here we only mention Gross’ concept of abstract Wiener space [156], [275], [175] and the Malliavin calculus, i.e. stochastic calculus of variations on an (abstract) Wiener space [176]. The results of Cameron and Martin are nicely summed up in [268]. [156] Kuo: Gaussian Measures in Banach Spaces. [175] Malliavin: Integration and Probability. [176] Malliavin: Stochastic Analysis. [268] Wiener et al.: Differential Space, Quantum Systems, and Prediction. [192] Nualart, Nualart: Introduction to Malliavin Calculus. [275] Yamasaki: Measures on infinite dimensional spaces.

Problems 1.

Let F : ℝ → [0, 1] be a distribution function. a) Show that there exists a probability space (Ω, A , ℙ) and a random variable X such that F(x) = ℙ(X ⩽ x).

b) Show that there exists a probability space (Ω, A , ℙ) and an iid sequence of random variables X n such that F(x) = ℙ(X n ⩽ x). c) State and prove the corresponding assertions for the d-dimensional case.

2.

Show that there exists a stochastic process (X t )t⩾0 such that the random variables X t are independent N(0, t) random variables. This process also satisfies ℙ( lims→t X s exists) = 0 for every t > 0.

Ex. 10.1

46 | 4 The canonical model 3.

Let S be any set and E be a family of subsets of S. Show that σ(E ) = ⋃{σ(C ) : C ⊂ E , C is countable}.

4. Let 0 ≠ E ∈ B(ℝd ) contain at least two elements, and 0 ≠ I ⊂ [0, ∞) be any index set. We denote by E I the product space E I = {f : f : I → E} which we equip with its natural product topology (i.e. f n → f if, and only if, f n (t) → f(t) for every t ∈ I). Then we can define the Borel σ-algebra B(E I ) and the product σ-algebra B I (E) := σ(π t : t ∈ I) where π t : E I → E, π t (f) := f(t) is the canonical projection onto the tth coordinate. a) Show that for every B ∈ B I (E) there exists a countable set J ⊂ I and a set I J A ∈ B J (E) such that B = π−1 J (A). (π J is the coordinate projection π J : E → E .) Hint. Problem 4.3.

b) If I is not countable, then each 0 ≠ B ∈ B I (E) is not countable. Hint. Part a)

c) Show that B I (E) ⊂ B(E I ). Hint. Part b)

d) If I is not countable, then B I (E) ≠ B(E I ).

5.

6.

e) If I is countable, then B I (E) = B(E I ).

Let μ be a probability measure on (ℝd , B(ℝd )), v ∈ ℝd and consider the ddimensional stochastic process X t (ω) := ω + tv, t ⩾ 0, on the probability space (ℝd , B(ℝd ), μ). Give an interpretation of X t if μ = δ x for some x ∈ ℝd and determine the family of finite dimensional distributions of (X t )t⩾0 .

(Brownian bridge) Let B = (B t )t∈[0,1] be a BM1 and set W = (W t )t∈[0,1] where W t := B t − tB1 . a) Show that W is a mean zero Gaussian process and find its covariance function. b) Let 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n < 1. Show that the law of (W t1 , . . . , W t n ) has a density p(x1 , . . . , x n ) which is of the form √2πg t1 (x1 ) ⋅ g t2 −t1 (x2 − x1 ) ⋅ . . . ⋅ g t n −t n−1 (x n−1 − x n ) ⋅ g1−t n (x n )

where g t (x) = (2π)−1/2 exp(−x2 /2t) is the density of N(0, t).

c) Show that the law of (W t1 , . . . , W t n ) coincides with the law of (B t1 , . . . , B t n ) conditioned that B1 = 0. Compare this with Exercise 2.9(b).

7.

d) Show that the processes W and W 󸀠 := (W1−t )t∈[0,1] have the same probability law.

(Shifts) Let (C(o) , B(C(o) ), μ) be the canonical Wiener space (we assume d = 1). We will now consider the space C, again equipped with the metric of locally uniform convergence and the Borel σ-algebra B(C). It is straightforward to check that

Problems |

47

Lemma 4.1 remains valid in this context. We define B t : C → ℝ as coordinate projection B t (w) = w(t) where w ∈ C and the corresponding σ-algebras Ft = F[0,t] = σ(B s , s ∈ [0, t]) and F[t,∞) = σ(B s , s ⩾ t),

t ∈ [0, ∞].

Let h > 0. The (canonical) shift operator is the map θ h : C → C, θ h w(⋅) = w(⋅ + h), i.e. (θ h w)(t) = w(t + h). a) Show that B t ∘ θ h = B t+h . Find θ−1 h F for F = {B t ∈ C}. b) Show that θ h : C(o) → C(o) is Ft+h /Ft and F[t+h,∞) /F[t,∞) measurable.

c) Denote by τ C (w) = inf{s ⩾ 0 : B s (w) ∈ C} and set inf 0 := ∞. Find an expression for θ h τ C := τ C ∘ θ h and show that τ C ∘ θ h = τ C − h if we happen to know that τ C ⩾ h.

8. (Brownian families) We continue with the set-up introduced in Problem 7. Denote by (C(o) , B(C(o) ), μ) the canonical Wiener space and set Ω = C, A = B(C), B t (w) = w(t) for w ∈ Ω, and define for every probability measure π on (ℝ, B(ℝ)) a probability measure ℙπ on (Ω, B(C)) first on the “cylinder sets” ℙπ ({w : w(t0 ) ∈ C0 , w(t1 ) ∈ C1 , . . . , w(t n ) ∈ C n })

:= ∫ μ({w : w(t1 ) + x ∈ C1 , . . . , w(t n ) + x ∈ C n }) π(dx), C0

where n ∈ ℕ0 , t0 = 0 < t1 < ⋅ ⋅ ⋅ < t n and C0 , C1 , . . . , C n ∈ B(ℝ). The measure ℙ is then extended, by Kolmogorov’s theorem (Theorem 4.8), onto B(C). a) Show that the map w 󳨃→ B(⋅, w) = {t 󳨃→ B t (w)} is measurable.

b) Show that B t is, under ℙπ and π = δ x , a Brownian motion started in B0 = x, i.e. a process satisfying (B1), (B2), (B4) and B0 = x instead of (B0) and B t ∼ N(x, t) instead of (B3).

9.

c) Give an interpretation as to what ℙπ means at the level of paths.

(Galmarino’s test “light”) Let B = (B t )t⩾0 be a canonical BM1 on Wiener space (Ω, A , ℙ) = (C(o) , B(C(o) ), μ) and Ft := σ(B s , s ⩽ t), F∞ = σ (⋃t⩾0 Ft ), the filtration generated by B. Show that for F ∈ F∞ the following assertions are equivalent: a) F ∈ Ft ;

b) Let ω ∈ F and ω󸀠 ∈ Ω. If we have for all s ⩽ t that B s (ω) = B s (ω󸀠 ), then ω󸀠 ∈ F. Hint. Show first that Ft = a−1 t (F∞ ) holds for the stopping operator a t ω(s) = ω(t ∧ s).

Remark. This topic will be continued in Problem 5.23. This test actually holds for all processes with continuous sample paths (with the same argument as for Brownian motion) and, with some

extra steps, we can relax this to processes whose paths are right continuous with finite left limits.

5 Brownian motion as a martingale Let I be some subset of [0, ∞] and (Ft )t∈I a filtration, i.e. an increasing family of subσ-algebras of A . Recall that a martingale (X t , Ft )t∈I is a real or complex stochastic process X t : Ω → ℝd or X t : Ω → ℂ satisfying a) 𝔼 |X t | < ∞ for all t ∈ I; b) X t is Ft measurable for every t ∈ I; c) 𝔼(X t | Fs ) = X s for all s, t ∈ I, s ⩽ t. If X t is real-valued and if we have in c) “⩾” or “⩽” instead of “=”, we call (X t , Ft )t∈I a sub- or supermartingale, respectively. We call a stochastic process (X t )t∈I adapted to the filtration (Ft )t∈I if X t is for each t ∈ I an Ft measurable random variable. We write briefly (X t , Ft )t⩾0 for an adapted process. Clearly, X is Ft adapted if, and only if, FtX ⊂ Ft for all t ∈ I where FtX is the natural filtration σ(X s : s ∈ I, s ⩽ t) of the process X.

5.1 Some “Brownian” martingales

Brownian motion is a martingale with respect to its natural filtration, i.e. the family of σ-algebras FtB := σ(B s : s ⩽ t). Recall that, by Lemma 2.14, B t − B s ⊥⊥ FsB

Ex. 5.1 Ex. 5.2

for all 0 ⩽ s ⩽ t.

(5.1)

It is often necessary to enlarge the canonical filtration (FtB )t⩾0 of a Brownian motion (B t )t⩾0 . The property (5.1) is equivalent to the independent increments property (B1) of (B t )t⩾0 , see Lemma 2.14 and Problem 2.21, and it is this property which preserves the Brownian character of a filtration. Depending on (B0), the σ-algebra F0B may be trivial ({0, Ω} if B0 (ω) = 0 for all ω), or contain measurable null sets (if B0 = 0 almost everywhere). 5.1 Definition. Let (B t )t⩾0 be a d-dimensional Brownian motion. A filtration (Ft )t⩾0 is called admissible, if a) FtB ⊂ Ft for all t ⩾ 0; b) B t − B s ⊥⊥ Fs for all 0 ⩽ s ⩽ t. If F0 contains all subsets of ℙ null sets, (Ft )t⩾0 is an admissible complete filtration.

The natural filtration (FtB )t⩾0 is always admissible. It is also clear that we may, without loss of generality, add all measurable null sets to F0B , hence to all FtB . We will discuss further examples of admissible filtrations in Lemma 6.21. 5.2 Example. Let (B t )t⩾0 , B t = (B1t , . . . , B dt ), be a d-dimensional Brownian motion and (Ft )t⩾0 an admissible filtration. a) (B t )t⩾0 is a martingale with respect to Ft . https://doi.org/10.1515/9783110741278-005

5.1 Some “Brownian” martingales |

49

Indeed: let 0 ⩽ s ⩽ t. Using the conditions 5.1.a) and b) we get

𝔼(B t | Fs ) = 𝔼(B t − B s | Fs ) + 𝔼(B s | Fs ) = 𝔼(B t − B s ) + B s = B s .

b) M(t) := |B t |2 is a positive submartingale with respect to Ft . Indeed: for all 0 ⩽ s < t we can use (the conditional form of) Jensen’s inequality to get d

d

j

j=1

c)

d

𝔼(M t | Fs ) = ∑ 𝔼((B t )2 | Fs ) ⩾ ∑ 𝔼(B t | Fs )2 = ∑ (B s )2 = M s . j

j=1

j

j=1

M t := |B t |2 − d ⋅ t is a martingale with respect to Ft . j Indeed: since |B t |2 − d ⋅ t = ∑dj=1 ((B t )2 − t) it is enough to consider d = 1. For s < t we see 󵄨 𝔼 [B2t − t 󵄨󵄨󵄨 Fs ]

= =

B t −B s ⊥⊥ Fs

=

2 󵄨 𝔼 [((B t − B s ) + B s ) − t 󵄨󵄨󵄨 Fs ]

󵄨 𝔼 [(B t − B s )2 + 2B s (B t − B s ) + B2s − t 󵄨󵄨󵄨 Fs ] 𝔼 [(B t − B s )2 ] +2B s 𝔼 [B t − B s ] +B2s − t B2s

=

=0

= t−s

− s.

d) M ξ (t) := exp (i⟨ξ, B(t)⟩ + 2t |ξ|2 ) is for all ξ ∈ ℝd a complex-valued martingale with respect to Ft . Indeed: for 0 ⩽ s < t we have, cf. Example a), 𝔼(M ξ (t) | Fs )

=

B t −B s ⊥⊥ Fs

2

e 2 |ξ| 𝔼(e i⟨ξ, B(t)−B(s)⟩ | Fs ) e i⟨ξ, B(s)⟩ t

2

=

e 2 |ξ| 𝔼(e i⟨ξ, B(t−s)⟩ ) e i⟨ξ, B(s)⟩

=

M ξ (s).

B t −B s ∼B t−s

=

Ex. 5.5

t

2

e 2 |ξ| e− t

t−s 2

|ξ|2

e i⟨ξ, B(s)⟩

The martingale M ξ (t) is sometimes called exponential martingale. A better way to understand its structure is to write M ξ (t) = exp (i⟨ξ, B(t)⟩) / 𝔼 [exp (i⟨ξ, B(t)⟩)]. e) M ζ (t) := exp [∑dj=1 (ζ j B j (t) − 2t ζ j2 )] is for all ζ ∈ ℂd a complex-valued martingale with respect to Ft . Indeed: since 𝔼 e|ζ||B(t)| < ∞ for all ζ ∈ ℂd , cf. (2.6), we can use the argument of Example d). Pay attention to the fact that (2.6) depends on ζ j2 rather than |ζ j |2 .

Since Brownian motion is a martingale we have all martingale tools at our disposal. The following maximal estimate will be particularly useful. It is a direct consequence of Doob’s maximal inequality A.10 and the observation that for a process with continuous sample paths sups∈D∩ [0,t] |B s | = sups∈[0,t] |B s | holds for any dense subset D ⊂ [0, ∞).

Ex. 5.6 Ex. 5.7

50 | 5 Brownian motion as a martingale 5.3 Lemma. Let (B t )t⩾0 be a d-dimensional Brownian motion. Then we have for all t ⩾ 0 and p > 1 p p 𝔼 [ sup |B s |p ] ⩽ ( ) 𝔼[|B t |p ]. p−1 s⩽t

The following characterization of a Brownian motion through conditional characteristic functions is quite useful. 5.4 Lemma. Let (X t )t⩾0 be a continuous ℝd -valued stochastic process with X0 = 0, and which is adapted to the filtration (Ft )t⩾0 . Then (X t , Ft )t⩾0 is a BMd if, and only if, for all 0 ⩽ s ⩽ t and ξ ∈ ℝd 󵄨 𝔼 [exp (i⟨ξ, X t − X s ⟩) 󵄨󵄨󵄨 Fs ] = exp (− 21 |ξ|2 (t − s)) .

(5.2)

Proof. We verify (2.15), i.e. for all n ⩾ 0, 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n , ξ0 , . . . , ξ n ∈ ℝd and X t−1 := 0: n

𝔼 [exp (i ∑ ⟨ξ j , X t j − X t j−1 ⟩)] = exp [− j=0

Using the tower property we find

1 n ∑ |ξ j |2 (t j − t j−1 )] . 2 j=1

n

𝔼 [exp (i ∑ ⟨ξ j , X t j − X t j−1 ⟩)] j=0

n−1 󵄨󵄨 = 𝔼 [𝔼 (exp (i ⟨ξ n , X t n − X t n−1 ⟩) 󵄨󵄨󵄨󵄨 Ft n−1 ) exp (i ∑ ⟨ξ j , X t j − X t j−1 ⟩)] 󵄨 j=0 n−1 1 = exp (− |ξ n |2 (t n − t n−1 )) 𝔼 [exp (i ∑ ⟨ξ j , X t j − X t j−1 ⟩)] . 2 j=0

Iterating this step we get (2.15), and the claim follows from Lemma 2.8.

Ex. 5.8

All examples in 5.2 were of the form “f(B t )”. We can get many more examples of this type. For this we begin with a real-variable lemma.

5.5 Lemma. Let p(t, x) := (2πt)−d/2 exp(−|x|2 /2t) be the transition density of a ddimensional Brownian motion. Then ∂ 1 p(t, x) = ∆ x p(t, x) for all x ∈ ℝd , t > 0, ∂t 2

where ∆ = ∆ x = ∑dj=1

∂2 ∂x2j

(5.3)

is the d-dimensional Laplace operator. Moreover,

∫ p(t, x)

1 1 ∆ x f(t, x) dx = ∫ f(t, x) ∆ x p(t, x) dx 2 2 ∂ = ∫ f(t, x) p(t, x) dx ∂t

(5.4)

5.1 Some “Brownian” martingales | 51

for all functions f ∈ C1,2 ((0, ∞) × ℝd ) ∩ C ([0, ∞) × ℝd ) satisfying

󵄨󵄨 d 󵄨󵄨 2 󵄨󵄨 ∂f(t, x) 󵄨󵄨 d 󵄨󵄨󵄨 ∂f(t, x) 󵄨󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨 + ∑ 󵄨󵄨󵄨 ∂ f(t, x) 󵄨󵄨󵄨 ⩽ c(t) e C|x| |f(t, x)| + 󵄨󵄨󵄨 󵄨󵄨 + ∑ 󵄨󵄨󵄨󵄨 󵄨 󵄨 󵄨 󵄨󵄨 ∂t 󵄨󵄨 󵄨 ∂x j 󵄨󵄨󵄨 j,k=1 󵄨󵄨󵄨 ∂x j ∂x k 󵄨󵄨󵄨 j=1 󵄨

(5.5)

for all t > 0, x ∈ ℝd , some C > 0 and a locally bounded function c : (0, ∞) → [0, ∞).

Proof. The identity (5.3) can be directly verified. We leave this (lengthy but simple) computation to the reader. The first equality in (5.4) follows if we integrate by parts (twice); note that the condition (5.5) imposed on f(t, x) guarantees that the marginal terms vanish and that all integrals exist. Finally, the second equality in (5.4) follows if we plug (5.3) into the second integral. 5.6 Theorem. Let (B t , Ft )t⩾0 be a d-dimensional Brownian motion and assume that the function f ∈ C1,2 ((0, ∞) × ℝd , ℝ) ∩ C([0, ∞) × ℝd , ℝ) satisfies (5.5). Then t

f Mt

:= f(t, B t ) − f(0, B0 ) − ∫ Lf(r, B r ) dr, 0

is an Ft martingale. Here Lf(t, x) =

∂ ∂t f(t,

x) + 12 ∆ x f(t, x).

t ⩾ 0,

(5.6)

t

Proof. Observe that M t − M s = f(t, B t ) − f(s, B s ) − ∫s Lf(r, B r ) dr for 0 ⩽ s < t. We have f f 󵄨 to show 𝔼 (M t − M s 󵄨󵄨󵄨 Fs ) = 0. Using B t − B s ⊥⊥ Fs we see from Lemma A.3 that f

f

f f 󵄨 𝔼 (M t − M s 󵄨󵄨󵄨 Fs )

t

󵄨󵄨 = 𝔼 (f(t, B t ) − f(s, B s ) − ∫ Lf(r, B r ) dr 󵄨󵄨󵄨󵄨 Fs ) 󵄨 s

t

󵄨󵄨 = 𝔼 (f(t, (B t − B s ) + B s ) − f(s, B s ) − ∫ Lf(r, (B r − B s ) + B s ) dr 󵄨󵄨󵄨󵄨 Fs ) 󵄨 s

󵄨󵄨 󵄨 = 𝔼 (f(t, (B t − B s ) + z) − f(s, z) − ∫ Lf(r, (B r − B s ) + z) dr)󵄨󵄨󵄨󵄨 . 󵄨󵄨z=B s s t

A.3

Set ϕ(t, x) = ϕ s,z (t, x) := f(t + s, x + z), s > 0, and observe that B t − B s ∼ B t−s . Since f ∈ C1,2 ((0, ∞) × ℝd ), the shifted function ϕ satisfies (5.5) on [0, T] × ℝd where we can take C = c(t) ≡ γ and the constant γ depends only on s, T and z. We have to show that for all z ∈ ℝ t−s

ϕ 𝔼 (M t−s



ϕ M0 )

= 𝔼 (ϕ(t − s, B t−s ) − ϕ(0, B0 ) − ∫ Lϕ(r, B r ) dr) = 0. 0

Ex. 5.10 Ex. 18.8

52 | 5 Brownian motion as a martingale Let 0 < ϵ < u. Then we find, using Lemma 5.5, u

ϕ

ϕ

𝔼 (M u − M ϵ ) = 𝔼 (ϕ(u, B u ) − ϕ(ϵ, B ϵ ) − ∫ Lϕ(r, B r ) dr) ϵ

(∗)

= ∫ (p(u, x)ϕ(u, x) − p(ϵ, x)ϕ(ϵ, x)) dx u

− ∫ ∫ p(r, x) ( ϵ

5.5

∂ϕ(r, x) 1 + ∆ x ϕ(r, x)) dx dr ∂r 2

= ∫ (p(u, x)ϕ(u, x) − p(ϵ, x)ϕ(ϵ, x)) dx u

− ∫ ∫ (p(r, x) ϵ

∂ϕ(r, x) ∂p(r, x) + ϕ(r, x) ) dx dr ∂r ∂r u

= ∫ (p(u, x)ϕ(u, x) − p(ϵ, x)ϕ(ϵ, x)) dx − ∫ ∫ ϵ

u

(∗)

= ∫ (p(u, x)ϕ(u, x) − p(ϵ, x)ϕ(ϵ, x)) dx − ∫ ∫ ϵ

∂ (p(r, x)ϕ(r, x)) dx dr ∂r ∂ (p(r, x)ϕ(r, x)) dr dx ∂r

and this equals zero. In the lines marked with an asterisk (∗) we use Fubini’s theorem, which is indeed applicable because of (5.5) and ϵ > 0. Moreover, ϕ󵄨 󵄨󵄨 ϕ 󵄨󵄨M u − M ϵ 󵄨󵄨󵄨 ⩽ γe γ|B u | + γe γ|B ϵ | + (u − ϵ) sup γe γ|B r | ⩽ (2 + u)γ sup e γ|B r | . ϵ⩽r⩽u 0⩽r⩽u

By Doob’s maximal inequality for p = γ – without loss of generality we can assume that γ > 1 – and for the submartingale e|B r | we see that 𝔼 ( sup e γ|B r | ) ⩽ ( 0⩽r⩽u

γ γ ) 𝔼 (e γ|B u | ) < ∞. γ−1

Now we can use dominated convergence to get

𝔼 (M u − M0 ) = lim 𝔼 (M u − M ϵ ) = 0. ϕ

ϕ

ϕ

ϵ↓0

5.2 Stopping and sampling If we want to know when a Brownian motion (B t )t⩾0 󳶳 leaves or enters a set for the first time, 󳶳 hits its running maximum, 󳶳 returns to zero,

ϕ

5.2 Stopping and sampling | 53

we have to look at random times. A random time τ : Ω → [0, ∞] is a stopping time (with respect to (Ft )t⩾0 ) if {τ ⩽ t} ∈ Ft

for all t ⩾ 0.

(5.7)

You may be familiar with stopping times in discrete time; in fact, most results (and proofs) carry over to the continuous time setting. A systematic treatment of stopping times in continuous time is given in Appendix A.4. Typical examples of stopping times are entrance and hitting times of a process (X t )t⩾0 into a set A ∈ B(ℝd ): τ∘A := inf{t ⩾ 0 : X t ∈ A}, τ A := inf{t > 0 : X t ∈ A}, τ A c := inf{t > 0 : X t ∉ A}.

(first) entrance time into A: (first) hitting time of A: (first) exit time from A:

Ex. 5.13

(inf 0 = ∞)

Note that τ∘A ⩽ τ A . If t 󳨃→ X t and (Ft )t⩾0 are sufficiently regular, τ∘A , τ A are stopping times for every Borel set A ∈ B(ℝd ). In this generality, the proof is very hard, cf. [18, Chapter I.10]. For our purposes it is enough to consider closed and open sets A. The natural filtration FtX := σ(X s : s ⩽ t) of a stochastic process (X t )t⩾0 is relatively small. For many interesting stopping times we have to consider the slightly larger filtration X X Ft+ := ⋂ FuX = ⋂ Ft+ 1 . u>t

n⩾1

(5.8)

n

5.7 Lemma. Let (X t )t⩾0 be a d-dimensional stochastic process with exclusively right continuous sample paths and U ⊂ ℝd an open set. The first hitting time τ U satisfies {τ U < t} ∈ FtX

X and {τ U ⩽ t} ∈ Ft+

X i.e. it is a stopping time with respect to Ft+ .

for all t ⩾ 0,

It is not difficult to see that, for open sets U, the stopping times τ∘U and τ U coincide. If the process has only almost surely right continuous paths, we must make sure that the exceptional set is contained in the filtration. Proof. The second assertion follows immediately from the first as X . {τ U ⩽ t} = ⋂ {τ U < t + 1n } ∈ ⋂ F X 1 = Ft+ n⩾1

n⩾1

t+ n

(5.9)

For the first assertion we claim that for all t > 0

{τ U < t} = ⋃ {X(r) ∈ U} ∈ FtX . ℚ+ ∋r t. paths

ℚ+ ∋r⩽t

For τ F we set τ n := inf {s ⩾ 0 : X1/n+s ∈ F}. Then τ F = inf inf {t ⩾ n⩾1

1 n

: X t ∈ F} = inf ( 1n + τ n ) n⩾1

X . and {τ F < t} = ⋃n⩾1 {τ n < t − 1n } ∈ FtX . As in (5.9) we see that {τ F ⩽ t} ∈ Ft+

5.9 Example (First passage time). For a BM1 (B t )t⩾0 the entrance time τ b := τ∘{b} into the closed set {b} is often called first passage time. Instead of B t we consider B n which we interpret as random walk with iid steps X j = B j − B j−1 ∼ N(0, 1). Then S := lim (X1 + X2 + ⋅ ⋅ ⋅ + X n ) = X1 + lim (X2 + ⋅ ⋅ ⋅ + X n ) =: X1 + S󸀠 . n→∞

n→∞

From the iid property of the X j we know that S ∼ S󸀠 and X1 ⊥⊥ S󸀠 . Since X1 ∼ N(0, 1) is a.s. finite, we have {|S| < ∞} = {|S󸀠 | < ∞} up to a null set, and so 𝔼 [e iξS 𝟙{|S| 0.

Let (X t , Ft )t⩾0 be a martingale and denote by Ft∗ be the completion of Ft (completion means to add all subsets of ℙ-null sets). a) Show that (X t , Ft∗ )t⩾0 is a martingale.

̃ t )t⩾0 be a modification of (X t )t⩾0 . Show that (X ̃ t , F ∗ )t⩾0 is a martingale. b) Let (X t

4. Let (X t , Ft )t⩾0 be a submartingale with continuous paths and Ft+ = ⋂u>t Fu . Show that (X t , Ft+ )t⩾0 is again a submartingale. 5.

Let (B t )t⩾0 be a BM1 . Find a polynomial π(t, x) in x and t, which is of order 4 in the variable x, such that π(t, B t ) is a martingale. Hint. One possibility is to use the exponential Wald identity 𝔼 exp (ξB τ −

1 2

ξ 2 τ) = 1, −1 ⩽ ξ ⩽ 1,

for a suitable stopping time τ and a power-series expansion of the left-hand side in ξ .

6.

This exercise contains a recipe how to obtain “polynomial” martingales with leading term B nt , where (B t , Ft )t⩾0 is a BM1 . a) We know that M t := exp (ξB t − ξ

1 2

ξ 2 t), t ⩾ 0, ξ ∈ ℝ, is a martingale. Differen-

tiate 𝔼(M t 𝟙F ) = 𝔼(M s 𝟙F ), F ∈ Fs , s ⩽ t, at ξ = 0 and show that ξ

ξ

B2t − t,

are martingales.

B3t − 3tB t ,

B4t − 6tB2t + 3t2 ,

...

b) Find a general expression for the polynomial b 󳨃→ P n (b, ξ) of degree n such that P n (B t , 0) is a martingale.

c) Let τ = τ(−a,b)c be the first exit time of a Brownian motion (B t )t⩾0 from the interval (−a, b) with −a < 0 < b. Show that

7.

𝔼(τ2 ) ⩽ 4 𝔼(B4τ ) and

𝔼(B4τ ) ⩽ 36 𝔼(τ2 ).

2

Let (B t )t⩾0 be a BMd . Find all c ∈ ℝ such that 𝔼 e c |B t | and 𝔼 e c |B t | are finite.

8. Let p(t, x) = (2πt)−d/2 exp (−|x|2 /(2t)), x ∈ ℝd , t > 0, be the transition density of a d-dimensional Brownian motion. a) Show that p(t, x) is a solution for the heat equation, i.e. ∂ 1 1 d ∂2 p(t, x) = ∆ x p(t, x) = ∑ 2 p(t, x), ∂t 2 2 j=1 ∂x j

x ∈ ℝd , t > 0.

b) Let f ∈ C1,2 ((0, ∞) × ℝd ) ∩ C([0, ∞) × ℝd ) be such that for some constants c, c t > 0 and all x ∈ ℝd , t > 0 󵄨󵄨 d 󵄨󵄨 2 󵄨󵄨 ∂f(t, x) 󵄨󵄨 d 󵄨󵄨󵄨 ∂f(t, x) 󵄨󵄨󵄨 󵄨 󵄨󵄨 󵄨󵄨 + ∑ 󵄨󵄨󵄨 ∂ f(t, x) 󵄨󵄨󵄨 ⩽ c e c|x| . |f(t, x)| + 󵄨󵄨󵄨 󵄨 + ∑ 󵄨󵄨󵄨󵄨 t 󵄨 󵄨 󵄨 󵄨 󵄨󵄨 ∂t 󵄨󵄨 󵄨 ∂x j 󵄨󵄨󵄨 j,k=1 󵄨󵄨󵄨 ∂x j ∂x k 󵄨󵄨󵄨 j=1 󵄨

60 | 5 Brownian motion as a martingale Show that

9.

∫ p(t, x)

1 1 ∆ x f(t, x) dx = ∫ f(t, x) ∆ x p(t, x) dx. 2 2

Let (B t , Ft )t⩾0 be a BM1 . Show that X t = exp(aB t + bt), t ⩾ 0, is a martingale if, and only if, a2 /2 + b = 0.

10. Let (B t , Ft )t⩾0 be a one-dimensional Brownian motion. Which of the following processes are martingales? a) c) e)

U t = e cB t , c ∈ ℝ;

b)

W t = B3t − tB t ; Yt =

1 3

d)

B3t − tB t ;

f)

t

V t = tB t − ∫0 B s ds;

X t = B3t − 3 ∫0 B s ds; t

Z t = e B t −ct , c ∈ ℝ.

11. Let (B t , Ft )t⩾0 be a one-dimensional Brownian motion and f ∈ C1 (ℝ). Show that t M t := f(t)B t − ∫0 f 󸀠 (s)B s ds is a martingale. 12. Let (B t , Ft )t⩾0 be a BMd . Show that X t =

1 d

|B t |2 − t, t ⩾ 0, is a martingale.

13. Let (X t , Ft )t⩾0 be a d-dimensional stochastic process and A, A n , C ∈ B(ℝd ), n ⩾ 1. Then a) A ⊂ C implies τ∘A ⩾ τ∘C and τ A ⩾ τ C ; b) τ∘A∪ C = min{τ∘A , τ∘C } and τ A∪ C = min{τ A , τ C }; c)

τ∘A∩ C ⩾ max{τ∘A , τ∘C } and τ A∩ C ⩾ max{τ A , τ C };

d) A = ⋃n⩾1 A n , then τ∘A = inf n⩾1 τ∘A n and τ A = inf n⩾1 τ A n ;

e) f)

τ A = inf n⩾1 ( 1n + τ∘n ) where τ∘n = inf{s ⩾ 0 : X s+1/n ∈ A};

find an example of a set A such that τ∘A < τ A .

14. Let U ⊂ ℝd be an open set and assume that (X t )t⩾0 is a stochastic process with continuous paths. Show that τ U = τ∘U . 15. Show that the function d(x, A) := inf y∈A |x − y|, A ⊂ ℝd , is continuous.

16. Let (X t , Ft )t⩾0 be a real-valued process whose paths t 󳨃→ X t (ω) are (a.s.) right continuous and have finite left-hand limits (this property is often called “càdlàg”). Assume, in addition, that X is quasi left continuous, i.e. for every sequence of stopping times (τ n )n⩾1 which increase to a stopping time τ we have limn→∞ X τ n = X τ a.s. on the set {τ < ∞}. Show that the assertion of Lemma 5.8 remains valid for such processes if the filtration contains all subsets of measurable null sets. Hint. Approximate F by open sets U n ↓ F and show that τ U n ↑ τ F .

Remark. If X has continuous paths, it is also quasi left continuous. Note that quasi left continuity does not imply continuity. Even though we can take τ n ≡ t n ↑ t, the limit limn→∞ X t n = X t holds

only up to an exceptional set – and this set may (and will) depend on the approximating sequence (t n )n⩾1 . This means that we cannot control the exceptional sets in the left limit X t− .

Problems |

61

17. Assume that the processes in Lemma 5.7 or in Lemma 5.8 are only “almost surely right continuous”, resp. “almost surely continuous”. Identify all steps in the proofs where this becomes relevant and show that the assertions remain valid if the filtraX tions FtX and Ft+ are completed, so that they contain all subsets of measurable null sets. 18. Let τ be a stopping time. Check that Fτ and Fτ+ are σ-algebras. 19. Let τ be a stopping time for the filtration (Ft )t⩾0 . Show that a) F ∈ Fτ+ ⇐⇒ ∀t ⩾ 0 : F ∩ {τ < t} ∈ Ft ; b) τ ∧ t is again a stopping time; c) {τ ⩽ t} ∈ Fτ∧t for all t ⩾ 0.

20. Let τ := τ∘(−a,b)c be the first entrance time of a BM1 into the set (−a, b)c . a) Show that τ has finite moments 𝔼 τ n of any order n ⩾ 1. Hint. Use Example 5.2.d) and show that 𝔼 e cτ < ∞ for some c > 0. τ

b) Evaluate 𝔼 ∫0 B s ds.

t

Hint. Find a martingale which contains ∫0 B s ds.

21. Let (B t )t⩾0 be a BMd . Find 𝔼 τ R where τ R = inf {t ⩾ 0 : |B t | = R}.

22. Let (B t )t⩾0 be a BM1 and σ, τ be two stopping times such that 𝔼 τ, 𝔼 σ < ∞. a) Show that σ ∧ τ is a stopping time. b) Show that {σ ⩽ τ} ∈ Fσ∧τ .

c) Show that 𝔼(B τ B σ ) = 𝔼(τ ∧ σ)

Hint. Consider 𝔼(B σ B τ 𝟙{σ⩽τ} ) = 𝔼(B σ∧τ B τ 𝟙{σ⩽τ} ) and use optional stopping.

d) Deduce from c) that 𝔼(|B τ − B σ |2 ) = 𝔼 |τ − σ|.

23. (Galmarino’s test – continuation of Problem 4.9) Let B = (B t )t⩾0 be a canonical BM1 on Wiener space (Ω, A , ℙ) = (C(o) , B(C(o) ), μ) and Ft := σ(B s , s ⩽ t), F∞ = σ (⋃t⩾0 Ft ), the filtration generated by B. Show that the following assertions are equivalent: a) τ is a stopping time for (Ft )t⩾0 , i.e. {τ ⩽ t} ∈ Ft for all t ⩾ 0; b) If ω, ω󸀠 ∈ Ω satisfy τ(ω) = t and B s (ω) = B s (ω󸀠 ) for all s ⩽ t, then τ(ω󸀠 ) = t.

6 Brownian motion as a Markov process We have seen in 2.13 that for a d-dimensional Brownian motion (B t )t⩾0 and any s > 0 the shifted process W t := B t+s − B s , t ⩾ 0, is again a BMd which is independent of (B t )0⩽t⩽s . Since B t+s = W t + B s , we can interpret this as a renewal property: rather than going from 0 = B0 straight away to x = B t+s in (t + s) units of time, we stop after time s at x󸀠 = B s and move, using a further Brownian motion W t for t units of time, from x󸀠 to x = B t+s . This situation is shown in Figure 6.1: Bt

Wt

t

s

t

Fig. 6.1: W t := B t+s − B s , t ⩾ 0, is a Brownian motion in the new coordinate grid with origin (s, B s ).

6.1 The Markov property

Ex. 5.2 Ex. 6.1 Ex. 6.5

In fact, we know from Lemma 2.14 that the paths up to time s and the increments after W := σ(B time s are stochastically independent, i.e. FsB ⊥⊥ F∞ t+s − B s : t ⩾ 0). If we use any admissible filtration (Ft )t⩾0 instead of the natural filtration (FtB )t⩾0 , the argument of Lemma 2.14 remains valid, and we get 6.1 Theorem (Markov property of BM). Let (B t )t⩾0 be a BMd and (Ft )t⩾0 some admissible filtration. For every s > 0, the process W t := B t+s − B s , t ⩾ 0, is also a BMd and (W t )t⩾0 W = σ(W , t ⩾ 0) ⊥ is independent of Fs , i.e. F∞ ⊥ Fs . t

Theorem 6.1 justifies our intuition that we can split a Brownian path into two independent pieces 󵄨 B(t + s) = B(t + s) − B(s) + B(s) = W(t) + y 󵄨󵄨󵄨y=B(s) . https://doi.org/10.1515/9783110741278-006

6.1 The Markov property |

63

Observe that W(t) + y is a Brownian motion started at y ∈ ℝd . Let us introduce the following notation: ℙx (B t1 ∈ A1 , . . . , B t n ∈ A n ) := ℙ(B t1 + x ∈ A1 , . . . , B t n + x ∈ A n ),

(6.1)

where 0 ⩽ t1 < ⋅ ⋅ ⋅ < t n and A1 , . . . , A n ∈ B(ℝd ). We will write 𝔼x for the corresponding mathematical expectation. Clearly, ℙ0 = ℙ and 𝔼0 = 𝔼. This means that ℙx (B s ∈ A) denotes the probability that a Brownian particle starts at time t = 0 at the point x and travels in s units of time into the set A. Since the finite dimensional distributions (6.1) determine ℙx uniquely, cf. Theorem 4.8, ℙx is a well-defined measure on (Ω, F∞ ).¹ Moreover, x 󳨃→ 𝔼x u(B t ) = 𝔼0 u(B t + x) is for all u ∈ Bb (ℝd ) a measurable function. We will need the following formulae for conditional expectations, a proof of which can be found in the appendix, Lemma A.3. Assume that X ⊂ A and Y ⊂ A are independent σ-algebras. If X is an X /B(ℝd ) and Y a Y /B(ℝd ) measurable random variable, then ² 󵄨 󵄨 󵄨 𝔼 [Φ(X, Y) 󵄨󵄨󵄨 X ] = 𝔼 Φ(x, Y)󵄨󵄨󵄨x=X = 𝔼 [Φ(X, Y) 󵄨󵄨󵄨 X]

(6.2)

󵄨 󵄨 󵄨 𝔼 [Ψ(X(⋅), ⋅) 󵄨󵄨󵄨 X ] = 𝔼 Ψ(x, ⋅)󵄨󵄨󵄨x=X = 𝔼 [Ψ(X(⋅), ⋅) 󵄨󵄨󵄨 X].

(6.3)

for all bounded Borel measurable functions Φ : ℝd × ℝd → ℝ. If Ψ : ℝd × Ω → ℝ is bounded and B(ℝd ) ⊗ Y /B(ℝ) measurable, then

6.2 Theorem (Markov property). Let (B t )t⩾0 be a BMd with admissible filtration (Ft )t⩾0 . Then 󵄨󵄨 󵄨 𝔼 [u(B t+s ) 󵄨󵄨󵄨 Fs ] = 𝔼 [u(B t + x)]󵄨󵄨󵄨 = 𝔼B s [u(B t )] (6.4) 󵄨x=B s holds for all bounded measurable functions u : ℝd → ℝ and

󵄨󵄨 󵄨 𝔼 [Ψ(B•+s ) 󵄨󵄨󵄨 Fs ] = 𝔼 [Ψ(B• + x)]󵄨󵄨󵄨 = 𝔼B s [Ψ(B• )] 󵄨x=B s

Ex. 4.8

(6.5)

holds for all bounded B(C)/B(ℝ) measurable functionals Ψ : C[0, ∞) → ℝ which may depend on the whole Brownian path.

Proof. We write u(B t+s ) = u(B s + B t+s − B s ) = u(B s + (B t+s − B s )). Then (6.4) follows from (6.2) and (B1) with Φ(x, y) = u(x + y), X = B s , Y = B t+s − B s and X = Fs . W where In the same way we can derive (6.5) from (6.3) and 2.14 if we take Y = F∞ W t := B t+s − B s , X = Fs and X = B s . 1 The corresponding canonical Wiener measure μ x from Theorem 4.2 is, consequently, concentrated on the continuous functions w : [0, ∞) → ℝd satisfying w(0) = x. 󵄨 2 The notation 𝔼 Φ(x, Y)󵄨󵄨󵄨x=X means that we first calculate 𝔼 Φ(x, Y) with x playing the role of a parameter, and then insert x = X(ω).

Ex. 6.6

64 | 6 Brownian motion as a Markov process 6.3 Remark. a) Theorem 6.1 really is a result for processes with stationary and independent increments, whereas Theorem 6.2 is not necessarily restricted to such processes. In general any d-dimensional, Ft adapted stochastic process (X t )t⩾0 with (right) continuous paths is called a Markov process if it satisfies 󵄨 󵄨 𝔼 [u(X t+s ) 󵄨󵄨󵄨 Fs ] = 𝔼 [u(X t+s ) 󵄨󵄨󵄨 X s ] for all s, t ⩾ 0, u ∈ Bb (ℝd ).

(6.4a)

󵄨 𝔼 [u(X t+s ) 󵄨󵄨󵄨 Fs ] = 𝔼s,X s [u(X t+s )] with 𝔼s,x [u(X t+s )] := g u,s,t+s (x).

(6.4b)

Since we can express 𝔼 [u(X t+s ) | X s ] as an (essentially unique) function of X s , say g u,s,t+s (X s ), we have Note that we can always select a Borel measurable version of g u,s,t+s (⋅). If g u,s,t+s depends only on the difference (t + s) − s = t, we call the Markov process homogeneous and we write 󵄨 𝔼 [u(X t+s ) 󵄨󵄨󵄨 Fs ] = 𝔼X s [u(X t )] with

𝔼x [u(X t )] := g u,t (x).

(6.4c)

Obviously (6.4) implies both (6.4a) and (6.4c), i.e. Brownian motion is also a Markov process in this more general sense. For homogeneous Markov processes we have 󵄨 󵄨 𝔼 [Ψ(X•+s ) 󵄨󵄨󵄨 Fs ] = 𝔼 [Ψ(X•+s ) 󵄨󵄨󵄨 X s ] = 𝔼X s [Ψ(X• )]

(6.5a)

for bounded measurable functionals Ψ : C[0, ∞) → ℝ of the sample paths. Although this relation seems to be more general than (6.4a), (6.4b), it is actually possible to derive (6.5a) directly from (6.4a), (6.4b) by standard measure-theoretic techniques. A proof in the discrete-time setting (for Markov chains) can be found in [235, Theorem 54.6]. b) It is useful to write (6.4) in integral form and with all ω: 󵄨 𝔼 [u(B t+s ) 󵄨󵄨󵄨 Fs ](ω) = ∫ u(B t (ω󸀠 ) + B s (ω)) ℙ(dω󸀠 ) Ω

= ∫ u(B t (ω󸀠 )) ℙB s (ω) (dω󸀠 ) Ω

= 𝔼B s (ω) [u(B t )].

Markov processes are often used in the literature. In fact, the Markov property (6.4a), (6.4b) can be seen as a means to obtain the finite dimensional distributions from onedimensional distributions. We will show this for a Brownian motion (B t )t⩾0 where we use only the property (6.4). The guiding idea is as follows: if a particle starts at x and is by time t in a set A and by time u in C, we can realize this as time t

x 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ A Brownian motion

{

stop there at B t = a, } then restart from a ∈ A

time u−t

󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ C. Brownian motion

6.2 The strong Markov property |

65

Let us become a bit more rigorous. Let 0 ⩽ s < t < u and let ϕ, ψ, χ : ℝd → ℝ bounded measurable functions. By the tower property of conditional expectation and a pull-out argument, we see 󵄨 𝔼 [ϕ(B s ) ψ(B t ) χ(B u )] = 𝔼 [ϕ(B s ) ψ(B t ) 𝔼 (χ(B u ) 󵄨󵄨󵄨 Ft )] (6.4)

= 𝔼 [ϕ(B s ) ψ(B t ) 𝔼B t χ(B u−t )]

(6.4)

= 𝔼 [ϕ(B s ) 𝔼B s (ψ(B t−s ) 𝔼B t−s χ(B u−t )) ].

=∫ ψ(a) ∫ χ(b) ℙa (B u−t ∈db) ℙB s (B t−s ∈da)

If we rewrite all expectations as integrals, we get

∭ϕ(x)ψ(y)χ(z)ℙ(B s ∈ dx, B t ∈ dy, B u ∈ dz)

= ∫ [ϕ(c)∫ψ(a)∫χ(b) ℙa (B u−t ∈ db) ℙc (B t−s ∈ da)] ℙ(B s ∈ dc).

This shows that we find the joint distribution of the vector (B s , B t , B u ) if we know the one-dimensional distributions of each B t . For s = 0 and ϕ(0) = 1 these relations are also known as Chapman-Kolmogorov equations. By iteration we get

6.4 Theorem. Let (B t )t⩾0 be a d-dimensional Brownian motion (or a general Markov process), x0 = 0 and t0 = 0 < t1 < ⋅ ⋅ ⋅ < t n . Then n

ℙ0 (B t1 ∈ dx1 , . . . , B t n ∈ dx n ) = ∏ ℙx j−1 (B t j −t j−1 ∈ dx j ).

(6.6)

j=1

6.2 The strong Markov property Let us now show that (6.4) remains valid if we replace s by a stopping time σ(ω). Since stopping times can attain the value +∞, we have to take care that B σ is defined even in this case. If (X t )t⩾0 is a stochastic process and σ a stopping time, we set X σ(ω) (ω) if σ(ω) < ∞, { { { X σ (ω) := {limt→∞ X t (ω) if σ(ω) = ∞ and if the limit exists, { { in all other cases. {0

6.5 Theorem (strong Markov property). Let (B t )t⩾0 be a BMd with admissible filtration (Ft )t⩾0 and σ a finite stopping time, i.e. σ(ω) < ∞ for all ω ∈ Ω. Then (W t )t⩾0 , where W t := B σ+t − B σ , is again a BMd , which is independent of Fσ+ .

We will give two proofs of this important result, first a down-to-earth direct attack, and then a second one which relies on an ingenious martingale-plus-optional-sampling trick.

66 | 6 Brownian motion as a Markov process

Ex. 6.2

Pedestrian proof of Theorem 6.5. We know, cf. Lemma A.16, that σ j := (⌊2j σ⌋ + 1)/2j is a decreasing sequence of stopping times such that inf j⩾1 σ j = σ. For all 0 ⩽ s < t, ξ ∈ ℝd and all F ∈ Fσ+ we find using the continuity of the sample paths 𝔼 [e i⟨ξ,B(σ+t)−B(σ+s)⟩ 𝟙F ] = lim 𝔼 [e i⟨ξ,B(σ j +t)−B(σ j +s)⟩ 𝟙F ] j→∞



−j

= lim ∑ 𝔼 [e i⟨ξ,B(k2 j→∞

k=1 ∞

𝟙{σ j =k2−j } ⋅ 𝟙F ]

⊥⊥ Fk2−j , ∼ B(t−s) by (B1) or 5.1.b), (B2) −j

= lim ∑ 𝔼 [ e i⟨ξ,B(k2 j→∞

+t)−B(k2−j +s)⟩

k=1

+t)−B(k2−j +s)⟩



𝟙{(k−1)2−j ⩽σ 0 we have Fσ+ ⊂ Fσ+s ; thus, 1

2

𝔼 [e i⟨ξ,B(t+σ)−B(s+σ)⟩ 𝟙F ] = e− 2 (t−s)|ξ| ℙ(F)

for all 0 ⩽ s < t,

F ∈ Fσ+ .

This remains true if we let s ↓ 0, i.e. we see that B(t + σ) − B(s + σ) ⊥⊥ Fσ+ for all 0 ⩽ s ⩽ t. Since W = (B(t + σ) − B(σ))t⩾0 has independent increments, we conclude W⊥ that F∞ ⊥ Fσ+ , see Lemma 2.14 for the details of this argument.

We can now use (6.2), (6.3) (or Lemma A.3) to deduce from Theorem 6.5 the following consequences of the strong Markov property. 6.6 Theorem. Let (B t )t⩾0 be a BMd with admissible filtration (Ft )t⩾0 and σ be a (not necessarily finite) stopping time. Then we have for all t ⩾ 0, u ∈ Bb (ℝd ) and ℙ almost all ω ∈ {σ < ∞} 󵄨󵄨 󵄨 𝔼 [u(B t+σ )𝟙{σ 0

ℙ(τ b ⩽ t) = ℙ(τ b ⩽ t, B t ⩾ b) + ℙ(τ b ⩽ t, B t < b) = ℙ(τ b ⩽ t, B t ⩾ b) + ℙ(τ b ⩽ t, B t > b) = ℙ(B t ⩾ b) + ℙ(B t > b) = 2ℙ(B t ⩾ b).

Observe that {τ b ⩽ t} = {M t ⩾ b} where M t := sups⩽t B s . Hence,

ℙ(M t ⩾ b) = ℙ(τ b ⩽ t) = 2ℙ(B t ⩾ b) for all b > 0.

(6.12)

From this we can calculate the probability distributions of various functionals of a Brownian motion. We will see, in connection with local times, a much deeper version later in Theorem 19.29. Ex. 6.8 Ex. 6.9 Ex. 6.10

6.10 Theorem (Lévy 1939). Let (B t )t⩾0 be a BM1 , b ∈ ℝ, and set τ b = inf{t ⩾ 0 : B t = b},

M t := sup B s ,

Then, for all x ⩾ 0, and for t > 0

and

s⩽t

M t ∼ |B t | ∼ −m t ∼ M t − B t ∼ B t − m t ∼ √ τb ∼

|b|

√2πt3

e−b

2

/(2t)

m t := inf B s . s⩽t

2 −x2 /(2t) e dx πt

(6.13) (6.14)

dt.

Proof. The equalities (6.12) immediately show that M t ∼ |B t |. Using the symmetry 2.12 of a Brownian motion we get m t = inf s⩽t B s = − sups⩽t (−B s ) ∼ −M t . The invariance of BM1 under time reversals, cf. 2.15, gives M t − B t = sup(B s − B t ) = sup(B t−s − B t ) ∼ sup B s = M t , s⩽t

s⩽t

s⩽t

6.3 Desiré André’s reflection principle | 71

and if we combine this again with the symmetry 2.12 we get M t − B t ∼ B t − m t . The formula (6.14) follows for b > 0 from the fact that {τ b > t} = {M t < b} if we differentiate the relation b

2 2 ℙ(τ b > t) = ℙ(M t < b) = √ ∫ e−x /(2t) dx. πt

0

For b < 0 we use the symmetry of a Brownian motion.

6.11 Remark. (6.13) tells us that for each single instant of time the random variables, e.g. M t and |B t |, have the same distribution. This is not true for the finite dimensional distributions, i.e. the laws of processes (M t )t⩾0 and (|B t |)t⩾0 are different. In our example this is obvious since t 󳨃→ M t is a.s. increasing while t 󳨃→ |B t | is positive but not necessarily monotone. On the other hand, the processes −m and M are equivalent, i.e. they do have the same law. In order to derive (6.12) we used the Markov property of a Brownian motion, i.e. the W, W = B fact that FsB and F∞ t t+s − B s are independent. Often the proof is based on a naive application of (6.7), but this leads to the following difficulty: ℙ(τ b ⩽ t, B t < b) = 𝔼 (𝟙{τ ?!

= 𝔼 (𝟙{τ

= 𝔼 (𝟙{τ

= 𝔼 (𝟙{τ

= 𝔼 (𝟙{τ

b ⩽t} b ⩽t}

b ⩽t}

b ⩽t}

b ⩽t}

󵄨 𝔼 [𝟙(−∞,0) (B t − b) 󵄨󵄨󵄨 FτB ]) b

𝔼



b

[𝟙(−∞,0) (B t−τ − b)]) b

𝔼b [𝟙(−∞,0) (B t−τ − b)]) b

𝔼 [𝟙(−∞,0) (B t−τ )]) b

𝔼 [𝟙(0,∞) (B t−τ )]). b

Notice that in the step marked by “?!” there appears the expression t − τ b . As it is written, we do not know whether t − τ b is positive – and if it were, our calculation would certainly not be covered by (6.7). In order to make such arguments work we need the following stronger version of the strong Markov property. Mind the dependence of the random times on the fixed path ω in the expression below! 6.12 Theorem. Let (B t )t⩾0 be a BMd , τ an FtB stopping time and η ⩾ τ where η is an B Fτ+ measurable random time. Then we have for all ω ∈ {η < ∞} and all u ∈ Bb (ℝd ) 󵄨 B 𝔼 [u(B η ) 󵄨󵄨󵄨 Fτ+ ](ω) = 𝔼B τ (ω) [u(B η(ω)−τ(ω) (⋅))]

= ∫ u(B η(ω)−τ(ω) (ω󸀠 )) ℙB τ (ω) (dω󸀠 ).

(6.15)

72 | 6 Brownian motion as a Markov process The requirement that τ is a stopping time and η is a Fτ+ measurable random time with η ⩾ τ looks artificial. Typical examples of (τ, η) pairs are (τ, τ + t), (τ, τ ∨ t) or (σ ∧ t, t) where σ is a further stopping time.

Proof of Theorem 6.12. Replacing u(B η ) by u(B η )𝟙{η 0 { 1, ℙx (τ{0} < ∞) = { 0, {

if d = 1,

if d ⩾ 2,

{1, ℙx (τ𝔹(0,r) < ∞) = { |x| 2−d , ( ) { r

if d = 1, 2, if d ⩾ 3.

(6.18) (6.19)

76 | 6 Brownian motion as a Markov process If x = 0, τ{0} is the first return time to 0, and (6.18) is still true:

{τ{0} < ∞} = {∃t > 0 : B t = 0} = ⋃ {∃t > 1/n : B t = 0}. n⩾1

By the Markov property, and the fact that

ℙ0 (B

1/n

= 0) = 0, we have

ℙ0 (∃t > 1/n : B t = 0) = 𝔼0 ℙB1/n (∃t > 0 : B t = 0) = 𝔼0 ℙB1/n (τ{0} < ∞) =0 or 1 by (6.18)

which is 0 or 1 according to d = 1 or d ⩾ 2. By the strong Markov property, a re-started Brownian motion is again a Brownian motion. This is the key to the following result. 6.18 Corollary. Let (B(t))t⩾0 be a BMd and x ∈ ℝd . a) If d = 1, Brownian motion is point recurrent, i.e.

ℙx (B(t) = 0 infinitely often) = 1.

b) If d = 2, Brownian motion is neighbourhood recurrent, i.e.

ℙx (B(t) ∈ 𝔹(0, r) infinitely often) = 1 for every r > 0.

c) If d ⩾ 3, Brownian motion is transient, i.e. ℙx ( limt→∞ |B(t)| = ∞) = 1.

Proof. Let d = 1, 2, r > 0 and denote by τ𝔹(0,r) the first hitting time of 𝔹(0, r) for (B(t))t⩾0 . We know from Corollary 6.17 that τ𝔹(0,r) is a.s. finite. By the strong Markov property, Theorem 6.6, we find ℙ0 (∃t > 0 : B(t + τ𝔹c (0,r) ) ∈ 𝔹(0, r)) = 𝔼0 (ℙ

B(τ𝔹c (0,r) )

(6.19)

(τ𝔹(0,r) < ∞)) = 1.

This means that (B(t))t⩾0 visits every neighbourhood 𝔹(0, r) time and again. Since B(t) − y, y ∈ ℝd , is also a Brownian motion, the same argument shows that (B(t))t⩾0 visits any ball 𝔹(y, r) infinitely often. In one dimension this is, because of the continuity of the paths of a Brownian motion, possible only if every point is visited infinitely often. Let d ⩾ 3. By the strong Markov property, Theorem 6.6, and Corollary 6.17 we find for all r < R ℙ0 (∃t ⩾ τ𝔹c (0,R) : B(t) ∈ 𝔹(0, r)) = 𝔼0 (ℙ

B(τ𝔹c (0,R) )

r d−2 . (τ𝔹(0,r) < ∞)) = ( ) R

Take r = √R and let R → ∞. This shows that the return probability of (B(t))t⩾0 to any compact neighbourhood of 0 is zero, i.e. limt→∞ |B(t)| = ∞.

6.5 Lévy’s triple law

In this section we show how we can apply the reflection principle repeatedly to obtain for a BM1 , (B t )t⩾0 , the joint distribution of B t , the running minimum m t = inf s⩽t B s and

6.5 Lévy’s triple law | 77

maximum M t = sups⩽t B s . This is known as P. Lévy’s loi à trois variables, [166, VI.42.6, 4o , p. 213]. In a different context this formula appears already in Bachelier [7, Nos. 413 and 504]. An interesting application of the triple law can be found in Section 12.2, footnote 1. 6.19 Theorem (Lévy 1948). Let (B t )t⩾0 be a BM1 and denote by m t and M t its running minimum and maximum, respectively. Then ℙ(m t > a, M t < b, B t ∈ dx) =

∞ (x+2n(b−a)) (x−2a−2n(b−a)) dx 2t 2t − e− ) ∑ (e− √2πt n=−∞ 2

2

(6.20)

for all t > 0 and a < 0 < b.

Proof. Let a < 0 < b, denote by τ = inf{t ⩾ 0 : B t ∉ (a, b)} the first exit time from the interval (a, b), and pick I = [c, d] ⊂ (a, b). We know from Corollary 5.11 that τ is an a.s. finite stopping time. Since {τ > t} = {m t > a, M t < b} we have ℙ(m t > a, M t < b, B t ∈ I) = ℙ(τ > t, B t ∈ I)

= ℙ(B t ∈ I) − ℙ(τ ⩽ t, B t ∈ I)

= ℙ(B t ∈ I) − ℙ(B τ = a, τ ⩽ t, B t ∈ I) − ℙ(B τ = b, τ ⩽ t, B t ∈ I).

For the last step we use the fact that {B τ = a} and {B τ = b} are, up to the null set {τ = ∞}, a disjoint partition of Ω. We want to use the reflection principle repeatedly. Denote by ra x := 2a − x and rb x := 2b − x the reflection in the (x, t)-plane with respect to the line x = a and x = b, respectively, cf. Figure 6.4. Using Theorem 6.12 we find 󵄨 ℙ(B τ = b, τ ⩽ t, B t ∈ I 󵄨󵄨󵄨 Fτ )(ω) = 𝟙{B τ =b}∩ {τ⩽t} (ω)ℙB τ (ω) (B t−τ(ω) ∈ I).

Because of the symmetry of Brownian motion,

ℙb (B t−τ(ω) ∈ I) = ℙ(B t−τ(ω) ∈ I − b) and, therefore,

= ℙ(B t−τ(ω) ∈ b − I)

= ℙb (B t−τ(ω) ∈ rb I),

󵄨 ℙ(B τ = b, τ ⩽ t, B t ∈ I 󵄨󵄨󵄨 Fτ )(ω) = 𝟙{B τ =b}∩ {τ⩽t} (ω)ℙB τ (ω) (B t−τ(ω) ∈ rb I) 󵄨 = ℙ(B τ = b, τ ⩽ t, B t ∈ rb I 󵄨󵄨󵄨 Fτ )(ω).

This shows that

ℙ (B τ = b, τ ⩽ t, B t ∈ I) = ℙ (B τ = b, τ ⩽ t, B t ∈ rb I) = ℙ (B τ = b, B t ∈ rb I) .

Ex. 6.12

78 | 6 Brownian motion as a Markov process

Bt r bI b I

τ

t

a Fig. 6.4: A [reflected] Brownian path visiting the [reflected] interval at time t.

Further applications of the reflection principle yield ℙ(B τ = b, B t ∈ rb I) = ℙ(B t ∈ rb I) − ℙ(B τ = a, B t ∈ rb I)

= ℙ(B t ∈ rb I) − ℙ(B τ = a, B t ∈ ra rb I)

and, finally,

= ℙ(B t ∈ rb I) − ℙ(B t ∈ ra rb I) + ℙ(B τ = b, B t ∈ ra rb I)

ℙ(B τ = b, τ ⩽ t, B t ∈ I) = ℙ(B t ∈ rb I) − ℙ(B t ∈ ra rb I) + ℙ(B t ∈ rb ra rb I) − + ⋅ ⋅ ⋅

Since the quantities a and b play exchangeable roles in all calculations up to this point, we get ℙ(m t > a, M t < b, B t ∈ I)

= ℙ(B t ∈ I) − ℙ(B t ∈ rb I) + ℙ(B t ∈ ra rb I) − ℙ(B t ∈ rb ra rb I) + − ⋅ ⋅ ⋅ − ℙ(B t ∈ ra I) + ℙ(B t ∈ rb ra I) − ℙ(B t ∈ ra rb ra I) + − ⋅ ⋅ ⋅

= ℙ(B t ∈ I) − ℙ(ra B t ∈ ra rb I) + ℙ(B t ∈ ra rb I) − ℙ(ra B t ∈ (ra rb )2 I) + − ⋅ ⋅ ⋅ − ℙ(ra B t ∈ I) + ℙ(B t ∈ rb ra I) − ℙ(ra B t ∈ rb ra I) + − ⋅ ⋅ ⋅

In the last step we use that ra ra = id. All probabilities in these identities are probabilities of mutually disjoint events, i.e. the series converges absolutely (with a sum < 1); therefore we may rearrange the terms in the series. Since ra I = 2a − I and rb ra I = 2b − (2a − I) = 2(b − a) + I, we get for all n ⩾ 0 (rb ra )n I = 2n(b − a) + I

and (ra rb )n I = 2n(a − b) + I = 2(−n)(b − a) + I.

6.6 An arc-sine law | 79

Therefore, we get ℙ(m t > a, M t < b, B t ∈ I) ∞



= ∑ ℙ(B t ∈ 2n(a − b) + I) − ∑ ℙ(2a − B t ∈ 2n(a − b) + I) n=−∞ ∞

= ∑

n=−∞

n=−∞

(x−2n(a−b))2 2t ∫ (e− I

which is the same as (6.20).



(2a−x−2n(a−b))2 2t e−

)

dx , √2πt

6.6 An arc-sine law There are quite a few functionals of a Brownian path which follow an arc-sine distribution. Most of them are more or less connected with the zeros of (B t )t⩾0 . Consider, for example, the largest zero of B s in the interval [0, t]: ξ t := sup{s ⩽ t : B s = 0}.

Obviously, ξ t is not a stopping time since {ξ t ⩽ r}, r < t, cannot be contained in FrB ; otherwise we could forecast, on the basis of [0, r] ∋ s 󳨃→ B s , whether B s = 0 for some future time s ∈ (r, t) or not. Nevertheless, we can determine the probability distribution of ξ t . 6.20 Theorem. Let (B t )t⩾0 be a BM1 and write ξ t for the largest zero of B s in the interval [0, t]. Then 2 s ℙ(ξ t < s) = arcsin √ for all 0 ⩽ s ⩽ t. (6.21) π t

Proof. Set h(s) := ℙ(ξ t < s). By the Markov property we get h(s) = ℙ(B u ≠ 0

for all u ∈ [s, t])

= ∫ ℙB s (ω) (B u−s ≠ 0

= ∫ ℙb (B u−s ≠ 0 = ∫ ℙ0 (B v ≠ −b

for all u ∈ [s, t]) ℙ(dω)

for all u ∈ [s, t]) ℙ(B s ∈ db)

for all v ∈ [0, t − s]) ℙ(B s ∈ db)

= ∫ ℙ0 (τ−b ⩾ t − s) ℙ(B s ∈ db),

where τ−b is the first passage time for −b. Using

ℙ(B s ∈ db) = ℙ(B s ∈ −db) and ℙ(τ b ⩾ t − s) = √

b

2 2 ∫ e−x /2(t−s) dx, π(t − s)

0

Ex. 6.15 Ex. 6.16 Ex. 6.17

80 | 6 Brownian motion as a Markov process cf. Theorem 6.10, we get ∞

b

h(s) = 2 ∫ √ 0

x2 b2 2 1 e− 2s db ∫ e− 2(t−s) dx π(t − s) √2πs

0

s

∞ β√ t−s

β2 ξ2 2 = ∫ ∫ e− 2 dξ e− 2 dβ. π

0

0

If we differentiate the last expression with respect to s we find ∞

β2 β2 1 h (s) = ∫ e− 2 e− 2 π

󸀠

Since

d ds

s t−s

0

arcsin √ st =

1 1 2 √s(t−s)

=

t 1 β 1 dβ = , 0 < s < t. t − s √ s(t − s) π √ s(t − s)

π 2

h󸀠 (s) and 0 = arcsin 0 = h(0), we are done.

6.7 Some measurability issues

Let (B t )t⩾0 be a BM1 . Recall from Definition 5.1 that a filtration (Gt )t⩾0 is admissible if FtB ⊂ Gt

and

B t − B s ⊥⊥ Gs

for all s < t.

Typically, one considers the following types of enlargements:

B Gt := σ(FtB , X) where X ⊥⊥ F∞ is a further random variable,

B Gt := Ft+ := ⋂ FuB , u>t

Gt :=

B Ft

:= σ(FtB , N ),

where N = {M ⊂ Ω : ∃N ∈ A , M ⊂ N, ℙ(N) = 0} is the system of all subsets of B

Ex. 5.1

measurable ℙ null sets; F t is the completion of FtB .

B

B and F t are admissible in the sense of 6.21 Lemma. Let (B t )t⩾0 be a BMd . Then Ft+ Definition 5.1.

B Proof. 1o Ft+ is admissible: let 0 ⩽ t ⩽ u, F ∈ Ft+ , and f ∈ Cb (ℝd ). Since F ∈ Ft+ϵ for all ϵ > 0, we find with (5.1)

𝔼 [𝟙F ⋅ f (B u+ϵ − B t+ϵ )] = 𝔼 [𝟙F ] ⋅ 𝔼 [f (B u+ϵ − B t+ϵ )] .

Letting ϵ → 0 proves 5.1.b); condition 5.1.a) is trivially satisfied. B

2o F t is admissible: let 0 ⩽ t ⩽ u. By the very definition of the completion we can ̃ ∈ F Bt some F ∈ F B and N ∈ N such that the symmetric difference find for every F t

6.7 Some measurability issues | 81

̃ \ F) ∪ (F \ F) ̃ is in N . Therefore, (F

(5.1)

𝔼(𝟙F̃ ⋅ f(B u − B t )) = 𝔼(𝟙F ⋅ f(B u − B t )) = ℙ(F) 𝔼 f(B u − B t )

̃ 𝔼 f(B u − B t ) = ℙ(F)

which proves 5.1.b); condition 5.1.a) is clear. B

6.22 Theorem. Let (B t )t⩾0 be a BMd . Then (F t )t⩾0 is a right continuous filtration, i.e. B

B

F t+ = F t holds for all t ⩾ 0. B

B

Proof. Let F ∈ F t+ . It is enough to show that 𝟙F = 𝔼 [𝟙F | F t ] almost surely. If this is B

B

B

true, 𝟙F is, up to a set of measure zero, F t measurable. Since N ⊂ F t ⊂ F t+ , we get

B F t+

B Ft .

⊂ Fix t < u and pick 0 ⩽ s1 < ⋅ ⋅ ⋅ < s m ⩽ t < u = t0 ⩽ t1 < ⋅ ⋅ ⋅ < t n and η1 , . . . , η m , ξ1 , . . . , ξ n ∈ ℝd , m, n ⩾ 1. Then we have m m m 󵄨 B 󵄨 B 𝔼 [e i ∑j=1 ⟨η j , B(s j )⟩ 󵄨󵄨󵄨 F t ] = e i ∑j=1 ⟨η j , B(s j )⟩ = 𝔼 [e i ∑j=1 ⟨η j , B(s j )⟩ 󵄨󵄨󵄨 F t+ ] .

(6.22)

As in the proof of Theorem 6.1 we get n

n

j=1

k=1

∑ ⟨ξ j , B(t j )⟩ = ∑ ⟨ζ k , B(t k ) − B(t k−1 )⟩ + ⟨ζ1 , B(u)⟩ ,

ζk = ξk + ⋅ ⋅ ⋅ + ξn .

Therefore,

󵄨󵄨 B n n 𝔼 [e i ∑j=1 ⟨ξ j , B(t j )⟩ 󵄨󵄨󵄨 F u ] = 𝔼 [e i ∑k=1 ⟨ζ k , B(t k )−B(t k−1 )⟩ 󵄨 6.21

n

󵄨󵄨 B i⟨ζ1 , B(u)⟩ 󵄨󵄨 F u ] e 󵄨󵄨

= ∏ 𝔼 [e i⟨ζ k , B(t k )−B(t k−1 )⟩ ] e i⟨ζ1 , B(u)⟩

(B1)

(2.15)

k=1 n

1

2

= ∏ e− 2 (t k −t k−1 )|ζ k | e i⟨ζ1 , B(u)⟩ . k=1

Exactly the same calculation with u replaced by t and t0 = t tells us

n 󵄨󵄨 B n 1 2 𝔼 [e i ∑j=1 ⟨ξ j , B(t j )⟩ 󵄨󵄨󵄨 F t ] = ∏ e− 2 (t k −t k−1 )|ζ k | e i⟨ζ1 , B(t)⟩ . 󵄨 k=1

Now we can use the backwards martingale convergence Theorem A.8 and let u ↓ t (along a countable sequence) in the first calculation. This gives 󵄨󵄨 B n n 𝔼 [e i ∑j=1 ⟨ξ j , B(t j )⟩ 󵄨󵄨󵄨 F t+ ] = lim 𝔼 [e i ∑j=1 ⟨ξ j , B(t j )⟩ 󵄨 u↓t n

1

2

󵄨󵄨 B 󵄨󵄨 F u ] 󵄨󵄨

= lim ∏ e− 2 (t k −t k−1 )|ζ k | e i⟨ζ1 , B(u)⟩ u↓t

k=1

82 | 6 Brownian motion as a Markov process n

1

2

= ∏ e− 2 (t k −t k−1 )|ζ k | e i⟨ζ1 , B(t)⟩ k=1

󵄨󵄨 B n = 𝔼 [e i ∑j=1 ⟨ξ j , B(t j )⟩ 󵄨󵄨󵄨 F t ] . 󵄨

Given 0 ⩽ u1 < u2 < ⋅ ⋅ ⋅ < u N , N ⩾ 1, and t ⩾ 0, then t splits the times u j into two subfamilies, s1 < ⋅ ⋅ ⋅ < s m ⩽ t < t1 < ⋅ ⋅ ⋅ < t n , where m + n = N and {u1 , . . . , u N } = {s1 , . . . , s m } ∪ {t1 , . . . , t n }. If we combine the above equality and (6.22), we see for Γ := (B(u1 ), . . . , B(u N )) ∈ ℝNd and all θ ∈ ℝNd 󵄨󵄨 B 𝔼 [e i⟨θ, Γ⟩ 󵄨󵄨󵄨 F t+ ] = 𝔼 [e i⟨θ, Γ⟩ 󵄨

󵄨󵄨 B 󵄨󵄨 F t ] . 󵄨󵄨

Using the definition and then the L2 symmetry of the conditional expectation we get B

for all F ∈ F t+

󵄨󵄨 B ∫ 𝟙F ⋅ e i⟨θ, Γ⟩ dℙ = ∫ 𝟙F ⋅ 𝔼 [e i⟨θ, Γ⟩ 󵄨󵄨󵄨 F t ] dℙ = ∫ 𝔼 [𝟙F 󵄨

󵄨󵄨 B 󵄨󵄨 F t ] ⋅ e i⟨θ, Γ⟩ dℙ. 󵄨󵄨

󵄨󵄨 B This shows that the image measures (𝟙F ℙ) ∘ Γ −1 and (𝔼 [𝟙F 󵄨󵄨󵄨 F t ] ℙ) ∘ Γ −1 have the 󵄨 same characteristic functions, i.e. 󵄨󵄨 B ∫ 𝟙F dℙ = ∫ 𝔼 [𝟙F 󵄨󵄨󵄨 F t ] dℙ 󵄨 A

(6.23)

A

holds for all A ∈ σ(Γ) = σ(B(u1 ), . . . , B(u N )) and all 0 ⩽ u1 < ⋅ ⋅ ⋅ < u N and N ⩾ 1. B , the uniqueness Since ⋃u1 0.

(7.1)

(7.2)

Throughout this chapter we assume that (B t )t⩾0 is a d-dimensional Brownian motion which is adapted to the filtration (Ft )t⩾0 . We will assume that (Ft )t⩾0 is right continuous, i.e. Ft = Ft+ for all t ⩾ 0, cf. Theorem 6.22 for a sufficient condition. Many results of this chapter will be true, with only minor changes in the proofs, for general Markov processes.

7.1 The semigroup A(n operator) semigroup (P t )t⩾0 on a Banach space (B, ‖⋅‖) is a family of linear operators P t : B → B, t ⩾ 0, which satisfies P t P s = P t+s

and

P0 = id .

(7.3)

In this chapter we will consider the following Banach spaces: Bb (ℝd ) the family of all bounded Borel measurable functions f : ℝd → ℝ equipped with the uniform norm ‖ ⋅ ‖∞ ; C∞ (ℝd ) the family of all continuous functions f : ℝd → ℝ vanishing at infinity, i.e. lim|x|→∞ u(x) = 0, equipped with the uniform norm ‖ ⋅ ‖∞ . Unless it is too ambiguous, we write Bb and C∞ instead of Bb (ℝd ) and C∞ (ℝd ).

7.1 Lemma. Let (B t )t⩾0 be a d-dimensional Brownian motion with filtration (Ft )t⩾0 . Then (7.1) defines a semigroup of operators on Bb (ℝd ). Proof. It is obvious that P t is a linear operator. Since x 󳨃→ e−|x−y| see that P t u(x) = ∫ u(y) ℙx (B t ∈ dy) = ℝd

https://doi.org/10.1515/9783110741278-007

2

/2t

is measurable, we

2 1 ∫ u(y) e−|x−y| /2t dy (2πt)d/2

ℝd

Ex. 7.1

88 | 7 Brownian motion and transition semigroups is again a Borel measurable function. (For a general Markov process we must use (6.4c) with a Borel measurable g u,t (⋅).) The semigroup property (7.3) is now a direct consequence of the Markov property (6.4): for s, t ⩾ 0 and u ∈ Bb (ℝd ) tower 󵄨 P t+s u(x) = 𝔼x u(B t+s ) = 𝔼x [𝔼x (u(B t+s ) 󵄨󵄨󵄨 Fs )] (6.4)

= 𝔼x [ 𝔼B s (u(B t ))]

= 𝔼x [P t u(B s )]

= P s P t u(x)

For many properties of the semigroup the Banach space Bb (ℝd ) is too big, and we have to consider the smaller space C∞ (ℝd ).

7.2 Lemma. A d-dimensional Brownian motion (B t )t⩾0 is uniformly stochastically continuous, i.e. lim sup ℙx (|B t − x| ⩾ δ) = 0 for all δ > 0. (7.4) t→0 x∈ℝd

Proof. This follows immediately from the translation invariance of a Brownian motion and Chebyshev’s inequality ℙx (|B t − x| ⩾ δ) = ℙ(|B t | ⩾ δ) ⩽

Ex. 7.2 Ex. 7.3 Ex. 7.4 Ex. 7.5

𝔼 |B t |2 td = 2. δ2 δ

After this preparation we can discuss the properties of the semigroup (P t )t⩾0 .

7.3 Proposition. Let (P t )t⩾0 be the transition semigroup of a d-dimensional Brownian motion (B t )t⩾0 and C∞ = C∞ (ℝd ), Cb = Cb (ℝd ) and Bb = Bb (ℝd ). For all t ⩾ 0 a) P t is conservative: P t 1 = 1. b) P t is a contraction on Bb : ‖P t u‖∞ ⩽ ‖u‖∞ for all u ∈ Bb . c) P t is positivity preserving: for u ∈ Bb we have u ⩾ 0 ⇒ P t u ⩾ 0; d) P t is sub-Markovian: for u ∈ Bb we have 0 ⩽ u ⩽ 1 ⇒ 0 ⩽ P t u ⩽ 1; e) P t has the Feller property: if u ∈ C∞ then P t u ∈ C∞ ; f) P t is strongly continuous on C∞ : limt→0 ‖P t u − u‖∞ = 0 for all u ∈ C∞ ; g) P t has the strong Feller property: if u ∈ Bb , then P t u ∈ Cb .

7.4 Remark. A semigroup of operators (P t )t⩾0 on Bb (ℝd ) which satisfies a)–d) is called a Markov semigroup; b)–d) is called a sub-Markov semigroup; b)–f) is called a Feller semigroup; b)–d), g) is called a strong Feller semigroup.

Proof of Proposition 7.3. a) We have P t 1(x) = 𝔼x 𝟙ℝd (B t ) = ℙx (B t ∈ ℝd ) = 1 for all x ∈ ℝd . b) We have |P t u(x)| = | 𝔼x u(B t )| ⩽ ∫ |u(B t )| dℙx ⩽ ‖u‖∞ for all x ∈ ℝd .

7.1 The semigroup | 89

c) Since u ⩾ 0 we see that P t u(x) = 𝔼x u(B t ) ⩾ 0.

d) Positivity follows from c). The proof of b) shows that P t u ⩽ 1 if u ⩽ 1.

e) Note that P t u(x) = 𝔼x u(B t ) = 𝔼 u(B t + x). Since |u(B t + x)| ⩽ ‖u‖∞ is integrable, the dominated convergence theorem, can be used to show lim 𝔼 u(B t + x) = 𝔼 u(B t + y)

x→y

and

lim 𝔼 u(B t + x) = 0.

|x|→∞

f) Fix ϵ > 0. Since u ∈ C∞ (ℝd ) is uniformly continuous, there is some δ > 0 such that |u(x) − u(y)| < ϵ

Thus,

󵄨 󵄨 ‖P t u − u‖∞ = sup 󵄨󵄨󵄨𝔼x u(B t ) − u(x)󵄨󵄨󵄨

for all |x − y| < δ.

x∈ℝd

⩽ sup 𝔼x |u(B t ) − u(x)| x∈ℝd

⩽ sup ( x∈ℝd



|B t −x| 0. Let u ∈ Bb (ℝd ). Then P t u is given by P t u(z) = p t ⋆ u(z) =

2 1 ∫ u(y) e−|z−y| /2t dy. d/2 (2πt)

If |y| ⩾ 2R we get |z − y|2 ⩾ (|y| − |z|)2 ⩾ bounded by |u(y)| e−|z−y|

2

/2t

1 2 4 |y| ,

which shows that the integrand is 2

⩽ ‖u‖∞ (𝟙𝔹(0,2R) (y) + e−|y|

/8t

𝟙𝔹c (0,2R) (y))

and the right-hand side is integrable. Therefore we can use dominated convergence to get limz→x P t u(z) = P t u(x). This proves that P t u ∈ Cb (ℝd ).

The semigroup notation offers a simple way to express the finite dimensional distributions of a Markov process. Let (B t )t⩾0 be a d-dimensional Brownian motion, s < t and f, g ∈ Bb (ℝd ). Then 𝔼x [f(B s )g(B t )] = 𝔼x [f(B s ) 𝔼B s g(B t−s )]

= 𝔼x [f(B s )P t−s g(B s )] = P s [fP t−s g](x).

If we iterate this, we get the following counterpart of Theorem 6.4.

90 | 7 Brownian motion and transition semigroups 7.5 Theorem. Let (B t )t⩾0 be a BMd , t0 = 0 < t1 < ⋅ ⋅ ⋅ < t n , C1 , . . . , C n ∈ B(ℝd ) and x ∈ ℝd . If (P t )t⩾0 is the transition semigroup of (B t )t⩾0 , then Ex. 7.6

ℙx (B t1 ∈ C1 , . . . , B t n ∈ C n ) = P t1 [𝟙C1 P t2 −t1 [𝟙C2 . . . P t n −t n−1 𝟙C n ] . . . ](x).

7.6 Remark (From semigroups to processes). So far we have considered semigroups given by a Brownian motion. It is an interesting question whether we can construct a Markov process starting from a semigroup of operators. If (T t )t⩾0 is a Markov semigroup where p t (x, C) := T t 𝟙C (x) is an everywhere defined kernel such that C 󳨃→ p t (x, C) is a probability measure for all x ∈ ℝd ,

(t, x) 󳨃→ p t (x, C) is measurable for all C ∈ B(ℝd ),

this is indeed possible. We can use the formula from Theorem 7.5 to define for every x ∈ ℝd p xt1 ,...,t n (C1 × ⋅ ⋅ ⋅ × C n ) := T t1 [𝟙C1 T t2 −t1 [𝟙C2 . . . T t n −t n−1 𝟙C n ] . . . ](x),

Ex. 7.7

where 0 ⩽ t1 < t2 < ⋅ ⋅ ⋅ < t n and C1 , . . . , C n ∈ B(ℝd ). For fixed x ∈ ℝd , this defines a family of finite dimensional measures on (ℝd⋅n , B n (ℝd )) which satisfies the Kolmogorov consistency conditions (4.6). The permutation property is obvious, while the projectivity condition follows from the semigroup property: p xt1 ,...,t k ,...,t n (C1 × ⋅ ⋅ ⋅ × ℝd × ⋅ ⋅ ⋅ × C n )

= T t1 [𝟙C1 T t2 −t1 [𝟙C2 . . . 𝟙C k−1 T t k −t k−1 𝟙ℝd T t k+1 −t k 𝟙C k+1 . . . T t n −t n−1 𝟙C n ] . . . ](x) = T t k −t k−1 T t k+1 −t k = T t k+1 −t k−1

= T t1 [𝟙C1 T t2 −t1 [𝟙C2 . . . 𝟙C k−1 T t k+1 −t k−1 𝟙C k+1 . . . T t n −t n−1 𝟙C n ] . . . ](x) = p xt1 ,...,t k−1 ,t k+1 ,...,t n (C1 × ⋅ ⋅ ⋅ × C k−1 × C k+1 × ⋅ ⋅ ⋅ × C n ).

Now we can use Kolmogorov’s construction, cf. Theorem 4.8 and Corollary 4.9, to get measures ℙx and a canonical process (X t )t⩾0 such that ℙx (X0 = x) = 1 for any x ∈ ℝd . We still have to check that the process X = (X t )t⩾0 is a Markov process. Denote by Ft := σ(X s : s ⩽ t) the canonical filtration of X. We will show that p t−s (X s , C) = ℙx (X t ∈ C | Fs ) for all s < t and C ∈ B(ℝd ).

Since Fs is generated by an ∩-stable generator G consisting of sets of the form m

G = ⋂{X s j ∈ C j }, j=1

m ⩾ 1, 0 ⩽ s1 < ⋅ ⋅ ⋅ < s m = s, C1 , . . . , C m ∈ B(ℝd ),

it is enough to show that for every G ∈ G

∫ p t−s (X s , C) dℙx = ∫ 𝟙C (X t ) dℙx . G

G

(7.5)

7.1 The semigroup | 91

Because of the form of G and since s = s m , this is the same as m

m

j=1

j=1

∫ ∏ 𝟙C j (X s j )p t−s m (X s m , C) dℙx = ∫ ∏ 𝟙C j (X s j )𝟙C (X t ) dℙx .

By the definition of the finite dimensional distributions we see m

∫ ∏𝟙C j (X s j )p t−s m (X s m , C) dℙx j=1

= T s1 [𝟙C1 T s2 −s1 [𝟙C2 . . . 𝟙C m−1 T s m −s m−1 [𝟙C m p t−s m ( ⋅ , C)] . . . ]](x) = T s1 [𝟙C1 T s2 −s1 [𝟙C2 . . . 𝟙C m−1 T s m −s m−1 [𝟙C m T t−s m 𝟙C ] . . . ]](x)

= p xs1 ,...,s m ,t (C1 × ⋅ ⋅ ⋅ × C m−1 × C m × C) m

= ∫ ∏ 𝟙C j (X s j )𝟙C (X t ) dℙx . j=1

If we take the conditional expectation 𝔼x (⋅ ⋅ ⋅ | X s ) in (7.5), use the tower property and compare the result with (7.5), then we see that ℙx (X t ∈ C | X s ) = p t−s (X s , C) = ℙx (X t ∈ C | Fs )

holds for all s < t and C ∈ B(ℝd ). This gives (6.4a).

7.7 Remark. Feller processes. Denote by (T t )t⩾0 a Feller semigroup on C∞ (ℝd ). Remark 7.6 allows us to construct a d-dimensional Markov process (X t )t⩾0 whose transition semigroup is (T t )t⩾0 . The process (X t )t⩾0 is called a Feller process. Let us briefly outline this construction: by the Riesz representation theorem, cf. e.g. Rudin [225, Theorem 2.14], there exists a unique (compact inner regular) kernel p t (x, ⋅) such that T t u(x) = ∫ u(y) p t (x, dy),

u ∈ C∞ (ℝd ).

(7.6)

Obviously, (7.6) extends T t to Bb (ℝd ), and (T t )t⩾0 becomes a Feller semigroup in the sense of Remark 7.4. Let K ⊂ ℝd be compact. Then U n := {x + y : x ∈ K, y ∈ 𝔹(0, 1/n)} is open and u n (x) :=

d(x, U nc ) , d(x, K) + d(x, U nc )

where

d(x, A) := inf |x − a|, a∈A

Ex. 7.8

(7.7)

is a sequence of continuous functions such that inf n⩾1 u n = 𝟙K . By monotone convergence, p t (x, K) = inf n⩾1 T t u n (x), and (t, x) 󳨃→ p t (x, K) is measurable. This follows from the joint continuity of (t, x) 󳨃→ T t u(x) for u ∈ C∞ (ℝd ), cf. Problem 7.9. Since the kernel is inner regular, we have p t (x, C) = sup{p t (x, K) : K ⊂ C, K compact} for all Borel sets C ⊂ ℝd and, therefore, (t, x) 󳨃→ p t (x, C) is measurable.

Ex. 7.9

92 | 7 Brownian motion and transition semigroups Obviously, p t (x, ℝd ) ⩽ 1. If p t (x, ℝd ) < 1, the following trick can be used to make Remark 7.6 work: consider the one-point compactification of ℝd by adding the point ∂ and define p ∂ (x, {∂}) := 1 − p t (x, ℝd ) { { { t∂ p t (∂, C) := δ ∂ (C) { { { ∂ {p t (x, C) := p t (x, C)

for x ∈ ℝd , t ⩾ 0,

for C ∈ B(ℝd ∪ {∂}), t ⩾ 0

(7.8)

for x ∈ ℝd , C ∈ B(ℝd ), t ⩾ 0.

With some effort we can always get a version of (X t )t⩾0 with right continuous sample paths. Moreover, a Feller process is a strong Markov process, cf. Section A.6 in the appendix.

7.2 The generator Let ϕ : [0, ∞) → ℝ be a continuous function which satisfies the functional equation ϕ(t)ϕ(s) = ϕ(t + s) and ϕ(0) = 1. Then there is a unique a ∈ ℝ such that ϕ(t) = e at . Obviously, a=

󵄨󵄨 d+ ϕ(t) − 1 ϕ(t)󵄨󵄨󵄨 = lim . 󵄨t=0 t↓0 dt t

Since a strongly continuous semigroup (P t )t⩾0 on a Banach space (B, ‖ ⋅ ‖) also satisfies the functional equation P t P s u = P t+s u it is not too wild a guess that – in an appropriate sense – P t u = e tA u for some operator A in B. In order to keep things technically simple, we will consider only Feller semigroups. 7.8 Definition. Let (P t )t⩾0 be a Feller semigroup on C∞ (ℝd ). Then Au := lim

t→0

Pt u − u t

(the limit is taken in (C∞ (ℝd ), ‖ ⋅ ‖∞ ))

󵄩󵄩 󵄩󵄩 P u − u 󵄨󵄨 󵄩 󵄩 t − g 󵄩󵄩󵄩 = 0} D(A) := {u ∈ C∞ (ℝd ) 󵄨󵄨󵄨 ∃g ∈ C∞ (ℝd ) : lim 󵄩󵄩󵄩 󵄩󵄩∞ 󵄨 t→0 󵄩 t 󵄩

Ex. 7.10

(7.9a) (7.9b)

is the (infinitesimal) generator of the semigroup (P t )t⩾0 .

Clearly, (7.9) defines a linear operator A : D(A) → C∞ (ℝd ).

7.9 Example. Let (B t )t⩾0 , B t = (B1t , . . . , B dt ), be a d-dimensional Brownian motion and denote by P t u(x) = 𝔼x u(B t ) the transition semigroup and by (A, D(A)) the generator. Then C2∞ (ℝd ) ⊂ D(A) where A|C2∞ (ℝd ) =

1 1 d ∆ = ∑ ∂2j , 2 2 j=1

∂j =

∂ , ∂x j

and C2∞ (ℝd ) := {u ∈ C∞ (ℝd ) : ∂ j u, ∂ j ∂ k u ∈ C∞ (ℝd ), j, k = 1, . . . , d} .

7.2 The generator | 93

Indeed: let u ∈ C2∞ (ℝd ). By the Taylor formula we get for some point ξ t = ξ(x, t, ω) between x and x + B t (ω) 󵄨󵄨 󵄨󵄨 d 󵄨󵄨 󵄨 󵄨󵄨𝔼 [ u(B t + x) − u(x) − 1 ∑ ∂2 u(x)]󵄨󵄨󵄨 j 󵄨󵄨 󵄨󵄨 t 2 󵄨󵄨 󵄨󵄨 j=1 󵄨󵄨 󵄨󵄨 d d d 󵄨󵄨 󵄨󵄨󵄨 1 1 1 j j 󵄨 k 2 = 󵄨󵄨󵄨𝔼 [ ∑ ∂ j u(x)B t + ∑ ∂ j ∂ k u(ξ t )B t B t − ∑ ∂ j u(x)]󵄨󵄨󵄨 t j=1 2t j,k=1 2 j=1 󵄨󵄨󵄨 󵄨󵄨 󵄨 [ ]󵄨󵄨 󵄨󵄨 󵄨 󵄨 j d B B k 󵄨󵄨󵄨 1 󵄨󵄨󵄨 = 󵄨󵄨󵄨𝔼 [ ∑ (∂ j ∂ k u(ξ t ) − ∂ j ∂ k u(x)) t t ]󵄨󵄨󵄨 . 2 󵄨󵄨󵄨 t 󵄨󵄨 󵄨 [j,k=1 ]󵄨󵄨 In the last equality we use 𝔼 B t = 0 and 𝔼 B t B kt = δ jk t. From the integral representation j

j

1

of the remainder term in the Taylor formula, 12 ∂ j ∂ k u(ξ t ) = ∫0 ∂ j ∂ k u(x + rB t )(1 − r) dr, we see that ω 󳨃→ ∂ j ∂ k u(ξ t (ω)) is measurable. Applying the Cauchy–Schwarz inequality two times, we find 1

2

j

1

d d 2 (B t B kt ) 2 1 󵄨 󵄨2 ⩽ 𝔼 [ ( ∑ 󵄨󵄨󵄨∂ j ∂ k u(ξ t ) − ∂ j ∂ k u(x)󵄨󵄨󵄨 ) ( ∑ ) ] 2 t2 j,k=1 j,k=1 ] [ 2

=∑dl=1 (B lt ) /t=|B t |2 /t

1 2

1

d 𝔼 (|B t |4 ) 2 1 󵄨 󵄨2 ⩽ ( ∑ 𝔼 (󵄨󵄨󵄨∂ j ∂ k u(ξ t ) − ∂ j ∂ k u(x)󵄨󵄨󵄨 )) ( ) . 2 j,k=1 t2

By the scaling property 2.16, 𝔼(|B t |4 ) = t2 𝔼(|B1 |4 ) < ∞. Since 󵄨󵄨 󵄨2 󵄨󵄨∂ j ∂ k u(ξ t ) − ∂ j ∂ k u(x)󵄨󵄨󵄨 ⩽ 4‖∂ j ∂ k u‖2∞

for all j, k = 1, . . . , d,

and since the expression on the left converges (uniformly in x) to 0 as t → 0, we get by dominated convergence 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 u(B t + x) − u(x) 1 d 2 lim sup 󵄨󵄨󵄨𝔼 [ − ∑ ∂ j u(x)]󵄨󵄨󵄨 = 0. 󵄨 󵄨󵄨 t→0 x∈ℝd 󵄨 t 2 j=1 󵄨 󵄨

This shows that C2∞ (ℝd ) ⊂ D(A) and A = 21 ∆ on C2∞ (ℝd ). We will see in Example 7.16 below that D(A) = C2∞ (ℝ) and A = 12 ∆ if d = 1. For d > 1 we have D(A) ⊋ C2∞ (ℝd ), cf. [121, pp. 234–5] and Example 7.25. In general, D(A) is the Sobolev-type space {u ∈ C∞ (ℝd ) : u, ∆u ∈ C∞ (ℝd )} where ∆u is understood in the sense of distributions, cf. Example 7.17.

Since every Feller semigroup (P t )t⩾0 is given by a family of measurable kernels, cf. t Remark 7.7, we can define linear operators of the type L = ∫0 P s ds in the following way: for u ∈ Bb (ℝd ) t

Lu is a Borel function given by x 󳨃→ Lu(x) := ∫∫ u(y) p s (x, dy) ds. 0

94 | 7 Brownian motion and transition semigroups The following lemma describes the fundamental relation between generators and semigroups. 7.10 Lemma. Let (P t )t⩾0 be a Feller semigroup with generator (A, D(A)). a) For all u ∈ D(A) and t > 0 we have P t u ∈ D(A). Moreover, d P t u = AP t u = P t Au dt t

for all u ∈ D(A), t > 0.

b) For all u ∈ C∞ (ℝd ) we have ∫0 P s u ds ∈ D(A) and

(7.10)

t

u ∈ C∞ (ℝd ), t > 0

(7.11a)

= ∫ AP s u ds,

u ∈ D(A), t > 0

(7.11b)

= ∫ P s Au ds,

u ∈ D(A), t > 0.

(7.11c)

P t u − u = A ∫ P s u ds, 0

t

0

t

0

Proof. a) Let 0 < ϵ < t and u ∈ D(A). The semigroup and contraction properties give 󵄩󵄩 P u − P u 󵄩󵄩 P P u − P u 󵄩󵄩 󵄩󵄩 t t 󵄩󵄩 󵄩󵄩 󵄩󵄩 t+ϵ 󵄩󵄩󵄩 ϵ t − P Au − P Au = 󵄩 󵄩󵄩 󵄩 t t 󵄩󵄩 󵄩󵄩 󵄩 󵄩󵄩∞ ϵ ϵ 󵄩 󵄩∞ 󵄩󵄩 󵄩󵄩 󵄩󵄩 P u − u ϵ 󵄩 󵄩 − P t Au󵄩󵄩󵄩 = 󵄩󵄩󵄩P t ϵ 󵄩󵄩∞ 󵄩󵄩 󵄩󵄩 P u − u 󵄩󵄩 󵄩󵄩 ϵ 󵄩󵄩 ⩽ 󵄩󵄩 − Au󵄩󵄩 󳨀󳨀󳨀→ 0. 󵄩󵄩 󵄩󵄩∞ ϵ→0 ϵ +

By the definition of the generator, P t u ∈ D(A) and ddt P t u = AP t u = P t Au. Similarly, 󵄩󵄩 P u − P u 󵄩󵄩 t−ϵ 󵄩󵄩 t 󵄩 − P t Au󵄩󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩∞ ϵ 󵄩󵄩 󵄩󵄩 P u − u ϵ 󵄩 󵄩 󵄩 󵄩 − P t−ϵ Au󵄩󵄩󵄩 + 󵄩󵄩󵄩P t−ϵ Au − P t−ϵ P ϵ Au󵄩󵄩󵄩∞ ⩽ 󵄩󵄩󵄩P t−ϵ 󵄩󵄩∞ 󵄩󵄩 ϵ 󵄩󵄩󵄩 󵄩󵄩󵄩 P ϵ u − u 󵄩 󵄩 − Au󵄩󵄩󵄩 + 󵄩󵄩󵄩Au − P ϵ Au󵄩󵄩󵄩∞ 󳨀󳨀󳨀→ 0. ⩽ 󵄩󵄩󵄩 ϵ→0 ϵ 󵄩󵄩∞ 󵄩󵄩 where we use the strong continuity in the last step. − This shows that we have ddt P t u = AP t u = P t Au. b) Let u ∈ C∞ (ℝd ) and t, ϵ > 0. By Fubini’s theorem t

t

P ϵ ∫ P s u(x) ds = ∫∫∫ u(y)p s (z, dy) ds p ϵ (x, dz) 0

0

t

t

= ∫∫∫ u(y)p s (z, dy) p ϵ (x, dz) ds = ∫ P ϵ P s u(x) ds, 0

0

7.2 The generator | 95

and so, t

t

0

0

P ϵ − id 1 ∫ P s u(x) ds = ∫ (P s+ϵ u(x) − P s u(x)) ds ϵ ϵ t+ϵ

ϵ

t

0

1 1 = ∫ P s u(x) ds − ∫ P s u(x) ds. ϵ ϵ

The contraction property and the strong continuity give for r ⩾ 0 󵄨󵄨 󵄨󵄨 r+ϵ 󵄨󵄨 1 r+ϵ 󵄨󵄨 1 󵄨 󵄨󵄨 󵄨󵄨 ∫ P s u(x) ds − P r u(x)󵄨󵄨󵄨 ⩽ ∫ |P s u(x) − P r u(x)| ds 󵄨󵄨 ϵ 󵄨󵄨 ϵ 󵄨󵄨 󵄨󵄨 r r ⩽ sup ‖P s u − P r u‖∞

(7.12)

r⩽s⩽r+ϵ

⩽ sup ‖P s u − u‖∞ 󳨀󳨀󳨀→ 0. ϵ→0

s⩽ϵ

t

This shows that ∫0 P s u ds ∈ D(A) as well as (7.11a). If u ∈ D(A), then t

(7.10)

t

t

(7.10)

∫ P s Au(x) ds = ∫ AP s u(x) ds = ∫ 0

0

0

d P s u(x) ds = P t u(x) − u(x) ds t

(7.11a)

= A ∫ P s u(x) ds. 0

This proves (7.11c) and (7.11b).

7.11 Corollary. Let (P t )t⩾0 be a Feller semigroup with generator (A, D(A)). a) D(A) is a dense subset of C∞ (ℝd ). b) (A, D(A)) is a closed operator, i.e. (u n )n⩾1 ⊂ D(A), limn→∞ u n = u,

(Au n )n⩾1 is a Cauchy sequence

}

󳨐⇒

{

u ∈ D(A) and

Au = lim Au n .

Ex. 23.3

(7.13)

n→∞

c) If (T t )t⩾0 is another Feller semigroup with generator (A, D(A)), then P t = T t . ϵ

Proof. a) Let u ∈ C∞ (ℝd ). From Lemma 7.10.b) we see u ϵ := ϵ−1 ∫0 P s u ds ∈ D(A) and, by an argument similar to (7.12), we find limϵ→0 ‖u ϵ − u‖∞ = 0.

b) Let (u n )n⩾1 ⊂ D(A) be such that limn→∞ u n = u and limn→∞ Au n = w uniformly. Therefore P t u(x) − u(x) =

(7.11c)

=

lim (P t u n (x) − u n (x))

n→∞

t

t

lim ∫ P s Au n (x) ds = ∫ P s w(x) ds.

n→∞

0

0

96 | 7 Brownian motion and transition semigroups As in (7.12) we find

t

lim

t→0

P t u(x) − u(x) 1 = lim ∫ P s w(x) ds = w(x) t→0 t t 0

uniformly for all x, thus u ∈ D(A) and Au = w.

c) Let 0 < s < t and assume that 0 < |h| < min{t − s, s}. Then we have for u ∈ D(A) Ex. 7.11

T s u − T s−h u P t−s T s u − P t−s+h T s−h u P t−s − P t−s+h = T s u + P t−s+h . h h h

Letting h → 0, Lemma 7.10 shows that the function s 󳨃→ P t−s T s u, u ∈ D(A), is differentiable, and

Thus,

d d d P t−s T s u = ( P t−s ) T s u + P t−s T s u = −P t−s AT s u + P t−s AT s u = 0. ds ds ds t

0=∫ 0

󵄨󵄨s=t d P t−s T s u(x) ds = P t−s T s u(x)󵄨󵄨󵄨 = T t u(x) − P t u(x). 󵄨s=0 ds

Therefore, T t u = P t u for all t > 0 and u ∈ D(A). Since D(A) is dense in C∞ (ℝd ), we can approximate u ∈ C∞ (ℝd ) by a sequence (u j )j⩾1 ⊂ D(A), i.e. T t u = lim T t u j = lim P t u j = P t u. j→∞

j→∞

7.3 The resolvent Let (P t )t⩾0 be a Feller semigroup. Then the integral ∞

U α u(x) := ∫ e−αt P t u(x) dt, 0

α > 0, x ∈ ℝd , u ∈ Bb (ℝd ),

(7.14)

exists and defines a linear operator U α : Bb (ℝd ) → Bb (ℝd ).

7.12 Definition. Let (P t )t⩾0 be a Feller semigroup and α > 0. The operator U α given by (7.14) is the α-potential operator. The α-potential operators share many properties of the semigroup. In particular, we have the following counterpart of Proposition 7.3.

Ex. 7.13 Ex. 7.14 Ex. 7.16

7.13 Proposition. Let (P t )t⩾0 be a Feller semigroup with α-potential operators (U α )α>0 and generator (A, D(A)). a) αU α is conservative whenever (P t )t⩾0 is conservative: αU α 1 = 1;

7.3 The resolvent | 97

b) c) d) e) f)

αU α is a contraction on Bb : ‖αU α u‖∞ ⩽ ‖u‖∞ for all u ∈ Bb ; αU α is positivity preserving: for u ∈ Bb we have u ⩾ 0 ⇒ αU α u ⩾ 0; αU α has the Feller property: if u ∈ C∞ , then αU α u ∈ C∞ ; αU α is strongly continuous on C∞ : limα→∞ ‖αU α u − u‖∞ = 0; (U α )α>0 is the resolvent: for all α > 0, α id −A is invertible with a bounded inverse U α = (α id −A)−1 ; g) Resolvent equation: U α u − U β u = (β − α)U β U α u for all α, β > 0 and u ∈ Bb . In particular, limβ→α ‖U α u − U β u‖∞ = 0 for all u ∈ Bb ; h) There is a one-to-one relationship between (P t )t⩾0 and (U α )α>0 ; i) The resolvent (U α )α>0 is sub-Markovian if, and only if, (P t )t⩾0 is sub-Markovian: let 0 ⩽ u ⩽ 1. Then 0 ⩽ P t u ⩽ 1 ∀t ⩾ 0 if, and only if, 0 ⩽ αU α u ⩽ 1 ∀α > 0.

Proof. a)–c) follow immediately from the integral formula (7.14) for U α u(x) and the corresponding properties of the Feller semigroup (P t )t⩾0 .

d) Since P t u ∈ C∞ and |αe−αt P t u(x)| ⩽ αe−αt ‖u‖∞ ∈ L1 (dt), this follows from dominated convergence. e) By the integral representation (7.14) and the triangle inequality we find ∞



‖αU α u − u‖∞ ⩽ ∫ αe−αt ‖P t u − u‖∞ dt = ∫ e−s ‖P s/α u − u‖∞ ds. 0

0

As ‖P s/α u − u‖∞ ⩽ 2‖u‖∞ and limα→∞ ‖P s/α u − u‖∞ = 0, the claim follows with dominated convergence. f) For every ϵ > 0, α > 0 and u ∈ C∞ (ℝd ) we have ∞

1 1 (P ϵ (U α u) − U α u) = ∫ e−αt (P t+ϵ u − P t u) dt ϵ ϵ 0

=





1 1 ∫ e−α(s−ϵ) P s u ds − ∫ e−αs P s u ds ϵ ϵ ϵ

0





ϵ

1 1 1 = ∫ e−α(s−ϵ) P s u ds − ∫ e−αs P s u ds − ∫ e−α(s−ϵ) P s u ds ϵ ϵ ϵ 0

=

0



0

ϵ

e αϵ − 1 e αϵ ∫ e−αs P s u ds − ∫ e−αs P s u ds. ϵ ϵ

→α, ϵ→0

0

0

→u, ϵ→0 as in (7.12)

This shows that we have in the sense of uniform convergence lim

ϵ→0

Pϵ Uα u − Uα u = αU α u − u, ϵ

98 | 7 Brownian motion and transition semigroups i.e. U α u ∈ D(A) and AU α u = αU α u − u. Hence, (α id −A)U α u = u for all u ∈ C∞ . Since for u ∈ D(A) ∞

U α Au = ∫ e 0

−αt

(7.11)



P t Au dt = A ∫ e−αt P t u dt = AU α u, 0

it is easy to see that u = (α id −A)U α u = U α (α id −A)u. This shows that U α is the left and right inverse of α id −A. Since U α is bounded, α id −A has a bounded inverse, i.e. α > 0 is in the resolvent set of A, and (U α )α>0 is the resolvent. g) Let α, β > 0 and u ∈ Bb . Using the definition of U α and Fubini’s theorem we see that ∞∞

(β − α)U α U β u(x) = (β − α) ∫ ∫ e−sα e−tβ P s+t u(x) ds dt 0 0 ∞∞

= (β − α) ∫ ∫ e−(r−t)α e−tβ P r u(x) dr dt 0 t ∞ r

= (β − α) ∫ ∫ e−t(β−α) e−rα P r u(x) dt dr 0 0



= ∫ [ − e−t(β−α) ] 0 ∞

t=r

t=0

e−rα P r u(x) dr

= ∫ (e−rα − e−rβ )P r u(x) dr 0

= U α u(x) − U β u(x).

Since αU α and βU β are contractive, we find ‖U α u − U β u‖∞ =

󵄨󵄨 1 1 󵄨󵄨 |β − α| 󵄨 󵄨 ‖αU α βU β u‖∞ ⩽ 󵄨󵄨󵄨 − 󵄨󵄨󵄨 ‖u‖∞ 󳨀󳨀󳨀󳨀→ 0. 󵄨󵄨 α β 󵄨󵄨 βα β→α

h) Formula (7.14) shows that (P t )t⩾0 defines (U α )α>0 . On the other hand, (7.14) means that ν(α) := U α u(x) is the Laplace transform of ϕ(t) := P t u(x) for every x ∈ ℝd . Therefore, ν determines ϕ(t) for Lebesgue almost all t ⩾ 0, cf. [237, Proposition 1.2] for the uniqueness of the Laplace transform. Since ϕ(t) = P t u(x) is continuous for u ∈ C∞ , this shows that P t u(x) is uniquely determined for all u ∈ C∞ , and so is (P t )t⩾0 , cf. Remark 7.7. i) Assume that u ∈ Bb satisfies 0 ⩽ u ⩽ 1. From (7.14) we get that 0 ⩽ P t u ⩽ 1, t ⩾ 0, implies 0 ⩽ αU α u ⩽ 1. Conversely, let 0 ⩽ u ⩽ 1 and 0 ⩽ αU α u ⩽ 1 for all α > 0. From

7.3 The resolvent | 99

the resolvent equation g) we see for all u ∈ Bb with u ⩾ 0

U α u(x) − U β u(x) g) d U α u(x) = lim = − lim U β U α u(x) = −U α2 u(x) ⩽ 0. dα α−β β→α β→α

d Iterating this shows that (−1)n dα n U α u(x) ⩾ 0. This means that ν(α) := U α u(x) is completely monotone, hence the Laplace transform of a positive measure, cf. Bernstein’s ∞ theorem [237, Theorem 1.4]. As U α u(x) = ∫0 e−αt P t u(x) dt, we infer that P t u(x) ⩾ 0 for Lebesgue almost all t > 0. d The formula for the derivative of U α yields dα αU α u(x) = (id −αU α )U α u(x). By n

d n iteration we get (−1)n+1 dα n αU α u(x) = n!(id −αU α )U α u(x). Plug in u ≡ 1; then it follows n



that 𝛶(α) := 1 − αU α 1(x) = ∫0 αe−αt (1 − P t 1(x)) dt is completely monotone. Thus, P t 1(x) ⩽ 1 for Lebesgue almost all t > 0, and using the positivity of P t for the positive function 1 − u, we get P t u ⩽ P t 1 ⩽ 1 for almost all t > 0. Since t 󳨃→ P t u(x) is continuous if u ∈ C∞ , we see that 0 ⩽ P t u(x) ⩽ 1 for all t > 0 if u ∈ C∞ and 0 ⩽ u ⩽ 1, and we extend this to u ∈ Bb with the help of Remark 7.7.

7.14 Example. Let (B t )t⩾0 be a Brownian motion and denote by (U α )α>0 the resolvent. Then for all u ∈ Bb (ℝd )

U α u(x) = G α,d ∗ u(x) = ∫ G α,d (y)u(x − y) dy

with the convolution kernel d 1 √2α 2 −1 K d −1 (√2α |y|), G α,d (y) = d/2 ( ) 2 2|y| π

d ⩾ 1,

(7.15)

where K ν (x) is a Bessel function of the third kind, cf. [194, 10.32.10, p. 253 and 10.39.2 p. 254]. For d = 1, 2, 3 (7.15) becomes G α,1 (y) =

e−√2α |y| , √2α

G α,2 (y) =

1 K0 (√2α |y|) π

and

G α,3 (y) =

e−√2α |y| . 2π|y|

Indeed: let d = 1 and denote by (P t )t⩾0 the transition semigroup. Using Fubini’s theorem we find for α > 0 and u ∈ Bb (ℝ) ∞



U α u(x) = ∫ e−αt P t u(x) dt = ∫ e−αt 𝔼 u(x + B t ) dt 0

0



= ∫ ∫ e−αt u(x + y) 0

2 1 e−y /2t dt dy √2πt

= ∫ r α (y2 ) u(x − y) dy.

The function r α in the integrand is given by ∞

r α (η) = ∫ 0

1 √2πt

e−(αt+η/2t) dt =



2 e−√2αη α ∫ √ e−(√αt−√η/2t) dt. πt √2α

0

Ex. 7.15

100 | 7 Brownian motion and transition semigroups Changing variables according to t = η/(2αs) and dt = −η/(2αs2 ) ds gives ∞

2 e−√2αη η −(√η/2s−√αs) ds. r α (η) = e ∫√ 2πs3 √2α

0

If we add these two equalities and divide by 2, we get r α (η) =



2 e−√2αη 1 1 α η ∫ (√ + √ 3 ) e−(√αs−√η/2s) ds. s 2s √2α √π 2

0

A further change of variables, y = √αs − √η/2s and dy = that

r α (η) =

1 2



η (√ αs + √ 2s3 ) ds, reveals

2 e−√2αη e−√2αη 1 . ∫ e−y dy = √2α √π √2α

−∞

For a d-dimensional Brownian motion, d ⩾ 2, a similar calculation leads to the kernel ∞

r α (η) = ∫ 0

1

1 1 α 4−2 −(αt+η/2t) e dt = K d −1 (√2αη) ( ) 2 (2πt)d/2 π d/2 2η d

which cannot be expressed by elementary functions if d is an even integer, cf. [194, 10.47.9, p. 262 and 10.49.12, p. 264] or [101, 8.468, p. 967]. The following result is often helpful if we want to determine the domain of the generator. 7.15 Theorem (Dynkin 1956; Reuter 1957). Let (A, D(A)) be the generator of a Feller semigroup and assume that (L, D(L)), D(L) ⊂ C∞ (ℝd ), extends (A, D(A)), i.e. If for any u ∈ D(L)

D(A) ⊂ D(L) and

then (A, D(A)) = (L, D(L)).

Proof. Pick u ∈ D(L) and set

g := u − Lu

(7.16)

L|D(A) = A.

Lu = u 󳨐⇒ u = 0, and

(7.17)

h := (id −A)−1 g ∈ D(A).

Since (L, D(L)) extends the operator (A, D(A)), we see that (7.16)

def

h − Lh = h − Ah = (id −A)(id −A)−1 g = g = u − Lu.

Thus, h − u = L(h − u) and, by (7.17), h = u. Since h ∈ D(A) we get that u ∈ D(A), and so D(L) ⊂ D(A).

7.3 The resolvent |

101

7.16 Example. Let (A, D(A)) be the generator of the transition semigroup (P t )t⩾0 of a BMd . We know from Example 7.9 that C2∞ (ℝd ) ⊂ D(A). If d = 1, then D(A) = C2∞ (ℝ) = {u ∈ C∞ (ℝ) : u, u󸀠 , u󸀠󸀠 ∈ C∞ (ℝ)}.

Proof. We know from Proposition 7.13.f) that U α is the inverse of α id −A. In particular, U α (C∞ (ℝ)) = D(A). This means that any u ∈ D(A) can be written as u = U α f for some f ∈ C∞ (ℝ). Using the formula (7.15) for U α from Example 7.14, we see for all f ∈ C∞ (ℝ) ∞

1 √ U α f(x) = ∫ e− 2α |y−x| f(y) dy. √2α −∞

By the differentiability lemma for parameter dependent integrals, cf. [232, Theorem 12.5], we find ∞

d √ U α f(x) = ∫ e− 2α |y−x| sgn(y − x) f(y) dy dx −∞ ∞

= ∫e x

x

−√2α (y−x) ∞

√2α (x−y)

f(y) dy − ∫ e− −∞

f(y) dy,

d2 √ U α f(x) = √2α ∫ e− 2α|y−x| f(y) dy − 2f(x). dx2 −∞

󵄨 󵄨d d2 Since 󵄨󵄨󵄨󵄨 dx U α f(x)󵄨󵄨󵄨󵄨 ⩽ 2√2αU α |f|(x) and dx 2 U α f(x) = 2αU α f(x)−2f(x), Proposition 7.13.d) shows that U α f ∈ C∞ (ℝ), thus C2∞ (ℝ) ⊃ D(A). If d > 1, the arguments of Example 7.16 are no longer valid.

7.17 Example. We have seen in Example 7.9 that the generator (A, D(A)) of a d-dimensional Brownian motion satisfies C2∞ (ℝd ) ⊂ D(A). Using Theorem 7.15 we can describe the generator precisely: (A, D(A)) = ( 12 ∇2x , D(∇2x )) where D(∇2x ) = {u ∈ C∞ (ℝd ) : ∇2x u ∈ D󸀠 (ℝd ) ∩ C∞ (ℝd )}

(7.18)

and ∇2x denotes the distributional (weak) derivatives in the space D󸀠 (ℝd ) of Schwartz distributions or generalized functions. Recall that the Schwartz distributions D󸀠 (ℝd ) are the topological dual of the test d functions D(ℝd ) = C∞ c (ℝ ), see for example [226, Chapter 6]. From Hölder’s inequality (p = 1, q = ∞) it follows immediately that 1 2 ∇ u[ϕ] := ∫ u(x) ⋅ 12 ∆ϕ(x) dx, 2 x

d u ∈ C∞ (ℝd ), ϕ ∈ C∞ c (ℝ )

(7.19)

defines a generalized function 12 ∇2x u ∈ D󸀠 (ℝd ). If u ∈ C2∞ (ℝd ), then (7.19) is just the familiar integration by parts formula ∫ 21 ∆u(x) ⋅ ϕ(x) dx = ∫ u(x) ⋅ 12 ∆ϕ(x) dx,

d u ∈ C2∞ (ℝd ), ϕ ∈ C∞ c (ℝ ).

(7.20)

Ex. 7.12

102 | 7 Brownian motion and transition semigroups 󵄨 󵄨 󵄨 Thus, 12 ∇2x 󵄨󵄨󵄨C∞ (ℝd ) extends 12 ∆󵄨󵄨󵄨C2∞ (ℝd ) = A󵄨󵄨󵄨C2∞ (ℝd ) ; in fact, it even extends (A, D(A)). To see this, we note that by Proposition 7.13.f) D(A) = U1 (C∞ (ℝd )) and every function u ∈ D(A) can be represented as u = U1 f with f = u − Au. Since A = 21 ∆ on d ∞ d C∞ c (ℝ ), we get for ϕ ∈ Cc (ℝ ) (u − 12 ∇2x u)[ϕ] = ∫ u(x) ⋅ (ϕ(x) − 12 ∆ϕ(x)) dx

= ∫ G1,d ∗ f(x) ⋅ (ϕ(x) − 21 ∆ϕ(x)) dx = ∫ f(x) ⋅ G1,d ∗ (ϕ(x) − 21 ∆ϕ(x)) dx = ∫ f(x) ⋅ U1 ( id − 12 ∆)ϕ(x) dx = ∫ f(x) ⋅ ϕ(x) dx.

This shows that u − 21 ∇2x u ∈ D󸀠 (ℝd ) is represented by f = u − Au, i.e. ( 12 ∇2x , D(∇2x )),

D(∇2x ) = {u ∈ C∞ (ℝd ) : ∇2x u ∈ D󸀠 (ℝd ) ∩ C∞ (ℝd )}

extends (A, D(A)). d Now let u ∈ D(∇2x ) such that u − 12 ∇2x u = 0. Therefore, we find for all ϕ ∈ C∞ c (ℝ ) and all z ∈ ℝd (u − 12 ∇2x u)[τ z ϕ] = 0

with τ z ϕ(x) := ϕ(x − z).

Since τ z ϕ has compact support, we can integrate this identity with respect to the kernel G1,d (z) of the convolution operator U1 , and get 0 = ∬ (u(x) − 21 ∇2x u(x))ϕ(x − z)G1,d (z) dz dx = ∫ u(x)U1 ( id − 12 ∆)ϕ(x) dx = ∫ u(x)ϕ(x) dx.

This shows that u = 0 (Lebesgue almost everywhere), and Theorem 7.15 proves that ( 12 ∇2x , D(∇2x )) = (A, D(A)). 7.18 Remark. If we combine Examples 7.16 and 7.17, we find

C2∞ (ℝ) = {u ∈ C∞ (ℝ) : ∇2x u ∈ D󸀠 (ℝ) ∩ C∞ (ℝ)} .

This is also a consequence of the (local) regularity of solutions of Poisson’s equation ∆u = f . For d = 1 the situation is easy since we deal with an ODE u󸀠󸀠 = f . In higher dimensions the situation is entirely different. If u is a radial solution, i.e. depending only on |x|, then ∆u = f ∈ C∞ (ℝd ),

u radial 󳨐⇒ u ∈ C2∞ (ℝd ).

7.4 The Hille-Yosida theorem and positivity |

103

In general, this is wrong, see also the discussion in Example 7.25. As soon as we assume, α (ℝd ) is Hölder continuous of order α ∈ (0, 1], then however, that f ∈ C∞ ∆u = f,

α d f ∈ C∞ (ℝd ) 󳨐⇒ u ∈ C2+α ∞ (ℝ )

d 2 d α d where u ∈ C2+α ∞ (ℝ ) means that u ∈ C∞ (ℝ ) and ∂ j ∂ k u ∈ C∞ (ℝ ) for j, k = 1, . . . , d. Proofs can be found in Dautray–Lions [46, Chapter II.2, pp. 283–296], a classical presentation is in Günter [103, §14, pp. 82–86].

7.4 The Hille-Yosida theorem and positivity We will now discuss the structure of generators of strongly continuous contraction semigroups and Feller semigroups. 7.19 Theorem (Hille 1948; Yosida 1948; Lumer–Phillips 1961). Let (B, ‖ ⋅ ‖) be a Banach space. A linear operator (A, D(A)) is the infinitesimal generator of a strongly continuous contraction semigroup if, and only if, a) D(A) is a dense subset of B; b) A is dissipative, i.e. ∀λ > 0, u ∈ D(A) : ‖λu − Au‖ ⩾ λ‖u‖; c) R(λ id −A) = B for some (or all) λ > 0.

The following proof is based on the ideas of Neveu [189, § I.4].

Proof. 1o The conditions a)–c) are equivalent to saying that (λ − A) has for every λ > 0 a bounded inverse R λ with dense range and satisfying ‖λR λ u‖ ⩽ ‖u‖. In fact, dissipativity means that λu − Au = 0 implies λ‖u‖ ⩽ ‖λu − Au‖ = 0, i.e. u = 0. Thus, (λ − A) is injective. Because of c) it is also surjective. Therefore, λ − A has an inverse R λ : B → D((λ − A)) = D(A) which is, again by b), bounded: take in b) u = R λ w to see that ‖λR λ w‖ ⩽ ‖w‖.

2o From Step 1o and Corollary 7.11.a), Proposition 7.13.b), f) ¹ it is clear that the conditions a)–c) are necessary for (A, D(A)) to be the generator of a strongly continuous contraction semigroup. Let us show in the following steps that these conditions are also sufficient. 3o Assume that a)–c) hold, and let R α , α > 0, be the inverse operators introduced in Step 1o . For α, β > 0 we have (β − A)u = (β − α)u + (α − A)u,

u ∈ D(A).

1 Strictly speaking, we have shown these results for (B, ‖ ⋅ ‖) = (C∞ , ‖ ⋅ ‖∞ ) only, but it is not hard to see that the proofs remain valid in the general case.

104 | 7 Brownian motion and transition semigroups Taking u = R α f for any f ∈ B gives

for all f ∈ B.

(β − A)R α f = (β − α)R α f + (α − A)R α f = (β − α)R α f + f

Applying to both sides the operator R β and rearranging the resulting identity leaves us with the resolvent identity R α f − R β f = (β − α)R β R α f

for all f ∈ B.

(7.21)

This shows, in particular, that the operators R α and R β commute. If we read the resolvent equality as R α f = (β − α)R β R α f + R β f , we see that the range of R α does not depend on α, – this is not surprising since R(R α ) = D(A) – but also that R α (B) = R β (B) 󳨐⇒ R2α (B) = R α R β (B) = R2β (B),

i.e. the range of R2α is independent of α, too.

4o We will now show the strong continuity of the family (αR α )α>0 . Let u ∈ D(A). By construction, there is some g ∈ B such that u = R1 g. Therefore, (7.21)

‖αR α u − u‖ = ‖αR α R1 g − R1 g‖ = ‖R α (R1 g − g)‖ ⩽

2 ‖g‖ 󳨀󳨀󳨀󳨀→ 0. α→∞ α

In the last estimate we use that ‖αR α g‖ ⩽ ‖g‖, see Step 1o . This establishes strong continuity for all u ∈ D(A). Since D(A) is dense in B, we can approximate f ∈ B by some u ∈ D(A), and we get by the linearity and contractivity of αR α ‖αR α f − f‖ ⩽ ‖αR α (f − u)‖ + ‖αR α u − u‖ + ‖f − u‖ ⩽ ‖αR α u − u‖ + 2‖f − u‖.

Letting first α → ∞ and then f → u proves strong continuity on B. As a consequence, limα→∞ α2 R2α u = u, and the independence of R(R2α ) of the parameter α shows that R(R21 ) is dense in B.

5o We can now define the Yosida approximation: A α f := αAR α f = α(αR α − 1)f , α > 0, f ∈ B. This is a family of bounded operators such that ‖A α f‖ ⩽ 2α‖f‖. Therefore, the exponential series ∞

T tα f := exp (tA α ) f = e−αt exp(tα2 R α )f := e−αt ∑

n=0

Ex. 7.9

tn 2 (α R α )n f, n!

t ⩾ 0, f ∈ B,

defines, for each α > 0, a strongly continuous, contractive operator semigroup (T tα )t⩾0 . We leave the elementary proof to the reader, cf. also Problem 7.9 . Note that the Yosida approximation satisfies (A α − A)R21 f = (αR α − id)AR21 f 󳨀󳨀󳨀󳨀→ 0 α→∞

since αR α u → u by Step

4o ;

this will be needed later on.

105

7.4 The Hille-Yosida theorem and positivity |

6o We want to show that T t f = limα→∞ T tα f exists in norm sense for all f ∈ B and locally unformly in t.² Since the limit is linear and continuity is preserved under locally uniform limits, it is clear that (T t )t⩾0 will be a strongly continuous semigroup. Our strategy is to show that (T tα )α>0 is – locally uniformly for t – a Cauchy sequence. Using the resolvent equation (7.21) we see that for all f ∈ B Rα f − Rβ f d R α f = lim = − lim R β R α f = −R2α f. dα α−β β→α β→α

(7.22)

This means that we can apply the usual rules of calculus to all functions of R α , and we deduce d d Aα f = α(αR α − 1)f = 2αR α f − α2 R2α f − f = −(αR α − 1)2 f, dα dα

d d d α T f = exp(tA α )f = tT tα A α f = −tT tα (αR α − 1)2 f. dα t dα dα

If f = R21 g the last identity yields, in particular, the estimate

󵄩󵄩 d 󵄩󵄩 󵄩 󵄩 󵄩󵄩 󵄩 󵄩󵄩 T tα R21 g󵄩󵄩󵄩 = 󵄩󵄩󵄩󵄩tT tα (αR α − id)2 R21 g 󵄩󵄩󵄩󵄩 󵄩󵄩 dα 󵄩󵄩 4t ‖g‖ 1 (7.21) 󵄩󵄩 α 󵄩 󵄩󵄩tT (R1 − αR α )2 g 󵄩󵄩󵄩 ⩽ = 󵄩 (α − 1)2 . (α − 1)2 󵄩 t

Let f ∈ R(R21 ), i.e. f = R21 g. Then we get for 1 < α < β β ‖T t f



T tα f‖

󵄩󵄩 β 󵄩󵄩 β 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 d 4t 4t 󵄩󵄩 d λ 2 󵄩 󵄩 󵄩 = 󵄩󵄩󵄩∫ T t R1 g dλ󵄩󵄩󵄩󵄩 ⩽ ∫ 󵄩󵄩󵄩 T tλ R21 g 󵄩󵄩󵄩 dλ ⩽ ( − ) ‖g‖. 󵄩󵄩 󵄩󵄩 dλ α−1 β−1 󵄩󵄩 dλ 󵄩󵄩 󵄩󵄩 α 󵄩󵄩 α

This shows that on R(R21 ) the sequence (T tα )α>1 is, locally uniformly in t, a Cauchy sequence. Since R(R21 ) is dense in B, cf. Step 4o , and since the operators T tα , T t are contractions, we get for an arbitrary h ∈ B and f ∈ R(R21 ) β

β

β

β

β

‖T t h − T tα h‖ ⩽ ‖T t h − T t f‖ + ‖T t f − T tα f‖ + ‖T tα f − T tα h‖ ⩽ ‖T t f − T tα f‖ + 2‖h − f‖. β

Letting first α, β → ∞ and then f → h shows that T t h = limα→∞ T tα h exists for all h ∈ B locally uniformly in t. The limit clearly inherits the semigroup and continuity properties of the approximating semigroups (T tα )t⩾0 .

7o Having defined the semigroup (T t )t⩾0 , all that remains to be done is to check that its generator is A. Again, by density, it is enough to check this for all f ∈ R(R21 ), i.e. 2 This means, to be precise, that limα→∞ sups⩽t ‖T sα f − T s f‖ = 0 for any t > 0 and f ∈ B.

106 | 7 Brownian motion and transition semigroups f = R21 g for some g ∈ B. We have

󵄩󵄩 T f − f 󵄩󵄩 󵄩󵄩 t 󵄩 − Af 󵄩󵄩󵄩 󵄩󵄩 󵄩󵄩 t 󵄩󵄩 󵄩 󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 T t f − f T tα f − f 󵄩󵄩󵄩 󵄩󵄩󵄩 T tα f − f 󵄩󵄩 + 󵄩󵄩 󵄩󵄩 + ‖A α f − Af ‖ − − A f ⩽ 󵄩󵄩󵄩 α 󵄩 󵄩 󵄩󵄩 t t 󵄩󵄩 󵄩󵄩 󵄩󵄩 t 󵄩 󵄩󵄩 T α f − f 󵄩󵄩 4 󵄩 󵄩󵄩 t 󵄩󵄩 󵄩󵄩 ‖g‖ + 󵄩󵄩󵄩 − A α f 󵄩󵄩󵄩 + 󵄩󵄩󵄩A α R21 g − AR21 g 󵄩󵄩󵄩󵄩 , ⩽ α−1 t 󵄩󵄩 󵄩󵄩

where we use the estimate from Step 6o with β → ∞ in the last estimate. Letting first t → 0 and then α → ∞ shows that 1t (T t − 1)f → Af . Here we use the definition of T tα = exp(tA α ) and the limit from Step 5o .

Theorem 7.19 is too general for transition semigroups of stochastic processes. Let us now show which role is played by the positivity and sub-Markovian property of such semigroups. To keep things simple, we restrict ourselves to the Banach space B = C∞ and Feller semigroups. 7.20 Lemma. Let (P t )t⩾0 be a Feller semigroup with generator (A, D(A)). Then A satisfies the positive maximum principle (PMP) u ∈ D(A), u(x0 ) = sup u(x) ⩾ 0 󳨐⇒ Au(x0 ) ⩽ 0. x∈ℝd

(7.23)

For the Laplace operator (7.23) is quite familiar: at a (global) maximum the second derivative is negative. Proof. Let u ∈ D(A) and assume that u(x0 ) = sup u ⩾ 0. Since P t preserves positivity, we get u+ −u ⩾ 0 and P t (u+ −u) ⩾ 0 or P t u+ ⩾ P t u. Since u(x0 ) is positive, u(x0 ) = u+ (x0 ) and so 1 1 1 (P t u(x0 ) − u(x0 )) ⩽ (P t u+ (x0 ) − u+ (x0 )) ⩽ (‖u+ ‖∞ − u+ (x0 )) = 0. t t t

Letting t → 0 therefore shows Au(x0 ) = limt→0 t−1 (P t u(x0 ) − u(x0 )) ⩽ 0.

In fact, the converse of Lemma 7.20 is also true.

7.21 Lemma. Assume that the linear operator (A, D(A)) satisfies conditions a) and c) of Theorem 7.19 and the PMP (7.23). Then 7.19.b) holds, in particular λ id −A is invertible, and both the resolvent U λ = (λ id −A)−1 and the semigroup generated by A are positivity preserving. Proof. Let u ∈ D(A) and x0 be such that |u(x0 )| = supx∈ℝd |u(x)|. We may assume that u(x0 ) ⩾ 0, otherwise we could consider −u. Then we find for all λ > 0 ‖λu − Au‖∞ ⩾ λu(x0 ) − Au(x0 ) ⩾ λu(x0 ) = λ‖u‖∞ . ⩽0 by PMP

7.5 The potential operator |

107

This is 7.19.b). Because of Proposition 7.13.i) it is enough to show that U λ preserves positivity. If we use the PMP for −u, we get Feller’s minimum principle: v ∈ D(A), v(x0 ) = inf v(x) ⩽ 0 󳨐⇒ Av(x0 ) ⩾ 0. x∈ℝd

Now assume that v ⩾ 0. If U λ v does not attain its infimum, we see that U λ v ⩾ 0 as U λ v is continuous and vanishes at infinity. Otherwise, let x0 be a minimum point of U λ v, then U λ v(x0 ) ⩽ 0 since U λ v vanishes at infinity. By Feller’s minimum principle 7.13.f)

λU λ v(x0 ) − v(x0 ) = AU λ v(x0 ) ⩾ 0

and, therefore, λU λ v(x) ⩾ inf x λU λ v(x) = λU λ v(x0 ) ⩾ v(x0 ) ⩾ 0.

The next result shows that positivity allows us to determine the domain of the generator using pointwise rather than uniform convergence. 7.22 Theorem. Let (P t )t⩾0 be a Feller semigroup generated by (A, D(A)). Then 󵄨󵄨 P t u(x) − u(x) D(A) = {u ∈ C∞ (ℝd ) 󵄨󵄨󵄨 ∃g ∈ C∞ (ℝd ) ∀x : lim = g(x)} . 󵄨 t→0 t

(7.24)

Proof. We define the weak generator of (P t )t⩾0 as A w u(x) := lim

t→0

P t u(x) − u(x) t

for all u ∈ D(A w ) and x ∈ ℝd ,

where D(A w ) is the set on the right-hand side of (7.24). Clearly, (A w , D(A w )) extends (A, D(A)) and the calculation in the proof of Lemma 7.20 shows that (A w , D(A w )) also satisfies the positive maximum principle. Using the argument of (the first part of the proof of) Lemma 7.21 we infer that (A w , D(A w )) is dissipative. In particular, we find for all u ∈ D(A w ) A w u = u 󳨐⇒ 0 = ‖u − A w u‖ ⩾ ‖u‖ 󳨐⇒ ‖u‖ = 0 󳨐⇒ u = 0.

Therefore, Theorem 7.15 shows that (A w , D(A w )) = (A, D(A)).

7.5 The potential operator

Let (P t )t⩾0 be a Feller semigroup with generator (A, D(A)) and α-potential operators (U α )α>0 . We have seen in Proposition 7.13.f) that the α-potential operator U α gives a representation of the solution of the equation αu − Au = f ∈ C∞ (ℝd ) 󳨐⇒ u = U α f ∈ D(A).

Now we are interested in the equation Au = f where α = 0. It is a natural guess that u = − limα→0 U α f ; this is indeed true, but we need a few preparations.

108 | 7 Brownian motion and transition semigroups 7.23 Definition (Yosida 1965). Let (P t )t⩾0 be a Feller semigroup on C∞ (ℝd ) with αpotential operators (U α )α>0 . The operator (U0 , D(U0 )) defined by U0 u := lim U α u

(the limit is taken in (C∞ (ℝd ), ‖ ⋅ ‖∞ ))

α→0

󵄨󵄨 D(U0 ) := {u ∈ C∞ (ℝd ) 󵄨󵄨󵄨 ∃g ∈ C∞ (ℝd ) : lim ‖U α u − g‖∞ = 0} 󵄨 α→0

(7.25a)

(7.25b)

is called the potential operator or zero-resolvent of the semigroup (P t )t⩾0 .

Clearly, (7.25) defines a linear operator U0 : D(U0 ) → C∞ (ℝd ). The potential operator U0 behaves like the α-potential operators (U α )α>0 .

Ex. 7.14

7.24 Lemma. Let (P t )t⩾0 be a Feller semigroup with generator (A, D(A)) and α-potential operators (U α )α>0 , (U0 , D(U0 )). a) U α (D(U0 )) ⊂ D(U0 ) for all α > 0; b) U0 u − U α u = αU0 U α u = αU α U0 u for all u ∈ D(U0 ) and α > 0; c) U0 (D(U0 )) ⊂ D(A) and AU0 u = −u for all u ∈ D(U0 ); d) {u ∈ C∞ (ℝd ) : u ⩾ 0, supα>0 U α u ∈ C∞ (ℝd )} ⊂ D(U0 ); e) If limα→0 ‖αU α u‖∞ = 0 for all u ∈ C∞ (ℝd ), then D(U0 ) is dense in C∞ (ℝd ). Moreover, D(U0 ) = R(A) and U0 = −A−1 . Proof. a) & b) Let u ∈ D(U0 ). By the resolvent equation, Proposition 7.13.g), we get lim U β U α u = lim

β→0

β→0

1 1 (U α u − U β u) = − (U α u − U0 u). β−α α

This proves that U α u ∈ D(U0 ). Moreover, since limβ→0 U β U α u = U0 U α u, we get the first equality in b). The second equality follows from the same calculation if we interchange α and β, and use that U α is a continuous operator, cf. Proposition 7.13.b). c) Since U α = (α id −A)−1 , cf. Proposition 7.13.f), we have AU α u = −u + αU α u. For each u ∈ D(U0 ), we get lim U α u = U0 u

α→0

and

lim AU α u = −u + lim αU α u = −u.

α→0

α→0

As (A, D(A)) is a closed operator, see Corollary 7.11, we conclude that U0 u ∈ D(A) and AU0 u = −u.

d) Since u ⩾ 0, P t u ⩾ 0 and we can use monotone convergence to get a (pointwise) extension of U0 α>0

Ex. 7.16





̃0 u(x) := sup ∫ e−αt P t u(x) dt = ∫ P t u(x) dt U 0

0

for all x ∈ ℝd , u ∈ C+∞ (ℝd ).

̃0 u ∈ C∞ (ℝd ), (a variant of) Dini’s theorem [224, Theorem 7.13, p. 150] shows that If U ̃0 u is uniform, which means that u ∈ D(U0 ) and U0 u = U ̃0 u. the limit limα→0 U α u = U e) Assume that limα→0 αU α u = 0 for all u ∈ C∞ (ℝd ). Since U α is the resolvent, we get AU α u = U α Au = −u + αU α u

for all u ∈ D(A),

7.5 The potential operator |

109

and so limα→0 U α Au = limα→0 (−u + αU α u) = −u. This proves that Au ∈ D(U0 ) and U0 Au = −u. Together with c) we see that U0 and A are inverses of each other and D(U0 ) = R(A). Let u ∈ D(A). As Au ∈ D(U0 ), we have by part a) that U α Au ∈ D(U0 ), and our calculation shows limα→0 AU α u = −u. Since AU α u = U α Au ∈ D(U0 ), we conclude that D(A) ⊂ D(U0 ), hence D(A) = D(U0 ) = C∞ (ℝd ).

7.25 Example (Günter 1934, 1957). Let (B t )t⩾0 be a BMd with semigroup (P t )t⩾0 , generator (A, D(A)) and resolvent (U α )α>0 . Then C2∞ (ℝd ) ⊊ D(A)

for all d ⩾ 3.

Indeed: because of Lemma 7.24.c),d) it is enough to show that there are positive functions u, v ∈ C∞ (ℝd ) such that sup U α u, sup U α v ∈ C∞ (ℝd ) but U0 (u − v)(x) = sup U α u(x) − sup U α v(x) ∉ C2 (ℝd ). α>0

α>0

α>0

α>0

Using monotone convergence and Tonelli’s theorem we get for any u ∈ C+∞ (ℝd ) and d⩾3 sup U α u(x) α>0



=

∫ P t u(x) dt

=

∫∫

=

∫∫

r=|y|2 /2t

=

=

=

0 ∞

0 ℝd ∞

ℝd 0

2 1 e−|y| /2t u(x − y) dy dt d/2 (2πt)

2 1 e−|y| /2t dt u(x − y) dy (2πt)d/2



d 1 ∫ ∫ r 2 −2 e−r dr |y|2−d u(x − y) dy d/2 2π

ℝd



ℝd

0

Γ( 2d

− 1)

2 π d/2

|y|2−d u(x − y) dy

c d ∫ |y|2−d u(x − y) dy ℝd

with c d = π−d/2 Γ( 2d )/(d − 2). Since ∫|y|⩽1 |y|2−d dy < ∞, we see with dominated convergence that the function U0 u(x) is in C∞ (ℝd ) for all u ∈ C+∞ (ℝd ) ∩ L1 (dy). For the following calculation we assume d = 3. In Problem 7.17 we sketch how to deal with d ⩾ 4. Pick any positive f ∈ Cc ([0, 1)) such that f(0) = 0. We denote by (x, y, z) points in 3 ℝ and set r2 := x2 + y2 + z2 . Then let u(x, y, z) :=

3z2 f(r), r2

v(x, y, z) := f(r),

w(x, y, z) := u(x, y, z) − v(x, y, z).

Ex. 7.17

110 | 7 Brownian motion and transition semigroups We will show that there is some f such that w ∈ D(U0 ) and U0 w ∉ C2 (ℝ3 ). The first assertion follows directly from Lemma 7.24.d). Using the integral formula for U0 u and U0 v and, introducing polar coordinates, we get for z ∈ (0, 1/2) 1

π

2

U0 w(0, 0, z) = ∫ r f(r) (∫ 0

0

(3 cos2 θ − 1) sin θ dθ) dr. √z2 + r2 − 2rz cos θ

Changing variables according to t = √z2 + r2 − 2zr cos θ gives for the inner integral π

z+r

0

|z−r|

2

1 z2 + r2 − t2 (3 cos2 θ − 1) sin θ dθ = ) − 1] dt ∫ [3 ( ∫ rz 2zr √z2 + r2 − 2rz cos θ r2 z3

4 { {5 ={ {4 {5

Thus,

z2 r3

for r < z, for r ⩾ z. 1

z

4 1 f(r) dr) . U0 w(0, 0, z) = ( 3 ∫ r4 f(r) dr + z2 ∫ 5 z r 0

z

We may differentiate this function in z and get z

1

0

z

f(r) d 4 3 U0 w(0, 0, z) = (− 4 ∫ r4 f(r) dr + 2z ∫ dr) . dz 5 r z

d U0 w(0, 0, z) = 0. In order to calculate the second It is not hard to see that limz→0+ dz derivative in z-direction at z = 0 we consider d dz U 0 w(0, 0,

Letting z → 0 yields lim

z→0+

z) − z

d dz U 0 w(0, 0,

d dz U 0 w(0, 0, 0)

z) − z

=

z

1

0

z

4 3 f(r) dr) . (− 5 ∫ r4 f(r) dr + 2 ∫ 5 r z

d dz U 0 w(0, 0, 0)

1

4 3 f(r) = (− f(0) + 2 lim ∫ dr) . z→0+ 5 5 r z

1

All we have to do is to pick a positive f ∈ Cc ([0, 1)) such that the integral ∫0+ f(r) drr diverges; a canonical candidate is f(r) = | log r|−1 χ(r) with a suitable cut-off function χ ∈ C+c ([0, 1)) and χ|[0,1/2] ≡ 1. This shows that U0 w ∉ C2 (ℝ3 ), but U0 w is in the domain of the generator of three-dimensional Brownian motion.

7.26 Lemma. Let (B t )t⩾0 be a BMd , and denote the corresponding resolvent by (U α )α>0 . Then the condition of Lemma 7.24.e) holds, i.e. limα→0 ‖αU α u‖∞ = 0 for all u ∈ C∞ (ℝd ).

7.5 The potential operator | 111

Proof. Fix ϵ > 0. Since Cc (ℝd ) is dense in C∞ (ℝd ), there is some u ϵ ∈ Cc (ℝd ) such that ‖u ϵ − u‖∞ < ϵ. By Lemma 7.13.b) we get ‖αU α u‖∞ ⩽ ‖αU α u ϵ ‖∞ + ‖αU α (u − u ϵ )‖∞ ⩽ ‖αU α u ϵ ‖∞ + ϵ. ⩽‖u−u ϵ ‖∞

Thus, it is enough to prove the claim for u ∈ Cc (ℝd ) with, say, supp u ⊂ 𝔹(0, R). From Example 7.14 we know that U α u(x) = G α,d ∗ u(x) for all u ∈ C∞ (ℝd ). Using the explicit representation (7.15) and a simple change of variables yield αU α u(x) =

d 1 ∫ (2|z|)1− 2 K d −1 (|z|) u(x − z/√2α) dz. d/2 2 2π

ℝd

By Proposition 7.13.d), αU α u ∈ C∞ (ℝd ), and so there is some S = S ϵ > 0 such that ‖αU α u‖∞ ⩽ sup |αU α u(x)| + ϵ. |x|⩽S

Observe that

󵄨󵄨 z 󵄨󵄨󵄨󵄨 1 󵄨󵄨 −S 󵄨󵄨x − 󵄨⩾ 󵄨󵄨 √2α 󵄨󵄨󵄨 R√2α

for all |x| ⩽ S, |z| ⩾

Therefore, we have for small α ⩽ c = c(R, S) αU α u(x) =

1 ( 2π d/2



|z| R ϵ is large, sup|x|⩽S |αU α u(x)| ⩽ ϵ. This proves limα→0 ‖αU α u‖∞ = 0 for any u ∈ Cc (ℝd ).

7.27 Theorem (Yosida 1970). Let (B t )t⩾0 be a BMd , and denote the corresponding resolvent by (U α )α>0 . Then the potential operator is densely defined and it is the inverse of the generator of (B t )t⩾0 : U0 = −A−1 . Moreover, U0 u(x) = G0 ∗ u(x) = ∫ G0 (y)u(x − y) dy

(7.26)

112 | 7 Brownian motion and transition semigroups for all u ∈ D(U0 ) ∩ Cc (ℝd ) (with ∫ u(x) dx = 0 if d = 1, 2) where −|y| { { { { { {1 1 G0 (y) = G0,d (y) = { π log |y| { { { { { Γ(d/2) 2−d { π d/2 (d−2) |y|

if d = 1

(linear potential);

if d ⩾ 3

(Newton potential).

if d = 2

(logarithmic potential);

(7.27)

Proof. Lemmas 7.24.e) and 7.26 show that D(U0 ) is dense in C∞ (ℝd ) and U0 = −A−1 . The formula for G0,d (y) in dimension d ⩾ 3 follows from the calculation in Example 7.25: just write u ∈ D(U0 ) ∩ Cc (ℝd ) as u = u+ − u− . Now we consider the case d = 1. Let u ∈ D(U0 ) ∩ Cc (ℝ) with ∫ u(y) dy = 0. Then, by dominated convergence, U0 u(x) = lim ∫ α→0



= lim ∫ α→0



e−√2α|x−y| u(y) dy √2α

e−√2α|x−y| − 1 u(y) dy = ∫ −|x − y|u(y) dy. √2α ℝ

Finally, let d = 2 and u ∈ D(U0 ) ∩ Cc (ℝ2 ) such that ∫ u(y) dy = 0. Observe that π

1 1 √ G α,2 (y) = K0 (√2α|y|) = − 2 ∫ e 2α|y| cos θ [γ + log (2√2α|y| sin2 θ)] dθ π π 0

where γ = 0.57721 . . . is Euler’s constant, cf. [194, 10.32.5, p. 252]. Then, U0 u(x)

π

=−

=−

1 √ lim ∫ ∫ e 2α|x−y| cos θ [γ + log (2√2α sin2 θ) + log |x − y|] dθ u(y) dy π2 α→0 ℝ2 0

1 ∫ log |x − y| u(y) dy π ℝ2

π

1 e√2α|x−y| cos θ − 1 ] − 2 lim √2α log √2α ∫ [∫ dθ u(y) dy π α→0 √2α ] ℝ2 [ 0 1 = − ∫ log |x − y| u(y) dy. π ℝ2

Using the positivity of the semigroup (P t )t⩾0 , there is a natural (pointwise) extension of U0 to positive measurable functions u ∈ B+b (ℝd ). ∞

U0 u(x) = sup ∫ e α>0

0

r

−αt

P t u(x) dt = sup sup ∫ e−αt P t u(x) dt α>0 r>0

0

7.5 The potential operator | 113 r

r

= sup sup ∫ e−αt P t u(x) dt = sup ∫ P t u(x) dt ∈ [0, ∞]. r>0 α>0

r>0

0

0

The last expression yields an alternative way to define the potential operator. 7.28 Definition (Hunt 1957). Let (P t )t⩾0 be a Feller semigroup on C∞ (ℝd ). The operator (G, D(G)) defined by r

Gu := lim ∫ P t u dt r→∞

(the limit is taken in (C∞ (ℝd ), ‖ ⋅ ‖∞ ))

0

󵄨󵄨 󵄩 󵄩 r D(G) := {u ∈ C∞ (ℝd ) 󵄨󵄨󵄨 ∃g ∈ C∞ (ℝd ) : lim 󵄩󵄩󵄩󵄩∫0 P t u dt − g󵄩󵄩󵄩󵄩∞ = 0} r→∞ 󵄨

(7.28a)

(7.28b)

is called Hunt’s potential operator of the semigroup (P t )t⩾0 .

If we consider U0 and G as operators in the Banach space (C∞ (ℝd ), ‖⋅‖∞ ), then Yosida’s definition is more general than Hunt’s.

7.29 Theorem (Berg 1974). Let (P t )t⩾0 be a Feller semigroup on C∞ (ℝd ) with generator (A, D(A)), potential operators (U α )α>0 , (U0 , D(U0 )) and Hunt’s potential operator (G, D(G)). a) (U0 , D(U0 )) extends (G, D(G)), i.e. D(G) ⊂ D(U0 ) and U0 |D(G) = G. b) If limt→∞ ‖P t u‖∞ = 0 for all u ∈ C∞ (ℝd ), then G = U0 = −A−1 . r

Proof. a) Pick u ∈ D(G) and set G r u := ∫0 P t u dt for r > 0. Integrating by parts yields r

r

∫e 0

−tα

P t u dt = [e

−tα

r G t u]t=0

+ ∫ αe−tα G t u dt. 0

Letting r → ∞ shows (all limits are taken in C∞ (ℝd )) ∞



s=tα



U α u = ∫ e−tα P t u dt = ∫ αe−tα G t u dt = ∫ e−s G s/α u ds. 0

0

s/α By assumption, G s/α u = ∫0 P t u dt and this proves u ∈ D(U0 ) and U0 u

0

→ Gu uniformly as α → 0. So, limα→0 U α u = Gu, = Gu.

b) The condition limt→∞ ‖P t u‖∞ = 0 and dominated convergence imply that ∞

‖αU α u‖∞ ⩽ ∫ αe 0

−αt

s=αt



‖P t u‖∞ dt = ∫ e−s ‖P s/α u‖∞ ds 󳨀󳨀󳨀󳨀→ 0. 0

⩽‖u‖∞

α→0

Thus, Lemma 7.24.e) shows that D(U0 ) = R(A) and U0 = −A−1 . Therefore, it is enough t to show that D(G) = R(A) and G = −A−1 . Let G t u := ∫0 P s u ds. From (7.11c) we see t

P t u − u = ∫ P s Au ds = G t Au 0

for all u ∈ D(A)

114 | 7 Brownian motion and transition semigroups and, if t → ∞, we get Au ∈ D(G) and −u = GAu. On the other hand, by (7.11a), t

P t u − u = A ∫ P s u ds = AG t u 0

for all u ∈ D(G) ⊂ C∞ (ℝd ).

Letting t → ∞ yields, because of the closedness of A, that Gu = limt→∞ G t u ∈ D(A) and −u = AGu. Thus, D(G) = R(A) and G = −A−1 .

7.6 Dynkin’s characteristic operator

Ex. 7.18

We will now give a probabilistic characterization of the infinitesimal generator. As so often, a simple martingale relation turns out to be extremely helpful. The following theorem should be compared with Theorem 5.6. Recall that a Feller process is a (strong ³) Markov process with right continuous trajectories whose transition semigroup (P t )t⩾0 is a Feller semigroup. Of course, a Brownian motion is a Feller process. 7.30 Theorem. Let (X t , Ft )t⩾0 be a Feller process on ℝd with transition semigroup (P t )t⩾0 and generator (A, D(A)). Then t

M tu := u(X t ) − u(x) − ∫ Au(X r ) dr 0

is an Ft martingale.

for all u ∈ D(A)

Proof. Let u ∈ D(A), x ∈ ℝd and s, t > 0. By the Markov property (6.4c), s+t

u 󵄨󵄨 𝔼x (M s+t 󵄨󵄨 Ft ) = 𝔼x (u(X s+t ) − u(x) − ∫ Au(X r ) dr 0

󵄨󵄨 󵄨󵄨 󵄨󵄨 Ft ) 󵄨󵄨 󵄨

󵄨󵄨 󵄨 = 𝔼 u(X s ) − u(x) − ∫ Au(X r ) dr − 𝔼 ( ∫ Au(X r ) dr 󵄨󵄨󵄨󵄨 Ft ) 󵄨󵄨 t 0 t

Xt

t

s+t

x

s

= P s u(X t ) − u(x) − ∫ Au(X r ) dr − 𝔼X t (∫ Au(X r ) dr) 0

t

0

s

= P s u(X t ) − u(x) − ∫ Au(X r ) dr − ∫ 𝔼X t (Au(X r )) dr 0

0

t

s

= P s u(X t ) − u(x) − ∫ Au(X r ) dr − ∫ P r Au(X t ) dr 0

0

=P s u(X t )−u(X t ) by (7.11c)

3 Feller processes have the strong Markov property, see Theorem A.23 in the appendix.

7.6 Dynkin’s characteristic operator | 115

t

= u(X t ) − u(x) − ∫ Au(X r ) dr 0

= M tu .

The following result, due to Dynkin, could be obtained from Theorem 7.30 by optional stopping, but we prefer to give an independent proof. 7.31 Proposition (Dynkin 1956). Let (X t , Ft )t⩾0 be a Feller process on ℝd with transition semigroup (P t )t⩾0 , resolvent (U α )α>0 and generator (A, D(A)). If σ is an almost surely finite Ft stopping time, then σ

U α f(x) = 𝔼x (∫ e−αt f(X t ) dt) + 𝔼x (e−ασ U α f(X σ )) , 0

for all x ∈ ℝd . If 𝔼x σ < ∞, then Dynkin’s formula holds

f ∈ Bb (ℝd ),

(7.29)

σ

𝔼 u(X σ ) − u(x) = 𝔼 (∫ Au(X t ) dt) , x

x

0

Proof. Since σ < ∞ a.s., we get

u ∈ D(A).



U α f(x) = 𝔼x ( ∫ e−αt f(X t ) dt) 0 σ

= 𝔼 (∫ e x

0

−αt



f(X t ) dt) + 𝔼 ( ∫ e−αt f(X t ) dt) . x

σ

Using the strong Markov property (6.7), the second integral becomes ∞

𝔼 (∫ e x

σ

−αt



f(X t ) dt) = 𝔼 ( ∫ e−α(t+σ) f(X t+σ ) dt) x

0



= ∫ 𝔼x (e−α(t+σ) 𝔼x [f(X t+σ ) | Fσ+ ]) dt 0 ∞

= ∫ e−αt 𝔼x (e−ασ 𝔼X σ f(X t )) dt 0



= 𝔼x (e−ασ ∫ e−αt P t f(X σ ) dt) 0

= 𝔼x (e−ασ U α f(X σ )) .

(7.30)

Ex. 7.19

116 | 7 Brownian motion and transition semigroups If u ∈ D(A), there is some f ∈ C∞ (ℝd ) such that u = U α f , hence f = αu − Au. Using the first identity, we find by dominated convergence – we use 𝔼x σ < ∞ in the step (∗) below to get an integrable majorant: σ

u(x) = 𝔼x (∫ e−αt f(X t ) dt) + 𝔼x (e−ασ u(X σ )) 0 σ

(∗)

= 𝔼x (∫ e−αt (αu(X t ) − Au(X t )) dt) + 𝔼x (e−ασ u(X σ )) 0

σ

󳨀󳨀󳨀󳨀→ − 𝔼 (∫ Au(X t ) dt) + 𝔼x (u(X σ )) . x

α→0

0

Assume that the filtration (Ft )t⩾0 is right continuous. Then the first hitting time of the open set ℝd \ {x}, τ x := inf{t > 0 : X t ≠ x}, is a stopping time, cf. Lemma 5.7. One can show that ℙx (τ x ⩾ t) = exp(−λ(x)t) for some exponent λ(x) ∈ [0, ∞], see Theorem A.24 in the appendix. For a Brownian motion we have λ(x) ≡ ∞, since for t > 0 ℙx (τ x ⩾ t) = ℙx ( sup |B s − x| = 0) ⩽ ℙx (|B t − x| = 0) = ℙ(B t = 0) = 0. s⩽t

Under ℙx the random time τ x is the waiting time of the process at the starting point. Therefore, λ(x) tells us how quickly a process leaves its starting point, and we can use it to give a classification of the starting points. 7.32 Definition. Let (X t )t⩾0 be a Feller process. A point x ∈ ℝd is said to be a) an exponential holding point if 0 < λ(x) < ∞, b) an instantaneous point if λ(x) = ∞, c) an absorbing point if λ(x) = 0.

7.33 Lemma. Let (X t , Ft )t⩾0 be a Feller process with a right continuous filtration, transition semigroup (P t )t⩾0 and generator (A, D(A)). If x0 is not absorbing, then a) there exists some u ∈ D(A) such that Au(x0 ) ≠ 0; b) there is an open neighbourhood U = U(x0 ) of x0 such that for V := ℝd \ U the first hitting time τ V := inf{t > 0 : X t ∈ V} satisfies 𝔼x0 τ V < ∞. If x is absorbing, then Au(x) = 0 for all u ∈ D(A). Proof. a) We show that Aw(x0 ) = 0 for all w ∈ D(A) implies that x0 is an absorbing point. Let u ∈ D(A). By Lemma 7.10.a) w = P t u ∈ D(A) and d P t u(x0 ) = AP t u(x0 ) = 0 dt

for all t ⩾ 0.

Thus, (7.11b) shows that P t u(x0 ) = u(x0 ). Since D(A) is dense in C∞ (ℝd ) we get 𝔼x0 u(X t ) = P t u(x0 ) = u(x0 ) for all u ∈ C∞ (ℝd ). Using the sequence from (7.7), we

7.6 Dynkin’s characteristic operator | 117

can approximate x 󳨃→ 𝟙{x0 } (x) from above by (u n )n⩾0 ⊂ C∞ (ℝd ). By monotone convergence, ℙx0 (X t = x0 ) = lim 𝔼x0 u n (X t ) = lim u n (x0 ) = 𝟙{x0 } (x0 ) = 1. n→∞

=u n (x0 )

n→∞

Since t 󳨃→ X t is a.s. right continuous, we find

Ex. 7.20

1 = ℙx0 (X q = x0 ∀q ∈ ℚ ∩ [0, ∞)) = ℙx0 (X t = x0 ∀t ∈ [0, ∞)),

which means that x0 is an absorbing point.

b) If x0 is not absorbing, then part a) shows that there is some u ∈ D(A) such that Au(x0 ) > 0. Since Au is continuous, there exists some ϵ > 0 and an open neighbourhood U = U(x0 , ϵ) such that Au|U ⩾ ϵ > 0. Let τ V be the first hitting time of the open set V = ℝd \ U. From Dynkin’s formula (7.30) with σ = τ V ∧ n, n ⩾ 1, we deduce τ V ∧n

ϵ 𝔼 (τ V ∧ n) ⩽ 𝔼 ( ∫ Au(X s ) ds) = 𝔼x0 u(X τ x0

x0

0

V ∧n

Monotone convergence shows that 𝔼x0 τ V ⩽ 2‖u‖∞ /ϵ < ∞.

) − u(x0 ) ⩽ 2‖u‖∞ .

Finally, let x be absorbing. Assume, to the contrary, that Au(x) ≠ 0 for some u ∈ D(A). We can the argument of Part b) to construct a neighbourhood U = U(x) and V = ℝd \ U such that 𝔼x τ V < ∞. This is impossible, since ∞ = τ x ⩽ τ V a.s., thus we must have Au(x) = 0. 7.34 Definition (Dynkin 1956). Let (X(t))t⩾0 be a Feller process and τ xr := τ c

c

𝔹 (x,r)

the

first hitting time of the set 𝔹 (x, r). Dynkin’s characteristic operator is the linear operator defined by 𝔼x u(X(τ xr )) − u(x) { {lim , 𝔼x τ xr Au(x) := {r→0 { 0, {

if x is not absorbing, if x is absorbing,

(7.31)

on the set D(A) consisting of all u ∈ Bb (ℝd ) such that the limit in (7.31) exists for each non-absorbing point x ∈ ℝd . Note that, by Lemma 7.33, 𝔼x τ xr < ∞ for sufficiently small r > 0.

7.35 Theorem. Let (X t )t⩾0 be a Feller process with generator (A, D(A)) and characteristic operator (A, D(A)). a) A is an extension of A; b) (A, D) = (A, D(A)) if D = {u ∈ D(A) : u, Au ∈ C∞ (ℝd )}.

118 | 7 Brownian motion and transition semigroups Proof. a) Let u ∈ D(A) and assume that x is not an absorbing point. Since we have Au ∈ C∞ (ℝd ), there exists for every ϵ > 0 an open neighbourhood U0 = U0 (ϵ, x) of x such that |Au(y) − Au(x)| < ϵ

for all y ∈ U0 .

Since x is not absorbing, we find some open neighbourhood U1 ⊂ U 1 ⊂ U0 of x such that 𝔼x τ c < ∞. By Dynkin’s formula (7.30) for the stopping time σ = τ c where 𝔹 (x,r)

U1

𝔹(x, r) ⊂ U1 ⊂ U0 we see

󵄨󵄨 󵄨󵄨 x 󵄨 󵄨󵄨 𝔼 u(X τ c ) − u(x) − Au(x) 𝔼x τ c 𝔹 (x,r) 󵄨 𝔹 (x,r) ⩽ 𝔼x (

Thus, limr→0 ( 𝔼x u(X τ



[0,τ𝔹c (x,r) )

c 𝔹 (x,r)

|Au(X s ) − Au(x)| ds) ⩽ ϵ 𝔼x τ

) − u(x))/ 𝔼x τ

⩽ϵ c

𝔹 (x,r)

c

𝔹 (x,r)

.

= Au(x).

If x is absorbing, then Au(x) = 0 by Lemma 7.33, hence Au = Au for all u ∈ D(A).

b) Since (A, D) satisfies the positive maximum principle (7.23), the claim follows like Theorem 7.22.

Let us finally show that processes with continuous sample paths are generated by differential operators.

7.36 Definition. Let L : D(L) ⊂ Bb (ℝd ) → Bb (ℝd ) be a linear operator. It is called local if for every x ∈ ℝd and u, w ∈ D(L) satisfying u|𝔹(x,ϵ) ≡ w|𝔹(x,ϵ) in some open neighbourhood 𝔹(x, ϵ) of x we have Lu(x) = Lw(x).

Typical local operators are differential operators and multiplication operators, e.g. u 󳨃→ ∑i,j a ij (⋅)∂ i ∂ j u(⋅)+∑j b j (⋅)∂ j u(⋅)+c(⋅)u(⋅). A typical non-local operator is an integral operator, e.g. u 󳨃→ ∫ℝd k(x, y) u(y) dy. A deep result by J. Peetre says that local operators are essentially differential operators: d k d 7.37 Theorem (Peetre 1960). Suppose that L : C∞ c (ℝ ) → Cc (ℝ ) is a linear operator ∂α where k ⩾ 0 is fixed. If supp Lu ⊂ supp u, then Lu = ∑α∈ℕd a α (⋅) ∂x α u with finitely 0

Ex. 7.21

many, uniquely determined distributions a α ∈ D󸀠 (ℝd ) (the distributions D󸀠 (ℝd ) are the d k topological dual of C∞ c (ℝ )), which are locally represented by functions of class C .

d where (A, D(A)) is the Let L be as in Theorem 7.37 and assume that L = A|C∞ c (ℝ ) generator of a Markov process. Then L has to satisfy the positive maximum principle (7.23). In particular, L is at most a second order differential operator. This follows from d the fact that we can find test functions ϕ ∈ C∞ c (ℝ ) such that x 0 is a global maximum while ∂ j ∂ k ∂ l ϕ(x0 ) has arbitrary sign. Every Feller process with continuous sample paths has a local generator:

7.6 Dynkin’s characteristic operator | 119

7.38 Theorem. Let (X t )t⩾0 be a Feller process with continuous sample paths. Then the generator (A, D(A)) is a local operator.

Proof. Because of Theorem 7.35 it is enough to show that (A, D(A)) is a local operator. If x is absorbing, this is clear. Assume, therefore, that x is not absorbing. If the functions u, w ∈ D(A) coincide in 𝔹(x, ϵ) for some ϵ > 0, we know for all r < ϵ u(X(τ

c

𝔹 (x,r)

)) = w(X(τ

c

𝔹 (x,r)

))

where we use that X(τ𝔹c (x,r) ) ∈ 𝔹(x, r) ⊂ 𝔹(x, ϵ) because of the continuity of the trajectories. Thus, Au(x) = Aw(x).

d If the domain D(A) is sufficiently rich, e.g. if it contains the test functions C∞ c (ℝ ), then the local property of A follows from the speed of the process moving away from its starting point.

d 7.39 Theorem. Let (X t )t⩾0 be a Feller process such that C∞ c (ℝ ) is in the domain of the ∞ d generator (A, D(A)). Then (A, Cc (ℝ )) is a local operator if, and only if,

∀ϵ > 0, r > 0, K ⊂ ℝd compact : lim sup sup h→0 t⩽h x∈K

1 x ℙ (r > |X t − x| > ϵ) = 0. h

(7.32)

d Proof. Assume that (A, C∞ c (ℝ )) is local. Fix ϵ > 0, r > 3ϵ, pick any compact set K and cover it by finitely many balls 𝔹(x j , ϵ), j = 1, . . . , N. d ∞ d Since C∞ c (ℝ ) ⊂ D(A), there exist functions u j ∈ Cc (ℝ ), j = 1, . . . , N, with

and

u j ⩾ 0,

u j |𝔹(x j ,ϵ) ≡ 0

and

󵄩󵄩 1 󵄩󵄩 󵄩 󵄩 lim 󵄩󵄩󵄩 (P t u j − u j ) − Au j 󵄩󵄩󵄩 = 0 󵄩󵄩∞ t→0 󵄩 󵄩t

u j |𝔹(x j ,r+ϵ)\𝔹(x j ,2ϵ) ≡ 1

for all j = 1, . . . , N.

Therefore, there is for every δ > 0 and all j = 1, . . . , N some h = h δ > 0 such that for all t < h and all x ∈ ℝd 1 󵄨󵄨 x 󵄨 󵄨 𝔼 u j (X t ) − u j (x) − tAu j (x)󵄨󵄨󵄨 ⩽ δ. t 󵄨

If x ∈ 𝔹(x j , ϵ), then u j (x) = 0 and, by locality, Au j (x) = 0. Thus 1 x 𝔼 u j (X t ) ⩽ δ; t

for x ∈ 𝔹(x j , ϵ) and r > |X t − x| > 3ϵ we have r + ϵ > |X t − x j | > 2ϵ, therefore

1 x 1 1 ℙ (r > |X t − x| > 3ϵ) ⩽ ℙx (r + ϵ > |X t − x j | > 2ϵ) ⩽ P t u j (x) ⩽ δ. t t t

Ex. 7.8

120 | 7 Brownian motion and transition semigroups This proves lim

h→0

1 sup sup ℙx (r > |X t − x| > 3ϵ) h x∈K t⩽h 1 ⩽ lim sup sup ℙx (r > |X t − x| > 3ϵ) ⩽ δ. h→0 x∈K t⩽h t

d Conversely, fix x ∈ ℝd and assume that u, w ∈ C∞ c (ℝ ) ⊂ D(A) coincide on 𝔹(x, ϵ) for some ϵ > 0. Moreover, pick r in such a way that supp u ∪ supp w ⊂ 𝔹(x, r). Then 󵄨󵄨 𝔼x u(X ) − u(x) 𝔼x w(X ) − w(x) 󵄨󵄨 t t 󵄨 󵄨󵄨 − |Au(x) − Aw(x)| = lim 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 t→0 󵄨󵄨 t t 1 󵄨󵄨 x 󵄨 = lim 󵄨󵄨𝔼 (u(X t ) − w(X t ))󵄨󵄨󵄨 t→0 t 1 ⩽ lim ∫ |u(y) − w(y)| ℙx (X t ∈ dy) t→0 t ℝd

⩽ ‖u − w‖∞ lim

t→0

d proving that (A, C∞ c (ℝ )) is a local operator.

=0 for |y − x| < ϵ or |y − x| ⩾ r

1 x (7.32) ℙ (r > |X t − x| > ϵ) = 0, t

7.40 Further reading. Semigroups in probability theory are nicely treated in the first chapter of [79]; the presentation in (the first volume of) [71] is still up-to-date. Both sources approach things from the analysis side. The survey paper [118] gives a brief self-contained introduction from a probabilistic point of view. The Japanese school has a long tradition approaching stochastic processes through semigroups, see for example [119] and [120]. We did not touch the topic of quadratic forms induced by a generator, so-called Dirichlet forms. The classic text is [96]. The monograph by Port–Stone [209] contains a detailed discussion of potentials from Hunt’s perspective. Dynkin: Markov Processes (volume 1). [71] Ethier, Kurtz: Markov Processes. Characterization and Convergence. [79] [96] Fukushima, Oshima, Takeda: Dirichlet Forms and Symmetric Markov Processes. [118] Itô: Semigroups in Probability Theory. [119] Itô: Stochastic Processes. [120] Itô: Essentials of Stochastic Processes. [209] Port, Stone: Brownian Motion and Classical Potential Theory.

Problems 1.

2.

Show that C∞ := {u : ℝd → ℝ : continuous and lim|x|→∞ u(x) = 0} equipped with the uniform topology is a Banach space. Show that in this topology C∞ is the closure of Cc , i.e. the family of continuous functions with compact support.

Let (B t )t⩾0 be a BMd with transition semigroup (P t )t⩾0 . Show, using arguments similar to those in the proof of Proposition 7.3.f), that (t, x, u) 󳨃→ P t u(x) is continuous. As usual, we equip [0, ∞) × ℝd × C∞ (ℝd ) with the norm |t| + |x| + ‖u‖∞ .

Problems

3.

| 121

Let (X t )t⩾0 be a d-dimensional Feller process and let f, g ∈ C∞ (ℝd ). Show that the function x 󳨃→ 𝔼x (f(X t )g(X t+s )) is also in C∞ (ℝd ). Hint. Markov property.

4. Let (B t )t⩾0 be a BMd and set u(t, x) := P t u(x). Adapt the proof of Proposition 7.3.g) and show that for u ∈ Bb (ℝd ) the function u(t, ⋅) ∈ C∞ and u ∈ C1,2 (i.e. once continuously differentiable in t, twice continuously differentiable in x). 5.

Let (P t )t⩾0 be the transition semigroup of a BMd and denote by L p , 1 ⩽ p < ∞, the space of pth power integrable functions with respect to Lebesgue measure on ℝd . Set u n := (−n) ∨ u ∧ n for every u ∈ L p . Verify that a) u n ∈ Bb (ℝd ) ∩ L p (ℝd ) and L p -limn→∞ u n = u. ̃ t u := limn→∞ P t u n extends P t and gives a contraction semigroup on L p . b) P Hint. Use Young’s inequality for convolutions, see e.g. [232, Theorem 15.6].

̃ t is sub-Markovian, i.e. if u ∈ L p and 0 ⩽ u ⩽ 1 a. e., then 0 ⩽ P ̃ t u ⩽ 1 a. e. P ̃ t is strongly continuous, i.e. limt→0 ‖P ̃ t u − u‖L p = 0 for all u ∈ L p . d) P

c)

6.

7.

Hint. For u ∈ L p , y 󳨃→ ‖u(⋅ + y) − u‖L p is continuous, cf. [232, Theorem 15.8].

Let (T t )t⩾0 be a Markov semigroup given by T t u(x) = ∫ℝd u(y) p t (x, dy) where p t (x, C) is a kernel in the sense of Remark 7.6. Show that the semigroup property of (T t )t⩾0 entails the Chapman–Kolmogorov equations p t+s (x, C) = ∫ p t (y, C) p s (x, dy) for all

x ∈ ℝd

and

C ∈ B(ℝd ).

Show that p xt1 ,...,t n (C1 × ⋅ ⋅ ⋅ × C n ) of Remark 7.6 define measures on B(ℝd⋅n ).

8. a) Let d(x, A) := inf a∈A |x − a| be the distance between the point x ∈ ℝd and the set A ∈ B(ℝd ). Show that x 󳨃→ d(x, A) is continuous. b) Let u n be as in (7.7). Show that u n ∈ Cc (ℝd ) and inf n u n (x) = 𝟙K (x). Draw a picture if d = 1 and K = [0, 1].

9.

d c) Let χ n be a sequence of type δ, i.e. χ n ∈ C∞ c (ℝ ), supp χ n ⊂ 𝔹(0, 1/n), χ n ⩾ 0 d and ∫ χ n (x) dx = 1. Show that v n := χ n ⋆ u n ∈ C∞ c (ℝ ) are smooth functions such that limn→∞ v n = 𝟙K .

Let (T t )t⩾0 be a Feller semigroup, i.e. a strongly continuous, positivity preserving sub-Markovian semigroup on (C∞ (ℝd ), ‖ ⋅ ‖∞ ), cf. Remark 7.4. a) Show that (t, x) 󳨃→ T t u(x) is jointly continuous for t ⩾ 0 and x ∈ ℝd . b) Conclude that for every compact set K ⊂ ℝd the function (t, x) 󳨃→ T t 𝟙K (x) is measurable.

c) Show that (t, x, u) 󳨃→ T t u(x) is jointly continuous for all t ⩾ 0, x ∈ ℝd and u ∈ C∞ (ℝd ).

j 10. Let A, B ∈ ℝd×d and set P t := exp(tA) := ∑∞ j=0 (tA) /j!. a) Show that P t is a strongly continuous semigroup. Is it contractive?

122 | 7 Brownian motion and transition semigroups b) Show that c) Show that

d tA d tA = Ae tA = dt e exists and that dt e e tA e tB = e tB e tA ∀t ⇐⇒ AB = BA.

e tA A.

Hint. Differentiate e tA e sB at t = 0 and s = 0.

d) (Trotter product formula) Show that e A+B = limk→∞ (e A/k e B/k )k .

11. Let (P t )t⩾0 and (T t )t⩾0 be two Feller semigroups with generators (A, D(A)), resp. (B, D(B)). d P t−s T s = −P t−s AT s + P t−s BT s on D(B). a) If D(B) ⊂ D(A), then we have ds b) Is it possible that U t := T t P t is again a Feller semigroup?

c) (Trotter product formula) Assume that U t f := limn→∞ (T t/n P t/n )n f exists strongly and locally uniformly for t. Show that U t is a Feller semigroup and determine its generator.

12. Complete the following alternative argument for Example 7.16. Assume that (u n )n⩾1 ⊂ C2∞ (ℝ) such that ( 12 u󸀠󸀠 n )n⩾1 is a Cauchy sequence in C∞ and that u n converges uniformly to u. x y x y a) Show that u n (x) − u n (0) − xu󸀠n (0) = ∫0 ∫0 u󸀠󸀠 n (z) dz dy → ∫0 ∫0 2g(z) dz dy uniformly. Conclude that c󸀠 := limn→∞ u󸀠n (0) exists. x

b) Show that u󸀠n (x) − u󸀠n (0) → ∫0 2g(z) dz uniformly. Deduce from this and part 0

x

a) that u󸀠n (x) converges uniformly to ∫0 2g(z) dz + c󸀠 where c󸀠 = ∫−∞ 2g(z) dz. x

y

c) Use a) and b) to show that u n (x) − u n (0) → ∫0 ∫−∞ 2g(z) dz uniformly and x

y

argue that u n (x) has the uniform limit ∫−∞ ∫−∞ 2g(z) dz dy.

13. Let U α be the α-potential operator of a BMd . Give a probabilistic interpretation of U α 𝟙C and limα→0 U α 𝟙C if C ⊂ ℝd is a Borel set.

14. Let U0 be the potential operator of a BMd in dimension d = 1 or d = 2. Show that every u ∈ D(U0 ) such that u ⩾ 0 is trivial, i.e. u = 0. 15. Let (U α )α>0 be the α-potential operator of a BMd . Use the resolvent equation to prove the following formulae for f ∈ Bb and x ∈ ℝd : and

dn U α f(x) = n!(−1)n U αn+1 f(x) dα n

dn (αU α )f(x) = n! (−1)n+1 (id −αU α )U αn f(x). dα n

16. (Dini’s theorem for C∞ ) Let (f n )n⩾1 ⊂ C∞ (ℝd ) be a sequence of functions such that 0 ⩽ f n ⩽ f n+1 and f := supn f n ∈ C∞ (ℝd ). Show that limn→∞ ‖f − f n ‖∞ = 0.

17. Let (A, D(A)) be the generator of a BMd . Adapt the arguments of Example 7.25 and show that C2∞ (ℝd ) ⊊ D(A) for any dimension d ⩾ 3.

Instructions. Use in Example 7.25 d-dimensional polar coordinates: y1 = r sin ϕ sin θ1 ⋅ . . . ⋅ sin θ d−2 , y2 = r cos ϕ sin θ1 ⋅ . . . ⋅ sin θ d−2 , . . . y d−1 = r cos θ d−3 ⋅ sin θ d−2 , y d = r cos θ d−2 , where (r, ϕ, θ1 , . . . , θ d−2 ) ∈ (0, ∞) × (0, 2π) × (0, π)d−2 .

Problems

| 123

The functional determinant is r d−1 (sin θ1 )1 (sin θ2 )2 . . . (sin θ d−2 )d−2 .

The calculations will lead to the following integrals (|x| < 1) π

∫ 0

(sin θ)d−2 dθ (1 + x2 − 2x cos θ)(d−2)/2

π

and

∫ 0

(sin θ)d dθ, (1 + x2 − 2x cos θ)(d−2)/2

(*)

which have the values a d and b d x2 + c d , respectively, with constants a d , b d , c d ∈ ℝ depending on the dimension, cf. Gradshteyn–Ryzhik [101, 3.665.2, p. 384]. Remark. A simple evaluation of the integrals (*) is possible only in d = 3. For d > 3 these integrals

are most elegantly evaluated using Gegenbauer polynomials, cf. [101, 8.93, pp. 1029–1031] and the discussion on stack exchange http://math.stackexchange.com/questions/395336.

18. Let (B t )t⩾0 be a BM1 and consider the two-dimensional process X t := (t, B t ), t ⩾ 0. a) Show that (X t )t⩾0 is a Feller process. b) Determine its transition semigroup, resolvent and generator.

c) State and prove Theorem 7.30 for this process and compare the result with Theorem 5.6

19. Show that Dynkin’s formula (7.30) in Proposition 7.31 follows from Theorem 7.30. 20. Let t 󳨃→ X t be a right continuous stochastic process. Show that for closed sets F ℙ(X t ∈ F ∀t ∈ ℝ+ ) = ℙ(X q ∈ F ∀q ∈ ℚ+ ).

d d 21. Assume that L : C∞ c (ℝ ) → C(ℝ ) is a local operator (cf. Definition 7.36) which is almost positive, i.e. for all ϕ ∈ C∞ c such that ϕ ⩾ 0 and ϕ(x 0 ) = 0 we have Lϕ(x0 ) ⩾ 0. a) Show that an operator that satisfies the positive maximum principle is almost positive. (This does not require locality).

b) Show that supp Lϕ ⊂ supp ϕ.

c) Show that the operator L is of order two, i.e. ‖Lϕ‖∞ ⩽ C‖ϕ‖(2) holds for all d α ϕ ∈ C∞ c (K) and any compact set K ⊂ ℝ . Here, ‖ϕ‖(2) = ∑0⩽|α|⩽2 ‖∂ ϕ‖∞ and the constant C depends only on K. Hint. Consider f(y) := c‖ϕ‖(2) |x − y|2 χ(y) ± (ϕ(y) − ϕ(x)χ(y) − ∇ϕ(x) ⋅ (y − x)χ(y)) where

𝟙K ⩽ χ ⩽ 1 is a smooth cut-off function and c is a sufficiently large constant. Now apply almost positivity at x0 = x.

d) Use Theorem 7.37 to conclude that L is a differential operator of order ⩽ 2.

8 The PDE connection We want to discuss some relations between partial differential equations (PDEs) and Brownian motion. For many classical PDE problems probability theory yields concrete representation formulae for the solutions in the form of expected values of a Brownian functional. These formulae can be used to get generalized solutions of PDEs (which require less smoothness of the initial/boundary data or the boundary itself) and they are amenable to Monte–Carlo simulations. Purely probabilistic existence proofs for classical PDE problems are, however, rare: classical solutions require smoothness, which does usually not follow from martingale methods.¹ This explains the role of Proposition 7.3.g) and Proposition 8.12. Let us point out that 12 ∆ has two meanings: it is the restriction of the generator (A, D(A)) of a Brownian motion, A|C2∞ = 21 ∆|C2∞ , but it 2 can also be seen as partial differential operator L = ∑dj,k=1 ∂x∂j ∂x k which acts on all C2

functions. Of course, on C2∞ both meanings coincide, and it is this observation which makes the method work. Throughout this chapter (A, D(A)) denotes the generator of a BMd . The origin of the probabilistic approach are the pioneering papers by Kakutani [133; 134], Kac [126; 127] and Doob [57]. As an illustration of the method we begin with the elementary (inhomogeneous) heat equation. In this context, classical PDE methods are certainly superior to the clumsy-looking Brownian motion machinery. Its elegance and effectiveness become obvious in the Feynman–Kac formula and the Dirichlet problem which we discuss in Sections 8.3 and 8.4. Moreover, since many second-order differential operators generate diffusion processes, see Chapter 23, only minor changes in our proofs yield similar representation formulae for the corresponding PDE problems. A martingale relation is the key ingredient. Let (B t , Ft )t⩾0 be a BMd . Recall from Theorem 5.6 that for u ∈ C1,2 ((0, ∞) × ℝd ) ∩ C([0, ∞) × ℝd ) satisfying 󵄨󵄨 d 󵄨󵄨 2 󵄨󵄨 ∂u(t, x) 󵄨󵄨 d 󵄨󵄨󵄨 ∂u(t, x) 󵄨󵄨󵄨 󵄨 󵄨󵄨 󵄨󵄨 + ∑ 󵄨󵄨󵄨 ∂ u(t, x) 󵄨󵄨󵄨 ⩽ c(t) e C|x| |u(t, x)| + 󵄨󵄨󵄨 󵄨󵄨 + ∑ 󵄨󵄨󵄨󵄨 󵄨 󵄨 󵄨 󵄨󵄨 ∂t 󵄨󵄨 󵄨 ∂x j 󵄨󵄨󵄨 j,k=1 󵄨󵄨󵄨 ∂x j ∂x k 󵄨󵄨󵄨 j=1 󵄨

(8.1)

for all t > 0 and x ∈ ℝd with some constant C > 0 and a locally bounded function c : (0, ∞) → [0, ∞), the process s

M su

= u(t − s, B s ) − u(t, B0 ) + ∫ ( 0

∂ 1 − ∆ x ) u(t − r, B r ) dr, s ∈ [0, t), ∂t 2

(8.2)

is an Fs martingale for every measure ℙx , i.e. for every starting point B0 = x of a Brownian motion.

1 A notable exception is, of course, Malliavin’s calculus, cf. [174; 176]. https://doi.org/10.1515/9783110741278-008

8.1 The heat equation | 125 Tab. 8.1: Some PDEs that can be solved probabilistically with a Brownian motion (B t )t⩾0 No.

PDE

1.

∂t u =

2. 3. 4. 5.

6.

1 2 ∆x u

∂t u =

1 2 ∆x u

∂t u =

1 2 ∆x u

∂t u =

1 2 ∆x u

condition

solution u(t, x), resp. w(x)

reference

u(0, ⋅) = f

𝔼x

Thm. 8.3

+g

u(0, ⋅) = f

+c⋅u

u(0, ⋅) = f

+ b ⋅ ∇x u

u(0, ⋅) = f

∆ x w = 0, x ∈ D

8.1 The heat equation

w|∂D = f

[f(B t )]

t

𝔼x [f(B t ) + ∫0 g(B s ) ds] t

𝔼x [f(B t ) + ∫0 g(t − s, B s ) ds] t

𝔼x [f(B t ) exp (∫0 c(B s ) ds)] t

Thm. 8.5 Ex. 8.10 Thm. 8.6, 8.8

𝔼x [f(B t ) exp (∫0 b(B s ) dB s

Thm. 23.13,

𝔼x f(B τ ), τ = inf{s : B s ∉ D}

Thm. 8.20



1 t 2 ∫0

|b(B s )|2 ds)]

Ex. 8.9 Ex. 8.10

Thm. 23.15

In Chapter 7 we have seen that the generator (A, D(A)) of a Brownian motion extends the Laplacian ( 12 ∆, C2∞ (ℝd )). In fact, applying Lemma 7.10 to a Brownian motion yields d P t f(x) = AP t f(x) for all f ∈ D(A). dt

Since P t f(x) = 𝔼x f(B t ) is smooth, cf. Problem 7.4, we get 12 ∆ x P t u(x) = AP t f(x). Setting u(t, x) := P t f(x) = 𝔼x f(B t ) this almost proves the following lemma.

Ex. 7.4

8.1 Lemma. Let (B t )t⩾0 be a BMd , f ∈ D(A) and u(t, x) := 𝔼x f(B t ). Then u(t, x) is the unique bounded solution in C1,2 ((0, ∞) × ℝd ) ∩ C([0, ∞) × ℝd ) and u(t, ⋅) ∈ C2∞ (ℝd ) of the heat equation with initial value f , ∂ 1 u(t, x) − ∆ x u(t, x) = 0, ∂t 2 u(0, x) = f(x).

(8.3a) (8.3b)

Proof. Since f ∈ D(A) ⊂ C∞ (ℝd ), a variation of the calculation in the proof of Proposition 7.3.g) shows that the u(t, x) = P t f(x) is in C1,2 ((0, ∞) × ℝd ) and u(t, ⋅) ∈ C2∞ (ℝd ). It remains to show uniqueness. Let u(t, x) be a bounded solution of (8.3) such that u ∈ C1,2 ((0, ∞) × ℝd ) ∩ C([0, ∞) × ℝd ) and u(t, ⋅) ∈ C2∞ (ℝd ). Thus, u(t, ⋅) ∈ D(A) and the Laplace transform is given by ∞

v λ (x) = ∫ u(t, x) e 0

−λt



dt = lim ∫ u(t, x) e−λt dt. ϵ→0

ϵ

Ex. 8.1

126 | 8 The PDE connection The limit exists in C∞ ; an application of differentiation theorem for parameter depend∞ ent integrals shows that x 󳨃→ ∫ϵ u(t, x) e−λt dt is in C2 and ∞

A ∫ u(t, x) e−λt dt = ϵ









ϵ

ϵ

ϵ

1 1 ∆ x ∫ u(t, x) e−λt dt = ∫ ∆ x u(t, x) e−λt dt = ∫ Au(t, x) e−λt dt. 2 2

Therefore, A ∫ϵ u(t, x) e−λt dt converges as ϵ → 0, and we conclude from the closedness of the operator (A, D(A)) that v λ ∈ D(A) as well as ∞



λv λ (x) − Av λ (x) = λv λ (x) − ∫ Au(t, x) e−λt dt = λv λ (x) − ∫ 0

0

∂ u(t, x) e−λt dt. ∂t

Integration by parts and (8.3b) yield ∞

(λ id −A) v λ (x) = λv λ (x) + ∫ u(t, x) 0

Ex. 8.1 Ex. 8.2

󵄨󵄨t=∞ ∂ −λt e dt − u(t, x)e−λt 󵄨󵄨󵄨 = f(x). 󵄨t=0 ∂t

The equation (λ id −A)v λ = f has for all f ∈ D(A) ⊂ C∞ (ℝd ) and λ > 0 the unique solution v λ = U λ f where U λ is the resolvent operator. The uniqueness of the Laplace transform, cf. [237, Proposition 1.2], shows that u(t, x) is the unique solution to the Problem (8.3) if we replace ( 21 ∆ x , C2∞ ) by its extension (A, D(A)). As u(t, ⋅) ∈ C2∞ (ℝd ), we see that u(t, x) is also the unique solution to the original problem.

Lemma 8.1 is a PDE proof. Let us sketch a further, more probabilistic approach which will be important if we consider more complicated PDEs. It is, admittedly, too complicated for the simple setting considered in Lemma 8.1, but it allows us to remove the restriction that f ∈ D(A), see the discussion in Remark 8.4 below. 8.2 Probabilistic proof of Lemma 8.1. Uniqueness: assume that u(t, x) satisfies (8.1) and |u(t, x)| ⩽ c e c|x| on [0, T] × ℝd where c = c T . By (8.2), s

M su := u(t − s, B s ) − u(t, B0 ) + ∫ ( 0

1 ∂ − ∆ x ) u(t − r, B r ) dr, ∂t 2 =0 by (8.3a)

is a martingale w.r.t. (Fs )s∈[0,t) , and so is N su := u(t − s, B s ), s ∈ [0, t). By assumption, u is exponentially bounded. Therefore, (N su )s∈[0,t) is dominated by the integrable function c sup0⩽s⩽t exp(c |B s |) ∈ L1 (ℙ), hence, uniformly integrable. By the martingale convergence theorem, Theorem A.6, the limit lims→t N su exists a.s. and in L1 . Thus, (8.3b)

u(t, x) = 𝔼x u(t, B0 ) = 𝔼x N0u = lim 𝔼x N su = 𝔼x N tu = 𝔼x f(B t ), s→t

which shows that the initial condition f uniquely determines the solution.

8.1 The heat equation | 127

Existence: set u(t, x) := 𝔼x f(B t ) = P t f(x) and assume that f : ℝd → ℝ is measurable and exponentially bounded: |f(x)| ⩽ Ce C|x| . Lemma 7.10 and a obvious modification of the proof of Proposition 7.3.g) show that u is in C1,2 ((0, ∞) × ℝd ) ∩ C([0, ∞) × ℝd ) and satisfies (8.1). Therefore, (8.2) tells us that s

M su := u(t − s, B s ) − u(t, B0 ) + ∫ ( 0

1 ∂ − ∆ x ) u(t − r, B r ) dr, ∂t 2

is a martingale. On the other hand, the Markov property reveals that u(t − s, B s ) = 𝔼B s f(B t−s ) = 𝔼x (f(B t ) | Fs ),

0 ⩽ s < t,

0 ⩽ s ⩽ t,

∂ is a martingale, and so is ∫0 ( ∂t − 12 ∆ x ) u(t − r, B r ) dr, s ⩽ t. Since it has continuous paths of bounded variation, it is identically zero on [0, t), cf. Proposition 17.2, and (8.3a) follows. s

The proof of Proposition 7.3.g) shows that (t, x) 󳨃→ P t f(x) = 𝔼x f(B t ), t > 0, x ∈ ℝd , is smooth. This allows us to extend the probabilistic approach using a localization technique described in Remark 8.4 below. This proves the following theorem. 8.3 Theorem. Let (B t )t⩾0 be a BMd and f ∈ B(ℝd ) such that |f(x)| ⩽ C e C|x| . Then u(t, x) := 𝔼x f(B t ) is the unique solution to the heat equation (8.3) satisfying the growth estimate |u(t, x)| ⩽ Ce C|x| .

Some kind of growth estimate as in the statement of Theorem 8.3 is essential for the uniqueness, cf. Evans [80, Chapter 2.3, p. 59] or John [125, Chapter 7.1, p. 211]. 8.4 Remark (Localization). Let (B t )t⩾0 be a BMd , u ∈ C1,2 ((0, ∞)×ℝd )∩ C([0, ∞)×ℝd ), fix t > 0, and consider the following sequence of stopping times τ n := inf {s > 0 : |B s | ⩾ n} ∧ (t − n−1 ),

n ⩾ 1/t.

(8.4)

For every n ⩾ 1/t we pick some cut-off function χ n ∈ C1,2 ([0, ∞) × ℝd ) with compact support, χ n |[0,1/2n]×ℝd ≡ 0 and χ n |[1/n,t]×𝔹(0,n) ≡ 1. Clearly, for M su as in (8.2), χ u

n u M s∧τ = M s∧τ n

n

for all s ∈ [0, t)

and

n ⩾ 1/t.

n Since χ n u ∈ C1,2 b satisfies (8.1), (M s , Fs )s∈[0,t) is, for each n ⩾ 1/t, a bounded martingale. By optional stopping, Remark A.21,

u , Fs )s∈[0,t) (M s∧τ n

χ u

Ex. 8.1

is a bounded martingale for all u ∈ C1,2 and n ⩾ 1/t.

(8.5)

Ex. 8.3

128 | 8 The PDE connection Assume that u ∈ C1,2 ((0, ∞) × ℝd ). Replacing in the probabilistic proof 8.2 B s , M su and u u N su by the stopped versions B s∧τ n , M s∧τ and N s∧τ , we see that ² n

n

s∧τ n u M s∧τ n

= u(t − s ∧ τ n , B s∧τ n ) − u(t, B0 ) + ∫ ( 0

∂ 1 − ∆ x ) u(t − r, B r ) dr, ∂t 2 =0 by (8.3a)

u and N s∧τ = u(t − s ∧ τ n , B s∧τ n ), s ∈ [0, t), are for every n ⩾ 1/t bounded martingales. n u In particular, 𝔼x N0u = 𝔼x N s∧τ for all s ∈ [0, t), and by the martingale convergence n theorem we get u u u(t, x) = 𝔼x u(t, B0 ) = 𝔼x N0u = lim 𝔼x N s∧τ = 𝔼x N t∧τ . s→t

As limn→∞ τ n = t a.s., we see

n

n

(8.3b)

u u(t, x) = 𝔼x N t∧τ = lim 𝔼x u(t − t ∧ τ k , B t∧τ ) = 𝔼x u(0, B t ) = 𝔼x f(B t ), n

k→∞

k

which shows that the solution is unique. To see that u(t, x) := P t f(x) is a solution, we assume that |f(x)| ⩽ Ce C|x| . We can adapt the proof of Proposition 7.3.g) to see that u ∈ C1,2 ((0, ∞) × ℝd ). Therefore, (8.5) u shows that (M s∧τ , Fs )s∈[0,t) is, for every n ⩾ 1/t, a bounded martingale. By the Markov n property u(t − s, B s ) = 𝔼B s f(B t−s ) = 𝔼x (f(B t ) | Fs ),

and by optional stopping (u(t −s ∧τ n , B s∧τ n ), Fs )s∈[0,t) is for every n ⩾ 1/t a martingale. s∧τ n

Ex. 8.3

0 ⩽ s ⩽ t,

This implies that ∫0

∂ − 12 ∆ x ) u(t − r, B r ) dr is a martingale with ( ∂t

continuous paths of bounded variation. Hence, it is identically zero on [0, t ∧ τ n ), cf. Proposition 17.2. As limn→∞ τ n = t, (8.3a) follows.

8.2 The inhomogeneous initial value problem

If we replace the right-hand side of the heat equation (8.3a) with some function g(t, x), we get the following inhomogeneous initial value problem. ∂ 1 v(t, x) − ∆ x v(t, x) = g(t, x), ∂t 2 v(0, x) = f(x). s∧τ n

2 Note that ∫0

⋅ ⋅ ⋅ dr = ∫[0,s∧τ ) ⋅ ⋅ ⋅ dr. n

(8.6a) (8.6b)

8.2 The inhomogeneous initial value problem | 129

Assume that u solves the corresponding homogeneous problem (8.3). If w solves the inhomogeneous problem (8.6) with zero initial condition f = 0, then v = u + w is a solution of (8.6). Therefore, it is sufficient to consider ∂ 1 v(t, x) − ∆ x v(t, x) = g(t, x), (8.7a) ∂t 2 v(0, x) = 0 (8.7b) instead of (8.6). Let us use the probabilistic approach of Section 8.1 to develop an educated guess about the solution to (8.7). Let v(t, x) be a solution to (8.7) which is bounded and sufficiently smooth (C1,2 and (8.1) will do). Then, for s ∈ [0, t), s

M sv

:= v(t − s, B s ) − v(t, B0 ) + ∫ ( 0

1 ∂ v(t − r, B r ) − ∆ x v(t − r, B r )) dr ∂t 2 s ∫0

=g(t−r,B r ) by (8.7a)

is a martingale, and so is := v(t − s, B s ) + g(t − r, B r ) dr, s ∈ [0, t); in particular, 𝔼x N0v = 𝔼x N sv for all s ∈ [0, t). If we can use the martingale convergence theorem, e.g. if the martingale is uniformly integrable, we can let s → t and get N sv

t

(8.7b)

v(t, x) = 𝔼x N0v = lim 𝔼x N sv = 𝔼x N tv = 𝔼x (∫ g(t − r, B r ) dr) . s→t

0

This consideration already yields uniqueness. For the existence we have to make sure that we may perform all manipulations in the above calculation. With some effort we can justify everything if g(t, x) is bounded or if g = g(x) is from the so-called Kato class. For the sake of clarity, we content ourselves with the following simple version. 8.5 Theorem. Let (B t )t⩾0 be a BMd and g ∈ D(A). Then the unique solution of the inhomogeneous initial value problem (8.7) with g = g(x) satisfying |v(t, x)| ⩽ C t is given t by v(t, x) = 𝔼x ∫0 g(B s ) ds. t

Analysts know the formula v(t, x) = 𝔼x ∫0 g(B s ) ds as Duhamel’s principle. t

t

Proof of Theorem 8.5. Since v(t, x) = ∫0 𝔼x g(B s ) ds = ∫0 P s g(x) ds, Lemma 7.10 and a variation of the proof of Proposition 7.3.g) imply that v ∈ C1,2 ((0, ∞) × ℝd ). Thus, our heuristic derivation of v(t, x) shows that v(t, x) is the only possible candidate for the of the problem (8.7). To see that v(t, x) is indeed a solution, we use (8.2) and conclude s

M sv

= v(t − s, B s ) − v(t, B0 ) + ∫ ( 0

s

∂ 1 − ∆ x ) v(t − r, B r ) dr ∂t 2

= v(t − s, B s ) + ∫ g(B r ) dr − v(t, B0 ) 0

s

+ ∫ [( 0

1 ∂ − ∆ x ) v(t − r, B r ) − g(B r )] dr ∂t 2

Ex. 8.5 Ex. 8.9

130 | 8 The PDE connection is a martingale. On the other hand, the Markov property gives for all s ⩽ t, s t 󵄨󵄨 󵄨󵄨 x 𝔼 (∫ g(B r ) dr 󵄨󵄨󵄨 Fs ) = ∫ g(B r ) dr + 𝔼 (∫ g(B r ) dr 󵄨󵄨 s 0 0 t

x

󵄨󵄨 󵄨󵄨 󵄨󵄨 Fs ) 󵄨󵄨 󵄨

󵄨󵄨 󵄨 = ∫ g(B r ) dr + 𝔼 ( ∫ g(B r+s ) dr 󵄨󵄨󵄨󵄨 Fs ) 󵄨󵄨 0 0 s

t−s

x

s

t−s

= ∫ g(B r ) dr + 𝔼B s ( ∫ g(B r ) dr) 0

0

s

= ∫ g(B r ) dr + v(t − s, B s ). 0

This shows that the right-hand side is for s ⩽ t a martingale. Consequently, s

∫( 0

Ex. 8.3

Ex. 8.10

∂ 1 v(t − r, B r ) − ∆ x v(t − r, B r ) − g(B r )) dr, ∂t 2

s ∈ [0, t),

is a martingale with continuous paths of bounded variation; by Proposition 17.2 it is zero on [0, t), and (8.7a) follows.

If g = g(x) does not depend on time, we can use the localization technique described in Remark 8.4 to show that Theorem 8.5 remains valid if g ∈ Bb (ℝd ). This requires a t modification of the proof of Proposition 7.3.g) to see that v(t, x) := 𝔼x ∫0 g(B r ) dr is in C1,2 ((0, ∞)×ℝd ). Then the proof is similar to the one for the heat equation described in Remark 8.4. As soon as g depends on time, g = g(t, x), things become messy, requiring arguments like those in Lemma 8.7, cf. [66, Chapter 8.2]. Still, the solution is given by t 𝔼 (∫0 g(t − s, B s ) ds).

8.3 The Feynman–Kac formula

We have seen in Section 8.2 that the inhomogeneous initial value problem (8.7) is t solved by the expectation of the random variable G t = ∫0 g(B r ) dr. It is an interesting question whether we can find out more about the probability distributions of G t and (B t , G t ). A natural approach would be to compute the joint characteristic function 𝔼x [exp(iξG t + i⟨η, B t ⟩)]. Let us consider the more general problem to calculate t

w(t, x) := 𝔼x (f(B t )e C t )

where

C t := ∫ c(B r ) dr 0

8.3 The Feynman–Kac formula | 131

for some functions f, c : ℝd → ℂ. Setting f(x) = e i⟨η,x⟩ and c(x) = iξg(x) brings us back to the original question. Considering real and imaginary parts separately, we may assume that f and c are real-valued. We will show that, under suitable conditions on f and c, the function w(t, x) is the unique solution to the following initial value problem 1 ∂ w(t, x) − ∆ x w(t, x) − c(x)w(t, x) = 0, ∂t 2 w(t, x) ∈ C([0, ∞) × ℝd ) and w(0, x) = f(x).

(8.8a) (8.8b)

To begin with, let us study the solution to the following simple problem for the generator (A, D(A)) of a BMd ∂ w(t, x) − Aw(t, x) − c(x)w(t, x) = 0, ∂t w(t, x) ∈ C([0, ∞) × ℝd ) and w(0, x) = f(x).

(8.9a)

(8.9b)

Note the subtle difference: the solution to (8.9) requires “only” w(t, ⋅) ∈ D(A) and w(⋅, x) ∈ C1 (0, ∞) ∩ C[0, ∞), whereas for the problem (8.8) we need that w ∈ C1,2 ((0, ∞) × ℝd ) ∩ C([0, ∞) × ℝd ); then, of course A and 12 ∆ x coincide. The next theorem can be shown by semigroup methods.

8.6 Theorem (Kac 1949). Let (B t )t⩾0 be a BMd , f ∈ D(A) and c ∈ Cb (ℝd ). Then t

w(t, x) = 𝔼x [f(B t ) exp (∫0 c(B r ) dr)]

(8.10)

is the unique solution to the initial value problem (8.9) such that |w(t, x)| ⩽ ce αt with α = supx∈ℝd c(x).

The representation formula (8.10) is called Feynman–Kac formula.

t

Proof of Theorem 8.6. Denote by P t the Brownian semigroup. Set C t := ∫0 c(B r ) dr and T t u(x) := 𝔼x (u(B t )e C t ). If we could show that (T t )t⩾0 is a Feller semigroup with generator Lu = Au + cu and D(L) = D(A), we are done: the existence of the solution w(t, x) = T t f(x) follows from Lemma 7.10, its uniqueness from the fact that L has a resolvent which is uniquely determined by T t , cf. Proposition 7.13. But there is a problem. From |T t u(x)| ⩽ exp (t supx∈ℝd c(x)) 𝔼x |u(B t )| ⩽ e tα ‖u‖∞ ,

we see that the operator T t is, in general, a contraction only if α ⩽ 0. Let us assume α ⩽ 0 for the time being and show that (T t )t⩾0 is a Feller semigroup. a) Positivity is clear from the very definition. b) Feller property: for all u ∈ C∞ (ℝd ) we have

t

T t u(x) = 𝔼x [u(B t ) exp (∫0 c(B r ) dr)] t

= 𝔼 [u(B t + x) exp (∫0 c(B r + x) dr)] .

132 | 8 The PDE connection Using the dominated convergence theorem yields limy→x T t u(y) = T t u(x) and lim|x|→∞ T t u(x) = 0. Thus, T t u ∈ C∞ (ℝd ).

c) Strong continuity: let u ∈ C∞ (ℝd ) and t ⩾ 0. Using the elementary inequality 1 − e c ⩽ |c| for all c ⩽ 0, we get 󵄨 󵄨 󵄨 󵄨 |T t u(x) − u(x)| ⩽ 󵄨󵄨󵄨󵄨𝔼x (u(B t )(e C t − 1))󵄨󵄨󵄨󵄨 + 󵄨󵄨󵄨𝔼x (u(B t ) − u(x))󵄨󵄨󵄨 ⩽ ‖u‖∞ 𝔼x |C t | + ‖P t u − u‖∞ ⩽ t ‖u‖∞ ‖c‖∞ + ‖P t u − u‖∞ .

Since the Brownian semigroup (P t )t⩾0 is strongly continuous on C∞ (ℝd ), the strong continuity of (T t )t⩾0 follows.

d) Semigroup property: let s, t ⩾ 0 and u ∈ C∞ (ℝd ). Using the tower property of conditional expectation and the Markov property, we see t+s

T t+s u(x) = 𝔼x [u(B t+s ) exp (∫0

c(B r ) dr)]

󵄨󵄨 s c(B r ) dr) exp (∫0 c(B r ) dr) 󵄨󵄨󵄨 Fs )] 󵄨 󵄨󵄨 t s = 𝔼x [𝔼x (u(B t+s ) exp (∫0 c(B r+s ) dr) 󵄨󵄨󵄨 Fs ) exp (∫0 c(B r ) dr)] 󵄨 t+s

= 𝔼x [𝔼x (u(B t+s ) exp (∫s t

s

= 𝔼x [𝔼B s (u(B t ) exp (∫0 c(B r ) dr)) exp (∫0 c(B r ) dr)] s

= 𝔼x [T t u(B s ) exp (∫0 c(B r ) dr)] = T s T t u(x).

Because of Theorem 7.22, we may calculate the generator L using pointwise convergence: for all u ∈ D(A) and x ∈ ℝd lim

t→0

T t u(x) − P t u(x) P t u(x) − u(x) T t u(x) − u(x) = lim + lim . t→0 t→0 t t t

(8.11)

The second limit on the right-hand side equals Au(x); since (e C t − 1)/t → 󳨀 c(B0 ) boundedly – cf. the estimate in the proof of the strong continuity c) – we can use dominated convergence and find for any u ∈ C∞ (ℝd ) lim 𝔼x (u(B t ) t→0

Ex. 7.11

e Ct − 1 ) = 𝔼x (u(B0 )c(B0 )) = u(x)c(x). t

Since this limit always exists, we conclude that the limit on the left-hand side of (8.11) exists if, and only if, the right-most limit exists, and so D(A) = D(L). This completes the proof in the case α ⩽ 0. If α > 0 we can consider the following auxiliary problem: Replace in (8.8) the function c(x) by c(x) − α. From the first part we know that e−αt T t f = 𝔼x (f(B t )e C t −αt ) is the unique solution of the auxiliary problem (

∂ 1 − ∆ − c(⋅) + α) e−αt T t f = 0. ∂t 2

8.3 The Feynman–Kac formula | 133

On the other hand, d −αt d e T t f = −αe−αt T t f + e−αt T t f, dt dt

and this shows that T t f is the unique solution of the original problem. Let us return to the original problem (8.8) and try out the strategy of Sections 8.1 and 8.2. We will see that f ∈ C∞ (ℝd ) will be a good initial datum. Assume, for the time being, that w ∈ C1,2 solves (8.8) and satisfies (8.1). If c(x) is bounded, we can apply (8.2) d to w(t − s, x) exp(C s ) for s ∈ [0, t). Since dt exp(C t ) = c(B t ) exp(C t ), we see that for s ∈ [0, t) s

M sw := w(t − s, B s )e C s − w(t, B0 ) + ∫( 0

∂ 1 − ∆ x − c(B r )) w(t − r, B r ) e C r dr ∂t 2 =0 by (8.8a)

is a uniformly integrable martingale under the filtration (Fs )s∈[0,t) , and the same holds for N w := (w(t − s, B s ) exp(C s ))s∈[0,t) . Since 𝔼x N0w = 𝔼x N sw for all s ∈ [0, t), the martingale convergence theorem, Theorem A.7, gives (8.8b)

w(t, x) = 𝔼x w(t, B0 ) = 𝔼x N0w = lim 𝔼x N sw = 𝔼x N tw = 𝔼x (f(B t )e C t ) . s→t

This reveals both the structure of the solution and its uniqueness. To show with martingale methods that w(t, x) := 𝔼x (f(B t )e C t ) solves (8.8) requires a priori knowledge that w is smooth, e.g. w ∈ C1,2 . 8.7 Lemma. Consider the initial value problem (8.8) where f ∈ Cb (ℝd ) and c is bounded and Hölder continuous with exponent κ ∈ (0, 1], i.e. |c(x) − c(y)| ⩽ C|x − y|κ for all x, y ∈ ℝd . Then w(t, x) defined in (8.10) is in C1,2 ((0, ∞) × ℝd ) ∩ C([0, ∞) × ℝd ) and satisfies (8.1).

Proof. ³ We split the argument into several steps. Throughout the proof, we denote by 2 (j) g t (x) = (2πt)−d/2 e−|x| /(2t) the d-dimensional normal density and x j , resp. B t , are the coordinates of x ∈ ℝd , resp. the BMd B t . We will frequently use the following simple identities (j, k = 1, . . . , d): ∂ j g t (x) = −

xj g t (x), t

∂ j ∂ k g t (x) =

x j x k − tδ jk g t (x), t2

∂ t g t (x) = (

|x|2 d − ) g t (x). 2t2 2t

In what follows, we need the continuity and differentiability lemmas for parameter dependent integrals (see, e.g. [232, Theorems 12.4, 12.5] or [235, § 12]), but we do not check the integrability assumptions in detail. This simple but tedious chore is left to

3 The proof of this lemma is quite technical and should be omitted on first reading. The idea of the proof is taken from Friedman [91, pp. 8–13].

Ex. 8.10

134 | 8 The PDE connection the reader; the reason why things work out is that the normal distribution g t (x), hence Brownian motion B t , has absolute moments of any order.

1o If f, c ∈ Cb (ℝd ), then w ∈ Cb ([0, ∞) × ℝd ). From the definition of w(t, x) and the def

fact that 𝔼x Φ(B∙ ) = 𝔼0 Φ(B∙ + x) = 𝔼 Φ(B∙ + x) we get t

w(t, x) = 𝔼 (f(B t + x) exp [∫0 c(B r + x) dr]) .

Since f and c are bounded and continuous, the claim follows from dominated convergence. Note that |w(t, x)| ⩽ ‖f‖∞ e tα with α = supx∈ℝd c(x). 2o We need a further characterization of w(t, x) in terms of an integral equation. Differt entiating s 󳨃→ exp [∫s c(B r ) dr] and then integrating it again yields t

t exp [∫0

t

c(B r ) dr] = 1 + ∫ c(B s ) exp [∫s c(B r ) dr] ds. 0

We multiply this identity with f(B t ), take expectations and use Fubini, conditioning and the Markov property to get t

󵄨󵄨 t w(t, x) = 𝔼x f(B t ) + ∫ 𝔼x [𝔼x (c(B s ) exp [∫s c(B r ) dr] f(B t ) 󵄨󵄨󵄨 B s )] ds 󵄨 0

t

t−s

= 𝔼 f(B t ) + ∫ 𝔼x [c(B s ) 𝔼B s (exp [∫0 x

0

t

c(B u ) du] f(B t−s ))] ds

= 𝔼x f(B t ) + ∫ 𝔼x [c(B s )w(t − s, B s )] ds =: 𝔼x f(B t ) + J(t, x).

(8.12)

0

3o If f, c ∈ Cb (ℝd ), then w(t, ⋅) ∈ C1 (ℝd ) and ‖∂ j w(t, ⋅)‖∞ ⩽ C(t1/2 + t−1/2 ). Using the representation (8.12), we show the claim separately for 𝔼x f(B t ) and J(t, x). The differentiation lemma for parameter dependent integrals yields 1 (j) ∂ j 𝔼x f(B t ) = ∂ j 𝔼 f(B t + x) = ∂ j (f ∗ g t )(x) = f ∗ (∂ j g t )(x) = − 𝔼 [f(B t + x)B t ] . t

(j) (j) Since f is bounded and 𝔼 |B t | ⩽ √𝔼[|B t |2 ] = √t, we get |∂ j 𝔼x f(B t )| ⩽ ‖f‖∞ t−1/2 . A similar reasoning for the integral term in (8.12) gives t

∂ j J(t, x) = ∫ ∫ c(y)w(t − s, y)∂ j g s (x − y) dy ds 0 ℝd t

= − ∫ ∫ c(y)w(t − s, y) 0 ℝd

xj − yj g s (x − y) dy ds s

8.3 The Feynman–Kac formula | 135

t

1 (j) = − ∫ 𝔼 [c(B s + x)w(t − s, B s + x) B s ] ds. s 0

(j)

Since c(y) ⋅ w(s, y) is bounded, cf. Step 2o , and 𝔼 |B s | ⩽ √s, we get t

󵄨 󵄨󵄨 󵄨󵄨∂ j J(t, x)󵄨󵄨󵄨 ⩽ ‖c‖∞ ‖f‖∞ e tα ∫ s−1/2 ds = 2‖c‖∞ ‖f‖∞ e tα √t. 0

Thus, the first partial derivatives exist; their continuity follows either from Step 4o , or from a direct application of the continuity lemma. 4o If f, c ∈ Cb (ℝd ) and if c is κ-Hölder continuous, then w(t, ⋅) ∈ C2 (ℝd ). Again we differentiate the two terms appearing in the representation (8.12) and get ∂ j ∂ k 𝔼x f(B t ) = f ∗ (∂ j ∂ k g t )(x) =

1 (j) (k) 𝔼 [f(B t + x) (B t B t − tδ jk )] . t2

󵄨󵄨 (j) (k) 󵄨󵄨 Note that 𝔼 󵄨󵄨󵄨B t B t − tδ jk 󵄨󵄨󵄨 < ∞, so everything converges as f is bounded. It is not hard 󵄨 󵄨 to check that the second derivatives are continuous if we use the continuity lemma for parameter dependent integrals for this expectation. For the integral expression we begin as in Step 3o : t

∂ j ∂ k J(t, x) = ∫ ∫ c(y)w(t − s, y)∂ j ∂ k g s (x − y) dy ds 0 ℝd t

= ∫ ∫ c(y)w(t − s, y) 0 ℝd t

(x j − y j )(x k − y k ) − δ jk s g s (x − y) dy ds s2

= ∫ 𝔼 [c(B s + x)w(t − s, B s + x) 0

(j) (k)

B s B s − δ jk s ] ds. s2

󵄨󵄨 (j) (k) 󵄨󵄨 The straightforward estimate s−2 𝔼 󵄨󵄨󵄨B s B s − δ jk s󵄨󵄨󵄨 ≈ s−1 leaves us with a non󵄨 󵄨 integrable singularity at s = 0. Here we can use that c(⋅) is Hölder continuous and that, by Step 3o , w(t − s, ⋅) is Lipschitz continuous with a Lipschitz constant of (j) (k) order (t − s)−1/2 . Since 𝔼 (B s B s − δ jk s) = 0, we can slip in c(x)w(t − s, x) and get t

∂ j ∂ k J(t, x) = ∫ 𝔼 [(c(B s + x)w(t − s, B s + x) − c(x)w(t − s, x)) 0

(j) (k)

B s B s − δ jk s ] ds. s2

Since the product of two bounded Hölder continuous functions is again Hölder continuous with the smaller of the two Hölder indices, we see that y 󳨃→ c(y)w(t − s, y) is κ-Hölder, and the Hölder constant is of order (t − s)−1/2 . Thus, the integral above can

136 | 8 The PDE connection be estimated by ⁴ t

t

󵄨󵄨 󵄨󵄨 (j) (k) ds ds C ∫ 𝔼 [|B s | 󵄨󵄨󵄨B s B s − δ jk s󵄨󵄨󵄨] ⩽ C󸀠 ∫ (𝔼 [|B s |2+κ ] + s 𝔼 [|B s |κ ]) . 2 󵄨 s2 √t − s 󵄨 √ s t−s κ

0

0

The scaling property of Brownian motion, B s ∼ √sB1 , shows that 𝔼 [|B s |2+κ ] ≈ s1+κ/2 and 𝔼 [|B s |κ ] ≈ s κ/2 , so that the above integral converges; thus, the application of the differentiation lemma is indeed justified. The continuity of the second derivatives follows from the continuity lemma for parameter dependent integrals. 5o If f, c ∈ Cb (ℝd ) and if c is κ-Hölder continuous, then w(⋅, x) ∈ C1 (0, ∞). Again we differentiate the two terms appearing in the representation (8.12) and get ∂ t 𝔼x f(B t ) = f ∗ (∂ t g t )(x) =

1 𝔼 [f(B t + x) (|B t |2 − td)] . 2t2

Note that the expectation is finite and that it is such that we can apply the continuity lemma to get continuity of the derivative. For the second term we use again the representation (8.12), but we change first variables s 󴁄󴀼 t − s. Then, t

∂ t J(t, x) = 𝔼 [c(B0 )w(t, B0 )] + ∫ ∫ c(y)w(s, y)∂ t g t−s (x − y) dy ds. x

= c(x)w(t, x)

0 ℝd

Now we can use the elementary identity ∂ t g t (x) = to the integrals t

1 2

∑dj=1 ∂2j g t (x) reducing everything

1 ∫ ∫ c(y)w(s, y)∂2j g t−s (x − y) dy ds 2 0 ℝd

which have already been dealt with in the last step. This completes the proof. We can now wrap up the discussion in the following theorem. 8.8 Theorem. Let (B t )t⩾0 be a BMd and f, c ∈ Cb (ℝd ) such that c is locally Hölder continuous. Then t w(t, x) = 𝔼x [f(B t ) exp (∫0 c(B r ) dr)] (8.13)

is the unique solution to the initial value problem (8.8) such that |w(t, x)| ⩽ Ce αt .

4 If we require only local Hölder continuity of c, we need at this point one more argument, splitting the above integral into a part with |B s | > R and |B s | ⩽ R, etc. This is just a technical point.

8.3 The Feynman–Kac formula | 137

8.9 An Application: Lévy’s arc-sine law (Lévy 1939. Kac 1949). Let (B t )t⩾0 be a BM1 t and G t := ∫0 𝟙(0,∞) (B s ) ds the total amount of time a Brownian motion spends in the positive half-axis up to time t. Lévy showed in [163] that G t has an arc-sine distribution: t

ℙ0 ( ∫ 𝟙(0,∞) (B s ) ds ⩽ x) = 0

x 2 arcsin √ , π t

0 ⩽ x ⩽ t.

(8.14)

We calculate the Laplace transform g α (t) = 𝔼0 e−αG t , α ⩾ 0, to identify the probability distribution of G t . Define for λ > 0 and x ∈ ℝ ∞



t

v λ (x) := ∫ e−λt 𝔼x e−αG t dt = ∫ e−λt 𝔼x (𝟙ℝ (B t ) ⋅ exp [−α ∫0 𝟙(0,∞) (B s ) ds]) dt. 0

0

=w(t,x) as in Thm. 8.6

Assume that we could apply Theorem 8.6 with f(x) = 𝟙ℝ (x) and c(x) = −α𝟙(0,∞) (x). Then λ 󳨃→ v λ (x) is the Laplace transform of w(t, x) which, by Theorem 8.6, is the solution of the initial value problem (8.9). A short calculation shows that the Laplace transform satisfies the following linear ODE: 1 󸀠󸀠 v (x) − α𝟙(0,∞) (x)v λ (x) = λv λ (x) − 𝟙ℝ (x) for all x ∈ ℝ \ {0}. 2 λ

(8.15)

This ODE is, with suitable initial conditions, uniquely solvable in (−∞, 0) and (0, ∞). It will become clear from the approximation below that v λ is of class C1 ∩ Cb on the whole real line. Therefore we can prescribe conditions in x = 0 such that v λ is the unique bounded C1 solution of (8.15). With the usual Ansatz for linear second-order ODEs, we get 1 √ √ { + C1 e 2λx + C2 e− 2λx , { { {λ v λ (x) = { { { 1 { + C3 e√2(α+λ)x + C4 e−√2(α+λ)x , {α + λ

x < 0, x > 0.

Let us determine the constants C j , j = 1, 2, 3, 4. The boundedness of the function v λ gives C2 = C3 = 0. Since v λ and v󸀠λ are continuous at x = 0, we get v λ (0+) =

1 1 + C4 = + C1 = v λ (0−), α+λ λ

v󸀠λ (0+) = −√2(α + λ) C4 = √2λ C1 = v󸀠λ (0−).

Solving for C1 yields C1 = −λ−1 + (λ(λ + α))−1/2 , and so ∞

v λ (0) = ∫ e−λt 𝔼0 (e−αG t ) dt = 0

1 . √ λ(λ + α)

138 | 8 The PDE connection ∞

On the other hand, by Fubini’s theorem and the formula ∫0 (πt)−1/2 e−ct dt = c−1/2 : ∞

∫e 0



π/2

−λt

t

2 2 e−αr dr dt ∫ e−αt sin θ dθ dt = ∫ e−λt ∫ π π√ r(t − r)

0

0 ∞

=∫

0 ∞

=∫ 0

e−αr

√πr

0 ∞

∫ r

e−λt

√ π(t − r)

dt dr



e−(α+λ)r e−λs 1 . dr ∫ ds = √πr √πs √(λ + α)λ 0

Because of the uniqueness of the Laplace transform, cf. [237, Proposition 1.2], we conclude g α (t) = 𝔼0 e−αG t =

π/2

t

0

0

2 2 x 2 ∫ e−αt sin θ dθ = ∫ e−αx d x arcsin √ , π π t

and, again by the uniqueness of the Laplace transform, we get (8.14). Let us now justify the formal manipulations that led to (8.15). In order to apply Theorem 8.6, we cut and smooth the functions f(x) = 𝟙ℝ (x) and g(x) = 𝟙(0,∞) (x). Pick a sequence of cut-off functions χ n ∈ C∞ c (ℝ) such that 𝟙[−n,n] ⩽ χ n ⩽ 1, and define g n (x) = 0 ∨ (nx) ∧ 1. Then χ n ∈ D(A), g n ∈ Cb (ℝ), supn⩾1 χ n = 𝟙ℝ and supn⩾0 g n = g. By Theorem 8.6, t

w n (t, x) := 𝔼x (χ n (B t ) exp [−α ∫0 g n (B s ) ds]) ,

Ex. 8.4

t > 0, x ∈ ℝ,

is, for each n ⩾ 1, the unique bounded solution of the initial value problem (8.9) with f = χ n and c = g n ; note that the bound from Theorem 8.6 does not depend on n ⩾ 1. The Laplace transform λ 󳨃→ v n,λ (x) of w n (t, x) is the unique bounded solution of the ODE 1 󸀠󸀠 (8.16) v (x) − αχ n (x)v n,λ (x) = λv n,λ (x) − g n (x), x ∈ ℝ. 2 n,λ Integrating two times, we see that limn→∞ v n,λ (x) = v λ (x), limn→∞ v󸀠n,λ (x) = v󸀠λ (x) for 󸀠󸀠 1 all x ∈ ℝ and limn→∞ v󸀠󸀠 n,λ (x) = v λ (x) for all x ≠ 0; moreover, we have v λ ∈ C (ℝ), cf. Problem 8.4.

8.4 The Dirichlet problem Up to now we have considered parabolic initial value problems. In this section we focus on the most basic elliptic boundary value problem, Dirichlet’s problem for the Laplace operator with given boundary data. Throughout this section D ⊂ ℝd is a bounded open and connected domain, and we write ∂D = D \ D for its boundary; (B t )t⩾0 is a BMd

8.4 The Dirichlet problem | 139

Fig. 8.1: Exit time and position of a Brownian motion from the set D.

starting from B0 = x, and τ = τ D c = inf{t > 0 : B t ∉ D} is the first exit time from the set D. We assume that the (B t )t⩾0 is adapted to a right continuous filtration (Ft )t⩾0 .⁵ As before, (A, D(A)) denotes the generator of (B t )t⩾0 and ∆ is the Laplace operator on, say, C2 (ℝd ). We are interested in the following question: find some function u on D with the following properties ∆u(x) = 0

for x ∈ D,

(8.17a)

u(x) = f(x) for x ∈ ∂D,

(8.17b)

u(x) is continuous in D.

(8.17c)

To get a feeling for the problem, we use again heuristic arguments as in Sections 8.1–8.3. Assume that u is a solution of the Dirichlet problem (8.17). If ∂D is smooth, we can extend u onto ℝd in such a way that u ∈ C2 (ℝd ) with supp u compact. Therefore, we can apply (8.2) with u = u(x) to see that t

M tu

= u(B t ) − u(B0 ) − ∫

1 ∆u(B r ) dr 2

0

is a bounded martingale. u By the optional stopping theorem, Theorem A.18 and Remark A.21, (M t∧τ , Ft )t⩾0 is a martingale, and we see that ⁶ t∧τ u M t∧τ = u(B t∧τ ) − u(B0 ) − ∫ 0

1 ∆u(B r ) dr = u(B t∧τ ) − u(B0 ). 2

=0 by (8.17a)

u u Therefore, N t∧τ := u(B t∧τ ) is a bounded martingale. In particular, 𝔼x N0u = 𝔼x N t∧τ for all t ⩾ 0. If τ = τ D c < ∞, the martingale convergence theorem yields u u(x) = 𝔼x N0u = lim 𝔼x N t∧τ = 𝔼x N τu = 𝔼x u(B τ ) t→∞

(8.17b)

=

𝔼x f(B τ ).

This shows that 𝔼x f(B τ ) is a good candidate for the solution of (8.17). But before we go on, let us quickly check that τ is indeed a.s. finite. The following result is similar to Lemma 7.33.

5 A condition for this can be found in Theorem 6.22. t∧τ 6 Note that ∫0 ⋅ ⋅ ⋅ dr = ∫[0,t∧τ) ⋅ ⋅ ⋅ dr.

Ex. 8.6

140 | 8 The PDE connection

Ex. 8.7

8.10 Lemma. Let (B t )t⩾0 be a BMd and D ⊂ ℝd be an open and bounded set. Then 𝔼x τ D c < ∞ where τ D c is the first hitting time of D c .

Proof. Because of Lemma 5.8, τ = τ D c is a stopping time. Since D is a bounded set, there exists some r > 0 such that D ⊂ 𝔹(0, r). Define u(x) := − exp(−|x|2 /4r2 ). Then u ∈ C2∞ (ℝd ) ⊂ D(A) and, for all |x| < r, 2 2 1 1 d |x|2 1 d 1 ∆u(x) = ( 2 − 4 ) e−|x| /4r ⩾ ( 2 − 2 ) e−1/4 =: κ > 0. 2 2 2r 2 2r 4r 4r

Therefore, we can use Dynkin’s formula (7.30) with σ = τ ∧ n to get τ∧n

κ 𝔼x (τ ∧ n) ⩽ 𝔼x ( ∫ 0

Ex. 8.8

By Fatou’s lemma,

𝔼x

1 ∆u(B r ) dr) = 𝔼x u(B τ∧n ) − u(x) ⩽ 2‖u‖∞ = 2. 2

τ ⩽ limn→∞ 𝔼x (τ ∧ n) ⩽ 2/κ < ∞.

Let us now show that u(x) = 𝔼x f(B τ ) is a solution of (8.17). For the behaviour in the interior, i.e. (8.17a), the following classical notion turns out to be the key ingredient. 8.11 Definition. Let D ⊂ ℝd be an open set. A function u : D → ℝ is harmonic (in the set D), if u ∈ C2 (D) and ∆u(x) = 0.

In fact, harmonic functions are arbitrarily often differentiable. This is an interesting by-product of the proof of the next result. 8.12 Proposition. Let D ⊂ ℝd be an open and connected set and u : D → ℝ a locally bounded function. Then the following assertions are equivalent. a) u is harmonic on D. b) u has the (spherical) mean value property: for all x ∈ D and r > 0 such that 𝔹(x, r) ⊂ D one has u(x) =



∂𝔹(0,r)

u(x + z) π r (dz)

(8.18)

where π r is the normalized uniform measure on the sphere ∂𝔹(0, r). c)

u is continuous and (u(B t∧τ c ), Ft )t⩾0 is a martingale w.r.t. the law ℙx of a Brownian G motion (B t , Ft )t⩾0 started at any x ∈ G; here G is a bounded open set such that G ⊂ D, and τ G c is the first exit time from the set G.

Proof. c)⇒b): fix x ∈ D and r > 0 such that 𝔹(x, r) ⊂ D, and set σ := τ𝔹c (x,r) .

Clearly, there is some bounded open set G such that 𝔹(x, r) ⊂ G ⊂ D. By assumption, (u(B t∧τ c ), Ft )t⩾0 is a ℙx martingale. G Since σ < ∞, cf. Lemma 8.10, we find by optional stopping (Theorem A.18) that (u(B t∧σ∧τ c ))t⩾0 is a martingale. Using σ ⩽ τ G c we get G

u(x) = 𝔼x u(B0 ) = 𝔼x u(B t∧σ ).

8.4 The Dirichlet problem |

141

Now |u(B t∧σ )| ⩽ supy∈𝔹(x,r) |u(y)| < ∞, and by dominated convergence and the continuity of Brownian paths u(x) = lim 𝔼x u(B t∧σ ) = 𝔼x u(B σ ) = ∫ u(y) ℙx (B σ ∈ dy) t→∞

= ∫ u(z + x) ℙ0 (B τ ∈ dz),

where τ = τ𝔹c (0,r) . By the rotational symmetry of Brownian motion, cf. Lemma 2.8, ℙ0 (B τ ∈ dz) is a uniformly distributed probability measure on the sphere ∂𝔹(0, r), and (8.18) follows.⁷

b)⇒a): fix x ∈ D and δ > 0 such that 𝔹(x, δ) ⊂ D. Pick ϕ δ ∈ C∞ (0, ∞) such that δ ϕ δ |[δ2 ,∞) ≡ 0 and set c d := ∫0 ϕ δ (r2 ) r d−1 dr. By Fubini’s theorem and changing to polar coordinates δ

u(x) =

δ

1 (8.18) 1 ∫ ϕ δ (r2 ) r d−1 u(x) dr = ∫ cd b) c d 0

=

=

1 c󸀠d 1 c󸀠d



0 ∂𝔹(0,r)

ϕ δ (r2 ) r d−1 u(z + x) π r (dz) dr

∫ ϕ δ (|y|2 ) u(y + x) dy

𝔹(0,δ)

∫ ϕ δ (|y − x|2 ) u(y) dy.

𝔹(x,δ)

As u is locally bounded, the differentiation lemma for parameter dependent integrals, e.g. [232, Theorem 12.5], shows that the last integral is arbitrarily often differentiable at the point x. Since x ∈ D is arbitrary, we have u ∈ C∞ (D). Fix x0 ∈ D and assume that ∆u(x0 ) > 0. Because D is open and ∆u continuous, there is some δ > 0 such that 𝔹(x0 , δ) ⊂ D and ∆u|𝔹(x0 ,δ) > 0. Set σ := τ𝔹c (x0 ,δ) and take χ ∈ C∞ c (D) such that 𝟙𝔹(x0 ,δ) ⩽ χ ⩽ 𝟙D . Then we can use (8.2) for the smooth and bounded function u χ := χu. Using an optional stopping argument we conclude that t∧σ



M t∧σ = u χ (B t∧σ ) − u χ (B0 ) − ∫ 0

t∧σ

= u(B t∧σ ) − u(B0 ) − ∫ 0

1 ∆u χ (B r ) dr 2

1 ∆u(B r ) dr 2

7 For a rotation Q : ℝd × ℝd , (QB t )t⩾0 is again a BMd . Let τ = τ𝔹c (0,r) and C ∈ B(∂𝔹(0, r)). Then ℙ0 (B τ ∈ C) = ℙ0 (QB τ ∈ C) = ℙ0 (B τ ∈ Q−1 C), which shows that ℙ0 (B τ ∈ dz) must be the (normalized) surface measure on the sphere ∂𝔹(0, r).

Ex. 2.15

142 | 8 The PDE connection uχ

(use χ|𝔹(x0 ,δ) ≡ 1) is a martingale. Therefore, 𝔼 M t∧σ = 0, and letting t → ∞ we get σ

1 𝔼 u(B σ ) − u(x0 ) = 𝔼x0 (∫ ∆u(B r ) dr) > 0. 2 x0

0

The mean-value property shows that the left-hand side equals 0 and we have a contradiction. Similarly one shows that ∆u(x0 ) < 0 gives a contradiction, thus ∆u(x) = 0 for all x ∈ D. a)⇒c): fix any open and bounded set G such that G ⊂ D and pick a cut-off function χ ∈ C∞ c such that 𝟙G ⩽ χ ⩽ 𝟙D . Set σ := τ G c and u χ (x) := χ(x)u(x). By (8.2), t

uχ Mt

= u χ (B t ) − u χ (B0 ) − ∫ 0

1 ∆u χ (B r ) dr 2

is a uniformly integrable martingale. Therefore, we can use optional stopping and the fact that χ|G ≡ 1 to see that t∧σ

uχ M t∧σ

= u(B t∧σ ) − u(B0 ) − ∫ 0

is a martingale. This proves c).

1 a) ∆u(B r ) dr = u(B t∧σ ) − u(B0 ) 2

Let us note another classical consequence of the mean-value property. 8.13 Corollary (maximum principle). Let D ⊂ ℝd be a bounded, open and connected domain and u ∈ C(D) ∩ C2 (D) be a harmonic function. Then ∃x0 ∈ D : u(x0 ) = sup u(x) 󳨐⇒ u ≡ u(x0 ). x∈D

In other words: a non-constant harmonic function can attain its maximum only on the boundary of D. Proof. Assume that u is harmonic in D and u(x0 ) = supx∈D u(x) for some x0 ∈ D. Since u(x) enjoys the spherical mean-value property (8.18), we see that u|∂𝔹(x0 ,r) ≡ u(x0 ) for all r > 0 such that 𝔹(x0 , r) ⊂ D. Thus, u|𝔹(x0 ,r) ≡ u(x0 ), and we see that the set M := {x ∈ D : u(x) = u(x0 )} is open. Since u is continuous, M is also closed, and the connectedness of D shows that D = M as M is not empty.

When we solve the Dirichlet problem, cf. Theorem 8.20 below, we use Proposition 8.12 to get (8.17a), and Corollary 8.13 for the uniqueness of the solution; the condition (8.17b), i.e. limD∋x→x0 𝔼x f(B τ ) = f(x0 ) for x0 ∈ ∂D, requires some kind of regularity for the hitting time τ = τ D c . 8.14 Definition. A point x0 ∈ ∂D is called regular (for D c ), if ℙx0 (τ D c = 0) = 1. Every non-regular point is called singular.

8.4 The Dirichlet problem | 143

Fig. 8.2: The outer cone condition. A point x ∈ D is regular, if we can touch it with a (truncated) cone which is in D c .

At a regular point x0 , Brownian motion leaves the domain D immediately; typical situations are shown in Example 8.18 below. The point x0 ∈ ∂D is singular for D c if, and only if, ℙx0 (τ D c > 0) = 1. If E ⊃ D such that x0 ∈ ∂E, the point x0 is also singular for E c . This follows from E c ⊂ D c and τ Ec ⩾ τ Dc . 8.15 Lemma. x0 ∈ ∂D is singular for D c if, and only if, ℙx0 (τ D c = 0) = 0. Proof. By definition, any singular point x0 satisfies ℙx0 (τ D c = 0) < 1. On the other hand, ∞



n=1

n=1

{τ D c = 0} = ⋂ {τ D c ⩽ 1/n} ∈ ⋂ F1/n = F0+ , and by Blumenthal’s 0-1–law, Corollary 6.23, ℙx0 (τ D c = 0) ∈ {0, 1}. The following simple regularity criterion is good enough for most situations. Recall that a truncated cone in ℝd with opening r > 0, vertex x0 and direction x1 is the set V = {x ∈ ℝd : x = x0 + h 𝔹(x1 , r), 0 < h < ϵ} for some ϵ > 0. 8.16 Lemma (outer cone condition. Poincaré 1890, Zaremba 1911). Suppose D ⊂ ℝd , d ⩾ 2, is an open domain and (B t )t⩾0 a BMd . If we can touch x0 ∈ ∂D with the vertex of a truncated cone V such that V ⊂ D c , then x0 is a regular point for D c . Proof. Assume that x0 is singular. Denote by τ V := inf{t > 0 : B t ∈ V} the first hitting time of the open cone V. Since V ⊂ D c , we have τ V ⩾ τ D c , i.e. ℙx0 (τ V > 0) ⩾ ℙx0 (τ D c > 0) = 1. Pick ϵ > 0 so small that 𝔹(x0 , ϵ) \ {x0 } is covered by finitely many copies of the cone V = V0 and its rotations V1 , . . . , V n about the vertex x0 . Since a Brownian motion is invariant under rotations – this follows immediately from Lemma 2.8 –, we conclude that ℙx0 (τ V j > 0) = ℙx0 (τ V0 > 0) = 1 for all j = 1, . . . , n. Observe that ⋃nj=0 V j ⊃ 𝔹(x0 , ϵ) \ {x0 }; thus, min0⩽j⩽n τ V j ⩽ τ𝔹(x0 ,ϵ)\{x0 } , i.e. ℙx0 (τ𝔹(x0 ,ϵ)\{x0 } > 0) ⩾ ℙx0 ( min τ V j > 0) = 1 0⩽j⩽n

144 | 8 The PDE connection which is impossible since ℙx0 (B t = x0 ) = 0 for every t > 0.

8.17 Remark. The case of dimension d = 2 is special since we can identify ℝ2 with ℂ. Recall that a set D ⊂ ℂ is simply connected, if its complement (in the extended complex plane ℂ ∪ {∞}) is connected. Equivalently, every loop inside of D can be shrunk to a single point within D. For such sets we have the Riemann mapping theorem (e.g. Rudin [225, Theorem 14.8]): For every simply connected open domain D ⫋ ℂ there is a biholomorphic map ϕ mapping D bijectively onto the open unit disk 𝔹(0, 1) ⊂ ℂ. We also need to know the behaviour of ϕ at the boundary. If D is bounded and ∂D is a Jordan curve (i.e. a homeomorphic image of ∂𝔹(0, 1)), then ϕ can be extended to a homeomorphism D → 𝔹(0, 1), cf. [225, Theorem 14.19]. Brownian motion in ℝ2 is transported with the map ϕ. That is, if W t ≃ β t + iB t is a BM2 (until exiting D), then ϕ(W σ t ) is a BM2 (until exiting 𝔹(0, 1)), where σ t is a t random-time change; σ t (ω) is the inverse of A t (ω) := ∫0 |ϕ󸀠 (W r (ω))|2 dr. This property is called the conformal invariance of Brownian motion; it was discovered by Lévy [165] in 1947, see also [166, § 56]. An accessible proof can be found in [9, Chapter V.1]. If we use this in conjunction with Lemma 8.16, it becomes clear that a simply connected open set in ℝ2 ≃ ℂ with a Jordan boundary has only regular boundary points. Note, however, that 𝔹(0, 1) \ {0} is not simply connected. In fact, more is true: all boundary points of a simply connected, bounded open domain D ⊂ ℂ are regular, see Burckel [24, Theorem 9.40]. 8.18 Example. Unless indicated otherwise, we assume that d ⩾ 2. The following pictures show typical examples of singular points. a) The boundary point 0 of D = 𝔹(0, 1) \ {0} is singular. This was the first example of a singular point and it is due to Zaremba [280]. This is easy to see since τ{0} = ∞ almost surely: by the Markov property ℙ0 (τ{0} < ∞) = ℙ0 (∃t > 0 : B t = 0) = lim ℙ0 (∃t > n→∞

1 n

: B t = 0)

= lim 𝔼0 ( ℙB1/n (∃t > 0 : B t = 0) ) = 0. n→∞

=0, Corollary 6.17

Therefore, ℙ0 (τ D c = 0) = ℙ0 (τ{0} ∧ τ𝔹c (0,1) = 0) = ℙ0 (τ𝔹c (0,1) = 0) = 0.

b) Let (r n )n⩾1 and (c n )n⩾1 be sequences with 0 < r n , c n < 1 which decrease to 0. We set K(c n , r n ) := 𝔹(c n e1 , r n ) ⊂ ℝ2 and e1 = (1, 0) ∈ ℝ2 . Run a Brownian motion in the set D := 𝔹(0, 1) \ ⋃n⩾1 K(c n , r n ) obtained by removing the circles K(c n , r n ) from the unit ball. Clearly, the Brownian motion has to leave 𝔹(0, 1) before it leaves 𝔹(c n e1 , 2), see Figure 8.4. Thus, ℙ0 (τ K(c n ,r n ) < τ𝔹c (0,1) ) ⩽ ℙ0 (τ K(c n ,r n ) < τ𝔹c (c n e1 ,2) )

= ℙ−c n e1 (τ K(0,r n ) < τ𝔹c (0,2) ).

8.4 The Dirichlet problem | 145

a) singular (d ⩾ 2)

b) singular (d ⩾ 2)

c) singular (d ⩾ 3)

e) singular (d ⩾ 3) regular (d = 2)

Fig. 8.3: Examples of singular points.

Since K(0, r n ) = 𝔹(0, r n ), this probability is known from Theorem 6.16: ℙ0 (τ K(c n ,r n ) < τ𝔹c (0,1) ) ⩽

log 2 − log c n . log 2 − log r n

Take c n = 2−n , r n = 2−n , n ⩾ 2, and sum the respective probabilities; then 3





n=2

n=2

∑ ℙ0 (τ K(c n ,r n ) < τ𝔹c (0,1) ) ⩽ ∑

n+1 < ∞. n3 + 1

The Borel–Cantelli lemma shows that Brownian motion visits at most finitely many of the balls K(c n , r n ) before it exits 𝔹(0, 1). Since 0 ∉ K(c n , r n ), this means that ℙ0 (τ D c > 0) = 1, i.e. 0 is a singular point. c) (Lebesgue’s spine) In 1913 Lebesgue gave the following example of a singular point. Consider in d = 3 the open unit ball and remove a(n exponentially) sharp spine, e.g. a closed cusp S which is obtained by rotating an exponentially decaying curve y = h(x) about the positive x–axis: E := 𝔹(0, 1) \ S. Then 0 is a singular point for Ec . For this, we use a similar construction as in b): we remove from 𝔹(0, 1) the sets K n , D := 𝔹(0, 1) \ ⋃∞ n=1 K n , but the sets K n are chosen in such a way that they cover the spine S, i.e. D ⊂ E. Let K n = [2−n , 2−n+1 ]×𝔹(0, r n ) be a cylinder in ℝ3 . The probability that a Brownian motion hits K n can be made arbitrarily small, if we choose r n small enough, see the technical wrap-up at the end of this paragraph. So we can arrange things in such a way that ℙ0 (τ K n < τ𝔹c (0,1) ) ⩽ p n



and

∑ p n < ∞. n=1

As in b) we see that a Brownian motion visits at most finitely many of the cylinders K n before exiting 𝔹(0, 1), i.e. ℙ0 (τ D c > 0) = 1. Therefore, 0 is singular for D c and for E c ⊂ D c . Technical wrap-up. Set K ϵn = [2−n , 2−n+1 ] × 𝔹(0, ϵ), τ n,ϵ := τ K n , B t = (w t , b t , β t ) ϵ and σ n = inf{t > 0 : w t ⩾ 2−n }.

146 | 8 The PDE connection

2

h(x)

cn e1 1 0

B1 (0)

K(cn , rn )

K1 K2

rn

Fig. 8.4: Left panel: Brownian motion has to exit 𝔹(0, 1) before it leaves 𝔹(c n e1 , 2). Right panel: Pacman – a “cubistic” version of Lebesgue’s spine.

From Corollary 6.18 we deduce that 0 < σ n , τ𝔹c (0,1) < ∞ a.s. Moreover, for fixed n ⩾ 1 and any ϵ󸀠 ⩽ ϵ we have σ n ⩽ τ n,ϵ ⩽ τ n,ϵ󸀠 , and so {σ n ⩽ τ n,ϵ󸀠 ⩽ τ𝔹c (0,1) } ⊂ {σ n ⩽ τ n,ϵ ⩽ τ𝔹c (0,1) }.

Since the interval [σ n , τ𝔹c (0,1) ] is compact we get

1

⋂ {σ n ⩽ τ n,ϵ ⩽ τ𝔹c (0,1) } ⊂ {∃t ∈ [σ n , τ𝔹c (0,1) ] : b t = β t = 0}.

ϵ>0

As and

ℙ0 (τ n,ϵ ⩽ τ𝔹c (0,1) ) = ℙ0 (σ n ⩽ τ n,ϵ ⩽ τ𝔹c (0,1) ) ℙ0 (∃t > 0 : b t = β t = 0) = 0,

see Corollary1 6.17, we conclude that limϵ→0 ℙ0 (τ n,ϵ ⩽ τ𝔹c (0,1) ) = 0 and this is all we need. ◻ Our construction shows that the spine will be exponentially sharp. Lebesgue used the function h(x) = exp(−1/x), x > 0, see [140, p. 285, p. 334].

Ex. 8.11

d) Let (b(t))t⩾0 be a BM1 and τ{0} = inf{t > 0 : b(t) = 0} the first return time to the origin. Then ℙ0 (τ{0} = 0) = 1. This means that in d = 1 a Brownian motion path

oscillates infinitely often around its starting point. To see this, we begin with the remark that, because of the symmetry of a Brownian motion, ℙ0 (b(t) ⩾ 0) = ℙ0 (b(t) ⩽ 0) =

1 2

for all t ∈ (0, ϵ], ϵ > 0.

Thus, ℙ0 (b(t) ⩾ 0 ∀t ∈ [0, ϵ]) ⩽ 12 and ℙ0 (τ(−∞,0] ⩽ ϵ) ⩾ 12 for all ϵ > 0. Letting ϵ → 0 we get ℙ0 (τ(−∞,0] = 0) ⩾ 12 and, by Blumenthal’s 0-1–law, Corollary 6.23,

8.4 The Dirichlet problem |

147

ℙ0 (τ(−∞,0] = 0) = 1. The same argument applies to the interval [0, ∞) and we get

Ex. 5.13

ℙ0 (τ{0} = 0) = ℙ0 (τ(−∞,0] ∨ τ[0,∞) = 0) = 1.

e) (Zaremba’s needle) Let e2 = (0, 1, 0, . . . , 0) ∈ ℝd , d ⩾ 2. The argument from c) shows that for the set D = 𝔹(0, 1) \ [0, ∞)e2 the point 0 is singular if d ⩾ 3. In two dimensions the situation is different: 0 is regular! Indeed: let B(t) = (b(t), β(t)), t ⩾ 0, be a BM2 ; we know that the coordinate processes b = (b(t))t⩾0 and β = (β(t))t⩾0 are independent one-dimensional Brownian motions. Set σ n = inf{t > 1/n : b(t) = 0}. Since 0 ∈ ℝ is regular for {0} ⊂ ℝ, see d), we get that limn→∞ σ n = τ{0} = 0 almost surely with respect to ℙ0 . Since β ⊥⊥ b, the random variable β(σ n ) is symmetric, and so

Ex. 8.12

Ex. 8.13

1 . 2 Clearly, B(σ n ) = (b(σ n ), β(σ n )) = (0, β(σ n )) and {β(σ n ) ⩾ 0} ⊂ {τ D c ⩽ σ n }, so ℙ0 (β(σ n ) ⩾ 0) = ℙ0 (β(σ n ) ⩽ 0) ⩾

ℙ0 (τ D c = 0) = lim ℙ0 (τ D c ⩽ σ n ) ⩾ lim ℙ0 (β(σ n ) ⩾ 0) ⩾ n→∞

n→∞

1 . 2

Now Blumenthal’s 0–1-law, Corollary 6.23, applies and gives ℙ0 (τ D c = 0) = 1.

8.19 Theorem. Let x0 ∈ ∂D be a regular point and τ = τ D c . Then lim ℙx (τ > h) = 0 for all h > 0.

D∋x→x0

Proof. Fix h > 0. Since D is open, we find for every x ∈ ℝd ℙx (τ > h) = ℙx (∀s ∈ (0, h] : B(s) ∈ D)

= inf ℙx (∀s ∈ (1/n, h] : B(s) ∈ D) n

= inf 𝔼x (ℙB(1/n) (∀s ∈ (0, h − 1/n] : B(s) ∈ D))

(8.19)

n

= inf 𝔼x (ℙB(1/n) (τ > h − 1/n)) . n

Set ϕ(y) := > h − 1/n). Since ϕ ∈ Bb (ℝd ), the strong Feller property, Proposition 7.3.g), shows that ℙy (τ

x 󳨃→ 𝔼x (ℙB(1/n) (τ > h − 1/n)) = 𝔼x ϕ(B(1/n))

is a continuous function. Therefore, lim ℙx (τ > h) =

D∋x→x0

lim inf 𝔼x (ℙB(1/n) (τ > h − 1/n))

D∋x→x0 n

⩽ inf lim 𝔼x (ℙB(1/k) (τ > h − 1/k)) k D∋x→x0

cts.

= inf 𝔼x0 (ℙB(1/k) (τ > h − 1/k)) k

(8.19)

since x0 is regular.

= ℙx0 (τ > h) = 0,

Ex. 6.6

148 | 8 The PDE connection We are now ready for the main theorem of this section. 8.20 Theorem. Let (B t )t⩾0 be a d-dimensional Brownian motion, D a connected, open and bounded domain and τ = τ D c the first hitting time of D c . If every point in ∂D is regular for D c and if f : ∂D → ℝ is continuous, then u(x) = 𝔼x f(B τ ) is the unique bounded solution of the Dirichlet problem (8.17). Proof. We know already that u(x) = 𝔼x f(B τ ) is well-defined for x ∈ D.

1o u is harmonic in D. Let x ∈ D. Since D is open, there is some δ > 0 such that 𝔹(x, δ) ⊂ D. Set σ := σ r := inf{t > 0 : B t ∉ 𝔹(x, r)} for 0 < r < δ. By the strong Markov property (6.10) u(x) = 𝔼x f(B τ ) = 𝔼x (𝔼B σ f(B τ󸀠 )) = 𝔼x u(B σ )

where τ󸀠 := inf{t > σ : B t ∉ D} is the first exit time from D of an independent Brownian motion (B t )t⩾0 started at B σ . Therefore, u enjoys the spherical mean-value property (8.18) and we infer with Proposition 8.12 that u is harmonic in D, i.e. (8.17a) holds. 2o limx→x0 u(x) = f(x0 ) for all x0 ∈ ∂D: fix ϵ > 0. Since f is continuous, there is some δ > 0 such that |f(x) − f(x0 )| ⩽ ϵ

for all x ∈ 𝔹(x0 , δ) ∩ ∂D.

Therefore, we find for every h > 0 󵄨 󵄨 |u(x) − f(x0 )| = 󵄨󵄨󵄨𝔼x (f(B τ ) − f(x0 ))󵄨󵄨󵄨

⩽ 𝔼x (|f(B τ ) − f(x0 )| ⋅ 𝟙{τ 0 we call SΠ p (f; [0, t]) :=

n



t j−1 ,t j ∈Π

|f(t j ) − f(t j−1 )|p = ∑ |f(t j ) − f(t j−1 )|p

(9.1)

j=1

the p-variation sum for Π. The supremum over all finite partitions is the strong or total p-variation,

Ex. 9.1

VARp (f; [0, t]) := sup {S Π p (f; [0, t]) : Π finite partition of [0, t]} .

(9.2)

varp (f; [0, t]) := lim S p n (f; [0, t]).

(9.3)

If VAR1 (f; [0, t]) < ∞, the function f is said to be of bounded or finite (total) variation. Often the following notion of p-variation along a given sequence of finite partitions (Π n )n⩾1 of [0, t] with limn→∞ |Π n | = 0 is used: Π

|Π n |→0

Ex. 9.3 Ex. 9.4

Note that VARp (f; [0, t]) is always defined in [0, +∞]; if varp (f; [0, t]) exists, then VARp (f; [0, t]) ⩾ varp (f; [0, t]). When a random function X(s, ω) is used in place of f(s), then varp (X(⋅, ω); [0, t]) may be a limit in law, in probability, in kth mean, and a.s. Note, however, that the sequence of partitions (Π n )n⩾1 must not depend on ω. If f is continuous, it is no restriction to require that the endpoints 0, t of the interval [0, t] are contained in every finite partition Π. Moreover, we may restrict ourselves to partitions with rational points or points from any other dense subset (plus the endpoint t). This follows from the fact that we find for every ϵ > 0 and every partition Π = {t0 = 0 < t1 < ⋅ ⋅ ⋅ < t n = t} some Π 󸀠 = {q0 = 0 < q1 < ⋅ ⋅ ⋅ < q n = t} with points from the dense subset such that n

∑ |f(t j ) − f(q j )|p ⩽ ϵ.

j=0

https://doi.org/10.1515/9783110741278-009

9.1 The quadratic variation | 153 󸀠

Π Thus, |S Π p (f; [0, t]) − S p (f; [0, t])| ⩽ c p,d ϵ with a constant depending only on p and the dimension d; since

VARp (f; [0, t]) = sup sup {S Π p (f; [0, t]) : Π partition of [0, t] with # Π = n} , n

we see that it is enough to calculate VARp (f; [0, t]) along rational points.

9.1 The quadratic variation

Let us determine the quadratic variation var2 (B; [0, t]) = var2 (B(⋅, ω); [0, t]) of a Brownian motion B(t), t ⩾ 0. In view of the results of Section 2.2 it is enough to consider a one-dimensional Brownian motion throughout this section. 9.1 Theorem. Let (B t )t⩾0 be a BM1 and (Π n )n⩾1 be any sequence of finite partitions of [0, t] satisfying limn→∞ |Π n | = 0. Then the following mean-square limit exists: var2 (B; [0, t]) = L2 (ℙ)- lim S2 n (B; [0, t]) = t. Π

n→∞

Usually var2 (B; [0, t]) is called the quadratic variation of a Brownian motion.

Proof. Let Π = {t0 = 0 < t1 < ⋅ ⋅ ⋅ < t n = t} be some partition of [0, t]. Then we have Therefore,

n

n

j=1

j=1

𝔼 S2Π (B; [0, t]) = ∑ 𝔼 [(B(t j ) − B(t j−1 ))2 ] = ∑ (t j − t j−1 ) = t. n

2

𝔼 [(S2Π (B; [0, t]) − t) ] = 𝕍 [S2Π (B; [0, t])] = 𝕍 [ ∑ (B(t j ) − B(t j−1 ))2 ] . j=1

))2 ,

By (B1) the random variables (B(t j ) − B(t j−1 j = 1, . . . , n, are independent with mean t j − t j−1 . With Bienaymé’s identity we find 2

(B1)

n

𝔼 [(S2Π (B; [0, t]) − t) ] = ∑ 𝕍 [(B(t j ) − B(t j−1 ))2 ] j=1 n

2

= ∑ 𝔼 [{(B(t j ) − B(t j−1 ))2 − (t j − t j−1 )} ] j=1

(B2)

n

2

= ∑ 𝔼 [{B(t j − t j−1 )2 − (t j − t j−1 )} ] j=1

2.16

n

2

= ∑ (t j − t j−1 )2 𝔼 [{B(1)2 − 1} ] j=1

n

=2 cf. 2.3

⩽ 2 |Π| ∑ (t j − t j−1 ) = 2 |Π| t 󳨀󳨀󳨀󳨀󳨀→ 0. j=1

|Π|→0

Ex. 9.2

154 | 9 The variation of Brownian paths 9.2 Corollary. Almost all Brownian paths are of infinite total variation. In fact, we have VARp (B; [0, t]) = ∞ a.s. for all p < 2. Proof. Let p = 2 − δ for some δ > 0. Let Π n be any sequence of partitions of [0, t] with |Π n | → 0. Then 2 󵄨 󵄨δ 󵄨 󵄨2−δ ∑ (B(t j ) − B(t j−1 )) ⩽ max 󵄨󵄨󵄨B(t j ) − B(t j−1 )󵄨󵄨󵄨 ∑ 󵄨󵄨󵄨B(t j ) − B(t j−1 )󵄨󵄨󵄨 t j−1 ,t j ∈Π n t j−1 ,t j ∈Π n

t j−1 ,t j ∈Π n

󵄨 󵄨δ ⩽ max 󵄨󵄨󵄨B(t j ) − B(t j−1 )󵄨󵄨󵄨 VAR2−δ (B; [0, t]). t j−1 ,t j ∈Π n

The left-hand side converges, at least for a subsequence, almost surely to t. On the other hand, lim|Π n |→0 maxt j−1 ,t j ∈Π n |B(t j ) − B(t j−1 )|δ = 0 since the Brownian paths are (uniformly) continuous on [0, t]. This shows that VAR2−δ (B; [0, t]) = ∞ almost surely.

Ex. 9.6

It is not hard to adapt the above results for intervals of the form [a, b]. Either we argue directly, or we observe that if (B t )t⩾0 is a Brownian motion, then so is the process W(t) := B(t + a) − B(a), t ⩾ 0, see Paragraph 2.13. Recall that a function f : [0, ∞) → ℝ is called (locally() Hölder continuous of order α > 0, if for every compact interval [a, b] there exists a constant c = c(f, α, [a, b]) such that |f(t) − f(s)| ⩽ c |t − s|α for all s, t ∈ [a, b]. (9.4) 9.3 Corollary. Almost all Brownian paths are nowhere (locally) Hölder continuous of order α > 12 .

Proof. Fix some interval [a, b] ⊂ [0, ∞), a < b are rational, α > 12 and assume that (9.4) holds with f(t) = B(t, ω) and some constant c = c(ω, α, [a, b]). Then we find for all finite partitions Π of [a, b] S2Π (B(⋅, ω); [a, b]) =



2

t j−1 ,t j ∈Π

⩽ c(ω)2

(B(t j , ω) − B(t j−1 , ω))

∑ (t j − t j−1 )2α ⩽ c(ω)2 (b − a)|Π|2α−1 .

t j−1 ,t j ∈Π

In view of Theorem 9.1 we know that the left-hand side converges almost surely to b − a for some subsequence Π n with |Π n | → 0. Since 2α − 1 > 0, the right-hand side tends to 0, and we have a contradiction. This shows that (9.4) can hold only on a ℙ-null set N a,b ⊂ Ω. But the union of null sets ⋃0⩽a 0 ∞ 2t ∞ 󵄨 Π 󵄨 ∑ ℙ (󵄨󵄨󵄨S2 n (B; [0, t]) − t󵄨󵄨󵄨 > ϵ) ⩽ 2 ∑ |Π n | < ∞. ϵ n=1 n=1

By the (easy direction of the) Borel–Cantelli lemma we conclude that there is a set Ω ϵ such that ℙ(Ω ϵ ) = 1 and for all ω ∈ Ω ϵ there is some N(ω) such that 󵄨󵄨 Π n 󵄨 󵄨󵄨S2 (B(⋅, ω); [0, t]) − t󵄨󵄨󵄨 ⩽ ϵ for all n ⩾ N(ω).

This proves that var2 (B; [0, t]) = t on Ω0 := ⋂k⩾1 Ω1/k along the sequence (Π n )n⩾1 . Since ℙ(Ω0 ) = 1, we are done. The summability condition ∑∞ n=1 |Π n | < ∞ used in Theorem 9.4 is essentially an assertion on the decay of the sequence |Π n |. If the mesh sizes decrease, we have n



n|Π n | ⩽ ∑ |Π j | ⩽ ∑ |Π j |, j=1

j=1

hence |Π n | = O(1/n). The following result of Dudley, [61, p. 89], shows that the condition |Π n | = o(1/ log n) is sufficient, i.e. limn→∞ |Π n | log n = 0. 9.5 Theorem (Dudley 1973). Let (B t )t⩾0 be a BM1 and assume that Π n is a sequence of partitions of [0, t] such that |Π n | = o(1/ log n). Then var2 (B; [0, t]) = lim S2 n (B; [0, t]) = t Π

almost surely.

n→∞

Proof. To simplify notation, we write in this proof Π

Π

S2 n = S2 n (B; [0, t]),

Then Π

S2 n − t =

=



t j−1 ,t j ∈Π n



t j−1 ,t j ∈Π n

s j = t j − t j−1

and

2

Wj =

[(B(t j ) − B(t j−1 )) − (t j − t j−1 )]

B(t j ) − B(t j−1 ) . √t j − t j−1

2

(t j − t j−1 )[

(B(t j ) − B(t j−1 )) − 1] = ∑ s j [W j2 − 1] . t j − t j−1 t ,t ∈Π j−1

j

n

156 | 9 The variation of Brownian paths Note that the W j are iid N(0, 1) random variables. For every ϵ > 0 󵄨󵄨 󵄨󵄨 Π Π Π ℙ (󵄨󵄨󵄨S2 n − t󵄨󵄨󵄨 > ϵ) ⩽ ℙ (S2 n − t > ϵ) + ℙ (t − S2 n > ϵ) ; 󵄨 󵄨

we estimate the right-hand side with the Markov inequality. For the first term we find ℙ (S2 n − t > ϵ) ⩽ exp(−θϵ) 𝔼 [exp (θ(S2 n − t))] Π

Π

= exp(−θϵ) 𝔼 [exp (θ ∑j s j (W j2 − 1))]

= exp(−θϵ) ∏j 𝔼 [exp (θs j (W j2 − 1))] .

Later on, we will pick θ such that the right-hand side is finite. If we combine the inequality |e λx − λx − 1| ⩽ 12 λ2 x2 e|λx| , x, λ ∈ ℝ, with the fact that 𝔼(W j2 − 1) = 0, we get ℙ (S2 n − t > ϵ) ⩽ exp(−θϵ) ∏j {1 + 𝔼 [exp (θs j (W j2 − 1)) − θs j (W j2 − 1) − 1]} Π

Ex. 9.7

2 󵄨 󵄨 ⩽ exp(−θϵ) ∏j {1 + 𝔼 [ 21 θ2 s2j (W j2 − 1) exp (|θs j | ⋅ 󵄨󵄨󵄨󵄨W j2 − 1󵄨󵄨󵄨󵄨)]} .

A direct calculation reveals that for each 0 < λ0 < 1/2 there is a constant C = C(λ0 ) such that 1 2 󵄨 󵄨 𝔼 [(W j2 − 1) exp (|λ| ⋅ 󵄨󵄨󵄨󵄨W j2 − 1󵄨󵄨󵄨󵄨)] ⩽ C < ∞ for all 1 ⩽ j ⩽ n, |λ| ⩽ λ0 < . 2 Using the estimate 1 + y ⩽ e y we arrive at

ℙ(S2 n − t > ϵ) ⩽ exp(−θϵ) ∏j [ 21 Cθ2 s2j + 1] Π

⩽ exp(−θϵ) ∏j exp ( 21 Cθ2 s2j ) = exp (−ϵθ + 21 Cθ2 s2 )

where s2 := ∑j s2j ; notice that s2 ⩽ t|Π n |. If we choose θ = θ0 := min {

λ0 ϵ , }, Cs2 |Π n |

(9.5)

then |s j θ0 | ⩽ λ0 , and all expectations in the calculations above are finite. Moreover, (9.5) ϵ λ0 1 Π , ℙ (S2 n − t > ϵ) ⩽ exp (θ0 ( 21 Cθ0 s2 − ϵ) ) ⩽ exp (− ϵ min { }) . 2 2Cs2 |Π n | 1 ⩽ − 2 ϵ, by (9.5)

The same estimate holds for ℙ(t − S2 n > ϵ); using s2 ⩽ t|Π n |, we see Π

1 ϵ λ0 󵄨 Π 󵄨 ℙ (󵄨󵄨󵄨S2 n − t󵄨󵄨󵄨 > ϵ) ⩽ 2 exp (− ϵ min { , }) . 2 2Ct|Π n | |Π n |

Since |Π n | = o(1/ log n), we have |Π n | = ϵ n / log n where ϵ n → 0 as n → ∞. Thus, 2

ϵ − min{ 2Ct , λ0 ϵ}/ϵ n 󵄨 Π 󵄨 ℙ (󵄨󵄨󵄨S2 n − t󵄨󵄨󵄨 > ϵ) ⩽ 2n ⩽ 2n−2

for all n ⩾ n0 .

The conclusion follows now, as in the proof of Theorem 9.4, from the Borel–Cantelli lemma.

9.2 Almost sure convergence of the variation sums | 157

Yet another possibility to get almost sure convergence of the variation sums is to consider nested partitions. The proof relies on the a.s. martingale convergence theorem for backwards martingales, cf. Corollary A.8 in the appendix. We begin with an auxiliary result. 9.6 Lemma. Let (B t )t⩾0 be a BM1 and Gs := σ(B r , r ⩽ s; (B u − B v )2 , u, v ⩾ s). Then 󵄨 𝔼 (B t − B s 󵄨󵄨󵄨 Gs ) = 0

for all 0 ⩽ s < t.

B Proof. From Lemma 2.14 we know that the σ-algebras FsB = σ(B r , r ⩽ s) and F[s,∞) := σ(B t − B s , t ⩾ s) are independent. A typical set of a ∩-stable generator of Gs is given by M

N

j=1

k=1

Γ = ⋂ {B r j ∈ C j } ∩ ⋂ {(B v k − B u k )2 ∈ D k },

B ̃ := −B where M, N ⩾ 1, r j ⩽ s ⩽ u k ⩽ v k and C j , D k ∈ B(ℝ). Since FsB ⊥⊥ F[s,∞) and B is again a Brownian motion, we get for t ⩾ s N

M

k=1

j=1

∫(B t − B s ) dℙ = ∫ (B t − B s ) ∏ 𝟙D k ((B v k − B u k )2 ) ⋅ ∏ 𝟙C j (B r j ) dℙ Γ

B F[s,∞)

measurable

FsB measurable

N

M

k=1

j=1

N

M

k=1

j=1

N

M

k=1

j=1

= ∫(B t − B s ) ∏ 𝟙D k ((B v k − B u k )2 ) dℙ ⋅ ∫ ∏ 𝟙C j (B r j ) dℙ

̃ B∼B

̃t − B ̃ s ) ∏ 𝟙D ((B ̃v − B ̃ u )2 ) dℙ ⋅ ∫ ∏ 𝟙C (B r ) dℙ = ∫(B k k k j j

̃ B=−B

= ∫(B s − B t ) ∏ 𝟙D k ((B u k − B v k )2 ) dℙ ⋅ ∫ ∏ 𝟙C j (B r j ) dℙ

= ⋅ ⋅ ⋅ = − ∫(B t − B s ) dℙ. Γ

This yields 𝔼(B t − B s | Gs ) = 0.

9.7 Theorem (Lévy 1940). Let (B t )t⩾0 be a one-dimensional Brownian motion and let Π n be a sequence of nested partitions of [0, t], i.e. Π n ⊂ Π n+1 ⊂ . . . , with limn→∞ |Π n | = 0. Then var2 (B; [0, t]) = lim S2 n (B; [0, t]) = t Π

n→∞

almost surely.

Proof. We can assume that #(Π n+1 \Π n ) = 1, otherwise we add the missing intermediate partitions. Π We show that the variation sums S−n := S2 n (B, [0, t]) are a martingale for the filtration H−n := σ(S−n , S−n−1 , S−n−2 , . . . ). Fix n ⩾ 1; then Π n+1 = Π n ∪ {s} and there

158 | 9 The variation of Brownian paths are two consecutive points t j , t j+1 ∈ Π n such that t j < s < t j+1 . Then

S−n − S−n−1 = (B t j+1 − B t j )2 − (B s − B t j )2 − (B t j+1 − B s )2

Thus, with Lemma 9.6,

= 2(B t j+1 − B s )(B s − B t j ).

󵄨 󵄨 𝔼 ((B t j+1 − B s )(B s − B t j ) 󵄨󵄨󵄨 Gs ) = (B s − B t j ) 𝔼 ((B t j+1 − B s ) 󵄨󵄨󵄨 Gs ) = 0

and, since H−n−1 ⊂ Gs , we can use the tower property to see

󵄨 󵄨 󵄨 𝔼 (S−n − S−n−1 󵄨󵄨󵄨 H−n−1 ) = 𝔼 ( 𝔼 (S−n − S−n−1 󵄨󵄨󵄨 Gs ) 󵄨󵄨󵄨 H−n−1 ) = 0.

Thus (S−n , H−n )n⩾1 is a (backwards) martingale and by the martingale convergence Π theorem, Corollary A.8, we see that the limn→∞ S−n = limn→∞ S2 n (B; [0, t]) exists almost surely. Since the L2 limit of the variation sums is t, cf. Theorem 9.1, we get that the a.s. limit is again t.

9.3 Almost sure divergence of the variation sums So far we have considered variation sums where the underlying partition Π does not depend on ω, i.e. on the particular Brownian path. This means, in particular, that we cannot make assertions on the strong quadratic variation VAR2 (B(⋅, ω); [0, 1]) of a Brownian motion. We are going to show that the strong quadratic variation of almost all Brownian paths is infinite. This is an immediate consequence of the following result. 9.8 Theorem (Lévy 1948). Let (B t )t⩾0 be a BM1 . For almost all ω ∈ Ω there is a nested sequence of random partitions (Π n (ω))n⩾1 such that |Π n (ω)| 󳨀󳨀󳨀󳨀→ 0 n→∞

and

Π (ω)

var2 (B(⋅, ω); [0, 1]) = lim S2 n n→∞

(B(⋅, ω); [0, 1]) = +∞.

The strong variation is, by definition, the supremum of the variation sums for all finite partitions. Thus, 9.9 Corollary (Lévy 1948). Let (B t )t⩾0 be a one-dimensional Brownian motion. Then VAR2 (B(⋅, ω); [0, 1]) = ∞ for almost all ω ∈ Ω.

Proof of Theorem 9.8 (Freedman 1971). Let w(t) := B(t, ω), t ∈ [0, 1], be a fixed trajectory, write Π n := {t j : t j = j/n, j = 0, 1, . . . , n} and set Jn := {[t j−1 , t j ] : j = 1, . . . , n}. An interval I = [a, b] is a k-fast interval for w, if (w(b) − w(a))2 ⩾ 2k(b − a).

We have seen in Theorem 9.1 that Brownian motion has on each interval [0, t] the quadratic variation t. Since the increments (B t j − B t j−1 )2 are independent random

9.3 Almost sure divergence of the variation sums | 159

variables, they fluctuate around their mean t j − t j−1 , and a “fast” interval is an interval where the fluctuation is “large”. If we allow for path-dependent partitions, we can show that Brownian motion has many “fast” intervals, and this will cause the variational sums to diverge. 1o Since a Brownian motion has stationary and independent increments, the random variables (n)

Zl

= # {1 ⩽ j ⩽ n : [B( ml +

j mn , ⋅)

− B( ml +

2 j−1 mn , ⋅)]

⩾ 2k

1 mn }

(l = 0, 1, . . . , m − 1) are iid binomial random variables with the parameters n and p k = ℙ ([B( ml +

j mn , ⋅)

2 j−1 mn , ⋅)]

− B( ml +

⩾ 2k

1 mn )

= ℙ(|B(1)|2 ⩾ 2k).

In the last equality we use the scaling property 2.16 of a Brownian motion. Observe (n) that 𝔼 Z l = p k n; by the central limit theorem, m−1 l=0

and so

(n)

ℙ ( ⋂ {Z l

m−1 l=0

∞ m−1

(n)

ℙ ( ⋃ ⋂ {Z l n=1 l=0

This shows that





∞ m−1

1 2

⩾ ⌊ 12 p k n⌋ + 1) 󳨀󳨀󳨀󳨀→ 1, n→∞

⩾ ⌊ 21 p k n⌋ + 1}) = 1. (n)

ℙ ( ⋂ ⋂ ⋃ ⋂ {Z l k=1 m=1 n=1 l=0

Set α k =

(n)

⩾ ⌊ 21 p k n⌋ + 1}) = ∏ ℙ(Z l

⩾ ⌊ 12 p k n⌋ + 1}) = 1.

ℙ(|B(1)|2 ⩾ 2k). We can restate this observation in the following form:

∃Ω0 ⊂ Ω, ℙ(Ω0 ) = 1,

∀ω ∈ Ω0 ,

∀k ⩾ 1, m ⩾ 1,

∃n = n(k, m, ω) ⩾ k,

∀I ∈ Jm : I contains at least ⌊α k n⌋ + 1 many k-fast intervals from Jmn .

(9.6)

2o We can use (9.6) to construct a nested sequence of partitions (Π n )n⩾1 such that Π |Π n | → 0 and S2 n (w; 1) → ∞. Keep in mind that Π n = Π n (ω). Fix ω ∈ Ω0 , k ⩾ 1, m = m k ⩾ 1 and pick the smallest integer s k ⩾ 1 such that (1 − α k )s k ⩽ 21 . For I ∈ Jm we find some n1 = n1 (k, m, ω) such that each I ∈ Jm contains at least ⌊α k n1 ⌋ + 1 k-fast intervals I 󸀠 ∈ Jmn1 .

Denote these intervals by J∗(m,n1 ) . By construction, Jmn1 \ J∗(m,n1 ) has no more than (n1 − ⌊α k n1 ⌋ − 1)m elements. Applying (9.6) to the intervals from Jmn1 \ J∗(m,n1 ) yields that there is some n2 such that each I ∈ Jmn1 \ J∗(m,n1 ) contains at least ⌊α k n2 ⌋ + 1 k-fast intervals I 󸀠 ∈ Jmn1 n2 .

160 | 9 The variation of Brownian paths Denote these intervals by J∗(m,n1 ,n2 ) . By construction, Jmn1 n2 \ J∗(m,n1 ,n2 ) has no more than (n2 − ⌊α k n2 ⌋ − 1)(n1 − ⌊α k n1 ⌋ − 1)m elements. If we repeat this procedure s k times, there are no more than sk

m ∏(n j − ⌊α k n j ⌋ − 1) j=1

intervals left in Jmn1 ⋅⋅⋅n sk \ J∗(m,n1 ,⋅⋅⋅ ,n s ) ; their total length is at most k

1

sk

m ∏(n j − ⌊α k n j ⌋ − 1) j=1

sk

m ∏ nj j=1

sk

=∏ j=1

(n j − ⌊α k n j ⌋ − 1) 1 ⩽ (1 − α k )s k ⩽ . nj 2 ⩽1−α j

The partition Π k contains all points of the form t = j/m, j = 0, . . . , m, as well as the endpoints of the intervals from J∗(m,n1 ) ∪ J∗(m,n1 ,n2 ) ∪ ⋅ ⋅ ⋅ ∪ J∗(m,n1 ,n2 ,...,n s ) . k

Since all intervals are k-fast and since their total length is at least 21 , we get S2 k (w; [0, 1]) ⩾ 2k Π

1 = k. 2

3o In order to construct Π k+1 , we repeat the construction from the second step with m = m k ⋅ n1 ⋅ n2 ⋅ . . . ⋅ n s k . This ensures that Π k+1 ⊃ Π k , and finishes the proof.

Using ω-dependent partitions we can also achieve the other extreme: quadratic variation becomes zero. 9.10 Theorem (Freedman 1971). Let (B t )t⩾0 be a BM1 . For almost all ω ∈ Ω there is a nested sequence of random partitions (Π n (ω))n⩾1 such that |Π n (ω)| 󳨀󳨀󳨀󳨀→ 0 n→∞

and

Π (ω)

var2 (B(⋅, ω); [0, 1]) = lim S2 n n→∞

(B(⋅, ω); [0, 1]) = 0.

Proof. 1o Let w : [a, b] → ℝ be a continuous function such that w(a) < w(b). Define the times τ j,N ∈ [a, b] when w(t) increases by more than (w(b) − w(a))/N: τ0,N (w) := a,

τ j+1,N (w) := inf {t ⩾ τ j,N : w(t) − w(τ j,N ) ⩾

w(b)−w(a) }. N

By our assumption, we have Π := {a = τ0,N ⩽ τ1,N < . . . ⩽ τ N,N ⩽ b =: τ N+1,N }, hence w(τ0,N ) = w(a) and w(τ N,N ) = w(τ N+1,N ) = w(b), as well as N

2

S2Π (w; [a, b]) = ∑ (w(τ j+1,N ) − w(τ j,N )) = j=0

N(w(b) − w(a))2 (w(b) − w(a))2 = . N N2

This construction remains valid if w(a) = w(b) (use Π = {a, b}) and if w(b) > w(a) (use w 󴁄󴀼 −w instead of w).

9.4 Lévy’s characterization of Brownian motion |

161

2o Let Ω0 ⊂ Ω the set of all ω such that B0 (ω) = 0 and t 󳨃→ B t (ω) is continuous. This is a set of full measure. Let w(t) = B t (ω), ω ∈ Ω0 , be a fixed Brownian path. We are going to construct a sequence of partitions Π n = Π n (w) with mesh |Π n | ⩽ 2−n and Π (w) S2 n (w; [0, 1]) → 0 as n → ∞. Set Π0 := {0, 1} and assume that we have already constructed Π n . Pick any finite set Σ n+1 = {s0 = 0 < s1 < ⋅ ⋅ ⋅ < s N = 1} such that Π n ⊂ Σ n+1 ⊂ [0, 1] and |Σ n+1 | ⩽ 2−n−1 . Now we use Step 1o on each interval [s j−1 , s j ], j = 1, 2, . . . , N. This furnishes a finite set Π n+1 ⊃ Σ n+1 ⊃ Π n such that for every j = 1, 2, . . . , N S2 n+1 (w; [s j−1 , s j ]) = Π



u k−1 ,u k ∈Π n+1 ∩ [s j−1 ,s j ]

(w(u k ) − w(u k−1 ))2 ⩽

(w(s j ) − w(s j−1 ))2 . N

Since w(t) is uniformly continuous on [0, 1] and s j − s j−1 ⩽ 2−n−1 , we see that Π S2 n+1 (w; [0, 1]) ⩽ ϵ n+1 for some null sequence (ϵ n )n⩾1 . Thus, |Π n+1 | ⩽ 2−n−1 → 0, Π Π n ⊂ Π n+1 and S2 n+1 (w; [0, 1]) ⩽ ϵ n+1 → 0.

9.4 Lévy’s characterization of Brownian motion

Theorem 9.1 remains valid for all continuous martingales (X t , Ft )t⩾0 with the property that (X 2t − t, Ft )t⩾0 is a martingale. In fact, this is equivalent to saying that 󵄨 𝔼 [X t − X s 󵄨󵄨󵄨 Fs ] = 0 for all s ⩽ t, (9.7) and, since 𝔼[X s X t | Fs ] = X s 𝔼[X t | Fs ] = X 2s , 󵄨 𝔼 [(X t − X s )2 󵄨󵄨󵄨 Fs ] = t − s

for all s ⩽ t.

(9.8)

9.11 Lemma. Let (X t , Ft )t⩾0 be a real-valued martingale with continuous sample paths such that (X 2t − t, Ft )t⩾0 is a martingale. Then 𝔼 [(X t − X s )4 ] < ∞ for all s < t, and 󵄨 𝔼 [(X t − X s )4 󵄨󵄨󵄨 Fs ] ⩽ 4(t − s)2 for all s < t. (9.9) Proof. Set t j := s + (t − s) nj , j = 0, 1, . . . , n, and δ j := X t j − X t j−1 . Obviously,¹ 4

2

n

2

n

(X t − X s ) = ( ∑ δ j ) ( ∑ δ l ) j=1

n

l=1

n

n

n

= ( ∑ δ2j + 2 ∑ ∑ δ j δ k ) ( ∑ δ2l + 2 ∑ ∑ δ l δ m ) j=1

n

k=1 j 0. Because of the subadditivity of an outer measure it is clearly enough to show that H α (E ∪ F) ⩾ H α (E) + H α (F); in particular, we may assume that H α (E ∪ F) < ∞. Take any δ-cover (A j )j⩾1 of E ∪ F with δ < δ0 . By the definition of δ0 it is clear that each A j can intersect only either E or F, so there is a natural split into δ-covers of E and F, respectively. We get ∑ |A j |α = ∑ |A j |α + ∑ |A j |α ⩾ Hδα (E) + Hδα (F). A j ∩ E=0̸

j⩾1

Thus,

Ex. 11.2

Hδα (E

∪ F) ⩾

Hδα (E)

+

Hδα (F),

A j ∩ F =0̸

and the claim follows as δ → 0.

This, combined with Carathéodory’s extension theorem for measures, shows that H α , restricted to the Borel sets B(ℝd ), is a Borel measure. Since H α is invariant under translations and 0 < H d (𝔹(0, 1)) < ∞, it is clear that H d is a constant multiple of d-dimensional Lebesgue measure λ d . In order to work out the exact constant – we have to show that H d (𝔹(0, 1)) = 1 – we need either the isodiametric inequality, cf. [81, Chapter 2.2, Theorem 1, p. 69], or a clever covering argument [232, Theorem 18.15]. 11.3 Lemma. H

α

is outer regular, i.e.

H α (E) = inf {H α (B) : B ⊃ E, B ∈ B(ℝd )}

for all E ⊂ ℝd .

Proof. We will construct a set B ∈ B(ℝd ) such that B ⊃ E and H α (B) = H α (E). If H α (E) = ∞, take B = ℝd . In all other cases there is for each n ⩾ 1 some δ-cover (E nj )j⩾1 of E satisfying ∑ |E nj |α ⩽ Hδα (E) +

j⩾1 n

1 1 ⩽ H α (E) + . n n

Since |E nj | = |E j | we may assume that the sets E nj are closed, hence Borel sets. Then and we see

B n := ⋃ E nj

and

B := ⋂ B n := ⋂ ⋃ E nj ⊃ E, n⩾1

j⩾1

n⩾1 j⩾1

Hδα (E) ⩽ Hδα (B) ⩽ Hδα (B n ) ⩽ ∑ |E nj |α ⩽ H α (E) + j⩾1

1 . n

The claim follows letting first n → ∞ and then δ → 0.

The function α 󳨃→ H α (E) is most of the time either 0 or +∞. Indeed, if (E j )j⩾1 is a δ-cover of E, then we get Hδα+ϵ (E) ⩽ ∑ |E j |α+ϵ ⩽ δ ϵ ∑ |E j |α , j⩾1

j⩾1

Letting δ → 0 proves the next lemma.

hence, Hδα+ϵ (E) ⩽ δ ϵ ⋅ Hδα (E).

11.1 Hausdorff measure and dimension | 179

11.4 Lemma. If H α (E) < ∞, then H

α+ϵ (E)

= 0 for all α, ϵ > 0.

In particular, # {α : 0 < H α (E) < ∞} ⩽ 1. The value where H α (E) jumps from +∞ to 0 is the dimension of E.

11.5 Definition (Hausdorff 1919). The Hausdorff (or Hausdorff–Besicovitch) dimension of a set E ⊂ ℝd is the unique number

Ex. 11.3

dim E = sup {α ⩾ 0 : H α (E) = ∞} = inf {α ⩾ 0 : H α (E) = 0}

(sup 0 := 0). If α = dim E, the Hausdorff measure H α (E) can have any value in [0, ∞]. Later on, we want to calculate the Hausdorff dimension of certain (random) sets. In the following remark we collect some useful techniques for this. 11.6 Remark (Determining the Hausdorff dimension). Let E ⊂ ℝd and f : ℝd → ℝn . a) Upper bound: H α (E) = 0 󳨐⇒ dim E ⩽ α. Showing that H α (E) = 0 is often simple, since it is enough to construct a sequence of δ-covers (with δ = δ n → 0) such that ∑j⩾1 |E nj |α becomes small as n → ∞.

b) Lower bound: H β (E) > 0 󳨐⇒ dim E ⩾ β. It is usually hard to check that H β (E) > 0 since we have to show that ∑j⩾1 |E j |β is uniformly bounded from below for all possible δ-covers of E. This requires clever covering strategies or capacitary methods. c) If f is Hölder continuous with exponent γ ∈ (0, 1], i.e. |f(x) − f(y)| ⩽ L|x − y|γ for all x, y ∈ ℝd , then dim f(E) ⩽ γ−1 dim E.

Proof. By the Hölder continuity |f(E j )| ⩽ L |E j |γ , and for any δ-cover (E j )j⩾1 of E, (f(E j ))j⩾1 is an Lδ γ -cover of f(E). If αγ > dim E, then H αγ (E) = 0 and, for some suitable δ-cover (E j )j⩾1 , we see L−α ∑ |f(E j )|α ⩽ ∑ |E j |αγ ⩽ ϵ. j⩾1

j⩾1

Using Part a), we get dim f(E) ⩽ α. The claim follows if we let α → γ−1 dim E.

d) Mass distribution principle. Let E ⊂ ℝd , α > 0 and assume that there is a non-trivial finite measure μ supported in E. If there exist constants c > 0 and ϵ > 0 such that μ(F) ⩽ c |F|α

for all closed sets F ⊂ ℝd with |F| ⩽ ϵ,

then H α (E) > 0 and dim E ⩾ α.

Proof. Let δ ⩽ ϵ and consider any δ-covering of E by closed sets (F j )j⩾1 . By assumption ∞



0 < μ(E) ⩽ ∑ μ(F j ) ⩽ ∑ c |F j |α , j=1

j=1

and this implies Hδα (E) ⩾ c−1 μ(E) and H α (E) ⩾ c−1 μ(E) > 0.

Ex. 11.6

180 | 11 Brownian motion as a random fractal

Ex. 11.7

e) Self-similarity. A set E ⊂ ℝd is said to be self-similar, if there are finitely many functions f j : ℝd → ℝd , j = 1, . . . , n, such that |f j (x) − f j (y)| = c j |x − y|, x, y ∈ ℝd , and E = f1 (E) ∪ ⋅ ⋅ ⋅ ∪ f n (E). If the sets f j (E) do not overlap, then n

H α (E) = ∑ c αj H α (E). j=1

f)

Assume that α = dim E and 0 < H α (E) < ∞. Then we can calculate α from the equality 1 = ∑nj=1 c αj . Energy criterion (Frostman 1935; McKean 1955; Blumenthal & Getoor 1960). Let μ be a probability measure on E. Then ∬

E×E

μ(dx) μ(dy) < ∞ 󳨐⇒ H α (E) > 0 󳨐⇒ dim E ⩾ α. |x − y|α

Proof. See Section A.9 in the appendix.

g) Frostman’s theorem (Frostman 1935). Let K be a compact set. Then α < dim K 󳨐⇒ H α (K) > 0

󳨐⇒ ∃ finite measure σ ∀β < α : ∬

K×K

Proof. See Section A.9 in the appendix.

σ(dx) σ(dy) < ∞. |x − y|β

11.2 The Hausdorff dimension of Brownian paths Recall that almost all Brownian motion paths satisfy a Hölder condition of any index 0 < γ < 21 : |B t (ω) − B s (ω)| ⩽ L γ (ω)|t − s|γ , 0 < γ < 21 , ω ∈ Ω0 . (11.2)

Here ℙ(Ω0 ) = 1 independently of γ and ℙ(L γ < ∞) = 1. This follows from Corollary 10.2, but for the readers’ convenience we provide a different proof which is closer to the original arguments by Wiener [265, § 13] and Paley & Wiener [198, Theorem XLVII, pp. 160–162]. 11.7 Lemma (Wiener 1930; Paley & Wiener 1934). BMd is almost surely Hölder continuous for any index γ ∈ (0, 21 ).

Proof. The coordinates of a d-dimensional Brownian motion are one-dimensional Brownian motions, cf. Corollary 2.9, and so we may assume that (B t )t⩾0 is real-valued.

11.2 The Hausdorff dimension of Brownian paths | 181

Since (B t+1 − B1 )t⩾0 is again a BM1 , it is enough to show Hölder continuity for every t ∈ [0, 1]. Fix γ ∈ (0, 12 ); using B t+h − B t ∼ B h ∼ √h B1 , we get 1

ℙ (|B t+h − B t | > h γ ) = ℙ (|B1 | > h γ− 2 ) = √



1 2 2 ∫ e− 2 x dx π

h γ−1/2



2 2 ⩽ √ h 2 −γ ∫ xe− 2 x dx π 1

1

h γ−1/2

=√

for some constant c > 0. For h = 2−n , n ⩾ 1, we find

2 1 −γ − 1 h2γ−1 ⩽ ch2 h2 e 2 π



󵄨 󵄨 ℙ ( ⋃ ⋃ {󵄨󵄨󵄨B j2−n − B(j−1)2−n 󵄨󵄨󵄨 > 2−nγ }) ⩽ c ∑ 2n 2−2n = c2−k+1 󳨀󳨀󳨀󳨀→ 0. k→∞ n⩾k 1⩽j⩽2n

n=k

By a standard Borel–Cantelli argument, there is a set Ω γ with ℙ(Ω γ ) = 1 such that 󵄨󵄨 −n 󵄨 󵄨󵄨B j2 (ω) − B(j−1)2−n (ω)󵄨󵄨󵄨 ⩽ 2−nγ

for all n ⩾ N(ω), j = 1, . . . , 2n , ω ∈ Ω γ .

If [t, t + h] ⊂ [0, 1], h < 2−N(ω) , we see that (t, t + h) ⊂ ⋃k⩾1 I k where the I k are nonoverlapping intervals of length 2−n(k) and with dyadic endpoints. Note that at most two intervals of any one length appear in the union. Indeed, I1 = [r, s] is the largest

𝑡

𝐼6

𝐼4

𝐼2

𝐼1

𝑟

𝑠

𝐼3

𝐼5

𝑡+ℎ

Fig. 11.1: Exhausting the interval [t, t + h] by non-overlapping dyadic intervals. Dots “∙” denote dyadic numbers. Observe that at most two dyadic intervals of any one length can occur.

possible dyadic interval fitting into [t, t + h]; its length is s − r = 2−n(1) , where n(1) is the smallest integer with 2−n(1) < h. Expanding r − t and t + h − s as dyadic series shows how to construct all further dyadic intervals to the left and right, respectively: they are directly adjacent to I1 and their length is given by the first (second, third, etc.) non-zero coefficient in the dyadic expansion, see Figure 11.1. Applying the above estimate to the (endpoints of the) intervals I k we see, by the continuity of Brownian motion, that for all ω ∈ Ω γ ∞

|B t+h (ω) − B t (ω)| ⩽ 2 ∑ 2−jγ = j=n(1)

2 2 2−n(1)γ ⩽ hγ . −γ 1−2 1 − 2−γ

182 | 11 Brownian motion as a random fractal This proves Hölder continuity on [0, 1] of order γ < 12 . Since Ω0 := ⋂k⩾3 Ω 12 − 1 has full k measure, ℙ(Ω0 ) = 1, and since Hölder continuity is inherited by smaller exponents, the above argument shows that all Brownian paths t 󳨃→ B t (ω), ω ∈ Ω0 , are Hölder continuous of all orders γ < 12 .

We are interested in the Hausdorff dimension of the range (or image) B(E, ω) and the graph Gr B(E, ω) of a d-dimensional Brownian motion: B(E, ω) := {B t (ω) : t ∈ E} ,

Gr B(E, ω) := {(t, B t (ω)) : t ∈ E} ,

E ∈ B[0, 1],

Dimension results for the range with E = [0, 1] are due to Taylor [254], the general case is due to McKean [183].

11.8 Theorem (Taylor 1953; McKean 1955). Let (B t )t⩾0 be a BMd and F ⊂ [0, 1] a closed set. Then 2 dim F, d ⩾ 2, dim B(F) = min {d, 2 dim F} = { a.s. (11.3) min {1, 2 dim F} , d = 1.

Proof. 1o The upper bound in (11.3) follows at once from the Hölder continuity of Brownian motion (11.2) using Remark 11.6.c). For all γ < 12 we have for any Borel set E ⊂ [0, 1] 1 dim B(E) ⩽ dim E a.s. γ

Letting γ → 21 along a countable sequence, we get dim B(E) ⩽ 2 dim E a.s. Since dim B(E) ⩽ d is trivial, the upper bound on Hausdorff dimension follows. Note that, in this case, the exceptional null set is the same for all E ⊂ [0, 1].

2o The proof of the lower bound in (11.3) requires methods from potential theory, cf. Remark 11.6.f) and g). Let F ⊂ [0, 1] be a closed set and pick α < dim F. By definition, H α (F) > 0 and by Frostman’s theorem, Remark 11.6.g), there is a finite measure σ on F such that ∬

F×F

σ(dt) σ(ds) < ∞. |t − s|α

In order to determine the Hausdorff dimension of B(F), we consider the occupation measure, i.e. the time (measured with σ) (B t )t⩾0 spends in a Borel set A ∈ B(ℝd ): μ Bσ (A) = ∫ 𝟙A (B t ) σ(dt). F

Assume, in addition, that λ := 2α < min {d, 2 dim F}. Then, by Tonelli’s theorem 𝔼[ ∬ [B(F)×B(F)

μ Bσ (dx) μ Bσ (dy) ] σ(dt) σ(ds) ] 1 = 𝔼 [∬ = ∬ 𝔼[ ] σ(dt) σ(ds). |x − y|λ |B t − B s |λ |B t − B s |λ ] [F×F ] F×F

11.2 The Hausdorff dimension of Brownian paths | 183

Using stationary increments and scaling, B t − B s ∼ B t−s ∼ √t − s B1 we see 𝔼[ ∬ [B(F)×B(F)

μ Bσ (dx) μ Bσ (dy) ] σ(dt) σ(ds) = 𝔼 [|B1 |−λ ] ∬ . |x − y|λ |t − s|λ/2 F×F ]

It is not hard to check that 𝔼 (|B1 |−λ ) < ∞ for λ < d, while the double integral appearing on the right is finite since α = 12 λ < dim F. The energy criterion, see Remark 11.6.f), now proves that dim B(F) ⩾ λ a.s. Letting λ → min {d, 2 dim F} along a countable sequence, gives the lower bound of (11.3). Note, however, that the null set may depend on F.

The graph Gr B(E, ω) of Brownian motion was studied by Taylor [255] for E = [0, 1], for general E ∈ B[0, 1] we refer to Kahane [131, Chapter 18, § 7]. 11.9 Theorem (Taylor 1955). Let (B t )t⩾0 be a BMd and F ⊂ [0, 1] a closed set. Then dim F + 12 , { { { dim Gr B(F) = min {2 dim F, dim F + 12 d} = {2 dim F, { { {2 dim F,

d = 1, dim F ⩾ 21 ,

d = 1, dim F ⩽ 12 ,

a.s.

d ⩾ 2,

Proof. 1o The upper bound uses two different coverings of the graph. Pick α > dim F. Then there is some δ-covering (F j )j⩾1 of F by closed sets with diameters |F j | =: δ j ⩽ δ. Observe that Gr B(F j ) ⊂ F j × B(F j ); using the Hölder continuity (11.2) of Brownian motion, we get for all γ ∈ (0, 12 ) |Gr B(F j )| ⩽ |F j | + |B(F j )| ⩽ |F j | + L γ |F j |γ ⩽ (L γ + 1) δ γ .

This shows that the closed sets (F j × B(F j ))j⩾1 are an (L γ + 1)δ γ -covering of Gr B(F) and ∞



󵄨 󵄨α/γ ∑ 󵄨󵄨󵄨 Gr B(F j )󵄨󵄨󵄨 ⩽ (L γ + 1)α/γ ∑ |F j |α < ∞

j=1

j=1

i.e. dim Gr B(F) ⩽ γ−1 dim F a.s. Letting γ → 12 via a countable sequence, shows dim Gr B(F) ⩽ 2 dim F a.s. On the other hand, using the Hölder continuity of t 󳨃→ B t , it is not hard to see (γ−1)d that each F j × B(F j ) is also covered by cδ j many cubes of diameter δ j . This gives a further δ-covering of Gr B(F) satisfying ∞

(γ−1)d

∑ (cδ j

j=1

α+(1−γ)d

) δj



= c ∑ δ αj < ∞, j=1

thus H α+(1−γ)d (Gr B(F)) < ∞ and dim Gr B(F) ⩽ α + (1 − γ)d a.s. Letting α → dim F and γ → 12 along countably many values proves dim Gr B(F) ⩽ dim F + 12 d a.s. Combining both estimates gives the upper bound. 2o For the lower bound we use potential theoretic arguments.

Ex. 11.8

184 | 11 Brownian motion as a random fractal Assume first that dim F ⩽ 12 d. Since B(F) is the projection of Gr B(F) onto ℝd , and since projections are Lipschitz continuous (i.e. Hölder continuous with γ = 1), we see with Theorem 11.8 dim Gr B(F) ⩾ dim B(F) ⩾ 2 dim F ⩾ min {2 dim F, dim F + 12 d}

a.s.

The situation dim F > 12 d can occur only if d = 1; if so, dim Gr B(F) ⩾ dim F + 12 a.s. In order to see this, fix γ ∈ (1, dim F + 12 ). Since γ − 21 < dim F, we conclude from Frostman’s theorem, Remark 11.6.g), that there is some finite measure σ on F with ∬

F×F

Consider the occupation measure

σ(dt) σ(ds) < ∞. |t − s|γ−1/2

B μGr σ (A) = ∫ 𝟙A (t, B t ) σ(dt), F

As in the proof of Theorem 11.8 we see 𝔼 [∬

A ∈ B(ℝ2 ).

B Gr B μGr σ(ds) σ(dt) σ (dx) μ σ (dy) ] ⩽ 𝔼 [ ∬ σ(ds) σ(dt)] . ] = 𝔼 [∬ γ/2 2 2 |x − y|γ |B t − B s |γ [F×F (|t − s| + |B t − B s | ) ] ] [F×F

Using Brownian scaling |B t − B s | ∼ √|t − s||B1 | yields, as in the proof of Theorem 11.8, 𝔼 [∬

B Gr B μGr σ(ds) σ(dt) σ(ds) σ(dt) σ (dx) μ σ (dy) ⩽ cγ ∬ < ∞. ] ⩽ cγ ∬ γ γ/2 |x − y| |t − s| |t − s|γ−1/2 F×F

F×F

The last inequality follows from the fact that |t − s| ⩽ 1 and 21 γ ⩽ γ − 12 . The energy criterion, cf. Remark 11.6.f), now shows that dim Gr B(F) ⩾ γ a.s., and the claim follows as γ → dim F + 21 along countably many values.

The exceptional set in Theorem 11.8 may depend on the set of times F. Kaufman [137] showed that for Brownian motion in dimension d ⩾ 2 this null set can be chosen independently of F. Kaufman’s beautiful proof does not rely on capacitary arguments; at the same time it holds for arbitrary Borel sets E ∈ B[0, 1]. The key ingredient of the proof is the following technical lemma. Denote by D(r) ⊂ ℝd an open ball of radius r > 0 and by B−1 (D, ω) = {t : B t (ω) ∈ D} the pre-image of D.

11.10 Lemma (Kaufman 1969). Let (B t )t⩾0 be a BMd , d ⩾ 2, set I nk = [(k − 1)4−n , k4−n ], k = 1, . . . , 4n , and denote by A n ⊂ Ω the event 󵄨󵄨 A n = {ω ∈ Ω 󵄨󵄨󵄨 ∃D = D(2−n ) : # {k : B−1 (D, ω) ∩ I nk ≠ 0} ⩾ n2d } . 󵄨

Then N0 := limn→∞ A n is a ℙ-null set.

11.2 The Hausdorff dimension of Brownian paths | 185

Proof. For sufficiently large n ⩾ 1, e.g. n ⩾ 4d2 , we set 󵄨󵄨 A∗n = {ω ∈ Ω 󵄨󵄨󵄨 ∃D = D(2√n 2−n ) : # {k : B k4−n (ω) ∈ D} ⩾ n2d } . 󵄨 Let us show that A n ⊂ A∗n . Write y0 for the centre of the ball D = D(2−n ) from the definition of A n . Using Lévy’s modulus of continuity, Theorem 10.6, we find for every t0 ∈ B−1 (D, ω) ∩ I nk and any constant c > 1 |B k4−n − B t0 | ⩽ c√2 ⋅ 4−n log 4n ⩽ 32 √n 2−n .

Thus, |B k4−n − y0 | ⩽ |B k4−n − B t0 | + |B t0 − y0 | < 2√n 2−n and B k4−n ∈ D(2√n 2−n ); this proves A n ⊂ A∗n . Set n−1

󵄨 󵄨 A∗(k1 ,...,k n ) := ⋂ {󵄨󵄨󵄨B k j+1 4−n − B k j 4−n 󵄨󵄨󵄨 < 4√n 2−n } ,

(11.4)

j=1

where 1 ⩽ k1 < k2 < ⋅ ⋅ ⋅ < k n ⩽ 4n are arbitrary integers. The probability of each set in the intersection on the right-hand side can be estimated by 󵄨 󵄨 ℙ (󵄨󵄨󵄨B k j+1 4−n − B k j 4−n 󵄨󵄨󵄨 < 4√n 2−n ) = ℙ (|B1 | < 4√n/√ k j+1 − k j ) −d/2

⩽ 4d c d n d/2 (k j+1 − k j )

Here we use B t − B s ∼ √t − s B1 , and the estimate follows from ℙ(|B1 | < r) =

.

1 2 2 1 rd Leb(𝔹(0, 1)) − 21 |x|2 dx = e . ∫ ∫ e− 2 r |y| dy ⩽ r d d/2 d/2 (2π) (2π) (2π)d/2

|x| x − 1n } ∩ {B s < x + 1n } , { {n=1 s∈(q,t]∩ ℚ

if t < q,

if t ⩾ q,

we see that τ x is a stopping time, and from Part b) we know that ℙ(τ x < ∞) = 1. q Set A q := {ω ∈ Ω : τ x (ω) is an accumulation point of T(x, ω)}. Then q



q∈[0,∞)∩ ℚ

q

A q ⊂ {T(x, ⋅) has no isolated points} .

In order to see this inclusion, we assume that for fixed ω the point t0 ∈ T(x, ω) is an isolated point. By its very definition, there exists some rational q ⩾ 0 such that q (q, t0 ) ∩ T(x, ω) = 0, so τ x (ω) = t0 ; since t0 is no accumulation point, ω ∉ A q . By the strong Markov property B τ qx +t − x is again a Brownian motion and from Lemma 11.19 we see that ℙ(A q ) = 1 for all rational q ⩾ 0. Since the above intersection is countable, the claim follows. e) This follows from Theorem 11.14 or Theorem 11.22 below.

The role of the exceptional sets appearing in Theorem 11.21 must not be neglected. For instance, consider the set N x := {ω ∈ Ω : T(x, ω) has isolated points}.

Then ℙ (⋃x∈ℝ N x ) = 1, since every t0 (ω) where a local maximum x0 = x0 (ω) is reached is necessarily an isolated point of T(x0 , ω): this is a consequence of Lemma 11.17 as Brownian local maxima are strict local maxima.

11.5 Roots and records Let (B t )t⩾0 be a BM1 . As in Chapter 6 we write ℙx for the law of B + x = (B t + x)t⩾0 (the Brownian motion started at x) and ℙ = ℙ0 . The 0-level set T B+x (0, ω) of B + x is the

192 | 11 Brownian motion as a random fractal (−x)-level set T B (−x, ω) of B. This means that we can reduce the study of the level sets to studying the zero set Z(ω) = B−1 ({0}, ω) = {t ∈ [0, ∞) : B t (ω) = 0}

under ℙx . Closely related to the zeros of a Brownian motion are the records, i.e. the times where a Brownian motion reaches its running maximum M t = sups⩽t B s , R(ω) = {t ∈ [0, ∞) : B t (ω) = M t (ω)} .

Ex. 6.1

From the reflection principle, Theorem 6.10, we know that M t −B t ∼ |B t |. Both M −B and |B| are Markov processes, cf. Problem 6.1, and we find that the processes (M t − B t )t⩾0 and (|B t |)t⩾0 ) have the same distributions. This means, however, that the roots Z(⋅) and the records R(⋅) of a BM1 have the same distributional properties. This was first observed by Lévy [163] who gave a penetrating study of the structure of Z(ω) using R(ω). Here is a typical example of this strategy. 11.22 Theorem. Let (B t )t⩾0 be a BM1 . Then ℙ (dim(Z(⋅) ∩ [0, 1]) = 21 ) = 1.

Proof. 1o (Lower bound). Since Z(⋅) and R(⋅) follow the same probability law, we show that dim(R(⋅) ∩ [0, 1]) ⩾ 12 a.s. Define a measure μ ω on R(ω) by setting μ ω (s, t] = M t (ω) − M s (ω). Since (B t )t⩾0 is a.s. nowhere monotone, μ ω is supported in R(ω). Brownian motion is a.s. Hölder continuous of order γ for any γ ∈ (0, 21 ), cf. (11.2), and so μ ω (s, t] = M t (ω) − M s (ω) ⩽ sup (B s+h (ω) − B s (ω)) ⩽ L γ (ω)(t − s)γ 0⩽h⩽t−s

for all s, t ∈ [0, 1] and ω ∈ Ω γ where ℙ(Ω γ ) = 1. Now we can use Example 11.6.d) to conclude that dim (R(ω) ∩ [0, 1]) ⩾ γ for all ω ∈ Ω γ . Letting γ → 12 along countably many values yields the lower bound.

2o (Upper bound) Let (β t )t⩾0 be a further BM1 independent of (B t )t⩾0 . Applying Kaufman’s theorem in the form of Corollary 11.12, to the two-dimensional Brownian motion W t = (B t , β t ) we find for the zero set Z(ω) = B−1 ({0}, ω) that dim B−1 ({0}, ω) = dim W −1 ({0} × ℝ, ω) ⩽

1 2

dim({0} × ℝ) = 21 .

For the rest of this section we will not follow Lévy’s idea, but we use a direct attack on Z(ω) without taking recourse to R(ω). The trouble with Z(ω) is that each Brownian zero is an accumulation point of other zeros, cf. Theorem 11.21, which means that we cannot move from a fixed zero to the “next” zero, see Figure 11.2. Since Z(ω) is a closed set, its complement is open and Zc (ω) = ⋃n⩾1 U n (ω) is a countable union of open intervals U n (ω) ⊂ [0, ∞). We call the intervals U n (ω) excursion intervals. In order to describe U n (ω), we introduce the following random times g t (ω) := sup {s ⩽ t : B s (ω) = 0}

and

d t (ω) := inf {s ⩾ t : B s (ω) = 0}

11.5 Roots and records | 193

Bt

t

gt

un

dt

Fig. 11.2: A typical excursion interval: g t is the last zero before t > 0, and d t is the first zero after t > 0.

which are the last zero before t and the first zero after t; this is shown in Figure 11.2. Since ℙ(B t = 0) = 0 we see that g t < t < d t almost surely. While d t is a stopping time, g t is not since it depends on the knowledge of the whole path in [0, t] (otherwise we could not decide if a given zero is the last zero before time t), see also the discussion in Section 6.6. Clearly, (g t (ω), d t (ω)) is the interval U n (ω) containing t > 0, hence Zc (ω) = ⋃q∈ℚ,q>0 (g q (ω), d q (ω)). One of the main ingredients in studying Z(ω) is the law of the first passage distribution τ x of the level x ∈ ℝ, cf. Theorem 6.10: ℙ0 (τ x ∈ ds) =

|x| x2 exp (− 2s ) ds = ℙx (τ0 ∈ ds) √2πs3

(the second equality follows from the symmetry and translation invariance of Brownian motion). 11.23 Lemma (Lévy 1948). Let (B t )t⩾0 be a one-dimensional Brownian motion. Then ℙ (B s = 0 for some s ∈ (t, t + h)) = ℙ (Z(⋅) ∩ (t, t + h) ≠ 0) =

Proof. Using the Markov property we get

ℙ(B s = 0 for some s ∈ (t, t + h)) = 𝔼 (ℙB t (τ0 < h) ),

t 2 arccos √ . π t+h

Ex. 11.11

194 | 11 Brownian motion as a random fractal and since we know the density of τ0 , we get h

𝔼 (ℙ (τ0 < h) ) = 𝔼 (∫ Bt

0

= =

|B t | B2 exp (− 2st ) ds) √2πs3

h ∞

|y| 1 exp (− 21 [ 1s + 1t ] y2 ) dy ds ∫ ∫ 2π √ts3 −∞ 0

h

√t 1 ds. ∫ π (s + t)√s 0

Changing variables according to s = tu2 , ds = 2tu du, yields √ h/t

du 2 2 h ℙ (B s = 0 for some s ∈ (t, t + h)) = = arctan √ . ∫ 2 π π t 1+u 0

Ex. 11.13

t . From t = (t + h) cos2 (arctan √ ht ) we conclude that arctan √ ht = arccos √ t+h

11.24 Lemma. Let (B t )t⩾0 be a BM1 and d t the first zero after time t > 0. Then ℙ (B t ∈ dy, d t ∈ ds) =

|y|

2

2π√ t(s − t)3

sy exp (− 2t(s−t) ) 𝟙(t,∞) (s) ds dy.

(11.7)

Proof. By definition, d t − t is just the first passage time τ0 of the Brownian motion re-started at time t > 0. Using the Markov property we get for any A ∈ B(ℝ) and a > t ℙ (B t ∈ A, τ t ⩾ a) = 𝔼 (𝟙A (B t ) ℙB t (τ0 ⩾ a − t)) ∞

= 𝔼 (𝟙A (B t ) ∫

a−t



= ∫∫ A a

|y|

2π√ t(s −

|B t | B2 exp (− 2st ) ds) √2πs3 2

t)3

2

y exp (− 2(s−t) ) exp (− y2t ) ds dy.

If (B t )t⩾0 is a Brownian motion, so is its projective reflection W t := tB1/t , (t > 0), and W0 = 0, cf. Paragraph 2.17. Since time is running backwards, the roles of the last zero before t and the first zero after t are reversed. This is the idea behind the following fundamental relation. 11.25 Theorem (Lévy 1939; Chung 1976). Let (B t )t⩾0 be a BM1 and g t the last zero before time t > 0. Then ℙ(|B t | ∈ dy, g t ∈ ds) =

y y2 exp (− 2(t−s) ) ds dy, 3 √ π s(t − s)

s < t, y > 0.

11.5 Roots and records | 195

Proof. Let W t be the projective reflection of B t and denote by g tW the last zero of W before time t > 0. We will use the following abbreviations: t∗ := 1/t, s∗ = 1/s and u∗ := 1/u. For all x > 0 and s < t we have ℙ (|W t | ⩾ x, g tW ⩽ s) = ℙ (|W t | ⩾ x, W u ≠ 0 ∀u ∈ (s, t])

= ℙ (|B t∗ | ⩾ xt∗ , B u∗ ≠ 0 ∀u∗ ∈ [t∗ , s∗ )) = ℙ (|B t∗ | ⩾ xt∗ , d t∗ ⩾ s∗ ) ∞ ∞

= 2 ∫ ∫ f(B t∗ ,d t∗ ) (y, r) dr dy xt∗ s∗

where f(B t∗ ,d t∗ ) (y, r) is the joint density (11.7) from Lemma 11.24. Differentiating the right-hand side with respect to d/dx and d/ds gives g(|W t |,g tW ) (y, s) = 2f(B t∗ ,d t∗ ) (yt∗ , s∗ )

t∗ . s2

Since (W, g W ) ∼ (B, g), the claim follows, if we substitute s∗ = 1/s and t∗ = 1/t.

From Theorem 11.25 we can get various distributional identities for the zero set of a Brownian motion. 11.26 Corollary. Let g t and d t the last zero before and the first zero after time t > 0. Then ℙ (g t ∈ ds, d t ∈ du) =

ds du , √ 2π s(u − s)3

0 < s < t < u.

Proof. Using the standard formula for conditional densities, we get

󵄨 ℙ (g t ∈ ds, d t ∈ du, |B t | ∈ dy) = ℙ (d t ∈ du 󵄨󵄨󵄨 g t = s, |B t | = y) ℙ (g t ∈ ds, |B t | ∈ dy) 󵄨󵄨 = ℙ (d t ∈ du 󵄨󵄨 |B t | = y) ℙ (g t ∈ ds, |B t | ∈ dy) .

The last equality follows from the fact that g t is Ft measurable and the Markov property. Since for fixed t, W t := B t+s − B t is a Brownian motion, we can easily work out the conditional probability: y 󵄨 y2 exp (− 2(u−t) ℙ (d t ∈ du 󵄨󵄨󵄨 |B t | = y) = ℙy (τ0 ∈ du − t) = ) du, √2π(u − t)3

u > t,

while Theorem 11.25 gives the density for ℙ (g t ∈ ds, |B t | ∈ dy). Integrating out dy we thus get for 0 < s < t < u ∞

ℙ (g t ∈ ds, d t ∈ du) y y y2 y2 =∫ exp (− 2(u−t) exp (− 2(t−s) ) ) dy 3 3 ds du √2π(u − t) π√ s(t − s) 0

=



1 1 1 ) du ∫ y2 exp (− 12 y2 / (t−s)(u−t) u−s 2π √2π √ s(t − s)3 √(u − t)3 −∞

Ex. 11.12

196 | 11 Brownian motion as a random fractal

=

1 1 2π(t − s)(u − t) √ s(u − s)

as we have claimed.

1 √2π√ (t−s)(u−t) u−s

1 2 (t−s)(u−t) 1 1 ∞ ∫−∞ y2 e− 2 y / u−s du = 2π √ s(u − s)3

=𝔼 B2c =c,

c= (t−s)(u−t) u−s

11.27 Further reading. The interplay between Hausdorff dimension and capacity is a classic topic of potential theory, going back to Frostman’s PhD thesis [94]. Good modern introductions can be found in [28] and [1], see also [131, Chapter 10]. Much more is known on Hausdorff dimension and (the exact) Hausdorff measure of Brownian paths. Excellent surveys on these topics are [274] and [257]. Fractals and fractal geometry, in general, are nicely covered in [83] (sometimes a bit handwaving, though) and [181]. In the hands of Itô and Williams, excursion theory became a most powerful tool. Lévy’s paper [163] and Chapter VI in [166] – which started all this – are more visionary than rigorous. A few first steps can be found in Sections 18.7 and 19.8, for a thorough study we recommend [216, Chapter XII]. Aikawa, Essén: Potential Theory – Selected Topics. [1] [28] Carleson: Selected Problems on Exceptional Sets. [83] Falconer: Fractal Geometry. [94] Frostman: Potentiel d’équilibre et capacité des ensembles avec quelques applications a la théorie des fonctions. [131] Kahane: Some Random Series of Functions. [181] Mattila: Geometry of Sets and Measures in Euclidean Spaces. [216] Revuz, Yor: Continuous Martingales and Brownian Motion. [257] Taylor: The measure theory of random fractals. [274] Xiao: Random fractals and Markov processes.

Problems 1.

Verify that s-dimensional Hausdorff measure is an outer measure.

2.

Show that the d-dimensional Hausdorff measure of a bounded open set U ⊂ ℝd is finite and positive: 0 < H d (U) < ∞. In particular, dim U = d.

Hint. Since Hausdorff measure is monotone, it is enough to show that for an axis-parallel cube Q ⊂ ℝd we have 0 < H d (Q) < ∞. Without loss, assume that Q = [0, 1]d . The upper bound

follows by covering Q with small cubes of side length 1/n. The lower bound follows since every

3.

δ-cover (E j )j⩾1 of Q gives rise to a cover by cubes whose side length is at most 2δ.

Let E ⊂ ℝd . Show that Hausdorff dimension dim E coincides with the numbers sup {α ⩾ 0 : H α (E) = ∞} , inf {α ⩾ 0 : H α (E) < ∞} ,

sup {α ⩾ 0 : H α (E) > 0}

inf {α ⩾ 0 : H α (E) = 0} .

4. Let (E j )j⩾1 be subsets of ℝd and E := ⋃j⩾1 E j . Show that dim E = supj⩾1 dim E j . 5.

Let E ⊂ ℝd . Show that dim(E × ℝn ) = dim E + n. Does a similar formula hold for E × F where E ⊂ ℝd and F ⊂ ℝn ?

Problems

6.

7.

| 197

Let f : ℝd → ℝn be a bi-Lipschitz map, i.e. both f and f −1 are Lipschitz continuous. Show that dim f(E) = dim E. Is this also true for a Hölder continuous map with index γ ∈ (0, 1)? Let C denote Cantor’s discontinuum which is obtained if we remove recursively the open middle third of any remaining interval: [0, 1] 󴁄󴀼 [0,

1 3]

∪ [ 32 , 1] 󴁄󴀼 [0,

1 9]

∪ [ 29 ,

1 3]

∪ [ 32 ,

7 9]

∪ [ 98 , 1] 󴁄󴀼 . . . ,

see e.g. [232, Problem 7.12] or [236, § 2.5]. Show that C = f1 (C)∪ f2 (C) with f1 (x) = 31 x and f2 (x) = 13 x + 32 and dim C = log 2/ log 3.

8. Let (B t )t⩾0 be a BMd . Show that for λ < d the expectation 𝔼 (|B1 |−λ ) is finite.

9.

Let (B t )t⩾0 be a one-dimensional Brownian motion. Show that dim B−1 (A) ⩽

1 2

+

1 2

dim A

a.s. for all A ∈ B(ℝ).

Hint. Apply Corollary 11.12 to W t = (B t , β t ) where β is a further, independent BM1 ; observe that B−1 (A) = W −1 (A × ℝ).

10. Let F ⊂ ℝ be a non-void perfect set, i.e. a closed set such that each point in F is an accumulation point of F. Show that a perfect set is uncountable.

11. Let (B t )t⩾0 be a BM1 and denote by g t the last zero before time t > 0. a) Use Theorem 11.25 to give a further proof of Lévy’s arc-sine law (cf. Theorem 6.20): ℙ(g t < s) = 2π arcsin √ st for all s ∈ [0, t]. b) Find the conditional density ℙ(|B t | ∈ dy | g t = s).

c) Integrate the formula appearing in Theorem 11.25 to get t

2 y y2 √ 2 exp (− y ) = ∫ √ 2 exp (− 2(t−s) ) ds 2t πt πs √2π(t − s)3

0

and give a probabilistic interpretation of this identity. 12. Show that Corollary 11.26 also follows from Lemma 11.23. Hint. Calculate

∂2 2 s (1 − arccos √ ). ∂u ∂s π u

13. Let (B t )t⩾0 , d t and g t be as in Corollary 11.26. Define L−t := t − g t

and

L t := d t − g t .

Find the laws of (L−t , L t ), L−t , L t and L t conditional on L−t = r. Find the probability that ℙ(L t > r + s | L−t = r), i.e. the probability that an excursion continues for s units of time if we know that it is already r units of time under way.

12 The growth of Brownian paths In this chapter we study the question how quickly a Brownian path grows as t → ∞. Since for a BM1 (B(t))t⩾0 , the process (tB(t−1 ))t>0 , is again a Brownian motion, we get from the growth at infinity automatically local results for small t, and vice versa. A first estimate follows from the strong law of large numbers. Write for n ⩾ 1 n

B(nt) = ∑ [B(jt) − B((j − 1)t)] , j=1

Ex. 12.1

and observe that the increments of a Brownian motion are stationary and independent random variables. By the strong law of large numbers we get that B(t, ω) ⩽ ϵt for any ϵ > 0 and all t ⩾ t0 (ϵ, ω). Lévy’s modulus of continuity, Theorem 10.6, gives a better bound. For all ϵ > 0 and almost all ω ∈ Ω there is some h0 = h0 (ϵ, ω) such that |B(h, ω)| ⩽ sup |B(t + h, ω) − B(t, ω)| ⩽ (1 + ϵ)√2h| log h| 0⩽t⩽1−h

for all h < h0 . Since tB(t−1 ) is again a Brownian motion, cf. 2.17, we see that |B(t, ω)| ⩽ (1 + ϵ)√2t log t

holds a.s. for all t > t0 = h−1 0 . But even this bound is not optimal.

12.1 Khintchine’s law of the iterated logarithm

We will show now that the function Λ(t) := √2t log log t, t ⩾ 3, neatly describes the behaviour of the sample paths as t → ∞. Still better results can be obtained from Kolmogorov’s integral test which we state (without proof) at the end of this section.

The key for the proof of the next theorem is a tail estimate for the normal distribution which we know from Lemma 10.5:

Ex. 12.2

2 1 x 1 1 −x2 /2 e , e−x /2 ⩽ ℙ(B(1) > x) ⩽ √2π x2 + 1 √2π x

x > 0,

(12.1)

and the following maximal estimate which is a consequence of the reflection principle, Theorem 6.10: ℙ( sup B(s) > x) ⩽ 2ℙ(B(t) > x), s⩽t

x > 0, t > 0.

(12.2)

Because of the form of the numerator the next result is often called the law of the iterated logarithm (LIL). https://doi.org/10.1515/9783110741278-012

12.1 Khintchine’s law of the iterated logarithm | 199

12.1 Theorem (LIL. Khintchine 1933). Let (B(t))t⩾0 be a BM1 . Then ℙ ( lim

t→∞

B(t)

√2t log log t

Ex. 12.3

= 1) = 1.

(12.3)

Since (−B(t))t⩾0 and (tB(t−1 ))t>0 are again Brownian motions, we can immediately get versions of the Khintchine LIL for both the limit inferior and t → 0. 12.2 Corollary. Let (B t )t⩾0 be a BM1 . Then, almost surely, B(t)

lim

lim

= 1,

B(t)

= 1, √2t log | log t| B(t) lim = −1. t→0 √2t log | log t|

√2t log log t B(t) lim = −1, t→∞ √2t log log t t→∞

t→0

Proof of Theorem 12.1. 1o (Upper bound). Fix ϵ > 0, q > 1 and set

A n := { sup B(s) ⩾ (1 + ϵ)√2q n log log q n } . 0⩽s⩽q n

From the maximal estimate (12.2), we see that

ℙ(A n ) ⩽ 2ℙ (B(q n ) ⩾ (1 + ϵ)√2q n log log q n ) = 2ℙ (

B(q n ) ⩾ (1 + ϵ)√2 log log q n ) . √q n

Since B(q n )/√q n ∼ B(1), we can use the upper bound from (12.1) and find ℙ(A n ) ⩽

2 n 2 1 e−(1+ϵ) log log q ⩽ c (n log q)−(1+ϵ) . n (1 + ϵ)√π √log log q

This shows that ∑∞ n=1 ℙ(A n ) < ∞. Therefore, we can apply the Borel–Cantelli lemma and deduce that lim

n→∞

sup0⩽s⩽q n B(s)

√2q n log log q n

⩽ (1 + ϵ) a.s.

Every t > 1 is in some interval of the form [q n−1 , q n ], and since Λ(t) = √2t log log t is an increasing function for t > 3, we find for all t ⩾ q n−1 > 3 B(t)

Therefore

√2t log log t lim

t→∞



sups⩽q n B(s)

√2q n log log q n

√2q n log log q n √2q n−1 log log q n−1 B(t)

√2t log log t

⩽ (1 + ϵ)√q

a.s.

.

200 | 12 The growth of Brownian paths Letting ϵ → 0 and q → 1 along countable sequences, we get the upper bound in (12.3).

2o If we use the estimate of Step 1o for the Brownian motion (−B(t))t⩾0 and with ϵ = 1, we see that −B(q n−1 ) ⩽ 2√2q n−1 log log q n−1 ⩽

for all sufficiently large n ⩾ n0 (ϵ, ω).

2 √2q n log log q n √q

a.s.

3o (Lower bound). So far we have not used the independence and stationarity of the increments of a Brownian motion. The independence of the increments shows that the sets C n := {B(q n ) − B(q n−1 ) ⩾ √2(q n − q n−1 ) log log q n } ,

are independent, and the stationarity of the increments implies ℙ(C n ) = ℙ (

n ⩾ 1,

B(q n ) − B(q n−1 ) B(q n − q n−1 ) ⩾ √2 log log q n ) = ℙ ( ⩾ √2 log log q n ) . n n−1 √q − q √ q n − q n−1

Since B(q n − q n−1 )/√ q n − q n−1 ∼ B(1) and 2x/(1 + x2 ) ⩾ x−1 for all x > 1, we get from the lower tail estimate of (12.1) that ℙ(C n ) ⩾

n c 1 1 e− log log q ⩾ . n √8π √2 log log q n√log n

Therefore, ∑∞ n=1 ℙ(C n ) = ∞, and since the sets C n are independent, the Borel–Cantelli lemma applies and shows that, for infinitely many n ⩾ 1, B(q n ) ⩾ √2(q n − q n−1 ) log log q n + B(q n−1 ) 2o

⩾ √2(q n − q n−1 ) log log q n −

2 √2q n log log q n . √q

Now we divide both sides by √2q n log log q n and get for all q > 1 lim

t→∞

B(t)

√2t log log t

⩾ lim

n→∞

B(q n )

√2q n

log log q n

⩾ √1 −

1 4 −√ q q

almost surely. Letting q → ∞ along a countable sequence concludes the proof of the lower bound.

12.3 Remark (Fast and slow points). Since W(h) := B(t + h) − B(t) is for every t > 0 again a Brownian motion, Corollary 12.2 shows that lim

h→0

|B(t + h) − B(t)| =1 √2h log | log h|

(12.4)

12.1 Khintchine’s law of the iterated logarithm |

201

almost surely for every fixed t ∈ [0, 1]. Using Fubini’s theorem, we can choose the same exceptional set for (Lebesgue) almost all t ∈ [0, 1]. It is instructive to compare (12.4) with Lévy’s modulus of continuity, Theorem 10.6: lim

h→0

sup0⩽t⩽1−h |B(t + h) − B(t)| √2h| log h|

= 1.

(12.5)

By (12.4), the local modulus of continuity of a Brownian motion is √2h log | log h|; on the other hand, (12.5) means that globally (and at some necessarily random points) the modulus of continuity will be larger than √2h| log h|. Therefore, a Brownian path must violate the LIL at some random points. The random set of all times where the LIL fails, E(ω) := {t ⩾ 0 : lim

h→0

|B(t + h, ω) − B(t, ω)| ≠ 1} , √2h log | log h|

is almost surely uncountable and dense in [0, ∞); this surprising result is due to S. J. Taylor [256], see also Orey and Taylor [196]. At the points τ ∈ E(ω), Brownian motion is faster than usual. Therefore, these points are also called fast points. On the other hand, there are also slow points. A result by Kahane [129; 130] says that almost surely there exists a t ∈ [0, 1] such that lim

h→0

|B(t + h) − B(t)| < ∞. √h

12.4 Remark. Blumenthal’s 0-1–law, Corollary 6.23, implies that for any positive continuous function κ(t) ℙ(B(t) ⩽ κ(t) for all sufficiently small t) ∈ {0, 1}.

If the probability is 1, we call κ an upper function. Equivalently, we can express this as ℙ (lim

t→0

B(t) ⩽ 1) ∈ {0, 1}. κ(t)

The following result due to Kolmogorov gives a sharp criterion whether κ is an upper function. 12.5 Theorem (Kolmogorov’s test). Let (B t )t⩾0 be a BM1 and κ : (0, 1] → (0, ∞) a continuous function such that κ(t) is increasing and κ(t)/√t is decreasing. Then the following alternative holds: 1

∫ 0

1

∫ 0

2 B(t) κ(t) exp (− κ 2t(t) ) dt < ∞ 󳨐⇒ ℙ (lim ⩽ 1) = 1, 3/2 t→0 κ(t) t 2 κ(t) B(t) exp (− κ 2t(t) ) dt = ∞ 󳨐⇒ ℙ (lim ⩾ 1) = 1. t→0 κ(t) t3/2

(12.6)

(12.7)

Ex. 12.4 Ex. 12.5

202 | 12 The growth of Brownian paths

Fig. 12.1: The time t = τ κ(t) is the first instance when the Brownian path [a, b] ∋ s 󳨃→ B(s) reaches the graph of [a, b] ∋ s 󳨃→ κ(s).

Kolmogorov’s original proof was never published, and the first proof of this result appeared in Petrovsky [202]. Motoo’s elegant approach from 1959 [187] can be found in Itô and McKean [121, 1.8, p. 33–36 and 4.12 pp. 161–164]. We use a combination of arguments from Itô and McKean, Fristedt [93, Theorem 11.2] and Khintchine [143, Fundamental Lemma, p. 490]. Proof of (12.6) (Itô–McKean). Obviously, we have {B(t) ⩽ κ(t) for all small t > 0} ⊂ ⋃ {B(t) ⩽ κ(t)

∀t ∈ (0, δ]} ,

δ>0

and we are going to prove that Kolmogorov’s condition (12.6) implies that lim ℙ (∃t ∈ (0, δ] : B(t) ⩾ κ(t)) = 0.

δ→0

Pick any n ⩾ 1 and 0 < a = t1 < t2 < ⋅ ⋅ ⋅ < t n = b ⩽ 1. The monotonicity of κ shows that the first passage times τ κ(t) = inf {s > 0 : B(s) ⩾ κ(t)} ,

t > 0,

are increasing as t increases. Note that t = τ κ(t) is the first instance when the Brownian path s 󳨃→ B(s) reaches the boundary defined by s 󳨃→ κ(s). This is shown in Figure 12.1. Assume that B(t, ω) = κ(t) for some t = t(ω) ∈ [a, b]; by construction there exists some k = k(ω) such that t ∈ (t k , t k+1 ] or t = a. In the latter case, we have τ κ(t1 ) = τ κ(a) ⩽ a. Otherwise, we get from the definition of the first passage time that τ κ(t) ⩽ t ⩽ t k+1 . By the monotonicity of κ we see τ κ(t k ) ⩽ τ κ(t) ⩽ t k+1 . On the other hand, t k ⩾ τ κ(t k ) is not

203

12.1 Khintchine’s law of the iterated logarithm |

possible, since then the path would have crossed the κ(⋅)-threshold before time t. Thus t k < τ κ(t k ) ⩽ t k+1 . Therefore, n−1

{∃t ∈ [a, b] : B(t) ⩾ κ(t)} ⊂ {τ κ(t1 ) ⩽ a} ∪ ⋃ {t j < τ κ(t j ) ⩽ t j+1 } j=1

and, since we know from Theorem 6.10 that |κ|(2πs3 )−1/2 e−κ

2

n−1

/2s

ds is the density of τ κ ,

ℙ(∃t ∈ [a, b] : B(t) ⩾ κ(t)) ⩽ ℙ (τ κ(t1 ) ⩽ a) + ∑ ℙ (t j < τ κ(t j ) ⩽ t j+1 ) j=1

t j+1

a

n−1 κ(t j ) 2 κ(a) κ2 (t ) =∫ exp (− κ 2s(a) ) ds + ∑ ∫ exp (− 2s j ) ds √2πs3 √2πs3 j=1 0

tj

a

t j+1

⩽∫ 0

n−1 2 κ(a) κ(s) κ2 (t ) exp (− κ 2s(a) ) ds + ∑ ∫ exp (− 2s j ) ds. 3 √2πs3 √ 2πs j=1 tj

If the mesh max1⩽j⩽n−1 (t j+1 − t j ) → 0 as n → ∞, we can use dominated convergence and the continuity of κ to get a

b

0

a

2 2 κ(a) κ(s) exp (− κ 2s(a) ) ds + ∫ exp (− κ 2s(s) ) ds. ℙ(∃t ∈ [a, b] : B(t) ⩾ κ(t)) ⩽ ∫ √2πs3 √2πs3

1

2

Since s−1 κ2 (s) is a decreasing function and ∫0 κ(s)s−3/2 exp (− κ 2s(s) ) ds < ∞, we see limt→0 t−1 κ2 (t) = ∞, hence a

2 κ(a) exp (− κ 2s(a) ) ds = ∫ √2πs3

0

and

a/κ2 (a)

∫ 0

1 1 exp (− 2u ) du 󳨀󳨀󳨀󳨀→ 0, a→0 √2πu3 b

ℙ(∃t ∈ (0, b] : B(t) ⩾ κ(t)) ⩽ ∫ 0

The claim follows as b → 0.

2 κ(s) exp (− κ 2s(s) ) ds. 3 √2πs

Before we proceed with the second part of the proof, we need two lemmas.

12.6 Lemma (Khintchine 1938). Let (B t )t⩾0 be a BM1 and κ : (0, 1] → (0, ∞) a continuous function such that κ(t) is increasing and κ(t)/√t is decreasing. Then the following implication holds: ∞

∀a ∈ (0, 1) : ∑ ℙ (B(a j ) > κ(a j )) = ∞ 󳨐⇒ ℙ (lim j=1

t→0

B(t) ⩾ 1) = 1. κ(t)

(12.8)

204 | 12 The growth of Brownian paths Proof (Fristedt 1974). Let a, ϵ ∈ (0, 1) and consider the sets 󵄨󵄨 a j+1 󵄨󵄨󵄨 aj a j+1 A j := {B ( 1−a ) − B ( 1−a ) > κ(a j ), 󵄨󵄨󵄨B ( 1−a )󵄨󵄨 ⩽ ϵκ(a j )} , j ∈ ℕ. 󵄨 󵄨 Since Brownian motion has independent and stationary increments, and scales according to B(t) ∼ √tB(1), we get κ(a ) ℙ(A j ) = ℙ (B(a j ) > κ(a j )) ℙ (|B(1)| ⩽ ϵ√ 1−a a √ j ) j

a

⩾ ℙ (B(a j ) > κ(a j )) ℙ (|B(1)| ⩽ ϵ√ 1−a a κ(1)) ;

in the last estimate we use that κ(t)/√t is decreasing. Using our assumption we conclude that ∑∞ j=1 ℙ(A j ) = ∞. Although the sets A j are not independent, the following observation allows us to employ the Rényi–Spitzer reinforced Borel–Cantelli lemma, Lemma 12.7. With a similar calculation as before, we have for all i ≠ j a a a a ℙ(A i ∩ A j ) ⩽ ℙ (B ( 1−a ) − B ( 1−a ) > κ(a i ) & B ( 1−a ) − B ( 1−a ) > κ(a j )) i



i+1

j

ℙ(A i )ℙ(A j )

2

[ℙ (|B(1)| ⩽ ϵ√ 1−a a κ(1))]

j+1

.

2

Thus, Lemma 12.7 shows that ℙ(limj→∞ A j ) ⩾ [ℙ (|B(1)| ⩽ ϵ√ 1−a a κ(1))] =: p(a) and, if a ↓ 0, the probability on the right-hand side tends to 1. Note, however, that the sets A j appearing on the left side also depend on a. Therefore, we can find a sequence j(n) = j(n, ω) for all ω from a set Ω a of probability at least p(a) such that B ( a1−a ) > κ (a j(n) ) + B ( a1−a ) ⩾ κ (a j(n) ) − ϵκ (a j(n) ) . j(n)

j(n)+1

Dividing by κ (a j(n) ) shows that the upper limit is bounded below by j(n)

B ( a1−a ) B (t) lim ⩾ 1 − ϵ. ⩾ lim t→0 κ (t) j→∞ κ (a j(n) )

Finally, letting ϵ ↓ 0 and a ↓ 0 finishes the proof.

The following version of the “difficult” direction of the Borel–Cantelli lemma does not require independence. 12.7 Lemma (reinforced Borel–Cantelli lemma. Rényi 1962). Let (A n )n⩾1 be a sequence of measurable sets in a probability space (Ω, A , ℙ) such that ∞

∑ ℙ(A n ) = ∞

n=1

and

lim

n→∞

∑ni=1 ∑nj=1 ℙ(A i ∩ A j ) [∑ni=1 ℙ(A i )]

Then one has ℙ (limn→∞ A n ) = ℙ (A n infinitely often) ⩾ 1/c.

2

⩽ c.

12.1 Khintchine’s law of the iterated logarithm |

205

Proof. (Variation of the argument in [235, Theorem 31.1].) Let ϵ ∈ (0, 1) be such that ϵ2 c(c − 1) ⩽ 1 (we use ϵ = 12 if c = 1), and define ∞

n

S n := ∑ 𝟙A i 󳨀󳨀󳨀󳨀→ ∑ 𝟙A i =: S, i=1 n

n→∞

i=1

n

m n := 𝔼 ( ∑ 𝟙A i ) = ∑ ℙ(A i ) 󳨀󳨀󳨀󳨀→ ∞. n→∞

i=1

i=1

2

The usual double-summation trick (∑i p i ) = ∑i ∑j p i p j yields 2

n

n

n

𝕍 S n = 𝔼 [( ∑ (𝟙A i − 𝔼(𝟙A i ))) ] = ∑ ∑ ℙ(A i ∩ A j ) − m2n . ] i=1 j=1 [ i=1

Obviously, we have S n ⩽ S, and so {S ⩽ (1 − ϵ)m n } ⊂ {S n ⩽ (1 − ϵ)m n }. This shows ℙ (S ⩽ (1 − ϵ)m n ) ⩽ ℙ (S n ⩽ (1 − ϵ)m n ) = ℙ (S n − m n ⩽ −ϵm n )

⩽ ℙ (|S n − m n | ⩾ ϵm n ) ⩽

ϵ2 𝕍 Sn . m2n

In the last two steps we use S n − m n ⩽ −ϵm n 󳨐⇒ |S n − m n | ⩾ ϵm n and the Chebyshev– Markov inequality. If we apply the limit inferior to this inequality, we get ℙ(S < ∞) = lim ℙ(S ⩽ ϵm n ) ⩽ ϵ2 lim n→∞

n→∞

∑ni=1 ∑nj=1 ℙ(A i ∩ A j ) m2n

− ϵ2 ⩽ ϵ2 (c − 1).

From the definition of ϵ we see that ℙ(S = ∞) ⩾ 1c . By construction, {S = ∞} and {A n infinitely often} coincide, and the proof is complete.

We can now finish the proof of Theorem 12.5

2

Proof of (12.7). 1o Let 13 < c < 1. The elementary inequality e ay ⩾ 1 + ay2 ⩾ a(1 + y2 ) 2 2 y 2c with a = 21 ( 1c − 1) gives ye−y /2c ⩽ 1−c e−y /2 , and we see y2 +1 c

∫ 0

1

κ(t) κ2 (t) dt κ(cs) κ2 (cs) ds 1 exp (− exp (− = ) ) ∫ 2t t 2cs s √t √c √s 0

1



κ(cs) √s κ2 (cs) + s 0

2√c ∫ 1−c

1

1

exp (−

κ2 (cs) ds ) 2s s

2√2cπ κ(cs) ds ⩽ ) ∫ ℙ (B(1) ⩾ 1−c s √s

(12.1)

0

1

=

2√2cπ ds ∫ ℙ (B(s) ⩾ κ(cs)) ; 1−c s 0

206 | 12 The growth of Brownian paths in the last step we use the scaling property B(s) ∼ √sB(1). Because of our assumption we have 1

∫ ℙ (B(s) ⩾ κ(cs)) 0

ds =∞ s

for all

1 3

< c < 1.

The functions κ c (t) := κ(ct) inherit the monotonicity and continuity properties of κ. Moreover, the fact that κ(t)/√t is decreasing, shows that B(t) κ(t) 1 B(t) B(t) = ⩽ , κ(ct) κ(t) κ(ct) √c κ(t)

i.e. it is enough to prove that limt→0 B(t)/κ(ct) ⩾ 1 a.s. for each done in the following step.

1 3

< c < 1. This will be

2o Let c ∈ ( 13 , 1), κ c (t) = κ(ct) and a ∈ (0, 1). For every t ∈ (0, 1), there is an exponent j = j(t) ∈ ℕ0 such that a j+1 < t ⩽ a j . Since κ c (t)/√t is decreasing, we know that κ c (t)/√t ⩾ κ c (a j )/√a j , and the scaling property of Brownian motion shows ℙ (B(t) ⩾ κ c (t)) = ℙ (B(1) ⩾

Therefore, we get

κ c (a j ) κ c (t) ) ⩽ ℙ (B(1) ⩾ ) = ℙ (B(a j ) ⩾ κ c (a j )) . √t √a j

1

aj

0

a j+1

dt dt ∞ = ∑ ∫ ℙ (B(t) ⩾ κ c (t)) ∫ ℙ (B(t) ⩾ κ c (t)) t t j=0 ∞

aj

⩽∑ ∫ j=0

a j+1

dt ℙ (B(a j ) ⩾ κ c (a j )) t ∞

= | log a| ∑ ℙ (B(a j ) ⩾ κ c (a j )) . j=0

1

From Step 1o we know that ∫0 ℙ (B(t) ⩾ κ c (t)) dtt = ∞ and limt→0 B(t)/κ(ct) ⩾ 1 a.s. follows from Lemma 12.5. Since 31 < c < 1 is arbitrary, the proof is complete.

12.2 Chung’s “other” law of the iterated logarithm

Let (X j )j⩾1 be iid random variables such that 𝔼 X1 = 0 and 𝔼 X12 < ∞. K. L. Chung [31] proved for the maximum of the absolute value of the partial sums the following law of the iterated logarithm lim

n→∞

max1⩽j⩽n |X1 + ⋅ ⋅ ⋅ + X j | √ n/ log log n

=

π . √8

12.2 Chung’s “other” law of the iterated logarithm |

207

A similar result holds for the running maximum of the modulus of a Brownian path. The proof resembles Khintchine’s LIL for Brownian motion, but we need a more subtle estimate for the distribution of the maximum. Using Lévy’s triple law (6.20) for a = −x, b = x and t = 1, we see ℙ (sup |B s | < x) = ℙ (inf B s > −x, sup B s < x, B1 ∈ (−x, x)) s⩽1

s⩽1

s⩽1

x

=



1 ∫ ∑ (−1)j exp (− 21 (y + 2jx)2 ) dy √2π j=−∞ −x

∞ 1 = ∑ (−1)j √2π j=−∞

=



(2j+1)x



(2j−1)x

exp (− 21 y2 ) dy

1 ∫ ∑ (−1)j 𝟙((2j−1)x, (2j+1)x) (y) exp (− 21 y2 ) dy. √2π j=−∞ ℝ

=f(y)

The 4x-periodic even function f is given by the following Fourier cosine series ¹

Therefore, we find

f(y) =

ℙ (sup |B s | < x) = s⩽1

4 ∞ (−1)k 2k + 1 cos ( πy) . ∑ π k=0 2k + 1 2x

2k + 1 1 4 ∞ 1 (−1)k πy) exp (− y2 ) dy ∑ ∫ cos ( π k=0 √2π 2k + 1 2x 2 ℝ

∞ (−1)k π2 (2k + 1)2 (2.5) 4 = exp (− ). ∑ π k=0 2k + 1 8x2

1 Although we have used Lévy’s formula (6.20) for the joint distribution of (m t , M t , B t ), it is not suitable to study the asymptotic behaviour of ℙ(m t < a, M t > b, B t ∈ [c, d]) as t → ∞. Our approach based on a Fourier expansion leads essentially to the following expression which is also due to Lévy [166, III.19–20, pp. 78–84]: ∞

ℙ (m t > a, M t < b, B t ∈ I) = ∑ e−λ j t ϕ j (0) ∫ ϕ j (y) dy j=1

I

where λ j and ϕ j , j ⩾ 1, are the eigenvalues and normalized eigenfunctions of the boundary value problem 1 󸀠󸀠 f (x) = λf(x) for a < x < b 2

and

It is not difficult to see that λ j = j2 π 2 /2(b − a)2 and jπ 2 cos ( b−a ϕ j (x) = √ b−a [x −

for odd and even indices j, respectively.

a+b 2 ])

and

f(x) = 0 for x = a, x = b.

jπ 2 ϕ j (x) = √ b−a sin ( b−a [x −

a+b 2 ]).

208 | 12 The growth of Brownian paths This proves the following lemma. 12.8 Lemma. Let (B(t))t⩾0 be a BM1 . Then

lim ℙ (sup |B(s)| < x) /

x→0

s⩽1

4 π2 exp (− 2 ) = 1. π 8x

12.9 Theorem (Chung 1948). Let (B(t))t⩾0 be a BM1 . Then ℙ ( lim

t→∞

sups⩽t |B(s)| √ t/ log log t

Proof. 1o (Upper bound) Set t n := n n and

=

C n := { sup |B(s) − B(t n−1 )| < t n−1 ⩽s⩽t n

By the stationarity of the increments we see

π ) = 1. √8

(12.9)

(12.10)

tn π √ }. √8 log log t n

supt n−1 ⩽s⩽t n |B(s) − B(t n−1 )| sups⩽t n −t n−1 |B(s)| sups⩽t n −t n−1 |B(s)| ; ∼ ⩽ √t n √t n √t n − t n−1

using Lemma 12.8 for x = π/√8 log log t n , we get for sufficiently large n ⩾ 1 ℙ(C n ) ⩾ ℙ (

sups⩽t n −t n−1 |B(s)| 1 1 4 π . < )⩾c π n log n √t n − t n−1 √8 √log log t n

Therefore, ∑n ℙ(C n ) = ∞; since the sets C n are independent, the Borel–Cantelli lemma applies and shows that supt n−1 ⩽s⩽t n |B(s) − B(t n−1 )|

lim

√ t n / log log t n

n→∞

Ex. 12.3



π . √8

The argument in Step 1o of the proof of Khintchine’s LIL, Theorem 12.1, yields lim

n→∞

sups⩽t n−1 |B(s)| √ t n / log log t n

sups⩽t n−1 |B(s)| √ t n−1 log log t n−1 = 0. n→∞ √ t n−1 log log t n−1 √ t n / log log t n

= lim

Finally, using sups⩽t n |B(s)| ⩽ 2 sups⩽t n−1 |B(s)| + supt n−1 ⩽s⩽t n |B(s) − B(t n−1 )|, we get lim

sups⩽t n |B(s)|

n→∞ √ t n / log log t n

⩽ lim

n→∞

2 sups⩽t n−1 |B(s)| √ t n / log log t n

+ lim

n→∞

supt n−1 ⩽s⩽t n |B(s) − B(t n−1 )| √ t n / log log t n

which proves that limt→∞ sups⩽t |B(s)|/√ t/ log log t ⩽ π/√8 a.s. 2o (Lower bound) Fix ϵ > 0 and q > 1 and set

A n := {sup |B(s)| < (1 − ϵ) s⩽q n

π qn √ }. √8 log log q n



π √8

209

12.2 Chung’s “other” law of the iterated logarithm |

Lemma 12.8 with x = (1 − ϵ)π/√8 log log q n gives for large n ⩾ 1 ℙ(A n ) = ℙ (

sups⩽q n |B(s)| √q n

< (1 − ϵ)

2

1/(1−ϵ) 1 π 4 1 . ) )⩽c ( π n log q √8 √log log q n

Thus, ∑n ℙ(A n ) < ∞, and the Borel–Cantelli lemma yields that ℙ (sup |B(s)| < (1 − ϵ) s⩽q n

i.e. we have almost surely that lim

n→∞

π qn √ for infinitely many n) = 0, √8 log log q n

sups⩽q n |B(s)|

√ q n / log log q n

For t ∈ [q n−1 , q n ] and large n ⩾ 1 we get sups⩽t |B(s)| √ t/ log log t



This proves

sups⩽q n−1 |B(s)|

√ q n / log log q n lim t→∞

=

⩾ (1 − ϵ)

π . √8

sups⩽q n−1 |B(s)|

√ q n−1 / log log q n−1

sups⩽t |B(s)| √ t/ log log t



√ q n−1 / log log q n−1 √ q n / log log q n

.

→1/√q as n → ∞

1−ϵ π √q √8

almost surely. Letting ϵ → 0 and q → 1 along countable sequences, we get the lower bound in (12.10).

If we replace in Theorem 12.9 log log t by log | log t|, and if we use in the proof n−n and q < 1 instead of n n and q > 1, respectively, we get the analogue of (12.10) for t → 0. Combining this with shifting B(t + h) − B(h), we obtain

12.10 Corollary. Let (B(t))t⩾0 be a BM1 . Then lim

lim t→0

sups⩽t |B(s)|

t→0 √ t/ log | log t|

sups⩽t |B(s + h) − B(h)| √ t/ log | log t|

=

=

π √8 π √8

a.s., a.s. for all h ⩾ 0.

Using similar methods Csörgo and Révész [44, pp. 44–47] show that lim inf sup

h→0 s⩽1−h t⩽h

|B(s + t) − B(s)| π = . √8 √ h/| log h|

This is the exact modulus of non-differentiability of a Brownian motion.

(12.11)

210 | 12 The growth of Brownian paths

12.11 Further reading. Path properties of the LIL-type are studied in [44], [88] and [215]. Apart from the original papers which are mentioned in Remark 12.3, exceptional sets are beautifully treated in [186]. [44] Csörgő, Révész: Strong Approximations in Probability and Statistics. [88] Freedman: Brownian Motion and Diffusion. [186] Mörters, Peres: Brownian Motion. [215] Révész: Random Walk in Random and Non-Random Environments.

Problems 1.

Let (B t )t⩾0 be a BM1 . Use the Borel-Cantelli lemma to show that the running maximum M n := sup0⩽t⩽n B t cannot grow faster than C√ n log n for any C > 2. Use this to show that lim

t→∞

2.

Mt

C√ t log t

⩽1

a.s.

Hint. Show that M n (ω) ⩽ C√ n log n for sufficiently large n ⩾ n0 (ω).

Let (B t )t⩾0 be a BM1 . Apply Doob’s maximal inequality (A.13) to the exponential ξ martingale M t := exp (ξB t − 12 ξ 2 t) to show that for all x, ξ > 0 ℙ (sup (B s − 12 ξs) > x) ⩽ e−xξ . s⩽t

(This inequality can be used for the upper bound in the proof of Theorem 12.1, avoiding the combination of (12.2) and the upper bound in (12.1).) 3.

Show that the proof of Khinchine’s LIL, Theorem 12.1, can be modified to give lim

t→∞

sups⩽t |B(s)|

√2t log log t

⩽ 1.

Hint. Use in Step 1o of the proof ℙ (sups⩽t |B(s)| ⩾ x) ⩽ 4ℙ(|B(t)| ⩾ x).

4. Let (B t )t⩾0 be a BM1 . Use Theorem 12.5 to show that κ(t) = (1 + ϵ)√2t log | log t| is an upper function for t → 0. 5.

Let (B t )t⩾0 be a BM1 . Deduce from Theorem 12.5 the following test for upper functions in large time. Assume that κ ∈ C[1, ∞) is a positive function such that κ(t)/t is decreasing and κ(t)/√t is increasing. Then 0

ℙ (B(t) < κ(t) as t → ∞) = {

1



according to ∫ 1

=∞ 2 κ(s) exp (− κ 2s(s) ) ds { . s3/2 0 and τ := inf {t ⩾ 0 : |B t | = b√a + t}. Show that a) ℙ(τ < ∞) = 1;

b) 𝔼 τ = ∞ 7.

| 211

c) 𝔼 τ < ∞

(b ⩾ 1); (b < 1).

Hint. Use in b) Wald’s identities. For c) show that 𝔼(τ ∧ n) ⩽ ab2 (1 − b2 )−1 for n ⩾ 1.

Let X = (X t )t⩾0 be a one-dimensional process satisfying (B0), (B1), (B2). Assume, in addition, that X is continuous in probability, i.e. limt→0 ℙ(|X t − X0 | > ϵ) = 0 for all ϵ > 0. A function g : [0, ∞) → [0, ∞) is said to be of regular growth, if there exist functions κ j (λ), j = 1, 2, such that limλ→1 κ2 (λ) = 1, limλ→∞ κ1 (λ) = ∞ and κ1 (λ)g(t) ⩽ g(λt) ⩽ κ2 (λ)g(t) for all λ ⩾ 0, t > 0.

Show that for every symmetric (i.e. X t ∼ −X t ) process X and increasing g of regular growth ∞

∫ 1

1 ℙ (X t > 13 g(t)) dt < ∞ 󳨐⇒ g is an upper function as t → ∞. t

Remark. X is a so-called Lévy process. Its paths t 󳨃→ X t are not continuous, but they may have jump discontinuities which happen, however, never at fixed times. This is due to the continuity “in probability”. See [231] and [235, Chapter X] for more on this topic. Hint. Use the following inequality due to Etemadi (1985): ℙ(sups⩽t |X s | ⩾ a) ⩽ 3ℙ(|X t | ⩾ a/3)

which is true for any (not necessarily symmetric) Lévy process, cf. [235, Theorem 33.1, p. 209]. Symmetry is used to get an estimate for X rather than |X|.

13 Strassen’s functional law of the iterated logarithm From Khintchine’s law of the iterated logarithm (LIL), Corollary 12.2, we know that for a one-dimensional Brownian motion −1 = lim

s→∞

B(s) B(s) < lim = 1. √2s log log s s→∞ √2s log log s

Since Brownian paths are a.s. continuous, it is clear that the set of limit points of {B(s)/√2s log log s : s > e} as s → ∞ is the whole interval [−1, 1]. This observation is the starting point for an extension of Khintchine’s LIL, due to Strassen [245], which characterizes the set of limit points as s → ∞ of the family {Z s (⋅) : s > e} ⊂ C(o) [0, 1]

Ex. 13.1

where

Z s (t) :=

B(st)

√2s log log s

, 0 ⩽ t ⩽ 1.

(C(o) [0, 1] denotes the set of all continuous functions w : [0, 1] → ℝ with w(0) = 0.) If w0 ∈ C(o) [0, 1] is a limit point it is necessary that for every fixed t ∈ [0, 1] the number w0 (t) is a limit point of the family {Z s (t) : s > e} ⊂ ℝ. Using Khintchine’s LIL and the scaling property 2.16 of Brownian motion it is straightforward that the limit points of {Z s (t) : s > e} are the whole interval [−√t, √t], i.e. −√t ⩽ w0 (t) ⩽ √t. In fact, we have 13.1 Theorem (Strassen 1964). Let (B t )t⩾0 be a one-dimensional Brownian motion and set Z s (t, ω) := B(st, ω)/√2s log log s. For almost all ω ∈ Ω, {Z s (⋅, ω) : s > e}

is a relatively compact subset of (C(o) [0, 1], ‖ ⋅ ‖∞ ), and the set of all limit points (as s → ∞) is given by 1

K = {w ∈ C(o) [0, 1] : w is absolutely continuous and ∫ |w󸀠 (t)|2 dt ⩽ 1}. 0

Ex. 13.2

Using the Cauchy-Schwarz inequality it is not hard to see that every w ∈ K satisfies |w(t)| ⩽ √t for all 0 ⩽ t ⩽ 1. Theorem 13.1 contains two statements: with probability one, a) the family {[0, 1] ∋ t 󳨃→ B(st, ω)/√2s log log s : s > e} ⊂ C(o) [0, 1] contains a sequence which converges uniformly for all t ∈ [0, 1] to a function in the set K; b) every function from the (non-random) set K is the uniform limit of a suitable sequence (B(s n t, ω)/√2s n log log s n )n⩾1 .

For a limit point w0 it is necessary that the probability ℙ (‖Z s (⋅) − w0 (⋅)‖∞ < δ)

https://doi.org/10.1515/9783110741278-013

as

s→∞

13.1 The Cameron–Martin formula | 213

is sufficiently large for any δ > 0. From the scaling property of a Brownian motion we see that Z s (⋅) − w0 (⋅) ∼

B(⋅) − w0 (⋅). √2 log log s

This means that we have to calculate probabilities involving a Brownian motion with a nonlinear drift, B(t) − √2 log log s w0 (t) and estimate it at an exponential scale, as we will see later. Both topics are interesting in their own right, and we will treat them separately in the following two sections. After that we return to our original problem, the proof of Strassen’s Theorem 13.1.

13.1 The Cameron–Martin formula In Chapter 4 we have introduced the Wiener space (C(o) , B(C(o) ), μ) as canonical probability space of a Brownian motion indexed by [0, ∞). In the same way we can study a Brownian motion with index set [0, 1] and the Wiener space (C(o) [0, 1], B(C(o) [0, 1]),μ) where C(o) = C(o) [0, 1] denotes the space of continuous functions w : [0, 1] → ℝ such that w(0) = 0. The Cameron–Martin theorem is about the absolute continuity of the Wiener measure under shifts in the space C(o) [0, 1]. For w0 ∈ C(o) [0, 1] we set Uw(t) := w(t) + w0 (t),

t ∈ [0, 1], w ∈ C(o) [0, 1].

Clearly, U maps C(o) [0, 1] into itself and induces an image measure of μ: μ U (A) = μ(U −1 A) for all A ∈ B(C(o) [0, 1]).

This is equivalent to saying

∫ F(w + w0 ) μ(dw) = ∫ F(Uw) μ(dw) = ∫ F(w) μ U (dw)

C(o)

C(o)

(13.1)

C(o)

for all bounded continuous functionals F : C(o) [0, 1] → ℝ. We will see that μ U is absolutely continuous with respect to μ if the shifts w0 are absolutely continuous (a.c.) functions such that the (Lebesgue a.e. existing) derivative w󸀠0 is square integrable. This is the Cameron–Martin space which we denote by 1

H1 := {u ∈ C(o) [0, 1] : u is a.c. and ∫ |u󸀠 (t)|2 dt < ∞}.

(13.2)

0

13.2 Lemma. For u : [0, 1] → ℝ the following assertions are equivalent: a) u ∈ H1 ; b) u ∈ C(o) [0, 1] and M := supΠ ∑t j−1 ,t j ∈Π |u(t j ) − u(t j−1 )|2 /(t j − t j−1 ) < ∞ where the supremum is taken over all finite partitions Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = 1} of [0, 1]. 1 If a) or b) is true, then ∫0 |u󸀠 (t)|2 dt = lim|Π|→0 ∑t j−1 ,t j ∈Π |u(t j ) − u(t j−1 )|2 /(t j − t j−1 ).

214 | 13 Strassen’s functional law of the iterated logarithm Proof. a)⇒ b): assume that u ∈ H1 and let 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = 1 be any partition of [0, 1]. Then 1

u(t j ) − u(t j−1 ) 2 ] (t j − t j−1 ) = ∫ f n (t) dt ∑[ t j − t j−1 j=1 n

0

with

n

f n (t) = ∑ [ j=1

By the Cauchy–Schwarz inequality

u(t j ) − u(t j−1 ) 2 ] 𝟙[t j−1 ,t j ) (t). t j − t j−1

2

tj

tj

n 1 [ 1 ] f n (t) = ∑ [ ∫ u󸀠 (s) ds] 𝟙[t j−1 ,t j ) (t) ⩽ ∑ ∫ |u󸀠 (s)|2 ds 𝟙[t j−1 ,t j ) (t). t − t t − t j j−1 j j−1 j=1 j=1 t j−1 t j−1 [ ] n

If we integrate this inequality with respect to t, we get 1

n

0

j=1 t

1

tj

∫ f n (t) dt ⩽ ∑ ∫ |u󸀠 (s)|2 ds = ∫ |u󸀠 (s)|2 ds,

1

0

j−1

hence M ⩽ ∫0 |u󸀠 (s)|2 ds < ∞, and the estimate in b) follows. b)⇒ a): conversely, assume that b) holds and denote by M the value of the supremum in b). Let 0 ⩽ r1 < s1 ⩽ r2 < s2 ⩽ . . . ⩽ r m < s m ⩽ 1 be the endpoints of the intervals (r k , s k ), k = 1, . . . , m. By the Cauchy-Schwarz inequality 1

1

1

2 2 m |u(s k ) − u(r k )|2 2 m ] [ ∑ (s k − r k )] ⩽ √M [ ∑ (s k − r k )] . ∑ |u(s k ) − u(r k )| ⩽ [ ∑ sk − rk k=1 k=1 k=1 k=1

m

Ex. 13.3

m

This means that u is absolutely continuous and u󸀠 exists a. e. Let f n be as in the first part of the proof and assume that the underlying partitions {0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = 1} have decreasing mesh sizes. Then the sequence (f n )n⩾1 converges to (u󸀠 )2 at all points where u󸀠 exists. By Fatou’s lemma 1

1

2

|u(t j ) − u(t j−1 )|2 ⩽ M. t j − t j−1 n→∞ j=1 n

∫ |u (t)| dt ⩽ lim ∫ f n (t) dt = lim ∑ 0

Finally,

󸀠

n→∞

0

1

1

0

0

1

lim ∫ f n (t) dt ⩽ M ⩽ ∫ |u󸀠 (t)|2 dt ⩽ lim ∫ f n (t) dt.

n→∞

n→∞

0

In a first step we prove the absolute continuity of μ U for shifts from a subset H∘1 of H1 , H∘1 := {u ∈ C(o) [0, 1] : u ∈ C1 [0, 1] and u󸀠 has bounded variation } .

(13.3)

13.1 The Cameron–Martin formula | 215

13.3 Lemma. Let w0 : [0, 1] → ℝ be a function in H∘1 , n ⩾ 1 and t j = j/n. Then the linear functionals n

G n (w) := ∑ (w(t j ) − w(t j−1 )) j=1

w0 (t j ) − w0 (t j−1 ) , t j − t j−1

n ⩾ 1, w ∈ C(o) [0, 1],

are uniformly bounded, i.e. |G n (w)| ⩽ c‖w‖∞ for all n ⩾ 1 and w ∈ C(o) [0, 1]; moreover, the following Riemann–Stieltjes integral exists: 1

G(w) := lim G n (w) = ∫ w󸀠0 (t) dw(t). n→∞

0

Proof. The mean-value theorem shows that n

G n (w) = ∑ (w(t j ) − w(t j−1 )) j=1

n w0 (t j ) − w0 (t j−1 ) = ∑ (w(t j ) − w(t j−1 )) w󸀠0 (θ j ) t j − t j−1 j=1

with suitable intermediate points θ j ∈ [t j−1 , t j ]. It is not difficult to see that n

G n (w) = w󸀠0 (θ n+1 )w(t n ) − ∑ w(t j )(w󸀠0 (θ j+1 ) − w󸀠0 (θ j )) j=1

From this we conclude that

(θ n+1 := 1).

|G n (w)| ⩽ (‖w󸀠0 ‖∞ + VAR1 (w󸀠0 ; [0, 1]))‖w‖∞

as well as

1

lim G n (w) =

n→∞

w󸀠0 (1)w(1)



1

∫ w(t) dw󸀠0 (t) 0

= ∫ w󸀠0 (t) dw(t). 0

We can now state and prove the Cameron–Martin formula for shifts in H∘1 .

13.4 Theorem (Cameron, Martin 1944). Let (C(o) [0, 1], B(C(o) [0, 1]), μ) be the Wiener space and Uw := w + w0 the shift by w0 ∈ H∘1 . The image measure μ U (A) := μ(U −1 A), A ∈ B(C(o) [0, 1]), is absolutely continuous with respect to μ, and for every bounded continuous functional F : C(o) [0, 1] → ℝ we have ∫

C(o) [0,1]

F(w) μ U (dw) =

=

with the Radon-Nikodým density



C(o) [0,1]



F(Uw) μ(dw)

F(w)

C(o) [0,1]

dμ U (w) μ(dw) dμ

1

1

0

0

dμ U (w) 1 = exp (− ∫ |w󸀠0 (t)|2 dt + ∫ w󸀠0 (t) dw(t)) . dμ 2

(13.4)

(13.5)

216 | 13 Strassen’s functional law of the iterated logarithm Proof. In order to deal with the infinite dimensional integrals appearing in (13.4) we use finite dimensional approximations. Set t j = j/n, j = 0, 1, . . . , n, n ⩾ 1, and denote by {w(t j ), if t = t j , j = 0, 1, . . . , n Π n w(t) := { linear, for all other t {

the piecewise linear approximation of w on the grid t0 , t1 , . . . , t n . Obviously, we have Π n w ∈ C(o) [0, 1] and limn→∞ Π n w(t) = w(t) uniformly for all t. Let F : C(o) [0, 1] → ℝ be a bounded continuous functional. Since Π n w is uniquely determined by x = (w(t1 ), . . . , w(t n )) ∈ ℝn , there exists a bounded continuous function F n : ℝn → ℝ such that F n (x) = F(Π n w)

where

x = (w(t1 ), . . . , w(t n )) .

Write x0 = (w0 (t1 ), . . . , w0 (t n )) and x0 = x00 := 0. Then ∫

F(Π n Uw) μ(dw)

C(o) [0,1]

=

(2.10a)

=

=

=

𝔼 [F n (B(t1 ) + w0 (t1 ), . . . , B(t n ) + w0 (t n ))]

n 1 ∫ F n (x + x0 ) exp [− 21 n ∑ (x j − x j−1 )2 ] dx n/2 (2π/n) j=1 n ℝ

n 1 1 n ((y j − y j−1 ) − (x0j − x0j−1 ))2 ] dy F (y) exp [− ∑ ∫ n 2 (2π/n)n/2 n j=1

e

− 12 n

n



∑ (x0j −x0j−1 )2 j=1

(2π/n)n/2

If we set

n

n

n ∑ (y j − y j−1 )(x0j − x0j−1 ) − 21 n ∑ (y j − y j−1 )2 j=1 e dy. ∫ F n (y) e j=1

ℝn

n

G n (w) := ∑ (w(t j ) − w(t j−1 )) j=1 n

α n (w0 ) := ∑

j=1

we find ∫

C(o) [0,1]

w0 (t j ) − w0 (t j−1 ) 1 n

(w0 (t j ) − w0 (t j−1 ))2 1 n

1

F(Π n Uw) μ(dw) = e− 2 α n (w0 )



C(o) [0,1]

,

,

F(Π n w)e G n (w) μ(dw).

(13.6)

Since F is bounded and continuous and limn→∞ Π n Uw = Uw, the left-hand side converges to ∫C [0,1] F(Uw) μ(dw). By construction, t j − t j−1 = 1n ; therefore, Lemmas 13.2 (o)

13.1 The Cameron–Martin formula | 217

and 13.3 show that 1

lim α n (w0 ) =

n→∞

∫ |w󸀠0 (t)|2

1

lim G n (w) = ∫ w󸀠0 (t) dw(t).

and

dt

n→∞

0

Moreover, by Lemma 13.3,

0

e G n (w) ⩽ e c‖w‖∞ ⩽ e c supt⩽1 w(t) + e−c inf t⩽1 w(t) .

By the reflection principle, Theorem 6.10, supt⩽1 B(t) ∼ − inf t⩽1 B(t) ∼ |B(1)|. Thus ∫

C(o) [0,1]

e G n (w) μ(dw) ⩽ 𝔼 [e c supt⩽1 B(t) ] + 𝔼 [e−c inf t⩽1 B(t) ] = 2 𝔼 [e c|B(1)| ] .

Therefore, we can use the dominated convergence theorem to deduce that the righthand side of (13.6) converges to 1

exp [− 12 ∫0 |w󸀠0 (t)|2 dt]



C(o) [0,1]

1

F(w) exp [∫0 w󸀠0 (t) dw(t)] μ(dw).

We want to show that Theorem 13.4 remains valid for w0 ∈ H1 . This is indeed possible, if we use a stochastic approximation technique. The drawback, however, will be that we do not have any more an explicit formula for the limit limn→∞ G n (w) as a Riemann– Stieltjes integral. The trouble is that the integral 1

1

∫ w0 (t) dw(t) = w(1)w0 (1) − ∫ w(t) dw0 (t) 0

0

is for all integrands w ∈ C(o) [0, 1] defined only in the Riemann–Stieltjes sense, if w0 is of bounded variation, cf. Corollary A.36.

13.5 Paley–Wiener–Zygmund integrals (Paley, Wiener, Zygmund 1933). Denote by BV[0, 1] the set of all functions of bounded variation on [0, 1]. For ϕ ∈ BV[0, 1] the Riemann–Stieltjes integral 1

1

w ∈ C(o) [0, 1],

G ϕ (w) = ∫ ϕ(s) dw(s) = ϕ(1)w(1) − ∫ w(s) dϕ(s), 0

0

is well-defined. If we write as the limit of a Riemann–Stieltjes sum, it is not hard to see that, on the Wiener space (C(o) [0, 1], B(C(o) [0, 1]), μ), Gϕ

1

G ϕ is a normal random variable, mean 0, variance ∫ ϕ2 (s) ds;

(13.7a)

G ϕ (w)G ψ (w) μ(dw) = ∫ ϕ(s)ψ(s) ds if ϕ, ψ ∈ BV[0, 1].

(13.7b)

0

1



C(o) [0,1]

0

Ex. 13.4

218 | 13 Strassen’s functional law of the iterated logarithm Therefore, the map L2 [0, 1] ⊃ BV[0, 1] ∋ ϕ 󳨃→ G ϕ ∈ L2 (C(o) [0, 1], B(C(o) [0, 1]), μ)

is a linear isometry. As BV[0, 1] is a dense subset of L2 [0, 1], we find for every function ϕ ∈ L2 [0, 1] a sequence (ϕ n )n⩾1 ⊂ BV[0, 1] such that L2 -limn→∞ ϕ n = ϕ. In particular, (ϕ n )n⩾1 is a Cauchy sequence in L2 [0, 1] and, therefore, (G ϕ n )n⩾1 is a Cauchy sequence in L2 (C(o) [0, 1], B(C(o) [0, 1]), μ). Thus, 1

G ϕ (w) := ∫ ϕ(s) dw(s) := L2 - lim G ϕ n (w), 0

Ex. 13.5 Ex. 13.6 Ex. 15.17

w ∈ C(o) [0, 1]

n→∞

defines a linear functional on C(o) [0, 1]. Note that the integral notation is symbolical and has no meaning as a Riemann–Stieltjes integral unless ϕ ∈ BV[0, 1]. Obviously the properties (13.7) are preserved under L2 limits. The functional G ϕ is often called the Paley–Wiener–Zygmund integral. It is a special case of the Itô integral which we will encounter in Chapter 15. 13.6 Corollary (Cameron, Martin 1944). Let (C(o) [0, 1], B(C(o) [0, 1]), μ) be the Wiener space and Uw := w+w0 the shift by w0 ∈ H1 . Then μ U (A) := μ(U −1 A), A ∈ B(C(o) [0, 1]) is absolutely continuous with respect to μ, and for every bounded Borel measurable functional F : C(o) [0, 1] → ℝ we have ∫

C(o) [0,1]

F(w) μ U (dw) =

=

with the Radon-Nikodým density



F(Uw) μ(dw)

C(o) [0,1]



F(w)

C(o) [0,1]

dμ U (w) μ(dw) dμ

1

1

0

0

dμ U (w) 1 = exp (− ∫ |w󸀠0 (t)|2 dt + ∫ w󸀠0 (t) dw(t)) , dμ 2

(13.8)

(13.9)

where the second integral is a Paley–Wiener–Zygmund integral.

Proof. Let w0 ∈ H1 and (ϕ n )n⩾0 be a sequence in H∘1 with L2 -limn→∞ ϕ󸀠n = w󸀠0 . If F is a continuous functional, Theorem 13.4 shows for all n ⩾ 1 1

∫ F(w + ϕ n ) μ(dw) =

C(o) [0,1]

1

1 ∫ F(w) exp [− ∫ |ϕ󸀠n (t)|2 dt + ∫ ϕ󸀠n (t) dw(t)] μ(dw). 2 0 0 ] [ C(o) [0,1]

It is, therefore, enough to show that the right-hand side converges as n → ∞. Set 1

G(w) := ∫ w󸀠0 (s) dw(s) and 0

1

G n (w) := ∫ ϕ󸀠n (s) dw(s). 0

13.1 The Cameron–Martin formula | 219

Since G n and G are normal random variables on the Wiener space, we find for all c > 0 ∫

C(o) [0,1]

󵄨󵄨 G n (w) 󵄨 󵄨󵄨e − e G(w) 󵄨󵄨󵄨󵄨 μ(dw) 󵄨 =



C(o) [0,1]

󵄨󵄨 G n (w)−G(w) 󵄨 󵄨󵄨e − 1󵄨󵄨󵄨󵄨 e G(w) μ(dw) 󵄨

󵄨 󵄨 = ( ∫ + ∫ ) 󵄨󵄨󵄨󵄨e G n (w)−G(w) − 1󵄨󵄨󵄨󵄨 e G(w) μ(dw) {|G|>c}

⩽(

{|G|⩽c}



C(o) [0,1]

1

+ ec

1



{|G|>c}

󵄨󵄨 G n (w)−G(w) 󵄨 󵄨󵄨e − 1󵄨󵄨󵄨󵄨 μ(dw) 󵄨

C(o) [0,1]

⩽ C( ∫ e

1

2 2 󵄨󵄨 G n (w)−G(w) 󵄨󵄨2 󵄨󵄨e 󵄨󵄨 μ(dw)) ( ∫ e2G(w) μ(dw)) − 1 󵄨 󵄨

2y

|y|>c

e

−y2 /(2a)

1 2

dy) +

1

2 ec 󵄨 󵄨 ∫ 󵄨󵄨e σ n y − 1󵄨󵄨󵄨 e−y /2 dy √2π 󵄨



where σ2n = ∫0 |ϕ󸀠n (s) − w󸀠0 (s)|2 ds and a = ∫0 |w󸀠0 (s)|2 ds. Thus, lim

n→∞



C(o) [0,1]

󵄨󵄨 G n (w) 󵄨 󵄨󵄨e − e G(w) 󵄨󵄨󵄨󵄨 μ(dw) = 0. 󵄨

Using standard approximation arguments from measure theory, the formula (13.8) is easily extended from continuous F to all bounded measurable F. 13.7 Remark. It is useful to rewrite Corollary 13.6 in the language of stochastic processes. Let (B t )t⩾0 be a BM1 and define a new stochastic process by X t (ω) := U −1 B t (ω) = B t (ω) − w0 (t),

w 0 ∈ H1 .

Then X t is a Brownian motion under the new probability measure 1

1

0

0

1 ℚ(dω) = exp (− ∫ |w󸀠0 (s)|2 ds + ∫ w󸀠0 (s) dB s (ω)) ℙ(dω). 2

This is a deterministic version of the Girsanov transformation, cf. Section 19.3. In fact, the converse of Corollary 13.6 is also true. 13.8 Theorem. Let (C(o) [0, 1], B(C(o) [0, 1]), μ) be the Wiener space and Uw := w + h the shift by h ∈ C(o) [0, 1]. If the image measure μ U (A) := μ(U −1 A), A ∈ B(C(o) [0, 1]), is absolutely continuous with respect to μ, then h ∈ H1 .

220 | 13 Strassen’s functional law of the iterated logarithm Proof. Assume that h ∉ H1 . Then there exists a sequence of functions (ϕ n )n⩾1 ⊂ H∘1 such that 1

Ex. 13.5

󵄩󵄩 󵄩󵄩2 󵄩󵄩ϕ n 󵄩󵄩H1 = ∫ |ϕ󸀠n (s)|2 ds = 1 and 0

1

1

⟨ϕ n , h⟩H1 = ∫ ϕ󸀠n (s) dh(s) ⩾ 2n 0

󸀠

where ⟨ϕ n , h⟩H1 = ∫0 ϕ󸀠n (s) dh(s) is a Paley–Wiener–Zygmund integral G ϕ n (h). This follows easily from the well-known formula for the norm in a Hilbert space ‖h‖H1 = supϕ∈H∘1 , ‖ϕ‖H1 =1 ⟨ϕ, h⟩H1 since H∘1 is a dense subset of H1 . We will construct a set A ∈ B(C(o) [0, 1]) such that μ(A) = 1 and μ U (A) = 0, i.e. μ and μ U are singular. Set 1

{ } A n := {w ∈ C(o) [0, 1] : ∫ ϕ󸀠n (s) dw s ⩽ n} 0 { }

and

∞ ∞

A := ⋂ ⋃ A k . n=1 k=n

1

󸀠

By 13.5, the Paley–Wiener–Zygmund integral G ϕ n (w) = ∫0 ϕ󸀠n (s) dw s is a normal random variable with mean 0 and variance 1 on the Wiener space. Thus, μ(A n ) → 1 as n → ∞. Therefore, μ(A) = lim μ (⋃k⩾n A k ) ⩾ lim μ(A n ) = 1. n→∞

On the other hand,

n→∞

1

μ U (A n ) = μ{w : ∫ ϕ󸀠n (s) d(w + h)s ⩽ n} 0

1

1

= μ{w : ∫ ϕ󸀠n (s) dw s ⩽ n − ∫ ϕ󸀠n (s) dh s } 0

0

⩽n−2n=−n

1

⩽ μ{w : − ∫ ϕ󸀠n (s) dw s ⩾ n} ⩽ 0

1 . n2

1

󸀠

In the last step we use again that G ϕ n (w) = ∫0 ϕ󸀠n (s) dw s has a normal distribution under μ with mean zero and variance one, as well as the Markov inequality. Thus, μ U (A) = lim μ U (⋃k⩾n A k ) ⩽ lim ∑ μ U (A k ) ⩽ lim ∑ n→∞

n→∞

k⩾n

n→∞

k⩾n

1 = 0. k2

13.2 Large deviations (Schilder’s theorem) The term large deviations generally refers to small probabilities on an exponential scale. We are interested in lower and upper (exponential) estimates of the probability that

13.2 Large deviations (Schilder’s theorem) | 221

the whole Brownian path is in the neighbourhood of a deterministic function u ≠ 0. Estimates of this type have been studied by Schilder and were further developed by Borovkov, Varadhan, Freidlin and Wentzell; our presentation follows the monograph [90] by Freidlin and Wentzell. 13.9 Definition. Let u ∈ C(o) [0, 1]. The functional I : C(o) [0, 1] → [0, ∞], 1

{ 1 ∫ |u󸀠 (s)|2 ds, if u is absolutely continuous I(u) := { 2 0 +∞, otherwise { is the action functional of the Brownian motion (B t )t∈[0,1] .

The action functional I will be the exponent in the large deviation estimates for B t . For this we need a few properties. 13.10 Lemma. The action functional I : C(o) [0, 1] → [0, ∞] is lower semicontinuous, i.e. for every sequence (w n )n⩾1 ⊂ C(o) [0, 1] which converges uniformly to w ∈ C(o) [0, 1] we have limn→∞ I(w n ) ⩾ I(w).

Proof. Let (w n )n⩾1 and w be as in the statement of the lemma. We may assume that limn→∞ I(w n ) < ∞. For any finite partition Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t m = 1} of [0, 1] we have sup



Π t j ,t j−1 ∈Π

|w(t j ) − w(t j−1 )|2 |w n (t j ) − w n (t j−1 )|2 = sup lim ∑ t j − t j−1 t j − t j−1 Π n→∞ t ,t ∈Π j

⩽ lim sup

j−1



n→∞ Π t ,t ∈Π j j−1

Using Lemma 13.2, we see that I(w) ⩽ limn→∞ I(w n ).

|w n (t j ) − w n (t j−1 )|2 . t j − t j−1

As usual, we write

d(w, A) := inf {‖w − u‖∞ : u ∈ A},

w ∈ C(o) [0, 1]

and

for the distance of the point w to the set A. Denote by

Φ(r) := {w ∈ C(o) [0, 1] : I(w) ⩽ r},

the sub-level sets of the action functional I.

r ⩾ 0,

A ⊂ C(o) [0, 1]

13.11 Lemma. The sub-level sets Φ(r), r ⩾ 0, of the action functional I are compact subsets of C(o) [0, 1].

Proof. Since I is lower semicontinuous, the sub-level sets are closed. For any r > 0, 0 ⩽ s < t ⩽ 1 and w ∈ Φ(r) we find using the Cauchy-Schwarz inequality 1 t 󵄨󵄨 t 󵄨󵄨 2 󵄨 󵄨 |w(t) − w(s)| = 󵄨󵄨󵄨󵄨 ∫ w󸀠 (x) dx 󵄨󵄨󵄨󵄨 ⩽ (∫ |w󸀠 (x)|2 dx) √t − s 󵄨󵄨 󵄨󵄨 s s

⩽ √2I(w) √t − s ⩽ √2r √t − s.

222 | 13 Strassen’s functional law of the iterated logarithm This shows that the family Φ(r) is equibounded and equicontinuous. By Ascoli’s theorem, Φ(r) is compact. We are now ready for the upper large deviation estimate. 13.12 Theorem (Schilder 1966). Let (B t )t∈[0,1] be a BM1 and denote by Φ(r) the sublevel sets of the action functional I. For every δ > 0, γ > 0 and r0 > 0 there exists some ϵ0 = ϵ0 (δ, γ, r0 ) such that r−γ ℙ(d(ϵB, Φ(r)) ⩾ δ) ⩽ exp [− 2 ] (13.10) ϵ for all 0 < ϵ ⩽ ϵ0 and 0 ⩽ r ⩽ r0 . Proof. Set t j = j/n, j = 0, 1, . . . , n, n ⩾ 1, and denote by

{ϵB(t j , ω), if t = t j , j = 0, 1, . . . , n, Π n [ϵB](t, ω) := { linear, for all other t, { the piecewise linear approximation of the continuous function ϵB t (ω) on the grid t0 , t1 , . . . , t n . Obviously, for all t ∈ [t k−1 , t k ] and k = 1, . . . , n, Π n [ϵB](t) = ϵB(t k−1 ) + (t − t k−1 )

ϵB(t k ) − ϵB(t k−1 ) . t k − t k−1

Since the approximations Π n [ϵB](t) are absolutely continuous, we can calculate the action functional I (Π n [ϵB]) =

1 n ϵ2 n 2 ∑ nϵ2 (B(t k ) − B(t k−1 ))2 = ∑ξ 2 k=1 2 k=1 k

(13.11)

where ξ k = √n(B(t k ) − B(t k−1 )) ∼ N(0, 1) are iid standard normal random variables. Note that {d(ϵB, Φ(r)) ⩾ δ} = {d(ϵB, Φ(r)) ⩾ δ, d(Π n [ϵB], ϵB) < δ}

∪ {d(ϵB, Φ(r)) ⩾ δ, d(Π n [ϵB], ϵB) ⩾ δ}

⊂ {Π n [ϵB] ∉ Φ(r)} ∪ {d(Π n [ϵB], ϵB) ⩾ δ} . =: A n

=: C n

We will estimate ℙ(A n ) and ℙ(C n ) separately. Combining (13.11) with Chebychev’s inequality gives for all 0 < α < 1 ℙ(A n ) = ℙ (I(Π n [ϵB]) > r) = ℙ (

r 1 n 2 ) ∑ξ > 2 k=1 k ϵ2

= ℙ (exp [

iid

⩽ exp [−

1−α n 2 r ∑ ξ ] > exp [ 2 (1 − α)]) 2 k=1 k ϵ

r 1−α 2 n ξ ]) . (1 − α)] exp (𝔼 [ 2 1 ϵ2 2

= e n(1−α)

/8

by (2.6)

13.2 Large deviations (Schilder’s theorem) | 223

Therefore, we find for all n ⩾ 1, r0 > 0 and γ > 0 some ϵ0 > 0 such that ℙ(A n ) ⩽

r 1 γ exp [− 2 + 2 ] 2 ϵ ϵ

for all 0 < ϵ ⩽ ϵ0 , 0 ⩽ r ⩽ r0 .

In order to estimate ℙ(C n ) we use the fact that Brownian motion has stationary increments. ℙ(C n ) = ℙ ( sup |ϵB(t) − Π n [ϵB](t)| ⩾ δ) 0⩽t⩽1

n

⩽ ∑ ℙ ( sup |ϵB(t) − ϵB(t k−1 ) − (t − t k−1 )n ϵ(B(t k ) − B(t k−1 ))| ⩾ δ) k=1 n

t k−1 ⩽t⩽t k

⩽1

δ δ ⩽ ∑ ℙ ( sup |B(t) − B(t k−1 )| ⩾ ) + ℙ (|B(t k ) − B(t k−1 )| ⩾ ) 2ϵ 2ϵ t ⩽t⩽t k−1 k k=1

⩽ 2nℙ ( sup B(t) ⩾ t⩽1/n

δ ). 2ϵ

We can now use the following maximal estimate which is a consequence of the reflection principle, Theorem 6.10, and the Gaussian tail estimate from Lemma 10.5 ℙ (sup B(s) ⩾ x) ⩽ 2ℙ(B(t) ⩾ x) ⩽ s⩽t

to deduce

ℙ(C n ) ⩽ 4nℙ (B ( 1n ) ⩾

x2 1 exp [− ] , 2t √2πt x 2

x > 0, t > 0,

2ϵ δ δ2 n exp [− 2 ] . ) ⩽ 4n 2ϵ 8ϵ √2π/n δ

Finally, pick n ⩾ 1 such that δ2 n/8 > r0 and ϵ0 small enough such that ℙ(C n ) is less than 12 exp [− ϵr2 + ϵγ2 ] for all 0 < ϵ ⩽ ϵ0 and 0 ⩽ r ⩽ r0 .

13.13 Corollary. Let (B t )t∈[0,1] be a BM1 and I the action functional. Then lim ϵ2 log ℙ (ϵB ∈ F) ⩽ − inf I(w)

ϵ→0

w∈F

for every closed set F ⊂ C(o) [0, 1].

(13.12)

Proof. By the definition of Φ(r) we have Φ(r) ∩ F = 0 for all r < inf w∈F I(w). Since Φ(r) is compact, d(Φ(r), F) = inf d(w, F) =: δ r > 0. w∈Φ(r)

From Theorem 13.12 we know for all 0 < ϵ ⩽ ϵ0 that and so

ℙ[ϵB ∈ F] ⩽ ℙ[d(ϵB, Φ(r)) > δ r ] ⩽ exp [− lim ϵ2 log ℙ(ϵB ∈ F) ⩽ −r.

ϵ→0

Since r < inf w∈F I(w) is arbitrary, the claim follows.

r−γ ], ϵ2

224 | 13 Strassen’s functional law of the iterated logarithm Next we prove the lower large deviation inequality. 13.14 Theorem (Schilder 1966). Let (B t )t∈[0,1] be a BM1 and I the action functional. For every δ > 0, γ > 0 and c > 0 there is some ϵ0 = ϵ0 (δ, γ, c) such that ℙ(‖ϵB − w0 ‖∞ < δ) > exp (−(I(w0 ) + γ)/ϵ2 )

for all 0 < ϵ ⩽ ϵ0 and w0 ∈ C(o) [0, 1] with I(w0 ) ⩽ c.

(13.13)

Proof. By the Cameron–Martin formula (13.8), (13.9) we see that 1

0) ℙ (‖ϵB − w0 ‖∞ < δ) = exp [− I(w ] ∫ exp [− 1ϵ ∫0 w󸀠0 (s)dw(s)] μ(dw), ϵ2

A

where A := {w ∈ C(o) [0, 1] : ‖w‖∞ < δ/ϵ}. 1

Let C := {w ∈ C(o) [0, 1] : ∫0 w󸀠0 (s)dw(s) ⩽ 2√2I(w0 ) }. Then 1

1

∫ exp [− 1ϵ ∫0 w󸀠0 (s)dw(s)] μ(dw) ⩾ ∫ exp [− 1ϵ ∫0 w󸀠0 (s)dw(s)] μ(dw) A

A∩ C

⩾ exp [− 2ϵ √2I(w0 )] μ(A ∩ C).

Pick ϵ0 so small that ℙ(A) = ℙ (sup |B(t)| < t⩽1

δ 3 )⩾ ϵ 4

Now we can use Chebyshev’s inequality to find

for all 0 < ϵ ⩽ ϵ0 .

1

μ(Ω \ C) = ⩽

(13.7a)

=

{ } μ {w ∈ C(o) [0, 1] : ∫ w󸀠0 (s) dw(s) > 2√2I(w0 )} 0 { } 1 8I(w0 )

2

1

∫ [∫ w󸀠0 (s) dw(s)] μ(dw) ] C(o) [0,1] [ 0 1

1 1 ∫ |w󸀠0 (s)|2 ds = . 8I(w0 ) 4 0

Thus, μ(C) ⩾ 34 and, therefore, μ(A ∩ C) = μ(A) + μ(C) − μ(A ∪ C) ⩾ 1/2. If we make ϵ0 even smaller, we can make sure that 1 2 γ exp [− √2I(w0 )] ⩾ exp [− 2 ] 2 ϵ ϵ

for all 0 < ϵ ⩽ ϵ0 and w0 with I(w0 ) ⩽ c.

13.15 Corollary. Let (B t )t∈[0,1] be a BM1 and I the action functional. Then lim ϵ2 log ℙ (ϵB ∈ U) ⩾ − inf I(w)

ϵ→0

for every open set U ⊂ C(o) [0, 1].

w∈U

(13.14)

13.3 The proof of Strassen’s theorem | 225

Proof. For every w0 ∈ U there is some δ0 > 0 such that

{w ∈ C(o) [0, 1] : ‖w − w0 ‖∞ < δ0 } ⊂ U.

Theorem 13.14 shows that

and so

ℙ(ϵB ∈ U) ⩾ ℙ(‖ϵB − w0 ‖ < δ0 ) ⩾ exp (−

1 (I(w0 ) + γ)) , ϵ2

lim ϵ2 log ℙ(ϵB ∈ U) ⩾ −I(w0 );

ϵ→0

since w0 ∈ U is arbitrary, the claim follows.

13.3 The proof of Strassen’s theorem

Before we start with the rigorous argument, we present a quick motivation. 13.16 Some heuristic considerations. Let (B t )t⩾0 be a BM1 and Z s (t, ω) =

B(st, ω) B(t, ω) ∼ , √2s log log s √2 log log s

t ∈ [0, 1], s > e.

In order to decide whether w0 ∈ C(o) [0, 1] is a limit point of {Z s : s > e}, we have to estimate the probability as

ℙ(‖Z s (⋅) − w0 (⋅)‖∞ < δ)

s→∞

for all δ > 0. Using the canonical version of the Brownian motion on the Wiener space (C(o) [0, 1], B(C(o) [0, 1]), μ) we see for ϵ := 1/√2 log log s. ℙ(‖Z s (⋅) − w0 (⋅)‖∞ < δ) = ℙ(‖ϵB(⋅) − w0 (⋅)‖∞ < δ)

= μ {w ∈ C(o) [0, 1] : ‖w − ϵ−1 w0 ‖∞ < δ/ϵ} .

Now assume that w0 ∈ H1 and set A := {w ∈ C(o) [0, 1] : ‖w‖∞ < δ/ϵ}. The Cameron– Martin formula (13.8), (13.9) shows ℙ(‖Z s (⋅) − w0 (⋅)‖∞ < δ) 1

= exp (−

1

1 1 ∫ |w󸀠0 (t)|2 dt) ∫ exp (− ∫ w󸀠0 (s) dw(s)) μ(dw). 2 ϵ 2ϵ 0

0

A

Using the large deviation estimates from Corollaries 13.13 and 13.15 we see that the first term on the right-hand side is the dominating factor as ϵ → 0 (i.e. s → ∞) and that 1

− ∫ |w󸀠0 (t)|2 dt

ℙ (‖Z s (⋅) − w0 (⋅)‖∞ < δ) ≈ (log s)

0

as

s → ∞.

(13.15)

226 | 13 Strassen’s functional law of the iterated logarithm 1

Thus, the probability on the left is larger than zero only if ∫0 |w󸀠0 (t)|2 dt < ∞, i.e. if w0 ∈ H1 . Consider now, as in the proof of Khintchine’s LIL, sequences of the form s = s n = q n , n ⩾ 1, where q > 1. Then (13.15) shows that ∑ ℙ (‖Z s n (⋅) − w0 (⋅)‖∞ < δ) = +∞,

or

< +∞,

n

1

1

according to ∫0 |w󸀠0 (t)|2 dt ⩽ 1, or ∫0 |w󸀠0 (t)|2 dt > 1, respectively. By the Borel–Cantelli lemma we infer that the family 1

1

K = {w ∈ H : ∫ |w󸀠 (t)|2 dt ⩽ 1} 0

is indeed a good candidate for the set of limit points.

We will now give a rigorous proof of Theorem 13.1. As in Section 13.2 we write I(u) for the action functional and Φ(r) = {w ∈ C(o) [0, 1] : I(w) ⩽ r} for its sub-level sets; with this notation K = Φ ( 21 ). Finally, set L(ω) := {w ∈ C(o) [0, 1] : w is a limit point of {Z s (⋅, ω) : s > e} as s → ∞}.

We have to show that L(ω) = Φ ( 12 ). We split the argument in a series of lemmas. 13.17 Lemma. L(ω) ⊂ Φ ( 12 ) for almost all ω ∈ Ω.

Proof. Since Φ ( 21 ) = ⋂η>0 Φ ( 12 + η), it is clearly enough to show that L(ω) ⊂ Φ ( 12 + η)

for all η > 0.

1o Consider the sequence (Z q n (⋅, ω))n⩾1 for some q > 1. From Theorem 13.12 we find for any δ > 0 and γ < η ℙ [d (Z q n , Φ ( 21 + η)) > δ] = ℙ [d ((2 log log q n )−1/2 B, Φ ( 12 + η)) > δ] ⩽ exp [−2(log log q n ) ( 21 + η − γ)] .

Since 2 ( 12 + η − γ) > 1, we see that

∑ ℙ [d (Z q n , Φ ( 21 + η)) > δ] < ∞, n

and the Borel–Cantelli lemma implies that, for almost all ω ∈ Ω, d (Z q n (⋅, ω), Φ ( 12 + η)) ⩽ δ

for all n ⩾ n0 (ω).

2o Let us now “fill the gaps” in the sequence (q n )n⩾1 . Set L(t) = √2t log log t. Then 󵄨󵄨 B(rt) B(rq n ) 󵄨󵄨 󵄨 󵄨󵄨󵄨 sup ‖Z t − Z q n ‖∞ = sup sup 󵄨󵄨󵄨 − L(q n ) 󵄨󵄨󵄨 q n−1 ⩽t⩽q n q n−1 ⩽t⩽q n 0⩽r⩽1 󵄨󵄨 L(t) ⩽

|B(rt)| L(q n ) |B(rt) − B(rq n )| + sup sup − 1) . ( n n L(q ) L(t) q n−1 ⩽t⩽q n 0⩽r⩽1 L(q ) q n−1 ⩽t⩽q n 0⩽r⩽1 sup

sup

=: X n

=: Y n

13.3 The proof of Strassen’s theorem | 227

Since L(t), t > 3, is increasing, we see that for sufficiently large n ⩾ 1 Y n ⩽ sup s⩽q n

L(q n ) |B(s)| − 1) . ( L(q n ) L(q n−1 )

Khintchine’s LIL, Theorem 12.1, shows that the first factor is almost surely bounded for all n ⩾ n0 (ω). Moreover, limn→∞ L(q n )/L(q n−1 ) = 1, which means that we have Yn ⩽

δ 2

for all n ⩾ n0 (ω, δ, q)

with δ > 0 as in Step 1o . For the estimate of X n we use the scaling property of a Brownian motion to get ℙ (X n >

δ 󵄨 󵄨 δ ) = ℙ ( sup sup 󵄨󵄨󵄨B(tq−n r) − B(r)󵄨󵄨󵄨 > √2 log log q n ) 2 2 q n−1 ⩽t⩽q n 0⩽r⩽1 󵄨 󵄨 δ sup 󵄨󵄨󵄨B(cr) − B(r)󵄨󵄨󵄨 > √2 log log q n ) 2 q−1 ⩽c⩽1 0⩽r⩽1

= ℙ ( sup

⩽ ℙ(γB ∈ F)

where γ := 2δ−1 (2 log log q n )−1/2 and F := {w ∈ C(o) [0, 1] :

sup

sup |w(cr) − w(r)| ⩾ 1} .

q−1 ⩽c⩽1 0⩽r⩽1

Since F ⊂ C(o) [0, 1] is a closed subset, we can use Corollary 13.13 and find ℙ (X n >

1 δ ) ⩽ exp [− 2 ( inf I(w) − h)] 2 γ w∈F

for any h > 0. As inf w∈F I(w) = q/(2q − 2), cf. Lemma 13.18 below, we get ℙ (X n >

δ2 q δ log log q n ⋅ ( − 2h)] ) ⩽ exp [− 2 4 q−1

and, therefore, ∑n ℙ [X n > δ/2] < ∞ if q is close to 1. Now the Borel–Cantelli lemma implies that X n ⩽ δ/2 almost surely if n ⩾ n0 (ω) and so sup

q n−1 ⩽s⩽q n

‖Z s (⋅, ω) − Z q n (⋅, ω)‖∞ ⩽ δ

for all n ⩾ n0 (ω, δ, q).

3o Steps 1o and 2o show that for every δ > 0 there is some q > 1 such that there exists for almost all ω ∈ Ω some s0 = s0 (ω) = q n0 (ω)+1 with d (Z s (⋅, ω), Φ ( 12 + η)) ⩽ ‖Z s (⋅, ω) − Z q n (⋅, ω)‖∞ + d (Z q n (⋅, ω), Φ ( 21 + η)) ⩽ 2δ

for all s ⩾ s0 . Since δ > 0 is arbitrary and Φ ( 12 + η) is closed, we get L(ω) ⊂ Φ ( 12 + η) = Φ ( 12 + η) .

Ex. 13.7

228 | 13 Strassen’s functional law of the iterated logarithm 13.18 Lemma. Let F = {w ∈ C(o) [0, 1] : supq−1 ⩽c⩽1 sup0⩽r⩽1 |w(cr) − w(r)| ⩾ 1}, q > 1, and I(⋅) be the action functional of a Brownian motion. Then inf I(w) =

w∈F

Proof. The function w0 (t) :=

qt−1 q−1

1 q . 2 q−1

𝟙[1/q,1] (t) is in F, and 1

2I(w0 ) = ∫ ( 1/q

2 q q . ) dt = q−1 q−1

Therefore, 2 inf w∈F I(w) ⩽ 2I(w0 ) = q/(q − 1). On the other hand, for every w ∈ F there are x ∈ [0, 1] and c ⩾ q−1 such that 1 󵄨󵄨 x 󵄨󵄨2 󵄨󵄨 󵄨󵄨 󸀠 󵄨 󵄨 1 ⩽ |w(x) − w(cx)| = 󵄨󵄨 ∫ w (s) ds 󵄨󵄨 ⩽ (x − cx) ∫ |w󸀠 (s)|2 ds 󵄨󵄨 󵄨󵄨 cx 0 2

⩽ 2(1 − q−1 )I(w).

13.19 Corollary. The set {Z s (⋅, ω) : s > e} is for almost all ω ∈ Ω a relatively compact subset of C(o) [0, 1]. Proof. Let (s j )j⩾1 be a sequence in (0, ∞). Without loss of generality we can assume that (s j )j⩾1 is unbounded. Pick a further sequence (δ n )n⩾1 ⊂ (0, ∞) such that limn→∞ δ n = 0. Because of Step 3o in the proof of Lemma 13.17 there exists for every n ⩾ 1 some w n ∈ Φ ( 12 + η) and an index j(n) such that ‖Z s j(n) (⋅, ω) − w n ‖∞ ⩽ δ n . Since Φ ( 21 + η) is a compact set, there is a subsequence (s󸀠j(n) )n⩾1 ⊂ (s j(n) )n⩾1 such that (Z s󸀠j(n) (⋅, ω))n⩾1 converges in C(o) [0, 1].

13.20 Lemma. L(ω) ⊃ Φ ( 21 ) for almost all ω ∈ Ω.

Proof. 1o Since for every w ∈ Φ ( 21 ) we have √2rw ∈ Φ(r), r < Φ ( 12 )

1 2,

the inclusion

is easy to see; as Φ(r) ⊂ Φ(r󸀠 ) if r ⩽ r󸀠 , it is enough to consider ⋃r 0 a sequence (s n )n⩾1 , s n → ∞, such that ℙ [ lim ‖Z s n (⋅) − w(⋅)‖∞ ⩽ ϵ] = 1. n→∞

Note that the sequence s n will depend measurably on ω.

13.3 The proof of Strassen’s theorem | 229

2o As in the proof of Khintchine’s LIL we set s n = q n , n ⩾ 1, for some q > 1, and L(s) = √2s log log s. Then 󵄨󵄨 B(ts ) 󵄨󵄨 n 󵄨 󵄨 ‖Z s n (⋅) − w(⋅)‖∞ = sup 󵄨󵄨󵄨 − w(t)󵄨󵄨󵄨 󵄨󵄨 0⩽t⩽1 󵄨󵄨 L(s n ) 󵄨󵄨 󵄨󵄨 B(ts ) − B(s ) n n−1 󵄨 󵄨 − w(t)󵄨󵄨󵄨 ⩽ sup 󵄨󵄨󵄨 󵄨󵄨 󵄨 L(s ) −1 n q ⩽t⩽1 󵄨 +

|B(s n−1 )| |B(ts n )| + sup |w(t)| + sup . L(s n ) L(s n ) −1 −1 t⩽q t⩽q

We will estimate the terms separately. Since Brownian motion has independent increments, we see that the sets 󵄨󵄨 B(ts ) − B(s ) 󵄨󵄨 ϵ n n−1 󵄨 󵄨 A n = { sup 󵄨󵄨󵄨 − w(t)󵄨󵄨󵄨 < } , 󵄨 󵄨󵄨 4 L(s ) −1 n q ⩽t⩽1 󵄨

n ⩾ 1,

are independent. Combining the stationarity of the increments, the scaling property 2.16 of a Brownian motion and Theorem 13.14 yields for all n ⩾ n0 󵄨󵄨 󵄨󵄨 󵄨 B(t − q−1 ) 󵄨 ϵ ℙ(A n ) = ℙ ( sup 󵄨󵄨󵄨󵄨 − w(t)󵄨󵄨󵄨󵄨 < ) −1 󵄨󵄨 4 q ⩽t⩽1 󵄨󵄨 √2 log log s n

󵄨󵄨 󵄨󵄨 B(t) 󵄨󵄨 󵄨󵄨 ϵ 󵄨󵄨 󵄨󵄨 < ) − v(t) 󵄨 󵄨󵄨 8 −1 󵄨 √ 0⩽t⩽1−q 󵄨 2 log log s n 󵄨 󵄨󵄨 󵄨󵄨 B(t) 󵄨 󵄨 ϵ ⩾ ℙ ( sup 󵄨󵄨󵄨󵄨 − v(t)󵄨󵄨󵄨󵄨 < ) 󵄨󵄨 8 0⩽t⩽1 󵄨󵄨 √2 log log s n

⩾ ℙ(

sup

⩾ exp ( − 2 log log s n ⋅ (I(v) + η)) =(

2(I(v)+η) 1 ; ) n log q

here we choose q > 1 so large that |w(q−1 )| ⩽ ϵ/8, and use 1 1 {w (t + q ) − w ( q ) , v(t) = { 1 {w(1) − w ( q ) ,

Since η > 0 is arbitrary and I(v) =

if 0 ⩽ t ⩽ 1 − 1q , if 1 −

1 q

1−1/q

1

1

0

1/q

0

⩽ t ⩽ 1.

1 1 1 1 ∫ |v󸀠 (t)|2 dt = ∫ |w󸀠 (s)|2 ds ⩽ ∫ |w󸀠 (s)|2 ds ⩽ r < , 2 2 2 2

we find ∑n ℙ(A n ) = ∞. The Borel–Cantelli lemma implies that for large q > 1 󵄨󵄨 󵄨󵄨 󵄨 B(ts n ) − B(s n−1 ) 󵄨 ϵ lim sup 󵄨󵄨󵄨󵄨 − w(t)󵄨󵄨󵄨󵄨 ⩽ . 󵄨󵄨 4 n→∞ q−1 ⩽t⩽1 󵄨󵄨 √2s n log log s n

230 | 13 Strassen’s functional law of the iterated logarithm

Ex. 13.8

Selecing a (in general, ω-dependent) subsequence of (s n )n⩾1 , we can assume that the lim is actually a limit. 3o Finally,

ℙ(

|B(s n−1 )| ϵ ϵ > ) = ℙ (|B(1)| > √2q log log q n ) , L(s n ) 4 4 1/q

sup |w(t)| ⩽ ∫ |w󸀠 (s)| ds ⩽

t⩽q−1

ℙ ( sup

0⩽t⩽q−1

0

√2r 1 < , √q √q

ϵ |B(ts n )| ϵ > ) ⩽ 2ℙ (|B(1)| > √2q log log q n ) , L(s n ) 4 4

and the Borel–Cantelli lemma shows that for sufficiently large q > 1 lim (

n→∞

This finishes the proof.

|B(ts n )| |B(s n−1 )| 3 + sup |w(t)| + sup ) < ϵ. L(s n ) 4 t⩽q−1 L(s n ) t⩽q−1

13.21 Further reading. The Cameron–Martin formula is the basis for many further developments in the analysis in Wiener and abstract Wiener spaces, see the Further reading section in Chapter 4; worth reading is the early account in [268]. There is an extensive body of literature on the topic of large deviations. A good starting point are the lecture notes [260] to get an idea of the topics and methods behind “large deviations”. Since the topic is quite technical, it is very difficult to find easy introductions. Our favourites are [49] and [90] (which we use here), a standard reference is the classic monograph [50]. [49] Dembo, Zeitouni: Large Deviations Techniques and Applications. Deuschel, Stroock: Large Deviations. [50] [90] Freidlin, Wentzell: Random Perturbations of Dynamical Systems. [260] Varadhan: Large Deviations and Applications. [268] Wiener et al.: Differential Space, Quantum Systems, and Prediction.

Problems 1.

2.

Let w ∈ C(o) [0, 1] and assume that for every fixed t ∈ [0, 1] the number w(t) is a limit point of the family {Z s (t) : s > e} ⊂ ℝ. Show that this is not sufficient for w to be a limit point of C(o) [0, 1].

Let K be the set from Theorem 13.1. Show that for w ∈ K the estimate |w(t)| ⩽ √t, t ∈ [0, 1] holds.

Problems

3.

| 231

Let u ∈ H1 and denote by Π n , n ⩾ 1, a sequence of partitions of [0, 1] such that limn→∞ |Π n | = 0. Show that the functions f n (t) =



t j ,t j−1 ∈Π n

[ t j −t1j−1 ∫t

tj

j−1

2

2

u󸀠 (s) ds] 𝟙[t j−1 ,t j ) (t),

converge for n → ∞ to [u󸀠 (t)] for Lebesgue almost all t.

n ⩾ 1,

4. Let ϕ ∈ BV[0, 1] and consider the following Riemann-Stieltjes integral 1

G ϕ (w) = ϕ(1)w(1) − ∫ w(s) dϕ(s), 0

w ∈ C(o) [0, 1].

Show that G ϕ is a linear functional on (C(o) [0, 1], B(C(o) [0, 1]), μ) satisfying 1 a) G ϕ is a normal random variable, mean 0, variance ∫0 ϕ2 (s) ds; 1

b) ∫C

(o) [0,1]

5.

G ϕ (w)G ψ (w) μ(dw) = ∫0 ϕ(s)ψ(s) ds for all ϕ, ψ ∈ BV[0, 1];

c) If (ϕ n )n⩾1 ⊂ BV[0, 1] converges in L2 to ϕ, then G ϕ := limn→∞ G ϕ n exists as the L2 -limit and defines a linear functional which satisfies a) and b).

Denote by H1 the Cameron–Martin space (13.2) and by H∘1 the set defined in (13.3). 1 a) Show that H1 is a Hilbert space with the canonical norm ‖h‖2H1 := ∫0 |h󸀠 (s)|2 ds 1

and scalar product ⟨ϕ, h⟩H1 = ∫0 ϕ󸀠 (s) dh(s) (this is a Paley–Wiener– Zygmund integral, cf. 13.5).

b) Show that H∘1 is a dense subset of H1 .

c) Show that ‖h‖H1 = supϕ∈H1 , ‖ϕ‖H1 =1 ⟨ϕ, h⟩H1 = supϕ∈H∘1 , ‖ϕ‖H1 =1 ⟨ϕ, h⟩H1 . 6.

d) Let h ∈ C(o) [0, 1]. Show that h ∉ H1 if, and only if, there exists a sequence (ϕ n )n⩾1 ⊂ H∘1 such that ‖ϕ n ‖H1 = 1 and ⟨ϕ n , h⟩H1 ⩾ 2n.

Let w ∈ C(o) [0, 1]. Find the densities of the following bi-variate random variables: t a) (∫1/2 s2 dw(s), w(1/2)) for 1/2 ⩽ t ⩽ 1; b) (∫1/2 s2 dw(s), w(u + 1/2)) for 0 ⩽ u ⩽ t/2, 1/2 ⩽ t ⩽ 1; t

c) (∫1/2 s2 dw(s), ∫1/2 s dw(s)) for 1/2 ⩽ t ⩽ 1; t

t

1

7.

d) (∫1/2 e s dw(s), w(1) − w(1/2)).

Show that F := {w ∈ C(o) [0, 1] : supq−1 ⩽c⩽1 sup0⩽r⩽1 |w(cr) − w(r)| ⩾ 1} is for every q > 1 a closed subset of C(o) [0, 1].

8. Check the (in)equalities from Step 3o in the proof of Lemma 13.20.

14 Skorokhod representation Let (B t )t⩾0 be a one-dimensional Brownian motion. Denote by τ = τ∘(−a,b)c , a, b > 0, the first entrance time into the interval (−a, b)c . We know from Corollary 5.11 that b a δ−a + δ b and a+b a+b b 𝟙[−a,b) (x) + 𝟙[b,∞) (x). F−a,b (x) = ℙ(B τ ⩽ x) = a+b Bτ ∼

(14.1)

We will now study the problem which probability distributions can be obtained in this way. More precisely, let X be a random variable with probability distribution function F(x) = ℙ(X ⩽ x). We want to know if there is a Brownian motion (B t )t⩾0 and a stopping time τ such that X ∼ B τ . If this is the case, we say that X (or F) can be embedded into a Brownian motion. The question and its solution (for continuous F) go back to Skorokhod [240, Chapter 7]. To simplify things, we do not require that τ is a stopping time for the natural filtration (FtB )t⩾0 , FtB = σ(B s : s ⩽ t), but for a larger admissible filtration (Ft )t⩾0 , cf. Definition 5.1. If X can be embedded into (B t )t⩾0 , it is clear that 𝔼 (X k ) = 𝔼 (B kτ )

whenever the moments exist. Moreover, if 𝔼 τ < ∞, Wald’s identities, cf. Theorem 5.10, entail 𝔼 Bτ = 0

and

𝔼 (B2τ ) = 𝔼 τ.

Thus, X is necessarily centred: 𝔼 X = 0. On the other hand, every centred distribution function F can be obtained as a mixture of two-point distribution functions F−a,b . This allows us to reduce the general embedding problem to the case considered in (14.1). The elegant proof of the following lemma is due to Breiman [20]. 14.1 Lemma. Let F be a distribution function with ∫ℝ |x| dF(x) < ∞ and ∫ℝ x dF(x) = 0. Then F is the mixture of two-point distributions of the form (14.1) F(x) = ∬ F u,w (x) ν(du, dw) ℝ2

with the mixing probability distribution on (−∞, 0] × (0, ∞) and α :=

ν(du, dw) =

1 2 ∫ℝ |x| dF(x).

1 (w − u)𝟙{u⩽0 n) n=1

0⩽s⩽τ1

Doob

⩽ 𝔼( sup B2s ) = 𝔼(sup B2s∧τ ) ⩽ 4 𝔼 (B2τ ) = 4 𝔼 (X12 ) < ∞. 0⩽s⩽τ1

s⩾0

1

(A.14)

1

Now we can use the (simple direction of the) Borel–Cantelli lemma to deduce, for sufficiently large values of n, sup

0⩽s⩽τ n −τ n−1

󵄨󵄨 󵄨 󵄨󵄨B s+τ − B τ 󵄨󵄨󵄨 ⩽ C√n. 󵄨 n−1 n−1 󵄨

󵄨 󵄨 󵄨 󵄨 This gives (14.2) since sup0⩽s⩽τ n −τ n−1 󵄨󵄨󵄨󵄨B s+τ n−1 − B τ n−1 󵄨󵄨󵄨󵄨 = sup0⩽s⩽τ n −τ n−1 󵄨󵄨󵄨󵄨B s+τ n−1 − B τ n 󵄨󵄨󵄨󵄨.

We have seen in Section 3.5 that Brownian motion can be constructed as the limit of random walks. For every finite family (B t1 , . . . , B t k ), t1 < ⋅ ⋅ ⋅ < t k , this follows from the

14 Skorokhod representation | 237

classical central limit theorem: let (ϵ j )j⩾1 be iid Bernoulli random variables such that ℙ(ϵ1 = ±1) = 21 and S n := ϵ1 + ⋅ ⋅ ⋅ + ϵ n , then ¹ S⌊nt⌋ weakly 󳨀󳨀󳨀󳨀󳨀󳨀→ B t √n n→∞

for all t ∈ (0, 1].

We are interested in approximations of the whole path [0, 1] ∋ t 󳨃→ B t (ω) of a Brownian motion, and the above convergence will give pointwise but not uniform results. To get uniform results, it is helpful to interpolate the partial sums S⌊nt⌋ /√n S n (t) :=

1 (S⌊nt⌋ − (nt − ⌊nt⌋)ϵ⌊nt⌋+1 ) , √n

t ∈ [0, 1].

This is a piecewise linear continuous function which is equal to S j /√n if t = j/n. We are interested in the convergence of Φ(S n (⋅)) 󳨀󳨀󳨀󳨀→ Φ(B(⋅)) n→∞

where Φ : (C[0, 1], ℝ) → ℝ is a (not necessarily linear) bounded functional which is uniformly continuous with respect to uniform convergence. Although the following theorem implies Theorem 3.10, we cannot use it to construct Brownian motion since the proof already uses the existence of a Brownian motion. Nevertheless the result is important as it helps to understand Brownian motion intuitively, and permits simple proofs of various limit theorems. 14.5 Theorem (invariance principle. Donsker 1951). Let (B t )t∈[0,1] be a one-dimensional Brownian motion, (S n (t))t∈[0,1] , n ⩾ 1, be the sequence of processes from above, and Φ : C([0, 1], ℝ) → ℝ a uniformly continuous bounded functional. Then lim 𝔼 Φ(S n (⋅)) = 𝔼 Φ(B(⋅)),

n→∞

i.e. S n (⋅) converges weakly to B(⋅).

Proof. Denote by (W t )t⩾0 the Brownian motion into which we can embed the random walk (S n )n⩾1 , i.e. W τ n ∼ S n for an increasing sequence of stopping times τ1 ⩽ τ2 ⩽ ⋅ ⋅ ⋅ , cf. Corollary 14.3. Since our claim is a statement on the probability distributions, we can assume that S n = W τ n . By assumption, we find for each ϵ > 0 some η > 0 such that |Φ(f) − Φ(g)| < ϵ for all ‖f − g‖∞ < η. The assertion follows if we can prove, for all ϵ > 0 and for sufficiently large values of n ⩾ N ϵ , that the Brownian motions defined by B n (t) := n−1/2 W nt ∼ B t , t ∈ [0, 1], satisfy 󵄨 󵄨 󵄨 󵄨 ℙ (󵄨󵄨󵄨Φ(S n (⋅)) − Φ(B n (⋅))󵄨󵄨󵄨 > ϵ) ⩽ ℙ ( sup 󵄨󵄨󵄨S n (t) − B n (t)󵄨󵄨󵄨 > η) ⩽ ϵ. 0⩽t⩽1

1 In fact, it is enough to assume that the ϵ j are iid random variables with mean zero and finite variance. This universality is the reason why Theorem 14.5 is usually called invariance principle.

238 | 14 Skorokhod representation This means that Φ(S n (⋅)) − Φ(B n (⋅)) converges in probability to 0, therefore B n ∼B

| 𝔼 Φ(S n (⋅)) − 𝔼 Φ(B(⋅))| = | 𝔼 Φ(S n (⋅)) − 𝔼 Φ(B n (⋅))| 󳨀󳨀󳨀󳨀→ 0. n→∞

In order to see the estimate we need three simple observations. Pick ϵ, η > 0 as above.

1o Since a Brownian motion has continuous sample paths, there is some δ > 0 such that ℙ(∃s, t ∈ [0, 1], |s − t| ⩽ δ : |W t − W s | > η) ⩽

ϵ . 2

2o From Corollary 14.3 and Wald’s identities it follows that (τ n )n⩾1 is a random walk with mean 𝔼 τ n = 𝔼 (B2τ ) = 𝔼(S2n ) < ∞. Therefore, the strong law of large numbers n tells us that lim

n→∞

τn = 𝔼 τ1 = 1 a.s. n

Thus, for every η > 0, there is some M(η) such that for all n > m ⩾ M(η) 󵄨󵄨 τ 󵄨󵄨 1 1 max |τ k − k| ⩽ max |τ k − k| + max 󵄨󵄨󵄨󵄨 k − 1󵄨󵄨󵄨󵄨 n 1⩽k⩽n n 1⩽k⩽m m⩽k⩽n 󵄨 k 󵄨 1 max |τ − k| + η 󳨀󳨀󳨀󳨀→ η 󳨀󳨀󳨀󳨀→ 0. ⩽ n→∞ η→0 n 1⩽k⩽m k

Consequently, for some N(ϵ, η) and all n ⩾ N(ϵ, η) ℙ(

1 δ ϵ max |τ − k| > ) ⩽ . n 1⩽k⩽n k 3 2

3o For every t ∈ [k/n, (k + 1)/n] there is a u ∈ [τ k /n, τ k+1 /n], k = 0, 1, . . . , n − 1, such that S n (t) = n−1/2 W nu = B n (u).

Indeed: S n (t) interpolates B n (τ k /n) and B n (τ k+1 /n) linearly, i.e. S n (t) is between the minimum and the maximum of B n (s), τ k ⩽ s ⋅ n ⩽ τ k+1 . Since s 󳨃→ B n (s) is continuous, there is some intermediate value u such that S n (t) = B n (u). This situation is shown in Figure 14.1. Clearly, we have

󵄨 󵄨 ℙ ( sup 󵄨󵄨󵄨S n (t) − B n (t)󵄨󵄨󵄨 > η) 0⩽t⩽1

󵄨󵄨 τ l 󵄨󵄨 δ δ 1 󵄨 󵄨 ⩽ ℙ ( sup 󵄨󵄨󵄨S n (t) − B n (t)󵄨󵄨󵄨 > η, max 󵄨󵄨󵄨󵄨 l − 󵄨󵄨󵄨󵄨 ⩽ ) + ℙ( max |τ l − l| > ). n󵄨 3 n 1⩽l⩽n 3 1⩽l⩽n 󵄨 n 0⩽t⩽1

14 Skorokhod representation | 239

%Q









τN

W

Q

X

τN  Q

W

Fig. 14.1: For each value S n (t) of the line segment connecting B n (τ k /n) and B n (τ k+1 /n) there is some u such that S n (t) = B n (u).

󵄨 󵄨 If 󵄨󵄨󵄨S n (t) − B n (t)󵄨󵄨󵄨 > η for some t ∈ [k/n, (k + 1)/n], then we find by Step 3o some value 󵄨 󵄨 u ∈ [τ k /n, τ k+1 /n] with 󵄨󵄨󵄨B n (u) − B n (t)󵄨󵄨󵄨 > η. Moreover, 󵄨󵄨 󵄨󵄨 δ δ 1 τ 󵄨󵄨 󵄨󵄨 τ k 󵄨󵄨 󵄨󵄨 k |u − t| ⩽ 󵄨󵄨󵄨󵄨u − k 󵄨󵄨󵄨󵄨 +󵄨󵄨󵄨󵄨 k − 󵄨󵄨󵄨󵄨 + 󵄨󵄨󵄨󵄨 − t󵄨󵄨󵄨󵄨 ⩽ + + ⩽ δ n󵄨 󵄨n n󵄨 󵄨n 󵄨 󵄨 3 3 n o → 0, n → ∞ by 2 󵄨󵄨 τ 󵄨󵄨 if n ⩾ N(ϵ, η) ∨ (3/δ) and max1⩽l⩽n 󵄨󵄨󵄨 nl − nl 󵄨󵄨󵄨 ⩽ δ/3. This, 2o and 1o give for all sufficiently 󵄨 󵄨 large n 󵄨 󵄨 ℙ( sup 󵄨󵄨󵄨S n (t) − B n (t)󵄨󵄨󵄨 > η) 0⩽t⩽1

ϵ 󵄨 󵄨 ⩽ ℙ (∃u, t ∈ [0, 1], |u − t| ⩽ δ : 󵄨󵄨󵄨B n (u) − B n (t)󵄨󵄨󵄨 > η) + ⩽ ϵ. 2 Typical examples of uniformly continuous functionals are Φ0 (f) = f(1), 1

Φ2 (f) = ∫ |f(t)|p dt, 0

Φ1 (f) = sup |f(t) − at|, 0⩽t⩽1

1

Φ3 (f) = ∫ |f(t) − tf(1)|4 dt. 0

With a bit of extra work one can prove Theorem 14.5 for all Φ : C([0, 1], ℝ) → ℝ which are continuous with respect to uniform convergence at almost every Brownian path.

240 | 14 Skorokhod representation This version includes the following functionals Φ4 (f) = 𝟙{sup0⩽t⩽1 |f(t)|⩽1} ,

Φ6 (f) = 𝟙{∀t : a(t) m ⩾ 0, 󵄨 󵄨 󵄨 𝔼 [(X n − X m )2 󵄨󵄨󵄨 Fm ] = 𝔼 [X 2n − X 2m 󵄨󵄨󵄨 Fm ] = 𝔼 [⟨X⟩n − ⟨X⟩m 󵄨󵄨󵄨 Fm ].

(15.3)

In particular,

n

n

j=1

j=1

󵄨 󵄨 ⟨X⟩n = ∑ 𝔼 [(X j − X j−1 )2 󵄨󵄨󵄨 Fj−1 ] = ∑ 𝔼 [X 2j − X 2j−1 󵄨󵄨󵄨 Fj−1 ].

(15.4)

Proof. Note that for all n > m ⩾ 0

󵄨 𝔼 [(X n − X m )2 󵄨󵄨󵄨 Fm ] 󵄨 = 𝔼 [X 2n − 2X n X m + X 2m 󵄨󵄨󵄨 Fm ] 󵄨 󵄨 = 𝔼 [X 2n 󵄨󵄨󵄨 Fm ] − 2X m 𝔼 [X n 󵄨󵄨󵄨 Fm ] +X 2m =

=

=

󵄨 𝔼 [X 2n 󵄨󵄨󵄨 Fm ] − X 2m 󵄨 𝔼 [X 2n − X 2m 󵄨󵄨󵄨 Fm ] 𝔼 [ X 2n − ⟨X⟩n −(X 2m martingale

=X m , martingale

󵄨 󵄨 − ⟨X⟩m ) 󵄨󵄨󵄨 Fm ] + 𝔼 [⟨X⟩n − ⟨X⟩m 󵄨󵄨󵄨 Fm ]

󵄨 = 𝔼 [⟨X⟩n − ⟨X⟩m 󵄨󵄨󵄨 Fm ].

Since ⟨X⟩j is Fj−1 measurable, we get (15.4) from (15.3) if we set n = j, m = j − 1, and sum over j = 1, . . . , n.

15.3 Definition. Let (X n , Fn )n⩾0 be a martingale and (C n , Fn )n⩾0 an adapted process. Then n C ∙ X n := ∑ C j−1 (X j − X j−1 ), C ∙ X0 := 0, n ⩾ 1, (15.5) j=1

is called the martingale transform.

Let us review a few properties of the martingale transform. 15.4 Theorem. Let (M n , Fn )n⩾0 be an L2 martingale and (C n , Fn )n⩾0 be a bounded adapted process. Then a) C ∙ M is an L2 martingale, i.e. C∙ : M2 → M2 ; b) C 󳨃→ C ∙ M is linear; c) ⟨C ∙ M⟩n = C2 ∙ ⟨M⟩n := ∑nj=1 C2j−1 (⟨M⟩j − ⟨M⟩j−1 ) for all n ⩾ 1;

246 | 15 Stochastic integrals: L2 -Theory 󵄨 󵄨 d) 𝔼 [(C ∙ M n − C ∙ M m )2 󵄨󵄨󵄨 Fm ] = 𝔼 [C2 ∙ ⟨M⟩n − C2 ∙ ⟨M⟩m 󵄨󵄨󵄨 Fm ] for n > m ⩾ 0. In particular, 𝔼 [(C ∙ M n )2 ] = 𝔼 [C2 ∙ ⟨M⟩n ],

(15.6)

n

n

j=1

j=1

𝔼 [(C ∙ M n )2 ] = 𝔼 [ ∑ C2j−1 (M j − M j−1 )2 ] = 𝔼 [ ∑ C2j−1 (M 2j − M 2j−1 )] .

(15.7)

Proof. a) By assumption, there is some constant K > 0 such that |C j | ⩽ K a.s. for all n ⩾ 0. Thus, |C j−1 (M j − M j−1 )| ⩽ K(|M j−1 | + |M j |) ∈ L2 , which means that C ∙ M n ∈ L2 . Let us check that C ∙ M is a martingale. For all n ⩾ 1 󵄨 󵄨 𝔼 (C ∙ M n 󵄨󵄨󵄨 Fn−1 ) = 𝔼 (C ∙ M n−1 + C n−1 (M n − M n−1 ) 󵄨󵄨󵄨 Fn−1 ) 󵄨 = C ∙ M n−1 + C n−1 𝔼 ((M n − M n−1 ) 󵄨󵄨󵄨 Fn−1 ) = C ∙ M n−1 .

b) The linearity of C 󳨃→ C ∙ M is obvious.

=0, martingale

c) If we apply Lemma 15.2 to the L2 martingale X = C ∙ M, we get n

(15.4) 󵄨 ⟨C ∙ M⟩n = ∑ 𝔼 [(C ∙ M j − C ∙ M j−1 )2 󵄨󵄨󵄨 Fj−1 ] j=1 n

󵄨 = ∑ 𝔼 [C2j−1 (M j − M j−1 )2 󵄨󵄨󵄨 Fj−1 ] j=1 n

󵄨 = ∑ C2j−1 𝔼 [(M j − M j−1 )2 󵄨󵄨󵄨 Fj−1 ] j=1

(15.3)

n

󵄨 = ∑ C2j−1 𝔼 [⟨M⟩j − ⟨M⟩j−1 󵄨󵄨󵄨 Fj−1 ] j=1 n

= ∑ C2j−1 (⟨M⟩j − ⟨M⟩j−1 ) j=1

= C2 ∙ ⟨M⟩n .

d) Using c) and Lemma 15.2 for the L2 martingale C ∙ M gives

󵄨 󵄨 𝔼 ((C ∙ M n − C ∙ M m )2 󵄨󵄨󵄨 Fm ) = 𝔼 ((C ∙ M n )2 − (C ∙ M m )2 󵄨󵄨󵄨 Fm ) 󵄨 = 𝔼 (C2 ∙ ⟨M⟩n − C2 ∙ ⟨M⟩m 󵄨󵄨󵄨 Fm ).

If we take m = 0, we get (15.6). If we use Lemma 15.2 for the L2 martingale M, we see 𝔼 (⟨M⟩j − ⟨M⟩j−1 | Fj−1 ) = 𝔼 ((M j − M j−1 )2 | Fj−1 ) = 𝔼 (M 2j − M 2j−1 | Fj−1 ). Therefore, (15.7) follows from (15.6) with the tower property.

247

15.1 Discrete stochastic integrals |

Note that M 󳨃→ ⟨M⟩ is a quadratic form. Therefore, we can use polarization to get the quadratic covariation which is a bilinear form: ⟨M, N⟩ :=

1 (⟨M + N⟩ − ⟨M − N⟩) for all M, N ∈ M2 . 4

(15.8)

By construction, ⟨M, M⟩ = ⟨M⟩. Using polarization, we see that (M n N n − ⟨M, N⟩n )n⩾0 is a martingale. We can use the quadratic covariation to characterize the martingale transform intrinsically.

Ex. 15.1

15.5 Lemma. Let (M n , Fn )n⩾0 , (N n , Fn )n⩾0 be L2 martingales and (C n , Fn )n⩾0 a bounded adapted process. Then n

⟨C ∙ M, N⟩n = C ∙ ⟨M, N⟩n = ∑ C j−1 (⟨M, N⟩j − ⟨M, N⟩j−1 ).

(15.9)

j=1

Proof. Since MN − ⟨M, N⟩, M and N are martingales, we find for all j = 1, . . . , n 󵄨 ⟨M, N⟩j − ⟨M, N⟩j−1 = 𝔼 (M j N j − M j−1 N j−1 󵄨󵄨󵄨 Fj−1 ) 󵄨 = 𝔼 (M j N j − M j−1 N j − M j N j−1 + M j−1 N j−1 󵄨󵄨󵄨 Fj−1 ) 󵄨 = 𝔼 ((M j − M j−1 )(N j − N j−1 ) 󵄨󵄨󵄨 Fj−1 )

(15.10)

(compare this calculation with the proof of Lemma 15.2). Replacing M by C ∙ M yields 󵄨 ⟨C ∙ M, N⟩j − ⟨C ∙ M, N⟩j−1 = 𝔼 ((C ∙ M j − C ∙ M j−1 )(N j − N j−1 ) 󵄨󵄨󵄨 Fj−1 ) 󵄨 = 𝔼 (C j−1 (M j − M j−1 )(N j − N j−1 ) 󵄨󵄨󵄨 Fj−1 )

pull

󵄨 = C j−1 𝔼 ((M j − M j−1 )(N j − N j−1 ) 󵄨󵄨󵄨 Fj−1 )

out

= C j−1 (⟨M, N⟩j − ⟨M, N⟩j−1 ).

Summing over j = 1, . . . , n, we get (15.9).¹

15.6 Corollary. Let (M n , Fn )n⩾0 , (N n , Fn )n⩾0 be L2 martingales and (C n , Fn )n⩾0 a bounded adapted process. Then n

󵄨 󵄨 |⟨C ∙ M, N⟩n | ⩽ ∑ |C j−1 | 󵄨󵄨󵄨⟨M, N⟩j − ⟨M, N⟩j−1 󵄨󵄨󵄨 ⩽ √⟨C ∙ M⟩n √⟨N⟩n . j=1

Proof. The first inequality follows from Lemma 15.5 and the triangle inequality. Combining (15.10) from the proof of Lemma 15.5, the conditional Cauchy–Schwarz inequality

1 I learned this short proof of (15.9) from Ms. Franziska Kühn.

Ex. 15.2

248 | 15 Stochastic integrals: L2 -Theory (e.g. [232, Corollary 27.16, Problem 27.6]) and (15.3) yields ⟨M, N⟩j − ⟨M, N⟩j−1

(15.10)

󵄨 = 𝔼 ((M j − M j−1 )(N j − N j−1 ) 󵄨󵄨󵄨 Fj−1 )

󵄨 󵄨 ⩽ √𝔼 ((M j − M j−1 )2 󵄨󵄨󵄨 Fj−1 )√𝔼 ((N j − N j−1 )2 󵄨󵄨󵄨 Fj−1 )

(15.3)

= √⟨M⟩j − ⟨M⟩j−1 √⟨N⟩j − ⟨N⟩j−1 .

Another application of the Cauchy–Schwarz inequality now shows n

󵄨 󵄨 ∑ |C j−1 | 󵄨󵄨󵄨⟨M, N⟩j − ⟨M, N⟩j−1 󵄨󵄨󵄨

j=1

n

⩽ ∑ |C j−1 |√⟨M⟩j − ⟨M⟩j−1 √⟨N⟩j − ⟨N⟩j−1 j=1

n

1 2

C2j−1 (⟨M⟩j

n

1 2

⩽ [∑ − ⟨M⟩j−1 )] [ ∑ (⟨N⟩j − ⟨N⟩j−1 )] j=1 ] [ ] [j=1 = √⟨C ∙ M⟩n √⟨N⟩n .

Lemma 15.5 can also be used to characterize C ∙ M among all L2 martingales.

15.7 Theorem. Let (M n , Fn )n⩾0 be an L2 martingale and (C n , Fn )n⩾0 a bounded adapted process. The martingale transform C ∙ M is the unique L2 martingale I ∈ M2 such that I0 = 0 and ⟨I, N⟩n = C ∙ ⟨M, N⟩n for all N ∈ M2 , n ⩾ 1. (15.11) Proof. Since I = C ∙ M is an L2 martingale, (15.11) follows from Lemma 15.5. Conversely, assume that (15.11) is true. Then (15.11)

(15.9)

⟨I, N⟩ = C ∙ ⟨M, N⟩ = ⟨C ∙ M, N⟩

or ⟨I − C ∙ M, N⟩ = 0 for all N ∈ M2 . Since I − C ∙ M ∈ M2 , we can set N = I − C ∙ M and find ⟨I − C ∙ M, I − C ∙ M⟩ = ⟨I − C ∙ M⟩ = 0.

Thus, 𝔼 [(I n − C ∙ M n )2 ] = 𝔼 [⟨I − C ∙ M⟩n ] = 0 which shows that I n = C ∙ M n a.s. for all n ⩾ 0.

15.2 Simple integrands

We can now consider stochastic integrals with respect to a Brownian motion. Throughout the rest of this chapter (B t )t⩾0 is a BM1 , (Ft )t⩾0 is an admissible filtration such

15.2 Simple integrands |

249 B

that each Ft contains all ℙ-null sets, e.g. the completed natural filtration Ft = F t (cf. Theorem 6.22), and [0, T], T < ∞, a finite interval. Although we do not explicitly state it, most results remain valid for T = ∞ and [0, ∞). In line with the notation introduced in Section 15.1 we use 󳶳 X τ and X sτ := X τ∧s for the stopped process (X t )t⩾0 ; 󳶳

󳶳 󳶳

M2T for the family of Ft martingales (M t )0⩽t⩽T ⊂ L2 (ℙ);

2 M2,c T for the L martingales with (almost surely) continuous sample paths;

L2 (λ T ⊗ ℙ) = L2 ([0, T] × Ω, λ T ⊗ ℙ) for the family of (equivalence classes of) all B[0, T] ⊗ A measurable random functions f : [0, T] × Ω → ℝ equipped with the T norm ‖f‖2L2 (λ T ⊗ℙ) = ∫0 𝔼 (|f(s)|2 ) ds < ∞.

15.8 Definition. A real-valued stochastic process (f(t, ⋅))t∈[0,T] of the form n

(15.12)

f(t, ω) = ∑ ϕ j−1 (ω)𝟙[s j−1 ,s j ) (t) j=1

where n ⩾ 1, 0 = s0 ⩽ s1 ⩽ . . . ⩽ s n ⩽ T and ϕ j ∈ L∞ (Fs j ) are bounded Fs j measurable random variables, j = 0, . . . , n − 1, is called a (right continuous) simple process. We write ST for the family of all simple processes on [0, T].

Many texts on stochastic integration use in (15.12) left continuous simple processes with steps of the form 𝟙(s j−1 ,s j ] (t). This becomes important if one wants to consider stochastic integrals w.r.t. a jump process. If the integrator is a continuous process (like Brownian motion) or a process with continuous quadratic variation (see Chapter 17), then it does not matter if we consider left or right continuous simple processes. For right continuous processes we can rewrite (15.12) in the following form n

f(t, ω) = ∑ f(s j−1 , ω)𝟙[s j−1 ,s j ) (t)

Ex. 15.14

(15.13)

j=1

which allows us to define the Itô integral very intuitively via Riemann sums. 2 15.9 Definition. Let M ∈ M2,c T be a continuous L martingale and f ∈ ST . Then n

f ∙ M T = ∑ f(s j−1 )(M(s j ) − M(s j−1 ))

(15.14)

j=1

T

is called the stochastic integral of f ∈ ST . Instead of f ∙ M T we also write ∫0 f(s) dM s .

Since (15.14) is linear, it is obvious that (15.14) does not depend on the particular representation of the simple process f ∈ ST . It is not hard to see that (f ∙ M s j )j=0,...,n is a martingale transform in the sense of Definition 15.3: just take C j = f(s j ) = ϕ j and M j = M(s j ).

Ex. 15.3

250 | 15 Stochastic integrals: L2 -Theory In order to use the results of Section 15.1 we note that for a Brownian motion (15.3) 2 󵄨 ⟨B⟩s j − ⟨B⟩s j−1 = 𝔼 [(B(s j ) − B(s j−1 )) 󵄨󵄨󵄨 Fs j−1 ] (B1)

2

= 𝔼 [(B(s j ) − B(s j−1 )) ] = s j − s j−1 .

(15.15)

(5.1)

Since ⟨B⟩0 = 0, and since 0 = s0 < s1 < ⋅ ⋅ ⋅ < s n ⩽ T is an arbitrary partition, we see that ⟨B⟩t = t. We can now use Theorem 15.4 to describe the properties of the stochastic integral for simple processes.

15.10 Theorem. Let (B t )t⩾0 be a BM1 , (M t )t⩽T ∈ M2,c T , (Ft )t⩾0 be an admissible filtration and f ∈ ST . Then a) f 󳨃→ f ∙ M T ∈ L2 (ℙ) is linear; T b) ⟨f ∙ B⟩T = f 2 ∙ ⟨B⟩T = ∫0 |f(s)|2 ds; c) Itô’s isometry – L2 -continuity. For all f ∈ ST T

‖f ∙ B T ‖2L2 (ℙ) = 𝔼 [(f ∙ B)2T ] = 𝔼 [∫ |f(s)|2 ds] = ‖f ‖2L2 (λ T ⊗ℙ) . [0 ]

(15.16)

Proof. a) Let f be a simple process given by (15.12). Without loss of generality we can assume that s n = T. All assertions follow immediately from the corresponding statements for the martingale transform, cf. Theorem 15.4. For b) we observe that n

(15.15)

2

n

(15.13)

sn

∑ |f(s j−1 )| (⟨B⟩s j − ⟨B⟩s j−1 ) = ∑ |f(s j−1 )| (s j − s j−1 ) = ∫ |f(s)|2 ds,

j=1

2

j=1

0

and c) follows if we take expectations and use that ((f ∙ B s j )2 − ⟨f ∙ B⟩s j )j=0,...,n is a martingale, i.e. for s n = T we have 𝔼(f ∙ B2T ) = 𝔼⟨f ∙ B⟩T .

Note that for T < ∞ we always have ST ⊂ L2 (λ T ⊗ ℙ), i.e. (15.16) is always finite and f ∙ B T ∈ L2 (ℙ) holds true. For T = ∞ this remains true if we assume that f ∈ L2 (λ T ⊗ ℙ) and s n < T = ∞. Therefore one should understand the equality in Theorem 15.10.c) as an isometry between (ST , ‖•‖L2 (λ T ⊗ℙ) ) and a subspace of L2 (ℙ). T

It is natural to study T 󳨃→ f ∙ M T = ∫0 f(s) dM s as a function of time. Since for

every Ft stopping time τ the stopped martingale M τ = (M s∧τ )s⩽T is again in M2,c T , see Theorem A.18 and Corollary A.20, we can define n

f ∙ M τ := f ∙ (M τ )T = ∑ f(s j−1 )(M τ∧s j − M τ∧s j−1 ). j=1

As usual, we write

def

f ∙ Mτ =

T

∫ f(s) dM sτ 0

τ

= ∫ f(s) dM s . 0

This holds, in particular, for the constant stopping time τ ≡ t, 0 ⩽ t ⩽ T.

(15.17)

15.2 Simple integrands | 251 2,c 15.11 Theorem. Let (B t )t⩾0 be a BM1 , (M t )t⩽T ∈ MT , (Ft )t⩾0 be an admissible filtration and f ∈ ST . Then a) (f ∙ M t )t⩽T is a continuous L2 martingale; in particular t

󵄨 𝔼 [(f ∙ B t − f ∙ B s ) 󵄨󵄨󵄨 Fs ] = 𝔼 [∫ |f(r)|2 dr [s 2

󵄨󵄨 󵄨󵄨 F ] , 󵄨󵄨 s 󵄨 ]

s < t ⩽ T.

b) Localization in time. For every Ft stopping time τ with finitely many values (f ∙ M)τt = f ∙ (M τ )t = (f 𝟙[0,τ) ) ∙ M t .

c) Localization in space. If f, g ∈ ST are simple processes with f(s, ω) = g(s, ω) for all (s, ω) ∈ [0, t] × F for some t ⩽ T and F ∈ F∞ , then (f ∙ M s )𝟙F = (g ∙ M s )𝟙F ,

0 ⩽ s ⩽ t.

Proof. a) Let 0 ⩽ s < t ⩽ T and f ∈ ST . We may assume that s, t ∈ {s0 , . . . , s n } where the s j are as in (15.12); otherwise we add s and t to the partition. From Theorem 15.4 we know that (f ∙ M s j )j is a martingale, thus 󵄨 𝔼 (f ∙ M t 󵄨󵄨󵄨 Fs ) = f ∙ M s .

Since s, t are arbitrary, we conclude that (f ∙ M t )t⩽T ∈ M2T . The very definition of t 󳨃→ f ∙ M t , see (15.17), shows that (f ∙ M t )t⩽T is a continuous process. Using Lemma 15.2 or Theorem 15.4.d) we get 2 󵄨 𝔼 [(f ∙ B t − f ∙ B s ) 󵄨󵄨󵄨 Fs ]

=

15.10.b)

=

󵄨 𝔼 [⟨f ∙ B⟩t − ⟨f ∙ B⟩s 󵄨󵄨󵄨 Fs ] t

s

𝔼 [∫ |f(r)|2 dr − ∫ |f(r)|2 dr 0 [0

󵄨󵄨 󵄨󵄨 F ] . 󵄨󵄨 s 󵄨 ]

b) Write f ∈ ST in the form (15.13). Then we have for all stopping times τ n

(f ∙ M)τt = (f ∙ M)t∧τ = ∑ f(s j−1 )(M s j ∧τ∧t − M s j−1 ∧τ∧t ) j=1 n

= ∑ f(s j−1 )(M sτj ∧t − M sτj−1 ∧t ) = f ∙ (M τ )t . j=1

If τ takes only finitely many values, we may assume that τ(ω) ∈ {s0 , s1 , . . . , s n } with s j from the representation (15.12) of f . Thus, τ(ω) > s j−1 implies τ(ω) ⩾ s j , and n

f(s)𝟙[0,τ) (s) = ∑ f(s j−1 )𝟙[0,τ) (s)𝟙[s j−1 ,s j ) (s) j=1 n

= ∑ (f(s j−1 )𝟙{τ>s j−1 } ) 𝟙[s j−1 ,τ∧s j ) (s) . j=1

Fs j−1 mble

τ∧s j =s j

252 | 15 Stochastic integrals: L2 -Theory This shows that f 𝟙[0,τ) ∈ ST as well as n

(f 𝟙[0,τ) ) ∙ M t = ∑ f(s j−1 )𝟙{τ>s j−1 } (M s j ∧t − M s j−1 ∧t ) j=1 n

= ∑ f(s j−1 ) (M s j ∧τ∧t − M s j−1 ∧τ∧t ) = f ∙ (M τ )t . j=1

=0, if τ⩽s j−1

c) Because of b) we can assume that t = T, otherwise we replace f and g by f 𝟙[0,t) and g𝟙[0,t) , respectively. Refining the partitions, if necessary, f, g ∈ ST can have the same support points 0 = s0 < s1 < ⋅ ⋅ ⋅ < s n ⩽ T in their representations (15.12). Since we know that f(s j )𝟙F = g(s j )𝟙F , we get n

(f ∙ M T )𝟙F = ∑ f(s j−1 )𝟙F ⋅ (M s j − M s j−1 ) j=1 n

= ∑ g(s j−1 )𝟙F ⋅ (M s j − M s j−1 ) = (g ∙ M T )𝟙F . j=1

15.3 Extension of the stochastic integral to L 2T We have seen in the previous section that the map f 󳨃→ f ∙ B is a linear isometry from ST to M2,c T . Therefore, we can use standard arguments to extend this mapping to the closure of ST with respect to the norm in L2 (λ T ⊗ ℙ). As before, most results remain valid for T = ∞ and [0, ∞) but, for simplicity, we restrict ourselves to the case T < ∞. 1

2,c 2 2 2 15.12 Lemma. Let M ∈ M2,c T . Then ‖M‖MT := (𝔼 [sups⩽T |M s | ]) is a norm on MT

and M2,c T is closed under this norm. Moreover,

𝔼 [ sup |M s |2 ] ≍ 𝔼 [|M T |2 ] = sup 𝔼 [|M s |2 ] ; s⩽T

s⩽T

(15.18)

(“≍” indicates a two-sided comparison).

Proof. By Doob’s maximal inequality A.10 we get that (A.14)

(A.14)

𝔼 [|M T |2 ] ⩽ 𝔼 [ sup |M s |2 ] ⩽ 4 sup 𝔼 [|M s |2 ] = 4 𝔼 [|M T |2 ] . s⩽T

Ex. 15.4

s⩽T

|2 )

This proves (15.18) since (|M t t⩽T is a submartingale, i.e. t 󳨃→ 𝔼(|M t |2 ) is increasing. Since ‖ ⋅ ‖M2T is the L2 (Ω, ℙ; C[0, T])-norm, i.e. the L2 (ℙ) norm (of the supremum norm) of C[0, T]-valued random variables, it is clear that it is again a norm.

2,c 2 We will now show that M2,c T is closed. Let N n ∈ MT with limn→∞ ‖N n − N‖MT = 0. In particular we see that for a subsequence n(j)

lim sup |N n(j) (s, ω) − N(s, ω)| = 0 for almost all ω.

j→∞ s⩽T

15.3 Extension of the stochastic integral to L2T

| 253

This shows that s 󳨃→ N(s, ω) is continuous. Since L2 (ℙ)-limn→∞ N n (s) = N(s) for every s ⩽ T, we find ∫ N(s) dℙ = lim ∫ N n (s) dℙ n→∞

F

F

martin-

=

gale

lim ∫ N n (T) dℙ = ∫ N(T) dℙ

n→∞

F

F

for all s ⩽ T and F ∈ Fs . Thus, N is a martingale, and N ∈ M2,c T .

Itô’s isometry (15.16) allows us to extend the stochastic integral from the simple integrands ST ⊂ L2 (λ T ⊗ ℙ) to the closure ST in L2 (λ T ⊗ ℙ).

15.13 Definition. By L2T we denote the closure ST of the simple processes ST with respect to the norm in L2 (λ T ⊗ ℙ). Using (15.16) we find for any sequence (f n )n⩾1 ⊂ ST with limit f ∈ L2T that T

‖f n ∙ B T − f m ∙ B T ‖2L2 (ℙ) = 𝔼 [∫ |f n (s, ⋅) − f m (s, ⋅)|2 ds] [0 ] = ‖f n − f m ‖2L2 (λ T ⊗ℙ) 󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ 0. m,n→∞

Therefore, for each 0 ⩽ t ⩽ T, the limit

L2 (ℙ)- lim f n ∙ B t

(15.19)

n→∞

exists and does not depend on the approximating sequence. Because of Lemma 15.12

Ex. 15.5

M2,c T .

the convergence (15.19) is locally uniform in t, defining an element in We will see later on (cf. Section 15.5) that L2T is the set of all (equivalence classes of) processes from L2 (λ T ⊗ ℙ) which have an progressively measurable representative.

15.14 Definition. Let (B t )t⩾0 be a BM1 and let f ∈ L2T . Then the stochastic integral (or Itô integral) is defined as t

f ∙ B t = ∫ f(s) dB s := L2 (ℙ)- lim f n ∙ B t , 0

n→∞

0 ⩽ t ⩽ T,

(15.20)

where (f n )n⩾1 ⊂ ST is any sequence approximating f ∈ L2T in L2 (λ T ⊗ ℙ).

The stochastic integral is constructed in such a way that all properties from Theorems 15.10 and 15.11 carry over to integrands from f ∈ L2T . 15.15 Theorem. Let (B t )t⩾0 be a BM1 , (Ft )t⩾0 be an admissible filtration and f ∈ L2T . Then we have for all t ⩽ T 2 a) f 󳨃→ f ∙ B is a linear map from L2T into M2,c T , i.e. (f ∙ B t )t⩽T is a continuous L martingale;

Ex. 15.7 Ex. 15.8 Ex. 15.12 Ex. 15.13

254 | 15 Stochastic integrals: L2 -Theory b) (f ∙ B)2 −⟨f ∙ B⟩ is a martingale where ⟨f ∙ B⟩t := f 2 ∙⟨B⟩t = ∫0 |f(s)|2 ds;² in particular, t

t

󵄨 𝔼 [(f ∙ B t − f ∙ B s )2 󵄨󵄨󵄨 Fs ] = 𝔼 [∫ |f(r)|2 dr [s

c) Itô’s isometry – L2 -continuity.

T

󵄨󵄨 󵄨󵄨 Fs ] , 󵄨󵄨 ]

(15.21)

s < t ⩽ T.

‖f ∙ B T ‖2L2 (ℙ) = 𝔼 [(f ∙ B)2T ] = 𝔼 [∫ |f(s)|2 ds] = ‖f‖2L2 (λ T ⊗ℙ) . [0 ]

(15.22)

d) Maximal inequalities. T

t

2

T

𝔼 [∫ |f(s)|2 ds] ⩽ 𝔼 [sup ( ∫ f(s) dB s ) ] ⩽ 4 𝔼 [∫ |f(s)|2 ds] . t⩽T 0 [0 ] [ ] [0 ]

(15.23)

e) Localization in time. Let τ be an Ft stopping time. Then

f)

(f ∙ B)τt = f ∙ (B τ )t = (f 𝟙[0,τ) ) ∙ B t

for all t ⩽ T.

Localization in space. If f, g ∈ L2T are integrands such that f(s, ω) = g(s, ω) for all (s, ω) ∈ [0, t] × F where t ⩽ T and F ∈ F∞ , then (f ∙ B s )𝟙F = (g ∙ B s )𝟙F ,

0 ⩽ s ⩽ t.

Proof. Throughout the proof we fix f ∈ L2T and some sequence f n ∈ ST , such that L2 (λ T ⊗ ℙ)-limn→∞ f n = f . Clearly, all statements are true for f n ∙ B, and we only have to verify that the L2 limits preserve them.

a) follows from Theorem 15.11, Lemma 15.12 (which shows that the L2 -limit is locally uniform in t, preserving continuity) and the linearity of L2 limits. b) Note that

t

(f n ∙ B t )2 − ⟨f n ∙ B⟩t = (f n ∙ B t )2 − ∫ |f n (s)|2 ds 0

is a sequence of martingales, cf. Theorem 15.10.b). Note that t

t

󵄨 󵄨 |⟨f n ∙ B⟩t − ⟨f ∙ B⟩t | ⩽ ∫ 󵄨󵄨󵄨|f n (s)|2 − |f(s)|2 󵄨󵄨󵄨 ds = ∫ |f n (s) + f(s)| ⋅ |f n (s) − f(s)| ds 0

t

⩽ (∫ |f n (s) + f(s)|2 ds) 0

0 1/2

1/2

t

(∫ |f n (s) − f(s)|2 ds)

.

0

2 Observe that we have defined the angle bracket ⟨•⟩ only for discrete martingales. Therefore we have to define the quadratic variation here. For each fixed t this new definition is in line with 15.10.b).

15.3 Extension of the stochastic integral to L2T

| 255

Taking expectations on both sides reveals that the second expression on the right-hand side tends to zero while the first bracket stays uniformly bounded. This follows from our assumption as L2 (λ T ⊗ ℙ)-limn→∞ f n = f . With essentially the same calculation we get that L2 (ℙ)-limn→∞ f n ∙ B t = f ∙ B t implies L1 (ℙ)-limn→∞ (f n ∙ B t )2 = (f ∙ B t )2 . As L1 -limits preserve martingales, we get that (f ∙ B)2 − f 2 ∙ ⟨B⟩ is again a martingale. Finally, noting that f ∙ B is a martingale, the technique used in the proof of Lemma 15.2 applies and yields (15.21). c) is a consequence of the same equalities in Theorem 15.10 and the completion. d) Apply Lemma 15.12 to the L2 martingale (f ∙ B t )t⩽T and use c): t

2

(15.18)

2

T

T

c) 𝔼 [sup ( ∫ f(s) dB s ) ] ≍ 𝔼 [( ∫ f(s) dB s ) ] = 𝔼 [∫ |f(s)|2 ds] . t⩽T 0 [ ] [ 0 ] [0 ]

e) From the proof of Theorem 15.11.b) we know that f n ∙ (B τ )t = (f n ∙ B)τt holds for all stopping times τ and f n ∈ ST . Using the maximal inequality d) we find 󵄨 󵄨2 󵄨 󵄨2 𝔼 [sup 󵄨󵄨󵄨f n ∙ (B τ )s − f m ∙ (B τ )s 󵄨󵄨󵄨 ] = 𝔼 [sup 󵄨󵄨󵄨f n ∙ B s∧τ − f m ∙ B s∧τ 󵄨󵄨󵄨 ] s⩽T

s⩽T

󵄨 󵄨2 ⩽ 𝔼 [sup 󵄨󵄨󵄨f n ∙ B s − f m ∙ B s 󵄨󵄨󵄨 ] s⩽T

T

d)

󵄨 󵄨2 ⩽ 4 𝔼 [∫ 󵄨󵄨󵄨f n (s) − f m (s)󵄨󵄨󵄨 ds] [0 ]

󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ 0. m,n→∞

Since M2,c T is complete, we conclude that

M2,c T

(f n ∙ B)τ = f n ∙ (B τ ) 󳨀󳨀󳨀󳨀→ f ∙ (B τ ). n→∞

On the other hand, limn→∞ f n ∙ B = f ∙ B, hence limn→∞ (f n ∙ B)τ = (f ∙ B)τ . This gives the first equality. For the second equality we assume that τ is a discrete stopping time. Then we know from Theorem 15.11.b) M2,c T

(f n 𝟙[0,τ) ) ∙ B = (f n ∙ B)τ 󳨀󳨀󳨀󳨀→ (f ∙ B)τ . n→∞

Ex. 15.15

256 | 15 Stochastic integrals: L2 -Theory The maximal inequality d) gives 󵄨2 󵄨 𝔼 [ sup󵄨󵄨󵄨(f n 𝟙[0,τ) ) ∙ B s − (f 𝟙[0,τ) ) ∙ B s 󵄨󵄨󵄨 ] s⩽T

T

󵄨2 󵄨 ⩽ 4 𝔼 [∫ 󵄨󵄨󵄨f n (s) − f(s)󵄨󵄨󵄨 𝟙[0,τ) (s) ds] 󳨀󳨀󳨀󳨀→ 0. n→∞ ⩽1 [0 ]

This shows that f 𝟙[0,τ) ∈ L2T and M2,c T -limn→∞ (f n 𝟙[0,τ) ) ∙ B T = (f 𝟙[0,τ) ) ∙ B T . Finally, let τ j ↓ τ be discrete stopping times, e.g. τ j = (⌊2j τ⌋ + 1)/2j as in Lemma A.16. With calculations similar to those above we get 󵄨 󵄨2 𝔼 [ sup󵄨󵄨󵄨(f 𝟙[0,τ) ) ∙ B s − (f 𝟙[0,τ ) ) ∙ B s 󵄨󵄨󵄨 ] j s⩽T

T

󵄨 󵄨2 ⩽ 4 𝔼 [∫ 󵄨󵄨󵄨f(s)𝟙[0,τ) (s) − f(s)𝟙[0,τ ) (s)󵄨󵄨󵄨 ds] j [0 ] T

󵄨 󵄨2 ⩽ 4 𝔼 [∫ 󵄨󵄨󵄨f(s)𝟙[τ,τ ) (s)󵄨󵄨󵄨 ds] j [0 ]

and the last expression tends to zero as j → ∞ by dominated convergence. Since τj

f ∙ B t → f ∙ B τt is obvious (by continuity of the paths), the assertion follows.

f) Let f n , g n ∈ ST be approximations of f, g ∈ L2T . We will see in the proof of Theorem 15.23 below, that we can choose f n , g n in such a way that Since

f n (s, ω) = g n (s, ω) for all n ⩾ 1

and

(s, ω) ∈ [0, t] × F.

󵄨 󵄨2 󵄨 󵄨2 𝔼 [sup 󵄨󵄨󵄨(f n ∙ B s )𝟙F − (f ∙ B s )𝟙F 󵄨󵄨󵄨 ] ⩽ 𝔼 [sup 󵄨󵄨󵄨f n ∙ B s − f ∙ B s 󵄨󵄨󵄨 ] s⩽t

s⩽t

d)

t

⩽ 4 𝔼 [∫ |f n (s) − f(s)|2 ds] , [0 ]

we see that the corresponding assertion for step functions, Theorem 15.11.c), remains valid for the L2 limit.

15.4 Evaluating Itô integrals Using integrands from ST allows us to calculate only very few Itô integrals. For example, T up to now we have the natural formula ∫0 ξ ⋅𝟙[a,b) (s) dB s = ξ ⋅(B b − B a ) just for bounded Fa measurable random variables ξ .

15.4 Evaluating Itô integrals | 257

15.16 Lemma. Let (B t )t⩾0 be a BM1 , 0 ⩽ a < b ⩽ T and ξ ∈ L2 (ℙ) be Fa measurable. Then ξ ⋅ 𝟙[a,b) ∈ L2T and T

∫ ξ ⋅ 𝟙[a,b) (s) dB s = ξ ⋅ (B b − B a ). 0

Proof. Clearly, [(−k) ∨ ξ ∧ k]𝟙[a,b) ∈ ST for every k ⩾ 1. Dominated convergence yields T

󵄨 󵄨2 󵄨 󵄨2 𝔼 [∫ 󵄨󵄨󵄨(−k) ∨ ξ ∧ k − ξ 󵄨󵄨󵄨 𝟙[a,b) (s) ds] = (b − a) 𝔼 [󵄨󵄨󵄨(−k) ∨ ξ ∧ k − ξ 󵄨󵄨󵄨 ] 󳨀󳨀󳨀󳨀→ 0, k→∞ [0 ]

i.e. ξ ⋅ 𝟙[a,b) ∈ L2T . By the definition of the Itô integral we find T

T

∫ ξ ⋅ 𝟙[a,b) (s) dB s = L2 (ℙ)- lim ∫ ((−k) ∨ ξ ∧ k)𝟙[a,b) (s) dB s k→∞

0

0

= L2 (ℙ)- lim ((−k) ∨ ξ ∧ k)(B b − B a ) k→∞

= ξ ⋅ (B b − B a ).

Let us now discuss an example that shows how to calculate a non-trivial stochastic integral explicitly. 15.17 Example. Let (B t )t⩾0 be a BM1 and T > 0. Then T

∫ B t dB t = 0

1 2 (B − T). 2 T

Ex. 15.16

(15.24)

As a by-product of our proof we will see that (B t )t⩽T ∈ L2T . Note that (15.24) differs from what we expect from calculus. If u(t) is differentiable and u(0) = 0, we get T

T

∫ u(t) du(t) = ∫ u(t)u󸀠 (t) dt = 0

0

1 2

1 1 2 󵄨󵄨󵄨T u (t)󵄨󵄨 = u2 (T), 󵄨t=0 2 2

i.e. we would expect rather than 12 (B2T − T). On the other hand, the stochastic integral must be a martingale, see Theorem 15.15.a), and the correction “−T” turns the submartingale B2T into a bona fide martingale. Let us now verify (15.24). Pick any partition Π = {0 = s0 < s1 < ⋅ ⋅ ⋅ < s n = T} with mesh |Π| = max1⩽j⩽n (s j − s j−1 ) and set B2T

n

f Π (t, ω) := ∑ B(s j−1 , ω)𝟙[s j−1 ,s j ) (t). j=1

Applying Lemma 15.16 n times, we get T

f

Π



L2T

and

n

∫ f Π (t) dB t = ∑ B(s j−1 )(B(s j ) − B(s j−1 )). 0

j=1

258 | 15 Stochastic integrals: L2 -Theory Observe that T

sj

n

󵄨2 󵄨 󵄨2 󵄨 𝔼 [∫ 󵄨󵄨󵄨󵄨f Π (t) − B(t)󵄨󵄨󵄨󵄨 dt] = ∑ 𝔼 [ ∫ 󵄨󵄨󵄨B(s j−1 ) − B(t)󵄨󵄨󵄨 dt] ] ] j=1 [s j−1 [0 sj

n

󵄨2 󵄨 = ∑ ∫ 𝔼 [󵄨󵄨󵄨B(s j−1 ) − B(t)󵄨󵄨󵄨 ] dt j=1 s j−1 sj

n

= ∑ ∫ (t − s j−1 ) dt j=1 s j−1

1 n ∑ (s j − s j−1 )2 2 j=1

=

󳨀󳨀󳨀󳨀󳨀→ 0. |Π|→0

Since L2T is by definition closed (w.r.t. the norm in L2 (λ T ⊗ ℙ)), we conclude that (B t )t⩽T ∈ L2T . By Itô’s isometry (15.22) 2

T

T

󵄨 󵄨2 𝔼 [( ∫ (f Π (t) − B t ) dB t ) ] = 𝔼 [∫ 󵄨󵄨󵄨󵄨f Π (t) − B(t)󵄨󵄨󵄨󵄨 dt] [ 0 ] [0 ] =

Therefore, T

1 n ∑ (s j − s j−1 )2 󳨀󳨀󳨀󳨀󳨀→ 0. |Π|→0 2 j=1

n

∫ B t dB t = L2 (ℙ)- lim ∑ B(s j−1 )(B(s j ) − B(s j−1 )), |Π|→0

0

j=1

i.e. we can approximate the stochastic integral in a concrete way by Riemann–Stieltjes sums. Finally, n

n j−1

j=1

j=1 k=1

B2T = ∑ (B s j − B s j−1 )2 + 2 ∑ ∑ (B s j − B s j−1 )(B s k − B s k−1 ) n

2

n

= ∑ (B s j − B s j−1 ) +2 ∑ (B s j − B s j−1 )B s j−1 j=1

→T, Thm. 9.1

j=1

and, therefore, B2T − T = 2 ∫0 B t dB t . T

T →∫0

B t dB t

L2 (ℙ)

󳨀󳨀󳨀󳨀󳨀→ |Π|→0

T

T − 2 ∫ B t dB t 0

In the calculation of Example 15.17 we use (implicitly) the following result which is interesting on its own.

15.4 Evaluating Itô integrals | 259

15.18 Proposition. Let f ∈ L2T be a process which is, as a function of t, mean-square continuous, i.e. lim

[0,T]∋s→t

Then

𝔼 [|f(s, ⋅) − f(t, ⋅)|2 ] = 0,

T

t ∈ [0, T].

n

∫ f(t) dB t = L2 (ℙ)- lim ∑ f(s j−1 )(B s j − B s j−1 ) |Π|→0

0

j=1

holds for any sequence of partitions Π = {s0 = 0 < s1 < ⋅ ⋅ ⋅ < s n = T} with mesh |Π| → 0. Proof. As in Example 15.17 we consider the following discretization of the integrand f n

f Π (t, ω) := ∑ f(s j−1 , ω)𝟙[s j−1 ,s j ) (t). j=1

Since f(t) ∈

L2 (ℙ),

we can use Lemma 15.16 to see f Π ∈ L2T . By Itô’s isometry,

T 󵄨󵄨 T 󵄨󵄨󵄨2 󵄨2 󵄨󵄨 󵄨 Π 󵄨 [ ] 󵄨 󵄨 𝔼 󵄨󵄨 ∫ [f(t) − f (t)] dB t 󵄨󵄨 = ∫ 𝔼 [󵄨󵄨󵄨󵄨f(t) − f Π (t)󵄨󵄨󵄨󵄨 ] dt 󵄨󵄨 󵄨󵄨 [ 0 ] 0 n

sj

󵄨 󵄨2 = ∑ ∫ 𝔼 [󵄨󵄨󵄨f(t) − f(s j−1 )󵄨󵄨󵄨 ] dt j=1 s j−1 n

sj

⩽∑ ∫

j=1 s j−1

󳨀󳨀󳨀󳨀󳨀→ 0. |Π|→0

sup

u,v∈[s j−1 ,s j ]

𝔼 [|f(u) − f(v)|2 ] dt

→0, |Π|→0

Combining this with Lemma 15.16 finally gives T

n

L2 (ℙ)

T

∑ f(s j−1 ) ⋅ (B s j − B s j−1 ) = ∫ f (t) dB t 󳨀󳨀󳨀󳨀󳨀→ ∫ f(t) dB t . Π

j=1

|Π|→0

0

0

15.19 Example. The following examples show that notation ∫0 X s dB s is to be understood in a strict symbolic sense, i.e. a naive manipulation of expressions outside and inside the integral sign may go terribly wrong. Of course, there is Theorem 15.15.e) and f), but do have a look at the not really straightforward proofs of these results. T a) Similar to Lemma 15.16 consider ∫0 B c 𝟙[a,b) (s) dB s for 0 ⩽ a ⩽ c < b ⩽ T. It looks reasonable to pull out B c from under the integral sign, as it does not depend on s: t

T

T

I T := ∫ B c 𝟙[a,b) (s) dB s = B c ∫ 𝟙[a,b) (s) dB s = B c (B b − B a ). 0

0

Ex. 15.11 Ex. 15.18

260 | 15 Stochastic integrals: L2 -Theory

Ex. 15.10 Ex. 15.11

This is only possible if c = a. Indeed, taking expectations gives 𝔼[B c (B b − B a )] = 𝔼(B c B b ) − 𝔼(B c B a ) = c ∧ b − c ∧ a = c − a, and this is 0 if, and only if, c = a. Since the stochastic integral is a martingale, we must have 𝔼 I T = 𝔼 I0 = 0. Note that the process B c (ω)𝟙[a,b) (t) is Itô integrable w.r.t. dB t if, and only if, c = a. b) The process f(s, ω) = 𝟙(0,∞) (B s (ω)) is Itô integrable – use L2T = L2P (λ T ⊗ ℙ) (Theorem 15.23) or Proposition 15.18 and the fact that f(s, ω) is mean-square continuous (Exercise 15.11). The (random) set {t ⩾ 0 : B t (ω) > 0} is open and, therefore, we can write it as a union of countably many disjoint random intervals ⋃j⩾0 (σ j (ω), τ j (ω)); clearly, B σ j (ω) = B τ j (ω) = 0. Naively, one would expect that 1

{B+ ∫ 𝟙(σ j ,τ j ) (s) dB s = B1∧τ j − B1∧σ j = { 1 0 0 {



if σ j < 1, τ j > 1 & B1 > 0, otherwise.

Therefore, 1

1

1

∫ 𝟙(0,∞) (B s ) dB s = ∫ ∑ 𝟙(σ j ,τ j ) (s) dB s = ∑ ∫ 𝟙(σ j ,τ j ) (s) dB s = B+1 . 0

0 j⩾0

j⩾0 0

t

But this is impossible as 𝔼(B+1 ) > 0 while ∫0 𝟙(0,∞) (B s ) dB s is a martingale, hence 1

Ex. 4.9 Ex. 5.23

𝔼 (∫0 𝟙(0,∞) (B s ) dB s ) = 0. The problem is, once again, that the integral of the random indicators 𝟙(σ j ,τ j ) (s) is not well-defined: given 𝟙(σ j ,τ j ) (s) = 1, then τ j is the first zero of Brownian motion after time s, thus a stopping time; but σ j , being the last zero of Brownian motion before s, is not a stopping time.

15.5 What is the closure of S T ?

Ex. 15.20 Ex. 15.21

Ex. 15.22

Let us now see how big L2T = ST really is. For this we need some preparations

15.20 Definition. Let (B t )t⩾0 be a BM1 and (Ft )t⩾0 an admissible filtration. The progressive σ-algebra P is the family of all sets Γ ⊂ [0, T] × Ω such that Γ ∩ ([0, t] × Ω) ∈ B[0, t] ⊗ Ft

for all t ⩽ T.

(15.25)

By L2P (λ T ⊗ ℙ) we denote those equivalence classes in L2 ([0, T] × Ω, λ T ⊗ ℙ) which have a P measurable representative. In Lemma 6.25 we have already seen that Brownian motion itself is P measurable. 15.21 Lemma. Let T > 0. Then (L2P (λ T ⊗ ℙ), ‖ ⋅ ‖L2 (λ T ⊗ℙ) ) is a Hilbert space.

Proof. By definition, L2P (λ T ⊗ ℙ) ⊂ L2 (λ T ⊗ ℙ) is a subspace and L2 (λ T ⊗ ℙ) is a Hilbert space for the norm ‖ ⋅ ‖L2 (λ T ⊗ℙ) . It is, therefore, enough to show that for any sequence (f n )n⩾0 ⊂ L2P (λ T ⊗ ℙ) such that L2 (λ T ⊗ ℙ)-limn→∞ f n = f , the limit f has a

15.5 What is the closure of ST ?

|

261

P measurable representative. We may assume that the processes f n are P measurable. Since f n converges in L2 , there exists a subsequence (f n(j) )j⩾1 such that limj→∞ f n(j) = f λ ⊗ ℙ almost everywhere. Therefore, the upper limit ϕ(t, ω) := limj→∞ f n(j) (t, ω) is a P measurable representative of f . Our aim is to show that L2T = L2P (λ T ⊗ ℙ). For this we need a simple deterministic lemma. As before we write L2 (λ T ) = L2 ([0, T], B[0, T], λ).

15.22 Lemma. Let ϕ ∈ L2 (λ T ) and T > 0. Then n−1

Θ n [ϕ](t) := ∑ ξ j−1 𝟙[t j−1 ,t j ) (t)

(15.26)

j=1

where jT tj = , n

j = 0, . . . , n,

ξ0 = 0

and

ξ j−1

t j−1

n = ∫ ϕ(s) ds, j ⩾ 2, T t j−2

is a sequence of step functions with L2 (λ T )-limn→∞ Θ n [ϕ] = ϕ.

Proof. If ϕ is a step function, it has only finitely many jumps, say σ1 < σ2 < ⋅ ⋅ ⋅ < σ m , and set σ0 := 0. Choose n so large that in each interval [t j−1 , t j ) there is at most one jump σ k . From the definition of Θ n [ϕ] it is easy to see that ϕ and Θ n [ϕ] differ exactly in the first, the last and in those intervals [t j−1 , t j ) which contain some σ k or whose predecessor [t j−2 , t j−1 ) contains some σ k . Formally, m

ϕ(t) = Θ n [ϕ](t)

for all t ∉ ⋃ [σ j − Tn , σ j + j=0

2T n )

∪ [0, Tn ) ∪ [ n−1 n T, T].

This is shown in Figure 15.1. Therefore, limn→∞ Θ n [ϕ] = ϕ Lebesgue almost everywhere and in L2 (λ T ). Since the step functions are dense in L2 (λ T ), we get for every ϕ ∈ L2 (λ T ) a sequence of step functions L2 (λ T )-limk→∞ ϕ k = ϕ. Now let Θ n [ϕ] and Θ n [ϕ k ] be the step functions defined in (15.26). Then Θ n [⋅] is a contraction, T

∫ |Θ n [ϕ](s)|2 ds 0

n−1

= =

2 (t j − t j−1 ) ∑ ξ j−1

j=1

t j−1

t j − t j−1 ∑ ∫ |ϕ(s)|2 ds t − t j−1 j−2 j=2

Jensen n−1





t j−1

2

1 ∑ (t j − t j−1 ) [ ∫ ϕ(s) ds] t j−1 − t j−2 j=2 t [ ] j−2

n−1

T

=1

∫ |ϕ(s)|2 ds, 0

t j−2

262 | 15 Stochastic integrals: L2 -Theory

𝜙 𝛩𝑛[𝜙] 𝑡1

𝜎1

𝜎2

Fig. 15.1: Approximation of a simple function by a left continuous simple function.

𝑡𝑛−1

𝑇

and, by the triangle inequality and the contractivity of Θ n [⋅],

‖ϕ − Θ n [ϕ]‖L2 ⩽ ‖ϕ − ϕ k ‖L2 + ‖ϕ k − Θ n [ϕ k ]‖L2 + ‖Θ n [ϕ k ] − Θ n [ϕ]‖L2 = ‖ϕ − ϕ k ‖L2 + ‖ϕ k − Θ n [ϕ k ]‖L2 + ‖Θ n [ϕ k − ϕ]‖L2

⩽ ‖ϕ − ϕ k ‖L2 + ‖ϕ k − Θ n [ϕ k ]‖L2 + ‖ϕ k − ϕ‖L2 .

Letting n → ∞ (with k fixed) we get limn→∞ ‖ϕ − Θ n [ϕ]‖L2 ⩽ 2‖ϕ k − ϕ‖L2 ; now k → ∞ shows that the whole expression tends to zero.

15.23 Theorem. The L2 (λ T ⊗ ℙ) closure of ST is L2P (λ T ⊗ ℙ).

Proof. Write L2T = ST . Obviously, ST ⊂ L2P (λ T ⊗ ℙ). By Lemma 15.21, the set L2P (λ T ⊗ ℙ) is closed under the norm ‖ ⋅ ‖L2T , thus ST = L2T ⊂ L2P (λ T ⊗ ℙ). All that remains to be shown is L2P (λ T ⊗ ℙ) ⊂ L2T . Let f(t, ω) be (the progressive representative of) a process in L2P (λ T ⊗ ℙ). By dominated convergence we see that the truncated process f C := −C ∨ f ∧ C converges in L2 (λ T ⊗ ℙ) to f as C → ∞. Therefore, we can assume that f is bounded. Define for each ω a simple process Θ n [f](t, ω), where Θ n is as in Lemma 15.22, i.e. t

j−1 { n ∫ f(s, ω) ds, if the integral exists ξ j−1 (ω) := { T t j−2 else. {0, Note that the set of ω for which the integral does not exist is an Ft j−1 measurable null set, since f(⋅, ω) is P measurable and

t j−1

T

𝔼 [ ∫ |f(s, ⋅)|2 ds] ⩽ 𝔼 [∫ |f(s, ⋅)|2 ds] < ∞, [t j−2 ] [0 ]

j ⩾ 2.

15.6 White noise integrals |

Thus, Θ n [f] ∈ ST . By the proof of Lemma 15.22, we have for ℙ almost all ω T

263

T

2

∫ |Θ n [f](t, ω)| dt ⩽ ∫ |f(t, ω)|2 dt, 0

(15.27)

0

T

∫ |Θ n [f](t, ω) − f(t, ω)|2 dt 󳨀󳨀󳨀󳨀→ 0.

(15.28)

n→∞

0

If we take expectations in (15.28), we get by dominated convergence – the majorant is guaranteed by (15.27) – that L2 (λ T ⊗ ℙ)-limn→∞ Θ n [f] = f , thus f ∈ L2T .

15.6 White noise integrals

Let us now look at the Itô integral from a slightly different angle which is closer, in spirit, to the theory of Lebesgue integration. Throughout this chapter, (Ω, A , ℙ) is a probability space carrying a BM1 (B t )t⩾0 with admissible filtration (Ft )t⩾0 . Recall that a semi-ring R0 is a family of sets such that 0 ∈ R0 , for all R, S ∈ R0 one has R ∩ S ∈ R0 , and R \ S can be represented as a finite union of disjoint sets from R0 , cf. [232, Chapter 6] or [235, Chapter 5]. We are mainly interested in two semi-rings: i) the finite intervals R0 = J0 = {(s, t] : 0 ⩽ s < t < ∞} in [0, ∞); we use on J0 the one-dimensional Lebesgue measure μ = λ. ii) the “predictable” rectangles in R0 = Q0 = {(s, t] × F : 0 ⩽ s < t < ∞, F ∈ Fs } on [0, ∞) × Ω; we use on Q0 the product measure μ = λ ⊗ ℙ.

15.24 Definition. Let R0 be a semi-ring. A random orthogonal measure with control measure μ is a family of random variables N(ω, R) ∈ ℝ, R ∈ R0 , such that 𝔼 [N 2 (⋅, R)] < ∞

for all R ∈ R0 ,

(15.29)

𝔼 [N(⋅, R) N(⋅, S)] = μ(R ∩ S) for all R, S ∈ R0 .

(15.30)

The following lemma explains why we call N(R) = N(ω, R) a random measure.

15.25 Lemma. The random set function R 󳨃→ N(R) = N(ω, R), R ∈ R0 , is countably additive in L2 (ℙ), i.e.





n

N ( ⋃ R n ) = L2 - lim ∑ N(R k ) n=1

n→∞

k=1

(15.31)



for every sequence (R n )n⩾1 ⊂ R0 of mutually disjoint sets such that R := ⋃∞ n=1 R n ∈ R0 . ⋅ In particular, N(0) = 0 a.s., and N(R ∪ S) = N(R) + N(S) a.s. for disjoint R, S ∈ R0 such that R ∪⋅ S ∈ R0 (notice that the exceptional set may depend on the sets R, S).

Proof. From R = S = 0 and 𝔼 [N 2 (0)] = μ(0) = 0 we see that N(0) = 0 a.s. It is enough to prove (15.31) as finite additivity follows if we take (R1 , R2 , R3 , R4 . . . ) = (R, S, 0, 0, . . . ).

264 | 15 Stochastic integrals: L2 -Theory



If R n ∈ R0 are mutually disjoint sets such that R := ⋃∞ n=1 R n ∈ R0 , then n

2

𝔼 [(N(R) − ∑ N(R k )) ] k=1

n

=

k=1

(15.30)

=

n

𝔼 N 2 (R) + ∑ 𝔼 N 2 (R k ) − 2 ∑ 𝔼 [N(R) ⋅ N(R k )] + k=1

n

μ(R) − ∑ μ(R k ) 󳨀󳨀󳨀󳨀→ 0, k=1

n



j=k,j,k=1 ̸

𝔼 [N(R j ) ⋅ N(R k )]

n→∞

where we use the σ-additivity of the measure μ. 15.26 Example. a) White noise. The random set function N(ω, (s, t]) := B t (ω) − B s (ω),

(s, t] ∈ J0 ,

is a random orthogonal measure with control measure λ. This follows at once from 𝔼 [(B t − B s )(B v − B u )] = t ∧ v − s ∨ u = λ[(s, t] ∩ (u, v]]

for all 0 ⩽ s < t < ∞ and 0 ⩽ u < v < ∞. Attention: N is not σ-additive. To see this, take R n := (1/(n + 1), 1/n] for all n ⩾ 1, and observe that ⋃n R n = (0, 1]. Since B has stationary and independent increments, and scales like B t ∼ √tB1 , we have





𝔼 exp [− ∑n=1 |N(R n )|]

=

=

𝔼 (exp [− ∑∞ n=1 |B 1/(n+1) − B 1/n |]) ∞

∏ 𝔼 (exp [−(n(n + 1))−1/2 |B1 |])

n=1 Jensen ∞



−1/2

∏ α(n(n+1))

n=1

,

α := 𝔼 [e−|B1 | ] ∈ (0, 1).

−1/2 diverges, we get 𝔼 (exp [− ∑∞ |N(R )|]) = 0, which As the series ∑∞ n n=1 n=1 (n(n + 1)) means that ∑∞ n=1 |N(ω, R n )| = ∞ for almost all ω. This shows that N(⋅) cannot be countably additive. Indeed, countable additivity implies that the series ∞



N (ω, ⋃ R n ) = ∑ N(ω, R n ) n=1

n=1

converges. The left-hand side, hence the sum, is independent under rearrangements. This entails absolute convergence of the series ∑∞ n=1 |N(ω, R n )| < ∞, which does not hold, as we have seen above. b) White noise 2. The random measure in a) leads to the Paley–Wiener–Zygmund integral, cf. Paragraph 13.5. In order to integrate random integrands, we need a slightly

15.6 White noise integrals |

265

more elaborate construction. Consider the semi-ring of predictable rectangles Q0 , cf. ii) on p. 263. The random set function N (ω, (s, t] × F) := (B t (ω) − B s (ω)) 𝟙F (ω),

(s, t] × F ∈ Q0 ,

is a random orthogonal measure with control measure μ = λ ⊗ ℙ. In order to see this, take (s, t] × F, (u, v] × G ∈ Q0 ; we distinguish between two cases. Case 1: (s, t] ∩ (u, v] = 0, i.e. s ∨ u ⩾ t ∧ v. Without loss of generality we can assume that s < t ⩽ u < v. In this case the random variable (B t − B s )𝟙F 𝟙G is Fu measurable and independent of the increment B v − B u , so 𝔼 [N((s, t] × F) ⋅ N((u, v] × G)] = 𝔼 [(B t − B s )𝟙F 𝟙G ] ⋅ 𝔼 [B v − B u ] = 0

which coincides with the value of the control measure

μ ([(s, t] × F] ∩ [(u, v] × G]) = μ(0) = 0.

Case 2: (s, t] ∩ (u, v] ≠ 0, i.e. s ∨ u < t ∧ v. Since the times s, t, u, v play symmetric roles, we can assume that (s, t] is either contained in (u, v], i.e. u ⩽ s < t ⩽ v, or “left” of (u, v] with a proper intersection, i.e. s < u < t < v. The calculations for both cases being similar, we will check only the latter. We have 𝔼 [N((s, t] × F) ⋅ N((u, v] × G)]

= 𝔼 [ ((B t − B u )𝟙F + (B u − B s )𝟙F ) ⋅ ((B v − B t )𝟙G + (B t − B u )𝟙G ) ].

We multiply out and see, as in the first case, that increments over disjoint intervals do not contribute. Thus, 𝔼 [N((s, t] × F) ⋅ N((u, v] × G)] = 𝔼 [(B t − B u )2 𝟙F 𝟙G ] = 𝔼 [(B t − B u )2 ] ℙ(F ∩ G)

since F, G ∈ Fu∨s and B t − B u is independent of Fu∨s . The right-hand side is, however, 𝔼 [N((s, t] × F) ⋅ N((u, v] × G)] = (t − u) ℙ(F ∩ G) = λ((s, t] ∩ (u, v]) ⋅ ℙ(F ∩ G)

finishing our argument.

We will now define a stochastic integral in the spirit of Itô’s original construction. 15.27 Definition. A simple process is a step function of the form n

f(x) = ∑ c k 𝟙R k (x), k=1

n ∈ ℕ, c k ∈ ℝ, R k ∈ R0 .

(15.32)

Although (15.32) looks deterministic, it can be random. This depends on the choice of the semi-ring. If R0 = J0 , f(x) = f(t) is, indeed, deterministic; if R0 = Q0 , we have f(x) = f(t, ω) which is random.

266 | 15 Stochastic integrals: L2 -Theory Intuitively, ∑nk=1 c k N(R k ) should be the stochastic integral of a simple process f . The only problem is the well-definedness. Since a random orthogonal measure is a.s. finitely additive, the following lemma has exactly the same proof as the usual welldefinedness result for the Lebesgue integral of a step function, see e.g. [232, Lemma 9.1] or [235, Lemma 8.1]; note that finite unions of null sets are again null sets. 15.28 Lemma. Let f be a simple process and assume that f = ∑nk=1 c k 𝟙R k = ∑m j=1 b j 𝟙S j has two representations as a step function. Then n

m

∑ c k N(R k ) = ∑ b j N(S j )

k=1

a.s.

j=1

15.29 Definition. Let N(R), R ∈ R0 , be a random orthogonal measure with control measure μ. The stochastic integral of a simple process f given by (15.32) is the random variable³ n

N(ω, f) := ∑ c k N(ω, R k ).

(15.33)

k=1

The following properties of the stochastic integral are almost immediate from the definition. 15.30 Lemma. Let N(R), R ∈ R0 , be a random orthogonal measure with control measure μ, let f, g be simple processes, and α, β ∈ ℝ. a) N(𝟙R ) = N(R) for all R ∈ R0 ; b) S 󳨃→ N(𝟙S ) extends N uniquely to the sets S ∈ R from the ring generated by R0 ;⁴ c) N(αf + βg) = αN(f) + βN(g); (linearity) d) 𝔼 [N 2 (f)] = ∫ f 2 dμ; (Itô’s isometry) Proof. The properties a) and c) are clear. For b) we note that R can be constructed from R0 by adding all possible finite unions of (disjoint) sets (see e.g. [232, Proof of Theorem 6.1, Step 2] or [235, Proof of Theorem 5.2, Step 2]). In order to see d), we use (15.32) and the orthogonality relation 𝔼 [N(R j )N(R k )] = μ(R j ∩ R k ) to get n

n

𝔼 [N(f)2 ] = ∑ c j c k 𝔼 [N(R j ) ⋅ N(R k )] = ∑ c j c k μ(R j ∩ R k ) j,k=1

j,k=1

n

= ∫ ∑ c j 𝟙R j (x) c k 𝟙R k (x) μ(dx) = ∫ f 2 (x) μ(dx). j,k=1

Itô’s isometry now allows us to extend the stochastic integral to the L2 (μ)-closure of the simple processes: L2 (E, σ(R0 ), μ). For this take f ∈ L2 (E, σ(R0 ), μ) and any

3 In abuse of notation we use here ρ(f) for the ρ-integral ∫ f dρ, i.e. we identify the measure and the integral operator. This is quite common in potential theory. 4 A ring is a family of sets which contains 0 and is stable under unions and differences of finitely many sets. Since R ∩ S = R \ (R \ S), it is automatically stable under finite intersections. The ring generated by R0 is the smallest ring containing R0 .

15.6 White noise integrals |

267

approximating sequence (f n )n⩾1 of simple processes, i.e. lim ∫ |f − f n |2 dμ = 0.

n→∞

In particular, (f n )n⩾1 is an L2 (μ) Cauchy sequence, and Itô’s isometry shows that the random variables (N(f n ))n⩾1 are a Cauchy sequence in L2 (ℙ): 𝔼 [(N(f n ) − N(f m ))2 ] = 𝔼 [N(f n − f m )2 ] = ∫(f n − f m )2 dμ 󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ 0. m,n→∞

Because of the completeness of L2 (ℙ), the limit limn→∞ N(f n ) exists and, by a standard argument, it does not depend on the approximating sequence.

15.31 Definition. Let N(R), R ∈ R0 , be a random orthogonal measure with control measure μ. The stochastic integral of a function f ∈ L2 (E, σ(R0 ), μ) is the random variable N(ω, f) := ∫ f(x) N(ω, dx) := L2 (ℙ)- lim N(ω, f n ), (15.34) n→∞

where (f n )n⩾1 is any sequence of simple processes which approximate f in L2 (μ).

It is immediate from the definition of the stochastic integral, that f 󳨃→ ∫ f dN is linear and satisfies Itô’s isometry 2

𝔼 [(∫ f(x) N(⋅, dx)) ] = ∫ f 2 (x) μ(dx).

(15.35)

Up to now we have not really used the structure of the white noise orthogonal random measure. Let us briefly collect all relevant formulae for this particular case: fix T ∈ (0, ∞), recall that λ T is Lebesgue measure on [0, T], N(ω, (s, t] × F) = (B t (ω) − B s (ω)) 𝟙F (ω),

μ((s, t] × F) = 𝔼 [N 2 (⋅, (s, t] × F)] = (t − s)ℙ(F),

s < t ⩽ T, F ∈ Fs ,

and for simple processes ϕ(s, ϖ) = ∑nk=1 c k 𝟙(t k−1 ,t k ] (s)𝟙F k (ϖ) and f ∈ L2 (μ) which is the limit of simple processes ϕ n (s, ϖ) → f(s, ϖ): n

N(ω, ϕ) = ∫ ϕ(s, ϖ) N(ω, ds, dϖ) = ∑ c k 𝟙F k (ω)(B t k (ω) − B t k−1 (ω)), k=1

2

∫ f(s, ϖ) N(ω, ds, dϖ) = L - lim ∫ ϕ n (s, ϖ) N(ω, ds, dϖ). n→∞

15.32 Definition. Let N((s, t] × F), (s, t] × F ∈ Q0 , 0 ⩽ s ⩽ t ⩽ T, be the white noise measure with control measure λ T ⊗ ℙ. The stochastic integral depending on the upper limit of the integral of a function f ∈ L2 ([0, T)×Ω, σ(Q0 ), λ T ⊗ℙ) is the random variable N t (f) := N(f 𝟙(0,t] ) = ∫ f(s, ϖ)𝟙(0,t] (s) N(ω, ds, dϖ),

t ∈ [0, T).

(15.36)

Ex. 15.23

268 | 15 Stochastic integrals: L2 -Theory The following two results show that the integral with respect to a white noise measure and the Itô integral coincide. 15.33 Proposition. Let N((s, t] × F) be the white noise measure with control measure λ T ⊗ℙ for any T ∈ (0, ∞), f ∈ L2 (σ(Q0 ), λ T ⊗ℙ) and (Ft )t⩾0 the filtration of the underlying ̃t (f), Ft )t∈[0,T] , BM1 . Then (N t (f), Ft )t∈[0,T] is a continuous L2 martingale. Moreover, (N t 2 2 ̃ ̂ ̂ ̂t (f) is where N t (f) := N t (f) − N t (f) and N t (f) := ∫0 f (s, ⋅) ds, is also a martingale, i.e. N the compensator of N t (f). Proof. We show the assertion for T = ∞ and write λ = λ T . L2 -limits of L2 martingales are again L2 martingales. Moreover, in view of Lemma 15.12, L2 -convergence entails locally uniform L2 -convergence, which means that the continuity is also preserved under L2 -limits. Thus, it is enough to consider the most simple integrands of the form c𝟙(u,v] (s)𝟙F (ϖ) where u < v ⩽ T and F ∈ Fu . Let f(s, ϖ) = c𝟙(u,v] (s)𝟙F (ϖ). Then N t (ω, f) = ∬ c𝟙(u,v] (r)𝟙(0,t] (r)𝟙F (ϖ) N(ω, dr, dϖ) = N(ω, (u ∧ t, v ∧ t] × F)

= (B v∧t (ω) − B u∧t (ω)) 𝟙F (ω).

Let s < t. If s ⩽ u, then we can use the tower property, a pull-out argument, and the independence of B v∧t − B u∧t from Fu to see 𝔼 [N t (f) | Fs ] = 𝔼 [ 𝔼 (N t (f) | Fu ) | Fs ]

= 𝔼 [ 𝔼 (B v∧t − B u∧t | Fu ) 𝟙F | Fs ] = 0. = 𝔼 (B v∧t − B u∧t ) = 0

If s > u, there is no need for the tower property, and the claim follows directly from pull out and independence. The continuity of t 󳨃→ N t (f) is obvious from the concrete representation through Brownian motion. ̂t is a continuous martingale. Note that A routine argument will show that N t2 − N t

̂t (f) = ∫ 𝟙(u,v] (r) dr ⋅ c2 𝟙F N 0

= (v ∧ t − u ∧ t) ⋅ c2 𝟙F ̂t (𝟙(u,v] ) ⋅ c2 𝟙F . =N

̃t (f) is clear. Let s < t and assume that s < u. Mimicking the The continuity of N calculation for the first equality in (15.3) and using the tower property we get 𝔼 [(N t2 (f) − N s2 (f) | Fu ) | Fs ] = 𝔼 [([N t (f) − N s (f)]2 | Fu ) | Fs ] .

15.6 White noise integrals |

269

By definition, N t (f) − N s (f) = N t (f 𝟙(s,t] ) = N t (𝟙(u,v]∩ (s,t] ) ⋅ c𝟙F , and the first term is independent of Fu . Thus, 𝔼 [N t2 (f) − N s2 (f) | Fs ] = 𝔼 [𝔼 (N t2 (f) − N s2 (f) | Fu ) | Fs ]

= 𝔼 [𝔼 ([N t (f) − N s (f)]2 | Fu ) | Fs ]

= 𝔼 [N t2 (𝟙(u,v]∩ (s,t] )] ⋅ 𝔼 [c2 𝟙F | Fs ] = 𝔼 [(t ∧ u − s ∨ u) c2 𝟙F | Fs ]

̂t (f) − N ̂s (f) | Fs ] , = 𝔼 [N

proving the claim. The calculation for s ⩾ u is exactly the same, but there is, again, no need to use the tower property.

15.34 Lemma. Let N((s, t] × F) be the white noise measure with control measure λ T ⊗ ℙ for any T ∈ (0, ∞), f ∈ L2 (σ(Q0 ), λ T ⊗ ℙ) and (Ft )t⩾0 the filtration of the underlying BM1 . For simple processes of the form n

ϕ(t, ω) = ∑ ϕ j−1 (ω)𝟙(s j−1 ,s j ] (t), j=1

where n ⩾ 1, 0 = s0 ⩽ s1 ⩽ . . . ⩽ s n ⩽ T and ϕ j−1 is bounded and Fs j measurable, the stochastic integral is given by n

N(ω, ϕ) = ∑ ϕ j−1 (ω) (B s j (ω) − B s j−1 (ω)) . j=1

In particular, the approach using the white noise measure leads to the same Riemann– Itô sums for simple integrands (in the sense of Definition 15.9) and, since both the white noise integral and Itô’s integral are obtained through completion in L2 , we end up with the same integral. In general, the closures of the left and right continuous simple processes are different: L2 (σ(Q0 ), λ T ⊗ ℙ) ⫋ L2P (λ T ⊗ ℙ) since the predictable σ-algebra σ(Q0 ) is strictly smaller than the progressive σ-algebra P. If we have a complete, right continuous filtration (Ft )t⩾0 , and a control measure μ without atoms, the L2 -spaces are essentially the same, see also Exercise 15.14: if f ∈ L2P (λ T ⊗ ℙ), then t limh↓0 ∫t−h

ω) := f(s, ω) ds is in a null set, see also [113, pp. 45–46]. f ∗ (t,

L2 (σ(Q

0 ),

λ T ⊗ ℙ) and differs from f(t, ω) only by

Proof of Lemma 15.34. Assume first that the random variables ϕ j are simple random m(j) variables of the form ϕ ij = ∑i=1 c ij 𝟙F i where F ji ∈ Fs j . As we are dealing with finitely j many ϕ j , we may assume that m(j) = m, otherwise we take m := maxj m(j) and “fill up” the missing summands with 0. Then we have n

m

n

∑ ϕ j−1 (ω)𝟙(s j−1 ,s j ] (t) = ∑ ∑ c i−1 j−1 𝟙(s j−1 ,s j ]×F i (t, ω).

j=1

i=1 j=1

j

Ex. 15.14

270 | 15 Stochastic integrals: L2 -Theory Since this is a linear combination of simple integrands, we can use the linearity and then the very definition of the integral to get m

n

N(ϕ) = ∑ N ( ∑ c i−1 j−1 𝟙(s j−1 ,s j ]×F i (t, ⋅)) i=1 m

j

j=1

n

n

i−1 = ∑ ∑ c j−1 (s j − s j−1 ) 𝟙F i = ∑ ϕ j−1 (s j − s j−1 ) . j

i=1 j=1

j=1

If the ϕ j are not simple, we use the sombrero lemma [232, Theorem 8.8], [235, Theorem 7.11] to approximate any Fs j measurable random variable ϕ j with simple random variables of the type we have been using. Since the ϕ j are bounded, the approximation is uniform, hence also in L2 (ℙ) [232, Corollary 8.9]/[235, Theorem 7.11]. Denote the approximations of ϕ j and ϕ by ϕ j (ℓ) and ϕ(ℓ), respectively. Since we have ϕ j (ℓ) → ϕ j and ϕ(ℓ) → ϕ uniformly, Itô’s isometry shows that the equality n

N(ϕ(ℓ)) = ∑ ϕ j−1 (ℓ) (s j − s j−1 ) j=1

remains valid, as both sides converge in L2 (ℙ) for ℓ → ∞.

15.35 Further reading. Most books consider stochastic integrals immediately for continuous martingales. A particularly nice and quite elementary presentation is [152]. The approach of [216] is short and mathematically elegant. For discontinuous martingales standard references are [113] who use a point-process perspective, and [211] who starts out with semimartingales as “good” stochastic integrators. [150] and [151] construct the stochastic integral using random orthogonal measures. This leads to a unified treatment of Brownian motion and stationary processes. A good discussion on left vs. right continuous integrands and their completions can be found in [113, Chapter II.1] and [136, Chapter 3.2.A]. [113] Ikeda, Watanabe: Stochastic Differential Equations and Diffusion Processes. [136] Karatzas, Shreve: Brownian Motion and Stochastic Calculus. [150] Krylov: Introduction to the Theory of Diffusion Processes. [151] Krylov: Introduction to the Theory of Random Processes. [152] Kunita: Stochastic Flows and Stochastic Differential Equations. [211] Protter: Stochastic Integration and Differential Equations. [216] Revuz, Yor: Continuous Martingales and Brownian Motion.

Problems 1. 2.

Let (M n , Fn )n⩾0 and (N n , Fn )n⩾0 be L2 martingales; then (M n N n −⟨M, N⟩n , Fn )n⩾0 is a martingale.

Let (M n , Fn )n⩾0 and (N n , Fn )n⩾0 be L2 martingales. Show that |⟨M, N⟩n | ⩽ √⟨M⟩n √⟨N⟩n .

Problems

| 271

Hint. Try the classical proof of the Cauchy–Schwarz inequality for scalar products.

3.

Let f ∈ ST be a simple process and M ∈ M2,c T . Show that the definition of the T

stochastic integral ∫0 f(s) dM s (cf. Definition 15.9) does not depend on the particular representation of the simple process (15.12). 1

4. Show that ‖M‖M2T := (𝔼 [sups⩽T |M s |2 ]) 2 is a norm in the family M2T of equivalence classes of L2 martingales (M and ̃ M are said to be equivalent if, and only if, sup0⩽t⩽T |M t − ̃ M t | = 0 a.s.). 5.

Show that the limit (15.19) does not depend on the approximating sequence.

6.

Let (B t , Ft )t⩾0 be a BM1 and τ a stopping time. Find ⟨B τ ⟩t .

7.

Let (B t )t⩾0 be a BM1 and f, g ∈ L2T . Show that

t 󵄨 a) 𝔼(f ∙ B t ⋅ g ∙ B t | Fs ) = 𝔼 [∫s f(u, ⋅)g(u, ⋅) du 󵄨󵄨󵄨 Fs ] if f, g vanish on [0, s]. 󵄨 b) 𝔼 [f ∙ B t 󵄨󵄨󵄨 Fs ] = 0 if f vanishes on [0, s].

c)

f(t, ω) = 0 for all ω ∈ A ∈ FT and all t ⩽ T implies f ∙ B(ω) = 0 for all ω ∈ A.

8. Let (B t )t⩾0 be a BM1 and (f n )n⩾1 a sequence in L2T such that f n → f in L2 (λ T ⊗ ℙ). Then f n ∙ B → f ∙ B in M2,c T . 9.

Let (B t )t⩾0 be a BM1 and f ∈ L2 (λ T ⊗ ℙ) for some T > 0. Assume that the limit limϵ→0 f(ϵ) = f(0) exists in L2 (ℙ). Show that ϵ

1 ℙ ∫ f(s) dB s 󳨀󳨀󳨀→ f(0). ϵ→0 Bϵ 0

1

10. Let (B t )t⩾0 be a BM , (Y t )t⩾0 and adapted process such that Y t Y s = Y t for all s ⩽ t and (X t )t⩾0 ∈ L2T . Show that t

t

Y t ∫ X s dB s = Y t ∫ Y s X s dB s . 0

0

11. Let (B t )t⩾0 be a BM1 . Show that (0, ∞) ∋ s 󳨃→ f(s, ω) := 𝟙(0,∞) (B s ) is continuous in L2 (ℙ).

12. Use the fact that any continuous, square-integrable martingale with bounded variation paths is constant (cf. Proposition 17.2) to show the following: t ⟨f ∙ B⟩t := ∫0 |f(s)|2 ds is the unique continuous and increasing process such that (f ∙ B)2 − f 2 ∙ ⟨B⟩ is a martingale. 13. The quadratic covariation of two continuous L2 martingales M, N ∈ M2,c T is defined as in the discrete case by the polarization formula (15.8). t Let (B t )t⩾0 be a BM1 and f, g ∈ L2T . Show that ⟨f ∙ B, g ∙ B⟩t = ∫0 f(s)g(s) ds for all t ∈ [0, T].

272 | 15 Stochastic integrals: L2 -Theory 14. Use Theorem 15.15.c) to show that the stochastic integrals for the right and left continuous simple processes f(t, ω) := ∑nj=1 ϕ j−1 (ω)𝟙[s j−1 ,s j ) (t) and g(t, ω) := ∑nj=1 ϕ j−1 (ω)𝟙(s j−1 ,s j ] (t) coincide. 15. Let (X n )n⩾0 be a sequence of r. v. in L2 (ℙ). Show that L2 -limn→∞ X n = X implies L1 -limn→∞ X 2n = X 2 .

16. Let (B t )t⩾0 be a BM1 . Show that ∫0 B2s dB s = T

1 3

B3T − ∫0 B s ds, T > 0. T

17. Let (B t )t⩾0 be a BM1 and f ∈ BV[0, T], T < ∞, a non-random function. Prove that T T ∫0 f(s) dB s = f(T)B T − ∫0 B s df(s) (the latter integral is understood as the meansquare limit of Riemann–Stieltjes sums). Conclude from this that the Itô integral extends the Paley–Wiener–Zygmund integral, cf. Paragraph 13.5. 18. Adapt the proof of Proposition 15.18 and prove that we also have 󵄨󵄨 t 󵄨󵄨2 n 󵄨 󵄨 [ lim 𝔼 sup 󵄨󵄨󵄨󵄨 ∫ f(s) dB s − ∑ f(s j−1 )(B s j ∧t − B s j−1 ∧t )󵄨󵄨󵄨󵄨 ] = 0. |Π|→0 󵄨󵄨 t⩽T 󵄨󵄨 j=1 0 [ ]

19. Let (B t , Ft )t⩾0 be a one-dimensional Brownian motion, T < ∞ and Π n , n ⩾ 1, a sequence of partitions of [0, T]: 0 = t n,0 < t n,1 < ⋅ ⋅ ⋅ < t n,k(n) = T. Assume that limn→∞ |Π n | = 0 and define intermediate points θ αn,k := t n,k + α(t n,k+1 − t n,k ) for some fixed 0 ⩽ α ⩽ 1. Show that the Riemann–Stieltjes sums k(n)−1

∑ B(θ αn,k )(B(t n,k+1 ) − B(t n,k ))

k=0

converge in L2 (ℙ) and find the limit L T (α). For which α is L t (α), 0 ⩽ t ⩽ T a martingale?

20. Let (B t , Ft )t⩾0 be a one-dimensional Brownian motioin and τ a stopping time. Show that f(s, ω) := 𝟙[0,T∧τ(ω)) (s), 0 ⩽ s ⩽ T < ∞ is progressively measurable. T Use the very definition of the Itô integral to calculate ∫0 f(s) dB s and compare the result with the localization principle of Theorem 15.15. 21. Show that

P = {Γ : Γ ⊂ [0, T] × Ω, Γ ∩ ([0, t] × Ω) ∈ B[0, t] ⊗ Ft

is a σ-algebra.

for all t ⩽ T}

22. Show that every right or left continuous adapted process is P measurable. 23. Let τ be a stopping for a BM1 (B t )t⩾0 and define the stochastic interval ((0, τ]] := {(s, ω) : 0 < s ⩽ τ(ω)}. Show, using the white noise approach from Section 15.6 that N t (𝟙((0,τ]] ) = B τ∧t . Hint. Approximate τ by stopping times with finitely many values as in Lemma A.16.

16 Stochastic integrals: localization Let (B t )t⩾0 be a one-dimensional Brownian motion and Ft an admissible, right continB

uous complete filtration, e.g. Ft = F t , cf. Theorem 6.22. We have seen in Chapter 15 T that the Itô integral ∫0 f(s) dB s exists for all integrands f ∈ L2T = L2P (λ T ⊗ ℙ), i.e. for all stochastic processes f which have a representative which is P measurable, cf. Definition 15.20, and satisfy T

𝔼 [∫ |f(s, ⋅)|2 ds] < ∞. [0

(16.1)

]

It is possible to relax the condition (16.1) and to extend the family of possible integrands. 16.1 Definition. a) The family L0T consists of all P measurable processes f such that T

∫ |f(s, ⋅)|2 ds < ∞

a.s.

(16.2)

0

b) We say that f ∈ L2T,loc , if there exists a sequence (σ n )n⩾0 of Ft stopping times such that σ n ↑ ∞ a.s. and f 𝟙[0,σ n ) ∈ L2T for all n ⩾ 0. The sequence (σ n )n⩾0 is called a localizing sequence.

16.2 Lemma. Let f ∈ L0T . Then ∫0 |f(s, ⋅)|2 ds is Ft measurable for all t ⩽ T. t

Proof. For every f ∈ L0T we see that |f|2 is P measurable. Therefore, the bivariate mapping (s, ω) 󳨃→ |f(s, ω)|2 𝟙[0,t] (s) is B[0, t] ⊗ Ft measurable for all t ∈ [0, T], and t Tonelli’s theorem shows that ∫0 |f(s, ⋅)|2 ds is Ft measurable. 16.3 Lemma. L2T ⊂ L0T ⊂ L2T,loc .

Proof. Since a square-integrable random variable is a.s. finite, the first inclusion is clear. To see the other inclusion, we assume that f ∈ L0T and define t

{ } σ n = inf {t ⩽ T : ∫ |f(s, ⋅)|2 ds ⩾ n} , 0 { }

(inf 0 = ∞). Clearly, σ n ↑ ∞. Moreover,

n ⩾ 1,

(16.3)

t

σ n (ω) > t ⇐⇒ ∫ |f(s, ω)|2 ds < n. 0

By Lemma 16.2, the integral is Ft measurable; thus, {σ n ⩽ t} = {σ n > t}c ∈ Ft . This means that σ n is a stopping time. https://doi.org/10.1515/9783110741278-016

Ex. 16.1

274 | 16 Stochastic integrals: localization

Ex. 15.20

Note that f 𝟙[0,σ n ) is P measurable since it is the product of two P measurable random variables. From σn

T

󵄨 󵄨2 𝔼 [∫ 󵄨󵄨󵄨f(s)𝟙[0,σ n ) (s)󵄨󵄨󵄨 ds] = 𝔼 [ ∫ |f(s)|2 ds] ⩽ 𝔼 n = n, [0 ] [0 ]

we conclude that f 𝟙[0,σ n ) ∈ L2T .

16.4 Lemma. Let f ∈ L2T,loc and set f n := f 𝟙[0,σ n ) for a localizing sequence (σ n )n⩾0 . Then σ

(f n ∙ B)t m = f m ∙ B t

for all m ⩽ n

and

t ⩽ σm .

Proof. Using Theorem 15.15.e) for f n , f m ∈ L2T we get for all t ⩽ σ m

(f n ∙ B)t m = (f n 𝟙[0,σ m ) ) ∙ B t = (f 𝟙[0,σ n ) 𝟙[0,σ m ) ) ∙ B t = (f 𝟙[0,σ m ) ) ∙ B t = f m ∙ B t . σ

Lemma 16.4 allows us to define the stochastic integral for integrands in L2T,loc . 16.5 Definition. Let f ∈ L2T,loc with localizing sequence (σ n )n⩾0 . Then f ∙ B t (ω) := (f 𝟙[0,σ n ) ) ∙ B t (ω)

for all t ⩽ σ n (ω)

(16.4)

is called the (extended) stochastic integral or (extended) Itô integral. We also write t ∫0 f(s) dB s := f ∙ B t .

Lemma 16.4 guarantees that this definition is consistent and independent of the localizing sequence: indeed for any two localizing sequences (σ n )n⩾0 and (τ n )n⩾0 we see easily that ρ n := σ n ∧ τ n is again such a sequence and then the telescoping argument of Lemma 16.4 applies. If we relax (16.1), then we cannot any more use convergence in L2 (λ T ⊗ ℙ). The proper substitute is uniform convergence in probability.

16.6 Definition. Let X n = (X n (t))t⩾0 , n ⩾ 1, be a sequence of stochastic processes. We say that X n converges to a process X = (X t )t⩾0 uniformly in probability (ucp), if ∀ϵ > 0

∀t ∈ [0, T] : lim ℙ (sup |X n (s) − X(s)| > ϵ) = 0. n→∞

s⩽t

(16.5)

ucp

We use the notation X n 󳨀󳨀󳨀→ X. ucp



Clearly, X n 󳨀󳨀󳨀→ X if, and only if, sups⩽T |X n (s) − X(s)| 󳨀 → 0. Thus, ucp-convergence preserves continuity. We also need to localize martingales.

Ex. 16.4

16.7 Definition. A local martingale is an adapted right continuous process (M t , Ft )t⩽T for which there exists a sequence of stopping times (σ n )n⩾0 such that σ n ↑ ∞ a.s. and σ for every n ⩾ 0, (M t n 𝟙{σ n >0} , Ft )t⩽T is a martingale. The sequence (σ n )n⩾0 is called a localizing sequence. We denote by McT,loc the continuous local martingales and by M2,c T,loc those local martingales where M σ n 𝟙{σ n >0} ∈ M2,c T .

16 Stochastic integrals: localization | 275

Let us briefly review the properties of the extended stochastic integral. The following theorem should be compared with its L2T analogue, Theorem 15.15 16.8 Theorem. Let (B t )t⩾0 be a BM1 , (Ft )t⩾0 an admissible filtration and f ∈ L2T,loc . Then we have for all t ⩽ T a) f 󳨃→ f ∙ B is a linear map from L2T,loc to M2,c T,loc , i.e. (f ∙ B t )t⩽T is a continuous local 2 L martingale; t

b) (f ∙

B)2

2

− ⟨f ∙ B⟩ is a local martingale with ⟨f ∙ B⟩t := f ∙ ⟨B⟩t = ∫ |f(s)|2 ds; ucp

ucp

n→∞

n→∞

0

c) Ucp-continuity. If f n 󳨀󳨀󳨀󳨀→ f , then f n ∙ B 󳨀󳨀󳨀󳨀→ f ∙ B for all (f n )n⩾1 ⊂ L2T,loc . T 󵄨󵄨 s 󵄨󵄨 4C 󵄨 󵄨 d) ℙ (sup 󵄨󵄨󵄨󵄨 ∫ f(r) dB r 󵄨󵄨󵄨󵄨 > ϵ) ⩽ 2 + ℙ (∫ |f(s)|2 ds ⩾ C) , ϵ 󵄨󵄨 s⩽T 󵄨󵄨 0 0

e) Localization in time. Let τ be a Ft stopping time. Then

f)

(f ∙ B)τt = f ∙ (B τ )t = (f 𝟙[0,τ) ) ∙ B t

ϵ > 0, C > 0.

for all t ⩽ T.

Localization in space. If the processes f, g ∈ L2T,loc satisfy f(s, ω) = g(s, ω) for all (s, ω) ∈ [0, t] × F for some t ⩽ T and F ∈ F∞ , then (f ∙ B s )𝟙F = (g ∙ B s )𝟙F ,

0 ⩽ s ⩽ t.

Proof. Throughout the proof we fix some localizing sequence (σ n )n⩾1 . We begin with the assertion e). By the definition of the extended Itô integral and Theorem 15.15.e) we have for all n ⩾ 1 ((f 𝟙[0,σ n ) ) ∙ B)τ = (f 𝟙[0,σ n ) 𝟙[0,τ) ) ∙ B = (f 𝟙[0,σ n ) ) ∙ (B τ ). = ((f 𝟙[0,τ) )𝟙[0,σ n ) ) ∙ B

Now a) follows directly from its counterpart in Theorem 15.15. To see b), we note that σ ∧∙ by Theorem 15.15.b), applied to f 𝟙[0,σ n ) , (f ∙ B σ n )2 − ∫0 n |f(s)|2 ds, is a martingale for each n ⩾ 1. t For d) we define τ := inf {t ⩽ T : ∫0 |f(s, ⋅)|2 ds ⩾ C}. As in Lemma 16.3 we see that τ is a stopping time. Therefore, we get with e) and the maximal inequality (15.23) ℙ (sup |f ∙ B s | > ϵ) = ℙ (sup |f ∙ B s | > ϵ, T < τ) + ℙ (sup |f ∙ B s | > ϵ, T ⩾ τ) s⩽T

s⩽T

s⩽T

e)

⩽ ℙ (sup |(f 𝟙[0,τ) ) ∙ B s | > ϵ) + ℙ(T ⩾ τ) s⩽T

T



4 󵄨 󵄨2 𝔼 [󵄨󵄨󵄨(f 𝟙[0,τ) ) ∙ B T 󵄨󵄨󵄨 ] + ℙ (∫ |f(s)|2 ds ⩾ C) ϵ2 0

276 | 16 Stochastic integrals: localization T∧τ

= ⩽

T

4 𝔼 [ ∫ |f(s)|2 ds] + ℙ (∫ |f(s)|2 ds ⩾ C) ϵ2 0 [0 ] T

4C + ℙ (∫ |f(s)|2 ds ⩾ C) , ϵ2 0

and d) follows. We will use d) to prove c). Since the stochastic integal is linear, it is enough to show ucp

ucp

that f n 󳨀󳨀󳨀→ 0 entails f n ∙ B 󳨀󳨀󳨀→ 0. Using d) for f n and t ∈ [0, T) yields for any δ > 0 t

4C ℙ (sup |f n ∙ B s | > ϵ) ⩽ 2 + ℙ (∫ |f n (s)|2 ds ⩾ C) ϵ s⩽t d)

0

t



4C + ℙ (∫ |f n (s)|2 ds ⩾ C, sup |f n (s)| ⩽ δ) + ℙ (sup |f n (s)| > δ) . ϵ2 s⩽t s⩽t 0

If δ is so small that tδ < C, this becomes

ℙ (sup |f n ∙ B s | > ϵ) ⩽ s⩽t

4C + ℙ (sup |f n (s)| > δ) . ϵ2 s⩽t

Letting first n → ∞ and then C → 0 proves the assertion. Let f, g ∈ L2T,loc with localizing sequences (σ n )n⩾0 and (τ n )n⩾0 , respectively. Then σ n ∧ τ n localizes both f and g, and by Theorem 15.15.f), (f 𝟙[0,σ n ∧τ n ) ) ∙ B𝟙F = (g𝟙[0,σ n ∧τ n ) ) ∙ B𝟙F

for all n ⩾ 0.

Because of the definition of the extended Itô integral this is just f).

For the extended Itô integral we also have a Riemann–Stieltjes-type approximation, cf. Proposition 15.18. 16.9 Theorem. Let f be an Ft adapted right continuous process with finite left limits. Then f ∈ L2T,loc . For every sequence (Π k )k⩾1 of partitions of [0, T] such that limk→∞ |Π k | = 0, the Riemann–Stieltjes sums converge in ucp to f ∙ B t , t ⩽ T: Π Yt k

:=



s j ,s j+1 ∈Π k

ucp

t

f(s j ) (B s j+1 ∧t − B s j ∧t ) 󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ ∫ f(s) dB s . |Π k |→0

0

Proof. Since f is right continuous with left limits, it is bounded on compact t-sets. Therefore, τ n := inf {s ⩾ 0 : |f(s)|2 ⩾ n} ∧ n

is a sequence of finite stopping times with τ n ↑ ∞ a.s.

16 Stochastic integrals: localization | 277

Let f Π k (t) := ∑s j−1 ,s j ∈Π k f(s j−1 )𝟙[s j−1 ,s j ) (t). Using the stopping times (τ n )n⩾1 we infer

from Lemma 15.16 that f Π k ∈ L2T,loc and Y t

Πk

τ n ∧T

τ n ∧T

= f Π k ∙ B t . Moreover, T

𝔼 [ ∫ |f(s)|2 ds] ⩽ 𝔼 [ ∫ n ds] ⩽ 𝔼 [∫ n ds] = Tn, [ 0 ] [ 0 ] [0 ]

i.e. τ n is also a localizing sequence for the stochastic integral. Therefore, by Theorem 16.8.a) and e) and Theorem 15.15.c) and d), t∧τ n 󵄨󵄨 󵄨󵄨2 󵄨󵄨 Π k 󵄨 󵄨 󵄨2 [ 𝔼 sup 󵄨󵄨󵄨Y t∧τ − ∫ f(s) dB s 󵄨󵄨󵄨󵄨 ] = 𝔼 [sup 󵄨󵄨󵄨󵄨(f Π k − f) ∙ B t∧τ n 󵄨󵄨󵄨󵄨 ] n 󵄨󵄨 t⩽T 󵄨󵄨 t⩽T 0 [ ] (15.23)

󵄨 󵄨2 ⩽ 4 𝔼 [󵄨󵄨󵄨󵄨(f Π k − f) ∙ B T∧τ n 󵄨󵄨󵄨󵄨 ]

(15.22)

τ n ∧T

= 4 𝔼 [ ∫ |f Π k (s) − f(s)|2 ds] . [ 0 ]

For 0 ⩽ s ⩽ τ n ∧ T the integrand is bounded by 4n and converges to 0 as k → ∞. By dominated convergence we see that the integral tends to 0 as k → ∞. Finally, f is right continuous with finite left-hand limits and, therefore, the set of discontinuity points {s ∈ [0, T] : f(s, ω) ≠ f(s−, ω)} is a Lebesgue null set. Since limn→∞ τ n = ∞ a.s., we see that ∀δ > 0

∃N δ ⩾ 1

∀n ⩾ N δ : ℙ(τ n < T) < δ.

Thus, by Chebyshev’s inequality, Theorem 16.8.e) and the above calculation 󵄨󵄨 󵄨󵄨 t 󵄨󵄨 󵄨󵄨 󵄨󵄨 Π k 󵄨 ℙ (sup 󵄨󵄨Y t − ∫ f(s) dB s 󵄨󵄨󵄨 > ϵ) 󵄨󵄨 t⩽T 󵄨󵄨󵄨 󵄨󵄨 󵄨 0 󵄨󵄨 󵄨󵄨 t 󵄨󵄨 󵄨󵄨 󵄨󵄨 Π k 󵄨 ⩽ ℙ (sup 󵄨󵄨Y t − ∫ f(s) dB s 󵄨󵄨󵄨 > ϵ, τ n ⩾ T) + ℙ(τ n < T) 󵄨󵄨 t⩽T 󵄨󵄨󵄨 󵄨 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 t∧τ n 󵄨󵄨 󵄨󵄨 󵄨󵄨 Π k 󵄨󵄨 󵄨 ⩽ ℙ (sup 󵄨󵄨Y t∧τ − ∫ f(s) dB s 󵄨󵄨󵄨 > ϵ) + ℙ(τ n < T) 󵄨 󵄨󵄨 n t⩽T 󵄨󵄨 󵄨󵄨 󵄨󵄨󵄨 0 T∧τ n

4 ⩽ 2 𝔼 [ ∫ |f Π k (s) − f(s)|2 ds] + δ 󳨀󳨀󳨀󳨀→ δ 󳨀󳨀󳨀󳨀→ 0. k→∞ δ→0 ϵ [ 0 ]

Let (g n )n⩾1 ⊂ L2T,loc be a sequence of adapted processes and assume that there is a common localizing sequence (τ k )k⩾1 for all g n .

278 | 16 Stochastic integrals: localization The sequence (g n )n⩾1 converges locally in L2T,loc , if for each k ⩾ 1 the sequence (g n 𝟙[0,τ k ) )n⩾1 converges in L2T . This is equivalent to saying that T∧τ k

lim 𝔼 [ ∫ |g n (s) − g m (s)|2 ds] = 0 for all k ⩾ 1. m,n→∞ [ 0 ]

This shows that the limit g is in L2T,loc and that (τ k )k⩾1 is a localizing sequence for g. We will see that convergence in L2T,loc yields ucp-convergence of the stochastic integrals.

16.10 Theorem (stochastic dominated convergence). Let (B t )t⩾0 be a BM1 , (Ft )t⩾0 an admissible filtration and (g n )n⩾1 ⊂ L2T,loc with a common localizing sequence (τ k )k⩾1 .¹ If the sequence (g n )n⩾1 converges locally in L2T to g ∈ L2T,loc , then the sequence of stochastic integrals (g n ∙ B)n⩾1 converges uniformly in probability to g ∙ B, i.e. 󵄨󵄨 t 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 lim ℙ (sup 󵄨󵄨∫(g n (s) − g(s)) dB s 󵄨󵄨󵄨 > ϵ) = 0 for all ϵ > 0. (16.6) n→∞ 󵄨󵄨 t⩽T 󵄨󵄨󵄨 󵄨0 󵄨󵄨

Proof. Denote by (τ k )k⩾1 the common localizing sequence for (g n )n⩾1 and g. Using Theorem 16.8.d) we get for all ϵ > 0, C > 0 and k, n ⩾ 1 󵄨󵄨 t 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 ℙ (sup 󵄨󵄨∫(g n (s) − g(s)) dB s 󵄨󵄨󵄨 > ϵ) 󵄨󵄨 t⩽T 󵄨󵄨󵄨 󵄨󵄨 󵄨0 T

4C ⩽ 2 + ℙ ( ∫ |g n (s) − g(s)|2 ds ⩾ C) ϵ 0

T

4C ⩽ 2 + ℙ ( ∫ |g n (s) − g(s)|2 ds ⩾ C, τ k ⩾ T) + ℙ(τ k < T) ϵ 0

T∧τ k



4C + ℙ ( ∫ |g n (s) − g(s)|2 ds ⩾ C) + ℙ(τ k < T) ϵ2 0

T∧τ k

4C 1 ⩽ 2 + 𝔼 ( ∫ |g n (s) − g(s)|2 ds) + ℙ(τ k < T) C ϵ 0

keeping ϵ, C and k fixed, we can let n → ∞ and get 󵄨󵄨 t 󵄨󵄨 󵄨󵄨 󵄨󵄨 4C 4C 󵄨󵄨 󵄨 ℙ (sup 󵄨󵄨∫(g n (s) − g(s)) dB s 󵄨󵄨󵄨 > ϵ) 󳨀󳨀󳨀󳨀→ 2 + ℙ(τ k ⩽ T) 󳨀󳨀󳨀󳨀→ 2 󳨀󳨀󳨀󳨀→ 0, n→∞ ϵ 󵄨󵄨 C→0 k→∞ ϵ t⩽T 󵄨󵄨󵄨 󵄨󵄨 󵄨0

where we use the fact that τ k converges to infinity as k → ∞.

1 A sufficient condition for the existence of a common localizing sequence is, e.g., the existence of a majorant |g n | ⩽ G ∈ L2T,loc or uniform boundedness |g n | ⩽ C < ∞ – this explains the name of the game.

16 Stochastic integrals: localization | 279

16.11 Theorem (stochastic Fubini). Let (B t )t⩾0 be a BM1 , (Ft )t⩾0 an admissible filtration and μ a finite measure on (B(ℝn ), ℝn ). If f : ℝn × [0, T) × Ω → ℝ, (x, s, ω) 󳨃→ f(x, s, ω), is a B(ℝn ) ⊗ P measurable bounded process, then a) ∫ f(x, s) μ(dx) is bounded and P measurable; t

b) ∫ f(x, s) dB s is B(ℝn ) ⊗ P measurable, λ T ⊗ ℙ a.e. in L2 (μ), 0

and ℙ a.s. continuous in t; t

t

c) ∫ ( ∫ f(x, s) dB s ) μ(dx) = ∫ (∫ f(x, s) μ(dx)) dB s . 0

0

Proof. Without loss of generality we may assume that μ is a probability measure; otherwise we normalize μ by setting μ/μ(ℝn ). We want to use the monotone class theorem (Theorem A.27) for the following set-up: E = ℝn × [0, T) × Ω, G = B(ℝn ) ⊗ P, and the family V of all bounded B(ℝn ) ⊗ P measurable processes f(x, s, ω) satisfying a)–c). Since measurability and (stochastic) integration respect linearity, it is clear that V is a vector space. If we can show the conditions A.27.i) and ii) of the monotone class theorem, the monotone class theorem guarantees that the assertion holds for all bounded, B(ℝn ) ⊗ P measurable integrands f(x, s, ω).

i) Assume that f(x, s, ω) = g(x)h(s, ω) where g, resp. h, are bounded B(ℝn ), resp. P, measurable functions. If f ∈ V, then we have, in particular, 𝟙A×Γ ∈ V for all A × Γ ∈ B(ℝn ) ⊗ P = G . The conditions a) and b) are trivially satisfied. To see c), note that t

t

t

∫ f(x, s, ⋅) dB s = ∫ g(x)h(s, ⋅) dB s = g(x) ∫ h(s, ⋅) dB s , 0

0

0

where we take a continuous version of the stochastic integral h ∙ B t . Moreover t

∫ ∫ f(x, s, ⋅) dB s μ(dx) = ∫ g(x)(h ∙ B)t μ(dx) = (h ∙ B)t ∫ g(x) μ(dx) 0

= (h(⋅, ⋅) ∫ g(x) μ(dx)) ∙ B t = (∫ h(⋅, ⋅)g(x) μ(dx)) ∙ B t t

= ∫ ∫ f(x, s, ⋅) μ(dx) dB s . 0

ii) Let (f n )n⩾1 be an increasing sequence in V such that f n ↑ f pointwise for a bounded f . We have to show that f ∈ V. Pointwise convergence preserves B(ℝn ) ⊗ P measurability. Therefore, a) follows from (the measurability assertion in) the standard Fubini–Tonelli theorem, and the stochastic integral in b) exists. A.s. boundedness and more follow from the next calculation: using Jensen’s inequality, Tonelli’s theorem, and then The-

280 | 16 Stochastic integrals: localization orem 15.15.d), we see 2 󵄨󵄨 󵄨󵄨 s 󵄨󵄨 󵄨󵄨 ] [ 󵄨󵄨 󵄨󵄨 𝔼 [(∫ sup 󵄨󵄨∫ (f n (x, r) − f(x, r)) dB r 󵄨󵄨 μ(dx)) ] 󵄨󵄨 s⩽t 󵄨󵄨󵄨 󵄨󵄨 󵄨0 ] [ 󵄨󵄨2 󵄨󵄨 s 󵄨󵄨 󵄨󵄨 [ 󵄨 ] 󵄨 ⩽ ∫ 𝔼 [sup 󵄨󵄨󵄨∫ (f n (x, r) − f(x, r)) dB r 󵄨󵄨󵄨 ] μ(dx) 󵄨󵄨 󵄨 s⩽t 󵄨󵄨 󵄨󵄨 󵄨0 ] [

(16.7)

t

⩽ 4 ∫ ∫ 𝔼 [|f n (x, r) − f(x, r)|2 ] dr μ(dx). 0

The same calculation (assume for a moment that f n ≡ 0) also proves that f(x, ⋅, ⋅) ∙ B t (ω) is μ ⊗ ℙ a.s. finite and λ T ⊗ ℙ a.e. in L2 (μ). Using dominated convergence the expression on the right-hand side of (16.7) converges to zero as f n → f . This shows that f(x, ⋅, ⋅) ∙ B t (ω) inherits a.s. continuity in t from f n (x, ⋅, ⋅) ∙ B t ; in particular it is P measurable for fixed x. The joint B(ℝn ) ⊗ P measurability follows from Lemma 16.12. This proves b) for f . Let us finally show c). By assumption, we know for f n ∈ V that t

t

∫ ∫ f n (x, s) dB s μ(dx) = ∫ ∫ f n (x, s) μ(dx) dB s . 0

0

t

We have already seen in (16.7) that the left-hand side converges to ∫ ∫0 f(x, s) dB s μ(dx), e.g. in probability. For the right-hand side we can either use the stochastic bounded convergence theorem (Corollary 16.10) or the following direct calculation which uses Itô’s isometry (in the first estimate) and then Jensen’s inequality and Tonelli’s theorem (in the second estimate): 󵄨󵄨 t 󵄨󵄨2 t 󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨2 [󵄨󵄨󵄨 󵄨󵄨 ] [ 𝔼 [󵄨󵄨∫ ∫ (f n (x, s) − f(s, x)) μ(dx) dB s 󵄨󵄨 ] = 𝔼 ∫ 󵄨󵄨󵄨󵄨∫ (f n (x, s) − f(s, x)) μ(dx)󵄨󵄨󵄨󵄨 ds] 󵄨󵄨 󵄨󵄨 󵄨 󵄨 󵄨 󵄨󵄨 [0 ] [󵄨 0 ] t

⩽ ∫ ∫ 𝔼 [(f n (x, s) − f(s, x))2 ] μ(dx) ds. 0

The expression on the right-hand side converges to zero because of dominated convergence. This proves c) for f , hence f ∈ V, and the proof is complete. In the above proof we need the following technical measurability lemma.

16.12 Lemma. Let (E, E ) be a measurable space and (Ω, A , ℙ) a probability space. Assume that (X n (x, ⋅))n⩾1 is a sequence of random variables depending on a parameter x. If (x, ω) 󳨃→ X n (x, ω) is E ⊗ A measurable and if, for every x ∈ E, the sequence X n (x, ω) converges in probability, then there is a version of the limit X(x, ω) such that (x, ω) 󳨃→ X(x, ω) is E ⊗ A measurable.

Problems

| 281

Proof. By assumption, limm,n→∞ ℙ (|X n (x, ω) − X m (x, ω)| > 2−k ) = 0 for all k ⩾ 0 and x ∈ E. We define a sequence n k (x) := inf {m > n k−1 (x) : sup ℙ (|X i (x, ⋅) − X j (x, ⋅)| > 2−k ) ⩽ 2−k } . i,j⩾m

Obviously, x 󳨃→ n k (x) is E measurable and we see that

󵄨 󵄨 ℙ (sup 󵄨󵄨󵄨󵄨X n k (x) (x, ⋅) − X nℓ (x) (x, ⋅)󵄨󵄨󵄨󵄨 > 2−k ) ⩽ 2−k x∈E

for all ℓ > k.

Using the completeness of the convergence in probability, this shows that X n k (x) (x, ω) converges in probability, uniformly in x ∈ E. Set X(x, ω) = ℙ- limk→∞ X n k (x) (x, ω) and note that (x, ω) 󳨃→ X(x, ω) is E ⊗ A measurable.

16.13 Further reading. Localized stochastic integrals are covered by most of the books mentioned in the Further reading section of Chapter 15. Theorem 16.9 has variants where the driving noise is an approximation of Brownian motion. Such Wong–Zakai results are important for practitioners. A good source are the original papers [273] and [112]. There exist more general versions of the stochastic Fubini theorem, see e.g. [211], but their proofs have the same flavour as the proof of Theorem 16.11. [112] Ikeda, Nakao, Yamato: A class of approximations of Brownian motion. [211] Protter: Stochastic Integration and Differential Equations. [273] Wong, Zakai: Riemann–Stieltjes approximations of stochastic integrals.

Problems 1. 2. 3.

Let T < ∞. Do we have L0T = L2T,loc ?

Let τ be a stopping time. Show that 𝟙[0,τ) is P measurable. If σ n is localizing for a local martingale, so is τ n := σ n ∧ n.

4. The factor 𝟙{σ n >0} is used in the definition of a local martingale, in order to avoid integrability requirements for M0 . More precisely, if (σ n )n⩾1 is a localizing sequence, then we have 5.

M0 ∈ L1 (ℙ) & (M t )t⩾0

is a local martingale ⇐⇒ (M σ n ∧t )t⩾0

is a martingale.

Let (I t , Ft )t⩾0 be a continuous adapted process with values in [0, ∞) and a.s. increasing sample paths. Set for u ⩾ 0 σ u (ω) := inf {t ⩾ 0 : I t (ω) > u}

and

τ u (ω) := inf {t ⩾ 0 : I t (ω) ⩾ u}.

Show that a) σ u ⩾ t ⇐⇒ I t ⩽ u and τ u > t ⇐⇒ I t < u. b) τ u is an Ft stopping time and σ u is an Ft+ stopping time.

282 | 16 Stochastic integrals: localization c) u 󳨃→ σ u is right continuous and u 󳨃→ τ u is left continuous. d) τ u = σ u− = limϵ↓0 σ u−ϵ . e) u 󳨃→ τ u , u 󳨃→ σ u are continuous, if t 󳨃→ I t is strictly increasing.

17 Stochastic integrals: martingale drivers Up to now we have defined the stochastic integral with respect to a Brownian motion and for integrands in the closure of the simple processes L2T = ST . For f ∈ ST , we can define a stochastic integral for any continuous L2 martingale M ∈ M2,c T , cf. Definition 15.8; only in the completion procedure, see Section 15.3, we use that (B2t − t, Ft )t⩽T is a martingale, that is ⟨B⟩t = t. If we assume that (M t , Ft )t⩽T is a continuous squareintegrable martingale such that (M 2t − ⟨M⟩t , Ft )t⩽T

is a martingale for some uniquely determined,

adapted, continuous and increasing process ⟨M⟩ = (⟨M⟩t )t⩽T ,

(DM)

then we can define f ∙ M for f from a suitable completion of ST . If you are reading this for the first time (or if you are happy taking (DM) for granted), then you may omit the next section and continue straight with § 17.2 on page 289.

17.1 The Doob–Meyer decomposition The aim of this section is to prove the following decomposition theorem due to P.A. Meyer. Throughout, T ∈ (0, ∞) is some finite but otherwise arbitrary time horizon. 17.1 Theorem (Doob–Meyer decomposition. Meyer 1962, 1963). For all square integrable martingales M ∈ M2,c T there exists a unique adapted, continuous and increasing process ⟨M⟩ = (⟨M⟩t )t⩽T such that M 2 − ⟨M⟩ is a martingale and ⟨M⟩0 = 0. If (Π n )n⩾1 is any refining sequence of partitions of [0, T] with |Π n | → 0, then ⟨M⟩ is given by ⟨M⟩t = lim

n→∞



t j−1 ,t j ∈Π n

2

(M t∧t j − M t∧t j−1 ) ,

(17.1)

where the limit is uniform (on compact t-sets) in probability.

The proof is split in a series of auxiliary results. This elementary approach is due to Kunita [152, Chapter 2.2]. We begin with the uniqueness of the decomposition. The following result will be useful in its own right. 17.2 Proposition. Let (X t , Ft )t⩾0 be a martingale with paths which are continuous and of bounded variation on compact intervals. Then X t = X0 a.s. for all t ⩾ 0. Proof. Without loss of generality we may assume that X0 = 0. Denote by V t := VAR1 (X; [0, t])

and

τ n = inf{t ⩾ 0 : V t ⩾ n}

the strong variation of (X t )t⩾0 and the stopping time when V t exceeds n for the first time. Since t 󳨃→ X t is continuous, so is t 󳨃→ V t , cf. Paragraph A.29.f). https://doi.org/10.1515/9783110741278-017

Ex. 17.1

284 | 17 Stochastic integrals: martingale drivers τ

By optional stopping, Theorem A.18, we know that (X t n , Ft∧τ n )t⩾0 is for all n ⩾ 1 a martingale. An application of Doob’s maximal inequality yields 2

(A.14)

N

2

n 𝔼 [sup (X s n ) ] ⩽ 4 𝔼 [(X t n ) ] = 𝔼 [ ∑ ((X jn )2 − (X j−1 )2 )]

τ

τ

s⩽t

τ

j=1 N

τ

Nt

τ

N

t

2

τ

n = 𝔼 [ ∑ (X jn − X j−1 ) ]

j=1

Nt

N

t

τ n 󵄨󵄨 󵄨 τ ⩽ 𝔼 [V t sup 󵄨󵄨󵄨X jn − X j−1 󵄨󵄨] Nt N t 1⩽j⩽N

τ n 󵄨󵄨 󵄨 τ ⩽ n 𝔼 [ sup 󵄨󵄨󵄨X jn − X j−1 󵄨󵄨] ; t N N t 1⩽j⩽N

in particular, the expectation on the left-hand side is finite. Since the paths are uniformly continuous on compact intervals, we find τ n 󵄨󵄨 󵄨 τ sup 󵄨󵄨󵄨X jn − X j−1 󵄨󵄨 󳨀󳨀󳨀󳨀󳨀→ 0 almost surely. N→∞ Nt N t

1⩽j⩽N

τ n 󵄨󵄨 󵄨 τ On the other hand, sup1⩽j⩽N 󵄨󵄨󵄨X jn − X j−1 󵄨 ⩽ VAR1 (X τ n ; [0, t]) ⩽ V t∧τ n ⩽ n, and domint t󵄨

ated convergence gives

N

N

τ τ n 󵄨󵄨 󵄨 τ 𝔼 [sup(X s n )2 ] ⩽ n lim 𝔼 [ sup 󵄨󵄨󵄨X jn − X j−1 󵄨󵄨] = 0. t N→∞ N N t 1⩽j⩽N

s⩽t

Therefore, X s = 0 a.s. on [0, k ∧ τ n ] for all k ⩾ 1 and n ⩾ 1. The claim follows since τ n ↑ ∞ almost surely as n → ∞.

17.3 Corollary. Let (X t , Ft )t⩾0 be an L2 martingale such that both (X 2t − A t , Ft )t⩾0 and (X 2t − A󸀠t , Ft )t⩾0 are martingales for some adapted processes (A t )t⩾0 and (A󸀠t )t⩾0 having continuous paths of bounded variation on compact sets and A0 = A󸀠0 = 0. Then A and A󸀠 are indistinguishable: ℙ (A t = A󸀠t ∀t ⩾ 0) = 1.

Proof. By assumption A t − A󸀠t = (X 2t − A󸀠t ) − (X 2t − A t ), i.e. A − A󸀠 is a martingale with continuous paths which are of bounded variation on compact sets. The claim follows from Proposition 17.2.

The next step is to show that Theorem 17.1 holds for uniformly bounded martingales M ∈ M2,c T . This requires two lemmas. Let Π = {0 = t 0 < t 1 < ⋅ ⋅ ⋅ < t n = T} be a partition of [0, T], and denote by M Π,t = (M ttj , Ft j ∧t )t j ∈Π , 0 ⩽ t ⩽ T, where M ttj = M t∧t j . Consider the martingale transform n

N TΠ,t := M Π,t ∙ M TΠ,t = ∑ M t∧t j−1 (M t∧t j − M t∧t j−1 ) . j=1

17.1 The Doob–Meyer decomposition | 285 Π 17.4 Lemma. (N TΠ,t , Ft )t⩽T , N TΠ,t = N T∧t , is a continuous martingale, and

𝔼 [(N TΠ,t )2 ] ⩽ κ4 , n

2

Π,t M 2t − M02 = ∑ (M t∧t j − M t∧t j−1 ) + 2N T ,

(17.2) (17.3)

j=1

2

n

2 𝔼 [( ∑ (M t∧t j − M t∧t j−1 ) ) ] ⩽ 16κ4 . ] [ j=1

(17.4)

Proof. It is clear that t 󳨃→ N TΠ,t is continuous. The martingale property follows from Theorem 15.11.a). For (17.2) we use Theorem 15.4: (15.6)

𝔼 [(N TΠ,t )2 ] = 𝔼 [(M Π,t ∙ M TΠ,t )2 ] = 𝔼 [(M Π,t )2 ∙ ⟨M Π,t ⟩T ] ⩽ κ2 𝔼 [⟨M Π,t ⟩T ] = κ2 𝔼 [M 2t − M02 ] ⩽ κ4 .

The identity (17.3) follows from n

M 2t − M02 = ∑ (M 2t∧t j − M 2t∧t j−1 ) j=1 n

= ∑ (M t∧t j + M t∧t j−1 ) (M t∧t j − M t∧t j−1 ) j=1 n

2

= ∑ [(M t∧t j − M t∧t j−1 ) + 2M t∧t j−1 (M t∧t j − M t∧t j−1 )] j=1 n

2

= ∑ (M t∧t j − M t∧t j−1 ) + 2N TΠ,t . j=1

This and the first estimate imply (17.4): n

2

2 Π,t 2 Π,t 𝔼 [( ∑ (M t∧t j − M t∧t j−1 ) ) ] = 𝔼 [( M 2t − M02 −2N T ) ] ⩽ 8κ4 + 8 𝔼 [(N T )2 ] . [ j=1 ] ⩽ 2 κ2 ⩽ 8κ4

In the penultimate inequality we use the fact that (a + b)2 ⩽ 2a2 + 2b2 .

17.5 Lemma. Let (Π n )n⩾1 be a sequence of partitions of [0, T] such that the mesh size |Π n | = maxt j−1 ,t j ∈Π n (t j − t j−1 ) tends to 0 as n → ∞. Then 󵄨󵄨 Π ,t Π ,t 󵄨󵄨2 lim 𝔼 [ sup 󵄨󵄨󵄨N T n − N T m 󵄨󵄨󵄨 ] = 0. 󵄨 󵄨 0⩽t⩽T

m,n→∞

Proof. For any finite partition Π we define Πf ut := ∑t j−1 ,t j ∈Π M t∧t j−1 𝟙[t j−1 ,t j ) (u). Note that for any two partitions Π, Π 󸀠 and all u ⩽ T we have 󵄨󵄨 Π t Π 󸀠 t 󵄨󵄨 󵄨󵄨 f u − f u 󵄨󵄨 ⩽ sup |M t∧s − M t∧r | . 󵄨󵄨 󵄨󵄨 |r−s|⩽|Π|∨|Π 󸀠 |

286 | 17 Stochastic integrals: martingale drivers By the continuity of the sample paths, this quantity tends to zero as |Π|, |Π 󸀠 | → 0. Fix two partitions Π, Π 󸀠 of the sequence (Π n )n⩾1 and denote the elements of Π by t j and those of Π ∪ Π 󸀠 by u k . The process Πf ut is defined in such a way that N TΠ,t = M Π,t ∙ M TΠ,t =



t j−1 ,t j ∈Π

M t∧t j−1 (M t∧t j − M t∧t j−1 )

Π t f u k−1 u k−1 ,u k ∈Π∪ Π 󸀠

=

Thus,



N TΠ,t − N TΠ

󸀠

,t

Π∪ Π ,t . (M t∧u k − M t∧u k−1 ) = Πf t ∙ M T 󸀠

= ( Πf t − Π f t ) ∙ M TΠ∪ Π ,t , 󸀠

󸀠

and we find with Doob’s maximal inequality (Theorem A.10): 󵄨󵄨 󸀠 󵄨 󵄨2 𝔼 [sup 󵄨󵄨󵄨N TΠ,s − N TΠ ,s 󵄨󵄨󵄨 ] 󵄨 s⩽t 󵄨

󵄨󵄨 󸀠 󵄨 󵄨2 ⩽ 4 𝔼 [󵄨󵄨󵄨N TΠ,t − N TΠ ,t 󵄨󵄨󵄨 ] 󵄨 󵄨 󵄨󵄨 Π t Π 󸀠 t 󸀠 󵄨 󵄨2 = 4 𝔼 [󵄨󵄨󵄨( f − f ) ∙ M TΠ∪ Π ,t 󵄨󵄨󵄨 ] 󵄨 󵄨

(15.7)

= 4𝔼[



u k−1 ,u k ∈Π∪ Π 󸀠

󸀠

2

2

( Πf ut k−1 − Π f ut k−1 ) (M t∧u k − M t∧u k−1 ) ]

󵄨󵄨 󵄨2 󸀠 󵄨 ⩽ 4 𝔼 [ sup 󵄨󵄨󵄨 Πf ut − Π f ut 󵄨󵄨󵄨 󵄨 󵄨 u⩽T

2



u k−1 ,u k ∈Π∪ Π 󸀠 1

(M t∧u k − M t∧u k−1 ) ]

2 󵄨󵄨 󸀠 󵄨 󵄨4 ⩽ 4( 𝔼 [ sup 󵄨󵄨󵄨 Πf ut − Π f ut 󵄨󵄨󵄨 ]) ( 𝔼 [( 󵄨 u⩽T 󵄨



u k−1 ,u k ∈Π∪ Π 󸀠

2

2

1 2

(M t∧u k − M t∧u k−1 ) ) ]) .

The first term tends to 0 as |Π|, |Π 󸀠 | → 0, while the second factor is bounded by 4κ2 , cf. Lemma 17.4. 17.6 Proposition. Theorem 17.1 holds for all M ∈ M2,c T such that supt⩽T |M t | ⩽ κ for some constant κ < ∞.

Proof. Let (Π n )n⩾0 be a sequence of refining partitions such that |Π n | → 0 and set Π := ⋃n⩾0 Π n . Π ,t By Lemma 17.5, the limit L2 -limn→∞ N T n exists uniformly for all t ∈ [0, T] and, 2

because of (17.3), so does L2 -limn→∞ ∑t j−1 ,t j ∈Π n (M t∧t j − M t∧t j−1 ) . Therefore, there is a subsequence such that ⟨M⟩t := lim

k→∞



t j−1 ,t j ∈Π n(k)

2

(M t∧t j − M t∧t j−1 )

exists a.s. uniformly for t ∈ [0, T]; due to the uniform convergence, ⟨M⟩t inherits the continuity of the approximating finite sums.

17.1 The Doob–Meyer decomposition | 287

For s, t ∈ Π with s < t there is some k ⩾ 1 such that s, t ∈ Π n(k) . Obviously, 2

(M s∧t j − M s∧t j−1 ) ⩽



t j−1 ,t j ∈Π n(k)

2



t j−1 ,t j ∈Π n(k)

(M t∧t j − M t∧t j−1 )

and, as k → ∞, this inequality also holds for the limiting process ⟨M⟩. Since Π is dense and t 󳨃→ ⟨M⟩t is continuous, we get ⟨M⟩s ⩽ ⟨M⟩t for all s ⩽ t. 2 From (17.3) we see that M 2t − ∑t j−1 ,t j ∈Π n(k) (M t∧t j − M t∧t j−1 ) is a martingale. From Lemma 15.12 we know that the martingale property is preserved under L2 limits, and so M 2 − ⟨M⟩ is a martingale.

We can, finally, remove the assumption that M is bounded, and prove the Doob–Meyer decomposition. Proof of Theorem 17.1. The uniqueness part follows from Corollary 17.3. In order to construct the decomposition, define a sequence of stopping times τ n := inf {t ⩽ T : |M t | ⩾ n}, n ⩾ 1; as usual, inf 0 = ∞. Clearly, limn→∞ τ n = ∞, n M τ n ∈ M2,c T and |M t | ⩽ n for all t ∈ [0, T]. Since for any partition Π of [0, T] and all m⩽n

τ



t j−1 ,t j ∈Π

τ

n (M t∧τ

2

τ

m ∧t j

n − M t∧τ

m ∧t j−1

the following process is well-defined ⟨M⟩t (ω) := ⟨M τ n ⟩t (ω)

) =

τ



t j−1 ,t j ∈Π

2

τ

m m − M t∧t (M t∧t ) , j j−1

for all (ω, t) : t ⩽ τ n (ω) ∧ T, τ

τ

and we find that ⟨M⟩t is continuous and ⟨M⟩t n = ⟨M n ⟩t . Using Fatou’s lemma, the fact that (M τ n )2 − ⟨M τ n ⟩ is a martingale, and Doob’s maximal inequality A.10, we have 𝔼 (⟨M⟩T ) = 𝔼 ( lim ⟨M⟩Tn ) ⩽ lim 𝔼 (⟨M τ n ⟩T ) = lim 𝔼 ((M Tn )2 − M02 ) τ

τ

n→∞

n→∞

n→∞

⩽ 𝔼 ( sup M 2t ) t⩽T

⩽ 4 𝔼 (M 2T ) .

Therefore, ⟨M⟩T ∈ L1 (ℙ). Moreover,

󵄨 󵄨 󵄨 󵄨 𝔼 ( sup 󵄨󵄨󵄨⟨M⟩t − ⟨M τ n ⟩t 󵄨󵄨󵄨) = 𝔼 ( sup 󵄨󵄨󵄨⟨M⟩t − ⟨M⟩τ n 󵄨󵄨󵄨 𝟙{t>τ n } ) ⩽ 𝔼 (⟨M⟩T 𝟙{T>τ n } ) . t⩽T

t⩽T

By the dominated convergence theorem we see that L1 -limn→∞ ⟨M⟩t n = ⟨M⟩t uniformly for t ∈ [0, T]. Similarly, τ

2

2

𝔼 ( sup (M t − M t n ) ) = 𝔼 ( sup (M t − M τ n ) 𝟙{t>τ n } ) ⩽ 4 𝔼 ( sup M 2t 𝟙{T>τ n } ). τ

t⩽T

t⩽T

t⩽T

288 | 17 Stochastic integrals: martingale drivers Using Doob’s maximal inequality we see supt⩽T M 2t ∈ L1 , and dominated convergence

yields L1 -limn→∞ (M t n )2 = M 2t uniformly for all t ∈ [0, T]. Since L1 -convergence preτ

serves martingales, M 2 − ⟨M⟩ is a martingale. Using that ⟨M⟩t = ⟨M⟩t n = ⟨M τ n ⟩t for t ⩽ τ n , we find for every ϵ > 0 τ

󵄨󵄨 󵄨󵄨 ℙ (sup 󵄨󵄨󵄨󵄨⟨M⟩t − ∑ (M t∧t j − M t∧t j−1 )2 󵄨󵄨󵄨󵄨 > ϵ) 󵄨 t⩽T 󵄨 t j−1 ,t j ∈Π

󵄨󵄨 󵄨󵄨 τn τn ⩽ ℙ (sup 󵄨󵄨󵄨󵄨⟨M τ n ⟩t − ∑ (M t∧t − M t∧t )2 󵄨󵄨󵄨󵄨 > ϵ, T ⩽ τ n ) + ℙ(T > τ n ) j j−1 󵄨 t⩽T 󵄨 t j−1 ,t j ∈Π



󵄨󵄨 󵄨󵄨2 1 τn τn 𝔼 (sup 󵄨󵄨󵄨󵄨⟨M τ n ⟩t − ∑ (M t∧t − M t∧t )2 󵄨󵄨󵄨󵄨 ) + ℙ(T > τ n ) j j−1 2 ϵ 󵄨 t⩽T 󵄨 t ,t ∈Π j−1

j

Thm. 17.6

󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ ℙ(T > τ n ) 󳨀󳨀󳨀󳨀→ 0. |Π|→0

n→∞

We close this section with a result which will be needed for the representation of some “Brownian” martingales, cf. § 19.6. 17.7 Corollary. Let (M t , Ft )t⩾0 ⊂ L2 (ℙ) be a continuous martingale with respect to a right continuous filtration. Then M and the quadratic variation ⟨M⟩ are, with probability one, constant on the same t-intervals. Proof. Fix T > 0. We have to show that there is a set Ω∗ ⊂ Ω such that ℙ(Ω∗ ) = 1 and for all [a, b] ⊂ [0, T] and ω ∈ Ω∗ M t (ω) = M a (ω) for all t ∈ [a, b] ⇐⇒ ⟨M⟩t (ω) = ⟨M⟩a (ω) for all t ∈ [a, b].

“⇒”: by Theorem 17.1 there is a set Ω󸀠 ⊂ Ω of full ℙ-measure and a sequence of partitions (Π n )n⩾1 of [0, T] such that limn→∞ |Π n | = 0 and ⟨M⟩t (ω) = lim

n→∞



2

t j−1 ,t j ∈Π n

(M t j ∧t (ω) − M t j−1 ∧t (ω)) ,

If M is constant on [a, b] ⊂ [0, T], we see that for all t ∈ [a, b] ⟨M⟩t (ω) − ⟨M⟩a (ω) = lim

n→∞



t j−1 ,t j ∈Π n ∩ [a,b]

ω ∈ Ω󸀠 . 2

(M t j ∧t (ω) − M t j−1 ∧t (ω)) = 0,

ω ∈ Ω󸀠 .

“⇐”: let Ω󸀠 be as above. The idea is to show that for every rational q ∈ [0, T] there is a set Ω q ⊂ Ω󸀠 with ℙ(Ω q ) = 1 and such that for every ω ∈ Ω q a ⟨M⟩• (ω)-constancy interval of the form [q, b] is also an interval of constancy for M• (ω). Then the claim follows for the set Ω∗ = ⋂q∈ℚ∩ [0,T] Ω q . The shifted process N t := M t+q − M q , 0 ⩽ t ⩽ T − q is a continuous L2 martingale for the filtration (Ft+q )t∈[0,T−q] ; moreover, ⟨N⟩t = ⟨M⟩t+q − ⟨M⟩q . Set τ = inf {s > 0 : ⟨N⟩s > 0}, note that τ is a stopping time (since ⟨N⟩ has continuous paths), and consider the stopped processes ⟨N τ ⟩t = ⟨N⟩t∧τ and N tτ = N t∧τ .

17.2 The Itô integral driven by L2 martingales

| 289

By definition, ⟨N τ ⟩t ≡ 0 almost surely, hence ⟨M⟩t∧τ+q ≡ ⟨M⟩q almost surely. Since 2 𝔼(N τ∧t ) = 𝔼⟨N τ ⟩t = 0, we conclude that M t∧τ+q − M q = N tτ = 0 a.s., i.e. there is a set Ω q of full measure such that for every ω ∈ Ω q every interval of constancy [q, b] of ⟨M⟩ is also an interval of constancy for M. The direction “⇒” shows that a constancy interval of M cannot be longer than that of ⟨M⟩, hence they coincide.

17.2 The Itô integral driven by L2 martingales

As before, (Ft )t⩽T is a complete filtration. Unless otherwise stated, we assume that all processes are adapted to this filtration and that all martingales are martingales w.r.t. this filtration. Let M ∈ M2,c T be such that (DM) holds and 0 = t 0 < t 1 < ⋅ ⋅ ⋅ < t n = T be any partition of [0, T]. Then (M t j )j is a martingale, and the process ⟨M⟩ from (DM) is the (discrete-time) compensator ⟨M⟩t j . This allows us to use the proof of Theorem 15.10.c) and to see that the following Itô isometry holds for all simple processes f ∈ ST : T

‖f ∙ M T ‖2L2 (ℙ) = 𝔼 [(f ∙ M)2T ] = 𝔼 [∫ |f(s)|2 d⟨M⟩s ] =: ‖f‖2L2 (μ T ⊗ℙ) . [0 ]

(17.5)

For fixed ω we denote by μ T (ω, ds) = d⟨M⟩s (ω) the measure on [0, T] induced by the increasing function s 󳨃→ ⟨M⟩s (ω), i.e. μ T (ω, [s, t)) = ⟨M⟩t (ω) − ⟨M⟩s (ω) for all s ⩽ t ⩽ T. We can now follow the procedure set out in Section 15.3 to define a stochastic integral for continuous square-integrable martingales. 2,c

17.8 Definition. Let M ∈ MT satisfying (DM) and set L2T (M) := ST ; the closure is taken in the space L2 (μ T ⊗ ℙ). Then the stochastic integral (or Itô integral) is defined as t

f ∙ M t = ∫ f(s) dM s := L2 (ℙ)- lim f n ∙ M t , 0

n→∞

0 ⩽ t ⩽ T,

(17.6)

where (f n )n⩾1 ⊂ ST is any sequence which approximates f ∈ L2T (M) in the L2 (μ T ⊗ ℙ)norm. Exactly the same arguments as in Section 15.3 can be used to show the analogue of Theorem 15.15.

17.9 Theorem. Let (M t , Ft )t⩽T be in M2,c T satisfying (DM), write μ T for the measure induced by the increasing process ⟨M⟩, and let f ∈ L2T (M). Then we have for all t ⩽ T 2 a) f 󳨃→ f ∙ M is a linear map from L2T (M) into M2,c T , i.e. (f ∙ M t )t⩽T is a continuous L martingale;

Ex. 17.8

290 | 17 Stochastic integrals: martingale drivers b) (f ∙ M)2 − f 2 ∙ A is a martingale where f 2 ∙ ⟨M⟩t := ∫0 |f(s)|2 d⟨M⟩s ; in particular t

t

2 󵄨 𝔼 [(f ∙ M t − f ∙ M s ) 󵄨󵄨󵄨 Fs ] = 𝔼 [∫ |f(r)|2 d⟨M⟩r [s

c) Itô’s isometry – L2 -continuity.

󵄨󵄨 󵄨󵄨 F ] , 󵄨󵄨 s 󵄨 ]

s < t ⩽ T.

(17.7)

T

‖f ∙

M T ‖2L2 (ℙ)

= 𝔼 [(f ∙

M)2T ]

= 𝔼 [∫ |f(s)|2 d⟨M⟩s ] = ‖f‖22

d) Maximal inequalities.

2

t

T

L (μ T ⊗ℙ) .

]

[0

(17.8)

T

𝔼 [∫ |f(s)|2 d⟨M⟩s ] ⩽ 𝔼 [sup ( ∫ f(s) dM s ) ] ⩽ 4 𝔼 [∫ |f(s)|2 d⟨M⟩s ] . (17.9) t⩽T 0 [0 ] [ ] [0 ]

e) Localization in time. Let τ be an Ft stopping time. Then

f)

(f ∙ M)τt = f ∙ (M τ )t = (f 𝟙[0,τ) ) ∙ M t

for all t ⩽ T.

Localization in space. Assume that f, g ∈ L2T (M) satisfy f(s, ω) = g(s, ω) for all (s, ω) ∈ [0, t] × F for some t ⩽ T and F ∈ F∞ . Then (f ∙ M s )𝟙F = (g ∙ M s )𝟙F

for all 0 ⩽ s ⩽ t.

The characterization of ST , see Section 15.5, relies on a deterministic approximation result, Lemma 15.22, and this remains valid in the general martingale setting. 17.10 Theorem. The L2 (μ T ⊗ ℙ) closure of ST is L2P (μ T ⊗ ℙ), i.e. the family of all equivalence classes in L2 (μ T ⊗ ℙ) which have a progressively measurable representative. Sketch of the proof. Replace in Lemma 15.22 the measure λ T by μ T and the partition points t j by the random times τ j (ω) := inf {s ⩾ 0 : ⟨M⟩s (ω) > jT/n} ∧ T, j ⩾ 1, and τ0 (ω) = 0. The continuity of s 󳨃→ ⟨M⟩s ensures that ⟨M⟩τ − ⟨M⟩τ = T/n for all j j−1 1 ⩽ j ⩽ ⌊n⟨M⟩T /T⌋. Therefore, the proof of Lemma 15.22 remains valid. Moreover, {τ j ⩽ t} = {⟨M⟩t ⩾ jT/n} ∈ Ft , i.e. the τ j are stopping times. Since

L2P (μ T ⊗ ℙ) is closed (Lemma 15.21), the inclusion ST ⊂ L2P (μ T ⊗ ℙ) is clear. For the other direction, we can argue as in Theorem 15.23, and see that every f ∈ L2P (μ T ⊗ ℙ) can be approximated by a sequence of random step functions of the form ∞

∑ ξ j−1 (ω)𝟙[τ

j=1

j−1 (ω),τ j (ω))

(t),

ξ j−1 is Fτ

j−1

measurable, |ξ j−1 | ⩽ C.

17.3 The Itô integral driven by local martingales | 291

We are done if we can show that any process ξ(ω)𝟙[σ(ω),τ(ω)) (t), where σ ⩽ τ ⩽ T are stopping times and ξ is a bounded and Fσ measurable random variable, can be approximated in L2 (μ T ⊗ ℙ) by a sequence from ST . ⌊mT⌋ m Set σ m := ∑k=1 mk 𝟙[(k−1)/m, k/m) (σ)+ ⌊mT⌋ m 𝟙[⌊mT⌋/m, T) (σ) and define τ in the same way. Then ⌊mT⌋

ξ(ω)𝟙[σ m (ω),τ m (ω)) (t) = ∑ ξ(ω)𝟙 k=1

{σ< k−1 m ⩽τ}

(ω)𝟙 k−1 , k (t) + ξ(ω)𝟙 ⌊mT⌋ (ω)𝟙 ⌊mT⌋ (t). {σ< m ⩽τ} [ m ,T) [ m m)

For all Borel sets B ⊂ ℝ and k = 1, 2, . . . , ⌊mT⌋ + 1 we have ∈Fσ

{ξ ∈ B} ∩ {σ < ∈F(k−1)/m

k−1 m }∩

{ k−1 m ⩽ τ} ∈ F(k−1)/m ,

and we see that ξ(ω)𝟙[σ m (ω),τ m (ω)) (t) is in ST . Finally, T

󵄨 󵄨2 𝔼 [∫ 󵄨󵄨󵄨ξ 𝟙[σ m ,τ m ) (t) − ξ 𝟙[σ,τ) (t)󵄨󵄨󵄨 d⟨M⟩t] ⩽ 2C2 [𝔼(⟨M⟩τ m − ⟨M⟩τ ) + 𝔼(⟨M⟩σ m − ⟨M⟩σ )] , [0 ] and this tends to zero as m → ∞.

17.3 The Itô integral driven by local martingales In Chapter 16 we have already seen how we can use stopping techniques in order to extend the range of the integrands from L2T to L2T,loc . Now we want to extend both the range of integrands and the range of integrators in a similar way. 17.11 The strategy. Let us briefly explain the underlying strategy: as before, denote by T L0T (M) all P measurable f such that ∫0 |f(s, ⋅)|2 d⟨M⟩s < ∞ a.s. and by L2T,loc (M) all processes which are locally, i.e. after truncation, in L2T (M). Construct localizing sequences (σ n )n⩾0 and (τ n )n⩾0 such that the stopped integrator 2 M σ n is bounded and in M2,c T , and the truncated integrand f 𝟙[0,τ n ) is in LT (M), see Lemma 17.12 below. With the new localizing sequence θ n := σ n ∧ τ n ∧ n we can define the stochastic integral (f 𝟙[0,θ n ) ) ∙ M θ n for each θ n . Let us now stitch together a process f ∙ M = (f ∙ M)t⩾0 by (f 𝟙[0,θ1 ) ) ∙ M t 1 , { { { { { . { ∞ { { .. θn f ∙ M t = ∑ (f 𝟙[0,θ n ) ) ∙ M t 𝟙[θ n−1 ,θ n ) (t) = { {(f 𝟙[0,θ ) ) ∙ M θ n , { n=1 n t { { { { { .. { . θ

if t < θ1 ,

if t < θ n ,

.

292 | 17 Stochastic integrals: martingale drivers This construction is possible since we have the following “telescoping” compatibility property: if θ m ⩽ θ n , then ((f 𝟙[0,θ n ) ) ∙ M θ n )

θ m 17.9.b)

=

(f 𝟙[0,θ n ) 𝟙[0,θ m ) ) ∙ M θ n ∧θ m

17.9.b)

=

(f 𝟙[0,θ m ) ) ∙ M θ m .

c In the same way we can define ⟨M⟩ for any continuous local martingale M ∈ Mloc with continuous paths and 𝔼 |M0 | < ∞.¹ Let (ρ n )n⩾1 be a sequence of stopping times c which reduce M ∈ Mloc to M ρ n ∈ Mc . Define a further sequence of stopping times τ n := inf {t : |M t | ⩾ n}. Since M has continuous paths, τ n → ∞ as n → ∞, and we see with the optional stopping theorem (Theorem A.18) that θ n := ρ n ∧ τ n ∧ n is a localizing sequence which achieves that M θ n ∈ M2,c T for all T > 0. Since ∞

⟨M θ n ⟩θ m = ⟨M θ n ∧θ m ⟩ = ⟨M θ m ⟩, we can define ⟨M⟩t := ∑ ⟨M θ n ⟩t 𝟙[θ n−1 ,θ n ) (t). n=1

Note that t 󳨃→ ⟨M⟩t is increasing and satisfies ⟨M⟩τ = ⟨M τ ⟩ for all stopping times.

17.12 Lemma. For any T > 0, one has McT,loc ∩ {𝔼 |M0 | < ∞} ⊂ M2,c T,loc as well as 0 2 LT (M) ⊂ LT,loc (M).

Proof. Since M is a local martingale with 𝔼 |M0 | < ∞, there is a reducing sequence ρ n → ∞ such that M ρ n ∈ M2,c T ; define, in addition σ n := inf{s : |M s | > n} ∧ ρ n ∧ n and note that σ n → ∞. By optional stopping (Theorem A.18), M σ n = (M ρ n )σ n is still a martingale; as we are stopping M t as soon as it reaches the level n, the first assertion follows from |M t n | = |M t∧σ n | ⩽ n 󳨐⇒ M t n ∈ L2 (ℙ) 󳨐⇒ M σ n ∈ M2,c T . σ

σ

c For M ∈ Mloc the bracket ⟨M⟩ is a well-defined increasing process; thus we can define for f ∈ L0T (M) the stopping times (cf. Lemma 16.3) T

τ n := inf {t ⩽ T : ∫ |f(s, ⋅)| 𝟙[0,t) (s) d⟨M⟩s > n}, 0

(inf 0 = ∞) and we get, because of the very definition of τ n ,

n ⩾ 0,

T

󵄨 󵄨2 𝔼 [∫ 󵄨󵄨󵄨f(s)𝟙[0,τ n ) (s)󵄨󵄨󵄨 d⟨M⟩s ] ⩽ 𝔼[n] = n. [0 ]

This shows that f can be localized, i.e. L0T (M) ⊂ L2T,loc (M) for any T > 0. 1 We assume M0 ∈ L1 (ℙ) in order to simplify the localization procedure, see Definition 16.7 and Problem 16.4. Typically, one would consider M − M0 if this condition does not hold.

17.3 The Itô integral driven by local martingales | 293

A word of caution: The proof of Lemma 17.12 does not work if M t or |f| ∙ ⟨M⟩t have σ τ jumps. In this case, M t n , and |f| ∙ ⟨M⟩t n , are bounded for t ∈ [0, σ n ), and t ∈ [0, τ n ), respectively; but we loose control at t = σ n , and t = τ n , respectively, since the processes may – and will, in general – jump over the threshold n. After these preparations, we are ready for the “martingale” counterpart of Theorem 16.8 resp. the “localization” of Theorem 17.9. 17.13 Theorem. Let (M t , Ft )t⩾0 be continuous local martingale with 𝔼 |M0 | < ∞ and f ∈ L2T,loc (M). Let t ⩽ T ∈ (0, ∞).

a) f 󳨃→ f ∙ M is a linear map from L2T,loc (M) to M2,c T,loc , i.e. (f ∙ M t )t⩽T is a continuous local L2 martingale; b) (f ∙ M)2 − ⟨f ∙ M⟩ is a continuous local martingale where t

2

⟨f ∙ M⟩t = f ∙ ⟨M⟩t = ∫ |f(s)|2 d⟨M⟩s ; 0

ucp

ucp

c) Ucp-continuity. If f n 󳨀󳨀󳨀󳨀→ f , then f n ∙ M 󳨀󳨀󳨀󳨀→ f ∙ M for all (f n )n⩾1 ⊂ L2T,loc (M) n→∞ n→∞ 󵄨󵄨 t 󵄨󵄨 T 󵄨󵄨 󵄨󵄨 4C 󵄨 󵄨 d) ℙ (sup 󵄨󵄨󵄨∫ f(s) dM s 󵄨󵄨󵄨 > ϵ) ⩽ 2 + ℙ (∫ |f(s)|2 d⟨M⟩s ⩾ C) for all C, ϵ > 0. 󵄨󵄨 ϵ t⩽T 󵄨󵄨󵄨 󵄨󵄨 󵄨0 0 e) Localization in time. Let τ be a Ft stopping time. Then f)

(f ∙ M)τt = f ∙ (M τ )t = (f 𝟙[0,τ) ) ∙ M t ,

t ⩽ T.

Localization in space. If f, g ∈ L2T,loc (M) are such that f(s, ω) = g(s, ω) for all (s, ω) ∈ [0, t] × F for some t ⩽ T and F ∈ F∞ , then (f ∙ M s )𝟙F = (g ∙ M s )𝟙F

for all 0 ⩽ s ⩽ t.

The proof of this theorem uses the same arguments as the proof of Theorem 16.8. The only major difference is the proof of Part c) which requires double localization in the integrand and the integrator. The following lemma allows us to deal with this situation: rather than looking at f and M, we can show the assertion for (f 𝟙[0,θ n ) ) ∙ M θ n = (f ∙ M)θ n and a suitable localizing sequence. Then we can follow the proof of Theorem 16.8.c) almost literally. 17.14 Lemma. Let (X t , Ft )t⩽T , (X t , Ft )t⩽T , j ⩾ 1, be adapted processes and (τ n )n⩾0 some localizing sequence. Then j

ucp

X•j 󳨀󳨀󳨀󳨀→ X• j→∞

ucp

if, and only if, ∀n : X∙∧τ n 󳨀󳨀󳨀󳨀→ X∙∧τ n . j

j→∞

Proof. We use the following shorthand: := sups⩽t |Z s | for any process (Z t )t⩾0 . “⇒”: since (X j − X)∗τ n ∧T ⩽ (X j − X)∗T , we get for every ϵ > 0 Z ∗t

ℙ ((X j − X)∗τ n ∧T > ϵ) ⩽ ℙ ((X j − X)∗T > ϵ) 󳨀󳨀󳨀󳨀→ 0. j→∞

Ex. 17.4 Ex. 17.5

294 | 17 Stochastic integrals: martingale drivers “⇐”: note that, since a localizing sequence τ n converges to ∞ a.s.,

ℙ ((X j − X)∗T > ϵ) ⩽ ℙ ((X j − X)∗T > ϵ, τ n ⩾ T) + ℙ (τ n < T) ⩽ ℙ ((X j − X)∗T∧τ n > ϵ) + ℙ (τ n < T) ucp

󳨀󳨀󳨀󳨀→ ℙ (τ n < T) 󳨀󳨀󳨀󳨀→ 0. n→∞

j→∞

Ex. 17.3

Once we have defined the angle bracket ⟨M⟩ for a continuous local martingale, we can use polarization to get the covariation ⟨M, N⟩ := 14 (⟨M + N⟩ − ⟨M − N⟩), see (15.8) for the discrete-time setting and Exercise 17.3 for continuous times. For the next result we only need to know that ⟨M, N⟩ is the unique bounded variation process such that MN −⟨M, N⟩ is a local martingale provided that M, N ∈ McT,loc . The following theorem is the continuous-time analogue of Theorem 15.7, and it gives an intrinsic characterization of the stochastic integral. 17.15 Theorem. Let M ∈ McT,loc , 𝔼 |M0 | < ∞, and f ∈ L2T,loc (M). I = f ∙ M is the unique element in McT,loc such that ⟨I, N⟩ = f ∙ ⟨M, N⟩

for all N ∈ McT,loc .

(17.10)

Proof. Assume that (17.10) holds. We have to identify I with f ∙ M. By assumption, (17.10)

⟨I − f ∙ M, N⟩ = ⟨I, N⟩ − ⟨f ∙ M, N⟩ = 0.

We pick N = I − f ∙ M and get ⟨I − f ∙ M⟩ = 0. Let (τ n )n⩾0 be a localizing sequence for N. τ τ Since 𝔼 ((N t n )2 ) = 𝔼 (⟨N⟩t n ), we see that 2

Doob

𝔼 (sup (I t n − f ∙ M t n ) ) ⩽ 4 𝔼 ((I Tn − f ∙ M Tn )2 ) = 4 𝔼 (⟨I − f ∙ M⟩Tn ) = 0. τ

τ

τ

τ

τ

t⩽T

This shows that ℙ (I t n = f ∙ M t n ∀t ⩾ 0) = 1, hence τ

τ

ℙ (I t = f ∙ M t ∀t ⩾ 0) = ℙ (I t n = f ∙ M t n ∀t ⩾ 0, ∀n ⩾ 0) = 1. τ

τ

For the converse direction, we have to show that ⟨f ∙ M, N⟩ = f ∙ ⟨M, N⟩. Let θ n be a 2 localizing sequence such that M θ n , N θ n ∈ Mc,2 T and f 𝟙[0,θ n ) ∈ LT (M). Since for any stopping time τ ⟨f ∙ M, N⟩τ = ⟨(f ∙ M)τ , N τ ⟩ = ⟨(f 𝟙[0,τ) ) ∙ M τ , N τ ⟩,

(f ∙ ⟨M, N⟩)τ = (f 𝟙[0,τ) ) ∙ ⟨M, N⟩τ = (f 𝟙[0,τ) ) ∙ ⟨M τ , N τ ⟩,

we can assume that M, N ∈ MTc,2 and f ∈ L2T (M). Let f k ∈ ST be a sequence of simple processes which approximate f ∈ L2T (M). Fix k and the underlying partition Π = {t0 = 0 ⩽ t1 ⩽ . . . ⩽ t n = T} (we suppress the dependence on k to ease notation). In abuse of notation we identify for any function or process (Z t )t⩽T the discrete process Z Π := (Z s )s∈Π and the piecewise constant

Problems

| 295

process ∑t j−1 ,t j ∈Π Z t j−1 𝟙[t j−1 ,t j ) (s) + Z T 𝟙{T} (s). Since the compensator ⟨M⟩ of a martingale is unique in discrete and continuous time, and since M Π ⊂ (M t )t⩽T we have Π Π ⟨M⟩Π s = ⟨M ⟩s for all s ∈ Π. Note that the bracket for M is the discrete bracket from Section 15.1, while the bracket for M is from the Doob–Meyer decomposition (DM). This is bequeathed to the covariation via polarization. With these conventions it is easy to see that Π Π ⟨f k ∙ M, N⟩Π T = ⟨(f k ∙ M) , N ⟩T

= ⟨f k ∙ M Π , N Π ⟩T

(15.9)

= f k ∙ ⟨M Π , N Π ⟩T = f k ∙ ⟨M, N⟩Π T.

Thus, as T ∈ Π,

⟨f k ∙ M, N⟩T = f k ∙ ⟨M, N⟩T .

(17.11)

The left-hand side of (17.11) converges in L2 (ℙ) to ⟨f ∙ M, N⟩. This follows from polarization and Itô’s isometry which is the L2 (ℙ) continuity of X 󳨃→ ⟨X⟩T : 4⟨f k ∙ M, N⟩ = ⟨f k ∙ M + N⟩ − ⟨f k ∙ M − N⟩ L2 (ℙ)

󳨀󳨀󳨀󳨀→ ⟨f ∙ M + N⟩ − ⟨f ∙ M − N⟩ = 4⟨f ∙ M, N⟩. k→∞

The argument for the right-hand side of (17.11) uses Stieltjes integrals. Note that Corollary 15.6 shows 󵄨󵄨 󵄨 󵄨󵄨f k ∙ ⟨M, N⟩T 󵄨󵄨󵄨 ⩽ √⟨f k ∙ M⟩T √⟨N⟩T .

This means that the convergence of f k → f in L2 (μ T ⊗ ℙ) and f k ∙ M → f ∙ M in L2 (ℙ) entails the convergence of f k ∙ ⟨M, N⟩T → f ∙ ⟨M, N⟩T . Use for this again the L2 (ℙ) continuity of X 󳨃→ ⟨X⟩T . In order to get the assertion for general t ∈ [0, T], all we have to do is to consider the stopped martingales M t and N t , finishing the proof. 17.16 Further reading. Stochastic integrals driven by martingales are covered by most of the books mentioned in the Further reading section of Chapter 15. Some authors, e.g. [113] and [216], use Theorem 17.15 to get a very elegant definition of the stochastic integral with respect to martingales. [113] Ikeda, Watanabe: Stochastic Differential Equations and Diffusion Processes. [216] Revuz, Yor: Continuous Martingales and Brownian Motion.

Problems 1.

Show that Proposition 17.2 and Corollary 17.3 remain valid for local martingales with continuous paths (cf. Definition 16.7).

296 | 17 Stochastic integrals: martingale drivers 2.

Extend Theorem 17.1 to local martingales with continuous paths.

3.

For M, N ∈ McT,loc we define the covariation by ⟨M, N⟩t := 41 (⟨M + N⟩t − ⟨M − N⟩t ). Show that a) t 󳨃→ ⟨M, N⟩t is well-defined, of bounded variation on compact t-intervals (BV-process).

b) ⟨M, N⟩ is the unique BV-process such that MN − ⟨M, N⟩ ∈ McT,loc .

c) ⟨M, N⟩t = ucp- lim ∑(M t i ∧t − M t i−1 ∧t )(N t i ∧t − N t i−1 ∧t ). |Π|↓0

Π

d) (M, N) 󳨃→ ⟨M, N⟩ is bilinear. e) |⟨M, N⟩| ⩽ √⟨M⟩√⟨N⟩.

4. Use Lemma 17.14 to show the following assertions: ucp

a) limn 𝔼 (supt⩽T |X tn − X t |2 ) = 0 󳨐⇒ X n 󳨀󳨀󳨀󳨀→ X. n→∞

ucp

b) X n 󳨀󳨀󳨀󳨀→ X, n→∞

5.

ucp

ucp

n→∞

n→∞

Y n 󳨀󳨀󳨀󳨀→ Y 󳨐⇒ X n + Y n 󳨀󳨀󳨀󳨀→ X + Y.

c) ET is ucp-dense in L2T,loc (M) for any M ∈ McT,loc .

Let M ∈ McT,loc . Show that the following maps are continuous, if we use ucpconvergence in both range and domain: a) L2T,loc (M) ∋ f 󳨃→ f 2 ∙ ⟨M⟩.

b) L2T,loc (M) ∋ f 󳨃→ f ∙ M ∈ McT,loc . 6.

c) McT,loc ∋ M 󳨃→ ⟨M⟩.

Let M ∈ McT,loc and f(t, ω) be an adapted càdlàg (right continuous, finite left limits) process and t ∈ [0, T]. T a) Show that f ∈ L0T (M), i.e. ∫0 f(t, ω) d⟨M⟩t < ∞ a.s. b) Show that for any sequence of partitions Π of [0, T] with |Π| ↓ 0 we have f Π ∙ M → f ∙ M in ucp-sense. Here f Π (t, ω) := ∑ f(t i−1 , ω)𝟙[t i−1 ,t i ) (t). Π

7.

Let M ∈ McT,loc and f n be a sequence in L0T (M). If f n → f in ucp-sense, then f ∈ L0T (M).

8. Show that the process f 2 ∙ ⟨M⟩t := ∫0 |f(s)|2 d⟨M⟩s appearing in Theorem 17.9.b) is adapted. t

18 Itô’s formula An important consequence of the fundamental theorem of integral and differential calculus is the fact that every differentiation rule has an integration counterpart. Consider, for example, the chain rule (f ∘ g)󸀠 (t) = f 󸀠 (g(t)) ⋅ g󸀠 (t). Then t

t

f(g(t)) − f(g(0)) = ∫ f 󸀠 (g(s)) ⋅ g 󸀠 (s) ds = ∫ f 󸀠 (g(s)) dg(s) 0

0

which is the substitution or change-of-variable rule for integrals. Moreover, it provides a useful calculus formula if we want to evaluate integrals. Let (B t )t⩾0 be a BM1 and (Ft )t⩾0 an admissible, right continuous complete filtration, B

e.g. Ft = F t , cf. Theorem 6.22. Itô’s formula is the stochastic counterpart of the chain rule; it is also known as the change-of-variable-formula or Itô’s Lemma. The unbounded variation of Brownian sample paths leaves, again, its traces: the chain rule has a correction term involving the quadratic variation of (B t )t⩾0 and the second derivative of the function f . Here is the statement of the basic result. 18.1 Theorem (Itô 1942). Let (B t )t⩾0 be a BM1 and let f : ℝ → ℝ be a C2 -function. Then we have for all t ⩾ 0 almost surely t

t

1 f(B t ) − f(B0 ) = ∫ f (B s ) dB s + ∫ f 󸀠󸀠 (B s ) ds. 2 0

󸀠

(18.1)

0

Before we prove Theorem 18.1, we need to discuss Itô processes and Itô differentials.

18.1 Itô processes and stochastic differentials Let T ∈ [0, ∞]. Stochastic processes of the form t

t

X t = X0 + ∫ σ(s) dB s + ∫ b(s) ds, 0

0

t < T,

(18.2)

are called Itô processes. If the integrands satisfy σ ∈ L2T,loc and, for almost all ω, b(⋅, ω) ∈ L1 ([0, T], ds) (or b(⋅, ω) ∈ L1loc ([0, ∞), ds) if T = ∞), then all integrals in (18.2) make sense. Roughly speaking, an Itô process is locally a Brownian motion with standard deviation σ(s, ω) plus a drift with mean value b(s, ω). Note that these parameters change infinitesimally over time. In dimensions other than d = 1, the stochastic integral is defined for each coordinate and we can read (18.2) as matrix and vector equality, i.e. j Xt

=

j X0

d

+∑

t

∫ σ jk (s) dB ks

k=1 0

https://doi.org/10.1515/9783110741278-018

t

+ ∫ b j (s) ds, 0

t ⩽ T, j = 1, . . . , m,

(18.3)

298 | 18 Itô’s formula where X t = (X 1t , . . . , X tm )⊤ , B t = (B1t , . . . , B dt )⊤ is a BMd , σ(t, ω) = (σ jk (t, ω))jk ∈ ℝm×d is a matrix-valued stochastic process and b(t, ω) = (b1 (t, ω), . . . , b m (t, ω))⊤ ∈ ℝm is a vector-valued stochastic process; the coordinate processes σ jk and b j are chosen in such a way that all (stochastic) integrals in (18.3) are defined. 18.2 Definition. Let (B t )t⩾0 be a BMd , (σ t )t⩾0 and (b t )t⩾0 be two progressively measurable ℝm×d and ℝm -valued locally bounded processes. Then t

t

X t = X0 + ∫ σ s dB s + ∫ b s ds, 0

0

is an m-dimensional Itô process.

t ⩾ 0,

18.3 Remark. a) If we know that σ and b are locally bounded, i.e. there is some Ω0 ⊂ Ω with ℙ(Ω0 ) = 1 and max sup |σ jk (t, ω)| + max sup |b j (t, ω)| < ∞ j,k

j

t⩽T

t⩽T

for all T ∈ (0, ∞), ω ∈ Ω0 ,

then b j (⋅, ω) ∈ L1 ([0, T], dt), T > 0, almost surely; moreover, the stopping times t

{ } τ n (ω) := min inf {t ⩾ 0 : ∫ |σ jk (s, ω)|2 ds ⩾ n} j,k 0 { }

satisfy τ n ↑ ∞ almost surely, i.e.

τ n ∧T

𝔼 [ ∫ |σ jk (s, ⋅)|2 ds] ⩽ n, [ 0 ]

which proves that σ jk ∈ L2T,loc for all T > 0. t

t

b) Let X t − X0 = ∫0 σ s dB s + ∫0 b s ds = M t + A t . From Chapter 16 we know that in t

Ex. 18.17

this decomposition the process M t = ∫0 σ s dB s is a (vector-valued) continuous local martingale; the coordinates of the continuous process A t satisfy t

|A t (ω) − A s (ω)| ⩽ ∫ |b j (r, ω)| dr ⩽ C T (ω)(t − s) for all s, t < T, j

j

s

i.e. A t is locally of bounded variation.¹

1 Processes of this form are called continuous semimartingales. The decomposition in the local martingale part and the BV part is (essentially) unique, cf. Exercise 18.17.

18.2 The heuristics behind Itô’s formula | 299

It is convenient to rewrite (18.2) in differential form, i.e. as dX t (ω) = σ(t, ω) dB t (ω) + b(t, ω) dt,

(18.4)

where we use again, if appropriate, vector and matrix notation. In this new notation, Example 15.17 reads t

(B t )2 = 2 ∫ B s dB s + t,

d(B t )2 = 2B t dB t + dt,

hence,

0

and we see that this is a special case of Itô’s formula (18.1) with f(x) = x2 : df(B t ) = f 󸀠 (B t ) dB t +

1 󸀠󸀠 f (B t ) dt. 2

18.2 The heuristics behind Itô’s formula

Let us give an intuitive argument why stochastic calculus differs from the usual calculus. Note that ∆B t = √∆t

B t+∆t − B t √ ∼ ∆t G √∆t

and (∆B t )2 = ∆t G2 ,

where G ∼ N(0, 1) is a standard normal random variable. Since 𝔼 |G| = √2/π and 𝔼 G2 = 1, we get from ∆t → 0 |dB t | ≈ √

2 dt π

and (dB t )2 ≈ dt.

This means that Itô’s formula is essentially a second-order Taylor formula while in ordinary calculus we go up to order one only. Terms of higher order, for example (dB t )3 ≈ (dt)3/2 and dt dB t ≈ (dt)3/2 , do not contribute, since integrals with respect to (dt)3/2 correspond to an integral sum with (∆t)−1 -many terms which are all of the order (∆t)3/2 – and such a sum tends to zero as ∆t → 0. The same argument also explains why we do not have terms of second order, i.e. (dt)2 , in ordinary calculus. In Section 18.5 we consider Itô’s formula for a d-dimensional Brownian motion j B t = (B1t , . . . , B dt ). Then we will encounter differentials of the form dt dB kt and dB t dB kt . k Clearly, dt dB t ≈ (dt)3/2 is negligible, while for j ≠ k j

j

j

(B t+∆t − B t )(B kt+∆t − B kt ) = ∆t

j

(B t+∆t − B t ) (B kt+∆t − B kt ) ∼ ∆t G G󸀠 , √∆t √∆t

where G and G󸀠 are independent N(0, 1) distributed random variables. Note that ∆t G G󸀠 =

2

2

G + G󸀠 G − G󸀠 1 ∆t [( ) −( ) ]. 2 √2 √2

Ex. 18.2

300 | 18 Itô’s formula Tab. 18.1: Multiplication table for the stochastic differentials of a BMd (B1t , . . . , B dt ).

Ex. 1.7

j

dt

dB t

dB kt

dt

0

0

0

j dB t

0

dt

0

dB kt

0

0

dt

j ≠ k

Since √1 (G ± G󸀠 ) are independent N(0, 1) random variables (cf. Problem 1.7), both 2 terms give, as ∆t → 0, the same contribution towards the quadratic variation. This shows – see also the argument at end of the proof of Theorem 18.8: {0, j dB t dB kt ≈ δ jk dt = { dt, {

j ≠ k, j = k,

This can be nicely summed up in a multiplication table, Table 18.1.

18.3 Proof of Itô’s formula (Theorem 18.1) We begin with an auxiliary result on Riemann–Stieltjes-type approximations of (stochastic) integrals. As a by-product we get a rigorous proof for (dB t )2 ≈ dt, compare Section 18.2. 18.4 Lemma. Let (B t )t⩾0 be a BM1 , Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t N = T} a generic partition of the interval [0, T] with mesh |Π| = maxl |t l − t l−1 | and g ∈ Cb (ℝ). Then N



∑ g(B t l−1 )(B t l − B t l−1 ) 󳨀󳨀󳨀󳨀󳨀→ ∫ g(B s ) dB s |Π|→0

l=1

and

T

N



T

∑ g(ξ l )(B t l − B t l−1 )2 󳨀󳨀󳨀󳨀󳨀→ ∫ g(B s ) ds |Π|→0

l=1

(18.5)

0

(18.6)

0

where ξ l (ω) = B t l−1 (ω) + θ l (ω)(B t l (ω) − B t l−1 (ω)), 0 ⩽ θ l ⩽ 1, l = 1, . . . , N, are intermediate values such that g(ξ l ) is measurable.

Proof. By Proposition 15.18 (or Theorem 16.9) we may use a Riemann-sum approximation of the stochastic integral provided that s 󳨃→ g(B s ) is continuous in L2 (ℙ)-sense. This, however, follows from ‖g‖∞ < ∞ and the continuity of x 󳨃→ g(x) and s 󳨃→ B s ; for all sequences (s n )n⩾1 with s n → s, we have by dominated convergence 2

lim 𝔼 {(g(B s n ) − g(B s )) } = 0;

n→∞

18.3 Proof of Itô’s formula (Theorem 18.1) |

301

and this proves (18.5). The integral appearing on the right-hand side of (18.6) is, for each ω, a Riemann integral. Since s 󳨃→ g(B s ) is bounded and continuous, we may approximate the integral by Riemann sums N

T

l=1

0

lim ∑ g(B t l−1 )(t l − t l−1 ) = ∫ g(B s ) ds

|Π|→0

almost surely, hence in probability. Therefore, the assertion follows if we can show that the following expression converges to 0 in probability: N

N

∑ g(ξ l )(B t l − B t l−1 )2 − ∑ g(B t l−1 )(t l − t l−1 )

l=1

l=1

N

= ∑ g(B t l−1 ){(B t l − B t l−1 )2 − (t l − t l−1 )} l=1

N

+ ∑ (g(ξ l ) − g(B t l−1 ))(B t l − B t l−1 )2 =: J1 + J2 . l=1

For the second term we have N

2 󵄨 󵄨 |J2 | ⩽ ∑ 󵄨󵄨󵄨g(ξ l ) − g(B t l−1 )󵄨󵄨󵄨(B t l − B t l−1 ) l=1

N

2 󵄨 󵄨 ⩽ max 󵄨󵄨󵄨g(ξ l ) − g(B t l−1 )󵄨󵄨󵄨 ∑ (B t l − B t l−1 ) . 1⩽l⩽N

l=1

=S2Π (B;[0,T]) cf. (9.1)

Taking expectations and using the Cauchy-Schwarz inequality we get 󵄨 󵄨2 𝔼 |J2 | ⩽ √𝔼 ( max 󵄨󵄨󵄨g(ξ l ) − g(B t l−1 )󵄨󵄨󵄨 ) √ 𝔼 [(S2Π (B; [0, T]))2 ]. 1⩽l⩽N

(18.7)

By Theorem 9.1, L2 (ℙ)-lim|Π|→0 S2Π (B; [0, T]) = T; since s 󳨃→ g(B s ) is uniformly continuous on [0, T] and bounded by ‖g‖∞ , we can use dominated convergence to get lim|Π|→0 𝔼 |J2 | = 0; therefore J2 → 0 in L1 (ℙ) and in probability. For the first term we get N

2

𝔼(J21 ) = 𝔼 [( ∑ g(B t l−1 )[(B t l − B t l−1 )2 − (t l − t l−1 )]) ] l=1

N

2 󵄨2 󵄨 = 𝔼 [ ∑ 󵄨󵄨󵄨g(B t l−1 )󵄨󵄨󵄨 [(B t l − B t l−1 )2 − (t l − t l−1 )] ] . l=1

302 | 18 Itô’s formula

Ex. 18.4

The last equality follows from the property (15.7) of the martingale transform if we use in (15.7) the martingale M t j = B2t j − t j . From (the proof of) Theorem 9.1 we get N

2

𝔼(J21 ) ⩽ ‖g‖2∞ 𝔼 [ ∑ [(B t l − B t l−1 )2 − (t l − t l−1 )] ] l=1



N 2‖g‖2∞ |Π| ∑ (t l l=1

− t l−1 ) =

2‖g‖2∞ |Π| T

(18.8)

󳨀󳨀󳨀󳨀󳨀󳨀→ 0. |Π|→0

This proves L2 (ℙ)-convergence and, consequently, convergence in probability.

Proof of Theorem 18.1. 1o Let us first assume that supp f ⊂ [−K, K] is compact. Then C f := ‖f‖∞ + ‖f 󸀠 ‖∞ + ‖f 󸀠󸀠 ‖∞ < ∞.

For any Π := {t0 = 0 < t1 < ⋅ ⋅ ⋅ < t N = t} with mesh |Π| = maxl (t l − t l−1 ) we get with Taylor’s theorem N

f(B t ) − f(B0 ) = ∑ (f(B t l ) − f(B t l−1 )) l=1 N

= ∑ f 󸀠 (B t l−1 )(B t l − B t l−1 ) + l=1

=: J1 + J2 .

1 N 󸀠󸀠 ∑ f (ξ l )(B t l − B t l−1 )2 2 l=1

Here ξ l = ξ l (ω) is an intermediate point between B t l−1 and B t l . From the integral representation of the remainder term in the Taylor formula, 1

1 󸀠󸀠 f (ξ l ) = ∫ f 󸀠󸀠 (B t l−1 + r(B t l − B t l−1 ))(1 − r) dr, 2 0

we see that f 󸀠󸀠 (ξ l ) is measurable.

2o From (18.5) we know that J1 → ∫0 f 󸀠 (B s ) dB s in probability as |Π| → 0.

3o From (18.6) we know that J2 →

t

1 t 󸀠󸀠 2 ∫0 f (B s ) ds

in probability as |Π| → 0.

4o So far we have shown (18.1) for functions f ∈ C2 such that C f < ∞. For a general f ∈ C2 we fix ℓ ⩾ 0, pick a cut-off function χℓ ∈ C2c satisfying 𝟙𝔹(0,ℓ) ⩽ χℓ ⩽ 𝟙𝔹(0,ℓ+1) and set fℓ (x) := f(x)χℓ (x). Clearly, C fℓ < ∞, and therefore t

fℓ (B t ) − fℓ (B0 ) = ∫ fℓ󸀠 (B s ) dB s + 0

Consider the stopping times

t

1 ∫ fℓ󸀠󸀠 (B s ) ds. 2

τ(ℓ) := inf{s > 0 : |B s | ⩾ ℓ},

0

ℓ ⩾ 1,

18.4 Itô’s formula for stochastic differentials |

and note that t⩾0

dj f (B ) dx j ℓ s∧τ(ℓ)

f(B t∧τ(ℓ) ) − f(B0 )

303

= f (j) (B s∧τ(ℓ) ), j = 0, 1, 2. Thus, Theorem 16.8 shows for all

=

t∧τ(ℓ) ∫ fℓ󸀠 (B s ) dB s

16.8.e)

0

t

t∧τ(ℓ)

1 + ∫ fℓ󸀠󸀠 (B s ) ds 2 0

t

1 + ∫ fℓ󸀠󸀠 (B s )𝟙[0,τ(ℓ)) (s) ds 2

=

∫ fℓ󸀠 (B s )𝟙[0,τ(ℓ)) (s) dB s

=

∫ f 󸀠 (B s )𝟙[0,τ(ℓ)) (s) dB s +

=

∫ f (B s ) dB s +

0

t

0

t∧τ(ℓ) 16.8.e) 󸀠

t∧τ(ℓ)

0

0

t

1 ∫ f 󸀠󸀠 (B s )𝟙[0,τ(ℓ)) (s) ds 2 0

1 ∫ f 󸀠󸀠 (B s ) ds. 2 0

Since limℓ→∞ τ(ℓ) = ∞ almost surely – Brownian motion does not explode in finite time – and since the (stochastic) integrals are continuous as functions of their upper boundary, the proof of Theorem 18.1 is complete.

18.4 Itô’s formula for stochastic differentials We will now derive Itô’s formula for one-dimensional Itô processes, i.e. processes of the form dX t = σ(t) dB t + b(t) dt

where σ, b are locally bounded (in t) and progressively measurable. We have seen in Remark 18.3 that under this assumption b ∈ L1 ([0, T], dt) a.s. and σ ∈ L2T,loc for all T > 0. We can also show that b ∈ L2T,loc . Fix some common localizing sequence τ n . Since ST = L2T we find, for every n, sequences of simple processes (b Π )Π , (σ Π )Π ⊂ ST such that L2 (λ T ⊗ℙ)

σ Π 𝟙[0,τ n ) 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ σ𝟙[0,τ n ) |Π|→0

and

L2 (λ T ⊗ℙ)

b Π 𝟙[0,τ n ) 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ b𝟙[0,τ n ) . |Π|→0

Using a diagonal procedure we can achieve that the sequences (b Π )Π , (σ Π )Π are independent of n.

18.5 Lemma. Let X tΠ = X0 + ∫0 σ Π (s) dB s + ∫0 b Π (s) ds. Then X tΠ → X t uniformly in probability as |Π| → 0, i.e. we have t

t

󵄨 󵄨 lim ℙ (sup󵄨󵄨󵄨X tΠ − X t 󵄨󵄨󵄨 > ϵ) = 0

|Π|→0

t⩽T

for all ϵ > 0, T > 0.

(18.9)

304 | 18 Itô’s formula Proof. The argument is similar to the one used to prove Theorem 16.10. Let τ n be some localizing sequence and fix n ⩾ 0, T > 0 and ϵ > 0. Then 󵄨󵄨 t 󵄨󵄨 󵄨 󵄨 ℙ (sup 󵄨󵄨󵄨󵄨 ∫ (σ Π (s) − σ(s)) dB s 󵄨󵄨󵄨󵄨 > ϵ) 󵄨󵄨 t⩽T 󵄨󵄨 0 ⩽



Chebyshev



Doob (15.22)



n fixed

󵄨󵄨 t 󵄨󵄨 󵄨 󵄨 ℙ (sup 󵄨󵄨󵄨󵄨 ∫ (σ Π (s) − σ(s)) dB s 󵄨󵄨󵄨󵄨 > ϵ, τ n > T) + ℙ(τ n ⩽ T) 󵄨 󵄨󵄨 t⩽T 󵄨 0 󵄨󵄨 t∧τ n 󵄨󵄨 󵄨 󵄨 ℙ (sup 󵄨󵄨󵄨󵄨 ∫ (σ Π (s) − σ(s)) dB s 󵄨󵄨󵄨󵄨 > ϵ) + ℙ(τ n ⩽ T) 󵄨󵄨 t⩽T 󵄨󵄨 0 󵄨󵄨 T∧τ n 󵄨󵄨2 4 󵄨󵄨 󵄨 [ 󵄨󵄨 ∫ (σ Π (s) − σ(s)) dB s 󵄨󵄨󵄨 ] + ℙ(τ n ⩽ T) 𝔼 󵄨󵄨 󵄨󵄨 ϵ2 󵄨 ] [󵄨 0 T∧τ n

4 󵄨 󵄨2 𝔼 [ ∫ 󵄨󵄨󵄨σ Π (s) − σ(s)󵄨󵄨󵄨 ds] + ℙ(τ n ⩽ T) ϵ2 [ 0 ]

󳨀󳨀󳨀󳨀󳨀→ ℙ(τ n ⩽ T) 󳨀󳨀󳨀󳨀󳨀→ 0. |Π|→0

n→∞

A similar, but simpler, calculation yields 󵄨󵄨 t 󵄨󵄨 󵄨 󵄨 ℙ (sup 󵄨󵄨󵄨󵄨 ∫ (b Π (s) − b(s)) ds󵄨󵄨󵄨󵄨 > ϵ) 󳨀󳨀󳨀󳨀󳨀󳨀→ 0. |Π|→0 󵄨󵄨 t⩽T 󵄨󵄨 0

Finally, for any two random variables X, Y : Ω → [0, ∞),

ℙ(X + Y > ϵ) ⩽ ℙ(X > ϵ/2) + ℙ(Y > ϵ/2),

and the claim follows since the supremum is subadditive.

18.6 Lemma. Let X t = X0 +∫0 σ(s) dB s +∫0 b(s) ds and Y t = Y0 +∫0 τ(s) dB s +∫0 c(s) ds, t ∈ [0, T], be two Itô processes with σ, b ∈ L2T and τ, c ∈ ST , respectively. Denote by Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t N = T} a partition of the interval [0, T] and g ∈ Cb (ℝ). Then t

N

t



T

t

T

∑ g(X t l−1 )(Y t l − Y t l−1 ) 󳨀󳨀󳨀󳨀󳨀→ ∫ g(X s )τ(s) dB s + ∫ g(X s )c(s) ds

l=1

N

|Π|→0 ℙ

0

|Π|→0

(18.10)

0

T

∑ g(ξ l )(Y t l − Y t l−1 )2 󳨀󳨀󳨀󳨀󳨀→ ∫ g(X s )τ2 (s) ds

l=1

t

(18.11)

0

where ξ l (ω) = X t l−1 (ω) + θ l (ω)(X t l (ω) − X t l−1 (ω)), 0 ⩽ θ l ⩽ 1, l = 1, . . . , N, are intermediate values such that g(ξ l ) is measurable.

18.4 Itô’s formula for stochastic differentials |

We define the stochastic integral for an Itô process dX t = σ(t) dB t + b(t) dt as t

t

t

∫ f(s) dX s := ∫ f(s)σ(s) dB s + ∫ f(s)b(s) ds 0

305

0

(18.12)

0

whenever the right-hand side makes sense. It is possible to define the Itô integral for the integrator dX t directly in the sense of Chapters 15 and 16 and to derive the equality (18.12) rather than to use it as a definition. Proof of Lemma 18.6. Denote by ∆ the common refinement of the partitions of the simple processes τ and c. Let Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t N = T} be partitions of [0, T]; we can assume that ∆ ⊂ Π. The processes s 󳨃→ g(X s )τ(s) and s 󳨃→ g(X s )c(s) are bounded, right continuous and have only finitely many discontinuities, and we see from Theorem 16.9 (or Proposition 15.18) and an approximation by Riemann sums that N

N

N

∑ g(X t l−1 )(Y t l − Y t l−1 ) = ∑ g(X t l−1 )τ(t l−1 )(B t l − B t l−1 ) + ∑ g(X t l−1 )c(t l−1 )(t l − t l−1 )

l=1

l=1

l=1

T



T

󳨀󳨀󳨀󳨀󳨀→ ∫ g(X s )τ(s) dB s + ∫ g(X s )c(s) ds; |Π|→0

0

0

this proves (18.10). In order to see (18.11) we observe that N

N

2

∑ g(ξ l )(Y t l − Y t l−1 )2 = ∑ g(ξ l )(τ(t l−1 )(B t l − B t l−1 ) + c(t l−1 )(t l − t l−1 ))

l=1

l=1

N

= ∑ g(ξ l )τ2 (t l−1 )(B t l − B t l−1 )2 l=1

N

+ 2 ∑ g(ξ l )τ(t l−1 )c(t l−1 )(t l − t l−1 )(B t l − B t l−1 ) l=1

N

+ ∑ g(ξ l )c2 (t l−1 )(t l − t l−1 )2 l=1

=: J1 + J2 + J3 .

The proof of (18.6) still works, almost literally, if we have g(ξℓ )τ2 (t l−1 ) instead of g(ξℓ ) where τ(t l−1 ) is bounded and Ft l−1 measurable. Since s 󳨃→ g(X s )τ2 (s) is piecewise continuous with finitely many discontinuities, we see lim|Π|→0 J1 = ∫0 g(X s )τ2 (s) ds in probability. Moreover, T

N

|J2 | ⩽ 2 ∑ ‖g‖∞ ‖τ‖∞ ‖c‖∞ (t l − t l−1 ) max |B t k − B t k−1 | k

l=1

= 2 ‖g‖∞ ‖τ‖∞ ‖c‖∞ T max |B t k − B t k−1 |, k

and, as Brownian motion is a.s. uniformly continuous on [0, T], we get J2 → 0 a.s. and in probability; a similar calculation gives J3 → 0 a.s. and in probability.

306 | 18 Itô’s formula 18.7 Theorem (Itô 1942). Let X t be a one-dimensional Itô process in the sense of Definition 18.2 such that dX t = σ(t) dB t + b(t) dt and let f : ℝ → ℝ be a C2 -function. Then we have for all t ⩾ 0 almost surely t

t

t

f(X t ) − f(X0 ) = ∫ f 󸀠 (X s )σ(s) dB s + ∫ f 󸀠 (X s )b(s) ds + 0

0

t

t

0

0

1 =: ∫ f 󸀠 (X s ) dX s + ∫ f 󸀠󸀠 (X s )σ2 (s) ds. 2

1 ∫ f 󸀠󸀠 (X s )σ2 (s) ds 2 0

(18.13)

Proof. 1o As in the proof of Theorem 18.1 it is enough to show (18.13) locally for t ⩽ τ(ℓ) where τ(ℓ) = inf{s > 0 : |X s | ⩾ ℓ}.

This means that we can assume, without loss of generality, that ‖f‖∞ + ‖f 󸀠 ‖∞ + ‖f 󸀠󸀠 ‖∞ + ‖σ‖∞ + ‖b‖∞ ⩽ C < ∞.

2o Lemma 18.5 shows that we can approximate σ and b by elementary processes σ Π and b Π . Denote the corresponding Itô process by X Π . Since f ∈ C2b , the following limits are uniform in probability: X Π 󳨀󳨀󳨀󳨀󳨀→ X, |Π|→0

Ex. 18.16

f(X Π ) 󳨀󳨀󳨀󳨀󳨀→ f(X). |Π|→0

The right-hand side of (18.13) with σ Π , b Π and X Π also converges uniformly in probability by (an obvious modification of) Lemma 18.5. This means that we have to show (18.13) only for Itô processes where σ and b are from ST for some T > 0.

3o Assume that σ, b ∈ ST and that Π = {t0 = 0 < t1 < ⋅ ⋅ ⋅ < t N = T} is the joint refinement of the underlying partitions. Fix t ⩽ T and assume, without loss of generality, that t = t k ∈ Π. As in Step 1o of the proof of Theorem 18.1, we get by Taylor’s theorem k

f(X t ) − f(X0 ) = ∑ f 󸀠 (X t l−1 )(X t l − X t l−1 ) + l=1

=: J1 + J2

1 k 󸀠󸀠 ∑ f (ξ l )(X t l − X t l−1 )2 2 l=1

where ξ l is an intermediate point between X t l−1 and X t l such that f 󸀠󸀠 (ξ l ) is measurable. Since σ and b are simple processes, we can use Lemma 18.6 to conclude that ℙ

t

󸀠

J1 󳨀󳨀󳨀󳨀󳨀→ ∫ f (X s ) dX s |Π|→0

0

t

1 and J2 󳨀󳨀󳨀󳨀󳨀→ ∫ f 󸀠󸀠 (X s )σ2 (s) ds. |Π|→0 2 ℙ

0

18.5 Itô’s formula for Brownian motion in ℝd

|

307

18.5 Itô’s formula for Brownian motion in ℝd

Assume that B = (B1 , . . . , B d ) is a BMd . Recall that the stochastic integral with respect to dB is defined for each coordinate, cf. Section 18.1. With this convention, the multidimensional version of Itô’s formula reads as follows: 18.8 Theorem (Itô 1942). Let (B t )t⩾0 , B t = (B1t , . . . , B dt ), be a d-dimensional Brownian t t motion and X t = X0 + ∫0 σ(s) dB s + ∫0 b(s) ds an m-dimensional Itô process with the coefficients σ = (σ jk ) ∈ ℝm×d and b = (b j ) ∈ ℝm . For every C2 -function f : ℝm → ℝ one has d

t

m

f(X t ) − f(X0 ) = ∑ ∫ [ ∑ k=1 0

+

m

j=1

∂ j f(X s )σ jk (s)] dB ks

t

t

+ ∑ ∫ ∂ j f(X s )b j (s) ds j=1 0

d 1 m ∑ ∫ ∂ i ∂ j f(X s ) ∑ σ ik (s)σ jk (s) ds. 2 i,j=1 k=1

(18.14)

0

Sketch of the proof. The argument is similar to the proof of the one-dimensional version in Theorem 18.7. Since Lemma 18.5 remains valid for m-dimensional Itô processes, the first two steps, 1o and 2o of the proof of Theorem 18.7 remain the same. In Step 3o we use now Taylor’s formula for functions in ℝm . Let σ jk , b j ∈ ST , j = 1, . . . , m, k = 1, . . . , d and Π = {t0 = 0 < t1 < ⋅ ⋅ ⋅ < t N = T} be the joint refinement of all underlying partitions. Fix t ⩽ T and assume, without loss of generality, that t = t n ∈ Π. As in Step 1o of the proof of Theorem 18.1, we have n

f(X t ) − f(X0 ) = ∑ (f(X t l ) − f(X t l−1 )) l=1

and if we apply Taylor’s theorem to each increment, we see for l = 1, . . . , n f(X t l ) − f(X t l−1 ) m

j

j

= ∑ ∂ j f(X t l−1 )(X t l − X t l−1 ) + j=1

1 m j j ∑ ∂ i ∂ j f(ξ l )(X ti l − X ti l−1 )(X t l − X t l−1 ) 2 i,j=1

where ξ l = ξ l (ω) is an intermediate point between X t l−1 and X t l . From the integral representation of the remainder term in the Taylor formula, 1

1 ∂ i ∂ j f(ξ l ) = ∫ ∂ i ∂ j f(X t l−1 + r(X t l − X t l−1 ))(1 − r) dr 2 0

we see that ∂ i ∂ j f(ξ l ) is measurable. Recall that j

j

d

X t l − X t l−1 = ∑ σ jk (t l−1 )(B ktl − B ktl−1 ) + b j (t l−1 )(t l − t l−1 ). k=1

308 | 18 Itô’s formula Using the shorthand ∆ l B k = B ktl − B ktl−1 and ∆ l t = t l − t l−1 , we see j

d

j

(X ti l − X ti l−1 )(X t l − X t l−1 ) = ∑ σ jk (t l−1 )σ ik󸀠 (t l−1 ) ∆ l B k ∆ l B k k,k󸀠 =1

󸀠

d

+ ∑ σ jk (t l−1 )b i (t l−1 ) ∆ l B k ∆ l t k=1 d

󸀠

+ ∑ σ ik󸀠 (t l−1 )b j (t l−1 ) ∆ l B k ∆ l t k󸀠 =1

+ b i (t l−1 )b j (t l−1 )(∆ l t)2 .

Inserting this into Taylor’s formula, we get six different groups of terms, f(X t ) − f(X0 ) = J0 +

1 (J1 + ⋅ ⋅ ⋅ + J5 ) 2

which we consider separately. By Lemma 18.6, (18.10), m

n



m

t

J0 = ∑ ∑ ∂ j f(X t l−1 )(X t l − X t l−1 ) 󳨀󳨀󳨀󳨀󳨀→ ∑ ∫ ∂ j f(X s ) dX s . j

j

|Π|→0

j=1 l=1

The last three groups, m

j

j=1 0

n

J5 = ∑ ∑ ∂ i ∂ j f(ξ l )b i (t l−1 )b j (t l−1 )(∆ l t)2 i,j=1 l=1 m

n

d

J3 + J4 = 2 ∑ ∑ ∂ i ∂ j f(ξ l )b i (t l−1 ) ∑ σ jk (t l−1 ) ∆ l B k ∆ l t i,j=1 l=1

k=1

tend to zero as |Π| → 0 because of the argument used in the proof of (18.11) from Lemma 18.6. Again by (18.11), m

n

i,j=1 l=1

where

d

2



J1 = ∑ ∑ ∂ i ∂ j f(ξ l ) ∑ σ jk (t l−1 )σ ik (t l−1 )(∆ l B k ) 󳨀󳨀󳨀󳨀󳨀→ Σ m

t

k=1

|Π|→0

d

Σ := ∑ ∫ ∂ i ∂ j f(X s ) ∑ σ jk (s)σ ik (s) ds. i,j=1 0

k=1

Finally, we get for a suitable value Σ 󸀠 m

n

󸀠



J2 = ∑ ∑ ∂ i ∂ j f(ξ l ) ∑ σ jk (t l−1 )σ ik󸀠 (t l−1 ) ∆ l B k ∆ l B k 󳨀󳨀󳨀󳨀󳨀→ Σ󸀠 − Σ󸀠 = 0. i,j=1 l=1

k=k ̸ 󸀠

|Π|→0

18.6 The time-dependent Itô formula |

309

󸀠

This follows from the convergence of J1 and the independence of B k and B k : we have 󸀠

where

1 (B k √2

2 (∆ l B k ) (∆ l B k ) = [∆ l ( 󸀠

󸀠

2

󸀠

2

Bk + Bk Bk − Bk )] − [∆ l ( )] √2 √2

± B k ) are independent one-dimensional Brownian motions.

18.9 Remark. a) The calculations in the proof of Theorem 18.8 show that we can write Itô’s formula using stochastic differentials which is much easier to memorize than (18.14) m 1 m j j (18.15) df(X t ) = ∑ ∂ j f(X t ) dX t + ∑ ∂ i ∂ j f(X t ) dX ti dX t 2 i,j=1 j=1

and the “product” dX ti dX t can be formally expanded using Itô’s multiplication table from Section 18.2, p. 300. Note that the proof of Theorem 18.8 justifies these rules. j

b) In the literature often the following matrix and vector version of (18.14) is used: t

t ⊤

f(X t ) − f(X0 ) = ∫ ∇f(X s ) σ(s) dB s + ∫ ∇f(X s )⊤ b(s) ds 0

0

t

1 + ∫ trace(σ(s)⊤ D2 f(X s )σ(s)) ds 2

(18.16)

0

where ∇ = (∂1 , . . . , ∂ m

)⊤

is the gradient, D2 f = (∂ j ∂ k f)m j,k=1 is the Hessian, and dB s is

understood as a column vector from ℝd . In differential notation this becomes df(X t ) = ∇f(X t )⊤ σ(t) dB t + ∇f(X t )⊤ b(t) dt 1 + trace(σ(t)⊤ D2 f(X t )σ(t)) dt. 2

(18.17)

18.6 The time-dependent Itô formula

We will now come to a further extension of Itô’s formula, where we allow the function f to depend also on time: f = f(t, X t ). With a simple trick we can see that we can treat this in the framework of Section 18.5. If X is an m-dimensional Itô process, t

t

X t = X0 + ∫ σ(s) dB s + ∫ b(s) ds, where σ ∈ ℝm×d , b ∈ ℝm , 0

0

then we see that t

t

0

0

t 0 ̂ ds, where σ ̂ = (1) ∈ ℝm+1 ̂ (s) dB s + ∫ b(s) ̂ = ( ) ∈ ℝ(m+1)×d , b ( ) = ∫σ Xt σ b

310 | 18 Itô’s formula is an (m+1)-dimensional Itô process. Since dt dB kt = 0, cf. Remark 18.9.a) and Table 18.1, we get immediately the following result. 18.10 Lemma (Itô 1942). Let (B t )t⩾0 , B t = (B1t , . . . , B dt ), be a d-dimensional Brownian t t motion and X t = X0 + ∫0 σ(s) dB s + ∫0 b(s) ds an m-dimensional Itô process. For every C2 -function f : [0, ∞) × ℝm → ℝ m

d

m

df(t, X t ) = ∂ t f(t, X t ) dt + ∑ ∂ j f(t, X t )b j (t) dt + ∑ ∑ ∂ j f(t, X t )σ jk (t) dB kt k=1 j=1

j=1

1 ∑ ∂ i ∂ j f(t, X t ) ∑ σ ik (t)σ jk (t) dt. 2 i,j=1 k=1 m

+

d

(18.18)

The second derivative ∂2t f(t, x) does not appear in (18.18), and it is an educated guess that (18.18) remains valid for functions f ∈ C1,2 ([0, ∞) × ℝm ).

18.11 Theorem (Itô 1942). The time-dependent Itô-formula (18.18) holds for all f ∈ C1,2 , i.e. f : [0, ∞) × ℝm → ℝ, (t, x) 󳨃→ f(t, x), which are continuously differentiable in t and twice continuously differentiable in x. Sketch of the proof.. We indicate two possibilities how we can obtain this improvement of Lemma 18.10. Method 1: for every f ∈ C1,2 there is a sequence of functions f n : [0, ∞) × ℝm → ℝ such that f n ∈ C2

and

locally uniformly

f n , ∂ t f n , ∂ j f n , ∂ i ∂ j f n 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ f, ∂ t f, ∂ j f, ∂ i ∂ j f. n→∞

Set g n (s) := ∂ t f n (s, X s ) (similarly: ∂ j f n (s, X s ), ∂ i ∂ j f n (s, X s )) and note that g n ∈ L2T,loc (cf. Theorem 16.9) and g n (s) → g(s) = ∂ t f(s, X s ) in L2T,loc . Since (18.18) holds for f n ∈ C2 , we can use Theorem 16.10 or the argument from Lemma 18.5 to see that the stochastic integrals appearing on the right-hand side of (18.18) converge in probability as n → ∞. The left-hand side f n (t, X t ) − f n (0, X0 ) converges to f(t, X t ) − f(0, X0 ) a.s., and the proof is finished. Method 2: we can avoid the approximation argument of the first proof if we inspect the proof of Theorem 18.8. The Taylor expansion for the increment f(X t l ) − f(X t l−1 ) is replaced by the following calculation: for l = 1, . . . , n f(t l , X t l ) − f(t l−1 , X t l−1 ) = (f(t l , X t l ) − f(t l−1 , X t l )) + (f(t l−1 , X t l ) − f(t l−1 , X t l−1 )) m

= ∂ t f(θ l , X t l )(t l − t l−1 ) + ∑ ∂ j f(t l−1 , X t l−1 )(X t l − X t l−1 ) j

j=1

1 m j j + ∑ ∂ i ∂ j f(t l−1 , ξ l )(X ti l − X ti l−1 )(X t l − X t l−1 ) 2 i,j=1

j

18.7 Tanaka’s formula and local time | 311

where θ l = θ l (ω) and ξ l = ξ l (ω) are convex combinations of t l−1 , t l and X t l−1 , X t l , respectively, such that ∂ t f(θ l , X t l ) and ∂ i ∂ j f(t l−1 , ξ l ) are measurable. The rest of the proof is now similar to the proof of Theorem 18.8.

18.7 Tanaka’s formula and local time Itô’s formula has been extended in various directions. Here we discuss the prototype of generalizations which relax the smoothness assumptions for the function f . The key point is that we retain some control on the (generalized) second derivative of f , e.g. if f x is convex – as in the Itô–Meyer formula, cf. Section 19.8 – or if f(x) = ∫0 ϕ(y) dy where ϕ ∈ Bb (ℝ) – as in the Bouleau–Yor formula [211, Theorems IV.70, 77]. In this section we consider one-dimensional Brownian motion (B t )t⩾0 and the function f(x) = |x|. Differentiating f (in the sense of generalized functions) yields f 󸀠 (x) = sgn(x) and 󸀠󸀠 f (x) = δ0 (x). A formal application of Itô’s formula (18.1) gives t

t

|B t | = ∫ sgn(B s ) dB s + 0

1 ∫ δ0 (B s ) ds. 2

(18.19)

0

In order to justify (18.19), we have to smooth out the function f(x) = |x|. For ϵ > 0 {|x| − 21 ϵ, f ϵ (x) := { 1 2 x , { 2ϵ

|x| > ϵ, |x| ⩽ ϵ,

{sgn(x), f ϵ󸀠 (x) = { 1 x, {ϵ

|x| > ϵ,

|x| ⩽ ϵ,

{0, f ϵ󸀠󸀠 (x) = { 1 , {ϵ

|x| > ϵ, |x| < ϵ.

The approximations f ϵ are not yet C2 functions. Therefore, we use a Friedrichs mollifier: Pick any χ ∈ C∞ c (−1, 1) with 0 ⩽ χ ⩽ 1, χ(0) = 1 and ∫ χ(x) dx = 1, and set for n ⩾ 1 χ n (x) := nχ(nx) and

𝜖/2

−𝜖

f ϵ,n (x) := f ϵ ⋆ χ n (x) = ∫ χ n (x − y)f ϵ (y) dy.

−𝜖 𝜖

𝜖−1

1 𝜖

Fig. 18.1: Smooth approximation of the function x 󳨃→ |x| and its derivatives.

−𝜖

𝜖

312 | 18 Itô’s formula

Ex. 18.13

󸀠 It is a not hard to see that limn→∞ f ϵ,n = f ϵ and limn→∞ f ϵ,n = f ϵ󸀠 uniformly while 󸀠󸀠 󸀠󸀠 limn→∞ f ϵ,n (x) = f ϵ (x) pointwise for all x ≠ ±ϵ. If we apply Itô’s formula (18.1) we get t

f ϵ,n (B t ) =

t

󸀠 (B s ) dB s ∫ f ϵ,n 0

1 󸀠󸀠 + ∫ f ϵ,n (B s ) ds 2

a.s.

0

On the left-hand side we see limϵ→0 limn→∞ f ϵ,n (B t ) = |B t | a.s. and in probability. For the expression on the right we have because of Itô’s isometry t 󵄨󵄨 t 󵄨󵄨2 󵄨 󵄨 󵄨 󸀠 󵄨2 󸀠 𝔼 [󵄨󵄨󵄨󵄨 ∫(f ϵ,n (B s ) − sgn(B s )) dB s 󵄨󵄨󵄨󵄨 ] = 𝔼 [∫ 󵄨󵄨󵄨󵄨f ϵ,n (B s ) − sgn(B s )󵄨󵄨󵄨󵄨 ds] . 󵄨󵄨 󵄨󵄨 [ 0 ] [0 ]

Letting first n → ∞ and then ϵ → 0, the right-hand side converges to t 󵄨 󵄨󵄨2 󵄨󵄨 dom. conv. 󵄨 [ 𝔼 ∫ 󵄨󵄨󵄨󵄨f ϵ󸀠 (B s ) − sgn(B s )󵄨󵄨󵄨󵄨 ds] 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ 0; ϵ→0 󵄨 󵄨󵄨 [0 󵄨 ] t

t

in particular, ℙ-limϵ→0 ∫0 f ϵ󸀠 (B s ) dB s = ∫0 sgn(B s ) dB s . Finally, set 󸀠󸀠 Ω s := {ω : lim f ϵ,n (B s (ω)) = f ϵ󸀠󸀠 (B s (ω))} . n→∞

As ℙ(B s = ±ϵ) = 0, we know that ℙ(Ω s ) = 1 for each s ∈ [0, t]. By Fubini’s theorem t

ℙ ⊗ Leb{(ω, s) : s ∈ [0, t], ω ∉ Ω s } = ∫ ℙ(Ω \ Ω s ) ds = 0, 0

󸀠󸀠 ‖ −1 and so Leb{s ∈ [0, t] : ω ∈ Ω s } = t for ℙ almost all ω. Since ‖f ϵ,n ∞ ⩽ ϵ , the dominated convergence theorem yields t

t

t

󸀠󸀠 L2 - lim ∫ f ϵ,n (B s ) ds = ∫ f ϵ󸀠󸀠 (B s ) ds = n→∞

0

0

Therefore, we have shown that

1 ∫ 𝟙(−ϵ,ϵ) (B s ) ds. ϵ 0

t

t

0

0

1 ∫ 𝟙(−ϵ,ϵ) (B s ) ds = |B t | − ∫ sgn(B s ) dB s ϵ→0 2ϵ

ℙ- lim

exists and defines a stochastic process. The same argument applies for the shifted function f(x) = |x − a|, a ∈ ℝ. Therefore, the following definition makes sense.

18.12 Definition (Brownian local time. Lévy 1939). Let (B t )t⩾0 be a BM1 and a ∈ ℝ. The local time at the level a up to time t is the random process t

L at

1 := ℙ- lim ∫ 𝟙(−ϵ,ϵ) (B s − a) ds. ϵ→0 2ϵ 0

(18.20)

Problems

| 313

The local time L at represents the total amount of time Brownian motion spends at the level a up to time t: t

1 1 Leb{s ∈ [0, t] : B s ∈ (a − ϵ, a + ϵ)}. ∫ 𝟙(−ϵ,ϵ) (B s − a) ds = 2ϵ 2ϵ 0

We are now ready for the next result.

18.13 Theorem (Tanaka’s formula. Tanaka 1963). Let (B t )t⩾0 denote a one-dimensional Brownian motion. Then we have for every a ∈ ℝ t

|B t − a| = |a| + ∫ sgn(B s − a) dB s + L at

(18.21)

0

where is Brownian local time at the level a. In particular, t 󳨃→ L at (has a version which) is continuous. L at

Proof. The formula (18.21) follows from the preceding discussion for the shifted function f(x) = |x − a|. The continuity of t 󳨃→ L at is now obvious, since both Brownian motion and the stochastic integral have continuous versions.

Tanaka’s formula is the starting point for many further investigations. For example, t it is possible to show that β t = ∫0 sgn(B s ) dB s is again a one-dimensional Brownian a = ∞) = 1 for any a ∈ ℝ. motion and L0t = sups⩽t (−β s ). Moreover, ℙ0 (L∞ 18.14 Further reading. All books mentioned in the Further reading section of Chapter 15 contain proofs of Itô’s formula. Local times are treated in much greater detail in [136] and [216]. An interesting alternative proof of Itô’s formula is given in [150], see also Problem 18.5. [136] Karatzas, Shreve: Brownian Motion and Stochastic Calculus. [150] Krylov: Introduction to the Theory of Diffusion Processes. [216] Revuz, Yor: Continuous Martingales and Brownian Motion.

Problems 1.

Let (B t )t⩾0 be a BM1 . Use Itô’s formula to obtain representations of t

X t = ∫ exp(B s ) dB s 0

which do not contain Itô integrals.

t

and

Y t = ∫ B s exp(B2s ) dB s 0

Ex. 18.14

Ex. 18.5

314 | 18 Itô’s formula 2.

Let Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t N = T} be a generic partition of the interval [0, T] and denote by |Π| = maxj |t j − t j−1 | the mesh of the partition. Show that the limit N

lim ∑ (t j − t j−1 )γ

|Π|→0

3.

j=1

is ∞, T or 0 according to γ < 1, γ = 1 or γ > 1.

Show that the limits (18.5) and (18.6) actually hold in L2 (ℙ) and uniformly for T from compact sets.

Hint. To show that the limits hold uniformly, use Doob’s maximal inequality. For the L2 -version of

(18.6) show first that 𝔼 [( ∑j (B t j − B t j−1 )2 )4 ] converges to c T 4 for some constant c.

4. Give a direct proof for the identity

2

N

𝔼 [( ∑ g(B t l−1 )[(B t l − B t l−1 )2 − (t l − t l−1 )]) ] l=1

N

2 󵄨2 󵄨 = 𝔼 [ ∑ 󵄨󵄨󵄨g(B t l−1 )󵄨󵄨󵄨 [(B t l − B t l−1 )2 − (t l − t l−1 )] ] . l=1

appearing in the proof of Lemma 18.4. 5.

The following exercise contains an alternative proof of Itô’s formula (18.1) for a one-dimensional Brownian motion (B t )t⩾0 . a) Let f ∈ C1 (ℝ). Show that for Π = {t0 = 0 < t1 < ⋅ ⋅ ⋅ < t n = t} and suitable intermediate values ξ l B t f(B t ) = ∑ B t l (f(B t l ) − f(B t l−1 )) + ∑ f(B t l−1 )(B t l − B t l−1 ) l

l

2

󸀠

+ ∑ f (ξ l )(B t l − B t l−1 ) . l

Conclude from this that the differential expression B t df(B t ) is well-defined.

b) Show that d(B t f(B t )) = B t df(B t ) + f(B t ) dB t + f 󸀠 (B t ) dt.

c) Use (b) to find a formula for dB nt , n ⩾ 2, and show that for every polynomial p(x) Itô’s formula holds: dp(B t ) = p󸀠 (B t ) dB t +

1 󸀠󸀠 p (B t ) dt. 2

d) Show the following stability result for Itô integrals: T

T

2

L (ℙ)- lim ∫ g n (s) dB s = ∫ g(s) dB s n→∞

0

0

for any sequence (g n )n⩾1 which converges in L2T .

Problems

6.

| 315

e) Use a suitable version of Weierstraß’ approximation theorem and the stopping technique from the proof of Theorem 18.1, Step 4o , to deduce from c) Itô’s formula for f ∈ C2 .

a) Use the (two-dimensional, deterministic) chain rule d(F ∘ G) = F 󸀠 ∘ G dG to deduce the formula for integration by parts for Stieltjes integrals: t

t

∫ f(s) dg(s) = f(t)g(t) − f(0)g(0) − ∫ g(s) df(s) for all f, g ∈ C1 ([0, ∞), ℝ). 0

0

b) Use the Itô formula for a Brownian motion (b t , β t )t⩾0 in ℝ2 to show that t

t

∫ b s dβ s = b t β t − ∫ β s db s , 0

0

What happens if b and β are not independent? 7.

t ⩾ 0.

Prove the following time-dependent version of Itô’s formula: let (B t )t⩾0 be a BM1 and f : [0, ∞) × ℝ → ℝ be a function of class C1,2 . Then t

f(t, B t ) − f(0, 0) = ∫ 0

t

∂f ∂f 1 ∂2 f (s, B s )) ds. (s, B s ) dB s + ∫ ( (s, B s ) + ∂x ∂t 2 ∂x2 0

Prove and state the d-dimensional counterpart.

Hint. Use the Itô formula for the (d + 1)-dimensional Itô process (t, B1t , . . . , B dt ).

8. Prove Theorem 5.6 using Itô’s formula. 9.

Let (B t , Ft )t⩾0 be a BM1 . Use Itô’s formula to verify that the following processes are martingales: X t = e t/2 cos B t

and

Y t = (B t + t) exp (−B t − t/2) .

10. Let B t = (b t , β t ), t ⩾ 0 be a BM2 and set r t := |B t | = √ b2t + β2t . t

t

a) Show that the stochastic integrals ∫0 b s /r s db s and ∫0 β s /r s dβ s exist. b) Show that W t := ∫0 b s /r s db s + ∫0 β s /r s dβ s is a BM1 . t

t

11. Let B t = (b t , β t ), t ⩾ 0 be a BM2 and f(x + iy) = u(x, y) + iv(x, y), x, y ∈ ℝ, an analytic function. If u2x + u2y = 1, then (u(b t , β t ), v(b t , β t )), t ⩾ 0, is a BM2 .

12. Show that the d-dimensional Itô formula remains valid if we replace the real-valued function f : ℝd → ℝ by a complex function f = u + iv : ℝd → ℂ.

13. (Friedrichs mollifier) Let χ ∈ C∞ c (−1, 1) with 0 ⩽ χ ⩽ 1, χ(0) = 1 and ∫ χ(x) dx = 1. Set χ n (x) := nχ(nx). a) Show that supp χ n ⊂ [−1/n, 1/n] and ∫ χ n (x) dx = 1.

316 | 18 Itô’s formula b) Let f ∈ C(ℝ) and f n := f ⋆ χ n . Show that |∂ k f n (x)| ⩽ n k supy∈𝔹(x,1/n) |f(y)|‖∂ k χ‖L1 .

c) Let f ∈ C(ℝ) uniformly continuous. Show that limn→∞ ‖f ⋆ χ n − f‖∞ = 0. What can be said if f is continuous but not uniformly continuous?

d) Let f be piecewise continuous. Show that limn→∞ f ⋆ χ n (x) = f(x) at all x where f is continuous.

14. Show that β t = ∫0 sgn(B s ) dB s is a BM1 . t

Hint. Use Lévy’s characterization of a BM1 , Theorem 9.13 or 19.5.

15. Let (B t )t⩾0 be a BM1 , Φ ∈ C2b (ℝ), Φ(0) = 0 and f, g ∈ L2T for some T > 0. Show that for all t ∈ [0, T] t

t

𝔼 [∫ f(r) dB r ⋅ Φ( ∫ g(r) dB r )] 0 [0 ] t

s

= 𝔼 [∫ f(s)g(s)Φ󸀠 ( ∫ g(r) dB r ) ds] 0 [0 ] t s

+

s

1 [ 𝔼 ∫ ∫ f(r) dB r Φ󸀠󸀠 ( ∫ g(r) dB r )g 2 (s) ds] . 2 0 [0 0 ]

16. Let σ Π , b Π ∈ ST which approximate σ, b ∈ L2T as |Π| → 0, and consider the Itô t

t

t

t

processes X tΠ = X0 +∫0 σ Π (s) dB s +∫0 b Π (s) ds and X t = X0 +∫0 σ(s) dB s +∫0 b(s) ds. Adapt the proof of Lemma 18.5 to show that T



T

∫ g(X sΠ ) dX sΠ 󳨀󳨀󳨀󳨀󳨀→ ∫ g(X s ) dX s 0

|Π|→0

0

for all g ∈ Cb (ℝ).

17. Let X t = M t + A t and X t = M 󸀠t + A󸀠t where M, M 󸀠 are continuous local martingales and A, A󸀠 are continuous processes which have locally a.s. bounded variation. Show that M − M 󸀠 and A − A󸀠 are a.s. constant. Hint. Have a look at Corollary 17.3.

19 Applications of Itô’s formula Itô’s formula has many applications, and we restrict ourselves to a few of them. We will use it to obtain a characterization of a Brownian motion as a martingale (Lévy’s theorem 19.5) and to describe the structure of “Brownian” martingales (Theorems 19.12, 19.14 and 19.17). In some sense, this will show that a Brownian motion is both a very particular martingale and the typical martingale. Girsanov’s theorem 19.9 allows us to change the underlying probability measure which will become important if we want to solve stochastic differential equations. The Burkholder–Davis–Gundy inequalities, Theorem 19.18, provide moment estimates for stochastic integrals. In the end, we will discuss some extensions to Brownian local times and Tanaka’s formula. Throughout this chapter (B t , Ft )t⩾0 is a BM with an admissible complete filtration, B

e.g. F t , see Theorem 6.22. Recall that L2T = L2P (λ T ⊗ ℙ) where T ∈ (0, ∞).

19.1 Doléans–Dade exponentials

Let (B t , Ft )t⩾0 be a Brownian motion on ℝ. We have seen in Example 5.2.e)) that ξ ξ (M t , Ft )t⩾0 is a martingale where M t := exp [ξB t − 2t ξ 2 ], ξ ∈ ℝ. Using Itô’s formula

we find that M t satisfies the following integral equation ξ

t

M t = 1 + ∫ ξ M s dB s ξ

or

ξ

0

dM t = ξ M t dB t . ξ

ξ

Classical calculus teaches us that the differential equation dy(t) = y(t) dx(t), y(0) = y0 has the unique solution y(t) = y0 exp [x(t) − x(0)]. If we compare this with the Brownian setting, we see that there appears the additional factor − 21 ξ 2 t = − 12 ⟨ξB⟩t which is due to the second order term in Itô’s formula. It is, however, this factor that makes M ξ into a martingale. Because of this analogy it is customary to call 1 E(M)t := exp (M t − ⟨M⟩t ) , t ⩾ 0, (19.1) 2 for any continuous L2 martingale (M t )t⩾0 the stochastic or Doléans–Dade exponential. If we use Itô’s formula for the stochastic integral for martingales from Chapter 17, it is not hard to see that E(M)t is itself a local martingale. Here we consider only some special cases which can be directly written in terms of a Brownian motion. 19.1 Lemma. Let (B t , Ft )t⩾0 be a BM1 , f ∈ L2P (λ T ⊗ ℙ) for all T > 0, and assume that |f(s, ω)| ⩽ C for some C > 0 and all s ⩾ 0 and ω ∈ Ω. Then t

t

0

0

1 exp (∫ f(s) dB s − ∫ f 2 (s) ds) , 2

is a martingale for the filtration (Ft )t⩾0 .

https://doi.org/10.1515/9783110741278-019

t ⩾ 0,

(19.2)

Ex. 19.1

318 | 19 Applications of Itô’s formula t

Proof. Set X t = ∫0 f(s) dB s − t

e

Xt

1 t 2 2 ∫0 f (s) ds.

Itô’s formula, Theorem 18.7, yields ¹ t

t

1 1 − 1 = ∫ e f(s) dB s − ∫ e X s f 2 (s) ds + ∫ e X s f 2 (s) ds 2 2 Xs

0

0

0

t

s

s

0

0

0

1 = ∫ exp (∫ f(r) dB r − ∫ f 2 (r) dr) f(s) dB s . 2

(19.3)

If we can show that the integrand is in L2P (λ T ⊗ ℙ) for every T > 0, then Theorem 15.15 applies and shows that the stochastic integral, hence e X t , is a martingale. We begin with an estimate for simple processes. Fix T > 0 and assume that g ∈ ST . Then g(s, ω) = ∑nj=1 g(s j−1 , ω)𝟙[s j−1 ,s j ) (s) for some partition 0 = s0 < . . . < s n = T of [0, T] and |g(s, ω)| ⩽ C for some constant C < ∞. Since g(s j−1 ) is Fs j−1 measurable and B s j − B s j−1 ⊥⊥ Fs j−1 , we find n

T

𝔼 [exp (2 ∫0 g(r) dB r )] = 𝔼 [∏ exp (2g(s j−1 )(B s j − B s j−1 ))] j=1

n−1

󵄨 = 𝔼 [ ∏ exp (2g(s j−1 )(B s j − B s j−1 )) 𝔼 (exp (2g(s n−1 )(B s n − B s n−1 )) 󵄨󵄨󵄨 Fs n−1 ) ]

tower

j=1

(A.4) = exp[2g 2 (s n−1 )(s n −s n−1 )] 2.2

n−1

⩽ exp [2C2 (s n − s n−1 )] 𝔼 [ ∏ exp (2g(s j−1 )(B s j − B s j−1 ))] . j=1

Further applications of the tower property yield n

𝔼 [exp (2 ∫0 g(r) dB r )] ⩽ ∏ exp (2C2 (s j − s j−1 )) = exp (2C2 T) . T

j=1

If f ∈ L2P (λ T ⊗ ℙ), there is a sequence of simple processes (f n )n⩾1 ⊂ ST such that

limn→∞ f n = f in L2P (λ T ⊗ ℙ) and limn→∞ ∫0 f n (s) dB s = ∫0 f(s) dB s for any t ∈ [0, T] in L2 (ℙ) and, for a subsequence, almost surely. We can assume that |f n (s, ω)| ⩽ C, otherwise we would consider −C ∨ f n ∧ C. Then t

󵄨󵄨 t 𝔼 [󵄨󵄨󵄨exp (∫0 f(r) dB r − 󵄨

󵄨󵄨2 1 t 2 󵄨 2 ∫0 f (r) dr) f(t)󵄨󵄨󵄨 ]

t

⩽ C2 𝔼 [exp (2 ∫0 f(r) dB r )] . t

By Fatou’s lemma, and the estimate for simple processes from above we get for t ∈ [0, T] 𝔼 [ lim exp (2 ∫0 f n (r) dB r )] ⩽ lim 𝔼 [exp (2 ∫0 f n (r) dB r )] ⩽ exp (2C2 t) . t

t

n→∞

n→∞

1 Compare the following equality (19.3) with (19.1).

19.1 Doléans–Dade exponentials | 319

If we integrate the resulting inequality 󵄨󵄨 t 𝔼 [󵄨󵄨󵄨exp (∫0 f(r) dB r − 󵄨

T

󵄨󵄨2 1 t 2 󵄨 2 ∫0 f (r) dr) f(t)󵄨󵄨󵄨 ]

with respect to ∫0 . . . dt, the claim follows.

⩽ C2 e2C

2

t

It is clear that Lemma 19.1 also holds for BMd and d-dimensional integrands. More interesting is the observation that we can replace the condition that the integrand is bounded by an integrability condition.

19.2 Theorem. Let (B t , Ft )t⩾0 , B t = (B1t , . . . , B dt ), be a BMd and f = (f1 , . . . , f d ) be a d-dimensional P measurable process such that f j ∈ L2P (λ T ⊗ ℙ) for all T > 0 and j = 1, . . . , d. Then d

Mt =

t

t

j exp ( ∑ ∫ f j (s) dB s j=1 0

1 − ∫ |f(s)|2 ds) 2

(19.4)

0

is a martingale for the filtration (Ft )t⩾0 if, and only if, 𝔼 M t = 1.

Proof. The necessity is obvious since M0 = 1. Since f ∈ L2P (λ T ⊗ ℙ) for all T > 0, we see with the d-dimensional version of Itô’s formula (18.14) applied to the process t t j X t = ∑dj=1 ∫0 f j (s) dB s − 12 ∫0 |f(s)|2 ds that d

t

e X t = 1 + ∑ ∫ f j (s) e X s dB s j

j=1 0 d

t

= 1 + ∑ ∫ f j (s) j=1 0

d

s

s

j exp ( ∑ ∫ f j (r) dB r j=1 0

1 j − ∫ |f(r)|2 dr) dB s . 2 0

Let τ n = inf {s ⩾ 0 : | exp(X s )| ⩾ n} ∧ n; since s 󳨃→ exp(X s ) is continuous, τ n is a

stopping time and exp(X s n ) ⩽ n. Therefore, f τ n ⋅ exp(X τ n ) is in L2P (λ T ⊗ ℙ) for all T > 0, and Theorem 15.15.e) shows that M τ n = exp(X τ n ) is a martingale. Clearly, limn→∞ τ n = ∞, which means that M t is a local martingale, cf. Definition 16.7. Proposition 19.3 below shows that any positive local martingale with 𝔼 M t = 1 is already a martingale, and the claim follows. τ

The last step in the proof of Theorem 19.2 is interesting on its own:

19.3 Proposition. a) Every positive local martingale (M t , Ft )t⩾0 is a supermartingale. b) Let (M t , Ft )t⩾0 be a supermartingale with 𝔼 M t = 𝔼 M0 for all t ⩾ 0, then (M t , Ft )t⩾0 is a martingale.

Proof. a) Let τ n bei a localizing sequence for (M t )t⩾0 . By Fatou’s lemma we see that 𝔼 M t = 𝔼 ( lim M t∧τ n ) ⩽ lim 𝔼 M t∧τ n = 𝔼 M0 < ∞, n→∞

n→∞

320 | 19 Applications of Itô’s formula since M τ n is a martingale. The conditional version of Fatou’s lemma shows for all s ⩽ t 󵄨 󵄨 𝔼 (M t 󵄨󵄨󵄨 Fs ) ⩽ lim 𝔼 (M t∧τ n 󵄨󵄨󵄨 Fs ) = lim M s∧τ n = M s , n→∞ n→∞

i.e. (M t )t⩾0 is a supermartingale.

b) Let s ⩽ t and F ∈ Fs . By assumption, ∫F M s dℙ ⩾ ∫F M t dℙ. Subtracting from both sides 𝔼 M s = 𝔼 M t gives ∫ M s dℙ − 𝔼 M s ⩾ ∫ M t dℙ − 𝔼 M t .

Therefore,

F

F

for all s ⩽ t, F c ∈ Fs .

∫ M s dℙ ⩽ ∫ M t dℙ

Fc

Fc

This means that (M t )t⩾0 is a submartingale, hence a martingale.

We close this section with the Novikov condition which gives a sufficient criterion for 𝔼 M t = 1. The following proof is essentially the proof which we used for the exponential Wald identity, cf. Theorem 5.14. We prove only the one-dimensional version, the extension to higher dimensions is obvious. 19.4 Theorem (Novikov 1972). Let (B t )t⩾0 be a BM1 and f ∈ L2P (λ T ⊗ ℙ) for some fixed T > 0. If T

1 𝔼 [exp ( ∫ |f(s)|2 ds)] < ∞, 2 0 [ ] then 𝔼 M t = 1 for all t ∈ [0, T], where M t is the stochastic exponential t

t

0

0

(19.5)

1 M t = exp (∫ f(s) dB s − ∫ |f(s)|2 ds) . 2

Proof. We have seen in the proof of Theorem 19.2 that M tT := M t∧T is a positive local martingale, and by Proposition 19.3.a) it is a supermartingale. In particular, we have 𝔼 M tT ⩽ 𝔼 M0T = 1. t t For the converse inequality write X t = ∫0 f(s) dB s and ⟨X⟩t = ∫0 |f(s)|2 ds. Then M t = exp [X t − 21 ⟨X⟩t ] = E(X)t , t ∈ [0, T], and the condition (19.5) reads 𝔼 (exp [ 12 ⟨X⟩T ]) < ∞. Let (τ n )n⩾1 be a localizing sequence for the local martingale M T , fix some c ∈ (0, 1) and pick p = p(c) > 1 such that p < 1/c(2 − c) (⩽ 1/c). T∧τ n

Then M t

T∧τ n

= exp [X t

for all t ∈ [0, T]. So, p

− 21 ⟨X⟩t

T∧τ n

τ

𝔼 [E(cX τ n )t ] = 𝔼 [exp (pcX t n −

1 2

] is a martingale with 𝔼 M t∧τ n = 𝔼 M0 = 1 pc2 ⟨X⟩t n )] τ

= 𝔼 [exp (pc(X t n − 12 ⟨X⟩t n )) exp ( 21 pc(1 − c)⟨X⟩t n )] , τ

τ

τ

19.2 Lévy’s characterization of Brownian motion | 321

for all t ∈ [0, T], and from the Hölder inequality for the exponents 1/pc and 1/(1 − pc) we find 𝔼 [E(cX τ n )t ] ⩽ [𝔼 exp (X t n − 21 ⟨X⟩t n )] τ

p

τ

=[𝔼 M t∧τ ]pc =1

pc

n



1−pc [𝔼 exp ( 21 ⟨X⟩T )]

[𝔼 exp ( 12

pc(1−c) 1−pc

⟨X⟩t )]

1−pc

.

In the last step we use that pc(1 − c)/(1 − pc) ⩽ 1. This shows that the pth moment T∧τ

of the family M T∧τ n = (M t n )n⩾1 is uniformly bounded, hence M T∧τ n is uniformly integrable. Therefore, we can let n → ∞ and find, using uniform integrability and Hölder’s inequality for the exponents 1/c and 1/(1 − c) τ

1 = 𝔼 [exp (cX t n −

1 2

1 2

c2 ⟨X⟩t n )] = 𝔼 [exp (cX t − τ

c ⟨X⟩t ) exp ( 21 c(1 − c)⟨X⟩t )] c

1−c

Since the last factor is bounded by (𝔼 exp [ 21 ⟨X⟩T ]) 1 ⩽ 𝔼 exp [X t −

1−c

⩽ [𝔼 exp (X t − 21 ⟨X⟩t )] [𝔼 exp ( 21 c ⟨X⟩t )]

1 2

⟨X⟩t ] = 𝔼 M t

.

< ∞, we find as c → 1,

for all t ∈ [0, T].

19.2 Lévy’s characterization of Brownian motion

In the hands of Kunita and Watanabe [154] Itô’s formula became a most powerful tool. We will follow their ideas and give an elegant proof of the following theorem due to P. Lévy. 19.5 Theorem (Lévy 1948). Let (X t , Ft )t⩾0 , X0 = 0, be a real-valued, adapted process with continuous sample paths. If both (X t , Ft )t⩾0 and (X 2t − t, Ft )t⩾0 are martingales, then (X t )t⩾0 is a Brownian motion with filtration (Ft )t⩾0 .

Theorem 19.5 means, in particular, that BM1 is the only continuous martingale with quadratic variation ⟨B⟩t ≡ t.

The way we have stated Itô’s formula, we cannot use it for (X t )t⩾0 as in Theorem 19.5. Nevertheless, a close inspection of the construction of the stochastic integral in Chapters 15 and 16 shows that all results remain valid if the integrator dB t is replaced by dX t where (X t )t⩾0 is a continuous martingale with quadratic variation ⟨X⟩t = t. For Steps 2o and 3o of the proof of Theorem 18.1, cf. (18.7) and (18.8) in the proof of Lemma 18.4, we have to replace Theorem 9.1 by Corollary 9.12. In fact, Theorem 19.5 is an ex-post vindication for this since such martingales already are Brownian motions. In order not to end up with a circular conclusion, we have to go through the proofs of Chapters 15–18 again, though.

Ex. 19.2 Ex. 19.3

322 | 19 Applications of Itô’s formula After this word of caution we can use (18.1) in the setting of Theorem 19.5 and start with the Proof of Theorem 19.5 (Kunita, Watanabe 1967). We have to show that X t − X s ⊥⊥ Fs

and

This is equivalent to

for all 0 ⩽ s ⩽ t.

X t − X s ∼ N(0, t − s)

𝔼 (exp [iξ(X t − X s )] 𝟙F ) = exp [− 12 (t − s)ξ 2 ] ℙ(F) for all s ⩽ t, F ∈ Fs .

(19.6)

Indeed, setting F = Ω yields Ex. 9.8

𝔼 (exp [iξ(X t − X s )]) = exp [− 21 (t − s)ξ 2 ] ,

i.e. X t − X s ∼ X t−s ∼ N(0, t − s), and (19.6) becomes

𝔼 (exp [iξ(X t − X s )] 𝟙F ) = 𝔼 (exp [iξ(X t − X s )]) ℙ(F)

for all s ⩽ t, F ∈ Fs ;

this proves that X t − X s ⊥⊥ F for all F ∈ Fs . Let us now prove (19.6). Note that f(x) := e iξx is a C2 -function with f(x) = e ixξ ,

f 󸀠󸀠 (x) = −ξ 2 e ixξ .

f 󸀠 (x) = iξe ixξ ,

Applying (18.1) to f(X t ) and f(X s ) and subtracting the respective results yields t

t

e

iξX t

−e

iξX s

= iξ ∫ e

iξX r

s

ξ2 dX r − ∫ e iξX r dr. 2

(19.7)

s

Since |e iη | = 1, the real and imaginary parts of the integrands are L2P (λ T ⊗ ℙ)-functions t

for any t ⩽ T and, using Theorem 15.15.a) with X instead of B, we see that ∫0 e iξX r dX r is a martingale. Hence, 󵄨󵄨 t 󵄨󵄨 󵄨 𝔼 (∫ exp [iξX r ] dX r 󵄨󵄨󵄨 Fs ) = 0 󵄨󵄨 󵄨󵄨 s

a.s.

Multiplying both sides of (19.7) by e−iξX s 𝟙F , where F ∈ Fs , and then taking expectations, gives t

𝔼 (exp [iξ(X t − X s )] 𝟙F ) − ℙ(F) = −

ξ2 ∫ 𝔼 (exp [iξ(X r − X s )] 𝟙F ) dr. 2 s

This means that ϕ(t) := 𝔼 (exp [iξ(X t − X s )] 𝟙F ), s ⩽ t, is a solution of the integral equation t

ξ2 Φ(t) = ℙ(F) − ∫ Φ(r) dr. 2 s

(19.8)

19.3 Girsanov’s theorem | 323

It is easy to see that (19.8) is also solved by ψ(t) := ℙ(F) exp [− 21 (t − s)ξ 2 ]. Since |ϕ(t) − ψ(t)| =

t 󵄨󵄨 󵄨 t ξ 2 󵄨󵄨󵄨 󵄨 ξ 2 󵄨󵄨 󵄨󵄨 󵄨󵄨 ∫(ϕ(r) − ψ(r)) dr󵄨󵄨󵄨 ⩽ ∫ 󵄨󵄨 2 󵄨󵄨ϕ(r) − ψ(r)󵄨󵄨 dr, 2 󵄨󵄨󵄨 󵄨 s

s

we can use Gronwall’s lemma, Theorem A.42, with u = |ϕ − ψ|, a ≡ 0 and b ≡ ξ 2 /2, to get |ϕ(t) − ψ(t)| ⩽ 0 for all t ⩾ 0.

This shows that ϕ(t) = ψ(t), i.e. (19.8) admits only one solution, and (19.6) follows.

19.3 Girsanov’s theorem

Let (B t )t⩾0 be a real Brownian motion. The process W t := B t − ℓt, ℓ ∈ ℝ, is called Brownian motion with drift. It describes a uniform motion with speed −ℓ which is perturbed by a Brownian motion B t . Girsanov’s theorem provides a very useful transformation of the underlying probability measure ℙ such that, under the new measure, W t is itself a Brownian motion. This is very useful if we want to calculate, e.g. the distribution of the random variable sups⩽t W s and other functionals of W. 19.6 Example. Let G ∼ N(m, σ2 ) be on (Ω, A , ℙ) a real-valued normal random variable with mean m and variance σ2 . We want to transform G into a mean zero normal random variable. Obviously, G 󴁄󴀼 G − m would do the trick on (Ω, A , ℙ) but this also changes G. Another possibility is to change the probability measure ℙ. Observe that 𝔼 e iξG = exp [− 21 σ2 ξ 2 + imξ ]. Since G has exponential moments, cf. Corollary 2.2, we can insert ξ = im/σ2 and find 𝔼 exp (− σm2 G) = exp (− 12 2

m2 ). σ2

Therefore, ℚ(dω) := exp(− σm2 G(ω) + 21 mσ2 ) ℙ(dω) is a probability measure, and we find for the corresponding mathematical expectation 𝔼ℚ = ∫Ω ⋅ ⋅ ⋅ dℚ 1 m2 2 σ2 ) dℙ m2 ) ∫ exp (i (ξ + i σm2 ) G) dℙ σ2 2 m2 ) exp (− 21 (ξ + i σm2 ) σ2 + σ2

𝔼ℚ e iξG = ∫ e iξG exp (− σm2 G + = exp ( 12

= exp ( 12

= exp (− 12 σ2 ξ 2 ) .

im(ξ + i σm2 ))

This means that we have G ∼ N(0, σ2 ) under the probability measure ℚ.

The case of a Brownian motion with drift is similar. First we need a formula which shows how conditional expectations behave under changes of measures.

324 | 19 Applications of Itô’s formula 19.7 Lemma. Let β be a positive random variable on the probability space (Ω, A , ℙ) and F ⊂ A a σ-algebra. If 𝔼 β = 1, then ℚ(dω) := β(ω)ℙ(dω) is a new probability measure, such that ℚ|F = 𝔼(β | F )ℙ|F

and

𝔼ℚ (X | F ) =

𝔼(Xβ | F ) ; 𝔼(β | F )

(19.9)

𝔼ℚ := ∫Ω . . . dℚ = ∫Ω . . . β dℙ denotes the expectation w.r.t. ℚ. Proof. It is clear that ℚ is a probability measure. Moreover, ∫ 𝔼(β | F ) dℙ = ∫ β dℙ = ℚ(F), F

F

F ∈ F,

shows the formula for ℚ|F . By the definition of 𝔼ℚ we see for all F ∈ F F mble

𝔼ℚ (𝟙F X) = 𝔼(𝟙F Xβ) = 𝔼 (𝟙F 𝔼(Xβ | F )) = 𝔼 ( 𝟙F

𝔼(Xβ | F )

= 𝔼ℚ (𝟙F

proving the formula for the conditional expectation.

𝔼(β | F )

𝔼(β | F ))

𝔼(Xβ | F ) ), 𝔼(β | F )

19.8 Example. Let (B t , Ft )t⩾0 be a BM1 and set β = β T where β t = exp [ℓB t − 21 ℓ2 t]. From Example 5.2.e)) we know that (β s )s⩽T is a martingale for the filtration (Ft )t⩾0 . Therefore, 𝔼 β T = 1, and ℚ := β T ℙ is a probability measure. We want to calculate the conditional characteristic function of W t := B t − ℓt under 𝔼ℚ . Let ξ ∈ ℝ and s ⩽ t ⩽ T. With the tower property we get 󵄨 𝔼ℚ [exp (iξ(W t − W s )) 󵄨󵄨󵄨 Fs ] (19.9)

=

󵄨 𝔼 [exp (iξ(W t − W s )) β T 󵄨󵄨󵄨 Fs ] / 𝔼 [β T | Fs ]

=

󵄨 𝔼 [exp (iξ(W t − W s )) 𝔼 (β T | Ft ) 󵄨󵄨󵄨 Fs ] /β s

=

󵄨 exp [−iℓξ(t − s) − 21 ℓ2 (t − s)] 𝔼 [exp (iξ(B t − B s )) exp (ℓ(B t − B s )) 󵄨󵄨󵄨 Fs ]

tower

=

(B1),(5.1)

󵄨 𝔼 [exp (iξ(W t − W s )) β t 󵄨󵄨󵄨 Fs ] /β s

=

exp [−iℓξ(t − s) − 12 ℓ2 (t − s)] 𝔼 [exp (iξ(B t − B s )) exp (ℓ(B t − B s ))]

=

exp [− 12 (t − s)ξ 2 ] .

(2.6)

=

exp [−iℓξ(t − s) − 21 ℓ2 (t − s)] exp [ 12 (iξ + ℓ)2 (t − s)]

Lemma 5.4 shows that (W t , Ft )t⩽T is a Brownian motion under ℚ.

19.3 Girsanov’s theorem | 325

We will now consider the general case. 19.9 Theorem (Girsanov 1960). Let (B t , Ft )t⩾0 , B t = (B1t , . . . , B dt ), be a BMd and assume that f = (f1 , . . . , f d ) is a P measurable process such that f j ∈ L2P (λ T ⊗ ℙ) for all 0 < T < ∞ and j = 1, . . . , d. If d

t

t

q t = exp ( ∑ ∫ f j (s) dB s − j

j=1 0

1 ∫ |f(s)|2 ds) 2

(19.10)

0

is integrable and 𝔼 q t ≡ 1, then for every T > 0 the process t

W t := B t − ∫ f(s) ds, 0

is a BMd under the new probability measure ℚ = ℚT

t ⩽ T,

ℚ(dω) := q T (ω) ℙ(dω).

(19.11)

(19.12)

Notice that the new probability measure ℚ in Theorem 19.9 depends on the right endpoint T. Proof. Since f j ∈ L2P (λ T ⊗ ℙ) for all T > 0, the processes t

t

X t := ∫ f j (s) dB s − j

j

0

1 ∫ f j2 (s) ds, 2 0

j = 1, . . . , d,

are well-defined, and Theorem 19.2 shows that under the condition 𝔼 q t = 1 the process (q t , Ft )t⩽T is a martingale. Set t

β t := exp [i⟨ξ, W(t)⟩ +

1 t |ξ|2 ] 2

where W t = B t − ∫0 f(s) ds is as in (19.11). If we knew that (β t )t⩽T is a martingale on the probability space (Ω, FT , ℚ = ℚT ), then 󵄨 𝔼ℚ (exp[i⟨ξ, W(t) − W(s)⟩] 󵄨󵄨󵄨 Fs )

󵄨 = 𝔼ℚ (β t exp [−i⟨ξ, W(s)⟩] exp [− 12 t |ξ|2 ] 󵄨󵄨󵄨 Fs ) 󵄨 = 𝔼ℚ (β t 󵄨󵄨󵄨 Fs ) exp [−i⟨ξ, W(s)⟩] exp [− 21 t |ξ|2 ] 1 = β s exp [−i⟨ξ, W(s)⟩] exp [− t |ξ|2 ] 2 = exp [− 12 (t − s) |ξ|2 ] .

Lemma 5.4 shows that W(t) is a Brownian motion under the probability measure ℚ.

Ex. 19.4

326 | 19 Applications of Itô’s formula Let us check that (β t )t⩽T is a martingale. We have t

d

t

t

0

0

d d 1 1 j − ∫ |f(s)|2 ds + i ∑ ξ j B t − i ∑ ∫ ξ j f j (s) ds + |ξ|2 t 2 2 j=1 j=1

j ∑ ∫ f j (s) dB s j=1 0

exponent of q t

exponent of β t

t

d

t

1 ∫ ⟨f(s) + iξ, f(s) + iξ ⟩ ds. 2

j

= ∑ ∫(f j (s) + iξ j ) dB s − j=1 0

This shows that

0

t

t

d 1 j q t β t = exp [ ∑ ∫(f j (s) + iξ j ) dB s − ∫ ⟨f(s) + iξ, f(s) + iξ ⟩ ds] , 2 0 [j=1 0 ]

and, just as for q t , we find for the product q t β t by Itô’s formula that d

t

q t β t = 1 + ∑ ∫ q s β s (f j (s) + iξ j ) dB s . j

j=1 0

Therefore, (q t β t )t⩾0 is a local martingale. If (τ k )k⩾1 is a localizing sequence, we get for k ⩾ 1, F ∈ Fs and s ⩽ t ⩽ T def

𝔼ℚ (β t∧τ 𝟙F ) = 𝔼 (q T β t∧τ 𝟙F ) k

k

󵄨 = 𝔼 [𝔼 (q T β t∧τ 𝟙F 󵄨󵄨󵄨 Ft )] k 󵄨 = 𝔼 [ 𝔼 (q T 󵄨󵄨󵄨 Ft ) β t∧τ 𝟙F ]

tower

= q t , martingale

k

= 𝔼 [(q t − q t∧τ )β t∧τ 𝟙F ] + 𝔼 [q t∧τ β t∧τ 𝟙F ] k

k

k

k

= 𝔼 [(q t − q t∧τ )β t∧τ 𝟙F ] + 𝔼 [q s∧τ β s∧τ 𝟙F ] . k

k

k

k

In the last step we use the fact that (q t β t , Ft )t⩾0 is a local martingale. Since t 󳨃→ q t is continuous and q t∧τ = 𝔼(q t | Ft∧τ ), we know that (q t∧τ , Ft∧τ )k⩾1 is a uniformly k k k k integrable martingale, thus, limk→∞ q t∧τ = q t a.s. and in L1 (ℙ). Moreover, we have 1

2

1

2

that |β t | = e 2 |ξ| t ⩽ e 2 |ξ|

k

T

. By dominated convergence,

𝔼ℚ (β t 𝟙F ) = lim 𝔼ℚ (β t∧τ 𝟙F ) k

k→∞

= lim 𝔼 [(q t − q t∧τ )β t∧τ 𝟙F ] + lim 𝔼 [q s∧τ β s∧τ 𝟙F ] k

k→∞

k

k→∞

k

k

= 𝔼 [q s β s 𝟙F ] .

If we use the last equality with t = s, we get t=s

𝔼ℚ (β t 𝟙F ) = 𝔼 [q s β s 𝟙F ] = 𝔼ℚ (β s 𝟙F )

for all F ∈ Fs , 0 ⩽ s ⩽ T,

i.e. (β t , Ft )0⩽t⩽T is a martingale under ℚ for every T > 0.

19.4 Martingale representation – 1 | 327

19.4 Martingale representation – 1 Let (B t )t⩾0 be a BM1 and (Ft )t⩾0 an admissible complete filtration. From Theorem 15.15 we know that for every T ∈ [0, ∞] and X ∈ L2P (λ T ⊗ ℙ) the stochastic integral t

M t = x + ∫ X s dB s ,

(19.13)

t ∈ [0, T], x ∈ ℝ,

0

is a continuous L2 martingale for the filtration (Ft )t⩽T of the underlying Brownian motion. We will now show the converse: if (M t , Ft )t⩽T is an L2 martingale for the complete Brownian filtration (Ft )t⩾0 , then M t must be of the form (19.13) for some X = (X s )s⩽T ∈ L2P (λ T ⊗ ℙ). For this we define the set T

{ } H2T := {M T : M T = x + ∫ X s dB s , x ∈ ℝ, X ∈ L2P (λ T ⊗ ℙ)} . 0 } {

19.10 Lemma. The space H2T is a closed linear subspace of L2 (Ω, FT , ℙ).

Proof. H2T is obviously a linear space and, by Theorem 15.15, it is contained in L2 (Ω, FT , ℙ). We show that it is a closed subspace. Let (M Tn )n⩾1 ⊂ H2T , T

M Tn = x n + ∫ X sn dB s , 0

be an

L2 (ℙ)

Cauchy sequence. Thus, it is an L1 (ℙ) Cauchy sequence, and we see from 𝔼 [M Tn − M Tm ] = x n − x m

that (x n )n⩾1 is a Cauchy sequence in ℝ; denote its limit by x = limn→∞ x n . This means T that M Tn − x n = ∫0 X sn dB s is again a Cauchy sequence in L2 (ℙ). Using Itô’s isometry (15.22) we see that 𝔼 [((M Tn

−x )− n

(M Tm

2

T

2

T

− x )) ] = 𝔼 [(∫(X sn − X sm ) dB s ) ] = 𝔼 [∫ |X sn − X sm |2 ds] , [ 0 ] [0 ] m

i.e. (X sn )s⩽T is a Cauchy sequence in L2 (λ T ⊗ℙ). Because of the completeness of L2 (λ T ⊗ℙ) the sequence converges to some process (X s )s⩽T and, for some subsequence, we get lim X s

n(k)

k→∞

(ω) = X s (ω) for λ T ⊗ ℙ almost all (s, ω) and in L2 (λ T ⊗ ℙ).

Since pointwise limits preserve measurability, we have (X s )s⩽T ∈ L2P (λ T ⊗ ℙ). Using again Itô’s isometry, we find T

L2 (Ω,FT ,ℙ)

T

∫ X sn dB s 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ ∫ X s dB s 0

n→∞

T

hence

0

This shows M = x + ∫0 X s dB s ∈ H2T . T

L2 (Ω,FT ,ℙ)

T

M Tn = x n + ∫ X sn dB s 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ x + ∫ X s dB s . 0

n→∞

0

328 | 19 Applications of Itô’s formula

Ex. 19.7

19.11 Lemma. Let f ∈ L2 ([0, T], dt) be a (non-random) function. Then exp [i ∫0 f(t) dB t ] T

is in H2T ⊕ iH2T .

Proof. Clearly, f ∈ L2T so that all (stochastic) integrals are well-defined. t

Set M t := i ∫0 f(s) dB s + T

e

MT

= 1+∫e 0

T

1 t 2 2 ∫0 f (s) ds

Ms

and apply Itô’s formula to e M t : T

1 1 (if(s) dB s + f 2 (s) ds) − ∫ e M s f 2 (s) ds 2 2 0

= 1 + ∫ ie M s f(s) dB s . 0

Thus,

T

exp [if ∙ B T ] = exp [− 21 ∫0 f 2 (s) ds] + ∫ if(t) exp [if ∙ B t − T

0

and the integrand is obviously in L2P (λ T ⊗ ℙ) ⊕ iL2P (λ T ⊗ ℙ).

1 T 2 2 ∫t f (s) ds] dB t ,

We are going to use Lemma 19.11 for step functions of the form n

g(s) = ∑ (ξ j + ⋅ ⋅ ⋅ + ξ n )𝟙[t j−1 ,t j ) (s)

(19.14)

j=1

where n ⩾ 1, ξ1 , . . . , ξ n ∈ ℝ and 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = T. It is not hard to see that T

n

g ∙ B T = ∫ g(s) dB s = ∑ ξ j B t j . 0

(19.15)

j=1

19.12 Theorem. Let (B t )t⩾0 be a BM1 and Ft := F t the completed natural filtration. T Every Y ∈ L2 (Ω, FT , ℙ) is of the form Y = y + ∫0 X s dB s for some X ∈ L2P (λ T ⊗ ℙ) and y ∈ ℝ. B

Ex. 19.8

Proof. We have to show that H2T = L2 (Ω, FT , ℙ). Since H2T is a closed subspace of L2 (Ω, FT , ℙ), it is enough to prove that H2T is dense in L2 (Ω, FT , ℙ). To do so, we take any Y ∈ L2 (Ω, FT , ℙ) and show that Y ⊥ H2T 󳨐⇒ Y = 0.

Recall that Y ⊥ H2T means that 𝔼(YH) = 0 for all H ∈ H2T . Take H = exp [ig ∙ B T ] where g is of the form (19.14). Write ξ = (ξ1 , . . . , ξ n ) and Γ = (B t1 , . . . , B t n ). As H ∈ H2T ⊕ iH2T , we get with (19.15) 0 = 𝔼 [exp (i ∑nj=1 ξ j B t j ) Y] = 𝔼 [e i⟨ξ,Γ⟩ 𝔼(Y | Γ)] = 𝔼 [e i⟨ξ,Γ⟩ u(Γ)] .

19.5 Martingale representation – 2 | 329

Here we use the tower property and the fact that 𝔼(Y | Γ) can be expressed as a function u of Γ = (B t1 , . . . , B t n ). This shows ∫ e i⟨ξ,x⟩ u(x) ℙ(Γ ∈ dx) = ∫ e i⟨ξ,x⟩ u(x)p Γ (x) dx = 0, ℝn

where p Γ (x) is the (strictly positive) density of the random vector Γ = (B t1 , . . . , B t n ), cf. (2.10b). Therefore, the Fourier transform of the function u(x)p Γ (x) is zero. This means that u(x)p Γ (x) = 0, hence u(x) = 0 for Lebesgue almost all x. We conclude that 𝔼(Y | B t1 , . . . , B t n ) = 0 a.s. for all 0 ⩽ t1 < ⋅ ⋅ ⋅ < t n = T, and so – for example using Lévy’s martingale convergence theorem and the fact that ⋃t1 t}; e) a(τ(s)) = sup{u ⩾ s : τ(u) = τ(s)}; f) τ(a(t)) = sup{w ⩾ t : a(w) = a(t)}; g) if a is continuous, a(τ(s)) = s for all s ∈ a([0, ∞)).

Proof. Before we begin with the formal proof, it is helpful to draw a picture of the situation, cf. Figure 19.1. All assertions are clear if t is a point of strict increase for the function a. Moreover, if a is flat, the inverse τ makes a jump, and vice versa. a) It is obvious from the definition that τ is increasing. Note that {t : a(t) > s} = ⋃ {t : a(t) > s + ϵ}. ϵ>0

Therefore, inf{t ⩾ 0 : a(t) > s} = inf ϵ>0 inf{t ⩾ 0 : a(t) > s + ϵ} proving right continuity.

b) Since a(u) ⩾ s for all u > τ(s), we see that a(τ(s)) ⩾ s.

c) By the very definition of the generalized inverse τ we have a(t) ⩾ s ⇐⇒ a(t) > s − ϵ ∀ϵ > 0

⇐⇒ τ(s − ϵ) ⩽ t ∀ϵ > 0 ⇐⇒ τ(s−) ⩽ t.

19.6 Martingales as time-changed Brownian motion | 333

𝜏(𝑠)

𝑎(𝑡) 𝑤+ 𝑤 𝑤

𝜏(𝑤)



𝜏(𝑣)

𝑣 𝑢

𝜏(𝑣−)

𝜏(𝑢)

𝜏(𝑣−)

𝜏(𝑣)

𝜏(𝑤)

𝑡

𝜏(𝑢)

𝑢

𝑣

𝑤− 𝑤 𝑤+

Fig. 19.1: An increasing right continuous function and its generalized right continuous inverse.

𝑠

d) From c) we see that c)

a(t) = sup{s ⩾ 0 : a(t) ⩾ s} = sup{s ⩾ 0 : τ(s−) ⩽ t} = inf{s ⩾ 0 : τ(s) > t}.

e) Assume that u ⩾ s and τ(u) = τ(s). Then a(τ(s)) = a(τ(u)) ⩾ u ⩾ s, and therefore a(τ(s)) ⩾ sup{u ⩾ s : τ(u) = τ(s)}. For the converse, we use d) and find f)

a(τ(s)) = inf{r ⩾ 0 : τ(r) > τ(s)} ⩽ sup{u ⩾ s : τ(u) = τ(s)}.

Follows as e) since a and τ play symmetric roles.

g) Because of b) it is enough to show that a(τ(s)) ⩽ s. If τ(s) = 0, this is trivial. Otherwise we find ϵ > 0 such that 0 < ϵ < τ(s) and a(τ(s) − ϵ) ⩽ s by the definition of τ(s). Letting ϵ → 0, we obtain a(τ(s)) ⩽ s by the continuity of a.

The following theorem was discovered by Doeblin in 1940. Doeblin never published this result but deposited it in a pli cacheté with the Academy of Sciences in Paris. The pli was finally opened in the year 2000, see the historical notes by Bru and Yor [23] and in the Collected Works [53]. Around 25 years after Doeblin, the result was in 1965 rediscovered by Dambis and by Dubins and Schwarz in a more general context.

19.17 Theorem (Doeblin 1940; Dambis 1965; Dubins–Schwarz 1965). Let (M t , Ft )t⩾0 , M t ∈ L2 (ℙ), be a continuous martingale with a right continuous filtration, i.e. Ft = Ft+ , t ⩾ 0, and ⟨M⟩∞ = limt→∞ ⟨M⟩t = ∞. Denote by τ s = inf{t ⩾ 0 : ⟨M⟩t > s} the generalized inverse of ⟨M⟩. Then (B t , Gt )t⩾0 := (M τ s , Fτ s )s⩾0 is a Brownian motion, and M t = B⟨M⟩t

for all t ⩾ 0.

334 | 19 Applications of Itô’s formula

Ex. 19.12

Proof. Define B s := M τ s and Gt := Fτ t . Since τ s is the first entrance time into (s, ∞) for the process ⟨M⟩, it is an Ft+ stopping time, cf. Lemma 5.7, which is almost surely finite because of ⟨M⟩∞ = ∞. Let Ω∗ = {ω ∈ Ω : M• (ω) and ⟨M⟩• (ω) are constant on the same intervals}. From the definition of the quadratic variation, Theorem 17.1, one can show that ℙ(Ω∗ ) = 1, see Corollary 17.7. In particular, s 󳨃→ M τ s (ω) is almost surely continuous: if s is a continuity point of τ s , this is obvious. Otherwise τ s− < τ s , but for ω ∈ Ω∗ and all τ s− (ω) ⩽ r ⩽ τ s (ω) we have M τ s− (ω) = M r (ω) = M τ s (ω). For all k ⩾ 1 and s ⩾ 0 we find with Doob’s maximal inequality (A.14) (A.14)

𝔼 (M 2τ ∧k ) ⩽ 𝔼 ( supu⩽k M 2u ) ⩽ 4 𝔼 M 2k < ∞. s

The optional stopping Theorem A.18 and Corollary A.20 show that (M τ s ∧k )s⩾0 is a martingale for the filtration (Gs )s⩾0 with Gs := Fτ s . Thus, 𝔼 (M 2τ ∧k ) = 𝔼 (⟨M⟩τ s ∧k ) ⩽ 𝔼 (⟨M⟩τ s ) = s, s

showing that for every s ⩾ 0 the family (M τ s ∧k )k⩾1 is uniformly integrable. Because of the continuity of M we see L1 and a. e.

M τ s ∧k 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ M τ s =: B s . k→∞

In particular, (B s , Gs )s⩾0 is a martingale. Fatou’s lemma shows that 𝔼 B2s ⩽ lim 𝔼 M 2τ ∧k ⩽ s, k→∞

i.e. B s is also in we find

L2 (ℙ).

s

If we apply optional stopping to the martingale (M 2t − ⟨M⟩t )t⩾0 ,

󵄨 󵄨 󵄨 𝔼 [(B t − B s )2 󵄨󵄨󵄨 Gs ] = 𝔼 [(M τ t − M τ s )2 󵄨󵄨󵄨 Fτ s ] = 𝔼 [⟨M⟩τ t − ⟨M⟩τ s 󵄨󵄨󵄨 Fτ s ] = t − s.

Thus (B s , Gs )s⩾0 is a continuous martingale whose quadratic variation process is s. Lévy’s characterization of a Brownian motion Theorem 19.5 applies, and we see that (B s , Gs )s⩾0 is a BM1 . Finally, B⟨M⟩t = M τ = M t for all t ⩾ 0. ⟨M⟩t

19.7 Burkholder–Davis–Gundy inequalities

The Burkholder–Davis–Gundy inequalities are estimates for the pth moment of a continuous martingale X in terms of the L p norm of √⟨X⟩: c p 𝔼 [⟨X⟩T ] ⩽ 𝔼 [ sup |X t |p ] ⩽ C p 𝔼 [⟨X⟩T ], p/2

p/2

t⩽T

0 < p < ∞.

(19.17)

19.7 Burkholder–Davis–Gundy inequalities | 335

The constants can be made explicit and they depend only on p. In fact, (19.17) holds for continuous local martingales; we restrict ourselves to Brownian L2 martingales, i.e. to martingales of the form t

t

X t = ∫ f(s) dB s 0

and ⟨X⟩t = ∫ |f(s)|2 ds 0

where (B t )t⩾0 is a BM1 and f ∈ L2P (λ T ⊗ ℙ) for all T > 0, see also Sections 19.4 and 19.5. We will first consider the case where p ⩾ 2. 19.18 Theorem (Burkholder 1966). Let (B t )t⩾0 be a BM1 and f ∈ L2P (λ T ⊗ ℙ) for all T > 0. Then, we have for all p ⩾ 2 and q = p/(p − 1) T p/2 󵄨󵄨 t 󵄨󵄨p 󵄨 󵄨 𝔼 [sup 󵄨󵄨󵄨󵄨 ∫ f(s) dB s 󵄨󵄨󵄨󵄨 ] ⩽ C p 𝔼 [( ∫ |f(s)|2 ds) ] , 󵄨󵄨 t⩽T 󵄨󵄨 0 [ ] [ 0 ]

p ⩾ 2,

(19.18)

T 󵄨󵄨 t 󵄨󵄨p p/2 󵄨 󵄨 𝔼 [sup 󵄨󵄨󵄨󵄨 ∫ f(s) dB s 󵄨󵄨󵄨󵄨 ] ⩾ c p 𝔼 [( ∫ |f(s)|2 ds) ] , 󵄨󵄨 t⩽T 󵄨󵄨 0 [ ] [ 0 ]

p ⩾ 2,

(19.19)

where C p = [q p p(p − 1)/2]

p/2

, and

where c p = 1/(2p)p/2 .

Proof. Set X t := ∫0 f(s) dB s and ⟨X⟩t = ∫0 |f(s)|2 ds, cf. Theorem 15.15. If we combine Doob’s maximal inequality (A.14) and Itô’s isometry (15.22), we get t

t

𝔼 [⟨X⟩T ] = 𝔼 [|X T |2 ] ⩽ 𝔼 [sup |X s |2 ] ⩽ 4 sup 𝔼 [|X s |2 ] = 4 𝔼 [⟨X⟩T ]. s⩽T

s⩽T

This shows that (19.18) and (19.19) hold for p = 2 with C p = 4 and c p = 1/4, respectively. Assume that p > 2. By stopping with the stopping time τ n := inf {t ⩾ 0 : |X t |2 + ⟨X⟩t ⩾ n} ,

(note that X 2 + ⟨X⟩ is a continuous process) we can, without loss of generality, assume that both X and ⟨X⟩ are bounded. Since x 󳨃→ |x|p , p > 2, is a C2 -function, Itô’s formula, Theorem 18.8, yields t

|X t | = p ∫ |X s | p

0

t

p−1

1 sgn(X s ) dX s + p(p − 1) ∫ |X s |p−2 d⟨X⟩s 2 0

where we use that d⟨X⟩s = |f(s)|2 ds. Because of the boundedness of |X s |p−1 , the first integral on the right-hand side is a martingale and we see with Doob’s maximal

336 | 19 Applications of Itô’s formula inequality and the Hölder inequality for p/(p − 2) and p/2 T

p(p − 1) [ 𝔼 [sup |X s | ] ⩽ q 𝔼 ∫ |X s |p−2 d⟨X⟩s ] 2 s⩽T ] [0 p(p − 1) ⩽ qp 𝔼 [sup |X s |p−2 ⟨X⟩T ] 2 s⩽T p

(A.14)

p

1−2/p

⩽ qp

p(p − 1) (𝔼 [sup |X s |p ]) 2 s⩽T

If we divide by (𝔼 [sups⩽T |X s |p ])

1−2/p

, we find

2/p

(𝔼 [sup |X s |p ]) s⩽T

⩽ qp

2/p

p/2

(𝔼 [⟨X⟩T ])

.

p(p − 1) p/2 2/p (𝔼 [⟨X⟩T ]) , 2

and (19.18) follows with C p = [q p p(p − 1)/2] For the lower estimate (19.19) we set

p/2

.

t

(p−2)/4

Y t := ∫⟨X⟩s 0

Ex. 19.13

dX s .

From Theorem 17.9 (or with Theorem 15.15 observing that dX s = f(s) dB s ) we see that T

(p−2)/2

⟨Y⟩T = ∫⟨X⟩s 0

and

p/2

𝔼 [⟨X⟩T ] =

d⟨X⟩s =

2 p/2 ⟨X⟩T p

p p 𝔼 [⟨Y⟩T ] = 𝔼 [|Y T |2 ] . 2 2

Using Itô’s formula (18.14) for the function f(x, y) = xy and the bivariate process (p−2)/4 (X t , ⟨X⟩t ) we obtain T

(p−2)/4 X T ⟨X⟩T

=

(p−2)/4 ∫⟨X⟩s 0

T

dX s +

T

(p−2)/4 ∫ X s d⟨X⟩s 0

(p−2)/4

= Y T + ∫ X s d⟨X⟩s 0

(p−2)/4

If we rearrange this equality, we get |Y T | ⩽ 2 sups⩽T |X s | ⟨X⟩T Hölder’s inequality with p/2 and p/(p − 2), 2 p/2 (p−2)/2 𝔼 [⟨X⟩T ] = 𝔼 [|Y T |2 ] ⩽ 4 𝔼 [sup |X s |2 ⟨X⟩T ] p s⩽T 2/p

⩽ 4 (𝔼 [sup |X s |p ]) s⩽T

p/2

. Finally, using

1−2/p

(𝔼 [⟨X⟩T ])

Dividing on both sides by (𝔼[⟨X⟩T ])1−2/p gives (19.19) with c p = (2p)−p/2 . p/2

.

.

19.7 Burkholder–Davis–Gundy inequalities | 337

The inequalities for 0 < p < 2 can be proved with the following useful lemma. 19.19 Lemma. Let X and A be positive random variables and ρ ∈ (0, 1). If ℙ(X > x, A ⩽ x) ⩽

then

1 𝔼(A ∧ x) for all x > 0, x

𝔼(X ρ ) ⩽

(19.20)

2−ρ 𝔼(A ρ ). 1−ρ

Proof. Using (19.20) and Tonelli’s theorem we find ∞

𝔼(X ρ ) = ρ ∫ ℙ(X > x) x ρ−1 dx 0 ∞

⩽ ρ ∫ [ℙ(X > x, A ⩽ x) + ℙ(A > x)] x ρ−1 dx 0 ∞

1 ⩽ ρ ∫ [ 𝔼(A ∧ x) + ℙ(A > x)] x ρ−1 dx x 0



= 𝔼 [ ∫ (A ∧ x)ρx ρ−2 dx] + 𝔼 [A ρ ] ] [0 A



= 𝔼 [∫ ρx ρ−1 dx + ∫ Aρx ρ−2 dx] + 𝔼 [A ρ ]

A ] ρ ρ A 2 −ρ = 𝔼 [A ρ − 𝔼(A ρ ). ] + 𝔼 [A ρ ] = ρ−1 1−ρ

[0

󵄨 s 󵄨2 We are going to apply Lemma 19.19 with X = sups⩽T |X s |2 = sups⩽T 󵄨󵄨󵄨󵄨∫0 f(r) dB r 󵄨󵄨󵄨󵄨 and T A = ⟨X⟩T = ∫0 |f(s)|2 ds. Define τ := inf {t ⩾ 0 : ⟨X⟩t ⩾ x}. Then P := ℙ (sup |X s |2 > x, ⟨X⟩T ⩽ x) = ℙ (sup |X s |2 > x, τ ⩾ T) s⩽T

s⩽T

⩽ ℙ ( sup |X s |2 > x) , s⩽T∧τ

and with Doob’s maximal inequality (A.13) we find P⩽

1 1 (∗) 1 𝔼 (|X T∧τ |2 ) = 𝔼 (⟨X⟩T∧τ ) = 𝔼 (⟨X⟩T ∧ x). x x x

For the equality (∗) we use optional stopping and the fact that (X 2t − ⟨X⟩t )t⩾0 is a martingale. Now we can apply Lemma 19.19 and get 𝔼 [sup |X s |2ρ ] ⩽ s⩽T

2−ρ ρ 𝔼 [⟨X⟩T ] 1−ρ

338 | 19 Applications of Itô’s formula which is the analogue of (19.18) for 0 < p < 2. T In a similar way we can apply Lemma 19.19 with X = ⟨X⟩T = ∫0 |f(s)|2 ds and 2 󵄨 s 󵄨 A = sups⩽T |X s |2 = sups⩽T 󵄨󵄨󵄨󵄨∫0 f(r) dB r 󵄨󵄨󵄨󵄨 . Define σ := inf{t > 0 : |X t |2 ⩾ x}. Then P󸀠 := ℙ (⟨X⟩T > x, sup |X s |2 ⩽ x) ⩽ ℙ(⟨X⟩T > x, σ ⩾ T) ⩽ ℙ(⟨X⟩T∧σ > x). s⩽T

Using the Markov inequality we get P󸀠 ⩽

1 1 1 𝔼 [⟨X⟩T∧σ ] ⩽ 𝔼 [ sup |X s |2 ] ⩽ 𝔼 [sup |X s |2 ∧ x] , x x x s⩽T∧σ s⩽T

and Lemma 19.19 gives

𝔼 [sup |X s |2ρ ] ⩾ s⩽T

Ex. 19.14

1−ρ ρ 𝔼 [⟨X⟩T ] 2−ρ

which is the analogue of (19.19) for 0 < p < 2. Combining this with Theorem 19.18 we have shown:

19.20 Theorem (Burkholder, Davis, Gundy 1972). Let (B t )t⩾0 denote a one-dimensional Brownian motion and f ∈ L2P (λ T ⊗ ℙ) for all T > 0. Then, we have for 0 < p < ∞ T

𝔼 [( ∫ |f(s)|2 ds) [ 0

p/2

󵄨󵄨 t 󵄨󵄨p ] ≍ 𝔼 [sup 󵄨󵄨󵄨󵄨 ∫ f(s) dB s 󵄨󵄨󵄨󵄨 ] , 󵄨 󵄨󵄨 t⩽T 󵄨󵄨 󵄨 ] 0 ] [

(19.21)

where “≍” means a two-sided comparison, with finite comparison constants depending only on p ∈ (0, ∞).

19.8 Local times and Brownian excursions

We will continue our study of Brownian local times which we have begun in Section 18.7. The key to define the local time was Tanaka’s formula (18.21), t

|B t − a| = |a| + ∫ sgn(B s − a) dB s + L at .

(19.22)

0

Comparing this with Itô’s formula (18.1) for a C2 -function f , we observe that L at plays t the role of the second derivative term 12 ∫0 f 󸀠󸀠 (B s ) ds. In fact, f(x) = |x| is convex, so there is a measure-valued, generalized second derivative f 󸀠󸀠 = δ0 , and the formula t (18.20) in Section 18.7 shows that we have “L at = ∫0 δ0 (B s − a) ds” in a limiting sense; we have to use inverted commas, since x 󳨃→ δ0 (x) is not a function (unless you are a physicist).

19.8 Local times and Brownian excursions | 339

The following representation formula for convex functions prepares the ground to make things rigorous. Recall that a function U : (a, b) → ℝ, −∞ ⩽ a < b ⩽ ∞, is convex, if ∀x, y ∈ [a, b]

∀t ∈ (0, 1) : U(tx + (1 − t)y) ⩽ tU(x) + (1 − t)U(y).

(19.23)

19.21 Lemma. Let U : ℝ → ℝ be a convex function. Then the left and right derivatives U±󸀠 (x) exist for all x ∈ ℝ, they are monotone increasing, finite, and left continuous (U−󸀠 ), resp., right continuous (U+󸀠 ). Moreover, U+󸀠 (x) = U−󸀠 (x) for Lebesgue a.e. x, and there is a unique locally finite measure μ on (ℝ, B(ℝ)) such that on every interval I = [−n, n] the function U is given by U(x) = a n + b n x +

1 2

∫ |x − z| μ(dz),

[−n,n)

x ∈ [−n, n].

(19.24)

The measure μ is U 󸀠󸀠 , where the derivatives are taken weakly, in the sense of Schwartz distributions; the coefficients a n , b n depend on the values U(±n) and μ|[−n,n) . Conversely, all convex functions U : ℝ → ℝ are on intervals given by (19.24), with a unique locally finite measure μ.

Proof. Let x∗ < x ⩽ y < y∗ and use the convexity inequality (19.23) for x ∈ (x∗ , y∗ ] and y ∈ [x∗ , y∗ ) to see that the difference quotients satisfy U(x) − U(x∗ ) U(y∗ ) − U(y) ⩽ . x − x∗ y∗ − y

Taking x = y shows that the difference quotients are monotonically increasing, i.e. the following one-sided limits exist: letting x∗ ↑ x shows that U−󸀠 (x∗ ) exists and is bounded above by (U(y∗ ) − U(y))/(y∗ − y) < ∞. In a similar fashion we see −∞ < U−󸀠 (x∗ ) ⩽ U+󸀠 (x) ⩽ U−󸀠 (y) ⩽ U+󸀠 (y∗ ) < ∞

for all x∗ < x < y < y∗ ,

which proves that U±󸀠 (x) exists and is finite for all x ∈ ℝ. Since x 󳨃→ U±󸀠 (x) is monotone, it has at most countably many discontinuities (see e.g. [232, Lemma 14.14] or [235, Fig. 16.1]), we conclude that U+󸀠 (x) = U−󸀠 (x) for Lebesgue almost all x ∈ ℝ. The left or right continuity follows directly from the fact that U−󸀠 (x) = limx∗ ↑x (U(x)−U(x∗ ))/(x−x∗ ), and the corresponding formula for U+󸀠 (x). In order to show the integral representation, fix I = [−n, n] and introduce an auxiliary function for all x, y ∈ [−n, n] (x − n)(y + n) { , { { 2n xy − n2 1 G n (x, y) := + |x − y| = { { 2n 2 { (x + n)(y − n) , { 2n

if y ⩽ x, if x ⩽ y.

Note that G n ⩽ 0 is symmetric, continuous on I × I and G n = 0 on the boundary of I × I. ∂ Fix y. If x ≠ y, x 󳨃→ G n (x, y) is linear affine and the derivative ∂x G n (x, y) has a jump of height 1 if y = x. In particular, G n is convex in each variable.

340 | 19 Applications of Itô’s formula Since U−󸀠 (x) is increasing and left-continuous, it induces a locally finite measure μ[a, b) := U−󸀠 (b) − U−󸀠 (a) < ∞. Using the fact that G n vanishes on the boundary of I × I, integration by parts shows for all x ∈ [−n, n] ∫ (

[−n,n)

xy − n2 1 + |x − y|) μ(dy) = ∫ G n (x, y) μ(dy) 2n 2 [−n,n)

=− ∫ =−

[−n,n)

x−n 2n

∂ G n (x, y)U−󸀠 (y) dy ∂y ∫ U−󸀠 (y) dy −

[−n,x)

x+n ∫ U−󸀠 (y) dy 2n [x,n)

x+n x−n =− (U(x) − U(−n)) − (U(n) − U(x)) 2n 2n U(n) − U(−n) U(n) + U(−n) − . = U(x) − x 2n 2

After some obvious rearrangements, this proves (19.24). The above calculation works for any ϕ ∈ C∞ c (−n, n) and yields, without any convexity assumptions on ϕ, ϕ(x) = ∫ G n (x, y)ϕ󸀠󸀠 (y) dy. [−n,n)

If U is given by (19.24), the convexity of x 󳨃→ G n (x, y) shows that U is itself convex. Using Fubini’s theorem and the symmetry of G n (x, y) finally gives ∫ U(x)ϕ󸀠󸀠 (x) dx =

[−n,n)



[−n,n)×[−n,n)

G n (x, y)ϕ󸀠󸀠 (x) dx μ(dy) = ∫ ϕ(y) μ(dy). [−n,n)

This identifies μ as the second derivative (in the sense of Schwartz distributions) of U, thus it is unique. If we combine Tanaka’s formula (19.22), the representation formula (19.24) for convex functions, and the stochastic Fubini theorem (Theorem 16.11), we obtain a far-reaching extension of Itô’s formula. 19.22 Theorem (Itô–Meyer formula). Let (B t )t⩾0 be a BM1 with local time (L at )t⩾0,a∈ℝ and f = U − V the difference of two convex functions U, V : ℝ → ℝ such that U 󸀠󸀠 = μ and V 󸀠󸀠 = ν for locally finite measures μ, ν (cf. Lemma 19.21). For all t ⩾ 0, t

f(B t ) = f(B0 ) + ∫ f−󸀠 (B s ) dB s + 0

1 ∫ L at (μ − ν)(da). 2 ℝ

(19.25)

Proof. Since the fomulae (19.24) and (19.25) are linear in f , we may assume that f is convex, i.e. f = U and V = 0. Fix n ∈ ℕ and write τ n = inf {s ⩾ 0 : |B s | ⩾ n} for the

19.8 Local times and Brownian excursions |

341

first exit time from the interval (−n, n). Using Lemma 19.21 we know that f(x) = a n + b n x + ∫ |x − a| μ(da), [−n,n)

x ∈ [−n, n],

for suitable constants a n , b n ∈ ℝ and the second generalized derivative μ = f 󸀠󸀠 . If t < τ n (ω), we may plug in x = B t (ω), and use Tanaka’s formula (19.22): f(B t ) = a n + b n B t +

1 2

∫ |B t − a| μ(da)

[−n,n)

1 = f(0) + b n B t + 2

t

∫ ∫ sgn(B s − a) dB s μ(da) +

[−n,n) 0

1 2

∫ L at μ(da).

[−n,n)

If we apply the stochastic Fubini theorem (Theorem 16.11) and note that (19.24) implies that f−󸀠 (x) = b n + 12 ∫[−n,n) sgn(x − y) μ(da), we get for all t < τ n t

f(B t ) = f(0) + ∫ (b n + 0

1 2

t

∫ sgn(B s − a) μ(da)) dB s +

[−n,n)

= f(0) + ∫ f−󸀠 (B s ) dB s + 0

1 2

∫ L at μ(da).

1 2

∫ L at μ(da)

[−n,n)

[−n,n)

Since B0 = 0 and τ n → ∞ as n → ∞, the assertion follows for all t ⩾ 0.

19.23 Corollary (occupation time formula). Let (B t )t⩾0 be a BM1 with local time L at . Then t

(19.26)

∫ ϕ(B s ) ds = ∫ ϕ(a) L at da 0



holds a.s. for all t > 0 and all positive measurable functions ϕ : ℝ → [0, ∞]. The exceptional ℙ-null set can be chosen uniformly for all t and ϕ.

Proof. Assume, for a moment, that ϕ is continuous, set μ(da) = ϕ(a) da and denote by f a convex function such that f 󸀠󸀠 = μ in the sense of Lemma 19.21; since μ has the density ϕ, we also have f 󸀠󸀠 (x) = ϕ(x). Thus, we can use both the classical Itô (18.1) and the Itô–Meyer formulae (19.25) to express f(B t ). Comparing both formulae yields t

t

0

0

1 1 1 1 ∫ L at μ(da) = ∫ L at ϕ(a) da = ∫ f 󸀠󸀠 (B s ) ds = ∫ ϕ(B s ) ds 2 2 2 2 ℝ



ℙ-a.s.

This is (19.26) with an exceptional set N t,ϕ . Let Φ = (ϕ n )n⩾1 be a countable dense subset of C+ (ℝ) (the positive parts of real polynomials with rational coefficients will do if we use locally uniform convergence), and set N = ⋃t∈ℚ+ ⋃ϕ∈Φ N t,ϕ . Being the union

342 | 19 Applications of Itô’s formula of countably many null sets, N is again a null set. We will show that (19.26) holds for all ω ∉ N. Observe that we can rewrite (19.26) using image measures: ∫ ϕ(x) λ ∘ (B∙ (ω))−1 (dx) = ∫ ϕ(a) L at (ω) da.





Since Φ is dense in C+ (ℝ) for locally uniform convergence, and since the measures λ ∘ (B∙ (ω))−1 (dx) and L at (ω) da are locally finite, we get (19.26) for all ϕ ∈ C+ (ℝ) and ω ∉ N. As C+ (ℝ) determines measures uniquely [232, Theorem 17.12], we conclude that λ ∘ (B∙ (ω))−1 (dx) = L xt (ω) dx for all ω ∉ N, hence the claim. Let us collect a few further consequences of the Itô–Meyer and Tanaka formulae.

19.24 Corollary (moments). Let (B t )t⩾0 be a BM1 with local time (L at )t⩾0,a∈ℝ . Then n

𝔼 [(sup L xt ) ] ⩽ C n t n/2 , x∈ℝ

n ∈ ℕ, t > 0.

(19.27)

Proof. From Tanaka’s formula (19.22) and the estimate (A + C)n ⩽ 2n−1 (A n + C n ) we get 󵄨󵄨 󵄨󵄨n t 󵄨󵄨 󵄨󵄨 󵄨 󵄨 x n 󵄨 |L t | = 󵄨󵄨|B t − x| − |x| − ∫ sgn(B s − x) dB s 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 t 󵄨󵄨n 󵄨󵄨 󵄨󵄨 󵄨󵄨n 󵄨 n−1 󵄨󵄨 n−1 󵄨󵄨 ⩽ 2 󵄨󵄨|B t − x| − |x|󵄨󵄨 + 2 󵄨󵄨∫ sgn(B s − x) dB s 󵄨󵄨󵄨 . 󵄨󵄨 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 ⩽|B t |

Take expectations, use Brownian scaling B t ∼ √tB1 and the Burkholder–Davis–Gundy inequality to get n/2

t

n 𝔼 [sup (L xt ) ] x∈ℝ

⩽ cn t

n/2

⩽ Cn t

n/2

[ 𝔼 [|B1 | ] + c n 𝔼 [(sup ∫ | sgn(B s − x)|2 ds) n

.

[

x∈ℝ

0

⩽1

] ] ]

19.25 Corollary (continuity of local time). Let (B t )t⩾0 be a BM1 with local time L at . There is a modification of L at such that (t, a) 󳨃→ L at is jointly continuous and a 󳨃→ L at is globally Hölder continuous of any order γ < 12 , uniformly for t from compact intervals.

Proof. We will show that supt⩽T |L at − L bt | ⩽ C T |b − a|γ holds a.s. for all a, b ∈ ℝ and T > 0. Since t 󳨃→ L bt (ω) is continuous for each b and (ℙ-almost) all ω – this follows immediately from Tanaka’s formula (19.22) – we infer from 󵄨󵄨 a 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨󵄨L t − L bu 󵄨󵄨󵄨 ⩽ 󵄨󵄨󵄨L at − L bt 󵄨󵄨󵄨 + 󵄨󵄨󵄨L bt − L bu 󵄨󵄨󵄨 ⩽ sup 󵄨󵄨󵄨L at − L bt 󵄨󵄨󵄨 + 󵄨󵄨󵄨L bt − L bu 󵄨󵄨󵄨 , 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 t⩽T 󵄨 󵄨 󵄨 󵄨

the joint continuity of (t, a) 󳨃→ L at .

t, u ∈ [0, T],

19.8 Local times and Brownian excursions |

343

In order to show Hölder continuity, we use the Kolmogorov–Chentsov theorem (Theorem 10.1). Fix n ∈ ℕ, use Tanaka’s formula (19.22) and the elementary estimates (|A| + |B| + |C|)2n ⩽ 32n−1 (|A|2n + |B|2n + |C|2n ) and ||A| − |B|| ⩽ |A − B| to see 󵄨󵄨 󵄨󵄨2n t 󵄨󵄨 󵄨󵄨 a 󵄨󵄨2n 󵄨󵄨󵄨󵄨 b 󵄨󵄨L t − L t 󵄨󵄨 = 󵄨󵄨|B t − a| − |B t − b| + |b| − |a| − ∫ [ sgn(B s − a) − sgn(B s − b)] dB s 󵄨󵄨󵄨󵄨 󵄨 󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 t 󵄨󵄨2n 󵄨󵄨󵄨 󵄨󵄨󵄨 ⩽ 2 ⋅ 32n−1 |b − a|2n + 32n−1 󵄨󵄨󵄨∫ [ sgn(B s − a) − sgn(B s − b)] dB s 󵄨󵄨󵄨 . 󵄨󵄨 󵄨󵄨 󵄨󵄨 0 󵄨󵄨

Without loss of generality we can assume that a < b. If we agree to use the right continuous version of x 󳨃→ sgn(x) = 𝟙[0,∞) (x) − 𝟙(−∞,0) (x), then a straightforward calculation shows that |sgn(B s − a) − sgn(B s − b)|2 = 4 ⋅ 𝟙[a,b) (B s ). Now we take expectations on both sides and use the Burkholder–Davis–Gundy inequality (19.18) to deduce n

T

󵄨 󵄨2n [ ] 𝔼 [sup 󵄨󵄨󵄨󵄨L at − L bt 󵄨󵄨󵄨󵄨 ] ⩽ c n |b − a|2n + c n 𝔼 [(∫ 𝟙[a,b) (B s ) ds) ] . t⩽T [ 0 ]

By the occupation density formula, 󵄨 󵄨2n 𝔼 [sup 󵄨󵄨󵄨󵄨L at − L bt 󵄨󵄨󵄨󵄨 ] t⩽T

n

T

⩽ (19.26)

=



(19.27)



2n

c n |b − a|

[ ] + c n 𝔼 [(∫ 𝟙[a,b) (B s ) ds) ] [

0

[

a

b

]

n

[ ] c n |b − a|2n + c n 𝔼 [(∫ L xT dx) ] ]

c n |b − a|2n + c n |b − a|n 𝔼 [ sup (L xT ) ] n

x∈[a,b]

C T,n |b − a|n

for all a, b ∈ ℝ such that |a − b| ⩽ 1. Therefore, the Kolmogorov–Chentsov theorem shows the existence of a Hölder continuous modification for any Hölder exponent (n−1)/2n, and the argument used in the proof of Corollary 10.2 proves Hölder continuity up to order γ < 1/2.

19.26 Corollary (quadratic variation of local time). Let (B t )t⩾0 be a BM1 with local time (L xt )t⩾0,x∈ℝ and let (Π n )n⩾1 be a sequence of partitions of the interval [a, b] ⊂ ℝ such that the mesh |Π n | → 0. Then the limit b



a i ,a i−1 ∈Π n

󵄨󵄨 a i 󵄨2 L1 (ℙ) 󵄨󵄨L t − L at i−1 󵄨󵄨󵄨 󳨀󳨀󳨀󳨀󳨀󳨀→ 4 ∫ L xt dx 󵄨 󵄨 |Π n |→0

exists uniformly for all t from compact intervals.

a

(19.28)

344 | 19 Applications of Itô’s formula Proof. Let Π = {a = a0 < a1 < ⋅ ⋅ ⋅ < a n = b} be any partition of [a, b]. As in the proof of the previous corollary, we can use Tanaka’s formula to see 2

t

󵄨󵄨 a i−1 󵄨2 󵄨󵄨L t − L at i 󵄨󵄨󵄨 = (|B t − a i−1 | − |B t − a i | − |a i−1 | + |a i | − 2 ∫ 𝟙[a i−1 ,a i ) (B s ) dB s ) . 󵄨 󵄨 0

t

To ease on notation, write I i (t) := 2 ∫0 𝟙[a i−1 ,a i ) (B s ) dB s . 2 󵄨 a t a 󵄨2 1o Let us first show that ∑i 󵄨󵄨󵄨󵄨L t i−1 − L t i 󵄨󵄨󵄨󵄨 and ∑i (2 ∫0 𝟙[a i−1 ,a i ) (B s ) dB s ) have the same L1 (ℙ)-limit. Using the triangle inequality gives 󵄨󵄨 󵄨 󵄨󵄨|B t − a i−1 | − |B t − a i | − |a i−1 | + |a i |󵄨󵄨󵄨 ⩽ 2(a i − a i−1 ),

and Itô’s isometry and Lemma 19.24 show

ai

t

2

[ 𝔼 |I i (t)|] ⩽

𝔼 [I i2 (t)]

(19.26) = 𝔼 [4 ∫ 𝟙[a i−1 ,a i ) (B s ) ds] = 4 𝔼 [ ∫ L xt dx] [ 0 ] [a i−1 ] (19.27)

⩽ 4(a i − a i−1 ) 𝔼 [sup L xt ] ⩽ x

c√t (a i − a i−1 ) .

Combining these estimates with the binomial formula (A + C)2 − C2 = A(A + 2C) yields 󵄨󵄨󵄨 a 󵄨󵄨 󵄨 󵄨 2 a 󵄨2 𝔼 󵄨󵄨󵄨󵄨󵄨󵄨󵄨L t i−1 − L t i 󵄨󵄨󵄨󵄨 − I i2 (t)󵄨󵄨󵄨 = 𝔼 󵄨󵄨󵄨󵄨(|B t − a i−1 | − |B t − a i | − |a i−1 | + |a i | − I i (t)) − I i2 (t)󵄨󵄨󵄨󵄨 󵄨 󵄨 ⩽ 2(a i − a i−1 )(a i − a i−1 − 2 𝔼 |I i (t)|) ⩽ c t (a i − a i−1 )3/2 .

Therefore, we have

∑(a i − a i−1 )3/2 ⩽ √|Π| ∑(a i − a i−1 ) = √|Π|(b − a) 󳨀󳨀󳨀󳨀󳨀→ 0, i

|Π|→0

i

󵄨 a a 󵄨2 so the sums ∑i 󵄨󵄨󵄨󵄨L t i−1 − L t i 󵄨󵄨󵄨󵄨 and ∑i I i2 have the same convergence behaviour in L1 (ℙ). 2

2o We will now show that ∑i (2 ∫0 𝟙[a i−1 ,a i ) (B s ) dB s ) converges to 4 ∫a L xt dx in L2 (ℙ) which will imply the desired result. Because of the occupation time formula we have t

2

t

b

b

∑ (∫ 𝟙[a i−1 ,a i ) (B s ) dB s ) − ∫ L xt dx i

0

a

t

2

ai

[ ] = ∑ [(∫ 𝟙[a i−1 ,a i ) (B s ) dB s ) − ∫ L xt dx] i

[

0

[

0

t

a i−1

2

t

]

[ ] = ∑ [(∫ 𝟙[a i−1 ,a i ) (B s ) dB s ) − ∫ 𝟙[a i−1 ,a i ) (B s ) ds] . i

0

]

19.8 Local times and Brownian excursions | 345 t

Using Itô’s formula for the Itô process ∫0 𝟙[a i−1 ,a i ) (B s ) dB s shows that the general term of the above sum is a martingale: 2

t

t

M i (t) := (∫ 𝟙[a i−1 ,a i ) (B s ) dB s ) − ∫ 𝟙[a i−1 ,a i ) (B s ) ds 0

0 t

Itô

s

= 2 ∫ (∫ 𝟙[a i−1 ,a i ) (B r ) dB r ) 𝟙[a i−1 ,a i ) (B s ) dB s 0

0

(note that the calculation at the end of this proof ensures M i (t)𝟙[a i−1 ,a i ) (B s ) ∈ L2T which is needed to see that the integral is a martingale). Therefore, 2

𝔼 [(∑ M i (t)) ] = ∑ ∑ 𝔼 [M i (t)M j (t)] = ∑ ∑ 𝔼 [⟨M i , M j ⟩t ] . i j i j [ i ]

The angle bracket can be easily calculated from the integral representation of M i (t) using Theorem 17.15 ⟨M i , M j ⟩t

(17.10)

t s

s

= ∫ ∫ 𝟙[a i−1 ,a i ) (B r ) dB r ∫ 𝟙[a j−1 ,a j ) (B u ) dB u 𝟙[a i−1 ,a i ) (B s )𝟙[a j−1 ,a j ) (B s ) ds. 0 0

0

Since the intervals [a i−1 , a i ) are mutually disjoint, the mixed terms in the sum disappear, and we are left with 2

𝔼 [(∑ M i (t)) ] = ∑ 𝔼 (M 2i (t)) . i [ i ]

In order to estimate the expectation inside the sum, we use first the Cauchy–Schwarz inequality (CSI), and then Tonelli and Burkholder’s inequality t

𝔼 (M 2i (t))

2

s

= 𝔼 ∫ (∫ 𝟙[a i−1 ,a i ) (B r ) dB r ) 𝟙[a i−1 ,a i ) (B s ) ds 0

CSI

0

t

s

4

t

2

t

[ ] ⩽ √ ∫ 𝔼 [(∫ 𝟙[a i−1 ,a i ) (B r ) dB r ) ] ds ⋅ √ 𝔼 [∫ 𝟙[a i−1 ,a i ) (B s ) ds] 0 ] [0 [ 0 ]

(19.21)

t

t

[ ] ⩽ c√ ∫ 𝔼 [(∫ 𝟙[a i−1 ,a i ) (B r ) dr) ] ds ⋅ √ 𝔼 [∫ 𝟙[a i−1 ,a i ) (B s ) ds]. s⩽t 0 [0 ] [ 0 ]

346 | 19 Applications of Itô’s formula

Fig. 19.2: Local time increases if B t = 0; there is no increase over an “excursion” interval (g t , d t ). This is – schematically – shown in the picture.

The occupation times formula now gives 2

ai

𝔼 (M 2i (t))



⩽ (19.27)



ai

] [ c√t ⋅ √ 𝔼 [( ∫ L xt dx) ] ⋅ √ 𝔼 [ ∫ L xt dx] [a i−1 ] ] [ a i−1 c√t ⋅ √ 𝔼 [sup (L xt ) ] (a i − a i−1 )2 ⋅ √ 𝔼 [sup L xt ] (a i − a i−1 ) 2

x

x

c t (a i − a i−1 )3/2 .

Arguing as at the end of Step 1o , we see 2

𝔼 [(∑ M i (t)) ] ⩽ c t ∑ (a i − a i−1 )3/2 ⩽ c t (b − a)√|Π| 󳨀󳨀󳨀󳨀󳨀→ 0, |Π|→0 i ] [ i finishing the proof. So far, the specific properties of Brownian motion did not really enter our discussion, and most of the properties above hold for more general processes, see e.g. [216, Chapter VI.1]. Let us return to the original definition of local time (Definition 18.12): t

L0t

1 = ℙ- lim ∫ 𝟙(−ϵ,ϵ) (B s ) ds. ϵ→0 2ϵ 0

This shows that t 󳨃→ L0t can only grow if B t = 0, and that L0t is constant if B t is away from zero, see Fig. 19.2. Since t 󳨃→ L0t (ω) is increasing, we write dL0t (ω) for the measure induced by t 󳨃→ L0t . 19.27 Lemma. Let (B t )t⩾0 be a BM1 and denote by L 0t its local time at zero. a) supp[dL0t (ω)] ⊂ Z(ω) = {t ⩾ 0 : B t (ω) = 0} for almost all ω. b) L0t ∼ √tL01 .

19.8 Local times and Brownian excursions |

347

c) ℙ(L0t > 0) = 1 for all t > 0.

Proof. a) If s0 ∉ Z(ω), the continuity of the paths of a Brownian motion ensures that 𝟙(−ϵ,ϵ) (B s (ω)) = 0 for all s ∈ (s0 −δ, s0 +δ), if δ = δ(ϵ, ω) is sufficiently small. Therefore, we have L0s0 +δ (ω) = L0s0 −δ (ω), i.e. s0 ∉ supp[dL t (ω)].

b) Using the scaling property B rt ∼ √tB r of Brownian motion and a change of variables, we see that t

1

1

0

0

0

t t 1 ∫ 𝟙(−ϵ,ϵ) (B s ) ds = ∫ 𝟙(−ϵ,ϵ) (B rt ) dr ∼ ∫ 𝟙(−ϵ,ϵ) (√tB r ) dr 2ϵ 2ϵ 2ϵ 1

√t = ϵ ∫ 𝟙(− ϵ , ϵ ) (B r ) dr. 2 √t √t √t 0

Letting ϵ → 0, hence ϵ/√t → 0, proves the claim.

c) From Tanaka’s formula (19.22) we see that 𝔼 L01 = 𝔼 |B1 | > 0, hence ℙ(L01 > 0) > 1. Because of the scaling property of local time, we also have B {L0t > 0} = {L01 > 0} = ⋂ {L01/n > 0} ∈ F1/N n⩾N

for all N ∈ ℕ.

B Since N ∈ ℕ is arbitrary, we see that {L t > 0} ∈ F0+ , and Blumenthal’s 0–1-law 0 (Corollary 6.23) shows that ℙ(L t > 0) = 1.

Recall from Sections 11.4 and 11.5 that the set of zeros of Brownian motion Z(ω) = B−1 ({0}, ω) = {t ∈ [0, ∞) : B t (ω) = 0}

is a closed, unbounded set which is dense in itself. We may write the complement as



Zc (ω) = ⋃ (g t (ω), d t (ω)) t>0

where g t = sup{s < t : B s = 0} is the last zero before time t, and d t = inf {s > t : B s = 0} is the first zero after time t; we say that the pair {g t ; d t } straddles t > 0.³ Since each interval (g t , d t ) ≠ 0 contains a rational number, the union appearing in the definition of Zc (ω) is, in fact, countable. Since d t − t is the first hitting time of zero of the shifted Brownian motion W s = B t+s started at W0 = B t , d t is a stopping time, whereas g t cannot be a stopping time. Note that B(g t ) = B(d t ) = 0 and B|(g t ,d t ) ≠ 0; therefore, (g t , d t ) is called an excursion interval for Brownian motion away from the level 0. 3 It is customary to use the letters g and d for the left (“gauche”) and right (“droite”) endpoints of the interval.

Ex. 4.9 Ex. 5.23

348 | 19 Applications of Itô’s formula We will need one more random set connected with Brownian motion. Denote by τ u (ω) := τ(u, ω) := inf {s > 0 : L0s (ω) > u} ,

u ⩾ 0,

(19.29)

the generalized inverse of the local time, cf. Lemma 19.16; τ u is an increasing, right continuous function whose jumps τ u − τ u− > 0 correspond to the intervals where t 󳨃→ L t is constant, taking the value u. We are interested in the range of τ∙ , R(ω) := {τ u (ω) : u ⩾ 0} and



Rc (ω) = ⋃ (τ u− (ω), τ u (ω)) . u>0

19.28 Theorem. Let (B t )t⩾0 be a one-dimensional Brownian motion with local time (at zero) L0t and the generalized inverse τ u = (L0 )−1 u . For almost all ω the sets i) Z(ω) – the set of all zeros of Brownian motion, ii) R(ω) – the closure of the range of u 󳨃→ τ u , iii) S(ω) – the support of the measure dL0t (ω), coincide. In particular, ℙ(L0∞ = ∞) = 1, and there is a one-to-one correspondence between excursion intervals (g t , d t ) ≠ 0 and the jumps (τ u− , τ u ) ≠ 0. Proof. 1o We have S = R. For every ω the following equivalences hold: ∞

t ∈ S(ω) ⇐⇒ ∀ϕ ∈ C+c [0, ∞), ϕ(t) > 0 : ∫ ϕ(s) dL0s (ω) > 0 0

⇐⇒ ∀ϵ > 0 : L0t+ϵ (ω) > L0t−ϵ (ω) ⇐⇒ t ∉ Rc (ω).

This proves that S(ω) = R(ω).

2o We have R ⊂ Z almost surely. Directly from the definition of L0t (ω) we can see that, a.s., L t+s (ω) > L s (ω) implies (t, t + s) ∩ Z(ω) ≠ 0. That is, (t, t + s) ∩ Z(ω) = 0 implies L t+s (ω) = L s (ω). Therefore, for each u > 0 there is some t = t(u) such that (τ u− (ω), τ u (ω)) ⊂ (g t (ω), d t (ω)); this proves that R ⊂ Z a.s. 3o We have Z ⊂ S. Since L0s (ω) > 0 a.s., see Lemma 19.27, we have τ0 (ω) = 0. Tanaka’s formula (19.22) shows for all s ⩾ 0 d t +s

L d t +s = |B d t +s | − ∫ sgn(B r ) dB r , 0

and the difference becomes

d t +s

s

L d t +s − L d t = |B d t +s | − |B d t | − ∫ sgn(B r ) dB r = |W s | − ∫ sgn(W r ) dW r . =0

dt

0

By the strong Markov property (Theorem 6.5), W s := B d t +s = B d t +s − B d t is again a Brownian motion. Therefore, the right-hand side of the last formula is the local time

19.8 Local times and Brownian excursions |

349

̂ 0s of W which proves that L d +s − L d > 0 a.s. and d t (ω) ∈ S(ω) for all ω ∈ N c where L t t t N t ⊂ Ω with ℙ(N t ) = 0; in particular, d q (ω) ∈ S(ω) for all q ∈ ℚ ∩ [0, ∞) and for all ω ∉ N := ⋃q∈ℚ+ N q which is again a null set. Let ω ∉ N and fix any z ∈ Z(ω). Since Z(ω) has Lebesgue measure zero, it cannot contain any open interval, and so there is in any neighbourhood of u some q ∈ ℚ such that q < u and q ∉ Z(ω). This means that d q (ω) ⩽ z and since we can let q ↑ z, we see that z ∈ S(ω) = S(ω), and S(ω) ⊂ Z(ω) follows.

Finally, since Z is almost surely unbounded, cf. Theorem 11.21.b), we conclude that L0∞ (ω) = ∞ a.s. From the equality Z(ω) = R(ω) and Step 2o we know that





Zc (ω) = ⋃ (g t (ω), d t (ω)) = ⋃ (τ u− (ω), τ u (ω)) = Rc (ω), t>0

u>0

establishing a one-to-one correspondence between the jump intervals (τ u− (ω), τ u (ω)) and the excursion intervals (g t (ω), d t (ω)); note, however, that different values of t can lead to the same excursion interval: any s ∈ (g t (ω), d t (ω)) satisfies (g s (ω), d s (ω)) = (g t (ω), d t (ω)).

We can now identify the processes appearing in Tanaka’s formula. The following theorem should be compared with Theorem 6.10 where we determined the laws of the running maximum M t = sups⩽t B s and minimum m t = sups⩽t B s of a Brownian motion.

19.29 Theorem (Lévy 1939, 1948). Let (B t )t⩾0 be a Brownian motion with local time L0t , running minimum m t , and running maximum M t . t a) The process β t := ∫0 sgn(B s ) dB s appearing in Tanaka’s formula is a one dimensional |B|

β

Brownian motion; its natural filtration is Ft = Ft . b) The local time satisfies L0t (ω) = sups⩽t (−β s (ω)) = − inf s⩽t β s (ω). c) The bivariate processes (M t − B t , M t )t⩾0 , (B t − m t , −m t )t⩾0 and (|B t |, L0t )t⩾0 have the same finite dimensional distributions. 2 d) The generalized inverse of L0t satisfies τ u ∼ u(2πx3 )−1/2 e−u /2x 𝟙(0,∞) (x) dx and 𝔼 exp (−ξτ u ) = exp (−u√2ξ ) for all ξ ⩾ 0. t

Proof. a) The stochastic integral β t := ∫0 sgn(B s ) dB s is a continuous martingale, and so is the compensated process 2

t

t

t

(∫ sgn(B s ) dB s ) − ⟨∫ sgn(B s ) dB s ⟩ = β2t − ∫ sgn(B s )2 ds = β2t − t, 0

0

0

=1

cf. Theorem 15.15.b). Therefore, we can apply Lévy’s martingale characterization of Brownian motion (Theorem 9.13 or 19.5) and see that (β t )t⩾0 is a Brownian motion. β From Tanaka’s formula (19.22) we get β t = |B t | − L0t which yields that Ft ⊂ |B|

0

0

|B|

|B|

σ(Ft , FtL ). The definition of L0t shows that FtL ⊂ Ft , hence Ft ⊂ Ft . β

Ex. 19.16

350 | 19 Applications of Itô’s formula Again from Tanaka’s formula we get |B t | = β t + L0t . With the help of Part b) we can |B|

express L0t as function of β s , s ⩽ t, hence Ft

b) Tanaka’s formula (19.22) tells us that

L0t

L0s

L0t = |B t | − β t

for the BM1

⊂ Ft . β

t

β t = ∫ sgn(B s ) dB s . 0

L0t

Thus, ⩾ ⩾ −β s for all s ∈ [0, t]. This proves ⩾ sups⩽t (−β s ). In order to see the converse inequality, let g t be the last zero of B before time t. Since L0t does not grow if B t ≠ 0, we see that L0g t = L0t , hence L0t = L0g t = |B g t | − β g t = −β g t ⩽ sups⩽t (−β s ). c) The result of Part b) shows for the bivariate processes (|B t |, L0t )t⩾0 = (β t − sup(−β s ), sup(−β s )) s⩽t

s⩽t

t⩽0

= (β t − inf β s , − inf β s ) s⩽t

s⩽t

t⩾0

.

Since β and B have the same finite dimensional distributions, we conclude from this (|B t |, L0t )t⩾0 ∼ (B t − m t , −m t )t⩾0 , and (B t − m t , −m t )t⩾0 ∼ (M t − B t , M t )t⩾0 follows from the fact that −B is again a Brownian motion.

d) The form of the density of τ u follows from (6.14) in Theorem 6.10 (where we use (b, t) 󴁄󴀼 (u, x)). The form of the Laplace transform is obtained by a direct calculation: using the differentiation theorem for parameter dependent integrals and a change of variables (xξ = u2 /2y) we get ∞

2 u d 𝔼 e−ξτ u = − ∫ e−ξx e−u /2x dx dξ √2πx

0

=−



2 u u u 𝔼 e−ξτ u . dy = − ∫ e−u /2y e−yξ √2ξ √2πy3 √2ξ

0

󵄨 Since ξ 󳨃→ 𝔼 e−ξτ u is continuous and 𝔼 e−ξτ u 󵄨󵄨󵄨ξ =0 = 1, the above differential equation has the unique solution 𝔼 e−ξτ u = exp (−u√2ξ ).

We want to study the excursions of a Brownian motion directly. For this we need a few definitions and preparations. 19.30 Definition. Let (Ω, A , ℙ) be a probability space with a BM1 (B t )t⩾0 and denote by L0t the local time at 0 and by τ u its inverse. The space of excursions consists of all continuous functions w ∈ C(o) which stay zero once they reach the first zero in (0, ∞): E := {t 󳨃→ w(t)𝟙[0,ζ(w)) (t) : w ∈ C(o) } ,

ζ(w) := inf {s > 0 : w(s) = 0} ∈ [0, ∞];

(19.30)

ζ(e) is the life time of the excursion e ∈ E. The space of excursions is equipped with the smallest σ-algebra E which makes all coordinate projections X t : w 󳨃→ w(t) measurable.

19.8 Local times and Brownian excursions | 351

A Brownian excursion is the map e u : Ω → E, ω 󳨃→ e u (ω, s) := B(τ(u−)+s)∧τ(u) (ω) where u > 0. We have e u ≡ 0 if, and only if, τ u = τ u− ; we also define e0 ≡ 0.

We have seen that Brownian excursions happen at the level stretches of local time. Thus, it is natural⁴ to “tag” the excursions e u with the level u of local time; if e u ≠ 0 is non-trivial, the inverse local time has a jump τ u− < τ u , and the jump height τ u − τ u− > 0 corresponds to the length of the level stretch or the life time ζ(e u ) of the excursion. Naively, we may think of the path t 󳨃→ B t (ω) as a concatenation of excursions. The trouble is, that the endpoints of the excursion intervals are the zeros of Brownian motion which are dense in itself, i.e. a “condensation” of excursions will a.s. happen. In order to make this rigorous, we need a bit more notation. 19.31 Definition. Let (E, E ) be the space of excursions and denote by e u (ω, ⋅) a Brownian excursion. We set for all v > 0 and U ∈ E such that 0 ∉ U N ω ((u, v] × U) := # {r ∈ (u, v] : e r (ω, ⋅) ∈ U} = ∑ 𝟙U (e r (ω, ⋅)) .

(19.31)

u ϵ},⁵ and use the one-to-one correspondence between the jumps of the inverse local time and the level stretches of local time. Thus, N ω ((u, v] × U ϵ ) = # {r ∈ (u, v] : τ r (ω) − τ r− (ω) > ϵ} .

(19.32)

Since u 󳨃→ τ u (ω) is for almost all ω right continuous, this quantity is for each ϵ > 0 a.s. finite. Since we are interested in the distributional properties of ω 󳨃→ N ω , we need to understand the structure of τ u better. The following lemma uses shift operators θ s : Ω → Ω which shift the path by s: B t (θ s ω) = B t+s (ω). We can also shift by a random time σ(ω), B t ∘ θ σ (ω) = B t+σ (ω) = B t+σ(ω) (ω). This formalism is useful if we want to indicate that some stopping time etc. depends on a time-shifted process.

19.32 Lemma. Let (B t )t⩾0 be a BM1 , L0t the local time at zero and τ u its generalized right continuous inverse. a) (L t )t⩾0 satisfies L τ u +s = L τ u + L s ∘ θ τ u = u + L s ∘ θ τ u . b) (τ u )u⩾0 satisfies τ u+v − τ u = τ v ∘ θ τ u . c) (τ u )u⩾0 equipped with the filtration Gu := Fτ u satisfies for all u, v, ξ ⩾ 0 𝔼 [exp (−ξ(τ u+v − τ u )) | Gu ] = 𝔼 [exp (−ξτ v )] = exp (−v√2ξ ) .

(19.33)

4 This simple but fundamental insight is due to Itô [116]. 5 You should check, that w 󳨃→ ζ(w) is indeed measurable, but this follows from the observation that ζ is the first hitting time of zero.

Ex. 4.7 Ex. 6.4

352 | 19 Applications of Itô’s formula In particular, (τ u )u⩾0 has independent and stationary increments.⁶

Proof. a) As in the third step of the proof of Theorem 19.28 we can use Tanaka’s formula (19.22) to get L0τ u +s



L0τ u

τ u +s

= |B τ u +s | − |B τ u | − ∫ sgn(B r ) dB r τu

s

= |B s ∘ θ τ u | − |B0 ∘ θ τ u | − ∫ sgn(B r ∘ θ τ u ) dB r ∘ θ τ u , 0

and the latter is L0s ∘ θ τ u – it is the local time for the Brownian motion W s = B s ∘ θ τ u ; observe that W0 = B τ u = 0. The equality L0τ u = u follows from Lemma 19.16.g). b) The equality in Part a) shows that L0τ u +s > u + v if, and only if, L0s ∘ θ τ u > v. From the definition of the generalized inverse (19.29) we see def

inf {x : L0x > u + v} = τ u+v ⩽ τ u + s

and

def

inf {x : L0x ∘ θ τ u > v} = τ v ∘ θ τ u ⩽ s.

Letting s ↓ τ v ∘ θ τ u yields τ u+v ⩽ τ u + τ v ∘ θ τ u . Conversely, Lemma 19.16.b) and the first part shows

τ u + s ⩽ τ(L0τ u + s) = τ(L0τ u + L0s ∘ θ τ u ).

Now we use s = τ v ∘ θ τ u and see τ u + τ v ∘ θ τ u = τ u+v as L0τ u = u and L0τ v ∘θ τ ∘ θ τ u = v. u

c) From Part b) we know that τ u+v − τ u = τ v ∘ θ τ u . The right-hand side depends on the shifted process B s ∘ θ τ u = B τ u +s = B τ u +s − B τ u which is, because of the strong Markov property (Theorem 6.5), again a Brownian motion. Since (B s ∘ θ τ u )s⩾0 is independent of Gu := Fτ u , we see that 𝔼 [exp (−ξ(τ u+v − τ u )) | Gu ] = 𝔼 [exp (−ξτ v ∘ θ τ u )]

(B s ∘θ τ u )s ∼(B s )s

=

𝔼 [exp (−ξτ v )] .

Thus, by Theorem 19.29.d), 𝔼 [exp (−ξ(τ u+v − τ u )) | Gu ] = exp (−v√2ξ ). This shows, in particular, that τ u+v − τ u ∼ τ v , i.e. the stationary increment property. In order to see the independent increments property, we can use (19.33) and mimic the argument used in Lemma 5.4. 19.33 Lemma. Let N u be a Poisson random variable with intensity u and (X k )k⩾1 iid exponential random variables with parameter 1. We have ℙ(X1 + ⋅ ⋅ ⋅ + X n ⩽ u) = ℙ(N u ⩾ n)

for all n ∈ ℕ, u > 0.

6 Such processes are often called Lévy processes. Increasing Lévy processes are called subordinators.

19.8 Local times and Brownian excursions | 353

Proof. Set ϕ0 (u) := 1, ϕ n (u) := ℙ(X1 + ⋅ ⋅ ⋅ + X n ⩽ u) and observe that, because of independence and X n ∼ e−x dx, u

u

ϕ n (u) = ∫ ℙ(X1 + ⋅ ⋅ ⋅ + X n−1 + x ⩽ u) e

Therefore,

0

−x

u

dx = ∫ ϕ n−1 (u − x) e−x dx, 0

ϕ n−1 (u) − ϕ n (u) = ∫ [ϕ n−2 (u − x) − ϕ n−1 (u − x)] e−x dx, 0

u

n ∈ ℕ.

n ⩾ 2,

and since ϕ0 (u) − ϕ1 (u) = 1 − ∫0 e−x dx = 1 − (1 − e−u ) = e−u , we get u

u

ϕ1 (u) − ϕ2 (u) = ∫ [ϕ0 (u − x) − ϕ1 (u − x)] e 0

−x

dx = ∫ e−(u−x) e−x dx = ue−u . 0

Repeating this process reveals that ϕ n−1 (u) − ϕ n (u) = u n−1 /(n − 1)! ⋅ e−u , i.e. ∞

ϕ n (u) = ∑

k=n

as claimed.

u k −u e = ℙ(N u ⩾ n), k!

19.34 Lemma. Let (X k )k⩾1 be iid exponential random variables with parameter 1, (H kϵ )k⩾1 iid random variables with law 12 √ϵx−3/2 𝟙(ϵ,∞) (x) dx, where ϵ > 0. We assume that the sequences (X k )k⩾1 and (H kϵ )k⩾1 are independent and set ∞

C ϵu := ∑ H kϵ 𝟙{X1 +⋅⋅⋅+X k ⩽λu} , k=1

λ = λϵ = √

2 . πϵ

For each u > 0, the random variables C ϵu converge in law to τ u as ϵ → 0.

Proof. In view of Lévy’s continuity theorem (e.g. [235, Theorem 30.3]), it is enough to show that limϵ→0 𝔼 [exp (−ξC ϵu )] = exp (−u√2ξ ) = 𝔼 [exp (−ξτ u )]. We have ∞

N λu

𝔼 [exp (−ξC ϵu )] = 𝔼 [exp (−ξ ∑ H kϵ 𝟙{X1 +⋅⋅⋅+X k ⩽λu} )] = 𝔼 [exp (−ξ ∑ H kϵ )] , k=1

k=1

where N λu is an independent Poisson random variable with parameter λu, cf. Lemma 19.33. We also use the “empty sum convention”, i.e. ∑0k=1 = 0. Since the random variables H kϵ are iid, ∞

n

𝔼 [exp (−ξC ϵu )] = ∑ 𝔼 [exp (−ξ ∑ H kϵ )] n=0

k=1

= [𝔼 (e−ξH1 )] ϵ

n

ϵ (λu)n −λu e = exp [−λu (1 − 𝔼 e−ξH1 )] . n!

354 | 19 Applications of Itô’s formula By construction, λ (1 − 𝔼 e−ξH1 ) = √ ϵ





√ϵ dx 2 1 dx 󳨀󳨀󳨀→ ∫ (1 − e−ξx ) ∫ (1 − e−ξx ) 3/2 . πϵ 2x3/2 ϵ→0 √2π x ϵ 0

The right-hand side has the value √2ξ . This follows easily using Fubini’s theorem: ∞

∫ (1 − e−ξx ) 0

∞ xξ



∞ ∞

dx dx dx = ∫ ∫ e−y dy 3/2 = ∫ ∫ 3/2 e−y dy = 2√ ξ ∫ y1/2−1 e−y dy. x3/2 x x 0 0

0

0 y/ξ

The last integral is Euler’s Gamma function: Γ(1/2) = √π.

Returning to the number of excursions of minimum length ϵ > 0, cf. (19.32), we see that the random variables N ω ((0, v] × U ϵ ) and # {u ⩽ v : C ϵu (ω) ≠ C ϵu− (ω)} ∼ N λv (ω) ∞ with λ = λ ϵ = √2/πϵ have the same law.⁷ If we recall that λ ϵ = ∫0 𝟙(ϵ,∞) (x) μ(dx) for the measure μ(dx) = (2π)−1/2 x−3/2 dx, then we see that for all u < v and ϵ < δ N ω ((u, v] × (U ϵ \ U δ ))

A bit more is true.

is a Poisson r.v. with intensity

Leb ⊗ μ[(u, v] × (ϵ, δ)].

19.35 Theorem (Itô 1969, 1970). Let (B t )t⩾0 be a BM1 , L0t its local time and τ u the inverse local time. Moreover, e u (ω, ⋅) are the Brownian excursions and N ω ((u, v] × U) the corresponding counting process, cf. Definitions 19.30 and 19.31. The family of Brownian excursions (e u (ω, ⋅), Fτ u )u⩾0 is a Poisson point process, i.e. i) (u, ω) 󳨃→ e u (ω, ⋅) is B(0, ∞) ⊗ A /E measurable. ii) The set D ω := {u ⩾ 0 : e u (ω, ⋅) ≢ 0} is a.s. countable. iii) The measure U 󳨃→ N ω ((0, v] × U) is for almost all (u, ω) σ-finite. iv) (Adaptedness) The random variable ω 󳨃→ N ω ((0, v] × U) is Fτ v measurable. v) (Stationary and independent increments) For all u, v ⩾ 0, U ∈ E and k ∈ ℕ0 : ℙ (N((u, u + v] × U) = k | Fτ u ) = ℙ (N((0, v] × U) = k). Proof. i) Since the σ-algebra E in the excursion space is defined through the coordinate projections X t : E → ℝ, it is enough to show that the set {(u, ω) : e u (ω, ⋅) ∈ X −1 t (A)} ∈ B(0, ∞) ⊗ A for all t ⩾ 0 and A ∈ B(ℝ). This, however, follows directly from the definitions e u (ω, ⋅) ∈ X −1 t (A) ⇐⇒ e u (ω, t) = X t (e u (ω, ⋅)) ∈ A,

and the measurability of (u, ω) 󳨃→ e u (ω, t) = B(τ(u−)+t)∧τ(u) (ω).

7 To understand this, notice that C ϵr → τ r in law as ϵ → 0. On the other hand, for each fixed ϵ > 0, the jump height of C ϵr exceeds ϵ, while N((u, v], U ϵ ) counts those r where τ r − τ r− > ϵ.

19.8 Local times and Brownian excursions | 355

ii) follows from the fact that e u (ω) ≢ 0 if, and only if, τ u− (ω) < τ u (ω), but this can only happen for countably many u. iii) follows from (19.32), if we take ϵ = 1/n and observe that E \ {0} = ⋃n⩾1 U1/n . iv) follows from (19.31) and the definition of e u (ω, s) = B(τ(u−)+s)∧τ(u) (ω).

v) if we shift B t 󴁄󴀼 W t := B t+τ u = B t+τ u − B τ u by τ u , then W t is again a Brownian motion – this is the strong Markov property, Theorem 6.5 –, and the values of the local times of B t and W t differ by u. Thus, e u+v (ω, ⋅) = e v (θ τ u ω, ⋅) as well as N ω ((u, u + v], U) = N θ τu ω ((0, v], U). Using again the strong Markov property gives ℙ0 (N∙ ((u, u + v], U) = k | Fτ u ) = ℙ0 (N θ τu (∙) ((0, v], U) = k | Fτ u ) = ℙB(τ u ) (N∙ ((0, v], U) = k) = ℙ0 (N∙ ((0, v], U) = k) .

Here we use that B(τ u ) = 0, and that the underlying probability measure ℙ = ℙ0 is induced by Brownian motion.

Up to now, we have constructed excursions from a Brownian motion. The next theorem shows that we can reconstruct (B t )t⩾0 from its excursions.

19.36 Corollary (Itô’s synthesis theorem). Let (e u )u⩾0 be the Poisson point process derived from a Brownian motion (B t )t⩾0 . Then τ v (ω) = ∑ ζ(e u (ω, ⋅)) and 0 0. Since a Poisson random variable is uniquely determined by its intensity, it is clear that the measure dv ⊗ n(de) uniquely determines (N((0, v] × U))v⩾0,U∈E0 , hence (e v )v⩾0 . The measure n is called the excursion measure or Itô measure. c) There is a stronger version of Itô’s synthesis theorem. Rather than using an excursion process derived from (B t )t⩾0 and then reassembling (B t )t⩾0 , we could start with a suitable measure n, construct a Poisson point process and then show that we get a Brownian motion using the recipe from Corollary 19.36. This is indeed true – but far from obvious, cf. [113, Chapter III.4.3] – and requires exact knowledge of the excursion measure n. We close this section with a few remarks on the excursion measure n, see Remark 19.37.b). Our analysis does not allow us to observe n directly, but we know its image measure μ = n ∘ ζ −1 under the life-span map ζ(e u ) = τ u − τ u− . In fact, μ(dx) = n(ζ ∈ dx) = (2π)−1/2 x−3/2 dx,

(19.34)

see the discussion preceding Theorem 19.35. Let us collect a few more results on n: 19.38 Theorem. Let n be the excursion measure of a Brownian motion (B t )t⩾0 as defined in Remark 19.37.b). a) For all x > 0 we have n {ζ > x} = (2/πx)−1/2 . b) For all x > 0 we have n {e : supt⩽ζ(e) |e(t)| > x} = 2n {e : supt⩽ζ(e) e(t) > x} = x−1 . c) For all t > 0 we have n t (dy) = n {e : e(t) ∈ dy} = |y|(2πt3 )−1/2 exp(−y2 /2t) dy. The measure n t (dy) is called the entrance law.

Proof. We have already proved Formula a) in the discussion before Theorem 19.35: U x = {ζ > x} and n(U x ) = μ(x, ∞) = λ x = (2/πx)−1/2 .

The first equality in b) follows from the symmetry of Brownian motion: there are equally many excursions above and below zero. In order to find the value of the measure, we set V x := {e ∈ E : supt |e(t)| > x}. Since N((0, v] × V x ) is a Poisson random variable with parameter vn(V x ) (cf. Remark 19.37.b)), the stopping time σ := inf {v > 0 : N((0, v] × V x ) > 0} is an exponential random variable with parameter 𝔼 σ = n(V x )−1 . Indeed: ℙ(σ > v) = ℙ (N((0, v] × V x ) > 0) = e−vn(V x ) .

Let τ = inf {t : |B t | = x} be the time when |B t | reaches the value x for the first time. Since this is its running maximum, we know that σ = L0τ . Now we can use Tanaka’s

19.8 Local times and Brownian excursions | 357

formula (19.22) to infer that |B t | − L0t is a martingale. By optional stopping, 𝔼 |B t∧τ | = 𝔼 L0t∧τ , ⩽x

and letting t → ∞, dominated convergence, resp., monotone convergence yield x = 𝔼 L0τ = 𝔼 σ = n(V x )−1 .

For Formula c) we give only a heuristic argument; the rather technical full proof can be found in [216, Chapter XII.4] or [222, Vol. 2, Chapter VI.50]. Recall that N((0, v] × U) counts the number of excursions of “shape” U ⊂ E up to local time level v. In our case, U = U t = {e ∈ E : e(t) ∈ A} with A ∈ B(ℝ \ {0}). Since U t = 0 if t > ζ(e), we have n(U) = n {e : e u (t) ∈ A, ζ(e) ⩾ t} =

n{e : e u (t) ∈ A, ζ(e) ⩾ t} n{ζ ⩾ t}. n{ζ ⩾ t}

2 From Part a) we know already the second factor: n{ζ ⩾ t} = √ πt . The first factor gives the proportion of excursions such that e(t) ∈ A. Since we have Brownian excursions, this proportion is the same as ℙ(B t ∈ A | g t = 0). From Theorem 11.25 and the symmetry of Brownian motion we know that for y ∈ ℝ and t > s > 0

ℙ(B t ∈ dy, g t ∈ ds) = e−y

2

/2(t−s)

This implies that

ℙ(B t ∈ dy | g t = s) =

|y| ds dy , 2π√ s(t − s)3

|y| 2π√ s(t−s)3

e−y

2

/2(t−s)

1 π√s(t−s)

hence ℙ(g t ∈ ds) =

dy

=

ds

π√ s(t − s)

.

2 |y| e−y /2(t−s) dy. 2(t − s)

2 gives the formula for n t (dy). Setting s = 0 and multiplying with n{ζ ⩾ t} = √ πt

Theorem 19.38 is far from describing n exactly, but it allows us to give an approximation formula for the local time. Denote by U ϵ,t = {e ∈ E : ζ(e) > ϵ, e(s) = 0 for some s < t} the excursions with length exceeding ϵ and ending before some time t.⁸ 19.39 Corollary (Lévy 1948). Let (B t )t⩾0 be a BM1 with local time L0t and N((0, v] × U) the number of excursions of “shape” U up to local time level v. Then ℙ (lim √ ϵ↓0

πϵ #U ϵ,t = L t for all t > 0) = 1. 2

8 Since ℙ(B t = 0) = 0, t > 0, a fixed time t > 0 is a.s. not a Brownian zero. Thus, U ϵ,t is also the number of excursions which end in [0, t].

358 | 19 Applications of Itô’s formula Proof (Itô–Mc Kean 1965; Maisonneuve 1980). Note that #U ϵ,t = N ((0, L t ) × {ζ > ϵ}), since all excursions in U ϵ,t must be finished before (real) time t, i.e. before local time level L t . Set ϵ k := 2π−1 k−2 and observe that n{ζ > ϵ k } = μ[ϵ k , ∞) = k. The sequence N ((0, v] × U ϵ k+1 \ U ϵ k ) = N ((0, v] × U ϵ k+1 ) − N ((0, v] × U ϵ k ) is a sequence of indepdendent Poisson random variables with parameter v(k + 1) − vk = v. Independence follows from the fact that for disjoint set U ∩ U 󸀠 = 0 the excursions in U and U 󸀠 come from different pieces of the Brownian path. Because of the strong Markov property (Theorem 6.5), these pieces are independent, and so are the counting variables. Now we can use the law of large numbers to get lim

k→∞

N((0, v] × U ϵ k ) πϵ k = lim √ N((0, v] × U ϵ k ) = v k 2 k→∞

a.s.

Denote the exceptional null set by M(v). Since N((0, v] × U ϵ ) is decreasing in ϵ and increasing in v, we can use for ϵ k+1 < ϵ ⩽ ϵ k and r < v < q where r, q ∈ ℚ+ , a sandwiching argument: √

πϵ k+1 πϵ πϵ k N ((0, r] × U ϵ k ) ⩽ √ N ((0, v) × U ϵ ) ⩽ √ N ((0, q] × U ϵ k+1 ) . 2 2 2

=√

ϵ k+1 ϵk

√ πϵ2k N ((0,r]×U ϵ k )

=√ ϵ

ϵk k+1

√ πϵ k+1 2 N((0,q]×U ϵ k+1 )

Note that this argument allows us to replace (0, v] by (0, v). Letting first k → ∞, and then r ↑ v and q ↓ v, we see that √ πϵ 2 N ((0, v) × U ϵ ) → v a.s. with an exceptional set M = ⋃r∈ℚ+ M(r) which does not depend on v. Finally, we can take v = L t (ω) to finish the argument.

19.40 Further reading. Further applications, e.g. to reflecting Brownian motion |B t |, excursions, local times, etc. are given in [113]. The monograph [150] proves Burkholder’s inequalities in ℝd with dimension-free constants. A concise introduction to Malliavin’s calculus and stochastic analysis can be found in [192] or [238]. Excursion theory for Brownian is nicely covered in [113] and [216]. The original account of Itô (which does not include Brownian motion) [116] can still be recommended. A treatment close to the original spirit of Lévy is [121] and [32]. Chung: Excursions in Brownian Motion. [32] [113] Ikeda, Watanabe: Stochastic Differential Equations and Diffusion Processes. [116] Itô: Poisson Point Processes and Their Application to Markov Processes. [121] Itô, McKean: Diffusion Processes and Their Sample Paths. [150] Krylov: Introduction to the Theory of Diffusion Processes. [192] Nualart, Nualart: Introduction to Malliavin Calculus. [216] Revuz, Yor: Continuous Martingales and Brownian Motion. [238] Shigekawa: Stochastic Analysis.

Problems

| 359

Problems 1.

State and prove the d-dimensional version of Lemma 19.1.

2.

Let (N t )t⩾0 be a Poisson process with intensity λ = 1 (see Problem 10.1 for the definition). Show that for FtN := σ(N r : r ⩽ t) both X t := N t − t and X 2t − t are martingales. Explain why this does not contradict Theorem 19.5.

3.

Let B t = (b t , β t ), t ⩾ 0, be a two-dimensional Brownian motion, i.e. (b t )t⩾0 , (β t )t⩾0 are independent one-dimensional Brownian motions. Show that for all σ1 , σ2 > 0 σ1 σ2 W t := bt + β t , t ⩾ 0, 2 2 2 √ σ1 + σ2 √ σ1 + σ22

is a one-dimensional Brownian motion.

4. Let (B(t))t⩾0 be a BM1 and T < ∞. Define β T = exp (ξB(T) − 12 ξ 2 T), ℚ := β T ℙ and W(t) := B(t)−ξt, t ∈ [0, T]. Verify by a direct calculation that for all 0 < t1 < ⋅ ⋅ ⋅ < t n and A1 , . . . , A n ∈ B(ℝ) 5.

ℚ(W(t1 ) ∈ A1 , . . . , W(t n ) ∈ A n ) = ℙ(B(t1 ) ∈ A1 , . . . , B(t n ) ∈ A n ).

Let (B t )t⩾0 be a BM1 , Φ(y) := ℙ(B1 ⩽ y), and set X t := B t + αt for some α ∈ ℝ. Use Girsanov’s theorem to show for all 0 ⩽ x ⩽ y, t > 0 ℙ(X t ⩽ x, sup X s ⩽ y) = Φ ( s⩽t

6.

7.

x − αt x − 2y − αt ) − e2αy Φ ( ). √t √t

Let X t be as in Problem 19.5 and set τ̂ b := inf{t ⩾ 0 : X t = b}. a) Use Problem 19.5 to find the probability density of τ̂ b if α, b > 0. b) Find ℙ(τ̂ b < ∞) for α < 0 and α ⩾ 0, respectively.

Let (B t )t⩾0 be a BM1 , denote by H2T the space used in Lemma 19.10 and pick ξ1 , . . . , ξ n ∈ ℝ and 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = T. Show that exp [i ∑nj=1 ξ j B t j ] is contained in H2T ⊕ iH2T .

8. Let (L, ⟨•, •⟩) be a Hilbert space and H some linear subspace. Show that H is a dense subset of L if, and only if, the following property holds: ∀x ∈ L : x ⊥ H 󳨐⇒ x = 0.

9.

Let (B t )t⩾0 be a BM1 , n ⩾ 1, 0 ⩽ t1 < ⋅ ⋅ ⋅ < t n = T and let Y an FT measurable, integrable random variable. Assume that 𝔼 [Y | σ(B t1 , . . . , B t n )] = 0 for any choice of n ⩾ 1 and 0 ⩽ t1 < . . . t n = T. Show that Y = 0 a.s.

10. Assume that (B t , Gt )t⩾0 and (M t , Ft )t⩾0 are independent martingales. Show that (M t )t⩽T and (B t )t⩽T are martingales for the enlarged filtration Ht := σ(Ft , Gt ). 11. Show the following addition to Lemma 19.16: τ(t−) = inf{s ⩾ 0 : a(s) ⩾ t} and

and τ(s) ⩾ t ⇐⇒ a(t−) ⩽ s.

a(t−) = inf{s ⩾ 0 : τ(s) ⩾ t}

360 | 19 Applications of Itô’s formula 12. Show that, in Theorem 19.17, the quadratic variation ⟨M⟩t is a Gt stopping time. Hint. Direct calculation, use Lemma 19.16.c) and A.15

13. Let f : [0, ∞) → ℝ be absolutely continuous. Show that ∫ f a−1 df = f a /a for all a > 0.

14. State and prove a d-dimensional version of the Burkholder–Davis–Gundy inequalities (19.21) if p ∈ [2, ∞). Hint. Use the fact that all norms in ℝd are equivalent

15. We have seen in Lemma 19.27.a) that supp[dL0t (ω)] ⊂ {t ⩾ 0 : B t (ω) = 0} for almost all ω. Show that supp[dL at (ω)] ⊂ {t ⩾ 0 : B t (ω) = a} for all ω from a set Ω0 with measure 1 such that Ω0 does not depend on a.

16. The proof of Theorem 19.29 uses, implicitly, the following beautiful result due to Skorokhod [239] which is to be proved: Lemma. Let b : [0, ∞) → ℝ be a continuous function such that b(0) ⩾ 0. There exist unique functions p, a : [0, ∞) → [0, ∞) such that i) b(t) = p(t) − a(t); ii) a(0) = 0, a(t) is continuous and increasing; iii) p(t) ⩾ 0 and supp [da(∙)] ⊂ {t : p(t) = 0}.

Hint. Set a(t) = 0 ∨ sups⩽t (−b(s)) to show existence. For uniqueness, assume that there is a further pair {a, p} and assume that p(t1 ) > p(t1 ) for some t1 , hence on some suitable maximal interval (t0 , t1 ]. Now use iii) to see a(t0 ) = a(t1 ). . . t

Remark. The third condition iii) can be equivalently expressed as a(t) = ∫0 𝟙{0} (p(s)) da(s). This is the so-called Skorokhod (integral) equation.

17. Show that in Lemma 19.32.a) the following stronger assertion holds: (L t )t⩾0 is an additive functional, i.e. L t+s = L t + L s ∘ θ t holds for all s, t ⩾ 0.

18. Use the strong Markov property to show the stationary and independent increments property in Lemma 19.32.c) directly, i.e. for all 0 = u0 < u1 < ⋅ ⋅ ⋅ < u n and A1 , . . . , A n ∈ B[0, ∞) we have n

ℙ (τ u i − τ u i−1 ∈ A i , i = 1, . . . , n) = ∏ ℙ (τ u i −u i−1 ∈ A i ) . i=1

19. Let (X t , Gt ) be an adapted, real-valued process with right continuous paths and finite left limits. Assume that ℙ(X t − X s ∈ A | Gs ) = ℙ(X t−s ∈ A) holds for all 0 ⩽ s < t < ∞ and A ∈ B(ℝ). Show that (X t )t⩾0 enjoys the independent and stationary increment properties (B1), (B2). Hint. Mimic Lemma 5.4

Remark. This proves that Theorem 19.35.v) entails that the counting measure has stationary independent increments.

20 Wiener Chaos and iterated Wiener–Itô integrals Let (B t )t⩾0 be a one-dimensional Brownian motion on the probability space (Ω, A , ℙ). B = σ(B , t ⩾ 0). We are interested We assume that A is generated by (B t )t⩾0 , i.e. A = F∞ t in the structure of the space L2 (ℙ) = L2 (Ω, A , ℙ) – in the present setting these are the square integrable functionals of Brownian motion – and the main result of this chapter is the following Wiener–Itô chaos decomposition: any F ∈ L2 (ℙ) can be written in the form t2 ∞ tn ∞

F = 𝔼 F + ∑ ∫ ∫⋅ ⋅ ⋅ ∫ f n (t1 , . . . , t n ) dB t1 . . . dB t n−1 dB t n n=1 0 0

(20.1)

0

for suitable deterministic integrands f n ∈ L2 ([0, ∞)n , λ n ); the sum appearing in (20.1) is an orthogonal sum in the space L2 (ℙ). You may want to compare this result with the martingale representation theorem from Section 19.4. The representation (20.1) is originally due to Wiener [266] (who used a different form of the summands, see also [267]), but we follow Itô’s elegant approach from [115]. Throughout this chapter we B , ℙ) and L 2 (ℝk ) = L 2 (ℝk , λ k ) where λ k is Lebesgue measure on use L2 (ℙ) = L2 (Ω, F∞ + + k k ℝ+ = [0, ∞) .

20.1 Multiple vs. iterated stochastic integrals

A key ingredient in the representation (20.1) are iterated stochastic integrals, where the upper limits in the interior integrals are decreasing, i.e. we are integrating over a simplex {0 ⩽ t1 ⩽ t2 ⩽ . . . ⩽ t n < ∞}. In contrast, Wiener looked at integrals over the full space ℝ+n . Already the definition of a multiple integral of the form b b

b

∫ ∫⋅ ⋅ ⋅ ∫ f(t1 , t2 , . . . , t n ) dB t1 . . . dB t n−1 dB t n a a

a

n integrals

is a problem. Wiener’s definition of the stochastic integral (cf. § 13.5) cannot deal with random integrands at all, whereas Itô’s definition will run into a measurability problem. 20.1 Example. Let n = 2 and f(s, t) = 𝟙[a,b)×[a,b) (s, t) where [a, b) ⊂ ℝ+ . Then b

b

b

∫ (∫ 1 dB s ) dB t = ∫(B b − B a ) dB t , a

a

a

and this integral does not exist as an Itô integral since B b − B a is not Fa measurable. Wiener’s solution (1938) was to approximate f(t1 , . . . , t n ) by step functions of the form ϕ(t1 , . . . , t n ) = ∑ a i1 ,...,i n 𝟙[u i1 ,v i1 )×⋅⋅⋅×[u in ,v in ) (t1 , . . . , t n ) i1 ,...,i n

https://doi.org/10.1515/9783110741278-020

362 | 20 Wiener Chaos and iterated Wiener–Itô integrals where Π = {0 ⩽ u1 < v1 ⩽ . . . ⩽ u m < v m < ∞},¹ and to define ∞



n

∫ ⋅ ⋅ ⋅ ∫ ϕ(t1 , . . . , t n ) dB t1 . . . dB t n := ∑ a i1 ,...,i n ∏(B v ik − B u ik ). 0

i1 ,...,i n

0

k=1

Similar to § 13.5, Wiener defines the general stochastic multiple integral via completion. Note, however, that with this definition b b

b b 2

∫ ∫ 1 dB s dB t = (B b − B a ) a a

󳨐⇒ 𝔼 ∫ ∫ 1 dB s dB t = 𝔼[(B b − B a )2 ] = b − a ≠ 0. a a

This means that the double integral is neither orthogonal to constants nor a martingale. We do not continue along this way, since Wiener’s definition is going in the wrong direction. Itô’s solution (1951) is also based on an approximation of f(t1 , . . . , t n ) ∈ L2 (ℝ+n ) by step functions, but there is a twist. 20.2 Definition. Let Π = {0 ⩽ u1 < v1 ⩽ . . . ⩽ u m < v m < ∞}. An off-diagonal simple function on ℝ+n is a function of the form ϕ(t1 , . . . , t n ) = ∑ a ∆i1 ,...,i n 𝟙[u i1 ,v i1 )×⋅⋅⋅×[u in ,v in ) (t1 , . . . , t n ) i1 ,...,i n

(20.2)

where a ∆i1 ,...,i n = 0 whenever i k = i j with k ≠ j. We denote the family of off-diagonal simple functions by S∆ = S∆ (ℝ+n ).

A moment’s thought shows that an off-diagonal simple function is zero on the diagonal in ℝ+n and on all diagonals of all lower-dimensional subspaces. Let us show that S∆ is dense in L2 (ℝ+n ). There are two issues: the diagonals are missing and the steps are defined only for sets from a generator of the Borel sets. 20.3 Lemma. S∆ (ℝ+n ) ⊂ L2 (ℝ+n , λ n ) is a dense linear subspace.

Proof. It is clear that S∆ is a vector space. For every δ > 0 we define a cut-off function χ δ (t1 , . . . , t n ) := 𝟙{mink,j |t k −t j |⩾2δ} (t1 , . . . , t n ) ⋅ χ[0,T) (t1 ) ⋅ . . . ⋅ χ[0,T) (t n ),

T = 1/δ,

such that for any f ∈ L2 (ℝ+n ) the function f δ := χ δ ⋅ f is zero outside [0, T)n and on a strip of width δ around all diagonals in all dimensions d = 2, 3, . . . , n. Since f δ → f pointwise (Lebesgue a.e.) as δ → 0 and since |f δ | ⩽ |f| ∈ L2 (ℝ+n ), we can use dominated convergence to see that L2 -limδ→0 f δ = f . 1 There is no harm to consider in each coordinate direction of ℝ+n the same partition. Otherwise, we take the joint refinement of all partitions.

20.1 Multiple vs. iterated stochastic integrals |

363

This shows that it is enough to approximate f ∈ L2 ([0, T)n ) which vanish on a strip around the diagonal. In fact, if we can show that the simple functions (which are not necessarily zero around the diagonals) are dense in L2 ([0, T)n ), we are done. This follows from the observation that we can split the (n-dimensional) “rectangles” intersecting the strip ∆2δ = {(t1 , . . . , t n ) : mink,j |t k − t j | ⩾ 2δ} into smaller rectangles of side length δ, and so we may approximate any f vanishing in ∆2δ by functions from S∆ vanishing on ∆ δ . In order to see that the functions ∑i1 ,...,i n a i1 ,...,i n 𝟙[u i1 ,v i1 )×[u in ,v in ) (t1 , . . . , t n ) are dense in L2 ([0, T)n ), we note that the basis sets of the steps of these functions make up the algebra G consisting of all finite unions of n-dimensional rectangles of the form [a1 , b1 ) × ⋅ ⋅ ⋅ × [a n , b n ) ⊂ [0, T)n . Since G generates the Borel sets B([0, T)n ), we can use a standard result from measure theory which says that the G -step functions are dense in L2 ([0, T)n ), see e.g. [235, Lemma 17.14].² 20.4 Definition (Itô 1951). Let ϕ ∈ S∆ (ℝ+n ) be given by (20.2). The iterated Itô integral is defined as n

I n (ϕ) := ∑ a ∆i1 ,...,i n ∏(B v ik − B u ik ). i1 ,...,i n

(20.3)

k=1

We leave it as an exercise to check that I n (ϕ) is well-defined, i.e. independent of the representation of ϕ ∈ S∆ . 20.5 Example. Let us return to Example 20.1. If Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t m = T} and |Π| = maxi (t i − t i−1 ), then we may approximate 𝟙[0,T)×[0,T) by off-diagonal simple Π functions 𝟙[0,T)×[0,T) := ∑i=k̸ 𝟙[t i−1 ,t i ) 𝟙[t k−1 ,t k ) and get Π I2 (𝟙[0,T)×[0,T) ) = ∑ (B t i − B t i−1 )(B t k − B t k−1 ) i=k ̸

= ∑(B t i − B t i−1 )(B t k − B t k−1 ) − ∑(B t i − B t i−1 )2 i

i,k

2

2

= (B T − B0 ) − ∑(B t i − B t i−1 ) i

L2 (ℙ), §9.1

󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ B2T − T. |Π|→0

If we can interchange L2 -limits and the I2 -integral, this would be a good candidate for I2 (𝟙[0,T)×[0,T) ). Just as for the “normal” stochastic integral I1 , this will be a consequence of Itô’s isometry which we will verify later. In particular, 𝔼 I2 (𝟙[0,T)×[0,T) ) = 0, i.e. I2 (𝟙[0,T)×[0,T) ) is orthogonal to all constants; moreover, if T = t ⩾ 0, it is a martingale. 2 The argument uses i) that the step functions with steps in B([0, T)n ) are dense in L2 ([0, T)n ) – this is due to the sombrero lemma and the definition of the space L2 – and ii) that we can approximate any Borel set in [0, T)n by elements from G up to measure ϵ > 0.

Ex. 20.1

364 | 20 Wiener Chaos and iterated Wiener–Itô integrals More importantly, I2 (𝟙[0,T)×[0,T) ) can be expressed as an iterated integral: ∞ t



t

2 ∫ ∫ 𝟙[0,T)×[0,T) (s, t) dB s dB t = 2 ∫ [∫ 𝟙[0,T)×[0,T) (s, t) dB s ] dB t 0 0 0 [0 ] T

= 2 ∫ (B t − B0 ) dB t 0

Itô

=

formula

B2T − T = I2 (𝟙[0,T)×[0,T) ).

In this calculation we tacitly use symmetry: 𝟙[0,T)×[0,T) (s, t) = 𝟙[0,T)×[0,T) (t, s). In order to see what is going on, it is useful to consider this calculation for a Lebesgue integral b b

Ex. 20.2

b

t

b

{ } ∫ ∫ f(s, t) ds dt = ∫ {∫ + ∫} f(s, t) ds dt a a a {a t }

Fubini

=

b t

b s

∫ ∫ f(s, t) ds dt + ∫ ∫ f(s, t) dt ds. a a

a a

Now swap the names of the variables in the second integral s 󴀸󴀼 t. If f is symmetric, b

b

b

t

i.e. if f(s, t) = f(t, s), then we get ∫a ∫a f(s, t) ds dt = 2 ∫a ∫a f(s, t) ds dt.

20.6 Definition. Let f : ℝ+n → ℝ. The symmetrization of f is the function ̂f (t1 , t2 , . . . , t n ) := 1 ∑ f (t σ(1) , . . . , t σ(n) ) , n! σ

where the sum extends over all permutations σ : {1, . . . , n} → {1, . . . , n}.

20.7 Remark. Let f ∈ L2 (ℝ+n ). Because of Tonelli’s theorem, the value of ‖f‖L2 does not depend on the order of the integrations, so 1 ‖̂f ‖L2 ⩽ ∑ ‖f‖L2 = ‖f‖L2 . n! σ

Note that “ 0 are defined as H n (x, t) := (−1)n t n e x

2

/2t

d n −x2 /2t e dx n

(

d0 := id) . dx0

(20.4)

For t = 1 we get the (classical) Hermite polynomials H n (x) := H n (x, 1).

It is not hard to see that x 󳨃→ H n (x, t) is indeed a polynomial of order n ⩾ 0. We can express the extended Hermite polynomials in terms of the classical Hermite polynomials as follows: H n (x, t) = t n/2 H n (xt−1/2 ) .

(20.5)

368 | 20 Wiener Chaos and iterated Wiener–Itô integrals This follows easily from the chain rule (−1)n t n e x

2

/2t

n d n −x2 /2t dy n 󵄨󵄨󵄨 n n x2 /2t d −y2 /2 e = (−1) t e e ⋅ ( ) 󵄨󵄨 dx n dy n dx 󵄨󵄨y=xt−1/2 󵄨󵄨 2 2 dn = (−1)n t n/2 e y /2 n e−y /2 󵄨󵄨󵄨󵄨 = t n/2 H n (xt−1/2 ). dy 󵄨y=xt−1/2

The following operators are useful when dealing with Hermite polynomials: ∂ϕ(x) =

d ϕ(x) and dx

δϕ(x) = xϕ(x) − ϕ󸀠 (x) = (x − 2

2

d )ϕ(x). dx

(20.6)

d A direct calculation yields δϕ(x) = −e x /2 dx [e−x /2 ϕ(x)] and shows that H n (x) = δ n 1 can be obtained by n applications of δ to the constant function x 󳨃→ 1. This makes it easy to calculate the first few Hermite polynomials:

H2 (x) = x2 − 1,

H0 (x) = 1,

H3 (x) = x3 − 3x,

H1 (x) = x,

H4 (x) = x4 − 6x2 + 3,

H5 (x) = x5 − 10x3 + 15x.

Note that we have normalized our polynomials such that the leading coefficient is 2 always 1. In the physics literature it is more common to use the measure e−x dx, rather 2 than (2πt)−1/2 e−x /2t dx, which results in different coefficients.

20.12 Lemma. The operators ∂ = δ∗ and δ∗ = ∂ are formal adjoints in the space 2 L2 (ν1 ) := L2 (ℝ, ν1 ), ν1 (dx) = (2π)−1/2 e−x /2 dx. Moreover, a) δH n (x) = H n+1 (x); b) ∂H n (x) = nH n−1 (x);

c) (δ + ∂)H n (x) = xH n (x);

d) δ∂H n (x) = nH n (x) and ∂δH n (x) = (n + 1)H n (x). Proof. We have for all f, g ∈ C∞ c (ℝ) ⟨∂f, g⟩L2 (ν1 ) = ∫ f 󸀠 (x)g(x)e−x ℝ

2

/2

2 2 2 dx parts d dx = ∫ f(x) (−1)e x /2 [e−x /2 g(x)] e−x /2 dx √2π √2π ℝ = δg(x)

= ⟨f, δg⟩L2 (ν1 ) .

By approximation, this relation extends to all absolutely continuous f, g ∈ L2 (ν1 ) such that ∂f, δg ∈ L2 (ν1 ). a) follows from H n+1 = δ n+1 1 = δ(δ n 1) = δH n . To see b) we use induction in n. For n = 1 the formula is obviously true. Assume that we already know ∂H n = nH n−1 ; then a)

∂H n+1 = ∂δH n

(20.6)

=

Ind’ass

=

∂(xH n − ∂H n ) = H n + x∂H n − ∂∂H n (20.6)

a)

H n + xnH n−1 − ∂nH n−1 = nδH n−1 + H n = (n + 1)H n .

Property c) follows from (δ + ∂) = x while d) is a combination of a) and b).

20.2 An interlude: Hermite polynomials |

369

Let us collect essential properties of the Hermite polynomials. We will use the double factorial (2k − 1)!! := 1 ⋅ 3 ⋅ 5 ⋅ ⋅ ⋅ (2k − 1) and (−1)!! := 1, and Kronecker’s delta δ mn = 𝟙{m=n} .

20.13 Theorem. Let H n (x, t) be the extended and H n (x) the classical Hermite polynomi2 als, and set ν t (dx) = (2πt)−1/2 e−x /2t dx. a) The family H n (x, t) is an orthogonal basis in the space L2 (ℝ, ν t ). Moreover, ∫ H n (x, t)H m (x, t) ν t (dx) = δ nm n! t n ; (orthogonality) ℝ

b) H n (x, t) = t n/2 H n (xt−1/2 );

c)

e xr−tr

2

/2



= ∑

n=0

rn

n!

⌊n/2⌋

d) H n (x, t) = ∑ ( k=0

⌊n/2⌋

xn = ∑ ( k=0

e) f)

(scaling)

(generating function)

H n (x, t);

n )(2k − 1)!! (−t)k x n−2k ; 2k

n )(2k − 1)!! t k H n−2k (x, t); 2k

(recurrence relation)

H n+1 (x, t) = xH n (x, t) − tnH n−1 (x, t);

∂ x H n (x, t) = nH n−1 (x, t);

g) (−t∂2x + x∂ x )H n (x, t) = nH n (x, t); 1 h) ∂ t H n (x, t) = − ∂2x H n (x, t); 2 m∧n m n i) H m (x, t)H n (x, t) = ∑ k!( )( )t k H m+n−2k (x, t). k k k=0

(differential equation)

(addition theorem)

Proof. We already know the scaling property b); therefore, it is enough to prove the assertions for the classical Hermite polynomials H n (x) = H n (x, 1).

The orthogonality relation in a) follows for m ⩽ n from ⟨H n , H m ⟩L2 (ν1 ) = ⟨δ n 1, δ m 1⟩L2 (ν1 )

=

20.12.d)

=

⟨δ n−m 1, ∂ m δ m 1⟩L2 (ν1 )

⟨δ n−m 1, m!⟩L2 (ν1 ) = δ nm m!.

Since the leading coefficient of the H n is 1, we see that the family (H n )n⩾0 is linearly independent. In order to see the completeness, it is enough to show that the monomials (x k )k⩾0 are dense in L2 (ν1 ). Assume that for some ϕ ∈ L2 (ν1 ) we have ⟨ϕ, x k ⟩L2 (ν1 ) = 0 for all k ⩾ 0. We multiply by (iξ)k /k! and sum over k ⩾ 0 to get ∞

∞ 2 2 dx dx (iξ)k (ixξ)k ⟨ϕ, x k ⟩L2 (ν1 ) = ∫ ∑ ϕ(x)e−x /2 = ∫ e ixξ ϕ(x)e−x /2 . k! k! √2π √2π k=0 k=0

0= ∑



2



2

This shows that the characteristic function of ϕ(x)e−x /2 is zero, hence ϕ(x)e−x /2 = 0, and ϕ(x) = 0 Lebesgue a.e. Thus, the orthogonal complement in L2 (ν1 ) of the closure

Ex. 20.4

370 | 20 Wiener Chaos and iterated Wiener–Itô integrals

Ex. 20.15

of the finite linear combinations span(x k , k ⩾ 0) = span(H n , n ⩾ 0) is {0}, hence span(H n , n ⩾ 0) is dense.

We derive c) using a differential equation. Set h(x, r) := ∑∞ n=0 ∞

rn n! H n (x)

and note that

∞ n ∞ n r n−1 r r e) H n (x) = ∑ H n+1 (x) = ∑ (xH n (x) − nH n−1 (x)) (n − 1)! n! n! n=1 n=0 n=0

∂ r h(x, r) = ∑



=x∑

n=0

∞ rn rn H n (x) − ∑ H n−1 (x) = (x − r)h(x, r). n! (n − 1)! n=1

Since h(x, 0) = 1, we can solve this differential equation and find h(x, r) = e xr−r

2

/2 .

The first identity in d) follows if we expand the generating function and change the order of summation by setting n = 2j + k: e xr−r

2

/2

∞ ∞

= ∑∑

k=0 j=0

∞ n n/2 j r k x k (−1)j r2j n! r n−2j (−1) x = ∑ ∑ k! n! j=0 (n − 2j)! 2j j! 2j j! n=0 ∞

= ∑

n=0

r n n/2 n n−2j (−1)j (2j)! . ∑ ( )x n! j=0 2j 2j j!

Since (2j)!/(2j j!) = (2j − 1)!!, the claimed formula follows if we compare the coefficients of r n in the above formula and the right-hand side of the generating fumction in Part c). For the second formula we make the Ansatz x n = ∑nk=0 c k H k (x). In order to determine the coefficient c m , we use the orthogonality relation for m ⩽ n n

⟨x n , H m ⟩L2 (ν1 ) = ∑ c k ⟨H k , H m ⟩L2 (ν1 ) = c m m!. k=0

On the other hand,

⟨x n , H m ⟩L2 (ν1 ) = ⟨x n , δ m 1⟩L2 (ν1 ) = ⟨∂ m x n , 1⟩L2 (ν1 ) =

n! ⟨x n−m , 1⟩L2 (ν1 ) . (n − m)!

Since ⟨x n−m , 1⟩L2 (ν1 ) is the (n − m)th moment of a standard normal random variable B1 , it is zero for odd n − m. Thus, set 2k = n − m and use § 2.3 to see that 𝔼 B2k 1 = (2k − 1)!!. Using m = n − 2k we get c n−2k =

e) follows from H n+1

20.12.a)

=

n! (2k − 1)!! n = ( )(2k − 1)!!. (n − 2k)! (2k)! 2k (20.6)

δH n = xH n − ∂H n

20.12.b)

=

xH n − nH n−1 .

f) is Lemma 20.12.b); g) follows from (20.6) and Lemma 20.12.d): −∂2 H n + x∂H n = δ∂H n = nH n , and h) is a straightforward calculation.

20.3 The Wiener–Itô theorem | 371

i) As in the proof of d) we make the Ansatz H m H n = ∑m∧n k=0 c m+n−2k H m+n−2k and assume that m ⩽ n. Using the orthogonality relation for Hermite polynomials, the duality of ∂ and δ, and the Leibniz rule for derivatives, show c m+n−2k (m + n − 2k)! = ⟨H m H n , H m+n−2k ⟩L2 (ν1 ) = ⟨∂ m+n−2k [H m H n ], 1⟩L2 (ν1 ) m+n−2k

=



i=0

(

=δ m+n−2k 1

m + n − 2k )⟨∂ i H n ∂ m+n−2k−i H m , 1⟩L2 (ν1 ) . i

Now we use that ∂ i H n = n!/(n − i)! H n−i and ∂ m+n−2k−i H m = m!/(2k + i − n)! H2k+i−n . Moreover, ⟨H n−i H2k+i−n , 1⟩L2 (ν1 ) = ⟨H n−i , H2k+i−n ⟩L2 (ν1 ) is zero, unless n − i = 2k + i − n, and in this case its value is (n − i)! = k! as i = n − k. Thus, we get c m+n−2k =

=

k! m + n − 2k n! m! ( ) (m + n − 2k)! n−k k! k!

n! m! n m k! = k! ( )( ), (n − k)!(m − k)! k! k! k k

and the asserted formula follows.

One can use the (extended) Hermite polynomials to expand f ∈ L2 (ν t ) in a series ∞

f(x) = ∑ a n n=0

H n (x, t) √n! t n

where

a n = ∫ f(x) ℝ

orthonormal

H n (x, t) ν t (dx). √n! t n

(20.7)

We close this interlude and record the first few extended Hermite polynomials H2 (x, t) = x2 − t,

H0 (x, t) = 1,

H3 (x, t) = x3 − 3tx,

H1 (x, t) = x,

20.3 The Wiener–Itô theorem

Ex. 5.6

H4 (x, t) = x4 − 6tx2 + 3t2 ,

H5 (x, t) = x5 − 10tx3 + 15t2 x.

In order to prove the formula (20.1) we need to know more about the iterated integrals I n (f). A bonus of the representation from Theorem 20.10.d) is the insight that I n (f 𝟙[0,t) ), t ⩾ 0, is an L2 martingale; this allows us to calculate the angle bracket of I n (g) and I m (f). We begin with a preparation (cf. also Theorem 17.15). 20.14 Lemma. Let (B t )t⩾0 be a BM1 and X, Y ∈ L2∞ such that M t := X ∙ B t = ∫0 X s dB s t

and N t := Y ∙ B t = ∫0 Y s dB s are L2 martingales. Then t

M t N t − ⟨M, N⟩t

with

t

⟨M, N⟩t = ⟨X ∙ B, Y ∙ B⟩t = ∫ X s Y s ds 0

(20.8)

372 | 20 Wiener Chaos and iterated Wiener–Itô integrals is a martingale. Moreover, t

t

(20.9)

⟨X ∙ B, Y ∙ B⟩t = ∫ X s d⟨B, Y ∙ B⟩s = ∫ X s Y s ds. 0

0

Proof. We have seen in Theorem 15.15 that M and N, hence M ± N, are L2 martingales. We will use the polarization identity 4ab = (a + b)2 − (a − b)2 for a = M, b = N and a = X, b = Y, and we define ⟨M, N⟩ := 41 (⟨M + N⟩ − ⟨M − N⟩). Note that 4M t N t − 4⟨M, N⟩t = [(M + N)2t − ⟨M + N⟩t ] − [(M − N)2t − ⟨M − N⟩t ]

is the difference of two martingales, hence a martingale. Moreover, def

4⟨M, N⟩t = ⟨M + N⟩t − ⟨M − N⟩t

§ 15.15

t

t

= ∫(X +

Y)2s

0

ds − ∫(X −

t

Y)2s

0

ds = 4 ∫ X s Y s ds. 0

Finally, if we take X t = 𝟙[0,t) , we see that M = B, and conclude that t

(20.8)

⟨B, N⟩t = ⟨M, N⟩t = ∫ Y s ds. 0

This, in turn, yields t

(20.8)

∫ X s d⟨B, N⟩s = ∫ X s Y s ds = ⟨X ∙ B, Y ∙ B⟩t 0

proving (20.9).

We can now show the orthogonality of iterated integrals of different length. 20.15 Theorem. Let f m ∈ L2 (ℝ+m , λ m ), g n ∈ L2 (ℝ+n , λ n ), and denote by δ mn = 𝟙{m=n} the Kronecker delta. Then 𝔼 [I m (f m )I n (g n )] = δ nm n! ⟨̂f n , ĝ n ⟩L2 = δ nm n! ∫ ̂f n (t)ĝ n (t) dt. ℝ+n

(20.10)

Proof. If m = n and f n = g n , the claim follows from Theorem 20.10.c). If m = n but f n ≠ g n , we use polarization to see that 4I n (f n )I n (g n ) = I n2 (f n + g n ) − I n2 (f n − g n ) and 4⟨̂f n , ĝ n ⟩L2 = ‖̂f n + ĝ n ‖2L2 − ‖̂f n − ĝ n ‖2L2 , reducing everything to the first case. Finally, let m ≠ n. Without loss we assume that m < n. Set H(t) := [0, t) × ℝ+n−1 . Since I m (f m 𝟙H(t) ) and I n (g n 𝟙H(t) ) are martingales, we know from Lemma 20.14 that 𝔼 [I n (g n )I m (f m )] = 𝔼 [⟨I n (g n ), I m (f m )⟩∞ ] .

20.3 The Wiener–Itô theorem | 373

Now we can use the formula from Theorem 20.10.d) and combine it with m applications of (20.9) ∙ tn

t n−m+1

⟨I n (g n ), I m (f m )⟩ = ⟨ ∫ ∫⋅ ⋅ ⋅ ∫ Z t n ,...,t n−m+1 (t n−m ) dB t n−m . . . dB t n−1 dB t n , m! ⋅ n! 0 0

0

m integrals

∙ sm

s2

∫ ∫ ⋅ ⋅ ⋅ ∫ ̂f m (s1 , . . . , s m ) dB s1 . . . dB s m−1 dB s m ⟩ 0 0

0

m integrals ∙ tn

t n−m+1

= ∫ ∫⋅ ⋅ ⋅ ∫ ̂f m (t n−m , . . . , t n )Z t n ,...,t n−m+1 (t n−m ) dt n−m . . . dt n−1 dt n , 0 0

0

t n−m

t2

0

0

where Z t n ,...,t n−m+1 (t n−m ) = ∫ ⋅ ⋅ ⋅ ∫ ĝ n (t1 , . . . , t n−m , t n−m+1 , . . . , t n ) dB t1 . . . dB t n−m−1 .

Since Z t n ,...,t n−m+1 (t) is a mean zero L2 martingale we see that 𝔼 [I n (g n )I m (f m )] = 0, finishing the proof. 20.16 Corollary. a) For every f ∈ L2 (ℝ+ , λ) the random variable I1 (f) is a mean zero Gaussian random variable with variance ‖f‖2L2 . b) Let (e k )k⩾1 be an orthonormal basis of L2 (ℝ+ , λ). Then (I1 (e1 ), . . . , I1 (e k )) is a k-dimensional standard normal Gaussian random variable.

Proof. a) If f = ϕ = ∑k c k 𝟙[u k ,v k ) ∈ S(ℝ+ ) with non-overlapping steps, then the claim follows from 𝔼 e iξI1 (ϕ) = 𝔼 e iξ ∑k c k (B(v k )−B(u k )) = ∏ 𝔼 e iξc k (B(v k )−B(u k )) k

= ∏e k

−(v k −u k )(ξc k )2 /2

=e

− ∑k ξ 2 c2k (v k −u k )/2

= e−ξ

2

‖ϕ‖22 /2 L

.

Approximating f ∈ L2 (ℝ+ ) by simple functions ϕ n ∈ S(ℝ+ ), we obtain the equality 2 2 𝔼 e iξI1 (f) = e−ξ ‖f‖L2 /2 , and the assertion follows.

b) Theorem 20.15 and the orthogonality of the random variables I1 (e k ) show that the covariance matrix of the vector (I1 (e1 ), . . . , I1 (e k )) is the k × k unit matrix. It is, therefore, enough to show that for any (ξ1 , . . . , ξ k ) ∈ ℝk the random variable ∑kj=1 ξ j I1 (e j ) = I1 (ξ1 e1 + ⋅ ⋅ ⋅ + ξ k e k ) is a one-dimensional Gaussian random variable. This, however, is clear from the first part. 20.17 Theorem. Let f ∈ L2 (ℝ+ , λ) and define the tensor product f ⊗n (t1 , . . . , t n ) = f(t1 ) ⋅ ⋅ ⋅ f(t n ) for any n ⩾ 1. Then f ⊗n ∈ L2 (ℝ+n , λ n ) and I n (f ⊗n ) = H n (I1 (f), ‖f‖2L2 ).

(20.11)

374 | 20 Wiener Chaos and iterated Wiener–Itô integrals Proof. We will often need to restrict f to the interval [0, t). To ease notation, we write f t (s) := f(s)𝟙[0,t) (s). We will prove (20.11) by induction. If n = 1, there is nothing to show. Assume that the claim holds for some n ⩾ 1. Since f ⊗(n+1) is symmetric, we have (using physicist’s notation, i.e. keeping the integrator next to the integral sign) I n+1 (f

⊗(n+1)



tn

t2

) = (n + 1)! ∫ dB t n f(t n ) ∫ dB t n−1 f(t n−1 )⋅ ⋅ ⋅ ∫ dB t1 f(t1 ) 0

0

0



= (n + 1) ∫ dB t f(t) I n (f t⊗n ) 0

=

ind’ass

=

⊗n 1 n! I n (f t n ) ∞

(n + 1) ∫ dB t f(t) H n (I1 (f t ), ‖f t ‖2L2 ). 0

We consider now the right-hand side of (20.11). Applying Itô’s formula to the function H n+1 (x, r) and the bivariate Itô process (I1 (f t ), ‖f t ‖2L2 ) yields dH n+1 (I1 (f t ), ‖f t ‖2L2 )

1 2 ∂ H n+1 (. . . ) d⟨I1 (f t )⟩ 2 x 1 = ∂ x H n+1 (. . . ) f(t) dB t + ∂ r H n+1 (. . . ) f(t)2 dt + ∂2x H n+1 (. . . ) f(t)2 dt. 2 = ∂ x H n+1 (. . . ) dI1 (f t ) + ∂ r H n+1 (. . . ) d‖f t ‖2L2 +

Here we use that d(I1 (f t ), ‖f t ‖2L2 ) = d (∫0 f(s) dB s , ∫0 f(s)2 ds) = (f(t) dB t , f(t)2 dt) and t

t

that all brackets involving F t = ‖f t ‖2L2 vanish, cf. Remark 18.9.a) and the end of the proof of Theorem 18.8. Now we use ∂ t H n+1 (x, t) = − 12 ∂2x H n+1 (x, t) from Theorem 20.13.h) to see that dH n+1 (I1 (f t ), ‖f t ‖2L2 ) = ∂ x H n+1 (I1 (f t ), ‖f t ‖2L2 ) f(t) dB t .

Since ∂ x H n+1 (x, t) = (n+1)H n (x, t), we finally get I n+1 (f ⊗(n+1) ) = H n+1 (I1 (f), ‖f‖2L2 ).

In order to see that the integrals I n (f), f ∈ L2 (ℝ+n ), n ⩾ 0, span the whole space L2 (ℙ), we need a further preparation. Recall that span(e k , k ⩾ 1) denotes the finite linear combinations of the family (e k )k⩾1 . 20.18 Lemma. Let (B t )t⩾0 be a BM1 and (e k )k⩾1 an orthonormal basis of L2 (ℝ+ , λ). def

B = σ(B , t ⩾ 0) = σ(I (e ), k ⩾ 1). a) F∞ t 1 k b) The family {exp(I1 (f)), f ∈ span(e k , k ⩾ 1)} is a total set in L2 (Ω, F∞ , ℙ) where F∞ B

B or its completion F , i.e. for any Z ∈ L 2 (F ) we have is either F∞ ∞ ∞

𝔼 [Z exp(I1 (f))] = 0 ∞

for all f ∈ span(e k , k ⩾ 1) 󳨐⇒ Z = 0.

B ⊃ σ(I (e ), k ⩾ 1) is clear. Proof. a) Since I1 (e k ) = ∫0 e k (s) dB s , the inclusion F∞ 1 k ∞ For the converse, we expand 𝟙[0,t) (s) = ∑k=1 c k (t)e k (s) from which we conclude that B B t = ∫ 𝟙[0,t) (s) dB s = ∑∞ k=1 c k (t)I 1 (e k ). Thus, F∞ ⊂ σ(I 1 (e k ), k ⩾ 1).

20.3 The Wiener–Itô theorem | 375

b) Since I1 (e k ) is a Gaussian random variable with mean 0 and variance ‖e k ‖2L2 = 1, the random variable exp(I1 (e k )) has all moments, cf. Corollary 2.2. Fix Z ∈ L2 (F∞ ) such that 𝔼 [Z exp(I1 (f))] = 0 holds for all f ∈ span(e k , k ⩾ 1). Define G0 := {0, Ω} and Gk := σ(I1 (e1 ), . . . , I1 (e k )), and observe that, by Part a), B . Using the tower property and the symmetry of conditional expectation, we G∞ = F∞ see for all ξ1 , . . . , ξ k ∈ ℝ and f = iξ1 e1 + ⋅ ⋅ ⋅ + iξ k e k ∈ L2 (ℝ+k ) that 0 = 𝔼 (Z exp[I1 (iξ1 e1 + ⋅ ⋅ ⋅ + iξ k e k )]) = 𝔼 (Z 𝔼 (exp[iξ1 I1 (e1 ) + ⋅ ⋅ ⋅ + iξ k I1 (e k )] | Gk ))

= 𝔼 (𝔼(Z | Gk ) exp[iξ1 I1 (e1 ) + ⋅ ⋅ ⋅ + iξ k I1 (e k )]) .

Moreover, 𝔼(Z | Gk ) = f k (I1 (e1 ), . . . , I1 (e k )) for a suitable measurable function f k . Set x = (x1 , . . . , x k ), ξ = (ξ1 , . . . , ξ k ), and observe that the above equality reads 2

∫ f k (x)e iξ ⋅x e−|x|

ℝk

/2

dx = 0.

2

Because of the uniqueness of the characteristic function, we see that f k (x)e−|x| and f k (x) = 0 Lebesgue a.e. Now we can use the martingale convergence theorem to conclude that

/2

=0

a.s.

Z = 𝔼(Z | G∞ ) = lim 𝔼(Z | Gk ) = lim f k (I1 (e1 ), . . . , I1 (e k )) = 0. k→∞

k→∞

=0

B , ℙ) 20.19 Theorem (Wiener 1938; Itô 1951). Let (B t )t⩾0 be a BM1 . Every F ∈ L2 (Ω, F∞ has a unique orthogonal expansion of the form ∞

F = 𝔼 F + ∑ I n (f n )

(20.12)

n=1

where f n ∈ L2 (ℝ+n , λ n ). The expansion (20.12) determines the functions ̂f n (but not necessarily the functions f n ) uniquely, and ∞

‖F‖2L2 = (𝔼 F)2 + ∑ n! ‖̂f n ‖2L2 .

(20.13)

n=1

Another way to state Theorem 20.19 is to write



B L2 (Ω, F∞ , ℙ) = ⨁ Kn

(20.14)

n=0

where K0 = ℝ and Kn = {I n (f) : f ∈ L2 (ℝ+n , λ n )}, n ⩾ 1, are orthogonal subspaces.

Proof of Theorem 20.19. Theorem 20.15 and the fact that 𝔼 I n (f n ) = 0 show that (20.12) is an orthogonal expansion. To finish the proof, we have to show that the system ⋃n⩾0 Kn consisting of all constants and all iterated integrals is dense.

376 | 20 Wiener Chaos and iterated Wiener–Itô integrals Let g ∈ L2 (ℝ+ ). We have seen in Theorem 20.17 that I n (g ⊗n ) = H n (I1 (g), ‖g‖2L2 ). Using the fact that we can expand the monomials x n in terms of Hermite polynomials, see Theorem 20.13.d), we get with suitable coefficients c k n

n

k=0

k=0

I1 (g)n = ∑ c k H k (I1 (g), ‖g‖2L2 ) = ∑ c k I k (g ⊗k ).

Thus, if Z ∈ L2 (ℙ) is orthogonal³ to all I n (f n ), f n ∈ L2 (ℝ+n ), n ⩾ 0, then Z is orthogonal to all I1 (g)n , n ⩾ 0, g ∈ L2 (ℝ+ ). Using the exponential series, this means that Z is orthogonal to all exp(I1 (g)), g ∈ L2 (ℝ+ ), hence Z = 0, cf. Lemma 20.18. This shows that ⋃n⩾0 Kn is dense in L2 (ℙ). In order to see the uniqueness of the functions ̂f n appearing in (20.12) – recall that I n (f n ) = I n (̂f n ) – we assume that ∞



̂ n ). F = 𝔼 F + ∑ I n (̂f n ) = 𝔼 G + ∑ I n (h n=1

n=1

Because of the uniqueness of the orthogonal expansion, we know that 𝔼 F = 𝔼 G and ̂ n ) = I n (h n ). Thus, I n (f n ) = I n (̂f n ) = I n (h ̂ n )|2 ) = n!‖̂f n − h ̂ n ‖22 0 = 𝔼 (|I n (̂f n ) − I n (h L

̂ n , i.e. the symmetrization of the coefficient f n is uniquely determand we see that ̂f n = h ined. Finally, (20.13) follows from the orthogonality of the terms under the sum and Theorem 20.10.c). It is not always easy to find a concrete Wiener–Itô decomposition of a given functional B , ℙ). There is a particularly simple situation: F ∈ L2 (Ω, F∞

20.20 Example. Assume that F = ϕ(I1 (f)) for some f ∈ L2 (ℝ+ ) with ‖f‖L2 = 1 and ϕ ∈ L2 (ν1 ) such that we know the Hermite decomposition of ϕ, i.e. ∞





ϕ(x) = ∑ c k H k (x) 󳨐⇒ F = ϕ(I1 (f)) = ∑ c k H k (I1 (f)) = c0 + ∑ c k I k (f ⊗k ). k=0

k=0

k=1

Similarly, if we happen to know the extended Hermite series, e.g. for the generating function (Theorem 20.13.c)), we get for any g ∈ L2 (ℝ+ ) e rI1 (g)−r

Ex. 20.7

2

‖g‖22 /2 L



= ∑

k=0

∞ k rk r H k (I1 (g), ‖g‖2L2 ) = 1 + ∑ I k (g ⊗k ). k! k! k=1

(20.15)

It is possible to generalize Example 20.20 to functionals of the form ϕ(I1 (f1 ), . . . , I1 (f n )). Then we need the Hermite orthogonal basis for L2 (ν⊗n 1 ) – with a little abstract functional analysis or direct calculations one can see that this is given by ∏Nk=1 H n k (x k ), 3 Recall that X, Y ∈ L2 (ℙ) are orthogonal, if 𝔼(XY) = 0.

20.4 Calculating Wiener–Itô decompositions | 377

N ⩾ 1, (n k )k⩾1 ⊂ ℕ0 – and the following corollary to Theorem 20.17. Recall that f ⊗ g(s, t) = f(s)g(t) is the tensor product of two functions.

20.21 Corollary. Let f1 , . . . , f k ∈ L2 (ℝ+ , λ) be orthogonal, n1 , . . . , n k ∈ ℕ, and set n := n1 + ⋅ ⋅ ⋅ + n k . Then ⊗n1

I n (f1

⊗n k

⊗ ⋅ ⋅ ⋅ ⊗ fk

k

) = ∏ H n i (I1 (f i ), ‖f i ‖2L2 ). i=1

Proof. If k = 1, this is Theorem 20.17. The following trick allows us to use Theorem 20.17 to prove the more general formula. From Example 20.20 we know k

exp [ ∑ ξ i I1 (f i ) − i=1

k 1 k 2 1 ∑ ξ i ‖f i ‖2L2 ] = ∏ exp [ξ i I1 (f i ) − ξ i2 ‖f i ‖2L2 ] 2 i=1 2 i=1 k

L2 (ℝ

On the other hand, f = is in again (20.15) from Example 20.20 to see k

exp [ ∑ ξ i I1 (f i ) − i=1

∑ki=1 ξ i f i



=∏ ∑

+ ),

i=1 n i =0

n

ξi i H n i (I1 (f i ), ‖f i ‖2L2 ). ni !

and we can use the linearity of I1 (⋅) and

∞ 1 1 1 k 2 I m (f ⊗m ). ∑ ξ i ‖f i ‖2L2 ] = exp [I1 (f) − ‖f‖2L2 ] = ∑ 2 i=1 2 m! m=0

Because of the structure of f we can evaluate the iterated integral with the multinomial formula, k

⊗m

I m (f ⊗m ) = I m (( ∑ ξ i f i ) i=1

)=



n1 +⋅⋅⋅+n k =m

m! ⊗n ⊗n ξ n1 ⋅ ⋅ ⋅ ξ n k I m (f1 1 ⊗ ⋅ ⋅ ⋅ ⊗ f k k ), n1 ! ⋅ ⋅ ⋅ n k !

and all that remains to be done is to compare the coefficients in these expansions.

20.4 Calculating Wiener–Itô decompositions The Wiener–Itô decomposition is just the starting point for many developments, for example a very elegant approach to Malliavin’s calculus via Gaussian spaces, see Nualart [191] and Nualart & Nualart [192]. Here we develop only the very first steps which will allow us to find Wiener–Itô decompositions. The idea is to define a variational derivative on the Wiener space C(o) (ℝ+ ). To do ∞ so, we interpret I1 (f) = ∫0 f(t) dB t = f ∙ B (we should write f ∙ B∞ , but we suppress t

the index ∞) as a linear functional of B and we perturb B t with G t := ∫0 g(s) ds, where g ∈ L2 (ℝ+ ). In fact, t 󳨃→ G t is in the Cameron–Martin space H1 from Chapter 13.1. The Gâteaux derivative D g f ∙ B := D g (f ∙ B) is given by D g f ∙ B = lim

ϵ→0





f ∙ (B + ϵG) − f ∙ B = f ∙ G = ∫ f(s) dG s = ∫ f(s)g(s) ds = ⟨f, g⟩L2 . ϵ 0

0

378 | 20 Wiener Chaos and iterated Wiener–Itô integrals Since g 󳨃→ D g f ∙ B is a linear functional in L2 (ℝ+ ), we can identify it with ⟨f, ⋅⟩L2 and write D t f ∙ B = f(t). This explains the idea behind the following definition.

n 2 20.22 Definition (Malliavin 1976/1978). Let ϕ ∈ C∞ b (ℝ , ℝ), f 1 , . . . , f n ∈ L (ℝ+ , λ), 1 and set F = ϕ(f1 ∙ B, . . . , f n ∙ B) where B = (B t )t⩾0 is a BM . This is a n-dimensional smooth cylindrical function. The Malliavin derivative DF of F is the following random variable taking values in L2 (ℝ+ , λ): n

t 󳨃→ D t F = ∑

i=1

∂ϕ (f1 ∙ B, . . . , f n ∙ B)f i (t). ∂x i

(20.16)

Note that f i (t) = D t f i ∙ B, i.e. the definition is made in such a way that D satisfies the chain rule. The Malliavin derivative is a linear but unbounded operator from the cylindrical functions (of any dimension) P ⊂ L2 (ℙ) into L2 (ℙ) ⊗ L2 (ℝ+ ) and, with some extra work, it is possible to extend it to a suitable closure of P ⊂ L2 (ℙ), see e.g. [192, Chapter 3.2]. 20.23 Example. The following Malliavin derivatives are easy to calculate. Let T ⩾ 0, m ⩾ 0, f ∈ L2 (ℝ+ , λ), then D t B T = D t (𝟙[0,T] ∙ B) = 𝟙[0,T] (t),

D t (f ∙ B) = f(t),

D t (f ∙ B)m = m(f ∙ B)m−1 f(t),

D t H m (f ∙ B) = (∂H m )(f ∙ B)f(t) = mH m−1 (f ∙ B)f(t).

(we interpret “0 ⋅ {anything undefined}” as “0”). The formula for the Hermite polynomials can be recast in terms of iterated integrals as D t I m (f ⊗m ) = mI m−1 (f ⊗(m−1) )f(t) = mI m−1 (f ⊗m (t, . . .)).

m-1 variables

Since DF ∈ L2 (ℙ) ⊗ L2 (ℝ+ ), we can iterate Gâteaux derivatives and define the k-fold Malliavin derivative of a cylindrical function F = ϕ(f1 ∙ B, . . . , f n ∙ B): D kt1 ,...,t k F :=

n



i1 ,...,i k =1

∂k ϕ (f1 ∙ B, . . . , f n ∙ B)f i1 (t1 ) ⋅ ⋅ ⋅ f i k (t k ), ∂x i1 . . . , ∂x i k

(20.17)

i.e. D kt1 ,...,t k F = D t1 . . . D t k F and D∙k F ∈ L2 (ℙ) ⊗ L2 (ℝ+ )⊗k . For example, we have in this way for f ∈ L2 (ℝ+ ) and m ⩾ k D kt1 ,...,t k I m (f ⊗m ) =

m! I m−k (f ⊗(m−k) )f ⊗k (t1 , . . . , t k ). (m − k)!

(20.18)

Let us quickly prove a lemma which allows us to switch from results for f1 ⊗ ⋅ ⋅ ⋅ ⊗ f n , f i ∈ L2 (ℝ+ ) to f ∈ L2 (ℝ+n ) by linearity and approximation.

20.24 Lemma. The linear span of the family Σ := {f1 ⊗ ⋅ ⋅ ⋅ ⊗ f n : f i ∈ L2 (ℝ+ , λ)} is dense in L2 (ℝ+n , λ n ). This remains true if we replace L2 (ℝ+ , λ) and L2 (ℝ+n , λ n ) by L2 (ℝ, ν1 ) and L2 (ℝn , ν⊗n 1 ).

20.4 Calculating Wiener–Itô decompositions | 379 2

Proof. Let g(x) := e−x /2 𝟙[0,∞) (x). The modified monomials g(x1 )x11 ⋅ ⋅ ⋅ g(x n )x nn , k1 , . . . , k n ⩾ 0, are contained in Σ, hence in L2 (ℝ+ ). Therefore, the claim follows if we can show that the modified monomials are a total family in L2 (ℝ+n ). Let u ∈ L2 (ℝ+n ) be such that ∫⋅⋅⋅∫

k

x11 ⋅ ⋅ ⋅ x nn ⋅ g(x1 ) ⋅ ⋅ ⋅ g(x n ) ⋅ u(x1 , . . . , x n ) dx1 . . . dx n = 0. k

k

ℝn

k

We multiply this expression with (iξ1 )k1 /(k1 !) ⋅ ⋅ ⋅ (iξ n )k n /(k n !), sum over k1 , . . . , k n ⩾ 0, and get with x = (x1 , . . . , x n ) and ξ = (ξ1 , . . . , ξ n ) 2

∫ e ix⋅ξ e−|x|

ℝn

/2

𝟙ℝ+n (x)u(x) dx = 0. 2

Thus, the characteristic function of u(x)e−|x| /2 𝟙ℝ+n (x) is zero. Because of the unique2 ness of the characteristic function, u(x)e−|x| /2 , hence u(x), is zero Lebesgue a.e. on n ℝ+ , i.e. Σ is total. The proof for L2 (ν1 ) and L2 (ν⊗n 1 ) is even easier, since we do not have to modify the monomials. From now on we will only sketch the arguments and omit some technical (mostly convergence-related) details. By approximation we extend (20.18) to f m ∈ L2 (ℝ+m ): D kt1 ,...,t k I m (f m ) =

m! I m−k (̂f m (t1 , . . . , t k , . . .)); (m − k)!

(20.19)

(m−k) variables

although I m (̂f m ) = I m (f m ), we have to use the symmetrization on the right-hand side of (20.19), in order to be able to take the first k variables as fixed variables. We want to calculate the coefficients of the expansion F = 𝔼 F + ∑n⩾1 I n (f n ). Applying (20.19) on both sides of this expansion and changing (a bit gentlemanlike) summation and differentiation, we get ∞

n! I n−k (̂f n (t1 , . . . , t k , . . .)). (n − k)! n=k+1

D kt1 ,...,t k F = k! ̂f k (t1 , . . . , t k ) + ∑

(n−k) variables

Now we take expectations, again swapping summation and expected values. Since I n−k (⋅) is a mean zero martingale, we get ̂f k (t1 , . . . , f k ) = 1 𝔼 D k t1 ,...,t k F. k!

Altogether we have “proved” the so-called Taylor–Stroock formula. B , ℙ) and assume that F has Malliavin 20.25 Theorem (Stroock 1987). Let F ∈ L2 (Ω, F∞ k derivatives of any order such that 𝔼 |D F| < ∞. Then the Wiener–Itô decomposition of F is given by the formula ∞



F = ∑ I n (f n ) = ∑ I n (̂f n ) n=0

n=0

1 where ̂f n (t1 , . . . , t n ) = 𝔼 (D nt1 ,...,t n F) . n!

380 | 20 Wiener Chaos and iterated Wiener–Itô integrals We close this chapter with an example that illustrates the use of Theorem 20.25.

Ex. 20.11

20.26 Example. We want to find the Wiener–Itô decomposition for the random variable F = exp[−B2T ] where T > 0 is fixed. Clearly, F ∈ L2 (ℙ). For the following calculations we will use a moment formula which we leave as an exercise: 2

−B T 𝔼 [B2n ]= T e

n (2n − 1)!! T ) . ( √2T + 1 2T + 1

(20.20)

The first term of the expansion is the expected value: 2

2

𝔼 F = 𝔼 e−B T = 𝔼 e−TB1 = 2

1 . √2T + 1

Observe that F = ϕ(I1 (𝟙[0,T)) ) for ϕ(x) = e−x and B T = I1 (𝟙[0,T) ). Thus, using Theorem 20.26, Dt F =

2 d −x2 󵄨󵄨󵄨 1 e 󵄨󵄨 𝔼[D t F] = 0. 𝟙 (t) = −2B T e−B T 𝟙[0,T) (t) 󳨐⇒ ̂f1 (t) = 󵄨x=B T [0,T) dx 1!

Similarly, we find

d2 −x2 󵄨󵄨󵄨 2 −B2T ⊗2 e 󵄨󵄨 𝟙⊗2 𝟙[0,T) (s, t) [0,T) (s, t) = 2(2B T − 1)e 2 󵄨 x=B dx T 2 1 1 𝟙⊗2 . 𝔼[2(2B2T − 1)e−B T ]𝟙⊗2 󳨐⇒ ̂f2 = [0,T) = − 2! (2T + 1)3/2 [0,T)

D2s,t F =

In the same way we get

2 1 𝔼[D3s,t,u F]𝟙⊗3 D3s,t,u F = 4(3B T − 2B3T )e−B T 󳨐⇒ ̂f3 = [0,T) = 0 3! 1 12 D4s,t,u,v F = 4(2 − 12B2T + 4B4T )e2 −B2T 󳨐⇒ ̂f4 = 𝟙⊗4 , 4! (2T + 1)5/2 [0,T)

which yields

exp [−B2T ] =

1 1 1 − I2 (𝟙⊗2 I4 (𝟙⊗4 [0,T) ) + [0,T) ) + 0 + . . . 1/2 3/2 (2T + 1) (2T + 1) 2(2T + 1)5/2

20.27 Further reading. The Wiener–Itô decomposition is the starting point for many further developments in the analysis in Wiener and abstract Wiener spaces, in particular the Malliavin calculus. Very readable introductions are the books by Nualart and Nualart [191; 192]. Still one of the best introductions to (abstract) Wiener space is Kuo’s lecture note [156]. An approach to Wiener chaos decompositions in the spirit of the present chapter for processes beyond Brownian motion, including many jump processes, can be found in [51]. Di Tella, Engelbert: The chaotic representation property. . . [51] [156] Kuo: Gaussian Measures in Banach Spaces. [191] Nualart: The Malliavin Calculus and Related Topics. [192] Nualart, Nualart: Introduction to Malliavin Calculus.

Problems

| 381

Problems 1.

Show that the definition of the double Itô integral for off-diagonal simple functions (Definition 20.4) is independent of the representation of the simple function.

2.

Repeat the calculation from the end of Example 20.5 for a general measure μ. What happens on the diagonal?

3.

Show that the definition of the iterated Itô integral for f ∈ L2 (ℝ2+ ) (Definition 20.9) is independent of the approximating sequence.

4. Show that the Hermite Polynomials are linearly independent. 5. 6.

7.

Let f, g ∈ L2 (ℝ+ ) and assume that f ⊥g, i.e. ∫ℝ f(t)g(t) dt = 0. Show that + I1 (f) ⊥⊥ I1 (g). Let (X, Y) be a mean zero Gaussian random variable. Show that 𝔼(X 2 Y) = 0. Use this observation to show that I 2 (f) − ‖f‖2L2 is and K1 := {I1 (g) : g ∈ L2 (ℝ+ )} are orthogonal. (∞)

(Symmetrized tensor basis) Pick any orthonormal basis (e k )k⩾1 of H. Let α ∈ ℕ0 , i.e. α = (α1 , . . . , α N , 0, 0, . . . ) where N ∈ ℕ and α1 , . . . , α N ∈ ℕ0 , and define ⊗α e α := ⨂k e k k , i.e. ∞

⊗α k

eα = ⨂ ek k=1

= e1 ⊗ ⋅ ⋅ ⋅ ⊗ e1 ⊗ ⋅ ⋅ ⋅ ⊗ e k ⊗ ⋅ ⋅ ⋅ ⊗ e k ⊗ . . . α1 factors

α k factors

e α we denote We agree that means “leave this factor out” and |α| = ∑∞ i=0 α i . By ̂ the symmetrization of e α . a) ⟨e α , e β ⟩L2 (ℝ+n ) = δ α,β if |α| = |β| = n. e⊗0

b) ‖̂e α ‖L2 (ℝ+n ) = (∏∞ 1 α k !) /n!

8. Let F, G be smooth, bounded cylindrical functions. Show that a) D t (FG) = GD t F + FD t G. 9.

m m m−k k b) D m G. t (FG) = ∑k=0 ( k )D t F ⋅ D t

Let f ∈ L2 (ℝ+m ), g ∈ L2 (ℝ+n ) and assume that the functions are symmetric: f = ̂f , g = ĝ . We define the contraction as (f ⊗r g)(t1 , . . . , t m−r , s1 , . . . , s n−r ) := ∫ ⋅ ⋅ ⋅ ∫

ℝ+r

f(t1 , . . . , t m−r , u1 , . . . , u r )g(s1 , . . . , s n−r , u1 , . . . , u r ) du1 . . . du r .

Show the following formula:

n∧m m n I m (f)I n (g) = ∑ r!( )( )I n+m−2r (f ⊗r g). r r r=0

382 | 20 Wiener Chaos and iterated Wiener–Itô integrals Hint. Consider first f = f1⊗m and g = g1⊗n , use Theorem 20.13.i) and the relation between Hermite polynomials and iterated integrals.

10. (Special case of Problem 9) Use Itô’s formula to show that for any two symmetric f, g ∈ L2 (ℝ2+ ) 2 2 2 I2 (f)I2 (g) = ∑ r!( )( )I4−2r (f ⊗r g). r r r=0 2

−B T ] = (2n−1)!! ( T ) . Here, (−1)!! := 1, 11. Prove (20.20), i.e. show that 𝔼 [B2n T e √2T+1 2T+1 (2n − 1)!! = 1 ⋅ 3 ⋅ . . . ⋅ (2n − 1), is the double factorial. n



12. Let F = exp [I1 (g) − 12 ∫0 g2 (s) ds] for some g ∈ L2 (ℝ+ ). Find all derivatives D k F as well as the Wiener–Itô chaos representation.

13. Find the Wiener–Itô chaos decomposition for the following random variables (T > 0 is fixed): a) exp(B T − T/2),

b) B5T ,

1

c) ∫(r3 B3t + 2tB2t ) dB t , 0

1

d) ∫ te B t dB t . 0

14. Find the Wiener–Itô chaos decomposition for the following random variables (T > 0 is fixed): a) sinh(B T ),

b) cosh(B T ).

15. Let D ⊂ L2 (μ) and write Σ := span(D). Show that Σ is dense if, and only if D is total, i.e. Σ = L2 (μ) ⇐⇒ [∀ϕ ∈ D : ⟨u, ϕ⟩L2 (μ) = 0 󳨐⇒ u = 0.]

21 Stochastic differential equations d The ordinary differential equation ds x s = b(s, x s ), x(0) = x0 , describes the position x t of a particle which moves with speed b(s, x) depending on time and on the current position. One possibility to take into account random effects, e.g. caused by measurement errors or hidden parameters, is to add a random perturbation which may depend on the current position. This leads to an equation of the form

X(t + ∆t) − X(t) = b(t, X(t))∆t + σ(t, X(t))(B(t + ∆t) − B(t)).

Letting ∆t → 0 we get the following equation for stochastic differentials, cf. Section 18.1, dX t = b(t, X t ) dt + σ(t, X t ) dB t .

(21.1)

Since (21.1) is a formal way of writing things, we need a precise definition of the solution of the stochastic differential equation (21.1).

21.1 Definition. Let (B t , Ft )t⩾0 be a d-dimensional Brownian motion with admissible filtration (Ft )t⩾0 , and let b : [0, ∞) × ℝn → ℝn and σ : [0, ∞) × ℝn → ℝn×d be measurable functions. A solution of the stochastic differential equation (SDE) (21.1) with initial condition X0 = ξ is a progressively measurable stochastic process (X t )t⩾0 , X t = (X 1t , . . . , X tn ) ∈ ℝn , such that the following integral equation holds t

t

X t = ξ + ∫ b(s, X s ) ds + ∫ σ(s, X s ) dB s , 0

i.e. for all j = 1, . . . , n, j Xt

t

0

d

t ⩾ 0,

t

= ξ + ∫ b j (s, X s ) ds + ∑ ∫ σ jk (s, X s ) dB ks , j

0

(21.2)

k=1 0

t ⩾ 0.

(21.3)

In Definition 21.1 we implicitly require that all (stochastic) integrals make sense. Remembering the definition of L2T,loc from Chapter 16, a sufficient condition is that (s, x) 󳨃→ b(s, x), (s, x) 󳨃→ σ(s, x) and s 󳨃→ X s are continuous. The expectation 𝔼 and the probability measure ℙ are given by the driving Brownian motion; in particular ℙ(B0 = 0) = 1. Throughout this chapter we use the Euclidean norms |b|2 = ∑nj=1 b2j , |x|2 = ∑dk=1 x2k

and |σ|2 = ∑nj=1 ∑dk=1 σ2jk for vectors b ∈ ℝn , x ∈ ℝd and matrices σ ∈ ℝn×d , respectively. The Cauchy-Schwarz inequality shows that for these norms |σx| ⩽ |σ| ⋅ |x| holds.

21.1 The heuristics of SDEs

We have introduced the SDE (21.1) as a random perturbation of the ODE _x t = b(t, x t ) by a multiplicative noise: σ(t, X t ) dB t . The following heuristic reasoning explains why https://doi.org/10.1515/9783110741278-021

384 | 21 Stochastic differential equations multiplicative noise is actually a natural choice for many applications. We learned this argument from Michael Röckner. Denote by x t the position of a particle at time t. After ∆t units of time, the particle is at x t+∆t . We assume that the motion can be described by an equation of the form ∆x t = x t+∆t − x t = f(t, x t ; ∆t)

where f is a function which depends on the initial position x t and the time interval [t, t + ∆t]. Unknown influences on the movement, e.g. caused by hidden parameters or measurement errors, can be modelled by adding some random noise Γ t,∆t . This leads to X t+∆t − X t = F(t, X t ; ∆t, Γ t,∆t )

where X t is now a random process. Note that F(t, X t ; 0, Γ t,0 ) = 0. In many applications it makes sense to assume that the random variable Γ t,∆t is independent of X t , that it has zero mean, and that it depends only on the increment ∆t. If we have no further information on the nature of the noise, we should strive for a random variable which is maximally uncertain. One possibility to measure uncertainty is entropy. We know from information theory that, among all probability densities p(x) with a fixed variance σ2 , ∞ the Gaussian density maximizes the entropy H(p) = − ∫−∞ p(x) log p(x) dx, cf. Ash [4, 2 Theorem 8.3.3]. Therefore, we assume that Γ t,∆t ∼ N(0, σ (∆t)). By the same argument, the random variables Γ t,∆t and Γ t+∆t,∆s are independent and Γ t,∆t+∆s ∼ Γ t,∆t + Γ t+∆t,∆s . Thus, σ2 (∆t + ∆s) = σ2 (∆t) + σ2 (∆s), and if σ2 (⋅) is continuous, we see that σ2 (⋅) is a linear function. Without loss of generality we can assume that σ2 (t) = t; therefore Γ t,∆t ∼ ∆B t := B t+∆t − B t

where (B t )t⩾0 is a one-dimensional Brownian motion, and we can write X t+∆t − X t = F(t, X t ; ∆t, ∆B t ).

If F is sufficiently smooth, a Taylor expansion of F in the last two variables (∂ j denotes the partial derivative w.r.t. the jth variable) yields X t+∆t − X t = ∂4 F(t, X t ; 0, 0) ∆B t + ∂3 F(t, X t ; 0, 0) ∆t 1 1 + ∂24 F(t, X t ; 0, 0)(∆B t )2 + ∂23 F(t, X t ; 0, 0)(∆t)2 2 2 + ∂3 ∂4 F(t, X t ; 0, 0)∆t ∆B t + R(∆t, ∆B t ).

The remainder term R is given by 1

R(∆t, ∆B t ) = ∑

0⩽j,k⩽3 j+k=3

3 j ∫(1 − θ)2 ∂3 ∂4k F(t, X t ; θ∆t, θ∆B t ) dθ ⋅ (∆t)j (∆B t )k . j! k! 0

21.2 Some examples | 385

Since ∆B t ∼ √∆tB1 , we get R = o (∆t) and ∆t ∆B t = o((∆t)3/2 ). Neglecting all terms of order o(∆t) and using (∆B t )2 ≈ ∆t, we get for small increments ∆t 1 2 ∂ F(t, X t ; 0, 0)) ∆t + ∂4 F(t, X t ; 0, 0) ∆B t . 2 4 =: σ(t, X t ) =: b(t, X t )

X t+∆t − X t = (∂3 F(t, X t ; 0, 0) +

Letting ∆t → 0, we end up with (21.1).

21.2 Some examples

Before we study general existence and uniqueness results for the SDE (21.1), we will review a few examples where we can find the solution explicitly. Of course, we must make sure that all integrals in (21.2) exist, i.e. T

T 2

∫ |σ(s, X s )| ds

and

0

∫ |b(s, X s )| ds 0

should have an expected value or be at least almost surely finite, cf. Chapter 16. We will assume this throughout this section. In order to keep things simple we discuss only the one-dimensional situation d = n = 1. 21.2 Example. Let (B t )t⩾0 be a BM1 . We consider the SDE t

t

(21.4)

X t − x = ∫ b(s) ds + ∫ σ(s) dB s 0

0

with deterministic coefficients b, σ : [0, ∞) → ℝ. It is clear that X t is a normal random variable and that (X t )t⩾0 inherits the independent increments property from the driving Brownian motion (B t )t⩾0 . Therefore, it is enough to calculate 𝔼 X t and 𝕍 X t in order to characterize the solution:

and

t

t

0

[0

t

15.15.a) 𝔼(X t − X0 ) = ∫ b(s) ds + 𝔼 [∫ σ(s) dB s ] = ∫ b(s) ds t

2

t

]

0

2

t

(15.22) 𝕍(X t − X0 ) = 𝔼 [(X t − X0 − ∫ b(s) ds) ] = 𝔼 [( ∫ σ(s) dB s ) ] = ∫ σ2 (s) ds. 0 0 ] [ ] [ 0

We will now use the time-dependent Itô formula to solve SDEs.

21.3 An Ansatz. Recall, cf. (18.18) and Theorem 18.11, that for f ∈ C1,2 (i.e. ∂ t f, ∂ x f, ∂2x f exist and are continuous) we have df(t, B t ) = ∂ t f(t, B t ) dt + ∂ x f(t, B t ) dB t +

1 2 ∂ f(t, B t ) dt. 2 x

Ex. 21.1

386 | 21 Stochastic differential equations In order to solve the one-dimensional SDE (21.1) dX t = b(t, X t ) dt + σ(t, X t ) dB t ,

X0 = x,

we make the following Ansatz:

X t = f(t, B t ),

apply Itô’s formula, and compare coefficients. This leads to the following system of (partial) differential equations 1 2 ∂ f(t, x), 2 x σ(t, f(t, x)) = ∂ x f(t, x).

b(t, f(t, x)) = ∂ t f(t, x) +

Ex. 21.3

(21.5)

This will often yield necessary conditions for the solution (hence, uniqueness) and give a hint to the general form of the solution; do not forget to check that the result f(t, B t ) is indeed a solution by using Itô’s formula.

21.4 Example (Geometric Brownian motion). A geometric Brownian motion is the solution of the following SDE t

t

X t = X0 + b ∫ X s ds + σ ∫ X s dB s , 0

0

where b ∈ ℝ and σ > 0 are constants. It is given by

X0 = x0 ∈ ℝ

X t = x0 exp [(b − 12 σ2 ) t + σB t ] ,

t ⩾ 0.

(21.6)

The easiest way to verify that (21.6) is indeed a solution to the SDE, is to use Itô’s formula and to check directly that (21.6) satisfies the SDE. This, however, does not guarantee uniqueness and this method works only if we already know the solution. Using the Ansatz 21.3 we can show that the solution is indeed unique and of the form (21.5): from (21.5) we get !

bf(t, x) = ∂ t f(t, x) + !

Now (21.8) yields

1 2 ∂ f(t, x); 2 x

(21.8)

σf(t, x) = ∂ x f(t, x). (21.8)

(21.7)

(21.8)

∂2x f(t, x) = σ∂ x f(t, x) = σ2 f(t, x),

and inserting this into (21.7) we get

(b − 12 σ2 )f(t, x) = ∂ t f(t, x) and

σf(t, x) = ∂ x f(t, x).

21.2 Some examples | 387

The usual separation method f(t, x) = g(t)h(x) reveals 1

g(t) = g(0)e(b− 2 σ

2

)t

and

h(x) = h(0)e σx ,

and we get (21.6). The calculation clearly shows that the conditions are necessary and one still has to check that the result is indeed a solution. This is left to the reader. The following example shows that the solution of an SDE need not be of the form f(t, B t ), i.e. the Ansatz 21.3 is not always suitable. 21.5 Example (Ornstein–Uhlenbeck process). An Ornstein–Uhlenbeck process is the solution of Langevin’s SDE ¹ t

t

X t = X0 + b ∫ X s ds + σ ∫ dB s , 0

0

X0 = x0 ∈ ℝ

where b, σ ∈ ℝ are constants. The solution is given by t

X t = e bt x0 + σe bt ∫ e−bs dB s , 0

t ⩾ 0.

(21.9)

Example 21.2 shows that the Ornstein–Uhlenbeck process is a Gaussian process. Using the Ansatz 21.3 naively, yields !

bf(t, x) = ∂ t f(t, x) + !

1 2 ∂ f(t, x), 2 x

(21.10) (21.11)

σ = ∂ x f(t, x).

The second equation implies ∂2x f = ∂ x ∂ t f = 0, and so, This gives

bf = ∂ t f

and

σ = ∂ x f 󳨐⇒ b∂ x f = ∂ x ∂ t f = 0.

(21.11)

bσ = b∂ x f

(21.11)

=

(21.10)

∂x ∂t f = 0

1 This shows, incidentally, that an Ornstein–Uhlenbeck process is the continuous-time analogue of an autoregressive time series. This follows if we formally set dX t = X t+dt − X t , dB t = B t+dt − B t and dt = 1. Then X t+1 − X t = bX t + σ(B t+1 − B t ) ⇐⇒ X t+1 = (b + 1)X t + Z t .

∼ N(0, σ2 ), iid

This OU process should not be confused with the stationary OU process which we encountered in Problem 2.11.

Ex. 2.11 Ex. 21.5

388 | 21 Stochastic differential equations or bσ = 0. Thus, b = 0 and f(t, x) = σx + const., and we get only the trivial solution. In order to avoid this mistake, we consider first the deterministic equation t

x t = x0 + b ∫ x s ds 0

which is solved by x t = x0 e bt or e−bt x t = x0 = const. This indicates that we should use the Ansatz Y t := e−bt X t = f(t, X t )

with

It is easy to calculate the partial derivatives ∂ t f(t, x) = −bf(t, x) = −bxe−bt ,

f(t, x) = e−bt x.

∂ x f(t, x) = e−bt ,

and we find, using the time-dependent Itô formula (18.18), t

∂2x f(t, x) = 0, t

Y t − Y0 = ∫ ( ∂ t f(s, X s ) + bX s ∂ x f(s, X s ) + 0

t

= ∫ σe−bs dB s ,

=0

1 2

σ2 ∂2x f(s, =0

X s ) ) ds + ∫ σ∂ x f(s, X s ) dB s 0

0

Ex. 21.5

and this gives (21.9). Problem 21.5 extends this example to random variables X0 = ξ .

21.3 The general linear SDE

We will now study linear SDEs in a more systematic way. 21.6 Example (Homogeneous linear SDEs). A homogeneous linear SDE is an SDE of the form dX t = β(t)X t dt + δ(t)X t dB t

with non-random coefficients β, δ : [0, ∞) → ℝ. Let us assume that the initial condition is positive: X0 > 0. Since the solution has continuous sample paths, it is clear that X t > 0 for sufficiently small values of t > 0. Therefore it is plausible to set Z t := log X t . By Itô’s formula (18.13) we get, at least formally, dZ t =

1 1 1 2 1 β(t)X t dt + δ(t)X t dB t − δ (t)X 2t dt Xt Xt 2 X 2t

= (β(t) −

1 2

δ2 (t)) dt + δ(t) dB t .

21.3 The general linear SDE | 389

Thus, t

t

X t = X0 exp [∫ (β(s) − [0

1 2

δ2 (s)) ds] exp [∫ δ(s) dB s ] . ] ] [0

Now we can verify by a direct calculation that X t is indeed a solution of the SDE – even if X0 < 0. If β ≡ 0, we get the formula for the stochastic exponential, see Section 19.1. 21.7 Example (Linear SDEs). A linear SDE is an SDE of the form

dX t = (α(t) + β(t)X t ) dt + (γ(t) + δ(t)X t ) dB t ,

with non-random coefficients α, β, γ, δ : [0, ∞) → ℝ. Let X ∘t be a random process such that 1/X ∘t is the solution of the homogeneous equation (i.e. where α = γ = 0) for the initial value X0∘ = 1. From the explicit formula for 1/X ∘t of Example 21.6 it is easily seen that dX ∘t = (−β(t) + δ2 (t))X ∘t dt − δ(t)X ∘t dB t

or

t

X ∘t

t

= exp [− ∫ (β(s) − 21 δ2 (s)) ds − ∫ δ(s) dB s ] 0 ] [ 0

Set Z t := X ∘t X t ; using Itô’s formula (18.14) for the two-dimensional Itô process (X t , X ∘t ), we see dZ t = X t dX ∘t + X ∘t dX t − (γ(t) + δ(t)X t )δ(t)X ∘t dt,

and this becomes, if we insert the formulae for dX t and dX ∘t ,

dZ t = (α(t) − γ(t)δ(t))X ∘t dt + γ(t)X ∘t dB t .

Since we know X ∘t from Example 21.6, we can thus get a formula for Z t , and then for X t = Z t /X ∘t : t

t

X t = exp [∫ (β(s) − 12 δ2 (s)) ds + ∫ δ(s) dB s ] × 0 [0 ] t

s

s

{ × {X0 + ∫ exp [− ∫ (β(r) − 21 δ2 (r)) dr − ∫ δ(r) dB r ] γ(s) dB s + 0 0 { [ 0 ] t

s

s

} + ∫ exp [− ∫ (β(r) − 12 δ2 (r)) dr − ∫ δ(r) dB r ] (α(s) − γ(s)δ(s)) ds} 0 0 [ 0 ] }

Ex. 21.4

390 | 21 Stochastic differential equations

21.4 Transforming an SDE into a linear SDE

Ex. 18.7

We will now study some transformations of an SDE, that help us to reduce a general SDE (21.1) to a linear SDE or an SDE with non-random coefficients. Assume that (X t )t⩾0 is a solution of the SDE (21.1), and denote by f : [0, ∞)×ℝ → ℝ a function such that 󳶳 x 󳨃→ f(t, x) is monotone; 󳶳 the derivatives f t (t, x), f x (t, x) and f xx (t, x) exist and are continuous; 󳶳 g(t, ⋅) := f −1 (t, ⋅) exists for each t, i.e. f(t, g(t, x)) = g(t, f(t, x)) = x. We set Z t := f(t, X t ) and use Itô’s formula (18.13) to find the stochastic differential equation corresponding to the process (Z t )t⩾0 : 1 f xx (t, X t )σ2 (t, X t ) dt 2 1 = [f t (t, X t ) + f x (t, X t )b(t, X t ) + f xx (t, X t )σ2 (t, X t )] dt + f x (t, X t )σ(t, X t ) dB t 2 =: σ(t, Z t ) =: b(t, Z t )

dZ t = f t (t, X t ) dt + f x (t, X t ) dX t +

where the new coefficients b(t, Z t ) and σ(t, Z t ) are given by b(t, Z t ) = f t (t, X t ) + f x (t, X t )b(t, X t ) + σ(t, Z t ) = f x (t, X t )σ(t, X t )

1 f xx (t, X t )σ2 (t, X t ), 2

with X t = g(t, Z t ). We will now try to choose f and g in such a way that the transformed SDE dZ t = b(t, Z t ) dt + σ(t, Z t ) dB t (21.12)

has an explicit solution from which we obtain the solution X t = g(t, Z t ) of (21.1). There are two obvious choices:

Variance transform/Lamperti transform. We want to achieve σ(t, z) ≡ 1. This means that x

f x (t, x)σ(t, x) ≡ 1 ⇐⇒ f x (t, x) =

1 dy ⇐⇒ f(t, x) = ∫ + c. σ(t, x) σ(t, y) 0

This can be done if σ(t, x) > 0 and if the derivatives σ t and σ x exist and are continuous. Drift transform. We want to achieve b(t, z) ≡ 0. In this case f(t, x) has to satisfy the partial differential equation f t (t, x) + b(t, x)f x (t, x) +

1 2 σ (t, x)f xx (t, x) = 0. 2

21.4 Transforming an SDE into a linear SDE | 391

If b(t, x) = b(x) and σ(t, x) = σ(x), then it is reasonable to look for a solution which also does not depend on t, i.e. f(t, x) = f(x). Thus, we have b t = σ t = f t = 0, and the partial differential equation becomes an ordinary differential equation b(x)f 󸀠 (x) +

1 2 σ (x)f 󸀠󸀠 (x) = 0 2

which has the following explicit solution x

y

f(x) = c1 + c2 ∫ exp(− ∫ 0

0

2b(v) dv) dy. σ2 (v)

21.8 Lemma. Let (B t )t⩾0 be a BM1 . The SDE dX t = b(t, X t ) dt + σ(t, X t ) dB t can be transformed into the form dZ t = b(t) dt + σ(t) dB t if, and only if, the coefficients satisfy the condition 0=

σ t (t, x) ∂ ∂ b(t, x) 1 + σ xx (t, x))} . − {σ(t, x) ( 2 ∂x σ (t, x) ∂x σ(t, x) 2

(21.13)

Proof. If we use the transformation Z t = f(t, X t ) from above, it is necessary that b(t) = f t (t, x) + b(t, x)f x (t, x) + σ(t) = f x (t, x)σ(t, x),

1 f xx (t, x)σ2 (t, x), 2

(21.14a)

(21.14b)

with x = g(t, z) = f −1 (t, z). Now we get from (21.14b) f x (t, x) =

and therefore f xt (t, x) =

σ t (t)σ(t, x) − σ(t)σ t (t, x) σ2 (t, x)

σ(t) , σ(t, x)

and

Differentiate (21.14a) in x,

0 = f tx (t, x) +

f xx (t, x) =

−σ(t)σ x (t, x) . σ2 (t, x)

∂ 1 (b(t, x)f x (t, x) + σ2 (t, x)f xx (t, x)) , ∂x 2

and plug in the formulae for f x , f tx and f xx to obtain

σ t (t) σ t (t, x) ∂ b(t, x) 1 = σ(t, x) ( 2 − + σ xx (t, x)) . σ(t) σ (t, x) ∂x σ(t, x) 2

If we differentiate this expression in x we finally get (21.13). Since we can, with suitable integration constants, perform all calculations in reversed order, it turns out that (21.13) is a necessary and sufficient condition to transform the SDE.

392 | 21 Stochastic differential equations

Ex. 21.6

21.9 Example. Assume that b(t, x) = b(x) and σ(t, x) = σ(x) and that condition (21.13) holds, i.e. 0=

Then the transformation

d d b(x) 1 󸀠󸀠 + σ (x))} . {σ(x) (− dx dx σ(x) 2 Xt

dy σ(y)

Z t = f(t, X t ) = e ∫ ct

0

Ex. 21.10

d b(x) with the constant c = σ(x) [ 12 σ󸀠󸀠 (x) − dx σ(x) ] converts dX t = b(X t ) dt + σ(X t ) dB t into ct dZ t = e dB t . This equation can be solved as in Example 21.2.

21.10 Lemma. Let (B t )t⩾0 be a BM1 . The SDE dX t = b(X t ) dt + σ(X t ) dB t with coefficients b, σ : ℝ → ℝ can be transformed into a linear SDE

if, and only if,

x

dZ t = (α + βZ t ) dt + (γ + δZ t ) dB t ,

d (κ󸀠 (x)σ(x)) d ( dx 󸀠 )=0 dx κ (x)

where

α, β, γ, δ ∈ ℝ,

κ(x) =

b(x) 1 󸀠 − σ (x). σ(x) 2

(21.15)

d (κ󸀠 (x)σ(x))] /κ󸀠 (x). The transformation Z t = f(X t ) Set d(x) = ∫0 dy/σ(y) and δ = − [ dx is given by

{e δd(x) , f(x) = { γd(x), {

if δ ≠ 0,

if δ = 0.

Proof. Since the coefficients do not depend on t, it is clear that the transformation Z t = f(X t ) should not depend on t. Using the general transformation scheme from above we see that the following relations are necessary: b(x)f 󸀠 (x) +

1 2 σ (x)f 󸀠󸀠 (x) = α + βf(x), 2 σ(x)f 󸀠 (x) = γ + δf(x),

(21.16a)

(21.16b)

where x = g(z), z = f(x), i.e. g(z) = f −1 (z). x

Case 1. If δ ≠ 0, we set d(x) = ∫0 dy/σ(y); then f(x) = ce δd(x) −

γ δ

is a solution to the ordinary differential equation (21.16b). Inserting this into (21.16a) yields [

δb(x) 1 2 δ2 δσ󸀠 (x) βγ + σ (x) ( 2 − 2 +α )] ce δd(x) = βce δd(x) − σ(x) 2 δ σ (x) σ (x)

21.5 Existence and uniqueness of solutions | 393

which becomes [

δb(x) 1 2 δ αδ − γβ + δ − β − σ󸀠 (x)] e δd(x) = . σ(x) 2 2 cδ

Set κ(x) := b(x)/σ(x) −

1 2

σ󸀠 (x). Then this relation reads

αδ − γβ 1 2 δ − β] e δd(x) = . 2 cδ

[δκ(x) +

Now we differentiate this equality in order to get rid of the constant c: δκ󸀠 (x)e δd(x) + [δκ(x) +

which is the same as

δκ(x) +

1 2 δ δ − β] e δd(x) =0 2 σ(x)

1 2 δ − β = −κ󸀠 (x)σ(x). 2

Differentiating this equality again yields (21.15); the formulae for δ and f(x) now follow by integration. Case 2. If δ = 0 then (21.16b) becomes 󸀠

σ(x)f (x) = γ

and so

x

f(x) = γ ∫ 0

Inserting this into (21.16a) shows

dy + c. σ(y)

κ(x)γ = βγd(x) + c󸀠

d and if we differentiate this we get σ(x)κ󸀠 (x) ≡ β, i.e. dx (σ(x)κ󸀠 (x)) = 0, which implies (21.15). The sufficiency of (21.15) is now easily verified by a (lengthy) direct calculation with the given transformation.

21.5 Existence and uniqueness of solutions The main question, of course, is whether an SDE has a solution and, if so, whether the solution is unique. If σ ≡ 0, the SDE becomes an ordinary differential equation, and it is well known that existence and uniqueness results for ordinary differential equations usually require a Lipschitz condition for the coefficients. Therefore, it seems reasonable to require n

n

d

∑ |b j (t, x) − b j (t, y)|2 + ∑ ∑ |σ jk (t, x) − σ jk (t, y)|2 ⩽ L2 |x − y|2

j=1

j=1 k=1

(21.17)

394 | 21 Stochastic differential equations for all x, y ∈ ℝn , t ∈ [0, T] and with a (global) Lipschitz constant L = L T . If we pick y = 0, the global Lipschitz condition implies the following linear growth condition n

n

j=1

Ex. 21.11

d

∑ |b j (t, x)|2 + ∑ ∑ |σ jk (t, x)|2 ⩽ M 2 (1 + |x|)2

(21.18)

j=1 k=1

for all x ∈ ℝn and with some constant M = M T which can be estimated from below by M 2 ⩾ 2L2 + 2 ∑nj=1 supt⩽T |b j (t, 0)|2 + 2 ∑nj=1 ∑dk=1 supt⩽T |σ jk (t, 0)|2 . We begin with a stability and uniqueness result.

21.11 Theorem (stability). Let (B t , Ft )t⩾0 be a d-dimensional Brownian motion and assume that the coefficients b : [0, ∞) × ℝn → ℝn and σ : [0, ∞) × ℝn → ℝn×d of the SDE (21.1) satisfy the Lipschitz condition (21.17) for all x, y ∈ ℝn with the global Lipschitz constant L = L T for T > 0. If (X t )t⩾0 and (Y t )t⩾0 are any two solutions of the SDE (21.1) with F0 measurable initial conditions X0 = ξ ∈ L2 (ℙ), and Y0 = η ∈ L2 (ℙ), respectively; then 𝔼 [sup |X t − Y t |2 ] ⩽ 3 e3L t⩽T

2

(T+4)T

𝔼 [|ξ − η|2 ] .

(21.19)

The expectation 𝔼 is given by the Brownian motion started at B0 = 0.

Proof. Fix r > 0 and define the stopping times τ Zr := inf {t ⩾ 0 : |Z t | ⩾ r}, Z = X, Y, and τ r := τ Xr ∧ τ Yr . Note that t

t

X t − Y t = (ξ − η) + ∫ (b(s, X s ) − b(s, Y s )) ds + ∫ (σ(s, X s ) − σ(s, Y s )) dB s . 0

0

The elementary estimate (a + b + c)2 ⩽ 3a2 + 3b2 + 3c2 yields 𝔼 [ sup |X t − Y t |2 ] = 𝔼 [sup |X t∧τ r − Y t∧τ r |2 ] t⩽T∧τ r

t⩽T

⩽ 3 𝔼 [|ξ − η|2 ]

󵄨󵄨 t∧τ r 󵄨󵄨2 󵄨 󵄨 + 3 𝔼 [sup 󵄨󵄨󵄨󵄨 ∫ (b(s, X s ) − b(s, Y s )) ds󵄨󵄨󵄨󵄨 ] 󵄨󵄨 t⩽T 󵄨󵄨 0 [ ]

󵄨󵄨 t∧τ r 󵄨󵄨2 󵄨 󵄨 [ + 3 𝔼 sup 󵄨󵄨󵄨󵄨 ∫ (σ(s, X s ) − σ(s, Y s )) dB s 󵄨󵄨󵄨󵄨 ] . 󵄨󵄨 t⩽T 󵄨󵄨 0 [ ]

We will now estimate the three terms appearing on the right-hand side separately, and bring them in a form where we can apply Gronwall’s inequality.

21.5 Existence and uniqueness of solutions | 395

For the first term there is nothing to do. Using the Cauchy-Schwarz inequality and (21.17), we find for the middle term 󵄨󵄨 t∧τ r 󵄨󵄨󵄨2 󵄨 𝔼 [sup 󵄨󵄨󵄨󵄨 ∫ (b(s, X s ) − b(s, Y s )) ds󵄨󵄨󵄨󵄨 ] 󵄨󵄨 t⩽T 󵄨󵄨 0 [ ]

2

T∧τ r

󵄨 󵄨 ⩽ 𝔼 [( ∫ 󵄨󵄨󵄨b(s, X s ) − b(s, Y s )󵄨󵄨󵄨 ds) ] [ 0 ]

(21.20)

T∧τ r

󵄨 󵄨2 ⩽ T 𝔼 [ ∫ 󵄨󵄨󵄨b(s, X s ) − b(s, Y s )󵄨󵄨󵄨 ds] ] [ 0 T

2

⩽ L T ∫ 𝔼 [|X s∧τ r − Y s∧τ r |2 ] ds. 0

For the third term we use the maximal inequality for stochastic integrals, (15.23) in Theorem 15.15.d), and (21.17) and get 󵄨󵄨 t 󵄨󵄨2 󵄨 󵄨 [ 𝔼 sup 󵄨󵄨󵄨󵄨 ∫ (σ(s, X s ) − σ(s, Y s )) dB s 󵄨󵄨󵄨󵄨 ] 󵄨󵄨 t⩽T∧τ r 󵄨󵄨 0 [ ] T

⩽ 4 ∫ 𝔼 [|σ(s ∧ τ r , X s∧τ r ) − σ(s ∧ τ r , Y s∧τ r )|2 ] ds

(21.21)

0

T

⩽ 4L2 ∫ 𝔼 [|X s∧τ r − Y s∧τ r |2 ] ds. 0

This proves

T

𝔼 [ sup |X t − Y t |2 ] ⩽ 3 𝔼 [|ξ − η|2 ] + 3L2 (T + 4) ∫ 𝔼 [|X s∧τ r − Y s∧τ r |2 ] ds t⩽T∧τ r

0

T

2

2

⩽ 3 𝔼 [|ξ − η| ] + 3L (T + 4) ∫ 𝔼 [ sup |X r − Y r |2 ] ds. 0

r⩽s∧τ r

Now Gronwall’s inequality, Theorem A.42, applied to u(T) = 𝔼 [supt⩽T∧τ r |X t − Y t |2 ], and a(s) = 3 𝔼[|ξ − η|2 ], and b(s) = 3L2 (T + 4), yields 𝔼 [ sup |X t − Y t |2 ] ⩽ 3 e3L t⩽T∧τ r

2

(T+4)T

𝔼 [|ξ − η|2 ] .

Since τ r → ∞ as r → ∞, we can use Fatou’s lemma on the left-hand side and get 𝔼 [sup |X t − Y t |2 ] ⩽ lim 𝔼 [ sup |X t − Y t |2 ] ⩽ 3 e3L t⩽T

r→∞

t⩽T∧τ r

2

(T+4)T

𝔼 [|ξ − η|2 ] .

396 | 21 Stochastic differential equations 21.12 Corollary (uniqueness). Let (B t , Ft )t⩾0 be a BMd and assume that the coefficients b : [0, ∞) × ℝn → ℝn and σ : [0, ∞) × ℝn → ℝn×d of the SDE (21.1) satisfy the Lipschitz condition (21.17) for all x, y ∈ ℝn with the global Lipschitz constant L = L T for T > 0. Any two solutions (X t )t⩾0 and (Y t )t⩾0 of the SDE (21.1) with the same F0 measurable initial condition X0 = Y0 = ξ ∈ L2 (ℙ) are indistinguishable.

Proof. The estimate (21.19) shows that ℙ(∀t ∈ [0, n] : X t = Y t ) = 1 for any n ⩾ 1. Therefore, ℙ(∀t ⩾ 0 : X t = Y t ) = 1. In order to show the existence of the solution of the SDE we can use the Picard iteration scheme from the theory of ordinary differential equations.

21.13 Theorem (existence). Let (B t , Ft )t⩾0 be a BMd and assume that the coefficients b : [0, ∞) × ℝn → ℝn and σ : [0, ∞) × ℝn → ℝn×d of the SDE (21.1) satisfy the Lipschitz condition (21.17) for all x, y ∈ ℝn with the global Lipschitz constant L = L T for T > 0 and the linear growth condition (21.18) with the constant M = M T for T > 0. For every F0 measurable initial condition X0 = ξ ∈ L2 (ℙ) there exists a unique solution (X t )t⩾0 ; this solution satisfies 𝔼 [sup |X s |2 ] ⩽ κ T 𝔼 [(1 + |ξ|)2 ]

for all T > 0.

s⩽T

(21.22)

Proof. The uniqueness of the solution follows from Corollary 21.12. To prove the existence we use Picard’s iteration scheme. Set X0 (t) := ξ,

t

t

X n+1 (t) := ξ + ∫ σ(s, X n (s)) dB s + ∫ b(s, X n (s)) ds. 0

0

1o Since (a + b)2 ⩽ 2a2 + 2b2 , we see that

󵄨󵄨 t 󵄨󵄨2 󵄨󵄨 t 󵄨󵄨2 󵄨 󵄨 󵄨 󵄨 |X n+1 (t) − ξ|2 ⩽ 2󵄨󵄨󵄨󵄨 ∫ b(s, X n (s)) ds󵄨󵄨󵄨󵄨 + 2󵄨󵄨󵄨󵄨 ∫ σ(s, X n (s)) dB s 󵄨󵄨󵄨󵄨 . 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 0 0

With the calculations (21.20) and (21.21) – cf. the proof of Theorem 21.11 – where we set X s = X n (s), τ r = ∞, and omit the Y-terms, and with the linear growth condition (21.18) instead of the Lipschitz condition (21.17), we get T

2

𝔼 [sup |X n+1 (t) − ξ| ] ⩽ 2M (T + 4) ∫ 𝔼 [(1 + |X n (s)|)2 ] ds t⩽T

2

0

⩽ 2M 2 (T + 4)T 𝔼 [sup(1 + |X n (s)|)2 ] . s⩽T

21.5 Existence and uniqueness of solutions | 397

Starting with n = 0, we see recursively that all expectations are finite and that the processes X n+1 , n = 0, 1, . . . , are well-defined. 2o As in Step 1o we see

󵄨󵄨 t 󵄨󵄨2 󵄨 󵄨 |X n+1 (t) − X n (t)| ⩽ 2󵄨󵄨󵄨󵄨 ∫ [b(s, X n (s)) − b(s, X n−1 (s))] ds󵄨󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 0 2

󵄨󵄨 t 󵄨󵄨󵄨2 󵄨 + 2󵄨󵄨󵄨󵄨 ∫ [σ(s, X n (s)) − σ(s, X n−1 (s))] dB s 󵄨󵄨󵄨󵄨 . 󵄨󵄨 󵄨󵄨 0

Using again (21.20) and (21.21) with τ r = ∞, X = X n , and Y = X n−1 we get T

2

2

𝔼 [sup |X n+1 (t) − X n (t)| ] ⩽ 2L (T + 4) ∫ 𝔼 [|X n (s) − X n−1 (s)|2 ] ds. t⩽T

0

Set ϕ n (T) := 𝔼 [supt⩽T |X n+1 (t) − X n (t)|2 ] and c T = 2L2 (T + 4). Several iterations of this inequality yield ϕ n (T) ⩽ c nT ⩽ c nT

∫⋅ ⋅ ⋅ ∫

t1 ⩽...⩽t n−1 ⩽T

∫⋅ ⋅ ⋅ ∫

t1 ⩽...⩽t n−1 ⩽T

ϕ0 (t0 ) dt0 . . . dt n−1

dt0 . . . dt n−1 ϕ0 (T) = c nT

The estimate from 1o for n = 0 shows

Tn ϕ0 (T). n!

ϕ0 (T) = 𝔼 [sup |X1 (s) − ξ|2 ] ⩽ 2M 2 (T + 4)T 𝔼 [(1 + |ξ|)2 ] s⩽T

and this means that for some constant C T = C(L, M, T) ∞

2

1 2



∑ {𝔼 [sup |X n+1 (t) − X n (t)| ]} ⩽ ∑

n=0

t⩽T

3o For n ⩾ m we find

n=0

2

1 2



(c nT

1

2 Tn ϕ0 (T)) ⩽ C T √𝔼 [(1 + |ξ|)2 ]. n!

2

1 2

{𝔼 [sup |X n (s) − X m (s)| ]} ⩽ ∑ {𝔼 [sup |X j (s) − X j−1 (s)| ]} . s⩽T

j=m+1

s⩽T

Therefore, (X n )n⩾0 is a Cauchy sequence in L2 ; denote its limit by X. The above inequality shows that there is a subsequence m(k) such that lim sup |X(s) − X m(k) (s)|2 = 0

k→∞ s⩽T

a.s.

398 | 21 Stochastic differential equations Since X m(k) (⋅) is continuous and adapted, we see that X(⋅) is also continuous and adapted. Moreover, 2

1 2



2

1 2

{𝔼 [sup |X(s) − ξ| ]} ⩽ ∑ {𝔼 [sup |X n+1 (t) − X n (t)| ]} ⩽ C T √𝔼 [(1 + |ξ|)2 ], s⩽T

t⩽T

n=0

and (21.22) follows with κ T = C2T + 1.

4o Recall that X n+1 (t) = ξ + ∫0 σ(s, X n (s)) dB s + ∫0 b(s, X n (s)) ds. Let n = m(k); because of the estimate (21.22) we can use the dominated convergence theorem to get t

t

t

t

X(t) = ξ + ∫ σ(s, X(s)) dB s + ∫ b(s, X(s)) ds, 0

0

which shows that X is indeed a solution of the SDE.

21.14 Remark. The proof of the existence of a solution of the SDE (21.1) shows, in particular, that the solution depends measurably on the initial condition. More precisely, if X tx denotes the unique solution of the SDE with initial condition x ∈ ℝn , then (s, x, ω) 󳨃→ X sx (ω), (s, x) ∈ [0, t] × ℝn is for every t ⩾ 0 measurable with respect to B[0, t] ⊗ B(ℝn ) ⊗ Ft . It is enough to show this for the Picard approximations X nx (t, ω), since the measurability is preserved under pointwise limits. For n = 0 we have X nx (t, ω) = x and there is nothing to show. Assume we know already that X nx (t, ω) is B[0, t] ⊗ B(ℝn ) ⊗ Ft measurable. Since t

t

x X n+1 (t) = x + ∫ b (s, X nx (s)) ds + ∫ σ (s, X nx (s)) dB s 0

0

we are done, if we can show that both integral terms have the required measurability. For the Lebesgue integral this is obvious from the measurability of the integrand and the continuous dependence on the upper integral limit t. To see the measurability of the stochastic integral, it is enough to assume d = n = 1 and that σ is a step function. Then t

m

∫ σ (s, X nx (s)) dB s = ∑ σ (s j−1 , X nx (s j−1 )) (B(s j ) − B(s j−1 )) 0

j=1

where 0 = s0 < s1 < ⋅ ⋅ ⋅ < s m = t and this shows the required measurability.

21.6 Further examples and counterexamples Throughout this section we consider one-dimensional time-homogeneous SDEs of the form dX t = b(X t ) dt + σ(X t ) dB t , X0 = x ∈ ℝ, (21.23)

21.6 Further examples and counterexamples | 399

where (B t , Ft )t⩾0 is a BM1 with an admissible complete filtration. If the coefficients b and σ satisfy the (local) Hölder and global growth condition, then the equation (21.23) has a unique solution. We will present a few examples that illustrate what may happen if these conditions do not hold. The first example is the classical example for an ODE without a solution; setting σ ≡ 0, we see that ODEs are particular SDEs. 21.15 Example. The one-dimensional ODE dx t = − sgn(x t ) dt,

does not have a solution.

x0 = 0,

sgn(x) = 𝟙(0,∞) (x) − 𝟙(−∞,0] (x),

(21.24)

Proof. Assume that x t is a solution of (21.24) and that there are ξ ≠ 0 and t0 > 0 such that x t0 = ξ . Define τ := inf {t ⩾ 0 : x t = ξ }

and

Inserting these times into the ODE yields

σ := sup {0 ⩽ t ⩽ τ : x t = 0}

τ

ξ = x τ − x σ = − ∫ sgn(x s ) ds = −(τ − σ) sgn(ξ). σ

This leads to a contradiction. Thus, only x t ≡ 0 qualifies as a solution, but this does not solve (21.24). If we replace dt by dB t in (21.24) we get a similar result:

21.16 Example (Tanaka 1963). The one-dimensional SDE dX t = − sgn(X t ) dB t ,

does not have a solution.

X0 = 0,

sgn(x) = 𝟙(0,∞) (x) − 𝟙(−∞,0] (x),

(21.25)

Proof. We may assume that Ft is the completion of the natural filtration FtB . Let (X t , FtX )t⩾0 be a solution to (21.25); we may assume that (FtX )t⩾0 is complete. By definition, FtX ⊂ Ft , and X is a continuous martingale. Its quadratic variation is given by ∙

t

⟨X⟩t = ⟨− ∫ sgn(X s ) dB s ⟩ = ∫ sgn2 (X s ) ds = t. 0

t

0

By Lévy’s characterization of Brownian motion, Theorem 9.13 or Theorem 19.5, we see that (X t , FtX )t⩾0 is a BM1 . Applying Tanaka’s formula (18.21) to X shows t

t

|X t | − L0t (X) = ∫ sgn(X s ) dX s = − ∫ sgn2 (X s ) dB s = −B t , 0

0

400 | 21 Stochastic differential equations where L0t (X) is the local time (at 0) of the Brownian motion X. Since t

L0t (X)

1 = ℙ- lim ∫ 𝟙[0,ϵ) (|X s |) ds, ϵ→0 2ϵ 0

|X|

|X|

cf. (18.20), we see that Ft ⊂ Ft where Ft is the completed canonical filtration of the process (|X t |)t⩾0 . Since (X t , FtX )t⩾0 is a solution of (21.25), we also have FtX ⊂ Ft |X| which means that FtX ⊂ Ft . This is impossible, since X is a Brownian motion. In fact, one can even show that there are SDEs of the form dX t = σ(X t ) dB t without a solution, where σ is continuous, bounded and strictly positive, see Barlow [11].

The one-sided version of the SDE (21.25) admits, however, a solution. 21.17 Example. The one-dimensional SDE dX t = 𝟙(0,∞) (X t ) dB t ,

has a solution.

X0 = x ∈ ℝ,

(21.26)

Proof. Consider the stopping time τ = inf{t ⩾ 0 : x + B t ⩽ 0}, and set X t := x + B t∧τ . Obviously, (X t )t⩾0 is continuous, Ft -adapted, and we have t

t

t

X t − X0 = ∫ 𝟙{τ>s} dB s = ∫ 𝟙(0,∞) (x + B s∧τ ) dB s = ∫ 𝟙(0,∞) (X s ) dB s . 0

0

0

This argument still holds if the initial condition X0 is a random variable which is independent of (B t )t⩾0 .

If we compare Example 21.15 with Example 21.16, we see that in the latter example the t non-existence is a measurability issue: use W t = − ∫0 sgn(X s ) dX s , where (X t , FtX )t⩾0 is an arbitrary Brownian motion, instead of the given Brownian motion (B t )t⩾0 . By Lévy’s theorem, (W t , FtX )t⩾0 is also a Brownian motion, and we have dW t = − sgn(X t ) dX t or dX t = − sgn(X t ) dW t . If we relax the Definition 21.1 of the solution to an SDE, we can take such situations into account.

21.18 Definition (Skorokhod 1965). A weak solution to the SDE (21.23) consists of some Brownian motion (W t , FtW )t⩾0 and a progressively measurable process (X t , FtW )t⩾0 such that (FtW )t⩾0 is a complete admissible filtration and t

t

X t = x + ∫ b(X s ) ds + ∫ σ(X s ) dW s . 0

0

(As in Definition 21.1, it is implicit that all (stochastic) integrals make sense).

21.6 Further examples and counterexamples | 401

Every (strong) solution in the sense of Definition 21.1 is also a weak solution. We do not want to follow this line of discussion, but refer to Karatzas & Shreve [136, Chapter 5.3], Ikeda & Watanabe [113, Chapter IV.1–3] and Cherny & Engelbert [30] for properties of weak solutions. In the following examples solution always means a strong solution in the sense of Definition 21.1. 21.19 Example (Zvonkin 1974). The one-dimensional SDE dX t = −

1 𝟙ℝ\{0} (X t ) dt + dB t , 2X t

X0 = 0,

has no solution (and also no weak solution).

(21.27)

Proof. Assume that X t is a (weak) solution of (21.27). Let ϕ ∈ C∞ c (ℝ) be a function with 0 ⩽ ϕ ⩽ 1 and ϕ|(−1,1) = 1. Set ϕ ϵ (x) := ϕ(x/ϵ); then 󵄨 󵄨 󵄨󵄨 󵄨󵄨 x 󵄨󵄨 󵄨󵄨 x 󸀠 x 󵄨󵄨 󵄨󵄨 x2 󸀠󸀠 x 󵄨󵄨 󵄨󵄨 󵄨 󵄨 󵄨󵄨ϕ ϵ (x)󵄨󵄨󵄨 + 󵄨󵄨󵄨󵄨xϕ󸀠ϵ (x)󵄨󵄨󵄨󵄨 + 󵄨󵄨󵄨󵄨x2 ϕ󸀠󸀠 ϵ (x)󵄨󵄨󵄨 = 󵄨󵄨 ϕ ( ϵ )󵄨󵄨 + 󵄨󵄨󵄨 ϵ ϕ ( ϵ )󵄨󵄨󵄨 + 󵄨󵄨󵄨 ϵ2 ϕ ( ϵ )󵄨󵄨󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 ⩽ sup 󵄨󵄨󵄨ϕ(y)󵄨󵄨󵄨 + sup 󵄨󵄨󵄨󵄨yϕ󸀠 (y)󵄨󵄨󵄨󵄨 + sup 󵄨󵄨󵄨󵄨y2 ϕ󸀠󸀠 (y)󵄨󵄨󵄨󵄨 < ∞. y∈ℝ y∈ℝ y∈ℝ

By Itô’s formula – which also holds for weak solutions with the Brownian motion specified in the definition of the weak solution – we get X 2t ϕ ϵ (X t ) t

t

= ∫ [2X s ϕ ϵ (X s ) + X 2s ϕ󸀠ϵ (X s )] dX s + 0

t

1 ∫ [2ϕ ϵ (X s ) + 4X s ϕ󸀠ϵ (X s ) + X 2s ϕ󸀠󸀠 ϵ (X s )] ds 2 0

t

3 1 = ∫ [2X s ϕ ϵ (X s ) + X 2s ϕ󸀠ϵ (X s )] dB s + ∫[ϕ ϵ (X s )𝟙{0} (X s ) + X s ϕ󸀠ϵ (X s ) + X 2s ϕ󸀠󸀠 ϵ (X s )] ds. 2 2 0 0 = 𝟙{0} (X s )

Since all integrands are bounded, the stochastic integral is a martingale, and we can take expectations: t

𝔼 [X 2t ϕ ϵ (X t )]

t

3 1 ] = 𝔼 [∫ 𝟙{0} (X s ) ds] + 𝔼 [∫ ( X s ϕ󸀠ϵ (X s ) + X 2s ϕ󸀠󸀠 ϵ (X s )) ds . 2 2 ] [0 ] [0

Using dominated convergence, we can let ϵ → 0 to conclude that for all t > 0 t

t

0 = 𝔼 [∫ 𝟙{0} (X s ) ds] , [0

and so,

]

0

Applying Itô’s formula again yields t

∫ 𝟙{0} (X s ) ds = 0 a.s.

t

t

X 2t = ∫ 𝟙{0} (X s ) ds + ∫ 2X s dB s = ∫ 2X s dB s . 0

0

0

402 | 21 Stochastic differential equations Thus, (X 2t )t⩾0 is a positive local martingale. As such, it is a supermartingale, cf. Proposition 19.3.a), and we find for all t > 0 𝔼 (X 2t ) ⩽ 𝔼 (X02 ) = 0,

hence,

X t = 0 a.s. t

Since t 󳨃→ X t is continuous, this contradicts the fact that ∫0 𝟙{0} (X s ) ds = 0.

21.20 Example (Zvonkin 1974). The one-dimensional SDE dX t = − sgn(X t ) dt + dB t ,

(21.28)

X0 = x ∈ ℝ,

has a unique solution. This is a non-trivial result. It is a special case of the fact that the SDE dX t = b(X t ) dt + dB t , X0 = x ∈ ℝ, b ∈ Bb (ℝ), (21.29) has a unique (strong) solution, cf. Zvonkin [281].

21.21 Example (Maruyama’s drift transformation; Maruyama 1954). Using Girsanov’s theorem it is straightforward to show that the SDE (21.29) from Example 21.20 admits a weak solution in the sense of Definition 21.18. Let (B t , Ft )t⩾0 be any BM1 with an admissible, complete filtration. We set X t := x + B t ,

t

t

0

0

1 M t := exp (∫ b(X s ) dB s − ∫ b2 (X s ) ds) , 2

ℚ|Ft := M t ⋅ ℙ|Ft .

By Girsanov’s theorem, Theorem 19.9, (W t , Ft )t⩽T is a Brownian motion where T > 0 t and W t := B t − ∫0 b(X s ) ds on the probability space (Ω, A , ℚ = ℚ|FT ). In particular, ((W t , Ft )t⩾0 , (X t , Ft )t⩾0 ) is a weak solution since t

dX t = dB t = dW t + d ∫ b(X s ) ds = dW t + b(X t ) dt 0

and

X0 = x.

If we replace the function b(x) by a functional β(t, ω) which depends on the whole path [0, t] ∋ s 󳨃→ X s (ω) up to time t, then there need not be a solution, cf. Tsirelson [258]. 21.22 Example. The one-dimensional SDE

dX t = 𝟙ℝ\{0} (X t ) dB t ,

has more than one solution.

X0 = 0,

Proof. Note that both X t ≡ 0 and X t = B t solve (21.30).

(21.30)

21.7 Solutions as Markov processes | 403

21.7 Solutions as Markov processes Recall, cf. Remark 6.3.a), that a Markov process is an n-dimensional, adapted stochastic process (X t , Ft )t⩾0 with (right) continuous paths such that 󵄨 󵄨 𝔼 [u(X t ) 󵄨󵄨󵄨 Fs ] = 𝔼 [u(X t ) 󵄨󵄨󵄨 X s ] for all t ⩾ s ⩾ 0, u ∈ Bb (ℝn ).

(21.31)

21.23 Theorem. Let (B t , Ft )t⩾0 be a d-dimensional Brownian motion and assume that the coefficients b(s, x) ∈ ℝn and σ(s, x) ∈ ℝn×d of the SDE (21.1) satisfy the global Lipschitz condition (21.17). Then the unique solution of the SDE is a Markov process. Proof. Consider for fixed s ⩾ 0 and x ∈ ℝn the following SDE t

t

X t = x + ∫ b(r, X r ) dr + ∫ σ(r, X r ) dB r , s

s

t ⩾ s.

(21.32)

It is not hard to see that the existence and uniqueness results of Section 21.5 also hold for this SDE, and we denote by (X ts,x )t⩾s its unique solution. On the other hand, we find for every F0 measurable initial condition ξ ∈ L2 (ℙ) that 0,ξ

Xt

t

t

0,ξ

0

and so 0,ξ Xt

0,ξ

= ξ + ∫ b(r, X r ) dr + ∫ σ(r, X r ) dB r ,

=

0,ξ Xs

0

t

+ ∫ b(r, s

0,ξ X r ) dr

t

t ⩾ 0,

0,ξ

+ ∫ σ(r, X r ) dB r ,

t ⩾ s.

s

Because of the uniqueness of the solution, we get 0,ξ

Xt 0,ξ

0,ξ

s,X s

= Xt

0,ξ

a.s. for all t ⩾ s.

Moreover, X t (ω) is of the form Φ(X s (ω), s, t, ω) and the functional Φ(x, s, t, ⋅) is measurable with respect to Gs := σ(B r − B s : r ⩾ s) since it solves (21.32). As (B t )t⩾0 has independent increments, we know that Fs ⊥⊥ Gs , cf. Lemma 2.14 or Theorem 6.1. Therefore, we get for all s ⩽ t and u ∈ Bb (ℝn ) 0,ξ 󵄨 𝔼 [u(X t ) 󵄨󵄨󵄨 Fs ]

=

(A.5)

=

Φ ⊥⊥ Fs

=

0,ξ 󵄨 𝔼 [u(Φ(X s , s, t, ⋅)) 󵄨󵄨󵄨 Fs ]

󵄨 𝔼 [u(Φ(x, s, t, ⋅))] 󵄨󵄨󵄨x=X0,ξ s

󵄨 𝔼 [u(X ts,x )] 󵄨󵄨󵄨x=X0,ξ . s

Using the tower property for conditional expectations we see now 0,ξ 󵄨 0,ξ 󵄨 0,ξ 𝔼 [u(X t ) 󵄨󵄨󵄨 Fs ] = 𝔼 [u(X t ) 󵄨󵄨󵄨 X s ].

404 | 21 Stochastic differential equations For u = 𝟙B , B ∈ B(ℝn ) the proof of Theorem 21.23 actually shows 0,ξ

0,ξ

p(s, X s ; t, B) := ℙ(X t

s,y 󵄨 󵄨 ∈ B 󵄨󵄨󵄨 Fs ) = ℙ(X t ∈ B)󵄨󵄨󵄨y=X0,ξ . s

The family of measures p(s, x; t, ⋅) is the transition function of the Markov process X. If p(s, x; t, ⋅) depends only on the difference t − s, we say that the Markov process is homogeneous. In this case we write p(t − s, x; ⋅).

21.24 Corollary. Let (B t , Ft )t⩾0 be a BMd and assume that the coefficients b(x) ∈ ℝn and σ(x) ∈ ℝn×d of the SDE (21.1) are autonomous (i.e. they do not depend on time) and satisfy the global Lipschitz condition (21.17). Then the unique solution of the SDE is a (homogeneous) Markov process with the transition function p(t, x; B) = ℙ(X tx ∈ B),

t ⩾ 0, x ∈ ℝn , B ∈ B(ℝn ).

Proof. As in the proof of Theorem 21.23 we see for all x ∈ ℝn and s, t ⩾ 0 s+t

s+t

s,x X s+t = x + ∫ b(X rs,x ) dr + ∫ σ(X rs,x ) dB r s

s

t

t

s,x s,x = x + ∫ b(X s+v ) dv + ∫ σ(X s+v ) dW v , 0

0

s,x where W v := B s+v − B s , v ⩾ 0, is again a Brownian motion. Thus, (X s+t )t⩾0 is the unique solution of the SDE t

t

Y t = x + ∫ b(Y v ) dv + ∫ σ(Y v ) dW v . 0

0

Since W and B have the same probability distribution, we conclude that s,x ℙ (X s+t ∈ B) = ℙ (X 0,x ∈ B) t

for all s ⩾ 0, B ∈ B(ℝn ).

The claim follows if we set p(t, x; B) := ℙ(X tx ∈ B) := ℙ(X 0,x ∈ B). t

21.8 Localization procedures

We will show that the existence and uniqueness theorem for the solution of an SDE still holds under a local Lipschitz condition (21.17) and a global linear growth condition (21.18). We begin with a useful localization result for stochastic differential equations. 21.25 Lemma (localization). Let (B t , Ft )t⩾0 be a BMd . Consider the SDEs dX t = b j (t, X t ) dt + σ j (t, X t ) dB t , j

j

j

j = 1, 2,

21.8 Localization procedures | 405

with initial condition X01 = X02 = ξ ∈ L2 . Assume that ξ is F0 measurable and that the coefficients b j : [0, ∞) × ℝn → ℝn and σ j : [0, ∞) × ℝn → ℝn×d satisfy the Lipschitz condition (21.17) for all x, y ∈ ℝn with a global Lipschitz constant L. If for some T > 0 and R > 0 b1 |[0,T]×𝔹(0,R) = b2 |[0,T]×𝔹(0,R)

and

σ1 |[0,T]×𝔹(0,R) = σ2 |[0,T]×𝔹(0,R) ,

then we have for the stopping times τ j = inf {t ⩾ 0 : |X t | ⩾ R} ∧ T, j = 1, 2, ℙ(τ1 = τ2 ) = 1

j

and ℙ( sup |X 1s − X 2s | = 0) = 1.

(21.33)

s⩽τ1

Proof. Since the solutions of the SDEs are continuous processes, τ1 and τ2 are indeed stopping times, cf. Lemma 5.8. Observe that X 1t∧τ 1



X 2t∧τ 1

t∧τ1

= ∫ [b1 (s, 0

t∧τ1

X 1s )

− b2 (s,

X 2s )] ds

t∧τ1

+ ∫ [σ1 (s, X 1s ) − σ2 (s, X 2s )] dB s 0

= ∫ [ b1 (s, X 1s ) − b2 (s, X 1s ) +b2 (s, X 1s ) − b2 (s, X 2s )] ds 0

t∧τ1

=0

+ ∫ [ σ1 (s, X 1s ) − σ2 (s, X 1s ) +σ2 (s, X 1s ) − σ2 (s, X 2s )] dB s . 0

=0

Now we can use the elementary inequality (a + b)2 ⩽ 2a2 + 2b2 and (21.20), (21.21) from the proof of the stability Theorem 21.11, with X = X 1 and Y = X 2 to deduce T

󵄨2 󵄨 󵄨2 󵄨 𝔼 [sup 󵄨󵄨󵄨X 1t∧τ − X 2t∧τ 󵄨󵄨󵄨 ] ⩽ 2L2 (T + 4) ∫ 𝔼 [sup 󵄨󵄨󵄨X 1r∧τ − X 2r∧τ 󵄨󵄨󵄨 ] ds. 1 1 1 1 r⩽s t⩽T

0

The estimate (21.22) ensures that all expectations are finite. Gronwall’s inequality, Theorem A.42 with a(s) = 0 and b(s) = 2L2 (T + 4) yields 󵄨 󵄨2 𝔼 [sup 󵄨󵄨󵄨X 1r∧τ − X 2r∧τ 󵄨󵄨󵄨 ] = 0 for all s ⩽ T. 1 1 r⩽s

1 2 Therefore, X∙∧τ = X∙∧τ almost surely and, in particular, τ1 ⩽ τ2 . Swapping the roles 1

1

of X 1 and X 2 we find τ2 ⩽ τ1 a.s., and the proof is finished.

We will use Lemma 21.25 to prove local uniqueness and existence results.

21.26 Theorem. Let (B t , Ft )t⩾0 be a d-dimensional Brownian motion and assume that the coefficients b : [0, ∞) × ℝn → ℝn and σ : [0, ∞) × ℝn → ℝn×d of the SDE (21.1) satisfy the linear growth condition (21.18) for all x ∈ ℝn and the Lipschitz condition (21.17)

406 | 21 Stochastic differential equations for all x, y ∈ K for every compact set K ⊂ ℝn with the local Lipschitz constants L T,K . For every F0 measurable initial condition X0 = ξ ∈ L2 (ℙ) there exists a unique solution (X t )t⩾0 , and this solution satisfies 𝔼 [sup |X s |2 ] ⩽ κ T 𝔼 [(1 + |ξ|)2 ] s⩽T

for all T > 0.

(21.34)

n Proof. Fix T > 0. For R > 0 there is some χ R ∈ C∞ c (ℝ ) with 𝟙𝔹(0,R) ⩽ χ R ⩽ 1. Set

b R (t, x) := b(t, x)χ R (x) and

By Theorem 21.13 the SDE

σ R (t, x) := σ(t, x)χ R (x).

dY t = b R (t, Y t ) dt + σ R (t, Y t ) dB t ,

Y0 = ξ,

has a unique solution X tR . Set

󵄨 󵄨 τ R := inf {s ⩾ 0 : 󵄨󵄨󵄨X sR 󵄨󵄨󵄨 ⩾ R} .

For S > R and all t ⩾ 0 we have

b R |[0,t]×𝔹(0,R) = b S |[0,t]×𝔹(0,R)

and

With Lemma 21.25 we conclude that

σ R |[0,t]×𝔹(0,R) = σ S |[0,t]×𝔹(0,R) .

󵄨 󵄨 󵄨 󵄨 ℙ( sup 󵄨󵄨󵄨X tR − X tS 󵄨󵄨󵄨 > 0) ⩽ ℙ(τ R < T) = ℙ( sup 󵄨󵄨󵄨X tR 󵄨󵄨󵄨 ⩾ R). t⩽T

t⩽T

Therefore it is enough to show that

lim ℙ (τ R < T) = 0.

R→∞

(21.35)

Once (21.35) is established, we see that X t := limR→∞ X tR exists uniformly for t ∈ [0, T] R and X t∧τ = X t∧τ . Therefore, (X t )t⩾0 is a continuous stochastic process. Moreover, on R

the set {τ R > 0},

R

t∧τ R

R X t∧τ R

= ξ + ∫ b R (s, 0

t∧τ R

t∧τ R

X sR ) ds

+ ∫ σ R (s, X sR ) dB s 0

t∧τ R

= ξ + ∫ b(s, X s ) ds + ∫ σ(s, X s ) dB s . 0

0

Letting R → ∞ shows that X t solves the SDE (21.1). Let us now show (21.35). Since ξ ∈ L2 (ℙ), we can use Theorem 21.13 to get 󵄨 󵄨2 𝔼 [ sup 󵄨󵄨󵄨X tR 󵄨󵄨󵄨 ] ⩽ κ T 𝔼 [(1 + |ξ|)2 ] < ∞, t⩽T

21.9 Dependence on the initial values | 407

and with Markov’s inequality, κT 󵄨 󵄨 ℙ (τ R < T) = ℙ( sup 󵄨󵄨󵄨X sR 󵄨󵄨󵄨 ⩾ R) ⩽ 2 𝔼 [(1 + |ξ|)2 ] 󳨀󳨀󳨀󳨀󳨀→ 0. R→∞ R s⩽T

To see uniqueness, assume that X and Y are any two solutions. Since X R is unique, we conclude that X t∧τ

R ∧T R

= Y t∧τ

R ∧T R

R = X t∧τ

R ∧T R

if τ R := inf{s ⩾ 0 : |X s | ⩾ R} and T R := inf{s ⩾ 0 : |Y s | ⩾ R}. Thus, X t = Y t for all t ⩽ τ R ∧ T R . Since τ R , T R → ∞ as R → ∞, we see that X = Y.

21.9 Dependence on the initial values

Often it is important to know how the solution (X tx )t⩾0 of the SDE (21.1) depends on the starting point X0 = x. Typically, one is interested whether x 󳨃→ X tx is continuous or differentiable. A first weak continuity result follows already from the stability Theorem 21.11. 21.27 Corollary (Feller continuity). Let (B t , Ft )t⩾0 be a BMd . Assume that the coefficients b : [0, ∞) × ℝn → ℝn and σ : [0, ∞) × ℝn → ℝn×d of the SDE (21.1) satisfy the Lipschitz condition (21.17) for all x, y ∈ ℝn with the global Lipschitz constant L. Denote by (X tx )t⩾0 the solution of the SDE (21.1) with initial condition X0 ≡ x ∈ ℝn . Then x 󳨃→ 𝔼 [f(X tx )]

is continuous for all f ∈ Cb (ℝn ).

Proof. Assume that f ∈ Cc (ℝn ) is a continuous function with compact support. Since f is uniformly continuous, we find for any ϵ > 0 some δ > 0 such that |ξ − η| < δ implies |f(ξ) − f(η)| ⩽ ϵ. Now consider for x, y ∈ ℝn y 󵄨 󵄨 𝔼 [󵄨󵄨󵄨f(X tx ) − f(X t )󵄨󵄨󵄨]

y 󵄨 y 󵄨 󵄨 󵄨 = 𝔼 [󵄨󵄨󵄨f(X tx ) − f(X t )󵄨󵄨󵄨 𝟙{|X tx −X ty |⩾δ} ] + 𝔼 [󵄨󵄨󵄨f(X tx ) − f(X t )󵄨󵄨󵄨 𝟙{|X tx −X ty | 0,

(21.36)

s⩽T

𝔼 [sup |X sx − X s |p ] ⩽ c󸀠p,T |x − y|p

for all T > 0.

y

s⩽T

(21.37)

In particular, x 󳨃→ X tx (has a modification which) is continuous.

Proof. The proof of (21.36) is very similar to the proof of (21.37). Note that the global Lipschitz condition (21.17) implies the linear growth condition (21.18) which will be needed for (21.36). Therefore, we will prove only (21.37). We have t

t

X tx − X t = (x − y) + ∫(b(s, X sx ) − b(s, X s )) ds + ∫(σ(s, X sx ) − σ(s, X s )) dB s . y

y

0

y

0

Define stoppping times τ zR := inf {s ⩾ 0 : |X sz | ⩾ R}, z = x, y, and τ R := τ xR ∧ τ R . We follow the strategy used in the proof of Theorem 21.11. From Hölder’s inequality we see y

21.9 Dependence on the initial values | 409

that (a + b + c)p ⩽ 3p−1 (a p + b p + c p ). Thus, 𝔼 [ sup |X tx − X t |p ] ⩽ 3p−1 |x − y|p y

t⩽T∧τ R

󵄨󵄨p 󵄨󵄨 R 󵄨 󵄨 y + 3p−1 𝔼 [sup 󵄨󵄨󵄨󵄨 ∫ (b(s, X sx ) − b(s, X s )) ds󵄨󵄨󵄨󵄨 ] 󵄨󵄨 t⩽T 󵄨󵄨 0 ] [ +3

p−1

t∧τ

󵄨󵄨p 󵄨󵄨 t∧τ R 󵄨 󵄨 y [ 𝔼 sup 󵄨󵄨󵄨󵄨 ∫ (σ(s, X sx ) − σ(s, X s )) dB s 󵄨󵄨󵄨󵄨 ] . 󵄨󵄨 t⩽T 󵄨󵄨 0 ] [

Using the Hölder inequality and (21.17), we find for the middle term

T∧τ R p 󵄨󵄨 t∧τ R 󵄨󵄨p 󵄨 󵄨 y y 󵄨 󵄨 𝔼 [sup 󵄨󵄨󵄨󵄨 ∫ (b(s, X sx ) − b(s, X s )) ds󵄨󵄨󵄨󵄨 ] ⩽ 𝔼 [( ∫ 󵄨󵄨󵄨b(s, X sx ) − b(s, X s )󵄨󵄨󵄨 ds) ] 󵄨󵄨 t⩽T 󵄨󵄨 0 [ ] [ 0 ] T∧τ R

⩽ T

(21.17)

p−1

y 󵄨p 󵄨 𝔼 [ ∫ 󵄨󵄨󵄨b(s, X sx ) − b(s, X s )󵄨󵄨󵄨 ds] [ 0 ] T

⩽ L T p

y

x − X s∧τ R |p ] ds ∫ 𝔼 [|X s∧τ R

p−1

0

T

⩽ L p T p−1 ∫ 𝔼 [ sup |X rx − X r |p ] ds. y

r⩽s∧τ R

0

For the third term we combine Burkholder’s inequality (applied to the coordinate processes), the estimate (21.17) and Hölder’s inequality to get 󵄨󵄨 t∧τ R 󵄨󵄨󵄨p 󵄨 y [ 𝔼 sup 󵄨󵄨󵄨󵄨 ∫ (σ(s, X sx ) − σ(s, X s )) dB s 󵄨󵄨󵄨󵄨 ] 󵄨󵄨 t⩽T 󵄨󵄨 0 [ ] (19.18)



(21.17)



⩽ ⩽

T

x ) − σ(s ∧ τ R , X s∧τ R )|2 ds) C p 𝔼 [( ∫ |σ(s ∧ τ R , X s∧τ R [ 0 T

x C p L p 𝔼 [( ∫ |X s∧τ − X s∧τ R |2 ds) R [ 0

y

p/2

y

T

C p L p 𝔼 [( ∫ sup |X rx − X r |2 ds) r⩽s∧τ R [ 0 y

T

] ]

p/2

] ]

C p L p T p/2−1 ∫ 𝔼 [ sup |X rx − X r |p ] ds. y

0

r⩽s∧τ R

p/2

] ]

410 | 21 Stochastic differential equations This shows that T

𝔼 [ sup

t⩽T∧τ R

|X tx



y Y t |p ]

⩽ c|x − y| + c ∫ 𝔼 [ sup |X rx − Y r |p ] ds y

p

0

r⩽s∧τ R

with a constant c = c(T, p, L). Gronwall’s inequality, Theorem A.42, yields when applied y to u(T) = 𝔼[supt⩽T∧τ R |X tx − X t |p ], a(s) = c |x − y|p and b(s) = c, 𝔼 [ sup |X tx − X t |p ] ⩽ c e c T |x − y|p . y

t⩽T∧τ R

Note that τ R → ∞ as R → ∞. If we use Fatou’s lemma for the expression on the left-hand side, we see 𝔼 [sup |X tx − X t |p ] ⩽ lim 𝔼 [ sup |X tx − X t |p ] ⩽ c e c T |x − y|p . y

y

t⩽T

t⩽T∧τ R

R→∞

The assertion follows from the Kolmogorov–Chentsov theorem, Theorem 10.1, where we use α = p and β + n = p.

21.29 Remark. a) A weaker form of the boundedness estimate (21.36) from Theorem 21.28 holds for all p ∈ ℝ: 𝔼 [(1 + |X tx |)p ] ⩽ c p,T (1 + |x|2 )p/2

Let us sketch the argument for d = n = 1.

for all x ∈ ℝn , t ⩽ T, p ∈ ℝ.

(21.38)

Outline of the proof. Since (1 + |x|)p ⩽ (1 + 2p/2 )(1 + |x|2 )p/2 for all p ∈ ℝ, it is enough to prove (21.38) for 𝔼 [(1 + |X tx |2 )p/2 ]. Set f(x) = (1 + x2 )p/2 . Then f ∈ C2 (ℝ) and we find that f 󸀠 (x) = px(1 + x2 )p/2−1

f 󸀠󸀠 (x) = p(1 − x2 + px2 )(1 + x2 )p/2−2 ,

and

i.e. |(1 + x2 )1/2 f 󸀠 (x)| ⩽ C p f(x) and |(1 + x2 )f 󸀠󸀠 (x)| ⩽ C p f(x) for some absolute constant C p . Itô’s formula (18.1) yields t

f(X tx )

− f(x) = ∫ f 0

t

󸀠

(X rx )b(r,

X rx ) dr

+∫f 0

t

󸀠

(X rx )σ(r,

X rx ) dB r

1 + ∫ f 󸀠󸀠 (X rx )σ2 (r, X rx ) dr. 2 0

The second term on the right is a local martingale, cf. Theorem 16.8 and Remark 18.3. Using a localizing sequence (σ k )k⩾0 and the fact that b(r, x) and σ(r, x) satisfy the linear growth condition (21.18), we get for all k ⩾ 0 and t ⩽ T t∧σ k

x 𝔼 f(X t∧σ ) k

2󵄨 󵄨 󵄨 󵄨 ⩽ f(x) + 𝔼 [ ∫ (M(1 + |X rx |)󵄨󵄨󵄨f 󸀠 (X rx )󵄨󵄨󵄨 + M 2 (1 + |X rx |) 󵄨󵄨󵄨f 󸀠󸀠 (X rx )󵄨󵄨󵄨) dr] [ 0 ] t

⩽ f(x) + C(p, M) ∫ 𝔼 f(X rx ) dr. 0

21.9 Dependence on the initial values |

411

x By Fatou’s lemma 𝔼 f(X tx ) ⩽ limk→∞ 𝔼 f(X t∧σ ), and we can apply Gronwall’s lemma, k Theorem A.42. This shows for all x and t ⩽ T

𝔼 f(X tx ) ⩽ e C(p,M)T f(x).

b) For the differentiability of the solution with respect to the initial conditions, we can again use the idea of the proof of Theorem 21.28. We consider the (directional) difference quotients x

j

x+he j

Xt − Xt h

,

h > 0,

e j = (0, . . . , 0, 1, 0, . . . ) ∈ ℝn ,

and show that, as h → 0, the partial derivatives ∂ j X tx , j = 1, . . . , n, exist and satisfy the formally differentiated SDE ℓth coordinate, ∈ℝ

∈ℝn×n ∀ℓ

∈ℝn×n

d 󵄨󵄨 󵄨󵄨 d∂ j X tx = D z b(t, z) 󵄨󵄨󵄨 x ∂ j X tx dt + ∑ D z σ∙,ℓ (t, z) 󵄨󵄨󵄨 x dBℓt ∂ j X tx . 󵄨z=X t 󵄨z=X t ℓ=1 ℓth column, ∈ℝn

∈ℝn

∈ℝn

As one would expect, we need now (global) Lipschitz and growth conditions also for the derivatives of the coefficients. We will not go into further details but refer to Kunita [152, Chapter 4.6] and [153, Chapter 3]. 21.30 Corollary. Let (B t , Ft )t⩾0 be a d-dimensional Brownian motion. Assume that the coefficients b : [0, ∞) × ℝn → ℝn and σ : [0, ∞) × ℝn → ℝn×d of the SDE (21.1) satisfy the Lipschitz condition (21.17) for all x, y ∈ ℝn with the global Lipschitz constant L. Denote by (X tx )t⩾0 the solution of the SDE (21.1) with initial condition X0 ≡ x ∈ ℝn . Then we have for all p ⩾ 2, s, t ∈ [0, T] and x, y ∈ ℝn 𝔼 [|X sx − X t |p ] ⩽ C p,T (|x − y|p + (1 + |x| + |y|)p |s − t|p/2 ) . y

(21.39)

In particular, (t, x) 󳨃→ X tx (has a modification which) is continuous.

Proof. Assume that s ⩽ t. Then the inequality (a + b)p ⩽ 2p−1 (a p + b p ) gives 𝔼 [|X sx − X t |p ] ⩽ 2p−1 𝔼 [|X sx − X s |p ] + 2p−1 𝔼 [|X s − X t |p ] . y

y

y

y

The first term can be estimated by (21.37) with c󸀠p,T |x − y|p . For the second term we note that y |X s



y X t |p

t 󵄨󵄨 t 󵄨󵄨p 󵄨󵄨 󵄨 y y 󵄨 = 󵄨󵄨 ∫ b(r, X r ) dr + ∫ σ(r, X r ) dB r 󵄨󵄨󵄨󵄨 . 󵄨󵄨 󵄨󵄨 s s

Now we argue as in Theorem 21.28: because of (a + b)p ⩽ 2p−1 (a p + b p ), it is enough to estimate the integrals separately. The linear growth condition (21.18) (which follows from the global Lipschitz condition) and similar arguments as in the proof of

412 | 21 Stochastic differential equations Theorem 21.28 yield 󵄨󵄨p 󵄨󵄨 t 󵄨 󵄨 y [ 𝔼 󵄨󵄨󵄨󵄨 ∫ b(r, X r ) dr󵄨󵄨󵄨󵄨 ] 󵄨󵄨 󵄨 ] [󵄨 s

t



(21.18)

|t − s|

p−1

y 𝔼 [∫ |b(r, X r )|p dr]

[s

t

]

⩽ |t − s|p−1 M p 𝔼 [∫(1 + |X r )|)p dr] [s ] ⩽

(21.36)



y

(21.40)

y

|t − s|p M p 𝔼 [sup(1 + |X r )|)p ] r⩽T

|t − s|p M p c p,T (1 + |y|)p .

For the second integral we find with Burkholder’s inequality (applied to the coordinate processes) t p/2 󵄨󵄨 t 󵄨󵄨p (19.18) 󵄨󵄨 󵄨󵄨 ] y y [ [ 𝔼 󵄨󵄨󵄨 ∫ σ(r, X r ) dB r 󵄨󵄨󵄨 ⩽ C p 𝔼 ( ∫ |σ(r, X r )|2 dr) ] 󵄨 󵄨󵄨 [󵄨 s ] [ s ]

⩽ |t − s|p/2 C p M p c p,T (1 + |y|)p

where the last estimate follows as (21.40) with |b| 󴁄󴀼 |σ|2 and p 󴁄󴀼 p/2. Thus,

p p p p/2 𝔼 [|X sx − X t |p ] ⩽ c󸀠󸀠 (1 + |y|)p ) . p,T (|x − y| + |t − s| (1 + |y|) + |t − s| y

Ex. 21.12

Since t − s ⩽ T, we have |t − s|p ⩽ T p/2 |t − s|p/2 . If t ⩽ s, the starting points x and y change their roles in the above calculation, and we see that (21.39) is just a symmetric (in x and y) formulation of the last estimate. The existence of a jointly continuous modification (t, x) 󳨃→ X tx follows from the Kolmogorov–Chentsov theorem, Theorem 10.1, for the index set ℝn × [0, ∞) ⊂ ℝn+1 and with α = p, β + (n + 1) = p/2.

21.31 Corollary. Let (B t , Ft )t⩾0 be a d-dimensional Brownian motion and assume that the coefficients b : [0, ∞) × ℝn → ℝn and σ : [0, ∞) × ℝn → ℝn×d of the SDE (21.1) satisfy the Lipschitz condition (21.17) for all x, y ∈ ℝn with the global Lipschitz constant L. Denote by (X tx )t⩾0 the solution of the SDE (21.1) with initial condition X0 ≡ x ∈ ℝn . Then lim |X tx (ω)| = ∞ almost surely for every t ⩾ 0. (21.41)

Proof. For x ∈

ℝn

|x|→∞

\ {0} we set x̂ := x/|x|2 ∈ ℝn , and define for every t ⩾ 0 {(1 + |X tx |)−1 Y t̂x = { 0 {

if x̂ ≠ 0, if x̂ = 0.

Observe that |x̂| → 0 if, and only if, |x| → ∞. If x̂ , ̂y ≠ 0 we find for all p ⩾ 2 ̂y 󵄨p y 󵄨p y 󵄨p 󵄨󵄨 ̂x 󵄨 ̂ 󵄨p 󵄨 ̂y 󵄨p 󵄨 󵄨 ̂ 󵄨p 󵄨 ̂y 󵄨p 󵄨 󵄨󵄨Y t − Y t 󵄨󵄨󵄨 = 󵄨󵄨󵄨Y tx 󵄨󵄨󵄨 ⋅ 󵄨󵄨󵄨Y t 󵄨󵄨󵄨 ⋅ 󵄨󵄨󵄨|X tx | − |X t |󵄨󵄨󵄨 ⩽ 󵄨󵄨󵄨Y tx 󵄨󵄨󵄨 ⋅ 󵄨󵄨󵄨Y t 󵄨󵄨󵄨 ⋅ 󵄨󵄨󵄨X tx − X t 󵄨󵄨󵄨 ,

Problems |

413

and applying Hölder’s inequality two times we get ̂y ̂y 󵄨p y 1/2 1/4 1/4 󵄨 𝔼 [󵄨󵄨󵄨Y t̂x − Y t 󵄨󵄨󵄨 ] ⩽ ( 𝔼 [|Y t̂x |4p ]) ( 𝔼 [|Y t |4p ]) ( 𝔼 [|X tx − X t |2p ]) .

From Theorem 21.28 and Remark 21.29.a) we infer

̂y 󵄨p 󵄨 𝔼 [󵄨󵄨󵄨Y t̂x − Y t 󵄨󵄨󵄨 ] ⩽ c p,T (1 + |x|)−p (1 + |y|)−p |x − y|p ⩽ c p,T |x̂ − ̂y|p .

In the last estimate we use the inequality

󵄨󵄨 x |x − y| y 󵄨󵄨󵄨 󵄨 ⩽ 󵄨󵄨󵄨 2 − 2 󵄨󵄨󵄨 , (1 + |x|)(1 + |y|) 󵄨󵄨 |x| |y| 󵄨󵄨

x, y ∈ ℝn .

The case where x̂ = 0 is similar but easier as we use only (21.38) from Remark 21.29.a). The Kolmogorov–Chentsov theorem, Theorem 10.1, shows that x̂ 󳨃→ Y t̂x is continuous at x̂ = 0, and the claim follows. 21.32 Further reading. The classic book by [98] is still a very good treatise on the L2 theory of strong solutions. Weak solutions and extensions of existence and uniqueness results are studied in [136] or [113]. For degenerate SDEs you should consult [77]. The dependence on the initial value and more advanced flow properties can be found in [152] for processes with continuous paths and, for jump processes, in [153]. The viewpoint from random dynamical systems is explained in [3]. Arnold: Random Dynamical Systems. [3] Engelbert, Cherny: Singular Stochastic Differential Equations. [77] [98] Gikhman, Skorokhod: Stochastic Differential Equations. [113] Ikeda, Watanabe: Stochastic Differential Equations and Diffusion Processes. [136] Karatzas, Shreve: Brownian Motion and Stochastic Calculus. [152] Kunita: Stochastic Flows and Stochastic Differential Equations. [153] Kunita: Stochastic differential equations based on Lévy processes and stochastic flows of diffeomorphisms.

Problems 1.

2.

Let X = (X t )t⩾0 be the process from Example 21.2. Show that X is a Gaussian process with independent increments and find C(s, t) = 𝔼 X s X t , s, t ⩾ 0. Hint. Let 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = t. Find the characteristic function of the random variables (X t1 − X t0 , . . . , X t n − X t n−1 ) and (X t1 , . . . , X t n ).

Let (B(t))t∈[0,1] be a BM1 and set t k = k/2n for k = 0, 1, . . . , 2n and ∆t = 2−n . Consider the difference equation ∆X n (t k ) = −

1 X n (t k )∆t + ∆B(t k ), 2

X n (0) = A,

where ∆X n (t k ) = X n (t k +∆t)−X n (t k ) and ∆B(t k ) = B(t k +∆t)−B(t k ) for 0 ⩽ k ⩽ 2n −1 and A ∼ N(0, 1) is independent of (B t )t∈[0,1] .

Ex. 21.13

414 | 21 Stochastic differential equations a) Find an explicit formula for X n (t k ), determine the distribution of X n (1/2) and the asymptotic distribution as n → ∞.

b) Letting n → ∞ turns the difference equation into the SDE dX(t) = −

3.

1 X(t) dt + dB(t), 2

X(0) = A.

Solve this SDE and determine the distribution of X(1/2) as well as the correlation function C(s, t) = 𝔼 X s X t , s, t ∈ [0, 1].

Let B t = (b t , β t ) be a BM2 . Solve the SDE t

t

t

X t = x + b ∫ X s ds + σ1 ∫ X s db s + σ2 ∫ X s dβ s 0

0

0

with x, b ∈ ℝ and σ1 , σ2 > 0.

Hint. Rewrite the SDE using the result of Problem 19.3 and then apply Example 21.4.

4. Show that in Example 21.7

t

t

X ∘t = exp ( − ∫(β(s) − δ2 (s)/2) ds) exp ( − ∫ δ(s) dB s ), 0

0

and verify the expression given for dX ∘t . Moreover, show that

5.

Xt =

t

t

0

0

1 (X0 + ∫(α(s) − γ(s)δ(s))X ∘s ds + ∫ γ(s)X ∘s dB s ). X ∘t 1

Let (B t )t⩾0 be a BM . The solution of the SDE

dX t = −βX t dt + σdB t ,

X0 = ξ

with β > 0 and σ ∈ ℝ is an Ornstein–Uhlenbeck process. a) Find X t explicitly.

b) Determine the distribution of X t . Is (X t )t⩾0 a Gaussian process? Find the correlation function C(s, t) = 𝔼 X s X t , s, t ⩾ 0.

c) Determine the asymptotic distribution μ of X t as t → ∞.

d) Assume that the initial condition ξ is a random variable which is independent of (B t )t⩾0 , and assume that ξ ∼ μ with μ from above. Find the characteristic function of (X t1 , . . . , X t n ) for all 0 ⩽ t1 < t2 < ⋅ ⋅ ⋅ < t n < ∞. e) Show that the process (U t )t⩾0 , where U t :=

σ √2β

e−βt B(e2βt − 1), has the same

finite dimensional distributions as (X t )t⩾0 with initial condition X0 = 0. Compare this with the stationary OU process from Problem 2.11.

Problems |

6.

415

Verify the claim made in Example 21.9 using Itô’s formula. Derive from the proof of Lemma 21.8 explicitly the form of the transformation and the coefficients in Example 21.9. Hint. Integrate the condition (21.13) and retrace the steps of the proof of Lemma 21.8 from the end to the beginning.

7.

Let (B t )t⩾0 be a BM1 . Find an SDE which has X t = tB t , t ⩾ 0, as unique solution.

8. Let (B t )t⩾0 be a BM1 . Find for each of the following processes an SDE which is solved by it: (a)

9.

Ut =

Bt ; 1+t

(b)

V t = sin B t ;

(c)

Let (B t )t⩾0 be a BM1 . Solve the following SDEs a) dX t = b dt + σX t dB t , X0 = x0 and b, σ ∈ ℝ;

Xt a cos B t ( )=( ), Yt b sin B t

a, b ≠ 0.

b) dX t = (m − X t ) dt + σ dB t , X0 = x0 and m, σ > 0.

10. Let (B t )t⩾0 be a BM1 . Use Lemma 21.10 to find the solution of the following SDE: dX t = (√1 + X 2t +

1 2

X t ) dt + √1 + X 2t dB t .

11. Show that the constant M in (21.18) can be chosen in the following way: n

n

d

M 2 ⩾ 2L2 + 2 ∑ sup |b j (t, 0)|2 + 2 ∑ ∑ sup |σ jk (t, 0)|2 j=1 t⩽T

j=1 k=1 t⩽T

12. The linear growth of the coefficients is essential for Corollary 21.31. a) Consider the case where d = n = 1, b(x) = −e x and σ(x) = 0. Find the solution of this deterministic ODE and compare your findings for x → +∞ with Corollary 21.31

b) Assume that |b(x)| + |σ(x)| ⩽ M for all x. Find a simpler proof of Corollary 21.31. 󵄨󵄨 t 󵄨󵄨 Hint. Find an estimate for ℙ (󵄨󵄨󵄨∫0 σ(X sx ) dB s 󵄨󵄨󵄨 > r). 󵄨 󵄨

c) Assume that |b(x)| + |σ(x)| grow, as |x| → ∞, like |x|1−ϵ for some ϵ > 0. Show that one can still find a (relatively) simple proof of Corollary 21.31.

13. Show for all x, y ∈ ℝn the inequality

󵄨 󵄨 |x − y|(1 + |x|)−1 (1 + |y|)−1 ⩽ 󵄨󵄨󵄨󵄨|x|−2 x − |y|−2 y󵄨󵄨󵄨󵄨 .

Hint. Use a direct calculation after squaring both sides.

14. Let (B t )t⩾0 be a BM1 and b(x), σ(x) autonomous and globally Lipschitz continuous coefficients. We have seen in Corollary 21.24 that the solution of the stochastic differential equation dX t = b(X t ) dt + σ(X t ) dB t is a Markov process. Recall that Cb (ℝ) and C∞ (ℝ) are the bounded continuous functions and the continuous functions vanishing at infinity.

416 | 21 Stochastic differential equations a) Find the transition semigroup T t of this Markov process and show that T t maps Cb (ℝ) into itself.

b) Use Corollary 21.31 to show that T t maps C∞ (ℝ) into itself.

c) Find the infinitesimal generator A of T t on C2c (ℝ). Show that A has an extension onto C2b (ℝ). Hint. Use Itô’s formula.

d) Prove the following generalization of Dynkin’s formula: if b, σ are bounded and if τ is a stopping time with 𝔼x τ < ∞ and u ∈ C1,2 b , u = u(t, x), then τ

𝔼x [u(τ, X τ )] − u(0, x) = 𝔼x [∫ (∂ t u(s, X s ) + A x u(s, X s )) ds] . [0 ]

22 Stratonovich’s stochastic calculus Itô’s integral does not obey the usual rules of the usual differential calculus. This is sometimes disturbing and a small variation in the definition of the stochastic integral can change this. We will, however, loose the property that the stochastic integral is a martingale. A good starting point is Problem 15.19 (see also Example 15.17) where you have shown that ∑

t l−1 ,t l ∈Π



B θ l (B t l − B t l−1 ) 󳨀󳨀󳨀󳨀󳨀→ |Π|→0

1 2 (B + (2α − 1)T) 2 T

with a partition Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t N = T} and θ l = t l−1 + α(t l − t l−1 ) for some α ∈ [0, 1]. This means that the character of the integral depends on the supporting point θ l ∈ [t l−1 , t l ], where we evaluate the integrand. For α = 0 we get the left endpoint and Itô’s integral, for α = 12 we get the midpoint of the interval and an integral that will reproduce the usual calculus rules, but we loose the martingale property. This observation is due to Stratonovich, and we will now discuss the so-called Stratonovich integral and its relation to Itô’s integral.

22.1 The Stratonovich integral We begin with the following stronger version of Lemma 18.6. 22.1 Lemma. Let (B t )t⩾0 be a one-dimensional Brownian motion, b, c, σ, τ ∈ L2T and t

t

t

t

X t := X0 +∫0 σ(s) dB s +∫0 b(s) ds, Y t := Y0 +∫0 τ(s) dB s +∫0 c(s) ds be the corresponding Itô processes. Denote by Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t N = T} a generic partition of the interval [0, T]. Then N

T



∑ g(X t l−1 )(B t l − B t l−1 ) 󳨀󳨀󳨀󳨀󳨀→ ∫ g(X s ) dB s |Π|→0

l=1

holds for all g ∈ Cb (ℝ). Moreover, for all g ∈ Cb (ℝ) N

2



(22.1)

0

T

∑ g(ξ l )(Y t l − Y t l−1 ) 󳨀󳨀󳨀󳨀󳨀→ ∫ g(X s )τ2 (s) ds

(22.2)

∑ g(ξ l )(Y t l − Y t l−1 )(B t l − B t l−1 ) 󳨀󳨀󳨀󳨀󳨀→ ∫ g(X s )τ(s) ds

(22.3)

l=1 N l=1

|Π|→0

0



|Π|→0

T

0

where ξ l (ω) = X t l−1 (ω) + θ l (ω)(X t l (ω) − X t l−1 (ω)), 0 ⩽ θ l ⩽ 1, l = 1, . . . , N are intermediate values such that g(ξ l ) is measurable.

https://doi.org/10.1515/9783110741278-022

Ex. 15.19

418 | 22 Stratonovich’s stochastic calculus Proof. Note that Y ± B are again Itô processes with the coefficients τ ± 𝟙[0,T] and c. Thus, (22.3) follows from (22.2) by polarization: 4(Y t − Y s )(B t − B s ) = ((Y + B)t − (Y + B)s )2 − ((Y − B)t − (Y − B)s )2 and 4τ(s) = (τ(s) + 𝟙[0,T] (s))2 − (τ(s) − 𝟙[0,T] (s))2 ,

s ∈ [0, T].

For simple coefficients τ, c ∈ ST , (22.2) is exactly (18.11) of Lemma 18.6. For general τ, c ∈ L2T we pick some approximating sequences (τ ∆ )∆ , (c ∆ )∆ ⊂ ST indexed by the t

t

partitions ∆ of the simple processes. We set Y t∆ := Y0 + ∫0 τ ∆ (s) dB s + ∫0 c ∆ (s) ds. For any fixed ∆ we may assume that ∆ ⊂ Π. Then N

2

T

∑ g(ξ l )(Y t l − Y t l−1 ) − ∫ g(X s ) τ2 (s) ds

l=1

0

N

2

= ∑ g(ξ l )(Y t l − Y t l−1 − Y t∆l + Y t∆l−1 ) l=1

N

2

T

+ ∑ g(ξ l )(Y t∆l − Y t∆l−1 ) − ∫ g(X s )τ2∆ (s) ds l=1

(22.4)

0

N

+ 2 ∑ g(ξ l )(Y t l − Y t l−1 − Y t∆l + Y t∆l−1 )(Y t∆l − Y t∆l−1 ) l=1

T

T

+ ∫ g(X s )τ2∆ (s) ds − ∫ g(X s )τ2 (s) ds 0

0

=: J1 (g) + [J2 (g) − ∫0 g(X s )τ2∆ (s) ds] + J3 (g) + J4 (g). T

Note that |Jk (g)| ⩽ ‖g‖∞ Jk (1) for k = 1, 2. By definition, and with the elementary inequality (a + b)2 ⩽ 2(a2 + b2 ), tl

N

2

tl

J1 (1) = ∑ ( ∫ (τ(s) − τ ∆ (s)) dB s + ∫ (c(s) − c ∆ (s)) ds) l=1

t l−1

t l−1

tl

N

2

tl

2

[ ] ⩽ 2 ∑ [( ∫ (τ(s) − τ ∆ (s)) dB s ) + ( ∫ (c(s) − c ∆ (s)) ds) ] . l=1

[

t l−1

]

t l−1

Taking expectations, we get using Itô’s isometry, the Cauchy-Schwarz inequality and with |Π| := max1⩽l⩽N (t l − t l−1 ) T

T

𝔼 |J1 (1)| ⩽ 2 𝔼 ( ∫ (τ(s) − τ ∆ (s))2 ds) + 2|Π| 𝔼 ( ∫ (c(s) − c ∆ (s))2 ds). 0

0

22.1 The Stratonovich integral |

419

Since τ ∆ and c ∆ approximate τ and c, respectively, we see that lim 𝔼 |J1 (g)| = ϵ(∆) where

|Π|→0

lim ϵ(∆) = 0.

|∆|→0

In particular, lim|∆|→0 lim|Π|→0 J1 (g) = 0 in probability. A similar, but simpler, calculation shows that T

T

𝔼 |J2 (1)| ⩽ 2 𝔼 ( ∫ τ2∆ (s) ds) + 2|Π| 𝔼 ( ∫ c2∆ (s) ds) ⩽ M < ∞. 0

0

Since |Π| ⩽ T and (τ ∆ )∆ , (c ∆ )∆ approximate τ and c, respectively, the bound M can be chosen independently of Π and ∆. Moreover, for fixed ∆, we see from (18.11) that T lim|Π|→0 J2 (g) = ∫0 g(X s )τ2∆ (s) ds in probability. By the Cauchy-Schwarz inequality we get uniformly in Π 𝔼 |J3 (g)| ⩽ 2 ‖g‖∞ √𝔼 |J1 (1)|√𝔼 |J2 (1)| ⩽ 2 ‖g‖∞ √ ϵ(∆)√M 󳨀󳨀󳨀󳨀󳨀→ 0. |∆|→0

L1

Finally, lim|∆|→0 J4 (g) = 0 in and in probability by the construction of the sequence (τ ∆ )∆ . If we combine the results for J1 (g), . . . , J4 (g) and let in (22.4) first |Π| → 0, and then |∆| → 0, we get (22.2).

22.2 Theorem. Let X t := X0 + ∫0 σ(s) dB s + ∫0 b(s) ds be a one-dimensional Itô process and b, σ ∈ L2T . Denote by Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t N = T} a generic partition of the interval [0, T]. Then the limit t

N



l=1

g( 12 X t l−1

+

1 2 X t l )(B t l

exists for all g ∈ C1b (ℝ).

t

T

T

0

0

1 − B t l−1 ) 󳨀󳨀󳨀󳨀󳨀→ ∫ g(X s ) dB s + ∫ g 󸀠 (X s )σ(s) ds |Π|→0 2 ℙ

(22.5)

Proof. For s < t and some intermediate value ξ between X s and 12 (X s + X t ), we find by Taylor’s formula g( 12 X s + 21 X t )(B t − B s ) = g(X s )(B t − B s ) + 21 g 󸀠 (ξ)(X t − X s )(B t − B s ).

The integral form of the remainder term in Taylor’s formula reveals that ω 󳨃→ g󸀠 (ξ(ω)) is measurable, compare Step 1o in the proof of Theorem 18.1. Thus, the assertion follows from Lemma 22.1. 22.3 Definition (Stratonovich 1964, 1966). Let X t := X0 + ∫0 σ(s) dB s + ∫0 b(s) ds be a one-dimensional Itô process with b, σ ∈ L2T . Let Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t N = T} be a generic partition of the interval [0, T] and g : [0, T] × ℝ → ℝ a continuous function. If the limit N

ℙ- lim ∑ g(t l−1 , |Π|→0

l=1

1 2 X t l−1

t

t

+ 12 X t l )(B t l ∧t − B t l−1 ∧t ),

t ∈ [0, T],

(22.6)

420 | 22 Stratonovich’s stochastic calculus exists in probability, it is called the Stratonovich or symmetrized stochastic integral. t Notation. ∫0 g(s, X s ) ∘dB s ; “∘” is often called Stratonovich’s circle.

With the usual stopping techniques, see Chapter 16, it is not hard to see that the results of this section still hold for b, σ ∈ L2T,loc . 22.4 Relation of Itô and Stratonovich integrals. Let X t = X0 + ∫0 σ(s) dB s + ∫0 b(s) ds be an Itô process with σ, b ∈ L2T . t

t

a) Theorem 22.2 shows that for every g ∈ C1b (ℝ) which does not depend on t the Stratonovich integral exists and t

t

t

0

0

0

1 ∫ g(X s ) ∘dB s = ∫ g(X s ) dB s + ∫ g 󸀠 (X s )σ(s) ds. 2

With a bit more effort, we can show that the Stratonovich integral exists for timedependent g : [0, T] × ℝ → ℝ such that g(⋅, x), g(t, ⋅) and ∂ x g(t, ⋅) are bounded continuous functions; in this case t

t

t

0

0

0

1 ∫ g(s, X s ) ∘dB s = ∫ g(s, X s ) dB s + ∫ ∂ x g(s, X s )σ(s) ds. 2

(22.7)

For a proof we refer to Stratonovich [246] and [247, Chapter 2.1]. Using the stopping technique of Step 4o in the proof of Theorem 18.1, we can remove the boundedness assumption on g and its derivative. Thus, (22.7) remains true for g ∈ C0,1 ([0, T] × ℝ). b) The correction term which describes the relation between Itô’s and Stratonovich’s integral in (22.7) can be formally written as

or



∂ x g(t, X t )σ(t) dt = dB t dg(t, X t ) ∙

∫ ∂ x g(s, X s )σ(s) ds = ⟨∫ ∂ x g(s, X s ) dB s , X⟩ 0

Ex. 22.1

0

where ⟨I, X⟩ denotes the quadratic covariation, cf. (15.8) and Theorem 17.15.

22.5 Remark. a) The Stratonovich integral satisfies the classical chain rule. Indeed, if f ∈ C2 (ℝ) and dX t = σ(t) dB t + b(t) dt is an Itô process, we get (18.13)

df(X t ) =

=

(22.7)

f 󸀠 (X t ) dX t +

1 2

f 󸀠󸀠 (X t ) (dX t )2

f 󸀠 (X t )σ(t) dB t + f 󸀠 (X t )b(t) dt +

= f 󸀠 (X t )σ(t) ∘dB t + f 󸀠 (X t )b(t) dt =

f 󸀠 (X t ) ∘dX t ,

1 2

f 󸀠󸀠 (X t )σ2 (t) dt

22.2 Solving SDEs with Stratonovich’s calculus |

421

if we set ∘dX t := σ(t) ∘dB t + b(t) dt. This is in line with the traditional calculus rules: if _ + b(t) or _x(t) = σ(t)β(t)

dx(t) = σ(t) dβ(t) + b(t) dt

for some continuously differentiable function β(t) with β(0) = 0, then

df(x(t)) = f 󸀠 (x(t)) dx(t) = f 󸀠 (x(t))σ(t) dβ(t) + f 󸀠 (x(t))b(t) dt.

Substituting x(t) 󴁄󴀼 X t and dβ(t) 󴁄󴀼 ∘dB t shows that both expressions coincide formally.

b) For an applied scientist or an engineer it is natural to sample a Brownian motion at the points Π = {0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = T} and to think of Brownian motion as a piecewise linear interpolation (B Π t )t∈[0,T] of the points (t l , B t l )t l ∈Π . This is indeed a good approximation of a Brownian path which converges uniformly almost surely and in L2 (ℙ) to (B t )t∈[0,T] , see e.g. the Lévy–Ciesielski construction of Brownian motion in Chapter 3.2 or Problem.3.3, but with stochastic integrals some caution is needed. Since t 󳨃→ B Π t is for any partition piecewise differentiable, we get T

T

Π ∫ f(B Π t (ω)) dB t (ω) 0

= ∫ f(B Π t (ω)) 0

󵄨󵄨T dB Π t (ω) Π 󵄨 dt = F(B Π t (ω))󵄨󵄨󵄨0 = F(B T (ω)), dt

where F is a primitive of f such that F(0) = 0. This shows that the limit T

Π Π lim ∫ f(B Π t ) dB t = lim F(B T ) = F(B T )

|Π|→0

|Π|→0

0

exists a.s. On the other hand, if F ∈ C2 (ℝ), Itô’s formula (18.1) yields T

F(B T ) = ∫ f(B t ) dB t + 0

T

0

T which means that, in the limit, the integrals ∫0

integral and not to an Itô integral.

T

1 (22.7) ∫ f 󸀠 (B t ) dt = ∫ f(B t ) ∘dB t , 2 0

Π f(B Π t (ω)) dB t (ω) lead to a Stratonovich

22.2 Solving SDEs with Stratonovich’s calculus The observation from Remark 22.5.a) that the chain rule for Stratonovich integrals coincides (formally) with the classical chain rule can be used to find solutions to certain stochastic differential equations. 22.6 Some heuristic considerations. Let (B t )t⩾0 be a BM1 and consider the following SDE dX t = σ(t, X t ) dB t + b(t, X t ) dt, X0 = x0 . (22.8)

We assume that it has a unique solution, cf. Chapter 21.5 for sufficient conditions.

Ex. 3.3

422 | 22 Stratonovich’s stochastic calculus 1o Find the corresponding Stratonovich SDE: Using (22.7) we get dX t = σ(t, X t ) ∘dB t −

1 σ(t, X t )∂ x σ(t, X t ) dt + b(t, X t ) dt, 2

X0 = x0 .

(22.9)

x(0) = x0 .

(22.10)

_ dt 2o Find the corresponding ODE: Replace in (22.9) X t by x(t) and ∘dB t by β(t) where β is assumed to be continuously differentiable and β(0) = 0. This gives _ − 1 σ(t, x(t))∂ σ(t, x(t)) + b(t, x(t)), _x(t) = σ(t, x(t))β(t) x 2

3o Find the solution to the ODE: x(t) = u(β(t), ⊛) and re-substitute β(t) 󴁄󴀼 B t and x(t) 󴁄󴀼 X t . Obviously, Step 3o is problematic since we need to make sure that the ODE has a unique solution and we need to know the quantities, symbolically denoted by “⊛” which determine this solution. Before we give a general result in this direction, let us study a few examples. 22.7 Example. Let (B t )t⩾0 be a BM1 and assume that σ ∈ C2b (ℝ) is either strictly positive or strictly negative. Then the SDE dX t = σ(X t ) dB t +

has a unique solution given by X t = u(B t )

where

1 σ(X t )σ󸀠 (X t ) dt, 2

u(y) = g −1 (y) and

g=∫

dx σ(x)

X0 = x0 ,

such that

g(x0 ) = 0.

Proof. Without loss of generality we assume that σ > 0. Since σ, σ󸀠 and σ󸀠󸀠 are bounded and continuous, it is obvious that the coefficients of the SDE satisfy the usual Lipschitz and linear growth conditions from Chapter 21.5. Thus, there exists a unique solution. Rather than verifying that the solution provided in the example is indeed a solution, let us show how we can derive it using the Stratonovich chain rule as in the previous paragraph § 22.6. Since we use exclusively the chain rule, this already proves that the thus found expression is indeed the solution. The corresponding Stratonovich SDE with initial condition X0 = x0 is dX t = σ(X t ) ∘dB t −

1 1 σ(X t )σ󸀠 (X t ) dt + σ(X t )σ󸀠 (X t ) dt = σ(X t ) ∘dB t , 2 2

and the corresponding ODE is

_ _x(t) = σ(x(t))β(t),

x(0) = x0 , β(0) = 0.

Since σ is continuously differentiable, we can use separation of variables to get the unique solution of the ODE t

x(t)

_ ds = ∫ dy = g(x(t)) − g(x ). β(t) = ∫ β(s) 0 σ(y) 0

x0

22.2 Solving SDEs with Stratonovich’s calculus |

423

Without loss we may assume that the primitive satisfies g(x0 ) = 0. Since σ > 0, we can invert g and get x(t) = g−1 (β(t)) or X t = u(B t ) with u = g −1 .

Example 22.7 works because the drift term of the SDE has a very special structure. The next example shows that we can introduce small perturbations. 22.8 Example. Let (B t )t⩾0 be a BM1 , a : [0, ∞) → ℝ be integrable, and assume that σ ∈ C2b (ℝ) is either strictly positive or strictly negative. Then the SDE dX t = σ(X t ) dB t + (a(t)σ(X t ) +

1 σ(X t )σ󸀠 (X t )) dt, 2

has unique solution of the form X t = u(B t + A(t)) where u(y) = g−1 (y),

g=∫

dx , σ(x)

A = ∫ a(s) ds

and

X0 = x0 ,

g(x0 ) = A(0) = 0.

Proof. As in Example 22.7 we see that there is a unique solution and that the associated Stratonovich SDE and ODE are given by dX t = a(t)σ(X t ) dt + σ(X t ) ∘dB t , _ _x(t) = a(t)σ(x(t)) + σ(x(t))β(t).

Again by separation of variables we get t

x(t)

t

_ ds = ∫ dy − ∫ a(s) ds = g(x(t)) − g(x ) − A(t) + A(0). β(t) = ∫ β(s) 0 σ(y) 0

x0

0

Without loss we may assume that the primitives satisfy A(0) = 0 = g(x0 ) and σ > 0. Therefore, we can invert g and get x(t) = g −1 (β(t) + A(t)) or X t = u(B t + A(t)) with u = g−1 . The simple arguments in Examples 22.7 and 22.8 are due to the fact that the associated ODE can be explicitly solved by separation of variables, and that the solution of the SDE is, essentially, a function of the driving Brownian motion X t = u(B t ) sampled at a single instance. The following theorem extends these examples; the price we have to pay is that u depends on B t and another “control” process Y t and that u is not any longer explicit. On the other hand, it is good to know that the solution X t is a (still rather simple) function of the driving noise. 22.9 Theorem (Doss 1978, Sussmann 1977). Let (B t )t⩾0 be a BM1 , σ ∈ C(ℝ) such that σ󸀠 , σ󸀠󸀠 ∈ Cb (ℝ) and b : ℝ → ℝ be globally Lipschitz continuous. Then the SDE dX t = σ(X t ) dB t + (b(X t ) +

has a unique solution given by

1 σ(X t )σ󸀠 (X t )) dt, 2

X t (ω) = u(B t (ω), Y t (ω)),

X0 = x0 ,

t ∈ [0, ∞), ω ∈ Ω,

(22.11)

424 | 22 Stratonovich’s stochastic calculus where u : ℝ2 → ℝ is a continuous function, and the control process (Y t (ω))t⩾0 is, for each ω ∈ Ω, the solution of an ODE.

Proof. By assumption, the coefficients b and σ satisfy the usual (global) Lipschitz conditions needed for the existence and uniqueness of the solution to (22.11), see Chapter 21.5. Using (22.7) we rewrite the SDE (22.11) as an Stratonovich SDE: dX t = σ(X t ) ∘dB t + b(X t ) dt,

The corresponding ODE is

_ + b(x(t)), _x(t) = σ(x(t))β(t)

X0 = x0 . x(0) = x0 .

Since the coefficients are Lipschitz continuous, it has a unique solution. In order to find the solution, we make the Ansatz x(t) = u(β(t), y(t)) where y(t) and β(t) are differentiable with β(0) = 0 and u : ℝ2 → ℝ is sufficiently regular. By the chain rule _ + ∂ u(β(t), y(t)) _y(t), _x(t) = ∂ x u(β(t), y(t)) β(t) y

and, comparing this with the previous formula for _x(t), we get with u = u(β(t), y(t)) ∂ x u = σ(u)

and

From ∂ y ∂ x u = ∂ y σ(u) = σ󸀠 (u)∂ y u we get

_y(t) =

b(u) . ∂y u

(22.12)

x

∂ y u(x, y) = ∂ y u(0, y) exp ( ∫ σ󸀠 (u(z, y)) dz),

(22.13)

0

which means that ∂ y u(0, y) = 1 or u(0, y) = y is a very convenient choice of the initial condition in (22.12) which allows us to solve the ODE ∂ x u = σ(u) uniquely (σ is globally Lipschitz continuous). In order to solve the second ODE in (22.12) we need that f(x, y) :=

b(u(x, y)) ∂ y u(x, y)

is, for fixed x, Lipschitz continuous and grows at most linearly. This we will check in Lemma 22.10 below. The correct initial condition is fixed by x0 = u(β(0), y(0)) = u(0, y(0)) = y(0).

Thus, there is a unique solution y(t) of

_y(t) = f(β(t), y(t)),

y(0) = x0 ,

and this proves that x(t) = u(β(t), y(t)) solves the associated ODE. Finally, X t (ω) = u(B t (ω), Y t (ω)) solves the Stratonovich SDE, hence (22.11); for _ t (ω) = f(B t (ω), Y t (ω)) and Y0 (ω) = x0 . fixed ω the control process is given by Y

Problems |

425

22.10 Lemma. The function y 󳨃→ f(x, y) := b(u(x, y))/∂ y u(x, y) appearing in the proof of Theorem 22.9 is, for fixed x, locally Lipschitz continuous and grows at most linearly.

Proof. Set c := ‖σ󸀠 ‖∞ + ‖σ󸀠󸀠 ‖∞ . By assumption, c < ∞ and ∂ y u(0, y) = 1, so x

e

−c|x|

1 ⩽ = exp ( − ∫ σ󸀠 (u(z, y)) dz) ⩽ e c|x| , ∂ y u(x, y) 0

If L is the (global) Lipschitz constant of b, we get for x, y, y󸀠 ∈ ℝ

x ∈ ℝ.

|b(u(x, y)) − b(u(x, y󸀠 ))| ⩽ L|u(x, y) − u(x, y󸀠 )| ⩽ Le c|x| |y − y󸀠 |.

Moreover, by the mean value theorem

󵄨󵄨 󵄨󵄨 󵄩󵄩 󵄩󵄩 1 1 1 󵄨󵄨 󵄨󵄨 󵄩󵄩 󵄩󵄩 󵄨󵄨 󵄨 󵄩󵄩 |y − y󸀠 |. 󵄩 − ⩽ ∂ ) ( y 󵄨 󵄩 󵄨󵄨 ∂ y u(x, y) ∂ y u(x, y󸀠 ) 󵄨󵄨 󵄩󵄩 ∂ y u(x, ⋅) 󵄩󵄩󵄩∞ 󵄨 󵄨 󵄩

From the explicit formula (22.13) for ∂ y u(x, y) we see

|x| 󵄨 󵄨 x 󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨󵄨 ∫0 σ󸀠󸀠 (u(z, y)) ∂ y u(z, y) dz 󵄨󵄨󵄨 1 󵄨󵄨 󵄨󵄨 = 󵄨󵄨 󵄨󵄨 ⩽ e c|x| c ∫ e cz dz. 󵄨󵄨∂ y ( ) 󵄨 󵄨 󵄨󵄨 󵄨󵄨 ∂ y u(x, y) 󵄨󵄨󵄨 󵄨󵄨󵄨 ∂ y u(x, y) 󵄨󵄨 󵄨 󵄨 󵄨 0

Finally, since |∂ y u(x, y)| ⩾ e−c|x| > 0 for every x, we see that 1/∂ y u(x, y) is bounded as a function of y. Thus, f(x, y) is the product of two globally Lipschitz continuous functions, one of which is bounded. As such it is locally Lipschitz continuous and grows at most linearly. 22.11 Further reading. A good discussion of the Stratonovich integral and its applications can be found in [2]. The most comprehensive source for approximations of stochastic integrals and SDEs, including Stratonovich integrals and SDEs is [113, Chapter VI.7, pp. 480–517]. The original papers by Wong and Zakai [271; 272; 273] are very intuitively written and still make a formidable reading. Arnold: Stochastic Differential Equations: Theory and Applications. [2] [113] Ikeda, Watanabe: Stochastic Differential Equations and Diffusion Processes. [247] Stratonovich: Conditional Markov Processes and Their Application to the Theory of Optimal Control. [271] Wong, Zakai: On the convergence of ordinary integrals to stochastic integrals. [272] Wong, Zakai: On the relation between ordinary and stochastic differential equations. [273] Wong, Zakai: Riemann–Stieltjes approximations of stochastic integrals.

Problems 1.

Show that Remark 22.5.a) remains valid if we replace f(x) by f(s, x).

Ex. 22.3

426 | 22 Stratonovich’s stochastic calculus 2. 3.

Show that Example 22.7 remains valid if we assume that σ(0) = 0 and σ(x) > 0 for x ≠ 0.

Let f, g : ℝ → ℝ be two globally Lipschitz continuous functions. If f is bounded, then h(x) := f(x)g(x) is locally Lipschitz continuous and at most linearly growing.

23 On diffusions Diffusion is a physical phenomenon which describes the tendency of two (or more) substances, e.g. gases or liquids, to reach an equilibrium: particles of one type, say B, move or “diffuse” in(to) another substance, say S, leading to a more uniform distribution of the type B particles within S. Since the particles are physical objects, it is reasonable to assume that their trajectories are continuous (in fact, differentiable) and that the movement is influenced by various temporal and spatial (in-)homogeneities which depend on the nature of the substance S. Diffusion phenomena are governed by Fick’s law. Denote by p = p(t, x) the concentration of the B particles at a space-time point (t, x), and by J = J(t, x) the flow of particles. Then J = −D

∂p ∂x

and

∂p ∂J =− ∂t ∂x

where D is called the diffusion constant which takes into account the geometry and the properties of the substances B and S. Einstein explained in 1905 the particle movement observed by Brown – under the assumption that the movement is temporally and spatially homogeneous – as a diffusion phenomenon: ∂p(t, x) ∂2 p(t, x) =D ∂t ∂x2

hence

p(t, x) =

1 √4πD

x2

e− 4Dt .

1 The diffusion coefficient D = RT N 6πkP depends on the absolute temperature T, the universal gas constant R, Avogadro’s number N, the friction coefficient k and the radius of the particles (i.e. atoms) P. If the diffusion coefficient depends on time and space, D = D(t, x), Fick’s law leads to a differential equation of the type

∂2 p(t, x) ∂D(t, x) ∂p(t, x) ∂p(t, x) = D(t, x) + . ∂t ∂x ∂x ∂x2

In a mathematical model of a diffusion we could either use a macroscopic approach and model the particle density p(t, x), or a microscopic point of view modelling the movement of the particles themselves and determine the particle density. Using probability theory we can describe the (random) position and trajectories of the particles by a stochastic process. In view of the preceding discussion it is reasonable to require that the stochastic process 󳶳 has continuous trajectories t 󳨃→ X t (ω); 󳶳 has a generator which is a second-order differential operator. Nevertheless there is no standard definition of a diffusion process. Depending on the model, one will prefer one assumption over the other or add further requirements. Throughout this chapter we use the following definition. https://doi.org/10.1515/9783110741278-023

428 | 23 On diffusions

Ex. 23.1

23.1 Definition. A diffusion process is a Feller process (X t )t⩾0 with values in ℝd , cond tinuous trajectories and an infinitesimal generator (A, D(A)) such that C∞ c (ℝ ) ⊂ D(A) ∞ d and for u ∈ Cc (ℝ ) Au(x) = L(x, D)u(x) =

d 1 d ∂2 u(x) ∂u(x) + ∑ b j (x) . ∑ a ij (x) 2 i,j=1 ∂x i ∂x j j=1 ∂x j

(23.1)

The symmetric, positive semidefinite matrix a(x) = (a ij (x))di,j=1 ∈ ℝd×d is called the diffusion matrix, and b(x) = (b1 (x), . . . , b d (x)) ∈ ℝd is called the drift vector. d d Since (X t )t⩾0 is a Feller process, A maps C∞ c (ℝ ) into C∞ (ℝ ), therefore a(⋅) and b(⋅) have to be continuous functions.

23.2 Remark. Continuity of paths and local operators. Let (X t )t⩾0 be a Feller process d with generator (A, D(A)) such that C∞ c (ℝ ) ⊂ D(A). By Theorems 7.38 and 7.37, the continuity of the trajectories implies that A is a local operator, hence a second-order differential operator. The converse is true if (X t )t⩾0 takes values in a compact space. For this, we need the Dynkin–Kinney criterion for the continuity of the sample paths. A proof can be found in Dynkin [70, Chapter 6 § 5] or Wentzell [262, Chapter 9]. By ℝd ∪ {∂} we denote the one-point compactification of ℝd . 23.3 Theorem (Dynkin 1952, Kinney 1953). Let (X t )t⩾0 be a strong Markov process with state space E ⊂ ℝd ∪ {∂}. If ∀δ > 0

lim

h→0

1 sup sup ℙx (|X t − x| > δ) = 0, h t⩽h x∈E

(23.2)

then there is a modification of (X t )t⩾0 which has continuous trajectories in E.

The continuity of the trajectories implies that the generator is a local operator, cf. Theorem 7.38. If the state space E is compact, (an obvious modification of) Theorem 7.39 shows that the locality of the generator is equivalent to ∀δ > 0

1 sup ℙx (|X t − x| > δ) = 0. t→0 t x∈E

lim

(23.3)

Since h−1 P(h) ⩽ h−1 supt⩽h P(t) ⩽ supt⩽h t−1 P(t) we see that (23.2) and (23.3) are equivalent conditions. We might apply this to (X t )t⩾0 with paths in E = ℝd ∪ {∂}, but the trouble is that the continuity of t 󳨃→ X t in ℝd and ℝd ∪ {∂} are quite different notions: the latter allows for explosion in finite time as well as sample paths coming back from ∂, while continuity in ℝd means infinite life time: ℙx (∀t ⩾ 0 : X t ∈ ℝd ) = 1.

23.1 Kolmogorov’s theory |

429

23.1 Kolmogorov’s theory Kolmogorov’s seminal paper [146] Über die analytischen Methoden der Wahrscheinlichkeitsrechnung marks the beginning of the mathematically rigorous theory of stochastic processes in continuous time. In §§ 13, 14 of this paper Kolmogorov develops the foundations of the theory of diffusion processes, and we will now follow Kolmogorov’s ideas. 23.4 Theorem (Kolmogorov 1933). Let (X t )t⩾0 , X t = (X 1t , . . . , X td ), be a Feller process with values in ℝd such that for all δ > 0 and i, j = 1, . . . , d 1 lim sup ℙx (|X t − x| > δ) = 0, (23.4a) t→0 t x∈ℝd 1 j (23.4b) lim 𝔼x ((X t − x j ) 𝟙{|X t −x|⩽δ} ) = b j (x), t→0 t 1 j lim 𝔼x ((X ti − x i )(X t − x j ) 𝟙{|X t −x|⩽δ} ) = a ij (x). (23.4c) t→0 t Then (X t )t⩾0 is a diffusion process in the sense of Definition 23.1.

Proof. The Dynkin–Kinney criterion, see the discussion in Remark 23.2, shows that (23.4a) guarantees the continuity of the trajectories. d d is a differential We are going to show that C∞ c (ℝ ) ⊂ D(A) and that A|C∞ c (ℝ ) operator of the form (23.1). The proof is similar to Example 7.9. d Let u ∈ C∞ c (ℝ ). Since ∂ j ∂ k u is uniformly continuous, ∀ϵ > 0

∃δ > 0 : max sup |∂ i ∂ j u(x) − ∂ i ∂ j u(y)| ⩽ ϵ. 1⩽i,j⩽d |x−y|⩽δ

Fix ϵ > 0, choose δ > 0 as above and set T t u(x) := 𝔼x (u(X t )), Ω δ := {|X t − x| ⩽ δ}. Then T t u(x) − u(x) 1 x 1 = 𝔼 ([u(X t ) − u(x)] 𝟙Ω δ ) + 𝔼x ([u(X t ) − u(x)] 𝟙Ω cδ ) . t t t =J

=J󸀠

It is enough to show that limt→0 (T t u(x) − u(x))/t = Au(x) holds for every x ∈ ℝd , all d u ∈ C∞ c (ℝ ), and the operator A from (23.1), see Theorem 7.22. The second term can be estimated by

(23.4a) 1 x ℙ (|X t − x| > δ) 󳨀󳨀󳨀󳨀󳨀→ 0. t→0 t Now we consider J. By Taylor’s formula there is some θ ∈ (0, 1) such that for the intermediate point Θ = (1 − θ)x + θX t

|J󸀠 | ⩽ 2 ‖u‖∞ d

j

u(X t ) − u(x) = ∑ (X t − x j ) ∂ j u(x) + j=1

+

1 d j ∑ (X − x j )(X tk − x k ) ∂ j ∂ k u(x) 2 j,k=1 t

1 d j ∑ (X − x j )(X tk − x k ) (∂ j ∂ k u(Θ) − ∂ j ∂ k u(x)). 2 j,k=1 t

430 | 23 On diffusions Denote the terms on the right-hand side by J1 , J2 and J3 , respectively. Since ∂ j ∂ k u is uniformly continuous, the elementary inequality 2|ab| ⩽ a2 + b2 yields 1 x ϵ d 1 x j 𝔼 (|J3 | 𝟙Ω δ ) ⩽ ∑ 𝔼 [|(X ti − x i )(X t − x j )| 𝟙Ω δ ] t 2 i,j=1 t ⩽

Moreover,

d (23.4c) ϵd ϵd d 1 x j ∑ 𝔼 [(X t − x j )2 𝟙Ω δ ] 󳨀󳨀󳨀󳨀󳨀󳨀→ ∑ a jj (x). t→0 2 j=1 t 2 j=1

1 d 1 x 1 x j 𝔼 (J2 𝟙Ω δ ) = 𝔼 [(X t − x j )(X tk − x k )𝟙Ω δ ] ∂ j ∂ k u(x) ∑ t 2 j,k=1 t (23.4c)

󳨀󳨀󳨀󳨀󳨀→ t→0

and

1 d ∑ a ij (x) ∂ i ∂ j u(x), 2 i,j=1

d d (23.4b) 1 x 1 j 𝔼 (J1 𝟙Ω δ ) = ∑ 𝔼x [(X t − x j )𝟙Ω δ ] ∂ j u(x) 󳨀󳨀󳨀󳨀󳨀→ ∑ b j (x) ∂ j u(x). t→0 t t j=1 j=1

Since ϵ > 0 is arbitrary, the claim follows.

If (X t )t⩾0 has a transition density, i.e. if ℙx (X t ∈ A) = ∫A p(t, x, y) dy for all t > 0 and A ∈ B(ℝd ), then we can express the dynamics of the movement as a Cauchy problem. Depending on the point of view, we get Kolmogorov’s first and second differential equation, cf. [146, §§ 12–14]. 23.5 Proposition (backward equation. Kolmogorov 1931). Let (X t )t⩾0 denote a diffusion d is given by (23.1) with bounded drift and with generator (A, D(A)) such that A|C∞ c (ℝ ) diffusion coefficients b ∈ Cb (ℝd , ℝd ) and a ∈ Cb (ℝd , ℝd×d ). Assume that (X t )t⩾0 has a transition density p(t, x, y), t > 0, x, y ∈ ℝd , satisfying p,

p(t, ⋅, ⋅),

∂p ∂p ∂2 p , , ∈ C((0, ∞) × ℝd × ℝd ), ∂t ∂x j ∂x j ∂x k

∂p(t, ⋅, ⋅) ∂p(t, ⋅, ⋅) ∂2 p(t, ⋅, ⋅) , , ∈ C∞ (ℝd × ℝd ) ∂t ∂x j ∂x j ∂x k

for all t > 0. Then Kolmogorov’s first or backward equation holds:

∂p(t, x, y) = L(x, D x )p(t, x, y) for all t > 0, x, y ∈ ℝd . ∂t

(23.5)

Proof. Denote by L(x, D x ) the second-order differential operator given by (23.1). By d assumption, A = L(x, D) on C∞ c (ℝ ). Since p(t, x, y) is the transition density, we know that the transition semigroup is given by T t u(x) = ∫ p(t, x, y)u(y) dy.

23.1 Kolmogorov’s theory |

431

Using dominated convergence it is not hard to see that the conditions on p(t, x, y) d 2 d are such that T t maps C∞ c (ℝ ) to C∞ (ℝ ). Since the coefficients b and a are bounded, ‖Au‖∞ ⩽ κ‖u‖(2) , where we use the norm ‖u‖(2) = ‖u‖∞ + ∑j ‖∂ j u‖∞ + ∑j,k ‖∂ j ∂ k u‖∞ in C2∞ (ℝd ), and κ = κ(b, a) is some d constant. Since C∞ c (ℝ ) ⊂ D(A) and (A, D(A)) is closed, cf. Corollary 7.11, it follows 2 d that C∞ (ℝ ) ⊂ D(A) and A|C2∞ (ℝd ) = L(x, D)|C2∞ (ℝd ) . d Therefore, Lemma 7.10 shows that dt T t u = AT t u = L(⋅, D)T t u. Thus,

Ex. 23.2 Ex. 23.3

∂ ∫ p(t, x, y)u(y) dy = L(x, D x ) ∫ p(t, x, y)u(y) dy ∂t

= ∫ L(x, D x )p(t, x, y)u(y) dy,

d for all u ∈ C∞ c (ℝ ). The last equality is a consequence of the continuity and smoothness assumptions on p(t, x, y) which allow us to use the differentiation lemma for parameter dependent integrals, see e.g. [232, Theorem 12.5]. Applying it to the t-derivative on the left-hand side, we get

∫(

∂p(t, x, y) d − L(x, D x )p(t, x, y)) u(y) dy = 0 for all u ∈ C∞ c (ℝ ) ∂t

which implies (23.5).

If L(x, D x ) is a second-order differential operator of the form (23.1), then the formal adjoint operator is L∗ (y, D y )u(y) =

Ex. 23.4

d ∂2 ∂ 1 d (a ij (y)u(y)) − ∑ (b j (y)u(y)) ∑ 2 i,j=1 ∂y i ∂y j ∂y j j=1

provided that the coefficients a and b are sufficiently smooth.

23.6 Proposition (forward equation. Kolmogorov 1931). Let (X t )t⩾0 denote a diffusion d is given by (23.1) with drift and diffusion with generator (A, D(A)) such that A|C∞ c (ℝ ) coefficients b ∈ C1 (ℝd , ℝd ) and a ∈ C2 (ℝd , ℝd×d ). Assume that X t has a probability density p(t, x, y) satisfying p,

∂2 p ∂p ∂p , , ∈ C((0, ∞) × ℝd × ℝd ). ∂t ∂y j ∂y j ∂y k

Then Kolmogorov’s second or forward equation holds: ∂p(t, x, y) = L∗ (y, D y )p(t, x, y) ∂t

for all t > 0, x, y ∈ ℝd .

(23.6)

Proof. This proof is similar to the proof of Proposition 23.5, where we use Lemma 7.10 d d T t u = T t Au = T t L(⋅, D)u for u ∈ C∞ in the form dt c (ℝ ), and then integration by parts in the resulting integral equality. The fact that we do not have to interchange the integral and the operator L(x, D) as in the proof of Proposition 23.5, allows us to relax the boundedness assumptions on p, b and a, but we pay a price in the form of differentiability assumptions for the drift and diffusion coefficients.

Ex. 23.5

432 | 23 On diffusions If a and b do not depend on x, the transition probability p(t, x, y) depends only on the difference x − y, and the corresponding stochastic process is spatially homogeneous, essentially a Brownian motion with a drift: aB t + bt. Kolmogorov [146, § 16] mentions that in this case the forward equation (23.6) was discovered (but not rigorously proved) by Bachelier [6; 7]. Often it is called Fokker–Planck equation since it also appears in the context of statistical physics, cf. Fokker [86] and Planck [207]. In order to understand the names “backward” and ‘´forward” we have to introduce the concept of a fundamental solution of a parabolic PDE. 23.7 Definition. Let L(s, x, D) = ∑di,j=1 a ij (s, x)∂ i ∂ j + ∑dj=1 b j (s, x)∂ j + c(s, x) be a second-order differential operator on ℝd where (a ij (s, x)) ∈ ℝd×d is symmetric and positive semidefinite. A fundamental solution of the parabolic differential op∂ + L(s, x, D) on the set [0, T] × ℝd is a function Γ(s, x; t, y) defined for all erator ∂s (s, x), (t, y) ∈ [0, T] × ℝd with s < t such that for every f ∈ Cc (ℝd ) the function u(s, x; t, f) := ∫ Γ(s, x; t, y)f(y) dy

satisfies −

ℝd

∂u(s, x; t, f) = L(s, x, D)u(s, x; t, f) for all x ∈ ℝd , s < t ⩽ T, ∂s lim u(s, x; t, f) = f(x) for all x ∈ ℝd . s↑t

(23.7a) (23.7b)

Proposition 23.5 shows that Γ(s, x; t, y) := p(t − s, x, y), s < t, is the fundamental ∂ solution of the parabolic differential operator L(x, D) + ∂s , whereas Proposition 23.6 means that Γ(t, y; s, x) := p(s − t, y, x), t < s is the fundamental solution of the (formal) ∂ adjoint operator L∗ (y, D) − ∂t . The backward equation has the time derivative at the left end of the time interval, and we solve the PDE in a forward direction with a condition at the right end: ∂ Γ(s, x; t, y) = −L(s, x, D x )Γ(s, x; t, y) and ∂s

lim Γ(s, x; t, y) = δ x (dy).

∂ Γ(s, x; t, y) = L∗ (t, y, D y )Γ(s, x; t, y) ∂t

lim Γ(s, x; t, y) = δ y (dx).

s↑t

In the forward equation the time derivative takes place at the right end of the time interval, and we solve the PDE backwards with an initial condition: and

t↓s

If we write Γ s,t f(x) := u(s, x; t, f) := ∫ Γ(s, x; t, y)f(y) dy the backward, resp., forward equations take the following form: backward: forward:

∂ Γ s,t f(x) = −L(s, x, D x )Γ s,t f(x) ∂s ∂ Γ s,t f(x) = Γ s,t L(t, x, D x )f(x) ∂t

and and

Γ t,t f(x) = f(x),

Γ s,s f(x) = f(x).

23.1 Kolmogorov’s theory |

ℝ𝑑

ℝ𝑑 𝐶

𝑥 0

433

𝑡



𝐶

𝑥 0

Fig. 23.1: Derivation of the backward and forward Kolmogorov equations.

𝑡−ℎ

𝑡

In the time-homogeneous case we have Γ(s, x; t, y) = p(t−s, x, y), so that the difference between the two equations is not really obvious. At a technical level, the proofs of Propositions 23.5 and 23.6 reveal that all depends on the identity (7.10): in the backward d d case we use dt T t = AT t , in the forward case dt T t = T t A. Using probability theory we can give the following stochastic interpretation. Run the stochastic process in the time interval [0, t]. Using the Markov property we can perturb the movement at the beginning, i.e. on [0, h] and then let it run from time h to t, or we could run it up to time t − h, and then perturb it at the end in [t − h, t]. In the first or backward case, the Markov property gives for all C ∈ B(ℝd ) ℙx (X t ∈ C) − ℙx (X t−h ∈ C) = 𝔼x ℙX h (X t−h ∈ C) − ℙx (X t−h ∈ C) = (T h − id)ℙ(⋅) (X t−h ∈ C)(x).

If ℙx (X t ∈ C) = ∫C p(t, x, y) dy with a sufficiently smooth density, we get p(t, x, y) − p(t − h, x, y) T h − id = p(t − h, ⋅, y)(x) h h

∂ p(t, x, y) = A x p(t, x, y), i.e. the backward equation. ∂t In the second or forward case we get

which, as h → 0, yields

ℙx (X t ∈ C) − ℙx (X t−h ∈ C) = 𝔼x ℙX t−h (X h ∈ C) − ℙx (X t−h ∈ C) = 𝔼x ((T h − id)𝟙C (X t−h )).

Passing to the transition densities and dividing by h, we get

and h → 0 yields

∗ p(t, x, y) − p(t − h, x, y) T h − id = p(t − h, x, ⋅)(y) h h

∂ p(t, x, y) = A∗y p(t, x, y) (T t∗ , A∗ are the formally adjoint operators). ∂t

The obvious question is, of course, which conditions on b and a ensure that there is a d (unique) diffusion process such that the generator, restricted to C∞ c (ℝ ), is a second

434 | 23 On diffusions order differential operator (23.1). If we can show that L(x, D) generates a Feller semigroup or has a nice fundamental solution for the corresponding forward or backward equation for L(x, D), then we could use Kolmogorov’s construction, cf. Section 4.2 or Remark 7.6, to construct the corresponding process. One possibility is to use the Hille–Yosida theorem (Theorem 7.19 and Lemma 7.21); the trouble is then to verify condition c) from Theorem 7.19 which amounts to finding a solution to an elliptic partial differential equation of the type λu(x) − L(x, D)u(x) = f(x) for f ∈ C∞ (ℝd ). In a non-Hilbert space setting, this is not an easy task. Another possibility would be to construct a fundamental solution for the backward equation. Here is a standard result in this direction. For a proof we refer to Friedman [91, Chapter I.6]. 23.8 Theorem. Let L(x, D) = ∑di,j=1 a ij (x)∂ i ∂ j + ∑dj=1 b j (x)∂ j + c(x) be a differential operator on ℝd such that d

∑ a ij (x)ξ i ξ j ⩾ κ |ξ|2 for some constant κ and all x, ξ ∈ ℝd ,

i,j=1

a ij (⋅), b j (⋅), c(⋅) ∈ Cb (ℝd ) are locally Hölder continuous.

Then there exists a fundamental solution Γ(s, x; t, y) such that ∂2 Γ ∂Γ ∂Γ , and are continuous in (s, x; t, y); 󳶳 Γ, ∂s ∂x j ∂x j ∂x k

󳶳

󳶳

󳶳

2

Γ(s, x; t, y) ⩽ c(t − s)−d/2 e−C|x−y|

/(t−s)

,

2 ∂Γ(s, x; t, y) ⩽ c(t − s)−(d+1)/2 e−C|x−y| /(t−s) ; ∂x j ∂Γ(s, x; t, y) − = L(x, D x )Γ(s, x; t, y); ∂s

lim ∫ Γ(s, x; t, y)f(x) dx = f(y) for all f ∈ Cb (ℝd ). s↑t

23.2 Itô’s theory It is possible to extend Theorem 23.8 to increasing and degenerate coefficients, and we refer the interested reader to the exposition in Stroock and Varadhan [249, Chapter 3]. Here we will follow Itô’s idea to use stochastic differential equations (SDEs) and construct a diffusion directly. In Chapter 21 we have discussed the uniqueness, existence and regularity of SDEs of the form dX t = b(X t ) dt + σ(X t ) dB t ,

X0 = x,

(23.8)

23.2 Itô’s theory |

435

for a BMd (B t )t⩾0 and coefficients b : ℝd → ℝd , σ : ℝd → ℝd×d . The key ingredient was the following global Lipschitz condition d

d

j=1

j,k=1

∑ |b j (x) − b j (y)|2 + ∑ |σ jk (x) − σ jk (y)|2 ⩽ C2 |x − y|2

If we take y = 0, (23.9) entails the linear growth condition.

for all x, y ∈ ℝd .

(23.9)

23.9 Theorem. Let (B t )t⩾0 be a BMd and (X tx )t⩾0 be the solution of the SDE (23.8). Assume that the coefficients b and σ satisfy the global Lipschitz condition (23.9). Then (X tx )t⩾0 is a diffusion process in the sense of Definition 23.1 with the infinitesimal generator d ∂2 u(x) ∂u(x) 1 d + ∑ b j (x) , ∑ a ij (x) 2 i,j=1 ∂x i ∂x j j=1 ∂x j

L(x, D)u(x) =

where a(x) = σ(x)σ⊤ (x).

Proof. The uniqueness, existence, and Markov property of (X tx )t⩾0 follow from Corollary 21.12, Theorem 21.13, and Corollary 21.24, respectively. The (stochastic) integral is, as a function of its upper limit, continuous, cf. Theorem 15.15 or Theorem 16.8, and so t 󳨃→ X tx (ω) is continuous. From Corollary 21.31 we know that lim|x|→∞ |X tx | = ∞ for every t > 0. Therefore, we conclude with the dominated convergence theorem that x 󳨃→ T t u(x) := 𝔼 u(X tx ) is in C∞ (ℝd ) for all t ⩾ 0 and u ∈ C∞ (ℝd ).

The operators T t are clearly sub-Markovian and contractive. Let us show that (T t )t⩾0 is strongly continuous, hence a Feller semigroup. Take u ∈ C2c (ℝd ) and apply Itô’s formula, Theorem 18.8. Then t

d

u(X tx )

− u(x) = ∑

d

∫ σ jk (X sx )∂ j u(X sx ) dB ks

j,k=1 0

+

t

t

+ ∑ ∫ b j (X sx )∂ j u(X sx ) ds j=1 0

1 d ∑ ∫ a ij (X sx )∂ i ∂ j u(X sx ) ds. 2 i,j=1 0

Since the first term on the right is a martingale, its expectation vanishes, and we get |T t u(x) − u(x)| = | 𝔼 u(X tx ) − u(x)|

󵄨󵄨 t 󵄨󵄨 t 󵄨󵄨 󵄨󵄨 d 󵄨 󵄨 󵄨 1 d 󵄨 ⩽ ∑ 𝔼 󵄨󵄨󵄨󵄨 ∫ b j (X sx )∂ j u(X sx ) ds󵄨󵄨󵄨󵄨 + ∑ 𝔼 󵄨󵄨󵄨󵄨 ∫ a ij (X sx )∂ i ∂ j u(X sx ) ds󵄨󵄨󵄨󵄨 2 󵄨 󵄨 󵄨󵄨 󵄨 󵄨 󵄨 󵄨 i,j=1 j=1 0 0 d

d

j=1

i,j=1

⩽ t ∑ ‖b j ∂ j u‖∞ + t ∑ ‖a ij ∂ i ∂ j u‖∞ 󳨀󳨀󳨀→ 0 t→0

Ex. 21.14

436 | 23 On diffusions Approximate u ∈ C∞ (ℝd ) by a sequence (u n )n⩾0 ⊂ C2c (ℝd ). Then

‖T t u − u‖∞ ⩽ ‖T t (u − u n )‖∞ + ‖T t u n − u n ‖∞ + ‖u − u n ‖∞ ⩽ 2‖u − u n ‖∞ + ‖T t u n − u n ‖∞ .

Letting first t → 0, and then n → ∞, proves strong continuity on C∞ (ℝd ). We can now determine the generator of the semigroup. Because of Theorem 7.22 we need to calculate only the limit limt→0 (T t u(x) − u(x))/t for each x ∈ ℝd . Using Itô’s formula as before, we see that 𝔼 u(X tx ) − u(x) t

t

t

1 1 d 1 = ∑ 𝔼 ( ∫ b j (X sx )∂ j u(X sx ) ds) + ∑ 𝔼 ( ∫ a ij (X sx )∂ i ∂ j u(X sx ) ds) t 2 t j=1 i,j=1 d

0

0

󳨀󳨀󳨀→ ∑ b j (x)∂ j u(x) + t→0

1 ∑ a ij (x)∂ i ∂ j u(x). 2 i,j=1 d

d

j=1

For the limit in the last line we use the dominated convergence theorem and the fact t that t−1 ∫0 ϕ(s) ds → ϕ(0+) as t → 0.

Theorem 23.9 shows what we have to do, if we want to construct a diffusion with a given generator L(x, D) of the form (23.1): the drift coefficient b must be globally Lipschitz, and the positive definite diffusion matrix a = (a ij ) ∈ ℝd×d should be of the form a = σσ⊤ where σ = (σ jk ) ∈ ℝd×d is globally Lipschitz continuous. Let M ∈ ℝd×d and set ‖M‖2 := sup|ξ|=1 ⟨M ⊤ Mξ, ξ⟩. Then ‖M‖ is a submultiplicative matrix norm which is compatible with the Euclidean norm on ℝd , i.e. ‖MN‖ ⩽ ‖M‖ ⋅ ‖N‖ and |Mx| ⩽ ‖M‖ ⋅ |x| for all x ∈ ℝd . For a symmetric matrix M we write λmin (M) and λmax (M) for the smallest, and the largest eigenvalue, respectively; we have ‖M‖2 = λmax (MM ⊤ ) and, if M is semidefinite, ‖M‖ = λmax (M).¹ 23.10 Lemma (van Hemmen–Ando 1980). Assume that B, C ∈ ℝd×d are two symmetric, positive semidefinite matrices such that λmin (B) + λmin (C) > 0. Then their unique positive semidefinite square roots satisfy 󵄩 󵄩󵄩√ 󵄩󵄩 C − √B󵄩󵄩󵄩 ⩽ 󵄩 󵄩

‖C − B‖

1/2 1/2 max {λmin (B), λmin (C)}

.

Proof. A simple change of variables shows that ∞

1 2 dr = ∫ , √λ π r2 + λ 0

1 In fact, if M = (m jk ), then ‖M‖2 = ∑jk m2jk .



2 λ dr hence √ λ = ∫ 2 . π r +λ 0

23.2 Itô’s theory |

437

Since B, C are positive semidefinite, we can use this formula to determine the square roots, e.g. for B, ∞

√B = 2 ∫ B(s2 id +B)−1 ds. π 0

Set

R Bs

:=

(s2 id +B)−1

and observe that BR Bs − CR Cs = s2 R Bs (B − C)R Cs . Thus, ∞

√C − √B = 2 ∫ s2 R Cs (C − B)R Bs ds. π 0

Since the matrix norm is submultiplicative and compatible with the Euclidean norm, we find for all ξ ∈ ℝd with |ξ| = 1 using the Cauchy-Schwarz inequality Moreover,

󵄨 󵄨 󵄩 󵄩 󵄩 󵄩 󵄩 󵄩 ⟨R Cs (C − B)R Bs ξ, ξ ⟩ ⩽ 󵄨󵄨󵄨󵄨R Cs (C − B)R Bs ξ 󵄨󵄨󵄨󵄨 ⩽ 󵄩󵄩󵄩󵄩R Bs 󵄩󵄩󵄩󵄩 ⋅ 󵄩󵄩󵄩󵄩R Cs 󵄩󵄩󵄩󵄩 ⋅ 󵄩󵄩󵄩C − B󵄩󵄩󵄩. 1 󵄩󵄩 B 󵄩󵄩 󵄩󵄩R s 󵄩󵄩 = λmax (R Bs ) = . 󵄩 󵄩 s2 + λmin (B)

Since C and B play symmetric roles, we may, without loss of generality, assume that λmin (C) ⩾ λmin (B). Thus, ∞

󵄩󵄩√ 󵄩 2 󵄩 󵄩󵄩 󵄩 󵄩󵄩 C − √B󵄩󵄩󵄩 ⩽ ∫ 󵄩󵄩󵄩R Bs 󵄩󵄩󵄩 󵄩󵄩󵄩R Cs 󵄩󵄩󵄩 s2 ds ⋅ ‖C − B‖ 󵄩 󵄩 π 󵄩 󵄩󵄩 󵄩 =



0 ∞

1 2 s2 ds ⋅ ‖C − B‖ ∫ 2 π s + λmin (B) s2 + λmin (C) 0 ∞

2 1 ds ⋅ ‖C − B‖ ∫ 2 π s + λmin (C) 0

‖C − B‖ = . √ λmin (C)

23.11 Corollary. Let L(x, D) be the differential operator (23.1) and assume that the diffusion matrix a(x) is globally Lipschitz continuous (Lipschitz constant C) and uniformly positive definite, i.e. inf x λmin (a(x)) = κ > 0. Moreover, assume that the drift vector is globally Lipschitz continuous. Then there exists a diffusion process in the sense of d Definition 23.1 such that its generator (A, D(A)) coincides on the test functions C∞ c (ℝ ) with L(x, D).

Proof. Denote by σ(x) the unique positive definite square root of a(x). Using Lemma 23.10 we find ‖σ(x) − σ(y)‖ ⩽

1 C ‖a(x) − a(y)‖ ⩽ |x − y| κ κ

for all x, y ∈ ℝd .

438 | 23 On diffusions On a finite dimensional vector space all norms are comparable, i.e. (23.9) holds for σ (possibly with a different Lipschitz constant). Therefore, we can apply Theorem 23.9 and obtain the diffusion process (X t )t⩾0 as the unique solution to the SDE (23.8).

It is possible to replace the uniform positive definiteness of the matrix a(x) by the weaker assumption that a ij (x) ∈ C2b (ℝd , ℝ) for all i, j = 1, . . . , d, cf. [113, Proposition IV.6.2] or [249, Theorem 5.2.3]. The square root σ is unique only if we require that it is also positive semidefinite. Let us briefly discuss what happens if σσ⊤ = a = ρρ⊤ where ρ is some d × d matrix which need not be positive semidefinite. Let (σ, b, (B t )t⩾0 ) be as in Theorem 23.9 and denote by (X tx )t⩾0 the unique solution of the SDE (23.8). Assume that (W t )t⩾0 is a further Brownian motion which is independent of (B t )t⩾0 and that (Y tx )t⩾0 is the unique solution of the SDE dY t = b(Y t ) dt + ρ(Y t ) dW t

Y 0 = x ∈ ℝd .

The proof of Theorem 23.9 works literally for (Y tx )t⩾0 since we need only the existence and uniqueness of the solution of the SDE, and the equality ρρ⊤ = a. This means that (X tx )t⩾0 and (Y tx )t⩾0 are two Feller processes whose generators, (A, D(A)) and (C, D(C)), say, coincide on the set C2c (ℝd ). It is tempting to claim that the processes coincide, but for this we need to know that A|C2c = C|C2c implies (A, D(A)) = (C, D(C)). A sufficient criterion would be that C2c is an operator core for A and C, i.e. A and C are the closures of A|C2c and C|C2c , respectively. This is far from obvious. Using the martingale problem technique of Stroock and Varadhan, one can show that – under even weaker conditions than those of Corollary 23.11 – the diffusion process (X tx )t⩾0 is unique in the following sense: every process whose generator coincides on C2c (ℝd ) with L(x, D), has the same finite dimensional distributions as (X tx )t⩾0 . On the other hand, this is the best we can hope for. For example, we have a(x) = σ(x)σ⊤ (x) = ρ(x)ρ⊤ (x)

if

ρ(x) = σ(x)U

for any orthogonal matrix U ∈ ℝd×d . If U is not a function of x, we see that dY t = b(Y t ) dt + ρ(Y t )dB t = b(Y t ) dt + σ(Y t )UdB t

= b(Y t ) dt + σ(Y t )d(UB)t .

Of course, (UB t )t⩾0 is again a Brownian motion, but it is different from the original Brownian motion (B t )t⩾0 . This shows that the solution (Y tx )t⩾0 will, even in this simple case, not coincide with (X tx )t⩾0 in a pathwise sense. Nevertheless, (X tx )t⩾0 and (Y tx )t⩾0 have the same finite dimensional distributions. In other words: we cannot expect that there is a strong one-to-one relation between L and the stochastic process.

23.3 Drift transformations revisited |

439

23.3 Drift transformations revisited We will now continue our discussion on the relation between PDEs and Markov processes which we have started in Chapter 8. Since we can now use SDEs and Itô’s approach to diffusion processes, we can go beyond the examples discussed earlier. In order to illustrate the method, let us treat the simplest case. Consider the following Cauchy problem ∂ 1 u(t, x) = ∆ x u(t, x) + b(x) ⋅ ∇x u(t, x) ∂t 2 u(0, x) = f(x).

(23.10a) (23.10b)

(b ⋅ ∇x u = ∑dj=1 b j ∂ j u, ∂ j := ∂/∂x j ). We are interested in classical solutions, i.e. we assume that u(t, x) is continuous in (t, x), continuously differentiable in t, and twice continuously differentiable in x. We have seen in Section 23.2 that – under suitable conditions on the drift coefficient b(x) – there is a Markov process (X t )t⩾0 which is given by the following Itô SDE dX t = dB t + b(X t ) dt,

X0 = x,

(23.11)

driven by a d-dimensional Brownian motion (B t , Ft )t⩾0 . We assume that the filtration Ft is admissible and complete. The infinitesimal generator of the process (X t )t⩾0 is the operator L = 12 ∆ x + b(x) ⋅ ∇x which appears in (23.10a). A sufficient condition for the existence of a unique (strong) solution of the SDE (23.11) is that the drift coefficient b(x) is Lipschitz continuous and grows at most linearly, cf. Theorem 21.13. We have seen in Example 21.21 that we can relax the conditions on b(x), if we content ourselves with a weak solution (in the sense of Definition 21.18, roughly speaking the driving noise is now part of the solution and not given a priori). The key ingredient is Maruyama’s drift transformation which only requires that b(x) satisfies the conditions needed for the Girsanov transformation, Theorem 19.9; in fact, Zvonkin [281] showed that these solutions are actually strong solutions (d = 1 or b(x) is Dini-continuous if d > 1). The Maruyama–Girsanov transformation will play central role in Theorem 23.13. Since we cannot avoid (global) Lipschitz continuity in the proof of the existence of a solution, cf. Theorem 23.15, we impose straight away the standard growth and Lipschitz conditions on the drift coefficient. We begin with a moment estimate which is particularly simple for SDEs with additive noise, i.e. the coefficient in front of the stochastic differential dB t is constant. For the general case we refer to Theorem 21.28. 23.12 Lemma. Assume that (B t )t⩾0 is a BMd and b : ℝd → ℝd a measurable function which grows at most linearly, i.e. |b(x)| ⩽ c(1 + |x|) holds for all x ∈ ℝd . Any solution (X tx )t⩾0 of the SDE (23.11) has moments of arbitrary order such that 𝔼 [supt⩽T |X tx |p ] ⩽ C T,p (1 + |x|p )

for all x ∈ ℝd , T ⩾ 0, p ⩾ 1.

440 | 23 On diffusions Proof. Throughout the proof we use the shorthand Y t∗ := sups⩽t |Y s |. Since X t := X tx is a solution of the SDE (23.11), the linear growth condition shows X ∗T

󵄨󵄨 󵄨󵄨 t T 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 = sup 󵄨󵄨x + B t + ∫ b(X s ) ds󵄨󵄨󵄨 ⩽ |x| + B∗T + c ∫ (1 + X ∗s ) ds 󵄨󵄨 t⩽T 󵄨󵄨󵄨 󵄨 󵄨󵄨 0 0 T

= |x| +

B∗T

+ cT + c ∫ X ∗s ds. 0

Now we can apply Gronwall’s inequality, Theorem A.42, and get (1),∗

X ∗T ⩽ (cT + |x| + B∗T ) e cT ⩽ c T (1 + |x| + B T (k),∗

By the reflection principle, cf. Theorem 6.10, B T moments of all orders, we see that (1)

(k)

(d),∗

+ ⋅ ⋅ ⋅ + BT

).

∼ |B T |. Since Brownian motion has

𝔼 [(X ∗T )p ] ⩽ c T,p,d (1 + 𝔼 [|B T |p ] + |x|p ) ⩽ C T,p,d (1 + |x|p ),

finishing the proof.

Recall that a function g(x) is said to be polynomially bounded with exponent κ(g) if |g(x)| ⩽ C κ (1 + |x|κ(g) ) for all x and some constant C = C κ .

23.13 Theorem (representation and uniqueness). Let (B t , Ft )t⩾0 be a BMd with an admissible complete filtration and assume that b = (b1 , . . . , b d ) ∈ C(ℝd , ℝd ) is Lipschitz continuous and polynomially bounded with exponent κ(b) < 1, and f ∈ C(ℝd , ℝ). If the initial value problem (23.10) has a solution u(t, x) which is uniformly for t in compact sets [0, T] polynomially bounded in x, then the solution is unique. In this case, it is given by t

t

d 1 (j) u(t, x) = 𝔼 (f(B t ) exp [ ∑ ∫ b j (B s ) dB s − ∫ |b(B s )|2 ds]) . 2 0 ] [j=1 0 x

(23.12)

Proof. Denote by (X t )t⩾0 the unique strong solution to the SDE (23.11) and let u(t, x) be any polynomially bounded solution to the initial value problem (23.10). Applying Itô’s formula to u(T − t, x), t ∈ [0, T), x ∈ ℝd , and for the process (t, X t ), shows u(T − t, X t ) − u(T, X0 ) t

t

t

0

0

0

=

∂ 1 ∫ ∇x u(T − s, X s ) dX s − ∫ u(T − s, X s ) ds + ∫ ∆ x u(T − s, X s ) ds ∂t 2

=

∫ ∇x u(T − s, X s ) dB s

(23.10a)

t

0

23.3 Drift transformations revisited | 441 (j)

(we use the shorthand ∇x u dX s = ∑dj=1 ∂ j u dX s etc.). In this calculation we observe that dX s = dX s dX s = dB s + b(X s ) ds which implies d⟨X⟩s = ds. This shows that (u(T − t, X t ) − u(T, x), Ft )t∈[0,T) is a local martingale for each ℙx .

Let σ n be a suitable localizing sequence and set τ n := inf{s ⩾ 0 : |X s | > n} ∧ σ n . With the optional stopping theorem we see that dom. conv.

u(T, x) = 𝔼x u(T − t ∧ τ n , X t∧τ n ) 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ 𝔼x u(0, X T ) n→∞, t↑T

(23.10b)

=

𝔼x f(X T ).

In order to justify the use of dominated convergence, we note that, by assumption, |u(t, x)| ⩽ C(1 + |x|κ ) for some κ ⩾ 0 and a constant C = C κ,T , while X t has moments of all orders, see Lemma 23.12. Under the present assumptions, the representation formula u(t, x) = 𝔼x f(X t ) is a necessary consequence, and so it ensures the uniqueness of the solution. The formula (23.12) follows from u(t, x) = 𝔼x f(X t ) with a Girsanov transform. To prepare the ground, let us first check Novikov’s condition, cf. Theorem 19.4, for the function b. Using the polynomial boundedness of b and the independence of the coordinate processes of a Brownian motion we see that t

2 1 1 𝔼 (exp [ ∫ |b(B s )|2 ds]) ⩽ 𝔼0 (exp [ ct (1 + sup |B s |κ(b) ) ]) 2 2 s⩽t [ 0 ] 0

d 󸀠 1 (j) ⩽ e−c t/2 ∏ 𝔼0 (exp [ c󸀠 t sup |B s |2κ(b) ]) . 2 s⩽t j=1 (j)

(j)

(j)

By the reflection principle and scaling, sups⩽t |B s | ∼ |B t | ∼ √t |B1 |, and so 1 (j) (j) 𝔼0 (exp [ c󸀠 t sup |B s |2κ(b) ]) = 𝔼0 (exp [c󸀠󸀠 t κ(b) |B1 |2κ(b) ]) 2 s⩽t ∞

2 1 = ∫ exp (c󸀠󸀠 t κ(b) |x|2κ(b) ) e−x /2 dx. √2π

−∞

Since κ(b) < 1, the integral on the right-hand side is finite for all t ⩾ 0. We may now apply Girsanov’s transformation. As in Example 21.21 we define t

Y t := B t ,

t

1 M t := exp [∫ b(Y s ) dB s − ∫ |b(Y s )|2 ds] , 2 0 [0 ]

󵄨 󵄨 ℚx 󵄨󵄨󵄨Ft := M t ℙx 󵄨󵄨󵄨Ft .

t 󵄨 Fix T > 0 and set ℚxT := ℚx 󵄨󵄨󵄨FT . By Girsanov’s theorem, W t := B t − ∫0 b(Y s ) ds is a Brownian motion on the space (Ω, ℚxT ), and we see with the definitions from above

dY t = dB t = dW t + b(Y s ) ds,

Y0 = x,

with respect to ℚxT ,

442 | 23 On diffusions i.e. (Y t )t⩽T solves (23.11) weakly, cf. Definition 21.18 (note that the driving Brownian motion is now part of the solution). Since the law of the solution of (23.11) does not depend on the type of the solution (weak vs. strong), we have x x 𝔼ℙx [f(X T )] = 𝔼ℚ [f(Y T )] = 𝔼ℚ [f(B T )] = 𝔼ℙx [f(B T )M T ] .

Because of the definition of M T and the fact that Y t = B t , we get (23.12).

23.14 Remark. Sometimes the differential version of the Girsanov transform, which we have used in the proof of Theorem 23.13, is more intuitive. Let us rewrite the argument in this form: ∂t u =

1 ∆ x u + b∇x u 2

associated SDE

󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ Girsanov

󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ Wt =

t B t − ∫0

is a BM

b(X s ) ds

under ℚxT

dX t = dB t + b(X t ) dt, dX t = dB t + b(X t ) dt,

= dW t + b(X t ) dt,

= dB t − b(X t ) dt + b(X t ) dt, = dB t .

[ℙx ]

[ℙx ]

[ℚxT ]

[ℚxT ] [ℚxT ]

The underlying probability measure is indicated in square brackets. This shows, in 󵄨 particular, that X t = B t under the transformed measure ℚxT = M T ℙx 󵄨󵄨󵄨FT .

We will now tend to the problem of the existence of a solution to (23.10). As we have seen in Theorem 23.13 (as well as in Chapter 8), it is quite easy to show uniqueness results and representation formulae with the help of probability theory. Existence theorems are of a different calibre and probability theory does not always give the best results.

23.15 Theorem (existence). Let (B t , Ft )t⩾0 be a BMd with an admissible complete filtration and assume that f ∈ C2 (ℝd , ℝ), b = (b1 , . . . , b d ) ∈ C2b (ℝd , ℝd ) such that f, ∂ j f, ∂ j ∂ k f are polynomially bounded with exponent κ(f) ⩾ 0; moreover ∂ j ∂ k b(x) is assumed to be globally Lipschitz continuous. Then the initial value problem (23.10) has a unique solution u(t, x) which is uniformly for t in compact sets [0, T] polynomially bounded in x. The solution is given by (23.12). Proof. In view of Theorem 23.13 it is enough to show the existence of a polynomially bounded solution. As we know from the analogous problems in Chapter 8, this amounts to showing the regularity of the expression (23.12). Define u(t, x) by (23.12).

1o Let us first prove that u(t, x) is jointly continuous and polynomially bounded. Since u(t, x) = 𝔼 f(X tx ) ⩽ C 𝔼(1 + |X tx |κ(f) ) where X tx is the solution to the SDE (23.11) with initial condition X0x = x, we can use Lemma 23.12 to see that u(t, x) is bounded by C T,κ(f) (1 + |x|κ(f) ) for all x ∈ ℝd and t ∈ [0, T]. Corollary 21.30 shows that (t, x) 󳨃→ X tx (has a modification which) is continuous. Since X tx has all moments, it is a routine exercise to show with the dominated convergence theorem that (t, x) 󳨃→ 𝔼 f(X tx ) is continuous.

23.3 Drift transformations revisited | 443

2o For the first and second derivatives of u(t, x) in x we argue in a similar way, only that we use now Remark 21.29.b) to see that x 󳨃→ X tx is twice continuously differentiable 󵄨p 󵄨 󵄨p 󵄨 and that 𝔼 [󵄨󵄨󵄨∂ j X tx 󵄨󵄨󵄨 ] and 𝔼 [󵄨󵄨󵄨∂ j ∂ k X tx 󵄨󵄨󵄨 ] are polynomially bounded in x (uniformly for all t ∈ [0, T]). Since f and its partial derivatives are polynomially bounded, we can use the differentiation theorem for parameter dependent integrals to see that d

x,(k)

∂ j u(t, x) = 𝔼 [∂ j f(X tx )] = 𝔼 [ ∑ (∂ k f)(X tx )∂ j X t k=1

].

A similar formula holds for the second derivatives ∂ j ∂ k u(t, x), i.e. x 󳨃→ u(t, x) is in C2 (ℝd ).

2o -bis. There is an alternative argument for Step 2o . Since the SDE (23.11) is driven by additive noise, the following observation enables us to avoid using Remark 21.29.b). We have y

X tx (ω) − X t (ω)

t

t y

y

= X0x (ω) + B t (ω) + ∫ b(X sx (ω)) ds − X0 (ω) − B t (ω) − ∫ b(X s (ω)) ds 0

0

t

y

= (x − y) + ∫ (b(X sx (ω)) − b(X s (ω))) ds, 0

and since the integral terms have a meaning for each ω, the difference X tx (ω) − X t (ω) is described by a non-random integral equation. As we are interested in the differentiability in the parameter x, x 󳨃→ X tx , we may replace the SDE (23.11) by the following non-random ODE: d ξ(t, x) = b(ξ(t, x)), dt

y

ξ(0, x) = x.

The dependence on the initial values of such ODEs is well known, see e.g. Coddington & Levinson [40, Theorem I.7.5]: x 󳨃→ ξ(t, x) is a Ck -function if b ∈ Ck ; moreover, the function (t, x) 󳨃→ ∂ αx ξ(t, x), α ∈ ℕ0d , is continuous and ∂ αx ξ(t, x) satisfies the ODE which we obtain by formally differentiating the original ODE. With a little extra work one can show that ∂ αx ξ(t, x) is polynomially bounded by |α| Cb (ℝd , ℝd )

c T,α (1 + if b ∈ such that the derivatives of order |α| are globally Lipschitz continuous. Using the chain rule we can now show, exactly as in Step 2o , that x 󳨃→ 𝔼 f(X tx ) is twice continuously differentiable and t 󳨃→ ∂ αx 𝔼 f(X tx ), |α| = 1, 2, is continuous. |x|κ(α) ),

Ex. 23.8

444 | 23 On diffusions 3o We will now show that t 󳨃→ u(t, x) is continuously differentiable. Since x 󳨃→ u(t, x) = 𝔼 f(X tx ) is a C2 function, we can use Itô’s formula to see for all 0 ⩽ t − h < t ⩽ T x u(t, x) − u(t − h, x) = 𝔼 [f(X tx )] − 𝔼 [f(X t−h )] t

t

1 = 𝔼 [ ∫ (∇f)(X sx ) dX sx ] + 𝔼 [ ∫ (∆f)(X sx ) ds] 2 [t−h ] [t−h ] t

t

1 = 𝔼 [ ∫ (∇f)(X sx ) dB s ] + 𝔼 [ ∫ ( (∆f)(X sx ) + b(X sx )(∇f)(X sx ))] ds 2 [t−h ] [t−h ] t

= 𝔼 [ ∫ (Lf)(X sx ) ds] . [t−h

]

t

For the last equality we have to check that 𝔼 ∫t−h (∇f)(X sx ) dB s = 0. This follows from the estimate |(∇f)(X sx )| ⩽ c(1 + |X sx |κ(f) ) and the fact that X sx has moments of any order, see Lemma 23.12. From this we conclude that t

𝔼 [∫ |(∇f)(X sx )|2 ds] < ∞, [0 ]

i.e. (∇f)(X sx ) ∈ L2T so that ∫0 (∇f)(X sx ) dB s is a martingale. Since s 󳨃→ (Lf)(X sx ) is continuous and bounded by c T (1 + |X sx |κ(f) ), we can use dominated convergence to get t

t

1 1 ∫ 𝔼 [(Lf)(X sx )] ds 󳨀󳨀󳨀→ 𝔼 [(Lf)(X tx )] . (u(t, x) − u(t − h, x)) = h h h↓0 t−h

Using a similar calculation for 0 ⩽ t < t + h ⩽ T shows that ∂ t u(t, x) exists (and is equal to 𝔼 [(Lf)(X tx )]).

4o Up to now we know that u(t, x) is polynomially bounded and of class C1,2 . We still have to show that u(t, x) satisfies the PDE (23.10a).² From now on we will freely switch between the Markov and the SDE notation, i.e. we use 𝔼x f(X t ) = 𝔼 f(X tx ). From the tower (TP) and the Markov properties (MP) we get for all 0 ⩽ t − h < t ⩽ T def

TP

u(t, x) = 𝔼x f(X t ) = 𝔼x [𝔼x (f(X t ) | X h )] MP

= 𝔼x [𝔼X h f(X t−h )] = 𝔼x u(t − h, X h ).

(23.13)

2 Denote by T t f(x) = 𝔼 f(X tx ) the semigroup of the Markov process. In Step 3o we have seen that ∂ t u(t, x) = 𝔼 [(Lf)(X tx )] = T t (Lf)(x), but we need ∂ t u(t, x) = L x [𝔼 f(X tx )] = L(T t f)(x). You should compare this with the discussion on forward and backward Kolmogorov equations in Chapter 23.1.

23.4 First steps towards the martingale problem |

445

If we apply the Itô formula to x 󳨃→ u(t − h, x) and the process X h = X hx , we see³ u(t − h, X h ) − u(t − h, x) h

= ∫(∇x u)(t − h, X s ) dX s + 0

h

h

1 ∫(∆ x u)(t − h, X s ) ds 2 0

h

1 (∆ x u)(t − h, X s )] ds. 2 = (L x u)(t − h, X s )

= ∫(∇x u)(t − h, X s ) dB s + ∫ [b(X s )(∇x u)(t − h, X s ) + 0

0

The first integral on the right-hand side is a martingale with zero mean – this follows from the polynomial boundedness of ∇x u(t − h, x) –, and so h

𝔼 u(t − h, X h ) − u(t − h, x) = ∫ 𝔼x [(L x u)(t − h, X s )] ds. x

0

We can now combine (23.13) and (23.3), divide by h, and use dominated convergence for the limit h ↓ 0: h

u(t, x) − u(t − h, x) 1 dom. conv. = ∫ 𝔼x [L x u(t − h, X s )] ds 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ L x u(t, x). h h h↓0 0

23.4 First steps towards the martingale problem Lévy’s celebrated theorem on the characterization of Brownian motion with the help of martingales (Theorem 19.5) states that (X t , Ft )t⩾0 is a BM1 ⇐⇒

(X t , Ft )t⩾0 and (X 2t − t, Ft )t⩾0

are continuous martingales.

(23.14)

This theorem still holds in d dimensions, and it is enough to assume that X t and X 2t − t are local martingales. These generalizations will be a consequence of the discussion below, see Corollary 23.20. Let us briefly explore the role of the term t in the martingale (X 2t − t)t⩾0 . We know from the Doob–Meyer decomposition, cf. Theorem 17.1, that t is the compensator ⟨X⟩t of the martingale (X t )t⩾0 ; as such, it is unique. If we read X 2t as f(X t ) for the C2 function 3 In the following calculations (L x u)(t, X s ), etc., means that we apply first the operator L = L x to the x-variable of u(t, x), and then we evaluate at (t, X s ).

Ex. 23.10

446 | 23 On diffusions f(x) = x2 , then we can use Itô’s formula to see that t

t

1 f(X t ) = f(X0 ) + ∫ f (X s ) dX s + ∫ f 󸀠󸀠 (X s ) ds; 2 0

󸀠

0

(local) martingale

this allows us to identify t with ∫0 12 ∆f(X s ) ds where 12 ∆ is the generator of X. This observation indicates that we should read the right-hand side of (23.14) as t

(f(X t ) − ∫0 21 ∆f(X s ) ds, Ft ) t

t⩾0

is a (local) martingale for sufficiently many f .

As we know from Brownian motion, it will be enough to consider f(x) = ξ ⋅ x and f(x) = (ξ ⋅ x)2 , x, ξ ∈ ℝd . A word on notation: throughout this section we use ξ ⋅ x or simply ξx to denote the Euclidean scalar product ∑dj=1 ξ j x j . The “bracket” ⟨X⟩ and ⟨X, Y⟩ is exclusively used for the compensator, see Chapter 17.3. It will also be convenient to identify a process (X t )t⩾0 with the probability measure ℙν0 which is induced on the path space Ω = C([0, ∞), ℝd ), through the finite dimensional distributions of (X t )t⩾0 given that X0 ∼ ν0 . Conversely, for a given ℙν0 , X t is the canonical process obtained by the coordinate projections X t (w) := w(t), w ∈ C([0, ∞), ℝd ). Finally, we write Lu(x) =

d 1 d ∑ a jk (x)∂ j ∂ k u(x) + ∑ b j (x)∂ j u(x), 2 j,k=1 j=1

(23.15)

where b : ℝd → ℝd and a : ℝd → ℝd×d are locally bounded coefficients.

23.16 Definition (martingale problem. Stroock, Varadhan 1969). Let ν0 be a probability distribution on ℝd . A probability measure ℙν0 on C([0, ∞), ℝd ) is the solution to the d (L, C∞ c (ℝ ), ν 0 ) martingale problem, if i) X0 ∼ ν0 under ℙν0 , i.e. ℙν0 (X0 ∈ A) = ν0 (A) for all A ∈ B(ℝd ), t ϕ d ii) M t := ϕ(X t ) − ϕ(X0 ) − ∫0 Lϕ(X s ) ds is for all ϕ ∈ C∞ c (ℝ ) a martingale with respect to the natural filtration σ(X s , s ⩽ t). 23.17 Example. a) Let (B t )t⩾0 be a BMd (defined on the canonical Wiener space) and assume that (X tx )t⩾0 is the solution of the SDE dX t = b(X t ) dt + σ(X t ) dB t ,

X 0 = x ∈ ℝd ,

where b : ℝd → ℝd and σ : ℝd → ℝd×d are globally Lipschitz continuous. The probability measure ℙx induced by X x is a solution to the martingale problem for L given by (23.15), with ν0 = δ x and a = σσ⊤ .

Indeed: in Theorem 23.9 we have already established the existence and uniqueness of d the process X x as well as the form of the generator L. If ϕ ∈ C∞ c (ℝ ), we can apply Itô’s

23.4 First steps towards the martingale problem | 447

formula to X t = X tx

t d

t

ϕ(X t ) − ϕ(X0 ) = ∫ ∑

0 j=1 t d

(j) ∂ j ϕ(X s ) dX s d

d 1 + ∫ ∑ ∂ j ∂ k ϕ(X s ) d⟨X (j) , X (k) ⟩s 2 j,k=1 0

t d

(k)

= ∫ ∑ ∂ j ϕ(X s ) ∑ σ jk (X s ) dB s + ∫ ∑ b j (X s )∂ j ϕ(X s ) ds 0 j=1

k=1

0 j=1

martingale, ∂ j ϕ(X s )σ jk (X s )∈L2T t

+∫ 0

d 1 d ∑ ∂ j ∂ k ϕ(X s ) ∑ σ pj (X s )σ qk (X s ) d⟨B(p) , B(q) ⟩s . 2 j,k=1 p,q=1

Since a jk = σσ⊤jk = ∑p,q σ pj σ qk δ pq , it follows that M t is a martingale. ϕ

=δ pq ds

b) Assume that L is of the form (23.15) with coefficients a(⋅), b(⋅) which are globally Lipschitz continuous, and a(⋅) is uniformly positive definite. We have seen in Corollary 23.11 that there is a diffusion process X which is the unique solution to the following SDE dX t = b(X t ) dt + √ a(X t ) dB t ,

X 0 = x ∈ ℝd ,

where √ a(x) is the unique (and again Lipschitz continuous) positive definite square root of a(x). This brings us back to the setting discussed in a), i.e. the martingale d problem (L, C∞ c (ℝ ), δ x ) has a solution. You may want to read the discussion following Corollary 23.11 or take a look at Lemma 23.21 to get a feeling what can go wrong if the matrix a(x) is not uniformly positive definite or even degenerate. The following extremely useful lemma is often stated as a corollary of the Itô formula for continuous semimartingales. Since our approach to stochastic integration does not cover this, we give a simple direct proof. 23.18 Lemma (integration by parts). Let (M t , Ft )t⩾0 be a real-valued, bounded martingale with continuous paths and (V t , Ft )t⩾0 an adapted, real-valued bounded process, such that V0 = 0 and t 󳨃→ V t is of bounded variation on any compact time interval. Then the process t

N t (ω) := M t (ω)V t (ω) − ∫ M s (ω) dV s (ω), 0

which is given by a Stieltjes integral for each ω, is a martingale.

Typically, we use Lemma 23.18 for stopped processes: for instance, if M t is a process with continuous paths, we can use the stopping times τ n := inf {s ⩾ 0 : |M s | ⩾ n} to τ ensure that the process M t n := M t∧τ n is bounded.

448 | 23 On diffusions Proof of Lemma 23.18. Because of the boundedness of M and V we do not have to worry about integrability. Since (for almost all ω) t 󳨃→ M t (ω) is continuous and t 󳨃→ V t (ω) is t of bounded variation on compact sets, the Riemann–Stieltjes integral ∫0 M s (ω) dV s (ω) exists and can be approximated by Riemann–Stieltjes sums. The approximation works for almost all ω and in L1 (ℙ) – just use dominated convergence and the fact that all processes are bounded. Fix s < t and F ∈ Fs . We will show that 𝔼 [(N t − N s )𝟙F ] = 0. For any partition Π = {s = t0 ⩽ t1 < ⋅ ⋅ ⋅ < t n = t} of [s, t] we see that a.s. and in L1 (ℙ) t

N t − N s = ∫(M t − M r ) dV r − (M t − M s )V s s

= lim Π

∑ (M t − M t j )(V t j − V t j−1 ) + (M t − M s )V s .

t j−1 ,t j ∈Π

By the tower property and a pull-out argument

Ft j m’ble

𝔼 [(M t − M t j )(V t j − V t j−1 )𝟙F ] = 𝔼 [𝔼 ((M t − M t j )(V t j − V t j−1 )𝟙F | Ft j )]

= 𝔼 [𝔼 (M t − M t j | Ft j )(V t j − V t j−1 )𝟙F ] = 0. =0, b/o martingale

A similar (even easier) argument shows 𝔼 [(M t − M s )V s 𝟙F ] = 0. Since we can swap the limit and the expectation when calculating 𝔼 [(N t − N s )𝟙F ], the claim follows.

The following lemma is the key lemma for the proof of the characterization of solutions to the martingale problem. 23.19 Lemma. Let (X t , Ft )t⩾0 be an adapted, ℝd -valued stochastic process with continuous paths such that X0 is a.s. bounded, and let L be the operator given by (23.15). The following assertions are equivalent (w.r.t. the filtration (Ft )t⩾0 ). t ϕ d a) M t = ϕ(X t ) − ϕ(X0 ) − ∫0 Lϕ(X s ) ds is for all ϕ ∈ C∞ c (ℝ ) a martingale. b) M t = f(X t ) − f(X0 ) − ∫0 Lf(X s ) ds is for all f ∈ C2 (ℝd ) a local martingale. t

f

c)

ξ

t

M t = ξ ⋅(X t − X0 − ∫0 b(X s ) ds) is for all ξ ∈ ℝd a local martingale whose quadratic t

variation is ⟨M ξ ⟩t = ∫0 ξ ⋅ a(X s )ξ ds. t

d) Et = exp [ζ ⋅ (X t − X0 − ∫0 b(X s ) ds) − martingale. ζ

t

1 t 2 ∫0

e) Et = exp [iξ ⋅ (X t − X0 − ∫0 b(X s ) ds) + martingale. iξ

ζ ⋅ a(X s )ζ ds] is for all ζ ∈ ℂd a local

1 t 2 ∫0

ξ ⋅ a(X s )ξ ds] is for all ξ ∈ ℝd a local

Proof. a)⇒b) It is well-known that we can approximate f ∈ C2 (ℝd ) with a sequence d (ϕ n )n⩾1 ⊂ C∞ c (ℝ ) using the norms d

d

‖u‖(2),K := sup |u(x)| + sup ∑ |∂ j u(x)| + sup ∑ |∂ j ∂ k u(x)| x∈K

x∈K j=1

x∈K j,k=1

23.4 First steps towards the martingale problem | 449

for any compact set K ⊂ ℝd , see e.g. [232, Theorem 15.11, Exercise 15.13] or [110, Theorem 1.3.2]. d ∞ d We fix f ∈ C2 (ℝd ) and (ϕ n )n⩾0 ⊂ C∞ c (ℝ ) as above, K = 𝔹(0, r), pick χ r ∈ Cc (ℝ ) such that 𝟙𝔹(0,r) ⩽ χ r ⩽ 𝟙𝔹(0,2r) and set τ r := inf {s ⩾ 0 : |X s | ⩾ r/2}. This is a stopping time for Ft , cf. Lemma 5.8; since X t has continuous paths in ℝd , τ r → ∞ as r → ∞. By construction, ‖χ r f − χ r ϕ n ‖(2),ℝd 󳨀󳨀󳨀󳨀→ 0 n→∞

holds for every fixed r > 0, and since the coefficients of the operator L are locally bounded, we conclude that ‖χ r f − χ r ϕ n ‖∞ + ‖L(χ r f) − L(χ r ϕ n )‖∞ 󳨀󳨀󳨀󳨀→ 0. n→∞

L1 (ℙ)

This, in turn, shows that M t r n 󳨀󳨀󳨀󳨀→ M t r as n → ∞, so that M χ r f inherits the martind gale property from M χ r ϕ n – here we use that χ r ϕ n ∈ C∞ c (ℝ ) and Part a). χr f By optional stopping, (M t∧τ r , Ft )t⩾0 is again a martingale. Because of the definition of τ r , we have f = χ r f , Lf = L(χ r f) on 𝔹(0, r), and X t∧τ r ∈ 𝔹(0, r/2), so χ ϕ

χ f

r M t∧τ = M t∧τ r is a martingale 󳨐⇒ M t is a local martingale. r

χ f

f

f

b)⇒c) Setting f(x) = f ξ (x) = ξ ⋅ x for x, ξ ∈ ℝd we see that f ξ ∈ C2 (ℝd ) and Lf ξ (x) = ξ ⋅ b(x). Therefore, t

ξM t := ξX t − ξX0 − ξA t is a local martingale where ξA t := ∫ ξb(X s ) ds. 0

We still have to identify the compensator of ξM t . Using in b) the function g ξ (x) := (ξ ⋅ x)2 and observing that ∇g ξ (x) = 2(ξ ⋅ x)ξ and ∇∇⊤ g ξ (x) = ξξ ⊤ = (ξ j ξ k )j,k , shows that t

2

t

2

L t = (ξX t ) − (ξX0 ) − 2 ∫ ξX s dξA s − ∫ ξ ⋅ a(X s )ξ ds 0

=ξb(X s ) ds 0

is a local martingale. Because of the uniqueness of the compensator, it is enough to t show that (ξM t )2 − ∫0 ξ ⋅ a(X s )ξ ds − L t is a local martingale. Note that t

2

(ξM t ) − ∫ ξ ⋅ a(X s )ξ ds − L t 0

t

= (ξ(X − X0 − A)t )2 − (ξX t )2 + (ξX0 )2 + 2 ∫ ξX s dξA s 0

t

= (ξA t )2 − 2ξ(X − X0 )t ξA t + 2 ∫ ξ(X − X0 )s dξA s − 2ξ(X − X0 − A)t ξX0 0

450 | 23 On diffusions t

= −2ξ(X − X0 − A)t ξA t + 2 ∫ ξ(X − X0 − A)s dξA s − 2ξ(X − X0 − A)t ξX0 . 0

In the last equality we use the elementary identity x(t)2 − x(0)2 = 2 ∫0 x(s) dx(s) for Stieltjes integrals. Lemma 23.18 and a stopping argument show that the sum of the first two terms in the above expression is a local martingale. Since ξX0 is bounded and F0 measurable, 2ξ(X − X0 − A)t ξX0 is also a local martingale, and the proof is complete. t

c)⇒d)⇒e) Since the dependence ξ 󳨃→ M t is linear, it is clear that M t is still a local ξ

ζ

martingale if ζ ∈ ℂd . Note that = E(M ζ )t is a Doléans–Dade stochastic exponential, cf. (19.1) in Chapter 19.1. A straightforward application of Itô’s formula for continuous local martingales⁴ gives ζ Et

t ζ

ζ

ζ

ζ

Et − E0 = ∫ Es d[M s − 0

t

t

0

0

1 ζ 1 ζ ζ ζ ⟨M ⟩s ] + ∫ Es d⟨M ζ ⟩s = ∫ Es dM s , 2 2

and this shows that Eζ is a local martingale, i.e. d). Specializing ζ = iξ proves e).

e)⇒a) Define e ξ (x) := exp[iξ ⋅ x] and note that The process

L x e ξ (x) = (iξ ⋅ b(x) − ξ ⋅ a(x)ξ ) e ξ (x). t

t

1 V t := exp [iξX0 + ∫ iξ ⋅ b(X s ) ds − ∫ ξ ⋅ a(X s )ξ ds] 2 0 0 ] [

is a continuous process which is of bounded variation on compact time intervals. Thus, a stopping argument and Lemma 23.18 show that t

t

N t := Et V t − ∫ Es dV s = e ξ (X t ) − ∫ Le ξ (X s ) ds ξ





0

is a local martingale. Let ϕ ∈

d C∞ c (ℝ ).

0

Since the Fourier transform

̂ = (2π)−d ∫ ϕ(x)e−iξ⋅x dx Fϕ(ξ) = ϕ(ξ) ℝd

Ex. 23.11

4 If N is a continuous local martingale, then du(N t ) = u󸀠 (N t ) dN t + 12 u󸀠󸀠 (N t ) d⟨N⟩t . Since we have not proved this formula, let us indicate the idea of the proof and a workaround. Proper solution: mimic the proof of Theorem 18.7 – replacing X 󴁄󴀼 N and σ2 (x) ds 󴁄󴀼 d⟨N⟩t , stopping everything at a suitable localizing sequence τ n , and observing that ⟨N⟩ = ucp- limΠ ∑t j ,t j−1 ∈Π (N t j − N t j−1 )2 , cf. Exercise 17.3. Workaround: if you are happy with Brownian filtrations, i.e. (Ft )t⩾0 being admissible for a Brownian motion, then you could invoke Corollary 19.13 to see that the suitably stopped processes N τ n are Itô processes, i.e. one can use Itô’s formula from Theorem 18.7; a similar argument works if we assume that ⟨N⟩ is absolutely continuous, cf. Corollary 19.15.

23.4 First steps towards the martingale problem | 451

is a bijection on the space S(ℝd ) of rapidly decreasing, smooth functions (see e.g. [232, d d Chapter 19]) and since C∞ c (ℝ ) ⊂ S(ℝ ), we can use the Fourier inversion formula ixξ ̂ ̂ ϕ(x) = ∫ ϕ(ξ)e dξ = ∫ ϕ(ξ)e ξ (x) dξ to show ℝd

t



ℝd

iξ (Et V t

ℝd

iξ ̂ dξ − ∫ Es dV s ) ϕ(ξ) 0

t

̂ ̂ = ∫ ϕ(ξ)e ξ (X t ) dξ − ∫ ∫ e −ξ (X s )L x e ξ (X s ) ϕ(ξ)e ξ (X s ) dξ ds 0 ℝd

ℝd

t

=iξ⋅b(X s )−ξ⋅a(X s )ξ

̂ ̂ = ∫ ϕ(ξ)e ξ (X t ) dξ − ∫ ∫ Lϕ(ξ)e ξ (X s ) dξ ds ℝd

0 ℝd

t

= ϕ(X t ) − ∫ Lϕ(X s ) ds =: M t . ϕ

0

Since N t is a continuous local martingale, we can introduce a sequence of stopping ξ

Ex. 23.12

times τ n = inf {s ⩾ 0 : |N t | ⩾ n} such that (N t∧τ n )t⩾0 is a bounded martingale. This allows us to use Fubini’s theorem to see for all s < t and F ∈ Fs that ξ

ξ

ξ ξ ξ ̂ ̂ ̂ 𝔼 [𝟙F ∫ ϕ(ξ)N t∧τ n dξ ] = ∫ 𝔼 [𝟙F N t∧τ n ] ϕ(ξ) dξ = ∫ 𝔼 [𝟙F N s∧τ n ] ϕ(ξ) dξ

ξ ̂ = 𝔼 [𝟙F ∫ ϕ(ξ)N s∧τ n dξ ] .

This proves that M t is a local martingale. Since a and b are locally bounded, hence ϕ

bounded on the compact set supp ϕ, we have |M t | ⩽ ‖ϕ‖∞ +t‖Lϕ‖∞ ⩽ C(1+t)‖ϕ‖(2),ℝd . ϕ

Therefore, we can use dominated convergence to infer M t∧τ n → M t in L1 (ℙ). As (τ n )n⩾1 ϕ

is a localizing sequence for the local martingale

inherits the martingale property from M t∧τ n . ϕ

ϕ

ϕ M t , this convergence ensures that

ϕ

Mt

An immediate consequence of Lemma 23.19 is Lévy’s characterization of ddimensional Brownian motion via martingales. You will recall the one-dimensional version from Theorem 9.13 or 19.5. 23.20 Corollary (Lévy 1948). Let (X t , Ft )t⩾0 be a stochastic process with values in ℝd and continuous sample paths. If for all j, k = 1, . . . , d, the processes (j)

(X t , Ft )t⩾0

(j) (k)

and (X t X t − δ jk t, Ft )t⩾0

are local martingales, then (X t , Ft )t⩾0 is a d-dimensional Brownian motion.

Ex. 23.13

452 | 23 On diffusions Proof. Fix ξ ∈ ℝd and write ξ = ∑dj=1 ξ (j) e j with the canonical ONB (e j )j=1,...,d of ℝd . Because of the assumption, d

d

j=1

j=1

(j)

ξ ⋅ X t = ∑ ξ (j) e j ⋅ X t = ∑ ξ j X t

is a local martingale, and so is (ξX t )2 − ⟨ξX⟩t . Since the compensator is unique and bilinear, we get d

d

j,k=1

j,k=1

⟨ξX⟩t = ⟨ξX, ξX⟩t = ∑ ξ (j) ξ (k) ⟨X (j) , X (k) ⟩t = ∑ ξ (j) ξ (k) δ jk t.

Therefore, we can use the implication c)⇒e) of Lemma 23.19 with b = 0 and a = (δ jk )jk , which shows that Et = exp [iξ(X t − X0 ) + iξ

1 2 t|ξ| ] 2

is a local martingale. Fix s < t and take a localizing sequence (τ n )n⩾0 . We have 𝔼 (Et∧τ n | Fs ) = Es∧τ n 󳨐⇒ 𝔼 (e iξ (X t∧τn −X s∧τn ) e(t∧τ n −s∧τ n )|ξ| iξ

2



2

/2

| Fs ) = 1.

Using the majorant e t|ξ| /2 , we can apply (conditional) dominated convergence and the continuity of t 󳨃→ X t to conclude 2

𝔼 (e iξ (X t −X s ) | Fs ) = e−(t−s)|ξ|

which identifies X as a BMd , cf. Lemma 5.4.

/2

The next lemma is a simple version of a singular value decomposition from linear algebra. 23.21 Lemma. Let a = σσ⊤ ∈ ℝd×d be a positive semidefinite matrix. There are matrices R, Q ∈ ℝd×d such that RaR⊤ + QQ⊤ = id,

σQ = 0 and

(id −σR)a(id −σR)⊤ = 0.

(23.16) (23.17)

If a = a(x) is a matrix-valued Borel measurable function, then R = R(x) and Q = Q(x) are again Borel measurable. Proof. Since a is positive semidefinite, there is an orthogonal matrix U ∈ O(d) such that UaU ⊤ = D2 = diag[λ1 , . . . , λ m , λ m+1 , . . . , λ d ].

Without loss of generality, we can assume that λ1 , . . . , λ m > 0 and λ m+1 = ⋅ ⋅ ⋅ = λ d = 0, 1/2 1/2 otherwise we make a change of basis. Using D = √D2 = diag[λ1 , . . . , λ d ] we can ⊤ ⊤ express the positive semidefinite root of a as √a = U DU, hence √aU = U ⊤ D.

23.4 First steps towards the martingale problem | 453 −1/2

Set Λ = diag[λ1 have

−1/2

, . . . , λm

, 0, . . . , 0] and P = diag[1, . . . , 1, 0, . . . , 0]. We m

(ΛU)a(ΛU)⊤ = Λ(UaU ⊤ )Λ⊤ = ΛD2 Λ⊤ = P.

It is now easy to check that the following matrices R := ΛU,

Q := (id −P),

satisfy (23.16) and (23.17). Indeed,

σ = √aU ⊤ = U ⊤ D

RaR⊤ + QQ⊤ = P + (id −P)2 = P + id −P = id σQ = U ⊤ D(id −P) = U ⊤ ΛP(id −P) = 0

and

(id −σR)a(id −σR)⊤ = (id −U ⊤ DΛ U)a(id −U ⊤ DΛU)⊤ =UU ⊤ ⊤

=P

= U (id −P) UaU ⊤ (id −P)⊤ U = 0, =D2

since (id −P)D = 0. Now assume that a = a(x) is Borel measurable. This implies that the eigenvalues depend measurably on x. Since we can construct Q and R using elementary matrix operations, whose action is continuous, these matrices will inherit the measurability of a. 23.22 Theorem. Let (X t , Ft )t⩾0 be an adapted, ℝd -valued stochastic process on the canonical space C([0, ∞), ℝd ), let ℙ be the probability measure induced by X, let ν0 be a probability measure on ℝd , and assume that L is given by (23.15) with locally bounded coefficients b : ℝd → ℝd , σ : ℝd → ℝd×d and a = σσ⊤ . The following assertions are equivalent. d a) X, i.e. ℙ, solves the (L, C∞ c (ℝ ), ν 0 )-martingale problem. b) X is a weak solution to the stochastic differential equation dX t = b(X t ) dt + σ(X t ) dB t ,

X0 ∼ ν0 ,

(23.18)

i.e. there is a (possible enlarged) probability space and a Brownian motion W such X satisfies (23.18) driven by the Brownian motion B = W.

Proof. b)⇒a) Assume that the pair (X t , W t ) solves the SDE (23.18). The calculation of Example 23.17.a) still holds for (X, W) and shows that f Mt

t

t

0

0

d 1 d := f(X t ) − f(X0 ) − ∑ ∫ a jk (X s )∂ j ∂ k f(X s ) ds − ∑ ∫ b j (X s )∂ j f(X s ) ds 2 j,k=1 j=1

454 | 23 On diffusions is for all f ∈ C2 (ℝd ) a local martingale. Therefore, Lemma 23.19 shows that (the canond ical realization of) X solves the (L, C∞ c (ℝ ), ν 0 )-martingale problem. a)⇒b) Lemma 23.19.c) shows that the quadratic variation of the continuous local t t ξ martingale M t = ξM t = ξX t − ξX0 − ∫0 ξb(X s ) ds is given by ⟨M ξ ⟩t = ∫0 a jk (X s ) ds; we t

use M t to denote the vector-valued local martingale X t − X0 − ∫0 b(X s ) ds. We have to construct a Brownian motion which will be the driver of the SDE (23.18). The idea is to follow Theorem 19.14 and Corollary 19.15, and use W = a(X)−1 ∙ M. There are two problems: i) Theorem 19.14 is one-dimensional, and ii) – as in Corollary 19.15, we have the problem that a(X)−1 may not exist. Problem i) can be addressed by using the d-dimensional version of Lévy’s martingale characterization of a Brownian motion (Corollary 23.20), and Problem ii) can be solved using the pseudo-inverse of the matrix a = σσ⊤ , cf. Lemma 23.21, and an supplementary Brownian motion which takes care of the subspace of ℝd , where a has no inverse. First, we enlarge the probability space of M ξ such that there is an independent, ̃ this can be done by the standard product cond-dimensional Brownian motion W; struction, see e.g. [235, Chapter 26]. In slight abuse of notation, we do not change the notation for the other processes. Using the notation from Lemma 23.21 we set t

t

̃s W t := ∫ R(X s ) dM s + ∫ Q(X s ) d W 0

0

and claim that this is a d-dimensional Brownian motion (on the enlarged probability space). In order to see this, we use Corollary 23.20. Clearly, W is a local martingale, since it is given by stochastic integrals driven by (local) martingales. The quadratic variation is ⟨W (j) , W (k) ⟩t d

d

p,q=1

p,q=1

̃ (p) , Q kq (X) ∙ W ̃ (q) ⟩t = ∑ ⟨R jp (X) ∙ M (p) , R kq (X) ∙ M (q) ⟩t + ∑ ⟨Q jp (X) ∙ W d

t

d

t

̃ (p) , W ̃ (q) ⟩s = ∑ ∫ R jp (X s )R kq (X s ) d⟨M (p) , M (q) ⟩s + ∑ ∫ Q jp (X s )Q kq (X s ) d⟨W p,q=1 0 d

t

p,q=1 0

d

t

= ∑ ∫ R jp (X s )a pq (X s )R kq (X s ) ds + ∑ ∫ Q jp (X s )δ pq Q kq (X s ) ds p,q=1 0

p,q=1 0

t

= ∫ (R(X s )a(X s )R⊤ (X s ) + Q(X s )Q⊤ (X s ))jk ds = δ jk t, 0

where we use (23.16) in the last line.

23.4 First steps towards the martingale problem | 455

We will now check that (X, W) satisfies the SDE. For all t > 0 we have t

∫ σ(X s ) dW s 0

def

t

t

̃s = ∫ σ(X s )R(X s ) dM s + ∫ σ(X s )Q(X s ) d W 0

0

(23.17)

t

= ∫ σ(X s )R(X s ) dM s 0

t

= M t − ∫ (id −σR) (X s ) dM s 0

(∗)

= Mt

t

def

= X t − X0 − ∫ b(X s ) ds. 0

The equality marked with (∗) follows from the fact that the quadratic variation uniquely determines the underlying (local) martingale: d

⟨∑



(p) ∫(id −σR)jp (X s ) dM s ,

p=1 0

t

d

d



(q)

∑ ∫(id −σR)kq (X s ) dM s ⟩

q=1 0

t

(23.17)

= ∫ ∑ (id −σR)jp (X s )a pq (X s )(id −σR)kq (X s ) ds = 0. 0 p,q=1

Let us briefly discuss uniqueness. Given any initial distribution ν0 , we say that the d martingale problem (L, C∞ c (ℝ ), ν 0 ) has a unique solution, if any two solutions to this martingale problem have the same finite dimensional distributions (hence, they induce the same measure ℙν0 on C([0, ∞), ℝd )). Similarly, a weak solution to the SDE (23.18) is unique, if the finite dimensional distributions of the processes X and X 󸀠 of any two weak solution pairs (X, W) and (X 󸀠 , W 󸀠 ) coincide. Note that Theorem 23.22 does not assume uniqueness. Consequently, existence and uniqueness for the two problems are equivalent, and we may record the following result. 23.23 Corollary. Let ν0 be any initial distribution on (ℝd , B(ℝd )). The following assertions are equivalent: d d a) The martingale problem (L, C∞ c (ℝ ), ν 0 ) in C([0, ∞), ℝ ) has a unique solution. b) The SDE (23.18) with initial law X0 ∼ ν0 has a unique weak solution. In fact, we can go one step further. If we take expectations in Lemma 23.19.a), we get t

𝔼 ϕ(X t ) = 𝔼 ϕ(X0 ) + 𝔼 [∫ Lϕ(X s ) ds] , [0 ]

d ϕ ∈ C∞ c (ℝ ).

(23.19)

456 | 23 On diffusions Let ν t ∼ X t denote the law of X t , i.e. the one-dimensional distribution of X under ℙν0 . Then we can rewrite (23.19) in the following form t

d ϕ ∈ C∞ c (ℝ ),

∫ ϕ(y) ν t (dy) = ∫ ϕ(y) ν0 (dy) + ∫ ∫ Lϕ(y) ν s (dy) ds,

ℝd

0

ℝd

(23.20)

which is the Kolmogorov forward equation for the family (ν t )t⩾0 .

23.24 Lemma. Let (X t , Ft )t⩾0 be a Markov process with infinitesimal generator L such d ∞ d that C∞ c (ℝ ) ⊂ D(L) and Lϕ is a bounded measurable function for every ϕ ∈ Cc (ℝ ). Condition a) in Lemma 23.19 is equivalent to the validity of the Kolmogorov forward equation (23.19) or (23.20). Proof. Let M t be the process defined in Lemma 23.19.a). If M t is a martingale, then ϕ

ϕ

𝔼x M t = 𝔼x M0 = 0, and we get (23.19). Conversely, assume that (23.19) holds for every ϕ

ϕ

x ∈ ℝd . Our assumptions guarantee that 𝔼x |M t | < ∞. In particular, we see that ϕ

𝔼X s M t = 0

a.s. for any s ⩾ 0.

ϕ

Using the Markov property we get

󵄨󵄨 󵄨 + 𝔼 (ϕ(X t ) − ϕ(X s ) − ∫ Lϕ(X r ) dr 󵄨󵄨󵄨󵄨 Fs ) 󵄨󵄨 s t

𝔼

x

ϕ (M t

| Fs ) = = =

ϕ Ms

x

t

ϕ Ms ϕ Ms

+𝔼 +𝔼

Xs

(ϕ(X t−s ) − ϕ(X0 ) − ∫ Lϕ(X r−s ) dr) s

Xs

ϕ (M t−s ) =0

This is Lemma 23.19.a), finishing the proof.

=

ϕ Ms .

The converse of Lemma 23.24 is also true: the existence and uniqueness of the martingale problem is equivalent to the unique solvability of the Kolmogorov forward equation (23.20). What is missing is the proof that the unique solution to the martingale problem is a Markov process, i.e. that the family (ν t )t⩾0 determines the finite dimensional distributions, hence ℙν0 . This is, however, beyond the scope of this introductory text, and we refer to Ethier & Kurtz [79, Theorem IV.4.2] or Stroock & Varadhan [249, Theorem 6.2.2] for a complete proof. 23.25 Further reading. The interplay of PDEs and SDEs is nicely described in [10] and [89]. A thorough discussion on analytic and probabilistic methods for parabolic PDEs can be found in Friedman [91] (analytic approach) and the first volume of [92] (probabilistic approach). The survey [12] gives brief introduction to the ideas of Malliavin’s calculus, see also the further reading recommendations in Chapter 4 and 19. The contributions of Stroock and Varadhan, in particular, the characterization of

Problems

| 457

stochastic processes via the martingale problem, opens up a whole new world. Good introductions are [79] and, with the focus on diffusion processes, [249]. A good overview on the interplay of the martingale problem, weak solutions to SDEs and Kolmogorov’s forward equation is given in the paper [155] by Kurtz. Bass: Diffusions and Elliptic Operators. [10] Bell: Stochastic differential equations and hypoelliptic operators. [12] Ethier, Kurtz: Markov Processes: Characterization and Convergence. [79] [89] Freidlin: Functional Integration and Partial Differential Equations. Friedman: Partial Differential Equations of Parabolic Type. [91] Friedman: Stochastic Differential Equations and Applications. [92] [155] Kurtz: Equivalence of stochastic equations and martingale problems. [249] Stroock, Varadhan: Multidimensional Diffusion Processes.

Problems 1.

2.

3.

Let (A, D(A)) be the generator of a diffusion process in the sense of Definition 23.1 and denote by a, b the diffusion and drift coefficients. Show that a ∈ C(ℝd , ℝd×d ) and b ∈ C(ℝd , ℝd ).

Show that under the assumptions of Proposition 23.5 we can interchange integ2 2 ration and differentiation: ∂x∂j ∂x k ∫ p(t, x, y)u(y) dy = ∫ ∂x∂j ∂x k p(t, x, y)u(y) dy and that the resulting function is in C∞ (ℝd ).

d d Let A : C∞ c (ℝ ) → C∞ (ℝ ) be a closed operator such that ‖Au‖∞ ⩽ C ‖u‖(2) ; here ‖⋅‖(2)

d ‖u‖(2) = ‖u‖∞ + ∑j ‖∂ j u‖∞ + ∑jk ‖∂ j ∂ k u‖∞ . Show that C2∞ (ℝd ) = C∞ c (ℝ ) D(A).



4. Let L = L(x, D x ) = ∑ij a ij (x)∂ i ∂ j + ∑j b j (x)∂ j + c(x) be a second order differential operator. Determine its formal adjoint L∗ (y, D y ), i.e. the linear operator satisfying d ⟨Lu, ϕ⟩L2 = ⟨u, L∗ ϕ⟩L2 for all u, ϕ ∈ C∞ c (ℝ ). What is the formal adjoint of the ∂ operator L(x, D) + ∂t ? 5.

Complete the proof of Proposition 23.6 (Kolmogorov’s forward equation).

6.

Let (B t )t⩾0 be a BM1 . Show that X t := (B t , ∫0 B s ds) is a diffusion process and determine its generator.

7.

t

(Carré-du-champ operator) Let (X t )t⩾0 be a diffusion process with the infinitesimal t generator L = L(x, D) = A|C∞ as in (23.1). Write M tu = u(X t ) − u(X0 ) − ∫0 Lu(X r ) dr. c d a) Show that M u = (M tu )t⩾0 is an L2 martingale for all u ∈ C∞ c (ℝ ).

b) Show that M f = (M t )t⩾0 is a local martingale for all f ∈ C2 (ℝd ). f

458 | 23 On diffusions d 2 c) Denote by ⟨M u , M ϕ ⟩t , u, ϕ ∈ C∞ c (ℝ ), the quadratic covariation of the L u ϕ martingales M and M , cf. (15.8) and Theorem 15.15.b). Show that t

⟨M , M ⟩t = ∫ (L(uϕ) − uLϕ − ϕLu)(X s ) ds u

ϕ

0

t

= ∫ ∇u(X s ) ⋅ a(X s )∇ϕ(X s ) ds. 0

(x ⋅ y denotes the Euclidean scalar product and ∇ = (∂1 , . . . , ∂ d )⊤ .)

Remark. The operator Γ(u, ϕ) := L(uϕ) − uLϕ − ϕLu is called carré-du-champ or mean-field operator.

8. Consider the ODE ∂ t ξ(t, x) = x + b(ξ(t, x)), ξ(0, x) = x, where b ∈ C2b (ℝd , ℝd ) such that |∂ αx b(x)|, |α| = 2, is Lipschitz continuous. Show that all ∂ αx ξ(t, x), |α| ⩽ 2, are polynomially bounded. Hint. Assume that the derivatives ∂ αx ξ(t, x) exist and solve the formally differentiated ODE.

9.

(BM with gradient drift) Let (B t )t⩾0 be a BMd and denote by (X tx )t⩾0 the strong solution of the SDE dX t = ∇c(X t ) dt + dB t ,

X 0 = x ∈ ℝd ,

where c ∈ C2c (ℝd , ℝ). The following problem makes a connection between X and Brownian motion killed with rate V(⋅). a) Show that for ϕ ∈ Cc (ℝd , ℝ) t

𝔼 ϕ(X tx ) = 𝔼 [exp (− ∫ V(B s ) ds) e c(B t )−c(x) ϕ(B t )] 0 [ ]

with the potential V(x) = 21 |∇c(x)|2 + 12 ∆c(x).

Hint. Use Girsanov’s theorem to show the formula for 𝔼 ϕ(X tx ). The formula for V follows

with an application of Itô’s formula to Z t := c(B t ).

b) Use the Feynman–Kac formula to rewrite the formula for T tX ϕ(x) := 𝔼 ϕ(X tx ) in the following way: T tX ϕ(x) = e−c(x) T tY (ϕ(⋅)e c(⋅) )(x)

for a suitable operator semigroup T tY . The corresponding process (Y t )t⩾0 is a Brownian motion killed at rate V. Remark. This is a particular case of Doob’s h-transform with h(x) = e−c(x) .

10. Let (M (1) , . . . , M (d) ) be a local martingale with M0 = 0 and denote its quadratic variation by ⟨M⟩ = (⟨M (j) , M (k) ⟩)j,k . Show that ⟨M⟩ = 0 implies that M = 0.

Problems

| 459

11. Let (N t , Ft )t⩾0 be a continuous, real-valued local martingale and u ∈ C2 (ℝ). Show the following Itô formula du(N t ) = u󸀠 (N t ) dN t + 21 u󸀠󸀠 (N t ) d⟨N⟩t .

Hint. Mimic the proof of Theorem 18.7, stopping everything at a suitable localizing sequence τ n , and observing that ⟨N⟩ = ucp- limΠ ∑t j ,t j−1 ∈Π (N t j − N t j−1 )2 , cf. Exercise 17.3.

12. Let (N t , Ft )t⩾0 be a continuous local martingale. Show that τ n := {s ⩾ 0 : |N t | ⩾ n} is a localizing sequence which makes N t∧τ n bounded. 13. Prove the converse of Corollary 23.20.

24 Simulation of Brownian motion by Björn Böttcher This chapter presents a concise introduction to simulation, in particular focusing on the normal distribution and Brownian motion.

24.1 Introduction Recall that for iid random variables X1 , . . . , X n with a common distribution F for any ω ∈ Ω the tuple (X1 (ω), . . . , X n (ω)) is called a (true) sample of F. In simulations the following notion of a sample is commonly used. 24.1 Definition. A tuple of values (x1 , . . . , x n ) which is indistinguishable from a true sample of F by statistical inference is called a sample of F. An element of such a tuple is called a sample value of F, and it is denoted by s

x j ∼ F.

Note that the statistical inference has to ensure both the distributional properties and the independence. A naive idea to verify the distributional properties would be, for instance, the use of a Kolmogorov–Smirnov test on the distribution. But this is not a good test for the given problem, since a true sample would fail such a test with confidence level α with probability α. Instead one should do at least repeated Kolmogorov–Smirnov tests and then test the p-values for uniformity. In fact, there is a wide range of possible tests, for example the NIST (National Institute of Standards and Technology)¹ has defined a set of tests which should be applied for reliable results. In theory, letting n tend to ∞ and allowing all possible statistical tests, only a true sample would be a sample. But in practice it is impossible to get a true sample of a continuous distribution, since a physical experiment as sample mechanism clearly yields measurement and discretization errors. Moreover for the generation of samples with the help of a computer we have only a discrete and finite set of possible values due to the internal number representation. Thus one considers n large, but not arbitrarily large. Nevertheless, in simulations one treats a sample as if it were a true sample, i.e. any distributional relation is translated to samples. For example, given a distribution function F and the uniform distribution U(0,1) we have X ∼ U(0,1) 󳨐⇒ F −1 (X) ∼ F

1 http://csrc.nist.gov/groups/ST/toolkit/rng/stats_tests.html https://doi.org/10.1515/9783110741278-024

24.1 Introduction | 461

ℙ(F −1 (X) ⩽ x) = ℙ(X ⩽ F(x)) = F(x) s s x ∼ U(0,1) 󳨐⇒ F −1 (x) ∼ F

since and thus

(24.1)

holds. This yields the following algorithm.

24.2 Algorithm (Inverse transform method). Let F be a distribution function. s a) Generate u ∼ U(0,1) b) Set x := F −1 (u) s Then x ∼ F.

By this algorithm a sample of U(0,1) can be transformed to a sample of any other distribution. Therefore the generation of samples from the uniform distribution is the basis for all simulations. For this various algorithms are known and they are implemented in most of the common programming languages. Readers interested in the details of the generation of samples of the uniform distribution are referred to [160; 161]. s From now on we will assume that we can generate x ∼ U(0,1) . Thus we can apply Algorithm 24.2 to get samples of a given distribution as the following examples illustrate. 24.3 Example (Scaled and shifted Uniform distribution U(a,b) ). For a < b the distribution function of U(a,b) is given by F(x) =

and thus

x−a b−a

s

for x ∈ [a, b] s

u ∼ U(0,1) 󳨐⇒ (b − a)u + a ∼ U(a,b) .

24.4 Example (Bernoulli distribution β p and U{0,1} , U{−1,1} ). Let p ∈ [0, 1]. Given a uniform random variable U ∼ U(0,1) , then 𝟙[0,p] (U) ∼ β p . In particular, we get s

s

u ∼ U(0,1) 󳨐⇒ 𝟙[0, 21 ] (u) ∼ U{0,1}

s

and 2 ⋅ 𝟙[0, 12 ] (u) − 1 ∼ U{−1,1} .

24.5 Example (Exponential distribution Expλ ). The distribution function of Expλ with parameter λ > 0 is F(x) = 1 − e−λx , x ⩾ 0, and thus Now using

1 F −1 (y) = − ln(1 − y). λ

U ∼ U(0,1) 󳨐⇒ 1 − U ∼ U(0,1)

yields with the inverse transform method

1 s s u ∼ U(0,1) 󳨐⇒ − ln(u) ∼ Expλ . λ

462 | 24 Simulation of Brownian motion Before focusing on the normal distribution, let us introduce two further algorithms which can be used to transform a sample of a distribution G to a sample of a distribution F. 24.6 Algorithm (Acceptance–rejection method). Let F and G be probability distribution functions with densities f and g, such that there is a constant c > 0 with f(x) ⩽ cg(x) for all x. s a) Generate y ∼ G s b) Generate u ∼ U(0,1) c) If ucg(y) ⩽ f(y), then x := y otherwise restart with 1. s Then x ∼ F.

Proof. If Y ∼ G and U ∼ U(0,1) are independent, then the algorithm generates a sample value of the distribution μ, given by ℙ(Y ∈ A, Ucg(Y) ⩽ f(Y)) 󵄨 = ∫ f(y) dy μ(A) = ℙ(Y ∈ A 󵄨󵄨󵄨 Ucg(Y) ⩽ f(Y)) = ℙ(Ucg(Y) ⩽ f(Y)) A

for A ∈ B(ℝ). This follows from the fact that g(x) = 0 implies f(x) = 0, and 1

ℙ(Y ∈ A, Ucg(Y) ⩽ f(Y)) = ∫ ∫ 𝟙{(v,z) : vcg(z)⩽f(z)} (u, y) g(y) dy du 0 A

f(y) cg(y)

=

=

=



∫ du g(y) dy

A\{g=0} 0



f(y) g(y) dy cg(y)

A\{g=0}

1 c



f(y) dy

A\{g=0}

1 = ∫ f(y) dy. c A

In particular, we see for A = ℝ

ℙ(Ucg(Y) ⩽ f(Y)) = ℙ(Y ∈ ℝ, Ucg(Y) ⩽ f(Y)) =

1 . c

24.7 Example (One-sided normal distribution). We can apply the acceptance–rejection method to the one-sided normal distribution f(x) = 2

1 2 1 e− 2 x 𝟙[0,∞) (x), √2π

24.1 Introduction | 463

and the Exp1 distribution g(x) = e−x 𝟙[0,∞) (x) (for sampling see Example 24.5) with the smallest possible c (such that f(x) ⩽ cg(x) for all x) given by c = sup x

f(x) 2 x2 2e = sup √ e− 2 +x 𝟙[0,∞) (x) = √ ≈ 1.315 . g(x) π π x

The name of Algorithm 24.6 is due to the fact that in the first step a sample is proposed and in the third step this is either accepted or rejected. In the case of rejection a new sample is proposed until acceptance. The algorithm can be visualized as throwing darts onto the set B = {(x, y) ∈ ℝ2 : 0 ⩽ y ⩽ cg(x)} uniformly. Then the x-coordinate of the dart is taken as a sample (accepted) if, and only if, the dart is within C = {(x, y) ∈ ℝ2 : 0 ⩽ y ⩽ f(x)}; note that C ⊂ B is satisfied by our assumptions. This also shows that, on average, the algorithm proposes λ2 (B)/λ2 (C) = c times a sample until acceptance. In contrast, the following algorithm does not produce unused samples. 24.8 Algorithm (Composition method). Let F be a distribution with density f of the form f(x) = ∫ g y (x) μ(dy)

where μ is a probability measure and g y is for each y a density. s

a) Generate y ∼ μ s b) Generate x ∼ g y s

Then x ∼ F.

Proof. Let Y ∼ μ and, conditioned on {Y = y}, let X ∼ g y , i.e. x

x

P(X ⩽ x) = ∫ ℙ(X ⩽ x | Y = y) μ(dy) = ∫ ∫ g y (z) dz μ(dy) = ∫ f(z)dz. −∞

−∞

Thus we have X ∼ F.

24.9 Example (Normal distribution). We can write the density of the standard normal distribution as f(x) = ∞

1 − 1 x2 1 2 − 1 x2 1 2 − 1 x2 e 2 = e 2 𝟙(−∞,0] (x) + e 2 𝟙(0,∞) (x) 2 √2π 2 √2π √2π = g−1 (x)

= g1 (x)

i.e. f(x) = ∫−∞ g y (x) μ(dy) with μ(dy) = 12 δ−1 (dy) + 12 δ1 (dy). Note that X ∼ g1 can be generated by Example 24.7 and that X ∼ g1 󳨐⇒ −X ∼ g−1

holds. Thus by Example 24.3 and the composition method we obtain the following algorithm.

464 | 24 Simulation of Brownian motion 24.10 Algorithm (Composition of one-sided normal samples). Let f denote the density of the one-sided normal distribution, see Example 24.7. s s a) Generate y ∼ U{−1,1} , z ∼ f b) Set x := yz s Then x ∼ N(0, 1). In the next section we will see further algorithms to generate samples of the normal distribution.

24.2 Normal distribution For the one-dimensional standard normal distribution and μ ∈ ℝ, σ > 0 one has X ∼ N(0, 1) 󳨐⇒ σX + μ ∼ N(μ, σ2 ).

Thus we have the following algorithm to transform samples of the standard normal distribution to samples of a general normal distribution. 24.11 Algorithm (Change of mean and variance). Let μ ∈ ℝ and σ > 0. s a) Generate y ∼ N(0, 1) b) Set x := σy + μ s Then x ∼ N(μ, σ2 ).

Based on the above algorithm we can focus in the following on the generation of samples of N(0, 1). The standard approach is the inverse transform method, Algorithm 24.2, with a numerical inversion of the distribution function of the standard normal distribution.

24.12 Algorithm (Numerical inversion). s a) Generate u ∼ U(0,1) b) Set x := F −1 (u) (where F is the N(0, 1)-distribution function) s Then x ∼ N(0, 1).

A different approach which generates two samples at once is the following: 24.13 Algorithm (Box–Muller-method). s a) Generate u1 , u2 ∼ U(0,1) b) Set r := √−2 ln(u1 ), v := 2πu2 c) Set x1 := r cos(v), x2 := r sin(v) s Then x j ∼ N(0, 1).

Proof. Let X1 , X2 ∼ N(0, 1) be independent normal random variables. Their joint

density is f X1 ,X2 (x1 , x2 ) =

1 2π

exp (−

x21 +x22 2 ).

Now we express the vector (X1 , X2 ) in

24.2 Normal distribution | 465

polar coordinates, i.e. (

X1 R cos V )=( ) X2 R sin V

(24.2)

and the probability density becomes, in polar coordinates, r > 0, v ∈ [0, 2π) cos v f R,V (r, v) = f X1 ,X2 (r cos v, r sin v) det ( sin v

1 2 r −r sin v . ) = e− 2 r 2π r cos v

Thus R and V are independent, V ∼ U[0,2π) and the distribution of R2 is 2

√x

x

1

2

ℙ(R ⩽ x) = ∫ e− 2 r r dr = ∫ 0

0

1 1 −1 s e 2 ds = 1 − e− 2 x , 2

i.e. R2 ∼ Exp1/2 . The second step of the algorithm is just an application of the inverse transform method, Examples 24.3 and 24.5, to generate samples of R and V, in the final step (24.2) is applied. A simple application of the central limit theorem can be used to get (approximately) standard normal distributed samples. 24.14 Algorithm (Central limit method). s a) Generate u j ∼ U(0,1) , j = 1, . . . , 12 12

b) Set x := ∑ u j − 6 j=1

s

Then x ∼ N(0, 1) approximately.

Proof. Let Uj ∼ U(0,1) , j = 1, 2, . . . , 12, be independent, then 𝔼(Uj ) = 𝕍 Uj = 𝔼 U j2 − (𝔼 U j )2 =

By the central limit theorem

∑12 j=1 Uj − 12 𝔼 U1 √12 𝕍 U1

12

1 2

and

x3 󵄨󵄨󵄨1 1 4 3 1 󵄨󵄨 − = − = . 󵄨 3 󵄨0 4 12 12 12

= ∑ Uj − 6 ∼ N(0, 1) (approximately). j=1

Clearly one could replace the constant 12 in the first step of the above algorithm by any other integer n, but then the corresponding sum in Step 2 would have to be rescaled 1 by (n/12)− 2 and this would require a further calculation. In the above setting it can be shown that the pointwise difference of the standard normal distribution function and the distribution function of (∑12 j=1 U j − 12 𝔼 U 1 ) /√12 𝕍 U 1 for iid U j ∼ U(0,1) is always less than 0.0023, see [220, Section 3.1]. Now we have seen several ways to generate samples of the normal distribution, i.e. we know how to simulate increments of Brownian motion.

466 | 24 Simulation of Brownian motion

24.3 Brownian motion A simulation of a continuous time process is always based on a finite sample, since we can generate only finitely many values in finite time. Brownian motion is usually simulated on a discrete time grid 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n , and it is common practice for visualizations to interpolate the simulated values linearly. This is justified by the fact that Brownian Motion has continuous paths. Throughout this section (B t )t⩾0 denotes a one-dimensional standard Brownian motion. The simplest method to generate a Brownian path uses the property of stationary and independent increments. 24.15 Algorithm (Independent increments). Let 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n be a time grid. Initialize b0 := 0 For j = 1 to n s a) Generate y ∼ N(0, t j − t j−1 ) b) Set b t j := b t j−1 + y s

Then (b0 , b t1 , . . . , b t n ) ∼ (B0 , B t1 , . . . , B t n ).

This method works particularly well if the step size δ = t j − t j−1 is constant for all j, in s

this case one has to generate only y ∼ N(0, δ) repeatedly. A drawback of the method is, that any refinement of the discretization yields a new simulation and, thus, a different path. To get a refinement of an already simulated path one can use the idea of Lévy’s original argument, see Section 3.4. 24.16 Algorithm (Interpolation. Lévy’s argument). Let 0 ⩽ s0 < s < s1 be fixed times and (b s0 , b s1 ) be a sample value of (B s0 , B s1 ). s (s −s)b s0 +(s−s0 )b s1 (s−s0 )(s1 −s) , s1 −s0 ) a) Generate b s ∼ N ( 1 s1 −s0 s

Then (b s0 , b s , b s1 ) ∼ (B s0 , B s , B s1 ) given that B s0 = b s0 and B s1 = b s1 . Proof. In Section 3.4 we have seen that in the above setting

󵄨 ℙ(B s ∈ ⋅ 󵄨󵄨󵄨 B s0 = b s0 , B s1 = b s1 ) = N (m s , σ2s )

with ms =

(s1 − s)b s0 + (s − s0 )b s1 s1 − s0

and

σ2s = s

(s − s0 )(s1 − s) . s1 − s0

Thus, for 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n and (b0 , b t1 , . . . , b t n ) ∼ (B0 , B t1 , . . . , B t n ) we get s

(b0 , b t1 , . . . , b t j , b r , b t j+1 , . . . , b t n ) ∼ (B0 , B t1 , . . . , B t j , B r , B t j+1 , . . . , B t n )

for some r ∈ (t j , t j+1 ) by generating the intermediate point b r given (b t j , b t j+1 ). Repeated applications yield an arbitrary refinement of a given discrete simulation of a Brownian

24.4 Multivariate Brownian motion | 467

path. Taking only the dyadic refinement of the unit interval we get the Lévy–Ciesielski representation, cf. Section 3.2. 24.17 Algorithm (Lévy–Ciesielski). Let J ⩾ 1 be the order of refinement. Initialize b0 := 0 s Generate b1 ∼ N(0, 1) For j = 0 to J − 1 For l = 0 to 2j − 1 s a) Generate y ∼ N(0, 1) j 1 b) Set b(2l+1)/2j+1 = (b l/2j + b(l+1)/2j ) + 2−( 2 +1) y 2 s Then (b0 , b1/2J , b2/2J , . . . , b1 ) ∼ (B0 , B1/2J , B2/2J , . . . , B1 ).

Proof. As noted at the end of Section 3.4, the expansion of B t at dyadic times has only finitely many terms and

where Y ∼ N(0, 1).

B(2l+1)/2j+1 =

j 1 (B l/2j + B(l+1)/2j ) + 2−( 2 +1) Y 2

Analogous to the central limit method for the normal distribution we can also use Donsker’s invariance principle, Theorem 3.10, to get an approximation to a Brownian path. 24.18 Algorithm (Donsker’s invariance principle). Let n be the number of steps. Initialize b0 := 0 s a) Generate u j ∼ U{−1,1} , j = 1, . . . , n b) For k = 1 to n 1 Set b k/n = u k + b(k−1)/n √n s Then (b0 , b1/n , . . . , b2/n , . . . , b1 ) ∼ (B0 , B1/n , B2/n , . . . , B1 ) approximately. Proof. The result follows as in Section 3.5.

Clearly one could also replace the Bernoulli random walk in the above algorithm by a different symmetric random walk with finite variance.

24.4 Multivariate Brownian motion Since a d-dimensional Brownian motion has independent one-dimensional Brownian motions as components, cf. Corollary 2.9, we can use the algorithms from the last section to generate the components separately. In order to generate a Q-Brownian motion we need to simulate samples of a multivariate normal distribution with correlated components.

468 | 24 Simulation of Brownian motion 24.19 Algorithm (Two-dimensional Normal distribution). Let an expectation vector μ ∈ ℝ2 , variances σ21 , σ22 > 0 and correlation ρ ∈ [−1, 1] be given. s

a) Generate y1 , y2 , y3 ∼ N(0, 1) b) Set x1 := σ1 (√1 − |ρ| y1 + √|ρ| y3 ) + μ1

x2 := σ2 (√1 − |ρ| y2 + sgn(ρ)√|ρ| y3 ) + μ2 s

Then (x1 , x2 )⊤ ∼ N (μ, (

σ21 ρσ1 σ2

ρσ1 σ2 )) . σ22

Proof. Let Y1 , Y2 , Y3 ∼ N(0, 1) be independent random variables. Then σ1 √1 − |ρ| (

Y1 0 σ1 Y3 ) + σ2 √1 − |ρ| ( ) + √|ρ| ( ) 0 Y2 sgn(ρ)σ2 Y3

is a normal random variable since it is a sum of independent N(0, .) distributed random vectors. Observe that the last vector is bivariate normal with completely dependent components. Calculating the covariance matrix yields the statement. This algorithm shows nicely how the correlation of the components is introduced into the sample. In the next algorithm this is less obvious, but the algorithm works in any dimension. 24.20 Algorithm (Multivariate Normal distribution). Let a positive semidefinite covariance matrix Q ∈ ℝd×d and an expectation vector μ ∈ ℝd be given. Calculate the Cholesky decomposition ΣΣ⊤ = Q s a) Generate y1 , . . . , y d ∼ N(0, 1) and let y be the vector with components y1 , . . . , y d b) Set x := μ + Σy s Then x ∼ N(μ, Q). Proof. Compare with the discussion following Definition 2.11.

Now we can simulate Q-Brownian motion using the independence and stationarity of the increments. 24.21 Algorithm (Q-Brownian motion). Let 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n be a time grid and Q ∈ ℝd×d be a positive semidefinite covariance matrix. Initialize b0k := 0, k = 1, . . . , d For j = 1 to n s a) Generate y ∼ N (0, (t j − t j−1 )Q) b) Set b t j := b t j−1 + y s Then (b0 , b t1 , . . . , b t n ) ∼ (B0 , B t1 , . . . , B t n ), where (B t )t⩾0 is a Q-Brownian motion.

24.5 Stochastic differential equations | 469

(2)

Bt

(1)

Bt

Fig. 24.1: Simulation of a twodimensional Q-Brownian motion.

24.22 Example. Figure 24.1 shows the path of a Q-Brownian motion for t ∈ [0, 10] 1 ρ with step width 0.0001 and correlation ρ = 0.8, i.e. Q = ( ). Note that it seems ρ 1 as if the correlation were visible, but – especially since it is just a single sample and the scaling of the axes is not given – one should be careful when drawing conclusions from samples.

24.5 Stochastic differential equations Based on Chapter 21 we consider in this section a process (X t )t⩾0 defined by the stochastic differential equation X0 = x0

and

dX t = b(t, X t ) dt + σ(t, X t ) dB t .

(24.3)

Corollary 21.12 and Theorem 21.13 provide conditions which ensure that (24.3) has a unique solution. We present two algorithms that allow us to simulate the solution. 24.23 Algorithm (Euler scheme). Assume that b and σ in (24.3) satisfy for all x, y ∈ ℝ, 0 ⩽ s < t ⩽ T and some K > 0 |b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)| ⩽ K |x − y|,

|b(t, x)| + |σ(t, x)| ⩽ K(1 + |x|),

|b(t, x) − b(s, x)| + |σ(t, x) − σ(s, x)| ⩽ K(1 + |x|)√t − s.

(24.4)

(24.5) (24.6)

Let n ⩾ 1, 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = T be a time grid, and x0 be the initial position. For j = 1 to n s a) Generate y ∼ N(0, t j − t j−1 ) b) Set x t j := x t j−1 + b(t j−1 , x t j−1 )(t j − t j−1 ) + σ(t j−1 , x t j−1 ) y s

Then (x0 , x t1 , . . . , x t n ) ∼ (X0 , X t1 , . . . , X t n ) approximately, where (X t )t⩾0 is the solution to (24.3).

470 | 24 Simulation of Brownian motion Note that the conditions (24.4) and (24.5) ensure the existence of a unique solution. The approximation error can be measured in various ways. For this denote, in the setting of the algorithm, the maximal step width by δ := maxj=1,...,n (t j − t j−1 ) and the sequence of random variables corresponding to step 2 by (X tδj )j=0,...,n . Given a Brownian motion (B t )t⩾0 , we can define X tδj , by X tδ0 := x0 , and for j = 1, . . . , n by

X tδj := X tδj−1 + b(t j−1 , X tδj−1 )(t j − t j−1 ) + σ(t j−1 , X tδj−1 )(B t j − B t j−1 ).

(24.7)

The process (X tδj )j=0,...,n is called the Euler scheme approximation of the solution (X t )t⩾0 to the equation (24.3). We say that an approximation scheme has strong order of convergence β if 𝔼 |X Tδ − X T | ⩽ Cδ β .

The following result shows that the algorithm provides an approximate solution to (24.3). 24.24 Theorem. In the setting of Algorithm 24.23 the strong order of convergence of the Euler scheme is 1/2. Moreover, the convergence is locally uniform, i.e. sup 𝔼 |X tδ − X t | ⩽ c T √ δ, t⩽T

where (X tδ )t⩽T is the piecewise constant extension of the Euler scheme from the discrete time set t0 , . . . , t n to [0, T] defined by X tδ := X tδn(t)

with n(t) := max {n : t n ⩽ t}.

In the same setting and with some additional technical conditions, cf. [5] and [145], one can show that the weak order of convergence of the Euler scheme is 1, i.e. 󵄨󵄨 󵄨 󵄨󵄨 𝔼 g(X Tδ ) − 𝔼 g(X T )󵄨󵄨󵄨 ⩽ c T δ

for any g ∈ C4 whose first four derivatives grow at most polynomially.

Proof of Theorem 24.24.. Let b, σ, T, n, be as in Algorithm 24.23. Furthermore let (X t )t⩾0 be the solution to the corresponding stochastic differential equation (24.3), (X tδj )j=0,...,n

be the Euler scheme defined by (24.7) and (X tδ )t⩽T be its piecewise constant extension onto [0, T]. Note that this allows us to rewrite X tδ as t n(t)

X tδ

= x0 + ∫ 0

t n(t)

b(t n(s) , X tδn(s) )

ds + ∫ σ(t n(s) , X tδn(s) ) dB s . 0

24.5 Stochastic differential equations |

471

Thus, we can calculate the error by Z(T) := sup 𝔼 [|X tδ − X t |2 ] ⩽ 𝔼 [sup |X tδ − X t |2 ] t⩽T

t⩽T

(24.8)

⩽ 3Z1 (T) + 3Z2 (T) + 3Z3 (T),

where Z1 , Z2 and Z3 will be given explicitly below. We begin with the first term and estimate it analogously to the proof of Theorem 21.11, see (21.20), (21.21), and apply the estimate (24.4). 󵄨󵄨 t n(t) 󵄨 [ Z1 (T) := 𝔼 sup 󵄨󵄨󵄨󵄨 ∫ (b(t n(s) , X tδn(s) ) − b(t n(s) , X s )) ds t⩽T 󵄨󵄨 0 [

󵄨󵄨2 󵄨 + ∫ (σ(t n(s) , X tδn(s) ) − σ(t n(s) , X s )) dB s 󵄨󵄨󵄨󵄨 ] 󵄨󵄨 0 ] t n(t)

T

󵄨 󵄨2 ⩽ 2T ∫ 𝔼 [󵄨󵄨󵄨b(t n(s) , X tδn(s) ) − b(t n(s) , X s )󵄨󵄨󵄨 ] ds 0

T

󵄨 󵄨2 + 8 ∫ 𝔼 [󵄨󵄨󵄨σ(t n(s) , X tδn(s) ) − σ(t n(s) , X s )󵄨󵄨󵄨 ] ds 0

T

2

2

⩽ (2TK + 8K ) ∫ 𝔼 [|X tδn(s) − X s |2 ] ds 0

T

⩽ C ∫ Z(s) ds. 0

The second term in (24.8) can be estimated in a similar way. t n(t) 󵄨󵄨 t n(t) 󵄨󵄨󵄨2 󵄨 Z2 (T) := 𝔼 [ sup 󵄨󵄨󵄨󵄨 ∫ (b(t n(s) , X s ) − b(s, X s )) ds + ∫ (σ(t n(s) , X s ) − σ(s, X s )) dB s 󵄨󵄨󵄨󵄨 ] 󵄨󵄨 t⩽T 󵄨󵄨 0 0 [ ] T

T

⩽ 2T ∫ 𝔼 [|b(t n(s) , X s ) − b(s, X s )|2 ] ds + 8 ∫ 𝔼 [|σ(t n(s) , X s ) − σ(s, X s )|2 ] ds 0

0

T

2

⩽ (2TK 2 + 8K 2 ) ∫ (1 + 𝔼 |X s |)2 (√s − t n(s) ) ds 0

⩽ Cδ,

where we use (24.6) for the second inequality, and |s − t n(s) | < δ and (21.22) for the last inequality. The terms Z1 and Z2 consider the integral only up to time t n(t) , thus the

472 | 24 Simulation of Brownian motion third term takes care of the remainder up to time t. It can be estimated using (24.5), |s − t n(s) | < δ and the estimate (21.22). t 󵄨󵄨 t 󵄨󵄨2 󵄨󵄨 󵄨 [ Z3 (T) := 𝔼 sup 󵄨󵄨󵄨 ∫ b(s, X s ) ds + ∫ σ(s, X s ) dB s 󵄨󵄨󵄨󵄨 ] 󵄨󵄨 t⩽T 󵄨󵄨 t t [ ] n(t)

n(t)

t

t

⩽ 2δ ∫ 𝔼 [|b(s, X s )| ] ds + 8 ∫ 𝔼 [|σ(s, X s )|2 ] ds ⩽ Cδ. 2

t n(t)

t n(t)

Note that all constants appearing in the estimates above may depend on T, but do not depend on δ. Thus we have shown T

Z(T) ⩽ C T δ + C T ∫ Z(s) ds, 0

where C T is independent of δ. By Gronwall’s lemma, Theorem A.42, we obtain that Z(T) ⩽ C T e TC T δ and, finally, sup 𝔼 [|X tδ − X t |] ⩽ √ Z(T) ⩽ c T √ δ. t⩽T

In the theory of ordinary differential equations one obtains higher order approximation schemes with the help of Taylor’s formula. Similarly for stochastic differential equations an application of a stochastic Taylor–Itô formula yields the following scheme which converges with strong order 1 to the solution of (24.3), for a proof see [145]. There it is also shown that the weak order is 1, thus it is the same as for the Euler scheme. 24.25 Algorithm (Milstein scheme). Assume that b and σ satisfy the conditions of Algorithm 24.23 and σ(t, ⋅) ∈ C2 . Let T > 0, n ⩾ 1, 0 = t0 < t1 < ⋅ ⋅ ⋅ < t n = T be a time grid and x0 be the starting position. For j = 1 to n s a) Generate y ∼ N(0, t j − t j−1 ) b) Set x t j := x t j−1 + b(t j−1 , x t j−1 )(t j − t j−1 ) + σ(t j−1 , x t j−1 ) y 1 + σ(t j−1 , x t j−1 )σ(0,1) (t j−1 , x t j−1 )(y2 − (t j − t j−1 )) 2 ∂ σ(t, x)) (where σ(0,1) (t, x) := ∂x s

Then (x0 , x t1 , . . . , x t n ) ∼ (X0 , X t1 , . . . , X t n ) approximately.

In the literature one finds a variety of further algorithms for the numerical evaluation of SDEs. Some of them have higher orders of convergence than the Euler or Milstein scheme, see for example [145] and the references given therein. Depending on the actual simulation task one might be more interested in the weak or strong order of convergence. The strong order is important for pathwise approximations whereas the weak order is important if one is interested only in distributional quantities.

24.6 Monte Carlo method |

473

We close this chapter by one of the most important applications of samples: the Monte Carlo method.

24.6 Monte Carlo method The Monte Carlo method is nothing but an application of the strong law of large numbers. Recall that for iid random variables (Y j )j⩾1 with finite expectation 1 n n→∞ ∑ Y j 󳨀󳨀󳨀󳨀→ 𝔼 Y1 n j=1

almost surely.

Thus, if we are interested in the expectation of g(X) where X is a random variable and g a measurable function, we can approximate 𝔼 g(X) by the following algorithm.

24.26 Algorithm (Monte Carlo method). Let X be a random variable with distribution F and g be a function such that 𝔼 |g(X)| < ∞. Let n ⩾ 1. s a) Generate x j ∼ F, j = 1, . . . , n 1 n b) Set y := ∑ g(x j ) n j=1

Then y is an approximation of 𝔼 g(X).

The approximation error is, by the central limit theorem and the theorem of Berry– Esseen, proportional to 1/√n. 24.27 Example. In applications one often defines a model by an SDE of the form (24.3) and one is interested in the value of 𝔼 g(X t ), where g is some function, and (X t )t⩾0 is the solution of the SDE. We know how to simulate the normal distributed increments of the underlying Brownian motion, e.g. by Algorithm 24.13. Hence we can simulate samples of X t by the Euler scheme, Algorithm 24.23. Doing this repeatedly corresponds to step 1 in the Monte Carlo method. Therefore, we get by step 2 an approximation of 𝔼 g(X t ). 24.28 Further reading. General monographs on simulation are [5] which focuses, in particular, on steady states, rare events, Gaussian processes and SDEs, and [223] which focuses in particular on Monte Carlo simulation, variance reduction and Markov chain Monte Carlo. More on the numerical treatment of SDEs can be found in [145]. A good starting point for the extension to Lévy processes and applications in finance is [42]. Asmussen, Glynn: Stochastic Simulation: Algorithms and Analysis. [5] Cont, Tankov: Financial modelling with jump processes. [42] [145] Kloeden, Platen: Numerical Solution of Stochastic Differential Equations. [223] Rubinstein, Kroese: Simulation and the Monte Carlo Method Wiley.

A Appendix A.1 Kolmogorov’s existence theorem In this appendix we give a proof of Kolmogorov’s theorem, Theorem 4.8, on the existence of stochastic processes with prescribed finite dimensional distributions. In addition to the notation introduced in Chapter 4, we use π JL : (ℝd )L → (ℝd )J for the projection onto the coordinates from J ⊂ L ⊂ I. Moreover, we write π J := π JI and H for the family of all finite subsets of I. Finally d J Z := {π−1 J (B) : J ∈ H, B ∈ B((ℝ ) )}

denotes the family of cylinder sets, and we call Z = π−1 J (B) a J-cylinder with basis B. If d J jd J ∈ H and # J = j, we can identify (ℝ ) with ℝ . For the proof of Theorem 4.8 we need the following auxiliary result.

A.1 Lemma (regularity). Every probability measure μ on (ℝn , B(ℝn )) is outer regular, μ(B) = inf {μ(U) : U ⊃ B, U open}

(A.1)

and inner (compact) regular,

μ(B) = sup {μ(F) : F ⊂ B, F closed}

Proof. Consider the family

(A.2)

= sup {μ(K) : K ⊂ B, K compact} .

Σ := {A ⊂ ℝn : ∀ϵ > 0 ∃U open, F closed, F ⊂ A ⊂ U, μ(U \ F) < ϵ}.

It is easy to see that Σ is a σ-algebra¹ which contains all closed sets. Indeed, if F ⊂ ℝn is closed, the sets U n := F + 𝔹(0, 1/n) = {x + y : x ∈ F, y ∈ 𝔹(0, 1/n)}

1 Indeed: 0 ∈ Σ is obvious. Pick A ∈ Σ. For every ϵ > 0 there are closed and open sets F ϵ ⊂ A ⊂ U ϵ such that μ(U ϵ \ F ϵ ) < ϵ. Observe that U ϵc ⊂ A c ⊂ F ϵc , that U ϵc is closed, that F ϵc is open and that F ϵc \ U ϵc = F ϵc ∩ (U ϵc )c = F ϵc ∩ U ϵ = U ϵ \ F ϵ .

Thus, μ(F ϵc \ U ϵc ) = μ(U ϵ \ F ϵ ) < ϵ, and we find that A c ∈ Σ. Let A n ∈ Σ, n ⩾ 1. For ϵ > 0 there exist sets F n ⊂ A n ⊂ U n satisfying μ(U n \ F n ) < ϵ/2n . Therefore, open

closed

Φ N := F1 ∪ ⋅ ⋅ ⋅ ∪ F N ⊂ A1 ∪ ⋅ ⋅ ⋅ ∪ A N ⊂ U := ⋃ U j . j⩾1

Since

⋂ U \ Φ N = ⋃ U j \ ⋃ F k = ⋃ (U j \ ⋃ F k ) ⊂ ⋃(U j \ F j )

N⩾1

j⩾1

https://doi.org/10.1515/9783110741278-025

k⩾1

j⩾1

k⩾1

j⩾1

A.1 Kolmogorov’s existence theorem |

475

are open and ⋂n⩾1 U n = F. By the continuity of measures, limn→∞ μ(U n ) = μ(F), which shows that F ⊂ U n and μ(U n \ F) < ϵ for sufficiently large values of n > N(ϵ). This proves F ∈ Σ and we conclude that B(ℝn ) = σ(closed subsets) ⊂ σ(Σ) = Σ.

In particular, we can construct for every B ∈ B(ℝn ) sequences of open (U n )n and closed (F n )n sets such that F n ⊂ B ⊂ U n and limn→∞ μ(U n \ F n ) = 0. Therefore, μ(B \ F n ) + μ(U n \ B) ⩽ 2 μ(U n \ F n ) 󳨀󳨀󳨀󳨀→ 0, n→∞

which shows that the infimum in (A.1) and the supremum in the first equality of (A.2) are attained. Finally, replace in the last step the closed sets F n with the compact sets K n := F n ∩ 𝔹(0, R),

R = R(n).

Using the continuity of measures, we get limR→∞ μ(F n \ F n ∩ 𝔹(0, R)) = 0, and for sufficiently large R = R(n) with R(n) → ∞, μ(B \ K n ) ⩽ μ(B \ F n ) + μ(F n \ K n ) 󳨀󳨀󳨀󳨀→ 0. n→∞

This proves the second equality in (A.2).

A.2 Theorem (Theorem 4.8. Kolmogorov 1933). Let I ⊂ [0, ∞) and (p J )J∈H be a projective family of probability measures p J defined on ((ℝd )J , B((ℝd )J )), J ∈ H. Then there exists a unique probability measure μ on the space ((ℝd )I , B I ) such that p J = μ ∘ π−1 J where π J is the canonical projection of (ℝd )I onto (ℝd )J . The measure μ is often called the projective limit of the family (p J )J∈H .

Proof. We use the family (p J )J∈H to define a measure relative to the cylinder sets Z , J μ ∘ π−1 J (B) := p J (B) for all B ∈ B .

(A.3)

∈Z

Since Z is an algebra of sets, i.e. 0 ∈ Z and Z is stable under finite unions, intersections and complements, we can use Carathéodory’s extension theorem to extend μ onto B I = σ(Z ). we find by the continuity and the σ-subadditivity of the (finite) measure μ lim μ(U \ Φ N ) = μ( ⋃ U j \ ⋃ F k ) ⩽ ∑ μ(U j \ F j ) ⩽ ∑

N→∞

j⩾1

k⩾1

j⩾1

j⩾1

ϵ = ϵ. 2j

Thus, μ(U \ Φ N ) ⩽ 2ϵ for all sufficiently large N > N(ϵ). This proves ⋃n⩾1 A n ∈ Σ.

476 | Appendix Let us, first of all, check that μ is well-defined and that it does not depend on the −1 󸀠 particular representation of the cylinder set Z. Assume that Z = π −1 J (B) = π K (B ) for J 󸀠 K J, K ∈ H and B ∈ B , B ∈ B . We have to show that Set L := J ∪ K ∈ H. Then

p J (B) = p K (B󸀠 ).

L −1 −1 L −1 Z = π −1 J (B) = (π J ∘ π L ) (B) = π L ∘ (π J ) (B),

󸀠 L −1 󸀠 −1 L −1 󸀠 Z = π−1 K (B ) = (π K ∘ π L ) (B ) = π L ∘ (π K ) (B ).

Thus, (π JL )−1 (B) = (π KL )−1 (B󸀠 ) and the projectivity (4.6) of the family (p J )J∈H gives p J (B) = π JL (p L )(B) = p L ((π JL )−1 (B)) = p L ((π KL )−1 (B󸀠 )) = p K (B󸀠 ).

Next we verify that μ is an additive set function: μ(0) = 0 is obvious. Let Z, Z 󸀠 ∈ Z −1 󸀠 󸀠 such that Z = π −1 J (B) and Z = π K (B ). Setting L := J ∪ K we can find suitable basis 󸀠 L sets A, A ∈ B such that 󸀠 −1 󸀠 Z = π−1 L (A) and Z = π L (A ).

If Z ∩ Z 󸀠 = 0 we see that A ∩ A󸀠 = 0, and μ inherits its additivity from p L : −1 ⋅ −1 󸀠 ⋅ 󸀠 μ(Z ∪⋅ Z 󸀠 ) = μ(π−1 L (A) ∪ π L (A )) = μ(π L (A ∪ A ))

= p L (A ∪⋅ A󸀠 ) = p L (A) + p L (A󸀠 ) = ⋅ ⋅ ⋅ = μ(Z) + μ(Z 󸀠 ).

Finally, we prove that μ is σ-additive on Z . For this we show that μ is continuous at the empty set, i.e. for every decreasing sequence of cylinder sets (Z n )n⩾1 ⊂ Z , ⋂n⩾1 Z n = 0, we have limn→∞ μ(Z n ) = 0. Equivalently, Z n ∈ Z , Z n ⊃ Z n+1 ⊃ ⋅ ⋅ ⋅ , μ(Z n ) ⩾ α > 0 󳨐⇒ ⋂ Z n ≠ 0. n⩾1

Write for such a sequence of cylinder sets Z n = π−1 J n (B n ) with suitable J n ∈ H and B n ∈ B J n . Since for every J ⊂ K any J-cylinder is also a K-cylinder, we may assume without loss of generality that J n ⊂ J n+1 . Note that (ℝd )J n ≅ (ℝd )# J n ; therefore we can use Lemma A.1 to see that p J n is inner compact regular. Thus, there exist compact sets K n ⊂ B n :

Write Z 󸀠n := π −1 J n (K n ) and observe that

p J n (B n \ K n ) ⩽ 2−n α.

−n μ(Z n \ Z 󸀠n ) = μ(π−1 J n (B n \ K n )) = p J n (B n \ K n ) ⩽ 2 α.

A.1 Kolmogorov’s existence theorem |

477

Although the Z n are decreasing, this might not be the case for the Z 󸀠n ; therefore we ̂ n := Z 󸀠 ∩ ⋅ ⋅ ⋅ ∩ Z 󸀠n ⊂ Z n and find consider Z 1 n

̂ n ) = μ(Z n ) − μ(Z n \ Z ̂ n ) = μ(Z n ) − μ( ⋃ ( Z n \ Z 󸀠 )) μ(Z k k=1

n

⊂Z k \Z 󸀠k

⩾ μ(Z n ) − ∑ μ (Z k \ Z 󸀠k ) k=1

n

⩾ α − ∑ 2−k α > 0. k=1

̂ n ≠ 0. For this we construct some We are done if we can show that ⋂n⩾1 Z 󸀠n = ⋂n⩾1 Z ̂ element z ∈ ⋂n⩾1 Z n . ̂ m . Since the sequence (Z ̂ n )n⩾1 is decreasing, 1o For every m ⩾ 1 we pick some f m ∈ Z ̂ n ⊂ Z 󸀠n , hence π J (f m ) ∈ K n for all m ⩾ n and n ⩾ 1. fm ∈ Z n

Since projections are continuous, and since continuous maps preserve compactness, π t (f m ) = π t n ∘ π J n (f m ) ∈ π t n (K n ) for all m ⩾ n, n ⩾ 1 and t ∈ J n . J

J

compact

2o The index set H := ⋃n⩾1 J n is countable; denote by H = (t k )k⩾1 some enumeration. For each t k ∈ H there is some n(k) ⩾ 1 such that t k ∈ J n(k) . Moreover, by 1o , J

J

(π t k (f m ))m⩾n(k) ⊂ π t kn(k) (K n(k) ).

Since the sets π t kn(k) (K n(k) ) are compact, we can take repeatedly subsequences, (π t1 (f m ))m⩾n(1) ⊂ compact set

1 ))m⩾n(2) ⊂ compact set (π t2 (f m

.. .

1 1 󳨐⇒ ∃(f m )m ⊂ (f m )m : ∃ lim π t1 (f m ) m→∞

2 1 2 󳨐⇒ ∃(f m )m ⊂ (f m )m : ∃ lim π t2 (f m ) m→∞

(π t k (f mk−1 ))m⩾n(k) ⊂ compact set 󳨐⇒ ∃(f mk )m ⊂ (f mk−1 )m : ∃ lim π t k (f mk ). m→∞

The diagonal sequence (f mm )m⩾1 satisfies ∃ lim π t (f mm ) m→∞

= z t for each t ∈ H.

3o By construction, π t (f m ) ∈ π t n (K n ) for all t ∈ J n and m ⩾ n. Therefore, J

π t (f mm ) ∈ π t n (K n ) for all m ⩾ n and t ∈ J n . J

478 | Appendix Since K n is compact, π t n (K n ) is compact, hence closed, and we see that J

z t = lim π t (f mm ) ∈ π t n (K n ) for every t ∈ J n . J

m→∞

Using the fact that n ⩾ 1 is arbitrary, find

{z t z(t) := { ∗ {

if t ∈ H;

if t ∈ I \ H

(∗ stands for any element from ℝd ) defines an element z = (z(t))t∈I ∈ (ℝd )I with the property that for all n ⩾ 1 󸀠 ̂ π J n (z) ∈ K n if, and only if, z ∈ π−1 J n (K n ) = Z n ⊃ Z n .

̂ n and ⋂ ̂ This proves z ∈ ⋂n⩾1 Z 󸀠n = ⋂n⩾1 Z n⩾1 Z n ≠ 0.

A.2 A property of conditional expectations

We assume that you have some basic knowledge of conditioning and conditional expectations with respect to a σ-algebra. Several times we will use the following result which is not always contained in elementary expositions. Throughout this section let (Ω, A , ℙ) be a probability space.

A.3 Lemma. Let X : (Ω, A ) → (D, D) and Y : (Ω, A ) → (E, E ) be two random variables. Assume that X , Y ⊂ A are σ-algebras such that X is X /D measurable, Y is Y /E measurable and X ⊥⊥ Y . Then 󵄨 󵄨 𝔼 [Φ(X, Y) 󵄨󵄨󵄨 X ] = [𝔼 Φ(x, Y)]|x=X = 𝔼 [Φ(X, Y) 󵄨󵄨󵄨 X]

(A.4)

󵄨 󵄨 𝔼 [Ψ(X(⋅), ⋅) 󵄨󵄨󵄨 X ] = [𝔼 Ψ(x, ⋅)]|x=X = 𝔼 [Ψ(X(⋅), ⋅) 󵄨󵄨󵄨 X] .

(A.5)

holds for all bounded D × E /B(ℝ) measurable functions Φ : D × E → ℝ. If Ψ : D × Ω → ℝ is bounded and D ⊗ Y /B(ℝ) measurable, then Proof. Assume first that Φ(x, y) is of the form ϕ(x)ψ(y). Then

Y ⊥⊥ X 󵄨 󵄨 𝔼 [ ϕ(X)ψ(Y) 󵄨󵄨󵄨 X ] = ϕ(X) 𝔼 [ψ(Y) 󵄨󵄨󵄨 X ] = ϕ(X) 𝔼 [ψ(Y)] =Φ(X,Y)

=

󵄨 𝔼 [ ϕ(x)ψ(Y) ]󵄨󵄨󵄨x=X . =Φ(x,Y)

Now fix some F ∈ X and pick ϕ(x) = 𝟙A (x) and ψ(y) = 𝟙B (y), A ∈ D, B ∈ E . Our argument shows that 󵄨 ∫ 𝟙A×B (X(ω), Y(ω)) ℙ(dω) = ∫ 𝔼 𝟙A×B (x, Y)󵄨󵄨󵄨x=X(ω) ℙ(dω), F

F

F∈X.

A.3 From discrete to continuous time martingales |

479

Both sides of this equality (can be extended from D × E to) define measures on the product σ-algebra D ⊗ E . By linearity, this becomes 󵄨 ∫ Φ(X(ω), Y(ω)) ℙ(dω) = ∫ 𝔼 Φ(x, Y)󵄨󵄨󵄨x=X(ω) ℙ(dω), F

F

F∈X,

for D ⊗ E measurable positive step functions Φ; using standard arguments from the theory of measure and integration, we get this equality for positive measurable functions and then for all bounded measurable functions. The latter is, however, equivalent to the first equality in (A.4). The second equality follows if we take conditional expectations 𝔼(⋅ ⋅ ⋅ | X) on both sides of the first equality and use the tower property of conditional expectation. The formulae (A.5) follow in a similar way. A simple consequence of Lemma A.3 are the following formulae. A.4 Corollary. Let X : (Ω, A ) → (D, D) and Y : (Ω, A ) → (E, E ) be two random variables. Assume that X , Y ⊂ A are σ-algebras such that X is X /D measurable, Y is Y /E measurable and X ⊥⊥ Y . Then 𝔼 Φ(X, Y) = ∫ 𝔼 Φ(x, Y) ℙ(X ∈ dx) = 𝔼 [∫ Φ(x, Y) ℙ(X ∈ dx)]

(A.6)

𝔼 Ψ(X(⋅), ⋅) = ∫ 𝔼 Ψ(x, ⋅) ℙ(X ∈ dx) = 𝔼 [∫ Ψ(x, ⋅) ℙ(X ∈ dx)] .

(A.7)

holds for all bounded D × E /B(ℝ) measurable functions Φ : D × E → ℝ. If Ψ : D × Ω → ℝ is bounded and D ⊗ Y /B(ℝ) measurable, then

A.3 From discrete to continuous time martingales We assume that you have some basic knowledge of discrete time martingales, e.g. as in [232; 236]. The key to the continuous time setting is the following simple observation: if (X t , Ft )t⩾0 is a martingale, then (X t j , Ft j )j⩾1

is a martingale for any sequence 0 ⩽ t1 < t2 < t3 < ⋅ ⋅ ⋅

We are mainly interested in estimates and convergence results for martingales. Our strategy is to transfer the corresponding results from the discrete time setting to continuous time. The key result is Doob’s upcrossing estimate. Let (X t , Ft )t⩾0 be a real-valued martingale, I ⊂ [0, ∞) be a finite index set with tmin = min I, tmax = max I, and a < b. Then 󵄨󵄨 ∃ σ1 < τ < σ2 < τ < ⋅ ⋅ ⋅ < σ m < τ , 󵄨 m 1 2 U(I; [a, b]) = max {m 󵄨󵄨󵄨󵄨 } 󵄨󵄨 σ j , τ j ∈ I, X(σ j ) < a < b < X(τ j )

480 | Appendix

𝜏1

𝑏

𝑎 𝜎1

𝜏2

𝜎2

𝑡min

𝑡max

Fig. A.1: Two upcrossings over the interval [a, b].

is the number of upcrossings of (X s )s∈I across the strip [a, b]. Figure A.1 shows two upcrossings. Since (X s , Fs )s∈I is a discrete time martingale, we know 𝔼 [U(I; [a, b])] ⩽

1 𝔼 [(X(tmax ) − a)+ ] . b−a

(A.8)

A.5 Lemma (upcrossing estimate). Let (X t , Ft )t⩾0 be a real-valued submartingale and D ⊂ [0, ∞) a countable subset. Then 𝔼 [U(D ∩ [0, t]; [a, b])] ⩽

for all t > 0 and −∞ < a < b < ∞.

1 𝔼 [(X(t) − a)+ ] b−a

(A.9)

Proof. Let D N ⊂ D be finite index sets such that D N ↑ D. If tmax := max D N ∩ [0, t], we get from (A.8) with I = D N ∩ [0, t] 𝔼 [U(D N ∩ [0, t]; [a, b])] ⩽

1 1 𝔼 [(X(tmax ) − a)+ ] ⩽ 𝔼 [(X(t) − a)+ ] . b−a b−a

For the last estimate observe that (X t − a)+ is a submartingale since x 󳨃→ x+ is convex and increasing. Since supN U(D N ∩ [0, t]; [a, b]) = U(D ∩ [0, t]; [a, b]), (A.9) follows from monotone convergence.

A.6 Theorem (martingale convergence theorem. Doob 1953). Let (X t , Ft )t⩾0 be a martingale in ℝd or a submartingale in ℝ with continuous sample paths. If supt⩾0 𝔼 |X t | < ∞, then the limit limt→∞ X t exists almost surely and defines a finite F∞ measurable random variable X∞ . If (X t , Ft )t⩾0 is a martingale of the form X t = 𝔼(Z | Ft ) for some Z ∈ L1 (ℙ), then we have X∞ = 𝔼(Z | F∞ ) and limt→∞ X t = X∞ almost surely and in L1 (ℙ).

A.3 From discrete to continuous time martingales |

481

Proof. Since a d-dimensional function converges if, and only if, all coordinate functions converge, we may assume that X t is real valued. By Doob’s upcrossing estimate 𝔼 [U(ℚ+ ; [a, b])] ⩽

supt⩾0 𝔼 [(X t − a)+ ] 0 and K > 0 ∫

{|X t |>R}

|X t | dℙ ⩽

= = ⩽ ⩽





{|X t |>R}



{|X t |>R}



𝔼(|X∞ | | Ft ) dℙ |X∞ | dℙ

{|X t |>R} ∩ {|X∞ |>K}



{|X∞ |>K}



{|X∞ |>K}



{|X∞ |>K}

󳨀󳨀󳨀󳨀󳨀→ R→∞

|X∞ | dℙ +



{|X t |>R} ∩ {|X∞ |⩽K}

|X∞ | dℙ

|X∞ | dℙ + K ℙ(|X t | > R) |X∞ | dℙ +

|X∞ | dℙ +



{|X∞ |>K}

K 𝔼 |X t | R

K 𝔼 |X∞ | R

dom. conv.

|X∞ | dℙ 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ 0. K→∞

For backwards submartingales, i.e. for submartingales with a decreasing index set, the assumptions of the convergence theorems A.6 and A.7 are always satisfied. A.8 Corollary (backwards convergence theorem). Let (X t , Ft ) be a submartingale and let t1 ⩾ t2 ⩾ t3 ⩾ . . . , t j ↓ t∞ , be a decreasing sequence. Then (X t j , Ft j )j⩾1 is a (backwards) submartingale which is uniformly integrable. In particular, the limit limj→∞ X t j exists in L1 and almost surely.

A.3 From discrete to continuous time martingales |

483

Proof. That (X t j )j⩾1 is a (backwards directed) submartingale is obvious. Now we use that 𝔼 X0 ⩽ 𝔼 X t and that (X +t )t⩾0 is a submartingale. Therefore, we get for t ⩽ t1 𝔼 |X t | = 2 𝔼 X +t − 𝔼 X t ⩽ 2 𝔼 X +t − 𝔼 X0 ⩽ 2 𝔼 X +t1 − 𝔼 X0 ,

which shows that supj⩾1 𝔼 |X t j | < ∞. Using Doob’s upcrossing estimate we can argue as in the martingale convergence theorem, Theorem A.6, and conclude that limj→∞ X t j = Z almost surely for some Ft∞ measurable random variable Z ∈ L1 (ℙ). For a submartingale 𝔼 X t j is decreasing, i.e. limj→∞ 𝔼 X t j = inf j⩾1 𝔼 X t j exists (and is finite). Thus, ∀ϵ > 0

∃N = N ϵ

∀j ⩾ N : | 𝔼 X t j − 𝔼 X t N | < ϵ.

This means that we get for all j ⩾ N, i.e. t j ⩽ t N : ∫

{|X t j |⩾R}

and since X t and

|X t j | dℙ = 2

X +t

=2 ⩽2

=2 =

Thus, {|X t j |⩾R}

|X t j | dℙ ⩽

{|X t j |⩾R}



{|X t j |⩾R}

X +t j dℙ −





{|X t j |⩾R}



{|X t j |⩾R}



{|X t j |⩾R}



{|X t j |⩾R}



{|X t j |⩾R}

X t j dℙ

{|X t j |⩾R}

X +t j dℙ − 𝔼 X t j +

are submartingales

⩽2







X t j dℙ



X t N dℙ

{|X t j | 0, t ⩾ 0.

(A.13)

Proof. Let D N ⊂ D be finite sets such that D N ↑ D. We use (A.12) with the index set I = D N ∩ [0, t], tmax = max D N ∩ [0, t] and observe that for the submartingale (|X t |p , Ft )t⩾0 the estimate 𝔼 [|X(tmax )|p ] ⩽ 𝔼 [|X(t)|p ]

holds. Since { max

s∈D N ∩ [0,t]

|X(s)| ⩾ r} ↑ ↑ { sup ↑

s∈D∩ [0,t]

|X(s)| ⩾ r} ,

we obtain (A.13) by using (A.12) and the continuity of measures. We can finally deduce Doob’s maximal inequality.

A.10 Theorem (L p maximal inequality. Doob 1953). Let (X t , Ft )t⩾0 be a positive submartingale or a d-dimensional martingale, X t ∈ L p (ℙ) for some p > 1 and D ⊂ [0, ∞) a countable subset. Then for p−1 + q−1 = 1 and all t > 0 𝔼 [ sup

s∈D∩ [0,t]

|X s |p ] ⩽ q p sup 𝔼 [|X s |p ] ⩽ q p 𝔼 [|X t |p ] . s∈[0,t]

If (X t )t⩾0 is continuous, (A.14) holds for D = [0, ∞).

(A.14)

A.4 Stopping times |

485

Proof. Let D N ⊂ D be finite index sets with D N ↑ D. The first estimate follows immediately from the discrete version of Doob’s inequality, 𝔼[

sup

s∈D N ∩ [0,t]

|X s |p ] ⩽ q p

sup

s∈D N ∩ [0,t]

𝔼 [|X s |p ] ⩽ q p sup 𝔼 [|X s |p ] s∈[0,t]

and a monotone convergence argument. For the second estimate observe that we have 𝔼[|X s |p ] ⩽ 𝔼[|X t |p ] for all s ⩽ t. Finally, if t 󳨃→ X t is continuous and D a dense subset of [0, ∞), we know that sup

s∈D∩ [0,t]

|X s |p = sup |X s |p . s∈[0,t]

A.4 Stopping times Recall that a stochastic process (X t )t⩾0 is said to be adapted to the filtration (Ft )t⩾0 if X t is for each t ⩾ 0 an Ft measurable random variable.

A.11 Definition. Let (Ft )t⩾0 be a filtration. A random time τ : Ω → [0, ∞] is called an (Ft -) stopping time if {τ ⩽ t} ∈ Ft for all t ⩾ 0. (A.15) Here are a few rules for stopping times which should be familiar from the discrete time setting.

A.12 Lemma. Let σ, τ, τ j , j ⩾ 1 be stopping times with respect to a filtration (Ft )t⩾0 . Then a) {τ < t} ∈ Ft ; b) τ + σ, τ ∧ σ, τ ∨ σ, supj τ j are Ft stopping times;

c) inf τ j , lim τ j , lim τ j are Ft+ stopping times. j

j

j

Proof. The proofs are similar to the discrete time setting. Let t ⩾ 0.

a) {τ < t} = ⋃ {τ ⩽ t − 1k } ∈ Ft since {τ ⩽ t − 1k } ∈ Ft− 1 ⊂ Ft for each k ⩾ 1. k

k⩾1

b) We have for all t ⩾ 0

{σ + τ > t} = {σ = 0, τ > t} ∪ {σ > t, τ = 0} ∪ {σ > 0, τ ⩾ t} ∪ {0 < τ < t, τ + σ > t} = {σ = 0, τ > t} ∪ {σ > t, τ = 0} ∪ {σ > 0, τ ⩾ t} ={σ⩽0}∩ {τ⩽t}c ∈Ft

={σ⩽t}c ∩ {τ⩽0}∈Ft

={σ⩽0}c ∩ {τ t − r} .

r∈(0,t)∩ ℚ ={τ⩽r}c ∩ {τ t}c ∈ Ft for all t ⩾ 0. Moreover, {σ ∨ τ ⩽ t} = {σ ⩽ t} ∩ {τ ⩽ t} ∈ Ft ;

{σ ∧ τ ⩽ t} = {σ ⩽ t} ∪ {τ ⩽ t} ∈ Ft ;

{supj τ j ⩽ t} = ⋂j {τ j ⩽ t} ∈ Ft ;

c) Note that {inf j τ j < t} = ⋃j {τ j < t} = ⋃j ⋃k {τ j ⩽ t − 1k } ∈ Ft ; thus, for any m ⩾ 1, {inf j τ j ⩽ t} = ⋂ {inf j τ j < t + 1n } ∈ Ft+ m1 󳨐⇒ {inf j τ j ⩽ t} ∈ ⋂ Ft+ m1 = Ft+ . n⩾m

m⩾1

∈Ft+ 1 ⊂Ft+ 1 n

m

Since limj and limj are combinations of inf and sup, the claim follows from the fact that inf j τ j and supj τ j are Ft+ stopping times. A.13 Example. The infimum of Ft stopping times need not be an Ft stopping time. To see this, let τ be an Ft+ stopping time (but not an Ft stopping time). Then τ j := τ + 1j are Ft stopping times: {τ j ⩽ t} = {τ ⩽ t − 1j } ∈ F(t− 1 )+ ⊂ Ft . j

But τ = inf j τ j is, by construction, not an Ft stopping time.

We will now associate a σ-algebra with a stopping time. A.14 Definition. Let τ be an Ft stopping time. Then

Fτ := {A ∈ F∞ := σ (⋃t⩾0 Ft ) : A ∩ {τ ⩽ t} ∈ Ft ∀t ⩾ 0} ,

Fτ+ := {A ∈ F∞ := σ (⋃t⩾0 Ft ) : A ∩ {τ ⩽ t} ∈ Ft+ ∀t ⩾ 0} .

It is not hard to see that Fτ and Fτ+ are σ-algebras and that for τ ≡ t we have Fτ = Ft . Note that, despite the slightly misleading notation, Fτ is a family of sets which does not depend on ω. A.15 Lemma. Let σ, τ, τ j , j ⩾ 1 be Ft stopping times. Then a) τ is Fτ measurable; b) if σ ⩽ τ then Fσ ⊂ Fτ ;

c) Fσ ∩ Fτ = Fσ∧τ ;

d) if τ1 ⩾ τ2 ⩾ . . . ⩾ τ n ↓ τ, then Fτ+ = ⋂j⩾1 Fτ + . If (Ft )t⩾0 is right continuous, then j Fτ = ⋂j⩾1 Fτ . j

Proof. a) We have to show that {τ ⩽ s} ∈ Fτ for all s ⩾ 0. But this follows from {τ ⩽ s} ∩ {τ ⩽ t} = {τ ⩽ s ∧ t} ∈ Fs∧t ⊂ Ft

for all t ⩾ 0.

A.4 Stopping times |

b) Let F ∈ Fσ . Then we have for all t ⩾ 0

487

F ∩ {τ ⩽ t} = F ∩ {τ ⩽ t} ∩ {σ ⩽ τ} = F ∩ {σ ⩽ t} ∩ {τ ⩽ t} ∈ Ft . ∈Ft as F∈Fσ

=Ω

Thus, F ∈ Fτ and Fσ ⊂ Fτ .

∈Ft

c) Note that σ ∧ τ is again a stopping time, cf. Lemma A.12. Therefore, Part b) shows that Fσ∧τ ⊂ Fσ and Fσ∧τ ⊂ Fσ , thus Fσ∧τ ⊂ Fσ ∩ Fτ . For the reverse inclusion fix some F ∈ Fσ ∩ Fτ and t ⩾ 0. We have F ∩ {σ ∧ τ ⩽ t} = (F ∩ {σ ⩽ t}) ∪ (F ∩ {τ ⩽ t}) ∈ Ft , ∈Ft

∈Ft

showing that F ∈ Fσ∧τ , hence Fσ ∩ Fτ ⊂ Fσ∧τ .

d) We know from Lemma A.12 that τ := inf j τ j is an Ft+ stopping time. Applying b) to the filtration (Ft+ )t⩾0 we get b)

τ j ⩾ τ ∀j ⩾ 1 󳨐⇒ ⋂ Fτ + ⊃ Fτ+ . j

j⩾1

On the other hand, if F ∈ Fτ + for all j ⩾ 1, then j

F ∩ {τ < t} = F ∩ ⋃{τ j < t} = ⋃ F ∩ {τ j < t} ∈ Ft j⩾1

where we use that

j⩾1

∈Ft− 1 ⊂Ft k

F ∩ {τ j < t} = ⋃ F ∩ {τ j ⩽ t − 1k } ⊂ Ft . k⩾1

As in (5.9) we see now that F ∩ {τ ⩽ t} ∈ Ft+ . If (Ft )t⩾0 is right continuous, we have Ft = Ft+ and Fτ = Fτ+ .

We close this section with a result which allows us to approximate stopping times from above by a decreasing sequence of stopping times with countably many values. This helps us to reduce many assertions to the discrete time setting. A.16 Lemma. Let τ be an Ft stopping time. Then there exists a decreasing sequence of Ft stopping times with τ1 ⩾ τ2 ⩾ . . . ⩾ τ n ↓ τ = inf τ j j

and each τ j takes only countably many values: {τ(ω), τ j (ω) = { k2−j , {

if τ(ω) = ∞;

if (k − 1)2−j ⩽ τ(ω) < k2−j , k ⩾ 1.

Moreover, F ∩ {τ j = k2−j } ∈ Fk/2j for all F ∈ Fτ+ .

488 | Appendix Proof. Fix t ⩾ 0 and j ⩾ 1 and pick k = k(t, j) such that (k − 1)2−j ⩽ t < k2−j . Then {τ j ⩽ t} = {τ j ⩽ (k − 1)2−j } = {τ < (k − 1)2−j } ∈ F(k−1)/2j ⊂ Ft ,

i.e. the τ j ’s are stopping times.

From the definition it is obvious that τ j ↓ τ: note that τ j (ω)−τ(ω) ⩽ 2−j if τ(ω) < ∞. Finally, if F ∈ Fτ+ , we have F ∩ {τ j = k2−j } = F ∩ {(k − 1)2−j ⩽ τ < k2−j }

= F ∩ {τ < k2−j } ∩ {τ < (k − 1)2−j }

c

= ⋃ (F ∩ {τ ⩽ k2−j − 1n } ) ∩ {τ < (k − 1)2−j } ∈ Fk/2j . n⩾1

∈F(k/2j −1/n)+ ⊂Fk/2j

c

∈Fk/2j

A.5 Optional sampling If we evaluate a continuous martingale at an increasing sequence of stopping times, we still have a martingale. The following technical lemma has exactly the same proof as Lemma 6.25 and its Corollary 6.26 – all we have to do is to replace in these proofs Brownian motion B by the continuous process X. A.17 Lemma. Let (X t , Ft )t⩾0 be a d-dimensional adapted stochastic process with continuous sample paths and let τ be an Ft stopping time. Then X(τ)𝟙{τk} ] = 0, then

Proof. Using Lemma A.16 we can approximate σ ⩽ τ by discrete-valued stopping times σ j ↓ σ and τ j ↓ τ. The construction in Lemma A.16 shows that we still have σ j ⩽ τ j .

489

A.5 Optional sampling |

Using the discrete version of Doob’s optional sampling theorem with α = σ j ∧ k and β = τ j ∧ k reveals that X(σ j ∧ k), X(τ j ∧ k) ∈ L1 (ℙ) and 󵄨󵄨 X(σ j ∧ k) ⩽ 𝔼 [X(τ j ∧ k) 󵄨󵄨󵄨 Fσ j ∧k ] . 󵄨

Since Fσ∧k ⊂ Fσ j ∧k , the tower property gives

󵄨 󵄨 𝔼 [X(σ j ∧ k) 󵄨󵄨󵄨 Fσ∧k ] ⩽ 𝔼 [X(τ j ∧ k) 󵄨󵄨󵄨 Fσ∧k ] .

(A.18)

By discrete optional sampling, (X(σ j ∧ k), Fσ j ∧k )j⩾1 and (X(τ j ∧ k), Fτ ∧k )j⩾1 are backj wards submartingales and as such, cf. Corollary A.8, uniformly integrable. Therefore, a.s., L1

X(σ j ∧ k) 󳨀󳨀󳨀󳨀󳨀→ X(σ ∧ k) and j→∞

This shows that we have for all F ∈ Fσ∧k

(A.18)

∫ X(σ ∧ k) dℙ = lim ∫ X(σ j ∧ k) dℙ ⩽ j→∞

F

F

a.s., L1

X(τ j ∧ k) 󳨀󳨀󳨀󳨀󳨀→ X(τ ∧ k). j→∞

lim ∫ X(τ j ∧ k) dℙ = ∫ X(τ ∧ k) dℙ,

j→∞

F

F

which proves (A.16). Assume now that X(σ), X(τ) ∈ L1 (ℙ). If F ∈ Fσ , then

F ∩ {σ ⩽ k} ∩ {σ ∧ k ⩽ t} = F ∩ {σ ⩽ k} ∩ {σ ⩽ t} = F ∩ {σ ⩽ k ∧ t} ∈ Fk∧t ⊂ Ft ,

which means that F ∩ {σ ⩽ k} ∈ Fσ∧k . Since σ ⩽ τ < ∞, we have by assumption ∫

F∩ {σ>k}

|X(k)| dℙ ⩽ ∫ |X(k)| dℙ ⩽ ∫ |X(k)| dℙ 󳨀󳨀󳨀󳨀→ 0. {σ>k}

k→∞

{τ>k}

Using (A.16) we get with dominated convergence

∫ X(σ ∧ k) dℙ = F

(A.16)



= which is exactly (A.17).

→∫F X(σ) dℙ, k→∞



F∩ {σ⩽k}



F∩ {σ⩽k}



F∩ {τ⩽k}

X(σ ∧ k) dℙ + =X(σ)

X(τ ∧ k) dℙ + =X(τ)

X(τ ∧ k) dℙ +

→∫F X(τ) dℙ, k→∞

→0, k→∞



F∩ {σ>k}

X(k) dℙ



X(k) dℙ



X(k) dℙ,

F∩ {σ>k} F∩ {τ>k}

→0, k→∞

A.19 Corollary. Let (X t , Ft )t⩾0 , X t ∈ L1 (ℙ), be an adapted process with continuous sample paths. The process (X t , Ft )t⩾0 is a submartingale if, and only if, 𝔼 Xσ ⩽ 𝔼 Xτ

for all bounded Ft stopping times σ ⩽ τ.

(A.19)

490 | Appendix Proof. If (X t , Ft )t⩾0 is a submartingale, then (A.19) follows at once from Doob’s optional stopping theorem. Conversely, assume that (A.19) holds true. Let s < t, pick any F ∈ Fs and set ρ := t𝟙F c + s𝟙F . Obviously, ρ is a bounded stopping time and if we use (A.19) with σ := ρ and τ := t we see 𝔼 (X s 𝟙F ) + 𝔼 (X t 𝟙F c ) = 𝔼 X ρ ⩽ 𝔼 X t .

Therefore,

𝔼 (X s 𝟙F ) ⩽ 𝔼 (X t 𝟙F )

i.e. X s ⩽ 𝔼(X t | Fs ).

for all s < t

and

F ∈ Fs ,

A.20 Corollary. Let (X t , Ft )t⩾0 be a submartingale which is uniformly integrable and has continuous paths. If σ ⩽ τ are stopping times with ℙ(τ < ∞) = 1, then all assumptions of Theorem A.18 are satisfied.

Proof. Since (X t )t⩾0 is uniformly integrable, we get supt⩾0 𝔼 |X(t)| ⩽ c < ∞. Moreover, since X +t is a submartingale, we have for all k ⩾ 0 (A.16)

𝔼 |X(σ ∧ k)| = 2 𝔼 X + (σ ∧ k) − 𝔼 X(σ ∧ k) ⩽ 2 𝔼 X + (k) − 𝔼 X(0)

⩽ 2 𝔼 |X(k)| + 𝔼 |X(0)| ⩽ 3c.

By Fatou’s lemma we get

𝔼 |X(σ)| ⩽ lim 𝔼 |X(σ ∧ k)| ⩽ 3c, k→∞

and, similarly, 𝔼 |X(τ)| ⩽ 3c. Finally,

𝔼 [|X(k)| 𝟙{τ>k} ] = 𝔼 [|X(k)|(𝟙{τ>k}∩ {|X(k)|>R} + 𝟙{τ>k}∩ {|X(k)|⩽R} )] ⩽ sup 𝔼 [|X(n)| 𝟙{|X(n)|>R} ] + R ℙ(τ > k) n

unif. integrable

󳨀󳨀󳨀󳨀→ sup 𝔼 [|X(n)| 𝟙{|X(n)|>R} ] 󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀󳨀→ 0. k→∞

R→∞

n

A.21 Remark. a) If (X t , Ft )t⩾0 is a martingale, we have “=” in (A.16), (A.17) and (A.19). b) Let (X t , Ft )t⩾0 be a submartingale with continuous paths and ρ an Ft stopping time. ρ ρ Then (X t , Ft ), X t := X ρ∧t , is a submartingale. Indeed: X ρ is obviously adapted and continuous. For any two bounded stopping times σ ⩽ τ it is clear that ρ ∧ σ ⩽ ρ ∧ τ are bounded stopping times, too. Therefore, ρ def

A.19

def

𝔼 X σ = 𝔼 X ρ∧σ ⩽ 𝔼 X ρ∧τ = 𝔼 X τ , ρ

and Corollary A.19 shows that X ρ is a submartingale for (Ft )t⩾0 .

c) Note that (X t = X t+ , Ft+ ) is also a submartingale. If σ ⩽ τ are Ft+ stopping times, then (A.16) and (A.17) remain valid if we use F(σ∧k)+ , Fσ+ , etc.

A.6 Remarks on Feller processes |

491

A.6 Remarks on Feller processes A Markov process (X t , Ft )t⩾0 is said to be a Feller process if the transition semigroup (7.1) has the Feller property, i.e. P t is a strongly continuous Markov semigroup on the space C∞ (ℝd ) = {u ∈ C(ℝd ) : lim|x|→∞ u(x) = 0}, cf. Remark 7.4. In Remark 7.6 we have seen how we can construct a Feller process from any given Feller semigroup. Throughout this section we assume that the filtration is right continuous, i.e. Ft = Ft+ for all t ⩾ 0, and complete. Let us briefly discuss some properties of Feller processes. A.22 Theorem. Let (X t , Ft )t⩾0 be a Feller process. Then there exists a Feller process ̃ t )t⩾0 whose sample paths are right continuous with finite left limits and which has the (X same finite dimensional distributions as (X t )t⩾0 .

Proof. See Blumenthal and Getoor [18, Theorem I.9.4, p. 46] or Ethier and Kurtz [79, Theorem 4.2.7 p. 169]. From now on we will assume that every Feller process has right continuous paths.

A.23 Theorem. Every Feller process (X t , Ft )t⩾0 has the strong Markov property (6.7).

Proof. The argument is similar to the proof of Theorem 6.5. Let σ be a stopping time and approximate σ by the stopping times σ j := (⌊2j σ⌋ + 1)/2j , cf. Lemma A.16. Let t > 0, x ∈ ℝd , u ∈ C∞ (ℝd ) and F ∈ Fσ+ . Since {σ < ∞} = {σ n < ∞} we find by dominated convergence and the right continuity of the sample paths ∫

F∩ {σ 0, we have Hδα (K) > 0 for some δ > 0, and there is some γ > 0 such that for all δ-covers (Q j )j⩾1 of K by cubes we have ∑j⩾1 |Q j |α ⩾ γ > 0. Consider the dyadic cubes d

Qn := {⨉ [p j 2−n , (p j + 1)2−n ] : p j ∈ ℤ, j = 1, . . . , d} ,



Q := ⋃ Qn . n=0

j=1

Fix n ⩾ 1 and define some measure ν n on B(ℝd ) satisfying |Q|α , ν n (Q) = { 0,

for all Q ∈ Qn such that Q ∩ K ≠ 0, for all Q ∈ Qn such that Q ∩ K = 0.

|Q| For example, ν n (dx) = g n (x) dx with the density g n (x) = ∑Q∈Qn , Q∩K =0̸ Leb(Q) 𝟙Q (x). 󸀠 Looking at the next larger dyadic cubes Q ∈ Qn−1 , we define a measure ν n−1 on B(ℝd ) such that α

ν (Q), { { n ν n−1 (Q) = { |Q󸀠 |α { ν n (Q), { ν n (Q󸀠 )

for all Q ∈ Qn with Q ⊂ Q󸀠 and ν n (Q󸀠 ) ⩽ |Q󸀠 |α , for all Q ∈ Qn with Q ⊂ Q󸀠 and ν n (Q󸀠 ) > |Q󸀠 |α .

Similar to ν n we can represent ν n−1 by a density with respect to Lebesgue measure. Continuing all the way up, we get after n steps a measure ν0 on B(ℝd ). Since ν0 depends on the initially fixed n, we write ν(n) := ν0 . The thus defined measures (ν(n) )n⩾1 satisfy:

A.9 Frostman’s theorem: Hausdorff measure, capacity and energy |

499

a) ν(n) (Q) ⩽ |Q|α for all Q ∈ Q0 ∪ ⋅ ⋅ ⋅ ∪ Qn . b) For each x ∈ K there is some Q ∈ Q with x ∈ Q and ν(n) (Q) = |Q|α .

Consider for each x ∈ K the largest cube such that b) holds. This gives a finite covering (Q j )j⩾1 of K by non-overlapping dyadic cubes, and we see ν(n) (ℝd ) = ∑ ν(n) (Q j ) = ∑ |Q j |α ⩾ γ > 0. j⩾1

Define

μ(n) (B)

:=

ν(n) (B)/ν(n) (ℝd ). μ(n) (Q) ⩽

1 γ

j⩾1

Then

|Q|α

for all Q ∈ Q0 ∪ ⋅ ⋅ ⋅ ∪ Qn

and supp μ(n) ⊂ K + 𝔹(0, δ n ) where δ n is the diameter of a cube in Qn . Using the Helly–Bray theorem, we can find some weakly convergent subsequence (n j (μ ) )j⩾1 ⊂ (μ(n) )n⩾1 . Denoting the weak limit by μ we see that supp μ ⊂ K

and

μ(Q) ⩽

1 γ

for all Q ∈ Q.

|Q|α

Since we may cover any line segment of length h by at most 2 dyadic intervals of length 2−k with 12 2−k ⩽ h < 2−k , we see that any ball B ⊂ ℝd of diameter |B| = h is covered by at most 2d dyadic cubes Q ∈ Qk , whence α

α

μ(B) ⩽ 2d μ(Q) ⩽ 2d γ−1 |Q|α = 2d γ−1 (√ d2−k ) ⩽ 2d γ−1 (2√ d) |B|α .

Theorem A.39 actually holds for all analytic (i.e. Souslin) sets A ⊂ ℝd . For this one needs the fact that any analytic set with H α (A) > 0 contains a closed set F ⊂ A with 0 < H α (F) < ∞. This is a non-trivial result due to Davies [48].

A.40 Corollary (Frostman’s theorem. Frostman 1935). Let K ⊂ ℝd be a compact set. Then μ(dx) μ(dy) < ∞. α < dim K 󳨐⇒ H α (K) > 0 󳨐⇒ ∃μ ∈ M+ (K) ∀β < α : ∬ |x − y|β K×K

Proof. By Frostman’s lemma, Theorem A.39, there is a probability measure μ on K such that μ {x : |x − y| ⩽ ρ} ⩽ c ρ α with an absolute constant c > 0. For all y ∈ K and all β < α we find ∞



μ(dx) = ∫ μ {x : |x − y|−β ⩾ r} dr = ∫ μ {x : |x − y| ⩽ r−1/β } dr ∫ |x − y|β K

0

0

1



⩽ ∫ dr + ∫ μ {x : |x − y| ⩽ r−1/β } dr 0

1 ∞

⩽ 1 + c ∫ r−α/β dr ⩽ c(μ, α, β) < ∞. 1

Since μ is a finite measure, we get I β (μ) = ∬K×K |x − y|−β μ(dx) μ(dy) < ∞.

500 | Appendix Similar to the Hausdorff dimension, Definition 11.5, one defines the capacitary dimension as the unique number where the capacity changes from strictly positive to zero: dimC E := sup {α ⩾ 0 : Capα E > 0} = inf {α ⩾ 0 : Capα E = 0} .

The following result is an easy consequence of Frostman’s theorem

A.41 Corollary. Let K ⊂ ℝd be a compact set and 0 < β < α < d. Then H α (K) > 0 󳨐⇒ Capβ (K) > 0 󳨐⇒ H β (K) > 0.

In particular, Hausdorff and capacitary dimension coincide: dim K = dimC K.

A.10 Gronwall’s lemma

Gronwall’s lemma is a well-known result from ordinary differential equations. It is often used to get uniqueness of the solutions of certain ODEs. A.42 Theorem (Gronwall’s lemma). Let u, a, b : [0, ∞) → [0, ∞) be positive measurable functions satisfying the inequality t

for all t ⩾ 0.

u(t) ⩽ a(t) + ∫ b(s)u(s) ds 0

(A.24)

In particular, it is assumed that s 󳨃→ u(s)b(s) is integrable over any compact set. Then t

t

u(t) ⩽ a(t) + ∫ a(s)b(s) exp ( ∫ b(r) dr)ds 0

s

for all t ⩾ 0.

If a(t) = a and b(t) = b are constants, the estimate (A.25) simplifies and reads for all t ⩾ 0.

(A.25󸀠 )

y󸀠 (t) − b(t)y(t) ⩽ a(t)b(t).

(A.26)

u(t) ⩽ ae bt

t

(A.25)

Proof. Set y(t) := ∫0 b(s)u(s) ds. Since this is the primitive of b(t)u(t) we can rewrite (A.24) for Lebesgue almost all t ⩾ 0 as t

If we set z(t) := y(t) exp ( − ∫0 b(s) ds) and substitute this into (A.26), we arrive at t

z󸀠 (t) ⩽ a(t)b(t) exp(− ∫ b(s) ds) 0

A.11 Completeness of the Haar functions |

501

Lebesgue almost everywhere. On the other hand, z(0) = y(0) = 0, hence we get by integration t

s

z(t) ⩽ ∫ a(s)b(s) exp(− ∫ b(r) dr) ds

or

0

0

t

t

y(t) ⩽ ∫ a(s)b(s) exp ( ∫ b(r) dr) ds 0

s

for all t > 0. This implies (A.25) since u(t) ⩽ a(t) + y(t).

A.11 Completeness of the Haar functions

Denote by (H n )n⩾0 the Haar functions introduced in Section 3.2. We show here that they are a complete orthonormal system in L2 ([0, 1], ds), hence an orthonormal basis. For more background on Hilbert spaces and orthonormal bases see [232, Ch. 26, 28].

A.43 Theorem. The Haar functions are an orthonormal basis of L2 ([0, 1], ds). 1

Proof. We show that any f ∈ L2 with ⟨f, H n ⟩L2 = ∫0 f(s)H n (s) ds = 0 for all n ⩾ 0 is Lebesgue almost everywhere zero. Indeed, if ⟨f, H n ⟩L2 = 0 for all n ⩾ 0 we find by induction that (k+1)/2j



k/2j

f(s) ds = 0 for all k = 0, 1, . . . , 2j − 1, j ⩾ 0.

(A.27)

Let us quickly verify the induction step. Assume that (A.27) is true for some j; fix k ∈ {0, 1, . . . 2j − 1}. The interval I = [ 2kj , k+1 ) supports the Haar function H2j +k , and 2j on the left half of the interval, I1 = [ 2kj , 2k+1 it has the value 2j/2 , on the right half, ), 2j+1 I2 = [ 2k+1 , k+1 ), it is −2j/2 . By assumption 2j+1 2j ⟨f, H2j +k ⟩L2 = 0,

thus,

∫ f(s) ds = ∫ f(s) ds.

I1

I2

From the induction assumption (A.27) we know, however, that ∫ f(s) ds = 0,

I1 ∪ I2

thus,

∫ f(s) ds = − ∫ f(s) ds.

I1

I2

This is only possible if ∫I f(s) ds = ∫I f(s) ds = 0. Since I1 , I2 are generic intervals of 1 2 the next refinement, the induction is complete. t Because of the relations (A.27), the primitive F(t) := ∫0 f(s) ds is zero for all dyadic points t = k2−j , j ⩾ 0, k = 0, 1, . . . , 2j ; since the dyadic numbers are dense in [0, 1] and since F(t) is continuous, we see that F(t) ≡ 0 and, therefore, f ≡ 0 almost everywhere.

Bibliography [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]

[20] [21] [22] [23] [24] [25]

Aikawa, H., Essén, M.: Potential Theory – Selected Topics. Springer, Lecture Notes in Mathematics 1633, Berlin 1996. Arnold, L.: Stochastic Differential Equations: Theory and Applications. Wiley-Interscience, New York 1974. Arnold, L.: Random Dynamical Systems. Springer, Berlin 2009. Ash, R.B.: Information Theory. Interscience Publishers, New York 1965. Reprinted by: Dover, New York 1990. Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algorithms and Analysis. Springer, Berlin 2007. Bachelier, L.: Théorie de la spéculation. Ann. Sci. Ec. Norm. Super. 17 (1900) 21–86. English translations in [43] pp. 17–78 and [8]. Bachelier, L.: Calcul des Probabilités. Gauthier-Villars, Paris 1912. Reprinted by Éditions Jacques Gabay, Paris 1992. Bachelier, L., Davis, M. (ed.), Etheridge, A. (ed.): Louis Bachelier’s Theory of Speculation: The Origins of Modern Finance. Princeton University Press, Princeton (NJ) 2006. Bass, R.: Probabilistic Techniques in Analysis. Springer, New York 1995. Bass, R.: Diffusions and Elliptic Operators. Springer, New York 1998. Barlow, M.: One dimensional stochastic differential equations with no strong solution. J. London Math. Soc. 26 (1982) 335–347. Bell, D.R.: Stochastic differential equations and hypoelliptic operators. In: Rao, M.M. (ed.): Real and Stochastic Analysis. New Perspectives. Birkhäuser, Boston 2004, 9–42. Berg, C.: On the potential operators associated with a semigroup. Studia Math. 51 (1974) 111–113. Billingsley, P.: Convergence of Probability Measures. Wiley, New York 1968. 2nd ed. Wiley, New York 1999. Billingsley, P.: Probability and Measure. Wiley, New York 1995 (3rd ed). Blumenthal, R.M.: An extended Markov property. Trans. Am. Math. Soc. (1957) 82 52–72. Blumenthal, R.M., Getoor, R.K.: Some theorems on stable processes. Trans. Am. Math. Soc. 95 (1960) 263–273. Blumenthal, R.M., Getoor, R.K.: Markov Processes and Potential Theory. Academic Press, New York 1968. Reprinted by Dover, Mineola (NY) 2007. Böttcher, B., Schilling, R.L., Wang, J.: Lévy-Type Processes: Construction, Approximation and Sample Path Properties. Springer, Lecture Notes in Mathematics 2099 (Lévy Matters 3), Springer, Cham 2013. Breiman, L.: On the tail behavior of sums of independent random variables. Z. Wahrscheinlichkeitstheorie verw. Geb. 9 (1967) 20–25. Brown, R.: On the existence of active molecules in organic and inorganic bodies. The Philosophical Magazine and Annals of Philosophy (new Series) 4 (1828) 161–173. Brown, R.: Additional remarks on active molecules. The Philosophical Magazine and Annals of Philosophy (new Series) 6 (1828) 161–166. Bru, B., Yor, M.: Comments on the life and mathematical legacy of Wolfgang Doeblin. Finance and Stochastics 6 (2002) 3–47. Reprinted in [53]. Burckel, R.B.: An Introduction to Classical Complex Analysis, Vol. 1. Birkhäuser, Basel 1979. Burkholder, D.L.: Martingale Transforms. Ann. Math. Statist. (1966) 37 1494–1504. Reprinted in [27].

https://doi.org/10.1515/9783110741278-026

504 | Bibliography

[26]

[27] [28] [29]

[30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50]

Burkholder, D.L., Davis, B.J., Gundy, R.F.: Integral inequalities for convex functions of operators on martingales. In: Le Cam, L. et al. (eds.): Proceedings of the 6th Berkeley Symposium on Mathematical Statistics and Probability, Vol. II. University of California Press, Berkely (CA) 1972, 223–240. Reprinted in [27]. Burkholder, D.L., Davis, B. (ed.), Song, R. (ed.): Selected Works of Donald L. Burkholder. Springer, New York 2011. Carleson, L.: Selected Problems on Exceptional Sets. Van Nostrand, Princeton (NJ) 1967. Chentsov, N.N.: Weak convergence of stochastic processes whose trajectories have no discontinuities of the second kind and the “heuristic” approach to the Kolmogorov-Smirnov tests, Theor. Probab. Appl. 1 (1956) 140–144. Cherny, A.S., Engelbert, H.-J.: Singular Stochastic Differential Equations. Springer, Lecture Notes Math. 1858, Berlin 2005. Chung, K.L.: On the maximum partial sums of sequences of independent random variables. Trans. Am. Math. Soc. 61 (1948) 205–233. Reprinted in [34]. Chung, K.L.: Excursions in Brownian Motion. Ark. Mat. 14 (1976) 155–177. Reprinted in [34]. Chung, K.L.: Lectures from Markov Processes to Brownian Motion. Springer, New York 1982. Enlarged and reprinted as [35]. Chung, K.L., AitSaliah, F. et al. (eds.): Selected Works of Kai Lai Chung. World Scientific, Singapore 2008. Chung, K.L., Walsh, J.B.: Markov Processes, Brownian Motion, and Time Symmetry. Springer, Berlin 2005. This is the 2nd ed. of [33]. Chung, K.L., Zhao, Z.: From Brownian Motion to Schrödinger’s Equation. Springer, Berlin 1995. Ciesielski, Z.: On Haar functions and on the Schauder basis of the space C⟨0,1⟩ . Bull. Acad. Polon. Sci. Sér. Math. Astronom. Phys. 7 (1959) 227–232. Ciesielski, Z.: Lectures on Brownian Motion, Heat Conduction and Potential Theory. Aarhus Matematisk Institut, Lecture Notes Ser. 3, Aarhus 1966. Ciesielski, Z., Kerkyacharian, G., Roynette, B.: Quelques espaces fonctionnels associés à des processus gaussienes. Studia Math. 107 (1993) 171–204. Coddington, E.A., Levinson, N.: Theory of Ordinary Differential Equations. McGraw-Hill Education, Chennai 2017 (42nd reprint). Cohen, L.: The history of noise. IEEE Signal Processing Magazine (November 2005) 20–45. Cont, R., Tankov, P.: Financial Modelling With Jump Processes. Chapman & Hall/CRC, Boca Raton 2004. Cootner, P. (ed.): The Random Character of Stock Market Prices. The MIT Press, Cambridge (MA) 1964. Csörgő, M., Révész, P.: Strong Approximations in Probability and Statistics. Academic Press, New York 1981. Dambis, K.E.: On the decomposition of continuous submartingales. Theory Probab. Appl. 10 (1965) 401–410. Dautray, R., Lions, J.-L.: Mathematical Analysis and Numerical Methods for Science and Technology. Volume 1: Physical Origins and Classical Methods. Springer, Berlin 1990. DaPrato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions. Cambridge University Press, Cambridge 1992. Davies, R.O.: Subsets of finite measure in analytic sets. Indag. Math. 14 (1952) 488–489. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Springer, New York 1993 (2nd ed). Deuschel, J.-D., Stroock, D.W.: Large Deviations. Academic Press 1989.

Bibliography |

[51] [52]

[53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68]

[69] [70] [71] [72]

[73] [74]

505

Di Tella, P., Engelbert, H.-J.: The chaotic representation property of compensated-covariation stable families of martingales. Annals of Probability 44 (2016) 3965–4005. Doeblin, W.: Sur l’équation de Kolmogoroff. C. R. Acad. Sci. Paris 331 (2000) 1059–1102. First printing of the pli cacheté deposited with the Académie des Sciences in 1940. Reprinted in [53]. Doeblin, W., Yor, M. (ed.), Bru, B. (ed.): Œvres Complètes – Collected Works. Springer, Cham 2020. Doléans-Dade, C.: On the existence and unicity of solutions of stochastic integral equations. Z. Wahrscheinlichkeitstheorie verw. Geb. 36 (1976) 93–101. Donsker, M.: An invariance principle for certain probability limit theorems. Mem. Am. Math. Soc. 6 (1951) 12 pp. (paper no. 4) Doob, J.L.: Stochastic Processes. Interscience Publishers, New York 1953. Reprinted by Wiley, New York 1990. Doob, J.L.: A probability approach to the heat equation. Trans. Am. Math. Soc. 80 (1955) 216–280. Doob, J.L.: Classical Potential Theory and Its Probabilistic Counterpart. Springer, Berlin 1984. Doss, H.: Liens entre équations différentielles stochastiques et ordinaires. Ann. Inst. H. Poincaré, Sér. B 13 (1977) 99–125. Dubins, L.E., Schwarz, G.: On continuous martingales. Proc. Natl. Acad. Sci. USA 53 (1965) 913–916. Dudley, R.M.: Sample functions of the Gaussian process. Ann. Probab. 1 (1973) 66–103. Reprinted in [63]. Dudley, R.M.: Uniform Central Limit Theorems. Cambridge University Press, Cambridge 1999. Dudley, R.M., Giné, E. et al. (eds.): Selected Works of R.M. Dudley. Springer, New York 2010. Dudley, R.M., Norvaiša, R.: An Introduction to p-variation and Young integrals. MaPhySto Lecture Notes 1, Aarhus 1998. Dudley, R.M., Norvaiša, R.: Differentiability of Six Operators on Nonsmooth Functions and p-Variation. Springer, Lecture Notes in Mathematics 1703, Berlin 1999. Durrett, R.: Brownian Motion and Martingales in Analysis. Wadsworth Advanced Books & Software, Belmont (CA) 1984. Durrett, R.: Probability: Theory and Examples. Cambridge University Press, Cambridge 2010 (4th ed). Dynkin, E.B.: Criteria of continuity and of absence of discontinuities of the second kind for trajectories of a Markov stochastic process. Izv. Akad. Nauk SSSR, Ser. Matem. 16 (6) (1952) 563–572. (In Russian) Dynkin, E.B.: Infinitesimal operators of Markov processes. Theor. Probab. Appl. 1 (1956) 34–54. Reprinted in [73]. Dynkin, E.B.: Die Grundlagen der Theorie der Markoffschen Prozesse. Springer, Berlin 1961. English translation: Theory of Markov Processes, Dover, Mineola (NY) 2006. Dynkin, E.B.: Markov Processes (2 vols.). Springer, Berlin 1965. Dynkin, E.B., Yushkevich, A.A.: Markov Processes. Theorems and Problems. Plenum Press, New York 1969. Russian Original, German Translation: Sätze und Aufgaben über Markoffsche Prozesse, Springer 1969. Dynkin, E.B., Yushkevich, A.A. et al. (eds.): Selected Papers of E.B. Dynkin with Commentary. American Mathematical Society, Providence (RI) 2000. Einstein, A.: Über die von der molekularkinetischen Theorie der Wärme geforderte Bewegung von in ruhenden Flüssigkeiten suspendierten Teilchen. Ann. Phys. 17 (1905) 549–560. Reprinted in [95], English translation in [75].

506 | Bibliography

[75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93]

[94]

[95]

[96] [97] [98] [99]

Einstein, A., Fürth, R. (ed.): Investigations on the Theory of the Brownian movement. Dover, New York 1956. Eliot, G.: Middlemarch. William Blackwood and Sons, Edinburgh 1871. Engelbert, H.-J., Cherny, A.S.: Singular Stochastic Differential Equations. Springer, Lecture Notes in Mathematics 1858, Berlin 2004. Etemadi, N.: On some classical results in probability theory. Sankhya A 47 (1985) 215–221. Ethier, S.N., Kurtz, T.G.: Markov Processes. Characterization and Convergence. Wiley, New York 1986. Evans, L.C.: Partial Differential Equations. American Mathematical Society, Providence (RI) 2010 (2nd ed). Evans, L.C., Gariepy, R.F.: Measure Theory and Fine Properties of Functions. CRC Press, Boca Raton (FL) 1992. Falconer, K.: The Geometry of Fractal Sets. Cambridge University Press, Cambridge 1985. Falconer, K.: Fractal Geometry. Mathematical Foundations and Applications. Wiley, Chichester 2003 (2nd ed). Fernique, X.: Fonctions aléatoires Gaussiennes. Vecteurs aléatoires Gaussiens. Les Publications CRM, Montréal 1997. Feyel, D., de La Pradelle, A.: On fractional Brownian processes. Potential Anal. 10 (1999) 273–288. Fokker, A.D.: Die mittlere Energie rotierender elektrischer Dipole im Strahlungsfeld. Ann. d. Phys. 43 (1914) 810–820. Folland, G.B.: Fourier Analysis and Its Applications. Brooks/Cole, Belmont (CA) 1992. Freedman, D.: Brownian Motion and Diffusion. Holden-Day, San Francisco (CA) 1971. Reprinted by Springer, New York 1982. Freidlin, M.: Functional Integration and Partial Differential Equations. Princeton University Press, Ann. Math. Studies 109, Princeton (NJ) 1985. Freidlin, M.I., Wentzell, A.D.: Random Perturbations of Dynamical Systems. Springer, New York 1998 (2nd ed). Friedman, A.: Partial Differential Equations of Parabolic Type. Prentice-Hall, Englewood Cliffs (NJ) 1964. Reprinted by Dover, Mineola (NY) 2008. Friedman, A.: Stochastic Differential Equations and Applications (2 vols.). Academic Press, New York (NY) 1975/1976. Reprinted in one volume by Dover, Mineola (NY) 2006. Fristedt, B.: Sample functions of stochastic processes with stationary, independent increments. In: Ney, P., Port, S. (eds.): Advances in Probability and Related Topics Vol. 3. Marcel Dekker, New York 1974. Frostman, O.: Potentiel d’équilibre et capacité des ensembles avec quelques applications a la théorie des fonctions. Meddelanden från Lunds Universitets Matematiska Seminarium Band 3, C.W.K. Gleerup, Lund 1953. Fürth, R. (ed.), Einstein, A., Smoluchowski, M. von: Untersuchungen über die Theorie der Brownschen Bewegung – Abhandlung über die Brownsche Bewegung und verwandte Erscheinungen. Harri Deutsch, Ostwalds Klassiker Bd. 199 (reprint of the separate volumes 199 and 207), Frankfurt am Main 1997. English translation [75]. Fukushima, M., Osima, Y., Takeda, M.: Dirichlet Forms and Symmetric Markov Processes. De Gruyter, Berlin 2011 (2nd ed). Garsia, A.M., Rodemich, E., Rumsey, H.: A real variable lemma and the continuity of paths of some Gaussian processes. Indiana Univ. Math. J. 20 (1970) 565–578. Gikhman, I.I., Skorokhod, A.W.: Stochastic Differential Equations. Springer, Berlin 1972. Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Springer, Berlin 2001 (several reprints).

Bibliography |

507

[100] Girsanov, I.V.: On transforming a certain class of stochastic processes by absolutely continuous substitution of measures. Theor. Probab. Appl. 5 (1960) 285–301. [101] Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products. Academic Press, San Diego 1980 (4th corrected and enlarged ed). [102] Gunther (Günter), N.M.: La Théorie du Potentiel et ses Applications aux Problèmes Fondamentaux de la Physique Mathématique. Gauthier-Villars, Paris 1934. [103] Günter, N.M.: Die Potentialtheorie und ihre Anwendung auf Grundaufgaben der mathematischen Physik. B.G. Teubner, Leipzig 1957. Extended second edition of [102]. [104] Hausdorff, F.: Dimension und äußeres Maß. Math. Annalen 79 (1919) 157–179. Reprinted (with English commentary) in [105]. [105] Hausdorff, F., Chatterji, S.D. (ed.), Remmert, R. (ed.), Scharlau, W. (ed.): Felix Hausdorff – Gesammelte Werke. Band IV: Analysis, Algebra und Zahlentheorie. Springer, Berlin 2001. [106] Helms, L.L.: Introduction to Potential Theory. Wiley–Interscience, New York 1969. [107] Helms, L.L.: Potential Theory. Springer, London 2009. Extended version of [106]. [108] Hewitt, E., Stromberg, K.: Real and Abstract Analysis. Springer, New York 1965. [109] Hille, E.: Functional Analysis and Semi-groups. Am. Math. Society, New York 1948. (2nd ed. 1957 jointly with R.S. Phillips.) [110] Hörmander, L.: The Analysis of Linear Partial Differential Operators I. Springer, Berlin 1983. [111] Hunt, G.H.: Markoff processes and potentials I, II. Illinois J. Math. 1 (1957) 44–93 and 316– 369. [112] Ikeda, N., Nakao, S., Yamato, Y.: A class of approximations of Brownian motion. Publ. RIMS Kyoto Univ. 13 (1977) 285–300. [113] Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. NorthHolland, Amsterdam 1989 (2nd ed). [114] Itô, K.: Differential equations determining a Markoff process (in Japanese). J. Pan-Japan Math. Coll. 1077 (1942). Translation reprinted in [123]. [115] Itô, K: Multiple Wiener integral. Journal of the Mathematical Society of Japan 3 (1951) 157–169. Reprinted in [123]. [116] Itô, K.: Poisson Point Processes and Their Application to Markov Processes. Mimeographed lecture notes, Kyoto University, 1969. Reprinted in the series “Springer Briefs in Probability and Mathematical Statistics”, Springer, Singapore 2015. [117] Itô, K.: Poisson point processes attached to Markov processes. In: L. LeCam et al. (eds.): Proceedings of the 6th Bergeley Symposium on Mathematical Statistics and Probability, Vol. 3. University of California Press, Berkeley 1970, 225–239. Reprinted in [123]. [118] Itô, K.: Semigroups in Probability Theory. In: Functional Analysis and Related Topics, 1991. Proceedings of the International Conference in Memory of Professor Kôsaku Yosida held at RIMS, Kyoto University, Japan, July 29 - Aug. 2, 1991. Springer, Lecture Notes in Mathematics 1540, Berlin 1993, 69–83. [119] Itô, K.: Stochastic Processes. Lectures Given at Aarhus University. Springer, Berlin 2004. Reprint (with commentaries) of Itô’s 1969 lectures, Aarhus Lecture Notes Series 16. [120] Itô, K.: Essentials of Stochastic Processes. Am. Math. Society, Providence (RI) 2006. [121] Itô, K., McKean, H.P.: Diffusion Processes and Their Sample Paths, Springer, Berlin 1965. [122] Itô, K., Nisio, M.: On the convergence of sums of independent Banach space valued random variables. Osaka J. Math. 5 (1968) 35–48. [123] Itô, K., Stroock, D.W. (ed.), Varadhan, S.R.S. (ed.): Kiyosi Itô. Selected Papers. Springer, New York 1987. [124] Jacod, J., Protter, P.: Probability Essentials. Springer, Berlin 2000. [125] John, F.: Partial Differential Equations. Springer, Berlin 1982 (4th ed).

508 | Bibliography [126] Kac, M.: On distributions of certain Wiener functionals. Trans. Am. Math. Soc. (1949) 65 1–13. Reprinted in [128]. [127] Kac, M.: On some connections between probability theory and differential and integral equations. In: Neyman, J. (ed.): Proceedings of the Second Berkely Symposium on Mathematical Statistics and Probability, August 1950. University of California Press, Berkeley 1951, 189–225. Reprinted in [128]. [128] Kac, M., Baclawski, K. (ed.), Donsker, M.D. (ed.): Mark Kac: Probability, Number Theory, and Statistical Physics. The MIT Press, Cambridge (MA) 1979. [129] Kahane, J.-P.: Sur l’irrégularité locale du mouvement brownien. C. R. Acad. Sci, Paris Sér. A 278 (1974) 331–333. Reprinted in [132]. [130] Kahane, J.-P.: Sur les zéros et les instants de ralentissement du mouvement brownien. C. R. Acad. Sci, Paris Sér. A 282 (1976) 431–433. Reprinted in [132]. [131] Kahane, J.-P: Some random series on functions. Cambridge University Press, Cambridge 1985 (2nd ed). [132] Kahane, J.-P., Baker, R.C. (ed.): Selected Works: Jean-Pierre Kahane. Kendrick Press, Heber City (UT) 2009. [133] Kakutani, S.: On Brownian motions in n-space. Proc. Imperial Acad. Japan 20 (1944) 648–652. Reprinted in [135]. [134] Kakutani, S.: Two-dimensional Brownian motion and harmonic functions. Proc. Imperial Acad. Japan 20 (1944) 706–714. Reprinted in [135]. [135] Kakutani, S., Kallman, R.R. (ed.): Shizuo Kakutani – Selected Papers (2 vols.). Birkhäuser, Boston 1986. [136] Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Springer, New York 1988. [137] Kaufman, R.: Une propriété métrique du mouvement brownien. C. R. Acad. Sci, Paris Sér. A 268 (1969) 727–728. [138] Kaufman, R.: Temps locaux and dimensions. C. R. Acad. Sci, Paris Sér. I Math. 300 (1985) 281–282. [139] Kaufman, R.: Dimensional properties of one-dimensional Brownian motion. Ann. Probab. 17 (1989) 189–193. [140] Kellogg, O.D.: Foundations of Modern Potential Theory. Springer, Berlin 1929. Reprinted by Dover, New York 1956. [141] Kestelman, H.: Modern Theories of Integration. Dover, New York 1960 (2nd ed). [142] Khintchine, A.: Asymptotische Gesetze der Wahrscheinlichkeitsrechnung. Springer, Ergeb. Math. Grenzgeb. Bd. 2, Heft 4, Berlin 1933. [143] Khintchine, A.: Sur la croissance locale des processus stochastiques homogènes à accroissements indépendants (Russian, French summary). Izvestija Akad. Nauk SSSR, Ser. Math. 3 (1939) 487–508. [144] Kinney, J.R.: Continuity criteria of sample functions of Markov processes. Trans. Am. Math. Soc. 74 (1953) 280–302. [145] Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Differential Equations. Springer, Berlin 1992. [146] Kolomogoroff, A.N., Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung. Math. Ann. 104 (1931) 415–458. English translation in [148]. [147] Kolmogoroff, A.N., Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer, Ergeb. Math. Grenzgeb. Bd. 2, Heft 3, Berlin 1933. [148] Kolmogorov, A.N., Shiryayev, A.N. (ed.): Selected Works of A.N. Kolmogorov (3 vols.). Kluwer, Dordrecht 1993. [149] Kowalski, E.: Bernstein polynomials and Brownian Motion. Am. Math. Monthly 113 (2006) 865–886.

Bibliography |

509

[150] Krylov, N.V.: Introduction to the Theory of Diffusion Processes. American Mathematical Society, Providence (RI) 1995. [151] Krylov, N.V.: Introduction to the Theory of Random Processes. American Mathematical Society, Providence (RI) 2002. [152] Kunita, H.: Stochastic Flows and Stochastic Differential Equations. Cambridge University Press, Cambridge 1990. [153] Kunita, H.: Stochastic differential equations based on Lévy processes and stochastic flows of diffeomorphisms. In: Rao, M.M. (ed.): Real and Stochastic Analysis. New Perspectives. Birkhäuser, Boston 2004, 305–373. [154] Kunita, H., Watanabe, S.: On square integrable martingales. Nagoya Math. J. 30 (1967) 209– 245. [155] Kurtz, T.: Equivalence of stochastic equations and martingale problems. In: Crisan, D. (ed.): Stochastic Analysis 2010. Springer, Berlin 2011, 113–130. [156] Kuo, H.-H.: Gaussian Measures in Banach Spaces. Springer, Lecture Notes in Mathematics 463, Berlin 1975. [157] Lebedev, N.N.: Special Functions And Their Applications. Prentice-Hall, Englewood Cliffs (NJ) 1965. Several reprints by Dover, New York. [158] Lebesgue, H.: Sur des cas d’impossibilité du problème du Dirichlet ordinaire. C. R. Séances Soc. Math. France 41 (1913). Reprinted in [159, vol. 4]. [159] Lebesgue, H.: Œvres Scientifiques En Cinq Volumes (5 vols.). L’enseignement mathématique & Institut de Mathématiques Université de Genève, Geneva 1973. [160] L’Ecuyer, P.: Uniform random number generation. Ann. Oper. Res. 53 (1994) 77–120. [161] L’Ecuyer, P., Panneton, F.: Fast random number generators based on linear recurrences modulo 2: overview and comparison. In: Kuhl, M.E., Steiger, N.M., Armstrong, F.B., Joines, J.A. (eds.): Proceedings of the 37th Winter Simulation Conference, Orlando, FL, 2005. ACM 2005. IEEE, New York 2006, 110–119. [162] Lévy, P.: Théorie de l’Addition des Variables Aléatoires. Gauthier-Villars, Paris 1937. [163] Lévy, P.: Sur certains processus stochastiques homogènes. Compositio Math. 7 (1939) 283– 339. Reprinted in [167, vol. 4]. [164] Lévy, P.: Le mouvement brownien plan. Am. J. Math. 62 (1940) 487–550. Reprinted in [167, vol. 5]. [165] Lévy, P.: Trois théorèmes sur le mouvement brownien. L’intermédiaire des Recherches Mathématiques 9 (1947) 124–126. Reprinted in [167, vol. 5]. [166] Lévy, P.: Processus Stochastiques et Mouvement Brownien. Gauthier-Villars, Paris 1948. [167] Lévy, P., Dugué, D. (ed.): Œvres de Paul Lévy (6 vols.). Gauthier-Villars, Paris 1973–1980. [168] Lieberman, G.M.: Second Order Parabolic Differential Equations. World Scientific, Singapore 1996. [169] Lifshits, M.A.: Gaussian Random Functions. Kluwer, Dordrecht 1995. [170] Lifshits, M.A.: Lectures on Gaussian Processes. Springer, Heidelberg 2012. [171] Lumer, G., Phillips, R.S.: Dissipative operators in a Banach space. Pacific J. Math. 11 (1961) 679–698. [172] MacDonald, D.K.C.: Noise and Fluctuations. Wiley, New York 1962. Reprinted by Dover, Mineola (NY) 2006. [173] Maisonneuve, B.: Temps local et dénombrements d’excursions. Probability Theory and Related Fields 52 (1980) 109–113. [174] Malliavin, P.: Stochastic calculus of variation and hypoelliptic operators. In: Itô, K. (ed.): Proceedings Intnl. Symposium on SDEs, Kyoto 1976, Wiley, New York 1978, 195–263. [175] Malliavin, P.: Integration and Probability. Springer, Berlin 1995. [176] Malliavin, P.: Stochastic Analysis. Springer, Berlin 1997.

510 | Bibliography

[177] Mandelbrot, B.B.: The Fractal Geometry of Nature. Freeman, New York 1977. [178] Marcus, M., Rosen, J.: Markov Processes, Gaussian Processes, and Local Times. Cambridge Universtiy Press, Cambridge 2006. [179] Maruyama, G.: On the transition probability functions of the Markov process. Natural Sci. Rep., Ochanomizu Univ. 5 (1954) 10–20. Reprinted in [180]. [180] Maruyama, G., Ikeda, N. (ed.), Tanaka, H. (ed.): Gisiro Maruyama. Selected Papers. Kaigai Publications, Tokyo 1988. [181] Mattila, P.: Geometry of Sets and Measures in Euclidean Spaces. Fractals and Rectifiability. Cambridge University Press, Cambridge 1995. [182] Mazo, R.M.: Brownian Motion. Fluctuations, Dynamics and Applications. Oxford University Press, Oxford 2002. [183] McKean, H.P.: Hausdorff–Besicovitch dimension of Brownian motion paths. Duke Math. J. 22 (1955) 229–234. [184] Meyer, P.A.: A decomposition theorem for supermartingales. Illinois J. Math. 6 (1962) 193– 205. [185] Meyer, P.A.: Decompositions of supermartingales. The uniqueness theorem. Illinois J. Math. 7 (1963) 1–17. [186] Mörters, P., Peres, Y.: Brownian Motion. Cambridge University Press, Cambridge 2010. [187] Motoo, M.: Proof of the law of iterated logarithm through diffusion equation. Ann. Inst. Math. Statist. 10 (1958) 21–28. [188] Nelson, E.: Dynamical Theories of Brownian Motion. Princeton University Press, Princeton (NJ) 1967. [189] Neveu, J.: Theorie des semi-groupes de Markov. University of California Publications in Statistics 2, No. 14 (1958) 319–394. [190] Novikov, A.A.: On an identity for stochastic integrals. Theory Probab. Appl. 16 (1972) 717–720. [191] Nualart, D.: The Malliavin Calculus and Related Topics. Springer, Berlin 2006. [192] Nualart, D., Nualart, E.: Introduction to Malliavin Calculus. Cambridge University Press, Cambridge 2018. [193] Oblój, J.: The Skorokhod embedding problem and its offspring. Probab. Surveys 1 (2004) 321–390. [194] Olver, F.W.J. et al. (eds.): NIST Handbook of Mathematical Functions. Cambridge University Press, Cambridge 2010. Free online access: http://dlmf.nist.gov/ [195] Orey, S.: Probabilistic methods in partial differential equations. In: Littman, W. (ed.): Studies in Partial Differential Equations. The Mathematical Association of America, 1982, 143–205. [196] Orey, S., Taylor, S.J.: How often on a Brownian path does the law of iterated logarithm fail? Proc. London Math. Soc. 28 (1974) 174–192. [197] Paley, R.E.A.C., Wiener, N., Zygmund, A.: Notes on random functions. Math. Z. 37 (1933) 647–668. [198] Paley, R.E.A.C., Wiener, N.: Fourier Transforms in the Complex Domain. American Mathematical Society, Providence (RI) 1934. [199] Partzsch, L.: Vorlesungen zum eindimensionalen Wienerschen Prozeß. Teubner, Texte zur Mathematik Bd. 66, Leipzig 1984. [200] Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. Springer, New York 1983. [201] Peetre, J.: Rectification à l’article “Une caractérisation abstraite des opérateurs différentiels”. Math. Scand. 8 (1960) 116–120. [202] Petrowsky, I.: Zur ersten Randwertaufgabe der Wärmeleitungsgleichung. Compositio Math. 1 (1935) 383–419. English translation in [203].

Bibliography

| 511

[203] Petrowsky, I.G., Oleinik, O.A. (ed.): I.G. Petrowsky. Selected Works. Part II: Differential Equations and Probability Theory. Gordon and Breach, Amsterdam 1996. [204] Perrin, J.: Mouvement brownien et réalité moléculaire. Ann. Chimie Physique (8ème sér.) 18 (1909) 5–114. [205] Perrin, J.: Les Atomes. Librairie Félix Alcan, Paris 1913. [206] Pinsky, M.A.: Introduction to Fourier Analysis and Wavelets. Brooks/Cole, Pacific Grove (CA) 2002. [207] Planck, M.: Über einen Satz der statistischen Dynamik und seine Erweiterung in der Quantentheorie. Sitzungsber. Königl. Preuß. Akad. Wiss. phys.-math. Kl. 10.5 (1917) 324–341. [208] Poincaré, H.: Sur les équations aux dérivées partielles de la physique mathématique. Am. J. Math. 12 (1890) 211–294. [209] Port, S., Stone, S.: Brownian Motion and Classical Potential Theory. Academic Press, New York 1978. [210] Prevôt, C., Röckner, M.: A Concise Course on Stochastic Partial Differential Equations. Springer, Lecture Notes in Mathematics 1905, Berlin 2007. [211] Protter, P.E.: Stochastic Integration and Differential Equations. Springer, Berlin 2004 (2nd ed). [212] Rao, M.: Brownian Motion and Classical Potential Theory. Aarhus Universitet Matematisk Institut, Lecture Notes Series 47, Aarhus 1977. [213] Rényi, A.: Probability Theory. North Holland, Amsterdam 1970. Reprinted by Dover, Mineola (NY) 2007. Translation of the German ed. 1962 and the Hungarian ed. 1965. [214] Reuter, G.E.H.: Denumerable Markov processes and the associated contraction semigroups on l. Acta Math. 97 (1957) 1–46. [215] Révész, P.: Random Walk in Random and Non-Random Environments. World Scientific, New Jersey 2005 (2nd ed). [216] Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Springer, Berlin 2005 (3rd ed). [217] Riesz, F.: Sur les opérations fonctionnelles linéaires. C. R. Acad. Sci. Paris 149 (1909) 974–977. Reprinted in [218]. [218] Riesz, F., Csázár, Á. (ed.): Frédéric Riesz – Œuvres complètes. Gauthier-Villars, Paris 1960. [219] Riesz, F., Sz.-Nagy, B.: Leçons d’Analyse Fonctionnnelle. Akadémiai Kiadó, Budapest 1952. English translation: Functional Analysis. Dover, Mineola (NY) 1990. [220] Ripley, B.D.: Stochastic Simulation. Wiley, Series in Probab. Stat., Hoboken (NJ) 1987. [221] Rogers, C.A.: Hausdorff Measures. Cambridge University Press, Cambridge 1970. Unaltered 2nd edition 1998 in the Cambridge Mathematical Library with a new introduction by K. Falconer. [222] Rogers, L.C.G., Williams, D.: Diffusions, Markov Processes, and Martingales (2 vols.). Wiley, Chichester 1987 & 1994 (2nd ed). Reprinted by Cambridge University Press, Cambridge 2000. [223] Rubinstein, R.Y., Kroese, D.P.: Simulation and the Monte Carlo Method. Wiley, Hoboken (NJ) 2008. [224] Rudin, W.: Principles of Mathematical Analysis. McGraw-Hill, Auckland 1976 (3rd ed). [225] Rudin, W.: Real and Complex Analysis. McGraw-Hill, New York 1986 (3rd ed). [226] Rudin, W.: Functional Analysis. McGraw-Hill, New York 1991 (2nd ed). [227] Sawyer, S.: The Skorokhod Representation. Rocky Mountain J. Math. 4 (1974) 579–596. [228] Schilder, M.: Some asymptotic formulae for Wiener integrals. Trans. Am. Math. Soc. 125 (1966) 63–85. [229] Slutsky, E.E.: Qualche proposizione relativa alla teoria delle funzioni aleatorie. Giorn. Ist. Ital. Attuari 8 (1937) 183–199. [230] Schilling, R.L.: Sobolev embedding for stochastic processes. Expo. Math. 18 (2000) 239–242.

512 | Bibliography

[231] Schilling, R.L.: An introduction to Lévy and Feller processes. In: Khoshnevisan, D., Schilling, R.L.: From Lévy-Type Processes to Parabolic SPDEs. Birkhäuser, Cham 2016. [232] Schilling, R.L.: Measures, Integrals and Martingales. Cambridge University Press, Cambridge 2017 (2nd ed). [233] Schilling, R.L.: Wahrscheinlichkeit. De Gruyter, Berlin 2017. [234] Schilling, R.L.: Martingale und Prozesse. De Gruyter, Berlin 2018. [235] Schilling, R.L.: Measure, Integral, Probability & Processes. Amazon/Kindle Self-Publishing, Dresden 2021. ISBN: 979-8-5991-0488-9. [236] Schilling, R.L., Kühn, F.: Counterexamples in Measure and Integration. Cambridge University Press, Cambridge 2021. [237] Schilling, R.L., Song, R., Vondraček, Z.: Bernstein Functions. Theory and Applications. De Gruyter, Berlin 2012 (2nd ed). [238] Shigekawa, I.: Stochastic Analysis. American Mathematical Society, Providence (RI) 2004. [239] Skorokhod, A.V.: Stochastic equations for diffusion processes in a bounded region. Theor. Probab. Appl. 6 (1961) 264–274. [240] Skorokhod, A.V.: Studies In The Theory Of Random Processes. Addison-Wesley, Reading (MA) 1965. [241] von Smoluchowski, M.: Zur kinetischen Theorie der Brownschen Molekularbewegung und der Suspensionen. Annalen der Physik (4. Folge) 21 (1906) 756–780. Reprinted in [95], English translation in [75]. [242] Spitzer, F.: Principles of Random Walk. Springer, New York 1976 (2nd ed). [243] Stieltjes, T.J.: Recherches sur les fractions continues. Ann. Fac. Sci. Toulouse 8 (1894) 1–122 and 9 (1895) 1–47. Translated and reprinted in [244, vol. 2]. [244] Stieltjes, T.J., van Dijk, G. (ed.): Thomas Jan Stieltjes - Œvres Complètes – Collected Papers (2 vols.). Springer, Berlin 1993. [245] Strassen, V.: An invariance principle for the law of the iterated logarithm. Z. Wahrscheinlichkeitstheorie verw. Geb. 3 (1964) 211–246. [246] Stratonovich, R.L.: A new representation for stochastic integrals and equations. SIAM J. Control 4 (1966) 362–371. Russian Original: Bull. Moscow Univ. Math. Mech. Ser. 1 (1964) 3–12. [247] Stratonovich, R.L.: Conditional Markov Processes and Their Application to the Theory of Optimal Control. American Elsevier Publishing Co., New York 1968. [248] Stroock, D.W.: Homogeneous Chaos Revisited. In Azéma, J. et al. (eds.): Séminaire de Probabilités XXI. Springer, Lecture Notes in Mathematics 1247, Berlin 1987, pp. 1–8. [249] Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Springer, Berlin 1979. [250] Stroock, D.W., Varadhan, S.R.S.: Diffusion drocesses with continuous coefficients. I, II. Communications in Pure and Applied Mathematics 22 (1969), 345–400, 479–530. [251] Sussmann, H.: On the gap between deterministic and stochastic ordinary differential equations. Ann. Probab. 6 (1978) 19–41. [252] Tanaka, H.: Note on continuous additive functionals of the 1-dimensional Brownian path. Z. Wahrscheinlichkeitstheor. verw. Geb. 1 (1963) 251–257. Reprinted in [253]. [253] Tanaka, H., Maejima, M. (ed.), Shiga, T. (ed.): Stochastic Processes. Selected Papers of Hiroshi Tanaka. World Scientific, New Jersey 2002. [254] Taylor, S.J.: The Hausdorff α-dimensional measure of Brownian paths in n-space. Proc. Cambridge Philos. Soc. 49 (1953) 31–39. [255] Taylor, S.J.: The α-dimensional measure of the graph and set of zeros of a Brownian path. Proc. Cambridge Philos. Soc. 51 (1955) 265–274. [256] Taylor, S.J.: Regularity of irregularities on a Brownian path. Ann. Inst. Fourier 24.2 (1974) 195–203.

Bibliography

| 513

[257] Taylor, S.J.: The measure theory of random fractals. Math. Proc. Cambridge Philos. Soc. 100 (1986) 383–406. [258] Tsirelson, B.: An example of a stochastic differential equation with no strong solution. Theor. Probab. Appl. 20 (1975) 427–430. [259] van Hemmen, J.L., Ando, T.: An inequality for trace ideals. Commun. Math. Phys. 76 (1980) 143–148. [260] Varadhan, S.R.S.: Large Deviations and Applications. SIAM, CBMS-NSF Lecture Notes 46, Philadelphia (PA) 1984. [261] Wald, A.: On cumulative sums of random variables. Ann. Math. Statist. 15 (1944) 283–296. [262] Wentzell, A.D.: Theorie zufälliger Prozesse. Birkhäuser, Basel 1979. Russian original. English translation: Course in the Theory of Stochastic Processes, McGraw-Hill 1981. [263] Wiener, N.: Differential-space. J. Math. & Phys. (1923) 58, 131–174. Reprinted in [269, vol. 1]. [264] Wiener, N.: Un problème de probabilités dénombrables. Bull. Soc. Math. France 11 (1924) 569–578. Reprinted in [269, vol. 1]. [265] Wiener, N.: Generalized harmonic analysis. Acta Math. 55 (1930) 117–258. Reprinted in [269, vol. 2]. [266] Wiener, N.: The homogeneous chaos. American Journal of Mathematics 55 (1938) 897–936. Reprinted in [269, vol. 1]. [267] Wiener, N.: Nonlinear Problems in Random Theory. The Technology Press of MIT and John Wiley, New York 1958. [268] Wiener, N., Siegel, A., Rankin, B., Martin, W.T.: Differential Space, Quantum Systems, and Prediction. The MIT Press, Cambridge (MA) 1966. [269] Wiener, N., Masani, P. (ed.): Collected Works With Commentaries (4 vols.). The MIT Press, Cambridge (MA) 1976–86. [270] Williams, D.: Probability with Martingales. Cambridge University Press, Cambridge 1991. [271] Wong, E., Zakai, M.: On the convergence of ordinary integrals to stochastic integrals. Ann. Math. Statist. 36 (1965) 1560–1564. [272] Wong, E., Zakai, M.: On the relation between ordinary and stochastic differential equations. Int. J. Engng. Sci. 3 (1965) 213–229. [273] Wong, E., Zakai, M.: Riemann–Stieltjes approximations of stochastic integrals. Z. Wahrscheinlichkeitstheor. verw. Geb. 12 (1969) 87–97. [274] Xiao, Y.: Random fractals and Markov processes. In: Lapidus, M.L., van Frankenhuijsen, M. (eds.): Fractal Geometry and Applications: A Jubilee of Benoît Mandelbrot. Vol. 2. American Mathematical Society, Proc. Symp. Pure Math. 72.2, Providence (RI) 2004, 261–338. [275] Yamasaki, Y.: Measures on Infinite Dimensional Spaces. World Scientific, Singapore 1985. [276] Yosida, K.: On the differentiability and the representation of one-parameter semigroup of linear operators. J. Math. Soc. Japan 1 (1948) 244–253. Reprinted in [279]. [277] Yosida, K.: Positive pseudo-resolvents and potentials. Proc. Japan Acad. 41 (1965) 1–5. Reprinted in [279]. [278] Yosida, K.: On the potential operators associated with Brownian motions. J. Analyse Math. 23 (1970) 461–465. Reprinted in [279]. [279] Yosida, K., Itô, K. (ed.): Kôsaku Yosida Collected Papers. Springer, Tokyo 1992. [280] Zaremba, S.: Sur le principe de Dirichlet. Acta Math. (1911) 34 293–316. [281] Zvonkin, A.K.: A transformation of the phase space of a process that removes the drift. Math. USSR Sbornik (1974) 22 129–149.

Index The index should be used in conjunction with the Bibliography and the Index of Notation. Numbers following entries are page numbers which, if accompanied by (Pr. n), refer to Problem n on that page. Within the index we use the abbreviations “BM” for Brownian motion and “LIL” for law of iterated logarithm. absorbing point, 116 acceptance–rejection method, 462 action functional, 221, 228 adapted, 48 additive functional, 351, 360 (Pr. 17) additive noise, 439 – cancels out, 443 admissible filtration, 48, 62, 80, 243 (Pr. 1) arc-sine law – for largest zero, 79, 193 – for the time in (0, ∞), 137 Banach–Steinhaus theorem, 496 Blumenthal’s 0-1–law, 82, 143 Borel–Cantelli lemma, 204 bounded variation, 152, 217, 244, 283, 494 Box–Muller-method, 464 Brownian bridge, 18 (Pr. 9), 46 (Pr. 6) Brownian motion, 1, 4 – canonical model, 39, 41 – characteristic function, 7, 11–13 – characterization, 10–13, 50 – Chung’s LIL, 208 – Ciesielski’s construction, 23, 467 – conformal invariance, 144 – covariance, 8 – dimension of graph, 183 – dimension of image, 182, 186 – dimension of zero set, 192 – distribution of zeros, 193–195 – Donsker’s construction, 35, 237, 467 – with drift, 213, 323, 324, 402, 432, 439 – embedding of random variable, 232 – excursion, 195, 347, 350 – excursion measure, 354 – exit from annulus, 74 – exit from ball, 75 – exit from interval, 56, 74 – exponential moments, 8 – fast point, 201 – first passage time, 56, 57, 70 https://doi.org/10.1515/9783110741278-027

– is Gaussian process, 9, 10 – generator, 92–93, 101–103 – geometric, 386 – Hölder continuity, 154, 170, 175, 180 – independent components, 13 – interpolation, 28–30, 466 – invariance principle, 35, 237, 467 – Khintchine’s LIL, 199, 212 – killing, 458 (Pr. 9) – Kolmogorov’s construction, 36 – large deviations, 223, 224 – law of (min, max, B), 77, 207 – law of (|B|, L), (M − B, M), 349 – law of first passage time, 70 – law of large numbers, 19 (Pr. 27), 198 – law of running min/max, 70, 349 – level set, 189, 190 – Lévy’s construction, 28–30, 466 – Lévy’s modulus of continuity, 172 – LIL, 199, 208, 212, 225 – LIL fails, 201 – local modulus of continuity, 201 – local time, 312, 399 – Markov property, 15, 62, 63 – martingale characterization, 163, 321, 451 – maximal inequality, 50 – modulus of non-differentiability, 209 – moments, 8 – nowhere differentiable, 170 – nowhere monotone, 187 – P measurable, 83 – projective reflection, 16, 42 – quadratic variation, 153, 165 (Pr. 5) – quadratic variation = 0, 160 – quadratic variation, strong = ∞, 158 – record set, 192 – reflected at b, 73 – reflection, 14, 42 – reflection principle, 69–70, 77 – renewal, 14, 42 – resolvent kernel, 99

Index

– return to zero, 75 – scaling, 15, 42 – shift, 47 (Pr. 7), 84 (Pr. 4), 351 – slow point, 201 – started at x, 4, 63 – Strassen’s LIL, 212, 225 – strong Markov property, 65, 67, 69, 71 – strong variation, 176 (Pr. 5) – tail estimate, 172, 198 – time inversion, 15, 42 – transition density, 9, 36 – unbounded, 54 – unique maximum, 188 – white noise, 264 – Wiener’s construction, 26 – with filtration, 48 – zero set, 192, 348 Brownian scaling, 15, 42 Burkholder inequalities, 335 Burkholder–Davis–Gundy inequalities, 334, 338

C(o) is not measurable, 42 Cameron–Martin – formula, 215, 218, 224 – space, 213, 219 canonical model, 39, 41 canonical process, 43, 90 carré-du-champ operator, 457 (Pr. 7) Cauchy problem, 128, 129 central limit method, 465 chaos decomposition, 375 Chapman–Kolmogorov equations, 65, 121 (Pr. 6) characteristic operator, 117 Chung’s LIL, 208 closed operator, 95 completion, 80 composition method, 463 conditional expectation – under change of measure, 324 cone condition, 143 – flat cone condition, 150 (Pr. 12) conformal map, 144 consistent probability measures, 43 control measure, 263 convex function (representation), 339 covariance matrix, 7 cylinder set, 40, 474

| 515

diffusion coefficient/matrix, 3, 297, 390, 427, 428, 430, 431, 437 diffusion process, 428, 429 – backward equation, 430 – construction by SDE, 437 – defined by SDE, 435 – forward equation, 431, 456 – generator, 428 Dirichlet problem, 139, 148 – in dimension 2, 144 dissipative operator, 103 Doléans–Dade exponential, 317, 319 dominated convergence (stoch. int.), 278 Donsker’s invariance principle, 35, 237, 467 Doob – decomposition, 244 – maximal inequality, 484 – optional stopping theorem, 488 Doob–Meyer decomposition, 283 Doss–Sussmann theorem, 423 drift coefficient/vector, 297, 323, 390, 402, 428, 430, 431, 437 drift transformation, 390, 402, 442 Duhamel principle, 129 Dynkin system, 492 Dynkin’s formula, 115 Dynkin–Kinney criterion, 428 energy, 180, 497 Euler scheme, 469, 472 – strong/weak order, 470 excursion, 347, 350 excursion measure, 354 exponential martingale, 49 Feller process, 91, 491 – starting at x, 90 – strong Markov property, 491 Feller property, 88 – strong Feller property, 88 Feller’s minimum principle, 107 Feynman–Kac formula, 131, 458 (Pr. 9) Fick’s law, 427 finite variation, see bounded variation first entrance time, see first hitting time first hitting time, 53 – expected value, 56, 116, 140 – of balls, 74, 75 – properties, 60 (Pr. 13)

516 | Index

– is stopping time, 53, 54 first passage time, 54, 56, 70 Fokker–Planck equation, 431, 432 formal adjoint operator, 431 Friedrichs mollifier, 121 (Pr. 8), 311, 315 (Pr. 13) Frostman’s theorem, 180, 498, 499 Fubini thm. (stoch. integral), 279 fundamental solution, 432 – of diffusion equation, 432 – existence, 434 Gaussian – density (n-dimensional), 9 – process, 8 – random variable, 7 – tail estimate, 172, 198 generalized inverse, 332–333 generator, 92 – of Brownian motion, 92–93, 101–103 – is closed operator, 95 – is dissipative, 103 – Dynkin’s generator, 117 – Feller’s minimum principle, 107 – is local operator, 118, 119 – no proper extension, 100 – pointwise definition, 107 – positive maximum principle, 106 – of the solution of an SDE, 435 – weak generator, 107 geometric BM, 386 Girsanov transform, 219, 325, 359 (Pr. 5), 402, 442 Gronwall’s lemma, 500 Haar functions, 22 – completeness, 23, 501 – Haar–Fourier series, 23 harmonic function, 140 – and martingales, 140 – maximum principle, 142 – mean value property, 140 Hausdorff dimension, 179 – of a Brownian graph, 183 – of a Brownian path, 182, 186 – of Brownian zeros, 192 – vs. capacitary dimension, 500 Hausdorff measure, 177 heat equation, 125, 127 Hermite polynomial, 367

heuristics – for the Cauchy problem, 129 – for Dirichlet’s problem, 139 – for Itô’s formula, 299 – for Kolmogorov’s equations, 433 – for an SDE, 383 – for Strassen’s LIL, 225 – for Stratonovich SDE, 421 Hille–Yosida theorem, 103 Hölder continuity, 154 – of a BM, 154, 170, 175, 180 holding point, 116 independent increments (B1), 4 infinitesimal generator, see generator instantaneous point, 116 inverse transform method, 461 Itô – differential see stoch. differential, – integral see stochastic integral, – isometry, 250, 251, 254, 267, 290 – process, 297, 298, 306 Itô’s formula – for BM1 , 297 – for BMd , 307 – convex function, 340 – for martingales, 459 (Pr. 11) – Itô–Meyer formula, 340 – for Itô process, 306 – and local time, 313, 399 – Tanaka’s formula, 313, 399 – time-dependent, 310 Itô’s synthesis theorem, 355 iterated Itô integral, 363, 366 – and Hermite polynomials, 373, 377 – multiplication formula, 381 (Pr. 9) Kaufman’s theorem, 186 Khintchine’s LIL, 199, 212 Kolmogorov – backward equation, 430 – consistency condition, 43 – construction of BM, 36, 44–45 – continuity criterion, 45, 167 – existence theorem, 475 – forward equation, 431, 456 Kolmogorov’s test – upper function, 201 Kolmogorov–Chentsov thm, 45, 167

Index

Lamperti transform, 390 Langevin SDE, 387 Laplace operator, 50, 59 (Pr. 8), 106, 125, 128, 139 large deviation – lower bound, 224 – upper bound, 223 Lebesgue’s spine, 145–146 level set, 189, 190 Lévy’s modulus of continuity, 172 LIL, 199, 201, 206, 208, 212, 225 linear growth condition, 394 Lipschitz condition, 393 local martingale, 274 – is martingale, 319 local operator, 118 – and continuous paths, 118, 119 – is differential operator, 118 – and diffusions, 428 local time, 312, 399 – approximation by excursion, 357 – continuity, 342 – inverse of, 351 – is additive functional, 351 – joint law with BM, 349 – moments, 342 – occupation time formula, 341 – quadratic variation, 343 – support, 348 localizing sequence, 273, 274, 459 (Pr. 12) Malliavin derivative, 378 Markov kernel, 240 Markov process, 64 – homogeneous Markov process, 64 – starting at x, 64 – strong Markov process, 69 Markov property, 15, 62, 63 – solution of SDE, 403, 404 Markov semigroup, 88 martingale, 48 – backwards convergence theorem, 482 – backwards martingale, 157, 158, 482 – bounded variation, 283 – characterization of BM, 163, 321, 451 – closure, 482 – constancy intervals, 288 – convergence theorem, 480 – Doob–Meyer decomposition, 283 – embedding into BM, 241, 333

| 517

– exit from interval, 74 – exponential, 49 – and generator, 51, 114, 124 – L2 martingale, 244, 249 – local martingale, 274 – maximal inequality, 484 2,c – norm in MT , 252, 271 (Pr. 4) – quadratic variation, 163, 244–245, 283 – upcrossing, 479 martingale problem, 446 – and SDEs, 453, 455 – equivalent conditions, 448 – existence, 453 – uniqueness, 455 martingale representation – as time-changed BM, 333 – if the BM given, 328, 329 – if the filtration is given, 330, 331 martingale transform, 245 – characterization by bracket, 248 Maruyama’s drift transformation, 402, 442 mass distribution principle, 179 maximum principle, 142 mean value property, 140 minimum principle, 107 modification, 25 modulus of continuity, 171–172 modulus of non-differentiability, 209 monotone class theorem, 492, 493 Monte Carlo simulation, 473 multiple stochastic integral, see iterated – – natural filtration, 48, 53 Novikov condition, 57, 320 occupation time formula, 341 optional stopping/sampling, 488 Ornstein–Uhlenbeck process, 18 (Pr. 11), 387 Paley–Wiener–Zygmund integral, 217–218, 272 (Pr. 17), 361 path space, 41 Poisson point process, 354–356 Poisson process, 176 (Pr. 1), 352, 354, 356, 359 (Pr. 2) positive maximum principle, 106 potential – linear potential, 111 – logarithmic potential, 111

518 | Index

– Newton potential, 111 potential operator, see resolvent, 108, 111, 113 progressive/progressively/P – measurable, 83, 272 (Pr. 22), 281 (Pr. 2) – σ-algebra, 260, 272 (Pr. 21) projective family, 43, 475 projective limit, 475 Q-Brownian motion, 13 quadratic covariation, 247, 293, 371, 457 (Pr. 7) quadratic variation, 254, 283, 297 – a.s. convergence, 155, 157 – of BM, 153, 165 (Pr. 5) – is zero, 160 – L2 -convergence, 153 – of martingale, 163 – of local time, 343 – strong q.v. = ∞, 158 random measure, 37 (Pr. 2) random orthogonal measure, 263 – Itô isometry, 267 – martingales, 268 – stochastic integral, 265, 267 random variable – embedded into BM, 232 random walk, 2–3, 34 – Chung’s LIL, 206 – embedded into BM, 234 – Khinchine’s LIL, 235–236 recurrence, 76 reflection principle, 69–70, 77 regular point, 142 – criterion for, 143, 150 (Pr. 12) – dimension 2, 144 resolvent, 87, 96 – resolvent equation, 97, 122 (Pr. 15) Riemann mapping theorem, 144 Riemann–Stieltjes integral, 218, 244, 494–496 Riesz representation theorem, 496 right continuous filtration, 81 sample path, 4 sample value, 460 Schauder functions, 22 – completeness, 23, 501 Schilder’s theorem, 222, 224 semigroup, 87 – conservative, 88, 92

– contraction, 88 – Feller, 88 – Markov, 88 – positive, 88 – strong Feller, 88 – strongly continuous, 88 – sub-Markov, 88 shift operator, 47 (Pr. 7), 84 (Pr. 4), 351 simple function (off-diagonal), 362 simple process, 249 – closure, 262, 290 simply connected, 144 simulation of BM, 466, 467 singular point, 142 – examples, 144–147 Skorokhod decomposition, 360 (Pr. 16) Skorokhod’s embedding theorem, 233 Slutsky’s theorem, 5 (Pr. 2), 5 (Pr. 3) state space, 4 – one-point compactification, 92, 428 stationary increments (B2), 4 stationary process, 19 (Pr. 23) stochastic differential, 299, 303 stochastic differential equation, 383 – continuity of solution, 408, 411 – counterexamples, 398–402 – dependence on initial value, 408, 411, 412 – deterministic coefficients, 385 – differentiability of solution, 411 – of diffusion process, 435 – drift transformation, 390, 402 – examples, 385–389, 398–402, 422–423 – existence of solution, 396 – Feller continuity, 407 – for given generator, 437 – heuristics, 383 – homogeneous linear SDE, 388 – Langevin equation, 387 – linear growth condition, 394 – linear SDE, 389 – Lipschitz condition, 393 – local existence and uniqueness, 405 – localization, 404 – Markov property of solutions, 403 – and martingale problem, 453, 455 – measurability of solution, 398 – moment estimate, 408, 410 – no solution, 398–402 – numerical solution, 469, 472

Index

– Picard iteration, 396 – solution map, 423 – solution of SDE, 383 – stability of solutions, 394 – Stratonovich SDE, 421 – transformation into linear SDE, 392 – transformation of SDE, 385, 390–391 – transition function of solutions, 404 – uniqueness of solutions, 396, 402 – variance transformation, 390 – weak solution, 400 stochastic dominated conv., 278 stochastic exponential, 317, 319 stochastic Fubini thm., 279 stochastic integral, 253, 265, 267 – characterization, 294 – dominated convergence, 278 – Fubini, 279 – Itô isometry, 254 – iterated, 363, 366 – localization, 254 – is martingale, 253 – maximal inequalities, 254 – mean-square cont. integrand, 259 – no pathwise meaning, 259–260 – Riemann approximation, 259, 276 – Stratonovich integral, 419 2,c stochastic integral (for MT ), 289 – Itô isometry, 290 – localization, 290 – is martingale, 289 – maximal inequalities, 290 stochastic integral (localized), 274, 291 – characterization, 294 – localization, 275 – is martingale, 275 – Riemann approximation, 276 stochastic integral (simple proc.), 249 – Itô isometry, 250, 251 – localization, 251 – is martingale, 251 stochastic process, 4 – equivalent, 25 – independence of two processes, 11 – indistinguishable, 24 – modification, 25 – sample path, 4 – simple, 249 – state space, 4

| 519

stopping time, 53, 485 – approximation from above, 487 Strassen’s LIL, 212 Stratonovich differential equation, 421 Stratonovich integral, 419 – vs. Itô integral, 420 strong Feller property, 88 strong Markov property, 65, 67, 69, 71, 491 – martingale proof, 66 strong order of convergence, 470 strong variation, 152, 244, 283, 493 sub-Markov semigroup, 88 – vs. Markov semigroup, 92 symmetrization (function), 364 Tanaka’s formula, 313, 399 Taylor–Stroock formula, 379 trajectory, see sample path transience, 76 transition kernel, 90 transition semigroup, see semigroup true sample, 460 ucp-convergence, 274 – and localization, 293 uniform convergence in probability, 274 uppcrossing estimate, 480 upper function, 201 usual conditions, 82 variation – see also bounded v., quadratic, v., strong v., – p-variation, 152 – variation sum, 152, 493 Wald’s identity, 55, 74 – exponential Wald identity, 57 weak order of convergence, 470 white noise, 264 – is not σ-additive, 264 Wiener chaos, 375 Wiener measure, 41, 213 Wiener process, see Brownian motion Wiener space, 41, 213 Wiener–Fourier series, 26 Wiener–Itô decomposition, 375 Zaremba – cone condition, 143 – deleted ball, 144 – needle, 147