Stochastic analysis : Itô and Malliavin calculus in tandem [Tra ed.] 110714051X, 978-1-107-14051-6, 9781316492888, 1316492885

Thanks to the driving forces of the Itô calculus and the Malliavin calculus, stochastic analysis has expanded into numer

376 110 1MB

English Pages 357 [358] Year 2016

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Stochastic analysis : Itô and Malliavin calculus in tandem [Tra ed.]
 110714051X, 978-1-107-14051-6, 9781316492888, 1316492885

Table of contents :
Content: Preface
Frequently used notation
1. Fundamentals of continuous stochastic processes
2. Stochastic integrals and Ito's formula
3. Brownian motion and Laplacian
4. Stochastic differential equations
5. Malliavin calculus
6. Black-Scholes model
7. Semiclassical limit
Appendix
References
Subject index.

Citation preview

C A M B R I D G E S T U D I E S I N A DVA N C E D M AT H E M AT I C S 1 5 9 Editorial Board B . B O L L O B Á S , W. F U LTO N , A . K ATO K , F. K I RWA N , P. S A R NA K , B . S I M O N , B . TOTA RO

STOCHASTIC ANALYSIS Itô and Malliavin Calculus in Tandem Thanks to the driving forces of the Itô calculus and the Malliavin calculus, stochastic analysis has expanded into numerous fields including partial differential equations, physics, and mathematical finance. This book is a compact, graduate-level text that develops the two calculi in tandem, laying out a balanced toolbox for researchers and students in mathematics and mathematical finance. The book explores foundations and applications of the two calculi, including stochastic integrals and stochastic differential equations, and the distribution theory on Wiener space developed by the Japanese school of probability. Uniquely, the book then delves into the possibilities that arise by using the two flavors of calculus together. Taking a distinctive, path-space-oriented approach, this book crystalizes modern day stochastic analysis into a single volume. Hiroyuki Matsumoto is Professor of Mathematics at Aoyama Gakuin University. He graduated from Kyoto University in 1982 and received his Doctor of Science degree from Osaka University in 1989. His research focuses on stochastic analysis and its applications to spectral analysis of Schrödinger operations and Selberg’s trace formula, and he has published several books in Japanese, including Stochastic Calculus and Introduction to Probability and Statistics. He is a member of the Mathematical Society of Japan and an editor of the MSJ Memoirs. Setsuo Taniguchi is Professor of Mathematics at Kyushu University. He graduated from Osaka University in 1980 and received his Doctor of Science degree from Osaka University in 1989. His research interests include stochastic differential equations and Malliavin calculus. He has published several books in Japanese, including Introduction to Stochastic Analysis for Mathematical Finance and Stochastic Calculus. He is a member of the Mathematical Society of Japan and is an editor of the Kyushu Journal of Mathematics.

4:36:19, subject to the Cambridge Core terms of use,

CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS Editorial Board: B. Bollobás, W. Fulton, A. Katok, F. Kirwan, P. Sarnak, B. Simon, B. Totaro All the titles listed below can be obtained from good booksellers or from Cambridge University Press. For a complete series listing visit: www.cambridge.org/mathematics. Already published 119 C. Perez-Garcia & W. H. Schikhof Locally convex spaces over non-Archimedean valued fields 120 P. K. Friz & N. B. Victoir Multidimensional stochastic processes as rough paths 121 T. Ceccherini-Silberstein, F. Scarabotti & F. Tolli Representation theory of the symmetric groups 122 S. Kalikow & R. McCutcheon An outline of ergodic theory 123 G. F. Lawler & V. Limic Random walk: A modern introduction 124 K. Lux & H. Pahlings Representations of groups 125 K. S. Kedlaya p-adic differential equations 126 R. Beals & R. Wong Special functions 127 E. de Faria & W. de Melo Mathematical aspects of quantum field theory 128 A. Terras Zeta functions of graphs 129 D. Goldfeld & J. Hundley Automorphic representations and L-functions for the general linear group, I 130 D. Goldfeld & J. Hundley Automorphic representations and L-functions for the general linear group, II 131 D. A. Craven The theory of fusion systems 132 J. Väänänen Models and games 133 G. Malle & D. Testerman Linear algebraic groups and finite groups of Lie type 134 P. Li Geometric analysis 135 F. Maggi Sets of finite perimeter and geometric variational problems 136 M. Brodmann & R. Y. Sharp Local cohomology (2nd Edition) 137 C. Muscalu & W. Schlag Classical and multilinear harmonic analysis, I 138 C. Muscalu & W. Schlag Classical and multilinear harmonic analysis, II 139 B. Helffer Spectral theory and its applications 140 R. Pemantle & M. C. Wilson Analytic combinatorics in several variables 141 B. Branner & N. Fagella Quasiconformal surgery in holomorphic dynamics 142 R. M. Dudley Uniform central limit theorems (2nd Edition) 143 T. Leinster Basic category theory 144 I. Arzhantsev, U. Derenthal, J. Hausen & A. Laface Cox rings 145 M. Viana Lectures on Lyapunov exponents 146 J.-H. Evertse & K. Gy˝ory Unit equations in Diophantine number theory 147 A. Prasad Representation theory 148 S. R. Garcia, J. Mashreghi & W. T. Ross Introduction to model spaces and their operators 149 C. Godsil & K. Meagher Erd˝os–Ko–Rado theorems: Algebraic approaches 150 P. Mattila Fourier analysis and Hausdorff dimension 151 M. Viana & K. Oliveira Foundations of ergodic theory 152 V. I. Paulsen & M. Raghupathi An introduction to the theory of reproducing kernel Hilbert spaces 153 R. Beals & R. Wong Special functions and orthogonal polynomials 154 V. Jurdjevic Optimal control and geometry: Integrable systems 155 G. Pisier Martingales in Banach spaces 156 C. T. C. Wall Differential topology 157 J. C. Robinson, J. L. Rodrigo & W. Sadowski The three-dimensional Navier–Stokes equations 158 D. Huybrechts Lectures on K3 surfaces 159 H. Matsumoto & S. Taniguchi Stochastic Analysis

4:36:19, subject to the Cambridge Core terms of use,

Stochastic Analysis Itô and Malliavin Calculus in Tandem H i r oy u k i M a t s u m o t o Aoyama Gakuin University, Japan

S e t s u o Ta n i g u c h i Kyushu University, Japan

Translated and adapted from the Japanese edition

4:36:19, subject to the Cambridge Core terms of use,

32 Avenue of the Americas, New York NY 10013 One Liberty Plaza, 20th Floor, New York, NY 10006, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia 4843/24, 2nd Floor, Ansari Road, Daryaganj, Delhi – 110002, India 79 Anson Road, #06–04/06, Singapore 079906 Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781107140516 Translated and adapted from the Japanese edition: Kakuritsu Kaiseki, Baifukan, 2013 c Hiroyuki Matsumoto and Setsuo Taniguchi 2017  This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2017 Printed in the United States of America by Sheridan Books, Inc. A catalog record for this publication is available from the British Library ISBN 978-1-107-14051-6 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

4:36:19, subject to the Cambridge Core terms of use,

Contents

Preface Frequently Used Notation

page ix xii

1 Fundamentals of Continuous Stochastic Processes 1.1 Stochastic Processes 1.2 Wiener Space 1.3 Filtered Probability Space, Adapted Stochastic Process 1.4 Discrete Time Martingales 1.4.1 Conditional Expectation 1.4.2 Martingales, Doob Decomposition 1.4.3 Optional Stopping Theorem 1.4.4 Convergence Theorem 1.4.5 Optional Sampling Theorem 1.4.6 Doob’s Inequality 1.5 Continuous Time Martingale 1.5.1 Fundamentals 1.5.2 Examples on the Wiener Space 1.5.3 Optional Sampling Theorem, Doob’s Inequality, Convergence Theorem 1.5.4 Applications 1.5.5 Doob–Meyer Decomposition, Quadratic Variation Process 1.6 Adapted Brownian Motion 1.7 Cameron–Martin Theorem 1.8 Schilder’s Theorem 1.9 Analogy to Path Integrals

1 1 4 9 11 11 13 16 17 20 22 24 24 25

34 37 40 43 49

2 Stochastic Integrals and Itô’s Formula 2.1 Local Martingale

52 52

28 32

v 4:36:21, subject to the Cambridge Core terms of use,

vi

Contents

2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9

Stochastic Integrals Itô’s Formula Moment Inequalities for Martingales Martingale Characterization of Brownian Motion Martingales with respect to Brownian Motions Local Time, Itô–Tanaka Formula Reflecting Brownian Motion and Skorohod Equation Conformal Martingales

54 61 70 73 82 87 93 96

3 Brownian Motion and the Laplacian 3.1 Markov and Strong Markov Properties 3.2 Recurrence and Transience of Brownian Motions 3.3 Heat Equations 3.4 Non-Homogeneous Equation 3.5 The Feynman–Kac Formula 3.6 The Dirichlet Problem

102 102 108 111 112 117 125

4 Stochastic Differential Equations 4.1 Introduction: Diffusion Processes 4.2 Stochastic Differential Equations 4.3 Existence of Solutions 4.4 Pathwise Uniqueness 4.5 Martingale Problems 4.6 Exponential Martingales and Transformation of Drift 4.7 Solutions by Time Change 4.8 One-Dimensional Diffusion Process 4.9 Linear Stochastic Differential Equations 4.10 Stochastic Flows 4.11 Approximation Theorem

133 133 138 145 151 156 157 164 167 180 183 190

5 Malliavin Calculus 5.1 Sobolev Spaces and Differential Operators 5.2 Continuity of Operators 5.3 Characterization of Sobolev Spaces 5.4 Integration by Parts Formula 5.5 Application to Stochastic Differential Equations 5.6 Change of Variables Formula 5.7 Quadratic Forms 5.8 Examples of Quadratic Forms 5.8.1 Harmonic Oscillators 5.8.2 Lévy’s Stochastic Area

195 195 206 214 224 232 244 257 265 265 269

4:36:21, subject to the Cambridge Core terms of use,

Contents

5.9

vii

5.8.3 Sample Variance Abstract Wiener Spaces and Rough Paths

274 276

6 The Black–Scholes Model 6.1 The Black–Scholes Model 6.2 Arbitrage Opportunity, Equivalent Martingale Measures 6.3 Pricing Formula 6.4 Greeks

281 281 284 287 293

7 The Semiclassical Limit 7.1 Van Vleck’s Result and Quadratic Functionals 7.1.1 Soliton Solutions for the KdV Equation 7.1.2 Euler Polynomials 7.2 Asymptotic Distribution of Eigenvalues 7.3 Semiclassical Approximation of Eigenvalues 7.4 Selberg’s Trace Formula on the Upper Half Plane 7.5 Integral of Geometric Brownian Motion and Heat Kernel on H2

297 297 302 307 309 312 318 323

Appendix

329

Some Fundamentals

References Index

337 344

4:36:21, subject to the Cambridge Core terms of use,

4:36:21, subject to the Cambridge Core terms of use,

Preface

The aim of this book is to introduce stochastic analysis, keeping in mind the viewpoint of path space. The area covered by stochastic analysis is very wide, and we focus on the topics related to Brownian motions, especially the Itô calculus and the Malliavin calculus. As is widely known, a stochastic process is a mathematical model to describe a randomly developing phenomenon. Many continuous stochastic processes are driven by Brownian motions, while basic discontinuous ones are related to Poisson point processes. The Itô calculus, named after K. Itô who introduced the calculus in 1942, is typified by stochastic integrals, Itô’s formula, and stochastic differential equations. While Itô investigated those topics in terms of Brownian motions, they are now studied in the extended framework of martingales. One of the important applications of the calculus is a construction of diffusion processes through stochastic differential equations. The Malliavin calculus was introduced by P. Malliavin in the latter half of the 1970s and developed by many researchers. As he originally called it “a stochastic calculus of variation”, it is exactly a differential calculation on a path space. It opened a way to take a purely probabilistic approach to transition densities of diffusion processes, which are fundamental objects in theory and are applied to many fields in mathematics and physics. We made the book self-contained as much as possible. Several preliminary facts in analysis and probability theory are gathered in the Appendix. Moreover, a lot of examples are presented to help the reader to easily understand the assertions. This book is organized as follows. Chapter 1 starts with fundamental facts on stochastic processes. In particular, Brownian motions and martingales are introduced and basic properties associated with them are given. In the last three sections, investigations of path space type are made; the Cameron–Martin theorem, Schilder’s theorem and an analogy with path integrals are presented. ix 4:36:22, subject to the Cambridge Core terms of use, .001

x

Preface

Chapter 2 introduces stochastic integrals and Itô’s formula, an associated chain rule. Although Itô originally discussed them with respect to Brownian motions, we formulate them with respect to martingales in the recent manner due to J. L. Doob, H. Kunita and S. Watanabe. Moreover, several facts on continuous martingales are discussed: for example, representations of them by time changes and those via stochastic integrals with respect to Brownian motions. Chapter 3 presents several properties of Brownian motion. As direct applications of Itô’s formula, problems in the theory of partial differential equations, like heat equations and Dirichlet problems, are studied. Although the Laplacian is only dealt with in this chapter, after reading Chapters 4 and 5, the reader will be easily convinced that the results in this chapter can be extended to second order differential operators on Euclidean spaces and Laplace–Beltrami operators on Riemannian manifolds. Chapters 4 and 5 form the main portion of this book. Chapter 4 introduces stochastic differential equations and presents their properties and applications. Stochastic differential equations enable us to construct diffusion processes in a purely probabilistic manner. Namely, diffusion processes are realized as measures on a path space via solutions of stochastic differential equations. This is different from the analytical method by A. Kolmogorov, which uses the fundamental solution of the associated heat equation. It is also seen in the chapter that stochastic differential equations determine stochastic flows as ordinary differential equations. The flow property will be used in the next chapter. The Malliavin calculus is developed in Chapter 5. The distribution theory on the Wiener space, which was structured by the Japanese school led by S. Watanabe, S. Kusuoka, and I. Shigekawa, is introduced. Moreover, the integration by parts formula and the change of variable formula on the Wiener space are presented. In the last two sections, the latter formula is applied to computing Laplace transforms of quadratic Wiener functionals. Chapter 6 is a brief introduction to mathematical finance. In this chapter, we focus on the Black–Scholes model, the simplest model in mathematical finance. The existence and uniqueness of an equivalent martingale measure is shown and a pricing formula of European contingent claims is given. Moreover, as an application of the Malliavin calculus, we show ways to compute hedging portfolios and the Greeks, indices to measure sensitivity of prices with respect to parameters like initial price and volatilities. Stochastic analysis is the analysis on path spaces, and it is deeply related to Feynman path integrals. It was M. Kac who gained an insight into this close relationship and achieved a lot of results. His achievements exerted great influence on not only probability theory but also other fields of mathematics.

4:36:22, subject to the Cambridge Core terms of use, .001

Preface

xi

Chapter 7 is intended to present results corresponding to such close relationship. It starts with an introduction of a Wiener space analog of the representation of a propagator by action integrals of classical paths, which was due to the physicist Van Vleck playing an active role in the early period of quantum mechanics. Next, applications of stochastic analysis to studies of eigenvalues of Schrödinger operators and Selberg’s trace formula are presented. In these applications, probabilistic representations of a heat kernel with the aid of the Malliavin calculus provide a clear route to the results. This book is based on the Japanese one Kakuritsu Kaiseki published by Baifukan in 2013. In this book, a section discussing the close conjunction of the Malliavin calculus and the rough path theory and a chapter on mathematical finance are newly added. During the writing of the Japanese book and this one, we have received much benefit from several representative monographs and books on stochastic analysis; these include: ●







R. Durrett, Brownian Motion and Martingales in Analysis, Wadsworth, 1984; N. Ikeda and S. Watanabe, Stochastic Differential Equations and Diffusion Processes, 2nd edn., North Holland/Kodansha, 1989; I. Karatzas and S. E. Shreve, Brownian Motion and Stochastic Calculus, 2nd edn., Springer-Verlag, 1991; D. Revuz and M. Yor, Continuous Martingales and Brownian Motion, 3rd edn., Springer-Verlag, 1999.

As for the theory of martingales, we also gained some benefit from ●

H. Kunita, Estimation of Stochastic Processes (in Japanese), Sangyou Tosho, 1976.

This book did not come into existence without the Japanese one. We deeply thank Professor Nobuyuki Ikeda, our supervisor, who recommended our writing the Japanese book. In writing the Japanese version, we received kind assistance and help from several people, whom we gratefully acknowledge. Professor Yoichiro Takahashi was on the editorial board of the series where our book appeared and encouraged our writing. He and Professors Masanori Hino, Yu Hariya, and Koji Yano read through the draft and gave us stimulating comments and kind help. Hiroyuki Matsumoto Setsuo Taniguchi

4:36:22, subject to the Cambridge Core terms of use, .001

Frequently Used Notation

N ≡ {1, 2, 3, . . .}, Z ≡ {0, ±1, ±2, . . .}, Z+ ≡ {0, 1, 2, . . .} a ∧ b = min{a, b}, a ∨ b = max{a, b} a− = − min{a, 0} = max{−a, 0} a+ = max{a, 0}, [a] : the largest integer less than or equal to a ∈ R [s]n = 2−n [2n s] ∈ {k2−n }∞ k=0 (s  0, n ∈ N) sgn(x) = −1 (x  0), = 1 (x > 0) 1A : the indicator function of a set A σ(C ) : the σ-field generated by a family C of subsets σ(X1 , . . . , Xn ) : the σ-field generated by the random variables X1 , . . . , Xn B(S ) : the σ-field generated by open subsets of a topological space S C0 (S ) : the space of continuous functions on S with compact supports C n (Rd ) : the space of C n -class functions on Rd C0∞ (Rd ) : the space of C ∞ -functions on Rd with compact supports S (Rd ) : the space of rapidly decreasing C ∞ functions on Rd S  (Rd ) : the dual space of S (Rd ), the space of tempered distributions    Δ : the Laplacian on Rd , di=1 ∂x∂ i 2 |y−x|2

g(t, x, y) = (2πt)− 2 e− 2t (t > 0, x, y ∈ Rd ) : the Gauss kernel W = Wd = C([0, ∞) → Rd ) : the space of Rd -valued continuous functions W = W d = {w ∈ Wd ; w(0) = 0} : the Wiener space WT = WTd = C([0, T ] → Rd ) : the Wiener space on [0, T ] HT = HTd : the Cameron–Martin space of WTd

X : the quadratic variation process of a semimartingale X M 2 : the set of square-integrable martingales 2 : the set of square-integrable local martingales Mloc 2 Mc,loc : the set of square-integrable continuous local martingales d

xii 4:36:23, subject to the Cambridge Core terms of use,

1 Fundamentals of Continuous Stochastic Processes

In this chapter fundamentals of continuous stochastic processes are mentioned, taking into account their applications to stochastic analysis.

1.1 Stochastic Processes Let Ω be a set. Definition 1.1.1 A family F of subsets of Ω is said to be a σ-field if (i) Ω, ∅ ∈ F , (ii) if A ∈ F , then Ac := {ω ∈ Ω; ω  A} ∈ F ,  (iii) if Ai ∈ F (i = 1, 2, . . .), then ∞ i=1 Ai ∈ F . The pair (Ω, F ) is called a measurable space. Definition 1.1.2 Let (Ω, F ) be a measurable space. A set function P : F A → P(A)  0 is said to be a probability measure if (i) 0  P(A)  1 for all A ∈ F , (ii) P(Ω) = 1, (iii) for mutually disjoint subsets Ai ∈ F (i = 1, 2, . . .), ∞ 

∞  Ai = P(Ai ).

i=1

i=1

P

The triplet (Ω, F , P) is called a probability space. Throughout this book we denote by E or EP the expectation (integral) with respect to P. 1 4:36:24, subject to the Cambridge Core terms of use, .002

2

Fundamentals of Continuous Stochastic Processes

Given a family C of subsets of Ω, we denote the smallest σ-field including C by σ(C ) :

σ(C ) = G, G

where G runs over all σ-fields on Ω including C . If Ω is a topological space, B(Ω) denotes the smallest σ-field containing all open subsets of Ω and is called the Borel σ-field on Ω. Definition 1.1.3 Given a topological space E, a mapping X : Ω → E is said to be F /B(E)-measurable if X −1 (A) := {ω ∈ Ω; X(ω) ∈ A} ∈ F holds for any A ∈ B(E). Such an X is called an E-valued random variable. The probability measure P ◦ X −1 on E induced by X, (P ◦ X −1 )(A) = P(X −1 (A))

(A ∈ B(E)),

is called the probability distribution or probability law of X. The purpose of this book is to study several kinds of analysis based on continuous stochastic processes. Here we introduce path spaces which play a fundamental role in such studies. Definition 1.1.4 Let (E, dE ) be a complete separable metric space. (1) For T > 0, WT (E) stands for the set of E-valued continuous functions on [0, T ]. WT (E) is endowed with the topology of uniform convergence, or equivalently, with the distance function given by d(w1 , w2 ) = max{dE (w1 (t), w2 (t)); 0  t  T }, which makes WT (E) a complete separable metric space. (2) The set of E-valued continuous functions on [0, ∞) is denoted by W(E), and it is endowed with the topology of uniform convergence on compact sets, or equivalently, with the distance function given by d(w1 , w2 ) =

∞ n=1

2−n max (dE (w1 (t), w2 (t))) ∧ 1 , 0tn

by which W(E) is a complete separable metric space. The Borel σ-fields with respect to the respective topologies are denoted by B(WT (E)), B(W(E)).

4:36:24, subject to the Cambridge Core terms of use, .002

3

1.1 Stochastic Processes Proposition 1.1.5 Let CT (E) be the totality of subsets of WT (E) of the form {w ∈ WT (E); w(t1 ) ∈ A1 , w(t2 ) ∈ A2 , . . . , w(tn ) ∈ An }

(1.1.1)

with 0 < t1 < t2 < · · · < tn  T, A1 , A2 , . . . , An ∈ B(E). Then σ(CT (E)) = B(WT (E)). Proof For an open set G of E and for t > 0, {w ∈ WT (E); w(t) ∈ G} is an open set of WT (E). Hence σ(CT (E)) ⊂ B(WT (E)). To prove σ(CT (E)) ⊃ B(WT (E)), it suffices to show w; max dE (w(t), w0 (t))  δ ∈ σ(CT (E)) 0tT

for any w0 ∈ WT (E) and δ > 0. It is easily obtained from the following identity:

w; max dE (w(t), w0 (t))  δ = {w; dE (w(r), w0 (r))  δ}.  0tT

0rT, r∈Q

We call a set of the form (1.1.1) a Borel cylinder set. We define the Borel  cylinder sets of W(E) in the same way. Then, setting C (E) = T >0 CT (E), we have σ(C (E)) = B(W(E)). Definition 1.1.6 Let T be [0, T ] (T > 0) or [0, ∞). (1) A family X = {X(t)}t∈T of E-valued random variables defined on a probability space (Ω, F , P) is called an E-valued stochastic process. When T = [0, ∞), we write X = {X(t)}t0 . (2) A stochastic process X is said to be continuous if for almost all ω ∈ Ω an E-valued function X(ω) : T t → X(t)(ω) ∈ E on T is continuous. (3) For each w ∈ Ω, the function X(ω) is called a sample path. Definition 1.1.7 Let T be the same as above and X = {X(t)}t∈T and X  = {X  (t)}t∈T be stochastic processes defined on a probability space (Ω, F , P). If P(X(t) = X  (t)) = 1 for all t ∈ T, X  is called a modification of X. Throughout this book, x, y denotes the standard inner product of x = (x1 , x2 , . . . , xd ), y = (y1 , y2 , . . . , yd ) ∈ Rd and |x| denotes the norm of x:

x, y =

d i=1

xi yi ,

|x| =

d 12 (xi )2 . i=1

The next assertion is called Kolmogorov’s continuity theorem.

4:36:24, subject to the Cambridge Core terms of use, .002

4

Fundamentals of Continuous Stochastic Processes

Theorem 1.1.8 Let X = {X(t)}t0 be an Rd -valued stochastic process defined on a probability space (Ω, F , P), and assume that for any T > 0 there exist positive constants α, β, C such that E[|X(t) − X(s)|α ]  C(t − s)1+β

(0  s < t  T ).

Then, there exists a modification X  of X which satisfies   |X  (t) − X  (s)| P lim sup = 0 =1 h↓0 0s 0 we let ϕT (w) be the restriction of w ∈ W to [0, T ]. Then, there exists a probability measure μ on W whose image measure under ϕT is the Wiener measure on WT . We also call μ the Wiener measure and the probability space (W, B(W), μ) the d-dimensional Wiener space on [0, ∞). We show some basic properties of the Wiener measure. Theorem 1.2.8 (1) Define the transforms ψc (c > 0), φ s (s > 0) and ϕQ (Q is a d-dimensional orthogonal matrix) on W by 1 w(c2 t), φ s (w)(t) = w(s + t) − w(s), c Then μ is invariant under ψc , φ s , ϕQ . (2) Define the transform ΦT on WT by ψc (w)(t) =

ΦT (w)(t) = w(T − t) − w(T )

ϕQ (w)(t) = Qw(t).

(0  t  T ).

Then μT is invariant under ΦT . Proof To see the invariance under ψc , we have only to show

4:36:24, subject to the Cambridge Core terms of use, .002

9

1.3 Filtered Probability Space, Adapted Stochastic Process  1  1 1 μ w; w(c2 t1 ) ∈ A1 , w(c2 t2 ) ∈ A2 , . . . , w(c2 tn ) ∈ An c c c = μ({w; w(t1 ) ∈ A1 , w(t2 ) ∈ A2 , . . . , w(tn ) ∈ An })

for t1 < t2 < · · · < tn , A1 , A2 , . . . , An ∈ B(Rd ) by using the identity g(t, x, y) =  cd · g(c2 t, cx, cy) (c > 0). The other assertions are shown similarly. Remark 1.2.9 Set Ψ(w)(0) = 0, Ψ(w)(t) = tw( 1t ) (t > 0). Then Ψ(w) is continuous at t = 0 almost surely under the Wiener measure. Hence, we may regard Ψ as a transform of the Wiener space and can prove that the Wiener measure is invariant under Ψ in the same way as in Theorem 1.2.8. Definition 1.2.10 A d-dimensional continuous stochastic process {X(t)}t0 defined on a probability space (Ω, F , P) is called a d-dimensional Brownian motion (or Wiener process) starting from 0 if X(0) = 0 and its probability distribution on W is the Wiener measure.

1.3 Filtered Probability Space, Adapted Stochastic Process Let (W, B(W), μ) be the d-dimensional Wiener space on [0, ∞) and define a function θ(s) (s  0) on W by θ(s)(w) = w(s). Then the smallest σ-field which makes the behavior of each w ∈ W up to time t measurable is Bt0 := σ({θ(s)−1 (A); A ∈ B(Rd ), 0  s  t})

(1.3.1)

and it forms an increasing sequence {Bt0 }t0 of sub-σ-fields of B(W). Moreover, setting

Bt = Bu0 , u>t

 we see that {Bt }t0 is right-continuous, that is, Bt+0 := u>t Bu = Bt .3 The stochastic process {θ(t)}t0 defined on the Wiener space W is called the coordinate process. Definition 1.3.1 A quartet (Ω, F , P, {Ft }) of a probability space (Ω, F , P) and an increasing sequence {Ft }t0 of sub-σ-fields of F is called a filtered probability space. 3

{Bt0 } is not right-continuous. For example, for a function φ on [0, ∞), the random variable 0 lim supu↓t w(u)−w(t) φ(u−t) is not Bt -measurable, but it is Bt -measurable.

4:36:24, subject to the Cambridge Core terms of use, .002

10

Fundamentals of Continuous Stochastic Processes

{Ft } is called a filtration. If {Ft } is right-continuous and each Ft contains all P-null sets, {Ft } is said to satisfy the usual condition. Next we mention the measurability of stochastic processes. While the purpose of this book is to develop analysis of continuous stochastic processes, we need to consider stochastic processes with discontinuous paths. Intuitive understanding is enough for our purpose and we do not discuss in detail but refer to [45, 56, 86, 114] and so on. Let E be a complete separable metric space. Definition 1.3.2 Let X = {X(t)}t0 be an E-valued stochastic process defined on a filtered probability space (Ω, F , P, {Ft }). (1) X is called measurable if X(t)(ω) is B([0, ∞)) × F -measurable as a function of (t, ω). (2) X is {Ft }-adapted if the E-valued random variable X(t) is Ft -measurable for each t. (3) X is {Ft }-progressively measurable if the map [0, t] × Ω (s, ω) → X(s)(ω) ∈ E is B([0, t]) × Ft -measurable for each t. We often write X(s, ω) for the value X(s)(ω) of the sample process at s, regarding it as a function in two variables (s, ω). Proposition 1.3.3 If an E-valued {Ft }-adapted stochastic process X defined on a filtered probability space (Ω, F , P, {Ft }) has right-continuous paths, that is, if for any ω ∈ Ω the map t → X(t)(ω) is right-continuous, then X is {Ft }progressively measurable. Proof Fix t > 0 and put for n = 1, 2, . . . X (n) (0) = X(0), ∞  ( j + 1)t  X X (n) (s) = 1[ jt , ( j+1)t ) (s) n n n j=0

(s > 0).

Then, letting j = j(s) (s ∈ [0, t]) be the integer such that njt < s  ( j+1)t n , we see that both  ( j + 1)t  [0, t] × Ω (s, ω) → X (ω) and 1[ jt , ( j+1)t ) (s) n n n are B([0, t]) × Ft -measurable. Since limn→∞ X (n) (s)(ω) = X(s)(ω) for any (s, ω) by the right-continuity of {X(s)}, X(s)(ω) is B([0, t]) × Ft measurable.  We end this section with an explanation on stopping times.

4:36:24, subject to the Cambridge Core terms of use, .002

1.4 Discrete Time Martingales

11

Definition 1.3.4 A [0, ∞]-valued random variable τ defined on a filtered probability space (Ω, F , P, {Ft }) is called an {Ft }-stopping time if {ω; τ(ω)  t} ∈ Ft

(t  0).

Proposition 1.3.5 For a filtered probability space (Ω, F , P, {Ft }) satisfying the usual condition, τ is an {Ft }-stopping time if and only if {ω; τ(ω) < t} ∈ Ft holds for any t > 0. Theorem 1.3.6 For an E-valued {Ft }-adapted continuous stochastic process X = {X(t)}t0 defined on a filtered probability space (Ω, F , P, {Ft }) which satisfies the usual condition, let τA be the first hitting time to a Borel set A ∈ B(E), τA (ω) = inf{t > 0; X(t)(ω) ∈ A}, where we put τA (ω) = ∞ if X(t)(ω)  A for all t. Then τA is an {Ft }-stopping time. We omit proofs of Proposition 1.3.5 and Theorem 1.3.6. See, for example, [56, 86].

1.4 Discrete Time Martingales The notion of martingales was introduced by Ville and Lévy. The theory developed by Doob ([15]) plays a fundamental role in stochastic analysis. While we are mainly concerned with continuous time stochastic processes in this book, we mention in this section about discrete time martingales, the theory of which is a basis of that on continuous time martingales.

1.4.1 Conditional Expectation We recall the conditional expectation and its properties. Let (Ω, F , P) be a probability space and G be a sub-σ-field of F . For an integrable random variable X, there exists a unique, up to the difference on P-null sets, integrable  satisfying G -measurable random variable X  E[XY] = E[XY]  the conditional for any bounded G -measurable random variable Y. We call X expectation of X with respect to G and denote it by E[X|G ].

4:36:24, subject to the Cambridge Core terms of use, .002

12

Fundamentals of Continuous Stochastic Processes

Theorem 1.4.1 Let X, X  , X1 , X2 , . . . be integrable random variables defined on a probability space (Ω, F , P) and G be a sub-σ-field of F . (1) If X is G -measurable, then E[X|G ] = X, P-a.s. where P-a.s. means except for a set of P-probability zero. (2) E[E[X|G ]] = E[X]. (3) [linearity] For any a, b ∈ R E[aX + bX  |G ] = aE[X|G ] + bE[X  |G ],

P-a.s.

(4) [positivity] If X  0, P-a.s., then E[X|G ]  0, P-a.s. (5) If Y is G -measurable and XY ∈ L1 (P), then E[XY|G ] = YE[X|G ], P-a.s. (6) [tower property] If H is a sub-σ-field of G , then E[E[X|G ]|H ] = E[X|H ],

P-a.s.

(7) [Jensen’s inequality] If ϕ : R → R is convex and ϕ(X) ∈ L1 (P), then ϕ(E[X|G ])  E[ϕ(X)|G ],

P-a.s.

In particular, if X ∈ L p (P), p  1, then |E[X|G ]| p  E[|X| p |G ],

P-a.s.

(8) [Fatou’s lemma] If Xn  0 (n = 1, 2, . . .), then    E lim inf Xn G  lim inf E[Xn |G ], n→∞

n→∞

P-a.s.

(9) [monotone convergence theorem] If Xn (n = 1, 2, . . .) is monotone increasing in n and converges to a random variable X, P-a.s., then E[Xn |G ] also converges to E[X|G ] almost surely. (10) [Lebesgue’s convergence theorem] If Xn converges to X, P-a.s. and if there exists an integrable non-negative random variable Y such that |Xn |  Y, P-a.s. (n = 1, 2, . . .), then E[Xn |G ] converges to E[X|G ], P-a.s. (11) Let p  1. If Xn converges to X in L p , then E[Xn |G ] converges to E[Xn |G ] in L p . (12) If X is independent of G , then E[X|G ] = E[X], P-a.s. Proposition 1.4.2 Let X, Y be random variables defined on a probability space (Ω, F , P) and G be sub-σ-field of F . Assume that X is independent of G and Y is G -measurable. Then, setting ϕ(y) = E[g(X, y)] (y ∈ R) for a bounded Borel-measurable function g : R × R → R, we have  = E[g(X, Y)|G ], P-a.s. ϕ(Y) = E[g(X, y)|G ] y=Y We refer the reader to [15, 45, 114, 126] and so on for more about the conditional expectation.

4:36:24, subject to the Cambridge Core terms of use, .002

1.4 Discrete Time Martingales

13

1.4.2 Martingales, Doob Decomposition Let {Fn }∞ n=0 be an increasing sequence of sub-σ-fields of F . Definition 1.4.3 Let X = {Xn }∞ n=0 be an R-valued stochastic process and assume that it is {Fn }-adapted, that is, Xn is Fn -measurable for each n. Then, X is called an {Fn }-martingale or simply a martingale, if Xn is integrable for each n and if E[Xn+1 |Fn ] = Xn ,

P-a.s.

X is called a submartingale if the inequality E[Xn+1 |Fn ]  Xn , P-a.s. holds in place of the equality. X is called a supermartingale if the converse inequality holds. If X is a submartingale, E[Xm |Fn ]  Xn , P-a.s. for m > n. The following two propositions are easily proven. Proposition 1.4.4 If X is a submartingale, E[Xn ] is monotone increasing in n. If X is a martingale, E[Xn ] is a constant independent of n. Proposition 1.4.5 (1) For a martingale X and a convex function ϕ, {ϕ(Xn )}∞ n=0 is a submartingale if ϕ(Xn ) is integrable for every n. (2) For a submartingale X and a convex and monotone decreasing function ϕ, {ϕ(Xn )}∞ n=0 is a submartingale if ϕ(Xn ) is integrable for every n. Example 1.4.6 A simple random walk is a martingale. In fact, let {ξn }∞ n=1 be a sequence of independent identically distributed random variables such that P(ξn = ±1) = 12 , and put F0 = {∅, Ω} and Fn = σ{ξ1 , ξ2 , . . . , ξn }, the smallest σ-field which makes ξ1 , ξ2 , . . . , ξn measurable. Set S 0 = 0 and S n = ξ1 + ξ2 + · · · + ξn (n = 1, 2, . . .). Then {S n }∞ n=0 is an {Fn }-martingale. Moreover, set f1 = 1 and ⎧ ⎪ ⎪ ⎨1 (if ξn−1 = 1), (n = 2, 3, . . .), fn = ⎪ ⎪ ⎩2k (if ξn−1 = ξn−2 = · · · = ξn−k = −1, ξn−k−1 = 1), and define a stochastic process Y = {Yn }∞ n=0 by Y0 = 0

and Yn = f1 ξ1 + f2 ξ2 + · · · + fn ξn .

Then fn is Fn−1 -measurable and Y is a martingale. Imagine that we make a sequence of fair gambles and bet one dollar the first time. The martingale Y in Example 1.4.6 represents the gain (loss) at time n

4:36:24, subject to the Cambridge Core terms of use, .002

14

Fundamentals of Continuous Stochastic Processes

when a gambler bets one dollar after a win and twice the previous one after a loss. Such a betting strategy is called a “martingale”. Definition 1.4.7 A stochastic process f = { fn }∞ n=1 is called predictable if fn is Fn−1 -measurable for any n = 1, 2, . . . Setting f0 = 0, we often consider { fn }∞ n=0 . As in Example 1.4.6, we can construct a new martingale from a martingale and a predictable process. This gives an original form of the stochastic integrals. ∞ Proposition 1.4.8 Let X = {Xn }∞ n=0 be a martingale and f = { fn }n=1 be a predictable process. Then, if fn (Xn − Xn−1 ) is integrable for each n, the stochastic process Z = {Zn }∞ n=0 defined by

Z0 = 0

and

Zn =

n

fk (Xk − Xk−1 ) (n = 1, 2, . . .)

(1.4.1)

k=1

is an {Fn }-martingale. Z is called the martingale transform of X by f . Together with the following proposition, we leave the proofs to the reader. ∞ Proposition 1.4.9 Let X = {Xn }∞ n=0 be a submartingale and f = { fn }n=1 be a non-negative, bounded predictable process. Then, the stochastic process Z = {Zn }∞ n=0 defined by (1.4.1) is an {Fn }-submartingale.

Remark 1.4.10 Let {Fn } and S = {S n }∞ n=1 be as in Example 1.4.6. Then, any is given by a martingale transform of S . {Fn }-martingale X = {Xn }∞ n=1 To see this, fix n. Since Xn is Fn -measurable, there exists a function Φn : {−1, 1}n → R such that Xn = Φn (ξ1 , . . . , ξn ). Then, by Proposition 1.4.2, we have that  Xn − Xn−1 = Φn (ξ1 , . . . , ξn ) − E[Φn (x1 , . . . , xn−1 , ξn )] x =ξ ,...,x =ξ 1

1

n−1

n−1

 1  = ξn Φn (ξ1 , . . . , ξn−1 , 1) − Φn (ξ1 , . . . , ξn−1 , −1) , 2 where we have used  1 Φn (x1 , . . . , xn−1 , 1) + Φn (x1 , . . . , xn−1 , −1) . 2  1 Thus, setting fn = 2 Φn (x1 , . . . , xn−1 , 1) − Φn (x1 , . . . , xn−1 , −1) , we obtain the desired expression. E[Φn (x1 , . . . , xn−1 , ξn )] =

4:36:24, subject to the Cambridge Core terms of use, .002

15

1.4 Discrete Time Martingales

It should be remarked that, for every {Fn }-martingale X, each Xn is bounded as seen in the above paragraph. ∞ Let {Mn }∞ n=0 be an {Fn }-martingale and {An }n=0 be a predictable increasing process (A0  A1  A2  · · · ) with A0 = 0. Then, the stochastic process X = {Xn }∞ n=1 given by Xn = Mn + An is an {Fn }-submartingale. The next theorem shows that any submartingale has such a decomposition, which is called the Doob decomposition.

Theorem 1.4.11 Let X = {Xn }∞ n=0 be an {Fn }-adapted integrable stochastic process. Then there exists an {Fn }-martingale M = {Mn }∞ n=0 and a predictable with A = 0 such that X = M +A and the decomposition process A = {An }∞ 0 n n n n=0 is unique. In particular, if X is a submartingale, A is an increasing process. Proof At first we show the uniqueness. Let us assume that Xn has two decompositions Xn = Mn + An = Mn + An , where {Mn }, {Mn } are martingales and {An }, {An } are predictable. Then, since An − An = Mn − Mn , E[An − An |Fn−1 ] = An−1 − An−1 . On the other hand, since An − An is Fn−1 -measurable, An − An = An−1 − An−1 . Hence, An − An = A0 − A0 = 0 and An = An , Mn = Mn . Next we show the existence of the decomposition. Put M0 = X0 and Mn =

n

(Xk − E[Xk |Fk−1 ])

(n = 1, 2, . . .).

k=1

Then {Mn }∞ n=0 is an {Fn }-martingale. Moreover, since n−1 An := Xn − Mn = E[Xn |Fn−1 ] − (Xk − E[Xk |Fk−1 ])

(n = 1, 2, . . .)

k=1

is Fn−1 -measurable, Xn = Mn + An is the desired decomposition. If X is a submartingale, since An − An−1 = E[Xn |Fn−1 ] − Xn−1  0 {An } is increasing.

(n = 1, 2, . . .), 

Example 1.4.12 Let X = {Xn }∞ n=0 be the martingale in Example 1.4.6. Then 2 ∞ {Xn }n=0 is a submartingale and {Xn2 − n}∞ n=0 is a martingale.

4:36:24, subject to the Cambridge Core terms of use, .002

16

Fundamentals of Continuous Stochastic Processes

1.4.3 Optional Stopping Theorem We define stopping times also for stochastic processes with discrete time parameters. Definition 1.4.13 An N ∪ {0, ∞}-valued random variable τ defined on a probability space (Ω, F , P, {Fn }) is called an {Fn }-stopping time if {τ  n} ∈ Fn for any n. Remark 1.4.14 The condition in the definition above is equivalent to that {τ = n} ∈ Fn for any n. Example 1.4.15 Let X = {Xn }∞ n=0 be an R-valued, {Fn }-adapted stochastic process and G ∈ B(R). Set τG = min{n; Xn ∈ G}, the first hitting time to G, where τG = ∞ if Xn  G for any n. Then, τG is an {Fn }-stopping time. The next theorem shows that a submartingale stopped at a stopping time is again a submartingale. Theorem 1.4.16 (Optional stopping theorem) Let X = {Xn }∞ n=0 be a is also a submartingale. Then, for any stopping time τ, X τ = {Xn∧τ }∞ n=0 submartingale. Proof Set fn = 1{τn} . Then, since { fn = 0} = {τ < n} = {τ  n − 1} ∈ Fn−1 , { fn }∞ n=1 is a predictable process and we have Xn∧τ = X0 +

n

fk (Xk − Xk−1 ).

k=1

Hence, by Proposition 1.4.9, X τ is a submartingale.



We give a σ-field which represents information up to a stopping time. Proposition 1.4.17 If τ is an {Fn }-stopping time, a family Fτ of subsets in F given by Fτ = {A ∈ F ; A ∩ {τ  n} ∈ Fn , n = 0, 1, 2, . . .} is a σ-field and τ is Fτ -measurable. Proposition 1.4.18 Let σ and τ be {Fn }-stopping times. Then, (1) {τ < σ}, {τ = σ} and {τ  σ} belong to both Fσ and Fτ . (2) If τ  σ, then Fτ ⊂ Fσ . The proofs of the two propositions are easy and are omitted. 4:36:24, subject to the Cambridge Core terms of use, .002

17

1.4 Discrete Time Martingales

1.4.4 Convergence Theorem + We show that, if a submartingale X = {Xn }∞ n=0 satisfies supn E[Xn ] < ∞, then Xn converges P-a.s. as n → ∞. Hence, if X is a non-positive submartingale or a non-negative supermartingale, it converges almost surely. For this purpose we define the number of upcrossings and prove an inequality (Theorem 1.4.20) due to Doob. Let X = {Xn }∞ n=0 be an {Fn }-adapted stochastic process. We consider a bounded closed interval [a, b] (a < b) and define a sequence of stopping times defined by

σ1 = inf{n  0; Xn  a},

τ1 = inf{n  σ1 ; Xn  b},

σk = inf{n  τk−1 ; Xn  a},

τk = inf{n  σk ; Xn  b},

(k = 2, 3, . . .).

Here, if the set {· · · } is empty, inf{· · · } = ∞. We put β(a, b) = sup{k; τk < ∞} and call it the upcrossing number of X of [a, b]. By definition, if lim inf Xn < a < b < lim sup Xn , then β(a, b) = ∞, n→∞

n→∞

if β(a, b) = ∞, then lim inf Xn  a < b  lim sup Xn . n→∞

n→∞

Hence, a necessary and sufficient condition that Xn converges or diverges to ±∞ as n → ∞ is that β(a, b) < ∞ for any rational a and b.4 Proposition 1.4.19 Let X = {Xn }∞ n=0 be a submartingale. Set βN (a, b) = sup{k; τk  N} for N = 1, 2, . . . Then, (b − a)E[βN (a, b)]  E[(XN − a)+ ] − E[(X0 − a)+ ],

(1.4.2)

where x+ = max{x, 0} for x ∈ R. Proof At first we prove the proposition under the additional condition that Xn  0 and a = 0. Define { fi }∞ i=1 by ⎧ ∞ ⎪ ⎪ ⎨1 (i ∈ k=1 (σk , τk ] ) fi = ⎪ ⎪ ⎩0 (i ∈ ∞ (τk , σk+1 ] ). k=1 Then { fi }∞ i=1 is predictable. Moreover, setting Y0 = 0

and Yn =

n

fi (Xi − Xi−1 )

(n = 1, 2, . . .),

i=1 4

lim inf n→∞ Xn = a or lim supn→∞ Xn = b does not imply β(a, b) = ∞. 4:36:24, subject to the Cambridge Core terms of use, .002

18

Fundamentals of Continuous Stochastic Processes

we obtain E[YN ]  bE[βN (0, b)]

(1.4.3)

∞

since YN = k=1 (Xτk ∧N − Xσk ∧N ) and YN  bβN (0, b). On the other hand, since Xn − X0 − Yn =

n (1 − fi )(Xi − Xi−1 ), i=1

{Xn − X0 −Yn }∞ n=0

is a submartingale by Proposition 1.4.9. In particular, we have E[XN − X0 − YN ]  0 and, combining this with (1.4.3), we obtain bE[βN (0, b)]  E[XN ] − E[X0 ] and the assertion of the proposition. In the general case, noting that {(Xn − a)+ }∞ n=0 is a submartingale and its upcrossing number up to time N of [0, b − a] is equal to βN (a, b) and applying the above result, we obtain the assertion.  Considering the upper bound in N of the right hand side of (1.4.2) and applying the monotone convergence theorem to the left hand side, we obtain the following. Theorem 1.4.20 (b − a)E[β(a, b)]  supn E[(Xn − a)+ ] − E[(X0 − a)+ ]. From Theorem 1.4.20, we get the convergence theorem for submartingales. + Theorem 1.4.21 If a submartingale X = {Xn }∞ n=0 satisfies supn E[Xn ] < ∞, then Xn converges to an integrable random variable P-a.s. as n → ∞.

Proof Theorem 1.4.20 implies that β(a, b) < ∞ for any a < b almost surely. Hence Xn converges almost surely. Moreover, since |x| = 2x+ − x, E[|Xn |] = 2E[Xn+ ] − E[Xn ]  2 sup E[Xn+ ] − E[X0 ] < ∞. n

Denote the limit of Xn by X∞ . Then, by Fatou’s lemma,   E[|X∞ |] = E lim inf |Xn |  lim inf E[|Xn |] < ∞. n→∞

n→∞



Next we show that a uniformly integrable martingale (see Section A.3) converges in L1 and almost surely. For this purpose we introduce the notion of closability, which is important also in the optional sampling theorem given in the next section.

4:36:24, subject to the Cambridge Core terms of use, .002

1.4 Discrete Time Martingales

19

Definition 1.4.22 A martingale X = {Xn }∞ n=0 , for which there exists an F∞ :=  F -measurable and integrable random variable X∞ satisfying n n Xn = E[X∞ |Fn ],

P-a.s. (n = 0, 1, 2, . . .),

is called a closable martingale. When X is a submartingale, if there exists an F∞ -measurable and integrable random variable X∞ satisfying Xn  E[X∞ |Fn ],

P-a.s. (n = 0, 1, 2, . . .),

(1.4.4)

then X is called a closable submartingale. As is mentioned in Theorem 1.4.24 below, if X is a closable martingale, then Xn converges to X∞ as n → ∞. The Doob decomposition of a closable submartingale is given by the following. Proposition 1.4.23 A submartingale X = {Xn }∞ n=0 is closable if and only if X is a sum of a closable martingale and a non-positive submartingale. Proof Assume that X is a closable submartingale. Then, by the tower property, E[Xn − E[X∞ |Fn ]|Fn−1 ]  Xn−1 − E[X∞ |Fn−1 ]. Hence, X is a sum of the martingale {E[X∞ |Fn ]}∞ n=0 and the non-positive . submartingale {Xn − E[X∞ |Fn ]}∞ n=0 The converse may be easily shown.  Theorem 1.4.24 For a martingale X = {Xn }∞ n=0 , the following conditions are equivalent: (i) X is uniformly integrable, (ii) Xn converges in L1 as n → ∞, (iii) X is closable. Proof If X is uniformly integrable, supn E[|Xn |] < ∞ and supn E[Xn+ ] < ∞. Hence, by Theorem 1.4.21, Xn converges almost surely as n → ∞. Moreover, by Theorem A.3.7, Xn is also convergent in L1 . Next we assume that Xn converges in L1 and let X∞ be the limit. Then, by Theorem 1.4.1(11), E[X∞ |Fn ] = lim E[Xm |Fn ] m→∞

1

in L . Since the right hand side is equal to Xn , P-a.s., X is closable. 4:36:24, subject to the Cambridge Core terms of use, .002

20

Fundamentals of Continuous Stochastic Processes

Finally we assume that X is closable. Then |Xn |  E[|X∞ ||Fn ] and E[|Xn |1{|Xn |c} ]  E[|X∞ |1{|Xn |c} ].

(1.4.5)

Letting c → ∞, we have sup P(|Xn |  c)  n

1 1 sup E[|Xn |]  E[|X∞ |] → 0 c n c

and, combining this with (1.4.5), we obtain the uniform integrability of X by Proposition A.3.1.  Theorem 1.4.25 A submartingale X = {Xn }∞ n=0 is closable if and only if is uniformly integrable. {Xn+ }∞ n=0 Proof Assume that X is closable. Set Mn = E[X∞ |Fn ]. Then {Mn }∞ n=0 is a closable martingale and is uniformly integrable by Theorem 1.4.24. In particular, {Mn+ } is uniformly integrable. Since Xn  Mn and 0  Xn+  Mn+ , {Xn+ } is uniformly integrable. Conversely, assume that {Xn+ } is uniformly integrable. Then, since {Xn+ } is a non-negative submartingale (Theorem 1.4.5), Xn+ converges to an integrable random variable Y as n → ∞ by Theorem 1.4.21. Moreover, since Xn+ converges also in L1 (Theorem A.3.7), we obtain E[Y|Fn ] = lim E[Xm+ |Fn ]  Xn+ , m→∞

P-a.s.,

which shows that {Xn − E[Y|Fn ]}∞ n=0 is a non-positive submartingale. Since is closable, {Xn } is a closable submartingale by the martingale {E[Y|Fn ]}∞ n=0 Proposition 1.4.23. 

1.4.5 Optional Sampling Theorem The following optional sampling theorem is useful in various situations. Theorem 1.4.26 Let X = {Xn }∞ n=0 be a closable submartingale. Then, for any stopping time τ, Xτ is integrable and, for another stopping time σ, E[Xτ |Fσ ]  Xσ∧τ , P-a.s. In particular, if X is a closable martingale, then E[Xτ |Fσ ] = Xσ∧τ , P-a.s.

4:36:24, subject to the Cambridge Core terms of use, .002

1.4 Discrete Time Martingales

21

Proof Let X∞ be the F∞ -measurable random variable satisfying (1.4.4). We can show E[X∞ |Fσ ]  Xσ . In fact, for any A ∈ Fσ , we have E[X∞ 1A ] = =

∞ k=0 ∞

E[X∞ 1A∩{σ=k} ] 



E[Xk 1A∩{σ=k} ]

k=0

E[Xσ 1A∩{σ=k} ] = E[Xσ 1A ].

k=0

In particular, if X is a martingale, we have an equality. Since, X τ = {Xn∧τ }∞ n=0 is a submartingale by Theorem 1.4.16, the proof is completed once we have shown that X τ is closable. By Proposition 1.4.23, there exist a closable martingale {Mn }∞ n=0 and a non∞ positive submartingale {Nn }n=0 such that Xn = Mn + Nn . Since {Nn∧τ }∞ n=0 is closable by Theorem 1.4.21, it suffices to show the closability of {Mn∧τ }∞ n=0 . Denote the limit of Mn as n → ∞ by M∞ . Since E[M∞ |Fn∧τ ] = Mn∧τ as is shown above,    E[|Mn∧τ |1{|Mn∧τ |c} ] = E E[M∞ |Fn∧τ ]1{|Mn∧τ |c}  E[|M∞ |1{|Mn∧τ |c} ]. Moreover, letting c → ∞, we obtain sup P(|Mn∧τ |  c)  n

1 1 sup E[|Mn∧τ |]  E[|M∞ |] → 0. c n c

∞ Hence {Mn∧τ }∞ n=0 is uniformly integrable and {Mn∧τ }n=0 is closable by Theorem 1.4.24. 

Corollary 1.4.27 Let X = {Xn }∞ n=0 be a submartingale and σ, τ be stopping times satisfying P(σ  τ  N) = 1 for some N ∈ N. Then E[Xτ |Fσ ]  Xσ ,

a.s.

In particular, if X is a martingale, the equality holds. τ Proof X N = {Xn∧N }∞ n=0 is a closable submartingale. Hence, since XN = Xτ , Theorem 1.4.26 implies

E[Xτ |Fσ ]  Xτ∧σ = Xσ .



The next corollary is easily obtained from Theorem 1.4.26.

4:36:24, subject to the Cambridge Core terms of use, .002

22

Fundamentals of Continuous Stochastic Processes

Corollary 1.4.28 Let X = {Xn }∞ n=0 be an {Fn }-closable submartingale and σ0 , σ1 , σ2 , . . . be an increasing sequence of stopping times. Then, the stochastic process {Yk }∞ k=0 defined by Yk = Xσk is an {Fσk }-submartingale. Remark 1.4.29 If each stopping time is bounded, Corollary 1.4.28 is true also when X is not closable. Remark 1.4.30 If X is not closable, the optional sampling theorem does not hold in general. For example, consider the martingale (simple random walk) S = {S n }∞ n=0 given in Example 1.4.6. We have lim supn→∞ |S n | = ∞ almost surely. Hence, letting τa be the first hitting time to a ∈ N, we have P(τa < ∞) = 1 and E[S τa ] = a. However, if we apply the optional sampling theorem, we should have E[S τa ] = E[S 0 ] = 0 and a contradiction.

1.4.6 Doob’s Inequality We show Doob’s inequality for the maximum of submartingales and its application. Theorem 1.4.31 Let X = {Xn }∞ n=0 be a submartingale and set Yn = max{Xk ; 0  k  n}. Then, for any a ∈ R and N ∈ N, aP(YN  a)  E[XN 1{YN a} ]  E[XN+ ]. Proof Let Ak ∈ Fk (k = 0, 1, . . . , N) be a sequence of events given by A0 = {X0  a} and Ak = {Xi < a (i = 0, . . . , k − 1), Xk  a}

(k = 1, 2, . . .).

Then aP(YN  a) =

N

E[a1Ak ] 

k=0

N

E[Xk 1Ak ].

k=0

N Since {Xk }k=0 is a submartingale, E[Xk 1Ak ]  E[XN 1Ak ] and

aP(YN  a) 

N

E[XN 1Ak ] = E[XN 1{YN a} ].

k=0

The second inequality is obtained from E[XN 1{τa N} ]  E[XN+ 1{τa N} ]  E[XN+ ].



4:36:24, subject to the Cambridge Core terms of use, .002

23

1.4 Discrete Time Martingales

The next inequality was first shown by Kolmogorov for a sum of independent random variables. It is obtained from Theorem 1.4.31 if we note that {Xn2 } is a submartingale. Corollary 1.4.32 If {Xn }∞ n=0 is a square-integrable martingale, then for any a > 0 and N ∈ N   1 P max |Xk |  a  2 E[|XN |2 ]. 0kN a The next result enables us to extend the martingale convergence theorem to L p - and almost sure convergences. We leave such extensions to the reader. Theorem 1.4.33 Let p, q be positive numbers such that p−1 + q−1 = 1 and X = {Xn }∞ n=0 be a non-negative p-th integrable submartingale. (1) Set Yn = max{Xk ; 0  k  n}. Then E[Ynp ]  q p E[Xnp ]. (2) If

supn E[Xnp ]

(1.4.6)

< ∞, then supn Xn is also p-th integrable and   E sup Xnp  q p sup E[Xnp ]. n

n

Proof (1) For K > 0, set YnK = K ∧ Yn . If λ  K, then {YnK  λ} = {Yn  λ} and, therefore, P(YnK  λ) = P(Yn  λ) 

1 1 E[Xn 1{Yn λ} ] = E[Xn 1{YnK λ} ] λ λ

by Doob’s inequality (Theorem 1.4.31). Since P(YnK  λ) = 0 for λ > K,



∞ 1 E[(YnK ) p ] = pλ p−1 P(YnK  λ) dλ  pλ p−1 E[Xn 1{YnK λ} ] dλ λ 0 0  YnK p E[Xn (YnK ) p−1 ]. = E Xn pλ p−2 dλ = p − 1 0 Hence, by Hölder’s inequality, we obtain 1

1

E[(YnK ) p ]  qE[(Xn ) p ] p E[(YnK ) p ] q and, after a simple manipulation, E[(YnK ) p ]  q p E[(Xn ) p ]. Letting K → ∞ on the left hand side, we obtain the assertion by the monotone convergence theorem.

4:36:24, subject to the Cambridge Core terms of use, .002

24

Fundamentals of Continuous Stochastic Processes

(2) By the convergence theorem for submartingales (Theorem 1.4.21), both of lim Xn and sup Xn exist. Hence, taking the supremum in n on the right hand side of (1.4.6) and applying the monotone convergence theorem to its left hand side, we obtain the assertion. 

1.5 Continuous Time Martingale 1.5.1 Fundamentals Let (Ω, F , P, {Ft }) be a filtered probability space. Definition 1.5.1 (1) An R-valued stochastic process M = {M(t)}t0 defined on (Ω, F , P, {Ft }) is called an {Ft }-martingale if (i) {M(t)}t0 is {Ft }-adapted, (ii) for any t  0, E[|M(t)|] < ∞, (iii) for s  t, E[M(t)|F s ] = M(s),

(1.5.1)

P-a.s.

(2) If E[M(t)|F s ]  ()M(s), P-a.s. holds instead of (1.5.1), M is called an {Ft }-submartingale (supermartingale, respectively). Moreover, for p > 1, M is called an L p -(sub, super)martingale if E[|M(t)| p ] < ∞ (t  0). Proposition 1.5.2 (1) If M is an {Ft }-martingale and f is a convex continuous function such that E[| f (M(t))|] < ∞ for any t  0, { f (M(t))}t0 is an {Ft }submartingale.   (2) If an {Ft }-martingale M is continuous and satisfies E sup0tT |M(t)| < ∞ (T > 0), the stochastic process {Mη (t)}t0 defined by

Mη (t) = M(t)η(t) −

t

M(s)η (s) ds

0

for η ∈ C 1 ([0, ∞); R) is an {Ft }-martingale. Proof (1) For s < t, Jensen’s inequality implies E[ f (M(t))|F s ]  f (E[M(t)|F s ]) = f (M(s)),

P-a.s.

(2) Let s = t0 < t1 < · · · < tn = t be a partition of the interval [s, t]. Then, noting

4:36:24, subject to the Cambridge Core terms of use, .002

1.5 Continuous Time Martingale

M(t)η(t) − M(s)η(s) −

n−1

25

M(t j+1 )(η(t j+1 ) − η(t j ))

j=0

=

n−1

η(t j )(M(t j+1 ) − M(t j ))

j=0

and taking the conditional expectation with respect to F s of both sides, we obtain  n−1  E M(t)η(t) − M(s)η(s) − M(t j+1 )(η(t j+1 ) − η(t j ))F s = 0. j=0

Letting the width max1 jn |t j − t j−1 | of the partition tend to 0, we get

t   M(u)η (u) duF s = 0 E M(t)η(t) − M(s)η(s) − s



and the assertion.

Remark 1.5.3 Let {X(t)}t0 be a d-dimensional Brownian motion and f be a subharmonic function on Rd , that is, a function satisfying Δ f  0. If in addition f satisfies an adequate growth condition, we can show that { f (X(t))}t0 is a submartingale. It is an easy application of Itô’s formula given in the next chapter.

1.5.2 Examples on the Wiener Space We give typical examples of martingales defined on the Wiener space. Theorem 1.5.4 Let (W, B(W), μ) be the d-dimensional Wiener space, θ = {θ(t)}t0 be the coordinate process and {Bt0 } be the filtration given by (1.3.1). Then, the following stochastic processes are {Bt0 }-martingales: (1) {θi (t)}t0 (i = 1, 2, . . . , d), where θi (t) is the i-th component of θ(t); (2) {(θi (t))2 − t}t0 (i = 1, 2, . . . , d); 2 (3) {exp( λ, θ(t) − |λ|2 t )}t0 for λ ∈ Rd ; √ 2 (4) {exp(i λ, θ(t) + |λ|2 t )}t0 for λ ∈ Rd , where i = −1; ! t (5) { f (θ(t)) − 12 0 (Δ f )(θ(s)) ds}t0 for a rapidly decreasing function f , where d ∂ 2 Δ = i=1 ( ∂xi ) and f is said to be rapidly decreasing if, for any α1 , . . . , αd α1 +···+αd ∈ Z+ and β > 0, (1 + |x|)β ∂α∂1 x1 ···∂αd fxd converges to 0 as |x| → ∞. Remark 1.5.5 In (4), a C-valued stochastic process is called a martingale if its real and imaginary parts are both martingales.

4:36:24, subject to the Cambridge Core terms of use, .002

26

Fundamentals of Continuous Stochastic Processes

Proof By the definition of the Wiener measure,

ϕ(x)g(t − s, 0, x) dx E[ϕ(θ(t) − θ(s))|B s0 ] = Rd

for s  t and a continuous function ϕ with at most exponential growth, that is, there exist constants C1 , C2  0 such that |ϕ(x)|  C1 eC2 |x| (x ∈ Rd ). From this, the assertions (1)–(4) follow. For (5) note that there exists a rapidly decreasing function  f : Rd → C such that

  f (λ)ei λ,x dλ and (Δ f )(x) = − f (λ)|λ|2 ei λ,x dλ. f (x) = Rd

Rd

Then we obtain

1 t (Δ f )(θ(s)) ds M f (t) := f (θ(t)) − 2 0

 |λ|2 t i λ,θ(s)   e ds dλ. f (λ) ei λ,θ(t) + = 2 0 Rd

Now let {M(t)}t0 be the martingale in (4) and set η(t) = exp(− |λ|2 t ). Then, by Proposition 1.5.2, the stochastic process

|λ|2 t i λ,θ(s) e ds (t  0) ei λ,θ(t) + 2 0 2

is a martingale. Hence, by Fubini’s theorem, E[M f (t)|B s0 ] = M f (s), P-a.s.



Conversely, the martingales discussed in the above theorem characterize the Wiener measure. Theorem 1.5.6 Let X = {X(t)}t0 be an Rd -valued continuous {Ft }-adapted stochastic process starting from 0 defined on a filtered probability space (Ω, F , {Ft }, P). For λ ∈ Rd and f ∈ C0∞ (Rd ), set Lλ (t) = e λ,X(t) −

|λ|2 t 2

M f (t) = f (X(t)) −

, 1 2

Fλ (t) = ei λ,X(t) +

|λ|2 t 2

,

t

(Δ f )(X(s)) ds. 0

Then, if one of the following conditions holds, the probability law of X, that is, the probability measure on W induced by X, is the Wiener measure: (i) for any λ ∈ Rd , {Lλ (t)}t0 is an {Ft }-martingale, (ii) for any λ ∈ Rd , {Fλ (t)}t0 is an {Ft }-martingale, (iii) for any f ∈ C0∞ (Rd ), {M f (t)}t0 is an {Ft }-martingale. For the proof we prepare the following lemma. 4:36:24, subject to the Cambridge Core terms of use, .002

27

1.5 Continuous Time Martingale

Lemma 1.5.7 (1) Let f : Cn × Ω → Cd satisfy that, for each ω ∈ Ω, the mapping Cn ζ → f (ζ, ω) ∈ Cd is holomorphic5 and, for any R > 0, there exists a non-negative integrable random variable ΦR satisfying    ∂ f (ζ, ·)  Φ (|ζ|  R, i = 1, . . . , d). R  ∂ζ i  Then, the mapping Cd ζ → E[ f (ζ, ·)] ∈ C is holomorphic. (2) If a Cd -valued random variable X satisfies e p|X| ∈ L1 (P) for any p > 1, then Cd ζ → E[e ζ,X ] is holomorphic. Proof (1) We only show the case when n = 1. Write ζ = x + iy (x, y ∈ R). ∂ ∂ = −i ∂y for holomorphic functions, Since ∂ζ∂ = ∂x          ∂ f (ζ, ·) =  ∂ f (ζ, ·)  Φ ,  ∂ f (ζ, ·) =  ∂ f (ζ, ·)  Φ R R   ∂ζ    ∂ζ   ∂x  ∂y by the assumption if x2 + y2 < R2 . Hence, by the Lebesgue convergence theorem, E[ f (x + iy, ·)] is of C 1 -class in x, y and ∂f ∂ E[ f (x + iy, ·)] = E (x + iy, ·) , ∂x ∂x  ∂f ∂ E[ f (x + iy, ·)] = E (x + iy, ·) . ∂y ∂y Let

∂ ∂ζ

=

∂ ∂x

∂ + i ∂y . Then, since

∂ ∂ζ

f (ζ, ·) = 0,

∂ E[ f (ζ, ·)] = E f (ζ, ·) = 0. ∂ζ ∂ζ ∂

This means the assertion (1). (2) For |ζ| < R,

  ∂ and  e ζ,X   |X|e|ζ| |X|  e(R+1)|X| . ∂ζ

|e ζ,X |  e|ζ| |X|  eR|X|



Hence (1) implies the assertion. Proof of Theorem 1.5.6. Assume (i). Then, for s < t, A ∈ F s and λ ∈ Rd , E[e λ,X(t) −

|λ|2 t 2

|λ|2 s 2

1A ] = E[e λ,X(s) −

1A ].

(1.5.2)

By Lemma 1.5.7, both sides are extended to holomorphic functions in λ and (1.5.2) holds for all λ ∈ Cd by the uniqueness theorem. In particular, E[ei λ,X(t) + 5

|λ|2 t 2

1A ] = E[ei λ,X(s) +

|λ|2 s 2

1A ]

(λ ∈ Rd ).

A mapping Cd ζ → g(ζ) ∈ Cm is called holomorphic if each component g j (ζ) ( j = 1, . . . , m) of g(ζ) is holomorphic in each variable ζ i (i = 1, . . . , d). 4:36:24, subject to the Cambridge Core terms of use, .002

28

Fundamentals of Continuous Stochastic Processes

Thus (ii) is obtained. Next assume (iii). For ϕ(x) = exp(i λ, x ), there exists a sequence { fn } ⊂ ∞ C0 (Rd ) such that fn → ϕ, Δ fn → Δϕ and fn , Δ fn are uniformly bounded. For example, let jn be a function in C0∞ (Rd ) such that jn (x) = 1 if |x|  n and jn (x) = 0 if |x|  n+1. Then, fn = ϕ jn is a desired function. Since {M fn (t)}t0 is a martingale by the assumption, we obtain (ii) by letting n → ∞ and applying the bounded convergence theorem. Finally we show, from (ii), that the probability distribution of X is the Wiener measure. By the assumption,

2 i λ,X(t)−X(s) − |λ|2 (t−s) E[e |F s ] = e = ei λ,x g(t − s, 0, x) dx Rd

for any λ ∈ Rd . From this identity we obtain

E[ f (X(t) − X(s))|F s ] = f (x)g(t − s, 0, x) dx Rd

for any f ∈ C0 (Rd ) by a similar argument to the proof of Theorem 1.5.4(5). Hence X(t) − X(s) is independent of F s and obeys the d-dimensional normal distribution with mean 0 and covariance matrix (t − s)I, where I is the ddimensional unit matrix. Therefore, for any 0 = t0 < t1 < · · · < tn+1 , f0 , f1 , . . . , fn ∈ C0∞ (Rd ), n 

E

fi (X(ti+1 ) − X(ti ))

i=0

=

Rd

···

 n Rd i=0

fi (xi )

n 

g(ti+1 − ti , 0, xi ) dx1 · · · dxn ,

i=0

which shows that the distribution of X is the Wiener measure.



1.5.3 Optional Sampling Theorem, Doob’s Inequality, Convergence Theorem For continuous time martingales, the optional sampling theorem, Doob’s inequality, and convergence theorems are proven in the same way as in the discrete time case. We begin with the definition of closability. In this book, we always assume that a submartingale X = {X(t)}t0 is rightcontinuous, that is, the function [0, ∞) t → X(t, ω) ∈ R is right-continuous for all ω ∈ Ω. On a probability space (Ω, F , P, {Ft }) satisfying the usual condition, an {Ft }-submartingale {X(t)}t0 whose expectation E[X(t)] is rightcontinuous in t has a modification whose sample path is right-continuous and has left limits (for details, see [98, Chapter 2]).

4:36:24, subject to the Cambridge Core terms of use, .002

1.5 Continuous Time Martingale

29

Definition 1.5.8 Let X = {X(t)}t0 be a submartingale defined on (Ω, F ,  P, {Ft }). If there exists an F∞ := t0 Ft -measurable integrable random variable X∞ such that X(t)  E[X∞ |Ft ],

P-a.s.

for every t  0,

X is called closable. If X is a martingale and if there exists an F∞ -measurable integrable random variable X∞ such that X(t) = E[X∞ |Ft ],

P-a.s.

for every t  0,

X is called a closable martingale. After showing that each stopping time (see Definition 1.3.4) is approximated by a decreasing sequence of discrete stopping times, we prove the optional sampling theorem. Before it, we define a σ-field which represents the information up to a stopping time in the same way as the discrete time case (Proposition 1.4.17). Definition 1.5.9 For an {Ft }-stopping time τ, the σ-field Fτ is defined by Fτ = {A ∈ F ; A ∩ {τ  t} ∈ Ft for each t  0}. Lemma 1.5.10 For an {Ft }-stopping time τ, set ⎧ ⎪ ⎪ 0 (τ = 0) ⎪ ⎪ ⎪ ⎨ k+1 τn = ⎪ ( 2kn < τ  k+1 ⎪ 2n 2n , k = 0, 1, 2, . . .) ⎪ ⎪ ⎪ ⎩∞ (τ = ∞) for n = 1, 2, . . . . Then, for each n, τn is an {Ft }-stopping time. Proof For t  0, take k ∈ Z+ such that 2kn  t < k+1 2n . Then, k k {τn  t} = τn  n = τ  n ∈ F kn ⊂ Ft . 2 2 2



Theorem 1.5.11 Let X = {X(t)}t0 be a closable submartingale. Then, for any stopping time σ and τ, E[X(τ)|Fσ ]  X(τ ∧ σ),

P-a.s.

In particular, if X is a closable martingale, we have the equality.

4:36:24, subject to the Cambridge Core terms of use, .002

30

Fundamentals of Continuous Stochastic Processes

Proof We show the first half. Let τn and σn be as in the above lemma, By the optional sampling theorem for discrete time submartingale (Theorem 1.4.26), E[X(τn )|Fσn ]  X(τn ∧ σn ),

P-a.s.

and, hence, for any A ∈ Fσ ⊂ Fσn , E[X(τn )1A ]  E[X(τn ∧ σn )1A ].

(1.5.3)

By the closability of X and the Lévy–Doob downward theorem ([126, Chap∞ ter 14]), {X(τn )}∞ n=1 , {X(τn ∧ σn )}n=1 are uniformly integrable and converge to 1 X(τ), X(τ ∧ σ) in L , respectively. Therefore, letting n → ∞ in (1.5.3), we  obtain E[X(τ)1A ]  E[X(τ ∧ σ)1A ]. Corollary 1.5.12 Let X = {X(t)}t0 be an {Ft }-submartingale. Then, for any stopping time τ, X τ = {X(t ∧ τ)}t0 is also a submartingale. Proof For s < t, apply Theorem 1.5.11 by replacing τ by t ∧ τ and σ by s.  Then we obtain E[X(t ∧ τ)|F s ]  X(s ∧ τ). Doob’s inequality for a submartingale {X(t)}t0 follows from that for discrete time submartingales, since sup0st X(s) = limn→∞ max0kn X( ktn ). Theorem 1.5.13 Let X = {X(t)}t0 be a submartingale. Then, for any a, t > 0,    aP sup X(s) > a  E X(t) ; sup X(s) > a . 0st

0st

Theorem 1.5.14 Let p, q be positive numbers such that p−1 + q−1 = 1 and X = {X(t)}t0 be a non-negative submartingale with X(t) ∈ L p (t  0). Then sup0st X(s) ∈ L p and  E sup X(s) p  q p E[X(t) p ]. 0st

The following convergence theorem can also be proven in the same way as for discrete time submartingales. Theorem 1.5.15 (1) Let {X(t)}t0 be a submartingale defined on a probability space (Ω, F , P, {Ft }) which satisfies the usual condition. If supt E[(X(t))+ ] < ∞, then X(t) converges as t → ∞ almost surely.

4:36:24, subject to the Cambridge Core terms of use, .002

1.5 Continuous Time Martingale

31

(2) A non-negative supermartingale defined on a probability space satisfying the usual condition converges almost surely as t → ∞. The following theorem is used in various situations. Theorem 1.5.16 For an {Ft }-martingale {X(t)}t0 defined on a probability space (Ω, F , P, {Ft }) satisfying the usual condition, the following three conditions are equivalent: (i) M is closable, (ii) {M(t)}t0 is uniformly integrable, (iii) limt↑∞ M(t) exists in L1 . Moreover, if supt E[|M(t)| p ] < ∞ for some p > 1, this is the case and M(t) converges in L p as t → ∞. Remark 1.5.17 When p = 1, the condition supt E[|M(t)|] < ∞ is not sufficient for the last assertion of the theorem to hold. For example, let {B(t)}t0 be a Brownian motion starting from 0 and set M(t) = exp(B(t)− 2t ). Then E[M(t)] = 1, but M(t) → 0 (t → 0). Proof First we show (ii) from (i). Set Γα (t) = {|M(t)| > α} for α > 0. Then      E[|M(t)|1Γα (t) ] = E E M∞ Ft 1Γα (t)       E E |M∞ |Ft 1Γα (t) = E[|M∞ |1Γα (t) ]. By Chebyshev’s inequality, we obtain 1 E[|M∞ |], α and the uniform estimates in t for P(Γα (t)) and E[|M∞ |1Γα (t) ]. Next, assume (ii). Then, since supt E[(M(t))+ ] < ∞ by the assumption, M(t) converges as t → ∞ almost surely by Theorem 1.5.15. Hence, by the assumption again, M(t) converges in L1 . Finally we show (i) from (iii). Set limt→∞ M(t) = M(∞). Then       E  M(t) − E M(∞)|Ft  = E E M(t + s) − M(∞)|Ft      E E |M(t + s) − M(∞)| |Ft = E[|M(t + s) − M(∞)|] P(Γα (t)) 

for s > 0. Since the right hand side converges to 0 as s → ∞, M is closable. When supt E[|M(t)| p ] < ∞, Doob’s inequality implies supt |M(t)| ∈ L p .  Hence {|M(t)| p }t0 is uniformly integrable.

4:36:24, subject to the Cambridge Core terms of use, .002

32

Fundamentals of Continuous Stochastic Processes

1.5.4 Applications We give applications of the optional sampling theorem and Doob’s inequality. First we show, by using the optional sampling theorem, that non-trivial martingales do not have bounded variation. Theorem 1.5.18 If a continuous martingale M = {M(t)}0tT (T > 0) has bounded variation, then M is a constant. Proof We may assume M(0) = 0. For a partition Δ = {0 = t0 < t1 < · · · < tn = T } of [0, T ], set VΔM (T ) =

n

|M(ti ) − M(ti−1 )|

i=1

and denote the supremum with respect to the partition (total variation) by V M (T ) = supΔ VΔM (T ). By the assumption, we have P(V M (T ) < ∞) = 1. Similarly, let V M (s) be the total variation of M in [0, s] and set τK = inf{s  0; V M (s)  K} for K > 0. Then, τK is a stopping time and M K = {M K (t) = M(t ∧ τK )}t0 is also a martingale. Hence, n 

E[(M K (T ))2 ] = E



{M K (ti ) − M K (ti−1 )}2

i=1

 E max{|M K (ti ) − M K (ti−1 )|} i

n

|M K (ti ) − M K (ti−1 )|

i=1

   KE max{|M K (ti ) − M K (ti−1 )|} . i

Since |M K (t)|  K, the right hand side converges to 0 by the bounded convergence theorem as |Δ| = max |ti − ti−1 | → 0. Thus M K (T ) = 0, P-a.s. Since K > 0 is arbitrary, M(T ) = 0, P-a.s.  Next we give applications to Brownian motions. Theorem 1.5.19 Let (W, B(W), μ) be the d-dimensional Wiener space and θ = {θ(t)}t0 be the coordinate process. Then, for T > 0 and 0 < ε < 1, $% " # d ε max |θ(t)|2  e(1 − ε)− 2 . E exp 2T 0tT

4:36:24, subject to the Cambridge Core terms of use, .002

33

1.5 Continuous Time Martingale In particular, for any λ > 0,   d ελ2 μ max |θ(t)| > λ  e(1 − ε)− 2 e− 2T . 0tT

Proof The second assertion follows from the first one by Chebyshev’s inequality. Hence we only show the first one. For y ∈ Rd , consider the 2 martingale given by {exp( y, θ(t) − |y|2 t )}t0 . By Doob’s inequality, for p > 1,   p|y|2 T |y|2 t E max e p y,θ(t)  e 2 E max e p( y,θ(t) − 2 ) 0tT 0tT  p  p p2 |y|2 T  p p p|y|2 T |y|2 T E[e p( y,θ(T ) − 2 ) ] = e 2 . e 2 p−1 p−1 Replace y by py to have   p  p |y|2 T E max e y,θ(t)  e 2 . 0tT p−1 |y|2 s 2

!

e y,z g(s, 0, z) dz,

  ε  ε|θ(t)|2 E max e 2T = E max e θ(t),z g , 0, z dz 0tT 0tT Rd T  p p   p p ε |z|2 T d  e 2 g , 0, z dz = (1 − ε)− 2 . p−1 T p−1 Rd

Since e

=

Rd

Letting p → ∞, we obtain the assertion.



Theorem 1.5.20 Let (W, B(W), μ) be the one-dimensional Wiener space and θ = {θ(t)}t0 be the coordinate process, and set τ−a = inf{t > 0; θ(t) = −a},

τb = inf{t > 0; θ(t) = b},

τ = τ−a ∧ τb

for a, b > 0. Then, μ(τ = τ−a ) =

b a+b

and

E[τ] = ab.

Proof Since {θ(t ∧τ)}t0 is a martingale by the optional stopping time, E[θ(t ∧ τ)] = 0. Since −a  θ(t ∧ τ)  b, by the bounded convergence theorem, we obtain, by letting t → ∞, 0 = E[θ(τ)] = −aμ(τ = τ−a ) + bμ(τ = τb ). Combine this identity with μ(τ = τ−a ) + μ(τ = τb ) = 1. Then the first assertion follows.

4:36:24, subject to the Cambridge Core terms of use, .002

34

Fundamentals of Continuous Stochastic Processes

For the second identity, note that {θ(t ∧ τ)2 − (t ∧ τ)}t0 is a martingale. Then E[θ(t ∧ τ)2 ] = E[t ∧ τ]. By the bounded convergence theorem applied to the left hand side and the monotone convergence theorem to the right hand side, we obtain E[τ] = E[θ(τ)2 ] = (−a)2 μ(τ = τ−a ) + b2 μ(τ = τb ). In conjunction with the first identity, the assertion follows.



1.5.5 Doob–Meyer Decomposition, Quadratic Variation Process As Doob’s decomposition for discrete time submartingales (Theorem 1.4.11), a submartingale is decomposed into a sum of a martingale and an increasing process, which is called the Doob–Meyer decomposition. From this the quadratic variation processes of martingales are defined. Let (Ω, F , P, {Ft }) be a probability space which satisfies the usual condition. Denote by S the set of {Ft }-stopping times and by Sa (a > 0) the subset of S which consists of elements τ satisfying τ  a almost surely. Definition 1.5.21 An {Ft }-adapted stochastic process X is said to be of class (D) if the family of random variables X(τ)1{τ 0, X is said to be of class (DL). Definition 1.5.22 A stochastic process A = {A(t)}t0 is called an increasing process if (i) A is {Ft }-adapted, (ii) A0 = 0, and t → A(t) is right-continuous and non-decreasing, (iii) for all t > 0, E[A(t)] < ∞. Definition 1.5.23 An increasing process A = {A(t)}t0 is said to be natural if, for any bounded martingale M = {M(t)}t0 ,  t  t E M(s) dA(s) = E M(s−) dA(s) (t > 0), 0

0

where M(s−) = limt↑s M(t).

4:36:24, subject to the Cambridge Core terms of use, .002

1.5 Continuous Time Martingale

35

It is known that, if an increasing process A is continuous, then A is natural. Moreover, an increasing process is natural if and only if it is predictable (see Definition 2.2.1). We omit the details and refer the reader to [13, 45]. Proposition 1.5.24 All martingales and non-negative submartingales are of class (DL). Proof We show the result for martingales. A similar argument is applicable to submartingales. If M = {M(t)}t0 is a martingale, {|Mt |}t0 is a submartingale and, by the optional sampling theorem (Theorem 1.5.11), E[|M(σ)|1{|M(σ)|λ} ]  E[|M(a)|1{|M(σ)|λ} ] for σ ∈ Sa and λ > 0. Moreover, we have sup P(|M(σ)|  λ)  sup λ−1 E[|M(σ)|]  λ−1 E[|M(a)|],

σ∈Sa

σ∈Sa

which implies the uniform integrability lim sup E[|M(σ)|1{|M(σ)|λ} ] = 0.

λ→∞ σ∈Sa



The following is easily obtained. Corollary 1.5.25 Let M be a martingale and A be an increasing process. Then the stochastic process {X(t)}t0 given by X(t) = M(t) + A(t) is of class (DL). The following theorem gives the converse of this corollary. Theorem 1.5.26 (Doob–Meyer decomposition) For a submartingale X of class (DL), there exist a martingale M = {M(t)}t0 and an increasing process A = {A(t)}t0 such that X(t) = M(t) + A(t). We can find A from the natural ones and, in this case, the decomposition is unique. Proof We give only an outline of the proof. For details, see [45, 61, 95]. First we fix T > 0 and set Y(t) = X(t) − E[X(T )|Ft ]

(0  t  T ).

Since E[Y(t)|F s ] = E[X(t)|F s ] − E[E[X(T )|Ft ]|F s ]  X(s) − E[X(T )|F s ] = Y(s) for s < t, {Y(t)}0tT is a submartingale and Y(T ) = 0.

4:36:24, subject to the Cambridge Core terms of use, .002

36

Fundamentals of Continuous Stochastic Processes

−n and let Δn be a partition of [0, T ] given by 0 = t0(n) < Second, set t(n) j = jT 2

t1(n) < · · · < t2(n)n = T . Moreover, set A(n) (tk(n) ) =

k

(n) (n) {E[Y(t(n) j )|Ft j−1 ] − Y(t j−1 )}

(k = 1, 2, . . . , 2n ).

j=1

Then {A(n) (T )}∞ n=1 is uniformly integrable. Hence, there exist a subsequence and an integrable random variable A(T ) such that A(n ) (T ) converges {n }∞ =1 weakly to A(T ) in L1 . That is, for any bounded random variable U, lim E[A(n ) (T )U] = E[A(T )U].

→∞

Now define A = {A(t)}0tT by A(t) = Y(t) + E[A(T )|Ft ]. Then A is a natural increasing process and X(t) = E[X(T ) − A(T )|Ft ] + A(t), 

which is the desired decomposition.

Corollary 1.5.27 (1) Let M be a square-integrable {Ft }-martingale with M(0) = 0, P-a.s. Then, there exists a natural increasing process A = {A(t)}t0 such that {M(t)2 − A(t)}t0 is an {Ft }-martingale. (2) Let M, N be square-integrable {Ft }-martingales with M(0) = N(0) = 0, P-a.s. Then there exists a stochastic process A = {A(t)} which is represented as a difference of two natural increasing processes such that {M(t)N(t) − A(t)}t0 is an {Ft }-martingale. Proof Since {M(t)2 }t0 is a submartingale, we can directly apply the Doob– Meyer decomposition to obtain (1). (2) is obtained by polarization.  Definition 1.5.28 The process A in Corollary 1.5.27 (1) and (2) will be denoted by M = { M (t)}t0 and M, N = { M, N (t)}t0 , respectively. M is called the quadratic variation process of M.

M, N is linear in M and N, and M, M = M . Definition 1.5.29 The set of square-integrable {Ft }-martingales defined on (Ω, F , P, {Ft }) is denoted by M 2 ({Ft }) or M 2 : & ' M is an {Ft }-martingale and 2 M = M = {M(t)}t0 ; . satisfies E[M(t)2 ] < ∞ for any t  0.

4:36:24, subject to the Cambridge Core terms of use, .002

37

1.6 Adapted Brownian Motion

The set of continuous elements in M 2 is denoted by Mc2 , and the set of the restrictions to the time interval [0, T ] of elements in M 2 and Mc2 are denoted by M 2 (T ) and Mc2 (T ), respectively. Lemma 1.5.30 (1) Identify two elements in M 2 (T ) or Mc2 (T ) if one is a mod2

2

ification of the other. Then the set of equivalence classes M (T ) and M c (T ) are Hilbert spaces with the inner product

M, M  T := E[MT MT ]. Moreover, 2

2

M c (T ) is a closed subspace of M (T ). 2

(2) M and

2 Mc

are complete metric spaces with metric d(M, N) :=



2−n min{M − Nn , 1},

n=1

where Mn =



M, M n .

The lemma is a straightforward consequence of Doob’s inequality.

1.6 Adapted Brownian Motion Let (Ω, F , P, {Ft }) be a filtered probability space. Definition 1.6.1 Let T be [0, T ] (T > 0) or [0, ∞). A stochastic process B = {B(t)}t∈T defined on (Ω, F , P, {Ft }) is called a d-dimensional {Ft }-Brownian motion if the following two conditions are satisfied: (i) B is an {Ft }-adapted Rd -valued continuous stochastic process, (ii) for any 0  s  t, λ ∈ Rd , E[ei λ,B(t)−B(s) |F s ] = e−

|λ|2 (t−s) 2

.

Moreover, if P(B(0) = x) = 1 for x ∈ Rd , we call B a d-dimensional {Ft }Brownian motion starting from x. Let B0 = {B0 (t)}t0 be the continuous stochastic process given by B0 (t) = B(t)− B(0) for a d-dimensional {Ft }-Brownian motion B = {B(t)}t0 . The probability measure on the path space W induced by B0 is the Wiener measure μ. In particular, for any 0 = t0 < t1 < · · · < tn , B(t1 ) − B(t0 ), . . . , B(tn ) − B(tn−1 ) are independent and the distribution of B(ti )−B(ti−1 ) is the d-dimensional Gaussian distribution with mean 0 and covariance matrix (ti − ti−1 )I.

4:36:24, subject to the Cambridge Core terms of use, .002

38

Fundamentals of Continuous Stochastic Processes

Conversely, let {Bt0 } be a filtration on the d-dimensional Wiener space (W, B(W), μ) on [0, ∞) given by Bt0 = σ({θ(s), s  t})

(t  0),

where {θ(t)}t0 is the coordinate process on W (see (1.3.1)). Then {θ(t)}t0 is a {Bt0 }-Brownian motion starting from 0. We mention several properties of the paths of Brownian motions. Since the components of a general dimensional Brownian motion are one-dimensional Brownian motions and are independent, we only consider the one-dimensional case. Let {B(t)}t0 be a one-dimensional {Ft }-Brownian motion on [0, ∞) starting from 0 defined on (Ω, F , P, {Ft }). The next proposition is shown by an elementary computation. Proposition 1.6.2 Let s < t. Then, for n = 1, 2, . . . , E[|B(t) − B(s)|2n ] = (2n − 1)(2n − 3) · · · 3 · 1(t − s)n =

(2n)! (t − s)n . 2n · n!

In stochastic analysis, the following property on the variations of Brownian motion is fundamental. Theorem 1.6.3 (1) For T > 0 let Δn = {0 = t0n < · · · < tmn n = T } be a sequence of partitions of [0, T ] which satisfies Δn ⊂ Δn+1 (n = 1, 2, . . .) and |Δn | := max(tnj − tnj−1 ) → 0 (n → ∞). Then mn   |B(tnj ) − B(tnj−1 )| = ∞ = 1. P lim n→∞

j=1

(2) For a sequence {Δn } of partitions in (1), mn 

lim E

n→∞

|B(tnj ) − B(tnj−1 )|2 − T

2

= 0.

j=1

Proof While (1) can be shown from (2) by contradiction, we give a direct  n |B(tnj ) − B(tnj−1 )|. Then, since Δn ⊂ Δn+1 , {Vn }∞ proof. Set Vn = mj=1 n=1 is nondecreasing and has a limit V as n → ∞, admitting V = ∞. Hence, by the bounded convergence theorem, one has mn  e−|x| g(tnj − tnj−1 , 0, x) dx. E[e−V ] = lim E[e−Vn ] = lim n→∞

n→∞

j=1

R

4:36:24, subject to the Cambridge Core terms of use, .002

39

1.6 Adapted Brownian Motion By the inequality e−x  1 − x + x2 (x > 0), (

∞ 1 − x2 1  2s s x2  − x2 −|x| + e √ e 2s dx  2 1−x+ e 2s dx = 1 − √ 2 π 2 0 R 2πs 2πs 2

and −V

E[e ]  lim sup n→∞

mn

(

mn 

1−

j=1

2 n n 1 1 n n (t − t j−1 ) 2 + (t j − t j−1 ) . π j 2

Since j=1 (tnj − tnj−1 ) → ∞ (n → ∞), E[e−V ] = 0 and V = ∞, P-a.s. (2) Set Dn, j = |B(tnj ) − B(tnj−1 )|2 − (tnj − tnj−1 ) ( j = 1, 2, . . . , mn ). For each n, n is an independent sequence of random variables. Dn, j has mean 0 and {Dn, j }mj=1 variance 2(tnj − tnj−1 )2 by Proposition 1.6.2. Hence, 1 2

mn 

E

|B(tnj )



B(tnj−1 )|2

−T

2

=E

j=1

=

2 Dn, j

j=1 mn

E[(Dn, j )2 ] + 2

mn

mn

E[Dn,i Dn, j ]

i< j

j=1

=

mn 

2(tnj − tnj−1 )2  2|Δn |

j=1

mn

(tnj − tnj−1 ) → 0.



j=1

Theorem 1.6.3 shows that the paths of Brownian motions are not smooth. In fact, the following is well known. Theorem 1.6.4 Almost every path of Brownian motions is nowhere differentiable. On the continuity and uniform continuity of paths, the following is known. Theorem 1.6.5 (Khinchin’s law of iterated logarithm) Let B be a Brownian motion with B(0) = 0. Then, almost surely, B(t) =1 lim sup  t↓0 2t log log( 1t )

and

lim inf  t↓0

B(t) 2t log log( 1t )

= −1.

Theorem 1.6.6 (Lévy’s modulus of continuity) For any T > 0  |B(t) − B(s)| √  = T = 1. P lim sup sup  1 0s 0 there exists g ∈ G ∩ HT of C 2 -class such that J(g)  J(h) + η.

(1.8.3)

Since G is an open set, we can choose δ > 0 such that B(g, δ) := {w ∈ WT ; w − g < δ} ⊂ G, where w = maxt∈[0,T ] |w(t)|. Then we have μεT (G)  μεT (B(g, δ)) = μT (ε 2 (w − ε− 2 g) ∈ B(0, δ)). 1

1

By the Cameron–Martin theorem (Theorem 1.7.2),

  1 √ 1 μεT (G)  1B(0,δ) ( εw) exp − √ I (g)(w) − J(g) μT (dw) ε ε W

T   1 1  1B(0,δ) (w) exp − I (g)(w) − J(g) μεT (dw) ε ε WT

4:36:24, subject to the Cambridge Core terms of use, .002

45

1.8 Schilder’s Theorem

=

exp B(0,δ)

1 ε

T 0

 1 1

w(t), g¨ (t) dt − w(T ), g˙ (T ) − J(g) μεT (dw). ε ε

Since |w(t)| < δ (t ∈ [0, T ]) for any w ∈ B(0, δ), in conjunction with (1.8.3), this yields 1 1 1 log(μεT (G))  log(μεT (B(0, δ))) − δT ¨g − δ˙g − (J(h) + η). ε ε ε Hence, lim inf ε log μεT (G)  −δT ¨g − δ˙g − J(h) − η ε↓0

because με (B(0, δ)) → 1 as ε ↓ 0. Letting η, δ → 0, we arrive at (1.8.1). (2) First let δ > 0 and set  F (δ) = B(w, δ). w∈F

Using the piecewise linear approximation m (w) of w ∈ WT given in Example 1.2.7, we have μεT (F)  μεT ({w; m (w) ∈ F (δ) }) + μεT ({w;  m (w) − w  δ}). Set αδ = inf J(w)  0. w∈F (δ)

We shall complete the proof of the upper estimate (1.8.2) by showing the following three assertions. (I) For any δ > 0 lim sup ε log μεT ({w; m (w) ∈ F (δ) })  −αδ . ε↓0

(II) For m = 1, 2, . . . lim sup ε log μεT ({w;  m (w) − w  δ})  − ε↓0

2m−4 δ2 . T

(III) As δ ↓ 0, αδ → inf w∈F J(w). From (I) and (II), we obtain 2m−4 δ2 lim sup ε log μεT (F)  − min αδ , = −αδ T ε↓0 for large m. This, combined with (III), implies the upper estimate. We prove the three claims in order. By the definition of αδ , μεT ({w; m (w) ∈ F (δ) })  μεT ({w; J( m (w))  αδ }).

4:36:24, subject to the Cambridge Core terms of use, .002

46

Fundamentals of Continuous Stochastic Processes

Using the same notations as in Example 1.2.7, we have 1 (I (hij )(w))2 . 2 i=1 j=0 d 2m −1

J( m (w)) =

Since I (hij ) (i = 1, . . . , d, j = 0, . . . , 2m −1) are independent and obey the standard normal distribution under the Wiener measure, 2J( m (w)) is a χ2 random variable with d2m degrees of freedom. Therefore, μεT ({w; J( m (w))  αδ }) = μT ({w; εJ( m (w))  αδ })

∞ x 1 m−1 = xd2 −1 e− 2 dx. m−1 d2 m−1 Γ(d2 ) 2α δ 2 ε

By an elementary inequality



x x p−1 e− 2 dx =



x p+1 e− 2 x

dx x2

 ∞ dx  x  exp sup − + (p + 1) log x 2 x2 xa a   a 1  exp − + (p + 1) log a a 2

a

a



for a, p ∈ R with a  2(p + 1) (p > 0), we obtain (I) by setting a = In order to show (II), note that   μεT w;  m (w) − w  δ 

m 2 −1

j=0

 μεT w; 

2αδ ε .



jT 2m

max

t ( j+1)T 2m

| m (w)(t) − w(t)|  δ 

 2m μεT w; max | 0 (w)(t) − w(t)|  δ 0t 2Tm

and, if max0t 2Tm |w(t)|  2c , max | 0 (w)(t) − w(t)|  max | 0 (w)(t)| + max |w(t)| < c.

0t 2Tm

0t 2Tm

0t 2Tm

Then, by using Theorem 1.5.19, we obtain  μεT ({w;  m (w) − w  δ})  2m μT w; max |w(t)|  0t 2Tm

 2m δ2  d  e 2m+ 2 exp − . 16εT

δ  √ 2 ε

This implies the assertion (II).

4:36:24, subject to the Cambridge Core terms of use, .002

1.8 Schilder’s Theorem

47

Finally, in order to show (III), we note that δ → αδ is non-increasing and limδ→0 αδ exists, admitting that it is ∞. When the limit is ∞, αδ  inf J(w)

(1.8.4)

w∈F

implies (III) in the sense that both sides are ∞. Assume that limδ→0 αδ is finite. Then, α 1n  limδ→0 αδ (n = 1, 2, . . .) and we 1

can choose fn ∈ F ( n ) such that 1 α 1n  J( fn )  α 1n + . n Since { fn } is a bounded set in the Hilbert space HT , we may assume that { fn } converges weakly to f , taking a subsequence if necessary. Then J( f )  lim inf J( fn ) = lim αδ . n→∞

δ→0

In general, for h ∈ HT and s, t ∈ [0, T ],  t 2  ˙ du  |h(t) − h(s)|2 =  h(u) 

T

 ˙ 2 du |t − s| |h(u)|

0

s

and |h(t) − h(s)|  hHT |t − s|. Hence, { fn } is equi-continuous and has a subsequence { fn j } which is convergent in WT by the Ascoli–Arzelà theorem. Let f¯ be the limit. Then, since F is a closed set, f¯ ∈ F and ( f¯) = lim ( fn j ) = lim , fn j = , f = ( f ) j→∞

j→∞

for any ∈ WT∗ . Since WT∗ is dense in HT , f = f¯ and inf J(w)  J( f ) = lim αδ .

w∈F

δ→0



This, combined with (1.8.4), implies (III).

Schilder’s theorem is one of the most important results in the theory of large deviations. The next theorem, due to Donsker and Varadhan, the originators of the theory, shows that the Laplace method, which is fundamental in the theory of asymptotics of the integrals on finite dimensional spaces, also works on the Wiener space with suitable modifications. Theorem 1.8.2 Let F be a bounded continuous function on the Wiener space (WT , B(WT ), μT ). Then,   F(w) lim ε log e ε μεT (dw) = sup {F(w) − J(w)}. ε↓0

WT

w∈WT

4:36:24, subject to the Cambridge Core terms of use, .002

48

Fundamentals of Continuous Stochastic Processes

Proof Let En,k (k ∈ Z, n = 1, 2, . . .) be a sequence of the closed sets in WT defined by k + 1 k En,k = w ∈ WT ;  F(w)  . n n Then

F(w) ε

e WT

μεT (dw) 



e nε μεT (En,k ). k+1

k

Since F is bounded, the sum on the right hand side is a finite one. Hence, by Schilder’s theorem,    k + 1 F(w) lim sup ε log − inf J(w) e ε μεT (dw)  max w∈En,k k n ε↓0 WT  1  1  k  − J(w) +  max sup (F(w) − J(w)) +  max sup k k n n w∈En,k n w∈En,k 1 = sup (F(w) − J(w)) + . n w∈WT Since n is arbitrary, we obtain the following estimate from above:   F(w) e ε μεT (dw)  sup (F(w) − J(w)). lim sup ε log ε↓0

w∈WT

WT

In order to show the opposite inequality, fix δ > 0 and choose wδ ∈ WT such that F(wδ ) − J(wδ )  sup (F(w) − J(w)) − δ.

(1.8.5)

w∈WT

Since F is continuous, there exists an open set G of WT such that wδ ∈ G and F(w)  F(wδ ) − δ for all w ∈ G. We now apply Schilder’s theorem and get     F(w) F(w) lim inf ε log e ε μεT (dw)  lim sup ε log e ε μεT (dw) ε↓0

ε↓0

WT

G

 F(wδ ) − δ − inf J(w). w∈G

Hence, by (1.8.5),  lim inf ε log ε↓0

e

F(w) ε

WT

 μεT (dw)  F(wδ ) − δ − J(wδ )  sup {F(w) − J(w)} − 2δ. w∈WT

Letting δ → 0, we obtain  lim inf ε log ε↓0

WT

e

F(w) ε



μεT (dw)

 sup {F(w) − J(w)}.



w∈WT

4:36:24, subject to the Cambridge Core terms of use, .002

49

1.9 Analogy to Path Integrals

1.9 Analogy to Path Integrals In this section, by formal considerations, we show a relationship between the Wiener integral and the Feynman path integral. The one-dimensional Wiener measure on [0, T ] is a probability measure on the path space WT such that μT ((w(t1 ), . . . , w(tn )) ∈ A)

=

··· A

 2 i−1 ) exp − 12 ni=1 (xtii−x −ti−1 1 dx1 dx2 · · · dxn n √ (2π) 2 t1 (t2 − t1 ) · · · (tn − tn−1 )

(1.9.1)

for 0 = t0 < t1 < · · · < tn < T, A ⊂ Rn . We consider the limit as n → ∞. Setting xi = w(ti ) for w ∈ WT , we conclude n (xi − xi−1 )2 i=1

ti − ti−1

=

n  xi − xi−1 2

ti − ti−1

T dw(s) 2

(ti − ti−1 )

i=1



0

ds

1 12 ds = 11w11H , T

where w should be assumed to be in the Cameron–Martin subspace and wHT is the norm. If, as a limit of dx1 · · · dxn , there exists a translation invariant measure D(dw) on the path space R[0,T ] ,

 1 T  dw(s) 2  1 exp − ds D(dw) μ(C) = 2 0 ds C Z

1 − w2H e 2 D(dw) = (1.9.2) C Z for a subset C of WT as a limit of (1.9.1). Here Z is the fictitious normalizing constant so that the total mass of WT is 1. However, the path of the Brownian motion is nowhere differentiable almost surely and the expression on the exponent has no meaning. Moreover, there is no translation invariant measure on R[0,T ] like the Lebesgue measure. Nevertheless, the Wiener measure on the left hand side of (1.9.2) exists in the mathematical sense and it is an “identity” which shows in a heuristic way that the Wiener measure is a Gaussian measure and that the integral with respect to it is quite similar to the path integral. Furthermore, the Cameron–Martin theorem and the Schilder theorem may be deduced from (1.9.2) in a formal way. We formally deduce the Cameron–Martin formula in the following way. When h is fixed, the distribution μT,h of w + h under μT is characterized by the identity

4:36:24, subject to the Cambridge Core terms of use, .002

50

Fundamentals of Continuous Stochastic Processes

EμT,h [F] =



F(w) μT,h (dw) =

WT

F(w + h) μT (dw) WT

for any function F on WT . Hence, noting the translation invariance of the “measure” D(dw) and using (1.9.2), we obtain

 1 T  dw 2  1 F(w + h) exp − EμT,h [F] = ds D(dw) Z 2 0 ds

 1 T  d(w − h) 2  1 F(w) exp − = ds D(dw) Z 2 0 ds

 1 1 11 112 1 T  dw 2  1 1 = F(w) exp w, h HT − h H − ds D(dw) T Z 2 2 0 ds and, writing w, h HT = I (h)(w), EμT,h [F] = EμT [FeI (h)− 2 hHT ]. 1

2

This leads us to the following Cameron–Martin formula:  1 1 12  μT,h (dw) = exp I (h)(w) − 11h11H μT (dw). T 2 The Cameron–Martin theorem also suggests the integration by parts formula on the path space. We have shown in the proof of the Cameron–Martin theorem that, if F ∈ L p (WT ), p > 1, h ∈ HT , then EμT [F ◦ T h ] = E μT [FRh ],

(Rh (w) = eI (h)− 2 hHT ), 1

2

where T h : WT w → w + h. Replacing h with sh and differentiating in s, we obtain  d μT E [F( · + sh)] = EμT [FI (h)]. s=0 ds Writing the “derivative” of F in the direction of h by Dh F, we have the integration by parts formula EμT [Dh F] = EμT [FI (h)]. In fact, we formulate such an integration by parts formula in the context of the Malliavin calculus in Chapter 5. The formula will play a fundamental role. It should be noted that an integration by parts formula on a finite dimensional Euclidean space with respect to the Gaussian measure is easily obtained. For example, for a standard normal random variable G, we have

1 2 1 E[ϕ (G)] = ϕ (t) √ e− 2 t dt R 2π

1 − 1 t2 = ϕ(t)t √ e 2 dt = E[ϕ(G)G]. R 2π

4:36:24, subject to the Cambridge Core terms of use, .002

1.9 Analogy to Path Integrals

51

This implies the identity E[ϕ (G)] = E[ϕ(G)G] for any ϕ ∈ C01 (R). Next we consider the measure μεT in Schilder’s theorem (Theorem 1.8.1). We have

√ 1 − 12 w2H T D(dw) e μεT (C) = μT ({w; εw ∈ C}) = √ ε−1 C Z for a subset C of the Wiener space and, changing the variable as in the integrals on the Euclidean spaces,

1 − 1 dim(WT ) − 2ε1 w2H T D(dw). ε 2 e μεT (C) = C Z If we could apply the Laplace method to the integral on the right hand side, we might say that the main term of the asymptotic behavior of μεT (C) as ε → 0 1 12 1 12 !T 2 would be determined by inf w∈C { 12 11w11H }. Regarding 11w11H = 0 ( dw ds ) ds as T T the action integral of the path w, we see that the principle of least action works.

4:36:24, subject to the Cambridge Core terms of use, .002

2 Stochastic Integrals and Itô’s Formula

In this chapter the stochastic integral, which is fundamental in the theory of stochastic analysis, is defined and Itô’s formula, the associated chain rule, is shown. Moreover, their applications and some related topics are presented. In the rest of this book, we sometimes say simply “X = Y” for “X = Y almost surely” for random variables X, Y.

2.1 Local Martingale Itô originated stochastic integrals with respect to Brownian motion ([48, 49]). Doob noticed their close relation to martingales as soon as he learned stochastic integrals ([15, Chapter 6]). Nowadays the stochastic integrals are usually formulated in a sophisticated manner developed by Kunita and Watanabe [65] and, also in this book, we follow this way. To have a wide extent of applications, it is necessary to consider stochastic integrals on the space of local martingales. This section is devoted to the explanation of local martingales. In the rest of this book, we assume that (Ω, F , P, {Ft }) is a filtered probability space which satisfies the usual condition, if not otherwise defined. Definition 2.1.1 (1) A right-continuous {Ft }-adapted stochastic process M = {M(t)}t0 defined on (Ω, F , P, {Ft }) is called an {Ft }-local martingale if there exists a sequence {σn }∞ n=1 of {Ft }-stopping times satisfying * + P(σn  σn+1 ) = 1 and P lim σn = ∞ = 1 (2.1.1) n→∞

σn

such that each stochastic process M = {M(t ∧ σn )}t0 is a martingale. (2) If a sequence {σn }∞ n=1 of stopping times satisfying (2.1.1) can be chosen so that M σn ∈ M 2 , that is, E[M(t ∧ σn )2 ] < ∞ for any t  0, M is called a 52 4:36:26, subject to the Cambridge Core terms of use, .003

2.1 Local Martingale

53

locally square-integrable martingale. The space of locally square-integrable 2 2 and the set of continuous elements in Mloc is martingales is denoted by Mloc 2 denoted by Mc,loc . When the filtration under consideration is clear, the notation {Ft } is omitted. A martingale is a local martingale, but the converse is not true, which causes problems in various situations. The following is important in applications. Proposition 2.1.2 Any non-negative local martingale X = {X(t)}t0 is a supermartingale. Moreover, if E[X(t)] = E[X(0)] for any t  0, then X is a martingale. Proof Let {σn }∞ n=1 be an increasing sequence of stopping times such that {X(t ∧ σn )}t0 is a martingale. By Fatou’s lemma (Theorem 1.4.1(8)) for conditional expectations, for any s < t    E[X(t)|F s ] = E lim inf X(t ∧ σn )F s  lim inf E[X(t ∧ σn )|F s ] n→∞

n→∞

= lim inf X(s ∧ σn ) = X(s). n→∞

The second assertion follows from the fact that, if E[N(t)] = E[N(0)] holds for a supermartingale {N(t)}t0 , then it is a martingale.  Also, for a continuous locally square-integrable martingale M, the quadratic variation process is important. The following stochastic process M , which is denoted by the same notation as for martingales, is called the quadratic variation process of M. 2 . Theorem 2.1.3 Let M, N ∈ Mc,loc (1) There exists a unique continuous increasing process M = { M (t)}t0 2 such that {M(t)2 − M (t)}t0 ∈ Mc,loc . 2 (2) M belongs to Mc if and only if E[ M (t)] < ∞ for any t  0. Moreover, if M(0) = 0, then E[M(t)2 ] = E[ M (t)]. (3) There exists a unique continuous process M, N = { M, N (t)}t0 , which is the difference of two continuous increasing processes, such that {M(t)N(t) − 2 .

M, N (t)}t0 ∈ Mc,loc

Proof (1) By assumption there exists a sequence {σn }∞ n=1 of stopping times such that σn ↑ ∞ almost surely and M σn = {M(t ∧ σn )}t0 is a martingale. Set τn = inf{t; |M(t)|  n}. Then M (n) = M σn ∧τn is a bounded martingale. Hence, by

4:36:26, subject to the Cambridge Core terms of use, .003

54

Stochastic Integrals and Itô’s Formula

Corollary 1.5.27, there exists a unique natural continuous increasing process {A(n) (t)}t0 such that {M (n) (t)2 − A(n) (t)}t0 is a martingale. Since M (n+1) (t ∧ σn ∧ τn )2 − A(n+1) (t ∧ σn ∧ τn ) = M(t ∧ σn ∧ τn )2 − A(n+1) (t ∧ σn ∧ τn ) = M (n) (t)2 − A(n+1) (t ∧ σn ∧ τn ) and the left hand side defines a martingale, the uniqueness of the quadratic variation process implies A(n+1) (t ∧ σn ∧ τn ) = A(n) (t). Define the continuous increasing process { M (t)} by M (t) = limn→∞ A(n) (t). Then the stochastic process given by M (n) (t)2 − M (t ∧ σn ∧ τn ) = M(t ∧ σn ∧ τn )2 − A(n) (t) is a martingale. This means that {M(t)2 − M (t)}t0 is a local martingale. (2) Assume that M ∈ Mc2 . As was shown above, E[M (n) (t)2 − A(n) (t)] = 0 and   E[ M (t)] = E lim A(n) (t) n→∞

= lim E[A(n) (t)] = lim E[(M (n) (t))2 ] n→∞

n→∞

by the monotone convergence theorem. Since |M (n) (t)|  sup0st |M(s)|, by Doob’s inequality and the Lebesgue convergence theorem, the limit in the right hand side coincides with E[M(t)2 ]. Conversely, assume that E[ M (t)] < ∞ (t > 0). Then, since M (t)  A(n) (t), Fatou’s lemma implies   E[M(t)2 ] = E lim inf (M(t ∧ σn ∧ τn )2 ) n→∞

 lim inf E[A(n) (t)]  E[ M (t)]. n→∞

Hence M is square-integrable. (3) It suffices to set M, N (t) = 14 ( M + N (t) − M − N (t)).



2.2 Stochastic Integrals Let M = {M(t)}t0 be a continuous square-integrable martingale defined on a probability space (Ω, F , P, {Ft }). In this section stochastic integrals with respect to M and a continuous locally square-integrable martingale are described. Since M is not of bounded variation (Theorem 1.5.18), stochastic integrals are not defined as Lebesgue–Stieltjes integrals. We briefly recall the martingale transform in the discrete time case mentioned in Section 1.4. Let S = {S n }∞ n=0 be a martingale defined on a probability

4:36:26, subject to the Cambridge Core terms of use, .003

2.2 Stochastic Integrals

55

∞ space with filtration {Gn }∞ n=0 . If a bounded stochastic process H = {Hn }n=1 is ∞ predictable, the stochastic process X = {Xn }n=1 defined by

Xn =

n

Hk (S k − S k−1 )

k=1

is a {Gn }-martingale and X is called the martingale transform of H with respect to S . The predictability is a natural notion, for example, in mathematical finance. If we consider S as a stock price process and Hk as a share of the stock at time k, the stock at time k was bought at time k − 1 and Hk should be determined by the information up to time k − 1. Then, Xn represents the increment of the asset up to time n and is a martingale. Also, in the continuous time case, the property of predictability of the integrands of stochastic integrals is necessary. Definition 2.2.1 Let S be the smallest σ-field on [0, ∞) × Ω, under which all left-continuous {Ft }-adapted R-valued stochastic processes are measurable. A stochastic process X is called predictable if X is S -measurable, that is for any A ∈ B(R), X −1 (A) := {(t, ω); X(t, ω) ∈ A} ∈ S . For the smallest σ-field T , under which all right-continuous {Ft }-adapted stochastic processes are measurable, an T -measurable stochastic process is called well measurable. It is easily seen that a predictable stochastic process is {Ft }-adapted. Definition 2.2.2 The set of R-valued predictable stochastic processes Φ such that  T 12 Φ(s)2 d M (s) 0) [Φ]T := E 0

is denoted by L (M) or L . For Φ, Ψ ∈ L 2 , the distance [Φ−Ψ] is defined by 2

2

[Φ − Ψ] =

∞ 1 ([Φ − Ψ]n ∧ 1). 2n n=1

Remark 2.2.3 Two elements Φ, Φ of L 2 satisfying [Φ − Φ ] = 0 are identified. We begin with the stochastic integrals for stochastic processes like the step functions.

4:36:26, subject to the Cambridge Core terms of use, .003

56

Stochastic Integrals and Itô’s Formula

Definition 2.2.4 The set of measurable {Ft }-adapted stochastic processes Φ = {Φ(t)}t0 for which there exists an increasing sequence 0 = t0 < t1 < · · · < tn < · · · with tn → ∞ (n → ∞) and Ftn -measurable random variables ξn such that ⎧ ⎪ ⎪ ⎨Φ(0) = ξ0 (2.2.1) ⎪ ⎪ ⎩Φ(t) = ξn (tn < t  tn+1 , n = 0, 1, . . .) and sup

0nN, ω∈Ω

|ξn (ω)| < ∞

(N > 0)

is denoted by L 0 . Definition 2.2.5 The stochastic integral of Φ ∈ L 0 given by (2.2.1) with respect to M = {M(t)}t0 ∈ Mc2 is defined by

t

Φ(s) dM(s) =

0

n−1

Φ(tk )(M(tk+1 ) − M(tk )) + Φ(tn )(M(t) − M(tn ))

k=0

for tn  t  tn+1 and is also denoted by I M (Φ) = {I M (Φ)(t)}t0 or I(Φ). Remark 2.2.6 The sequence of the times {tn } which defines Φ ∈ L 0 is not unique, but the stochastic integral is independent of the choice of such sequence of times. Lemma 2.2.7 Let Φ, Ψ ∈ L 0 . (1) [linearity] For a, b ∈ R, aΦ + bΨ = {aΦ(t) + bΨ(t)}t0 ∈ L 0 and

t

t

t (aΦ(s) + bΨ(s)) dM(s) = a Φ(s) dM(s) + b Ψ(s) dM(s). 0

!t

0

0

(2) The stochastic process I(Φ) = { 0 Φ(s) dM(s)}t0 is continuous. (3) [isometry] For any t  0  t 2  t E Φ(s) dM(s) = E Φ(s)2 d M (s) . 0

0

(4) I(Φ) is a continuous square-integrable martingale and, for any T > 0, 2  t  T  Φ(s) dM(s)  4E Φ(s)2 d M (s) . (2.2.2) E sup 0tT

0

0

Proof The assertions (1) and (2) are direct conclusions of the definition of stochastic integrals.

4:36:26, subject to the Cambridge Core terms of use, .003

2.2 Stochastic Integrals

57

To prove (3), we may assume t = tN in (2.2.1). Then,

t N−1 Φ(s) dM(s) = Φ(tk )(M(tk+1 ) − M(tk )). 0

k=0

Since M is a martingale and Φ is adapted, we have   E Φ(tk )(M(tk+1 ) − M(tk )) · Φ(t )(M(t +1 ) − M(t ))   = E Φ(tk )(M(tk+1 ) − M(tk ))Φ(t )E[M(t +1 ) − M(t )|Ft ] = 0 for k < . Since E[(Mtk+1 − Mtk )2 |Ftk ] = E[ M tk+1 − M tk |Ftk ], we obtain N−1  t 2 E Φ(s) dM(s) = E[Φ(tk )2 (M(tk+1 ) − M(tk ))2 ] 0

=

k=0

N−1



t

E[Φ(tk )2 ( M (tk+1 ) − M (tk ))] = E

Φ(s)2 d M (s) .

0

k=0

Thus the assertion (3) follows. For a proof of (4), let s < t. We may assume s = tk < t = t in (2.2.1). Then,   −1  t  Φ(u) dM(u)F s = E Φ(t j )(M(t j+1 ) − M(t j ))Ftk E 0

=

j=0

k−1 j=0 s

Φ(t j )(M(t j+1 ) − M(t j )) +

j=k

=

−1   E Φ(t j )E[M(t j+1 ) − M(t j )|Ftk ]

Φ(u) dM(u)

0

by the tower property. Thus I(Φ) is an {Ft }-martingale. Its square-integrability has been shown in (3), and (2.2.2) is obtained by (3) and Doob’s inequality (Theorem 1.5.14).  Next, we define the stochastic integral of Φ ∈ L 2 with respect to a martingale. It may also be considered as an element of the space of square-integrable martingales. The next proposition is important. Proposition 2.2.8 The set L 0 is dense in L 2 with respect to the distance [ · ]. Proof For Φ ∈ L 2 and K, n ∈ N, define ΦK and ΦnK by ΦK (t, ω) = Φ(t, ω)1{|Φ(t)|K} (ω), ΦnK (t, ω) =

K2n k=−K2n

k 1 k K k+1 (t, ω). 2n { 2n Φ < 2n }

4:36:26, subject to the Cambridge Core terms of use, .003

58

Stochastic Integrals and Itô’s Formula

Then, ΦK ∈ L 2 , [ΦK − Φ] → 0 as K → ∞, { 2kn  ΦK < k+1 2n } ∈ S , and ΦnK → ΦK pointwise as n → ∞. 0 Let Φ be the set of Φ ∈ L 2 for which there exists K > 0 and {Φn }∞ n=1 ⊂ L such that |Φ(t, ω)|  K for any (t, ω) and [Φn − Φ] → 0. By virtue of the above observation, the proof is completed once we have shown that 1B ∈ Φ for any B ∈ S. To do this, set S  = {B; B is a subset of [0, ∞) × Ω such that 1B ∈ Φ}. S  is a Dynkin class (see Definition A.2.1). For left-continuous {Ft }-adapted stochastic processes Y1 , . . . , Yk and open  sets E1 , . . . , Ek in R, we show ki=1 Yi−1 (Ei ) ∈ S  . To do it, note 1ki=1 Y −1 (Ei ) (t, ω) =

k 

i

1Ei (Yi (t, ω)).

i=1

Let {φin } be an increasing sequence of non-negative bounded continuous functions on R which converges to 1Ei pointwise. Then, since k 

φin (Yi (t, ω)) ↑

i=1

k 

1Ei (Yi (t, ω))

i=1

and the left hand side is a left-continuous {Ft }-adapted stochastic process, 5k k −1 ) ∈ S . i=1 1Ei (Yi (t, ω)) is S -measurable. That is, i=1 Yi (E i The totality U of the subsets of [0, ∞)×Ω of the form ki=1 Yi−1 (Ei ) is closed under the finite number of intersections and σ[U ] = S by the definition. On the other hand, the smallest Dynkin class containing U is σ[U ] by the Dynkin class theorem (see, e.g., Theorem A.2.2 and [126, Chapter A1]). Hence we get S ⊂ S  since U ⊂ S  . This means that B ∈ S  for B ∈ S , that is,  1B ∈ Φ. We define the stochastic integral for Φ ∈ L 2 . By Proposition 2.2.8, let 0 {Φn }∞ n=1 ∈ L be a sequence such that [Φn −Φ] → 0 and consider the stochastic integral of Φn with respect to M ∈ Mc2 :

t I(Φn )(t) = Φn (s) dM(s). 0

Then, by Lemma 2.2.7 (3), E[(I(Φn )(T ) − I(Φm )(T ))2 ] = [Φn − Φm ]2T . 2 Thus {I(Φn )}∞ n=1 is a Cauchy sequence in the space M of square-integrable martingales. Since Mc2 is a closed subspace of M 2 , I(Φn ) converges to some

4:36:26, subject to the Cambridge Core terms of use, .003

2.2 Stochastic Integrals

59

element in Mc2 . It is clear that the limit X does not depend on the choice of the sequence {Φn }∞ n=1 and is determined only by Φ. Definition 2.2.9 X ∈ Mc2 determined by Φ as above is called the stochastic integral of Φ ∈!L 2 with respect to a martingale M ∈ Mc2 and also denoted by t I(Φ), I M (Φ) or 0 Φ(s) dM(s). Proposition 2.2.10 Let Φ, Ψ ∈ L 2 . (1) For a, b ∈ R and t > 0,

t

t

t (aΦ(s) + bΨ(s)) dM(s) = a Φ(s) dM(s) + b Ψ(s) dM(s). 0

(2) I(Φ) =

0

! t 0



Φ(u)dM(u)

t0

0

is a square-integrable martingale and

 2   t  t Φ(u) dM(u) F s = E Φ(u)2 d M (u)F s . E s

s

(3) For any s < t

t    t  t Φ(u) dM(u) Ψ(u) dM(u)F s = E Φ(u)Ψ(u) d M (u)F s . E s

s

s

For Φ, Ψ ∈ L 0 , we can prove Proposition 2.2.10 in the same way as Lemma 2.2.7. For general Φ ∈ L 2 , the assertions are shown by taking a sequence {Φn } ⊂ L 0 used to define I(Φ) and letting n → ∞. Stochastic integrals with respect to local martingales are given as limits of stochastic integrals with respect to martingales, which are stopped local martingales by stopping times. We also extend the space L 2 of integrands. 2 Definition 2.2.11 Let M ∈ Mc,loc . The set of R-valued predictable stochastic processes Φ = {Φ(t)}t0 such that

t Φ(s)2 d M (s) < ∞ (t > 0) 0 2 2 is denoted by Lloc (M) or Lloc . 2 2 For M ∈ Mc,loc and Φ ∈ Lloc (M), we define the stochastic integral of Φ with respect to M. First, consider a sequence {σn }∞ n=1 of {Ft }-stopping times !t 2 satisfying (2.1.1) and 0 Φ(s) d M (s)  n (t  σn ) and set

Mn = M σn ,

Φn (t) = Φ(t)1{σn t} .

4:36:26, subject to the Cambridge Core terms of use, .003

60

Stochastic Integrals and Itô’s Formula

Since the stochastic integral I Mn (Φn ) of Φn ∈ L 2 with respect to Mn ∈ M 2 satisfies I Mn (Φm )(t) = I Mn (Φn )(t ∧ σm )

(m < n),

2 there exists a unique I M (Φ) = {I M (Φ)(t)}t0 ∈ Mc,loc such that

I M (Φ)(t ∧ σn ) = I Mn (Φn )(t)

(t  0, n = 1, 2, . . .).

2 Definition 2.2.12 The stochastic process I M (Φ) defined above for M ∈ Mc,loc 2 and Φ ∈ Lloc (M) is called the stochastic integral of Φ with respect to M, !t which is also denoted by 0 Φ(s) dM(s).

It is easy to see that I M (Φ) does not depend on the choice of the sequence {σn }∞ n=1 of stopping times. Moreover, Proposition 2.2.10 extends to M, N ∈ 2 2 2 and Φ ∈ Lloc (M), Ψ ∈ Lloc (N). In particular, Mc,loc

t

I (Φ), N (t) = M

Φ(s) d M, N (s),

0

which characterizes the stochastic integrals as (locally) square-integrable martingales. 2 2 2 and Φ ∈ Lloc (M). If X ∈ Mc,loc satisfies Proposition 2.2.13 Let M ∈ Mc,loc

X, N (t) =

t

Φ(s) d M, N (s)

(t  0)

(2.2.3)

0 2 for any N ∈ Mc,loc , then X coincides with I M (Φ). 2 Proof For any N ∈ Mc,loc , I M (Φ) − X, N (t) = 0. In particular, letting N = M M  I (Φ) − X, we obtain E[ I (Φ) − X (t)] = 0. 2 2 2 and Φ ∈ Lloc (M) ∩ Lloc (N), then Proposition 2.2.14 (1) If M, N ∈ Mc,loc 2 Φ ∈ Lloc (M + N) and

t

t

t Φ(s) d(M + N)(s) = Φ(s) dM(s) + Φ(s) dN(s). 0

0

0

2 2 (2) If M ∈ Mc,loc and Φ, Ψ ∈ Lloc (M), then

t 0

(Φ + Ψ)(s) dM(s) = 0

t

t

Φ(s) dM(s) +

Ψ(s) dM(s).

0

4:36:26, subject to the Cambridge Core terms of use, .003

2.3 Itô’s Formula

61

2 2 2 and Φ ∈ Lloc (M). Set N = I M (Φ). Then, ΨΦ ∈ Lloc (M) for (3) Let M ∈ Mc,loc 2 Ψ ∈ Lloc (N) and

t

t (ΨΦ)(s) dM(s) = Ψ(s) dN(s). 0

0

We omit the proofs.

2.3 Itô’s Formula The main stochastic processes treated hereafter in this book are of the following form. Definition 2.3.1 Let X(0) be an F0 -measurable random variable, M be an 2 with M(0) = 0, and A = {A(t)}t0 be a continuous {Ft }element of Mc,loc adapted stochastic process such that A(0) = 0 and t → A(t) is of bounded variation on each finite interval. Then the continuous {Ft }-adapted stochastic process X = {X(t)}t0 defined by X(t) = X(0) + M(t) + A(t)

(2.3.1)

is called a continuous semimartingale. An RN -valued {Ft }-adapted continuous stochastic process X = {X(t) = (X 1 (t), . . . , X N (t))}t0 is an N-dimensional semimartingale if each component {X i (t)}t0 is a continuous semimartingale. The following stochastic process, called an Itô process, is a typical example of semimartingales. Definition 2.3.2 The set of R-valued predictable stochastic processes {b(t)}t0 !T 1 such that 0 |b(t)| dt < ∞ for any T > 0 is denoted by Lloc . Definition 2.3.3 Let B = {B(t) = (B1 (t), . . . , Bd (t))}t0 be a d-dimensional {Ft }-Brownian motion. The continuous and {Ft }-adapted stochastic process {X(t)}t0 given by X(t) = X(0) +

d α=1

0

t

α

aα (s) dB (s) +

t

b(s) ds

(t  0)

0

2 is called an R-valued Itô process, where {aα (t)}t0 ∈ Lloc (α = 1, . . . , d) and 1 {b(t)}t0 ∈ Lloc . An N-dimensional stochastic process whose components are R-valued Itô processes is called an RN -valued Itô process.

4:36:26, subject to the Cambridge Core terms of use, .003

62

Stochastic Integrals and Itô’s Formula

Itô’s formula is the chain rule for semimartingales and is a fundamental tool in the analysis of functions of sample paths of semimartingales. Theorem 2.3.4 (Itô’s formula) Let {X(t) = (X 1 (t), . . . , X N (t))}t0 be an Ndimensional semimartingale whose components are decomposed as X i (t) = X i (0) + M i (t) + Ai (t)

(i = 1, . . . , N)

and f ∈ C 2 (RN ). Then, { f (X(t))}t0 is also a semimartingale and, for any t > 0, we have N t N t ∂f ∂f i (X(s)) dM (s) + (X(s)) dAi (s) f (X(t)) = f (X(0)) + i i ∂x ∂x 0 0 i=1 i=1 N t 2 ∂ f 1 + (X(s)) d M i , M j (s). (2.3.2) 2 i, j=1 0 ∂xi ∂x j Itô’s formula is written in the following way for Itô processes. 2 1 and {bi (t)}t0 ∈ Lloc (α = 1, . . . , d, i = Theorem 2.3.5 Let {aiα (t)}t0 ∈ Lloc N 1, . . . , N), and {X(t)}t0 be the R -valued Itô process defined by

t d t i i i α X (t) = X (0) + aα (t) dBs + bi (s) ds (i = 1, 2, . . . , N). α=1

0

0

Then, for any f ∈ C 2 (RN ), f (X(t)) = f (X(0)) + N t

d N i=1 α=1

t 0

∂f (X(s))aiα (s) dBαs ∂xi

∂f (X(s))bi (s) ds i ∂x 0 i=1 d N 1 t ∂2 f (X(s))aiα (s)aαj (s) ds + 2 i, j=1 α=1 0 ∂xi ∂x j +

(t > 0).

Proof of Theorem 2.3.4 To avoid unnecessary complexity, we only show the case where d = 1 and N = 1. The general case can be shown in the same way. We write the semimartingale under consideration as X(t) = X(0) + M(t) + A(t). Denote the total variation of A on [0, t] by V A (t) and define a sequence {τn }∞ n=1 of stopping times by τn = 0 if |X(0)| > n and, if |X(0)|  n, τn = inf{t; max{|M(t)|, M (t), V A (t)} > n}. 4:36:26, subject to the Cambridge Core terms of use, .003

2.3 Itô’s Formula

63

If Itô’s formula is proven for {X(t ∧ τn )}t0 , we obtain it for {X(t)}t0 by letting n → ∞. Hence, we may assume that all of X(0), M, M , V A are bounded and that the support of f ∈ C 2 (R) is compact. Fix t > 0 and let Δ be the partition 0 = t0 < t1 < · · · < tn = t of [0, t]. By Taylor’s theorem, there exists a ξk between X(tk−1 ) and X(tk ) such that n { f (X(tk )) − f (X(tk−1 ))}

f (X(t)) − f (X(0)) =

k=1

=

n

f  (X(tk−1 ))(X(tk ) − X(tk−1 )) +

k=1

1  f (ξk )(X(tk ) − X(tk−1 ))2 . 2 k=1 n

The first term of the right hand side converges in probability to

t

t  f (X(s)) dM(s) + f  (X(s)) dA(s) 0

0

by the definition of the stochastic integral. We divide the second term into the sum of n 1  I1 = f (ξk )(M(tk ) − M(tk−1 ))2 , 2 k=1 I2 =

n

f  (ξk )(M(tk ) − M(tk−1 ))(A(tk ) − A(tk−1 )),

k=1

1  f (ξk )(A(tk ) − A(tk−1 ))2 . 2 k=1 n

I3 =

Write A as the difference of two increasing processes A+ , A− : A(t) = A+ (t) − A− (t). Setting J2 = max | f  (x)| · max |M(tk ) − M(tk−1 )| · (|A+ (t)| + |A− (t)|), x∈R

1kn



J3 = max | f (x)| · max |A(tk ) − A(tk−1 )| · (|A+ (t)| + |A− (t)|), x∈R

1kn

we see that |I j |  J j ( j = 2, 3) and that J2 , J3 → 0 as |Δ| = max1  k n {tk − tk−1 } → 0. Next set n 1  f (X(tk−1 ))(M(tk ) − M(tk−1 ))2 , I1 = 2 k=1 1 max ζkΔ (M(tk ) − M(tk−1 ))2 , 2 1kn k=1 n

J1 = where ζkΔ =

max

{| f  (ξ) − f  (X(tk ))|}.

X(tk )∧X(tk+1 )ξX(tk )∨X(tk+1 )

4:36:26, subject to the Cambridge Core terms of use, .003

64

Stochastic Integrals and Itô’s Formula

Then, |I1 − I1 |  J1 . Since E[J1 ] 

n 2 12  12  1  (M(tk ) − M(tk−1 ))2 E E max (ζkΔ )2 1kn 2 k=1

 by Schwarz’s inequality,   the boundedness and the uniform continuity of f Δ 2 yields E max1kn (ζk ) → 0 (|Δ| → 0). Therefore, if we use the following lemma, we obtain E[J1 ] → 0.

Lemma 2.3.6 If |M(s)|  C (s ∈ [0, t]), then n  2 (M(tk ) − M(tk−1 ))2  6C 4 . E k=1

We postpone a proof of the lemma and continue the proof of Itô’s formula. Since   n  1  2   f (ξk )(X(tk ) − X(tk−1 )) − I1   J1 + J2 + J3 , 2 k=1 we obtain

1  f (ξk )(X(tk ) − X(tk−1 ))2 − I1 → 0 2 k=1 n

as n → ∞ in probability. Moreover, set I1 =

1  f (X(tk−1 ))( M (tk ) − M (tk−1 )). 2 k=1 n

By the boundedness of f  and M , the bounded convergence theorem yields

  1 t     E I1 − f (X(s)) d M (s) → 0, |Δ| → 0. 2 0 Hence, if we show E[|I1 − I1 |2 ] → 0, we complete the proof of Itô’s formula. For this purpose we write E[|I1 − I1 |2 ] n   2 1   f (X(tk−1 )) (M(tk ) − M(tk−1 ))2 − ( M (tk ) − M (tk−1 )) = E 4 k=1 n    1 = E f  (X(tk−1 ))2 (M(tk ) − M(tk−1 ))2 − ( M (tk ) − M (tk−1 )) 2 4 k=1

4:36:26, subject to the Cambridge Core terms of use, .003

2.3 Itô’s Formula

65

  1   E f (X(tk−1 )) (M(tk ) − M(tk−1 ))2 − ( M (tk ) − M (tk−1 )) 2 k<

+

 × f  (X(t −1 ))E[(M(t ) − M(t −1 ))2 − ( M (t ) − M (t −1 ))|Ft −1 ] .

The second term is zero because    E (M(t ) − M(t −1 ))2 − ( M (t ) − M (t −1 ))Ft −1 = 0. Hence, setting K = 2 max | f  (x)|2 , we obtain, by the elementary inequality (a + b)2  2(a2 + b2 ), E[|I1 − I1 |2 ] n n   (M(tk ) − M(tk−1 ))4 + KE ( M (tk ) − M (tk−1 ))2  KE 

k=1

k=1

 KE max(M(tk ) − M(tk−1 ))2

n

(M(tk ) − M(tk−1 ))2

k=1

 + KE max( M (tk ) − M (tk−1 )) · M (t) n  2 12 4 21 2  K{E[max(M(tk ) − M(tk−1 )) ]} E (M(tk ) − M(tk−1 )) 

k=1

+ KE[max( M (tk ) − M (tk−1 )) · M (t)]. By Lemma 2.3.6 and the boundedness of M and M , applying the Lebesgue convergence theorem to the last term, we obtain E[|I1 − I1 |2 ] → 0. Now we have shown Itô’s formula (2.3.2) for each t > 0. Noting that both sides are continuous in t, we complete the proof.  Proof of Lemma 2.3.6 n

Write

(M(tk ) − M(tk−1 ))2

2

k=1

n n  n  = (M(tk ) − M(tk−1 ))4 + 2 (M(t ) − M(t −1 ))2 (M(tk ) − M(tk−1 ))2 k=1 =k+1

k=1

and note E[(M(v) − M(u))2 |Fu ] = E[M(v)2 − M(u)2 |Fu ]

(u < v).

Then, for the first term, n n   (M(tk ) − M(tk−1 ))4  E (2C)2 (M(tk ) − M(tk−1 ))2 E k=1

k=1

 4C 2 E[M(t)2 − M(0)2 ]  4C 4 .

4:36:26, subject to the Cambridge Core terms of use, .003

66

Stochastic Integrals and Itô’s Formula

For the second term, n  n   (M(t ) − M(t −1 ))2 (M(tk ) − M(tk−1 ))2 E k=1 =k+1 n 

=E

n 

E

k=1

 (M(t ) − M(t −1 ))2 Ft −1 (M(tk ) − M(tk−1 ))2

=k+1

n  (M(t)2 − M(tk )2 )(M(tk ) − M(tk−1 ))2 =E k=1

n 

E

M(t)2 (M(tk ) − M(tk−1 ))2

k=1

 C E[M(t)2 − M(0)2 ]  C 4 . 2

Combining the two estimates, we obtain the conclusion.



Example 2.3.7 For any n = 1, 2, . . ., set f (x) = xn and apply Itô’s formula. Then

t n(n − 1) t B(t)n = B(0)n + n B(s)n−1 dB(s) + B(s)n−2 ds. 2 0 0 The second term of the right hand side is a martingale with mean 0. Hence, assuming B(0) = 0 and letting n = 2m (m = 1, 2, . . .), we obtain

t 2m E[B(t) ] = m(2m − 1) E[B(s)2m−2 ] ds. 0

Since E[B(s) ] = s, this yields E[B(t) ] = 3t2 . In general, by induction, 2

4

E[B(t)2m ] =

(2m)! m t 2m m!

Example 2.3.8 For σ ∈ R, eσB(t)− 2 σ t = eσB(0) + σ 1

(m = 1, 2, . . . ).

2

t

eσB(s)− 2 σ s dB(s). 1

2

0

This is obtained by applying Itô’s formula to X(t) = σB(t)− 21 σ2 t and f (x) = e x . Example 2.3.9 Let B = {B(t)}t0 be a one-dimensional Brownian motion and φ : [0, ∞) → R be a function which is square-integrable on each finite interval. Then, setting

 t  1 t X(t) = exp φ(s) dB(s) − φ(s)2 ds , 2 0 0

4:36:26, subject to the Cambridge Core terms of use, .003

2.3 Itô’s Formula

we have

X(t) = 1 +

t

67

φ(s)X(s) dB(s).

0

Example 2.3.10 Let B = {B(t)}t0 be a one-dimensional Brownian motion with B(0) = 0. For γ ∈ R, the stochastic process X = {X(t)}t0 defined by

t eγs dB(s) X(t) = xe−γt + e−γt 0

is called an Ornstein–Uhlenbeck process. X satisfies

t X(t) = x + B(t) − γ X(s) ds. 0

Example 2.3.11 Let {B(t) = (B1 (t), B2 (t), B3 (t))}t0 be a three-dimensional Brownian motion with B(0)  0 and set τn = inf{t; |B(t)| = 1n }. If |B(0)| > 1n , 1 1 = − |B(t ∧ τn )| |B(0)| j=1 3

t∧τn 0

B j (s) dB j (s). |B(s)|3

Since τn → ∞ almost surely (see Remark 3.2.4), {|B(t)|−1 }t0 is a local martingale. {|B(t)|−1 }t0 is an example of a local martingale which is not a martingale because E[|B(t)|−1 ] → 0 (t → ∞). Definition 2.3.12 (Stochastic differential) (1) When a semimartingale X is decomposed as X(t) = X(0) + MX (t) + AX (t), we write dX(t) = dMX (t) + dAX (t) and call it a representation of X by stochastic differential. (2) Let Y be another semimartingale expressed as dY(t) = dMY (t) + dAY (t). The stochastic differential of MX , MY is denoted by dX · dY(t) and is called the product of the stochastic differentials dX(t) and dY(t). (3) For a continuous {Ft }-measurable stochastic process Φ = {Φ(t)}t0 which is bounded on each bounded interval, the stochastic integral Φ · X of Φ with respect to X is defined as a semimartingale with (Φ·X)(0) = 0 whose stochastic differential is Φ dX := Φ dMX + Φ dAX :

t

t (Φ · X)(t) = Φ(s) dMX (s) + Φ(s) dAX (s). 0

0

4:36:26, subject to the Cambridge Core terms of use, .003

68

Stochastic Integrals and Itô’s Formula

Remark 2.3.13 (1) For a d-dimensional {Ft }-Brownian motion B = {B(t) = (B1 (t), . . . , Bd (t))}t0 , we have dBi · dB j = δi j dt,

dBi · dt = 0,

dt · dt = 0.

(2.3.3)

Moreover, if X = {X(t)}t0 is an R-valued Itô process given by

t d t X(t) = X(0) + aα (s) dBα (s) + b(s) ds (t  0), α=1

0

0

its stochastic differential is dX(t) =

d

aα (t) dBα (t) + b(t) dt.

α=1

If {Y(t)}t0 is another Itô process given by dY(t) =

d

pα (t) dBα (t) + q(t) dt,

α=1

(2.3.3) yields dX · dY(t) =

d

aα (t)pα (t) dt.

α=1

(2) When X, Y, Z are semimartingales, we have (dX · dY) · dZ = 0. Using the stochastic differentials makes Itô’s formula simple. Theorem 2.3.14 (Itô’s formula) Let {X(t) = (X 1 (t), . . . , X N (t))}t0 be an N-dimensional semimartingale. Then, for any f ∈ C 2 (RN ), d( f (X))(t) =

N N 1 ∂2 f ∂f i (X(t))dX (t) + (X(t))dX i · dX j (t). i i ∂x j ∂x 2 ∂x i=1 i, j=1

Remark 2.3.15 Let ξ be a semimartingale whose martingale part is zero. Then, for any semimartingale Y, dξ · dY = 0. Hence, by a standard approximation by smooth functions, it is easily deduced from Theorem 2.3.14 that, for f ∈ C 1,2 (R × RN ), ∂f ∂f (ξ(t), X(t)) dξ(t) + (X(t))dX i (t) i 0 ∂x ∂x i=1 N

d( f (ξ, X))(t) =

+

N 1 ∂2 f (X(t))dX i · dX j (t). 2 i, j=1 ∂xi ∂x j

4:36:26, subject to the Cambridge Core terms of use, .003

2.3 Itô’s Formula

69

Example 2.3.16 The stochastic process in Example 2.3.9 satisfies dX(t) = φ(t)X(t) dB(t). Example 2.3.17 The Ornstein–Uhlenbeck process defined in Example 2.3.10 satisfies dX(t) = dB(t) − γX(t) dt. Next we define the stochastic integral due to Stratonovich. If we use this stochastic integral, Itô’s formula is of the same form as the chain rule in the usual calculus. It is useful when we apply stochastic calculus to study problems in geometry. Definition 2.3.18 Let X and Y be semimartingales. Then, Y ◦ dX denotes Y dX + 12 dX · dY. The corresponding semimartingale starting from 0 is called the Stratonovich integral of Y with respect to X and is denoted by

t Y(s) ◦ dX(s). 0

Theorem 2.3.19 Let X = {X(t)}t0 be the same N-dimensional semimartingale as Theorem 2.3.4. Then, for any f ∈ C 3 (RN ), f (X(t)) = f (X(0)) +

N i=1

t 0

Proof We simply write fi , fi j and fi jp for Then, by the definition, N

fi (X(t)) ◦ dX i (t) =

i=1

N i=1

=

N

fi (X(t))dX i (t) +

i=1

+

∂f (X(s)) ◦ dX i (s). ∂xi

2 ∂f , ∂ f ∂xi ∂xi ∂x j

and

∂3 f , ∂xi ∂x j ∂x p

respectively.

1 d( fi (X))(t) · dX i (t) 2 i=1 N

fi (X(t))dX i (t) +

N 1 fi j (X(t))dX i · dX j (t) 2 i, j=1

N 1 fi jp (X(t))(dX i · dX j ) · dX p (t). 2 i, j,p=1

The right hand side coincides with d( f (X(t))) because the third term is 0 by Remark 2.3.13 (2). 

4:36:26, subject to the Cambridge Core terms of use, .003

70

Stochastic Integrals and Itô’s Formula

2.4 Moment Inequalities for Martingales In the rest of this chapter, several results which are obtained by applying stochastic integrals and Itô’s formula are shown. These will be frequently used in the following chapters. Theorem 2.4.1 (Burkholder–Davis–Gundy inequality) For any p > 0 there exist positive constants c p and C p such that c p E[M ∗ (t)2p ]  E[ M (t) p ]  C p E[M ∗ (t)2p ]

(2.4.1)

for any continuous local martingale M = {M(t)}t0 with M(0) = 0, where M ∗ (t) = max |M(s)|. 0st

Proof Following [29], we give a proof by stochastic analysis. See, e.g., [100], for another proof. We set τn = inf{t; |M(t)|  n or M (t)  n},

n = 1, 2, . . .

Then, τn → ∞ almost surely. If we show (2.4.1) for M τn = {M(t ∧ τn )}t0 (c p and C p are independent of n), we see (2.4.1) for M by letting n → ∞. Hence we may assume that {M(t)}t0 and { M (t)}t0 are bounded. We recall Doob’s inequality (Theorem 1.5.14) : for q > 1,  q q E[|M(t)|q ]. (2.4.2) E[M ∗ (t)q ]  q−1 If p = 1, since E[ M (t)] = E[M(t)2 ], Doob’s inequality for q = 2 implies (2.4.1) if we set c p = 14 and C p = 1. Next we consider the case when p > 1. Since f (x) = |x|2p (x ∈ R) is of 2 C -class,

t |M(t)|2p = 2p|M(s)|2p−1 sgn(M(s)) dM(s) 0

t + p(2p − 1) |M(s)|2p−2 d M (s), 0

where sgn(x) is a function such that sgn(x) = 1 for x > 0 and sgn(x) = −1 for x  0. By Hölder’s inequality,  t E[|M(t)|2p ] = p(2p − 1)E |M(s)|2p−2 d M (s) 0

 p(2p − 1)E[|M ∗ (t)|2p−2 M (t)]  p(2p − 1){E[|M ∗ (t)|2p ]}

p−1 p

1

{E[ M (t) p ]} p .

4:36:26, subject to the Cambridge Core terms of use, .003

2.4 Moment Inequalities for Martingales

71

Hence we get E[M ∗ (t)2p ] 

 2p 2p p−1 1 p(2p − 1){E[M ∗ (t)2p ]} p {E[ M (t) p ]} p 2p − 1

by Doob’s inequality. It is now easy to show the first inequality of (2.4.1). !t p−1 Set N(t) = 0 M (s) 2 dM(s). Then it is easy to see

t 1

M (s) p−1 d M (s) = M (t) p

N (t) = p 0 and E[ M (t) p ] = pE[ N (t)] = pE[N(t)2 ].

(2.4.3)

On the other hand, Itô’s formula yields

t

t p−1 p−1 p−1 M(t) M (t) 2 =

M (s) 2 dM(s) + M(s) d[ M (s) 2 ] 0 0

t p−1 = N(t) + M(s) d[ M (s) 2 ], 0

which implies |N(t)|  2M ∗ (t) M (t)

p−1 2

.

Hence, by (2.4.3), we get E[ M (t) p ]  4pE[M ∗ (t)2 M (t) p−1 ]  4p{E[M ∗ (t)2p ]} p {E[ M (t) p ]} 1

p−1 p

.

Thus, for p > 1, E[ M (t) p ]  (4p) p E[M ∗ (t)2p ]. Finally we show the case when 0 < p < 1. Note M −

t 1−p

M (s)− 2 dM(s). N(t) =

1−p 2

∈ L 2 (M) and set

0

!t 1−p Then E[ M (t) ] = pE[N(t) ] and M(t) = 0 M (s) 2 dN(s). Itô’s formula yields

t

t 1−p 1−p 1−p 2 2 N(t) M (t) =

M (s) dN(s) + N(s) d[ M (s) 2 ] 0 0

t 1−p = M(t) + N(s) d[ M (s) 2 ]. p

2

0 ∗

Hence |M(t)|  2N (t) M (t)

1−p 2

. Therefore, since

M ∗ (t)  2N ∗ (t) M (t)

1−p 2

,

4:36:26, subject to the Cambridge Core terms of use, .003

72

Stochastic Integrals and Itô’s Formula

we obtain E[M ∗ (t)2p ]  22p E[N ∗ (t)2p M (t) p(1−p) ]  22p {E[N ∗ (t)2 ]} p {E[ M (t) p ]}1−p . In conjunction with the result in the case when p = 1, we obtain E[M ∗ (t)2p ]  22p 4 p {E[N(t)2 ]} p {E[ M (t) p ]}1−p  16  p  16  p  {E[ M (t) p ]} p {E[ M (t) p ]}1−p = E[ M (t) p ]. p p Now we get the first inequality of (2.4.1). To show the second inequality, let α > 0 and write

M (t) p = { M (t) p (α + M ∗ (t))−2p(1−p) }(α + M ∗ (t))2p(1−p) . Then, E[ M (t) p ]  {E[ M (t)(α + M ∗ (t))−2(1−p) ]} p {E[(α + M ∗ (t))2p ]}1−p . (2.4.4) !  = t (α + M ∗ (s))−(1−p) dM(s). Then Set N(t) 0

t  (α + M ∗ (s))−2(1−p) d M (s)

N (t) = 0

 (α + M ∗ (t))−2(1−p) M (t). Itô’s formula yields ∗

M(t)(α + M (t))

−(1−p)

(2.4.5)

t

(α + M ∗ (s))−(1−p) dM(s)

t + M(s) d[(α + M ∗ (s))−(1−p) ] 0

t  = N(t) − (1 − p) M(s)(α + M ∗ (s)) p−2 dM ∗ (s).

=

0

0

Hence we get   M ∗ (t) p + (1 − p) |N(t)|

t

M ∗ (s) p−1 dM ∗ (s) =

0

1 ∗ p M (t) p

 ]  p E[M (t) ]. By (2.4.4) and (2.4.5), and E[N(t) 2

−2



2p

p  E[ M (t) p ]  {E[ N (t)]} {E[(α + M ∗ (t))2p ]}1−p 1  2p {E[M ∗ (t)2p ]} p {E[(α + M ∗ (t))2p ]}1−p . p

Letting α ↓ 0, we obtain E[ M (t) p ]  p−2p E[M ∗ (t)2p ].



4:36:26, subject to the Cambridge Core terms of use, .003

2.5 Martingale Characterization of Brownian Motion

73

2.5 Martingale Characterization of Brownian Motion First we show the martingale characterization of Brownian motion due to Lévy. Theorem 2.5.1 Let M = {M(t) = (M 1 (t), M 2 (t), . . . , M N (t))}t0 be an N-dimensional continuous {Ft }-local martingale with M(0) = 0. (1) If

M i , M j (t) = δi j t

(i, j = 1, 2, . . . , N),

(2.5.1)

then M is an N-dimensional {Ft }-Brownian motion. (2) Suppose that N t N   1 t fi (s) dM i (s) + fi (s)2 ds exp i 2 i=1 0 i=1 0 is a martingale for any f1 , . . . , fN ∈ L2 ([0, ∞)). Then, M is an N-dimensional {Ft }-Brownian motion. Proof (1) It suffices to show E[ei ξ,M(t)−M(s) |F s ] = e−

|ξ|2 (t−s) 2

.

(2.5.2)

for any ξ ∈ RN and 0  s < t. The assumption (2.5.1) implies that M is a square-integrable martingale. Moreover, by Itô’s formula, it yields ei ξ,M(t) − ei ξ,M(s) =

N i=1

t

i ξi ei ξ,M(u) dM i (u)

s

1 i 2 (ξ ) 2 i=1 N



t

ei ξ,M(u) du. s

Since the first term on the right hand side is a square-integrable martingale, we have

|ξ|2 t E[ei ξ,M(u)−M(s) 1A ] du E[ei ξ,M(t)−M(s) 1A ] − P(A) = − 2 s for any A ∈ F s . The unique solution of this integral equation is given by E[ei ξ,M(t)−M(s) 1A ] = P(A)e−

|ξ|2 (t−s) 2

.

This means (2.5.2). (2) For ξ = (ξ1 , ξ2 , . . . , ξ N ) ∈ RN and T > 0, set fi (s) = ξi 1[0,T ](s) . By the   i i |ξ|2 t  assumption, ei i ξ M (t)− 2 0tT is a martingale. This means (2.5.2). 

4:36:26, subject to the Cambridge Core terms of use, .003

74

Stochastic Integrals and Itô’s Formula

Corollary 2.5.2 Let {B(t) = (B1 (t), B2 (t), . . . , BN (t))}t0 be an N-dimensional {Ft }-Brownian motion and σ be an {Ft }-stopping time which is finite almost surely. Set B∗ (t) = B(σ + t) and Ft∗ = Fσ+t (t  0). Then, B∗ = {B∗ (t)}t0 is an N-dimensional {Ft∗ }-Brownian motion. In particular, B˜ ∗ (t) = B(σ + t) − B(σ) is an N-dimensional Brownian motion which is independent of F0∗ = Fσ . Proof By the optional sampling theorem, M i (t) = Bi (σ + t) − Bi (σ) and M i (t)M j (t) − δi j t are {Ft∗ }-local martingales. Hence, Theorem 2.5.1 implies  the assertion because M i , M j (t) = δi j t. Next we show that continuous local martingales are given as time changed Brownian motions. Theorem 2.5.3 (Dambis–Dubins–Schwarz) Let M be a continuous local martingale such that limt→∞ M (t) = ∞ almost surely and set σ(s) = inf{t > 0; M (t) > s} and B(s) = M(σ(s)). Then, B = {B(s)} s0 is an {Fσ(s) }-Brownian motion and M(t) = B( M (t)). Proof If we show that B is a continuous local martingale satisfying

B (s) = s, we are done by Lévy’s theorem (Theorem 2.5.1). Note that σ(s) is an {Ft }-stopping time which is finite almost surely, because {σ(s) < t} = {s
t1 ; M (t) > M (t1 )} and set N(t) = M((t1 + t) ∧ η) − M(t1 ). {N(t)}t0 is a continuous {F(t1 +t)∧η }-local martingale satisfying

N (t) = M ((t1 + t) ∧ η) − M (t1 ). By the definition of η, N (t) = 0 and hence N(t) = 0 (t  0) almost surely. Thus (2.5.3) holds. Due to the continuity of M and M , (2.5.3) implies   for all t1  t2 , if M (t1 ) = M (t2 ), = 1, P then M is constant on [t1 , t2 ] which means the continuity of B.

4:36:26, subject to the Cambridge Core terms of use, .003

2.5 Martingale Characterization of Brownian Motion

75

Next we show that {B(s)} s0 is a locally square-integrable martingale satis = M(t ∧ σ(s2 )). { M(t)}  t0 is an fying B (s) = s. Let 0  s1 < s2 and set M(t) {Ft }-martingale and 

M (t) = M (t ∧ σ(s2 ))  M (σ(s2 )) = s2

(t  0).

 2 − M (t)}   t0 and { M(t) This means that { M(t)} t0 are uniformly integrable martingales. Therefore, by the optional sampling theorem,   E[B(s2 ) − B(s1 )|Fσ(s1 ) ] = E[ M(σ(s 2 )) − M(σ(s1 ))|Fσ(s1 ) ] = 0 and   E[(B(s2 ) − B(s1 ))2 |Fσ(s1 ) ] = E[ M (σ(s 2 )) − M (σ(s1 ))|Fσ(s1 ) ] = s2 − s1 .  Theorem 2.5.3 was extended to multi-dimensional stochastic processes in [60]. 2 (i = 1, 2, . . . , N) and suppose Theorem 2.5.4 Let M i ∈ Mc,loc

M i , M j (t) = 0

(i  j)

and lim M i (t) = ∞, P-a.s.

t→∞

(i = 1, 2, . . . , N).

Then, the stochastic process {B(s) = (B1 (s), B2 (s), . . . , BN (s))} s0 defined by Bi (s) = M i (τi (s)), where τi (s) = inf{t; M i (t) > s}, is an N-dimensional Brownian motion. Proof We give a proof following [9]. Since each Bi = {Bi (s)} s0 is a onedimensional Brownian motion by Theorem 2.5.3, it suffices to show the independence of B1 , B2 , . . . , BN . For this purpose, let fi ∈ L2 ([0, ∞) → R) and set

 t  1 t fi (s) dBi (s) + fi (s)2 ds . Φi (t) = exp i 2 0 0 Then, it is easy to show by Itô’s formula that {Φi (t)}t0 is a bounded martingale and that E[Φi (t)] = 1, E[Φi (∞)] = 1. In particular, !   !∞ 1 ∞ i 2 (2.5.4) E ei 0 fi (s)dB (s) = e− 2 0 fi (s) ds .

4:36:26, subject to the Cambridge Core terms of use, .003

76

Stochastic Integrals and Itô’s Formula

On the other hand, set L(t) = N

L (t) =

t

N ! t i=1 0

fi ( M (u)) dM i (u). Then,

fi ( M (u))2 d M i (u) 

N

0

i=1



| fi (s)|2 ds

0

i=1

and {L(t)}t0 is a martingale by Theorem 2.1.3. By Itô’s formula again, the

L (t) stochastic process {eiL(t)+ 2 }t0 is a bounded martingale and E[eiL(t)+

L (t) 2

] = 1.

Hence, denoting the limit of L(t) and L (t) as t → ∞ by L(∞) and L (∞), we get E[eiL(∞)+

L (∞) 2

] = 1.

Since {τi (t) < s} = {t < M i (s)},

∞ i i i i 1(t1 ,t2 ] ( M (t)) dM (t) = B (t2 ) − B (t1 ) = 0



1(t1 ,t2 ] (s) dBi (s)

0

for any t1 < t2 . Let {gim }∞ m=1 be a sequence of step functions which are linear combinations of the indicator functions of intervals of the form (t1 , t2 ] such that

∞ lim | fi (s) − gim (s)|2 ds = 0. m→∞

0

Then, by the observation above,



gim ( M i (t)) dM i (t) = 0

Moreover, since



0



gim (s) dBi (s).

(2.5.5)

0

| fi ( M i (t)) − gim ( M i (t))|2 d M i (t)

∞ | fi (s) − gim (s)|2 ds → 0 (m → ∞), = 0

we obtain





fi ( M (t)) dM (t) = i

i

0

fi (s) dBi (s)

0

by letting m → ∞ in (2.5.5). Hence we have L(∞) =

N i=1

∞ 0

fi ( M i (t)) dM i (t) =

N i=1



fi (s) dBi (s)

0

4:36:26, subject to the Cambridge Core terms of use, .003

2.5 Martingale Characterization of Brownian Motion

77

and

L (∞) =

N



fi (s)2 ds.

0

i=1

Therefore, by (2.5.4), N N  N ! ∞  !∞ !   1 ∞ i 2 i E ei i=1 0 fi (s)dB (s) = e− 2 0 fi (s) ds = E ei 0 fi (s)dB (s) . i=1

i=1

This implies the independence of B , B , . . . , B . 1

2



N

Theorems 2.5.3 and 2.5.4 are extended by removing the assumptions on the divergence of the quadratic variation processes as t → ∞. For this we need to extend the probability spaces and to consider a Brownian motion independent of M. Theorem 2.5.5 Let M be a continuous local martingale defined on a probability space (Ω, F , P, {Ft }) and set ⎧ ⎪ ⎪ ⎨inf{u; M (u) > s}, s = u>0 Fσ(s)∧u . σ(s) = ⎪ F ⎪ ⎩∞ (if s  M (∞)), Moreover, let {B (t)}t0 be a Brownian motion starting from 0 defined on 6,  6t }) be the direct  F P, {F another probability space (Ω , F  , P , {Ft }) and (Ω, product probability space defined by 6 = F ⊗ F ,  6t = F t ⊗ F  .  = Ω × Ω , F P = P × P , F Ω t

(2.5.6)

6t }-Brownian motion {B(s)} s0 which is defined on the Then, there exists an {F product probability space and satisfies B(s) = M(σ(s))

(s ∈ [0, M (∞))),

that is, M(t) = B( M (t)) (t  0). Proof We follow [45]. By the optional sampling theorem (Theorem 1.5.11), if s  s , u  u , then E[M(σ(s ) ∧ u )|Fσ(s)∧u ] = M(σ(s) ∧ u) and E[(M(σ(s ) ∧ u ) − M(σ(s) ∧ u))2 |Fσ(s)∧u ] = E[ M (σ(s ) ∧ u ) − M (σ(s) ∧ u)|Fσ(s)∧u ].

4:36:26, subject to the Cambridge Core terms of use, .003

78

Stochastic Integrals and Itô’s Formula

In particular, for A ∈ Fσ(s)∧u (u > 0), E[M(σ(s ) ∧ u)1A ] = E[M(σ(s) ∧ u)1A ]. By the observation above, M(σ(s) ∧ u) converges in L2 as u ↑ ∞. Hence,     )1A ] = E[ B(s)1 denoting the limit by B(s), we have E[ B(s A ]. Therefore, s ] = B(s)   )|F  E[ B(s and s ] = E[(s ∧ M (∞)) − (s ∧ M (∞))|F s ].   ) − B(s))  2 |F E[( B(s Set  B(s) = B (s) − B (s ∧ M (∞)) + B(s) (s  0). 6t }-martingale and B (s) = s. Hence, B is an Then, B = {B(s)} s0 is an {F 6  {Ft }-Brownian motion. The direct product probability space in Theorem 2.5.5 is called an extension of (Ω, F , P, {Ft }). Also, for a multi-dimensional case, the following can be proven. We omit the proof. Theorem 2.5.6 Let M i = {M i (t)}t0 (i = 1, 2, . . . , d) be continuous local martingales defined on a probability space (Ω, F , P, {Ft }) and assume that

M i , M j (t) = 0 Set

(i  j).

⎧ i ⎪ ⎪ ⎨inf{u; M (u) > s} σ (s) = ⎪ ⎪ ⎩∞ (if s  M i (∞)). i

Then, there exists an extension of (Ω, F , P, {Ft }) and a d-dimensional Brownian motion {(B1 (s), B2 (s), . . . , Bd (s))} s0 defined on it such that Bi (s) = M i (σi (s))

(s ∈ [0, M i (∞)) ).

Next we show that a d-dimensional local martingale can be expressed as a stochastic integral with respect to a Brownian motion. 2 (i = 1, 2, . . . , d) with M i (0) = 0 Theorem 2.5.7 For M i = {M i (t)}t0 ∈ Mc,loc defined on a probability space (Ω, F , P, {Ft }), assume that there exists Φi j = 1 2 and Ψi j = {Ψi j (s)} s0 ∈ Lloc (i, j = 1, 2, . . . , d) such that {Φi j (s)} s0 ∈ Lloc

t d Φi j (s) ds, Φi j (s) = Ψip (s)Ψ jp (s),

M i , M j (t) = 0

p=1

4:36:26, subject to the Cambridge Core terms of use, .003

2.5 Martingale Characterization of Brownian Motion

79

and det((Ψi j (s)))  0 for any s  0 almost surely. Then, there exists a d-dimensional {Ft }-Brownian motion {(B1 (t), B2 (t), . . . , Bd (t))}t0 satisfying d

M (t) = i

t

Ψi j (s) dB j (s).

0

j=1

Proof By the standard argument using stopping times, we may assume that M i ∈ Mc2 , Φi j ∈ L 1 and Ψi j ∈ L 2 . Denote by (Ψ−1 )i j (s) the (i, j)-component of the inverse of Ψ(s) = (Ψi j (s)) and set ⎧ −1 −1 ⎪ ⎪ ⎨(Ψ )i j (s) (if |(Ψ )i j (s)|  N (i, j = 1, 2, . . . , d)) (N) Θi j (s) = ⎪ ⎪ ⎩ 0 (otherwise) for N > 0. By the dominated convergence theorem,

0

t

2 d   (N) (N)  E Θip (s)Θ jq (s)Φ pq (s) − δi j  ds → 0

(N → ∞).

p,q=1

Set Bi(N) (t)

=

d j=1

t

0

j Θ(N) i j (s) dM (s).

Then, {Bi(N) (t)}t0 ∈ Mc2 and the quadratic variation process is given by

j (t) =

Bi(N) , B(N)

t

d

0 p,q=1

(N) Θ(N) ip (s)Θ jq (s)Φ pq (s) ds.

Hence, {Bi(N) (t)}t0 converges in Mc2 as N → ∞ and, for the limit {Bi (t)}t0 ,

Bi , B j (t) = δi j t. By Theorem 2.5.1, {(B1 (t), B2 (t), . . . , Bd (t))}t0 is a d-dimensional {Ft }-Brownian motion. Moreover, setting ⎧ −1 ⎪ ⎪ ⎨1 (if |(Ψ )i j (s)|  N (i, j = 1, 2, . . . , d)) IN (s) = ⎪ ⎪ ⎩0 (otherwise), by the definition of {Bi(N) (t)}t0 , we obtain

0

t

IN (s) dM i (s) =

d j=1

0

t

j Ψi j (s) dB(N) (s).

4:36:26, subject to the Cambridge Core terms of use, .003

80

Stochastic Integrals and Itô’s Formula

Both sides converge in Mc2 as N → ∞ and d t Ψi j (s) dB j (s). M i (t) = j=1



0

When Φ is degenerate, we need an extension of a probability space. 2 (i = 1, 2, . . . , d) with M i (0) = 0 defined Theorem 2.5.8 For M i ∈ Mc,loc on (Ω, F , P, {Ft }), assume that there exist d × d and d × r matrix-valued predictable processes Φ = (Φi j ) and Ψ = (Ψiα ) such that

t

t Φi j (s) ds < ∞, Ψ jα (s)2 ds < ∞ 0

0

(i, j = 1, 2, . . . , d, α = 1, 2, . . . , r) and

M i , M j (t) =

t

Φi j (s) ds,

Φi j (s) =

0

r

Ψiα (s)Ψ jα (s)

α=1

for every t > 0 almost surely. Then there exists an extension of (Ω, F , P, {Ft }) and an r-dimensional Brownian motion {(B1 (t), B2 (t), . . . , Br (t))}t0 defined on it satisfying r t Ψiα (s) dBα (s). M i (t) = α=1

0

Proof We give a constructive proof, following [114, Theorem 4.5.2]. Setting Ψi,r+1 = · · · = Ψi,d = 0 (i = 1, 2, . . . , d) if r < d and M d+1 = · · · = M r = 0 if r > d, we may assume r = d. Letting (W, B(W), μ) be the d-dimensional Wiener space on [0, ∞), we consider the direct product probability space 6,   F (Ω, P) = (Ω × W, F × B(W), P × μ).  in a natural way and shall not We extend functions on Ω and W to those on Ω explain it.  t0 , {α(t)}t0 Define d × d-matrix-valued stochastic processes {Π(t)}t0 , {Π(t)} and {β(t)}t0 by Π(t) = lim Φ(t)(εI + Φ(t))−1 , ε↓0

−1

α(t) = lim(εI + Φ(t)) Π(t), ε↓0

 = lim Ψ∗ (t)Ψ(t)(εI + Ψ∗ (t)Ψ(t))−1 , Π(t) ε↓0

β(t) = Ψ(t)∗ α(t).

The diagonalization of symmetric matrices yields the existence of the lim are the orthogonal projections onto the images of Φ(t) and its. Π(t) and Π(t)

4:36:26, subject to the Cambridge Core terms of use, .003

2.5 Martingale Characterization of Brownian Motion

81

Ψ∗ (t)Ψ(t), respectively. Moreover, by definition, each matrix-valued stochastic process is predictable. It holds that  Ψ(t)β(t) = Π(t) and β(t)Ψ(t) = Π(t).

(2.5.7)

We postpone the proof of (2.5.7) and continue the proof of the theorem. Define a d × (2d) matrix Σ(t) by  Σ(t) = (β(t), I − Π(t)) and set

t

B(t) = 0

Σ(s)

# $ dM(s) , dθ(s)

where {θ(t)}t0 is the coordinate process of W. Then, by (2.5.7) and the fact  is an orthogonal projection, that Π(t) # $

t  i j  Φ(s) 0 Σ(s) Σ(s)∗ ds = tI.

B , B (t) 1i, jd = 0 I 0 This means that {B(t)}t0 is a d-dimensional Brownian motion.  is the orthogonal projection onto the image of Ψ∗ (t)Ψ(t), Since Π(t) ∗   Ψ(t)(I − Π(t)) = 0. (I − Π(t))Ψ(t)

 That is, Ψ(t)(I − Π(t)) = 0. Hence,

# $ dM(t) Ψ(t)dB(t) = Ψ(t)Σ(t) dθ(t)  dθ(t) = Ψ(t)β(t) dM(t) + Ψ(t)(I − Π(t)) = Ψ(t)β(t) dM(t).

Moreover, since Π(t) is the orthogonal projection onto the image of Φ(t), (I − Π(t))Φ(t) = 0. Set

t

N(t) =

(I − Π(s)) dM(s).

0

Then, 



N i , N j (t)

1i, jd

t

=

(I − Π(s))Φ(s)(I − Π(s)) ds = 0.

0

Combining the above observation with (2.5.7), we obtain Ψ(t) dB(t) = Π(t) dM(t) = dM(t).



4:36:26, subject to the Cambridge Core terms of use, .003

82

Stochastic Integrals and Itô’s Formula

Proof of (2.5.7) The first identity is obtained from Ψ(t)β(t) = Φ(t)α(t) = lim Φ(t)(εI + Φ(t))−1 Π(t) = Π(t)2 = Π(t). ε↓0

To show the second identity, let λi be the non-zero eigenvalues of Ψ∗ (t)Ψ(t) and f i (i = 1, . . . , m) be the corresponding eigenvectors. Moreover, let { f 1 , . . . , f d } be the orthonormal basis of Rd obtained by extending f 1 , . . . , f m . Then, since Ψ(t) f i is an eigenvector of Φ(t) corresponding to the eigenvalue λi , d 

β(t)Ψ(t)

i=1

d m      ξi ξi f i = Ψ∗ (t)α(t) ξi Ψ(t) f i = Ψ∗ (t) Ψ(t) f i λ i=1 i=1 i

for any (ξ1 , . . . , ξd ) ∈ Rd . Hence we obtain d 

β(t)Ψ(t)

ξi f

i



i=1

=

m

ξi f i .

i=1

Thus β(t)Ψ(t) is the orthogonal projection onto the image of Ψ∗ (t)Ψ(t).



2.6 Martingales with respect to Brownian Motions We show that martingales with respect to the filtration generated by a Brownian motion are represented as the stochastic integrals with respect to it. This is given in Theorem 2.6.2 and called Itô’s representation theorem. Throughout this section we let B = {B(t) = (B1 (t), B2 (t), . . . , Bd (t))}t0 be a d-dimensional Brownian motion defined on a complete probability space (Ω, F , P) and {FtB } be the filtration obtained by the completion of the σ-field generated by sample paths: FtB = σ{B(s); s  t} ∨ N , where N is the totality of the P-null sets. B Lemma 2.6.1 Ft+0 = FtB for any t  0.

Proof It suffices to show that    ϕ(t) := E f1 (B(t1 )) f2 (B(t2 )) · · · fn (B(tn ))FtB is right-continuous in t for any 0  t1 < t2 < · · · < tn and f1 , f2 , . . . , fn ∈ C∞ (Rd ), where C∞ (Rd ) is the set of continuous functions on Rd which tend

4:36:26, subject to the Cambridge Core terms of use, .003

2.6 Martingales with respect to Brownian Motions

83

to 0 at infinity and is regarded as a Banach space with the topology of uniform convergence on compacta. If tk−1  t < tk , then ϕ(t) = f1 (B(t1 )) · · · fk−1 (B(tk−1 ))E[ fk (B(tk )) · · · fn (B(tn ))|FtB ]. Let g(t, x, y) be the transition density of a d-dimensional Brownian motion and set

(Pt f )(x) = g(t, x, y) f (y) dy ( f ∈ C∞ (Rd )), Rd

where C∞ (R ) is the space of continuous functions on Rd tending to 0 at infinity. Then, {Pt }t0 is a strongly continuous semigroup of linear operators on C∞ (Rd ). Moreover, define the functions Hm (t1 , t2 , . . . , tm ; f1 , f2 , . . . , fm ) ∈ C∞ (Rd ) inductively by H1 (t; f ) = Pt f and d

Hm (t1 , t2 , . . . , tm ; f1 , f2 , . . . , fm ) = Hm−1 (t1 , t2 , . . . , tm−1 ; f1 , f2 , . . . , fm−1 Ptm −tm−1 fm ) for m  2. Since E[ f1 (B(t1 )) f2 (B(t2 )) · · · fm (B(tm ))]

Hm (t1 , t2 , . . . , tm ; f1 , f2 , . . . , fm )(x)ν(dx), = Rd

ν being the probability distribution of B(0), ϕ is written as ϕ(t) =

k−1 

f j (B(t j )) · Hn−k+1 (tk − t, tk+1 − t, . . . , tn − t; fk , fk+1 , . . . , fn )(B(t)).

j=1

It is easy to see that the right hand side is a right-continuous function in t.



We denote by M 2 ({FtB }) the set of square-integrable {FtB }-martingales and by L 2 (B) the set of R-valued predictable stochastic processes Φ satisfying  T Φ(s)2 ds < ∞ (T > 0). E 0

The following theorem shows that every square integrable {FtB }-martingale is expressed as a stochastic integral with respect to the original Brownian motion. Theorem 2.6.2 For any M = {M(t)}t0 ∈ M 2 ({FtB }), there exist Φi ∈ L 2 (B) (i = 1, 2, . . . , d) such that d t M(t) = M(0) + Φi (s) dBi (s). (2.6.1) i=1

0

4:36:26, subject to the Cambridge Core terms of use, .003

84

Stochastic Integrals and Itô’s Formula

Remark 2.6.3 Combining a similar argument in the proof below with the localization argument via {FtB }-stopping times, we can show that for any 2 2 ({FtB }) there exist Φi ∈ Lloc (i = 1, 2, . . . , d) satisfying (2.6.1). M ∈ Mc,loc To avoid unnecessary complexity, we give a proof when d = 1. Moreover, it suffices to consider the case when the time interval is [0, T ] (T > 0) and M(0) = 0. For the proof, setting

t I 2 (B) = M ∈ M 2 ({FtB }) ; M(t) = Φ(s) dB(s), Φ ∈ L 2 (B) , 0

we use the following lemma. Lemma 2.6.4 For any M ∈ M 2 ({FtB }), there exist unique M1 ∈ I 2 (B) and M2 ∈ M 2 ({FtB }) such that M = M1 + M2 and M2 , N = 0 for all N ∈ I 2 (B). Proof At first we show the uniqueness. Let M = M1 + M2 = M1 + M2 be the desired decompositions. Then, since M2 − M2 = M1 − M1 ∈ I 2 (B), M2 (M2 − M2 ) and M2 (M2 − M2 ) are martingales. Hence, (M2 − M2 )2 is also a martingale. This means M2 − M2 = 0 and, therefore, M2 = M2 . Next let H be a closed subspace of the L2 -space on (Ω, FTB , P) defined by H = {M1 (T ); M ∈ I 2 (B)} and H ⊥ be the orthogonal complement of H . For M ∈ M 2 ({FtB }) let M(T ) = H + K

(H ∈ H , K ∈ H ⊥ ).

(2.6.2)

be the corresponding decomposition of M(T ). !T Then, there exists a Φ ∈ L 2 (B) such that H = 0 Φ(s) dB(s). By the rightcontinuity of {FtB }, denoting the right-continuous modification of E[K|FtB ] by {M2 (t)}0tT , we have by (2.6.2)

t M(t) = Φ(s) dB(s) + M2 (t). 0

Hence, it suffices to show that M2 , N (t) = 0 for any N ∈ I 2 (B), or that {M2 (t)N(t)}0tT is an {FtB }-martingale. To see this, we show E[M2 (σ)N(σ)] = 0 for any {FtB }-stopping time σ satisfying σ  T . This is shown in the following way. Since N ∈ I 2 (B), there exists a Ψ ∈ L 2 (B) !t such that N(t) = 0 Ψ(s) dB(s). Hence we have

t Ψ(s)1{sσ} dB(s) ∈ I 2 (B). N(t ∧ σ) = 0

4:36:26, subject to the Cambridge Core terms of use, .003

2.6 Martingales with respect to Brownian Motions

85

Since N(σ) ∈ H , the optional sampling theorem yields   E[M2 (σ)N(σ)] = E E[M2 (T )|FσB ]N(σ) = E[KN(σ)] = 0.



Lemma 2.6.5 If M ∈ M 2 ({FtB }) is bounded, M(0) = 0, and M, N = 0 for any N ∈ I 2 (B), then M = 0. Proof Let C be a constant satisfying P(|M(t)|  C, 0  t  T ) = 1 and set D=1+

1 M(T ). 2C

Then D  12 and E[D] = 1. Set  P(A) = E[D1A ] for A ∈ FTB . Then,  P is a B P by probability measure on (Ω, FT ). Denote the expectation with respect to   Then, since M, B = 0, E.    E[B(σ)] = E[DB(σ)] = E E[D|FσB ]B(σ) 1 = E[B(σ)] + E[M(σ)B(σ)] = 0 2C for any stopping time σ with σ  T . Hence, {B(t)}t0 is an {FtB }-martingale under  P. !t 2  − σ] = 0 and Similarly, since B(t)2 − t = 2 0 B(s) dB(s) ∈ I 2 (B), E[B(σ) P. {B(t)2 − t}t0 is also an {FtB }-martingale under  Hence, by Lévy’s theorem (Theorem 2.5.1), {B(t)} is an {FtB }-Brownian motion under  P. Since  P and P coincide on FTB , we obtain D = 1 and M(T ) = 0.  We are now in a position to give a proof of Theorem 2.6.2. Proof of Theorem 2.6.2 Following Lemma 2.6.4, write M(t) = M1 (t) + M2 (t). For any fixed K > 0, set τK = inf{t; |M2 (t)|  K} and M2K = {M2 (t ∧ τK )}0tT . If we have shown M2K , N = 0 for any N ∈ I 2 (B), we obtain M2K = 0 and, therefore, M2 = 0 by Lemma 2.6.5. This implies the assertion. We should show the martingale property of {M2K (t)N(t)}, that is, E[M2 (t ∧ τK )N(t)|F sB ] = M2 (s ∧ τK )N(s)

(2.6.3)

for s < t almost surely. Let A ∈ F sB . The martingale property of N implies   E E[M2 (t ∧ τK )N(t)|F sB ]1A∩{τK s} = E[M2 (t ∧ τK )N(t)1A∩{τK s} ]   = E M2 (s ∧ τK )E[N(t)|F sB ]1A∩{τK s} = E[M2 (s ∧ τK )N(s)1A∩{τK s} ].

4:36:26, subject to the Cambridge Core terms of use, .003

86

Stochastic Integrals and Itô’s Formula

On the other hand, write   E E[M2 (t ∧ τK )N(t)|F sB ]1A∩{τK >s} = E[M2 (t ∧ τK )N(t)1A∩{τK >s} ] = E[M2 (t ∧ τK )(N(t) − N(t ∧ τK ))1A∩{τK >s} ] + E[M2 (t ∧ τK )N(t ∧ τK )1A∩{τK >s} ]. Then, since A∩{τK > s} ∈ F s∧τK ⊂ Ft∧τK , the first term is 0. By Lemma 2.6.5, the stochastic process {M2 (t)N(t)}t0 is a martingale and hence E[M2 (t ∧ τK )N(t ∧ τK )1A∩{τK >s} ] = E[M2 (s ∧ τK )N(s ∧ τK )1A∩{τK >s} ] = E[M2 (s ∧ τK )N(s)1A∩{τK >s} ]. 

Thus we obtain (2.6.3).

The following important results are obtained from Itô’s representation theorem. Corollary 2.6.6 M 2 ({FtB }) = Mc2 ({FtB }), that is, every square-integrable {FtB }-martingale is continuous. Corollary 2.6.7 Let F be an FTB -measurable square-integrable random variable for T > 0. Then there exist predictable stochastic processes {Φi (t)}0tT !T (i = 1, 2, . . . , d) with E[ 0 Φi (s)2 ds] < ∞ such that F = E[F|F0B ] +

d i=1

T

Φi (t) dBi (t).

0

Proof Set M(t) = E[F|FtB ] − E[F|F0B ]. Since {M(t)}0tT ∈ M 2 ({FtB }), Theorem 2.6.2 implies the assertion.  Remark 2.6.8 (1) The Clark–Ocone formula (Theorem 5.3.5) shows that Φ is given as an expectation of the derivative of F in the sense of the Malliavin calculus. (2) Corollary 2.6.7 holds when T = ∞, that is, for a square-integrable random  variable which is measurable under t FtB = F∞B . (3) If B(0) is non-random, we can first prove Corollary 2.6.7 and, using it conversely, prove the representation theorem (Theorem 2.6.2) ([93]).

4:36:26, subject to the Cambridge Core terms of use, .003

2.7 Local Time, Itô–Tanaka Formula

87

2.7 Local Time, Itô–Tanaka Formula Let X = {X(t)}t0 be a one-dimensional semimartingale, M be its martingale part and M be the quadratic variation process of M. Then there exists a twoparameter family of random variables (random field) L = {L(t, x)}t0, x∈R , called the local time of X, such that

t

f (X(s)) d M (s) = f (x)L(t, x) dx R

0

for a function f on R which satisfies adequate conditions. If X is a Brownian motion and f (x) = 1A (x), A ∈ B(R), then the left hand side is equal to the total time when X has stayed in A up to time t. The Lebesgue measure of the set of times when a Brownian motion has stayed at a fixed point is 0.1 Lévy introduced the notion of local time in order to study the properties of this set in detail. The purpose of this section is to show the existence of the local times of semimartingales and to show an extension of Itô’s formula to convex functions, which are not necessarily of C 2 -class, by using local times. Let X = {X(t)}t0 be a continuous semimartingale defined on a probability space (Ω, F , P, {Ft }) which has a decomposition (2.3.1). Define the function sgn : R → {−1, 1} by ⎧ ⎪ ⎪ ⎨−1 (x  0) sgn(x) = ⎪ ⎪ ⎩1 (x > 0). Theorem 2.7.1 For each a ∈ R, there exists a continuous increasing process La = {L(t, a)}t0 such that

t sgn(X(s) − a) dX(s) + L(t, a). (2.7.1) |X(t) − a| − |X(0) − a| = 0

Moreover,

t

1{X(s)a} dL(s, a) = 0,

(2.7.2)

0

that is, La increases only at times s with X(s) = a. Remark 2.7.2 (1) (2.7.1) is called Tanaka’s formula. !t (2) Combining the trivial identity X(t) − X(0) = 0 dX(s) with (2.7.1) and setting x+ = max(x, 0), we have 1

For a Brownian motion B = {B(t)}t0 , let Z be the set of zeros of the mapping t → B(t) and !1 |Z | be its Lebesgue measure. Then E[|Z |] = 0 P(B(s) = 0) ds = 0.

4:36:26, subject to the Cambridge Core terms of use, .003

88

Stochastic Integrals and Itô’s Formula

(X(t) − a)+ − (X(0) − a)+ =

t 0

1 1{X(s)>a} dX(s) + L(t, a). 2

(3) The second derivative of φ(x) = |x − a| in the sense of distribution is 2δa (x), where δa is the Dirac measure concentrated at a. Hence, applying Itô’s formula to |X(t) − a| in a formal manner, we get

t L(t, a) = δa (X(s)) ds. 0

It is not easy to give a meaning to the right hand side, but it gives an intuitive understanding to the local time. Proof It suffices to show the case when a = 0. Moreover, we may assume that X, the martingale part M of X, the quadratic variation process M of M and the increasing process A are all bounded, because, once this is done, the general case can be proven by the localization argument via stopping times. To approximate f (x) = |x| by C 2 -class functions, let ϕ ∈ C ∞ (R) be a monotone increasing function such that ϕ(x) = −1 (x  0),

ϕ(x) = 1 (x  1)

and define a sequence of functions fn (n = 1, 2, . . .) by fn (x) = ϕ(nx)

fn (0) = 0,

(x ∈ R).

fn converges to f uniformly and fn does to the function sgn pointwise as n → ∞. Moreover, by Itô’s formula,

t fn (X(t)) − fn (X(0)) = fn (X(s)) dX(s) + Cn (t), (2.7.3) 0

where Cn (t) =

1 2

t

fn (X(s)) d M (s).

0

Since fn  0, {Cn (t)}t0 is increasing and, if m  n,

t 1{|X(s)|> m1 } (X(s)) dCn (s) = 0.

(2.7.4)

0

Fix T > 0. Then, 2  T   sgn(X(s)) − fn (X(s)) dM(s) E  0  T   =E sgn(X(s)) − fn (X(s)) 2 d M (s) 0

4:36:26, subject to the Cambridge Core terms of use, .003

2.7 Local Time, Itô–Tanaka Formula

89

converges to 0 by the bounded convergence theorem. Hence, by Doob’s inequality (Theorem 1.5.13),   T      sgn(X(s)) − fn (X(s)) dM(s) sup  0tT

0

2

converges to 0 in L and almost ! t On the other ! t surely if we take a subsequence. hand, it is easy to see that 0 fn (X(s)) dA(s) converges to 0 sgn(X(s)) dA(s) uniformly in t! as n → ∞. Hence, the first term on the right hand side of (2.7.3) t converges to 0 sgn(X(s)) dX(s) uniformly on [0, T ] almost surely. The uniform convergence of fn to f and (2.7.3) yield the uniform convergence of Cn (t). Denote the limit by La = {L(t, a)}t0 . Then La is continuous and increasing in t, and it satisfies (2.7.1). Moreover, letting n → ∞ in (2.7.4), we have

t 1{|X(s)|> m1 } dL(s, 0) = 0 0

and obtain (2.7.2) by letting m tend to ∞.



On the continuity of L(t, a) as a function of two variables, the following is known ([131]). Theorem 2.7.3 The local time {L(t, a)}t0, a∈R of a continuous semimartingale X has a modification which is continuous in t and right-continuous with the left limits in a. Proof As in the proof of Theorem 2.7.1, we may assume that X, M, M and A are bounded. Set

t

t sgn(X(s) − a) dM(s), ξ2 (t, a) = sgn(X(s) − a) dA(s), ξ1 (t, a) = 0

0

ξ3 (t, a) = |X(t) − a| − |X(0) − a|. Obviously ξ3 (t, a) is continuous in (t, a). For ξ2 (t, a), the continuity in t follows from that of A and the right-continuity in a follows from the left-continuity of the function sgn(x). For ξ1 (t, a), we shall show that, for any T > 0 and p > 0, there exists a constant K p such that  p E sup |ξ1 (t, a) − ξ1 (t, b)| p  K p |a − b| 2 . 0tT

Once this is done, applying Kolmogorov’s continuity theorem (Theorem A.5.1) to a family of random variables {L(·, a)}a∈R with values in the path space, we obtain the continuity of ξ1 (t, a) in two variables (t, a).

4:36:26, subject to the Cambridge Core terms of use, .003

90

Stochastic Integrals and Itô’s Formula

If a < b, the Burkholder–Davis–Gundy inequality (2.4.1) yields  p  t    p p  1(a,b] (X(s)) dM(s) E sup |ξ1 (t, a) − ξ1 (t, b)| = 2 E sup  0tT

0tT

0

  p 2 2 p  T  E  1(a,b] (X(s)) d M (s) . cp 0

Let g ∈ C 2 (R) satisfy 0  g  1,

supp[g ] ⊆ [−1, 2],

g (x) = 1 (0  x  1),

g (x) = g(x) = 0 (x  −1) and set

 x − a

ϕ(x) = g

b−a

.

Since 0  g  3, we have

t

t 2 1(a,b] (X(s)) d M (s)  (b − a) ϕ (X(s)) d M (s) 0 0 0

t 2 = (b − a) ϕ(X(t)) − ϕ(X(0)) − ϕ (X(s)) dX(s) 0   t  2   ϕ (X(s)) dM(s)  (b − a) |ϕ(X(t)) − ϕ(X(0))| +  0

t + |ϕ (X(s))| |dA(s)| 0

t  t   X(s) − a     |b − a| 3|X(t) − X(0)| + sup  g |dA(s)| , dM(s) + 3 b−a 0tT 0 0 where |dA(s)| denotes the integral with respect to the total variation of A. A repeated use of the Burkholder–Davis–Gundy inequality implies  t   p  2 X(s) − a    E sup  g dM(s) b − a 0tT 0  T   X(s) − a 2 p g  d M (s) 4  C 3 2p E[ M (T ) 4p ].  C pE p  b−a  0

By the boundedness of X, M and A, there exists a constant K p such that  p E sup |ξ1 (t, a) − ξ1 (t, b)| p  K p |b − a| 2 .



0tT

Remark 2.7.4 From the proof we see that, if A = 0, that is, if X is a continuous local martingale, then the local time has a modification which is continuous as

4:36:26, subject to the Cambridge Core terms of use, .003

2.7 Local Time, Itô–Tanaka Formula

91

a function of the two variables (t, a). This was first proven by Trotter ([119]) when X is a Brownian motion. Using the local time, we can extend Itô’s formula to convex functions or functions given as the difference of two convex functions. The extended formula is called the Itô–Tanaka formula. Let f be a convex function. Since f (z) − f (y) f (z) − f (x)  z−x z−y for x < y < z, f has a left-derivative D− f and a right-derivative D+ f : (D± f )(x) = lim

h→±0

f (x + h) − f (x) . h

Note that the left-(right-)derivative is a left-(right-)continuous monotone increasing function and satisfies (D+ f )(x)  (D− f )(y)  (D+ f )(y)

(x < y).

Denote by ν f the measure determined by the monotone increasing function D− f : ν f ([a, b)) = (D− f )(b) − (D− f )(a)

(a < b).

Theorem 2.7.5 For a convex function f ,

t 1 (D− f )(X(s)) dX(s) + L(t, a)ν f (da). f (X(t)) = f (X(0)) + 2 R 0 Proof We may assume that all of X, M, M and A are bounded and ν f has a compact support by the localization argument. Then, there exist constants α and β such that

1 |x − a|ν f (da), f (x) = αx + β + 2 R

1 (D− f )(x) = α + sgn(x − a)ν f (da) (2.7.5) 2 R (see [98, Appendix §3]). Hence, by Tanaka’s formula (2.7.1), we have

1 f (X(t)) = αX(t) + β + |X(t) − a|ν f (da) 2 R = αX(t) + β

t  1  sgn(X(s) − a) dX(s) + L(t, a) ν f (da) + |X(0) − a| + 2 R 0

4:36:26, subject to the Cambridge Core terms of use, .003

92

Stochastic Integrals and Itô’s Formula = α(X(t) − X(0)) + f (X(0))

 1  t sgn(X(s) − a) dX(s) + L(t, a) ν f (da). + 2 R 0

Using Fubini’s theorem for stochastic integrals (Theorem 2.7.6 below) and (2.7.5), we obtain

 1  t sgn(X(s) − a) dX(s) ν f (da) 2 R 0

 1 t = sgn(X(s) − a)ν f (da) dX(s) 2 0 R

t = (D− f )(X(s)) dX(s) − α(X(t) − X(0)). 0

Substituting this into the above identity, we arrive at the conclusion.



Theorem 2.7.6 Let ν be a σ-finite measure on (R, B(R)) and h be a continuous function on R with a compact support. Then,

 t  h(a) sgn(X(s) − a) dM(s) ν(da) 0 R

t   h(a) sgn(X(s) − a)ν(da) dM(s). (2.7.6) = 0

R

Proof Suppose that supp[h] ⊆ [c, d] and set ξk = c + 2−n (d − c)k (k = 0, 1, . . . , 2n ). Define a function Fn by Fn (x) =

n 2 −1

h(ξk ) sgn(x − ξk )μ([ξk , ξk+1 )).

k=0

Then, n 2 −1

k=0

h(ξk ) 0

t

sgn(X(s)−ξk ) dM(s)·μ([ξk , ξk+1 )) =

t

Fn (X(s)) dM(s). (2.7.7) 0

!t In the proof of Theorem 2.7.3, it was shown that I(t, a) := 0 sgn(X(s) − a) dM(s) has a modification which is continuous in (t, a). Hence, the left hand side of (2.7.7) converges to that of (2.7.6) almost surely. On the other hand, !d since Fn (x) converges to c h(a) sgn(x − a)ν(da) uniformly, the right hand side of (2.7.7) converges to that of (2.7.6) in L2 . From these, we obtain the assertion of the theorem. 

4:36:26, subject to the Cambridge Core terms of use, .003

2.8 Reflecting Brownian Motion and Skorohod Equation

93

Suppose that f is of C 2 -class. Then, comparing the development of f (X(t)) by Itô’s formula and setting ϕ = f  , we obtain

t

ϕ(X(s)) d M (s) = ϕ(a)L(t, a) da. (2.7.8) R

0

By the monotone class theorem (Theorem A.2.5), this result, called the occupation time formula, holds for any bounded Borel measurable function ϕ. Theorem 2.7.7 Formula (2.7.8) holds for any bounded Borel measurable function ϕ on R.

2.8 Reflecting Brownian Motion and Skorohod Equation Let B = {B(t)}t0 be a one-dimensional Brownian motion starting from x > 0 defined on a probability space (Ω, F , P) and define B+ = {B+ (t)}t0 by B+ (t) = |B(t)|. Then, for all 0 = t0 < t1 < t2 < · · · < tn and Ai ∈ B(R+ ), we have   P B+ (t1 ) ∈ A1 , B+ (t2 ) ∈ A2 , . . . , B+ (tn ) ∈ An

n  = dx1 dx2 · · · dxn g+ (t j − t j−1 , x j−1 , x j ), A1

A2

An

j=1

where x0 = x and (y+x)2  1  − (y−x)2 e 2t + e− 2t g+ (t, x, y) = √ 2πt

(x, y  0, t > 0).

The stochastic process B+ on [0, ∞) is called a reflecting Brownian motion. The aim of this section is to present some results related to a reflecting Brownian motion. Lemma 2.8.1 Let x  0 and φ be an R-valued continuous function on [0, ∞) with φ(0) = 0. Then, there exists a unique continuous function k : [0, ∞) → R which satisfies the following three conditions: (i) x(t) := x + φ(t) + k(t)  0 (t  0), (ii) k(0) ! t = 0 and k is increasing, (iii) 0 1{0} (x(s)) dk(s) = k(t), that is, k increases only at s with x(s) = 0. Proof First we show the uniqueness. Suppose that a continuous function  k also satisfies the conditions (i), (ii), (iii) and set  x(t) = x + φ(t) +  k(t). Assume x(t1 ) and set that there exists t1 such that x(t1 ) > 

4:36:26, subject to the Cambridge Core terms of use, .003

94

Stochastic Integrals and Itô’s Formula x(t)}. t2 = max{t < t1 ; x(t) = 

Then, for t ∈ (t2 , t1 ], x(t) >  x(t)  0. By the condition (iii), k is constant on (t2 , t1 ] and k(t2 ) = k(t1 ). Hence, 0 < x(t1 ) −  x(t1 ) = k(t1 ) −  k(t1 ) = k(t2 ) −  k(t1 )  k(t2 ) −  k(t2 ) = 0, which is a contradiction. Therefore, x(t)   x(t) (t  0), which means k   k.  The same argument shows k  k and we obtain k =  k. In order to show the existence, set  k(t) = max 0, max{−(x + φ(s))} . 0st

k is increasing and k(0) = 0. Moreover, x + φ(t) + k(t)  k(t) − max{−(x + φ(s))}  0. 0st

This means that k satisfies the condition (i). To show (iii), take arbitrary ε > 0 and let (t1 , t2 ) be an interval contained in the open set Oε = {s  0; x(s) > ε}. It suffices to show k(t1 ) = k(t2 ). For this purpose note −(x + φ(s)) = k(s) − x(s)  k(t2 ) − ε

(t1  s  t2 ).

Then, we have  k(t2 ) = max k(t1 ), max {−(x + φ(s))}  max{k(t1 ), k(t2 ) − ε} t1 st2

and obtain k(t2 ) = k(t1 ).



Theorem 2.8.2 Let x  0 and B = {B(t)}t0 be a one-dimensional Brownian motion with B(0) = 0. Assume that there exists a continuous stochastic process = { (t)}t0 satisfying the following three conditions: (i) X(t) := x + B(t) + (t)  0 (t  0), (ii) (0) ! t = 0 and is an increasing process, (iii) 0 1{0} (X(s)) d (s) = (t). Then, X = {X(t)}t0 is a reflecting Brownian motion on [0, ∞). The system of equations satisfying (i)–(iii) is called a Skorohod equation, named after Skorohod, who was the first to solve a stochastic differential equation with a reflecting boundary condition.

4:36:26, subject to the Cambridge Core terms of use, .003

2.8 Reflecting Brownian Motion and Skorohod Equation

95

Proof X and are uniquely determined from x and the Brownian motion B by Lemma 2.8.1. Hence it is sufficient to show that, letting W = {W(t)}t0 be a one-dimensional Brownian motion and setting X(t) = |W(t)|, there  = { B(t)}  t0 and a continuous increasing process exists a Brownian motion B  = { (t)}t0 which satisfy the conditions (i) and (ii). Let {L(t, a)}t0 be the local time of W at a and set L(t) = L(t, 0). By Tanaka’s formula,

t |W(t)| = sgn(W(s)) dW(s) + L(t). 0

Set  = B(t)

t

sgn(W(s)) dW(s) and  (t) = L(t).

0

  is a {FtW }-Brownian motion by Lévy’s theorem Then, B (t) = t and B (Theorem 2.5.1), where FtW = σ{W(s); s  t}. By Theorem 2.7.7,

t

t

ε 1(−ε,ε) (W(s)) ds = 1[0,ε) (X(s)) ds = L(t, a) da 0

−ε

0

and 1  (t) = lim ε↓0 2ε

t

1[0,ε) (X(s)) ds. 0

!t  Hence 0 1{0} (X(s)) d (s) =  (t). Thus the triplet {X, B, } satisfies the conditions of the theorem.  Theorem 2.8.3 Let B = {B(t)}t0 be a one-dimensional Brownian motion starting from 0 and set m(t) = min{B(s); s  t}. Moreover, let {L(t)}t0 be the local time at 0 of B. Then, the two-dimensional continuous stochastic processes {(B(t) − m(t), −m(t))}t0 and {(|B(t)|, L(t))}t0 have the same probability law. Proof By Tanaka’s formula and Theorem 2.5.1, the stochastic process {β(t)} defined by

t β(t) = |B(t)| − L(t) = sgn(B(s)) dB(s) 0

is a Brownian motion. Consider the decompositions |B(t)| = β(t) + L(t) and B(t) − m(t) = B(t) + (−m(t)) of Lemma 2.8.1(i) for {β(t)} and {B(t)}. {(|B(t)|, L(t))} and {(B(t)−m(t), −m(t))} are obtained from the Brownian motions {β(t)} and {B(t)}, respectively, through the same deterministic procedure. Hence their distributions coincide. 

4:36:26, subject to the Cambridge Core terms of use, .003

96

Stochastic Integrals and Itô’s Formula

Corollary 2.8.4 (1) Let M(t) = max{B(s); st}. Then, {(M(t)−B(t), M(t))}t0 and {(|B(t)|, L(t))}t0 have the same probability law. (2) limt→∞ L(t) = ∞ almost surely. Proof (1) Apply the theorem to {−B(t)}t0 . (2) The result follows from the fact that supt0 B(t) = ∞ almost surely.



Remark 2.8.5 (1) Let {L(t, a)} be the local time of B at a and set τa = inf{t > 0; B(t) = a}. Since {Bτa +t − a}t0 is a Brownian motion starting from 0, limt→∞ L(t, a) = ∞ almost surely. (2) In connection with Corollary 2.8.4, an important result is Pitman’s theorem: {2M(t) − B(t)}t0 has the same probability law as a three dimensional Bessel process {ρ(t)}t0 starting from 0 (Theorem 4.8.7). Moreover, set J(t) = inf st ρ(s). Then, {(2M(t) − B(t), M(t))}t0 has the same probability law as {(ρ(t), J(t))}t0 . For the details and related topics, see [45, 83, 98].

2.9 Conformal Martingales When stochastic analysis is applied to complex analysis, one of the starting points is a complex-valued stochastic process whose real and imaginary parts are both local martingales. The purpose of this section is to √ present some fundamental properties of such stochastic processes. We set i = −1. A complex-valued stochastic process Z = {Z(t)}t0 is a complex-valued continuous locally square-integrable martingale if both its real and imaginary parts 2 , that is, Z is represented as are elements of Mc,loc Z(t) = X(t) + iY(t) 2 by X = {X(t)}t0 , Y = {Y(t)}t0 ∈ Mc,loc . Such a stochastic process Z is also denoted by Z = X + iY. The set of complex-valued continuous locally square2 (C). integrable martingales is denoted by Mc,loc 2 (C) ( j = 1, 2) we define Z1 , Z2 = { Z1 , Z2 (t)}t0 For Z j = X j + iY j ∈ Mc,loc by

Z1 , Z2 (t) = X1 , X2 (t) − Y1 , Y2 (t) + i{ X2 , Y1 (t) + X1 , Y2 (t)}. Note that , is complex bi-linear in two stochastic processes. For Z ∈ 2 2 Mc,loc (C), Z 2 − Z, Z = {Z(t)2 − Z, Z (t)}t0 ∈ Mc,loc (C). Moreover, defining 2 2 Z ∈ Mc,loc (C) by Z = X − iY for Z = X + iY ∈ Mc,loc (C), we have

Z, Z = X + Y .

(2.9.1)

4:36:26, subject to the Cambridge Core terms of use, .003

2.9 Conformal Martingales

97

2 (C) is called a conformal martingale if Definition 2.9.1 Z = X + iY ∈ Mc,loc

X = Y , X, Y = 0. Moreover, when P(Z(0) = a) = 1 for a ∈ C, it is called a conformal martingale starting from a. 2 Z = X + iY ∈ Mc,loc (C) is a conformal martingale if and only if Z, Z = 0. 2 Representing Z as

Z(t)2 = X(t)2 − X (t) − (Y(t)2 − Y (t)) + X (t) − Y (t) + 2i(X(t)Y(t) − X, Y (t)) + 2i X, Y (t), we easily see that Z is a conformal martingale if and only if Z 2 = {Z(t)2 }t0 ∈ 2 (C). Moreover, by the formula (2.9.1), if Z is a conformal martingale, Mc,loc then Z, Z = 2 X . Let {(B1 (t), B2 (t))}t0 be a two-dimensional Brownian motion. Then, β = {β(t) = B1 (t) + iB2 (t)}t0 is a conformal martingale. This is called a complex Brownian motion. Conformal martingales are not closed under summation. For example, let β = {β(t) = B1 (t) + iB2 (t)}t0 be a complex Brownian motion starting from a ∈ C. Then, β is a conformal martingale starting from a ∈ C. Since β+β = 2B1 and B1 (t) = t, β + β is not a conformal martingale. The sum of two conformal martingales Z and W is a conformal martingale if and only if Z, W = 0. The next theorem shows that conformal martingales are closed under the transforms defined by holomorphic functions and that they are time changed complex Brownian motions. 2 Theorem 2.9.2 (1) Z ∈ Mc,loc (C) is a conformal martingale if and only if 2 (C) for any holomorphic function f . f (Z) = { f (Z(t))}t0 ∈ Mc,loc 2 (C) is a conformal martingale if and only if f (Z) is a conformal (2) Z ∈ Mc,loc martingale for any holomorphic function f : C → C. Moreover,

t 8 7 | f  (Z(s))|2 d Z, Z (s), f (Z), f (Z) (t) = 0 

where f is the complex derivative of f . (3) If Z = X + iY = {Z(t)}t0 is a conformal martingale, then there exists a complex Brownian motion β such that Z(t) = β( X (t)). Proof (1) As was shown above, the sufficiency follows if we show Z 2 = 2 (C). This is checked by taking f (z) = z2 . {Z(t)2 }t0 ∈ Mc,loc 2 (C) is a conformal martingale. Conversely, assume that Z = X + iY ∈ Mc,loc Let f be a holomorphic function and write f (x, y) = ϕ(x, y) + iψ(x, y) with

4:36:26, subject to the Cambridge Core terms of use, .003

98

Stochastic Integrals and Itô’s Formula

R-valued functions ϕ and ψ, where z = x + iy. Since ϕ and ψ are harmonic functions, Itô’s formula yields ∂ϕ ∂ϕ (X(t), Y(t))dX(t) + (X(t), Y(t))dY(t), ∂x ∂y ∂ψ ∂ψ d(ψ(X(t), Y(t))) = (X(t), Y(t))dX(t) + (X(t), Y(t))dY(t). ∂x ∂y

d(ϕ(X(t), Y(t))) =

(2.9.2)

2 2 Since X, Y ∈ Mc,loc , f (Z) ∈ Mc,loc (C). (2) Although the first assertion immediately follows from (1) since the compositions of holomorphic functions are holomorphic, we give an alternative proof. Write Z = X + iY and f = ϕ + iψ as above. By (2.9.2) and the Cauchy– Riemann relation, we have 2  ∂ϕ 2  ∂ϕ (X(t), Y(t)) + (X(t), Y(t)) d X (t) d ϕ(X, Y) (t) = ∂x ∂y 2  ∂ψ 2  ∂ψ (X(t), Y(t)) + (X(t), Y(t)) d X (t) = ∂x ∂y

= d ψ(X, Y) (t) and d ϕ(X, Y), ψ(X, Y) (t) =

∂ϕ

∂ψ (X(t), Y(t)) ∂x ∂ψ ∂ϕ (X(t), Y(t)) (X(t), Y(t)) d X (t) + ∂y ∂y

∂x

(X(t), Y(t))

= 0. Hence, f (Z) = ϕ(X, Y) + iψ(X, Y) is a conformal martingale. Moreover, since  ∂ϕ 2  ∂ϕ 2  ∂ψ 2  ∂ψ 2 | f  |2 = + = + , ∂x ∂y ∂x ∂y we have

and

7 8 7 8 d ϕ(X, Y) (t) = d ψ(X, Y) (t) = | f  (Z(t))|2 d X (t) 8 8

f (Z), f (Z) = ϕ(X, Y) (t) + ψ(X, Y) (t),

Z, Z = 2 X .

The assertion follows from these identities. (3) By Theorem 2.5.6, there exists a two-dimensional Brownian motion {(B1 (t), B2 (t))}t0 such that X(t) = B1 ( X (t)),

Y(t) = B2 ( Y (t)).

4:36:26, subject to the Cambridge Core terms of use, .003

2.9 Conformal Martingales

99

Define a complex Brownian motion β by β = B1 + iB2 . Then, since X = Y , Z(t) = β( X (t)).  As seen in the next chapter, the probability that a two-dimensional Brownian motion hits a fixed point is 0. We can prove this property for conformal martingales by using the representation via complex Brownian motions. Theorem 2.9.3 Let Z = {Z(t)}t0 be a conformal martingale starting from a ∈ C. Then, for any b ∈ C, we have P(Z(t) ∈ C \ {b}, t ∈ (0, ∞)) = 1. Proof By Theorem 2.9.2(3), it suffices to show P(ζ(t) ∈ C \ {b}, t ∈ (0, ∞)) = 1 for a complex Brownian motion ζ = {ζ(t)}t0 with P(ζ(0) = a) = 1. While this immediately follows from the same property of two-dimensional Brownian motion, we give another proof via complex analysis in the case when a  b. Let {(B1 (t), B2 (t))}t0 be a two-dimensional Brownian motion starting from 0. Set β = {β(t) = B1 (t) + iB2 (t)}t0 and W(t) = (a − b)eβ(t) + b. Then it is easy to show   P W(t)  b, t ∈ (0, ∞) = 1.

(2.9.3)

Since W = {W(t)}t0 is a conformal martingale by Theorem 2.9.2 and β, β (t) = 2t, we have

t |b − a|2 e2B1 (s) ds.

W, W (t) = 2 !n

0

For each n > 0, 0 (max{B1 (s), 0})2 ds has the same probability law as !1 !1 n2 0 (max{B1 (s), 0})2 ds (Theorem 1.2.8). Since 0 (max{B1 (s), 0})2 ds > 0, P-a.s., by Fatou’s lemma and the bounded convergence theorem,

∞ −1  (max{B1 (s), 0})2 ds E 1+ 0

n  −1  lim inf E 1 + (max{B1 (s), 0})2 ds n→∞ 0

1  −1 = lim inf E 1 + n2 (max{B1 (s), 0})2 ds = 0. n→∞

0

4:36:26, subject to the Cambridge Core terms of use, .003

100

Stochastic Integrals and Itô’s Formula

Therefore,

 P



 (max{B1 (s), 0})2 ds = ∞ = 1.

0

Moreover, using the elementary inequality e2x  (max{x, 0})2 , we obtain

t e2B1 (s) ds = ∞, P-a.s. (2.9.4) lim W, W (t) = lim 2|b − a|2 t→∞

t→∞

0

Using Theorem 2.9.2 again, we see that there exists a complex Brownian motion ζ = {ζ(t)}t0 starting from a such that   t |b − a|2 e2B1 (s) ds . W(t) = ζ 0

Hence (2.9.3) and (2.9.4) imply the desired assertion.



2 (C). Definition 2.9.4 Let Z = X + iY ∈ Mc,loc 2 (1) Denote by Lloc (Z) the set of C-valued predictable processes Φ = ξ + iη = {Φ(t) = ξ(t) + iη(t)}t0 such that + * t |Φ(s)|2 d Z, Z (s) < ∞ = 1 (t  0). P 0 2 (Z) with respect to Z, (2) The complex stochastic integral of Φ = ξ + iη ∈ Lloc t 2 Φ(s) dZ(s) ∈ Mc,loc (C), I Z (Φ) = t0

0

is defined by

t 0

t

Φ(s) dZ(s) = 0

{ξ(s) dX(s) − η(s) dY(s)}

t +i {η(s) dX(s) + ξ(s) dY(s)}. 0

2 2 (X) ∩ Lloc (Y) by (2.9.1), the complex stochastic integral is Since ξ, η ∈ Lloc 2 2 2   well defined. For Z, Z ∈ Mc,loc (C), Φ ∈ Lloc (Z) and Ψ ∈ Lloc (Z),

t 8 7 Z   Φ(s)Ψ(s) d Z, Z (s), (2.9.5) I (Φ), I Z (Ψ) = 0

where, for increasing processes ϕ and ψ, d(ϕ + iψ) = dϕ!+ idψ. Therefore, if Z t is a conformal martingale, then this is also the case for { 0 Φ(s) dZ(s)}t0 . Let Z1 = {Z1 (t)}t0 , . . . , Zn = {Zn (t)}t0 be a conformal martingale. Itô’s formula for a holomorphic function f : Cn → C is given by

4:36:26, subject to the Cambridge Core terms of use, .003

2.9 Conformal Martingales f (Z1 (t), . . . , Zn (t)) = f (Z1 (0), . . . , Zn (0)) +

n 0

j=1

t

101

∂f (Z1 (s), . . . , Zn (s)) dZ j (s) ∂z j

t 2 1 ∂ f + (Z1 (s), . . . , Zn (s)) d Z j , Zk (s). 2 1 jkn 0 ∂z j ∂zk

(2.9.6)

In particular, if Z j , Zk = 0 ( j  k), then { f (Z1 (t), . . . , Zn (t))}t0 is a conformal martingale. By Itô’s formula, the logarithm and the p-th root of a conformal martingale are defined. Theorem 2.9.5 Let Z = {Z(t)}t0 be a conformal martingale starting from a ∈ C \ {0} and p ∈ N. Then, there exist conformal martingales W = {W(t)}t0 and W p = {W p (t)}t0 such that Z(t) = eW(t)

Z(t) = (W p (t)) p .

and

Proof It suffices to show the first identity. In fact, setting W p (t) = e obtain Z(t) = (W p (t)) p from it. By Theorem 2.9.3,

W(t) p

, we

P(Z(t)  0, t ∈ [0, ∞)) = 1. Combining this identity with the continuity of {Z(t)}t0 , we obtain 1 2 ∈ Lloc (Z). Φ(t) = Z(t) t0 Take α ∈ C such that a = eα and set

t

W(t) = α + 0

1 dZ(s). Z(s)

Then W = {W(t)}t0 is a conformal martingale.

Z, W = 0 by (2.9.5). Applying Itô’s formula (2.9.6) to f (z, w) = ze−w , we  see Z(t)e−W(t) = 1. An interesting application of conformal martingales is to show the little Picard theorem, which asserts that the range of an entire function on C is either C or C − a for some a ∈ C. For details, see [12].

4:36:26, subject to the Cambridge Core terms of use, .003

3 Brownian Motion and the Laplacian

We prove the Markov and the strong Markov properties of Brownian motion. We also mention its recurrence and transience. The transition density of Brownian motion is given by the fundamental solution to the heat equation for the Laplacian, and Brownian motions are closely related to various differential equations. In this chapter, we show that the solutions of the heat equation and the Dirichlet problem for the Laplacian are represented as expectations with respect to Brownian motions. We also introduce the Feynman–Kac formula, which is associated with the Laplacian and a scalar potential, and its applications.

3.1 Markov and Strong Markov Properties Suppose that the behavior of a stochastic process X = {X(t)}t0 up to time s is given. If the probability law of the behavior of X after s is determined only by the position X(s) at s, X is said to have the Markov property. Moreover, if X has this property not only for fixed times but also for stopping times, X is said to have the strong Markov property. The purpose of this section is to show that Brownian motions have both of these properties and give applications. Let W be the path space given by W = {w : [0, ∞) → Rd ; w is continuous}. We denote the coordinate process by B(t) : W w → w(t) ∈ Rd (t  0) and let W be the smallest σ-field under which the coordinate process is measurable. For each x ∈ Rd , denote by P x the probability measure on W under which the coordinate process {B(t)}t0 is a Brownian motion starting from x. Set Ft0 = σ{B(u); u  t} and let {Ft } be the right-continuous filtration given by 102 4:36:28, subject to the Cambridge Core terms of use, .004

3.1 Markov and Strong Markov Properties Ft =



103

F s0 .

s>t

In this chapter, these {B(t)}t0 and P x are dealt with. Define the shift θ s (s  0) on the path space W by (θ s w)(t) = w(s + t)

(t  0).

If F : W → R is a W -measurable function, F ◦ θ s is determined by the path of the Brownian motion after s. For example, if F is given by F(w) =

n 

fi (w(ti ))

(3.1.1)

i=1

for bounded measurable functions f1 , f2 , . . . , fn on Rd and 0 < t1 < t2 < · · · < tn , we have n  (F ◦ θ s )(w) = fi (w(s + ti )). i=1

Theorem 3.1.1 (Markov property) Let s  0 and F be a bounded W measurable function (random variable). Denote the expectation with respect to P x by E x . Then, for any x ∈ Rd , E x [F ◦ θ s |F s ] = EB(s) [F],

P x -a.s.,

(3.1.2)

where, on the right hand side, y = B(s) is substituted into the function ϕ(y) = Ey [F] on Rd . The measurability of the mapping y → Ey [F] can be shown by the monotone class theorem. The proof is left to the reader. Proof What should be shown is the identity E x [(F ◦ θ s )1A ] = E x [EB(s) [F]1A ]

(3.1.3)

for any A ∈ F s . First, suppose that F is defined by (3.1.1) and that A is given by A = {w; w(s1 ) ∈ A1 , w(s2 ) ∈ A2 , . . . , w(sm ) ∈ Am }

(3.1.4)

for 0 < h < t1 , 0 < s1 < s2 < · · · < sm  s + h and A1 , A2 , . . . , Am ∈ B(Rd ). |y−x|2 d Then, letting g(t, x, y) = (2πt)− 2 e− 2t be the transition density of a Brownian motion, we have

g(t1 , y, y1 ) f1 (y1 ) dy1 g(t2 − t1 , y1 , y2 ) f2 (y2 ) dy2 ϕ(y) = Ey [F] = Rd Rd

× ··· × g(tn − tn−1 , yn−1 , yn ) fn (yn ) dyn . Rd

4:36:28, subject to the Cambridge Core terms of use, .004

104

Brownian Motion and the Laplacian

Set

g(t1 − h, y, y1 ) f1 (y1 ) dy1 g(t2 − t1 , y1 , y2 ) f2 (y2 ) dy2 Rd Rd

× ··· × g(tn − tn−1 , yn−1 , yn ) fn (yn ) dyn .

 ϕ(y, h) =

Rd

Then,



E x [(F ◦ θ s )1A ] = g(s1 , x, x1 ) dx1 g(s2 − s1 , x1 , x2 ) dx2 A1 A2

× ··· × g(sm − sm−1 , xm−1 , xm ) dxm g(s + h − sm , xm , y) ϕ(y, h) dy Rd

Am

and ϕ(B(s + h), h)1A ]. E x [(F ◦ θ s )1A ] = E x [

(3.1.5)

0 Denote by G the totality of A ∈ F s+h which satisfies (3.1.5). Then, by the 0 and, therefore, for all Dynkin class theorem, (3.1.5) holds for all A ∈ F s+h A ∈ Fs . Let {yh } ⊂ Rd be a sequence with yh → y (h → 0). Then, by the Lebesgue convergence theorem,  ϕ(yh , h) → ϕ(y). Hence, by the bounded convergence theorem,

E x [(F ◦ θ s )1A ] = E x [ϕ(B(s))1A ] = E x [EB(s) [F]1A ] for any A ∈ F s and F of the form (3.1.1). Fix A ∈ F s and denote by H the set of functions F on W which satisfies (3.1.3). Letting A be the totality of the subsets of W given as {w; w(tk ) ∈ Ak (k = 1, 2, . . . , n)}, Ak ∈ B(Rd ), we have shown that the indicator functions of the sets in A belong to H . Therefore, by Theorem A.2.6, (3.1.3) holds for any σ(A )-measurable, that is, W -measurable bounded function F.  Theorem 3.1.1 shows that {B(s + t) − B(s)}t0 is a Brownian motion independent of F s for any s  0. Moreover, by Theorem 3.1.1, E x [F ◦ θ s |F s ] is F s0 -measurable. Hence, by the tower property of the conditional expectation, E x [F ◦ θ s |F s ] = E x [F ◦ θ s |F s0 ]. Put Φ=

m 

(3.1.6)

ϕ j (w(t j ))

j=1

for t1 < t2 < · · · < tm and bounded measurable functions ϕ1 , ϕ2 , . . . , ϕm on Rd . Then E x [Φ|F s ] = E x [Φ|F s0 ].

(3.1.7)

4:36:28, subject to the Cambridge Core terms of use, .004

3.1 Markov and Strong Markov Properties

105

In fact, if tk  s < tk+1 , decomposing Φ by Φ = I 1 · I2 ,

I1 =

k 

ϕ j (w(t j )), I2 =

j=1

m 

ϕ j (w(t j )),

k+1

we obtain (3.1.7) by (3.1.6) since I1 is F s0 -measurable. The monotone class theorem yields the following. Theorem 3.1.2 Let Φ be a bounded W -measurable random variable. Then, for any s  0 and x ∈ Rd , E x [Φ|F s ] = E x [Φ|F s0 ],

P x -a.s.

(3.1.8)

From Theorem 3.1.2, the following important result is deduced. Theorem 3.1.3 (Blumenthal 0-1 law) For any x ∈ Rd and A ∈ F0 , P x (A) is 0 or 1. Proof By Theorem 3.1.2, 1A = E x [1A |F0 ] = E x [1A |F00 ],

P x -a.s.

Since P x (B(0) = x) = 1 and P x (C) is 0 or 1 for C ∈ F00 , the conditional expectation in the third term coincides with the usual expectation and 1A =  P x (A). Therefore, P x (A) is 0 or 1. While several properties of Brownian motions are deduced from the Blumenthal 0-1 law, we only show the following. For more results, see, for example, [20, 56]. Theorem 3.1.4 Set τ+ = inf{t  0; B(t) > 0}. Then P0 (τ+ = 0) = 1. Proof Since P0 (τ+  t)  P0 (B(t) > 0) =

1 2

for t > 0,

P0 (τ+ = 0) = lim P0 (τ+  t)  t↓0

1 . 2

Hence, by the Blumenthal 0-1 law, P0 (τ+ = 0) = 1.



Also, for τ− = inf{t  0; B(t) < 0}, P0 (τ− = 0) = 1. Hence, by the continuity of the path of Brownian motions, the following holds. Corollary 3.1.5 Set τ0 = inf{t > 0; B(t) = 0}. Then P0 (τ0 = 0) = 1.

4:36:28, subject to the Cambridge Core terms of use, .004

106

Brownian Motion and the Laplacian

Next we show the strong Markov property of Brownian motions. Let σ be an {Ft }-stopping time. The information up to time σ is given by the σ-field Fσ = {A; A ∩ {σ  t} ∈ Ft for all t  0}. Theorem 3.1.6 (strong Markov property) Let F : [0, ∞)×W → F(s, w) ∈ R be bounded and B([0, ∞)) × W -measurable and σ be an {Ft }-stopping time. Then, for all x ∈ Rd ,  E x [Fσ ◦ θσ |Fσ ] = Ey [Ft ] y=B(σ),t=σ holds P x -a.s. on {σ < ∞}, where (Fσ ◦ θσ )(w) = F(σ(w), θσ(w) w) and Ft (w) = F(t, w). Proof For A ∈ Fσ , we shall show E x [(Fσ ◦ θσ )1A∩{σ 0). a+b (2) By the Markov property of Brownian motion, the probability in question (1) coincides with E(1) x [P B(s) (τy < ∞)], which is 1 by (1). (3) Take s = 1, 2, . . . in (2). Then, almost surely, there exists tn  n such that  B(tn ) = y for any n = 1, 2, . . . . P(1) 0 (τ−a < τb ) =

Set φ(x) = log |x| if d = 2 and φ(x) = |x|2−d if d  3. Then Δφ(x) =

d  ∂ 2 φ(x) = 0 ∂xi i=1

(x  0),

that is, φ is a harmonic function. Suppose r < |x| < R in the rest of this section. For a Brownian motion B = {B(t)}t0 starting from x, Itô’s formula yields d t ∂φ(x) φ(B(t)) = φ(B(0)) + (B(s)) dBi (s) (t < σr ∧ σR ), i ∂x 0 i=1 where σr = inf{t > 0; |B(t)| = r}. Since ∂φ(x) is bounded and continuous on ∂x j {y; r < |y| < R}, the optional stopping theorem implies φ(x) = E(d) x [φ(B(t ∧ σr ∧ σR ))]. Letting t → ∞ by the bounded convergence theorem, we obtain φ(x) = E(d) x [φ(B(σr ∧ σR ))] (d) = φ0 (r)P(d) x (σr < σR ) + φ0 (R)P x (σR < σr ), (d) where we have set φ0 (|x|) = φ(x). Since P(d) x (σr < σR ) + P x (σR < σr ) = 1,

P(d) x (σr < σR ) =

φ0 (R) − φ(x) . φ0 (R) − φ0 (r)

(3.2.1)

Theorem 3.2.2 Let G be an open subset of R2 . Then, for any x ∈ R2 , there exists an increasing sequence {tn }∞ n=1 satisfying B(tn ) ∈ G and tn → ∞ almost surely under P(2) x . Proof It suffices to show the case when G = {y ∈ R2 ; |y| < r}, r > 0. Letting 2 R → ∞ in (3.2.1), we have P(2) x (σr < ∞) = 1 for all x ∈ R . Hence a similar argument to the proof of Theorem 3.2.1 yields the conclusion.  2 Theorem 3.2.3 If d  2, then P(d) x (τ{0} = ∞) = 1 for all x ∈ R .

4:36:28, subject to the Cambridge Core terms of use, .004

110

Brownian Motion and the Laplacian

Proof It is sufficient to show the case when d = 2. While we showed the main part of the assertion in Theorem 2.9.3, we here give a proof by using (3.2.1). Let x  0. For R > 0, (2) P(2) x (σ0 < σR )  lim P x (σr < σR ) = 0. r→0

Hence, letting R → ∞, we obtain the assertion when x  0. Suppose that x = 0. By the Markov property of Brownian motion and the conclusion when x  0, (2) (2) P(2) 0 (there exists a t  ε such that B(t) = 0) = E0 [P B(ε) (σ0 < ∞)] = 0

for any ε > 0, since P(2) 0 (B(ε)  0) = 1. Letting ε → 0 we obtain the assertion.  + * Remark 3.2.4 If d  2 and x  0, then P(d) x limr↓0 σr = ∞ = 1. Theorem 3.2.5 Let d  3. (1) If r < |x|, then P(d) x (σr < ∞) =

 r d−2 . |x|

+ * (2) For all x ∈ Rd , P(d) x limt→∞ |B(t)| = ∞ = 1.

Proof (1) Let R → ∞ in (3.2.1). Then we immediately obtain the assertion. (2) By Theorem 3.2.1 (1), P(d) 0 (σR < ∞) = 1 for all R > 0. Denote by An the √ event that |B(t)| > n for all t  σn . Then, by the strong Markov property of Brownian motion and (3.2.1), we have  1 d−2 c (d) (d) √ P(d) . x (An ) = E x [P B(σn ) (σ n < ∞)] = √ n By the monotonicity of probability measure, * + (d) P(d) x lim sup An  lim sup P x (An ) = 1. n→∞

n→∞

Hence, almost surely, there exists an increasing sequence {ni } with ni → ∞ √ such that |B(t)| > ni for all t > σni . This means that limt→∞ |B(t)| = ∞ almost surely.  For a one-dimensional Brownian motion, the expectation of the exit time from an interval was computed in Theorem 1.5.20. Also, for the multidimensional case, we have the following. Theorem 3.2.6 Let d  2 and |x| < R. Then E(d) x [σR ] =

R2 −|x|2 d .

4:36:28, subject to the Cambridge Core terms of use, .004

3.3 Heat Equations

111

Proof Set M(t) = |B(t)|2 − dt. Then {M(t)}t0 is a martingale and, by the optional stopping theorem, 2 2 E(d) x [|B(σR ∧ t)| − d(σR ∧ t)] = |x|

and E(d) x [σR ∧ t] =

1 (d) (E [|B(σR ∧ t)|2 ] − |x|2 ). d x

Letting t → ∞, we obtain the assertion.



3.3 Heat Equations Consider the initial value problem for the heat equation on Rd . It is a problem to find a function u(t, x) on [0, ∞) × Rd which is of C 1 -class in t, is of C 2 -class in x (C 1,2 -class for short) and satisfies ∂u 1 (t, x) = Δu(t, x) ∂t 2 u(0, x) = f (x).

(t > 0, x ∈ Rd ),

(3.3.1) (3.3.2)

The transition density of a d-dimensional Brownian motion g(t, x, y) = (2πt)− 2 e− d

|y−x|2 2t

(t > 0, x, y ∈ Rd )

1 satisfies ∂g ∂t = 2 Δ x g, where Δ x is the Laplacian acting on functions in x. The next theorem can be proven by justifying the change of order of differentiation and integration. The function g(t, x, y) is called the fundamental solution of the heat equation (3.3.1) or the heat kernel on Rd .

Theorem 3.3.1 Let f : Rd → R be continuous and assume that 1 max{1, log | f (x)|} → 0 |x|2

(|x| → ∞).

Then, the function v(t, x) defined by

g(t, x, y) f (y) dy v(t, x) = Rd

(3.3.3)

is of C ∞ -class in (t, x) and solves the heat equation (3.3.1). Moreover, lim v(t, x) = f (x) t↓0

(x ∈ Rd ).

4:36:28, subject to the Cambridge Core terms of use, .004

112

Brownian Motion and the Laplacian

The function v(t, x) defined by (3.3.3) is represented as the expectation, v(t, x) = E x [ f (B(t))]. Hence, if the uniqueness of the solution is proven, this function is the unique solution for the heat equation. Theorem 3.3.2 If a bounded function u(t, x) (t > 0, x ∈ Rd ) is a solution to the initial value problem of the heat equation (3.3.1) and (3.3.2), then u(t, x) = v(t, x). Proof Fix t > 0 and set M(s) = u(t − s, B(s)). Since u satisfies (3.3.1), by Itô’s formula applied to functions of C 1,2 -class (see Remark 2.3.15), M(s) − u(t, B(0)) =

d i=1

s

0

∂u (t − r, B(r)) dBi (r). ∂xi

Hence, {M(s)} s0 is a bounded local martingale and, in fact, a martingale. Since M(s) converges to M(t) = f (B(t)) as s ↑ t, we obtain u(t, x) = M(0) = E x [M(t)] = v(t, x).



3.4 Non-Homogeneous Equation Given h : (0, ∞) × Rd → R and f : Rd → R, consider the problem of finding a function on [0, ∞) × Rd which is of C 1,2 -class and satisfies ∂u 1 = Δu(t, x) + h(t, x) ∂t 2 u(0, x) = f (x)

((t, x) ∈ (0, ∞) × Rd ),

(x ∈ Rd ).

(3.4.1) (3.4.2)

If u1 solves the equation ∂u1 1 = Δu1 , ∂t 2

u1 (0, x) = f (x),

which was considered in the previous section, and if u2 solves ∂u2 1 = Δu2 + h, ∂t 2

u2 (0, x) = 0,

then u = u1 + u2 is a solution of (3.4.1) and (3.4.2). Hence, we concentrate on the case where f = 0. By Itô’s formula, we obtain the following.

4:36:28, subject to the Cambridge Core terms of use, .004

3.4 Non-Homogeneous Equation

113

Proposition 3.4.1 If u satisfies (3.4.1), then the stochastic process M = {M(s)}0s 0, there exist positive constants α and C such that |h(t, x) − h(t, y)|  C|x − y|α (1) There exists

∂ v ∂xi ∂x j

(2) There exists

∂v ∂t

(|x|, |y|, t  N).

(3.4.6)

2

which is continuous in (t, x) and given by

t ∂2 v ∂2 (t, x) = g(s, x, y)h(t − s, y) dsdy. i j ∂xi ∂x j 0 Rd ∂x ∂x and it satisfies

t ∂ ∂v (t, x) = h(t, x) + g(t − r, x, y)h(r, y) drdy. d ∂t ∂t 0 R

Proof (1) Write vi and gi j for

∂v ∂xi

∂2 g , ∂xi ∂x j

respectively. Then we have

t 1 (vi (t, x + εe j ) − vi (t, x)) = ϕi j,ε (s, x) ds, (3.4.7) ε 0

where

ϕi j,ε (s, x) =

Rd

1 ε

ε

and

gi j (s, x + ξe j , y)h(t − s, y) dξdy.

0

ϕi j,ε (s, x) converges as ε → 0 by the bounded convergence theorem. Hence, once we show that there exists an integrable function φ which is independent of ε and satisfies |ϕi j,ε (s, x)|  φ(s), we obtain the assertion. Noting that

 i (y − xi )(y j − x j ) δi j  0= gi j (s, x, y) dy = − g(s, x, y) dy s s2 Rd Rd  (Bi (s) − xi )(B j (s) − x j ) δ ij − , = Ex s s2 we have 1 ϕi j,ε (s, x) = ε

ε

E x+ξe j 0

 (Bi (s) − xi )(B j (s) − x j − ξ) s2



δi j  s

× (h(t − s, B(s)) − h(t − s, x + ξe j )) dξ. To estimate ϕi j,ε (s, x), we apply Schwarz’s inequality to the expectation on the right hand side. First, by the scaling property of Brownian motion, observe that  (Bi (s) − xi )(B j (s) − x j − ξ) δ 2 ij E x+ξe j − = (1 + δi j )s−2 . s s2

4:36:28, subject to the Cambridge Core terms of use, .004

116

Brownian Motion and the Laplacian

Next, by the assumption, there exist positive constants C1 , C2 , and C3 such that   E x+ξe j |h(t − s, B(s)) − h(t − s, x + ξe j )|2    C1 E x+ξe j |B(s) − x − ξe j |2α 1{|B(s)−x−ξe j |N} + C2 P x+ξe j (|B(s) − x − ξe j |  N) = C1 E0 [|B(s)|2α 1{|B(s)|N} ] + C2 P0 (|B(s)|  N)  C3 sα + C2 P0 (|B(s)|  N) for |x|  N. Using the elementary inequality

∞ 2 ξ 1 η2 e− 2 dξ  e− 2 η η

(η > 0),

we obtain  N  P0 |Bi (s)|  √ d i=1 √

∞ 1 − η2 2d3 s − N2 = 2d e 2ds . e 2s dη  √ √ N πN √ 2πs d

P0 (|B(s)|  N) 

d

Thus there exists a constant C4 , independent of ε, such that α

N2

|ϕi j,ε (s, x)|  C4 (s−1+ 2 + s− 4 e− 4ds ) 3

for s > 0 and |x|  N. (2) The strategy of the proof is similar to that for (1). We start from

1  t+ε 1 (v(t + ε, x) − v(t, x)) = E x h(r, B(t + ε − r)) dr + Iε , ε ε t where

Iε = 0

t

(3.4.8)

 1  E x h(r, B(t + ε − r)) − h(r, B(t − r)) dr. ε

The first term of (3.4.8) converges to h(t, x) by the bounded convergence theorem. Rewrite Iε as

t 1 Iε = (g(t + ε − r, x, y) − g(t − r, x, y))h(r, y) dy dr d 0 R ε

t 1 ε ∂ g(t − r + ξ, x, y)h(r, y) dξdy. (3.4.9) = dr 0 Rd ε 0 ∂t

4:36:28, subject to the Cambridge Core terms of use, .004

3.5 The Feynman–Kac Formula

Since

Rd

117

 d |y − x|2 − g(t, x, y) dy 2t 2t2 Rd  |B(t) − x|2 − dt = 0, = Ex 2t2

∂ g(t, x, y) dy = ∂t

we change the order of the integrations to obtain

t

1 ε ∂ g(t − r + u, x, y)(h(r, y) − h(r, x)) dy dr du Iε = ε 0 0 Rd ∂t

t 1 ε  |B(t − r + u) − x|2 − d(t − r + u) = dr Ex ε 0 2(t − r + u)2 0 × (h(r, B(t − r + u)) − h(r, x)) du. By the local Hölder continuity of h, there exists an integrable function φ on [0, t), independent of ε, such that  1 ε  |B(t − r + u) − x|2 − d(t − r + u)  Ex ε 0 2(t − r + u)2

  × (h(r, B(t − r + u)) − h(r, x)) du  φ(r) (r ∈ [0, t)).

Then apply the Lebesgue convergence theorem to the right hand side of (3.4.9) to obtain the assertion. The details are left to the reader. 

3.5 The Feynman–Kac Formula Given V, f : Rd → R, consider the problem of finding a function u(t, x) of C 1,2 -class on [0, ∞) × Rd such that 1 ∂u (t, x) = Δu(t, x) − V(x)u(t, x) ∂t 2 u(0, x) = f (x).

((t, x) ∈ (0, ∞) × Rd ),

(3.5.1) (3.5.2)

A Feynman path integral presents a representation for a solution to the 1 Schrödinger equation 1i ∂ψ ∂t = − 2 Δψ + Vψ via a formal integral of paths. The representation for a solution to (3.5.1) and (3.5.2) via a Brownian motion is called the Feynman–Kac formula. The approach taken in this section is quite similar to those in the previous two sections. Let B = {B(s)} s0 be a d-dimensional Brownian motion. The following can be easily shown as an application of Itô’s formula.

4:36:28, subject to the Cambridge Core terms of use, .004

118

Brownian Motion and the Laplacian

Proposition 3.5.1 If u : (0, ∞) × Rd → R satisfies (3.5.1), then the stochastic process M = {M(s)}0s 0 satisfying

∞ √ | f (x + y)|e− 2αy dy < ∞ (3.5.6) −∞

for all x. Then, the function F defined by

t    ∞ F(x) = E x f (B(t)) exp −αt − V(B(s)) ds dt 0

(3.5.7)

0

is of piecewise C 2 -class and, at every continuum x of both f and V, 1  F (x) = (α + V(x))F(x) − f (x). 2

(3.5.8)

Remark 3.5.8 For α > 0 and x, y ∈ R, the following formula holds:

∞ √ |y−x|2 1 1 e−αt √ e− 2t dt = √ e−|y−x| 2α . 0 2πt 2α Hence, the condition (3.5.6) is equivalent to  ∞ e−αt | f (B(t))| dt < ∞ Ex

(x ∈ R).

0

4:36:28, subject to the Cambridge Core terms of use, .004

3.5 The Feynman–Kac Formula

121

Proof Define the resolvent operator Gα (α > 0) by  ∞ e−αt g(B(t)) dt , (Gα g)(x) = E x 0

which acts on piecewise continuous functions satisfying (3.5.6). By the preceding remark,



∞ |y−x|2 1 −αt e dt g(y) √ e− 2t dy (Gα g)(x) = 0 −∞ 2πt

∞ √ 1 e− 2α|y−x| g(y) dy = √ 2α −∞

1 √2αx ∞ − √2αy 1 − √2αx x √2αy e g(y) dy + √ e e g(y) dy, = √ e −∞ x 2α 2α which implies the continuity of (Gα g)(x). Moreover, since g is piecewise continuous, Gα g is differentiable in x and

x √

∞ √ √ √  − 2αx 2αy 2αx e g(y) dy + e e− 2αy g(y) dy. (Gα g) (x) = −e −∞

x





Hence (Gα g) is continuous. Similarly (Gα g) is also differentiable and, at the continuum x of g, (Gα g) (x) = 2α(Gα g)(x) − 2g(x).

(3.5.9)

Next we show Gα (V F) = Gα f − F. By (3.5.5),  ∞ !t * + e−αt 1 − e− 0 V(B(s))ds f (B(t)) dt (Gα f )(x) − F(x) = E x 0

t !  ∞ t −αt e f (B(t)) dt e− s V(B(r))dr V(B(s)) ds . = Ex 0

0

Since V is non-negative and  t ! t  !   = 1 − e− 0t V(B(s))ds < 1, − s V(B(r))dr e V(B(s)) ds   0

we have 



−αt

e

Ex 0

 t ! t   − s V(B(r))dr  | f (B(t))| dt  e V(B(s)) ds < ∞, 0

as was remarked in Remark 3.5.8. Hence, changing the order of integrations by Fubini’s theorem, we obtain

4:36:28, subject to the Cambridge Core terms of use, .004

122

Brownian Motion and the Laplacian

∞  ∞ !t V(B(s))ds f (B(t))e−αt− s V(B(r))dr dt (Gα f )(x) − F(x) = E x 0

∞s  ∞ !ρ = Ex e−αs V(B(s)) ds f (B(s + ρ))e−αρ− 0 V(B(s+r))dr dρ . 0

0

The Markov property of Brownian motion implies  ∞ e−αs V(B(s))F(B(s)) ds = Gα (V F)(x). (Gα f )(x) − F(x) = E x 0

Applying (3.5.9) with g = f − V F, we obtain F  (x) = (Gα ( f − V F)) (x) = 2α(Gα ( f − V F))(x) − 2( f − V F)(x) = 2αF(x) − 2 f (x) + 2V(x)F(x) 

if f and V are continuous at x.

Example 3.5.9 (Arcsine law) Let B = {B(t)}t0 be a Brownian motion and Γ+ (t) be the total time when B stays in (0, ∞) up to time t:

t Γ+ (t) = 1(0,∞) (B(s)) ds. 0

Then, for any positive α and β > 0,

∞   1 , e−αt E0 e−βΓ+ (t) dt = ) α(α + β) 0 (

α t ds α 2 P0 (Γ+ (t)  α) = = arcsin √ π t 0 π s(1 − s)

(0  α  t).

Proof Apply Kac’s formula for f = 1 and V(x) = β1(0,∞) (x). The equation (3.5.8) to solve is written as ⎧ ⎪ ⎪ (x < 0) ⎨2(αF(x) − 1)  F (x) = ⎪ ⎪ ⎩2(αF(x) + β − 1) (x  0). This equation has a unique solution of piecewise C 2 -class, which is given by ⎧√ √ √ α− α+β ⎪ 1 2αx ⎪ (x < 0) ⎪ ⎨ √α √α+β√ e √ + α F(x) = ⎪ ⎪ α+β− α − 2(α+β)x 1 ⎪ ⎩ (α+β) √α e + α+β (x  0), and we obtain

0



  e−αt E0 e−βΓ+ (t) dt = F(0) = )

1 α(α + β)

.

4:36:28, subject to the Cambridge Core terms of use, .004

3.5 The Feynman–Kac Formula

On the other hand, by the elementary formula (



∞ −ct e π 2 −s2 e ds = √ dt = √ c c 0 t 0

123

(c > 0),

we have

∞ −αt

1 ∞ e−(α+β)s e ds = ) √ √ dt π s t α(α + β) 0 0

∞ −(α+β)s ∞ −α(t−s)



t e e e−βs 1 ds dt = e−αt dt = ds. √ √ √ π 0 s t−s s 0 0 π s(t − s) 1

Hence, the uniqueness of Laplace transforms implies

t   e−βs 1 ds and P0 (Γ+ (t) ∈ ds) = √ ds.  E0 e−βΓ+ (t) = √ π s(t − s) π s(t − s) 0 The Kato class is a natural class of functions which gives the scalar potentials of Schrödinger operators. Its characterization by means of Brownian motion is known. Definition 3.5.10 Define a function φ on Rd by φ(z) = 1 if d = 1, φ(z) = − log |z| if d = 2 and φ(z) = |z|2−d if d  3. The Kato class Kd is the set of functions f satisfying

t

t E x [| f (B(s))|] ds = g(s, x, y)| f (y)| dsdy < ∞ (t > 0) (3.5.10) 0

0

and

Rd

lim sup α↓0 x∈Rd

|y−x| 0).

4:36:28, subject to the Cambridge Core terms of use, .004

124

Brownian Motion and the Laplacian

Theorem 3.5.11 A function f on Rd satisfying (3.5.10) is of the Kato class Kd if and only if  t | f (B(s))| ds = 0. (3.5.12) lim sup E x t↓0 x∈Rd

0

Proof We give a proof when d  3. The proof for the cases when d = 1 and 2 can be done in a similar way. For details, see [2]. First let f ∈ Kd . Note that |y − x|2 − ds ∂ g(s, x, y) = g(s, x, y). ∂s 2s2 Since

∂g ∂s

< 0 if |y − x|  α and s < g(s, x, y)  g(t, x, y)

α2 d ,

*

we have

0 0 such that

| f (y)| dy < ∞. C := sup x∈Rd

|y−x| 0; |B(t) − a|  r}. Then, by Itô’s formula, d t∧τr ∂u (B(s)) dBi (s). u(B(t ∧ τr )) − u(B(0)) = i ∂x i=1 0 ∂u Since ∂x i is bounded and continuous on B(a, r), the expectation under Pa of the right hand side is 0 and Ea [u(B(t ∧ τr ))] = u(a). Letting t ↑ ∞, by the bounded convergence theorem we obtain

u(a) = Ea [u(B(τr ))]. This shows the mean-value property of u because the distribution of B(τr )  under Pa is the uniform distribution on S (a, r). Corollary 3.6.4 If u is harmonic in D and there exists a ∈ D such that u(a) = sup u(x), x∈D

then u is a constant function. Proof Take r > 0 so that B(a, r) ⊂ D. Then, by Proposition 3.6.3,

u(y)μa,r (dy). u(a) = S (a,r)

This means u(y) = u(a) for any y ∈ S (a, r).



The converse of Proposition 3.6.3 holds. Proposition 3.6.5 If an R-valued function on D satisfies the mean-value property, it is of C ∞ -class and harmonic. We omit the proof. See, for example, [56, Chapter 4]. We show that the solution of the Dirichlet problem (3.6.1) and (3.6.2) is represented as an expectation with respect to Brownian motion. The following is easily shown by using Itô’s formula. Proposition 3.6.6 Let u ∈ C(D) be harmonic in D. If a ∈ D, then {u(B(t ∧ τ))}t0 is a local martingale under Pa . For a bounded function f : ∂D → R, define a function v on D by v(x) = E x [ f (B(τ))].

4:36:28, subject to the Cambridge Core terms of use, .004

128

Brownian Motion and the Laplacian

Proposition 3.6.7 Suppose that f is bounded. If u is a bounded solution to (3.6.1) and (3.6.2), then u = v. , Proof For n = 1, 2, . . ., set Dn = x ∈ D; inf y∈∂D |x − y| > n−1 and τn = inf{t; B(t)  Dn or |B(t)| > n}. Then, by Itô’s formula, u(B(t ∧ τn )) = u(B(0)) +

d i=1

t∧τn 0

∂u (B(s)) dBi (s) ∂xi

and u(a) = Ea [u(B(t ∧ τn ))]. Letting t → ∞ and n → ∞, by the bounded convergence theorem we obtain u(a) = Ea [u(B(τ))] = Ea [ f (B(τ))] = v(a).



Proposition 3.6.8 If f is bounded, then v is of C ∞ -class and harmonic. Proof For x ∈ D and δ > 0 so that B(x, δ) ⊂ D, set τδ = inf{t > 0; B(t)  B(x, δ)}. Then the distribution of B(τδ ) under P x is the uniform distribution on S (x, δ). Hence, by the strong Markov property of Brownian motion,

  v(y)μ x,δ (dy). v(x) = E x EB(τδ ) [ f (B(τ))] = E x [v(B(τδ ))] = S (x,δ)

This means that v has the mean-value property and Proposition 3.6.5 implies the assertion.  For v to satisfy the boundary condition (3.6.2), some condition on ∂D is necessary. Definition 3.6.9 Let D be an open set in Rd and τ be the same exit time as above. y ∈ ∂D is called a regular point, if Py (τ = 0) = 1. Since {τ = 0} ∈ F0 , its probability is 0 or 1 by the Blumenthal 0-1 law. Proposition 3.6.10 Assume that f is bounded and continuous and y ∈ ∂D is a regular point. Then, for any sequence {xn }∞ n=1 ⊂ D with xn → y ∈ ∂D (n → ∞), v(xn ) → f (y) (n → ∞). For the proof, we need a lemma. 4:36:28, subject to the Cambridge Core terms of use, .004

3.6 The Dirichlet Problem

129

Lemma 3.6.11 Fix t > 0. Then, the function D x → P x (τ  t) is lowersemicontinuous, that is, if xn → y ∈ D, lim inf P xn (τ  t)  Py (τ  t). n→∞

Proof For any ε > 0, by the Markov property P x (there exists an s ∈ (ε, t] such that B(s)  D)

g(ε, x, z)Pz (τ  t − ε) dz. = Rd

(3.6.3)

It is easy to see that the right hand side is continuous in x by the Lebesgue convergence theorem. Hence, the left hand side is also continuous in x. Since P xn (there exists an s ∈ (ε, t] such that B(s)  D)  P xn (τ  t), by the continuity observed above, Py (there exists an s ∈ (ε, t] such that B(s)  D)  lim inf P xn (τ  t). n→∞

We obtain the assertion by letting ε ↓ 0.



Proof of Proposition 3.6.10. Since f is bounded and continuous, it suffices to show that, for any sequence {xn }∞ n=1 ⊂ D with xn → y ∈ ∂D (n → ∞), lim P xn (B(τ) ∈ B(y, δ)) = 1

n→∞

(3.6.4)

holds for any δ > 0. For this purpose, fix arbitrary ε > 0 and take t > 0 such that  δ P0 max |B(s)| > < ε. 0st 2 Then, for xn ∈ B(y, 2δ ), we have

 δ P xn (B(τ) ∈ B(y, δ))  P xn τ  t, max |B(s) − xn |  0st 2  δ  P xn (τ  t) − P xn max |B(s) − xn | > 0st 2 > P xn (τ  t) − ε.

By Lemma 3.6.11 and the assumption, lim inf P xn (τ  t)  Py (τ  t) = 1. n→∞

Hence P xn (τ  t) → 1 and lim inf P xn (B(τ) ∈ B(y, δ))  1 − ε. n→∞

Since ε is arbitrary, we get (3.6.4).

 4:36:28, subject to the Cambridge Core terms of use, .004

130

Brownian Motion and the Laplacian

As a sufficient condition for a boundary point to be regular, the Poincaré cone condition is well known. For a ∈ Rd , θ ∈ (0, π) and b ∈ Rd , the set Va,b,θ = {x ∈ Rd ; x − a, b  |x − a| · |b| cos θ} is called a cone with vertex a, direction b and aperture θ. Proposition 3.6.12 y ∈ ∂D is a regular point if there exist a cone V with vertex y and a positive constant r such that V ∩ B(y, r) ⊂ Dc . Proof The scaling property (Theorem 1.2.8) of the probability law of Brownian motion implies that Py (B(t) ∈ V) is independent of t. We denote it by γ. If t is sufficiently small, we have  γ  Py B(t) ∈ V, max |B(s) − y| < r > . 0st 2 Hence, by the assumption, lim inf Py (B(t)  D)  t↓0

γ . 2

Noting that Py (τ  t)  Py (B(t)  D), we obtain Py (τ = 0) = lim inf Py (τ  t)  lim inf Py (B(t)  D)  t↓0

t↓0

Hence, by the Blumenthal 0-1 law, Py (τ = 0) = 1.

γ . 2 

As an example of non-regular points, “Lebesgue’s thorn” is well known (see, for example, [56, Section 4.2]). Moreover, the criterion by Wiener via the Newtonian capacity is widely known. For this, we refer the reader to [50, Section 7.14] and [94, Section 4.2]. What we have shown in this section is the following. Theorem 3.6.13 Let D be an open set in Rd . Assume that P x (τ < ∞) = 1 for any x ∈ D and each point on ∂D is regular. Then, for a bounded and continuous function f on ∂D, v(x) = E x [ f (B(τ))] is the unique solution to the Dirichlet problem (3.6.1) and (3.6.2). When there exists an x such that P x (τ < ∞) < 1, a solution to the Dirichlet problem is not unique. Proposition 3.6.14 Assume that each point on ∂D is regular and f is bounded and continuous. If there exists an a ∈ D such that Pa (τ < ∞) < 1, then the solution to the Dirichlet problem to (3.6.1) and (3.6.2) is not unique.

4:36:28, subject to the Cambridge Core terms of use, .004

3.6 The Dirichlet Problem

131

Proof Set h(x) = P x (τ = ∞) and let τr be the exit time from B(x, r) of a Brownian motion B. If B(x, r) ⊂ D, then the strong Markov property of Brownian motion implies h(x) = E x [PB(τr ) (τ = ∞)] = E x [h(B(τr ))]. Hence, h has the mean-value property and harmonic in D. Moreover, as was shown in the proof of Proposition 3.6.10, limD x→y P x (τ  t) = 1 for any t > 0 and Py (τ = ∞) = 0 for any y ∈ ∂D. h  0 by the assumption and h is a solution to the problem (3.6.1) and (3.6.2) for f ≡ 0. However, h is different from the obvious solution v ≡ 0.  The non-uniqueness of the solution is caused only by h ([94, Section 2]). Theorem 3.6.15 Assume that f is bounded and continuous. If u is a bounded solution to the Dirichlet problem (3.6.1) and (3.6.2), then there exists a constant C such that u(x) = E x [ f (B(τ))] + C · P x (τ = ∞). We omit the proof. Example 3.6.16 (Poisson integral formula on the upper half space) Let d  2 and D be the upper half space of Rd : D = {(x1 , . . . , xd−1 , xd ); xd > 0}. If f is a bounded and continuous function on ∂D, then the solution to the Dirichlet problem is given by

Γ( d ) xd f (y) dy. u(x) = 2d π 2 ∂D |y − x|d Proof Let B = {B(t)}t0 be a Brownian motion and set τ = inf{t > 0; B(t)  D}. Then, since τ = inf{t > 0; Bd (t) = 0}, we have P x (τ ∈ dt) = √

xd 2πt3

e−

(xd )2 2t

dt

by Corollary 3.1.8. Since {(B1 (t), . . . , Bd−1 (t))}t0 and {Bd (t)}t0 are independent, setting x = (x , xd ) and identifying ∂D with Rd−1 , we obtain



|ξ−x |2 1 xd − (xd )2 u(x) = E x [ f (B(τ))] = e− 2t √ e 2t f (ξ) dξdt d−1 Rd−1 0 (2πt) 2 2πt3

∞ |y−x|2 xd e− 2t f (y) dydt. = d d+2 ∂D 0 (2π) 2 t 2 Carrying out the integration in t yields the conclusion.



4:36:28, subject to the Cambridge Core terms of use, .004

132

Brownian Motion and the Laplacian

The Dirichlet problem inside a ball is solved by using the result in Example 3.6.16. For details, see [56, 94]. Example 3.6.17 (Poisson integral formula for a ball) Let d  2 and set D = B(0, r) = {x ∈ Rd ; |x| < r}. Then, for a bounded and continuous function f on ∂B(0, r), the unique solution to the Dirichlet problem is given by

1 d−2 2 2 u(x) = r (r − |x| ) f (y)μ0,r (dy) (x ∈ B(0, r)). d ∂B(0,r) |y − x|

4:36:28, subject to the Cambridge Core terms of use, .004

4 Stochastic Differential Equations

We showed in Section 3.1 that Brownian motions have the Markov and strong Markov properties. A continuous stochastic process with the strong Markov property is called a diffusion process and is one of the main objects in probability theory. Kolmogorov constructed diffusion processes with the help of the theory of partial differential equations. On the other hand, Itô originated the theory of stochastic differential equations in order to construct diffusion processes in a pathwise manner. The aim of this chapter is to develop the theory of stochastic differential equations.

4.1 Introduction: Diffusion Processes In this section we will briefly discuss Markov processes, strong Markov processes, and diffusion processes. We restrict ourselves to time-homogeneous cases. For details, see [25, 45, 100] and references therein. Let E be a locally compact separable metric space. Typical examples are Rd and Riemannian manifolds. When E is not compact, we add an extra point  (infinity) to E, and set E = E ∪ {}. A neighborhood of  is the complement of a compact set in E. E is called a one-point compactification of E. Let W(E) be the set of E -valued functions w : [0, ∞) t → w(t) ∈ E which satisfies the following condition: for each w, there exists a ζ(w) ∈ [0, ∞] such that (i) w(t) ∈ E for t ∈ [0, ζ(w)) and w is continuous on [0, ζ(w)), (ii) w(t) =  for t  ζ(w). The above ζ(w) is called the life time of w ∈ W(E). When Markov processes are studied, it is frequently assumed that the sample paths are right-continuous with left limits. The purpose of this book being 133 4:36:54, subject to the Cambridge Core terms of use, .005

134

Stochastic Differential Equations

to study diffusion processes, we have assumed that the sample paths are continuous up to the life time. As in the case of Wiener space (Section 1.2), a Borel cylinder set of W(E) is defined in the following way. For 0  t1 < t2 < · · · < tn , define the projection πt1 ,t2 ,...,tn by πt1 ,t2 ,...,tn : W(E) w → (w(t1 ), w(t2 ), . . . , w(tn )) ∈ (E )n . For a Borel set F of (E )n , the subset π−1 t1 ,t2 ,...,tn (F) of W(E) is called a Borel cylinder set. The σ-field on W(E) generated by the Borel cylinder sets is denoted by B(W(E)) and that generated by the Borel cylinder sets with tn  t is denoted by Bt (W(E)). Definition 4.1.1 A family of probability measures {P x } x∈E on W(E) satisfying the following conditions is called a (time-homogeneous) Markov family: (i) for all x ∈ E , P x (w(0) = x) = 1, (ii) for all A ∈ B(W(E)), E x → P x (A) is Borel measurable, (iii) for 0  s < t, A ∈ B s (W(E)), F ∈ B(E ) and x ∈ E ,

Pw (s) (w(t − s) ∈ F)P x (dw ). P x (A ∩ {w(t) ∈ F}) =

(4.1.1)

A

Definition 4.1.2 Let m be a probability measure on (E , B(E )). An E valued {Ft }-adapted stochastic process X = {X(t)}t0 defined on a filtered probability space (Ω, F , P, {Ft }) is called a continuous Markov process with initial distribution m if (i) X ∈ W(E), P-a.s., (ii) for any F ∈ B(E ), P(X(0) ∈ F) = m(F), (iii) for 0  s < t and F ∈ B(E ), P(X(t) ∈ F|F s ) = P(X(t) ∈ F|X(s)),

P-a.s.,

where P( · |X(s)) is the conditional probability given the σ-field σ(X(s)) generated by X(s). A Markov process corresponds to a Markov family if we set Ω = W(E) and define X(t, ω) = ω(t) for ω ∈ Ω.

4:36:54, subject to the Cambridge Core terms of use, .005

4.1 Introduction: Diffusion Processes

135

Definition 4.1.3 Let {P x } x∈E be a Markov family. For t  0, x ∈ E and F ∈ B(E ), the probability P x (w(t) ∈ F) is denoted by P(t, x, F), P(t, x, F) = P x (w(t) ∈ F). The family {P(t, x, F)} is called the transition probability of a Markov family {P x } x∈E . For a cylinder set of W(E), (4.1.1) implies P x (w(t1 ) ∈ F1 , w(t2 ) ∈ F2 , . . . , w(tn ) ∈ Fn )

P(t1 , x, dx1 ) P(t2 − t1 , x1 , dx2 ) · · · P(tn − tn−1 , xn−1 , dxn ). = F1

F2

Fn

For a Markov family {P x } x∈E , we define the filtration {Gt = Gt (W(E))} on the path space W(E) by Gt = We set G∞ =



Bt+ε (W(E))

Px

(t > 0).1

ε>0 x∈E

 t>0

Gt .

Definition 4.1.4 A Markov family {P x } x∈E on E is called a strong Markov family if, for any t  0, a {Gt }-stopping time τ, A ∈ Gτ and F ∈ B(E ),

P x (A ∩ {w(t + τ(w)) ∈ F}) = Pw (τ(w )) (w(t) ∈ F)P x (dw ) A

holds for all x ∈ E . Definition 4.1.5 Let X = {X(t)}t0 be an E -valued continuous Markov process with initial distribution m defined on a probability space (Ω, F , P, {Ft }). X is called a diffusion process on E if there exists a strong Markov family ! {P x } x∈E on E such that the probability law of X coincides with Pm (·) = P (·)m(dx), E x

P x (A)m(dx) (A ∈ B(W(E))). P(X ∈ A) = E 1

P

For a probability space (Ω, F , P), the completion of F by P is denoted by F : P

B

B

F = {A ⊂ Ω; there exist B and ∈ F such that B ⊂ A ⊂ and P(B) = P(B )}. P P P is naturally extended to the probability measure P on (Ω, F ) and (Ω, F , P) is a complete probability space.

4:36:54, subject to the Cambridge Core terms of use, .005

136

Stochastic Differential Equations

A diffusion process X = {X(t)}t0 on E defined on (Ω, F , P, {Ft }) is called conservative if the probability that the life time ζ(ω) = inf{t; X(t)(ω) = } is finite is 0. In this book we consider diffusion processes which correspond to second order differential operators. Let Cb (E ) be the set of real-valued bounded continuous functions on E and L : Cb (E ) → Cb (E ) be a linear operator with domain D(L). We shall mainly deal with the case when L is the Laplacian on Rd or, more generally, the second order differential operator given by L=

d d ∂2 ∂ 1 ij a (x) i j + bi (x) i , 2 i, j=1 ∂x ∂x ∂x i=1

(4.1.2)

where ai j and bi are real functions on Rd and the matrix (ai j ) is symmetric and non-negative definite. Definition 4.1.6 Let L be a linear operator with domain D(L) and {P x } x∈E be a family of probability measures on (W(E), B(W(E))) such that the function x → P x (A) is Borel measurable for any A ∈ B(W(E)). If (i) P x (w(0) = x) = 1 for all x ∈ E , (ii) the stochastic process {M f (t)}t0 ( f ∈ D(L)) defined by

t (L f )(w(s)) ds M f (t) = f (w(t)) − f (w(0)) − 0

is a {Bt (W(E))}-martingale under P x for all x ∈ E , then {P x } x∈E is called a family of diffusion measures generated by L. In the rest of this section, let E be Rd or its subset and L be the second order differential operator given by (4.1.2). The above construction of a family of diffusion measures generated by L is called a martingale problem by Stroock and Varadhan ([114]). For the martingale problem, see Section 4.5. One of the advantages of the martingale problem is that, if the family of diffusion measures {P x } x∈E is unique, then {P x } x∈E is a strong Markov family on E. Definition 4.1.7 If the martingale problem for the differential operator L on E of the form (4.1.2) has a unique solution {P x } x∈E , the stochastic process whose probability distribution is P x is called the diffusion process starting from x ∈ E generated by L. It is also simply called an L-diffusion or (a, b)-diffusion. L is called the generator.

4:36:54, subject to the Cambridge Core terms of use, .005

4.1 Introduction: Diffusion Processes

137

Assume that there exists a diffusion process generated by L. Then for any f ∈ D(L) we have lim t↓0

P x [ f (X(t))] − f (x) = (L f )(x). t

Moreover, letting f be a function which is equal to xi or xi x j on a bounded domain, we have 1 lim E x [X i (t) − xi ] = bi (x), t↓0 t 1 lim E x [(X i (t) − xi )(X j (t) − x j )] = ai j (x). t↓0 t Roughly speaking, b(x) = (bi (x)) represents the mean velocity of X and a(x) = (ai j (x)) represents the covariance in an infinitesimal sense. b is called a drift and a is called a diffusion matrix. In the case of Brownian motion, b is 0 and ai j = δi j . Example 4.1.8 Let E = Rd . Set D(L) = { f ∈ Cb (E ); f |Rd ∈ Cb2 (Rd )} and define L f for f ∈ D(L) by ⎧ 1 d ⎪ ⎪ ⎨ 2 (Δ f )(x) (x ∈ R ) (L f )(x) = ⎪ ⎪ ⎩0 (x = ). Then there exists a diffusion process on Rd generated by L. This is nothing but a d-dimensional Brownian motion. Changing domains of generators, we obtain diffusion processes on subsets. Example 4.1.9 (Absorbing Brownian motion) Let D be a bounded domain in Rd with the smooth boundary ∂D and D = D∪{} be its one-point compactification. Denote by D(L) the space of C 2 -functions on D satisfying f (x) → 0 as D x → y ∈ ∂D and define L by ⎧ 1 ⎪ ⎪ ⎨ 2 (Δ f )(x) (x ∈ D) (L f )(x) = ⎪ ⎪ ⎩ 0 (x = ). Then there exists a diffusion process on D generated by L. It is called an absorbing Brownian motion on D.

4:36:54, subject to the Cambridge Core terms of use, .005

138

Stochastic Differential Equations

As in the example above, we need some boundary conditions in order to consider diffusion processes on subsets. In this book we do not consider the boundaries except in the case where d = 1. For the diffusion processes on subsets with boundaries, see, for example, [45, 114]. Kolmogorov constructed diffusion processes by showing the existence of transition densities with the help of the theory of partial differential equations. Also, in the detailed studies on one-dimensional diffusion processes by Feller, Itô and McKean and others, a lot of important parts of observations rely on the results and arguments associated with differential equations. On the other hand, the theory of stochastic differential equations gives us a direct method to construct diffusion processes from Brownian motions. In the following sections, this theory will be stated in detail. For stochastic differential equations, the existence and uniqueness of solutions are equivalent to those of some Markovian family. As was mentioned above, Stroock and Varadhan formulated the latter problem as a martingale problem and solved it in a general framework, where deep results on partial differential equations have again played important roles.

4.2 Stochastic Differential Equations While a construction of diffusion processes is an important application of stochastic differential equations, there are many other areas where stochastic differential equations play key roles. In these applications, stochastic differential equations of more general form than the ones for diffusion processes appear. Hence we start with a general framework. Let Wd be the set of Rd -valued continuous functions on [0, ∞). Set ∞   2−m max |w1 (t) − w2 (t)| ∧ 1 (w1 , w2 ∈ Wd ). ρ(w1 , w2 ) = m=1

0tm

Then ρ is a distance function on Wd and (Wd , ρ) is a complete separable metric space. Let B(Wd ) be the topological σ-field on Wd and Bt (Wd ) be the subσ-field generated by w(s) (0  s  t). As in Chapter 1, denote the subset of Wd consisting of the elements with w(0) = 0, the d-dimensional Wiener space, by W d : W d = {w ∈ Wd ; w(0) = 0}. We define a class of functions which give coefficients of stochastic differential equations. Let Rd ⊗ Rr be the set of d × r real matrices.

4:36:54, subject to the Cambridge Core terms of use, .005

4.2 Stochastic Differential Equations

139

Definition 4.2.1 Denote by A d,r the set of Rd ⊗ Rr -valued functions α(t, w) on [0, ∞) × Wd such that Wd w → α(t, w) ∈ Rd ⊗ Rr is Bt (Wd )-measurable for each t  0. Let α ∈ A d,r and β ∈ A d,1 . For an r-dimensional Brownian motion B = {B(t)}t0 , we consider the stochastic differential equation dX i (t) =

r

αik (t, X) dBk (t) + βi (t, X) dt,

(4.2.1)

k=1

where α(t, w) = (αik (t, w))i=1,2,...,d, k=1,2,...,r , or in matrix notation2 dX(t) = α(t, X) dB(t) + β(t, X) dt.

(4.2.2)

First of all we give the definition of solutions. Definition 4.2.2 Let (Ω, F , P, {Ft }) be a filtered probability space. An Rd valued continuous stochastic process X = {X(t)}t0 is a solution of the stochastic differential equation (4.2.1) if (i) there exists an r-dimensional {Ft }-Brownian motion B = {B(t)}t0 with B(0) = 0, (ii) X is {Ft }-adapted, 2 1 , {βi (t, X)}t0 ∈ Lloc (i = 1, 2, . . . , d, k = 1, 2, . . . , r) 3 , (iii) {αik (t, X)}t0 ∈ Lloc (iv) almost surely, X (t) = X (0) + i

i

r k=1

t 0

αik (s, X) dBk (s)

+

t

bi (s, X) ds 0

(i = 1, 2, . . . , d).

(4.2.3)

X is called a solution driven by a Brownian motion B. We also say that the pair (X, B) is a solution of (4.2.1). The probability distribution of X(0) is called the initial distribution of X. The next Markovian type stochastic differential equation is one of the most important objects in this book. Definition 4.2.3 Let σ(t, x) and b(t, x) be Borel measurable functions on [0, ∞) × Rd with values in Rd ⊗ Rr and Rd , respectively. The stochastic differential equation of the form 2 3

Every element of Rd is thought of as a column vector. 2 and L 1 , see Definitions 2.2.11 and 2.3.2, respectively. For Lloc loc

4:36:54, subject to the Cambridge Core terms of use, .005

140

Stochastic Differential Equations dX(t) = σ(t, X(t)) dB(t) + b(t, X(t)) dt

(4.2.4)

is said to be of Markovian type.4 Moreover, if both σ(t, x) and b(t, x) depend only on x, the Markovian type equation is called time-homogeneous. If σ ≡ 0, then the Markovian type stochastic differential equation (4.2.4) is nothing but the ordinary differential equation dX(t) = b(t, X(t)). dt Hence we may regard a Markovian type stochastic differential equation as an ordinary differential equation perturbed by a noise generated by a Brownian motion.  The function b(t, ·) : Rd → Rd is identified with a vector field di=1 bi (t, x) ∂x∂ i on Rd , and σ(t, ·) : Rd → Rd ⊗ Rr is identified with a set (V1 , . . . , Vr ) of vector fields given by Vk =

d

σik (t, x)

i=1

∂ ∂xi

(k = 1, 2, . . . , r).

Under these identifications we write Equation (4.2.4) as dX(t) =

r

Vk (t, X(t)) dBk (t) + b(t, X(t)) dt.

k=1

It is important to consider solutions which diverge to infinity in finite time (said to explode). But, in order to avoid complexity, we do not discuss the explosion problem here, but we do so in Section 4.3, restricting ourselves to the Markovian case. We present examples of the stochastic differential equations whose solutions are explicitly given. All results are checked by applying Itô’s formula. Example 4.2.4 (Geometric Brownian motion) Let d = r = 1. For σ, ρ ∈ R, set   σ2   X(t) = x exp σB(t) + ρ − t. 2 Then X = {X(t)}t0 is a solution of the stochastic differential equation dX(t) = σX(t) dB(t) + ρX(t) dt satisfying X(0) = x. 4

(αik (t, X)) = (σik (t, X(t))) ∈ A d,r , (βi (t, X)) = (bi (t, X(t))) ∈ A d,1 .

4:36:54, subject to the Cambridge Core terms of use, .005

4.2 Stochastic Differential Equations

141

Example 4.2.5 (Ornstein–Uhlenbeck process) For α ∈ R and β > 0, set

t eβs dB(s). X(t) = X(0)e−βt + αe−βt 0

Then X = {X(t)}t0 satisfies dX(t) = α dB(t) − βX(t) dt. Assume that the initial distribution is Gaussian with mean 0 and variance Then, since X(0) and B are independent, {X(t)}t0 is a stationary Gaussian

α2 2β .

process with covariance cov[X(s), X(t)] =

α2 −β|t−s| . 2β e

Example 4.2.6 Let d = r = 2 and set

t s 2 1 1 x2 eB (s)− 2 dB1 (s), X (t) = x +

2

t

X 2 (t) = x2 eB (t)− 2 .

0

Then {(X (t), X (t))}t0 is a solution of 1

2

dX 1 (t) = X 2 (t) dB1 (t),

dX 2 (t) = X 2 (t) dB2 (t)

satisfying (X 1 (0), X 2 (0)) = (x1 , x2 ). $ # 0 γ2 . Define an R2 -valued Example 4.2.7 For γ > 0 and σ ∈ R, set A = 1 0 & # $' X(t) stochastic process Z(t) = by Y(t) t0 &# $ t # $ ' tA x −sA σ + dB(s) , e Z(t) = e 0 0 0  tn n where etA is the exponential of the matrix tA, etA = ∞ n=0 n! A . By Itô’s formula, # $ σ dB(t). (4.2.5) dZ(t) = AZ(t)dt + 0 ⎞ ⎛ ⎜⎜⎜ cosh(γt) γ sinh(γt)⎟⎟⎟ tA ⎟ , the definition of {Z(t)}t0 implies Since e = ⎝⎜ 1 cosh(γt) ⎠ γ sinh(γt)

t cosh(γ(t − s)) dB(s). X(t) = x cosh(γt) + σ 0

!t Then, it follows from the second line of (4.2.5) that Y(t) = 0 X(s) ds. Moreover, the first line of (4.2.5) yields that X = {X(t)}t0 obeys the stochastic differential equation

4:36:54, subject to the Cambridge Core terms of use, .005

142

Stochastic Differential Equations

dX(t) = γ2



t

 X(s)ds dt + σ dB(t),

X(0) = x.

(4.2.6)

0

Example 4.2.8 (Doss [16], Sussmann [117]) Let d = r = 1 and σ, b ∈ Cb2 (R). Then a solution of the time-homogeneous Markovian type stochastic differential equation dX(t) = σ(X(t)) dB(t) + b(X(t)) dt is constructed as follows. For y ∈ R, denote by ϕ(x, y) the solution of the ordinary differential equation dϕ = σ(ϕ), ϕ(0) = y. dx Then ϕ(x, y) is differentiable in y. Moreover, since ∂  ∂ϕ(x, y)  ∂ϕ(x, y) , = σ (ϕ(x, y)) ∂x ∂y ∂y we have

 ∂ϕ(x, y) = exp ∂y

Set 1  b(x) = b(x) − σ(x)σ (x) 2

and

x

 σ (ϕ(u, y)) du .

0

 ∂ϕ(x, y) −1 f (x, y) =  b(ϕ(x, y)) ∂y

and let Y(t) be the solution of the ordinary differential equation dY (t) = f (B(t), Y(t)). dt Then {X(t) = ϕ(B(t), Y(t))}t0 is a solution driven by B.5 Next we give the definitions of uniqueness of solutions. Definition 4.2.9 It is said that the uniqueness in law of solutions for the stochastic differential equation (4.2.1) holds if the probability laws of X and X  coincide for any solutions X and X  of (4.2.1) with the same initial distribution. Definition 4.2.10 It is said that the pathwise uniqueness of solutions for the stochastic differential equation (4.2.1) holds if X(t) = X  (t) for all t  0 almost surely for any solutions X and X  of (4.2.1) which are defined on the same filtered probability space, driven by the same Brownian motion, and satisfy X(0) = X  (0) almost surely. 5

Example 4.2.8 is one of the starting points of the representation of solutions of stochastic differential equations (Kunita [64],Yamato [129]).

4:36:54, subject to the Cambridge Core terms of use, .005

4.2 Stochastic Differential Equations

143

 d × W r )Definition 4.2.11 Φ(x, w) : Rd × W r → Wd is said to be B(R d measurable if, for any probability measure m on R , there exists a function  m : Rd × W r → Wd such that Φ m×μ

 m is B(Rd × W r ) (i) for the Wiener measure μ on W r , Φ  (ii) Φ(x, w) = Φm (x, w) for m-a.e. x.

-measurable,

Definition 4.2.12 A solution X of (4.2.1) is called a strong solution with a  d × W r )-measurable function F(x, w) Brownian motion B if there exists a B(R μ d such that w|[0,t] → F(x, w) ∈ Wt ≡ C([0, t] → Rd ) is Bt (W r ) -measurable for each x ∈ Rd and t  0 and X = F(X(0), B) holds almost surely. Definition 4.2.13 The stochastic differential equation (4.2.1) is said to have a unique strong solution if there exists a function F(x, w) : Rd × W r → Wd satisfying the conditions in Definition 4.2.12 such that (i) for an r-dimensional Brownian motion B = {B(t)}t0 and an F0 measurable random variable ξ defined on a filtered probability space (Ω, F , P, {Ft }), F(ξ, B) is a solution of (4.2.1) with X(0) = ξ almost surely, (ii) for any solution (X, B) of (4.2.1), X = F(X(0), B) almost surely. Following [45, 128], we show the fundamental theorem for the uniqueness of the solutions of stochastic differential equations. Theorem 4.2.14 Let α ∈ A d,r and β ∈ A d,1 . Then (4.2.1) has a unique strong solution if and only if, for any Borel probability measure m on Rd , there exists a solution with initial distribution m and the pathwise uniqueness holds. Proof Assume that (4.2.1) has a unique strong solution, that is, there exists a unique function F : Rd ×W r → Wd which satisfies the conditions (i) and (ii) in Definition 4.2.13. Take an r-dimensional {Ft }-Brownian motion B on a filtered probability space (Ω, F , P, {Ft }) and an F0 -measurable random variable ξ whose probability distribution is m. Then X = F(ξ, B) is a solution of (4.2.1) with initial distribution m. Moreover, if (X, B) and (X  , B ) are two solutions of (4.2.1) on a same probability space such that B(t) = B (t) (t  0) and X(0) = X  (0) almost surely, then we have X = F(X(0), B) = F(X  (0), B ) = X  almost surely. Hence the pathwise uniqueness holds for (4.2.1).

4:36:54, subject to the Cambridge Core terms of use, .005

144

Stochastic Differential Equations

We give a sketch of the proof of the converse. Suppose that, for any Borel probability measure m, there exists a solution of (4.2.1) with initial distribution m and the pathwise uniqueness holds. We consider the case when m = δ x for a fixed x ∈ Rd . Let (X, B) and (X  , B ) be solutions of (4.2.1) starting from x and denote by P x and Px their probability distributions on Wd × W r , respectively. Moreover, let π : Wd × W r (w1 , w2 ) → w2 ∈ W r be the projection. Then, the distribution of w2 under P x or Px is the Wiener measure μ on W r . Let Q x,w2 (dw1 ) be the regular conditional probability distribution of w1 under P x given w2 . Q x,w2 is a probability measure on Wd which satisfies (i) for each w2 ∈ W r , Q x,w2 (dw1 ) is a probability measure on (Wd , B(Wd )), μ (ii) for any A ∈ B(Wd ), w2 → Q x,w2 (A) is B(W r ) -measurable, (iii) for any A1 ∈ B(Wd ) and A2 ∈ B(W r ),

P x (A1 × A2 ) = Q x,w2 (A1 )μ(dw1 ). A2

Qx,w2 (dw1 ) d

We define in the same way. Set Ω = W × Wd × W r and define a probability measure Q x on Ω by Q x (dw1 dw2 dw3 ) = Q x,w3 (dw1 )Qx,w3 (dw2 )μ(dw3 ). Qx

Let B(Ω) be the topological σ-field on Ω and set F = B(Ω) . Denoting by Bt the direct product σ-field Bt (Wd ) ⊗ Bt (Wd ) ⊗ Bt (W r ) and by N the  totality of the Q x -null sets, we set Ft = ε>0 (Bt+ε ∨ N ). Then, the distributions of (w1 , w3 ) and (w2 , w3 ) under Q x are those of (X, B) and (X  , B ), respectively. Hence, both (w1 , w3 ) and (w2 , w3 ) are solutions of (4.2.1) defined on (Ω, F , P, {Ft }), and Q x (w1 = w2 ) = 1 by the assumption, which implies (Q x,w × Qx,w )(w1 = w2 ) = 1,

μ-a.s.

This means that, for almost all w ∈ W r , there exists a unique F x (w) ∈ Wd such that Q x,w and Qx,w are the Dirac measure concentrated on F x (w). F x (w) is the desired function on Rd × W r and we can prove that (4.2.1) has a unique strong solution. For details, see [45, p.163].  Corollary 4.2.15 If pathwise uniqueness holds for the stochastic differential equation (4.2.1), then uniqueness in law holds.

4:36:54, subject to the Cambridge Core terms of use, .005

4.3 Existence of Solutions

145

4.3 Existence of Solutions Let α ∈ A d,r and β ∈ A d,1 , and consider the stochastic differential equation dX(t) = α(t, X) dB(t) + β(t, X) dt.

(4.3.1)

Define a = (ai j ) : [0, ∞) × Wd → Rd ⊗ Rd by a = αα∗ , that is, r

ai j (t, w) =

αik (t, w)αkj (t, w),

k=1

and set (A f )(t, w) =

d d 1 ij ∂2 f ∂f a (t, w) i j (w(t)) + βi (t, w) i (w(t)) 2 i, j=1 ∂x ∂x ∂x i=1

for f ∈ Cb2 (Rd ). Let (X, B) be a solution of (4.3.1) defined on a filtered probability space (Ω, F , P, {Ft }) and set

t X M f (t) = f (X(t)) − f (X(0)) − (A f )(s, X) ds. 0

By Itô’s formula M Xf (t) =

r d i=1 k=1

0

t

αik (s, X)

∂f (X(s)) dBk (s) ∂xi

2 and {M Xf (t)}t0 ∈ Mc,loc . The converse also holds and, as seen below, this is equivalent to the existence of the solution for (4.3.1).

Theorem 4.3.1 The stochastic differential equation (4.3.1) has a solution if and only if there exists a d-dimensional continuous stochastic process X = 2 for any f ∈ Cb2 (Rd ). {X(t)}t0 such that {M Xf (t)}t0 ∈ Mc,loc Proof Assume the existence of X satisfying the condition. Let R > 0 and take fi ∈ Cb2 (R) such that fi (x) = xi for |x| < R. Set σR = inf{t > 0; |X(t)|  R},

t∧σR M (R),i (t) = X i (t ∧ σR ) − X i (0) − βi (s, X) ds, 0

and

M i (t) = X i (t) − X i (0) −

t

βi (s, X) ds.

0

4:36:54, subject to the Cambridge Core terms of use, .005

146

Stochastic Differential Equations

Then, for i = 1, 2, . . . , d, {M (R),i (t)}t0 ∈ Mc2 and it converges to {M i (t)}t0 as 2 . Moreover, let fi j ∈ Cb2 (R) satisfy fi j (x) = R → ∞. Hence {M i (t)}t0 ∈ Mc,loc xi x j for |x| < R and set M i j (t) = X i (t)X j (t) − X i (0)X j (0)

t   ij a (s, X) + βi (s, X)X j (s) + β j (s, X)X i (s) ds. − 0 2 Then, we get, in a similar way, {M i j (t)}t0 ∈ Mc,loc . By Itô’s formula

M (t)M (t) − i

t

j

ai j (s, X) ds 0 i

= M i j (t) − X (0)M j (t) − X j (0)M i (t)

t  s

t  s   i j β (u, X) du dM (s) − β j (u, X) du dM i (s) − 0

0

0

and

0

t

M i , M j (t) =

ai j (s, X) ds. 0

Hence, by Theorem 2.5.8, there exists an r-dimensional Brownian motion B such that r t i M (t) = αik (s, X) dBk (s) k=1

0



and (X, B) is a solution of (4.3.1).

Remark 4.3.2 The condition of the theorem is equivalent to the existence of the probability measure on (W, B(W)) under which

t (A f )(s, w) ds f (w(t)) − f (w(0)) − 0

is an {Ft }-locally square integrable martingale (see Section 2.1) for any f ∈ Cb2 (Rd ). In fact, the probability law of X in Theorem 4.3.1 plays the desired role. The next is a fundamental theorem on the existence of solutions. Theorem 4.3.3 Assume that α ∈ A d,r and β ∈ A d,1 are bounded and continuous in (t, w) ∈ [0, ∞) × Wd . Then, for any x0 ∈ Rd , there exists a solution (X (x0 ) , B) of (4.3.1) satisfying P(X (x0 ) (0) = x0 ) = 1.

4:36:54, subject to the Cambridge Core terms of use, .005

4.3 Existence of Solutions

147

Proof We give an outline of a proof. For details, see [45]. It suffices to show that there exists a stochastic process X = {X(t)}t0 defined on some filtered probability space (Ω, F , P, {Ft }) such that P(X(0) = x0 ) = 1 and, for any f ∈ Cb2 (Rd ),

M f (t) := f (X(t)) − f (X(0)) − 0

t

2 (A f )(s, X) ds ∈ Mc,loc .

Take an r-dimensional {Ft }t0 -Brownian motion B = {B(t)}t0 on a filtered probability space (Ω, F , P, {Ft }) and define a sequence of d-dimensional stochastic processes Xn = {Xn (t)}t0 (n = 1, 2, . . .) by Xn (0) = x0 for t = 0 and, for t with 2mn  t  m+1 2n ,   m  m  m , X , X B(t) − B + β t − , n,m n,m 2n 2n 2n 2n 2n   where {Xn,m (t)}t0 is given by Xn,m (t) = Xn t ∧ 2mn . Moreover, set φn (t) = m m m+1 d,r and βn ∈ A d,1 by αn (t, w) = α(φn (t), w) 2n ( 2n  t < 2n ) and define αn ∈ A and βn (t, w) = β(φn (t), w). Then, {Xn (t)}t0 is a unique solution of Xn (t) = Xn

m

m



dX(t) = αn (t, X) dB(t) + βn (t, X) dt,

X(0) = x0 .

By the boundedness and continuity of α and β, we can prove that there exists a subsequence of {Xn (t)}t0 which converges uniformly on compact sets almost surely. The limit is the desired solution of the stochastic differential equation.  Let P x be the probability law of the solution X of (4.3.1) with X(0) = x. Denote by P(Wd ) the set of probability measures on Wd endowed with the topology of weak convergence and by B(P(Wd )) its topological σ-field. Then, the mapping x → P x is B(Rd )/B(P(Wd ))-measurable ([45, p.171], [114, p.28]). For a probability measure m on Rd , define a probability measure P on (Wd , B(Wd )) by

P x (A)m(dx) (A ∈ B(Rd )). P(A) = Rd

Then, for any f ∈ Cb2 (Rd ), {M Xf (t)}t0 is a local martingale under P. Hence there exists a solution of (4.3.1) with initial distribution m. We devote the rest of this section to time-homogeneous Markovian stochastic differential equations, which are important ingredients in the future chapters. In what follows, we consider solutions which may explode, i.e., they may arrive at infinity in finite time.

4:36:54, subject to the Cambridge Core terms of use, .005

148

Stochastic Differential Equations

For measurable functions σ(x) = (σij (x)) : Rd → Rd ⊗Rr and b(x) = (bi (x)) : Rd → Rd , consider the stochastic differential equation dX(t) = σ(X(t)) dB(t) + b(X(t)) dt, or dX i (t) =

r

σik (X(t)) dBk (t) + bi (X(t)) dt

(4.3.2)

(i = 1, 2, . . . , d).

k=1

If σ and b are bounded and continuous, then there exists a solution by Theorem 4.3.3. If they are not bounded, an explosion occurs in general. Example 4.3.4 Let d = r = 1. X(t) = (1 − B(t))−1 (t < inf{s; B(s) = 1}) is a solution of dX(t) = X(t)2 dB(t) + X(t)3 dt,

X(0) = 1.

For this stochastic differential equation, the uniqueness of solution holds by Theorem 4.4.5 below. Let  Rd = Rd ∪ {} be the one-point compactification of Rd and consider the path space on  Rd given by & ' w is continuous. d d   . W = w : [0, ∞) → R ; If w(t) = , then w(t ) =  (t  t).  d ) be the σ-field generated by the Borel cylinder sets and ζ(w) be the Let B(W  d, life time of w ∈ W ζ(w) = inf{t; w(t) = }. Since we admit explosions, we again give the definition of solutions of stochastic differential equations.  d -valued random variable X = {X(t)}t0 defined on a Definition 4.3.5 A W filtered probability space (Ω, F , P, {Ft }) is called a solution of the stochastic differential equation (4.3.2) if (i) there exists an r-dimensional {Ft }-Brownian motion B = {B(t)}t0 with B(0) = 0, Rd is Ft -measurable, (ii) X is {Ft }-adapted, that is, for each t  0, X(t) ∈  (iii) almost surely, for t < ζ X = inf{t; X(t) = },

t r t i i i k σk (X(s)) dB (s)+ bi (X(s)) ds (i = 1, 2, . . . , d). X (t) = X (0)+ k=1

0

0

 d , we can define the uniqueness of Remark 4.3.6 Replacing Wd with W solutions and the pathwise uniqueness in the same way. 4:36:54, subject to the Cambridge Core terms of use, .005

4.3 Existence of Solutions

149

Remark 4.3.7 If ζ X < ∞, then limt↑ζ X X(t) =  almost surely ([45, p.174]). Theorem 4.3.8 If σ(x) = (σik (x)) and b(x) = (bi (x)) are continuous, then, for any x ∈ Rd , there exists a solution X of (4.3.2) such that X(0) = x.  d -valued random variable Proof It is sufficient to show that there exists a W X = {X(t)}t0 defined on some filtered probability space (Ω, F , P, {Ft }) such that P(X(0) = x0 ) = 1 and the stochastic process {M f (t)}t0 defined by

t∧τn (L f )(X(s)) ds M f (t) := f (X(t ∧ τn )) − f (X(0)) − 0

is an {Ft }-martingale for each f ∈ Cb2 (Rd ), where τn = inf{t; |X(t)|  n} (see Theorem 4.3.1). L is given by (L f )(x) =

d d 1 ij ∂2 f ∂f a (x) i j + bi (x) i (x), 2 i, j=1 ∂x ∂x ∂x i=1

where ai j (x) =

d

σik (x)σkj (x).

k=1

Let ρ ∈ C(R → R) be a function with values in (0, 1] such that all of ρ(x)ai j (x) and ρ(x)bi (x) are bounded. Set d

( L f )(x) = ρ(x)(L f )(x).  = {X(t)}  t0 By Theorem 4.3.3, there exists an Rd -valued stochastic process X 6 6 starting from x0 defined on a filtered probability space (Ω, F , P, {Ft }) such that

t  f (t) := f (X(t))  − f (X(0))   M − ( L f )(X(s)) ds 0

6t }-martingale for any f ∈ is an {F

Cb2 (Rd ).

t

Define {A(t)}t0 by

 ρ(X(s)) ds

A(t) =

0

and set e = lim A(t) ∈ (0, ∞]. Since ρ > 0, t → A(t) has an inverse function t→∞

6t }-stopping time and α(t) → ∞ α(t) (0  t < e). For fixed t > 0, α(t) is an {F as t → e. We now set ⎧ ⎪  ⎪ (t < e), ⎨X(α(t)) 6 Ft = Fα(t) and X(t) = ⎪ ⎪ ⎩ (t  e)  d -valued random variable X = {X(t)}t0 defined on the and show that the W probability space (Ω, F , P, {Ft }) is the desired stochastic process. 4:36:54, subject to the Cambridge Core terms of use, .005

150

Stochastic Differential Equations

  n}. By the optional stopping theorem { M  f (t∧ τn )}t0 is Set  τn = inf{t; |X(t)| 6t }-martingale and { M  f (α(t) ∧ τn )}t0 is an {Ft }-martingale. By definition, an {F  τn = α(t∧τn ), X(α(t)∧ τn ) = X(t∧τn ). α(t) <  τn is equivalent to ! tt < τn , and α(t)∧ −1  Moreover, since t = 0 (ρ(X(s))) dA(s) and

α(t)

α(t) = 0

1 dA(s) =  ρ(X(s))

t 0

1 ds, ρ(X(s))

we have

α(t∧τn )  f (α(t) ∧    M τn ) = f (X(α(t) ∧ τn )) − f (x0 ) − (ρL f )(X(s)) ds 0

t∧τn    ρ(X(α(u)))(L f )(X(α(u))) dα(u) = f (X(α(t ∧ τn ))) − f (x0 ) − 0

t∧τn (L f )(X(u)) du. = f (X(t ∧ τn )) − f (x0 ) − 0

Hence, {M f (t)}t0 is an {Ft }-martingale.



We show a sufficient condition for solutions not to explode. Theorem 4.3.9 Assume that σ(x) and b(x) are continuous and there exists a positive constant K such that σ(x)2 + |b(x)|2  K(1 + |x|2 )

(x ∈ Rd ),

where σ(x)2 =

r d (σik (x))2 ,

|b(x)|2 =

i=1 k=1

d

(bi (x))2 .

i=1



If E[|X(0)| p ] < ∞ for some p  2, then E sup0tT |X(t)| p < ∞ for any T > 0. In particular, no explosion occurs almost surely. Proof Set σn = inf{t; |X(t)|  n} and Xn (t) = X(t ∧ σn ). By the Burkholder– Davis–Gundy inequality (Theorem 2.4.1), there exists a constant C1 such that  p  r s   t 2p E sup  σik (Xn (u)) dBk (u)  C1 E |σi (Xn (u))|2 du 0st k=1

0

0

for every i = 1, 2, . . . , d. Moreover, by Hölder’s inequality,  t 2p  t p−2 E |σi (Xn (u))|2 du |σi (Xn (u))| p du t 2 E 0

0

4:36:54, subject to the Cambridge Core terms of use, .005

4.4 Pathwise Uniqueness

and

151

t  t  p    t p−1 i p b (X (u)) du |bi (Xn (u))| p du. n   0

0

By the assumption, there exists a constant Ci = Ci (K, p, T ), independent of n, such that

t   p E sup |Xn (s)|  C2 + C3 E sup |Xn (u)| p ds 0st

0us

0

for any t ∈ [0, T ]. Hence, by Gronwall’s inequality (Theorem A.1.1), there exist constants C4 and C5 such that  E sup |Xn (t)| p  C4 eC5 T . 0tT

We obtain the conclusion by letting n → ∞.



The explosion problem is studied in [84, Section 4.4] and [114, Chapter 10] from an analytic point of view. In the one-dimensional case, it is completely clarified and the result will be presented in Section 4.8. In that section, as an application of one-dimensional achievements, a sufficient condition for nonexplosion of general dimensional diffusions due to Khasminskii is also shown.

4.4 Pathwise Uniqueness The stochastic differential equations with Lipschitz continuous coefficients have nice properties, as in the case of ordinary differential equations. Definition 4.4.1 γ ∈ A 1,1 is called Lipschitz continuous if there exists a positive constant K such that |γ(t, w) − γ(t, w )|  K(w − w )∗ (t)

(t  0, w, w ∈ Wd ),

where w∗ is given by w∗ (t) = max |w(s)|. 0st

The constant K is called a Lipschitz coefficient. Theorem 4.4.2 Suppose that each component of α ∈ A d,r and β ∈ A d,1 is Lipschitz continuous and, for any T > 0, there exists a positive constant CT such that α(t, 0), |b(t, 0)|  CT

(0  t  T ).

4:36:54, subject to the Cambridge Core terms of use, .005

152

Stochastic Differential Equations

Then the stochastic differential equation (4.3.1) has a solution and the pathwise uniqueness holds. Proof It suffices to consider a solution with X(0) = x for a fixed x ∈ Rd . We construct it by Picard’s method of successive approximation. Let (W r , μ) be the r-dimensional Wiener space and let Ft be the σ-field  obtained by the completion of σ({w(s); s  t}) by μ. Set F∞ = t0 Ft . Then (W r , F∞ , μ, {Ft }) is a probability space which satisfies the usual condition. Denote by {B(t)}t0 the coordinate process. It is an r-dimensional {Ft }-Brownian motion under μ. Fix T > 0 and define a sequence of {Ft }-adapted stochastic processes {X (n) (t)}t0 (n = 0, 1, 2, . . .) by X (0) (t) = x, X

(n+1)

(t) = x +

t

t

α(s, X ) dB(s) + (n)

0

β(s, X (n) ) ds.

0

 We can easily show by induction that E sup0tT |X (n) (t)|2 < ∞ for each n, where E denotes the expectation with respect to μ.

To give an estimate for E sup0tT |X (n+1) (t) − X (n) (t)|2 , we show the

following. Lemma 4.4.3 Let X and Y be {Ft }-adapted stochastic processes satisfying   E sup |X(t)|2 < ∞, E sup |Y(t)|2 < ∞ 0tT

0tT

 and for any T > 0, and ξ and η be F0 -measurable random variables. Define X  Y by  =ξ+ X(t)  =η+ Y(t)



t

0 t 0

t

σ(s, X) dB(s) +

β(s, X) ds,

0 t

σ(s, Y) dB(s) +

β(s, Y) ds.

0

Then there exists a constant C, dependent only on T > 0 and the Lipschitz coefficients of σ and β, such that   t   − Y)  ∗ (t)2 ]  C E[|ξ − η|2 ] + E E[(X (X − Y)∗ (s)2 ds 0

for any t ∈ [0, T ]. 4:36:54, subject to the Cambridge Core terms of use, .005

4.4 Pathwise Uniqueness

153

Proof By definition  t    − Y(t)|   |ξ − η| +  |X(t) (σ(s, X) − σ(s, Y)) dB(s)   0  t  +  (β(s, X) − β(s, Y)) ds. 0

Applying Doob’s inequality to the second term of the right hand side, we obtain 2  s  E sup  (σ(u, X) − σ(u, Y)) dB(u) 0st

0

2  t  t  4 E  (σ(u, X) − σ(u, Y)) dB(u) = 4 E σ(u, X) − σ(u, Y)2 du . 0

0

Then the assumption implies 2  s  t    2   (σ(u, X) − σ(u, Y)) dB(u)  4K E (X − Y)∗ (u)2 du . E sup  0st

0

0

Moreover, the assumption and Schwarz’s inequality yield

s  s 2   E sup  (β(u, X) − β(u, Y)) du  t E sup (X − Y)∗ (u)2 du 0st

0

0st t



= tE

0

(X − Y)∗ (u)2 du .

0

Combining the above estimates, we obtain the conclusion.



Proof of Theorem 4.4.2 (continued) Set Δ(n) (t) = E[(X (n+1) − X (n) )∗ (t)2 ]. By the above lemma,

t Δ(n) (t)  C Δ(n−1) (s) ds. 0

Set A = Δ (T ). Then, by induction, (0)

Δ(n) (T )  A

CnT n . n!

Hence, Chebyshev’s inequality implies   1 μ sup |X (n+1) (t) − X (n) (t)| > n  4n E sup |X (n+1) (t) − X (n) (t)|2 2 0tT 0tT (4CT )n A . n! Thus ∞  1 μ sup |X (n+1) (t) − X (n) (t)| > n < ∞. 2 0tT n=1 4:36:54, subject to the Cambridge Core terms of use, .005

154

Stochastic Differential Equations

By the Borel–Cantelli lemma, for almost all w ∈ W r , there exists N(w) ∈ N such that 1 sup |X (n+1) (t, w) − X (n) (t, w)|  n 2 0tT for all n  N(w). This means that X (n) (t) = X (0) +

n

(X ( j) (t) − X ( j−1) (t))

j=1

converges uniformly as n → ∞ almost surely. Denote the limit by X. Then, by Schwarz’s inequality and the estimate for Δ(n) (T ), = $2 # ∞ (CT ) j (n) ∗ 2 E[(X − X ) (T ) ]  A →0 (n → ∞). j! j=n+1 On the other hand, setting

t

X(t) = x +

t

σ(s, X) dB(s) +

0

β(s, X) ds,

0

we obtain by the above lemma

T (n) ∗ 2 E[(X − X ) (T ) ]  C E[(X − X (n−1) )∗ (t)2 ] dt → 0

(n → ∞).

0

Hence, X = X and X satisfies the stochastic differential equation (4.3.1). To show the pathwise uniqueness, let X and X  be the solutions defined on the same probability space. By the lemma, there exists a constant CT such that

t  ∗ 2 E[(X − X  )∗ (s)2 ] ds (t ∈ [0, T ]). E[(X − X ) (t) ]  CT 0  ∗

This implies that E[(X − X ) (T ) ] = 0 and X(t) = X  (t) (0  t  T ) almost surely.  2

Remark 4.4.4 Suppose that α and β are locally Lipschitz continuous, that is, for any N > 0 there exists a constant KN such that α(t, w) − α(t, w ) + |β(t, w) − β(t, w )|  KN (w − w )∗ (t) for w, w ∈ Wd and t with w∗ (t), (w )∗ (t)  N and 0  t  N. If we assume that there is no explosion, then the pathwise uniqueness holds. For details, see for example [100, p.132]. A sufficient condition for no explosion is that α(t, w) + |β(t, w)|  KN (w)∗ (t)

(0  t  N).

4:36:54, subject to the Cambridge Core terms of use, .005

4.4 Pathwise Uniqueness

155

For Markovian stochastic differential equations with locally Lipschitz continuous coefficients, the pathwise uniqueness holds. We omit the proof and refer the reader to [45, p.178]. Theorem 4.4.5 Suppose that σ : R → Rd ⊗ Rr and b : Rd → Rd are locally Lipschitz continuous, that is, for any N > 0 there exists a positive constant KN such that σ(x) − σ(y) + |b(x) − b(y)|  KN |x − y| (|x|, |y|  N). Then the pathwise uniqueness of solutions holds for the stochastic differential equation dX(t) = σ(X(t)) dB(t) + b(X(t)) dt. Hence this stochastic differential equation has a unique strong solution. For the one-dimensional stochastic differential equation, deep and interesting results on pathwise uniqueness are known. In the remaining of this section, we list them without proofs. For details, see [69, 89, 128] and also [45, 100]. Theorem 4.4.6 (Yamada [128]) Let σ, b : R → R be bounded functions such that (i) there exists a strictly monotone increasing function ρ on [0, ∞) such that

ρ(u)−2 du = ∞ and |σ(x) − σ(y)|  ρ(|x − y|) (x, y ∈ R), ρ(0) = 0, 0+

(ii) there exists a monotone increasing concave function κ on [0, ∞) such that

κ(u)−1 du = ∞ and |b(x) − b(y)|  κ(|x − y|) (x, y ∈ R). κ(0) = 0, 0+

Then, for the stochastic differential equation dX(t) = σ(X(t)) dB(t) + b(X(t)) dt,

(4.4.1)

the pathwise uniqueness of solutions holds. If σ is 12 -Hölder continuous and b is Lipschitz continuous, then the conditions of the theorem are fulfilled. Theorem 4.4.7 (Nakao [89], Le Gall [69]) Let σ, b : R → R be bounded functions. Assume that there exist a positive constant ε > 0 and a bounded and monotone increasing function f such that σ(x)  ε

and |σ(x) − σ(y)|2  | f (x) − f (y)| (x, y ∈ R).

4:36:54, subject to the Cambridge Core terms of use, .005

156

Stochastic Differential Equations

Then, for the stochastic differential equation (4.4.1), the pathwise uniqueness of solutions holds. Under the assumptions of these theorems, the existence of solutions is seen by Theorem 4.3.8. Hence, by Corollary 4.2.15, the uniqueness in law also holds. The uniqueness in law for the Markovian stochastic differential equations is also discussed in the next section.

4.5 Martingale Problems Given ai j , bi : Rd → R (i, j = 1, 2, . . . , d), assume that (ai j ) is symmetric and non-negative definite. We consider the diffusion process corresponding to L=

d d ∂2 ∂ 1 ij a (x) i j + bi (x) i . 2 i, j=1 ∂x ∂x ∂x i=1

(4.5.1)

While some of the results mentioned below continue to hold in the case where the coefficients ai j and bi depend on paths, we do not mention them here. For more results, see [100, 114]. Definition 4.5.1 For y ∈ Rd , a probability measure Py on the path space (Wd , B(Wd )) is called a solution of the martingale problem for L starting from y if Py (w(0) = y) = 1 and

t M f (t) = f (w(t)) − f (w(0)) − (L f )(w(s)) ds 0

is a {Bt (W )}-martingale under P for any f ∈ C0∞ (Rd ). If there exists a unique solution starting from y for any y ∈ Rd , the martingale problem is called well posed. d

y

By Theorems 4.3.1 and 4.3.3, a solution of the stochastic differential equation (4.3.2) exists if and only if so does that for a martingale problem for L. Moreover, the following holds. Theorem 4.5.2 Suppose that there exists a bounded continuous function σ : Rd → R ⊗ Rd such that a(x) = σ(x)σ(x)∗ and b(x) is also bounded and continuous. Then there exists a solution Py of the martingale problem for L starting from any y ∈ Rd . For the uniqueness of solutions, the following conclusive result is known.

4:36:54, subject to the Cambridge Core terms of use, .005

4.6 Exponential Martingales and Transformation of Drift

157

Theorem 4.5.3 Suppose that the diffusion coefficient a and the drift b satisfy (i) a is continuous, (ii) a(x) is strictly positive definite for any x, (iii) there exists a constant K such that |ai j (x)|  K(1 + |x|2 ), |bi (x)|  K(1 + |x|) (i, j = 1, 2, . . . , d). Then the martingale problem for L is well posed. In particular, the uniqueness in law holds for the stochastic differential equation (4.3.2). We omit the proof. See [114, Theorem 7.2.1] and a survey in [100, p.170]. As was mentioned in Section 4.1, one of the advantages of martingale problems is that the strong Markov property of the corresponding diffusion processes is deduced from the well-posedness of them. Theorem 4.5.4 Let L be the second order elliptic differential operator defined by (4.5.1). Assume that the martingale problem for L is well posed. Then, the coordinate process on Wd is a strong Markov process under the unique solution Py starting from y. A proof of the theorem is carried out as follows. Let τ be a bounded {Ft }stopping time, where {Ft }t0 is the filtration described in Section 4.4. Define θτ : W d → W d by (θτ w)(t) = w(τ(w) + t). Moreover, let (p|Fτ )(w, A) : W d × F → [0, 1] be the regular conditional probability of Py given Fτ . Then, the probability law of θτ under (p|Fτ )(w, ·) is a solution of the martingale problem starting from X(τ, w). By the uniqueness, it coincides with PX(τ,w) . The details are omitted.

4.6 Exponential Martingales and Transformation of Drift For α ∈ A d,r and β ∈ A d,1 , consider the stochastic differential equation (4.2.1) or (4.3.1): dX(t) = α(t, X) dB(t) + β(t, X) dt. The aim of this section is to introduce transformation of drift, which guarantees that, if there exists a solution of this equation, then the equation dX(t) = α(t, X) dB(t) + (β(t, X) + α(t, X)γ(t, X)) dt

(4.6.1)

4:36:54, subject to the Cambridge Core terms of use, .005

158

Stochastic Differential Equations

also has a solution for suitable γ ∈ A r,1 .6 For this purpose it is important to study whether the local martingale {E M (t)}t0 defined by   1 (4.6.2) E M (t) = exp M(t) − M (t) , 2 M being a local martingale, is a martingale or not. It should also be noted that transformations of drifts relate changes of probability measures on Wiener spaces. 2 For an M = {M(t)}t0 ∈ Lc,loc with M(0) = 0 on (Ω, F , P, {Ft }), define {E M (t)}t0 by (4.6.2). By Itô’s formula, we have the following. Proposition 4.6.1 E M (t) = 1 + continuous local martingale.

!t 0

E M (s) dM(s) and E M = {E M (t)}t0 is a

Since E M (t) > 0, E M is a supermartingale (Proposition 2.1.2). Hence E M is a martingale if and only if E[E M (t)] = 1

(t  0).

If E M is a martingale, a probability measure P M on (Ω, FT ), which is absolutely continuous with P, is defined by P M (A) = E[E M (t)1A ]

(A ∈ Ft , 0  t  T )

for any T > 0. The following Girsanov’s theorem, which is often called the Cameron–Martin–Maruyama–Girsanov theorem, is essential to achieve the transformation of drift. Theorem 4.6.2 Let (Ω, F , P, {Ft }) be a filtered probability space and B = Bi (t) = Bi (t) − {(B1 (t), B2 (t), . . . , Br (t))}t0 be an {Ft }-Brownian motion. Set  i 1 2    Br (t))}t0 is an

B , M (t). If E M is a martingale, then B = {( B (t), B (t), . . . ,  {Ft }-Brownian motion on (Ω, F , P M , {Ft }). Proof By virtue of the Lévy theorem (Theorem 2.5.1), it suffices to show that { Bi (t)}t0 (i = 1, 2, . . . , r) is a local martingale on (Ω, F , P M , {Ft }). Since  Bi (t)E M (t) −  Bi (0)

t

t

t i i E M (s) dB (s) + B (s) dE M (s) −

Bi , M (s) dE M (s) = 0 6

0

0

This was first shown by Maruyama [77, 78].

4:36:54, subject to the Cambridge Core terms of use, .005

4.6 Exponential Martingales and Transformation of Drift

159

by Itô’s formula, { Bi (t)E M (t)}t0 is a continuous local martingale under P. Hence, there exists a sequence of stopping times {τn } with P(τn → ∞) = 1 such that, for any s < t and A ∈ F s , E[ Bi (t ∧ τn )E M (t ∧ τn )1A ] = E[ Bi (s ∧ τn )E M (s ∧ τn )1A ]. Since A ∩ {τn > s} ∈ F s∧τn , Doob’s optional sampling theorem implies Bi (t ∧ τn )1A∩{τn >s} ] = E[ Bi (t ∧ τn )E M (T )1A∩{τn >s} ] EPM [ = E[ Bi (t ∧ τn )E M (t ∧ τn )1A∩{τn >s} ] = E[ Bi (s ∧ τn )E M (s ∧ τn )1A∩{τn >s} ] = E[ Bi (s ∧ τn )E M (T )1A∩{τn >s} ] = EPM [ Bi (s ∧ τn )1A∩{τn >s} ] If τn  s, then t ∧ τn = τn and Bi (t ∧ τn )1A∩{τn s} ] = EPM [ Bi (s ∧ τn )1A∩{τn s} ]. EPM [ Summing up the above, we obtain EPM [ Bi (t ∧ τn )1A ] = EPM [ Bi (s ∧ τn )1A ], which means that each component of  B is a local martingale under P M .



Remark 4.6.3 The probability measures P and P M are absolutely continuous on (Ω, Ft ) for any t > 0. However, they are not absolutely continuous on  (Ω, F∞ ) in general, where F∞ = t0 Ft . To see this, let d = 1 and set M(t) = μB(t) for a non-zero constant μ. Then,  B(t) = B(t) − μt is a Brownian motion under P M and     B(t) = 0 = 1, P M lim t−1 B(t) = μ = P M lim t−1  t→∞

t→∞

*

+ but P limt→∞ t−1 B(t) = μ = 0. If dP M = Φ dP on F∞ , then E M (T ) = E[Φ|FT ] and {E M (t)}t0 is uniformly integrable. Therefore, P and P M are absolutely continuous on (Ω, F∞ ) if and only if {E M (t)}t0 is uniformly integrable. See Theorem 1.5.16 on the martingale convergence theorem. Consider a local martingale {M(t)}t0 given by M(t) =

r k=1

t

γk (s, X) dBk (s),

(4.6.3)

0

where γ ∈ A r,1 . The following is obtained from Theorem 4.6.2.

4:36:54, subject to the Cambridge Core terms of use, .005

160

Stochastic Differential Equations

Theorem 4.6.4 Define M by (4.6.3). Suppose that E M is a martingale. If (X, B) is a solution of the stochastic differential equation (4.3.1), then (X,  B) is a solution of (4.6.1) under P M . From the above observations, it is important to have sufficient conditions on M so that E M is a martingale. The following two criteria are well known. 2 . If Theorem 4.6.5 (Novikov [90, 91]) Let M ∈ Mc,loc   1 (t  0), E exp M (t) < ∞ 2

(4.6.4)

then E[E M (t)] = 1 holds and {E M (t)}t0 is a continuous martingale [91, 92]. 2 . If Theorem 4.6.6 (Kazamaki [59]) Let M ∈ Mc,loc   1 (t  0), E exp M(t) < ∞ 2

(4.6.5)

then E[E M (t)] = 1 holds and {E M (t)}t0 is a continuous martingale [60]. The Novikov condition (4.6.4) implies the Kazamaki condition (4.6.5). In fact, for any α > 0, Schwarz’s inequality implies E[eαM(t) ] = E[eαM(t)−α M (t) eα M (t) ]  1  1 2 2  E[e2αM(t)−2α M (t) ] 2 E[e2α M (t) ] 2 . 2

2

Since {exp(2αM(t) − 2α2 M (t))}t0 is a supermartingale, we have  1 2 E[eαM(t) ]  E[e2α M (t) ] 2 . Set α = 12 to see that (4.6.5) is weaker. Hence, it suffices to show Theorem 4.6.6. However, we give the proofs of both theorems, since they both stem from the following lemma, and have their own interests. Lemma 4.6.7 Let {B(t)}t0 be a one-dimensional Brownian motion starting from 0 and set σa = inf{t > 0; B(t) = t − a} for a > 0. Then, E[e−λσa ] = e−(



1+2λ−1)a

(λ > 0).

(4.6.6)

√ Proof Set u(t, x) = exp(−λt − ( 1 + 2λ − 1)x). Then ∂u ∂u 1 ∂2 u − + = 0. ∂t ∂x 2 ∂x2

4:36:54, subject to the Cambridge Core terms of use, .005

4.6 Exponential Martingales and Transformation of Drift

Hence, by Itô’s formula,

u(t, B(t) − t) = 1 + 0

t

161

∂u (s, B(s) − s) dB(s). ∂x

Then, by the optional sampling theorem, E[u(t ∧ σa , B(t ∧ σa ) − (t ∧ σa ))] = 1. Since B(t ∧ σa ) − (t ∧ σa )  −a, due to the bounded convergence theorem, we obtain 1 = E[u(σa , B(σa ) − σa )] = E[e−λσa −(



1+2λ−1)(−a)



].

Proofs of Theorems 4.6.5 and 4.6.6 By Theorem 2.5.5, there exists an exten6t }-Brownian motion B with 6,  6t }) of (Ω, F , P, {Ft }) and an {F  F sion (Ω, P, {F 6t }B(0) = 0 such that M(t) = B( M (t)). For each s  0, M (s) is an {F stopping time. Setting σa = inf{t > 0; B(t) = t − a} as in Lemma 4.6.7, we have aea − t − a2 e 2 2t dt P(σa ∈ dt) = √ 2πt3 and σa

E[e 2 ] = ea .   Set Y(t) = exp B(t ∧ σa ) − 12 (t ∧ σa ) . By the optional sampling theorem and Fatou’s lemma, {Y(t)}t0 is a closable and, hence, uniformly integrable 6t }-martingale. {F First we prove Theorem 4.6.5. By the uniform integrability of {Y(t)}t0 , the optional sampling theorem implies    1 E exp B(σ ∧ σa ) − (σ ∧ σa ) = 1 2 6t }-stopping time σ. In particular, setting σ = M (t), we have for any {F σa

E[e−a+ 2 1{σa  M (t)} ] + E[e M(t)− −a+ σ2a

M (t) 2

1{σa > M (t)} ] = 1.

M (t) 2

1{σa  M (t)} ]  e−a E[e ], by the assumption, the first term of Since E[e

M (t) the left hand side tends to 0 as a → ∞. Therefore we obtain E[e M(t)− 2 ] = 1 and the conclusion of Theorem 4.6.5. Next we prove Theorem 4.6.6. Since {Y(t)}t0 is uniformly integrable, by the optional sampling theorem, E[Y( M (t))] = 1. Rewrite as E[Y( M (t))1{σa > M (t)} ] + E[Y( M (t))1{σa  M (t)} ] = 1.

(4.6.7)

If σa > M (t), then Y( M (t)) = E M (t). Hence E[Y( M (t))1{σa > M (t)} ]  E[E M (t)].

(4.6.8)

4:36:54, subject to the Cambridge Core terms of use, .005

162

Stochastic Differential Equations

For the second term of (4.6.7), we have 1

1

E[Y( M (t))1{σa  M (t)} ]  E[e 4 B(σa ∧ M (t)) e 4 B(σa )− 2 σa ]  I12 I22 , 1

3

1

where I1 and I2 are given by I1 = E[e 2 B(σa ∧ M (t)) ] 1

and

3

I2 = E[e 2 B(σa )−σa ].

6t }-stopping time T a = inf{t  0; M (t)  To estimate I1 , we consider an {F σa }. Since M (t ∧ T a ) = M (t) ∧ σa and {M(t) = B( M (t))}t0 is an 6 M (t) }-martingale, by the optional sampling theorem, we have {F 6 M (t∧T ) ] = B( M (t ∧ T a )) = B( M (t) ∧ σa ). E[B( M (t))|F a Hence, by Jensen’s inequality,  1  6 M (t∧T ) ] I1 = E exp E[B( M (t))|F a 2    M(t)   1  E exp B( M (t)) = E exp . 2 2 On the other hand, since B(σa ) − σa = −a, σa

I2 = E[e 2 ]e− 2 = e− 2 . 3a

a

Hence, the second term of the left hand side of (4.6.7) tends to 0 as a → ∞. By (4.6.7) and (4.6.8), we obtain E[E M (t)]  1 for every t  0 and the conclusion.  Finally we present a necessary and sufficient condition for exponential local martingales to be martingales by an explosion problem of solutions of the corresponding stochastic differential equations. Let b and Vk : Rd → Rd (k = 1, 2, . . . , r) be continuous and identify them with vector fields on Rd . Suppose that the stochastic differential equation dX(t) =

r

Vk (X(t)) dBk (t) + b(X(t)) dt,

X(0) = x

k=1

has a unique solution X = {X(t)}t0 and denote its explosion time by ζ.  b = b − rk=1 fk Vk . Let fk (k = 1, 2, . . . , r) be Borel functions on Rd and set  Suppose also that the stochastic differential equation dY(t) =

r

Vk (Y(t)) dBk (t) +  b(Y(t)) dt,

Y(0) = x

k=1

4:36:54, subject to the Cambridge Core terms of use, .005

4.6 Exponential Martingales and Transformation of Drift

163

has a unique solution Y = {Y(t)}t0 and define a local martingale M = {M(t)}t0 by r t r   1 t fk (Y(s)) dBk (s) − fk (Y(s))2 ds . M(t) = exp 2 k=1 0 k=1 0 Then, we have the following ([84, Section 3.7]). Theorem 4.6.8 Assume that Y does not explode. Then, for any T > 0 and Borel subset A in WdT , P(X ∈ A, ζ > T ) = E[M(T )1{Y∈A} ]. X does not explode if and only if the identity E[M(T )] = 1 holds, which is equivalent to M being a martingale. Proof Let ϕn ∈ C ∞ (Rd ) be a function such that ⎧ ⎪ ⎪ ⎨1 (|x|  n + 1), ϕn (x) = ⎪ ⎪ ⎩0 (|x|  n + 2) and set Vk(n) = ϕn Vk ,

 b(n) = ϕn b.

b(n) = ϕn b,

Denote by X (n) = {X (n) (t)}t0 and Y (n) = {Y (n) (t)}t0 the solutions of the stochastic differential equations dX(t) = dY(t) =

r k=1 r

Vk(n) (X(t)) dBk (t) + b(n) (X(t)) dt,

X(0) = x,

Vk(n) (Y(t)) dBk (t) +  b(n) (Y(t)) dt,

Y(0) = x,

k=1

respectively. Set ζn = inf{t  0; |X (n) (t)|  n}

ζnY = inf{t  0; |Y (n) (t)|  n}.

and

By the assumption, limn→∞ ζn = ζ and limn→∞ ζnY = ∞ almost surely. Now define M (n) = {M (n) (t)}t0 by M (n) (t) = exp

r  k=1

t

ϕn+1 (Y (n) (s)) fk (Y (n) (s)) dBk (s)

0

1 2 k=1 r



t

 ϕn+1 (Y (n) (s))2 fk (Y (n) (s))2 ds .

0

4:36:54, subject to the Cambridge Core terms of use, .005

164

Stochastic Differential Equations

Then M (n) is a martingale and, by Theorem 4.6.4, P(X (n) ∈ A ) = P[M (n) (T )1{Y (n) ∈A } ] for any Borel subset A in WdT . Hence, setting τn (w) = inf{t  0; |w(t)|  n}

(w ∈ Wd )

and putting A = A ∩ {τn > T }, we obtain P(X ∈ A, ζn > T ) = E[M(T )1{Y∈A,ζnY >T } ]. Let n → ∞ to obtain the desired conclusion. The assertion (2) follows from (1).



4.7 Solutions by Time Change Time changes, whose importance was seen in the representation theorems for martingales, are also useful to solve stochastic differential equations. In fact, in the proof of Theorem 4.3.8, we have already used a similar argument as described below. A non-negative {Ft }-adapted stochastic process φ = {φ(t)}t0 defined on a probability space (Ω, F , P, {Ft }) is called a time change process with respect to {Ft } if (i) φ(0) = 0, (ii) φ is continuous and strictly increasing, (iii) limt↑∞ φ(t) = ∞. Denote the inverse function of t → φ(t, ω) by φ−1 . For arbitrarily fixed u  0, {φ−1 (u)  t} = {φ(t)  u} ∈ Ft for any t. Thus φ−1 (u) is an {Ft }-stopping time. Given an {Ft }-adapted stochastic process X = {X(t)}t0 , the stochastic process T φ X = {(T φ X)(t)}t0 defined by (T φ X)(t) = X(φ−1 (t)) is {Fφ−1 (t) }-adapted. T φ X is called a time change of X by the time change process φ. 62 the set of square 6t for the σ-field Fφ−1 (t) and denote by M Write simply F c,loc 6t }-local martingales. integrable continuous {F Doob’s optional sampling theorem implies the following. 2 62 . , then T φ M ∈ M Lemma 4.7.1 (1) If M ∈ Mc,loc c,loc 2 φ φ (2) If M1 , M2 ∈ Mc,loc , then T M1 , T M2 = T φ M1 , M2 .

First we discuss the time change in the framework of martingale problems. Let L be the second order elliptic differential operator on Rd defined by (4.5.1)

4:36:54, subject to the Cambridge Core terms of use, .005

4.7 Solutions by Time Change

165

and P x (x ∈ Rd ) be a solution of the martingale problem for L starting from x, using the notation in Section 4.5. Let ρ : Rd → (0, ∞) be a measurable function and set

t ϕ(t) = ρ(X(s)) ds. 0

Suppose that limt↑∞ ϕ(t) = ∞, P -almost surely, and set ψ = ϕ−1 . Then, ϕ = {ϕ(t)}t0 is a time change process with respect to the natural filtration {Ft }. Hence, by the optional sampling theorem,

ψ(t) f (t) ≡ f (X(ψ(t))) − f (x) − C (L f )(X(s)) ds y

0

 = X(ψ(t)), we Setting X(t) is an {Fψ(t) }-local martingale for any f ∈ have

t  − f (x) −  −1 (L f )(X(u))  f (t) = f (X(t)) ρ(X(u)) du. C C0∞ (Rd ).

0

 t0 is a solution of the martingale This means that the probability law of {X(t)} −1 problem for ρ L. We shall discuss time changes in detail in the case of d = 1. Consider the one-dimensional stochastic differential equation without drift dX(t) = α(t, X) dB(t),

X(0) = 0

(4.7.1)

for α ∈ A 1,1 . Solutions for stochastic differential equations with drifts can be constructed from the solution for this equation by applying the transformation of drifts. In the rest of this section, we assume that there exist positive constants C1 and C2 such that C1  α(t, w)  C2

(t  0, w ∈ W1 ).

For the methods of time change in a more general setting, see for example [56]. Theorem 4.7.2 (1) Let θ = {θ(t)}t0 be a one-dimensional Brownian motion starting from 0 defined on a probability space (Ω, F , P, {Ft }). Set ξ(t) = X(0)+ θ(t) and suppose that there exists a time change process φ such that

t φ(t) = α(φ(s), T φ ξ)−2 ds. (4.7.2) 0

Moreover, set X = T φ ξ,

6t = Fφ−1 (t) . F

4:36:54, subject to the Cambridge Core terms of use, .005

166

Stochastic Differential Equations

6t }-Brownian motion B = {B(t)}t0 satisfying (4.7.1). Then, there exists an {F (2) Conversely, let (X, B) be a solution of (4.7.1) defined on a probability space (Ω, F , P, {Ft }). Then, there exist a filtration {Gt }, a {Gt }-Brownian motion θ = {θ(t)}t0 starting from 0, and a time change process φ = {φ(t)}t0 with respect to {Gt } such that ξ = {ξ(t) = X(0) + θ(t)}t0 satisfies (4.7.2) and X = T φ ξ. (3) If there exists a unique time change process φ satisfying (4.7.2), X = T φ ξ is the unique solution of (4.7.1). 62 , M(t) = X(t) − X(0) and M (t) = Proof (1) We have M = T φ θ ∈ M c,loc !t φ−1 (t). Since t = 0 α(φ(s), X)2 dφ(s) by (4.7.2),

M (t) = φ−1 (t) =

φ−1 (t)

t

α2 (φ(s), X)2 dφ(s) =

0

α(s, X)2 ds.

(4.7.3)

0

Set

t

B(t) =

α(s, X)−1 dM(s).

0

62 and (4.7.3) implies Then, B = {B(t)}t0 ∈ M c,loc

t

B (t) = α(s, X)−2 d M (s) = t. 0

6t }-Brownian motion and Hence, B is an {F

t X(t) − X(0) = M(t) = α(s, X)dB(s). 0 2 (2) Set M(t) = X(t) − X(0). Then M ∈ Mc,loc and M (t) = Moreover, set

ψ(t) = M (t),

φ(t) = ψ−1 (t) and

!t 0

α(s, X)2 ds.

6t = Fψ−1 (t) . F

6t }. Hence, by Then φ = {φ(t)}t0 is a time change process with respect to {F the martingale representation theorem (Theorem 2.5.3), the stochastic process 6t }-Brownian motion. θ = {θ(t)}t0 given by θ(t) = M(φ(t)) is an {F !t Now set ξ(t) = X(0) + θ(t). Then, since T φ ξ = X and 0 α(s, X)−2 dψ(s) = t, we have

t

t

φ(t) α(s, X)−2 dψ(s) = α(φ(u), X)−2 du = α(φ(u), T φ ξ)−2 du. φ(t) = 0

0

0

This means that φ satisfies (4.7.2). The assertion of (3) follows from those of (1) and (2).



4:36:54, subject to the Cambridge Core terms of use, .005

4.8 One-Dimensional Diffusion Process

167

We give some examples. We use the same notations as in Theorem 4.7.2 and its proof. Example 4.7.3 Let a : R1 → [0, ∞) be a bounded measurable function and assume that there exists a positive constant C such that a(x)  C (x ∈ R1 ), and let θ be a one-dimensional Brownian motion starting from 0. Set ξ(t) = X(0) + θ(t) and

t a(ξ(s))−2 ds. φ(t) = 0

Then, the time change process satisfying (4.7.2) exists uniquely and X(t) = ξ(φ−1 (t)) satisfies the following time-homogeneous Markovian stochastic differential equation: dX(t) = a(X(t)) dB(t). Example 4.7.4 Let a : [0, ∞)×R1 → [0, ∞) be a bounded measurable function and assume that there exists a positive constant C such that a(t, x)  C ((t, x) ∈ [0, ∞) × R1 ). In order to solve the stochastic differential equation dX(t) = a(t, X(t)) dB(t) by the method of time change, we need a time change process φ = {φ(t)}t0 satisfying

t

t  −2 φ a φ(s), T ξ(φ(s)) ds = a(φ(s), ξ(s))−2 ds. φ(t) = 0

0

This problem of finding φ is equivalent to solving the ordinary differential equation dφ(t) = a(φ(t), ξ(t))−2 , φ(0) = 0 dt for each sample path {ξ(t)}t0 . Hence, under the assumption on the existence and uniqueness of solutions of this ordinary differential equation (for example, suppose that a(t, x) is Lipschitz continuous in t and the Lipschitz constant is independent of x), the unique solution of the above stochastic differential equation is given by X(t) = ξ(φ−1 (t)).

4.8 One-Dimensional Diffusion Process The purpose of this section is to study the explosion problem for the onedimensional diffusion processes which are determined by stochastic differential equations. As an application, we present a result on the explosion problem

4:36:54, subject to the Cambridge Core terms of use, .005

168

Stochastic Differential Equations

for multi-dimensional diffusions. For one-dimensional diffusion processes, the explosion problem is solved almost completely and the results are shown in detail in Itô and McKean [50]. See also [46, 56, 100]. Let −∞  < r  ∞ and I = ( , r) be an open interval. For continuous functions σ and b on I such that σ(x) > 0 (x ∈ I), consider the stochastic differential equation dX(t) = σ(X(t)) dB(t) + b(X(t)) dt,

X(0) = x ∈ I.

Assume that this equation has a unique solution X x = {X x (t)}tζ , ζ being its life time. Let an and bn (n = 1, 2, . . .) be sequences satisfying an ↓ and bn ↑ r, respectively, and set τn = inf{t > 0; X x (t) = an or bn }. Then, ζ = limn→∞ τn . Moreover, if ζ < ∞, then there exists limt↑ζ X x (t) and it is or r.7 Define X x (t) = lim s↑ζ X x (s) for t  ζ. Then X x is a random variable with values in ' &  I = w ∈ W([ , r]) ; w(t) ∈ ( , r) (0  t < ζ(w)) W , and w(t) = w(ζ(w)) (t  ζ(w)) where ζ(w) = inf{t > 0; w(t) = or r}. Denote by P x the probability law of X x  I . Then, {P x } x∈I is a diffusion measure on I with generator on W L= Fix c ∈ I and set

d 1 d2 σ(x)2 2 + b(x) . 2 dx dx

x

s(x) =

 exp −

c

y c

2b(z)  dz dy. σ(z)2

s is called a scale function. It is strictly increasing on I and satisfies Ls(x) = 0 Defining m (x) by m (x) = we have L= 7

(x ∈ I).

(4.8.1)

 x 2b(z)  2 exp dz , 2 σ(x)2 c σ(z) 1 d 1 d . m (x) dx s (x) dx

We do not need to consider one-point compactification in the one-dimensional case.

4:36:54, subject to the Cambridge Core terms of use, .005

4.8 One-Dimensional Diffusion Process

169

The measure m with density m (x) is called a speed measure:

b m (ξ) dξ. m([a, b]) = a

Proposition 4.8.1 Let a, b, x ∈ I satisfy a < x < b and set τa = inf{t > 0; X(t) = a},

τb = inf{t > 0; X(t) = b}.

Then P x (τb < τa ) =

s(x) − s(a) . s(b) − s(a)

Proof Set τ = τa ∧ τb . By Itô’s formula and (4.8.1),

t∧τ s (X(u))σ(X(u)) dB(u) s(X(t ∧ τ)) − s(x) = 0

and E x [s(X(t ∧ τ))] = s(x). Hence, by letting t ↑ ∞ and applying the bounded convergence theorem, we obtain s(x) = E[s(X(τ))] = s(a)P x (τa < τb ) + s(b)P x (τa > τb ). Combining this with P x (τa < τb ) + P x (τa > τb ) = 1, we obtain the conclusion.  Example 4.8.2 (Brownian motion with constant drift) Let {B(t)}t0 be a one-dimensional Brownian motion with B(0) = 0 and μ ∈ R \ {0}. Set X(t, x) = x + B(t) + μt. Then, {X(t, x)}t0 determines a diffusion process on R with generator L=

1 1 d 1 d d 1 d2 = + μ . 2 dx2 dx 2 e2μx dx e−2μx dx

A scale function is given by

x

s(x) =

e−2μx dx =

0

1 − e−2μx . 2μ

Hence, if a < x < b, we have P x (τb < τa ) =

e−2μa − e−2μx . e−2μa − e−2μb

The next is a fundamental theorem for the explosion of one-dimensional diffusion processes.

4:36:54, subject to the Cambridge Core terms of use, .005

170

Stochastic Differential Equations

Theorem 4.8.3 Set s( +) = lim x↓ s(x) and s(r−) = lim s(x), and let x ∈ I. x↑r

(1) If s( +) = −∞ and s(r−) = ∞, then     P x (ζ = ∞) = P x sup X(t) = r = P x inf X(t) = = 1. 0t 0; X(t)  (a, b)}. Then, by Itô’s formula, d(e−t u(X(t))) = e−t u (X(t))σ(X(t))dB(t) and {e−(t∧τ) u(X(t∧τ))}t0 is a positive martingale. Hence, by letting a ↓ , b ↑ r and applying Fatou’s lemma, we see that {e−(t∧ζ) u(X(t∧ζ))}t0 is a non-negative supermartingale. (1) By (4.8.4), lim x↓ u(x) = lim x↑r u(x) = ∞. Since t → e−(t∧ζ) u(X(t ∧ ζ)) is a non-negative supermartingale, it converges as t ↑ ∞ almost surely. By its boundedness, we get P x (ζ = ∞) = 1. (2) We only give a proof under the assumption ν(r−) < ∞. The other case can be proven in the same way. We may assume c < x, because c is chosen arbitrarily. By (4.8.4), we have u(r−) < ∞. Set τ = inf{t > 0; X(t) = c}. Then, since −(t∧τ ) u(X(t ∧ τ ))}t0 is a bounded martingale, {e 

u(x) = E x [e−(t∧τ ) u(X(t ∧ τ ))] 

= E x [e−ζ u(r−)1{limt↑ζ∧τ X(t)=r} ] + E x [e−τ u(c)1{limt↑ζ∧τ X(t)=c} ]. Now suppose that the first term of the right hand side is 0. Then, 

u(x) = u(c)E x [e−τ 1{limt↑ζ∧τ X(t)=c} ]  u(c) = 1, which contradicts (4.8.4). Hence, we have E x [e−ζ 1{limt↑ζ∧τ X(t)=r} ] > 0 and, therefore, P x (ζ < ∞) > 0. (3) We first show the “only if” part. To do this, suppose that P x (ζ < ∞) = 1 for any x ∈ I. The proof is divided into two parts according as ν(r−) is finite or not. First assume that ν(r−) < ∞. Then, if none of (i), (ii), and (iii) holds, then ν( +) = ∞ and s( +) > −∞. Since s(r−) < ∞ by Remark 4.8.6, we have   P x lim X(t) = > 0 t↑ζ

4:36:54, subject to the Cambridge Core terms of use, .005

4.8 One-Dimensional Diffusion Process

175

by Theorem 4.8.3. ν( +) = ∞ implies lim x↓ u(x) = ∞. Moreover, since {e−(t∧ζ) u(X(t ∧ ζ))}t0 is a non-negative supermartingale, {limt↑ζ X(t) = } ⊂ {ζ = ∞}, P x -a.s., which implies P x (ζ = ∞) > 0. This is a contradiction. Thus one of the conditions (i)–(iii) holds. Secondly assume that ν(r−) = ∞. If (iii) does not hold, then ν( +) = ∞ or s(r−) < ∞. When ν( +) = ∞, P x (ζ = ∞) = 1 by (1). This contradicts the assumption. When s(r−) < ∞, P x (limt↑ζ X(t) = r) > 0 by Theorem 4.8.3. Combining this with u(r−) = ∞, we have P x (ζ = ∞) > 0. This contradicts the assumption. Thus the condition (iii) holds. We now proceed to the proof of the “if” part. At first assume (i). Define a function G(x, y) by ⎧ (s(x) − s( +))(s(r−) − s(y)) ⎪ ⎪ ⎪ (x < y), ⎪ ⎪ ⎨ s(r−) − s( +) G(x, y) = ⎪ ⎪ (s(y) − s( +))(s(r−) − s(x)) ⎪ ⎪ ⎪ (y  x). ⎩ s(r−) − s( +) For a bounded continuous function f on I, set

v(x) = G(x, y) f (y) m (y)dy. I

Then, under the condition (i), v is a bounded function of C 2 -class on I and v( +) = v(r−) = 0,

Lv = − f.

(4.8.8)

In particular, setting f = 1 and defining

u1 (x) = G(x, y) m (y)dy, I

we obtain by Itô’s formula

u1 (X(t ∧ ζ)) − u1 (x) = 0

t∧ζ

u1 (X(s)) dB(s) − (t ∧ ζ)

and E x [t ∧ ζ] = u1 (x) − E x [u1 (X(t ∧ ζ))]. Letting t ↑ ∞, by Theorem 4.8.3(4) and (4.8.8), we get E x [ζ] = u1 (x). Hence P x (ζ < ∞) = 1. Next assume (ii). Set τn = inf{t > 0; X(t) = + n−1 } and σr = inf{t > 0; X(t) = r}. Then, ζ = limn→∞ (τn ∧ σr ) and P x (τn ∧ σr < ∞) = 1 for any x ∈ I by (i). Since s(r−) < ∞, limn→∞ P x (σr < τn ) = 1 by Theorem 4.8.3(2). Hence P x (σr < ∞) = 1. Since ζ  σr , we have P x (ζ < ∞) = 1. When (iii) is assumed, we obtain the conclusion by changing and r in the argument of the proof of (ii). 

4:36:54, subject to the Cambridge Core terms of use, .005

176

Stochastic Differential Equations

Example 4.8.10 Let C and δ be positive constants and consider a diffusion process R with generator LC,δ =

d 1 d + C|x|δ . 2 dx2 dx

We take a scale function given by

x  2C  s(x) = ξδ+1 dξ exp − δ+1 0

(x > 0).

For x < 0, set s(x) = −s(|x|). Define the speed measure by m (x) = 2(s (x))−1 . By L’Hospital’s rule, 1 m([0, x]) = . lim x→∞ x−δ m (x) 2C If δ > 1, then

ν(∞) =



m([0, ξ])s (ξ) dξ < ∞.

0

By symmetry, we also have ν(−∞) < ∞. Hence, in this case, the diffusion process corresponding to LC,δ explodes in finite time almost surely for any C > 0. If 0  δ  1, then we do not have explosion almost surely. As an application of Theorem 4.8.8, we show a sufficient condition for general dimensional diffusion processes not to explode or to explode almost surely. It is called Khasminskii’s condition. For d  2, let σ : Rd → Rd ⊗ Rd and b : Rd → Rd be continuous functions. Consider the Markovian stochastic differential equation dX(t) = σ(X(t)) dB(t) + b(X(t)) dt.

(4.8.9)

Denote by , the inner product in Rd and define functions a, a0 , b0 on Rd by a(x) = σ(x)σ(x)∗ , a0 (x) = x, a(x)x , d  1  ii a (x) + 2 b(x), x . b0 (x) = a0 (x) i=1 We assume that the diffusion matrix a(x) is non-degenerate for every x ∈ Rd . For r > 0, set a+ (r) = max a0 (x),

b+ (r) = max b0 (x),

a− (r) = min a0 (x),

b− (r) = min b0 (x)

|x|=r

|x|=r

|x|=r

|x|=r

4:36:54, subject to the Cambridge Core terms of use, .005

4.8 One-Dimensional Diffusion Process and define s± , m± : [0, ∞) → [0, ∞) by  r   s± (r) = exp − b± (ξ)ξ dξ , 1

m± (r) =

177

2 . a± (r)s± (r)

Theorem 4.8.11 Suppose that the stochastic differential equation (4.8.9) has a unique solution X = {X(t)}t0 starting from x ∈ Rd . Denote its probability law by! P x and let ζ! be the life time. ∞ r (1) If 1 s+ (r)r dr 1 m+ (ξ)ξ dξ = ∞, then P x (ζ = ∞) = 1 for every x ∈ Rd . !∞ !r (2) If 1 s− (r)r dr 1 m− (ξ)ξ dξ < ∞, then P x (ζ < ∞) = 1 for every x ∈ Rd . Proof We follow [84, Section 4.5]. See also [100, Section V.52] and [114, Chapter 10]. ) (1) For r  1, set r = 2ρ and

√2ρ

ξ * η2 + s+ (ξ)ξ dξ un−1 u0 (ρ) = 1, un (ρ) = m (η)η dη. 2 + 1 1 Then, the solution for the ordinary differential equation  ) ) 1 1 1 d du  (ρ) , u(ρ) = a+ ( 2ρ)(u (ρ) + b+ ( 2ρ)u (ρ)) = ) ) 2 m+ ( 2ρ) dρ s+ ( 2ρ) dρ 1 du  1  u ) = 1, =0 2 dρ 2  1 is given by u(ρ) = ∞ n=0 un (ρ) (see Lemma 4.8.9). We extend u for ρ < 2 so ∞ that u ∈ C (R). Under the assumption, we have ) 1 u(ρ)  u1 (ρ) → ∞ (ρ → ∞), u (ρ) > 0, u (ρ) + b+ ( 2ρ)u (ρ)  0 (ρ > ). 2 Let L be the generator of {X(t)}t0 : L=

d d ∂2 ∂ 1 ij a (x) i j + bi (x) i . 2 i, j=1 ∂x ∂x ∂x j=1

Set v(x) = u( |x|2 ) for x ∈ Rd . Then, 2

Lv(x) =

, * |x|2 + * |x|2 +1 a0 (x) u + b0 (x)u . 2 2 2

Hence −v(x) + Lv(x)  0

if |x|  1.

4:36:54, subject to the Cambridge Core terms of use, .005

178

Stochastic Differential Equations

By Itô’s formula and the time change argument, there exists a one-dimensional Brownian motion {β(t)}t0 and an increasing process {A(t)}t0 such that

t −t e−s (−v + Lv)(X(s))ds. e v(X(t)) − v(x) = β(A(t)) + 0

Let γ = max{t > 0 ; |X(t)| = 1} and γ = 0 if {. . .} = ∅. Take ω ∈ {ζ < ∞}. Then γ(ω) < ζ(ω) and |X(s, ω)| > 1 for any s ∈ (γ(ω), ζ(ω)), and limt→ζ(ω) v(X(t)) = ∞. Moreover, 1 = β(A(t, ω), ω) − β(A(γ(ω), ω), ω) 2

t e−s (−v + Lv)(X(s, ω)) ds +

e−t v(X(t, ω)) − e−γ(ω) u

γ(ω)

 β(A(t, ω), ω) − β(A(γ(ω), ω), ω) for any t ∈ (γ(ω), ζ(ω)). This implies lim inf t→ζ(ω) e−t v(X(t, ω)) < ∞, which contradicts limt→ζ(ω) v(X(t)) = ∞ shown above. Thus P x (ζ = ∞) = 1. (2) Under the assumption, the solution u of the ordinary differential equation  ) ) 1 1 d du  1 (ρ) , u(ρ) = a− ( 2ρ)(u (ρ) + b− ( 2ρ)u (ρ)) = ) ) 2 m− ( 2ρ) dρ s− ( 2ρ) dρ 1 du  1  u = 1, =0 2 dρ 2 satisfies  1 < u(ρ) < exp



1



s− (ξ)ξ dξ

1

ξ



m− (η)η dη

(ρ >

1 ) 2

and is bounded. 2 By the same argument as (1), ν(x) = u( |x|2 ) satisfies −ν+Lν  0 (|x|  1). Let T R = inf{t  0; |X(t)| = R}. Then, by Itô’s formula and the optional sampling theorem, we obtain for |x| > 1  n2  1  |x|2  + E x [e−T1 1{Tn T1 } ]u u . 2 2 2 Letting n → ∞ and setting u(∞) = limρ→∞ u(ρ), we have E x [e−Tn 1{Tn 1 − ε holds for all x ∈ Rd . Since ε > 0 is arbitrary,  P x (ζ < ∞) = 1. The fact used in the above proof of Theorem 4.8.11 is important and we give a proof. Theorem 4.8.12 Let K be a compact set containing X(0) = x and τK be the exit time from K of X. Then, under the same assumption as Theorem 4.8.11, E x [τK ] < ∞. Proof We may assume that K is a ball B(R) with center at the origin and radius R. Fix T > 0 and set τT = τB(R) ∧ T . Then, φ(x) = − exp(αx1 ) satisfies d d 1 i j ∂2 φ(x) i ∂φ(x) a (x) i j + b (x) 2 i, j=1 ∂x ∂x ∂xi i=1 1  = − α2 a11 (x) + αb1 (x) eαx1 2 and Lφ(x)  −1 (x ∈ B(R)) for sufficiently large α by the assumption. Then, by Itô’s formula, we get E x [φ(X(τT ))]  φ(x) − E x [τT ] and, letting T → ∞,  E x [τB(R) ] < ∞.

Lφ(x) =

The essence of Khasminskii’s theorem is a comparison theorem for onedimensional diffusion processes. In fact, in [45], the comparison theorem is proven first, and, as an application of it, Theorem 4.8.11 is shown. For another approach to the explosion problem, we refer the reader to [39].

4:36:54, subject to the Cambridge Core terms of use, .005

180

Stochastic Differential Equations

4.9 Linear Stochastic Differential Equations Consider an ordinary differential equation with constant coefficients: dx(t) = ax(t) + b. dt Since (e−at x(t)) = e−at (x (t) − ax(t)) = be−at , this equation can be solved as

t   at e−as ds . x(t) = e x(0) + b 0

For a one-dimensional stochastic differential equation with constant coefficients dX(t) = σ dB(t) + (aX(t) + b) dt, the situation is similar and it can be checked by Itô’s formula that the solution is given by

t

t   X(t) = eat X(0) + σ e−as dB(s) + b e−as ds . 0

0

In this section we generalize this observation to multi-dimensional cases. As applications, a higher-dimensional Ornstein–Uhlenbeck process and a Brownian bridge will be investigated. Let σ(t), A(t), and b(t) be locally bounded Borel measurable functions on [0, ∞) with values in Rd ⊗ Rr , Rd ⊗ Rd , and Rd , respectively. Let {B(t)}t0 be an r-dimensional {Ft }-Brownian motion defined on a filtered probability space (Ω, F , P, {Ft }) and X(0) be an F0 -measurable Rd -valued random variable. Consider the stochastic differential equation dX(t) = σ(t) dB(t) + (A(t)X(t) + b(t)) dt,

X(0) = X0 .

(4.9.1)

Denote the unique strong solution by X = {X(t)}t0 . Associated with (4.9.1), we consider the ordinary differential equation dx(t) = A(t)x(t) + b(t), x(0) = x0 (4.9.2) dt and denote the solution by x(t). Moreover, denote by Φ(t) the solution of the matrix equation dΦ(t) = A(t)Φ(t), Φ(0) = I, (4.9.3) dt where I is the d-dimensional unit matrix. For the function Ψ(t) given as the solution of the equation dΨ(t) = −Ψ(t)A(t), dt

Ψ(0) = I,

4:36:54, subject to the Cambridge Core terms of use, .005

4.9 Linear Stochastic Differential Equations

181

we have dtd (Ψ(t)Φ(t)) = 0 and Ψ(t)Φ(t) = I. Hence, Φ(t) is non-degenerate for any t  0 and

t   Φ−1 (s)b(s) ds . x(t) = Φ(t) x(0) + 0

Also, for the stochastic differential equation (4.9.1), the solution is explicitly expressed. Proposition 4.9.1 (1) Let Φ(t) be a solution of (4.9.3). Then the solution of the stochastic differential equation (4.9.1) is given by

t

t −1 Φ (s)σ(s) dB(s) + Φ−1 (s)b(s) ds . (4.9.4) X(t) = Φ(t) X0 + 0

0

(2) m(t) = E[X(t)] is a solution of (4.9.2) satisfying m(0) = E[X0 ]. (3) The covariance matrix R(s, t) = E[(X(s) − m(s))(X(t) − m(t))∗ ] is given by

s∧t  Φ−1 (u)σ(u)σ(u)∗ Φ−1 (u)∗ du Φ(t)∗ , R(s, t) = Φ(s) R(0, 0) + 0

where A∗ is the transpose matrix of A. Moreover, R(t) = R(t, t) satisfies dR(t) = A(t)R(t) + R(t)A(t)∗ + σ(t)σ(t)∗ . dt

(4.9.5)

Proof (1) is verified by Itô’s formula. (2) is obtained by taking the expectation of both sides of (4.9.1). By (1),  t ∗  s Φ−1 (u)σ(u) dB(u) Φ−1 (u)σ(u) dB(u) Φ(t)∗ R(s, t) = Φ(s)E 0

0 ∗

+ Φ(s)E[(X0 − m(0))(X0 − m(0)) ]Φ(t)∗  s∧t  = Φ(s) R(0, 0) + E Φ−1 (u)σ(u) dB(u) 0 ∗  s∧t Φ−1 (u)σ(u) dB(u) Φ(t)∗ . × 0

Computing the expectation of the right hand sides by using Itô’s isometry, we obtain the first assertion of (3). (4.9.5) follows from (4.9.3).  Remark 4.9.2 If the initial distribution is d-dimensional Gaussian, then that of X(t) (t > 0) is also Gaussian.

4:36:54, subject to the Cambridge Core terms of use, .005

182

Stochastic Differential Equations

Example 4.9.3 (Multi-dimensional Ornstein–Uhlenbeck process) When σ(t) and A(t) are constant matrices, say σ and A, respectively, Φ(t) = etA , and

t

t X(t) = etA X0 + e(t−s)A σ dB(s) + e(t−s)A b(s) ds, 0

0

the covariance matrix is given by tA∗

R(t) = e R(0)e tA

+

t



e(t−u)A σσ∗ e(t−u)A du.

0

Suppose that the real part of each eigenvalue of A is negative. Set

∞ ∗ R= euA σσ∗ euA du 0

and assume that R(0) = R. Then R(t) does not depend on t. In fact, since

∞  d uA ∗ uA∗  AR + RA∗ = du = −σσ∗ , e σσ e 0 du we have # t # t $ $ ∗ tA −uA ∗ −uA∗ tA∗ tA −uA ∗ −uA∗ e σσ e du e = −e e (AR + RA )e du etA e 0

0

#

t

= etA 0

$ d  −uA −uA∗  ∗ ∗ du etA = R − etA RetA e Re du

and R(t) = R (t  0). The covariance matrix is given by ⎧ (s−t)A ⎪ ⎪ R (0  t  s) ⎨e R(s, t) = ⎪ ⎪ ⎩Re(t−s)A∗ (0  s  t). Example 4.9.4 (Brownian bridge) Let r = d = 1, and fix x, y ∈ R and T > 0. The solution {X x,y (t)}0t 0 and p  2. Then, there exists a positive constant C such that p

E[|X(t, x) − X(s, y)| p ]  C(|x − y| p + |t − s| 2 ) for all x, y ∈ Rd and s, t ∈ [0, T ]. Moreover, {X(t, x)}t0,x∈Rd has a modification which is Hölder continuous in (t, x). Proof The second assertion is an immediate consequence of the first one and Kolmogorov’s continuity theorem (Theorem A.5.1). Thus we give a proof of only the first assertion. First we consider the case where x = y. Let s < t. By the elementary inequality |a1 + a2 + · · · + ar+1 | p  (r + 1) p−1 (|a1 | p + |a2 | p + · · · + |ar+1 | p ), we have E[|X(t, x) − X(s, x)| p ]  (r + 1) p−1

 p r  t E  Vk (X(u, x)) dBk (u) s

k=1

 p  t V0 (X(u, x)) du . + (r + 1) p−1 E  s

Moreover, by the Burkholder–Davis–Gundy inequality (Theorem 2.4.1) and the boundedness of the Vk s, there exist positive constants C1 and C2 such that8 r  t  2p 2 E[|X(t, x) − X(s, x)| ]  C1 E |Vk (X(u, x))| du + C1 (t − s) p p

s

k=1 p 2

 C2 ((t − s) + (t − s) p )

(0  s < t  T, x ∈ Rd ).

Hence there exists a constant C3 such that p

E[|X(t, x) − X(s, x)| p ]  C3 (t − s) 2

(s, t ∈ [0, T ], x ∈ Rd ).

Next consider the case where s = t. Since 8

Here and in the remainder of this section, constants C1 , C2 , . . . depend only on d, r, p, T, and the bounds and the Lipschitz constants of the Vk s.

4:36:54, subject to the Cambridge Core terms of use, .005

4.10 Stochastic Flows

185

r t   X(t, x) − X(t, y) = x − y + Vk (X(u, x)) − Vk (X(u, y)) dBk (u) k=1

0

t

+

  V0 (X(u, x)) − V0 (X(u, y)) du,

0

by the Burkholder–Davis–Gundy inequality, there exists a constant C4 such that E[|X(t, x) − X(t, y)| p ] r  t  2p p 2 E |Vk (X(u, x)) − Vk (X(u, y))| du  C4 |x − y| + C4 k=1

0

 p  t   (V0 (X(u, x)) − V0 (X(u, y))) du . + C4 E  0

By the Lipschitz continuity of the Vk s and Hölder’s inequality, there exists a constant C5 such that

t

E[|X(t, x) − X(t, y)| p ]  C4 |x − y| p + C5

E[|X(u, x) − X(u, y)| p ] du.

0

Hence, by Gronwall’s inequality (Theorem A.1.1), there exists a constant C6 such that E[|X(t, x) − X(t, y)| p ]  C6 |x − y| p

(t ∈ [0, T ], x, y ∈ Rd ).



Next we prove that the continuous mapping X(t, ·) is injective. For this purpose we give the following proposition. Proposition 4.10.2 For any T > 0 and p ∈ R, there exists a positive constant C such that E[|X(t, x) − X(t, y)| p ]  C|x − y| p

(4.10.2)

for every x, y ∈ Rd (x  y) and t ∈ [0, T ]. Proof For ε > 0 and x, y ∈ Rd with |x−y| > ε, set τε = inf{t; |X(t, x)−X(t, y)| < ∂f ∂2 f ε}. Let f (z) = |z| p , fi = ∂z i , fi j = ∂zi ∂z j . By Itô’s formula, |X(t, x) − X(t, y)| p − |x − y| p = I1 (t) + I2 (t) for t < τε , where

4:36:54, subject to the Cambridge Core terms of use, .005

186

I1 (t) =

Stochastic Differential Equations r d i=1 k=1

t

0

+

  fi (X(s, x) − X(s, y)) Vki (X(s, x)) − Vki (X(s, y)) dBk (s)

d i=1

I2 (t) =

d

r t

1 2 i, j=1 k=1

0

0

t

  fi (X(s, x) − X(s, y)) V0i (X(s, x)) − V0i (X(s, y)) ds,

  fi j (X(s, x) − X(s, y)) Vki (X(s, x)) − Vki (X(s, y))   × Vkj (X(s, x)) − Vkj (X(s, y)) ds.

Since fi (z) = p|z| p−2 zi , fi j (z) = p|z| p−2 δi j + p(p − 2)|z| p−4 zi z j , and the Vk s are Lipschitz continuous, there exist constants C1 and C2 such that

t |E[I1 (t ∧ τε )]|  C1 E[|X(s ∧ τε , x) − X(s ∧ τε , y)| p ] ds 0

and

t

|E[I2 (t ∧ τε )]|  C2

E[|X(s ∧ τε , x) − X(s ∧ τε , y)| p ] ds.

0

Hence, there exists a positive constant C3 such that E[|X(t ∧ τε , x) − X(t ∧ τε , y)| p ]

 |x − y| + C3 p

t

E[|X(s ∧ τε , x) − X(s ∧ τε , y)| p ] ds

0

and, by Gronwall’s inequality (Theorem A.1.1), there exists a positive constant C4 such that E[|X(t ∧ τε , x) − X(t ∧ τε , y)| p ]  C4 |x − y| p

(t ∈ [0, T ]).

Letting ε ↓ 0 and setting τ = inf{t; X(t, x) = X(t, y)}, we obtain E[|X(t ∧ τ, x) − X(t ∧ τ, y)| p ]  C4 |x − y| p

(t ∈ [0, T ]).

Substituting p = −1, we have P(τ < ∞) = 0. Thus (4.10.2) holds.



By Propositions 4.10.1 and 4.10.2, for t  0 and x, y ∈ Rd with x  y, X(t, x)  X(t, y) almost surely. However, the exceptional set {X(t, x)  X(t, y)} depends on (x, y). Hence these propositions are not enough to see the injectivity of X(t, ·). Proposition 4.10.3 Set D = {(x, x); x ∈ Rd }. Then, Z(t, x, y) = |X(t, x) − X(t, y)|−1 has a modification which is continuous on [0, ∞) × (Rd × Rd \ D).

4:36:54, subject to the Cambridge Core terms of use, .005

4.10 Stochastic Flows

187

Proof Let p > 2(2d + 1). Then, |Z(t, x, y) − Z(t , x , y )| p  2 p Z(t, x, y) p Z(t , x , y ) p   × |X(t, x) − X(t , x )| p + |X(t, y) − X(t , y )| p . By Hölder’s inequality, E[|Z(t, x, y) − Z(t , x , y )| p ]  2 p E[Z(t, x, y)4p ] 4 E[Z(t , x , y )4p ] 4 1 1  × E[|X(t, x) − X(t , x )|2p ] 2 + E[|X(t, y) − X(t , y )|2p ] 2 . 1

1

Hence, by Propositions 4.10.1 and 4.10.2, there exists a positive constant C1 such that E[|Z(t, x, y) − Z(t , x , y )| p ]  C1 |x − y|−p |x − y |−p p

× {|x − x | p + |y − y | p + |t − t | 2 }. For δ > 0, let Dδ = {(x, y) x, y ∈ Rd , |x − y| > δ}. The above inequality implies p

E[|Z(t, x, y) − Z(t , x , y )| p ]  C1 δ−2p {|x − x | p + |y − y | p + |t − t | 2 } for any t, t ∈ [0, T ] and (x, y), (x , y ) ∈ Dδ . By Kolmogorov’s continuity theorem (Theorem A.5.1), Z(t, x, y) has a modification which is continuous on [0, T ] × Dδ . Since T > 0 and δ > 0 are arbitrary, we obtain the conclusion.  We show the surjectivity of X(t, ·). Proposition 4.10.4 For T > 0 and ∈ R, there exists a positive constant C such that E[(1 + |X(t, x)|2 ) p ]  C(1 + |x|2 ) p

(x ∈ Rd , t ∈ [0, T ]).

Set f (z) = (1+|z|2 ) p and apply Itô’s formula to f (X(t, x)). Then we can show the proposition in the same way as Proposition 4.10.2. We omit the proof. Proposition 4.10.5 Let  Rd = Rd ∪ {} be a one-point compactification of Rd and set ⎧ −1 ⎪ ⎪ (x ∈ Rd ) ⎨(1 + |X(t, x)|) η(t, x) = ⎪ ⎪ 0 ⎩ (x = ). Then, η is continuous on [0, ∞) ×  Rd .

4:36:54, subject to the Cambridge Core terms of use, .005

188

Stochastic Differential Equations

Proof The continuity of η on [0, ∞) × Rd follows from Proposition 4.10.1. Let p > 2(2d + 1). By a similar argument to that in the proof of Proposition 4.10.3, we can show that there exists a positive constant C such that p

E[|η(t, x) − η(s, y)| p ]  C(1 + |x|)−p (1 + |y|)−p (|x − y| p + |t − s| 2 ) for all x, y ∈ Rd and s, t ∈ [0, T ]. For x = (x1 , x2 , . . . , xd ), set * x1 x2 xd + , 2,..., 2 . x−1 = 2 |x| |x| |x| Since

|x − y| = |x−1 − y−1 |, |x| |y|

we have p

E[|η(t, x) − η(s, y)| p ]  C(|x−1 − y−1 | p + |t − s| 2 ). Moreover, set

⎧ −1 ⎪ ⎪ ⎨η(t, x ) (x  0)  η(t, x) = ⎪ ⎪ ⎩ 0 (x = 0).

Then, since (x−1 )−1 = x (x ∈ Rd \ {0}), p

E[| η(t, x) −  η(s, y)| p ]  C(|x − y| p + |t − s| 2 ) and E[| η(t, x)| p ]  C|x| p . Hence, by Kolmogorov’s continuity theorem (Theorem A.5.1),  η is continuous on [0, ∞)×Rd . This implies that η(t, x) is continuous on [0, ∞)×({|x| > R}∪{}) for any R > 0.  Proposition 4.10.6 The mapping X(t, ·) is surjective for any t > 0 almost surely.  x)}t0 (x ∈  Proof Define a stochastic process {X(t, Rd ) on  Rd by ⎧ ⎪ ⎪X(t, x) (x ∈ Rd )  x) = ⎨ X(t, ⎪ ⎪ ⎩ (x = ).  x) is continuous on [0, ∞) ×  By Proposition 4.10.5, X(t, Rd and, for any t  0,  ·) is homotopic with the identity mapping on  the mapping X(t, Rd .9 On the d d  ·) other hand, since  R and a d-dimensional sphere S are homeomorphic, X(t, d induces a continuous and injective mapping on S . It is well known in the 9

Mappings f and g from a topological space M onto M are said to be homotopic if there exists a continuous mapping φ(t, x), (t, x) ∈ [0, 1] × M such that φ(0, x) = f (x) and φ(1, x) = g(x).

4:36:54, subject to the Cambridge Core terms of use, .005

4.10 Stochastic Flows

189

theory of topology that the identity mapping on S d is not homotopic with the constant mapping.  ·) is not surjective. Then, X(t,  ·) is homotopic with a Suppose now that X(t, d d  continuous mapping from R into R and, therefore, with a constant mapping.  ·) is homotopic with the identity mapping on This contradicts the fact that X(t,   ·) and, by definition, X(t, ·) are surjective.  Rd . Hence, the mapping X(t, Theorem 4.10.7 Let {X(t, x)}t0 be a unique strong solution of the stochastic differential equation (4.10.1). Then, the mapping defined by Rd x → X(t, x) ∈ Rd is a homeomorphism on Rd for all t  0 almost surely. Proof The assertion follows by summing up Propositions 4.10.1, 4.10.3, and 4.10.6.  Finally, we discuss the differentiability of the mapping X(t, ·). Theorem 4.10.8 For m  1, assume that V0 , V1 , . . . , Vr are bounded and of C m -class with bounded derivatives up to m-th order. Then, (1) the mapping X(t, ·) is of C m−1 -class; i )i, j=1,2,...,d satisfies (2) the Jacobian matrix Y(t, x) = ( ∂X∂x(t,x) j r t Vk (X(s, x))Y(s, x) dBk (s), Y(t, x) = I + k=0

0

∂V i (x)

where I is an identity mapping on Rd , Vk (x) = ( ∂xk j )i, j=1,2,...,d and B0 (s) = s; (3) Y(t, x) is non-degenerate for all t > 0 and x ∈ Rd almost surely; (4) the mapping x → X(t, x) is a diffeomorphism of C m−1 -class for any t almost surely. j−1

9:;< Proof Let ε ∈ (0, 1] and e j = (0, . . . , 0, 1, 0, . . . , 0) ∈ Rd ( j = 1, 2, . . . , d). Set Y εj (t, x) =

1 (X(t, x + εe j ) − X(t, x)). ε

Then we have Y εj (t,

r t  1 x) = e j + Vk (X(s, x + εe j )) − Vk (X(s, x)) dBk (s) ε k=0 0 r t 1   = ej + Vk X(s, x) + u(X(s, x + εe j ) − X(s, x)) du k=0

0

0

× Y εj (s, x) dBk (s).

(4.10.3)

4:36:54, subject to the Cambridge Core terms of use, .005

190

Stochastic Differential Equations

Hence, {(X(t, x + εe j ), X(t, x), Y εj (t, x))}t0 is a solution of a stochastic differential equation with Lipschitz continuous coefficients on R3d . By Proposition 4.10.1, it is continuous in (t, x, ε) ∈ [0, T ] × Rd × ([−1, 1] \ {0}) and extended to a continuous mapping on [0, T ] × Rd × [−1, 1]. This means that exists almost surely and is continuous in (t, x). limε→0 Y εj (t, x) = ∂X(t,x) ∂x j (2) is obtained from (4.10.3) by letting ε → 0. To show (3), let Z = {Z(t, x)}t0 be a solution of a matrix-valued stochastic differential equation r t r t  k Z(t, x) = I − Z(s, x)Vk (X(s, x)) dB (s) + Z(s, x)(Vk (X(s, x)))2 ds. k=0

0

k=1

0

By Itô’s formula, we obtain d(Z(t, x)Y(t, x)) = 0 and Z(t, x)Y(t, x) = I for any t  0. Hence, Y(t, x) is non-degenerate. By a similar argument,10 we can show that {(X(t, x), Y(t, x))}t0 is differen2 i (t,x) and it is continuous. Moreover, by tiable with respect to x, there exists ∂∂xXj ∂x m−1 induction, we can prove that X(t, x) is of C -class in x and obtain (1). (4) follows from the assertions (1)–(3).  Also, from a solution of the Stratonovich type stochastic differential equation r Vk (X(t)) ◦ dBk (t) + V0 (X(t)) dt, dX(t) = k=1

a stochastic flow of diffeomorphisms is defined. If the coefficients are of C m class, then the mapping X(t, ·) is a diffeomorphism of C m−2 -class.

4.11 Approximation Theorem Let V0 , V1 , . . . , Vr : Rd → Rd be bounded and C ∞ functions with bounded derivatives of all orders. Consider the Stratonovich type stochastic differential equation ⎧ r  ⎪ ⎪ ⎪ Vk (X(t)) ◦ dBk (t) + V0 (X(t)) dt, ⎪ ⎨dX(t) = k=1 (4.11.1) ⎪ ⎪ ⎪ ⎪ ⎩X(0) = x. Denote the unique strong solution by {X(t, x)}t0 . Consider a sequence of curves which converges to a Brownian motion B = {(B1 (t), B2 (t), . . . , Br (t))}t0 ; for example, curves obtained by piecewise linear approximations or by approximations via mollifiers. A natural question 10

While we need to show that X(t, x) and Y(t, x) have moments of any order, we omit the proof. See [62, 111] for details. 4:36:54, subject to the Cambridge Core terms of use, .005

4.11 Approximation Theorem

191

is whether solutions of ordinary differential equations driven by these curves converge to those of stochastic ones. This problem is important in other fields than probability theory, for example, in numerical analysis for solutions of stochastic differential equations. In this section we consider a piecewise linear approximation of Brownian motion and discuss the corresponding approximation for solutions of stochastic differential equations, originated by Wong and Zakai [127]. For other approximations, see [45, 62, 79]. For n = 1, 2, . . . and m = 1, 2, . . ., set m + 1 m Δm,n = B − B 2n 2n and define a piecewise linear approximation {Bn (t)}t0 of a Brownian motion B by m  m m + 1 m Bn (t) = B n + 2n t − n Δm,n  t  . 2 2 2n 2n Let {Xn (t, x)}t0 be the solution of the random ordinary differential equation ⎧ r ⎪ d ⎪ ⎪ ⎪ Vk (Xn (t)) B˙ kn (t) + V0 (Xn (t)) ⎪ ⎨ dt Xn (t) = ⎪ k=1 ⎪ ⎪ ⎪ ⎪ ⎩Xn (0) = x, where B˙ n (t) = dtd Bn (t) for t  2mn (m = 1, 2, . . .) and B˙ n (t) = 0 for t = 1, 2, . . .). The following is the main result in this section.

m 2n

(m =

Theorem 4.11.1 For any T > 0 and p  2,  lim sup E sup |Xn (t, x) − X(t, x)| p = 0. n→∞ x∈Rd

0tT

Proof For simplicity write B0 (t) = t, and denote X(t, x) and Xn (t, x) by X(t) and Xn (t), respectively. Let Vk = (Vk1 , Vk2 , . . . , Vkd ) and define the function Vk [V ] = ((Vk [V ])i ) : Rd → Rd by (Vk [V ])i =

d j=1

For

m 2n

t

Xn (t) = Xn

Vkj

∂V i . ∂x j

m+1 2n ,

we have by Taylor’s theorem

m

r  m    m  k Vk Xn n Δm,n + 2n t − n 2 k=0 2 r   m    1 m 2 + 22n t − n Vk [V ] Xn n Δkm,n Δ m,n + Rm n (t), 2 2 k, =1 2

2n

4:36:54, subject to the Cambridge Core terms of use, .005

192

Stochastic Differential Equations

where Rm n (t) =

r

t

m2−n

k=0

ds

r

+ 23n

s m2−n t

m2−n

k, ,p=1

Vk [V0 ](Xn (u)) du × 2n Δkn,m

ds



s m2−n

du

u m2−n

dv Vk [V [V p ]](Xn (v))

p . × Δkm,n Δ m,n Δm,n

In particular, we have r m + 1 m   m    m  k 6 Xn V + V = X + X Δ Xn n Δ0n,m n k n 0 m,n n 2n 2n 2 2 k=1 r    m 1 + Vk [Vk ] Xn n {(Δkm,n )2 − 2−n } 2 k=1 2   m  m + 1 + Vk [V ] Xn n Δkm,n Δ m,n + Rm , n 2 2n k 60 is given by where V

60 = V0 + 1 Vk [Vk ]. V 2 k=1 r

n

Hence, setting [s]n = [22ns] ∈ { 2mn }∞ m=0 , [u] being the largest integer less than or equal to u, we have for t > 0

[t]n r [t]n 60 (Xn ([s]n )) ds V Xn (t) = x + Vk (Xn ([s]n )) dBk (s) + k=1

0

0 [2 t]−1 n

+ In1 (t) + In2 (t) +

m=0

Rm n

m + 1 2n

+ Xn (t) − Xn ([t]n ),

where   m   1 Vk [Vk ] Xn n (Δkm,n )2 − 2−n , 2 m=1 k=1 2 [2n t] r

In1 (t) =

[2n t]   m  1 = Vk [V ] Xn n Δkm,n Δ m,n . 2 m=1 k 2  It is easy to show that sup x E sup0tT |Xn (t) − Xn ([t]n )| p → 0 and

In2 (t)

n  [2 t]   m + 1  p  → 0 Rm sup E sup  n 2n  x

0tT m=1

4:36:54, subject to the Cambridge Core terms of use, .005

4.11 Approximation Theorem

193

as n → ∞. By Itô’s formula, we have

(m+1)2−n   m  1 k 2 Bk (s) − Bk n dBk (s), (Δm,n ) − n = 2 2 2 m2−n and hence In1 (t) =

r k=1

[t]n

Vk [Vk ](Xn ([s]n ))(Bk (s) − Bk ([s]n )) dBk (s).

0

Thus, by the continuity of paths of Brownian motion, we obtain  lim sup E sup |In1 (t)| p = 0. n→∞ x∈Rd

If k  , then Δkm,n Δ m,n

=

0tT

(m+1)2−n  m2−n

+

Bk (s) − Bk

(m+1)2−n  m2−n



 m  2n

B (s) − B

dB (s)

 m  dBk (s). 2n

Hence we obtain sup x E sup0tT |In2 (t)| p → 0 in a similar way to In1 (t). From the above observations, there exists an In (t) with  lim sup E sup |In (t)| p = 0 n→∞ x∈Rd

such that X(t) − Xn (t) =

r k=1

[t]n

0tT

 Vk (X(s)) − Vk (Xn (s)) dBk (s)



0

+

[t]n

 0 (Xn (s))ds + In (t). V0 (X(s)) − V

0

Hence, in a similar way to the proof of Proposition 4.10.1, we obtain the conclusion by Doob’s inequality (Theorem 1.5.13), the Burkholder–Davis–Gundy inequality (Theorem 2.4.1), and Gronwall’s inequality. The details are left to the reader.  The problem of characterizing the subset of path space which consists of the paths of the solutions of a stochastic differential equation of the form (4.11.1) is called a support problem, and is closely related to approximation of solutions of stochastic differential equations. Let P x be the probability law of the solution X = {X(t)}t0 of (4.11.1). Let C 1 be the subspace of the Wiener space W r = {w ∈ Wr ; w(0) = 0} consisting

4:36:54, subject to the Cambridge Core terms of use, .005

194

Stochastic Differential Equations

of paths of C 1 -class. For φ ∈ C 1 , let ξ(x, φ) be the solution of the ordinary differential equation ⎧ r  ⎪ ⎪ ⎪ Vk (ξ(t))φ˙ k (t) dt + V0 (ξ(t)) dt, ⎪ ⎨dξ(t) = k=1 ⎪ ⎪ ⎪ ⎪ ⎩ξ(0) = x. Then the following is known ([115]). Theorem 4.11.2 For any x ∈ Rd , the topological support of P x , that is, the smallest closed subset of Wd whose P x -measure is 1, coincides with the closure of {ξ(x, φ); φ ∈ C 1 }. Moreover, if the rank of the Lie algebra generated by V1 , V2 , . . . , Vr is equal to d at every point in Rd , then the closure of {ξ(x, φ); φ ∈ C 1 } is W xd = {w ∈ Wd ; w(0) = x} ([63]) and the topological support of P x coincides with Wdx . For details, see the references cited at the beginning of this section and [45, 114].

4:36:54, subject to the Cambridge Core terms of use, .005

5 Malliavin Calculus

In 1976, Malliavin ([74, 75]) proposed a new calculus on Wiener spaces and achieved purely probabilistic proofs of results related to diffusion processes, which, before him, were shown based on outcomes in other mathematical fields like partial differential equations. For example, he proved the existence and smoothness of the transition densities of diffusion processes in a purely probabilistic manner. This method has been developed into a theoretical system, which is nowadays called the Malliavin calculus [43, 73, 104, 122]. It plays an important role in stochastic analysis together with the Itô calculus, consisting of stochastic integrals, stochastic differential equations, and so on. The aim of this chapter is to introduce the Malliavin calculus. We will use the fundamental terminologies and notions in functional analysis without detailed explanation. For these, consult [1, 19, 58].

5.1 Sobolev Spaces and Differential Operators Throughout this chapter, let T > 0, d ∈ N and (WT , B(WT ), μT ) be the d-dimensional Wiener space on [0, T ] (Definition 1.2.2). For t ∈ [0, T ], define θ(t) : WT → Rd by θ(t)(w) = w(t) (w ∈ WT ). Moreover, let HT be the Cameron–Martin subspace of WT . Then, identifying HT with its dual space HT∗ in a natural way, we obtain the relation WT∗ ⊂ HT∗ = HT ⊂ WT (see Section 1.2). Let E be a real separable Hilbert space and L p (μT ; E) be the space of E-valued p-th integrable functions with respect to μT on WT . L p (μT ; R) is simply written as L p (μT ). Denote the norm in L p (μT ; E) by  ·  p or  ·  p,E when emphasizing E is necessary. Let P be the set of functions φ : WT → R of the form φ = f ( 1 , . . . , n ) for 1 , . . . , n ∈ WT∗ and a polynomial f : Rn → R, that is, φ(w) = f ( 1 (w), . . . , n (w))

(w ∈ WT ).

195 4:36:52, subject to the Cambridge Core terms of use, .006

196

Malliavin Calculus

Set P(E) =

m ,

φ j e j ; φ j ∈ P, e j ∈ E, j = 1, . . . , m, m ∈ N .

j=1

For φ = f ( 1 , . . . , n ) ∈ P, define ∇φ ∈ P(HT ) by ∇φ = Moreover, for φ =

m j=1

n ∂f ( , . . . , n ) i . i 1 ∂x i=1

φ j e j ∈ P(E), define ∇φ ∈ P(HT ⊗ E) by ∇φ =

m

∇φ j ⊗ e j ,

j=1

where, for real separable Hilbert spaces E1 and E2 , E1 ⊗ E2 denotes the Hilbert space of Hilbert–Schmidt operators A : E1 → E2 and, for e(1) ∈ E1 and e(2) ∈ E2 , e(1) ⊗ e(2) denotes the Hilbert–Schmidt operator such that E1 e → e(1) , e E1 e(2) ∈ E2 . The Hilbert space E1 ⊗ E2 has an inner product given by

A, B E1 ⊗E2 =

∞ (1)

Ae(1) n , Ben E2

(A, B ∈ E1 ⊗ E2 ),

n=1 ∞ where {e(1) n }n=1 is an orthonormal basis of E 1 . It should be noted that the above definition of ∇φ does not depend on the expression of φ ∈ P, because d  (w ∈ WT , h ∈ HT ). (5.1.1)

∇φ(w), h HT =  φ(w + ξh) dξ ξ=0

Example 5.1.1 Let d = 1. For t ∈ [0, T ], the coordinate function θ(t) : WT → R satisfies

T ˙ ds

∇θ(t), h HT = h(t) = 1[0,t] (s)h(s) (h ∈ HT ). 0

Hence, defining [0,t] ∈ HT by ˙[0,t] (s) = 1[0,t] (s) (s ∈ [0, t]), we have ∇θ(t) = [0,t] . Lemma 5.1.2 Let !p > 1. For F ∈ L p (μT ; E), ∈ WT∗ , φ ∈ P and e ∈ E, the mapping R ξ → W F(· + ξ ), φe E dμT is differentiable and T

 d 

F(· + ξ ), φe dμ =

F, e E ∂ φ dμT , (5.1.2)  E T dξ ξ=0 WT WT where ∂ φ(w) = (w)φ(w) − ∇φ(w), HT .

4:36:52, subject to the Cambridge Core terms of use, .006

5.1 Sobolev Spaces and Differential Operators

197

In particular, the mapping ∇ : L p (μT ; E) ⊃ P(E) φ → ∇φ ∈ L p (μT ; HT ⊗ E) is closable, that is, ∇ is extended to a unique closed operator whose domain is a dense subset of L p (μT ; E). Proof By the Cameron–Martin theorem (Theorem 1.7.2),

F(w + ξ ), φ(w)e E μT (dw) WT

=

* ξ2 1 12 +

F(w), e E φ(w − ξ ) exp ξ (w) − 11 11H μT (dw). T 2 WT

It is easy to see that the right hand side is differentiable in ξ, and, hence, so is the left hand side. Differentiating both sides in ξ = 0, we obtain (5.1.2) by (5.1.1). By (5.1.2), we have, for any ψ ∈ P(E), e ∈ E, ∈ WT∗ and φ ∈ P,

∇ψ, ⊗ e HT ⊗E φ dμT =

ψ, e E ∂ φ dμT . WT

WT

p Hence, if {ψn }∞ n=1 ⊂ P(E) and G ∈ L (μT ; HT ⊗ E) satisfy ψn  p → 0 and ∇ψn − G p → 0 (n → ∞), then

G, ⊗ e HT ⊗E φ dμT = 0. WT

Since this holds for any e, , and φ, we obtain that G = 0, μT -a.s. and ∇ is closable.  Remark 5.1.3 (1) By (5.1.1) and (5.1.2),

∇F, φ ⊗ e HT ⊗E dμT = WT

F, (∂ φ) e E dμT

(5.1.3)

WT

for any F ∈ P(E). Denote the dual operator of ∇ by ∇∗ . Then, the left hand ! side is equal to W F, ∇∗ (φ ⊗ e) E dμT . Thus T

∇∗ (φ ⊗ e) = (∂ φ) e

(5.1.4)

since F is arbitrary. Identity (5.1.3) is a prototype of the integration by parts formula on the Wiener space presented in Section 5.4. (2) Set E = R and φ = 1 in (5.1.4). Then (∇∗ )(w) = (w),

μT -a.s.,

(5.1.5)

where ∈ WT∗ is regarded as an HT -valued constant function on the left hand side and as a random variable : WT → R on the right hand side.

4:36:52, subject to the Cambridge Core terms of use, .006

198

Malliavin Calculus

Moreover, if  HT = 1 and F = ϕ( ), then (5.1.3) corresponds to the following elementary identity for a standard normal random variable X,

x2 x2 1 1 ϕ (x) √ e− 2 dx = ϕ(x)x √ e− 2 dx = E[ϕ(X)X]. E[ϕ (X)] = R R 2π 2π On account of the closability of ∇, we introduce Sobolev spaces over the Wiener space. Definition 5.1.4 Let p  1 and k ∈ N. For φ ∈ P(E), set φ(k,p) =

k

∇ j φ p

j=0

and denote the completion of P(E) with respect to  · (k,p) by Dk,p (E). Simply write Dk,p for Dk,p (R). Denote by the same ∇ the extension of ∇ : P(E) → P(HT ⊗ E) to Dk,p (E) and by ∇∗ the adjoint operator of the closed operator ∇ : L p (μT ; E) → L p (μT ; HT ⊗ E). 



If k  k and p  p , then Dk ,p (E) ⊂ Dk,p (E). By definition, ∇ is defined consistently on each Dk,p (E). Moreover, by (5.1.4), for F ∈ P(HT ⊗ E) of the form m φ j j ⊗ e j F= j=1

with φ j ∈ P, j ∈

WT∗

and e j ∈ E ( j = 1, . . . , m), we have ∇∗ F =

m

(∂ j φ j )e j .

j=1

Hence, ∇∗ is also defined consistently on each L p (μT ; E). Because of this consistency, we may use the simple notations ∇ and ∇∗ without referring to the dependency on k and p. Example 5.1.5 Let ∈ WT∗ . Set f (x) = x (x ∈ R) and write (w) = f ( (w)) (w ∈ WT ). Then, by definition, the derivative of : WT w → (w) ∈ R is given by (∇ )(w) = ,

μT -a.s. w ∈ WT .

Combining this identity with (5.1.5), we obtain ∇(∇∗ )(w) = ,

μT -a.s. w ∈ WT .

(5.1.6)

We now show that this identity is extended to HT .

4:36:52, subject to the Cambridge Core terms of use, .006

5.1 Sobolev Spaces and Differential Operators

199

Let h ∈ HT and take n ∈ WT∗ (n = 1, 2, . . .) so that  n − hHT → 0. Setting * + 1p x2 1 Ap = |x| p √ e− 2 dx , R 2π we have

WT

1 1p  n (w) − m (w) p μT (dw) = A pp 11 n − m 11H . T

By (5.1.5), this implies lim ∇∗ n − ∇∗ m  p = 0.

n,m→∞

(5.1.7)

Since ∇∗ is a closed operator, h belongs to the domain of ∇∗ as a constant HT -valued function and lim ∇∗ n − ∇∗ h p = 0.

n→∞

Combining this with (5.1.5), we obtain ∇∗ h = I (h),

(5.1.8)

where I (h) is the Wiener integral of h˙ (see Section 1.7). On the other hand, (5.1.6) implies lim ∇(∇∗ n ) − ∇(∇∗ m ) p = lim  n − m HT = 0.

n,m→∞

n,m→∞



By (5.1.7) and the closedness of ∇, ∇ h belongs to the domain of ∇ and ∇(∇∗ h) = h.

(5.1.9)

In order to develop the theory of distributions on Wiener spaces, we need to consider the Sobolev spaces Dk,p (E) for k ∈ R. For this extension, we introduce the Wiener chaos decomposition of L2 (μT ), which plays an important role in several areas of stochastic analysis. Define the Hermite polynomials {Hn }∞ n=0 by Hn (x) =

(−1)n 1 x2 dn − 1 x2 e2 (e 2 ) n! dxn

(x ∈ R).

We have e− 2 (x−y) = 1

2

∞ ∞ 1 dn − 1 x2 n − 12 x2 2 (e )(−y) = e Hn (x)yn n n! dx n=0 n=0

and the generating function for the Hermite polynomials is ∞

1 2

Hn (x)yn = e xy− 2 y .

(5.1.10)

n=0

4:36:52, subject to the Cambridge Core terms of use, .006

200

Malliavin Calculus

2 {Hn }∞ n=0 forms an orthogonal basis of the L -space on R with respect to the standard normal distribution and

1 2 1 1 Hi (x)H j (x) √ e− 2 x dx = δi j . j! R 2π

This identity is shown by inserting (5.1.10) into the left hand side of

1 2 1 2 1 2 1 e sx− 2 s etx− 2 t √ e− 2 x dx = e st R 2π and comparing the coefficients of si t j . By using the Hermite polynomials, we construct an orthonormal basis of L2 (μT ) in the following way. Let A be the set of sequences of non-negative integers with a finite number of non-zero elements: ∞ , A = α = {α j }∞j=1 ; α j ∈ Z+ , αj < ∞ . j=1

For α ∈ A , define |α| and α! by |α| =



αj

and

α! =



α j !.

j:α j 0

j=1

Fix an orthonormal basis {hn }∞ n=1 of the Cameron–Martin subspace HT and define a family Hα (α ∈ A ) of functions on H by Hα (w) =

∞ 

Hα j (I (h j )(w)),

j=1

where I (h) is the Wiener integral of h ∈ HT . √ Theorem 5.1.6 { α!Hα , α ∈ A } forms an orthonormal basis of L2 (μT ). Moreover, L2 (μT ) admits the orthogonal decomposition L2 (μT ) =

∞ >

Hn ,

(5.1.11)

n=0

where Hn (n = 0, 1, 2, . . .) is the closed subspace of L2 (μT ) spanned by {Hα ; |α| = n}. Hn does not depend on the choice of the orthonormal basis of HT . Proof √I (h j ) is a standard normal random variable. Hence, the orthonormality √ of { n!Hn } with respect to the standard Gaussian measure implies that of { α!Hα }.

4:36:52, subject to the Cambridge Core terms of use, .006

5.1 Sobolev Spaces and Differential Operators

201

?∞ Hn is dense. Let X ∈ L2 (μT ) and suppose that n=0? ! We next show that ∞ XY dμT = 0 for any Y ∈ n=0 Hn . This implies that X is orthogonal to all WT polynomials of I (h j ) and that

n   X exp i a j I (h j ) dμT = 0 WT

j=1

for all n ∈ N and a j ∈ R. Hence,

X f (I (h1 ), I (h2 ), . . . , I (hn )) dμT = 0 WT

C0∞ (Rn )

(n ∈ N), which means X = 0 because, by the Itô– for any f ∈  Nisio theorem (Theorem 1.2.5), j I (h j )h j converges almost surely and the distribution of the limit is μT . If hn − hHT → 0, then I (hn ) → I (h) in L2 (μT ). Hence, Hn does not  depend on the choice of the orthonormal basis of HT . Definition 5.1.7 The orthogonal decomposition (5.1.11) of L2 (μT ) is called the Wiener chaos decomposition and an element in Hn is called an n-th Wiener chaos. Let Jn : L2 (μT ) → L2 (μT ) be the orthogonal projection onto Hn . Extend Jn to P(E) so that m (Jn F j )e j Jn F = j=1

m

for F = j=1 F j e j (F j ∈ P, e j ∈ E ( j = 1, . . . , m)). If G ∈ P, then there exist an N ∈ N and cα ∈ R (|α|  N) such that G= cα Hα , 5∞

|α|N

∞ where Hα is given by Hα = j=1 Hα j ( j ) with an orthonormal basis { j } j=1 such that G is expressed as G = g( 1 , . . . , M ) for some M ∈ N and a polynomial g : R M → R. Hence JnG ∈ P and JnG = 0 if n > N.

Definition 5.1.8 Let r ∈ R and p > 1. Define (I − L)r : P(E) → P(E) by ∞ (1 + n)r Jn (I − L) = r

(5.1.12)

n=0

and set r

Fr,p = (I − L) 2 F p .

4:36:52, subject to the Cambridge Core terms of use, .006

202

Malliavin Calculus

Denote the completion of P(E) with respect to  · r,p by Dr,p (E) and write Dr,p for Dr,p (R). The infinite sum on the right hand side of (5.1.12) is a finite one for G ∈ P(E). D0,p (E) is equal to L p (μT ; E). It is known as Meyer’s equivalence that, for k ∈ Z+ , Definitions 5.1.4 and 5.1.8 are consistent, that is, they define the same space Dk,p (E). Theorem 5.1.9 ([104, Theorem 4.4]) For any k ∈ Z+ and p > 1, there exist ak,p and Ak,p > 0 such that ak,p ∇k F p  Fk,p  Ak,p

k

∇ j F p

(F ∈ P(E)).

j=0

The family of Sobolev spaces Dr,p (E) (r ∈ R, p > 1) has the following consistency. Theorem 5.1.10 (1) For r, r ∈ R and p, p > 1 with r  r , p  p , the   inclusion mapping Dr ,p (E) ⊂ Dr,p (E) is a continuous embedding. (2) Let (Dr,p (E))∗ be the dual space of Dr,p (E). Under the identification of (L p (μT ; E))∗ and Lq (μT ; E), where 1p + 1q = 1, D−r,q (E) = (Dr,p (E))∗ . For the proof, we prepare some lemmas. Define L and T t : P(E) → P(E) (t > 0) by LG =



(−n)JnG

and T t G =

n=0



e−nt JnG

(G ∈ P(E)),

(5.1.13)

n=0

respectively. Since Jn is the orthogonal projection onto Hn , we have ∞ 1 11 12 12 1 12 1T t F 112 = e−2nt 11 Jn F 112  11F 112

(F ∈ P(E)).

n=0

Since P(E) is dense in L2 (μT ; E), T t is extended to a contraction operator on L2 (μT ; E), which is also denoted by T t . Moreover, T t (T s F) = T t+s F

(F ∈ L2 (μT ; E))

by definition and {T t }t0 defines a contraction semigroup on L2 (μT ; E) satisfying d T t F = LT t F (F ∈ P(E)). (5.1.14) dt

4:36:52, subject to the Cambridge Core terms of use, .006

5.1 Sobolev Spaces and Differential Operators

203

{T t }t0 and L are called the Ornstein–Uhlenbeck semigroup and the Ornstein–Uhlenbeck operator, respectively. Moreover, by definition, for Hα ∈ H|α| (α ∈ A ), we have (I − L)r Hα = (1 + |α|)r Hα ,

LHα = −|α| Hα ,

T t Hα = e−|α|t Hα .

Lemma 5.1.11 Let p > 1, F ∈ P(E) and G ∈ L p (μT ; E). (1) For any t  0 and w ∈ WT ,

√   F e−t w + 1 − e−2t w μT (dw ). T t F(w) =

(5.1.15)

(5.1.16)

WT

(2) T t F p  F p holds. In particular, T t : L p (μT ; E) ⊃ P(E) F → T t F ∈ P(E) ⊂ L p (μT ; E) is extended to a bounded linear operator. (3) limt→0 T t G − G p = 0 holds. Proof (1) It suffices to show the case where E = R. Let F ∈ P. Then, there exist an N ∈ N, a polynomial f : RN → R, and an orthonormal system 1 , . . . , N ∈ WT∗ of HT such that F = f ( 1 , . . . , N ). Set (w) = ( 1 (w), . . . , N (w)) for w ∈ WT . Then, denoting the right hand side of (5.1.16) by S t F(w), we have

S t F(w) =

RN

f (e−t (w) + y)gN (1 − e−2t , y) dy,

where gN (s, y) = (2πs)− 2 exp(− |y|2s ). Since

gN (s, z − y)gN (t, y) dy = gN (s + t, z) and 2

N

RN

setting  f (x) =

Rn

1 ∂gN = ΔgN , ∂s 2

=  F f ( 1 , . . . , N ),

f (e−t x + y)gN (1 − e−2t , y) dy,

 we have S t F = F, S s (S t F)(w) = S s+t F(w),

(5.1.17)

and ∂f d  j (w) j ((w)). t=0 S t F(w) = Δ f ((w)) − dt ∂x j=1 N

(5.1.18)

Extend { j }Nj=1 to an orthonormal basis { j }∞j=1 of HT and set Hα =

∞ 

Hα j ( j )

(α ∈ A ).

j=1

4:36:52, subject to the Cambridge Core terms of use, .006

204

Malliavin Calculus

Since Hn (x) − xHn (x) = −nHn (x), (5.1.18) yields d   S t Hα (w) = −|α|Hα (w). dt t=0 Combining this with (5.1.17), we have d S t Hα (w) = −|α|S t Hα (w). dt Since S 0 Hα (w) = Hα (w), by this ordinary differential equation and (5.1.15), we obtain S t Hα (w) = e−|α|t Hα (w) = T t Hα (w). Thus (5.1.16) holds for Hα . Since F ∈ P is written as a linear combination of Hα s, (5.1.16) is satisfied. (2) For the same F = f ( 1 , . . . , N ) as above, we have by Hölder’s inequality √ 11 1p 1T t F 11 p  |F(e−t w + 1 − e−2t w )| p μT (dw)μT (dw ) W W

T T = | f (x + y)| p gN (e−2t , x)gN (1 − e−2t , y) dxdy RN RN

1 1p = | f (z)| p gN (1, z) dz = 11F 11 p . RN

Hence, T t F p  F p . (3) Let K ∈ P(E). By (5.1.16), limt→0 T t K(w) − K(w)E = 0 (w ∈ WT ). Since T t K2p  K2p by (2), {T t K}t∈[0,T ] is uniformly integrable (Theorem A.3.4). Hence limt→0 T t K − K p = 0. Using (2) again, we obtain T t G − G p  T t K − K p + 2G − K p . Since P(E) is dense in L p (μT ; E), this inequality implies the conclusion.



Lemma 5.1.12 Let r, r ∈ R and p, p > 1. (1) If r  r and p  p , then Fr,p  Fr ,p (F ∈ P(E)). (2) If Fn ∈ P(E) satisfies lim Fn r,p = 0,

n→∞

lim Fn − Fm r ,p = 0,

n,m→∞

then limn→∞ Fn r ,p = 0. Proof (1) Let s  0. Then, by Definition 5.1.8 and (5.1.13),

∞ 1 (I − L)−s F = t s−1 e−t T t F dt Γ(s) 0

4:36:52, subject to the Cambridge Core terms of use, .006

5.1 Sobolev Spaces and Differential Operators

205

for F ∈ P(E). Since (I − L)−s F p  F p by Lemma 5.1.11, we obtain Fr,p = (I − L)−

r −r 2

r

r

(I − L) 2 F p  (I − L) 2 F p

r

 (I − L) 2 F p = Fr ,p . r

(2) Set Gn = (I−L) 2 Fn . Since Fn −Fm r ,p → 0, {Gn }∞ n=1 is a Cauchy sequence   in L p (μT ; E). Hence, limn→∞ Gn − G p = 0 holds for some G ∈ L p (μT ; E). Since Fn r,p → 0, lim (I − L)

n→∞

r−r 2

Gn  p = 0.

Hence, we have, for any K ∈ P(E),

G, K E dμT = lim

Gn , K E dμT n→∞ W WT

T r−r r −r = lim

(I − L) 2 Gn , (I − L) 2 K E dμT = 0 n→∞

WT

and G = 0. Therefore, Fn r ,p = Gn  p → 0.



Proof of Theorem 5.1.10 (1) The assertion follows from Lemma 5.1.12. p (2) For p > 1, q = p−1 and G ∈ P(E), we have G−r,q = (I − L)− 2 Gq , r = sup

(I − L)− 2 G, F E dμT ; F ∈ P(E), F p  1 WT , r = sup

G, (I − L)− 2 F E dμT ; F ∈ P(E), F p  1 WT , = sup

G, K E dμT ; K ∈ P(E), Kr,p  1 . r

WT



This implies the assertion (2). Definition 5.1.13 (1) Define

Dr,p (E), Dr,∞− (E) =

D∞,p (E) =

p∈(1,∞)

D

r,1+



(E) =

Dr,p (E),

r∈R

D (E), r,p



r∈R, p∈(1,∞)

−∞,p

D

(E) =

 r∈R

p∈(1,∞)

D∞,∞− (E) =



Dr,p (E),

D−∞,1+ (E) =

Dr,p (E), 

Dr,p (E).

r∈R, p∈(1,∞)

(2) An element Φ ∈ D−∞,1+ (E) is called a generalized Wiener functional.

4:36:52, subject to the Cambridge Core terms of use, .006

206

Malliavin Calculus

D∞,∞− (E) is a Fréchet space and D−∞,1+ (E) is its dual space. The value −∞,1+ (E) = (D∞,∞− (E))∗ at F ∈ D∞,∞− (E) is denoted by !Φ(F) of Φ ∈ D

F, Φ E dμT or E[ F, Φ E ] : W T

Φ(F) =

F, Φ E dμT = E[ Φ, F E ]. WT

! When E = R, we simply write the above as W FΦ dμT or E[FΦ]. Moreover, T ! if F = 1, it is also written as W Φ dμT or E[Φ]. These notations come from T q p 1 the ! fact that, if F ∈ L (μT ; E) and Φ ∈ L (μT ; E), then F, Φ E ∈ L (μT ) and

F, Φ E dμT is a usual integral. W T

5.2 Continuity of Operators The aim of this section is to prove the continuity of the operators ∇, ∇∗ , and T t and to present their applications. Theorem 5.2.1 (1) For any r ∈ R and p > 1, ∇ : P(E) → P(HT ⊗ E) is extended to a unique linear operator ∇ : D−∞,1+ (E) → D−∞,1+ (HT ⊗ E) whose restriction ∇ : Dr+1,p (E) → Dr,p (HT ⊗ E) is continuous. (2) For any r ∈ R and p > 1, the adjoint operator ∇∗ of ∇ is extended to a ∗ unique linear operator ∇ : D−∞,1+ (HT ⊗ E) → D−∞,1+ (E) whose restriction ∗ ∇ : Dr+1,p (HT ⊗ E) → Dr,p (E) is continuous. (3) For any t > 0 and p > 1, T t (L p (μT ; E)) ⊂ D∞,p (E). In particular, if a measurable function F : WT → E is bounded, then T t F ∈ D∞,∞− (E). ∗

In the following, the extensions ∇ and ∇ of ∇ and ∇∗ will also be denoted by ∇ and ∇∗ . For a proof of the theorem, we prepare a lemma. For φ = {φn }∞ n=0 ⊂ R, define the mapping Mφ : P(E) → P(E) by Mφ F =



φn Jn F

(F ∈ P(E)).

(5.2.1)

n=0 + ∞ Lemma 5.2.2 For φ = {φn }∞ n=0 ⊂ R, set φ = {φn+1 }n=0 . Then, for any F ∈ P(E),

∇Mφ F = Mφ+ ∇F. In particular, ∇(Jn F) = Jn−1 (∇F), n = 1, 2, . . .

4:36:52, subject to the Cambridge Core terms of use, .006

5.2 Continuity of Operators

207

Proof We may assume that E = R and F is a function of the form F = Hα = 5∞ ∞ j=1 Hα j ( j ), { j } j=1 being an orthonormal basis of HT . Then, since ∇Mφ Hα = φ|α| ∇Hα and Hn = Hn−1 , we have ∇Hα =

j:α j >0

*5

Since Hα j −1 ( j )

*

Hα j −1 ( j )

+ Hαi ( i ) j .

i j

+

i j

Hαi ( i ) ∈ H|α|−1 , we obtain Mφ+ ∇Hα = φ|α| ∇Hα = ∇Mφ Hα .



Lemma 5.2.3 (Hypercontractivity of {T t }) Let p > 1 and t  0, and set q(t) = e2t (p − 1) + 1. Then, for any F ∈ L p (μT ), T t Fq(t)  F p .

(5.2.2)

The proof is omitted. See [104, Theorem 2.11]. Lemma 5.2.4 For any p > 1 and n ∈ Z+ , there exists a constant b p,n > 0 such that Jn F p  b p,n F p

(5.2.3)

for any F ∈ P. In particular, Jn defines a bounded operator on L p (μT ). Proof For p > 1, define c(p) by ⎧ 1 ⎪ ⎪ ⎨(p − 1) 2 c(p) = ⎪ ⎪ ⎩(p − 1)− 12

(p  2) (1 < p < 2).

First we assume p  2. Let t  0 so that e2t = p − 1. Then, since T t F1+e2t  F2 by Lemma 5.2.3, we have T t F p  F2 . Moreover, since Jn is an orthogonal projection on L2 (μT ), T t Jn F p  Jn F2  F2  F p . Hence, from the identity T t Jn F = e−nt Jn F = c(p)−n Jn F, taking b p,n = c(p)n , we obtain (5.2.3). p p > 2 and c( p−1 ) = c(p), we Second, we assume 1 < p < 2. Then, since p−1 obtain from the above consideration n p  c(p) F p . Jn F p−1 p−1

4:36:52, subject to the Cambridge Core terms of use, .006

208

Malliavin Calculus

Due to the duality, Jn∗ F p  c(p)n F p . Since Jn is an orthogonal projection on L2 (μT ), we have Jn∗ F = Jn F and obtain  (5.2.3), by setting b p,n = c(p)n again. Lemma 5.2.5 For any p > 1 and n ∈ Z+ , there exists a constant Cn,p > 0 such that T t (I − J0 − · · · − Jn−1 )F p  Cn,p e−nt F p

(5.2.4)

for any t > 0 and F ∈ L (μT ). p

Proof If p = 2, then, by the definition of T t , ∞ 11 1 1 12 12 12 1T t (I − J0 − · · · − Jn−1 )F 112 = e−2kt 11 Jk F 112  e−2nt 11F 112 k=n

and (5.2.4) holds. Assume that p > 2. Set p = e2t0 + 1 for t0 > 0. For t > t0 , by Lemma 5.2.3 and the above observation, T t (I − J0 − · · · − Jn−1 )F p  T t−t0 (I − J0 − · · · − Jn−1 )F2  e−n(t−t0 ) F2  ent0 e−nt F p . For t  t0 , by Lemmas 5.1.11 and 5.2.4, T t (I − J0 − · · · − Jn−1 )F p  (I − J0 − · · · − Jn−1 )F p 

n−1 *

+

b p,k F p  e

nt0

k=0

n−1 *

+ b p,k e−nt F p .

k=0

Hence, we have (5.2.4) also for p > 2. If p ∈ (1, 2), we can prove the conclusion by the duality between L p (μT ) and p  L p−1 (μT ) in the same way as in the proof of Lemma 5.2.4. Lemma 5.2.6 Let δ > 0 and ψ : (−δ, δ) → R be real analytic. Suppose that, −α − α1 for α ∈ (0, 1], φ = {φn }∞ n=0 satisfies φn = ψ(n ) for n  δ . Then, for each p > 1, there exists a constant C p such that Mφ F p  C p F p Proof Fix n ∈ N so that Mφ(1) =

1 nα

(F ∈ P).

(5.2.5)

< δ and set

n−1 k=0

φk Jk

and

Mφ(2) =



φk Jk .

k=n

4:36:52, subject to the Cambridge Core terms of use, .006

5.2 Continuity of Operators

209

Since Mφ(1) is a bounded operator on L p (μT ) by Lemma 5.2.4, it suffices to prove the following inequality:   sup Mφ(2) F p ; F ∈ P, F p  1 < ∞. (5.2.6) First we show (5.2.6) when α = 1. Define the operator R by

∞ R= T t (I − J0 − · · · − Jn−1 ) dt. 0

Then, we have



R jF =

···

0



T t1 +t2 +···+t j (I − J0 − · · · − Jn−1 )F dt1 · · · dt j .

0

By Lemma 5.2.5, 1 F p . nj

R j F p  Cn,p

(5.2.7)

Moreover, by the definition of R, RJk F =

1 Jk F, k

1 Jk F kj

R j Jk F =

(k  n).

Combining this with the series expansion of ψ, ψ(x) =



(x ∈ (−δ, δ)),

ajxj

j=0

we obtain φk Jk F = ψ(k−1 )Jk F =



1 Jk F = a j R j Jk F. j k j=0 ∞

aj

j=0

Hence Mφ(2) F =



a j R j F.

j=0

From this identity and (5.2.7), we obtain Mφ(2) F p  Cn,p

∞ j=0

*1+j

|a j |

n

F p

and (5.2.6). Second, we show (5.2.6) when α < 1. For t  0, let νt be a probability measure on [0, ∞) such that

∞ α e−λs νt (ds) = e−λ t (λ > 0). 0

4:36:52, subject to the Cambridge Core terms of use, .006

210

Malliavin Calculus

Set

Qt =



T s νt (ds)

0

and define



Q=

Qt (I − J0 − · · · − Jn−1 ) dt.

0

By Lemma 5.2.5, Q j F p  Cn,p

* 1 +j F p . nα

Moreover, by definition, Q j Jk F =

* 1 +j Jk F kα

for k  n. From these observations we obtain (5.2.6) by a similar argument to the case where α = 1.  By the following lemma, the assertions of Lemmas 5.2.4, 5.2.5, and 5.2.6 also hold if we replace P and L p (μT ) by P(E) and L p (μT ; E). Lemma 5.2.7 Let K be a real separable Hilbert space and 1 < p  q < ∞. Suppose that a linear operator A : P → P(K) is extended to a continuous operator L p (μT ) → Lq (μT ; K). Define A(G e) = (AG) ⊗ e (G ∈ P, e ∈ E) and extend A to P(E). Then, A : P(E) → P(K ⊗ E) is extended to a continuous linear operator L p (μT ; E) → Lq (μT ; K ⊗ E). Proof We use the following Khinchin’s inequality (see [112]): Let {rn }∞ n=1 be a Bernoulli sequence on a probability space (Ω, F , P), that is, r1 , r2 , . . . are independent and satisfy P(ri = 1) = P(ri = −1) = 12 (i = 1, 2, . . .). Then, for any p > 1, N ∈ N, e1 , . . . , eN ∈ E, N N 1 1 +1 N 11 p + 1 * 1 1 1 1 * P 111 p 11e 112 2  B *EP 11 r e 11 p + p ,  E 1 rm em 11 1 m E p m m1 E E Bp m=1 m=1 m=1 1 ). where B p = (p − 1) ∨ ( p−1 N For F ∈ P(E), take an orthonormal basis {en }∞ n=1 F n en n=1 of E so that F = for some N ∈ N and Fn ∈ P (n = 1, . . . , N). Denoting by LA the operator norm of A : L p (μT ) → Lq (μT ; K), by Khinchin’s inequality, we obtain

11 11q 1AF 1q,K⊗E =

N 1 q * 11AF (w)1112 + 2 μ (dw) n T K

WT n=1

4:36:52, subject to the Cambridge Core terms of use, .006

5.2 Continuity of Operators

 Bqq WT

N 11q  11 EP 11 rn AFn (w)11 μT (dw) K

* q

 Bqq LA EP  Bqq LqA Bqp =

211



*

n=1

N p  + q  rn Fn (w) μT (dw) p

WT n=1 N *

+ qp + 2p Fn (w)2 μT (dw)

WT n=1 1 1 q p q 1 1q Bq LA B p 1F 1 p,E .

Hence, A : P(E) → P(K ⊗ E) is extended continuously.



Proof of Theorem 5.2.1. (1) Let r ∈ R and p > 1. Define φ = {φn }∞ n=0 by * n + 2r * 1 + 2r φ0 = 0, φn = = (n  1). 1+n 1 + 1n By Lemma 5.2.6, there exists a constant C p such that Mφ F p  C p F p for any F ∈ P(E). By Lemma 5.2.2, we have r

r

∇Mφ (I − L) 2 F = (I − L) 2 ∇F. Moreover, by Theorem 5.1.9, there exists a constant C p such that ∇F p  C p (I − L) 2 F p . 1

Summing up the above observations, we obtain r

r

(I − L) 2 ∇F p = ∇Mφ (I − L) 2 F p  C p (I − L) 2 Mφ (I − L) 2 F p = C p Mφ (I − L) 1

 C pC p (I − L)

r

r+1 2

r+1 2

F p

F p = C pC p Fr+1,p .

Hence, ∇ : Dr+1,p (E) → Dr,p (HT ⊗ E) is continuous. (2) The assertion follows from (1) and the duality. ∗ ∗ (3) Let { n }∞ n=1 ⊂ WT be an orthonormal basis of HT . By (5.1.5), ∇ n (w) = n (w), μT -a.s. w ∈ WT . Hence, by Lemma 5.1.11,

√ 7 8

∇T t F(w), n HT = e−t (∇F)(e−t w + 1 − e−2t w ), n HT μT (dw ) WT

√ 7 8 e−t = √ ∇[F(e−t w + 1 − e−2t · )](w ), n HT μT (dw ) 1 − e−2t WT

√ e−t = √ F(e−t w + 1 − e−2t w ) n (w )μT (dw ). 1 − e−2t WT

4:36:52, subject to the Cambridge Core terms of use, .006

212

Malliavin Calculus

Since { n }∞ n=1 ⊂ H1 is an orthonormal basis of H1 ,

∞ √ +2 11 12 e−2t * 1∇T t F(w)11H = F(e−t w + 1 − e−2t w ) n (w )μT (dw ) −2t T 1−e WT n=1 =

√ 12 e−2t 11 1 J1 [F(e−t w + 1 − e−2t ·)]112 . −2t 1−e

Set Ap = Then, we have

* R

* R

+ 1p x2 1 √ |x| p e− 2 dx . 2π

+ 1p √ y2 1 √ |y| p e− 2t dy = A p t. 2πt

In particular, 11 112since G ∈ H1 is a Gaussian random variable with mean 0 and variance 1G12 , G p = A p G2 . Combining this with Lemma 5.2.4, we obtain

1 11∇T F 111 p dμ t T HT WT √ * + p 11 1p e−t 1 J1 [F(e−t w + 1 − e−2t ·)]112 μT (dw) = √ WT 1 − e−2t

1 −t * + p e 11 J [F(e−t w + √1 − e−2t ·)]111 p μ (dw) = A−p √ 1 p p T WT 1 − e−2t

√ * +p 11 −t 1p e−t p 1F(e w + 1 − e−2t ·)11 p μT (dw)  A−p p b p,1 √ W 1 − e−2t

T −t * + * + p 11 11 p p e e−t p p 1F 1 p , = A−p T t |F| p dμT = A−p p b p,1 √ p b p,1 √ WT 1 − e−2t 1 − e−2t where the last identity follows from

G(T t K) dμT = (T t G)K dμT WT

(G, K ∈ P)

and

T t 1 = 1.

WT

Hence, by Lemma 5.2.7, ∇T t : P(E) → P(HT ⊗ E) is extended to a continuous linear operator from L p (μT ; E) into L p (μT ; HT ⊗ E). Therefore, T t (L p (μT ; E)) ⊂ D1,p (E). Repeating the above arguments inductively, we obtain the assertion.



We end this section by showing the fundamental properties of ∇ and ∇∗ . Theorem 5.2.8 Let p, q, r > 1 be such that separable Hilbert spaces.

1 r

=

1 p

+

1 q

and E, E1 , E2 be real

4:36:52, subject to the Cambridge Core terms of use, .006

5.2 Continuity of Operators

213

(1) Let F ∈ D1,p (E1 ), G1 ∈ D1,q (E2 ), G2 ∈ D1,q (HT ⊗ E2 ) and K ∈ D1,p . Then, F ⊗ G1 ∈ D1,r (E1 ⊗ E2 ), KG2 ∈ D1,r (HT ⊗ E2 ) and ∇(F ⊗ G1 ) = F ⊗ ∇G1 + ∇F ⊗ G1 , ∗



∇ (KG2 ) = K∇ G2 − ∇K, G2 HT ,

(5.2.8) (5.2.9)

where E1 ⊗ HT ⊗ E2 is identified with HT ⊗ E1 ⊗ E2 . (2) Let k ∈ Z+ . Both of the following mappings are bounded and bilinear: Dk,p (E1 ) × Dk,q (E2 ) (F, G) → F ⊗ G ∈ Dk,r (E1 ⊗ E2 ), Dk,p (E) × Dk,q (E) (F, G) → F, G E ∈ Dk,r . In particular, if F, G ∈ D∞,∞− , then FG ∈ D∞,∞− . Proof (1) Let F ∈ P(E1 ) and G1 ∈ P(E2 ). By (5.1.1), the E1 ⊗ E2 -valued random variable ∇(F ⊗ G1 ), h HT is obtained by

∇(F ⊗ G1 )(w), h HT =

d   (F ⊗ G1 )(w + ξh) dξ ξ=0

(w ∈ WT , h ∈ HT ).

Hence, ∇(F ⊗ G1 ) = F ⊗ ∇G1 + ∇F ⊗ G1 . By the continuity of ∇, (5.2.8) holds for any F ∈ D1,p (E1 ) and G1 ∈ D1,q (E2 ). Next, let G2 ∈ P(HT ⊗ E2 ), K ∈ P and ψ ∈ P(E2 ). By (5.2.8), we have

KG2 , ∇ψ HT ⊗E2 dμT =

G2 , ∇(Kψ) − ∇K ⊗ ψ HT ⊗E2 dμT WT WT

=

K∇∗G2 − ∇K, G2 HT , ψ E2 dμT . WT

By the continuity of ∇ and ∇∗ , (5.2.9) holds for any G2 ∈ D1,q (HT ⊗ E2 ) and K ∈ D1,p . (2) The assertion is trivial by (1) and the definition of the inner product.  Proposition 5.2.9 For G ∈ D1,2 (E), if ∇G = 0, then there exists an e ∈ E such that G = e, μT -a.s. Proof We may assume E = R. Let { j }∞j=1 ⊂ WT∗ be an orthonormal basis of 5 HT . As in Theorem 5.1.6, we set Hα = ∞j=0 Hα j ( j ) for α = (α1 , α2 , . . .) ∈ A . Since Hn (x) = xHn (x) − (n + 1)Hn+1 (x) and ∇∗ j = j , by Theorem 5.2.8, ∇∗ (Hα j ) = (α j + 1)Hα+δ j ,

(5.2.10)

where δ j = (δ ji )i∈N .

4:36:52, subject to the Cambridge Core terms of use, .006

214

Malliavin Calculus

For α ∈ A with |α|  0, fix j ∈ N satisfying α j  0. Since Hα = ∇∗ (α−1 j Hα−δ j j ) by (5.2.10), we have

GHα dμT =

∇G, α−1 j Hα−δ j j HT dμT = 0. WT

WT



Hence, by Theorem 5.1.6, G is a constant.

5.3 Characterization of Sobolev Spaces The aim of this section is to present explicit criteria for generalized Wiener functionals to belong to Dr,p (E), by using the continuity of ∇ and ∇∗ . The following characterization of Dr,p (E) holds as in the theory of distributions on finite dimensional spaces. Theorem 5.3.1 Let r ∈ R, k ∈ Z+ and p > 1. (1) Φ ∈ D−∞,1+ (E) belongs to Dr,p (E) if and only if , sup

Φ, F E dμT ; F ∈ P(E), F−r,q  1 < ∞, WT p where q = p−1 . (2) F ∈ L p (μT ; E) belongs L p (μT ; HT⊗k ⊗ E) such that

to Dk,p (E) if and only if there exists an Fk ∈

F, (∇∗ )k G E dμT = WT

WT

Fk , G HT⊗k ⊗E dμT

for any G ∈ P(HT⊗k ⊗ E), where HT⊗k = HT ⊗ · · · ⊗ HT . Moreover, in this case, ; 0 so that   ∂ f  | f (x)| + ni=1  ∂x i (x) sup 2r. Choose Fim ∈ P so that Fim − Fi 1,p = 0 (m → ∞). Then, for G ∈ P(HT ),

f (F1 , . . . , Fn )∇∗G dμT = lim f (F1m , . . . , Fnm )∇∗G dμT m→∞

WT

WT

@ n A ∂f m m m = lim (F , . . . , F )∇F , G dμT n i 1 HT m→∞ W ∂xi T i=1

@ n A ∂f = (F1 , . . . , Fn )∇Fi , G dμT . i HT WT i=1 ∂x Since p is arbitrary, Theorem 5.3.1 implies f (F1 , . . . , Fn ) ∈ D1,∞− and (5.3.1).  Repeating this argument, we obtain f (F1 , . . . , Fn ) ∈ D∞,∞− . The operator ∇∗ is a generalization of stochastic integrals.1 Theorem 5.3.3 (1) Let N ⊂ B(WT ) be the totality of sets of zero μT -outer   measure and Ft = σ N ∪ σ({θ(u), u  t}) . Let {u(t) = (u1 (t), . . . , ud (t))}t∈[0,T ] be an Rd -valued {Ft }-predictable stochastic process such that

* T + |u(t)|2 dt dμT < ∞. 0

WT

Define Φu : WT → HT by

Φu (w)(t) =

t

u(s)(w) ds

(t ∈ [0, T ]).

0

Then, Φu ∈ L2 (μT ; HT ) and ∇∗ Φu =

d α=1

T

uα (t) dθα (t).

(5.3.2)

0

∞ (Rd ). Then, for any t ∈ [0, T ] and α = 1, . . . , d, (2) Let f ∈ Cexp

t f (θ(s)) dθα (s) ∈ D∞,∞− . 0 1

∇∗

coincides with the Skorohod integral, which is a generalization of stochastic integrals ([28, 92]).

4:36:52, subject to the Cambridge Core terms of use, .006

218

Malliavin Calculus

Remark 5.3.4 In the above assumption on u, μT is extended to !FT naturally, T and so is the measurability. Even so, we may think of Φu and 0 uα (t)dθα (t) as B(WT )-measurable functions. To see this, let Ft0 = σ({θ(s); s  t}). Notice that every Ft -measurable F possesses an Ft0 -measurable modifica Hence every ν = {ν(t)}t∈[0,T ] ∈ L 0 ({Ft }), the L 0 -space with respect tion F. B t∈[0,T ] ∈ L 0 ({Ft0 }) such that ν = {ν(t)} to {Ft } (Definition 2.2.4), admits  ν(t) (0  t  T )) = 1. Therefore, by Proposition 2.2.8, there exists μT (ν(t) =  un = {un (t) = (u1n (t), . . . , udn (t))}t∈[0,T ] ∈ L 0 ({Ft0 }) such that

 T  |un (t) − u(t)|2 dt dμT → 0 (n → ∞). WT

0

Then, defining Φu (w)α (t) =

t

lim sup uαn (s) ds n→∞

0

and

T

uα (t) dθα (t) = lim sup

n→∞

0

T

uαn (t) dθα (t) (α = 1, . . . , d),

0

we obtain the desired B(WT )-modifications.   d Proof (1) Take a sequence {un (t) = (u1n (t), . . . , udn (t))}t∈[0,T ] ∞ n=1 of R -valued α 0 0 stochastic processes with {un (t)}t∈[0,T ] ∈ L ({Ft }) (α = 1, . . . , d) (see Definition 2.2.4) such that

* T + lim |un (t) − u(t)|2 dt dμT = 0. n→∞

WT

0

there exist an increasing sequence 0 = t0n < By the definition of L t1n < · · · < tkn < · · · < tmn n = T and bounded, Ftkn -measurable Rd -valued random d 1 , . . . , ξn,k ) such that variables ξn,k = (ξn,k 0

α uαn (t) = ξn,k

({Ft0 }),

n (tkn < t  tk+1 , k = 0, . . . , mn − 1, α = 1, . . . , d).

Since Ft0 is generated by θ(s) (s  t), we may assume that, taking a subsek,n n α ∞ d jk,n ) quence if necessary, there exist 0 < sk,n 1 < · · · < s jk,n  tk and φn,k ∈ C b (R such that k,n α ξn,k = φαn,k (θ(sk,n 1 ), . . . , θ(s jk,n )).

(5.3.3)

α−1

9:;< For α = 1, . . . , d, let eα = (0, . . . , 0, 1, 0, . . . , 0) ∈ Rd . For 0  s < t  T , α ∈ HT by define (s,t] α ˙(s,t] (v) = 1(s,t] (v)eα

(v ∈ [0, T ]),

4:36:52, subject to the Cambridge Core terms of use, .006

5.3 Characterization of Sobolev Spaces

219

α , h HT = hα (t) − hα (s) (h ∈ HT ). Then, that is, (s,t]

Φun =

m d n −1 k=0 α=1

α α ξn,k (tn ,tn ] . k k+1

Hence, by (5.2.9), ∇∗ Φun =

m d n −1



k=0 α=1

α ∇∗ (tαn ,tn ξn,k

k k+1 ]

 α − ∇ξn,k , (tαn ,tn ] HT . k k+1

By the expression (5.3.3) and Corollary 5.3.2, α , (tαn ,tn ] HT = 0

∇ξn,k

(k = 0, . . . , mn − 1).

k k+1

By (5.1.5), {un (t)}t∈[0,T ] satisfies (5.3.2). In particular, for F ∈ P, we have

d *

Φun , ∇F HT dμT =

WT

T

+ uαn (t) dθα (t) F dμT .

T

+ uα (t) dθα (t) F dμT ,

0

WT α=0

Letting n → ∞, we arrive at

d *

Φu , ∇F HT dμT =

WT

WT α=0

0

which gives (5.3.2). (2) Define* {u(s)} s∈[0,T ] by+ uα (s) = f (θ(s))1[0,t] (s) and uβ (s) = 0 (β  α).  Since exp max0sT |θ(s)| ∈ p∈(1,∞) L p (μT ), Corollary 5.3.2 implies Φu ∈  D∞,∞− (HT ). (1) and the continuity of ∇∗ yields the conclusion. By Theorem 5.3.3, we obtain an explicit formula for the integrand in Itô’s representation theorem (Theorem 2.6.2) for martingales, as will be seen below. The result is called the Clark–Ocone formula, which, for example, plays an important role in the theory of mathematical finance to obtain the hedging strategy for derivatives. Theorem 5.3.5 Let Ft be as in Theorem 5.3.3. For F ∈ D1,2 , set f α (t, w) = 9:;< 9:;< ˙ ˙ E[( (∇F) (w))α (t)|Ft ], where ( (∇F) (w))α (t) is the α-th component of the value at time t of the derivative of (∇F)(w) ∈ HT and E[ · |Ft ] is the conditional expectation with respect to the natural extension of μT to Ft . Then, F = E[F] +

d α=1

T

f α (t) dθα (t).

(5.3.4)

0

4:36:52, subject to the Cambridge Core terms of use, .006

220

Malliavin Calculus

Proof By Itô’s representation theorem (Theorem 2.6.2), there exists some {gα (t)}t∈[0,T ] ∈ L 2 (α = 1, . . . , d) such that d T F = E[F] + gα (t) dθα (t). α=1

0

What is to be shown is gα (t) = f α (t) (α = 1, . . . , d). Let {uα (t)}t∈[0,T ] be as in Theorem 5.3.3. Since stochastic integrals are isometries (Proposition 2.2.10), Theorem 5.3.3 implies

d * T + uα (t)gα (t) dt dμT WT α=1

=

0 d *

WT α=1

T

d +* u (t) dθ (t) α

α

0

α=1

T

+ gα (t) dθα (t) dμT

0

(∇∗ Φu )(F − E[F]) dμT .

=

(5.3.5)

WT

By the definitions of dual operators and the inner product, the last term is rewritten as

d * T 9:;< + ˙

Φu , ∇F HT dμT = uα (t)( (∇F) )α (t) dt dμT . WT

WT α=1

0

α

Moreover, since {u (t)} is {Ft }-adapted, it is equal to

d * T + uα (t) f α (t) dt dμT WT α=1

0

by Fubini’s theorem. Comparing this with (5.3.5), we obtain gα (t) = f α (t) (α =  1, . . . , d) since {u(t)}t∈[0,T ] is arbitrary. Next, we show that the Lipschitz continuity of a Wiener functional implies its differentiability.   with Theorem 5.3.6 Suppose that, for F ∈ p∈(1,∞) L p (μT ), there exists F  = F, μT -a.s., and a constant C such that F  + h) − F(w)|  |F(w  ChHT

(5.3.6)

for any w ∈ WT and h ∈ HT . Then, F ∈ D1,∞− and ∇FHT  C, μT -a.s. 1 1−2 Proof Let ∈ WT∗ (  0). Define π : WT → WT by π (w) = w − 11 11H (w) T and decompose WT into an orthogonal sum WT = π (WT ) ⊕ R = {w + ξ ; w ∈ π (WT ), ξ ∈ R}.

4:36:52, subject to the Cambridge Core terms of use, .006

5.3 Characterization of Sobolev Spaces

221

Then, by the Itô–Nisio theorem (Theorem 1.2.5), we have ξ2

μT = (μT ◦

π−1 )

− 2 1 2  ⊗  1 1 e HT dξ. 2 1 1 2π1 1 HT

  + ξ ) is absolutely continuous by the Let w ∈ π (WT ). Since R ξ → F(w assumption, the set   + ξ )   + (ξ + ε) ) − F(w F(w does not converge as ε → 0 ε has Lebesgue measure 0. Hence, setting ,

ξ ∈ R;

,  + ε ) − F(w)  F(w A( ) = w ∈ WT ; lim exists , ε→0 ε we have μT (A( )) = 1 by Fubini’s theorem. Set G(w, ) = 1A( ) (w) lim

ε→0

 + ε ) − F(w)  F(w ε

(w ∈ WT ).

Then, by the assumption, |G(w, )|  C HT . for any w ∈ WT and ∈ WT∗ . ∗ Let { k }∞ k=1 ⊂ WT be an orthonormal basis of HT and set K =

n ,

q j j ; q j ∈ Q ( j = 1, . . . , n), n ∈ N .

j=1

n

For = j=1 q j j ∈ K and φ ∈ P, we have by Lemma 5.1.2



 F(· + ε ) − F(·)  φ dμT φ dμT = F∂ G(·, )φ dμT = lim ε→0 W ε WT WT T

n n  F∂ j φ dμT = = qj q jG(·, j )φ dμT (5.3.7) j=1

WT

WT j=1

and, for any ∈ K , G(·, ) =



, j HT G(·, j ),

μT -a.s.

j=1

Hence, setting ∞ ∞

, B= w∈ A( j ) ; G(w, ) =

, j HT G(w, j ), ∈ K , j=1

j=1

4:36:52, subject to the Cambridge Core terms of use, .006

222

Malliavin Calculus

we have μT (B) = 1. If w ∈ B, then ∞    , j H G(w, j ) = |G(w, )|  C H T T

( ∈ K ).

j=1

Hence, letting N ∈ N and taking kn ∈ K with N 11 11 G(w, j ) j 11 lim 11kn −

n→∞

j=1

HT

=0

and kn , j HT = 0 ( j  N + 1),

we obtain N

N G(w, j ) = lim

kn , j HT G(w, j ) 2

n→∞

j=1

j=1

 lim sup Ckn HT = C n→∞

N ,

- 12 G(w, j )2 .

j=1

Letting N → ∞, we obtain ∞

G(w, j )2  C 2 < ∞.

(5.3.8)

j=1

If we set G(w) = 1B (w)



G(w, j ) j ,

j=1

then G(w)HT  C (w ∈ WT ) by (5.3.8). Moreover, by (5.3.7),

F∇∗ (φ ) dμT =

G, φ HT dμT WT

WT

for ∈ K and φ ∈ P. Since K is dense in HT ,

∗ F∇ K dμT =

G, K HT dμT WT

(K ∈ P(HT )).

WT

Therefore, by Theorem 5.3.1, F ∈ D1,∞− and ∇F = G.



Corollary 5.3.7 The norm θ = max0tT |θ(t)| belongs to D1,∞− .   √ Proof Since w + h − w  T hHT (h ∈ HT ), Theorem 5.3.6 implies the assertion. 

4:36:52, subject to the Cambridge Core terms of use, .006

5.3 Characterization of Sobolev Spaces

223

Using the following proposition, we can prove that, when d = 1, the derivative of the norm θ is given by 9:;< ˙ (∇θ) = sgn(θ(τ))1[0,τ] ,

μT -a.s.,

(5.3.9)

where τ(w) = inf{t ∈ [0, T ]; |w(t)| = w}. Proposition 5.3.8 (1) Let F ∈ D1,p . Then, F + = max{F, 0} ∈ D1,p and ∇F + = 1(0,∞) (F)∇F,

μT -a.s.

(2) Let F1 , . . . , Fn ∈ D1,p . Then, max1in Fi ∈ D1,p and ∇ max Fi = 1in

n

1Ai ∇Fi ,

μT -a.s.,

i=1

where Ai = {w; F j (w)  Fi (w) ( j < i), F j (w) < Fi (w) ( j > i)}. (3) Let d = 1. Then, max0sT θ(s) ∈ D1,∞− and 9:;< ˙ (∇ max θ(s)) = 1[0,σ] , 0sT

μT -a.s.,

(5.3.10)

where σ(w) = inf{t ∈ [0, T ]; w(t) = max0sT w(s)}. (4) (5.3.9) holds. Proof (1) Take ϕ(x) ∈ C ∞ (R) so that ϕ(x) ! x = 1 (x  1) and ϕ(x) = 0 (x  0). Set ϕn (x) = ϕ(nx) and define ψn (x) = 0 ϕn (y) dy. By the same arguments as in the proof of Corollary 5.3.2, we can show ψn (F) ∈ D1,p . Letting n → ∞, we obtain the conclusion. (2) If n = 2, then the assertion follows from (1) because max{F1 , F2 } = (F1 − F2 )+ + F2 . By induction we obtain the assertion for general n. (3) From (2) we have max0k2n θ( 2kn ) ∈ D1,∞− and 9:;< ˙ 2n * * k ++ ∇ maxn θ n = 1Ank 1[0, kn ] , 2 0k2 2 k=0

(5.3.11)

where Ank = {θ( 2jn )  θ( 2kn ) ( j < k), θ( 2jn ) < θ( 2kn ) ( j > k)}. Since μT (θ(σ) > θ(t), t  σ) = 1 (see [56, p.102]), letting , n → ∞ in (5.3.11) yields (5.3.10). (4) Since θ = max max0sT θ(s), max0sT (−θ(s)) , (1) and (3) yield the conclusion. 

4:36:52, subject to the Cambridge Core terms of use, .006

224

Malliavin Calculus

5.4 Integration by Parts Formula In this section we show an integration by parts formula and, by applying it, we introduce the composition of distributions on RN and Wiener functionals. Definition 5.4.1 F = (F 1 , . . . , F N ) ∈ D∞,∞− (RN ) is called non-degenerate if

+ +−1 * * ∈ L∞− (μT ) = L p (μT ). (5.4.1) det ∇F i , ∇F j HT i, j=1,...,N

p∈(1,∞)

Example 5.4.2 For 1 , . . . , N ∈ WT∗ , suppose that    det i , j HT i, j=1,...,N  0 and set F = ( 1 , . . . , N ). Then, F ∈ P(RN ) and ∇F i (w) = i . Hence, F is non-degenerate. In particular, for t > 0, N = d and i (w) = wi (t) (i = 1, . . . , d, w ∈ WT ), we have    det i , j HT i, j=1,...,N = td > 0. Hence, F = θ(t) is non-degenerate. ∞ Example 5.4.3 Let {hn }∞ n=1 be an orthonormal basis of HT and {a j } j=1 ⊂ R ∞ 2 satisfy j=1 a j < ∞. Set

Fn =

n

a j {(∇∗ h j )2 − 1}.

j=1 ∗

Since {∇ h j } is a sequence of independent identically distributed normal Gaus12 1  sian random variables, 11Fn − Fm 112 = 2 nj=m+1 a2j for n > m. Hence, as the limit of Fn in L2 (μT ), a random variable F=



a j {(∇∗ h j )2 − 1}

j=1

is defined. We have

n n  , -− 12 x2 1 2 eλFn dμT = eλa j (x −1) √ e− 2 dx = (1 − 2λa j )e2λa j WT 2π j=1 R j=1 for any λ ∈ R with a|λ| < 12 , where a = sup j∈N |a j |. Since log(1 − x) + x = 5 2 − x2 + o(x2 ) as x → 0, the infinite product ∞j=1 (1 − 2λa j )e2λa j converges and is not 0. Since e|y|  ey + e−y , applying Fatou’s lemma to the sequences

4:36:52, subject to the Cambridge Core terms of use, .006

5.4 Integration by Parts Formula

225

! ! |λ||F| dμT < ∞. { W e±λFn dμT }∞ n=1 (subsequences if necessary), we obtain WT e T ∞− In particular, F ∈ L (μT ). Set ∞ F = 2 a j (∇∗ h j )h j . Then, since ∇Fn = 2

j=1

n j=1



a j (∇ h j )h j , we obtain by (5.1.9) and Corollary 5.3.2,

∞ ∞ 11 11 12 11 1∇Fn − F  112 = 114 a2j (∇∗ h j )2 11 = 4 a2j → 0 1

j=n+1

Hence, for any G ∈ P(HT ),

F∇∗G dμT = lim Fn ∇∗G dμT n→∞ W WT T

= lim

∇Fn , G HT dμT = n→∞

(n → ∞).

j=n+1

WT

F  , G HT dμT .

WT

 1 12  Moreover, for λ ∈ R with a2 |λ| < 12 , the integrability of exp |λ| 11F  11H is T shown by a similar argument to that in the preceding paragraph, and F  ∈ L∞− (μT ; HT ). Thus F ∈ D1,∞− and ∇F = F  .  Furthermore, since ∇2 Fn = 2 nj=1 a j h j ⊗ h j and ∇3 Fn = 0, Theorem 5.2.1 implies F ∈ D∞,∞− ,

∇F = 2



a j (∇∗ h j )h j ,

j=1

∇2 F = 2



a jh j ⊗ h j,

∇k F = 0

(k  3).

j=1

Finally, we present a sufficient condition for F to be non-degenerate. Suppose that a j  0 for infinitely many js and set { j ; a j  0} = { j(1) < j(2) < · · · }. Putting mn = min{a2j(k) ; k = 1, . . . , n}, we have mn > 0 and n ∞ 11 112 1∇F 1H = a2j(k) (∇∗ h j(k) )2  mn (∇∗ h j(k) )2 T

k=1

(n ∈ N).

k=1

Since ∇∗ h j(k) (k ∈ N) form a sequence of independent identically distributed +− 1 * n ∗ 2 2 ∈ L p (μT ) for n > p. normal Gaussian random variables, k=1 (∇ h j(k) ) 11 11−1  Therefore, 1∇F 1H ∈ p∈(1,∞) L p (μT ) and F is non-degenerate. T

Next we introduce an integration by parts formula associated with nondegenerate Wiener functionals. For this purpose we note the following.

4:36:52, subject to the Cambridge Core terms of use, .006

226

Malliavin Calculus

Lemma 5.4.4 For G ∈ D∞,∞− , assume that G  0, μT -a.s., and G1 ∈ L∞− (μT ). Then, G1 ∈ D∞,∞− . In particular, if F ∈ D∞,∞− (RN ) is non-degenerate, then * +−1

∇F i , ∇F j HT ∈ D∞,∞− (RN ⊗ RN ). i, j=1,...,N

Proof By Corollary 5.3.2, (G + ε)−1 ∈ D∞,∞− for any ε > 0. Moreover, we have n * 1 + φk (G) , ∇n = G+ε (G + ε)k+1 k=0 where φk (G) is a polynomial determined by the tensor products of ∇G, . . . ,  ∇nG. Let ε → 0 to see G1 ∈ D∞,∞− . Theorem 5.4.5 Suppose that F ∈ D∞,∞− (RN ) is non-degenerate and set * +−1   γ = γi j i, j=1,...,N = ∇F i , ∇F j HT . i, j=1,...,N

∞,∞−

Define the linear mapping ξi1 ...in : D ξi [G] =

N

 ∇∗ γi jG∇F j ),

∞,∞−

→D

(i1 , . . . , in ∈ {1, . . . , N}) by

ξi1 ...in [G] = ξin [ξi1 ...in−1 [G]].

j=1

Then, for any p > 1, , sup |ξi1 ...in [G]| dμT ; G ∈ D∞,∞− , Gn,p  1 < ∞.

(5.4.2)

WT n Moreover, for f ∈ C (RN ) and G ∈ D∞,∞− ,

∂n f (F) G dμT = f (F) ξi1 ...in [G] dμT . i1 in WT ∂x · · · ∂x WT

(5.4.3)

Proof (5.4.2) follows from Theorem 5.2.8 and Corollary 5.4.4. We only prove 1 (5.4.3). Let f ∈ C (RN ). Since ∇( f (F)) =

N ∂f (F)∇F i , i ∂x i=1

we have @ A ∂f j (F) = ∇( f (F)), γ ∇F . i j HT ∂xi j=1 N

This implies

WT

∂f (F) G dμT = ∂xi

By induction we obtain (5.4.3).

f (F) ξi [G] dμT . WT

 4:36:52, subject to the Cambridge Core terms of use, .006

5.4 Integration by Parts Formula

227

As an application of the integration by parts formula, we show that a composition of a distribution on RN and a non-degenerate Wiener functional is realized as a generalized Wiener functional. By using this result, we present representations as expectations for probability densities and conditional expectations. Let S (RN ) be the space of rapidly decreasing functions on RN and S  (RN ) be the space of tempered distributions on RN . For k ∈ Z, denote by S2k (RN ) the completion of S (RN ) by the norm     f 2k = sup  I + |x|2 − 12 Δ k f (x), x∈RN

N

where Δ = i=1 ( ∂x∂ i )2 . Then, S2k (RN ) ⊃ S2k+2 (RN ) and S0 (RN ) is the space of continuous functions on RN satisfying lim|x|→∞ | f (x)| = 0. Moreover, S (RN ) =



S2k (RN ) and

S  (RN ) =

k=1

∞ 

S−2k (RN ).

k=1

Theorem 5.4.6 Let p > 1 and k ∈ Z+ , and suppose that F ∈ D∞,∞− (RN ) is non-degenerate. Then, there exists a constant C such that  f (F)−2k,p  C f −2k for any f ∈ S (RN ). Proof Define η : D∞,∞− → D∞,∞− by 1 ξii [G]. 2 i=1 N

η[G] = G + |F|2G −

By Theorem 5.4.5,

 2 1 k  I + |x| − 2 Δ f (F)G dμT = WT

f (F)ηk [G] dμT .

WT

This implies

f (F)G dμT = WT

WT



−k  f (F)ηk [G] dμT

I + |x|2 − 12 Δ

for any G ∈ D∞,∞− . As we have shown in the proof of Theorem 5.1.10, we have , f (F)G dμT ; G ∈ D∞,∞− , G2k,q  1 ,  f (F)−2k,p = sup WT

where q =

p p−1 .

Combining this with (5.4.2), we obtain the conclusion.



4:36:52, subject to the Cambridge Core terms of use, .006

228

Malliavin Calculus

Corollary 5.4.7 If F ∈ D∞,∞− (RN ) is non-degenerate, then, for any p > 1 and k ∈ Z+ , the mapping S (RN ) f → f (F) ∈ Dk,p is extended to a continuous linear mapping ΦF : S−2k → D−2k,p . Definition 5.4.8 Suppose that F ∈ D∞,∞− (RN ) is non-degenerate. For u ∈ S−2k (RN ), the generalized Wiener functional ΦF (u) ∈ D−2k,p in Corollary 5.4.7 is denoted by u(F) and called the pull-back of u by F. By Corollary 5.4.7 we obtain the following. Corollary 5.4.9 Assume that F ∈ D∞,∞− (RN ) is non-degenerate. Let p > 1 and U ⊂ Rn be an open set. (1) Let m ∈ Z+ . If the mapping U z → uz ∈ S−2k is of C m -class, then so is the mapping U z → uz (F) ∈ D−2k,p . (2) Assume ! that U z → uz ∈ S−2k is continuous and admits the Bochner integral U uz dz. Then, z → uz (F) is Bochner integrable as a D−2k,p -valued function and

+ * uz dz (F) = uz (F) dz. U

U

Remark 5.4.10 In the above, for a Banach space E, the derivative of an Evalued function ψ : U → E at z ∈ U is, by definition, a continuous linear mapping ψ (z) : Rn → E such that  1ε {ψ(z + εξ) − ψ(z)} − [ψ (z)](ξ)E → 0 (ε → 0) for any ξ ∈ Rn . The higher order derivatives are defined inductively. For the Bochner integral, see [133]. Using a composition of a non-degenerate functional and a distribution, we have the following expression of the probability density via a generalized Wiener functional. Let δ x be the Dirac measure on RN concentrated at x ∈ RN . Theorem 5.4.11 Suppose that F ∈ D∞,∞− (RN ) is non-degenerate. (1) Let pF (x) be the value of δ x (F) ∈ D−∞,1+ at 1 ∈ D∞,∞− ; pF (x) = E[δ x (F)] = [δ x (F)](1). Then, pF is of C ∞ -class and the probability density of F:

pF (x) dx (A ∈ B(RN )). μT (F ∈ A) = A

(2) Let G ∈ D∞,∞− and set pG|F (x) = E[δ x (F)G]. Then,

f (F)G dμT = f (x)pG|F (x) dx WT

RN

4:36:52, subject to the Cambridge Core terms of use, .006

5.4 Integration by Parts Formula

229

for any f ∈ S (RN ). In particular, pG|F (x) = pF (x)E[G|F = x] holds for almost all x ∈ RN with pF (x) > 0. Proof For k ∈ Z+ , the mapping RN x → δ x ∈ S−2([ N2 ]+1+k) is of C 2k -class (see [45, Lemma V-9.1]). Hence, by Corollary 5.4.9, both pF and pG|F are of C ∞ -class. ! For f ∈ S (RN ), the integral RN f (x)δ x dx of the S−2([ N2 ]+1+k) -valued function x → f (x)δ x coincides with f . By Corollary 5.4.9 again,

f (x)δ x (F) dx = f (F). RN

Hence, by Corollary 5.4.9 and the commutativity between Bochner integrals and linear continuous operators, we obtain, for G ∈ D∞,∞− ,

f (F)G dμT = f (x)E[δ x (F)G] dx. (5.4.4) RN

WT

Setting G = 1 in (5.4.4), we see that pF is the probability density of F. More! over, since the left hand side of (5.4.4) is equal to RN f (x)E[G|F = x]pF (x) dx, the identity pG|F (x) = pF (x)E[G|F = x] holds for almost all x ∈ RN with pF (x) > 0.



Positive distributions on R are realized by measures ([36]). Similar facts holds for generalized Wiener functionals. n

Definition 5.4.12 Φ ∈ D−∞,1+ is said to be positive (Φ  0 in notation) if

FΦ dμT  0 WT

holds for any non-negative F ∈ D∞,∞− . If Φ ∈ L p (μT ), then the condition in the above definition is equivalent to Φ  0, μT -a.s. Set   F Cb∞ = F ; F = f ( 1 , . . . , n ), f ∈ Cb∞ (Rn ), 1 , . . . , n ∈ WT ∗ , n ∈ N . ! Since F Cb∞ is dense in Dr,p , Φ is positive if and only if W FΦ dμT  0 for T any non-negative F ∈ F Cb∞ . Proposition 5.4.13 If F ∈ D∞,∞− (RN ) is non-degenerate, δ x (F) ∈ D−∞,1+ is positive.

4:36:52, subject to the Cambridge Core terms of use, .006

230

Malliavin Calculus

Proof The assertion follows from the identity

Gδ x (F) dμT = E[G|F = x]pF (x)

(G ∈ D∞,∞− ).



WT

Lemma 5.4.14 Let Φ ∈ D−∞,1+ be positive. Then, Φ = 0 if and only if ! Φ dμT = 0. W T

! Proof Obviously W Φ dμT = 0 if Φ = 0. We show the converse. Suppose T that F ∈ F Cb∞ is non-negative and set M = supw∈WT F(w). Since M − F ∈ F Cb∞ is non-negative, we have

(M − F)Φ dμT = − FΦ dμT  0. 0 WT

Hence

! WT

WT

FΦ dμT = 0. Since F is arbitrary, we obtain Φ = 0.



Theorem 5.4.15 For any positive Φ ∈ D−∞,1+ , there exists a finite measure νΦ on WT such that

FΦ dμT = F dνΦ (5.4.5) WT

WT

for any F ∈ F Cb∞ . Remark 5.4.16 If p > 1 and Φ ∈ L p (μT ), then dνΦ = Φ dμT . Proof !Let Φ ∈ D−r,p , Φ  0 (r ∈ R, p > 1). By Lemma 5.4.14, we may assume W Φ dμT = 1. Set T

D=

,k n ; n ∈ Z , k ∈ Z , k  2 T . + + 2n

For t j ∈ D ( j = 1, . . . , n) with 0  t1 < · · · < tn , define ut1 ...tn : S ((Rd )n ) → R by

  f θ(t1 ), . . . , θ(tn ) Φ dμT , ut1 ...tn ( f ) = WT

where {θ(t)}t∈[0,T ] is the coordinate process. Then ut1 ...tn is a positive distribution. Hence, there exists a probability measure νt1 ...tn on (Rd )n such that

f (θ(t1 ), . . . , θ(tn ))Φ dμT = f (x1 , . . . , xn )νt1 ...tn (dx1 · · · dxn ) WT

(Rd )n

4:36:52, subject to the Cambridge Core terms of use, .006

5.4 Integration by Parts Formula

231

for any f ∈ S ((Rd )n ). Since {νt1 ...tn ; t1 , . . . , tn ∈ D, n ∈ N} is consistent, by Kolmogorov’s extension theorem (see, e.g., [56, 114]), there exists a probability measure νΦ on (Rd )D such that   νΦ (X(t1 ), . . . , X(tn )) ∈ A = νt1 ...tn (A).

(5.4.6)

for any A ∈ B((Rd )n ) (t1 < · · · < tn ∈ D), where X(t) : (Rd )D → Rd is given by X(t, φ) = φ(t) (φ ∈ (Rd )D ). By Lemma 5.2.6, there exists a constant C such that Gr,q  CGq ?4 for any G ∈ n=0 Hn , where q = t, s ∈ D, we have

4 |X(t) − X(s)| dνΦ =

p p−1 .

Since |θ(t) − θ(s)|4 ∈

?4 n=0

Hn for any

|θ(t) − θ(s)|4 Φ dμT * 1 − |x|2 + 1q  CΦ−r,p |x|4q e 2 dx |t − s|2 . d d 2 R (2π)

(Rd )D

WT

Hence, by Kolmogorov’s continuity theorem (Theorem A.5.1), {X(t)}t∈D is extended to a stochastic process {X(t)}t∈[0,T ] , which is continuous almost surely with respect to νΦ . Therefore, νΦ is regarded as a probability measure on WT . Let f ∈ Cb∞ ((Rd )n ). By (5.4.6), we have for t1 < · · · < tn ∈ D

f (θ(t1 ), . . . , θ(tn ))Φ dμT = f (θ(t1 ), . . . , θ(tn )) dνΦ . WT

WT

Since D is dense in [0, T ], this identity continues to hold for any t1 < · · · < tn ∈ [0, T ]. We have now proved the conclusion because the elements of the form  f (θ(t1 ), . . . , θ(tn )) form a dense subset in D∞,∞− . Example 5.4.17 Let η1 , . . . , ηn ∈ WT∗ form an orthonormal system in HT . Then, by Example 5.4.2, η = (η1 , . . . , ηn ) ∈ D∞,∞− (Rn ) is non-degenerate and δ x (η) is positive by Proposition 5.4.13. Take ϕ ∈ C0∞ (Rn ) so that ϕ(y) = 1 for |y|  1 and set ϕm (y) = mn ϕ( y−x m ). By Theorem 5.4.11 we have

1 − |x|2 e 2 = δ x (η) dμT = ϕm (η)δ x (η) dμT √ WT WT 2πn

= ϕm (η) dνδx (η) → νδx (η) ({η = x}) (m → ∞). WT

4:36:52, subject to the Cambridge Core terms of use, .006

232

Malliavin Calculus

Hence

⎧ 1 − |x|2 ⎪ ⎪ ⎪ 2 ⎪ n e ⎨ 2 (2π) νδx (η) ({η = y}) = ⎪ ⎪ ⎪ ⎪ ⎩ 0

(y = x), (y  x).

Thus νδx (η) is a measure concentrated on the “hyperplane” {w ; η(w) = x} on the Wiener space. Since μT (η = x) = 0, νδx (η) is singular with respect to μT .

5.5 Application to Stochastic Differential Equations We present applications of the Malliavin calculus to stochastic differential equations. Throughout this section, let V0 , V1 , . . . , Vd : RN → RN be C ∞ functions on RN with bounded derivatives of all orders. In this section, we think of {θ(t)}t∈[0,T ] as an {Ft }-Brownian motion as described in Theorem 5.3.3. However, as mentioned in the remark after the theorem, all random variables are B(WT )-measurable. Denote by {X(t, x)}t∈[0,T ] the unique strong solution of the stochastic differential equation dX(t) =

d

Vα (X(t)) dθα (t) + V0 (X(t)) dt,

X(0) = x

(5.5.1)

α=1

(Theorem 4.4.5). By Theorem 4.10.8, X(t, ·) is of C ∞ -class and the Jacobian i )i, j=1,...,N satisfies the stochastic differential equation matrix Y(t, x) = ( ∂X∂x(t,x) j dY(t, x) =

d α=1

Vα (X(t, x))Y(t, x) dθα (t) + V0 (X(t, x))Y(t, x) dt,

Y(0, x) = I,

(5.5.2)

 ∂V i  where Vα (x) = ∂xαj (x) i, j=1,...,N (α = 0, 1, . . . , d). Moreover, Y(t, x) is nondegenerate and the inverse matrix Z(t, x) = Y(t, x)−1 satisfies the stochastic differential equation dZ(t, x) = −

d α=1

+

Z(t, x)Vα (X(t, x)) dθα (t) − Z(t, x)V0 (X(t, x)) dt

d α=1

Z(t, x)(Vα (X(t, x)))2 dt.

(5.5.3)

4:36:52, subject to the Cambridge Core terms of use, .006

5.5 Application to Stochastic Differential Equations

From these observations, we have, in particular,

sup sup {|Y(t, x)| p + |Z(t, x)| p } dμT < ∞ x∈RN

233

(5.5.4)

WT 0tT

*

for any p > 1, where |A| =

N i, j=1

a2i j

+ 12

for a matrix A = (ai j )i, j=1,...,N .

Theorem 5.5.1 Let t ∈ [0, T ]. Then, X(t, x) ∈ D∞,∞− (RN ) and (∇X i (t, x))α (u) =

N

Y ij (t, x)

t∧u 0

j,k=1

Zkj (v, x)Vαk (X(v, x)) dv (α = 1, . . . , d), (5.5.5)

where, for h ∈ HT , hα (u) is the value of the α-th component hα : [0, T ] → R of h at time u ∈ [0, T ]. Proof For n ∈ N, set [s]n =

[2n s] 2n

and define {Xn (s)} s∈[0,T ] by

Xn (0) = x, d

Xn (s) = Xn ([s]n ) +

Vα (Xn ([s]n )){θα (s) − θα ([s]n )}

α=1

+ V0 (Xn ([s]n )){s − [s]n }. By definition, Xn (t) ∈ D∞,∞− (RN ). Moreover, using the expression dXn (s) =

d

Vα (Xn ([s]n )) dθα (s) + V0 (Xn ([s]n )) ds,

(5.5.6)

α=1

we observe

sup |Xn (s) − X(s, x)|2 dμT → 0

(n → ∞).

(5.5.7)

WT 0sT

To see this, set Rn (s) =

d α=1

s

{Vα (Xn (u)) − Vα (Xn ([u]n ))} dθα (u)

0

+

s

{V0 (Xn (u)) − V0 (Xn ([u]n ))} du.

0

4:36:52, subject to the Cambridge Core terms of use, .006

234

Malliavin Calculus

In the same way as for Theorem 4.3.9, we obtain from (5.5.6)

sup sup |Xn (s)| p dμT < ∞ n∈N

WT 0sT

for any p > 1. Hence, by the definition of Xn (s), there exists a constant C1 such that

|Xn (u) − Xn ([u]n )|2 dμT  C1 2−n . WT

By this estimate, the Lipschitz continuity of Vα and the Burkholder–Davis– Gundy inequality (Theorem 2.4.1), there exists a constant C2 such that

sup |Rn (s)|2 dμT  C2 2−n (n = 1, 2, . . .). WT 0sT

Moreover, since X(s) − Xn (s) =

n

s

{Vα (X(u)) − Vα (Xn (u))} dθα (u)

0

α=1

s

+

{V0 (X(u)) − V0 (Xn (u))} du + Rn (s),

0

we see, by using the Burkholder–Davis–Gundy inequality again, that there exist constants C3 and C4 such that

sup |Xn (u) − X(u)|2 dμT WT 0us

s * +  C3 2−n + C4 sup |Xn (u) − X(u)|2 dμT dv (n = 1, 2, . . .). 0

WT 0uv

Hence, by Gronwall’s inequality, we obtain (5.5.7). Let h ∈ HT and set Jn,h (s) = ∇Xn (s), h HT

(s ∈ [0, T ]).

By the definition of Xn (s), dJn,h (s) =

d α=1

Vα (Xn ([s]n ))Jn,h ([s]n ) dθα (s) + V0 (Xn ([s]n ))Jn,h ([s]n ) ds

+

d

Vα (Xn ([s]n ))h˙ α (s) ds.

α=1

Let an RN -valued stochastic process {Jh (s)} s∈[0,T ] be the solution of

4:36:52, subject to the Cambridge Core terms of use, .006

5.5 Application to Stochastic Differential Equations

dJh (s) =

d

235

Vα (X(s, x))Jh (s) dθα (s) + V0 (X(s, x))Jh (s) ds

α=1

d

+

Vα (X(s, x))h˙ α (s) ds

(5.5.8)

α=1

satisfying Jh (0) = 0 and set Rn,h (s) =

d s  α=1

 Vα (Xn ([u]n ))Jn,h ([u]n ) − Vα (X(u, x))Jn,h (u) dθα (u)

0

+

s

0

+

 V0 (Xn ([u]n ))Jn,h ([u]n ) − V0 (X(u, x))Jn,h (u) du



d s 

 Vα (Xn ([u]n )) − Vα (X(u, x)) h˙ α (u) du.

0

α=1

Rewriting as Vα (Xn ([u]n ))Jn,h ([u]n ) − Vα (X(u, x))Jn,h (u) = {Vα (Xn ([u]n )) − Vα (X(u, x))}Jn,h ([u]n ) + Vα (X(u, x)){Jn,h ([u]n ) − Jn,h (u)} and using the estimate

sup |Jn,h (s)| p dμT < ∞

sup n∈N

WT 0sT

for any p > 1, we obtain

sup |Rn,h (s)|2 dμT = 0.

lim

n→∞

WT 0sT

Hence, by the expression Jn,h (s) − Jh (s) =

d α=1

s

0

Vα (X(u, x)){Jn,h (u) − Jh (u)} dθα (u) s

+ 0

V0 (X(u, x)){Jn,h (u) − Jh (u)} du + Rn,h (s),

a similar argument to that in (5.5.7) yields

|Jn,h (t) − Jh (t)|2 dμT → 0

(n → ∞).

(5.5.9)

WT

4:36:52, subject to the Cambridge Core terms of use, .006

236

Malliavin Calculus

Define an HT ⊗ RN -valued random variable F(t) = (F 1 (t), . . . , F N (t)) ∈ (HT ⊗ RN ) by D

t d N Y ij (t, x) Zkj (v, x)Vαk (X(v, x)) g˙ α (v) dv (g ∈ HT )

F i (t), g HT = 0,∞−

0

j,k=1 α=1

for i = 1, . . . , N. Then, by (5.5.8), we have

F(t), h HT = Jh (t).

(5.5.10)

Let φ ∈ P, h ∈ HT , i = 1, . . . , N. Then, by (5.5.7),

X i (t, x)∇∗ (φ · h) dμT = lim Xni (t)∇∗ (φ · h) dμT n→∞ W WT T

i i = lim

∇Xn (t), φ · h HT dμT = lim Jn,h (t)φ dμT . n→∞

n→∞

WT

WT

By (5.5.9) and (5.5.10), the right hand side coincides with

i Jh (t)φ dμT =

F i (t), φ · h HT dμT , WT

WT

and hence we obtain from Theorem 5.3.1, X(t, x) ∈ D1,∞− (RN ) and

∇X(t, x) = F(t).

Using the result ∇X(t, x) = F(t) and repeating a similar argument to the above, we can show X(t, x) ∈ D∞,∞− (RN ). We omit the details and refer to [104].  On the non-degeneracy of X(t, x), we have the following. d  ij  j i Theorem 5.5.2 Set ai j (y) = a (x) i, j=1,...,N α=1 Vα (y)Vα (y). If a(x) = is positive definite at the starting point x of {X(t, x)}t∈[0,T ] , then X(t, x) is non-degenerate for any t ∈ (0, T ]. We give a lemma for the proof. Lemma 5.5.3 Let {uα (t)}t∈[0,T ] (α = 0, 1, . . . , d) be {Ft }-predictable and bounded (see Theorem 5.3.3 for Ft ) and set M = sup{|uα (t, w)|; t ∈ [0, T ], w ∈ WT , α = 0, 1, . . . , d}. Define a stochastic process {ξ(t)}t∈[0,T ] by

t d t ξ(t) = x + uα (s) dθα (s) + u0 (s) ds α=1

0

0

4:36:52, subject to the Cambridge Core terms of use, .006

5.5 Application to Stochastic Differential Equations

237

and, for ε > 0, set σε = inf{t  0; |ξ(t) − x| > ε}. Then,

μT (σε  t)  2

holds for t
sup  ⊂ max |β(s)| > . 2 2 0sdM 2 t 0st α=1 0

Applying Corollary 3.1.8, we obtain the conclusion.



Let t ∈ (0, T ] and set

t Z(s, x)a(X(s, x))Z(s, x)∗ ds, A(t, x) =

Proof of Theorem 5.5.2

0 ∗

where Z is the transposed matrix of Z. By Theorem 5.5.1, * +

∇X i (t, x), ∇X j (t, x) HT = Y(t, x)A(t, x)Y(t, x)∗ . i, j=1,...,N

2

Strictly speaking, we need to extend the probability space. We suppose here that the probability space is already extended and we do not write it explicitly. For details, see Theorem 2.5.5.

4:36:52, subject to the Cambridge Core terms of use, .006

238

Since

Malliavin Calculus

1 det Y(t,x)

= det Z(t, x), it suffices to show

1 ∈ L p (μT ) det A(t) p∈(1,∞)

(5.5.11)

because of (5.5.4). Fix a sufficiently small ε > 0. By the positivity of a(x), there exists a δ > 0 such that a(y)  εI

(y ∈ B(x, δ) = {y; |y − x| < δ}).

For η > 0, define stopping times τη and ση by τη = inf{s > 0; |X(t, x) − x| > η} and ση = inf{s > 0; |Z(s, x) − I| > η}. By the definitions, we have

t∧τδ ∧σ 1 9ε 4 A(t)  ε (t ∧ τδ ∧ σ 1 )I. Z(s, x)Z(s, x)∗ ds  4 16 0 In particular, det A(t) 

* 9ε 16

+N (t ∧ τδ ∧ σ 41 ) .

Applying Lemma 5.5.3 to {|X(s ∧ τ1 , x) − x|2 } s∈[0,T ] and {|Z(s ∧ σ1 , x)|2 } s∈[0,T ] ,   we obtain (t ∧ τδ ∧ σ 41 )−1 ∈ p∈(1,∞) L p (μT ) and (5.5.11). As will be mentioned below, the non-degeneracy in Theorem 5.5.2 holds under weaker conditions. Denote the space of RN -valued C ∞ functions on RN by C ∞ (RN ; RN ) and identify each element U of C ∞ (RN ; RN ) with the differN U i (x) ∂x∂ i . For U, V ∈ C ∞ (RN ; RN ), let [U, V] be the Lie ential operator i=1 bracket of U and V : [U, V] = U ◦ V − V ◦ U. By the identification mentioned above, [U, V] ∈ C ∞ (RN ; RN ). Theorem 5.5.4 Let x ∈ RN and L (x) be the subspace of RN spanned by Vα (x), [Vk1 , [Vk2 , . . . , [Vkn , Vα ] . . .]](x) (α = 1, . . . , d, k j = 0, 1, . . . , d, j = 1, . . . , n, n  1). If dim L (x) = N, then X(t, x) is non-degenerate for any t ∈ [0, T ]. As Theorem 5.5.2, this theorem is proven by showing the integrability of The condition in the theorem is called Hörmander’s condition. For details, see [104]. 1 det A(t) .

4:36:52, subject to the Cambridge Core terms of use, .006

5.5 Application to Stochastic Differential Equations

239

Example 5.5.5 Let d = 1 and N = 2. Define the vector fields V1 and V2 on R2 by # $ # $ 1 0 and V0 (x) = . V1 (x) = 0 x A diffusion process on R2 defined by the solution of the stochastic differential equation dX 1 (t) = dθ1 (t),

dX 2 (t) = X 1 (t) dt,

X(0, x) = (x1 , x2 )

is called the Kolmogorov diffusion. By Theorem 5.5.1, X(t, x) ∈ D∞,∞− (R2 ). # $ 0 , dim L (x) = 2 and X(t, x) is non-degenerate Moreover, since [V0 , V1 ] = −1 by Theorem 5.5.4. Hence, E[δy (X(t, x))] gives the transition density p(t, x, y) of the diffusion process {X(t, x)}t0 . The above results can be seen in a more straightforward manner. In fact, the solution of this stochastic differential equation is explicitly given by ⎞ ⎛ ⎟⎟⎟ ⎜⎜⎜ x1 + θ1 (t) ! X(t, x) = ⎝⎜ 2 t 1 ⎠⎟ . 1 x + x t + 0 θ (s) ds This immediately implies X(t, x) ∈ D∞,∞− (R2 ). Moreover, since ⎛ 2⎞ ⎜⎜⎜ t t2 ⎟⎟⎟   i j

∇X (t, x), ∇X (t, x) HT i, j=1,2 = ⎝⎜ t2 t3 ⎠⎟ , 2

3

the non-degeneracy of X(t, x) follows. Furthermore, p(t, x, y) admits an explicit expression. In fact, the distribution ⎛ ⎞ $ # ⎜⎜⎜ t t2 ⎟⎟⎟ x1 2 and covariance matrix ⎜⎜⎜⎜⎝ 2 3 ⎟⎟⎟⎟⎠. of X(t, x) is Gaussian with mean 2 t t x + x1 t 2

Hence

2



p(t, x, y) =

 3 exp −2t−1 (y1 − x1 )2 + 6t−2 (y1 − x1 )(y2 − x2 − tx1 ) 2 πt  − 6t−3 (y2 − x2 − tx1 )2 ,

where x = (x1 , x2 ) and y = (y1 , y2 ). 2 The generator 12 ∂(x∂1 )2 + x2 ∂x∂2 is called the Kolmogorov operator and it is referred to as a typical degenerate and hypoelliptic operator in the original paper by Hörmander [37]. See [43] and [113] for recent related studies.

4:36:52, subject to the Cambridge Core terms of use, .006

240

Malliavin Calculus

Example 5.5.6 Let d = 2 and N = 3. Define the vector fields V0 , V1 , and V2 on R3 by V0 = 0, ⎞ ⎛ ⎛ ⎞ ⎜⎜⎜ 1 ⎟⎟⎟ ⎜⎜⎜ 0 ⎟⎟⎟ ⎟ ⎜ ⎜ ⎟ V1 (x) = ⎜⎜⎜⎜⎜ 0 ⎟⎟⎟⎟⎟ , and V2 (x) = ⎜⎜⎜⎜⎜ 1 ⎟⎟⎟⎟⎟ (x = (x1 , x2 , x3 ) ∈ R3 ). ⎝ x2 ⎠ ⎝ x1 ⎠ −2 2 The solution X(t, x) of the corresponding stochastic differential equation dX 1 (t) = dθ1 (t), dX 2 (t) = dθ2 (t), 1 1 dX 3 (t) = X 1 (t) dθ2 (t) − X 2 (t) dθ1 (t) 2 2 belongs to D∞,∞− (R3 ). Moreover, since

⎛ ⎞ ⎜⎜⎜0⎟⎟⎟ ⎜ ⎟ [V1 , V2 ](x) = ⎜⎜⎜⎜⎜0⎟⎟⎟⎟⎟ , ⎝ ⎠ 1

X(t, x) is non-degenerate. X(t, x) is explicitly written as (α = 1, 2), X α (t, x) = xα + θα (t) 1 X 3 (t, x) = x3 + {x1 θ2 (t) − x2 θ1 (t)} 2

t 1, t 1 θ (s) dθ2 (s) − θ2 (s) dθ1 (s) . + 2 0 0 The stochastic process s(t) =

1, 2

t

θ1 (s) dθ2 (s) −

0

t

θ2 (s) dθ1 (s)

-

0 3

which appears in the expression for X (t, x) is called Lévy’s stochastic area and plays an important role in various fields related to stochastic analysis. The explicit form of the characteristic function of s(T ) is well known (Theorem 5.8.4) and is called Lévy’s formula. Next we apply the Malliavin calculus to Schrödinger operators on Rd . First we consider Brownian motions, that is the case where N = d and X(t, x) = x + θ(t) (x ∈ Rd ). We presented a probabilistic representation for the corresponding heat equations in Chapter 3. We here consider Schrödinger operators with magnetic fields and give representations for the fundamental solutions by using the results in the previous section. ∞ (Rd ) and assume that Let V, Θ1 , . . . , Θd ∈ Cexp inf V(x) > −∞.

x∈Rd

(5.5.12)

4:36:52, subject to the Cambridge Core terms of use, .006

5.5 Application to Stochastic Differential Equations

241

The differential operator H given by +2 1 * ∂ + i Θα + V α 2 α=1 ∂x d

H=−

is called a Schrödinger operator with vector potential Θ = (Θ1 , . . . , Θd ) and scalar potential V. The fundamental solution for the heat equation ∂u ∞ = −Hu, u(0, ·) = f ∈ Cexp (Rd ) ∂t associated with H is a function p(t, x, y) such that

f (y)p(t, x, y) dy u(t, x) =

(5.5.13)

Rd

is a solution of (5.5.13). We construct the ! t fundamental solution by applying the Malliavin calculus. It is easy to see 0 V(x + θ(s)) ds ∈ D∞,∞− and, from the assumption (5.5.12), we have * t + exp − V(x + θ(s)) ds ∈ D∞,∞− (t ∈ [0, T ], x ∈ Rd ). 0

Set L(t, x; Θ) =

d α=1

t

Θα (x + θ(s)) ◦ dθα (s).

0

∞,∞−

. Hence, by Corollary 5.3.2, By Theorem 5.3.3, L(t, x; Θ) ∈ D

t * + e(t, x) = exp i L(t, x; Θ) − V(x + θ(s)) ds ∈ D∞,∞− . 0

Theorem 5.5.7 The function p(t, x, y) (t > 0, x, y ∈ Rd ) defined by

e(t, x)δy (x + θ(t)) dμT p(t, x, y) = E[e(t, x)δy (x + θ(t))] = WT

is the fundamental solution for the heat equation (5.5.13) associated with the Schrödinger operator H. ∞ ∞ (Rd ). Then H f ∈ Cexp (Rd ). Setting Proof Let f ∈ Cexp

f (x + θ(t))e(t, x) dμT , v(t, x; f ) = WT ∞ we can prove, by Lebesgue’s convergence theorem, that v(t, · ; f ) ∈ Cexp (Rd ). By Itô’s formula,

t v(t, x; f ) = f (x) + v(s, x; −H f ) ds. 0

4:36:52, subject to the Cambridge Core terms of use, .006

242

Malliavin Calculus

Hence, we obtain ∂v(t, x; f ) = v(t, x; −H f ) ∂t

(5.5.14)

and, by the Markov property of Brownian motions, v(t, x; f ) = v(s, x; v(t − s, · ; f ))

(s  t).

Differentiate both sides with respect to s. Then, since the mapping f → v(s, x; f ) is linear, by (5.5.14), we obtain 0 = v(s, x; −Hv(t − s, · ; f )) − v(s, x; v(t − s, · ; −H f )). Setting s = 0, we see −Hv(t, x; f ) = v(t, x; −H f ). Hence, by (5.5.14), ∂v(t, x; f ) = −Hv(t, x; f ). ∂t

(5.5.15)

By Theorem 5.4.11,

v(t, x; f ) =

Rd

f (y)p(t, x, y) dy

(5.5.16)

* + for any f ∈ S (Rd ). For a = (a1 , . . . , ad ) ∈ Rd , set g a (x) = cosh dα=1 aα xα . Moreover, take φn ∈ C0∞ (Rd ) such that φn (x) = 1 for |x|  n and φn (x) = 0 ∞ (Rd ), by the monotone for |x| > n + 1, and set g a,n = g a φn . Since g a ∈ Cexp convergence theorem and (5.5.16),

g a (y)p(t, x, y) dy = lim g a,n (y)p(t, x, y) dy n→∞

Rd

Rd

= lim v(t, x; g a,n ) = v(t, x; g a ) < ∞. n→∞

∞ Cexp (Rd ),

If f ∈ there exists an a = (a1 , . . . , ad ) ∈ Rd such that | f |  g a . Hence, by Lebesgue’s convergence theorem and (5.5.16), we obtain v(t, x; f ) = lim v(t, x; f φn ) n→∞

= lim ( f φn )(y)p(t, x, y) dy = n→∞

Rd

Rd

f (y)p(t, x, y) dy.

Combining this with (5.5.15), we see that p(t, x, y) is the fundamental solution for the heat equation (5.5.13). 

4:36:52, subject to the Cambridge Core terms of use, .006

5.5 Application to Stochastic Differential Equations

243

Remark 5.5.8 By Corollary 5.4.9, p ∈ C ∞ ((0, T ] × Rd × Rd ). Moreover, by Theorem 5.4.11(2), we have p(t, x, y) = E[e(t, x)|x + θ(t) = y] ×

1 (2πt)

d 2

e−

|x−y|2 2t

.

The expression in Theorem 5.5.7 above is essentially a conditional expectation. The above result is naturally extended to solutions of general stochastic differential equations. Let {X(t, x)}t∈[0,∞) be the solution of the stochastic dif∞ (RN ) ferential equation (5.5.1). Assume that the functions V, Θ1 , . . . , Θd ∈ C  by satisfy (5.5.12). Define the Schrödinger operator H N N * N + ∂f ∂2 f f = 1 H ai j i j + ai j Θ j V0i + i 2 i, j=1 ∂x ∂x ∂xi i=1 j=1 N N N , *1 1 ij ∂Θi i + + i ai j j + V0 Θi − V − a Θi Θ j f, 2 i, j=1 ∂x 2 i, j=1 i=1

where ai j =

N

α=1

Vαi Vαj . Set

t N t + * i  Θi (X(s, x)) ◦ dX (s, x) − V(X(s, x)) ds . e(t, x) = exp i i=1

0

0

Then,  e(t, x) ∈ D∞,∞− and the following holds as in the case of Brownian motions. Theorem 5.5.9 Suppose that Hörmander’s condition holds at every x ∈ RN . ! Then the function q(t, x, y) = W  e(t, x)δy (X(t, x)) dμT is the fundamental T solution of the heat equation ∂u  = Hu, ∂t

∞ u(0, ·) = f ∈ C (RN )

 That is, the function u(t, x) = associated with the Schrödinger operator H. ! f (y)q(t, x, y) dy is the solution of this heat equation. RN Proof For β = (β1 , . . . , βN ) ∈ Z+N , let ∂β be the differential operator * ∂ +β1 * ∂ +βN ··· . ∂β = 1 ∂xN ∂x Since the mapping

|∂β X(t, x)| p dμT

x → WT

4:36:52, subject to the Cambridge Core terms of use, .006

244

Malliavin Calculus

is at most of polynomial growth for any p > 1, a repetition of the arguments in the proof of Theorem 5.5.7 yields the conclusion.  Another application of the Malliavin calculus to a study of Greeks in mathematical finance will be discussed in the next chapter.

5.6 Change of Variables Formula The integration by parts formula and the change of variables formula are fundamental in calculus. We have discussed the integration by parts formula on Wiener spaces and its applications. In this section we investigate a change of variables formula on a Wiener space. Let E be a real separable Hilbert space. For A ∈ E ⊗2 , we define the regularized determinant det2 (I + A) of I + A so that det2 (I + A) = det(I + A)e−tr A if A is of trace class, where I is the identity mapping of E. For details, see [19, XI.9] and [107, Chapter 9]. With the eigenvalues {λ j }∞j=1 of A, repeated according to multiplicity, the regularized determinant is written as det2 (I + A) =

∞ 

(1 + λ j )e−λ j .

(5.6.1)

j=1

The following change of variables formula holds on WT . Theorem 5.6.1 Let F ∈ D∞,∞− (HT ). Suppose that there exists a q > 12 such that  −∇∗ F+q∇F2 ⊗2 1+ H T ∈ L (μT ) = L p (μT ). (5.6.2) e p∈(1,∞)

Then, for any f ∈ Cb (WT ),

    ∗ 1 2 f ι + F det2 I + ∇F e−∇ F− 2 FHT dμT = WT

f dμT ,

(5.6.3)

WT

where ι(w) = w (w ∈ WT ). The left hand side is well defined because * 1 1 12 +   det2 (I + A)  exp 11A11E ⊗2 . 2

(5.6.4)

4:36:52, subject to the Cambridge Core terms of use, .006

5.6 Change of Variables Formula

245

This estimate is obtained by combining (5.6.1) with the inequality 2  2 (1 + x)e−x   e x (x ∈ R). Remark 5.6.2 (1) By (5.6.3), if det2 (I + ∇F)  0, μT -a.s., the measure on WT with density det2 (I + ∇F)e−∇



F− 12 F2H

T

with respect to μT is a probability measure and the distribution of the WT valued function ι + F under this probability measure is the Wiener measure. In this case, (5.6.3) also holds for any bounded measurable f : WT → R. (2) Suppose that G : WT → HT is continuous. If there exists an F ∈ D∞,∞− (HT ) such that the conditions of Theorem 5.6.1 are fulfilled and (ι + G) ◦ (ι + F) = ι, then

1 ∗ 2 f (ι + G) dμT = f det2 (I + ∇F)e−∇ F− 2 FHT dμT WT

WT

for any f ∈ Cb (WT ), that is, the distribution of ι + G under μT coincides with the probability measure  μT given by

1 ∗ 2  det2 (I + ∇F)e−∇ F− 2 FHT dμT (A ∈ B(WT )). μT (A) = A

(3) The Cameron–Martin theorem (Theorem 1.7.2) is a special case of this theorem. In fact, if F is an HT -valued constant function, say F = h (h ∈ HT ), then ∇F = 0 and FHT = hHT . Moreover, by Example 5.1.5, ∇∗ F = I (h). Hence, by Theorem 5.6.1, we have

1 2 f (w + h)e−I (h)− 2 hHT μT (dw) = f dμT WT

WT

for any f ∈ Cb (WT ).3 (4) Girsanov’s theorem (Theorem 4.6.2) is also derived from Theorem 5.6.1. To show this, let {u(t) = (u1 (t), . . . , ud (t))}t∈[0,T ] be an {Ft }-predictable and bounded Rd -valued stochastic process. As in Theorem 5.3.3, we define Φu : ˙ u (t) = u(t) (t ∈ [0, T ]). Assume that Φu ∈ D∞,∞− (HT ). Since, WT → HT by Φ by Theorem 5.3.3, d T uα (t) dθα (t), ∇∗ Φu = α=1 3

0

Research on the change of variables formula on WT started from a series of studies by Cameron and Martin in the 1940s, including this Cameron–Martin theorem.

4:36:52, subject to the Cambridge Core terms of use, .006

246

Malliavin Calculus

we can rewrite Girsanov’s theorem as

−∇∗ Φu − 12 Φu 2H T f (ι + Φu )e dμT = WT

f dμT .

(5.6.5)

WT

We take Rd -valued stochastic processes {un (t) = (u1n (t), . . . , udn (t))}t∈[0,T ] with components {uαn (t)}t∈[0,T ] ∈ L 0 (α = 1, . . . , d) (see Definition 2.2.4) such that

* T + lim |un (t) − u(t)|2 dt dμT = 0 and M := sup |un (t, w)| < ∞. n→∞

WT

n∈N,w∈WT

0

Moreover, we may assume that each uαn (t) is written as α , uαn (t) = ξn,k

n tkn < t  tk+1 , k = 0, 1, . . . , mn − 1

(α = 1, . . . , d),

α are given by where the random variables ξn,k k,n α ξn,k = φαn,k (θ(sk,n 1 ), . . . , θ(s jk,n ))

for a monotone increasing sequence 0 = t0n < t1n < · · · < tkn < · · · < tmn n = T n α ∞ d jk,n < · · · < sk,n and 0 < sk,n jk,n  tk and φn,k ∈ C b ((R ) ) (see the proof of 1 Theorem 5.3.3). Then, we have Φun =

m d n −1 k=0 α=1

α α ξn,k (tn ,tn ] , k k+1

α−1

9:;< α α where eα = (0, . . . , 0, 1, 0, . . . , 0) ∈ Rd and (s,t] ∈ HT is defined by ˙(s,t] (v) = 1(s,t] (v)eα (v ∈ [0, T ]). By Corollary 5.3.2, ∇Φun =

jk,n m n −1

d ∂φα n,k

β k=0 i=1 α,β=1 ∂xi

β k,n (θ(sk,n 1 ), . . . , θ(s jk,n ))

(0,sk,n i ]

⊗ (tαn ,tn ] , k k+1

where the coordinate of (Rd ) jk,n is (x11 , . . . , x1d , . . . , x1jk,n , . . . , xdjk,n ). Hence, there exist an N ∈ N, an orthonormal system g1 , . . . , gN of HT , and random variables ai j (i, j = 1, . . . , N) such that ∇Φun = ai j gi ⊗ g j . 1i< jN

Since all the eigenvalues of upper triangular matrices are zero, det2 (I +∇Φun ) = 1. Hence, by Theorem 5.6.1, (5.6.5) holds for u = un . !t  d ! t  α 2 Since e−2 α=1 0 un,α (s)dθ (s)−2 0 |un (s)| ds t∈[0,T ] is a martingale,

1 ∗ 2 2 e2{−∇ Φn − 2 Φun HT } dμT  e M T WT

4:36:52, subject to the Cambridge Core terms of use, .006

5.6 Change of Variables Formula ∗

247

and e−∇ Φn − 2 Φun HT (n ∈ N) is uniformly integrable. Hence, setting u = un in (5.6.5) and letting n → ∞, we obtain (5.6.5) for a bounded stochastic process {u(t)}t∈[0,T ] . (5) The change of variables formula via the regularized determinant det2 and the derivative ∇ on Wiener spaces as in the theorem was first studied by Kusuoka [66] and his result was applied to the degree theorem on Wiener spaces in [30]. The proof of the theorem below is based on the arguments in Üstünel and Zakai [120]. 1

2

For a proof of Theorem 5.6.1, we show a change of variables formula for the integrals with respect to Gaussian measures on Rn . Denote the inner product in Rn by ·, · and the norm on Rn by ·, where the norm was denoted by |·| in the previous chapters. This change is to make notation analogous to that for HT . ∂f ∂f The gradient operator on Rn is also denoted by ∇, that is, ∇ f = ( ∂x 1 , . . . , ∂xn ). Let νn be the probability measure on Rn defined by νn (dx) = (2π)− 2 e− n

x2 2

dx

(5.6.6)

and ∇∗ be the formal adjoint operator of ∇ with respect to νn , ∇∗ F(x) = x, F(x) − tr(∇F(x))

(F ∈ C ∞ (Rn ; Rn )),

where, for F = (F 1 , . . . , F n ) ∈ C ∞ (Rn ; Rn ), ∇F =

* ∂F i + ∂x j

i, j=1,...,n

.

Moreover, the space of Rk -valued C ∞ functions F on Rn such that ∇ j F ∈ j L∞− (νn ; Rkn ) for any j ∈ Z+ is denoted by D ∞,∞− (Rn ; Rk ). When k = 1, we simply write D ∞,∞− (Rn ). For F ∈ D ∞,∞− (Rn ; Rn ), set ΛF = det2 (I + ∇F)e−∇



F− 12 F2

= det(I + ∇F)e− F,· − 2 F . 1

2

ai j be For a matrix A ∈ Rn ⊗ Rn , let A∼ be its cofactor matrix, that is, letting    a ji i, j=1,...,n . The product of A and A∼ is the (i, j)-cofactor of A, A∼ =  A (A∼ ) = det A × I. By this identity we can define ΛF (I + ∇F)−1 (x) ∈ Rn ⊗ Rn by ΛF (I + ∇F)−1 (x) = e− x,F(x) − 2 F(x) (I + ∇F(x))∼ 1

2

(5.6.7)

regardless of the regularity of the matrix I + ∇F(x) ∈ Rn ⊗ Rn .

4:36:52, subject to the Cambridge Core terms of use, .006

248

Malliavin Calculus

Lemma 5.6.3 For x, v ∈ Rn ,   ∇∗ ΛF (I + ∇F)−1 v (x) = ΛF (x) v, x + F(x) . Proof Let v = (v1 , . . . , vn ) ∈ Rn . By (5.6.7) and the definition of ∇∗ , n 1   ∂ 2 (I + ∇F)∼i j v j . ∇∗ ΛF (I + ∇F)−1 v = ΛF v, · + F − e− F,· − 2 F i ∂x i, j=1

Hence it suffices to show n ∂ (I + ∇F)∼i j v j = 0. i ∂x i, j=1

(5.6.8)

For x ∈ Rn and ζ ∈ C, set f (x, ζ) =

n ∂ (I + ζ∇F)∼i j v j . i ∂x i, j=1

Suppose that x ∈ Rn and ζ ∈ C satisfy det(I + ζ∇F(x))  0. Since det(I + ζ∇F(·))  0 in a neighborhood of x,   (I + ζ∇F)∼i j = det(I + ζ∇F) (I + ζ∇F)−1 i j .   Since ∂a∂pq det A = det A (A−1 )qp for A = ai j i, j=1,...,n , a straightforward computation yields n ∂ (I + ζ∇F)∼i j = 0. i ∂x i=1 Hence, if det(I + ζ∇F(x))  0, then f (x, ζ) = 0. For each x ∈ Rn there are at most n ζs such that det(I + ζ∇F(x)) = 0. Therefore we obtain f (x, ζ) ≡ 0 and (5.6.8).  Lemma 5.6.4 Let F ∈ D ∞,∞− (Rn ; Rn ). Suppose that there exist γ > 0 and q > 12 such that e−∇



F+q∇F2

∈ L1+γ (νn ).

Then, for any v ∈ Rn , ΛF (I + ∇F)−1 v ∈ L1+γ (νn ),

ΛF · + F, v ∈

(5.6.9)

L p (νn ).

p∈(1,1+γ) ∞,∞−

(R ), Moreover, for any G ∈ D

∇G, ΛF (I + ∇F)−1 v dνn = GΛF · + F, v dνn . Rn

n

Rn

4:36:52, subject to the Cambridge Core terms of use, .006

5.6 Change of Variables Formula

249

Proof By Lemma 5.6.3, it suffices to show the integrability of the first two Wiener functionals. First we show, for A ∈ Rn ⊗ Rn , *1 + 11  1 1det2 (I + A) (I + A)−1 − I 11  exp (A + 1)2 . (5.6.10) 2 Since ζ → det2 (I + A + ζB) (B ∈ Rn×n ) is holomorphic and   d  −1  det2 (I + A + ζB) = det2 (I + A)tr {(I + A) − I}B , dζ ζ=0 by Cauchy’s integral formula and (5.6.4), we obtain

2π     1 det2 (I + A + eis B)  −1 det2 (I + A)tr {(I + A) − I}B  =  ds 2π 0 eis

2π *1 + *1 + 1  exp A + eis B2 ds  exp (A + B)2 . 2π 0 2 2 Hence, since T  = sup{|tr(T B)| | B  1}, we have (5.6.10). Second, by using (5.6.4), (5.6.10), and an elementary inequality 12 (a + 1)2  q (a > 0), we obtain qa2 + 2q−1 ΛF (I + ∇F)−1 v  2e−∇



q F− 12 F2 +q∇F2 + 2q−1

q −∇∗ F+q∇F2 + 2q−1

 2e

v

v.

This implies the first assertion. Since · + F, v ∈ L∞− (νn ), the second assertion follows from (5.6.4) and the assumption.  Lemma 5.6.5 If F ∈ D ∞,∞− (Rn ; Rn ) satisfies (5.6.9), then

  f x + F(x) ΛF (x)νn (dx) = ΛF dνn f dνn Rn

Rn

Rn

(5.6.11)

for any f ∈ Cb (Rn ). Proof For v ∈ Rn and λ ∈ R, set fλ (x) = exp(i λ x, v ). Since   (k ∈ Rn ),

∇ fλ (· + F) , k = i λ v, (I + ∇F)k fλ (· + F) setting k = ΛF (I + ∇F)−1 v, we have  

∇ fλ (· + F) , ΛF (I + ∇F)−1 v = i λv2 fλ (· + F)ΛF . Combining this identity with Lemma 5.6.3, we obtain

4:36:52, subject to the Cambridge Core terms of use, .006

250

Malliavin Calculus 1 d i dλ

R

n

  fλ x + F(x) ΛF (x)νn (dx)

  fλ x + F(x) ΛF (x) x + F(x), v νn (dx)

  = i λv2 fλ x + F(x) ΛF (x)νn (dx). =

Rn

Rn

Solving this ordinary differential equation, we arrive at

− 12 λ2 v2 fλ (x + F(x))ΛF (x)νn (dx) = e ΛF (x)νn (dx). Rn

Rn

!

− 12 λ2 v2

= Rn fλ dνn , we obtain (5.6.11). Since e For general f ∈ Cb (Rn ), approximating it by elements in S (Rn ) and expressing elements of S (Rn ) in terms of Fourier transforms, we obtain the assertion from the identity above.  Lemma 5.6.6 Suppose that F ∈ D ∞,∞− (Rn ; Rn ) satisfies (5.6.9). Then,

1     ∗ 2 f x + F(x) det2 I + ∇F(x) e−∇ F(x)− 2 F(x) νn (dx) = f dνn (5.6.12) Rn

Rn

for any f ∈ Cb (R ). n

Proof Let t ∈ [0, 1]. Since at  1 + a (a  0), ∗

e−∇ (tF)+q∇(tF)  et(−∇





 1 + e−∇ F+q∇F . ! Hence ! mapping t!→ Rn ΛtF dνn is continuous. ! tF also satisfies (5.6.9) and the If Rn ΛtF dνn ∈ Z (t ∈ [0, 1]), then Rn ΛtF dν ! n = Rn Λ0 dνn = 1 and we obtain (5.6.12) by Lemma 5.6.5. Hence we show Rn ΛF dνn ∈ Z. Let {Ek }∞ k=1 be a sequence of disjoint Borel sets such that 2

∞ 

F+q∇F2 )

2

Ek = {x ∈ Rn ; det(I + ∇F(x))  0}

k=1

and, in a neighborhood of each Ek , the mapping x → T (x) = x + F(x) is a diffeomorphism. By the change of variables formula with respect to the Lebesgue measure, we have

f (T (x))|ΛF (x)|νn (dx) Ek

n 1 2 f (T (x))| det(∇T (x))|(2π)− 2 e− 2 T (x) dx = E

k f (x)νn (dx) (k = 1, 2, . . . , f ∈ Cb (Rn )). (5.6.13) = T (Ek )

4:36:52, subject to the Cambridge Core terms of use, .006

5.6 Change of Variables Formula

251

Since ΛF (x) = 0 if det(I + ∇F(x)) = 0, this implies

∞ f (T )|ΛF | dνn = f (T )|ΛF | dνn = Rn

{det(I+∇F)0}

k=1

f dνn T (Ek )

for any f ∈ Cb (Rn ). Setting f = 1, we obtain

* ∞ + 1T (Ek ) dνn < ∞.

(5.6.14)

Rn k=1

Denote by sk ∈ {±1} the signature of det(I + ∇F) on Ek . By Lemma 5.6.5 and (5.6.13), we have



∞   ΛF dνn f dνn = f · + F ΛF dνn = sk f dνn Rn

Rn

Rn

k=1

∞

T (Ek )

∞

for any f ∈ Cb (Rn ). The sum k=1 sk 1T (Ek ) is dominated by k=1 1T (Ek ) and, by (5.6.14), converges absolutely νn -a.e. Hence we have

* ∞ + ΛF dνn f dνn = sk 1T (Ek ) (x) f (x) νn (dx) Rn

Rn

Rn k=1

for any f ∈ Cb (Rn ) and

In particular,

! Rn

Rn

ΛF dνn =



sk 1T (Ek ) ,

νn -a.e.

k=1

ΛF dνn ∈ Z.



We extend the identity (5.6.12) on Rn to that on WT . For this purpose we ∗ prepare some notation. Let { i }∞ i=1 ⊂ WT be an orthonormal basis of HT . For each n ∈ N, let Gn be the σ-field generated by the random variables 1 , . . . , n : Gn = σ( 1 , . . . , n ). Define the projection πn : WT → WT∗ ⊂ HT ⊂ WT by πn w =

n

j (w) j

(w ∈ WT ).

j=1

Moreover, for j ∈ N, define π⊗n j : HT⊗ j → HT⊗ j by π⊗n j (h1 ⊗ · · · ⊗ h j ) = πn h1 ⊗ · · · ⊗ πn h j . Denote by En the conditional expectation given Gn , En (F) = E[F|Gn ], and extend it to the HT⊗ j -valued random variable G ∈ L2 (μT ; HT⊗ j ) by En (G) =

∞ i=1

En ( G, ψi H ⊗ j )ψi ,

(5.6.15)

T

4:36:52, subject to the Cambridge Core terms of use, .006

252

Malliavin Calculus

⊗j where {ψi }∞ i=1 is an orthonormal basis of HT . The above infinite sum con⊗j 2 verges in HT almost surely and in the L -sense. Specifically, since {En (G)}∞ n=1 is a discrete time martingale, by the monotone convergence theorem, Doob’s inequality, and Jensen’s inequality, we have

2 sup(En ( G, ψi H ⊗ j )) dμT = lim max(En ( G, ψi H ⊗ j ))2 dμT T T m→∞ W nm WT n∈N T

 4 lim sup (Em ( G, ψi H ⊗ j ))2 dμT T m→∞ WT

 4 lim sup Em ( G, ψi 2H ⊗ j ) dμT T m→∞ WT

=4

G, ψi 2H ⊗ j dμT (i ∈ N). T

WT

Hence we obtain

∞ *

+2 +

*

sup En ( G, ψi H ⊗ j ) T

WT i=1 n∈N

dμT  4 WT

G2H ⊗ j dμT .

(5.6.16)

T

The almost sure and L2 -convergence of the right hand side of (5.6.15) follows from this estimate. Lemma 5.6.7 (1) En (G) is an HT⊗ j -valued random variable, unique up to μT null sets, such that



G, G H ⊗ j dμT =

En (G), G H ⊗ j dμT T

WT

T

WT

for any Gn -measurable G ∈ L2 (μT ; HT⊗ j ). In particular, En (G) is independent of the choice of the orthonormal basis {ψi }∞ i=1 . (2) For any G, K ∈ L2 (μT ; HT⊗ j ),

En (G), K H ⊗ j dμT =

En (G), En (K) H ⊗ j dμT T T WT WT

=

G, En (K) H ⊗ j dμT . (5.6.17) T

WT

Moreover, for G ∈ L p (μT ; HT⊗ j ), p  1, En (G) p ⊗ j  En (G p ⊗ j ). HT

HT

(5.6.18)

(3) The following convergence holds:

lim En (πnG) − G2H ⊗ j dμT = 0. n→∞

WT

T

4:36:52, subject to the Cambridge Core terms of use, .006

5.6 Change of Variables Formula

253

(4) Let G ∈ P. Then, for μT -a.s. w ∈ WT ,

  G πn w + (1 − πn )w μT (dw ). En (G)(w) =

(5.6.19)

WT

Proof (1) Let G ∈ L2 (μT ; HT⊗ j ) be Gn -measurable. By the expansion with respect to {ψi }∞ i=1 ,

En (G), G H ⊗ j = T



En ( G, ψi H ⊗ j ) G , ψi H ⊗ j . T

i=1

T

By the identities ∞



G, ψi 2H ⊗ j = G2H ⊗ j

G , ψi 2H ⊗ j = G 2H ⊗ j , T

i=1

T

T

i=1

T

and (5.6.16), we can apply Lebesgue’s convergence theorem to obtain the desired equality

∞  

En (G), G H ⊗ j dμT = En G, ψi H ⊗ j G , ψi H ⊗ j dμT T

WT

=

i=1 WT ∞ i=1

=

WT

WT

T

T

G, ψi H ⊗ j G , ψi H ⊗ j dμT T

T

G, G H ⊗ j dμT . T

because G , ψi H ⊗ j is Gn -measurable. T The uniqueness is shown in the same way as the usual conditional expectation. (2) Set G = K and G = En (G) in (1) to obtain the first identity of (5.6.17). The second identity is obtained by changing G and K in the first one. Next we show (5.6.18). It suffices to prove it in the case when p = 1 because the general case is obtained by Jensen’s inequality (En (X)) p  En (X p ) for conditional expectations. Let g ∈ HT⊗ j . For any Gn -measurable φ ∈ L2 (μT ), set G = φ · g in the identity described in (1). Then we have

En (G), g H ⊗ j φ dμT =

G, g H ⊗ j φ dμT . WT

T

T

WT

Hence

En (G), g H ⊗ j = En ( G, g H ⊗ j ). T

T

4:36:52, subject to the Cambridge Core terms of use, .006

254

Malliavin Calculus

In particular, since | G, g H ⊗ j |  GH ⊗ j gH ⊗ j , we obtain T

T

T

| En (G), g H ⊗ j |  En (GH ⊗ j )gH ⊗ j . T

T

(5.6.20)

T

Since the space HT⊗ j is separable, there exists a countable sequence {gi }∞ i=1 with gi H ⊗ j  1 such that T

ξH ⊗ j = sup | ξ, gi H ⊗ j | T

T

i∈N

(ξ ∈ HT⊗ j ).

Combining this with (5.6.20), we obtain (5.6.18) when p = 1. (3) By the linearity of En ,

1 11E (π G) − G1112 dμ n n T HT⊗ j WT

1

1 112 1 11E (G) − G1112 dμ . 1En (πnG − G)1H ⊗ j dμT + 2 2 ⊗j n T H T

WT

Using (5.6.18) for p = 2, we obtain

1

11E (π G − G)1112 dμ  ⊗j n n T H T

WT

WT

T

WT

11 12 1πnG − G11H ⊗ j dμT → 0 T

(n → ∞).

By (5.6.16), the martingale convergence theorem (Theorem 1.4.21) and the dominated convergence theorem, we have

1

∞   11E (G) − G1112 dμ = En ( G, ψi H ⊗ j ) − G, ψi H ⊗ j 2 dμT ⊗j n T H T

WT

WT i=1

−→ 0

T

T

(n → ∞).

(4) Let G ∈ P be of the form   G(w) = f η1 (w), . . . , ηm (w)

(w ∈ WT )

with a polynomial f : Rm → R and η1 , . . . , ηm ∈ WT∗ . By the embedding WT∗ ⊂ HT∗ = HT ⊂ WT , we have ηi ( j ) = ηi , j HT = j (ηi ). Hence, for any w ∈ WT , ηi (πn w) = (πn ηi )(w),

ηi ((I − πn )w) = ((I − πn )ηi )(w).

(5.6.21)

m Since πn ηi , (I − πn )η j HT = 0, {ηi ◦ πn }m i=1 and {ηi ◦ (I − πn )}i=1 are independent.  Define a polynomial f by

   f (x1 , . . . , xm ) = f x1 + η1 ((I − πn )w ), . . . , xm + ηm ((I − πn )w ) μT (dw ). WT

4:36:52, subject to the Cambridge Core terms of use, .006

5.6 Change of Variables Formula

255

Then, since {ηi ◦ πn }m i=1 is Gn -measurable, by Proposition 1.4.2, we obtain     En (G) = E f η1 ◦ πn + η1 ◦ (I − πn ), . . . , ηm ◦ πn + ηm ◦ (I − πn ) Gn

  G(πn · +(I − πn )w )μT (dw ). =  f η1 ◦ πn , . . . , ηm ◦ πn =



WT

Lemma 5.6.8 Let F ∈ D∞,∞− (HT⊗ j ) and F1 ∈ D∞,∞− (HT ). Then En (π⊗n j F) ∈ D∞,∞− (HT⊗ j ) and       ∇ En (π⊗n j F) = π⊗n j+1 En (∇F) = En π⊗n j+1 (∇F) , (5.6.22)  ∗ ∗ (5.6.23) ∇ En (πn F1 ) = En (∇ F1 ), 11 112  112 11 1∇(En (πn F1 ))1H ⊗2  En 1∇F1 1H ⊗2 . (5.6.24) T

T

Proof First let F ∈ P(HT⊗ j ). We prove En (π⊗n j F) belongs to P(HT⊗ j ) and (5.6.22) holds. To do this, it suffices to show it in the case where F = Ge for the same G ∈ P as in the proof of Lemma 5.6.7 (4) and e ∈ HT⊗ j . We use the same notation as in the proof of Lemma 5.6.7 (4). f (η1 ◦ πn , . . . , ηm ◦ πn ) ∈ P, we have En (π⊗n j F) = Then, since En (G) =  ⊗j ⊗j En (G)πn e ∈ P(HT ). Hence, by Lemma 5.6.7 (4), ∇(En (π⊗n j F)) = ∇(En (G))π⊗n j e m ∂ f (η ◦ πn , . . . , ηm ◦ πn )(ηi ◦ πn ) ⊗ π⊗n j e. = i 1 ∂x i=1 On the other hand, since π⊗n j+1 (∇F) = again Lemma 5.6.7 (4), we obtain En (π⊗n j+1 (∇F)) =

m

∂f i=1 ∂xi (η1 , . . . , ηm )(πn ηi )

⊗ π⊗n j e, using

m ∂ f (η ◦ πn , . . . , ηm ◦ πn )(πn ηi ) ⊗ π⊗n j e. i 1 ∂x i=1

By (5.6.21), (5.6.22) holds for F ∈ P(HT⊗ j ). Second, let F ∈ D∞,∞− (HT⊗ j ). For k ∈ N, p > 1, take Fm ∈ P(HT⊗ j ) so that limm→∞ Fm − F(k,p) = 0 (see Definition 5.1.4). Then, applying (5.6.22) to Fm and using (5.6.18), we obtain for  k 11   1  1∇ E (π⊗ j F ) − E π⊗ j+ (∇ F) 11 n

m

n

n

n

p

1   1   11En π⊗n j+ (∇ Fm ) − En π⊗n j+ (∇ F) 11 p 1 1  11∇ F − ∇ F 11 −→ 0 (m → ∞). m

Hence, we have En (π⊗n j F) F ∈ D∞,∞− (HT⊗ j ).

p

∈D

k,p

(HT⊗ j ) and (5.6.22). Since k

and p are arbitrary,

4:36:52, subject to the Cambridge Core terms of use, .006

256

Malliavin Calculus

Third, we show (5.6.23). Let K ∈ P. By the symmetry of En mentioned in (5.6.17), the symmetry of πn in HT , the commutativity of En and πn , and (5.6.22) for j = 0, we have

K∇∗ (En (πn F1 )) dμT =

∇K, En (πn F1 ) HT dμT WT WT

=

πn (En (∇K)), F1 HT dμT =

En (πn (∇K)), F1 HT dμT WT WT

=

∇(En K), F1 HT dμT = KEn (∇∗ F1 ) dμT . WT

WT

Thus we obtain (5.6.23). Finally, we show (5.6.24). By (5.6.22) and (5.6.18), we obtain 11 12 1 12 1∇(E (π F ))11 = 11E (π⊗2 (∇F ))11 n

n

1

HT⊗2

n

1

n

HT⊗2

112  112  1 11  En 11π⊗2 n (∇F 1 )1H ⊗2  En 1∇F 1 1H ⊗2 . T

T

Proof of Theorem 5.6.1 By (5.6.2), there exist γ > 0 and q > −∇

e



F+q∇F2 ⊗ H 2 T

1 2



such that

∈ L1+γ (μT ).

Let En and πn be as above and set Fn = En (πn F). By (5.6.23), (5.6.24), and Jensen’s inequality for conditional expectations, we have

(1+γ)(−∇∗ Fn +q∇Fn 2 ⊗2 ) En ((1+γ){−∇∗ F+q∇F2 ⊗2 }) H H T dμ T e  e dμT T WT WT

* (1+γ){−∇∗ F+q∇F2 ⊗2 } + H T  En e dμT WT

(1+γ){−∇∗ F+q∇F2 ⊗2 } H T dμ . = e (5.6.25) T WT

Identify πn (WT ) with Rn in a natural way. By Lemma 5.6.8, applying the Sobolev embedding theorem ([1]), we may regard Fn ∈ D ∞,∞− (Rn ; Rn ). By (5.6.25), Fn satisfies (5.6.9). Thus, by Lemma 5.6.6, (5.6.3) holds for F = Fn . By the martingale convergence theorem (Theorem 1.4.21), Lemma 5.6.7 (3), (5.6.22), and (5.6.23), we may suppose that Fn , ∇Fn , and ∇∗ Fn converges almost surely to F, ∇F, and ∇∗ F, respectively, taking a subsequence if necessary. By (5.6.4), we have |det2 (I + ∇Fn )|e−∇



Fn − 12 Fn 2H

T

−∇∗ Fn + 12 ∇Fn 2 ⊗2

e

H

T

.

By (5.6.25), ,

* 1 1 12 +f (ι + Fn )det2 (I + ∇Fn ) exp −∇∗ Fn − 11Fn 11H T n∈N 2

4:36:52, subject to the Cambridge Core terms of use, .006

5.7 Quadratic Forms

257

is uniformly integrable (Theorem A.3.4). Hence, letting n → ∞ in (5.6.3) for  F = Fn , we obtain the desired identity (5.6.3) for F.

5.7 Quadratic Forms As in other fields of analysis, quadratic forms on the Wiener space play fundamental roles in stochastic analysis. In this section we show a general theory on quadratic Wiener functionals and, in the next section, we present concrete examples. Definition 5.7.1 Regarding a symmetric A ∈ HT⊗2 as an HT⊗2 -valued constant function on WT , set QA = (∇∗ )2 A,

LA = ∇∗ A.

QA is called a quadratic form associated with A. By Theorem 5.2.1, QA ∈ D∞,∞− and LA ∈ D∞,∞− (HT ). Since the Hilbert–Schmidt operators are compact operators, by the spectral decomposition for compact operators ([58]), A is diagonalized as A=



an hn ⊗ hn ,

(5.7.1)

n=1

where {hn }∞ an orthonormal basis of HT and {an }∞ n=1 is n=1 is a sequence of real ∞ 2 numbers with n=1 an < ∞. By using this decomposition, QA and LA are represented as infinite sums. Lemma 5.7.2 (1) The following convergence in the L2 -sense holds: QA =



  an (∇∗ hn )2 − 1 and

n=1

LA =



an (∇∗ hn )hn .

n=1

(2) Set Aop = supn |an |. For λ ∈ R with |λ| Aop < 12 , eλQA ∈ L1+ (μT ). (3) For λ ∈ R with |λ| Aop < 12 ,

1 eλQA dμT = {det2 (I − 2λA)}− 2 . WT

The aim ! of this section is to extend the assertion (3) to general integrals of the form W eλQA f dμT ( f ∈ Cb (WT )) (Theorem 5.7.6), applying the change of T variables formula on WT as shown in the previous section.

4:36:52, subject to the Cambridge Core terms of use, .006

258

Malliavin Calculus

Remark 5.7.3 Since {∇∗ hn }∞ n=1 is a sequence of independent standard Gaussian random variables, by the Itô–Nisio theorem (Theorem 1.2.5), we have θ=



(∇∗ hn )hn .

n=1

Combining this with (5.7.1), we have a formal expression for Lemma 5.7.2(1): QA = θ, Aθ HT − tr(A). While this expression is “∞ − ∞” in general because HT  WT and A is not necessarily of trace class, it suggests the origin of the name of quadratic forms associated with A.

Proof of Lemma 5.7.2 (1) First we show (∇∗ )2 (h ⊗ g) = (∇∗ h)(∇∗ g) − h, g HT

(h, g ∈ HT ).

(5.7.2)

For this purpose, let E be a separable Hilbert space, G ∈ D∞,∞− (HT ⊗ E) and e ∈ E. We have G, e E ∈ D∞,∞− (HT ). Since

∇∗G, e E φ dμT =

G, ∇(φ · e) HT ⊗E dμT WT WT

7 =

G, (∇φ) ⊗ e HT ⊗E dμT =

G, e E , ∇φ HT dμT WT

WT

for any φ ∈ P,

∇∗G, e E = ∇∗ ( G, e E ). Using this identity with E = HT , we obtain

∇∗ (h ⊗ g), hn HT = (∇∗ h) g, hn HT

(n = 1, 2, . . .).

Hence we have ∇∗ (h ⊗ g) = (∇∗ h)g.

(5.7.3)

Since ∇(∇∗ h) = h (Example 5.1.5), by Theorem 5.2.8, we obtain (5.7.2).  Setting Km = m n=1 an hn ⊗ hn , by (5.7.2) and (5.7.3), we have ∇∗ Km =

m

an (∇∗ hn )hn

n=1

and

(∇∗ )2 Km =

m

an {(∇∗ hn )2 − 1}.

n=1 ∗

By the continuity of ∇ and a similar argument to that in Example 5.4.3, we obtain the conclusion. 4:36:52, subject to the Cambridge Core terms of use, .006

5.7 Quadratic Forms

259

(2) It suffices to show eλQA ∈ L1 (μT ) for λ ∈ R with |λ| Aop < 12 . By (1), there exists a subsequence {mn }∞ n=1 such that Fn =

mn

ak {(∇∗ hk )2 − 1} → QA ,

μT -a.s.

k=1

Since {∇∗ hn } is a sequence of independent standard Gaussian random variables by (5.1.8), we have

eλQA dμT  lim inf eλFn dμT n→∞

WT

= lim inf n→∞

WT

mn 

(1 − 2λak )− 2 e−λak = det2 (1 − 2λA)− 2 < ∞. 1

1

k=1

(3) By the proof of (2), {eλFn }∞ n=1 is uniformly integrable. Hence, in the same way as in (2), we have

λQA e dμT = lim eλFn dμT n→∞

WT

= lim

n→∞

WT

mn 

(1 − 2λak )− 2 e−λak = det2 (1 − 2λA)− 2 . 1

1



k=1

By Lemma 5.7.2 (1), ∇3 QA = 0. The converse is also true. Proposition 5.7.4 Let F ∈ D∞,∞− . If ∇3 F = 0, then there exist a symmetric operator A ∈ HT⊗2 , h ∈ HT , and c ∈ R such that 1 F = c + ∇∗ h + QA . 2 Moreover,

c=

(5.7.4)

F dμT

WT

and

h=

∇F dμT .

(5.7.5)

WT

Proof By Proposition 5.2.9, there exists an A ∈ HT⊗2 such that ∇2 F = A. In particular, A is symmetric. Set F1 = F − 12 QA . Then, by Lemma 5.7.2, ∇2 F1 = 0. By Proposition 5.2.9 again, there exists an h ∈ HT such that ∇F1 = h. Next set F2 = F1 − ∇∗ h. By Example 5.1.5, ∇F2 = 0. Hence, there exists a c ∈ R such that F2 = c. From these observations, we obtain (5.7.4). By Lemma 5.7.2 and Example 5.1.5, we have

QA dμT = 0, ∇QA dμT = 0, ∇∗ h dμT = 0. WT

WT

Hence, (5.7.5) follows from (5.7.4).

WT

 4:36:52, subject to the Cambridge Core terms of use, .006

260

Malliavin Calculus

Remark 5.7.5 From the expression (5.7.4), we can prove that the distribution of F is infinitely divisible ([101]) and can compute the corresponding Lévy measure. For details, see [82]. Develop a symmetric operator A ∈ HT⊗2 as (5.7.1). For λ ∈ R with 2|λ| Aop < 1, set ∞ 1 snA,λ = (1 − 2λan )− 2 − 1 and S A,λ = snA,λ hn ⊗ hn . n=1

Since |snA,λ |  )

|2λan | 1 − 2|λ| Aop

,

(5.7.6)

S A,λ is a symmetric Hilbert–Schmidt operator. If |λ| is sufficiently small, for 3 , then S A,λ op < 12 . example, if |λ| Aop < 16 In connection with quadratic forms, the following change of variables formula holds. 3 and f ∈ Cb (WT ), Theorem 5.7.6 For λ ∈ R with |λ| Aop < 16

1 eλQA f dμT = {det2 (I − 2λA)}− 2 f (ι + LS A,λ ) dμT . WT

(5.7.7)

WT

For a proof, we give a lemma. Lemma 5.7.7 (1) For each h ∈ HT , there exists an HT -invariant Xh ∈ B(WT ) with μT (Xh ) = 1 such that (∇∗ h)(w + g) = (∇∗ h)(w) + h, g HT

(5.7.8)

for any w ∈ Xh and g ∈ HT , where the HT -invariance of Xh means that Xh +g = Xh for any g ∈ HT . (2) For any symmetric operator A ∈ HT⊗2 , there exists an HT -invariant XA ∈ B(WT ) with μ(XA ) = 1 such that QA (w + g) = QA (w) + 2 LA (w), g HT + Ag, g HT

(5.7.9)

for any w ∈ XA and g ∈ HT . ∇∗ h, LA and QA are defined up to null sets and the assertions of the lemma include the problem of the choice of modifications. We give an answer in the proof below. In order to expand QA (w + F(w)), which appears in the change of variables formula on WT , for each w, we need to consider the HT -invariant sets as in the lemma.

4:36:52, subject to the Cambridge Core terms of use, .006

5.7 Quadratic Forms

261

∗ Proof (1) Let { n }∞ n=1 ⊂ WT be an orthonormal basis of HT . Set

 hn =

n

h, k HT k . k=1

WT∗ ,



By (5.1.4), if n ∈ then ∇ n = n , μT -a.s. Hence, we can take the hn so that modification of ∇∗ ∇∗ hn . hn =  Moreover, we may assume that  hn → ∇∗ h, μT -a.s., choosing a subsequence if necessary. Set , Xh = w ∈ WT ; lim | hn (w) −  hm (w)| = 0 . n,m→∞

hn − hHT = 0. Moreover, since  hn → Xh is HT -invariant because limn→∞  ∗ ∇ h, μT -a.s., μT (Xh ) = 1. Define a modification of ∇∗ h by ⎧ ⎪ ⎪ hn (w) (w ∈ Xh ) ⎨limn→∞  ∗ (∇ h)(w) = ⎪ ⎪ ⎩ 0 (w  Xh ). Then, since  hn (w) +  hn , g HT hn (w + g) = 

(w ∈ WT , g ∈ HT ),

we obtain (5.7.8) by letting n → ∞. (2) Develop A as in (5.7.1) with an orthonormal basis {hn }∞ of HT . For each n=1 ∞ ∗ hn , define Xhn and ∇ hn by (1). Let XA be the set of w ∈ n=1 Xhn such that ∞ 2 ∗ 2  ∗ 2 and ∞ By (5.7.8), XA n=1 an (∇ hn ) (w) < ∞ n=1 an {(∇ hn ) (w) − 1} converges. ∞ ∗ 2 2 ∗ 2 is HT -invariant. Since n=1 an {(∇ hn ) −1} converges in L2 and ∞ n=1 an (∇ hn ) is integrable, by the assertion (1), μT (XA ) = 1. Let w ∈ XA and g ∈ HT . Recalling Lemma 5.7.2, set ⎧ ∞ ∗ 2 ⎪ ⎪ ⎨ n=1 an {(∇ hn ) (w) − 1} (w ∈ XA ), QA (w) = ⎪ ⎪ ⎩ 0 (w  XA ), ⎧ ∞ ∗ ⎪ ⎪ ⎨ n=1 an (∇ hn )(w)hn (w ∈ XA ), LA (w) = ⎪ ⎪ ⎩ 0 (w  XA ). Then, we have (5.7.9).



Proof of Theorem 5.7.6 For simplicity of notation, write S = S A,λ and sn = snA,λ , and set F = LS . Then, ∇F = S and ∇∗ F = QS .

4:36:52, subject to the Cambridge Core terms of use, .006

262

Malliavin Calculus

If |λ| Aop < −∇



3 16 ,

F+∇F2 ⊗2 H T

then S op
0

1 2 1 2 2 2 ∗ e p∇ gn dμT = e 2 p gn HT  e 2 p gHT . WT ∗

∗ Hence, by Lemma 5.7.2(2), {eλQA +∇ gn }∞ n=1 is uniformly integrable. Since ∇ gn ∗ 2 converges to ∇ g in L , we obtain (5.7.13) by letting n → ∞ in (5.7.14). 

Corollary 5.7.9 Let η1 , . . . , ηn ∈ WT∗ be an orthonormal system in HT . Define  π : WT → WT by π(w) = ni=1 ηi (w)ηi , and A0 and A1 : HT → HT by A0 = (I − π)A(I − π) and A1 = πAπ, respectively. Write δ0 (η)dμT for the measure νδ0 (η) in Theorem 5.4.15. Then, for λ ∈ R with |λ| Aop < 12 and g ∈ HT ,

∗ eλQA +∇ g δ0 (η)dμT WT

=

1 − 12 −λtr(A1 ) 12 (I−2λA0 )−1 (I−π)g,(I−π)g HT e e . n {det2 (I − 2λA0 )} (2π) 2

Proof Write w = (I − π)(w) + ηi (w) = 0,

n i=1

(5.7.15)

ηi (w)ηi . Then, by Example 5.4.17,

δ0 (η)dμT -a.e. w ∈ WT ,

ηi ◦(I−π) = 0

(i = 1, . . . , n). (5.7.16)

∞ Hence, for F = f ( 1 , . . . , m ) with 1 , . . . , m ∈ WT∗ and f ∈ C (Rm ), we have

F = F ◦ (I − π),

δ0 (η)dμT -a.e.

(5.7.17)

On the other hand, since I − π and η are independent under μT ,

(F ◦ (I − π))ϕ(η) dμT = (F ◦ (I − π)) dμT × ϕ(η) dμT WT WT WT

1 − 1 |x|2 2 = (F ◦ (I − π)) dμT × ϕ(x) dx n e (2π) 2 WT Rn n for any ϕ ∈ S (Rn ). Hence, by taking {ϕk }∞ k=1 ∈ S (R ) converging to δ0 and letting k → ∞, by Corollary 5.4.7 and (5.7.16), we obtain

1 Fδ0 (η)dμT = (F ◦ (I − π)) dμT . (5.7.18) n (2π) 2 WT WT

4:36:52, subject to the Cambridge Core terms of use, .006

264

Malliavin Calculus

Extend η1 , . . . , ηn to an orthonormal basis {ηi }∞ i=1 of HT . Set ci = g, ηi HT , ai j = ηi , Aη j HT , gN =

N

ci ηi

AN =

and

i=1

N

ai j ηi ⊗ η j

(N  n).

i, j=1

Then, we have QAN = Q(I−π)AN (I−π) − tr(πAN π),

δ0 (η)dμT -a.e.,

Q(I−π)AN (I−π) ◦ (I − π) = Q(I−π)AN (I−π) ,

(5.7.19)

μT -a.s.

(5.7.20)

In fact, by (5.7.2), Q AN =

N

ai j {ηi η j − δi j }

and



Q(I−π)AN (I−π) =

ai j {ηi η j − δi j }.

n − 4T 2 . Next we show (5.8.2). As above, it suffices to show (5.8.2) for λ ∈ R with π2 √t (t ∈ [0, T ]) and π, A0 , A1 as in |λ| < 4T 2 . Define η ∈ HT by η(t) = T Corollary 5.7.9. A0 is given by



T + 1 T* T (A˙0 h)(t) = h(s) ds − h(u) du ds T 0 t s 2

for h ∈ HT with πh = 0 or h(T ) = 0. By a similar argument to the above, A0 is developed as √ ∞ * nπt + T2 2T sin k ⊗ k , k (t) = A0 = . n n n nπ T n2 π2 n=1 Since δ0 (θ(T )) = √1T δ0 (η) and tr(A) = tr(A0 ) + tr(A1 ), by Corollary 5.7.9, we obtain

1 1 1 e− 2 λhT δ0 (θ(T ))dμT = e− 2 λQA δ0 (θ(T ))dμT e− 2 λtr(A) WT

WT

4:36:52, subject to the Cambridge Core terms of use, .006

268

Malliavin Calculus

= √ = √

1

{det2 (I + λA0 )}− 2 e− 2 λ{tr(A)−tr(A1 )} 1

2πT ∞ 1 ,* 2πT

1+

n=1

1

λT 2 +2 -− 12 . n2 π2

By the identity sinh x = x

∞ * 

1+

n=1

x2 + , n2 π2

(5.8.5) 

we arrive at (5.8.2).

By using Theorem 5.8.1, we show the explicit formula for the heat kernel of the Schrödinger operator investigated in Theorem 5.5.7 when d = 1, Θ = 0, and V(x) = x2 . Theorem 5.8.2 Fix λ > 0. Then, for x, y ∈ R and T > 0,

* λ2 T + exp − (x + θ(t))2 dt δy (x + θ(T )) dμT (5.8.6) 2 0 WT ( * λ  + 1 λT = √ exp − coth(λT ) x2 − 2xy sech(λT ) + y2 . 2 2πT sinh(λT ) Proof Let φ : [0, T ] → R be the unique solution for the ordinary differential equation4 φ − λ2 φ = 0,

φ(0) = x, φ(T ) = y.

Define h ∈ HT by h(t) = φ(t) − x. Since

T

  φ (t) dθ(t) = θ(T )φ (T ) − 0

T

(5.8.7)

φ (t)θ(t) dt

0

and φ satisfies (5.8.7), applying the Cameron–Martin theorem (Theorem 1.7.2), we obtain

* λ2 T + exp − (x + θ(t))2 dt δy (x + θ(T )) dμT 2 0 WT

* λ2 T + 1 ∗ 2 = exp − (x + θ(t) + h(t))2 dt e−∇ h− 2 hHT 2 0 WT × δy (x + θ(T ) + h(T )) dμT

* 1 T + 1 2  = exp − e− 2 λ hT δ0 (θ(T )) dμT . λ2 φ(t)2 + (φ (t))2 dt 2 0 WT 4

This equation is the Lagrange equation corresponding to the action integral S T (φ) = !T ˙ L(φ(t), φ(t)) dt associated with the Lagrangian L(p, q) = 12 {p2 + q2 }. See Section 7.1. 0

4:36:52, subject to the Cambridge Core terms of use, .006

5.8 Examples of Quadratic Forms

By (5.8.7), we have

T

269



 λ2 φ(t)2 + (φ (t))2 dt = yφ (T ) − xφ (0).

0

Then, plugging in the explicit form of φ, φ(t) =

y − eλT x −λt y − e−λT x λt e − λT e λT −λT e −e e − e−λT

(t ∈ [0, T ]) 

and using Theorem 5.8.1, we obtain (5.8.6).

Remark 5.8.3 We have derived (5.8.6) by applying (5.8.2) in Theorem 5.8.1. The identity (5.8.6) can be shown in a direct and functional analytical way  d 2 λ2 2 + 2 x (λ > 0) on R. associated with the Schrödinger operator Hλ = − 12 dx The method is as follows. Realize Hλ as a self-adjoint operator on L2 (R), the Hilbert space of squareintegrable functions with respect to the Lebesgue measure. The spectrum of Hλ consists only of the eigenvalues {λ(n + 12 )}∞ n=0 with multiplicity one and the corresponding normalized eigenfunction φn is given by φn (x) =

√ * λ + 14 − 1 λx2 √ n! e 2 Hn ( 2λ x), π

where Hn (x) is a Hermite polynomial. Since p(t, x, y) admits the eigenfunction expansion ∞ 1 e−λ(n+ 2 )t φn (x)φn (y), p(t, x, y) = n=0

a well-known formula for the Hermite polynomials ∞

* 1 1 + 1 n!Hn (x)Hn (y)tn = √ exp − (t2 x2 − 2txy + t2 y2 ) 2 2 1−t 1 − t2 n=0

yields (5.8.6). For this identity, see [67].

5.8.2 Lévy’s Stochastic Area Let WT be the two-dimensional Wiener space and consider Lévy’s stochastic area s(T ) (Example 5.5.6). Theorem 5.8.4 For λ ∈ R with |λ| < Tπ ,

eλs(T ) dμT = WT

1 , cos( 21 λT )

(5.8.8)

4:36:52, subject to the Cambridge Core terms of use, .006

270

Malliavin Calculus

eλs(T ) δ0 (θ(T ))dμT =

WT

1 2 λT . 2πT sin( 12 λT ) 1

(5.8.9)

Proof First we show s(T ) ∈ D∞,∞− and the expression s(T ) =

1 QA , 2

(5.8.10)

where A : HT → HT is given by + * 1 ˙ (t ∈ [0, T ], h ∈ HT ) (Ah)(t) = J h(t) − h(T ) 2 # $ 0 −1 and J = . For n ∈ N, define s(n) (T ) ∈ P by 1 0 * i +A 1 @ * i + * i + 1 + T − θ T 2. Jθ T , θ R 2 i=0 n n n n−1

s(n) (T ) =

By (5.1.1), we have for h ∈ HT

∇s(n) (T ), h HT =

n−1 * i +A 1 @ * i + * i + 1 + T −θ T 2 Jh T , θ R 2 i=0 n n n

* i +A 1 @ * i + * i + 1 + T − h T 2. Jθ T , h R 2 i=0 n n n n−1

+

Since θ(0) = h(0) = 0, an algebraic manipulation yields

∇s(n) (T ), h HT =

n−1 * i − 1 +A 1 @ * i + * i + 1 + T −h T 2 Jθ T , h R 2 i=1 n n n * n − 1 +A 1@ T 2. − Jθ(T ), h R 2 n

Using (5.1.1) again, we obtain for h, g ∈ HT

[∇2 s(n) (T )](g), h HT =

n−1 * i − 1 +A 1 @ * i + * i + 1 + T −h T 2 Jg T , h R 2 i=1 n n n * n − 1 +A 1@ T 2. − Jg(T ), h R 2 n

Letting n → ∞, we see that s(T ) ∈ D∞,∞− ,

T ˙ R2 dt − 1 Jθ(T ), h(T ) R2

Jθ(t), h(t)

∇s(T ), h HT = 2 0

4:36:52, subject to the Cambridge Core terms of use, .006

5.8 Examples of Quadratic Forms

271

and

T

˙ R2 dt − 1 Jg(T ), h(T ) R2

Jg(t), h(t) 2 0

T@ * + A 1 ˙ = dt (h, g ∈ HT ). J g(t) − g(T ) , h(t) R2 2 0

[∇ s(T )](g), h HT = 2

From these observations we obtain

∇2 s(T ) = A, ∇s(T ) dμT = 0, WT

s(T ) dμT = 0. WT

By Proposition 5.7.4, (5.8.10) holds. Second, we compute the eigenvalues and eigenfunctions of A. The equation Ah = λh is equivalent to ˙ λh¨ = J h,

h(0) = 0,

1 ˙ λh(0) = − Jh(T ). 2

Solving this ordinary differential equation, we see that the following functions T hn and  hn are the eigenfunctions corresponding to the eigenvalue λn = (2n+1)π : √ $ # T cos( (2n+1)πt )−1 T hn (t) = sin( (2n+1)πt ) (2n + 1)π T

and  hn = Jhn

(n ∈ Z).

Moreover, {hn ,  hn }n∈Z is an orthonormal basis of HT . Hence, we have the expansion T {hn ⊗ hn +  A= hn ⊗  hn }. (2n + 1)π n∈Z T is two and Aop = Tπ . The multiplicity of each eigenvalue (2n+1)π π If |λ| < T , then, by Corollary 5.7.8,

eλs(T ) dμT = {det2 (I − λA)}− 2 1

WT

=

,* n∈Z

+ λT -−1 ,* λT λ2 T 2 +-−1 = . 1− e (2n+1)π (2n + 1)π (2n + 1)2 π2 n=0 ∞

1−

By the identity cos x =

∞ * 

1−

n=0

+ 4x2 , 2 2 (2n + 1) π

we obtain (5.8.8).

4:36:52, subject to the Cambridge Core terms of use, .006

272

Malliavin Calculus

Next we show (5.8.9). Let e1 = (1, 0) and e2 = (0, 1) ∈ R2 . Define ηi ∈ HT by ηi (t) = √tT ei (i = 1, 2). Moreover, define π, A0 , and A1 as in Corollary 5.7.9. For h ∈ HT with πh = 0 or h(T ) = 0, we have

where h = T −1

!T 0

(A˙0 h)(t) = J(h(t) − h), h(s) ds. Hence, in the same way as above,

A0 =



T {kn ⊗ kn +  kn ⊗  kn }, 2nπ n∈Z\{0}

where kn (t) =

√ # $ T cos( 2nπt T )−1 sin( 2nπt 2nπ T )

and  kn = Jkn .

Furthermore, since 2 2

ηi , Aηi HT = tr(A1 ) = i=1

T 0

i=1

t−T

ei , Jei R2 dt = 0 T

and δ0 (θ(T )) = T1 δ0 (η), by (5.7.9), we obtain

eλs(T ) δ0 (θ(T ))dμT = WT

1 1 {det2 (I − λA0 )}− 2 2πT ∞ 1 ,* λ2 T 2 +-−1 = . 1− 2 2 2πT n=1 4n π

e 2 λQA δ0 (θ(T ))dμT = 1

WT

1 ,  * λT +-−1 = 1− 2πT n∈Z\{0} 2nπ

By the identity sin x = x

∞ * 

1−

n=1

x2 + n2 π2 

we obtain (5.8.9).

As Theorem 5.8.2, Theorem 5.8.4 is applicable to compute the heat kernel. 2

1

Theorem 5.8.5 Let Θ(x) = (− x2 , x2 ) (x = (x1 , x2 ) ∈ R2 ). Define L(t, x; Θ) as in Theorem 5.5.7. Then, for λ ∈ R,

ei λL(T,x;Θ) δy (x + θ(T )) dμT WT

=

*1 + *i λ + λ λ 2 2 −

Jx, y coth λT |y − x| exp . R 2 4 2 4π sinh( 21 λT )

4:36:52, subject to the Cambridge Core terms of use, .006

5.8 Examples of Quadratic Forms

273

Proof Let x, y ∈ R2 . If we show

eαL(T,x;Θ) δy (x + θ(T )) dμT WT

=

*1 + *α + α α 2 2 −

Jx, y cot αT |y − x| exp R 2 4 2 4π sin( 12 αT )

(5.8.11)

for sufficiently small α ∈ R, we obtain the conclusion by analytic continuation. Let φ : [0, T ] → R2 be the solution of the ordinary differential equation φ¨ − αJ φ˙ = 0,

φ(0) = x, φ(T ) = y

(5.8.12)

and define h ∈ HT by h(t) = φ(t) − x. Since

1 T L(t, x; Θ) =

J(x + θ(t)), dθ(t) R2 2 0 and

L(t, x; Θ)(· + h) = s(T ) + 0

T

1

Jθ(t), φ (t) R2 dt − Jθ(T ), φ(T ) R2 2

1 T +

Jφ(t), φ (t) R2 dt, 2 0

by the Cameron–Martin theorem, we obtain

eαL(T,x;Θ) δy (x + θ(T )) dμT WT

= exp

(5.8.13)

*1 T  + eαs(T ) δ0 (θ(T )) dμT .

αJφ(t), φ (t) R2 − |φ (t)|2 dt 2 0 WT

By integration by parts on [0, T ] and (5.8.12),

T  

αJφ(t), φ (t) R2 − |φ (t)|2 dt = φ (0), x R2 − φ (T ), y R2 . 0

The solution of (5.8.12) is explicitly given by φ(t) = x + where

1 J(I − eαtJ )γ α

# α cos( 21 αT ) γ= 1 2 sin( 21 αT ) − sin( 2 αT )

(t ∈ [0, T ]), $ sin( 21 αT ) (y − x). cos( 12 αT )

Hence φ (0) = γ

and φ (T ) = eαT J γ = αJ(y − x) + γ.

4:36:52, subject to the Cambridge Core terms of use, .006

274

Malliavin Calculus

Moreover, we have

φ (0), x R2 − φ (T ), y R2 = α Jx, y R2 + γ, x − y R2 = α Jx, y R2 −

α cos( 21 αT ) 2 sin( 12 αT )

|x − y|2 . 

Plugging this and (5.8.9) into (5.8.13), we obtain (5.8.11).

Remark 5.8.6 Lévy [70] showed the results in this section by using the Fourier expansion of Brownian motion. Moreover, several proofs are known (see [4]).

5.8.3 Sample Variance Let WT be the one-dimensional Wiener space and set

T   (w ∈ WT ), w(t) − w 2 dt vT (w) = where w =

1 T

!T 0

0

w(t) dt.

Theorem 5.8.7 For λ ∈ R with λ > − Tπ 2 , √

* λ T + 12 − 12 λvT e dμT = , √ sinh( λ T ) WT √

1 2 λT − 12 λvT e δ0 (θ(T )) dμT = . √ sinh( 21 λ T ) WT 2

(5.8.14) (5.8.15)

Proof First we show (5.8.14). Define A : HT → HT by

T ˙ (h(s) − h) ds (t ∈ [0, T ], h ∈ HT ). (Ah)(t) = t

In the same way as in Theorem 5.8.1, we can show for h, g ∈ HT

T

∇vT , h HT = 2 (θ(t) − θ)(h(t) − h) dt 0

and



T

[∇vT ](g), h HT = 2

(g(t) − g)(h(t) − h) dt = 2 0

T * T + ˙ dt. =2 (g(s) − g) ds h(t) 0

T

(g(t) − g)h(t) dt

0

t

4:36:52, subject to the Cambridge Core terms of use, .006

5.8 Examples of Quadratic Forms

From these identities, we obtain

∇2 vT = 2A, ∇vT dμT = 0,

275

WT

vT dμT = WT

T2 . 6

Hence, by Proposition 5.7.4, we obtain vT = QA + The equation Ah = λh is equivalent to ... ˙ λ h = −h, h(0) = 0,

T2 . 6 ˙ ˙ ) = 0. h(0) = h(T

Solving this equation, we obtain the expansion ∞ * T +2 A= hn ⊗ hn nπ n=1 √ , * + where hn (t) = nπ2T cos nπt − 1 . In particular, we have Aop = T |λ|
− Tπ 2 . 2 Second, we show (5.8.15). It suffices to show it when |λ| < Tπ 2 . Define η ∈ HT by η(t) = √tT (t ∈ [0, T ]). Define π, A0 , and A1 as in Corollary 5.7.9. For h ∈ HT with πh = 0 or h(T ) = 0, we have



T + 1 T* T ˙ (A0 h)(t) = (h(s) − h) ds − (h(u) − h) du ds. T 0 t s From this, by a similar argument to the above, we obtain ∞ * T +2 {kn ⊗ kn +  kn ⊗  kn }, A0 = 2nπ n=1 where the eigenfunctions are given by √ √ * 2nπt + 2T 2T , * 2nπt +  sin kn (t) = , kn (t) = cos −1 . 2nπ T 2nπ T Hence, since δ0 (θ(T )) = √1T δ(η) and tr(A) = tr(A0 )+tr(A1 ), by Corollary 5.7.9, we have

∞ * , 1 λT 2 +2 -− 12 e− 2 λvT δ0 (θ(T )) dμT = . 1+ (2nπ)2 WT n=1 By (5.8.5), we obtain (5.8.15) for λ ∈ R with |λ|
0, define Fˆ ε : Wˆ → R by Fˆ ε (w, w ) = F(w) + εeξ(w ) ((w, w ) ∈ Wˆ ),   ξ  where ξ(w ) = w (1). Then, ∇ˆ Fˆ ε = ∇F + εe ∇ ξ, where ∇ stands for the gradient operator on W11 . In particular, ∇ˆ Fˆ ε 2Hˆ = ∇F2H + ε2 e2ξ

and

∇ˆ ∗ ∇ˆ Fˆ ε = ∇∗ ∇F + ε(ξ − 1)eξ ,

where we have used Theorem 5.2.8 to see the second identity. Thus, Fˆ ε is of class D∞,∞− and non-degenerate. By Theorem 5.4.11, the integration by parts formula and Theorem 5.2.1,

f (x)E[δ x (Fˆ ε )∇ˆ Fˆ ε 2Hˆ ]dx E[ f (Fˆ ε )∇ˆ Fˆ ε 2Hˆ ] =

R = f (x)E[1[x,∞) (Fˆ ε )∇ˆ ∗ ∇ˆ Fˆ ε ]dx ( f ∈ Cb (W )). R

Letting ε → 0, we arrive at E[ f (F)∇F2H

]=

R

f (x)p(x)dx.

This implies the first assertion. The second assertion is an immediate consequence of the first one.  It was shown by Bouleau and Hirsch [8] that the second assertion continues to hold for Rn -valued Wiener functionals. See also [92]. Proposition 5.9.3 Let F = (F 1 , . . . , F n ) : W → Rn be of class D1,p for some p > 1. Suppose det[( ∇F i , ∇F j )i, j=1,...,n ]  0 ν-a.s. Then, the distribution of F on Rn is absolutely continuous with respect to the Lebesgue measure. An application of the Malliavin calculus on abstract Wiener spaces is the one to stochastic differential equations extended by the theory of rough paths. The theory of rough paths was initiated by Lyons in the 1990s, and developed widely to produce several monographs [26, 27, 71, 72]. In the remainder of this section, we shall give a glance at the theory of rough paths, following [26].

4:36:52, subject to the Cambridge Core terms of use, .006

278

Malliavin Calculus

For a while, we work in the deterministic setting. Let V be a Banach space. A rough path X = (X, X) is a pair of continuous functions X : [0, T ] → V and X : [0, T ]2 → V ⊗ V, satisfying X(s, t) − X(s, u) − X(u, t) = X(s, u) ⊗ X(u, t),

where X(s, t) = X(t) − X(s).

For 31 < α  12 , Cα = Cα ([0, T ], V) denotes the space of rough paths X = (X, X) such that |X(s, t)| < ∞, α st∈[0,T ] |t − s|

|X(s, t)| < ∞. 2α |t st∈[0,T ] − s|

Xα = sup

X2α = sup

Moreover, Cαg stands for the space of rough paths X = (X, X) ∈ Cα such that Sym(X(s, t)) = 12 X(s, t) ⊗ 12 X(s, t), where Sym is the symmetrizing operator. ¯ W ¯ being a Banach space, is said For X ∈ C α ([0, T ], V), Y ∈ C α ([0, T ], W),  α ¯ where L(V, W) ¯ to be controlled by X if there exists Y ∈ C ([0, T ], L(V, W)), Y ¯ such that R (s, t) = is the space of continuous linear mappings of V to W, Y(s, t) − Y  (s)X(s, t) satisfies RY 2α < ∞. The space of such pairs (Y, Y  ) is ¯ denoted by D2α X ([0, T ], W). 1 Let α > 3 . If X = (X, X) ∈ Cα ([0, T ], V) and (Y, Y  ) ∈ D2α X ([0, T ], L(V, W)),  then, for every s < t  T , lim|P|→0 [u,v]∈P (Y(u)X(u, v) + Y  (u)X(u, v)) exists, where P is a partition of [s, t]. The ! tlimit is called the integration of Y against the rough path X, and denoted by s Y(r)dX(r). Using integrations against rough paths, a differential equation driven by a rough path, say a rough differential equation, can be defined; the rough differential equation dY = f (Y)dX,

Y0 = ξ

means the integral equation

t

Y(t) = ξ +

f (Y(s))dX(s). 0

We now proceed to the stochastic setting. First we shall see that a rough differential equation extends a stochastic differential equation. To do this, let B = {B(t)}t0 be a d-dimensional standard Brownian motion. Set

t BItô = B(s, r) ⊗ dB(r) ∈ Rd ⊗ Rd . s

Then BItô = (B, BItô ) ∈ Cα ([0, T ], Rd ) a.s. for any α ∈ ( 31 , 12 ) and T > 0. Similarly, if we set

t B(s, r) ⊗ ◦dB(r) ∈ Rd ⊗ Rd , BStrat = s

4:36:52, subject to the Cambridge Core terms of use, .006

5.9 Abstract Wiener Spaces and Rough Paths

279

then BStrat = (B, BStrat ) ∈ Cαg ([0, T ], Rd ) a.s. for any α ∈ ( 13 , 12 ) and T > 0. It may be worthwhile to notice that, if s = 0, then the anti-symmetric parts of BItô and BStrat coincide with Lévy’s stochastic area.  If (Y(ω), Y  (ω)) ∈ D2α B(ω) for a.a. ω, and Y, Y are both predictable, then

T 0



T

YdBItô =

Y(t)dB(t) and 0

T

YdBStrat =

0

T

Y(t) ◦ dB(t).

0

Moreover, for f ∈ Cb3 (Re , L(Rd , Re )), Lipschitz continuous f0 : Re → Re , and ξ ∈ Re , (i) for a.a. ω, there is a unique solution (Y(ω), f (Y(ω))) ∈ D2α B(ω) to the rough differential equation dY(t, ω) = f0 (Y(t, ω))dt + f (Y(t, ω))dBItô (t, ω),

Y(0, ω) = ξ,

and (ii) Y = {Y(t, ω)}t≥0 is a strong solution to the Itô stochastic differential equation dY(t) = f0 (Y(t))dt + f (Y(t))dB(t),

Y(0) = ξ.

A similar assertion holds with “Strat” instead of “Itô”. We now investigate rough paths arising from Gaussian processes, for which the Malliavin calculus on abstract Wiener spaces works. Let X be a continuous, centered Gaussian process with values in Rd . In what follows, we work on the abstract Wiener space given in Example 5.9.1 (3). The rectangle increment of the covariance is defined by # R

$ s, t = E[X(s, t) ⊗ X(s , t )]. s , t 

Its ρ-variation on a rectangle I × I  , where I and I  are both rectangles in Rd , is given by 1  # s, t $ρ $ ρ   R    , = sup s ,t

#

Rρ,I×I 

P⊂I, [s,t]∈P P ⊂I  [s ,t ]∈P

where P (resp. P ) is a partition of I (resp. I  ). 1 ), and {X(t)}0tT be a d-dimensional, continuous, Let ρ ∈ [1, 32 ), α ∈ ( 31 , 2ρ centered Gaussian process with independent components such that sup 0s j,

where Π[s,t] is the set of partitions of [s, t]. Then X = (X, X) ∈ Cαg . For this X and V1 , . . . , Vd ∈ Cb∞ (Re , Re ), let {Y(t)}0tT be the solution to the rough differential equation dY = V(Y)dX,

Y(0) = y0 ∈ Re ,

where V = (V1 , . . . , Vd ). As an application of Proposition 5.9.3, we have the following. Theorem 5.9.4 Assume that  !t (1) For f ∈ C α ([0, t], Rd ), f = 0 if and only if dj=1 0 f j dh j = 0 for all h ∈ H . (2) For a.a. ω, X(ω) is truly rough, at least in a right neighborhood of 0, that is, there is a dense subset A of a right neighborhood of 0 such that for any s ∈ A, | v, X(s, t) | = ∞, for any v ∈ Rd \ {0}. lim sup |t − s|2α t↓s  Moreover, suppose that Lie(V1 , . . . , Vd )y = Re . Then, for any t > 0, the distri0 bution of Y(t) is absolutely continuous with respect to the Lebesgue measure on Re .

4:36:52, subject to the Cambridge Core terms of use, .006

6 The Black–Scholes Model

Throughout this chapter, we fix T > 0 and a 1-dimensional Brownian motion {B(t)}0tT on a complete probability space (Ω, F , P). As in Section 2.6, let FtB = σ{B(s); s  t} ∨ N , N being the totality of P-null sets. We assume F = FTB . In what follows, we omit the prefix “{FtB }-”, and say simply “predictable”, “martingale” and so on. Moreover, E stands for the expectation with respect to P, and the expectation with respect to another probability measure p denotes the space of Q will be denoted by EQ . Finally, for p = 1, 2, Lloc !T p predictable processes { f (t)}0tT with 0 | f (t)| dt < ∞, P-a.s.

6.1 The Black–Scholes Model Let r, μ, σ > 0, and define the two-dimensional stochastic process X = {X(t) = (ρ(t), S (t))}0tT by ⎧ ⎪ ⎪ ρ(0) = 1, ⎨dρ(t) = rρ(t)dt, (6.1.1) ⎪ ⎪ ⎩dS (t) = μS (t)dt + σS (t)dB(t), S (0) = s0 > 0. Then ρ(t) = ert

  σ2   and S (t) = s0 exp σB(t) + μ − t. 2

In particular, ρ(t), S (t) > 0. ρ(t) represents the price of a safe security, like a bond, at time t. r is an interest rate and ρ(t) represents the amount of continuous compounding at time t. S (t) represents the price of a risky security, like a stock. As (6.1.1) indicates, it is an amount of continuous compounding perturbed by the noise driven by a Brownian motion. The parameter σ corresponds to how much S (t) fluctuates, and is called volatility. The stochastic process X = {X(t) = (ρ(t), S (t))}0tT is 281 4:36:54, subject to the Cambridge Core terms of use, .007

282

The Black–Scholes Model

called the Black–Scholes model. We restrict ourselves to the Black–Scholes model, while general market models can be investigated via stochastic differential equations with more general coefficients than (6.1.1). In what follows, we call {S (t)}0tT the stock price process. Definition 6.1.1 (1) A portfolio is a two-dimensional predictable process ϕ = {ϕ(t) = (ϕ0 (t), ϕ1 (t))}0tT . The totality of portfolios is denoted by P. (2) Given ϕ = {ϕ(t) = (ϕ0 (t), ϕ1 (t))}0tT ∈ P, set V(t; ϕ) = ϕ(t), X(t) = ϕ0 (t)ρ(t) + ϕ1 (t)S (t). The process {V(t; ϕ)}0tT is called the value process of ϕ. 1 , (3) ϕ ∈ P is said to be self-financing in the market X if {ϕ0 (t)}0tT ∈ Lloc 1 2 {ϕ (t)}0tT ∈ Lloc , and

t

t V(t; ϕ) = V0 (ϕ) + ϕ0 (s)dρ(s) + ϕ1 (s)dS (s) 0 0

t

t (rϕ0 (s)ρ(s) + μϕ1 (s)S (s))ds + σϕ1 (s)S (s)dB(s). (6.1.2) = V0 (ϕ) + 0

0

The totality of self-financing portfolios is denoted by Psf . Remark 6.1.2 (1) By the continuity of {ρ(t)}0tT and {S (t)}0tT , 1 {rϕ0 (t)ρ(t) + μϕ1 (t)S (t)}0tT ∈ Lloc

2 and {σϕ1 (t)S (t)}0tT ∈ Lloc

1 2 if {ϕ0 (t)}0tT ∈ Lloc and {ϕ1 (t)}0tT ∈ Lloc . Hence the integrals appearing in (6.1.2) are well defined. (2) By discretizing time, it can be seen that being self-financing means no inflow and outflow of money. In fact, suppose a next trade after time t occurs at t + Δ. In this case, the portfolio ϕ(t + Δ) is the trading strategy taken at time t. If there is no inflow or outflow of money, then the wealth V(t; ϕ) of the trader at time t must coincide with the amount of investment: ϕ(t), X(t) =

ϕ(t + Δ), X(t) . Hence

V(t + Δ; ϕ) − V(t; ϕ) = ϕ(t + Δ), X(t + Δ) − X(t) . Letting Δ → 0, we obtain dV(t; ϕ) = ϕ(t), dX(t) = ϕ0 (t)dρ(t) + ϕ1 (t)dS (t). Thus (6.1.2) holds. (3) Since B(0) = 0, F0B = {∅, Ω} ∪ N . Hence every F0B -measurable function is a constant function. In particular, ϕ0 (0), ϕ1 (0), and V(0; ϕ) are all constants.

4:36:54, subject to the Cambridge Core terms of use, .007

6.1 The Black–Scholes Model

283

Example 6.1.3 A constant portfolio, that is, a portfolio such that ϕ(t) = ϕ(0) (0  t  T ) is self-financing. Definition 6.1.4 Set 1 = e−rt , ξ(t) = ρ(t)

S (t) = ξ(t)S (t),

X(t) = ξ(t)X(t) = (1, S (t)).

{S (t)}0tT is called a stock price process discounted by numeraire {ρ(t)}0tT . Since dξ(t) = −rξ(t)dt, by Itô’s formula, dS (t) = S (t)dξ(t) + ξ(t)dS (t) = (μ − r)S (t)dt + σS (t)dB(t).

(6.1.3)

Being “self-financing” is common in both markets X and X as follows. Lemma 6.1.5 Let Psf be the totality of portfolios which are self-financing in the market X. Then Psf = Psf . In particular, for ϕ ∈ Psf , V(t; ϕ) = ξ(t)V(t; ϕ) satisfies

t V(t; ϕ) = V 0 (ϕ) + ϕ1 (s)dS (s). (6.1.4) 0

Proof Let ϕ ∈ Psf . By Itô’s formula and (6.1.3), dV(t; ϕ) = ξ(t)dV(t; ϕ) + V(t; ϕ)dξ(t) = ξ(t){(rϕ0 (t)ρ(t) + μϕ1 (t)S (t))dt + σϕ1 (t)S (t)dB(t)} − rξ(t){ϕ0 (t)ρ(t) + ϕ1 (t)S (t)}dt = ξ(t)S (t)ϕ1 (t){(μ − r)dt + σdB(t)} = ϕ1 (t)dS (t). Thus ϕ ∈ Psf . Conversely, suppose ϕ ∈ Psf . Since V(t; ϕ) = ρ(t)V(t; ϕ), by Itô’s formula, (6.1.4), and (6.1.3), dV(t; ϕ) = ρ(t)dV(t; ϕ) + V(t; ϕ)dρ(t) = ρ(t)ϕ1 (t){(μ − r)S (t)dt + σS (t)dB(t)} + rρ(t){ϕ0 (t) + ϕ1 (t)S (t)}dt = rϕ0 (t)ρ(t)dt + ϕ1 (t)S (t){μdt + σdB(t)} = ϕ0 (t)dρ(t) + ϕ1 (t)dS (t). 

Hence ϕ ∈ Psf . We close this section with a remark on Psf . 2 , define Lemma 6.1.6 Given a ∈ R and {ϕ1 (t)}0tT ∈ Lloc

t ϕ0 (t) = a + ϕ1 (s)dS (s) − ϕ1 (t)S (t). 0

4:36:54, subject to the Cambridge Core terms of use, .007

284

The Black–Scholes Model

Then ϕ = (ϕ0 , ϕ1 ) ∈ Psf and V(0; ϕ) = a. Proof By Lemma 6.1.5, ϕ = (ϕ0 , ϕ1 ) ∈ P is self-financing if and only if

t ϕ0 (t) + ϕ1 (t)S (t) = V(0; ϕ) + ϕ1 (s)dS (s) (0  t  T ). 0

Solving this in ϕ0 (t), we obtain the desired assertion.



Example 6.1.7 Suppose ϕ1 (t) = b (0  t  T ) for some b ∈ R. If we apply the lemma to this ϕ1 , then ϕ0 (t) = a − bs0 (0  t  T ). Thus, we arrive at a constant portfolio.

6.2 Arbitrage Opportunity, Equivalent Martingale Measures Definition 6.2.1 (1) ϕ ∈ Psf is said to be admissible if P(V(t; ϕ)  C for any t ∈ [0, T ]) = 1 for some C ∈ R. The totality of admissible portfolios is denoted by Padm . (2) A portfolio ϕ ∈ Padm is called an arbitrage opportunity if V(0; ϕ) = 0, V(t; ϕ)  0, P-a.s. for every t ∈ [0, T ], and P(V(t; ϕ) > 0) > 0. Denote by Parb the totality of arbitrage opportunities. (3) An equivalent martingale measure is a probability measure Q on (Ω, F ) satisfying (i) Q is equivalent to P, that is, A ∈ F is Q-null if and only if it is P-null. (ii) Under Q, {S (t)}0tT is a local martingale. The totality of equivalent martingale measures is denoted by EMM. Remark 6.2.2 (1) ϕ ∈ Padm is also called a tame portfolio. The condition V(t; ϕ)  C means that an investor has to keep the debt within manageable limits. (2) The trading strategy ϕ ∈ Parb enables an investor to start with no asset, invest without running into debt, and end up with a profit with positive probability. (3) There exists ϕ ∈ Psf \ Padm such that V(0; ϕ) = 0 and V(T ; ϕ) > 0 P-a.s. To see this, we assume r = μ = 0. Let

t 1 dB(s), Y(t) = √ T−s 0

4:36:54, subject to the Cambridge Core terms of use, .007

6.2 Arbitrage Opportunity, Equivalent Martingale Measures

285

a > 0, and τa = inf{t  0 ; Y(t)  a}. By Theorem 2.5.3, there exists a Brownian motion Bˆ such that   t 1 ˆ ds (6.2.1) Y(t) = B 0 T −s Then Theorem 3.2.1 implies P(τa < T ) = 1. Set ϕ1 (t) = σS (t)1√T −t 1[0,τa ) (t). On account of Lemma 6.1.6, there exists a 1 {ϕ0 (t)}0tT ∈ Lloc such that ϕ = (ϕ0 , ϕ1 ) ∈ Psf and V(0; ϕ) = 0. Since r = 0,

t σS (s)ϕ1 (s)dB(s) = Yt∧τa . (6.2.2) V(t; ϕ) = 0

In conjunction with (6.2.1) and Theorem 1.5.20, this yields P( inf V(t; ϕ)  C) < 1 0tT

for any C ∈ R. Thus ϕ  Padm . Moreover, by (6.2.2) and the definition of τa , V(T ; ϕ) = a > 0. Theorem 6.2.3 Set α = 2

r−μ σ

α  P(A) = E[eαB(t)− 2 T ; A]

and define (A ∈ F )

and

 B(t) = B(t) − αt

(0  t  T ).

(1)  P is a probability measure on (Ω, F ). Moreover, { B(t)}0tT is an (FtB ) Brownian motion under P. (2) EMM = { P}. P, {V(t; ϕ)}0tT is a local martingale. (3) Let ϕ ∈ Psf . Under  (4) Parb = ∅. Proof (1) While the assertion is a direct application of the Girsanov theorem (Theorem 4.6.2), we give an elementary proof. Let 0  s1 < · · · < sn  s < t, f ∈ Cb (Rn ) and λ ∈ R. Then, by the independence of B(t) − B(s) and F sB , EP [ f ( B(s1 ), . . . ,  B(sn )) exp(iλ( B(t) −  B(s)))]    1 B(sn )) exp αB(s) − α2 s = E f ( B(s1 ), . . . ,  2   α2 × exp (iλ + α)(B(t) − B(s)) − iλα(t − s) − (t − s) 2 − 12 λ2 (t−s)   = EP [ f ( B(s1 ), . . . , B(sn ))]e . B )-Brownian motion under  P, where This implies that { B(t)}0tT is an (F t  B  = σ{    B(s); s  T } ∨ { P-null sets}. Since B(t) = B(t) − αt and P and P are F t B B = F B . Thus {  B(t)} is an (F )-Brownian motion under P. equivalent, F 0tT t t t

4:36:54, subject to the Cambridge Core terms of use, .007

286

The Black–Scholes Model

(2) By (6.1.3), we have dS (t) = σS (t)d B(t).

(6.2.3)

P. Thus  P ∈ EMM. Then, by (1), {S (t)}0tT is a local martingale under  Next let Q ∈ EMM. The proof of the second assertion is completed once we have shown Q =  P. Since Q is equivalent to P, there exists a Z ∈ L1 (P) such that Z > 0, P-a.s. and Q(A) = E[Z; A]

(A ∈ F ).

(6.2.4)

Define Z(t) = E[Z|FtB ] (0  t  T ). By Corollary 2.6.6 and a standard stopping time argument, {Z(t)}0tT is a continuous martingale. Set τ = inf{t  T | Z(t) = 0}, where τ = ∞ if {· · · } = ∅. Since {τ  T } ∈ B , by the optional sampling theorem (Theorem 1.5.11), we have Fτ∧T 0 = E[Z(τ); τ  T ] = E[Z(τ ∧ T ); τ  T ] = E[Z(T ); τ  T ]. This implies that P(τ  T ) = 0, for Z(T ) = Z > 0, P-a.s. Hence inf Z(t) > 0, P-a.s.

(6.2.5)

0tT

2 As remarked after Theorem 2.6.2, there is an { f (t)}0tT ∈ Lloc satisfying

t Z(t) = 1 + f (s)dB(s) (0  t  T ). 0

If we set g(t) =

f (t) Z(t) ,

2 then by (6.2.5), {g(t)}0tT ∈ Lloc . Furthermore,

t g(s)Z(s)dB(s) (0  t  T ). Z(t) = 1 +

(6.2.6)

0

Set τn = inf{t  0 | S (t)  n} (n = 1, 2, . . . ), where τn = ∞ if {· · · } = ∅. Since τn {S (t)}0tT is a local martingale under Q, {S (t)}0tT is a bounded martingale under Q. Hence for s < t and A ∈ F sB , τn

τn

EQ [S (t); A] = EQ [S (s); A]. From this and the identity τn

τn

τn

EQ [S (u); C] = E[S (u)Z; C] = E[S (u)Z(u); C]

(0  u  T, C ∈ FuB ),

τn

it follows that {S (t)Z(t)}0tT is a martingale under P. By Itô’s formula, (6.1.3), and (6.2.6), we obtain

t

t S (t)Z(t) =s + S (s)g(s)Z(s)dB(s) + σS (s)Z(s)dB(s) 0 0

t S (s)Z(s){(μ − r) + σg(s)}ds. + 0

4:36:54, subject to the Cambridge Core terms of use, .007

6.3 Pricing Formula

287

τn

Since {S (t)Z(t)}0tT is a martingale under P, this yields

t S (s)Z(s){(μ − r) + σg(s)}ds = 0 (0  t  T ). 0

Hence g(s) = α ( s  T ). Substitute this into (6.2.6) to see

t Z(t) = 1 + αZ(t)dB(t) (0  t  T ). 0 2

α2

αB(t)− α2 t

(t  T ), and Z = Z(T ) = eαB(T )− 2 T . In conjunction with Thus Z(t) = e the definition (6.2.4) of Q, we have Q =  P. P, so is {V(t; ϕ)}0tT by (3) Since {S (t)}0tT is a local martingale under  (6.1.4). (4) It suffices to show that if ϕ ∈ Padm satisfies V0 (ϕ) = 0 and V(t; ϕ)  0 (0  t  T ) P-a.s., then ϕ  Parb . Let ϕ be as above, and put σn = inf{t  0 | V(t; ϕ)  n} (n = 1, 2, . . . ), where P. σn = ∞ if {· · · } = ∅. By (3), {V(t ∧ σn ; ϕ)}0tT is a martingale under  By the equivalence of  P and P, V(0; ϕ) = 0 and V(T ; ϕ)  0,  P-a.s. and there exists a C ∈ R such that  P(V(t; ϕ)  C (0  t  T )) = 1. Applying Fatou’s lemma, we obtain 0  EP [V(T ; ϕ)]  lim inf EP [V(T ∧ τn ; ϕ)] = EP [V(0; ϕ)] = 0. n→∞

This implies V(T ; ϕ) = 0,  P-a.s. and hence also P-a.s. Hence ϕ  Parb .



Remark 6.2.4 By (6.2.3),

Thus, {S (t)}0tT

 σ2  t (0  t  T ). S (t) = exp σ B(t) − 2 is a martingale under  P.

6.3 Pricing Formula In this section, we shall give a formula to give the price of a claim by using the equivalent martingale measure  P defined in Theorem 6.2.3. Consider a derivative of the stock price process S = {S (t)}0tT , which is paid off at maturity T . It is an FTS -measurable function. Since  1 σ2  B(t) = log S (t) − μ − t , σ 2 the FTS -measurability coincides with F -measurability. Thus, such a claim is an F -measurable function.

4:36:54, subject to the Cambridge Core terms of use, .007

288

The Black–Scholes Model

Definition 6.3.1 (1) CE denotes the totality of F -measurable functions F with P(F  C) = 1 for some C ∈ R. An element of CE is called a European contingent claim. (2) F ∈ CE is said to be replicable if there exists a ϕ ∈ Padm such that V(T ; ϕ) = F. In this case, we say ϕ replicates (hedges) F. A natural question is “which F ∈ CE can be replicated?”, and the following is an answer. Theorem 6.3.2 Given F ∈ L1 ( P)∩CE , there exists a ϕ ∈ Padm which replicates F and satisfies V(0; ϕ) = EP [ξ(T )F]. Remark 6.3.3 The theorem asserts that EP [ξ(T )F] is the price of F at time 0 since there is no arbitrage opportunity. In fact, suppose we buy F at the price x < EP [ξ(T )F]. Then we take the strategy −ϕ, where ϕ is the portfolio described in the theorem. By this investment, we still have EP [ξ(T )F]− x at the beginning, and invest it into ρ by a constant portfolio. Receiving the payoff F, we end up with (EP [ξ(T )F] − x)erT > 0 remaining. Thus an arbitrage opportunity occurs. Next if we sell F at the price y > EP [ξ(T )F], then taking the strategy ϕ and investing y−EP [ξ(T )F] into ρ, we earn (y−EP [ξ(T )F])erT > 0 at maturity. This is also an arbitrage opportunity. Thus, trades at a price different from EP [ξ(T )F] cause an arbitrage opportunity. B , the σ-field constructed Proof As was seen in the proof of Theorem 6.2.3, F t B and  P instead of B and P, coincides with FtB . Moreover, by as FtB with  2 with  P instead of P the equivalence of P and  P, the space defined as Lloc 2 coincides with Lloc itself. Due to these observations, as an application of Itô’s representation theorem (Theorem 2.6.2) to { B(t)}0tT , there is an { f (t)}0tT ∈ 2 Lloc such that

t f (s)d B(s) (0  t  T ). EP [ξ(T )F|FtB ] = EP [ξ(T )F] + 

0

2 by ϕ1 (t) = Define {ϕ1 (t)}0tT ∈ Lloc

f (t) . By σS (t) 0 1

virtue of Lemma 6.1.6, using

this {ϕ (t)}0tT , we construct a ϕ = (ϕ (t), ϕ (t)) ∈ Psf such that V(0; ϕ) = EP [ξ(T )F]. Then, by (6.1.4) and (6.2.3),

t V(t; ϕ) = EP [ξ(T )F] + σϕ1 (s)S (s)d B(s) = EP [ξ(T )F|FtB ] (0  t  T ). 1

0

Since ξ(T )F is bounded from below, ϕ ∈ Padm . Moreover, substitute t = T to  see V(T ; ϕ) = ξ(T )F, which means ϕ replicates F.

4:36:54, subject to the Cambridge Core terms of use, .007

6.3 Pricing Formula

289

As was seen in the above proof, finding a replicating portfolio reduces to finding a process { f (t)}0tT satisfying

T ξ(T )F = EP [ξ(T )F] + f (t)d B(t). (6.3.1) 0

By the Clark–Ocone formula (Theorem 5.3.5), if F is in D1,2 , then the desired process is given by 9:;< ˙ f (t) = ξ(T )EP [ (∇F) (t)|FtB ]. If F is of the form g(S (T )), then a more precise expression is available. Proposition 6.3.4 Let g ∈ C 1 (R) be of polynomial growth order, and F = g(S (T )). Set

f (t) = σξ(T )S (t)

 (x + σ (T − t))2  1 2 xg (S (t)e x ) exp − dx. √ 2σ(T − t) 2πσ(T − t) 2

R

Then { f (t)}0tT satisfies (6.3.1). In particular, let

 (x + σ2 (T − t))2  1 2 xg (S (t)e x ) exp − dx, 2σ(T − t) 2πσ(T − t) R

t ϕ1 (s)dS (s) − ϕ1 (t)S (t). ϕ0 (t) = EP [ξ(T )F] + ϕ1 (t) = ξ(T )



0

Then ϕ = (ϕ , ϕ ) replicates F. 0

1



Proof Since S (T ) = eσB(T )−

σ2 2

T

, by Corollary 5.3.2,

9:;< ˙ (∇F) (t) = σg (S (T ))S (T ). σ   S (t, T ), where  S (t, T ) = eσ(B(T )−B(t))− 2 (T −t) , and Rewriting as S (T ) = S (t) using the independence of  B(T ) −  B(t) and FtB , we obtain  S (t, T )) S (t, T )]y=S (t) . EP [g (S (T ))S (T )|FtB ] = S (t)EP [g (y 2

2 Since σ( B(T ) −  B(t)) − σ2 (T − t) obeys the normal distribution with mean 2 − σ2 (T − t) and variance σ2 (T − t), this implies the desired expression of f (t). The second assertion has been seen in the proof of Theorem 6.3.2. 

We now proceed to determining the price of a European contingent claim F ∈ CE . For this purpose, we introduce two candidates of price as follows. Let

4:36:54, subject to the Cambridge Core terms of use, .007

290

The Black–Scholes Model PB (F) = {ϕ ∈ Padm | V(T ; ϕ) + F  0}, PS (F) = {ψ ∈ Padm | V(T ; ψ) − F  0}, πB (F) = sup{y | V(0; ϕ) = −y for some ϕ ∈ PB }, πS (F) = inf{z | V(0; ψ) = z for some ψ ∈ PS }.

Remark 6.3.5 Suppose a buyer buys F at the price y at time 0, hence starts with initial value −y, and then invests by a portfolio ϕ. What he/she expects is that the sum V(T ; ϕ) + F, the total assets at maturity, is non-negative, otherwise he/she suffers a loss. Hence πB (F) is the supremum of prices which a buyer accepts. Contrarily, suppose a seller sells F at the price z at time 0. By a portfolio ψ with initial value z, he/she earns V(T ; ψ). After paying F, what remains is V(T ; ψ)−F, which he/she hopes to be non-negative. Thus πS (F) is the infimum of prices which a seller sets. Theorem 6.3.6 Let F ∈ CE . (1) πB (F)  πS (F). P) in addition, then πB (F) = πS (F) = EP [ξ(T )F]. (2) If F ∈ L1 ( Proof (1) Let φ ∈ Padm . By Theorem 6.2.3(3), {V(t; φ)}0tT is a local martingale under  P. Since  P and P are equivalent,  P(V(t; φ)  C (0  t  T )) = 1 for some C ∈ R. By a similar argument as in the proof of Theorem 6.2.3(4), we have EP [V(t; φ)]  EP [V(0; φ)].

(6.3.2)

Since V(0; φ) is a constant, in conjunction with the boundedness from below, P). this implies V(t; φ) ∈ L1 ( Take ϕ ∈ PB and ψ ∈ PS . Set V(0; ϕ) = −y and V(0; ψ) = z. Then ϕ + ψ ∈ Padm and V(t; ϕ) + V(t; ψ) = {V(t; ϕ) + F} + {V(t; ψ) − F}  0. By (6.3.2), 0  EP [V(t; ϕ) + V(t)(ψ)]  −y + z. Hence y  z. Taking the supremum over y and the infimum over z, we obtain πB (F)  πS (F). P), by Theorem 6.3.2, there exists a ϕ ∈ Padm , which repli(2) Since F ∈ L1 ( cates F and satisfies V(0; ϕ) = EP [ξ(T )F]. This ϕ belongs to PS (F), because V(T ; ϕ) − F = 0. Hence πS (F)  EP [ξ(T )F].

(6.3.3)

Next set Fn = F ∧ n (n = 1, 2, . . . ). By Theorem 6.3.2, for each n, take a ϕn ∈ Padm replicating −Fn and satisfying V(0; ϕn ) = −EP [ξ(T )Fn ].

4:36:54, subject to the Cambridge Core terms of use, .007

6.3 Pricing Formula

291

Since Fn  F, V(T ; ϕn ) + F  V(T ; ϕn ) + Fn = 0. Thus ϕn ∈ PB . Hence πB (F)  EP [ξ(T )Fn ]. By the boundedness from below of F and the monotone convergence theorem, letting n → ∞, we obtain πB (F)  EP [ξ(T )F]. In conjunction with the assertion (1) and (6.3.3), this implies the desired identity.  In Theorem 6.3.6 (2), the identity means the expectation is an amount accepted by both buyer and seller. Hence the price of F ∈ CE ∩ L1 ( P) is given by π(F) = EP [ξ(T )F]. Theorem 6.3.7 Let f ∈ C(R) be bounded from below and at most of P), and polynomial growth. Then F = f (S (T )) ∈ CE ∩ L1 (

x2 σ2 1 π(F) = e−rT f (s0 e(r− 2 )T e x ) √ (6.3.4) e− 2σ2 T dx. R 2σ2 T It should be noted that in the pricing formula (6.3.4), only r and σ are involved, and μ is not. The stock price process {S (t)}0tT reflects on prices via only σ, which indicates how risky the market is. 2 Proof By (6.2.3), S (t) = s0 exp(σ B(t) − σ2 T ). Hence  σ2  T . B(T ) − S (T ) = s0 erT exp σ 2

P, σ B(T ) obeys the Since { B(t)}0tT is a Brownian motion starting at 0 under  2 P) and normal distribution with mean 0 and variance σ T . Thus F ∈ L1 (    σ2  π(F) = EP [ξ(T ) f (S (T ))] = e−rT EP f s0 erT exp σ T B(T ) − 2

x2 σ2 1 = e−rT f (s0 erT e x− 2 T ) √ e− 2σ2 T dx. 2 R 2σ T This implies (6.3.4).  A European call option is a contingent claim whose payoff at maturity is C = (S (T ) − K)+ . The holder of this claim is allowed to buy a unit of stock at the exercise price K at maturity. The payoff is computed as follows. If the price of the stock is more than K, then the holder exercises the option; buys a

4:36:54, subject to the Cambridge Core terms of use, .007

292

The Black–Scholes Model

stock at the price K, immediately sells it at the market price S (T ), and earns a profit S (T ) − K. If K  S (T ), the holder never exercises the option, and makes no profit. Thus the payoff is (S (T ) − K)+ . There is an exact expression of the price of the European call option, called the Black–Scholes formula. Proposition 6.3.8 The price of the European call option is given by π(C) = s0 Φ(d+ ) − Ke−rT Φ(d− ), where

Φ(x) =

and d± =

x −∞

(6.3.5)

y2 1 √ e− 2 dy (x ∈ R) 2π

1  * s0 +  σ2   + r± T (the double signs correspond). √ log K 2 σ T

Proof By Theorem 6.3.7,

x2   (r− σ2 )T x 1 rT e π(C) = s0 e 2 e − K + √ e− 2σ2 T dx R 2σ2 T

∞ 2 σ2 1 − x2 x 2σ T dx e = s0 e(r− 2 )T e √ 2 log( sK )−(r− σ2 )T 2σ2 T 0

x2 1 −K e− 2σ2 T dx. √ 2 log( sK )−(r− σ2 )T 2σ2 T 0 √ By the change √ of variables x = σ T y in both terms in the last equation, and then z = y − σ T in the first term, the first term turns into s0 erT Φ(d+ ) and the !∞ x2  second one becomes KΦ(d− ), because a √12π e− 2 dx = Φ(−a). A European put option allows the holder to sell at maturity a unit of stock at the exercise price K. Its payoff is P = (K − S (T ))+ . By the same method as above, its price is computed as π(P) = Ke−rT Φ(−d− ) − s0 Φ(−d+ ).

(6.3.6)

From this and (6.3.5), the put-call parity follows: π(P) − π(C) = Ke−rT − s0 . This identity is also derived by taking advantage of the equivalent martingale measure  P. Indeed, as was seen in Remark 6.2.4, {S (T )}0tT is a martingale  under P. Since (K − S (T ))+ − (S (T ) − K)+ = K − S (T ), π(P) − π(C) = E[e−rT K − S (T )] = Ke−rT − s0 .

4:36:54, subject to the Cambridge Core terms of use, .007

6.4 Greeks

293

To apply the Black–Scholes model to a real market, it is indispensable to estimate the volatility σ. Thinking of the price of the European √ call option as a function of σ, we write π(C; σ) for π(C). Since d+ = d− + σ T and √ K Φ (d− + σ T ) = e−rT Φ (d− ), s0 it follows from Proposition 6.3.8 that √ √ d π(C; σ) = s0 Φ (d− + σ T ){d− + T } − Ke−rT Φ (d− )d− dσ √ √ = s0 Φ (d− + σ T ) T > 0. Thus, the mapping σ → π(C; σ) is strictly increasing, and hence for each γ  0, there exists a unique solution to the identity π(C; σ) = γ. The solution σγ is called an implied volatility. Hence, substituting the real price of the European call option into γ, we estimate the volatility σ.

6.4 Greeks Partial derivatives of π(F) are called Greeks. They indicate the sensitivity of π(F) to changes of parameters. The Greeks Delta, Gamma, Vega, and Rho are 2 ∂ π(F), and ∂r∂ π(F), respectively. defined to be ∂s∂0 π(F), ∂∂2 s0 π(F), ∂σ By (6.3.4), the Greeks are directly computed. Moreover, by the integration σ2 by parts on R, they can be represented as weighted integrals of f (s0 er− 2 )T e x ). This can be done with the help of the Malliavin calculus. Such an application of the Malliavin calculus to Greeks was first described in [24]. See also [76]. We shall see this for the Delta. Proposition 6.4.1 Let f ∈ C(R) be bounded from below. Suppose f and its derivative are at most of polynomial growth. Set F = f (S (T )). Then,   B(T ) ∂ π(F) = e−rT EP f (S (T )) . (6.4.1) ∂s0 s0 σT Proof By Theorem 6.3.7, π(F) is differentiable in s0 . Moreover, due to the same theorem, we may assume that f is of C 1 -class. Let (WT , B(WT ), μT ) be the one-dimensional Wiener space on [0, T ], and {ϕ(t)}0tT be the coordinate process on it. By Theorem 6.2.3,

σ2 π(F) = e−rT f (s0 e(r− 2 )T eσϕ(T ) )dμT . WT

4:36:54, subject to the Cambridge Core terms of use, .007

294

The Black–Scholes Model

Then

∂ π(F) = e−rT ∂s0

σ2 2

)T σϕ(T )

σ2 2

)T σϕ(T )

f  (s0 e(r−

e

)e(r−

σ2 2

)T σϕ(T )

e

dμT .

WT

By Corollary 5.3.2, ∇( f (s0 e(r−

σ2 2

)T σϕ(T )

e

)) = f  (s0 e(r−

e

)s0 e(r−

σ2 2

)T σϕ(T )

e

σ [0,T ] ,

where [0,T ] ∈ WT∗ is defined by [0,T ] (w) = w(T ) (w ∈ WT ). Since  [0,T ] 2 = T , by Remark 5.1.3(2),

σ2 ∂ e−rT π(F) =

∇( f (s0 e(r− 2 )T eσϕ(T ) )), [0,T ] HT dμT ∂s0 s0 σT WT

σ2 e−rT = f (s0 e(r− 2 )T eσϕ(T ) )ϕ(T )dμT . s0 σT WT 

This means (6.4.1) holds.

The other Greeks possess similar expectation expressions. The proofs are carried out in exactly the same manner as above via the Malliavin calculus. In general mathematical models in finance, stock prices are realized as solutions of stochastic differential equations with general coefficients. For such general models, Greeks are defined similarly. For example, the Delta is defined as partial derivatives of the price of European contingent claims with respect to the initial value of risky assets. Thus it relates to partial derivatives of Wiener integrals with respect to the initial conditions of the stochastic differential equations. To be precise, let {X(t, x)}t∈[0,T ] be the solution of the stochastic differential equation (5.5.1) and f be a function on RN . The Delta corresponds to the derivative of the expectation of f (X(t, x)) with respect to the initial value x = X(0, x),

∂ f (X(t, x)) dμT , ∂xi WT where (WT , B(WT ), μT ) is the d-dimensional Wiener space. We present an expression for the derivative by using the Malliavin calculus. In the following we assume that X(t, x) is non-degenerate for all t ∈ (0, T ] and x ∈ RN . For example, the reader may assume Hörmander’s condition at every x ∈ RN . Let Y(t, x) = (Y ij (t, x))1i, jN be the Jacobian matrix of the mapping x → X(t, x) as before and Z(t, x) = (Z ij (t, x))1i, jN be its inverse matrix. Define Gi (t, x) ∈ D∞,∞− (HT ) by N 9:;< ˙ Z ij (s, x)Vαj (X(s, x))1[0,t] (s) Gi (t, x)α (s) =

(0  s  T )

j=1 4:36:54, subject to the Cambridge Core terms of use, .007

6.4 Greeks

295

and A(t, x) = (Ai j (t, x))1i, jN by Ai j (t, x) = Gi (t, x), G j (t, x) HT . As mentioned after Theorem 5.5.4, (det A(t, x))−1 ∈ L∞− (μT ) and det A(t, x)  0, μT -a.s. Denote the inverse matrix of A(t, x) by B(t, x): B(t, x) = (Bi j (t, x))1i, jN = A(t, x)−1 . 1 (RN ), Theorem 6.4.2 For any f ∈ C

∂ f (X(t, x)) dμT = f (X(t, x))Φi (t, x) dμT , ∂xi WT WT

(6.4.2)

where Φi is given by Φi (t, x) =

N

Bik (t, x)

d N j=1 α=1

k=1

t

0

Z kj (s, x)Vαj (X(s, x)) dθα (s)

N

+2

Bik (t, x) ∇Gk , Gm ⊗ G j HT⊗2 Bm j (t, x).

j,k,m=1

Proof By Theorem 4.10.8,

N ∂ ∂f f (X(t, x)) dμ = (X(t, x))Yij (t, x) dμT . T j ∂xi WT ∂x W T j=1

(6.4.3)

By Theorem 5.5.1, ∇X k (t, x) =

N

Yik (t, x)Gi (t, x),

i=1

 

∇X i (t, x), ∇X j (t, x) HT 1i, jN = Y(t, x)A(t, x)Y(t, x)∗ ,   ∗

∇X i (t, x), ∇X j (t, x) HT −1 1i, jN = Z (t, x)B(t, x)Z(t, x). Then Theorem 5.4.5 yields N ∂f (X(t, x))Yij (t, x) dμT j ∂x j=1 WT

=

f (X(t, x)) WT

N

 ∇∗ Bik (t, x)Gk (t, x)) dμT .

(6.4.4)

Z kj (s, x)Vαj (X(s, x)) dθα (s).

(6.4.5)

k=1

By Theorem 5.3.3, ∇∗Gk (t, x) =

d N j=1 α=1

t 0

4:36:54, subject to the Cambridge Core terms of use, .007

296

The Black–Scholes Model

Moreover, applying ∇ to both sides of the identity N

Bik (t, x) Gk (t, x), G j (t, x) HT = δi j ,

k=1

we obtain N

∇Bik (t, x) Gk (t, x), G j (t, x) HT

k=1

+

N

Bik (t, x){ ∇Gk (t, x), G j (t, x) HT + Gk (t, x), ∇G j (t, x) HT } = 0.

k=1

Since B(t, x) is a symmetric matrix, we have ∇Bi j (t, x) = −2

N

Bik (t, x) ∇Gk (t, x), Gm (t, x) HT Bm j (t, x).

k,m=1

Combining this identity with (6.4.3)–(6.4.5) and Theorem 5.2.8, we obtain (6.4.2).  Remark 6.4.3 By Remark 5.5.8, the mapping

f (X(t, x)) dμT x → WT

is smooth for any Borel measurable function f with at most polynomial growth. We can show that the assertion of the theorem holds for such a function f by approximating f by smooth functions. In fact, let p(t, x, y) be the transition density of X(t, x). Then p(t, x, y) is smooth in (t, x, y) ∈ (0, ∞) × RN × RN and we have

∂ ∂ f (X(t, x)) dμ = p(t, x, y) f (y) dy. T i ∂xi WT RN ∂x The right hand side of (6.4.2) is an analytical expression for the right hand side of this identity. Those who are interested in the mathematical finance may proceed to books by Duffie [18], Elliott and Kopp [21], Karatzas and Shreve [57], Musiela and Rutkowski [88], and Shreve [105].

4:36:54, subject to the Cambridge Core terms of use, .007

7 The Semiclassical Limit

Kac (see [52, 54, 55]) found a similarity between the Feynman path integral and the Wiener integral. The Feynman–Kac formula is a typical example. He also applied the Wiener integral to analysis of differential operators. Among these his contribution to the “problem of drums” is well known. In the first half of this chapter, we introduce Van Vleck’s formula [121], which asserts that, if the Hamiltonian of the classical mechanics is given by a quadratic polynomial, the corresponding propagator (the fundamental solution for the Schrödinger operator) is represented in terms of action integrals, and we show its analogue to heat equations. As an application, a probabilistic representation of soliton solutions for the KdV equation and other related topics will be discussed. In the second half, by investigating the semiclassical approximations for the eigenvalues of Schrödinger operators and Selberg’s trace formula on the upper half plane, we present applications of stochastic analysis to differential operators and see the correspondence to classical mechanics.

7.1 Van Vleck’s Result and Quadratic Functionals In chapter 5, the explicit forms of the heat kernels of the harmonic oscillators and the Schrödinger operators with constant magnetic fields were derived. For these Schrödinger operators on Rd , whose Hamiltonians of the corresponding classical mechanics are given by quadratic polynomials, Van Vleck ([121]) showed that the propagators are given by means of action integrals of classical mechanics. For simplicity we study on R2 . Let k and be non-negative constants and K and J be the 2 × 2 constant matrices given by # 2 k K= 0

0 2

$

#

and

0 J= 1

$ −1 , 0

297 4:36:57, subject to the Cambridge Core terms of use, .008

298

The Semiclassical Limit

respectively. For γ ∈ R, define V : R2 → [0, ∞) and Θ : R2 $ # 1 Θ1 (x) = γJx = V(x) = K x, x and Θ(x) = Θ2 (x) 2

→ R2 by # $ 1 −x2 γ 1 , x 2

respectively, where x = (x1 , x2 ) ∈ R2 and x, y is the inner product of x, y ∈ R2 . Let H = H(V,Θ) be the Schrödinger operator defined by 2 1  1 ∂ + Θ (x) + V(x). H= α 2 α=1 i ∂xα 2

The Hamiltonian of the corresponding classical mechanics is 1 α (p + Θα (q))2 + V(q) 2 α=1 2

H(p, q) =

(p, q ∈ R2 ).

The classical path (p(s), q(s)) (s  0) satisfies the Hamilton equation q˙ α (s) =

∂H (p(s), q(s)), ∂pα

p˙ α (s) = −

∂H (p(s), q(s)) ∂qα

(α = 1, 2).

The corresponding Lagrangian is given by 1 12 1 1 ((q˙ ) + (q˙ 2 )2 ) + γ(q1 q˙ 2 − q2 q˙ 1 ) − Kq, q 2 2 2 and the classical path also satisfies the Lagrange equation  ∂L d  ∂L (q(s), q(s)) ˙ − α (q(s), q(s)) ˙ =0 (α = 1, 2). ds ∂q˙ α ∂q L(q, q) ˙ =

Moreover, for each path φ = {φ(s)}0st , the integral

t ˙ S (φ) = L(φ(s), φ(s)) ds 0

is called the action integral of φ. For classical mechanics, see for example Arnold [3]. For fixed x, y ∈ R2 and t > 0, the classical path φcl = {φcl (s)}0st with φ(0) = x and φ(t) = y is uniquely determined. Denote the action integral of φcl by S cl (t, x, y). Let q(t, x, y) be the propagator of the Schrödinger equation 1 ∂u = Hu. i ∂t Van Vleck showed that it is given by  12 1   ∂2 S cl (t, x, y)  q(t, x, y) = eiS cl (t,x,y) . det 2π ∂xα ∂yβ α,β=1,2

4:36:57, subject to the Cambridge Core terms of use, .008

7.1 Van Vleck’s Result and Quadratic Functionals

299

In order to achieve an analogous expression for the heat equation, we define a formal Lagrangian  L by 1 1 i  L(x, x˙) = | x˙|2 − γ Jx, x˙ + K x, x . 2 2 2 φcl (s)}0st the solution of the Lagrange equation Denote by  φcl = {  ∂ L d  ∂ L ˙ ˙ (φ(s), φ(s)) − α (φ(s), φ(s)) =0 α ds ∂x ∂x

(α = 1, 2)

satisfying φ(0) = x and φ(t) = y. Define its action integral by

t  L( φcl (s),  Scl (t, x, y) = φ˙cl (s)) ds. 0

Note that  φcl is a path in C2 and Scl (t, x, y) ∈ C. Let p(t, x, y) be the fundamental solution of the heat equation ∂u = −Hu. ∂t Let (W, B(W), μ) be the two-dimensional Wiener space and set

1 t h(t)(w) = {(kw1 (s))2 + ( w2 (s))2 } ds, 2 0

1 t 1 s(t)(w) = (w (s) dw2 (s) − w2 (s) dw1 (s)) (w ∈ W). 2 0 Then we have (see Theorem 5.5.9)

eiγs(t)(wx )−h(t)(wx ) δy (w x (t)) dμ, p(t, x, y) = W

where w x = {w x (s)} s0 is given by w x (s) = x + w(s). The following is known ([40, 41]). Theorem 7.1.1 For all x, y ∈ R2 and t > 0, 2 1   ∂2 Scl (t, x, y)   e−S cl (t,x,y) . det 2π ∂xα ∂yβ α,β=1,2 1

p(t, x, y) =

(7.1.1)

(7.1.1) holds for γ ∈ C and for a more general symmetric matrix K, which is not non-negative. However, in general, the corresponding (formal) classical path has conjugate points and (7.1.1) holds before conjugacy. See the above cited references together with the proofs.

4:36:57, subject to the Cambridge Core terms of use, .008

300

The Semiclassical Limit

Example 7.1.2 (stochastic area) For γ ∈ R \ {0}, let H be the Schrödinger operator with a constant magnetic field γ 2 2 1  ∂ γ 1 2 1 ∂ x x . − i − + i H=− 2 ∂x1 2 2 ∂x2 2 As was shown in Theorem 5.8.5, the fundamental solution of the heat equation is given by

eiγs(t)(wx ) δy (w x (t))μ(dw) p(t, x, y) = W  γt    γ iγ Jx,y γ 2 coth . (7.1.2) exp − =e 2 |y − x| 4 2 4π sinh( γt2 ) To derive (7.1.2) from (7.1.1), we consider the formal Lagrangian 1 2 iγ 1 2  ˙ − (q q˙ − q2 q˙ 1 ). L(q, q) ˙ = |q| 2 2 The Lagrange equation is written as q¨ 1 (s) = −i γq˙ 2 (s),

q¨ 2 (s) = i γq˙ 1 (s)

φcl (0) = x,  φcl (t) = y is given by and the classical path { φcl (s)}0st with   γt  1 i (y2 − x2 )   φ1cl (s) = x1 + coth (y1 − x1 ) + sinh(γs) 2 2 2  γt    y1 − x1 i + coth (y2 − x2 ) (cosh(γs) − 1), − 2 2 2  γt   i (y1 − x1 ) 1   + coth φ2cl (s) = x2 + − (y2 − x2 ) sinh(γs) 2 2 2  γt  i y2 − x2  + coth (y1 − x1 ) − (cosh(γs) − 1). 2 2 2 !t L( φcl (s),  φ˙ cl (s)) ds, we obtain (7.1.2). Plugging these representations into 0  Computing the action integrals of the classical paths in the same way as above, we can show the following explicit representations in general. The Lagrange equation in the general setting is q¨ 1 + i γq˙ 2 − k2 q1 = 0

and q¨ 2 − i γq˙ 1 − 2 q2 = 0.

Set     m1 = (k + )2 + γ2 , m2 = (k − )2 + γ2 , m1 + m2 −m1 + m2 , s2 = . s1 = 2 2

4:36:57, subject to the Cambridge Core terms of use, .008

7.1 Van Vleck’s Result and Quadratic Functionals

301

Then the classical paths  φcl (s) are written as a linear combination of e s1 s s2 s and e . Solving the Lagrange equation with the conditions  φcl (0) = x and  φcl (t) = y and substituting the solution in the action integral

t  L( φcl (s),  φ˙cl (s)) ds, 0

we obtain the following explicit expression for the right hand side of (7.1.1). Corollary 7.1.3 ([80]) For x = (x1 , x2 ), y = (y1 , y2 ) ∈ R2 and t > 0, 1 1  k m21 m22  2 exp(−Scl (t, x, y)), p(t, x, y) = 2π K(t) where K(t) and Scl (t, x, y) are given by K(t) = 2k γ2 (cosh(s1 t) cosh(s2 t) − 1) − {γ2 (k2 + 2 ) + (k − )2 } sinh(s1 t) sinh(s2 t), m1 m2 m1 m2 Scl (t, x, y) = α1 (t){(x1 )2 + (y1 )2 } + β1 (t)x1 y1 K(t) K(t) m1 m2 m1 m2 α2 (t){(x2 )2 + (y2 )2 } + β2 (t)x2 y2 + K(t) K(t) i γ(k2 − 2 ) η(t)(x1 x2 − y1 y2 ) + 2K(t) i k m1 m2 γ − {cosh(s1 t) − cosh(s2 t)}(x1 y2 − x2 y1 ) K(t) with α1 (t) = s1 (s22 − k2 ) cosh(s1 t) sinh(s2 t) − s2 (s21 − k2 ) sinh(s1 t) cosh(s2 t), α2 (t) = s1 (s22 − 2 ) cosh(s1 t) sinh(s2 t) − s2 (s21 − 2 ) sinh(s1 t) cosh(s2 t), β1 (t) = s2 (s21 − k2 ) sinh(s1 t) − s1 (s22 − k2 ) sinh(s2 t), β2 (t) = s2 (s21 − 2 ) sinh(s1 t) − s1 (s22 − 2 ) sinh(s2 t), η(t) = 2k {cosh(s1 t) cosh(s2 t) − 1} + (m21 − 2k ) sinh(s1 t) sinh(s2 t). 1

Corollary 7.1.4 For γ, k ∈ R, set m1 = (γ2 + 4k2 ) 2 . Then,

  k2 t exp i γs(t) − |θ(s)|2 ds δy (θ(t)) dμ 2 0 W m1 t m1 t  1  1 2 2 2 exp − |y| = (y ∈ R2 , t > 0), 2πt sinh( m21 t ) 2t tanh( m21 t ) where {θ(t)}t0 is the coordinate process of W.

4:36:57, subject to the Cambridge Core terms of use, .008

302

The Semiclassical Limit

Proof Set k = and x = 0 in Corollary 7.1.3. Then, m2 = |γ|. By the identities *a+ cosh a cosh b − sinh a sinh b = cosh(a − b) and cosh a − 1 = 2 sinh2 , 2 we have * m1 t + K(t) = 4γ2 k2 sinh2 . 2 Hence, ( m21 )2 k2 m21 m22 = . K(t) sinh2 ( m21 t ) Moreover, since s1 (s22 − k2 ) = s2 (s21 − k2 ) = −k2 |γ|, by the identity sinh a cosh b − cosh a sinh b = sinh(a − b), we obtain α1 (t) = α2 (t) = |γ|k2 sinh(m1 t). Thus m t

1 m1 m2 1 2 |y|2 . α1 (t)|y|2 = Scl (t, 0, y) = 2K(t) 2t tanh( m21 t )



By Corollary 7.1.3, we obtain the conclusion.

7.1.1 Soliton Solutions for the KdV Equation In this subsection, applying Corollary 7.1.4, we give an expectation representation of soliton solutions for the Korteweg–de Vries (KdV) equation, which is one of the fundamental non-linear partial differential equations. For p ∈ R, let {ξ p (t)}t0 be a two-dimensional Ornstein–Uhlenbeck process obtained as the unique strong solution of the stochastic differential equation dξ p (t) = dθ(t) + pξ p (t) dt,

ξ p (0) = 0.

Each component of {ξ p (t) = (ξ p,1 (t), ξ p,2 (t))} satisfies dξ p,i (t) = dθi (t) + pξ p,i (t) dt,

ξ p,i (0) = 0

(i = 1, 2).

We assume t > 0 in the following. 1

Theorem 7.1.5 Let γ, C ∈ R and y ∈ R2 and set m1 = (γ2 + 4p2 ) 2 . Then, $ # t

iγ exp

Jξ p (s), dξ p (s) − C|ξ p (t)|2 δy (ξ p (t)) dμ 2 0 W $ # & ' m1 t m1 t * 1 1 p+ 2 2 2 = exp − + C− |y| − pt . (7.1.3) 2πt sinh( m21 t ) 2t tanh( m21 t ) 2 4:36:57, subject to the Cambridge Core terms of use, .008

7.1 Van Vleck’s Result and Quadratic Functionals

303

Moreover, if C  2p , then

#

iγ exp 2 W

t

$

Jξ (s), dξ (s) − C|ξ (t)| dμ p

p

p

2

0

m1 e−pt . m1 cosh( m21 t ) + (4C − 2p) sinh( m21 t )

=

(7.1.4)

We give a remark on the left hand side of (7.1.3). Since

t e−ps dθi (s) (i = 1, 2), ξ p,i (t) = e pt 0

the distribution of ξ (t) is two-dimensional Gaussian with mean 0 ∈ R2 and 2pt covariance matrix e 2p−1 I, where I is the 2 × 2 unit matrix. Therefore, if p

C 0, m1 t + C − 2t tanh( 2 ) 2

integrating (7.1.3) in y over R2 , we obtain

4:36:57, subject to the Cambridge Core terms of use, .008

7.1 Van Vleck’s Result and Quadratic Functionals #

exp W

iγ 2

t

305

$

Jξ p (s), dξ p (s) − C|ξ p (t)|2 dμ

0

=

& # m1 1 2 2 2t sinh( m21 t )

$'−1 * p+ + C− e−pt . 2

m1 t 2 tanh( m21 t )



Thus (7.1.4) holds.

The KdV equation is the non-linear partial differential equation on R2 given by 3 ∂u 1 ∂3 u ∂u (x, t) = u(x, t) (x, t) + (x, t) ∂t 2 ∂x 4 ∂x3

((x, t) ∈ R2 ),

which shows the movement of waves on shallow water surfaces. The solution u(x, t) expresses the height of the wave at time t and position x ∈ R. There has been much research on the solutions of the KdV equation. Among them, the existence of interesting solutions, called the soliton solutions, is well known. We briefly discuss this soliton solution. For details, see [87]. Let S = {{η j , m j } j=1,...,n ; 0 < m1 , . . . , mn , η1 < · · · < ηn , n = 1, 2, . . .}. s = {η j , m j } j=1,...,n ∈ S is called the scattering data of length n. For s = {η j , m j } j=1,...,n ∈ S , the function us defined by * d +2 us (x) = −2 log det(I + Gs (x)) dx

(x ∈ R)

is called the reflectionless potential with scattering data s, where Gs (x) is an n × n matrix given by  √m m e−(ηi +η j )x  i j . Gs (x) = ηi + η j i, j=1,...,n For s = {η j , m j } j=1,...,n ∈ S , the function v(x, t) given by v(x, t) = −us(t) (x) is a solution of the KdV equation, where s(t) = {η j , m j exp(−2η3j t)} j=1,...,n . Such a v is called an n-soliton solution. Next, applying Theorem 7.1.5, we show probabilistic representations for the reflectionless potential and 1-soliton solution by means of the Ornstein– Uhlenbeck process. Proposition 7.1.6 Under the same framework as Theorem 7.1.5, let C ∈ [ 2p , 2p + m41 ) and set

4:36:57, subject to the Cambridge Core terms of use, .008

306

The Semiclassical Limit  i γ x   Q(x) = log exp

Jξ p (s), dξ p (s) − C|ξ p (x)|2 dμ 2 0 W

(x  0).

d 2 Then, on [0, ∞), the function q = 2( dx ) Q coincides with the reflectionless potential with scattering data m m (m − 4C + 2p) 1 1 1 , . 2 m1 + 4C − 2p

Remark 7.1.7 (1) Since reflectionless potentials are analytic on R, they are uniquely determined by the value on [0, ∞). Hence, the probabilistic representation above determines the reflectionless potential completely. (2) In the proposition above, if p  0, then we can take C = 0. In particular, then the function # x $ $ * d +2 # iγ log exp

Jξ p (s), dξ p (s) dμ 2 dx 2 0 W is a reflectionless potential. Proof Define δ  0 by m1 tanh δ = 4C − 2p. By the identity cosh(t + s) = cosh t cosh s + sinh t sinh s, we have

m x m x 1 1 + (4C − 2p) sinh m1 cosh 2 2 m x  m x  1 1 + tanh δ sinh = m1 cosh 2 2 m1 x m1 e( 2 )+δ (1 + e−2δ e−m1 x ). = cosh δ 2 Combining this with Theorem 7.1.5, we obtain * d +2 m1 x m1 − +δ q(x) = −2 log m1 − px − log dx 2 cosh δ 2   − log 1 + e−2δ e−m1 x * d +2   log 1 + e−2δ e−m1 x . = −2 dx

Hence, q is the reflectionless potential with scattering data { m21 , m1 e−2δ }.



Corollary 7.1.8 For p ∈ R and t  0, set # # x iγ exp

Jξ p (y), dξ p (y) V p (x, t) = log 2 0 W & $ $  m3 t ' p m1 1 p 2 + tanh − |ξ (x)| dμ 2 4 8 4:36:57, subject to the Cambridge Core terms of use, .008

7.1 Van Vleck’s Result and Quadratic Functionals

and

307

 ∂ 2 V p (x, t). v p (x, t) = 2 ∂x

Then, v p is a 1-soliton solution of the KdV equation with the initial condition 1

u(x, 0) = −

(γ2 + 4p2 ) 2 .  1  2 cosh2 (γ2 + 4p2 ) 2 x

Proof By Proposition 7.1.6, v p (·, t) is a reflectionless potential with scattering m1 3 data { m21 , m1 e−2( 2 ) t }. Hence, v p is a 1-soliton solution of the KdV equation.  We can present probabilistic representations for the reflectionless potentials and n-soliton solutions with scattering data of length n. In such a study, it is important to have an explicit representation for Wiener integrals corresponding to the function V p (t, x) by using a solution for the Riccati equation. For details, see [44, 118].

7.1.2 Euler Polynomials A sequence of polynomials {pn (x)}∞ n=0 satisfying pn = npn−1

(n = 1, 2, . . .)

is called an Appell sequence. pn (x) = xn (n = 0, 1, . . .) is a typical example. The sequence {n!Hn (x)}∞ n=0 , Hn being the Hermite polynomial, also satisfies this relationship. In this subsection, we are concerned with the Euler polynomial En defined by ∞ 2eζ x ζn = . (7.1.5) E (x) n eζ + 1 n=0 n! The sequence {En }∞ n=0 is also an Appell sequence. The Euler number en is defined by en = 2n En ( 21 )

(n = 0, 1, . . .).

For the Euler polynomials, the Euler numbers, and Lévy’s stochastic area, the following relationship holds. Theorem 7.1.9 For x ∈ R,



n # $ k n i n (1 − 2x)k En (x) = i s(1)n−k dμ, s(1) + |θ(1)|2 dμ × k 4 W W k=0 (7.1.6) 4:36:57, subject to the Cambridge Core terms of use, .008

308

The Semiclassical Limit ⎧ ⎪ ⎪ ⎨ 0 s(1) dμ = ⎪ ⎪ ⎩(−1) n2 2−n en W

(if n is odd),

n

(7.1.7)

(if n is even).

For a proof, we give a lemma. Lemma 7.1.10 For a, β ∈ R with |a|  12 , −aβ < 1, &

 β 1  β '−1 1 iβ{s(1)+i a2 |θ(1)|2 } + a e2 + − a e− 2 e dμ = . 2 2 W aβ

Proof Since −aβ < 12 , e− 2 |θ(1)| ∈ D∞,∞− . Moreover, both the real and imaginary parts of eiβs(1) belong to D∞,∞− . Hence, by Theorem 5.4.11,

  a 2 iβ{s(1)+i a2 |θ(1)|2 } e dμ = eiβ{s(1)+i 2 |θ(1)| } δy (θ(1)) dμ dy 2 W

R W   aβ 2 = e− 2 |y| eiβs(1) δy (θ(1)) dμ dy. 2

R2

W

By Corollary 7.1.4 (or Theorem 5.8.5), the right hand side coincides with #

β β β  β  |y|2 $ 1 2 2 dy. cosh exp − + 2a sinh 2π sinh( β ) R2 2 2 2 sinh( β ) 2

Since

2

 β 1  β β β 1 + a e2 + − a e− 2 > 0, cosh + 2a sinh = 2 2 2 2

computing the Gaussian integral yields the conclusion.



Proof of Theorem 7.1.9 First we show (7.1.6). Rewrite (7.1.5) as ∞

ζ n e− 2 ζ(1−2x) . En (x) = n! cosh( 2ζ ) n=0 1

(7.1.8)

Set a = 12 , β = ζ(1 − 2x) and let ζ(x − 12 ) < 1. Then, applying Lemma 7.1.10, we have

  i − 12 ζ(1−2x) e = exp i ζ(1 − 2x) s(1) + |θ(1)|2 dμ. 4 W Plugging this and the identity

W

ei ζs(1) dμ =

1 cosh( 2ζ )

,

which is obtained in Theorem 5.8.4, into (7.1.8), we obtain

4:36:57, subject to the Cambridge Core terms of use, .008

7.2 Asymptotic Distribution of Eigenvalues

309

∞   En (x) n i 2 ζ = exp i ζ(1 − 2x) s(1) + |θ(1)| dμ × ei ζs(1) dμ. n! 4 W W n=0 Considering the series expansion in ζ of the right hand side and comparing the coefficients of ζ n (n ∈ Z+ ), we obtain (7.1.6). Second, we show (7.1.7). Since  θ = (θ2 , θ1 ) is also a Brownian motion under μ, s(1) is identical in law with −s(1). In particular,

n s(1) dμ = (−s(1))n dμ. W

Hence, if n is odd, then

W

s(1)n dμ = 0. W

When n is even, let x =

1 2



in (7.1.6).

7.2 Asymptotic Distribution of Eigenvalues Let V be a real-valued function on Rd satisfying (V.1) V is continuous and bounded from below. (For simplicity, we assume that V  0.) Then, the Schrödinger operator 1 H =− Δ+V 2 is an essentially self-adjoint operator on C0∞ (Rd ). Denote also by H the corresponding self-adjoint operator on L2 (Rd ) (see [58, 133]). By virtue of detailed study on semigroups generated by Schrödinger operators, a part of which is presented in Chapter 3, it is known that H generates a semigroup {e−tH }t0 on L2 (Rd ) and the heat kernel p(t, x, y) is continuous in (t, x, y) under additional assumptions, which will be described below. We utilize these fundamental facts in functional analysis. See, for example, [108]. We work with these semigroups and the heat kernel in this section. Under the assumption that V(x) → ∞ as |x| → ∞ and the spectrum of H consists only of eigenvalues with finite multiplicity (discrete spectrum) {λn }∞ n=1 , we shall investigate the asymptotic behavior of N(λ), the number of λn s with λn < λ, as λ → ∞. If the Tauberian theorem (Theorem A.6.4) is applicable, it suffices to show the asymptotic behavior of the trace tr(e−tH ) of the semigroup {e−tH }t0 as t ↓ 0. We show this asymptotic behavior as t ↓ 0 via analysis based on the Wiener integral (the Feynman–Kac formula).

4:36:57, subject to the Cambridge Core terms of use, .008

310

The Semiclassical Limit

This problem is an analogue of Weyl’s theorem on the Laplacian in bounded domains ([124]). As was mentioned at the beginning of this chapter, Kac [54] pointed out that a Brownian motion is useful in solving such questions. See also [50, Section 7.6]. The result in this section was first shown by Ray [96]. In the rest of this section we assume the following conditions on V. (V.2) There exist an α > 0 and a slowly increasing function F (see Section A.6) such that vol({x ∈ Rd ; V(x) < λ}) = 1, lim λ→∞ λα F(λ) where vol is the Lebesgue measure on the corresponding Euclidean space. (V.3) There exists a δ > 0 such that   vol x ∈ Rd ; max|y| 0, we have tr(e−tH )  (2πt)− 2 d

Next, set

t

eV (t) =

Rd

e−tV(x) dx.

(7.2.1)

V(x + B(s)) ds,

0

and rewrite as p(t, x, x) = (2πt)

− d2

 E 0



 e−u 1{eV (t)u} du  B(t) = 0 .

Let M(t) = max0st |B(s)|. For δ > 0,

∞ − d2 e−u E[1{eV (t)u} 1{M(t) 0.  We show Iλ2 (t) → 0 by using (7.3.1). Let x ∈ [0, 1)d \ ( m =1 O ,δ ), that is, suppose that |x − a | > δ for every = 1, . . . , m. Set hε = 1{max0uε2 t |θ(u)| 2δ } and, using the conditional expectation, rewrite pλ (t, x, x) as (2) pλ (t, x, x) = p(1) λ (t, x) + pλ (t, x),   ! ε2 t − 14 0 V(x+θ(u))du  2 − d2 ε  θ(ε2 t) = 0 , (t, x) = (2πε t) E h e p(1) ε λ   ! ε2 t (2) − 14 0 V(x+θ(u))du  2 − d2 ε  θ(ε2 t) = 0 . pλ (t, x) = (2πε t) E (1 − hε )e

If max0uε2 t |θ(u)|  2δ , then |x + θ(u) − a | > 2δ for any u ∈ [0, ε2 t] and there exists a positive constant C = C(δ) such that V(x + θ(u))  C. Hence, we have 2 − 2 − ε2 p(1) λ (t, x)  (2πε t) e d

Ct

(t > 0).

Next, denote by μ(A | θ(ε t) = 0) the conditional probability of A ∈ B(W) given θ(ε2 t) = 0. By Corollary 3.1.8,    δ μ max |θ(u)| >  θ(ε2 t) = 0 2 0uε2 t    δ2 δ   d · μ max |θ1 (u)| > √  θ(ε2 t) = 0 = de− 2ε2 dt . 0uε2 t 2 d 2

Hence, by the non-negativity of V, we obtain δ2

2 − 2 − 2ε2 dt p(2) λ (t, x)  d(2πε t) e d

(t > 0).  Thus, pλ (t, x, x) → 0 uniformly in x ∈ [0, 1)d \ ( O ,δ ) and Iλ2 (t) → 0 as λ → ∞ (i.e., ε → 0).

4:36:57, subject to the Cambridge Core terms of use, .008

316

The Semiclassical Limit

We now turn to the main term Iλ1 (t). Fix = 1, . . . , m. By (7.3.2), we have

!t 1 pλ (t, x, x) dx = dx e− ε2 0 V(a +x+εθ(s))ds ε−n δ0 (θ(t)) dμ O ,δ |x| 0, we divide this into two terms,

!t 1 dy e− ε2 0 V(a +ε(y+θ(s)))ds δ0 (θ(t)) dμ J1 (N) = |y|< εδ ,|y|N

W

and

J2 (N) =

|y|< εδ ,|y|>N

e− ε2 1

dy

!t 0

V(a +ε(y+θ(s)))ds

δ0 (θ(t)) dμ.

W

For η > 0, set  hη = 1{max0ut |εθ(u)|N

and

J2(2) (N) =

|y|< εδ ,|y|>N

  !t 1 d (2πt)− 2 E (1 −  hη )e− ε2 0 V(a +ε(y+θ(s)))ds  θ(t) = 0 dy.

By Corollary 3.1.8, η2

E[1 −  hη |θ(t) = 0]  d(2πt)− 2 e− 2ε2 dt . d

Then, by the non-negativity of V,  d δ  η2 J2(2) (N)  d(2πt)− 2 vol |y| < e− 2ε2 dt . ε

(7.3.4)

Hence, J2(2) (N) → 0 as ε → 0. By the condition (V), there exists a constant C1 such that V(a + ξ)  C1 |ξ|2 if |ξ| is sufficiently small. Hence, there exists a constant C, depending only on η, δ, and the Hessian (Vi j (a ))i, j=1,2,...,d , such that

  !t (1) − d2 −C 0 |y+θ(s)|2 ds   J2 (N)  (2π) E hη e  θ(t) = 0 dy



|y|< εδ ,|y|>N

|y|>N

  !t d 2 (2πt)− 2 E e−C 0 |y+θ(s)| ds  θ(t) = 0 dy

4:36:57, subject to the Cambridge Core terms of use, .008

7.3 Semiclassical Approximation of Eigenvalues

317

for sufficiently small η > 0. Hence, by Theorem 5.8.2,

  d2 √ √ 2Ct 2 2C (1) sup J2 (N)  e− 2C tanh( 2 )y dy √ ε>0 |y|>N π sinh( 2Ct)  1 2 (N → ∞). = O e−N N Combining this with (7.3.4), we obtain 1  2 lim sup J2 (N) = O e−N . N ε↓0 Set h (t) =

t d 1 Vi j (a ) (yi + θi (s))(y j + θ j (s)) ds 2 i, j=1 0

and decompose J1 (N) as J1 (N) = J1(1) (N) + J1(2) (N) + J1(3) (N), where

J1(1) (N) =

J1(2) (N) =

d

|y|< εδ ∧N

=

(2πt)− 2 d

|y|< εδ ∧N

J1(3) (N)

(2πt)− 2 E[ hη e−h (t) |θ(t) = 0] dy,

  1 !t  ×E hη e− ε2 0 V(a +ε(y+θ(s)))ds − e−h (t)  θ(t) = 0 dy, (2πt)− 2 d

|y|< εδ ∧N



× E (1 −  hη )e− ε2 1

!t 0

  θ(t) = 0 dy.

V(a +ε(y+θ(s)))ds 

For J1(3) (N), we can show in the same way as above  d η2 d δ J1(3) (N)  (2πt)− 2 e− ε2 . ε If max0st |εθ(s)| < η and |εy| < δ, then there exists a constant C2 , depending only on η and δ, such that e−ε

−2

!t 0

V(a +ε(y+θ(s)))ds

− e−h  C2 ε.

Hence, we obtain J1(2) (N)  C2 ε(2πt)− 2 vol(|y| < N). d

Thus lim supε↓0 J1(i) (N) = 0 (i = 2, 3).

4:36:57, subject to the Cambridge Core terms of use, .008

318

The Semiclassical Limit

Let r (t, x, y) (t > 0, x, y ∈ Rd ) be the heat kernel for L . Then, by Lebesgue’s convergence theorem, we have

d (2πt)− 2 E[e−h |θ(t) = 0] dy lim J1(1) (N) = ε→0 |y| 0). γa = 0 a−1 Using the Brownian motion Z on H2 and following the argument in [85], we show Selberg’s trace formula ([103]) for the Laplace–Beltrami operator on M. To present the formula, we need the explicit form of the value p(t, z, z) (t > 0, z ∈ H2 ) on the diagonal for the transition density of Z or the heat kernel of L. We prove the following explicit form of p(t, z, z ) in the next section.

4:36:57, subject to the Cambridge Core terms of use, .008

320

The Semiclassical Limit

Theorem 7.4.1 The transition density p(t, z, z ) (t > 0, z, z ∈ H2 ) of Z with respect to the volume element dv(z) = y−2 dxdy on H2 is given by 

p(t, z, z ) =

√ −t 2e 8 3

(2πt) 2

b2



be− 2t

1

r

(cosh b − cosh r) 2

db,

(7.4.3)

where r = d(z, z ). Identify the fundamental group π1 (M) of M with a discrete PSL2 (R) and denote it by Γ. Let Δ M be the restriction of Δ to smooth functions f which is periodic with respect to the action f (γz) = f (z) for any γ ∈ Γ, z ∈ H2 . Moreover, let D be the domain of Γ, a connected subset of H2 satisfying  γD = H2 and γD ∩ γ D = ∅ (γ  γ ).

subgroup of the space of of Γ, that is, fundamental

γ∈Γ

It is known that Δ M is essentially self-adjoint and extended to a unique selfadjoint operator on L2 (H2 , dv), say Δ M again. Δ M may be regarded as a selfadjoint operator on L2 (D, dv) and its spectrum consists only of eigenvalues {λk }∞ k=1 with finite multiplicities. For γ ∈ Γ, set (γ) = inf d(z, γz). z∈H2

(γ) is the length of the closed geodesic on M associated with γ and, if γ is conjugate with γa (a > 0), satisfies (γ) = d(z, a2 z) = log a2

(z ∈ H2 ).

Moreover, for n ∈ N, (γn ) = n (γ). We call γ ∈ Γ primitive if it is not represented as a positive power of another element of Γ. Let Γ0 be a set of the primitive elements in Γ whose elements are not conjugate with each other. Theorem 7.4.2 For all t > 0, tr(e 2 ΔM ) = t



e− 2 λk t 1

k=1

e− 8 t

= vol(M)



3

(2πt) 2

0

b2 t ∞ n2 (γ)2 (γ) be− 2t e− 8 db + e− 2t . √ b n (γ) sinh( 2 ) γ∈Γ0 n=1 2 2πt sinh( 2 )

4:36:57, subject to the Cambridge Core terms of use, .008

7.4 Selberg’s Trace Formula on the Upper Half Plane

321

Proof We show a sketch of a proof. Let δ(2) z be the Dirac measure concentrated at z ∈ H2 with respect to the Lebesgue measure. Then, the heat kernel p(t, z, z ) for L is written as

  2 p(t, z, z ) = (y ) δ(2) z (Z(t)) dμ. W

Set ( θ1 (s),  θ2 (s)) = (θ1 (s), θ2 (s) − 2s ). By the Cameron–Martin theorem θ2 (s))}0st is a two-dimensional (Theorem 1.7.2, Theorem 4.6.2),  θ = {( θ1 (s),  Brownian motion under the probability measure  μ on (W, B(W)) given by    1 1 t d μ = e 2 θ (t)− 8 dμ . Bt

Bt

Since δz (αZ(t)) = α−2 δ z (Z(t)) (α > 0), we have α

1 2 t  θ2 (t) p(t, z, z ) = (y )2 e− 2 θ (t)− 8 δ(2) ) d μ z (x + yβ(t), ye W

   32 t y

= e− 8

y

where β(t) is given by

W

δ(2)x −x (

t

β(t) =

y

2 (s)





, yy )

2 (t)

(β(t), eθ

) d μ,

d θ1 (s).

0

The conditional probability distribution of β(t) given  θ2 = { θ2 (s)}0st is the Gaussian distribution with mean 0 and variance

t 2  = A(t) e2θ (s) ds. 0

Taking advantage of the conditional expectation,1 we obtain

  3 2 1 2 − ξ  − 8t y 2 p(t, z, z ) = e d μ δ(2)x −x y (ξ, eθ (t) ) dξ e 2A(t)  ( , ) y y y W R  2πA(t)    32  2 t y 1 − (x −x) 2  = e− 8 δ y (eθ (t) ) d μ e 2y2 A(t)  y y W  2πA(t)   2   12 t y 1 − (x −x)  = e− 8 δlog( y ) ( θ2 (t)) d μ, (7.4.4) e 2y2 A(t)  y y W  2πA(t) 1

While we carry out a formal computation on the Dirac measure, we can justify the argument by considering the expectation E[ f1 (X(t)) f2 (Y(t))] for bounded continuous functions f1 and f2 .

4:36:57, subject to the Cambridge Core terms of use, .008

322

The Semiclassical Limit

where δa is the Dirac measure concentrated at a ∈ R and, in the last equality, we have used δa (e x ) = a−1 δlog a (x) (a > 0).  The explicit form of the probability density of the distribution of ( θ(t), A(t)) for t > 0 is known ([132]), which, in conjunction with (7.4.4), leads us to Theorem 7.4.1. In the next section, we firstly study the distribution, and then we shall prove Theorem 7.4.1. We continue the sketch of the proof of Theorem 7.4.2. For the details of the following argument, see [85]. For σ ∈ Γ, there exist n ∈ N, κ ∈ Γ, and γ ∈ Γ0 such that σ = κ−1 γn κ. γ, n and the conjugacy class [κ] ∈ Γ/Γγ containing κ are uniquely determined by σ, where Γγ is the centralizer of γ. Hence, we obtain

1 tr(e− 2 tΔM ) = p(t, z, z) dv(z) + p(t, z, σz) dv(z) D

σI

= vol(M)p(t, z, z) +

D



I(γ, n, [κ]),

γ∈Γ0 n=1 [κ]∈Γ/Γγ

where I(γ, n, [κ]) is given by

I(γ, n, [κ]) =

p(t, z, κ−1 γn κz) dv(z).

D

Recall now that p(t, z, z) does not depend on z and, by Theorem 7.4.1, is given by

∞ t b2 e− 8 be− 2t db. p(t, z, z) = 3 (2πt) 2 0 sinh( 2b ) For γ ∈ Γ0 , take a = a(γ) > 1 and τ ∈ PSL2 (R) so that γ = τ−1 γa τ and set  Dγ = [κ]∈Γ/Γγ κD. Dγ is a fundamental domain of Γγ and ∞ 

∞   γam τDγ = τ γm Dγ = H2

m=−∞

m=−∞

Thus, τDγ is a fundamental domain of the cyclic group {(γa )m }∞ m=−∞ and we may assume that τDγ = {(x, y) ∈ H2 ; 1 < y  a2 }. Moreover, as was mentioned above, (γ) = (γa ) = log a2 . Combining the arguments above, we obtain, for fixed γ ∈ Γ0 and n ∈ N,

I(γ, n, [κ]) = p(t, z, γn z) dv(z). [κ]∈Γ/Γγ



4:36:57, subject to the Cambridge Core terms of use, .008

7.5 The Heat Kernel on H2

323

Moreover, since τ ∈ PSL2 (R) is an isometry of H2 and p(t, z, γn z) = p(t, z, τ−1 γan τz) = p(t, τz, γan τz) holds, we have



I(γ, n, [κ]) =

[κ]∈Γ/Γγ

τDγ

p(t, z, γan z) dv(z)

= 1

a2

dy y2

R

p(t, z, a2n z) dx.

Using the expression (7.4.4) of the heat kernel, we obtain2

a2

dy p(t, z, a2n z) dx y2 R 1

a2

2n 2 x2 t dy 1 − (a −1)  = e− 8 an dx δlog(a2n ) ( θ2 (t)) d μ e 2y2 A(t)  2 y 1 R W  2πA(t)

a2

t dy y δlog(a2n ) ( θ2 (t)) d μ = e− 8 an 2 2n y W a −1 1 n2 (γ)2 (γ) e− 8 e− 2t , √ n (γ) 2 2πt sinh( 2 ) t

=

where, to see the last identity, we have used

x2 1 δ x ( θ2 (t)) d μ = √ e− 2t . W 2πt



A probabilistic and simple proof as above is extensible to showing Selberg’s trace formula for the Maass Laplacian (see [33]), a generalization of Δ, given by ∂ k 2 ∂2 +i + y2 2 (k ∈ R). y2 ∂x y ∂y For details, see [42].

7.5 Integral of Geometric Brownian Motion and Heat Kernel on H2 To prove Selberg’s trace formula, evaluating the diagonal of the heat kernel on H2 was indispensable. In this section we prove Theorem 7.4.1 which gives 2

It may seem that we formally apply Fubini’s theorem since the δ-function is used for the expression of the heat kernel. In fact, we can apply it because { θ2 (t)}t0 is a one-dimensional Brownian motion under  μ and the generalized Wiener functionals determined by δ-functions are measures on the Wiener space (Theorem 5.4.15).

4:36:57, subject to the Cambridge Core terms of use, .008

324

The Semiclassical Limit

the explicit form of the heat kernel. We need some formulae for the modified Bessel functions, Legendre functions and so on. For these, see, for example, [31, 67]. Let {B(t)}t0 be a one-dimensional Brownian motion with B(0) = 0 defined on a probability space (Ω, F , P) and A(t) be a functional given by

t e2B(s) ds. A(t) = 0

In order to derive the explicit form of the heat kernel from the expression (7.4.4), it is necessary to show the explicit form of the probability density of the distribution of (A(t), B(t)) for fixed t > 0. This was done by Yor [130, 132]. For μ ∈ R, the function ∞  z 2n+μ 1 (z ∈ C \ (−∞, 0)) Iμ (z) = n! Γ(μ + n + 1) 2 n=0 is called the modified Bessel function of the first kind of order μ. When z = v > 0, the function [0, ∞) λ → I √2λ (v) is completely monotone. Set

∞ 2 2  πξ  π −ξ v Θ(v, t) = e 2t e−v cosh(ξ) sinh(ξ) sin dξ. (7.5.1) 1 t (2π3 t) 2 0 Then it is known that ([130, 132]) I √2λ (v)

=



e−λt Θ(v, t) dt.

(7.5.2)

0

The function Kμ (μ ∈ R) defined by ⎧ ⎪ π I−μ (z) − Iμ (z) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ sin μπ ⎨2 Kμ (z) = ⎪ ⎪ n  ∂I (z) ⎪ ∂Iμ (z) (−1) ⎪ −μ ⎪ ⎪ − ⎪ ⎩ 2 ∂μ ∂μ

(μ  Z),

μ=n

(μ = n ∈ Z)

is called the modified Bessel function of the second kind, or the Macdonald function, of order μ. For μ  0, Iμ and Kμ are monotone increasing and decreasing functions on (0, ∞), respectively. They satisfy the modified Bessel equation d2 u 1 du  μ2  − 1 + + u = 0. dz2 z dz z2 Theorem 7.5.1 For t > 0, the distribution of (A(t), B(t)) has the smooth density at (u, b) (u > 0, b ∈ R) given by  1 + e2b   eb  1 (7.5.3) at (u, b) = exp − Θ ,t . u 2u u

4:36:57, subject to the Cambridge Core terms of use, .008

7.5 The Heat Kernel on H2

325

Proof {(A(t), B(t))}t0 is a diffusion process on R2 generated by 1 ∂2 ∂ + e2y . 2 ∂y2 ∂x Since

∂ ∂ ∂ , e2y = 2e2y , ∂y ∂x ∂y

it satisfies Hörmander’s condition. Therefore, by Theorems 5.4.11 and 5.5.4, (A(t), B(t)) has a density of C ∞ -class. We denote it by at (u, b) and show that it has the expression (7.5.3). For this purpose, for λ > 0, set  1  1 2 2x  (y−x)2 qλ (t, x, y) = E e− 2 λ e A(t)  x + B(t) = y √ e− 2t 2πt

∞ − 12 λ2 e2x u = e at (u, y − x) du. 0

qλ (t, x, y) is the heat kernel (with respect to the Lebesgue measure) for the Schrödinger operator on R: Hλ = − That is, the function u given by

1 d2 1 + λ2 e2x . 2 2 dx 2

u(t, x) =

R

qλ (t, x, y) f (y) dy

for f ∈ L2 (R) satisfies the heat equation ∂u = −Hλ u, ∂t

lim u(t, x) = f (x). t↓0

The Laplace transform of the heat kernel in t,

∞ e−αt qλ (t, x, y) dt Gλ (x, y; α) :=

(α > 0)

0

is called the Green function with respect to the Lebesgue measure. Now consider the equation −Hλ u = αu. Then, there exist solutions u1 (x; α) and u2 (x; α), which are monotone increasing and decreasing in x, respectively, such that lim u1 (x; α) = 0

x→−∞

and

lim u2 (x; α) = 0.

x→∞

4:36:57, subject to the Cambridge Core terms of use, .008

326

The Semiclassical Limit

Such solutions are unique up to multiplicative constants. It is known in the general theory of one-dimensional diffusion processes (see, for example, [50, 56]) 1 λ (x, y; α) = that the function G W(u1 ,u2 ) u1 (x; α)u2 (y; α) (x  y) is the Green function with respect to the speed measure m(dy) = 2dy (Section 4.8), where W(u1 , u2 ) is the Wronskian given by W(u1 , u2 ) = u1 (x)u2 (x) − u1 (x)u2 (x). Hence, 2 u1 (x; α)u2 (y; α). Gλ (x, y; α) = W(u1 , u2 ) We can take u1 (x; λ) = I √2α (λe x )

and u2 (x; λ) = K √2α (λe x ).

By the formula Iμ (z)Kμ (z) − Iμ (z)Kμ (z) = 1z , we have Gλ (x, y; α) = 2I √2α (λe x )K √2α (λey ) Moreover, by using the formula 1 Iμ (x)Kμ (y) = 2 we obtain



Gλ (x, y; α) = 0

that is,

0





e− 2 − t

x2 +y2 2t

(x  y, α > 0).



0

e− 2 − t

λ2 (e2x +e2y ) 2t

I √2α

 xy  dt t

t

,

 λ2 e x+y  dt t

t

,

∞ 1 2 2x e−αt dt e− 2 λ e u at (u, y − x) du 0

∞  λ2 e x+y  dt t λ2 (e2x +e2y ) . e− 2 − 2t I √2α = t t 0

On the other hand, by the characterization (7.5.2) of the function Θ(v, t), Fubini’s theorem yields



∞  ey−x  1 2 2x 1 1+e2(y−x) , t du e−αt dt e− 2 λ e u e− 2u Θ u u 0

∞0  ey−x  du 1 2 2x 1+e2(y−x) = e− 2 λ e u e− 2u I √2α u u 0

∞  λ2 e x+y  dt λ2 (e2x +e2y ) t , = e− 2 e− 2t I √2α t t 0 where, for the second identity, we have used the change of variables given by λ2 e2x u = t. Hence, by a repeated use of the uniqueness of the Laplace transforms, we obtain (7.5.3). 

4:36:57, subject to the Cambridge Core terms of use, .008

7.5 The Heat Kernel on H2

327

Plugging (7.5.3) into (7.4.4), we obtain the following ([32]). Proposition 7.5.2 Set r = d(z, z ). Then

e− 8 t



p(t, z, z ) =

3



1

4π 2 t 2

e

π2 −ξ2 2t

sinh ξ sin( πξt ) 3

0

(cosh r + cosh ξ) 2

dξ.

Proof By (7.4.2), (7.4.4), and (7.5.3),  y     12 ∞  2 2  2 t y 1 − (x−x ) +y2 +(y ) 2y u , t du e Θ p(t, z, z ) = e− 8 √ 3 y yu 0 2πu 2

∞  y   r t 1 − y cosh yu = e− 8 , t du e Θ √ 3 yu 0 2πu 2 t ∞ 1 e− 8 = √ e−v cosh r v− 2 Θ(v, t) dv. 2π 0

(7.5.4)

(7.5.5)

On account of the expression (7.5.1) of Θ(v, t) and Fubini’s theorem, we obtain

∞ 2 2 t  πξ  π −ξ 1 e− 8 2t e sinh ξ sin p(t, z, z ) = √ dξ 1 t 2π (2π3 t) 2 0

∞ 1 e−v(cosh r+cosh ξ) v 2 dv × = Since Γ( 32 ) =

− 8t

e

2π2 t

1 2

0 ∞

2 −ξ2

π Γ( 32 )e

2t

sinh ξ sin( πξt ) 3

0

(cosh r + cosh ξ) 2

dξ.



π 2 ,



we arrive at (7.5.4).

Proof of Theorem 7.4.1 We give a proof following [81]. See [11] for another approach. For α, β, γ > 0, let F(α, β, γ; z) be the Gauss hypergeometric function given by Γ(γ) Γ(α + n)Γ(β + n) zn . Γ(α)Γ(β) n=0 Γ(γ + n) n! ∞

F(α, β, γ; z) =

Moreover, for μ ∈ R \ {−1, −2, . . .}, z ∈ C \ (−∞, 0), let Qμ be the Legendre function of the second kind defined by √ μ + 1 μ 3 1 πΓ(μ + 1) 1 , + 1, μ + ; F . Qμ (z) = 2 2 2 z2 Γ(μ + 32 ) (2z)μ+1 Qμ is a solution for the Legendre differential equation (1 − z2 )

d2 u du − 2z + μ(μ + 1)u = 0. dz dz2

4:36:57, subject to the Cambridge Core terms of use, .008

328

The Semiclassical Limit

In what follows, we use the integral representations of Qμ (μ > −1) (see for example [67, pp.174, 201]) ( ∞ π du Qμ (cosh a) = e−(cosh a)u Iμ+ 12 (u) √ (7.5.6) 2 0 u

∞ 1 e−(μ+ 2 )ρ 1 = √ dρ, (7.5.7) 1 2 a (cosh ρ − cosh a) 2 where a > 0. Set



G1 (t) =

e−v cosh r v− 2 Θ(v, t) dv 1

0

and G2 (t) =

√ 2



3

2πt 2

ρ2

ρe− 2t

1

r

(cosh ρ − cosh r) 2

dρ.

By (7.5.5), it suffices to show G1 (t) = G2 (t) (t > 0) in order to obtain (7.4.3). For this purpose, we show the coincidence of the Laplace transforms. For G1 (t), by (7.5.2) and (7.5.6),



∞ 1 −λt e G1 (t) dt = e−v cosh r v− 2 I √2λ (v) dv 0 0 ( 2 √ Q = 1 (cosh r) π 2λ− 2 for λ > 0. On the other hand, since √

∞ 2 2π − √2λρ −λt− ρ2t − 32 e e t dt = ρ 0 (7.5.7) implies



−λt

e 0

(λ, ρ > 0),



∞ 1 e− 2λρ G2 (t) dt = √ dρ π r (cosh ρ − cosh r) 12 ( 2 √ Q = 1 (cosh r). π 2λ− 2

Hence, we see the coincidence of the Laplace transforms of G1 and G2 .



4:36:57, subject to the Cambridge Core terms of use, .008

Appendix Some Fundamentals

A.1 Gronwall’s Inequality Theorem A.1.1 Let f (t) and g(t) be non-negative continuous functions on [0, T ] and α be a positive number. If

t f (t)  g(t) + α f (s)ds (0  t  T ), 0

then

f (t)  g(t) + α

In particular, if f (t)  α

!t 0

t

g(s)eα(t−s) ds

(0  t  T ).

0

f (s)ds (0  t  T ), then f (t) ≡ 0.

A.2 Dynkin Class Theorem, Monotone Class Theorem Definition A.2.1 (1) A family A of subsets of a set Ω is called a multiplicative class if the following conditions are fulfilled: (i) Ω ∈ A ; (ii) if A, B ∈ A , then A ∩ B ∈ A . (2) A family G of subsets of a set Ω is called a Dynkin class if the following conditions are fulfilled: (i) Ω ∈ G ; (ii) if A, B ∈ A , A ⊂ B, then B \ A ≡ B ∩ Ac ∈ G ;  (iii) if An ∈ G , An ⊂ An+1 (n = 1, 2, . . .), then ∞ n=1 An ∈ G . Theorem A.2.2 (Dynkin class theorem) (1) Let A be a multiplicative class and G be a Dynkin class. If A ⊂ G , then σ(A ) ⊂ G . (2) The smallest Dynkin class containing a multiplicative class A is σ(A ). 329 4:36:57, subject to the Cambridge Core terms of use, .009

330

Appendix

Some Fundamentals

Theorem A.2.3 Let A be a multiplicative class and μ and  μ be measures on σ(A ). If μ and  μ coincide on A , then μ =  μ. Definition A.2.4 A family G of subsets of a set Ω is called a monotone class if the following conditions are fulfilled:  (i) if An ∈ G , An ⊂ An+1 , then n An ∈ G ;  (ii) if An ∈ G , An ⊃ An+1 , then n An ∈ G . Theorem A.2.5 (Monotone class theorem) (1) If a monotone class G is finitely additive, then G is a σ-field. (2) Let A be a finitely additive class and G be a monotone class. If A ⊂ G , then σ(A ) ⊂ G . The next theorem for functions (random variables) is useful in applications. Theorem A.2.6 Let A be a multiplicative class on a set Ω and H be a family of real functions on Ω such that (i) (ii) (iii) (iv)

for all A ∈ A , 1A ∈ H ; 1∈H; H is a vector space; let { fn }∞ n=1 ⊂ H be a convergent sequence of bounded functions such that fn  0 and fn  fn+1 . If f = limn→∞ fn is also a bounded function, then f ∈H.

Then, H contains all σ(A )-measurable real bounded functions. For proofs, see [5, 6, 47, 114].

A.3 Uniform Integrability The next elementary proposition is used to prove the uniform integrability of martingales. Proposition A.3.1 An integrable random variable X defined on a probability space (Ω, F , P) satisfies the following. (1) limλ→∞ E[|X|1{|X|λ} ] = 0. (2) For any ε > 0 there exists a δ > 0 such that, if P(A) < δ (A ∈ F ), then E[|X|1A ] < ε.

4:36:57, subject to the Cambridge Core terms of use, .009

A.3 Uniform Integrability

331

The uniform integrability of a family of random variables is defined in the following way. Definition A.3.2 Let I be an index set and {Xι }ι∈I be a family of random variables defined on (Ω, F , P). {Xι }ι∈I is said to be uniformly integrable if lim sup E[|Xι |1{|Xι |>λ} ] = 0.

λ→∞ ι∈I

Proposition A.3.3 If {Xι }ι∈I is uniformly integrable, then supι∈I E[|Xι |] < ∞. Theorem A.3.4 A family {Xι }ι∈I of integrable random variables is uniformly integrable if and only if there exists a measurable function ψ such that lim

x→∞

ψ(x) = ∞ and x

sup E[ψ(|Xι |)] < ∞. ι∈I

In particular, if supι∈I E[|Xι | p ] < ∞ for some p > 1, then {Xι }ι∈I is uniformly integrable. Theorem A.3.5 For a uniformly integrable sequence {Xn }∞ n=1 of random variables, E[lim inf Xn ]  lim inf E[Xn ]  lim sup E[Xn ]  E[lim sup Xn ]. n→∞

n→∞

n→∞

n→∞

Moreover, if there exists the limit limn→∞ Xn = X almost surely, then X ∈ L1 and lim E[|Xn − X|] = 0.

n→∞

Theorem A.3.6 For a uniformly integrable sequence {Xn }∞ n=1 of random variables, Xn converges to a random variable X in probability if and only if it does in L1 . Theorem A.3.7 Suppose that a sequence {Xn }∞ n=1 of integrable random variables converges to X almost surely. Then, the following conditions are equivalent: (i) {Xn }∞ n=1 is uniformly integrable; (ii) Xn converges to X in L1 ; (iii) X ∈ L1 and E[|Xn |] converges to E[|X|]. For details on the topics in this section, see [126].

4:36:57, subject to the Cambridge Core terms of use, .009

332

Appendix

Some Fundamentals

A.4 Conditional Probability and Regular Conditional Probability Let (Ω, F , P) be a probability space and G be a sub-σ-field of F . For an integrable random variable X, there exists a unique, up to null sets, G -measurable integrable random variable Y such that E[X1A ] = E[Y1A ] for all A ∈ G . This random variable Y(ω) is called the conditional expectation of X given G and is denoted by E[X|G ](ω). In particular, when X = 1B for B ∈ F , E[1B |G ](ω) is called the conditional probability of B given G and is denoted by P(B|G )(ω). Remark A.4.1 (1) If X is G -measurable, then E[X|G ] = X, P-a.s. (2) If G is trivial, that is, if G = {∅, Ω}, then E[X|G ] = E[X], P-a.s. The conditional probability P(B|G )(ω) is well defined as a random variable (a function in ω) for each fixed B ∈ F . Since it is defined up to null sets, it is impossible in general to regard it as a function of B for a fixed ω. This inconvenience is overcome by the regular conditional probability, which was used in a proof of the fundamental theorem on the existence of solutions for stochastic differential equations and that of the strong Markov property of solutions of martingale problems. Definition A.4.2 If a mapping p(·, ·|G ) : Ω × F (ω, A) → p(ω, A|G ) ∈ [0, 1] satisfies the conditions: (i) for any ω ∈ Ω, F A → p(ω, A|G ) is a probability measure on (Ω, F ); (ii) for any A ∈ F , Ω ω → p(ω, A|G ) is G -measurable; (iii) for any A ∈ F and B ∈ G ,

P(A ∩ B) = p(ω, A|G )P(dω), B

then it is called a regular conditional probability given the condition G . The condition (iii) can be written as (iii) for any X ∈ L1 (Ω) and B ∈ G ,  E[X1B ] = E X(ω )p(·, dω |G )1B . In other words,

! Ω

Ω





X(ω )p(ω, dω |G ) = E[X|G ](ω), P-a.s.

4:36:57, subject to the Cambridge Core terms of use, .009

A.4 Conditional Probability

333

An example of probability spaces which do not admit regular conditional probabilities can be found in [99]. Definition A.4.3 Two measurable spaces (Ω, F ) and (Ω , F  ) are called Borel isomorphic if there exists an F /F  -measurable and one-to-one mapping f from Ω onto Ω whose inverse f −1 is F  /F -measurable. If a measurable space (Ω, F ) is Borel isomorphic with a Borel subset of R1 , it is called a standard measurable space. Moreover, if P is a probability measure on (Ω, F ), (Ω, F , P) is called a standard probability space. It is known that a standard probability space is Borel isomorphic with one of {1, 2, . . . , n} (n = 1, 2, . . .), N and [0, 1]. It is also known that a complete separable metric space endowed with the topological σ-field is a standard probability space. Theorem A.4.4 A standard probability space (Ω, F , P) admits the regular conditional probability given a sub-σ-field G of F . We refer the reader to [17, 45, 47] about standard probability space and regular conditional probability. Next let (S , B) be a measurable space and ξ be an S -valued random variable defined on Ω. For X ∈ L1 (Ω), C ∈ B, define R by R(C) = E[X1C (ξ)]. R is a set function on B. Since R is absolutely continuous with respect to the probability law ν of ξ, there exists a measurable function Z on S such that

R(C) = Z(s)ν(ds) C

by the Radon–Nikodym theorem. The function Z is denoted by E[X|ξ = s] (s ∈ S ) and called the conditional expectation of X given ξ = s. By definition we have

E[X|ξ = s]P(ξ ∈ ds) (C ∈ B). E[X1C (ξ)] = C

Definition A.4.5 Let (Ω, F ) be a measurable space. F is said to be countably generated if it possesses a countable subset F0 of F such that any probability measures P1 and P2 coincide provided that they are equal on F0 . If (Ω, F ) is a standard measurable space, then F is countably generated.

4:36:57, subject to the Cambridge Core terms of use, .009

334

Appendix

Some Fundamentals

Theorem A.4.6 Let (Ω, F , P) be a standard probability space, G be a subσ-field of F , p(ω, dω |G ) be the regular conditional probability given G and H be a countably generated sub-σ-field of G . Then there exists a P-null set N ∈ G such that, if ω  N, p(ω, A|G ) = 1A (ω) for A ∈ H . Corollary A.4.7 Under the same framework as the theorem, let ξ be an S -valued random variable, (S , B) being a measurable space. If B is countably generated and {x} ∈ B for all x ∈ S , then p(ω, {ω ; ξ(ω ) = ξ(ω)}|G ) = 1 almost surely.

A.5 Kolmogorov Continuity Theorem Let D be a domain in Rd and B be a Banach space. A family {X(x)} x∈D of B-valued random variables is called a random field. The stochastic flow defined by the solution of a stochastic differential equation is a typical example. Kolmogorov’s continuity theorem gives a sufficient condition for a random field to have a continuous modification. It is also called the Kolmogorov– Centsov theorem or the Kolmogorov–Totoki theorem. Theorem A.5.1 Let X = {X(x)} x∈D be a random field. Suppose that there exist  positive constants α1 , α2 , . . . , αd , γ, C with di=1 α−1 i < 1 such that E[X(x) − X(y)γ ]  C

d

|xi − yi |αi

(x, y ∈ D),

i=1

 x∈D . where  ·  is the norm of B. Then X has a continuous modification {X(x)} Moreover, for any cube E contained in D, there exists a positive random variable K with E[K γ ] < ∞ such that  − X(y)  X(x) K

d

|xi − yi |βi

(x, y ∈ E),

i=1

where βi =

αi (α0 −d) α0 γ

and α0 = d

d

1 −1 . i=1 αi

For a proof, see Kunita [62].

A.6 Laplace Transforms and the Tauberian Theorem For a non-negative measure F(dξ) on [0, ∞), the function L defined by

∞ e−λξ F(dξ) L(λ) ≡ LF (λ) = 0 4:36:57, subject to the Cambridge Core terms of use, .009

A.6 Laplace Transforms and the Tauberian Theorem

335

is the Laplace transform of F. For example, if F(dξ) = γξ γ−1 dξ for γ > 0, then

∞ Γ(γ + 1) LF (λ) = γ e−λξ ξγ−1 dξ = , λ > 0. λγ 0 We mention some properties of Laplace transforms and the Tauberian theorem which was used in Chapter 7. For details, see [23, 125]. Theorem A.6.1 (Uniqueness) Let F and G be non-negative measures on [0, ∞) and assume that there exists an a > 0 such that LF (λ) = LG (λ) < ∞ (λ > a). Then F = G. Theorem A.6.2 (Continuity) Let Fn (n ∈ N) be a sequence of non-negative measures on [0, ∞) which have the Laplace transforms LFn (λ) for λ > a and assume that LFn (λ) converges to some function L(λ) for all λ > a as n → ∞. Then L(λ) (λ > a) is the Laplace transform of some measure F on [0, ∞) and, for every bounded and continuous interval1 I = (α, β) of F, Fn (I) → F(I) as n → ∞. The Tauberian theorem is formulated by means of regularly varying functions. Definition A.6.3 A function (ξ) on [0, ∞) is called slowing varying as ξ → ∞ if (κξ) =1 lim ξ→∞ (ξ) for any κ > 0. A function of the form k(ξ) = ξ γ (ξ) (γ > 0) is called regularly varying. For c, δ > 0, the function (ξ) = c(log(1 + ξ))δ is slowing varying as ξ → ∞. Theorem A.6.4 Let γ be a positive number and (ξ) be a slowly varying function as ξ → ∞. Then, for a right-continuous non-decreasing function F, F has an asymptotic behavior F(ξ) = ξ γ (ξ)(1 + o(1))

(ξ → ∞)

if and only if the Laplace transform L(λ) of the corresponding Stieltjes measure F(dξ) has the asymptotic behavior 1 L(λ) = Γ(γ + 1)λ−γ (λ ↓ 0). λ 1

I = (α, β) is said to be a continuous interval of a measure F if F({α}) = F({β}) = 0. 4:36:57, subject to the Cambridge Core terms of use, .009

336

Appendix

Some Fundamentals

That the asymptotic behavior of the Laplace transform L of F follows from that of F is called the Abelian theorem and the converse assertion is the Tauberian theorem. If we define the slow varying near 0, the assertion of Theorem A.6.4 holds, replacing ξ ↓ 0 and λ → ∞ for ξ → ∞ and λ ↓ 0, respectively.

4:36:57, subject to the Cambridge Core terms of use, .009

References

[1] R. Adams and J. Fournier, Sobolev Spaces, 2nd edn., Academic Press, 2003. [2] M. Aizenman and B. Simon, Brownian motion and Harnack’s inequality for Schrödinger operators, Comm. Pure Appl. Math., 35 (1982), 209–273. [3] V. I. Arnold, Mathematical Methods of Classical Mechanics, 2nd edn., SpringerVerlag, 1989. [4] J. Avron, I. Herbst, and B. Simon, Schrödinger operators with magnetic fields, I. general interactions, Duke Math. J., 45 (1978), 847–883. [5] P. Billingsley, Probability and Measure, 3rd edn., John Wiley & Sons, 1995. [6] R. M. Blumenthal and R. K. Getoor, Markov Processes and Potential Theory, Academic Press, 1968. [7] V. Bogachev, Gaussian Measures, Amer. Math. Soc., 1998. [8] N. Bouleau and F. Hirsch, Dirichlet Forms and Analysis on Wiener Space, Walter de Gruyter, 1991. [9] C. Cocozza and M. Yor, Démonstration d’un théorème de Knight à l’aide de martingales exponentielles, Séminaire de Probabilités, XIV, eds. J. Azama and M. Yor, Lecture Notes in Math., 784, 496–499, Springer-Verlag, 1980. [10] H. L. Cycon, R. G. Froese, W. Kirsch, and B. Simon, Schrödinger Operators, with Application to Quantum Mechanics and Global Geometry, SpringerVerlag, 1987. [11] E. B. Davies, Heat Kernels and Spectral Theory, Cambridge University Press, 1989. [12] B. Davis, Picard’s theorem and Brownian motion, Trans. Amer. Math. Soc., 213 (1975), 353–362. [13] C. Dellacherie, Capacités et Processus Stochastiques, Springer-Verlag, 1971. [14] D. Deuschel and D. Stroock, Large Deviations, Academic Press, 1989. [15] J. L. Doob, Stochastic Processes, John Wiley & Sons, 1953. [16] H. Doss, Liens entre équations différentielles stochastiques et ordinaires, Ann. Inst. H. Poincaré Sect. B (N.S.), 13 (1977), 99–125. [17] R. M. Dudley, Real Analysis and Probability, 2nd edn., Cambridge University Press, 2002. [18] D. Duffie, Dynamic Asset Pricing Theory, 2nd edn., Princeton University Press, 1996. [19] N. Dunford and J. Schwartz, Linear Operators, II, Interscience, 1963.

337 4:36:59, subject to the Cambridge Core terms of use, .010

338

References

[20] R. Durrett, Brownian Motion and Martingales in Analysis, Wadsworth, 1984. [21] R. Elliot and P. Kopp, Mathematics of Financial Markets, Springer-Verlag, 1999. [22] K. D. Elworthy, Stochastic Differential Equations on Manifolds, Cambridge University Press, 1982. [23] W. Feller, An Introduction to Probability Theory and Its Applications, Vol. II, John Wiley & Sons, 1966. [24] E. Fournié, J.-M. Lasry, J. Lebuchoux, P.-L. Lions, and N. Touzi, Applications of Malliavin calculus to Monte Carlo methods in finance, Finance Stoch. 3 (1999), 391–412. [25] M. Fukushima, Y. Oshima, and M. Takeda, Dirichlet Forms and Symmetric Markov Processes, 2nd edn., Walter de Gruyter, 2010. [26] P. Friz and M. Hairer, A Course on Rough Paths, Springer-Verlag, 2014. [27] P. Friz and N. Victoir, Multidimensional Stochastic Processes as Rough Paths, Cambridge University Press, 2010. [28] B. Gaveau and P. Trauber, L’intégrale stochastique comme opérateur de divergence dans l’space fonctionnel, J. Func. Anal., 46 (1982), 230–238. [29] R. K. Getoor and M. J. Sharpe, Conformal martingales, Invent. Math., 16 (1972), 271–308. [30] E. Getzler, Degree theory for Wiener maps, J. Func. Anal., 68 (1986), 388–403. [31] I. S. Gradshteyn and I. M. Ryzhik, Tables of Integrals, Series, and Products, 7th edn., Academic Press, 2007. [32] J.-C. Gruet, Semi-groupe du mouvement Brownien hyperbolique, Stochastics Stochastic Rep., 56 (1996), 53–61. [33] D. Hejhal, The Selberg Trace Formula for PSL(2, R), Vol.1, Vol.2, Lecture Notes in Math., 548, 1001, Springer-Verlag, 1976, 1983. [34] B. Helffer and J. Sjöstrand, Multiple wells in the semiclassical limit, I, Comm. PDE, 9 (1984), 337–408. [35] B. Helffer and J. Sjöstrand, Puits multiples en limite semi-classique, II. Interaction moléculaire. Symétries. Perturbation., Ann. Inst. H. Poincaré Phys. Théor., 42 (1985), 127–212. [36] L. Hörmander, The Analysis of Linear Partial Differential Operators, I, Distribution Theory and Fourier Analysis, 2nd edn., Springer-Verlag, 1990. [37] L. Hörmander, Hypoelliptic second order differential equations, Acta Math., 119 (1967), 147–171. [38] E. P. Hsu, Stochastic Analysis on Manifolds, Amer. Math. Soc., 2002. [39] K. Ichihara, Explosion problem for symmetric diffusion processes, Trans. Amer. Math. Soc., 298 (1986), 515–536. [40] N. Ikeda, S. Kusuoka, and S. Manabe, Lévy’s stochastic area formula and related problems, in Stochastic Analysis, eds. M. Cranston and M. Pinsky, 281–305, Proc. Sympos. Pure Math., 57, Amer. Math. Soc., 1995. [41] N. Ikeda and S. Manabe, Van Vleck–Pauli formula for Wiener integrals and Jacobi fields, in Itô’s Stochastic Calculus and Probability Theory, eds. N. Ikeda, S. Watanabe, M. Fukushima, and H. Kunita, 141–156, Springer-Verlag, 1996. [42] N. Ikeda and H. Matsumoto, Brownian motion on the hyperbolic plane and Selberg trace formula, J. Funct. Anal., 163 (1999), 63–110.

4:36:59, subject to the Cambridge Core terms of use, .010

References

339

[43] N. Ikeda and H. Matsumoto, The Kolmogorov operator and classical mechanics, Séminaire de Probabilités XLVII, eds. C. Donati-Martin, A. Lejay, and A. Rouault, Lecture Notes in Math., 2137, 497–504, Springer-Verlag, 2015. [44] N. Ikeda and S. Taniguchi, Quadratic Wiener functionals, Kalman-Bucy filters, and the KdV equation, in Stochastic Analysis and Related Topics in Kyoto, in honor of Kiyosi Itô, eds., H. Kunita, S. Watanabe, and Y. Takahashi, Adv. Studies Pure Math. 41, 167–187, Math. Soc. Japan, Tokyo, 2004. [45] N. Ikeda and S. Watanabe, Stochastic Differential Equations and Diffusion Processes, 2nd edn., North Holland/Kodansha, 1989. [46] K. Itô, Essentials of Stochastic Processes (translated by Y. Ito), Amer Math. Soc., 2006. (Originally published in Japanese from Iwanami Shoten, 1957, 2006) [47] K. Itô, Introduction to Probability Theory, Cambridge University Press, 1984. (Originally published in Japanese from Iwanami Shoten, 1978) [48] K. Itô, Differential equations determining Markov processes, Zenkoku Shij¯o S¯ugaku Danwakai, 244 (1942), 1352–1400, (in Japanese). English translation in Kiyosi Itô, Selected Papers, eds. D. Stroock and S. R. S. Varadhan, Springer-Verlag, 1987. [49] K. Itô, On stochastic differential equations, Mem. Amer. Math. Soc., 4 (1951). [50] K. Itô and H. P. McKean, Jr., Diffusion Processes and Their Sample Paths, Springer-Verlag, 1974. [51] K. Itô and M. Nisio, On the convergence of sums of independent Banach space valued random variables, Osaka J. Math., 5 (1968), 35–48. [52] M. Kac, Integration in Function Spaces and Some of Its Applications, Fermi Lectures, Accademia Nazionale dei Lincei, Scuola Normale Superiore, 1980. [53] M. Kac, On distributions of certain Wiener functionals, Trans. Amer. Math. Soc., 65 (1949), 1–13. [54] M. Kac, On some connections between probability theory and differential and integral equations, Proceedings of 2nd Berkeley Symp. on Math. Stat. and Probability, 189–215, University of California Press, 1951. [55] M. Kac, Can one hear the shape of a drum?, Amer. Math. Monthly, 73 (1966), 1–23. [56] I. Karatzas and S. E. Shreve, Brownian Motion and Stochastic Calculus, 2nd edn., Springer-Verlag, 1991. [57] I. Karatzas and S. Shreve, Methods of Mathematical Finance, Springer-Verlag, 1998. [58] T. Kato, Perturbation Theory for Linear Operators, 2nd edn., Springer-Verlag, 1995. [59] N. Kazamaki, The equivalence of two conditions on weighted norm inequalities for martingales, Proc. Intern. Symp. SDE Kyoto 1976 (ed. K. Itô), 141–152, Kinokuniya, 1978. [60] F. B. Knight, A reduction of continuous square-integrable martingales to Brownian motion, in Martingales, ed. H. Dinges, Lecture Notes in Math., 190, 19–31, Springer-Verlag, 1971. [61] H. Kunita, Estimation of Stochastic Processes (in Japanese), Sangyou Tosho, 1976.

4:36:59, subject to the Cambridge Core terms of use, .010

340

References

[62] H. Kunita, Stochastic Flows and Stochastic Differential Equations, Cambridge University Press, 1990. [63] H. Kunita, Supports of diffusion processes and controllability problems, Proc. Intern. Symp. SDE Kyoto 1976 (ed. K. Itô), 163–185, Kinokuniya, 1978. [64] H. Kunita, On the decomposition of solutions of stochastic differential equations, in Stochastic Integrals, ed. D. Williams, Lecture Notes in Math., 851, 213–255, Springer-Verlag, 1981. [65] H. Kunita and S. Watanabe, On square integrable martingales, Nagoya Math. J., 30 (1967), 209–245. [66] S. Kusuoka, The nonlinear transformation of Gaussian measure on Banach space and its absolute continuity, J. Fac. Sci. Tokyo Univ., Sect. 1.A., 29 (1982), 567– 590. [67] N. N. Lebedev, Special Functions and their Applications, translated by R. R. Silverman, Dover, 1972. [68] M. Ledoux, Isoperimetry and Gaussian analysis, in Lectures on Probability Theory and Statistics, Ecole d’Eté de Probabilités de Saint-Flour XXIV – 1994, ed. P. Bernard, Lecture Notes in Math., 1648, 165–294, Springer-Verlag, 1996. [69] J.-F. LeGall, Applications du temps local aux equations différentielle stochastiques unidimensionalles, Séminaire de Probabilités XVII, edn. J. Azema and M. Yor, Lecture Notes in Math., 986, 15–31, Springer-Verlag, 1983. [70] P. Lévy, Wiener’s random function, and other Laplacian random functions, Proceedings of 2nd Berkeley Symp. on Math. Stat. and Probability, 171–186, University of California Press, 1951. [71] T. Lyons, M. Caruana, and T. Lévy, Differential equations driven by rough paths, École d’Été de Probabilités de Saint-Flour XXXIV - 2004, Lecture Notes in Math., 1908, Springer, 2007. [72] T. Lyons and Z. Qian, System Control and Rough Paths, Oxford University Press, 2002. [73] P. Malliavin, Stochastic Analysis, Springer-Verlag, 1997. [74] P. Malliavin, Stochastic calculus of variation and hypoelliptic operators, Proc. Intern. Symp. SDE Kyoto 1976 ed. K. Itô, 195–263, Kinokuniya, 1978. [75] P. Malliavin, C k -hypoellipticity with degeneracy, in Stochastic Analysis, eds. A. Friedman and M. Pinsky, 199–214, 327–340, Academic Press, 1978. [76] P. Malliavin and A. Thalmaier, Stochastic Calculus of Variations in Mathematical Finance, Springer-Verlag, 2006. [77] G. Maruyama, Selected Papers, eds. N. Ikeda and H. Tanaka, Kaigai Publications, 1988. [78] G. Maruyama, On the transition probability functions of the Markov process, Nat. Sci. Rep. Ochanomizu Univ., 5 (1954), 10–20. [79] G. Maruyama, Continuous Markov processes and stochastic equations, Rend. Circ. Mate. Palermo., 4 (1955), 48–90. [80] H. Matsumoto, Semiclassical asymptotics of eigenvalues for Schrödinger operators with magnetic fields, J. Funct. Anal., 129 (1995), 168–190. [81] H. Matsumoto, L. Nguyen, and M. Yor, Subordinators related to the exponential functionals of Brownian bridges and explicit formulae for the semigroups of hyperbolic Brownian motions, in Stochastic Processes and Related Topics, eds. R. Buckdahn, E. Engelbert and M. Yor, 213–235, Gordon and Breach, 2001.

4:36:59, subject to the Cambridge Core terms of use, .010

References

341

[82] H. Matsumoto and S. Taniguchi, Wiener functionals of second order and their Lévy measures, Elect. Jour. Probab., 7, No.14 (2002), 1–30. [83] H. Matsumoto and M. Yor, A version of Pitman’s 2M–X theorem for geometric Brownian motions, C.R. Acad. Sc. Paris Série I, 328 (1999), 1067–1074. [84] H. P. McKean, Jr., Stochastic Integrals, Academic Press, 1969. [85] H. P. McKean, Jr., Selberg’s trace formula as applied to a compact Riemannian surface, Comm. Pure Appl. Math., 101 (1972), 225–246. [86] P. A. Meyer, Probabilités et Potentiel, Hermann, 1966. [87] T. Miwa, E. Date, and M. Jimbo, Solitons: Differential Equations, Symmetries and Infinite Dimensional Algebras (translated by M. Reid), Cambridge University Press, 2000. (Originally published in Japanese from Iwanami Shoten, 1993) [88] M. Musiela and M. Rutkowski, Martingale Methods in Financial Modeling, Springer-Verlag, 2003. [89] S. Nakao, On the pathwise uniqueness of solutions of one-dimensional stochastic differential equations, Osaka J. Math., 9 (1972), 513–518. [90] A. A. Novikov, On an identity for stochastic integrals, Theory Prob. Appl., 17 (1972), 717–720. [91] A. A. Novikov, On moment inequalities and identities for stochastic integrals, Proc. Second Japan–USSR Symp. Prob. Theor., eds. G. Maruyama and J. V. Prokhorov, Lecture Notes in Math., 330, 333–339, Springer-Verlag, 1973. [92] D. Nualart, The Malliavin Calculus and Related Topics, 2nd edn., Springer-Verlag, 2006. [93] B. Øksendal, Stochastic Differential Equations, an Introduction with Applications, 6th edn., Springer-Verlag, 2003. [94] S. Port and C. Stone, Brownian Motion and Classical Potential Theory, Academic Press, 1978. [95] K. M. Rao, On the decomposition theorem of Meyer, Math. Scand., 24 (1969), 66–78. [96] D. Ray, On spectra of second order differential operators, Trans. Amer. Math. Soc., 77 (1954), 299–321. [97] L. Richardson, Measure and Integration: a Concise Introduction to Real Analysis, John Wiley & Sons, 2009. [98] D. Revuz and M. Yor, Continuous Martingales and Brownian Motion, 3rd edn., Springer-Verlag, 1999. [99] L. C. G. Rogers and D. Williams, Diffusions, Markov Processes, and Martingales, Vol.1, Foundations, 2nd edn., John Wiley & Sons, New York, 1994. [100] L. C. G. Rogers and D. Williams, Diffusions, Markov Processes, and Martingales, Vol.2, Itô Calculus, 2nd edn., John Wiley & Sons, New York, 1994. [101] K. Sato, Lévy Processes and Infinitely Divisible Distributions, Cambridge University Press, 1999. [102] M. Schilder, Some asymptotic formulae for Wiener integrals, Trans. Amer. Math. Soc., 125 (1966), 63–85. [103] A. Selberg, Harmonic analysis and discontinuous groups in weakly symmetric Riemannian spaces with applications to Dirichlet series, J. Indian Math. Soc., 20 (1956), 47–87. [104] I. Shigekawa, Stochastic Analysis, Amer. Math. Soc., 2004. (Originally published in Japanese from Iwanami Shoten, 1998)

4:36:59, subject to the Cambridge Core terms of use, .010

342

References

[105] S. Shreve, Stochastic Calculus for Finance, I, II, Springer-Verlag, 2004. [106] B. Simon, Functional Integration and Quantum Physics, Academic Press, 1979. [107] B. Simon, Trace Ideals and Their Applications, 2nd edn., Amer. Math. Soc., 2005. [108] B. Simon, Schrödinger semigroups, Bull. Amer. Math. Soc., 7 (1982), 447–526. [109] B. Simon, Semiclassical analysis of low lying eigenvalues I, Non-degenerate minima: Asymptotic expansions, Ann. Inst. Henri-Poincaré, Sect. A, 38 (1983), 295–307. [110] B. Simon, Semiclassical analysis of low lying eigenvalues II, Tunneling, Ann. Math., 120 (1984), 89–118. [111] D. W. Stroock, Lectures on Topics in Stochastic Differential Equations, Tata Insitute of Fundamental Research, 1982. [112] D. W. Stroock, Probability Theory: an Analytic View, 2nd edn., Cambridge University Press, 2010. [113] D. W. Stroock, An exercise in Malliavin calculus, J. Math. Soc. Japan, 67 (2015), 1785–1799. [114] D. W. Stroock and S. R. S. Varadhan, Multidimensional Diffusion Processes, Springer-Verlag, 1979. [115] D. W. Stroock and S. R. S. Varadhan, On the support of diffusion processes with applications to the strong maximum principle, Proc. Sixth Berkeley Symp. Math. Statist. Prob. III., 361–368, University of California Press, 1972. [116] H. Sugita, Positive generalized Wiener functions and potential theory over abstract Wiener spaces, Osaka J. Math., 25 (1988), 665–696. [117] H. J. Sussmann, On the gap between deterministic and stochastic ordinary differential equations, Ann. Probab., 6 (1978), 19–41. [118] S. Taniguchi, Brownian sheet and reflectionless potentials, Stoch. Pro. Appl., 116 (2006), 293–309. [119] H. Trotter, A property of Brownian motion paths, Illinois J. Math., 2 (1958), 425–433. [120] A.S. Üstünel and M. Zakai, Transformation of Measure on Wiener Space, Springer-Verlag, 2000. [121] J. H. Van Vleck, The correspondence principle in the statistical interpretation of quantum mechanics, Proc. Nat. Acad. Sci. U.S.A., 14 (1928), 178–188. [122] S. Watanabe, Analysis of Wiener functionals (Malliavin calculus) and its applications to heat kernels, Ann. Probab., 15 (1987), 1–39. [123] S. Watanabe, Generalized Wiener functionals and their applications, Probability theory and mathematical statistics, Proceedings of the Fifth Japan–USSR Symposium, Kyoto, 1986, eds. S. Watanabe and Y. V. Prokhorov, 541–548, Lecture Notes in Math., 1299, Springer-Verlag, Berlin, 1988. [124] H. Weyl, Das asymptotische Verteilungsgesetz der Eigenschwingungen eines beliebig gestalteten elastischen Körpers, Rend. Cir. Mat. Palermo, 39 (1915), 1–50. [125] D. V. Widder, The Laplace Transform, Princeton University Press, 1941. [126] D. Williams, Probability with Martingales, Cambridge University Press, 1991. [127] E. Wong and M. Zakai, On the relation between ordinary and stochastic differential equations, Intern. J. Engng. Sci., 3 (1965), 213–229.

4:36:59, subject to the Cambridge Core terms of use, .010

References

343

[128] T. Yamada and S. Watanabe, On the uniqueness of solutions of stochastic differential equations, J. Math. Kyoyo Univ., 11 (1971), 155–167. [129] Y. Yamato, Stochastic differential equations and nilpotent Lie algebras, Z. Wahr. verw. Geb., 47 (1979), 213–229. [130] M. Yor, Exponential Functionals of Brownian Motion and Related Processes, Springer-Verlag, 2001. [131] M. Yor, Sur la continuité des temps locaux associés à certaines semimartingales, Astérisque 52–53 (1978), 23–35. [132] M. Yor, On some exponential functionals of Brownian motion, Adv. Appl. Prob., 24 (1992), 509–531. (Also in [130]) [133] K. Yoshida, Functional Analysis, 6th edn., Springer-Verlag, 1980.

4:36:59, subject to the Cambridge Core terms of use, .010

Index

Abelian theorem, 336 absorbing Brownian motion, 137 abstract Wiener space, 276 action integral, 298 adapted process, 10 Appell sequence, 307 arbitrage opportunity, 284 arcsine law, 122 Bessel process, 172 Black–Scholes model, 282 Black–Scholes formula, 292 Blumenthal 0-1 law, 105 Borel cylinder set, 3, 134 Borel isomorphic, 333 Borel σ-field, 2 Brownian bridge, 182 Brownian motion, 9 Brownian motion with constant drift, 169 Burkholder–Davis–Gundy inequality, 70 Cameron–Martin subspace, 6, 8 Cameron–Martin theorem, 40 Chapman–Kolmogorov equation, 5 Clark–Ocone formula, 219 class (D), 34 class (DL), 34 classical path, 298 closable martingale, 19, 29 closable submartingale, 19, 29 complex Brownian motion, 97 conditional expectation, 11, 332 conditional probability, 332 cone, 130 conformal martingale, 97

conservative process, 136 continuous semimartingale, 61 coordinate process, 9 countably generated, 333 Dambis–Dubins–Schwarz theorem, 74 Delta, 293 diffusion matrix, 137 diffusion measure, 136 diffusion process, 135, 136 Dirac measure, 228 Dirichlet problem, 126 Doob decomposition, 15 Doob’s inequality, 22, 30 Doob–Meyer decomposition, 35 drift, 137 Dynkin class, 329 Dynkin class theorem, 329 entrance boundary, 171 equivalent martingale measure, 284 Euler number, 307 Euler polynomial, 307 European call option, 291 European contingent claim, 288 European put option, 292 exercise price, 291 exit boundary, 171 explosion, 140 extension (probability space), 78 Feynman–Kac formula, 117 filtered probability space, 9 filtration, 10 first hitting time, 11

344 4:37:00, subject to the Cambridge Core terms of use,

Index

{Ft }-Brownian motion, 37 fundamental solution, 111, 240 Gauss kernel, 5 generalized Wiener functional, 205 generator, 136 geometric Brownian motion, 140 Girsanov’s theorem, 158 Greeks, 293 Gronwall’s inequality, 329 Haar function, 7 Hamiltonian, 298 harmonic function, 109 harmonic oscillator, 265 heat equation, 111 heat kernel, 5, 111 Hermite polynomial, 199 Hörmander’s condition, 238 hypercontractivity, 207 hypergeometric function, 327 implied volatility, 293 increasing process, 34 index (Bessel process), 172 integration by parts formula, 225 Itô process, 61 Itô’s formula, 62 Itô’s representation theorem, 82 Itô–Tanaka formula, 91 Jensen’s inequality, 12 Kac’s formula, 120 Kato class, 123 Kazamaki condition, 160 KdV equation, 305 Khasminskii’s condition, 176 Khinchin’s law of iterated logarithm, 39 Koebe’s theorem, 126 Kolmogorov diffusion, 239 Kolmogorov operator, 239 Kolmogorov’s continuity theorem, 3, 334 Laplace transform, 335 large deviation, 47 Legendre function, 327 Lévy’s modulus of continuity, 39 life time, 133 linear fractional transformation, 319 local martingale, 52 local time, 87

345

locally square-integrable martingale, 53 Malliavin calculus, 195 Markov family, 134 Markov process, 134 Markov property, 103 Markovian type, 140 martingale, 13, 24 martingale problem, 156 martingale transform, 14 mean-value property, 126 measurable (stochastic process), 10 measurable space, 1 Meyer’s equivalence, 202 modification, 3 modified Bessel function, 324 monotone class, 330 monotone class theorem, 330 multiplicative class, 329 natural boundary, 171 natural process, 34 non-degenerate (Wiener functional), 224 Novikov condition, 160 occupation time formula, 93 optional sampling theorem, 20 optional stopping theorem, 16 Ornstein–Uhlenbeck operator, 203 Ornstein–Uhlenbeck process, 67, 141, 182 Ornstein–Uhlenbeck semigroup, 203 pathwise uniqueness, 142 piecewise C m -class, 120 piecewise continuous function, 120 pinned Brownian motion, 182 Pitman’s theorem, 96 Poincaré cone condition, 130 Poincaré metric, 318 Poisson integral formula, 132 portfolio, 282 predictable process, 14, 55 price, 291 probability distribution, 2 probability law, 2 probability measure, 1 probability space, 1 product (for stochastic differential), 67 progressively measurable process, 10 pull-back, 228 put-call parity, 292

4:37:00, subject to the Cambridge Core terms of use,

346

quadratic form (on Wiener spaces), 257 quadratic variation process, 36, 53 random field, 334 random variable, 2 random walk, 5 recurrent, 108 reflecting Brownian motion, 93 reflectionless potential, 305 regular boundary, 171 regular conditional probability, 332 regular point, 128 regularly varying (function), 335 replicable, 288 sample path, 3 scale function, 168 scattering data, 305 Schauder function, 7 Schilder’s theorem, 43 Selberg’s trace formula, 319 self-financing, 282 semigroup, 123 shift, 103 σ-field, 1 Skorohod equation, 93 Skorohod integral, 217 slowly varying function, 335 Sobolev space (over Wiener space), 198 soliton solution, 305 special linear group, 319 speed measure, 169 standard measurable space, 333 standard probability space, 333 stochastic area, 240, 269, 300 stochastic differential, 67 stochastic differential equation, 138 stochastic flow, 183 stochastic integral, 56, 59, 60 stochastic process, 3

Index

stock price process, 282 stopping time, 11, 16 Stratonovich integral, 69 strong Markov family, 135 strong Markov property, 106 strong solution, 143 submartingale, 13, 24 successive approximation, 152 supermartingale, 13, 24 support problem, 193 Tanaka’s formula, 87 Tauberian theorem, 336 time change, 164 time change process, 164 time-homogeneous equation, 140 transformation of drift, 157 transient, 108 transition probability, 135 uniformly integrable family of random variables, 331 uniqueness in law, 142 upcrossing number, 17 upper half plane, 318 usual condition, 10 value process, 282 Van Vleck, 297 volatility, 281 well measurable process, 55 well posed martingale problem, 156 Wiener chaos, 201 Wiener functional, 5 Wiener integral, 40 Wiener measure, 4 Wiener process, 9 Wiener space, 5

4:37:00, subject to the Cambridge Core terms of use,