Applied Stochastic Control of Jump Diffusions [3 ed.] 978-3-030-02781-0

The main purpose of the book is to give a rigorous introduction to the most important and useful solution methods of var

651 60 6MB

English Pages 439 Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Applied Stochastic Control of Jump Diffusions [3 ed.]
 978-3-030-02781-0

Citation preview

Universitext

Bernt Øksendal Agnès Sulem

Applied Stochastic Control of Jump Diffusions Third Edition

Universitext

Universitext

Series Editors Sheldon Axler San Francisco State University, San Francisco, CA, USA Carles Casacuberta Universitat de Barcelona, Barcelona, Spain Angus MacIntyre Queen Mary University of London, London, UK Kenneth Ribet University of California, Berkeley, CA, USA Claude Sabbah École Polytechnique, CNRS, Université Paris-Saclay, Palaiseau, France Endre Süli University of Oxford, Oxford, UK Wojbor A. Woyczyński, Case Western Reserve University, Cleveland, OH, USA

Universitext is a series of textbooks that presents material from a wide variety of mathematical disciplines at master’s level and beyond. The books, often well class-tested by their author, may have an informal, personal even experimental approach to their subject matter. Some of the most successful and established books in the series have evolved through several editions, always following the evolution of teaching curricula, to very polished texts. Thus as research topics trickle down into graduate-level teaching, first textbooks written for new, cutting-edge courses may make their way into Universitext.

More information about this series at http://www.springer.com/series/223

Bernt Øksendal Agnès Sulem •

Applied Stochastic Control of Jump Diffusions Third Edition

123

Bernt Øksendal Department of Mathematics University of Oslo Oslo, Norway

Agnès Sulem Inria Research Center of Paris Paris, France

ISSN 0172-5939 ISSN 2191-6675 (electronic) Universitext ISBN 978-3-030-02779-7 ISBN 978-3-030-02781-0 (eBook) https://doi.org/10.1007/978-3-030-02781-0 Library of Congress Control Number: 2018965883 Mathematics Subject Classification (2010): 47J20, 49J40, 60G40, 65M06, 65M12, 91A23, 91B28, 91GXX, 93E20 1st and 2nd edition: © Springer-Verlag Berlin Heidelberg 2005, 2007 3rd edition: © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

To my family Eva, Elise, Anders, and Karina Bernt Øksendal A tous ceux qui m’accompagnent Agnès Sulem

Preface to the Third Edition

In this third edition, we have expanded and updated the second edition and included more recent developments within stochastic control and its applications. Specifically, we have replaced Section 1.5 on application to finance by a more comprehensive presentation of financial markets modeled by jump diffusions (the new Chap. 2). We have added a new chapter on backward stochastic differential equations, convex risk measures, and recursive utilities (Chap. 4). Moreover, we have expanded the optimal stopping chapter (was Chap. 2, now Chap. 3) and the stochastic control chapter (was Chap. 3, now Chap. 5) and added a new chapter on stochastic differential games (Chap. 6). In addition, we have corrected errors and updated and improved the presentation throughout the book. We wish to thank Nacira Agram, Kristina Dahl, Olfa Draouil, Michele Giordano, Huigi Guan, Gambarra Matteo, Sandun Perara, Andriy Pylypenko, and Yngve Willassen for their helpful comments. Last but not least, we are very grateful to Martine Verneuille for her valuable help with the typing through many years. Oslo, Norway Paris, France 2018

Bernt Øksendal Agnès Sulem

vii

Preface to the Second Edition

In this second edition, we have added a chapter on optimal control of random jump fields (solutions of stochastic partial differential equations) and partial information control (Chap. 13). We have also added a section on optimal stopping with delayed information (Sect. 3.3). It has always been our intention to give a contemporary presentation of applied stochastic control, and we hope that the addition of these recent developments will contribute in this direction. We have also made a number of corrections and other improvements, many of them based on helpful comments from our readers. In particular, we would like to thank Andreas Kyprianou for his valuable communications. We are also grateful to (in alphabetical order) Knut Aase, Jean-Philippe Chancelier, Inga Eide, Emil Framnes, Arne-Christian Lund, Jose-Luis Menaldi, Tamás K. Papp, Atle Seierstad, and Jens Arne Sukkestad for pointing out errors and suggesting improvements. Our special thanks go to Martine Verneuille for her skillful typing. Oslo and Paris November 2006

Bernt Øksendal Agnès Sulem

ix

Preface to the First Edition

Jump diffusions are solutions of stochastic differential equations driven by Lévy processes. Since a Lévy process gðtÞ can be written as a linear combination of t, a Brownian motion BðtÞ and a pure jump process, jump diffusions represent a natural and useful generalization of Itô diffusions. They have received a lot of attention in the last years because of their many applications, particularly in economics. There exist today several excellent monographs on Lévy processes. However, very few of them—if any—discuss the optimal control, optimal stopping, and impulse control of the corresponding jump diffusions, which is the subject of this book. Moreover, our presentation differs from these books in that it emphasizes the applied aspect of the theory. Therefore, we focus mostly on useful verification theorems, and we illustrate the use of the theory by giving examples and exercises throughout the text. Detailed solutions of some of the exercises are given at the end of the book. The exercises to which a solution is provided are marked with an asterisk . It is our hope that this book will fill a gap in the literature and that it will be a useful text for students, researchers, and practitioners in stochastic analysis and its many applications. Although most of our results are motivated by examples in economics and finance, the results are general and can be applied in a wide variety of situations. To emphasize this, we have also included examples in biology and physics/engineering. This book is partially based on courses given at the Norwegian School of Economics and Business Administration (NHH) in Bergen, Norway, during the Spring semesters 2000 and 2002, at INSEA in Rabat, Morocco in September 2000, at Odense University in August 2001, and at ENSAE in Paris in February 2002. Acknowledgements We are grateful to many people who in various ways have contributed to these lecture notes. In particular, we thank Knut Aase, Fred Espen Benth, Jean-Philippe Chancelier, Rama Cont, Hans Marius Eikseth, Nils Christian Framstad, Jørgen Haug, Monique Jeanblanc, Kenneth Karlsen, Arne-Christian Lund, Thilo Meyer-Brandis, Cloud Makasu, Sure Mataramvura, Peter Tankov, and Jan Ubøe for their valuable help. We also thank Francesca Biagini for useful comments and suggestions to the text and her detailed solutions of some of the

xi

xii

Preface to the First Edition

exercises. We are grateful to Dina Haraldsson and Martine Verneuille for proficient typing and Eivind Brodal for his kind assistance. We acknowledge with gratitude the support by the French–Norwegian cooperation project Stochastic Control and Applications, Aur 99-050. Oslo and Paris August 2004

Bernt Øksendal Agnès Sulem

Contents

1

Stochastic Calculus with Lévy Processes . . . . . . . . . . 1.1 Basic Definitions and Results on Lévy Processes 1.2 The Itô Formula and Related Results . . . . . . . . . 1.3 Lévy Stochastic Differential Equations . . . . . . . . 1.4 The Girsanov Theorem and Applications . . . . . . 1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1 1 7 11 14 23

2

Financial Markets Modeled by Jump Diffusions . 2.1 Market Definitions and Arbitrage . . . . . . . . 2.2 Hedging and Completeness . . . . . . . . . . . . 2.3 Option Pricing . . . . . . . . . . . . . . . . . . . . . 2.3.1 European Options . . . . . . . . . . . . . 2.3.2 American Options . . . . . . . . . . . . . 2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

27 27 33 40 40 43 46

3

Optimal Stopping of Jump Diffusions . . . . . . . . . . . . . . . 3.1 A General Formulation and a Verification Theorem 3.2 Applications and Examples . . . . . . . . . . . . . . . . . . 3.3 Optimal Stopping with Delayed Information . . . . . . 3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

55 55 59 65 71

4

Backward Stochastic Differential Equations and Risk Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 General BSDEs with Jumps . . . . . . . . . . . . . . . . . . . . . . . 4.3 Linear BSDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Comparison Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Convex Risk Measures and BSDEs . . . . . . . . . . . . . . . . . 4.5.1 Dynamic Risk Measures . . . . . . . . . . . . . . . . . . . 4.5.2 A Dual Representation of Convex Risk Measures . 4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

75 75 78 80 82 85 87 88 89

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

xiii

xiv

5

6

Contents

Stochastic Control of Jump Diffusions . . . . . . . . . . . . . . . . . 5.1 The Dynamic Programming Approach . . . . . . . . . . . . . 5.2 Stochastic Maximum Principles for Partial Information Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 A Sufficient Maximum Principle . . . . . . . . . . . 5.2.2 A Necessary Maximum Principle . . . . . . . . . . . 5.2.3 The Relation Between the Maximum Principle and Dynamic Programming . . . . . . . . . . . . . . . 5.2.4 Utility Maximization . . . . . . . . . . . . . . . . . . . . 5.2.5 Mean-Variance Portfolio Selection . . . . . . . . . . 5.3 The Maximum Principle with Infinite Horizon . . . . . . . 5.3.1 A Sufficient Maximum Principle . . . . . . . . . . . 5.3.2 A Necessary Maximum Principle . . . . . . . . . . . 5.4 Optimal Control of FBSDEs by Means of Stochastic HJB Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Optimal Control of FBSDEs . . . . . . . . . . . . . . 5.4.2 Applications in Mathematical Finance . . . . . . . 5.5 Optimal Control of Stochastic Delay Equations . . . . . . . 5.5.1 A Sufficient Maximum Principle . . . . . . . . . . . 5.5.2 A Necessary Maximum Principle . . . . . . . . . . . 5.5.3 Time-Advanced BSDEs with Jumps . . . . . . . . 5.5.4 Example: Optimal Consumption from a Cash Flow with Delay . . . . . . . . . . . . . . . . . . . . . . . 5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

..... .....

93 93

. . . . . 101 . . . . . 103 . . . . . 105 . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

108 109 111 115 116 121

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

124 125 129 134 137 140 144

. . . . . 148 . . . . . 150

Stochastic Differential Games . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Stochastic Differential (Markov) Games, HJB–Isaacs Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Entropic Risk Minimization Example . . . . . . . . . . 6.2 Stochastic Maximum Principles . . . . . . . . . . . . . . . . . . . . 6.2.1 General (Non-zero) Stochastic Differential Games 6.2.2 The Zero-Sum Game Case . . . . . . . . . . . . . . . . . 6.2.3 Proofs of the Stochastic Maximum Principles . . . . 6.2.4 Risk Minimization by FBSDE Games . . . . . . . . . 6.3 Mean-Field Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Two Motivating Examples . . . . . . . . . . . . . . . . . 6.3.2 General Mean-Field Non-zero Sum Games . . . . . 6.3.3 A Sufficient Maximum Principle . . . . . . . . . . . . . 6.3.4 A Necessary Maximum Principle . . . . . . . . . . . . . 6.3.5 Application to Model Uncertainty Control . . . . . . 6.3.6 The Zero-Sum Game Case . . . . . . . . . . . . . . . . . 6.3.7 The Single Player Case . . . . . . . . . . . . . . . . . . . . 6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . 157 . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

157 160 164 165 170 174 179 182 182 184 185 189 192 201 204 206

Contents

7

8

9

xv

Combined Optimal Stopping and Stochastic Control of Jump Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 A General Mathematical Formulation . . . . . . . . . . . . . . . 7.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Singular Control for Jump Diffusions . . . . . . . . 8.1 An Illustrating Example . . . . . . . . . . . . . . 8.2 A General Formulation . . . . . . . . . . . . . . 8.3 Application to Portfolio Optimization with Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

211 211 212 217 221

. . . . . . . . . . . . . . . 225 . . . . . . . . . . . . . . . 225 . . . . . . . . . . . . . . . 227

Transaction . . . . . . . . . . . . . . . 233 . . . . . . . . . . . . . . . 235

Impulse Control of Jump Diffusions . . . . . . . . . . . . . . . . 9.1 A General Formulation and a Verification Theorem 9.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10 Approximating Impulse Control 10.1 Iterative Scheme . . . . . . . 10.2 Examples . . . . . . . . . . . . 10.3 Exercises . . . . . . . . . . . .

. . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

239 239 244 252

by Iterated Optimal Stopping . . . 255 . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

11 Combined Stochastic Control and Impulse Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 A Verification Theorem . . . . . . . . . . 11.2 Examples . . . . . . . . . . . . . . . . . . . . 11.3 Iterative Methods . . . . . . . . . . . . . . 11.4 Exercises . . . . . . . . . . . . . . . . . . . .

Control of Jump . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

273 273 276 281 282

12 Viscosity Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Viscosity Solutions of Variational Inequalities . . . 12.1.1 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . 12.2 The Value Function is Not Always C1 . . . . . . . . . 12.3 Viscosity Solutions of HJBQVI . . . . . . . . . . . . . . 12.4 Numerical Analysis of HJBQVI . . . . . . . . . . . . . . 12.4.1 Finite Difference Approximation . . . . . . . 12.4.2 A Policy Iteration Algorithm for HJBQVI 12.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

285 286 289 290 293 304 304 307 311

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

13 Optimal Control of Stochastic Partial Differential Equations and Partial (Noisy) Observation Control . . . . . . . . . . . . . . . . . . . . . 313 13.1 A Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 13.2 The Maximum Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

xvi

Contents

13.3 13.4 13.5

13.6

13.2.1 Return to Example 13.1 . . . . . . . . . . . . . . A Necessary Maximum Principle . . . . . . . . . . . . . . Controls Which do not Depend on x . . . . . . . . . . . Application to Partial (Noisy) Observation Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5.1 Optimal Portfolio with Noisy Observations Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14 Solutions of Selected Exercises 14.1 Exercises of Chap. 1 . . . 14.2 Exercises of Chap. 2 . . . 14.3 Exercises of Chap. 3 . . . 14.4 Exercises of Chap. 4 . . . 14.5 Exercises of Chap. 5 . . . 14.6 Exercises of Chap. 6 . . . 14.7 Exercises of Chap. 7 . . . 14.8 Exercises of Chap. 8 . . . 14.9 Exercises of Chap. 9 . . . 14.10 Exercises of Chap. 10 . . 14.11 Exercises of Chap. 11 . . 14.12 Exercises of Chap. 12 . . 14.13 Exercises of Chap. 13 . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . 321 . . . . . . . . 326 . . . . . . . . 329 . . . . . . . . 332 . . . . . . . . 336 . . . . . . . . 339 . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

343 343 349 352 367 368 376 380 382 387 400 404 408 414

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 Notation and Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433

Chapter 1

Stochastic Calculus with Lévy Processes

1.1 Basic Definitions and Results on Lévy Processes In this chapter we present the basic concepts and results needed for the applied calculus of jump diffusions. Since there are several excellent books which give a detailed account of this basic theory, we will just briefly review it here and refer the reader to these books for more information. Definition 1.1 Let (Ω, F, F = {Ft }t≥0 , P) be a filtered probability space. An Ft adapted process {η(t)}t≥0 = {ηt }t≥0 ⊂ R with η0 = 0 a.s. is called a Lévy process if ηt is continuous in probability and has stationary and independent increments. Theorem 1.2 Let {ηt } be a Lévy process. Then ηt has a càdlàg version (right continuous with left limits) which is also a Lévy process. 

Proof See, e.g., [P, S].

In view of this result we will from now on assume that the Lévy processes we work with are càdlàg. The jump of ηt at t ≥ 0 is defined by ηt = ηt − ηt − .

(1.1.1)

Let B0 be the family of Borel sets U ⊂ R whose closure U¯ does not contain 0. For U ∈ B0 we define  XU (ηs ). (1.1.2) N (t, U ) = N (t, U, ω) = s:0 0; ηt ∈ U }. We claim that T1 (ω) > 0 a.s. To prove this note that by right continuity of paths we have lim+ η(t) = η(0) = 0 a.s. t→0

Therefore, for all ε > 0 there exists t (ε) > 0 such that |η(t)| < ε for all t < t (ε). This implies that η(t) ∈ / U for all t < t (ε), if ε < dist(0, U ). Next define inductively Tn+1 (ω) = inf{t > Tn (ω); ηt ∈ U }. Then by the above argument Tn+1 > Tn a.s. We claim that Tn → ∞ as n → ∞, a.s. Assume not. Then Tn → T < ∞. But then lim η(s) cannot exist,

s→T −

contradicting the existence of left limits of the paths. It is well known that Brownian motion {B(t)}t≥0 has stationary and independent increments. Thus B(t) is a Lévy process. Another important example is the following. Example 1.4 (The Poisson Process) The Poisson process π(t) of intensity λ > 0 is a Lévy process taking values in N ∪ {0} and such that P[π(t) = n] =

(λt)n −λt e ; n = 0, 1, 2, . . . n!

Theorem 1.5 ([P, Theorem 1.35]) 1. The set function U → N (t, U, ω) defines a σ-finite measure on B0 for each fixed t, ω. The differential form of this measure is written N (t, dz). 2. The set function [a, b)×U → N (b, U, ω)−N (a, U, ω); [a, b) ⊂ [0, ∞), U ∈ B0 defines a σ-finite measure for each fixed ω. The differential form of this measure is written N (dt, dz). 3. The set function ν(U ) = E[N (1, U )], (1.1.3) where E = E P denotes expectation with respect to P, also defines a σ-finite measure on B0 , called the Lévy measure of {ηt }.

1.1 Basic Definitions and Results on Lévy Processes

3

4. Fix U ∈ B0 . Then the process πU (t) := πU (t, ω) := N (t, U, ω) is a Poisson process of intensity λ = ν(U ). Example 1.6 (The Compound Poisson Process) Let X (n), n ∈ N be a sequence of i.i.d. random variables taking values in R with common distribution μ X (1) = μ X and let π(t) be a Poisson process of intensity λ, independent of all the X (n)s. The compound Poisson process Y (t) is defined by Y (t) = X (1) + · · · + X (π(t)), t ≥ 0.

(1.1.4)

An increment of this process is given by π(s) 

Y (s) − Y (t) =

X (k), s > t.

k=π(t)+1

This is independent of X (1), . . . , X (π(t)), and its distribution depends only on the difference (s − t) and on the distribution of X (1). Thus Y (t) is a Lévy process. To find the Lévy measure ν of Y (t) note that if U ∈ B0 then ⎡



ν(U ) = E[N (1, U )] = E ⎣

⎤ XU (Y (s))⎦

s;0 ε) < ε). Let Lucp denote the space of adapted càglàd processes (left continuous with right limits), equipped with the ucp topology. If H (t) is a step function of the form H (t) = H0 X{0} (t) +



Hi X(Ti ,Ti+1 ] (t),

i

where Hi ∈ FTi and 0 = T0 ≤ T1 ≤ · · · ≤ Tn+1 < ∞ are F-stopping times and X is càdlàg, we define 

t

J X H (t) :=

Hs dX s := H0 X 0 +

0



Hi (X Ti+1 ∧t − X Ti ∧t ), t ≥ 0.

i

Theorem 1.13 ([P]) Let X be a semimartingale. Then the mapping J X can be extended to a continuous linear map J X : Lucp → Ducp . This construction allows us to define stochastic integrals of the form 

t

H (s)dηs

0

for all H ∈ Lucp . (See also Remark 1.18.) In view of the decomposition (1.1.6) (ds, dz), and this integral can be split into integrals with respect to ds, dB(s), N

6

1 Stochastic Calculus with Lévy Processes

N (ds, dz). This makes it natural to consider the more general stochastic integrals of the form  t  t  t α(s, ω)ds + β(s, ω)dB(s) + γ(s, z, ω) N¯ (ds, dz), X (t) = X (0) + 0

0

0

R

(1.1.11)

where the integrands are F-predictable and satisfy the growth condition   t

2 2 |α(s)| + β (s) + γ (s, z)ν(dz) ds < ∞ a.s. for all t > 0. R

0

Hence we for simplicity have put N¯ (ds, dz) =

N (ds, dz) − ν(dz)ds if |z| < R N (ds, dz) if |z| ≥ R,

with R as in Theorem 1.7. As is customary we will use the following shorthand differential notation for processes X (t) satisfying (1.1.11):  dX (t) = α(t)dt + β(t)dB(t) +

R

γ(t, z) N¯ (dt, dz).

(1.1.12)

We call such processes Itô–Lévy processes. Recall that a semimartingale M(t) is called a local martingale up to time T (with respect to P) if there exists an increasing sequence of Ft -stopping times τn such that limn→∞ τn = T a.s. and M(t ∧ τn ) is a martingale with respect to P for all n. Note that 1. If



T



E R

0

 γ 2 (t, z)ν(dz)dt < ∞,

(1.1.13)

then the process M(t) :=

 t R

0

is a martingale. 2. If

 0

T

γ(s, z) N˜ (ds, dz), 0 ≤ t ≤ T

 R

γ 2 (t, z)ν(dz)dt < ∞ a.s.,

then M(t) is a local martingale, 0 ≤ t ≤ T .

(1.1.14)

1.2 The Itô Formula and Related Results

7

1.2 The Itô Formula and Related Results We now come to the important Itô formula for Itô–Lévy processes: If X (t) is given by (1.1.12) and f : R2 → R is a C 2 function, is the process Y (t) := f (t, X (t)) again an Itô–Lévy process and if so, how do we represent it in the form (1.1.12)? If we argue heuristically and use our knowledge of the classical Itô formula it is easy to guess what the answer is: Let X (c) (t) be the continuous part of X (t), i.e., X (c) (t) is obtained by removing the jumps from X (t). Then an increment in Y (t) stems from an increment in X (c) (t) plus the jumps (coming from N (·, ·)). Hence in view of the classical Itô formula we would guess that dY (t) =

∂f 1 ∂2 f ∂f (t, X (t))dt + (t, X (t))dX (c) (t) + (t, X (t)) · β 2 (t)dt ∂t ∂x 2 ∂x 2

+

R

{ f (t, X (t − ) + γ(t, z)) − f (t, X (t − ))}N (dt, dz).

It can be proved that our guess is correct. Since   dX (c) (t) = α(t) −

|z|=< y0 (·), φ > + < Au y(s, ·), φ > ds 0  t  t < z(s, ·), φ > d B(s) + < k(s, ·, ζ), φ > N˜ (ds, dζ), + 0

0

(5.4.6)

R

for some y0 (·) ∈ V , where denotes the dual pairing between the space V and its dual V ∗ , and V := W01,2 (R) is the Sobolev space of order one with zero boundary condition at infinity. Note that with this framework the Itô calculus can be applied to (5.4.5). See [P, PR]. By the Itô–Ventzell formula (see [ØZ] and the references therein), dY (t) = Au (y(·), z(·), k(·))(t, X (t))dt + z(t, X (t))d B(t)  + k(t, X (t), ζ) N˜ (dt, dζ) + y (t, X (t))[α(t)dt + β(t)d B(t)] R

1 + y (t, X (t))β 2 (t)dt 2 + + + +



R



R



R R

{y(t, X (t) + γ(t, ζ)) − y(t, X (t)) − y (t, X (t))γ(t, ζ)}ν(dζ)dt {y(t, X (t) + γ(t, ζ)) − y(t, X (t))} N˜ (dt, dζ) + z (t, X (t))β(t)dt {k(t, X (t) + γ(t, ζ), ζ) − k(t, X (t), ζ)}ν(dζ)dt k(t, X (t − ) + γ(t, ζ), ζ) N˜ (dt, dζ),

(5.4.7)

∂y where y (t, x) = (t, x) etc. and using the shorthand notation α(t) = α(t, X (t), ∂x Y (t), Z (t), K (t, ·), u(t)) etc. Rearranging the terms we see that

5.4 Optimal Control of FBSDEs by Means of Stochastic HJB Equations

127

1 dY (t) = [Au (y(·), z(·), k(·))(t, X (t)) + y (t, X (t))α(t) + y (t, X (t))β 2 (t) 2  + {y(t, X (t) + γ(t, ζ)) − y(t, X (t)) − y (t, X (t))γ(t, ζ)}ν(dζ) R  + z (t, X (t))β(t) + {k(t, X (t) + γ(t, ζ), ζ) − k(t, X (t), ζ)}ν(dζ)]dt R

+ [z(t, X (t)) + y (t, X (t))β(t)]d B(t)  + {y(t, X (t) + γ(t, ζ)) − y(t, X (t)) + k(t, X (t) + γ(t, ζ), ζ)} N˜ (dt, dζ). R

(5.4.8)

Comparing (5.4.8) with the original Eq. (5.4.2) for Y we deduce the following result Theorem 5.13 Suppose that (y(t, x), z(t, x), k(t, x, ·)) satisfies the BSPDE  dy(t, x) = −Au (t, x)dt +z(t, x)d B(t)+

R

k(t, x, ζ) N˜ (dt, dζ); y(T, x) = h(x), (5.4.9)

where Au (t, x) = Au (y(·), z(·), k(·))(t, x) := g(t, x, y(t, x), z(t, x) + y (t, x)β(t), y(t, x + γ(t, ·)) − y(t, x) + k(t, x + γ(t, ·), ·), u(t, x)) 1 + y (t, x)α(t) + y (t, x)β 2 (t) + z (t, x)β(t) 2  {y(t, x + γ(t, ζ)) − y(t, x) − y (t, x)γ(t, ζ)}ν(dζ) R  + {k(t, x + γ(t, ζ), ζ) − k(t, x, ζ)}ν(dζ).

+

(5.4.10)

R

Then (Y (t), Z (t), K (t, ζ)), given by Y (t) := y(t, X (t)),

(5.4.11)

Z (t) := z(t, X (t)) + y (t, X (t))β(t),

(5.4.12)

K (t, ζ) := y(t, X (t) + γ(t, ζ)) − y(t, X (t)) + k(t, X (t) + γ(t, ζ), ζ), (5.4.13) is a solution of the FBSDE system (5.4.1)–(5.4.2). Definition 5.14 We say that the BSPDE (5.4.9) satisfies the comparison principle with respect to u if for all u1 , u2 ∈ A and all FT -measurable random variables h 1 , h 2 with corresponding solutions (yi , z i , ki ), i = 1, 2, of (5.4.9) such that

128

5 Stochastic Control of Jump Diffusions

Au1 (t, x) ≤ Au2 (t, x) for all t, x ∈ [0, T ] × R and

h 1 (x) ≤ h 2 (x) for all x ∈ R,

we have y1 (t, x) ≤ y2 (t, x) for all t, x ∈ [0, T ] × R. Sufficient conditions for the validity of comparison principles for BSPDEs with jumps is still an open question in this setting. For related results see [ØSZ2]. However, in the Brownian case, sufficient conditions for the validity of comparison principles for BSPDEs of the type (5.4.10) are given in Theorem 7.1 in [MYZ]. Using this result we get Theorem 5.15 Assume that the following hold: • N = K = 0, i.e. there are no jumps • The coefficients α, β, and g are F-progressively measurable for each fixed (x, y, z) and h is FT -measurable for each fixed x • α, β, g, h are uniformly Lipschitz-continuous in (x, y, z) T • α and β are bounded and E[ 0 g 2 (t, 0, 0, 0)dt + h 2 (0)] < ∞ • α and β do not depend on z. Then the comparison principle holds for the BSPDE (5.4.9). From the above we deduce the following result, which may be regarded as a stochastic HJB equation for optimal control of possibly non-Markovian FBSDEs. Theorem 5.16 (Stochastic HJB equation) Suppose the comparison principle holds for the BSPDE (5.4.9). Moreover, suppose that for all t, x, ω there exists a maximizer u = u(t, ˆ x) = u(y, ˆ y , y , z, z , k)(t, x, ω) of the function u → Au (t, x). Suppose ˆ x, ·)) and the system (5.4.9) with u = uˆ has a unique solution (y(t, ˆ x), zˆ (t, x), k(t, that u(t, ˆ X (t)) ∈ A. Then u(t, ˆ X (t)) is an optimal control for the problem (5.4.3), with optimal value ˆ x). (5.4.14) sup Y u (0) = Y uˆ (0) = y(0, u∈A

Note that in this general non-Markovian setting the classical value function from the dynamic programming is replaced by the solution y(t, ˆ x) of the BSPDE (5.4.9) for u = u. ˆ

5.4 Optimal Control of FBSDEs by Means of Stochastic HJB Equations

129

5.4.2 Applications in Mathematical Finance We now illustrate Theorem 5.16 by looking at some examples. Example 5.17 (Utility Maximization) First we consider the Merton problem, the solution of which is well known in the Markovian case with deterministic coefficients. Here we consider the general non-Markovian case, when the coefficients are stochastic processes. Consider a financial market consisting of a risk free investment, with unit price S0 (t) := 1 ; t ∈ [0, T ] and a risky investment, with unit price d S1 (t) = S1 (t)[b(t)dt + σ(t)d B(t)] ; t ∈ [0, T ].

(5.4.15)

Here b(t) = b(t, ω) and σ(t) = σ(t, ω) > 0 are given adapted processes. Let u(t, X (t)) be a portfolio, representing the amount invested in the risky asset at time t. If u is self-financing, then the corresponding wealth X (t) at time t is given by the stochastic differential equation d X (t) = d X xu (t) = u(t, X (t))[b(t)dt + σ(t)d B(t)] , t ∈ [0, T ] ; X (0) = x > 0. (5.4.16) Let (Y (t), Z (t)) = (Yxu (t), Z xu (t)) be the solution of the BSDE dY (t) = Z (t)d B(t) , t ∈ [0, T ] ; Y (T ) = U (X (T )),

(5.4.17)

where U (X ) = U (X, ω) is a given utility function, possibly random. Then Yxu (0) = E[U (X xu (T ))]. Therefore, the classical portfolio optimization problem of Merton is to find uˆ ∈ A such that (5.4.18) sup Yxu (0) = Yxuˆ (0). u∈A

In the following we assume that sup Yxu (0) < ∞.

u∈A

(5.4.19)

In this general non-Markovian setting with stochastic coefficients b(t) = b(t, ω) and σ(t) = σ(t, ω) > 0, an explicit expression for the optimal portfolio uˆ is not known. Let us apply the theory from the previous sections to study this problem. In this case we get, from (5.4.10), 1 Au (t, x) = y (t, x)ub(t) + y (t, x)u2 σ 2 (t, x) + z (t, x)uσ(t), 2 which is maximal when

(5.4.20)

130

5 Stochastic Control of Jump Diffusions

u = u(t, ˆ x) = −

y (t, x)b(t) + z (t, x)σ(t) . y (t, x)σ 2 (t)

(5.4.21)

Substituting this into Auˆ (t, x) we obtain Auˆ (t, x) = −

(y (t, x)b(t) + z (t, x)σ(t))2 . 2y (t, x)σ 2 (t)

(5.4.22)

Hence the BSPDE for y(t, x), z(t, x) gets the form ⎧ 2 ⎨dy(t, x) = (y (t, x)b(t) + z (t, x)σ(t)) dt + z(t, x)d B(t) ; t ∈ [0, T ] 2y (t, x)σ 2 (t) ⎩ y(T, x) = U (x). (5.4.23) We have proved: Proposition 5.18 Suppose there exists a solution (y(t, x), z(t, x)) of the BSPDE (5.4.23) with y (t, x) < 0. Suppose that uˆ defined in (5.4.21) is admissible. Then uˆ is optimal for problem (5.4.18) and y(0, x) = sup Yxu (0) = Yxuˆ (0). u∈A

(5.4.24)

Note that if b, σ and U are deterministic, we can choose z(t, x) = 0 in (5.4.23) and this leads to the following (deterministic) PDE for y(t, x): y (t, x)2 b2 (t) ∂y (t, x) − = 0 ; t ∈ [0, T ] ; y(T, x) = U (x). ∂t 2y (t, x)σ 2 (t)

(5.4.25)

This is the classical Merton PDE for the value function, usually obtained by dynamic programming and the HJB equation. Hence we may regard (5.4.21)–(5.4.23) as a generalization of the Merton equation (5.4.25) to the non-Markovian case with stochastic b(t), σ(t) and U (x). The Markovian case corresponds to the special case when z(t, x) = 0 in the BSDE (5.4.23). Therefore y(s, ˆ x) is a stochastic generalization of the value function u (T ))], (5.4.26) ϕ(s, x) := sup E[U (X s,x u∈A

where u u (t) = u(t)[b(t)dt + σ(t)d B(t)] ; t ≥ s ; X s,x (s) = x. d X s,x

Let us compare with the use of the classical HJB:

(5.4.27)

5.4 Optimal Control of FBSDEs by Means of Stochastic HJB Equations

131

  ⎧ ⎨ ∂ϕ (s, x) + max 1 v 2 σ 2 (s)ϕ (s, x) + vb (s)ϕ (s, x) = 0 ; s < T 0 0 v ∂s 2 ⎩ ϕ(T, x) = U (x). (5.4.28) The maximum is attained at v = u(s, ˆ x) = −

b0 (s)ϕ (s, x) . ϕ (s, x)σ02 (s)

(5.4.29)

Substituted into (5.4.28) this gives the HJB equation ∂ϕ ϕ (s, x)2 b02 (s) (s, x) − = 0, ∂s ϕ (s, x)σ02 (s)

(5.4.30)

which is identical to (5.4.25). Example 5.19 (Risk Minimizing Portfolios) Now suppose X (t) = X xu (t) is as in (5.4.16), while (Y (t), Z (t)) = (Yxu (t), Z xu (t)) is given by the BSDE 1 dY (t) = −(− Z 2 (t))dt + Z (t)d B(t) ; Y (T ) = X (T ). 2

(5.4.31)

1 Note that the driver g(z) := − z 2 is concave. We want to minimize the risk of the 2 terminal financial standing X (T ), denoted by ρ(X (T )). If we interpret the risk in the sense of the convex risk measure defined in terms of the BSDE (5.4.31) we have ρ(X (T )) = −Y (0). The risk minimization problem is to find uˆ ∈ A such that inf −Yxu (0) = −Yxuˆ (0),

u∈A

(5.4.32)

where Yxu (t) is given by (5.4.31). By changing sign we can consider the supremum problem instead. In this case we get 1 Au (t, x) = − (z(t, x) + y (t, x)uσ(t))2 + y (t, x)ub(t) 2 1 + y (t, x)u2 σ 2 (t) + z (t, x)uσ(t), 2

(5.4.33)

which is maximal when u = u(t, ˆ x) satisfies u(t, ˆ x) = −

z(t, x)y (t, x)σ(t) − y (t, x)b(t) − z (t, x)σ(t) . ((y (t, x))2 − y (t, x))σ 2 (t)

(5.4.34)

132

5 Stochastic Control of Jump Diffusions

This gives 1 (ˆz (t, x)yˆ (t, x)σ(t) − yˆ (t, x)b(t) − zˆ (t, x)σ(t))2 Auˆ (t, x) = − zˆ 2 (t, x) + , 2 2((yˆ (t, x))2 − yˆ (t, x))σ 2 (t) (5.4.35) and hence (y(t, ˆ x), zˆ (t, x)) solves the BSPDE ˆ x) = x. d y(t, ˆ x) = −Auˆ (t, x)dt + zˆ (t, x)d B(t), 0 ≤ t ≤ T ; y(T,

(5.4.36)

We have proved: Proposition 5.20 Suppose there exists a solution (y(t, ˆ x), zˆ (t, x)) of the BSPDE (5.4.36). Suppose uˆ defined by (5.4.34) belongs to A. Then uˆ is optimal for the risk minimizing problem (5.4.32), and the minimal risk is ˆ x). inf −Yxu (0) = −Yxuˆ (0) = −y(0,

(5.4.37)

u∈A

Next we look at the special case when b(t) and σ(t) are deterministic. Let us try to choose zˆ (t, x) = 0 in (5.4.36). Then this BSPDE reduces to the PDE ⎧ ˆ x) (yˆ (t, x)b(t))2 ⎨ ∂ y(t, =− ; 0≤t ≤T ∂t 2((yˆ (t, x))2 − yˆ (t, x))σ 2 (t) ⎩ y(T, ˆ x) = x.

(5.4.38)

We try a solution of the form y(t, ˆ x) = x + a(t),

(5.4.39)

where a(t) is deterministic. Substituted into (5.4.38) this gives a (t) = −

1 2

which gives



 a(t) = t

b(t) σ(t) T

1 2

2



, 0 ≤ t ≤ T ; a(T ) = 0,

b(s) σ(s)

(5.4.40)

2 ds ; 0 ≤ t ≤ T.

With this choice of a(t), (5.4.38) is satisfied and the minimal risk is ˆ (0) = −y(0, ˆ x) = −x − ρmin (X (T )) = −Y (u)

 0

T

1 2



b(s) σ(s)

Hence by (5.4.34) the optimal (risk minimizing) portfolio is

2 ds.

(5.4.41)

5.4 Optimal Control of FBSDEs by Means of Stochastic HJB Equations

u(t, ˆ X (t)) =

133

b(t) . σ 2 (t)

(5.4.42)

Remark 5.21 Note that (5.4.41) can be interpreted by means of entropy as follows: Recall that in general the entropy of a measure Q with respect to the measure P is defined by   dQ dQ H (Q | P) := E ln . dP dP Define   Γ (t) = exp −

t 0

1 b(s) d B(s) − σ(s) 2

 t 0

b(s) σ(s)

2

 ds .

(5.4.43)

By the Itô formula we have 

   b(t) 1 b(t) 2 d(Γ (t) ln Γ (t)) = Γ (t) − dt d B(t) − σ(t) 2 σ(t)      b(t) b(t) b(t) + (ln Γ (t))Γ (t) − d B(t) + Γ (t) − − dt. σ(t) σ(t) σ(t) Hence, if we define the measure Q Γ (ω) by d Q Γ (ω) := Γ (T )d P(ω)

(5.4.44)

we get  E

d QΓ d QΓ ln dP dP  T

=E 0



= E[Γ (T ) ln Γ (T )]       1 b(t) 2 1 T b(t) 2 Γ (t) dt = dt, 2 σ(t) 2 0 σ(t)

which proves that (5.4.41) can be written ρmin (X (T )) = −x − H (Q Γ | P).

(5.4.45)

Note that Q Γ is the unique equivalent martingale measure for the market (5.4.15), when deterministic coefficients are considered. Thus we have proved that if the coefficients b(t) and σ(t) in (5.4.15) are deterministic ˆ is a risk minimizing and if the portfolio u(t, ˆ X (t)) := σb(t) 2 (t) is admissible, then u portfolio for the problem (5.4.32) and the minimal risk is equal to minus the initial wealth x minus the entropy of the equivalent martingale measure.

134

5 Stochastic Control of Jump Diffusions

5.5 Optimal Control of Stochastic Delay Equations It is a remarkable feature of the stochastic maximum principle that it often also applies, with appropriate modifications, to optimal control problems of non-standard stochastic systems. We will see examples of such situations in the next chapter on mean-field games and in Chap. 13 on optimal control of stochastic partial differential equations. In this section we present another example, namely the optimal control of systems with delay. More precisely, as in [ØSZ3] we consider a controlled stochastic delay equation of the form d X (t) = b(t, X (t), Y (t), A(t), u(t), ω)dt + σ(t, X (t), Y (t), A(t), u(t), ω)d B(t)  (5.5.1) + θ(t, X (t), Y (t), A(t), u(t), z, ω) N˜ (dt, dz) ; t ∈ [0, T ] R

X (t) = x0 (t) ; t ∈ [−δ, 0], 

where Y (t) = X (t − δ), A(t) =

t

e−ρ(t−r ) X (r )dr,

(5.5.2)

(5.5.3)

t−δ

and δ > 0, ρ ≥ 0 and T > 0 are given constants. Here b :[0, T ] × R × R × R × U × Ω → R σ :[0, T ] × R × R × R × U × Ω → R and θ : [0, T ] × R × R × R × U × R0 × Ω → R are given functions such that, for all t, b(t, x, y, a, u, ·), σ(t, x, y, a, u, ·) and θ(t, x, y, a, u, z, ·) are Ft -measurable for all x ∈ R, y ∈ R, a ∈ R, u ∈ U and z ∈ R0 := R\{0}. The function x0 (t) is assumed to be continuous and deterministic. Let Et ⊆ Ft ; t ∈ [0, T ] be a given subfiltration of {Ft }t∈[0,T ] , representing the information available to the controller who decides the value of u(t) at time t. For example, we could have Et = F(t−c)+ for some given c > 0. Let U ⊂ R be a given set of admissible control values u(t) ; t ∈ [0, T ] and let AE be a given family of admissible control processes u(·), included in the set of càdlàg, E-predictable and U-valued processes u(t) ; t ∈ [0, T ] such that (5.5.1)–(5.5.2) has a unique solution X (·) ∈ L 2 (λ × P), where λ denotes the Lebesgue measure on [0, T ]. The performance functional is assumed to have the form  J (u) = E 0

T

 f (t, X (t), Y (t), A(t), u(t), ω)dt + g(X (T ), ω) ; u ∈ AE , (5.5.4)

5.5 Optimal Control of Stochastic Delay Equations

135

where f = f (t, x, y, a, u, ω) : [0, T ]×R×R×R×U ×Ω → R and g = g(x, ω) : R × Ω → R are given C 1 functions w.r.t. (x, y, a, u) such that #

T

E 0



" "2 "∂ f " " | f (t, X (t), A(t), u(t))| + " (t, X (t), Y (t), A(t), u(t))"" dt ∂xi $ +|g(X (T ))| + |g (X (T ))|2 < ∞ for xi = x, y, a and u.

Here, and in the following, we suppress the ω, for notational simplicity. The problem we consider in this section is the following: Find Φ(x0 ) and u∗ ∈ AE such that Φ(x0 ) := sup J (u) = J (u∗ ). u∈AE

(5.5.5)

Any control u∗ ∈ AE satisfying (5.5.5) is called an optimal control. Variants of this problem have been studied in several papers. Stochastic control of delay systems is a challenging research area, because delay systems have, in general, an infinite-dimensional nature. Hence, the natural general approach to them is infinite-dimensional. For this kind of approach in the context of control problems we refer to [C-M, Fe, GoMa, GoMaSa] in the stochastic Brownian case. Nevertheless, in some cases systems with delay can be reduced to finitedimensional systems, in the sense that the information we need from their dynamics can be represented by a finite-dimensional variable evolving in terms of itself. In such a context, the crucial point is to understand when this finite-dimensional reduction of the problem is possible and/or to find conditions ensuring it. There are some papers dealing with this subject in the stochastic Brownian case: We refer to [KoSh, EØS, La1, LR, ØS8]. The paper [Dav] represents an extension of [ØS8] to the case when the equation is driven by a Lévy noise. We also mention the paper [EH], where certain control problems of stochastic functional differential equations are studied by means of the Girsanov transformation. This approach, however, does not work if there is a delay in the noise components. Our approach in this section is different from all the above. Note that the presence of the terms Y (t) and A(t) in (5.5.1) makes the problem non-Markovian and we cannot use a (finite-dimensional) dynamic programming approach. However, we will show that it is possible to obtain a maximum principle for the problem. To this end, we define the following appropriately modified Hamiltonian H : [0, T ] × R × R × R × U × R × R × R × Ω → R

136

5 Stochastic Control of Jump Diffusions

by H (t, x, y, a, u, p, q, r (·), ω) = H (t, x, y, a, u, p, q, r (·)) = f (t, x, y, a, u)  + b(t, x, y, a, u) p + σ(t, x, y, a, u)q + θ(t, x, y, a, u, z)r (z)ν(dz); R0

(5.5.6) where R is the set of functions r : R0 → R such that the last term in (5.5.6) converges. We assume that b, σ and θ are C 1 functions with respect to (x, y, a, u) and that   " "2 " "2 T " " " ∂σ " ∂b " " " " E " ∂x (t, X (t), Y (t), A(t), u(t))" + " ∂x (t, X (t), Y (t), A(t), u(t))" i i 0   "2  " " ∂θ " " + (t, X (t), Y (t), A(t), u(t), z)"" ν(dz) dt < ∞ (5.5.7) " R0 ∂xi for xi = x, y, a and u. Associated to H we define the adjoint processes p(t), q(t), r (t, z); t ∈ [0, T ], z ∈ R0 , by the following backward stochastic differential equation (BSDE):  ⎧ ⎨dp(t) = E[μ(t) | F ]dt + q(t)d B(t) + r (t, z) N˜ (dt, dz) ; t ∈ [0, T ] t R0 ⎩ p(T ) = g (X (T )), (5.5.8) where ∂H (t, X (t), Y (t), A(t), u(t), p(t), q(t), r (t, ·)) ∂x ∂H (t + δ, X (t + δ), Y (t + δ), A(t + δ), − ∂y u(t + δ), p(t + δ), q(t + δ), r (t + δ, ·))χ[0,T −δ] (t)  t+δ ∂H ρt −e ( (s, X (s), Y (s), A(s), ∂a t u(s), p(s), q(s), r (s, ·))e−ρs χ[0,T ] (s)ds).

μ(t) = −

(5.5.9)

Note that this BSDE is anticipative, or time-advanced, in the sense that the driver μ(t) contains future values of X (s), u(s), p(s), q(s), r (s, ·) ; s ≤ t + δ. In the case when there are no jumps and no integral term in (5.5.9), anticipative BSDEs (ABSDEs for short) have been studied by [PY], who prove the existence and uniqueness of such equations under certain conditions. They also relate a class of linear ABSDEs to a class of linear stochastic delay control problems with no delay

5.5 Optimal Control of Stochastic Delay Equations

137

in the noise coefficients. Thus, in this section we extend this relation to general nonlinear control problems and general nonlinear ABSDEs by means of the maximum principle, and throughout the discussion we also include the possibility of delays in all the noise coefficients, as well as the possibility of jumps.

5.5.1 A Sufficient Maximum Principle We now establish a maximum principle of sufficient type, i.e. we show that – under some assumptions – maximizing the Hamiltonian leads to an optimal control. Theorem 5.22 (Sufficient maximum principle [ØSZ3]) Let uˆ ∈ AE with correˆ and adjoint processes p(t), sponding state processes Xˆ (t), Yˆ (t), A(t) ˆ q(t), ˆ rˆ (t, z), assumed to satisfy the ABSDE (5.5.8)–(5.5.9). Suppose the following hold: (i) The functions x → g(x) and (x, y, a, u) → H (t, x, y, a, u, p(t), ˆ q(t), ˆ rˆ (t, ·))

(5.5.10)

are concave, for each t ∈ [0, T ], a.s. (ii) $ # ˆ v, p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et max E H (t, Xˆ (t), Xˆ (t − δ), A(t), v∈U # $ ˆ = E H (t, Xˆ (t), Xˆ (t − δ), A(t), u(t), ˆ p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et (5.5.11) for all t ∈ [0, T ], a.s. Then u(t) ˆ is an optimal control for the problem (5.5.5). Proof As in the proof of Theorem 5.4, by considering an increasing sequence of stopping times converging to T we may assume that all the local martingale terms in the following are martingales, and hence have expectation 0. Choose u ∈ AE and consider J (u) − J (u) ˆ = I1 + I2 ,

(5.5.12)

where I1 = E

#

T

$ ˆ { f (t, X (t), Y (t), A(t), u(t)) − f (t, Xˆ (t), Yˆ (t), A(t), u(t))}dt ˆ ,

0

(5.5.13) I2 = E[g(X (T )) − g( Xˆ (T ))].

(5.5.14)

138

5 Stochastic Control of Jump Diffusions

By the definition of H and concavity of H we have 

T

I1 = E

{H (t, X (t), Y (t), A(t), u(t), p(t), ˆ q(t), ˆ rˆ (t, ·))

0

ˆ − H (t, Xˆ (t), Yˆ (t), A(t), u(t), ˆ p(t), ˆ q(t), ˆ rˆ (t, ·)) ˆ − (b(t, X (t), Y (t), A(t), u(t)) − b(t, Xˆ (t), Yˆ (t), A(t), u(t))) ˆ p(t) ˆ ˆ − (σ(t, X (t), Y (t), A(t), u(t)) − σ(t, Xˆ (t), Yˆ (t), A(t), u(t))) ˆ q(t) ˆ  − (θ(t, X (t), Y (t), A(t), u(t), z) R $ ˆ −θ(t, Xˆ (t), Yˆ (t), A(t), u(t), ˆ z))ˆr(t, z)ν(dz)}dt  T ∂ Hˆ ∂ Hˆ (t)(X (t) − Xˆ (t)) + (t)(Y (t) − Yˆ (t)) ≤E { ∂x ∂y 0 ∂ Hˆ ∂ Hˆ ˆ (t)(A(t) − A(t)) + (t)(u(t) − u(t)) ˆ ∂a ∂u ˆ − (b(t) − b(t)) p(t) ˆ − (σ(t) − σ(t)) ˆ q(t) ˆ   ˆ z))ˆr (t, z)ν(dz)}dt , − (θ(t, z) − θ(t, +

(5.5.15)

R

where we have used the abbreviated notation ∂ Hˆ ∂H ˆ (t) = (t, Xˆ (t), Yˆ (t), A(t), u(t), ˆ p(t), ˆ q(t), ˆ rˆ (t, ·)), ∂x ∂x b(t) = b(t, X (t), Y (t), A(t), u(t)), ˆ = b(t, Xˆ (t), Yˆ (t), A(t), ˆ b(t) u(t) ˆ etc. Since g is concave we have I2 ≤ E[g ( Xˆ (T ))(X (T ) − Xˆ (T ))] = E[ p(T ˆ )(X (T ) − Xˆ (T ))]  T  T p(t)(d ˆ X (t) − d Xˆ (t)) + (X (t) − Xˆ (t))d p(t) ˆ =E 0 T



0



T





T

(σ(t) − σ(t)) ˆ q(t)dt ˆ +

+ 0

T

=E 0 T



ˆ (b(t) − b(t)) p(t)dt ˆ + 

(σ(t) − σ(t)) ˆ q(t)dt ˆ +

+ 0

R

0



0

ˆ z))ˆr (t, z)ν(dz)dt (θ(t, z) − θ(t,



(X (t) − Xˆ (t))μ(t)dt

0 T  R

 ˆ (θ(t, z) − θ(t, z))ˆr (t, z)ν(dz)dt . (5.5.16)

5.5 Optimal Control of Stochastic Delay Equations

139

Combining (5.5.12)–(5.5.16) we get, using that X (t) = Xˆ (t) = x0 (t) for all t ∈ [−δ, 0], #

∂ Hˆ

∂ Hˆ (t)(X (t) − Xˆ (t)) + (t)(Y (t) − Yˆ (t)) ∂x ∂y 0 $ ∂ Hˆ ∂ Hˆ ˆ (t)(A(t) − A(t)) + (t)(u(t) − u(t)) ˆ + μ(t)(X (t) − Xˆ (t)) dt + ∂a ∂u #  T +δ ∂ Hˆ ∂ Hˆ =E (t − δ) + (t)χ[0,T ] (t) + μ(t − δ) (Y (t) − Yˆ (t))dt ∂x ∂y δ  T ˆ  T ˆ $ ∂H ∂H ˆ (t)(A(t) − A(t))dt + (t)(u(t) − u(t))dt ˆ . (5.5.17) + ∂a ∂u 0 0

J (u) − J (u) ˆ ≤E

T

Using integration by parts and substituting r = t − δ, we get 

∂ Hˆ ˆ (s)(A(s) − A(s))ds ∂a 0  T ˆ  s ∂H = e−ρ(s−r ) (X (r ) − Xˆ (r ))dr ds (s) ∂a 0 s−δ   T  r +δ ˆ ∂H −ρs (s)e χ[0,T ] (s)ds eρr (X (r ) − Xˆ (r ))dr = ∂a 0 r   T +δ  t ∂ Hˆ −ρs (s)e χ[0,T ] (s)ds eρ(t−s) (X (t − δ) − Xˆ (t − δ))dt. = ∂a δ t−δ (5.5.18) T

Combining this with (5.5.17) and using (5.5.9) we obtain  J (u) − J (u) ˆ ≤E 

δ

T +δ



∂ Hˆ ∂ Hˆ (t − δ) + (t)χ[0,T ] (t) ∂x ∂y  

∂ Hˆ (s)e−ρs χ[0,T ] (s)ds eρ(t−δ) + μ(t − δ) (Y (t) − Yˆ (t))dt t−δ ∂a   T ˆ ∂H (t)(u(t) − u(t))dt ˆ + ∂u 0   T ∂ Hˆ =E (t)(u(t) − u(t))dt ˆ ∂u 0     T ∂ Hˆ =E (t)(u(t) − u(t)) ˆ | Et dt E ∂u 0 +

t

140

5 Stochastic Control of Jump Diffusions



T

=E 0



  ∂ Hˆ (t) | Et (u(t) − u(t))dt E ˆ ≤ 0. ∂u

The last inequality holds because for each t ∈ [0, T ], v = u(t) ˆ maximizes ˆ E[H (t, Xˆ (t), Yˆ (t), A(t), v, p(t), ˆ q(t), ˆ rˆ (t, ·) | Et ]. This proves that uˆ is an optimal control. 

5.5.2 A Necessary Maximum Principle A drawback with the sufficient maximum principle in Sect. 5.5.1 is the condition of concavity, which does not always hold in applications. In this section we will prove a result going in the other direction. More precisely, we will prove the equivalence between being a directional critical point for J (u) and a critical point for the conditional Hamiltonian. To this end, we need to make the following assumptions: A 1 For all u ∈ AE and all bounded β ∈ AE there exists an ε > 0 such that u + sβ ∈ AE for all s ∈ (−ε, ε). A 2 For all t0 ∈ [0, T ] and all bounded Et0 -measurable random variables α the control process β(t) defined by β(t) = αχ[t0 ,T ] (t) ; t ∈ [0, T ]

(5.5.19)

belongs to AE . For all bounded β ∈ AE the derivative process ξ(t) :=

d u+sβ X (t) |s=0 ds

(5.5.20)

exists and belongs to L 2 (λ × P). It follows from (5.5.1) that dξ(t) =    t ∂b ∂b ∂b ∂b −ρ(t−r ) (t)ξ(t) + (t)ξ(t − δ) + (t) (t)β(t) dt e ξ(r )dr + ∂x ∂y ∂a ∂u t−δ  ∂σ ∂σ (t)ξ(t) + (t)ξ(t − δ) + ∂x ∂y   t ∂σ ∂σ −ρ(t−r ) (t)β(t) d B(t) + (t) e ξ(r )dr + ∂a ∂u t−δ   ∂θ ∂θ (t, z)ξ(t) + (t, z)ξ(t − δ) + ∂y R0 ∂x

5.5 Optimal Control of Stochastic Delay Equations

+

∂θ (t) ∂a



t

e−ρ(t−r ) ξ(r )dr +

t−δ

 ∂θ (t)β(t) N˜ (dt, dz), ∂u

141

(5.5.21)

where for simplicity of notation we have put ∂b ∂b (t) = (t, X (t), X (t − δ), A(t), u(t)) etc . . . ∂x ∂x and we have used that d u+sβ d u+sβ Y X (t) |s=0 = (t − δ) |s=0 = ξ(t − δ) ds ds

(5.5.22)

and  t  d u+sβ d −ρ(t−r ) u+sβ A (t) |s=0 = e X (r )dr |s=0 ds ds t−δ  t  t d = e−ρ(t−r ) X u+sβ (r ) |s=0 dt = e−ρ(t−r ) ξ(r )dr. ds t−δ t−δ

(5.5.23)

Note that ξ(t) = 0 for t ∈ [−δ, 0].

(5.5.24)

Theorem 5.23 (Necessary maximum principle [ØSZ3]) Suppose uˆ ∈ AE with corresponding solutions Xˆ (t) of (5.5.1)–(5.5.2) and p(t), ˆ q(t), ˆ rˆ (t, z) of (5.5.7)– ˆ given by (5.5.20). (5.5.8) and corresponding derivative process ξ(t) Then the following are equivalent: d J (uˆ + sβ) |s=0 = 0 for all bounded β ∈ AE . ds  ∂H ˆ (t, Xˆ (t), Yˆ (t), A(t), u, p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et (ii) E = 0 a.s. for all ∂u u=u(t) ˆ t ∈ [0, T ]. (i)

Proof For simplicity of notation we write uˆ = u, Xˆ = X , pˆ = p, qˆ = q and rˆ = r in the following. Suppose (i) holds. Then d J (u + sβ) |s=0 ds  T d E[ = f (t, X u+sβ (t), Y u+sβ (t), Au+sβ (t), u(t) + sβ(t))dt ds 0 + g(X u+sβ (T ))] |s=0 #  T ∂ f ∂f =E (t)ξ(t) + (t)ξ(t − δ) ∂x ∂y 0

0=

142

5 Stochastic Control of Jump Diffusions

  t $ ∂f ∂f (t) (t)β(t) dt + g (X (T ))ξ(T ) e−ρ(t−r ) ξ(r )dt + ∂a ∂u t−δ #  T ∂H ∂b ∂σ (t) − (t) p(t) − (t)q(t) =E ∂x ∂x ∂x 0   ∂θ − (t, z)r (t, z)ν(dz) ξ(t)dt ∂x R  T ∂H ∂b ∂σ (t) − (t) p(t) − (t)q(t) + ∂y ∂y ∂y 0   ∂θ (t, z)r (t, z)ν(dz) ξ(t − δ)dt − R ∂y  T ∂H ∂b ∂σ (t) − (t) p(t) − (t)q(t) + ∂a ∂a ∂a 0   t   ∂θ (t, z)r (t, z)ν(dz) − e−ρ(t−r ) ξ(r )dr dt R ∂a t−δ  T $ ∂f (t)β(t)dt + g (X (T ))ξ(T ) . (5.5.25) + 0 ∂u +

By (5.5.21), E[g (X (T ))ξ(T )] = E[ p(T )ξ(T )]   T  T  T ∂σ ∂σ (t)ξ(t) + (t)ξ(t − δ) p(t)dξ(t) + ξ(t)dp(t) + q(t) = E[ ∂x ∂y 0 0 0   t ∂σ ∂σ (t)β(t) dt + (t) e−ρ(t−r ) ξ(r )dr + ∂a ∂u t−δ   T ∂θ ∂θ (t, z)ξ(t) + (t, z)ξ(t − δ) + r (t, z) ∂x ∂y 0 R    t ∂θ ∂θ −ρ(t−r ) (t, z) + e ξ(r )dr + (t)β(t) ν(dz)dt ∂a ∂u t−δ  T   t ∂b ∂b ∂b =E (t)ξ(t) + (t)ξ(t − δ) + (t) p(t) e−ρ(t−r ) ξ(r )dr ∂x ∂y ∂a 0 t−δ   T ∂b + (t)β(t) dt + ξ(t)μ(t)dt ∂u 0   t  T ∂σ ∂σ ∂σ (t)ξ(t) + (t)ξ(t − δ) + (t) q(t) e−ρ(t−r ) ξ(r )dr + ∂x ∂y ∂a 0 t−δ  ∂σ + (t)β(t) dt ∂u   T ∂θ ∂θ (t, z)ξ(t) + (t, z)ξ(t − δ) + r (t, z) ∂x ∂y 0 R

5.5 Optimal Control of Stochastic Delay Equations

+

∂θ (t, z) ∂a



t

e−ρ(t−r ) ξ(r )dr +

t−δ

  ∂θ (t, z)β(t) ν(dz)dt . ∂u

143

(5.5.26)

Combining (5.5.25) and (5.2.7) we get 

  T ∂H ∂H (t) + μ(t) dt + (t)dt ξ(t − δ) ∂x ∂y 0 0    T  t  T ∂H ∂H −ρ(t−r ) + e ξ(r )dr (t)dt + (t)β(t)dt ∂a ∂u 0 t−δ 0   T ∂H ∂H ∂H (t) − (t) − (t + δ)χ[0,T −δ] (t) ξ(t) =E ∂x ∂x ∂y 0  t+δ   T ∂H ∂H −eρt dt + ξ(t − δ) (s)e−ρs χ[0,T ] (s)ds (t)dt ∂a ∂y t 0    T  T  s ∂H ∂H (s)ds + (t)β(t)dt e−ρ(s−t) ξ(t)dt + ∂a ∂u 0 s−δ 0  T  ∂H =E (t + δ)χ[0,T −δ] (t) ξ(t) − ∂y 0  t+δ   T ∂H ∂H ρt −ρs −e (s)e χ[0,T ] (s)ds (t)dt dt + ξ(t − δ) ∂a ∂y t 0    T  t+δ  T ∂H ∂H +eρt (s)e−ρs χ[0,T ] (s)ds ξ(t)dt + (t)β(t)dt ∂a ∂u 0 t 0   T ∂H (t)β(t)dt , (5.5.27) =E ∂u 0

0=E

T



ξ(t)

where we have again used integration by parts. If we apply (5.2.8) to β(t) = α(ω)χ[s,T ] (t), where α(ω) bounded and Et0 -measurable, s ≥ t0 , we get 

T

E s

 ∂H (t)dt α = 0. ∂u

Differentiating with respect to s we obtain  E

 ∂H (s)α = 0. ∂u

Since this holds for all s ≥ t0 and all α we conclude that

144

5 Stochastic Control of Jump Diffusions

 E

 ∂H (t0 ) | Et0 = 0. ∂u

This shows that (i) ⇒ (ii). Conversely, since every bounded β ∈ AE can be approximated by linear combinations of controls β of the form (5.5.20), we can prove that (ii) ⇒ (i) by reversing the above argument. 

5.5.3 Time-Advanced BSDEs with Jumps We now study time-advanced backward stochastic differential equations driven both by Brownian motion B(t) and compensated Poisson random measures N˜ (dt, dz). Framework Given a positive constant δ, denote by D([0, δ], R) the space of all càdlàg paths from [0, δ] into R. For a path X (·) : R+ → R, X t will denote the function defined by X t (s) = X (t + s) for s ∈ [0, δ]. Put H = L 2 (ν). Consider the L 2 spaces V1 := L 2 ([0, δ], ds) and V2 := L 2 ([0, δ] → H, ds). Let F : R+ × R × R × V1 × R × R × V1 × H × H × V2 × Ω → R be a predictable function. Introduce the following Lipschitz condition: There exists a constant C such that ¯ q¯1 , q¯2 , q, ¯ r¯1 , r¯2 , r¯ , ω)| |F(t, p1 , p2 , p, q1 , q2 , q, r1 , r2 , r, ω) − F(t, p¯ 1 , p¯ 2 , p, ≤ C(| p1 − p¯ 1 | + | p2 − p¯ 2 | + | p − p| ¯ V1 + |q1 − q¯1 | + |q2 − q¯2 | + |q − q| ¯ V1 + |r1 − r¯1 |H + |r2 − r¯2 |H + |r − r¯ |V2 .

(5.5.28)

First Existence and Uniqueness Theorem (Special Case) We first consider the following time-advanced backward stochastic differential equation in the unknown F-predictable processes ( p(t), q(t), r (t, z)): dp(t) = E[F t, p(t), p(t + δ)χ[0,T −δ] (t), pt χ[0,T −δ] (t), q(t), q(t + δ)χ[0,T −δ] (t),

qt χ[0,T −δ] (t), r (t), r (t + δ)χ[0,T −δ] (t), rt χ[0,T −δ] (t) |Ft ]dt  + q(t)d B(t) + r (t, z) N˜ (dt, dz) ; t ∈ [0, T ] (5.5.29) R

p(T ) = G,

(5.5.30)

where G is a given Ft -measurable random variable. Note that the time-advanced BSDE (5.5.8)–(5.5.9) for the adjoint processes of the Hamiltonian is of this form. For this type of time-advanced BSDEs we have the following result:

5.5 Optimal Control of Stochastic Delay Equations

145

Theorem 5.24 ([ØSZ3]) Assume that E[G 2 ] < ∞ and that condition (5.5.28) is satisfied. Then the BSDE (5.5.29)–(5.5.30) has a unique solution p(t), q(t), r (t, z)) such that    T   (5.5.31) E p 2 (t) + q 2 (t) + r 2 (t, z)ν(dz) dt < ∞. R

0

Moreover, the solution can be found by inductively solving a sequence of BSDEs backwards as follows: Step 0: In the interval [T − δ, T ] we let p(t), q(t) and r (t, z) be defined as the solution of the classical BSDE dp(t) = F (t, p(t), 0, 0, q(t), 0, 0, r (t, z), 0, 0) dt  + q(t)d B(t) + r (t, z) N˜ (dt, dz) ; t ∈ [T − δ, T ],

(5.5.32)

p(T ) = G.

(5.5.33)

R

Step k ; k ≥ 1 If the values of ( p(t), q(t), r (t, z)) have been found for t ∈ [T − kδ, T − (k − 1)δ], then if t ∈ [T − (k + 1)δ, T − kδ] the values of p(t + δ), pt , q(t + δ), qt , r (t + δ, z) and rt are known and hence the BSDE dp(t) = E[F (t, p(t), p(t + δ), pt , q(t), q(t + δ), qt , r (t), r (t + δ), rt ) |Ft ]dt  + q(t)d B(t) + r (t, z) N˜ (dt, dz) ; t ∈ [T − (k + 1)δ, T − kδ] (5.5.34) R

p(T − kδ) = the value found in Step k − 1

(5.5.35)

has a unique solution in [T − (k + 1)δ, T − kδ]. We proceed like this until k is such that T − (k + 1)δ ≤ 0 < T − kδ and then we solve the corresponding BSDE on the interval [0, T − kδ]. Proof The proof follows directly from the above inductive procedure. The estimate (5.5.31) is a consequence of known estimates for classical BSDEs (Pardoux– Peng).  Second Existence and Uniqueness Theorem (General Case) Next, consider the following backward stochastic differential equation in the unknown F-predictable processes ( p(t), q(t), r (t, x)): dp(t) = E[F(t, p(t), p(t + δ), pt , q(t), q(t + δ), qt , r (t), r (t + δ), rt )|Ft ]dt  + q(t)d Bt + r (t, z) N˜ (dt, dz), ; t ∈ [0, T ], (5.5.36) R

146

5 Stochastic Control of Jump Diffusions

p(t) = G(t),

t ∈ [T, T + δ],

(5.5.37)

where G is a given FT -measurable stochastic process. Theorem 5.25 ([ØSZ3]) Assume E[supT ≤t≤T +δ |G(t)|2 ] < ∞ and that the condition (5.5.28) is satisfied. Then the backward stochastic differential equation (5.5.36) admits a unique solution ( p(t), q(t), r (t, z)) such that   T  2 2 2 { p (t) + q (t) + r (t, z)ν(dz)}dt < ∞. E R

0

Proof Step 1. Assume F is independent of p1 , p2 and p. Set q 0 (t) := 0, r 0 (t, x) = 0. For n ≥ 1, define ( p n (t), q n (t), r n (t, x)) to be the unique solution to the following backward stochastic differential equation: dp n (t) = E[F(t, q n−1 (t), q n−1 (t + δ), qtn−1 , r n−1 (t, ·), r n−1 (t + δ, ·), rtn−1 (·))|Ft ]dt + q n (t)d Bt + r n (t, z) N˜ (dt, dz), t ∈ [0, T ] p n (t) = G(t)

t ∈ [T, T + δ].

(5.5.38)

As pointed out in Chap. 4 the existence and uniqueness of the solution of above BSDE is well known. We then show that ( p n (t), q n (t), r n (t, x)) forms a Cauchy sequence. Step 2. General case Let p 0 (t) = 0. For n ≥ 1, define ( p n (t), q n (t), r n (t, z)) to be the unique solution to the following BSDE: dp n (t) = E[F(t, p n−1 (t), p n−1 (t + δ), ptn−1 , q n (t), q n (t + δ), qtn , r n (t, ·), r n (t + δ, ·), rtn (·))|Ft ]dt + q n (t)d Bt + r n (t, z) N˜ (dt, dz), p n (t) = G(t);

t ∈ [T, T + δ].

(5.5.39)

The existence of ( p n (t), q n (t), r n (t, z)) is proved in Step 1. By similar arguments as above, we deduce after some computations that   T eC N T T n . | p n+1 (s) − p n (s)|2 ds ≤ E n! 0 By this inequality and a similar argument as in Step 1, it can be shown that ( p n (t), q n (t), r n (t, z)) converges to some limit ( p(t), q(t), r (t, z)), which is the unique solution of equation (5.5.36).    Theorem 5.26 ([ØSZ3]) Assume E supT ≤t≤T +δ |G(t)|2α < ∞ for some α > 1 and that the following condition holds:

5.5 Optimal Control of Stochastic Delay Equations

147

|F(t, p1 , p2 , p, q1 , q2 , q, r1 , r2 , r ) − F(t, p¯ 1 , p¯ 2 , p, ¯ q¯1 , q¯2 , q, ¯ r¯1 , r¯2 , r¯ )| ≤ C(| p1 − p¯ 1 | + | p2 − p¯ 2 | + sup | p(s) − p(s)| ¯ + |q1 − q¯1 | 0≤s≤δ

+ |q2 − q¯2 | + |q − q| ¯ V1 + |r1 − r¯1 |H + |r2 − r¯2 |H + |r − r¯ |V2 .

(5.5.40)

Then the BSDE (5.5.36) admits a unique solution ( p(t), q(t), r (t, z)) such that  E



T

sup | p(t)|2α + 0≤t≤T

 {q 2 (t) +

0

R

 r 2 (t, z)ν(dz)}dt < ∞.

Proof Step 1. Assume F is independent of p1 , p2 and p. In this case the condition above reduces to assumption (5.5.28). By Step 1 in the proof of Theorem 5.25, there is a unique solution ( p(t), q(t), r (t, z)) to Eq. (5.5.36). Step 2. General case. Let p 0 (t) = 0. For n ≥ 1, define ( p n (t), q n (t), r n (t, z)) to be the unique solution to the following BSDE: dp n (t) =E[F(t, p n−1 (t), p n−1 (t + δ), ptn−1 , q n (t), q n (t + δ), qtn , r n (t, ·),  n n n r (t + δ, ·), rt (·))|Ft ]dt + q (t)d Bt + r n (t, z) N˜ (dt, dz), R

p n (t) = G(t),

t ∈ [T, T + δ].

(5.5.41)

By Step 1, ( p n (t), q n (t), r n (t, z)) exists. We then proceed to show that ( p n (t), q (t), r n (t, z)) forms a Cauchy sequence. After some computations we end up with the estimate n

 E 0

T

 sup | p

n+1

(s) − p (s)| ds ≤ n



t≤s≤T

eC N T T n . n!

Using this inequality and a similar argument as in Step 1, we can show that ( p n (t), q n (t), r n (t, z)) converges to some limit ( p(t), q(t), r (t, z)), which is the unique solution of equation (5.5.36).  Finally we give a result when the coefficient f is independent of z and r .   Theorem 5.27 ([ØSZ3]) Assume that E

sup

T ≤t≤T +δ

|G(t)|2

< ∞ and that F

satisfies ¯ ≤ C(|y1 − y¯1 | + |y2 − y¯2 | + sup | p(s) − p(s)|). ¯ |F(t, y1 , y2 , p) − F(t, y¯1 , y¯2 , p)| 0≤s≤δ

(5.5.42) Then the backward stochastic differential equation (5.5.36) admits a unique solution.

148

5 Stochastic Control of Jump Diffusions

5.5.4 Example: Optimal Consumption from a Cash Flow with Delay Let α(t), β(t) and γ(t, z) be given bounded adapted processes, α(t) deterministic. Assume that for t ∈ [0, T ],

R

γ 2 (t, z)ν(dz) < ∞. Consider a cash flow X 0 (t) with the dynamics,

   d X 0 (t) = X 0 (t − δ) α(t)dt + β(t)d B(t) + γ(t, z) N˜ (dt, dz) ; t ∈ [0, T ] R

(5.5.43)

and such that X 0 (t) = x0 (t) > 0 ; t ∈ [−δ, 0],

(5.5.44)

where x0 (t) is a given bounded deterministic function. Suppose that at time t ∈ [0, T ] we consume at the rate c(t) ≥ 0, a càdlàg adapted process. Then the dynamics of the corresponding net cash flow X (t) = X c (t) is d X (t) = [X (t − δ)α(t) − c(t)]dt + X (t − δ)β(t)d B(t)  + X (t − δ) γ(t, z) N˜ (dt, dz) ; t ∈ [0, T ] R

X (t) = x0 (t) ; t ∈ [−δ, 0].

(5.5.45)

Let U1 (t, c, ω) : [0, T ] × R+ × Ω → R be a given stochastic utility function satisfying the following conditions t → U1 (t, c, ω) is Ft -adapted for each c ≥ 0, ∂U1 (t, c, ω) > 0, c → U1 (t, c, ω) is C 1 , ∂c ∂U1 c→ (t, c, ω) is strictly decreasing, ∂c ∂U1 lim (t, c, ω) = 0 for all t, ω ∈ [0, T ] × Ω. c→∞ ∂c ∂U1 (t, 0, ω) and define ∂c ⎧ ⎪ ⎨0  −1 I (t, v, ω) = ∂U1 ⎪ (t, ·, ω) (v) ⎩ ∂c

(5.5.46)

Put v0 (t, ω) =

if v ≥ v0 (t, ω), if 0 ≤ v < v0 (t, ω).

(5.5.47)

5.5 Optimal Control of Stochastic Delay Equations

149

Suppose we want to find a consumption rate c(t) ˆ such that J (c) ˆ = sup{J (c) ; c ∈ A}, where



T

J (c) = E

(5.5.48)

 U1 (t, c(t), ω)dt + k X (T ) .

0

Here k > 0 is constant and A is the family of all càdlàg, Ft -adapted processes c(t) ≥ 0 such that E[|X (T )|] < ∞. In this case the Hamiltonian given by (5.5.6) gets the form H (t, x, y, a, u, p, q, r (·), ω) = U1 (t, c, ω) + (α(t)y − c) p  + yβ(t)q + y γ(t, z)r (z)ν(dz).

(5.5.49)

R

Maximizing H with respect to c gives the following first-order condition for an optimal c(t): ˆ ∂U1 (t, c(t), ˆ ω) = p(t). (5.5.50) ∂c The time-advanced BSDE for p(t), q(t), r (t, z) is, by (5.5.8)–(5.5.9), dp(t) = −E[{α(t) p(t + δ) + β(t)q(t + δ)   + γ(t, z)r (t + δ, z)ν(dz) χ[0,T −δ] (t)|Ft ]dt R  + q(t)d B(t) + r (t, z) N˜ (dt, dz) ; t ∈ [0, T ]

(5.5.51)

p(T ) = k.

(5.5.52)

R

Since k is deterministic, we can choose q = r = 0 and (5.5.51)–(5.5.52) becomes dp(t) = −α(t) p(t + δ)χ[0,T −δ] (t)dt ; t < T,

(5.5.53)

p(t) = k for t ∈ [T − δ, T + δ].

(5.5.54)

To solve this we introduce h(t) := p(T − t) ; t ∈ [−δ, T ]. Then

150

5 Stochastic Control of Jump Diffusions

dh(t) = −dp(T − t) = α(T − t) p(T − t + δ)dt = α(T − t) p(T − (t − δ))dt = α(T − t)h(t − δ)dt for t ∈ [0, T ], and

h(t) = p(T − t) = k for t ∈ [−δ, 0].

(5.5.55)

(5.5.56)

This determines h(t) inductively on each interval [ jδ, ( j + 1)δ] ; j = 1, 2, . . . , as follows: If h(s) is known on [( j − 1)δ, jδ], then for t ∈ [ jδ, ( j + 1)δ] we have  h(t) = h( jδ) +

t

h (s)ds = h( jδ) +





t

α(T − s)h(s − δ)ds. ; t ∈ [ jδ, ( j + 1)δ].



(5.5.57) We have proved: Proposition 5.28 (Optimal consumption rate in a stochastic system with delay) [ØSZ3] The optimal consumption rate cˆδ (t) for the problem (5.5.45), (5.5.48) is given by (5.5.58) cˆδ (t) = I (t, h δ (T − t), ω), where h δ (·) = h(·) is determined by (5.5.56)–(5.5.57). Remark 5.29 Assume that α(t) = α > 0 for all t ∈ [0, T ]. Then we see by induction on (5.5.57) that 0 ≤ δ1 < δ2 ⇒ h δ1 (t) > h δ2 (t) for all t ∈ (0, T ] and hence, perhaps surprisingly, 0 ≤ δ1 < δ2 ⇒ cˆδ1 (t) < cˆδ2 (t) for all t ∈ [0, T ). Thus the optimal consumption rate increases if the delay increases. The explanation for this may be that the delay postpones the negative effect on the growth of the cash flow caused by the consumption.

5.6 Exercises Exercise* 5.1 Suppose the wealth X (t) = X (u) (t) of a person with consumption rate u(t) ≥ 0 satisfies the following Lévy type mean reverting Ornstein–Uhlenbeck SDE  (dt, dz), t > 0, dX (t) = (μ − ρX (t) − u(t))dt + σ dB(t) + θ z N R

X (0) = x > 0.

5.6 Exercises

151

Fix T > 0 and define J (u) (s, x) = E s,x



T −s

e−δ(s+t)

0

 uγ (t) dt + λX (T − s) . γ

Use dynamic programming to find the value function Φ(s, x) and the optimal consumption rate (control) u∗ (t) such that ∗

Φ(s, x) = sup J (u) (s, x) = J (u ) (s, x). u(·)

In the above μ, ρ, σ, θ, T, δ > 0, γ ∈ (0, 1), and λ > 0 are constants. Exercise* 5.2 Solve the problem of Exercise 5.1 by using the stochastic maximum principle. Exercise* 5.3 Define ⎡

⎤  (dt, dz) u(t, ω) z N R dX 1 (t) ⎦ ∈ R2 dX (u) (t) = dX (t) = =⎣  dX 2 (t) 2 R z N (dt, dz) 



and, for fixed T > 0 (deterministic),   J (u) = E − (X 1 (T ) − X 2 (T ))2 . Use the stochastic maximum principle to find u∗ such that J (u∗ ) = sup J (u). u

 (T, dz). We may regard F as a given T -claim in Interpretation. Put F(ω) = R z 2 N the normalized market with the two investment possibilities bond and stock, whose prices are (bond) dS0 (t) = 0,  S0 (0) = 1, (dt, dz), a Lévy martingale. (stock) dS1 (t) = R z N Then −J (u) is the variance of the difference between F = X 2 (T ) and the wealth X 1 (T ) generated by a self-financing portfolio u(t, ω). See [BDLØP] for more information on minimal variance hedging in markets driven by Lévy martingales. Exercise* 5.4 Solve the stochastic control problem  Φ1 (s, x) = inf E s,x u(t)∈R

0



 e−ρ(s+t) (X 2 (t) + θu2 (t))dt ,

152

5 Stochastic Control of Jump Diffusions



where dX (t) = u(t)dt + σdB(t) +

R

z N˜ (dt, dz), X (0) = x,

where ρ > 0, θ > 0, and σ > 0 are constants. The interpretation of this problem is that we want to push the process X (t) as close as possible to 0 by using a minimum of energy, its rate being measured by θu2 (t). [Hint: Try ϕ(s, x) = e−ρs (ax2 + b) for some constants a, b.] Exercise* 5.5 (The Stochastic Linear Regulator Problem) Solve the stochastic control problem Φ0 (x) = inf E u

x



T

 (X (t) + θu(t) )dt + λX (T ) , 2

2

2

0



where dX (t) = u(t)dt + σdB(t) +

R

z N˜ (dt, dz), X (0) = x

and T > 0 is a constant. (a) By using dynamic programming (Theorem 5.1). (b) By using the stochastic maximum principle (Theorem 5.4). Exercise* 5.6 Solve the stochastic control problem  Φ(s, x) = sup E

s,x

τ0

e

c(t)≥0

−δ(s+t)

 ln c(t)dt ,

0

where the supremum is taken over all Ft -adapted processes c(t) ≥ 0 and τ0 = inf{t > 0; X (t) ≤ 0}, where    dX (t) = X (t − ) μ dt + σ dB(t) + θ z N˜ (dt, dz) − c(t)dt, R

X (0) = x > 0,

where δ > 0, μ, σ, and θ are constants, and θz > −1 for a.a. z w.r.t. ν. We may interpret c(t) as the consumption rate, X (t) as the corresponding wealth, and τ0 as the bankruptcy time. Thus Φ represents the maximal expected total discounted logarithmic utility of the consumption up to bankruptcy time.

5.6 Exercises

153

[Hint: Try ϕ(s, x) = e−δs (a ln x + b) as a candidate for Φ(s, x), where a and b are suitable constants.] Exercise 5.7 Use the stochastic maximum principle (Theorem 5.4) to solve the problem  T  sup E e−δt ln c(t)dt + λe−δT ln X (T ) , c(t)≥0

0

where −





dX (t) = X (t ) μ dt + σ dB(t) + θ

R

 ˜ z N (dt, dz) − c(t)dt,

X (0) > 0.

Here δ > 0, λ > 0, μ, σ, and θ are constants, and θz > −1 for a.a. z(ν) (see Exercise 5.6 for an interpretation of this problem). [Hint: Try p(t) = ae−μt X −1 (t) and c(t) = e−δt p −1 (t) = e(μ−δ)t) X (t)/a, for some constant a > 0.] Exercise 5.8 (Optimal Portfolio Problem) Suppose a financial market has two investment possibilities: a risk free asset with unit price S0 (t) = 1 for all t, and a risky asset, with unit price S(t) given by the SDE    ⎧ ⎨d S(t) = S(t − ) α(t)dt + β(t)d B(t) + γ(t, ζ) N˜ (dt, dζ) ; t ∈ [0, T ] R ⎩ S(0) > 0, (5.6.1) where α(t), β(t), γ(t, ζ) > −1 are given bounded deterministic functions. Suppose we introduce a portfolio π(t) in this market, representing the fraction of the total wealth X (t) invested in the risky asset at time t. Then the corresponding equation for the wealth process X (t) = X π (t) will be  ⎧ ⎨d X (t) = π(t)X (t − )[α(t)dt + β(t)d B(t) + γ(t, ζ) N˜ (dt, dζ)]; t ∈ [0, T ] R ⎩ X (0) = x0 > 0. (5.6.2) Let U : (0, ∞) → [−∞, ∞) be a given utility function, assumed to be C 1 and concave. We want to find the portfolio π ∗ (t) which maximizes the expected utility of the terminal wealth, i.e. such that

154

5 Stochastic Control of Jump Diffusions

sup J1 (π) = J1 (π ∗ ), π

(5.6.3)

where J1 (π) is the performance functional defined by J1 (π) = E[U (X π (T ))].

(5.6.4)

1. Use dynamic programming and the HJB equation to solve this problem in the following cases (discuss possible admissible conditions for the control): (a) (Logarithmic utility) N = 0 and U (x) = ln(x) Hint: Try a value function of the form ϕ(s, x) = h(s)+ln(x), for some function h. (b) (Power utility) N = 0 and U (x) = ρ1 xρ , where ρ ∈ (−∞, 1) is a constant Hint: Try a value function of the form ϕ(s, x) = k(s) × ρ1 xρ , for some function k. (c) (Exponential utility) N = 0 and U (x) = exp(−αx), where α > 0 is a constant (d) (Jump diffusion case) N = 0 and U (x) = ln(x). 2. Use now the maximum principle to solve the optimal portfolio problem for the various cases given above. Hint to case (d): In the adjoint BSDE for ( p, q, r ), try to put q(t) = p(t)θ0 (t) and r (t, ζ) = p(t − )θ1 (t, ζ) for some processes θ0 , θ1 . Exercise 5.9 (Optimal Consumption Problem) Suppose we are given a cash flow X 0 (t) of the form    ⎧ ⎨d X 0 (t) = X 0 (t − ) α(t)dt + β(t)d B(t) + γ(t, ζ) N˜ (dt, dζ) ; t ∈ [0, T ] R ⎩ 0 X (0) = x0 > 0, (5.6.5) with α(t), β(t), γ(t, ζ) as in Exercise 5.8. If we introduce a relative consumption rate c(t) from this wealth process, the corresponding wealth X (t) = X c (t) gets the dynamics    d X (t) = X (t − ) (α(t) − c(t))dt + β(t)d B(t) + γ(t, ζ) N˜ (dt, dζ) ; t ∈ [0, T ] R

X (0) = x0 > 0.

We want to find the relative consumption rate c∗ (t) which maximizes the total expected utility from consumption up to time T , i.e. we want to find c∗ (t) such that (5.6.6) sup J2 (c) = J2 (c∗ ), c

5.6 Exercises

155

where J2 (c) is the performance functional defined by J2 (c) = E

#

T

$ U (c(t)X c (t))dt + λU (X c (T )) ,

(5.6.7)

0

where λ > 0 is a given constant. 1. Use dynamic programming and the HJB equation to solve this problem in the following cases: (a) (Logarithmic utility) N = 0 and U (x) = ln(x) Hint: Try a value function of the form ϕ(s, x) = f (s) ln(x) + g(s), for some functions f, g. (b) (Power utility) N = 0 and U (x) = ρ1 xρ , where ρ ∈ (−∞, 1). (c) (Exponential utility) N = 0 and U (x) = exp(−αx), where α > 0. (d) (Jump diffusion case) N = 0 and U (x) = ln(x). 2. Now use the maximum principle to solve the optimal consumption problem in the various cases given above. Hint to case (a): In the adjoint BSDE for ( p, q) try to put q(t) = p(t)θ0 (t) for some process θ0 . Exercise 5.10 (Infinite Horizon Consumption Problem) We consider an infinite horizon problem as in Example 5.9. Solve this problem with the performance functional (5.3.15) replaced by  J (u) = E 0

and with σ(t, x, u) = σ0 (t)X .



e−ρt ln u(t)dt



Chapter 6

Stochastic Differential Games

6.1 Stochastic Differential (Markov) Games, HJB–Isaacs Equations In this section we present the dynamic programming approach to stochastic differential games. We only present the case for zero sum games. For the extension to non-zero sum games, we refer to [MØ]. Suppose the state Y (t) = Y y(u) (t) ∈ Rk is described by a controlled Markovian SDE of the form ⎧ ⎪ ⎪dY (t) = b(Y  (t), u(t))dt + σ(Y (t), u(t))dB(t) ⎨ + γ(Y (t), u(t), ζ) N˜ (dt, dζ); t ≥ 0 (6.1.1) ⎪ Rl ⎪ ⎩ Y (0) = y ∈ Rk . Here u = (u 1 , u 2 ) is a control process with values in V1 × V2 ⊂ Rn 1 × Rn 2 . The control u i is chosen by player number i, and Ai is the set of admissible control processes for player number i; i = 1, 2. We put A = A1 × A2 . Let f : Rk × V1 × V2 → R be a given profit rate and let g : Rk → R be a given bequest function. We assume that  Ey

τS

 | f (Y (t), u(t))|dt + |g(Y (τS ))| < ∞

0

for all y, where E y [ϕ(Y (t))] means E[ϕ(Y y(u) (t))]. Here © Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0_6

157

158

6 Stochastic Differential Games

τS := inf{t > 0 ; Y y(u) (t) ∈ / S},

(6.1.2)

with S ∈ Rk being a given solvency region. We may interpret τS as the bankruptcy time. Let A1 , A2 be given families of admissible control processes u 1 , u 2 for player number 1 and 2, respectively Problem 6.1 (Zero-sum stochastic differential games) Find ϕ(y) and u i∗ ∈ Ai ; i = 1, 2 such that Φ(y) = sup

u 1 ∈A1







inf J (y) = J u 1 ,u 2 (y), u

(6.1.3)

u 2 ∈A2

where J u (y) = J u 1 ,u 2 (y) = E



τS

 f (Y (s), u(s))ds + g(Y (τ S ))χτS 0 ; |Y (t)| ≥ N } ; N = 1, 2, . . .

Hence, by (i) we get, with u i (s) = u i (Y (s)), E [ϕ(Y y

uˆ 1 ,u 2

(τS(N ) ))]

 ≥ ϕ(y) − E

τ S(N )

y

 f (Y

uˆ 1 ,u 2

(s), u 1 (s), uˆ 2 (s))ds .

0

Therefore,  ϕ(y) ≤ E

y

τ S(N )

 f (Y

uˆ 1 ,u 2

(s), uˆ 1 (s), u 2 (s))ds + ϕ(Y

0

→ J uˆ 1 ,u 2 (y) as N → ∞.

uˆ 1 ,u 2

(τ S(N ) )) (6.1.8)

Here we have used condition (v) and the fact that Y (·) is quasi-left continuous (i.e. left continuous at increasing stopping times; see [JS], Propositions I.2.26 and I.3.27). Since u 2 ∈ A2 was arbitrary, we deduce from (6.1.8) that

160

6 Stochastic Differential Games

ϕ(y) ≤ inf J uˆ 1 ,u 2 (y).

(6.1.9)

u 2 ∈A2



It follows that ϕ(y) ≤ sup

u 1 ∈A1

inf J u 1 ,u 2 (y) = Φ(y).

(6.1.10)

u 2 ∈A2

Similarly, applying (ii) we get ϕ(y) ≥ J u 1 ,uˆ 2 (y) for all u 1 ∈ A1

(6.1.11)

and therefore

ϕ(y) ≥ sup J

u 1 ,uˆ 2

(y) ≥ inf

u 2 ≤A2

u 1 ∈A1

sup J

u 1 ,u 2

u 1 ∈A1

(y) .

(6.1.12)

In the same way, applying (iii) we get ϕ(y) = J uˆ 1 ,uˆ 2 (y).

(6.1.13)

By combining (6.1.9)–(6.1.13) we obtain ϕ(y) = inf J u 2 ∈A2

uˆ 1 ,u 2

(y) ≤ sup

u 1 ∈A1



≤ inf

u 2 ∈A2

inf J

u 1 ,u 2

u 2 ∈A2

(y) = Φ(y)

sup J u 1 ,u 2 (y) ≤ sup J u 1 ,uˆ 2 (y) ≤ ϕ(y).

u 1 ∈A1

u 1 ∈A1

We conclude that ϕ(y) = Φ(y), that (6.1.6) holds and that u ∗ := (uˆ 1 , uˆ 2 ) is optimal. 

6.1.1 Entropic Risk Minimization Example We now return to the problem of risk minimization in a financial market, but now the risk β is represented by (4.5.5). If X β (t) is the wealth processes corresponding to a portfolio β, then the risk minimization problem becomes

inf

β∈A

sup {E Q [−X β (T )] − α(Q)} .

(6.1.14)

Q∈P

This is a min-max problem, which in our setting is an example of a stochastic differential game:

6.1 Stochastic Differential (Markov) Games, HJB–Isaacs Equations

161

Suppose the financial market is as in (4.1.1), but with r = 0 and with jumps added, i.e. ⎧ S0 (t) = 1 for all t ⎪ ⎪   ⎨ − ˜ dS1 (t) = S1 (t ) μ(t)dt + σ(t)dB(t) + γ(t, ζ) N (dt, dζ) ; t ≥ 0 ⎪ R ⎪ ⎩ S1 (0) > 0. (6.1.15) If β(t) is a self-financing portfolio representing the number of units of the risky asset (with unit price S1 (t)) held at time t, the corresponding wealth process X (t) = X β (t) is given by dX (t) = β(t)dS1 (t)    = w(t) μ(t)dt + σ(t)dB(t) + γ(t, ζ) N˜ (dt, dζ) = dX w (t),

R

(6.1.16)

where w(t) := β(t)S1 (t − ) is the amount held in the risky asset at time t. If we use the representation (4.5.5) of the risk measure ρ = ρe corresponding to the family P of measures Q given in (4.5.6)–(4.5.8) and the entropic penalty αe given by (4.5.9), the risk minimizing portfolio problem (6.1.14) can be written  



dQ w dQ dQ sup E − X (T ) − log , w∈W θ∈ dP dP dP inf

(6.1.17)

where W is the family of admissible portfolios w. To put this problem into the setting above, we represent Q by dQ = M θ (T ) and we put dP ⎤ ⎡ ⎤ ⎡ ⎤ 0 dt 1 dY (t) = dY θ,w (t) = ⎣dX w (t)⎦ = ⎣w(t)μ(t)⎦ dt + ⎣ w(t)σ(t) ⎦ dB(t) 0 M θ (t)θ0 (t) dM θ (t) ⎡ ⎤  0 + ⎣ w(t)γ(t, ζ) ⎦ N˜ (dt, dζ), (6.1.18) R M θ (t − )θ (t , ζ) 1 1 ⎡

with initial value ⎡ ⎤ s Y θ,w (0) = y = ⎣ x ⎦ ; s ∈ [0, T ], x > 0, m > 0. m

(6.1.19)

In this case the solvency region is S = [0, T ] × R+ × R+ and the performance functional is

162

6 Stochastic Differential Games

J θ,w (s, x, m) = E s,x,m [−M θ (T )X w (T ) − M θ (T ) log M θ (T )].

(6.1.20)

Assume from now on that μ(t), σ(t) and γ(t, ζ) are deterministic.

(6.1.21)

Then Y θ,w (t) becomes a controlled jump diffusion, and the risk minimization problem (6.1.14) is the following special case of Problem 6.1: Problem 6.2 (Entropic Risk Minimization) Find w ∗ ∈ W, θ∗ ∈  and Φ(y) such that Φ(y) = inf sup J θ,w (y) = J θ w∈W θ∈



,w ∗

(y) ; y ∈ S.

(6.1.22)

By (6.1.18) we see that the generator Aθ,w is given by (see (6.1.5)) ∂ϕ ∂ϕ (s, x, m) + wμ(s) (s, x, m) ∂s ∂x 1 2 2 ∂2ϕ 1 2 2 ∂2ϕ ∂2ϕ + w σ (s) 2 (s, x, m) + m θ0 (s, x, m) (s, x, m) + wθ mσ(s) 0 ∂x 2 ∂m 2 ∂x∂m 2  + ϕ(s, x + wγ(s, ζ), m + mθ1 (ζ)) − ϕ(s, x, m)

Aθ,w ϕ(s, x, m) =

R

 ∂ϕ ∂ϕ (s, x, m)wγ(s, ζ) − (s, x, m)mθ1 (ζ) ν(dζ). − ∂x ∂m

(6.1.23)

Comparing with the general formulation in (6.1.4), we see that in this case f = 0 and g(y) = g(x, m) = −mx − m log(m). Therefore, according to Theorem 6.1, we should try to find a function ϕ(s, x, m) ∈ ˆ w = w(y) ˆ such that C 2 (S) ∩ C(R3 ) and control values θ = θ(y), ˆ

inf sup Aθ,w ϕ(y) = Aθ,wˆ ϕ(y) ; y ∈ S

(6.1.24)

lim ϕ(s, x, m) = −xm − m log(m).

(6.1.25)

w∈R θ∈R2

and t→T −

Let us try a function of the form ϕ(s, x, m) = −xm − m log(m) + κ(s)m, where κ is a deterministic function, κ(T ) = 0. Then by (6.1.23)

(6.1.26)

6.1 Stochastic Differential (Markov) Games, HJB–Isaacs Equations

163



1 1 + wθ0 mσ(t)(−1) Aθ,w ϕ(s, x, m) = κ (s)m + mμ(t)(−m) + m 2 θ02 − 2 m  + {−(x + wγ(s, ζ))(m + mθ1 (ζ)) + xm − (m + mθ1 (ζ)) log(m + mθ1 (ζ)) R

+ m log m + κ(s)(m + mθ1 (ζ)) − κ(s)m + mwγ(s, ζ) − mθ1 (ζ)(−x − 1 − log m + κ(s))}ν(dζ)  1 = m κ (s) − wμ(t) − θ02 − wθ0 σ(t) 2   + θ1 (ζ){1 − log(1 + θ1 (ζ)) − wγ(s, ζ)}ν(dζ) .

(6.1.27)

R

Maximizing Aθ,w ϕ(y) with respect to θ = (θ0 , θ1 ) and minimizing with respect to w gives the following first-order equations ˆ = 0, θˆ0 (s) + w(s)σ(s) θˆ1 (s, ζ) 1 − log(1 + θˆ1 (s, ζ)) − w(s)γ(s, ˆ ζ) − = 0, 1 + θˆ1 (s, ζ)  μ(s) + θˆ0 (s)σ(s) − θˆ1 (s, ζ)γ(s, ζ)ν(dζ) = 0.

(6.1.28) (6.1.29) (6.1.30)

R

These are 3 equations in the 3 unknown candidates θˆ0 , θˆ1 and wˆ for the optimal control for the game in Problem 6.2. To get an explicit solution, let us now assume that (6.1.31) N = 0 and γ = θ1 = 0. Then (6.1.28)–(6.1.30) gives μ(s) μ(s) , w(s) ˆ = 2 . θˆ0 (s) = − σ(s) σ (s)

(6.1.32)

Substituted into (6.1.27) we get by (iii) Theorem 6.1 

A

ˆ wˆ θ,

1 ϕ(s, x, w) = m κ (s) − 2



μ(s) σ(s)

2  = 0.

Combining this with (iv) Theorem 6.1 we obtain 

T

κ(s) = − s

1 2



μ(t) σ(t)

2 dt.

Now all the conditions of Theorem 6.1 are satisfied, and we get:

(6.1.33)

164

6 Stochastic Differential Games

Theorem 6.2 (Entropic Risk Minimization) Assume that (6.1.21) and (6.1.31) hold. Then the solution of Problem 6.2 is 

T

Φ(s, x, m) = −xm − m log m − s

1 2



μ(t) σ(t)

2 dt

(6.1.34)

and the optimal controls are μ(s) μ(s) θˆ0 (s) = − and w(s) ˆ = 2 ; s ∈ [0, T ]. σ(s) σ (s)

(6.1.35)

In particular, choosing the initial values s = 0 and m = 1 we get  Φ(0, x, 1) = −x − 0

T

1 2



μ(t) σ(t)

2 dt.

(6.1.36)

This agrees with what we obtained in Proposition 5.20 using the stochastic HJB equation for optimal control of FBSDEs. See also Theorem 6.9. Remark 6.3 Theorem 6.2 could also have been obtained by using a stochastic HJB approach to the optimal control of forward-backward SDEs. This was presented in Sect. 5.4.

6.2 Stochastic Maximum Principles We first recall some basic concepts and results from Banach space theory. Let V be an open subset of a Banach space X with norm · and let F : V → R. (i) We say that F has a directional derivative (or Gâteaux derivative) at x ∈ X in the direction y ∈ X if 1 D y F(x) := lim (F(x + εy) − F(x)) ε→0 ε exists. (ii) We say that F is Fréchet differentiable at x ∈ V if there exists a linear map L:X →R such that

1 |F(x + h) − F(x) − L(h)| = 0. h→0 h lim

h∈X

In this case we call L the gradient (or Fréchet derivative) of F at x and we write

6.2 Stochastic Maximum Principles

165

L = ∇x F. (iii) If F is Fréchet differentiable, then F has a directional derivative in all directions y ∈ X and D y F(x) = ∇x F(y).

6.2.1 General (Non-zero) Stochastic Differential Games In this section, we formulate and prove a sufficient and a necessary maximum principle for general stochastic differential games (not necessarily zero-sum games) of forward-backward SDEs. Consider a controlled forward SDE of the form dX (t) = dX u (t) = b(t, X (t), u(t), ω)dt + σ(t, X (t), u(t), ω)dB(t)  + γ(t, X (t − ), u(t), ζ, ω) N˜ (dt, dζ) ; 0 ≤ t ≤ T ; X (0) = x ∈ R. R

(6.2.1)

Here u = (u 1 , u 2 ), where u i (t) is the control of player i ; i = 1, 2. We assume that we are given two subfiltrations Et(i) ⊆ Ft ; t ∈ [0, T ],

(6.2.2)

representing the information available to player i at time t ; i = 1, 2. We let Ai denote a given set of admissible control processes for player i, contained in the set of Et(i) -predictable processes; i = 1, 2, with values in Ai ⊂ Rd , d ≥ 1. Define U := A1 × A2 and U := A1 × A2 . We assume that b(·, x, u, ω), σ(·, , x, u, ω), γ(·, , x, u, ζ, ω) are given predictable processes for each x in R, u in U, ζ in R0 := R\{0} such that (6.2.1) has a unique solution for each u in U. We consider the associated controlled backward SDEs (i.e. BSDEs) in the unknowns (Yiu (t), Z iu (t), K iu (t, ζ)) = (Yi (t), Z i (t), K i (t, ζ)) of the form dYi (t) = −gi (t, X (t), Yi (t), Z i (t), K i (t, ·), u(t), ω)dt  + Z i (t)dB(t) + K i (t, ζ) N˜ (dt, dζ) ; 0 ≤ t ≤ T R

Yi (T ) = h i (X (T ), ω) ; i = 1, 2.

(6.2.3)

Here gi (·, x, y, z, k, u, ω) are given predictable processes for each x in R, y in R, z in R, k in RR0 , u in U, and where h i (x, ω) is FT -measurable for each given x in R, such that the BSDEs (6.2.3) have unique solutions for each u in U. Let f i (t, x, u) : [0, T ] × R × U → R, ϕi (x) : R → R and ψi (x) : R → R be given profit rates, bequest functions and “risk evaluations” respectively, of player

166

6 Stochastic Differential Games

i ; i = 1, 2. Define for i = 1, 2 

T

Ji (u) := E

 f i (t, X (t), u(t), ω)dt + ϕi (X (T ), ω) + u

u

0

ψi (Yiu (0))

, (6.2.4)

provided the integrals and expectations exist. We call Ji (u) the performance functional of player i ; i = 1, 2. We assume that b, σ, γ, gi , h i , f i , ϕi and ψi are C 1 with respect to x, y, z, u and that ψi (x) ≥ 0

for all x; i = 1, 2.

(6.2.5)

A Nash equilibrium for the FBSDE game (6.2.1)–(6.2.4) is a pair (uˆ 1 , uˆ 2 ) ∈ A1 ×A2 such that (6.2.6) J1 (u 1 , uˆ 2 ) ≤ J1 (uˆ 1 , uˆ 2 ) for all u 1 ∈ A1 and J2 (uˆ 1 , u 2 ) ≤ J2 (uˆ 1 , uˆ 2 ) for all u 2 ∈ A2 .

(6.2.7)

Heuristically, this means that player i has no incentive to deviate from the control uˆ i , as long as player j ( j = i) does not deviate from uˆ j ; i = 1, 2. Therefore a Nash equilibrium is in some cases a likely outcome of a game. Suppose there exists a Nash equilibrium (uˆ 1 , uˆ 2 ). We now present a method to find it, based on the maximum principle for stochastic control. Define the Hamiltonians Hi (t, x, y, z, k, u 1 , u 2 , λ, p, q, r ) : [0, T ] × R3 × R × U × R3 × R → R of this game by Hi (t, x, y, z, k, u 1 , u 2 , λ, p, q, r ) := f i (t, x, u 1 , u 2 ) + λgi (t, x, y, z, k, u 1 , u 2 )  r (ζ)γ(t, x, u 1 , u 2 , ζ)ν(dζ) ; i = 1, 2, + pb(t, x, u 1 , u 2 ) + qσ(t, x, u 1 , u 2 ) + R

(6.2.8)

where R is the set of functions from R0 into R such that the integral in (6.2.8) converges. We assume that Hi is Fréchet differentiable (C 1 ) in the variables x, y, z, k, u i=1,2, and that ∇k Hi (t, ζ) as a random measure is absolutely continuous with respect to ν ; i = 1, 2. We also assume that Hi and its derivatives with respect to u 1 and u 2 are integrable with respect to P; i = 1, 2. In the following, we use the shorthand notation

6.2 Stochastic Maximum Principles

167

∂ Hi (t) ∂y ∂ Hi = (t, X (t), Yi (t), Z i (t), K i (t, ·), u 1 (t), u 2 (t), λi (t), pi (t), qi (t), ri (t, ·)) ∂y and similarly for the other partial derivatives of Hi . To these Hamiltonians we associate a system of FBSDEs in the adjoint processes λi (t), pi (t), qi (t) and ri (t, ζ) as follows: (i) Forward SDE in λi (t):  ∂ Hi d ∂ Hi (t)dt + (t)dB(t) + ∇k Hi (t, ζ) N˜ (dt, dζ) ∂y ∂z R dν    ∂gi d ∂gi (t)dt + (t)dB(t) + ∇k gi (t, ζ) N˜ (dt, dζ) ; 0 ≤ t ≤ T = λi (t) ∂y ∂z R dν

dψ i (Yi (0)) ≥ 0 (6.2.9) λi (0) = ψi (Yi (0)) = dy

dλi (t) =

d where dν ∇k gi (t, ζ) is the Radon–Nikodym derivative of ∇k gi (t, ζ) with respect to ν(ζ). (ii) Backward SDE in pi (t), qi (t), ri (t, ζ):

 ∂ Hi (t)dt + qi (t)dB(t) + ri (t, ζ) N˜ (dt, dζ) ; 0 ≤ t ≤ T ∂x R (6.2.10) pi (T ) = ϕi (X (T )) + h i (X (T ))λi (T ).

d pi (t) = −

In the following we assume that d Δk gi (t, ζ) > 1 a.s. for all t, ζ and i = 1, 2. dν

(6.2.11)

λi (t) ≥ 0 for all t ; i = 1, 2.

(6.2.12)

This ensures that

Theorem 6.4 (Sufficient Maximum Principle for FBSDE Games) Let (uˆ 1 , uˆ 2 ) ∈ A1 × A2 with corresponding solutions Xˆ (t), Yˆi (t), Zˆ i (t), Kˆ i (t), λˆ i (t), pˆ i (t), qˆi (t), rˆi (t, ζ) of Eqs. (6.2.1), (6.2.3), (6.2.9) and (6.2.10) for i = 1, 2. Suppose that the following hold: • (Concavity I) The functions x → h i (x), x → ϕi (x), x → ψi (x) are concave, i = 1,2. • (The conditional maximum principle)

168

6 Stochastic Differential Games

ess supv∈A1 {E[H1 (t, Xˆ (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·), v, uˆ 2 (t), λˆ 1 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t, ·)) |

(6.2.13)

Et(1) ]}

= E[H1 (t, Xˆ (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·), uˆ 1 (t), uˆ 2 (t), λˆ 1 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t, ·)) | Et(1) ]

(6.2.14)

and similarly ess supv∈A2 {E[H2 (t, Xˆ (t), Yˆ2 (t), Zˆ 2 (t), Kˆ 2 (t, ·), u 1 (t), v, λˆ 2 (t), pˆ 2 (t), qˆ2 (t), rˆ2 (t, ·)) | Et(2) ]} = E[H2 (t, Xˆ (t), Yˆ2 (t), Zˆ 2 (t), Kˆ 2 (t, ·), uˆ 1 (t), uˆ 2 (t), λˆ 2 (t), pˆ 2 (t), qˆ2 (t), rˆ2 (t, ·)) | Et(2) ].

(6.2.15)

• (Concavity II) (The Arrow conditions). The functions hˆ 1 (x, y, z, k) := ess supv1 ∈A1 E[H1 (t, x, y, z, k, v1 , uˆ 2 (t), λˆ 1 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t, ·)) | Et(1) ] and hˆ 2 (x, y, z, k) := ess supv2 ∈A2 E[H2 (t, x, y, z, k, uˆ 1 (t), v2 , λˆ 2 (t), pˆ 2 (t), qˆ2 (t), rˆ2 (t, ·)) | Et(2) ] are concave for all t, a.s. Then u(t) ˆ = (uˆ 1 (t), uˆ 2 (t)) is a Nash equilibrium for (6.2.1)–(6.2.4). Above and in the proof in Sect. 6.2.3, we have used the following shorthand notation: If i = 1, then X (t) = X (u 1 ,uˆ 2 ) (t) and Y1 (t) = Y1(u 1 ,uˆ 2 ) (t) are the processes corresponding to the control u(t) = (u 1 (t), uˆ 2 (t)), while Xˆ (t) = X uˆ (t) and Yˆ1 (t) = ˆ = (uˆ 1 (t), uˆ 2 (t)). An analogous Y1uˆ (t) are those corresponding to the control u(t) notation is used for i = 2. Moreover, we put ∂ Hi ∂ Hˆ i (t) = (t, Xˆ (t), Yˆi (t), Zˆ i (t), Kˆ i (t, ·), u(t), ˆ λˆ i (t), pˆ i (t), qˆi (t), rˆi (t, ·)) ∂x ∂x and similarly with

∂ Hˆ i (t) and ∇k Hˆ i (t, ζ), i = 1, 2. ∂z

Proof See Sect. 6.2.3. It is also of interest to prove a version of the maximum principle which does not require the concavity conditions. One such version is the following necessary

6.2 Stochastic Maximum Principles

169

maximum principle (Theorem 6.5), which makes the following assumptions: • For all t0 ∈ [0, T ] and all bounded Et(i) -measurable random variables αi (ω), the control βi (t) := χ(t0 ,T ) (t)αi (ω) belongs to Ai ; i = 1, 2. (6.2.16) • For all u i , βi ∈ Ai with βi bounded there exists δi > 0 such that the control u˜ i (t) := u i (t) + sβi (t) ; t ∈ [0, T ] belongs to Ai for all s ∈ (−δi , δi ) ; i = 1, 2.

(6.2.17)

• The following derivative processes exist and belong to L ([0, T ] × Ω) : (6.2.18) d (u 1 +sβ1 ,u 2 ) d (u +sβ ,u ) X x1 (t) = (t) |s=0 ; y1 (t) = Y1 1 1 2 (t) |s=0 ; ds ds d (u 1 +sβ1 ,u 2 ) d (u 1 +sβ1 ,u 2 ) Z K z 1 (t) = (t) |s=0 ; k1 (t, ζ) = (t) |s=0 ds 1 ds 1 d (u 1 ,u 2 +sβ2 ) X and, similarly x2 (t) = (t) |s=0 etc. ds 2

Note that since X u (0) = x for all u we have xi (0) = 0 for i = 1, 2. In the following we write ∂b ∂b (t) for (t, X (t), u(t)) etc. ∂x ∂x By (6.2.1) and (6.2.3) we have, using the estimates in [AØ12], dx1 (t) =     ∂b ∂b ∂σ ∂σ (t)x1 (t) + (t)x1 (t) + (t)β1 (t) dt + (t)β1 (t) dB(t) ∂x ∂u 1 ∂x ∂u 1    ∂γ ∂γ (t, ζ)x1 (t) + (t, ζ)β1 (t) N˜ (dt, dζ), + (6.2.19) ∂u 1 R ∂x  ∂g1 ∂g1 ∂g1 (t)x1 (t) + (t)y1 (t) + (t)z 1 (t) dy1 (t) = − ∂x ∂y ∂z   ∂g1 (t)β1 (t) dt + ∇k g1 (t, ζ)k1 (t, ζ)ν(dζ) + ∂u 1 R  + z i (t)dB(t) + k1 (t, ζ) N˜ (dt, dζ) ; 0 ≤ t ≤ T, R

y1 (T ) = h 1 (X (u 1 ,u 2 ) (T ))x1 (T ),

(6.2.20)

and similarly for dx2 (t), dy2 (t). We are now ready to state a necessary maximum principle, which is an extension of Theorem 3.1 in [AØ12] and Theorem 2.2 in [ØS3]. See also Theorem 3.2 in [ØS7]. In the sequel, ∂∂vH means ∇v H .

170

6 Stochastic Differential Games

Theorem 6.5 (Necessary Maximum Principle) Suppose u ∈ A with corresponding solutions X (t), Yi (t), Z i (t), K i (t, ζ), λi (t), pi (t), qi (t), ri (t, ζ) of Eqs. (6.2.1), (6.2.3), (6.2.9) and (6.2.10). Suppose (6.2.16)–(6.2.18) hold. Then the following are equivalent: (i) d d J1 (u 1 + sβ1 , u 2 ) |s=0 = J2 (u 1 , u 2 + sβ2 ) |s=0 = 0 ds ds for all bounded β1 ∈ A1 , β2 ∈ A2 . (ii) 

∂ H1 (t, X (t), Y1 (t), Z 1 (t), K 1 (t, ·), v1 , u 2 (t), λ1 (t), ∂v1  p1 (t)q1 (t), r1 (t, ·)) | Et(1) v1 =u 1 (t)  ∂ =E H2 (t, X (t), Y2 (t), Z 2 (t), K 2 (t, ·), u 1 (t), v2 , λ2 (t), ∂v2  p2 (t), q2 (t), r2 (t, ·)) | Et(2)

E

v2 =u 2 (t)

= 0. Proof See Sect. 6.2.3.

6.2.2 The Zero-Sum Game Case In the zero-sum case we have J1 (u 1 , u 2 ) + J2 (u 1 , u 2 ) = 0.

(6.2.21)

Then the Nash equilibrium (uˆ 1 , uˆ 2 ) ∈ A1 × A2 satisfying (6.2.6)–(6.2.7) becomes a saddle point for J (u 1 , u 2 ) := J1 (u 1 , u 2 ). To see this, note that (6.2.6)–(6.2.7) imply that J1 (u 1 , uˆ 2 ) ≤ J1 (uˆ 1 , uˆ 2 ) = −J2 (uˆ 1 , uˆ 2 ) ≤ −J2 (uˆ 1 , u 2 ) and hence J (u 1 , uˆ 2 ) ≤ J (uˆ 1 , uˆ 2 ) ≤ J (uˆ 1 , u 2 ) for all u 1 , u 2 . From this we deduce that

6.2 Stochastic Maximum Principles

inf

171

sup J (u 1 , u 2 ) ≤ sup J (u 1 , uˆ 2 ) ≤ J (uˆ 1 , uˆ 2 )

u 2 ∈A2 u 1 ∈A1

u 1 ∈A1

≤ inf J (uˆ 1 , u 2 ) ≤ sup inf J (u 1 , u 2 ). u 2 ∈A2

u 1 ∈A1 u 2 ∈A2

(6.2.22)

Since we always have inf sup ≥ sup inf, we conclude that inf

sup J (u 1 , u 2 ) = sup J (u 1 , uˆ 2 ) = J (uˆ 1 , uˆ 2 )

u 2 ∈A2 u 1 ∈A1

u 1 ∈A1

= inf J (uˆ 1 , u 2 ) = sup inf J (u 1 , u 2 ), u 2 ∈A2

u 1 ∈A1 u 2 ∈A2

(6.2.23)

i.e. (uˆ 1 , uˆ 2 ) ∈ A1 × A2 is a saddle point for J (u 1 , u 2 ). In this case, only one Hamiltonian is needed, and only one set of adjoint equations. Indeed, let g1 = g2 =: g, h 1 = h 2 =: h, f 1 = − f 2 := f , ϕ1 = −ϕ2 := ϕ, and ψ1 = −ψ2 := ψ. Then H1 (t, x, y, z, k, u 1 , u 2 , λ, p, q, r ) = f (t, x, u 1 , u 2 ) + λg(t, x, y, z, k, u 1 , u 2 )  + pb(t, x, u 1 , u 2 ) + qσ(t, x, u 1 , u 2 ) + r (ζ)γ(t, x, u 1 , u 2 , ζ)ν(dζ) R

(6.2.24)

and H2 (t, x, y, z, k, u 1 , u 2 , λ, p, q, r ) = − f (t, x, u 1 , u 2 ) + λg(t, x, y, z, k, u 1 , u 2 )  + pb(t, x, u 1 , u 2 ) + qσ(t, x, u 1 , u 2 ) + r (ζ)γ(t, x, u 1 , u 2 , ζ)ν(dζ). R

(6.2.25)

For u = (u 1 , u 2 ) ∈ A1 × A2 , let X u (t) be defined by (6.2.1), and (Y u (t), Z u (t), K u (t, ζ)) be defined by (6.2.3) with gi = g, and h i = h. The adjoint processes λi , i = 1, 2, satisfy ⎧  ∂g ⎪ ⎪ dλi (t) = λi (t) (t, X (t), Y (t), Z (t), K (t, ·), u 1 (t), u 2 (t))dt ⎪ ⎪ ⎪ ∂y ⎪ ⎪ ⎪ ∂g ⎨ + (t, X (t), Y (t), Z (t), K (t, ·), u 1 (t), u 2 (t))dB(t) ∂z   ⎪ d∇k g(t, ζ) ˜ ⎪ ⎪ ⎪ + N (dt, dζ) ; 0≤t ≤T ⎪ ⎪ dν(ζ) ⎪ R ⎪ ⎩λ (0) = ψ (Y (0)) ≥ 0. i

(6.2.26)

i

Recall that by (6.2.11) we have λi (t) ≥ 0 for all t, ζ ; i = 1, 2. It follows that λ2 (t) = −λ1 (t); t ∈ [0, T ].

(6.2.27)

172

6 Stochastic Differential Games

The adjoint processes for pi , qi and ri i = 1, 2 become by (6.2.10)  ⎧ ∂f ∂g ∂b ∂σ ⎪ ⎪ d p1 (t) = − (t) + λ1 (t) (t) + p1 (t) (t) + q1 (t) (t) ⎪ ⎪ ⎪ ∂x ∂x ∂x ∂x  ⎪  ⎪ ⎪ ∂γ ⎨ + r1 (t, ζ) (t, ζ)ν(dζ) dt ∂x R  ⎪ ⎪ ⎪ ⎪ +q (t)dB(t) + r1 (t, ζ) N˜ (dt, dζ) ; 0 ≤ t ≤ T 1 ⎪ ⎪ ⎪ R ⎪ ⎩ p1 (T ) = ϕ 1 (X (T )) + h 1 (X (T ))λ1 (T )

(6.2.28)

and, by (6.2.10) and (6.2.27),  ⎧ ∂f ∂g ∂b ∂σ ⎪ ⎪ d p (t) − λ1 (t) (t) + p2 (t) (t) + q2 (t) (t) (t) = − − ⎪ 2 ⎪ ⎪ ∂x ∂x ∂x ∂x  ⎪  ⎪ ⎪ ∂γ ⎨ + r2 (t, ζ) (t, ζ)ν(dζ) dt ∂x R  ⎪ ⎪ ⎪ ⎪ +q (t)dB(t) + r2 (t, ζ) N˜ (dt, dζ) ; 0 ≤ t ≤ T 2 ⎪ ⎪ ⎪ R ⎪ ⎩ p2 (T ) = −ϕ 1 (X (T )) − h 1 (X (T ))λ1 (T ). Thus, we see that ( p2 (t), q2 (t), r2 (t)) = −( p1 (t), q1 (t), r1 (t)); t ∈ [0, T ].

(6.2.29)

Consequently, − H2 (t, X (t), Y (t), Z (t), K (t, ·), u 1 (t), u 2 (t), λ2 (t), p2 (t), q2 (t), r2 (t, ·)) = H1 (t, X (t), Y (t), Z (t), K (t, ·), u 1 (t), u 2 (t), λ1 (t), p1 (t), q1 (t), r1 (t, ·)). We thus conclude that in the zero-sum game case, we only need one Hamiltonian and one quadruple of controlled adjoint processes. In the following, we set: 

T

J (u 1 , u 2 ) := E

 f (t, X u (t), u(t))dt + ϕ(X (u) (T )) + ψ(Y u (0)) , (6.2.30)

0

H := H1 as defined in (6.2.24), and (λ(t), p(t), q(t), r (t, ·)) := (λ1 (t), p1 (t), q1 (t), r1 (t, ·)) as defined in (6.2.26), (6.2.28). We can now state the necessary and the sufficient maximum principles for the zero-sum game: Theorem 6.6 (Necessary Maximum Principle for Zero-Sum Games) Assume the conditions of Theorem 6.5 hold. Then the following are equivalent: (i) d d J (u 1 + sβ1 , u 2 ) |s=0 = J (u 1 , u 2 + sβ2 ) |s=0 = 0 ds ds

(6.2.31)

6.2 Stochastic Maximum Principles

173

for all bounded β1 ∈ A1 , β2 ∈ A2 . (ii)  E

∂ H (t, X (t), Y (t), Z (t), K (t, ·), v1 , u 2 (t), λ(t), p(t), ∂v1  q(t), r (t, ·)) | Et(1) v1 =u 1 (t)  ∂ =E H (t, X (t), Y (t), Z (t), K (t, ·), u 1 (t), v2 , λ(t), p(t), ∂v2  q(t), r (t, ·)) | Et(2) v2 =u 2 (t)

= 0.

(6.2.32)

Proof The proof is similar to that of Theorem 6.5 and is omitted. Corollary 6.7 Let u = (u 1 , u 2 ) ∈ A1 × A2 be a Nash equilibrium (saddle point) for the zero-sum game in Theorem 6.6. Then (6.2.32) holds. Proof This follows from Theorem 6.6 by noting that if u = (u 1 , u 2 ) is a Nash equilibrium, then (6.2.31) holds by (6.2.23). Similarly we get Theorem 6.8 (Sufficient Maximum Principle for Zero-Sum Forward-Backward Games) Let (uˆ 1 , uˆ 2 ) ∈ A1 × A2 , with corresponding solutions Xˆ (t), Yˆ (t), Zˆ (t), ˆ Kˆ (t), λ(t), p(t), ˆ q(t), ˆ rˆ (t, ζ). Suppose the following hold • The functions x → ϕ(x) and x → ψ(x) are affine, and the function x → h(x) is concave. • (The conditional maximum principle) ˆ ess supv1 ∈A1 E[H (t, Xˆ (t), Yˆ (t), Zˆ (t), Kˆ (t, ·), v1 , uˆ 2 (t), λ(t), p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et(1) ] (1)

ˆ = E[H (t, Xˆ (t), Yˆ (t), Zˆ (t), Kˆ (t, ·), uˆ 1 (t), uˆ 2 (t), λ(t), p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et ]

(6.2.33) and ˆ ess inf v2 ∈A2 E[H (t, Xˆ (t), Yˆ (t), Zˆ (t), Kˆ (t, ·), uˆ 1 (t), v2 , λ(t), p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et(2) ] (2)

ˆ = E[H (t, Xˆ (t), Yˆ (t), Zˆ (t), Kˆ (t, ·), uˆ 1 (t), uˆ 2 (t), λ(t), p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et ].

(6.2.34) • (The Arrow conditions) The function ˆ h(x, y, z, k) := (1) ˆ p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et ] ess supv1 ∈A1 E[H (t, x, y, z, k, v1 , uˆ 2 (t), λ(t),

174

6 Stochastic Differential Games

is concave, and the function ˇ h(x, y, z, k) := (2)

ˆ ess inf v2 ∈A2 E[H (t, x, y, z, k, uˆ 1 (t), v2 , λ(t), p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et ]

is convex, for all t ∈ [0, T ], a.s. Then u(t) ˆ = (uˆ 1 (t), uˆ 2 (t)) is a saddle point for J (u 1 , u 2 ). Proof The proof is similar to that of Theorem 6.4 and is omitted.

6.2.3 Proofs of the Stochastic Maximum Principles Proof of Theorem 6.4 (Sufficient maximum principle) By introducing a sequence of stopping times τn as in the proof of Theorem 5.4, we see that we may assume that all the local martingales appearing in the following computations are martingales and therefore have expectation 0. We first prove that J1 (u 1 , uˆ 2 ) ≤ J1 (uˆ 1 , uˆ 2 ) for all u 1 ∈ A1 . To this end, fix u 1 ∈ A1 and consider Δ := J1 (u 1 , uˆ 2 ) − J1 (uˆ 1 , uˆ 2 ) = I1 + I2 + I3 ,

(6.2.35)

where  I1 = E

T

 ˆ { f 1 (t, X (t), u(t)) − f 1 (t, X (t), u(t))}dt ˆ ,

(6.2.36)

0

I2 = E[ϕ1 (X (T )) − ϕ1 ( Xˆ (T ))],

(6.2.37)

I3 = E[ψ1 (Y1 (0)) − ψ1 (Yˆ1 (0))].

(6.2.38)

By (6.2.8) we have 

T



ˆ H1 (t) − Hˆ 1 (t) − λˆ 1 (t)(g1 (t) − gˆ1 (t)) − pˆ 1 (t)(b(t) − b(t))    ˆ − rˆ1 (t, ζ)(γ(t, ζ) − γ(t, ˆ ζ))ν(dζ) dt . (6.2.39) −qˆ1 (t)(σ(t) − σ(t))

I1 = E

0

R

By concavity of ϕ1 , (6.2.10) and the Itô formula,

6.2 Stochastic Maximum Principles

175

I2 ≤ E[ϕ 1 ( Xˆ (T ))(X (T ) − Xˆ (T ))] = E[ pˆ 1 (T )(X (T ) − Xˆ (T ))] − E[λˆ 1 (T )h 1 ( Xˆ (T ))(X (T ) − Xˆ (T ))]  T  T pˆ 1 (t − )(dX (t) − d Xˆ (t)) + (X (t − ) − Xˆ (t − ))d pˆ 1 (t) =E 0 T

 +



T

qˆ1 (t)(σ(t) − σ(t))dt ˆ +

0

0



R

0

 rˆ1 (t, ζ)(γ(t, ζ) − γ(t, ˆ ζ))ν(dζ)dt

− E[λˆ 1 (T )h 1 ( Xˆ (T ))(X (T ) − Xˆ (T ))] 

 T T ˆ1 ∂ H ˆ pˆ 1 (t)(b(t) − b(t))dt + (X (t) − Xˆ (t)) − =E (t) dt ∂x 0 0   T  T qˆ1 (t)(σ(t) − σ(t))dt ˆ + rˆ1 (t, ζ)(γ(t, ζ) − γ(t, ˆ ζ))ν(dζ)dt + 0

R

0

− E[λˆ 1 (T )h 1 ( Xˆ (T ))(X (T ) − Xˆ (T ))].

(6.2.40)

By concavity of ψ1 , (6.2.5), (6.2.9), and concavity of ϕ1 : I3 = E[ψ1 (Y1 (0)) − ψ1 (Yˆ1 (0))] ≤ E[ψ1 (Yˆ1 (0))(Y1 (0) − Yˆ1 (0))] = E[λˆ 1 (0)(Y1 (0) − Yˆ1 (0))] = E[(Y1 (T ) − Yˆ1 (T ))λˆ 1 (T )] − E



T

(Y1 (t − ) − Yˆ1 (t − ))dλˆ 1 (t)

0





T ∂ Hˆ 1 (t)(Z 1 (t) − Zˆ 1 (t))dt λˆ 1 (t − )(dY1 (t) − dYˆ1 (t)) + ∂z 0 0   T ∇k Hˆ 1 (t, ζ)(K 1 (t, ζ) − Kˆ 1 (t, ζ))ν(dζ)dt + R 0  T ∂ Hˆ 1 ˆ ˆ (t)(Y1 (t) − Yˆ1 (t))dt = E[(h 1 (X (T )) − h 1 ( X (T )))λ1 (T )] − E ∂y 0  T ˆ  T ∂ H1 (t)(Z 1 (t) − Zˆ 1 (t))dt + λˆ 1 (t)(−g1 (t) + gˆ1 (t))dt + ∂z 0 0   T ∇k Hˆ 1 (t, ζ)(K 1 (t, ζ) − Kˆ 1 (t, ζ))ν(dζ)dt + 0 R  T ∂ Hˆ 1 ˆ ˆ ˆ (t)(Y1 (t) − Yˆ1 (t))dt ≤ E[λ1 (T )h 1 ( X (T ))(X (T ) − X (T ))] − E ∂y 0  T ˆ  T ∂ H1 (t)(Z 1 (t) − Zˆ 1 (t))dt + λˆ 1 (t)(−g1 (t) + gˆ1 (t))dt + ∂z 0 0   T ∇k Hˆ 1 (t, ζ)(K 1 (t, ζ) − Kˆ 1 (t, ζ))ν(dζ)dt , (6.2.41) + T

+

0

R

176

6 Stochastic Differential Games

where we have used λˆ 1 (T ) ≥ 0 (see (6.2.11)). Adding (6.2.39)–(6.2.41) we get Δ = I1 + I2 + I3  T  ∂ Hˆ 1 (t)(X (t) − Xˆ (t)) H1 (t) − Hˆ 1 (t) − ≤E ∂x 0 ∂ H1 ∂ Hˆ 1 (t)(Y1 (t) − Yˆ1 (t)) − (t)(Z 1 (t) − Zˆ 1 (t)) ∂y ∂z    ˆ ˆ − ∇k H1 (t, ζ)(K 1 (t, ζ) − K 1 (t, ζ))ν(dζ) dt .



(6.2.42)

R

Since hˆ 1 (x, y, z, k) is concave, it follows by a standard separating hyperplane argument (see e.g. [SeSy], Chap. 5, Sect. 23) that there exists a supergradient a = (a0 , a1 , a2 , a3 (·)) ∈ R3 × R for hˆ 1 (x, y, z, k) at x = Xˆ (t), y = Yˆ1 (t), z = Zˆ 1 (t − ) and k = Kˆ 1 (t − , ·) such that if we define ϕ1 (x, y, z, k) := hˆ 1 (x, y, z, k) − hˆ 1 ( Xˆ (t − ), Yˆ1 (t − ), Zˆ 1 (t − ), Kˆ 1 (t, ·))  − a0 (x − Xˆ (t)) + a1 (y − Yˆ1 (t)) + a2 (z − Zˆ 1 (t))   + a3 (ζ)(k(ζ) − Kˆ (t, ζ))ν(dζ) R

then ϕ1 (x, y, z, k) ≤ 0 for all x, y, z, k. On the other hand we clearly have ϕ1 ( Xˆ (t), Yˆ (t), Zˆ (t), Kˆ 1 (t, ·)) = 0. It follows that ∂ Hˆ 1 ∂ hˆ 1 ˆ (t) = ( X (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·)) = a0 , ∂x ∂x ∂ hˆ 1 ˆ ∂ Hˆ 1 (t) = ( X (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·)) = a1 , ∂y ∂y ∂ hˆ 1 ˆ ∂ Hˆ 1 (t) = ( X (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·)) = a2 , ∂z ∂z ∇k Hˆ 1 (t, ζ) = ∇k hˆ 1 ( Xˆ (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·)) = a3 . Combining this with (6.2.42) we get

6.2 Stochastic Maximum Principles

177

Δ ≤ hˆ 1 (X (t), Y1 (t), Z 1 (t), K 1 (t, ·)) − hˆ 1 ( Xˆ (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·)) ∂ hˆ 1 ˆ ( X (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·))(X (t) − Xˆ (t)) ∂x ∂ hˆ 1 ˆ − ( X (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·))(Y1 (t) − Yˆ1 (t)) ∂y ∂ hˆ 1 ˆ ( X (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·))(Z 1 (t) − Zˆ 1 (t)) − ∂z  − ∇k hˆ 1 ( Xˆ (t), Yˆ1 (t), Zˆ 1 (t), Kˆ 1 (t, ·))(K 1 (t, ζ) − Kˆ 1 (t, ζ))ν(dζ) −

R

≤ 0 since hˆ 1 is concave. Hence J1 (u 1 , uˆ 2 ) ≤ J1 (uˆ 1 , uˆ 2 ) for all u 1 ∈ A1 . The inequality J2 (uˆ 1 , u 2 ) ≤ J2 (uˆ 1 , uˆ 2 ) for all u 2 ∈ A2 is proved similarly. This completes the proof of Theorem 6.4.



Proof of Theorem 6.5 (Necessary maximum principle) By introducing a sequence of stopping times τn as in the proof of Theorem 5.5, we see that we may assume that all the local martingales appearing in the following computations are martingales and therefore have expectation 0. Consider d J1 (u 1 + sβ1 , u 2 ) |s=0 ds  T   ∂ f1 ∂ f1 (t)x1 (t) + =E (t)β1 (t) dt ∂x ∂u 1 0  (u 1 ,u 2 ) (T ))x1 (T ) + ψ1 (Y1 (0))y1 (0) . +ϕ1 (X

D1 :=

By (6.2.10) and the Itô formula, E[ϕ 1 (X (u 1 ,u 2 ) (T ))x1 (T )] = E[ p1 (T )x1 (T )] − E[h 1 (X (u 1 ,u 2 ) (T ))λ1 (T )]  T  p1 (t − )dx1 (t) + x1 (t − )d p1 (t) =E 0

 ∂σ  ∂σ (t)x1 (t) + (t)β1 (t) + q1 (t) ∂x ∂u 1   ∂γ    ∂γ r1 (t, ζ) (t, ζ)β1 (t, ζ) ν(dζ) dt (t, ζ)x1 (t) + + ∂x ∂u 1 R

(6.2.43)

178

6 Stochastic Differential Games

− E[h 1 (X (u 1 ,u 2 ) (T ))λ1 (T )]    T  ∂b ∂b p1 (t) (t)x1 (t) + (t)β1 (t) =E ∂x ∂u 1 0

  ∂ H1 ∂σ ∂σ (t) + q1 (t) (t)x1 (t) + (t)β1 (t) + x1 (t) − ∂x ∂x ∂u 1      ∂γ ∂γ (t, ζ)x1 (t) + (t, ζ)β1 (t, ζ) ν(dζ) dt + r1 (t, ζ) ∂x ∂u 1 R − E[h 1 (X (u 1 ,u 2 ) (T ))λ1 ((T )].

(6.2.44)

By (6.2.9) and the Itô formula E[ψ1 (Y1 (0))y1 (0)] = E[λ1 (0)y1 (0)]  T λ1 (t − )dy1 (t) + y1 (t − )dλ1 (t) = E[λ1 (T )y1 (T )] − E 0   ∂ H1 (t)z 1 (t)dt + ∇k H1 (t, ζ)k1 (t, ζ)ν(dζ)dt + ∂z R = E[λ1 (T )h 1 (X (u 1 ,u 2 ) (T ))]   T  ∂g1 ∂g1 ∂g1 λ1 (t) − (t)x1 (t) − (t)y1 (t) − (t)z 1 (t) −E ∂x ∂ y ∂z 0   ∂g1 (t)β1 (t) − ∇k g1 (t, ζ)k1 (t, ζ)ν(dζ) − ∂u 1 R    ∂ H1 ∂ H1 (t)y1 (t) + (t)z 1 (t) + ∇k H1 (t, ζ)k1 (t, ζ)ν(dζ) dt . (6.2.45) + ∂y ∂z R Adding (6.2.44) and (6.2.45) we get, by (6.2.43), 



∂ f1 ∂b ∂σ (t) + p1 (t) (t) + q1 (t) (t) ∂x ∂x ∂x 0   ∂ H1 ∂γ ∂g1 (t) + λ1 (t) (t) x1 (t) + r1 (t, ζ) (t, ζ)ν(dζ) − ∂x ∂x ∂x    R ∂g1 ∂ H1 ∂g1 ∂ H1 (t) + λ1 (t) (t) y1 (t) + − (t) + λ1 (t) (t) z 1 (t) + − ∂y ∂y ∂z ∂z  + [−∇k H1 (t, ζ) + λ1 (t)∇k g1 (t, ζ)] k1 (t, ζ)ν(dζ) R ∂b ∂σ ∂ f1 (t) + p1 (t) (t) + q1 (t) (t) + ∂u 1 ∂u 1 ∂u 1     ∂γ ∂g1 (t, ζ)ν(dζ) + (t) β1 (t) dt + r1 (t, ζ) ∂u 1 ∂u 1 R

D1 = E

T

6.2 Stochastic Maximum Principles



T

=E 

0 T

=E 0

179

 ∂ H1 (t)β1 (t)dt ∂u 1    ∂ H1 (1) dt . E (t)β1 (t) | Et ∂u 1

(6.2.46)

If D1 = 0 for all bounded β1 ∈ A1 , then this holds in particular for β1 of the form in (6.2.16), i.e. β1 (t) = χ(t0 ,T ] (t)α1 (ω), where α1 (ω) is bounded and Et(1) -measurable. Hence 0 



T

E

E t0

  ∂ H1 (t) | Et(1) α1 dt = 0. ∂u 1

Differentiating with respect to t0 we get  ∂ H1 (t0 )α1 = 0 for a.a. t0 . E ∂u 1 

Since this holds for all bounded Et(1) -measurable random variables α1 we conclude 0 that   ∂ H1 (1) = 0 for a.a. t ∈ [0, T ]. (t) | Et E ∂u 1 A similar argument gives that  E

 ∂ H2 (t) | Et(2) = 0 ∂u 2

provided that D2 :=

d J2 (u 1 , u 2 + sβ2 ) |s=0 = 0 for all bounded β2 ∈ A2 . ds

This shows that (i) ⇒ (ii). The argument above can be reversed, to give that (ii) ⇒ (i). We omit the details. 

6.2.4 Risk Minimization by FBSDE Games We illustrate the general results above by again studying the risk minimization problem.

180

6 Stochastic Differential Games

The starting point is the same as in Sect. 5.2.4, with a wealth process X π (t) associated to a (self-financing) portfolio π(t) given by 

dX π (t) = π(t)X (t − )[b0 (t)dt + σ0 (t)dB(t)] ; t ≥ 0 X π (0) = x0 > 0.

(6.2.47)

This time we want to minimize the risk ρ(X π (T )) of the terminal value X π (T ), defined by (6.2.48) ρ(X π (T )) = −Yπ (0), 

where

dYπ (t) = −g(Z (t))dt + Z (t)dB(t) ; t ∈ [0, T ] Yπ (T ) = X π (T ),

(6.2.49)

for some given concave function g. Thus we want to find πˆ ∈ A and ρ(X πˆ (T )) := −Yπˆ (0) such that (6.2.50) inf {−Yπ (0)} = −Yπˆ (0). π∈A

In this case the Hamiltonian becomes H (t, x, y, z, k, π, λ, p, q, r ) = πxb0 (t) p + πxσ0 (t)q + λg(z).

(6.2.51)

The adjoint equations are (6.2.10)–(6.2.9) 

d p(t) = −{π(t)b0 (t) p(t) + π(t)σ0 (t)q(t)}dt + q(t)dB(t) p(T ) = λ(T ) 

and

dλ(t) = λ(t)g (Z (t))dB(t) λ(0) = 1



i.e.

t

λ(t) = exp 0

If πˆ is optimal, then

1 2 g (Z (s))dB(s) − g (Z (s)) ds . 2

ˆ + σ0 (t)q(t) ˆ = 0. b0 (t) p(t)

(6.2.52)

(6.2.53)

(6.2.54)

(6.2.55)

This gives ⎧ b0 (t) ⎨d p(t) p(t)dB(t) ˆ ; 0≤t ≤T ˆ = q(t)dB(t) ˆ =− σ0 (t) ⎩ ˆ ). p(T ˆ ) = λ(T

(6.2.56)

6.2 Stochastic Maximum Principles

181

Comparing with (6.2.53) we see that the solution ( p, ˆ q) ˆ of the BSDE (6.2.56) is ˆ ˆ ˆ p(t) ˆ = λ(t), q(t) ˆ = λ(t)g ( Z (t)).

(6.2.57)

b0 (t) . g ( Zˆ (t)) = − σ0 (t)

(6.2.58)

1 g(z) = − z 2 2

(6.2.59)

Hence by (6.2.55)

If, for example,

then (6.2.58) gives

b0 (t) . Zˆ (t) = σ0 (t)

Substituted into (6.2.47) this gives, using (6.2.48) (with Γ (t) as in (5.2.35)), Xˆ (T ) = Yˆ (T ) = Yˆ (0) +

 0

T

1 2



b0 (s) σ0 (s)

2



T

ds + 0

b0 (s) dB(s) σ0 (s)

= Yˆ (0) − ln Γ (T ).

(6.2.60)

We take expectation w.r.t. the martingale measure Q defined by dQ(ω) = Γ (T )d P(ω)

(6.2.61)

and get   dQ dQ ˆ ln . − Y (0) = −x − E Q [ln Γ (T )] = −x − E dP dP

(6.2.62)



 dQ dQ ln is the entropy of Q with respect to P. dP dP Now that the optimal value Yˆ (0) has been found, we can use (6.2.60) to find the corresponding optimal terminal wealth Xˆ (T ), and from there the optimal portfolio as we did in Sect. 5.2.4. We have proved: Note that H (Q | P) := E

Theorem 6.9 Suppose (6.2.59) holds. Then the minimal risk −Yπˆ (0) = −Yˆ (0) of problem (6.2.50) is given by (6.2.62), where dQ = Γ (T )d P is the unique equivalent martingale measure for the market of Sect. 5.2.4.

182

6 Stochastic Differential Games

6.3 Mean-Field Games Mean-field SDEs are stochastic differential equations where the coefficients depend not only on the current value of the solution but also on its probability law. The motivations of such equations, also called McKean–Vlasov equations, go back to M. Kac and H.P. McKean, with more recent fundamental contributions from J.M. Lasry and P.L. Lions. We refer to [CD1] and [CD2] and the references therein for more information about the background for mean-field SDEs. Mean-field SDEs are not Markovian, and classical dynamic programming and HJB equations cannot be used to study optimal control problems for such equations. However, it is possible to apply a suitably modified maximum principle approach. The first paper on this was [AD]. Subsequently a number of extensions have appeared, e.g. to optimal control of coupled systems of mean-field FBSDEs, singular meanfield control, mean-field stochastic partial differential equations, mean-field games, mean-field systems with memory and mean-field BSDEs. We refer to e.g. [CD1, CD2, HØS, AØ1, AØ2, DØS, AHØ] and the references therein. Our presentation is mainly based on [HØS].

6.3.1 Two Motivating Examples (a) Optimal Harvesting from a Mean-Field System Suppose we model the density X 0 (t) = X t0 of an unharvested population at time t by an equation of the form dX t0

=

E[X t0 ]a(t)dt

X 00− = x > 0.

+

X t0−

   ˜ σ(t)dB(t) + γ(t, ζ) N (dt, dζ) ; t ∈ [0, T ] R

(6.3.1)

We assume that a(t), σ(t), and γ(t, ζ) are given predictable processes. We may regard (6.3.1) as a limit as n → ∞ of a large population interacting system of the form ⎡ ⎤    n 1 i,n j,n i,n i ˜ ⎣ ⎦ x (t) a(t)dt + x (t) σ(t)dB (t) + γ(t, ζ) N (dt, dζ) . dx (t) = n j=1 R (6.3.2) Here we have divided the lake into a grid of size n, and x i,n (t) represents the population density in grid box i. Thus themean-field term E[X (t)] represents an approximation to the weighted average n1 nj=1 x j,n (t) for large n. Now suppose we introduce harvesting of the population. The size of the harvested population X (t) = X u (t) at time t can then be

6.3 Mean-Field Games

183

modeled by a mean-field control stochastic differential equation of the form    ˜ dX (t) = E[X (t)]a(t)dt + X (t ) σ(t)dB(t) + γ(t, ζ) N (dt, dζ) −

R

− X (t)u(t)dt; t ∈ [0, T ] X (0) = x > 0,

(6.3.3)

where u ≥ 0 is a predictable process, representing the relative harvesting rate. The performance functional is assumed to be of the form 

T

J (u) = E o

 1 2 2 {h 0 X (t)u(t) − ρ X (t) u (t)}dt + K X (T ) , 2

(6.3.4)

where h 0 > 0 is a fixed given unit price and K = K (ω) > 0 is a given salvage price, assumed to be FT -measurable. The term 21 ρ u 2 (t) represents a quadratic cost rate for the harvesting effort u(t) and ρ > 0 is a given constant. The problem is to find u ∗ such that J (u ∗ ) = sup J (u).

(6.3.5)

u

This is an example of a mean-field control problem. We will return to this problem in Sect. 6.22.

(b) Optimal Irreversible Investments Under Model Uncertainty t Let V (t) = V0 + 0 u(s)ds denote the production capacity of a production plant and let D(t) denote the demand rate at time t. At any time t the production capacity can be increased by the rate u(t) at the price λ0 (t, D(t)) per capacity unit, where u(t) ≥ 0 is a predictable process (our control). The total expected net profit of the production is assumed to be  J (u, θ) =E



T

a(t, E[ϕ(D(t))]) min[D(t), V (t)]dt + g(D(T ))

0



T



 λ0 (t, D(t))u(t)dt ,

(6.3.6)

0

where g(D(T )) is some salvage value of the closed-down production plant, ϕ is a given real function and a(t) is the unit sales price of the production. Here {Q θ }θ∈ is a family of probability measures representing the model uncertainty. We let AE

184

6 Stochastic Differential Games

denote the set of right-continuous, non-decreasing Et -adapted processes u(t) ≥ 0, where E = {Et }t≥0 with Et ⊆ Ft is a given subfiltration. Thus Et represents the information available to the investor at time t. We assume that the demand D(t) is given by a jump diffusion of the form    ⎧ ⎨dD(t) = D(t − ) α(t, ω)dt + β(t, ω)dB(t) + η(t, ω, ζ) N˜ (dt, dζ) , 0≤t≤T R ⎩ D(0) > 0,

(6.3.7) where η(t, ω, ζ) ≥ −1 for all t, ω, ζ. We want to maximize the expected total net profit under the worst possible scenario, i.e. we want to find (u ∗ , θ∗ ) ∈ AE ×  such that sup inf J (u, θ) = inf sup J (u, θ) = J (u ∗ , θ∗ ).

u∈AE θ∈

θ∈ u∈AE

(6.3.8)

This is an example of a (partial information) mean-field control game of a jump diffusion. Note that the system is non-Markovian, both because of the mean-field term and the partial information constraint.

6.3.2 General Mean-Field Non-zero Sum Games In the following, we study general mean-field control games and prove maximum principles for the problem to find the following types of equilibria: (i) Nash equilibria for non-zero sum games. (ii) Saddle points (as in (6.3.8)) for zero sum games. Recall (see the beginning of Sect. 6.2) that if F : L 2 (P) → R is Fréchet differentiable, then its Fréchet derivative at X ∈ L 2 (P) denoted by ∇ X F is a bounded linear functional on L 2 (P). Hence, by the Riesz representation theorem, we can identify 2 2 ∇ X F with a random variable ∇˜ X F in L  (P)  and the action of ∇ X F on Y ∈ L (P) is given by < ∇ X F, Y >= E ∇˜ X F Y . In the following we do not distinguish between ∇ X F and ∇˜ X F. For example, if F(X ) = E[ϕ(X )]; X ∈ L 2 (P), where ϕ is a real C 1 - function (X ) ∈ L 2 (P), then ∇ X F = ∂ϕ (X ) and ∇ X F(Y ) = such that ϕ(X ) ∈ L 2 (P) and ∂ϕ ∂x ∂x ∂ϕ ∂ϕ < ∂x (X ), Y >= E[ ∂x (X )Y ] for Y ∈ L 2 (P). Put u = (u 1 , u 2 ), where u i represents the control of player i ; i = 1, 2. Consider the system with state process X (t) = X u (t) of the form

6.3 Mean-Field Games

185

dX (t) = b(t, X (t), Y (t), u(t), ω)dt + σ(t, X (t), Y (t), u(t), ω)dB(t)  + γ(t, X (t), Y (t), u(t), ζ, ω) N˜ (dt, dζ), (6.3.9) R

where Y (t) = F(X (t, ·)),

(6.3.10)

and F is a Fréchet differentiable operator on L 2 (P). The performance functional for player i is assumed to be of the form  Ji (u) = E

T

 f i (t, X (t), Y (t), u(t), ω)dt + gi (X (T ), Y (T ), ω) ; i = 1, 2.

0

(6.3.11) We may interpret the functions f i as profit rates and gi as bequest or salvage value functions. We want to find a Nash equilibrium for this game, i.e. find u ∗1 ∈ A(1) and u ∗2 ∈ A(2) such that (6.3.12) sup J1 (u 1 , u ∗2 ) = J1 (u ∗1 , u ∗2 ) u 1 ∈A(1)

and

sup J2 (u ∗1 , u 2 ) = J2 (u ∗1 , u ∗2 ).

(6.3.13)

u 2 ∈A(2)

Here A(i) is a given family of E(i) -predictable processes such that the corresponding state equation has a unique solution X such that ω → X (t, ω) ∈ L 2 (P) for all t. We let A(i) denote the set of possible values of u i (t); t ∈ [0, T ] when u i ∈ A(i) ; i = 1, 2.

6.3.3 A Sufficient Maximum Principle In this section we prove a sufficient maximum principle for the mean-field control games described above. We first consider the general non-zero sum game. To this end, define two Hamiltonians Hi ; i = 1, 2, as follows: Hi (t, x, y, u 1 , u 2 , pi , qi , ri ) = f i (t, x, y, u) + b(t, x, y, u) pi + σ(t, x, y, u)qi +

 R

γ(t, x, y, u, ζ)ri (ζ)ν(dζ).

(6.3.14)

186

6 Stochastic Differential Games

We assume that for i = 1, 2, H = Hi is Fréchet differentiable (C 1 ) in the variables x, y, z, k, ξ, u and that the Fréchet derivative ∇k H of H with respect to k ∈ R as a random measure is absolutely continuous with respect to ν, with Radon–Nikodym d∇k H . Thus, if ∇k H, h denotes the action of the linear operator ∇k H derivative dν on the function h ∈ R we have   d∇k H (ζ) dν(ζ). (6.3.15) h(ζ)d∇k H (ζ) = h(ζ) ∇k H, h = dν(ζ) R R In the rest of the book we will assume that d∇k Hˆ i (t, ·) < ∞ for all t ∈ [0, T ], i = 1, 2, dν

(6.3.16)

where here, and in the following, we are using the abbreviated notation Hi (t) = Hi (t, X (t), Y (t), u(t), pi (t), qi (t), ri (t, ·)) and similarly for ˆ pˆ i (t), qˆi (t), rˆi (t, ·)) Hˆ i (t) = Hi (t, Xˆ (t), Yˆ (t), u(t), and related processes. The BSDE for the adjoint processes pi , qi , ri is ⎧ ∂ Hi ⎪ ⎪ d pi (t) = − (t, X (t), Y (t), u(t), pi (t), qi (t), ri (t))dt ⎪ ⎪ ⎪ ∂x ⎪ ⎪ ∂ H i ⎪ ⎪ (t, X (t), Y (t), u(t), pi (t), qi (t), ri (t))∇ X (t) F)dt − ⎨ ∂y  (6.3.17) ⎪ ⎪ +qi (t)dB(t) + ri (t, ζ) N˜ (dt, dζ) ⎪ ⎪ ⎪ R ⎪ ⎪ ∂gi ∂gi ⎪ ⎪ (X (T ), Y (T )) + (X (T ), Y (T ))∇ X (T ) F; i = 1, 2. ⎩ pi (T ) = ∂x ∂y Theorem 6.10 (Sufficient maximum principle) Let uˆ 1 ∈ A(1) , uˆ 2 ∈ A(2) with corresponding solutions Xˆ , pˆ i , qˆi , rˆi of (6.3.9) and (6.3.17). Assume the following: • The maps X, u 1 → H1 (t, X, F(X ), u 1 , uˆ 2 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t)),

(6.3.18)

X, u 2 → H2 (t, X, F(X ), uˆ 1 (t), u 2 , pˆ 2 (t), qˆ2 (t), rˆ2 (t)), and

(6.3.19)

X → gi (X, F(X ))

(6.3.20)

6.3 Mean-Field Games

187

are concave for all t; i = 1, 2. • (The conditional maximum properties) ess supu 1 E[H1 (t, Xˆ (t), Yˆ (t), u 1 , uˆ 2 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t, ·)) | Et(1) ] = E[H1 (t, Xˆ (t), Yˆ (t), uˆ 1 (t), uˆ 2 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t, ·)) | Et(1) ] (6.3.21) and ess supu 2 E[H2 (t, Xˆ (t), Yˆ (t), uˆ 1 (t), u 2 , pˆ 2 (t), qˆ2 (t), rˆ2 (t, ·)) | Et(2) ] = E[H2 (t, Xˆ (t), Yˆ (t), uˆ 1 (t), uˆ 2 (t), pˆ 2 (t), qˆ2 (t), rˆ2 (t, ·)) | Et(2) ]. (6.3.22) Then (uˆ 1 , uˆ 2 ) is a Nash equilibrium, in the sense that (6.3.12) and (6.3.13) hold with u i∗ := uˆ i ; i = 1, 2. Proof By introducing an increasing sequence of stopping times converging to T, we see that we may assume that all local martingales appearing in the proof below are martingales, and hence have expectation 0. See the proof of Theorem 5.4. We first study the stochastic control problem (6.3.12): For simplicity of notation, in the following we put X (t) = X u 1 ,uˆ 2 (t), Y (t) = Y u 1 ,uˆ 2 (t) and Xˆ (t) = X uˆ 1 ,uˆ 2 (t), Yˆ (t) = Y uˆ 1 ,uˆ 2 (t). Consider J1 (u 1 , uˆ 2 ) − J1 (uˆ 1 , uˆ 2 ) = I1 + I2 , where 

T

I1 = E

 { f 1 (t, X (t), Y (t), u 1 (t), uˆ 2 (t)) − f 1 (t, Xˆ (t), Yˆ (t), uˆ 1 (t), uˆ 2 (t))}dt ,

0

I2 = E[g1 (X (T ), Y (T )) − g1 ( Xˆ (T ), Yˆ (T ))]. By the definition of H1 we have 

T

I1 = E



H1 (t, X t , Yt , u 1 (t), uˆ 2 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t))

0

− H1 (t, Xˆ t , Yˆt , u(t), ˆ pˆ 1 (t), qˆ1 (t), rˆ1 (t))    ˆ pˆ 1 − (σ − σ) ˆ qˆ1 − (γ − γ)ˆ ˆ r1 (ζ)ν(dζ) dt . −(b − b) By concavity of g1 and the Itô formula we have

(6.3.23)

188

6 Stochastic Differential Games



∂g1 ˆ ( X (T ), Yˆ (T ))(X (T ) − Xˆ (T )) ∂x  ∂g1 ˆ ( X (T ), Yˆ (T ))∇ Xˆ (T ) F(X (T ) − Xˆ (T )) + ∂y =< pˆ 1 (T ), X (T ) − Xˆ (T ) >  T  T ! ! pˆ 1 (t)d X (t) + X (t)d pˆ 1 (t) =E

I2 ≤ E

0



T

+

0



T

qˆ1 (t)! σ (t)dt +

0

0



R

 rˆ1 (t, ζ)γ(t, ˜ ζ)ν(dζ)dt ,

(6.3.24)

where we have put ! X (t) = X (t) − Xˆ (t), ! σ (t) = σ(t) − σ(t). ˆ

(6.3.25)

Note that 

T

E

  pˆ 1 (t)d ! X (t) = E

0

T

ˆ pˆ 1 (t)(b − b)dt

 (6.3.26)

0

and that 

T

E 0

! X (t)d pˆ 1 (t) 



 ∂H 1 ! (t, Xˆ (t), Yˆ (t), u(t), ˆ pˆ 1 (t), qˆ1 (t), rˆ1 (t)) X (t) − ∂x 0   ∂ H1 ˆ ˆ (t, X (t), Y (t), u(t), ˆ pˆ 1 (t), qˆ1 (t), rˆ1 (t))∇ Xˆ (t) F) dt . − ∂y

=E

T

(6.3.27)

Combining the above we get J1 (u 1 , uˆ 2 ) − J1 (uˆ 1 , uˆ 2 )   T ˆ1 ˆ1 ∂ H ∂ H (X − Xˆ ) − ∇ ˆ F(X − Xˆ )}dt {H1 (t) − Hˆ 1 (t) − ≤E ∂x ∂ y X (t) 0   T ∂ Hˆ 1 ∂ Hˆ 1 (1) ˆ ˆ ˆ (X − X ) − ∇ ˆ F(X − X ) | Et ]dt , =E E[H1 (t) − H1 (t) − ∂x ∂ y X (t) 0 (6.3.28) ˆ pˆ 1 (t), qˆ1 (t), rˆ1 (t)), where Hˆ 1 (t) means H1 evaluated at (t, Xˆ (t), Yˆ (t), u(t), and H1 (t) is H1 evaluated at (t, X (t), Y (t), u 1 (t), uˆ 2 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t)). Note that by concavity of H1 we have

6.3 Mean-Field Games

189

H1 (t, X, F(X ), u 1 , uˆ 2 , pˆ 1 , qˆ1 , rˆ1 ) − H1 (t, Xˆ , F( Xˆ ), uˆ 1 , uˆ 2 , pˆ 1 , qˆ1 , rˆ1 ) ≤

∂ Hˆ 1 ˆ ∂ Hˆ 1 ˆ ∂ Hˆ 1 ( X )(X − Xˆ ) + ( X )∇ Xˆ F(X − Xˆ ) + (u)(u ˆ 1 − uˆ 1 ). ∂x ∂y ∂u 1

(6.3.29)

Therefore, to obtain that J1 (u 1 , uˆ 2 ) − J1 (uˆ 1 , uˆ 2 ) ≤ 0, it suffices that 

 ∂ Hˆ 1 (1) E (u) ˆ | Et (u 1 − uˆ 1 ) ≤ 0 ∂u 1

(6.3.30)

for all u 1 . The inequality (6.3.30) holds by our assumption (6.3.21). The difference J2 (uˆ 1 , u 2 ) − J2 (uˆ 1 , uˆ 2 ) 

is handled similarly.

6.3.4 A Necessary Maximum Principle In Theorem 6.10 we proved a verification theorem, stating that if a given control uˆ satisfies certain conditions, then it is indeed optimal for the mean-field control game. We now establish a partial converse, implying that if a control uˆ is optimal for the mean-field control game, then it is a conditional saddle point for the Hamiltonian. To achieve this, we start with the setup of Theorem 6.5 as follows: Assume that (6.2.16) and (6.2.17) hold, and assume the analogue of (6.2.18)– (6.2.20), namely that the following derivative processes exit and belong to L 2 ([0, T ]× Ω): d (u 1 +sβ1 ,u 2 ) X (t) |s=0 , ds d (u 1 ,u 2 +sβ2 ) X x2 (t) = (t) |s=0 , ds x1 (t) =

and that they satisfy the SDEs 

" # ∂b ∂b (t)xi (t) + (t)∇ X (t) F, xi (t) + ∂x ∂y " #  ∂σ ∂σ (t)xi (t) + (t)∇ X (t) F, xi (t) + + ∂x ∂y

dxi =

 ∂b (t)βi (t) dt ∂u i  ∂σ (t)βi (t) dB(t) ∂u i

190

6 Stochastic Differential Games

  +

R

" #  ∂γ ∂γ ∂γ (t)xi (t) + (t)∇ X (t) F, xi (t) + (t)βi (t) N˜ (dt, dζ) ∂x ∂y ∂u i

for all u i , βi as in (6.2.17); i = 1, 2. Note that, by the chain rule, if h ∈ C 1 (R), %% d $ (u 1 +sβ1 ,u 2 ) % d $ $ (t) |s=0 = h F X (u 1 +sβ1 ,u 2 ) (t) |s=0 h Y ds ds ' & = h (Y u (t))∇ X u (t) F, x1 (t) and similarly & ' d $ (u 1 ,u 2 +sβ2 ) % h Y (t) |s=0 = h (Y u (t))∇ X u (t) F, x2 (t) . ds Under these assumptions we can prove the following: Theorem 6.11 (Necessary maximum principle for mean-field games) Suppose uˆ 1 ∈ A(1) and uˆ 2 ∈ A(2) constitute a Nash equilibrium for the game, i.e. they satisfy (6.3.12) and (6.3.13). Then  E

 ∂ H1 (t, Xˆ (t), Yˆ (t), u 1 , uˆ 2 (t), pˆ 1 (t), qˆ1 (t), rˆ1 (t, ·))u 1 =uˆ 1 (t) | Et(1) = 0 ∂u 1 (6.3.31)

and  ∂ H2 (2) ˆ ˆ = 0. E (t, X (t), Y (t), uˆ 1 (t), u 2 , pˆ 2 (t), qˆ2 (t), rˆ2 (t, ·))u 2 =uˆ 2 (t) | Et ∂u 2 (6.3.32) 

Proof Let u i , βi be as in (6.2.17); i = 1, 2. First, consider β

d Ji (u 1 + sβ1 , u 2 ) |s=0 ds  T   ∂ f1 ∂ f1 ∂ f1 (t)x1 (t) + (t)∇ X (t) F x1 (t) + (t)β1 (t) dt =E ∂x ∂y ∂u 0  ∂g1 u ∂g1 u u u (X (T ), Y (T ))x1 (T ) + (X (T ), Y (T ))∇ X (T ) F x1 (T ) . + ∂x ∂y

D1 i :=

By (6.3.17) we have 

∂g1 ∂g1 (X (T ), Y (T ))x1 (T ) + (X (T ), Y (T ))∇ X (T ) F x1 (T ) E ∂x ∂y = E[ p1 (T )x1 (T )]   T − − p1 (t )dx1 (t) + x1 (t )d p1 (t) =E 0



6.3 Mean-Field Games

191

  ∂σ ∂σ ∂σ (t)x1 (t) + (t)∇ X (t) F x1 (t) + + q1 (t) (t)β1 (t) dt ∂x ∂y ∂u 1    ∂γ ∂γ ∂γ (t)x1 (t) + (t)∇ X (t) F x1 (t) + (t)β1 (t) ν(dζ)dt. + r1 (t, ζ) ∂x ∂y ∂u 1 R Hence, by (6.3.14) and (6.3.17), 





∂ f1 ∂ f1 ∂b ∂b (t) + (t)∇ X (t) F + p1 (t) (t) + (t)∇ X (t) F ∂x ∂y ∂x ∂y 0

∂σ ∂σ (t) + (t)∇ X (t) F + q1 (t) ∂x ∂y

 ∂γ ∂γ (t, ζ) + (t, ζ)∇ X (t) F ν(dζ) + r1 (t, ζ) ∂x ∂y R

 ∂ H1 ∂ H1 (t) − (t)∇ X (t) F x1 (t)dt − ∂x ∂y  T  ∂ f1 ∂b ∂σ (t) + p1 (t) (t) + q1 (t) (t) +E ∂u ∂u ∂u 1 1 1 0    ∂γ (t, ζ)ν(dζ) β1 (t)dt + r1 (t, ζ) ∂u 1 R     T   T ∂ H1 ∂ H1 1) (t)β1 (t)dt = E E (t)β1 (t) | Et dt =E ∂u 1 ∂u 1 0 0    T  ∂ H1 =E β1 (t)E (t) | Et(1) dt . ∂u 1 0 β

D1 1 = E

T

β

Thus we see that if D1 1 = 0 for all bounded Et(1) -adapted β1 (t) we obtain that  E β

The proof for D2 2 :=

 ∂ H1 (t) | Et(1) = 0 for all t ∈ [0, T ]. ∂u 1 d J (u , u 2 ds 2 1

+ sβ2 ) is similar.



Remark 6.12 Note that the argument in the above proof can be reversed, to give the following result of independent interest: Corollary 6.13 Under the assumptions of the previous theorem the following are equivalent, for each i = 1, 2: β

(i) Dii = 0 for all bounded βi ∈ Ai , (ii) E ∂∂uHii (t) | Et(i) = 0 for all t ∈ [0, T ]. 

192

6 Stochastic Differential Games

6.3.5 Application to Model Uncertainty Control General Theory We represent model uncertainty by a family of probability measures Q = Q θ equivalent to P, with the Radon–Nikodym derivative on Ft given by d(Q | Ft ) = G θ (t), d(P | Ft )

(6.3.33)

where, for 0 ≤ t ≤ T , G θ (t) is a martingale of the form θ

θ







dG (t) = G (t ) θ0 (t)dB(t) +

R

 ˜ θ1 (t, ζ) N (dt, dζ) ; G θ (0) = 1.

(6.3.34)

Here θ = (θ0 , θ1 ) may be regarded as a scenario control. Let A1 := AG denote a given family of admissible singular controls ξ and let A2 :=  denote a given set of admissible scenario controls θ such that   T  {|θ02 (t)| + θ12 (t, ζ)ν(dζ)}dt < ∞ (6.3.35) E 0

R

and θ1 (t, ζ) ≥ −1 +  for some  > 0. Now assume that X 1 (t) = X u (t) is a controlled mean-field Itô–Lévy process of the form dX 1 (t) = b1 (t, X 1 (t), Y1 (t), ω)dt + σ1 (t, X 1 (t), Y1 (t), ω)dB(t)  + γ1 (t, X 1 (t), Y1 (t), ζ, ω) N˜ (dt, dζ), (6.3.36) R

where Y1 (t) = F(X 1 (t, ·)),

(6.3.37)

and F is a Fréchet differentiable operator on L 2 (P). (2) = {Et(2) }0≤t≤T be given subfiltrations As before, let E(1) = {E(1) t }0≤t≤T and E of F = {Ft }0≤t≤T , representing the information available to the controllers at time t. It is required that ξ ∈ A1 be E(1) -predictable, and θ ∈ A2 be E(2) -predictable. We set ˆ ∈ A1 × A 2 w = (u, θ) and consider the stochastic differential game to find (u, ˆ θ) such that ˆ = inf sup E Q θ [ j (u, θ)], ˆ θ)] sup inf E Q θ [ j (u, θ)] = E Q θˆ [ j (u,

u∈A1 θ∈A2

where

θ∈A2 u∈A1

(6.3.38)

6.3 Mean-Field Games

 j (u, θ) =

T

193

{ f 1 (t, X (t), Y (t), u(t), ω) + λ(θ(t))}dt + g1 (X (T ), Y (T ), ω).

0

(6.3.39) T The term E Q θ [ 0 λ(θ(t))dt] can be seen as a penalty term, penalizing the difference between Q θ and the original probability measure P. Note that since G θ (t) is a martingale we have  T  E Q θ [ j (u, θ)] = E G θ (T )g1 (X (T ), Y (T )) + G θ (t){ f 1 (t, X (t), Y (t), u(t))) 0  + λ(θ(t))}dt . (6.3.40) We see that this is a mean-field stochastic differential game of the type discussed in Sect. 6.3.2, with a two-dimensional state space X (t) := (X 1 (t), X 2 (t)) := (X u (t), G θ (t))

(6.3.41)

and with f (t, X (t), Y (t), u, θ)) := G θ (t){ f 1 (t, X 1 (t), Y1 (t), u(t)) + λ(θ(t))} = X 2 (t){ f 1 (t, X 1 (t), Y1 (t), u(t)) + λ(θ(t))}

(6.3.42)

and g(X (T ), Y (T )) := G θ (T )g1 (X 1 (T ), Y1 (T )) = X 2 (T )g1 (X 1 (T ), Y1 (T )). (6.3.43) Accordingly, using the result from Sect. 6.3.2, we get the following Hamiltonian for the game (6.3.38): H (t, x1 , x2 , y1 , u, θ, p, q, r ) = x2 { f 1 (t, x1 , y1 , u) + λ(θ)} + b1 (t, x1 , y1 ) p1 + σ(t, x1 , y1 )q1 + x2 θ0 q2   + γ1 (t, x1 , y1 (t), ζ)r1 (ζ)ν(dζ) + x2 θ1 (ζ)r2 (ζ)ν(dζ) R

R

and the corresponding mean-field BSDEs for the adjoint processes become

(6.3.44)

194

6 Stochastic Differential Games

⎧ ∂H ⎪ ⎪ ⎪d p1 (t) = − ∂x (t, X (t), Y (t), w(t), p(t), q(t), r (t))dt ⎪ ⎪ 1 ⎪ ⎪ ∂H ⎪ ⎪ − (t, X (t), Y (t), w(t), p(t), q(t), r (t))∇ X 1 (t) F)dt ⎨ ∂ y1  ⎪ ⎪ +q (t)dB(t) + r1 (t, ζ) N˜ (dt, dζ) 1 ⎪ ⎪ ⎪ R   ⎪ ⎪ ∂g ∂g1 ⎪ ⎪ ⎩ p1 (T ) = X 2 (T ) 1 (X 1 (T ), Y1 (T )) + E (X 1 (T ), Y1 (T )) ∇ X 1 (T ) F ∂x1 ∂ y1 (6.3.45) and ⎧ ⎪ d p2 (t) = −{ f 1 (t, X 1 (t), Y1 (t), u(t)) + λ(θ(t)) + θ0 (t)q2 (t) ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ + θ1 (t, ζ)r2 (t, ζ)ν(dζ)}dt  R ⎪ ⎪ +q (t)dB(t) + r2 (t, ζ) N˜ (dt, dζ) ⎪ 2 ⎪ ⎪ R ⎪ ⎩ p2 (T ) = g1 (X 1 (T ), Y1 (T )). Minimizing the Hamiltonian with respect to θ = (θ0 , θ1 ) gives the following firstorder conditions: ∂λ (t) = −E[q2 (t) | Et(2) ] ∂θ0 

and ∇θ1 λ(θ)(t)(·) = −E

R

r2 (t, ζ)(·) |

Et(2)

(6.3.46)  ν(dζ)

(6.3.47)

(as linear operators).

A Special Case For simplicity, consider the special case with Et(i) = Ft , N = 0.

(6.3.48)

Then, writing X 1 (t) = X u (t), Y1 (t) = Y u (t) and X 2 (t) = G θ (t) and f 1 = f, g1 = g, b1 = b, σ1 = σ, and θ0 = θ, the controlled system gets the form dX u (t) = b(t, X u (t), Y u (t))dt + σ(t, X u (t), Y u (t))dB(t); X u (0) = x (6.3.49) dG θ (t) = G θ (t)θ(t)dB(t); G θ (0) = 1. The performance functional becomes

(6.3.50)

6.3 Mean-Field Games

195

 E Q θ [ j (u, θ)] = E G θ (T )g(X u (T ), Y u (T ))  T  + G θ (t){ f (t, X u (t), Y u (t), u(t)) + λ(θ(t))}dt ,

(6.3.51)

0

the Hamiltonian becomes H (t, x, g, y, u, θ, p, q) = g{ f 1 (t, x, y, u) + λ(θ)} + b1 (t, x, y) p1 + σ(t, x, y)q1 + gθq2 (6.3.52) and the corresponding mean-field BSDEs for the adjoint processes become ⎧ ∂H ⎪ ⎪ d p1 (t) = − (t, X u (t), Y u (t), u(t), p(t), q(t))dt ⎪ ⎪ ∂x ⎪ ⎪ ⎪ ∂H ⎪ ⎨ (t, X u (t), Y u (t), u(t), p(t), q(t))∇ X u (t) F)dt − ∂y ⎪ ⎪ +q1 (t)dB(t) ⎪ ⎪   ⎪ ⎪ ∂g ∂g u ⎪ ⎪ (X (T ), Y u (T )) ∇ X u (T ) F ⎩ p1 (T ) = G θ (T ) (X u (T ), Y u (T )) + E ∂x ∂y (6.3.53) and  d p2 (t) = −{ f (t, X u (t), Y u (t), u) + λ(θ(t)) + θ(t)q2 (t)}dt + q2 (t)dB(t) p2 (T ) = g1 (X u (T ), Y u (T )). (6.3.54) Then the first-order condition for a minimum of the Hamiltonian with respect to θ reduces to (6.3.55) ρ (θ)(t) = −q2 (t). In general it is difficult to solve such a coupled system of forward-backward meanfield SDEs. However, in some cases a possible solution procedure can be described, as in the next example. Example 6.14 (Optimal harvesting under uncertainty) Now we consider a model uncertainty version of the optimal harvesting problem in Sect. 6.3.1(a). For simplicity we put K = 1 and N = 0. Thus we have the following mean-field forward system, with X (t) = X u (t), 

and

dX (t) = E[X (t)]a(t)dt + X (t)σ(t)dB(t) − X (t)u(t)dt X (0) = x > 0 dG θ (t) = G θ (t)θ(t)dB(t) ; G θ (0) = 1,

(6.3.56)

(6.3.57)

196

6 Stochastic Differential Games

with performance functional  J (u, θ) = E[

T

G θ (t){h 0 X (t)u(t) −

0

1 2 X (t)u 2 (t) + λθ(t)}dt + G θ (T )X (T )]. 2 (6.3.58)

Hence the Hamiltonian is H (t, x, g, y, u, θ, p, q) 1 = g{h 0 xu − ρx 2 u 2 − λ(θ)} + ya(t) p1 + xσ(t)q1 + gθq2 . 2

(6.3.59)

Minimizing the Hamiltonian with respect to θ gives the first-order equation ρ (θ)(t) = −q2 (t).

(6.3.60)

The corresponding reflected backward system is  d p1 (t) = −[gh 0 u − ρxu 2 + a(t) p1 (t) + σ(t)q1 (t)]dt + q1 (t)dB(t) (6.3.61) p1 (T ) = G θ (T ) and

⎧ 1 2 2 ⎪ ⎨d p2 (t) = −[h 0 X (t)u(t) − 2 ρX (t)u (t) − λθ(t) +θ(t)q2 (t)]dt + q2 (t)dB(t) ⎪ ⎩ = X (T ). p2 (T )

(6.3.62)

Then we get the following result: ˆ ˆ Proposition 6.15 Suppose there exists a solution Xˆ (t) := X uˆ (t), G(t) := G θ (t), ˆ of the coupled system of mean-field forwardˆ θ(t) pˆ 1 (t), qˆ1 (t), pˆ 2 (t), qˆ2 (t), u(t), backward stochastic differential equations consisting of the forward equations (6.3.56), (6.3.57) and the backward equations (6.3.61), (6.3.62), and satisfying the constraint (6.3.60). ˆ is the optimal scenario paramThen u(t) ˆ is the optimal harvesting strategy and θ(t) eter for the model uncertainty harvesting problem (6.3.38).

Example 6.16 (Optimal insider consumption under model uncertainty) Suppose we have a cash flow with consumption, modeled by the process X (t) = X c,μ (t) defined by: ⎧ ⎪ + μ(t) − c(t))X (t)dt + β(t)X (t)dB(t) ⎨dX (t) = (α(t)  + R γ(t, ζ)X (t) N˜ (dt, dζ) ⎪ ⎩ X (0) = x > 0. Here α(t), β(t), γ(t, ζ) are given coefficients, while c(t) > 0 is the relative consumption rate chosen by the consumer (player number 1) and μ(t) is a perturbation

6.3 Mean-Field Games

197

of the drift term, representing the model uncertainty chosen by the environment (player number 2). Define the performance functional by  J (c, μ) = E 0

T

 1 2 {log(c(t)X (t)) + μ (t)}dt + θ log X (T ) , 2

(6.3.63)

where θ = θ(ω) > 0 is a given FT -measurable random variable, and 21 μ2 (t) represents a penalty rate, penalizing μ for being away from 0. We assume that c and μ are F-adapted. We want to find c∗ ∈ A1 and μ∗ ∈ A2 such that sup inf J (c, μ) = J (c∗ , μ∗ ).

c∈A1 μ∈A2

(6.3.64)

The Hamiltonian for this problem is H (t, x, y, c, μ, p, q, r )    1 2 γ(t, ζ)r (ζ)dν(ζ) = log(cx) + μ + (α(t) + μ − c)x p + β(t)xq + x 2 R (6.3.65) and the BSDE for the adjoint processes p, q, r is ⎧ d p(t) = −[ X1(t) + (α(t) + μ(t) − c(t)) p(t) ⎪ ⎪  ⎪ ⎨ +β(t)q(t) + R γ(t, ζ)r (ζ)dν(ζ)]dt  ⎪ +q(t)dB(t) + R r (t) N˜ (dt, dζ); 0 ≤ t ≤ T ⎪ ⎪ ⎩ θ . p(T ) = X (T )

(6.3.66)

Define h(t) = p(t)X (t).

(6.3.67)

Then by the Itô formula we get  1 − (α(t) + μ(t) − c(t)) p(t) − β(t)q(t) dh(t) = X (t) − X (t)   − γ(t, y, ζ)r (t, ζ)dν(ζ) dt R

+ p(t)(α(t) + μ(t) − c(t))X (t)dt + p(t)β(t)X (t)dB(t) + X (t)q(t)dB(t) + q(t)β(t)X (t)dt  + [(X (t) + γ(t, ζ)X (t))( p(t) + r (t, ζ)) − p(t)X (t) − p(t)γ(t, ζ)X (t) R

− X (t)r (t, ζ]dν(ζ)dt

198

6 Stochastic Differential Games



[(X (t) + γ(t, ζ)X (t))( p(t) + r (t, ζ)) − p(t)X (t)] N˜ (dt, dζ)  = dF(t) + h(t)β(t)dB(t) + h(t) γ(t, ζ) N˜ (dt, dζ)), (6.3.68)

+

R

R

where  dF(t) = −dt + X (t)q(t)dB(t) + X (t)

R

r (t, ζ)(1 + γ(t, ζ)) N˜ (dt, dζ). (6.3.69)

To simplify this, we define the process k(t) by the equation    dk(t) = k(t) b(t)dB(t) + λ(t, ζ) N˜ (dt, dζ)

(6.3.70)

R

for suitable processes b, λ (to be determined). Then again by the Itô formula we get    d(h(t)k(t)) = h(t)k(t) b(t)dB(t) + λ(t, ζ) N˜ (dt, dζ) R    + k(t) dF(t) + h(t)β(t)dB(t) + h(t) γ(t, ζ) N˜ (dt, dζ) R

+ (h(t)β(t) + X (t)q(t))k(t)b(t)dt    h(t)γ(t, ζ) + X (t)r (t, ζ)(1 + γ(t, ζ) k(t)λ(t, ζ) N˜ (dt, dζ) + R   + h(t)γ(t, ζ) + X (t)r (t, ζ)(1 + γ(t, ζ) k(t)λ(t, ζ)dν(ζ)dt. R

(6.3.71)

Define u(t) := h(t)k(t). (6.3.72) Then the equation above can be written  γ(t, ζ)λ(t, ζ)dν(ζ)dt du(t) = u(t) R

 + {β(t) + b(t)}dB(t) + β(t)b(t)dt + {λ(t, ζ) + γ(t, ζ) + λ(t, ζ) R   γ(t, ζ)} N˜ (dt, dζ) k(t) dF(t) + X (t)q(t)b(t)dt

6.3 Mean-Field Games

199

 + +

R R

X (t)r (t, ζ)λ(t, ζ)(1 + γ(t, ζ))dν(ζ)dt  X (t)r (t, ζ)λ(t, ζ)(1 + γ(t, ζ)) N˜ (dt, dζ) .

(6.3.73)

Choose b(t) := −β(t), γ(t, ζ) . λ(t, ζ) := − 1 + γ(t, ζ)

(6.3.74)

Then from (6.3.70) we get 

 t  1 t 2 k(t) = exp −β(s)dB(s) − β (s)ds − ln(1 + γ(s, ζ)) N˜ (ds, dζ) 2 0 0 0 R  t  γ(s, ζ) − ln(1 + γ(s, ζ))}ν(dζ)ds , (6.3.75) { + 0 R 1 + γ(s, ζ) t

and (6.3.73) reduces to du(t) = f (t)dt + k(t)X (t)q(t)dB(t)  + {X (t)r (t, ζ)(1 + γ(t, ζ))[k(t) + k(t)λ(t, ζ)]} N˜ (dt, dζ),

(6.3.76)

R

where  γ(t, ζ)λ(t, ζ)dν(ζ) + β(t)b(t) R  + k(t)X (t)q(t)b(t) + k(t) X (t)r (t, ζ)λ(t, ζ)(1 + γ(t, ζ))dν(ζ). 

f (t) = −k(t) + u(t)

R

(6.3.77)

Now define v(t) := k(t)X (t)q(t), w(t) := k(t)X (t)r (t, ζ).

(6.3.78)

Then from (6.3.71) and (6.3.74) we get the following BSDE in the unknowns u, v, w:   γ 2 (t, ζ) dν(ζ) + β 2 (t)] du(t) = − k(t) − u(t)[ 1 + γ(t, ζ)  R  − β(t)v(t) − γ(t, ζ)w(t, ζ)dν(ζ) dt R

(6.3.79)

200

6 Stochastic Differential Games

 + v(t)dB(t) +

R

w(t, , ζ) N˜ (dt, dζ); 0 ≤ t ≤ T

u(T ) = θk(T ). This is a linear BSDE which has a unique solution u(t) = p(t)X (t)k(t), v(t), w(t, ζ), where u(t) is given explicitly by (see Theorem 4.8) u(t) =

   T 1 Γ (s)k(s)ds|Ft , E Γ (T )θk(T ) + Γ (t) t

(6.3.80)

where  t   t 1 t 2 Γ (t) = exp β(s)dB(s) + β (s)ds + ln(1 + γ(s, ζ)) N˜ (ds, dζ) 2 0 R 0 0  t   γ(s, ζ)  ν(dζ)ds . (6.3.81) ln(1 + γ(s, ζ)) − + 1 + γ(s, ζ) 0 R Combining (6.3.75) and (6.3.81) we see that Γ (t)k(t) = 1.

(6.3.82)

  u(t) = k(t) E[θ|Ft ] + T − t .

(6.3.83)

Substituted into (6.3.80) this gives

In particular, we get p(t)X (t) =

u(t) = E[θ|Ft ] + T − t. k(t)

(6.3.84)

Maximizing H with respect to c gives the first-order equation 1 − X (t) p(t) = 0, c(t)

(6.3.85)

i.e., by (6.3.84), c(t) = c(t) ˆ =

1 1 = . X (t) p(t) E[θ|Ft ] + T − t

(6.3.86)

Minimizing H with respect to μ gives the first-order equation μ(t) + X (t) p(t) = 0,

(6.3.87)

6.3 Mean-Field Games

201

i.e., by (6.3.84),   μ(t) = μ(t) ˆ = −X (t) p(t) = − E[θ|Ft ] + T − t .

(6.3.88)

We can now verify that c, ˆ μˆ satisfies all the conditions of the sufficient maximum principle, and hence we conclude the following: Proposition 6.17 (Optimal Consumption for an Insider under Model Uncertainty) The solution (c∗ , μ∗ ) of the stochastic differential game (6.3.64) is given by c∗ (t) =

1 ; 0≤t ≤T E[θ|Ft ] + T − t

(6.3.89)

and   μ∗ (t) = − E[θ|Ft ] + T − t ; 0 ≤ t ≤ T.

(6.3.90)

An interesting, and perhaps surprising, consequence of this result is the following Corollary 6.18 Suppose θ is a deterministic constant. Then c∗ (t) and μ∗ (t) are also deterministic. In fact, we have c∗ (t) = and

1 ; 0≤t ≤T θ+T −t

  μ∗ (t) = − θ + T − t ; 0 ≤ t ≤ T.

(6.3.91)

(6.3.92)

Remark 6.19 Note that this last result states that if θ is deterministic, then the two players do not need any information about the system to find the optimal respective controls.

6.3.6 The Zero-Sum Game Case In the zero-sum case we have J1 (u 1 , u 2 ) + J2 (u 1 , u 2 ) = 0.

(6.3.93)

Then the Nash equilibrium (uˆ 1 , uˆ 2 ) ∈ A1 × A2 satisfying (6.3.12)–(6.3.13) becomes a saddle point for (6.3.94) J (u 1 , u 2 ) := J1 (u 1 , u 2 ). To see this, note that (6.3.12)–(6.3.13) imply that

202

6 Stochastic Differential Games

J1 (u 1 , uˆ 2 ) ≤ J1 (uˆ 1 , uˆ 2 ) = −J2 (uˆ 1 , uˆ 2 ) ≤ −J2 (uˆ 1 , u 2 ) and hence J (u 1 , uˆ 2 ) ≤ J (uˆ 1 , uˆ 2 ) ≤ J (uˆ 1 , u 2 ) for all u 1 , u 2 . From this we deduce that sup J (u 1 , u 2 ) ≤ sup J (u 1 , uˆ 2 ) ≤ J (uˆ 1 , uˆ 2 )

inf

u 2 ∈A2 u 1 ∈A1

u 1 ∈A1

≤ inf J (uˆ 1 , u 2 ) ≤ sup inf J (u 1 , u 2 ). u 2 ∈A2

u 1 ∈A1 u 2 ∈A2

(6.3.95)

Since we always have inf sup ≥ sup inf, we conclude that sup J (u 1 , u 2 ) = sup J (u 1 , uˆ 2 ) = J (uˆ 1 , uˆ 2 )

inf

u 2 ∈A2 u 1 ∈A1

u 1 ∈A1

= inf J (uˆ 1 , u 2 ) = sup inf J (u 1 , u 2 ), u 2 ∈A2

u 1 ∈A1 u 2 ∈A2

(6.3.96)

i.e. (uˆ 1 , uˆ 2 ) ∈ A1 × A2 is a saddle point for J (u 1 , u 2 ). Thus we want to find (u ∗1 , u ∗2 ) ∈ A1 × A2 such that  sup

u 1 ∈A1

inf J (u 1 , u 2 ) = inf

u 2 ∈A2

(



 u 2 ∈A2

sup J (u 1 , u 2 ) = J (u ∗1 , u ∗2 ),

u 1 ∈A1

(6.3.97)

where 

T

J (u 1 , u 2 ) = E

 f (t, X (t), Y (t), u(t), ω)dt + g(X (T ), Y (T ), ω) . (6.3.98)

0

In this case we see that only one Hamiltonian H is needed, namely H (t, x, y, u 1 , u 2 , p, q, r ) = f (t, x, y, u) + b(t, x, y, u) p + σ(t, x, y, u)q +

 R

γ(t, x, y, u, ζ)r (ζ)ν(dζ), (6.3.99)

and we have put gi = g ; i = 1, 2 and f 1 = f = − f 2 . Moreover, there is only one triple ( p, q, r ) of adjoint processes, given by the BSDE

6.3 Mean-Field Games

203

⎧ ∂H ⎪ ⎪ d p(t) = − (t, X (t), Y (t), u(t), p(t), q(t), r (t))dt ⎪ ⎪ ⎪ ∂x ⎪ ⎪ ∂H ⎪ ⎪ (t, X (t), Y (t), u(t), p(t), q(t), r (t))∇ X (t) F)dt − ⎨ ∂y  ⎪ ⎪ +q(t)dB(t) + r (t, ζ) N˜ (dt, dζ) ⎪ ⎪ ⎪ R ⎪  ∂g  ⎪ ∂g ⎪ ⎪ (X (T ), Y (T )) + E (X (T ), Y (T )) ∇ X (T ) F. ⎩ p(T ) = ∂x ∂y

(6.3.100)

We can now state the corresponding sufficient maximum principle for the zerosum game: Theorem 6.20 (Sufficient maximum principle for zero-sum mean-field games) Let (uˆ 1 , uˆ 2 ) ∈ A1 × A2 , with corresponding solutions Xˆ (t), Yˆ (t), p(t), ˆ q(t), ˆ rˆ (t, ζ). Suppose the following hold • The function ˆ q(t), ˆ rˆ (t)) X, u 1 → H (t, X, F(X ), u 1 , uˆ 2 (t), p(t),

(6.3.101)

is concave for all t, the function ˆ q(t), ˆ rˆ (t)) X, u 2 → H (t, X, F(X ), uˆ 1 (t), u 2 , p(t),

(6.3.102)

is convex for all t, and the function X → g(X, F(X ))

(6.3.103)

is affine. • (The conditional maximum property) ˆ q(t), ˆ rˆ (t, ·)) | Et(1) ] ess supv1 ∈A1 E[H (t, Xˆ (t), Yˆ (t), v1 , uˆ 2 (t), p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et(1) ] = E[H (t, Xˆ (t), Yˆ (t), uˆ 1 (t), uˆ 2 (t), p(t),

(6.3.104)

and ˆ q(t), ˆ rˆ (t, ·)) | Et(2) ] ess inf v2 ∈A2 E[H (t, Xˆ (t), Yˆ (t), uˆ 1 (t), v2 , p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et(2) ]. = E[H (t, Xˆ (t), Yˆ (t), uˆ 1 (t), uˆ 2 (t), p(t),

(6.3.105)

Then u(t) ˆ = (uˆ 1 (t), uˆ 2 (t)) is a saddle point for J (u 1 , u 2 ). Proof The proof is similar to (and simpler than) the proof of Theorem 6.10 and is omitted. 

204

6 Stochastic Differential Games

6.3.7 The Single Player Case If there is only one player, then u = u 1 and the problem (6.3.12) reduces to the problem of maximizing the performance functional 

T

J (u) = E

 f (t, X (t), Y (t), u(t), ω)dt + g(X (T ), Y (T ), ω) .

(6.3.106)

0

The corresponding Hamiltonian reduces to H (t, x, y, u, p, q, r )



= f (t, x, y, u) + b(t, x, y, u) p + σ(t, x, y, u)q +

R

γ(t, x, y, u, ζ)r (ζ)ν(dζ) (6.3.107)

and the associated mean-field BSDE for the adjoint processes becomes ⎧ ∂H ⎪ ⎪ d p(t) = − (t, X (t), Y (t), u(t), p(t), q(t), r (t))dt ⎪ ⎪ ∂x ⎪ ⎪ ⎪ ∂H ⎪ ⎪ − (t, X (t), Y (t), u(t), p(t), q(t), r (t))∇ X (t) F)dt ⎨ ∂y  ⎪ ⎪ r (t, ζ) N˜ (dt, dζ) +q(t)dB(t) + ⎪ ⎪ ⎪ R   ⎪ ⎪ ∂g ∂g ⎪ ⎪ (X (T ), Y (T )) + E (X (T ), Y (T )) ∇ X (T ) F. ⎩ p(T ) = ∂x ∂y

(6.3.108)

The corresponding sufficient maximum principle can therefore be stated as follows: Theorem 6.21 (Sufficient maximum principle for mean-field control) Let uˆ ∈ A, with corresponding solutions Xˆ (t), Yˆ (t), p(t), ˆ q(t), ˆ rˆ (t, ζ). Suppose the following hold • The function

X, u → H (t, X, F(X ), u, p(t), ˆ q(t), ˆ rˆ (t))

(6.3.109)

is concave for all t. • (The conditional maximum property) ˆ q(t), ˆ rˆ (t, ·)) | Et ] ess supv∈A E[H (t, Xˆ (t), Yˆ (t), v, p(t), = E[H (t, Xˆ (t), Yˆ (t), u(t), ˆ p(t), ˆ q(t), ˆ rˆ (t, ·)) | Et ]. Then u(t) ˆ is an optimal control for J (u).

(6.3.110)

6.3 Mean-Field Games

205

Proof The proof is similar to the proof of Theorem 6.10 and is omitted. Similarly, we obtain a single player version of the necessary maximum principle by reducing Theorem 6.11 in the same way. Example 6.22 (Return to the optimal harvesting problem) To illustrate our result, we apply it to the optimal harvesting problem (6.3.3), (6.3.4) in Sect. 6.3.1(a): For simplicity we assume that N=0. Then the Hamiltonian (6.3.107) gets the form 1 H (t, x, y, u, p, q) = h 0 xu − ρx 2 u 2 + [ya(t) − xu] p + xσ(t)q. 2

(6.3.111)

The corresponding BSDE (6.3.108) reduces to ⎧ 2 ⎪ ⎨d p(t) = −[h 0 u(t) − ρx(t)u (t) + (a(t) − u(t)) p(t) + σ(t)q(t)]dt (6.3.112) +q(t)dB(t) ⎪ ⎩ p(T ) = K . The first-order equation for a maximum of the Hamiltonian as a function of u is ∂H = h 0 x(t) − ρx 2 (t)u(t) − x(t) p(t) = 0, ∂u i.e. u(t) = u(t) ˆ =

h 0 − p(t) . ρx(t)

(6.3.113)

If we substitute this into (6.3.112) we get the BSDE 

d p(t) = −[(a(t) − u(t)) p(t) + σ(t)q(t)]dt + q(t)dB(t) p(T ) = K .

(6.3.114)

This is a linear BSDE with solution (see Theorem 4.8) p(t) =

1 E[K Γ (T ) | Ft ]; 0 ≤ t ≤ T, Γ (t)

where dΓ (t) = Γ (t)[a(t)dt + σ(t)dB(t)]; Γ (0) = 1. If, for example, a(t) and K are deterministic, then we can choose q(t) = 0 in (6.3.114) and get  T

p(t) = K exp a(s)ds . t

206

6 Stochastic Differential Games

Therefore, if a ≥ 0 and



T

h 0 ≥ K exp

a(s)ds ,

0

then we get u(t) ˆ ≥ 0. Moreover, if



T

x≥ 0

 s

h 0 − p(s) exp − a(r )dr ds, ρ a

then we can see that the population Xˆ (t) = X uˆ (t) corresponding to the control uˆ is non-negative. We conclude that the relative harvesting rate uˆ given by (6.3.113) is admissible and optimal under the above assumptions. We leave the details to the reader. (Exercise 6.4).

6.4 Exercises Exercise* 6.1 (Optimal portfolio under model uncertainty) Suppose we have a financial market of the form dS0 (t) = S0 (t)r (t)dt ; S0 (0) = 1,    − ˜ dS1 (t) = S1 (t ) α(t)dt + β(t)dB(t) + γ(t, ζ) N (dt, dζ) ; S1 (0) > 0, R

where r, β and γ > −1 are given deterministic functions. Assume that the mean rate of return of the stock, α(t), is not given a priori, but represents a model uncertainty chosen by the environment. The wealth process V (α,π) (t) corresponding to α and to a self-financing portfolio π is given by dV (α,π) (t) = V (α,π) (t − ) [{(1 − π(t))r (t) + π(t)α(t)}dt   ˜ +π(t)β(t) + π(t) ζ N (dt, dζ) ; S ≤ t ≤ T R

V (απ) (0) = x > 0. If U is a given strictly increasing, C 2 utility function on [0, ∞], then the expected utility of the terminal wealth is % ) $ J (α, π)(s, x) = E U V (α,π) (T ) . The trader is convinced that “the world is against her” and therefore she wants to maximize J (α,π) (s, x) in the worst possible case regarding α (“worst case scenario”).

6.4 Exercises

207

This leads to the zero-sum game to find (α∗ , π ∗ ) and Φ(s, x) (the value function) such that



Φ(s, x) = sup

π∈Az

inf J (α,π) s, x) = inf

α∈A1

α∈A1

sup J (α,π) (s, x) ,

π∈A2

where A1 and A2 denote the usual sets of admissible π and α, respectively. To solve this problem, we first put it into the framework of Theorem 4.16 by introducing Y = (Y0 , Y1 ), where dY0 (t) = dt ; Y0 (0) = y0 = s ∈ R and  dY1 (t) = dV (α,π) (t) = Y1 (t − ) {(1 − π(t))r (t) + π(t)d(t)}dt   + π(t)β(t)dB(t) + π(t) γ(t, ζ) N˜ (dt, dζ) R

Y1 (0) = y1 = x > 0. (i) Write down the generator Aα,π of Y . (ii) Write down the HJBI equation for the value function Φ. (iii) To find the value function Φ, we guess that it has the form



ϕ(s, ˆ x) = U x exp

T



r (t)dt

; (s, x) ∈ (−∞, T ) × (0, ∞).

s

Compute Aα,π ϕ(s, ˆ x). (iv) Find a saddle point (α, ˆ π) ˆ for the functions ˆ x) (α, π) → Aα,π ϕ(s, by requiring that

∂2 Aα,π ϕ(s, ˆ x)α=α,π= ˆ πˆ = 0. ∂π∂α

(v) Verify that (α, ˆ π) found in (iv) satisfies all the conditions of Theorem 4.16, and conclude that (α, ˆ π) = (α∗ , π ∗ ) is indeed a Nash equilibrium and that  Φ(s, x) = ϕ(s, ˆ x) = U x exp

T



r (t)dt

; (s, x) ∈ (−∞, T ] × (0, ∞).

s

Exercise* 6.2 Consider the following jump extension of Example 5.17, with the FBSDE in (X π , Y π ) given by

208

6 Stochastic Differential Games

   ⎧ ⎨dX π (t) = π(t)X π (t − ) b(t)dt + σ(t)dB(t) + γ(t, ζ) N˜ (dt, dζ) R ⎩ π X (0) = x > 0 and

 ⎧ ⎨dY π (t) = Z π (t)dB(t) + K π (t, ζ) N˜ (dt, dζ) R ⎩ π Y (T ) = ln X π (T ).

Here π denotes the fraction of the wealth X π (t) invested in the risky asset. Assuming that the conditions of Theorem 5.16 hold, use this theorem to find π ∗ ∈ A and Y π∗ (0) such that sup E[ln(X π (T ))] = E[ln(X π∗ (T ))],

π∈A

i.e.

sup Y π (0) = Y π∗ (0) = y(0, x).

π∈A

We assume that the coefficients b(t), σ(t) and γ(t, ζ) > −1 are deterministic and that X π (t) > 0 for all t ∈ [0, T ] and all π ∈ A. Exercise* 6.3 (Optimal insider portfolio under model uncertainty) Consider a financial market with two investment possibilities: (i) A risk free investment possibility with unit price S0 (t) = 1 for all t ∈ [0, T ] (ii) A risky investment, where the unit price S(t) is modeled by the SDE dS(t) = S(t)[(α(t) + μ(t))dt + β(t)dB(t)]; S(0) > 0.

(6.4.1)

Here α(t), β(t) are given F-adapted coefficients, while μ(t) is a perturbation of the drift term, representing the model uncertainty chosen by the environment (player number 2). Suppose the wealth process X (t) = X π,μ (t) associated to an insider portfolio π(t) (representing the fraction of the wealth invested in the risky asset) is given by: 

dX (t) = π(t)X (t)[(α(t) + μ(t))dt + β(t)]dB(t) X (0) = x > 0.

(6.4.2)

Define the performance functional by  J (π, μ) = E 0

T

 1 2 μ (t)dt + θ log X (T ) , 2

(6.4.3)

6.4 Exercises

209

where θ > 0 is a given (deterministic) constant, and 21 μ2 (t) represents a penalty rate, penalizing μ for being away from 0. We assume that π and μ are F-adapted. Let A1 , A2 be given families of admissible controls for π and μ, respectively. For π, μ to be admissible we require the usual integrability and adaptedness conditions. Moreover, we require that the corresponding X π,μ (t) > 0 for all t. We want to find π ∗ ∈ A1 and μ∗ ∈ A2 such that sup inf J (π, μ) = inf sup J (π, μ) = J (π ∗ , μ∗ ).

π∈A1 μ∈A2

μ∈A2 π∈A1

(6.4.4)

(i) Write down the Hamiltonian H (t, x, π, μ, p, q) for this problem. (ii) Write down the BSDE for the adjoint processes p, q. (iii) Show that the first-order equation for maximizing H with respect to π is X (t)[(α(t) + μ(t)) p(t) + β(t)q(t)] = 0.

(6.4.5)

(iv) Combine this with (ii) above to obtain that 

p(t)dB(t) d p(t) = − α(t)+μ(t) β(t) θ p(T ) = X (T ) .

(6.4.6)

h(t) = p(t)X (t).

(6.4.7)

(v) Define

Use the Itô formula to show that  dh(t) = (π(t)β(t) − α(t)+μ(t) )h(t)dB(t) β(t) h(T ) = p(T )X (T ) = θ.

(6.4.8)

(vi) Explain why this BSDE has the unique solution h(t) = θ; 0 ≤ t ≤ T

(6.4.9)

and deduce the following expression for our candidate π(t) ˆ for the optimal portfolio π(t) ˆ =

α(t) + μ(t) ˆ , β 2 (t)

where μ(t) ˆ is the corresponding candidate for the optimal perturbation.

(6.4.10)

210

6 Stochastic Differential Games

(vii) Write the first-order equation for the optimal μˆ obtained by minimizing H with respect to μ. (viii) Combine this with the result in (vi) to conclude the following: Proposition 6.23 (Optimal Portfolio under Model Uncertainty) The saddle point (π ∗ (t), μ∗ (t)) of the stochastic differential game (6.4.2) is given by π ∗ (t) =

α(t) , +θ

(6.4.11)

α(t)θ . +θ

(6.4.12)

β 2 (t)

and μ∗ (t) = −

β 2 (t)

Exercise* 6.4 (Optimal Mean-Field Control) Complete the details of Example 6.22.

Chapter 7

Combined Optimal Stopping and Stochastic Control of Jump Diffusions

7.1 Introduction In this chapter we discuss combined optimal stopping and stochastic control problems and their associated Hamilton–Jacobi–Bellman (HJB) variational inequalities. This is a subject which deserves to be better known because of its many applications. A thorough treatment of such problems (but without the associated HJB variational inequalities) can be found in Krylov [K]. This chapter may also serve as a brief review of the theory of optimal stopping and their variational inequalities on one hand, and the theory of stochastic control and their HJB equations on the other. An introduction to these topics separately can be found in [Ø1]. As an illustration of how combined optimal stopping and stochastic control problems may appear in economics, let us consider the following example which is an extension of Exercise 3.2. Example 7.1 (An Optimal Resource Extraction Control and Stopping Problem) Suppose the price Pt = P(t) of one unit of a resource (e.g., gas or oil) at time t is varying like a geometric Lévy process, i.e.,    ¯ z N (dt, dz) , d P(t) = P(t ) α dt + β dB(t) + γ −

R

P0 = p ≥ 0,

(7.1.1)

where α, β = 0, γ are constants and γ z ≥ 0 a.s. ν. Let Q t denote the amount of remaining resources at time t. If we extract the resources at the “intensity” u t = u t (ω) ∈ [0, m] at time t, then the dynamics of Q t is (7.1.2) dQ t = −u t Q t dt, Q 0 = q ≥ 0. (m is a constant giving the maximal intensity.) We assume as before that our control u t (ω) is adapted to the filtration F = {Ft }t≥0 . If the running cost is given by K 0 + K 1 u t (with K 0 , K 1 ≥ 0 constants) as long as © Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0_7

211

212

7 Combined Optimal Stopping and Stochastic Control of Jump Diffusions

the field is open and if we decide to stop the extraction for good at time τ (ω) ≥ 0 let us assume that the expected total discounted profit is J

(u,τ )

(s, p, q) = E

( p,q)



τ

e−ρ(s+t) (u t (Pt Q t − K 1 ) − K 0 )dt  −ρ(s+τ ) (θ Pτ Q τ − a) , +e 0

(7.1.3)

where ρ > 0 is the discounting  exponent and θ > 0, a ≥ 0 are constants. Thus e−ρ(s+t) u t (Pt Q t − K 1 ) − K 0 gives the discounted net profit rate when the field is in operation, while e−ρ(s+τ ) (θ Pτ Q τ − a) gives the discounted net value of the remaining resources at time τ . (We may interpret a ≥ 0 as a transaction cost.) We assume that the closing time τ is a stopping time with respect to the filtration {Ft }t≥0 , i.e., that {ω; τ (ω) ≤ t} ∈ Ft for all t. Thus both the extraction intensity u t and the decision whether to close before or at time t must be based on the information IF t only, not on any future information. The problem is to find the value function Φ(s, p, q) and the optimal control u ∗t ∈ [0, m] and the optimal stopping time τ ∗ such that Φ(s, p, q) = sup J (u,τ ) (s, p, q) = J (u



,τ ∗ )

u t ,τ

(s, p, q).

(7.1.4)

This problem is an example of a combined optimal stopping and stochastic control problem. It is a modification of a problem discussed in [BØ1, DZ]. We will return to this and other examples after presenting a general theory for problems of this type.

7.2 A General Mathematical Formulation Consider a controlled stochastic system of the same type as in Chap. 5, where the state Y (u) (t) = Y (t) ∈ Rk at time t is given by  dY (t) = b(Y (t), u(t))dt + σ (Y (t), u(t))dB(t)+ γ (Y (t − ), u(t − ), z) N¯ (dt, dz), Rk

Y (0) = y ∈ R . k

(7.2.1)

Here b : Rk × U → Rk , σ : Rk × U → Rk×m , and γ : Rk × U × Rk → Rk× are given continuous functions and u(t) = u(t, ω) is our control, assumed to be IF t -adapted and with values in a given closed, convex set U ⊂ R p . Associated to a control u = u(t, ω) and an F-stopping time τ = τ (ω) belonging to a given set T of admissible stopping times, we assume there is a performance

7.2 A General Mathematical Formulation

213

criterion of the form J (u,τ ) (y) = E y



τ

 f (Y (t), u(t))dt + g(Y (τ ))χ{τ 0; Y (u) (t) ∈ − The family {g − (Y (u) (τ )); τ ∈ T } is uniformlyP y -integrable for all y ∈ S, where g − (y) = max(0, −g(y)). (7.2.4) We interpret g(Y (τ (ω))) as 0 if τ (ω) = ∞. Here, and in the following, E y denotes expectation with respect to P when Y (0) = y and S ⊂ Rk is a fixed Borel set such that S ⊂ S 0. We can think of S as the “universe” or “solvency set” of our system, in the sense that we are only interested in the system up to time T , which may be interpreted as the time of bankruptcy. We now consider the following combined optimal stopping and control problem. Let T be the set of IF t -stopping times τ ≤ τS . Find Φ(y) and u ∗ ∈ U, τ ∗ ∈ T such that ∗ ∗ (7.2.5) Φ(y) = sup{J (u,τ ) (y); u ∈ U, τ ∈ T } = J (u ,τ ) (y). We will prove a verification theorem for this problem. The theorem can be regarded as a combination of the variational inequalities for optimal stopping (Theorem 3.2) and the HJB equation for stochastic control (Theorem 5.1). We say that the control u is Markov or Markovian if it has the form u(t) = u 0 (Y (t)) for some function u 0 : S¯ → U . If this is the case we usually do not distinguish notationally between u and u 0 and write (with abuse of notation) u(t) = u(Y (t)). If u ∈ U is Markovian then Y (u) (t) is a Markov process whose generator coincides on C02 (Rk ) with the differential operator L = L u defined for y ∈ Rk by

214

7 Combined Optimal Stopping and Stochastic Control of Jump Diffusions

L u ψ(y) =

k

bi (y, u(y))

i=1

+

 j=1

R

k ∂ψ 1 ∂ 2ψ + (σ σ T )i j (y, u(y)) ∂ yi 2 i, j=1 ∂ yi ∂ y j

{ψ(y + γ ( j) (y, u(y), z j )) − ψ(y)

− ∇ψ(y).γ ( j) (y, u(y), z j )}ν j (dz j )

(7.2.6)

for all functions ψ : Rk → R which are twice differentiable at y. Typically the value function Φ will be C 2 outside the boundary ∂ D of the continuation region D (see (ii) below) and it will satisfy an HJB equation in D and an HJB ¯ Across ∂ D the function Φ will not be C 2 , but it will usually inequality outside D. 1 be C , and this feature is often referred to as the “high contact” – or “smooth fit” – principle. This is the background for the verification theorem given below (Theorem 7.2). Note, however, that there are cases when Φ is not even C 1 at ∂ D. To handle such cases one can use a verification theorem based on the viscosity solution concept. See Chap. 12 and in particular Sect. 12.2. Theorem 7.2 (HJB-Variational Inequalities for Optimal Stopping and Control) (a) Suppose we can find a function ϕ : S¯ → R such that ¯ (i) ϕ ∈ C 1 (S 0 ) ∩ C(S) (ii) ϕ ≥ g on S 0 Define D = {y ∈ S; ϕ(y) > g(y)} (the continuation region). (u) Suppose  τSY (t) spends 0time on ∂ D a.s., i.e., X∂ D (Y (u) (t))dt = 0 for all y ∈ S, u ∈ U (iii) E y 0

and suppose that (iv) ∂ D is a Lipschitz surface (v) ϕ ∈ C 2 (S 0 \ ∂ D) and the second-order derivatives of ϕ are locally bounded near ∂ D (vi) L v ϕ(y) + f (y, v) ≤ 0 on S 0 \ ∂ D for all v ∈ U (vii) ϕ(y) / S.  = g(y) for all y ∈  τS

(viii) E y |ϕ(Y (u) (τ ))| +

|L u ϕ(Y (u) (t))|dt < ∞ for all u ∈ U, τ ∈ T .

0

Then ϕ(y) ≥ Φ(y) for all y ∈ S. (b) Suppose, in addition to (i)–(viii) above, that

7.2 A General Mathematical Formulation

215

(ix) for each y ∈ D there exists u(y) ˆ ∈ U such that ˆ ϕ(y) + f (y, u(y)) ˆ = 0, L u(y) ˆ (t) ∈ / D} < ∞ a.s. for all y ∈ S, (x) τ D := inf{t > 0; Y (u)

and ˆ (τ )); τ ∈ T } is uniformly integrable with respect to P y (xi) the family {ϕ(Y (u) for all y ∈ D. Suppose uˆ ∈ U. Then ϕ(y) = Φ(y) for all y ∈ S. Moreover, u ∗ := uˆ and τ ∗ := τ D are optimal control and stopping times, respectively. Proof The proof is a synthesis of the proofs of Theorems 3.2 and 5.1. For completeness we give some details: ¯ Choose u ∈ U and (a) By Theorem 3.1 we may assume that ϕ ∈ C 2 (S 0 ) ∩ C(S). put Y (t) = Y (u) (t). Let τ ≤ τS be a stopping time. Then by Dynkin’s formula (Theorem 1.24) we have, for m = 1, 2, . . . ,

E ϕ(Y (τ ∧ m)) = ϕ(y) + E y



τ ∧m

 L ϕ(Y (t))dt .

(7.2.7)

 τ ∧m  ϕ(y) = lim E y −L u ϕ(Y (t))dt + ϕ(Y (τ ∧ m)) m→∞ 0  τ  y −L u ϕ(Y (t))dt + g(Y (τ ))χ{τ θ w − a for all w > w0 ,

(7.3.6) (7.3.7)

1 − ρ F(w) + (α − v)wF (w) + β 2 w 2 F

(w) 2  + {F(w + γ zw) − F(w) − F (w)γ zw}ν(dz) + v(w − K 1 ) − K 0 ≤ 0 R

for all w < w0 , v ∈ [0, m],

(7.3.8)



 1 −ρ F(w) + (α − v)wF (w) + β 2 w 2 F

(w)+ {F(w + γ zw) − F(w) 2 v∈[0,m] R  (7.3.9) − F (w)γ zw}ν(dz) + v(w − K 1 ) − K 0 = 0 for all w > w0 . sup

From (7.3.9) and (xi) of Theorem 7.2 we get the following candidate uˆ for the optimal control:   v = u(w) ˆ = Argmax v(w(1 − F (w)) − K 1 ) v∈[0,m]

 m if F (w) < 1 − (K 1 /w) = . 0 if F (w) > 1 − (K 1 /w)

(7.3.10)

Let Fm (w) be the solution of (7.3.9) with v = m, i.e., the solution of 1 − ρ Fm (w) + (α − m)wFm (w) + β 2 w 2 Fm

(w) 2  (7.3.11) + {F(w + γ zw) − F(w) − F (w)γ zw}ν(dz) = K 0 + m K 1 − mw. R

A solution of (7.3.11) is Fm (w) = C1 w λ1 + C2 w λ2 +

K0 + m K1 mw − , ρ+m−α ρ

(7.3.12)

where C1 , C2 are constants and λ1 > 0, λ2 < 0 are roots of the equation h(λ) = 0

(7.3.13)

with 1 h(λ) = −ρ + (α − m)λ + β 2 λ(λ − 1) + 2

 R

{(1 + γ z)λ − 1 − λγ z}ν(dz). (7.3.14)

7.3 Applications

219

(Note that h(0) = −ρ < 0 and lim|λ|→∞ h(λ) = ∞.) The solution will depend on the relation between the parameters involved and we will not give a complete discussion, but only consider some special cases. Case 1 Let us assume that α ≤ ρ,

K 1 = a = 0,

It is easy to see that

and

0 1 ⇐⇒ ρ + m > α.

(7.3.16)

C1 = 0

(7.3.17)

Let us try (guess) that and that the continuation region D = {(s, p, q); pq > w0 } is such that (see (7.3.10)) Fm (w) < 1 for all w > w0 .

(7.3.18)

The intuitive motivation for trying this is the belief that it is optimal to use the maximal extraction intensity m all the time until closure, at least if θ is small enough. These guesses lead to the following candidate for the value function F(w): F(w) =

⎧ ⎨θ w

if 0 ≤ w ≤ w0 K0 mw ⎩ Fm (w) = C2 w + − if w > w0 . ρ+m−α ρ λ2

(7.3.19)

We now use continuity and differentiability at w = w0 to determine w0 and C2 : K0 mw0 − = θ w0 , ρ+m−α ρ m = θ. (Differentiability) C2 λ2 w0λ2 −1 + ρ+m−α (Continuity) C2 w0λ2 +

(7.3.20) (7.3.21)

Easy calculations show that the unique solution of (7.3.20) and (7.3.21) is w0 =

(−λ2 )K 0 (ρ + m − α) (> 0 by (7.3.15)) (1 − λ2 )ρ[m − θ (ρ + m − α)]

and C2 =

[m − θ (ρ + m − α)]w01−λ2 (−λ2 )(ρ + m − α)

(> 0 by (7.3.15)).

(7.3.22)

(7.3.23)

It remains to verify that with these values of w0 and C2 the set D = {(s, p, q); pq > w0 } and the function F(w) given by (7.3.19) satisfies (7.3.6)–(7.3.9), as well as all the other conditions of Theorem 7.2.

220

7 Combined Optimal Stopping and Stochastic Control of Jump Diffusions

To verify (7.3.6) we have to check that (7.3.18) holds, i.e., that Fm (w) = C2 λ2 w λ2 −1 +

m < 1 for all w > w0 . ρ+m−α

Since λ2 < 0 and we have assumed α ≤ ρ (in (7.3.15)) this is clear. So (7.3.9) holds. If we substitute F(w) = θ w in (7.3.8) we get −ρθ w + (α − m)wθ + mw − K 0 = w[m − θ (ρ + m − α)] − K 0 . We know that this is 0 for w = w0 by (7.3.20) and (7.3.21). Hence it is less than 0 for w < w0 . So (7.3.8) holds. Condition (7.3.6) holds by definition of D and F. Finally, since F(w0 ) = θ w0 , F (w0 ) = θ , and F

(w) = Fm

(w) = C2 λ2 (λ2 − 1)w λ2 −2 > 0 we must have F(w) > θ w for w > w0 . Hence (7.3.7) holds. Similarly one can verify all the other conditions of Theorem 7.2. We have proved. Theorem 7.5 Suppose (7.3.15) holds. Then the optimal strategy (u ∗ , τ ∗ ) for problems (7.1.3) and (7.1.4) is u ∗ = m, τ ∗ = inf{t > 0; Pt Q t ≤ w0 },

(7.3.24)

where w0 is given by (7.3.22). The corresponding value function is Φ(s, p, q) = e−ρs F( p · q), where F is given by (7.3.19) with λ2 < 0 as in (7.3.11) and C2 > 0 as in (7.3.23). For other values of the parameters it might be optimal not to produce at all but just wait for the best closing/sellout time. For example, we mention without proof the following cases (see Exercise 7.2): Case 2 Assume that θ = 1 and ρ ≤ α.

(7.3.25)

Then u ∗ = 0 and Φ = ∞. Case 3 Assume that θ = 1, ρ > α, and K 0 < ρa < K 0 + ρ K 1 . Then

(7.3.26)

7.3 Applications

221

u ∗ = 0 and τ ∗ = inf{t > 0; Pt Q t ≥ w1 }, for some w1 > 0.

7.4 Exercises Exercise* 7.1 (a) Solve the following stochastic control problem ∗

Φ(s, x) = sup J (u) (s, x) = J (u ) (s, x), u(t)≥0



where J

(u)

(s, x) = E

τS

x

e

−δ(s+t) u

 (t) dt . γ

γ

0

Here τS = τS (ω) = inf{t > 0; X (t) ≤ 0} (the time of bankruptcy) and dX (t) = (μX (t) − u(t))dt + σ X (t)dB(t) + θ X (t − )

 R

z N¯ (dt, dz),

X0 = x > 0

with γ ∈ (0, 1), δ > 0, μ, σ = 0, θ constants, θ z > −1 a.s. ν. The interpretation of this is the following. X (t) represents the total wealth at time t, u(t) = u(t, ω) ≥ 0 represents the chosen consumption rate (the control). We want to find the consumption rate u ∗ (t) which maximizes the expected total discounted utility of the consumption up to the time of bankruptcy, τS . [Hint: Try a value function of the form φ(s, x) = K e−δs x γ for a suitable value of the constant K .] (b) Consider the following combined stochastic control and optimal stopping problem ∗ ∗ Φ(s, x) = sup J (u,τ ) (s, x) = J (u ,τ ) (s, x), u,τ



where J

(u,τ )

(s, x) = E

x

τ

e

−δ(s+t) u

0

with X (t) as in (a), λ > 0 a given constant.

γ

(t) dt + λe−δ(s+τ ) X γ (τ ) γ



222

7 Combined Optimal Stopping and Stochastic Control of Jump Diffusions

Now the supremum is taken over all F-adapted controls u(t) ≥ 0 and all Fstopping times τ ≤ τS . Let K be the constant found in (a). Show that: 1. If λ ≥ K then it is optimal to stop immediately. 2. If λ < K then it is never optimal to stop. Exercise 7.2 (a) Verify the statements in Cases 2 and 3 at the end of Sect. 7.3. (b) What happens in the cases Case 4: θ = 1, ρ > α, and ρa ≤ K 0 ? Case 5: θ = 1, ρ > α, and K 0 + ρ K 1 ≤ ρa? Exercise 7.3 (A Stochastic Linear Regulator Problem with Optimal Stopping) Consider the stochastic linear regulator problem in Exercise 5.5, with the additional option of stopping, i.e., solve the problem   Φ(s, x) = inf J (u,τ ) (s, x); u ∈ U, τ ∈ T , where J



(u,τ )

(s, x) = E

τ

s,x

e

−ρ(s+t)





X (t) + θ u (t) dt + λe 2

2

−ρ(s+τ )

 X (τ )χ{τ 0, θ > 0, λ > 0 constants), where the state process is  Y (t) =

   s+t s ; t ≥ 0, Y (0) = = y ∈ R2 X (t) x

with dX (t) = dX (u) (t) = u(t)dt + σ dB(t) + (a) Explain why

 R

(dt, dz), zN

X (0) = x.

Φ(s, x) ≤ λe−ρs x 2 for all s, x.

(b) Prove that Φ(s, x) ≤ e

−ρs





 x +σ + 2

2

z ν(dz) 2

R

for all s, x.

[Hint: What happens if we choose τ = ∞?] (c) Show that if 0 0.

Here r > 0, α and β > 0 are given constants, and we assume that z > −1 a.s. ν. Assume that at any time t the investor can choose an adapted and càdlàg consumption rate process c(t) = c(t, ω) ≥ 0, taken from the bank account. Moreover, the investor can at any time transfer money from one investment to another with a transaction cost which is proportional to the size of the transaction. Let X 1 (t) and X 2 (t) denote the amounts of money invested in the bank and in the stocks, respectively. Then the evolution equations for X 1 (t) and X 2 (t) are x ,c,ξ

dX 1 (t) = dX 11

(t) = (r X 1 (t) − c(t))dt − (1 + λ)dξ1 (t) + (1 − μ)dξ2 (t), X 1 (0− ) = x1 ∈ R,

© Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0_8

225

226

8 Singular Control for Jump Diffusions

x2 ∂2 S : x1 + (1 − μ)x2 = 0

S

x1

∂1 S : x1 + (1 + λ)x2 = 0 Fig. 8.1 The solvency region S

x,c,ξ

dX 2 (t) = dX 2

   (dt, dz) (t) = X 2 (t − ) α dt + β dB(t) + z N R

+ dξ1 (t) − dξ2 (t),

X 2 (0− ) = x2 ∈ R.

Here ξ = (ξ1 , ξ2 ), where ξ1 (t) and ξ2 (t) represent cumulative purchase and sale, respectively, of stocks up to time t. The constants λ ≥ 0, μ ∈ [0, 1] represent the constants of proportionality of the transaction costs. Define the solvency region S to be the set of states (x1 , x2 ) such that the net wealth is nonnegative, i.e., (see Fig. 8.1) S = {(x1 , x2 ) ∈ R2 , x1 + (1 + λ)x2 ≥ 0, and x1 + (1 − μ)x2 ≥ 0}.

(8.1.1)

We define the set A of admissible controls as the set of adapted consumption– investment policies (c, ξ) such that c = c(t) ≥ 0 is càdlàg and ξ = (ξ1 , ξ2 ) where each ξi (t), i = 1, 2 is right-continuous, nondecreasing, ξ(0− ) = 0. Define / S}. τS = inf{t > 0; (X 1 (t), X 2 (t)) ∈

8.1 An Illustrating Example

227

The performance criterion is defined by J

c,ξ

(s, x1 , x2 ) = E

s,x1 ,x2



τS

e 0

−δ(s+t)

 cγ (t) dt , γ

(8.1.2)

where δ > 0, γ ∈ (0, 1) are constants. We seek (c∗ , ξ ∗ ) ∈ A and Φ(s, x1 , x2 ) such that ∗ ∗ (8.1.3) Φ(s, x1 , x2 ) = sup J c,ξ (s, x1 , x2 ) = J c ,ξ (s, x1 , x2 ). (c,ξ)∈A

This is an example of a singular stochastic control problem. It is called singular because the investment control measure dξ(t) is allowed to be singular with respect to Lebesgue measure dt. In fact, as we shall see, the optimal control measure dξ ∗ (t) turns out to be singular. We now give a general theory of singular control of jump diffusions and return to the above example afterward.

8.2 A General Formulation Let κ = [κi j ] : Rk → Rk× p and θ = [θi ] : Rk →∈ R p be given continuous functions. Suppose the state Y (t) = Y u,ξ (t) ∈ Rk is described by the equation dY (t) = b(Y (t), u(t))dt + σ(Y (t), u(t))dB(t)  (dt, dz) + κ(Y (t − ))dξ(t), Y (0− ) = y ∈ Rk . + γ(Y (t − ), u(t − ), z) N R

(8.2.1)

Here ξ(t) ∈ R p is an adapted càdlàg finite variation process with increasing components and ξ(0− ) = 0. Since dξ(t) may be singular with respect to Lebesgue measure dt, we call ξ our singular control or our intervention control. The process u(t) is an adapted càdlàg process with values in a given set U (u(t)dt is our absolutely continuous control). Suppose we are given a performance functional J u,ξ (y) of the form  τS f (Y (t), u(t))dt + g(Y (τS )) · X{τS 0; Y u,ξ (t) ∈ / S} ≤ ∞ is the time of bankruptcy, where S ⊂ Rk is a given solvency set, assumed to satisfy S ⊂ S 0 . Let A be a given family of admissible controls (u, ξ), contained in the set of (u, ξ) such that a unique strong solution Y (t) of (8.2.1) exists and  E

τS

y 0



| f (Y (t), u(t))|dt + |g(Y (τS ))| · X{τS 0 or Z 1 (t) > inf 0≤s≤t Z (s). In either case the process ξ1 (t) = sup0≤s≤t Z 1− (s) is constant in an interval [t, t + ) for some  > 0, by right continuity of Z (t). Therefore dξ(t) = 0 if Y (t) ∈ D, as claimed. Hence (Y (t), ξ(t)) defined by (8.2.16) solves the Skorohod equation (8.2.8)–(8.2.10) in the case when D and κ are given by (8.2.11) and (8.2.12). The proof in the general case follows by mapping neighborhoods of x ∈ ∂ D into neighborhoods of the boundary point 0 of a half-space and applying the argument above.

8.3 Application to Portfolio Optimization with Transaction Costs We now apply this theorem to Example 8.1. In this case our state process is ⎡

⎤ ⎡ ⎤ ⎡ ⎤ dt 1 0 dY (t) = ⎣dX 1 (t)⎦ = ⎣r X 1 (t) − c(t)⎦ dt + ⎣ 0 ⎦ dB(t) β X 2 (t) dX 2 (t) αX 2 (t) ⎤ ⎡ ⎤⎡ ⎡ ⎤ 0 0 0 dξ1 (t) ⎦ + ⎣−(1 + λ) (1 − μ)⎦ ⎣ ⎦. +⎣ 0 −  1 −1 dξ2 (t) X 2 (t ) R z N (dt, dz) The generator of Y (t) when there are no interventions is ∂φ ∂2φ ∂φ ∂φ 1 + (r x1 − c) + αx2 + β 2 x22 2 ∂s ∂x1 ∂x2 2 ∂x2    ∂φ φ(s, x1 , x2 + x2 z) − φ(s, x1 , x2 ) − x2 z + (s, x1 , x2 ) ν(dz). ∂x2 R

Ac φ(y) =

234

8 Singular Control for Jump Diffusions

Or, if φ(s, x1 , x2 ) = e−δs ψ(x1 , x2 ) we have Ac φ(s, x1 , x2 ) = e−δs Ac0 ψ(x1 , x2 ), where ∂ψ ∂2ψ ∂ψ 1 + αx2 + β 2 x22 2 Ac0 ψ(x1 , x2 ) = − δψ + (r x1 − c) ∂x1 ∂x2 2 ∂x2    ∂ψ ψ(x1 , x2 + x2 z) − ψ(x1 , x2 ) − x2 z + (x1 , x2 ) ν(dz). ∂x2 R Here f (y, u) = f (s, x1 , x2 , c) = e−δs

θ = g = 0, u(t) = c(t),

cγ . γ

Condition (ii) of Theorem 8.2 gets the form −(1 + λ)

∂ψ ∂ψ + ≤0 ∂x1 ∂x2

and (1 − μ)

∂ψ ∂ψ − ≤ 0. ∂x1 ∂x2

The nonintervention region D in (8.2.5) therefore becomes

∂ψ ∂ψ ∂ψ ∂ψ D = (s, x1 , x2 ) ∈ S; −(1 + λ) + < 0 and (1 − μ) − γα − β 2 γ(1 − γ) − γ ν + 2



∞ −1

{(1 + z)γ − 1}ν(dz),

where

ν = ν((−1, ∞)) < ∞. Then Φ ∈ C 2 (S 0 ) ∩ C(S) and c∗ (x1 , x2 ) =

 ∂Φ 1/(γ−1) ∂x1

  is optimal and there exist θˆ1 , θˆ2 ∈ 0, π/2 with θˆ1 < θˆ2 such that D = {r eiθ ; θˆ1 < θ < θˆ2 } (i =



−1)

is the non-intervention region and the optimal intervention strategy (portfolio) ξ ∗ (t) is the local time of the process (X 1 (t), X 2 (t)) reflected back into D in the direction parallel to ∂1 S at θ = θˆ1 and in the direction parallel to ∂2 S at θ = θˆ2 . See Fig. 8.2. For proofs and more details we refer to [FØS2]. Remark 8.6 For other applications of singular control theory of jump diffusions see [Ma].

8.4 Exercises Exercise* 8.1 (Optimal Dividend Policy Under Proportional Transaction Costs) Suppose the cash flow X (t) = X (ξ) (t) of a firm at time t is given by (with α, σ, β, λ > 0 constants)  (dt, dz) − (1 + λ)dξ(t), X (0− ) = x > 0, dX (t) = α dt + σ dB(t) + β z N R

where ξ(t) is an increasing, adapted càdlàg process representing the total dividend taken out up to time t (our control process). We assume that βz ≤ 0 for a.a. z (ν). Let τS = inf{t > 0; X (ξ) (t) ≤ 0} be the time of bankruptcy of the firm and let ξ

J (s, x) = E



τS

s,x

e 0

−ρ(s+t)

 dξ(t) , ρ > 0 constant

236

8 Singular Control for Jump Diffusions

x1 + (1 − μ)x2 = 0

x2 ∂ψ − (1 − μ) ∂x 1

∂2 S

∂ψ ∂x2

=0

θ = θˆ2

(x1 , x2 ) D= θ = θˆ1

(x1 , x2 ) (x1 , x2 ) −(1 +

∂ψ λ) ∂x 1

+

∂ψ ∂x2

(x1 , x2 ) =0

x1

∂1 S x1 + (1 + λ)x2 = 0 = purchase amount, m = sale amount (at transaction) x1 = x1 − (1 + λ) + (1 − μ)m (new value of x1 after transaction) x2 = x2 + − m (new value of x2 after transaction) Fig. 8.2 The optimal portfolio ξ ∗ (t)

be the expected total discounted amount taken out up to bankruptcy time. Find Φ(s, x) and a dividend policy ξ ∗ such that ∗

Φ(s, x) = sup J (ξ) (s, x) = J ξ (s, x). ξ

Exercise* 8.2 Let Φ(s, x1 , x2 ) be the value function of the optimal consumption problem (8.1.2) with proportional transaction costs and let Φ0 (s, x1 , x2 ) = K e−δs (x1 + x2 )γ be the corresponding value function when there are no transaction costs, i.e., μ = λ = 0 (Example 5.2). Use Theorem 8.2a to prove that Φ(s, x1 , x2 ) ≤ K e−δs (x1 + x2 )γ . Exercise 8.3 (Optimal Harvesting) Suppose the size X (t) at time t of a certain fish population is modeled by a geometric Lévy process, i.e.,

8.4 Exercises

237

   (dt, dz) − dξ(t), t > 0, dX (t) = dX (ξ) (t) = X (t − ) μ dt + z N R

X (0− ) = x > 0,

where μ > 0 is a constant, z > −1 a.s. ν(dz) and ξ(t) is an increasing adapted process giving the amount harvested from the population from time 0 up to time t. We assume that ξ(t) is right-continuous. Consider the optimal harvesting problem Φ(s, x) = sup J (ξ) (s, x), ξ

where J (ξ) (s, x) = E s,x



τs

 θe−ρ(s+t) dξ(t) ,

0

with θ > 0, ρ > 0 constants and τS = inf{t > 0; X (ξ) (t) ≤ 0} (extinction time). If we interpret θ as the price per unit harvested, then J (ξ) (s, x) represents the expected total discounted value of the harvested amount up to extinction time. (a) Write down the integrovariational inequalities (i), (ii), (vi), and (ix) of Theorem 8.2 in this case, with the state process Y (t) =

    s+t s , t ≥ 0, Y (0) = y = ∈ R+ × R+ . X (t) x

(b) Suppose μ ≤ ρ. Show that in this case it is optimal to harvest all the population immediately, i.e., it is optimal to choose the harvesting strategy ξˆ defined by ˆ = x for all t ≥ 0 ξ(t) (sometimes called the “take the money and run” strategy). This gives the value function Φ(s, x) = θe−ρs x. (c) Suppose μ > ρ. Show that in this case Φ(s, x) = ∞.

Chapter 9

Impulse Control of Jump Diffusions

9.1 A General Formulation and a Verification Theorem Suppose that—if there are no interventions—the state Y (t) ∈ Rk of the system we consider is a jump diffusion of the form  dY (t) = b(Y (t))dt + σ(Y (t))dB(t) +

R

(dt, dz), γ(Y (t − ), z) N

(9.1.1)

Y (0) = y ∈ Rk , where b : Rk → Rk , σ : Rk → Rk×m , and γ : Rk × R → Rk× are given functions satisfying the conditions for the existence and uniqueness of a solution Y (t) (see Theorem 1.19). The generator A of Y (t) is k 

k ∂φ 1  ∂2φ Aφ(y) = bi (y) + (σσ T )i j (y) ∂ yi 2 i, j=1 ∂ yi ∂ y j i=1

+

   j=1

R

{φ(y + γ ( j) (y, z j )) − φ(y) − ∇φ(y) · γ ( j) (y, z j )}ν j (dz j ).

Now suppose that at any time t and any state y we are free to intervene and give the system an impulse ζ ∈ Z ⊂ R p , where Z is a given set (the set of admissible impulse values). Suppose the result of giving the impulse ζ when the state is y is that the state jumps immediately from y = Y (t − ) to Y (t) = (y, ζ) ∈ Rk , where  : Rk × Z → Rk is a given function. An impulse control for this system is a double (possibly finite) sequence v = (τ1 , τ2 , . . . , τ j , . . . ; ζ1 , ζ2 , . . . , ζ j , . . .) j≤M ,

M ≤ ∞,

© Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0_9

239

240

9 Impulse Control of Jump Diffusions

where 0 ≤ τ1 ≤ τ2 ≤ · · · are Ft -stopping times (the intervention times) and ζ1 , ζ2 , . . . are the corresponding impulses at these times. We assume that ζ j is Fτ j measurable for all j. If v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) is an impulse control, the corresponding state process Y (v) (t) is defined by Y (v) (0− ) = y and Y (v) (t) = Y (t), 0 < t ≤ τ1 , ˇ (v)

Y (τ j ) = (Y (τ − j = 1, 2, . . . j ), ζ j ), (If τ1 = 0 we put Y (v) (τ1 ) = (Y (v) (0− ), ζ1 ) (v)

dY

(v)

(v)

(9.1.2) (9.1.3)

= (y, ζ1 ).)

(v)

(t) = b(Y (t))dt + σ(Y (t))dB(t)  (dt, dz) for τ j < t < τ j+1 ∧ τ ∗ . + γ(Y (v) (t), z) N

(9.1.4)

R

ˇ (v) (τ − If τ j+1 = τ j , then Y (v) (t) jumps from Yˇ (v) (τ − j ) to ((Y j ), ζ j ), ζ j+1 ), where

(v) − (τ j ) + Δ N Y (τ j ), Yˇ (v) (τ − j )=Y

(9.1.5)

Δ N Y (v) (t) is as in (8.2.2) the jump of Y (v) stemming from the jump of the random measure N (t, ·) only and τ ∗ = τ ∗ (ω) = lim (inf{t > 0; |Y (v) (t)| ≥ R}) ≤ ∞ R→∞

(9.1.6)

is the explosion time of Y (v) (t). Note that here we must distinguish between the (possible) jump of Y (v) (τ j ) stemming from the random measure N , denoted by Δ N Y (v) (τ j ) and the jump caused by the intervention v, given by ˇ (v) (τ − Δv Y (v) (τ j ) := (Yˇ (v) (τ − j ), ζ) − Y j ).

(9.1.7)

Let S ⊂ Rk be a fixed open set (the solvency region). Define / S}. τS = inf{t ∈ (0, τ ∗ ); Y (v) (t) ∈

(9.1.8)

Suppose we are given a continuous profit function f : S → R and a continuous bequest function g : Rk → R. Moreover, suppose the profit/utility of making an intervention with impulse ζ ∈ Z when the state is y is K (y, ζ), where K : S×Z → R is a given continuous function. We assume we are given a set V of admissible impulse controls which is included in the set of v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) such that a unique solution Y (v) of (9.1.2)– (9.1.4) exists and (9.1.9) τ ∗ = ∞ a.s. and (if M = ∞)

9.1 A General Formulation and a Verification Theorem

241

lim τ j = τS .

(9.1.10)

j→∞

We also assume that   τS    f (Y (v) (t)) dt < ∞ for all y ∈ Rk , v ∈ V, Ey

(9.1.11)

0

 E g − (Y (v) (τS ))X{τS τˆ j ; Y (vˆ j ) (t) ∈ j+1 (vˆ j ) if τˆ j+1 < τS , where Y is the result of applying vˆ j := (τˆ1 , . . . , τˆ j ; ζˆ1 , . . . , ζˆ j ) to Y . Suppose ˆ (τ )); τ ∈ T } is uniformly integrable. (xii) vˆ ∈ V and {φ(Y (v) Then φ(y) = Φ(y) and vˆ is an optimal impulse control.

(9.1.17)

Sketch of Proof (a) By Theorem 3.1 and (iii)–(v), we may assume that φ ∈ C 2 (S) ∩ ¯ Choose v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) ∈ V and set τ0 = 0. By another approxC(S). imation argument we may assume that we can apply the Dynkin formula to the stopping times τ j . Then for j = 0, 1, 2, . . ., with Y = Y (v) y E [φ(Y (τ j ))] − E [φ(Yˇ (τ − j+1 ))] = −E y

y



τ j+1

τj

 Aφ(Y (t))dt ,

(9.1.18)

− where Yˇ (τ − j+1 ) = Y (τ j+1 ) + Δ N Y (τ j+1 ), as before. Summing this from j = 0 to j = m we get

9.1 A General Formulation and a Verification Theorem

φ(y) +

m 

243

y ˇ − E y [φ(Y (τ j )) − φ(Yˇ (τ − j ))] − E [φ(Y (τm+1 ))]

j=1



τm+1

= −E y

  Aφ(Y (t))dt ≥ E y

0

 f (Y (t))dt .

τm+1

(9.1.19)

0

Now φ(Y (τ j )) = φ((Yˇ (τ − j ), ζ j )) ˇ − ≤ Mφ(Yˇ (τ − j )) − K (Y (τ j ), ζ j ) if τ j < τS by (9.1.15) and φ(Y (τ j )) = φ(Yˇ (τ − j )) if τ j = τS by (vii). Therefore ˇ − ˇ − ˇ − Mφ(Yˇ (τ − j )) − φ(Y (τ j )) ≥ φ(Y (τ j )) − φ(Y (τ j )) + K (Y (τ j ), ζ j ) and φ(y) +

m 

ˇ − E y [{Mφ(Yˇ (τ − j )) − φ(Y (τ j ))} · X{τ j 0, θ ≥ 0 are constants and we assume that z ≤ 0 a.s. ν. Suppose that at any time t we are free to take out an amount ζ > 0 from X (t) by applying the transaction cost k(ζ) = c + λζ, (9.2.2) where c > 0, λ ≥ 0 are constants. The constant c is called the fixed part and the quantity λζ is called the proportional part, respectively, of the transaction cost. The resulting cash flow X (v) (t) is given by X (v) (t) = X (t) if 0 ≤ t < τ1 ,

(9.2.3)

ˇ (v) − X (v) (τ j ) = (X (v) (τ − j ) + Δ N X (τ j ), ζ j ) = X (τ j ) − (1 + λ)ζ j − c,

(9.2.4)

and dX (v) (t) = μ dt + σ dB(t) + θ Put

 R

(dt, dz) if τ j ≤ t < τ j+1 . zN

τS = inf{t > 0; X (v) (t) ≤ 0} (time of bankruptcy) ⎡

and

J (v) (s, x) = E s,x ⎣



(9.2.5)

(9.2.6)

⎤ e−ρ(s+τ j ) ζ j ⎦ ,

(9.2.7)

τ j ≤τS

where ρ > 0 is constant (the discounting exponent). We seek Φ(s, x) and v ∗ = (τ1∗ , τ2∗ , . . . ; ζ1∗ , ζ2∗ , . . .) ∈ V such that ∗

Φ(s, x) = sup J (v) (s, x) = J (v ) (s, x), v∈V

(9.2.8)

9.2 Examples

245

where V is the set of impulse controls s.t. X (v) (t) ≥ 0 for all t ≤ τS . This is a problem of the type (9.1.14), with    s+t s (v) − , t ≥ 0, Y (0 ) = Y (t) = = y, x X (v) (t)   s (y, ζ) = (s, x, ζ) = , x − c − (1 + λ)ζ (v)



K (y, ζ) = K (s, x, ζ) = e−ρs ζ,

f = g = 0,

and S = {(s, x); x > 0}. As a candidate for the value function Φ we try φ(s, x) = e−ρs ψ(x). Then

(9.2.9)

  x −c . Mψ(x) = sup ψ(x − c − (1 + λ)ζ) + ζ, 0 < ζ < 1+λ

We now guess that the continuation region has the form D = {(s, x); 0 < x < x ∗ } for some x ∗ > 0.

(9.2.10)

Then (x) of Theorem 9.2 gives 1 −ρψ(x) + μψ  (x) + σ 2 ψ  (x) + 2

 R

{ψ(x + θz) − ψ(x) − ψ  (x)θz}ν(dz) = 0.

To solve this equation we try a function of the form ψ(x) = er x for some constant r ∈ R. Then r must solve the equation 1 h(r ) := −ρ + μr + σ 2 r 2 + 2

 R

{er θz − 1 − r θz}ν(dz) = 0.

(9.2.11)

Since h(0) = −ρ < 0 and lim|r |→∞ h(r ) = ∞, we see that there exist two solutions r1 , r2 of h(r ) = 0 such that r2 < 0 < r1 . Moreover, since er θz − 1 − r θz ≥ 0 for all r, z we have |r2 | > r1 .

246

9 Impulse Control of Jump Diffusions

With such a choice of r1 , r2 we try ψ(x) = A1 er1 x + A2 er2 x ,

Ai constants.

Since ψ(0) = 0 we have A1 + A2 = 0 so we write A1 = A = −A2 > 0 and   ψ(x) = A er1 x − er2 x , 0 < x < x ∗ . Define

  ψ0 (x) = A er1 x − er2 x for allx > 0.

(9.2.12)

To study Mψ we first consider g(ζ) := ψ0 (x − c − (1 + λ)ζ) + ζ, ζ > 0. ˆ The first-order condition for a maximum point ζˆ = ζ(x) for g(ζ) is that ˆ = ψ0 (x − c − (1 + λ)ζ)

1 . 1+λ

Now ψ0 (x) > 0 for all x and ψ0 (x) < 0 iff x < x˜ :=

2(ln |r2 | − ln r1 ) . r1 − r2

Therefore the equation ψ0 (x) = 1/(1 + λ) has exactly two solutions x = x and x = x¯ where 0 < x < x˜ < x¯ ˜ < 1/(1 + λ) < ψ0 (0)). See Fig. 9.1. (provided that ψ0 (x) Choose

x ∗ = x¯ and put xˆ = x.

If we require that ψ(x) = Mψ0 (x) for x ≥ x ∗ we get ˆ ψ(x) = ψ0 (x) ˆ + ζ(x) for x ≥ x ∗ , where

ˆ x − c − (1 + λ)ζ(x) = x, ˆ

(9.2.13)

9.2 Examples

247

Fig. 9.1 The function ψ0 (x)

i.e.,

x − xˆ − c ˆ for x ≥ x ∗ . ζ(x) = 1+λ

(9.2.14)

Hence we propose that ψ has the form ⎧ ⎨ψ0 (x) = A(er1 x − er2 x ), 0 < x < x ∗ , ψ(x) = x − xˆ − c ⎩ψ0 (x) , x ≥ x ∗. ˆ + 1+λ

(9.2.15)

Now choose A such that ψ is continuous at x = x ∗ . This gives  ∗ −1 ∗ A = (1 + λ)−1 er1 x − er2 x − er1 xˆ + er2 xˆ (x ∗ − xˆ − c).

(9.2.16)

By our choice of x ∗ we then have that ψ is also differentiable at x = x ∗ . ˆ and A, our choice of φ(s, x) = We can now check that, with these values of x ∗ , x, e ψ(x) satisfies all the requirements of Theorem 9.2, provided that some conditions on the parameters are satisfied. We leave this verification to the reader. Thus the solution of the impulse control problem (9.2.8) can be described as follows. As long as X (t) < x ∗ we do nothing. If X (t) reaches the value x ∗ then immediately we make an intervention to bring X (t) down to the value x. ˆ See Fig. 9.2. −ρs

Example 9.5 As another illustration of how to apply Theorem 9.2 we consider the following example, which is a jump diffusion version of the example in [Ø2] studied in connection with questions involving vanishing fixed costs. Variations of this problem have been studied by many authors (see, e.g., [HST, J-P, MuØ, ØS, ØUZ, V]). One possible economic interpretation is that the given process represents the exchange rate of a given currency and the impulses represent the interventions taken in order to keep the exchange rate in a given “target zone.” See, e.g., [J-P, MØ].

248

9 Impulse Control of Jump Diffusions X(t)

x∗ ζ1∗

ζ2∗

x ˆ

t t = τ1∗

t = τ2∗

Fig. 9.2 The optimal impulse control of Example 9.4

Suppose that without interventions the system has the form 

 s+t Y (t) = ∈ R2 , Y (0) = y = (s, x), X (t) where X (t) = x + B(t) +

 t 0

R

(9.2.17)

z N˜ (ds, dz) and B(0) = 0. We assume that z ≤ 0

a.s. z. Suppose that we are only allowed to give the system impulses ζ with values in Z := (0, ∞) and that if we apply an impulse control v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) to Y (t) it gets the form ⎤   s + t s+t ⎦ . = Y (v) (t) = ⎣ X (t) − ζk X (v) (t) ⎡

(9.2.18)

τk ≤t

Suppose that the cost rate f (t, ξ) if X (v) (t) = ξ at time t is given by f (t, ξ) = e−ρt ξ 2 ,

(9.2.19)

where ρ > 0 is constant. In an effort to reduce the cost one can apply the impulse control v in order to reduce the value of X (v) (t). However, suppose the cost of an intervention of size ζ > 0 at time t is K (t, ξ, ζ) = K (ζ) = c + λζ,

(9.2.20)

where c > 0 and λ ≥ 0 are constants. Then the expected total discounted cost associated to a given impulse control is

9.2 Examples

J

(v)

249

 (s, x) = E



x

e

−ρ(s+t)

(X

(v)

(t)) dt + 2

0

N 

 e

−ρ(s+τk )

(c + λζk ) . (9.2.21)

k=1

We seek Φ(s, x) and v ∗ = (τ1∗ , τ2∗ , . . . ; ζ1∗ , ζ2∗ , . . .) such that ∗

Φ(s, x) = inf J (v) (s, x) = J (v ) (s, x).

(9.2.22)

v

This is an impulse control problem of the type described above, except that it is a minimum problem rather than a maximum problem. Theorem 9.2 still applies, with the corresponding changes. Note that it is not optimal to move X (t) downward if X (t) is already below 0. Hence we may restrict ourselves to consider impulse controls v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) such that τk  ζ j ≤ X (τk ) for all k. (9.2.23) j=1

We let V denote the set of such impulse controls. We guess that the optimal strategy is to wait until the level of X (t) reaches an (unknown) value x ∗ > 0. At this time, τ1 , we intervene and give X (t) an impulse ζ1 , which brings it down to a lower value xˆ > 0. Then we do nothing until the next time, τ2 , that X (t) reaches the level x ∗ , etc. This suggests that the continuation region D in Theorem 9.2 has the form D = {(s, x), x < x ∗ }.

(9.2.24)

See Fig. 9.3. Let us try a value function ϕ of the form ϕ(s, x) = e−ρs ψ(x),

(9.2.25)

where ψ remains to be determined. Condition (x) of Theorem 9.2 gives that for x < x ∗ we should have Aϕ + f = e

−ρs



1 −ρψ(x) + ψ  (x) + 2

 R





{ψ(x + z) − ψ(x) − zψ (x)}ν(dz)

+ e−ρs x 2 = 0. So for x < x ∗ we let ψ be a solution h(x) of the equation 

1 {h(x + z) − h(x) − zh  (x)}ν(dz) + h  (x) − ρh(x) + x 2 = 0. 2 R

(9.2.26)

250

9 Impulse Control of Jump Diffusions

X(t)

x∗ ζ1∗

ζ2∗

x ˆ

t t = τ1∗

t = τ2∗

Fig. 9.3 The optimal impulse control of Example 9.5

We see that any function h(x) of the form h(x) = C1 e

r1 x

+ C2 e

r2 x

1+ 1 + x2 + ρ



z 2 ν(dz) , ρ2

R

(9.2.27)

where C1 , C2 are arbitrary constants, is a solution of (9.2.26), provided that r1 > 0, r2 < 0 are roots of the equation  K (r ) :=

1 {er z − 1 − r z}ν(dz) + r 2 − ρ = 0. 2 R

Note that if we make no interventions at all, then the cost is  ∞  (v) −ρs x −ρt 2 J (s, x) = e E e (X (t)) dt 0    ∞ 1 2 b −ρs −ρt 2 −ρs x + 2 , e (x + tb)dt = e =e ρ ρ 0 where b = 1 +

 R

(9.2.28)

z 2 ν(dz). Hence we must have 0 ≤ ψ(x) ≤

1 2 b x + 2 ρ ρ

for all x.

(9.2.29)

9.2 Examples

251

Comparing this with (9.2.27) we see that we must have C2 = 0. Hence C1 ≤ 0. So we put 1 b (9.2.30) ψ(x) = ψ0 (x) := x 2 + 2 − aer1 x for x ≤ x ∗ , ρ ρ where a = −C1 remains to be determined. We guess that a > 0. To determine a we first find ψ for x > x ∗ and then require ψ to be C 1 at x = x ∗ . By (ii) and (9.2.24) we know that for x > x ∗ we have ψ(x) = Mψ(x) := inf{ψ(x − ζ) + c + λζ, ζ > 0}.

(9.2.31)

ˆ The first-order condition for a minimum ζˆ = ζ(x) of the function G(ζ) := ψ(x − ζ) + c + λζ, ζ > 0 is

ˆ = λ. ψ  (x − ζ) Suppose there is a unique point xˆ ∈ (0, x ∗ ) such that ˆ = λ. ψ  (x)

Then

(9.2.32)

ˆ ˆ xˆ = x − ζ(x), i.e., ζ(x) = x − xˆ

and from (9.2.31) we deduce that ˆ + c + λ(x − x) ˆ for x ≥ x ∗ . ψ(x) = ψ0 (x) In particular,

ψ  (x ∗ ) = λ

(9.2.33)

ˆ + c + λ(x ∗ − x). ˆ ψ(x ∗ ) = ψ0 (x)

(9.2.34)

⎧ ⎨ 1 x 2 + b − aer1 x , for x ≤ x ∗ , ρ2 ψ(x) = ρ ⎩ ψ0 (x) ˆ + c + λ(x − x), ˆ for x > x ∗ ,

(9.2.35)

and

To summarize we put

where x, ˆ x ∗ , and a are determined by (9.2.32)–(9.2.34), i.e.,

252

9 Impulse Control of Jump Diffusions

2 xˆ − λ (i.e., ψ  (x) ˆ = λ), ρ 2 = x ∗ − λ (i.e., ψ  (x ∗ ) = λ), ρ

ar1 er1 xˆ = ar1 er1 x ∗

aer1 x − aer1 xˆ =



(9.2.36) (9.2.37)

1 ∗ 2 ((x ) − (x) ˆ 2 ) − c − λ(x ∗ − x). ˆ ρ

(9.2.38)

One can now prove (see [Ø2, Theorem 2.5]). ˆ > 0, and x ∗ = For each c > 0 there exists a = a ∗ (c) > 0, xˆ = x(c) ∗ ˆ and x ∗ , x (c) > xˆ such that (9.2.36)–(9.2.38) hold. With this choice of a, x, −ρs the function ϕ(s, x) = e ψ(x) with ψ given by (9.2.35) coincides with the value function Φ defined in (9.2.22). Moreover, the optimal impulse control v ∗ = (τ1∗ , τ2∗ , . . . ; ζ1∗ , ζ2∗ , . . .) is to do nothing while X (t) < x ∗ , then move X (t) from x ∗ down to xˆ (i.e., apply ζ1∗ = x ∗ − x) ˆ at the first time τ1∗ when X (t) reaches a value ∗ ≥ x , then wait until the next time, τ2∗ , X (t) again reaches the value x ∗ , etc. Remark 9.6 In [Ø2] this result is used to study how the value function Φ(s, x) = Φc (s, x) depends on the fixed part c > 0 of the intervention cost. It is proved that the function c → Φc (s, x) is continuous but not differentiable at c = 0. In fact, we have ∂ Φc (s, x) → ∞ as c → 0+ . ∂c Subsequently this high c-sensitivity of the value function for c close to 0 was proved for other processes as well. See [ØUZ]. Remark 9.7 For applications of impulse control theory in inventory control see, e.g., [S1, S2] and the references therein.

9.3 Exercises Exercise* 9.1 Solve the impulse control problem ∗

Φ(s, x) = inf J (v) (s, x) = J (v ) (s, x), v

where J

(v)





(s, x) = E

e 0

−ρ(s+t)

(X

(v)

(t)) dt + 2

N  k=1

 e

−ρ(s+τk )

(c + λ|ζk |) .

9.3 Exercises

253

The inf is taken over all impulse controls v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) with ζi ∈ R and the corresponding process X (v) (t) is given by dX (v) (t) = dB(t) +

 R

θ(X (v) (t), z) N˜ (dt, dz), τk < t < τk+1 ,

− X (v) (τk+1 ) = Xˇ (v) (τk+1 ) + ζk+1 , k = 0, 1, 2, . . . ,

where B(0) = 0, x ∈ R, and we assume that there exists ξ > 0 such that θ(x, z) = 0 for a.a. z if |x| < ξ, θ(x, z) ≥ 0 if x ≥ ξ and θ(x, z) ≤ 0 if x ≤ −ξ for a.a. z,

(9.3.1) (9.3.2)

Exercise* 9.2 (Optimal Stream of Dividends with Transaction Costs from a Geometric Lévy Process) For v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) with ζi ∈ R+ define X (v) (t) by dX (v) (t) = μX (v) (t)dt +σ X (v) (t)dB(t) + θ X (v) (t − )

R

z N˜ (ds, dz), τi ≤ t ≤ τi+1 ,

− X (v) (τi+1 ) = Xˇ (v) (τi+1 ) − (1 + λ)ζi+1 − c, i = 0, 1, 2, . . . , (v) − X (0 ) = x > 0,

where μ, σ = 0, θ, λ ≥ 0, and c > 0 are constants (see (9.1.5)), −1 ≤ θz ≤ 0 a.s. ν. Find Φ and v ∗ such that ∗

Φ(s, x) = sup J (v) (s, x) = J (v ) (s, x). v

Here  J

(v)

(s, x) = E

x



 e

−ρ(s+τk )

ζk

(ρ > 0 constant)

τk 0; X (v) (t) ≤ 0} is the time of bankruptcy. (See also Exercise 10.2.) Exercise* 9.3 (Optimal Forest Management (Inspired by Y. Willassen [W])) Suppose the biomass of a forest at time t is given by  X (t) = x + μt + σ B(t) + θ

R

(t, dz), zN

where μ > 0, σ > 0, θ > 0 are constants and we assume that z ≤ 0 a.s. ν. At times 0 ≤ τ1 < τ2 < · · · we decide to cut down the forest and replant it, with the cost

254

9 Impulse Control of Jump Diffusions

c + λ Xˇ (τk− ), with Xˇ (τk− ) = X (τk− ) + Δ N X (τk ), where c > 0 and λ ∈ [0, 1) are constants and Δ N X (t) is the (possible) jump in X at t coming from the jump in N (t, ·) only, not from the intervention. Find the sequence of stopping times v = (τ1 , τ2 , . . .) which maximizes the expected total discounted net profit J (v) (s, x) given by  J

(v)

(s, x) = E

x

∞ 

 e

−ρ(s+τk )

( Xˇ (τk− )

k=1

where ρ > 0 is a given discounting exponent.

−c−

λ Xˇ (τk− ))

,

Chapter 10

Approximating Impulse Control by Iterated Optimal Stopping

10.1 Iterative Scheme In general it is not possible to reduce impulse control to optimal stopping, because the choice of the first intervention time τ1 and the first impulse ζ1 will influence the next and so on. However, if we allow only (up to) a fixed finite number n of interventions, then the corresponding impulse control problem can be solved by solving iteratively n optimal stopping problems. Moreover, if we restrict the number of interventions to (at most) n in a given impulse control problem, then the value function of this restricted problem will converge to the value function of the original problem as n → ∞. Thus it is possible to reduce a given impulse control problem to a sequence of iterated optimal stopping problems. This is useful both for theoretical purposes and numerical applications. We now make this more precise. Using the notation of Chap. 9 consider the impulse control problem ∗

Φ(y) = sup{J (v) (y); v ∈ V} = J (v ) (y),

y ∈ S,

(10.1.1)

/ S}, where, with τS = τS(v) = inf{t > 0; Y (v) (t) ∈ J (v) (y) =E



τS

y 0

f (Y

(v)

(t))dt + g(Y

(v)

(τS ))χ{τS 0 is a root of the equation  K (r ) :=

1 {er z − 1 − r z}ν(dz) + r 2 − ρ = 0. 2 R

See Fig. 10.2. If we require ψ1 to be continuous and differentiable at x = x1∗ , we get the following two equations to determine x1∗ and a1 : 1 ∗ 2 b b (x1 ) + 2 − a1 exp(r x1∗ ) = 2 + c, ρ ρ ρ

(10.2.9)

268

10 Approximating Impulse Control by Iterated Optimal Stopping

Fig. 10.2 The graph of ψ1 (x)

ψ1 (x)

x x ˆ1

x∗1

2 ∗ x − a1r exp(r x1∗ ) = 0. ρ 1

(10.2.10)

Combining these two equations we get x1∗ =

  1 1 + 1 + cρr 2 r

(10.2.11)

2x1∗ exp(−r x1∗ ). ρr

(10.2.12)

and a1 =

If we choose these values of x1∗ and a1 , we can now verify that, under some additional assumptions on ν, the candidate ψ1 given by (10.2.8) satisfies all the conditions of the verification theorem and we conclude that ψ1 = Ψ1 . We now repeat the procedure to find Ψ2 . First note that MΨ1 (x) = inf{Ψ1 (x − ζ) + c; ζ > 0} = Ψ1 (x ∧ xˆ1 ) + c b 1 = (x ∧ xˆ1 )2 + 2 + c − a1 exp(r (x ∧ xˆ1 )), ρ ρ where xˆ1 ∈ ArgminΨ1 (x). Hence  Ψ2 (x) = inf E τ ≥0

τ

x 0

e−ρt X 2 (t)dt

(10.2.13)

10.2 Examples

269

+e

−ρτ



1 b (X (τ ) ∧ xˆ1 )2 + 2 + c − a1 exp(r (X (τ ) ∧ xˆ1 )) ρ ρ

 .

The same procedure as above leads us to the candidate ⎧1 ⎪ ⎨ x2 + ψ2 (x) = ρ 1 ⎪ ⎩ xˆ12 + ρ

b − a2 exp(r x), x < x2∗ , ρ2 b + c − a1 exp(r xˆ1 ), x ≥ x2∗ , ρ2

(10.2.14)

   1 2 2 1 + 1 + r (xˆ1 + ρ(c − a1 exp(r xˆ1 ))) , = r

(10.2.15)

where x2∗ and a2 are given by x2∗

a2 =

2x2∗ exp(−r x2∗ ). ρr

(10.2.16)

If we choose these values of x2∗ and a2 > 0 we can verify that the candidate ψ2 given by (10.2.14) coincides with Ψ2 . Note that x2∗ < x1∗ and a2 > a1 . Continuing this inductively we get a sequence of functions ψn of the form ⎧1 b ⎪ x < xn∗ , ⎨ x 2 + 2 − an exp(r x), ρ ρ ψn (x) = 1 b ⎪ 2 ⎩ xˆn−1 + 2 + c − an−1 exp(r xˆn−1 ), x ≥ xn∗ , ρ ρ

(10.2.17)

where xn∗ and an are given by xn∗

   1 2 2 1 + 1 + r (xˆn−1 + ρ(c − an−1 exp(r xˆn−1 ))) = r

(10.2.18)

with xˆn−1 ∈ ArgminΨn−1 (x), and an =

2xn∗ exp(−r xn∗ ). ρr

(10.2.19)

∗ We find that xn∗ < xn−1 and an > an−1 . Using the verification theorem we conclude that ψn = Ψn . In the limiting case when there is no bound on the number of interventions the corresponding value function will be

270

10 Approximating Impulse Control by Iterated Optimal Stopping

⎧1 ⎪ ⎨ x2 + Ψ (x) = ρ 1 ⎪ ⎩ xˆ 2 + ρ

b − a exp(r x), x < x ∗, ρ2 b + c − a exp(r x), ˆ x ≥ x ∗, ρ2

(10.2.20)

where x ∗ and a solve the coupled system of equations x∗ =

 1  1+ 1 + r 2 (xˆ 2 + ρ(c − a exp(r x))) ˆ , r 2x ∗ a= exp(−r x ∗ ), ρr

(10.2.21) (10.2.22)

where xˆ ∈ ArgminΨ (x). It is easy to see that this system has a unique solution x ∗ > 0, a > 0, and that x ∗ < xn∗ and a > an for all n. The situation in the case n = 3 is shown in Fig. 10.3. Note that with only three interventions allowed the optimal strategy is first to wait until the first time τ1∗ when X (t) ≥ x3∗ , then move X (t) down to xˆ2 , next wait until the first time τ2∗ > τ1∗ when X (t) ≥ x2∗ , then move X (t) down to xˆ1 and finally wait until the first time t = τ3∗ > τ2∗ when X (t) ≥ x1∗ before making the last intervention, which is moving X (t) down to xˆ0 = 0. X(t)

x∗1 x∗2 x∗3 D1

x∗ ζ1∗

D3

ζ2∗

D2

D

x ˆ2 x ˆ1

t

x ˆ0 = 0 t = τ1∗

t = τ2∗

t = τ3∗

Fig. 10.3 The optimal impulse control for Ψ3 (Example 10.9)

10.3 Exercises

271

10.3 Exercises Exercise* 10.1 (Optimal Forest Management Revisited) Using the notation of Exercise 9.3 let   Φ(x) = sup J (v) (x); v = (τ1 , τ2 , . . .) be the value function when there are no restrictions on the number of interventions. For n = 1, 2, . . . let   Φn (x) = sup J (v) (x); v = (τ1 , τ2 , . . . , τk ); k ≤ n be the value function when up to n interventions are allowed. Use Theorem 10.2 to find Φ1 and Φ2 . Exercise* 10.2 (Optimal Stream of Dividends with Transaction Costs from a Geometric Lévy Process) This is an addition to Exercise 9.2. Suppose that we at times 0 ≤ τ1 ≤ τ2 ≤ · · · decide to take out dividends of sizes ζ1 , ζ2 , . . . ∈ (0, ∞) from an economic quantity growing like a geometric Lévy process. If we let X (v) (t) denote the size at time t of this quantity when the dividend policy v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) is applied, we assume that X (v) (t) is described by (see (9.2.4)) dX (v) (t) = μX (v) (t)dt + σ X (v) (t)dB(t) + θ X (v) (t − )

 R

z N˜ (dt, dz), τi ≤ t < τi+1 ,

− X (v) (τi+1 ) = Xˇ (v) (τi+1 ) − (1 + λ)ζi+1 − c, i = 0, 1, 2, . . . (see (9.1.5)),

where μ, σ = 0, θ, λ ≥ 0, and c > 0 are constants, −1 ≤ θz ≤ 0, a.s. (ν). Let   Φ(x) = sup J (v) (x); v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) and

  Φn (x) = sup J (v) (x); v = (τ1 , τ2 , . . . , τk ; ζ1 , ζ2 , . . . , ζk ); k ≤ n

be the value function with no restrictions on the number of interventions and with at most n interventions, respectively, where  J

(v)

(x) = E

x



 e

−ρτk

ζk

(ρ > 0 constant)

τk 0; X (v) (t) ≤ 0

272

10 Approximating Impulse Control by Iterated Optimal Stopping

is the time of bankruptcy. Show that Φ(x) = Φn (x) = Φ1 (x) for all n. Thus in this case we achieve the optimal result with just one intervention.

Chapter 11

Combined Stochastic Control and Impulse Control of Jump Diffusions

11.1 A Verification Theorem Consider the general situation in Chap. 9, except that now we assume that we, in addition, are free at any state y ∈ Rk to choose a Markov control u(y) ∈ U , where U is a given closed convex set in R p . Let U be a given family of such Markov controls u. If, as before, v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) ∈ V denotes a given impulse control we call w := (u, v) a combined control . If w = (u, v) is applied, we assume that the corresponding state Y (t) = Y (w) (t) at time t is given by (see (5.1.1) and (9.1.2)–(9.1.5)) Y (0− ) = y, dY (t) = b(Y (t), u(t))dt + σ(Y (t), u(t))dB(t)  + γ(Y (t − ), u(t − ), z) N˜ (dt, dz), τ j < t < τ j+1 ,

(11.1.1)

R

Y (τ j+1 ) = Γ (Yˇ (τ − j+1 ), ζ j+1 ),

j = 0, 1, 2, . . .

(11.1.2)

where u(t) = u(Y (t)) and b : Rk × U → Rk , σ : Rk × U → Rk×m , and γ : Rk × U × R → Rk× are given continuous functions, τ0 = 0. As before we let our “universe” S be a fixed Borel set in Rk such that S ⊂ S 0 and we define (11.1.3) τ ∗ = lim inf{t > 0; |Y (w) (t)| ≥ R} ≤ ∞ R→∞

and

/ S}. τS = inf{t ∈ (0, τ ∗ (ω)); Y (w) (t, ω) ∈

(11.1.4)

If Y (w) (t, ω) ∈ S for all t < τ ∗ we put τS = τ ∗ . We assume that we are given a set W = U × V of admissible combined controls w = (u, v) which is included in the set of combined controls w = (u, v) such that a unique strong solution Y (w) (t) of (11.1.1) and (11.1.2) exists and © Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0_11

273

274

11 Combined Stochastic Control and Impulse Control of Jump Diffusions

τ ∗ = ∞ and

lim τ j = τS a.s. Q y for all y ∈ Rk .

j→∞

(11.1.5)

Define the performance or total expected profit/utility of w = (u, v) ∈ W, v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .), by J

(w)

 (y) = E

τS

y 0

+

f (Y (w) (t), u(t))dt + g(Y (w) (τS ))X{τS Mφ(y)} (the continuation region).

(11.1.10)

Suppose that Y (w) (t) spends 0 time on ∂ D a.s., i.e.,   τS  (iii) E y X∂ D (Y (w) (t))dt = 0 for all y ∈ S, w ∈ W 0

and suppose that (iv) ∂ D is a Lipschitz surface, (v) φ ∈ C 2 (S 0 \ ∂ D) and the second-order derivatives of φ are locally bounded near ∂ D, (vi) L α φ(y) + f (y, α) ≤ 0 for all α ∈ U , y ∈ S 0 \ ∂ D, (vii) φ(y) = g(y) fort all y ∈ / S, (viii) the family {φ− (Y (w) (τ )); τ ∈ T } is uniformly Q y -integrable for all y ∈ S, w ∈ W,   τS

(ix) E y |φ(Y (τ ))| +

|Aφ(Y (t))|dt < ∞ for all τ ∈ T , w ∈ W, y ∈ S.

0

Then φ(y) ≥ Φ(y) for all y ∈ S. (b) Suppose in addition that (x) there exists a function uˆ : S → R such that ˆ φ(y) + f (y, u(y)) ˆ = 0 for all y ∈ D L u(y)

and (xi)

¯ ζ(y) ∈ argmax{φ(Γ (y, ·)) + K (y, ·)} ¯ is a Borel measurable selection. exists for all y ∈ S and ζ(·) Define an impulse control vˆ = (τˆ1 , τˆ2 , . . . ; ζˆ1 , ζˆ2 , . . .) as follows. Put τˆ0 = 0 and inductively / D}, τˆk+1 = inf{t > τˆk ; Y (k) (t) ∈ ¯ (k) (ˆτ − )), k = 0, 1, . . . , ζˆk+1 = ζ(Y k+1

where Y (k) (t) is the result of applying the combined control ˆ ˆ (τˆ1 , . . . , τˆk ; ζˆ1 , . . . , ζ)). w k := (u, Put w  = (u, ˆ v). ˆ Suppose w  ∈ W and that

276

11 Combined Stochastic Control and Impulse Control of Jump Diffusions

w) (xii) {φ(Y ( (τ )); τ ∈ T , τ ≤ τ D } is Q y -uniformly integrable for all y ∈ S. Then φ(y) = Φ(y) for all y ∈ S

and w  ∈ W is an optimal combined control. Proof The proof is similar to the proof of Theorem 9.2 and is omitted.



11.2 Examples Example 11.2 (Optimal Combined Control of the Exchange Rate) This example is a simplification of a model studied in [MuØ]. Suppose a government has two means of influencing the foreign exchange rate of its own currency: 1. At all times the government can choose the domestic interest rate r . 2. At times selected by the government it can intervene in the foreign exchange market by buying or selling large amounts of foreign currency. Let r (t) denote the interest rate chosen and let τ1 , τ2 , . . . be the (stopping) times when it is decided to intervene, with corresponding amounts ζ1 , ζ2 , . . . If ζ > 0 the government buys foreign currency, if ζ < 0 it sells. Let v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) be corresponding impulse control. If the combined control w = (r, v) is applied, we assume that the corresponding exchange rate X (t) (measured in the number of domestic monetary units it takes to buy one average foreign monetary unit) is given by 

t

X (t) = x − 0

F(r (s) − r¯ (s))ds + σ B(t) +



γ(ζ j ), t ≥ 0,

(11.2.1)

j:τ j ≤t

where σ > 0 is a constant, r¯ (s) is the (average) foreign interest rate, and F : R → R and γ : R → R are known functions which give the effects on the exchange rate by the interest rate differential r (s) − r¯ (s) and the amount ζ j , respectively. The total expected cost of applying the combined control w = (r, v) is assumed to be of the form  T   e−ρt {M(X (t) − m) + N (r (t) − r¯ (t))}dt + L(ζ j )e−ρτ j , J (w) (s, x) = E x s

j;τ j ≤T

(11.2.2) where M(X (t) − m) and N (r (t) − r¯ (t)) give the costs incurred by the difference X (t) − m between X (t) and an optimal value m, and by the difference r (t) − r¯ (t) between the domestic and the average foreign interest rate r¯ (t), respectively. The

11.2 Examples

277

cost of buying/selling the amount ζ j is L(ζ j ) and ρ > 0 is a constant discounting exponent. The problem is to find Φ(s, x) and w ∗ = (r ∗ , v ∗ ) such that ∗

Φ(s, x) = inf J (w) (s, x) = J (w ) (s, x). w

(11.2.3)

Since this is a minimum problem, the corresponding HJBQVIs of Theorem 11.1 are changed to minima also and they get the form  min

where



∂φ ∂φ − F(r − r¯ (s)) inf e−ρs M(x − m) + N (r − r¯ (s)) + r ∈R ∂s ∂x 2 1 ∂ φ (11.2.4) + σ 2 2 , Mφ(s, x) − φ(s, x) = 0, 2 ∂x Mφ(s, x) = inf {φ(s, x + γ(ζ)) + e−ρs L(ζ)}. ζ∈R

(11.2.5)

In general this is difficult to solve for φ, even for simple choices of the functions M, N , and F. A detailed discussion on a special case can be found in [MuØ]. Example 11.3 (Optimal Consumption and Portfolio with Both Fixed and Proportional Transaction Costs (1)) This application is studied in [ØS]. Suppose there are two investment possibilities, say a bank account and a stock. Let X 1 (t) and X 2 (t) denote the amount of money invested in these two assets, respectively, at time t. In the absence of consumption and transactions suppose that dX 1 (t) = r X 1 (t)dt and

  dX 2 (t) = X 2 (t − ) μ dt + σ dB(t) + θ z N˜ (dt, dz) ,

(11.2.6)

(11.2.7)

R

where r, μ, θ, and σ = 0 are constants and μ > r > 0, θz ≥ −1 a.s. ν.

(11.2.8)

Suppose that at any time t the investor is free to choose a consumption rate u(t) ≥ 0. This consumption is automatically drawn from the bank account holding with no extra costs. In addition the investor may at any time transfer money from the bank to the stock and conversely. Suppose that such a transaction of size ζ incurs a transaction cost given by c + λ|ζ|, (11.2.9)

278

11 Combined Stochastic Control and Impulse Control of Jump Diffusions

where c > 0 and λ ∈ [0, 1) are constants. (If ζ > 0 we buy stocks and if ζ < 0 we sell stocks.) Thus the control of the investor consists of a combination of a stochastic control u(t) and an impulse control v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .), where τ1 , τ2 , . . . are the chosen transaction times and ζ1 , ζ2 , . . . are corresponding transaction amounts. If such a combined control w = (u, v) is applied, the corresponding system (X 1 (t), X 2 (t)) = (X 1(w) (t), X 2(w) (t)) gets the form (11.2.10) dX 1 (t) = (r X 1 (t) − u(t))dt, τi < t < τi+1 ,   dX 2 (t) = X 2 (t − ) μ dt + σ dB(t) + θ z N˜ (dt, dz) , τi < t < τi+1 , R

(11.2.11) X 1 (τi+1 ) = X 2 (τi+1 ) =

− X 1 (τi+1 ) − Xˇ 2 (τi+1 )

− ζi+1 − c − λ|ζi+1 |,

(11.2.12)

+ ζi+1 .

(11.2.13)

If we do not allow any negative amounts held in the bank account or in the stock, the solvency region S is given by S = [0, ∞) × [0, ∞).

(11.2.14)

We call w = (u, v) admissible if X (w) (t) := (X 1(w) (t), X 2(w) (t)) exists for all t. The set of admissible controls is denoted by W. Let / S} τS = inf{t > 0; X (w) (t) ∈ be the bankruptcy time. The investor’s objective is to maximize J

(w)

 (y) = E

τS

y

e 0

−ρ(s+t) u

 (t) dt , γ

γ

(11.2.15)

where ρ > 0 and γ ∈ (0, 1) are constants and E y with y = (s, x1 , x2 ) denotes the expectation when X 1 (0− ) = x1 ≥ 0 and X 2 (0− ) = x2 ≥ 0. Thus we seek the value function Φ(y) and an optimal control w∗ = (u ∗ , v ∗ ) ∈ W such that ∗

Φ(y) = sup J (w) (y) = J (w ) (y). w∈W

(11.2.16)

This problem may be regarded as a generalization of optimal consumption and portfolio problems studied by Merton [M] and Davis and Norman [DN] (see also Shreve and Soner [SS]). Merton [M] considers the case with no jumps and notransaction costs (c = λ = θ = 0). The problem then reduces to an ordinary stochastic control problem and it is optimal to keep the positions (X 1 (t), X 2 (t)) on the line y = (π ∗ /(1 − π ∗ ))x in the (x, y)-plane at all times (the Merton line), where π ∗ = (μ − r )/(1 − γ)σ 2 (see Example 5.2).

11.2 Examples

279

x2

Γ2

sell

x2 =

π∗ x 1−π ∗ 1

Γ1 buy

x1 Fig. 11.1 The no-transaction cone (no fixed cost: c = 0), θ = 0

Davis and Norman [DN] and Shreve and Soner [SS] consider the case when the cost is proportional (λ > 0), with no fixed component (c = 0) and no jumps (θ = 0). In this case the problem can be formulated as a singular stochastic control problem and under some conditions it is proved that there exists a no-transaction cone N T bounded by two straight lines Γ1 and Γ2 such that it is optimal to make no transactions if (X 1 (t), X 2 (t)) ∈ N T and make transactions corresponding to local time at ∂(N T ), resulting in reflections back to N T every time (X 1 (t), X 2 (t)) ∈ ∂(N T ). See Fig. 11.1. These results have subsequently been extended to jump diffusion markets by [FØS2] (see Example 8.1). In the general combined control case numerical results indicate (see [CØS]) that the optimal control w∗ = (u ∗ , v ∗ ) has the following form. There exist two pairs of lines, (Γ1 , Γˆ1 ) and (Γ2 , Γˆ2 ) from the origin such that the following is optimal. Make no transactions (only consume at the rate u ∗ (t)) while (X 1 (t), X 2 (t)) belongs to the region D bounded by the outer curves Γ1 , Γ2 , and if (X 1 (t), X 2 (t)) hits ∂ D = Γ1 ∪ Γ2 then sell or buy so as to bring (X 1 (t), X 2 (t)) to the curve Γˆ1 or Γˆ2 . See Fig. 11.2. Note that if we sell stocks (ζ < 0) then (X 1 (t), X 2 (t)) = (x1 , x2 ) moves to a point (x1 , x2 ) = (x1 (ζ), x2 (ζ)) on the line x1 + (1 − λ)x2 = x1 + (1 − λ)x2 − c,

(11.2.17)

i.e., (x1 , x2 ) lies on the straight line through (x1 , x2 ) with slope −1/(1 − λ). Similarly, if we buy stocks (ζ > 0) then the new position (x1 , x2 ) = (x1 (ζ), x2 (ζ)) is on the line

280

11 Combined Stochastic Control and Impulse Control of Jump Diffusions

x2 Γ2 Γˆ2

D

sell Γˆ1 buy

Γ1

x1 Fig. 11.2 The no-transaction region D (c > 0)

x1 + (1 + λ)x2 = x1 + (1 + λ)x2 − c,

(11.2.18)

i.e., (x1 , x2 ) lies on the straight line through (x1 , x2 ) with slope −1/(1 + λ). If there are no interventions then the process ⎡

⎤ s+t Y (t) = ⎣ X 1 (t)⎦ X 2 (t)

(11.2.19)

has the generator ∂φ ∂2φ ∂φ ∂φ 1 + (r x1 − u) + μx2 + σ 2 x22 2 ∂s ∂x1 ∂x2 2 ∂x2   φ(s, x1 , x2 + x2 θz) − φ(s, x1 , x2 ) + R ∂φ (s, x1 , x2 ) ν(dz). − x2 θz ∂x2

L u φ(s, x1 , x2 ) =

(11.2.20)

Therefore, if we put φ(s, x1 , x2 ) = e−ρs ψ(x1 , x2 ) the corresponding HJBQVI is

11.2 Examples

281





uγ ∂ψ ∂2ψ ∂ψ 1 − ρψ(x1 , x2 ) + (r x1 − u) + μx2 + σ 2 x22 2 γ ∂x1 ∂x2 2 ∂x2 u≥0   ∂ψ ψ(x1 , x2 + x2 θz) − ψ(x1 , x2 ) − x2 θz + (x1 , x2 ) ν(dz) ∂x2 R (11.2.21) ψ(x1 , x2 ) − Mψ(x1 , x2 ) = 0 for all (x1 , x2 ) ∈ S,

max

sup

where (see (11.2.17) and (11.2.18)) Mψ(x1 , x2 ) = sup{ψ(x1 (ζ), x2 (ζ)); ζ ∈ R \ {0}, (x1 (ζ), x2 (ζ)) ∈ S}. (11.2.22) See Example 12.12 for a further discussion of this.

11.3 Iterative Methods In Chap. 10 we saw that an impulse control problem can be regarded as a limit of iterated optimal stopping. A similar result holds for combined control problems. More precisely, a combined stochastic control and impulse control problem can be regarded as a limit of iterated combined stochastic control and optimal stopping problems. We now describe this in more detail. The presentation is similar to the approach in Chap. 10. For n = 1, 2, . . . let Wn denote the set of all admissible combined controls w = (u, v) ∈ W with v ∈ Vn , where Vn is the set of impulse controls v = (τ1 , . . . , τk ; ζ1 , ζ2 , . . . , ζk ) with at most n interventions (i.e., k ≤ n). Then Wn ⊆ Wn+1 ⊆ W for all n.

(11.3.1)

Define, with J (w) (y) as in (11.1.6),   Φn (y) = sup J (w) (y); w ∈ Wn , n = 1, 2, . . . Then Φn (y) ≤ Φn+1 (y) ≤ Φ(y) because Wn ⊆ Wn+1 ⊆ W. Moreover, we have: Lemma 11.4 Suppose g ≥ 0. Then lim Φn (y) = Φ(y) for all y ∈ S.

n→∞

(11.3.2)

282

11 Combined Stochastic Control and Impulse Control of Jump Diffusions

Proof The proof is similar to the proof of Lemma 10.1 and is omitted.

 

The iterative procedure is the following. Let Y (t) = Y (u,0) (t) be the process in (11.1.1) obtained by using the control u and no interventions. Define   τS  (11.3.3) φ0 (y) = sup E y f (Y (t), u(t))dt + g(Y (τS ))χ{τS g(y)}.

(12.1.9)

/ D we have Φ(y0 ) = g(y0 ) and hence (12.1.6) holds trivially. Then if y0 ∈ Next, assume y0 ∈ D. Then by the dynamic programming principle (Lemma 10.3b) we have  Φ(y0 ) = E

τ

y0

 f (Y (t))dt + Φ(Y (τ ))

(12.1.10)

0

for all bounded stopping times τ ≤ τ D = inf{t > 0; Y (t) ∈ / D}. Choose h m ∈ C02 (Rk ) such that h m → h and Lh m → Lh pointwise dominatedly on Rk . Then by combining (12.1.10) with the Dynkin formula we get 

 f (Y (t))dt + Φ(Y (τ )) 0  τ  y0 f (Y (t))dt + h(Y (τ )) ≤E 0  τ  f (Y (t))dt + h m (Y (τ )) = lim E y0 m→∞ 0  τ  (Lh m (Y (t)) + f (Y (t)))dt . = h(y0 ) + lim E y0

Φ(y0 ) = E y0

τ

m→∞



Hence lim E y0

m→∞

0

τ

0

 (Lh m (Y (t)) + f (Y (t)))dt ≥ 0.

288

12 Viscosity Solutions

In particular, if we choose τ = β j := inf{t > 0; |Y (t) − y0 | ≥ 1/j} ∧ 1/j ∧ τ D , we get  E

βj

y0

 (Lh(Y (t)) + f (Y (t)))dt

0



m→∞

 (Lh m (Y (t)) + f (Y (t)))dt ≥ 0.

βj

= lim E y0

(12.1.11)

0

If we divide (12.1.11) by E y0 [β j ] and let j → ∞ we get, by right continuity, Lh(y0 ) + f (y0 ) ≥ 0. Hence (12.1.6) holds and we have proved that Φ is a viscosity subsolution. Finally we show that Φ is a viscosity supersolution. So we assume that h ∈ C 2 (Rk ) and y0 ∈ S are such that h ≤ Φ on S and h(y0 ) = Φ(y0 ). Then by the dynamic programming principle (Lemma 10.3a) we have  Φ(y0 ) ≥ E

τ

y0

 f (Y (t))dt + Φ(Y (τ ))

(12.1.12)

0

for all stopping times τ ≤ τS 0 . Hence, by the Dynkin formula, with h m and β j as above,  Φ(y0 ) ≥ E

0

≥ lim E

 f (Y (t))dt + φ(Y (τ ))

βj

y0



βj

y0

m→∞

 f (Y (t))dt + h(Y (τ ))

0



= h(y0 ) + lim E y0 m→∞

 (Lh(Y (t))dt + f (Y (t)))dt

0



= h(y0 ) + E

βj

βj

y0



(Lh(Y (t))dt + f (Y (t)))dt

for all j.

0

Hence

 E

y0

βj

 (Lh(Y (t)) + f (Y (t)))dt ≤ 0

0

and by dividing by E y0 [β j ] and letting j → ∞ we get, by right continuity, Lh(y0 ) + f (y0 ) ≤ 0. Hence (12.1.7) holds and we have proved that Φ is also a viscosity supersolution. 

12.1 Viscosity Solutions of Variational Inequalities

289

12.1.1 Uniqueness One important application of the viscosity solution concept is that it can be used as a verification method. In order to verify that a given function φ is indeed the value function Φ it suffices to verify that the function is a viscosity solution of the corresponding variational inequality. For this method to work, however, it is necessary that we know that Φ is the unique viscosity solution. Therefore the question of uniqueness is crucial. In general we need not have uniqueness. The following simple example illustrates this. Example 12.3 Let Y (t) = B(t) ∈ R and choose f = 0, S = R, and g(y) =

y2 , 1 + y2

y ∈ R.

(12.1.13)

Then the value function Φ of the optimal stopping problem Φ(y) = sup E y [g(B(τ ))] τ ∈T

(12.1.14)

is easily seen to be Φ(y) ≡ 1. The corresponding VI is  max

 1 φ (y), g(y) − φ(y) = 0 2

(12.1.15)

and this equation is trivially satisfied by all constant functions φ(y) ≡ a for any a ≥ 1. Theorem 12.4 (Uniqueness) Suppose that τS 0 < ∞ a.s. P y for all y ∈ S 0 .

(12.1.16)

¯ be a viscosity solution of (12.1.4) and (12.1.5) with the property that Let φ ∈ C(S) the family {φ(Y (τ )); τ stopping time, τ ≤ τS 0 } is P y -uniformly integrable, for all y ∈ S 0 . Then

(12.1.17)

¯ φ(y) = Φ(y) for all y ∈ S.

Proof We refer the reader to [ØR] for the proof in the case where there are no jumps. 

290

12 Viscosity Solutions

12.2 The Value Function is Not Always C 1 Example 12.5 We now give an example of an optimal stopping problem where the value function Φ is not C 1 everywhere. In this case Theorem 6.10 cannot be used to find Φ. However, we can use Theorem 12.4. The example is taken from [ØR]: Define ⎧ ⎪ for x ≤ 0 ⎨1 k(x) = 1 − cx for 0 < x < a, (12.2.1) ⎪ ⎩ 1 − ca for x ≥ a where c and a are constants to be specified more closely later. Consider the optimal stopping problem Φ(s, x) = sup E (s,x) e−ρ(s+τ ) k(B(τ )) , τ ∈T

(12.2.2)

where B(t) is one-dimensional Brownian motion, B(0) = x ∈ R = S, and ρ > 0 is a constant. The corresponding variational inequality is (see (12.1.4))  max

 ∂φ 1 ∂ 2 φ −ρs + , e k(x) − φ(s, x) = 0. ∂s 2 ∂x 2

(12.2.3)

If we try a solution of the form φ(s, x) = e−ρs ψ(x)

(12.2.4)

for some function ψ, then (12.2.3) becomes  1 max −ρψ(x) + ψ (x), k(x) − ψ(x) = 0. 2 

(12.2.5)

Let us guess that the continuation region D has the form D = {(s, x); 0 < x < x1 }

(12.2.6)

for some x1 > a. Then (12.2.5) can be split into the three equations 1 ; 0 < x < x1 −ρψ(x) + ψ (x) = 0 2 ψ(x) = 1 ; x ≤0 ψ(x) = 1 − ca ; x ≥ x1 . The general solution of (12.2.7) is

(12.2.7)

12.2 The Value Function is Not Always C 1

ψ(x) = C1 e



2ρ x

291 √

+ C 2 e−

2ρ x

, 0 < x < x1 ,

where C1 , C2 are arbitrary constants. If we require ψ to be continuous at x = 0 and at x = x1 we get the two equations

C1 e



C1 + C2 = 1, 2ρ x1

+ C2 e

√ − 2ρ x1

(12.2.8) = 1 − ca

(12.2.9)

in the three unknowns C1 , C2 , and x1 . If we also guess that ψ will be C 1 at x = x1 we get the third equation √ √

C1 2ρ e 2ρ x1 − C2 2ρ e− 2ρ x1 = 0.

(12.2.10)

If we assume that ca < 1 and

  1 − ca 1 2ρ < ln √ a 1 − ca(2 − ca)

(12.2.11)

then the three equations (12.2.8), (12.2.9), and (12.2.10) have the unique solution C1 =

 1 1 − ca(2 − ca) > 0, C2 = 1 − C1 > 0 2   1 − ca 1 x1 = √ ln > a. 2C1 2ρ

and

(12.2.12)

(12.2.13)

With these values of C1 , C2 , and x1 we put ⎧ ⎪ if x ≤ 0 ⎨1 √ √ − 2ρ x 2ρ x ψ(x) = C1 e + C2 e if 0 < x < x1 . ⎪ ⎩ 1 − ca if x1 ≤ x

(12.2.14)

See Fig. 12.1. We claim that ψ is a viscosity solution of (12.2.5). (i) First we verify that ψ is a viscosity subsolution: let h ∈ C 2 (R), h ≥ ψ, and h(x0 ) = ψ(x0 ). Then if x0 ≤ 0 or x0 ≥ x1 we have k(x0 ) − ψ(x0 ) = 0. And if 0 < x0 < x1 then h − ψ is C 2 at x = x0 and has a local minimum at x0 so h (x0 ) − ψ (x0 ) ≥ 0. Therefore 1 1 −ρh(x0 ) + h (x0 ) ≥ −ρψ(x0 ) + ψ (x0 ) = 0. 2 2

292

12 Viscosity Solutions

1 − cx

1

ψ(x) 1 − ca

0

a

x1

x

Fig. 12.1 The function ψ

This proves that   1 max −ρh(x0 ) + h (x0 ), k(x0 ) − ψ(x0 ) ≥ 0, 2 so ψ is a viscosity subsolution of (12.2.5). (ii) Second, we prove that ψ is a viscosity supersolution. So let h ∈ C 2 (R), h ≤ ψ, and h(x0 ) = ψ(x0 ). Note that we always have k(x0 ) − ψ(x0 ) ≤ 0 so in order to prove that   1 max −ρh(x0 ) + h (x0 ), k(x0 ) − ψ(x0 ) ≤ 0 2 it suffices to prove that 1 −ρh(x0 ) + h (x0 ) ≤ 0. 2 At any point x0 where ψ is C 2 this follows in the same way as in (i) above. So it remains only to consider the two cases x0 = 0 and x0 = x1 . If x0 = 0 then no such h exists, so the conclusion trivially holds. If x0 = x1 then the function h − ψ has a local maximum at x = x0 and it is C 2 to the left of x0 so lim h (x) − ψ (x) ≤ 0,

x→x0−

12.2 The Value Function is Not Always C 1

293

i.e., h (x0 ) − ψ (x0− ) ≤ 0 . This gives 1 1 −ρh(x0 ) + h (x0 ) ≤ −ρψ(x0 ) + ψ (x0− ) = 0, 2 2 and the proof is complete. We have proved. Suppose (12.2.11) holds. Then the value function Φ(s, x) of problem (12.2.2) is given by Φ(s, x) = e−ρs ψ(x) with ψ as in (12.2.14), C1 , C2 , and x1 as in (12.2.12) and (12.2.13). Note in particular that ψ(x) is not C 1 at x = 0.

12.3 Viscosity Solutions of HJBQVI We now turn to the general combined stochastic control and impulse control problem from Chap. 11. Thus the state Y (t) = Y (w) (t) is Y (0− ) = y ∈ Rk , dY (t) = b(Y (t), u(t))dt + σ(Y (t), u(t))dB(t)  + γ(Y (t − ), u(t − ), z) N˜ (dt, dz), τi < t < τi+1 ,

(12.3.1)

R

− Y (τi+1 ) = Γ (Yˇ (τi+1 ), ζi+1 ), i = 0, 1, 2, . . . ,

where w = (u, v) ∈ W, u ∈ U, v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) ∈ V. The performance is given by ⎡ J (w) (y) = E y ⎣

 0

τS

f (Y (t), u(t))dt + g(Y (τS ))χ{τS 0; Y (w) (t) ∈

for all y ∈ ∂S and all w ∈ W. (12.3.5) These conditions (12.3.4) and (12.3.5) exclude cases where Φ also satisfies certain HJBQVIs on ∂S (see, e.g., [ØS]), but it is often easy to see how to extend the results to such situations. We need to make the following two assumptions on the set of admissible controls W: (1) If w = (u, v), with v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .), belongs to W and ζˆ is any point in Z , then w := (u, v) ˆ belongs to W also, when vˆ := (0, τ1 , τ2 , . . . ; ζ, ζ1 , ζ2 , . . .). (2) If α is any constant in U , then the combined control w := (α, 0) (no interventions, just the constant control α) belongs to W . Theorem 11.1 associates Φ to the HJBQVI   max sup {L α Φ(y) + f (y, α)}, MΦ(y) − Φ(y) = 0, α∈U

y∈S

(12.3.6)

with boundary values Φ(y) = g(y),

y ∈ ∂S,

(12.3.7)

where L α Φ(y) =

k 

bi (y, α)

i=1

k ∂Φ 1   T ∂2Φ σσ i j (y, α) + ∂ yi 2 i, j=1 ∂ yi ∂ y j

      Φ y + γ ( j) (y, α, z j ) − Φ(y) + j=1

R

 − ∇Φ(y) · γ ( j) (y, α, z j ) ν j (dz j )

(12.3.8)

and   MΦ(y) = sup Φ(Γ (y, ζ)) + K (y, ζ); ζ ∈ Z, Γ (y, ζ) ∈ S .

(12.3.9)

Unfortunately, as we have seen already for optimal stopping problems, the value function Φ need not be C 1 everywhere – in general not even continuous! So (12.3.6) is not well defined, if we interpret the equation in the usual sense. However, it turns out that if we interpret (12.3.6) in the weak sense of viscosity then Φ does indeed solve the equation. In fact, under some assumptions Φ is the unique viscosity solution of (12.3.6) and (12.3.7) (see Theorem 12.11). This result is an important supplement to Theorem 11.1.

12.3 Viscosity Solutions of HJBQVI

295

We now define the concept of viscosity solutions of general HJBQVIs of type (12.3.6) and (12.3.7). ¯ Definition 12.6 Let ϕ ∈ C(S). (i) We say that ϕ is a viscosity subsolution of  max sup {L ϕ(y) + f (y, α)} , Mϕ(y) − ϕ(y) = 0, 

α

α∈U

ϕ(y) = g(y),

y ∈ S,

y ∈ ∂S

(12.3.10) (12.3.11)

if (12.3.11) holds and for every h ∈ C 2 (Rk ) and every y0 ∈ S such that h ≥ ϕ on S and h(y0 ) = ϕ(y0 ) we have 



α

max sup {L h(y0 ) + f (y0 , α)} , Mϕ(y0 ) − ϕ(y0 ) ≥ 0. α∈U

(12.3.12)

(ii) We say that ϕ is a viscosity supersolution of (12.3.10) and (12.3.11) if (12.3.11) holds and for every h ∈ C 2 (Rk ) and every y0 ∈ S such that h ≤ ϕ on S and h(y0 ) = ϕ(y0 ) we have   max sup {L α h(y0 ) + f (y0 , α)} , Mϕ(y0 ) − ϕ(y0 ) ≤ 0. α∈U

(12.3.13)

(iii) We say that ϕ is a viscosity solution of (12.3.10) and (12.3.11) if ϕ is both a viscosity subsolution and a viscosity supersolution of (12.3.10) and (12.3.11). Lemma 12.7 Let Φ be as in (12.3.3). Then Φ(y) ≥ MΦ(y) for all y ∈ S. Proof Suppose there exists y ∈ S with Φ(y) < MΦ(y), i.e., Φ(y) < sup {Φ(Γ (y, ζ)) + K (y, ζ)} . ζ∈Z

ˆ Then there exist ε > 0 and ζˆ ∈ Z such that, with yˆ = Γ (y, ζ), ˆ − 2ε. Φ(y) < Φ( yˆ ) + K (y, ζ) Let w = (u, v), with v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) be ε-optimal for Φ at yˆ , in the sense that J (w) ( yˆ ) > Φ( yˆ ) − ε. ˆ ζ1 , ζ2 , . . .). Then, with τ0 = 0 Define wˆ := (u, v), ˆ where vˆ = (0, τ1 , τ2 , . . . ; ζ, ˆ and ζ0 = ζ,

296

12 Viscosity Solutions ˆ Φ(y) ≥ J (w) (y) = E y



+

τS

0 ∞ 

f (Y (t), u(t))dt + g(Y (τS ))χ{τS 0 and let w = (u, v) ∈ W, with v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) ∈ V, be an -optimal portfolio, i.e., Φ(y0 ) < J (w) (y0 ) + . Since τ1 is a stopping time we know that {ω; τ1 (ω) = 0} is F0 -measurable and hence either (12.3.16) τ1 (ω) = 0 a.s. or τ1 (ω) > 0 a.s. If τ1 = 0 a.s. then Y (w) makes an immediate jump from y0 to the point y = Γ (y0 , ζ1 ) ∈ S and hence

12.3 Viscosity Solutions of HJBQVI

297



Φ(y0 ) −  ≤ J (w ) (y ) + K (y0 , ζ1 ) ≤ Φ(y ) + K (y0 , ζ1 ) ≤ MΦ(y0 ), where w = (τ2 , τ3 , . . . ; ζ2 , ζ3 , . . .). This is a contradiction if  < Φ(y0 )−MΦ(y0 ). This proves that (12.3.15) implies that it is impossible to have τ1 = 0 a.s. So by (12.3.16), we can now assume that τ1 > 0 a.s. Choose ρ > 0 and define τ := τ1 ∧ ρ. By the dynamic programming principle (see Lemma 10.3) we have: for each ε > 0, there exists a control u = u ρ such that  Φ(y0 ) ≤ E

τ

y0

 − ˇ f (Y (t), u(t))dt + Φ(Y (τ )) + ρ2 ,

(12.3.17)

0

where, as before, Yˇ (τ − ) = Y (τ − ) + Δ N Y (τ ). Choose h m ∈ C02 (Rk ) such that h m → h and L u h m → L u h pointwise dominatedly as m → ∞. Then by (12.3.17) and the Dynkin formula we have, using that Φ ≤ h,  τ  y0 − ˇ Φ(y0 ) ≤ E f (Y (t), u(t))dt + h(Y (τ )) + ε 0   τ y0 − ˇ f (Y (t), u(t))dt + h m (Y (τ )) + ε ≤ lim inf E m→∞ 0  τ  y0 u (L h m (Y (t)) + f (Y (t), u(t)))dt + ε = h(y0 ) + lim inf E m→∞ 0  τ  y0 u (L h(Y (t)) + f (Y (t), u(t)))dt + ε. (12.3.18) = h(y0 ) + E 0

Using that h(y0 ) = Φ(y0 ), we obtain  E y0

τ

 {L u h(Y (t)) + f (Y (t), u(t))}dt ≥ −ρ2 .

0

Dividing by E y0 [τ ] and letting ρ → 0 we get L α0 h(y0 ) + f (y0 , α0 ) ≥ 0, where α0 = lim sup lim+ u(s). ρ→0

s→0

Since ε is arbitrary, this proves (12.3.14) and hence that Φ is a viscosity subsolution.

298

12 Viscosity Solutions

(b) Next we prove that Φ is a viscosity supersolution. So we choose h ∈ C 2 (Rk ) and y0 ∈ S such that h ≤ Φ on S and h(y0 ) = Φ(y0 ). We must prove that 



α

max sup {L h(y0 ) + f (y0 , α)}, MΦ(y0 ) − Φ(y0 ) ≤ 0. α∈U

(12.3.19)

Since Φ ≥ MΦ always (Lemma 12.7) it suffices to prove that L α h(y0 ) + f (y0 , α) ≤ 0 for all α ∈ U. To this end, fix α ∈ U and let wα = (α, 0), i.e., wα is the combined control (u α , vα ) ∈ W where u α = α (constant) and vα = 0 (no interventions). Then by the dynamic programming principle and the Dynkin formula we have, with Y (t) = Y (wα ) (t), τ = τS ∧ ρ, and h m as in (a), 

 f (Y (s), α)ds + Φ(Yˇ (τ − )) 0  τ  f (Y (s), α)ds + h(Yˇ (τ − )) ≥ E y0 0  τ  {L α h m (Y (t)) + f (Y (t), α)}dt = h(y0 ) + lim E y0 m→∞ 0  τ  {L α h(Y (t)) + f (Y (t), α)}dt . = h(y0 ) + E y0

Φ(y0 ) ≥ E y0

τ

0



Hence

τ

E

 {L α h(Y (t)) + f (Y (t), α)}dt ≤ 0.

0

Dividing by E[τ ] and letting ρ → 0 we get (12.3.19). This completes the proof of Theorem 12.8.



Next we turn to the question of uniqueness of viscosity solutions of (12.3.10) and (12.3.11). Many types of uniqueness results can be found in the literature. See the references in the end of this section. Here we give a proof in the case when the process Y (t) has no jumps, i.e. when N (·, ·) = ν(·) = 0. The method we use is a generalization of the method in [ØS, Theorem 3.8]. First we introduce some convenient notation: Define Λ : Rk×k × Rk × RS × Rk → R by Λ(R, r, ϕ, y) := sup α∈U

 k i=1

bi (y, α)ri +

k 1  (σσ T )i j (y, α)Ri j 2 i, j=1

12.3 Viscosity Solutions of HJBQVI

+

299

    j=1

−r ·γ

R ( j)

ϕ(y + γ ( j) (y, α, z j )) − ϕ(y)

  (y, α, z j ) ν j (dz j ) + f (y, α)

(12.3.20)

for R = [Ri j ] ∈ Rk×k , r = (ri , . . . , rk ) ∈ Rk , ϕ : S → R, y ∈ Rk , and define F : Rk×k × Rk × RS × Rk → R by F(R, r, ϕ, y) = max{Λ(R, r, ϕ, y), Mϕ(y) − ϕ(y)}.

(12.3.21)

Note that if ϕ ∈ C 2 (Rk ) then Λ(D 2 ϕ, Dϕ, ϕ, y) = sup {L α ϕ(y) + f (y, α)} , α∈U

   ∂ϕ ∂2ϕ (y) and Dϕ(y) = (y). D ϕ(y) = ∂ yi ∂ y j ∂ yi 

where

2

We recall the concepts of “superjets” JS2,+ , JS2,− and J¯S2,+ , J¯S2,− (see [CIL, Sect. 2]):  JS2,+ ϕ(y) := (R, r ) ∈ Rk×k × Rk ;

 1 lim sup[ϕ(η) − ϕ(y) − (η − y)T r − (η − y)T R(η − y)]· |η − y|−2 ≤ 0 , 2 η→y η∈S

 J¯S2,+ ϕ(y) := (R, r ) ∈ Rk×k × Rk ; for all n there exists (R (n) , r (n) , y (n) ) ∈ Rk×k × Rk × S such that (R (n) , r (n) ) ∈ JS2,+ ϕ(y (n) ) and  (R (n) , r (n) , ϕ(y (n) ), y (n) ) → (R, r, ϕ(y), y) as n → ∞ and JS2,− ϕ = −JS2,+ (−ϕ),

J¯S2,− ϕ = − J¯S2,+ (−ϕ).

In terms of these superjets one can give an equivalent definition of viscosity solutions as follows. Theorem 12.9 ([CIL, Sect. 2]) (i) A function ϕ ∈ C(S) is a viscosity subsolution of (12.3.10) and (12.3.11) if and only if (12.3.11) holds and max(Λ(R, r, ϕ, y), Mϕ(y) − ϕ(y)) ≥ 0 for all (R, r ) ∈ J¯S2,+ ϕ(y), y ∈ S.

300

12 Viscosity Solutions

(ii) A function ϕ ∈ C(S) is a viscosity supersolution of (12.3.10) and (12.3.11) if and only if (12.3.11) holds and max(Λ(R, r, ϕ, y), Mϕ(y) − ϕ(y)) ≤ 0 for all (R, r ) ∈ J¯S2,− ϕ(y), y ∈ S. We have now ready for the second main theorem of this section. Theorem 12.10 (Comparison Theorem) Assume that N (·, ·) = 0.

(12.3.22)

¯ which satisfies the strict Suppose that there exists a positive function β ∈ C 2 (S) quasivariational inequality 

 α

max sup {L β(y)}, sup β(Γ (y, ζ)) − β(y) ≤ −δ(y) < 0, y ∈ S, (12.3.23) α∈U

ζ∈Z

where δ(y) > 0 is bounded away from 0 on compact subsets of S. Let u be a viscosity subsolution and v be a viscosity supersolution of (12.3.10) and (12.3.11) and suppose that  lim

|y|→∞

u + (y) v − (y) + β(y) β(y)

 = 0.

Then u(y) ≤ v(y) for all y ∈ S. Proof (Sketch) We argue by contradiction. Suppose that sup{u(y) − v(y)} > 0. y∈S

Then by (12.3.24) there exists  > 0 such that if we put v (y) := v(y) + β(y),

y∈S

then M := sup{u(y) − v (y)} > 0. y∈S

For n = 1, 2, . . . and (x, y) ∈ S × S define Hn (x, y) := u(x) − v(y) −

n  |x − y|2 − (β(x) + β(y)) 2 2

(12.3.24)

12.3 Viscosity Solutions of HJBQVI

301

and Mn :=

sup

(x,y)∈S×S

Hn (x, y).

Then by (12.3.24) we have 0 < Mn < ∞

for all n,

and there exists (x (n) , y (n) ) ∈ S × S such that Mn = Hn (x (n) , y (n) ). Then by Lemma 3.1 in [CIL] the following holds: lim n|x (n) − y (n) |2 = 0

n→∞

and lim Mn = u( yˆ ) − v ( yˆ ) = sup{u(y) − v (y)} = M,

n→∞

y∈S

for any limit point yˆ of {y (n) }∞ n=1 . Since v is a supersolution of (12.3.10), (12.3.11), and (12.3.23) holds, we see that v is a strict supersolution of (12.3.10), in the sense that ϕ = v satisfies (12.3.13) in the following strict form: 



α

max sup {L h(y0 ) + f (y0 , α)}, Mv (y0 ) − v (y0 ) ≤ −δ(y0 ), α∈U

with δ(·) as in (12.3.23). By [CIL, Theorem 3.2], there exist k × k matrices P (n) , Q (n) such that, if we put p (n) = q (n) = n(x (n) − y (n) ) then

(P (n) , p (n) ) ∈ J¯ 2,+ u(x (n) ) and (Q (n) , q (n) ) ∈ J¯ 2,− v (y (n) ) 

and

   I −I P (n) 0 ≤ 3n , −I I 0 −Q (n)

in the sense that ξ T P (n) ξ − η T Q (n) η ≤ 3n|ξ − η|2 for all ξ, η ∈ Rk . Since u is a subsolution we have, by Theorem 12.9,

(12.3.25)

302

12 Viscosity Solutions

  max Λ(P (n) , p (n) , u, x (n) ), Mu(x (n) ) − u(x (n) ) ≥ 0

(12.3.26)

and since v is a supersolution we have   max Λ(Q (n) , q (n) , v , y (n) ), Mv (y (n) ) − v (y (n) ) ≤ 0.

(12.3.27)

By (12.3.25) we get Λ(P (n) , p (n) , u, x (n) ) − Λ(Q (n) , q (n) , v , y (n) )  k ≤ sup (bi (x (n) , α) − bi (y (n) , α))( pi(n) − qi(n) ) α∈U

i=1

+

 k 1  (n) (σσ T )i j (x (n) , α) − (σσ T )i j (y (n) , α) (Pi(n) − Q ) j ij 2 i, j=1

≤ 0. Therefore, by (12.3.27), Λ(P (n) , p (n) , u, x (n) ) ≤ Λ(Q (n) , q (n) , v , y (n) ) ≤ 0 and hence, by (12.3.26),

Mu(x (n) ) − u(x (n) ) ≥ 0.

(12.3.28)

On the other hand, since v is a strict supersolution we have Mv (y (n) ) − v (y (n) ) < −δ for all n,

(12.3.29)

for some constant δ > 0. Combining the above we get Mn < u(x (n) ) − v (y (n) ) < Mu(x (n) ) − Mv (y (n) ) − δ and hence M = lim Mn ≤ lim (Mu(x (n) ) − Mv (y (n) ) − δ) n→∞

n→∞

≤ Mu( yˆ ) − Mv ( yˆ ) − δ = sup{u(Γ ( yˆ , ζ)) + K ( yˆ , ζ)} − sup{v (Γ ( yˆ , ζ)) + K ( yˆ , ζ)} − δ ζ∈Z

ζ∈Z

≤ sup{u(Γ ( yˆ , ζ)) − v (Γ ( yˆ , ζ))} − δ ≤ M − δ. ζ∈Z

This contradiction proves Theorem 12.10.



12.3 Viscosity Solutions of HJBQVI

303

Theorem 12.11 (Uniqueness of Viscosity Solutions) Suppose that the process Y (t) has no jumps, i.e., N (·, ·) = 0 ¯ be as in Theorem 12.10. Then there is at most one viscosity and let β ∈ C 2 (S) solution ϕ of (12.3.10) and (12.3.11) with the property that lim

|y|→∞

|ϕ(y)| = 0. β(y)

(12.3.30)

Proof Let ϕ1 , ϕ2 be two viscosity solutions satisfying (12.3.30). If we apply Theorem 12.10 to u = ϕ1 and v = ϕ2 we get ϕ1 ≤ ϕ 2 . If we apply Theorem 12.10 to u = ϕ2 and v = ϕ1 we get ϕ2 ≤ ϕ1 . Hence ϕ1 = ϕ2 .



Example 12.12 (Optimal Consumption and Portfolio with Both Fixed and Proportional Transaction Costs (2)) Let us return to Example 11.3. In this case (12.3.10) takes the form (11.2.21) and (11.2.22) in S 0 . For simplicity we assume Dirichlet boundary conditions, e.g., ψ = 0, on ∂S. Fix γ ∈ (γ, 1) such that (see (5.1.9))   (μ − r )2 ρ > γ r + 2 2σ (1 − γ) and define



β(x1 , x2 ) = (x1 + x2 )γ .

(12.3.31)

Then with M as in (11.2.22) we have  1− (Mβ − β)(x1 , x2 ) ≤ (x1 + x2 ) γ

Moreover, with

c x1 + x2



 − 1 < 0.

(12.3.32)

304

12 Viscosity Solutions

L u ψ(x1 , x2 ) := − ρψ(x1 , x2 ) + (r x1 − u)

∂ψ ∂ψ (x1 , x2 ) + μx2 (x1 , x2 ) ∂x1 ∂x2

1 ∂2ψ + σ 2 x22 2 (x1 , x2 ), ψ ∈ C 2 (R2 ) 2 ∂x2

(12.3.33)

we get max L u β(x1 , x2 ) < 0, u≥0

(12.3.34)

and in both (12.3.32) and (12.3.34) the strict inequality is uniform on compact subsets of S 0 . The proofs of these inequalities are left as an exercise (Exercise 12.3). We conclude that the function β in (12.3.31) satisfies the conditions (12.3.23) of Theorem 12.10. Thus by Theorem 12.11 we have in this example uniqueness of viscosity solutions ϕ satisfying the growth condition lim



(x1 + x2 )−γ |ϕ(x1 , x2 )| = 0.

|(x1 ,x2 )|→∞

(12.3.35)

For other results regarding uniqueness of viscosity solutions of equations associated to impulse control, stochastic control and optimal stopping for jump diffusions, we refer to [Am, AKL, BKR2, CIL, Is1, Is2, Ish, MS, AT, Ph1, JK1, FS, BCa, BCe] and the references therein.

12.4 Numerical Analysis of HJBQVI In this section we give some insights in the numerical solution of HJBQVI. We refer, e.g., to [LST] for details on the finite difference approximations and the description of the algorithms to solve dynamic programming equations. Here we focus on the main problem which arises in the case of quasivariational inequalities, i.e., the presence of a nonexpansive operator due to the intervention operator.

12.4.1 Finite Difference Approximation We want to solve the following HJBQVI numerically  max sup {L Φ(x) + f (x, α)}, MΦ(x) − Φ(x) = 0, x ∈ S 

α

α∈U

(12.4.1)

with boundary values Φ(x) = g(x), x ∈ ∂S,

(12.4.2)

12.4 Numerical Analysis of HJBQVI

305

where L α Φ(x) = −ρΦ +

k  i=1

bi (x, α)

k ∂Φ 1  ∂2Φ + ai j (x, α) ∂xi 2 i, j=1 ∂xi ∂x j

(12.4.3)

and   MΦ(x) = sup Φ(Γ (x, ζ)) + K (x, ζ); ζ ∈ Z, Γ (x, ζ) ∈ S .

(12.4.4)

  We have denoted here ai j := σσ T i j . We shall also write K ζ (x) for K (x, ζ). We assume that S is bounded, otherwise a change of variable or a localization procedure has to be performed in order to reduce to a bounded domain. Moreover we assume for simplicity that S is a box, i.e., a cartesian product of bounded intervals in Rk . We can also handle Neumann type boundary conditions without additional difficulty. We discretize (12.4.1) by using a finite difference approximation. Let δi denote the finite difference step in each coordinate direction and set δ = (δ1 , . . . , δk ). Denote byei the unit vector in the ithcoordinate direction, and consider the grid Sδ = k k S i=1 (δi Z). Set ∂Sδ = ∂S i=1 (δi Z). We use the following approximations: Φ(x + δi ei ) − Φ(x − δi ei ) ∂Φ (x) ∼ ≡ ∂iδi Φ(x) ∂xi 2δi

(12.4.5)

or (see (12.4.16)) ⎧ Φ(x + δi ei ) − Φ(x) ⎪ ⎪ ≡ ∂iδi + Φ(x) if bi (x) ≥ 0, ⎪ ⎨ δi

∂Φ (x) ∼ ⎪ ∂xi ⎪ Φ(x) − Φ(x − δi ei ) ⎪ ⎩ ≡ ∂iδi − Φ(x) if bi (x) ≤ 0. δi

Φ(x + δi ei ) − 2Φ(x) + Φ(x − δi ei ) ∂2Φ (x) ∼ ≡ ∂iiδi Φ(x). ∂xi2 δi2

(12.4.6)

(12.4.7)

If ai j (x) ≥ 0, i = j, then 2Φ(x) + Φ(x + δi ei + δ j e j ) + Φ(x − δi ei − δ j e j ) ∂2Φ (x) ∼ ∂xi ∂x j 2δi δ j   Φ(x + δi ei ) + Φ(x − δi ei ) + Φ(x + δ j e j ) + Φ(x − δ j e j ) − 2δi δ j δ δj+

≡ ∂i ji

Φ(x).

(12.4.8)

306

12 Viscosity Solutions

If ai j (x) < 0, i = j, then [2Φ(x) + Φ(x + δi ei − δ j e j ) + Φ(x − δi ei + δ j e j )] ∂2Φ (x) ∼ − ∂xi ∂x j 2δi δ j Φ(x + δi ei ) + Φ(x − δi ei ) + Φ(x + δ j e j ) + Φ(x − δ j e j ) + 2δi δ j δ δj−

≡ ∂i ji

Φ(x).

(12.4.9)

These approximations can be justified when the function Φ is smooth by Taylor expansions. Using approximations (12.4.5), (12.4.7)–(12.4.9), we obtain the following approximation problem:   max sup {L αδ Φδ (x) + f (x, α)}, Mδ Φδ (x) − Φδ (x) = 0 for all x ∈ Sδ , α∈U

Φδ (x) = g(x) for all x ∈ ∂Sδ , (12.4.10) where ⎧ ⎫ k ⎨ ⎬  |a (x, α)| −a (x, α) ij ii L αδ Φ(x) = Φ(x) + − ρ ⎩ ⎭ 2δi δ j δi2 i=1 j =i ⎧ ⎫ ⎨ a (x, α) ⎬  |a (x, α)| 1  (x, α) b ij ii i + Φ(x + κδi ei ) − + κ ⎩ δi2 2 i,κ=±1 δi δ j δi ⎭ j, j =i + and

with

1 2

 i = j,κ=±1,λ=±1

Φ(x + κei δi + λe j δ j )

ai j (x, α)[κλ] δi δ j

  Mδ Φδ (x) = sup Φ(Γ (x, ζ)) + K (x, ζ); ζ ∈ Zδ (x)   Zδ (x) = ζ ∈ Z, Γ (x, ζ) ∈ Sδ .

(12.4.11)

(12.4.12)

(12.4.13)

We have used here the notation  + ai j (x, α) ≡ max(0, ai j (x, α)) if κλ = 1, ai[κλ] (x, α) = j ai−j (x, α) ≡ − min(0, ai j (x, α)) if κλ = −1. In (12.4.10), Φδ denotes an approximation of Φ at the grid points. This approximation is consistent and stable if the following condition holds: (see [LST] for a proof)

12.4 Numerical Analysis of HJBQVI

|bi (x, α)| ≤

307

aii (x, α)  |ai j (x, α)| − δi δj j =i

for all α in U, x in Sδ , i = 1, . . . , k.

(12.4.14) In this case φδ converges to the viscosity solution of (12.4.1) when the step δ goes to 0. This can be proved by using techniques introduced by Barles and Souganidis [BaS], provided a comparison theorem holds for viscosity sub- and supersolutions of the continuous-time problem. If (12.4.14) does not hold but only the following weaker condition 0≤

aii (x, α)  |ai j (x, α)| − δi δj j =i

for all α in U, x in Sδ , i = 1 . . . k.

(12.4.15)

is satisfied, then it can be shown that we can also obtain a stable approximation (but of lower order) by using the one-sided approximations (12.4.6) for the approximation of the gradient instead of the centered difference (12.4.5). Instead of (12.4.11), the operator L αδ is then equal to ⎧ ⎫ k ⎨ ⎬  |ai j (x, α)| |bi (x, α)| −aii (x, α) L αδ Φ(x) = Φ(x) + − −ρ 2 ⎩ ⎭ 2δi δ j δi δi i=1 j =i ⎧ ⎫ ⎨ [κ] ⎬   |ai j (x, α)| bi (x, α) 1 aii (x, α) + Φ(x + κδi ei ) − + ⎩ δi2 ⎭ 2 i,κ=±1 δi δ j δi j, j =i +

1 2

 i = j,κ=±1,λ=±1

Φ(x + κei δi + λe j δ j )

ai j (x, α)[κλ] . δi δ j

(12.4.16)

By replacing the values of the function Φδ by their known values on the boundary, we obtain the following equation in Sδ :   max sup { L¯ αδ Φδ (x) + f δ (x, α)}, Mδ Φδ (x) − Φδ (x) = 0, x ∈ Sδ , (12.4.17) α∈U

where L¯ αδ is a square Nδ × Nδ matrix, obtained by retrieving the first and last column from L αδ , Nδ = Card(Sδ ), i.e., the number of points of the grid, and f δ (x, α) (which will also be denoted by f δα (x)) takes into account the boundary values.

12.4.2 A Policy Iteration Algorithm for HJBQVI When the stability conditions (12.4.14) or (12.4.15) hold, then the matrix L¯ αδ is diagonally dominant, i.e.,

308

12 Viscosity Solutions

( L¯ αδ )i j ≥ 0 for i = j and

Nδ  ( L¯ αδ )i j ≤ −ρ < 0 for all i = 1, . . . , Nδ . j=1

Now let h be a positive number such that h ≤ min i

1 α ¯ |( L δ )ii + ρ|

(12.4.18)

and let Iδ denote the Nδ × Nδ identity matrix. It is easy to check that the matrix Pδα := Iδ + h( L¯ αδ + ρIδ ) is sub-Markovian, i.e., (Pδα )i j ≥ 0 for all i, j and quently (12.4.17) can be rewritten as

$ Nδ

α j=1 (Pδ )i j

≤ 1 for all i. Conse-

 %    & 1 α Pδ Φδ (x) − (1 + ρh)Φδ (x) + f δα (x) , Mδ Φδ (x) − Φδ (x) = 0, max sup α∈U h (12.4.19) which is equivalent to 



Φδ (x) = max sup Lαδ Φδ (x), sup B ζ Φδ (x) , α∈U

ζ∈Zδ (x)

(12.4.20)

where Pδα Φ(x) + h f δα (x) , 1 + ρh B ζ Φ(x) := Φ(Γ (x, ζ)) + K ζ (x).

Lαδ Φ(x) :=

(12.4.21) (12.4.22)

Let P(Sδ ) denote the set of all subsets of Sδ and for (T, α, ζ) in P(Sδ ) × U × Zδ , denote by OT,α,ζ the operator: ' OT,α,ζ v(x) :=

Lαδ v(x) if x ∈ Sδ \T, B ζ v(x) if x ∈ T.

Problem (12.4.20) is equivalent to the fixed point problem Φδ (x) = We define Tad as

sup

T ∈P(Sδ ),α∈U,ζ∈Zδ

OT,α,ζ Φδ (x).

Tad := P(Sδ )\Sδ

(12.4.23)

12.4 Numerical Analysis of HJBQVI

309

and restrict ourselves to the following problem Φδ (x) =

sup

T ∈Tad ,α∈U,ζ∈Zδ

OT,w,z Φδ (x) =: OΦδ (x).

(12.4.24)

In other words, it is not admissible to make interventions at all points of Sδ (i.e., the continuation region is never the empty set). We can always assume that we order the points of the grid in such a way that it is not admissible to intervene at x1 ∈ Sδ . The operator Lαδ is contractive (because P∞ ≤ 1 and r h > 0) and satisfies the discrete maximum principle, i.e., Lαδ v1 − Lαδ v2 ≤ v1 − v2 ⇒ v1 − v2 ≥ 0.

(12.4.25)

(If v is a function from Sδ into R, v ≥ 0 means v(x) ≥ 0 for all x ∈ Sδ .) The operator B ζ is nonexpansive and we need some additional hypothesis in order to be able to use a policy iteration algorithm for computing a solution of (12.4.21). We assume There exists an integer function σ : {1, 2, . . . , Nδ } × Zδ → {1, 2, . . . , Nδ } such that for all ζ ∈ Zδ and all i = 1, . . . , Nδ Γ (xi , ζ) = xσ(i,ζ) with σ(i, ζ) < i.

(12.4.26)

The operator Bζ defined in (12.4.22) can be rewritten as B ζ v = Bζ v + K ζ , where (Bζ , ζ ∈ Zδ ) is a family of Nδ × Nδ Markovian matrices (except for the first row) defined by: Bi,z j = 1 if j = σ(i, z) and i = 1, and 0 elsewhere. Let ζ(·) be a feedback Markovian control from Sδ into Zδ , and define the function ¯ := σ(x, ζ(x)). Condition (12.4.26) implies that the pth composition σ¯ on Sδ by σ(x) of σ¯ starting in T ∈ Tad will end up in Sδ \T after a finite number of iterations. We can now consider the following Howard or policy iteration algorithm to solve problem (12.4.20) in the finite set Sδ . It consists of constructing two sequences of feedback Markovian policies {(Tk , αk , ζk ), k ∈ N} and functions {vk , k ∈ N} as follows. Let v0 be a given initial function in Sδ . For k ≥ 0 we do the following iterations: • (step 2k) Given vk , compute a feedback Markovian admissible policy (Tk+1 , αk+1 , ζk+1 ) such that (12.4.27) (Tk+1 , αk+1 , ζk+1 ) ∈ Argmax{OT,α,ζ vk }. T,α,ζ

310

12 Viscosity Solutions

In other words αk+1 (x) ∈ Argmax Lαδ vk (x); α∈U

ζ

ζk+1 (x) ∈ Argmax Bδ vk (x); β∈Zδ

α

Tk+1 = {x ∈ Sδ , Lδ k+1

(x)

for all x in Sδ , for all x in Sδ , ζ

vk (x) > Bδ k+1

(x)

vk (x)}.

• (step 2k + 1) Compute vk+1 as the solution of vk+1 = OTk+1 ,αk+1 ,ζk+1 vk+1 .

(12.4.28)

Set k ← k + 1 and go to step 2k. It can be proved that if (12.4.15), (12.4.18), and (12.4.26) hold, then the sequence {vk } converges to the solution Φδ of (12.4.20) and the sequence {(Tk , αk , ζk )} converges to the optimal feedback Markovian strategy. See [CMS] for a proof and [BT] for similar problems. For more information on the Howard algorithm, we refer to [Pu, LST]. For complements on numerical methods for HJB equations we refer, e.g., to [KD, LST]. Example 12.13 (Optimal Consumption and Portfolio with Both Fixed and Proportional Transaction Costs (3)) We go back to Example 12.12. We want to solve (11.2.21) numerically. We assume now that S = (0, l) × (0, l) with l > 0, and that the following boundary conditions hold: ψ(0, x2 ) = ψ(x1 , 0) = 0, ∂ψ ∂ψ (l, x2 ) = (x1 , l) = 0 for all (x1 , x2 ) in (0, l) × (0, l). ∂x1 ∂x2 Moreover we assume that the consumption is bounded by u max > 0 so that U = [0, u max ]. Let δ > 0 be a positive step and let Sδ = {(iδ, jδ), i, j ∈ {1, . . . , N }} be the finite difference grid (we suppose that N = l/δ is an integer). We denote by ψδ the approximation of ψ on the grid. We approximate the operator L u defined in (12.3.33) by the following finite difference operator on Sδ : 1 δ2 + ψ L uδ ψ := −ρψ + r x1 ∂1δ+ ψ + μx2 ∂2δ+ ψ − u∂1δ− ψ + σ 2 x22 ∂22 2 and set the following boundary values: ψδ (0, x2 ) = ψδ (x1 , 0) = 0, ψδ (l − δ, x2 ) = ψδ (l, x2 ), ψδ (x1 , l − δ) = ψδ (x1 , l). We then obtain a stable approximation. Take now

12.4 Numerical Analysis of HJBQVI

h≤

311

μx2 % σx2 &2 u max r x1 + + . + δ δ δ δ

We obtain a problem of the form (12.4.20). In order to be able to apply the Howard algorithm described above, it remains to check that (12.4.26) holds. This is indeed the case since a finite number of transactions brings the state to the continuation region. The details are left as an exercise. This problem is solved in [CØS] by using another numerical method based on the iterative methods of Chap. 10.

12.5 Exercises Exercise* 12.1 Let k > 0 be a constant and define ' K |x| for − K1 ≤ x ≤ G(x) = 1 for |x| > K1 .

1 K

,

Solve the optimal stopping problem Φ(s, x) = sup E x e−ρ(s+τ ) G(B(τ )) , τ ≥0

where B(t) is a one-dimensional Brownian motion starting at x ∈ R. Distinguish between the two cases √ (a) K ≤ 2ρ/z , where z > 0 is the unique positive solution of the equation tgh(z) = and tgh(z) = (b) K >



1 , z

ez − e−z . ez + e−z

2ρ/z .

Exercise* 12.2 Assume that the state X (t) = X (w) (t) at time t obtained by using a combined control w = (u, v), where u = u(t, ω) ∈ R and v = (τ1 , τ2 , . . . ; ζ1 , ζ2 , . . .) with ζi ∈ R given by  dX (t) = u(t)dt + d B(t) +

R

z N˜ (dt, dz), τi ≤ t < τi+1 ,

− ) + Δ N X (τi+1 ) + ζi+1 , X (τi+1 ) = X (τi+1

Assume that the cost of applying such a control is

X (0− ) = x ∈ R.

312

12 Viscosity Solutions

J

(w)

 (s, x) = E

x



e

−ρ(s+t)

(X

(w)

(t) + θu(t) )dt + c 2

2

0



 e

−ρ(s+τi )

,

i

where ρ, θ, and c are positive constants. Consider the problem to find Φ(s, x) and w∗ = (u ∗ , v ∗ ) such that ∗

Φ(s, x) = inf J (w) (s, x) = J (w ) (s, x). w

Let

(12.5.1)

Φ1 (s, x) = inf J (u,0) (s, x) u

be the value function if we do not allow any impulse control (i.e., v = 0) and let Φ2 (s, x) = inf J (0,v) (s, x) v

be the value function if u is fixed equal to 0, and only impulse controls are allowed. (See Exercises 5.4 and 9.1, respectively.) Prove that for i = 1, 2, there exists (s, x) ∈ R × R such that Φ(s, x) < Φi (s, x). In other words, no matter how the positive parameter values ρ, θ, and c are chosen it is never optimal for the problem (12.5.1) to choose u = 0 or v = 0 (compare with Exercise 11.2). [Hint: Use Theorem 12.8]. Exercise 12.3 Prove the inequalities (12.3.32) and (12.3.34) and verify that the inequalities hold uniformly on compact subsets of S 0 .

Chapter 13

Optimal Control of Stochastic Partial Differential Equations and Partial (Noisy) Observation Control

13.1 A Motivating Example Example 13.1 Suppose the density Y (t, x) of a fish population at time t ∈ [0, T ] and at the point x ∈ D ⊂ Rn (where D is a given open set) is modeled by a stochastic partial differential equation (SPDE for short) of the form  1 Y (t, x) + αY (t, x) − u(t, x) dt 2  + βY (t, x)dB(t) + Y (t − , x) ζ N˜ (dt, dζ); (t, x) ∈ (0, T ) × D,

 dY (t, x) =

R

(13.1.1) where we assume that ζ ≥ −1 + ε a.s. ν(dζ) for some constant ε > 0. The boundary conditions are: Y (0, x) = ξ(x); x ∈ D (13.1.2) Y (t, x) = η(t, x); (t, x) ∈ [0, T ) × ∂D.

(13.1.3)

Here dY (t, x) = dt Y (t, x) is the differential with respect to t, and Δ = Δx =

n  ∂2 ∂xi2 i=1

is the Laplacian operator acting on the variable x. We assume that α and β are constants and ξ(x) and η(t, x) are given deterministic functions. The process u(t, x) ≥ 0 is our control, representing the harvesting rate density at (t, x). Equation (13.1.1) is an example of a reaction-diffusion. With u = 0 and without the Δ-term, the equation reduces to a geometric Lévy equation describing the growth with respect to t. The Δ-term models the diffusion in space of the population. Let A be a family of admissible controls, contained in the set of all F-adapted processes u(t, x, ω) such that (13.1.1)–(13.1.3) has a unique solution Y (t, x). © Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0_13

313

314

13 Optimal Control of Stochastic Partial Differential …

Suppose the total expected utility from the harvesting rate u(·) and the corresponding terminal density Y (T , x) is given by 

T

J (u) = E



0

D

   uγ (t, x) dx dt + ρ Y (T , x)dx , γ D

(13.1.4)

where γ ∈ (0, 1) and ρ > 0 are constants. We want to find u∗ ∈ A such that sup J (u) = J (u∗ ).

(13.1.5)

u∈A

Such a control u∗ is called an optimal control. This is an example of a stochastic control problem for random jump fields, i.e., random fields which are solutions of stochastic partial differential equations driven by Brownian motions and Poisson random measures. How do we solve problem (13.1.5)? It is possible to use a dynamic programming approach and formulate an infinite-dimensional HJB equation for the value function (see [Mort]) but this HJB equation is difficult to use. Therefore, we will instead formulate a maximum principle for such problems (Theorem 13.2). This principle can be used to solve such stochastic control problems in some cases. We will illustrate this by solving (13.1.5) by this method. Our presentation is an extension of the presentation given in the previous edition of the book. It is based on the recent paper [DØ4]. See also [Ø1, ØPZ].

13.2 The Maximum Principle We first give a general formulation of the stochastic control problem we consider. Suppose that the state Y (t, x) = Y (u) (t, x) at (t, x) is described by a stochastic partial differential equation of the form  dY (t, x) = Lu(t,x) Y (t, x) + b(t, x, Y (t, x), u(t, x)) dt + σ(t, x, Y (t, x), u(t, x))dB(t)  + θ(t, x, Y (t, x), u(t, x), ζ)N˜ (dt, dζ); (t, x) ∈ (0, T ) × D (13.2.1) Rk

with boundary conditions Y (0, x) = ξ(x); x ∈ D

(13.2.2)

Y (t, x) = η(t, x); (t, x) ∈ [0, T ] × ∂D

(13.2.3)

13.2 The Maximum Principle

315

The operator Lu is a linear integro-differential operator acting on x, with parameter u, and the expression Lu(t,x) Y (t, x) means Lu Y (t, x)|u=u(t,x) . The Eq. (13.2.1) for Y is interpreted in the weak (variational) sense, i.e., Y (t, ·) satisfies the equation 

t

(Y (t, ·), φ)L2 (D) = (a, φ)L2 (D) + (Y (s, ·), L∗u φ)L2 (D) ds 0  t  t + (b(s, Y (s, ·), ·), φ)L2 (D) ds + (σ(s, Y (s, ·), ·), φ)L2 (D) dB(s) 0 0  t + γ(s, Y (s, ·), ζ, ·), φ)L2 (D) N˜ (ds, dζ), (13.2.4) 0

Rk

for all smooth functions φ with compact support in D. Here  (ψ, φ)L2 (D) =

ψ(x)φ(x)dx

(13.2.5)

D

is the L2 inner product on D and L∗u is the adjoint of the operator Lu , in the sense that (Lu ψ, φ)L2 (D) = (ψ, L∗u φ)L2 (D)

(13.2.6)

for all smooth L2 functions ψ, φ with compact support in D. It is known that the Itô formula can be applied to such SPDEs. See [Pa1, PrR]. The coefficients b : [0, T ] × D × Rn × V → R and σ : [0, T ] × D × Rn × V → R and θ : [0, T ] × D × Rn × V × Rk → R are given functions and V ⊆ R is a given closed set of admissible control values. Let f : [0, T ] × D × Rn × V → R and g : D × Rn → R be a given profit rate function and bequest rate function, respectively. Let A be a given family of admissible controls, contained in the set of IF t -adapted right-continuous stochastic processes u(t, x) ∈ U such that (13.2.1)–(13.2.3) has a unique solution Y (t, x) and such that  T 











dt + (t, x, Y (t, x), u(t, x)) Y (T , x)) E

dx

dx < ∞.

f

g(x, 0

D

D

(13.2.7)

If u ∈ A we define its performance functional J (u) by 

T

J (u) = E 0



   f (t, x, Y (t, x), u(t, x))dx dt + g(x, Y (T , x))dx .

D

D

(13.2.8)

Problem 13.1 Find u∗ ∈ A such that sup J (u) = J (u∗ ). u∈A

(13.2.9)

316

13 Optimal Control of Stochastic Partial Differential …

Such a process u∗ is called an it optimal control (if it exists). The number J ∗ := sup J (u)

(13.2.10)

u∈A

is called the value of this problem. We now state the maximum principle for this problem. Define the Hamiltonian H : [0, T ] × D × Rn × D × V × R × R × R × Ω → R by H (t, x, y, ϕ, u, p, q, r) = H (t, x, y, ϕ, u, p, q, r, ω) = f (t, x, y, u) + [Lu (ϕ) + b(t, x, y, u)]p  + σ(t, x, y, u)q + γ(t, x, y, u, ζ)r(ζ)ν(dζ).

(13.2.11)

Rk

Here D denotes the domain of definition for the operator Lu , while R denotes the set of all functions r(·) : R → R such that the last integral above converges. We assume that D is a Banach space. The quantities p, q, r(·) are called the adjoint variables. We define the adjoint processes p(t, x), q(t, x), r(t, x, ζ) as the solution of the backward stochastic partial differential equation (BSPDE) ⎧ ∂H ⎪ ⎪ dp(t, x) = −[L∗u(t,x) p(t, x) + (t, x)]dt + q(t, x)dB(t) ⎪ ⎪ ⎪ ∂y  ⎪ ⎪ ⎪ ⎨ + r(t, x, ζ)N˜ (dt, dζ); (t, x) ∈ (0, T ) × D R ⎪ ∂g ⎪ ⎪ ⎪ (x, Y (T , x)); x ∈ D p(T , x) = ⎪ ⎪ ∂y ⎪ ⎪ ⎩p(t, x) = 0; (t, x) ∈ [0, T ] × ∂D,

(13.2.12)

where ∂H ∂H (t, x) = (t, x, y, Y (t, .), u(t, x), p(t, x), q(t, x), r(t, x, .))|y=Y (t,x) . ∂y ∂y (13.2.13) Note that for fixed t, u, p, q, r we can regard ϕ → h(ϕ)(x) := H (t, x, ϕ(x), ϕ, u, p, q, r)

(13.2.14)

as a map from D into R. The Fréchet derivative at ϕ of this map is the linear operator ∇ϕ h on D given by ∇ϕ h, ψ = ∇ϕ h(ψ) = Lu (ψ)(x)p + ψ(x)

∂H (t, x, y, ϕ, u, p, q, r)|y=ϕ(x) ; ψ ∈ D. ∂y

(13.2.15) For simplicity of notation, if there is no risk of confusion, we will denote h by H from now on.

13.2 The Maximum Principle

317

The following result is an extension to control-dependent integro-differential operators taken from [DØ4]. For earlier, related results see [Ø1, ØPZ]. Theorem 13.2 (A Sufficient Maximum Principle for Random Jump Fields) [DØ4] Let uˆ ∈ A with associated solution Yˆ (t, x), pˆ (t, x), qˆ (t, x), rˆ (t, x, ζ) of (13.2.1) and (13.2.12). Assume that the following hold: y → g(x, y) is concave for all x (ϕ, u) → H (t, x, ϕ(x), ϕ, u, p(t, x, z), q(t, x, z), rˆ (t, x, z, ζ)) is concave for all t, x, ζ   Y (t, x),  Y (t, ·)(x), w, p(t, x), q(t, x), rˆ (t, x, ζ) sup H t, x, 

(13.2.16) (13.2.17) (13.2.18)

w∈U

  = H t, x,  Y (t, x),  Y (t, ·, )(x), u(t, x), p(t, x), q(t, x), rˆ (t, x, ζ) for all t, x, ζ. (13.2.19)

Then  u(·, ·) is an optimal insider control for Problem 13.1. Proof By considering an increasing sequence of stopping times τn converging to T , we may assume that all local integrals appearing in the computations below are martingales and hence have expectation 0. See [ØS2]. We omit the details. Choose arbitrary u(., .) ∈ A, and let the corresponding solution of (13.2.1) and (13.2.12) be Y (t, x) and p(t, x), q(t, x), r(t, x, ζ), respectively. For simplicity of notation we write f = f (t, x, Y (t, x), u(t, x)),  f = f (t, x,  Y (t, x), u(t, x)) and simi σ and so on. larly with b, b, σ,  Moreover put Hˆ (t, x) = H (t, x,  Y (t, x),  Y (t, ·)(x), u(t, x), p(t, x), q(t, x), r(t, x, .))

(13.2.20)

and p(t, x), q(t, x), r(t, x, .)) H (t, x) = H (t, x, Y (t, x), Y (t, ·)(x), u(t, x),

(13.2.21)

In the following we write  f = f − f , b = b − b,  Y = Y − Y. Consider J (u) − J ( u) = I1 + I2 , where  I1 = E 0

T

 D

    {f (t, x) −  f (t, x)}dx dt , I2 = E {g(x) − g(x)}dx ˆ . D

(13.2.22)

318

13 Optimal Control of Stochastic Partial Differential …

By the definition of H we have 

T



 (t, x) −  {H (t, x) − H p(t, x)[Lu Y (t, x) − Luˆ  Y (t, x) +  b(t, x)]   ˜ x, ζ)ν(dζ)}dxdt . (13.2.23) − q(t, x) σ (t, x) − rˆ (t, x, ζ)γ(t,

I1 = E

0

D

R

Since g is concave with respect to y we have (g(x, Y (T , x)) − g(x, Yˆ (T , x))) ≤

∂g (x, Yˆ (T , x))(Y (T , x) − Yˆ (T , x)), (13.2.24) ∂y

and hence    ∂g  (x, Y (T , x))Y˜ (T , x)dx = E  p(T , x) Y (T , x)dx (13.2.25) D ∂y D     T  T  T ˜    p(t, x)d Y (t, x) + d[ˆp, Y ]t dx =E Y (t, x)d p(t, x) + 

I2 ≤ E

D

0 T

  =E D

+

0

0

0

{ p(t, x)[Lu Y (t, x) − Luˆ  p(t, x) Y (t, x) +  b(t, x)] −  Y (t, x)[L∗uˆ

    (t, x) ∂H + σ (t, x) q(t, x) + γ(t, ˜ x, ζ)ˆr (t, x, ζ)ν(dζ)}dtdx . ∂y R

where  (t, x) ∂H ∂H = (t, x, y,  Y (t, ·)(x), uˆ (t, x), pˆ (t, x), qˆ (t, x), rˆ (t, x, .)) |y= Y (t,x) . ∂y ∂y (13.2.26) By a slight extension of (13.2.6) we get  D

 Y (t, x)L∗uˆ p(t, x)dx =



Y (t, x)dx.  p(t, x)Luˆ 

(13.2.27)

D

Therefore, adding (13.2.23)–(13.2.25) and using (13.2.27) we get,  

T



H (t, x) − Hˆ (t, x)    ˆ (t, x) ∂ H − pˆ (t, x)Luˆ (Y˜ )(t, x) + Y˜ (t, x) dt dx . ∂y

J (u) − J ( u) ≤ E 

D

0

(13.2.28)

13.2 The Maximum Principle

319

Hence   J (u) − J ( u) ≤ E D

0

T

   (Y˜ )(t, x)}dt dx {H (t, x) − Hˆ (t, x) − ∇Yˆ H (13.2.29)

where

 (Y˜ ) = ∇ϕ H  (Y˜ )| ˆ ∇Yˆ H ϕ=Y

(13.2.30)

By the concavity assumption of H in (ϕ, u) we have:  (Y − Yˆ )(t, x) + H (t, x) − Hˆ (t, x) ≤ ∇Yˆ H

 ∂H (t, x)(u(t, x) − uˆ (t, x)), (13.2.31) ∂u

and the maximum condition implies that  ∂H (t, x)(u(t, x) − uˆ (t, x)) ≤ 0. ∂u

(13.2.32)

Hence by (13.2.29) we get J (u) ≤ J (ˆu). Since u ∈ A was arbitrary, this shows that uˆ is optimal.  In many cases the Hamiltonian h(ϕ, u) := H (t, x, ϕ(x), ϕ, u, pˆ (t, x), qˆ (t, x), rˆ (t, x, ·)) is not concave in both variables ϕ, u. In such cases it might be useful to replace the concavity condition in (ϕ, u) (see (13.2.18)) by a weaker condition, called the Arrow condition, which is the following: The Arrow condition. For each fixed t, x the function ˆ h(ϕ) := max H (t, x, ϕ(x), ϕ, v, pˆ (t, x), qˆ (t, x), rˆ (t, x, ·)) v∈V

(13.2.33)

exists and is a concave function of ϕ. We then get the following extension of Theorem 13.2: Corollary 13.3 (Strengthened Maximum Principle) Let uˆ (t, x), Yˆ (t, x), pˆ (t, x), qˆ (t, x) and rˆ (t, x, ·) be as in Theorem 13.2. Assume that g(x, y) is concave in y for each x and that the Arrow condition (13.2.33) and the maximum condition (13.2.19) hold. Then uˆ (t, x) is an optimal control for the stochastic control problem (13.2.9). Proof We proceed as in the proof of Theorem 13.2, up to and including (13.2.30). Then to obtain  ˜ H − Hˆ − ∇ (13.2.34) Y H (Y ) ≤ 0

320

13 Optimal Control of Stochastic Partial Differential …

we note that  ˜ H − Hˆ − ∇ ˆ uˆ ) − ∇ϕ h, Y˜ . Y H (Y ) = h(ϕ, u) − h(ϕ,

(13.2.35)

This is ≤ 0 by the same argument as in the deterministic case. See [SeSy], Theorem 5, p. 107–108. For completeness we give the details: Note that by the maximum condition (13.2.19) we have ˆ ϕ). h(ϕ, ˆ uˆ ) = h( ˆ

(13.2.36)

ˆ h(ϕ, u) ≤ h(ϕ) for all ϕ, u.

(13.2.37)

ˆ Moreover, by definition of h,

Therefore, subtracting (13.2.36) from (13.2.37) gives ˆ ˆ ϕ) h(ϕ, u) − h(ϕ, ˆ uˆ ) ≤ h(ϕ) − h( ˆ

for all ϕ, u.

(13.2.38)

Accordingly, to prove (13.2.34) it suffices to prove that (see (13.2.35)) ˆ ϕ) ˆ − h( ˆ − ∇ϕˆ h, Y˜ ≤ 0. h(ϕ)

(13.2.39)

ˆ To this end, note that since the function h(ϕ) is concave it follows by a standard separating hyperplane argument (see e.g., [SeSy], Chap. 5, Sect. 23) that there exists ˆ a supergradient a for h(ϕ) at ϕ = ϕ, ˆ i.e., ˆ ˆ ϕ) h(ϕ) − h( ˆ ≤ a, ϕ − ϕ ˆ for all ϕ.

(13.2.40)

Define F(ϕ) = h(ϕ, uˆ ) − h(ϕ, ˆ uˆ ) − a, ϕ − ϕ . ˆ Then by (13.2.38) and (13.2.40) we have F(ϕ) ≤ 0

for all ϕ.

Moreover, by definition of F we have ˆ = 0. F(ϕ) Therefore ∇ϕˆ F = a. Combining this with (13.2.40), we obtain (13.2.39) and the proof is complete.



13.2 The Maximum Principle

321

13.2.1 Return to Example 13.1 As an illustration of the maximum principle let us apply it to solve the problem in Example 13.1. In this case the Hamiltonian is H (t, x, y, u, p, q, r) =

uγ + (αy − u)p + βyq + y γ

 r(ζ)ζν(dζ) R

(13.2.41)

which is clearly concave in (y, u). The adjoint equation is 

  1 dp(t, x) = − p(t, x) + αp(t, x) + βq(t, x) + r(t, x, ζ)ζν(dζ) dt 2 R  + q(t, x)dB(t) + r(t, x, ζ)N˜ (dt, dζ); (t, x) ∈ (0, T ) × D (13.2.42) R

p(T , x) = ρ; x ∈ D,

(13.2.43)

p(t, x) = 0 ; (t, x) ∈ (0, T ) × ∂D.

(13.2.44)

Since the coefficients α, β and the boundary value ρ are all deterministic, we see that we can choose q(t, x) = r(t, x, z) = 0 and solve the resulting deterministic equation 1 ∂ p(t, x) = − p(t, x) − αp(t, x) ; (t, x) ∈ (0, T ) × D ∂t 2

(13.2.45)

(together with (13.2.43)–(13.2.44)) for a deterministic solution p(t, x). This is a classical boundary value problem, and it is well known that the solution can be expressed as follows (Fig. 13.1): p(t, x) = ρeα(T −t) P[W x (s) ∈ D for all s ∈ [t, T ]],

(13.2.46)

where W x (·) denotes an auxiliary n-dimensional Brownian motion starting at x ∈ Rn with probability law P. (See e.g., [KS, Chap. 4], or [Ø1, Chap. 9]). The function  uγ + (αy − u)p + βyq + y r(ζ)ζν(dζ) u → H (t, x, y, u, p, q, r) = γ R is maximal when u = uˆ (t, x) = (p(t, x))1/(γ−1) , where p(t, x) is given by (13.2.46).

(13.2.47)

322

D

13 Optimal Control of Stochastic Partial Differential …

x W x (T )

T

t Fig. 13.1 Interpretation of the function p(t, x)

With this choice of uˆ (t, x) we see that all the conditions of Theorem 13.2 are satisfied and we conclude that uˆ (t, x) is an optimal harvesting rate for Example 13.1. Example 13.4 ([ØPZ]) The solution uˆ (t, x) of Example 13.1 is a bit degenerate, in the sense that it is deterministic and hence independent of the history of the population density Y (t, x). The mathematical reason for this is the deterministic parameters of the adjoint equation, including the terminal condition p(T , x) = ρ. Therefore, let us consider a more general situation, where the performance functional J (u) of (13.1.4) is replaced by 

T

J (u) = E 0

 R

uγ (t, x) dxdt + γ

 R

 g(x, Y (T , x))dx ,

(13.2.48)

where g : R2 → R is a given C 1 -function. The Hamiltonian remains the same as in Example 13.1, and hence the candidate uˆ (t, x) for the optimal control has the same form as in (13.2.47), i.e., 1 (13.2.49) uˆ (t, x) = (p(t, x)) γ−1 . The difference is that now we have to work harder to find p(t, x). The backward SPDE is now    1 dp(t, x) = − αp(t, x) + βq(t, x) + r(t, x, ζ)ζν(dζ) + p(t, x) dt 2 R  + q(t, x)dB(t) + r(t, x, ζ)N˜ (dt, dζ); (t, x) ∈ [0, T ] × R, (13.2.50) R

13.2 The Maximum Principle

323

p(T , x) = F(x); x ∈ R.

(13.2.51)

lim p(t, x) = 0; t ∈ [0, T ],

(13.2.52)

|x|→∞

where F(x) = F(x, ω) =

∂g (x, Y (T , x)); x ∈ R. ∂y

To solve this equation we proceed as follows: Put p˜ (t, x) = eαt p(t, x).

(13.2.53)

(13.2.54)

This transforms (13.2.50)–(13.2.52) into 1 dp˜ (t, x) = −βeαt q(t, x)dt − ˜p(t, x)dt 2  αt +e r(t, x, ζ)ζν(dζ)dt + eαt q(t, x)dB(t) R αt +e r(t, x, ζ)N˜ (dt, dζ); t < T ,

(13.2.55)

p˜ (T , x) = eαT F(x),

(13.2.56)

lim p˜ (t, x) = 0.

(13.2.57)

R

|x|→∞

Define the probability measure P0 on IF T by dP0 (ω) = Z(T )dP(ω), where 

 t 1 2 Z(t) = exp βB(t) − β t + ln(1 + ζ)N˜ (ds, dζ) 2 R 0   t + {ln(1 + ζ) − ζ}ν(dζ)ds ; 0 ≤ t ≤ T . (13.2.58) 0

R

Then by the Girsanov theorem (use Theorem 1.33 with u = −β and θ(t, ζ) = −ζ) the process B0 (t) := B(t) − βt is a Brownian motion with respect to P0 and the random measure N˜ 0 (dt, dζ) := N˜ (dt, dζ) − ζν(dζ)dt is a compensated Poisson random measure with respect to P0 .

324

13 Optimal Control of Stochastic Partial Differential …

In terms of dB0 (t) and N˜ 0 (dt, dζ) (13.2.55) gets the form 1 dp˜ (t, x) = − ˜p(t, x)dt + eαt q(t, x)dB0 (t) 2  + eαt r(t, x, ζ)N˜ 0 (dt, dζ).

(13.2.59)

R

Suppose

 E0

R

 F 2 (x)dx < ∞,

(13.2.60)

where E0 denotes the expectation with respect to P0 . Then by the Itô representation theorem (see e.g., [I]), there exists for a.a., x ∈ R a unique pair of adapted processes (ϕ(t, x), ψ(t, x, ζ)) such that 

T

 ϕ (t, x)dt +

T

E0





2

ψ (t, x, ζ)ν(dζ)dt < ∞ 2

0

0

R

and eαT F(x) = h(x) +



T



T

ϕ(t, x)dB0 (t) +

0

 R

0

ψ(t, x, ζ)N˜ 0 (dt, dζ),

(13.2.61)

 h(x) = E0 eαT F(x) .

where

Let Qt be the heat operator, defined by (Qt f )(x) = (2πt)−1/2

  (x − y)2 dy; f ∈ D f (y) exp − 2t R



where D is the set of functions f : R → R such that the integral exists. Define  t   t pˆ (t, x) := QT −t ϕ(s, ·)dB0 (s) + ψ(s, ·, ζ)N˜ 0 (ds, dζ) + h(·) (x) R 0 0  t  t = QT −t ϕ(s, ·)(x)dB0 (s) + QT −t ψ(s, ·, ζ)(x)N˜ 0 (ds, dζ) 0

0

R

+ QT −t h(x). (13.2.62) Then, since

1 d Qt f = (Qt f ), dt 2

13.2 The Maximum Principle

325

we see that  t    1 − Δ(QT −t ϕ(s, ·))(x) dB0 (s) dt dpˆ (t, x)x = QT −t ϕ(t, ·)(x)dB0 (t) + 2 0  + QT −t ψ(t, ·, ζ)(x)N˜ 0 (ds, dζ) R   t 1 ˜ + − (QT −t ψ(s, ·, ζ))(x)N0 (ds, dζ) dt 2 R 0 1 − (QT −t h(x))dt 2  t 1 =−  QT −t ϕ(s, ·)(x)dB0 (s) 2 0   t + QT −t ψ(s, ·ζ)(x)N˜ 0 (ds, dζ) + QT −t h(x) dt 0 R  + QT −t ϕ(t, ·)(x)dB0 (t) + QT −t ψ(t, ·, ζ)(x)N˜ 0 (dt, dζ) R

1 = − ˆp(t, x)dt + QT −t ϕ(t, ·)(x)dB0 (t) 2  +

R

QT −t ψ(t, ·, ζ)(x)N˜ 0 (dt, dζ).

(13.2.63)

Comparing with (13.2.59) we see that the triple (˜p, q, r) given by p˜ (t, x) := pˆ (t, x),

(13.2.64)

q(t, x) := e−αt QT −t ϕ(t, ·)(x),

(13.2.65)

r(t, x, ζ) := e−αt QT −t ψ(t, ·, ζ)(x)

(13.2.66)

solves the backward SPDE 13.2.59, and hence it solves (13.2.55), together with the terminal values (13.2.56) and (13.2.57). We have proved: Theorem 13.5 Assume that (13.2.60) holds Then the optimal control of the problem (13.1.1)–(13.1.3), (13.1.5), with performance functional J (u) as in (13.2.48), satisfies uˆ (t, x) = (p(t, x))1/(γ−1) , where

p(t, x) = e−αt pˆ (t, x)

with pˆ (t, x) defined by (13.2.62) and (13.2.61).

326

13.3

13 Optimal Control of Stochastic Partial Differential …

A Necessary Maximum Principle

We proceed to establish a corresponding necessary maximum principle. For this, we do not need concavity conditions, but instead we need the following assumptions about the set of admissible control processes: • For all t0 ∈ [0, T ] and all bounded Ft0 -measurable random variables α(x, ω), the control θ(t, x, ω) := 1[t0 ,T ] (t)α(x, ω) belongs to A. • For all u, β0 ∈ A with β0 (t, x) ≤ K < ∞ for all t, x define δ(t, x) =

1 dist(u(t, x), ∂V ) ∧ 1 > 0 2K

(13.3.1)

and put β(t, x) = δ(t, x)β0 (t, x).

(13.3.2)

Then the control  u(t, x) = u(t, x) + aβ(t, x); t ∈ [0, T ] belongs to A for all a ∈ (−1, 1). • For all β as in (13.3.2) the derivative process η(t, x) :=

d u+aβ (t, x)|a=0 Y da

(13.3.3)

exists, and belongs to L2 (λ × P) and 

∂b dL (Y )(t, x)β(t, x) + Lu η(t, x) + (t, x)η(t, x) du ∂y  ∂b + (t, x)β(t, x) dt ∂u   ∂σ ∂σ (t, x)η(t, x) + (t, x)β(t, x) dB(t) + ∂y ∂u    ∂γ ∂γ (t, x, ζ)η(t, x) + (t, x, ζ)β(t, x) N˜ (dt, dζ); + ∂u R ∂y (t, x) ∈ [0, T ] × D, d u+aβ Y (0, x)|a=0 = 0, η(0, x) = da η(t, x) = 0; (t, x) ∈ [0, T ] × ∂D. (13.3.4)

dη(t, x) =

13.3 A Necessary Maximum Principle

327

Theorem 13.6 (Necessary maximum principle) [DØ4] Let uˆ ∈ A. Then the following are equivalent: d J (ˆu + aβ)|a=0 = 0 for all bounded β ∈ A of the form (13.3.2). da ∂H 2. (t, x)u=ˆu = 0 for all (t, x) ∈ [0, T ] × D. ∂u 1.

Proof For simplicity of notation we write u instead of uˆ in the following. By considering an increasing sequence of stopping times τn converging to T , we may assume that all local integrals appearing in the computations below are martingales and have expectation 0. See [ØS2]. We omit the details. We can write d J ((u + aβ))|a=0 = I1 + I2 da where d I1 = da

 

 f (t, x, Y

u+aβ

(t, x), u(t, x) + aβ(t, x)dtdx a=0

0

D

and

T

d E I2 = da



 g(x, Y

u+aβ

(T , x))dx |a=0 .

D

By our assumptions on f and g and by (13.3.3) we have  

T

I1 = E D

 I2 = E D

0



  ∂f ∂f (t, x)η(t, x) + (t, x)β(t, x) dtdx , ∂y ∂u

   ∂g (x, Y (T , x))η(T , x)dx = E p(T , x)η(T , x)dx . ∂y D

By the Itô formula 

   p(T , x)η(T , x)dx = E

I2 = E D

  + D

T

 T



p(t, x)dη(t, x)dx

D 0  T

η(t, x)dp(t, x)dx +

0

 

T



d[η, p](t, x)dx D

0

dL (Y )(t, x)β(t, x) + Lu η(t, x) p(t, x) =E du D 0  ∂b ∂b (t, x)β(t, x) dtdx + (t, x)η(t, x) + ∂y ∂u     T ∂σ ∂σ (t, x)η(t, x) + (t, x)β(t, x) dB(t) + p(t, x) ∂y ∂u D 0

(13.3.5)

(13.3.6)

328

13 Optimal Control of Stochastic Partial Differential …

  + D

0

  − D

T

0

  + D

T

T



 ∂γ ∂γ p(t, x) (t, x, ζ)η(t, x) + (t, x, ζ)β(t, x) N˜ (dt, dζ)dx ∂y ∂u R   ∂H ∗ (t, x) dtdx η(t, x) Lu p(t, x) + ∂y   T η(t, x)q(t, x)dB(t)dx + η(t, x)r(t, x, ζ)N˜ (dt, dζ)dx 

0

0

R

 

T

D

 ∂σ ∂σ (t, x)η(t, x) + (t, x)β(t, x) dtdx q(t, x) + ∂y ∂u D 0     T  ∂γ ∂γ + (t, x, ζ)η(t, x) + (t, x, ζ)β(t, x) r(t, x, ζ)ν(ζ)dtdx ∂u D 0 R ∂y     T  dL p(t, x) (Y )(t, x)β(t, x) + Lu η(t, x) dt =E du 0 D   T ∂σ ∂b ∂H (t, x) η(t, x) p(t, x) (t, x) + q(t, x) (t, x) − L∗u p(t, x) − + ∂y ∂y ∂y 0   ∂γ (t, x, ζ)r(t, x, ζ)ν(dζ) dt + R ∂y   T ∂σ ∂b + β(t, x) p(t, x) (t, x) + q(t, x) (t, x) ∂u ∂u 0     ∂γ + (t, x, ζ)r(t, x, ζ)ν(dζ) dt dx R ∂u      T  T ∂H ∂f ∂f dt + (t, x) − (t, x) β(t, x)dtdx −η(t, x) =E ∂y ∂u ∂u 0 D 0    T ∂H (t, x)β(t, x)dtdx . (13.3.7) = −I1 + E ∂u D 0  

T



Adding (13.3.5) and (13.3.7) we get d J (u + aβ)|a=0 = I1 + I2 = E da We conclude that

D

0

 ∂H (t, x)β(t, x)dtdx . ∂u

d J (u + aβ)|a=0 = 0 da

if and only if

  E D

0

T

 ∂H (t, x)β(t, x)dtdx = 0, ∂u

for all bounded β ∈ A of the form (13.3.2).

(13.3.8)

13.3

A Necessary Maximum Principle

329

In particular, applying this to β(t, x) = θ(t, x) as in A1, we get that this is again equivalent to ∂H (t, x) = 0 for all (t, x) ∈ [0, T ] × D. (13.3.9) ∂u 

13.4 Controls Which do not Depend on x In some applications (see e.g. Sect. 13.5) it is important to consider only controls u(t, x) = u(t) which do not depend on the space variable x. Thus we let the set A0 of admissible controls be defined by A0 = {u ∈ A; u(t, x) = u(t) does not depend on x}.

(13.4.1)

With the performance functional J (u) as in Problem 13.1, the problem is now the following: Problem 13.2 For each find u0∗ ∈ A0 such that sup J (u) = J (u0∗ ).

(13.4.2)

u∈A0

It turns out that one can formulate an analog of Theorem 13.2 for this case, the main difference between this case and Theorem 13.2 is the integration with respect to dx in the maximum condition 3 below. Theorem 13.7 (Sufficient maximum principle for controls which do not depend on x) [DØ4] Suppose uˆ ∈ A0 with corresponding solutions Yˆ (t, x) of (13.2.1) and pˆ (t, x), qˆ (t, x), rˆ (t, x, ζ) of (13.2.12) respectively. Assume that the following hold: 1. y → g(x, y) is concave for all x 2. (ϕ, u) p(t, x), q(t, x), rˆ (t, x, ·)) is concave for all t, x  → H (t, x, ϕ(x), ϕ, u,   Y (t, x),  Y (t, ·), w, p(t, x), q(t, x), rˆ (t, x, ·) dx 3. sup H t, x,  w∈U  D   = H t, x,  Y (t, x),  Y (t, ·), u(t), p(t, x), q(t, x), rˆ (t, x, ·) dx for all t. D

Then uˆ (t) is an optimal control for the Problem 6.1. Proof We proceed as in the proof of Theorem 13.2. Let u ∈ A0 with corresponding solution Y (t, x) of (13.2.1). With uˆ ∈ A0 , consider 

T

J (u) − J (ˆu) = E 0

 D

{f − fˆ }dxdt +

 {g − g}dx ˆ ,

 D

(13.4.3)

330

13 Optimal Control of Stochastic Partial Differential …

where fˆ = f (t, x, Yˆ (t, x), uˆ (t)), f = f (t, x, Y (t, x), u(t)), gˆ = g(x, Yˆ (T , x)) and g = g(x, Y (T , x)). ˆ b, σ, Using a similar shorthand notation for b, ˆ σ and γ, ˆ γ, and setting Hˆ = H (t, x, Yˆ (t, x), uˆ (t), pˆ (t, x), qˆ (t, x), rˆ (t, x, ·))

(13.4.4)

H = H (t, x, Y (t, x), u(t), pˆ (t, x), qˆ (t, x), rˆ (t, x, ·)),

(13.4.5)

and

we see that (13.4.3) can be written J (u) − J (ˆu) = I1 + I2 ,

(13.4.6)

where 



T

I1 = E 0

    {f (t, x) −  f (t, x)}dx dt , I2 = E {g(x) − g(x)}dx ˆ .

D

D

(13.4.7)

By the definition of H we have 

   (t, x) −  H (t, x) − H p(t, x)(Lu Y (t, x) − Luˆ Yˆ (t, x) +  b(t, x)) D 0    ˜ x, ζ)ν(dζ) dxdt . (13.4.8) − q(t, x) σ (t, x) − rˆ (t, x, ζ)γ(t,

I1 = E

T

R

Since g is concave with respect to y we have ∂g (x, Yˆ (T , x))(Y (T , x) − Yˆ (T , x)). g(x, Y (T , x)) − g(x, Yˆ (T , x))) ≤ ∂y

(13.4.9)

Therefore, as in the proof of Theorem 13.2,   I2 ≤ E D

0

T

   p(t, x)[Lu Y (t, x) − Lu˜ Y˜ (t, x) +  b(t, x)] −  Y (t, x) L∗uˆ pˆ (t, x)

     (t, x) ∂H (t, x) +  σ (t, x) q(t, x) + γ(t, ˜ x, ζ)ˆr (t, x, ζ)ν(dζ) dtdx , + ∂y R (13.4.10) where, as before,  (t, x) ∂H ∂H = (t, x, y, Yˆ (t, .)(x), uˆ (t), pˆ (t, x), qˆ (t, x), rˆ (t, x, .)) |y=Yˆ (t,x) ∂y ∂y (13.4.11)

13.4 Controls Which do not Depend on x

331

Adding (13.4.8)–(13.4.10) we get as in Eq. (13.2.29), 

T

J (u) − J (ˆu) ≤ E 0

 D

  ˆ ˆ ˜ {H (t, x) − H (t, x) − ∇Yˆ H (Y )(t, x)}dx dt . (13.4.12)

By the concavity assumption of H in (ϕ, u) we have:  ∂H (t, x)(u(t) − uˆ (t)), H (t, x) − Hˆ (t, x) ≤ ∇Yˆ Hˆ (Y − Yˆ )(t, x) + ∂u

(13.4.13)

and the maximum condition implies that  D

Hence

 D

 ∂H (t, x)(u(t) − uˆ (t))dx ≤ 0. ∂u

{H (t, x) − Hˆ (t, x) − ∇Yˆ Hˆ (Y − Yˆ )(t, x)}dx ≤ 0,

(13.4.14)

(13.4.15)

and therefore we conclude by (10.1.17) that J (u) ≤ J (ˆu). Since u ∈ A0 was arbitrary, this shows that uˆ is optimal.  We proceed as in Theorem 13.6 to establish a corresponding necessary maximum principle for controls which do not depend on x. As in Sect. 13.3 we assume the following: • For all t0 ∈ [0, T ] and all bounded Ht0 -measurable random variables α(ω), the control θ(t, ω) := 1[t0 ,T ] (t)α(ω) belongs to A0 . • For all u, β0 ∈ A0 with β0 (t) ≤ K < ∞ for all t define δ(t) =

1 dist((u(t), ∂V ) ∧ 1 > 0 2K

(13.4.16)

and put β(t) = δ(t)β0 (t). Then the control  u(t) = u(t) + aβ(t); t ∈ [0, T ] belongs to A0 for all a ∈ (−1, 1). • For all β as in (13.4.17) the derivative process η(t, x) :=

d u+aβ Y (t, x)|a=0 da

exists, and belongs to L2 (λ × P) and

(13.4.17)

332

13 Optimal Control of Stochastic Partial Differential …

 ⎧ dL ∂b ⎪ ⎪ dη(t, x) = (Y )(t, x)β(t) + Lu η(t, x) + (t, x)η(t, x) ⎪ ⎪ du ∂y ⎪  ⎪ ⎪ ⎪ ∂b ⎪ ⎪ + (t, x)β(t) dt ⎪ ⎪ ∂u ⎪   ⎪ ⎪ ⎪ ∂σ ∂σ ⎪ ⎪ (t, x)η(t, x) + (t, x)β(t) dB(t) + ⎨ ∂u    ∂y ∂γ ∂γ ⎪ ⎪ (t, x, ζ)η(t, x) + (t, x, ζ)β(t) N˜ (dt, dζ); + ⎪ ⎪ ∂u ⎪ R ∂y ⎪ ⎪ ⎪ ⎪ (t, x) ∈ [0, T ] × D, ⎪ ⎪ ⎪ d u+aβ ⎪ ⎪ ⎪η(0, x) = (0, x)|a=0 = 0; x ∈ D Y ⎪ ⎪ da ⎪ ⎩ η(t, x) = 0; (t, x) ∈ [0, T ] × ∂D.

(13.4.18)

Then we have the following result: Theorem 13.8 ([DØ4]) (Necessary maximum principle for controls not depending on x) Let uˆ ∈ A0 . Then the following are equivalent: d J (ˆu + aβ)|a=0 = 0 for all bounded β ∈ A0 of the form (13.4.17). da   ∂H 2. = 0 for all t ∈ [0, T ]. (t, x)dx D ∂u u=ˆu(t)

1.

Proof The proof is analogous to the proof of Theorem 13.6 and is omitted.



13.5 Application to Partial (Noisy) Observation Optimal Control For simplicity we consider only the one-dimensional case in the following. Suppose the signal process X (t) = X (u) (t) and its corresponding observation process G(t) are given respectively by the following system of stochastic differential equations • (Signal process) dX (t) = α(X (t), G(t), u(t))dt + β(X (t), G(t), u(t))dv(t)  + γ(X (t), G(t), u(t), ζ)N˜ (dt, dζ); t ∈ [0, T ], (13.5.1) R  X (0) has density F(·), i.e. E[φ(X (0))] = φ(x)F(x)dx; φ ∈ C0 (R). R

As before T > 0 is a fixed constant. • (Observation process)

13.5 Application to Partial (Noisy) Observation Optimal Control



333

dG(t) = h(X (t))dt + dw(t); t ∈ [0, T ], G(0) = 0.

(13.5.2)

Here α : R × R × U → R, β : R × R × U → R, γ : R × R × U × R → R and h : R → R are given deterministic functions, with h bounded. The processes v(t) = v(t, ω) and w(t) = w(t, ω) are independent Brownian motions, and N˜ (dt, dζ) is a compensated Poisson random measure, independent of both v and w. We let Fv := {Ftv }0≤t≤T and Fw := {Ftw }0≤t≤T denote the filtrations generated by (v, N˜ ) and w, respectively. The process u(t) = u(t, ω) is our control process, assumed to have values in a given closed set U ⊂ R. We require that u(t) be adapted to the filtration G := {Gt }0≤t≤T ,

(13.5.3)

where Gt is the sigma-algebra generated by the observations G(s), s ≤ t. We call u(t) admissible if, in addition, (13.5.1) and (13.5.2) has a unique strong solution (X (t), G(t)) such that 

T

E

 |(X (t), u(t))|dt + |k(X (T ))| < ∞,

(13.5.4)

0

where  : R × U → R and k : R → R are given functions, called the profit rate and the bequest function, respectively. The set of all admissible controls is denoted by AG . For u ∈ AG we define the performance functional 

T

J (u) = E

 (X (t), u(t))dt + k(X (T )) .

(13.5.5)

0

We consider the following problem: Problem 13.3 (The noisy observation stochastic control problem) Find u∗ ∈ AG such that (13.5.6) sup J (u) = J (u∗ ). u∈AG

We now proceed to show that this partial observation SDE insider control problem can be transformed into a full observation SPDE control problem of the type discussed in the previous sections: To this end, define the probability measure Q by dQ(ω) = Mt (ω)dP(ω) on Ftv ∨ Ftw ,

(13.5.7)

where   t   1 t 2 Mt (ω) = Mt (ω) = exp − h(X (s))dw(s) − h (X (s))ds . 2 0 0

(13.5.8)

334

13 Optimal Control of Stochastic Partial Differential …

It follows by the Girsanov theorem that the observation process G(t) defined by (13.5.2) is a Brownian motion with respect to Q. Moreover, we have dP(ω) = Kt (ω)dQ(ω),

(13.5.9)

where   1 t 2 h (X (s))ds 2 0 0   t  1 t 2 h(X (s))dG(s) − h (X (s))ds . = exp 2 0 0

Kt = Mt−1 = exp



t

h(X (s))dw(s) +

(13.5.10)

For ϕ ∈ C02 (R) and fixed g ∈ R, c ∈ U define the integro-differential operator A = Ag,c by 1 ∂ϕ ∂2ϕ Ag,c ϕ(x) = α(x, g, c) (x) + β 2 (x, g, c) 2 ∂x 2 ∂x  + {ϕ(x + γ(x, g, c, ζ)) − ϕ(x) − ∇ϕ(x)γ(x, g, c, ζ)}ν(dζ), R

(13.5.11)

and let A∗ be the adjoint of A, in the sense that (Aϕ, ψ)L2 (R) = (ϕ, A∗ ψ)L2 (R)

(13.5.12)

for all ϕ, ψ ∈ C02 (R). Suppose that there exists a stochastic process y(t, x) such that  EQ [ϕ(X (t))Kt |Gt ] =

ϕ(x)y(t, x)dx

(13.5.13)

R

for all bounded measurable functions ϕ. Then y(t, x) is called the unnormalized conditional density of X (t) given the observation filtration Gt . Note that by the Bayes rule we have EQ [ϕ(X (t))Kt |Gt ] . (13.5.14) E[ϕ(X (t))|Gt ] = EQ [Kt |Gt ] It is known that under certain conditions the process y(t, x) = y(t, x, z) exists and satisfies the following integro-SPDE, called the Duncan–Mortensen–Zakai equation: dy(t, x) = A∗G(t),u(t) y(t, x)dt + h(x)y(t, x)dG(t); t ≥ 0 y(0, x) = F(x).

(13.5.15)

13.5 Application to Partial (Noisy) Observation Optimal Control

335

See for example Theorem 7.17 in [BC]. If (13.5.13) holds, we get  J (u) = E

 (X (t), u(t))dt + k(X (T ))

T

0



T

= EQ

 (X (t), u(t))Kt dt + k(X (T ))KT

0



T

= EQ

 EQ [(X (t), u(t))Kt |Gt ]dt + EQ [k(X (T ))KT |GT ]

0



T

= EQ

 EQ [(X (t), u(t))Kt |Gt ]dt + EQ [k(X (T ))KT |GT ]

0



T

= EQ 0



T

= EQ

 EQ [(X (t), v)Kt |Gt ]v=u(t) dt + EQ [k(X (T ))KT |GT ] 

 R

0

(x, u(t))y(t, x)dxdt +

R

 k(x)y(T , x)dx =: JQ (u). (13.5.16)

This transforms the insider partial observation SDE control problem (13.5.6) into a full observation SPDE control problem of the type we have discussed in the previous sections. We summarize what we have proved as follows: Theorem 13.9 (From partial observation SDE control to full observation SPDE control) Assume that (13.5.13) and (13.5.15) hold. Then the solution u∗ (t) of the partial observation SDE control problem (13.5.6) coincides with the solution u∗ of the following (full observation) SPDE control problem: Problem 13.4 Find u∗ ∈ AG such that sup JQ (u) = JQ (u∗ ),

(13.5.17)

u∈AG

where 

T

JQ (u) = EQ 0

 R



 (x, u(t))y(t, x)dxdt +

R

k(x)y(T , x)dx ,

(13.5.18)

and y(t, x) solves the SPDE dy(t, x) = A∗G(t),u(t) y(t, x)dt + h(x)y(t, x)dG(t); t ≥ 0 y(0, x) = F(x).

(13.5.19)

336

13 Optimal Control of Stochastic Partial Differential …

13.5.1 Optimal Portfolio with Noisy Observations We now study an example illustrating an application of Theorem 13.9: Suppose the signal process X (t) = X π (t) is given by 

dX (t) = π(t)[α0 (t)dt + β0 (t)dv(t)]; 0 ≤ t ≤ T , X (0) is a random variable independent of {v(t)}t ≥ 0 and with density F. (13.5.20) Here π(t) is the control, representing the portfolio in terms of the amount invested in the risky asset at time t, when the risky asset unit price S(t) is given by 

dS(t) = S(t)[α0 (t)dt + β0 (t)dv(t)]; 0 ≤ t ≤ T , S(0) > 0,

(13.5.21)

and the safe investment unit price is S0 (t) = 1 for all t. The process X (t) then represents the corresponding value of the investment at time t. For π to be admissible we require that X (t) > 0 for all t. Suppose the observations G(t) of X (t) at time t are not exact, but subject to uncertainty or noise, so that the observation process is given by  dG(t) = X (t)dt + dw(t); 0 ≤ t ≤ T , (13.5.22) G(0) = 0. Here, as above, the processes v and w are independent Brownian motions. Let U : [0, ∞) → [−∞, ∞) be a given C 1 (concave) utility function. The performance functional is assumed to be J (π) = E[U (X π (T ))].

(13.5.23)

By Theorem 13.9, the problem to maximize J (π) over all π ∈ AG is equivalent to the following problem: Problem 13.5 Find πˆ ∈ AG such that ˆ sup JQ (π) = JQ (π),

π∈AG



where JQ (π) = EQ

R+

 U (x)y(T , x)dx ,

(13.5.24)

(13.5.25)

and y(t, x) = yπ (t, x) is the solution of the SPDE  dy(t, x) = (A∗π(t) y)(t, x)dt + xy(t, x)dG(t); 0 ≤ t ≤ T , y(0, x) = F(x),

(13.5.26)

13.5 Application to Partial (Noisy) Observation Optimal Control

where

337

1 (A∗π(t) y)(t, x) = −π(t)α0 (t)y (t, x) + π 2 (t)β02 (t)y (t, x), 2

∂y(t, x)  ∂ 2 y(t, x) with y (t, x) = . , y (t, x) = ∂x ∂x2 Define the space H1 (R+ ) = {y ∈ L2 (R+ ),

∂y ∈ L2 (R+ )} ∂x

(13.5.27)

The H1 -norm is given by: y(t)2H1 (R+ ) = y(t)2L2 (R+ ) + y (t)2L2 (R+ )

(13.5.28)

H1 (R+ ) ⊂ L2 (R+ ) ⊂ H−1 (R+ )

(13.5.29)

We have

We verify the coercivity condition of the operator −A∗π(t) : 2 −A∗π(t) y, y = 2π(t, z)α0 (t) y (t, x), y(t, x) − π 2 (t, z)β02 (t) y (t, x), y(t, x)   = 2π(t)α0 (t) y (t, x)y(t, x)dx − π 2 (t)β02 (t) y (t, x)y(t, x)dx R+ 2

= π(t, z)α0 (t)[y (t, x)]∂R+ − π  + π 2 (t)β02 (t) (y (t, x))2 dx.

2

R+ 2  (t)β0 (t)[y(t, x)y (t, x)]∂R+

(13.5.30)

R+

Suppose that y(t, x) = 0 for x = 0. Then we get

Let

2 −A∗π(t) y, y = π 2 (t)β02 (t)y (t)2L2 (R+ ) .

(13.5.31)

H01 (R+ ) = {y ∈ H1 , y = 0 on ∂R+ }.

(13.5.32)

We have |y(t)|1,R+ = y (t)L2 (R+ ) is a norm in H01 (R+ ), which is equivalent to the H1 (R+ ) norm; i.e. there exist a, b > 0 such that ay(t)1,R+ ≤ |y(t)|1,R+ = y (t)L2 (R+ ) ≤ by(t)1,R+

(13.5.33)

We conclude that the following coercivity condition is satisfied: 2 −A∗π(t) y, y ≥ a2 π 2 (t, z)β02 (t)y(t, z)21,R+ .

(13.5.34)

338

13 Optimal Control of Stochastic Partial Differential …

Using Theorems 1.1 and 2.1 in Pardoux [Pa1], we obtain that (13.5.26) has a unique solution y(t, x, z) ∈ L2 (Ω, C(0, T , L2 (R+ ))). Moreover, the first and second partial derivatives with respect to x, denoted by y (t, x, z) and y (t, x, z) respectively, exist and belong to L2 (R). The problem (13.5.24) is of the type discussed in Sect. 13.2 and we now apply the methods developed there to study it: The Hamiltonian given in (13.2.11) now gets the form H (t, x, y, ϕ, π, p, q) = (A∗π ϕ)p + xyq,

(13.5.35)

and the adjoint BSDE (13.2.12) becomes 

dp(t, x) = −[Aπ(t) p(t, x) + xq(t, x)]dt + q(t, x)dG(t); 0 ≤ t ≤ T , p(T , x) = U (x). (13.5.36) where Gt is the sigma-algebra generated by {G(s)}s≤t , for 0 ≤ t ≤ T , and 1 Aπ(t) p(t, x) = π(t)α0 (t)p (t, x) + π 2 (t)β02 (t)p (t, x). 2

(13.5.37)

By [ØPZ, ZRW], this backward SPDE (BSPDE for short) admits a unique solution which belongs to L2 (R+ ). The map  π →

R+

H (t, x, y(t, x), y(t, .), π, p(t, x), q(t, x))dx

is maximal when  {−α0 (t)y (t, x) + πβ02 (t)y (t, x)}p(t, x)dx = 0,

(13.5.38)

R+

i.e. when π = π(t), ˆ given by π(t) ˆ =

α0 (t) β02 (t)

  R+ R+

y (t, x)p(t, x)dx y (t, x)p(t, x)dx

.

(13.5.39)

Using integration by parts and (13.5.13)–(13.5.14) we can rewrite this as follows: π(t) ˆ =− =−

α0 (t) β02 (t)

  R+ R+

y(t, x)p (t, x)dx y(t, x)p (t, x)dx

α0 (t)E[p (t, X (t))|Gt ] . β02 (t)E[p (t, X (t))|Gt ]

(13.5.40)

13.5 Application to Partial (Noisy) Observation Optimal Control

339

We summarise what we have proved as follows: Theorem 13.10 ([DØ4]) Assume that the conditions of Theorem 13.9 hold. The optimal portfolio π(t) ˆ ∈ A for the noisy observation portfolio problem (13.5.24), is given in feedback form by α0 (t)E[p (t, X (t))|Gt ] π(t) ˆ =− 2 , (13.5.41) β0 (t)E[p (t, X (t))|Gt ] in the sense that it is expressed in terms of the corresponding solution p(t, x) = pπ (t, x) of the BSPDE 

dp(t, x) = −[Aπ(t) ˆ p(t, x) + xq(t, x)]dt + q(t, x)dG(t); 0 ≤ t ≤ T , p(T , x) = U (x),

(13.5.42)

with E[p (t, X (t)) | Gt ] = 0 for all t. Remark 13.11 It is interesting to compare the above result with the well-known case when there is no noise in the observations. Then the optimal portfolio π ∗ is given in feedback form by α0 (t) π ∗ (t) = 2 X (t). β0 (t) Thus, comparing with 13.5.41 we see that the expression −

E[p (t, X (t)) | Gt ] E[p (t, X (t)) | Gt ]

reduces to X (t) in this case.

13.6 Exercises Exercise 13.1 Transform the following partial observation SDE control problems into complete observation SPDE control problems: (a) (The partially observed linear-quadratic control problem)

340

13 Optimal Control of Stochastic Partial Differential …

 dX (t) = (αX (t) + u(t))dt + σdv(t); t > 0, X (0) has density F(x),

(signal):

 dG(t) = h(x(t))dt + dw(t); t > 0, (observations): G(0) = 0,  T   2  2 performance: J (u) = E X (t) + θu (t) dt , J ∗ = inf J (u). u∈AG

0

Here α, σ = 0, θ > 0 are constants and F and h are given functions. (b) (The partially observed optimal portfolio problem)  (signal) :  (observations) : performance:

dX (t) = X (t)[αu(t)dt + βu(t)dv(t)]; t > 0 X (0) has density F(x), dG(t) = X (t)dt + dw(t); t > 0, G(0) = 0.

J (u) = E[X γ (T )], J ∗ = sup J (u). u∈AG

Here α > 0, β > 0 and γ ∈ (0, 1) are constants. We may interpret u(t) as the portfolio representing the fraction of the total wealth x(t) invested in the risky asset with price dynamics dS1 (t) = S1 (t)[αdt + βdv(t)]; t > 0, S1 (0) > 0. The remaining fraction 1 − π(t) is then invested in the other investment alternative, being a risk free asset with price S0 (t) = 1 for all t. Exercise 13.2 (Terminal Conditions) Let Y (t, x) = Y (u) (t, x) be as in (13.1.1)– (13.1.3) and define 

T

J (u) = E



0





˜ ln u(t, x)dx dt ; u ∈ A,

(13.6.1)

D

where A˜ is the set of controls in A (see (13.1.3)) such that the terminal constraint  E

Y

(u)

 (T , x)dx ≥ 0

(13.6.2)

D

holds. We consider the constrained stochastic control problem to find J˜ ∈ R and u˜ ∈ A˜ such that

13.6 Exercises

341

J˜ = sup J (u) = J (˜u).

(13.6.3)

u∈A˜

To solve this problem we use the Lagrange multiplier method as follows: (a) Fix λ > 0 and solve the unconstrained control problem to find Jλ∗ ∈ R and uλ∗ ∈ A such that (13.6.4) Jλ∗ = sup Jλ (u) = Jλ (uλ∗ ), u∈A

where 

T

Jλ (u) = E 0



 ln u(t, x)dx dt + D





λY (T , x)dx .

(13.6.5)

D

[Hint: Use the method in Example 13.1.] (b) Suppose there exists λˆ > 0 such that  E

 ∗ Y (uλˆ ) (T , x)dx = 0,

(13.6.6)

D

where u∗ˆ is the corresponding solution of the unconstrained problem (13.6.4)– λ ˆ Show that then in fact u˜ := u∗ ∈ A˜ solves the constrained (13.6.5) with λ = λ. λˆ

problem (13.6.1)–(13.6.3) and hence J˜ = J ˆ∗ . λ (c) Use the above to solve the constrained stochastic control problem (13.6.3). Exercise 13.3 (Controls Which do not Depend on x) Consider Example 13.1 again, but this time we only allow controls u(t, x) = u(t) which do not depend on x. Use Theorem 13.7 to find the optimal control u∗ (t) in this case.

Chapter 14

Solutions of Selected Exercises

14.1 Exercises of Chap. 1 Exercise 1.1 Choose f ∈ C 2 (R) and put Y (t) = f (X (t)). Then by the Itô formula dY (t) = f  (X (t))[α dt + σ dB(t)] + 21 σ 2 f  (X (t))dt    + f (X (t − ) + γ(z)) − f (X (t − )) − γ(z) f  (X (t − )) ν(dz)dt |z| 0.

Then A0 ψ(x) = −ρeλx + 21 λ2 eλx +



 λ(x+γz)  e − eλx − λeλx · γz ν(dz)

R



= eλx − ρ + 21 λ2 +





 eλγz − 1 − λγz ν(dz) .

R



Put h(λ) := −ρ +

1 2 λ 2

+



 eλγz − 1 − λγz ν(dz).

R

Note that h(0) = −ρ < 0. Therefore, since eλγz − 1 − λγz ≥ 0 for all x ∈ R, we see that lim h(λ) = ∞. λ→∞

So the equation h(λ) = 0 has at least one solution λ1 > 0. Define  ψ(x) =

for x ≥ x ∗ , for x < x ∗ ,

x −a C e λ1 x

(14.3.1)

where C > 0, x ∗ > 0 are two constants to be determined. If we require ψ to be continuous at x = x ∗ we get the equation ∗

C eλ1 x = x ∗ − a.

(14.3.2)

If we require ψ to be differentiable at x = x ∗ we get the additional equation ∗

λ1 C eλ1 x = 1.

(14.3.3)

354

14 Solutions of Selected Exercises

Dividing (14.3.1) by (14.3.2) we get x∗ = a +

1 1 −(λ1 a+1) , C= e . λ1 λ1

(14.3.4)

We now propose that the function φ(s, x) := e−ρs ψ(x) with ψ(x) given by (14.3.1)–(14.3.3) satisfies all the requirements of Theorem 3.2 (possibly under some assumptions) and hence that φ(s, x) = Φ(s, x) and that

  τ ∗ := inf t > 0; X (t) ≥ x ∗

is an optimal stopping time. We proceed to check if the conditions (i)–(xi) of Theorem 3.2 hold. Many of these conditions are satisfied trivially or by construction of φ. We only discuss the remaining ones: (ii) We know that φ = g for x > x ∗ , by construction. For x < x ∗ we must check that Ceλ1 x ≥ x − a . To this end, put

k(x) = Ceλ1 x − x + a ; x ≤ x ∗ .

Then k(x ∗ ) = k  (x ∗ ) = 0 and k  (x ∗ ) = λ21 Ceλ1 x > 0 for x ≤ x ∗ . Therefore k  (x) < 0 for x < x ∗ and hence k(x) > 0 for x < x ∗ . Hence (ii) holds. (vi) We know that Aφ + f = Aφ = 0 for x < x ∗ , by construction. For x > x ∗ we have Aφ(s, x) = e−ρs A0 (x − a) = e−ρs (−ρ(x − a) +



< e−ρs (−ρ(x ∗ − a) +

x+γz x ∗ < ∞ a.s.

(14.3.8)

Some conditions are needed on σ, γ, and ν for (14.3.8) to hold. For example, it suffices that ⎫ ⎧ t  ⎬ ⎨ lim X (t) = lim σ B(t) + γ z N (ds, dz) = ∞ a.s. (14.3.9) t→∞ t→∞ ⎩ ⎭ 0

R

356

14 Solutions of Selected Exercises

(xi) For (xi) to hold it suffices that sup E x [e−2ρτ X 2 (τ )] < ∞ .

τ ∈T

(14.3.10)

Again it suffices to assume that (14.3.7) holds. Conclusion Assume that (14.3.5), (14.3.6), and (14.3.8) hold. Then the value function is Φ(s, x) = e−ρs ψ(x), where ψ(x) is given by (14.3.1) and (14.3.4). An optimal stopping time is   τ ∗ = inf t > 0; X (t) ≥ x ∗ . Exercise 3.2 Define ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 dt 1 0  − ⎥ ⎢ dY (t) = ⎣ d P(t) ⎦ = ⎣ α P(t) ⎦ dt + ⎣β P(t)⎦ dB(t) + ⎣γ P(t )z N (dt, dz)⎦ . R dQ(t) −λ Q(t) 0 0 ⎡

Then the generator A of Y (t) is ∂φ ∂φ 1 2 2 ∂ 2 φ ∂φ +αp − λq + 2β p Aφ(y) = Aφ(s, p, q) = ∂s ∂p ∂q ∂ p2    ∂φ φ(s, p + γ pz, q) − φ(s, p, q) − (s, p, q)γ zp ν(dz) . + ∂p R

If we try

φ(s, p, q) = e−ρs ψ(w)

then

with w = p · q ,

Aφ(s, p, q) = e−ρs A0 ψ(w),

where A0 ψ(w) = −ρψ(w) + (α − λ)wψ  (w) + 21 β 2 w 2 ψ  (w)    + ψ((1 + γ z)w) − ψ(w) − γ wzψ  (w) ν(dz). R

14.3 Exercises of Chap. 3

357

Consider the set U defined in Proposition 3.3: U = {y; Ag(y) + f (y) > 0} = {(s, p, q); A0 (θ w) + λ w − K > 0} = {(s, p, q); [θ(α − ρ − λ) + λ]w − K > 0}  ⎧ K ⎨ (s, p, q) : w > if θ(α − ρ − λ) + λ > 0 θ(α − ρ − λ) + λ = ⎩ ∅ if θ(α − ρ − λ) + λ ≤ 0. By Proposition 3.4 we therefore get Case 1: Assume λ ≤ θ(λ + ρ − α). Then τ ∗ = 0 is optimal and Φ(y) = g(y) = e−ρs p · q for all y. Case 2: Assume θ(λ + ρ − α) < λ.

 K Then U = (s, w); w > ⊂ D. λ − θ(λ + ρ − α) In view of this it is natural to guess that the continuation region D has the form   D = (s, w); 0 < w∗ < w

for some constant w ∗ ; 0 < w ∗ < equation

K . In D we try to solve the λ − θ(λ + ρ − α)

A0 ψ(w) + f (w) = 0. The homogeneous equation A0 ψ0 (w) = 0 has a solution ψ0 (w) = wr if and only if  h(r ) := −ρ + (α − λ)r +

1 2 β r (r 2

− 1) +



 (1 + γ z)r − 1 − r γ z ν(dz) = 0.

R

Since h(0) = −ρ < 0 and lim h(r ) = ∞, we see that the equation h(r ) = 0 has |r |→∞

two solutions r1 , r2 such that

r2 < 0 < r1 .

Let r be a solution of this equation. To find a particular solution ψ1 (w) of the nonhomogeneous equation A0 ψ1 (w) + λw − K = 0

358

14 Solutions of Selected Exercises

we try ψ1 (w) = aw + b and find a=

λ , λ+ρ−α

b=−

K . ρ

This gives that for all constants C the function ψ(w) = C wr +

K λ w− λ+ρ−α ρ

is a solution of A0 ψ(w) + λ w − K = 0. Therefore we try to put ψ(w) =

⎧ ⎨θ w;

λ K ⎩C wr + w− ; λ+ρ−α ρ

0 < w ≤ w∗ w ≥ w∗ ,

(14.3.11)

where w ∗ > 0 and C remain to be determined. Continuity and differentiability at w = w∗ give λ K w∗ − λ+ρ−α ρ λ . θ = C r (w ∗ )r −1 + λ+ρ−α θ w ∗ = C(w ∗ )r +

(14.3.12) (14.3.13)

Combining (14.3.12) and (14.3.13) we get w∗ =

(−r )K (λ + ρ − α) (1 − r )ρ(λ − θ(λ + ρ − α))

(14.3.14)

C=

λ − θ(λ + ρ − α) · (w ∗ )1−r . −r

(14.3.15)

and

Since we need to have w ∗ > 0 we are led to the following condition: Case 2a: θ(λ + ρ − α) < λ and λ + ρ − α > 0.

14.3 Exercises of Chap. 3

359

Then we choose r = r2 < 0, and with the corresponding values (14.3.14), (14.3.15) of w ∗ and C the function φ(s, p, q) = e−ρs ψ( p · q), with ψ given by (14.3.11), is the value function of the problem. The optimal stopping time τ ∗ is   τ ∗ = inf t > 0; P(t) · Q(t) ≤ w ∗ ,

(14.3.16)

provided that all the other conditions of Theorem 3.2 are satisfied. For condition (vi) to hold it suffices that w∗(λ − θ(λ+ ρ − α)) − K +

γzw>w∗

C(w + γzw)r +

 λ K (w + γzw) − − θ(w + γzw) ν(dz) ≤ 0. λ+ρ−α ρ

(14.3.17) See also Remark 14.1. Case 2b: θ(λ + ρ − α) < λ and λ + ρ − α ≤ 0, i.e., α≥λ+ρ. In this case we have Φ ∗ (y) = ∞. To see this note that since t t t  (ds, dz), γ P(s − )z N P(t) = p + α P(s)ds + β P(s)dB(s) + 0

0

0

R

t

we have E[P(t)] = p +

α E[P(s)]ds, 0

which gives

E[P(t)] = p eαt .

Therefore E[e−ρt P(t)Q(t)] = E[ pq e−ρt e−λt P(t)] = pq exp {(α − λ − ρ)t} .

360

14 Solutions of Selected Exercises

Hence  T lim E

e

T →∞

−ρt



T exp {(α − λ − ρ)t} dt = ∞

P(t)Q(t)dt = lim pq T →∞

0

0

if and only if α ≥ λ + ρ. Remark 14.1 (On condition (viii) of Theorem 3.2) Consider φ(Y (t)) = e−ρt ψ(P(t)Q(t)), where P(t) = p exp

⎧ ⎨ ⎩



 α − 21 β 2 − γ

t 

z ν(dz) t + R

ln(1 + γ z)N (dt, dz) + β B(t) 0 R

and Q(t) = q exp(−λ t). We have ⎧  ⎨ 1 2 α − λ − 2 β − γ z ν(dz) t P(t)Q(t) = pq exp ⎩ R

t  + 0

R

⎫ ⎬ ln(1 + γ z)N (dt, dz) + β B(t) ⎭

and e−ρt P(t)Q(t) = pq exp



 α − λ − ρ − 21 β 2 − γ

z ν(dz) t

R

t  + 0

 ln(1 + γ z)N (ds, dz) + β B(t) .

R

Hence ⎡ E[(e−ρt P(t)Q(t))2 ] = ( pq)2 E ⎣exp

⎧  ⎨ 2α − 2λ − 2ρ − β 2 − 2γ z ν(dz) t ⎩ R

⎫ ⎬ ⎭

14.3 Exercises of Chap. 3

361

t  +2 0 R

⎫⎤ ⎬ ln(1 + γ z)N (ds, dz) + 2β B(t) ⎦ ⎭

⎧ ⎫  ⎨ ⎬ 2α − 2λ − 2ρ − β 2 − 2γ z ν(dz) t + 2β 2 t = ( pq)2 exp ⎩ ⎭ R



 t 

· E exp 2 ln(1 + γ z)N (dt, dz) . 0 R

Using Exercise 1.6 we get

E[(e

−ρt

P(t)Q(t)) ] = p q exp 2

2 2

⎧ ⎨ ⎩

 2α−2λ−2ρ + β −2γ

z ν(dz)

2

R

⎫ ⎬    + (1 + γ z)2 −1−2 ln(1+γ z) ν(dz) t . ⎭ R

So condition (viii) of Theorem 6.10 holds if   2 2  2 2α − 2λ − 2ρ + β + γ z − 2 ln(1 + γ z) ν(dz) < 0. R

Exercise 4.3 In this case we have

g(s, x) = e−ρs |x| 

and dX (t) = dB(t) +

R

z N˜ (dt, dz).

We look for a solution of the form φ(s, x) = e−ρs ψ(x). The continuation region is given by D = {(s, x) ∈ R × R : φ(s, x) > g(s, x)} = {(s, x) ∈ R × R : ψ(x) ≥ |x|} .

362

14 Solutions of Selected Exercises

Because of the symmetry we assume that D is of the form   D = (s, x) ∈ R × R; −x ∗ < x < x ∗ , where x ∗ > 0. It is trivial that D is a Lipschitz surface and X (t) spends 0 time on ∂ D. We must have Aφ ≡ 0 on D, (14.3.18) where the generator A is given by ∂φ 1 ∂ 2 φ + + Aφ = ∂s 2 ∂x 2

  R

 ∂φ φ(s, x + z) − φ(s, x) − (s, x)z ν(dz). ∂x

Hence (14.3.18) becomes − ρψ(x) + 21 ψ  (x) +



 R

 ψ(x + z) − ψ(x) − zψ  (x) ν(dz) = 0.

(14.3.19)

For |x| < ξ this equation becomes, by (3.4.1) 1  −ρψ(x) + ψ (x) = 0, 2 which has the general solution ψ(x) = C1 e



2ρx



+ C 2 e−

2ρx

,

√ where C1 , C2 are arbitrary constants. Let λ = 2ρ and −λ be the two roots of the equation 1 F(λ) := −ρ + λ2 = 0. 2 Because of the symmetry we guess that ψ(x) =

%   C $ λx e + e−λx = C cosh(λx); x ∈ D = (s, x); |x| < x ∗ 2

for some constants C > 0 and x ∗ ∈ (0, ξ). Therefore  $ % C cosh λx for |x| < x ∗ ψ(x) = |x| for |x| ≥ x ∗ .

14.3 Exercises of Chap. 3

363

y

tgh(λx) 1/λx

x

λx∗

Fig. 14.1 The value of x ∗

In order to find x ∗ and C, we impose the continuity and C 1 -conditions on ψ(x) at x = x ∗: % $ • Continuity: 1 = |x ∗ | = C cosh λx ∗ • C1 : 1 = Cλ sinh(λx ∗ ) It follows that: C= and x ∗ is the solution of

x∗ cosh(λx ∗ )

$ % 1 tgh λx ∗ = . λx ∗

(14.3.20)

(14.3.21)

Figure 14.1 illustrates that there exists a unique solution for (14.3.21). Finally we have to verify that the conditions of Theorem 3.2 hold. We check some: (ii) ψ(x) ≥ |x| for (s, x) ∈ D. Define h(x) = C cosh(λx) − x ; x > 0. Then h(x ∗ ) = h  (x ∗ ) = 0 and h  (x) = Cλ2 cosh(λx) > 0 for all x. Hence h(x) > 0 for 0 < x < x ∗ , so (ii) holds. See Fig. 14.2. ¯ (vi) Aψ ≤ 0 outside D. This holds since, by (3.4.2) and (3.4.3),  Aψ(x) = −ρ|x| +

R

{|x + z| − x − z} ν(dz) ≤ 0 for all x > x ∗ .

364

14 Solutions of Selected Exercises y

|x| C cosh(λx)

−x∗

x∗

x

Fig. 14.2 The function ψ

Since all the conditions of Theorem 3.2 are satisfied, we conclude that φ(s, y) = e−ρs ψ(y) is the optimal value function and τ ∗ = inf {t > 0; |X (t)| = x ∗ } Exercise 3.4 From Example 3.5 we know that in the no delay case (δ = 0) the solution of the problem (3.4.8) is the following (under some additional assumptions on the Lévy measure ν): (14.3.22) Φ0 (s, x) = e−ρs 0 (x), 

where

x − q ; x ≥ x0∗ C0 x λ ; 0 < x < x0∗ .

0 (x) =

(14.3.23)

Here λ > 1 is uniquely determined by the equation  − ρ + μλ + 21 σ 2 λ(λ − 1) +

 R

 (1 + z)λ − 1 − λ z ν(dz) = 0,

(14.3.24)

and x0∗ and C0 are given by λq , λ−1 1 C0 = (x0∗ )1−λ . λ x0∗ =

(14.3.25) (14.3.26)

14.3 Exercises of Chap. 3

365

The corresponding optimal stopping time τ ∗ ∈ T0 is   τ ∗ = inf t > 0; X (t) ≥ x0∗ .

(14.3.27)

Thus it is optimal to sell at the first time the price X (t) equals or exceeds the value x0∗ . To find the solution in the delay case (δ > 0) we note that we have f = 0 and g(y) = g(s, x) = e−ρs (x − q). Hence, by (3.4.5), g˜δ (y) = E y [g(Y (δ))] = E s,x [e−ρ(s+δ) (X (δ) − q)] = e−ρ(s+δ) (E x [X (δ)] − q) = e−ρ(s+δ) (x eμδ − q) ˜ = e−ρs+δ(μ−ρ) (x − q e−μδ ) = K e−ρs (x − q), where

K = eδ(μ−ρ) and q˜ = q e−μδ .

(14.3.28)

(14.3.29)

Thus g˜δ has the same form as g, so we can apply the results (14.3.22)–(14.3.27) to ˜ find Φ(y) and the corresponding optimal τ ∗ : ˜ ˜ x) = e−ρs (x), ˜ Φ(y) = Φ(s,

(14.3.30)



where ˜ (x) =

K (x − q) ˜ ; x ≥ x˜ ∗ λ ; 0 < x < x˜ ∗ C˜ x

(14.3.31)

with λ as in (14.3.24). Here x˜ ∗ and C˜ are given by λ q˜ , λ−1 1 C˜ = (x˜ ∗ )1−λ . λ x˜ ∗ =

(14.3.32) (14.3.33)

The corresponding optimal stopping time for problem (3.3.7) and (3.3.6), respectively, is   τ˜ ∗ = inf t > 0; X (t) ≥ x˜ ∗ ∗



α = τ˜ + δ.

(14.3.34) (14.3.35)

366

14 Solutions of Selected Exercises

Using Theorem 3.11 we conclude the following: Conclusion The value function Φδ (y) for the delayed optimal stopping problem (3.4.7) is given by ˜ Φδ (y) = Φ(y), where Φ˜ is as in (14.3.30)–(14.3.33). The corresponding optimal stopping time α∗ ∈ Tδ is   α∗ = inf t > 0; X (t) ≥ x˜ ∗ + δ. Remark 14.2 Assume for example that μ > 0. Then comparing (14.3.32) with the non-delayed case (14.3.25) we see that q˜ > q and hence x˜ ∗ < x0∗ Thus, in terms of the delayed effect of the stopping time formulation (see 3.3.4), it is optimal to stop at the first time t = τ˜ ∗ when X (t) ≥ x˜ ∗ . This is sooner than in

X(t)

x∗0

x˜∗

(δ = 0 case)

(delay case)

δ τ˜∗

τ0∗ α∗ = τ˜∗ + δ

Fig. 14.3 The optimal stopping times for Exercise 3.4 (μ > 0)

x∗0 =

λq λ−1

x˜∗ =

λqe−μδ λ−1

t

14.3 Exercises of Chap. 3

367

the non-delayed case, because of the anticipation that during the delay time interval [τ ∗ , τ ∗ + δ] X (t) is likely to increase (since μ > 0). See Fig. 14.3.

14.4 Exercises of Chap. 4 Exercise 4.3 (a) The measure Q θ is an equivalent local martingale measure by Theorem 1.31. Since θ0 and θ1 are bounded, Q θ is an EMM. (b) By the Bayes rule for conditional expectation we have

Y (t) =

& '  ) ( T E exp − t r (s)ds F G θ (T ) | Ft

=−

G (t) & '  θ ) ( ' ( T t E exp − 0 r (s)ds exp 0 r (s)ds G θ (T )F | Ft

G θ (t) & '  ) ( T E exp − 0 r (s)ds G θ (T )F | Ft E[(T )F | Ft ] '  ( = = , t (t) exp − 0 r (s)ds G θ (t)   t r (s)ds G θ (t) ; 0 ≤ t ≤ T, (t) = exp −

where

0

i.e. 

d(t) =

& )  (t − ) −r (t)dt + θ0 (t)d B(t) + R θ1 (t, ζ) N˜ (dt, dζ)

(0) = 1. Comparing with (4.3.2)–(4.3.3) in Theorem 4.8 we see that Y (t) is the solution of the linear BSDE 

⎧  ⎪ ⎪dY (t) = − −r (t)Y (t) + θ0 (t)Z (t) + θ1 (t, ζ)K (t, ζ)ν(dζ) dt ⎪ ⎪ ⎨ R  ˜ +Z (t)d B(t) + K (t, ζ) N (dt, dζ) ; 0 ≤ t ≤ T ⎪ ⎪ ⎪ R ⎪ ⎩ Y (T ) = F, as claimed.

368

14 Solutions of Selected Exercises

14.5 Exercises of Chap. 5 Exercise 5.1 

s+t Y (t) = . X (t)

Put

Note: If we put S = {(s, x); s < T } then

  / S = T − s. τS = inf t > 0; Y s,x (t, x) ∈

The generator of Y (t) is ∂φ 1 2 ∂ 2 φ ∂φ + (μ − ρ x − u) + σ Au φ(y) = Au φ(s, x) = ∂s ∂x 2 ∂x 2   ∂φ φ(s, x + θ z) − φ(s, x) − · θ z ν(dz). + ∂x R

So the conditions of Theorem 5.1 get the form uγ ≤ 0 for all u ≥ 0, s < T , γ (ii) lim− φ(s, x) = λ x, s→T   (iv) φ− (Y (τ )) τ ≤τS is uniformly integrable, uˆ γ = 0 for s < T , in addition to requirements (iii) and (vi). (v) Auˆ φ(s, x) + e−δs γ (i) Au φ(s, x) + e−δs

We try a function φ of the form φ(s, x) = h(s) + k(s)x for suitable functions h(s), k(s). Then the conditions above get the form uγ (i)’ h  (s) + k  (s)x + (μ − ρ x − u)k(s) + e−δs γ  + {h(s) + k(s)(x + γ z) − h(s) − k(s)x − k(s)γ z} ν(dz) ≤ 0, R

i.e.

e−δs

uγ γ

+ h  (s) + k  (s)x + (μ − ρ x − u)k(s) ≤ 0 for all s < T, u ≥ 0,

14.5 Exercises of Chap. 5

369

(ii)’ h(T ) = 0, k(T ) = λ, (iv)’ {h(τ ) + k(τ )X (τ )}τ ≤τS is uniformly integrable, γ (v)’ h  (s) + k  (s)x + (μ − ρ x − u)k(s) ˆ + e−δs uˆγ = 0. From (i)’ and (v)’ we get −k(s) + e−δs uˆ γ−1 = 0 or

% 1 $ uˆ = u(s) ˆ = eδs k(s) γ−1 .

Combined with (v)’ this gives (1) k  (s) − ρ k(s) = 0 so k(s) = λ eρ(s−T ) , γ ˆ − μ)k(s) − e−δs uˆ γ(s) , h(T ) = 0. (2) h  (s) = (u(s) Note that γ $ δs % γ−1 1 $ δs % γ−1 −δs e k(x) h (s) = e k(s) k(s) − μ k(s) − e γ



=e

δs γ−1 δs

k(s)

γ γ−1 γ

= e γ−1 k(s) γ−1

' γ ( −δs 1− γ−1 e

− μ k(s) − · 

1 − γ1 − μ k(s) < 0 .

1 γ

γ

· k(s) γ−1

Hence, since h(T ) = 0, we have h(s) > 0 for s < T . Therefore φ(s, x) = h(s) + k(s)x ≥ 0 . Clearly φ satisfies (i), (ii), (iv) and (v). It remains to check (vi), i.e., that {h(τ ) + k(τ )X (τ )}τ ≤T is uniformly integrable, and to check (iii). For these properties to hold some conditions on ν must be imposed. We omit the details. We conclude that if these conditions hold then   1 (δ + ρ)s − ρ T ; s≤T (14.5.1) u(s) ˆ = λ γ−1 exp γ−1 is the optimal control.

370

14 Solutions of Selected Exercises

Exercise 5.2 Define

 T0 J (u) = E

e−δt

u γ (t) dt + λ X (T0 ) , γ

0

where  dX (t) = (μ − ρ X (t) − u(t))dt + σ B(t) + γ

(dt, dz); 0 ≤ t ≤ T0 . zN

R

The Hamiltonian is H (t, x, u, p, q, r ) = e−δt

uγ γ

 + (μ − ρ x − u) p + σ q +

γ zr (t, z)ν(dz). R

The adjoint equation is  ⎧ ⎪ (dt, dz); t < T0 ⎨d p(t) ˆ = ρ p(t)dt ˆ + σ q(t)dB(t) ˆ + rˆ (t, z) N ⎪ ⎩

R

p(T ˆ 0 ) = λ.

Since λ and ρ are deterministic, we guess that qˆ = rˆ = 0 and this gives p(t) ˆ = λ eρ(t−T0 ) . Hence H (t, Xˆ (t), u, p(t), ˆ q(t), ˆ rˆ (t)) = e−δt

uγ γ

+ (μ − ρ Xˆ (t) − u) p(t), ˆ

which is maximal when   1 % γ−1 $ 1 (δ + ρ)t − ρ T0 . ˆ = λ γ−1 exp u = u(t) ˆ = eδt p(t) γ−1

(14.5.2)

Exercise 5.3 In this case we have ⎡ ⎡ ⎤ ⎤ − − − u(t , ω)z γ (t, X (t ), u(t ), z) N (dt, dz) 1 ⎢ ⎢ ⎥ ⎥ ⎢R ⎢R ⎥ ⎥ ⎥N  ⎥ (dt, dz) = ⎢ dX (t) = ⎢ ⎢ ⎢ ⎥ ⎥ (dt, dz)⎦ ⎣ γ2 (t, X (t − ), u(t − ), z) N ⎣ ⎦ z2 R

R

14.5 Exercises of Chap. 5

371

so the Hamiltonian is  H (t, x, u, p, q, r ) =



 u z r1 (t, z) + z 2 r2 (t, z) ν(dz)

R

and the adjoint equations are (g(x1 , x2 ) = −(x1 − x2 )2 )  ⎧ ⎪ (dt, dz) ; t < T ⎨d p1 (t) = r1 (t − , z) N ⎪ ⎩

R

p1 (T ) = −2(X 1 (T ) − X 2 (T ))  ⎧ ⎪ (dt, dz) ⎨d p2 (t) = r2 (t − , z) N ⎪ ⎩

R

p2 (T ) = 2(X 1 (T ) − X 2 (T )).

T  Now X 1 (T ) − X 2 (T ) =



 (dt, dz). So if uˆ is a given candidate u(t − ) − z N

R

0

for an optimal control we get

rˆ1 (t, z) = −2(u(t) ˆ − z)z, uˆ 2 (t, z) = 2(u(t) ˆ − z)z. This gives 



H (t, x, u, p, ˆ q, ˆ rˆ ) =

 u z(−2(u(t) ˆ − z)z) + z 2 2(u(t) ˆ − z)z ν(dz)

R



= −2u



 2 u(t)z ˆ − z 3 ν(dz) + 2

R





 3 u(t)z ˆ − z 4 ν(dz).

R

This is a linear expression in u, so we guess that the coefficient of u is 0, i.e., that 

z 3 ν(dz)

R

u(t) ˆ =

z 2 ν(dz)

for all (t, ω) ∈ [0, T ] × .

(14.5.3)

R

With this choice of u(t) ˆ all the conditions of the stochastic maximum principle are satisfied and we conclude that uˆ is optimal.

372

14 Solutions of Selected Exercises

Note that this implies that  inf E u

2

T F−

u(t)dS1 (t)

 T  =E

0

0



 (dt, dz) z 2 − u(t)z ˆ N

2

R

T  2 E[(z 2 − u(t)z) ˆ ]ν(dz)dt

= 0

R

  =T R



z 3 ν(dz) 2 z2 −  2 z ν(dz). z ν(dz) R

R

We see that this is 0 if and only if 

 z 3 ν(dz) = z

R

z 2 ν(dz) for a.a. z(ν),

(14.5.4)

R

i.e. iff ν is supported on one point {z 0 }. Only then is the market complete! See [BDLØP] for more information. Exercise 5.4 We try to find a, b such that the function ϕ(s, x) = e−ρs ψ(x) := e−ρs (ax 2 + b) satisfies the conditions of (the minimum version of) Theorem 5.1. In this case the generator is Avϕ (s, x) = e−ρs Av0 ψ(x), where 1 Av0 ψ(x) = − ρψ(x) + vψ  (x) + σ 2 ψ  (x) 2    + ψ(x + z) − ψ(x) − zψ  (x) ν(dz). R

Hence condition (i) of Theorem 5.1 becomes  1 z 2 ν(dz) + x 2 + θv 2 Av0 ψ(x) + x 2 + θv 2 = −ρ(ax 2 + b) + v2ax + σ 2 2a + a 2 R   = θv 2 + 2axv + x 2 (1 − ρa) + a σ 2 + z 2 ν(dz) − ρb =: h(v). R

14.5 Exercises of Chap. 5

373

The function h is minimal when v = u ∗ (x) = −

ax . θ

(14.5.5)

With this value of v condition (v) becomes 

  a2 x 2 1 − ρa − + a σ 2 + z 2 ν(dz) − ρb = 0. θ R Hence we choose a > 0 and b such that

and b=

a 2 + ρθa − θ = 0

(14.5.6)

  a σ 2 + z 2 ν(dz) . ρ R

(14.5.7)

With these values of a and b we can easily check that ϕ(s, x) := e−ρs (ax 2 + b) satisfies all the conditions of Theorem 5.1. The corresponding optimal control is given by (14.5.5). Exercise 5.5 (b) The Hamiltonian for this problem is  H (t, x, u, p, q, r ) = x + θu + up + σq + 2

2

R

r (t − , z) N˜ (dt, dz).

The adjoint equation is  ⎧ ⎨d p(t) = −2X (t)dt + q(t)dB(t) + r (t − , z) N˜ (dt, dz) ; t < T R ⎩ p(T ) = 2λX (T ).

(14.5.8)

By imposing the first and second-order conditions, we see that H (t, x, u, p, q, r ) is minimal for 1 (14.5.9) u = u(t) = u(t) ˆ = − p(t). 2θ In order to find a solution of (14.5.8), we consider p(t) = h(t)X (t), where h : R → R is a deterministic function such that

374

14 Solutions of Selected Exercises

h(T ) = 2λ. Note that u(t) = −

h(t)X (t) and 2θ

h(t)X (t) dX (t) = − dt + σdB(t) + 2θ

 R

z N˜ (dt, dz); X (0) = x.

Moreover, (14.5.8) turns into d p(t) = h(t)dX (t) + X (t)h  (t)dt 

 h(t)2  + h (t) dt + h(t)σdB(t) + h(t) z N˜ (dt, dz). = X (t) − 2θ R Hence h(t) is the solution of ⎧ ⎨

h(t)2 − 2; t < T 2θ ⎩ h(T ) = 2λ. h  (t) =

(14.5.10)

The general solution of (14.5.10) is √ 1 + βe √θ h(t) = 2 θ 2T √ 1 − βe θ 2t

with β = that

√ 2T λ−√θ − √θ e . λ+ θ

(14.5.11)

By using the stochastic maximum principle, we can conclude u ∗ (t) = −

h(t) X (t) 2θ

is the optimal control, p(t) = h(t)X (t) and q(t) = σh(t), r (t − , z) = h(t)z, where h(t) is given by (14.5.11). Exercise 5.6 If we try a function of the form ϕ(s, x) = e−δs ψ(x) then equations (i) and (v) for Theorem 5.1 combine to give the equation  1 sup ln c − δψ(x) + (μx − c)ψ  (x) + σ 2 x 2 ψ  (x) 2 c≥0

14.5 Exercises of Chap. 5

 +

375

 R

  ψ(x + xθz) − ψ(x) − xθzψ  (x) ν(dz) = 0.

The function

h(c) := ln c − cψ  (x); c > 0

is maximal when

1

c = c(x) ˆ =

ψ  (x)

.

If we set ψ(x) = a ln x + b where a, b are constants, a > 0, then this gives c(x) ˆ =

x , a

and hence the above equation becomes ' a( 1 a ln x − ln a − δ(a ln x + b) + μx · − 1 + σ 2 x 2 − 2 x x 2   1 +a ln(x + xθz) − ln x − xθz · ν(dz) = 0 x R or 1 (1 − δa) ln x − ln a − δb + μa − 1 − σ 2 a 2  + a {ln(1 + θz) − θz} ν(dz) = 0, for all x > 0. R

This is possible if and only if a= and b=δ

−2

1 δ

  1 2 δ ln δ − δ + μ − σ + {ln(1 + θz) − θz} ν(dz) . 2 R

376

14 Solutions of Selected Exercises

One can now verify that if δ > μ then with these values of a and b the function ϕ(s, x) = e−δt (a ln x + b) satisfies all the conditions of Theorem 5.1. We conclude that Φ(s, x) = e−δt (a ln x + b) and that

c∗ (x) = cˆ =

x a

(in feedback form) is an optimal consumption rate.

14.6 Exercises of Chap. 6 Exercise 6.1 (i) ∂ϕ 1 2 2 2 ∂ 2 ϕ ∂ϕ + x[(1 − π)r (s) + πα] + β π x ∂s ∂x 2 ∂x 2 3   ∂ϕ ϕ(s, x + π × γ(s, ζ)) − ϕ(s, x) − (s, x) ν(dζ). + ∂x R

Aα,π ϕ(s, x) =

(ii) The HJBI equation is  ⎧ ⎨ inf sup Aα,π Φ(s, x) = 0 ; s > T α∈A1 π∈A2 ⎩ Φ(T, x) = U (x). (iii) With





ϕ(s, ˆ x) = U x exp

T

r (t)dt

s

we get Aα,π ϕ(s, ˆ x) = −U  (xψ(s)) × ψ(s)r (s) + x[(1 − π)r (s) + πα]U  (xψ(s))ψ(s)  1 + β 2 π 2 x 2 U  (xψ(s))ψ 2 + {U ((x + π × γ(s, ζ))ψ(s)) − U (xψ(s)) 2 R − U  (xψ(s))ψ(s)π × γ(s, ζ)}ν(dζ),

14.6 Exercises of Chap. 6

377

where

 ψ(s) = exp

T

r (t)dt .

s

(iv) The unique saddle point is found to be π(s, ˆ x) = 0, α(s, ˆ x) = r (s) ; (s, x) ∈ [−∞, T ] × (0, ∞). The interpretation of this result is the following: the worst scenario for the trader is to have α(s) = r (s), because then the underlying probability measure P is martingale measure for the market, and in that case, it is impossible for the trader to make any profit. She might as well choose not a trade, i.e. put π = πˆ = 0. Exercise 6.2 From (5.4.10) we get 1 Aπ (t, x) = y  (t, x)πxb(t) + y  (t, x)x 2 π 2 σ 2 (t) + z  (t, x)πxσ(t) 2  + {y(t, x + xπγ(t, ζ)) − y(t, x) − y  (t, x)xπγ(t, ζ)}ν(dζ) R + {k(t, x + xπγ(t, ζ), ζ) − k(t, x, ζ)}ν(dζ). (14.6.1) R

This is maximal when π = π(t) ˆ satisfies the equation 2 ˆ + z  (t, x)σ(t)x (14.6.2) x y  (t, x)b(t) + y  (t, x)σ 2 (t)π(t)x  + {y  (t, x, +x π(t)γ(t, ˆ ζ)) − y  (t, x) + k  (t, x + x π(t)γ(t, ˆ ζ))} R

xγ(t, ζ)ν(dζ),

(14.6.3)

where (y, z, k) solves the partial BSDE  ⎧ ⎨dy(t, x) = −A (t, x)dt + z(t, x)d B(t) + k(t, x, ζ) N˜ (dt, dζ) πˆ R ⎩ y(T, x) = ln X πˆ (T ).

(14.6.4)

Since the coefficients b(t), σ(t) and γ(t, ζ) are deterministic, we guess that π(t) ˆ is deterministic also, and hence that we can in (14.6.4) choose z(t, x) = k(t, x, ζ) = 0. Then let us try a solution of (14.6.4) of the form

378

14 Solutions of Selected Exercises

y(t, x) = f (t) + ln x

(14.6.5)

for some C 1 -function f (to be determined). Substituted into (14.6.1) this gives 1 Aπ (t, x) = π(t)b(t) − π 2 (t)σ 2 (t) 2  + {ln(1 + π(t)γ(t, ζ)) − π(t)γ(t, ζ)}ν(dζ) = Aπ (t),

(14.6.6) (14.6.7)

R

and (14.6.2) becomes  b(t) − σ 2 (t)π(t) ˆ +

R

2 π(t)γ ˆ (t, ζ) ν(dζ) = 0. 1 + π(t)γ(t, ˆ ζ)

(14.6.8)

With πˆ as in (14.6.8) the BSDE (14.6.4) reduces to 

f  (t) = −Aπˆ (t) ; t ∈ [0, T ] f (T ) = 0,

which gives



T

f (t) =

Aπˆ (s)ds ; t ∈ [0, T ].

t

Therefore, the maximal expected logarithmic utility is Y πˆ (0) = y(0, x) =



T

Aπˆ (s)ds + ln x.

0

Exercise 6.3 The Hamiltonian for this problem is H (t, x, π, μ, p, q) =

1 2 μ + πx(α(t) + μ) p + πxβ(t)q 2

(14.6.9)

and the BSDE for the adjoint processes p, q is ⎧ ⎨dp(t) = −[π(t){(α(t) + μ(t)) p(t) + β(t)q(t)}]dt + q(t)d B(t); 0 ≤ t ≤ T θ ⎩ p(T ) = . X (T ) (14.6.10)

14.6 Exercises of Chap. 6

379

Maximizing H with respect to π gives the first-order equation X (t)[(α(t) + μ(t)) p(t) + β(t)q(t)] = 0.

(14.6.11)

Since X (t) > 0 and β(t) = 0, we deduce that (α(t) + μ(t)) p(t) + β(t)q(t) = 0 and q(t) = −

α(t) + μ(t) p(t). β(t)

(14.6.12)

(14.6.13)

Hence (14.6.10) reduces to ⎧ α(t) + μ(t) ⎪ ⎨dp(t) = − p(t)d B(t) β(t) ⎪ ⎩ p(T ) = θ . X (T )

(14.6.14)

Define h(t) = p(t)X (t).

(14.6.15)

Then by the Itô formula we get  ⎧ ⎨dh(t) = π(t)β(t) − α(t) + μ(t) h(t)d B(t) β(t) ⎩ h(T ) = p(T )X (T ) = θ.

(14.6.16)

This BSDE has the solution h(t) = θ; 0 ≤ t ≤ T. We deduce that (π(t)β(t) −

α(t) + μ(t) )h(t) = 0, β(t)

(14.6.17)

(14.6.18)

from which we get the following expression for our candidate π(t) ˆ for the optimal portfolio π(t) ˆ =

α(t) + μ(t) ˆ , β 2 (t)

where μ(t) ˆ is the corresponding candidate for the optimal perturbation.

(14.6.19)

380

14 Solutions of Selected Exercises

Minimizing H with respect to μ gives the following first-order equation for the optimal μ(t): ˆ μ(t) ˆ + π(t) ˆ Xˆ (t) p(t) ˆ = 0,

(14.6.20)

μ(t) ˆ = −π(t) ˆ Xˆ (t) p(t) ˆ = −θπ(t). ˆ

(14.6.21)

i.e.,

We can now verify that (π, ˆ μ) ˆ satisfies all the conditions of the sufficient maximum principle, and hence we conclude the following: Proposition 14.3 (Optimal portfolio under model uncertainty) The saddle point (π ∗ (t), μ∗ (t)) of the stochastic differential game (6.4.2) is given by π ∗ (t) =

α(t) , +θ

(14.6.22)

α(t)θ . +θ

(14.6.23)

β 2 (t)

and μ∗ (t) = −

β 2 (t)

14.7 Exercises of Chap. 7 Exercise 7.1 (a) The HJB equation, i.e., (vi) and (ix) of Theorem 7.2, for this problem gets the form  ∂φ ∂φ σ 2 x 2 ∂ 2 φ uγ 0 = sup e−δs + + (μx − u) + γ ∂s ∂x 2 ∂x 2 u≥0     ∂φ φ(s, x + θx z) − φ(s, x) − θx z (s, x) dν(z) + ∂x R

(14.7.1)

for x > 0. We impose the first-order conditions to find the supremum, which is obtained for 1/(γ−1)  ∂φ u = u ∗ (s, x) = eδs . ∂x

(14.7.2)

14.7 Exercises of Chap. 7

381

We guess that φ(s, x) = K e−δs x γ with K > 0 to be determined. Then $ %1/(γ−1) u ∗ (s, x) = K γ x

(14.7.3)

and (14.7.1) turns into % $ 1 1 (K γ)γ/(γ−1) − K δ + μ − (K γ)1/(γ−1) K γ + σ 2 K γ(γ − 1) γ 2  {(1 + θz)γ − 1 − γθz} ν(dz) = 0 +K R

or 1 γ γ/(γ−1) K 1/(γ−1) − δ + μγ − γ γ/(γ−1) K 1/(γ−1) + σ 2 γ(γ − 1) 2  + {(1 + θz)γ − 1 − γθz} ν(dz) = 0. R

Hence   1 σ2 1 δ − μγ + γ(1 − γ) K = γ 1−γ 2 γ−1  γ − {(1 + θz) − 1 − γθz} ν(dz)

(14.7.4)

R

provided that σ2 δ − μγ + γ(1 − γ) − 2

 R

{(1 + θz)γ − 1 − γθz} ν(dz) > 0.

With this choice of K the conditions of Theorem 7.2 are satisfied and we can conclude that φ = Φ is the value function. (b)

(i) First assume λ ≥ K . Choose φ(s, x) = λe−δs x γ . By the same computations as in a), condition (vi) of Theorem 7.2 gets the form   1 1 1 δ − μγ + σ 2 γ(1 − γ) λ≥ γ γ−1 2 γ−1  − {(1 + θz)γ − 1 − γθz} ν(dz) . R

Since λ ≥ K , the inequality (14.7.5) holds by (14.7.4).

(14.7.5)

382

14 Solutions of Selected Exercises

By Theorem 7.2(a), it follows that: φ(s, x) = λe−δs x γ ≥ Φ(s, x) where Φ is the value function for our problem. On the other hand, φ(s, x) is obtained by the (admissible) control of stopping immediately (τ = 0). Hence we also have φ(s, x) ≤ Φ(s, x). We conclude that

Φ(s, x) = λe−δs x γ

in this case and τ ∗ = 0 is optimal. Note that D = ∅. (ii) Assume now λ < K . Choose φ(s, x) = K e−δs x γ . Then for all (s, x) ∈ R × (0, ∞) we have φ(s, x) > λe−δs x γ . Hence we have D = R × (0, ∞) and by Theorem 7.2(a) we conclude that Φ(s, x) ≤ K e−δs x γ . On the other hand, we have seen in (a) above that if we apply the control $ %1/(γ−1) u ∗ (s, x) = K γ x ∗

and never stop, then we achieve the performance J (u ) (s, x) = K e−δs x γ . Hence Φ(s, x) = K e−δs x γ and it is optimal never to stop (τ ∗ = ∞).

14.8 Exercises of Chap. 8 Exercise 8.1 In this case we put + , 





0 dt 1 0 0  (dt, dz) + dY (t) = = dt + dB(t) + β z N dξ(t). dX (t) α σ −(1 + λ) 

R

14.8 Exercises of Chap. 8

383

The generator if ξ = 0 is ∂φ 1 2 ∂ 2 φ ∂φ +α + 2σ Aφ = + ∂s ∂x ∂x 2

  R

 ∂φ φ(s, x + β z) − φ(s, x) − β z (s, x) ν(dz). ∂x

The non-intervention region D is described by (see (8.2.5)) 

k 

∂φ D = (s, x); κi j (y) + θ j < 0 for all j = 1, . . . , p ∂ yi i=1   ∂φ −ρs = (s, x); −(1 + λ) (s, x) + e 0 then by Theorem 8.2 we should have Aφ(s, x) = 0 for 0 < x < x ∗ . We try a solution φ of the form φ(s, x) = e−ρs ψ(x) and get A0 ψ(x) := −ρ ψ(x) + α ψ  (x) + 21 σ 2 ψ  (x) +  −β z ψ  (x) ν(dz) = 0.

 {ψ(x + β z) − ψ(x) R

We now choose ψ(x) = er x for some constant r ∈ R and get the equation  h(r ) := −ρ + α r + 21 σ 2 r 2 +

  rβz − 1 − r β z ν(dz) = 0. e

R

Since h(0) < 0 and lim h(r ) = lim h(r ) = ∞, we see that the equation h(r ) = 0 r →∞

r →−∞

has two solutions r1 , r2 such that

384

14 Solutions of Selected Exercises

r2 < 0 < r1 . Outside D we require that −(1 + λ)ψ  (x) + 1 = 0 or ψ(x) = Hence we put

x + C3 , C3 constant. 1+λ

⎧ ⎨C1 er1 x + C2 er2 x ; x ψ(x) = ⎩ + C3 ; 1+λ

0 < x < x∗ x∗ ≤ x

(14.8.1)

where C1 , C2 are constants. To determine C1 , C2 , C3 and x ∗ we have the four equations: ψ(0) = 0 ⇒ C1 + C2 = 0.

(14.8.2)

Put C2 = −C1 x∗ + C3 1+λ 1 ∗ ∗ ψ ∈ C 1 at x = x ∗ ⇒ C1 (r1 er1 x − r2 er2 x ) = 1+λ ∗ ∗ ψ ∈ C 2 at x = x ∗ ⇒ C1 (r12 er1 x − r22 er2 x ) = 0. ∗



ψ continuous at x = x ∗ ⇒ C1 (er1 x − er2 x ) =

(14.8.3) (14.8.4) (14.8.5)

From (14.8.4) and (14.8.5) we deduce that x∗ =

2(ln |r2 | − ln r1 ) . r1 − r2

(14.8.6)

Then by (14.8.4) we get the value for C1 , and hence the value of C3 by (14.8.3). With these values of C1 , C2 , C3 and x ∗ we must verify that φ(s, x) = e−ρs ψ(x) satisfies all the requirements of Theorem 8.2: (i) We have constructed φ such that Aφ + f = 0 in D. Outside D, i.e., for x ≥ x ∗ , we have eρs (Aφ(s, x) + f (s, x)) = A0 ψ(x)  x + C3 = −ρ 1+λ   1 C1 (er1 (x+βz) − er2 (x+βz) ) + +α· 1+λ x+βz 0) F  (x) > F  (x ∗ ) = 0 Hence

for x < x ∗ .

F(x) < 0 for 0 < x < x ∗ .

The conditions (iii), (iv), and (v) are left to the reader to verify. (vi) This holds by construction of φ. (vii)–(x) These conditions claim the existence of an increasing process ξˆ such that ˆ ˆ is strictly increasing only when Y (t) ∈ / D, and Y ξ (t) stays in D¯ for all times t, ξ(t) ˆ brings Y (t) down to a point on ∂ D. Such a singular control if Y (t) ∈ / D¯ then ξ(t) is called a local time at ∂ D of the process Y (t) reflected downwards at ∂ D. The existence and uniqueness of such a local time is proved in [CEM]. (xi) This is left to the reader. We conclude that the optimal dividend policy ξ ∗ (t) is to take out exactly the amount of money needed to keep X (t) on or below the value x ∗ . If X (t) < x ∗ we take out nothing. If X (t) > x ∗ we take out X (t) − x ∗ .

386

14 Solutions of Selected Exercises

Exercise 8.2 It suffices to prove that the function Φ0 (s, x1 , x2 ) := K e−δs (x1 + x2 )γ satisfies conditions (i)–(iv) of Theorem 8.2. In this case we have (see Sect. 8.3) A(v) Φ0 (y) = A(c) Φ0 (y) = +

∂Φ0 ∂ 2 Φ0 ∂Φ0 ∂Φ0 1 + (r x1 − c) + αx2 + β 2 x22 ∂s ∂x1 ∂x2 2 ∂x22

  Φ0 (s, x1 , x2 + x2 z) − Φ0 (s, x1 , x2 ) R

 ∂Φ0 − x2 z (s, x1 , x2 ) ν(dz) ∂x2

cγ , so condition (i) becomes γ γ (i)’ Ac Φ0 (s, x1 , x2 ) + e−δs cγ ≤ 0 for all c ≥ 0. and f (s, x1 , x2 , c) = e−δs

This holds because we know by Example 5.2 that (see (5.1.23))   cγ sup A(c) Φ(s, x1 , x2 ) + e−δs = 0. γ c≥0 Since in this case θ = 0 and 

−(1 + λ) 1 − μ κ= 1 −1 we see that condition (ii) of Theorem 8.2 becomes 0 0 + ∂Φ ≤0 (ii)’ −(1 + λ) ∂Φ ∂x1 ∂x2 ∂Φ0 ∂Φ0 (ii)” (1 − μ) ∂x1 − ∂x2 ≤ 0. Since ∂Φ0 ∂Φ0 = = K e−δs γ(x1 + x2 )γ−1 ∂x1 ∂x2 we see that (ii)’ and (ii)” hold trivially. We leave the verification of conditions (iii)–(v) to the reader.

14.9 Exercises of Chap. 9

387

14.9 Exercises of Chap. 9 Exercise 9.1 By using the same notation as in Chap. 6, we have here



s+t s (v) − ; t ≥ 0; Y (0 ) = = y ∈ R2 x X (v) (t) 

s (y, ζ) = (s, x, ζ) = ; (s, x, ζ) ∈ R3 x +ζ Y (v) (t) =



K (y, ζ) = K (s, x, ζ) = e−ρs (x + λ|ζ|) f (y) = f (s, x) = e−ρs x 2 , g(y) = 0. By symmetry we expect the continuation region to be of the form D = {(s, x) : −x¯ < x < x} ¯ for some x¯ > 0, to be determined. As soon as X (t) reaches the unknown value x¯ or −x, ¯ there is an intervention and X (t) is brought down (or up) to a certain value xˆ (or −x) ˆ where −x¯ < −xˆ < 0 < xˆ < x. ¯ We determine x¯ and xˆ in the following computations (Fig. 14.4). We guess that the value function is of the form

x xˆ 0 −ˆ x −x

Fig. 14.4 The optimal strategy of Exercise 9.1

388

14 Solutions of Selected Exercises

φ(s, x) = e−ρs ψ(x). In the continuation region D, we have by Theorem 9.2 (x) Aφ + f = 0,

(14.9.1)

where A is the generator of Y , i.e., Aφ(s, x) =

   ∂φ ∂φ 1 ∂ 2 φ φ(s, x + θ(x, z)) − φ(s, x) − θ(x, z) (s, x) ν(dz). +2 2 + ∂s ∂x ∂x R

In this case, if |x| < ξ, (14.9.1) becomes, by (9.3.1), 1 A0 ψ(x) + f (x) := −ρψ(x) + ψ  (x) + x 2 = 0. 2 We try a solution of the form ψ(x) = C cosh(γx) +

1 1 2 x + 2 ρ ρ

√ where C is a constant (to be determined) and γ = 2ρ is the positive solution of the equation 1 F(γ) := −ρ + γ 2 = 0. 2 We try to set C = −a where a > 0. Define ψ0 (x) :=

1 2 1 x + 2 − a cosh(γx) ρ ρ

and put ψ(x) = ψ0 (x); x ∈ D. We recall that D = {(s, x) : φ(s, x) < Mφ(s, x)} = {x : ψ(x) < Mψ(x)} and the intervention operator is in this case Mψ(x) = inf {ψ(x + ζ) + c + λ|ζ|; ζ ∈ R} .

14.9 Exercises of Chap. 9

389

ˆ The first-order condition for a minimum ζˆ = ζ(x) of the function  G(ζ) =

ψ(x + ζ) + c + λζ ψ(x + ζ) + c − λζ

ζ>0 ζ 0: ψ  (x + ζ) + λ = 0 ⇒ ψ  (x + ζ) = −λ (ii) ζ < 0: ψ  (x + ζ) − λ = 0 ⇒ ψ  (x + ζ) = λ. Hence we look for points x, ˆ x¯ such that −x¯ < −xˆ < 0 < xˆ < x¯ and

ψ  (x) ˆ = −λ ˆ = λ. ψ  (−x)

(14.9.2)

Note that since xˆ < x, ¯ ψ  (x) ˆ = ψ0 (x). ˆ Arguing as in Example 9.5, we put ⎧ ⎪ ; −x¯ ≤ x ≤ x¯ ⎨ψ0 (x) ψ(x) = ψ0 (x) ˆ + c + λ(x − x) ˆ ; x > x¯ ⎪ ⎩ ˆ + c − λ(x + x) ˆ ; x < −x. ¯ ψ0 (−x)

(14.9.3)

We have to show that there exist 0 < xˆ < x¯ and a value of a such that φ(s, x) := e−ρs ψ(x) satisfies all the requirements of (the minimum version of) Theorem 9.2. By symmetry we may assume x > 0 and ζ > 0 in the following. Continuity at x = x¯ gives the equation ˆ + c + λ(x¯ − x) ˆ = ψ0 (x). ¯ ψ0 (x) Differentiability at x = x¯ gives the equation ¯ λ = ψ0 (x). Substituting for ψ0 these equations give x¯ 2 xˆ 2 − a cosh(γ x) ˆ − λxˆ + c = − a cosh(γ x) ¯ − γ x¯ ρ ρ and λ=

2 x¯ − aγ sinh(γ x). ¯ ρ

(14.9.4)

(14.9.5)

390

14 Solutions of Selected Exercises

In addition we have required λ = ψ0 (x) ˆ =

2 xˆ − aγ sinh(γ x). ˆ ρ

(14.9.6)

As in Example 9.5 one can prove that for each c > 0 there exist a = a ∗ (c) > 0, xˆ = x(x) ˆ > 0 and x¯ = x(c) ¯ > xˆ such that (14.9.3)–(14.9.5) hold. With these values of a, xˆ and x¯ it remains to verify that the conditions of Theorem 9.2 hold. We check some of them: (ii) ψ ≤ Mψ = inf {ψ(x − ζ) + c + λζ; ζ > 0}. First suppose x ≥ x. ¯ If x − ζ ≥ x¯ then ˆ + c + λ(x − ζ − x) ˆ + c + λζ = c + ψ(x) > ψ(x). ψ(x − ζ) + c + λζ = ψ0 (x) If 0 < x − ζ < x¯ then ψ(x − ζ) + c + λζ = ψ0 (x − ζ) + c + λζ, which is minimal when

i.e., when

−ψ0 (x − ζ) + λ = 0 ζ = ζˆ = x − x. ˆ

This is the minimum point because ˆ > 0. ψ0 (x) See Fig. 14.5. This shows that ˆ + c + λζˆ = ψ(x) Mψ(x) = ψ(x − ζ) ˆ + c + λ(x − x) ˆ = ψ(x) for x > x. ˆ Next suppose 0 < x < x. ¯ Then ˆ + c + λ(x − x) ˆ > ψ(x) Mψ(x) = ψ0 (x) if and only if ψ(x) − λx < ψ(x) ˆ − λxˆ + c.

14.9 Exercises of Chap. 9

391

ψ (x) = λ

ψ (ˆ x) = λ x x ˆ

x ¯

Fig. 14.5 The function ψ(x) for x > 0

Now the minimum of H (x) := ψ(x) − λx for 0 < x < x¯ is attained when

ψ  (x) = λ

i.e. when x = x. ˆ Therefore ψ(x) − λx ≤ ψ(x) ˆ − λxˆ < ψ(x) ˆ − λxˆ + c. This shows that Mψ(x) > ψ(x) for all 0 < x < x. ¯ Combined with the above we can conclude that Mψ(x) ≥ ψ(x) for all x > 0, which proves (ii). Moreover, Mψ(x) > ψ(x) if and only if 0 < x < x. ¯ Hence

392

14 Solutions of Selected Exercises

D ∩ (0, ∞) = (0, x). ¯ Finally we verify (vi) Aφ + f ≥ 0 for x > x. ¯ For x > x, ¯ we have, if x¯ ≤ ξ (using (9.3.2)), ˆ + c + λ(x − x)) ˆ + x 2. A0 ψ(x) + f (x) = −ρ(ψ0 (x) This is nonnegative for all x > x¯ iff it is nonnegative for x = x, ¯ i.e., iff ¯ + x¯ 2 ≥ 0. − ρψ0 (x)

(14.9.7)

By construction of ψ0 we know that, for x < x¯ 1 −ρψ0 (x) + ψ0 (x) + 2

 R

  ψ0 (x + z) − ψ0 (x) − zψ0 (x) ν(dz) + x 2 = 0.

Therefore (14.9.7) holds iff    1  ψ0 (x) ψ0 (x¯ + z) − ψ0 (x) ¯ + ¯ − zψ0 (x) ¯ ν(dz) ≤ 0. 2 R For this it suffices that



ρ z 2 ν(dz) ≤ − ψ0 (x). ¯ 2 R

(14.9.8)

Conclusion Suppose x¯ ≤ ξ and that (14.9.8) holds. Then Φ(s, x) = e−ρs ψ(x), with ψ(x) given by (14.9.3) and a, x, ˆ x¯ given by (14.9.4)–(14.9.6). The optimal impulse control is to do nothing while |X (t)| < x, ¯ then move X (t) down to xˆ (respectively, up to −x) ˆ as soon as X (t) reaches a value ≥ x¯ (respectively, a value ≤ −x). ¯ Exercise 9.2 Here we put

s+t X (v) (t) 

s Y (v) (0− ) = =y x Y (v) (t) =



(y, ζ) = x − c − (1 + λ)ζ K (y, ζ) = e−ρs ζ f ≡g≡0 S = {(s, x) : x > 0} .

14.9 Exercises of Chap. 9

393

We guess that the value function φ is of the form φ(s, x) = e−ρs ψ(x) and consider the intervention operator   x −c . Mψ(x) = sup ψ(x − c − (1 + λ)ζ) + ζ; 0 ≤ ζ ≤ 1+λ

(14.9.9)

Note that the condition on ζ is due to the fact that the impulse must be positive and x − c − (1 + λ)ζ must belong to S. We distinguish between two cases: (1) μ > ρ. In this case, suppose we wait until time t1 and then take out ζ1 =

X (t1 ) − c . 1+λ

The corresponding value is J

(v1 )



e−ρ(t1 +s) (X (t1 ) − c) (s, x) = E 1+λ 

1 $ −ρs (μ−ρ)t1 x −ρ(s+t1 ) xe e − ce ) =E 1+λ x

→ ∞ as t1 → ∞. Therefore we obtain Φ(s, x) = +∞ in this case. (2) μ < ρ. We look for a solution by using the results of Theorem 9.2. In this case condition (x) becomes A0 ψ(x) := − ρψ(x) + μxψ  (x) + 21 σ 2 ψ  (x)    + ψ(x + γx z) − ψ(x) − γzψ  (x) ν(dz) = 0 in D.

(14.9.10)

R

We try a solution of the form ψ(x) = C1 x γ1 + C2 x γ2 , where γ1 > 1, γ2 < 0 are the solutions of the equation 1 F(γ) := −ρ + μγ + σ 2 γ(γ − 1) + 2

 R

{(1 + θz)γ − 1 − θzγ} ν  (dz) = 0.

We guess that the continuation region is of the form

394

14 Solutions of Selected Exercises

D = {(s, x) : 0 < x < x} ¯ for some x¯ > 0 (to be determined). We see that C2 = 0, because otherwise lim |ψ(x)| = ∞. x→0

We guess that in this case it is optimal to wait till X (t) reaches or exceeds a value x¯ > c and then take out as much as possible, i.e., reduce X (t) to 0. Taking the transaction costs into account this means that we should take out x −c ˆ for x ≥ x. ¯ ζ(x) = 1+λ We therefore propose that ψ(x) has the form ⎧ ⎨C1 x γ1 for 0 < x < x¯ ψ(x) = x − c ⎩ for x ≥ x. ¯ 1+λ Continuity and differentiability of ψ(x) at x = x¯ give the equations C1 x¯ γ1 =

x¯ − c 1+λ

and C1 γ1 x¯ γ1 −1 =

1 . 1+λ

Combining these we get x¯ =

x¯ − c −γ1 γ1 c and C1 = x¯ . γ1 − 1 1+λ

With these values of x¯ and C1 , we have to verify that ψ satisfies all the requirements of Theorem 3.2. We check some of them: (ii) ψ ≥ Mψ on S.   x −c . Here Mψ = sup {ψ(x − c − (1 + λ)ζ) + ζ} ; 0 ≤ ζ ≤ 1+λ If x − c − (1 + λ)ζ ≥ x, ¯ then ψ(x − c − (1 + λ)ζ) + ζ =

x − 2c x −c < = ψ(x) 1+λ 1+λ

and if x − c − (1 + λ)ζ < x¯ then h(ζ) := ψ(x − c − (1 + λ)ζ) + ζ = C1 (x − c − (1 + λ)ζ)γ1 + ζ.

14.9 Exercises of Chap. 9

395

Since h



x −c 1+λ



= 1 and h  (ζ) > 0

x −c ˆ we see that the maximum value of h(ζ); 0 ≤ ζ ≤ , is attained at ζ = ζ(x) = 1+λ x −c . 1+λ Therefore  x − 2c x − c x −c Mψ(x) = max , = for all x > c. 1+λ 1+λ 1+λ Hence Mψ(x) = ψ(x) for x ≥ x. ¯ For 0 < x < x¯ consider k(x) := C1 x γ1 − Since

x −c . 1+λ

k(x) ¯ = k  (x) ¯ = 0 and k  (x) > 0 for all x,

we conclude that k(x) > 0 for 0 < x < x. ¯ Hence ψ(x) > Mψ(x) for 0 < x < x. ¯ ¯ (vi) A0 ψ(x) ≤ 0 for x ∈ S\ D¯ i.e., for x > x. For x > x, ¯ we have 1 x −c + μx · A0 ψ(x) = −ρ 1+λ 1+λ    x + γx z − c γ1 C1 (x + γx z) − ν(dz) + 1+λ x+γx z x¯ ⇔ (μ − ρ)x + (ρ + ν)c ≤ 0 for all x > x¯ ⇔ (μ − ρ)x¯ + (ρ + ν)c ≤ 0

396

14 Solutions of Selected Exercises

(ρ + ν)c ρ−μ

⇔ x¯ ≥ ⇔

γ1 c (ρ + ν)c ≥ γ1 − 1 ρ−μ

⇔ γ1 ≤ Since F

ρ + ν . μ + ν

  ρ 1 ρ ρ ρ ≥ −ρ + μ + σ 2 −1 >0 μ μ 2 μ μ

and F(γ1 ) = 0, γ1 > 1 we conclude that γ1 < enough.

ρ μ

and hence (vi) holds if ν is small

Exercise 9.3 Here f = g = 0, (y, ζ) = (s, 0), K (y, ζ) = −c + (1 − λ)x and S = R2 ; y = (s, x). If there are no interventions, the process Y (t) defined by , +





0 dt 1 0  dY (t) = = dt + dB(t) + (dt, dz) θz N dX (t) μ σ 

R

has the generator Aφ(y) =

∂φ 1 2 ∂ 2 φ ∂φ +μ + 2σ + ∂s ∂x ∂x 2

  φ(s, x + θ z) − φ(s, x) − θ z R

 ∂φ (s, z) ν(dz) ; ∂x

y = (s, x).

The intervention operator M is given by M φ(y) = sup {φ((y, ζ)) + K (y, ζ); ζ ∈ Z and (y, ζ) ∈ S} = φ(s, 0) + (1 − λ)x − c. If we try

we get that

φ(s, x) = e−ρs ψ(x), Aφ(s, x) = e−ρs A0 ψ(x),

14.9 Exercises of Chap. 9

397

where A0 ψ(x) = −ρψ + μψ  (x) + 21 σ 2 ψ  (x) +





 ψ(x + θ z) − ψ(x) − θ zψ  (x) ν(dz)

R

and

M φ(s, x) = e−ρs M0 ψ(x),

where M0 ψ(x) = ψ(0) + (1 − λ)x − c. We guess that the continuation region D has the form   D = (s, x); x < x ∗ for some x ∗ > 0 to be determined. To find a solution ψ0 of A0 ψ0 + f = A0 ψ0 = 0, we try ψ0 (x) = er x (r constant) and get A0 ψ0 (x) = −ρ er x + μ r er x + 21 σ 2 r 2 er x   r (x+θ z)  + − er x − r θ z er x ν(dz) e R rx

= e h(r ) = 0, 

where h(r ) = −ρ + μ r + 21 σ 2 r 2 +



 er θ z − 1 − r θ z ν(dz).

R

Choose r1 > 0 such that h(r1 ) = 0 (see the solution of Exercise 3.1). Then we define  ψ(x) =

M er1 x ; x < x∗ ψ(0) + (1 − λ)x − c = M + (1 − λ)x − c; x ≥ x ∗

(14.9.11)

398

14 Solutions of Selected Exercises

Fig. 14.6 The function F

F (x) x∗

for some constant M = ψ(0) > 0. If we require continuity and differentiability at x = x ∗ we get the equations ∗

M er1 x = M + (1 − λ)x ∗ − c

(14.9.12)

and ∗

M r1 er1 x = 1 − λ.

(14.9.13)

This gives the following equations for x ∗ and M: ∗

k(x ∗ ) := e−r1 x + r1 x ∗ − 1 − Since k(0) = −

r1 c = 0, 1−λ

M=

1 − λ −r1 x ∗ e > 0. r1

(14.9.14)

r1 c < 0 and lim k(x) = ∞, we see that there exists x ∗ > 0 s.t. x→∞ 1−λ

k(x ∗ ) = 0. We must verify that with these values of x ∗ and M the conditions of Theorem 9.2 are satisfied. We consider some of them: (ii) ψ(x) ≥ M0 ψ(x). For x ≥ x ∗ we have ψ(x) = M0 ψ(x) = M + (1 − λ)x − c. For x < x ∗ we have ψ(x) = M er1 x and M0 ψ(x) = M + (1 − λ)x − c. Define F(x) = M er1 x − (M + (1 − λ)x − c); x ≤ x ∗ . See Fig. 14.6. We have F(x ∗ ) = F  (x ∗ ) = 0 and F  (x) = M r12 er1 x > 0. Hence F  (x) < 0 and so F(x) > 0 for x < x ∗ . Therefore ψ(x) ≥ M0 ψ(x) for all x.

14.9 Exercises of Chap. 9

399

(vi) A0 ψ ≤ 0 for x > x ∗ : For x > x ∗ we have A0 ψ(x) = −ρ[M + (1 − λ)x − c] + μ(1 − λ)   r1 (x+θz)  + Me − (M + (1 − λ)(x + θz) − c) ν(dz) x+θz x ∗ ρ 1−λ ρ(1 − λ) cν μ c−M + ⇔ x∗ ≥ + ρ 1−λ ρ(1 − λ) c 1 μ cν ∗ − e−r1 x + ⇔ x∗ ≥ + ρ 1 − λ r1 ρ(1 − λ)

A0 ψ(x) ≤ 0 for all x > x ∗ ⇔ x ≥



⇔ e−r1 x + r1 x ∗ −

μ cν c ≥ + 1−λ ρ ρ(1 − λ)

⇔ 1≥

cν μ + ρ ρ(1 − λ)

⇔ μ+

cν ≤ ρ. 1−λ

So we need to assume that μ +

cν ≤ ρ for (vi) to hold. 1−λ

Conclusion Let

ψ(s, x) = e−ρs ψ(x)

where ψ is given by (14.9.11). Assume that

x∗

x∗

cut

cut t

Fig. 14.7 The optimal forest management of Exercise 9.3

400

14 Solutions of Selected Exercises

μ+ Then

cν ≤ ρ. 1−λ

φ(s, x) = sup J (v) (s, x) v

and the optimal strategy is to cut the forest every time the biomass reaches the value x ∗ (see Fig. 14.7).

14.10 Exercises of Chap. 10 Exercise 10.1 As in Exercise 9.3, we have f = g = 0, (y, ζ) = (s, 0); y = (s, x), K (y, ζ) = (1 − λ)x − c, S = [0, ∞) × R. If there is no intervention, then φ0 ≡ 0 and Mφ0 = sup {(1 − λ)ζ − c; ζ = x} = (1 − λ)x − c. Hence 

−ρ(s+τ ) ((1 − λ)X (τ ) − c) . (14.10.1) φ1 (y) = sup E [Mφ0 (Y (τ ))] = sup E e y

τ ≤τS

y

τ ≤τS

This is an optimal stopping problem that can be solved by exploiting the three basic variational inequalities. We assume that the continuation region D1 = {φ1 > Mφ0 } is of the form D1 = {(s, x); x < x1 } for some x1 > 0 and that the value function has the form φ1 (s, x) = e−ρs ψ1 (x) for some function ψ1 . On D1 , ψ1 is the solution of

14.10 Exercises of Chap. 10

401

− ρψ1 (x) + μψ1 (x) + 21 σ 2 ψ1 (x) +

A solution of (14.10.2) is

 R

  ψ1 (x + θz) − ψ1 (x) − θzψ1 (x) ν(dz) = 0.

(14.10.2)

ψ1 (x) = Aeγ1 x + Beγ2 x

where γ2 < 0 and γ1 > 1, A and B arbitrary constants to be determined. We choose B = 0 and put A1 = A > 0. We get  ψ1 (x) =

A1 eγ1 x (1 − λ)x − c

x < x1 x ≥ x1 .

We impose the continuity and differentiability conditions of ψ1 at x = x1 . (i) Continuity: A1 eγ1 x1 = (1 − λ)x1 − c. (ii) Differentiability: A1 γ1 eγ1 x1 = 1 − λ. (1 − λ) −γ1 x1 1 c . e and x1 = + γ1 γ1 1−λ As a second step, we evaluate

We get A1 =

φ2 (y) = sup E y [Mφ1 (Y (τ ))]. τ

We suppose φ2 (s, x) = e−ρs ψ2 (x) and consider Mψ1 (x) = sup {ψ1 (0) + (1 − λ)ζ − c; ζ ≤ x} = ψ1 (0) + (1 − λ)x − c = (1 − λ)x + A1 − c. Hence   φ2 (y) = sup E y e−ρ(s+τ ) ((1 − λ)X (τ ) − (c − A1 )) . τ ≤τS

(14.10.3)

By the same argument as before, we get Φ2 (s, x) = e−ρs ψ2 (x), where  ψ2 (x) =

A2 eγ1 x (1 − λ)x + A1 − c

x < x2 x ≥ x2

1 c − A1 1 − λ −γ1 x2 and A2 = + e . Note that x2 < x1 and A2 > A1 . γ1 1−λ γ1 Since Mφ0 and Mφ1 have linear growth, the conditions of Theorem 10.2 are satisfied. Hence φ1 and φ2 are the solutions for our impulse control problems when respec-

where x2 =

402

14 Solutions of Selected Exercises

tively one intervention and two interventions are allowed. The impulses are given by ζ1 = ζ2 = x and τ1 = inf {t : X (t) ≥ x2 } and τ2 = inf {t > τ1 : X (t) ≥ x1 }. Exercise 10.2 Here we have (see the notation of Chap. 9) f =g≡0 K (x, ζ) = ζ (x, ζ) = x − (1 + λ)ζ − c S = {(s, x); x > 0} . We put y = (s, x) and suppose φ0 (s, x) = e−ρs ψ0 (x). Since f = g = 0 we have φ0 (y) = 0    x −c + x −c and Mψ0 (y) = sup ζ : 0 ≤ ζ ≤ = . As a second step, we 1+λ 1+λ consider 

(X (τ + s) − c)+ . φ1 (s, x) = sup E x [Mφ0 (X (τ ))] = sup E x e−ρ(τ +s) 1+λ τ ≤τS τ ≤τS (14.10.4) We distinguish between three cases (a) μ > ρ Then xe(μ−ρ)(t+s) − ce−ρ(t+s) φ1 (s, x) ≥ . 1+λ Hence if t → +∞

φ1 (s, x) → +∞.

We obtain Mφ1 (s, x) = +∞ and clearly φn = +∞ for all n. In this case, the optimal stopping time does not exist. (b) μ < ρ In this case we try to put φ1 (s, x) = e−ρs ψ1 (x) and solve the optimal stopping problem (14.10.4) by using Theorem 3.2.   We guess that the continuation region is of the form D = 0 < x < x1∗ and solve σ 2 x 2  ψ1 (x) Lψ1 (x) := − ρψ1 (x) + μxψ1 (x) + 2    + ψ1 (x + θx z) − ψ1 (x) − θx zψ1 (x) ν(d x) = 0. R

(14.10.5)

14.10 Exercises of Chap. 10

A solution of (14.10.5) is

403

ψ1 (x) = c1 x γ1 + c2 x γ2

where γ2 < 0 and γ1 > 1 are solutions of the equation 1 k(γ) := −ρ + μγ + σ 2 γ(γ − 1) + 2

 R

{(1 + θz)γ − 1 − γθz} ν(dz) = 0

and c1 , c2 are arbitrary constants. Since γ2 < 0, we put c2 = 0. We obtain ⎧ ⎨c1 x γ1 ψ1 (x) = x − c ⎩ 1+λ

0 < x < x1∗ x ≥ x1∗ .

By imposing the condition of continuity and differentiability, we can compute c1 and x1∗ . The result is: γ1 c . 1. x1∗ = γ1 − 1 $ γ1 c %1−γ1 1 2. c1 = . γ1 (1 + λ) γ1 − 1 Note that γ1 > 1 and x1∗ > c. We check some of the conditions of Theorem 3.2: (ii) ψ1 (x) ≥ Mψ0 (x) for all x: We know that φ1 (x) = Mφ0 (x) for x > x1∗ . Consider h 1 (x) := ψ1 (x) − Mψ0 (x). We have h 1 (x1∗ ) = 0, h 1 (x1∗ ) = 0, h 1 (x1∗ ) = c1 γ1 (γ1 − 1)(x1∗ )γ1 −2 > 0. Hence x1∗ is a minimum for h 1 and ψ1 (x) ≥ Mψ0 (x) for every 0 < x < x1∗ . (vi) Lψ1 ≤ 0 for all x > 0: Clearly Lψ1 = 0 for 0 < x < x1∗ . If x > x1∗ then 1 Lψ1 (x) = ((μ − ρ)x + cρ) 1+λ    x + θx z − c γ1 c1 (x + θx z) − ν(dz) ≤ 0 + 1+λ x+θx z 0; | B(t)| ≥ x ∗ (t) ∧ T. τˆD = τˆD (ω) σ

(14.11.8)

The high contact condition for determination of the curve x ∗ (t) is then 

' c (  ∂w ∂w ∂w , (s, x ∗ (s)) = exp − 2 (s, 0), 0 . ∂s ∂x 2c ∂s

The suggested shape of x ∗ (t) is shown in Fig. 14.8.

(14.11.9)

408

14 Solutions of Selected Exercises

x

x∗ (t)

t τ1

τ2 T

Fig. 14.8 The suggested optimal combined control of Exercise 11.2

14.12 Exercises of Chap. 12 Exercise 12.1 Because of the symmetry of h, we assume that the continuation region is of the form   D = (s, x) : −x ∗ < x < x ∗ with x ∗ > 0. We assume that the value function φ(s, x) = e−ρs ψ(x). On D, φ is the solution of Lφ(s, x) = 0, (14.12.1) where L =

∂ ∂s

+

1 ∂2 . 2 ∂x 2

We obtain L 0 ψ(x) := −ρψ(x) + 21 ψ  (x) = 0.

(14.12.2)

The general solution of (14.12.2) is ψ(x) = c1 e



2ρ x



+ c2 e−

2ρ x

.

We must have ψ(x) = ψ(−x), hence c1 = c2 . We put c1 = 21 c and assume c > 0. We impose continuity and differentiability conditions at x = x ∗ (Fig. 14.9):

14.12 Exercises of Chap. 12

409

y G(x)

1 −K

x

1 K

Fig. 14.9 The function G(x)

(i) Continuity at x = x ∗ √ 1 $ √2ρ x ∗ ∗% c e + e− 2ρ x = K x ∗ . 2

(ii) Differentiability at x = x ∗ √ 1 0 $ x ∗ √2ρ 1 ∗% c 2ρ e − e− 2ρ x = K . 2 2

Then x ∗ is the solution of

x

1 √ ∗

and

We must check if x ∗
K1 ⎨1 ψ(x) = K |x| x ∗ < |x| < ⎪ √ ⎩ c cosh(x 2ρ ) |x| < x ∗

1 K

410

14 Solutions of Selected Exercises

ψ(x)

Fig. 14.10 The function ψ(x) in Case 1 for x ≥ 0

x∗

1 K

Since ψ is not C 2 at x = x ∗ we prove that ψ is a viscosity solution for our optimal stopping problem. (i) We first prove that ψ is a viscosity subsolution. Let u belong to C 2 (R) and u(x) ≥ ψ(x) for all x ∈ R and let y0 ∈ R be such that u(y0 ) = ψ(y0 ). Then ψ is a viscosity subsolution if and only if max(L 0 u(y0 ), G(y0 ) − u(y0 )) ≥ 0 for all such u, y0 .

(14.12.3)

We need to check (14.12.3) only for y0 = x ∗ . We have u(x ∗ ) = ψ(x ∗ ) = G(x ∗ ) i.e., (G − u)(x ∗ ) = 0. Hence max(L 0 u(x ∗ ), G(x ∗ ) − u(x ∗ )) ≥ (G − u)(x ∗ ) = 0. (ii) We prove that ψ is a viscosity supersolution. Let v belong to C 2 (R) and v(x) ≤ ψ(x) for every x ∈ R and let y0 ∈ R be such that v(y0 ) = ψ(y0 ). Then ψ is a viscosity supersolution if and only if max(L 0 v(y0 ), G(y0 ) − v(y0 )) ≤ 0 for all such v, y0 . We check it only for x = x ∗ . Then G(x ∗ ) = ψ(x ∗ ) = v(x ∗ ). Since v ≤ ψ, x = x ∗ is a maximum point for H := v − ψ. We have H (x ∗ ) = 0,

14.12 Exercises of Chap. 12

411

ψ(x)

1 −K

1 K

Fig. 14.11 The function ψ(x) in Case 2

H  (x ∗ ) = 0, H  (x ∗ ) = v  (x ∗ ) − ψ  (x−∗ ) ≤ 0. Hence L 0 v(x ∗ ) ≤ L 0 ψ(x−∗ ) ≤ 0. Therefore ψ is a viscosity supersolution. Since ψ is both a viscosity supersolution and subsolution, ψ is a viscosity solution. √ 2ρ Case 2: We consider now the case when K ≥ ∗ . z In this case, the continuation region is given by   1 1 D= − 0 this is impossible. Next we prove that Φ(x) < Φ2 (x) where Φ2 (x) is the solution of Exercise 9.1. It has the form Φ2 (s, x) = e−ρs ψ2 (x), with  ψ0 (x); |x| ≤ x¯ ψ2 (x) = ˆ + c; |x| > x¯ ψ0 (x) where ψ0 (x) =

1 2 b x + 2 − a cosh(γx) ρ ρ

is a solution of 1 −ρψ0 (x) + ψ0 (x) + 2



  ψ0 (x + z) − ψ0 (x) − zψ0 (x) ν(dz) + x 2 = 0.

R

Since clearly Φ(s, x) ≤ Φ2 (s, x), it suffices to prove that Φ2 (s, x) does not satisfy (14.12.5) in viscosity sense. In particular, (14.12.5) implies that L u Φ2 (s, x) + e−ρs (x 2 + θu 2 ) ≥ 0 for all u ∈ R. For |x| < x¯ this reads ¯ uψ0 (x) + θu 2 ≥ 0 for all u ∈ R, |x| < x. 

The function h(u) :=

2x − aγ sinh(γx) u + θu 2 ; u ∈ R ρ

is minimal when u = uˆ =

 2x 1 aγ sinh(γx) − 2θ ρ

with corresponding minimum value

(14.12.6)

414

14 Solutions of Selected Exercises

 2x 2 1 aγ sinh(γx) − h(u) ˆ =− . 4θ ρ Hence (14.12.6) cannot possibly hold and we conclude that Φ2 cannot be a viscosity solution of (14.12.5). Hence Φ = Φ2 and hence Φ(x) < Φ2 (x) for some x.

14.13 Exercises of Chap. 13 Exercise 13.6 (a) Here Aφ(x) = (αx + u)φ (x) + 21 σ 2 φ (x) and hence 1 d ((αx + u)ψ(x)) + σ 2 ψ  (x) dx 2 1  = −αψ(x) − (αx + u)ψ (x) + σ 2 ψ  (x). 2

A∗ ψ(x) = −

(14.13.1)

Therefore the corresponding complete observation controlled SPDE is ⎧ 

∂y 1 2 ∂2 ⎪ ⎪ ⎪ dy(t, x) = −αy(t, x) − (αx + u(t)) (t, x) + σ y(t; x) dt ⎨ ∂x 2 ∂x 2 + h(x)y(t, x)dζ(t) ; t > 0 ⎪ ⎪ ⎪ ⎩ y(0, x) = F(x) (14.13.2) with performance criterion  J (u) = E Q

T





0

R



(x + u (t))y(t, x)d x dt . 2

2

(14.13.3)

(b) Here Aφ(x) = αxuφ (x) + 21 β 2 x 2 u 2 φ (x) and hence A∗ ψ(x) = −

1 d2 2 2 2 d (αuxψ(x)) + (β u x ψ(x)) dx 2 dx 2

1 = (β 2 u 2 − αu)ψ(x) + (2β 2 u 2 x − αux)ψ  (x) + β 2 u 2 x 2 ψ  (x). 2 (14.13.4) Hence the corresponding controlled complete observation SPDE is

14.13 Exercises of Chap. 13

415

⎧ & ∂ ⎪ ⎪dy(t, x) = (β 2 u 2 (t) − αu(t))y(t, x) + (2β 2 u 2 (t)x − αu(t)x) y(t, x) ⎪ ⎪ ∂x ⎨

2 1 2 2 2 ∂ + β u (t)x y(t, x) dt + h(x)y(t, x)dζ(t) ⎪ ⎪ 2 ∂x 2 ⎪ ⎪ ⎩ y(0, x) = F(x), (14.13.5) with performance criterion  J (u) = E Q

R

x γ y(T, x)dx .

(14.13.6)

References

[A] [Aa] [Aa13] [AaØPU]

[AaØU] [AD] [Am] [AHØP] [AHØ] [AK] [AKL]

[AKy] [AØ]

[AMS] [AØ1]

[AØ2]

D. Applebaum: Lévy Processes and Stochastic Calculus. Cambridge University Press 2003. K. Aase: Optimum portfolio diversification in a general continuous-time model. Stoch. Proc. Appl. 18 (1984), 81–98. K. Aase: Recursive utility and disappearing puzzles for discrete-time models. Manuscript, Norwegian School of Economics (NHH), May 31, 2013 K. Aase, B. Øksendal, N. Privault and J. Ubøe: White noise generalizations of the Clark–Haussmann–Ocone theorem with application to mathematical finance. Finance Stoch. 4 (2000), 465–496. K. Aase, B. Øksendal and J. Ubøe: Using the Donsker delta function to compute hedging strategies. Potential Analysis 14 (2001), 351–374. D. Anderson and B. Djehiche: A maximum principle for SDEs of mean-field type. Applied Mathematics and Optimization 63, 341–356 (2011). A.L. Amadori: Nonlinear integro-differential operators arising in option pricing: a viscosity solutions approach. J. Diff. Integral Eqn. 13 (2003), 787–811. N. Agram, S. Haadem, B. Øksendal and F. Proske: A maximum principle for infinite horizon delay equations. SIAM J. Math. Anal. 45(4) (2013), 2499–2522. N. Agram, Y. Hu and B. Øksendal: Mean-field BSDEs and their applications. arxiv: 1801.03349. L.H.R. Alvarez and J. Keppo: The impact of delivery lags on irreversible investment under uncertainty. Eur. J. Oper. Res. 136 (2002), 173–180. A.L. Amadori, K.H. Karlsen and C. La Chioma: Non-linear degenerate integro-partial differential evolution equations related to geometric Lévy processes and applications to backward stochastic differential equations. Stoch. Stoch. Rep. 76(2) (2004), 147– 177(31). L. Alili and A. Kyprianou: Some remarks on first passage of Lévy processes, the American put and pasting principles. Ann. Appl. Probab. (2004). N. Agram and B. Øksendal: Infinite horizon optimal control of forward-backward stochastic differential equations with delay. Comp. and Appl. Math. 259 (2014), 336– 349. M. Akian, J.L. Menaldi and A. Sulem: On an investment-consumption model with transaction costs. SIAM J. Control Opt. 34 (1996), 329–364. N. Agram and B. Øksendal: Stochastic control of memory mean-field processes. Appl. Math. Optim. doi/org/10.1007/s00245-017-9425-1 (2017). Errata: doi/org/10.1007/s00245-018-9483-Z (2018). N. Agram and B. Øksendal: Model uncertainty stochastic mean-field control (2017). Stoch. Anal. Appl. (to appear). arxiv.org/abs/161101385 v9.

© Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0

417

418 [AØ12]

References

T.T.K. An and B. Øksendal: A maximum principle for stochastic differential games with g-expectation and partial information. Stochastics 84 (2012), 137–155. [AO] R. Anderson and S. Orey: Small random perturbations of dynamical systems with reflecting boundary. Nagoya Math. J. 60 (1976), 189–216. [AR] N. Agram and R.R. Røse: Optimal control of forward-backward mean-field delayed systems. Afrika Matematika 29 (2018), 149–174. [AT] O. Alvarez and A. Tourin: Viscosity solutions of nonlinear integro-differential equations. Ann. Inst. Henri Poincaré 13 (1996), 293–317. [ADEH] P. Artzner, F. Delbaen, J.-M. Eber and D. Heath: Coherent measures of risk. Math. Finance 9(3) (1999), 203–228. [B] G. Barles: Solutions de Viscosité des Équations de Hamilton–Jacobi. Math. Appl. 17. Springer, Berlin Heidelberg New York 1994. [BaS] G. Barles and P.E. Souganidis: Convergence of approximation schemes for fully nonlinear second order equations. Asymptot. Anal. 4 (1991), 271–283. [BBP] G. Barles, R. Buckdahn, and E. Pardoux: Backward stochastic differential equations and integral-partial differential equations. Stochastics and Stochastics Reports, 60:57–83, 2009. [BC] A. Bain and D. Crisan: Fundamentals of Stochastic Filtering. Springer 2009. [BG] R.M. Blumenthal and R.K. Getoor: Markov Processes and Potential Theory. Academic, New York 1968. [BI] G. Barles and C. Imbert: Second-order elliptic integro-differential equations: Viscosity solutions’ theory revisited. Annales de l’IHP, Vol. 25(3) (2008), 567–585. [B-N] O. Barndorff-Nielsen: Processes of normal inverse Gaussian type. Finance Stoch. 1 (1998), 41–68. [Ben] F.E. Benth: On the positivity of the stochastic heat equation. Potential Analysis 6 (1997), 127–148. [Ber] J. Bertoin: Lévy Processes. Cambridge University Press 1996. [BEK] J.S. Baras, R.J. Elliott and M. Kohlmann: The partially observed stochastic minimum principle. SIAM J. Control Opt. 27 (1989), 1279–1292. [Ben1] A. Bensoussan: Maximum principle and dynamic programming approaches of the optimal control of partially observed diffusions. Stochastics 9 (1983), 169–222. [Ben2] A. Bensoussan: Stochastic maximum principle for systems with partial information and application to the separation principle. In M. Davis and R. Elliott (eds.): Applied Stochastic Analysis. Gordon Breach, New York 1991, pp. 157–172. [Ben3] A. Bensoussan: Stochastic Control of Partially Observable Systems. Cambridge University Press 1992. [Bi] J.-M. Bismut: Conjugate convex functions in optimal stochastic control. J. Math. Anal. Appl. 44 (1973), 384–404. [BCa] M. Bardi and I. Capuzzo-Dolcetta: Optimal Control and Viscosity Solutions of Hamilton–Jacobi–Bellman Equations. Birkhäuser, Basel 1997. [BCe] B. Bassan and C. Ceci: Optimal stopping problems with discontinuous reward: regularity of the value function and viscosity solutions. Stoch. Stoch. Rep. 72 (2002), 55–71. [BDLØP] F.E. Benth, G. Di Nunno, A. Løkka, B. Øksendal and F. Proske: Explicit representation of the minimal variance portfolio in markets driven by Lévy processes. Math. Finance 13 (2003), 55–72. [BK] F.E. Benth and K.H. Karlsen: Portfolio Optimization in Lévy Markets. World Scientific (in preparation). [BKR1] F.E. Benth, K. Karlsen and K. Reikvam: Optimal portfolio management rules in a non-Gaussian market with durability and intertemporal substitution. Finance Stoch. 4 (2001), 447–467. [BKR2] F.E. Benth, K. Karlsen and K. Reikvam: Optimal portfolio selection with consumption and nonlinear integro-differential equations with gradient constraint: a viscosity solution approach. Finance Stoch. 5 (2001), 275–303.

References [BKR3] [BKR4]

[BKR5] [BKR6]

[BL] [BØ] [BHØZ] [BØ1] [BØ2]

[BR] [BS] [BT] [B] [BM] [C] [CB]

[CD1] [CD2] [CE]

[CEM] [C-M]

[CIL] [CMS]

[CØS]

419 F.E. Benth, K.H. Karlsen and K. Reikvam: A note on portfolio management under non-gaussian logreturns. Int. J. Appl. Theor. Finance 4 (2001), 711–732. F.E. Benth, K.H. Karlsen and K. Reikvam: Portfolio optimization in a Lévy market with intertemporal substitution and transaction costs. Stoch. Stoch. Rep. 74 (2002), 517–569. F.E. Benth, K.H. Karlsen and K. Reikvam: A semilinear Black & Scholes partial differential equation for valuing American options. Finance Stoch. 7 (2003), 277–298. F.E. Benth, K.H. Karlsen and K. Reikvam: Merton’s portfolio optimization problem in a Black and Scholes market with non-gaussian stochastic volatility of Ornstein– Uhlenbeck type. Math. Finance 13 (2003), 215–244. A. Bensoussan and J.L. Lions: Impulse Control and Quasi-Variational Inequalities. Gauthiers-Villars, Paris 1984. F. Biagini and B. Øksendal: A general stochastic calculus approach to insider trading. Appl. Math. & Optim. 52 (2005), 167–181. F. Biagini, Y. Hu, B. Øksendal and T. Zhang: Stochastic Calculus for Fractional Brownian Motion and Applications. Springer 2008. K.A. Brekke and B. Øksendal: Optimal switching in an economic activity under uncertainty. SIAM J. Control Opt. 32 (1994), 1021–1036. K.A. Brekke and B. Øksendal: A verification theorem for combined stochastic control and impulse control. In L. Decreusefond et al. (eds.): Stochastic Analysis and Related Topics, Vol. 6, Birkhäuser, Basel 1998, 211–220. F.E. Benth and K. Reikvam: On a connection between singular control and optimal stopping. Appl. Math. Opt. 49 (2004), 27–41. A. Bar-Ilan and A. Sulem: Explicit solution of inventory problems with delivery lags. Math. Oper. Res. 20 (1995), 709–720. D.P. Bertsekas and J.N. Tsitsiklis: An analysis of stochastic shortest path problems. Math. Oper. Res. 16 (1991), 580–595. L. Breiman: Probability. Addison-Wesley 1968. Buckdahn R., Ma J.: Pathwise stochastic control problems and stochastic HJB equations. SIAM J. Control Optim. 45, 2224–2256 (2007). T. Chan: Pricing contingent claims on stocks driven by Lévy processes. Ann. Appl. Probab. 9 (1999), 504–528. C. Ceci and B. Bassan: Mixed optimal stopping and stochastic control problems with semi-continuous final reward for diffusion processes. Stoch. Stoch. Rep. 76, (2004), 323–347. R. Carmona and F. Delarue: Mean-field forward-backward stochastic differential equations. Electronic Commun. Probab. 18 (2013), 1–15. R. Carmona and F. Delarue: Probabilistic Theory of Mean Field Games and Applications. Volume I, II. Springer 2018. C.D. Charalambous and R.E. Elliott: Classes of nonlinear partially observable stochastic optimal control problems with explicit optimal control laws. SIAM J. Control Opt. 36 (1998), 542–578. M. Chaleyat-Maurel, N. El Karoui and B. Marchal: Réflexion discontinue et systèmes stochastiques. Ann. Probab. 8 (1980), 1049–1067. A. Chojnowska-Michalik: Representation theorem for general stochastic delay equations, Bull. Acad. Polon. Sci. Ser. Sci. Math. Astronom. Phys., 26, 7, pp. 635–642, 1978. M.G. Crandall, H. Ishii and P.-L. Lions: User’s guide to viscosity solutions of second order partial differential equations. Bull. Am. Math. Soc. 27 (1992), 1–67. J.-Ph. Chancelier, M. Messaoud and A. Sulem: A policy iteration algorithm for fixed point problems with nonexpansive operators. Mathematical Methods of Operations Research, 65 (2006). J.-Ph. Chancelier, B. Øksendal and A. Sulem: Combined stochastic control and optimal stopping, and application to numerical approximation of combined stochastic and

420

References

impulse control, In A.N. Shiryaev (ed.): Stochastic Financial Mathematics, Steklov Mathematical Institute, Moscow, vol. 237, 149–173, 2002. [CW] L. Chen and Z. Wu: Maximum principle for the stochastic optimal control problem with delay and application. Automatica, 46:1074–1080, 2010. [CT] R. Cont and P. Tankov: Financial Modelling with Jump Processes. Chapman & Hall/CRC, London/Boca Raton 2003. [D1] M.H.A. Davis: Linear Estimation and Stochastic Control. Chapman & Hall, London 1977. [D2] M.H.A. Davis: Lectures on Stochastic Control and Nonlinear Filtering. Tata Institute of Fundamental Research, Bombay 1984. [DM] M.H.A. Davis and I. Markus: An introduction to nonlinear filtering. In M. Hazewinkel and J.C. Willems (eds.): Stochastic Systems; The Mathematics of Filtering and Identification and Applications. Reidel, Dordrecht 1981, 53–75. [DMØP1] G. Di Nunno, T. Meyer-Brandis, B. Øksendal and F. Proske: Malliavin calculus and anticipative Itô formulae for Lévy processes. Inf. Dim. Anal. Quantum Prob. Rel. Topics 8 (2005), 235–258. [DMØP2] G. Di Nunno, T. Meyer-Brandis, B. Øksendal and F. Proske: Optimal portfolio for an insider in a market driven by Lévy processes. Quant. Finance 6 (2006), 83–94. [Dav] D. David: Optimal control of stochastic delayed systems with jumps, preprint, 2008. [DN] M.H.A. Davis and A. Norman: Portfolio selection with transaction costs. Math. Oper. Res. 15 (1990), 676–713. [DØ1] O. Draouil and B. Øksendal: A Donsker delta functional approach to optimal insider control and application to finance. Comm. Math. Stat. (CIMS) 3 (2015), 365–421; DOI https://doi.org/10.1007/s40304-015-0065-y. [DØ2] O. Draouil and B. Øksendal: Optimal insider control and semimartingale decompositions under enlargement of filtration. arXiv: 1512.01759v1 (6 Dec 2015). To appear in Stochastic Analysis and Applications. [DØ4] O. Draouil and B. Øksendal: Optimal insider control of stochastic partial differential equations. Stochastics and Dynamics 18(3) (2018). DOI:https://doi.org/10.1142/ S0219493718500144. http://arxiv.org/abs/1607.00197 (July 2016) [DiØ1] G. Di Nunno and B. Øksendal: The Donsker delta function, a representation formula for functionals of a Lévy process and application to hedging in incomplete markets. Séminaires et Congrèes, Societé Mathématique de France, Vol. 16 (2007), 71–82. [DiØ2] G. Di Nunno and B. Øksendal: A representation theorem and a sensitivity result for functionals of jump diffusions. In A.B. Cruzeiro, H. Ouerdiane and N. Obata (editors): Mathematical Analysis and Random Phenomena. World Scientific 2007, pp. 177–190. [DØP] G. Di Nunno, B. Øksendal and F. Proske: Malliavin Calculus for Lévy Processes with Applications to Finance. Universitext, Springer 2009. [DØS] R. Dumitrescu, B. Øksendal and A. Sulem: Stochastic control for mean-field stochastic partial differential equations with jumps. J. Optim. Theory Appl. (2018) 176: 559–584. DOI.https://doi.org/10.1007/s10957-018-1243-3. [DE] D. Duffie and L. Epstein: Stochastic differential utility. Econometrica 60 (1992), 353– 394 [DS] F. Delbaen and W. Schachermayer: A general version of the fundamental theorem of asset pricing. Math. Ann. 300 (1994), 463–520. [DS2] F. Delbaen and W. Schachermayer: The Mathematics of Arbitrage. Springer 2008. [DZ] K. Duckworth and M. Zervos: A model for investment decisions with switching costs. Ann. Appl. Probab. 11 (2001), 239–260. [E] H.M. Eikseth: Optimization of dividends with transaction costs. Manuscript 2001. [Eb] E. Eberlein: Application of generalized hyperbolic Lévy motion to finance. In O.E. Barndorff-Nielsen (ed.): Lévy Processes. Birkhäuser, Basel 2001, 319–336. [EK] E. Eberlein and U. Keller: Hyperbolic distributions in finance. Bernouilli 1 (1995), 281–299.

References [ETZ]

421

I. Ekren, N. Touzi and J. Zhang: Viscosity Solutions of Fully Nonlinear Parabolic Path Dependent PDEs: Part I, arXiv:1210.0006v2. Annals of Probability, to appear. [EPQ] N. El Karoui, S. Peng and M.-C. Quenez: Backward stochastic differential equations in finance. Math. Finance 7(1977), 702–737. [EPQ01] N. El Karoui, S. Peng and M.-C. Quenez: A dynamic maximum principle for the optimization of recursive utilities under constraints. The Annals of Appl. Probab. 11 (2001), 664–693. [EH] N. El Karoui and S. Hamadène: BSDEs and risk-sensitive control, zero-sum and nonzero-sum game problems of stochastic functional differential equations. Stochastic Processes and their Applications, 107, pp. 145–169, 2003. [EØS] I. Elsanosi, B. Øksendal, and A. Sulem: Some solvable stochastic control problems with delay. Stochastics and Stochastics Reports, 71:69–89, 2000. [EZ] L. Epstein and S. Zin: Substitution, risk aversion, and the temporal behavior of consumption and asset returns: An empirical analysis. J. Political Economy 99 (1991), 263–286. [FGS] G. Fabbri, F. Gozzi and A. Swiech: Stochastic Optimal Control in Infinite Dimension. Springer 2017. [Fe] S. Federico: A stochastic control problem with delay arising in a pension fund model, to appear in Finance and Stochastics, 2009. [F] N.C. Framstad: Combined Stochastic Control for Jump Diffusions with Applications to Economics. Cand. Scient. Thesis, University of Oslo 1997. [Fl] W. Fleming: Controlled Markov processes and mathematical finance. In F.H. Clarke and R.J. Stern (editors): Nonlinear Analysis, Differential Equations and Control. Kluwer 1999, pp. 407–446. [FØS1] N.C. Framstad, B. Øksendal and A. Sulem: Optimal consumption and portfolio in a jump diffusion market. In A. Shiryaev and A. Sulem (eds.): Math. Finance INRIA, Paris 1998, 8–20. [FØS2] N.C. Framstad, B. Øksendal and A. Sulem: Optimal consumption and portfolio in a jump diffusion market with proportional transaction costs. J. Math. Econ. 35 (2001), 233–257. [FØS3] N.C. Framstad, B. Øksendal and A. Sulem: Sufficient stochastic maximum principle for optimal control of jump diffusions and applications to finance. J. Opt. Theor. Appl. 121 (2004), 77–98. Errata: 124 (2005), 511–512. [Fre] M. Freidlin: Functional Integration and Partial Differential Equations. Princeton University Press, 1985. [FS] W. Fleming and M. Soner: Controlled Markov Processes and Viscosity Solutions. Springer, Berlin Heidelberg New York 1993. [FR] M. Frittelli and E. Rosazza-Gianin: Putting order in risk measures. J. Banking & Finance 26 (2002), 1473–1486. [FS1] H. Föllmer and A. Schied: Convex measures of risk and trading constraints. Finance & Stoch. 6 (2002), 429–447. [FS2] H. Föllmer and A. Schied: Stochastic Finance. De Gruyter 2010. [GK] I. Gyöngy and N.V. Krylov: Stochastic partial differential equations with unbounded coefficients I, II. Stochastics 1990. [GoMa] F. Gozzi and C. Marinelli: Stochastic optimal control of delay equations arising in advertising models, Da Prato (ed.) et al., Stochastic partial differential equations and applications VII – Papers of the 7th meeting, Levico Terme, Italy, January 5–10, 2004. Boca Raton, FL: Chapman & Hall/CRC. Lecture Notes in Pure and Applied Mathematics, 245, pp. 133–148, 2004. [GoMaSa] F. Gozzi, C. Marinelli and S. Savin: On controlled linear diffusions with delay in a model of optimal advertising under uncertainty with memory effects, Journal of Optimization, Theory and Applications 142 (2), 291–321 (2009). [GS] I.I. Gihman and A.V. Skorohod: Controlled Stochastic Processes. Springer, Berlin Heidelberg New York 1979.

422 [H1] [H2] [Hue] [HØS] [HØUZ] [HST] [H] [HØP]

[Ha] [Ho] [HP] [I] [Is1] [Is2] [Ish] [J-P] [JK1] [JK2]

[JS] [JS03] [J-PS] [K] [Ka] [Kalli] [KaSh] [Ko1]

References U.G. Haussmann: A Stochastic Maximum Principle for Optimal Control of Diffusions. Longman, London 1986. U.G. Haussmann: The maximum principle for optimal control of diffusions with partial information. SIAM J. Control Opt. 25 (1987), 341–361. F. Huehne: A Clark–Ocone–Haussmann formula for optimal portfolio under Girsanov transformed pure-jump Lévy processes. Technical report, 2005. Y. Hu, B. Øksendal and A. Sulem: Singular mean-field control games. Stoch. Anal. Appl. (2017). DOI https://doi.org/10.1080/07362994.2017.1325745. H. Holden, B. Øksendal, J. Ubøe and T. Zhang: Stochastic Partial Differential Equations. Universitext, Springer, Second Edition 2010. M.J. Harrison, T. Selke and A. Taylor: Impulse control of a Brownian motion. Math. Oper. Res. 8 (1983), 454–466. S. Hamadène: Backward-forward SDE’s and stochastic differential games. Stochastic Processes and their Applications 77, 1–15 (1998). S. Haadem, B. Øksendal and F. Proske: Maximum principles for jump diffusion processes with infinite horizon. Automatica, 2013. URL https://doi.org/10.1016/j. automatica.2013.04.011. Halkin, H.: Necessary conditions for optimal control problems with infinite horizons. Econometrica 42, 267–272 (1974). J.J.A. Hosking: A stochastic maximum principle for a stochastic differential game of mean-field type. Applied Mathematics and Optimization, 12/2012; 66(3), 415–454. Y. Hu and S. Peng: Solution of forward-backward stochastic differential equations. Probab. Theory Rel. Fields 103, 273–283 (1995). K. Itô: Spectral type of the shift transformation of differential processes with stationary increments. Trans. Am. Math. Soc. 81 (1956), 253–263. H. Ishii: Viscosity solutions of nonlinear second order elliptic PDEs associated with impulse control problems. Funkciala Ekvacioj. 36 (1993), 132–141. H. Ishii: On the equivalence of two notions of weak solutions, viscosity solutions and distribution solutions. Funkciala Ekvacioj. 38 (1995), 101–120. Y. Ishikawa: Optimal control problem associated with jump processes. Appl. Math. Opt. 50 (2004), 21–65. M. Jeanblanc-Picqué: Impulse control method and exchange rate. Math. Finance 3 (1993), 161–177. E.R. Jakobsen and K.H. Karlsen: Continuous dependence estimates for viscosity solutions of integro-PDEs. J. Differential Equations (2005), 278–318. E.R. Jakobsen and K.H. Karlsen: A maximum principle for semicontinuous functions applicable to integro-partial differential equations. Nonlinear Diff. Eqn. Appl. 13 (2006), 1–29. J. Jacod and A. Shiryaev: Limit Theorems for Stochastic Processes. Springer, Berlin Heidelberg New York 1987. J. Jacod and A. Shiryaev: Limit Theorems for Stochastic Processes. Second Edition. Springer 2003. M. Jeanblanc-Picqué and A.N. Shiryaev: Optimization of the flow of dividends. Russian Math. Surv. 50 (1995), 257–277. N.V. Krylov: Controlled Diffusion Processes. Springer, Berlin Heidelberg New York 1980. O. Kallenberg: Foundations of Modern Probability. Second Edition. Springer-Verlag 2002. G. Kallianpur: Stochastic Filtering Theory. Springer, Berlin Heidelberg New York 1980. J. Kallsen and A. Shiryaev: The cumulant process and Esscher’s change of measure. Finance & Stochastics 6 (2002), 397–428. R. Korn: Portfolio optimization with strictly positive transaction costs and impulse control. Finance Stoch. 2 (1998), 85–114.

References [Ko2] [Ko]

[KoSh] [KP] [Kr]

[Ku] [Kun] [KD] [KS] [Ky] [La] [La1] [LR]

[Le] [LiS] [LL] [LaPp]

[LST] [LS] [LP]

[L] [LZ] [M] [Ma] [MØ]

423 R. Korn: Optimal Portfolios: Stochastic Models for Optimal Investment and Risk Management in Continuous Time. World Scientific, Singapore 1997. T. Ø. Kobila: A class of solvable stochastic investment problems involving stochastic maximum principle for a stochastic differential game of mean-field type. Stochastics and Stochastics Reports 43, 29–63 (1993). V.B. Kolmanovski and L.E. Shaikhet: Control of Systems with Aftereffect. American Mathematical Society 1996. D. Kreps and E. Porteus: Temporal resolution of uncertainty and dynamic choice theory. Econometrica 46 (1978), 185–200. D.O. Kramkov: Optimal decomposition of supermartingales and hedging contingent claims in incomplete security markets. Probab. Theory and Relat. Fields 105 (1996), 459–479. H.J. Kushner: Necessary conditions for continuous parameter stochastic optimization problems. SIAM J. Control. 10 (1972), 550–565. H. Kunita: Representation of martingales with jumps and applications to finance. Adv. Stud. Pure Math. 41 (2004), Math. Soc. Japan, Tokyo, pp. 209–232. H.J. Kushner and P. Dupuis: Numerical Methods for Stochastic Control Problems in Continuous Time. Springer-Verlag 1992. I. Karatzas and S. Shreve: Brownian Motion and Stochastic Calculus. 2nd Edition. Springer, Berlin Heidelberg New York 1991. A. Kyprianou: Introductory Lectures on Fluctuations of Lévy Processes with Applications. Second Edition. Springer, Berlin Heidelberg New York 2014. B. Larssen: The partially observed stochastic linear quadratic regulator: A direct approach. Preprint, University of Oslo 29/2001. B. Larssen: Dynamic programming in stochastic control of systems with delay, Stochastics and Stochastics Reports, 74, 3-4, pp. 651–673, 2002. B. Larssen and N.H. Risebro: When are HJB-equations for control problems with stochastic delay equations finite-dimensional?, Stochastic Analysis and Applications, 21, 3, pp. 643–671, 2003. D. Lefèvre: An introduction to utility maximization with partial observation. Finance 23 (2002), 93–126. R.S. Liptser and A. Shiryaev: Statistics of Random Processes, Vol. I, II, Springer 1978. P.-L. Lions and J.-M. Lasry: Stochastic control under partial information and applications to finance. CEREMADE (UMR 7534), Number 9912, 10/03/1999. A. Lanconelli and F. Proske: On explicit strong solution of Itô-SDEs and the Donsker delta function of a diffusion. Inf. Dim. Anal. Quatum Prob Rel. Topics 7 (2004), 437– 447. B. Lapeyre, A. Sulem and D. Talay: Understanding Numerical Analysis for Financial Models. Cambridge University Press (to appear). S. Levental and A.V. Skorohod: A necessary and sufficient condition for absence of arbitrage with tame portfolios. Ann. Appl. Probab. 5 (1995), 906–925. J. Li and S. Peng: Stochastic optimization theory of backward stochastic differential equations with jumps and viscosity solutions of Hamilton–Jacobi–Bellman equations. Nonlinear Analysis, 70:1779–1796, 2009. A. Løkka: Martingale representation and functionals of Lévy processes. Stoch. Anal. Appl. 22 (2005), 867–892. R.R. Lumley and M. Zervos: A model for investment in the natural resource industry with switching costs. Math. Oper. Res. 26 (2001), 637–653. R. Merton: Optimal consumption and portfolio rules in a continuous time model. J. Econ. Theor. 3 (1971), 373–413. C. Makasu: On some optimal stopping and stochastic control problems with jump diffusions. Ph.D. Thesis, University of Zimbabwe 2002. S. Mataramvura and B. Øksendal: Risk minimizing portfolios and HJBI equations for stochastic differential games. Stochastics 80 (2008), 317–337.

424 [Me] [MR] [MnS] [Mo] [MuØ] [MØP]

[Mort] [MOZ] [MP1] [MP2]

[MS]

[MV] [MPY] [MYZ] [MY] [Ø6] [ØPZ]

[Ø1] [Ø2] [Ø3] [Ø4] [Ø5] [Ok] [ØR] [ØS1]

References J.-L. Menaldi: Optimal impulse control problems for degenerate diffusions with jumps. Acta Appl. Math. 8 (1987), 165–198. J.-L. Menaldi and M. Robin: On a singular control problem for diffusions with jumps. IEEE Trans. Automatic Control, AC-29 (1984), 991–1004. M. Mnif and A. Sulem: Optimal risk control and divident pay-outs under excess of loss reinsurance. Stochastics and Stochastics Reports 77 (2005), 455–476. E. Mordecki: Optimal stopping and perpatual options for Lévy processes. Finance Stoch. 6 (2002), 473–493. G. Mundaca and B. Øksendal: Optimal stochastic intervention control with application to the exchange rate. J. Math. Econ. 29 (1998), 225–243. S. Mataramvura, B. Øksendal and F. Proske: The Donsker delta function of a Lévy process with application to chaos expansion of local time. Ann. Inst H. Poincaré Prob. Statist. 40 (2004), 553–567. R.E. Mortensen: Stochastic optimal control with noisy observations. Int. J. Control 4 (1966), 455–464. T. Meyer-Brandis, B. Øksendal and X.Y. Zhou: A mean-field stochastic maximum principle via Malliavin calculus, Stochastics 84 (2012), 643–666. T. Meyer-Brandis and F. Proske: Explicit solution of a nonlinear filtering problem for Lévy processes with application to finance. Appl. Math. Opt. 50 (2004), 119–134. T. Meyer-Brandis and F. Proske: On the existence and explicit representability of strong solutions of Lévy noise driven SDEs with irregular coefficients. Commun. Math. Sci. 4 (2006), 129–154. S.-E.A. Mohammed and M.K.R. Scheutzow: Lyapunov exponents of linear stochastic functional differential equations driven by semimartingales, part II: Examples and case studies. Annals of Probability, 25(3):1210–1240, 1997. B. Maslowski and P. Veverka: Sufficient stochastic maximum principle for discounted control problem. arXiv:1105.4737v, 2011. URL http://arxiv.org/abs/1105.4737. J. Ma, P. Protter and J. Yong: Solving forward-backward stochastic differential equations explicitly – a four step scheme. Probab. Theory Relat. Fields 98, 339–359 (1994). J. Ma, H. Yin and J. Zhang: On non-Markovian forward-backward SDEs and backward stochastic PDEs. Stoch. Proc. and their Appl. 122 (2012), 3980–4004. J. Ma and J. Yong: Forward-Backward Stochastic Differential Equations and their Applications. Springer LNM 1702 (1999). B. Øksendal: Optimal control of stochastic partial differential equations. Stochastic Analysis and Applications 23 (2005), 165–179. B. Øksendal, F. Proske and T. Zhang: Backward stochastic partial differential equations with jumps and application to optimal control of random jump fields. Stochastics 77 (2005), 381–399. B. Øksendal: Stochastic Differential Equations. 6th Edition. Springer, Berlin Heidelberg New York 2013. B. Øksendal: Stochastic control problems where small intervention costs have big effects. Appl. Math. Opt. 40 (1999), 355–375. B. Øksendal: An Introduction to Malliavin Calculus with Applications to Economics. NHH Lecture Notes 1996. B. Øksendal: Optimal stopping with delayed information. Stoch. Dynam. 5 (2005), 271–280. B. Øksendal: Optimal control of stochastic partial differential equations. Stoch. Anal. Appl. 23 (2005), 165–179. Y.Y. Okur: An extension of the Clark–Ocone formula under change of measure for Lévy processes. Manuscript 2008. PhD Thesis, University of Oslo 2009. B. Øksendal and K. Reikvam: Viscosity solutions of optimal stopping problems. Stoch. Stoch. Rep. 62 (1998), 285–301. B. Øksendal and A. Sulem: Applied Stochastic Control of Jump Diffusions. Second Edition. Springer 2007

References [ØS2]

[ØSU]

[ØS] [ØS8]

[ØS3]

[ØS4] [ØS5]

[ØS6] [ØS7] [ØUZ] [ØSZ]

[ØSZ1]

[ØSZ2]

[ØSZ3]

[ØZ] [Pa1] [Pa2] [Pa3]

[PP] [Pe] [PeS]

425 B. Øksendal and A. Sulem: Risk minimization in financial markets modeled by Itô–Lévy processes. Afrika Matematika (2014), https://doi.org/10.1007/s13370-01402489-9. B. Øksendal, L. Sandal and J. Ubøe: Stochastic Stackelberg equilibria and applications to time-dependent investor models. J. Economic Dynamics and Control 2013, https:// doi.org/10.1016/j.jedc2013.02.010. B. Øksendal and A. Sulem: Optimal consumption and portfolio with both fixed and proportional transaction costs. SIAM J. Control Opt. 40 (2002), 1765–1790. B. Øksendal and A. Sulem: A maximum principle for optimal control of stochastic systems with delay, with applications to finance : In J.M. Menaldi, E. Rofman and A. Sulem (editors): Optimal Control and Partial Differential Equations – Innovations and Applications. IOS Press, Amsterdam, 2000. B. Øksendal and A. Sulem: Maximum principles for optimal control of forwardbackward stochastic differential equations with jumps. SIAM J. Control Optimization 48(5), 2845–2976 (2009). B. Øksendal and A. Sulem: Singular control and optimal stopping with partial information of jump diffusions. SIAM J. Control Optimization 50 (2012), 2254–2287. B. Øksendal and A. Sulem: Forward-backward stochastic differential games and stochastic control under model uncertainty. J. Optim. Theory Appl. Volume 161, issue 1(2014), 22–55. B. Øksendal and A. Sulem: Portfolio optimization under model uncertainty and BSDE games. Quantitative Finance 11(11) (2011), 1665–1674. B. Øksendal and A. Sulem: Risk minimization in financial markets modeled by Itô– Lévy processes. Afrika Mathematika, 2015, 26:939–979. B. Øksendal, J. Ubøe and T. Zhang: Nonrobustness of some impulse control problems with respect to intervention costs. Stoch. Anal. Appl. 20 (2002), 999–1026. B. Øksendal, A. Sulem, and T. Zhang: Optimal control of stochastic delay equations and time-advanced backward stochastic differential equations. Adv. Appl. Prob., 43(2) 572–596, 2011. B. Øksendal, A. Sulem and T. Zhang: A stochastic HJB equation for optimal control of forward-backward SDEs. In A. Veraart, M. Podolskij, R. Stelzer and S. Thorbjonsen (editors): The Fascination of Probability, Statistics and their Applications. In honor of Ole E. Barndorff-Nielsen, Springer (2016), pp. 435–446. B. Øksendal, A. Sulem and T. Zhang: A comparison theorem for backward SPDEs with jumps In Z.-Q. Chen, N. Jacob, M. Takeda and T. Uemura (editors): Festschrift Masatoshi Fukushima, World Scientific 2015, pp. 479–487. B. Øksendal, A. Sulem and T. Zhang: Optimal control of stochastic delay equations and time-advanced backward stochastic differential equations, Adv. Appl. Prob. 43, 572–596 (2011). B. Øksendal and T. Zhang: The Itô–Ventzell formula and forward stochastic differential equations driven by Poisson random measures. Osaka J. Math. 44 (2007), 207–230. E. Pardoux: Stochastic partial differential equations and filtering of diffusion processes. Stochastics 3 (1979), 127–167. E. Pardoux: Filtrage non linéaire et équations aux dérivées partielles stochastiques associées. Ecole d’Eté de Probabilités de Saint-Flour 1989. E. Pardoux: BSDEs, weak convergence and homogenizations of semilinear PDEs. In F.H. Clark and R.J. Stern, editors, Nonlinear Analysis, Differential Equations and Control, pages 503–549. Kluwer Academic, Dordrecht, 1999. E. Pardoux and S. Peng: Adapted solutions of backward stochastic differential equations. System and Control Letters 14, 55–61 (1990). S. Peng: Stochastic Hamilton–Jacobi–Bellman equations. SIAM J. Control Optim. 30, 284–304 (1992). S. Peng and Y. Shi: Infinite horizon forward-backward stochastic differential equations. Stoch. Proc. and Their Appl., 85:75–92, 2000.

426 [PesZab]

[PY] [Ph1] [Ph2] [P1] [P2] [P3] [PR] [PBGM] [P] [PK] [Pu]

[PrR] [PS]

[Q] [QS] [R] [Roy] [Rø] [RV] [RV1] [RV2] [S] [ST]

[Sh] [Sh1] [Sc]

References S. Peszat and J. Zabczyk: Stochastic Partial Differential Equations with Lévy Noise, Encyclopedia of Mathematics and its Applications, Vol. 113, Cambridge University Press, Cambridge (UK), 2008. S. Peng and Z. Yang: Anticipated backward stochastic differential equations, The Annals of Probability 37,3, pp. 877–902, 2009. H. Pham: Optimal stopping of controlled jump diffusion processes: a viscosity solution approach. J. Math. Syst. Estimation Control 8 (1998), 1–27. H. Pham: Continuous-time Stochastic Control and Optimization with Financial Applications. Springer 2009. R.S. Pindyck: Irreversible investments, capacity choice, and the value of the firm. American Economic Review 78 (1988), 969–985. R.S. Pindyck: Irreversibility and the explanation of investment behaviour. In D. Lund and B. Øksendal (editors): Stochastic Models and Option Values. North-Holland 1991. R.S. Pindyck: Irreversibility, uncertainty and investment. J. Economic Literature 29 (1991), 1110–1148. E. Platen and R. Rebolledo: Principles for modelling financial markets. J. Appl. Probab. 33 (1996), 601–630. L.S. Pontryagin, V.G. Boltyanskii, R.V. Gamkrelidze and E.F. Mishchenko: The Mathematical Theory of Optimal Processes. Wiley, New York 1962. P. Protter: Stochastic Integration and Differential Equations. 2nd Edition. Springer, Berlin Heidelberg New York 2003. I. Pikovsky and I. Karatzas: Anticipative portfolio optimization. Adv. Appl. Probab. 28 (1996), 1095–1122. M.L. Puterman: Markov Decision Processes: Discrete Stochastics Dynamic Programming. Probability and Mathematical Statistics: applied probability and statistics section. Wiley 1994. C.I. Prévôt and M. Roeckner: A concise course on stochastic partial differential equations. Lecture Notes in Mathematics 1905, Springer 2007. P. Protter and K. Shimbo: No arbitrage and general semimartingales, Markov processes and related topics. In Festschrift for Thomas G. Kurz, IMS Lecture Notes – Monograph Series 4(2008), 267–283. M.-C. Quenez: Backward Stochastic Differential Equations, Encyclopedia of Quantitative Finance, 2010, 134–145. M.-C. Quenez and A. Sulem: BSDEs with jumps, optimization and applications to dynamic risk measures. Stoch. Proc. and their Appl. 123 (2013), 3328–3357 R.T. Rockafellar: Convex Analysis. Princeton University Press 1997. M. Royer: Backward stochastic differential equations with jumps and related non-linear expectations. Stochastic Processes and Their Applications, 116:1358–1376, 2006. E. Røse: Optimal control for mean-field SDEs with jumps and delay. Manuscript, University of Oslo August 2013. F. Russo and P. Vallois: Forward, backward and symmetric stochastic integration. Probab. Theor. Rel. Fields 93 (1993), 403–421. F. Russo and P. Vallois: The generalized covariation process and Itô formula. Stoch. Proc. Appl., 59(4):81–104, 1995. F. Russo and P. Vallois: Stochastic calculus with respect to continuous finite quadratic variation processes. Stoch. Stoch. Rep., 70(4):1–40, 2000. K. Sato: Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press 1999. C. Schwab and R.A. Todor: Convergence rates for sparse chaos approximations of elliptic problems with stochastic coefficients. IMA J. Numer. Anal. 27(2) (2007), 232– 261. A. Shiryaev: Optimal Stopping Rules. Springer, Berlin Heidelberg New York 1978. A. Shiryaev: Essentials of Stochastic Finance. World Scientific 1999. W. Schoutens: Lévy Processes in Finance. Wiley, New York 2003.

References [SeSy] [SS] [S1] [S2] [Si]

[TL] [V] [W] [X] [YZ] [Yi] [Z] [Z1] [ZRW]

427 A. Seierstad and K. Sydsæter: Optimal Control Theory with Economic Applications. North-Holland, Amsterdam 1987. S.E. Shreve and H.M. Soner: Optimal investment and consumption with transaction costs. Ann. Appl. Probab. 4 (1994), 609–692. A. Sulem: A solvable one-dimensional model of a diffusion inventory system. Math. Oper. Res. 11 (1986), 125–133. A. Sulem: Explicit solution of a two-dimensional deterministic inventory problem. Math. Oper. Res. 11 (1986), 134–146. R. Situ: On solutions of backward stochastic differential equations with jumps and with non-Lipschitzian coefficients in Hilbert spaces and stochastic control. Statistics and Probability Letters, 60:279–288, 2002. S.H. Tang and X. Li: Necessary conditions for optimal control of stochastic systems with random jumps, SIAM J. Control Optim. 32 (1994), 1447–1475. H. Varner: Some Impulse Control Problems with Applications to Economics. Canadian Scientist Thesis, University of Oslo, 1997. Y. Willassen: The stochastic rotation problem: a generalization of Faustmann’s formula to stochastic forest growth. J. Econ. Dynam. Control 22 (1998), 573–596. J. Xiong: An introduction to Stochastic Filtering Theory. Oxford University Press 2008. J. Yong and X.Y. Zhou: Stochastic Controls. Springer, Berlin Heidelberg New York 1999. J. Yin: On solutions of a class of infinite horizon FBSDEs. Statistics and Probability Letters, 78:2412–2419, 2008. L. Zhang: The relaxed stochastic maximum principle in the mean field singular controls. arXiv: 1202.4129v5 [MATH.OC] 1 NOV 2012. T. Zhang: White noise driven SPDEs with reflection: strong Feller properties and Harnack inequalities. Potential Analysis 33 2 , 137–151 (2010). Q. Zhou, Y. Ren and W. Wu: On solutions to backward stochastic partial differential equations for Lévy processes. Journal of Computational and Applied Mathematics 235 (2011), 5411–5421.

Notation and Symbols

Rn R+ Rn×m Z N B0 Rn  Rn×1 In AT P(Rk ) C(U, V ) C(U ) C0 (U ) C k = C k (U ) C0k = C0k (U ) C k+α C 1,2 (R × Rn ) Cb (U ) |x|2 = x 2 x·y x+ x− sign x

n-dimensional Euclidean space the nonnegative real numbers the n × m matrices (real entries) the integers the natural numbers the family of Borel sets U ⊂ R whose closure U¯ does not contain 0 i.e., vectors in Rn are regarded as n × 1 matrices the n × n identity matrix the transposed of the matrix A set of functions f : Rk → R of at must polynomial growth, i.e., there exists constants C, m such that: | f (y)| ≤ C(1 + |y|m ) for all y ∈ Rk the continuous functions from U into V the same as C(U, R) the functions in C(U ) with compact support the functions in C(U, R) with continuous derivatives up to order k the functions in C k (U ) with compact support in U the functions in C k whose kth derivatives are Lipschitz continuous with exponent α the functions f (t, x); R × Rn → R which are C 1 w.r.t. t ∈ R and C 2 w.r.t. x ∈ Rn the continuous functions on U n bounded 2 1 , . . . , xn ) i=1 x i if x = (x  n xi yi if x = (x1 , . . . , xn ), y = the dot product i=1 (y1 , . . . , yn ) max(x, 0) if x ∈ R max(−x, 0) if x ∈ R  1 if x ≥ 0 −1 if x > 0

© Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0

429

430

Notation and Symbols

  x −x hyperbolic sine of x = e −e 2   x −x hyperbolic cosine of x = e +e 2

sinh(x) cosh(x)

sinh(x) cosh(x)

tgh(x) s∧t s∨t δx argmaxu∈U f (u) := lim, lim supp f

the minimum of s and t (= min(s, t)) the maximum of s and t (= max(s, t)) the unit point mass at x {u ∗ ∈ U ; f (u ∗ ) ≥ f (u), ∀u ∈ U } equal to by definition the same as lim inf, lim sup the support of the function   f n

∇f ∇x F ∂G G¯ G0 χG (Ω, IF, (IF t )t≥0 , P) ηt P N (t, U ) ν(U )

ν N˜ (dt, dz) B(t) PQ P∼Q EQ E E[Y ] = E μ [Y ] = [X, Y ] T τG Δ N Y (t) Yˇ (t − ) Δξ Y (t) Δξ φ



Y dμ

∂f the same as D f = ∂x i i=1 the Fréchet derivative of F at x (see the beginning of Section 6.2) the boundary of the set G the closure of the set G the interior of the set G the indicator function of the set G; χG (x) = 1 if x ∈ G, /G χG (x) = 0 if x ∈ filtered probability space the jump of ηt defined by ηt = ηt − ηt− the probability law of ηt see (1.1.2) E[N (1, U )] see (1.1.3) the norm (total mass) of the measure ν, i.e., ν(R) see (1.1.7) Brownian motion the measure P is absolutely continuous w.r.t. the measure Q P is equivalent to Q, i.e., P  Q and Q  P the expectation w.r.t. the measure Q the expectation w.r.t. a measure which is clear from the context (usually P) the expectation of the random variable Y w.r.t. the measure μ quadratic covariation of X and Y , see Definition 1.28 set of all stopping times ≤ τS see (3.1.1) the first exit time from the set G of a process X t : τG = / G} inf{t > 0; X t ∈ the jump of Y caused by the jump of N , see (8.2.2) Y (t − ) + Δ N Y (t) (see (9.1.5)) the jump of Y caused by the singular control ξ see (8.2.3)

Notation and Symbols

ξ c (t) π/K A = AY M VI QVI HJB HJBVI HJBQVI SDE càdlàg càglàd i.i.d. iff a.a., a.e., a.s. w.r.t. s.t.

431

continuous part of ξ(t), i.e., the process obtained by removing the jumps of ξ(t) the restriction of the measure π to the set K the generator of jump diffusion Y intervention operator, see Definition 9.1 variational inequality quasivariational inequality Hamilton–Jacobi–Bellman equation Hamilton–Jacobi–Bellman variational inequality Hamilton–Jacobi–Bellman quasivariational inequality Stochastic differential equation right continuous with left limits left continuous with right limits independent identically distributed if and only if almost all, almost everywhere, almost surely with respect to such that

Index

A Adjoint equation, 109 Adjoint process, 102, 167, 202, 316 Admissible, 28, 48, 50, 93, 96, 112, 228, 315 Admissible combined control, 273 Admissible control, 226, 313 Admissible impulse controls, 240 American option, 43 American put option, 43 Anticipative, 136 Approximation theorem, 56 Arbitrage, 28, 47, 48 Arrow condition, 103, 168, 173, 319

B Backward stochastic differential equation (BSDE), 75 Bankruptcy time, 55, 93 Bequest rate, 240, 315 Burgers equation, 407

C Càdlàg, 1 Càglàd, 5 Cash flow with delay, 148 Claim (or a T -claim), 33 Combined control, 273 Combined impulse linear regulator problem, 282 Combined optimal stopping and stochastic control, 211, 213 Combined stochastic control and impulse control problem, 274, 296 Comparison principle, 127 Comparison theorem, 82, 300

Compensated Poisson random measure, 4 Complete, 34 Compound Poisson process, 3 Conditional maximum principle, 103, 167, 173 Conditional maximum property, 187, 203 Continuation region, 57, 214, 242 Control process, 93 Controlled jump diffusion, 93 Controls which do not depend on x, 329, 341 Convex risk measure, 77, 85

D Delay, 134 Delayed effect, 66 Delayed information, 65 Delayed optimal stopping, 71, 73 Delayed stopping times, 65 Diagonally dominant matrix, 307 Discrete maximum principle, 309 Dual representation, 88 Duncan–Mortensen–Zakai equation, 334 Dynamic programming, 93, 108 Dynamic programming principle, 258, 259, 287, 288, 297, 298 Dynamic risk measure, 87 Dynkin formula, 12, 13

E Entropic risk minimization, 160, 162, 164 Entropy, 133 Equivalent local martingale measure, 29 Equivalent measure, 14 European option, 40 Exchange rate, 276

© Springer Nature Switzerland AG 2019 B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Universitext, https://doi.org/10.1007/978-3-030-02781-0

433

434 Exponential utility, 155

F FBSDE game, 167, 179 Finite difference approximation, 304 First exit time from a ball, 24 First fundamental theorem of asset pricing, 30, 351 Fixed and proportional transaction costs, 277, 310 Fixed transaction cost, 244 Forward-backward stochastic differential equation (FBSDE), 124

G General mean-field non-zero sum game, 184 General (non-zero) stochastic differential game, 165 Geometric Lévy martingales, 23 Geometric Lévy process, 8, 24 Girsanov theorem, 14, 16, 18, 21 Graph of the geometric Lévy process, 24

H Hamiltonian, 102, 109, 119, 135, 166, 185, 193, 316 Hamilton–Jacobi–Bellman (HJB) theorem, 94 Hamilton–Jacobi–Bellman–Isaacs (HJBI) equation, 158 Heat operator, 324 Hida–Malliavin derivative, 53 High contact conditions, 61, 65, 214, 407 HJBQVI verification theorem, 274 HJB-variational inequality, 214 Howard algorithm, 309

I Impulse control, 239, 241, 273 Impulses, 240 Infinite-dimensional HJB equation, 314 Infinite horizon, 155 Infinite horizon necessary maximum principle, 122 Infinite horizon sufficient maximum principle, 117 Integration by parts, 25 Integro-variational inequality, 57, 229 Integrodifferential operator, 56 Intensity, 3

Index Intervention control, 227 Intervention operator, 241, 257 Intervention time, 240 Iterated optimal stopping problems, 255, 281 Iterative method, 281 Iterative procedure, 257 Itô formula, 7, 9 Itô–Lévy isometry, 10 Itô–Lévy processes, 6 Itô representation theorem, 324

J Jump diffusion, 12 Jump (of a Lévy process), 1

L Laplacian operator, 313 Lévy decomposition, 4 Lévy diffusion, 11 Lévy–Khintchine formula, 4 Lévy martingale, 4 Lévy measure, 2 Lévy process, 1 Lévy stochastic differential equations, 11 Lévy type Black–Scholes market, 95 Lipschitz surface, 56 Local time, 231 Logarithmic utility, 155 Lower hedging price, 40

M Malliavin calculus, 52, 82 Malliavin differentiable, 52 Markov control, 94 Markovian, 213 Maximum principle, 314, 316, 317 Mean-field game, 182, 190 Mean-reverting Lévy–Ornstein–Uhlenbeck process, 23, 24 Mean-variance portfolio selection, 111, 112 Merton line, 97 Model uncertainty, 183, 192, 196, 206, 208, 210, 380

N Nash equilibrium, 166, 184 Necessary maximum principle, 105, 106, 140, 141, 326, 327 Noisy observation, 336 Non-intervention region, 229, 235

Index Nonlinear optimal stopping problem, 264 Normalized stock price, 47 No-transaction cone, 279

O Observation process, 332, 339, 340 Optimal consumption, 148, 154 Optimal consumption and portfolio, 95, 277, 303, 310 Optimal control, 94, 314 Optimal dividend policy, 235 Optimal forest management, 253, 271 Optimal harvesting, 182, 195, 205, 236 Optimal impulse control, 242 Optimal insider consumption, 196 Optimal insider portfolio, 208 Optimal intervention strategy, 235 Optimal irreversible investment, 183 Optimal portfolio, 52, 153, 206, 210, 336, 380 Optimal resource extraction, 72, 211 Optimal stopping, 46, 55 Optimal stopping time, 56, 66, 71 Optimal stream of dividends, 244, 253, 271 Optimal time to sell, 59

P Partial information control, 101 Partially observed linear-quadratic control problem, 339 Partially observed optimal portfolio problem, 340 partial (noisy) observation control, 313 partial (noisy) observation optimal control, 332 Performance, 56, 93, 96, 101, 183, 185, 213, 227, 274, 315, 339 Poisson process, 2 Poisson random measure, 1 Policy iteration algorithm, 307, 309 Polynomial growth, 258 Portfolio, 28, 47, 112 Power utility, 155 Predictable processes, 11 Profit rate, 240, 315 Proportional transaction cost, 225, 235, 244, 303

Q Quadratic covariation, 15, 26 Quasi-integrovariational inequality, 241

435 Quasivariational inequality, 300

R Random jump field, 314 Reaction-diffusion equation, 313 Recursive utility, 76, 80 Reflected process, 231 Replicable (or hedgable or attainable), 33 Replicating portfolio, 75, 81 Resource extraction, 69 Risk free asset, 27 Risk measure, 75 Risk minimization , 179 Risk minimizing portfolio, 131, 132 Risky asset, 27

S Saddle point, 170, 171, 184, 380 Second Fundamental Theorem of Asset Pricing, 21 Self-financing, 28, 48, 112 Shift operator, 68 Signal, 339, 340 Signal process, 332 Single player case, 204 Singular control, 225, 227, 229 Skorohod problem, 231 Smooth fit principle, 214 Smooth pasting, 65 Snell envelope, 43 Solvency set, 55, 93, 226, 227 Stochastic control, 93, 94 Stochastic delay equation, 134 Stochastic differential game, 157, 160, 380 Stochastic HJB, 124, 128 Stochastic linear regulator problem, 152, 222 Stochastic maximum principle, 101 Stochastic partial differential equation, 313 Stopping time, 212 Strengthened maximum principle, 319 Subfiltration, 101 Sufficient maximum principle, 103, 137 Supergradient, 320 Superjets, 299

T Time-advanced BSDE, 136, 144 Time homogeneous, 12 Transaction cost, 233, 244, 253, 271 Transversality condition, 117

436 U Uniqueness of viscosity solutions, 289, 303 Unnormalized conditional density, 334 Upper hedging price, 41 Utility maximization, 109, 129

V Value, 316 Value function, 56, 94, 296 Variational inequality, 286 Verification theorem, 94, 214 Viscosity solution, 285, 287, 295 Viscosity solution of HJBQVI, 293, 307

Index Viscosity subsolution, 286, 291, 295, 300, 410, 411 Viscosity supersolution, 286, 292, 295, 300, 410, 411

W Weak (variational), 315 Wealth process, 47, 96, 112

Z Zero-sum forward-backward game, 172 Zero-sum game, 158, 170